← Back ◬ AI & Machine Learning Jun 18, 2026

PYPILINE: Malicious PyPI Package Detection via Suspicious API Knowledge and Agent Workflow

arXiv Security Archived Jun 18, 2026 ✓ Full text saved

arXiv:2606.19063v1 Announce Type: new Abstract: The detection of malicious PyPI packages is crucial for maintaining the security of the open source software supply chain. Existing methods, which primarily rely on rules or traditional machine learning, suffer from poor interpretability and difficulty in adapting to novel attacks. To address this, we propose PYPILINE, a novel detection method that combines a suspicious API knowledge base with an Agent workflow. PYPILINE first conducts static analy

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 17 Jun 2026] PYPILINE: Malicious PyPI Package Detection via Suspicious API Knowledge and Agent Workflow Siyuan Pang, Zhengwei Jiang, Yepeng Yao, Zijing Fan, Haozhe Li, Baoxu Liu The detection of malicious PyPI packages is crucial for maintaining the security of the open source software supply chain. Existing methods, which primarily rely on rules or traditional machine learning, suffer from poor interpretability and difficulty in adapting to novel attacks. To address this, we propose PYPILINE, a novel detection method that combines a suspicious API knowledge base with an Agent workflow. PYPILINE first conducts static analysis on known malicious packages, extracting abstract syntax trees and generating API call graphs, from which it automatically extracts and constructs a structured suspicious API knowledge base. During the detection phase, this knowledge base is used to enhance reasoning capabilities. Through an Agent workflow, PYPILINE performs in depth semantic analysis of unknown packages and outputs a structured, interpretable maliciousness assessment report. The experimental results show that PYPILINE significantly outperforms existing state-of-the-art tools in precision of 96.7\%, recall of 99.6\%, and F1-score of 98.1\%, with its precision surpassing baseline tools by 5.7 to 24.2 percentage points. Additionally, we conducted an empirical study on malicious packages, systematically revealing prevalent attack strategies, as well as the most commonly abused APIs. Equipped with tool-calling AI agent workflows for automated vector database retrieval of suspicious API knowledge and mail server delivery of analysis reports, PYPILINE delivers a practical, efficient, and convenient malicious package detection solution to strengthen open-source ecosystem security. Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2606.19063 [cs.CR] (or arXiv:2606.19063v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2606.19063 Focus to learn more Submission history From: Siyuan Pang [view email] [v1] Wed, 17 Jun 2026 13:37:17 UTC (1,213 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-06 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes