← Back ◬ AI & Machine Learning Mar 27, 2026

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

arXiv Security Archived Mar 27, 2026 ✓ Full text saved

arXiv:2603.24857v1 Announce Type: new Abstract: As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a proliferation of attacks and defenses. However, existing studies largely treat these threats in isolation, lacking a coherent framework to expose their shared principles and interdependencies. This fragmented view hinders systematic understanding and limits the design of comprehensive defenses. Crucially, the two f

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 25 Mar 2026] AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective Zhenyi Wang, Siyu Luan As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a proliferation of attacks and defenses. However, existing studies largely treat these threats in isolation, lacking a coherent framework to expose their shared principles and interdependencies. This fragmented view hinders systematic understanding and limits the design of comprehensive defenses. Crucially, the two foundational assets of ML -- \textbf{data} and \textbf{models} -- are no longer independent; vulnerabilities in one directly compromise the other. The absence of a holistic framework leaves open questions about how these bidirectional risks propagate across the ML pipeline. To address this critical gap, we propose a \emph{unified closed-loop threat taxonomy} that explicitly frames model-data interactions along four directional axes. Our framework offers a principled lens for analyzing and defending foundation models. The resulting four classes of security threats represent distinct but interrelated categories of attacks: (1) Data\rightarrowData (D\rightarrowD): including \emph{data decryption attacks and watermark removal attacks}; (2) Data\rightarrowModel (D\rightarrowM): including \emph{poisoning, harmful fine-tuning attacks, and jailbreak attacks}; (3) Model\rightarrowData (M\rightarrowD): including \emph{model inversion, membership inference attacks, and training data extraction attacks}; (4) Model\rightarrowModel (M\rightarrowM): including \emph{model extraction attacks}. Our unified framework elucidates the underlying connections among these security threats and establishes a foundation for developing scalable, transferable, and cross-modal security strategies, particularly within the landscape of foundation models. Comments: Published at Transactions on Machine Learning Research (TMLR) Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG) Cite as: arXiv:2603.24857 [cs.CR] (or arXiv:2603.24857v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2603.24857 Focus to learn more Submission history From: Zhenyi Wang [view email] [v1] Wed, 25 Mar 2026 22:53:43 UTC (116 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-03 Change to browse by: cs cs.AI cs.CL cs.CV cs.LG References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes