Cross-Modal Backdoors in Multimodal Large Language Models
arXiv SecurityArchived May 11, 2026✓ Full text saved
arXiv:2605.07490v1 Announce Type: new Abstract: Developers increasingly construct multimodal large language models (MLLMs) by assembling pretrained components,introducing supply-chain attack surfaces.Existing security research primarily focuses on poisoning backbones such as encoders or large language models (LLMs),while the security risks of lightweight connectors remain unexplored.In this work,we propose a novel cross-modal backdoor attack that exploits this overlooked vulnerability.By poisoni
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 8 May 2026]
Cross-Modal Backdoors in Multimodal Large Language Models
Runhe Wang, Li Bai, Haibo Hu, Songze Li
Developers increasingly construct multimodal large language models (MLLMs) by assembling pretrained components,introducing supply-chain attack this http URL security research primarily focuses on poisoning backbones such as encoders or large language models (LLMs),while the security risks of lightweight connectors remain this http URL this work,we propose a novel cross-modal backdoor attack that exploits this overlooked this http URL poisoning only the connector using a single seed sample and several augmented variants from one modality,the adversary can subsequently activate the backdoor using inputs from other this http URL achieve this,we first poison the connector to associate a compact latent region with a malicious target this http URL activate the backdoor from other modalities,we further extract a malicious centroid from the poisoned latent representations and perform input-side optimization to steer inputs toward this latent anchor,without requiring repeated API queries or full-model this http URL evaluations on representative connector-based MLLM architectures,including PandaGPT and NExT-GPT,demonstrate both the effectiveness and cross-modal transferability of the proposed this http URL attack achieves up to 99.9% attack success rate (ASR) in same-modality settings,while most cross-modal settings exceed 95.0% ASR under bounded this http URL,the attack remains highly stealthy,producing negligible leakage on clean inputs,and maintaining weight-cosine similarity above 0.97 relative to benign this http URL further show that existing defense strategies fail to effectively mitigate this threat without incurring substantial utility this http URL findings reveal a fundamental vulnerability in multimodal alignment: a single compromised connector can establish a reusable latent-space backdoor pathway across modalities,highlighting the need for safer modular MLLM design.
Subjects: Cryptography and Security (cs.CR)
Cite as: arXiv:2605.07490 [cs.CR]
(or arXiv:2605.07490v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2605.07490
Focus to learn more
Submission history
From: Runhe Wang [view email]
[v1] Fri, 8 May 2026 09:29:50 UTC (1,348 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-05
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)