← Back ◬ AI & Machine Learning May 15, 2026

Talk is (Not) Cheap: A Taxonomy and Benchmark Coverage Audit for LLM Attacks

arXiv Security Archived May 15, 2026 ✓ Full text saved

arXiv:2605.15118v1 Announce Type: new Abstract: We introduce a reusable framework for auditing whether LLM attack benchmarks collectively cover the threat surface: a 4$\times$6 Target $\times$ Technique matrix grounded in STRIDE, constructed from a 507-leaf taxonomy -- 401 data-populated and 106 threat-model-derived leaves -- of inference-time attacks extracted from 932 arXiv security studies (2023--2026). The matrix enables benchmark-external validation -- auditing collective coverage rather th

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 14 May 2026] Talk is (Not) Cheap: A Taxonomy and Benchmark Coverage Audit for LLM Attacks Karthik Raghu Iyer, Yazdan Jamshidi, Nicholas Bray, Alexey A. Shvets We introduce a reusable framework for auditing whether LLM attack benchmarks collectively cover the threat surface: a 4\times6 Target \times Technique matrix grounded in STRIDE, constructed from a 507-leaf taxonomy -- 401 data-populated and 106 threat-model-derived leaves -- of inference-time attacks extracted from 932 arXiv security studies (2023--2026). The matrix enables benchmark-external validation -- auditing collective coverage rather than individual benchmark consistency. Applying it to six public benchmarks reveals that the three primary frameworks (HarmBench, InjecAgent, AgentDojo) occupy non-overlapping cells covering at most 25\% of the matrix, while entire STRIDE threat categories (Service Disruption, Model Internals) lack any standardized evaluation, despite published attacks in these categories achieving 46\times token amplification and 96\% attack success rates through mechanisms which no benchmark tests. The corpus of 2,521 unique attack groups further reveals pervasive naming fragmentation (up to 29 surface forms for a single attack) and heavy concentration in Safety \& Alignment Bypass, structural properties invisible at smaller scale. The taxonomy, attack records, and coverage mappings are released as extensible artifacts; as new benchmarks emerge, they can be mapped onto the same matrix, enabling the community to track whether evaluation gaps are closing. Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL) Cite as: arXiv:2605.15118 [cs.CR] (or arXiv:2605.15118v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2605.15118 Focus to learn more Submission history From: Alexey Shvets [view email] [v1] Thu, 14 May 2026 17:30:36 UTC (184 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-05 Change to browse by: cs cs.CL References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes