Room for Error: Large-Scale Simulation of Over-the-Air Acoustic Attacks
arXiv SecurityArchived Jun 29, 2026✓ Full text saved
arXiv:2606.27701v1 Announce Type: cross Abstract: While voice control is rapidly becoming a ubiquitous vector of human-AI communication, the risks facing these systems remain poorly understood. This is, in part, a product of the difficulties in scaling strictly digital adversarial workflows to the physical world. These scale barriers have led the community to abstract away key acoustic factors relating to detectability and the influence of geometry on acoustics. These methodological and metrolog
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Sound
[Submitted on 26 Jun 2026]
Room for Error: Large-Scale Simulation of Over-the-Air Acoustic Attacks
Andrew C. Cullen, Neil Marchant, Jiani Xie, Paul Montague, Benjamin I.P. Rubinstein
While voice control is rapidly becoming a ubiquitous vector of human-AI communication, the risks facing these systems remain poorly understood. This is, in part, a product of the difficulties in scaling strictly digital adversarial workflows to the physical world. These scale barriers have led the community to abstract away key acoustic factors relating to detectability and the influence of geometry on acoustics. These methodological and metrological shortcomings undermine our understanding of risk. We illuminate these issues through real-world testing, conceptual discussions, and a novel, high-throughput reality simulation framework. By testing over 8 million adversarial evaluations, we demonstrate that acoustic awareness yields relative Word Error Rate increases of up to 94.5\% under Whisper and wav2vec. We employ this framework to explore a formalize and operationalize a Dual-Form Signal to Noise Ratio to decouple source stealth from victim attack efficacy, resolving a crucial limitation in current works. This lays the groundwork for repeatable, verifiable research that embraces, rather than abstracts, the acoustic environment.
Comments: 20 pages
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as: arXiv:2606.27701 [cs.SD]
(or arXiv:2606.27701v1 [cs.SD] for this version)
https://doi.org/10.48550/arXiv.2606.27701
Focus to learn more
Submission history
From: Andrew Cullen [view email]
[v1] Fri, 26 Jun 2026 04:00:01 UTC (1,017 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.SD
< prev | next >
new | recent | 2026-06
Change to browse by:
cs
cs.AI
cs.CR
cs.LG
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)