Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
arXiv SecurityArchived Jun 09, 2026✓ Full text saved
arXiv:2606.07796v1 Announce Type: new Abstract: The Internet of Vehicles (IoV) faces a dynamic, adversarial security environment where attackers adapt to defenses. Existing intrusion detection systems rely on static classifiers that fail to capture sequential decision-making, attacker adaptation, and uncertainty. We formulate IoV security as a sequential attacker-defender interaction and model defense as a reinforcement learning problem under partial observability. We propose Quantum Belief-Inte
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 5 Jun 2026]
Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles
Anwar Shah, Rohan Farooq, Sajid Anwer, Tallha Akram, Usman Ghous, Sajid Ullah Khan
The Internet of Vehicles (IoV) faces a dynamic, adversarial security environment where attackers adapt to defenses. Existing intrusion detection systems rely on static classifiers that fail to capture sequential decision-making, attacker adaptation, and uncertainty. We formulate IoV security as a sequential attacker-defender interaction and model defense as a reinforcement learning problem under partial observability. We propose Quantum Belief-Integrated Reinforcement Defense (Q-BIRD), using quantum-inspired belief representation to encode defender uncertainty about hidden attacker intent via amplitude-based states, enabling non-Bayesian belief evolution. Integrated into a Proximal Policy Optimization (PPO) defender, Q-BIRD selects cost-aware mitigation actions. In simulated environments with adaptive, probing attackers, Q-BIRD reduced cumulative mean damage, damage variance, and attack success rate (ASR) by 60.4%, 90.2%, and 50.0%, respectively, while increasing survival probability by 46.4%. Compared to classical Bayesian PPO, damage variance reduction and ASR improved by 10.2 times and 50%. Ablation and explainability analyses confirm that amplitude-based belief is the primary decision signal during strategy transitions when classical belief collapses, providing superior IoV security without additional hardware.
Subjects: Cryptography and Security (cs.CR)
Cite as: arXiv:2606.07796 [cs.CR]
(or arXiv:2606.07796v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2606.07796
Focus to learn more
Submission history
From: Anwar Shah [view email]
[v1] Fri, 5 Jun 2026 19:20:38 UTC (17,781 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-06
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)