← Back ◬ AI & Machine Learning Jun 09, 2026

Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles

arXiv Security Archived Jun 09, 2026 ✓ Full text saved

arXiv:2606.07796v1 Announce Type: new Abstract: The Internet of Vehicles (IoV) faces a dynamic, adversarial security environment where attackers adapt to defenses. Existing intrusion detection systems rely on static classifiers that fail to capture sequential decision-making, attacker adaptation, and uncertainty. We formulate IoV security as a sequential attacker-defender interaction and model defense as a reinforcement learning problem under partial observability. We propose Quantum Belief-Inte

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 5 Jun 2026] Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles Anwar Shah, Rohan Farooq, Sajid Anwer, Tallha Akram, Usman Ghous, Sajid Ullah Khan The Internet of Vehicles (IoV) faces a dynamic, adversarial security environment where attackers adapt to defenses. Existing intrusion detection systems rely on static classifiers that fail to capture sequential decision-making, attacker adaptation, and uncertainty. We formulate IoV security as a sequential attacker-defender interaction and model defense as a reinforcement learning problem under partial observability. We propose Quantum Belief-Integrated Reinforcement Defense (Q-BIRD), using quantum-inspired belief representation to encode defender uncertainty about hidden attacker intent via amplitude-based states, enabling non-Bayesian belief evolution. Integrated into a Proximal Policy Optimization (PPO) defender, Q-BIRD selects cost-aware mitigation actions. In simulated environments with adaptive, probing attackers, Q-BIRD reduced cumulative mean damage, damage variance, and attack success rate (ASR) by 60.4%, 90.2%, and 50.0%, respectively, while increasing survival probability by 46.4%. Compared to classical Bayesian PPO, damage variance reduction and ASR improved by 10.2 times and 50%. Ablation and explainability analyses confirm that amplitude-based belief is the primary decision signal during strategy transitions when classical belief collapses, providing superior IoV security without additional hardware. Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2606.07796 [cs.CR] (or arXiv:2606.07796v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2606.07796 Focus to learn more Submission history From: Anwar Shah [view email] [v1] Fri, 5 Jun 2026 19:20:38 UTC (17,781 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-06 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes