← Back ◬ AI & Machine Learning Jun 17, 2026

ShellGames: Speculative LLM-Driven SSH Deception

arXiv Security Archived Jun 17, 2026 ✓ Full text saved

arXiv:2606.17986v1 Announce Type: new Abstract: Cyber deception and Moving Target Defense are promising strategies that aim to disrupt adversaries by increasing uncertainty. However, sustaining long-lived, credible interactive sessions with adversaries remains an open challenge. Large Language Models (LLMs) offer a promising path toward more dynamic deception systems, but suffer from key limitations that fundamentally limit their applicability, including: lack of persistent state, output inconsi

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 16 Jun 2026] ShellGames: Speculative LLM-Driven SSH Deception Umberto Salviati, Fabio De Gaspari, Mauro Conti, Luigi Vincenzo Mancini Cyber deception and Moving Target Defense are promising strategies that aim to disrupt adversaries by increasing uncertainty. However, sustaining long-lived, credible interactive sessions with adversaries remains an open challenge. Large Language Models (LLMs) offer a promising path toward more dynamic deception systems, but suffer from key limitations that fundamentally limit their applicability, including: lack of persistent state, output inconsistencies, hallucinations, latency, and susceptibility to behavioral subversion that may reveal the deception. We propose ShellGames, an SSH shell simulator based on LLM designed to address these limitations. ShellGames combines five complementary techniques: (i) Automatic Chain-of-Thought and few-shot learning to improve correctness; (ii) memory management to maintain system state coherency; (iii) speculative command execution to reduce response latency; (iv) smart routing of complex interactive commands to a sandboxed environment; and (v) subversion detection leveraging the constrained input-output domain of shell environments. To enable systematic evaluation, we introduce a standardized benchmarking protocol and dataset spanning correctness, consistency, state tracking, and robustness tasks. ShellGames achieves 0.898 command accuracy on correctness (+5.3pp over baselines), 0.918 sequence-level accuracy on consistency (+36pp), 0.98 state tracking accuracy (+18.3pp), and 0.95 accuracy on robustness (+37pp). A user study with n=20 participants confirms that ShellGames achieves realism comparable to a real shell under free exploration and outperforms traditional honeypots on perceived command coverage. Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2606.17986 [cs.CR] (or arXiv:2606.17986v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2606.17986 Focus to learn more Submission history From: Umberto Salviati [view email] [v1] Tue, 16 Jun 2026 14:40:08 UTC (90 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-06 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes