← Back ◬ AI & Machine Learning Jun 03, 2026

The Security Budget of Code LLMs: An Information-Theoretic Capacity-Security Bound

arXiv Security Archived Jun 03, 2026 ✓ Full text saved

arXiv:2606.03308v1 Announce Type: new Abstract: AI programming assistants make natural-language prompts a software-development interface, so small prompt perturbations become usability and security risks. We study an information-theoretic trade-off for code LLMs between functional capacity, $\Cap=\rmI(c^*;c_\pi)$, and perturbation retention, $\Sec=\rmI(c_\pi;\tilde c_\pi)$. Here $\Sec$ is a retention-channel quantity, not a direct measure of exploit success or vulnerable-code generation. For cod

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 2 Jun 2026] The Security Budget of Code LLMs: An Information-Theoretic Capacity-Security Bound Jianwei Tai AI programming assistants make natural-language prompts a software-development interface, so small prompt perturbations become usability and security risks. We study an information-theoretic trade-off for code LLMs between functional capacity, \Cap=\rmI(c^*;c_\pi)$\Cap=\rmI(c^*;c_\pi)$, and perturbation retention, \Sec=\rmI(c_\pi;\tilde c_\pi)$\Sec=\rmI(c_\pi;\tilde c_\pi)$. Here \Sec$\Sec$ is a retention-channel quantity, not a direct measure of exploit success or vulnerable-code generation. For code completion modeled as p\to c_\pi with perturbed prompt \tilde p, we prove \Cap+\Sec\le \rmH(c^*)+\rmI(p;\tilde p), decomposing the budget into task entropy and prompt leakage. A deterministic-embedding corollary gives the hidden-state version, and a tokenizer/gzip companion bound gives a model-agnostic ceiling on sequence-level task entropy. Empirically, we estimate embedded \Cap and \Sec from output-only last-token hidden states, excluding prompt context from the \Sec channel. Six individual validation rows across two models, two datasets, INT4/BF16 precision, and estimator ablations satisfy the embedded check (\Cap+\max_T\Sec)/(\rmH(z^*)+\max_T\rmI(p;\tilde p))\le1. Saturation is 0.27--0.92 and theorem slack is 2.36--26.94 nats; a separate three-seed stability diagnostic has mean saturation 0.87. A context-mixed cosine, used only as a per-problem generation-prompt alignment signal, correlates with pass@1 on CodeLlama-HumanEval (\rho{=}0.36, p{<}10^{-4}), Qwen-HumanEval (\rho{=}0.22, p{=}0.005), and CodeLlama-MBPP (\rho{=}0.225, p{=}0.0038; all n{=}164). Adaptive stress tests with a 23-perturbation pool, a fixed universal suffix, and prompt-embedding PGD all leave positive slack. Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2606.03308 [cs.CR] (or arXiv:2606.03308v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2606.03308 Focus to learn more Submission history From: Jianwei Tai [view email] [v1] Tue, 2 Jun 2026 08:22:14 UTC (41 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-06 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes