CyberIntel ⬡ News
★ Saved ◆ Cyber Reads
← Back ◬ AI & Machine Learning Jun 25, 2026

Speculative Decoding at Temperature Zero: A Scoped Safety-Invariance Screen with a 48,072-Sample Expansion

arXiv Security Archived Jun 25, 2026 ✓ Full text saved

arXiv:2606.25097v1 Announce Type: cross Abstract: Speculative decoding accelerates inference by letting a draft model propose tokens for a target model to verify, raising a concrete safety question: at temperature zero, can draft-side behavior leak into safety-scored outputs? We answer with Typical-Acceptance Invariance Screen (TAIS), a behavioral-equivalence screen that pairs target-only and speculative outputs on the same safety battery and requires byte-identity evidence, TOST equivalence at

Full text archived locally
✦ AI Summary · Claude Sonnet


    Computer Science > Machine Learning [Submitted on 23 Jun 2026] Speculative Decoding at Temperature Zero: A Scoped Safety-Invariance Screen with a 48,072-Sample Expansion Sahil Kadadekar Speculative decoding accelerates inference by letting a draft model propose tokens for a target model to verify, raising a concrete safety question: at temperature zero, can draft-side behavior leak into safety-scored outputs? We answer with Typical-Acceptance Invariance Screen (TAIS), a behavioral-equivalence screen that pairs target-only and speculative outputs on the same safety battery and requires byte-identity evidence, TOST equivalence at +/-3pp, and per-task Cohen's h below a calibrated null cutoff of |h| < 0.1. Applied to a 16,783-sample confirmatory core plus 44,066 matched expansion samples (fp16/bf16 execution, canonical and DPO-adversarial drafts, GPTQ-4bit drafts, two seeds, and four safety benchmarks), the tested temperature-zero vLLM stacks show no detectable safety divergence under TAIS. The largest absolute Cohen's h on matched target-only versus speculative refusal is 0.024, roughly an order of magnitude below the conventional trivial-effect floor; 25 of 27 per-task TOST contrasts pass at the +/-3pp margin (the two non-pass contrasts are capability-domain Wald-CI edge cases at identical ceiling rates, not genuine non-equivalence); the DPO-adversarial draft produces byte-identical output to the canonical draft across 4,006 samples; and bf16 changes 36%-53% of output bytes without moving any per-task safety rate outside equivalence. A separate 4,006-sample 70B production-scale probe, which lacks a matched 70B target-only arm and is therefore not counted as a TAIS pass, produces AdvBench refusal 0.839 over 700 AdvBench completions with 95% Wilson CI [0.809, 0.864]. We make no claim about sampling temperatures, untested frameworks, untested model families, or tree-speculation variants such as EAGLE and Medusa. Comments: Preprint Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR) Cite as: arXiv:2606.25097 [cs.LG]   (or arXiv:2606.25097v1 [cs.LG] for this version)   https://doi.org/10.48550/arXiv.2606.25097 Focus to learn more Submission history From: Sahil Kadadekar [view email] [v1] Tue, 23 Jun 2026 19:06:42 UTC (39 KB) Access Paper: HTML (experimental) view license Current browse context: cs.LG < prev   |   next > new | recent | 2026-06 Change to browse by: cs cs.CR References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
    💬 Team Notes
    Article Info
    Source
    arXiv Security
    Category
    ◬ AI & Machine Learning
    Published
    Jun 25, 2026
    Archived
    Jun 25, 2026
    Full Text
    ✓ Saved locally
    Open Original ↗