CyberIntel ⬡ News
★ Saved ◆ Cyber Reads
← Back ◬ AI & Machine Learning Jun 08, 2026

Subtle Injection for Ground-truth Inference of LLM Training Data

arXiv Security Archived Jun 08, 2026 ✓ Full text saved

arXiv:2606.06502v1 Announce Type: new Abstract: As large language models (LLMs) are increasingly trained on scraped web corpora without authorisation, content owners require forensic methods to prove that their documents were included in a model's training set. We propose \textbf{SIGIL} (\textbf{S}ubtle \textbf{I}njection for \textbf{G}round-truth \textbf{I}nference of \textbf{L}LM training data), a framework that embeds imperceptible \emph{canary sequences} into protected text and code such tha

Full text archived locally
✦ AI Summary · Claude Sonnet


    Computer Science > Cryptography and Security [Submitted on 18 May 2026] Subtle Injection for Ground-truth Inference of LLM Training Data Abraham Itzhak Weinberg As large language models (LLMs) are increasingly trained on scraped web corpora without authorisation, content owners require forensic methods to prove that their documents were included in a model's training set. We propose \textbf{SIGIL} (\textbf{S}ubtle \textbf{I}njection for \textbf{G}round-truth \textbf{I}nference of \textbf{L}LM training data), a framework that embeds imperceptible \emph{canary sequences} into protected text and code such that any LLM trained on those documents exhibits statistically detectable behavioural signatures when probed with targeted queries. SIGIL defines five canary strategies -- lexical-rare, lexical-phrase, syntactic, semantic, and code-pattern -- and a \emph{Membership Inference Score} (MIS) grounded in the Neyman-Pearson hypothesis testing framework with formal false-positive rate (FPR) control. Simulator parameters are calibrated against the empirical membership inference literature, yielding realistic heterogeneous results across 36{,}000 trials: overall AUC = 0.892, rising from 0.831 at 0.1\% injection to 0.947 at 10\%. Detection rates range from 33\% to 78\% across model-size and injection-rate conditions. Code Pattern canaries achieve the highest AUC (0.903, Cohen's d = 1.84); Syntactic Structure the lowest (0.875, d = 1.63). All four experimental factors -- injection rate, model size, canary strategy, and mixture ratio -- have significant independent effects on MIS (p < 0.001). SIGIL maintains AUC > 0.86 even under 100\% paraphrasing (\text{AUC} = 0.864), confirming robustness through semantic leakage that survives surface-level rewriting. Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2606.06502 [cs.CR]   (or arXiv:2606.06502v1 [cs.CR] for this version)   https://doi.org/10.48550/arXiv.2606.06502 Focus to learn more Submission history From: Abraham Itzhak Weinberg [view email] [v1] Mon, 18 May 2026 14:48:44 UTC (614 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev   |   next > new | recent | 2026-06 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
    💬 Team Notes
    Article Info
    Source
    arXiv Security
    Category
    ◬ AI & Machine Learning
    Published
    Jun 08, 2026
    Archived
    Jun 08, 2026
    Full Text
    ✓ Saved locally
    Open Original ↗