TGCM: Topic-Guided Generative Disentanglement of Interleaved APT Technique Sequences
arXiv SecurityArchived Jun 18, 2026✓ Full text saved
arXiv:2606.18651v1 Announce Type: new Abstract: In enterprise environments, multiple Advanced Persistent Threat (APT) campaigns often unfold concurrently, producing audit logs in which attack techniques across actors (sources) are interleaved over time. This setting naturally gives rise to an Unknown-K Interleaved Sequence Demixing (UKISD) problem: recovering multiple latent campaigns from an interleaved technique sequence while jointly inferring their number and technique-level assignments. Exi
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 17 Jun 2026]
TGCM: Topic-Guided Generative Disentanglement of Interleaved APT Technique Sequences
Guo-Wei Wong, Ming-Chuan Yang, Shou-De Lin, Wang-Chien Lee, MengChang Chen
In enterprise environments, multiple Advanced Persistent Threat (APT) campaigns often unfold concurrently, producing audit logs in which attack techniques across actors (sources) are interleaved over time. This setting naturally gives rise to an Unknown-K Interleaved Sequence Demixing (UKISD) problem: recovering multiple latent campaigns from an interleaved technique sequence while jointly inferring their number and technique-level assignments. Existing approaches, ranging from statistical pattern mining to provenance-based analysis, typically assume single-campaign settings or rely on rigid heuristics, limiting their effectiveness under realistic conditions involving overlapping campaigns, shared techniques, and variable execution lengths.
We present Topic-Guided Consistency Modeling (TGCM), a generative disentanglement framework to tackle the UKSID problem. TGCM leverages Consistency Models to learn a direct inverse mapping from interleaved multi-campaign observations to structured single-campaign sequences in a single inference step. To favor semantically coherent attack chains, TGCM incorporates a topic-guided prior derived from MITRE ATT\&CK narratives, providing high-level tactical constraints during decomposition. We evaluate TGCM on synthetic datasets, established mixed datasets, and incident traces from DARPA TC-E3 and TC-E5, comparing against 15 representative baselines spanning pattern mining, deep learning, and LLM-based methods. Results indicate improved separation robustness over baselines under heavy interleaving and technique sharing, and show that TGCM generalizes zero-shot to a naturally interleaved in-the-wild benchmark (DARPA TC-E5) without retraining.
Comments: 13 pages,
Subjects: Cryptography and Security (cs.CR)
Cite as: arXiv:2606.18651 [cs.CR]
(or arXiv:2606.18651v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2606.18651
Focus to learn more
Submission history
From: Guo-Wei Wong [view email]
[v1] Wed, 17 Jun 2026 03:35:43 UTC (961 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-06
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)