← Back ◬ AI & Machine Learning May 25, 2026

Sample-wise Targeted Adversarial Attacks on Test-time Adaptation

arXiv Security Archived May 25, 2026 ✓ Full text saved

arXiv:2605.23411v1 Announce Type: cross Abstract: Test-time adaptation (TTA) effectively counters distribution shifts but exposes models to adversarial manipulation via the unlabeled test stream. Existing class-wise targeted attacks remain impractical for stealthy exploitation in this setting: since TTA operates on batches, forcing a subset of samples toward a target label unintentionally pulls similar benign samples along, resulting in a conspicuously high frequency of the target label that is

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Machine Learning [Submitted on 22 May 2026] Sample-wise Targeted Adversarial Attacks on Test-time Adaptation Phuc Duc Nguyen, Quang Duc Nguyen Test-time adaptation (TTA) effectively counters distribution shifts but exposes models to adversarial manipulation via the unlabeled test stream. Existing class-wise targeted attacks remain impractical for stealthy exploitation in this setting: since TTA operates on batches, forcing a subset of samples toward a target label unintentionally pulls similar benign samples along, resulting in a conspicuously high frequency of the target label that is easy to detect. To capture a more realistic threat, we introduce a sample-wise targeted attack. Unlike prior approaches, the attacker aims to misclassify only inputs carrying an attacker-chosen trigger, while preserving the global label distribution of benign queries to evade detection. To achieve this, we propose a meta-learning-based attack with a novel priority-aware gradient alignment strategy that explicitly prioritizes attack success. The strategy formulates the gradient update as an ellipsoidal trust-region problem, mitigating the misalignment between attack success and distributional stealth, while providing theoretical guarantees for effective optimization of the attack objective in the presence of gradient misalignment. Extensive experiments on CIFAR-10-C, CIFAR-100-C, and ImageNet-C across TTA protocols demonstrate that our method achieves high targeted success rates while maintaining a label distribution that is consistent with the no-attack baseline, making it difficult to detect in unlabeled TTA deployment scenarios. Furthermore, we demonstrate that our attack shows strong robustness against existing defenses. Comments: 32 pages, 17 figures Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV) Cite as: arXiv:2605.23411 [cs.LG] (or arXiv:2605.23411v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2605.23411 Focus to learn more Submission history From: Quang Duc Nguyen [view email] [v1] Fri, 22 May 2026 09:18:22 UTC (1,474 KB) Access Paper: HTML (experimental) view license Current browse context: cs.LG < prev | next > new | recent | 2026-05 Change to browse by: cs cs.CR cs.CV References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes