Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations
arXiv SecurityArchived Apr 03, 2026✓ Full text saved
arXiv:2604.01635v1 Announce Type: new Abstract: Recent advances in GAN and diffusion models have significantly improved the realism and controllability of facial deepfake manipulation, raising serious concerns regarding privacy, security, and identity misuse. Proactive defenses attempt to counter this threat by injecting adversarial perturbations into images before manipulation takes place. However, existing approaches remain limited in effectiveness due to suboptimal perturbation injection stra
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 2 Apr 2026]
Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations
Yue Li, Linying Xue, Kaiqing Lin, Hanyu Quan, Dongdong Lin, Hui Tian, Hongxia Wang, Bin Wang
Recent advances in GAN and diffusion models have significantly improved the realism and controllability of facial deepfake manipulation, raising serious concerns regarding privacy, security, and identity misuse. Proactive defenses attempt to counter this threat by injecting adversarial perturbations into images before manipulation takes place. However, existing approaches remain limited in effectiveness due to suboptimal perturbation injection strategies and are typically designed under white-box assumptions, targeting only simple GAN-based attribute editing. These constraints hinder their applicability in practical real-world scenarios. In this paper, we propose AEGIS, the first diffusion-guided paradigm in which the AdvErsarial facial images are Generated for Identity Shielding. We observe that the limited defense capability of existing approaches stems from the peak-clipping constraint, where perturbations are forcibly truncated due to a fixed L_\infty-bounded. To overcome this limitation, instead of directly modifying pixels, AEGIS injects adversarial perturbations into the latent space along the DDIM denoising trajectory, thereby decoupling the perturbation magnitude from pixel-level constraints and allowing perturbations to adaptively amplify where most effective. The extensible design of AEGIS allows the defense to be expanded from purely white-box use to also support black-box scenarios through a gradient-estimation strategy. Extensive experiments across GAN and diffusion-based deepfake generators show that AEGIS consistently delivers strong defense effectiveness while maintaining high perceptual quality. In white-box settings, it achieves robust manipulation disruption, whereas in black-box settings, it demonstrates strong cross-model transferability.
Subjects: Cryptography and Security (cs.CR)
Cite as: arXiv:2604.01635 [cs.CR]
(or arXiv:2604.01635v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2604.01635
Focus to learn more
Submission history
From: Yue Li [view email]
[v1] Thu, 2 Apr 2026 05:24:11 UTC (9,471 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-04
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)