← Back ◬ AI & Machine Learning —

Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!

arXiv Security Archived Mar 17, 2026 ✓ Full text saved

arXiv:2603.12679v1 Announce Type: new Abstract: Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy neurons. By combining neuron permutation and parameter scaling, NSO makes a radical modification to the network structure and parameters while strictly preserving functional equivalence, thereby disrupting white-box watermark verification. This capability

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 13 Mar 2026] Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good! Yanna Jiang, Guangsheng Yu, Qingyuan Yu, Yi Chen, Qin Wang Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy neurons. By combining neuron permutation and parameter scaling, NSO makes a radical modification to the network structure and parameters while strictly preserving functional equivalence, thereby disrupting white-box watermark verification. This capability has been a fundamental challenge to the reliability of existing white-box watermarking schemes. We rethink NSO and, for the first time, fully recover from the damage it has caused. We redefine NSO as a graph-consistent threat model within a \textit{producer--consumer} paradigm. This formulation posits that any obfuscation of a producer node necessitates a compatible layout update in all downstream consumers to maintain structural integrity. Building on these consistency constraints on signal propagation, we present \textsc{Canon}, a recovery framework that probes the attacked model to identify redundancy/dummy channels and then \textit{globally} canonicalizes the network by rewriting \textit{all} downstream consumers by construction, synchronizing layouts across \texttt{fan-out}, \texttt{add}, and \texttt{cat}. Extensive experiments demonstrate that, even under strong composed and extended NSO attacks, \textsc{Canon} achieves \textbf{100\%} recovery success, restoring watermark verifiability while preserving task utility. Our code is available at this https URL. Subjects: Cryptography and Security (cs.CR) Cite as: arXiv:2603.12679 [cs.CR] (or arXiv:2603.12679v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2603.12679 Focus to learn more Submission history From: Qin Wang [view email] [v1] Fri, 13 Mar 2026 05:50:26 UTC (7,110 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-03 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes