Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!
arXiv SecurityArchived Mar 17, 2026✓ Full text saved
arXiv:2603.12679v1 Announce Type: new Abstract: Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy neurons. By combining neuron permutation and parameter scaling, NSO makes a radical modification to the network structure and parameters while strictly preserving functional equivalence, thereby disrupting white-box watermark verification. This capability
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 13 Mar 2026]
Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!
Yanna Jiang, Guangsheng Yu, Qingyuan Yu, Yi Chen, Qin Wang
Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy neurons. By combining neuron permutation and parameter scaling, NSO makes a radical modification to the network structure and parameters while strictly preserving functional equivalence, thereby disrupting white-box watermark verification. This capability has been a fundamental challenge to the reliability of existing white-box watermarking schemes.
We rethink NSO and, for the first time, fully recover from the damage it has caused. We redefine NSO as a graph-consistent threat model within a \textit{producer--consumer} paradigm. This formulation posits that any obfuscation of a producer node necessitates a compatible layout update in all downstream consumers to maintain structural integrity. Building on these consistency constraints on signal propagation, we present \textsc{Canon}, a recovery framework that probes the attacked model to identify redundancy/dummy channels and then \textit{globally} canonicalizes the network by rewriting \textit{all} downstream consumers by construction, synchronizing layouts across \texttt{fan-out}, \texttt{add}, and \texttt{cat}. Extensive experiments demonstrate that, even under strong composed and extended NSO attacks, \textsc{Canon} achieves \textbf{100\%} recovery success, restoring watermark verifiability while preserving task utility. Our code is available at this https URL.
Subjects: Cryptography and Security (cs.CR)
Cite as: arXiv:2603.12679 [cs.CR]
(or arXiv:2603.12679v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2603.12679
Focus to learn more
Submission history
From: Qin Wang [view email]
[v1] Fri, 13 Mar 2026 05:50:26 UTC (7,110 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-03
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)