Beyond the Wrapper: Identifying Artifact Reliance in Static Malware Classifiers using TRUSTEE
arXiv SecurityArchived May 11, 2026✓ Full text saved
arXiv:2605.07034v1 Announce Type: new Abstract: Modern cybersecurity relies heavily on static machine-learning-based malware classifiers. However, transformations such as packing and other non-semantic modifications applied to executable files limit their reliability. Malware classifiers often learn these unnecessary artifacts rather than the true binary behavior because of the high association between maliciousness and packing. Moreover, these malware classifiers are black boxes, making it diff
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 7 May 2026]
Beyond the Wrapper: Identifying Artifact Reliance in Static Malware Classifiers using TRUSTEE
Riyazuddin Mohammed, Lan Zhang
Modern cybersecurity relies heavily on static machine-learning-based malware classifiers. However, transformations such as packing and other non-semantic modifications applied to executable files limit their reliability. Malware classifiers often learn these unnecessary artifacts rather than the true binary behavior because of the high association between maliciousness and packing. Moreover, these malware classifiers are black boxes, making it difficult to understand what they learn. To address this issue, we proposed a two-part framework using the post-hoc interpretability XAI tool TRUSTEE, followed by a manual analysis of the top features. We conducted several controlled experiments by varying the dataset composition ratios to understand their impact on the results. The top-ranked features across all experiments, identified by TRUSTEE, were predominantly packing artifacts, portable executable(PE) metadata, and n-grams at the string level, rather than malicious semantics. These results suggest that these malware classifiers are highly sensitive to dataset composition and can misinterpret packing as malicious behavior. Our proposed framework allows for the reproducible diagnosis of such biases and forms a guideline for building more robust and semantically meaningful malware detection models
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as: arXiv:2605.07034 [cs.CR]
(or arXiv:2605.07034v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2605.07034
Focus to learn more
Submission history
From: Riyazuddin Mohammed [view email]
[v1] Thu, 7 May 2026 23:24:36 UTC (2,387 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-05
Change to browse by:
cs
cs.LG
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)