CyberIntel ⬡ News
★ Saved ◆ Cyber Reads
← Back ◬ AI & Machine Learning Apr 22, 2026

Adding Compilation Metadata To Binaries To Make Disassembly Decidable

arXiv Security Archived Apr 22, 2026 ✓ Full text saved

arXiv:2604.19628v1 Announce Type: new Abstract: The binary executable format is the standard method for distributing and executing software. Yet, it is also as opaque a representation of software as can be. If the binary format were augmented with metadata that provides security-relevant information, such as which data is intended by the compiler to be executable instructions, or how memory regions are expected to be bounded, that would dramatically improve the safety and maintainability of soft

Full text archived locally
✦ AI Summary · Claude Sonnet


    Computer Science > Cryptography and Security [Submitted on 21 Apr 2026] Adding Compilation Metadata To Binaries To Make Disassembly Decidable Daniel Engel, Freek Verbeek, Pranav Kumar, Binoy Ravindran The binary executable format is the standard method for distributing and executing software. Yet, it is also as opaque a representation of software as can be. If the binary format were augmented with metadata that provides security-relevant information, such as which data is intended by the compiler to be executable instructions, or how memory regions are expected to be bounded, that would dramatically improve the safety and maintainability of software. In this paper, we propose a binary format that is a middle ground between a stripped black-box binary and open source. We provide a tool that generates metadata capturing the compiler's intent and inserts it into the binary. This metadata enables lifting to a correct and recompilable higher-level representation and makes analysis and instrumentation more reliable. Our evaluation shows that adding metadata does not affect runtime behavior or performance. Compared to DWARF, our metadata is roughly 17% of its size. We validate correctness by compiling a comprehensive set of real-world C and C++ binaries and demonstrating that they can be lifted, instrumented, and recompiled without altering their behavior. Comments: 12 pages, 5 figures, 2 tables. Submitted to QRS 2026 Subjects: Cryptography and Security (cs.CR); Programming Languages (cs.PL) Cite as: arXiv:2604.19628 [cs.CR]   (or arXiv:2604.19628v1 [cs.CR] for this version)   https://doi.org/10.48550/arXiv.2604.19628 Focus to learn more Submission history From: Daniel Engel [view email] [v1] Tue, 21 Apr 2026 16:16:10 UTC (39 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev   |   next > new | recent | 2026-04 Change to browse by: cs cs.PL References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
    💬 Team Notes
    Article Info
    Source
    arXiv Security
    Category
    ◬ AI & Machine Learning
    Published
    Apr 22, 2026
    Archived
    Apr 22, 2026
    Full Text
    ✓ Saved locally
    Open Original ↗