← Back ◬ AI & Machine Learning Mar 30, 2026

Reentrancy Detection in the Age of LLMs

arXiv Security Archived Mar 30, 2026 ✓ Full text saved

arXiv:2603.26497v1 Announce Type: new Abstract: Reentrancy remains one of the most critical classes of vulnerabilities in Ethereum smart contracts, yet widely used detection tools and datasets continue to reflect outdated patterns and obsolete Solidity versions. This paper adopts a dependability-oriented perspective on reentrancy detection in Solidity 0.8+, assessing how reliably state-of-the-art static analyzers and AI-based techniques operate on modern code by putting them to the test on two f

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 27 Mar 2026] Reentrancy Detection in the Age of LLMs Dalila Ressi, Alvise Spanò, Matteo Rizzo, Lorenzo Benetollo, Sabina Rossi Reentrancy remains one of the most critical classes of vulnerabilities in Ethereum smart contracts, yet widely used detection tools and datasets continue to reflect outdated patterns and obsolete Solidity versions. This paper adopts a dependability-oriented perspective on reentrancy detection in Solidity 0.8+, assessing how reliably state-of-the-art static analyzers and AI-based techniques operate on modern code by putting them to the test on two fronts. We construct two manually verified benchmarks: an Aggregated Benchmark of 432 real-world contracts, consolidated and relabeled from prior datasets, and a Reentrancy Scenarios Dataset (RSD) of \chadded{143} handcrafted minimal working examples designed to isolate and stress-test individual reentrancy patterns. We then evaluate 12 formal-methods-based tools, 10 machine-learning models, and 9 large language models. On the Aggregated Benchmark, traditional tools and ML models achieve up to 0.87 F1, while the best LLMs reach 0.96 in a zero-shot setting. On the RSD, most tools fail on multiple scenarios, the top performer achieving an F1 of 0.76, whereas the strongest model attains 0.82. Overall, our results indicate that leading LLMs outperform the majority of existing detectors, highlighting concerning gaps in the robustness and maintainability of current reentrancy-analysis tools. Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE) Cite as: arXiv:2603.26497 [cs.CR] (or arXiv:2603.26497v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2603.26497 Focus to learn more Submission history From: Alvise Spanò [view email] [v1] Fri, 27 Mar 2026 15:00:42 UTC (240 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-03 Change to browse by: cs cs.SE References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes