CyberIntel ⬡ News
★ Saved ◆ Cyber Reads
← Back ◬ AI & Machine Learning Jun 15, 2026

SEVRA-BENCH: Social Engineering of Vulnerabilities in Review Agents

arXiv Security Archived Jun 15, 2026 ✓ Full text saved

arXiv:2606.13757v1 Announce Type: new Abstract: Large language model (LLM) reviewers are increasingly used in pull-request (PR) workflows, where their approvals help decide which code is merged into a repository. This raises a question that benchmarks for static vulnerability detection or code generation do not address: can an automated reviewer reject a malicious contribution when the attacker controls both the code change and the accompanying PR text? We introduce SEVRA-BENCH (Social Engineeri

Full text archived locally
✦ AI Summary · Claude Sonnet


    Computer Science > Cryptography and Security [Submitted on 11 Jun 2026] SEVRA-BENCH: Social Engineering of Vulnerabilities in Review Agents Rui Melo, Riccardo Fogliato, Sean Zhou, Pratiksha Thaker, Zhiwei Steven Wu Large language model (LLM) reviewers are increasingly used in pull-request (PR) workflows, where their approvals help decide which code is merged into a repository. This raises a question that benchmarks for static vulnerability detection or code generation do not address: can an automated reviewer reject a malicious contribution when the attacker controls both the code change and the accompanying PR text? We introduce SEVRA-BENCH (Social Engineering of Vulnerabilities in Review Agents), a benchmark that measures how often an automated reviewer approves such adversarial pull requests. Each malicious PR in SEVRA-BENCH is built from a real project commit that previously fixed a vulnerability listed in the Common Vulnerabilities and Exposures (CVE) database. We automatically invert that fix to restore the original vulnerable code and submit it as a pull request wrapped in one of 15 social-engineering framings, which vary the claims made, the supporting evidence, the urgency conveyed, signals of prior approval, and appeals to authority. SEVRA-BENCH contains 1,062 malicious PRs drawn from Common Vulnerabilities and Exposures (CVE)-linked fixes across the top 10 entries of the 2025 Common Weakness Enumeration (CWE) Top 25. In a realistic setting, we evaluate 8 current LLMs as code review agents on PRs that introduce vulnerabilities previously reported in public disclosures. Our results reveal a sharp gap in security capabilities between closed- and open-source models. We hope SEVRA-BENCH will serve as a valuable resource for advancing open-source models and narrowing this gap. Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI) Cite as: arXiv:2606.13757 [cs.CR]   (or arXiv:2606.13757v1 [cs.CR] for this version)   https://doi.org/10.48550/arXiv.2606.13757 Focus to learn more Submission history From: Rui Melo [view email] [v1] Thu, 11 Jun 2026 16:35:56 UTC (2,020 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev   |   next > new | recent | 2026-06 Change to browse by: cs cs.AI References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
    💬 Team Notes
    Article Info
    Source
    arXiv Security
    Category
    ◬ AI & Machine Learning
    Published
    Jun 15, 2026
    Archived
    Jun 15, 2026
    Full Text
    ✓ Saved locally
    Open Original ↗