'TrustFall' Convention Exposes Claude Code Execution Risk
Dark ReadingArchived May 09, 2026✓ Full text saved
Malicious repositories can trigger code execution in Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI with minimal or no user interaction, thanks to skimpy warning dialogs.
Full text archived locally
✦ AI Summary· Claude Sonnet
APPLICATION SECURITY
CYBER RISK
THREAT INTELLIGENCE
VULNERABILITIES & THREATS
NEWS
'TrustFall' Convention Exposes Claude Code Execution Risk
Malicious repositories can trigger code execution in Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI with minimal or no user interaction, thanks to skimpy warning dialogs.
Jai Vijayan,Contributing Writer
May 7, 2026
6 Min Read
SOURCE: SAMUEL BOIVIN VIA SHUTTERSTOCK
Developers using the latest versions of AI coding tools like Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI could inadvertently execute malicious code on their systems with a single keypress, or no keypress at all in continuous integration environments.
That, according to researchers at Adversa AI, is because none adequately warn users of how a malicious repo can auto-approve and spawn a Model Context Protocol (MCP) server without their explicit approval or knowledge. All four coding tools show some form of a trust dialog prompting the user to indicate whether they trust a particular repo, but they do not offer full details on what that consent might actually entail.
Adversa AI identified Claude Code as offering the least information in its trust dialog, and Gemini AI as offering the most, along with a choice in terms of allowing or disallowing an MCP server to execute on the developer's system. But the exposure is the same in all four, according to Adversa's lead researcher, Rony Utevsky.
Related:Reverse Engineering With AI Unearths High-Severity GitHub Bug
"A repository can ship a configuration that auto-approves and immediately launches an MCP server, no tool call from the agent is required," he tells Dark Reading. "The variation is purely in how clearly the dialog tells the user what they are consenting to."
Anthropic itself however has described the issue that Adversa AI identified as existing outside its threat model, and it told Adversa AI that it believes its trust dialog offers sufficient warning to users. Anthropic pointed to how any malicious activity happens only after the user has allowed a repo/folder to be trusted or safe, Utevsky says, adding that Adversa AI has not raised the issue with the other AI coding toolmakers because Anthropic's approach appears to be the general convention.
"Once we identified the issue as a class-level convention rather than a vendor bug, vendor-specific disclosure stopped being the right shape of response: you can responsibly disclose a vulnerability to a vendor, but not a convention," he explains.
A Straightforward Path?
According to Adversa AI, all a threat actor would need to do to pull off an attack is create a repository that includes a malicious MCP server and configuration settings that auto-approve it to run. When a developer clones or opens the repo in the AI coding tool and presses "enter" on what appears to be a routine security check, the AI coding tool unwittingly launches the attacker-controlled code with the developer's full system privileges and no further prompting.
Related:Fresh Wave of GlassWorm VS Code Extensions Slices Through Supply Chain
The payload can vary, and can allow attackers to read local files, including secrets, SSH keys, and tokens; access other projects; install backdoors; and establish a command-and-control connection. In a CI/CD environment, the same attack would unfold with no human interaction at all.
"The impact is full-machine compromise, not just project access," researchers at Adversa AI said in a report this week that focused on attacks using Claude Code. "MCP servers execute as native OS processes with the full privileges of the user running Claude Code." That means they aren't sandboxed or confined in any way. "The payload runs the moment the MCP server process starts," they added.
A Risky Change to the Trust Dialog in Claude Code
The report points to a trust dialog change that Anthropic introduced in Claude Code version 2.1, which removed warning language that previously made the risk more visible to users. That change has turned a routine developer action of cloning or reviewing a repo into a high-risk action, Utevsky says.
"The dialog users see is a simple 'Yes, I trust this folder,'" he explains. "Most developers don't realize 'trusting' hands over that much power." In contrast, earlier versions of Claude Code prior to 2.1 warned about MCP execution explicitly, and offered an option to proceed with MCP servers disabled. Both are no longer present, Utevsky says.
Related:Vercel Employee's AI Tool Access Led to Data Breach
The security researcher says the TrustFall issue joins three exploitable vulnerabilities in Claude Code that could allow a malicious repository to abuse project-scoped settings to silently change how the tool behaves on a developer's machine. The other three vulnerabilities include CVE-2025-59536, CVE-2026-21852, and CVE-2026-33068, all of which Anthropic has patched.
Adversa AI also identified three configuration settings that an attacker could use in their malicious repos to trigger arbitrary code execution on a developer's system, without an explicit prior warning from Claude Code. One of them uses a setting that would automatically approve a malicious MCP server to run the moment the user accepts Claude Code's broad folder trust prompt. The second involves planting the payload directly in the configuration file making it harder for security scanners to flag, and the third pre-authorizes specific tool calls through project settings, enabling code execution without further user interaction.
"In our opinion, the language of the new warning dialog downplays the decision's importance and the severity of the consequences, while providing no information about the project contents," Utevsky says. "It also defaults to 'trust,' so a reflexive press of 'enter' leads to unsafe behavior."
Claude Code's handling of dangerous settings is also internally inconsistent, he believes. Other configuration settings, such as bypassPermissions, invoke a much more alarming dialog with stronger language, and it defaults to "No, exit." "The same product treats less dangerous settings more carefully than this one," Utevsky says.
Not a Vulnerability, But Developers Still Need Defenses
Anthropic's position is that unlike previous vulnerabilities that allowed malicious code execution before a trust dialog even appeared, the issue that Adversa AI has identified involves code execution that happens only after the user has consented to the project. "Whether this meets Anthropic's threshold for a vulnerability is their call," the security vendor noted in its report. "Whether users are making an informed trust decision under the v2.1+ dialog, in our view, is not a close question. They are not."
Reducing exposure to the AI agent threats like these, according to Adversa AI, boils down to tightening controls across developer endpoints and CI/CD pipelines, and bolstering overall visibility into how tools like Claude Code are used.
On developer systems, organizations should focus on inspecting project configurations and monitoring for unexpected behavior when new repositories are opened. Organizations should make sure they validate projects and use behavioral monitoring to detect unusual processes or activity initiated by development tools In CI environments, the most effective safeguard is to avoid running the tool automatically on untrusted code, Adversa said. "Inspecting repo settings, automation actions, and project scaffolding isn't technically complex, but it takes time and discipline," Utevsky says. "It's also unavoidable now, given how common supply chain attacks and intentionally malicious open source packages have become."
Don't miss the latest Dark Reading Confidential podcast, How the Story of a USB Penetration Test Went Viral. Two decades ago Dark Reading posted its first blockbuster piece — a column by a pen tester who sprinkled rigged thumb drives around a credit union parking lot and let curious employees do the rest. This episode looks back at the history-making piece with its author, Steve Stasiukonis. Listen now!
About the Author
Jai Vijayan
Contributing Writer
Jai Vijayan is a seasoned technology reporter with over 20 years of experience in IT trade journalism. He was most recently a Senior Editor at Computerworld, where he covered information security and data privacy issues for the publication. Over the course of his 20-year career at Computerworld, Jai also covered a variety of other technology topics, including big data, Hadoop, Internet of Things, e-voting, and data analytics. Prior to Computerworld, Jai covered technology issues for The Economic Times in Bangalore, India. Jai has a Master's degree in Statistics and lives in Naperville, Ill.
Want more Dark Reading stories in your Google search results?
ADD US NOW
More Insights
Industry Reports
How Enterprises Are Developing Secure Applications
Inside RSAC 2026: security leaders reveal the risks redefining your defense strategy
How Enterprises Are Harnessing Emerging Technologies in Cybersecurity
Ditch the Data Center: Understanding Flexible Cloud Infrastructure Security Management
2025 State of Malware
Access More Research
Webinars
The New Attack Surface: How Attackers Are Exploiting OAuth to Own Your Cloud Workspace
Prompt Injection Is Just the Start: Securing LLMs in AI Systems
Anatomy of a Data Breach: What to Do if it Happens to You
How Well Can You See What's in Your Cloud?
Implementing CTEM: Beyond Vulnerability Management
More Webinars
You May Also Like
APPLICATION SECURITY
Supply Chain Attack Secretly Installs OpenClaw for Cline Users
by Rob Wright
FEB 19, 2026
APPLICATION SECURITY
Chinese Hackers Hijack Notepad++ Updates for 6 Months
by Jai Vijayan, Contributing Writer
FEB 02, 2026
APPLICATION SECURITY
Trump Administration Rescinds Biden-Era Software Guidance
by Alexander Culafi
JAN 29, 2026
APPLICATION SECURITY
Microsoft Fixes Exploited Zero Day in Light Patch Tuesday
by Jai Vijayan, Contributing Writer
DEC 09, 2025
Editor's Choice
THREAT INTELLIGENCE
From Stuxnet to ChatGPT: 20 News Events That Shaped Cyber
byDark Reading Editorial Team
MAY 6, 2026
31 MIN READ
CYBER RISK
Physical Cargo Theft Gets a Boost From Cybercriminals
byRobert Lemos
MAY 4, 2026
5 MIN READ
CYBER RISK
NSA Chief During Snowden Affair Shares Regrets, Reflections 13 Years Later
byDark Reading Editorial Team
APR 28, 2026
Want more Dark Reading stories in your Google search results?
Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.
SUBSCRIBE
LOADING...
Webinars
The New Attack Surface: How Attackers Are Exploiting OAuth to Own Your Cloud Workspace
WED, JUNE 24,2026 AT 1PM EST
Prompt Injection Is Just the Start: Securing LLMs in AI Systems
TUES, MAY 26, 2026, AT 1PM EST
Anatomy of a Data Breach: What to Do if it Happens to You
JUNE 18TH, 2026 | 11:00AM -5:00PM ET | DOORS OPEN AT 10:30AM ET
How Well Can You See What's in Your Cloud?
THURS, JUNE 4, 2026 AT 1:00PM EST
Implementing CTEM: Beyond Vulnerability Management
THURS, MAY 21, 2026 AT 1PM EST
More Webinars
BLACK HAT USA | MANDALAY BAY, LAS VEGAS
The premier cybersecurity event of the year returns to Mandalay Bay with a re‑engineered, six‑day program built to ignite innovation, push boundaries, and bring the global security community together like never before. Use code: DARKREADING to save $200 on a Briefings pass or $100 on a Business pass.
GET YOUR PASS