Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?
Dark ReadingArchived Apr 10, 2026✓ Full text saved
Its Mythos Preview model, which can allegedly find and exploit critical zero-days, also comes with certain controls, the vendor said.
Full text archived locally
✦ AI Summary· Claude Sonnet
APPLICATION SECURITY
СLOUD SECURITY
VULNERABILITIES & THREATS
CYBERSECURITY OPERATIONS
NEWS
Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?
Its Mythos Preview model, which can allegedly find and exploit critical zero-days, also comes with certain controls, the vendor said.
Alexander Culafi,Senior News Writer,Dark Reading
April 10, 2026
4 Min Read
SOURCE: ADRIAN VIDAL VIA ALAMY STOCK PHOTO
Anthropic's Mythos model promises major innovations in vulnerability management and security red-teaming, but questions remain regarding how defenders can keep threat actors from taking full advantage.
Anthropic on April 7 unveiled Claude Mythos Preview, a general-purpose large language model (LLM) that the company said in a blog post, "performs strongly across the board, but it is strikingly capable at computer security tasks." The AI firm said Mythos could identify and exploit zero-day vulnerabilities in "every major operating system and every major Web browser" at user direction, including subtle and difficult-to-detect ones. One exploit included a patched 27-year-old flaw in OpenBSD.
Some of these vulnerabilities are complex, but the company says one doesn't need to be a security engineer to properly prompt the model.
"In one case, Mythos Preview wrote a Web browser exploit that chained together four vulnerabilities, writing a complex JIT heap spray that escaped both renderer and OS sandboxes," the blog read. "It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets."
Related:AI-Led Remediation Crisis Prompts HackerOne to Pause Bug Bounties
The vulnerability detection and exploitation enhancements came as a "downstream consequence" of improving Mythos' code and reasoning capabilities, rather than it being an explicit goal on its developers' part.
"The same improvements that make the model substantially more effective at patching vulnerabilities also make it substantially more effective at exploiting them," Anthropic said.
While the aim is to assist defenders and keep Mythos out of attacker hands, and while Anthropic claims it has identified "thousands" of high-risk and critical security vulnerabilities that it's responsibly disclosing, it's not much of a leap to see how a model like Mythos Preview could be misused, similarly to how threat actors abuse legitimate penetration testing tools like Cobalt Strike.
Enter Project Glasswing: Anthropic Mythos for Cyber Defenders
It is likely in anticipation of this that Anthropic introduced "Project Glasswing," a new initiative the company launched this week in partnership with companies like Apple, AWS, Microsoft, Palo Alto Networks, and CrowdStrike. As part of its product launch, Anthropic claimed Project Glasswing could fundamentally "reshape cybersecurity," and that this would be "an urgent attempt to put these capabilities to work for defensive purposes."
Related:Grafana Patches AI Bug That Could Have Leaked User Data
In practical terms, the AI vendor has extended Mythos Preview access to a group of more than 40 organizations to scan and secure first-party and open source systems. Lee Klarich, chief product and technology officer of Palo Alto Networks, called early Mythos Preview results "compelling" in a LinkedIn blog post.
In addition to granting limited access to partners, Anthropic is committing $100 million in Mythos Preview usage credits to Project Glasswing, as well as $4 million in direct donations to open source security organizations.
As for why Anthropic introduced something so good at exploiting vulnerabilities, Forrester senior analyst Erik Nost tells Dark Reading that it's good PR for Anthropic, as the company is basically saying its AI is so good that it can reshape cybersecurity and software development. Secondly, it also calls attention to the vulnerability detection gaps that the industry has dealt with for 30 years.
Keeping Mythos Preview Out of the Wrong Hands
Nost explains that there are controls in place ensuring Mythos stays in the right hands, though it has become "a race [for defenders] to remediate and patch before other AIs, in the wrong hands, discover these zero-days and rapidly write exploits."
Related:AI-Assisted Supply Chain Attack Targets GitHub
"It's a call to action, a heads-up, to defenders that vulnerability management practices are about to get very different," he says.
Julian Totzek-Hallhuber, senior principal solution architect at Veracode, says that because there is no clear answer for how these tools can stay out of attacker hands, defenders should assume the capability will proliferate, and should prepare accordingly. This means investing in detection instead of just prevention, identifying the behavioral signatures of AI-assisted exploitation, and investing in zero-trust architecture as well as aggressive patching cycles and anomaly-based detection.
Melissa Ruzzi, director of AI at AppOmni, tells Dark Reading a deeper truth: "No one can ever keep anything 100% out of attackers' hands. The best that can be done is to make it more difficult for them to get access to it."
Mythos' potential comes with a caveat: While the early Anthropic examples of discovered vulnerabilities are compelling, two data points do not make a pattern. Totzek-Hallhuber emphasizes that "Anthropic controls both the model and the narrative; independent replication is impossible when the model isn't publicly available."
He adds, "Until independent researchers with access can run their own evaluations, healthy skepticism is the appropriate posture. This is, frankly, another consequence of the restricted access model: the claims can't be tested, so they can't be fully trusted or refuted."
Dark Reading contacted Anthropic to ask for statistics regarding false positives and error rates; the vendor did not respond by press time.
Don't miss the latest Dark Reading Confidential podcast, Security Bosses Are All in on AI: Here's Why, where Reddit CISO Frederick Lee and Omdia analyst Dave Gruber discuss AI and machine learning in the SOC, how successful deployments have (or haven’t) been, and what the future holds for AI security products. Listen now!
About the Author
Alexander Culafi
Senior News Writer, Dark Reading
Alex is an award-winning writer, journalist, and podcast host based in Boston. After cutting his teeth writing for independent gaming publications as a teenager, he graduated from Emerson College in 2016 with a Bachelor of Science in journalism. He has previously been published on VentureFizz, Search Security, Nintendo World Report, and elsewhere. In his spare time, Alex hosts the weekly Nintendo podcast Talk Nintendo Podcast and works on personal writing projects, including two previously self-published science fiction novels.
Want more Dark Reading stories in your Google search results?
ADD US NOW
More Insights
Industry Reports
AI SOC for MDR: The Structural Evolution of Managed Detection and Response
How Enterprises Are Developing Secure Applications
Frost Radar™: Non-human Identity Solutions
2026 CISO AI Risk Report
Gartner IGA Voice of the Customer 2026
Access More Research
Webinars
Tips for Managing Cloud Security in a Hybrid Environment?
Zero Trust Architecture for Cloud environments: Implementation Roadmap
Security in the AI Age
Identity Maturity Under Pressure: 2026 Findings and How to Catch Up
Building a Robust SOC in a Post-AI World
More Webinars
You May Also Like
APPLICATION SECURITY
Trump Administration Rescinds Biden-Era Software Guidance
by Alexander Culafi
JAN 29, 2026
APPLICATION SECURITY
Microsoft Fixes Exploited Zero Day in Light Patch Tuesday
by Jai Vijayan, Contributing Writer
DEC 09, 2025
APPLICATION SECURITY
It Takes Only 250 Documents to Poison Any AI Model
by Jai Vijayan, Contributing Writer
OCT 22, 2025
CYBERATTACKS & DATA BREACHES
DeepSeek Breach Opens Floodgates to Dark Web
by Emma Zaballos
APR 22, 2025
Editor's Choice
CYBERSECURITY OPERATIONS
RSAC 2026: AI Dominates, But Community Remains Key to Security
byKristina Beek,Rob Wright
APR 2, 2026
THREAT INTELLIGENCE
Axios Attack Shows How Complex Social Engineering Is Industrialized
byAlexander Culafi
APR 6, 2026
5 MIN READ
ICS/OT SECURITY
Iranian Threat Actors Disrupt US Critical Infrastructure via Exposed PLCs
byElizabeth Montalbano
APR 8, 2026
4 MIN READ
Want more Dark Reading stories in your Google search results?
2026 Security Trends & Outlooks
THREAT INTELLIGENCE
Cybersecurity Predictions for 2026: Navigating the Future of Digital Threats
JAN 2, 2026
CYBER RISK
Navigating Privacy and Cybersecurity Laws in 2026 Will Prove Difficult
JAN 12, 2026
ENDPOINT SECURITY
CISOs Face a Tighter Insurance Market in 2026
JAN 5, 2026
THREAT INTELLIGENCE
2026: The Year Agentic AI Becomes the Attack-Surface Poster Child
JAN 30, 2026
Download the Collection
Loading...
Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.
SUBSCRIBE
Webinars
Tips for Managing Cloud Security in a Hybrid Environment?
THURS, MAY 7, 2026 AT 1PM EST
Zero Trust Architecture for Cloud environments: Implementation Roadmap
TUES, MAY 12, 2026 AT 1PM EST
Security in the AI Age
TUES, APRIL 28, 2026 AT 1PM EST
Identity Maturity Under Pressure: 2026 Findings and How to Catch Up
WED, MAY 6,2026 AT 1PM EST
Building a Robust SOC in a Post-AI World
THURS, MARCH 19, 2026 AT 1PM EST
More Webinars
White Papers
How Sunrun Transformed Security Operations with AiStrike
Autonomous Pentesting at Machine Speed, Without False Positives
Fixing Organizations' Identity Security Posture
Best practices for incident response planning
Industry Report: AI, SOC, and Modernizing Cybersecurity
Explore More White Papers
BLACK HAT ASIA | MARINA BAY SANDS, SINGAPORE
Experience cutting-edge cybersecurity insights in this four-day event featuring expert Briefings on the latest research, Arsenal tool demos, a vibrant Business Hall, networking opportunities, and more. Use code DARKREADING for a Free Business Pass or $200 off a Briefings Pass.
GET YOUR PASS
GISEC GLOBAL 2026
GISEC GLOBAL is the most influential and the largest cybersecurity gathering in the Middle East & Africa, uniting global CISOs, government leaders, technology buyers, and ethical hackers for three power-packed days of innovation, strategy, and live cyber drills.
📌 BOOK YOUR SPACE