Dark ReadingArchived Mar 26, 2026✓ Full text saved
AI models often hallucinate or make costly mistakes when tasked with recommending software versions, upgrade paths, and security fixes — leading to significant technical debt.
Full text archived locally
✦ AI Summary· Claude Sonnet
APPLICATION SECURITY
CYBER RISK
THREAT INTELLIGENCE
VULNERABILITIES & THREATS
NEWS
AI-Powered Dependency Decisions Introduce, Ignore Security Bugs
AI models often hallucinate or make costly mistakes when tasked with recommending software versions, upgrade paths, and security fixes — leading to significant technical debt.
Rob Wright,Senior News Director,Dark Reading
March 26, 2026
4 Min Read
SOURCE: BAKHTIAR ZEIN VIA ALAMY STOCK PHOTO
Organizations may want to think twice before consulting with AI models on software dependency decisions.
New research from Sonatype found that "frontier" models (defined as the most advanced AI models available at a given moment) often generate faulty or fabricated recommendations for software dependencies, which spells trouble for organizations that lean on AI for upgrade and patching guidance.
Sonatype's research team analyzed 36,870 unique dependency upgrade recommendations across Maven Central, npm, PyPI, and NuGet between June and August 2025. In all, the DevSecOps company studied a total of 258,000 recommendations generated by seven AI models from Anthropic, OpenAI, and Google.
Sonatype published the first part of this study, which focused on OpenAI's GPT-5, in February as part of its 2026 State of the Software Supply Chain report. That study found the LLM often recommended software versions, upgrade paths, or security fixes that didn't actually exist. In fact, nearly 28% of the recommended dependency upgrades were hallucinations.
Related:Checkmarx KICS Code Scanner Targeted in Widening Supply Chain Hit
Part two of the study, published Tuesday, showed that while newer frontier models with enhanced reasoning — including GPT-5.2, Anthropic's Claude Sonnet 3.7 and 4.5, Claude Opus 4.6, and Google's Gemini 2.5 Pro and 3 Pro — saw improvements, the models still generated a significant number of hallucinations and faulty recommendations.
"In practice, those failures drive wasted AI spend, wasted developer time, unresolved vulnerability exposure, and technical debt before code reaches production," Sonatype said in the report.
Bad Upgrade Advice from AI Models
Sonatype emphasized that the issue isn't with the frontier models' reasoning capabilities, which have improved over time from earlier models. Instead, they lack real-time intelligence from the dependencies, as well as other factors.
"The issue is not model scale but rather ecosystem intelligence," according to Sonatype's report. "AI models lack the real-time dependency, vulnerability, compatibility, and enterprise policy context required to make safe remediation decisions."
For example, even the best-performing models still invented about one out of every 16 dependency recommendations. Frontier models also recommended "no change" for approximately a third of components, which reduced the hallucinations.
However, Sonatype said the more "cautious" AI models failed to flag vulnerabilities in components with "no change" designations, resulting in 800 and 900 critical and high-severity vulnerabilities being left in production code.
Related:How AI Coding Tools Crushed the Endpoint Security Fortress
In other cases, the models actively introduced vulnerabilities, by recommending software versions that actually contained known bugs, which in some cases put the AI stack itself at increased risk.
"These are the libraries used to train, fine-tune, orchestrate, and serve LLMs," the report stated. "The irony is difficult to ignore: AI agents recommending upgrades inside the AI stack are themselves failing to avoid critical vulnerabilities in the very tools that power them."
Sonatype co-founder and chief technology officer (CTO) Brian Fox says the bad advice provided by AI models creates a significant amount of technical debt for organizations that's often easy to miss. Organizations generally know that AI models made mistakes, he says, but Sonatype's research shows that errors in software dependency recommendations are "subtle, structured, and quietly becoming part of normal development work."
He tells Dark Reading, "The most dangerous version of this problem isn't when the model gives you something obviously broken. It's when it gives you something plausible that preserves risk, misses the better upgrade path, and looks close enough to ship."
Adding Dependency Intelligence & Context to AI
Sonatype's study showed that "grounding" AI models with live intelligence and context led to dramatically better results. The company compared the frontier models to Sonatype's own hybrid approach, which applies real-time intelligence at inference time, and found the latter provided a nearly 70% reduction in critical and high risks to organizations.
Related:GitHub 'OpenClaw Deployer' Repo Delivers Trojan Instead
As an experiment, Sonatype equipped GPT-5 Nano, which is the smallest and cheapest of the GPT-5 models, with a single function-calling tool backed by Sonatype Guide’s version recommendation API. Providing the models with additional intelligence, such as ranked upgrade candidates, vulnerability counts, and the platform's Developer Trust Scores, led to a significant reduction in vulnerabilities compared to the ungrounded counterparts.
"Grounding doesn’t just prevent hallucinations; it steers the model toward versions with fewer known vulnerabilities when a perfect option doesn’t exist," the report.
Fox says without live registry data, vulnerability intelligence, or compatibility context, AI models will make mistakes — ones that are costly to fix. And unfortunately, simple adding a human in the loop for the process is unlikely to prevent such errors.
"At that point, you're asking humans to clean up decisions the system never had enough truth to make well in the first place," he says. "Humans should set policy and constraints. The system still needs to be grounded in real-time software intelligence."
About the Author
Rob Wright
Senior News Director, Dark Reading
Rob Wright is a longtime reporter with more than 25 years of experience as a technology journalist. Prior to joining Dark Reading as senior news director, he spent more than a decade at TechTarget's SearchSecurity in various roles, including senior news director, executive editor and editorial director. Before that, he worked for several years at CRN, Tom's Hardware Guide, and VARBusiness Magazine covering a variety of technology beats and trends. Prior to becoming a technology journalist in 2000, he worked as a weekly and daily newspaper reporter in Virginia, where he won three Virginia Press Association awards in 1998 and 1999. He graduated from the University of Richmond in 1997 with a degree in journalism and English. A native of Massachusetts, he lives in the Boston area.
Want more Dark Reading stories in your Google search results?
ADD US NOW
More Insights
Industry Reports
Frost Radar™: Non-human Identity Solutions
2026 CISO AI Risk Report
Cybersecurity Forecast 2026
The ROI of AI in Security
ThreatLabz 2025 Ransomware Report
Access More Research
Webinars
Building a Robust SOC in a Post-AI World
Retail Security: Protecting Customer Data and Payment Systems
Rethinking SSE: When Unified SASE Delivers the Flexibility Enterprises Need
Securing Remote and Hybrid Work Forecast: Beyond the VPN
AI-Powered Threat Detection: Beyond Traditional Security Models
More Webinars
You May Also Like
APPLICATION SECURITY
Trump Administration Rescinds Biden-Era Software Guidance
by Alexander Culafi
JAN 29, 2026
APPLICATION SECURITY
OWASP Highlights Supply Chain Risks in New Top 10 List
by Jai Vijayan, Contributing Writer
NOV 10, 2025
APPLICATION SECURITY
It Takes Only 250 Documents to Poison Any AI Model
by Jai Vijayan, Contributing Writer
OCT 22, 2025
CYBERATTACKS & DATA BREACHES
DeepSeek Breach Opens Floodgates to Dark Web
by Emma Zaballos
APR 22, 2025
Editor's Choice
CYBERSECURITY OPERATIONS
Why Stryker's Outage Is a Disaster Recovery Wake-Up Call
byJai Vijayan
MAR 12, 2026
5 MIN READ
CYBER RISK
What Orgs Can Learn From Olympics, World Cup IR Plans
byTara Seals
MAR 12, 2026
THREAT INTELLIGENCE
Commercial Spyware Opponents Fear US Policy Shifting
byRob Wright
MAR 12, 2026
9 MIN READ
Want more Dark Reading stories in your Google search results?
2026 Security Trends & Outlooks
THREAT INTELLIGENCE
Cybersecurity Predictions for 2026: Navigating the Future of Digital Threats
JAN 2, 2026
CYBER RISK
Navigating Privacy and Cybersecurity Laws in 2026 Will Prove Difficult
JAN 12, 2026
ENDPOINT SECURITY
CISOs Face a Tighter Insurance Market in 2026
JAN 5, 2026
THREAT INTELLIGENCE
2026: The Year Agentic AI Becomes the Attack-Surface Poster Child
JAN 30, 2026
Download the Collection
Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.
SUBSCRIBE
Webinars
Building a Robust SOC in a Post-AI World
THURS, MARCH 19, 2026 AT 1PM EST
Retail Security: Protecting Customer Data and Payment Systems
THURS, APRIL 2, 2026 AT 1PM EST
Rethinking SSE: When Unified SASE Delivers the Flexibility Enterprises Need
WED, APRIL 1, 2026 AT 1PM EST
Securing Remote and Hybrid Work Forecast: Beyond the VPN
TUES, MARCH 10, 2026 AT 1PM EST
AI-Powered Threat Detection: Beyond Traditional Security Models
WED, MARCH 25, 2026 AT 1PM EST
More Webinars
White Papers
Autonomous Pentesting at Machine Speed, Without False Positives
Fixing Organizations' Identity Security Posture
Best practices for incident response planning
Industry Report: AI, SOC, and Modernizing Cybersecurity
The Threat Prevention Buyer's Guide: Find the best AI-driven threat protection solution to stop file-based attacks.
Explore More White Papers
GISEC GLOBAL 2026
GISEC GLOBAL is the most influential and the largest cybersecurity gathering in the Middle East & Africa, uniting global CISOs, government leaders, technology buyers, and ethical hackers for three power-packed days of innovation, strategy, and live cyber drills.
📌 BOOK YOUR SPACE