← Back ◇ Industry News & Leadership Mar 25, 2026

Anthropic flags AI-driven cyberattacks, warns that cybersecurity has reached a critical inflection point - Industrial Cyber

Industrial Cyber Archived Mar 25, 2026 ✓ Full text saved

Anthropic flags AI-driven cyberattacks, warns that cybersecurity has reached a critical inflection point Industrial Cyber

Full text archived locally

✦ AI Summary · Claude Sonnet

AI Attacks and Vulnerabilities Control device security Critical infrastructure Industrial Cyber Attacks Malware, Phishing & Ransomware News Risk & Compliance Secure Remote Access System Design & Architecture Threat Landscape Vulnerabilities Anthropic flags AI-driven cyberattacks, warns that cybersecurity has reached a critical inflection point November 17, 2025 Anthropic has warned that cybersecurity has reached a critical inflection point, with AI models becoming powerful tools for both defensive and offensive operations. The warning comes after China’s state-sponsored hackers reportedly used Anthropic’s artificial-intelligence technology to automate intrusions into major corporations and foreign governments during a September hacking campaign. “At the same time, as part of our Safeguards work, we have found and disrupted threat actors on our own platform who leveraged AI to scale their operations,” Anthropic detailed in a recent research report. “Our Safeguards team recently discovered (and disrupted) a case of ‘vibe hacking,’ in which a cybercriminal used Claude to build a large-scale data extortion scheme that previously would have required an entire team of people. Safeguards has also detected and countered Claude’s use in increasingly complex espionage operations, including the targeting of critical telecommunications infrastructure, by an actor that demonstrated characteristics consistent with Chinese APT operations.” Anthropic noted that over the past year, a shift has become evident. It demonstrated that its models could simulate one of the costliest cyberattacks in history, the 2017 Equifax breach. Claude has also been entered into cybersecurity competitions, outperforming human teams in certain scenarios. Additionally, the AI has helped identify vulnerabilities in Anthropic’s own code, allowing them to be fixed before release. In mid-September, Anthropic detected suspicious activity later identified as a highly sophisticated espionage campaign. The attackers exploited AI’s ‘agentic’ capabilities to an unprecedented degree, using the technology not merely as an advisory tool but to execute the attacks directly. The threat actor, assessed with high confidence as a Chinese state-sponsored group, manipulated the Claude Code tool to attempt infiltration of roughly thirty global targets, succeeding in a small number of cases. Usual targets have included tech companies, financial institutions, chemical manufacturers, and government agencies. This appears to be the first documented large-scale cyberattack conducted with minimal human intervention. Upon detection, an immediate investigation was launched to determine the scope and nature of the operation. Over the following ten days, the team mapped the full extent of the campaign, banned compromised accounts, notified affected organizations as appropriate, and coordinated with authorities while gathering actionable intelligence. “Agents are valuable for everyday work and productivity—but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks,” the report noted. “These attacks are likely to only grow in their effectiveness. To keep pace with this rapidly advancing threat, we’ve expanded our detection capabilities and developed better classifiers to flag malicious activity. We’re continually working on new methods of investigating and detecting large-scale, distributed attacks like this one.” Anthropic’s review shows cyber capabilities doubling every six months and highlights real-world attacks exploiting AI. Despite robust safeguards, malicious actors continue to probe for weaknesses. A recent threat campaign was identified and disrupted by the company’s Threat Intelligence team, which works with the Safeguards organization to strengthen defenses and accelerate the use of AI for securing code and infrastructure, emphasizing that AI-driven cyber advantages must not fall to attackers. The research recognizes that “We should not cede the cyber advantage derived from AI to attackers and criminals. While we will continue to invest in detecting and disrupting malicious attackers, we think the most scalable solution is to build AI systems that empower those safeguarding our digital environments—like security teams protecting businesses and governments, cybersecurity researchers, and maintainers of critical open-source software.” The attack exploited several AI capabilities that either did not exist or were still in early development just a year ago. The models’ general intelligence has advanced to the point that they can follow complex instructions and understand context, enabling the execution of highly sophisticated tasks. Their specific skills, particularly in software coding, make them well-suited for use in cyberattacks. Models can also act as agents, operating autonomously in loops where they chain together tasks and make decisions with only minimal human input. In addition, they now have access to a wide array of software tools, often through the open standard Model Context Protocol. These tools allow them to search the web, retrieve data, and perform actions that previously required human operators. In cyberattacks, such tools may include password crackers, network scanners, and other security-related software. In the initial phase, human operators selected the targets and developed an attack framework designed to compromise them with minimal human involvement, using Claude Code as an automated tool. To bypass Claude’s safeguards against harmful behavior, the attackers jailbroke the system, breaking the attack into small, seemingly benign tasks and misleading the AI by presenting it as an employee of a legitimate cybersecurity firm conducting defensive testing. In the second phase, Claude Code conducted reconnaissance on the target’s systems, quickly identifying high-value databases and reporting its findings to the human operators, performing in minutes what would have taken a human team significantly longer. In the later phases, Claude identified and tested security vulnerabilities by researching them and generating its ‘own’ exploit code. The attack framework then used the system to harvest credentials, expand access, and extract large volumes of private data, which Claude categorized by intelligence value. It identified high-privilege accounts, installed backdoors, and carried out data exfiltration with little human oversight. In the final phase, Claude generated detailed documentation of the operation, compiling stolen credentials and system analyses to support planning for the threat actor’s future campaigns. “Overall, the threat actor was able to use AI to perform 80-90% of the campaign, with human intervention required only sporadically (perhaps 4-6 critical decision points per hacking campaign),” Anthropic identified. “The sheer amount of work performed by the AI would have taken vast amounts of time for a human team. At the peak of its attack, the AI made thousands of requests, often multiple per second—an attack speed that would have been, for human hackers, simply impossible to match.” Anthropic highlighted that Claude didn’t always work perfectly. “It occasionally hallucinated credentials or claimed to have extracted secret information that was in fact publicly-available. This remains an obstacle to fully autonomous cyberattacks.” The report also found that the AI component demonstrated extensive autonomous capability across all operational phases. “Reconnaissance proceeded without human guidance, with the threat actor instructing Claude to independently discover internal services within targeted networks through systematic enumeration. Exploitation activities, including payload generation, vulnerability validation, and credential testin,g occurred autonomously based on discovered attack surfaces.” It added that data analysis operations involved the AI parsing large volumes of stolen information to independently identify intelligence value and categorize findings. Claude maintained a persistent operational context across sessions spanning multiple days, enabling complex campaigns to resume seamlessly without requiring human operators to manually reconstruct progress. Clearly, the operational tempo achieved proves the use of an autonomous model rather than interactive assistance. Peak activity included thousands of requests, representing sustained request rates of multiple operations per second. The substantial disparity between data inputs and text outputs further confirms that the AI actively analyzed stolen information rather than generating explanatory content for human review. Recognizing a fundamental change that has occurred in cybersecurity, Anthropic advised security teams to experiment with applying AI for defense in areas like Security Operations Center automation, threat detection, vulnerability assessment, and incident response. “We also advise developers to continue to invest in safeguards across their AI platforms, to prevent adversarial misuse. The techniques described above will doubtless be used by many more attackers—which makes industry threat sharing, improved detection methods, and stronger safety controls all the more critical.” Anna Ribeiro Industrial Cyber News Editor. Anna Ribeiro is a freelance journalist with over 14 years of experience in the areas of security, data storage, virtualization and IoT. Related NIST expands CSF 2.0 toolkit with quick-start guides aligning cyber risk, risk management, workforce strategy PwC Annual Threat Dynamics 2026 discloses that identity attacks surge as AI reshapes cyber threat landscape Forescout achieves FedRAMP high ATO, strengthens security for converged IT, OT and IoT networks Darktrace introduces Adaptive Human Defense to personalize security training and protection across organizations NetRise Provenance launched to expose open source contributor risk, map impact across software supply chains ISA opens call for ISA113 committee to tackle industrial workflow interoperability challenges across industrial systems Forescout 2026 Riskiest Connected Devices report warns of rising OT, ICS risk as network infrastructure becomes prime target Resecurity warns that Iran war enters multi-domain phase as cyber and kinetic operations converge M-Trends 2026 reveals threat landscape shaped by faster, coordinated, and industrialized cyberattacks AppGate launches OT ZTNA solution to secure industrial control systems and critical infrastructure

💬 Team Notes