Armed Conflict Cybersecurity & Tech

The AI Revolution in Cyber Conflict

Lennart Maschmeyer
Wednesday, April 8, 2026, 9:41 AM
The AI revolution will likely empower cyber defense over offense because AI excels at detection but struggles with deception.
Person using a computer, "Cybersecurity." (Spark Services, https://sparkservices.net/10-tips-to-prevent-data-theft-for-your-small-business; CC BY-NC 4.0, https://creativecommons.org/licenses/by-nc/4.0/).

There is no shortage of hype over artificial intelligence (AI), especially in regard to its use in cyber conflict. Some observers predict catastrophic consequences as AI-powered cyberattacks proliferate. These concerns are no longer theoretical. Hacking groups are now actively using AI. In the summer of 2025, an AI model by a security startup made headlines by becoming the world’s top ranked hacker. Soon after, a Chinese government hacking group used Anthropic’s Claude model to automate a “sophisticated” cyberattack that successfully compromised several targets. And in February, hacktivists used the same model for a cyberattack against Mexico’s government that stole more than 150 gigabytes of sensitive data. The firm that discovered this hack explained that AI has definitively changed the cyber conflict game. By leveraging AI, the firm explained, “[w]annabe threat actors are causing damage in moments and experienced threat actors are amplifying their capabilities overnight to rapidly achieve some of the most impactful malicious outcomes ever recorded.”

If this prediction is true, AI will revolutionize not only cyber conflict but international conflict at large. Dire predictions of catastrophic strategic cyberattacks have existed since the inception of the World Wide Web. Thankfully, they have not manifested due to significant operational and organizational constraints involved in major cyber operations. AI automation, however, now promises to overcome these constraints. Even if it does not unleash the cyberwar scenarios scientists have warned of since the 1990s, it may still supercharge low-intensity cyber campaigns to a level where they can substitute war.

In short, the era of AI-powered cyberattacks has arrived. Consequently, determining the likely impact on cyber conflict and conflict at large is both urgent and important. As one report on the Mexican government intrusion put it, “[f]or any cyber-defender continuing to deny the impact of AI on attacker efficiency, welcome to Exhibit A.” But here lies the crux: Efficiency does not equal effectiveness.

There is undeniable evidence AI automation enhances efficiency, especially for lower capacity actors. They are able to do more with less, and faster. There is little evidence, however, that such automation makes operations more effective, especially for capable actors such as nation-states. Effectiveness in this context means the ability to exploit vulnerabilities in systems to gain unauthorized access and manipulate them toward producing desired effects (and avoiding detection). The more reliably actors can do so, and the more effects contribute toward their strategic goals, the more effective operations are.

Marketing claims such as the one above assert AI-powered attacks are more sophisticated and damaging, but there is little evidence of this in practice. There haven’t been game-changing AI-powered cyberattacks by state-sponsored actors that produced previously impossible outputs.

That is probably not a coincidence. As I argue in a new article in International Security, the offense in cyber conflict has less to gain from AI automation than the defense. At the most basic level, offense is about deception whereas defense is about detection. The offense tries to sneak in and manipulate systems while the defense aims to detect and neutralize this activity. Crucially, AI models excel at detection but struggle with deception. Consequently, offense automation offers efficiency gains yet limited effectiveness gains—and the higher the stakes become, the lower these gains tend to be.

As tasks get more complex (from vulnerability detection to exploit development, manipulation of systems, and effect production), AI automation offers decreasing utility and growing failure risks. Generative AI models are very good at generating output reproducing similar patterns in their training data, but they struggle to generate original, creative, and deceptive output. The higher the stakes are in cyber conflict, the more important such creativity and cunning becomes. Indeed, such skills are the hallmark of the most advanced hacking groups. Automation of these higher-end tasks in higher-end operations may thus lower the quality of its output and make it easier to detect. In addition, the non-deterministic behavior of (generative) AI models and their innate tendency toward hallucination introduces further uncertainty to a process already fraught with it. In short, AI automation likely leads to inferior tradecraft while adding failure risks. For advanced actors, AI automation may thus ultimately lower effectiveness compared to a fully “manual” workflow—a cost unlikely to be offset by the relatively limited efficiency gains at this level.

In contrast, AI automation offers significant efficiency and effectiveness gains for the defense. Efficient defense means detecting and neutralizing as many intrusions as fast as possible, whereas effectiveness means doing so accurately and reliably. AI automation allows improvements across both dimensions, and these gains tend to increase with the stakes involved for actors. Simply speaking, the more an organization has to lose, the larger it tends to be. The larger the organization, the larger its network tends to be, and the more data there will be to analyze. AI model performance tends to increase with dataset size.

The result is what I call an “Automation Gap” between cyber offense and defense that widens with stakes. At the high end of cyber conflict, where state-sponsored actors go after large and well-endowed organizations, AI automation has the least transformative impact. Counterintuitively, at the interstate level, AI adoption by offense and defense is thus likely to tame rather than inflame cyber conflict. There is one exception though: As cyber offense gets even harder than it already is, actors may risk more to attempt more dramatic effects in a one-or-nothing cyber strike for glory. The result is a higher risk of inadvertent escalation.

As detailed below, there is a wealth of evidence of offense and defense AI automation up to early 2025 that largely supports this theory. But this is a fast-moving area of technological innovation. Model performance keeps increasing. Several high-profile AI powers’ offensive operations have rattled cybersecurity researchers. Hence, it is worth considering how the theory holds up against these more recent developments. To put the conclusion first: Counterintuitively, a deeper dive beyond the headlines shows these incidents mostly support the theory.

XbowThe World’s Top Hacker Is an AI

In mid-2025, an AI model made headlines for becoming the world’s top hacker. The website HackerOne features a leaderboard of the world’s best hackers based on reputation, earned by submitting vulnerabilities and exploits across a range of specialized categories. In June 2025, a hacker by the name of Xbow topped the board—and turned out to be an AI model trained by a synonymous startup. This was a significant milestone for humans’ silicon-based competitors. As one media report put it, “AI is getting so good that it’s outperforming human red teamers.”

However, a closer look at the type of vulnerabilities casts some doubt on that perspective. Evidently, Xbow did outperform human competitors on HackerOne—but the main reason for its superior performance seems to have been its ability to detect lower-tier vulnerabilities at high scale. By June 2025, XBow had submitted close to 1,000 vulnerabilities to HackerOne. A security researcher analyzed XBow’s profile and noted many of the vulnerabilities it listed were “some of the more basic things you can find with automation,” continuing that “I wouldn’t be so mean as to say these are rudimentary finds, but all of this is much more ‘surface material’ as opposed to more in-depth campaigns.” This outcome is exactly in line with the predictions of the theory, namely that AI significantly improves efficiency at lower complexity activity. The theory identifies four distinct steps in offensive and defensive workflow—both of which start with vulnerability detection, which AI excels at. Accordingly, XBow’s performance supports that part of the theory. Concerning effectiveness (which in this context would mean detecting the most prized types of vulnerabilities, such as zero-days in iOS), the gains are less clear. Indeed, it seems humans still hold an edge in this regard.

2025 Chinese-Sponsored AI-Automated Cyberattack Using Anthropic AI

Later in 2025, the AI firm Anthropic published a landmark report of another world first: a sophisticated state-sponsored cyberattack relying on AI agents to automate much of its workflow. The humans involved selected approximately 30 targets, developed an attack framework designed to aid automated compromises with minimal human development, and let AI agents loose to implement it. The hacking group first had to “jailbreak” the Claude AI by deceiving it into assuming it was an employee of a cybersecurity firm and breaking down the workflow into individual tasks that masked their malicious intent. These tasks covered all offensive steps, namely vulnerability detection, exploit generation, as well as manipulating systems toward producing desired effects (in this case, data exfiltration). According to Anthropic, this highly automated workflow, where AI carried out 80 to 90 percent of tasks, failed for the majority of targets but succeeded in “a small number” of cases.

This case is significant because it constitutes the first known instance of a state-sponsored actor deploying AI automation across the entire offensive workflow. Anthropic argued the future implications were grave as this incident demonstrated how “threat actors can now use agentic AI systems for extended periods to do the work of entire teams of experienced hackers.” There are reasons to be skeptical, however. First, it is unsurprising that the vendor of an AI tool would play up its capabilities. Accordingly, security researchers called out the report as “odd” for not including any actual details of tools, techniques, and procedures (and indicators of compromise), and potentially “made up”.

Most strikingly, even Anthropic’s marketing narrative further supports the core tenets of the theory. Anthropic’s findings showed AI automation enhances efficiency. As the report asserts, “threat actors can now use agentic AI systems for extended periods to do the work of entire teams of experienced hackers … more efficiently than any human operator.” The potential for efficiency increases is clear. Yet the predicted limited effectiveness is clear in this case as well. As noted, the approach failed in the vast majority of cases. If the hacking group had intended to compromise a specific, high-value target, automation would thus most likely have led to failure. Meanwhile, Anthropic’s report reveals that the automated workflow relied almost entirely on known, open-source tools. Known tools can be easily detected by well-equipped defenders—explaining the high failure rate, and again underlining the limitations of AI models in generating original exploits and tooling that reliably evade detection. Veteran security researcher Kevin Beaumont, accordingly, posited that this workflow’s “operational impact should likely be zero - existing detections will work for open source tooling.” On top of this source of failure risk, there is the unpredictability of AI models itself, which Anthropic’s report provides further evidence of. As the report notes, Anthropic’s Claude agents “occasionally hallucinated credentials or claimed to have extracted secret information that was in fact publicly-available.” In short, the Automation Gap theory also helps to explain the key characteristics of this case as well.

2026 AI-Powered Breach of the Mexican Government

The most recent case provides a stronger challenge to the theory. A small, still unidentified, hacktivist collective used Anthropic’s Claude and OpenAI’s ChatGPT models to compromise parts of the Mexican government. According to security startup Gambit, the hackers also succeeded in exfiltrating 150 gigabytes of sensitive data. In the modest tone typical of threat reports, it concludes that “AI gives a motivated individual the operational leverage of a nation-state … and experienced threat actors are amplifying their capabilities overnight to rapidly achieve some of the most impactful malicious outcomes ever recorded.”

As in the previous case, the report lacks any technical detail and precludes any analysis of the quality of the tooling the hacktivists used. Accordingly, it is only possible to examine the case and its implications in its wider context. The first thing that stands out is the type of hacking group involved, which Gambit characterizes as “a small number of individuals.” Compared to the previous case, this operation thus fits within what I characterize as the lower end of cyber conflict. It involves small groups with limited resources and skills engaged in low-stakes activity. Typically, such groups target foreign targets—meaning they are beyond the reach of its law enforcement—and the type of information they obtained can be sold on underground forums and marketplaces for money. In this context, Gambit’s conclusion that AI automation significantly enhances the capabilities of such low-stakes actors fully supports one of the key predictions of the theory: AI is most transformative at the low end of the cyber conflict spectrum.

The claim that AI gives an individual the “operational leverage of a nation-state,” whatever that means in practice, is questionable, however. As discussed, nation-state actors stand out due to their significant organizational capacity and resource endowments. These characteristics cannot be replaced by AI use of the kind demonstrated in this case. As Gambit showed, the hackers relied on prompt-based interactions with models, meaning the workflow was at best semiautomated. First, they had to jailbreak the models, as in the previous case, by deceiving them into assuming a benign purpose behind their exploit activity. Next, they carefully guided the AI model’s activities through a series of more than 1,000 individual prompts. Hence, there was a clear bottleneck in the speed of human-machine interaction. And, contrary to Gambit’s claims about AI enhancing the damage of attacks, the scale of this breach is nothing out of the ordinary either. Just a month prior to this intrusion, a hacktivist collective called Chronus managed to compromise the Mexican government with traditional human hackers and exfiltrated more than 15 times the amount of data that this AI-driven operation managed to obtain. Once again, there is no evidence of a significant improvement in the effectiveness of cyberattacks due to AI automation—on the contrary.

Looking Ahead

In short, there is growing evidence that the AI revolution in cyber conflict empowers defenders more than offenders. AI automation enhances offensive efficiency, but not necessarily effectiveness. On the contrary, for the most advanced actors, AI automation likely brings a net loss in effectiveness because it increases failure risks. While this piece has largely focused on evidence of offense automation, the underlying article documents extensive evidence supporting the theory’s predicted gains of defense automation. This evidence will likely grow as defenders increasingly adopt the technology. Indeed, more recent evidence further corroborates the core claim of defense advantage, demonstrating the superior performance of AI-automated defenses both in experiments, and their growing adoption by businesses.  (In fact, the absence of evidence of major AI-powered cyberattacks in the context of rising geopolitical tensions itself could be the result of improved defenses.) In short, cyber offense is already hard, and AI-empowered defenses likely make it even harder. Consequently, interstate cyber conflict is likely to become even less intense than it already is. The utility of cyber operations as an instrument of projecting power is likely to further decline relative to other (kinetic) options. Hence, rational actors will use it less often, in a narrower set of circumstances.

Conversely, the dark worlds of cybercrime and authoritarian repression are likely to receive a major boost. AI automation enables significant increases in the deployment of common, open-source tools against “soft” targets lacking adequate security measures. Smaller, resource-strapped actors already unable to afford and implement advanced cyber defenses will become even more vulnerable. Unable to reap the benefits of advanced AI automated defenses, they will bear the brunt of AI-empowered attacks.

For these predictions based on the theory and evidence to continue to hold true, several conditions must be met. First, both offense and defense continue adopting AI automation. I assume this to be the case because of the clear expected benefits. However, because the rollout will be uneven, there will be cases especially in the near future of AI-powered offenses hitting non-AI defenses and vice versa. The nonautomated side will be at a disadvantage here. Such cases will provide added incentives especially to defenders to improve automation, however, which should help ensure more universal automation.

Second, vulnerabilities need to be patched. While AI allows vulnerability detection at machine speed, the pace of patching is still often determined by bureaucracy and fragmented responsibilities. In order to benefit from the improved detection speed and accuracy that AI offers, these processes need to be improved as well. If not, offenders may still win despite the technology being better suited to the needs of defenders.

Third, there is no fundamental change in the trajectory of AI development. The theory applies to large language transformer models that have come to dominate the industry. The limited utility for the offense results from the intrinsic limitations of these models. Because these limitations are intrinsic, they are unlikely to simply disappear. However, there are alternative approaches, such as “world models,” that may not suffer from these limitations. If such, or other, alternative models take over and take off in regard to their capabilities, all bets are off. Again.


Lennart Maschmeyer is a Senior Researcher at the Center for Security Studies at ETH Zurich. He holds a PhD in Political Science from the University of Toronto and an M.Phil in International Relations from the University of Oxford. His current research focuses on the nature of cyber power and the relationship between operational constraints and strategic dynamics in cyber conflict. Lennart is also working on a second project compiling a dataset of threat intelligence reporting to identify potential sources of bias in the data and how these impact prevailing threat perceptions. He is a Fellow at The Citizen Lab.
}

Subscribe to Lawfare