Cybersecurity & Tech

AI-Enhanced Social Engineering Will Reshape the Cyber Threat Landscape

Alex O'Neill, Fred Heiding
Monday, May 5, 2025, 2:00 PM
The proliferation of artificial intelligence tools enables bad actors to conduct deceptive attacks more cheaply, quickly, and effectively.
"Cyber attacks" (Christiaan Colen, https://tinyurl.com/yckrfzc8; CC BY-SA 2.0 DEED, https://creativecommons.org/licenses/by-sa/2.0/)

Published by The Lawfare Institute
in Cooperation With
Brookings

In February 2024, a joint Microsoft-OpenAI advisory revealed that advanced persistent threat (APT) groups from Russia, North Korea, Iran, and China had experimented with using large language models (LLMs) to develop malicious code and gather intelligence. According to expert congressional testimony delivered last April, ransomware gangs are “innovating with AI to accelerate and scale attacks and find new attack vectors.” While the February 2024 advisory characterized these forays as “early-stage, incremental moves,” a new OpenAI report describes several cases of real-world operationalization.

These developments are indicative of cyber threat actors’ determination to harness the power of artificial intelligence (AI). The emergence of widely accessible AI systems promises to reshape the threat landscape, enabling bad actors to conduct cyberattacks more cheaply, quickly, and effectively at every phase of an operation. AI systems will pioneer new methods of exploitation that humans have not yet tried or even imagined. For all their positive applications, LLMs like GPT-4 will accelerate the proliferation of hacking tools and know-how, allowing existing threat actors to expand their capabilities more easily and lowering the bar for novices to enter the fray. It is crucial that policymakers anticipate the transformative impact AI will have on the cyber risk environment.

AI tools will strengthen offensive cyber capabilities across the board, but the greatest near-term threat is their capacity to enable “social engineering” operations, in which bad actors deceive victims into compromising their own security. This form of attack is already widespread and costs society billions of dollars every year; some studies have found that more than three-quarters of all malicious cyber operations originate with social engineering tactics like spear phishing and impersonation, which technical defenses struggle to guard against. The arrival of LLMs well-suited to automating such techniques will only deepen attackers’ asymmetrical advantage. While AI tools will also help hackers penetrate hardware and software systems, these capabilities will mature more slowly than those for deceiving human beings. The consequences of supercharged deceptive tactics will extend well beyond cyberspace, exacerbating real-world challenges from the ransomware epidemic to Chinese industrial espionage to North Korea’s nuclear program. A rise in potent social engineering attacks is far from the only way in which AI will influence the cyber threat landscape, but it is a useful starting point for thinking about mitigating the risks these technologies pose.

Automating the Hook, Line, and Sinker 

AI systems, especially LLMs, are well-suited to creating deceptive content like spear phishing emails and deepfake videos. LLMs generate material based on prediction, stringing words together according to patterns gleaned from training data. While these creations are necessarily imprecise, humans naturally tend to fill in information gaps and smooth over imperfections, making them susceptible to close approximations of the truth. Deceptive artificial content need not be perfect; as long as the user believes it, the attacker wins. LLMs excel at performing this form of exploitation. Breaking into technical systems, by contrast, often requires a high level of precision. For instance, an attacker might need to unlock an encrypted device and use AI to guess the password, in which case only the exact code will suffice. Such tasks are difficult to achieve through approximation, and while future AI tools will be more capable, the latest models still struggle to complete them without assistance. 

Our research illuminates the drastic effects of automating the phishing attack chain with AI tools. In a recent study, Heiding and his colleagues at Harvard University found that current LLMs can identify potential targets, scrape publicly available information about them, generate personalized lure emails, distribute the emails in ways that maximize impact—for example, by avoiding spam filters or aligning with important deadlines like tax day—and improve based on their results. AI-enhanced spear phishing models have proved capable of performing as well as or better than humans conducting the same operations manually in just a fraction of the time. All told, automating the attack chain may reduce spear phishing costs by up to 99 percent at scale, eliminating the long-standing trade-off between quality and economy. These factors portend a dangerous new era of phishing in which malign actors have nearly unfettered access to potent tools of deception. 

Most major LLM companies build guardrails into their products to prevent abuse, but bypassing them is often easy. Many models, for example, refuse explicit requests to create phishing emails but produce functionally identical text when prompted to generate marketing content. Researchers have found that with minimal fine-tuning, users can circumvent safety standards designed to prevent LLMs and associated application program interfaces (APIs) from producing illegal, dangerous, obscene, or otherwise undesirable material. Attackers have also managed to “jailbreak” AI models by creating local versions that lack security mechanisms, thereby enabling their use anonymously, at scale, for any purpose. Despite AI companies’ efforts at prevention, jailbreaking is becoming increasingly commonplace. It should be assumed that sophisticated threat actors are working to jailbreak frontier AI models and may create versions tailored to certain nefarious activities, such as spear phishing financial institutions or impersonating U.S. political candidates. 

Fuel on a Roaring Fire

Social engineering is already among the most widely used tactics for computer exploitation. The number of phishing attacks reported to the FBI’s Internet Crime Complaint Center rose from around 115,000 in 2019 to 300,000 in 2023, more than 216 percent growth in figures that both represent gross undercounts. Many infamous cyber incidents stemmed from an initial social engineering compromise, such as the 2012 Sony Pictures Entertainment hack and the 2023 MGM casino compromise, which led to $100 million in losses. Incorporating AI tools will allow threat actors to undertake more numerous and more sophisticated operations than before.

Consider, for example, a typical operation the North Korean espionage group known as Kimsuky often conducts, in which hackers seek to spearphish Western foreign policy experts. Instead of combing tediously through think tank rosters, threat actors could instruct an LLM to generate a list of targets who fit specific criteria—perhaps those with expertise in North Korea’s nuclear weapons program—and distribute compelling lures with a hidden malicious payload. Using LLMs will increase operational efficiency by orders of magnitude while eliminating a stumbling block DPRK threat actors have long struggled with: foreign language skills. Faulty wire instructions foiled the 2016 Bangladesh Bank heist when fraudsters misspelled the word “foundation”; forensic analysis of the WannaCry malware revealed that DPRK actors had likely written clunky ransom notes using Google Translate. Malign cyber actors will benefit from these AI-augmented social engineering methodologies in operations as various as cryptocurrency theft, sabotage, and disinformation.

Attackers have already begun layering artificial video and audio generation to further enhance their social engineering capabilities. In the past, North Korean hackers have gone to extraordinary lengths to deceive targets, once hiring an actor to conduct a fake Spanish-language job interview in order to persuade an ATM network employee to download malicious software. More recently, the Kim regime has illicitly raised millions of dollars by directing DPRK programmers to perform freelance IT work for foreign companies under false or hidden identities. To date, common-sense vetting has been a reasonably effective safeguard against unwittingly hiring a North Korean contractor: According to a joint U.S.-South Korean threat advisory, the first red flag employers should look for is an “unwillingness or inability to appear on camera, conduct video interviews or video meetings” as well as “inconsistencies when [IT workers] do appear on camera, such as time, location, or appearance.” AI-powered technologies like deepfakes and real-time translation can defeat these tests, making it much easier for fraudsters to circumvent standard diligence procedures.  LLM “hallucination” may cause occasional unreliability but is less damaging for spear phishing than for tasks that require precision, like generating access codes or providing medical advice.

The effects of supercharging malign actors’ social engineering capabilities will reverberate beyond cyberspace. Incorporating LLMs for spear phishing will enable the ransomware groups that extorted more than a billion dollars in 2023 to expand their activities, leading to even more losses and social disruption. State-sponsored cyberattackers will achieve greater success in espionage and sabotage campaigns, bolstering rogue states’ real-world capabilities from intelligence gathering and decision making to industrial production and military power. In the case of North Korea, whose cybercriminals stole more than $1.3 billion in cryptocurrency last year and another $1.5 billion in one February 2025 heist, leveraging AI will likely enable more illicit revenue generation to support the Kim regime’s weapons of mass destruction programs, compounding the grave threats they already pose to international security. 

Preparing for the Coming Wave of AI Social Engineering Attacks

Despite LLMs’ transformative impact, their social engineering capabilities face some limitations. Many models cannot access up-to-date information, which could lead to errors in gathering intelligence or generating spear phishing messages. For example, a model trained on data from 2023 would not know that a Kimsuky think-tank target has since changed jobs. Additionally, as flimsy as LLM guardrails tend to be, they may prevent less determined would-be attackers from exploiting the models. Security programs may be able to detect some AI-generated content and flag it as suspicious. Moreover, social engineering tactics are typically part of a larger attack chain with multiple steps that current AI tools cannot automate effectively, such as creating malware tailored to a particular target network. Overall, LLMs offer powerful social engineering capabilities but still struggle to execute sophisticated cyberattacks from end to end.

To counter AI-enhanced social engineering threats, the federal government should encourage potential targets to leverage the same tools for defensive purposes. Heiding’s team has found that LLMs can help detect phishing emails at minimal cost and with lower false-positive rates than human beings. LLMs can also encourage users to respond safely to potential phishing emails, such as by verifying requests over the phone. In time, AI technologies will be able to provide personalized spam filters that detect suspicious content based on a user’s routine and characteristics. Our ongoing research, as well as related studies, indicates that AI-powered automation can streamline labor-intensive technical security assessments, saving organizations time and money while enabling them to build more defense in depth for when social engineering attacks succeed. While development of these critical technologies will fall mainly to the private sector and academia, governments should nurture them by steering investment, funding research, and updating cybersecurity requirements for regulated entities and government systems to include LLM-enabled social engineering defenses.

Technical countermeasures represent just one aspect of a broader strategy governments should pursue to mitigate AI-enhanced social engineering threats. Congress should prioritize regulating AI companies, clearly delineating liability for harms that result from abuse of their platforms and mandating rigorous, continual risk assessment. Firms like OpenAI and Anthropic should be required to perform independent, standardized security audits of their LLMs on a regular basis and disclose the results, building on the impressive “system card” evaluations they already release themselves. In parallel, cybersecurity agencies like the Cybersecurity & Infrastructure Security Agency should sound the alarm on the rise of AI-enhanced social engineering threats, while the U.S. Securities and Exchange Commission and other regulators should consider strengthening requirements for evidence-based anti-spear phishing training and security technologies.

On the global stage, diplomats and national security practitioners should treat AI-enhanced social engineering—and AI-enhanced cyberattacks more broadly—as a topic of primary concern. Successful existing forums like the Counter Ransomware Initiative could add lines of work related to combating AI-enhanced social engineering threats, while through the UN diplomats could promote anti-social engineering capacity building and seek to achieve a global regulatory baseline for AI tools with implications for cybersecurity. As the rise of AI transforms international relations in unpredictable ways, friends and adversaries alike will need to cooperate to avoid the dangers of weaponized AI. 

*** 

Emerging AI technologies like LLMs promise to supercharge malign actors’ capacity to conduct social engineering cyberattacks. Unlike technical systems, the human brain cannot be easily patched to recognize deceptively realistic spear phishing emails and deepfake videos. By simultaneously boosting the quality and reducing the cost of these types of operations, LLMs will deliver a dangerous advantage to cyberattackers, the effects of which will deepen a number of existing challenges in cyberspace and the physical world. While the rise of AI-enhanced social engineering attacks is inevitable, responsible actors can take steps to prepare for its arrival—before it is too late.


Alex O’Neill until recently worked as a researcher at the Harvard Kennedy School’s Belfer Center for Science and International Affairs, where he led the North Korea Cyber Working Group and studied emerging technology, cyber threats, and illicit finance. Alex’s research has been published by the Belfer Center, the Royal United Services Institute, Lawfare, and other organizations. He received an MSc in Russian and East European Studies from the University of Oxford and a BA with distinction in History from Yale University.
Fred Heiding is a Defense, Emerging Technology, and Strategy (DETS) Program Research Fellow at the Harvard Kennedy School’s Belfer Center for Science and International Affairs. His work focuses on the intersection of technical cybersecurity, business, and policy. Fred is a member of the World Economic Forum's Cybercrime Center, and he co-teaches National and International Security and Managing Risk in Information Systems at the Harvard Kennedy School. He received his PhD in computer science from the Royal Institute of Technology in Sweden and spent the final two years of his doctoral studies as a visiting scholar at Harvard’s School of Engineering and Applied Sciences.
}

Subscribe to Lawfare