Cybersecurity & Tech

The U.S. Cannot Prevent Every AI Biothreat—But It Can Outpace Them

Tal Feldman, Jonathan Feldman
Thursday, July 24, 2025, 9:50 AM
Just like LLMs generate text, PLMs generate proteins—but there is no playbook to manage the risks.
(Jernej Furman, https://shorturl.at/5EIxz; CC BY 2.0, https://creativecommons.org/licenses/by/2.0/)

Published by The Lawfare Institute
in Cooperation With
Brookings

The next biothreat might start as a line of code. A new class of artificial intelligence (AI) systems—known as protein language models (PLMs)—can design novel proteins with the potential to be used as bioweapons at astonishing speed. Originally developed to accelerate drug discovery, these systems can also propose mutations that make viruses more infectious, harder to detect, and resistant to treatment.

The Trump administration’s newly released AI Action Plan includes a biosecurity section focused on access controls and nucleic acid synthesis screening. That emphasis is understandable but insufficient. Screening systems rely on matching against known pathogens, yet the threat from PLMs is precisely that they generate unknown proteins. Synthesis providers can’t evaluate whether a novel gene encodes a harmful function. And the growing availability of benchtop synthesis machines allows well-resourced actors to bypass providers entirely. In short, the danger has shifted upstream, to the model output itself.

At first glance, PLMs resemble large language models (LLMs) such as ChatGPT: Both generate sequences using the same structure—words for LLMs, amino acids for PLMs. But that’s where the similarity ends. Harmful text from LLMs can be detected with filters and keywords. Dangerous proteins cannot be. Whether a protein is safe or harmful depends on complex biological properties—how it folds, what it interacts with, and how it behaves in the body—none of which can be reliably predicted from sequence alone. Today, scientists still need to test these proteins in the lab, using real human biological material, to understand their effects.

So, PLMs pose a fundamentally different kind of risk: Their outputs cannot be meaningfully screened for danger. U.S. biosecurity strategy is not built for this. Existing safeguards—access restrictions, red teaming, and content filters—were designed for systems where harm can be recognized upfront. But PLMs generate biological code, which cannot be evaluated without experimental testing. And you can’t control a model’s output if you can’t tell whether it’s dangerous.

This challenge demands a policy shift. Instead of trying to block every potentially dangerous output, the United States must be ready to outpace the threat. If an adversary releases a synthetic pathogen, the United States must be able to design a countermeasure in days, manufacture it in weeks, and distribute it nationwide before it becomes a crisis. This is not science fiction—it’s what national security requires in a world where pathogens can be programmed.

Meeting that challenge demands two urgent investments. First, the United States needs AI systems explicitly trained to generate safe, stable, and manufacturable therapeutics. This requires access to high-quality biomedical data, new model architectures, and federal support for research that doesn’t yet have a commercial market. Second, the United States needs a national infrastructure for rapid biomanufacturing, emergency regulatory approvals, and supply chain coordination—all of which are tested regularly and ready to activate at machine speed.

PLMs are both the cause of and solution to this risk. The same models that could be used to design pathogens are already helping scientists discover new drugs. Their ability to generate novel, functional proteins is precisely what makes them indispensable for rapid response. Shutting them down wouldn’t just slow biomedical progress—it would weaken U.S. defenses. If the U.S. can’t stop the technology, it must outrun its weaponization. In the age of generative biology, resilience is the only viable defense.

PLMs Are Not Just Another AI risk

Most AI safety tools are built on a simple premise: that harmful content can be identified before it causes harm. But PLMs don’t work that way. This is the core challenge of securing PLMs: Researchers don’t know when an output is dangerous. As discussed above, traditional AI safety tools assume harmful content is recognizable. That works for LLMs, where threats often show up as keywords or explicit text—such as asking the model to help a user build a bomb.

But PLM outputs are biological code—they are opaque and context dependent. A PLM could produce a protein that binds tightly to human cells. That protein might cure cancer or cause a deadly outbreak. Proteins and their interactions with living organisms are incredibly complex, so identifying harmful proteins often requires extensive laboratory testing—including experiments in cellular or tissue systems—making them difficult to detect and regulate.

PLMs also rarely operate alone. They are typically paired with AI-based structural prediction tools such as AlphaFold or RFdiffusion, which simulate how proteins fold and behave. Together, these systems form a design-and-validation loop: PLMs generate, predictors verify, and downstream services can manufacture the result. This is the same workflow used in biotech to develop therapeutic methods. It can just as easily be repurposed for harm. That dual-use nature is what makes the problem so urgent. The threat lies not in a single evil model, but in an ecosystem of interoperable tools and scientific infrastructure that can be repurposed with minimal difficulty.

Since computational infrastructure is largely open-access, decentralized, and global, regulatory chokepoints are limited. Export controls may delay access to high-performance computing, but they are unlikely to prevent the use of open-source models fine-tuned on public data. Access restrictions on commercial platforms can be circumvented by running models locally. Even if next-generation safety tools could detect a dangerous protein sequence, models can be modified or fine-tuned in private—especially by well-resourced actors.

To be clear, model output alone is not enough. Developing a functional bioweapon still requires access to DNA synthesis services, laboratory infrastructure, and methods of delivery. But those barriers are far lower than they once were—and continue to fall. Commercial synthesis is more accessible, foundational lab tools are widely available, and much of the expertise once concentrated in state-run programs is now public or commodified. The threshold for misuse is no longer high.

In this environment, prevention cannot be the United States’s only strategy. Open-source PLMs are already circulating globally, making it increasingly easy for malicious actors to create pathogens. What matters is how fast U.S. defense systems can respond—and whether the nation has the infrastructure in place to do so. As with cybersecurity, resilience—not containment—must become the cornerstone of national biosecurity policy.

Offense Versus Defense

It’s easy to assume that if AI can help design a pathogen, it should be equally good at designing a cure. But the two tasks are not symmetrical.

Designing a biothreat is comparatively simple: It involves optimizing for being lethal and infectious, often using existing pathogens as templates. Designing a countermeasure, by contrast, requires satisfying several hard constraints. The countermeasure protein has to bind precisely to the target, be nontoxic to humans, and be manufacturable at scale. Most AI-generated candidates fail at one of these steps—and success at the intersection of all three is narrow.

The data problem makes things worse. PLMs are trained on vast public datasets of viral and bacterial sequences. But the data needed to generate effective treatments—on toxicity, immune response, and clinical efficacy—is fragmented,e proprietary, or missing entirely. Without it, models can do their best to guess what might bind to a target, but not what will actually work in treating real patients.

The design tools are also different. Many key treatments, like most antibiotics and antivirals, aren’t proteins. They require entirely different modeling approaches—such as cheminformatics—which bring their own data gaps and limitations. And once a candidate is identified, scaling it involves a complex, fragile pipeline: cell-line development, purification, clinical trials, and rapid deployment.

Offense needs to succeed only once. Defense is constant. Pathogens mutate, resistance emerges, and countermeasures must evolve in real time. It’s not a fair fight—and right now, defense is behind.

Building a System That Beats the Clock

Federal agencies are beginning to address AI-driven biorisks, but their initiatives still orbit a prevention-first mindset and are too slow and too narrow for the threats that PLMs now pose. A 2024 Department of Homeland Security Report flags the dual-use potential of biological design tools, while the Biden-era National Science and Technology Council’s Framework for Nucleic Acid Synthesis Screening aims to screen dangerous DNA and RNA orders. But these initiatives focus on access controls, red teaming, and screening for known threats. They do not effectively address PLMs’ ability to generate novel protein sequences. Meanwhile, response-side efforts such as the Defense Advanced Research Projects Agency’s (DARPA’s) Reimagining Protein Manufacturing program and the Biomedical Advanced Research and Development Authority’s (BARDA’s) FASTx program either lapsed or excluded AI entirely. And with Trump’s rescission of Executive Order 14110 earlier this year, the current administration has yet to advance any replacement biosecurity agenda.

These are only attempts at establishing guardrails. As of now, there is no complete defense strategy. And worse: Even guardrails often don’t work against PLMs. As mentioned, many PLMs are open source and can be easily built upon to advance the state of the art. Red teaming requires deep biological expertise and can’t scale without experimental validation. And output filtering depends on reliably identifying dangerous generations—a task no current system can perform with confidence.

If PLMs can generate dangerous biological components faster than scientists can recognize them, then prevention is no longer a viable foundation for defense. What the United States needs is a paradigm shift.

This shift—from static safeguards to dynamic response—is a structural reorientation of how the U.S. thinks about biological threats in the age of AI. U.S. biosecurity officials are no longer dealing with natural timelines, where outbreaks emerge slowly and evolve over months. If attackers can move fast, the country’s only viable defense is to outpace them in the real world.

This requires two foundational investments: funding research and expanding state capacity.

To start, the United States can radically expand its research base for rapid countermeasure design. That means building dedicated programs focused on creating AI systems that can identify a new virus and quickly design an effective antibody to neutralize it. This requires research tuned not to general drug discovery, but to the specific challenge of mapping viral proteins and generating effective treatments.

Getting there will require good data. AI systems can generate these treatments only if they’re trained on the right information. Currently, much of that data is proprietary or fragmented, with researchers unable to access it. A serious federal effort is needed to curate datasets derived from real lab experiments, such as how antibodies perform against real pathogens, how the immune system responds, and which candidates fail in clinical settings.

Even then, data isn’t enough. The United States needs pipelines that are built not just to imagine new molecules or polymers but to produce ones that can actually be manufactured and deployed. Scientists need simulation tools to filter out weak candidates early. They need testing platforms to validate thousands of designs in parallel. And they need it all to operate at the speed of the threat.

This kind of infrastructure won’t emerge from the private sector alone. Most often, companies don’t build for emergencies that might never arrive—and by the time the market materializes, it’s too late. The government must lead. The National Institutes of Health, the National Science Foundation, and DARPA can anchor this effort, but only with a clear mandate: to build the AI tools, datasets, and biomanufacturing foundations needed to respond before the next crisis hits.

Research is only one half of the equation. Even if AI models can rapidly generate viable countermeasures, the United States currently lacks the operational machinery to act on them. Designing an antibody on a computer is not the same as validating its safety, navigating regulatory approvals, and manufacturing it at scale. That entire pipeline—spanning federal agencies and industrial supply chains—was never built to operate on machine timescales. This needs to change. Quickly.

Achieving this goal requires a different kind of investment: building the state’s ability to execute. The U.S. should establish a distributed network of crisis-ready biomanufacturing facilities, capable of switching from peacetime production to emergency response within days. These facilities should be stocked with essential inputs—like reagents, cell lines, and packaging materials—and maintained on standby contracts. Right now, there is no standing federal capacity to produce large volumes of computer-generated antibodies or other therapeutics in real time. That gap will become lethal in the next crisis.

The regulatory system must be retooled as well. The Food and Drug Administration’s emergency use processes were never designed for AI-generated therapies. What’s needed is a fast-track pathway tailored to this new category of countermeasures—one that allows for conditional, limited deployment of treatments while continuing safety monitoring in parallel. The Department of Health and Human Services has recently indicated plans to use AI in clinical trial reform, creating a narrow opening for regulatory modernization.

This new capability must also be tested regularly. The National Security Council, in coordination with agencies such as BARDA, the Administration for Strategic Preparedness and Response, and the Department of Defense, should conduct full-scale exercises that simulate an AI-generated pathogen—from model output to therapeutic deployment. These simulations should test every link in the response chain: compute access, lab integration, manufacturing ramp-up, regulatory approval, and public distribution. No agency can do this alone. The entire system must be stress-tested as a single unit.

Resilience, Not Just Restraint

Protein language models aren’t just a new kind of AI risk—they’re one of the most powerful tools in modern biology. These systems can accelerate vaccine development, enable precision therapies, and unlock treatments for diseases once thought untreatable. But they’re also inherently dual-use. In the wrong hands, they lower the barrier to engineering bioweapons and they do it fast.

Speed is the strategic challenge. With the rise of AI, the United States and its allies are entering a world where malicious actors may be able to design and deploy synthetic pathogens faster than governments can detect, assess, or respond. The old defenses—model restrictions, content filters, export controls—were built for slower threats. They won’t be enough.

This isn’t a call for panic. It’s a call for readiness. If PLMs compress the timeline, then the United States must get better and faster. That means making AI-driven biosecurity a core government function—not just a research grant or emergency playbook. It means investing ahead of the threat: in data, in models, in manufacturing and regulatory infrastructure, and in the institutions that can move fast when it counts.

In this space, speed is the battlefield. Resilience—not restraint—must be the foundation of national defense.


Tal Feldman is a J.D. candidate at Yale Law School focused on national security and innovation policy. Before law school, he worked as an AI engineer across the federal government, building tools at agencies including the State Department, Federal Reserve, and Department of Defense. He earned a master’s degree in global affairs as a Schwarzman Scholar, where he researched Chinese industrial policy, and a bachelor’s degree in mathematics from Wake Forest University, where he was a Truman Scholar.
Jonathan Feldman is a Computer Science student at the Georgia Institute of Technology and a research fellow at Harvard University, where he conducts research at the intersection of machine learning and biology.
}

Subscribe to Lawfare