Kicking the Tires: A Voluntary Path to Pre-Deployment AI Vetting
The administration lacks authority to mandate frontier model vetting—but existing CAISI and CISA tools enable a voluntary alternative.
The Trump administration is weighing the creation of a “review system” for frontier AI models. According to the New York Times, in this proposed approach, AI labs would provide the federal government with “first access” to “get ahead” of models with significant cyber capabilities, presumably such as Anthropic’s Mythos. It’s unclear what legal authority would allow the president to accomplish these goals—specifically, mandating labs to undergo a vetting process and then sharing any essential information related to countering any detected risks with other parts of the government.
However, existing authorities would allow for a voluntary “kick the tires” testing period. Labs could opt to share models with materially new capabilities with the Center for AI Standards and Innovation (CAISI), which is housed within the Commerce Department’s National Institute of Standards and Technology; the director of the Cybersecurity and Infrastructure Security Agency (CISA) could then fund an effort to help a broad set of actors—including local, state, federal actors as well as other public and private entities—take any necessary cybersecurity precautions. This would help labs avoid popular backlash for knowingly introducing models that may threaten critical systems and public well-being and perhaps subvert more onerous, formal requirements.
Cyber Threats Posed by AI
Anthropic opted not to release their latest model, Mythos, because of its ability to identify and exploit software vulnerabilities. Instead, Anthropic made the model available to a select group of private stakeholders and, following negotiations with the White House, the federal government, to take any necessary precautions. Though some dismissed that decision as a PR move, testing from third parties validated Anthropric’s findings; by way of example, the U.K. AI Security Institute determined that Mythos could oversee a 32-step corporate network attack with some degree of reliability—a process that would take humans 20 hours. OpenAI subsequently developed a model with similar cyber capabilities.
Cybersecurity experts fear that large swaths of society—such as private and public sector entities with mediocre cybersecurity plans—may be caught flat-footed when other such tools become available. Small businesses and nonprofits, for example, “lack the skills and resources to address these challenges before their systems are compromised.” State and local governments may similarly be ill-equipped to take timely measures as AI tools continue to become more sophisticated. This is especially concerning given that local election officials report a dearth of state and federal support for countering emerging cyber threats.
The result is a widening asymmetry: Frontier labs increasingly understand what their models can do weeks or months before the institutions tasked with defending against misuse have any meaningful opportunity to prepare.
Can the President Mandate a Vetting Process?
Whether the president has the legal authority to mandate a vetting process is hard to assess without the Trump administration specifying its own understanding of the law. A preliminary review of national security provisions returns a potential shortlist: the Defense Production Act (DPA), the International Emergency Economic Powers Act (IEEPA), and the Communications Act of 1934. For sake of brevity, the latter two can be dismissed fairly easily. Reliance on those acts to subject U.S. companies to a government “review system” would involve stretched interpretations of the laws, which courts would likely not condone.
The DPA is also an unlikely stable legal basis for the administration’s plan. President Biden leaned on the DPA to require AI model testing under an executive order he issued in 2023; President Trump rescinded that order in 2025. The breadth of the DPA’s terms have made it a recurring vehicle for expansive actions by the executive branch. Pursuant to the DPA, a president may use an “array of authorities to shape national defense preparedness programs and to take appropriate steps to maintain and enhance the domestic industrial base.” However, numerous investigations of the DPA have pointed out that those authorities have limits.
Enacted in 1950, “the original purpose of the DPA was to ensure that the federal government could compel private industry to produce strategically necessary resources to meet the needs of national defense during an emergency,” according to Ashley Mehra. While the law has subsequently been amended and stretched by creative uses of its vague authorities, it is unlikely to fit the administration's unique legal needs here. The President will be hard pressed to find a clear hook for compelled vetting of an AI model with minimal direct connections to a specific, ongoing national defense effort or domestic industrial base consideration in the DPA. Title I of the DPA empowers the president to direct private parties to prioritize and accept contracts necessary for national defense. Title III enables the president to incentivize the production of certain critical materials and goods, such as with loan guarantees and grants. Title VII includes a range of authorities, including the ability to establish voluntary arrangements among private actors that might otherwise run afoul of antitrust laws, as well as to gather information from private entities. More specifically, the Department of Commerce may rely on Title VII to “conduct assessments of domestic industrial base capabilities.”
Use of Title VII to govern frontier AI models runs counter to the marginal role intended for this part of the DPA. In contrast to Titles I and III, Title VII amounts to a “potpourri” of provisions meant to assist with administration of the act more so than to afford expansive powers. A more common understanding of the information-gathering power afforded therein is the authority to “obtain information from industry and firms, including through testimony or by inspecting their books, records and properties.” Such an inquiry would serve the purpose of identifying any weak points in supply chains that may be relevant to the nation’s readiness for war and related emergencies.
Compelled disclosure of a model to the federal government does not fit neatly into any DPA authorities.
The Legal Path to a Voluntary “Kick the Tires” Period
Though it seems unlikely that the president can force AI labs to participate in a review system, he does not need to if the underlying goal is to translate the results of CAISI’s voluntary evaluations of model capabilities into general cyber readiness assistance. The leading labs already make their models available to CAISI for rigorous testing on a voluntary basis. Given that they have a vested interest in avoiding the following headline: “AI Lab Bypasses Federal Testing; Cyberattacks Proliferate,” they may agree for such testing to occur two or three weeks prior to generally deploying their model.
A more evocative hypothetical stresses this point. Imagine that a leading lab releases a model with notable cyber capabilities days before the November elections. Political actors may allege that local and state officials likely had their election systems undermined by bad actors using the latest AI model. It would be hard to dismiss such a claim under the status quo. One could easily foresee reports on “Model ____ Blamed for Cyberattacks; Election Results Contested.” Such headlines would become far less likely if a short-term “Kick the Tires” period became standard practice.
Suppose instead that prior to that model’s release, CAISI concludes—based on publicly disclosed, objective evaluations that return clearly established threat models—that the model would indeed expose critical infrastructure, local, state, tribal, and federal actors, and private entities to cyber threats, they can immediately share that information with CISA. The CISA director would then need to evaluate whether a "specific significant incident is likely to occur imminently” in order to trigger additional authorities under the Homeland Security Act.
Significant incident refers to “an incident or a group of related incidents that results, or is likely to result, in demonstrable harm to— (i) the national security interests, foreign relations, or economy of the United States; or (ii) the public confidence, civil liberties, or public health and safety of the people of the United States.” Mass deployment of a model with heightened cyber capabilities seems likely to qualify. As noted above, Mythos is capable of hacks that require 20-hours of human work. If a model of a similar capability was in the hands of even a couple dozen bad actors, there would likely be widespread economic harm and, perhaps, threats to the public health and safety of the public.
The CISA director could then use the Cyber Response and Recovery Fund to help a range of stakeholders—public and private—with “vulnerability assessments and mitigation; technical incident mitigation; malware analysis; analytic support; threat detection and hunting; and, network protections.”
Of course, the biggest “if” here is whether labs would agree to this “Kick the Tires Period.” Labs have reasons to say yes.
A specific, time-bound testing window run by CAISI is preferable to the alternatives on the table—whether that means a mandatory review regime, which would likely run afoul of the law, or the reputational fallout of a post-deployment incident traced back to capabilities the lab knew about but did not flag.
CISA, for its part, would need to commit to handling shared model information with the same care it extends to vulnerability disclosures from private security researchers, lest labs conclude that cooperation invites leakage rather than partnership.
Such an approach would not require new legislation, nor would it require the administration to stretch the DPA past its breaking point—it requires only that the relevant agencies use the authorities they already have, and that the labs recognize a voluntary framework is the one most likely to keep a mandatory one off the table.

