‘Voluntary’ Until the Government Is Your Customer
The new executive order on artificial intelligence (AI), “Promoting Advanced Artificial Intelligence Innovation and Security,” goes out of its way to say that its frontier-model evaluation framework is voluntary. Section 3 of the framework directs agencies to design a process through which AI developers may engage with the government before releasing covered frontier models to other trusted partners. The section then adds an explicit caveat: Nothing in that section authorizes a mandatory licensing, preclearance, or permitting regime for developing, publishing, releasing, or distributing new AI models, including frontier models. But “voluntary” looks different when your customer is the federal government.
The order directs officials to develop a classified benchmarking process to determine when a model should be designated a “covered frontier model.” It also calls for a voluntary framework: Developers may seek that determination, give the government access for up to 30 days before a model is released to other trusted partners, and help choose the trusted partners whose early access is intended to strengthen critical-infrastructure cybersecurity. Under the order, the director of the National Security Agency (NSA), in consultation with specified national security and cybersecurity officials, determines whether to designate a frontier model as “covered.” The designation authority is vested in the national security establishment, not in the civilian standards office that has run the government’s frontier-model evaluation to date.
The order also arrived quietly, and that quiet is part of the story. A signing ceremony had been scheduled with major technology and AI executives invited to the Oval Office. It was canceled hours before it began, after phone calls from the former White House AI and crypto czar, David Sacks, and other tech leaders reportedly derailed it. The loudest public objection to the draft was that it would slow U.S. AI developers in the race with China. The more legally revealing concern was narrower: that a voluntary review system could harden into a de facto licensing regime. Twelve days later, the president signed a version with a shorter access window—reduced from 90 days to 30—and the disclaimer intact. The disclaimer is the residue of a fight over exactly one question: whether voluntary stays voluntary.
Within days of the order’s signing, OpenAI said it would comply and give the government early access to its covered frontier models. Sam Altman posted that the order “gets the balance right.” Anthropic called it “an important step in strengthening America’s leadership in AI.” Which raises the question: Why would companies embrace an order that cannot compel them?
Part of the answer is conventional: The framework’s terms do not exist yet, and the companies at the table will help write them. Saying yes to a voluntary regime is also the best argument against a mandatory one. But the other part of the answer is procurement. The major frontier developers are already federal suppliers, and they all want more of the government’s business. Just a few months ago, the companies watched a contract dispute become a supply chain risk designation, a government-wide order to stop using Anthropic’s technology, and ongoing, multifront litigation. A court has since temporarily blocked both the designation and the ban, but by then every other frontier developer had witnessed the cost of defying the government. And they are already inside the procurement system, where the customer’s preferences can become solicitation terms, evaluation factors, and contract requirements without waiting for a new statute or government-wide rule.
This is what government contractors do. When the government signals where it is going, the largest firms get in line. Of course, getting in line has never meant silence: Contractors lobby, comment, protest, and litigate. Industry pushback narrowed the Cybersecurity Maturity Model Certification (CMMC), a cybersecurity framework the Pentagon created for defense contractors, and slowed its rollout for years. The anti-diversity, equity, and inclusion (DEI) certification regime that began in 2025 and was followed by a more specific clause for contractors in 2026 has triggered litigation over what the government can require as the price of doing federal business. But contractors rarely walk away from the customer. They fight over the terms. The canceled signing ceremony followed a similar pattern: The framework survived, but a key term gave way, reducing the access window from 90 days to 30.
When you work with the government, your customer is a superpower. The government is simultaneously the largest buyer, the rule writer, and the gatekeeper to federal business. No ordinary customer holds those three roles over its suppliers. The order’s disclaimer can be sincere because procurement, not a licensing mandate, supplies the compulsion. The government’s spending power can turn a voluntary frontier framework into a condition of doing business with the government, without a single statute or regulation stating “license.”
This is the same opaque, bilateral, vendor-by-vendor channel I have previously criticized as inadequate public governance: weak at constraining what the government does with these tools, but highly effective at conscripting the market that supplies them. The open question is whether all of this stays visible or becomes a governance system run through classified benchmarks, evaluation preferences, contract clauses, and supplier pressure that outsiders can see only in fragments and challenge only with difficulty.
From Commerce to the NSA
The order did not create government access to frontier models. Since 2024, the Commerce Department’s Center for AI Standards and Innovation (CAISI), a National Institute of Standards and Technology (NIST) office, and its predecessor, the U.S. AI Safety Institute, have operated under publicly announced agreements with OpenAI and Anthropic to evaluate their models before and after release. On May 5, the Commerce Department announced that Google, Microsoft, and xAI had joined, that the original agreements had been renegotiated to align with the administration’s AI Action Plan, and that CAISI would serve as the industry’s primary point of contact within the government for AI testing. CAISI also reported having completed more than 40 such assessments, including on models not yet released. The press release told the world, nearly four weeks before the executive order, that a voluntary, centralized, civilian evaluation function was operating with the leading closed-model frontier developers inside it. In the wake of the order, however, that channel may have narrowed: A week after the order was signed, the Wall Street Journal reported that administration officials had directed CAISI to halt public reporting on its model assessments while the executive order is implemented. Testing continues internally, but the public-facing function is on hold.
Even as the civilian channel’s public work pauses, the signed order builds on that access model and adds a decisive function the civilian program never had. The classified benchmarking process will determine which models count as covered frontier models. NIST is consulted in developing that process, but the NSA director makes the designation itself. The existing evaluation relationships reside at the Commerce Department, in public agreements, inside a standards agency. The covered designation will be made behind a classified benchmark, against an undisclosed threshold, on a basis no one outside the process can see.
The designation should not be mistaken for a seal of approval. It is closer to a classified risk label: a government judgment that the model has crossed a capability threshold serious enough to warrant special handling. But risk labels can still carry market consequences. Unlike anything in the CAISI agreements, the covered designation comes with a federally mediated early-access channel for covered models and a role in selecting which trusted partners access them early. Kevin Frazier and Alan Rozenshtein called the canceled signing ceremony “governance by phone call.” What begins after the signing is regulation by contract.
Three Pathways to a Mandate
A classified designation does not govern anything until consequences are attached to it, and government procurement is where that happens. The order disclaims a licensing regime, but agencies do not need licensing authority to shape federal buying. The agencies already have solicitations, evaluation criteria, responsibility determinations, and contract terms. The frontier-model evaluation framework can therefore become a federal-market mandate without ever becoming a formal license. There are three ways this can happen, descending in terms of formal authority and visibility.
The first and cleanest route is for Congress to use procurement directly. Section 508 of the Rehabilitation Act provides the template. It required agencies to procure information and communication technology (ICT) that people with disabilities can use—a federal purchasing standard that went on to reshape the commercial market without ever regulating it. Vendors seeking federal business had to document conformance, and that documentation became a credential far beyond the government. Other public and enterprise buyers demanded it because the federal baseline had become the market baseline, and vendors found it cheaper to build accessibility into the product than to maintain separate federal and commercial lines. The covered-frontier framework has no statute behind it and may never get one. It needs only a federal baseline, and the next two pathways show how to build it without Congress.
The second pathway is formal executive-branch action. A president issues an executive order, and the acquisition system then translates it into procurement policy through Office of Management and Budget (OMB) memoranda, amendments to the Federal Acquisition Regulation (FAR) and agency supplements, deviations (departures from the FAR for classes of contracts or contracting actions), and interim guidance that can take effect before the regulations catch up. Federal AI policy is already moving through that channel in adjacent contexts. Executive Order 14179, “Removing Barriers to American Leadership in Artificial Intelligence,” led OMB to issue guidance on federal AI use and AI acquisition. Executive Order 14319, “Preventing Woke AI in the Federal Government,” was followed by an OMB memo on “Unbiased AI Principles,” which agencies have already translated into contract requirements.
For example, the Department of Energy implemented an acquisition letter that treats compliance with these principles as “material” to eligibility for award and payment. The letter further directs contracting officers to insert an AI clause in applicable solicitations and contracts, warning that noncompliance may lead to contract termination. The same pattern appears in the national security context. A new National Security Presidential Memorandum (NSPM-11), issued within days of the order, directs defense and intelligence agencies to terminate, where lawful, contracts with firms showing a pattern of conduct inconsistent with the memorandum’s AI policies, including “contracts under which such companies provide services to the applicable agencies as subcontractors.”
At first, the distinction between a preference and a mandate matters. Participation is not a requirement, and an evaluation advantage is not a floor. But federal assurance frameworks often turn into requirements. The Federal Risk and Authorization Management Program (FedRAMP) and CMMC show the life cycle. FedRAMP began as OMB guidance on how agencies should authorize cloud services. The program, however, quickly became the practical requirement for selling those services to the federal government and was later codified through the FedRAMP Authorization Act. CMMC traces back to the Controlled Unclassified Information (CUI) program created by an executive order in 2010. That government-wide information-control framework later gave rise to a defense-contractor cybersecurity regime, now enforced through DFARS 252.204-7021, for when the Department of Defense requires a specified CMMC level for contracts involving federal contract information or CUI.
Cloud authorization and contractor cybersecurity differ from frontier-model assessment, but the institutional pattern is familiar: A centralized risk framework is established through policy, becomes a de facto procurement gate, and then becomes a requirement when codified in statute or embedded in acquisition rules. That life cycle is the answer to anyone calling this a one-time, voluntary moment. Federal risk frameworks do not need to begin as mandates to become conditions of doing business, and they outlast the administrations that create them.
The third pathway is informal regulation by contract. There is no statute, no regulation, and no government-wide clause. When buying advanced AI systems or services, an agency can build participation in the frontier-model process into a solicitation’s evaluation criteria. A contracting officer may also consider participation in assessing a contractor’s responsibility where it bears on capability, operational controls, security, or performance. And a prime contractor can demand the same commitments from its subcontractors or upstream technology providers.
However the procurement preference enters the acquisition, it usually takes one of two forms. As a floor, it is pass/fail: The vendor must meet a minimum condition to be eligible for the award. As a discriminator, it becomes an advantage in evaluation: The vendor is not formally required to participate, but participation improves its technical rating, lowers perceived risk, or strengthens the agency’s confidence in performance. The discriminator is often more powerful at first because it reshapes behavior before anyone announces a mandate. Today’s discriminator can become tomorrow’s floor, and at no point does the order itself compel a developer to participate.
The Cascade Through the AI Stack
The pressure does not stop at the frontier AI labs, because the federal AI market extends well beyond a handful of model developers selling directly to agencies. It includes a vast ecosystem of integrators, cloud platforms, resellers, and AI-enabled service providers. A frontier model may be embedded in any of them: in an application, an enterprise platform, a managed service, or a contractor’s internal delivery tool. Once a requirement lands on a federal-facing contractor, it can move through the stack.
Sometimes the government expressly mandates the cascade. The Department of Energy’s 2026 “Ensuring Unbiased AI Principles” clause flows down to subcontractors at all tiers. But even without a mandatory flowdown, prime contractors certify compliance with contract requirements and can face False Claims Act exposure when failures in the AI supply chain make material certifications false. This gives contractors a strong incentive to push AI obligations, such as provenance, incident reporting, audit cooperation, data segregation, and limits on training with government data, into subcontracts, reseller agreements, and API terms. Under either path, the model provider may never sign a government contract and still face the government’s requirements through the customer it serves.
The federal contract is the trigger: Procurement rules do not directly apply to purely commercial deals, but once a firm serves a customer engaged in federal-facing work, federal requirements can flow through the customer’s contracts and start to reshape behavior well beyond the federal market.
The Process Is the Credential
Vendor risk management explains only part of the cascade; the rest follows from the limits of what agency buyers can evaluate on their own. Agency acquisition teams generally cannot independently assess the cyber capabilities of a cutting-edge model from scratch. The agencies typically lack the technical expertise, test environments, access to classified threat information, and institutional confidence to determine whether a model has the cyber capability that warrants covered status. That is why government procurement already relies on centralized assurance mechanisms in adjacent contexts. FedRAMP, for example, provides agencies with a common baseline for cloud security authorization rather than requiring each buyer to redo the security analysis. Agencies tend to adopt centralized assurance when the risk is technical, recurring, and too complex for individual buyers to evaluate on their own.
That makes a centralized frontier-model assessment function necessary. No agency wants to be responsible for selecting the model that later causes a major cyber incident. A covered-frontier-model designation is not a seal of approval; it is a risk label. The government can assess any model, whether or not its developer volunteers. Participation adds a credential that the developer and the buying agency can point to.
Once a developer can say its model has gone through the government’s frontier-model evaluation process, that fact will carry weight in procurement, even if no solicitation formally requires it. The government has already begun formalizing a reliance on centralized evaluation: In March, CAISI signed a memorandum of understanding with the GSA to support the AI evaluation needs of USAi, the GSA’s centralized platform through which federal agencies test and adopt AI.
The order itself also initiates the development of a federal distribution channel. Section 2(c)(iii) directs the Cybersecurity and Infrastructure Security Agency (CISA) to help federal agencies, state and local authorities, and critical‑infrastructure operators gain access to cybersecurity tools and services, “including, where appropriate, covered frontier models.” Before the order even gets to its disclaimer, it has already tied the “covered frontier model” label to that channel.
A Standard No One Can See
The problem with all three pathways is that each depends on a black-box benchmarking process that the public cannot see. That governance problem does not disappear because participation is nominally voluntary. The covered-frontier-model designation rests on a classified threshold—determined through a nonpublic executive-branch process—and may become the practical gatekeeper for the federal frontier AI market.
The voluntary framework allows a developer to engage the government to determine whether a model under development meets the threshold, but nothing in the order conditions the NSA director’s determination on a developer’s request. Assessments are shared with developers and researchers “as appropriate,” a case-by-case exercise of discretion that no firm can demand as of right. Section 5(c) adds the common executive order disclaimer: The order creates no right or benefit enforceable at law or in equity. The result is discretion on the way in, no guaranteed process during, and no remedy under the order on the way out. Ordinary procurement challenges may reach the visible solicitation terms, evaluations, or exclusions, but not necessarily the underlying classified benchmark. Procurement can turn the designation into market consequences before any forum gets a meaningful look at the standard doing the work.
None of this means the test set should be public. A frontier-model benchmark that reveals prompts, attack methods, operational scoring rubrics, or threat scenarios will be gamed almost immediately. Confidential testing is defensible, but allowing confidentiality to swallow the standard, the designation, and any meaningful route to challenge a designation with procurement consequences is not.
The Better Institutional Design
The government did not need to invent a new capability. It already had one. CAISI was evaluating frontier models and building the government’s relationships with major labs. The order then moved the decisive designation elsewhere, behind a classified benchmark controlled through a national security process. That was the wrong institutional design, and it was a choice, not a gap.
The choice was presumably deliberate. Advanced cyber capabilities are the NSA’s lane, and the threat picture is classified. But an agency well positioned to assess the threat is not necessarily the agency best positioned to administer a standard that may become market facing through procurement. In earlier work, I argued for centralized testing capacity that could pool expertise, publish common test standards, and provide agencies with on-demand support before AI systems are deployed for federal use. The answer to agencies that cannot do this testing themselves should be a more transparent central capability, not an opaque process that procurement officials, Congress, courts, and independent assessors cannot meaningfully evaluate.
The better structure is to centralize the standards, distribute testing, and anchor the public-facing standards function at CAISI. CAISI sits inside the federal standards agency, has already run frontier-model evaluations, and holds the existing access relationships. CAISI lacks the resources and statutory backing it needs today, but both are arguments for Congress, not for moving the decisive designation function to the NSA. The NSA should contribute the classified threat picture through a confidential annex rather than control the designation behind a curtain. The executive order leaves the door open: NIST sits in the Section 3 consultation chain, and the framework’s design has not yet been written.
This also answers the strongest objection to centralization. Giving one institution prerelease access to every covered frontier model creates capture, security, and concentration risks. The answer is to split the function: publish a common methodology, accredit qualified third-party assessors, and permit secure, distributed testing under common standards. Reserve classified threat details for controlled annexes. Make the public standard visible enough for developers, contractors, agencies, competitors, Congress, and courts to understand what the designation means.
The window for getting this right is short, and the life cycle is the reason. FedRAMP hardened around a public-facing baseline and authorization process. If this framework hardens, it will harden around a classified one. The order’s AI cybersecurity clearinghouse must be established by July 2; the benchmarking process and voluntary framework are due by Aug. 1.
The order can keep its disclaimer. The labs are already inside. Agencies will lean on the government’s assessment, and the acquisition system will do what it has always done: turn the customer’s preference into the supplier’s requirement. The fight over whether the framework is voluntary or mandatory misses where the mandate will come from. Even without new legislation, and even from an administration skeptical of broad AI regulation, oversight of frontier models is coming through a less visible channel: the terms a company must accept to sell AI to the government. The coalition that derailed the signing ceremony can fight visible AI regulation; it has far less purchase on a contest that fragments across thousands of solicitations, evaluation factors, and contract clauses. Public-sector AI is not commercial AI, and the administration is increasingly governing the two in opposite directions. The broader debate over AI regulation will continue, but for companies that want federal business, the rules are already being written.
