The GSA’s Draft AI Clause Is Governance by Sledgehammer
The General Services Administration’s draft AI clause gets the governance problem right—then blows right past it.
The federal government spent the past eight months telling agencies to accelerate artificial intelligence (AI) adoption and treat governance as something to sort out later. That speed-first posture first emerged in the administration’s AI Action Plan and then, more explicitly, in the Pentagon’s January AI Strategy, which identified governance processes as blockers to be eliminated, declared that the risks of not moving fast enough outweigh the risks of “imperfect alignment,” and pushed for “any lawful use” models “free from usage policy constraints.”
Against that backdrop, the General Services Administration’s (GSA’s) proposed contract clause, GSAR 552.239-7001, Basic Safeguarding of Artificial Intelligence Systems, attempts to address the governance gaps in federal AI procurement. It covers everything from data control and portability to “American AI” sourcing and ideological output requirements, and explicitly states that, in any conflict with the contractor’s or service provider’s policies, terms, or commercial agreements, the clause takes precedence. The proposed clause is open for public comment until Friday, March 20.
What is striking to those of us observing the evolution of AI procurement policies is not that the government is finally building governance into AI acquisition. It is that GSA is doing so after months of policy statements pointing in the opposite direction, through a commercial channel, with a clause that reads like too many competing agendas forced into a single instrument.
What the Draft Gets Right
GSA is the federal government’s primary civilian purchasing agency. When agencies buy commercial AI, they often do so through GSA’s acquisition channels, including the Multiple Award Schedule, a government-wide contracting program for commercial products and services.
When buying “commercial” products and services, the Federal Acquisition Regulation (FAR)—the primary rulebook for federal procurement—generally directs agencies toward customary commercial terms and limits the addition of terms inconsistent with customary commercial practice, absent an applicable statutory requirement or an approved waiver. The idea behind commercial government purchasing is straightforward: Buy what the market already sells, on the terms the market already uses, rather than reinventing the wheel every time the government needs something.
For commercial computer software, a category that may not neatly encompass all commercial AI services, the default starting point is the licenses customarily provided to the public, but the government can insist on broader rights.
The new clause is a proposed addition to the General Services Administration Regulation (GSAR), the GSA-specific supplement to the FAR, for solicitations and contracts involving AI capabilities. It reads more like something you would find in a custom defense procurement than a commercial buying channel.
At a high level, the clause tries to do several big things at once: control government-linked data and informational advantages generated through the government’s use of an AI system; prohibit vendor policy refusals and usage constraints for lawful government uses; govern a layered AI stack through the prime vendor; guarantee portability and prevent vendor lock-in; expand government testing and evaluation authority; reduce supply chain risk via American AI sourcing; and impose anti-ideology principles through a commercial clause.
Some requirements are overdue, while others are wildly inconsistent with commercial buying practices. And packing them into a single clause—one that covers far more ground than this analysis can address—only exacerbates the problem. The government has been buying AI without sufficient transparency, testing, leverage over vendor dependence, or control over the informational advantages created within vendor-controlled systems. The draft’s best instinct is that the status quo is untenable.
Restricting Informational Advantage
The clause provides a list of examples of the prohibited use of “Government Data,” which includes “training, fine-tuning, or otherwise improving an LLM or other machine learning or AI models, including those operated by third parties, or to develop or improve the AI System(s) for any other customers or any commercial or non-commercial purposes”; “targeting Government or non-Government entities or informing the Contractor’s or Vendor’s advertising, marketing, sales, monetization, strategy, operations, or other business decisions”; and “retaining, accessing, or using Government data beyond the scope and duration expressly permitted in the contract.”
The clause’s definition of “Government Data” illustrates both the strengths and weaknesses of the draft. The clause correctly recognizes that a vendor must be able to use at least some “Government Data” to perform the contract. Vendors should not, however, be free to repurpose a government customer’s inputs and outputs for unrelated commercial purposes, and most major AI providers already accept some version of that limit in their commercial terms. Yet the clause goes further. It defines “Government Data” to include not only inputs and outputs but also metadata, logs, derivative data, and other usage-linked information generated by government use.
The clause excludes purely technical system-level data that contains no government data or government usage context. That carve-out is critical. Without it, the clause would claim government ownership over basic operational data that every vendor needs to keep the lights on. But the carve-out depends on “usage context” as the dividing line, and the draft never defines where system-performance data ends and data that reveals government insights begins. That ambiguity is the crux of the problem. Not all usage-linked data poses the same risk, and this clause addresses at least three categories that warrant distinction: telemetry, “data dust,” and feedback.
Telemetry is the basic operational data a vendor needs to keep the service running, such as how quickly the system responds, how often it crashes, and how much computing power it uses. No serious buyer should expect a provider to operate a modern AI service in the dark.
“Data dust” is different. It is most closely related to what technologists call “data exhaust,” the digital trail left by online activity, but the concept here is narrower. It is the behavioral fingerprint left behind by government use, not the system’s performance. Every prompt and pattern in what users accept or reject can leave a trace. Taken together, these traces begin to reveal what the government is working on, what users accept or reject, and what the government might need next. It is the difference between knowing someone entered a room and knowing which drawers they opened, which ones they opened twice, and which ones they walked away from.
That creates a competitive advantage that the government often cannot see. A vendor that systematically learns from government use gains insight into agency demand, evaluation preferences, and capability gaps, which can shape product design, pricing, strategy, and future procurement proposals. Existing procurement-integrity and conflict-of-interest frameworks were not designed for that kind of ambient, continuous insight, which is part of why the risk is so easy to overlook.
The clause’s most important and likely controversial response to that risk is a categorical prohibition on using government data to train, fine-tune, or otherwise improve AI models for any purpose outside performance of the contract itself. Vendors understandably oppose this. I have heard the restriction compared to inviting a master chef into a kitchen, handing them the finest ingredients, and then requiring them to forget every recipe they learned on the way out. The analogy resonates, but it botches the equities. The ingredients are sensitive government data, the recipes are based on government workflows, and the lessons learned could give the chef an unfair competitive advantage in future procurements. The restriction exists because the chef should not be allowed to leave the kitchen with that information. But the prohibition does not distinguish between mission-sensitive data and routine interactions that are indistinguishable from ordinary commercial use. This broad restriction might be necessary for administrative simplicity, but that comes at a cost to the improvement cycle.
The third informational category is feedback. Unlike data dust and telemetry, the clause does not address it through the prohibited uses list. Instead, it states that the government owns all feedback—regardless of who generated it—and limits its use for system improvement when it includes both government data and government confidential information, except when used to perform the contract itself. Rather than telling the contractor what it cannot do with feedback, the clause asserts that the feedback itself belongs to the government. This raises the same question about where to draw the line. A pattern of corrections can reveal agency priorities, but a user flagging a hallucinated citation may simply improve everyone’s baseline performance.
Where Governance Becomes Control
Governance protects visibility, testing, exit rights, and data boundaries within a commercial relationship. Control goes further. It starts to dictate how the vendor builds, operates, and governs its own product. That is the line the clause crosses in several places.
The Safety Stack
The “safety stack” refers to the guardrails a company builds around its AI model that dictate what it will answer, what it will refuse, and how it decides. For example, in my earliest days of using generative AI, I was researching ear-piercing methods for my daughter. I asked ChatGPT something like, “Which is better for kids, needles or a device?” ChatGPT refused to answer because I had wandered into a health or safety-adjacent refusal. The refusal likely came from the company’s trust and safety guardrails, which flagged a question about using needles or devices on a child and decided the safest course was not to answer.
In practice, the line between the guardrails and the model itself is not clean. Some refusal behavior is built into the model through its training, and some sits in the operational layer around it. The clause states that the AI system must not refuse to produce outputs or conduct analyses based on the contractor’s or service provider’s discretionary policies, while clarifying that this does not require retraining or altering model weights. That distinction is crucial. In commercial AI systems, refusal behavior may result from the operational controls built around the model, from system instructions, or from the model’s own trained behavior. The clause appears to target the operational layer, even though those sources of behavior are not always cleanly separable.
Many operational-layer refusals occur because the provider does not trust the model to answer reliably in sensitive areas. A vendor refuses a medical or legal prompt not necessarily because the model cannot generate an answer, but because the vendor knows the response may be confidently wrong. This creates a compliance inconsistency: Removing those refusals might make it harder to satisfy the clause’s separate requirements for truthful, trustworthy, and supposedly neutral outputs, while keeping the refusals could risk their being deemed prohibited discretionary restrictions. The clause does not specify who bears the loss if a vendor relaxes a safeguard and the system produces a confidently wrong output that an agency relies on.
To be fair, not all refusals are reliability driven. Some are purely commercial or reputational—a vendor declining to address certain topics not because the model would be inaccurate, but because the vendor does not want the exposure. Removing those would not degrade output quality. But the clause draws no distinction between reliability-driven refusals and reputational ones. It prohibits discretionary refusals while separately requiring truthful, trustworthy, and neutral outputs, without acknowledging that some of those refusals may be what makes the outputs trustworthy in the first place. The clause also applies to all AI systems, not just large language models (LLMs). The same prohibition on discretionary refusals extends to image recognition, autonomous systems, and other models where vendor-imposed constraints may reflect safety engineering rather than content policy.
Stack Governance
The clause creates what I have been calling a “David and Goliath problem” in AI stack oversight. Government contractors understand flow-down obligations (requirements that the prime contractor passes down to its subcontractors and suppliers). But this is not an ordinary supply chain. The clause broadly defines “Service Provider” as any entity that directly or indirectly provides, operates, or licenses an AI system, and makes the prime contractor responsible for each upstream provider’s adherence to the clause. Notably, the clause specifies that service providers “may or may not be subcontractors,” meaning the prime contractor must ensure compliance by an upstream provider whose only connection to the government contract may be a commercial API or platform agreement.
Consider a company that sells an AI-based document analysis tool built on Claude. That company may be little more than an interface layer on top of Anthropic’s model. Yet the company is responsible for making sure that Anthropic and any other covered providers comply with the clause. The instinct is understandable: If the government contracts only with the visible front-end vendor while the significant operational risk lies deeper in the stack, the clause becomes hollow. But the draft shifts compliance responsibility to what may be the smallest actor in the chain, while the biggest actors remain beyond its contractual reach.
The risk extends beyond contract performance. The clause makes the prime contractor responsible for upstream compliance, which it may have no practical means to verify. A contract governed by these obligations poses serious False Claims Act (FCA) risks: treble damages, per-claim penalties, and a whistleblower mechanism that encourages filing against the most accessible target—the prime vendor holding the contract, not the upstream provider with actual operational control. The clause is littered with FCA pressure points, including certification requirements, disclosure and reporting obligations, and open-ended performance standards that collectively heighten exposure. The FCA risk alone warrants attention well beyond the broader intellectual property (IP) and commercial-practice concerns raised by this clause.
Ownership
The clause does not just try to control how the AI system operates. It also claims ownership over much of what the system produces. The government should absolutely restrict how vendors use government data and the informational advantage they generate from that use. But restricting use and claiming ownership are different legal tools, and this draft does both. Whether leading AI companies will accept these terms when federal contracts represent only a small fraction of their overall commercial revenue remains an open question.
The clause states that the government owns all “Government Data” and all “Custom Developments,” and grants the contractor only a limited, revocable license for contract performance. It then goes further: Any intellectual property rights the contractor acquires in government data, or in any improvements, enhancements, feedback, or derivative works thereof, are automatically assigned to the government at creation, though the contractor retains the underlying AI system and base models.
Government ownership of custom configurations and workflows developed specifically for the government is defensible. The overreach is in the automatic assignment of rights over improvements, enhancements, feedback, and derivative works tied to government data. In many current federal AI deployments, agencies access shared commercial models through APIs, and many major vendors already commit not to use customer inputs and outputs to train their models. But the clause defines government data far more broadly than those commercial commitments cover, extending to metadata, logs, derivative data, and usage-linked information. That gap is real, but the question is whether ownership is the right tool to fill it.
For much of what the clause tries to reach, there may be no clean, transferable asset to own. The informational advantages a vendor accumulates through government use are not separable objects that the government can take possession of. Ownership language shifts the burden: The vendor needs authorization rather than the government needing proof of misuse. That is a genuine structural advantage. But if the real concern is preventing vendors from exploiting the operational residue of government use (i.e., the logs, usage patterns, and behavioral insights described earlier), and maintaining the government’s ability to exit, then strong use restrictions, audit rights, and portability obligations are better targeted to how cloud-based AI works. They also require enforcement capacity that most agencies currently lack—the same capacity gap that makes the clause’s broad defaults attractive in the first place.
Portability and Interoperability
The government should insist on exit rights. Full stop. The federal government has a long and painful history with vendor lock-in, and I have argued elsewhere that the risk extends beyond technical lock-in to behavioral dependency on vendor-controlled systems. Actual portability means the government can recreate essential data, relationships, and workflows with a different provider. The clause’s requirement of open, standardized data formats and APIs, and its prohibition on proprietary formats that require additional licensing, serve that goal. The prohibition on proprietary technologies that create vendor dependencies goes further. It moves from regulating the exit to regulating the architecture—dictating how the vendor builds its product, not just ensuring the government can leave.
Evaluation Rights
The government should absolutely test what it purchases. Yet the draft allows automated government assessments while disclaiming any obligation to disclose the underlying data, methodologies, or systems. Fully disclosed benchmarks may invite gaming—vendors optimizing to the test rather than the mission. But the solution is structured evaluation with defined criteria and a remediation process, not a regime in which the government renders consequential decisions without explaining their basis.
American AI Systems
Concerns about upstream foreign control, foreign-developed model components, and opaque supply-chain dependencies in federal AI systems are not hypothetical, and the government has a legitimate interest in knowing who built and controls the AI systems it uses. OMB M-25-21 (the administration’s primary AI policy guidance) and M-25-22 (AI acquisition guidance) implement the Advancing American AI Act and articulate a policy of “maximizing the use” of AI products and services developed and produced in the United States. The clause implements that preference through a prohibition by requiring contractors to “use only American AI Systems” and barring foreign AI systems, including components “manufactured, developed, or controlled” by non-U.S. entities. In a market built on layered models, open-source components, global development teams, and complex service-provider relationships, that is sweeping language with immediate interpretation and implementation challenges. Does a model qualify as “American” if it was developed by a U.S. company but trained on data processed overseas? That is the kind of line-drawing problem the clause creates but does not resolve.
Where It Turns Into Ideological Policy
Agencies have a valid interest in demanding factual accuracy, transparency about uncertainty, and freedom from intentional vendor-driven ideological bias in systems used for public purposes. The “Unbiased AI Principles” closely track Executive Order 14319, “Preventing Woke AI in the Federal Government,” and OMB M-26-04, “Increasing Public Trust in Artificial Intelligence Through Unbiased AI Principles.” The GSA draft appears to extend that framework more broadly across AI systems. Requiring truthfulness and stronger testing is one thing. Requiring an AI system to be a “neutral, nonpartisan tool” that does not manipulate responses in favor of “ideological dogmas such as Diversity, Equity, Inclusion,” while reserving to the government the right to test for “unsolicited ideological content” using undisclosed methodologies, is another. The tension is obvious: The arbiter of neutral, truthful AI output is the same government that titled its mandate “Preventing Woke AI.” That is a politically freighted performance requirement masquerading as a demand for better output. The clause qualifies these requirements with a “commercial efforts” standard, which softens the obligation. But even a best-efforts version of a politically subjective performance benchmark creates the same contract administration and compliance concerns.
Redefining “Commercial” Procurement
The government is not an ordinary buyer, and commercial acquisitions have never been free of government-unique terms. The concern here is not that the clause adds requirements. It is the aggregation of buyer protections, operational control, sourcing mandates, disclosure burdens, and ideological performance conditions, packed into a single default clause.
The draft does contain a tailoring mechanism: Paragraph (j) allows bilateral revisions of certain sections at the order level, including the data and IP provisions in paragraph (d)—which means the ownership, assignment, and feedback provisions that take up so much of this analysis can be modified. But tailoring does not extend to the American AI requirement, the Unbiased AI Principles, or the government’s evaluation regime, and most agencies will accept the standard terms rather than negotiate departures. Moreover, there is market structure risk. The compliance burden of the nonnegotiable provisions falls disproportionately on mid-sized and smaller AI companies, while large incumbents with dedicated government divisions can absorb or negotiate around the terms.
The strongest defense of this draft is that lighter-touch mechanisms assume a level of technical and contractual capacity that most agencies lack, and that a strong baseline may be the only way to protect the government. But the answer is to build the capacity the system needs, not to paper over its deficiencies with a single, overburdened clause. A better clause would keep the parts that address real AI procurement issues—data-use limits, testing rights, meaningful disclosure, portability, supply chain transparency, incident reporting, and managed change—and remove what is logically inconsistent with those protections.
Federal procurement has always wrestled with competing priorities—flexibility versus uniformity, integrity versus efficiency, socioeconomic goals versus open competition. The data and governance challenges in AI procurement are another version of that structural tension. I do not envy the policymakers trying to balance protecting legitimate government interests with preserving the commercial relationships that make these products worth buying.
The draft gets the diagnosis mostly right. But it has responded to an existing governance gap with a clause that tries to do too much at once through the wrong channel, risking both overreach and distortion. Moving from governance as a “blocker” to governance by sledgehammer is not a cure. It is just the next iteration of the same pathology.
