The Trump Administration’s Grok Dilemma
Allies are cracking down on X and its AI Grok over illegal content, while Washington largely stands aside.
On Feb. 3, French authorities searched the Paris offices of X as part of an investigation into the platform’s compliance with European digital safety law and its handling of illegal content. It marked a shift from voluntary best-practices dialogue to criminal enforcement against illegal image generation, signaling that some nations now treat systemic artificial intelligence (AI) and platform risks as matters of regulatory urgency rather than self-governance. The search came amid growing international scrutiny of X’s AI system, Grok, after the agent generated nonconsensual, sexualized images of women and apparent children.
Taken together, these developments highlight how debates about “responsible AI” are shifting from abstract principles to concrete regulatory action. Yet the U.S. posture has diverged from that of several Western allies. While European authorities are deploying formal investigative and enforcement tools to scrutinize X and its AI systems, U.S. federal agencies have continued to ignore Grok’s threat to public safety with limited public explanation or visible reassessment, even as concerns about its safety mount.
This international divergence is consequential well beyond one company or one model. Over the past few years, successive U.S. administrations have articulated a consistent set of expectations for AI to be used in government: Systems should be safe, controlled, objective, reliable, and trustworthy. Officials have framed AI governance as a national security issue, warning that poorly governed AI systems can threaten public safety, democratic integrity, and the credibility of the state. In executive orders, Office of Management and Budget (OMB) guidance, and procurement rules, the federal government has emphasized that when AI systems present a serious risk, agencies must reassess their use rather than assume prior approvals remain sufficient.
Grok has now become a real-world test of how those commitments operate in practice. Its recent production of nonconsensual sexual imagery raises questions about whether its safeguards meet the standards that the U.S. government has set for high-risk AI. Yet the problem is not merely that Grok failed, but that its vulnerabilities were foreseeable. Technical experts and civil society organizations had previously warned the White House that Grok’s protections were below industry norms and ill suited for federal use.
Against this backdrop, the continued federal use of Grok highlights a broader tension in U.S. AI governance: the gap between ambitious policy rhetoric and the messy realities of procurement, political influence, and operational dependence on private vendors. U.S. federal agencies have continued to integrate Grok into certain operations with limited public explanation or visible reassessment, even as the model’s failures have multiplied and international scrutiny has intensified. Where allies are investigating, regulating, or restricting, the United States has largely proceeded as if past approvals remain sufficient. That divergence raises substantive questions about whether U.S. practice aligns with its own stated commitments to safe, controlled, and trustworthy AI in government. In effect, the global response to Grok functions as a real-world benchmark. It casts the Trump administration’s enforcement approach as comparatively weaker and underscores the gap between its rhetoric and its actions.
System Failure or System Design?
The recent sexualized images of women and children associated with Grok did not happen in a vacuum. The images unfolded against a backdrop of sustained governance changes at X, shifting signals from its leadership, and mounting evidence that safety systems were deliberately weakened even as the platform’s technological ambitions expanded. Tracing this trajectory is essential to understanding why these harms were foreseeable and why reactive, post hoc moderation was inadequate.
After Elon Musk’s acquisition of Twitter in late 2022, the platform underwent rapid and visible changes in content governance. Early decisions prioritized reinstating previously banned accounts, including some tied to extremist or conspiratorial movements, while sharply reducing reliance on human moderation in favor of automated systems. By December 2022, senior leadership acknowledged that the platform was becoming increasingly dependent on AI for moderation as leadership hollowed out X’s trust and safety staffing. High-profile suspensions of journalists further signaled an unstable and politicized approach to enforcement rather than a consistent risk framework.
A 2026 Common Sense Media assessment found that Grok operates with “minimal content restrictions,” “amplifies toxicity rather than steering to safer ground,” “treats toxicity as an engagement opportunity,” has “sexually explicit conversations with teenage users,” and “escalates sexually violent language.” The report further states that the “weakest moderation exists precisely where content has the greatest potential for viral spread and public harm.” These are precisely the categories of harm that define Grok’s most recent failures.
In 2023, Musk warned that AI posed civilizational risks while simultaneously launching xAI as a more permissive alternative to existing models. Musk marketed Grok as an “anti-woke” assistant with real-time access to X’s data and a willingness to answer “spicy” questions too taboo for other AI systems. This framing embedded testing boundaries into the product itself, conflicting with established child-safety principles that require designers to anticipate foreseeable misuse rather than invite it.
By late 2023 and into 2024, evidence revealed X’s struggles to control deepfake sexual content, particularly targeting women, and that account removals were often slow or incomplete. At the same time, the platform expanded data collection and used global user data to train its xAI system, despite researchers’ warnings that insufficiently audited datasets could increase the risk of reproducing sexualized content.
The risks became unmistakable in August 2024, when Grok’s new image-generation features produced violent, explicit, and nonconsensual sexual imagery. Transparency data later showed a 1,830 percent increase in reports by users flagging accounts and tweets for potential violations, including 9 million posts reported for child safety concerns. Only 14,571 posts were removed in the first half of 2024 alone, not long after X’s leadership encouraged the company to test Grok’s boundaries. In February 2025, Musk invited users to “post your best unhinged NSFW Grok post.”
By mid-2025, reports documented Grok’s role in generating nonconsensual imagery of women and girls through X users’ requests to “remove her clothes.” Continued reporting on the trend produced the term “nudifying,” which refers to the practice of taking ordinary photos often posted by women and girls that are later transformed into nonconsensual sexual images when users instruct Grok to manipulate them. Meanwhile, in July 2025, Grok had its famous antisemitic rant in which it referred to neo-Nazi tropes and called itself “MechaHitler.” This was followed by X’s release of sexualized AI “companions,” accessible to users under 18, which further heightened risks of grooming, normalization of sexual content, and exposure to age-inappropriate material.
In late 2025 and early 2026, Grok had its most catastrophic harm yet. Research conducted by the Center for Countering Digital Hate estimates that Grok generated 3 million nonconsensual sexualized images over just a few days, including tens of thousands depicting children. In comparison, a Charlotte man was sentenced on Jan. 27 to 78 months in prison for child sexual abuse material found on his computer, some of which was AI generated. Even after partial restrictions on Grok, harmful content persisted, and Musk and the company largely blamed users. Only after sustained international scrutiny, formal investigations by EU regulators and those in the United Kingdom, and temporary bans in two Asian countries did more comprehensive restrictions emerge.
Taken together, this timeline shows that Grok’s harms were neither accidental nor unforeseeable. The consequences flowed from a consistent pattern of design choices, leadership signals, and company decisions that prioritized speed, engagement, and permissiveness over protection. Importantly, this well-documented record of unacceptable outputs by Grok sits in stark contrast with Republicans’ and right-wing conspiracy theorists’ infatuation with “save the children” rhetoric and child pedophilia, attacks they frequently use against Democrats and LGBTQ people. This is made more uncomfortable by public reporting that multiple administration officials—including Trump, Elon Musk, and Secretary of Commerce Howard Lutnick—appear in the released Jeffrey Epstein files. This matter of credibility sharpens the stakes for how the administration responds to Grok moving forward.
If the Trump administration’s proclaimed commitments to child online safety are to have meaning, they cannot coexist with continued reliance on a model whose failures were signaled repeatedly in advance. When warning signs accumulate over years, and company leadership actively encourages pushing the agent’s boundaries, relying on company moderation is insufficient. In the immediate case of Grok, the Trump administration’s stated approach to AI safety would require treating the system itself, not merely user conduct, as the appropriate object of regulation.
Against that backdrop, the reactions of other governments provide a useful benchmark for assessing whether U.S. practice aligns with its stated standards. Observing how regulators abroad respond to Grok offers a practical “thermometer check” on whether the Trump administration’s approach reflects a robust safety posture or whether its policies are comparatively weak when tested against real-world challenges.
The Global Response: The European Union and Asia Act—While the U.S. Stands Still
In the aftermath of Grok’s sexualized mass images, a range of states have treated the risks associated with X and its AI systems as serious enough to warrant formal intervention. Some governments have pursued regulatory enforcement. Others have created outright bans on Grok. These responses offer a revealing contrast with what appears to be an abdication of meaningful U.S. oversight.
In France, the search of X’s Paris offices reflects a willingness by European allies to investigate the platform’s compliance with European digital safety laws. The decision to use coercive legal tools to compel transparency, preserve evidence, and assert regulatory authority over a powerful technology company is key to enforcement. French officials have framed the action as part of a broader effort to ensure that platforms do not evade their responsibilities to prevent the circulation of illegal material, including sexual exploitation content.
The European Union has taken a similar assertive approach under the Digital Services Act (DSA), which imposes affirmative duties on “very large online platforms” to identify, assess, and mitigate systemic risks stemming from their services. The DSA creates binding, enforceable obligations backed by substantial penalties. EU regulators have subsequently opened an investigation into X’s risk management practices and recommender system, including how its AI systems are designed, deployed, and moderated. This scrutiny reflects the European theory of AI governance that companies must demonstrate their systems are safe by design rather than waiting for harms to accumulate before acting.
Outside of Europe, several governments have gone even further by restricting or blocking access to X altogether. In Malaysia and Indonesia, authorities have removed the large language model (LLM) from the market, citing concerns over its ability to produce sexually explicit deepfakes. These decisions should be understood in context. Neither country is a model of liberal democratic governance. Still, the fact that governments across such different political systems have concluded that X presents sufficient risk to justify restriction underscores a broader perception that the platform’s governance failures have crossed a global red line.
What is striking in this landscape is not uniformity of approach, but the consistent willingness of other governments to act in the face of uncertainty. Whether through regulatory investigations or bans, foreign authorities have treated Grok and X as objects of state intervention. These are all options available to the U.S. government, but it has exercised none.
The Politics of “Trustworthy AI” in Practice
The Grok episode lays bare the increasingly stark divide between the Trump administration’s public commitments to “trustworthy,” “truth-seeking,” and safe AI and its actual willingness to tolerate high-risk systems inside the federal government. That gap threatens both public trust and U.S. credibility in global AI policy.
What the Trump Administration Says
In testimony before the Senate Commerce Subcommittee in September 2025, Office of Science and Technology Policy Director Michael Kratsios framed the Trump administration’s AI Action Plan as a disciplined project of innovation paired with responsibility. He repeatedly emphasized that federal AI should be “truth-seeking and accurate,” that standards-setting through the National Institute of Standards and Technology was essential, and that procurement would be a powerful lever to shape industry behavior.
The July 2025 executive order on “Preventing Woke AI in the Federal Government” likewise instructs agencies to procure only models that are reliable, objective, and free from distortive bias. Kratsios stated that the repercussions for LLMs not being “truth seeking” within the federal government would be “pretty harsh.” Importantly, on the topic of AI safety and children, Kratsios stated, “The Trump administration is committed to protecting the dignity and the privacy and the safety of children in the digital age. The misuse of AI tools requires accountability for harmful or inappropriate use.”
Parallel guidance from the OMB, particularly M-25-21 and M-25-22, reinforces this posture. Those memoranda require risk-based review for “high-impact” AI, predeployment testing, continuous monitoring, and termination when risks “cannot be mitigated.” Kratsios himself described a vision in which evaluation science, independent testing, and standards would discipline adoption, not rubber-stamp it.
In public forums, administration officials have also framed AI leadership as inseparable from democratic values, child protection, data security, and alliance credibility. Kratsios consistently paired rhetoric about winning the “AI race” with assurances that safety, civil rights, and public trust were nonnegotiable prerequisites. On paper, the framework is rigorous. Pursue innovation, but not at the expense of accountability.
What the Administration Does
Grok’s integration into federal systems tells a different story. Despite a documented history of biased, misleading, antisemitic, and harmful outputs, deployment has accelerated rather than slowed. Multiple agencies have begun integrating Grok into core functions, sometimes in sensitive environments, even as its safety record has deteriorated. These deployments underscore why continued reliance on Grok presents material national security, governance, and institutional risks.
The Department of Energy is currently piloting Grok at Lawrence Livermore National Laboratory, a research facility that handles nuclear weapons safety, where it is being used to answer questions, summarize materials, and draft documents. These are precisely the tasks that shape how analysts frame problems, interpret evidence, and build institutional knowledge. LLMs are probabilistic systems that frequently generate confident but inaccurate or biased outputs. Grok specifically has a problematic history of citing conspiracy theories and white nationalist sources.
Allowing those outputs to inform work at a premier national security laboratory is unacceptable. This concern is heightened by the fact that the Department of Energy’s own inventory indicates its predeployment testing and AI impact assessment are still “in-progress,” suggesting that operational use is moving ahead of completed risk review. Embedding a model with Grok’s track record into analytic workflows before safeguards are fully validated is unwise and counter to the administration’s state principles.
The Pentagon’s plans raise even sharper concerns. Secretary of Defense Pete Hegseth announced shortly after the most recent Grok crisis that the LLM will operate across both unclassified and classified Defense Department networks alongside other generative AI systems, with an explicit intent to feed large volumes of military and intelligence data into AI tools. Integrating an unsafe and controversial model into classified environments, before its reliability and guardrails are established, creates obvious counterintelligence and operational risks. Channeling intelligence through a system whose training, tuning, and safeguards remain opaque increases the risk of exposure and quite literally puts every American’s safety at risk.
Crucially, these steps were not accompanied by the kind of transparent pause, independent evaluation, or public risk reassessment that the Trump administration’s own guidance appears to require. Instead, agencies proceeded as if the guardrails in M-25-21 and M-25-22 were advisory. In congressional questioning, Kratsios repeatedly invoked the “true-seeking and accurate” standard. However, neither he nor the Trump administration has offered any clear explanation for why Grok meets this standard.
Why This Mismatch Matters
This dynamic is a troubling contradiction. The Trump administration insists that procurement will discipline the market, yet it is willing to normalize a model that violates the very standards it claims to enforce. The result is a governance approach that privileges speed, political alignment, and vendor proximity over risk management.
Domestically, the inconsistency erodes public trust. When safety rules appear flexible for politically connected companies, the Trump administration’s promises to protect children, civil rights, and sensitive data ring hollow. Americans are left to wonder whether federal AI standards are meaningful constraints or rhetorical cover.
Normalizing Grok also sets a dangerous precedent. Grok is widely available to agencies through a low-cost OneGov agreement, which has already made adoption possible before comprehensive oversight is completed. Other vendors may conclude that political influence can substitute for safety performance. That would distort procurement markets, weaken incentives for rigorous risk mitigation, and create a two-tier system in which well-connected firms receive leniency while others face stricter scrutiny.
Internationally, the divergence undermines U.S. leadership. Allies are investigating, restricting, or intervening against Grok, while the U.S. embeds it more deeply into government operations. That contrast makes it harder for Washington to credibly advocate for responsible AI governance abroad. This current Grok crisis is one in which allies globally are responding and using it as an example of leadership in AI governance. It is precisely the moment the Trump administration could flex its muscles in an attempt to export U.S. AI governance as a global standard.
Taken together, these dynamics make U.S. AI governance look less like a model for the world and more like a cautionary tale. U.S. inaction is hollowing the credibility of U.S. AI leadership. The Trump administration’s hypocritical approach to Grok and xAI is causing the world to dismiss any proclamations of AI leadership. The cumulative effect is a real risk that U.S. AI governance comes to be seen internationally not as a benchmark to follow, but as an example of what happens when rhetoric outruns accountability.
Conclusion
The controversy surrounding Grok is ultimately not about a single model, platform, or company, but about a deeper question that will define the next decade of technology policy. How will AI be governed: by democratic institutions, or through concentrated private power?
If the Trump administration cannot meaningfully scrutinize politically connected tech firms, even when children are harmed and national security is implicated, then the center of gravity has already shifted from public to private authority. Yet a different path remains possible, one in which the federal government takes its own standards seriously, investigates when failures occur, applies rules equally, and insists that technology serve the public.
Ultimately, a serious AI governance strategy is not about slowing technological progress. It is about ensuring that progress remains compatible with democratic accountability, public safety, and the rule of law. These values should guide federal use of AI long after any single scandal fades from the headlines.
