Protecting AI Whistleblowers
.jpeg?sfvrsn=6b2dd648_5)
Published by The Lawfare Institute
in Cooperation With
Listen to Bullock discuss the AI Whistleblower Protection Act on Lawfare Daily.
In May 2024, OpenAI found itself at the center of a national controversy when news broke that the AI lab was pressuring departing employees to sign contracts with extremely broad nondisparagement and nondisclosure provisions—or else lose their vested equity in the company. This would essentially have required former employees to avoid criticizing OpenAI for the indefinite future, even on the basis of publicly known facts and nonconfidential information.
Although OpenAI quickly apologized and promised not to enforce the provisions in question, the damage had already been done—a few weeks later, a number of current and former OpenAI and Google DeepMind employees signed an open letter calling for a “right to warn” about serious risks posed by AI systems, noting that “[o]rdinary whistleblower protections are insufficient because they focus on illegal activity, whereas many of the risks we are concerned about are not yet regulated.”
The controversy over OpenAI’s restrictive exit paperwork helped convince a number of industry employees, commentators, and lawmakers of the need for new legislation to fill in gaps in existing law and protect AI industry whistleblowers from retaliation. This culminated recently in the AI Whistleblower Protection Act (AI WPA), a bipartisan bill introduced by Sen. Chuck Grassley (R-Iowa) along with a group of three Republican and three Democratic senators. Companion legislation was introduced in the house by Reps. Ted Lieu (D-Calif.) and Jay Obernolte (R-Calif.).
Whistleblower protections such as the AI WPA are minimally burdensome, easy to implement and enforce, and plausibly useful for facilitating government access to the information needed to mitigate AI risks. They also have genuine bipartisan appeal, meaning there is actually some possibility of enacting them. As increasingly capable AI systems continue to be developed and adopted, it is essential that those most knowledgeable about any dangers posed by these systems be allowed to speak freely.
Why Whistleblower Protections?
The normative case for whistleblower protections is simple: Employers shouldn’t be allowed to retaliate against employees for disclosing information about corporate wrongdoing. The policy argument is equally straightforward—company employees often witness wrongdoing well before the public or government becomes aware but can be discouraged from coming forward by fear of retaliation. Prohibiting retaliation is an efficient way of incentivizing whistleblowers to come forward and a strong social signal that whistleblowing is valued by governments (and thus worth the personal cost to whistleblowers).
There is also reason to believe that whistleblower protections could be particularly valuable in the AI governance context. Information is the lifeblood of good governance, and it’s unrealistic to expect government agencies and the legal system to keep up with the rapid pace of progress in AI development. Often, the only people with the information and expertise necessary to identify the risks that a given model poses will be the people who helped create it.
Of course, there are other ways for governments to gather information on emerging risks. Prerelease safety evaluations, third-party audits, basic registration and information-sharing requirements, and adverse event reporting are all tools that help governments develop a sharper picture of emerging risks. But these tools have mostly not been implemented in the U.S. on a mandatory basis, and there is little chance they will be in the near future.
Furthermore, whistleblower disclosures are a valuable source of information even in thoroughly regulated and relatively well-understood contexts like securities trading. In fact, the Securities and Exchange Commission has awarded more than $2.2 billion to more than 444 whistleblowers since its highly successful whistleblower program began in 2012. We therefore expect AI whistleblowers to be a key source of information no matter how sophisticated the government’s other information-gathering authorities (which, currently, are almost nonexistent) become.
Whistleblower protections are also minimally burdensome. A bill like the AI WPA imposes no affirmative obligations on affected companies. It doesn’t prevent them from going to market or integrating models into useful products. It doesn’t require them to jump through procedural hoops or prescribe rigid safety practices. The only thing necessary for compliance is to refrain from retaliating against employees or former employees who lawfully disclose important information about wrongdoing to the government. It seems highly unlikely that this kind of common-sense restriction could ever significantly hinder innovation in the AI industry. This may explain why even innovation-focused, libertarian-minded commentators like Martin Casado of Andreesen Horowitz and Dean Ball have reacted favorably to AI whistleblower bills like California SB 53, which would prohibit retaliation against whistleblowers who disclose information about “critical risks” from frontier AI systems. It’s worth noting that the sponsor of the AI WPA’s House companion bill was introduced by Rep. Obernolte, who has been the driving force behind the controversial AI preemption provision in the GOP reconciliation bill.
The AI Whistleblower Protection Act
Beyond the virtues of whistleblower protections generally, how does the actual whistleblower bill currently making its way through Congress stack up?
In our opinion, favorably. A few weeks ago, we published a piece on how to design AI whistleblower legislation. The AI WPA checks almost all of the boxes we identified, as discussed below.
Dangers to Public Safety
First, and most important, the AI WPA fills a significant gap in existing law by protecting disclosures about “dangers” to public safety even if the whistleblower can’t point to any law violation by their employer. Specifically, the law protects disclosures related to a company’s failure to appropriately respond to “substantial and specific danger[s]” to “public safety, public health, or national security” posed by AI, or about “security vulnerabilit[ies]” that could allow foreign countries or other bad actors to steal model weights or algorithmic secrets from an AI company. This is significant because the most important existing protection for whistleblowers at frontier AI companies—California’s state whistleblower statute—only protects disclosures about law violations.
It’s important to protect disclosures about serious dangers even when no law has been violated because the law, with respect to emerging technologies like AI, often lags far behind technological progress. When the peer-to-peer file sharing service Napster was founded in 1999, it wasn’t immediately clear whether its practices were illegal. By the time court decisions resolved the ambiguity, a host of new sites using slightly different technology had sprung up and were initially determined to be legal before the Supreme Court stepped in and reversed the relevant lower court decisions in 2005. In a poorly understood, rapidly changing, and almost totally unregulated area like AI development, the prospect of risks arising from behavior that isn’t clearly prohibited by any existing law is all too plausible.
Consider a hypothetical: An AI company trains a new cutting-edge model that beats out its competitors’ latest offerings on a wide variety of benchmarks, redefining the state of the art for the nth time in as many months. But this time, a routine internal safety evaluation reveals that the new model can, with a bit of jailbreaking, be convinced to plan and execute a variety of cyberattacks that the evaluators believe would be devastatingly effective if carried out, causing tens of millions of dollars in damage and crippling critical infrastructure. The company, under intense pressure to release a model that can compete with the newest releases from other major labs, implements safeguards that employees believe can be easily circumvented but otherwise ignores the danger and misrepresents the results of its safety testing in public statements.
In the above hypothetical, is the company’s behavior unlawful? An enterprising prosecutor might be able to make charges stick in the aftermath of a disaster, because the U.S. has some very broad criminal laws that can be creatively interpreted to prohibit a wide variety of behaviors. But the illegality of the company’s behavior is at the very least highly uncertain.
Now, suppose that an employee with knowledge of the safety testing results reported those results in confidence to an appropriate government agency. Common sense dictates that the company shouldn’t be allowed to fire or otherwise punish the employee for such a public-spirited act, but under currently existing law it is doubtful whether the whistleblower would have any legal recourse if terminated. Knowing this, they might well be discouraged from coming forward in the first place. This is why establishing strong, clear protections for AI employees who disclose information about serious threats to public safety is important. This kind of protection is also far from unprecedented—currently, federal employees enjoy a similar protection for disclosures about “substantial and specific” dangers, and there are also sector-specific protections for certain categories of private-sector employees such as (for example) railroad workers who report “hazardous safety or security conditions.”
Importantly, the need to protect whistleblowers has to be weighed against the legitimate interest that AI companies have in safeguarding valuable trade secrets and other confidential business information. A whistleblower law that is too broad in scope might allow disgruntled employees to steal from their former employers with impunity and hand over important technical secrets to competitors. The AI WPA, however, sensibly limits its danger-reporting protection to disclosures made to appropriate government officials or internally at a company regarding “substantial and specific danger[s]” to “public safety, public health, or national security.” This means that, for better or worse, reporting about fears of highly speculative future harms will probably not be protected, nor will disclosures to the media or watchdog groups.
Preventing Contractual Waivers of Whistleblower Rights
Another key provision states that contractual waivers of the whistleblower rights created by the AI WPA are unenforceable. This is important because nondisclosure and nondisparagement agreements are common in the tech industry, and are often so broadly worded that they purport to prohibit an employee or former employee from making the kinds of disclosures that the AI WPA is intended to protect. It was this sort of broad nondisclosure agreement (NDA) that first sparked widespread public interest in AI whistleblower protections during the 2024 controversy over OpenAI’s exit paperwork.
OpenAI’s promise to avoid enforcing the most controversial parts of its NDAs did not change the underlying legal reality that allowed OpenAI to propose the NDAs in the first place, and that would allow any other frontier AI company to propose similarly broad contractual restrictions in the future. As we noted in a previous piece on this subject, there is some chance that attempts to enforce such restrictions against genuine whistleblowers would be unsuccessful, because of either state common law or existing state whistleblower protections. Even so, the threat of being sued for violating an NDA could discourage potential whistleblowers even if such a lawsuit might not eventually succeed. A clear federal statutory indication that such contracts are unenforceable would therefore be a welcome development. The AI WPA, which clearly resolves the NDA issue by providing that “[t]he rights and remedies provided for in this section may not be waived or altered by any contract, agreement, policy form, or condition of employment,” would provide exactly this.
Looking Forward
It’s not clear what will happen to the AI Whistleblower Protection Act. It appears as likely to pass as any AI measure we’ve seen, given the substantial bipartisan enthusiasm behind it and the lack of any substantial pushback from industry to date. But it is difficult in general to pass federal legislation, and the fact that there has been very little in the way of vocal opposition to this bill to date doesn’t mean that dissenting voices won’t make themselves heard in the coming weeks.
Regardless of what happens to this specific bill, those who care about governing AI well should continue to support efforts to pass something like the AI WPA. However concerned or unconcerned one may be about the dangers posed by AI, the bill as a whole serves a socially valuable purpose: establishing a uniform whistleblower protection regime for reports about security vulnerabilities and lawbreaking in a critically important industry.