Cybersecurity & Tech

Lawfare Daily: Cullen O’Keefe on the Impending Wave of AI Agents

Kevin Frazier, Renee DiResta, Cullen O'Keefe, Jen Patja
Wednesday, May 14, 2025, 7:00 AM
What are AI agents and how do we ensure they operate safely?

Published by The Lawfare Institute
in Cooperation With
Brookings

Cullen O’Keefe, Research Director at the Institute for Law and AI, joins Kevin Frazier, AI Innovation and Law Fellow at Texas Law and a Contributing Editor at Lawfare, and Renée DiResta, Associate Research Professor at the McCourt School of Public Policy at Georgetown and a Contributing Editor at Lawfare, to discuss a novel AI governance framework. They dive into a paper he co-authored on the concept of "Law-Following AI" or LFAI. That paper explores a near-term future. Imagine AI systems capable of tackling complex computer-based tasks with expert human-level skill. The potential for economic growth, scientific discovery, and improving public services is immense. But how do we ensure these powerful tools operate safely and align with our societal values? That’s the question at the core of Cullen’s paper, his recent Lawfare article, and this podcast.

To receive ad-free podcasts, become a Lawfare Material Supporter at www.patreon.com/lawfare. You can also support Lawfare by making a one-time donation at https://givebutter.com/lawfare-institute.

Click the button below to view a transcript of this podcast. Please note that the transcript was auto-generated and may contain errors.

 

Transcript

[Intro]

Cullen O’Keefe: To really drive home the difference between the different types of loyalty that we might want to distinguish between for AI agents, we have this concept of AI henchmen, which are agents that are perfectly loyal. They'll, they'll do what the principal asked them to, and they will be willing to break the law, either if they're instructed to, or perhaps the more insidious cases, not when they're instructed to, but when they realize it would be in the best interest of their principal for them to break law.

Kevin Frazier: It is the Lawfare Podcast. I'm Kevin Frazier, AI Innovation and Law Fellow at Texas Law, and a contributing editor at Lawfare, with my co-host Renée DiResta, associate research professor at the McCourt School of Public Policy at Georgetown and a contributing editor at Lawfare, and our guest, Cullen O'Keefe, research director at the Institute for Law and AI.

Cullen O’Keefe: I think it would be bad if we try to make this project hinge on having a philosophical account of both the law generally and the exact application of every single law to every imaginable circumstance in this brand new world of AI agents.

Kevin Frazier: Today we're talking about his work on AI agents. A forthcoming paper he co-authored proposes a constructive path forward on how to adjust legal norms and systems for the age of AI agents. We'll be exploring the concept of Law-Following AI, the innovative thinking behind it, and why it's a crucial concept for building a better future with AI.

[Main podcast]

Alright, today we're talking about a truly transformative development: AI agents. Imagine AI systems capable of tackling complex computer-based tasks with expert human level skill. The potential for economic growth, scientific discovery and improving public services is immense. But how do we ensure these powerful tools operate safely and align with our societal values?

Cullen, your paper presents Law-Following AI as a positive vision for integrating AI agents into society. But before we go into what it even means for AI agents to follow the law, what are AI agents? Everyone keeps telling me we're in the year of AI agents and yet I'm still planning my own travel, which is profoundly frustrating. So what are AI agents in theory and where are they right now? Just to kind of make sure we're on the same technical level.

Cullen O’Keefe: Sure. So as you suggest, we're talking about a trajectory of AI technology that's still in very early stages. So I think it's most intuitive to explain AI agents in contrast to the generative AI systems that people are most familiar with now. This is a type of AI like ChatGPT, where you input something, a request of some sort, and get a pretty static output, whether that’s a piece of text or a piece of software code, an image, maybe even a video.

And that's all well and good, it's very useful for a lot of things, but what the companies developing these systems really want to do is be able to automate any task that can be done on a computer. And this is more or less explicitly how they often define the goal of AI agents—being able to do anything that can be done on a computer. The on a computer caveat is important because robotics, doing things in the real world is often harder, especially if it’s a nonroutine task.

But that’s the ultimate goal. And we're not there yet, it's not obvious that we're anywhere close to there yet, but companies like OpenAI and Anthropic are rolling out demos of their early AI systems that do things like operate a GUI, a computer like a human would, are able to enter search terms and click around and do some very basic tasks. But as you suggest, they're, they're not particularly good at this yet.

There's kind of another track that people have been working on for a long time, which is kind of building what we call scaffolding software around these existing generative AI systems that helps them do more agentic things, things that look more like taking action than outputting out text, but really we're mostly just talking about a trajectory that the companies are working very hard to realize.

Kevin Frazier: Yeah, and thinking through our current state of where things are and looking forward to where things may go, let's just play hypo real quick. I have to because I'm a law professor, so in this hypo right, I'm planning my trip, my, my dream honeymoon. My wife doesn't listen to these, I talk too much about AI, so I can tell you that we haven't booked our honeymoon. Yes, we've been married for more than a year; yes, it's a source of contention. Eventually we will, maybe I'll use an AI agent to do so.

So I reach out to my AI agent. I say, I'm finally being a good husband. Let's go to Greece, plan this trip. Sounds great, right? I just press it, set it, prego. We see where I'm going in Athens. Until I find out, oh my gosh you know, the AI agent to get that last hotel room, had to email the other person who had that reservation and tell them that if they didn't unbook their trip, then their child would be abducted or some crazy scenario, right? These are the sort of fearful situations of AI agents getting out of control.

So I'm trying to tee you up to help explain why do we need to be thinking about Law-Following AI? What is the importance of setting up this paradigm for AI agents before we realize their capabilities to undercut that person who stole my beautiful hotel room on the Aegean.

Cullen O’Keefe: Yeah, I think that's right. So one of the features of AI systems that we've observed in, in past AI systems and also has a pretty strong theoretical basis, is that there's no strong guarantees that they'll behave in ways that humans expect them to, first of all, or that they'll behave in accordance with any particular set of values we want them to behave in.

So this is what people familiar with the field of AI typically call alignment, and it's typically kind of decomposed into a few different types of alignment. So the one that is often considered the most basic type of alignment is called intent alignment. And you can think of this again using a like principal-agent set up that will be familiar to a lot of economists and lawyers listening to this, where the agent does what the principal would want it to.

So Kevin, since I know you're a good law following person and I know an AI that is aligned to your intent would not do that type of blackmail to secure the last hotel spot–

Kevin Frazier: Thank you for thinking highly of me. Yes, I, I appreciate it.

Cullen O’Keefe: But you know, sometimes there are, there are not good people in the world and so, and a system that is only aligned with the intent of the user would not have the same type of ethical qualms with regard to users that just ask it to do something and tell it that it doesn't care if it violates the law.

Or perhaps more in a more nuanced way, a more advanced AI system might reason about is it worth it for me to break the law on behalf of my principal in this situation because I think I could get away with it. Because probably if it was found out that you knew that the AI system was blackmailing people on your behalf, at the very least, you would feel that—hopefully you would, you know, face some sort of consequence or the company would face some sort of consequence. But, you know, a, a more purely intent aligned system might be quite Machiavellian and, and very strategically break the law.

So people have known about this for a long time. It's why all the systems on the market today are not kind of attempting to be purely intent aligned in this way. They do have ethical boundaries of some sort, and people have often called this kind of like extra set of guardrails value alignment. But there's also a lot of kind of controversy around certain forms of value alignment because obviously humans don't all agree on sources of value.

So the incident that people might be the most familiar with is the Google Gemini incident a couple years ago where basically people found that if you entered what would be pretty benign descriptions of things, it would output very counterintuitive results. So for example, it would refuse to generate pictures of white couples but would be very comfortable doing that with couples of other races. It would depict like 1945 German soldiers—since the word Nazi was, was censored—as racially and gender diverse, which is obviously ahistorical and offensive.

So, you know, there's this separate conversation about when we're choosing values, obviously we don't all agree on what values things have, and this, this, you know, understandably lands companies in a bit of hot water when people who have different set of values don’t like this. And you know, to their credit, it’s not just cherry picked stuff. There’s decent literature that finds that the political views of these systems match most closely to something like center left politics across the developed world. And you know, people are also worried about this from a further left perspective. People in the Global South—there's a lot of literature coming outta the Global south about how these systems reflect the perspectives of people in the Global North.

And so all of this is to say that there's a bit of a quandary here, which is everyone wants systems to have some sort of guardrails beyond just doing what’s in the user’s interest, but so far most of that discussion has been around a set of like extralegal normative values that sound more in ethics than law. And so the Law-Following AI pitch is that well—as a first pitch, probably not sufficient or holistic—let’s try aligning these systems to law, ie prevent them from taking actions that break the law.

Kevin Frazier: Renee, do you want to jump in here?

Renée DiResta: Yeah. So your paper actually goes into the notion of AI henchmen. Do you wanna, do you wanna describe the kind of explicitly manipulative aspect that you know, that you model out in the, in the paper itself, so we can go into this notion of the explicitly manipulative agents that, that you characterize as henchmen.

Cullen O’Keefe: Sure. Yeah. So the AI henchmen concept is really just trying to make this point that I think we all want AI agents to be loyal, but I think anyone who's thought about the concept of loyalty realizes there's kind of multiple ways to be loyal.

And you know, in the traditional principal-agent literature that is familiar to lawyers, agents have a duty of loyalty to their principal, but it's qualified by a duty to still obey the law. It's not a defense to break the law that you are following orders from your principal, you’re still liable for lawbreaking even if its at the orders of or in service of your principal.

And so, to really drive home the difference between the different types of loyalty we might want to distinguish between for AI agents, we have this concept of AI henchmen, which are agents that are perfectly loyal. They’ll, they'll do what the principal asked them to, and they will be willing to break the law, either if they're instructed to, or perhaps the more insidious cases, not when they're instructed to, but when they realize it would be in the best interest of their principal for them to break the law, and possibly even for them to break the law without telling their principal about it so that their principal has plausible deniability over the lawbreaking behavior. You know, this is a thing we associate with sophisticated criminal organizations, the middle management, taking the fall for the leadership so that the leadership keeps their hands clean.

And so, you know, AI henchmen could be a pretty nasty thing to have to deal with. So, you know, I think if we're just limiting ourselves to the type of AI agents that we consider in the paper, we're mostly talking about things that can be done on the computer, but a lot of nasty stuff can happen on the computer. You can steal people’s identities, you can steal money, you can blackmail people, you can create defamatory information about them with, you know, new AI tools that could be quite nasty.

You can, you know, be part of a larger criminal conspiracy on the computer. So even if it's other physical humans that are going out and doing the physical dirty work, you can arrange payments and coordinate, coordinate action, you know. So it's, it's imagining all the different ways that a digital worker could either do a lot of very nasty work itself or could aid in the effectiveness of a larger criminal or illegal enterprise.

Renée DiResta: I think one thing that's interesting just for listeners who are not as familiar with the, both the, the theory and I think the practice, is kind of how far along we are on a technological front with just how indistinguishable some of these systems actually are at this point. I know there was work that OpenAI has done some of the personhood credential work where I've intersected with some of your teammates on that just the dynamics of actually how hard it is to tell already at this point when you're engaging with an agent and the, the rapid acceleration of the technology and the question of how do you even tell.

One thing I really appreciated about your paper is that—and I, I definitely encourage listeners to read it, even if you're, you know, it's not, it's not, you know, technically challenging to read—it's really, really interesting because you go into these kind of vignettes around, like, if you're in a Discord server and you're just talking about cryptocurrency, and one of the entities participating in the cryptocurrency server with you is a bot, right?

Or, you know, a, an AI agent that engages as if it's another cryptocurrency enthusiast, but what it's there for is to pay attention to kind of monitor the conversation. And when you say something like, man, I made so much money today, well, that's when it decides, okay, now it's time to extort this person right? And that and that notion of extortion in ways in which AI is used to extort people is something that we see actually constantly in the work that I do on adversarial abuse.

So it is very much already in the realm of things that are real, and this question of where does the law intersect or in, in, you know, in the work that that I've done and intersected with the OpenAI team on like, how do you have people indicate even in a discord server, like, I am real, I am human. Like, that's a technologically very, very complicated thing that is a problem that we're working on solving from a technological standpoint that we have not been able to do yet. And then there's this other question of, from a legal standpoint, what are the legal ramifications for this agent who controls it and what happens?

So I'd love you to talk maybe a little bit about even just that vignette, the cyber extortion one, which I think is like so accessible to people because spam and scams are at least things that regular people encounter every day, and then maybe we can talk about the, the global national security criminal masterminds next. But, that one was a, I really liked that example because it is something that I think people can see themselves in because everybody has had that obnoxious people on the internet trying to scam you experience.

Cullen O’Keefe: Yeah, and I, I'll say also that I owe a small debt to this podcast actually for helping inspire this vignette because it was inspired in part by a vignette that Jonathan Zittrain told when he appeared on the podcast. I don't remember how long ago it was, but he talks about how pretty soon it's going to be quite cheap—if you want to really make someone's life hard—to just instruct an AI agent to just follow them around from website to website and harass them. You know, make fake accounts spread nasty news about them on all these different websites to the point where they have to go totally anonymous or change their identity or something like that.

And that’s very close to something that seems possible today. I, I, I don't wanna be in the business of making hard predictions, but it does seem quite plausible. From where today's money is, you could probably do this for something like, I dunno, under definitely way under a thousand dollars, for the rest of someone's life to just have an AI agent Google this person every day, and if they find something new, you know, a make a new account to try to harass them. So that was some of the inspiration.

And yeah, so the basic story here is that a criminal group, you know, is quite interested in cryptocurrencies. You know, I'm not trying to be too negative on crypto here, but, you know, it has these nice features for criminals, which is that it's a bit harder for various law enforcement agencies to track and then halt transactions in.

So, what they do is they search social media, something like Twitter to figure, to find kind of information of people who have been recently posting big gains on, on Twitter. They, you know, make, find information about a Discord group where people tend to like brag about these, about their big gains. Again, this is a very common thing on social media that people are familiar with, the, the crypto scene might be familiar with.

And something like just, you know, if someone accidentally posts their real name or their email, this could make it quite easy for them to—the AI agent that is—to, you know, use something like a data broker to figure out more information about this real, the real natural person that's posting about these gains, and then, you know, look up more information about who their contacts are in real life and then threaten them with some sort of blackmail. So in the scenario we do, they use AI tools to create pornography of this person. And yeah, it threatens to release it if they don't hand over some of their crypto gains.

And yeah, that's, that's something you could easily imagine someone panicking and doing if they were faced with that threat and found it quite credible.

Kevin Frazier: Yeah. And just to, to jump on there and say listeners. Please don't do this to me. I don't have any crypto gains. So, so don't get any thoughts.

Cullen O’Keefe: Not a lot of gains this year.

Kevin Frazier: I think everyone's stock portfolio is in the trash, so, we, we don't have to, maybe we don't have to worry about that immediately.

But to ground this a little bit more into actual AI governance—so you've got this great theoretical paper about what could happen, what we could see happen in the near future as we continue in this year of AI agents. We've had a couple folks on the pod who have said, hey, we need to make sure this is a space of, for lack of better phrase, permissionless innovation, let's not clamp down on AI before we see these risks. On the other side, we've had folks come on the pod and say. Well, we should probably just pause this whole pursuit of AI agents if we're seeing these potential harms arise.

So when we think about Law-Following AI as a sort of regulatory device, how would you like to see this applied? Do you think this is something that the lab should be spearheading? Should we see states, should we see Congress, should this be an international accord of LFAI, which is the acronym? Should we see some big LFAI treaty, or what's the point of intervention you'd like to see here?

Cullen O’Keefe: Well, I, I, I think one piece of good news is that the labs are already doing something like this, which is part of why we wanted to write the paper.

So OpenAI has this document called the Model Spec that kind of goes through the principles in a hierarchical way that it wants its models to follow. And in part, this is meant to deal with the exact kind of problem that we're trying to solve here, where the ways that OpenAI wants its model to behave might be in conflict with the ways that its users want the model to behave. And one of the restrictions that OpenAI puts on there, it, it does take this very principal agent approach where it says, you know, you're supposed to mostly do what the user wants, but there, it does have an exception there for things that would violate the law.

Anthropic’s approach, constitutional approach for their system Claude, has a similar limitation, although it's a bit less explicit or a, a bit more buried in there. So, you know, there are kind of hints that the industry wants to move in this direction and you can see why that would be the case.

To, to go back to the thing that you opened with Kevin, I think this would be a big scandal if it was found out that ChatGPT was blackmailing people in order to get a better hotel deal for their users, right? So I expect it to be pretty sympathetic to a lot of AI companies’ fear.

That said, you know, the primary thing that I care about for the purposes of this project is the prospect of AI agents being integrated into the government, and particularly being integrated into the government in ways that would allow them to exercise various hard power functions—things like law enforcement, military, you know, things like investigating citizens, etc.

And I think, you know, as a part of our Anglo American legal tradition, that's the type of thing we’re supposed to be quite worried about. And basically I think that that's where I, I have the most optimism for Law-Following AI, and I, I hope that a lot of people who take a more libertarian approach to this technology will similarly feel that that is something there should be pretty significant guardrails on. You know, it’s something most American political traditions really care a lot about.

And so, you know, I think there's a debate to be had about what sort of regulations the private sector should have with regard to how AI agents should behave, but I think really like the, the red line that we're trying to kind of generate consensus for in this paper is that AI agents acting as AI henchmen within the government is a pretty intolerable situation. And we can go into more why it might be even more intolerable than the situation where you have perfectly loyal humans staffing the government in these kind of hard powered positions. But yeah, that's the, that's the primary way I see this playing out.

Kevin Frazier: We're speaking on April 22. There's a big debate right now about the word facilitate that's going on in the courts. What does facilitate mean? And I think Renee has a, a great thread to pull on here about the ambiguity in the law. So, Renee, I'll, I'll kick it over to you.

Renée DiResta: Yeah, no, I, I was curious as I was reading it, just this question of which interpretation of the law is correct. And that is constantly evolving. We see the, the notion of—and I'm not a lawyer, just to be clear, so I feel like this, you know, this was me reading it as a non-lawyer—just this question of as case law evolves as we talk about as Kevin's saying, a, a lot hinging on the interpretation of one word and really critical decisions hinging on the interpretation of one word, how do you port that to a model where we're expecting AI agents to not only have an immediate I dunno, like system update. How do you immediately kind of like pass that on down and through, right? How do you adapt that instantaneously?

There's also these moments where things will come into conflict, a decision will come into conflict. The Supreme Court will temporarily halt something. There will be a moment where a decision is stayed, where there's a, a temporary restraining, a temporary order. Help me out here, Kevin. I don't wanna use the wrong legal terms.

Kevin Frazier: Good, good old TRO or injunction.

Renée DiResta: Thank you—injunction, that's what I was going for. And, and this is where that question comes in, around what happens technologically. Like what is, what is transmitted and how to the system in those moments, this is what you cannot do for this day, potentially for this hour, for this two week period, and then, okay, now you can do it again, right? So what does, what does the implementation look like for that, for the technological system?

Kevin Frazier: I mean, I, I kind of think, and, and I love Renee. What I kept thinking of as you were mentioning this with the systems update is a sort of Y2K, are we just gonna have Y2Ks for all of these AI agents if, you know, California passes a new law and it's super confused about which interpretation does it follow California law, New York law, Italian law, and then you see a change of interpretation.

Renée DiResta: I was just trying to get my head through the yeah, like you, or you know, you don't, you don't patch your, your, your Windows or whatever, and something goes horribly wrong, right?

Cullen O’Keefe: Yeah. I mean, yes. So I think, I think you all are hitting on the exact right type of questions. You know, we're talking about building a type of AI system that will work in the real world and so we need to have principled answers to these questions.

But one inconvenient fact about the law is that very few people agree on like the underlying philosophy of law. It, it, like, I, I think very few people even have a like, great definition of what the law is, much less what it requires in all circumstances, and I think it would be bad if we try to make this project hinge on having a philosophical account of both the law generally and the exact application of every single law to every imaginable circumstance in this brand new world of AI agents, right, and one that could command high enough consensus to, you know, pass in our very polarized world.

So I don't think that we are going to be able to have something completely theorized by that. We say that we want to build toward something like a minimum viable Law-Following AI that kind of preserves the status quo of the distribution of power in society, which is that there are limits to what people in the government and the military will do if they see and act as sufficiently illegal. They will, many of them will refuse to carry it out. And so we do have to come up with some kind of view about how the AI agents are supposed to reason about whether they are being asked to do something illegal and carry it out.

I think the, the world of AI agents will actually allow for a lot of like fascinating possible technical solutions to this. So you can imagine that like you can get a hundred different legal opinions from a hundred different AI lawyers. Maybe a hundred different AI lawyers, each one of them is like fine tuned to the opinions of a different district court judge, so that like this one's pretending to be Judge X and this one’s pretending to be Judge Y and it aggregates all these views and comes to like a different view in less than a minute of whether the thing that it is trying to do would violate the law.

And you know, you can imagine some sort of decision procedure about how it’s supposed to aggregate that information, but overall, I think, you know, we can start with the qualitative thing that we wanted to do, which is we wanted to refuse to take many illegal actions, possibly not all—and we can talk about like, you know, I, I think expecting perfect obedience from these things is not realistic and it's not how humans work, but, you know, we want it to have to refuse to take sufficiently illegal actions.

And then the question we need to figure out as a society is how can we build systems that like roughly mirror what we expect law abiding humans to behave like, and how can we ensure that the systems that the government deploys roughly match kind of our expectations for the expectations that we have of civil servants and the military officers to both be obedient to their principles, but also not to obey illegal orders.

You know, there’s an even nerdier legal subject that Lawfare co-founder Jack Goldsmith is, is expert in, which is like the authority of the president to interpret the law for the executive branch, so I think you could have a very important discussion to be had about when the president interprets the law one way and the Law-Following AI thinks it'll likely be interpreted different way, and there's maybe case law, not exactly on point, but very close on point in a third way, how it should resolve that.

So that's the type of question we’ll need to figure out, but what we’re really trying to do in this article is open a conversation into these questions rather than purporting to offer a holistic account, in part because a holistic account doesn’t exist yet and it’s unlikely to by the time we have AI agents, unless they figure it out for us, in which case they’ll be good.

Kevin Frazier: You, you haven't figured it all out yet, Cullen? I mean, come on, we need, we need a comprehensive solution now.

But I do wanna applaud you for that humility and that invitation for more discussion on this topic. As a sort of baseline, I have to say I'm pretty attracted to this Law-Following AI concept because if you look at, for example, there was a recent Brookings report showing that the public is still quite skeptical of AI and in particular AI use by the government.

And if you have just the Wild West AI, AI agents with no sort of guardrails, we'll continue to see that public backlash and perhaps a reactionary response where instead of realizing the potential of an AI agent that can, for example proactively look and find, oh, hey, you're eligible for this benefit that you didn't realize, you should probably sign up. That sounds super positive. If we don't have something like Law-Following AI that gives the public the assurance that this can be used in a reliable and known fashion, then we may just miss out on on those outcomes.

So, you've also shown humility by sharing this paper and this idea with a lot of folks, which is always just a gut wrenching moment of, oh gosh, what, what are, what are my colleagues gonna say? What's the most compelling piece of critique you've received? What, what gave you pause? Whose feedback do you say, oh shoot, now I have to rewrite that entire section, or now I have to plan this entirely new paper. What are some of those counterarguments?

Cullen O’Keefe: Yeah I mean I think Renee in the earlier conversation we had really did hit on one of them, which is what exactly is the task that we're asking the Law-Following AIs to do when they're trying to execute on this duty that we give them to obey the law, obey their principles, but only within legal bounds.

So just for example, you know, one popular theory of what the law is, is often associated with Oliver Wendell Holmes, Jr.

It's called the prediction theory of law where law is a prediction of what courts are likely to decide, and the intuition behind this is that that's ultimately what most citizens care about when they, you know, ask a lawyer for advice. They don't, they don't really care about, you know, what is the true nature of the law, they say if I am sued or prosecuted over this, am I likely to win or lose my case?

Yeah, so that, that might be an intuitive place that you can start, but it's actually probably a lot more complicated than that, in part because there's all these duties that the law imposes on the executive branch in particular that are not likely to be litigated. And so I think figuring out what exactly we want the AI to do and how we would even generate a like training signal to, to train it on that task, what the ground truth would be, seems pretty hard to me.

I do also, Kevin, to your general point, worry about tradeoffs, about putting barriers to adopting AI agents in government. So even though this is a, a piece that's quite focused on the risks of AI agents and government, I, I, I do actually think that having a lot of government automated in the medium run could be a, a, a pretty big thing.

You know, I think part of the reason that people have a lot of distrust of government is they see all of the ways in which it fails and I think a lot of ways in which it fails are things AI could solve. So, you know, I think if you look at like processing times for like green cards for passports, that's like something that I think AI agents in charge of issuing those documents could, could dramatically reduce the backlogs of. AI agents could help you file your taxes—like imagine if the IRS just gave you an AI agent that would like you just dump all your documents into and it just files it for free. You know, this like a thing most countries have that the U.S. doesn’t have. So you know, if we had something like that I think that could be quite important.

And also, you know, I, it has to be said that like there will be a lot of pressure to adopt these AI agents throughout the national security complex, even if we have a lot of worries about it. If our major adversaries are able to operate at the speed of AI and augmented by their own AI agents, the idea that the national security apparatus is going to sit back and let us iron out the civil liberties and civil rights concerns before automating in kind—maybe that would be the right response ethically, but I think if we look at the way that these types of competitions tend to play out, I wouldn't bet all my money on it. So I would rather, you know, invest in a parallel kind of stream of work.

You know, we, we can and should debate when it's appropriate to automate the government. Overall, I think that like in, you know, there should be lots of parts of the government that are eventually automated, but I think our North Star should be when we can guaranteed that it will not disturb the balance of power in society, which currently relies on the fact that if you had an aspiring tyrant, that a lot, large parts of the civil service in the military would refuse to carry out illegal orders and that they have some common regard for their citizens, and that they, you know, swore an oath to the Constitution that they take seriously. And until we have AI agents that we can expect to live up to that same standard we should be pretty hesitant to replace these hard power functions of the government with AI agents.

Renée DiResta: I think that's a really important point. I think that is something that, you know, you do well to write a whole lot more op-eds on. Just to, to to, to begin making that point in a lot of places, the, the importance of the civil liberties argument, because I think, you know, there are going to be henchmen. This is the argument that came up quite a bit as I was reading the paper.

I was also thinking a lot about the discussion between closed source versus open source models with regard to some of the safety conversations that happened early on regarding generating of obscene and illegal content, for example. You remember, I'm sure those, those conversations as they, as that began to become a thing that we saw and that question of the recognition that the henchman will exist, and so this challenge of ensuring that government AI does not become government AI henchman is a separate problem from the notion that there will be no henchman, which is an unrealistic state, right?

And so I think that that, that the argument you're making here is very much related to the ethics that we want in our, in our government, the civil liberties protections that we want for our citizens, recognizing at the same time that the dynamic of finding and stopping henchman is a, is a, is a separate and distinct challenge.

Cullen O’Keefe: Absolutely. And I, you know, that's in part why we put such an emphasis on the governmental case as the, the thing that we care the most about.

You know, I, I think there is a conversation to be had about should private citizens be allowed to procure AI henchmen. It’s not obvious to my personal politics that that is a, like, huge liberty interest that people have, but I think, you know, I respect people who, you know, have a more techno libertarian bent on that. And to their credit, there are all types of other ways we can deal with the problem of henchmen in the private context existing. We have law enforcement, we have civil suits for damages, we have self-help, we have, you know, personal cybersecurity and cyber hygiene and whatnot.

And I, I, like, I think society can learn to deal with a, a, a lot of those situations as long as we're imagining the system staying around the human level—I think if you get to vastly superhuman, there, there's a whole different set of challenges there—because we have learned to deal with, you know, bad people in society and, you know, nowhere has zero crime, but, and states make different trade offs on crime versus liberty.

But I, you know, the, the remedies for a government run by AI henchmen are many fewer and may only be revolt at that point, and so I think we really don’t want to get to that point. And you know, the majesty of the American legal system is coming up with a system on which, you know to, to quote some of the like famous words of our founding, you know, the law is king in America, and everyone is supposed to, everyone in the government has a duty to uphold the law and follow the law.

We, we see Law-Following AI as a way of kind of evolving that vision for a world in which most of the law is being, or most of the government’s functions are being carried out by AI agents rather than humans and the different dynamics that will entail. It will make it easier in some ways to make sure that government agents obey the law because, you know, it's, it's not ethically acceptable to do the sort of like brain surgery on humans that would be required to make them perfectly obedient to the law, and no one thinks that would be a good idea, but it'll also be harder because conversely very few humans are perfectly obedient to their, their principles and, you know, they have their own innate sense of morality and it’s an open design question whether AI will have that same respect for the law.

Kevin Frazier: And this is just such a timely paper. And folks there's going to be a version of this paper on Lawfare soon in an essay format, and then a sort of extra credit assignment, as Renee mentioned, if you wanna read this beautiful, lengthy manuscript, this couldn't be more timely with the recent OMB Memo M-25-21, basically setting up what I've referred to as government by AI, we're already seeing the barriers to integration of AI come down, and as soon as AI agents are available, we can see a pretty rapid uptake by the government in some critical functions and so having this conversation now is critically important.

And before we let you go, start working on your next paper—which I expect in approximately eight weeks, so, so get on it—what else have we missed? Have we missed anything? Do you want any other big takeaways from your paper before we, we send you on your merry way?

Cullen O’Keefe: Like I said, I think the main thing we want to do is start a conversation and grow a field of people interested in this question. You know, I think under my projections about the future, I, I, I do think there will, you know, be a time where the government is very tempted to automate large fractions of it, and I think that this is a, a like fascinating legal set of legal questions that any aspiring law professors should, should seriously consider working on.

And so, if that sounds like you, if you're interested in answering this question of what does it mean for an AI to obey the law, like which set of laws should we obey—you know, that’s like a whole separate question, it probably doesn’t need to be every law. How rigorously should they obey the law—again, like, you know, people, we, we don't think it's a great scandal if someone, you know, jaywalks and similarly there's probably some degree of like legal risk that AI are allowed to take.

So we have a list of these questions in the long form article and also a shorter list in the short form article. So, if you're interested in these questions, I hope you'll reach out and join what we're hoping to build, which is a research community and field really interested in preparing the world for the economy and, and government of AI agents.

Kevin Frazier: Well, always love some homework with some additional questions for all those listeners out there. Thanks to Renee, thank you to Cullen for joining, and we'll have to leave it there.

Renée DiResta: Thank you so much.

Cullen O’Keefe: Thanks so much, Kevin. Thanks so much, Renee.

Kevin Frazier: The Lawfare Podcast is produced in cooperation with the Brookings Institution. You can get ad-free versions of this and other Lawfare podcasts by becoming a Lawfare material supporter at our website, lawfaremedia.org/support. You'll also get access to special events and other content available only to our supporters.

Please rate and review us wherever you get your podcasts. Look for our other podcasts, including Rational Security, Allies, The Aftermath, and Escalation, our latest Lawfare Presents podcast series about the war in Ukraine.

Check out our written work at lawfaremedia.org. The podcast is edited by Jen Patja. Our theme song is from Alibi Music.As always, thank you for listening.


Kevin Frazier is an AI Innovation and Law Fellow at UT Austin School of Law and Senior Editor at Lawfare .
Renée DiResta is an Associate Research Professor at the McCourt School of Public Policy at Georgetown. She is a contributing editor at Lawfare.
Cullen O'Keefe is the Director of Research at the Institute for Law & AI (LawAI) and a Research Affiliate at the Centre for the Governance of AI. Cullen's research focuses on legal and policy issues arising from general-purpose AI systems, with a focus on risks to public safety, global security, and rule of law. Prior to joining LawAI, he worked in various policy and legal roles at OpenAI over 4.5 years.
Jen Patja is the editor of the Lawfare Podcast and Rational Security, and serves as Lawfare’s Director of Audience Engagement. Previously, she was Co-Executive Director of Virginia Civics and Deputy Director of the Center for the Constitution at James Madison's Montpelier, where she worked to deepen public understanding of constitutional democracy and inspire meaningful civic participation.
}

Subscribe to Lawfare