The Psychology of a New Obedience Paradigm
A review of Emilie A. Caspar, “Just Following Orders: Atrocities and the Brain Science of Obedience” (Cambridge University Press, 2024).

Published by The Lawfare Institute
in Cooperation With
Why do individuals obey commands to inflict terrible pain or even kill other people, sometimes people they may know personally? Yale psychologist Stanley Milgram’s famous—and controversial—studies in the 1960s suggested the answer lay in “agentic shift”—transfer of responsibility from the individual inflicting pain to the person or persons ordering the pain. In Milgram’s studies, the experimenter was usually the authority figure ordering naïve participants to raise the shock level to a supposed learner with every mistake in a supposed learning experiment.
Most psychologists never took Milgram’s agentic shift theory seriously. Milgram had no independent evidence of a perceived shift in responsibility. And his description of the shift focused on the status and authority of the experimenter. Shift theory was thus a form of situational explanation—a situation in which a high-status scientist recruited participants and told them what to do. This theory could not explain the behavior of participants who, in the same situation, refused to obey the experimenter.
Now Emilie Caspar’s book resurrects Milgram’s agentic shift theory of obedience. Combining the results of laboratory studies with interviews with participants in the mass killings in Cambodia in the 1970s and Rwanda in 1994, Caspar argues that we should believe perpetrators who say they were just following orders. Her interviewees often explain themselves with versions of this claim. Participants in her new shock-giving obedience experiments often say the same. And brain recordings from these participants—based on functional magnetic resonance imaging (fMRI) studies—show different patterns for ordered and for free choice shock giving—suggesting that participants are in a different state when acting under orders.
Chapter 2 reviews experimental research on obedience, with special attention to Milgram’s famous studies of obedience. Caspar represents Milgram’s results only for the “seminal study” of 40 participants that had 65 percent administering the maximum (supposed) shock level of 450 volts. She does not mention the other 20 experimental variations carried out by Milgram and evaluated by Haslam, Loughman, and Perry (2014) in their meta-analysis of Milgram’s work, nor their conclusion that the relationship between teacher and learner is about as important as the relationship between teacher and experimenter that Milgram focused on.
After considering some flaws of previous obedience research, Caspar then introduces her new paradigm for studying obedience. Two participants, strangers, are recruited. They sit facing one another across a table. The experimenter determines a pain threshold for electric shock for each participant (separately), and both have a hand wired for connection with a shock generator. One participant is randomly assigned as agent, the other as victim; these roles are reversed halfway through the experiment.
In one condition, the agent alone decides in each of 60 trials whether to press a button that produces a pain-threshold shock and an accompanying tone, and earns the agent .05 British pounds. The alternative is a button that produces neither shock nor money. An experimenter is seated beside the table looking away from participants.
In the other condition, an experimenter, standing by the table and looking directly at the agent, orders the agent to deliver shock or no-shock for 60 trials. The experimenter usually orders 30 shocks and 30 no-shocks. The author notes that there is no deception in this procedure; the shocks are real and the whole procedure is explained in advance.
Chapter 3 introduces a measure of perceived agency used with the shock paradigm. There is evidence that the perceived interval between action and effect of action is shorter for intentional actions than for passive actions—a finger moved intentionally versus mechanically, for instance. This result leads to the idea that the perceived interval between button press and shock/tone would be shorter for freely chosen than for ordered button presses. And so it turned out in Caspar’s studies. The interpretation Caspar gives to the interval difference is that it measures the participant’s sense of agency or responsibility for a button press—more agency and more responsibility for freely chosen presses than for ordered presses.
This is the sense in which Caspar claims that ordered actions are less agentic than freely chosen actions. The author’s research does not include a measure of experimenter responsibility but does include post-experimental questionnaire measures of agent responsibility. The questionnaire results show what the time estimate shows: Agents feel less responsible for ordered than for free-choice button presses.
The interval difference is an implicit measure of agency but not a brain measure. Caspar uses fMRI or electroencephalogram (EEG) measures to show different patterns of brain activity for freely chosen versus ordered shock giving. This review will not attempt to summarize these chapters, which look at empathy and guilt associated with giving shocks, at agency and guilt of those giving or transmitting orders to shock, and at efforts to reduce obedience. Instead, the review will focus on questions about the new obedience paradigm that is the foundation of the brain measures: These measures are recorded while participants are pressing buttons under free-choice or ordered conditions.
Social psychologists know that human participants in a laboratory experiment can be affected by every aspect of the procedure, starting with recruitment. The effects of cues conveyed by the procedure and by the experimenter’s behavior have been studied as “demand effects.” Participants try to make sense of the procedure they experience. What is the experiment about? What does the experimenter want the participant to do? How will the experimenter evaluate the participant? Participants’ answers to these questions affect their behavior in an experiment.
Unfortunately, the procedure of the agent/victim shock paradigm used by Caspar is not entirely clear in the book, nor in the first publication of research using the paradigm. More detail can be found in the online supplementary materials of the 2016 publication, but several questions remain. How were participants recruited? What were participants told about the study they were asked to volunteer for? What payment was advertised for participating? Beyond gender, what were the demographics of participants: age, education, employment? The female experimenter chose to invite only female participants; do male participants produce similar results? What was the exact wording of the instructions given to participants on arrival at the experiment? What was the procedure for measuring the pain threshold? (Supplemental materials for the 2016 publication include the following: “The shock caused a twitch of the ‘victim’s’ hand that was readily visible to the agent.” Was a twitch the criterion of painful shock?) What was the pattern of the experimenter’s orders to shock and not shock (random? alternating?), and was it the same pattern for both participants?
Some of the results also are not clear. What was the mean and standard deviation of the number of shocks given in the free-choice conditions? Some conditions had one experimenter in the room, some had two; was there any difference in results? Some conditions had an experimenter who ordered 30 shocks in 60 trials; some had an experimenter who ordered 50 shocks in 60 trials. Was there any difference in results? Victim and agent switched roles halfway through the experiment: Was there a difference in results (order effect) for the first half versus the second half? First-half results had random assignment to conditions, clean of possible order effects: What were the results of first-half conditions? What was the correlation across participants between number of shocks given in the various conditions and the questionnaire measures (personality, importance of money, liberal/conservative, empathy, responsibility)?
Questions about the methods and results of the new obedience measure are questions of construct validity: What are free-choice versus ordered shocks measuring? In a final chapter, Caspar reports great difficulty in getting disobedience up to 30 percent, even after removing the monetary incentive. It seems possible that an important motivation for obedience for the all-female participants was helping a young female experimenter get the results she sought—a kind of demand effect.
The implicit measure of agency indicates that agency is reduced for ordered (versus free-choice) button presses—both presses for shock and presses for no-shock. The implicit measure indicates that participants are not lying when, on questionnaire measures, they claim less responsibility for ordered button presses.
But what are we to make of the fMRI and EEG results that show different patterns for free-choice and ordered button presses? In one sense this is not surprising: If the brain is responding to different situations, there must be some difference in brain activity. So, what does it mean that brain differences are correlated with questionnaire differences? Does it mean that participants are telling the truth on the questionnaire? If they were lying about responsibility in the ordered-choices condition, might brain recordings nevertheless show differences between the ordered and free-choice (no lying) conditions?
Finally, it is necessary to ask what the agent/victim shock paradigm can tell us about the importance of obedience in mass political murder. An odd aspect of the paradigm is that each participant experiences both giving and receiving shocks. Giving shocks may be justified by receiving them; indeed, the 2016 publication reports that victim-first participants who receive more shocks later give more shocks as agent. Where is the parallel to this reciprocity in the politicide literature that the research aims to illuminate?
In a larger perspective, Caspar recognizes that obedience to authority is only one of the ingredients that contribute to mass atrocities. In-group love, out-group hate, mass opinion and perceptions of mass opinion, small-group dynamics, fear for self and in-group—all of these come together in mass political murder.
To sum up, the implicit measure of agency—estimation of the interval between button press and tone—supports the idea that individuals feel less responsibility for ordered than for free-choice shocks. But interpretation of the number of ordered shocks administered versus disobeyed remains opaque, as does interpretation of the number of free-choice shocks. There is no measure of perceived experimenter responsibility, that is, no test of Milgram’s agentic shift hypothesis.
There is also a moral issue to be considered. Caspar addresses this in her introduction: “[A] crucial aspect of conducting this research is reiterating again and again that uncovering the neural mechanisms explaining how people can commit atrocious acts out of obedience does not offer an excuse or an escape door for people trying to justify their actions.” It seems likely that not everyone will agree with this assertion. If neuroscience can get beyond bias and defensiveness to “explain” behavior, perhaps perpetrators are at the mercy of unconscious brain activity and free will is an illusion?