The ongoing resurgence of interest in Stanley Milgram’s classic and controversial experiments of 1961–1962 has created opportunities for “obedience to authority” theories to better integrate with recent archival findings about participants’ perspectives.
1 In one of social psychology’s most influential projects, Milgram asked why ordinary people will obey authority when commands conflict with conscience. No less relevant today than 50 years ago, his question about human social behavior is indeed perennial. To address it scientifically, his experiments had an authority (confederate “Experimenter”) direct a research subject (“Teacher”) to deliver putative electroshocks to a peer (confederate “Learner”). Operationalizing hundreds of Teacher-participants’ responses to a situation of constrained choice as “obedient” or “defiant,”
Milgram (1963,
1974) claimed to show that a battery of experimentally controlled situational variables can cause dramatic change in obedience rates. Recent scholarship has opened the “black box” of the Yale Milgram archive as never before, generating new understandings of the experiment’s sociohistorical context (
Gibson 2013;
Hollander and Maynard 2016;
Russell 2014) and new assessments of Milgram’s relevance to the social psychology of genocide (
Brannigan, Nicholson, and Cherry 2015;
Staub 2014). It has also produced new appraisals of the Milgram ethics controversy (
Elms 2014;
Perry 2012) and new theories of participant behavior (
Burger 2014;
Reicher and Haslam 2011;
Russell and Gregory 2011).
In light of burgeoning interest in participants’ perspectives (
Haslam et al. 2015), it is surprising that virtually no studies exist of the debriefing interviews Milgram conducted immediately after each experiment. Gina
Perry’s (2012,
2013) archival research is a notable exception, and for this reason we repeatedly refer to it below (see also
Gibson et al. forthcoming). Whereas commentators have traditionally accepted
Milgram’s (1974) account of debriefing as thoughtful and effective (
Blass 2004), Perry argues that debriefing was, in fact, quite insensitive and primarily designed to offer justifications for helping to harm an innocent person—an aim she presents as thoroughly unethical (cf.
Baumrind 1964). Moreover, even when Milgram began, in spring 1962, to provide a full debriefing featuring the crucial revelation that the Learner did not receive any shocks, this amounted, Perry claims, to a one-sided
monologue… [not] invit[ing] questions or discussion with the subject. … the pattern of debriefing across obedient and disobedient subjects from condition 20 onward is the same. Williams [Experimenter] followed a standard script for the debriefing that is independent of subject reactions. Typically it involved a minute and a half of delivery, the introduction of McDonough [Learner], and a handshake, before the subject was shown the door. (
Perry 2013:85; cf.
Perry 2012:84–90)
A contrasting line of Milgram investigations comes from Alexander Haslam and Stephen Reicher.
Haslam and colleagues (2015), for example, provide an incisive analysis of box 44 of the Yale archive, consisting of subjects’ generally positive responses to questionnaires Milgram mailed several weeks or months after their participation. These authors see such data as evidence supporting engaged followership, a social identity theory widely accepted at present as explaining “obedience” (
Haslam, Reicher, and Birney 2014;
Reicher, Haslam, and Smith 2012). “People are able to inflict harm on others not because they are unaware that they are doing wrong, but rather because—as engaged followers—they know full well what they are doing and believe it to be right” (
Haslam et al. 2015:25). Such engagement involves “identification with a ‘noble’ cause” such as “science as a source of human progress and welfare” (
Haslam et al. 2015:25, 23); furthermore, “this is a cause in whose name they perceive themselves to be acting virtuously and to be doing good” (
Haslam and Reicher 2017:72). In Milgram’s lab, engagement was behavior in accordance with the view that, although compliance may have been difficult, the experiment was scientifically or practically important, worthwhile, or valuable. Participants’ compliance “deriv[ed] primarily from their identification with his [Milgram’s] scientific mission and from their sense of the experiment as emblematic of that mission” (
Haslam and Reicher 2017:68).
Using new archival evidence, we substantially revise Perry’s findings on the debriefing interviews, doing so in order to challenge engaged followership as a uniquely adequate model for explaining behavior in Milgram’s lab. Insofar as participants’ perspectives can elucidate their actions, better understanding of how they voiced these during the interview is essential for appraisal of contending Milgram theories. “Obedience” remains a puzzle, and our article’s title alludes to a key clue hitherto overlooked by the Milgram literature – participants’
inconsistent assessments of the experiment: thus, whereas “obedient” participant 2017 initially evaluated the experiment negatively (“I think it’s ri
diculous”), she later assessed her participation positively, after full debriefing (“I’m glad to have helped you”).
2 If Milgram’s full debriefing procedure were really as perfunctory and superficial as Perry claims, why did other subjects display a similarly radical conversion? Why did many express “distress and anger” (
Perry 2013:83) during and immediately after the experiment, only to later—upon leaving the lab or responding weeks or months afterward to the questionnaire—declare themselves “happy to have been of service” (
Haslam et al. 2015)?
3 In addition to obedient participants, why did
defiant ones improve their assessments? In short,
Perry’s (2013) and
Haslam et al.’s (2015) findings contradict each other: whereas Perry finds many angry and disillusioned participants immediately following the experiment (suggesting they did not identify with it), Haslam et al. report that participants generally evaluated the experiment positively (suggesting they
did identify with it). This article explains the tension by arguing that the interview could improve initially ambivalent or negative assessments, especially when it featured the full version of debriefing. At stake in this matter is a convincing theory of “obedience,” a major goal of research and commentary in the Milgram paradigm for more than 50 years.
This article contributes to and links three distinct areas of social psychological interest.
First, Milgram and “obedience to authority.” Like the aforementioned authors, we use the Yale archive to revisit Milgram’s classic experiments. This article responds to
Elms’s (2014:3) dismissal of Perry’s research as mere “Milgram-bashing” that gives “little consideration to the other side of the ethical balance: that is, what does Milgram teach us about how to understand the evil in the world?” In connecting Perry’s important archival work with Milgram’s concern to explain “obedience,” we contribute to a revived theoretical debate about the meaning of his experiments.
Second, public attitudes toward science. We show how subjects in a classic social psychology experiment could react to participation in inconsistent ways, with debriefing improving assessments. We discuss implications of these findings for the engaged followership model of scientific authority and thus contribute to studies of public attitudes toward, understanding of, and trust in, science (
Gauchat 2011;
Miller 2004).
Third, conversation analysis and discursive psychology. We provide evidence that social psychological research activities such as experimental debriefing are interactional accomplishments of detailed, self-organizing practices. Like survey interviewing (
Maynard and Schaeffer 2000) and informed consent acquisition (
Speer and Stokoe 2012), scripted “dehoaxing” (Milgram’s term, found in
Perry 2012:73) is necessarily a collaborative achievement, even in situations of unequal authority, expertise, and power (
Gibson 2013). We likewise contribute to these research areas by analyzing participants’ perspectives in terms of publicly available practices of social interaction rather than as “internal” cognitive-emotional entities: as “evaluations” and “assessments” (
Coulter 1989;
Potter 1998) rather than “attitudes” (
Hitlin and Pinkston 2013).
In sum, this article uses conversation analysis to examine news delivery in Milgram’s full debriefing. We show how the interview could improve assessments of the experiment and discuss implications for engaged followership and public attitudes toward science. We first present our data and methods in greater detail and provide more background about Milgram’s two types of debriefing procedures: deceptive versus full. We then examine the coproduction of news delivery sequences (NDSs) in full debriefing, analyzing these detailed conversational structures in terms of their constituent parts.
Data and Methods
Our data are 56 audio recordings from the Stanley Milgram Archive maintained by the Manuscripts and Archives Department at Yale University Library. The recordings preserve the pre-experimental discussion, experiment, and post-experimental interview for 56 subjects from three experimental conditions, all receiving full debriefing (see Table 1).
The larger project in which this article originated involved the purchase of 117 recordings—from conditions 2 (“Voice-feedback”), 3 (“Proximity”), 20 (“Women as Subjects”), 23 (“Bridgeport”), and 24 (“Relationship”)—that had received prior digitization and sanitization (erasure of participants’ names:
Kaplan 1996).
4 Selection was haphazard rather than random due to the archive’s incompleteness, the relative availability of digitized and sanitized recordings in each condition, and a limited budget. Since debriefing was deceptive rather than full in conditions 2 and 3, we drew the 56 recordings analyzed in this article solely from conditions 20, 23, and 24.
To study Milgram’s debriefing procedures using conversation analysis, we first listened to the entire experimental performance and interview of all 117 participants. During this process, we took detailed notes on the interviews, paying particular attention to full debriefings in conditions 20, 23, and 24. We then listened again to the entire interview for the 56 full-debrief participants, edited our notes, transcribed all 56 full debriefings, and wrote detailed analyses of the transcripts. Finally, we listened once again to all 117 debriefings and participation assessments (see component 9,
Figure 1 below), noting Experimenter practices that elicited improved initial assessments and comparing and contrasting the two debriefing procedures. Overall, our conversation-analytic strategy was thus, first, to assemble a collection of as large a number of instances as possible of the same interactional phenomenon (news delivery in Milgram’s full debriefing) and, second, to study the collection to gain insight into the phenomenon’s general features (
Sacks 1992).
Milgram’s Two Debriefing Procedures: Deceptive versus Full
As noted,
Perry (2012,
2013) has shed important new light on Milgram’s debriefing procedures. However, missing is a detailed contextualization of both types of debriefing within the interview. Here, we show how our archival research significantly revises Perry’s findings on Milgram’s debriefing practices. This section also provides necessary context to our later analysis of full debriefing and discussion of engaged followership.
Figure 1 illustrates the positioning of both debriefing procedures with respect to the entire interview, conducted by the Experimenter immediately after the experiment.
5
Although interviews were highly structured, the Experimenter allowed participants ample time to develop responses. Interviews therefore ranged in length, with many reaching more than 20 minutes.
6 The main differences between deceptive-debrief conditions 2 and 3 versus full-debrief conditions 20, 23, and 24 are that, by spring 1962, components 3 and 8 had disappeared, 12 replaced 11, and 9 followed 12. Moreover, the Experimenter performed full debriefing, like debriefing overall, in two ways: either
more extensively (condition 20) or
less extensively (conditions 23 and 24). As with the introduction of full debriefing itself, the difference seems primarily due to Milgram’s concern to ensure that female participants (condition 20) left his lab relaxed.
7Perry (2013:84) argues that a contributing factor was mounting ethical criticism at his funding agency (National Science Foundation) and among psychologist peers. Whereas the
more-extensive full debriefings featured the perspective display sequence (PDS) and the panoply of other “reconciliation practices” discussed further in this article, the
less-extensive later ones lacked this sequence, featured fewer reconciliation practices, and took less time.
Deceptive debriefing was conducted as follows. Teachers completed a demographic questionnaire (component 10) while the Experimenter retrieved the Learner from an adjacent room. The Experimenter then performed a script indicating the Learner’s “problem was that [he] was too nervous. … This machine has previously only been used to shock animals like mice and small rats. … Learner, the amperage is adjusted so the shocks you received were only slightly more painful than the sample shock.” Following conciliatory scripted talk by the Learner showing him unharmed and unresentful (“If I’d been the Teacher, I would have done the same thing. … No harm done.”), the Experimenter confirmed the Teacher had his check for $4.50 and led him outside.
By contrast, full debriefing differed in several key ways. Unlike the deceptive procedure, which typically lasted two minutes or less, full debriefing took longer, especially the more-extensive version in condition 20 (see endnote 6). After demographic component 10, the Experimenter started removing central planks of the cover story: the Learner didn’t receive shocks; he was actually working together with the Experimenter; Teacher behavior was the focus; the experiment was about obedience to authority, not punishment’s effect on memory and learning. Then, the Experimenter called in the Learner to meet the Teacher “under different circumstances.” Also, as noted, whereas participation assessment (component 9) had previously appeared
before deceptive debriefing (component 11), it was now performed
after full debriefing (component 12), immediately prior to the Teacher’s exit. Despite these differences, however, the second debriefing procedure was only relatively “full”: Milgram refrained from total disclosure in all conditions and even in the summer 1962 questionnaire, mailed after all experiments had ended (
Perry 2013).
Though contrasting, the deceptive-debrief interviews and the full-debrief ones both dealt with two issues of moral import highly relevant to participants’ assessments. First, the interviews addressed responsibility, or the problem of assigning blame for the Learner’s receiving shocks against his will. For instance, during full debriefing the Experimenter typically highlighted “obedient” participants’ resistance (“I had the feeling you were reluctant, that you didn’t really wanna do this.”). By mentioning their lack of enthusiasm and/or active resistance, he invited responses of having been concerned “all along” about the Learner and as being less culpable than the Experimenter himself. Second, the interviews emphasized reconciliation among the three parties. In both types of debriefing the Learner reappeared, claiming responsibility for the problems with continuation (e.g., overreacting to the shocks—“I was nervous.”) and lack of resentment (“No hard feelings.”). As noted, Milgram especially emphasized reconciliation in condition 20 due to his concern about “panicked” women. As shown in the next section, a crucial feature of full debriefing was the Experimenter’s arsenal of what we call “reconciliation practices” (e.g., “Do you feel better now?”). In sum, by addressing issues of responsibility and reconciliation, the interviews could improve participants’ initial evaluations of the experiment, especially in full debriefing. Inconsistent assessments, in turn, are directly relevant to theories of behavior in Milgram’s lab such as engaged followership, as discussed later.
Full-debriefing News Delivery in Condition 20
Considerable interactional organization of news delivery characterizes Milgram’s full debriefing. In condition 20, the Experimenter delivers scripted news using what
Maynard (1989,
1997,
2003) calls the (1)
perspective display sequence and (2)
news delivery sequence. This section uses conversation analysis to examine the recurrent and orderly practices of talk and social interaction (
Schegloff 2007) by which the Experimenter elicits Teachers’ assessments of the experiment (PDS) and tells them its true purpose (NDS).
8 Although
Perry (2013:84) presents a partial interview transcript of “defiant” subject 2316, claiming the example is “representative” of full debriefing, missing is discussion of the more extensive version in condition 20. A more complete analysis would also articulate its sequential structure, which participants co-produce (notwithstanding potentially relevant inequalities in authority, expertise, and power between Experimenter and Teacher). The picture that emerges of Milgram’s full-debriefing procedure differs in fundamental respects from that painted by Perry.
Perspective Display Sequence: “What Did You Think of the Experiment?”
In condition 20, the Experimenter typically delivers debriefing news only after first soliciting Teachers’ evaluations of the experiment (32 of 37 interviews, or 86 percent).
9 The PDS exemplifies Milgram’s attempt to show concern for female participants and takes the form of a question about their thoughts or feelings. In institutional talk in medical clinics and in ordinary conversation (
Maynard 2003), the sequence consists of three components: (1) a question or statement projecting (2) an evaluative response, followed by (3) a second evaluation calibrated by the first speaker to the prior evaluation. The PDS thus allows its initiator to fit subsequent talk to the recipient’s stance on a given topic. However, as shown below (e.g., extract 1), in Milgram’s lab the PDS unfolded somewhat differently than in clinical and everyday contexts: First, the Experimenter typically uses the PDS’s third slot to launch the first turn (news announcement) of NDS rather than to offer a second evaluation. Second, the design of this announcement-turn does not vary dramatically in light of the recipient’s evaluation; the Experimenter’s news delivery is fairly standardized (“This man was not being shocked.”). Such differences aside, the PDS in Milgram’s lab is important because it embodies the Experimenter’s orienting to the Teacher’s perspective (
pacePerry 2013), and because it often elicits an evaluation
before the Teacher is debriefed. We can usefully compare the initial assessments with those provided
during and after debriefing in order to study the impact of debriefing on assessments of the experiment.
Table 2 provides the number of participant PDS assessments cross-tabulated by valence and outcome.
The pattern of “obedient” evaluations does not greatly differ from that of “defiant” ones. Of 20 “obedient” participants, only 3 respond positively (e.g., participant 2004 assesses the experiment as “
wonderful”). More frequent are ambivalent or plainly negative assessments (12/20, or 60 percent). For instance, “obedient” participant 2034, after 1.8 seconds of silence, hesitates (“Uh::”), falls silent for 0.4 seconds, and evaluates ambivalently with “It’s not what I an
ticipated.” After similar delay, “obedient” participant 2035 evaluates negatively with, “I thought it was very cruel.” Also noteworthy is that 25 percent (5/20) of “obedient” participants withhold explicit assessment (e.g., with silence or digression). These results, consistent with Perry’s account of widespread “distress and anger” (2013:83), contrast sharply with
Haslam and colleagues’ (2015) questionnaire-based finding of general “happ[iness] to have been of service.”
Although frequency counts are useful summaries, they omit interactional features that can tell us much about the detailed organization of participants’ responses. Extract 1 contextualizes a “defiant” Teacher’s ambivalent assessment (bolded).
At line 1, the Experimenter initiates a PDS. The “unmarked” turn does not project a positive or negative answer but rather leaves open the valence of the response (
Maynard 1991:170). However, the Teacher’s “well” preface (line 3) signals a non-straightforward reply (
Schegloff and Lerner 2009), and she then characterizes herself as having lacked prior knowledge about the experiment. Interpolated particles of aspiration (line 3: symbolized with (h)) can mark “some descriptive problem or insufficiency with a single lexical item” and “head off a problematic incipient action” by the speaker (
Potter and Hepburn 2010:1552). Moreover, they display “sensitivity to how the recipient will understand the action,” thereby softening it (
Potter and Hepburn 2010:1552). Here, they appear to soften a hearing of the response as negatively assessing the experiment, and the Teacher’s post-turn-completion laugh particle (line 5) further contributes to this action (
Shaw, Potter, and Hepburn 2013), though it may also perform what
Jefferson (1984) calls “troubles resistance.” The Experimenter does not immediately respond, reissuing his question at line 7. He thereby marks the insufficiency of the prior response and pursues a more adequate one. This occasions a dispreferred response: following delay/hesitation + well + I don’t know (lines 10–11), the Teacher surmises that the Learner “probably” thought he should have “listened … more” (lines 13–15). But she then immediately accounts for his performance (“=But it’s kind of difficult …”), which is hearable as a critique of and/or complaint about the experiment (lines 15 and 17). Moreover, in her next turn (“I don’t think …,” lines 21–22) she locates the experiment, rather than Learner, as the source of the problem. Thus, though not providing a straightforwardly negative assessment, this participant displays misgivings that amount to an ambivalent response. But note also the Experimenter’s response: rather than expand the PDS with a second assessment, he instead produces an acknowledgment token (“Yeah,” line 19) and moves into a news announcement (lines 24–25, 27–28). Indeed, the Experimenter rarely expands the PDS by reacting to the Teacher’s assessment, and even in such cases (see extract 2), the design of the news announcement itself is largely unchanged.
Whereas subject 2005 responds ambivalently to the PDS, some participants respond more assertively. In so doing, they may negatively evaluate the experiment by reporting what they thought or felt while participating, thereby invoking mental states (
Edwards and Potter 2005).
In extract 2, following initiation of a PDS (line 1), the Teacher assesses the experiment as “
very con
fusing” and “very
baffling” (line 3). The superlative “very” upgrades these assessments, while her prosody further emphasizes them. Also, her criticism of the experiment uses mental-state avowals (confused, baffled) (
Edwards and Potter 2005) to portray its purpose as opaque. The Experimenter then prompts elaboration (“Oh really?”) at line 5, treating her assessments as accountable. As
Raymond and Stivers (2016) show, recipients of “known answer requests for confirmation” tend to orient to such requests as account solicitations, and the Teacher here responds by confirming and upgrading her stance (line 6) and elaborating on her bafflement (lines 6–7). The Teacher uses such practices to make her negative response even more forceful and explicit. Further, when the Experimenter expands the PDS by challenging the Teacher’s assessment (line 9), she responds with a surprise token (line 10)—perhaps suggesting the Experimenter’s position defies common sense (
Wilkinson and Kitzinger 2006:161)—and a negatively polarized question (lines 13 and 15; see
Heritage and Raymond 2012) that projects a “no” response. However, rather than justify his stance directly, the Experimenter supplies a preannouncement + news announcement (lines 16–17) that begins the full debriefing itself.
News Delivery Sequence
Having elicited participants’ evaluations of the experiment, the Experimenter begins a full debriefing with the NDS. This sequence appears in all 37 condition 20 interviews in our collection and consists of four components: (1) announcement, (2) reaction, (3) elaboration, and (4) assessment. As
Maynard (1997:117) notes, “the assessment turn may mark the completion of an NDS. However, following an initial assessment, a deliverer may produce further, embellishing elaborations that also receive evaluation.” This “expanded” (
Maynard 1997:123) NDS often occurs in our 56 full debriefings, with the Experimenter elaborating extensively on the news and projecting further assessments by the Teacher. Crucially, these later assessments (frequently positive) may
contrast with participants’ initial responses to the PDS (frequently ambivalent or negative), producing inconsistency across the interview as a whole. Although the object of the assessment may slide—for example, PDS assessment of the experiment versus later assessments of personal well-being and participation—it seems clear that full debriefing could transform participants’ initial “distress and anger” (
Perry 2013) into “happ[iness] to have been of service” (
Haslam et al. 2015).
News Announcement: “This Man Was Not Being Shocked.”
Extract 3 specifically illustrates how the Experimenter performs the first component of the NDS.
At line 1, the Experimenter does a preannouncement. He then informs the Teacher of what “we” were “interested in” (“your re
actions”). His “as a matter of fact” works as an honesty phrase (
Edwards and Fasulo 2006), framing the forthcoming talk as sincere but contrary to expectation. Honesty phrases are characteristic of the Experimenter’s announcements; they indicate a discrepancy between the experiment’s hitherto asserted purpose—to study the effects of punishment on learning and memory—and its actual one. After elaborating (lines 4–5, 7–9), he announces that the Learner “was ↑not being shocked” (line 10). His prosodic emphasis on the negation term (“↑not”), honesty phrase (“as a matter of fact”), and contrast marker “actually” (
Clift 2001) all distinguish previously asserted appearance (the cover story) from emerging reality. Collectively, these practices work to change the definition of the situation heretofore presented to the Teacher.
Extract 3 also illustrates a recurrent feature of the full debriefings that contrasts with
Perry’s (2013:85) account of them as conducted “independent[ly] of subject reactions.” The Experimenter’s news delivery characteristically includes numerous turn-transition relevance places (
Sacks, Schegloff, and Jefferson 1974): for example, here at lines 3, 6, and 11. At such points, his turns of talk are grammatically and prosodically complete, making speaker transition relevant. Only when the Teacher does not assume speakership does the Experimenter continue. Indeed, he frequently pursues responses (
Pomerantz 1984) by reformulating the announcement (e.g., line 12: “He did not re
ceive any shocks.”). Moreover, some of these pursuits are done in increments (e.g., lines 4 and 7) that, by extending the prior turn, obscure the absence of response while also renewing its relevance (
Ford, Fox, and Thompson 2002). Although Experimenter and Teacher may also orient to unequal expertise and power, it is apparent that the Experimenter regularly treats participants as co-producers of full-debriefing news.
News Reaction: “Oh, I Felt So Sorry for that Man.”
Participants typically react to the Experimenter’s announcement with initial surprise, followed by experiential reports. At minimum, participants’ reactions feature a change-of-state token (“Oh”) indicating the news was previously unknown to them (
Heritage 1984) and/or a repair initiation (“He wasn’t being shocked?”). Such practices can embody surprise, as can initial silence following the Experimenter’s announcement (
Wilkinson and Kitzinger 2006).
Extract 4 illustrates overt surprise. Following the Experimenter’s announcement (line 1), the Teacher initiates repair with a candidate understanding (line 3). The Experimenter rejects this understanding (“No”), and explains the Learner’s true role, along with his own (lines 4–5). As the Experimenter starts another turn (“He uh”), the Teacher responds with a surprise token (line 6). As
Wilkinson and Kitzinger (2006:161) observe, such tokens are “designed to appear as-if-visceral”; in producing them, “people confirm for each other a shared, taken-for-granted world defined by a set of norms, values, and expectations of which the ‘surprising’ behavior, event, or whatever constitutes a breach.” The Teacher’s reaction token marks the Experimenter’s news as a breach of expectations, as does the mild oath “
Oh: golly” at line 10.
Similarly, other participants react with professions of “ritualized disbelief” (
Wilkinson and Kitzinger 2006). For example, “obedient” participant 2017 responds with “You’re kidding.” Several Teachers compare the situation to “Candid Camera” (2011, “obedient”: “This is like <$
Candid [↑
Camera
n(h)ow:!$>”), while others “flood out” (
Goffman 1974) with unrestrained emotion displays. Thus, “defiant” participant 2008 reacts with prolonged laughter, then rhetorically exclaims, “I
should have
KNOW:N!” In such cases, participants exhibit an as-if-visceral reaction to the news. Practices such as repair and candidate understandings, by treating the announcement as unexpected, work to make sense of it and mark a transition between knowledge states.
Following initial surprise, many participants produce experiential reports: claims about what they had
thought or
felt during the experiment. The Experimenter, pursuing reconciliation, supportively aligns with these, by observing, for instance, that the participant was “quite nervous” (2026, “defiant”) or was “reluctant to … inflict pain” (2009, “obedient”). Teachers’ reports account for their behavior during the experiment, performing “counter-dispositional” work (
Edwards and Potter 2005) by displaying concern for the Learner’s well-being. By reporting subjective states such as thoughts, feelings, and inclinations, participants account for their actions by disavowing cruelty or malice toward the Learner and presenting themselves as having been concerned “all along” for him. Also, they embody participants’ efforts to make sense of their earlier actions in light of their current awareness that the cover story was false. Extract 5 shows such experiential reporting.
The Experimenter has just announced that the Learner had not received any shocks. At line 1, he solicits a report of the Teacher’s emotional state (“Does this …”). She sighs in overlap with the question (line 2), then projects a response with an in-breath (line 4) that the Experimenter overlaps with an “uh” token (line 5). The Teacher’s report, with its intensifier “so” and idiomatic “I can’t tell you,” emphasizes how “sorry” she felt for the Learner. Here, she may be deflecting a possible judgment of her “obedience” as callous or cruel. Her claim to have felt concern counters that interpretation, making inferentially available (
Edwards and Potter 2005) to the Experimenter that she indeed pitied the Learner. At line 10, he prompts elaboration, but in a way that can be heard to question her sincerity (“Did you really?”), and the Teacher then accounts for her feelings. Using reported thought (
Haakana 2001), she emphasizes the apparent senselessness of her behavior during the experiment (“for no reason at all”). Additionally, her repeated “just” (lines 12 and 14) places a maximal boundary around its significance: she was doing “nothing more” than hitting him for no good reason (
Drew 1992;
Turowetz 2017). And her “whipping him at the post”—in addition to invoking an archaic and cruel form of punishment—has idiomatic qualities characteristic of complaints (
Drew and Holt 1988). In sum, this Teacher’s news reaction disclaims cruelty toward the Learner and may also complain about the experiment.
News Elaboration: “We Don’t Like to Fool You.”
After the Teacher reacts to the announcement, the Experimenter elaborates. He animates this part of the script with several practices emphasizing that the Learner experienced no pain or harm. For instance, he may state that the cries the Teacher heard were prerecorded and reveal that the Learner is a Yale employee and fellow project member. He may also repeat the news announcement, reasserting that the Learner did not receive any shocks. For instance, following extract 5 (transcript not shown), the Experimenter’s reformulated announcement (“He was not actually receiving any shocks.”) is fitted to, and counters, the Teacher’s report (lines 14 and 16) that she thought she was hurting the Learner. Similarly, in extract 6, the Experimenter redoes his announcement (line 13) following the Teacher’s reaction (line 9). Further, news elaboration characteristically features an apology for the deception (“We really don’t like to fool you this way”) and an account for its necessity. The apology treats the deception as requiring justification while also pursuing reconciliation with the Teacher.
Participants may explicitly accept the Experimenter’s apology, for example, by claiming to understand why deception was necessary. Similarly, “defiant” participants may perform a counter-apology, after learning the experiment’s true purpose, for disrupting it with their resistance.
News Assessment: “I Feel Better.”
As noted, Milgram intended full debriefing to induce Teachers to rationalize their experimental actions and reconcile with the Experimenter and Learner (
Perry 2013). Though the Experimenter pursues indications that participants “feel better” throughout the debriefing, this is especially evident in the assessment phase of the NDS. If participants do not volunteer this information, he typically solicits a self-report with the positively polarized question, “Does that make you feel better?” In ordinary conversation, news recipients’ assessments are structurally recurrent features of news delivery (
Maynard 2003); in the institutional setting of Milgram’s lab, news recipients assessed the Experimenter’s news by professing to feel better.
Just prior to the first line of extract 7, the Experimenter had been elaborating on his news announcement. At line 1, he redoes that announcement, using the adverb “really” to mark a contrast between appearance and reality. The Teacher responds with a beat of laughter (line 2), and following a pause (line 3) and Teacher in-breath (line 4), the Experimenter solicits confirmation that the news makes her “feel better” (line 5). His question is formatted as a positively polarized yes-no interrogative (
Koshik 2002) that projects an affirmative response. Though the Teacher produces such a response (line 7), its non-type-conforming design (lacking a “yes”) may be a subtle means of asserting agency (
Heritage and Raymond 2012). Also, the Experimenter’s laugh particle at line 8 (“eh(h)”) appears after the Teacher’s interpolated particles of aspiration (line 7) and in overlap with her laugh token, and thereby aligns with the Teacher’s response—perhaps embodying a sense that they are “on the same page.”
In addition to solicitations of well-being, the Experimenter performs several more “reconciliation practices” (as we term them) during full debriefing that promote relaxation after the stresses of participation and invite improved assessments of the experiment. These practices structure the remainder of full debriefing: (1) Offer of relaxation item (“Would you like a cigarette?”). Cigarettes, coffee, and tranquilizers are on hand to calm participants during the interview. The Experimenter often offers cigarettes, or the Teacher is carrying them, and they smoke together. (2) Debriefing news delivery (“He wasn’t really receiving shocks.”). The Experimenter drops the cover story. (3) Solicitation of well-being (“Does that make you feel a little better?” “How do you feel now?” “I certainly hope you don’t feel bad about coming down [here to participate.]”). The Experimenter (and Learner) may perform this practice multiple times. (4) Summoning the Learner (“JIM?”). The Learner responds by returning and meeting the Teacher “under different circumstances.” (5) Solicitation of positive attitude toward experiment (“What do you think of it, now that it’s over?”). (6) Announcement of future report and book (“You’ll receive a report in a few months, and I think you’ll find it quite interesting. A book will also be published.”). (7) Leave taking (“And we did certainly appreciate having you here and we enjoyed you very much.”).
Overall, the reconciliation practices allow the Experimenter to perform emotion work (
Hochschild 1983), the labor of managing participants’ reflections and feelings about their experience in Milgram’s lab. This labor, in full debriefing, proves capable of improving participants’ earlier ambivalent and negative assessments. In condition 20, at least 65 percent (24/37) of the Teachers in our collection, just prior to departure, respond positively to the Experimenter’s participation assessment question (see component 9,
Figure 1). For instance, after “defiant” participant 2023 writes her participation assessment, she says, “Well (.) this is after I have spoken with you now. (.) You don’t want my immediate reaction” (25:57). Likewise, in conditions 23 and 24 at least 68 percent (13/19) provide positive responses. These figures are bare minimums, reflecting only participants’ (or the Experimenter’s) verbalized comments about their written responses, thereby omitting participants who wrote a positive response but did not verbalize it. Thus, the pattern of participation assessments, provided at the very end of the interview, inverts that of the earlier PDS assessments, being more consistent with “happ[iness] to have been of service” (
Haslam et al. 2015) than with “distress and anger” (
Perry 2013).
Discussion and Conclusion
In connecting current archival research on Milgram with contemporary theorizing of “obedience,” we have built on
Perry’s (2012,
2013) work while significantly revising her conclusions about Milgram’s debriefing procedures. Full debriefing in particular should be seen in terms of its local context of two- (Experimenter and Teacher) and three-party (Learner included) social interaction—as a multiparty achievement rather than unilateral “monologue.” Although deceptive debriefing was more common than full, the latter was also a major feature of Milgram’s project overall (conditions 20, 23, and 24 included 100 participants). Also, although he indeed intended the interview to reconcile participants with the Experimenter (and Learner), Milgram’s team sought to do this with reconciliation practices that seem, in themselves, less morally dubious to us than to Perry. Rather than Machiavellian justifications for helping to harm the Learner, the Experimenter offered “obedient” participants mildly sanctimonious support for any resistance they showed or reluctance they felt. Below, we discuss in further detail the contribution this article makes to conversation analysis and discursive psychology, Milgram literature and “obedience to authority,” and the study of public attitudes toward science.
First, in the course of showing that Milgram’s full debriefing could improve participants’ evaluation of the experiment the course of showing that Milgram’s full debriefing could improve participants’ evaluation of the experiment, we have argued that social psychological debriefing procedures should be viewed as interactional accomplishments. Although other conversation-analytic studies have examined research activities such as survey interviewing and informed consent acquisition (
Maynard and Schaeffer 2000;
Speer and Stokoe 2012), none to our knowledge has investigated debriefing in real time. A conversation-analytic or discursive psychology approach would allow analysts to identify concerns to which the research participants themselves orient as relevant and consequential (
Schegloff 2007).
As we’ve seen, Milgram intended his debriefing procedure, especially in its full version, to reconcile Teachers with the Experimenter and Learner after the stresses of participation. The Experimenter delivered full debriefing news with a panoply of practices that collectively worked to produce the outcome of “reconciliation”: participants’ positive assessment of the experiment, participation, or personal well-being. Although especially relevant to Milgram’s project, given the extreme nature of the experimental task, similar issues of reconciliation and positive participation assessment arise in any social psychological experiment involving deception. Though the gain in knowledge may be considerable, such research relies on dishonesty—about the nature of the research, the motives of the confederates, and so forth—creating situations that lend themselves to potential abuses of power, authority, and expertise and that must be explained candidly in the debriefing interview. Future studies of the detailed interactional organization of actual debriefings, as captured by video- or audio-recordings, would usefully contribute to ongoing debates in social psychology and elsewhere about the use of deception (see, e.g.,
Cook and Yamagishi 2008). Such studies could fruitfully investigate the ways that researcher and subject negotiate their relative statuses within, and stances toward, the research, along with the power dynamics this may entail. In this vein, despite the long-standing ethical controversy surrounding Milgram (
Baumrind 1964), our conversation analytic approach has provided evidence that power asymmetries—in the form of opportunities to talk—in Milgram’s full debriefing may have been less stark than
Perry (2013) claims. As shown above, the Experimenter provided numerous turn-transition relevance spaces during news delivery, actively pursuing participant responses. Likewise, we found that full debriefings lasted significantly longer than Perry’s “minute and a half” (2013:85). In such ways, the Experimenter took a more conversational, egalitarian approach to full debriefing than to its deceptive counterpart.
Second, our finding of improved assessments has significant implications for engaged followership. Did most “obedient” participants comply because they identified with Milgram’s scientific values and putative research on punishment’s effect on learning and memory, as maintained by Haslam and Reicher? Granted that this article has examined initial PDS assessments from a single condition, 20, it is nevertheless striking that so few participants here, regardless of outcome category, initially assessed the experiment positively (see
Table 2). Missing are the indications of close identification with the Experimenter, engagement with his perceived goals, and approval of his management of the Learner’s dramatic resistance that engaged followership theory would predict. Given that Milgram designed condition 20 (“Women as Subjects”) to differ from the more famous condition 2 (“Voice-feedback”) only by participant gender, and that to this point in the interview (prior to debriefing) little distinguishes the full debriefing version from the deceptive one, doubts arise as to the theory’s ability to comprehensively explain “obedience.” If, as
Haslam et al. (2015) claim, participants’ generally positive questionnaire responses should be regarded as evidence supporting the theory, shouldn’t initially ambivalent and negative responses
prior to debriefing be taken to cast doubt on it?
11 Whereas engaged followership treats positive post-experimental assessments as displaying a
static commitment “all along” to the Experimenter’s leadership and importance of the experiment, we have found that the interviews could have a
dynamic, ameliorative effect on initial reactions—by changing, in full debriefing, the assessments’ object from an experiment about “punishment and learning” to one about “obedience to authority.” The basic definition of the situation proved fluid, with participants’ evaluations shifting accordingly. It seems to us that insufficient attention to full-debrief participants’ dynamically changing experience of, first, the putative experiment on “punishment” and, second, the actual experiment on “obedience” has created a mistaken impression: that their sympathy for the latter typically extended to the former, such that they positively assessed the experiment before the cover story was dropped.
This article, then, has made the important, yet relatively modest, argument that Milgram’s
full-debriefing interview
could improve participants’ initial assessments. We took this approach because PDS assessments, in which participants provide explicit evaluations of the experiment, are relatively rare in the interviews overall, being mostly confined to condition 20, and our strategy has been to compare explicit initial assessments with later ones provided during and after full debriefing. Our approach has thus been conservative and has not examined
additional practices by which participants displayed their perspective during and immediately following the experiment. Our immersion in the larger collection of 117 recordings has familiarized us with many instances of negative reaction to participation, and we have little doubt that Perry is right that “distress and anger” were initially widespread.
12 By contrast, Haslam and Reicher, whose archival research relies solely on the questionnaires sent out weeks or months later, argue that “participants’ post-experimental responses provide little evidence that they were stressed or angry” (2017:71). As discussed earlier, in this article we have sought to explain this tension: the full debriefings could improve assessments. Here we note further that, if we and Perry are correct about the initially negative reactions, and granted Haslam and Reicher’s findings about the questionnaires, then our argument’s scope widens accordingly: the interview
often improved assessments, in
both deceptive and full debriefing conditions. All this provides compelling reasons for doubting that engaged followership is uniquely adequate for explaining behavior in Milgram’s lab.
Finally, our analysis also contributes to the study of public attitudes toward science. In examining naturally occurring debriefings, we have shown how assessments of science can be affected by participation in research and how such participation can involve researchers and subjects in concerted negotiations about the meaning and value of scientific work. Of course, Milgram’s project constitutes an extreme case in many respects: not only did the experiments go far beyond what would be allowed in current research, but the participants themselves may have differed in important ways from the general population; on average, they were probably somewhat more sympathetic to science—people with a “scientific habitus” (
Gauchat 2011)—and were thus more likely to respond to Milgram’s newspaper ad or mail solicitation in the first place. Further, the Milgram experiments of 1961–1962 occurred during a high point of public trust in science, the 1950s and 1960s, which has since declined. In an era in which trust in experts is waning (
Giddens 1991), it is increasingly important to understand how public participation in science, whether in clinical trials (
Roberts 2002) or social scientific studies (
Maynard and Schaeffer 2000;
Speer and Stokoe 2012), affects people’s dispositions toward the scientific enterprise. Analysis of language and social interaction in scientific research activity, as performed in the present article, provides a valuable complement to meso- and macro-level investigations of trends in public understanding of, attitudes toward, and trust in science (
Gauchat 2011,
2012;
Miller 2004).
In conclusion, this article’s findings on debriefing news delivery in Milgram’s laboratory raise fundamental questions not only about prior archival research on his research practices, but also about the meaning of participants’ actions. Although social psychological processes of social identity such as engaged followership do seem likely to operate in many real-world contemporary and historical settings of compliance with malevolent authority, we have offered a compelling, empirically grounded challenge to that theory as a general model of Milgram participants’ “obedience.” It would seem that no single social psychological process uniquely suffices to explain their actions but rather that compliance resulted from multiple processes involving a complex interplay of situational forces and individual dispositions (
Overy 2014;
Staub 2014) operating in concert with organized practices of social interaction (
Hollander 2015;
Hollander and Turowetz 2017). The best starting point for future interpretations of “obedience” and its real-world relevance may perhaps be acknowledgment of such multiplicity and interplay.
Acknowledgments
We thank the Social Psychology Quarterly anonymous reviewers for very helpful feedback on an earlier draft of this article. Doug Maynard and Ceci Ford offered crucial guidance and criticism of the dissertation in which this project originated. We also benefited from jointly presenting this research at the July 2017 International Institute for Ethnomethodology and Conversation Analysis conference at Westerville, Ohio. Cynthia Ostroff at Yale University Library’s Manuscripts and Archives Department helped arrange purchase of Milgram recordings in 2012. Research was funded by National Science Foundation grant number 1103195. This article features transcribed excerpts from audio-recordings of the Milgram Obedience Experiment; permission to publish these is granted by Alexandra Milgram. This article is equally coauthored, with both authors making equal contributions.