Altruistic punishment: A closer look 1 1 In press: Proceedings of the Royal Society B: Biological Sciences, 280(1758). 2 3 4 Do humans really punish altruistically? A closer look 5 Eric J. Pedersena, Robert Kurzbanb,c, and Michael E. McCullougha 6 7 a 8 9 b 10 c Department of Psychology, University of Miami, FL 33124-0751 USA Department of Psychology, University of Pennsylvania, PA 19104-6241 USA Department of Economics, University of Alaska Anchorage, AK 99508 USA 11 12 Key Words: cooperation, altruistic punishment, third-party punishment, affective forecasting, 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 evolutionary psychology Corresponding Author: Michael E. McCullough Department of Psychology University of Miami P.O. Box 248185 Coral Gables, FL 33124-0751 Phone: 305-284-8057 Fax: 305-284-3402 Email: [email protected] Altruistic punishment: A closer look 2 39 Abstract 40 Some researchers have proposed that natural selection has given rise in humans to one or more 41 adaptations for altruistically punishing on behalf of other individuals who have been treated 42 unfairly, even when the punisher has no chance of benefiting via reciprocity or benefits to kin. 43 However, empirical support for the altruistic punishment hypothesis depends on results from 44 experiments that are vulnerable to potentially important experimental artefacts. Here we searched 45 for evidence of altruistic punishment in an experiment that precluded these artefacts. In so doing, 46 we found that victims of unfairness punished transgressors whereas witnesses of unfairness did 47 not. Furthermore, witnesses’ emotional reactions to unfairness were characterized by envy of the 48 unfair individual’s selfish gains rather than by moralistic anger toward the unfair behaviour. In a 49 second experiment run independently in two separate samples, we found that previous evidence 50 for altruistic punishment plausibly resulted from affective forecasting error—that is, limitations 51 on humans’ abilities to accurately simulate how they would feel in hypothetical situations. 52 Together, these findings suggest that the case for altruistic punishment in humans—a view that 53 has gained increasing attention in the biological and social sciences—has been overstated. 54 Altruistic punishment: A closer look 3 55 56 Do humans really punish altruistically? A closer look In many animal species, including humans, individuals punish conspecifics that have 57 harmed them [1–3]. Some researchers have recently argued that humans, unlike other animals, 58 also altruistically punish individuals who have harmed others, even when the punisher has no 59 chance of benefiting via reciprocity or benefits to kin [4–6]. Results from several economics 60 experiments appear to support this claim [4,6,7], but some scholars have questioned both the 61 adaptationist logic behind such theoretical claims [8–10] and the interpretation of the empirical 62 results [8,10–14]. Here we elide these theoretical debates and instead investigate a more basic 63 empirical question: Do people actually spontaneously punish individuals who have only harmed 64 other individuals in anonymous settings in the laboratory? Put differently, do the empirical 65 research findings often marshalled in support of the altruistic punishment hypothesis [e.g., 4,6] 66 provide a reliable guide to the presence or absence of a propensity for altruistic punishment in 67 humans? 68 In previous work, researchers claimed empirical support for the existence of altruistic 69 punishment on the basis of results from public goods game experiments in which the individual 70 being punished had harmed—or failed to help—the putative punisher as well as other victims 71 [4], leaving open the possibility that the punishment was vengeful, rather than altruistic [10]. 72 Results from similar experiments that exclude revenge as a possible motive suggest that 73 investments in punishment in such contexts are conspicuously low [15]. Additional data 74 frequently adduced in support of the altruistic punishment hypothesis come from third-party 75 punishment games [6,7], in which a Dictator chooses to give some portion of a sum of money 76 (or none) to a passive Recipient. A third player can punish the Dictator (at a cost) in response to Altruistic punishment: A closer look 4 77 the Dictator’s transfer to the Recipient. Many third parties in games with this structure pay a cost 78 to punish stingy Dictators, despite receiving no financial benefit from doing so [6,7]. 79 However, five methodological limitations of the standard third-party punishment game 80 might conspire to yield inflated estimates of humans’ propensity to punish strangers for having 81 behaved unfairly toward other strangers. First, in the standard game, subjects are assigned to a 82 third-party role that implies their task is to determine how much to punish the Dictator: Indeed, 83 the only choices third parties can make are whether to punish the Dictator [16]—and, if so, how 84 much. Thus, any error will lead to an increase in the estimated quantity of punishment. Second, 85 punishment in the third-party punishment game is typically administered with the presence (or 86 inferred presence) of an audience: punishment of the Dictator by the third party is witnessed by 87 the initial victim because all players see the results of the game. The presence of an audience 88 introduces reputational considerations that could motivate punishment as a means of pursuing 89 indirect fitness benefits (e.g., by signaling one’s quality as a cooperative partner [17, 18]; or 90 one’s formidability to prevent future exploitation of oneself [19, 20] or one’s friends and kin 91 [21]). Indeed, it has been shown—though with a different paradigm—that observers of unfair 92 treatment punish third parties significantly less when they are assured no one will see their 93 decision [22; however, see also ref. 23]. 94 Third, the third-party punishment game is typically conducted with the “strategy method” 95 [24], which requires third parties to repeatedly respond to a series of hypothetical Dictator 96 choices—in advance of learning of the Dictator’s actual choice—that are progressively more (or 97 progressively less) unfair [6]. Such methods can cause subjects to infer that the experimenters 98 expect them to vary their responses according to some feature that varies across the set of 99 repeated scenarios [25]. Consequently, due to a well-known experimental artefact called demand, Altruistic punishment: A closer look 5 100 subjects might feel compelled to punish at least some of the time, calibrating those decisions to 101 the only feature of the Dictators’ repeated choices that varies: how unfair they are. This is 102 especially problematic in the standard third-party punishment game because rewarding is not 103 allowed; the only way subjects can vary their responses is to vary their amount of punishment. In 104 a notable exception, Alemberg, et al. [26] did add a rewarding option to the typical third-party 105 punishment game (conducted with the strategy method), and a small amount of third-party 106 punishment was observed, on average, when Dictators transferred $0 (of $10) to the Recipient. 107 We note, however, that subjects in this experiment were informed, before making their decisions, 108 that it was possible they would not be paired with another subject—in such a case, their 109 decisions would not be enacted and they would retain all of their money (i.e., participants’ 110 decisions were somewhat hypothetical; see below). 111 Fourth, the strategy method also involves affective forecasting [27] inasmuch as it 112 requires subjects to respond ex ante to Dictator actions that have not yet occurred. Such 113 behavioural commitments can differ from the actual behaviours people enact after experiencing 114 social situations directly because people frequently weight the features of social situations 115 differently during conscious deliberation than they do after experiencing those social situations 116 in real time [28]. For example, as forecasters, people severely overestimate how upset they 117 would feel by (and subsequently, how much they would attempt to avoid interacting with) 118 someone who had made a racist comment; in contrast, subjects who have actually observed 119 another individual express strongly racist attitudes (versus those in control conditions) respond 120 with relative indifference to the racist individual [29]. 121 122 Fifth, previous claims that anger is the predominant emotional response of third-party punishers have relied on self-reports of anger in response to hypothetical scenarios [4,6]. Self- Altruistic punishment: A closer look 6 123 reports of anger are typically highly correlated with self-reports of other, similar emotions— 124 including envy [30]. To the extent that the covariation between self-reported anger and self- 125 reported envy is not statistically controlled, estimates of third parties’ anger toward unfair 126 strangers might actually reflect envy, which can also motivate costly punishment in pursuit of 127 goals that are quite distinct from putatively altruistic goals such as enforcing norms or delivering 128 deterrence benefits to strangers [31]: Specifically, if third parties’ punishment of individuals who 129 have treated another individual unfairly is motivated by envy, but not by anger, then the 130 mechanisms that motivate third-party punishment might process cues that another individual has 131 obtained better outcomes than the self, rather than cues that an individual has violated a norm or 132 harmed an anonymous third party in whom the punisher has no fitness interest [e.g., 6,7]. 133 Here we present two experiments designed to test whether subjects punish altruistically 134 on behalf of strangers in a third-party punishment game that was designed to rectify the 135 methodological problems noted above. We also examined whether previous findings could 136 plausibly be explained as a product of affective forecasting errors. We note that our goal was not 137 to estimate the unique influence of each of these five potential methodological problems; rather, 138 our goal was to test whether the altruistic punishment hypothesis could survive falsification in an 139 experiment that eliminated these problems. Experiment 1 was a modified third-party punishment 140 game in which subjects could either punish or reward—thereby reducing experimental demand 141 for punishment [25], the confounding of error and punishment, and potential audience effects. 142 Also, subjects made decisions about giving or deducting money from Dictators after witnessing 143 the Dictator’s decision, which enabled us to measure third-party punishment without the 144 possibility of affective forecasting errors [27]. Additionally, our measures of emotion were fine- 145 grained enough that it was possible to evaluate the unique motivational roles of anger and envy. Altruistic punishment: A closer look 7 146 In Experiments 2a and 2b, the same third-party punishment game was presented as a 147 hypothetical vignette to subjects from two different research pools. 148 149 Methods 150 151 152 Subjects 153 SD = 2.99; 57% female). They received partial course credit and monetary compensation (see 154 below). 155 Experiment 1: Subjects were 315 University of Miami undergraduates (mean age = 19.12, Experiment 2a: 538 subjects (mean age = 34.37, SD = 12.14; 60% female) were recruited 156 via Amazon Mechanical Turk (http://www.mturk.com/mturk/welcome) and were paid $0.25 for 157 their participation. Participation was restricted to users in the United States. Because participants 158 merely had to read a vignette and then report their forecasts of how they would think, feel, and 159 behave if the hypothetical situation had actually happened to them, participation generally took 160 about 4 minutes. 161 Experiment 2b: We replicated Experiment 2a with University of Miami undergraduates; 162 394 subjects (mean age = 18.74, SD = 1.27; 53% female) participated for partial course credit. 163 Procedure 164 Experiment 1: Subjects were run in individual sessions at a computer station in an 165 isolated room (see ESM S2.1). The entire experiment, including instructions, was conducted on a 166 computer via E-Run with a script created in E-Prime version 2.0. After subjects provided 167 informed consent, they were told they would be interacting with two other players in the building 168 over the computer network and that it was important that those other people remain anonymous; 169 in fact, they interacted with a pre-programmed computer script. Without this deception, the Altruistic punishment: A closer look 8 170 research would have been unfeasible (see ESM S2.1). Subjects were informed that they would be 171 participating in an economic decision-making game that would last for multiple rounds and they 172 would be paid based on the money they earned during the game. Because deception was 173 involved, everyone was paid a flat rate of $9 at the end of the experiment following a debriefing. 174 We used a “funnel debriefing” method designed to detect suspicion and explore subjects’ 175 reactions to having been deceived [32]. Subjects flagged for suspicion were excluded from all 176 analyses presented; re-including them in analyses did not qualitatively affect the results in any 177 way (see ESM S1.2). 178 The decision-making game comprised two rounds (Fig. 1) in which each player was 179 given $5 to use in each round and assigned to one of three roles: Decision-Maker, Receiver, or 180 Observer. (We refer to these roles here as Dictator, Recipient, and Third Party, respectively, to 181 be consistent with labels used in previous work on third-party punishment.) Subjects were not 182 told the exact number of rounds to avoid end of game effects [33], and were told that money 183 earned during each round would be “banked” and thus unaffected by subjects’ behaviour during 184 subsequent rounds. The Dictator ostensibly had the option to give any portion of his or her $5 to 185 the Recipient, or take any portion of the Recipient’s $5; the Third Party would merely see the 186 results of the round and would not be affected by the Dictator’s choice. Subjects were informed 187 that in some rounds all players would be involved, and in other rounds some players might be 188 excluded. Subjects were randomly assigned to be either the Third Party or the Recipient in the 189 first round and the (computer-programmed) Dictator either took $4 or $0 from the Recipient. The 190 computer displayed a summary screen for the round showing the amount of money each player 191 earned for the round. Following the round, subjects completed a lexical decision task (see ESM 192 S1.3) and a series of self-report questions (see below). Altruistic punishment: A closer look 9 193 Prior to role assignment for the second round, subjects were informed that there would be 194 no Third Party in Round 2 [to avoid potential audience effects; 22,34]; one player would be 195 assigned to a different task and be unable to see the results of the interaction. We note that the 196 presence of the experimenter can also induce audience effects e.g., [22]; we took great care to 197 minimize this potential influence by (a) clearly informing participants during the consent process 198 that their data would be stored completely anonymously and could not be connected to them in 199 any way, and (b) minimizing contact with the experimenter by presenting all instructions 200 electronically. Though we cannot rule out experimenter audience effects completely, our results 201 are as insulated from them as we believe it was possible to do in the context of this experiment. 202 All players were given another $5; because previous earnings had been “banked,” all 203 players started Round 2 with $5. The subject was assigned the role of Dictator while the Dictator 204 from Round 1 (who had treated either the subject or the other player fairly or unfairly) was 205 assigned the role of the Recipient (ostensibly by chance). Players were identified consistently 206 throughout, so subjects were aware that the Recipient in Round 2 was the same player that had 207 been the Dictator in Round 1. Subjects were instructed that they could give any amount of their 208 $5 to the Recipient, do nothing, or remove any amount of the Recipient’s $5 (the word 209 “punishment” was never used). Removing money cost one-fourth of the amount removed and, 210 unlike in the first round, was not gained by the subject as income—it simply disappeared. Note 211 that the cost of punishment used here, 1:4, was less expensive than the 1:3 cost typically used in 212 the third-party punishment game; previous research has shown that punishment becomes more 213 likely as the cost of punishment declines (see ref. [10] for review). Following the completion of 214 the round, the experiment ended and the experimenter debriefed the subject through an Altruistic punishment: A closer look 10 215 extensive, staged process to assess the believability of the experiment and to explain why 216 deception was necessary [32]. 217 Experiments 2a and 2b: After providing consent, subjects were instructed to imagine 218 themselves “…in a particular situation in our laboratory. Please try to picture yourself in the 219 situation we are describing. We will ask you to complete a series of questions regarding how you 220 think you would think, feel, and act in this situation.” The layout and instructions of the game 221 were presented as they were in Experiment 1, and the rounds of the game and the self-report 222 measures were the same. However, subjects did not complete a lexical decision task following 223 the first round and were not debriefed following the completion of the experiment (because no 224 deception was involved). 225 226 227 Psychometric information regarding the self-report measures 228 responses toward both of the other players after the first round. They described their feelings 229 toward both players to avoid demand effects that might have occurred by probing only about the 230 Dictator. (Emotional reactions to the other player were not of theoretical interest here and so, in 231 the interest of brevity, we do not report them). 232 Self-rated emotions toward the other players: Subjects were asked to describe their emotional - Anger: 3-item composite of ratings on a scale from 0 (not at all) to 5 (extremely) the 233 extent to which the subject was “angry,” “mad,” and “outraged” at the Dictator 234 (Cronbach’s = .94). 235 236 - Envy: 2-item composite of ratings on a scale from 0 (not at all) to 5 (extremely) the extent to which the subject was “envious” and “jealous” of the Dictator (Cronbach’s = .84). Altruistic punishment: A closer look 11 237 Fairness/moral wrongness of the Round 1 dictator’s behaviour: Subjects were asked to rate 238 both how “fair” and how “morally wrong” the Dictator’s behaviour was toward the Recipient in 239 Round 1 on a scale from 1 (not at all) to 9 (totally). 240 241 242 Results Experiment 1 243 Third parties did not punish on behalf of strangers: A one-sample Wilcoxon test (used 244 because distributions were non-normal) revealed that the sample median of the distribution of 245 dollars punished or rewarded in Round 2 (in terms of the effect on the Recipient, not the cost to 246 the subject) by third-party witnesses of unfairness did not differ significantly from a 247 hypothesized median of zero, Z = -1.48, p = .140, N = 65 (all p-values throughout manuscript are 248 two-tailed). In contrast, victims of unfairness punished a nonzero amount, Z = -3.52, p < .001, N 249 = 61—significantly more than mere witnesses of unfairness, p = .026, N = 126 (two-sample 250 median test; Fig. 2). 251 If the function of punishment is to deter harmdoers from imposing costs on oneself (i.e., 252 to bargain for better treatment for oneself [10,13,20]) or others in the future—or even if its 253 function is to enforce adherence to social norms [6]—then the punishment must be strong 254 enough to erase unfairly gained benefits [1,35]. Otherwise, the harmdoer retains a net profit from 255 the transgression, and thus will retain an incentive to continue to behave unfairly toward others 256 in the future. Because unfair Dictators took $4 from Recipients in Round 1 of Experiment 1, $4 257 was also the minimum amount of punishment that would be expected to deter unfair Dictators 258 from behaving unfairly in the future. Third-party punishment of this magnitude was extremely 259 rare: only 2 of 65 (3%) witnesses imposed at least $4 worth of punishment on unfair Dictators, a Altruistic punishment: A closer look 12 260 proportion no different from the proportion for witnesses of fairness (0 of 80; p = .199, Fisher’s 261 Exact Test). In contrast, 13 of 61 (21%) victims of unfairness punished at least $4, a proportion 262 significantly greater than that for both recipients of fairness (0 of 64; p < .001), and witnesses of 263 unfairness (p = .002). Indeed, most victims of unfairness who punished (13 of 21) imposed at 264 least $4 worth of punishment. 265 According to the self-report measures of emotion, third parties did not become angry at 266 unfairness: when controlling for envy (which was highly correlated with anger, r = .637, p < 267 .001; see ESM S1.5), a 2 (Target: Self, Other) x 2 (Treatment: Fair, Unfair) ANCOVA revealed 268 a significant target*treatment interaction for anger, F (1, 265) = 17.11, p < .001). Witnesses of 269 unfairness (M = .533, SE = .095, N = 65) did not report more anger than did witnesses of fairness 270 (M = .414, SE = .084, N = 80), p = .363, partial 2 = .00, but victims of unfairness (M = 1.28, SE 271 = .098, N = 61) did report more anger than their fairly-treated counterparts (M = .418, SE = .093, 272 N = 64), p < .001, partial 2 = .13 (Fig. 3; see ESM S1.3 and Fig. S1 for a replication with an 273 implicit measure based on reaction time data). Thus, people became angry when treated unfairly 274 but not when they only witnessed the unfair treatment of a stranger [cf. 36]. Importantly, this 275 difference in the anger of witnesses versus victims of unfairness was not due to different 276 perceptions of the transgression’s fairness or moral wrongness (see ESM S1.4 and Fig. S2). 277 Eleven of sixty-five witnesses of unfairness paid some cost to impose costs on the unfair 278 Dictator; with a much larger sample, one might argue, we therefore might have found statistical 279 evidence for mild third-party punishment. However, because the witnesses of unfairness were 280 not angry at the Dictator (see above), we suspected that the predominant emotional response 281 among witnesses of unfairness was envy given that they had observed the unfair Dictator obtain 282 a higher payoff ($9) than they themselves had received [$5; see ref. 37]. We found a significant Altruistic punishment: A closer look 13 283 target*treatment interaction for envy (with anger partialed out), F (1, 265) = 4.53, p = .034: 284 witnesses of unfairness were more envious of the Dictator than were the witnesses of fairness, p 285 = < .001, partial 2 = .07. In contrast, victims of unfairness were no more envious than were their 286 fairly-treated counterparts, p = .306, partial 2 = .00. Thus, had we observed a significant amount 287 of third-party punishment among witnesses of unfairness, it plausibly could have been motivated 288 by envy toward the unfair Dictator rather than by moralistic anger. This difference in the 289 emotions of the witnesses and victims of the Dictator’s unfairness explains why third-party 290 punishment was quite rare and mild: witnesses of unfairness were envious of the Dictator’s ill- 291 gotten gains—but not angry—and so they likely were reluctant to spend their own money to 292 punish the Dictator. 293 During Experiment 1, we also ran a small fifth condition (N = 45; see ESM S1.1) in 294 which witnesses of unfairness started Round 1 with $9, enabling us to test whether witnesses’ 295 economic disadvantage relative to unfair Dictators explained their surplus envy. Witnesses who 296 started Round 1 with $9 were significantly less envious of unfair Dictators (controlling for anger) 297 than were witnesses who started with $5, F (1, 107) = 8.35, p = .005, partial 2 = .07. Self- 298 reported anger (controlling for envy) did not differ between groups, F (1,107) = .581, p = .448, 299 partial 2 = .01. Witnesses of unfairness with $9 did not punish an amount significantly different 300 from zero, Z = -.879, p = .379, N = 45, nor differently than did the witnesses of unfairness with 301 $5, p = .475, N = 110, even though doing so would have cost a smaller proportion of their stake. 302 Thus, the emotional reactions of witnesses of unfairness were characterized by envy rather than 303 moralistic anger. 304 Experiment 2a Altruistic punishment: A closer look 14 305 In Experiments 2a (online sample) and 2b (undergraduate sample), we investigated how 306 subjects’ affective and behavioural forecasts in a hypothetical scenario would compare to the 307 results from Experiment 1 (see Table S1 and Table S2 for descriptive statistics for both 308 experiments). This experiment was conducted not because we thought that participants’ 309 hypothetical responses would provide a reliable assay of how they would behave in a real-life 310 situation, such as the one we explored in Experiment 1, but because we wished to compare 311 participants’ forecasts of how they might behave and feel with participants’ actual behaviour and 312 emotional reactions in Experiment 1. The rounds of the game were identical to Experiment 1, 313 except that subjects were instructed to report how they believed they would act and feel in 314 response to a hypothetical vignette. 315 In Experiment 2a—and in contrast to Experiment 1—witnesses of hypothetical unfairness 316 forecast that they would administer a greater-than-zero amount of punishment, Z = -2.38, p = 317 .017, N = 137, as did victims of hypothetical unfairness, Z = -2.66, p = .008, N = 148; witnesses 318 and victims of hypothetical unfairness did not differ significantly, p = .391. Four additional 319 results suggest that a different psychological process was at work in Experiment 2a than in 320 Experiment 1. First, witnesses of hypothetical unfairness forecast a much higher likelihood of 321 punishing at least $4 (the critical threshold for efficacious punishment) than did witnesses of 322 hypothetical fairness (p = .002, Fisher’s Exact Test). This proportion (22 of 137; 16%) was 323 significantly larger than what we saw in the actual behaviour of Experiment 1 subjects (2 of 65; 324 3%), p = .009, Fisher’s Exact Test, (Fig. 4). Second, the proportion of witnesses (16%) and 325 victims (31 of 148; 21%) of hypothetical unfairness that forecast punishing at least $4 did not 326 differ, p = .361, contrary to Experiment 1. Third, witnesses of hypothetical unfairness forecast a 327 significant amount of anger toward unfair Dictators in Experiment 2a, again in contrast to Altruistic punishment: A closer look 15 328 Experiment 1: there was a significant target*treatment interaction (controlling for envy), F (1, 329 285) = 5.87, p = .016. Witnesses of hypothetical unfairness (M = 1.59, SE = .109, N = 133) 330 forecast more anger than did witnesses of hypothetical fairness (M = .419, SE = .116, N = 147), p 331 < .001, partial 2 = .16. Likewise, victims of hypothetical unfairness (M = 2.04, SE = .110, N = 332 139) forecast more anger than did recipients of hypothetical fairness (M = .345, SE = .108, N = 333 141), p < .001, partial 2 = .28. Fourth, both witnesses (F (1, 131) = 25.48, p < .001, partial 2 = 334 .16) and victims (F (138) = 10.69, p = .001, partial 2 = .07) of hypothetical unfairness forecast 335 significantly more anger (controlling for envy) toward the unfair Dictator than their counterparts 336 reported in Study 1 (Fig. 3). These results are therefore consistent with proposals that norm 337 violations elicit “negative emotions” [4,6], which in turn motivate altruistic punishment, but here 338 they resulted from affective forecasting rather than from responding to real-time events. (Recall, 339 in contrast, that in Experiment 1, which involved real-time behaviour and emotional responses 340 rather than forecasting, no such moral outrage was found). 341 Experiment 2b 342 The pattern of punishment results for Experiment 2b was virtually identical to those of 343 Experiment 2a: the proportion of witnesses of hypothetical unfairness that forecast they would 344 punish at least $4 (5 of 85; 6%) did not differ from that of victims of hypothetical unfairness (9 345 of 101; 9%), p = .580, Fisher’s Exact Test—a pattern that was similar to Experiment 2a, but 346 contrary to Experiment 1. Interestingly, neither witnesses Z = .183, p = .855, N = 85, nor victims 347 of hypothetical unfairness, Z = -.162, p = .872, N = 101, reported they would administer a 348 greater-than-zero amount of punishment. Nevertheless, both witnesses of hypothetical unfairness 349 (p = .031) and victims of hypothetical unfairness (p = .012) forecast they would punish 350 significantly more than did their (hypothetically) fairly-treated counterparts: this is because both Altruistic punishment: A closer look 16 351 witnesses, Z = 2.33, p = .020, N = 97 and recipients of hypothetical fairness, Z = 2.98, p = .003, 352 N = 85, rewarded a greater-than-zero amount. Furthermore—and importantly—witnesses and 353 victims of hypothetical unfairness did not forecast different amounts of punishment, p = .743. 354 Thus, notwithstanding the fact that hypothetical punishment of unfairness appeared largely to 355 have taken the form of withdrawing reward (rather than imposing costs) for subjects in 356 Experiment 2b, the punishment results largely replicated those obtained in Experiment 2a (see 357 Fig. 4). 358 Moreover, the pattern of emotion results of Experiment 2b was identical to that of 359 Experiment 2a: witnesses of hypothetical unfairness (M = 1.43, SE = .100, N = 94) forecast more 360 anger (controlling for envy) than did witnesses of hypothetical fairness (M = .392, SE = .104, N = 361 92), p < .001, partial 2 = .13. Likewise, victims of hypothetical unfairness (M = 1.44, SE = .096, 362 N = 109) forecast more anger than did recipients of hypothetical fairness (M = .259, SE = .098, N 363 = 99), p < .001, partial 2 = .16. Witnesses (F (1, 144) = 18.53, p < .001, partial 2 = .11) but not 364 victims (F (157) = .005, p = .945, partial 2 = .00) of hypothetical unfairness also forecast 365 significantly more anger (controlling for envy) toward the unfair Dictator than the subjects in 366 Experiment 1 actually experienced (Fig. 3). The overall pattern of forecast behaviour and 367 emotion in Experiment 2b suggests that the students who were the subjects in Experiment 2b had 368 a slight tendency to believe that they would reward fair distributions, which the non-student 369 subjects in Experiment 2a did not share, but in every other way the results are identical to those 370 of Experiment 2a: subjects forecast that both experiencing and witnessing unfairness would 371 cause them to become angry and to punish dictators to a greater extent than did subjects who 372 forecast their responses to either receiving or witnessing fair treatment. Furthermore, both 373 experiencers and witnesses of unfairness forecast equivalent likelihoods of punishing at least $4. Altruistic punishment: A closer look 17 374 Discussion 375 Experiment 1 indicates that, under the conditions we investigated, humans do not impose 376 meaningful amounts of third-party punishment on behalf of absolute strangers. The nominal and 377 statistically non-significant amount of punishment we did observe was apparently motivated by 378 envy because of a comparatively unfavourable personal outcome rather than by moralistic anger 379 on behalf of a mistreated stranger. Our finding that the emotional reaction to witnessing 380 unfairness is characterized by envy rather than moralistic anger is particularly inconvenient for 381 the altruistic punishment hypothesis: to categorize a behaviour as an adaptation for altruistic 382 benefit delivery, one needs to provide evidence that the psychological mechanisms that produce 383 the behaviour in question have been designed for that specific function. That is, one needs to 384 demonstrate that the behaviour is not caused by mechanisms designed for a different function. 385 The presence of envy, rather than moralistic anger, in response to witnessing unfairness, suggests 386 that the psychological mechanisms involved in third-party punishment are, at least in part, 387 designed to process cues that another individual has obtained better outcomes than oneself [38]. 388 In contrast, we found no evidence that they are designed to process cues that an anonymous 389 stranger has been harmed. We do not mean to imply that humans do not impose any third-party 390 punishment: under some circumstances, they do [22,35,39]. However, our results cast doubt on 391 the proposal that the mechanisms that motivate third-party punishment are altruistic benefit- 392 delivery systems that are motivated proximately by moralistic anger. 393 Experiments 2a and 2b show furthermore that people inaccurately forecast their affective 394 and behavioural responses to unfairness in experimental games: in particular, subjects who 395 imagined themselves witnessing (rather than experiencing) unfair treatment forecast both more 396 anger and punishment (and, in the case of Experiment 2b, withdrawal of rewarding) than is Altruistic punishment: A closer look 18 397 observed among people who witness unfair treatment in the laboratory. This dissociation 398 between hypothetical and actual third-party punishment raises the possibility that punishment 399 imposed by mere witnesses of unfairness found in prior work resulted from demand 400 characteristics, affective forecasting errors, and the other methodological shortcomings we have 401 cited here [8,25,27]. 402 Limitations 403 As mentioned at the outset, the goal of the experiments presented herein was not to 404 systematically identify which specific methodological conventions were responsible for previous 405 findings of third-party punishment on behalf of strangers. Rather, the goal was to test whether a 406 suite of methodological conventions that are commonly applied within the third-party 407 punishment game collude to create more third-party punishment in that experimental realization 408 than would actually obtain in experiments that remediated those methodological shortcomings. 409 As such, our results cannot speak with certainty to the effects of particular aspects of previous 410 designs, such as the strategy method. A recent survey of studies comparing the strategy method 411 to the direct-response method across a variety of paradigms found that evidence surrounding the 412 effect of using the strategy method is mixed [40], but importantly, no study has been conducted 413 to directly compare the amounts of third-party punishment elicited in experiments using the 414 strategy method versus the direct response method [40]. Therefore, further work is needed to 415 determine the unique contribution of the use of the strategy method (and the other potential 416 methodological artefacts we have identified herein) to the apparently exaggerated evidence for 417 altruistic third-party punishment that previous work has revealed: we emphasize again that doing 418 so was not our goal here. Despite this limitation, our results do strongly suggest that subjects’ Altruistic punishment: A closer look 19 419 forecasts of their likely anger and punishment in response to witnessing unfairness in the 420 standard third-party punishment game [e.g., 6] are exaggerated. 421 Conclusion 422 These findings are of broad significance in the study of human cooperation because many 423 researchers have proceeded under the assumption that altruistic punishment is a robust 424 phenomenon that requires an adaptationist explanation. Indeed, two scientific “problems” for 425 which cooperation researchers over the past decades have been seeking adaptationist solutions 426 might not be problems at all. Consider the puzzle framed by proponents of “strong reciprocity,” 427 such as Gintis [41], who claimed that humans are “strong reciprocators” who are “predisposed to 428 cooperate with others and punish non co-operators, even when this behaviour cannot be justified 429 in terms of self-interest, extended kinship, or reciprocal altruism (p. 169).” 430 With respect to the former problem—“unjustified” predispositions to cooperate without 431 apparent individual benefit—results from recent models suggest that a bias to cooperate, even 432 when one faces cues of an interaction being one-shot, should be expected to coevolve with 433 reciprocity. This is because mistaking a one-shot interaction for a repeated interaction is a less 434 costly error than the reverse [42]. In terms of the latter problem (i.e., the claim that people punish 435 non co-operators even when the punisher does not stand to benefit individually), the results from 436 Experiment 1 call into question the claim that people engage in altruistic third-party punishment 437 at all [See also, 10,13]. We think another way forward in the study of third-party punishment in 438 humans, as in the study of the evolved mechanisms that motivate human cooperation in general, 439 is to intensify the search for direct or indirect benefits for punishers that outweigh the costs of 440 punishment, consistent with all known cases of third-party intervention in non-human animals 441 [43,44]. Altruistic punishment: A closer look 20 442 Acknowledgements 443 We thank Max Burton-Chellow, Steven Pinker, Stuart A. West, and Richard W. Wrangham for 444 their feedback on a previous draft, and David G. Rand for kind assistance with his data. Research 445 supported by grants from the Air Force Office of Scientific Research (Award #FA9550-12-1- 446 0179) to M.E.M and R.K., the Arsht Research on Ethics and Community Program at the 447 University of Miami to M.E.M, and an NSF Graduate Research Fellowship to E.J.P. Altruistic punishment: A closer look 21 448 References 449 450 1 Clutton-Brock, T. H. & Parker, G. A. 1995 Punishment in animal societies. Nature 373, 209-216. (DOI:10.1038/373209a0) 451 452 453 2 West, S. A., Griffin, A. S. & Gardner, A. 2007 Social semantics: altruism, cooperation, mutualism, strong reciprocity and group selection. J Evol Biol 20, 415-432. (DOI:10.1111/j.1420-9101.2006.01258.x) 454 455 3 Bshary, A. & Bshary, R. 2010 Self-serving punishment of a common enemy creates a public good in reef fishes. Curr Biol 20, 2032-2035. (DOI:10.1016/j.cub.2010.10.027) 456 457 4 Fehr, E. & Gächter, S. 2002 Altruistic punishment in humans. Nature 415, 137-140. (DOI:10.1038/415137a) 458 459 5 Boyd, R., Gintis, H., Bowles, S. & Richerson, P. J. 2003 The evolution of altruistic punishment. Proc Natl Acad Sci USA 100, 3531-3535. (DOI:10.1073/pnas.0630443100) 460 461 6 Fehr, E. & Fischbacher, U. 2004 Third-party punishment and social norms. Evol Hum Behav 25, 63-87. (DOI:10.1016/S1090-5138(04)00005-4) 462 463 464 7 Henrich, J., McElreath, R., Barr, A., Ensminger, J., Barrett, C., Bolyanatz, A., Cardenas, J. C., Gurven, M., Gwako, E., Henrich, N. et al. 2006 Costly punishment across human societies. Science 312, 1767-70. (DOI:10.1126/science.1127333) 465 466 467 8 West, S. A., El Mouden, C. & Gardner, A. 2011 Sixteen common misconceptions about the evolution of cooperation in humans. Evol Hum Behav 32, 231-262. (DOI:10.1016/j.evolhumbehav.2010.08.001) 468 469 9 Burnham, T. C. & Johnson, D. D. P. 2005 The biological and evolutionary Logic of human cooperation. Analyse and Kritik 27, 113-135. 470 471 10 McCullough, M. E., Kurzban, R. & Tabak, B. A. In press. Cognitive systems for revenge and forgiveness. Behav Brain Sci 472 473 474 11 Hagen, E. H. & Hammerstein, P. R. 2006 Game theory and human evolution: A critique of some recent interpretations of experimental games. Theor Popul Biol 69, 339-348. (DOI:10.1016/j.tpb.2005.09.005) 475 476 477 478 12 Kümmerli, R., Burton-Chellew, M. N., Ross-Gillespie, A. & West, S. A. 2010 Resistance to extreme strategies, rather than prosocial preferences, can explain human cooperation in public goods games. Proc Natl Acad Sci USA 107, 10125-30. (DOI:10.1073/pnas.1000829107) 479 480 13 Krasnow M. M., Cosmides L., Pedersen E. J., & Tooby J. 2012 What Are Punishment and Reputation for? PLoS ONE 7: e45662. doi:10.1371/journal.pone.0045662 Altruistic punishment: A closer look 22 481 482 483 14 McCullough M. E., Pedersen, E. J., Schroder, J. M., Tabak, B. A., & Carver, C. S. 2012 Harsh childhood environmental characteristics predict exploitation and retaliation in humans. Proc R Soc B 20122104. doi: 10.1098/rspb.2012.2104 484 485 15 Carpenter, J. P. & Matthews, P. H. In press. Norm Enforcement: Anger, indignation, or reciprocity? J Eur Econ Assoc 486 487 16 Orne, M. 1962 On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. Am Psychol 17, 776-783. 488 489 490 491 492 17 Dana, J., Weber, R. A. & Kuang, J. 2007 Exploiting Moral Wiggle Room: Experiments Demonstrating an Illusory Preference for Fairness. Econ Theor 33, 67-80. 18 Nelissen, R. M. A. 2008 The price you pay: cost-dependent reputation effects of altruistic punishment. Evol Hum Behav 29, 242-248. 493 494 19 Johnstone, R. A. & Bshary, R. 2004 The evolution of spiteful behavior. Proc R Soc B 271, 1917-1922. 495 496 20 Sell, A, Tooby, J. & Cosmides, L. 2009 Formidability and the logic of human anger. Proc Natl Acad Sci USA 106, 15073-15078. 497 498 21 Lieberman, D., & Linke, L. 2007 The effect of social category on third party punishment. Evol Psychol 5, 289-305. 499 500 22 Kurzban, R., DeScioli, P. & O’Brien, E. 2007 Audience effects on moralistic punishment. Evol Hum Behav 28, 75-84. (DOI:10.1016/j.evolhumbehav.2006.06.001) 501 502 23 Bolton, G. E., & Zwick, R. (1995). Anonymity versus punishment in ultimatum bargaining. Games and Economic Behavior 10, 95-121. 503 504 505 24 Selten, R. 1967 Die Strategiemethode zur Erforschung des eingeschränkt rationalen Verhaltens im Rahmen eines Oligopolexperiments. In Beiträge zur experimentellen Wirtschaftsforschung, (ed H. Sauermann), pp. 136-168. 506 507 508 25 Weber, S. J. & Cook, T. D. 1972 Subject effects in laboratory research: An examination of subject roles, demand characteristics, and valid inference. Psychol Bul 77, 273-295. (DOI:10.1037/h0032351) 509 510 511 26 Almenberg J., Dreber A., Apicella C. L., & Rand D.G. 2011 Third Party Reward and Punishment: Group Size, Efficiency and Public Goods, in Psychology of Punishment Nova Science Publishers. Eds. NM Palmetti et al. 512 513 27 Wilson, T. D. & Gilbert, D. T. 2005 Affective forecasting: Knowing what to want. Curr Dir Psychol Sci 14, 131-134. Altruistic punishment: A closer look 23 514 515 28 Cook, K. S. & Yamagishi, T. 2008 A defense of deception on scientific grounds. Soc Psychol Q 71, 215-221. 516 517 29 Kawakami, K., Dunn, E., Karmali, F. & Dovidio, J. 2009 Mispredicting affective and behavioral responses to racism. Science 323, 276-278. (DOI:10.1126/science.1164951) 518 519 30 Hareli, S. & Weiner, B. 2002 Dislike and envy as antecedents of pleasure at another’s misfortune. Motiv Emot 26, 257-277. 520 521 31 Reuben, E. & van Winden, F. 2008 Social ties and coordination on negative reciprocity: The role of affect. J Public Econ 92, 34-53. (DOI:10.1016/j.jpubeco.2007.04.012) 522 523 32 Aronson, E., Ellsworth, P., Carlsmith, J. & Gonzales, M. 1990 Methods of research in social psychology. New York: McGraw-Hill. 524 33 Camerer, C. 2003 Behavioral game theory. New York: Princeton University Press. 525 526 527 34 Ernest-Jones, M., Nettle, D. & Bateson, M. 2011 Effects of eye images on everyday cooperative behavior: A field experiment. Evol Hum Behav 32, 172-178. (DOI:10.1016/j.evolhumbehav.2010.10.006) 528 529 530 531 35 Petersen, M. B., Sell, A., Tooby, J. & Cosmides, L. 2010 Evolutionary psychology and criminal justice: A recalibrational theory of punishment and reconciliation. In Human Morality and Sociality: Evolutionary and comparative perspectives (ed H. Høgh-Oleson), pp. 72-131. New York: Palgrave MacMillan. 532 533 534 36 Batson, C. D., Kennedy, C. L., Nord, L., Stocks, E. L., Fleming, D. A., Marzette, C. M., Lishner, D. A., Hayes, R. E., Kolchinsky, L. M. & Zerger, T. 2007 Anger at unfairness: Is it moral outrage? Eur J Soc Psychol 1285, 1272-1285. (DOI:10.1002/ejsp) 535 536 37 Zizzo, D. & Oswald, A. 2001 Are people willing to pay to reduce others’ incomes? Annales d’Economie et de Statistique 63-64, 39-65. 537 538 38 Price, M. E., Cosmides, L. & Tooby, J. 2002. Punitive sentiment as an anti-free rider psychological device. Evol Hum Behav 23, 203-231. 539 540 39 Phillips, S., Cooney, M., Carr, T. & Frady, B. 2005 Aiding peace, abetting violence: Third parties and the management of conflict. Am Sociol Rev 70, 334-354. 541 542 40 Brandts, J. & Charness, G. 2011 The strategy versus the direct-response method: a first survey of experimental comparisons Exp Econ 14, 375-398. 543 544 41 Gintis, H. 2000 Strong reciprocity and human sociality. J Theor Biol 206, 169-179. (DOI:10.1006/jtbi.2000.2111) Altruistic punishment: A closer look 24 545 546 547 42 Delton, A. W., Krasnow, M. M., Cosmides, L. & Tooby, J. 2011 Evolution of direct reciprocity under uncertainty can explain human generosity in one-shot encounters. Proc Natl Acad Sci USA (DOI:10.1073/pnas.1102131108) 548 549 43 Raihani, N. J., Grutter, A. S. & Bshary, R. 2010 Punishers benefit from third-party punishment in fish. Science 327, 171. (DOI:10.1126/science.1183068) 550 551 552 44 Smith, J. E., Van Horn, R. C., Powning, K. S., Cole, A. R., Graham, K. E., Memenis, S. K. & Holekamp, K. E. 2010 Evolutionary forces favoring intragroup coalitions among spotted hyenas and other animals. Behav Ecol 21, 284-303. (DOI:10.1093/beheco/arp181) 553 Altruistic punishment: A closer look 25 554 Figure Captions 555 556 557 Figure 1. Game structure 558 (a.) Round 1. All players started with $5. Subject was either Recipient or 3rd Party. The Dictator 559 (a fictitious player whose “decisions” were determined by computer script) either took $0 (fair 560 conditions; unbolded) or $4 (unfair conditions; bolded) from Recipient. (b.) Round 2. Players 561 started with $5. Subject was Dictator; previous “Dictator” was Recipient; 3rd party was excluded. 562 Subject was allowed to give any portion of $5, do nothing, or pay a 1:4 cost to deduct money 563 from Recipient. Money deducted from Recipient in Round 2 was “burned”—it was not gained by 564 Dictators as income. 565 Altruistic punishment: A closer look 26 566 567 Figure 2. Punishment/reward distributions (Experiment 1) 568 Amount of money (in $) the subject punished (negative values) or rewarded (positive values) the 569 Round 1 Dictator (N = 270). Values are in terms of the effect on the Round 1 Dictator, not the 570 cost to the subject (cost-to-punish ratio = 1:4; cost-to-reward ratio = 1:1). 571 Altruistic punishment: A closer look 27 572 573 Figure 3. Self-reported anger (Experiments 1, 2a and 2b) 574 Self-reported anger (scale: 0-5) toward Dictator following Round 1, controlling for envy (N = 575 943). Error bars = +/- 1 SE. 576 Altruistic punishment: A closer look 28 577 578 Figure 4. Punishment ≥ $4 (Experiments 1, 2a and 2b) 579 Proportion of subjects in unfair conditions that punished (Experiment 1) or reported they would 580 punish (Experiments 2a and 2b) ≥ $4 (N = 597). Error bars = +/- 1 SE. 581 Altruistic punishment: A closer look 29 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 Electronic Supplementary Material S1. Supporting Methods and Results S1.1 Random assignment to experimental condition and cell sizes Experiment 1: The condition in which witnesses of unfairness started with $9 was significantly smaller than the other conditions because we stopped enrolling subjects in this condition to increase power for statistical analyses involving the four cells of the 2 (Target: Self, Other) x 2 (Treatment: Fair, Unfair) design. This decision was made prior to any analysis of the data from this condition and was based solely on an attempt to maximize the number of subjects in each of the other four cells, given the rate at which we managed to recruit subjects into the study. All other cell size differences are due to the random nature of the assignments. Experiments 2a and 2b: Cell size differences are due to random assignment. The discrepancy in sample sizes between punishment/rewarding analyses and emotion analyses is a result of our adding the emotion questions halfway through data collection. S1.2 Data excluded from Experiment 1 analyses based on debriefing responses Data from 26 subjects (12 female; Age: M = 18.85, SD = 1.62) were excluded from all analyses, figures, and tables (including ESM) because they expressed scepticism during debriefing (see ESM Appendix for debriefing script) that they had been interacting with other people. Decisions to exclude individual subjects were made without knowledge of their experimental data. The number of subjects excluded did not vary by condition, χ2 (4, N = 341) = .459, p = .977, suggesting that none of the conditions induced greater scepticism than any other. Additionally, we re-ran all of the analyses with flagged subjects included and found that the results did not change qualitatively in any way—all of the significant relationships presented in the main text remained significant when flagged subjects were added to the analyses. S1.3 Lexical decision task measure of implicit anger following round 1 (Experiment 1): Method and results In addition to the self-report anger data reported in the main paper for Experiment 1, we collected data from a lexical decision task (LDT) following Round 1: Subjects decided, as quickly as possible, whether a string of letters was a word or a non-word; reaction times to 15 hostility-related words (e.g., “angry,” “kill”) in the LDT serve as an implicit measure of anger (with faster reaction times indicating more anger; see refs 34,35). A 2 (Target: Self, Other) x 2 (Treatment: Fair, Unfair) ANOVA revealed a significant target*treatment interaction for LDT anger following Round 1, F (1, 261) = 4.66, p = .032, partial 2 = .02 (see Fig. S1). Main effects were not significant. The reaction times to hostilityrelated words for subjects who witnessed another person being treated unfairly (natural logtransformed ms, M = 6.70, SD = 0.28, N = 62) did not differ from those who witnessed another being treated fairly (M = 6.66, SD = 0.27, N = 80), p = .416, partial 2 = .00. However, subjects Altruistic punishment: A closer look 30 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 who were treated unfairly (M = 6.58, SD = 0.29, N = 59) had faster reaction times than did subjects who were treated fairly (M = 6.69, SD = 0.32, N = 64), p = .029, partial 2 = .02, suggesting greater anger among those who had been treated unfairly. Reaction times to neutral words are typically controlled for in analysing LDT data [34,35]. However, we did not control for reaction times to neutral words in our analysis because (a) reaction times to neutral words did not differ across conditions, F (3, 261) = 1.54, p = .205, partial 2 = .02, and (b) the inclusion of reaction times to neutral words as a covariate eliminated the significant interaction. Because statistical control of a non-significant covariate decreases power by subtracting degrees of freedom from the error term while not removing adequate sums of squares for error [36], we dropped it from our analysis. S1.4 Fairness/moral wrongness of the round 1 dictator’s behavior (Experiment 1): Witnesses and receivers view the transgression identically. A possible explanation for the differences between the emotional and behavioral responses of recipients and witnesses of unfairness is that recipients paid more attention to the Dictator’s unfair behavior. However, the target*treatment interaction was not significant for perceived fairness (F (1, 266) = 1.73, p = .190) or moral wrongness (F (1, 266) = 1.33, p = .249) of the Dictator’s decision (see Fig. S2). There were, however, significant effects of treatment: Unfair Dictators’ behavior was perceived as more unfair, F (1, 266) = 354.13, p < .001, and morally wrong, F (1, 266) = 182.03, p < .001, than fair Dictators’ behavior. Thus, subjects in both unfair Dictator conditions clearly judged that a transgression had occurred, but they were only angered and motivated to punish when they personally were targets of unfairness—consistent with previous findings [27]. S1.5 Effects of round 1 dictator behavior on self-reported anger toward the round 1 dictator, envy uncontrolled When envy was not controlled, witnesses of unfairness did appear to get angry at unfair Dictators, relative to witnesses’ anger in response to fair dictators: A 2 (Target: Self, Other) x 2 (Treatment: Fair, Unfair) ANOVA revealed significant main effects of both target, F (1, 266) = 10.81, p = .001), and treatment, F (1, 266) = 85.09, p < .001) for anger, and a significant target*treatment interaction for anger, F (1, 266) = 12.44, p < .001). Witnesses of unfairness (M = .813, SE = .110, N = 65) reported more anger than did witnesses of fairness (M = .198, SE = .098, N = 80), p < .001, partial 2 = .06, and recipients of unfairness (M = 1.55, SE = .115, N = 61) reported more anger than recipients of fairness (M = .172, SE = .108, N = 64), p < .001, partial 2 = .22. However, as discussed in the main text, witnesses of unfairness reported no more anger than witnesses of fairness when envy was statistically controlled. Thus, witnesses’ anger at unfair Dictators can be attributed to envy rather than to moralistic anger. Altruistic punishment: A closer look 31 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 S2. Supplementary Notes 687 688 Below we address some possible concerns with the use of deception in our method, and why these concerns do not affect the validity of our results. 689 690 691 692 693 694 695 696 697 698 Authenticity of debriefing responses. It could be argued that our debriefing process was inadequate to identify all cases of scepticism of our deception among our participants because participants were being compensated both with money and partial course credit; that is, perhaps they were coerced into responding that they believed the deception because they felt they would not receive compensation if they responded that they did not. This is highly unlikely. Participants read and sign consent forms prior to participation in any experiment at the University of Miami—as is the case with any IRB-approved study involving human subjects in the United States—that explicitly state that participants cannot be denied compensation based on their responses; actual amount of money earned may vary based on decisions in experimental tasks, but course credit is required to be granted once a consent form is signed. 699 700 701 702 703 704 705 706 707 708 Possible contamination of the subject pool. If participants are regularly subjected to experiments in which deception is used, it could be argued that perhaps they come to expect to be deceived in future experiments in which they participate [37], and consequently provide responses that do not accurately reflect natural behavior. This is unlikely to be a significant issue in the psychology department’s subject pool at the University of Miami for several reasons. First, our subject pool consists only of undergraduates currently enrolled in a specific introductory psychology course; once they have completed the course they are no longer eligible to participate in experiments. Second, members of the subject pool only participate in five total hours of experiments, which translates to participation in approximately two to four experiments. Third, many of the experiments that members of our subject pool participate in do not involve deception. Thus, it is S2.1 The use of deception Though experimental economists typically resist the use of deception in experiments, its use here is justified: there was no practical way we could have obtained a sufficiently large sample size without deception. We sought to gather data from at least 50 subjects per cell of our main 2x2 design in order to have adequate statistical power. Without the use of deception, we would have had to rely on a minimum of 100 subjects, in the role of the Round 1 Dictator, to take exactly $4.00 from the Recipient (unfair conditions) and at least 100 more subjects to take exactly $0.00 from the Recipient (fair conditions). Assume that Round 1 Dictators would be equally likely to make one of the 11 possible choices on the give-take continuum, from giving $5 to taking $5, in whole-dollar increments.* We would need N = 1,100 (100 subjects per choice*11 choices) to achieve the same statistical power without the use of deception as with 200 subjects in our actual paradigm. Considering that our interest lies entirely with how subjects responded to Dictator actions, and not the actions themselves, such a design would be wasteful of subjects’ time (thereby altering the ratio of benefits to risks of the experiment, and thus its ethicality) and resources. * It is probably unrealistic to assume that subjects would be equally likely to choose any one of the 11 options, but this oversimplification is more than adequate to demonstrate the present point. Also, in our actual design with predetermined Round 1 Dictator actions, subjects in Round 2 were allowed to reward or punish any amount in this range, down to the nearest cent: they made their decision by typing in a precise amount. Altruistic punishment: A closer look 32 709 710 unlikely that participants in our subject pool have become accustomed to participating in experiments involving deception to the point where their behavior is altered by its expectation. 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 Running subjects in individual sessions. The decision to run subjects either individually or in groups presented a trade-off between experimental control and a more realistic setting. We decided to run subjects individually as the benefits of doing so greatly outweighed the potential costs of the less realistic setting. First, though subjects may have been more sceptical that they were actually interacting with others since there were no other people in the room, the scenario we presented was very plausible: subjects were told that two other participants we located elsewhere in the psychology building (a five-story building with lots of foot traffic). Second, running a third-party punishment study with two other people in the room introduces variables relative to their physical appearances including: sex, clothing style, coalitional markings (e.g., fraternity symbols, sports team logos), perceived formidability, friendliness, ethnicity, etc. Even though subjects’ identities are anonymous in the game and subjects are separated with partitions during play, subjects will inevitably interact sometime during the experimental session. Thus, running subjects individually under the perfectly plausible guise of their interacting with other people in another room avoids the potential experimental noise that can be introduced by these other factors. Concern has previously been raised that running subjects in isolated rooms may potentially increase scepticism that interactions are legitimate, thereby influencing behavioural results. Frohlich, et al. [38], empirically tested whether running a dictator game with isolated subjects led to different results as compared with running the game with subjects in the same room. In their discussion, Frolich, et al. state, “Contrary to the hypothesis, most measures of subjects’ uncertainties were not significantly different as a result of the change in the number of rooms. Indeed, only doubt that the money left in the envelope would be given to the paired other was significantly reduced in the One Room experiments.” We note that this effect was only statistically significant by one-tailed test (p = .038), whereas the effects on other, similar, dependent variables were not statistically significant (even by one-tailed test: Did not view experiment as a game; Not sure that description was accurate; and, most importantly, Not sure that there were real people paired.) S2.2 Inconsistency with previous experimental results It could be argued that the reason we found no altruistic punishment in Experiment 1 is because we changed multiple aspects of the standard design of the third-party punishment game at once. Indeed, we made several changes in an attempt to minimize or eliminate (1) audience effects, (2) experimental demand, (3) affective forecasting, and (4) potential extraneous variables introduced by brief interactions with other participants. However, there are several reasons this argument is not compelling. First, we observed a significant amount of second-party punishment so, clearly, there was not a general suppression of punishment in our design. Second, the cost of punishment in our design (1:4) was less expensive than the 1:3 cost-to-punishment ratio typically used in the third-party punishment game [e.g., 6]. Indeed, our design should have encouraged punishment relative to previous designs. Third, our purpose here was to test for the presence of altruistic third-party punishment in a well-controlled experimental design, not to iteratively make changes to a design that contained several features that likely produced artefactual results—to achieve this well-controlled design, all of our changes needed to be made simultaneously. Altruistic punishment: A closer look 33 755 756 References 757 758 34 Ayduk, O., Mischel, W. & Downey, G. 2002 Attentional mechanisms linking rejection to hostile reactivity: The role of “hot” versus “cool” focus. Psychol Sci 13, 443-448. 759 760 761 35 Gollwitzer, M. & Denzler, M. 2009 What makes revenge sweet: Seeing the offender suffer or delivering a message? J Exp Soc Psychol 45, 840-844. (DOI:10.1016/j.jesp.2009.03.001) 762 763 36 Tabachnick, B. G. & Fidell, L. S. 1989 Using Multivariate Statistics. New York: Harper & Row. 764 765 37 Hertwig, R. & Ortmann, A. 2001 Experimental practices in economics: A methodological challenge for psychologists? Behav Brain Sci 24, 383-451. 766 767 768 38 Frohlich, N., Oppenheimer, J., & Bernard Moore, J. 2001. Some doubts about measuring self-interest using dictator experiments: the costs of anonymity. J Econ Behav Organ 46, 271-290. 769 770 Altruistic punishment: A closer look 34 771 S3. Supplementary Figures and Captions 772 773 774 775 Figure S1. Reaction times to hostility-related words in the lexical decision task. Altruistic punishment: A closer look 35 776 777 778 779 780 Figure S2. Ratings of the fairness and moral wrongness of the Round 1 Dictator’s decision. Scale from 1 (Not at all fair/morally wrong) to 9 (Totally fair/morally wrong). Altruistic punishment: A closer look 36 781 782 783 784 785 786 S4. Supplementary Tables Fair – Recipient Mean SD Unfair Recipient Mean SD Fair Witness Mean SD Unfair Witness ($5) Mean SD Unfair Witness ($9) Mean SD Variable Overall Mean SD $ Punished/Rewarded* -0.18 1.46 0.17 0.68 -1.12 2.08 0.34 1.05 -0.24 1.37 -0.22 1.42 LDT RT** 6.65 0.29 6.69 0.32 6.58 0.29 6.66 0.28 6.70 0.28 6.61 0.31 Moral wrongness 3.34 2.67 1.31 1.19 4.48 2.49 1.61 1.79 5.18 2.30 5.13 2.31 Fairness 6.07 2.95 8.61 1.08 3.64 2.40 8.48 1.24 4.26 2.43 4.07 2.00 Anger 0.64 1.01 0.14 0.44 1.54 1.29 0.17 0.46 0.84 1.07 0.67 0.88 Envy 0.85 1.17 0.27 0.61 1.47 1.40 0.36 0.76 1.48 1.27 0.80 1.02 * Negative values indicate punishment ** Response time to hostility-related words in lexical decision task, ln-transformed ms Table S1. Variable summary statistics – Experiment 1. Altruistic punishment: A closer look 37 787 788 789 790 Fair – Recipient Mean SD Unfair Recipient Mean SD Fair Witness Mean SD Unfair Witness Mean SD Variable Overall Mean SD $ Punished/Rewarded* -0.23 2.08 0.05 1.69 -0.59 2.48 0.17 1.51 -0.45 2.28 Moral wrongness 3.08 3.23 1.04 2.13 5.32 3.04 0.79 1.81 4.72 2.68 Fairness 5.28 3.38 7.85 2.21 2.64 2.57 7.74 2.01 3.37 2.58 Anger 1.12 1.32 0.16 0.52 2.25 1.20 0.28 0.64 1.63 1.13 Envy 0.98 1.25 0.46 0.92 1.85 1.39 0.34 0.69 1.12 1.16 * Negative values indicate punishment Table S2. Variable summary statistics – Experiment 2a. Altruistic punishment: A closer look 38 Variable 791 792 793 794 Overall Mean SD Fair – Recipient Mean SD Unfair Recipient Mean SD Fair Witness Mean SD Unfair Witness Mean SD $ Punished/Rewarded* 0.25 1.83 0.42 1.71 -0.04 1.99 0.58 1.74 0.06 1.79 Moral wrongness 2.87 2.88 0.75 1.79 4.56 2.41 1.00 1.92 4.97 2.32 Fairness 5.40 3.25 7.92 2.17 3.24 2.24 7.70 2.34 3.02 2.30 Anger 0.90 1.15 0.15 0.51 1.60 1.21 0.24 0.62 1.51 1.11 Envy 0.92 1.26 0.39 0.91 1.68 1.28 0.21 0.53 1.27 1.42 * Negative values indicate punishment Table S3. Variable summary statistics – Experiment 2b. Altruistic punishment: A closer look 39 795 796 797 798 799 800 801 Appendix Debriefing script followed by experimenter: (1) Tell participant that the study is over. Ask if he/she has any questions. If the questions are about hypotheses or the deceptive elements of the experiment, explain that you will address those specific questions in just a few moments. 802 803 804 (2) Ask whether entire the experiment was clear in its overall purpose, and whether all aspects of the procedure made sense. Was there anything that the participant found confusing unclear? “Were you, at any point, unsure about what we were asking you to do?” 805 806 (3) We would find it very helpful to hear about any of your personal feelings and reactions to the experiment. Probe about what made person feel the way they felt. 807 808 809 810 (4) Today’s experiment was designed to help us test some very specific hypotheses about human behavior. Do you have any idea what those hypotheses were? If you had to guess, what would you say were the hypotheses we were testing today? We would like to know as many of your guesses about our hypotheses as you can come up with. 811 (5) Ask whether participant found any aspect of the procedure odd, upsetting or disturbing. 812 813 814 815 816 (6) Did you wonder at any point whether there was more than meets the eye to any of the procedures that we had you complete today? That is, do you think that there might have been any information that I held back from explaining to you about the experiment until now? Ask participant to say more about their suspicions, and to elaborate on their questions about the procedure. 817 818 (7) Ask how participant thinks (the suspicions he/she mentioned) affected his or her behavior during the study. 819 820 821 822 823 824 The experimenter then fully explained the nature of the deception to the participant and why it was a necessary part of the experiment. The experimenter also discussed the aims of the research with the participant, and answered any questions the participant had about their experience. Lastly, the experimenter and participant discussed possible ways for the participant to talk about the experiment with his or her peers in a manner that, while honest, would not spoil the deception for others in the subject pool. 825 826 827 828 829 830 Our experimenters reported anecdotally that participants generally found the experiment fun and interesting, and were typically fascinated with the study design after the deception was revealed in debriefing. Furthermore, the experimenters’ discussion of how to talk to other about the study helped to, in a sense, bring participants in as collaborators on the research such that (a) they would not feel that they had been taken advantage of, and (b) they would not spoil the study for others in the subject pool.
© Copyright 2026 Paperzz