Introducing the Leroy Problem: Extending the Frequency vs. Probability Debate to Racial Stereotypes via an Evolutionary Perspective Nazia S. Mirza and Herbert H. Blumberg Corresponding author: Herbert H. Blumberg, Department of Psychology, Goldsmiths College, University of London, London SE14 6NW, England. Phone number (from USA): 011-44-20-7919-7896; Fax 011-44-20-7919-7873; E-mail: [email protected] Introducing Leroy 2 Running head: RACE AND THE FREQUENCY VS. PROBABILITY DEBATE Introducing the Leroy Problem: Extending the Frequency vs. Probability Debate to Racial Stereotypes via an Evolutionary Perspective Nazia S. Mirza and Herbert H. Blumberg Goldsmiths College, University of London Abstract Bayesian vs. frequentist paradigms are here extended to the issue of racial stereotypes. It has been widely argued that human beings do not embody an innate probability of calculus and are not Bayesian thinkers. Bayesian probabilists argue that probability refers to subjective degrees of confidence, while the frequentists believe probability refers to frequencies of events in the real world. A growing body of research has shown that frequentist versions of Bayesian problems elicit Bayesian reasoning. This study (N = 118) replicated Fiedler's finding that a frequency version of the Linda problem elicits Bayesian reasoning in about 75% of participants, compared to 17% for the probability version in Tversky and Kahneman's studies. It also found, however, that the inductive-reasoning mechanism that operates on frequency input is not activated when there is a racial stereotype generated. Keywords: stereotypes conjunction Bayesian frequentists probabilists Introducing Leroy 3 Introducing the Leroy problem: Extending the Frequency vs. Probability Debate to Racial Stereotypes via an Evolutionary Perspective. Life is filled with decisions that are based on the likelihood of uncertain events--for instance, guessing the outcome of a general election or deciding on the innocence or guilt of a defendant. The questions of how we make our decisions and what influences them are intriguing and important. It appears that human beings use heuristic principles to simplify the task of using probabilities to make predictions, but in the process, they often make systematic errors (Tvserky & Kahneman, 1982). There are many biases found in the intuitive judgement of probability; these include representativeness, misconceptions of chance, and base rate neglect. For many years researchers have been fascinated with solving the "Linda problem" first used by Tversky and Kahneman (1982) in their original study looking at representativeness in forming bias. The Linda problem involves the conjunction rule, one of the simplest in using probabilities (Tversky & Kahneman, 1982). The rule states that the joint occurrence of two events cannot be more likely than the occurrence of either single event. Despite its simplicity, the vast majority of people fail to apply the rule to the Linda problem and commit what Tversky and Kahneman call a conjunction fallacy. In the original presentation of the problem, participants were provided with a brief personality description of Linda (including elements that implied feminism) and were then asked to rank order the probabilities of various statements being true: Linda is a bank teller [Constituent B] and Linda is a bank teller and is active in the feminist movement [Conjunction of B + F] alongside 6 other outcomes. Tversky and Kahneman (1982) found that 83% of participants ranked the conjunction (B + F) as more likely than the constituent (B)--a mathematical impossibility as the bank teller category contains those who are and are not feminists. Surprisingly they also found that statistical sophistication had little or no effect on the rate of conjunction fallacies. They suggested that this surprising finding reflected the representativeness heuristic in operation. However this was not conclusive and despite intensive research on the Linda problem, Introducing Leroy 4 which has led to the identification of many potential variables that lead to the conjunction fallacy, no one has come close to eliminating it (Epstein, Denes-Raj, & Pacini, 1995; cf.: Hertwig & Chase, 1998; Hertwig & Gigerenzer, 1999). In a replication study where all personality descriptions were eliminated from the description of Linda, Tversky and Kahneman (1982) elicited a very low rate of conjunction fallacies. However, this does not necessarily mean that the conjunction rule is being correctly applied. Participants are probably judging the combined activities of "bank teller and feminist" as incompatible and therefore still engaging in the representative heuristic (Epstein, Denes-Raj, & Pacini, 1995). So why is the Linda problem so difficult? Tversky and Kahneman (1982) conclude that the high rate of conjunction fallacies is obtained due to use of a judgmental heuristic, which they describe as a strategy "that relies on a natural assessment to produce an estimate or a prediction" (Tversky & Kahneman, 1983, p. 294). In simple terms, people make judgements about the Linda problem based on perceiving Linda as being more similar to bank tellers who are feminists (thus rendering this combination very "available") than to bank tellers in general. The Paradigm of Evolutionary Psychology Evolutionary psychology has become increasingly relevant to modern psychology in general. Researchers are searching for possible underlying evolutionary psychological mechanisms to explain the existence of behaviors. After all the "human brain did not fall out of the sky, an inscrutable artifact of unknown origin, and there is no longer any sensible reason for studying it in ignorance of the causal processes that constructed it" (Cosmides & Tooby, 1994, p. 85; but see also Gigerenzer & Hoffrage, 1995). The evolutionary history that led to our present "form" consists of a "step-by-step succession of designs modified across millions of generations" (Cosmides & Tooby, 1994, p. 86). Modifications are the results of either chance or natural selection (if one ignores the creationist perspective) with the only plausible explanation for complex functional designs being natural selection (Dawkins, 1986). Natural selection works as follows: a long enduring adaptive problem results in the formation of various competing designs to cope with it. The designs that better enhance their own Introducing Leroy 5 propagation relative to alternative designs are selected for and eventually became the norm. It is important to understand that evolution is a historical and not a predictive or "foresightful" process. Hence our current design is geared towards adaptive problems of the past without regard to the problems of the present. For humans, our cognitive mechanisms are, as it were, designed to solve the adaptive problems created by the situations our Pleistocene hunter gatherer ancestors faced. It is thus viewed as coincidental that a mechanism may solve present-day problems and this event plays little or no role in explaining how the mechanism came, in the first place, to have the design it does (Cosmides & Tooby, 1994). Also, it is important to understand that natural selection does not always produce perfect or optimal designs (Darwin, 1859; Dawkins, 1976, 1982). Cognitive psychology has traditionally given emphasis to the acquisition of knowledge rather than to the regulation of action. Understanding evolutionary theory turns this emphasis on its head. The brain evolved mechanisms to acquire knowledge because knowledge is important in the regulation of action. One should be asking what a mechanism was designed to do rather than what it can do. As described by Cosmides and Tooby (1994), "Because an adaptive problem and its cognitive solution ... need to fit together like a lock and a key, understanding adaptive problems tells one a great deal about the associated cognitive mechanisms" (p. 96). Frequencies or Probabilities Following the work of Tversky and Kahneman (1982, 1983; Kahneman & Tversky, 1972), conventional psychology now adheres to the idea that people's "untutored intuitions" do not follow a calculus of probability (Cosmides & Tooby, 1996). However, the situation is not as simple as it seems; even professional probability theorists are in disagreement as to what probability means. Two of the prominent schools of thought are manifest as the frequentists and the Bayesians. (But see also, Fiedler, Brinkmann, Betsch, & Wild, 2000; Hoffrage, Gigerenzer, Krauss, & Martignon, 2002). Bayesians argue that probability refers to a subjective degree of confidence and, because one can express one's confidence that a single event will occur, it is possible to refer to the probability of a single event. In contrast, frequentists argue that probability refers to the relative frequencies of events in the world and are always defined over a specific reference class. Hence, a Introducing Leroy 6 frequentist would argue that a single event (such as Pat being a bus driver) can not have a probability (as it does not have a relative frequency), and, as such, accurate probabilities for single events can not be computed by a calculus of probability within the mind. Gigerenzer (1991) argued that even people who are not aware of the finer points of probability theory may implicitly make the Baysean vs. frequentist distinction and that for most domains, the human mind represents probabilistic information as frequencies. If this is correct, it sidesteps the question: can humans make judgements under uncertainty that obey the rules of probability theory if the probabilistic information provided (and answer required) is in terms of frequencies? Tversky and Kahneman (1982) argue that the laws of chance are not intuitively obvious nor applied very easily, suggesting that the "human mind is not designed to spontaneously learn such rules" (Tversky & Kahneman, 1974, p. 1130). When one is considering what the brain is designed to do, the paradigm of evolutionary psychology is inescapable. Judgement under uncertainty is an adaptive problem that would have regularly been experienced by our Pleistocene hunter gatherer ancestors, and statistical rules or judgmental heuristics could have been used to solve it. It appears from Tversky and Kahneman's finding that this is true. However, it is important to ask why one design was selected over another (Cosmides & Tooby, 1996). It seems odd that natural selection would favor a design that used error-prone heuristics rather than an accurate calculus of probability. There is evidence to suggest that some birds and insects, with nervous systems considerably simpler than those of humans, utilize very sophisticated statistical reasoning when foraging (Real, 1991). Staddon (1988) argued that many organisms, from sea snails to humans, have learning mechanisms responsible for a variety of tasks (e.g., habituation) that can be described as "Bayesian inference machines." When one considers this evidence, it seem unlikely that birds and insects and other similarly less sophisticated organisms can carry out statistical functions that the human brain is considered incapable of. A well-engineered reasoning mechanism. Cosmides and Tooby (1996, p. 14) identified that the Marrion question of "what should the design of a well engineered reasoning mechanism look like?" needs to be addressed, and experiments constructed that can detect these designs (Marr, 1982). In ancestral times the only reliable database for information would be one's own Introducing Leroy 7 observation and those shared by the small community within which one lived. The probabilities of single events would not have been available; instead contemporary humans' ancestors would have thought in terms of "encountered frequencies" (Cosmides & Tooby, 1996). So if one considers Gigerenzer's hypothesis that the mind is a good intuitive statistician of the frequentist school and place it in an evolutionary framework, one is left with having evolved mechanisms that "took frequency information as input, maintained such information as frequentist representations and used these representations as a database for effective inductive reasoning" (Cosmides & Tooby, 1996, p. 17; but cf., for instance, Evans, Handley, Perham, Over, & Thompson, 2000). So experiments should find that performance on tasks that infer judgement under uncertainty will differ depending on whether participants are asked to judge the frequency or probability of a single event. This difference will favor frequency versions of problems which should elicit superior performances. This appears especially likely when one considers that there is evidence to suggest that people possess a mechanism that is designed to encode frequency information very accurately as well as automatically (Attig & Hasher, 1980; Hasher & Zacks, 1979; Zacks, Hasher, & Sanft, 1982). It is important to understand that performances are expected to improve with frequency versions but not to be perfect. This is true even when an "optimum algorithm" is selected for as it is theoretically impossible to build an "omniscient algorithm" (Cosmides & Tooby, 1996). Hence Tversky and Kahneman's (1982) original finding does not necessarily mean that human beings are not good intuitive statisticians but rather may indicate that the information provided to solve the problem was not in frequentist terms, and so the mind's calculus of probability was not designed to solve it. Cosmides and Tooby (1996) applied the frequency hypothesis to a problem famous in the heuristics and biases literature for eliciting base rate neglect. They found that correct Bayesian reasoning could be elicited in 76% (92% in the most ecologically valid condition) of participants. Fiedler (1988) tested the frequency hypothesis on the Linda problem. Indeed, one might wonder how there could be "more than one Linda." Frequency conditions typically state that "there are 100 people who fit the description above. How many of them are: ..." In any event, Introducing Leroy 8 Fiedler found that the vast majority of participants that commit the fallacy (quoted by Fiedler as usually 70-80%) only do so on the probability version, where they are asked to rank statements with respect to their probability. The frequency version saw this number drop to less than 20% (here participants were asked "To how many out of 100 people do the following statements apply?" as per Fiedler, 1988). The findings suggest that statistical judgements may obey the conjunction rule provided that the task is formulated appropriately; that is, with frequencies. Fiedler (1988) also found the same drop in conjunction fallacies when the frequency judgement task was not presented as a "probability-like" judgement task (i.e., not "how many out of 100"). The results from these studies suggest that humans, like other animals, have inductive reasoning mechanisms that embody a calculus of probability but that these mechanisms may have been designed to operate when information is presented in a frequency format. (The mechanisms' design appears consistent with-although it is not necessarily a result of--the adaptive problems that people's ancestors faced). Their existence remained hidden in earlier studies because (at least under the circumstances studied) the mechanisms are unable to operate accurately when the information provided is in a non-frequency format (Brase, Cosmides, & Tooby, 1998). Avoiding confounds and extending the debate to racial stereotypes. Epstein, Donovan, and Denes-Raj (1999) have criticized the relevance of Fiedler's findings due to the response formats of the two versions of the Linda problem being confounded in important ways, one requiring ranking and the other frequency estimates, It was reported that when confounds were eliminated, similar rates of conjunction fallacies were obtained for frequency and probability versions of the Linda problem (Epstein, Denes-Raj, & Pacini, 1995). Racial Stereotype The stereotypes that are invoked by the description of a fairly young black male, driving an expensive car and wearing designer brand clothing are numerous and often entail the object of the description being involved with crime or drugs. Although the fact that he is a university graduate confounds this, it was decided to include the information so that "Leroy" is more comparable to Linda. Intergroup discrimination is a feature of most modern societies. Racial tensions may be as Introducing Leroy 9 prevalent today as they have ever been despite the increased opportunities for positive contact between blacks and whites. A considerable body of research has searched for circumstances under which contact between whites and blacks results in positive intergroup relations (see, e.g., Henderson-King & Nisbett, 1996). However there is little evidence to show that positive contact has prolonged effects or influences at group level. Disturbingly, research indicates that despite being given increased opportunities for interracial contact, white American attitudes towards blacks remain at best ambivalent (Dovidio, Evans, & Tyler, 1986; Gaertner & McLaughlin, 1983; McConahay, 1986; cf. D. Katz & Braly, 1933). Henderson-King and Nisbett (1996) showed that seeing a black person behave with hostility or even simply overhearing a conversation where a black individual is the perpetrator of a hostile event resulted in participants perceiving blacks as more antagonistic than whites (there being no equivalent effect for whites). An explanation of this may be that a single black person's behavior may have an inordinately large influence on white people's attitudes towards blacks in general. These attitudes may be particularly disproportionate to reality if the observed behavior is negative. It appears to be a case of another heuristic, labelled by Tversky and Kahneman (1982) as the "law of small numbers." People rely too heavily on small, fortuitous samples to make judgements, remaining blind to the fact that their observations can be explained by sample variability. To put it another way, most people have substantial and varied ongoing social interaction with both men and women. By contrast, many people may be especially ready--with minimal cueing (such as being presented with a single vivid example)--to use the stereotype of a racial "outgroup." Prediction. The above evidence would suggest that it can be very difficult to remedy the problem of intergroup discrimination; due to the effect of the law of small numbers, it seems that many people are too ready to associate certain stereotypes with black (here referring to people of African origin) people. It is predicted that (arguably) due to the seeming strength of racial stereotypes, the frequency hypothesis shall not be supported for the Leroy problem and hence the probability and frequency versions would elicit approximately equivalent rates of conjunction fallacies. The present study Introducing Leroy 10 The present study seeks to replicate Fiedler's finding that a frequency version of the Linda problem will achieve fewer conjunction fallacies (CF) than the probability version. Note that it avoids Epstein, Donovan, and Denes-Raj's (1999) criticism by requiring participants to give either probability or frequency estimates, hence avoiding ranking altogether. Thus, mode (frequency or probability) acts as the first within-subjects variable. It also introduces a question that would cause participants to engage in a racial bias if operating the representativeness heuristic; this is the Leroy problem. The Leroy vignette reads: "Leroy is 28 years old, black and single. He is intelligent and studied economics at university. He likes to wear designer labels and drives an expensive car." (Note: the vignette and all possible selections can be found in the Method section.) Another effect that is examined is that of practice. In everyday life, people participate in a variety of often inter-related situations. In the present study, there is a between-subjects variable of sequence designed to determine whether practice with one problem type--frequency--improves performance with the other (more difficult) problem type--probability. To ensure that participants had the intuitive knowledge to use the conjunction rule and to solve conjunction problems, a question was included that tested for this ability, which is most reliably demonstrated by pairing two palpably unlikely events. The question (lottery vignette, as described below) was first used by Epstein, Denes-Raj, and Pacini (1995) where it was found that only 6.5% of participants committed the conjunction fallacy when answering it compared to 67.5% for fallacies in the Linda problem. A prediction for the present study is that conjunction fallacies will be more common for the probability version of the Linda problem than for the frequency version and also more common than for either version of the Leroy problem. If, as we would hypothesize, the frequency-probability difference in conjunction fallacies is not wholly stable--and, for instance, varies as a function of sample, experimental treatments (such as racial vs. gender stereotypes), measuring operations, and/or time and place of setting (cf. "UTOs" in Cronbach, 1982)--then this would not, of course, "disprove" people's inherent ability to avoid conjunction fallacies when using frequencies. (Equally, it would neither strengthen nor weaken a social/cultural rationale for the conjunction fallacy in general nor the frequency/probability difference in particular.) It would, Introducing Leroy 11 however, indicate that such ability is not robust and notably context-dependent--and would moreover go some way toward elucidating this dependency. Method Participants There was a total of 120 participants, 30 for each sequence. They were approached in the refectory of Goldsmiths College, University of London, and participants were all undergraduates of the College. There were equal numbers of each sex in each condition, i.e., 15 male and 15 female. For the initial analysis the data from two participants had to be discarded as they failed to complete the questionnaire properly (by leaving target scores blank, e.g., Linda is a bank teller). After closer examination of data and initial analysis it was decided provisionally to remove the data for 11 additional participants as they had appeared to fill in their questionnaires without adequate thought and placed the same number e.g., 50% for all, or nearly all, outcomes. Including respondents who show such a response set would (in inferential statistics tests) give an inflated estimate of sample size and a deflated estimate of differences among probabilities (or among frequencies); nevertheless, the data were analyzed both with and without these respondents and, as indicated below. there were no major differences in the results. Materials Five conjunction problems were presented in counterbalanced order in four different questionnaire booklets, each booklet corresponding to one of the four sequences. The five problems were as follows: probability and frequency versions of the Linda problem, probability and frequency versions of the Leroy problem, and the lottery problem. The lottery problem was always presented last (in all four sequences) as previous research has revealed that when the lottery problem precedes the Linda problem, participants' performance on the Linda problem improves (Epstein, Denes-Raj, & Pacini, 1995). The Linda vignette, reproduced from Tversky and Kahneman (1983) reads as follows: "Linda is 31 years old, single outspoken and very bright. She majored in philosophy. As a student she was deeply concerned with social issues of discrimination and social justice and also participated in anti-nuclear demonstrations" (p. 297). Introducing Leroy 12 Participants were then asked the following: Probability version: Approximately how likely is each of the following? Rate each one SEPARATELY on a scale from 0-100% (i.e., they do not have to all add up to 100%). Frequency version: To how many out of 100 people, do the following statements apply? For both modes participants had to estimate the likelihood of the following statements: Linda is a bank teller (B) and Linda is a bank teller and is active in the feminist movement (B + F)--alongside 6 other statements, which were ignored in the data analysis. The Leroy Problem. The Leroy vignette reads as follows: "Leroy is 28 years old, black and single. He is intelligent and studied economics at university. He likes to wear designer labels and drives an expensive car." Participants were asked to respond to the corresponding probability and frequency questions as with the Linda problem and had to estimate the likelihood of the following statements alongside 6 other statements: Leroy is a voluntary aid worker (V) and Leroy is a voluntary aid worker and likes to listen to Rap music (V + R). Lottery vignette. The lottery vignette--reproduced from Epstein, Denes-Raj, and Pacini (1995)--reads as follows (but note that the vignette is slightly adapted to suit participants from a British population, the word lotteries being replaced with lottery tickets): "Tom buys two lottery tickets, one from the state lottery and one from the local fire department. The chances of winning in the state lottery are one in a million. The chances of winning in the fire department lottery are one in a thousand" (p. 1127). Participants were then asked to rank order the following likelihood statements on a scale from 1 (most likely) to 3 (least likely): "Tom wins the state lottery," "Tom wins the fire department lottery," and "Tom wins the state lottery and the fire department lottery". Each vignette (Linda, Leroy, and lottery), along with its questions, was placed on an individual page. Introducing Leroy 13 Design The study was a 2 (Mode: frequency vs. probability) x 2 (Basis: sex vs. race) x 4 (Sequence) mixed factorial design. Sequence was a between-subjects variable and had 4 levels: Sequence 1 = F-Linda, F-Leroy, P-Linda, P-Leroy; Sequence 2 = F-Leroy, F-Linda, P-Leroy, P-Linda; Sequence 3 = P-Linda, P-Leroy, F-Linda, F-Leroy; Sequence 4 = P-Leroy, P-Linda, F-Leroy, F-Linda; (Where F = Frequency, and P = Probability). In order to account for order effects (if any), 4 sequences were used such that for either mode (frequency or probability) some participants answered a race then sex problem and others did the opposite. Mode and basis were within-subjects variables with 2 levels each as already stated (see above). Procedure The experimenter approached participants in the college refectory and asked if they were willing to take part in a ten-minute study exploring people's judgements of likelihoods. Those who agreed were given one of the four types (i.e., sequences) of questionnaire booklets. Participants were randomly allocated into a sequence type. This was done by the experimenter, who prior to approaching students in the refectory sorted the questionnaires into a pile with sequences rotating in the following fashion, Sequence 1, 2, 3, 4, 1, 2, 3, . . . etc. Participants were handed the top questionnaire on the pile, so that neither they nor the experimenter knew what sequence they were completing. The questionnaires had clear instructions on the first page that they should not flick through the booklet but should complete each question in the sequence provided; once a question had been completed they were to turn the page and then not turn back again. The experimenter read through the instructions with each participant and then asked if they had understood the instructions. The participant was then left to complete the questionnaire while the experimenter was close by. Introducing Leroy 14 Scoring The bias was scored quantitatively--e.g., if a participant gave the likelihood that "Linda is a bank teller" a rating of 20% (or 20/100) and gave the likelihood "Linda is bank teller and a feminist" a rating of 25% (or 25/100) their score would equal +5 (25 - 20). A positive score indicates the breaking of the conjunction rule and a negative score the opposite, i.e., a lack of bias. (The initial analyses are, however, concerned with the sheer rate of CF--proportion of positive scores.) Results Conjunction Fallacies as a function of problem type. Table 1 shows that nearly 2/5 of participants (39.8%) committed a Conjunction fallacy (CF) in response to the probability version of the Linda vignette, compared to only 24.6% in the frequency version of the same problem. The difference is substantial and highly significant. (See Table 2; using a McNemar test to compare those 23 participants in the first two columns of the "+" row--who show CF only for probabilities--with the 4 in the first two rows of the "+" column, who show CF only for frequencies, Χ2 = 13.37, df = 1, p < .001). -----------------------Tables 1 and 2 about here -----------------------Nevertheless, the rate of CF for the "Probability-Linda" (P-Linda) version of 39.8% was much lower than Tversky and Kahneman's (1983) rate of 83% in their original study. Moreover, many participants failed to answer the Lottery problem correctly (30.5%), a surprisingly high number when compared to 6.5% in Epstein, Denes-Raj, and Pacini's (1995) study. It is possible that the participants have poor math ability in general or did not understand the questions. In fact only 19 out of 118 could identify the conjunction rule explicitly. Fiedler's (1988) finding of only 22% of participants committing a CF with a frequency version of the Linda problem was replicated, with a difference of only 2.6%. The Leroy problem (which invoked racial stereotypes) did not establish the same trend. It elicited CF in 36.7% of participants in the probability version (P-Leroy) compared to 40.7% in the frequency version. (The relevant "turnover table" is shown as Table 3.) As predicted, the Introducing Leroy 15 frequency version did not reduce bias and indeed shows a non-significant trend towards increasing it. -----------------------Table 3 about here -----------------------Actual CF scores (that is, conjunction minus constituent). Figure 1 provides a graphical representation of the marginal means across sequence for all question types. The graph indicates that F-Linda seems unusual in the response it elicits, suggesting an interaction of Mode x Basis, yielding a low CF rate for the frequency version of the Linda problem. None of the other means appear very different, suggesting that there are no pronounced main effects. (See also the ANOVA results, below.) -----------------------Figures 1 & 2 about here -----------------------Practice effects To establish if there is a practice effect, one needs to compare mean scores across all four conditions, i.e., all sequences. Figure 2 shows that the F-Linda version elicited lower scores (i.e., fewer CFs) across all sequences--except Sequence 2--than did the P-Linda, P-Leroy, and F-Leroy versions, with Sequence 4 (P-Leroy, P-Linda. F-Leroy, F-Linda) achieving particularly low scores. The graph also indicates that Sequence 2 (F-Leroy, F-Linda, P-Leroy, P-Linda) seems to have relatively low scores across the ranges of problem types. It suggests that practice with frequency-type questions enhances performance of probability type questions, provided the racial-stereotype-inducing question is asked first. In order to assess the significance of differences among the means a 2 (Mode: frequency vs. probability) x 2 (Basis: Linda/sex vs. Leroy/race) x 4 (Sequence) mixed factorial analysis of variance was carried out, with Mode and Basis as within-subjects measures and Sequence (1/2/3/4) as a between-subjects measure. The grand means for the two levels of each of the within-subjects measures, Mode and Basis, were as follows: Probability, 7.81; Frequency, 4.93; Sex(Linda) 5.47, and Race(Leroy), Introducing Leroy 16 7.27. The analysis of variance revealed that there were no significant main effects but there was a significant Mode x Basis interaction, F(1, 114) = 6.82, p < .01. From Figure 1 it seems clear that the interaction is due to low CF scores on the frequency version of the Linda problem. The results again indicate that a significant interaction for Mode x Basis occurs for the frequency version of the Linda problem. As it was predicted that the frequency version of the Linda problem would elicit fewer CFs than the probability version and that Mode would have no effect on the Leroy problem, a single paired samples t-test was carried out. This revealed that participants, irrespective of sequence, displayed smaller CFs on the frequency version of the Linda problem compared to the probability version, t (117) = 3.08, p < .003. Scores of Zero Many participants achieved a score of zero for some answers. A zero score signifies that the participant has answered the questions in one of two possible ways. Taking the Linda problem as an example (the same thinking applies to the Leroy problem): (1) Participant answered 0% or 0/100 for both the constituent (B) and for the conjunction (B + F). (2) Participant answered with the same non-zero figure, e.g., 25% or 25/100 for both the constituent (B) and the conjunction (B + F). Both answers are logically possible, if improbable, and hence do not inevitably actually break the conjunction rule. Closer examination of raw data revealed that some participants who had zero (or near-zero) scores via the second method had filled in the same (or virtually the same) number for all categories, e.g., 25%. It was decided to repeat the main analyses without the scores of these participants (n = 11) to see what impact, if any, they had on the findings. In the event there were no major differences when the data were analyzed without the "zero scorers." (See Tables 1, 2, and 3). Conjunction fallacies as a function of problem type ("zero scorers" removed). Table 2 shows that participants perform better on the frequency version than the probability version of the Linda problem with 26.2% and 43.0% committing Conjunction Fallacies (CF) respectively. Again the rate of CF for the P-Linda version (42%) is much less than Tversky and Kahneman's (1983) 83%. Possible reasons for this drop in number of CF are discussed below. Introducing Leroy 17 Again Fiedler's (1988) finding of a frequency version eliciting fewer CFs than the probability version of the Linda problem was replicated, with only 26.2% of participants committing the fallacy in the frequency version. The Leroy problem elicited approximately the same amount of CF in the probability version as did the Linda problem with the frequency version producing no drop in number of fallacies. This is as predicted. A graph of the marginal means for each question type (Figure 3) reveals that removing the 11 participants had little effect on the means for both the frequency and probability versions of the Linda problem and for the P-Leroy problem but substantially reduced the scores for the F-Leroy version making the frequency and probability version of the Leroy problem approximately equivalent in terms of mean scores. -----------------------Figures 3 & 4 about here -----------------------Figure 4 shows the mean scores for each question type in each sequence. The graph makes it easy to see that Sequence 4 is the only one to undergo a major change, with the mean F-Leroy score dropping significantly (a drop of 4.45). This happened mainly because two "zero scorers," both in Sequence 4, showed 0.0 conjunction fallacy for three of the four conditions but very large conjunction fallacies (50 and 90, respectively) for F-Leroy; in other words, removing "zero scorers" stripped F-Leroy of two conjunction-fallacy outliers. The "stripped" result would tentatively suggest that practice with probability type questions enhances performance on frequency type questions provided the sex stereotype inducing question is asked first. Otherwise, the same trends seem apparent with or without "zero scorers," with Sequence 2 (F-Leroy, F-Linda, P-Leroy, P-Linda) appearing to elicit lower scores across all problem types and F-Linda scores being the lowest in all sequences except for Sequence 2. In order to determine if there were any significant differences between the means and to assess any other impact from removing the scores of the 11 participants who appeared not to have completed the questionnaire appropriately, a 2 (Mode) x 2 (Basis) x 4 (Sequence) mixed factorial Analysis of variance was carried out, with mode and basis as within-subjects measures and Introducing Leroy 18 sequence as a between-subjects measure. The analysis revealed a significant main effect for mode F(1,103) = 5.689, p < .019. It also revealed two interactions: Mode x Basis F(1, 103) = 3.966, p < .049 and Mode x Sequence F(3,103) = 2.313, p < .063. (Note that although the Mode x Condition interaction was not significant at the 5% level, it was arguably close enough to merit brief discussion.) The grand means across all four sequences (cases with zero scores removed) were as follows: Probability, 7.80; Frequency, 3.25; Sex(Linda) 5.70, and Race(Leroy), 5.36. The main effect for Mode confirms that participants performed fewer CFs on frequency problems. From Figure 3 it appears that the interaction Mode x Basis is rooted in the Linda problem, with participants achieving fewer CFs in the frequency version. Several t-tests were carried out. (The Bonferroni correction was applied to prevent capitalizing on chance, and probabilities were adjusted accordingly). The further analysis revealed, as predicted, that the frequency version of the problems only reduced CFs significantly for the Linda problem, t(106) = 2.973, p < .004 and not for the others (p > .05). Figure 4 shows the scores for each question type by sequence. Sequence 4 has atypically low scores for the frequency version of both the Linda and the Leroy problem. To establish the source of the Mode x Sequence interaction, further statistical analysis was carried out. A total of four independent-samples t-tests were carried out, which revealed only one significant difference between sequences. Participants in Sequence 4 (P-Leroy, P-Linda, F-Leroy, F-Linda) committed significantly fewer CFs for both the Linda and the Leroy problems than those in Sequence 2 (F-Leroy, F-Linda, P-Leroy, P-Linda), t(1,106) = 2.39, p < .021. However once the Bonferroni correction was applied, this lost its significance as p > 0.05. However, as the correction is quite conservative and the difference (away from statistical significance) small it is probably safe to say that the effect was significant. Introducing Leroy 19 Discussion The initial analysis (with no scores removed) showed no main effects for mode (frequency or probability) or basis (sex or race). There was an interaction for Mode x Basis however. This revealed, as predicted, that participants committed fewer CF on the frequency version of the Linda problem compared to the probability version. This was a replication of Fiedler's (1988) findings, where only 22% of Participants committed the fallacy when completing the frequency version of the Linda problem (24.6% in this study). The present study did not, however, replicate Tversky and Kahneman's (1982) original finding of 83% of participants committing the fallacy on the probability version of the Linda problem. Instead only 39.8% made conjunction errors. This could be because the present participants were not ranking probabilities but giving quantified estimates (essentially thinking in terms of frequency even for probability questions) (see, for example, Fisk & Pidgeon, 1996; Hertwig & Chase, 1998), or it could be because some (approximately 1/3) of the participants were psychology students who may have been aware of Tversky and Kahneman's work (though we have no particular reason to think this to be the case) and hence avoided the fallacy. Also, many participants voiced confusion at the profession "bank teller," an American term for which the English equivalent is "bank cashier" or "bank clerk"--though this did not stop the sample, overall, from demonstrating a typical "Linda problem" effect, albeit in attenuated form. It would be interesting to see if a replication of the study using language more appropriate to British participants would facilitate any change in the study's findings. Despite the drop in number, close to 40% of participants broke the conjunction rule, which is described by Tversky and Kahneman (1983) as one of the simplest in probability. This requires an explanation. Kahneman and Tversky (1972) drew the conclusion, which has been widely accepted, that: "In his evaluation of evidence, man is apparently not a conservative Bayesian: he is not a Bayesian at all" (p. 450). However this view may well be premature. Frequentist versions of Bayesian problems do appear to elicit Bayesian reasoning. There is a growing body of research that is finding this to be true (see Brase, Cosmides, & Tooby, 1998; Cosmides & Tooby, 1996; Fiedler, 1988; Gigerenzer, Hell, & Blank, 1988). The present study has found that a frequency version of the famous Linda problem reduced the number of CFs to only Introducing Leroy 20 24.6%--a drop of 58.4% from Tversky and Kahneman's (1982) original finding. It was predicted that the frequency version would not elicit the same effect (as for the Linda problem--i.e., a drop in number of CF) on a problem that generated racial stereotypes. This was found to be true. As noted above, stereotypes of Blacks are very easily created and reinforced. Moreover, these attitudes have proven very difficult to eliminate (see Gaertner & Dovidio, 1986; I. Katz & Hass, 1988; McConahay, 1986), partly due to the operation of another heuristic, the law of small numbers (Tversky & Kahneman, 1982), where people rely too heavily on small samples when they make judgements. Surprisingly (given the history of research on the Linda problem), the frequency version of the Leroy problem actually led to a (non-significant) increase in the number of CFs (a rise of 6%) compared with the probability version. Closer examination of the data revealed that several participants who had committed fallacies on the F-Leroy version had filled in the same (or nearly the same) number for all problems throughout the questionnaire. The validity of these participants was brought into question and it was decided to redo the analysis with their data removed. A total of 11 participants had their data removed. The re-analysis showed a main effect for mode and interactions for Mode x Basis and Mode x Condition. Removing the anomalous scorers did lead to the number of CFs dropping in the frequency version of the Leroy problem. Comparison of Figures 2 and 4 (graphs of mean scores by sequence) showed that most scores remained similar, but with Sequence 4 (P-Leroy, P-Linda, F-Leroy, F-Linda) displaying a large change in the score for the frequency version of the Leroy problem. It appears that many of the participants who had their data removed fell in this group. Whether this was due to chance assignment to sequence groups or to the experimental effect of initially facing the P-Leroy task is not certain. What is clear from the present results, taken together, however, is that the "CF-freeing" effect of the frequency task varies with context. The Mode x Basis interaction indicated the same effect as the initial analysis, where t-tests revealed that there was only a significant drop in scores (hence fallacies) for the Linda problem. As predicted, the Leroy problem, which generated racial stereotypes, did show substantial CFs but did not show the frequency version reducing the number of CFs significantly. Apparently, participants are still engaging in the representativeness heuristic even when the problem is posed Introducing Leroy 21 in frequentist terms. This is possibly because the bias is too strong and somehow the mechanism for inductive reasoning is bypassed and the representativeness heuristic engaged. Apparently the black male stereotype is powerful enough to "override sound reasoning" in both frequentist and probabilistic modes. (This is not of course to say that using sound base-rate probabilities as a primary basis for social-cognition decisions is routinely maladaptive.) As Sears (2001) has put it, though, Black-White differences tend to "trump" all other categorizations. An evolutionary framework does not suggest that there are only mechanisms that deal with frequency input, and it may be the case that in some instances the use of an heuristic would provide a more favorable outcome and hence be the mechanism selected for. It is plausible (albeit uncertain) to consider that, as our hunter gatherer ancestors lived and functioned socially within small discrete groups (Tooby & Cosmides, 1996), members from outgroups were treated with suspicion and a quick judgement (possibly applying the law of small numbers) was the most adaptive. Heuristic reasoning, including representativeness, has its adaptive functions; in fact, as Epstein, Donovan, and Denes-Raj (1999) noted: "In the real world, no one would doubt that the absence of such [heuristic] thinking can be highly maladaptive" (p. 213). It is important to note that although the Leroy problem involved generating a racial stereotype, that black men like to listen to Rap music, the stereotype is not a particularly negative one. It would be interesting to see if more negative stereotypes such as drug use, criminality etc. would elicit a similar effect. The conjunction fallacy is subject to shifts due to subtle differences in stimuli, and the Linda and Leroy problems do necessarily differ in ways other than those linked directly to the stereotypes concerned. It is also possible that (relatively small) effects were not found in the Leroy problem (including the reduction of effect in finding CF in probabilities) due to the limited number of participants used. Further research using a larger sample and different population (i.e., non-student) may produce different findings. It is generally considered that students may be more aware of issues of gender and race discrimination, and their answers may reflect this. A different sample population might well yield even stronger effects. Additional general issues It is worth taking stock briefly of what the present study implies as regards racial and Introducing Leroy 22 sexual stereotyping and related matters. The results do indicate that basic logical (conjunction) errors can follow not only from sex-based but also from race-based stereotypical expectations. Moreover, although the present experiment confirms that participants may be more error-prone for probabilistic than for frequency-framed reasoning, the Leroy problem makes it clear that the latter does not grant "immunity" to such biases. This may well be because racial stereotypes are less yielding than gender ones, though it might also, for instance, simply be due to our having hit upon a vignette that has somewhat different properties from the Linda problem. For example, albeit in a different research area, Krueger and Rothbart (1988) found that, among sex-based examples, stronger stereotypes may be less "malleable" in terms of inferences made from them in different contexts. Ideally, one would follow Cronbach's (1982) dictum to sample Treatments in much the same way that one would Units (participants). This is not so easily done, however, and a substantial literature attests to the difficulty of replicating the Linda problem with other vignettes much less exceeding it with a problem, such as the Leroy one, which yields substantial conjunction errors with frequency as well as probability questions. Any conclusions with regard to gender vs. racial stereotyping must be made with great caution. Nevertheless, the results are at least consistent with a view that (a) frequency-based reasoning is well represented in our evolutionary history and (b) minority-group stereotyping is especially resilient to change (though fortunately not impervious to change--see, e.g., Aboud & Levy, 1999). It has long been known that psychological factors can disrupt logical reasoning. In the 1950s, for example, Abelson and Rosenberg's (1958) demonstrations of "psycho-logic" showed the impact of expectations on syllogistic accuracy (see also, for example, Simon & Holyoak, 2002). Until recently, however, such work has not been particularly directly concerned with conjunction errors in categorical reasoning. Further substantial programmatic research would be useful for understanding more precisely the properties and ecological frequency of logical errors. Perhaps this could be accomplished, in part, by creating abstract "stereotypes" by exposing participants to "training trials" in which simple shapes and objects are presented with different frequencies of occurrence for different manifestations (e.g., various shapes and sizes)--and then asking frequentist and Introducing Leroy 23 probabilistic questions that are framed so as to be conceptually parallel to those in the Linda and Leroy problems. It may well be possible to elicit conjunction errors without the "baggage" of personality vignettes (see also, Yates & Carlson, 1986). Although it is tempting to view a conjunction fallacy as a binary distinction--present when, and only when, a conjunction is seen as more frequent or probable than one of its constituents-the signed difference apparently forms a fairly smooth distribution that spans the zero point. As with many variables, its magnitude is associated with differences both within and between individuals. That is, even differences where the constituent is (correctly) seen as larger than the conjunction, with no "conjunction error" as such, are arguably part of a scaled "degree" of conjunction error." Thus phenomena that may be associated with some gender and racial stereotyping may be basic not only in their automaticity (as in implicit association tests) but also in the generality of the situations to which they may apply. In some contexts (cf. Borgida, Locksley, & Brekke, 1981; Locksley & Stangor, 1984) a potential social problem with conclusions based on stereotypes linked to sex or race is not that they are illogical--indeed they may be "overly" logical in a Bayesian sense (e.g., based on perceived prior odds)--but that they may be used to justify wholly unwarranted and unfair extrapolation and discrimination. By contrast, in the present context it would seem that people may need to be more mindful of bias (such as possible conjunction errors due to exaggerated expectations) reflecting logical violations in a broad spectrum of examples, including basic categories (e.g., shapes) as well as stereotypes such as those related to women and minority group members. Conclusions Whatever the cause, the present results do confirm that conjunction fallacies found with probabilistic judgements may well be reduced when equivalent frequency judgements are made in some circumstances (in the Linda problem)--but that such reduction does not necessarily take place (the Leroy problem). It seems as though frequentist problems are able to access a cognitive mechanism for inductive-reasoning in certain circumstances only. Further research into different types of problem that cause participants to engage in a variety of stereotypes is needed to understand the full effect Introducing Leroy 24 of the frequency hypothesis. It cannot be ignored that human beings are constantly being exposed to actual frequencies of real events and that we, like many non-human animals, appear to have unconscious mechanisms to keep track of these frequencies (Staddon, 1988). The evidence suggests we can use information that is in a frequency format to apply statistical rules correctly when making judgements under uncertainty. However, the notable conjunction fallacies of the probabilistic version of the Linda Problem seem to be readily manifest in the racial arena of the Leroy Problem as well--so readily, in fact, that they apparently can override any probabilist/frequentist disparity. As Tooby and Cosmides (1996) pointed out, human beings in certain situations may be "good intuitive statisticians after all!" (p. 1)--but, one may need to add, only under finely balanced circumstances. Introducing Leroy 25 References Abelson, R. P., & Rosenberg, M. J. (1958). Symbolic psycho-logic: A model of attitudinal cognition. Behavioral Science. 3, 1-13. Aboud, F. E., & Levy, S. R. (1999). Reducing racial prejudice, discrimination, and stereotyping: Translating research into programs. Journal of Social Issues, 55, 621-625. Attig, M., & Hasher, L. (1980). The processing of frequency occurrence information by adults. Journal of Gerontology, 35, 66-69. Borgida, E., Locksley, A., & Brekke, N. (1981). Social stereotypes and social judgment. In N. Cantor and J. F. Kihlstrom (Eds.). Personality, cognition, and social interaction (pp. 153-169). Hillsdale, NJ: Erlbaum. Brase, G. L., Cosmides, L., & Tooby, J. (1998). Individuation, counting, and statistical inference: The role of frequency and whole-object representations in judgement under uncertainty. Journal of Experimental Psychology: General, 127, 1, 3-21. Cosmides, L., & Tooby, J. (1994). Origins of domain specificity: The evolution of functional organisation. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture (pp. 85-116). Cambridge, England: Cambridge University Press. Cosmides, L., & Tooby, J. (1996). Are humans good intuitive statisticians after all? Rethinking some conclusions from the literature on judgement under uncertainty. Cognition, 58, 1-73. Cronbach, L. J. (1982). Designing evaluations of educational and social programs. San Francisco: Jossey-Bass. Darwin, C. (1859). On the origin of species by means of natural selection, or, the preservation of favoured races in the struggle for life. London: Murray. Dawkins, R. (1976). The selfish gene. Oxford, England: Oxford University Press. Dawkins, R. (1982). The extended phenotype: The gene as the unit of selection. Oxford, England: W. H. Freeman and Company Ltd. Dawkins, R. (1986). The blind watchmaker. Harlow, England: Longman. Introducing Leroy 26 Dovidio, J. F., Evans, N., & Tyler R. B. (1986). Racial stereotypes: The contents of their cognitive representations. Journal of Experimental Social Psychology, 22, 22-37. Epstein, S., Denes-Raj, V., & Pacini, R. (1995). The Linda problem revisited from the perspective of cognitive-experiential self-theory. Personality and Social Psychology Bulletin, 21, 1124-1138. Epstein, S., Donovan, S., & Denes-Raj, V. (1999). The missing link in the paradox of the Linda conjunction problem: Beyond knowing and thinking of the conjunction rule, the intrinsic appeal of heuristic processing. Personality and Social Psychology Bulletin, 25, 204-214. Evans, J. St. B. T., Handley, S. J., Perham, N., Over, D. E., & Thompson, V. A. (2000). Frequency versus probability formats in statistical word problems. Cognition, 77, 197-213. Fiedler, K. (1988). The dependence of the conjunction fallacy on subtle linguistic factors. Psychological Research, 50, 123-129. Fiedler, K., Brinkmann, B., Betsch, T., & Wild, B. (2000). A sampling approach to biases in conditional probability judgments: Beyond base rate neglect and statistical format. Journal of Experimental Psychology: General, 129, 399-418. Fisk, J. E., & Pidgeon, N. (1996). Component probabilities and the conjunction fallacy: Resolving signed summation and the low component model in a contingent approach. Acta Psychologica, 94, 1-20 Gaertner, S. L, & Dovidio, J. F. (1986). The aversive form of racism. In J. F. Dovidio & S. L. Gaertner (Eds.), Prejudice, discrimination, and racism (pp. 61-89). San Diego, CA: Academic Press. Gaertner, S. L., & McLaughlin, J. P. (1983). Racial stereotypes: Associations and ascriptions of positive and negative characteristics. Social Psychology Quarterly, 46, 23-30. Gigerenzer, G. (1991). How to make cognitive illusions disappear: Beyond "heuristics and biases." In W. Stroebe & M. Hewstone (Eds.), European Review of Social Psychology (Vol. 2, pp. 83-115). Chichester, England: Wiley. Gigerenzer, G., Hell, W., & Blank, H. (1988). Presentation and content: The use of base rates as a continuous variable. Journal of Experimental Psychology: Human Perception and Performance, 14, 513-525. Introducing Leroy 27 Gigerenzer, G., & Hoffrage, U. (1995). How to improve Bayesian reasoning without instruction: Frequency formats. Psychological Review, 102, 684-704. Hasher, L., & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108, 356-388. Henderson-King, E. I., & Nisbett, R. E. (1996). Anti-Black prejudice as a function of exposure to the negative behavior of a single Black person. Journal of Personality and Social Psychology 71, 654-664. Hertwig, R., & Chase, V. M. (1998). Many reasons or just one: How response mode affects reasoning in the conjunction problem. Thinking and Reasoning, 4, 319-352. Hertwig, R., & Gigerenzer, G. (1999). The "conjunction fallacy" revisited: How intelligent inferences look like reasoning errors. Journal of Behavioral Decision Making, 12, 275-305. Hoffrage, U., Gigerenzer, G., Krauss, S., & Martignon, L. (2002). Representation facilitates reasoning: What natural frequencies are and what they are not. Cognition, 84, 343-352. Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of representativeness. Cognitive Psychology, 3, 430-454. Katz, D., & Braly, K. (1933). Racial stereotypes of one hundred college students. Journal of Abnormal and Social Psychology, 28, 280-290. Katz, I., & Hass, R. G. (1988). Racial ambivalence and American value conflict: Correlational and priming studies of dual cognitive structures. Journal of Personality and Social Psychology, 55, 893-905. Krueger, J. & Rothbart, M. (1988). Use of categorical and individuating information in making inferences about personality. Journal of Personality and Social Psychology, 55, 187-195. Locksley, A., & Stangor, C. (1984). Why versus how often: Causal reasoning and the incidence of judgmental bias. Journal of Experimental Social Psychology, 20, 470-483. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information San Francisco: Freeman. McConahay, J. B. (1986). Modern racism, ambivalence, and the modern racism scale. In J. F. Dovidio and S. L. Gaertner (Eds.), Prejudice, discrimination, and racism (pp. 91-125). Orlando, FL: Academic Press. Introducing Leroy 28 Real, L. A. (1991). Animal choice behavior and the evolution of cognitive architecture. Science, 253, 980-986. Sears, D. (2001, April). Continuities and contrasts in American racial politics. Paper presented at The Yin and Yang of social cognition: Perspectives on the social psychology of thought systems; A Festschrift honoring William J. McGuire. New Haven, CT. Simon, D., & Holyoak, K. J. (2002). Structural dynamics of cognition: From consistency theories to constraint satisfaction. Personality and Social Psychology Review, 6, 283-294. Staddon, J. E. R. (1988). Learning as inference. In R. C Bolles & M. D. Beecher (Eds.), Evolution and learning (pp. 59-77). Hillsdale, NJ: Erlbaum. Tversky, A., & Kahneman, D. (1974). Judgement under uncertainty: Heuristics and biases. Science, 185, 1124-1131. Tversky, A., & Kahneman, D. (1982). Judgments of and by representativeness. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgement under uncertainty: Heuristics and biases (pp. 84-98). Cambridge, UK: Cambridge University Press. Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review. 90, 293-315. Yates, J. F., & Carlson, B. W. (1986). Conjunction errors; Evidence for multiple judgment procedures including "signed summation." Organizational Behavior and Human Decision Processes, 37, 230-253. Zacks, R. T, Hasher, L., & Sanft, H. (1982). Automatic encoding of event frequency: Further findings. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 106116. Introducing Leroy 29 Author Note Correspondence concerning this article should be addressed to Herbert H. Blumberg, Department of Psychology, Goldsmiths College, University of London, London SE14 6NW, England; e-mail: [email protected]. We would like to thank Jules Davidoff and others for helpful comments on a draft of this article. Introducing Leroy 30 Table 1 Percentage of Conjunction Fallacies (CF) Committed --------------------------------------------------------------------Parameter P-Linda F-Linda P-Leroy F-Leroy Lottery -----------------------------------------------------------------------------------All Respondents (N = 118) % of CF 40.7 24.6 36.4 40.7 30.5 N of CF 48 29 43 48 36 All Respondents Except "Zero Scorers" (N = 107) % of CF 39.0 26.2 38.3 38.3 32.7 N of CF 46 28 41 41 35 ------------------------------------------------------------------------------------ Introducing Leroy 31 Table 2 "Turnover Tables" for Conjunction Fallacies: Linda Problem. _________________________________________________________ F-Linda _______________________________________ - 0 + | Sum | P-Linda - 24 5 0 | 29 0 11 (10) 26 (18) 4 | 41 (32) + 11 12 (11) 25 (24) | 48 (46) Sum 46 (45) 43 (34) 29 (28) | 118 (107) _________________________________________________________ Note. - indicates that conjunction is smaller than constituent (no conjunction fallacy), 0 indicates that conjunction equals constituency, and + indicates that conjunction is larger than constituent (conjunction fallacy). Table shows the number of respondents in each cell (where applicable, figures in parentheses show similar numbers but with "zero scorers" deleted). Introducing Leroy 32 Table 3 "Turnover Tables" for Conjunction Fallacies: Leroy Problem. _________________________________________________________ F-Leroy _______________________________________ - 0 + | Sum | P-Leroy - 13 4 11 | 28 0 11 18 (15) 18 (12) | 47 (38) + 7 17 (16) 19 (18) | 43 (41) Sum 31 39 (35) 48 (41) | 118 (107) _________________________________________________________ Note. - indicates that conjunction is smaller than constituent (no conjunction fallacy), 0 indicates that conjunction equals constituency, and + indicates that conjunction is larger than constituent (conjunction fallacy). Table shows the number of respondents in each cell (where applicable, figures in parentheses show similar numbers but with "zero scorers" deleted). Introducing Leroy 33 Figure Captions Figure 1. Marginal means (for scores) across sequence. Figure 2. Mean scores by sequence. Note. Sequence 1 is F-Linda (FLIN), F-Leroy (FLER), P-Linda (PLIN), P-Leroy (PLER). Sequence 2 is FLER, FLIN, PLER, PLIN. Sequence 3 is PLIN, PLER, FLIN, FLER. Sequence 4 is PLER, PLIN, FLER, FLIN. Figure 3. Marginal means (for scores) across sequence (with "zero scorers" removed). Figure 4. Mean scores by sequence (with "zero scorers" removed). (Note. Sequence 1 is FLIN, FLER, PLIN, PLER. Sequence 2 is FLER, FLIN, PLER, PLIN. Sequence 3 is PLIN, PLER, FLIN, FLER. Sequence 4 is PLER, PLIN, FLER, FLIN.) Introducing Leroy 34 Introducing Leroy 35 Introducing Leroy 36 Introducing Leroy 37
© Copyright 2026 Paperzz