Public Opinion Quarterly, Vol. 74, No. 2, Summer 2010, pp. 286–318 RACE AND TURNOUT IN U.S. ELECTIONS EXPOSING HIDDEN EFFECTS BENJAMIN J. DEUFEL ORIT KEDAR* BENJAMIN J. DEUFEL directs quantitative analysis for the Financial Services Practice at the Corporate Executive Board, Arlington, VA, USA. ORIT KEDAR is an Associate Professor in the Department of Political Science at the Massachusetts Institute of Technology, Cambridge, MA, USA, and the Hebrew University of Jerusalem, Jerusalem, Israel. Benjamin Deufel benefited from financial support by the Jacob K. Javits Fellowship Program of the U.S. Department of Education and the Multidisciplinary Program in Inequality and Social Policy at Harvard University, sponsored by the National Science Foundation. Orit Kedar benefited from the V.O.K. Fellowship and Dellon Fellowship, Harvard University. The authors would like to thank the Center for American Political Studies at Harvard University for a seed grant. For helpful comments and suggestions, they thank Chris Achen, Barry Burden, Don Green, Jonathan Katz, Gary King, Matthew Lebo, Skip Lupia, Jonathan Nagler, Ken Scheve, Nick Valentino, Lynn Vavreck, and Jonathan Wand. They also thank Greg Distelhorst and Mike Sances for superb research assistance. Accompanying materials can be found on the authors’ Web site at http://web.mit.edu/ okedar/www/. *Address correspondence to Orit Kedar, Massachusetts Institute of Technology, Department of Political Science, 77 Massachusetts Ave., E53-429, Cambridge, MA 02139, USA; e-mail: [email protected]. doi: 10.1093/poq/nfq017 Advance Access publication April 22, 2010 © The Author 2010. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For permissions, please e-mail: [email protected] Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Abstract We demonstrate that the use of self-reported turnout data often results in misleading inferences about racial differences in turnout. We theorize about the mechanism driving report of turnout and, utilizing ANES turnout data in presidential elections from 1976 to 1988 (all years for which comparable validated data are available), we empirically model report of turnout as well as the relationship between reported and actual turnout. We apply the model to the two subsequent presidential elections in which validated data are not available, 1992 and 1996. Our findings suggest that African Americans turned out almost 20 percentage points less than did Whites in the 1992 and 1996 U.S. presidential elections—almost double the gap that the self-reported data indicates. In contrast with previous research, we show that racial differences in factors predicting turnout make African Americans less likely to vote compared to Whites and thus increase their probability of overreporting. At the same time, when controlling for this effect, other things equal, African Americans overreport electoral participation more than Whites. Race and Turnout in U.S. Elections 287 Introduction Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 The question “Who votes?” has been the focus of numerous studies. Out of both positive and normative motivations, political scientists have examined voter turnout over time, across electoral institutions, and across segments of the population. For the most part, however, the secret ballot prevents political scientists from observing what we seek to explain; we know who reports voting in surveys, but we do not know who actually votes. The vast majority of studies simply assume that the missing piece— individual turnout—is identical to the observed piece—self-reported turnout. In the absence of any further information, this is, of course, a reasonable assumption. However, auxiliary information reveals not only that a substantial proportion of American survey respondents report turning out when they do not, but also that their propensity to misreport turnout is related to their propensity to vote, which may lead to mistaken inferences. Two steps are thus required in order to explain who votes. First, we need to comprehend the mechanism that drives the observed piece—reported turnout. And second, the relationship between the observed and unobserved, reported and actual turnout, should be modeled. Although much energy has been focused on the former, the latter has been largely ignored. In this study, we take up this challenge, focusing on racial differences in turnout. Numerous studies provide evidence that African Americans overreport turnout at higher rates than Whites. Building on these studies, we develop a theoretical account that explicitly models both the reporting process and its relationship to actual turnout. We employ this theory to model the relationship between turnout and report of turnout in those years in which the two are available at the individual level. We contend that African Americans overreport their turnout more than Whites for two reasons. First, African Americans overreport more because they simply have more of an opportunity to do so. Group differences in other predictors of turnout, such as socioeconomic status, cause African Americans to be less likely to turn out compared to Whites, and a lower propensity to vote actually makes African Americans more likely to overreport. Furthermore, holding constant other predictors of turnout, African Americans still overreport more than Whites. We argue that, because of the importance of race in American politics and elections in particular, the social desirability of voting is higher for African Americans. After establishing our theory empirically, we employ it to deflate selfreported figures in the 1990s where validated data are not available. Our findings suggest that throughout the 1980s and 1990s, the overreporting bias in self-reports masks a large turnout gap between Whites and African Americans. On the other hand, this gap narrows considerably after accounting for other predictors of turnout. 288 Deufel and Kedar Finally, a side benefit of our venture to expose hidden effects is methodological. Modeling the relationship between validated and self-reported turnout in the 1970s and 1980s, we produce a function that probabilistically deflates reported turnout. We test the performance of our deflating function and show that our algorithm predicts turnout more accurately than self-reports. WHAT WE KNOW (AND DON’T KNOW) ABOUT OVERREPORTING Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Studies of voter turnout are in agreement that misreporting of turnout exists and is almost entirely in one direction. The fraction of respondents voting but reporting they did not (“underreporters”) is negligible and, importantly, these discrepancies are usually random (Silver, Anderson, and Abramson 1986; Presser and Traugott 1992). Overreporting of turnout is different: it is of substantial magnitude and, most students of voting behavior agree, it is systematically related to voters’ characteristics. However, several issues make it difficult to develop theories about the relationship between self-reports (R) and verified turnout (V). The appropriate quantities of interest, the mechanism that generates overreporting, and the substantive effects of overreporting are all disputed in the literature. While these issues are mostly empirical, they muddle the theoretical waters. Before we outline our theory, we need to first wrestle with these issues. The first hurdle is determining the population of interest, as choices on this front may change inferences about the nature of overreporting. Focusing on overreporting as their ultimate variable of interest, some (e.g., Silver et al., 1986) argue that overreporting should be calculated among nonvoters only, the actual group at risk of overreporting. Another advantage of this procedure is that it controls for turnout rate. Given our motivation, however, a different strategy is in order. Because our goal is to explain voter turnout rather than overreporting, we estimate the hypothetical probability of all respondents to overreport. In line with standard treatment of binary variables (Greene 1993, pp. 636–43), we acknowledge that every observed binary variable (here, both validated turnout and reported turnout) is a realization of an unobserved continuous proclivity (a probability of actually turning out, and a probability of reporting turning out). Therefore, every individual has both an underlying unobserved probability, and a realization—the observed binary outcome. Whether they turned out or not, it is possible for an individual to have a probability of turning out of, say, 0.65, and a probability of reporting turnout of, say, 0.85. Thus, the relevant group by which we calculate overreporting of turnout is the general voting-eligible population, and in the sample, the entire group of respondents eligible to vote. The second hurdle is understanding the sources of systematic components of overreporting. The standard argument, perhaps most clearly represented in Race and Turnout in U.S. Elections 289 1. In the only comparative study of overreporting we are aware of, Karp and Brockington (2005) make a distinction between social desirability and opportunity of overreporting. Examining overreporting in Britain, New Zealand, Norway, Sweden, and the United States, the authors show that where turnout is high, social desirability is higher as well, yet high turnout also leaves fewer nonvoters with the opportunity to overreport their participation. 2. Bernstein, Chadha, and Montjoy (2001) propose an alternate possibility. They argue that respondents misreport out of guilt from failure to fulfill a social obligation. 3. A review of the APSR, AJPS, JOP, and Political Behavior between 1993 and 2008 for studies estimating turnout in U.S. presidential elections, employing individual-level data, and having race on the right-hand side yielded 13 studies; among them, five found no effect of race on turnout, two found mixed effect, and among the six which found an effect, five found that African Americans were more likely to participate than Whites. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Silver et al. (1986), employs the concept of social desirability. The basic idea is that respondents report their behavior untruthfully because they wish to portray themselves as engaged in a socially desirable activity. This wish is unevenly felt across respondents; the very factors that make respondents likely to vote also reinforce their desire to portray themselves as voters, regardless of their actual behavior.1 Thus, those who are more likely to vote are more likely to misreport when they did not actually participate.2 This argument implies that the use of self-reported (R) data will lead to overestimation of partial correlations with regard to turnout (Cassel 2003). For example, if education makes people both more likely to vote and more likely to exaggerate the extent of their participation, use of R will result in overestimation of the effect of education on the propensity to turn out. Indeed, Presser and Traugott (1992) show that, using R, political scientists overestimate partial correlations with regard to turnout. Nonetheless, Sigelman (1982) argues that substantive conclusions about the predictors of voting, despite biased coefficients, are mostly unchanged by the use of validated as opposed to reported voting data. Racial effects are a seeming exception to this general pattern. Repeated studies have shown that African Americans are more likely to overreport than Whites, despite possibly being less likely to turn out (Sigelman 1982; Hill and Hurley 1984; Bernstein, Chadha, and Montjoy 2001). In a series of articles, Abramson and Claggett (1984, 1986, 1989, 1991) find that although selfreported data indicate no racial difference in turnout after accounting for education and region, use of validated data reveals otherwise. In other words, African Americans overreport at higher rates, yet are less likely to turn out compared to Whites. These two gaps in opposite directions mask each other, leading researchers using R to underestimate the effect of race on turnout and often conclude that there is no relationship or even that African Americans turn out at higher rates than Whites do.3 While much attention has been dedicated to empirical descriptions of this pattern, to our knowledge there has been no attempt to model how the relationship between turnout and reported turnout varies by race. Doing so allows 290 Deufel and Kedar us to understand how differential overreporting affects our inference of racial differences in turnout. It is to this task that we now turn. Explaining the Reporting Gap: Two Mechanisms Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 We expect the impact of social desirability to be nonlinear, where overreporting first increases and then decreases with propensity to vote. Those with the highest likelihood of voting have low levels of overreporting because they often actually do turn out. Those respondents who are moderately likely to turn out are those who overreport at the highest rates. They turn out at the polls less often than do those most likely to vote, which gives them more of an opportunity to overreport (for similar intuition, see also McDonald 2003, p. 185). These mid-range potential voters feel a greater need to report socially desirable behavior than those with a low likelihood of voting, and thus their level of overreporting is also higher than that of those who vote at low rates. In sum, opportunity to overreport, along with social desirability, produces a nonlinear relationship between propensity to vote and overreporting. Figure 1 graphically illustrates this relationship. Let us begin by examining curve I only. The top panel (panel A) presents the probability of turning out on the horizontal axis against the probability of overreporting on the vertical axis. This curve graphically presents the social desirability argument we discussed above. It asserts that as the probability of voting increases, the tendency to overreport, measured vertically, first increases and then decreases. (If one voted for sure, she has no chance of overreporting, and if we assume that people do not underreport, someone with zero probability of reporting having voted has an identical probability of actually voting.) If curve I holds for everyone in the population, the effect of race on reporting should diminish as more factors that account for the tendency to turn out are included in the analysis. In other words, any potential racial difference in overreporting is a product of opportunity and social desirability (on-the-curve effect). Panel B (still curve I) presents the latent propensity of turning out on the horizontal axis and the respondent’s latent propensity to report having voted on the vertical axis. The diagonal represents the relationship between the two had there been no overreporting. The vertical difference between the curve and the 45-degree line is the overreporting gap presented in the first panel on the vertical axis. Suppose that on average Whites score higher on predictors of turnout (e.g., homeownership) such that they are located around point A and that African Americans are located on average at a point like C or E. While an E-type voter is both less likely to vote and less likely to overreport than an A-type, C is less likely to vote yet more likely to overreport than A. This latter possibility is consistent with the empirical findings that African Americans may be more likely to overreport but less likely to vote. In sum, the account in curve I alone suggests that African Americans may overreport more than Whites because they are less likely to vote, not because of any Race and Turnout in U.S. Elections 291 Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Figure 1. (a) Possible Relationships between Voting and Overreporting. (b) Possible Relationships between Voting and Report of Voting 292 Deufel and Kedar 4. For discussion of a related issue, the relationship between group consciousness and participation, see Verba, Schlozman, and Brady (1995, pp. 355–56), and their discussion of the departure of their findings from those of Verba and Nie (1972). 5. See Dawson (1994, pp. 140–41) for discussion of the effect of Jackson’s candidacy and his treatment by the party. See also Tate (1994) and Kinder, Mendelberg, Dawson, et al. (1989). Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 special propensity to overreport relative to Whites—they are on a different point on the same overreporting function (curve I). However, previous studies also note that overreporting by African Americans may have an additional source (Bernstein et al., 2001). Because most African Americans effectively gained the right to vote in the 1960s after a hard-fought struggle that drew heavily upon group resources, other things equal, voting may be more of a socially desirable act for African Americans than for Whites. A high percentage of African American disenfranchisement compared to Whites (Uggen and Manza 2002) potentially serves as an additional source of social desirability, reducing the likelihood that those who did not turn out will report so to the pollster. More generally, though, American electoral politics is often about race. Electoral districts are drawn explicitly with race in mind, and race is important for electoral mobilization and demobilization (see, for example, Rosenstone and Hansen 1993; Dawson 1994; Tate 1994; and Kinder and Sanders 1996). According to this argument, African Americans have a different tendency to overreport, controlling for predictors of turnout. Therefore, unlike the previous “on-the-curve” explanation, curve I represents overreporting for Whites and curve II represents overreporting for African Americans. For example, take two voters of different races but identical in their likelihood of voting— E is White and D is African American. Although they are equally likely to vote, D is more likely to overreport than E. Technically, African Americans are located on a different reporting curve. We allow for both accounts and, therefore, for both on-the-curve and offthe-curve effects to affect racial differences in reporting. For example, perhaps the average White is at point A and the average African American is at point B. In this case, African Americans would be more likely to overreport because of the relationship between race and overreporting and because of differences in the values of predictors of turnout between Whites and African Americans. A side issue we need to address is the consistency of the relationship over time. The relationship may change systematically as we move away from the height of civil rights movement activity,4 or idiosyncratically, with events that alter the social desirability of voting by race.5 This highlights the need to establish whether any of our expectations are met in the data before we can make inferences about race and turnout. Analysis of survey and validation data will allow us to determine the extent of racial differences in reporting and voting, and the degree to which the two mechanisms account for gaps in reporting of turnout. Race and Turnout in U.S. Elections 293 REPORTED AND OFFICIAL TURNOUT: HOW BIG IS THE PROBLEM? 6. We do not use the 1964 validation study since the validation procedure in that year was changed in the subsequent studies. 7. The respondent was included in our validation sample if her validation record indicated that she voted (coded as 1), did not vote, or no record of voting or registering was found, or if the respondent reported that she had not registered or SDR (coded as 0), but not if one of the latter four was found and the status of the office voting records was such that some or none of the records were inaccessible. The respondent was included in our reported sample if she reported that she had voted or that she did not vote. 8. In 1976 and 1980, the ANES attempted to validate each self-report, but in 1984 and 1988, selfreports were validated only if the respondent indicated that she was registered or had voted in localities without registration requirements. In 1980–88 (and for some respondents in 1976), the ANES also validated those who completed the pre-election interview but not the post-election one (leaving such respondents without R data), leading to the possibility of a V sample size greater than that of R (as is indeed the case in 1980, 1984, and 1988). 9. On this point, see also Clausen (1968). 10. In most official measures, the denominator is the Voting Age Population (VAP) as reported by the Bureau of the Census in their Current Population Reports, Series P-25. VAP includes all persons over the age of 18, including those ineligible to vote in federal elections, such as legal and illegal aliens, convicted felons, and individuals legally declared non compos mentis. The VAP is therefore considerably larger than the pool of potential voters (McDonald and Popkin 2001, although see Burden 2000). For related articles, see Burden (2003), Martinez (2003), and McDonald (2003). Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 The American National Election Studies (ANES) conducted vote validation studies in the 1964, 1972–80, and 1984–90 election studies. In this article, we utilize validated data from the four most recent presidential elections where these data are available: 1976, 1980, 1984, and 1988.6 In appendix A, we justify the quality of the validation procedure and offer evidence that it provides an accurate means to address our research questions, and in appendices B and C, we report the question wording and sampling procedures along with response rates, respectively. Table 1 shows the official turnout rate, the proportion of respondents reporting having voted (R), and the proportion validated as voting (V), with the latter two quantities broken down by race.7 Each row presents these quantities for one of the four presidential elections: 1976–88.8 The first obvious point is that in the full sample, V is closer to figures of official turnout than R is in all years.9 Although at first glance there is still roughly a 10-percentage-point difference between validated turnout in the full sample and official turnout figures, this is not surprising, both because the latter includes in its denominator many who are ineligible to vote (McDonald and Popkin 2001)10 and because of potential systematic nonresponse to the ANES. Examination of the differences between validated and reported turnout over time in the full sample (as well as the gap between the two measures and official turnout) suggests no apparent secular trend across the four elections. The difference between R and V is around 10 percentage points in each elec- 53.6 52.6 53.1 50.2 1976 1980 1984 1988 73.3 (71.3, 75.3) n =1,872 71.7 (69.4, 74.1) n = 1,390 74.1 (72.1, 76.0) n = 1,945 70.4 (68.2, 72.6) n = 1,713 Self-reported turnout (R) Validated turnout (V) 66.1 (63.8, 68.4) n = 1,639 60.0 (57.3, 62.7) n = 1,288 65.6 (63.4, 67.7) n = 1,856 65.7 (63.3, 68.1) n = 1,527 Self-reported turnout (R) 73.9 (72.8, 76.0) n = 1,684 72.3 (69.8, 74.9) n = 1,222 75.2 (73.1, 77.2) n = 1,719 71.9 (69.7, 74.2) n = 1,493 64.7 (62.5, 66.9) n = 1,826 58.9 (56.4, 61.4) n = 1,453 63.4 (61.3, 65.5) n = 2,109 63.2 (60.9, 65.4) n = 1,756 Whites Validated turnout (V) Full Sample** Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 52.1 (44.5, 59.7) n = 169 49.4 (41.6, 57.2) n = 162 47.3 (41.0, 53.6) n = 243 45.8 (39.2, 52.3) n = 225 66.1 (58.9, 73.2) n = 171 66.7 (59.4, 73.9) n = 165 65.6 (59.2, 72.0) n = 215 59.7 (53.1, 66.3) n = 216 Validated turnout (V) Self-reported turnout (R) African Americans NOTE.—Cell entries include rate, 95-percent confidence interval, and number of respondents, respectively. The R sample includes respondents who have reported data, while the V sample includes all those with validated data. This table does not use the post-stratification weights provided by the ANES. *Official turnout figures are from the Federal Election Commission, http://www.fec.gov/pages/tonote.htm. **The full sample includes respondents who indicated they were either African American or White. Official Turnout* Year Table 1. Self-Reported Turnout, Validated Turnout, and Official Turnout by Year (95-percent confidence intervals in parentheses) 294 Deufel and Kedar Race and Turnout in U.S. Elections 295 UNMASKING RACIAL EFFECTS: REVEALING THE REPORTING GAP We now turn to examining whether our theory holds empirically. What is the underlying relationship between reported turnout and actual turnout? To capture this relationship, in each of the four years we first estimate a logistic model of turnout and a model of reporting turnout using validated turnout (V) and self-reported turnout (R), respectively, as dependent variables. We rely on an established empirical and theoretical literature in choosing our explanatory variables for these models (Rosenstone and Hansen 1993; Verba et al., 1995). These variables capture resources, in addition to social and political experiences and attachments that raise the benefits and lower the costs of voting, as well as standard demographic controls. In particular, the model includes party contact, church attendance, party attachment, age, education, income, homeownership, race, and gender. The estimated coefficients of the eight regressions are too much to be discussed in detail here and can be found in table D2 (appendix D). Nonetheless, some effects hold throughout the eight models, reflecting the systemic effects discussed in the literature. Resources such as education and income make voting more likely, as does community embeddedness measured as homeownership. By the same token, young adults (usually more mobile and less immersed in a stable community) are less likely to vote. Attachment to a party, voter mobilization, and church attendance all have a positive effect on the likelihood of turning out. Finally, once controlling for these effects, living in the South has a partial negative effect on one’s likelihood of turning out. 11. In a thorough analysis combining data from both presidential and congressional elections, Belli, Traugott, and Beckmann (2001) find that non-Whites overreport at higher rates than Whites do. 12. The inferences we make here are identical if one looks at only those respondents who have both reported and validated data. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 tion. Among African Americans, however, the gap is considerably greater than among Whites, falling between roughly 14 and 18 percentage points across the four elections.11 Finally, note that at first glance there seems to be no trend in overreporting of African Americans over time.12 While these figures expose a potential reporting gap, they do not provide a theoretical framework for understanding the sources of racial differences in reported or validated turnout. In particular, the differences do not tell us whether the gap is simply a reflection of African Americans’ lower propensity to vote given other predictors of turnout (an on-the-curve effect) or whether African Americans are more likely to overreport, even accounting for predictors of turnout (an off-the-curve effect related to differing social desirability), or both. To get at this issue, we must model the relationship between reported and actual turnout. 296 Deufel and Kedar Next, we map the relationship between voting and reported voting. We pool the resulting predicted probabilities over the four years and model the relationship between voting and reporting as a quadratic function, allowing it to vary by race and year (including all relevant interaction effects for maximum flexibility).13 The quadratic specification and the interactions allow us to examine if the relationship expected between overreporting and turnout (as specified in figure 1) is found in the data and whether it varies by race and over time. Figure 2 presents this relationship for all four years by race. On the horizontal axis are the probabilities of voting using validated data, and on the vertical axis are predicted probabilities of reporting having voted. As in figure 1b, the 45-degree line in figure 2 represents a hypothetical relationship with no reporting gap. The four dashed lines present the relationship for Whites, while the four solid lines present the relationship for African Americans in the four years. The vertical gaps between the diagonal line and each of the eight lines are the respective overreporting gaps. The figure illuminates several aspects of the relationship between reported and actual turnout. Examine the relationship for Whites first. The pattern is consistent with the theory 13. The full specification and coefficients are presented in table E1 of appendix E. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Figure 2. Discounting Functions: 1976–1988. Race and Turnout in U.S. Elections 297 Implications of the Reporting Gap So far we have established that there is a nonlinear tendency to overreport, and that this tendency differs for African Americans and for Whites. Our next step is to analyze the consequences of these patterns for inference about turnout. To accomplish this task, we rely on three models of voter turnout. The first two models come from Abramson and Claggett’s studies of racial differences in overreporting and turnout (1984, 1986, 1989, 1991).14 Their first (“thin”) model has race only on the right-hand side. Their second (“thick”) model controls for education and region (South). Finally, based on recent advances in the study of turnout (Rosenstone and Hansen 1993; Verba et al., 1995), we extend our analysis to a third (“full”) model that is identical to 14. Abramson and Claggett (1984, 1986, 1989, 1991) code the race variable as a dichotomous 1, 1 variable, while we employ the more common dummy approach (0, 1). Despite this difference, our results are very similar. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 specified above: the line is, in general, above the diagonal, and the relationship is nonlinear; as the probability of voting increases, the probability of overreporting one’s likelihood of voting first increases and then declines. Individuals in the middle range have a higher likelihood of overreporting than either those who are unlikely to vote or those who almost surely vote. Consistent with our expectation, the relationship between turnout and report of turnout for Whites is constant over time. The functions for 1976, 1980, and 1984 are almost identical, and the data expose no trend in the relationship. However, the overreporting function for 1988 is a curious exception to this stable pattern. The relationship for African Americans tells an even clearer story. Here, too, overreporting is in quadratic relation to voting, but the magnitude of overreporting is greater; the vertical gaps between the diagonal and each of the functions reaches almost 20 percentage points. Overreporting is not orthogonal to race. Although the coefficients bear large standard errors (partly because of the many interaction terms included), the point estimates (presented in figure 2) consistently suggest that African Americans’ reporting function is different than Whites (off-the-curve effect). Furthermore, the nature of overreporting by African Americans appears roughly similar across all four elections. Based on this consistency, in the next stage we eliminate the year variables, resulting in reduced estimation uncertainty and similar results (column 2 in table E1). In summary, we have presented evidence for two mechanisms at play. First is a quadratic relationship between voting and report of voting, supporting the importance of opportunity to overreport and social desirability. Second is a racial difference, which may be attributed to an additional social desirability variable related to racial history. Both effects are constant over time. 298 Deufel and Kedar 15. Likelihood Ratio Tests comparing the “full” model to the “thick” and the “thin” models are statistically significant in all years. 16. Results for 1976 and 1980 are similar to those reported below. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 the model of voting turnout we used to construct figure 2 (for estimates, see tables D1 and D2 in appendix D). To reiterate, in addition to region and education, this model also includes variables that capture resources, social and political experience, and attachment that raise the benefits and lower the costs of voting, as well as standard demographic controls.15 While the inclusion of these particular variables is not novel, in the context of our unpacking of overreporting, a comparison of the three models sheds light on our theoretical argument. Our theory implies that effects relating to opportunity and social desirability account for part of the racial reporting gap. According to this argument, as we move from the thin to the thick to the full model and control for more of the determinants of turnout, and hence opportunity to overreport, using R, the reporting gap between the two racial groups should decline. In other words, if as we account for more factors that disproportionately reduce African Americans’ likelihood of voting, the gap in overreporting diminishes, then we may infer that African Americans overreport more because they vote less (on-the-curve effect). However, if the gap between reported and actual data holds even as we control for such factors, then it is an indication that African Americans indeed have a different reporting function, likely because of additional social desirability, as we discuss above. We focus below on 1984 and 1988, the two most recent presidential elections in which validated data are available.16 Table 2 presents the results for 1984. The first and second rows in each section present the mean predicted probabilities for African Americans and Whites, respectively, with the values of other explanatory variables held at their mean. The third entry is the mean difference in turnout between the two racial groups. A positive number indicates that Whites are estimated to turn out at higher rates than African Americans. As the table shows, in 1984, R and V (in the first and second columns, respectively) lead to different inferences about racial gaps in turnout. In model 1, both R and V suggest that African Americans are less likely to turn out, yet using V, the racial difference is substantially greater (10 percentage points using R as opposed to 19 using V). In other words, validated data reveal a large effect of race on turnout, an effect that is masked by selfreported data. The results of model 2 make an even stronger case: using R, there is no statistically significant racial gap in turnout, and the predicted probabilities for both races are inflated (73 percentage points for African Americans, and 76 for Whites). V, however, exposes a different picture. First note that the predicted probability of African Americans turning out is substantially lower (57 percentage points) and significantly different from R. The figure for Whites is lower as well (69 percentage points). Most importantly, while 0.822 (0.760, 0.872) 0.651 (0.564, 0.727) 0.649 (0.533, 0.764) 0.670 (0.551, 0.775) 0.805 (0.781, 0.828) 0.710 (0.682, 0.737) 0.720 (0.673, 0.762) 0.718 (0.676, 0.758) -0.017 (-0.072, 0.044) 0.059 (-0.022, 0.148) 0.071 (-0.051, 0.190) 0.048 (-0.069, 0.168) 0.565 (0.431, 0.688) 0.672 (0.631, 0.713) 0.107 (-0.016, 0.243) 0.539 (0.414, 0.656) 0.675 (0.633, 0.717) 0.136 (0.012, 0.272) Model 3 (“full”) African Americans Whites Difference b/w African Americans and White Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 NOTE.—Entries in the table are predicted probabilities of turning out. All other variables in the model are held at their respective means. Model 1: Race. Model 2: Race, education, and region (South vs. other). Model 3: Full model; see specification in table D2. For all models, see N and coefficients in tables D1 and D2. 0.566 (0.484, 0.634) 0.687 (0.661, 0.710) 0.121 (0.046, 0.203) 0.731 (0.665, 0.792) 0.763 (0.740, 0.786) 0.032 (-0.035, 0.102) 0.495 (0.354, 0.622) 0.666 (0.619, 0.708) 0.171 (0.040, 0.323) 0.466 (0.338, 0.578) 0.668 (0.622, 0.713) 0.202 (0.081, 0.331) Model 2 (“thick”) African Americans Whites Difference b/w African Americans and Whites 0.503 (0.429, 0.573) 0.686 (0.662, 0.709) 0.183 (0.110, 0.260) Deflated self-reported turnout (1976–1984, out of sample) Discounted self-reported turnout (1984–1984) 0.663 (0.595, 0.730) 0.757 (0.735, 0.777) 0.093 (0.019, 0.166) Validated turnout (V) Model 1 (“thin”) African Americans Whites Difference b/w African Americans and Whites Self-reported turnout (R) Table 2. Predicted Probabilities of Turning Out by Race: 1984 Presidential Elections (95-percent confidence intervals in parentheses) Race and Turnout in U.S. Elections 299 0.664 (0.587, 0.727) 0.728 (0.703, 0.753) 0.064 (-0.010, 0.143) 0.765 (0.670, 0.839) 0.775 (0.746, 0.803) 0.010 (-0.070, 0.105) Model 2 (“thick”) African Americans Whites Difference b/w African Americans and Whites Model 3 (“full”) African Americans Whites Difference b/w African Americans and White 0.530 (0.437, 0.614) 0.704 (0.674, 0.733) 0.175 (0.082, 0.275) 0.480 (0.389, 0.560) 0.679 (0.650, 0.706) 0.199 (0.114, 0.293) 0.465 (0.386, 0.541) 0.672 (0.645, 0.697) 0.207 (0.125, 0.287) Validated turnout (V) Deflated self-reported turnout (1976–1988, out of sample) 0.431 (0.304, 0.562) 0.620 (0.573, 0.666) 0.190 (0.048, 0.331) 0.497 (0.363, 0.637) 0.635 (0.584, 0.680) 0.138 (-0.008, 0.273) 0.600 (0.454, 0.733) 0.686 (0.638, 0.732) 0.086 (-0.058, 0.237) Deflated self-reported turnout (1988–1988) 0.405 (0.283, 0.523) 0.654 (0.610, 0.697) 0.250 (0.112, 0.387) 0.464 (0.339, 0.587) 0.665 (0.622, 0.709) 0.201 (0.072, 0.340) 0.562 (0.422, 0.690) 0.708 (0.663, 0.747) 0.146 (0.006, 0.297) Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 NOTE.—Entries in the table are predicted probabilities of turning out. All other variables in the model are held at their respective means. Model 1: Race. Model 2: Race, education, and region (South vs. other). Model 3: Full model, see specification in table D2. For all models, see N and coefficients in tables D1 and D2. 0.596 (0.528, 0.664) 0.715 (0.690, 0.739) 0.119 (0.046, 0.193) Model 1 (“thin”) African Americans Whites Difference b/w African Americans and Whites Self-reported turnout (R) Table 3. Predicted Probabilities of Turning Out by Race: 1988 Presidential Elections (95-percent confidence intervals in parentheses) 300 Deufel and Kedar Race and Turnout in U.S. Elections 301 Table 4. The Challenge of Partial Observation of Turnout Data available/period t t+1 Self-reported Validated + + + ? DEFLATING SELF-REPORTS Table 4 summarizes the challenge we face. In some periods (t) both selfreported and verified data are observed, while in others (t + 1) only self-reported data are available. While the literature on overreporting bias is extensive, almost no attention has been given to potential solutions. Suggestions of measurement improvements ex ante (Belli et al. 1999; Duff, Hammer, Park, et al. 2007; Holbrook and Krosnick forthcoming), discussing issues such as poor memory and face-saving response options, produce more accurate reports, but the problem of bias with already collected data still holds. Our proposed algorithm is simple. Based on our theory and empirical evidence for the joint roles of opportunity and desirability, we model the relationship between voting and reporting of voting using all years in which reported and validated data from presidential elections are available (1976, Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 using R alone there is no statistically significant effect of race on turnout, V reveals an effect of 12 percentage points—reporting is correlated with race. Finally, the “full” model shows the same trend, although the results fall slightly short of standard levels of statistical significance. As in the previous two models, the point estimates indicate that the racial gap—hidden using self-reported figures—is greater when turnout is measured with validated figures, and the racial gap (which is six percentage points) is exposed by the use of validated figures. Table 3 repeats this exercise in 1988. The results are consistent with those of 1984, and we describe them here only briefly. Within each model, the racial difference exposed by V is substantively greater than the difference suggested by R. Even in the “full” model, use of validated data suggests an 18-percent turnout gap while self-reported data suggests no difference. The findings using validated data are consistent with the findings of Dawson (1994) and Kinder and Sanders (1996) previously noted. In sum, with regard to our substantive issue of interest—racial differences in turnout—a strong case can be made that the use of self-reported data can lead to mistaken inferences. Both opportunity and social desirability (on-thecurve effects) and an additional desirability related to race (off-the-curve effect) lead to biased inferences. As we control for opportunity to overreport, the gap declines, yet because of differing social desirability, some racial differences remain. Having revealed these effects, in the next section we devise a remedy for the biases produced by conventional investigation. 302 Deufel and Kedar 1980, 1984, and 1988). We then use the estimated relationship to deflate selfreported turnout data at time t + 1 (1992 and 1996), where validated data are not available. We now turn to presenting our algorithm in greater detail. The Deflating Algorithm STEP 1: MODEL THE RELATIONSHIP BETWEEN VOTING AND REPORTING OF VOTE AT TIME t 1.1. We estimate a model of reported turnout at time t: ð1Þ where Xt1 is a vector of voter characteristics, β t1 is a vector of coefficients, and f ð⋅Þis a logistic function.17 We then calculate the predicted probability of reported voting for each individual, ^ PrðRt = 1Þ 1.2. Similarly, we estimate a model of validated turnout at time t: V t = f X2t ; β t2 ; ð2Þ and calculate the predicted probability of turning out for each individual ^ PrðV t = 1Þ.18 1.3. The deflating function: We model the relationship between voting (V) and reporting of vote (R) where the dependent variable is the estimated probability of actually voting calculated from equation (2). Given the racial differences in voting and report of voting established above, we let the voting be a quadratic function of selfreported voting and vary interactively by race. The right-hand side, then, is a vector of covariates such that ^ ^ ^ PrðV t = 1Þ = δt0 + δt1 racet + δt2 PrðRt = 1Þ + δt3 Pr2 ðRt = 1Þ h i h i ^ ^ + δt4 racet PrðRt = 1Þ + δt5 racet Pr2 ðRt = 1Þ ð3Þ Equation (3) reflects how social desirability and opportunity affect racial differences in overreporting. The coefficients δ 0 , δ 2 , and δ 3 represent the quadratic relationship between voting and report of voting among Whites. 17. In our case, based on the observed consistent relationships observed over time, we pool data from 1976, 1980, 1984, and 1988. 18. In our case, we use the same functional form and vector of voter characteristics as in equation 1, reported in table D2. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Rt = f X1t ; β t1 ; Race and Turnout in U.S. Elections 303 The coefficients δ1, δ4, and δ5 represent the difference between the curve for Whites and the curve for African Americans. The additional effect of social desirability (off-the-curve) is carried out by the latter set of coefficients. The estimated coefficients are reported in the second column of table E1. We will next use these coefficients to deflate reported turnout. STEP 2: DEFLATE REPORTED DATA AT t +1 Rt + 1 = f X3t + 1 ; β t3 + 1 ð4Þ (in our case, 1992 or 1996). Here, too, we use a logistic function. We then compute predicted probabilities of reported turnout for two hypothetical individuals: White ðP^rðRt + 1 jW = 1ÞÞ and African American ðP^rðRt + 1 jAf :A = 1ÞÞ, with all other variables held at their mean (six hypothetical individuals altogether). We repeat this step for three different model specifications corresponding with the models above (“thin,” “thick,” and “full”). The estimates are reported in tables D1 and D2. 2.2. We apply the deflating function to the predicted probabilities generated in the previous step. For example, for a White individual: Dt + 1 = δ^ t0 + δ^ t1 ðrace = W Þt + 1 + δ^ t2 P^ r Rt + 1 jW + δ^ t3 P^ r2 Rt + 1 jW h ^ h ^ ^ ^ + δ t4 ðrace = W Þt + 1 P r Rt + 1 jW + δ t5 ðrace = W Þt + 1 P r2 Rt + 1 jW ð5Þ The outcome Dt+1 is the estimated individual’s deflated probability of turning out. Multiple-Step Estimation The logic of our algorithm is straightforward. It is important to keep in mind, however, that each step produces a layer of uncertainty that should be taken into account. The first sourcet is the tuncertainty around the Maximum Likelihood estimates at time t (β^ 1 andβ^ 2 ), which in turn produce the predicted probabilities for the two variables V and R (steps 1.1 and 1.2). We assume that t β t1 = fMultivar:−Normal β^ 1 ; ∑ β^t , and drawing randomly from this distribution, t 1 we get a sample distribution of β^ 1 (in this step, as in all other steps t described below, we draw 1,000 times). We repeat the same procedure for β^ 2, producing 1,000 predicted probabilities of turning out and 1,000 predicted Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 In this stage, we first estimate the probability of turning out using reported data (the only data we have at t + 1). We then employ our deflator to deflate the estimated probabilities. 2.1. Similar to step 1.1, we estimate a model of reported turnout at time t + 1: 304 Deufel and Kedar probabilities of reporting to have turned out for each individual. Therefore, in step 1.3, we estimate 1,000 deflating functions. Similarly, we draw on the ret+1 spective distribution of the coefficients on R at time t + 1 (β^ 3 ) and calculate predicted probabilities of two hypothetical respondents to be deflated. Finally, ^ we employ the sets of coefficients of the deflating function δ to these 1,000 predicted probabilities and get deflated probabilities. The results we present are the mean of those deflated probabilities.19 Evaluating Our Algorithm 19. We compute 95-percent confidence intervals by sorting the deflated probabilities and taking the 25th and 975th probabilities. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 To evaluate our algorithm, we pose two questions. First, we examine whether the inferences we make about racial differences in turnout using our method come closer to V than those we would make using R. Our measure of turnout ought to produce quantities of interest that are closer to those produced by the use of V than those produced by the use of R, conditioning on the same model. Focusing on racial differences in turnout, we turn back to tables 2 and 3. Recall that the use of R coefficients often led to underestimated effects of race on turnout. We perform both in-sample and out-of-sample tests to examine the performance of our measure. For the in-sample tests, we deflate self-reports using the deflating function produced by validated data from 1984. For our out-of-sample tests, we estimate our deflating function in 1976 and use it to deflate self-reported data in 1984. These tests are particularly demanding for two reasons. First, we use auxiliary information from 1976 to correct bias in data produced eight and 12 years later. Second, we know from figure 2 that Whites overreported in a unique manner in 1988 compared to other years. Results of these estimations are presented in the third (in-sample) and fourth (out-of-sample) columns of table 2. We then repeat this exercise for 1988 (table 3). Tables 2 and 3 reveal that using information from 1976 to deflate self-reports in 1984 and 1988 produces substantially more accurate estimates of racial differences in turnout than the use of R does. As expected, the results for 1984 are somewhat more accurate than those of 1988, but in all cases the point estimates using the deflated data (D) are considerably closer to V than those produced by R (the multiple stages of estimation acknowledge the uncertainty in these point estimates, leading to a few cases where the D confidence intervals overlap with zero). Notice in particular the figures for African American turnout produced by our algorithm in the “full” model for 1988. While self-reports suggest no racial gap in turnout, our algorithm exposes the particularly low turnout of African Americans, consistent with previously cited accounts of alienation felt among the African American population in 1988. In sum, while keeping in mind the caveat that the same Race and Turnout in U.S. Elections 305 Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Figure 3. a and b. Model Fit Based on Self-Reported, Validated, and Deflated Data for 1984 (Panel a) and 1988 (Panel b) 306 Deufel and Kedar Exposing Hidden Effects in the 1990s Having established its validity, we reach the final stage of our analysis, applying our algorithm to deflate self-reports in the 1992 and 1996 presidential elections where validated data are unavailable. We do so exactly as we describe above: we estimate a turnout model based on self-reports for 1992 (and 1996), calculate the predicted probability of turning out for each individual, and apply our deflator to those predicted probabilities. Table 5 compares the results based on self-reports to those based on the deflated self-reports produced by our algorithm for each of the three models, as above. The table presents two main findings. First, it shows that predicted probabilities produced by R are inflated compared to D across models in both elections. More importantly, our use of D exposes a larger racial gap in turn- Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 sampling frame was used both in 1976 and in 1984 and 1988, our method produces point estimates superior to those produced by R, although as expected, a fair amount of noise created in the process increases uncertainty. Second, we examine how well our model fits the data. In particular, we evaluate how predicted probabilities of turning out based on D produced by the out-of-sample procedure match observed data of turnout based on validated figures. Figure 3 presents this evaluation for 1984 and 1988 in the top and bottom panels, respectively. To evaluate our model fit, for each of three measures, V, R, and D, we sort our observations by predicted probability of turning out produced by the respective measure (based on estimates in table D2), group them into 20 equal-length intervals, and plot on the horizontal axis the average predicted probability for that group, and on the vertical axis the actual observed fraction of votes in that group as indicated by validated data. A perfect fit (on average) will result in all points aligned right on the 45-degree line. As can be seen for 1984, results produced by the self-reported model do not fit the data well and overestimate the observed vote by up to 20 percentage points. Consistent with figure 2, the inaccuracy is largest in the middle range. It is reassuring that our model for validated data produces a good fit— predicted probabilities are aligned close to the 45-degree line, and importantly, in cases where our predictions diverge from the line, they reveal no systematic bias but rather are scattered equally above and below the line. Finally, results produced by the deflated data are tightly clustered around the 45-degree line with no systematic deviation. Results for 1988 are similar. Recall that this is our hardest case: the out-ofsample test is based on data 12 years prior to the election, and the reporting pattern in 1988 is different than in other years. And although our prediction does involve more noise than in 1984, here too, while the prediction based on self-reports differs systematically from the observed vote, prediction based on our algorithm is in line with observed patterns. 0.854 (0.804, 0.900) 0.836 (0.813, 0.857) -0.018 (-0.064, 0.032) 0.790 (0.734, 0.841) 0.792 (0.769, 0.813) 0.001 (-0.052, 0.061) 0.670 (0.607, 0.725) 0.773 (0.753, 0.793) 0.103 (0.044, 0.165) 0.680 (0.603, 0.752) 0.750 (0.720, 0.777) 0.069 (0.000, 0.146) 0.609 (0.513, 0.693) 0.703 (0.673, 0.730) 0.094 (0.001, 0.197) 0.483 (0.399, 0.559) 0.685 (0.656, 0.712) 0.202 (0.121, 0.287) Deflated self-reported turnout Deflated self-reported turnout 0.490 (0.396, 0.581) 0.687 (0.658, 0.716) 0.197 (0.106, 0.294) 0.576 (0.488, 0.659) 0.699 (0.667, 0.729) 0.122 (0.038, 0.216) 0.671 (0.582, 0.746) 0.751 (0.720, 0.780) 0.080 (0.003, 0.166) 0.676 (0.594, 0.745) 0.776 (0.753, 0.799) 0.100 (0.026, 0.182) 0.760 (0.687, 0.822) 0.787 (0.763, 0.810) 0.027 (-0.036, 0.101) 0.845 (0.780, 0.897) 0.837 (0.810, 0.861) -0.008 (-0.066, 0.058) 1996 Self-reported turnout Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 NOTE.—Entries in the table are predicted probabilities of turning out. All other variables in the model are held at their respective means. Model 1: Race. Model 2: Race, education, and region (South vs. other). Model 3: Full model; see specification in table D2. For all models, see N and coefficients in tables D1 and D2. Model 3 (full) African Americans Whites Difference b/w African Americans and White Model 2 (thick) African Americans Whites Difference b/w African Americans and Whites Model 1 (thin) African Americans Whites Difference b/w African Americans and Whites Self-reported turnout 1992 Table 5. Predicted Probabilities of Turning Out by Race: 1990s Presidential Elections (95-percent confidence intervals in parentheses) Race and Turnout in U.S. Elections 307 308 Deufel and Kedar Conclusion Sophisticated survey methodology can shed light on important puzzles in political behavior, yet in some circumstances survey responses should not be taken at face value. Electoral participation is a prime example. This study explicitly analyzes how reporting of turnout is related to actual turnout, and thus, who really votes. We demonstrated that racial differences in turnout are underestimated because of both differing opportunities to overreport and differences in social desirability between African Americans and Whites. With this insight we provided a means to assess the impact of race on turnout in years where validation data are unavailable. Finally, through the mechanism we develop, we inferred that the large racial gap in turnout existed in the 1970s and 1980s and was likely present in the 1990s. With an African American leading a major party for the first time and African Americans turning out in record numbers (Ansolabehere and Stewart 2009, p. 6), the 2008 U.S. presidential elections presented a new reality to the American public. Yet, other things equal, social desirability may be an even Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 out compared to the use of R. In both 1992 and 1996, the use of R suggests that while Whites turn out significantly more than African Americans in the “thin” model (a difference of about 10 percentage points), the effect washes out when controls are introduced in the “thick” and in the “full” models. However, similar to our findings in the 1980s, in the case of D, the point estimates of the gap between African Americans and Whites are considerably higher in each model. In the thin model, the racial gap in turnout increases to about 20 percentage points in both elections. In the thick model, we expose a difference of 12 percentage points in 1996. Finally, results of model 2 in 1992 and the full model in both 1992 and 1996 expose differences that, although close to a traditional threshold of statistical significance, are all centered at a much greater gap than self-reported data suggest. That racial differences decline as we move from the “thin” to the “full” model is not surprising. While in the thin model the gap is a function of both opportunity and desirability effects (on-the-curve and off-the-curve effects), adding background variables reduces the role of opportunity effects relative to those caused by desirability. How far in time can we extrapolate? Although we establish the constancy of the relationship between reported and validated turnout between the 1970s and the 1980s and test the performance of our algorithm between the earliest and latest available data points, we extrapolate only to the two elections immediately following the latest validated data. Our approach can be applied to elections after 1996, but absent further validation data, we take caution and stop in 1996. We do not argue that the same results we find in the 1980s and 1990s hold indefinitely. Rather, our exercise exposes the perils of using selfreported turnout data for inference about racial (and perhaps other) differences in turnout. We discuss this point further in the concluding section. Race and Turnout in U.S. Elections 309 more powerful source of overreporting among African Americans who did not turn out in elections in which a Black candidate competes. While we demonstrated that the underlying patterns of overreporting on which we base our method hold during the period in which we are able to test our assumptions, we are, of course, unable to argue that they hold forever after. The 2008 U.S. presidential elections call us to collect new validated data that will allow us to continue to uncover the important and sometimes hidden relationship between race and turnout. Appendix A: Validated or “Validated”? Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 The arguments presented in this article rest on the assumption that the validation data are an accurate representation of actual voting. In this section, we address possible concerns about this assumption. In a National Election Studies Technical Report, Presser, Traugott, and Traugott (1990) suggest that the racial gap in turnout revealed by the validated data is merely an artifact of poor recordkeeping in areas where African Americans reside. In other words, since African Americans’ voting records are poorly kept, validated figures mistakenly suggest that they vote at substantially lower rates compared to Whites. While no data are pure of measurement error, we doubt this conclusion for three reasons. The authors argue that African Americans are more likely to live in areas where record quality and access to records were poorer (table 6, p. 28), and that across both races, the proportion validated as voting (given a positive self-report and validation as registered to vote) is lower in areas where record quality is lower. (The authors combine six variables of record management into an additive index of record quality (see footnote 6 in the report) and then break the index into three categories.) Focusing on those living in areas of high record quality alone, African Americans are only six percent less likely to be validated as voting than Whites—a third of the racial difference overall. While we do not contest this argument, it does not necessarily undermine our finding of a racial gap in overreporting. First, although of varying magnitudes, racial differences in overreporting are found within each level of recordkeeping quality (28 percent, 11 percent, and six percent, depending on record quality). In other words, since we are not aware of an argument suggesting that the records for African Americans are more difficult to find than those for Whites within districts where record quality as a whole is similar, racial differences in overreporting remain, holding constant record quality. In addition, although we do not reanalyze the data here, the nature of the relationships the authors detect both across and within types of record quality actually complements our argument rather than undercutting it. Presser et al. limit their analysis to respondents who both registered and reported to 310 Deufel and Kedar Appendix B: Question Wording and Variable Names All variables and codes are from the American National Election Study 1948– 2004 Cumulative file. VCF9151 Vote Validation: Turnout (Self-Report) 1. Yes, voted 5. No, did not vote VCF9152 Vote Validation: Attempt to Validate Registration 1. Yes, attempted 2. No: R says is not registered (1984–1990); R says not required/DK/NA if registered (1984–1990) Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 have voted. In our framework, these are respondents who are relatively likely to vote, located at the middle and upper sections of figures 1 and 2. Since the authors reveal that districts with low-quality records are also areas where turnout is generally lower, such as the South and central cities, respondents in these districts are probably relatively less likely to vote among a sample of people with a generally high propensity of turning out. In our framework, these voters are located in the middle (as opposed to the uppermost section) of figures 1 and 2. Therefore, the racial gap the authors find within categories of record quality is analogous to our differential desirability (off-thecurve) effect: this is the racial gap controlling for propensity to vote. As we showed, this gap is large for voters in the middle section (poor record quality in Presser et al.’s sample) and smaller for voters who are highly likely to vote (high record quality). Conversely, the effect the authors find across record quality is analogous to our opportunity effect (on-the-curve): rates of overreporting are lower where voters are likely to vote (areas with high record quality). Indeed, utilizing the same data but including nonregistered as well as registered voters, Abramson and Claggett (1992) conclude that racial differences in overreporting are not products of varying record quality. Utilizing the same objective measures of record quality as Presser et al., they find no evidence in 1986 or 1988 that African Americans are more likely to live in areas with poor recordkeeping. Finally, conducting a thorough analysis of record quality, Cassel (2004) concludes that “validation error from poor-quality or inaccessible voting records—located particularly in central city, African American, and Southern communities—does not bias the effects of related turnout predictors” (p. 107). In sum, available evidence supports our assumption regarding the quality of validated data. Race and Turnout in U.S. Elections 311 3. No: No respondent named and/or insufficient address (all years); registered out of area (except 1984, 1990); Washington, D.C. [Note: Code 2 has priority] 4. No: Same-day registration (1964, 1980) 5. No: Records not sent out due to: Office error (1980); No Post IW (1964, some cases in 1976) 7. No: Office refuses all access to registration records VCF9155 Vote Validation: Vote Validated VCF9153 Vote Validation: Office Voting Records 1. Office records appear to be adequate; no information about deficiencies 2. Some office voting records not accessible 3. R’s name unknown and/or insufficient address and/or registered out of area** (not 1984, 1986: see code 0) 5. No office voting records accessible VCF0703 Did R Register and Vote? “In talking to people about the election, we [1972 and later: often] find that a lot of people weren't able to vote because they weren’t registered or they were sick or they just didn’t have time.” (1978 and later: “How about you, did you vote in the elections this November?”) (1) No, did not vote (2) Yes, voted VCF0106 Race 1948–1998: Interviewer Observation of Race 1972 and later: “In addition to being American, what do you consider your main ethnic group or nationality group?” (1) White (2) Black (3) Other VCF0108 Hispanic “In addition to being American, what do you consider your main ethnic group or nationality group?” (If no Hispanic group mentioned:) “Are you of Spanish or Hispanic origin or descent?” (If yes:) “Please look at the booklet and tell me which category best describes your Hispanic origin.” Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 1. Yes 3. Registration record was found; no record of R voting 5. No registration record found; no record of voting found 312 Deufel and Kedar (1) Yes, R is Hispanic (2) No, R is not Hispanic Appendix C: Sampling Procedure and Response Rates SAMPLING PROCEDURE RESPONSE RATE The ANES calculates response rate as the number of interviews divided by the number of all eligible citizens from a random sample of households. Below, we reproduce part of a table of response rates from the ANES Web site for the survey years relevant to this study. Full details about the ANES methodology for constructing response rates are available at http://www.electionstudies.org. Table C1. Response Rates and Refusal Rates for the 1976–1996 Preand Post-Election Studies (Presidential Election Years Only) Year 1996# 1992# 1988 1984 1980 1976+ Response Refusal Number of Sample Dates conducted Dates conducted rate rate interviews N (pre) (post) 59.8 74.0 70.5 72.1 71.8 70.4 20.8 22.2 20.7 20.8 - 398 1126 2040 2257 1614 2248 + Using unweighted Ns - Unknown # Only includes fresh cross-section sample 666 1522 2893 3131 2249 3191 9/3-11/4 9/1-11/2 9/6-11/7 9/4-11/6 9/7-11/3 9/7-11/1 11/6-12/31 11/4-1/13 11/9-1/24 11/8-1/25 11/5-2/7 11/3-1/30 Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 The ANES defines their sample universe as “all U.S. households (including civilian households on military bases) in the 48 coterminous states and the District of Columbia,” and employs a multi-stage area probability design, in four stages. In stage one, primary geographical areas including Standard Metropolitan Statistical Areas (SMSAs), counties, and county groups are selected, followed by area segments such as housing unit clusters, followed by housing units, and finally all eligible members of the housing unit. Selection is random at each stage. (This description is paraphrased from, and is described in greater detail on, the ANES Web site at http://www.electionstudies.org.) Race and Turnout in U.S. Elections 313 Appendix D: Model Specifications Table D1. Estimated Turnout Based on Self-Reported and Discounted Data for Models 1 and 2 (Thin and Thick) (standard errors in parentheses) 1984 1988 1992 1996 SelfSelfSelfValidated SelfValidated reported reported reported turnout reported turnout turnout turnout coefficients turnout coefficients turnout coefficients coefficients coefficients coefficients N 1614 Log 1016.9 likelihood Model 2 (thick) African -0.500* American (0.168) Education 0.338* (0.048) South -0.440* (0.130) Constant -0.244 (0.170) N Log likelihood p < .05 = * 1614 982.9 -0.458* (0.166) 1.135* (0.061) -0.852* (0.168) 0.717* (0.061) -0.532* (0.159) 0.920* (0.061) -0.518* (0.151) 1.230* (0.055) -0.498* (0.189) 1.242* (0.069) 1663 938.3 1385 886.0 1530 929.2 1838 1005.9 1349 732.2 -0.165 (0.179) 0.450* (0.052) -0.315* (0.136) -0.253 (0.177) -0.834* (0.179) 0.424* (0.050) 0.172 (0.154) -0.729* (0.180) -0.293 (0.174) 0.593* (0.052) -0.297* (0.137) -0.941* (0.180) -0.002 (0.172) 0.686* (0.053) -0.673* (0.131) -0.869* (0.180) -0.148 (0.203) 0.573* (0.066) -0.319* (0.144) -0.582* (0.227) 1663 892.9 1385 848.5 1530 850.8 1838 895.2 1349 686.9 Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Model 1 (thin) African -0.773 American (0.158) Constant 0.784* (0.057) 1992 1996 Age 25–34 African American Age 17–24 South Male Family income 0.542* (0.071) 0.111 (0.076) 0.301* (0.142) -0.396* (0.158) -0.119 (0.230) -1.025* (0.250) -0.665* (0.215) 0.690* (0.078) 0.157 (0.082) 0.307* (0.152) -0.211 (0.165) 0.141 (0.226) -1.296* (0.288) -0.831* (0.263) 0.403* (0.064) 0.377* (0.062) 0.003 (0.109) -0.373* (0.144) -0.267 (0.191) -1.019* (0.235) -0.405 (0.210) 0.530* (0.069) 0.254* (0.070) 0.034 (0.116) -0.193 (0.151) 0.133 (0.208) -1.090* (0.250) -0.512* (0.230) 0.403* (0.064) 0.138 (0.072) 0.293* (0.137) 0.022 (0.118) -0.754* (0.206) -1.180* (0.293) -0.634* (0.245) 0.607* (0.070) 0.311* (0.075) 0.190 (0.143) -0.513* (0.159) -0.058 (0.251) -1.185* (0.301) -0.752* (0.263) Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 0.524* (0.065) 0.260* (0.073) 0.634* (0.135) -0.436* (0.147) 0.248 (0.217) -1.130* (0.245) -0.795* (0.222) 0.786* (0.070) 0.325* (0.069) -0.069 (0.133) -0.554* (0.146) 0.146 (0.193) -0.766* (0.260) -0.439* (0.220) Continued 0.576* (0.083) 0.388* (0.085) 0.002 (0.153) -0.249 (0.166) 0.081 (0.238) -0.606 (0.337) -0.544* (0.253) 0.331* (0.059) 0.253* (0.069) 0.279* (0.127) -0.675* (0.144) 0.114 (0.215) -1.039* (0.230) -0.692* (0.203) 1988 Education 1984 Validated Self-reported Validated Self-reported Validated Self-reported Validated Self-reported Self-reported Self-reported turnout turnout turnout turnout turnout turnout turnout turnout turnout turnout 1980 Explanatory variable 1976 Table D2. Estimated Turnout Based on Self-Reported and Validated Data, Model 3 (Full) (standard errors in parentheses) 314 Deufel and Kedar 1692 786.4 N Log likelihood 1590 839.7 -0.081 (0.244) 0.052 (0.258) 0.130 (0.261) 0.288* (0.143) 0.343* (0.046) 0.415* (0.066) 0.715* (0.159) -3.484* (0.405) -0.185 (0.226) Age 55–64 0.169 (0.230) Age 65 and over 0.242 (0.233) Homeowner 0.718* (0.137) Church 0.261* attendance (0.043) PID (strength) 0.292* (0.063) Contact 0.760* (0.146) Constant -2.994* (0.373) p < .05 = * 1980 1984 1988 1992 1996 1109 622.8 -0.120 (0.235) 0.026 (0.151) 0.352 (0.247) 0.558* (0.165) 0.216* (0.048) 0.282* (0.073) 0.354* (0.173) -3.071* (0.391) 1223 573.2 -0.179 (0.287) 0.264 (0.309) 0.416 (0.306) 0.475* (0.170) 0.225* (0.052) 0.382* (0.077) 0.829* (0.205) -3.406* (0.450) 1614 845.1 -0.350 (0.219) 0.272 (0.259) 0.484 (0.246) 0.520* (0.138) 0.262* (0.043) 0.360* (0.062) 0.624* (0.155) -3.102* (0.365) 1663 746.9 -0.146 (0.248) 0.365 (0.286) 0.435 (0.270) 0.561* (0.147) 0.288* (0.047) 0.403* (0.066) 0.865* (0.182) -3.435* (0.392) 1385 724.0 -0.430 (0.239) -0.164 (0.272) 0.202 (0.277) 0.506* (0.152) 0.297* (0.047) 0.453* (0.068) 0.557* (0.164) -3.315* (0.392) -0.468* (0.227) 0.916* (0.306) 0.741* (0.252) 0.335* (0.144) 0.133* (0.042) 0.439* (0.065) 0.992* (0.208) -3.836* (0.380) 1838 760.3 -0.405 (0.265) 0.063 (0.333) 0.274 (0.297) 0.326* (0.157) 0.343* (0.051) 0.592* (0.071) 0.910* (0.191) -4.426* (0.429) 1530 685.3 1349 563.9 -0.504* (0.248) 0.022 (0.299) 0.580* (0.285) 0.333 (0.171) 0.317* (0.053) 0.584* (0.080) 0.896* (0.201) -4.370* (0.471) Validated Self-reported Validated Self-reported Validated Self-reported Validated Self-reported Self-reported Self-reported turnout turnout turnout turnout turnout turnout turnout turnout turnout turnout 1976 Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Age 35–44 Explanatory variable Table D2. Continued Race and Turnout in U.S. Elections 315 316 Deufel and Kedar Appendix E: The Relationship Between Validated and Self-Reported Data Table E1. Deflated Functions (standard errors in parentheses) Varying over time and race African American Pr(R) Pr(R) squared Year dummy: 1980 1984 1988 Interaction effects: Af. Am. * Pr(R) Af. Am. * Pr(R) squared 1980 * Af. Am. 1984 * Af. Am. 1988 * Af. Am. 1980 * Pr(R) 1984 * Pr(R) 1988 * Pr(R) 1980 * Pr(R) squared 1984 * Pr(R) squared 1988 * Pr(R) squared 1980 * Pr(R) * Af. Am. 1984 * Pr(R)*Af. Am. 1988 * Pr(R)*Af. Am. 1980 * Pr(R) squared * Af. Am. 1984 * Pr(R) squared * Af. Am. 1988 * Pr(R) squared * Af. Am. Constant 0.022 0.385* 0.454* N Varying over race only (0.053) (0.052) (0.040) -0.011 0.420* 0.391* (0.019) (0.189) (0.020) 0.005 -0.016 0.010 (0.024) (0.023) (0.020) – – – – – – -0.273 0.200 -0.022 -0.031 -0.012 -0.006 0.056 0.212* -0.037 -0.041 -0.252* 0.042 -0.053 -0.249 -0.012 0.073 0.231 0.113* (0.178) (0.141) (0.070) (0.066) (0.060) (0.078) (0.075) (0.067) (0.061) (0.057) (0.052) (0.238) (0.226) (0.206) (0.190) (0.181) (0.167) (0.016) -0.280* 0.221* – – – – – – – – – – – – – – – 0.126* (0.069) (0.057) – – – – – – – – – – – – – – – (0.008) 6108 6108 p < .05 = * References Abramson, Paul, and William Claggett. 1984. “Race-Related Differences in Self-Reported and Validated Turnout.” Journal of Politics 46(3):719–38. ———. 1986. “Race-Related Differences in Self-Reported and Validated Turnout in 1984.” Journal of Politics 48(2):412–22. ———. 1989. “Race-Related Differences in Self-Reported and Validated Turnout in 1986.” Journal of Politics 51(2):397–408. ———. 1991. “Racial Differences in Self-Reported and Validated Turnout in the 1988 Presidential Election.” Journal of Politics 53(1):186–97. Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Variable Race and Turnout in U.S. Elections 317 Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 ———. 1992. “The Quality of Record Keeping and Racial Differences in Validated Turnout.” Journal of Politics 54(3):871–80. Ansolabehere, Stephen, and Charles Stewart III. 2009. “Amazing Race: How Post-Racial Was Obama's Victory?" Boston Review 34(1):(January/February). Belli, Robert F., Michael W. Traugott, and Matthew N. Beckmann. 2001. “What Leads to Voting Overreports? Contrasts of Overreporters to Validated Voters and Admitted Nonvoters in the American National Election Studies.” Journal of Official Statistics 17(4):479–98. Belli, Robert F., Michael W. Traugott, Margaret Young, and Katherine A. McGonagle. 1999. “Reducing Vote Overreporting in Surveys.” Public Opinion Quarterly 63:90–108. Bernstein, Robert, Anita Chadha, and Robert Montjoy. 2001. “Overreporting Voting: Why It Happens and Why It Matters.” Public Opinion Quarterly 65(1):22–44. Burden, Barry C. 2000. “Voter Turnout and the National Election Studies.” Political Analysis 8 (4):389–90. ———. 2003. “Internal and External Effects on the Accuracy of NES Turnout: Reply.” Political Analysis 11(2):193–95. Cassel, Carol A. 2003. “Overreporting and Electoral Participation Research.” American Politics Research 31(3):81–92. ———. 2004. “Voting Records and Validated Voting Studies.” Public Opinion Quarterly 68 (1):102–8. Clausen, Aage R. 1968. “Response Validity: Vote Report.” Public Opinion Quarterly 32 (4):588–606. Dawson, Michael C. 1994. Behind the Mule. Princeton, NJ: Princeton University Press. Duff, Brian, Michael J. Hanmer, Won-Ho Park, and Ismail K. White. 2007. “Good Excuses: Understanding Who Votes with an Improved Turnout Question.” Public Opinion Quarterly 71 (1):67–90. Greene, William H. 1993. Econometric Analysis. 2nd ed. Upper Saddle River, NJ: Prentice Hall International. Hill, Kim Quaile, and Patricia A. Hurley. 1984. “Nonvoters in Voters’ Clothing: The Impact of Voting Behavior Misreporting on Voting Behavior Research.” Social Science Quarterly 65 (1):199–206. Holbrook, Allyson L., and Jon A. Krosnick. Forthcoming. “Social Desirability Bias in Voter Turnout Reports: Tests Using the Item Count Technique.” Public Opinion Quarterly. Karp, Jeffrey A., and David Brockington. 2005. “Social Desirability and Response Validity: A Comparative Analysis of Overreporting Voter Turnout in Five Countries.” Journal of Politics 67(3):825–40. Kinder, Donald R., Mendelberg Tali, Michael C. Dawson, Lynn M. Sanders, Steven J. Rosenstone, Jocelyn Sargent, and Cathy Cohen. 1989. Race and the 1988 American Presidential Election. Paper presented at the Annual Meeting of the American Political Science Association, Atlanta, GA, USA. Kinder, Donald R., and Lynn M. Sanders. 1996. Divided by Color: Racial Politics and Democratic Ideals. Chicago, IL: University of Chicago Press. Martinez, Michael D. 2003. “Comment on 'Voter Turnout and the National Election Studies'.” Political Analysis 11(2):187–92. McDonald, Michael P. 2003. “On the Overreport Bias of the National Election Study Turnout Rate.” Political Analysis 11(2):180–86. McDonald, Michael P., and Samuel L. Popkin. 2001. “The Myth of the Vanishing Voter.” American Political Science Review 95(4):963–74. Miller, Warren E, and the National Election Studies. American National Election Studies Cumulative Data File, 1948–1998 [Computer file]. 1999. 10th ICPSR version. Ann Arbor: University of Michigan, Center for Political Studies [producer] Ann Arbor, MI: Inter-University Consortium for Political and Social Research [distributor]. 318 Deufel and Kedar Downloaded from poq.oxfordjournals.org by guest on February 7, 2011 Presser, Stanley, and Michael Traugott. 1992. “Little White Lies and Social Science Models: Correlated Response Errors in a Panel Study of Voting.” Public Opinion Quarterly 56(1):77–86. Presser, Stanley, Michael W. Traugott, and Santa Traugott. 1990. Vote 'Over' Reporting in Surveys: The Records or the Respondent. Paper prepared for the International Conference on Measurement Errors, Tucson, AZ, USA. Rosenstone, Steven J., and John Mark Hansen. 1993. Mobilization, Participation, and Democracy in America. New York: MacMillan. Sigelman, Lee. 1982. “The Nonvoting Voter in Voting Research.” American Journal of Political Science 26(1):47–56. Silver, Brian D., Barbara A. Anderson, and Paul R. Abramson. 1986. “Who Overreports Voting? American Political Science Review 80(2):613–24. Tate, Katherine. 1994. From Protest to Politics: The New Black Voters in American Elections. Cambridge, MA: Harvard University Press. Uggen, Christopher, and Jeff Manza. 2002. “Democratic Contraction? Political Consequences of Felon Disenfranchisement in the United States.” American Sociological Review 67 (6):777–803. Verba, Sidney, and Norman H. Nie. 1972. Participation in America: Political Democracy and Social Equality. New York: Harper and Row. Verba, Sidney, Kay Lehman Schlozman, and Henry E. Brady. 1995. Voice and Equality. Cambridge, MA: Harvard University Press.
© Copyright 2026 Paperzz