American Association for Public Opinion Research Review: Was 1996 a Worse Year for Polls Than 1948? Author(s): Warren J. Mitofsky Source: The Public Opinion Quarterly, Vol. 62, No. 2 (Summer, 1998), pp. 230-249 Published by: Oxford University Press on behalf of the American Association for Public Opinion Research Stable URL: http://www.jstor.org/stable/2749624 . Accessed: 03/09/2013 19:44 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Association for Public Opinion Research and Oxford University Press are collaborating with JSTOR to digitize, preserve and extend access to The Public Opinion Quarterly. http://www.jstor.org This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions THE POLLS-REVIEW WAS 1996 A WORSEYEARFOR POLLS THAN 1948? WARRENJ. MITOFSKY Afterthe ballotswerecountedin the 1996presidentialelection,therewere no picturesof a victoriousBob Dole, gleefully holdinga newspaperwith the erroneousheadline,"ClintonDefeatsDole." We saw no postelection speechesin whichPresident-electDole rippedinto the liberalmediapolls thathaddeclaredhim prematurelydead.HouseRepublicansheld no hearings to get to the bottomof the polls' failure.Therewere no pressconferences in whichpollstersexpressedregretandconfusionaboutwhatcould haveled themto thinkthatBill Clintonwouldwin. And a humbledClinton did not opine that the lead he had in preelectionpolls must have lulled his supportersto sleep on election day. We saw none of these things because, in concert with the estimates fromall of the mediapreelectionpolls, Clintonwon the 1996 presidential election. Unlike the 1992 British preelectionpolls, almost all of which erroneouslyforetolda Labourvictory,or the 1990Nicaraguanpolls, many of which picked Daniel Ortegato defeat ViolettaChamoro,or the 1980 U.S. polls thatpredictedan uncertainReaganlead andnot a landslide,or the infamous1948 polls thatpredicteda Dewey victoryover Truman,the 1996polls wereunanimouslycorrectin predictingthatClintonwouldwin by a safe margin.Unlikethose storiedexemplarsof pollingfrailty,earlier occasionsfor embarrassment andconsternation,polling in the 1996 election could be judged a clear-cutsuccess. It came as a considerableshock, then, when EverettCarllLadd,Jr.directorof the RoperCenterat the Universityof Connecticutanda prominent figurein the field of publicopinionresearch-declared in an article published in the Chronicle of Higher Education (1996a), and excerpted in the WallStreetJournal (1996b), that "election polling had a terrible yearin 1996. Indeed,its overallperformancewas so flawedthatthe entire enterpriseshouldbe reviewedby a blue-ribbonpanelof experts."A flurry of "copycat" negative media coverage followed Ladd's extraordinary statement, with articles appearing in U.S. News and World Report, the WARREN J. MITOFSKY is presidentof MitofskyInternational. Public Opinion Quarterly Volume 62:230-249 ? 1998 by the American Association for Public Opinion Research All rights reserved. 0033-362X/98/6202-0006$02.50 This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 231 New York Times Sunday Magazine, and a dozen or more periodicals and radio and television broadcasts. Ladd's attackon the polls was a miscellaneouscollection of charges. Most prominently,while acknowledgingthatClintondid actuallywin the 1996 election, he said that the polls had overestimatedthe president's margin of victory. He remarkedthat, by comparison,the 1948 polls seemed "closerto a triumphthana disaster."He saidthatClinton'slarge leadin the polls throughoutthe campaignwas illusory,butthatits publication had dampenedinterestin the election and turnouton election day. He decriedthe numberof preelectionpolls, whose findings"bombarded" the electoratethroughoutthe campaign.He assailednot only preelection but also exit polls, chargingthatthey had overestimatedthe Democratic shareof the vote "in recentyears." He opinedthatRepublicansandconservativeswere less likely to agree to respondto polls (perceivingthem to be tools of the liberalmedia),leadingto nonresponsebias in preelection poll estimates. Ladd's wide-rangingcritiquecontainedbothtestablechargesandspeculative complaints.Characteristicof the lattercriticismswas his claim thatthe polls had overestimatedClinton'slead duringthe campaignand hadtherebydampenedinterestin the election.His conjecturethatthe polls had sufferedfrom nonresponsebias due to noncooperationby conservatives was similarlyunverifiable.The impact,if any, of poll "bombardment" on prospectivevoters would have been difficultto assess during the campaignand was impossibleto gauge retrospectively.To respond to any of these assertions,one wouldhave to matchspeculationwith surmise. In addition,Ladd's testableclaims themselveswere poorly specified. His complaintsaboutinaccuracyandDemocraticbiasin exit polls referred to problemsin "recent years" and offered no data. And the marquee attackin Ladd's offensive-that 1996 poll accuracywas worse than in 1948 or any other year-was simply unsupported.His Chronicle of HigherEducationarticlecontainedno 1996 datawhatsoever.The excerpt of that articlein the Wall Street Journal featureda table, titled "Polls Away fromReality," listing final 1996 poll projectionsfor eight polling organizationsand the share of the vote for Clinton, Dole, and Perot. Ladd's "analysis" consistedsolely of two statements,that "most of the leading nationalpolling organizationsmade pre-electionestimatesthat divergedsharplyfromthe actualvote on November5," andthat "of late, both pre-electionsurveysand exit polls on ElectionDay frequentlyhave missedthe markby marginswell in excess of the Gallupresultsin 1948." He presentedno criteriafor assessing "divergence"nor any measuresto supportthis claim. Because even his highlighted charges were imprecise and undocu- This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions 232 Warren J. Mitofsky mented,a reasonedassessmentof Ladd'sattackrequiressome reformulation andrefinementto enablegatheringappropriate evidence.The responsible analyst,in otherwords,has to do the difficultwork that Laddhad sparedhimself in leveling his chargesand then make the effort to judge the case once the termsareclear.The prospectof performingsuchdouble dutyis neverinviting.But becauseLadd's unexpectedbroadsidereceived much national media attention,the National Council on Public Polls (NCPP 1997) felt compelledto defend 1996 poll performance.Considering Ladd'sleadingcharge,thatfinalpreelectionpolls had badlyoverestimatedthe extent of Clinton'svictoryand were fartheroff the markthan in 1948, the councilhad to decide on a way to measurepoll performance and then do the computationsto providea historicalcomparison. The council's work was made more difficultby the fact that-after more than50 years of election polling-no standardmetricfor gauging poll accuracyhad been adoptedby the polling community.When challengedby Ladd,the NCPPhadto agreeanewon a methodof assessment. The absenceof a generallyacceptedstandardprovidesan open field for chargesand counterchargesbased on hyperboleandblurreddistinctions. But it is clearly an undesirablestate of affairsfor scholarswho wish to studythe questionof poll accuracywith fairnessandprecision.My analysis, harkingback to the Social Science ResearchCouncil (SSRC) report on the polls of 1948, examinesthe questionof how poll accuracyshould be measured,weighing the pluses and minuses of variousapproaches.I thenrevisitthe Ladd-NCPPexchangeandoffer an answerto the question, Was 1996 a worse year for the polls than 1948? Measures of Poll Accuracy The SSRC study (Mostelleret al. 1949) of the 1948 preelectionpolls consideredeight methodsfor measuringpolling error.It noted that each method has "advantagesand disadvantages."The SSRC committee's definitionsare as follows. Methodsfor DefiningElectionPolling Error 1. The differencein percentagepoints betweenthe leadingcandidates's shareof the total vote from a poll and from the actual vote. 2. The differencein percentagepoints betweenthe leadingcandidate's shareof the major party vote from a poll and from the actual vote. (Majorpartiesare Democraticand Republicanand are assumedto be the top two vote getters.) This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 233 3. The average(withoutregardto sign) of the percentagepoint deviation for each candidatebetweenhis/her estimateand the actual vote. 4. The averagedifference(withoutregardto sign) between a ratio for each candidateand the numberone, where the ratio is defined as a candidate'sestimatefrom a poll dividedby the candidate's actualvote. 5. The differencebetween two differences,where the first difference is the estimateof the vote for the two leading candidatesfrom a poll and the second differenceis the election resultfor the same two candidates. 6. The maximumdifferencein percentagepoints between a party and the actualvote. 7. The chi-squareto test the congruenceof the estimatedand actual vote distributions. 8. The differencebetweenthe predictedand actualelectoralvote. A significantproblemfor today's evaluationof polling accuracy,andnot addressedby the SSRCbecausethe problemdid not arisein 1948, is how to handlethe "undecided"vote in the polls. Thereis no undecidednumberin an election;it only exists in some,butnot all, polls. If the undecided respondentsare not allocatedto a candidate,then the averageerrorcomputedusing methods 1, 3, 4, 6, and 7 will be exaggerated.For example, supposethat an election thatwas 55 percentDemocraticand 45 percent Republicanhada poll thatshowed50 percentfor the Democrat,40 percent for the Republicanand 10 percentundecided.If therewas no allocation of the undecided,methods 1 and 3 would have errorsof 5 percentage points.If the undecidedwere allocatedproportionallyandthe errorcomputed,methods1 and 3 would only show an errorof 1 percentagepoint. Furthermore,the errorcomputedusing any of methods 1, 3, 4, 6, and 7 will not be comparablefor polls thatreport"undecided"and those that do theirown allocation. The alternativesfor evaluatingpolls thatincludean undecidedcategory are as follows: (1) allocatethe undecidedin proportionto the votes for candidatesin a poll, (2) allocatethe undecidedevenly between the two majorparties,(3) allocateall the undecidedto the challenger,if thereis an incumbent,or (4) use one of the methodsthat does not requirethat the undecidedbe allocated.A morecomplexallocationcannotbe accomplished by an evaluator. Crespi (1988, p. 22) claims that the pollstershe interviewedfor his book thoughtthatproportionalallocationof the undecidedwas "closest to the experienceof most pollsters."Actually,the arithmeticcan be done for all methodswithoutallocatingthe undecided,but, as shown above, This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions 234 Warren J. Mitofsky it does not resultin comparablemeasures.The overridingconsideration concernsthe comparabilityof the measureover several elections. Comments on Methods for Measuring Accuracy The SSRCchose the straightforward analysisandinterpretation of method 1 for its evaluationof the 1948 preelectionpolls. But they also expressed concernaboutthe use of method1 in electionswith significantminorparty candidacies(in 1948, StromThurmondand HenryWallace). Method 1, while simpleandeasily understood,is artificialunless the test in the election is whetherthe leadingcandidategets 50 percentor more.If thereare morethantwo candidatesin an electionor if a poll includesan undecided percentage,the numberfor the leading candidatealone is of little value in describingthe statusof an election.Witnessa recentargumentbetween the EagletonandZogby Polls following the 1997 New Jerseygovernor's election.The EagletonInstituteof Politics(1997) claimedthatit was closest with a preelectionestimateof 45 percentof the vote, but that figure is of little use unless we also know thatChristieTodd Whitman'sopponent,JimMcGreevey,hadsignificantlyless than45 percent.The Eagleton claim of accuracy(based on method 1) does not addressthis question (althoughtheirpress release did include the vote for othercandidates). The SSRC committeealso liked method 2 and used it, too, in their analysis.Method2 reducesthe percentagesso the top two candidatesadd to 100 percent.The unspokenadvantageis that this method eliminates all othercandidatesandanyundecidedwho maybe in a poll. Themeasure, unlikemethod1, meansthe samethingregardlessof how manycandidates participatein an election:at least one candidatewill have 50 percentor more.While method2 may appearto be similarto method 1, thereis an importantdifference.When poll numbersare repercentagedso the two majorpartycandidatesaddto 100 percentsomethingvery importanthappens: the undecidedpercentageis eliminated.The effect is that method 2 becomesidenticalto method5 withthe undecidedallocated.If the undecided are not allocatedbefore applyingmethod5, then the two measures are not equivalent. Method3 averagesthe percentagepoint deviationfor each candidate betweenits estimateand the actualvote, withoutregardto sign. This approach,the SSRC committeethought,had "inherentdrawbacks.By includingmanysmallpartieswhich scarcelycontributeto the totalvote ... the averagedeviationcan be made very small even thoughmajorparty predictionshave largeerrors"(Mostelleret al. 1949, p. 56). They understatedtheproblem.If all 22 partieson the ballotin 1996hadbeenincluded in a computationusing this method,then the averageerrorwould have been close to zero. Even with a limitednumberof candidates,if the coef- This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 235 ficientof variation'is roughlyconstantfor each candidateincluded,then the overall error will decline as the number of candidatesincreases. Clearly,thereneeds to be a limitationon the numberof candidates.This method, if used with "discretion" about which partiesto include, the SSRC reportsaid, could be useful (p. 56). It shouldbe notedthatneither the SSRC nor the NCPP, which also used this methodin its evaluation of the 1996 preelectionpolls, definedwhat might be reasonablecriteria for includingthirdparties. Method 3 has the virtue of evaluatingcandidatesother than the top two. If the intent of a preelectionpoll is to reportthe standingof each candidate,this measureevaluatesthe averageaccuracyof more thantwo candidates.The criterionfor includingmore thantwo candidatesis arbitrary.Crespi(1988) includedthird-partycandidatesonly if they received at least 15 percentof the vote in an election. In the twentiethcentury, his third-partycriterionwouldhave includedTheodoreRoosevelt(1912), RobertLaFollette(1924), andRoss Perot(1992). It wouldhaveeliminated StromThurmondand HenryWallacein 1948, GeorgeWallace in 1968, JohnAndersonin 1980, andRoss Perotin 1996, amongothers(CongressionalQuarterly1985). Method3's otherfailingsare (1) a lack of comparabilitybetweenelectionsthathave differentnumbersof meaningfulcandidates, and (2) like some other methods, it requires the analyst do somethingthe pollsterwho createdthe poll was unwillingto do: namely, allocatethe undecidedvotersamongthe candidates.If the undecidedwere not allocated,the measurementsof errorwould not be comparablefrom poll to poll. Method4 computesthe ratio of each candidate'sestimatedividedby the actualvote;erroris the averagedeviationfromone for each candidate. This approachtends to exaggeratesmall percentagepoint differencesin minorpartycandidates.For example,a one-pointerrorin a partywith 50 percentof the vote results in a 2 percenterror.A one-pointerrorin a partywith 5 percentof the vote producesa 20 percenterror.If all parties' errorsare averaged,the overallresultexaggeratesthe total error.This is just the opposite of method3, which minimizesthe total error.Method 4 could be modifiedto includemajorpartycandidatesonly, but then its result would be comparableto methods2 and 5 (with allocationof the undecided). The only problemwith method5, accordingto the SSRC report,was the "complexity"of explainingit. Method5 firstcomputesthe difference between the two leading candidatesin the poll and the actualvote; the erroris the differencebetweenthese differences. 1. The coefficientof variationis the standarderrorof an estimate(percentage)dividedby the estimate(percentage).This is a more useful measureof variabilitythanthe standard erroralone becausecoefficientsare comparablefrom one candidateto another. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions 236 Warren J. Mitofsky In a race involving only Democratsand Republicans,after allocation of the undecided,method5 yields resultsexactly two times the resultof methods1 and2. An advantageof method5 is thatit evaluatesthe statistic most often reportedby the mediawhenreportingpreelectionpolls, which is the marginbetween the top two candidates.Ladd, in his Wall Street Journalarticle,bases his discussionon the marginbetween the top two candidates.More recently,the NewarkStar-Ledger,in its reportof the EagletonPoll for the 1997 New Jerseygovernor'srace statedin its lead paragraphthatWhitmanhad "a 9-pointlead" (Hassell 1997). In the fifth the storycitedWhitman's45 percentfigure,whichthe Eagleton paragraph Instituteof Politics (1997), in its postelectionpress release, said was the most importantfigure.The author'snonscientificreview of finalpreelection poll storiesin 1996 and 1997 showedthatalmostall poll storiesmade primarymentionof the marginbetweenthe top two candidates.The fact thattherewere distantthird-partycandidateswas mentionedin these stories, butwithoutcomment,presumablybecausethosecandidatesappeared to have no chanceof challengingthe two leadingcandidates.The margin betweenthe leadingcandidatesalso was the measurepollstersused when they wrote aboutpolls (DiVall 1996; Newport 1997; Taylor 1997). Method 5 rewardsthe effort of the pollsters who allocate undecided voters well and penalizesthose who allocatepoorly. If the pollsterdoes not allocate at all then the marginreportedby the pollster,presumably, is the best indicationof the pollster'sexpectationaboutthe election outcome. Method5 does not force someoneevaluatingpolls to makeassumptions aboutthe undecidedthat were not made by the pollster.2It should be notedthatthe resultsof methods3 and5 areidenticalfor two-candidate raceswhen the undecidedareallocated.They differwhenthereis no allocation. Methods6, 7, and 8 were mentionedin the 1948 SSRCreportbut were not consideredas viable options.Method6 appliesto the one partywith the largesterror,even if it is a minorparty.Complexitywas the reasonfor droppingmethod7. The errorin electoralvotes projectedand evaluated in method8 is one step removedfrom evaluatingnationalor state polls directly. Crespi(1988), afterhe performedproportionalallocationfor the polls thatincludedundecidedvotersin theirbase, evaluatedmethods 1, 3, and 6. He concludedthat there was not much differencebetween them and used method 1 for his analysis,as did the SSRC committeefor its report 2. I would like to acknowledgethat I made the mistakeof not allocatingthe undecided duringthe 15 yearsI directedthe CBS half of the CBS/NewYorkTimesPoll. I now believe thatit is unreasonableof a pollsterto ask a readeror viewer of a final preelectionpoll to abouthow the undecidedwill vote. A poll is being reportedso the make an interpretation publicknows whatto expect when the electiontakesplace. Leavingthe undecidedin the base of the percentagesreporteddoes not serve the public expectationor the pollsters' claims aboutaccuracy. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 237 on the 1948 elections. Methods3 and 6, accordingto Crespi'sdata,had the smallercoefficientsof variation.Method1, whichhe used throughout the book, had a slightly largercoefficientof variationthanthe methods he rejectedfor his analysis. An analysis of recent British elections yields a mixed bag. Crewe (1997) prefersmethod3. He calls it "the truetest of a poll's accuracy" (p. 580). Nonetheless,the entirediscussionin his articleaboutthe 1997 elections is aboutLabour'slead, the lead being the differencebetween the top two parties,which is evaluateddirectlyby method5. He offers computationsof polling errorfor both method 3 and method5. Robert M. WorcesterendorsesCrewe'spreferencefor method3.3 He also added a few details:(1) polls arerepercentagedso the majorparties addto 100 percent;4(2) thereareno decimalplacesin the poll's recomputedestimate of a partyand thereis one decimalin the electionresult.5He claims this methodhas been used for decadesto evaluateBritishelections.6 Comparison of the Methods Nine finalpreelectionpresidentialpolls from 1996 were evaluatedusing four of the eight methods.They includethe eight polls listed in the table in the Wall Street Journal excerpt of Ladd's critique,plus the Politics Now/ICRpoll. Theninepolls wereconductedby ABC News, CBS News/ New YorkTimes,Gallup/CNN/USAToday,HarrisPoll, Hotline/Battleground,NBC/Wall StreetJournal,Politics Now/ICR, PrincetonSurvey Research/PewResearchCenter,Zogby Group/Reuters.Three of them, (Harris, Princeton Survey Research, and Zogby), reported vote for "other" candidatesin additionto Clinton,Dole, and Perot.Four polls, (ABC, CBS, Hotline, and NBC), reportedundecidedvoters in the base of theirpercentages.The finalpoll results,as reportedby the pollingorganizations,are presentedin table 1. The accuracyof methods1, 2, 3, and 5 were evaluated.Method 1 was used by the SSRC in 1948; method2 is similar,but only deals with the majorpartycandidatesand also was used in the SSRC report;method3 3. RobertM. Worcester,personalcommunicationto author(E-mail),December13, 1997. 4. Unlike U.S. elections, there are more than two majorpartiesin British elections. In 1997, therewere threemajorpartiesplus "others" in the evaluation. 5. The Britishevaluation,unlikethe NCPP evaluationof U.S. presidentialelectionpolls, maintainsthe same numberof significantdigits for a poll that the poll had when it was reportedby the media.The use of a constantevaluationmethodalso makesit possibleto readilycomparepolling performanceover more than one election. 6. The evaluationof the performanceof the polls for the 1992 Britishgeneralelection took on the same sense of urgencyas the SSRC evaluationof the 1948 U.S. polls. The four final British polls showed Labourwith a small lead, which suggested a coalition government.The Conservativesactuallywon by a small but comfortablemargin.Butler and Kavanagh(1992) discuss these polls in chap. 7, "The Waterlooof the Polls." This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Warren J. Mitofsky 238 Table 1. Final 1996 Preelection Presidential Polls Perot Dole Clinton (Democrat) (Republican) (Reform) Other Undecided Electiona ABC News CBS News/New YorkTimes Gallup Harris Hotline/ Battleground International Communication Research/ Politics Now NBC News/Wall StreetJournal PrincetonSurvey Research/ PEW Research Center Zogby/Reuters 49.3 51 40.7 39 8.4 7 53 52 51 35 41 39 9 7 9 45 36 8 51 38 11 49 37 9 52 49 38 41 9 8 1.6 3 3 1 11 5 1 2 NOTE.-Data are percentages. a Data are from Scammonand McGillivary1997. was used by NCPP and the British in their evaluation of election polls; method 5 deals with the statistic most often reported in the media, the margin between the leading candidates. The evaluation was done with and without allocation of the undecided voters. A few rules were observed in the calculations: the result of the presidential election was stated to within one decimal place. All poll numbers were whole percentages with no decimals, thereby maintaining the same number of significant digits as were published by the pollsters. Undecided voters were allocated in proportion to the vote for Clinton, Dole, and Perot, which was the allocation method least controversial and used by NCPP, the British evaluations, and Crespi (1988; vote for "others" in a poll was eliminated before allocation). Poll percentages after allocation were rounded to whole percentages before other calculations. A rank of 1 was assigned to the poll with the smallest error for a given method. If more than one poll had the same error they were each given the same rank. A comparison of the effect on the rankings for a given method can be This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 239 seen in table 2. Allocationof the undecidedvoters changedthe method 1 rankingafter allocationby more than one place for four of the nine polls. Method2 always includes allocation,and thereforethe resultsare identical,excludingit from this comparison.The resultingrankingusing method3 is changedslightly by allocation,except for one poll. Method 5 shows the least variabilityin the rankingswhen the undecidedare allocated.Onlyone of the ninepolls shifttheirrankingby two or moreplaces.7 A comparisonof the resultingrankingsfor methods 1, 3, and 5 when thereis no allocationshows considerablevariability.Each methodproduces a differentrankorder.A comparisonof all four methodswhen the undecidedvotersareallocatedis muchless variable,as one wouldexpect. Method 1 varies slightly more than the other three methods, which is consistentwith Crespi'sfinding.8Methods2 and5, as notedin the discussion of methodsabove, produceidenticalrankingsafter allocation. Discussion An inspectionof the rankingsin table2 producedby the differentmethods shows that they are more consistentwhen the undecidedare allocated. However,it is still opento questionwhetherthe evaluatorshouldallocate the undecidedif the pollsterdid not do it. If the goal of these preelection polls is to informthe public aboutthe expectedoutcomeof an election, then it seems that the responsibilityfor allocationshould rest with the pollster.Thepublicandthejournalistshaveneitherthe informationnecessary for a sophisticatedallocationnor the technicalknowledge.If there is some goal for these polls otherthan forecastingthe election, then the eight methodsdescribedabove for evaluatingthe polls are not sufficient. A more appropriateapproachwould be to assess the conformityof the pollingmethodsto good statisticalandpollingtheory,which arethe criteriato whichall otherpolls aresubjected.Assumingthe goal is forecasting, it seems reasonable,when evaluatinga poll, to take the numbersas reportedby a pollster,withoutmodifyingthem. Thisconclusionwas basedon fourpoints.First,five of the ninenational polls did theirown allocationof the undecided.Second,when evaluating the Dolls thatdid not allocate,thereis no agreeduDonmethodof allocation 7. An examinationof the correlationsamongthe rankingshows high consistencyamong the methodswhen the undecidedare allocated.These correlationsare all high, ranging from .72 to .94, except for the correlationbetweenmethods2 and 5, which is 1.0. When the undecidedare not allocated,the correlationsare more variableand they are lower. They rangefrom .10 to .75, with one exception.Again, the correlationbetweenmethods 2 and 5 is high; it is .94. I thankDan M. Merklefor these computations. 8. The coefficientsof variationfor methods1, 2, 3, and 5, respectively,are .70, .58, .53, and .64 for the 1996 final presidentialelection polls, when the undecidedare allocated. Crespi's(1988) studyhad coefficientsfor methods1 and 3, respectively,of .83 and.76. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions o mC ) *c-) cs -) 0~~~~~0 ?? c ~, It cl 00 t 0) a)~~~~~r o soc o Ot^ C Ca mctr t)N ~~~r) (7 cl t clq t > so mt X 0 o ct v~~~~~~tC ce00 (\It ?tXmmmmm> z .t ct (" It ("I I tevo c -qCI Bs C o ~~ ~ ~ co c zmc C13~ -~ ' ~~r)(\ Q' % ~ ~~~ o 40 cc m l I- B' : Z0 U N% CIA This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 241 for an evaluatorto use. Third,if thereis a preferredmethodof allocation, the pollsterdid not choose to use it. Fourth,it seems reasonableto assume thatpollstersreporttheir "best" estimatesof an election outcometo the public. Thereis an effect on the measurementof accuracyof the polls when the undecidedareallocated.The correlationof the errorsfor the nine polls using method1 with and withoutallocationis only .35.9For method3 it is .60. Only method 5 maintainsa high correlationof .94. Should the evaluatordo somethingthatwill changethe assessmentof accuracythat the pollsterwas unwillingor unableto do for him- or herself? As to which methodshouldbe used to evaluatepolling accuracy,that remainsthe choice of the evaluator.The argumentsfor and againsteach methodare listed above and summarizedhere. If the goal is to forecast which candidatewill win andby how much,thenmethod1 does not adequatelyevaluatean electionoutcomeunlessthereareonly two candidates. Limitingthe analysisto one candidate,as method 1 does, gives no idea aboutthe accuracyof the forecast;methods2 and 3 evaluatethe forecast of the winnerindirectlyandmethod5 does it directly.Method2 implicitly introducesproportionalallocation,andthereforeit too seems less preferable. The best choice appearsto be between methods 3 and 5. The chief argumentmade by proponentsof method3 is thatit representsall "significant"candidatesbut leaves open how to define "significant."Its opponents say method3 artificiallyreducesthe overall errorwhen a third candidateis introduced,therebymakingcomparisonswith two-candidate electionsnot meaningful.It shouldbe noted,for example,thatthe introductionof Perotinto the evaluationof the performanceof the 1996 preelection polls reducedthe measuredpolling errorof method 3; the error on Perot's shareof the vote is less than the overall per candidateerror. Proponentsof method3 say it evaluatesall candidates,which is true.It does not, however, provide a consistentmethodfor evaluatinga poll's forecastof the winningcandidatein an election. The choice then seems clearer.If one wantsto compareelectionsover time it is necessaryto use a method that is comparablefor both twocandidateand multicandidateelections. Only method5 meets that test. Was 1996 a Worse Year than 1948? Havingreviewedthe methodsforjudgingpoll accuracysuggestedby the SSRC committee50 years ago, I can now offer a reasonedjudgmenton 9. These are Spearmancorrelations,computedon the rankorderof the polling errors.The correlationsare very similarto the Pearsoncorrelationsusing the errorsthemselves. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions 242 Warren J. Mitofsky Table 3. Method 5 Errors in Presidential Polls, 1948 and 1996 Truman- Method5 Dewey Error (%) (%) 1948: Election Gallup Crossley Roper 4 -5 -5 -15 -9 -9 -19 Clinton- Method5 Dole Error (%) (%) 1996: Election Reuters/Zogby Hotline/Battleground Gallup ABC News Harris NBC News/Wall StreetJournal InternationalCommunicationsResearch/Politics Now PrincetonSurveyResearch/PEWCenter CBS News/New YorkTimes 9 8 9 11 12 12 12 -1 0 2 3 3 3 13 14 18 4 5 9 the accuracy of the 1996 polls and assess the validity of Ladd's complaint, that the 1948 polls look better by comparison. In 1948, George Gallup and Archibald Crossley had polls that were closer to the Truman-Dewey election outcome than the poll conducted by Elmo Roper. (See table 3.) Gallup and Crossley had Dewey winning by a 5 percentage point margin in a race Truman won by 4 points. This resulted in a 9-point erroron the difference. Roper was fartheroff the mark. He had an error on the margin of 19 points. In 1996, eight of the polls had errors ranging between 0 and 5 points on the margin. The ninth, the CBS News/New York Times polls, had a 9-point error. The one poll with the largest error in 1996 was as far off as the best result in 1948. By this measure, the polls of 1996 were clearly better than the polls of 1948. Polling closer to election day may have helped the polls of 1996. Gallup's 1948 national poll was concluded closer to the election than either This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 243 Crossley'sor Roper's.GallupstoppedinterviewingOctober28, Crossley finishedOctober18, and Roper's poll was finishedearly in September. The 1948 electionwas November2. In 1996, the eightnationalpolls cited by Laddcompletedinterviewingduringthe closing days of the campaign. The earliestto stop was Hotline/Battleground on October31, the Thursday before the election. Galluppolled throughthe night before the November5 election, while most polls stoppedon November3. National Council on Public Polls The NCPP (1997), in an attemptto counterLadd's criticismof the 1996 polls, publishedits evaluationof 47 final preelectionpresidentialpolls conductedbetween 1936 and 1996. The early polls (1936-60) in the NCPP analysiswere only from Gallup.Harrispolls were includedfrom 1964 on; startingin the 1970s all othermajornationalpolls wereincluded. For 1996, NCPPincludedeightpolls citedby Laddplus the PoliticsNow/ ICR poll. In its press release, NCPP (1997) "refutescriticismsof the accuracy of 1996 nationalpresidentialpolls" (p. 1). It claims the average error "was low relativeto historicalexperience" (p. 1) and within expected samplingerrormargins.SheldonGawiser,presidentof NCPPsaid, " 1996 shouldbe rememberedas one of the betteryears for the nationalpolls" (p. 2). The NCPP concluded,"The averageerrorin 1996 was only 1.7 percentagepoints. This comparesto 2.5% between 1936 and 1996 and 1.9% since 1956. Eight of the nine [1996] polls had errorswithin the ?3% marginof errorexpectedfor samplesof their size" (p. 3.) In its analysis,NCPPcalculatedpollingerroras "the average[absolute] deviationbetweenthe finalpoll resultsandthe electionresultsfor the top two or threecandidates"(p. 2). The thirdcandidatewas includedin five of the 11 electionssince 1956. The thirdcandidatesincludedin this analysis rangedfrom a low of 0.9 percent(McCarthyin 1976) to a high of 18.9 percent(Perotin 1992). There was one other wrinkle.Some polls allocatedthe undecidedvote and othersdid not. In an effortto make all polls equal, NCPP allocatedthe undecidedvote among the top two or threecandidatesin proportionto theirestimatedvote in the poll. Therewas a debatewithinNCPPover which errorconceptto use. The membersacceptedmethod3 (the averagedeviationbetweenthe poll and the election for each candidate)and rejectedmethod5 (the erroron the margin)because it resultedin an averagecandidateerrorthat was more thantwice as large as the one used in their analysis. Table 4 shows the averageerrorsfor each presidentialelection year between 1956 and 1996. The errorswere computedusing method3, as NCPP did in its analysis,and method5. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Warren J. Mitofsky 244 Table 4. Average Errors in Presidential Polls, 1948 and 1956-96 Method3a Method5 Average Number Numberof Average of Polls Candidates Error(%) Rank Error(%) Rank Year 1996 1992 1988 1984 1980 1976 1972 1968 1964 1960 1956 Yearly average, 1956-96 1948 9 6 5 6 4 3 3 2 2 1 1 3 3 2 2 3 3 2 3 2 2 2 1.7 2.2 1.5 2.4 3.0 1.5 2.0 1.3 2.7 1.0 1.8 5 8 3 9 11 3 7 2 10 1 6 3.6 2.7 2.8 4.4 6.1 2.0 2.6 2.5 5.3 1.9 3.5 8 5 6 9 11 2 4 3 10 1 7 3 3 1.9 4.9 12 3.4 12.9 12 SOURCE.-1996 from NCPP (1997) and publication;1956-92from NCPP; 1948 from Mostelleret al. 1949. aMethod3 was used by NCPP in its analysisof the polls. A few conclusions can be drawn from these data. The 1948 preelection polls stand out as the poorest performance of any preelection polls. Ladd's comparison of the 1996 polling performance to 1948 was without merit. The average error in the 1996 polls by either error measurement is much less than for 1948. Also, the error for each of the 1996 polls (except the CBS/New YorkTimes poll) was less than their presumed sampling error.10 The error on each of the 1948 polls exceeds what might have been the sampling error if those polls had been probability-based polls. To say that the polls of 1996 had estimates that "diverged sharply," as Ladd said, is wrong. They diverged modestly, and all but two overstated Clinton's lead over Dole. Only the CBS/New York Times Poll had an error approaching Gallup's 1948 error, and, unlike CBS and the New YorkTimes, Gallup had the wrong winner. However, the NCPP conclusion that 1996 was a banner year for the 10. The samplingerroron the marginbetweenthe candidatesis slightly less than twice the standarderroron a single candidate.Polls thatclaim a "marginof error"of 3 percent (2 X standarderroron one candidate)wouldlikely have a marginof erroron the diference of between 5 and 6 percent. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 245 Table S. Party "Bias" in PreelectionPolls, 1956-96 Numberof Polls Favoringa Party Year Democrats Republicans Neither 1996 1992 1988 1984 1980 1976 1972 1968 1964 1960 1956 7 5 1 2 4 1 1 1 2 1 0 0 0 4 2 0 2 2 0 0 0 1 2 1 0 2 0 0 0 1 0 0 0 polls also does not standup to scrutiny.By its own errormeasurement (method3), the averageof the polls' errorsin 1996 was the fifthbest of the 11 presidentialelections since 1956. By the author'spreferencefor a measureof pollingerror(method5), 1996 is eighth,somewhatworsethan the NCPP's result. In either case, the 1996 performanceis not "one of the smallererrorsrecorded,"as NCPP PresidentSheldon Gawisersaid in his organization's(1997) press release on polling accuracy. Laddalso claimedthat "electionpolls have frequentlyover-estimated the Democrats'shareof the vote." The NCPPdisagreed.It said, "Since 1956 errorsfavored the Democraticcandidatein six elections and the Republicansin five. The size of the errorswas almostequal." Ratherthanexaminethe averageerrorin each election year, as NCPP did, I examinedthe directionof the errorin each final preelectionpoll since 1956. Averages have the potentialfor masking a potentialbias, whereasindividualpoll resultsgive a more directpicture.For this analysis, if a poll's marginbetweenthe two leadingcandidatesvariedby less thanone percentagepointfromthe electionresult,I said the poll favored neitherparty.(See table 5.) The evidence shows that Ladd was correctaboutthe directionof the polls' errors.More than twice as many polls overstatedthe Democratic candidate'sshareof the vote thanoverstatedthe Republican'sshare.Furthermore,the 25 Democratic-leaningpolls had an averageerroron the marginof victory (method5) of 4.4 percentagepoints, while the 11 Republican-leaningpolls' errorwas only 3.3 percentagepoints.The Demo- This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions 246 Warren J. Mitofsky Table 6. State Polls, 1996 Erroron Marginbetween LeadingCandidates(%) 10%+ 7-9 4-6 1-3 <1 Total Numberof PresidentialRaces Numberof SenateRaces 4 (2) 16 (9) 18 (10) 47 (26) 15 (8) 100 (55) 12 (6) 14 (7) 27 (13) 45 (22) 2 (1) 100 (49) NOTE.-Data are percentages, Ns are given in parentheses. cratic and Republicanerrorsdiffer significantlyand do not supportthe NCPP position that the parties'errorswere aboutequal. Whetherthese differencesarelargeenoughto supportLadd'ssuggestionthatthe overreportingof Clinton'svictorymarginhad a bearingon eitherparticipation or the outcomeof the election is problematic.One would have to accept the bandwagontheory over the underdogtheory in orderto accept his notion, a discussionthis articlewill not pursue. 1996 State Polls11 Ladd's (1996b) criticismof the 1996 polls included,by implication,the state polls as well as the national polls when he said, "Election polling had a terribleyearin 1996." Whilehe andhis criticsfocusedmoreattentionon the nationalpolls, the state polls are more numerousand appearmore regularlyin local news reports.The statepolls, with very few exceptions, were done by differentpollstersthan the nationalpolls. More than half of the state polls were done by Mason-Dixon,a firmthat services news organizationsnationwide.Mason-Dixon'sperformancewas better,collectively, thanthose who did the otherstatepolls. Of the 55 presidentialstate polls reportedin the finalHotline(1996), 62 percentof themwere within 3 points of the actualmarginof victory.(See table 6.) The 49 statepolls on senatorialraceswere not as good. Justunderhalf were within3 points of the final margin. There were three instanceswhere the presidentialstate polls had the 11. All poll resultsfor this analysiscome from Hotline (1996). This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 247 wrongwinneras well as threeerrorsin Senatepolls. The incorrectpresidentialpolls were all close races.Only one of the threeSenatepoll errors was close. The biggest errorwas by the Detroit News in the Michigan Senaterace. They were off by 17 points on the margin.Therewere two othersthatwere 14 pointsoff, the OmahaWorld-Heraldin the Nebraska Senate race and the Greensburg (Pa.) Tribune-Reviewin the Pennsylvania presidentialrace. Laddcould hardlycriticize the state polls for favoringthe Democrats over the Republicans.The state polls were much more evenhandedthan the nationalpolls in the directionof theirerrors.Nineteenpolls erredin favor of the Democrat,Clinton,while only 18 favoredthe Republican, Dole. The other 18 presidentialpolls were within 1 percentagepoint of the election. The Exit Polls Ladddid not sparethe exit polls fromhis criticism.He said the networks offeredprematurereportson electionnightwhentheirexit pollingconsortium,VoterNews Service(VNS), incorrectlyprojecteda Democraticwin in New Hampshire.An "especiallyegregiouserror,"he calledit. He also comparedthe performanceof the exit polls in recent years to the 1948 preelectionpolls. He said they, too, had frequentlymissed the markby largermarginsthanGallup'serrorin 1948. The networksused exit polls on electionnightfor two purposes-projections and analysis.Exit-poll-basedprojectionstook place at poll closing time in conteststhatappearedto be clear-cutvictoriesfor a candidate. These projectionshave never cited exit poll estimatesof the candidates' percentages.A VNS or a networkanalystjust namedthe winningcandidate,which was then broadcastafterthe polls closed. The networkshave never reporteda marginof victorybased on exit polls. Lateron election night they did reportestimatesbased on samplesof actualvote returns, andthese have almostalwaysbeen withina few pointsof the finalresult. The analyticalinformationused in cross-tabulationwas weightedto the estimatesproducedfrom samplesthatused actualvote returns.As someone who didhave accessto the exit poll estimates,whichwerenotpublicly available,I can reportthatthe 1996 exit polls were not in excess of Gallup's 9-point errorin 1948. Since the networksformedtheirexit poll pool in 1990 they have covered about500 races.Overhalf were projectedfromexit poll results.The only incorrectprojectionby the pool was in the New HampshireSenate race in 1996. This lone errorwas correctedby the networkson-airtwo and a half hours afterthe mistakewas made. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions 248 Warren J. Mitofsky Conclusion Ladd'smainpoint and most testableclaim aboutthe accuracyof the national polls holds no water:by any of the measuresreviewedhere, 1996 was not the best but was far from the worst year for the polls. The data also do not supporta condemnationof 1996 statepolls, nor do they supportLadd'sclaims aboutthe accuracyof exit polls. The only chargethat receives any supportis Ladd'sclaim aboutoverestimationof the Democraticvote sharein polls. One measure,calculatedfor this article,bolsters this charge,while an NCPPanalysiscalls it intoquestion.Overall,a modicum of scrutinyrevealsthatLadd's impressionisticscorecardfor the polls is seriously in error. One can only speculateas to why Ladd chose to make demonstrably erroneousand unsupportedclaims. The professionalpolling community was understandably outragedin its responseto Ladd's very public pronouncements.If he really meantto improvepolling practice,one cannot imagine a less effective means of achievingthe goal. Certainly,in view of his less-than-rigorousanalysis,Ladd's call for a "blue-ribbon"commission to investigatepoll performancecannotbe taken seriously. Issues facing the polling professionrequiredispassionateand careful study if appropriateimprovementsare to be found. In the firstinstance, we shouldbe clear abouthow the qualityof our work is to be measured. This articlehas examineda variety of rules and discussedtheir merits anddrawbacks.The analysisprovidesa foundationforjudgingthe quality of poll results.In 1998 and beyond, this frameworkmay help to assess pollingprogressandproblems.In the meanwhile,we can answerthe question, Was 1996 a worse yearfor polls than 1948?No, it was muchbetter. References Butler, David, and Dennis Kavanagh. 1992. The British General Election of 1992. London: Macmillan. Congressional Quarterly. 1985. Guide to U.S. Elections. Washington, DC: Congressional Quarterly. Crespi, Irving. 1988. Pre-election Polling. New York: Russell Sage Foundation. Crewe, Ivor. 1997. "The Opinion Polls: Confidence Restored?" Parliamentary Affairs 4:569-85. DiVall, Linda A. 1996 "Keys to Expanding GOP Majorities." Polling Report, December 9, p. 1. Eagleton Institute of Politics. 1997. "Summary of the 1997 Gubernatorial Election Results Comparison with Pre-election Polls." Press release. Eagleton Institute of Politics, Rutgers University, New Brunswick, NJ. Hassell, James. 1997. "Latest Poll Shows Gains for Whitman." Newark Star-Ledger, November 2. Hotline. 1996. "#1" and "#1 1." Hotline, November 4. Ladd, Everett C. 1996a. "The Election Polls: An American Waterloo." Chronicle of Higher Education, November 22, p. A52. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions Poll Review: Was 1996 Worse than 1948? 249 . 1996b. "The Pollsters'Waterloo."WallStreetJournal,November19, p. A22. Mosteller,Frederick,HerbertHyman,PhilipJ. McCarthy,Eli S. Marks,and David B. Truman. 1949. The Pre-election Polls of 1948. New York: Social Science Research Council. NationalCouncilon Public Polls (NCPP). 1997. "Polling CouncilAnalysis Concludes Criticismsof 1996 PresidentialPoll AccuracyAre Unfounded."Press release, February13. NationalCouncilon Public Polls, Fairfield,CT. Newport,Frank.1997. "Controversiesin Pre-electionPolling." Paperpresentedat the annualmeetingof the AmericanAssociationfor Public OpinionResearch,Norfolk, VA. Scammon,RichardM., and Alice V. McGillivray.1997. AmericaVotes22. Washington,DC: CongressionalQuarterly. Taylor,Humphrey.1997. "Why Most Polls OverestimatedClinton'sMargin."Public Perspective8 (February/March): 45-48. Zogby International.1997. "Zogby Polls NJ and VA Right Again!" Press release. Zogby International,Utica, NY. This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM All use subject to JSTOR Terms and Conditions
© Copyright 2026 Paperzz