Review: Was 1996 a Worse Year for Polls Than 1948?

American Association for Public Opinion Research
Review: Was 1996 a Worse Year for Polls Than 1948?
Author(s): Warren J. Mitofsky
Source: The Public Opinion Quarterly, Vol. 62, No. 2 (Summer, 1998), pp. 230-249
Published by: Oxford University Press on behalf of the American Association for Public Opinion Research
Stable URL: http://www.jstor.org/stable/2749624 .
Accessed: 03/09/2013 19:44
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
.
American Association for Public Opinion Research and Oxford University Press are collaborating with JSTOR
to digitize, preserve and extend access to The Public Opinion Quarterly.
http://www.jstor.org
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
THE POLLS-REVIEW
WAS 1996 A WORSEYEARFOR POLLS
THAN 1948?
WARRENJ. MITOFSKY
Afterthe ballotswerecountedin the 1996presidentialelection,therewere
no picturesof a victoriousBob Dole, gleefully holdinga newspaperwith
the erroneousheadline,"ClintonDefeatsDole." We saw no postelection
speechesin whichPresident-electDole rippedinto the liberalmediapolls
thathaddeclaredhim prematurelydead.HouseRepublicansheld no hearings to get to the bottomof the polls' failure.Therewere no pressconferences in whichpollstersexpressedregretandconfusionaboutwhatcould
haveled themto thinkthatBill Clintonwouldwin. And a humbledClinton
did not opine that the lead he had in preelectionpolls must have lulled
his supportersto sleep on election day.
We saw none of these things because, in concert with the estimates
fromall of the mediapreelectionpolls, Clintonwon the 1996 presidential
election. Unlike the 1992 British preelectionpolls, almost all of which
erroneouslyforetolda Labourvictory,or the 1990Nicaraguanpolls, many
of which picked Daniel Ortegato defeat ViolettaChamoro,or the 1980
U.S. polls thatpredictedan uncertainReaganlead andnot a landslide,or
the infamous1948 polls thatpredicteda Dewey victoryover Truman,the
1996polls wereunanimouslycorrectin predictingthatClintonwouldwin
by a safe margin.Unlikethose storiedexemplarsof pollingfrailty,earlier
occasionsfor embarrassment
andconsternation,polling in the 1996 election could be judged a clear-cutsuccess.
It came as a considerableshock, then, when EverettCarllLadd,Jr.directorof the RoperCenterat the Universityof Connecticutanda prominent figurein the field of publicopinionresearch-declared in an article
published in the Chronicle of Higher Education (1996a), and excerpted
in the WallStreetJournal (1996b), that "election polling had a terrible
yearin 1996. Indeed,its overallperformancewas so flawedthatthe entire
enterpriseshouldbe reviewedby a blue-ribbonpanelof experts."A flurry
of "copycat" negative media coverage followed Ladd's extraordinary
statement, with articles appearing in U.S. News and World Report, the
WARREN
J. MITOFSKY
is presidentof MitofskyInternational.
Public Opinion Quarterly Volume 62:230-249 ? 1998 by the American Association for Public Opinion Research
All rights reserved. 0033-362X/98/6202-0006$02.50
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
231
New York Times Sunday Magazine, and a dozen or more periodicals and
radio and television broadcasts.
Ladd's attackon the polls was a miscellaneouscollection of charges.
Most prominently,while acknowledgingthatClintondid actuallywin the
1996 election, he said that the polls had overestimatedthe president's
margin of victory. He remarkedthat, by comparison,the 1948 polls
seemed "closerto a triumphthana disaster."He saidthatClinton'slarge
leadin the polls throughoutthe campaignwas illusory,butthatits publication had dampenedinterestin the election and turnouton election day.
He decriedthe numberof preelectionpolls, whose findings"bombarded"
the electoratethroughoutthe campaign.He assailednot only preelection
but also exit polls, chargingthatthey had overestimatedthe Democratic
shareof the vote "in recentyears." He opinedthatRepublicansandconservativeswere less likely to agree to respondto polls (perceivingthem
to be tools of the liberalmedia),leadingto nonresponsebias in preelection
poll estimates.
Ladd's wide-rangingcritiquecontainedbothtestablechargesandspeculative complaints.Characteristicof the lattercriticismswas his claim
thatthe polls had overestimatedClinton'slead duringthe campaignand
hadtherebydampenedinterestin the election.His conjecturethatthe polls
had sufferedfrom nonresponsebias due to noncooperationby conservatives was similarlyunverifiable.The impact,if any, of poll "bombardment" on prospectivevoters would have been difficultto assess during
the campaignand was impossibleto gauge retrospectively.To respond
to any of these assertions,one wouldhave to matchspeculationwith surmise.
In addition,Ladd's testableclaims themselveswere poorly specified.
His complaintsaboutinaccuracyandDemocraticbiasin exit polls referred
to problemsin "recent years" and offered no data. And the marquee
attackin Ladd's offensive-that 1996 poll accuracywas worse than in
1948 or any other year-was simply unsupported.His Chronicle of
HigherEducationarticlecontainedno 1996 datawhatsoever.The excerpt
of that articlein the Wall Street Journal featureda table, titled "Polls
Away fromReality," listing final 1996 poll projectionsfor eight polling
organizationsand the share of the vote for Clinton, Dole, and Perot.
Ladd's "analysis" consistedsolely of two statements,that "most of the
leading nationalpolling organizationsmade pre-electionestimatesthat
divergedsharplyfromthe actualvote on November5," andthat "of late,
both pre-electionsurveysand exit polls on ElectionDay frequentlyhave
missedthe markby marginswell in excess of the Gallupresultsin 1948."
He presentedno criteriafor assessing "divergence"nor any measuresto
supportthis claim.
Because even his highlighted charges were imprecise and undocu-
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
232
Warren J. Mitofsky
mented,a reasonedassessmentof Ladd'sattackrequiressome reformulation andrefinementto enablegatheringappropriate
evidence.The responsible analyst,in otherwords,has to do the difficultwork that Laddhad
sparedhimself in leveling his chargesand then make the effort to judge
the case once the termsareclear.The prospectof performingsuchdouble
dutyis neverinviting.But becauseLadd's unexpectedbroadsidereceived
much national media attention,the National Council on Public Polls
(NCPP 1997) felt compelledto defend 1996 poll performance.Considering Ladd'sleadingcharge,thatfinalpreelectionpolls had badlyoverestimatedthe extent of Clinton'svictoryand were fartheroff the markthan
in 1948, the councilhad to decide on a way to measurepoll performance
and then do the computationsto providea historicalcomparison.
The council's work was made more difficultby the fact that-after
more than50 years of election polling-no standardmetricfor gauging
poll accuracyhad been adoptedby the polling community.When challengedby Ladd,the NCPPhadto agreeanewon a methodof assessment.
The absenceof a generallyacceptedstandardprovidesan open field for
chargesand counterchargesbased on hyperboleandblurreddistinctions.
But it is clearly an undesirablestate of affairsfor scholarswho wish to
studythe questionof poll accuracywith fairnessandprecision.My analysis, harkingback to the Social Science ResearchCouncil (SSRC) report
on the polls of 1948, examinesthe questionof how poll accuracyshould
be measured,weighing the pluses and minuses of variousapproaches.I
thenrevisitthe Ladd-NCPPexchangeandoffer an answerto the question,
Was 1996 a worse year for the polls than 1948?
Measures of Poll Accuracy
The SSRC study (Mostelleret al. 1949) of the 1948 preelectionpolls
consideredeight methodsfor measuringpolling error.It noted that each
method has "advantagesand disadvantages."The SSRC committee's
definitionsare as follows.
Methodsfor DefiningElectionPolling Error
1. The differencein percentagepoints betweenthe leadingcandidates's shareof the total vote from a poll and from the actual
vote.
2. The differencein percentagepoints betweenthe leadingcandidate's shareof the major party vote from a poll and from the actual vote. (Majorpartiesare Democraticand Republicanand are
assumedto be the top two vote getters.)
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
233
3. The average(withoutregardto sign) of the percentagepoint deviation for each candidatebetweenhis/her estimateand the actual
vote.
4. The averagedifference(withoutregardto sign) between a ratio
for each candidateand the numberone, where the ratio is defined
as a candidate'sestimatefrom a poll dividedby the candidate's
actualvote.
5. The differencebetween two differences,where the first difference
is the estimateof the vote for the two leading candidatesfrom a
poll and the second differenceis the election resultfor the same
two candidates.
6. The maximumdifferencein percentagepoints between a party
and the actualvote.
7. The chi-squareto test the congruenceof the estimatedand actual
vote distributions.
8. The differencebetweenthe predictedand actualelectoralvote.
A significantproblemfor today's evaluationof polling accuracy,andnot
addressedby the SSRCbecausethe problemdid not arisein 1948, is how
to handlethe "undecided"vote in the polls. Thereis no undecidednumberin an election;it only exists in some,butnot all, polls. If the undecided
respondentsare not allocatedto a candidate,then the averageerrorcomputedusing methods 1, 3, 4, 6, and 7 will be exaggerated.For example,
supposethat an election thatwas 55 percentDemocraticand 45 percent
Republicanhada poll thatshowed50 percentfor the Democrat,40 percent
for the Republicanand 10 percentundecided.If therewas no allocation
of the undecided,methods 1 and 3 would have errorsof 5 percentage
points.If the undecidedwere allocatedproportionallyandthe errorcomputed,methods1 and 3 would only show an errorof 1 percentagepoint.
Furthermore,the errorcomputedusing any of methods 1, 3, 4, 6, and 7
will not be comparablefor polls thatreport"undecided"and those that
do theirown allocation.
The alternativesfor evaluatingpolls thatincludean undecidedcategory
are as follows: (1) allocatethe undecidedin proportionto the votes for
candidatesin a poll, (2) allocatethe undecidedevenly between the two
majorparties,(3) allocateall the undecidedto the challenger,if thereis
an incumbent,or (4) use one of the methodsthat does not requirethat
the undecidedbe allocated.A morecomplexallocationcannotbe accomplished by an evaluator.
Crespi (1988, p. 22) claims that the pollstershe interviewedfor his
book thoughtthatproportionalallocationof the undecidedwas "closest
to the experienceof most pollsters."Actually,the arithmeticcan be done
for all methodswithoutallocatingthe undecided,but, as shown above,
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
234
Warren J. Mitofsky
it does not resultin comparablemeasures.The overridingconsideration
concernsthe comparabilityof the measureover several elections.
Comments on Methods for Measuring Accuracy
The SSRCchose the straightforward
analysisandinterpretation
of method
1 for its evaluationof the 1948 preelectionpolls. But they also expressed
concernaboutthe use of method1 in electionswith significantminorparty
candidacies(in 1948, StromThurmondand HenryWallace). Method 1,
while simpleandeasily understood,is artificialunless the test in the election is whetherthe leadingcandidategets 50 percentor more.If thereare
morethantwo candidatesin an electionor if a poll includesan undecided
percentage,the numberfor the leading candidatealone is of little value
in describingthe statusof an election.Witnessa recentargumentbetween
the EagletonandZogby Polls following the 1997 New Jerseygovernor's
election.The EagletonInstituteof Politics(1997) claimedthatit was closest with a preelectionestimateof 45 percentof the vote, but that figure
is of little use unless we also know thatChristieTodd Whitman'sopponent,JimMcGreevey,hadsignificantlyless than45 percent.The Eagleton
claim of accuracy(based on method 1) does not addressthis question
(althoughtheirpress release did include the vote for othercandidates).
The SSRC committeealso liked method 2 and used it, too, in their
analysis.Method2 reducesthe percentagesso the top two candidatesadd
to 100 percent.The unspokenadvantageis that this method eliminates
all othercandidatesandanyundecidedwho maybe in a poll. Themeasure,
unlikemethod1, meansthe samethingregardlessof how manycandidates
participatein an election:at least one candidatewill have 50 percentor
more.While method2 may appearto be similarto method 1, thereis an
importantdifference.When poll numbersare repercentagedso the two
majorpartycandidatesaddto 100 percentsomethingvery importanthappens: the undecidedpercentageis eliminated.The effect is that method
2 becomesidenticalto method5 withthe undecidedallocated.If the undecided are not allocatedbefore applyingmethod5, then the two measures
are not equivalent.
Method3 averagesthe percentagepoint deviationfor each candidate
betweenits estimateand the actualvote, withoutregardto sign. This approach,the SSRC committeethought,had "inherentdrawbacks.By includingmanysmallpartieswhich scarcelycontributeto the totalvote ...
the averagedeviationcan be made very small even thoughmajorparty
predictionshave largeerrors"(Mostelleret al. 1949, p. 56). They understatedtheproblem.If all 22 partieson the ballotin 1996hadbeenincluded
in a computationusing this method,then the averageerrorwould have
been close to zero. Even with a limitednumberof candidates,if the coef-
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
235
ficientof variation'is roughlyconstantfor each candidateincluded,then
the overall error will decline as the number of candidatesincreases.
Clearly,thereneeds to be a limitationon the numberof candidates.This
method, if used with "discretion" about which partiesto include, the
SSRC reportsaid, could be useful (p. 56). It shouldbe notedthatneither
the SSRC nor the NCPP, which also used this methodin its evaluation
of the 1996 preelectionpolls, definedwhat might be reasonablecriteria
for includingthirdparties.
Method 3 has the virtue of evaluatingcandidatesother than the top
two. If the intent of a preelectionpoll is to reportthe standingof each
candidate,this measureevaluatesthe averageaccuracyof more thantwo
candidates.The criterionfor includingmore thantwo candidatesis arbitrary.Crespi(1988) includedthird-partycandidatesonly if they received
at least 15 percentof the vote in an election. In the twentiethcentury,
his third-partycriterionwouldhave includedTheodoreRoosevelt(1912),
RobertLaFollette(1924), andRoss Perot(1992). It wouldhaveeliminated
StromThurmondand HenryWallacein 1948, GeorgeWallace in 1968,
JohnAndersonin 1980, andRoss Perotin 1996, amongothers(CongressionalQuarterly1985). Method3's otherfailingsare (1) a lack of comparabilitybetweenelectionsthathave differentnumbersof meaningfulcandidates, and (2) like some other methods, it requires the analyst do
somethingthe pollsterwho createdthe poll was unwillingto do: namely,
allocatethe undecidedvotersamongthe candidates.If the undecidedwere
not allocated,the measurementsof errorwould not be comparablefrom
poll to poll.
Method4 computesthe ratio of each candidate'sestimatedividedby
the actualvote;erroris the averagedeviationfromone for each candidate.
This approachtends to exaggeratesmall percentagepoint differencesin
minorpartycandidates.For example,a one-pointerrorin a partywith 50
percentof the vote results in a 2 percenterror.A one-pointerrorin a
partywith 5 percentof the vote producesa 20 percenterror.If all parties'
errorsare averaged,the overallresultexaggeratesthe total error.This is
just the opposite of method3, which minimizesthe total error.Method
4 could be modifiedto includemajorpartycandidatesonly, but then its
result would be comparableto methods2 and 5 (with allocationof the
undecided).
The only problemwith method5, accordingto the SSRC report,was
the "complexity"of explainingit. Method5 firstcomputesthe difference
between the two leading candidatesin the poll and the actualvote; the
erroris the differencebetweenthese differences.
1. The coefficientof variationis the standarderrorof an estimate(percentage)dividedby
the estimate(percentage).This is a more useful measureof variabilitythanthe standard
erroralone becausecoefficientsare comparablefrom one candidateto another.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
236
Warren J. Mitofsky
In a race involving only Democratsand Republicans,after allocation
of the undecided,method5 yields resultsexactly two times the resultof
methods1 and2. An advantageof method5 is thatit evaluatesthe statistic
most often reportedby the mediawhenreportingpreelectionpolls, which
is the marginbetween the top two candidates.Ladd, in his Wall Street
Journalarticle,bases his discussionon the marginbetween the top two
candidates.More recently,the NewarkStar-Ledger,in its reportof the
EagletonPoll for the 1997 New Jerseygovernor'srace statedin its lead
paragraphthatWhitmanhad "a 9-pointlead" (Hassell 1997). In the fifth
the storycitedWhitman's45 percentfigure,whichthe Eagleton
paragraph
Instituteof Politics (1997), in its postelectionpress release, said was the
most importantfigure.The author'snonscientificreview of finalpreelection poll storiesin 1996 and 1997 showedthatalmostall poll storiesmade
primarymentionof the marginbetweenthe top two candidates.The fact
thattherewere distantthird-partycandidateswas mentionedin these stories, butwithoutcomment,presumablybecausethosecandidatesappeared
to have no chanceof challengingthe two leadingcandidates.The margin
betweenthe leadingcandidatesalso was the measurepollstersused when
they wrote aboutpolls (DiVall 1996; Newport 1997; Taylor 1997).
Method 5 rewardsthe effort of the pollsters who allocate undecided
voters well and penalizesthose who allocatepoorly. If the pollsterdoes
not allocate at all then the marginreportedby the pollster,presumably,
is the best indicationof the pollster'sexpectationaboutthe election outcome. Method5 does not force someoneevaluatingpolls to makeassumptions aboutthe undecidedthat were not made by the pollster.2It should
be notedthatthe resultsof methods3 and5 areidenticalfor two-candidate
raceswhen the undecidedareallocated.They differwhenthereis no allocation.
Methods6, 7, and 8 were mentionedin the 1948 SSRCreportbut were
not consideredas viable options.Method6 appliesto the one partywith
the largesterror,even if it is a minorparty.Complexitywas the reasonfor
droppingmethod7. The errorin electoralvotes projectedand evaluated
in method8 is one step removedfrom evaluatingnationalor state polls
directly.
Crespi(1988), afterhe performedproportionalallocationfor the polls
thatincludedundecidedvotersin theirbase, evaluatedmethods 1, 3, and
6. He concludedthat there was not much differencebetween them and
used method 1 for his analysis,as did the SSRC committeefor its report
2. I would like to acknowledgethat I made the mistakeof not allocatingthe undecided
duringthe 15 yearsI directedthe CBS half of the CBS/NewYorkTimesPoll. I now believe
thatit is unreasonableof a pollsterto ask a readeror viewer of a final preelectionpoll to
abouthow the undecidedwill vote. A poll is being reportedso the
make an interpretation
publicknows whatto expect when the electiontakesplace. Leavingthe undecidedin the
base of the percentagesreporteddoes not serve the public expectationor the pollsters'
claims aboutaccuracy.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
237
on the 1948 elections. Methods3 and 6, accordingto Crespi'sdata,had
the smallercoefficientsof variation.Method1, whichhe used throughout
the book, had a slightly largercoefficientof variationthanthe methods
he rejectedfor his analysis.
An analysis of recent British elections yields a mixed bag. Crewe
(1997) prefersmethod3. He calls it "the truetest of a poll's accuracy"
(p. 580). Nonetheless,the entirediscussionin his articleaboutthe 1997
elections is aboutLabour'slead, the lead being the differencebetween
the top two parties,which is evaluateddirectlyby method5. He offers
computationsof polling errorfor both method 3 and method5. Robert
M. WorcesterendorsesCrewe'spreferencefor method3.3 He also added
a few details:(1) polls arerepercentagedso the majorparties addto 100
percent;4(2) thereareno decimalplacesin the poll's recomputedestimate
of a partyand thereis one decimalin the electionresult.5He claims this
methodhas been used for decadesto evaluateBritishelections.6
Comparison of the Methods
Nine finalpreelectionpresidentialpolls from 1996 were evaluatedusing
four of the eight methods.They includethe eight polls listed in the table
in the Wall Street Journal excerpt of Ladd's critique,plus the Politics
Now/ICRpoll. Theninepolls wereconductedby ABC News, CBS News/
New YorkTimes,Gallup/CNN/USAToday,HarrisPoll, Hotline/Battleground,NBC/Wall StreetJournal,Politics Now/ICR, PrincetonSurvey
Research/PewResearchCenter,Zogby Group/Reuters.Three of them,
(Harris, Princeton Survey Research, and Zogby), reported vote for
"other" candidatesin additionto Clinton,Dole, and Perot.Four polls,
(ABC, CBS, Hotline, and NBC), reportedundecidedvoters in the base
of theirpercentages.The finalpoll results,as reportedby the pollingorganizations,are presentedin table 1.
The accuracyof methods1, 2, 3, and 5 were evaluated.Method 1 was
used by the SSRC in 1948; method2 is similar,but only deals with the
majorpartycandidatesand also was used in the SSRC report;method3
3. RobertM. Worcester,personalcommunicationto author(E-mail),December13, 1997.
4. Unlike U.S. elections, there are more than two majorpartiesin British elections. In
1997, therewere threemajorpartiesplus "others" in the evaluation.
5. The Britishevaluation,unlikethe NCPP evaluationof U.S. presidentialelectionpolls,
maintainsthe same numberof significantdigits for a poll that the poll had when it was
reportedby the media.The use of a constantevaluationmethodalso makesit possibleto
readilycomparepolling performanceover more than one election.
6. The evaluationof the performanceof the polls for the 1992 Britishgeneralelection
took on the same sense of urgencyas the SSRC evaluationof the 1948 U.S. polls. The
four final British polls showed Labourwith a small lead, which suggested a coalition
government.The Conservativesactuallywon by a small but comfortablemargin.Butler
and Kavanagh(1992) discuss these polls in chap. 7, "The Waterlooof the Polls."
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Warren J. Mitofsky
238
Table
1. Final 1996 Preelection Presidential Polls
Perot
Dole
Clinton
(Democrat) (Republican) (Reform) Other Undecided
Electiona
ABC News
CBS News/New
YorkTimes
Gallup
Harris
Hotline/
Battleground
International
Communication
Research/
Politics Now
NBC News/Wall
StreetJournal
PrincetonSurvey
Research/
PEW Research
Center
Zogby/Reuters
49.3
51
40.7
39
8.4
7
53
52
51
35
41
39
9
7
9
45
36
8
51
38
11
49
37
9
52
49
38
41
9
8
1.6
3
3
1
11
5
1
2
NOTE.-Data are percentages.
a
Data are from Scammonand McGillivary1997.
was used by NCPP and the British in their evaluation of election polls;
method 5 deals with the statistic most often reported in the media, the
margin between the leading candidates. The evaluation was done with
and without allocation of the undecided voters. A few rules were observed
in the calculations: the result of the presidential election was stated to
within one decimal place. All poll numbers were whole percentages with
no decimals, thereby maintaining the same number of significant digits
as were published by the pollsters. Undecided voters were allocated in
proportion to the vote for Clinton, Dole, and Perot, which was the allocation method least controversial and used by NCPP, the British evaluations,
and Crespi (1988; vote for "others" in a poll was eliminated before allocation). Poll percentages after allocation were rounded to whole percentages before other calculations. A rank of 1 was assigned to the poll with
the smallest error for a given method. If more than one poll had the same
error they were each given the same rank.
A comparison of the effect on the rankings for a given method can be
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
239
seen in table 2. Allocationof the undecidedvoters changedthe method
1 rankingafter allocationby more than one place for four of the nine
polls. Method2 always includes allocation,and thereforethe resultsare
identical,excludingit from this comparison.The resultingrankingusing
method3 is changedslightly by allocation,except for one poll. Method
5 shows the least variabilityin the rankingswhen the undecidedare allocated.Onlyone of the ninepolls shifttheirrankingby two or moreplaces.7
A comparisonof the resultingrankingsfor methods 1, 3, and 5 when
thereis no allocationshows considerablevariability.Each methodproduces a differentrankorder.A comparisonof all four methodswhen the
undecidedvotersareallocatedis muchless variable,as one wouldexpect.
Method 1 varies slightly more than the other three methods, which is
consistentwith Crespi'sfinding.8Methods2 and5, as notedin the discussion of methodsabove, produceidenticalrankingsafter allocation.
Discussion
An inspectionof the rankingsin table2 producedby the differentmethods
shows that they are more consistentwhen the undecidedare allocated.
However,it is still opento questionwhetherthe evaluatorshouldallocate
the undecidedif the pollsterdid not do it. If the goal of these preelection
polls is to informthe public aboutthe expectedoutcomeof an election,
then it seems that the responsibilityfor allocationshould rest with the
pollster.Thepublicandthejournalistshaveneitherthe informationnecessary for a sophisticatedallocationnor the technicalknowledge.If there
is some goal for these polls otherthan forecastingthe election, then the
eight methodsdescribedabove for evaluatingthe polls are not sufficient.
A more appropriateapproachwould be to assess the conformityof the
pollingmethodsto good statisticalandpollingtheory,which arethe criteriato whichall otherpolls aresubjected.Assumingthe goal is forecasting,
it seems reasonable,when evaluatinga poll, to take the numbersas reportedby a pollster,withoutmodifyingthem.
Thisconclusionwas basedon fourpoints.First,five of the ninenational
polls did theirown allocationof the undecided.Second,when evaluating
the Dolls thatdid not allocate,thereis no agreeduDonmethodof allocation
7. An examinationof the correlationsamongthe rankingshows high consistencyamong
the methodswhen the undecidedare allocated.These correlationsare all high, ranging
from .72 to .94, except for the correlationbetweenmethods2 and 5, which is 1.0. When
the undecidedare not allocated,the correlationsare more variableand they are lower.
They rangefrom .10 to .75, with one exception.Again, the correlationbetweenmethods
2 and 5 is high; it is .94. I thankDan M. Merklefor these computations.
8. The coefficientsof variationfor methods1, 2, 3, and 5, respectively,are .70, .58, .53,
and .64 for the 1996 final presidentialelection polls, when the undecidedare allocated.
Crespi's(1988) studyhad coefficientsfor methods1 and 3, respectively,of .83 and.76.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
o
mC
)
*c-)
cs
-)
0~~~~~0
??
c
~,
It cl
00
t
0)
a)~~~~~r
o soc
o
Ot^ C
Ca
mctr
t)N
~~~r)
(7 cl
t clq
t > so
mt
X
0
o
ct v~~~~~~tC
ce00 (\It
?tXmmmmm>
z
.t
ct
(" It ("I
I
tevo
c
-qCI
Bs
C
o
~~ ~ ~
co c zmc
C13~
-~
'
~~r)(\
Q'
%
~ ~~~
o
40
cc
m l
I-
B'
:
Z0
U
N%
CIA
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
241
for an evaluatorto use. Third,if thereis a preferredmethodof allocation,
the pollsterdid not choose to use it. Fourth,it seems reasonableto assume
thatpollstersreporttheir "best" estimatesof an election outcometo the
public.
Thereis an effect on the measurementof accuracyof the polls when
the undecidedareallocated.The correlationof the errorsfor the nine polls
using method1 with and withoutallocationis only .35.9For method3 it
is .60. Only method 5 maintainsa high correlationof .94. Should the
evaluatordo somethingthatwill changethe assessmentof accuracythat
the pollsterwas unwillingor unableto do for him- or herself?
As to which methodshouldbe used to evaluatepolling accuracy,that
remainsthe choice of the evaluator.The argumentsfor and againsteach
methodare listed above and summarizedhere. If the goal is to forecast
which candidatewill win andby how much,thenmethod1 does not adequatelyevaluatean electionoutcomeunlessthereareonly two candidates.
Limitingthe analysisto one candidate,as method 1 does, gives no idea
aboutthe accuracyof the forecast;methods2 and 3 evaluatethe forecast
of the winnerindirectlyandmethod5 does it directly.Method2 implicitly
introducesproportionalallocation,andthereforeit too seems less preferable.
The best choice appearsto be between methods 3 and 5. The chief
argumentmade by proponentsof method3 is thatit representsall "significant"candidatesbut leaves open how to define "significant."Its opponents say method3 artificiallyreducesthe overall errorwhen a third
candidateis introduced,therebymakingcomparisonswith two-candidate
electionsnot meaningful.It shouldbe noted,for example,thatthe introductionof Perotinto the evaluationof the performanceof the 1996 preelection polls reducedthe measuredpolling errorof method 3; the error
on Perot's shareof the vote is less than the overall per candidateerror.
Proponentsof method3 say it evaluatesall candidates,which is true.It
does not, however, provide a consistentmethodfor evaluatinga poll's
forecastof the winningcandidatein an election.
The choice then seems clearer.If one wantsto compareelectionsover
time it is necessaryto use a method that is comparablefor both twocandidateand multicandidateelections. Only method5 meets that test.
Was 1996 a Worse Year than 1948?
Havingreviewedthe methodsforjudgingpoll accuracysuggestedby the
SSRC committee50 years ago, I can now offer a reasonedjudgmenton
9. These are Spearmancorrelations,computedon the rankorderof the polling errors.The
correlationsare very similarto the Pearsoncorrelationsusing the errorsthemselves.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
242
Warren J. Mitofsky
Table 3. Method 5 Errors in Presidential Polls, 1948 and 1996
Truman- Method5
Dewey
Error
(%)
(%)
1948:
Election
Gallup
Crossley
Roper
4
-5
-5
-15
-9
-9
-19
Clinton- Method5
Dole
Error
(%)
(%)
1996:
Election
Reuters/Zogby
Hotline/Battleground
Gallup
ABC News
Harris
NBC News/Wall StreetJournal
InternationalCommunicationsResearch/Politics
Now
PrincetonSurveyResearch/PEWCenter
CBS News/New YorkTimes
9
8
9
11
12
12
12
-1
0
2
3
3
3
13
14
18
4
5
9
the accuracy of the 1996 polls and assess the validity of Ladd's complaint,
that the 1948 polls look better by comparison.
In 1948, George Gallup and Archibald Crossley had polls that were
closer to the Truman-Dewey election outcome than the poll conducted by
Elmo Roper. (See table 3.) Gallup and Crossley had Dewey winning by
a 5 percentage point margin in a race Truman won by 4 points. This resulted in a 9-point erroron the difference. Roper was fartheroff the mark.
He had an error on the margin of 19 points. In 1996, eight of the polls
had errors ranging between 0 and 5 points on the margin. The ninth, the
CBS News/New York Times polls, had a 9-point error. The one poll with
the largest error in 1996 was as far off as the best result in 1948. By this
measure, the polls of 1996 were clearly better than the polls of 1948.
Polling closer to election day may have helped the polls of 1996. Gallup's 1948 national poll was concluded closer to the election than either
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
243
Crossley'sor Roper's.GallupstoppedinterviewingOctober28, Crossley
finishedOctober18, and Roper's poll was finishedearly in September.
The 1948 electionwas November2. In 1996, the eightnationalpolls cited
by Laddcompletedinterviewingduringthe closing days of the campaign.
The earliestto stop was Hotline/Battleground
on October31, the Thursday before the election. Galluppolled throughthe night before the November5 election, while most polls stoppedon November3.
National Council on Public Polls
The NCPP (1997), in an attemptto counterLadd's criticismof the 1996
polls, publishedits evaluationof 47 final preelectionpresidentialpolls
conductedbetween 1936 and 1996. The early polls (1936-60) in the
NCPP analysiswere only from Gallup.Harrispolls were includedfrom
1964 on; startingin the 1970s all othermajornationalpolls wereincluded.
For 1996, NCPPincludedeightpolls citedby Laddplus the PoliticsNow/
ICR poll.
In its press release, NCPP (1997) "refutescriticismsof the accuracy
of 1996 nationalpresidentialpolls" (p. 1). It claims the average error
"was low relativeto historicalexperience" (p. 1) and within expected
samplingerrormargins.SheldonGawiser,presidentof NCPPsaid, " 1996
shouldbe rememberedas one of the betteryears for the nationalpolls"
(p. 2). The NCPP concluded,"The averageerrorin 1996 was only 1.7
percentagepoints. This comparesto 2.5% between 1936 and 1996 and
1.9% since 1956. Eight of the nine [1996] polls had errorswithin the
?3% marginof errorexpectedfor samplesof their size" (p. 3.)
In its analysis,NCPPcalculatedpollingerroras "the average[absolute]
deviationbetweenthe finalpoll resultsandthe electionresultsfor the top
two or threecandidates"(p. 2). The thirdcandidatewas includedin five
of the 11 electionssince 1956. The thirdcandidatesincludedin this analysis rangedfrom a low of 0.9 percent(McCarthyin 1976) to a high of
18.9 percent(Perotin 1992). There was one other wrinkle.Some polls
allocatedthe undecidedvote and othersdid not. In an effortto make all
polls equal, NCPP allocatedthe undecidedvote among the top two or
threecandidatesin proportionto theirestimatedvote in the poll.
Therewas a debatewithinNCPPover which errorconceptto use. The
membersacceptedmethod3 (the averagedeviationbetweenthe poll and
the election for each candidate)and rejectedmethod5 (the erroron the
margin)because it resultedin an averagecandidateerrorthat was more
thantwice as large as the one used in their analysis.
Table 4 shows the averageerrorsfor each presidentialelection year
between 1956 and 1996. The errorswere computedusing method3, as
NCPP did in its analysis,and method5.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Warren J. Mitofsky
244
Table 4. Average Errors in Presidential Polls, 1948 and 1956-96
Method3a
Method5
Average
Number Numberof Average
of Polls Candidates Error(%) Rank Error(%) Rank
Year
1996
1992
1988
1984
1980
1976
1972
1968
1964
1960
1956
Yearly average,
1956-96
1948
9
6
5
6
4
3
3
2
2
1
1
3
3
2
2
3
3
2
3
2
2
2
1.7
2.2
1.5
2.4
3.0
1.5
2.0
1.3
2.7
1.0
1.8
5
8
3
9
11
3
7
2
10
1
6
3.6
2.7
2.8
4.4
6.1
2.0
2.6
2.5
5.3
1.9
3.5
8
5
6
9
11
2
4
3
10
1
7
3
3
1.9
4.9
12
3.4
12.9
12
SOURCE.-1996 from NCPP (1997) and publication;1956-92from NCPP; 1948
from Mostelleret al. 1949.
aMethod3 was used by NCPP in its analysisof the polls.
A few conclusions can be drawn from these data. The 1948 preelection
polls stand out as the poorest performance of any preelection polls. Ladd's
comparison of the 1996 polling performance to 1948 was without merit.
The average error in the 1996 polls by either error measurement is much
less than for 1948. Also, the error for each of the 1996 polls (except the
CBS/New YorkTimes poll) was less than their presumed sampling error.10
The error on each of the 1948 polls exceeds what might have been the
sampling error if those polls had been probability-based polls. To say that
the polls of 1996 had estimates that "diverged sharply," as Ladd said,
is wrong. They diverged modestly, and all but two overstated Clinton's
lead over Dole. Only the CBS/New York Times Poll had an error approaching Gallup's 1948 error, and, unlike CBS and the New YorkTimes,
Gallup had the wrong winner.
However, the NCPP conclusion that 1996 was a banner year for the
10. The samplingerroron the marginbetweenthe candidatesis slightly less than twice
the standarderroron a single candidate.Polls thatclaim a "marginof error"of 3 percent
(2 X standarderroron one candidate)wouldlikely have a marginof erroron the diference
of between 5 and 6 percent.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
245
Table S. Party "Bias" in PreelectionPolls,
1956-96
Numberof Polls Favoringa Party
Year
Democrats
Republicans
Neither
1996
1992
1988
1984
1980
1976
1972
1968
1964
1960
1956
7
5
1
2
4
1
1
1
2
1
0
0
0
4
2
0
2
2
0
0
0
1
2
1
0
2
0
0
0
1
0
0
0
polls also does not standup to scrutiny.By its own errormeasurement
(method3), the averageof the polls' errorsin 1996 was the fifthbest of
the 11 presidentialelections since 1956. By the author'spreferencefor a
measureof pollingerror(method5), 1996 is eighth,somewhatworsethan
the NCPP's result. In either case, the 1996 performanceis not "one of
the smallererrorsrecorded,"as NCPP PresidentSheldon Gawisersaid
in his organization's(1997) press release on polling accuracy.
Laddalso claimedthat "electionpolls have frequentlyover-estimated
the Democrats'shareof the vote." The NCPPdisagreed.It said, "Since
1956 errorsfavored the Democraticcandidatein six elections and the
Republicansin five. The size of the errorswas almostequal."
Ratherthanexaminethe averageerrorin each election year, as NCPP
did, I examinedthe directionof the errorin each final preelectionpoll
since 1956. Averages have the potentialfor masking a potentialbias,
whereasindividualpoll resultsgive a more directpicture.For this analysis, if a poll's marginbetweenthe two leadingcandidatesvariedby less
thanone percentagepointfromthe electionresult,I said the poll favored
neitherparty.(See table 5.)
The evidence shows that Ladd was correctaboutthe directionof the
polls' errors.More than twice as many polls overstatedthe Democratic
candidate'sshareof the vote thanoverstatedthe Republican'sshare.Furthermore,the 25 Democratic-leaningpolls had an averageerroron the
marginof victory (method5) of 4.4 percentagepoints, while the 11 Republican-leaningpolls' errorwas only 3.3 percentagepoints.The Demo-
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
246
Warren J. Mitofsky
Table 6. State Polls, 1996
Erroron Marginbetween
LeadingCandidates(%)
10%+
7-9
4-6
1-3
<1
Total
Numberof
PresidentialRaces
Numberof
SenateRaces
4 (2)
16 (9)
18 (10)
47 (26)
15 (8)
100 (55)
12 (6)
14 (7)
27 (13)
45 (22)
2 (1)
100 (49)
NOTE.-Data are percentages, Ns are given in parentheses.
cratic and Republicanerrorsdiffer significantlyand do not supportthe
NCPP position that the parties'errorswere aboutequal. Whetherthese
differencesarelargeenoughto supportLadd'ssuggestionthatthe overreportingof Clinton'svictorymarginhad a bearingon eitherparticipation
or the outcomeof the election is problematic.One would have to accept
the bandwagontheory over the underdogtheory in orderto accept his
notion, a discussionthis articlewill not pursue.
1996 State Polls11
Ladd's (1996b) criticismof the 1996 polls included,by implication,the
state polls as well as the national polls when he said, "Election polling had
a terribleyearin 1996." Whilehe andhis criticsfocusedmoreattentionon
the nationalpolls, the state polls are more numerousand appearmore
regularlyin local news reports.The statepolls, with very few exceptions,
were done by differentpollstersthan the nationalpolls. More than half
of the state polls were done by Mason-Dixon,a firmthat services news
organizationsnationwide.Mason-Dixon'sperformancewas better,collectively, thanthose who did the otherstatepolls. Of the 55 presidentialstate
polls reportedin the finalHotline(1996), 62 percentof themwere within
3 points of the actualmarginof victory.(See table 6.) The 49 statepolls
on senatorialraceswere not as good. Justunderhalf were within3 points
of the final margin.
There were three instanceswhere the presidentialstate polls had the
11. All poll resultsfor this analysiscome from Hotline (1996).
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
247
wrongwinneras well as threeerrorsin Senatepolls. The incorrectpresidentialpolls were all close races.Only one of the threeSenatepoll errors
was close. The biggest errorwas by the Detroit News in the Michigan
Senaterace. They were off by 17 points on the margin.Therewere two
othersthatwere 14 pointsoff, the OmahaWorld-Heraldin the Nebraska
Senate race and the Greensburg (Pa.) Tribune-Reviewin the Pennsylvania
presidentialrace.
Laddcould hardlycriticize the state polls for favoringthe Democrats
over the Republicans.The state polls were much more evenhandedthan
the nationalpolls in the directionof theirerrors.Nineteenpolls erredin
favor of the Democrat,Clinton,while only 18 favoredthe Republican,
Dole. The other 18 presidentialpolls were within 1 percentagepoint of
the election.
The Exit Polls
Ladddid not sparethe exit polls fromhis criticism.He said the networks
offeredprematurereportson electionnightwhentheirexit pollingconsortium,VoterNews Service(VNS), incorrectlyprojecteda Democraticwin
in New Hampshire.An "especiallyegregiouserror,"he calledit. He also
comparedthe performanceof the exit polls in recent years to the 1948
preelectionpolls. He said they, too, had frequentlymissed the markby
largermarginsthanGallup'serrorin 1948.
The networksused exit polls on electionnightfor two purposes-projections and analysis.Exit-poll-basedprojectionstook place at poll closing time in conteststhatappearedto be clear-cutvictoriesfor a candidate.
These projectionshave never cited exit poll estimatesof the candidates'
percentages.A VNS or a networkanalystjust namedthe winningcandidate,which was then broadcastafterthe polls closed. The networkshave
never reporteda marginof victorybased on exit polls. Lateron election
night they did reportestimatesbased on samplesof actualvote returns,
andthese have almostalwaysbeen withina few pointsof the finalresult.
The analyticalinformationused in cross-tabulationwas weightedto the
estimatesproducedfrom samplesthatused actualvote returns.As someone who didhave accessto the exit poll estimates,whichwerenotpublicly
available,I can reportthatthe 1996 exit polls were not in excess of Gallup's 9-point errorin 1948.
Since the networksformedtheirexit poll pool in 1990 they have covered about500 races.Overhalf were projectedfromexit poll results.The
only incorrectprojectionby the pool was in the New HampshireSenate
race in 1996. This lone errorwas correctedby the networkson-airtwo
and a half hours afterthe mistakewas made.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
248
Warren J. Mitofsky
Conclusion
Ladd'smainpoint and most testableclaim aboutthe accuracyof the national polls holds no water:by any of the measuresreviewedhere, 1996
was not the best but was far from the worst year for the polls. The data
also do not supporta condemnationof 1996 statepolls, nor do they supportLadd'sclaims aboutthe accuracyof exit polls. The only chargethat
receives any supportis Ladd'sclaim aboutoverestimationof the Democraticvote sharein polls. One measure,calculatedfor this article,bolsters
this charge,while an NCPPanalysiscalls it intoquestion.Overall,a modicum of scrutinyrevealsthatLadd's impressionisticscorecardfor the polls
is seriously in error.
One can only speculateas to why Ladd chose to make demonstrably
erroneousand unsupportedclaims. The professionalpolling community
was understandably
outragedin its responseto Ladd's very public pronouncements.If he really meantto improvepolling practice,one cannot
imagine a less effective means of achievingthe goal. Certainly,in view
of his less-than-rigorousanalysis,Ladd's call for a "blue-ribbon"commission to investigatepoll performancecannotbe taken seriously.
Issues facing the polling professionrequiredispassionateand careful
study if appropriateimprovementsare to be found. In the firstinstance,
we shouldbe clear abouthow the qualityof our work is to be measured.
This articlehas examineda variety of rules and discussedtheir merits
anddrawbacks.The analysisprovidesa foundationforjudgingthe quality
of poll results.In 1998 and beyond, this frameworkmay help to assess
pollingprogressandproblems.In the meanwhile,we can answerthe question, Was 1996 a worse yearfor polls than 1948?No, it was muchbetter.
References
Butler, David, and Dennis Kavanagh. 1992. The British General Election of 1992.
London: Macmillan.
Congressional Quarterly. 1985. Guide to U.S. Elections. Washington, DC:
Congressional Quarterly.
Crespi, Irving. 1988. Pre-election Polling. New York: Russell Sage Foundation.
Crewe, Ivor. 1997. "The Opinion Polls: Confidence Restored?" Parliamentary Affairs
4:569-85.
DiVall, Linda A. 1996 "Keys to Expanding GOP Majorities." Polling Report,
December 9, p. 1.
Eagleton Institute of Politics. 1997. "Summary of the 1997 Gubernatorial Election
Results Comparison with Pre-election Polls." Press release. Eagleton Institute of
Politics, Rutgers University, New Brunswick, NJ.
Hassell, James. 1997. "Latest Poll Shows Gains for Whitman." Newark Star-Ledger,
November 2.
Hotline. 1996. "#1" and "#1 1." Hotline, November 4.
Ladd, Everett C. 1996a. "The Election Polls: An American Waterloo." Chronicle of
Higher Education, November 22, p. A52.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions
Poll Review: Was 1996 Worse than 1948?
249
. 1996b. "The Pollsters'Waterloo."WallStreetJournal,November19, p. A22.
Mosteller,Frederick,HerbertHyman,PhilipJ. McCarthy,Eli S. Marks,and David B.
Truman. 1949. The Pre-election Polls of 1948. New York: Social Science Research
Council.
NationalCouncilon Public Polls (NCPP). 1997. "Polling CouncilAnalysis Concludes
Criticismsof 1996 PresidentialPoll AccuracyAre Unfounded."Press release,
February13. NationalCouncilon Public Polls, Fairfield,CT.
Newport,Frank.1997. "Controversiesin Pre-electionPolling." Paperpresentedat the
annualmeetingof the AmericanAssociationfor Public OpinionResearch,Norfolk,
VA.
Scammon,RichardM., and Alice V. McGillivray.1997. AmericaVotes22.
Washington,DC: CongressionalQuarterly.
Taylor,Humphrey.1997. "Why Most Polls OverestimatedClinton'sMargin."Public
Perspective8 (February/March):
45-48.
Zogby International.1997. "Zogby Polls NJ and VA Right Again!" Press release.
Zogby International,Utica, NY.
This content downloaded from 141.213.236.110 on Tue, 3 Sep 2013 19:44:30 PM
All use subject to JSTOR Terms and Conditions