(1986). "How Not to Lie with Statistics: Avoiding Common Mistakes

How Not to Lie with Statistics: Avoiding Common Mistakes in Quantitative Political Science
Author(s): Gary King
Reviewed work(s):
Source: American Journal of Political Science, Vol. 30, No. 3 (Aug., 1986), pp. 666-687
Published by: Midwest Political Science Association
Stable URL: http://www.jstor.org/stable/2111095 .
Accessed: 31/01/2013 04:25
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
.
Midwest Political Science Association is collaborating with JSTOR to digitize, preserve and extend access to
American Journal of Political Science.
http://www.jstor.org
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
WORKSHOP
How NottoLie withStatistics:
Avoiding
CommonMistakesin Quantitative
PoliticalScience*
GaryKing,NewYorkUniversity
This articleidentifies
a set of serioustheoretical
mistakesappearingwithtroublingly
highfrequency
throughout
thequantitative
politicalscienceliterature.
areall
Thesemistakes
basedon faulty
statistical
oron erroneous
statistical
theory
analysis.Through
algebraicand
interpretive
proofs,someof themostcommonly
made mistakesare explicatedand illustrated.The theoretical
problemunderlying
each is highlighted,
and suggested
solutions
are
providedthroughout.
It is arguedthatcloserattention
to theseproblems
will
and solutions
resultin morereliablequantitative
analysesandmoreusefultheoretical
contributions.
One ofthemostglaring
withmuchquantitative
problems
politicalscienceis itsunevensophistication
and quality.Mistakesare oftenmadebut
rarelynoticed.In journalsubmissions,
conference
presentations,
and studentpapers,problems
occurwithevenmorefrequency.
Havingobserved
thissituationfora fewyears,I noticedseveralpatterns.
First,thesame
mistakesare beingmadeor "invented"
overand over.Second,to refera
orientated
toan articleinEconometrica,
The
substantively
politicalscientist
JournaloftheAmerican
Statistical
orevenPoliticalMethodolAssociation,
ogyis to giveadvicethateitheris not helpfulor is notfollowed.These
problems
aremorethantechnical
flaws;theyoftenrepresent
theoimportant
in mostcases,there
reticaland conceptualmisunderstandings.'
However,
are relatively
thatcan reduceor eliminate
bias and other
simplesolutions
statistical
make
the
problems,
improve
conceptualization,
analysiseasierto
interpret,
andmaketheresults
moregeneral.
In orderto addresstheseconcerns,thispaper presentsproofsand
illustrations
ofsomeofthemostcommonstatistical
inthepolitimistakes
cal science literature,
and suggested
along withtheoreticalarguments
An earlierversionofthisarticlefirst
appearedat theannualPoliticalScienceMethodologySocietyconference,
from
Berkeley,
California,
July,1985.I appreciate
thecomments
theparticipants
at thatmeeting,
particularly
thoseof Christopher
Achenand Nathaniel
Beck. Thanksalso to mycolleaguesat New YorkUniversity,
particularly
LarryMead,
BertellOllman,and Paul Zarowin.ArthurGoldberger,
HerbertM. Kritzer,AnnR. Mcreviewers
were
Cann,CharlesM. Pearson,LynRagsdale,theeditors,and theanonymous
also veryhelpful.
'An exampleofa minortechnicalmistakeis usingordinallevelindependent
variables
withstatistics
thatassumeinterval
leveldata,I referto thisas "minor"becauseit usually
(although
notalways)haslittlesubstantive
consequence
andbecauseitdoesnotrepresent
a
conceptualmisunderstanding.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
667
corrections.
It specifically
omitsproblemswiththenewestand fanciest
statistical
techniquesfortworeasons.First,theproblemsconsideredbelow formthetheoretical
and statisticalfoundation
to themoresophisticatedmethodologies;
and fillingcracksin thefoundation
finding
should
logicallyand chronologically
precedethepaintingof shinglesand shutters.Second,the greatvarietyof newertechniquesare beingused by
relatively
fewpoliticalscientists;
thus,anycriticism
ofthenewtechniques
willapplyonlyto a smallaudience.Althoughimportant,
I willleavethe
newertechniquesfora future
paper.
Foreachquantitative
problem,
I describe(1) themistake,
(2) theproof,
and(3) theinterpretation.
Theproofs,
orappendices
appearingin footnotes
whenexcessively
technical,are formalversionsof,as wellas algebraicor
numerical
evidencefor,theassertions
made in mydiscussionofthemistake.Emphasishereis on theintuitive,
so generality
is oftensacrificedin
orderto improveconceptualunderstanding.
The finalsectionincludesa
inthecontext
briefsummary
andgivesimplications
ofmistakes
ofproposed
solutions.
Somesectionsaretoo briefto be dividedintothistriadand are
therefore
combined.This sortof methodological
has been
retrospective
donein otherdisciplines,
butalthoughwe can learnfromsomeofthese,
mostdo notaddressproblems
specificenoughto politicalscienceresearch.
(See, forexample,Leamer,1983a; Smith,1983; Friedmanand Phillips,
1981;andHendry,1980;Gurel,1968).2
Overthreedecadesago,DarrellHuff(1954) explained,in a book by
the same name,How to Lie WithStatistics.Because of the systematic
precisionrequired,
we shouldrealizebynowthatitis a lotharder(knowinglyornot)to lie (andgetawaywithit)withstatistics
thanwithout
them.
Regressionon Residuals
TheMistake. Supposethaty wereregressed
on twosetsofindependent
variables
to be estimated
areintheparameter
XI andX2.3Thecoefficients
vectorsPIand 32in model1:
E(y IX1,X2) =XI1f3 +X2X32
(1)
The standardand appropriate
wayto estimateP1and P2 in model 1 is by
a multipleregression
ofy on XI andX2.The result
running
y =X1bI +X2b2+ e
(2)
21 do notciteevery
methodologically
flawedpoliticalscienceworkinthispaperbecause
thepurposehereistoimprove
future
research
andtofacilitate
criticalreadingofall research.
Thereis littlegainedbyberating
thoseon whoseresearch
wearetrying
to build.
3 Theword
"regressed"
is sometimes
misused.Readinga regression
equationfromleftto
right,
wesay,"thedependent
variableis regressed
ontheindependent
In thetext,y
variables."
is thedependent
variable;
a setofseveralindependent
variables.
XI andX2eachrepresent
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
668
GaryKing
is theleastsquares(LS) estimator.
The sampleestimates
in equation2 are
usedto inferto thepopulationparameters
in equation1.
Nowconsideran (incorrect)
alternative
procedure,
calledheretheregression
onresiduals
(ROR)estimator.
Thisis a methodofestimating
X11and
inthisequation:
byfirst
regressing
y onXI, resulting
P2often"invented"
y=Xlb*l+el
whereb*is thefirstROR estimator.
(3)
Wethenregress
on thesecondsetofexplanatory
variables,
eI, theresiduals,
X2,yielding
el =X2b>e2
(4)
whereb*is thesecondROR estimator.
The mistakenbeliefis thatb2fromthesecondregression
in equation
4 is equal to b2fromequation2; thatis,sincewe have"controlled"
forXI
inequation3,theresultis thesameas ifwehad originally
computed2. As
is demonstrated
in theproofappearingin appendixA, thisis nottrue.
The ROR estimator
b in equation3 is a biasedestimateofPI, sincethe
does
equation
notcontrolforX2. This is thewell-known
omittedvariablesbias.4 Sinceel-the residualsfromequation3 and thedependent
variablein equation4-is calculatedfromthebiasedROR estimator
b*,
ittoo is biased.Thus,itfollowsthatb*is also biased,sinceitis calculated
fromthe regressionof the biased el on the second set of explanatory
variables
X2.5
TheInterpretation.
Exceptfortwoveryspecialcases,theROR estimatoris notthesameas theordinary
andbyitselfhas
leastsquaresestimator
no usefulinterpretation.
b* is also a biasedestimateof a in model1. In
orderto estimate1Pand P2 correctlyin model 1, both sets of variablesXI
and X2 shouldbe putin theregression
Thisgivesan estisimultaneously.
mate(bI) oftheinfluence
ofXI on y (controlling
forX2),and an estimate
forXI).
(b2)ofX2ony(controlling
An implication
ofthisresultis thatone shouldnotmaketoo muchof
oftheresidualsfroma regression
anyinterpretation
analysis.Ifitappears
froman analysisoftheresidualsthatsomevariableX3 is missing,
thenX3
maybe missing,
butit is notpossibleto drawfairconclusionsaboutthe
orboth.
4The biasdoesnotoccurwheneitherP2 = 0 orXI andX2 areuncorrelated
on
'Sometimesthisprocessis continued:The secondset of residualse2 is regressed
andanother
setof
another
setofexplanatory
variables,
X3, producing
another
RORestimator
Thisprocesshasbeenextended
thefirst
twoin
residuals.
to manystages,butI onlyconsider
ROR estimator,
thebiasis confounded
evenfurther.
thetext.In themulti-stage
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
669
influenceofX3 on y unlessX3 were actually measured and the full equationwere estimated.
An exampleis Achen's (1979) resultthat"Normal Vote" calculations
are inconsistent:the Normal Vote was determinedby a two-stepprocess,
roughlyanalogous to usingthe ROR estimator.
In the statisticalliterature,the ROR estimationprocedure is called
"stepwiseleast squares." However,"stepwiseregression"is verydifferent
fromthisprocedure-although it is no less problematic.6
The Race of the Variables
In this section,the use of standardizedcoefficients("beta weights"),
correlationcoefficients
(Pearson's correlation),and R2 ("the coefficientof
determination")are challenged. In most practical political science situations,it makeslittlesense to use thesestatistics.Theydo not measurewhat
theyappear to; theysubstitutestatisticaljargon forpolitical meaning;they
can be highlymisleading;and in nearlyall situations,thereare betterways
to proceed.
TheRace (1): Standardized
Fruit
TheMistake.Apples,Oranges,and Perceptions.
Imaginea situation
wherea researcherwantedto explainy, the numberof visitsto the doctor
per year.The explanatoryvariableswereX1,the numberofapples eatenper
week,and X2, the numberof orangeseaten per week. The multipleregression equationwas thenestimatedto be:
y= 10 - 1.5X1- 0.25X2.
(5)
6Stepwiseregression
(whichhas been called "unwiseregression"
[Leamer,1985] or
mightbe called a "MinimumLogic Estimator"),
allowscomputeralgorithms
to replace
logicaldecisionprocessesin selecting
variablesfora regression
analysis.Thereis nothing
wrongwithfitting
manyversions
ofthesamemodelto analyzeforsensitivity.
Afterall,the
goaloflearning
fromdatais as nobleas thegoalofusingdata to confirm
a priorihypotheses. However,some a prioriknowledge,
or at least some logic,alwaysexiststo make
selections
betterthanan atheoretical
computer
algorithm.
EdwardLeamer(1983b,p. 320)
hasnoted,"Economists
haveavoidedstepwisemethodsbecausetheydo notthinknatureis
pleasantenoughto guarantee
orthogonal
explanatory
variables,
and theyrealizethat,ifthe
truemodeldoes nothavesucha favorable
design,thenomitting
correlated
variablescan
havean obviousanddisastrous
effect
on theestimates
oftheparameters."
Attheveryleast,
stepwise
evenifoccasionallyusefulforspecialpurposes,neednotbe presented
regression,
in publishedwork(see Lewis-Beck,1978).The use of stepwiseregression
has caused an
additional
curiousmistake.Itis oftensaidthattheorderinwhichvariablesareenteredinto
a regression
equationinfluences
thevaluesofthecoefficients.
A cursorylookat theequationsused in theestimation
(or at a samplecomputerrun)will showthatthisis wrong.
Whatdoes changeis dependentupon theordervariablesare enteredis themarginalincreasein theR2 statistic.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
670
Gary King
Foreveryadditionalappleone eatsperweek,theaveragenumberofvisits
to thedoctorper yeardecreasesby one and a half.For each additional
orangeone eats,theydecreasebyone quarterofa visit.
This hypothetical
researcher
now would like to make a statement
aboutthe comparative
worthof applesand orangesin reducingdoctor
visits.He thenasks theresident
politicalsciencemethodologist
whether
shecan helphimcompareapplesandoranges.Themethodologist
saysthat
the answerdependsupon the researcher
statinghis questionmoreprecisely.Iftheresearcher
means:"I haveonlyenoughmoneyforoneappleor
one orange,and I wantto knowwhichwillmakeme healthier,"
thenthe
answeris probably
theapple.Butsupposean applecosts50 cents,whilean
orangecostsonlyfivecents.In thiscase,theresearcher
might
ask,"Whatis
thebestuseofmylastdollar?"Herethedecisionwouldhaveto be in favor
of the orange:For one dollarspenton two apples,doctorvisitswould
decreaseby about three,whereasthe same dollarspenton 20 oranges
woulddecreasedoctorvisitsbyfiveon average.
Assuming
thequestionis statedprecisely
enough,thesecomparisons
makesomesense.Buttheymakesenseonlybecausethereis a commonunit
ofmeasurement-apieceoffruit
oran amountofmoney.
Supposethenthat
theresearcher
toldthe methodologist
thathe had tornoffthecomputer
printout
just priorto thelast coefficient
estimate.The real equation,he
explained,
includesX3,therespondent's
ofdoctorsas beneflcial,
perception
measuredon a scaleranging
from1 (notbeneficial)
to 10(verybeneficial).
Theestimated
equationshouldhaveappearedas this:
y= 10- 1.5XI- 0.25X2+ 2X3
(6)
The researcher
nowasks whether
thismeansthatperceptions
are "more
important"
than apples. Afterall, he says,2 is greaterthan 1.5. Any
methodologist
worthher8087 chipwouldobjectto this,she asserts.In
fact,wereone to takethiscomparisonto itslogicalextreme,
one would
concludethatperceivingdoctorsas moredetrimental
is morehealthproducing
thaneatingan apple.Although
bothregressors
seekto explain
thesamedependentvariable,theyare neithermeasuredon, norcan they
be converted
to,meaningfully
commonunitsofmeasurement.
This is preciselythepoint:Onlywhenexplanatory
variablesare on
commonunitsofmeasurement
is therea chanceofcomparimeaningfully
son. If thereis no commonunitof measurement,
thereis no chanceof
meaningful
comparison.
However,
thereis anothersensein whicheven"common-unit"
comparisonsareunfair.
Theapplecoefficient,
forexample,
theeffect
represents
ofapples(holdingconstant
theinfluence
oforangesand perceptions).
The
estimatedcoefficient
fororangeshas a different
set of controlvariables
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
67I
(sinceit includesapplesand notoranges).This maymakea comparison
ifnotlogicallyimpossible.
betweenapplesandorangesmoredifficult,
The MistakeContinued.StandardizedFruit. Convincedabout not
our hypothetical
unstandardized
researcher
comparing
coefficients,
procoefficients
on his computerprintout.
The
poses usingthestandardized
methodologist
retorts
that,ifitis oflittleusetocompareapplesandperceplessusetocomparestandardized
tions,thenitis ofconsiderably
applesand
If
standardized
perceptions.
Standardization
does not add information.
therewerenobasisforcomparison
thenthereis no
priortostandardization,
afterstandardization.
basisforcomparison
A relatively
commonrebuttalis thatforexplanatory
variableswith
unclearordifficult-to-understand
counitsofmeasurement,
standardized
efficients
shouldincreaseinterpretability.
The problemis thatiftheoriginal data weremeaningless,
thenthestandardized
regression
coefficients
ifstandardized
areprecisely
as meaningless;
coefficients
do notadd information,theycertainly
do notadd meaning."To replacetheunmeasurable
is notprogress"
bytheunmeaningful
(Achen,1977,p. 806).
I present
"s" todenotestandardized
the
Usinga superscript
variables,
resultsforourhypothetical
case:7
(7)
s = _O.9Xi- 0.2X2+ 0.5X3
Wenowmustinterpret
equation7 to mean,forexample,thatas weeatone
additionalstandarddeviationofapples,thenumberofvisitsto thedoctor
decreasesby ninetenthsof a standarddeviation-nota veryappealing
conceptualization.
Threeobservations:
First,standardizing
makesthecoefficients
submore
difficult
to
standardization
stantially
interpret.
Second,
stilldoesnot
us
to
this
first
effect
to
the
one-half
standard
deviation
enable
compare
increasein doctorvisitsresulting
froma one standarddeviationincrease
in perceptions
ofdoctors.
areestimates
of
Third,andmostserious,whiletheoriginalcoefficients
therelationships
betweentherespective
variablesand thedeexplanatory
fortheotherexplanatory
thestanpendentvariable(controlling
variables),
dardizedvariables
aremeasures
ofthisrelationship
as wellas ofthevariance
in
oftheindependent
variable.Since researchers
are typicallyinterested
in thetwoseparately,
measuring
onlytherelationship,
orat leastinterested
'Therearetwomethodsthatcan producethesamestandardized
coefficients:
(1) standardizeeach oftheoriginalvariables(subtract
thesamplemeanand divideby thesample
standard
on thesestandardized
deviation)
andruna regression
variables;
or(2) runa regressionand multiply
each unstandardized
coefficient
bytheratioofthestandarddeviationof
therespective
variabletothestandard
ofthedependent
independent
deviation
variable.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
672
GaryKing
itmakeslittlesensetousestandardized
variables.
A simplenumerical
proof
willdemonstrate
thispoint.8
The Proof Imaginea simpleexperiment
whereonlythreeobservationson one dependentand one independent
variablearetaken.The observations
arey' = (5, 5, 6) andX' = {2, 4, 4). Calculatedfromthesethree
observations
witha constanttermincluded,theregression
is:
y= 4.5 + 0.25X
(8)
andthestandardized
coefficients
are:
yS=
0.50XS
(9)
thatanother
Supposefurther
datapointwas
yearwentby,andanother
collectedon y { 9.5) and onX {20). Becausethisrandomdrawworkedout
well,theunstandardized
coefficients
inequation8 do notchangeatall with
theintroduction
ofthisadditionalobservation.
thenewobservaHowever,
tionincreases
thesamplestandard
deviation
ofX from1.16to8.39(whichis
whatone wouldgenerally
expectas n increases).Althoughthisdid not
in equation8, thestandardized
coefficient
changetheoriginalcoefficients
nearlydoublesinthefourobservation
case (compareequations9 and 10):
Vs=
0.97Xs
(10)
Undersituations
withdifferent
variancesoftheindependent
variables
butidenticalrelationships,
thestandardized
coefficient
is constrained
only
to have the same signas the unstandardized
coefficient.
Standardized
coefficients
This intuitive
maybe eitherunder-or over-estimates.
proof
extendsdirectly
to situations
withmultipleindependent
variables.
TheInterpretation.
In summary,
standardized
coefficients
areingeneral(1) moredifficult
tointerpret,
thatmay
(2) do notadd anyinformation
helpto compareeffects
fromdifferent
explanatory
variables,
and (3) may
add seriously
information.
Theoriginal,
unstandardized
coeffimisleading
cientsaremeaningful
andarenotsubjecttotheseproblems,
although
they
cannotbe comparedforimportance.
generally
Thereare twoimportant
to thesepoints.First,ifone
qualifications
mustincludea variablethatis difficult
to interpret
as a control,then
perhapsstandardizing
just thisvariablewouldcapitalizeon thestandardizedcoefficient's
simpler
descriptive
properties
(Blalock,1967a).Thisparall
tial standardization
procedureis certainlybetterthanstandardizing
8Kim and Mueller(1976) also showthatchangesin thecovariancesof theincluded
variables
andofthevariances
oftheincludedandexcludedvariables
ina system
ofequations
also affect
thestandardized
(butnottheunstandardized)
coefficients.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
673
thevariables.9
Second,somearguethatstandardized
measuresseemto be
themorenaturalscale forvariablesliketestscores.'0For example,Harcan sometimes
gens(1976) arguesthatstandardized
coefficients
be structuralparameters.
AlthoughKim and Ferree(1981) successfully
refute
mostofthisargument
on theoretical
grounds,
thereis onesenseinwhichit
maybe correctforsomestudies.To makethispoint,itis usefulto consider
a verydifferent
typeof standardization
commonlyused and generally
acceptedin economicstudiesoftimeseriesdata.
Therawconsumer
priceindex(CPIt)is notusuallyincludedin regrestheseriesis nonstationary
andmay,theresionmodelsfortworeasons.First,
leadtospuriousfindings.
an increaseintheprice
fore,
Second,forexample,
ofa typicalmarketbasketoffoodfrom$10.00to $11.00is likelyto have
moreofan influence
on anydependent
variablethaniftheincreasewere
from$100.00to$101.00.Forbothofthesereasons,theproportional
change
in CPI, is used;this"standardized"
measureis commonly
calledtheinflationrate.l Inthiscase,thestandardized
variableis usuallyconsidered
more
thanthe"unstandardized"
naturalandsubstantively
meaningful
CPI,.
In a similar
manner,
subtracting
thesamplemeanfroma variableunder
itbythesamplestandarddeviationmaybe themore
analysisand dividing
naturalmeasureforsome concepts,particularly
forsome psychological
In part,itmayevenbe a matter
ofpersonal
measures.
scalesandattitudinal
each
tasteandcustom(Blalock,1967b).However,
decisionsaboutwhether
variableis tobe standardized
shouldbe madeandjustified
on an individual
coefficients"
basisrather
than"a habitualrelianceonthestandardized
(Kim
calculateproporandFerree,1981,p. 207).Justas we shouldnotroutinely
variablesin crosstionalchangesforeveryvariablein a timeseriesanalysis,
sectional
standardized.
analysesshouldnotbe automatically
A moreimportant
and finalpointis thatmosttimesscholarsare not
in finding
interested
outwhichvariablewillwintherace.Mostoftenit is
fora setof
theoretically
"goodenough"to saythatevenaftercontrolling
thenI wouldgiveup on the
variableis too difficult
to understand,
9Ifthedependent
interpretation.
data,ortryto figure
outa moremeaningful
regression,
collectbetter
causes,consider
theEducationalTesting
thissometimes
"0Asan exampleoftheproblem
admission
(GRE). University
GraduateRecordExamination
Service's(ETS's) standardized
in
makeimportant
decisionsbased in parton smalldifferences
offices
acrossthecountry
distinguish
scoreson thisexam,whereasETS reportsthatthe GRE can onlycorrectly
whoaremorethanonehundred
pointsapart(on a scalefrom200 to 800) twooutof
students
orif
interval!).
Perhapsifthisscorewerenotstandardized
threetimes(i.e.,a 66%confidence
to use
prepared
wewouldbe better
substantive
interpretation,
therewerea moremeaningful
GREs foradmission
decisions.
" Themostintuitive
rateis as (CPI, - CPI,- )/CPI,1, buta
waytocalculatetheinflation
whichfortechnicalreasonsis actuallybetterandis usedmosteverynearlyexactmeasure,
where,
is log(CPI,)- log(CPI, 1).See KingandBenjamin(1985)fora politicalapplication.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
GaryKing
674
influences),
possibleconfounding
variables(i.e.,plausiblerivalhypotheses,
stillseemsto havean important
thevariablein whichwe are interested
theempiricaleviinfluenceon thedependentvariable.This is precisely
expectaourtheoretical
orrefute
denceforwhichwe searchto substantiate
a
is gainedby hypothesizing
tions.Usually,littlepoliticalunderstanding
winnerina raceofthevariables.
Problem
TheRace (2): TheCorrelation
to thesimplecorrelaTheMistake.Manygreatthingsare attributed
whileregression
Itpurportedly
needstoassumecovariation,
tioncoefficient.
arethought
to
assumptions
mustassumecausation.The specificstatistical
It is said to be a betterguidewhenone's
be lessseverethanforregression.
ratherthan
go together"
theoryarguesonlythat"thevariablesgenerally
also
It
supposedly
causeandeffect
relationship."
therebeinga "one-to-one,
easiertointerpret.
makesresults
is false.Thereare severalapproachesto deEach ofthesestatements
areinvalid(Tufte,1974).Two are
scribing
whythesecommonarguments
mostusefulforpresent
purposes.
onone
coefficient
first,
thecase ofa standardized
TheProof Consider,
variable.Throughsomesimplealgebraic
and one dependent
independent
is equalto
coefficient
itcan be shownthatthisstandardized
manipulation,
thatappliestothestancoefficient.'2
Thus,everyargument
thecorrelation
coefficient.
appliesalso tothecorrelation
dardizedregression
coefficient,
towhichthesamplecorrelaNext,considerthepopulationparameters
is
distribution
to infer.The mostlikelyrelevant
probability
tionattempts
themarginalmeanand
thebivariatenormal,whichhas fiveparameters:
coefficient.
varianceforeach variableand p, thepopulationcorrelation
ofp wewouldneedto
an estimator
Theproblemis thatifrwereconsidered
assumethatx and y weredrawnfroma bivariatenormaldistribution.
are
of a bivariatenormaldistribution
Since the marginaldistributions
ofrewe wouldneed to makeall theassumptions
distributed,
normally
12 With
assumeeachvariablehasa meanofzero.Thisproofdemonno lossofgenerality,
is
variable
andonedependent
thatthestandardized
coefficient
(bS) foroneindependent
strates
coefficient
(r):
equaltothecorrelation
bs =b s- = (x'x)- 'x'_y
-
2
=
*
VY21
j
T2=rYY
lXj yi
arenotthesameas
standardized
coefficients
independent
variables,
Forthecase ofmultiple
heretherefore
Theresults
orpartialcorrelation
coefficients.
coefficients
presented
correlation
stillapply.
conclusions
andrecommendations
butthesubstantive
arenotcompletely
general,
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
675
gressionand,in addition,theassumptions
thatX is normallydistributed
andthatx andyarejointlynormally
In manypoliticalscience
distributed.
examples,thisis unreasonable.
For example,any use of a dichotomous
independent
variable(male/female,
agree/disagree,
etc.) violatesthe asonecan useregression,
sumption.
Moreover,
makefewerassumptions,
and
getmorereasonableandinterpretable
results.
The Interpretation.
All of the problemsattributed
to standardized
coefficients
applyto correlation
coefficients.
Furthermore,
thereis nothingin statisticaltheorythat attributes
causalassumptions
toregression
is simplya sample
coefficients;
regression
estimateof a (population)conditionalexpectedvalue. The assumptions
areabouttheconditional
notabouttheinfluence
probability
distribution,
ofx on y.Nothingcan or shouldstopan appliedresearcher
fromstating
thatx causesy,butitis crucialto understand
thatstatistical
analysisdoes
not usuallyprovideevidencewithwhichto evaluatethisassertion(see
Granger
(1969) and Sims(1980) formoredirectattempts).
Thereis also nothing
thatattributes
causalassumptions
tothecorrelationcoefficient.
Correlations
are sampleestimatesof thepopulationparameter
p fromthebivariatenormaldistribution.
Thus,arguments
about
andcorrelation
arenotrequiredforeitherregression
causality,
association,
orcorrelation
anddo notforma basisforchoosingbetweenthetwo.
Furthermore,
as a resultof thedistributional
requirements,
the asfor
correlation
coefficients
far
sumptions
are
moredemandingthanfor
regression
analysis.Unstandardized
regression
coefficients
are almostalwaysthebestoption.
TheRace (3): Coefficient
ofDetermination?
R2 is oftencalled the"coefficient
ofdetermination."
The result(or
cause)ofthisunfortunate
is thattheR 2 statistic
terminology
is sometimes
interpreted
as a measureoftheinfluence
ofX on y.Othersconsiderittobe
a measureofthefitbetweenthestatistical
modeland thetruemodel.A
highR 2 is considered
tobe proofthatthecorrectmodelhasbeenspecified
or thatthetheorybeingtestedis correct.A higherR2 in one modelis
takento meanthatthatmodelis better.
All theseinterpretations
arewrong.R2 is a measureofthespreadof
pointsarounda regression
line,and it is a poor measureof even that
fromtheirmeans,R 2 can
(Achen,1982).Takingall variablesas deviations
be definedas thesumofall y2 (thesumofsquaresdue to theregression)
dividedbythesumofall y2 (thesumofsquarestotal):
yyb'X'Xb
Rs"
__
Xy
X2 's
vy,
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
676
GaryKing
wherethelastequationmovesfromgeneralnotationto thatforone independentvariable.
coeffiNote,however,
thatthisis preciselythesquareofthecorrelation
coefficient
cient(or the squareof the standardizedregression
givenin
all ofthecriticisms
andstandardfootnote
12).Therefore
ofthecorrelation
izedregression
coefficients
applyequallytotheR 2 statistic.
Worse,however,
is thatthereis no statistical
theorybehindtheR2
statistic.Thus,R2 is not an estimator
because thereexistsno relevant
All calculatedvaluesofR 2 refer
populationparameter.
onlyto theparticular samplefromwhichtheycome. This is clearfromthestandardized
coefficient
examplein precedingparagraphs,
but it is moregraphically
in two(x,y) plotsby Achen(1977, 808). In the firstplot
demonstrated
R 2 = 0.2. In thesecondplot,thefitaroundtheregression
lineis thesame,
butthevarianceofX is larger;hereR 2 = 0.5.
Ad hoc arguments
forR2 are oftenmade in the formof the researcher's
questionsand themethodologist's
answers:
Q: How can I tellhowstrongly
myindependent
variablesinfluence
R2?
variablewithout
mydependent
A: Interpret
yourunstandardized
regression
coefficients.
Q: Buthowcan I tellhowgoodthesecoefficients
are?
A: The standarderrorsare estimatesof thevarianceof yourestimatesacrosssamples.If theyare small relativeto yourcoefficients,thenyou shouldbe moreconfident
thatsimilarresults
wouldhaveemergedevenifa sampleof 1500 different
people
wereinterviewed.
Q: Buthowcan I tellhowgoodtheregression
is as a whole?
A: If youwantto testthehypothesis
all
that yourcoefficients
are
zero,use the F-test.More complexhypotheses
aboutdifferent
relevantlinearcombinations
of coefficients
theoretically
(e.g.,
thatthefirstthreecoefficients
arejointlyzero,or thatthenext
twoadd to 1.0)can also be tested.R 2 is associatedwith,butis a
poorsubstitute
for,teststatistics.
Q: O.K. I guessI reallymeanto ask:How can I assessthespreadof
thepointsaroundmyregression
line?
A: There is nothingintrinsically
or politicallyinteresting
in the
in
spreadofpointsarounda regression
line.Ifyouareinterested
theprecisionwithwhichyoucan confidently
makeinferences,
thenlook at yourstandarderrors.Alternatively,
you mightbe
in theprecisionof within-sample
interested
and out-of-sample
forecasts.
Forecastscorrespond
to theregression
line(or to the
line forout-of-sample
extrapolated
forecasts),
givenspecified
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
Q:
A:
Q:
A:
677
valuesofyourexplanatory
variables.It is perfectly
reasonableto
estimateand thenmakeprobabilistic
statements
abouttheforecastsor evento calculateforecast
confidence
intervals.
Surelyif
theobservedpointspreadis large,theconfidenceintervalwill
also be large.However,R2 is also a poor substitute
forgoing
to confidence
directly
intervals.
Butdo youreallywantme to stopusingR 2? Afterall, myR 2 is
higher
thanthatofall myfriends
and higher
thanthosein all the
articlesin thelastissueoftheAPSR!
Ifyourgoal is to geta bigR 2, thenyourgoal is notthesameas
thatforwhichregression
analysiswas designed.The purposeof
regression
analysisand all ofparametric
statistical
analysesis to
estimateinteresting
populationparameters(regressioncoefficientsin thiscase). Thebestregression
modelusuallyhas an R2
thatis lowerthancouldbe obtainedotherwise.
If the goal is just to geta big R2, theneven thoughthatis
to be relevant
unlikely
to anypoliticalscienceresearchquestion,
hereis some "advice": Includeindependent
variablesthatare
verysimilarto thedependentvariable.The "best"choiceis the
dependent
variable;yourR2 willbe 1.0.Laggedvaluesofy usually do quite well. In fact,the moreright-hand-side
variables
includedthebiggeryourR2 willget.13Anotherchoiceis to add
variablesor selectively
add or deleteobservations
in orderto
increasethevarianceoftheindependent
variables.
Thesestrategies
willincreaseyourR 2, buttheywilladd nothingto youranalysis,nothingto yourunderstanding
ofpolitical
andnothing
phenomena,
usefulin explaining
yourresultsto others.Thegeneralstrategy
ofanalysiswilllikelydestroy
mostofthe
desirableproperties
ofregression
analysis.
Is thereanything
usefulaboutR2?
Yes.Thereis atleastonedirectuseandseveralindirect
usesofR2.
Youcandirectly
applyandevaluateR 2 whencomparing
twoequationswithdifferent
explanatory
variablesandidenticaldependent
variables.
Themeasureis,inthiscase,a convenient
goodness-of-fit
a roughwayto assessmodelspecification
statistic,
providing
and
Foranyoneequation,
sensitivity.
R 2 can be considered
a measure
oftheproportional
inerrorfrom
reduction
thenullmodel(withno
tocurrent
explanatory
variables)
model.As such,itis a measureof
1Itis possible,
butunlikely,
fortheR2 tostaythesame;inanycase,itwillneverdecrease
as morevariables
areadded.Moregenerally,
as thenumber
ofvariables
approaches
thenumber
ofobservations,
R 2 approaches
1.0.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
678
GaryKing
thisinterthe"proportion
ofvarianceexplained,"and,although
pretation
is commonly
used,itis notclearhowthisinterpretation
addsmeaning
topoliticalanalyses.
Therearealso a variety
ofindirect
"uses"forR 2. Itis oftentruethata
and
highR 2 is accompaniedby smallstandarderrors,largecoefficients,
good news;
narrowconfidenceintervals.
Thus,a higherR2 is generally
thisis thereasonwhy,ceterisparibus,R 2 does notalwaysmislead.Howin R2 is alreadyavailablein other
ever,mostof theusefulinformation
aremore
commonly
reported
statistics.
Furthermore,
theseotherstatistics
accuratemeasures:They can directlyanswertheoretically
interesting
R 2 cannot.Ofcourse,whenone readssomeoneelse'swork,R 2
questions.
ifsomeofthemoreaccuratemeamaybe a usefulinterpretive
substitute
theoddsofbeingmisled
sureswerenotcalculated.Consequently,
although
withR 2 thanwiththeseotherstatistics,
itisjustas
aresubstantially
higher
of
information
that
wellthatR2 is routinely
It
is
the
use
this
reported.
shouldbe changed.
withDichotomous
Variables
Confusion
In thissection,I discusscommonmisusesofdichotomous
variables.
First,I consider
therelationship
betweenanalysisofvarianceandregression
in handlingdichotomous
independent
variables.Then,I presentcommon
to
mistakesin usingdichotomous
dependentvariables.Finally,I attempt
dealleviateconfusion
aboutusingdichotomous
variablesand mistaking
analysis.
pendent
variablesforindependent
variablesinfactor
(1) Dichotomous
Independent
Variables
with
TheMistake. Considera case wherethereare twopopulations
meansg1 and 92, fromwhichrandomsamplessizes nI and n2are taken.
The populationscouldbe maleand female,agreeand disagree,Republican and Democratoranything
thatcouldbe represented
bya meaningful
dichotomous
variable.A commonproblemis to testthehyexplanatory
pothesisthatthemeansareequal (9 1 - 92 = 0). In thiscase,thefirstthing
we do is calculatethemeans,Yi andY2, ofthetwosamples.
in means
Therearethreeapproaches
to thisproblem:
(1) a difference
model.
test,(2) an analysisofvariance(ANOVA)model,and(3) a regression
Justifications
forchoosingone of thesemodelsovertheothersare often
in meanstestis sometimes
given.Thedifference
seenas a quickwaytogeta
in meanshavebeencredited
feelforthedata.ANOVAand thedifference
withrequiring
less restrictive
assumptions
about the data. Some think
others
variables;
say
ANOVAcanbe safely
usedwithdichotomous
dependent
variables.
thatANOVAandnotregression
allowsdichotomous
independent
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
679
Theseassertions
are false.In fact,thethreetechniquesare intimately
related-conceptually,
statistically,
and evenalgebraically.
The simplest
butleastgeneralofthethreeis thedifference
in meanstest.Let y be a
vectorofobservations
frombothpopulationsandX be an indicatorvariable. Let thevalue forthefirstpopulationbe - 1 and thevalue forthe
secondbe 1. (Thesevaluesarearbitrary
choicesthatmakelatercomputationeasier.)Thenthemodelis
E(yIX=
-l)=
E(ylX=
l)=2
(11)
The obvioussamplestatisticis thedifference
in thesamplemeans,
which,afterdividingby the standarderrorof thisdifference,
followsa
14
t-distribution.
Analysis
ofvariance(ANOVA)is a somewhat
moregeneralwayto deal
withthisproblem.The theoretical
modelis E(y) = i + 8i, where i is the
grandmean of both populations,i = 1, . . . , G, whereG is the numberof
and 6i is thedeviationfromthegrandmeanforpopulationi.
populations,
G
We imposethe restriction
that l: 8i = 0. In the special case of G = 2,
61 =
- 62.
The model can be restatedforeach populationas
E(yIX=
- 1)=g+81
E(ylX=
1)=9+82=9-61
(12)
The sampleestimateof i is y and of6i is di. Bydefinition,
y+di=y,
andy+d2=y2.
Thesemeansare,ofcourse,identicalto thosethatestimatemodel11,but
deviations
fromthesample"grand"mean.The repred, and d2 represent
sentation
is slightly
buttheinterpretation
different,
shouldbe exactlythe
same.The teststatisticforthehypothesis
that61 = 62 = 0 followstheFwhichis a trivialgeneralization
distribution,
ofthet-distribution
usedfor
thedifference
in meanstest.'5
The finaland mostgeneralapproachto thisproblemis withregression analysis(the generallinearmodel).The model of interesthereis
E(y IX) = Po+ p IX,withX takingon thevalue - 1 forthefirst
population
and 1 forthesecond.The modelis defined,foreachpopulation,
as:
'4The choiceforan estimateof the standarderrordependsupon whether
the two
samplesareindependent.
I considerhereonlythecase ofindependence,
Although
thereare
straight-forward
generalizations
to thecase of"nonspherical"
disturbances.
"
a variable
witha t-distribution
(df= k) yieldsa variable
withan F-distribution
Squaring
(df,=k,df2= 1).
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
GaryKing
68o
E(YIX=
- 1)=Po-pi
E(YIX = 1) = 3o+ pi
(13)
AppendixB
ofPoand P1 areboand b1,respectively.
The sampleestimates
ofthegrandmean(y) and
provesthatboandbI areclosealgebraicrelatives
fromthatmean(di) fromtheANOVAmodel.Two points
thedeviations
in AppendixB. First,bo is shownnotto be thegrand
are demonstrated
meanexceptwhennI = n2.Second,bI is provennotto be thedeviation
fromthegrandmean(di), exceptwhennI = n2.
onlydenotesdifferbetweenANOVAand regression
Thisinequality
Thereare no
relationships.
thesameunderlying
entwaysofrepresenting
Notethatin the
in assumptions
or empiricalinterpretation.
differences
in
model,deviationsfromthegrandmeancan be represented
regression
estimates:
termsoftheparameter
y-yi =bo+bI[(n2-ni)/n]-(bo+bi)
=[n2-
n
nl]b
y-y2=bo+bI[(n2-ni)/n]-(bo-bi)
[n2- n+
b
Thus,forthespecialcase ofn1 = n2,y -Yl = -bI andy -Y2=bI justas in
be2b1to be estimateddifference
ANOVA.Whenn1
I# n2,we interpret
tweenthe two populationmeans.In fact,2bI is exactlythe parameter
in meanstest.
estimateforthedifference
independent
Notethatin noneofthesemodelsshoulddichotomous
The consequenceof such a calculationis to
variablesbe standardized.
dependentnotonlyuponthevariance
coefficient
makethestandardized
variable(as is alwaysthecase) butalso uponitsmean,
oftheindependent
ofitsmean.
is a function
sincethevarianceofdichotomy
ofmeans
andthedifference
ANOVA,regression,
TheInterpretation.
testare all special cases of the generallinearmodel.The assumptions
requiredofone arerequiredoftheothersas well.Iftherearedichotomous
Ifthereis a
dependentvariables,noneofthetechniquesare appropriate.
variable,anyone ofthethreewill do. If,as is
dichotomous
independent
usuallythecase withpoliticaldata,therearebothdiscreteandcontinuous
theresearch
willaccommodate
thenonlyregression
variables,
explanatory
problem.
mixedmodels
ofANOVAthataccommodate
Therearegeneralizations
with"analysisof
(whichis notto be confused
like"analysisofcovariance,"
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
68I
covariance
structures").
Since,forexperimental
researchers,
ANOVAoften
seemsa moreconceptually
appropriate
model,and sincethe same data
requirements
and resulting
information
is essentially
equivalent
to regressionanalysis,
thechoicebetween
thetwoismostly
a matter
ofpersonaltaste.
Myviewis thatformostpoliticalscienceresearch,
regression
is a substantially
moregeneralmodel:It incorporates
manytypesofANOVAin
onestatistical
model(andalgebraicformula).
issues
Although
specification
in a regression
applyto all threemethods,
theyareusuallyonlyconsidered
In addition,regression
context.
is also substantially
easierto generalizein
disturbances
and othercommonproborderto correctfornonspherical
lems.By comparison,
moregeneralANOVA modelscan getquitemessy
whentheyexist.Forthisreason,manyANOVAcomputer
programs
actutheresultsintotheANOVA
allydo regression
analysesandthentransform
forpresentation.
parameters
Thepointisthatforthestandardanalysis,
all threemodelscomefrom
thesamegeneralform.Each modelprovidesa different
of
representation
and correctspecification
is requiredofall
exactlythesameinformation,
theregression
modelmay
three.Whentheanalysisis morecomplicated,
provemoretractable.
(2) Dichotomous
DependentVariables
TheMistake.The mistakehereis usingdichotomous
dependent
variablesinregression,
ANOVA,oranyotherlinearmodel.Doingthiscanyield
thanoneorlessthanzero,heteroskedasticity,
predicted
probabilities
greater
inefficient
and uselessteststatistics.
Of
estimates,
biasedstandarderrors,
moreimportance
is thata linearmodelappliedtothesedatais ofthewrong
functional
incorrect.
form;in otherwords,itis conceptually
forexample,
theinfluence
incomeon theprobabiloffamily
Consider,
ityof a childattending
college(measuredas a dichotomous,
college/no
in thissituation
a linearrelationship
collegerealization).Hypothesizing
incomewillincrease
impliesthatan additionalthousanddollarsoffamily
ofthe
theprobability
ofgoingto collegeby thesame amountregardless
levelofincome.Surelythisis notplausible.Imaginehowlittledifference
an additional$1000 wouldmakefora family
with$1,000,000,or forone
withonly$500, in annualfamilyincome.However,fora familyat the
threshold
ofhavingenoughmoneyto senda childto college,an additional
ofcollegeattendance
thousanddollarswouldincreasetheprobability
bya
thisimpliesis a steepregression
line
substantial
amount.The relationship
a strong
atthemiddlerangeofincomeanda relatively
(representing
effect)
at theextremes.
thisforall
line(a weakerrelationship)
flatter
Extending
valuesof incomeproducesthe familiarlogitor probitS-curve(foran
see King,in press,1986).
application,
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
682
GaryKing
The solutionis to modelthisrelationship
witha logitor probit(or
someotherappropriate
non-linear)
model.Scholarly
footnotes
to thecontrary,
it is notpossibleto do logitand regression
analysesand havethem
"comeoutthesame."Whatexactlyis meantby"comeoutthesame"?It
wouldbe meaningless
to comparelogitand LS coefficients,
standarderrors,or teststatistics.
Thereis no suchthingas R2 in logitanalysis,and
makelittlesense;in
althoughthereare analogousstatistics,
comparisons
anycase,logitanalysiswillalwayshavea fittothedataas goodas orbetter
thanthatofLS estimation.
cannotbe thesame,since
The interpretation
theunderlying
modelsareverydifferent.
theoretical
Thereis,however,
one propercomparison
betweenLS and logitestimation-betweenthefitted
valuesofthetwomodelsexpressed
as proportions.6 A short-hand
wayto accomplishthisforthe logitmodelis by
observing
thefirstderivatives
ofthelogitfunction,
bp(1- p), whereb is
thelogitcoefficient
and p is theinitialprobability.
The problemis that
unlikeLS, the effecton y foran additionalunitincreasein X is not
as
constant
overtherangeofX values.This"variableeffect"
is represented
a nonlinear
logitfunction."7
withDichotomous
(3) Confusing
Dichotomous
Independent
DependentVariables
In thefactoranalysismodel,thereare manyobservedvariablesfrom
whichthegoal is to deriveunderlying
(unobserved)factors.A common
mistakeis to viewthe observedvariablesas causingthe factor.This is
variablesas funcincorrect.
The correctmodelhas observabledependent
tionsoftheunderlying
factors.
Forexample,ifa setof
and unobservable
opinionquestionsaskedofthepoliticaleliteis factor
analyzed,underlying
ideologicaldimensions
arelikelyto result.It is thefundamental
ideologies
thatcausetheobservedopinions,and itis precisely
becausetheseideologies are unobservable
thatwe measureonlythe consequencesof these
ideologies.
Thishastwopracticalconsequences
fortheresearcher.
First,variables
16 Whentheunderlying
foreachobservation
remainswithinthe0.25 to0.75
probability
probability
interval,
thelogitandLS modelsproduceverysimilar
predicted
values.However,
more
standarderrorsand teststatistics
havelittlemeaning;althoughtheyhavesomewhat
meaning
whenprobabilities
arewithinthe0.25 to0.75 interval.
Ofcourse,projections
ofthe
withLS.
underlying
theoretical
modelarealwaysimplausible
variables,
Kritzer(1978a;see
'7For thespecialcase ofonlynominallevelindependent
becomesa veryintuitive
also 1978b)showsthata minimum
estimation
chi-square
procedure
weighted
leastsquareson tabulardata.Thisarticleis also a goodexampleofthepointmade
in theprevioussection-thatmanydifferent
statistical
modelscan be organizedunderthe
regression
framework.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
683
likerace,gender,
and age shouldneverbe observedvariablesin a political
willeverfindthat
scientist's
factor
thata researcher
analysis.It is doubtful
or ideologyinfluences
a person'sgenderor race.Secpartyidentification
ond,sincemostfactoranalysismodelsarelinear,theycan no morehandle
dichotomous
dependent(observed)variablesthancan regression
analysis
models.However,thereare nonlinearfactoranalysismodels,whichare
ofthebinarylogitmodel,thatmaybe appropriate
in this
generalizations
situation
(Christoffersson,
1975).
Reporting
ReplicableResults
I focusin thissectionon reporting
resultsofstatistical
analyses.An
erroneousreporting
method,ifnotthemostgrievousoffense,
is certainly
themostfrustrating.
Afterall,ifa mistakeis madeand reported,
thenitis
sometimes
possibleto assessthedamage.Ifminimum
standards
reporting
arenotfollowed,
thentheonlyconclusions
thatcan be drawnarebasedon
blindfaithin or rejectionof the author'sinterpretative
conclusionsand
methodological
skills.Tabularinformation
conveysinformation
thatusuinthetext.Ifthetablesare
allyis not(andusuallyshouldnotbe) presented
notcomplete,
thenthereportmaybe rendered
useless.
I haveconcentrated
in thispaperprimarily
on regression
analysis,the
mostfrequently
usedexplicitstatistical
modelin politicalscienceresearch
andthemostfrequently
abused.As an example,therefore,
considerreportingtheresultsof a LS analysis.The requiredresultsshouldbe (1) data
theunitofmeasurement
foreachvariable,theunit
descriptions
(including
ofanalysis,and thenumberofobservations
and variables),(2) parameter
estimates
coefficients
andtheestimated
varianceofthedistur(regression
bances),and (3) the standarderrors(measuresof the precisionof the
coefficient
Fortime-series
estimates).
analysesand certaintypesofcrosssectionalanalyses,testsofor searchesfornonspherical
disturbances
(e.g.,
autocorrelation
and heteroskedasticity)
shouldalso appear.18 Ifjointhybutnotexecuted,
relevant
pothesistestsarerelevant,
partsofthevariancecovariancematrixoftheregression
coefficients
(on thediagonalofwhich
arethesquaresofthestandarderrors)shouldbe included.Sincetheycan
be derivedfromtheinformation
presented,
t-tests,
F-tests,goodness-of-fit
and marginal
levelsareoptional.
statistics,
probability
One relatively
commonviolationofthesereporting
rulesis to replace
notmeeting
somesignificance
levelwith"N.S."(Sometimes
anycoefficient
intime-series
18 Automatic
useoftheDurbin-Watson
statistic
datais better
thannothing,
butitis farfromthebestapproach.A betterprocedure
is to analyzetheautocorrelation
and
functions
oftheresiduals.Although
fullreporting
ofthesewouldbe
partialautocorrelation
a sentence
ortwosummarizing
excessive,
anyodd resultswouldbe veryhelpful.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
684
GaryKing
is notreported!)
arenotsignificant
eventhelevelatwhichthesecoefficients
In fact,I knowofno politicalscican be verymisleading.
Thisprocedure
ence researchin whichit makessenseto use a precisecriticalvalue.Any
as
atthe0.05levelis as usefulinthisdiscipline
thatis significant
coefficient
whichis
a coefficient
tointerpret
ifitwere0.06or0.04.To deleteandrefuse
levelmakeslittlesense.Eveniftheauthor
0.01or0.001abovea significance
to cometotheirown
hasa reasonforit,at leastreaderscouldbe permitted
is to presentthe marginalprobability
conclusions.My recommendation
of
regardless
foreach coefficient,
level(theexact"levelof significance")
would
readers
heorshewantsand
whatitis;theauthorcan arguewhatever
andsubsignificance
Statistical
stillbe ableto drawtheirownconclusions.
relationship.
haveno necessary
importance
stantive
foottables,misleading
Therearemanyotherexamplesofincomplete
adequacy
the
to
judge
way
notesand uselessappendices.The bestgeneral
It,ofcourse,
iftheanalysiscan be replicated.
is to determine
ofreporting
and
methodological
contribute
need not be replicated,but in orderto
informust
enough
a
report
to itsreaders, paper
information
theoretical
ifsomeoneactually
mationso thattheresultsitgivescouldbe replicated
tried.
Remarks
This paperreviewssomeofthemorecommonconceptualstatistical
politicalscienceresearch.Althoughmanymismistakesin quantitative
colleagues,manymoreslipby.Thosepretakesarecaughtbyperceptive
Too often,we
problematic.
systematically
most
here
are
among
the
sented
each
others'misthan
learning
from
rather
mistakes
learneach others'
for
theinitial
are
reasons
there
in
each
plausible
case,
takes.Fortunately,
painlesssoluorconceptualproblemanda relatively
"invention"
mistaken
tionto theproblem.
givenin thepaper,thereare twomore
In additionto thearguments
generalrulesthatshouldbe appliedto all politicalsciencedataanalyses.
Ifthestatistics
statistics.
on interpretable
First,weshouldconcentrate
intoinformathatis fine,as longas theycan be translated
arecomplicated,
by,nonstatisticians.
to,and interpretable
tionthatis meaningful
biasedor
a feelfordata" is laudable,butpresenting
Second,"getting
modresultsis not.Thus,we shouldtryto use formalstatistical
incorrect
els,aboutwhichmuchmoreis known.The problemwithad hocsolutions
is thatthe same mistakescan occurin theseas withformalstatistics;
we are muchlesslikelyto discoverthem.Forexample,political
however,
thepoint
and arguing
areproneto doinga fewcross-tabulations
scientists
variableswith
dependent
fromthere.Omittedvariablebias,dichotomous
issuesare manytimesmissedwith
linearmodels,and otherspecification
methods
this"method."Whatis oftennotrealizedis thattheseinformal
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
685
HOW NOT TO LIE WITH STATISTICS
can usuallybe expressedin verysimpleformalstatistical
models.Their
weaknesses
thenbecomeimmediately
apparent.
If thesetwo ruleswerefollowed-and adequateinformation
were
providedwithwhichto assess the quantitative
analyses- manyfuture
mistakes
couldbe avoided.
submitted
16 September
1985
Manuscript
18 November
Final manuscript
received
1985
APPENDIX A
A PROOF THAT REGRESSION ON RESIDUAL
(ROR) ESTIMATORS ARE BIASED
thecoefficient
First,partition
vectoras b= [b,b2]and thevectorofindependent
variablesas X = [X, X2 ]. Also let Q = X'X, A = Q X', and e = My be thevectorof residuals
(whereM = I - XQX'). Thenb in thefullregression,
equation2, is theleastsquares(LS)
whereb = Ay.
estimator,
Nowconsidertheregression
on residual(ROR) estimator.
First,letQj = XYXjfori = 1,
2,j = 1, 2, Ai = QYi-1X' fori = 1,2 and M, = I - XI Qii' X,. Then,calculatethecoefficients
withb, = Alyande, = Mlyfromequation3. b, is thefirst
Then
RORestimator.
andresiduals
variablesX2 and getb2= A2e, fromequation4,
regress
e, on thesecondsetofexplanatory
where,
b2 is thesecondROR estimator.
Nowletb ' = [b, b2 ]. I willfirst
provethatb #b.
I
b b
b=
L
XX
X
(I
2] -I [XI']
[Q2XIMX21-Q2,'
-(2X2
2)
X2Ill
X
l
l(2Ml2
X
] X
,X)l
[b l + Q,, Q,2(X2QM,X2)Q ' X2 M
Qy2
J
(x2M,X2)-'XQ2M,y
[bl + Q'Q,2
L
(x2M,X2) 'XM,(X,b,+X2b2+e)
(x2M,X2) 'X2M,(X2b2 +e2)
J
fromequations2 and4:
substituting
=
[
1 [b,1
bi +A,X2b2
*
I
(X2M,X2)- (X2X2)b12
b*2-
termsandtakingexpectedvalues,wehave:
Then,rearranging
[(X2X2)- I (X2MI X2)P 2]
omitted
standard
variablebias.'9
Thus,bothb, andb2arebiased.Theformer
represents
Therearealsotwospecialcases.IfX, andX2 areorthogonal
(i.e.,X{'X2= 0),thenb = b.Also,
thatan omittedvariablebias existsonlywhen
'9Itis easyto see fromthisformulation
oftheomittedvariable
fromtheregression
both(1) thesamplecoefficients
(AI X2) resulting
on theomittedvariable(P2) is
on theincludedvariablesare nonzeroand (2) theparameter
is zero.
ony). Thereis nobiasifeitherone,orthierproduct,
nonzero(i.e.,hassomeinfluence
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
686
GaryKing
whenb2= 0, thenbI = bl. Finally,whenP2= 0, E(b) =!fi. A similarproofcan be foundin
andJochems
Goldberger
(1961).Furthermore,
Goldberger
(1961)haveshownforthebivariate
ofb2.
case,andAchen(1978)forthemultivariate
case,thatb2is an underestimate
APPENDIX B
THE RELATIONSHIP BETWEEN REGRESSION
AND ANALYSIS OF VARIANCE
hold:
Notefirst
thatfortheestimates
ofthemodelinequation13,thefollowing
equalities
n
=n2-n,
7.x
n
X2
=fn2+ni
n
Yay
_
_
=n2Y2+nfyi
n
a xy = n2Y2 -
nlyl
Now,expressing
boandb1in termsofy1andy2:
n
n
n
n xy- axay
i=1 i=1
bl= i=l
n
x2
(n2 +
-
X)
ni) (n2V2
- nly) - (n2 - 1l)(n2y2 + nlyj)
(n2+nl )2 - (n2- n1)2
Y2 -Yi
2
Also,
bo=y b-bx
n2Y2 + nlyl
(y2
(n2+fl)
-yi) (n2-
ni)
2(n2+n1)
Y2 +Y1
2
REFERENCES
coefficient.
Perilsofthecorrelation
representation:
H. 1977.Measuring
Achen,Christopher
AmericanJournalofPoliticalScience, 21 (November):805-15.
manuscript.
leastsquares.Unpublished
1978.On thebiasin stepwise
6(3): 343-56.
PoliticalMethodology,
1979.Thebiasin normalvoteestimates.
and usingregression.BeverlyHills: Sage.
1982. Interpreting
ofassociation.
closedpopulations,
andmeasures
Blalock,HubertM. 1967a.Causalinferences,
AmericanPoliticalScience Review,61 (March): 130-36.
AmericanJournalofSociology,
. 1967b. Path coefficientsversusregressioncoefficients.
72 (May):675-76.
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions
HOW NOT TO LIE WITH STATISTICS
687
variables.Psychometrika,
Anders.1975.Factoranalysisof dichotomized
Christoffersson,
40(1):5-32.
Pediatric
residents
Phillips.1981.What'sthedifference?
B.,andSheridan
Stanford
Friedman,
68(5):644-46.
statistics.
Pediatrics,
concepts
regarding
andtheirinaccurate
error.
leastsquares:Residualanalysisandspecification
Arthur
S. 1961.Stepwise
Goldberger,
1000.
56 (December):998Statistical
Association,
oftheAmerican
Journal
Arthur
S., and D. B. Jochems.1961.Noteon stepwiseleastsquares.Journalof
Goldberger,
56 (March):105-10.
Statistical
Association,
theAmerican
modelsandcross-spectral
byeconometric
causalrelations
C. W.J.1969.Investigating
Granger,
37 (July):424-38.
Econometrica,
methods.
Journalof Psychiatry,
Gurel,Lee. 1968. Statisticalsense and nonsense.International
6(2):127-31.
SociologicalMethodsand
coefficients.
Hargens,LowellL. 1976.A noteon standardized
5 (November):247-56.
Research,
orscience?Economica,47:387-406.
David. 1980.Econometrics-alchemy
Hendry,
NewYork:Norton.
Darrell.1954.Howtolie withstatistics.
Huff,
in causalanalysis.SociologiandG. DonaldFerree,Jr.1981.Standardization
Kim,Jae-On,
187-210.
cal MethodsandResearch,10(November):
coefficients
and unstandardized
Kim,Jae-On,and CharlesW. Mueller.1976.Standardized
4 (May):423-38.
MethodsandResearch,
in causalanalysis.Sociological
approach.
policy:A structuralist
1986.Politicalpartiesandforeign
King,Gary.Forthcoming.
PoliticalPsychology.
amongU.S.
of partyidentification
King,Gary,and GeraldBenjamin.1985.The stability
oftheAmerican
Politattheannualmeeting
Paperpresented
senators
andrepresentatives.
NewOrleans.
icalScienceAssociation,
tablesby weightedleast squares:An
Kritzer,HerbertM. 1978a. Analyzingcontingency
5(4):277-326.
alternative
totheGoodmanapproach.PoliticalMethodology,
tableanalysis.
contingency
to multivariate
An introduction
. 1978b.The workshop:
187-226.
ofPoliticalScience,22 (February):
American
Journal
AmericanEconomic
Leamer,EdwardE. 1983a. Let's takethe con out of econometrics.
Review,73 (March):31-44.
InZ. Griliches
andM. D. Intriligator,
analysis.
. 1983b.Modelchoiceandspecification
Vol.I, NewYork:North-Holland.
eds.,Handbookofeconometrics.
3.
75 (June):308-1
Review,
Economic
analyseswouldhelp.American
. 1985.Sensitivity
5(2):
A caution.PoliticalMethodology,
MichaelS. 1978.Stepwiseregression:
Lewis-Beck,
213-40.
1-48.
48 (January):
andreality.
Econometrica,
A. 1980.Macroeconomics
Sims,Christopher
American
Journal
Somefrequent
misunderstandings.
Kim. 1983.Testsofsignificance:
Smith,
53(2):315-21.
ofOrthopsychiatry,
Cliffs:
Prentice-Hall.
forpoliticsandpolicy.Englewood
Tufte,EdwardR. 1974.Data analysis
This content downloaded on Thu, 31 Jan 2013 04:25:06 AM
All use subject to JSTOR Terms and Conditions