No Free Lunch: Natural Experiments and the
Construction of Instrumental Variables
Thad Dunning
Department of Political Science
Yale University
June 28, 2007
1
Abstract
Social scientists have increasingly exploited sources of random or quasi-random
variation, including natural experiments, to construct instrumental variables for use in
regression analysis. In many applications, researchers seek to defend the plausibility
of a key assumption: namely, while an instrument or set of instruments is empirically
correlated with an endogenous regressor in a linear regression model, it is independent
of the error term in that model. I argue here that while fulfilling this exogeneity criterion may be necessary for a valid application of the instrumental variables approach,
it is far from sufficient. In particular, in the regression context the identification of
causal effects depends not just on the exogeneity of the instrument(s) but also on the
validity of the underlying model. In this paper, I focus attention on the implications of
one feature of characteristic models: the assumption of common effects across exogenous and endogenous portions of the problematic regressor(s). In many applications,
this assumption may be quite strong, but relaxing it can limit our ability to estimate
parameters of greatest theoretical interest. After discussing two substantive examples,
I discuss analytic results (simulations are reported elsewhere). I also present a specification test that may be useful for determining the relevance of these issues in a given
application.
2
1
Introduction
Social scientists increasingly exploit natural experiments as sources of instrumental variables for
use in regression analysis. Unlike true experiments, in a natural experiment the manipulation of
a treatment variable is not under the control of an experimental researcher; instead, analysts take
advantage of interventions they observe in the social and political world. Akin to randomized
controlled experiments, however, and unlike other observational studies, a researcher exploiting a
natural experiment can make a credible claim that the observed assignment of non-experimental
subjects to treatment and control conditions is done at random or “as if” at random.
In any given study, it may happen that units of analysis are “as if” randomized to the
treatment of theoretical interest. In this case, a natural experiment may be very close to a true
experiment, in which the researchers have planned and introduced the randomized intervention.
Perhaps more often, however, nature randomizes units of analysis to levels of some variable Z
that is different from the treatment variable X. Under further assumptions to be discussed below,
randomization of subjects to such a variable Z may nonetheless allow identification of the causal
effect of the non-randomly assigned treatment X, so long as Z is correlated with treatment.
The well-known idea is as follows. Consider the regression equation,
Y = Xβ + ,
(1)
where Y is a column vector of observations on a dependent variable, X is a matrix of observations
on treatment variables and covariates, β is a vector of parameters, and is a vector of unobserved,
mean-zero error terms. Unlike the classical regression model, here at least some columns of X
may be dependent on the error term, that is, endogenous. The Ordinary Least Squares (OLS)
estimator of β will therefore be biased by a quantity related to the expected value of the error term,
given X. However, under additional assumptions, Instrumental Variables Least Squares (IVLS)
3
regression provides a way to obtain consistent estimates of β. To use IVLS, we must find a matrix
of instrumental variables Z, with at least as many columns as X (exogenous columns of X may
be included in Z), for which the following conditions hold: Z’Z and Z’X have full rank, and Z is
independent of the unobserved error term (Greene 2003: 74-80; Freedman 2005: 175).
The latter requirement is the hard one, and it is the one for which natural experiments are
often exploited. For instance, Miguel, Satyanath, and Sergenti (2004) take advantage of the “as
if” random assignment of African countries to inclement weather to instrument for GDP growth,
in a study of the influence of growth on civil conflict. Acemoglu, Johnson, and Robinson (2001)
use variation in historic settler mortality rates across former European colonies to instrument for
current institutional quality, in a regression of measures of economic development on institutions.
Angrist and Lavy (1999) exploit “as if” random variation in the size of Israeli classes to estimate
the effect of class size on educational attainment. In these and other applications, researchers tend
to devote substantial attention to defending the plausibility that a set of instrumental variables Z is
independent of the error term in an equation like (1), as required for consistent estimation of β.
In the context of an equation like (1), however, it is not merely the exogeneity of the instrument(s) that allows for estimation of the causal impact of X on Y. Instead, Z is tied to inferences
about the impact of X on Y through a particular causal model. This underlying causal model in turn
lends itself to a regression equation like (1), which is used to estimate the effect of treatment. Exogeneity is therefore necessary but not sufficient for valid application of the instrumental variables
approach: the validity of the model is always at issue as well.
Though this observation is in itself unremarkable, I argue here that an under-appreciated
aspect of a statistical model like equation (1) plays an important role in sustaining causal inferences
about the impact of X on Y. In brief, the statistical model in equation (1) assumes common effects
across exogenous and endogenous portions of the treatment variable. While this assumption may
be innocuous in some settings, it is far from clear that it holds in other contexts in which we would
commonly use IVLS. Below, I discuss at length two substantive examples in which a compelling
4
natural experiment provides plausible “as if” randomization and thus supplies an instrumental
variable that is credibly exogenous. Because of the exogeneity of the instruments, these examples
lend themselves to particularly credible applications of the IVLS approach.
However, as these examples will also suggest, the assumption that endogenous and exogenous portions of a problematic regressor have the same effects on the outcome of interest may be
quite strong. Suppose that X is a (mean-zero) scalar random variable, and we have X = X1 + X2 ,
with X1 endogenous and X2 exogenous; this partition of X into endogenous and exogenous portions emerges in a natural application-specific way in one of the examples discussed below. One
alternative is that the true data-generating process is such that we should estimate
Y = β(X1 + X2 ) + (2)
Another alternative, however, is instead
Y = β1 X1 + β2 X2 + (3)
with β1 , β2 . We can think of the rows of equation (1) as i.i.d. realizations of the data-generating
process implied by equation (2) or (3). The simple point I make here is that in many applications, as
the substantive examples discussed below suggest, equation (3) may be more natural than equation
(2).
Since the point of using IVLS is often to recover estimates of the coefficient on an endogenous variable such as X = X1 + X2 , positing a model like equation (2) is an important part of the
technique. However, erroneously assuming constant coefficients can also produce IVLS estimates
that are misleading. The point is not that there is a general failure in IVLS applications. Rather,
the point is that in the regression context, the identification of causal effects depends not just on the
exogeneity of the instrument(s) but also on the validity of the underlying model. This is, of course,
5
a general point that goes beyond applications of IVLS, yet it is one we tend to forget in focusing
our attention only on the exogeneity of the instruments.
The spirit of the discussion might also be put as follows. In a typical randomized controlled
experiment, questions are sometimes raised about the extent to which the effect of one treatment
can be generalized to another (perhaps similar) treatment. The standard response to such questions
would be, “we need to conduct another experiment to find out.” Yet the natural-experiment-cuminstrumental variables approach seems to offer another alternative: although natural experiments
often randomize nations or other units of analysis to treatments (weather patterns, setter mortality
rates, and so on) other than the treatment of theoretical interest (such as GDP growth or institutional
quality), it appears that we may use IVLS to recover the effect of these treatments. This approach
can substitute for conducting a new experiment – or for finding a different natural experiment in
which units are in fact randomized to the treatments of primary interest – only under assumptions
that may be quite strong. In these contexts, the IVLS approach may not provide a “free lunch,”
that is, a way to surmount the unfortunate fact that nature has randomized our units of analysis not
to the treatments about which we care the most but rather to some other, related treatment.
Whether these issues are germane in any given application is mostly a matter for a priori
reflection, though at the end of this article I sketch a statistical specification test that might be of
some use. The specification test requires an additional instrument, however, and therefore may be
of limited practical utility. The main goal of the paper is thus to underscore the general relevance
of the issues and to encourage their discussion in applications. Indeed, specification of the model,
especially the assumption of constant effects, should perhaps be defended with the same energy
with which we often defend exogeneity.
This paper relates to but is distinct from several strands of literature in econometrics, political science, statistics, and program evaluation. There is a large literature that discusses the relative
merits of IVLS (Kennedy 1985: 115; Hanushek and Jackson 1977: 238). Bartels (1991) uses simulations to study a bias-variance trade-off under the assumption that the instrument itself is (weakly)
6
endogenous. On a different topic, there is also a literature on instruments that may be exogenous
but that are only weakly correlated with an endogenous regressor or set of regressors (e.g., Bound
et al. 1995). The focus of the current paper differs from this previous work, in that it considers
instruments that are strictly exogenous and that are also well-correlated with endogenous regressors. In this case, IVLS estimates are consistent, and the efficiency loss won’t be too great because
of the high correlation between the instrument and the endogenous regressors. Nonetheless, IVLS
may fail to produce accurate estimates, due to a particular form of model misspecification. The
paper thus focuses on inferential difficulties that arise even when the standard requirements for a
valid instrument are met.
More related to the present article is an important recent literature on understanding instrumental variables in the presence of “causal heterogeneity” or “essential heterogeneity.” Many
recent papers have clarified what instrumental variables can estimate in such settings. In some
settings, for example, the instrumental variables approach estimates treatment effects for individuals whose behavior is modified by instruments (Heckman and Robb 1985, 1986); instrumental variables can identify what Imbens and Angrist (1994) call “local average treatment effects”
(see also Angrist, Imbens, and Rubin 1996; for discussion, Heckman, Urzua and Vytlacil 2006).
Rosenzweig and Wolpin (2000) also show that what IVLS estimates depends on the underlying
behavioral models that are posited. In these papers, however, which are often formulated in the
context of the Neyman-Holland-Rubin potential outcomes model, the heterogeneity of interest
comes across units (individuals, countries, etc.) – that is, across i. In the present paper, we suppose
that coefficients are constant across i and instead investigate the consequences of heterogeneity
across variables – that is, that is, heterogeneity across exogenous and endogenous portions of the
regressors in X.
7
2
An example on lottery winnings
In a recent paper, Doherty, Green and Gerber (2005) are interested in assessing the relationship
between income and political attitudes.1 They surveyed 342 people who had won a lottery in
an Eastern state between 1983 and 2000 and asked a variety of questions about attitudes towards
estate taxes, government redistribution, and social and economic policies more generally. Given the
number and kinds of lottery tickets that individuals buy, the level of lottery winnings are randomly
assigned among lottery players.2 Abstracting from sample non-response and other issues that
might threaten the validity of the inferences,3 , Doherty, Green, and Gerber can exploit the lottery
to make compelling claims about the causal impact of winnings on political beliefs. It turns out that
winning large amounts in a lottery has an effect on some relatively narrow political attitudes – e.g.,
those who win more in the lottery favor the estate tax less – but lottery winnings have relatively
little impact on broader political attitudes, for instance, towards the proper role of government in
the economy writ large.
However, the question of perhaps greater social-scientific interest concerns the political
effects of overall income: while relatively few people have lottery winnings, many have incomes.
Does the natural experiment also allow us to generalize from the impact of lottery winnings to
the effect of overall income on attitudes? It does not, without assumptions that may be quite
strong in this context. As Doherty et al. (2005) carefully point out, the effect on political attitudes
of “windfall” lottery winnings may be very different from other kinds of income – for example,
income earned through work, interest on wealth inherited from a rich parent, and so on.
These kinds of concerns may also limit our ability to use IVLS to estimate the causal effect
1
Portions of the material in this section are based on Dunning (forthcoming).
Lottery winners are paid a large range of dollar amounts. In Doherty et al.’s sample, the minimum total prize was
$47,581, while the maximum was $15.1 million, both awarded in annual installments.
3
See Doherty et al. (2005) for further details.
2
8
of overall income on political attitudes. Consider the regression equation
ATTITUDESi = β INCOMEi + i
(4)
where ATTITUDESi measures the political attitudes of respondent i, INCOMEi is the self-reported
income (from all sources) of respondent i, and β is a regression coefficient common to all respondents.4 For ease of exposition, the variables are mean-deviated, and covariates are not included.5
The error term i is a random variable, independently and identically distributed across respondents
with E(i ) = 0. The goal is to estimate the value of the parameter β, defined here as the impact of
overall income on political attitudes.
Equation (4) is the standard linear regression set-up, except for one catch: the error term
is not independent of income, because unobserved (unmeasured) variables may be associated with
both overall income and political attitudes. For instance, rich parents may teach their children
how to play the stock market and also influence their attitudes towards government intervention.
Peer-group networks may influence both economic success and political values. Ideology may
itself shape economic returns, perhaps through the channel of beliefs about the returns to hard
work. Even if some of these variables could be measured and controlled, clearly there are many
unobserved variables that could conceivably confound inferences about the causal impact of overall
income on political attitudes.
From the perspective of standard approaches to instrumental variables regression, however,
the innovative research design of Doherty et al. (2005) supplies the “perfect” instrument – namely,
a variable that is both correlated with overall income and is independent of the error term in equa4
Note that according to equation (4), subject i’s response depends on the values of i’s right-hand side variables;
values for other subjects are irrelevant. The analog in Rubin’s formulation of the Neyman model is the stable unit
treatment value assumption (SUTVA) (Neyman 1923, Dabrowska and Speed 1990; Holland 1986).
5
In a similar regression model, Doherty et al. (2005) include various covariates, including a vector of variables to
control for the kind of lottery tickets bought, and another vector of demographic variables to boost statistical efficiency
by adjusting for slight imbalances due to the randomization. Doherty et al. (2005) also estimate a series of ordered
probit models to estimate the impact of lottery winnings per se on attitudes.
9
tion (4). This variable is the level of lottery winnings of respondent i. An accounting identity
is
INCOMEi ≡ EARNED INCOMEi + WINNINGSi
(5)
where WINNINGSi are the lottery winnings of survey respondent i and EARNED INCOMEi is
shorthand for all other income sources of respondent i, net of lottery winnings. (We call this
“earned income,” though it could of course include any additional income source beyond lottery
winnings).
Equation (5) implies that
Cov(INCOMEi , WINNINGSi ) , 0
(6)
since the variable WINNINGSi is a component of INCOMEi .6 Moreover, since levels of lottery
winnings are randomly assigned to the lottery-playing survey respondents, winnings should be
statistically independent of other characteristics of the respondents, including characteristics that
might influence political attitudes. Thus:
WINNINGSi y i
(7)
where A y B means “A is independent of B.”
Viewed in the context of equation (4), equation (7) is an exclusion restriction (Greene 2003:
74-80). Together with equation (6), it says that there exists a variable that is correlated with the
endogenous regressor INCOMEi in equation (4) but that is independent of the error term in that
equation. The Instrumental Variables Least Squares (IVLS) estimator is
β̂IVLS =
6
d
Cov(WINNINGS,
ATTITUDES)
d
Cov(WINNINGS,
INCOME)
This assumes (eminently plausibly) that Cov(EARNED INCOMEi , WINNINGSi ) , −var(WINNINGSi ).
10
(8)
that is, the sample covariance of lottery winnings and attitudes divided by the sample covariance
of lottery winnings and overall income.7 Under the assumptions of the model, equation (8) will
provide a consistent estimator for β in equation (4).
Note, however, that our ability to generalize from the effect of one treatment – lottery winnings – to the effect of another treatment – total income – is ensured only by the model in equation
(4). Given the model, we can use the instrumental variables technique to obtain a consistent estimator of β, since lottery winnings are correlated with income but independent of the error term in
equation (4). Yet the model itself does an important portion of the inferential work. To see this,
note that by equation (5), the total income of individual i will be the sum of income from lottery
winnings and income from all other sources (or “earned income”). It is useful to rewrite equation
(4) as
ATTITUDESi = β(EARNED INCOMEi + WINNINGSi ) + i
(9)
Writing the equation this way makes the assumptions of the model clearer. Among the
most important assumptions is that β is assumed to be constant for all i, and, especially, constant
for all forms of income. According to the model, it does not matter whether income comes from
lottery winnings or from other sources: a marginal increment in either lottery winnings or in earned
income will be associated with the same expected marginal increment in political attitudes. Put
differently, the slope coefficient is assumed to be the same across endogenous and exogenous
portions of the regressor INCOMEi . The model therefore assumes away the important inferential
issue that Doherty et al. (2005) point out.
Suppose instead that we allowed lottery winnings to have its own slope parameter in equation (9), thus assuming that political attitudes are a linear combination of earned income, lottery
7
Equivalently, we can use Two-Stage Least Squares (IISLS); see Freedman (2005: 170-175) on the equivalence of
these estimators.
11
winnings, and an error term. Then we can rewrite equation (9) as
ATTITUDESi = β1 EARNED INCOMEi + β2 WINNINGSi + i
(10)
The variable WINNINGSi is independent of the error term among lottery winners, due to the
randomization provided by the natural experiment. However, EARNED INCOMEi remains endogenous, perhaps because factors such as education or parental attitudes influence both earned
income and political attitudes. We could again resort to the instrumental variables approach, but
since we need as many instruments as there are regressors in (10), we will need some new instrument in addition to WINNINGSi . Even if we could find one, we would need to assume a constant
coefficient β1 across the exogenous and endogenous portions of EARNED INCOMEi .
Suppose the data were generated according to equation (10), with β1 , β2 , and we (erroneously) assume equation (9). Now we estimate the model with IVLS, using WINNINGS
as an instrument for INCOME. As I show analytically in Section 4, given the independence of
EARNED INCOME and WINNINGS, IVLS will asymptotically estimate the coefficient β2 on
WINNINGS. Yet the coefficient on the endogenous variable EARNED INCOME may be of theoretical interest. (After all, if we only cared about β2 , we could simply regress ATTITUDES on
the exogenous variable WINNINGS). In this context, IVLS provides “no free lunch,” despite the
fact that we have an “ideal” instrumental variable for the endogenous regressor in equation (4).
The identification provided by the natural experiment can help us recover the impact of lottery
winnings, but not the impact of overall income – unless the data were indeed generated according
to (4). Again, the point is not that there is a general flaw in the IVLS approach. The point is the
model is misspecified; to use the IVLS approach effectively, the data should have been generated
according to (9), not (10).
12
3
An example on economic growth and civil conflict in Africa
Before turning to the analytic results, however, I explore a different example drawn from the growing literature that applies the IVLS approach in comparative political economy and comparative
politics. As we will see, there are issues that are parallel to those raised above, in the context of the
study of lottery winnings. Just as in that study, the point here is neither that IVLS is necessarily the
wrong approach nor that there are fundamental flaws in the study: indeed, the study reviewed here
is one of the most creative and compelling of recent applications of IVLS in the field. The point is
simply that “purging” estimates of their endogeneity through an application of IVLS depends on
an important assumption about homogenous effects, and this is an assumption that it may be useful
to discuss.
Miguel, Satyanath, and Sergenti (2004) are interested in the effects of economic recession
on the likelihood of civil conflict in Africa. Although the idea that adverse economic conditions
may incite conflict is an old one, recent studies have posited specific mechanisms through which
economic recessions may increase the likelihood of civil war. For instance, according to the influential models advanced by Paul Collier and Anke Hoeffler of the World Bank (1998, 2001, 2002),
economic factors help to explain the incidence of civil war because of the important role they play
in rebel recruitment. Miguel et al. (2004: 727) summarize the approach as follows: “Collier and
Hoeffler stress the gap between the returns from taking up arms relative to those from conventional
economic activities, such as farming, as the causal mechanism linking low income to the incidence
of civil war.”8 Indeed, Collier and Hoeffler suggest that the economic motives of potential rebels
far outweigh other factors, such as indicators of social injustice, in explaining the incidence of
rebellion. In their well-known formulation, it is “greed,” not “grievance,” that mainly explains
variation in the occurrence of civil wars.
8
Miguel et al. (2004) also discuss an alternative, though possibly complementary, approach, that of Fearon and
Laitin (2003), who emphasize the importance of state capacity and road coverage in explaining the outbreak and
duration of civil war.
13
However, there is an important problem for purposes of testing such theories about the
influence of economic conditions on civil conflict: civil conflict may influence economic conditions, and there may be confounding, too. As Miguel et al. (2004) put it, “the existing literature
does not adequately address the endogeneity of economic variables to civil war and thus does
not convincingly establish a causal relationship. In addition to endogeneity, omitted variables –
for example, government institutional quality – may drive both economic outcomes and conflict,
producing misleading cross-country estimates” (2004: 726). In other words, in a regression of
civil conflict on economic growth, the latter may be dependent on the error term in the underlying
regression model.
The solution to this problem, Miguel et al. (2004) suggest, is instrumental variables regression. The instrument for economic growth they propose is weather shocks stemming from
variation in rainfall. In sub-Saharan Africa, as these authors demonstrate empirically, there is a
strong positive correlation between percentage change in rainfall over the previous year and economic growth (and the correlation holds up for both lagged and contemporaneous annual change
in rainfall). Drought hinders economic growth.
Rainfall thus passes one key requirement for a potential instrument, that it be correlated
with the endogenous regressor. The other key requirement, and always the harder one to fulfill,
is that rainfall is exogenous – that is, independent of the error term in the underlying regression
model. This assertion is, of course, essentially untestable; but Miguel et al. (convincingly) probe
its plausibility at length. Although establishing the plausibility of an instrument’s exogeneity is
always an important component of the instrumental variables approach, it is not the issue here.
For purposes of this discussion, we will therefore assume rainfall is exogenous, which seems very
sensible.
The IVLS estimates presented by Miguel et al. suggest a strong negative relationship between economic growth and civil conflict: “a five-percentage-point drop in annual economic
growth increases the likelihood of a civil conflict (at least 25 deaths per year) in the following
14
year by over 12 percentage points – which amounts to an increase of more than one-half in the
likelihood of civil war” (2004:727).9 Miguel et al. also “find – perhaps surprisingly – that the
impact of income shocks on civil conflict is not significantly different in richer, more democratic,
more ethnically diverse, or more mountainous African countries.”
This appears to be compelling evidence of a causal relationship, and Miguel et al. also have
a plausible mechanism to explain the effect – namely, the impact of drought on the recruitment of
rebel soldiers. But have Miguel et al. really estimated the effect of “economic growth” on conflict?
This is not so clear. Making this assertion, as we will see, depends on specific assumptions about
the way growth produces conflict. In particular, it depends on positing a model in which economic
growth has a constant effect on civil conflict – constant, that is, across the components of growth.
As with the example on lottery winnings, using the IVLS machinery to “identify causal effects”
depends not just on the validity of the exclusion restriction – it also depends on the validity of this
model.
The point might be elaborated as follows. Suppose for purposes of this argument that we
can model economic growth in country i in year t as a function of growth in two sectors, agriculture
and industry. We then want to consider two alternate ways in civil conflict could be a function of
economic variables. On the one hand, it might be the case that the probability of civil conflict is
given by
Prob{Cit = 1} = γ∆Yit + it
(11)
where Cit is a binary variable for conflict in country i in year t (with Cit = 1 indicating conflict),
∆Yit is the economic growth rate of country i in year t, and it is a latent mean-zero random variable
meant to capture unmeasured characteristics that affect the probability of civil war.10 According to
9
The dependent variable is dichotomous; it measures the incidence of civil conflict in which there are more than
25 (alternatively, more than 1,000) battle deaths in a given year. The main equation thus appears to be specified as a
linear probability model.
10
Equation (11) resembles the main equation found in Miguel et al. (2004: 737), though we abstract from control
variables as well as lagged growth values here for ease of presentation. Miguel et al. in fact specify Cit = γ(∆Yit ) +
Xit0 β + it , so the dichotomous Cit is assumed to be a linear combination of continuous right-hand side covariates and a
15
the model, if we intervene to increase the economic growth rate in country i and year t by one unit,
the probability of conflict in that country-year is expected to increase by γ units (or to decrease,
if γ is negative). However, the model is agnostic about the source of this increase in economic
growth. Indeed, if we want to influence the probability of conflict we might consider different
interventions to boost growth: for example, we might target foreign aid with an eye to increasing
industrial productivity, or we might hope that more rainfall will boost agricultural productivity.
Suppose, on the other hand, that growth in agriculture and growth in industry – which both
influence overall economic growth – have different effects on conflict, as in the following model:
Prob{Cit = 1} = α∆It + β∆At + it
(12)
where ∆It and ∆At are the annual growth rates in industry and agriculture, respectively. What might
motivate such an alternative model? As the causal mechanism posited by Collier and Hoeffler
suggests, decreases in agricultural productivity may increase the difference in returns to taking
up arms and farming, making it more likely that the rebel force will grow and civil conflict will
increase; yet in a context in which many rebels are recruited from the countryside, changes in
(urban) industrial productivity may have no or at least different effects on the probability of conflict.
In this context, heterogenous effects on the probability of conflict across components of growth
may be the conservative assumption.
Moving from either equation (11) or equation (12) to data, and thus to estimation of γ or
α and β requires further statistical assumptions. In a standard linear probability model, the ∆Yit
in equation (11) would be independent of the it , and OLS would give unbiased estimates of γ.
The problem is that ∆Yit is dependent on the it , that is, endogenous. The solution that Miguel
et al. propose is IVLS, with rainfall growth as an instrument for the ∆Yit . Given the response
schedule in equation (11), this seems like a very plausible solution. Rainfall growth is correlated
continuous error term; yet the authors clearly have in mind a linear probability model, so in the text I write equation
(11) instead.
16
with economic growth in Africa. If rainfall growth is also exogenous, as Miguel et al. argue, IVLS
delivers the goods.
If the true data-generating process is equation (12), however, another approach is needed.
It seems reasonable that industrial growth and agricultural growth will both be dependent on the
error term in equation (12), for the same reasons as Miguel et al. (2004) suggest that overall
economic growth is endogenous. For instance, conflict may depress agricultural growth, and harm
urban productivity as well. If rainfall growth is correlated with agricultural growth but not with
industrial growth – as seems plausible intuitively – we have a good instrument for ∆At . But we
cannot estimate α without an additional instrument for industrial productivity.
The point here is not that equation (12) is the right response schedule; indeed, it is as stylized as equation (11), and there may well be contexts in which it is more appropriate to estimate
(12). There are important policy implications, of course: if growth reduces conflict no matter what
the source, we might counsel more foreign aid for the urban industrial sector, while if only agricultural productivity matters, the policy recommendations would be quite different. The objective
here, however, is merely to point out that what IVLS estimates in this context (or any context)
depends importantly on the assumed model, and not just on the plausible exogeneity of the instrument.
4
Analysis
If the data-generating process involves heterogenous effects across endogenous and exogenous
portions of X, and we instead assume homogenous effects, what does IVLS estimate? In this
section, I analyze a case akin to the example on lottery winnings, where an endogenous regressor
breaks down into the sum of exogenous and endogenous portions.
17
For each observation i, the true data-generating process is
yi = β1 x1i + β2 x2i + i ,
(13)
where β1 and β2 are parameters. When β1 , β2 , equation (13) is identical to equation (10) in
the example on lottery winnings; x1 is analogous to earned income and x2 is analogous to lottery
winnings. The subjects are i.i.d., and E(i ) = E(x1i ) = E(x2i ) = 0, with y x2 but Cov(, x1 ) , 0.
Also,
x1 y x2 ,
(14)
as in the example on lottery winnings: subjects are randomized by the natural experiment to levels
of x2 .
Suppose we (erroneously) assume that data were generated according to
yi = β(x1i + x2i ) + i ,
(15)
Equation (15) is the usual regression model, with one exception: the regressor xTi ≡ x1i + x2i (with
“T” for “total”) is endogenous, because x1 and are dependent. However, we have available to us
(by construction) a valid instrument, since x2 is correlated with the endogenous regressor but also
independent of the error term.
The instrumental variables estimator is:
β̂IVLS =
Cov(x2 , y)
Cov(x2 , xT )
(16)
where the covariances are taken over data.11 Now, substituting for xT and distributing covariances,
11
Equation (16) is valid because all of the x’s have expectation 0.
18
we have
β̂IVLS =
Cov(x2 , y)
Cov(x2 , x1 ) + Var(x2 )
(17)
By assumption, x1 and x2 are independent, so Cov(x2 ,x1 ) should be near zero, and
Cov(x2 , y)
= β2
n→∞ Var(x2 )
lim
(18)
IVLS here delivers an estimate of the impact of the exogenous portion of treatment to
which subjects have been randomized but not the endogenous portion of the aggregate variable of
interest, xT . Yet we are ultimately interested in the effect of x1 , or at least xT ; otherwise, we could
simply regress the dependent variable on x2 . This is the sense in which there is no “free lunch”
provided by the exogeneity of the instrument, given that the data-generating process is equation
(13) and not (15).
In other cases, the situation may be somewhat more complicated. For instance, when
Cov(x1 , x2 ) , 0, the IVLS estimate of β in equation (15) will converge to a mixture of β1 and
β2 , the weights being w = Cov(x2 , x1 )/[Cov(x2 , x1 ) + Var(x2 )] and 1 − w. In simulations reported
on the author’s website, I investigate what IVLS estimates under a range of other assumptions
about the true data-generating process.12
5
A specification test
The discussion above suggests a natural specification test, which requires the availability of an
additional instrument, z1 , with the following properties:
z1 y 12
In the formula for w, Var and Cov operate on random variables, and w could be negative.
19
(19)
and
Cov(z1 , x1 ) , 0
(20)
where the notation follows the section above. We will then use IVLS to estimate the model in
equation (13) above, that is,
yi = β1 x1i + β2 x2i + i
(21)
using z1 and x2 (which is exogenous) as the instruments.
Let b
Σ be the estimated variance-covariance matrix for the coefficient estimates:
d βˆ1 , βˆ2 |x1 , z1 ) = b
Cov(
Σ
(22)
Using the diagonal and off-diagonal elements of this 2 × 2 matrix, we can calculate
c (βˆ1 − βˆ2 ) = Var
c βˆ1 + Var
c βˆ2 − 2 Cov
d (βˆ1 , βˆ2 )
Var
(23)
The coefficient estimates are asymptotically normal, and z-tests for the difference can be applied
(see Greene 2003: 77-78 for details). If pooling is appropriate, then the estimated coefficient on x1
should be the same as the estimated coefficient on x2 , up to random error. Statistical tests should
therefore fail to reject the null hypothesis that β1 and β2 are equal.
This adaptation of a standard test compares a pooling estimator to a splitting estimator; it
could be viewed as a Hausmann test, in which an additional instrument is needed to test the pooling
restriction because x1 is endogenous. In simulations, the specification test is able to detect model
specification failures with a high degree of accuracy. Of course, like most specification tests, this
one is robust only against a limited class of alternatives: we stipulate that the data are generated
according to equation (13), and the alternatives are that β1 = β2 or β1 , β2 . Moreover, since the
test requires the availability of an additional instrument, it may only be useful in certain classes of
20
applications. For instance, we do not attempt to key the test to data from the examples discussed
in this paper because we do not see an available additional instrument.13
6
Conclusion
In a given natural experiment, nature may assign the units of analysis not to levels of the treatment
variable X that is of greatest interest but to levels of another variable Z. Nonetheless, the causal
effect of X may be recovered through instrumental variables regression analysis – provided that the
assumptions of the IVLS model are valid. In most applications, analysts tend to focus attention on
two canonical requirements for a valid instrumental variable: the variable Z must be correlated with
the endogenous regressor X, and it must itself be exogenous, that is, independent of the error term.
The first assumption can be checked from the data. The second, essentially untestable, assumption
is generally the more difficult one, and it is the one for which a good natural experiment can be
particularly useful.
However, satisfying these requirements is not enough for valid application of the instrumental variables approach. In particular, the regression model linking X to Y must also be valid.
While this may seem obvious, in this article I have drawn attention to a too-infrequently remarked
feature of the canonical IVLS regression model: the assumption of constant effects across exogenous and endogenous portions of the problematic regressor X.
Violations of this assumption can limit the ability of the natural-experiment-cum-instrumentalvariables approach to recover causal parameters. For example, in order to use lottery income to
estimate the effect of overall income on political attitudes, we must assume that the effects of lottery income and earned income are the same. To use rainfall changes to estimate the effect of
economic growth on civil conflict, we must assume that growth in the agricultural sector has the
same effect as growth in the industrial sector. These are strong assumptions, and they should be
13
For simulations, see the author’s website.
21
defended with same kind of energy that is used to defend exogeneity.
If the assumption of constant effects across endogenous and exogenous portions of X is
wrong, then IVLS estimates can be quite misleading. When heterogeneity takes the simple form
I have investigated here – that is, the outcome variable is a sum of independent exogenous and
endogenous portions – instrumental variables regression simply estimates the coefficients of the
exogenous portion. In more complicated settings, IVLS may estimate a mixture of the true coefficients of interest. Thus, if the model is incorrectly specified, exogeneity may not be much
help.
Ultimately, of course, the question of model specification is a theoretical and not a technical one. Whether it is proper to specify constant coefficients across exogenous and endogenous
portions of a treatment variable, in examples like those discussed in this paper, is a matter for a
priori reflection. This is not unique to applications of IVLS – indeed, similar issues may arise even
if there is no endogeneity – yet special issues are raised with IVLS because we often hope to use
the technique to recover the causal impact of endogenous treatments.
What about the potential problem of ”infinite regress?” In the lottery example, for instance,
it might well be that different kinds of earned income have different impacts on political attitudes;
in the Africa example, different sorts of agricultural income could have different effects on conflict.
To test many permutations, given the endogeneity of the variables, we would need many instruments and these are not usually available. This is exactly the point. Deciding when it is appropriate
to assume constant effects is a crucial theoretical issue. That issue tends to be given short shrift
in current applications of the natural-experiment-cum-instrumental-variables approach, where the
focus is on exogeneity.
The basic point emphasized here is therefore not that data analysis or regression diagnostics are the key. Rather, in any particular application, a priori and theoretical reasoning should
be brought to bear to justify the crucial assumptions of constant effects across endogenous and
exogenous portions of the problematic regressor. In some settings, the constancy assumption may
22
be innocuous; the point here is not that the IVLS approach is necessarily flawed, but that the validity of underlying regression model should be carefully considered. The “no free lunch” principle
suggests randomization to Z instead of X may not be enough to recover the causal impact of X.
23
References
[1] Acemoglu, Daron, Simon Johnson, and James Robinson. 2001. “The Comparative Origins of
Comparative Development: An Empirical Investigation.” The American Economic Review Vol.
91 (5): 1369-1401.
[2] Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin. 1996. “Identification of Causal
Effects Using Instrumental Variables.” Journal of the American Statistical Association 91
(434): 444-455.
[3] Angrist, Joshua D. and Victor Lavy. 1999. “Using Maimonides Rule to Estimate the Effect of
Class Size on Student Achievement.” Quarterly Journal of Economics 114: 533-575.
[4] Bartels, Larry M. 1991. “Instrumental and ‘Quasi-Instrumental’ Variables.” American Journal
of Political Science Vol. 35 (3): 777-800. bound et al Bound, John, David Jaeger, and Regina
Baker. 1995. “Problems with Instrumental Variables Estimation when the Correlation between
the Instruments and the Endogenous Explanatory Variables is Weak.” Journal of the American
Statistical Association Vol. 90 (430): 443-50.
[5] Doherty, Daniel, Donald Green, and Alan Gerber. 2005. “Personal Income and Attitudes toward Redistribution: A Study of Lottery Winners.” Political Psychology Vol. 27 (3): 2006.
[6] Dunning, Thad. Forthcoming. “Improving Causal Inference: Strengths and Limitations of Natural Experiments.” Political Research Quarterly. Previous version presented at the meetings of
the American Political Science Association, August 31-September 5, Washington, D.C., 2005.
[7] Freedman, David. 2005. Statistical Models: Theory and Practice. Cambridge: Cambridge
University Press.
[8] Freedman, David, Robert Pisani, and Roger Purves. 1997. Statistics. 3rd 3d. New York: W.W.
Norton, Inc.
24
[9] Greene, William H. 2003. Econometric Analysis. Prentice Hall: Upper Saddle River, NJ, Fifth
Edition.
[10] Hanushek, Eric A. and John E. Jackson. 1977. Statistical Methods for Social Scientists. San
Diego, CA: Academic Press, Harcourt Brace Company.
[11] Heckman, James J. 1996. “Randomization as an Instrumental Variable.” The Review of Economics and Statistics Vol 78 (2): 336-341.
[12] Heckman, James J. and R. Robb. 1985. “Alternative Methods for Evaluating the Impact of
Interventions.” In James J. Heckman and Burton Singer, eds., Longitudinal Analysis of Labor
Market Data, Volume 10, pp. 156-245. New York: Cambridge University Press.
[13] Heckman, James J. and R. Robb. 1986. “Alternative Methods for solving the problem of
selection bias in evaluating the impact of treatments on outcomes.” In Howard Wainer, ed.,
Drawing Inferences from Self-Selected Samples, pp. 63-107. New York: Springer-Verlag.
[14] Heckman, James J., Sergio Urzua, Edward Vytlacil. 2006. “Understanding Instrumental Variables in Models with Essential Heterogeneity.” Paper given by Heckman as the Tjalling C.
Koopmans Lecture, Cowles Foundation, Yale University, September 26-27, 2006.
[15] Holland, Paul W. 1986. Statistics and causal inference. Journal of the American Statistical
Association 8: 945–70 (with discussion).
[16] Kennedy, Peter. 1985. A Guide to Econometrics. 2d ed. Cambridge: MIT Press.
[17] Krasno, Jonathan S. and Donald P. Green. 2005. “Do Televised Presidential Ads Increase
Voter Turnout? Evidence from a Natural Experiment.” Manuscript, Department of Political
Science, Yale University.
25
[18] Neyman, Jersey. 1923. Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes. Roczniki Nauk Rolniczych 10: 1–51, in Polish. English translation
by DM Dabrowska and TP Speed (1990), Statistical Science 5: 465–80 (with discussion).
[19] Rosenzweig, Mark R. and Kenneth I. Wolpin. 2000. “Natural ‘Natural Experiments’ in Economics.” Journal of Economic Literature Vol. 38 (4): 827-874.
[20] Rubin, Donald. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66: 688–701.
26
© Copyright 2026 Paperzz