The perils of endogeneity and instrumental variables in strategy

Strategic Management Journal
Strat. Mgmt. J., 35: 1070–1079 (2014)
Published online EarlyView 25 June 2013 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/smj.2136
Received 5 September 2012 ; Final revision received 19 February 2013
RESEARCH NOTES AND COMMENTARIES
THE PERILS OF ENDOGENEITY AND INSTRUMENTAL
VARIABLES IN STRATEGY RESEARCH:
UNDERSTANDING THROUGH SIMULATIONS
MATTHEW SEMADENI,1 * MICHAEL C. WITHERS,2 and S. TREVIS CERTO3
1
2
3
Kelley School of Business, Indiana University, Bloomington, Indiana, U.S.A.
Mays Business School, Texas A&M University, College Station, Texas, U.S.A.
W.P. Carey School of Business, Arizona State University, Tempe, Arizona, U.S.A.
In this paper we use simulations to examine how endogeneity biases the results reported by
ordinary least squares (OLS) regression. In addition, we examine how instrumental variable
techniques help to alleviate such bias. Our results demonstrate severe bias even at low levels of
endogeneity. Our results also illustrate how instrumental variables produce unbiased coefficient
estimates, but instrumental variables are associated with extremely low levels of statistical
power. Finally, our simulations highlight how stronger instruments improve statistical power
and that endogenous instruments can report results that are inferior to those reported by OLS
regression. Based on our results, we provide a series of recommendations for scholars dealing
with endogeneity. Copyright  2013 John Wiley & Sons, Ltd.
INTRODUCTION
Shaver’s (1998) examination of market entry
modes introduced to strategy scholars the influence of endogeneity on statistical results. Since
that time, other scholars in strategic management have described the consequences of endogeneity as well as the techniques to circumvent it (e.g., Bascle, 2008; Hamilton and Nickerson, 2003). Endogeneity occurs when an independent variable is correlated with the error
term (also known as “disturbance” or “residual”)
in an ordinary least squares (OLS) regression
model (for a review, see Kennedy, 2008). Endogeneity may bias the assertions that researchers
make regarding hypothesized effects. To this end,
Keywords: endogeneity; simulations; instrumental variables; ordinary least squares; strategy research
*Correspondence to: Matthew Semadeni, Kelley School of Business, Indiana University, 1309 East Tenth Street, Bloomington,
IN 47405, USA. E-mail: [email protected]
Copyright  2013 John Wiley & Sons, Ltd.
reviewers and editors in multiple disciplines have
increasingly identified endogeneity as an alternative explanation for results presented in papers
they evaluate, and endogeneity represents a more
and more frequent reason for manuscript rejection
(e.g., Larcker and Rusticus, 2010; Shugan, 2004).
The objective of this paper is to provide a series
of simulations to illustrate the consequences of
endogeneity and the robustness of the techniques
prescribed to circumvent these consequences.
Simulations allow us to investigate the stability of
analytical techniques while knowing what results
“should be,” and our techniques and outcome
variables allow us to provide a number of contributions to the literature in strategy on endogeneity.
First, the outcome measures we develop illustrate
that addressing endogeneity involves a balancing
act in which the disease (i.e., endogeneity)
introduces Type I error, whereas the cure (i.e.,
instrumental variables) introduces Type II error.
Endogeneity—even at low levels—increases the
likelihood of reporting statistically significant
Research Notes and Commentaries
(yet untrue) coefficient estimates. At the same
time, correcting for endogeneity with instrumental
variables increases the likelihood of reporting
coefficient estimates that are near their true values,
but these reported estimates are rarely statistically
significant.
Second, our simulations reveal and highlight
the individual roles of instrument strength and
instrument endogeneity. Weaker instruments (i.e.,
instruments weakly correlated with the endogenous variable) result in higher standard errors (i.e.,
lower efficiency), which decrease the likelihood
of reporting statistically significant relationships
when they exist. In contrast, instrument endogeneity results in statistically significant coefficient
estimates that differ from their true values (i.e.,
biased). In fact, the results of our simulations show
that endogenous instruments often produce coefficient estimates that are inferior to those reported
by OLS regression (i.e., the “cure” is worse than
the “disease”).
Third, our simulations demonstrate that standard tests detecting the presence of endogeneity
are highly dependent on the quality of the
instrumental variables used in the analyses. Our
results indicate that these tests will rarely detect
existing endogeneity when researchers use weak
instruments. These tests perform even more poorly
when the instruments are themselves endogenous.
Our simulations highlight the ease with which
researchers (and reviewers and editors) may
use test results mistakenly to dismiss concerns
regarding endogeneity.
Taken together, our results allow us to extend
previous recommendations for authors, reviewers,
and editors considering the potential impact of
endogeneity. Given our findings regarding the pernicious effects of endogeneity at even low levels, our recommendations center around testing
for endogeneity. Bascle (2008) presents a comprehensive approach for scholars to implement when
dealing with endogeneity. The first step in this
approach involves asking, “Is there an endogeneity
problem?” (Bascle, 2008: 287). Our simulations
provide a counterintuitive result that researchers
must first identify strong and exogenous instruments before they can know whether endogeneity
is problematic. Only with such instruments can
researchers effectively test for the presence of and
dismiss concerns regarding endogeneity.
The relevance of strategy research depends
on both theory generation and the empirical
Copyright  2013 John Wiley & Sons, Ltd.
1071
work that tests proposed theoretical perspectives.
We contribute to strategy research by providing
recommendations to improve current practices
used by researchers, reviewers, and editors grappling with the effects of endogeneity. We believe
current practices must be improved to prevent both
Type I and Type II errors. As we describe later, our
review of articles published in SMJ between 2005
and 2012 reveals alarming inconsistencies regarding how strategy researchers approach and remedy
endogeneity. We are hopeful that our simulations
will help to improve the rigor of the empirical
studies designed to test our theories and enhance
the overall quality of knowledge in our field that
is based on the results of these empirical tests.
ENDOGENEITY
Endogeneity defined
While many scholars in strategic management
discuss endogeneity, it is important to clarify the
precise meaning of the term. Endogeneity is most
typically described in the context of ordinary least
squares (OLS) regression. Equation 1 represents a
basic OLS regression equation:
yi = α + βxi + εi
(1)
In this equation, y i represents the dependent
variable, α represents a constant, β represents
the coefficient, x i represents the independent
variable, and εi represents the error term. The
error term in an OLS regression model illustrates
the extent to which the independent variables
predict the dependent variable and should vary
randomly. When the error term is correlated
with an independent variable, however, the errors
are not random; this leads to biased coefficient
estimates (for a review see Kennedy, 2008). Bias
occurs when the coefficient estimate based on a
sample does not on average equal the true value
of the coefficient in the population (Cohen et al.,
2003: 117). Therefore, a critical assumption of
OLS regression is that the independent variable
and the error term are uncorrelated.
According to Kennedy (2008), four different
issues may potentially introduce endogeneity in
OLS regression models: errors-in-variables (i.e.,
measurement error), autoregression, omitted variables, and simultaneous causality. In each of these
scenarios, OLS regression reports biased coefficients. Instead of estimating the “true” relationship
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
1072
M. Semadeni, M. C. Withers and S. Trevis Certo
between the independent variable and the dependent variable, OLS regression mistakenly includes
the correlation between the independent variable
and the error term in the estimation of the independent variable’s coefficient.
Endogeneity in strategy research
Shaver’s (1998) work on international entry modes
represents perhaps the most influential study of
endogeneity in strategic management. Shaver utilizes a sample of firms engaging in international
expansion to study whether acquisitions or Greenfield ventures lead to higher levels of firm performance. According to Shaver, managers choose
entry modes based on their perceptions of anticipated performance. It is these unobservable perceptions that potentially lead to endogeneity in this
empirical context.
Shaver’s ideas have caused scholars to consider the potential endogeneity of a variety of
independent variables. Hamilton and Nickerson
(2003: 51) summarize this perspective by stating: “the field of strategic management is fundamentally predicated on the idea that management’s decisions are endogenous to their expected
performance outcomes—if not, managerial decision making is not strategic; it is superfluous.”
Accordingly, almost all firm-level variables (e.g.,
R&D spending, acquisitions, etc.) can be considered decisions made by managers to influence
firm outcomes. Shaver’s introduction to endogeneity involved a dichotomous independent variable:
the managerial decision to engage in an acquisition versus a Greenfield venture. Nevertheless,
scholars have also treated continuous variables
such as employee stock ownership (Wang, He,
and Mahoney, 2009), human capital investments
(Sirmon and Hitt, 2009), and CEO hubris (Li and
Tang, 2010) as endogenous variables.
OVERCOMING ENDOGENEITY WITH
INSTRUMENTAL VARIABLES
Instrumental variables
Strategy researchers rely on instruments to model
continuous endogenous independent variables. The
terms instrumental variable estimation and twostage least squares are often used interchangeably, but instrumental variable techniques may
Copyright  2013 John Wiley & Sons, Ltd.
use estimators other than least squares, such
as generalized method of moments (GMM) or
limited-information maximum likelihood (LIML).
Researchers employ such techniques when the
endogenous variable does not represent a dichotomous decision.
Instrumental variables must fulfill two conditions: relevance and exogeneity (Kennedy, 2008).1
Relevance refers to the degree to which the instrument corresponds with the endogenous variable.
A literature on instrument strength (i.e., strong
vs. weak instruments) examines how relevance
influences model results, and scholars have created recommendations based on the F-statistics
of first-stage regressions to determine instrument relevance (Stock, Wright, and Yogo, 2002).
The general conclusion of this research indicates
that stronger (higher F-statistics)—as opposed to
weaker (lower F-statistics)—instruments are better
for two-stage approaches. Complementing instrument relevance, exogeneity refers to the degree
to which an instrument is uncorrelated with the
disturbance term in the second stage. Testing for
instrument exogeneity allows researchers to reduce
the chance that they replace one endogenous independent variable with another (see Bascle, 2008).
Perhaps the most problematic aspect of instrumental variable estimation involves identifying
suitable instruments. In practice it remains difficult to find variables that correlate strongly with
the endogenous variable but not with the error
term in the second stage. Instrument relevance
and exogeneity often work against one another. As
instrument strength increases (i.e., the instrument
becomes more like the endogenous independent
variable), it is perhaps not surprising that it may
be related to the error term in the same way as the
endogenous variable.
Instrumental variables in strategy research
To understand better how strategy researchers use
instrumental variables, we reviewed all empirical
papers appearing in SMJ between 2005 and
2012 that incorporated instrumental variables to
analyze continuous endogenous variables. We
noted the degree to which researchers (1) tested for
endogeneity; (2) used more than one instrument;
(3) tested for instrument strength; and (4) tested
1
Exogeneity and endogeneity are opposites (i.e., a variable that
is not “ex ogenous” is “end ogenous”).
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
Research Notes and Commentaries
for instrument exogeneity. To code for tests
of endogeneity, we noted whether authors used
either a Hausman or Durbin-Wu-Hausman test. We
also noted whether authors examined instrument
strength using thresholds such as those developed
by Stock et al. (2002). Finally, we noted whether
a formal test was used to detect instrument
exogeneity.
The results of our review reveal an alarming lack of consistency in terms of how strategy
researchers report instrumental variables. We identified 24 articles that use two-stage least squares
for either the primary or supplementary analyses.
Of these 24 articles, 10 test for endogeneity, 9 use
more than one instrument, 3 test for instrument
strength, and 5 test for instrument endogeneity.
Finally, two-thirds of these articles do not report
the results corresponding to the first stage of the
model.
It is important to note procedures that we did
not include in our coding scheme. Many strategy
researchers focused primarily on statistical significance to test whether instruments are relevant
and/or exogenous. Some scholars, for example,
suggest that an instrument is relevant if it is statistically related to the endogenous variable and
exogenous if it is not related to the ultimate dependent variable. Larcker and Rusticus (2010: 192)
declare such approaches “completely inappropriate,” so we do not count studies adopting these
methods as effective tests for instrument relevance
or exogeneity.
During our review, Hoetker and Mellewigt
(2009) stood out as providing one of the most comprehensive descriptions of the procedures used to
implement instrumental variables.2 Nevertheless,
the authors did not note that the F-tests regarding instrument strength revealed weak instruments
(the corresponding F-test value of 3.29 was well
below Stock et al.’s (2002) recommended value
of 11.59 for two instruments). The fact that one
of the most comprehensive and transparent uses
of instrumental variables in strategy identified
weak instruments but still proceeded to report the
results should give scholars pause. In the following sections, we use simulations to highlight the
importance of understanding instrument strength
and exogeneity when using instrumental variables.
1073
METHODOLOGY
To understand better the implications of endogeneity we provide a series of simulations. First, we
treat endogeneity as a continuous—as opposed to
dichotomous—condition and explore the implications of increasing endogeneity in OLS regression. Second, we examine how instrument strength
and exogeneity influence the outcomes of instrumental variables. Finally, we study how instrument strength and exogeneity influence the tests
to detect endogeneity.
Simulation design
Kennedy (2008) provides an excellent overview
of the intuition underlying simulation techniques.
We used Stata for our simulations, which involved
two broad steps. First, we generated a dataset with
500 observations of dependent and independent
variables with known properties. In this study, we
generated y with the following equation:
y = α + βx + e
(2)
In our simulations, we assigned a value of 1
to the intercept (a) and set the value of β to
0.1 to represent a small effect. We then generated
normally distributed values for e, with a mean of 0.
The primary issue in this simulation involves
examining various levels of endogeneity. To simulate this, we generated independent variables (x )
that varied in terms of the correlation with the error
term (e). We created three categories of endogeneity: no endogeneity (corr[x ,e] = 0), low endogeneity (corr[x ,e] = 0.1), and medium endogeneity
(corr[x ,e] = 0.3).3 To do so we followed Larcker and Rusticus (2010: 193), rather than directly
choosing the parameters, we set the population
correlations and then calculate the parameters necessary to obtain the desired correlations, allowing
for a more natural interpretation.
Analytical models
We compared the effectiveness of two main analytic strategies: OLS regression and instrumental
2
The authors used multiple instruments, test for endogeneity
(using the Durbin-Wu-Hausman test), and reference both
instrument strength (using an F-test) and exogeniety (using
Hansen’s J test [1982]).
Copyright  2013 John Wiley & Sons, Ltd.
3
In supplementary analyses, higher levels of endogeneity
resulted in even more dramatic results.
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
1074
M. Semadeni, M. C. Withers and S. Trevis Certo
variables.4 We used Stata’s “regress” command to
invoke OLS regression, which we refer to as OLS
when discussing the results. We then used Stata’s
“ivreg” command to invoke a two-stage least
squares approach.5 With this approach, researchers
must specify two stages. The first stage involves
using an instrumental variable, z , to determine the
endogenous independent variable, x . The second
stage then uses the predicted value from the first
stage as an independent variable in the second
stage. The intuition behind this approach is that
the first stage “partials out” any common variance
between x and z , so the predicted value does not
share any variance that is related to the error term
in the second stage. We vary the strength of the
instrument by modifying the correlation between
x and z . We report results for two alternative
instrumental variable approaches: (1) IVweak refers
to a two-stage approach using weak instruments
(i.e., the correlation between x and z is 0.1), and
(2) IVmod refers to a two-stage approach using
moderate instruments (i.e., the correlation between
x and z is 0.33).6
generate the data for each simulation) value. An
unbiased estimator will report confidence intervals
that include the true value in approximately
95 percent of the cases. PercSig refers to the
percentage of simulations that report statistically
significant coefficients. This measure assesses an
estimator’s power (e.g., Cohen, 1992). Estimators
reporting PercSig levels that exceed their “true”
values (based on power calculations) indicate
biased coefficients and/or standard errors.7
Examining 95int and PercSig together allows
us to uncover interesting relationships. Although
it is impossible to know a “true” value, fundamentally researchers should be interested in approximating the coefficient’s true value (95int). Practically speaking, however, researchers are also interested in identifying statistically significant relationships (Bettis, 2012). PercSig reveals the conditions under which endogeneity increases the probability of finding significance while decreasing the
probability of finding the “true” relationship.
Outcome measures
Endogeneity and OLS
We examined how endogeneity influences the bias
and efficiency of OLS regression by creating four
different outcome measures. We simulated 1,000
iterations of each condition. For each condition, we
saved the estimated β and the estimated standard
error. Beta Med represents the median estimate of
β for the 1,000 iterations in our simulations.
SEMed represents the median standard error for
the 1,000 iterations in our simulation. Values of
these measures that exceed (are less than) their
true values suggest positive (negative) bias.
We also included two outcome measures that
incorporated both the reported betas and standard
errors to examine the significance of the results
reported by each estimator. The expression 95int
denotes the extent to which an estimator reports
a 95 percent confidence interval that includes the
coefficient’s “true” (i.e., the coefficient used to
Table 1 illustrates the results of our simulations
comparing OLS, IVWeak , and IVMod . This table
includes three different endogeneity conditions: no
endogeneity, low endogeneity, and medium endogeneity. For each endogeneity condition, Table
1 displays the four outcome measures associated
with each of the estimators. We use two panels
to contrast between simulations with a true effect
(true β = 0.1 in the top panel) and without a
true effect (true β = 0 in the bottom panel). Column 1 in Table 1’s top panel illustrates the effects
of endogeneity on OLS when the true value of
B should equal 0.1. When there exists no endogeneity, OLS reported an unbiased Beta Med of
0.10. This unbiased beta, coupled with the reported
SE Med of 0.045 resulted in a 95int value of 95
percent and PercSig of 60 percent.8
When endogeneity was low, the reported beta
for OLS was twice its true value and increased
to 0.43 for moderate levels of endogeneity.
4
In supplementary analyses, we generated panel data and
examined the effects of endogeneity on fixed- and randomeffects models. Our results were substantively similar to those
reported by OLS regression.
5 In supplementary analyses, results for Stata’s GMM and LIML
options for ivreg were virtually identical.
6 The effect sizes we examined are consistent with standards
used in other disciplines (e.g., Stock et al. 2002).
Copyright  2013 John Wiley & Sons, Ltd.
RESULTS
7
The regression simulations with no endogeneity provide the
“true” values that are used for comparison purposes.
8
The reported power of 60 percent is consistent with statistical
power calculators using a sample size of 500 and an effect size
of 0.10.
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
Research Notes and Commentaries
Table 1.
1075
Main findings OLS vs. single instrument
OLS
Weak inst.
Moderate inst.
PANEL A: Sample size: 500, True B : 0.1
No endogeneity
Beta
0.100
0.088
SE
0.045
0.469
95% interval
95%
100%
% significant
60%
1%
Low endogeneity (0.1)
Beta
0.200
0.076
SE
0.045
0.478
95% interval
39%
100%
% significant
69%
1%
Medium endogeneity (0.3)
Beta
0.430
0.097
SE
0.042
0.475
95% interval
0%
100%
% significant
100%
4%
PANEL B: Sample size: 500, True B : 0
No endogeneity
Beta
0.001
SE
0.045
95% interval
95%
% significant
5%
Low endogeneity (0.1)
Beta
0.100
SE
0.044
95% interval
39%
% significant
61%
Medium endogeneity (0.3)
Beta
0.330
SE
0.042
95% interval
0%
% significant
100%
Moderate and
endo instr.
0.100
0.138
96%
9%
1.095
0.633
84%
26%
0.399
0.141
43%
84%
0.107
0.135
96%
12%
1.073
0.598
78%
34%
0.406
0.139
40%
86%
0.092
0.136
95%
13%
1.142
0.556
66%
45%
0.404
0.129
36%
86%
−0.009
0.474
100%
1%
−0.004
0.137
95%
5%
0.940
0.610
84%
16%
0.335
0.157
44%
56%
0.051
0.489
100%
0%
−0.009
0.136
95%
5%
1.016
0.608
77%
23%
0.336
0.154
40%
60%
−0.001
0.466
98%
2%
−0.007
0.135
95%
5%
0.987
0.536
63%
37%
0.338
0.143
35%
65%
Complementing the biased betas, Table 1’s top
panel shows that endogeneity also biases the
standard errors reported by OLS regression,
but this bias is opposite of that associated with
coefficients. Although the standard errors remain
constant for low levels of endogeneity, the
reported standard errors are actually lower for
moderate levels of endogeneity. As shown in
Table 1, 95int decreases as endogeneity increases;
the positively biased coefficient coupled with the
increasingly narrower confidence interval suggests
the reported confidence interval for OLS will be
less likely to include the true coefficient value as
endogeneity increases. At the same time, Table 1
shows that OLS is more likely to report a
statistically significant coefficient as endogeneity
increases. Taken together, these results indicate
Copyright  2013 John Wiley & Sons, Ltd.
Weak and
endo instr.
that endogeneity dramatically increases the extent
to which the reported coefficient is statistically
significant, but this value is much less likely to
be the true value.
Table 1’s bottom panel displays the same
information for simulations in which β should
equal 0 (i.e., there is no true effect). The results
illustrate that when endogeneity is low, OLS
will report a positive and statistically significant
relationship in 61 percent of the cases. When
endogeneity increases to 0.33, OLS reports a
positive and statistically significant relationship in
100 percent of the cases. These results illustrate
how endogeneity leads researchers to report results
that support relationships that do not exist.9
9
We also ran the simulations modeling a negative relationship
(i.e., B = −0.1). In these simulations, OLS was able to detect
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
1076
M. Semadeni, M. C. Withers and S. Trevis Certo
Endogeneity and instrumental variables
Testing for endogeneity
Columns two and three of Table 1 display
the results of weak and moderate instrumental
variables, respectively. When endogeneity is zero,
IVWeak and IVMod both produce Beta Med values
that are close to the true value. In contrast, the
SE Med for both instrumental variable techniques
are dramatically higher than the corresponding
SE Med values for OLS, but Table 1 also shows that
SEMed decreases as instrument strength increases.
This combination results in a PerSig value of
less than 10 percent, underscoring the efficiency
problems associated with instrumental variables.
When endogeneity is at low or medium levels,
similar patterns remain.
Like OLS, this combination of Beta Med and
SE Med influence outcome measures associated
with confidence intervals, but this combination
presents the exact opposite effects. Columns 2 and
3 in Table 1 show that 95int is essentially greater
than or equal to 95 in all cases; these results are
rarely statistically significant. This pattern directly
contradicts OLS. For instrumental variables, the
unbiased betas are associated with such large standard errors and confidence intervals that the true
value is almost always included, but these large
confidence intervals almost always include zero.
Complementing instrument strength, we also
create simulations to investigate how instrument
endogeneity influences the results of instrumental
variable analysis. We create simulations to examine the outcomes at low levels of instrumental variable endogeneity (i.e., the correlation between z
and e was 0.10).10 As illustrated in Columns 4 and
5 of Table 1, instrument endogeneity substantively
biases Beta Med for both weak and medium instruments. As compared to exogenous instruments,
Beta Med is nearly 1,000 percent larger for weak
instruments and 300 percent larger for moderate
instruments for low levels of instrument endogeneity. Interestingly, the effects of these changes
are worse for moderate, as compared to weak,
instruments in terms of 95int. Nevertheless, both
types of endogenous instruments produce coefficient estimates that are far more biased than those
reported by OLS.
Addressing the deleterious effects of endogeneity
begins with testing for its presence, since endogeneity remediation in its absence yields less efficient estimates. The Hausman and Durbin-WuHausman (DWH) tests both examine whether the
independent variable of interest is in fact endogenous. The quality of these tests, however, depends
on the appropriateness of the instruments. As Larcker and Rusticus (2010: 191) suggest, such tests
are valid “[u]nder the assumption of the appropriateness of the instruments.”
Panel A of Table 2 illustrates the effectiveness of
the DWH test as instrument strength and number
of observations vary.11 With weak instruments, the
DWH test identifies endogeneity in less than 20
percent of cases. With moderate instruments the
DWH is more effective, but it still remains difficult
to detect low levels of endogeneity. In contrast,
Panel B of Table 2 illustrates that instrument
endogeneity has the opposite effect. Endogenous
instruments cause the test to report endogeneity
even when it is not present. Taken together,
weak instruments provide results suggesting that
endogeneity is not present (even when it is), and
endogenous instruments provide results suggesting
endogeneity is present (even when it is not).
a statistically significant relationship in only 5 (0) percent of
the cases with low (moderate) levels of endogeneity. When the
direction of endogeneity is opposite of the coefficient, the erratic
results of OLS are much like those related to suppression.
10 This approach was conservative, as higher levels of endogeneity produced dramatically more biased results.
Copyright  2013 John Wiley & Sons, Ltd.
DISCUSSION
Contributions
Strategy scholars are concerned about the effects
of endogeneity, and our results suggest that such
concerns are warranted. In full disclosure, as
researchers we were hopeful to present research
suggesting that concerns about endogeneity were
perhaps overstated. Our simulations suggest, however, that even low levels of endogeneity can bias
reported coefficient estimates by as much as 100
percent. And when the hypothesized true relationship is negative, low levels of endogeneity can create positive, negative, or no relationships.12 Consequently, it is difficult for us to conclude that
concerns about endogeneity are excessive.
11 We also ran the same simulations for the Hausman test, and
the results were substantively similar.
12
In other words, when empirical results are significant and in
the opposite direction from that suggested by well-established
theory, negative endogeneity may be the culprit.
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
Research Notes and Commentaries
Table 2.
Durbin-Wu-Hausman endogeneity test resultsa
Observations
100 (%) 500 (%) 1000 (%)
Panel A
No instrument endogeneity
Instrument strength zero
No endogeneity
4.9
Low endogeneity
5.0
Moderate endogeneity
5.4
Instrument strength weak
No endogeneity
5.3
Low endogeneity
5.6
Moderate endogeneity
6.7
Instrument strength moderate
No endogeneity
4.6
Low endogeneity
7.5
Moderate endogeneity 21.5
Panel B
Low instrument endogeneity
Instrument strength zero
No endogeneity
17.6
Low endogeneity
17.5
Moderate endogeneity
18.2
Instrument strength weak
No endogeneity
16.1
Low endogeneity
14.1
Moderate endogeneity
10.3
Instrument strength moderate
No endogeneity
19.6
Low endogeneity
10.2
Moderate endogeneity
6.1
5.0
4.8
5.4
5.2
5.0
5.5
4.7
5.6
12.4
4.6
5.9
19.1
5.5
13.1
78.5
4.5
20.0
97.4
60.9
60.7
66.7
87.6
88.5
92.8
61.6
53.3
35.2
89.7
81.4
60.9
66.3
36.9
6.3
92.1
61.7
5.9
a
Values in the cells denote the percentage of the simulated DWH
tests that were significant (i.e., finding endogeneity present).
OLS and type I errors
The simulations reveal two important results associated with endogeneity in the context of OLS.
First, OLS coefficients become more biased (i.e.,
they become increasing larger than their true values) as endogeneity increases. Second, the standard errors reported by OLS become smaller (i.e.,
they become increasingly smaller than their true
values) as endogeneity increases. It is not surprising, then, that as endogeneity increases the 95
percent confidence intervals reported by OLS are
less likely to include the true value of the coefficient. At the same time, as endogeneity increases,
OLS becomes more likely to report statistically
significant results. Endogeneity, then, may make
it easier for researchers to find statistically significant relationships, but such significance may
Copyright  2013 John Wiley & Sons, Ltd.
1077
be driven by endogeneity as opposed to the theorized relationships. Our results also highlight how
even low levels of instrument endogeneity increase
reported betas by nearly 1,000 percent—a bias that
is worse than even OLS regression. While instrument endogeneity increases the likelihood of statistically significant coefficients, it also decreases the
likelihood that the confidence intervals surrounding these estimates include the true value.
Instrumental variables and type II errors
Our simulations also provide a number of results
regarding instrumental variables that require discussion. First, our simulations demonstrate that
both weak and moderate instruments provide coefficient estimates that closely approximate their true
values, but the associated standard errors greatly
exceed those of OLS. As a result, instrumental
variable techniques provide unbiased coefficient
estimates but are associated with extremely low
levels of statistical power. Directly contrasting our
results regarding OLS, instrumental variables are
likely to produce confidence intervals that contain
the true value of beta, but these estimates are rarely
statistically significant.
Testing for endogeneity
Finally, our simulations suggest that the effectiveness of endogeneity tests depends on instrument quality. In other words, weak and/or endogenous instruments yield suspect results whereas
stronger, exogenous instruments reveal endogeneity. Accordingly, tests that rely on weak and/or
endogenous instruments may mislead authors,
reviewers, editors, and general readers.
Recommendations
The results of our simulations allow us to provide
a number of recommendations to researchers
confronting endogeneity. Our results indicate that
stronger instrumental variables result in more
accurate betas as compared to OLS, and the benefit
of stronger instruments involves reduced standard
errors. Because of the importance of instrument
relevance, strategy researchers should start by
identifying at least moderately strong instruments.
Authors should always report instrument strength
by noting the F-statistic in the first stage associated
with the addition of the instrumental variable(s)
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
1078
M. Semadeni, M. C. Withers and S. Trevis Certo
(e.g., Larcker and Rusticus, 2010; Stock et al.,
2002). Without this detail, it is difficult for readers
to understand the influence of the instruments in
the models. Consistent with the recommendations
of econometricians (Angrist and Pischke, 2009;
Kennedy, 2008), researchers should report the full
results for the first-stage models, and these models
should include the controls used in the secondstage (i.e., structural model).
After identifying moderately strong instruments,
researchers should also examine the potential
endogeneity of the instruments. Compared to
exogenous instruments, endogenous instruments
dramatically raise the likelihood of reporting a statistically significant result. Given such enormous
bias, we recommend testing for instrument endogeneity using the Sargan (1958) test or Hansen’s
(1982) J-statistic (for other tests of instrument
endogeneity, see Bascle, 2008).
It should be noted that multiple instruments
are required to test for instrument endogeneity
(e.g., Kennedy, 2008). In supplementary analyses
not reported, our simulations also show that
multiple instruments help to decrease standard
errors. For these reasons, we cannot overemphasize the importance of identifying multiple—as
opposed to single—instruments.13 Nevertheless,
our results should give strategy researchers
pause regarding the number of existing studies
in strategic management that have advanced
theory while using only a single—and potentially
endogenous—instrument. If authors can only find
one instrument of sufficient strength, they must
present compelling theoretical evidence that the
instrument is not itself endogenous. In this regard,
authors should keep in mind that the instrumental
variable should be uncorrelated with the residuals
associated with the dependent variable—and not
the dependent variable itself.
Finally, but perhaps most importantly, authors
should compare their results to OLS (or a related
procedure such as logistic, fixed-effects regression,
etc.) and test for endogeneity. We emphasize this is
the final step in the process, because quality (i.e.,
relevant and exogenous) instruments are required
for the Hausman or DWH tests. Without quality
instruments, these tests will almost surely fail to
report endogeneity—even when it exists. In our
view, approaches such as “We tested for endogeneity and found that it was not an issue” are not
acceptable. Instead, reviewers, editors, and ultimately readers need to understand clearly the properties of the instruments used in endogeneity tests.
Finding instrumental variables
Researchers interested in endogeneity understand
that the most difficult aspect of instrumental variable analysis involves identifying suitable instruments. Nevertheless, economists may have some
suggestions in this regard. Angrist and Pischke
(2009: 117) suggest that, “Good instruments come
from a combination of institutional knowledge and
ideas about the processes determining the variable of interest.” Because researchers in strategy
address a wide array of topics and contexts, it
is difficult to provide advice that applies equally
to all researchers. We propose that researchers in
strategy can learn from the insights and examples
offered by economists, who have been dealing with
endogeneity for far longer. Kennedy (2008) summarizes a number of creative ideas in this regard.
Distance from a college, for instance, can serve as
an instrument for years of education as a predictor
of wages. More generally, he also discusses how
researchers may employ lagged exogenous variables as instruments for endogenous predictors.
In our view, though, the key to remedying endogeneity is to consider its effects before submitting manuscripts for review. If reviewers highlight
endogeneity after the original submission, it will
likely prove difficult for authors to identify suitable
instruments that fit neatly within existing models.
For researchers using archival data, for instance,
it may prove difficult to identify a natural experiment after the fact that helps to reframe a study
examining the effects of a strategic decision (e.g.,
market entry) on firm performance. Correcting for
endogeneity in a post-hoc fashion is even more
problematic for researchers relying on survey data.
CONCLUSION
13 Furthermore, it is important to note that multiple endogenous
instruments will not be able to detect instrument endogeneity.
Supplemental analyses found that detection requires at least one
moderately strong exogenous instrument among the multiple
instruments.
Copyright  2013 John Wiley & Sons, Ltd.
In sum, given noted publication biases that focus
on statistical significance (Bettis, 2012), we can
speculate that endogeneity has led to a myriad
of type I errors among published papers in
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj
Research Notes and Commentaries
strategy. While it is not our purpose to highlight
specific cases, we can safely presume that even
small amounts of endogeneity have resulted in
a number of published papers that have resulted
in statistically significant results driven not by
the purported independent variables but instead
by endogeneity. We are hopeful that the results
and recommendations we provide are helpful to
authors, reviewers and editors.
ACKNOWLEDGEMENTS
We thank Brian Connelly, Ryan Krause, Don
Lange and Phil Podsakoff for their constructive
comments on earlier versions of this manuscript.
We also thank Margarethe Wiersema and two
anonymous reviewers for their guidance in shaping
the final manuscript.
REFERENCES
Angrist JD, Pischke JS. 2009. Mostly Harmless Econometrics: an Empiricist’s Companion. Princeton University Press: Princeton, NJ.
Bascle G. 2008. Controlling for endogeneity with instrumental variables in strategic management research.
Strategic Organization 6(3): 285–327.
Bettis RA. 2012. The search for asterisks: compromised
statistical tests and flawed theories. Strategic Management Journal 33(1): 108–113.
Cohen J. 1992. A power primer. Psychological Bulletin
112(1): 155–159.
Cohen J, Cohen P, West SG, Aiken LS. 2003. Applied
Multiple Regression/Correlation Analysis for the
Behavioral Sciences. Erlbaum: Mahwah, NJ.
Copyright  2013 John Wiley & Sons, Ltd.
1079
Hamilton BH, Nickerson JA. 2003. Correcting for endogeneity in strategic management research. Strategic
Organization 1(1): 51–78.
Hansen LP. 1982. Large sample properties of generalized
method of moments estimators. Econometrica 50(4):
1029–1054.
Hoetker G, Mellewigt T. 2009. Choice and performance
of governance mechanisms: matching alliance governance to asset type. Strategic Management Journal
30(10): 1025–1044.
Kennedy P. 2008. A Guide to Econometrics (2nd edn).
Blackwell: Oxford, UK.
Larcker DF, Rusticus TO. 2010. On the use of instrumental variables in accounting research. Journal of
Accounting and Economics 49(3): 186–205.
Li J, Tang Y. 2010. CEO hubris and firm risk taking in
China: the moderating role of managerial discretion.
Academy of Management Journal 53(1): 45–68.
Sargan JD. 1958. The estimation of economic relationships using instrumental variables. Econometrica
26(3): 393–415.
Shaver JM. 1998. Accounting for endogeneity when
assessing strategy performance: does entry mode
choice affect fdi survival? Management Science 44(4):
571–585.
Shugan SM. 2004. Editorial: endogeneity in marketing
decision models. Marketing Science 23(1): 1–3.
Sirmon DG, Hitt MA. 2009. Contingencies within
dynamic managerial capabilities: interdependent
effects of resource investment and deployment on
firm performance. Strategic Management Journal
30(13): 1375–1394.
Stock JH, Wright JH, Yogo M. 2002. A survey of weak
instruments and weak identification in generalized
method of moments. Journal of Business & Economic
Statistics 20(4): 518–529.
Wang HC, He J, Mahoney JT. 2009. Firm-specific knowledge resources and competitive advantage: the roles
of economic- and relationship-based employee governance mechanisms. Strategic Management Journal
30(12): 1265–1285.
Strat. Mgmt. J., 35: 1070–1079 (2014)
DOI: 10.1002/smj