Strategic Management Journal Strat. Mgmt. J., 35: 1070–1079 (2014) Published online EarlyView 25 June 2013 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/smj.2136 Received 5 September 2012 ; Final revision received 19 February 2013 RESEARCH NOTES AND COMMENTARIES THE PERILS OF ENDOGENEITY AND INSTRUMENTAL VARIABLES IN STRATEGY RESEARCH: UNDERSTANDING THROUGH SIMULATIONS MATTHEW SEMADENI,1 * MICHAEL C. WITHERS,2 and S. TREVIS CERTO3 1 2 3 Kelley School of Business, Indiana University, Bloomington, Indiana, U.S.A. Mays Business School, Texas A&M University, College Station, Texas, U.S.A. W.P. Carey School of Business, Arizona State University, Tempe, Arizona, U.S.A. In this paper we use simulations to examine how endogeneity biases the results reported by ordinary least squares (OLS) regression. In addition, we examine how instrumental variable techniques help to alleviate such bias. Our results demonstrate severe bias even at low levels of endogeneity. Our results also illustrate how instrumental variables produce unbiased coefficient estimates, but instrumental variables are associated with extremely low levels of statistical power. Finally, our simulations highlight how stronger instruments improve statistical power and that endogenous instruments can report results that are inferior to those reported by OLS regression. Based on our results, we provide a series of recommendations for scholars dealing with endogeneity. Copyright 2013 John Wiley & Sons, Ltd. INTRODUCTION Shaver’s (1998) examination of market entry modes introduced to strategy scholars the influence of endogeneity on statistical results. Since that time, other scholars in strategic management have described the consequences of endogeneity as well as the techniques to circumvent it (e.g., Bascle, 2008; Hamilton and Nickerson, 2003). Endogeneity occurs when an independent variable is correlated with the error term (also known as “disturbance” or “residual”) in an ordinary least squares (OLS) regression model (for a review, see Kennedy, 2008). Endogeneity may bias the assertions that researchers make regarding hypothesized effects. To this end, Keywords: endogeneity; simulations; instrumental variables; ordinary least squares; strategy research *Correspondence to: Matthew Semadeni, Kelley School of Business, Indiana University, 1309 East Tenth Street, Bloomington, IN 47405, USA. E-mail: [email protected] Copyright 2013 John Wiley & Sons, Ltd. reviewers and editors in multiple disciplines have increasingly identified endogeneity as an alternative explanation for results presented in papers they evaluate, and endogeneity represents a more and more frequent reason for manuscript rejection (e.g., Larcker and Rusticus, 2010; Shugan, 2004). The objective of this paper is to provide a series of simulations to illustrate the consequences of endogeneity and the robustness of the techniques prescribed to circumvent these consequences. Simulations allow us to investigate the stability of analytical techniques while knowing what results “should be,” and our techniques and outcome variables allow us to provide a number of contributions to the literature in strategy on endogeneity. First, the outcome measures we develop illustrate that addressing endogeneity involves a balancing act in which the disease (i.e., endogeneity) introduces Type I error, whereas the cure (i.e., instrumental variables) introduces Type II error. Endogeneity—even at low levels—increases the likelihood of reporting statistically significant Research Notes and Commentaries (yet untrue) coefficient estimates. At the same time, correcting for endogeneity with instrumental variables increases the likelihood of reporting coefficient estimates that are near their true values, but these reported estimates are rarely statistically significant. Second, our simulations reveal and highlight the individual roles of instrument strength and instrument endogeneity. Weaker instruments (i.e., instruments weakly correlated with the endogenous variable) result in higher standard errors (i.e., lower efficiency), which decrease the likelihood of reporting statistically significant relationships when they exist. In contrast, instrument endogeneity results in statistically significant coefficient estimates that differ from their true values (i.e., biased). In fact, the results of our simulations show that endogenous instruments often produce coefficient estimates that are inferior to those reported by OLS regression (i.e., the “cure” is worse than the “disease”). Third, our simulations demonstrate that standard tests detecting the presence of endogeneity are highly dependent on the quality of the instrumental variables used in the analyses. Our results indicate that these tests will rarely detect existing endogeneity when researchers use weak instruments. These tests perform even more poorly when the instruments are themselves endogenous. Our simulations highlight the ease with which researchers (and reviewers and editors) may use test results mistakenly to dismiss concerns regarding endogeneity. Taken together, our results allow us to extend previous recommendations for authors, reviewers, and editors considering the potential impact of endogeneity. Given our findings regarding the pernicious effects of endogeneity at even low levels, our recommendations center around testing for endogeneity. Bascle (2008) presents a comprehensive approach for scholars to implement when dealing with endogeneity. The first step in this approach involves asking, “Is there an endogeneity problem?” (Bascle, 2008: 287). Our simulations provide a counterintuitive result that researchers must first identify strong and exogenous instruments before they can know whether endogeneity is problematic. Only with such instruments can researchers effectively test for the presence of and dismiss concerns regarding endogeneity. The relevance of strategy research depends on both theory generation and the empirical Copyright 2013 John Wiley & Sons, Ltd. 1071 work that tests proposed theoretical perspectives. We contribute to strategy research by providing recommendations to improve current practices used by researchers, reviewers, and editors grappling with the effects of endogeneity. We believe current practices must be improved to prevent both Type I and Type II errors. As we describe later, our review of articles published in SMJ between 2005 and 2012 reveals alarming inconsistencies regarding how strategy researchers approach and remedy endogeneity. We are hopeful that our simulations will help to improve the rigor of the empirical studies designed to test our theories and enhance the overall quality of knowledge in our field that is based on the results of these empirical tests. ENDOGENEITY Endogeneity defined While many scholars in strategic management discuss endogeneity, it is important to clarify the precise meaning of the term. Endogeneity is most typically described in the context of ordinary least squares (OLS) regression. Equation 1 represents a basic OLS regression equation: yi = α + βxi + εi (1) In this equation, y i represents the dependent variable, α represents a constant, β represents the coefficient, x i represents the independent variable, and εi represents the error term. The error term in an OLS regression model illustrates the extent to which the independent variables predict the dependent variable and should vary randomly. When the error term is correlated with an independent variable, however, the errors are not random; this leads to biased coefficient estimates (for a review see Kennedy, 2008). Bias occurs when the coefficient estimate based on a sample does not on average equal the true value of the coefficient in the population (Cohen et al., 2003: 117). Therefore, a critical assumption of OLS regression is that the independent variable and the error term are uncorrelated. According to Kennedy (2008), four different issues may potentially introduce endogeneity in OLS regression models: errors-in-variables (i.e., measurement error), autoregression, omitted variables, and simultaneous causality. In each of these scenarios, OLS regression reports biased coefficients. Instead of estimating the “true” relationship Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj 1072 M. Semadeni, M. C. Withers and S. Trevis Certo between the independent variable and the dependent variable, OLS regression mistakenly includes the correlation between the independent variable and the error term in the estimation of the independent variable’s coefficient. Endogeneity in strategy research Shaver’s (1998) work on international entry modes represents perhaps the most influential study of endogeneity in strategic management. Shaver utilizes a sample of firms engaging in international expansion to study whether acquisitions or Greenfield ventures lead to higher levels of firm performance. According to Shaver, managers choose entry modes based on their perceptions of anticipated performance. It is these unobservable perceptions that potentially lead to endogeneity in this empirical context. Shaver’s ideas have caused scholars to consider the potential endogeneity of a variety of independent variables. Hamilton and Nickerson (2003: 51) summarize this perspective by stating: “the field of strategic management is fundamentally predicated on the idea that management’s decisions are endogenous to their expected performance outcomes—if not, managerial decision making is not strategic; it is superfluous.” Accordingly, almost all firm-level variables (e.g., R&D spending, acquisitions, etc.) can be considered decisions made by managers to influence firm outcomes. Shaver’s introduction to endogeneity involved a dichotomous independent variable: the managerial decision to engage in an acquisition versus a Greenfield venture. Nevertheless, scholars have also treated continuous variables such as employee stock ownership (Wang, He, and Mahoney, 2009), human capital investments (Sirmon and Hitt, 2009), and CEO hubris (Li and Tang, 2010) as endogenous variables. OVERCOMING ENDOGENEITY WITH INSTRUMENTAL VARIABLES Instrumental variables Strategy researchers rely on instruments to model continuous endogenous independent variables. The terms instrumental variable estimation and twostage least squares are often used interchangeably, but instrumental variable techniques may Copyright 2013 John Wiley & Sons, Ltd. use estimators other than least squares, such as generalized method of moments (GMM) or limited-information maximum likelihood (LIML). Researchers employ such techniques when the endogenous variable does not represent a dichotomous decision. Instrumental variables must fulfill two conditions: relevance and exogeneity (Kennedy, 2008).1 Relevance refers to the degree to which the instrument corresponds with the endogenous variable. A literature on instrument strength (i.e., strong vs. weak instruments) examines how relevance influences model results, and scholars have created recommendations based on the F-statistics of first-stage regressions to determine instrument relevance (Stock, Wright, and Yogo, 2002). The general conclusion of this research indicates that stronger (higher F-statistics)—as opposed to weaker (lower F-statistics)—instruments are better for two-stage approaches. Complementing instrument relevance, exogeneity refers to the degree to which an instrument is uncorrelated with the disturbance term in the second stage. Testing for instrument exogeneity allows researchers to reduce the chance that they replace one endogenous independent variable with another (see Bascle, 2008). Perhaps the most problematic aspect of instrumental variable estimation involves identifying suitable instruments. In practice it remains difficult to find variables that correlate strongly with the endogenous variable but not with the error term in the second stage. Instrument relevance and exogeneity often work against one another. As instrument strength increases (i.e., the instrument becomes more like the endogenous independent variable), it is perhaps not surprising that it may be related to the error term in the same way as the endogenous variable. Instrumental variables in strategy research To understand better how strategy researchers use instrumental variables, we reviewed all empirical papers appearing in SMJ between 2005 and 2012 that incorporated instrumental variables to analyze continuous endogenous variables. We noted the degree to which researchers (1) tested for endogeneity; (2) used more than one instrument; (3) tested for instrument strength; and (4) tested 1 Exogeneity and endogeneity are opposites (i.e., a variable that is not “ex ogenous” is “end ogenous”). Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj Research Notes and Commentaries for instrument exogeneity. To code for tests of endogeneity, we noted whether authors used either a Hausman or Durbin-Wu-Hausman test. We also noted whether authors examined instrument strength using thresholds such as those developed by Stock et al. (2002). Finally, we noted whether a formal test was used to detect instrument exogeneity. The results of our review reveal an alarming lack of consistency in terms of how strategy researchers report instrumental variables. We identified 24 articles that use two-stage least squares for either the primary or supplementary analyses. Of these 24 articles, 10 test for endogeneity, 9 use more than one instrument, 3 test for instrument strength, and 5 test for instrument endogeneity. Finally, two-thirds of these articles do not report the results corresponding to the first stage of the model. It is important to note procedures that we did not include in our coding scheme. Many strategy researchers focused primarily on statistical significance to test whether instruments are relevant and/or exogenous. Some scholars, for example, suggest that an instrument is relevant if it is statistically related to the endogenous variable and exogenous if it is not related to the ultimate dependent variable. Larcker and Rusticus (2010: 192) declare such approaches “completely inappropriate,” so we do not count studies adopting these methods as effective tests for instrument relevance or exogeneity. During our review, Hoetker and Mellewigt (2009) stood out as providing one of the most comprehensive descriptions of the procedures used to implement instrumental variables.2 Nevertheless, the authors did not note that the F-tests regarding instrument strength revealed weak instruments (the corresponding F-test value of 3.29 was well below Stock et al.’s (2002) recommended value of 11.59 for two instruments). The fact that one of the most comprehensive and transparent uses of instrumental variables in strategy identified weak instruments but still proceeded to report the results should give scholars pause. In the following sections, we use simulations to highlight the importance of understanding instrument strength and exogeneity when using instrumental variables. 1073 METHODOLOGY To understand better the implications of endogeneity we provide a series of simulations. First, we treat endogeneity as a continuous—as opposed to dichotomous—condition and explore the implications of increasing endogeneity in OLS regression. Second, we examine how instrument strength and exogeneity influence the outcomes of instrumental variables. Finally, we study how instrument strength and exogeneity influence the tests to detect endogeneity. Simulation design Kennedy (2008) provides an excellent overview of the intuition underlying simulation techniques. We used Stata for our simulations, which involved two broad steps. First, we generated a dataset with 500 observations of dependent and independent variables with known properties. In this study, we generated y with the following equation: y = α + βx + e (2) In our simulations, we assigned a value of 1 to the intercept (a) and set the value of β to 0.1 to represent a small effect. We then generated normally distributed values for e, with a mean of 0. The primary issue in this simulation involves examining various levels of endogeneity. To simulate this, we generated independent variables (x ) that varied in terms of the correlation with the error term (e). We created three categories of endogeneity: no endogeneity (corr[x ,e] = 0), low endogeneity (corr[x ,e] = 0.1), and medium endogeneity (corr[x ,e] = 0.3).3 To do so we followed Larcker and Rusticus (2010: 193), rather than directly choosing the parameters, we set the population correlations and then calculate the parameters necessary to obtain the desired correlations, allowing for a more natural interpretation. Analytical models We compared the effectiveness of two main analytic strategies: OLS regression and instrumental 2 The authors used multiple instruments, test for endogeneity (using the Durbin-Wu-Hausman test), and reference both instrument strength (using an F-test) and exogeniety (using Hansen’s J test [1982]). Copyright 2013 John Wiley & Sons, Ltd. 3 In supplementary analyses, higher levels of endogeneity resulted in even more dramatic results. Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj 1074 M. Semadeni, M. C. Withers and S. Trevis Certo variables.4 We used Stata’s “regress” command to invoke OLS regression, which we refer to as OLS when discussing the results. We then used Stata’s “ivreg” command to invoke a two-stage least squares approach.5 With this approach, researchers must specify two stages. The first stage involves using an instrumental variable, z , to determine the endogenous independent variable, x . The second stage then uses the predicted value from the first stage as an independent variable in the second stage. The intuition behind this approach is that the first stage “partials out” any common variance between x and z , so the predicted value does not share any variance that is related to the error term in the second stage. We vary the strength of the instrument by modifying the correlation between x and z . We report results for two alternative instrumental variable approaches: (1) IVweak refers to a two-stage approach using weak instruments (i.e., the correlation between x and z is 0.1), and (2) IVmod refers to a two-stage approach using moderate instruments (i.e., the correlation between x and z is 0.33).6 generate the data for each simulation) value. An unbiased estimator will report confidence intervals that include the true value in approximately 95 percent of the cases. PercSig refers to the percentage of simulations that report statistically significant coefficients. This measure assesses an estimator’s power (e.g., Cohen, 1992). Estimators reporting PercSig levels that exceed their “true” values (based on power calculations) indicate biased coefficients and/or standard errors.7 Examining 95int and PercSig together allows us to uncover interesting relationships. Although it is impossible to know a “true” value, fundamentally researchers should be interested in approximating the coefficient’s true value (95int). Practically speaking, however, researchers are also interested in identifying statistically significant relationships (Bettis, 2012). PercSig reveals the conditions under which endogeneity increases the probability of finding significance while decreasing the probability of finding the “true” relationship. Outcome measures Endogeneity and OLS We examined how endogeneity influences the bias and efficiency of OLS regression by creating four different outcome measures. We simulated 1,000 iterations of each condition. For each condition, we saved the estimated β and the estimated standard error. Beta Med represents the median estimate of β for the 1,000 iterations in our simulations. SEMed represents the median standard error for the 1,000 iterations in our simulation. Values of these measures that exceed (are less than) their true values suggest positive (negative) bias. We also included two outcome measures that incorporated both the reported betas and standard errors to examine the significance of the results reported by each estimator. The expression 95int denotes the extent to which an estimator reports a 95 percent confidence interval that includes the coefficient’s “true” (i.e., the coefficient used to Table 1 illustrates the results of our simulations comparing OLS, IVWeak , and IVMod . This table includes three different endogeneity conditions: no endogeneity, low endogeneity, and medium endogeneity. For each endogeneity condition, Table 1 displays the four outcome measures associated with each of the estimators. We use two panels to contrast between simulations with a true effect (true β = 0.1 in the top panel) and without a true effect (true β = 0 in the bottom panel). Column 1 in Table 1’s top panel illustrates the effects of endogeneity on OLS when the true value of B should equal 0.1. When there exists no endogeneity, OLS reported an unbiased Beta Med of 0.10. This unbiased beta, coupled with the reported SE Med of 0.045 resulted in a 95int value of 95 percent and PercSig of 60 percent.8 When endogeneity was low, the reported beta for OLS was twice its true value and increased to 0.43 for moderate levels of endogeneity. 4 In supplementary analyses, we generated panel data and examined the effects of endogeneity on fixed- and randomeffects models. Our results were substantively similar to those reported by OLS regression. 5 In supplementary analyses, results for Stata’s GMM and LIML options for ivreg were virtually identical. 6 The effect sizes we examined are consistent with standards used in other disciplines (e.g., Stock et al. 2002). Copyright 2013 John Wiley & Sons, Ltd. RESULTS 7 The regression simulations with no endogeneity provide the “true” values that are used for comparison purposes. 8 The reported power of 60 percent is consistent with statistical power calculators using a sample size of 500 and an effect size of 0.10. Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj Research Notes and Commentaries Table 1. 1075 Main findings OLS vs. single instrument OLS Weak inst. Moderate inst. PANEL A: Sample size: 500, True B : 0.1 No endogeneity Beta 0.100 0.088 SE 0.045 0.469 95% interval 95% 100% % significant 60% 1% Low endogeneity (0.1) Beta 0.200 0.076 SE 0.045 0.478 95% interval 39% 100% % significant 69% 1% Medium endogeneity (0.3) Beta 0.430 0.097 SE 0.042 0.475 95% interval 0% 100% % significant 100% 4% PANEL B: Sample size: 500, True B : 0 No endogeneity Beta 0.001 SE 0.045 95% interval 95% % significant 5% Low endogeneity (0.1) Beta 0.100 SE 0.044 95% interval 39% % significant 61% Medium endogeneity (0.3) Beta 0.330 SE 0.042 95% interval 0% % significant 100% Moderate and endo instr. 0.100 0.138 96% 9% 1.095 0.633 84% 26% 0.399 0.141 43% 84% 0.107 0.135 96% 12% 1.073 0.598 78% 34% 0.406 0.139 40% 86% 0.092 0.136 95% 13% 1.142 0.556 66% 45% 0.404 0.129 36% 86% −0.009 0.474 100% 1% −0.004 0.137 95% 5% 0.940 0.610 84% 16% 0.335 0.157 44% 56% 0.051 0.489 100% 0% −0.009 0.136 95% 5% 1.016 0.608 77% 23% 0.336 0.154 40% 60% −0.001 0.466 98% 2% −0.007 0.135 95% 5% 0.987 0.536 63% 37% 0.338 0.143 35% 65% Complementing the biased betas, Table 1’s top panel shows that endogeneity also biases the standard errors reported by OLS regression, but this bias is opposite of that associated with coefficients. Although the standard errors remain constant for low levels of endogeneity, the reported standard errors are actually lower for moderate levels of endogeneity. As shown in Table 1, 95int decreases as endogeneity increases; the positively biased coefficient coupled with the increasingly narrower confidence interval suggests the reported confidence interval for OLS will be less likely to include the true coefficient value as endogeneity increases. At the same time, Table 1 shows that OLS is more likely to report a statistically significant coefficient as endogeneity increases. Taken together, these results indicate Copyright 2013 John Wiley & Sons, Ltd. Weak and endo instr. that endogeneity dramatically increases the extent to which the reported coefficient is statistically significant, but this value is much less likely to be the true value. Table 1’s bottom panel displays the same information for simulations in which β should equal 0 (i.e., there is no true effect). The results illustrate that when endogeneity is low, OLS will report a positive and statistically significant relationship in 61 percent of the cases. When endogeneity increases to 0.33, OLS reports a positive and statistically significant relationship in 100 percent of the cases. These results illustrate how endogeneity leads researchers to report results that support relationships that do not exist.9 9 We also ran the simulations modeling a negative relationship (i.e., B = −0.1). In these simulations, OLS was able to detect Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj 1076 M. Semadeni, M. C. Withers and S. Trevis Certo Endogeneity and instrumental variables Testing for endogeneity Columns two and three of Table 1 display the results of weak and moderate instrumental variables, respectively. When endogeneity is zero, IVWeak and IVMod both produce Beta Med values that are close to the true value. In contrast, the SE Med for both instrumental variable techniques are dramatically higher than the corresponding SE Med values for OLS, but Table 1 also shows that SEMed decreases as instrument strength increases. This combination results in a PerSig value of less than 10 percent, underscoring the efficiency problems associated with instrumental variables. When endogeneity is at low or medium levels, similar patterns remain. Like OLS, this combination of Beta Med and SE Med influence outcome measures associated with confidence intervals, but this combination presents the exact opposite effects. Columns 2 and 3 in Table 1 show that 95int is essentially greater than or equal to 95 in all cases; these results are rarely statistically significant. This pattern directly contradicts OLS. For instrumental variables, the unbiased betas are associated with such large standard errors and confidence intervals that the true value is almost always included, but these large confidence intervals almost always include zero. Complementing instrument strength, we also create simulations to investigate how instrument endogeneity influences the results of instrumental variable analysis. We create simulations to examine the outcomes at low levels of instrumental variable endogeneity (i.e., the correlation between z and e was 0.10).10 As illustrated in Columns 4 and 5 of Table 1, instrument endogeneity substantively biases Beta Med for both weak and medium instruments. As compared to exogenous instruments, Beta Med is nearly 1,000 percent larger for weak instruments and 300 percent larger for moderate instruments for low levels of instrument endogeneity. Interestingly, the effects of these changes are worse for moderate, as compared to weak, instruments in terms of 95int. Nevertheless, both types of endogenous instruments produce coefficient estimates that are far more biased than those reported by OLS. Addressing the deleterious effects of endogeneity begins with testing for its presence, since endogeneity remediation in its absence yields less efficient estimates. The Hausman and Durbin-WuHausman (DWH) tests both examine whether the independent variable of interest is in fact endogenous. The quality of these tests, however, depends on the appropriateness of the instruments. As Larcker and Rusticus (2010: 191) suggest, such tests are valid “[u]nder the assumption of the appropriateness of the instruments.” Panel A of Table 2 illustrates the effectiveness of the DWH test as instrument strength and number of observations vary.11 With weak instruments, the DWH test identifies endogeneity in less than 20 percent of cases. With moderate instruments the DWH is more effective, but it still remains difficult to detect low levels of endogeneity. In contrast, Panel B of Table 2 illustrates that instrument endogeneity has the opposite effect. Endogenous instruments cause the test to report endogeneity even when it is not present. Taken together, weak instruments provide results suggesting that endogeneity is not present (even when it is), and endogenous instruments provide results suggesting endogeneity is present (even when it is not). a statistically significant relationship in only 5 (0) percent of the cases with low (moderate) levels of endogeneity. When the direction of endogeneity is opposite of the coefficient, the erratic results of OLS are much like those related to suppression. 10 This approach was conservative, as higher levels of endogeneity produced dramatically more biased results. Copyright 2013 John Wiley & Sons, Ltd. DISCUSSION Contributions Strategy scholars are concerned about the effects of endogeneity, and our results suggest that such concerns are warranted. In full disclosure, as researchers we were hopeful to present research suggesting that concerns about endogeneity were perhaps overstated. Our simulations suggest, however, that even low levels of endogeneity can bias reported coefficient estimates by as much as 100 percent. And when the hypothesized true relationship is negative, low levels of endogeneity can create positive, negative, or no relationships.12 Consequently, it is difficult for us to conclude that concerns about endogeneity are excessive. 11 We also ran the same simulations for the Hausman test, and the results were substantively similar. 12 In other words, when empirical results are significant and in the opposite direction from that suggested by well-established theory, negative endogeneity may be the culprit. Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj Research Notes and Commentaries Table 2. Durbin-Wu-Hausman endogeneity test resultsa Observations 100 (%) 500 (%) 1000 (%) Panel A No instrument endogeneity Instrument strength zero No endogeneity 4.9 Low endogeneity 5.0 Moderate endogeneity 5.4 Instrument strength weak No endogeneity 5.3 Low endogeneity 5.6 Moderate endogeneity 6.7 Instrument strength moderate No endogeneity 4.6 Low endogeneity 7.5 Moderate endogeneity 21.5 Panel B Low instrument endogeneity Instrument strength zero No endogeneity 17.6 Low endogeneity 17.5 Moderate endogeneity 18.2 Instrument strength weak No endogeneity 16.1 Low endogeneity 14.1 Moderate endogeneity 10.3 Instrument strength moderate No endogeneity 19.6 Low endogeneity 10.2 Moderate endogeneity 6.1 5.0 4.8 5.4 5.2 5.0 5.5 4.7 5.6 12.4 4.6 5.9 19.1 5.5 13.1 78.5 4.5 20.0 97.4 60.9 60.7 66.7 87.6 88.5 92.8 61.6 53.3 35.2 89.7 81.4 60.9 66.3 36.9 6.3 92.1 61.7 5.9 a Values in the cells denote the percentage of the simulated DWH tests that were significant (i.e., finding endogeneity present). OLS and type I errors The simulations reveal two important results associated with endogeneity in the context of OLS. First, OLS coefficients become more biased (i.e., they become increasing larger than their true values) as endogeneity increases. Second, the standard errors reported by OLS become smaller (i.e., they become increasingly smaller than their true values) as endogeneity increases. It is not surprising, then, that as endogeneity increases the 95 percent confidence intervals reported by OLS are less likely to include the true value of the coefficient. At the same time, as endogeneity increases, OLS becomes more likely to report statistically significant results. Endogeneity, then, may make it easier for researchers to find statistically significant relationships, but such significance may Copyright 2013 John Wiley & Sons, Ltd. 1077 be driven by endogeneity as opposed to the theorized relationships. Our results also highlight how even low levels of instrument endogeneity increase reported betas by nearly 1,000 percent—a bias that is worse than even OLS regression. While instrument endogeneity increases the likelihood of statistically significant coefficients, it also decreases the likelihood that the confidence intervals surrounding these estimates include the true value. Instrumental variables and type II errors Our simulations also provide a number of results regarding instrumental variables that require discussion. First, our simulations demonstrate that both weak and moderate instruments provide coefficient estimates that closely approximate their true values, but the associated standard errors greatly exceed those of OLS. As a result, instrumental variable techniques provide unbiased coefficient estimates but are associated with extremely low levels of statistical power. Directly contrasting our results regarding OLS, instrumental variables are likely to produce confidence intervals that contain the true value of beta, but these estimates are rarely statistically significant. Testing for endogeneity Finally, our simulations suggest that the effectiveness of endogeneity tests depends on instrument quality. In other words, weak and/or endogenous instruments yield suspect results whereas stronger, exogenous instruments reveal endogeneity. Accordingly, tests that rely on weak and/or endogenous instruments may mislead authors, reviewers, editors, and general readers. Recommendations The results of our simulations allow us to provide a number of recommendations to researchers confronting endogeneity. Our results indicate that stronger instrumental variables result in more accurate betas as compared to OLS, and the benefit of stronger instruments involves reduced standard errors. Because of the importance of instrument relevance, strategy researchers should start by identifying at least moderately strong instruments. Authors should always report instrument strength by noting the F-statistic in the first stage associated with the addition of the instrumental variable(s) Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj 1078 M. Semadeni, M. C. Withers and S. Trevis Certo (e.g., Larcker and Rusticus, 2010; Stock et al., 2002). Without this detail, it is difficult for readers to understand the influence of the instruments in the models. Consistent with the recommendations of econometricians (Angrist and Pischke, 2009; Kennedy, 2008), researchers should report the full results for the first-stage models, and these models should include the controls used in the secondstage (i.e., structural model). After identifying moderately strong instruments, researchers should also examine the potential endogeneity of the instruments. Compared to exogenous instruments, endogenous instruments dramatically raise the likelihood of reporting a statistically significant result. Given such enormous bias, we recommend testing for instrument endogeneity using the Sargan (1958) test or Hansen’s (1982) J-statistic (for other tests of instrument endogeneity, see Bascle, 2008). It should be noted that multiple instruments are required to test for instrument endogeneity (e.g., Kennedy, 2008). In supplementary analyses not reported, our simulations also show that multiple instruments help to decrease standard errors. For these reasons, we cannot overemphasize the importance of identifying multiple—as opposed to single—instruments.13 Nevertheless, our results should give strategy researchers pause regarding the number of existing studies in strategic management that have advanced theory while using only a single—and potentially endogenous—instrument. If authors can only find one instrument of sufficient strength, they must present compelling theoretical evidence that the instrument is not itself endogenous. In this regard, authors should keep in mind that the instrumental variable should be uncorrelated with the residuals associated with the dependent variable—and not the dependent variable itself. Finally, but perhaps most importantly, authors should compare their results to OLS (or a related procedure such as logistic, fixed-effects regression, etc.) and test for endogeneity. We emphasize this is the final step in the process, because quality (i.e., relevant and exogenous) instruments are required for the Hausman or DWH tests. Without quality instruments, these tests will almost surely fail to report endogeneity—even when it exists. In our view, approaches such as “We tested for endogeneity and found that it was not an issue” are not acceptable. Instead, reviewers, editors, and ultimately readers need to understand clearly the properties of the instruments used in endogeneity tests. Finding instrumental variables Researchers interested in endogeneity understand that the most difficult aspect of instrumental variable analysis involves identifying suitable instruments. Nevertheless, economists may have some suggestions in this regard. Angrist and Pischke (2009: 117) suggest that, “Good instruments come from a combination of institutional knowledge and ideas about the processes determining the variable of interest.” Because researchers in strategy address a wide array of topics and contexts, it is difficult to provide advice that applies equally to all researchers. We propose that researchers in strategy can learn from the insights and examples offered by economists, who have been dealing with endogeneity for far longer. Kennedy (2008) summarizes a number of creative ideas in this regard. Distance from a college, for instance, can serve as an instrument for years of education as a predictor of wages. More generally, he also discusses how researchers may employ lagged exogenous variables as instruments for endogenous predictors. In our view, though, the key to remedying endogeneity is to consider its effects before submitting manuscripts for review. If reviewers highlight endogeneity after the original submission, it will likely prove difficult for authors to identify suitable instruments that fit neatly within existing models. For researchers using archival data, for instance, it may prove difficult to identify a natural experiment after the fact that helps to reframe a study examining the effects of a strategic decision (e.g., market entry) on firm performance. Correcting for endogeneity in a post-hoc fashion is even more problematic for researchers relying on survey data. CONCLUSION 13 Furthermore, it is important to note that multiple endogenous instruments will not be able to detect instrument endogeneity. Supplemental analyses found that detection requires at least one moderately strong exogenous instrument among the multiple instruments. Copyright 2013 John Wiley & Sons, Ltd. In sum, given noted publication biases that focus on statistical significance (Bettis, 2012), we can speculate that endogeneity has led to a myriad of type I errors among published papers in Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj Research Notes and Commentaries strategy. While it is not our purpose to highlight specific cases, we can safely presume that even small amounts of endogeneity have resulted in a number of published papers that have resulted in statistically significant results driven not by the purported independent variables but instead by endogeneity. We are hopeful that the results and recommendations we provide are helpful to authors, reviewers and editors. ACKNOWLEDGEMENTS We thank Brian Connelly, Ryan Krause, Don Lange and Phil Podsakoff for their constructive comments on earlier versions of this manuscript. We also thank Margarethe Wiersema and two anonymous reviewers for their guidance in shaping the final manuscript. REFERENCES Angrist JD, Pischke JS. 2009. Mostly Harmless Econometrics: an Empiricist’s Companion. Princeton University Press: Princeton, NJ. Bascle G. 2008. Controlling for endogeneity with instrumental variables in strategic management research. Strategic Organization 6(3): 285–327. Bettis RA. 2012. The search for asterisks: compromised statistical tests and flawed theories. Strategic Management Journal 33(1): 108–113. Cohen J. 1992. A power primer. Psychological Bulletin 112(1): 155–159. Cohen J, Cohen P, West SG, Aiken LS. 2003. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Erlbaum: Mahwah, NJ. Copyright 2013 John Wiley & Sons, Ltd. 1079 Hamilton BH, Nickerson JA. 2003. Correcting for endogeneity in strategic management research. Strategic Organization 1(1): 51–78. Hansen LP. 1982. Large sample properties of generalized method of moments estimators. Econometrica 50(4): 1029–1054. Hoetker G, Mellewigt T. 2009. Choice and performance of governance mechanisms: matching alliance governance to asset type. Strategic Management Journal 30(10): 1025–1044. Kennedy P. 2008. A Guide to Econometrics (2nd edn). Blackwell: Oxford, UK. Larcker DF, Rusticus TO. 2010. On the use of instrumental variables in accounting research. Journal of Accounting and Economics 49(3): 186–205. Li J, Tang Y. 2010. CEO hubris and firm risk taking in China: the moderating role of managerial discretion. Academy of Management Journal 53(1): 45–68. Sargan JD. 1958. The estimation of economic relationships using instrumental variables. Econometrica 26(3): 393–415. Shaver JM. 1998. Accounting for endogeneity when assessing strategy performance: does entry mode choice affect fdi survival? Management Science 44(4): 571–585. Shugan SM. 2004. Editorial: endogeneity in marketing decision models. Marketing Science 23(1): 1–3. Sirmon DG, Hitt MA. 2009. Contingencies within dynamic managerial capabilities: interdependent effects of resource investment and deployment on firm performance. Strategic Management Journal 30(13): 1375–1394. Stock JH, Wright JH, Yogo M. 2002. A survey of weak instruments and weak identification in generalized method of moments. Journal of Business & Economic Statistics 20(4): 518–529. Wang HC, He J, Mahoney JT. 2009. Firm-specific knowledge resources and competitive advantage: the roles of economic- and relationship-based employee governance mechanisms. Strategic Management Journal 30(12): 1265–1285. Strat. Mgmt. J., 35: 1070–1079 (2014) DOI: 10.1002/smj
© Copyright 2026 Paperzz