Rockefeller College University at Albany PAD 705 Handout: Granger Causality Most elementary statistics texts will exhort you to remember that “correlation is not causation.” When two variables are correlated, they move together in magnitude. Why they move together is not clear. Is it because X causes Y, Y causes X, or that they are simultaneously determined. There is no way to know. OLS regression takes a different starting point. It suggests that there is a known direction of causality: from independents to dependents. As we saw with simultaneous equations and two-stage least squares, the entire edifice of OLS regression crashes into bias and inconsistency if we cannot assume that the Xs cause Y. However, it may not always be clear whether we can diagnose and correct for simultaneity. We should always run regressions after developing a coherent body of theory, but sometimes our theories are simply wrong. In other cases, theory may itself be indeterminate with respect to the direction of causality. There is one test available to see if we can establish causality, called the Granger-Sims Test. Unfortunately for Sims, it seems that most of the statistical world refers to this as a test for Granger Causality. The Granger Causality test operates from a very simple premise: if X causes Y, then X must occur before Y. This simple premise has two implications. First, it should be the case that lagged values of X (Xt-1. Xt-2, Xt-3, etc.) should be significantly related to the Y we wish to predict. This conclusion stems quite directly from this premise. The second implication is that lagged values of Y (Yt-1, Yt-2, Yt-3, etc.) must not help determine the values of X. This implication must be true, or we may have a simultaneity problem. In essence, the Granger-Sims causality test is composed of two F tests of joint significance: 1. Lagged values of X help determine Y. Unrestricted: Yt = β0 + β1Xt-1 + β2Xt-2 + β3Xt-3 + β4Xt-4 + β5Xt-5 + β6Yt-1 + β7Yt-2 + β8Yt-3 + β9Yt-4 + β10Yt-5 + εt Restricted: Yt = β0 + β1Yt-1 + β2Yt-2 + β3Yt-3 + β4Yt-4 + β5Yt-5 + εt Test: F test with 5 restrictions (numerator degrees of freedom) and N - 6 denominator degrees of freedom. To find Granger Causality, this F test must be statistically significant. Revised: April 17, 2005 PAD 705 Granger Causality 2. Lagged values of Y do not help determine X. Unrestricted: Xt = β0 + β1Xt-1 + β2Xt-2 + β3Xt-3 + β4Xt-4 + β5Xt-5 + β6Yt-1 + β7Yt-2 + β8Yt-3 + β9Yt-4 + β10Yt-5 + εt Restricted: Xt = β0 + β1Xt-1 + β2Xt-2 + β3Xt-3 + β4Xt-4 + β5Xt-5 + εt Test: F test with 5 restrictions (numerator degrees of freedom) and N - 6 denominator degrees of freedom. To find Granger Causality, this test must be statistically insignificant. Granger Causality is established if we reject the null hypothesis in part 1 of the test and accept the null hypothesis in part 2 of the test. The number of lags one chooses to include in the regressions is completely arbitrary. Here, science must be driven by judgment. To make sure your results are robust, you should try several tests, starting with a smaller number of lags and then increasing the number. An interesting test case is the infra.dta data from Problem Set #3. Does employment “Granger Cause” gross state product (GSP)? I ran two tests, the first on just two lags. 1. Lagged values of X help determine Y. . reg lngsp1986 lngsp1985 lngsp1984 lnemp1985 lnemp1984 Source | SS df MS -------------+-----------------------------Model | .436983003 4 .109245751 Residual | .000184834 43 4.2985e-06 -------------+-----------------------------Total | .437167837 47 .009301443 Number of obs F( 4, 43) Prob > F R-squared Adj R-squared Root MSE = 48 =25415.06 = 0.0000 = 0.9996 = 0.9995 = .00207 -----------------------------------------------------------------------------lngsp1986 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lngsp1985 | 1.052274 .2212554 4.76 0.000 .6060697 1.498478 lngsp1984 | -.2500968 .2114807 -1.18 0.243 -.6765883 .1763948 lnemp1985 | .1258521 .3662377 0.34 0.733 -.6127366 .8644408 lnemp1984 | .1326279 .3666324 0.36 0.719 -.6067568 .8720125 _cons | -.2110498 .0291472 -7.24 0.000 -.2698308 -.1522688 -----------------------------------------------------------------------------. test ( 1) ( 2) lnemp1985 lnemp1984 lnemp1985 = 0.0 lnemp1984 = 0.0 F( 2, 43) = Prob > F = 28.56 0.0000 The F test confirms that lagged employment helps to determine GSP. 2 PAD 705 Granger Causality 2. Lagged values of Y do not help determine X. . reg lnemp1986 lnemp1985 lnemp1984 lngsp1985 lngsp1984 Source | SS df MS -------------+-----------------------------Model | .264920135 4 .066230034 Residual | .000028517 43 6.6318e-07 -------------+-----------------------------Total | .264948652 47 .005637205 Number of obs F( 4, 43) Prob > F R-squared Adj R-squared Root MSE = 48 =99867.43 = 0.0000 = 0.9999 = 0.9999 = .00081 -----------------------------------------------------------------------------lnemp1986 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lnemp1985 | 1.672141 .143854 11.62 0.000 1.382032 1.962251 lnemp1984 | -.559751 .144009 -3.89 0.000 -.8501728 -.2693291 lngsp1985 | -.0637595 .0869066 -0.73 0.467 -.2390233 .1115044 lngsp1984 | -.0213051 .0830672 -0.26 0.799 -.1888261 .1462159 _cons | -.0949636 .0114487 -8.29 0.000 -.1180521 -.071875 -----------------------------------------------------------------------------. test ( 1) ( 2) lngsp1985 lngsp1984 lngsp1985 = 0.0 lngsp1984 = 0.0 F( 2, 43) = Prob > F = 38.62 0.0000 The data fails the second test. We found that lagged values of GSP do help to determine employment. Maybe we have used too few lags. I ran a second specification with five lags. 1. Lagged values of X help determine Y. . reg lngsp1986 lngsp1985 lngsp1984 lngsp1983 lngsp1982 lngsp1981 lnemp1985 lnemp1984 lnemp1983 lnemp1982 lnemp1981 Source | SS df MS -------------+-----------------------------Model | .437016572 10 .043701657 Residual | .000151264 37 4.0882e-06 -------------+-----------------------------Total | .437167837 47 .009301443 Number of obs F( 10, 37) Prob > F R-squared Adj R-squared Root MSE = 48 =10689.65 = 0.0000 = 0.9997 = 0.9996 = .00202 -----------------------------------------------------------------------------lngsp1986 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lngsp1985 | 1.051754 .2423634 4.34 0.000 .5606791 1.542829 lngsp1984 | -.1050723 .411316 -0.26 0.800 -.9384776 .728333 lngsp1983 | .0253586 .2576243 0.10 0.922 -.4966377 .5473549 lngsp1982 | .0151875 .3888624 0.04 0.969 -.7727226 .8030977 lngsp1981 | -.1137875 .2633196 -0.43 0.668 -.6473237 .4197487 lnemp1985 | .0421012 .5060015 0.08 0.934 -.9831552 1.067358 lnemp1984 | -.4483582 .9095711 -0.49 0.625 -2.291324 1.394608 lnemp1983 | 1.27469 .7416251 1.72 0.094 -.2279856 2.777365 lnemp1982 | -.9941582 .6194091 -1.61 0.117 -2.2492 .2608838 lnemp1981 | .2903128 .4559594 0.64 0.528 -.6335486 1.214174 _cons | -.1310574 .046638 -2.81 0.008 -.2255549 -.0365598 ------------------------------------------------------------------------------ 3 PAD 705 . test ( ( ( ( ( 1) 2) 3) 4) 5) Granger Causality lnemp1985 lnemp1984 lnemp1983 lnemp1982 lnemp1981 lnemp1985 lnemp1984 lnemp1983 lnemp1982 lnemp1981 F( = = = = = 0.0 0.0 0.0 0.0 0.0 5, 37) = Prob > F = 5.13 0.0011 Once again, the F test confirms that lagged employment helps to determine GSP. 2. Lagged values of Y do not help determine X. . reg lnemp1986 lnemp1985 lnemp1984 lnemp1983 lnemp1982 lnemp1981 lngsp1985 lngsp1984 lngsp1983 lngsp1982 lngsp1981 Source | SS df MS -------------+-----------------------------Model | .264927584 10 .026492758 Residual | .000021068 37 5.6940e-07 -------------+-----------------------------Total | .264948652 47 .005637205 Number of obs F( 10, 37) Prob > F R-squared Adj R-squared Root MSE = 48 =46527.28 = 0.0000 = 0.9999 = 0.9999 = .00075 -----------------------------------------------------------------------------lnemp1986 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lnemp1985 | 1.854204 .1888401 9.82 0.000 1.471578 2.236831 lnemp1984 | -1.222959 .3394526 -3.60 0.001 -1.910755 -.5351625 lnemp1983 | .753092 .276775 2.72 0.010 .1922925 1.313892 lnemp1982 | -.6139826 .2311639 -2.66 0.012 -1.082365 -.1456 lnemp1981 | .3144569 .1701644 1.85 0.073 -.0303289 .6592427 lngsp1985 | -.102607 .0904502 -1.13 0.264 -.2858765 .0806625 lngsp1984 | .1541779 .1535034 1.00 0.322 -.1568496 .4652054 lngsp1983 | -.1275578 .0961456 -1.33 0.193 -.3223673 .0672516 lngsp1982 | .1228319 .1451238 0.85 0.403 -.1712168 .4168805 lngsp1981 | -.1105443 .0982711 -1.12 0.268 -.3096604 .0885718 _cons | -.0724049 .0174053 -4.16 0.000 -.1076715 -.0371384 -----------------------------------------------------------------------------. test ( ( ( ( ( 1) 2) 3) 4) 5) lngsp1985 lngsp1984 lngsp1983 lngsp1982 lngsp1981 lngsp1985 lngsp1984 lngsp1983 lngsp1982 lngsp1981 F( = = = = = 0.0 0.0 0.0 0.0 0.0 5, 37) = Prob > F = 7.22 0.0001 As in the first test, GSP helps to determine employment, so we cannot say that employment “Granger Causes” GSP. Unfortunately, we also cannot say conclusively that GSP and employment are simultaneous. This test is only conclusive about Granger Causality. However, the failure of the Granger Causality test does suggest that we should think carefully about both serial correlation and simultaneity as problems for our estimations. 4
© Copyright 2026 Paperzz