Granger Causality - University at Albany

Rockefeller College
University at Albany
PAD 705 Handout: Granger Causality
Most elementary statistics texts will exhort you to remember that “correlation is not causation.” When
two variables are correlated, they move together in magnitude. Why they move together is not clear. Is it
because X causes Y, Y causes X, or that they are simultaneously determined. There is no way to know.
OLS regression takes a different starting point. It suggests that there is a known direction of causality:
from independents to dependents. As we saw with simultaneous equations and two-stage least squares,
the entire edifice of OLS regression crashes into bias and inconsistency if we cannot assume that the Xs
cause Y.
However, it may not always be clear whether we can diagnose and correct for simultaneity. We should
always run regressions after developing a coherent body of theory, but sometimes our theories are simply
wrong. In other cases, theory may itself be indeterminate with respect to the direction of causality.
There is one test available to see if we can establish causality, called the Granger-Sims Test.
Unfortunately for Sims, it seems that most of the statistical world refers to this as a test for Granger
Causality. The Granger Causality test operates from a very simple premise: if X causes Y, then X must
occur before Y. This simple premise has two implications. First, it should be the case that lagged values
of X (Xt-1. Xt-2, Xt-3, etc.) should be significantly related to the Y we wish to predict. This conclusion
stems quite directly from this premise. The second implication is that lagged values of Y (Yt-1, Yt-2, Yt-3,
etc.) must not help determine the values of X. This implication must be true, or we may have a
simultaneity problem. In essence, the Granger-Sims causality test is composed of two F tests of joint
significance:
1. Lagged values of X help determine Y.
Unrestricted: Yt = β0 + β1Xt-1 + β2Xt-2 + β3Xt-3 + β4Xt-4 + β5Xt-5 + β6Yt-1 + β7Yt-2 + β8Yt-3 + β9Yt-4 + β10Yt-5 + εt
Restricted:
Yt = β0 + β1Yt-1 + β2Yt-2 + β3Yt-3 + β4Yt-4 + β5Yt-5 + εt
Test:
F test with 5 restrictions (numerator degrees of freedom) and N - 6 denominator degrees
of freedom. To find Granger Causality, this F test must be statistically significant.
Revised: April 17, 2005
PAD 705
Granger Causality
2. Lagged values of Y do not help determine X.
Unrestricted: Xt = β0 + β1Xt-1 + β2Xt-2 + β3Xt-3 + β4Xt-4 + β5Xt-5 + β6Yt-1 + β7Yt-2 + β8Yt-3 + β9Yt-4 + β10Yt-5 + εt
Restricted:
Xt = β0 + β1Xt-1 + β2Xt-2 + β3Xt-3 + β4Xt-4 + β5Xt-5 + εt
Test:
F test with 5 restrictions (numerator degrees of freedom) and N - 6 denominator degrees
of freedom. To find Granger Causality, this test must be statistically insignificant.
Granger Causality is established if we reject the null hypothesis in part 1 of the test and accept the null
hypothesis in part 2 of the test.
The number of lags one chooses to include in the regressions is completely arbitrary. Here, science must
be driven by judgment. To make sure your results are robust, you should try several tests, starting with a
smaller number of lags and then increasing the number.
An interesting test case is the infra.dta data from Problem Set #3. Does employment “Granger
Cause” gross state product (GSP)? I ran two tests, the first on just two lags.
1. Lagged values of X help determine Y.
. reg
lngsp1986 lngsp1985 lngsp1984 lnemp1985 lnemp1984
Source |
SS
df
MS
-------------+-----------------------------Model | .436983003
4 .109245751
Residual | .000184834
43 4.2985e-06
-------------+-----------------------------Total | .437167837
47 .009301443
Number of obs
F( 4,
43)
Prob > F
R-squared
Adj R-squared
Root MSE
=
48
=25415.06
= 0.0000
= 0.9996
= 0.9995
= .00207
-----------------------------------------------------------------------------lngsp1986 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lngsp1985 |
1.052274
.2212554
4.76
0.000
.6060697
1.498478
lngsp1984 | -.2500968
.2114807
-1.18
0.243
-.6765883
.1763948
lnemp1985 |
.1258521
.3662377
0.34
0.733
-.6127366
.8644408
lnemp1984 |
.1326279
.3666324
0.36
0.719
-.6067568
.8720125
_cons | -.2110498
.0291472
-7.24
0.000
-.2698308
-.1522688
-----------------------------------------------------------------------------. test
( 1)
( 2)
lnemp1985 lnemp1984
lnemp1985 = 0.0
lnemp1984 = 0.0
F(
2,
43) =
Prob > F =
28.56
0.0000
The F test confirms that lagged employment helps to determine GSP.
2
PAD 705
Granger Causality
2. Lagged values of Y do not help determine X.
. reg
lnemp1986 lnemp1985 lnemp1984 lngsp1985 lngsp1984
Source |
SS
df
MS
-------------+-----------------------------Model | .264920135
4 .066230034
Residual | .000028517
43 6.6318e-07
-------------+-----------------------------Total | .264948652
47 .005637205
Number of obs
F( 4,
43)
Prob > F
R-squared
Adj R-squared
Root MSE
=
48
=99867.43
= 0.0000
= 0.9999
= 0.9999
= .00081
-----------------------------------------------------------------------------lnemp1986 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lnemp1985 |
1.672141
.143854
11.62
0.000
1.382032
1.962251
lnemp1984 |
-.559751
.144009
-3.89
0.000
-.8501728
-.2693291
lngsp1985 | -.0637595
.0869066
-0.73
0.467
-.2390233
.1115044
lngsp1984 | -.0213051
.0830672
-0.26
0.799
-.1888261
.1462159
_cons | -.0949636
.0114487
-8.29
0.000
-.1180521
-.071875
-----------------------------------------------------------------------------. test
( 1)
( 2)
lngsp1985 lngsp1984
lngsp1985 = 0.0
lngsp1984 = 0.0
F(
2,
43) =
Prob > F =
38.62
0.0000
The data fails the second test. We found that lagged values of GSP do help to determine employment.
Maybe we have used too few lags. I ran a second specification with five lags.
1. Lagged values of X help determine Y.
. reg lngsp1986 lngsp1985 lngsp1984 lngsp1983 lngsp1982 lngsp1981 lnemp1985 lnemp1984
lnemp1983 lnemp1982 lnemp1981
Source |
SS
df
MS
-------------+-----------------------------Model | .437016572
10 .043701657
Residual | .000151264
37 4.0882e-06
-------------+-----------------------------Total | .437167837
47 .009301443
Number of obs
F( 10,
37)
Prob > F
R-squared
Adj R-squared
Root MSE
=
48
=10689.65
= 0.0000
= 0.9997
= 0.9996
= .00202
-----------------------------------------------------------------------------lngsp1986 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lngsp1985 |
1.051754
.2423634
4.34
0.000
.5606791
1.542829
lngsp1984 | -.1050723
.411316
-0.26
0.800
-.9384776
.728333
lngsp1983 |
.0253586
.2576243
0.10
0.922
-.4966377
.5473549
lngsp1982 |
.0151875
.3888624
0.04
0.969
-.7727226
.8030977
lngsp1981 | -.1137875
.2633196
-0.43
0.668
-.6473237
.4197487
lnemp1985 |
.0421012
.5060015
0.08
0.934
-.9831552
1.067358
lnemp1984 | -.4483582
.9095711
-0.49
0.625
-2.291324
1.394608
lnemp1983 |
1.27469
.7416251
1.72
0.094
-.2279856
2.777365
lnemp1982 | -.9941582
.6194091
-1.61
0.117
-2.2492
.2608838
lnemp1981 |
.2903128
.4559594
0.64
0.528
-.6335486
1.214174
_cons | -.1310574
.046638
-2.81
0.008
-.2255549
-.0365598
------------------------------------------------------------------------------
3
PAD 705
. test
(
(
(
(
(
1)
2)
3)
4)
5)
Granger Causality
lnemp1985 lnemp1984 lnemp1983 lnemp1982 lnemp1981
lnemp1985
lnemp1984
lnemp1983
lnemp1982
lnemp1981
F(
=
=
=
=
=
0.0
0.0
0.0
0.0
0.0
5,
37) =
Prob > F =
5.13
0.0011
Once again, the F test confirms that lagged employment helps to determine GSP.
2. Lagged values of Y do not help determine X.
. reg lnemp1986 lnemp1985 lnemp1984 lnemp1983 lnemp1982 lnemp1981 lngsp1985 lngsp1984
lngsp1983 lngsp1982 lngsp1981
Source |
SS
df
MS
-------------+-----------------------------Model | .264927584
10 .026492758
Residual | .000021068
37 5.6940e-07
-------------+-----------------------------Total | .264948652
47 .005637205
Number of obs
F( 10,
37)
Prob > F
R-squared
Adj R-squared
Root MSE
=
48
=46527.28
= 0.0000
= 0.9999
= 0.9999
= .00075
-----------------------------------------------------------------------------lnemp1986 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lnemp1985 |
1.854204
.1888401
9.82
0.000
1.471578
2.236831
lnemp1984 | -1.222959
.3394526
-3.60
0.001
-1.910755
-.5351625
lnemp1983 |
.753092
.276775
2.72
0.010
.1922925
1.313892
lnemp1982 | -.6139826
.2311639
-2.66
0.012
-1.082365
-.1456
lnemp1981 |
.3144569
.1701644
1.85
0.073
-.0303289
.6592427
lngsp1985 |
-.102607
.0904502
-1.13
0.264
-.2858765
.0806625
lngsp1984 |
.1541779
.1535034
1.00
0.322
-.1568496
.4652054
lngsp1983 | -.1275578
.0961456
-1.33
0.193
-.3223673
.0672516
lngsp1982 |
.1228319
.1451238
0.85
0.403
-.1712168
.4168805
lngsp1981 | -.1105443
.0982711
-1.12
0.268
-.3096604
.0885718
_cons | -.0724049
.0174053
-4.16
0.000
-.1076715
-.0371384
-----------------------------------------------------------------------------. test
(
(
(
(
(
1)
2)
3)
4)
5)
lngsp1985 lngsp1984 lngsp1983 lngsp1982 lngsp1981
lngsp1985
lngsp1984
lngsp1983
lngsp1982
lngsp1981
F(
=
=
=
=
=
0.0
0.0
0.0
0.0
0.0
5,
37) =
Prob > F =
7.22
0.0001
As in the first test, GSP helps to determine employment, so we cannot say that employment “Granger
Causes” GSP. Unfortunately, we also cannot say conclusively that GSP and employment are
simultaneous. This test is only conclusive about Granger Causality. However, the failure of the Granger
Causality test does suggest that we should think carefully about both serial correlation and simultaneity as
problems for our estimations.
4