Special Topics in Applied Econometrics 3) Relaxing

Special Topics in Applied Econometrics
3) Relaxing the Exogeneity Assumption
Marcel Bluhm
Wang Yanan Institute for Studies in Economics
Xiamen University
Antwerp University, 13 - 17 February 2012
3) Relaxing the Exogeneity Assumption
→ Agenda
⇒ Panel GMM
The Pooled Model
Selection of Instruments
The Fixed Effects Model
The Random Effects Model
Dynamic Panel Data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
2/38
3.1) Panel GMM:
→ Introduction
In the models seen so far, yit depends on the contemporaneous value
of regressor, xit , though the strong exogeneity assumption permits all
of xit , t = 1, 2, ..., T , to be included as regressors
Regressors in other periods might be valid instruments for current
period regressors that are endogenous or lags of the dependent
variable
The Generalized Method of Moments (GMM) is a useful framework
for panel instrumental variable (IV) estimation
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
3/38
3.1) Panel GMM:
→ Introduction: The IV Principle
Consider the following single equation general regression model
y = xβ + Endogeneity, possibly due to reverse causality or omitted variables,
causes:
cov (x, ) 6= 0
OLS regression of y on x is biased
A variable q must fulfil two properties to be a valid instrumental
variable:
cov (q, ) = 0 (’Validity requirement’)
cov (q, x) 6= 0 (’Relevancy requirement’)
The validity requirement states that the instrumental variable must
not be endogenous
The relevancy requirement states that the instrumental variable must
have explanatory power for the endogeneous explanatory variable
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
4/38
3.1) Panel GMM:
→ Introduction: General Idea of GMM
GMM is a general estimation framework
To see the basic idea of GMM, note that all estimators used so far
can be described by some moment condition/restriction. This
moment condition can be used for parameter estimation
For example, in the linear regression model it has to hold that
E [x] = E [x(y − x0 β)] = 0
The dimension of E [x] is k × 1, that is, there are k moment
conditions/restrictions
An estimator of the regression parameter chooses β such that the
sample analog of the moment condition equals zero:
β̂ = argzeroβ
N
X
i=1
M. Bluhm
N
N
X
X
0
−1
xi (yi − xi β) = ( (xi xi ))
xi yi = (X0 X)−1 X0 y
i=1
i=1
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
5/38
3.1) Panel GMM:
→ Introduction: General Idea of GMM (ctd.)
If more restrictions than parameters are available, the outlined
estimation approach does not work. The parameters are
overidentified, that is, β cannot be chosen to satisfy all moment
conditions simultaneously
An example for this situation is an IV regression model with more
instrumental variables than endogenous variables
In this situation the GMM, which is outlined in the following, can be
used to estimate the parameters of interest
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
6/38
3) Relaxing the Exogeneity Assumption
→ Agenda
Panel GMM
⇒ The Pooled Model
Selection of Instruments
The Fixed Effects Model
The Random Effects Model
Dynamic Panel Data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
7/38
3.2) The Pooled Model
Consider the linear panel model
yi = Xi β + ui
(1)
where



yi1



Xi = 
yi =  ... 
yiT Tx1;
M. Bluhm



ui1
x0i1


.. 
ui =  ... 
. 
uiT Tx1
x0iT TxK ;
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
8/38
3.2) The Pooled Model
Assume the existence of a T × r matrix of instruments, Zi , where
r ≥ K is the number of instruments that satisfy
E (Z0i ui ) = 0
(2)
Instruments are contemporaneously uncorrelated with the error term
Given this assumption the panel GMM estimator is given by
#−1 N
N
N
X
X
X
= (
(
X0i Zi )WN (
X0i Zi )
X0i Zi )WN (Z0i yi )
"
β̂PGMM
i=1
i=1
(3)
i=1
where WN is a weighting matrix.
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
9/38
3.2) The Pooled Model
To obtain the asymptotic variance of the PGMM estimator rewrite
Equation (3) in more compact notation as
β̂PGMM = [X0 ZWN Z0 X]−1 X0 ZWN Zy
where
X0 = [X01 ...X0N ], KxTN
Z0 = [Z01 ...Z0N ], KxTN
0
y0 = [y10 ...yN
], 1xTN
The asymptotic variance of β̂PGMM is then given by
Avar (β̂PGMM ) = [X0 ZWN Z0 X]−1 X0 ZWN (N Ŝ)WN Z0 X[X0 ZWN Z0 X]−1
(4)
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
10/38
3.2) The Pooled Model
A consistent estimate for S is given by
Ŝ =
N
1 X 0 0
Z ûi ûi Z
N
(5)
i=1
where the T × 1 estimated residual is given by ûi = yi − Xi β̂PGMM
Given Equation (5), the estimator for the asymptotic variance in
Equation (4) yields panel-robust standard errors allowing for both
heteroscedasticity and correlation over time
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
11/38
3.2) The Pooled Model
PGMM estimation can be done in one (’one-step-GMM’) or, more
efficiently, in four (’two-step-GMM’) operation(s):
1
Choose WN = [Z0 Z]−1 and estimate Equation (3) (’one-step-GMM’)
2
Compute ûi = yi − Xi β̂PGMM
3
Compute Ŝ from Equation (5)
4
Choose WN = Ŝ−1 and estimate Equation (3) (’two-step-GMM’)
5
Calculate Avar (β̂PGMM ) from Equation (4) which for the
two-step-estimator simplifies to
Avar (β̂PGMM2S ) = [X0 Z(N Ŝ)−1 Z0 X]−1
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
12/38
3) Relaxing the Exogeneity Assumption
→ Agenda
Panel GMM
The Pooled Model
⇒ Selection of Instruments
The Fixed Effects Model
The Random Effects Model
Dynamic Panel Data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
13/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments
In cross-section models, endogenous variables are instrumented by
variables that do not appear as regressors in the equation of interest
With panel models, additional periods of data provide additional
instruments
The number of instruments available expands as stronger assumptions
are made about the correlation between uit and Zis , s, t = 1, 2, ..., T
Assume the existence of r IV for each t, t = 1, ..., T , where
r ≥ dim(X)
Define for each t the r-dimensional IV column vector, zit
Combine the IV vectors for all t in a matrix Zi , and define also the
vector of error terms, u:
Z0i = (zi1 , zi2 , ..., ziT )[r ×T ] ,
u0i = (ui1 , ui2 , ..., uiT )[1×T ]
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
14/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments (ctd.)
The least restrictive exogeneity assumption for IVs is the summation
assumption:
T
X
0
E (Zi ui ) = E (
zit uit ) = 0
t=1
’Correlation between each instrument and the error term in the same
time period summed up over the time dimension equals zero in
expectation over all cross-section’
Since the dimension of Z0i ui is r × 1, this assumption leads to r
moment conditions in Equation (2)
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
15/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments (ctd.)
The somewhat stronger contemporaneous exogeneity assumption is
given by:
E (z0it uit ) = 0
’Correlation between each instrument and the error term in the same
time period equals zero in expectation over all cross-section’
Holds ∀ t separately. If fulfilled, summation assumption also fulfilled.
In this case the IV matrix containing

zi1 0

 0 zi2
Z0i = 
 ..
 .
0 ···
the IVs for all t is given by

... 0
.. 
. 


..
. 0 
0 ziT TrxT
This assumption leads to rT moment conditions
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
16/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments (ctd.)
The weak exogeneity assumption states that IVs of current and
previous periods are uncorrelated with current-period error term:
E (z0is uit ) = 0
for s ≤ t, t = 1, 2, ..., T
In this case Z0i is given by

zi1 0

 0 zi1


 0 zi2

0
.
Zi = 
 ..

 0 ···


 0 ···
0 ···
M. Bluhm
···
..
.
0
0
0
0
..
.
..
.
0
zi1
..
.
ziT














rT T 2+1 xT
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY
ASSUMPTION
T +1
17/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments (ctd.)
The weak exogeneity assumption leads to rT T 2+1 moment conditions
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
18/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments (ctd.)
The strong exogeneity assumption states that the error term is
uncorrelated with the IVs of all periods:
E (z0is uit ) = 0
∀s, t = 1, 2, ..., T
In this case Z0i is given by

zi 0 · · ·
 0 zi

Z0i =  .
..
 ..
.
0 ··· 0
0


zi1
zi2
..
.




, where zi = 


0 
zi rT 2 xT
ziT





The strong exogeneity assumption leads to rT 2 moment conditions
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
19/38
3.3) Selection of Instruments
⇒ Assumptions on the Exogeneity of Instruments (ctd.)
The distinction between the exogeneity assumptions is most relevant
for dynamic models
Given the exogeneity assumption on the instruments there can be too
many moment conditions. The marginal value of an IV may then be
very small
In this situation multicollinearity among instruments can lead to a
weak instruments problem
Instruments that vary little over time should be treated as
time-invariant
Other instruments might also be only used for a few periods
A test of overidentifying restrictions (OIR) can help assessing the
validity of the number of instruments
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
20/38
3.3) Selection of Instruments
If there are r instrumental variables and only K parameters to
estimate, panel GMM estimation leaves r − K overidentifying
restrictions
A test statistic for overidentifying restrictions is given by
N
N
X
X
OIR = [
û0i Zi ](N Ŝ)−1 [
Z0i ûi ]
i=1
(6)
i=1
where
ûi = yi − Z0i β̂PGMM2S
Ŝ is given in Equation (5)
OIR is distributed as χ2 (r − K ) under the null H0 : the overidentifying
restrictions are valid. If rejected, some instrumental variables are
endogenous
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
21/38
3.3) Selection of Instruments
⇒ HANDS-ON
Hands-On 8
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
22/38
3) Relaxing the Exogeneity Assumption
→ Agenda
Panel GMM
The Pooled Model
Selection of Instruments
⇒ The Fixed Effects Model
The Random Effects Model
Dynamic Panel Data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
23/38
3.4) The Fixed Effects Model
We now augment the model in Equation (1) by including an
individual-specific effect:
yit = ci + x0it β + it
(7)
The error term in Equation (1) is thus now modeled as uit = ci + it
Some regressors in xit are assumed to be endogenous, with
E (xit (ci + it )) 6= 0, making OLS inconsistent
In the FE model E (Z0i it ) = 0 but E (Z0i ci ) 6= 0
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
24/38
3.4) The Fixed Effects Model
The weak exogeneity assumption that E (zis it ) for s ≤ t implies
E (zis (it − i,t−1 )) = 0 for s ≤ t − 1
First differencing shortens the time series on the available instruments
by one period so that only zi,t−1 , zi,t−2 ... are available
Estimation and inference can then be done as outlined in Equations
(3) and (4)
Note: The within or mean-differenced model can only be estimated
following Equations (3) and (4) if the instruments are strongly
exogenous
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
25/38
3) Relaxing the Exogeneity Assumption
→ Agenda
Panel GMM
The Pooled Model
Selection of Instruments
The Fixed Effects Model
⇒ The Random Effects Model
Dynamic Panel Data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
26/38
3.5) The Random Effects Model
In the RE model E (Z0i (ci + it )) = 0
If the RE assumption on the error terms is true, Equation (7) can be
directly estimated via the PGMM2S estimator (see slide 12)
Note: More efficient estimation is possible if the error structure
features the same form as the standard RE model, that is, with
diagonal entries σc2 + σ2 and off-diagonal entries σc2 (see
Cameron/Trivedi (2005) p. 759 et seq.)
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
27/38
3) Relaxing the Exogeneity Assumption
→ Agenda
Panel GMM
The Pooled Model
Selection of Instruments
The Fixed Effects Model
The Random Effects Model
⇒ Dynamic Panel Data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
28/38
3.6) Dynamic Panel Data
Dynamic models use lags of the dependent variable as regressors:
yit = γyi,t−1 + ci + x0it β + it
(8)
By assumption |γ| < 1. Otherwise non-stationary panel models have
to be used
If ci is a random effect, the RE estimator is inconsistent for γ and β
because yi,t−1 is correlated with ci and hence with the composite
error term uit = ci + it
If ci is a fixed effect, estimation of the within-transformed model, that
is, regressing (yit − ȳi ) on (yi,t−1 − ȳi , −1) and (xit − x̄i ), also yields
inconsistent parameter estimates
The correlation comes up because yit is correlated with it , so yi,t−1 is
correlated with i,t−1 and hence with ¯i which is in the error term of
the within estimation: (it − ¯i )
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
29/38
3.6) Dynamic Panel Data
To get rid of the unobserved heterogeneity term, consider a
first-differenced version of Equation (8)
(yit − yi,t−1 ) = γ(yi,t−1 − yi,t−2 ) + (x0it − x0i,t−1 )β + (it − i,t−1 ) (9)
In Equation (9) the explanatory variable, (yi,t−1 − yi,t−2 ), is
correlated with the error term, (it − i,t−1 )
The lag of the dependent variable is therefore an endogenous
regressor which would lead to biased OLS estimates of the model in
Equation (9). This bias is known as Nickel Bias (Nickel, 1981)
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
30/38
3.6) Dynamic Panel Data
⇒ The Nickel Bias
To derive the Nickel Bias in a dynamic panel setting when N → ∞
and T is fixed, consider the following model where E (yi,t−1 ci ) 6= 0
yit = γyi,t−1 + ci + uit
The within estimator for this model is
PN PT
t=1 (yit − ȳi )(yi,t−1 − ȳi,−1 )
γ̂ = i=1
PN PT
2
i=1
t=1 (yi,t−1 − ȳi,−1 )
=
N X
T
X
wit (yit − ȳi )
(10)
(11)
(12)
i=1 t=1
where
wit =
M. Bluhm
(yi,t−1 − ȳi,−1 )
(yi,t−1 − ȳi,−1 )2
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
31/38
3.6) Dynamic Panel Data
⇒ The Nickel Bias (ctd.)
Replacing the expression for (yit − ȳi ) in Equation (12) by the mean
deviation of Equation (10) yields
γ̂ = γ +
N X
T
X
wit (uit − ūi )
(13)
i=1 t=1
Using Equation (13) and letting N → ∞ we obtain that
plimN→∞ (γ̂ − γ) = plimN→∞
N X
T
X
wit (uit − ūi )
i=1 t=1
6= 0
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
32/38
3.6) Dynamic Panel Data
⇒ The Nickel Bias (ctd.)
It can be shown that
1 − γT
1−γ
(1 − T −1
)×
T −1
1−γ
2γ
1 − γT
(1 −
(1 −
))
(1 − γ)(T − 1)
T (1 − γ)
plimN→∞ (γ̂ − γ) = −
(14)
For example, in the simple model given in Equation (10) the bias
amounts to − 43 if T = 2, to −0.53 if T = 3, and to −0.16 if T = 10
The model in Equation (9) can however be consistently estimated by
using instrumental variables for the lag of the dependend variable
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
33/38
3.6) Dynamic Panel Data
Possible IVs for the lags of the dependent variable are lags of
(yi,t−1 − yi,t−2 ) a sufficient number of periods ago
To see how this works consider the endogenous regressor
(yi,t−1 − yi,t−2 )
The IV yi,t−2 is correlated with (yi,t−1 − yi,t−2 ) (’Relevancy
requirement’)
Under the weak exogeneity assumption it is however uncorrelated with
(i,t − i,t−1 ) (’Validity requirement’)
Further lags of yi,t−1 are also valid IVs. As their inclusion leads to
overidentification, the model is formulated within the PGMM
framework outlined before
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
34/38
3.6) Dynamic Panel Data
⇒ The Arellano-Bond Estimator
Arellano and Bond (1991) proposed PGMM estimators using
unbalanced instrument sets
Assuming weakly exogenous regressors, the number of instruments
available is highest for the endogenous variable observed at t closest
to T
→ In t = 3 only yi1 is available as an instrument
→ In t = 4 yi1 and yi2 are available as instruments etc.
In the following, the Arellano-Bond estimator is outlined. Since it is a
PGMM estimator, the general procedure has already been shown in
the previous sub-chapter
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
35/38
3.6) Dynamic Panel Data
⇒ The Arellano-Bond Estimator (ctd.)
The Arellano-Bond estimator is given by
N
N
N
N
X
X
X
X
β̂AB = [(
X̃0i Zi )WN (
Z0i X̃i )]−1 (
X̃0i Zi )WN (
Z0i ỹi )
i=1
where



X̃i = 

i=1
∆yi,1
∆yi,2
..
.
∆x̃0i2
∆x̃0i3
..
.
∆yi,T −1
∆x̃0iT

i=1





 ; ỹi = 


∆yi,2
∆yi,3
..
.
∆yi,T
i=1
z0i3
0


 0

 ; Z̃i = 
 ..

 .
0
z0i4

(15)

···
···
..
.
0

0
.. 
. 



0
zi3
where
z0it = [yi,t−2 , yi,t−3 , ..., yi,1 , ∆x0i,t , ...]
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
36/38
SUMMARY
Often strict exogeneity assumption in standard models such as RE
and FE models violated
GMM allows for IV approach in the panel data context (pooled,
RE,FE)
Different assumptions on exogeneity allow different amount of
moment conditions
Dynamic data
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
37/38
Digression: One-Track Bind
DISCUSSION
M. Bluhm
Special Topics in Applied Econometrics: 3) RELAXING THE EXOGENEITY ASSUMPTION
38/38