Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Dynamic Panel Data
Ch 1. Reminder on Linear Non Dynamic Models
Pr. Philippe Polomé, Université Lumière Lyon 2
M2 EcoFi
2016 – 2017
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Overview of Ch. 1
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Data
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Data
Panel Data
I
I
i = 1, ..., N: agent (individual, firm, country...)
t = 1, ..., T : time
I
Generally Ti : number of periods differs from agent to agent
I
I
I
To simplify notation, theore uses T
I
I
I
I
Unbalanced Panel (this is the norm)
Attrition, the property that agents drop out of the sample
But all computer packages manage Ti
So that you should balance your sample
yit one obs. of the dependant variable y
xit one obs. of K ⇥ 1 vector of the independant variables
I
I
“regressors”
Possibly endogenous – Ch. 2
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Data
Data management
obs
1
..
.
agent i
1
time t
1
y
y11
x1
x111
t
..
.
1
t
y1t
x11t
xK 1t
..
.
T
T+1
..
.
1
2
T
1
y1T
y21
x11T
x121
xK 1T
xK 21
..
.
it
..
.
i
t
yit
x1it
xKit
..
.
NT
N
T
yNT
x1NT
xKNT
...
xK
xK 11
..
.
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Typical Linear Panel Data Model
I
The typical panel data model
yit = ↵i +
t
where
I
I
I
I
0
+ xit + uit
uit scalar disturbance term
Intercepts ↵i vary across agents
Intercepts i vary over time
Slopes are constant
(1)
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Typical Linear Panel Data Model
I
A mathematically proper way to write this model is
yit =
N
X
↵j dj,it +
j=1
T
X
s ds,it
0
+ xit + uit
s=2
where the N individual dummies dj,it = 1 if i = j and = 0 otherwise
the T
I
1 time dummies ds,it = 1 if t = s and = 0 otherwise
xit does not include an intercept
I
If an intercept is included
I
I
then one of the N individual dummies must be dropped
Many packages do that automatically
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Time dummies
I
Focus on short panels where N ! 1 but T does not
I
I
Then (time intercept) can be consistently estimated
In the sense that there is a finite number of them
I
I
I
T 1 time dummies are simply incorporated into the
regressors xit
We do not discuss them anymore
“Long” panels are treated using time-series methods
I
The panel dimension is abandonned
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Individual dummies
I
If we inserted the full set of N individual intercepts dj,it
I
I
It would cause problems as N ! 1
We cannot estimate consistently an 1 number of parameters
I
I
Information does not increase on the ↵i as N increases
Challenge : estimating the parameters
I
I
I
consistently
controlling for the N individual intercepts ↵i
In this sense, the ↵i are not the focus of the regression
I
I
They represent individual unobservables that do not not have
much interpretation
They are nuisance parameters
I
I
we are not intrested in them
but we must find a way to deal with them
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Individual-Specific Effects Model
I
Individual-specific effects model
0
yit = ↵i + xit + ✏it
I
where ✏it is iid over i and t
= a more parsimonious way to express the previous model (1)
with all the dummies
I
I
Time dummies may be included in regressors xit
“standard” linear non-dynamic panel data model
I
I
(2)
no yi(t
s)
in xit
↵i random variables
I
I
I
Capture unobserved heterogeneity
= unobserved time-invariant individual characteristics
In effect: a random parameter model
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Reminder : Unobserved Heterogeneity
I
The correct model is Y =
I
But the estimated model is Y =
I
The effect of the missing regressor on Y is implied in the error
of the estimated model : ⌫ = 2 x2 + ✏
I
I
0
+
1 x1
0
+
+
2 x2
1 x1
+✏
+⌫
= unobserved heterogeneity : Unobserved (individual) factors
influence the LHS variable
If the missing regressor is correlated with an included regressor
I
Then ⌫ correlated with at least one included regressor
I
I
LS inconsistent
Furthermore, possibly :
I
I
Heteroscedasticity if var (x2t ) 6= var (x2s ) , t 6= s
Autocorrelation if corr (x2t , x2s ) 6= 0, t 6= s
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Reminder : Unobserved Heterogeneity
Same slopes
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Exogeneity
I
Throughout this chapter: assume strong/strict exogeneity
E [eit |ai , xi1 , ..., xiT ] = 0, t = 1, ..., T
I
So that ✏it is assumed to have mean zero conditional on past,
current, and future values of the regressors
I
I
I
(3)
Zero covariance
Nothing is said between the random term ↵i and xi
Strong exogeneity rules out models with lagged dependent
variables or with endogenous variables as regressors (Ch. 2)
I
I
0
Take yit = ↵i + xit + yt 1 + ✏it
0
Thus yit 1 = ↵i + xit 1 + yt 2 + ✏it
I
I
1
it is often hard to maintain that E (✏it ✏it 1 ) = 0
Strong exogeneity does not hold in dynamic models
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Fixed Effects Model
I
2 variants to model (2) accordingly with hypotheses on ↵i
I
Both are models with “2” errors ↵i and ✏it
I
Both variants treat ↵i as an unobserved random variable
I
I
Error component models
Variant 1 of model (2): fixed effects (FE) model
I
↵i is potentially correlated with the (time-invariant part of
the) observed regressors xit
I
I
A form of unobserved heterogeneity
“fixed” because early treatments treated ↵i as (non-random)
parameters to be estimated (hence “fixed”)
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Random Effects Model
I
Variant 2 of model (2) : Random effects (RE) model
I
I
↵i distributed independently of x
Usually makes the additional assumptions that both the
random effects ↵i and the error term ✏it in (2) are iid :
↵i ⇠ ↵,
✏it ⇠ 0,
I
I
(4)
No distribution has been specified in model (4)
✏it may show autocorrelation
I
I
Often it is assumed cov (✏it , ✏is ) 6= 0
While both cov (✏it , ✏jt ) = 0 and cov (↵i , ↵j ) = 0 are assumed
I
I
2
↵
2
✏
Except in spatial models
↵ can be treated as the intercept of the model
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Other names for the Random Effects Model
I
One-way individual-specific effects model
I
I
Random intercept model
I
I
Two-way = inclusion of time-dummies or time-specific random
effects
To distinguish the model with more general random effects
models e.g. random slopes
Random components model
I
because the error term is ↵i + ✏it
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Equicorrelated Random Effects Model
I
0
RE model yit = ↵i + xit + ✏it
I
can be viewed as regression of yit on xit
I
I
with composite error term uit = ↵i + "it
The RE hypothesis (4) (↵i and ✏it iid) implies that
Cov [(ai + eit ), (ai + eis )] =
I
⇢
sv2a , t 6= s
sv2a + sv2e , t = s
(5)
RE model thus imposes the constraint that the composite
error uit is equicorrelated
I
I
Since Cor [uit , uis ] = ↵2 /[ ↵2 + "2 ] for t 6= s does not vary with
the time difference t s
RE model is also called the equicorrelated model or
exchangeable errors model
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Models
Synthesis of Panel Data Models
Fixed-effects model
Random-effects model
0
yit = ↵i + xit + ✏it (2)
Cov (↵i , xit ) 6= 0
↵i ⇠ ↵, ↵2
(4)
✏it ⇠ 0, ✏2
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Panel Data Estimators
I
3 commonly used panel data estimators of
I
I
In this non-dynamic, no endogeneity context : LS variants
Differ in the extent to which cross-section and time-series
variation in the data are used
I
I
their properties vary according to what model is appropriate
A regressor xit may be time-invariant
I
xit = xi for t = 1, ..., T
I
I
so that x̄i =
1
T
P
t
xit = xi
For some estimators only the coefficients of time-varying
regressors are identified
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Variance Matrix
I
For a given i we expect correlation in y over time :
I
I
Cor [yit , yis ] is high
Even after inclusion of regressors, Cor [uit , uis ] may remain 6= 0
I
I
Call Cor [uit , uis ] =
When t = s, its =
its
2
it
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Panel Block-Diagonal Var-Cov Matrix of the Errors ⌃
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
sv211
sv112
..
.
SYM
sv212
..
.
···
···
..
.
..
.
0
..
.
0
sv11T
..
.
sv1(T 1)T
sv21T
0
···
0
..
.
..
.
..
.
..
.
..
.
···
0
sv2N1
..
.
SYM
svN12
sv2N2
..
.
···
0
···
..
.
..
.
1
svN 1T
..
.
svN(T 1)T
sv2NT
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Variance Matrix
I
The RE model accommodates (partly) this correlation
I
From (5):
Cov [(ai + eit ), (ai + eis )] =
I
sv2a , t 6= s
sv2a + sv2e , t = s
OLS output treats each of the T years as independent
information, but
I
The information content is less than this
I
I
I
⇢
given the positive error correlation
Tends to overstate estimator precision
Always use panel-corrected standard errors when OLS is
applied in a panel
I
I
Many possible corrections, depending on assumed correlation
and heteroskedasticity and whether short or long panel
The
default is not panel-corrected
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Within Estimator
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Within Estimator
Within Model
I
Principle: Individual-specific deviations of the dependent
variable from its time-averaged value
I
I
I
are explained by
individual-specific deviations of regressors from their
time-averaged values
0
Individual-specific effects model 2 yit = ↵i + xit + ✏it
I
I
Average over time : ȳi = ↵i + x̄i0 + "¯i
Subtract: the ↵i terms cancel = the within model
yit
0
ȳi = (xit x̄i ) + (✏it ✏¯i )
1, ..., N, t = 1, ..., T
(6)
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Within Estimator
Within / Fixed Effects Estimator
I
Within estimator = OLS estimator on
yit
I
I
Consistent for
x̄i )
0
+ (✏it
✏¯i )
in the FE model
Called the fixed effects estimator by analogy with the FE
model
I
I
ȳi = (xit
does not imply that ↵i are fixed
Each i must be observed at least twice in the sample
I
Else xit
x̄i = 0
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Within Estimator
Consistency of Fixed Effects Estimator
I
FE treats ↵i as nuisance parameters
I
I
I
can be ignored when interest lies in
do not need to be consistently estimated to obtain consistent
estimates of the slope parameters
Consistency further requires
E (✏it
x̄i ) = 0
✏¯i |xit
in the within model
yit
I
I
ȳi = (xit
x̄i )
0
+ (✏it
✏¯i )
Because of the averages, that requires more than E (✏it |xit ) = 0
Requires the strict exogeneity assumption (3)
E [eit |ai , xi1 , ..., xiT ] = 0, t = 1, ..., T
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Within Estimator
Fixed Effects Estimates
I
I
If the fixed effects ↵i are of interest they can also be estimated
If N is not too large an alternative way to compute Within is
I
Least-Squares Dummy variable estimation
I
I
Yields Within estimator for ,
I
I
I
0
Directly estimates yit = ↵i + xit + ✏it by OLS of yit on xit and
N individual dummy variables
along with estimates of the N fixed effects: ↵
ˆ i = ȳi
unbiased estimator of ↵i
0
x̄i ˆ
But in short (small T ) panels ↵
ˆ i are always inconsistent
I
I
because information never accumulate for them
Their distribution or their variation with a key variable may be
informative
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Within Estimator
Time-Invariant Regressors
I
Major limitation of Within
I
I
I
Many studies seek to estimate the effect of time-invariant
regressors
I
I
I
the coefficients of time-invariant regressors are not identified
Since if xit = x̄i then x̄i = xi so (xit x̄i ) = 0
For example, in panel wage regressions : the effect of gender
or race
For this reason many practitioners prefer not to use the within
estimator
RE estimator permits estimation of coefficients of
time-invariant regressors
I
but are inconsistent if the FE model is the correct model
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
First-Differences Estimator
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
First-Differences Estimator
First-Differences Model
I
Principle: Individual-specific one-period changes in the
dependent variable
I
I
I
are explained by
individual-specific one-period changes in regressors
0
Individual-specific effects model (2) yit = ↵i + xit + ✏it
I
I
0
Lag one period yi,t 1 = ↵i + xi,t
1 + "i,t
Subtract = the first-differences model
yit
yi,t
1
= (xit
0
1
xi,t 1 ) + (✏it ✏i,t 1 )
i = 1, ..., N, t = 2, ..., T
(7)
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
First-Differences Estimator
First-Differences Estimator
I
The First-differences estimator D1 is OLS in the first
differences model (7)
I
Consistent estimates of
I
I
The coefficients of time-invariant regressors are not identified
D1 is less efficient than within
I
I
in the FE model
if "it is iid (for T > 2)
However, it may safeguard against I(1) / unit root variables
I
That would otherwise lead to inconsistency
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Random Effects Model
I
0
Individual-specific effects model (2) yit = ↵i + xit + ✏it
I
Assume RE model with iid ↵i and ✏it as in RE hyp (4)
↵i ⇠ ↵,
✏it ⇠ 0,
I
OLS would be consistent
I
But GLS will be more efficient
2
↵
2
✏
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Reminder : GLS in a cross-section
I
When all the hypotheses of the linear model are satisfied but
the errors covariance matrix ⌃ is not the identity, then
I
I
I
OLS is consistent
but it is not efficient if we know ⌃
Let⇣ the⌘ classical linear (cross-section) model y = x
0
E ✏✏ = ⌃ 6= 2 I
I
0
Let P P = ⌃
I
I
I
+ ✏ with
1
Unique Cholesky decomposition for real definite positive
matrix ⌃ 1
Premultiply the linear model by P : Py = Px + P✏
I
I
0
y ⇤ = x ⇤ + ✏⇤ ⇣
⌘
⇣ 0⌘ 0
0
0
Then Var (✏⇤ ) = E P✏✏ P = PE ✏✏ P
⇣ 0 ⌘ 1 0
⇣ 0⌘ 1 0
0
= P⌃P = P P P
P = PP 1 P
P =I
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Reminder : GLS in a cross-section
I
So the transformed model has spherical disturbances
I
I
I
Applying OLS to the transformed data is an efficient
estimator
That is GLS
Since ⌃ is unknown in practice, we need an estimate
I
ˆ yields a Feasible (consistent)
Any consistent estimate of ⌃, ⌃,
GLS estimator
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
RE Panel Block-Diagonal Var-Cov Matrix of the Errors ⌃
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
sv2a + sv2e
sv2a
sv2a
..
.
sv2a
sv2a + sv2e
..
.
···
0
..
.
0
···
..
.
..
.
sv2a
sv2a
..
.
sv2a
sv2a
+ sv2e
0
···
0
..
.
..
.
..
.
..
.
..
.
···
0
1
0
sv2a + sv2e
sv2a
sv2a
..
.
sv2a
sv2a + sv2e
..
.
···
···
..
.
..
.
sv2a
sv2a
..
.
sv2a
sv2a
+ sv2e
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Random Effects Estimator
I
The feasible GLS estimator of the RE model
I
can be calculated from OLS estimation of the transformed
model :
yit
I
⇣
ˆ ȳi = 1
where ⌫it = (1 ˆ )↵i + ("it
ˆ is consistent for
=1
I
⌘
⇣
ˆ µ + xit
Called the RE estimator
p
ˆ x̄i
⌘0
+ ⌫it
(8)
ˆ "¯i ) is asymptotically iid, and
✏
2
✏
+T
2
↵
(9)
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Random Effects Estimator
I
The nonrandom scalar intercept µ is added to normalize the
random effects ↵i to have zero mean
I
I
Cameron & Trivedi provide a derivation of (8) and ways to
estimate ↵2 and "2 and hence to estimate
I
I
Not detailed here
Note
I
I
I
I
as in the RE hypothesis
ˆ = 0 corresponds to pooled OLS
ˆ = 1 corresponds to within estimation
ˆ ! 1 as T ! 1 (look at the formula)
This is a two-step estimator of
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
Random Effects Estimator Properties
I
RE estimator is
I
Fully efficient under the RE model
I
I
Might still be inefficient if the equicorrelation hypothesis is
not true
I
I
The efficiency gain compared to Pooled OLS (applied to the
RE model) need not be great
In particular, under AR (1) processes
Inconsistent if the FE model is correct
I
since then ↵i is correlated with xit
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Random Effects Estimator
RE Discussion
I
Most disciplines in applied statistics,
I
I
I
other than microeconometrics,
treat any unobserved individual heterogeneity as being
distributed independently of the regressors
Then the effects are random effects
I
I
Compared to FE models,
I
this stronger assumption has the advantage of permitting
consistent estimation of all parameters
I
I
I
rather : purely random effects
Including coefficients of time-invariant regressors
However, RE and Pooled OLS are inconsistent if the true
model is FE
Economists often view the assumptions for the RE model as
being unsupported by the data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
Identification of the Individual-Specific Effects
I
0
In yit = ↵i + xit + ✏it
I
the individual effect is a random variable (random coefficient)
I
I
I
Both models assume that E [yit |↵i , xit ] = ↵i + x 0it
↵i is unknown and cannot be consistently estimated
I
I
I
I
I
Unless T ! 1
So we cannot estimate E [yit |↵i , xit ]
Prediction is therefore not possible
Contrarily to what we usually do with OLS
That is reasonnable as ↵i includes unobserved individual
characteristics
I
I
in both fixed and random effects models
Possibly with a non-zero mean
But, take the expectation wrt xit : E [yit |xit ] = E [↵i |xit ] + xit0
I
I
That is, what is the (conditional) expected value of ↵i ?
FE and RE have different takes on this expectation
Random Effects vs. Fixed Effects
I
RE : it is assumed that E [↵i |xit ] = ↵, so E [yit |xit ] = ↵ + xit0
I
Hence E [yit |xit ] is identified
I
I
I
Since we estimate consistently a single intercept as NT ! 1
But the key RE assumption that E [↵i |xit ] is constant across i
might not hold in many microeconometrics applications
FE : E [↵i |xit ] varies with xit and it is not known how it varies
I
I
I
So we cannot identify E [yit |xit ]
Nonetheless Within & First-Diff estimators consistently
estimate with short panels
Thus identify the marginal effect = @E [yit |↵i , xit ]/@xit
I
I
e.g. identify effect on earnings of 1 additional year of schooling
But only for time-varying regressors
I
I
so the marginal effect of race or gender, for example, is not
identified
And not the expected individual yit as we do not know the
individual effect ↵i
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
Random Effects vs. Fixed Effects
I
I
Both models have different focuses
RE
I
I
I
Time-series structure
Efficiency
FE
I
I
Endogeneity of unobserved heterogeneity
Consistency
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
Summary Models & Estimators
Table: Linear Panel Model: Common Estimators and Models
Model
Estimator of
Rnd Effects (2) & (4)
Fixed Effects (2)
Within (Fixed Effects) (6)
Consistent
Consistent
First Differences (7)
Consistent
Consistent
Random Effects (8)
Consistent & efficient
Inconsistent
This table considers only consistency of estimators of .
For correct computation of standard errors see next Section.
The only fully efficient estimator is RE under the RE model
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
Example Arellano-Bond
I
Unbalanced panel of 140 U.K. manufacturing companies over
the period 1976-1984
I
I
I
I
Download in
webuse abdata
Year = t, n = log of employment, w = log of real wage, k =
log of gross capital, ys = log of industry output, id = firm
index (i)
Panel structure in
xtset id year, yearly
Arellano & Bond are interested in a dynamic employment
equation (labour demand)
nit = ↵1 ni,t
1
+ ↵2 ni,t
2
+
0
(L) xit +
t
+ ⌘i + ⌫it
where (L) indicates a vector of polynomials in the lag
operator so that various lags of x might be used
I
AB use wt , wt
I
1 , kt , kt 1 , yst , yst 1 , yst 2
And time dummies for all years
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
Example Arellano-Bond
I
AB model is dynamic
I
In this chapter, we estimate
I
I
I
I
! AB.do
I
without the lags of n in the regressors
with them
by FE, D1 and RE
All this is in principle known
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
First-difference in
I
First-Differences estimator is not readily available
I
Define the first differences first, then apply the OLS
I
I
Lag 1 period : by id: gen xL1 = x[_n-1]
I
I
I
This is fairly unsatisfactory as there is no real account of the
error term panel structure
n indexes observations
by id indicates to lag by group defined on the id variable
Then by id: gen xD1 = x-xL1 for the 1st diff
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Estimators
Fixed vs. Random Effects
First-differencing time dummies
I
Take dt a time-dummy
I
I
By construction L1dt must be one at t + 1 and zero elsewhere
I
I
Recall that a lag one period of x indicates at time t+1 the
value that x had at t
with a missing value at t=1 (at the 1st obs period)
Thus, e.g. yr1980L1=1 in 1981, 0 in other years
I
I
so yr1980D1=yr1980-yr1980L1=-1 in 1981, 1 in 1980, 0 in
other years, missing in 1976
Also yr1984L1 is zero everywhere since it is the last obs. year
(missing in 1976)
I
I
So yr1984D1 cannot be used as it is identical to yr1984
Interpretation of the 1st diff. of a time dummy is hard
Example Arellano-Bond Results
Table: Coef. Estimates – no lags of n
Variable
w
wL1
k
kL1
ys
ysL1
ysL2
yr1979
yr1980
yr1981
yr1982
yr1983
yr1984
Intercept
OLS
-0.229
-0.289
0.320
0.493
-1.801
-0.468
2.136
-0.057
-0.233
-0.467
-0.392
-0.235
-0.264
3.748
FE
-0.524
-0.077
0.493
0.142
0.344
-0.198
-0.076
-0.016
-0.017
-0.048
-0.065
-0.058
-0.022
2.907
D1
-0.543
0.041
0.399
0.166
0.532
-0.268
-0.001
0.006
0.022
0.004
-0.013
-0.013
omitted
-0.010
RE
-0.503
-0.052
0.553
0.196
0.263
-0.266
-0.048
-0.017
-0.024
-0.058
-0.069
-0.056
-0.011
3.396
Example Arellano-Bond Results
Table: Coef. Estimates – with lags of n; time dummies not presented
Variable
nL1
nL2
w
wL1
k
kL1
ys
ysL1
ysL2
Intercept
OLS
1.096
-0.132
-0.534
0.486
0.355
-0.325
0.465
-0.787
0.314
0.215
FE
0.736
-0.154
-0.560
0.316
0.393
-0.098
0.475
-0.633
0.056
1.810
D1
0.130
-0.035
-0.556
0.124
0.392
0.127
0.560
-0.368
0.034
-0.009
RE
1.096
-0.132
-0.534
0.486
0.355
-0.325
0.465
-0.787
0.314
0.215
It is interesting to compare parameter estimates, but we postpone
to next chapter
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Panel-Robust Inference
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Panel-Robust Inference
Panel-Robust Statistical Inference
I
The various panel models include error terms : uit , "it , ↵i
I
In many microeconometrics applications :
I
I
Reasonable to assume independence over i
The errors are potentially
1. serially correlated (correlated over t for given i )
2. heteroskedastic (at least across i)
I
Valid statistical inference requires controlling for both of
these factors
Het. & Autoc. Block-Diagonal Errors Var-Cov Matrix ⌃
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
sv211
sv112
..
.
SYM
sv212
..
.
···
···
..
.
..
.
sv11T
..
.
sv1(T 1)T
sv21T
0
..
.
0
I
Not enough structure
0
···
0
..
.
..
.
..
.
..
.
..
.
···
0
sv2N1
svN12
..
.
SYM
sv2N2
..
.
···
0
···
..
.
..
.
1
svN 1T
..
.
svN(T 1)T
sv2NT
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
RE Panel Block-Diagonal Var-Cov Matrix of the Errors ⌃
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
sv2a + sv2e
sv2a
sv2a
..
.
sv2a
sv2a + sv2e
..
.
···
···
..
.
..
.
sv2a
sv2a
..
.
sv2a
2
sva + sv2e
0
..
.
0
I
0
···
0
..
.
..
.
..
.
..
.
..
.
···
0
Equicorrelation implies
I
I
Homoskedasticity
A limited form of autocorrelation
1
0
sv2a + sv2e
sv2a
sv2a
..
.
sv2a
sv2a + sv2e
..
.
···
···
..
.
..
.
sv2a
sv2a
..
.
sv2a
sv2a + sv2e
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
Heteroskedastic RE Block-Diagonal Errors Var-Cov Matrix ⌃
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
sv2a + sv2e1
sv2a
sv2a
..
.
sv2a
sv2a + sv2e1
..
.
···
0
..
.
0
I
I
···
..
.
..
.
sv2a
sv2a
..
.
sv2a
sv2a + sv2e1
0
···
0
..
.
..
.
..
.
..
.
..
.
···
0
0
sv2a + sv2eN
sv2a
sv2a
..
.
sv2a
sv2a + sv2eN
..
.
···
sv2a
..
.
···
..
.
..
.
sv2a
sv2a
sv2a + sv2eN
Small generalisation of RE for Heteroskedasticity
The White heteroskedastic consistent estimator can be
extended to short panels
I
since for the i th observation the error variance matrix ⌃ is of
finite dimension T ⇥ T while N ! 1
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Panel-Robust Inference
Reminder : The White heteroskedastic-consistent estimator
I
Classical linear model y = x
I
I
I
0
⇣ 0⌘
+ ✏ with E ✏✏ = ⌃ 6=
OLS⇣unbiased
⌘ and
⇣ 0 consistent
⌘ 1 0
⇣ 0 ⌘
Var ˆOLS = X X
X ⌃X X X
1
6=
2
⇣
0
X X
For pure heteroskedasticity, White (1980) shows that
⌘
2I
1
N
S=
1X 2
0
✏ˆi Xi Xi
N
i=1
I
I
I
where ✏ˆi is the OLS residual
0
is a consistent estimate of N1 X ⌃X under general conditions
The formula can be extended for Autocorrelation
I
But often autocorrelation reveals time-series properties
I
That need to be investigated in more details
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Panel-Robust Inference
Panel-Robust Statistical Inference
I
Panel-robust standard errors can thus be obtained
I
following White’s principle
I
without assuming specific functional forms for
within-individual error correlation or heteroskedasticity
However, we assume a constant covariance as in RE
I
I
I
Called “sandwich” or “robust” estimators
So we use inefficient estimators
I
but at least we get their variance better than with OLS
formulas
I
Only RE estimator in RE model is efficient
More efficient estimators using GMM : Chap 2
I
I
If there is AR(1) or I(1) errors, we might still be very wrong
I
FE or RE tend to reduce the serial correlation in errors
I
The panel commands in many computer packages calculate
default se assuming iid errors
I
I
I
but not eliminate it
erroneous inference
Ignoring it can lead to underestimated se
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Panel-Robust Inference
commands
I
Robust estimator assumes independence over i and N ! 1
I
I
I
but permits V [uit ] and Cov [uit , uis ] to vary with i, t, and s
the case for short panels
Panel-robust standard errors based on White can be computed
by use of a regular panel command
I
if the command has a cluster-robust standard error option
I
Common error : use the standard robust se option
I
in
I
Only adjusts for heteroskedasticity
In practice in a panel : more important to correct for serial
correlation
In
, in a panel estimator, robust automatically
accounts for cluster
I
I
I
, cluster on the individual i
Bootstrap, computes panel-robust standard errors based on
bootstrap
I
I
I
Fewer hypotheses
Slower, depends on the number of replications
Do not specify a cluster variable when in a panel model
Example Arellano-Bond Results
Table: p-values – FE models w/ 2 lags of n; time dummies not presented
Variable
nL1
nL2
w
wL1
k
kL1
ys
ysL1
ysL2
Intercept
Standard
0.000
0.000
0.000
0.000
0.000
0.002
0.000
0.000
0.677
0.000
(Cluster-) Robust
0.000
0.027
0.001
0.029
0.000
0.032
0.006
0.003
0.672
0.005
Bootstrap (500 rep)
0.000
0.032
0.001
0.033
0.000
0.028
0.005
0.002
0.693
0.003
Robust is interpreted as Cluster robust, clustering var. is id, the panel i
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Panel Data Inference
Panel-Robust Inference
Note: Variance Decomposition
The total variance s 2 of a series xit can be decomposed as
N X
T
N X
T
X
X
2
1
1
(x
x̄)
=
[(xit x̄i ) + (x̄i x̄)]2
it
NT
NT
i=1 t=1
=
i=1 t=1
N X
T
X
1
(xit
NT −N
i=1 t=1
as the cross-product term sums to zero.
Total variance s 2 =
2
x̄i ) +
1
N 1
N X
T
X
i=1 t=1
I
sw2 within variance [sum across individuals of individual
deviations around the individual means]
I
+ sb2 between variance [deviations of individual means
around the grand mean]
The between and within R 2 are defined similarly
I
I
R 2 often small with panel data
(x̄i
x̄)2
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Causation
I
The FE model can establish causation under weaker
assumptions than those needed with
I
I
I
cross-section data
panel data models without fixed effects : pooled & RE models
In some studies causation is clear, so RE may be appropriate
I
For example, in a controlled experiment, causation is clear
I
I
I
crop yield from different amounts of fertilizers applied to
different fields in a laboratory
xi is assigned randomly to cases, thus uncorrelated to ↵i
In other cases it may be sufficient to use a RE analysis to
measure the extent of correlation
I
I
determination of causation is left to other approaches
e.g. effect of smoking on lung cancer
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Causation
I
Economists are unusual in preferring a FE approach because of
a desire to measure causation with observational instead of
experimental data
I
I
There is the possibility that instead of measuring causation, we
measure only a spurious correlation due to the effect of
unobserved variables that are correlated with the variables
included in the regression
FE eliminates those unobserved variables that are
time-invariant by differencing, so that
I
The causative effect of x on y is measured by the association
between individual changes in y and in x
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Fixed Effects Weaknesses in Practice
I
I
Estimation of the coefficient of any time-invariant regressor
is not possible with FE
Coefficients of time-varying regressors are estimable, but may
be imprecise if most of the variation in a regressor is cross
sectional rather than over time
I
I
Prediction of the conditional mean is not consistent since
the indiv. effects are not consistently estimated
I
I
As then the within transformation will greatly remove this
variation
Only changes in the conditional mean caused by changes in
time-varying regressors can be predicted
Still requires the assumption that the unobservables ↵i are
time-invariant (no ↵it )
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Reminder : Hausman Test
I
I
Principle : if two estimators are consistent, then their
difference should not be statistically different from zero,
asymptotically
Consider two estimators ✓ˆ and ✓˜ (in the same model)
I
I
⌘
✓˜ 6= 0
where VH is the variance matrix in the limiting distribution
⇣
Hausman test statistic H = ✓ˆ
I
I
I
⌘
⇣
✓˜ = 0 , Ha : plim ✓ˆ
Under H0 , the difference between the⇣2 estimators
converges
⌘
p
to a normal with zero mean : N ✓ˆ ✓˜ ! N [0, VH ]
I
I
⇣
We test H0 : plim ✓ˆ
⌘0 ⇣
⌘
1
✓˜
V̂
H
N
asymptotically 2 (q) under H0
reject H0 at level ↵ if H > 2↵ (q)
1⇣
✓ˆ
⌘
✓˜
The question in practice is to find an estimate of VH : V̂H
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Hausman Test for Panel Data
I
If individual effects are fixed
I
I
I
I
within estimator ˆW is consistent
RE estimator ˜RE is inconsistent
vector of coefficients of just the time-varying regressors
Hausman test on presence of fixed effects
I
H0 : No systematic difference between the coefficients
estimates
I
I
I
If holds, prefer RE as it is more efficient
In principle, maybe not if errors are I(1)
Works on any pair of estimators with similar properties
I
e.g first differences versus pooled OLS
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Hausman Test for Panel Data
I
Large value of H leads to rejection of the null hypothesis
I
I
I
We infer that since ˆW is consistent, if ˜RE is much different,
it must be inconsistent
So that the individual-specific effects are correlated with the
regressors
It may still be possible to avoid using a FE estimator
I
If regressors are correlated with individual-specific effects
because of omitted variables
I
I
then maybe add further regressors
It may be possible to estimate a RE model using instrumental
variables methods (Ch. 2)
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Hausman Test Computation When RE IS Fully Efficient
I
Assume the true model is the RE model with
I
I
⇥
⇤
↵i iid 0, ↵2⇥ uncorrelated
with regressors
⇤
error "it iid 0, "2
Then ˜RE fully efficient, the Hausman test statistic simplifies
⇣
⌘0 \
h
i
h
i 1⇣
⌘
\
˜
ˆ
H = 1,RE
V ˆ1,W
V ˜1,RE
b̃1,RE −b̂1,W
1,W
I
I
where 1 denotes the subcomponent of
time-varying regressors
I
I
I
corresponding to
since only that component can be estimated by the within
estimator
This test stastistic is asymptotically 2 (dim [ 1 ]) under H0
Very easy since then the V̂ matrices are regular outputs of the
estimation
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Hausman Test When RE IS NOT Fully Efficient
I
The above simple form of the Hausman test is invalid if ↵i or
"it are not iid
I
I
I
e.g with heteroskedasticity inherent in much
microeconometrics data
Then the RE estimator is not fully efficient under the null
hypothesis
h
i
h
i
\
\
The expression V b̂1,W V b̃1,RE in the formula for H
h \
i
needs to be replaced by the more general V b̃1,RE b̂1,W
I
I
That is NOT implemented in
For short panels this variance matrix can be consistently
estimated by bootstrap resampling over i
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Hausman Test When RE IS NOT Fully Efficient 2
I
A panel-robust Hausman test statistic is
⇣
HRobust = b̃1,RE −b̂1,W
I
I
I
h\
i
Vboot b̃1,RE b̂1,W
h\
i
where Vboot b̃1,RE b̂1,W =
1
B 1
B ⇣
X
b=1
ˆb
−1 ⇣
⌘⇣
¯ˆ ˆ
b
b is the b th of B bootstrap replications and ˆ = b̃1,RE
b̃1,RE −b̂1,W
⌘0
¯ˆ
b̂1,W
This test statistic can
I
I
I
⌘�
be applied to subcomponents of 1
use other estimators such as ˜1,POLS in place of ˜1,RE and
ˆ1,FD in place of ˆ1,W
There are user-implementations over the Internet
⌘
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Example Arellano-Bond Results
I
I
How it works in
e.g. to compare FE & RE
I
I
I
I
do xtreg ..., fe
estimates store EstimEF
do xtreg ..., re
hausman EstimEF .
I
I
I
Take care to insert the final dot . that means “last estimates
computed”
Stat!Postestimation!Tests!Hausman
If you try to use vce(robust) or any other than the default
I
I
an error message results
That is fair as
only does the “fully efficient” version
of Hausman
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
Example Arellano-Bond Results
I
Output is fairly complete
I
Test: Ho: difference in coefficients not systematic
I
I
I
I
I
chi2(15) = (b-B)’[(V_b-V_B)^(-1)](b-B) = 169.57
Prob>chi2 = 0.0000 (V_b-V_B is not positive definite)
The last probably because the difference between some
variances are machine-zero
So what conclusion ?
The 2 estimators must have the same number of coef
estimates
I
It may be necessary to remove time-invariant regressors from
FE
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Fixed Effects vs. Random Effects
Hausman Test
A Note on Model Selection
I
Often students proceeds as follows
1.
2.
3.
4.
5.
I
Problem with this approach
I
I
I
Estimate a pooled (OLS) model
Estimate the RE model
Test (there are several tests) the RE against the pooled
If ¬R RE, then estimate FE
Test (Hausman) FE against RE
If correct model is FE then both RE & OLS (pooled)
estimators are inconsistent
The test(s) in step 3 cannot be relied upon
More generally, in applied work
I
I
Start from the more general model (here : FE)
And see whether it can be simplified
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Unbalanced Panel Data
Outline
Data
Panel Data Models
Panel Data Estimators
Within Estimator
First-Differences Estimator
Random Effects Estimator
Fixed vs. Random Effects
Panel Data Inference
Panel-Robust Inference
Fixed Effects vs. Random Effects
Non-Test Elements of Choice
Hausman Test
Unbalanced Panel Data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Unbalanced Panel Data
Attrition
I
Balanced panel : data are available for every i in every t
I
I
Panel surveys of individuals
I
I
e.g. countries
attrition over time
Different individuals appear in different years
I
unbalanced or incomplete panels
I
T becomes Ti
I
Sometime purposefully
I
Generally unavoidable
I
Rotating panel
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Unbalanced Panel Data
Consistency
I
The FE estimators is consistent only if
I
presence or absence in the sample is not correlated
I
I
I
RE is consistent if
I
I
with individual-specific effects ↵i
with regressors xit
additionally ↵i is independent of the regressors xit
Non-Randomly Missing Data if the reason for individuals
dropping out of the sample is correlated with the error term
I
The panel becomes unrepresentative
I
I
I
e.g. individuals with unusually low wages may be more likely to
drop out of the sample
I
I
The panel estimators that we have seen may be inconsistent
called attrition bias
attrition bias if wage is the dependent variable
Consistent estimation requires sample selection methods
extended to panel data
Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models
Unbalanced Panel Data
Balancing
I
Convert an unbalanced panel into a balanced panel
I
I
I
I
By including in the sample only those individuals with data
available in all years
Or by rejecting some years for some individuals
Can greatly reduce efficiency
One reason for removing obs is because only one variable is
not observed
I
I
Nonresponse rate to income questions can be high
Data Imputation methods
© Copyright 2026 Paperzz