Specification of Conditional Expectation Functions

Speci…cation of Conditional Expectation Functions
Econometrics II
Douglas G. Steigerwald
UC Santa Barbara
D. Steigerwald (UCSB)
Specifying Expectation Functions
1 / 24
Overview
Reference: B. Hansen Econometrics Chapter 2.9-2.17, 2.31-2.32
Why focus on E (y jx )?
conditional mean is the "best predictor"
yields marginal e¤ects
I
are the measured marginal e¤ects causal?
speci…cation of conditional mean
I
exact for discrete covariates as E (y jx ) is linear in x by de…nition
conditional variance is also informative
D. Steigerwald (UCSB)
Specifying Expectation Functions
2 / 24
Why Focus on the CEF?
let g (x ) be an arbitrary predictor of y
suppose our goal is to minimize
E (y
if Ey 2 < ∞
I
E (y
g (x ))2
g (x ))2
E (y
E (y jx ))2
conditional mean is the "best predictor"
D. Steigerwald (UCSB)
Specifying Expectation Functions
3 / 24
Proof
let m (x ) = E (y jx )
E (y
g (x ))2 = E (e + m (x )
g (x ))2
because E (e jx ) = 0, e is uncorrelated with any function of x:
E (e + m (x )
I
g (x ))2 = Ee 2 + E (m (x )
g (x ))2
g (x ) = m (x ) is the minimizing value
D. Steigerwald (UCSB)
Specifying Expectation Functions
4 / 24
Marginal E¤ects
E (y jx ) can be interpreted in terms of marginal e¤ects
e¤ect of a change in x1 holding other covariates constant
I
cannot hold "all else" constant
causal e¤ect - marginal e¤ect of (continuous) x1 on y
∂y
∂E (y jx )
∂e
=
+
∂x1
∂x1
∂x1
for marginal e¤ects to be causal, we must establish (assume)
∂e
=0
∂x1
D. Steigerwald (UCSB)
Specifying Expectation Functions
5 / 24
CEF Derivative
we measure - marginal e¤ect on CEF
the formula for this derivative di¤ers for continuous and discrete
covariates
r1 m (x ) =
∂
∂x1 E (y jx1 , x2 , . . . , xK )
E (y j1, x2 , . . . , xK )
if x1 is continuous
E (y j0, x2 , . . . , xK ) if x1 is discrete
e¤ect of a change in x1 on E (y jx ) holding other covariates constant
potentially varies as we change the set of covariates
D. Steigerwald (UCSB)
Specifying Expectation Functions
6 / 24
E¤ect of Covariates
changing covariates changes E (y jx ) and e
= E (y jx1 ) + e1
y = E (y jx1 , x2 ) + e2
y
I
E (y jx1 , x2 ) reveals greater detail about the behavior of y
e is the unexplained portion of y
Adding Covariates Theorem: If Ey 2 < ∞
Var (y )
Var (e1 )
Var (e2 )
How restrictive is the …nite moment assumption?
Finite Moment Assumption
D. Steigerwald (UCSB)
Specifying Expectation Functions
7 / 24
CEF Speci…cation: No Covariates
m (x ) = µ
y = m (x ) + e
becomes
y = µ+e
intercept-only model
D. Steigerwald (UCSB)
Specifying Expectation Functions
8 / 24
Binary Covariate
binary covariate
x1 =
I
1 if gender = man
0 if gender = woman
more commonly termed an indicator variable
conditional expectation
µ1 for men and µ0 for women
conditional expectation function is linear in x1
E (y jx1 ) = β1 x1 + β2
I
I
2 covariates (including the intercept)
β2 = µ0 β1 = µ1 µ0
D. Steigerwald (UCSB)
Specifying Expectation Functions
9 / 24
Multiple Indicator Variables
notation: x2 = 1 (union)
there are 4 conditional means
µ11 for union men and µ10 for nonunion men (and similarly for women)
conditional expectation is linear in (x1 , x2 , x1 x2 )
E (y jx1 ) = β1 x1 + β2 x2 + β3 x1 x2 + β4
I
I
I
I
I
β4 mean for nonunion women
β1 male wage premium for nonunion workers
β2 union wage premium for women
β3 di¤erence in union wage premium for men and women
x1 x2 is the interaction term (4 covariates)
p indicator variables, 2p conditional means
D. Steigerwald (UCSB)
Specifying Expectation Functions
10 / 24
Categorical Covariates
8
< 1 if race = white
2 if race = black
x3 =
:
3 if race = other
no meaning in terms of magnitude, only indicates category
E (y jx3 ) is not linear in x3
represent x3 with two indicator variables: x4 = 1 (black ) and
x5 = 1 (other )
conditional expectation is linear in (x4 , x5 )
E (y jx4 , x5 ) = β1 x4 + β2 x5 + β3
I
I
I
I
β3
β4
β5
no
mean for white workers
black-white mean di¤erence
other-white mean di¤erence
individual who is both black and other - no interaction
D. Steigerwald (UCSB)
Specifying Expectation Functions
11 / 24
Continuous Covariates: Linear CEF
E (y jx ) is linear in x: E (y jx ) = x T β
y = xTβ + e
I
x = ( x1 , . . . , xk
1 , 1)
derivative of E (y jx )
I
I
I
T
β = ( β1 , . . . , βk )T
rx E (y jx ) = β
coe¢ cients are marginal e¤ects
marginal e¤ect - impact of change in one covariate holding all other
covariates …xed
marginal e¤ect is a constant
general existence of CEF Existence of the Conditional Mean
D. Steigerwald (UCSB)
Specifying Expectation Functions
12 / 24
Measures of Conditional Distributions
common measure of location - conditional mean
common measure of dispersion - conditional variance
conditional variance
σ2 (x ) := Var (y jx ) = E (y
2
E (y jx )) jx
σ 2 (x ) = E e 2 jx
I
often report σ (x ), same unit of measure as y
alternative representation
y = E (y jx ) + σ (x ) e
D. Steigerwald (UCSB)
E ( e jx ) = 0 E e2 jx = 1
Specifying Expectation Functions
13 / 24
Conditional Standard Deviation
σmen = 3.05
σwomen = 2.81
men have higher average wage and more dispersion
D. Steigerwald (UCSB)
Specifying Expectation Functions
14 / 24
Heteroskedasticity
homoskedasticity
E e 2 jx = σ 2
(does not depend on x)
heteroskedasticity
E e 2 jx = σ 2 (x )
(does depend on x)
unconditional variance E E e 2 jx
I
constant by construction
formally, conditional heteroskedasticity
heteroskedasticity is the leading case for empirical analysis
D. Steigerwald (UCSB)
Specifying Expectation Functions
15 / 24
Proofs
1
Proof of Adding Covariates Theorem
D. Steigerwald (UCSB)
Specifying Expectation Functions
16 / 24
Review
Why focus on E (y jx )?
"best" predictor
Suppose E (y jx ) = x T β. How to do you interpret β?
rx E (y jx )
What is required for causality?
rx e = 0
When is E (y jx ) known?
x discrete
What is the leading empirical approach for dispersion?
(conditional) heteroskedasticity E e 2 jx = σ2 (x )
D. Steigerwald (UCSB)
Specifying Expectation Functions
17 / 24
Finite Moment Assumption
Consider the family of Pareto densities
f (y ) = ay
I
a 1
y >1
a indexes the decay rate of the tail
F
larger a implies tail declines to zero more quickly
D. Steigerwald (UCSB)
Specifying Expectation Functions
18 / 24
Tail Behavior and Finite Moments
for the Pareto densities
r
E jy j =
I
a
R∞
1
yr
a 1 dy
a
a r
=
∞
r <a
r a
r th moment is …nite i¤ r < a
extend beyond Pareto distribution using tail bounds
I
f (y )
F
F
A jy j
a 1
E jy jr
=
for some A < ∞ and a > 0
f (y ) is bounded below a scale of a Pareto density
for r < a
Z ∞
∞
1+
D. Steigerwald (UCSB)
jy jr f (y ) dy
2A
a
r
Z 1
1
f (y ) dy + 2A
Z ∞
yr
a 1
dy
1
<∞
Specifying Expectation Functions
19 / 24
Tail Behavior
if the tail of a density declines at rate jy j
I
a 1
or faster
then y has …nite moments up to (but not including) a
intuitively, restriction that y has …nite r th moment means the tail of
the density declines to zero faster than y r 1
1
y2
I
…nite mean (but not variance), density declines faster than
I
…nite variance (but not third moment), density declines faster than
I
…nite fourth moment, density declines faster than
1
y3
1
y5
Return to Covariates
D. Steigerwald (UCSB)
Specifying Expectation Functions
20 / 24
Conditional Mean Existence
If E jy j < ∞ then there exists a function m (x ) such that for all
measurable sets X
E (1 (x 2 X ) y ) = E (1 (x 2 X ) m (x ))
I
I
(1)
from probability theory e.g. Ash (1972) Theorem 6.3.3
m (x ) is almost everywhere unique
F
F
F
if h (x ) satis…es (1) then there is a set S such that P (S) = 1 and
m (x ) = h (x ) for x 2 S
m (x ) is called the conditional mean and is written E (y jx )
(1) establishes E (y ) = E (E (y jx ))
D. Steigerwald (UCSB)
Specifying Expectation Functions
21 / 24
General Nature of Conditional Mean
E (y jx ) exists for all …nite mean distributions
I
I
I
y can be discrete or continuous
x can be scalar or vector valued
components of x can be discrete or continuous
if (y , x ) have a joint continuous distribution with density f (y , x ) then
I
I
the conditional density fy jx (y jx ) is well de…ned
R
E (y jx ) = R yfy jx (y jx ) dy
Return to Conditional Mean
D. Steigerwald (UCSB)
Specifying Expectation Functions
22 / 24
Proof of Adding Covariates Theorem
Theorem: If Ey 2 < ∞,
Var (y )
Var (y
E (y jx1 ))
Var (y
E (y jx1 , x2 ))
Ey 2 < ∞ implies existence of all conditional moments in the proof
First establish Var (y )
I
I
let z1 = E (y jx1 )
y µ = (y z1 ) + (z1
F
I
E (y
F
I
Var (y
note E ((z1
µ ) (y
µ )2 = E (y
µ)
z1 ) jx1 ) = 0
z1 )2 + E (z1
µ )2
y and z1 both have mean µ
Var (y ) = Var (y
Var (y )
E (y jx1 ))
Var (y
D. Steigerwald (UCSB)
z1 ) + Var (z1 )
E (y jx1 ))
Specifying Expectation Functions
23 / 24
Completion of Proof
E (y jx1 ))
Second establish Var (y
I
let z2 = E (y jx1 , x2 ) also with mean µ
F
I
note E ((z2
µ ) (y
Var (y ) = Var (y
I
z2 ) jx1 , x2 ) = 0
Var (z1 )
(E (z2 jx1 ))2 E z22 jx1 (conditional Jensen’s inequality)
(unconditional expectations)
E (E (z2 jx1 ))2 E E z22 jx1
F
I
E (y jx1 , x2 ))
z2 ) + Var (z2 )
need to show Var (z2 )
I
Var (y
E (z2 jx1 ) = z1 E E z22 jx1
Var (z1 )
Var (y
= E z22
(LIE)
Var (z2 )
E (y jx1 ))
Var (y
E (y jx1 , x2 )).
Return to Proofs
D. Steigerwald (UCSB)
Specifying Expectation Functions
24 / 24