A unified asymptotic distribution theory for parametric and

A uni…ed asymptotic distribution theory for
parametric and nonparametric least squares
and
Robust inference
by
Bruce E. Hansen
Department of Economics
University of Wisconsin
March 2015
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
1 / 65
A Uni…ed Asymptotic Distribution Theory for Parametric
and NonParametric Least Squares
Standard nonparametric sieve regression theory imposes stronger
assumptions than in parametric settings
I
I
Bounded regressors
Bounded conditional variances
Consequently there is a disconnect between the parametric and
nonparametric theory
This paper presents a uni…ed set of conditions for asymptotic
normality
Does not imposes bounded regressors nor conditional moments
Shows there is a trade-o¤ between number of …nite moments and
allowed regressor expansion rate.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
2 / 65
Nonparametric Sieve Distribution Theory
Asymptotic normality established by: Andrews (1991), Newey (1997),
Chen and Shen (1998), Huang (2003), Chen, Liao, Sun (2014), Chen
and Liao (2014), Belloni, Chernozhukov, Cheterikov and Kato (2012),
Chen and Christensen (2014)
All assume conditional variances bounded above zero and below
in…nity, and bounded conditional 2 + ε moment (or higher)
Most assume bounded regressors.
I
I
p
Chen-Shen (1998) do not impose boundedness, but only explore n
consistent functionals
Chen-Christensen (2014) do not impose boundedness, but only
examined a trimmed LS estimator and do not explore the impact of
trimming on bias.
Chen-Shen (1998) and Chen-Christensen (2014) allow time-series
Belloni, Chernozhukov, Cheterikov and Kato (2012) extend to
uniform inference
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
3 / 65
Series Regression
iid (yi , zi ), i = 1, ..., n
I
I
I
yi = g (zi ) + ei
E ( e i j zi ) = 0
zi may have unbounded support
Linear parameter of interest θ = a(g )
I
includes regression function, derivatives, and integrals
Sieve Regression Approximation
I
I
I
I
I
I
Let xK (z ) be a sequence of K 1 basis functions
Regressor xKi = xK (zi )
1
0
Projection coe¢ cient βK = E xKi xKi
E (xKi yi )
0 β +e
Projection equation yi = xKi
Ki
K
K th series approximation gK (z ) = xK (z )0 βK
0 β
Approximation error rKi = g (zi ) xKi
K
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
4 / 65
Least Squares Estimation
0 )
b
βK = (∑ni=1 xKi xKi
gbK (z ) = xK (z )0 b
β
1
∑ni=1 xKi yi
K
b
βK
θ K = a (gbK ) = aK0 b
0 )
QK = E (xKi xKi
0 e2
SK = E xKi xKi
Ki
VK = QK 1 SK QK 1 , conventional asymptotic variance
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
5 / 65
Goal: Asymptotic Normality
In parametric context (…xed K ), asymptotic normality holds under
following conditions:
1
2
E kxKi k4
E jeKi j4
3
QK > 0
4
SK > 0
1/4
1/4
C <∞
C <∞
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
6 / 65
Nonparametric Context
Assumptions: For some q > 4
1
2
3
4
5
6
7
8
(E kxKi kq )
λmin (QK )
1/q
1
q 1/q
(E jrKi j )
O Kφ
O (K
γ
) for some γ
0
1/q
C
( E j ei j q )
1
λmin QK SK
λK
n
n
1 K (2φ+η )q/(q 4 )
η
(log K )(q
1 K 2φq/(q 2 )+η +2 (φ γ)
If q < 8, n
8 ) + / (q 4 )
= o (1)
(q 4 ) / (q 2 )
(log K )
= o (1)
= o (1)
1 K (2φq +8 q )/(q 4 )
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
7 / 65
Assumption 1: For some q > 4, (E kxKi kq )
1/q
O Kφ
In nonparametric sieve literature, the standard assumption is
boundedness, kxKi k ζ K
I
I
When zi has bounded support, ζ K = K φ , with φ depending on the
sieve (φ = 1 for power series and φ = 1/2 for splines)
Assumption 1 relaxes this requirement, does not require bounded zi
Assumption 1 implied with φ = 1/2 if regressors xji satisfy
1/q
C
(E kxji kq )
q > 4 strengthening of parametric q = 4
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
8 / 65
Assumption 2. λmin (QK )
1
A uniform verion of QK > 0
Assumption 3. (E jrKi jq )
1/q
O (K
γ
) for some γ
0
O (K γ ) bound for approximation error means the approximation
error decreases as regressor length K increases
Holds if g (zi ) = ∑j∞=1 βj xji , with orthogonal xji , (E jxKji jq )
and βj = O (j
1/q
C,
γ 1)
We can allow γ = 0 (no decay) if K does not grow too fast
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
9 / 65
Assumption 4. For some q > 4, (E jei jq )
1/q
C
q > 4 strengthening of parametric q = 4
Can replace with bounded conditional moment E (jei js j zi )
some s > 2
C for
Unconditional moment bound more primitive
Assumption 5. λmin QK 1 SK
λK
η
Typically holds with η = 0. For example, if σ2i
σ2
Allows σ2i = 0 (no regressor error, only approximation error).
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
10 / 65
Assumptions 6, 7 & 8. Restriction on expansion rate for K
Fastest expansion rate when q = ∞ (bounded regressors and errors),
η = 0 and γ φ. Then n 1 K 2φ log K = o (1)˙ . This is the rate in
Chen-Christensen and Belloni, et.al.
Slower expansion for …nite q. As q ! 4, K is slowed to boundedness.
η > 0 (near singular variance) slows rate for K
γ = 0 (no approximation decay) slows rate for K .
Rate kink at q = 8, where rate is n
is same as Newey (1997)
1 K 4φ+2η
= o (1)˙ . For η = 0 this
The bene…t of higher moments is allowing a higher K
The extreme case of bounded regressors (as is common in sieve
theory) allows the fastest growth in K
Assumption allows K to be bounded (as in parametric case) or
increasing with n (nonparametric case)
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
11 / 65
Restatement of Assumptions:
Assumptions: For some q > 4
1
2
3
4
5
6
7
8
(E kxKi kq )
λmin (QK )
1/q
1
q 1/q
(E jrKi j )
O Kφ
O (K
γ
) for some γ
0
1/q
C
( E j ei j q )
1
λmin QK SK
λK
n
n
1 K (2φ+η )q/(q 4 )
η
(log K )(q
1 K 2φq/(q 2 )+η +2 (φ γ)
If q < 8, n
8 ) + / (q 4 )
= o (1)
(q 4 ) / (q 2 )
(log K )
= o (1)
= o (1)
1 K (2φq +8 q )/(q 4 )
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
12 / 65
Theorem 1
Under Assumption 1,
p
nαK0 b
βK
βK
(αK0 VK αK )1/2
!d N (0, 1)
Asymptotic distribution for the linear projection coe¢ cient.
It extends conventional parametric theory
Estimate is centered at pseudo-true projection coe¢ cient βK as in
parametric case
0 e 2 Q 1 is written as function of projection
VK = QK 1 E xKi xKi
Ki
K
errors eKi , thereby including parametric and nonparametric models as
special cases.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
13 / 65
Theorem 2
Under Assumption 1,
p
θK
n b
θ + a ( rK )
(αK0 VK αK )1/2
!d N (0, 1)
Asymptotic distribution for linear functionals
Explicit bias term a(rK )
I
Similar to bias term in nonparametric kernel regression theory
Bias term can be omitted under undersmoothing assumption
(K ! ∞ su¢ ciently fast) but this is an inferior approximation, and
does not allow a uni…cation of parametric and nonparametric cases.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
14 / 65
Summary
The paper introduced regularity conditions for OLS asymptotic
normality
Conditions unify parametric (…xed K ) and nonparametric (increasing
K ) cases
Does not require bounded regressors, nor bounded conditional
variances
Minimal number of moments is q > 4
Larger q allows a faster growth in K
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
15 / 65
Robust Inference
by
Bruce E. Hansen
Department of Economics
University of Wisconsin
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
16 / 65
Motivation
In econometrics, a common de…nition of “robust inference” under
possible mis-speci…cation is to obtain valid con…dence intervals for the
pseudo-true value of a parameter.
But in many contexts, we do not care about the pseudo-true value,
we care about the true value.
My goal is to propose valid con…dence intervals for the true value of a
parameter, allowing for misspeci…cation.
I focus on sieve estimation of regression functions
I
I
I believe the idea might have broader applicability.
In sieve regression, mis-speci…cation=…nite-sample bias
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
17 / 65
Series Regression
iid (yi , zi ), i = 1, ..., n
I
I
yi = g (zi ) + ei
E ( e i j zi ) = 0
Linear parameter of interest θ = a(g )
I
includes regression function, derivatives, and integrals
Sieve Regression Approximation
I
I
I
I
I
I
Let xK (z ) be a sequence of K 1 basis functions
Regressor xKi = xK (zi )
1
0
Projection coe¢ cient βK = E xKi xKi
E (xKi yi )
0
Projection equation yi = xKi βK + eKi
K th series approximation gK (z ) = xK (z )0 βK
0 β
Approximation error rKi = g (zi ) xKi
K
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
18 / 65
Least Squares Estimation
0 )
b
βK = (∑ni=1 xKi xKi
gbK (z ) = xK (z )0 b
β
1
∑ni=1 xKi yi
K
b
βK
θ K = a (gbK ) = aK0 b
0 )
QK = E (xKi xKi
0 e2
SK = E xKi xKi
Ki
VK = QK 1 SK QK 1 , conventional asymptotic variance
bK = Q
b 1S
bK Q
b 1 , conventional variance estimate
V
K
K
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
19 / 65
Inference
Standard error s (b
θK ) =
t-ratio: Tn (θ ) = b
θK
q
bK aK /n
aK0 V
θ /s (b
θK )
Tests for H0 : θ = θ 0 reject for Tn (θ 0 )
Con…dence region:
C0 = f θ : Tn ( θ )
q1
n
qα g = b
θK
α
s (b
θ K )qα
where qα is the α quantile of the jN (0, 1)j distribution
Bruce Hansen (University of Wisconsin)
Robust Inference
o
March 2015
20 / 65
Asymptotic Distribution
Under mild regularity conditions
p
n b
θK
θ + a ( rK )
(aK0 VK aK )1/2
!d N (0, 1)
Asymptotic distribution for linear functionals
bK replaces VK
Same if V
Explicit bias term a(rK )
I
Similar to bias term in nonparametric kernel regression theory
Bias term can be omitted under undersmoothing assumption
(K ! ∞ su¢ ciently fast) but this is an inferior approximation, and
does not allow a uni…cation of parametric and nonparametric cases.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
21 / 65
Asymptotic Distribution Re…ned
Recall
p
n b
θK
θ + a(rK )
(αK0 VK αK )1/2
!d N (0, 1)
To characterize this distribution more precisely we add
Assumption For some constants φ > 0, γ > 0 and τ 1 > 0
1
2
3
4
limK !∞ K
φ a0 V a
K K K
=D>0
limK !∞ K γ a(rK ) = A 6= 0
θ = θ 0 + δn γ/(φ+2γ)
K = τ 1 n1/(φ+2γ) (MSE-minimizing
optimal rate)
φ > 0 implies convergence rate will be slower than
p
n
δ is a localizing parameter
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
22 / 65
Asymptotic Distribution of t statistic
Theorem 1: Assumption 1 implies
Tn (θ ) !d χ(λ)
jZ1 + λj
and
where Z1
Tn (θ 0 ) !d χ(λ + δ) jZ1 + λ + δj
q
q
φ+2γ
φ
N (0, 1), λ = A/ Dτ 1
and δ = δ/ Dτ 1
The asymptotic distribution is called folded normal or non-central chi
Unlike the parametric case, χ(λ) has a noncentral distribution due to
the bias parameter λ
The distribution χ(λ + δ) depends on the localizing parameter δ as
well as the bias parameter λ
Recall K = τ 1 n1/(φ+2γ) . Increasing τ 1 (e.g. K ) decreases the
asymptotic bias λ as well as the localizing parameter λ.
I
Thus undersmoothing (large K ) decreases both bias and power.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
23 / 65
Folded Normal Distribution
χ(λ)
jZ1 + λj where Z1 N (0, 1)
Depends on jλj
Distribution function F (x, λ) = Φ (x jλj) Φ ( x
Density function f (x, λ) = φ (x jλj) + φ (x + jλj)
Quantile function qη (λ) solves F (qη (λ), λ) = η
Bruce Hansen (University of Wisconsin)
Robust Inference
jλj)
March 2015
24 / 65
Classical Con…dence Interval
C0 = f θ : Tn ( θ )
n
qα g = b
θK
s (b
θ K )qα
Corollary 1: Pr(θ 2 C0 ) ! F (qα , λ)
with strict inequality when λ 6= 0
α
o
Classical con…dence interval undercovers the parameter θ
Classical t test overrejects under H0
Correct coverage/rejection only if λ = 0
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
25 / 65
Coverage probability can be arbitrarily small
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
26 / 65
Classical Test Power
Corollary 2: Pr(Tn (θ 0 ) qα ) ! 1 F (qα , λ + δ)
Asymptotic power of nominal 5% test with φ = 1, γ = 2, D = 1,
A = 1, and τ 1 = (4A2 )1/3 (MSE optimal)
Size distortion and power varies with τ 1 (number of regressors)
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
27 / 65
Size Corrected Power
If λ were known, use critical value qα (λ)
Undersmoothing (large K ) decreases power, though not uniformly
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
28 / 65
How Large is the Undercoverage?
Example: One sample of size n = 100 with yi = g (zi ) + ei ,
zi
U [0, 10], ei
N (0, 1) and unknown g
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
29 / 65
Suppose a researcher …ts a quadratic regression, and reports point
estimates and 95% classical con…dence intervals
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
30 / 65
But the true regression function is g (z ) = sin(z )/z 2/3 (the solid line) is
not in the con…dence intervals, as the latter are designed to cover the
projection approximation.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
31 / 65
Now suppose the researcher …ts a quadratic spline with one knot.
Similar problem
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
32 / 65
Now suppose the researcher …ts a quadratic spline with two knots.
In this case, the con…dence intervals contain the true value.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
33 / 65
Simulation Experiment
Same model
yi = g (zi ) + ei
zi
sin(z )
z 2/3
U [0, 10]
ei
N (0, 1)
g (z ) =
n = 100
Evaluation by simulation with 100,000 replications
Estimation by quadratic splines with N equally spaced knots
I
I
N = 2 minimizes …nite-sample IMSE
Fix N = 2
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
34 / 65
The Optimal Spline Estimator has Finite-Sample Bias
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
35 / 65
The Bias and Standard Deviation are of Similar Magnitude
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
36 / 65
Con…dence Intervals Under-Cover
Nominal 95% intervals
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
37 / 65
Summary so far
Sieve estimators have meaningful …nite-sample bias
Conventional asymptotic approximations ignore this bias by assuming
it away
In consequence, inferences are distored
Classic con…dence intervals exhibit systematic undercoverage.
An asymptotic theory which assumes non-trivial asymptotic bias
provides a much improved approximation.
I
But the distribution depends on the unknown noncentrality parameter
λ so cannot be directly used for inference.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
38 / 65
Local Robustness
Hansen and Sargent (2007)
Suppose we believe λ is non-zero but small
Consider rules which are robust to small λ,
I
λ
c for some c > 0.
In our context, use the critical value qα (c )
n
Local robust con…dence interval Cc = b
θK
Locally robust coverage: inf λ
Bruce Hansen (University of Wisconsin)
c
s (b
θ K )qα (c )
limn !∞ Pr(θ 2 Cc ) = α
Robust Inference
o
March 2015
39 / 65
Locally robust, and locally conservative. But not globally robust.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
40 / 65
Estimation of Non-Centrality Parameter
For some L > K , (perhaps L = 2K ) consider the estimate b
θ L = a (gbL )
For some ε > 0
p
n b
θK b
θL
b
λn = q
(1 + ε ) .
bK aK
aK0 V
ε > 0 is necessary to account for under-estimation of bias
Assumption 2: L = τ 2 n1/(φ+2γ) where τ 2 > τ 1
I
Same rate as K
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
41 / 65
Asymptotic Distribution of Estimated Non-Centrality
Theorem 2:
where
Tn (θ 0 )
bn
λ
!d
Z1
Z2
χ(λ + δ)
ξ
N
0,
0
1
jZ1 + λ + δj
@
λB A
v Z2 +
v
1 ρ
ρ 1
b n is also noncentral chi with
The asymptotic distribution of λ
noncentrality depending on λ.
But the latter distribution does not depend on localizing parameter δ
To simplify, we assume ρ = 0 which occurs when the regressors xKi
and xLi are nested and errors homoskedastic (Hausman, 1978)
To bound distributions which come later, we also assume B 1,
which can be guaranteed if τ 2 /τ 1 and ε are su¢ ciently large
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
42 / 65
Plug-In Con…dence Interval
b n we can use qα (λ
b n ) as critical value
Given λ
n
o
bn )
CPlug In = b
θ K s (b
θ K )qα ( λ
More generally, for any function q (λ) we could use the critical value
b n ). This leads to the class of con…dence intervals
q (λ
o
n
bn )
θ K s (b
θ K )q ( λ
C = b
By independence of χ(λ) and ξ and assumption B
Pr(θ 2 C ) = Pr(Tn (θ )
1,
b n ))
q (λ
! Pr(χ(λ) q (ξ ))
= EF (q (ξ ), λ)
Z
1 ∞
=
F (q (ξ ), λ)f (ξ/v , λB/v ) d ξ
v 0
Z ∞
1
F (q (ξ ), λ)f (ξ/v , λ/v ) d ξ
v 0
Depends on v and λ, and can be evaluated by numerical integration.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
43 / 65
Plug-In Coverage Probabilities Improve over Classical Interval
But still undercover for large λ
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
44 / 65
Conservative Con…dence Intervals
For any critical value function, de…ne the asymptotic coverage
probability
P (λ) =
1
v
Z ∞
0
F (q (ξ ), λ)f (ξ/v , λ/v ) d ξ
Theorem 3: If q (λ)
lim inf Pr(θ 2 C )
n !∞
λ ! κ as λ ! ∞ then
p
P (λ) ! Φ κ/ 1 + v 2
Theorem 4: If q 0 (λ)
1 for all λ, and v
P 0 (λ)
1, then
0.
We can use these results to craft C to have uniform (in λ) coverage
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
45 / 65
Condition 1: q (λ) satis…es
1
2
3
q (λ) λ κ as λ ! ∞
p
κ = 1 + v 2 Φ 1 (α)
q 0 (λ)
1.
Corollary 3: If q (λ) satis…es Condition 1, and v
lim inf Pr(θ 2 C )
n !∞
1, then
α
Within the class of critical value q (λ) satisfying Condition 1, the one
with the smallest distortion is q (λ) = κ + λ
n
o
bn )
θ K s (b
θ K )(κ + λ
“Linear Rule” Cκ = b
Example: If v = 1 and α = 0.95 then critical value is
b n ) = 2.33 + λ
bn
q (λ
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
46 / 65
Asymptotic Coverage of Linear Critical value
Coverage uniformly above 0.95
But very conservative for small v .
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
47 / 65
Finite Sample Simulation Experiment Revisited
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
48 / 65
Con…dence Interval for Non-Centrality Parameter
bn
λ
!d v jZ2 + λB/v j
Invert folded normal distribution to obtain con…dence interval for λ
b n and vbn
given λ
r
bL aL / a0 V
b
vbn = (1 + ε)
aL0 V
1
K K aK
De…ne ψτ (x ) as the inverse of CDF F (x, λ) for λ
I
I
The solution to F (x, ψτ (x )) = τ
For F (x, 0) < τ, set ψτ (x ) = 0
λτ = vbn ψ1
Λ = [0, λτ ]
τ
b vn
λ/b
Theorem 5: lim inf Pr(λ 2 Λ)
n !∞
τ
Λ is a valid τ asymptotic con…dence interval for λ
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
49 / 65
Two-Stage Robust Con…dence Interval
Given interval Λ = [0, λτ ], create a Bonferroni interval for θ
I
I
I
Take upper endpoint λτ = vbn ψ1
τ
Critical value qα (λτ )
n
Con…dence interval CB = b
θK
Asymptotic Coverage
Pr(θ 2 CB ) ! P (λ, τ )
1
v
Z ∞
0
b n /b
λ
vn
s (b
θ K )qα ( λ τ )
F (qα v ψ 1
τ
o
(ξ/v ) , λ)f (ξ/v , λ/v )
How to select τ? First idea:
I
I
We could set τ so that limλ!∞ P (λ, τ ) = α
p
Solution is τ = 1 Φ
1 + v 2 1 Φ 1 (α)/v
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
50 / 65
Coverage is locally robust, but not globally.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
51 / 65
Uniform Coverage
Set τ so that asymptotic coverage uniformly (in λ) exceeds α
inf P (λ, τ ) = α
λ
No explicit solution, only numerical
Since P (λ, τ ) is increasing in τ, computation is simple
Solution depends on v
Rather than report a table, I numerically calculated the optimal τ for
each v on a grid, and then …t a low-order model
τ (v ) = 0.948 + 0.026
1
v
0.393
1
v2
0.247
1
v3
0.047
1
v4
which …ts extremely tightly
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
52 / 65
Comparison of uniformly robust methods
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
53 / 65
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
54 / 65
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
55 / 65
Power
For any critical value function, the asymptotic local power against the
alternative δ is
P (λ) =
1
v
Z ∞
0
F (q (ξ ), λ + δ)f (ξ/v , λB/v ) d ξ
Depends only on δ, v , λ, and B
Numerical comparison:
I
I
As before, asymptotic power of nominal 5% test with φ = 1, γ = 2,
D = 1, A = 1, and τ 1 = (4A2 )1 /3 (MSE optimal)
Compare power of Linear Rule with Kopt regressors, Classical test with
Kopt regressors, and Classical test with 2Kopt regressor
(undersmoothing)
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
56 / 65
Robust Interval has correct size, similar power with undersmoothed method
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
57 / 65
Robust Interval has nearly identical power with size-corrected interval
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
58 / 65
Finite Sample Simulation Experiment Revisited
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
59 / 65
Summary of Method
For a model with K parameters, estimate:
I b
θ K = a (gbK )
q
0 V
bK , standard error s (b
bK aK /n
I variance V
θ K ) = aK
For a larger model with L parameters, estimate:
bL
I b
θ L = a (gbL ) and variance V
q
p b
b
b
bK aK
λn = (1 + ε) n θ K θ L / aK0 V
r
bL aL / a0 V
b
vbn = (1 + ε)
aL0 V
1
K K aK
τ = τ (vbn ) ( …rst-stage con…dence level for λ selected so that
second-stage con…dence level for θ is uniformly above α)
b n /b
λτ = vbn ψ1 τ λ
vn (upper end of level τ con…dence set for λ)
critical value qα (λτ ) (folded normal critical value)
n
o
Robust con…dence interval CB = b
θ K s (b
θ K )qα ( λ τ )
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
60 / 65
Let’s return to the example we started with.
A single sample with 100 observations, estimated by a quadratic regression.
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
61 / 65
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
62 / 65
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
63 / 65
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
64 / 65
Comments
Alternative critical value functions q (λ) will be explored
Rather than attempting to uniformly bound the coverage above α, we
can attempt to bound the coverage above α γ where γ is a
permitted distortion level. This is the approach of Stock and Yogo
(2005) for weak instrument con…dence sets
Can heteroskedasticity be allowed? Perhaps using a wild bootstrap.
Current regularity conditions are unreasonably restrictive, can they be
relaxed?
Can the idea be extended beyond series regression to general
econometric settings?
Bruce Hansen (University of Wisconsin)
Robust Inference
March 2015
65 / 65