Econometrica Supplementary Material
SUPPLEMENT TO “SPARSE MODELS AND METHODS FOR
OPTIMAL INSTRUMENTS WITH AN APPLICATION
TO EMINENT DOMAIN”
(Econometrica, Vol. 80, No. 6, November 2012, 2369–2429)
BY A. BELLONI, D. CHEN, V. CHERNOZHUKOV, AND C. HANSEN
THIS SUPPLEMENT PROVIDES technical results and proofs.
S1. TOOLS
S1.1. Lyapunov CLT, Rosenthal Inequality, and Von Bahr–Esseen Inequality
LEMMA S1—Lyapunov CLT: Let {Xin i = 1 n} be independent
n 2zero2
, n = 1 2 Define sn2 = i=1 sin
. If,
mean random variables with variance sin
for some μ > 0, Lyapunov’s condition holds:
lim
n→∞
n
1 2+μ
=0
E |Xin |
2+μ
sn
i=1
then as n goes to infinity,
n
1
Xin →d N (0 1)
sn i=1
LEMMA S2 —Rosenthal Inequality: Let X1 Xn be independent zeromean random variables; then, for r ≥ 2,
r/2 n
n
n
r
E Xi ≤ C(r) max
E |Xi |r E Xi2
i=1
i=1
i=1
This is due to Rosenthal (1970).
COROLLARY S1: Let r ≥ 2, and consider the case of independent zero-mean
variables Xi with EEn (Xi2 ) = 1 and EEn (|Xi |r ) bounded by C. Then, for any
n → ∞,
⎛ n
⎞
Xi ⎜ ⎟
⎜ i=1 ⎟ 2C(r)C
−1/2 ⎟
⎜
> n n ⎟ ≤
Pr ⎜
→ 0
rn
⎝ n
⎠
© 2012 The Econometric Society
DOI: 10.3982/ECTA9626
2
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
To verify the corollary, we use Rosenthal’s inequality E(|
and the result follows by Markov inequality,
⎛ n
⎞
Xi ⎜ ⎟
⎜ i=1 ⎟ C(r)Cnr/2 C(r)C
⎜
P⎜
> c⎟
≤ r r/2 ⎟≤
c r nr
cn
⎝ n
⎠
n
i=1
Xi |r ) ≤ Cnr/2 ,
LEMMA S3—von Bahr–Essen Inequality: Let X1 Xn be independent
zero-mean random variables. Then, for 1 ≤ r ≤ 2,
n
n
r
E Xi ≤ 2 − n−1 ·
E |Xk |r i=1
k=1
This result is due to von Bahr and Esseen (1965).
COROLLARY S2: Let r ∈ [1 2], and consider the case of independent zeromean variables Xi with EEn (|Xi |r ) bounded by C. Then, for any n → ∞,
⎫
⎧ n
⎪
⎪
⎪
⎪
⎪
⎪
Xi ⎪
⎪
⎬ 2C
⎨
i=1
−(1−1/r)
≤
> n n
P
→ 0
⎪ rn
⎪ n
⎪
⎪
⎪
⎪
⎪
⎪
⎭
⎩
The corollary follow by Markov and von Bahr–Esseen’s inequalities,
n
⎞
⎛ n
n
r
Xi Xi E(|Xi |r )
⎟ CE ⎜ ⎟
⎜ i=1 Ē(|Xi |r )
i=1
i=1
⎟≤
>
c
≤
≤
C
P⎜
⎟
⎜ n
c r nr
c r nr
c r nr−1
⎠
⎝
S1.2. A Symmetrization-Based Probability Inequality
Next we proceed to use
to bound the empirical
symmetrization arguments
√
process. Let f Pn 2 = En [f (Zi )2 ], Gn (f ) = nEn [f (Zi ) − E[Zi ]], and, for
a random variable Z, let q(Z 1 − τ) denote its (1 − τ)-quantile. The proof
follows standard symmetrization arguments.
LEMMA S4—Maximal Inequality via Symmetrization: Let Z1 Zn be arbitrary independent stochastic processes and F a finite set of measurable functions. For any τ ∈ (0 1/2), and δ ∈ (0 1) with probability at least 1 − 4τ − 4δ, we
METHODS FOR OPTIMAL INSTRUMENTS
have
3
supGn (f ) ≤ 4 2 log 2|F |/δ q sup f Pn 2 1 − τ
f ∈F
f ∈F
∨ 2 sup q Gn (f ) 1/2 f ∈F
PROOF: Let
e1n =
2 log 2|F |/δ q max En f (Zi )2 1 − τ f ∈F
!
1
e2n = max q Gn f (Zi ) f ∈F
2
and the event E = {maxf ∈F En [f 2 (Zi )] ≤ q(maxf ∈F En [f 2 (Zi )] 1 − τ)},
which satisfies P(E ) ≥ 1 − τ. By the symmetrization Lemma 2.3.7 of van der
Vaart and Wellner (1996) (by definition of e2n , we have βn (x) ≥ 1/2 in
Lemma 2.3.7), we obtain
"
#
P maxGn f (Zi ) > 4e1n ∨ 2e2n
f ∈F
#
"
≤ 4P maxGn εi f (Zi ) > e1n
f ∈F
"
#
≤ 4P maxGn εi f (Zi ) > e1n |E + 4τ
f ∈F
where εi are independent Rademacher random variables, P(εi = 1) = P(εi =
−1) = 1/2.
Thus, a union bound yields
#
"
(S.1)
P maxGn f (Zi ) > 4e1n ∨ 2e2n
f ∈F
%
$ ≤ 4τ + 4|F | max P Gn εi f (Zi ) > e1n |E f ∈F
We then condition on the values of Z1 Zn and E , denoting the conditional
probability measure as Pε . Conditional on Z1 Zn , by the Hoeffding inequality, the symmetrized process Gn (εi f (Zi )) is sub-Gaussian for the L2 (Pn )
norm, namely, for f ∈ F , Pε {|Gn (εi f (Zi ))| > x} ≤ 2 exp(−x2 /{2En [f 2 (Zi )]})
Hence, under the event E , we can bound
$ %
Pε Gn εi f (Zi ) > e1n |Z1 Zn E ≤ 2 exp −e21n /[2En f 2 (Zi )
≤ 2 exp − log(2|F |/δ) 4
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
Taking the expectation over Z1 Zn does not affect the right hand side
bound. Plugging in this bound yields the result.
Q.E.D.
S2. PROOF OF THEOREM 6
To show part (a), note that, by a standard argument,
√
n(α̃ − α) = M −1 Gn [Ai i ] + oP (1)
From the proof of Theorem 4, we have that
√
n(&
α − αa ) = Q−1 Gn D(xi )
i + oP (1)
& for Σ can be demonstrated simiThe conclusion follows. The consistency of Σ
& and Q
& in the proof of Theorem 4.
larly to the proof of consistency of Ω
To show part (b), let α denote the true value as before, which by assumption
coincides with the estimand of the baseline IV estimator by the standard argument, α̃ − α = oP (1) The baseline IV estimator is consistent for this quantity.
Under the alternative hypothesis, the estimand of &
α is
−1 αa = Ē D(xi )di
Ē D(xi )yi
−1 = α + Ē D(xi )D(xi )
Ē D(xi )
i Under the alternative hypothesis, Ē[D(xi )
i ]2 is bounded away from zero
uniformly in n. Hence, since the eigenvalues of Q are bounded away from zero
uniformly in n, α − αa 2 is also bounded away from zero uniformly in n. Thus,
it remains to show that &
α is consistent for αa . We have that
& i )d −1 En D(x
& i )
i − Ē D(xi )D(xi )
−1 Ē D(xi )
i &
α − αa = En D(x
i
so that
(S.2)
' '' '
& i )d −1 − Ē D(xi )D(xi )
−1 ''En D(x
& i )
i '
&
α − αa 2 ≤ 'En D(x
i
2
' −1 '' '
& i )
i − Ē D(xi )
i '
+ 'Ē D(xi )D(xi ) ''En D(x
2
= oP (1)
& i )d ]−1 − Ē[D(xi )D(xi )
]−1 = oP (1) which is shown
provided that (i) En [D(x
i
in the proof of Theorem 4; (ii) Ē[D(xi )D(xi )]−1 = Q−1 is bounded from
above uniformly in n, which follows from the assumption on Q in Theorem 4;
and (iii)
' ' '
'
'En D(x
& i )
i − Ē D(xi )
i ' = oP (1) 'Ē D(xi )
i ' = O(1)
2
2
METHODS FOR OPTIMAL INSTRUMENTS
5
where Ē[D(xi )
i ]2 = O(1) is assumed. To show the last claim, note that
' '
'En D(x
& i )
i − Ē D(xi )
i '
2
' ' '
'
& i )
i − En D(xi )
i ' + 'En D(xi )
i − Ē D(xi )
i '
≤ 'En D(x
2
2
'
'
&l (xi )' i 2n + oP (1) = oP (1)
≤ ke max 'Dl (xi ) − D
2n
1≤l≤ke
since ke is fixed, En [D(xi )
i ] − Ē[D(xi )
i ]2 = oP (1) by von Bahr–Esseen
inequality (von Bahr and Esseen (1965)) and SM, i 2n = OP (1) follows
by the Markov inequality and assumptions on the moments of i , and
&l (xi )2n = oP (1) follows from Theorems 1 and 2. Q.E.D.
max1≤l≤ke Dl (xi ) − D
S3. PROOF OF THEOREM 7
We introduce additional superscripts a and b on all variables; and n gets
replaced by either na or nb . The proof is divided in steps. Assuming
(
' k
'
s log(p ∨ n)
k
& −D '
= oP (1) k = a b
max 'D
(S.3)
il
il 2nk P
1≤l≤ke
n
Step 1 establishes bounds for the intermediary estimates on each subsample.
Step 2 establishes the result for the final estimator &
αab and consistency of the
matrices estimates. Finally, Step 3 establishes (S.3).
Step 1. We have that
k k −1 √
k k
√
&d
&
nEnk D
nk (&
αk − α0 ) = Enk D
i i
i i
$ k k %−1 k k
&d
= Enk D
Gnk Di i + oP (1)
i i
$ %−1 = Enk Di D
i + oP (1)
Gnk Dki ki + oP (1) since
(S.4)
(S.5)
k k
& d = En Di D
+ oP (1)
Enk D
i i
i
k
k k
√
& = Gn Dk k + oP (1)
nEnk D
i i
i i
k
Indeed, (S.4) follows similarly to Step 2 in the proof of Theorems 4 and 5 and
condition (S.3). The relation (S.5) follows from E[
ki |xki ] = 0 for both k = a
and k = b, Chebyshev inequality, and
'√
k
'
& − Dk k '2 |xk i = 1 n kc
E ' nEnk D
i
i
i
2 i
'
k
'
& − Dk '2 →P 0
ke max ' D
il
il
2n
1≤l≤ke
k
where E[·|xki i = 1 n kc ] denotes the estimate computed conditional xki ,
i = 1 n and on the sample kc , where kc = {a b} \ k. The bound follows
6
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
from the fact that (a)
kc
&k − Dk = f xk β
&l − βk0lc − al xii D
il
il
i
1 ≤ i ≤ nk &kl c − βk0lc ) are independent of {
ki 1 ≤ i ≤ nk }, by
by Condition AS, where (β
the independence of the two subsamples k and kc , (b) {
ki xi 1 ≤ i ≤ nk } are
independent across i and independent from the sample kc , (c) {
ki 1 ≤ i ≤ nk }
have conditional mean equal to zero, conditional on xki i = 1 n, and have
conditional variance bounded from above, uniformly in n, conditional on xki ,
&k − Dk 2n →P 0.
i = 1 n, by Condition SM, and that (d) max1≤l≤ke D
il
il
k
Using the same arguments as in Step 1 in the proof of Theorems 4 and 5,
√
−1
nk (&
αk − α0 ) = Enk Dki Dki
Gnk Dki ki + oP (1) = OP (1)
Step 2. Now, putting together terms, we get
a a
b b −1
√
&D
&D
& + (nb /n)En D
&
n(&
αab − α0 ) = (na /n)Ena D
i
i
i
i
b
a a √
&
&D
× (na /n)Ena D
n(&
αa − α0 )
i
i
b b √
&
&D
n(&
αb − α0 )
+ (nb /n)Enb D
i
i
−1
= (na /n)Ena Dai Dai + (nb /n)Enb Dbi Dbi √
× (na /n)Ena Dai Dai n(&
αa − α0 )
b b √
+ (nb /n)Enb Di Di
n(&
αb − α0 ) + oP (1)
$ %−1 $ √
× (1/ 2) × Gna Dai ai
= En Di D
i
√
%
+ (1/ 2)Gnb Dbi bi + oP (1)
$ %−1
= En Di D
i
× Gn [Di i ] + oP (1)
where we are also using the fact that
k k
& − En Dk Dk = oP (1)
&D
Enk D
i
i
i
i
k
which is shown similarly to the proofs given in Theorem 4 for showing that
En [Di D
i ] − En [Di D
i ] = oP (1) The conclusion of the theorem follows by an
application of Lyapunov CLT, similarly to the proof of Theorem 4 in the main
text.
Step 3. In this step, we establish (S.3). For every observation i and l =
1 ke , by Condition AS, we have
(S.6)
Dil = fi
βl0 + al (xi )
βl0 0 ≤ s
'
'
max 'al (xi )'2n ≤ cs P s/n
1≤l≤ke
METHODS FOR OPTIMAL INSTRUMENTS
7
Under our conditions, the sparsity bound for Lasso by Lemma 9 implies that,
&kl − βl0 , k = a b, and l = 1 ke ,
for all δkl = β
' k'
'δ ' P s
l 0
Therefore, by Condition SE, we have, for M = Enk [fik fik ], k = a b, that with
probability going to 1, for n large enough,
' ' ' ' 0 < κ
≤ φmin 'δkl '0 (M) ≤ φmax 'δkl '0 (M) ≤ κ
< ∞
where κ
and κ
are some constants that do not depend on n. Thus,
'
'
' k
' k
k
k &kc '
'
'D − D
&k '
(S.7)
il
il 2nk = fi βl0 + al xi − fi βl
2nk
' k
' k '
c '
k
& ' + 'al x '
= 'f βl0 − β
i
l
i
2nk
2nk
' c kc
'
&l − βl0 '
≤ κ
/κ
'fik β
+ cs n/nk 2n c
k
where the last inequality holds with probability going to 1 by Condition SE
imposed on matrices M = Enk [fik fik ], k = a b, and by
' k '
'
'
'al x '
'al (xi )' ≤ n/nk cs ≤
n/n
k
i
2n
2n
k
√
√
Then, in view of (S.6), n/nk = 2 + o(1), and condition s log p = o(n), the
result (S.3) holds by (S.7) combined with Theorem 1 for Lasso and Theorem 2
for post-Lasso, which imply
(
' k
k
'
s log(p ∨ n)
&l − βl0 '
= oP (1) k = a b
max 'fi β
P
2nk
1≤l≤ke
n
Q.E.D.
S4. PROOF OF LEMMA 3
Part 1. The first condition in Condition RF(iv) is assumed, and we omit the
proof for the third condition since it is analogous to the proof for the second
condition.
Note that max1≤j≤p En [fij4 vil4 ] ≤ (En [vil8 ])1/2 max1≤j≤p (En [fij8 ])1/2 P 1, since
max1≤j≤p En [fij8 ] P 1 by assumption and max1≤l≤ke En [vil8 ] P 1 by the
bounded ke , Markov inequality, and the assumption that Ē[vil8 ] are uniformly
bounded in n and l.
Thus, applying Lemma S4,
(
(
2 2
2 2 log(pk
)
log p
e
P
→P 0
max En fij vil − Ē fij vil P
1≤l≤ke 1≤j≤p
n
n
8
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
Part 2. To show (a), we note that, by simple union bounds and tail properties
of Gaussian variable, we have that maxij fij2 P log(p ∨ n), so we need log(p ∨
n) s log(p∨n)
→ 0.
n
Therefore, max1≤j≤p En [fij4 vil4 ] ≤ En [vil4 ] max1≤i≤n1≤j≤p fij4 P log2 (p ∨ n). Thus,
applying Lemma S4,
(
2 2
2 2 log(pke )
max En fij vil − Ē fij vil P log(p ∨ n)
1≤l≤ke 1≤j≤p
n
)
log3 (p ∨ n)
P
→P 0
n
since log p = o(n1/3 ). The remaining moment conditions of Condition RF(ii)
follow immediately from the definition of the conditionally bounded moments
since, for any m > 0, Ē[|fij |m ] is bounded, uniformly in 1 ≤ j ≤ p, uniformly
in n, for the i.i.d. Gaussian regressors of Lemma 1 of Belloni, Chen, Chernozhukov, and Hansen (2012). The proof of (b) for arbitrary bounded i.i.d.
regressors of Lemma 2 of Belloni et al. (2012) is similar.
Q.E.D.
S5. PROOF OF LEMMA 4
The first two conditions of Condition SM(iii) follow from the assumed rate
s2 log2 (p ∨ n) = o(n) since we have q
= 4. To show part (a), we note that, by
simple union bounds and tail properties of Gaussian variable, we have that
→ 0. Applymax1≤i≤n1≤j≤p fij2 P log(p ∨ n), so we need log(p ∨ n) s log(p∨n)
n
ing union bound, Gaussian concentration inequalities (Ledoux and Talagrand
(1991)), and that log2 p = o(n), we have max1≤j≤p En [fij4 ] P 1. Thus Condition SM(iii)(c) holds by maxj En [fij2 2i ] ≤ (En [
4i ])1/2 max1≤j≤p (En [fij4 ])1/2 P 1.
Part (b) follows because regressors are bounded and the moment assumption
on .
Q.E.D.
S6. PROOF OF LEMMA 11
Step 1. Here we consider the initial option, in which &
γjl2 = En [fij2 (dil − En dil )2 ].
Let us define d̃il = dil − Ē[dil ], γ̃jl2 = En [fij2 d̃il2 ], and γjl2 = Ē[fij2 d̃il2 ] We want to
show that
2
γjl − γ̃jl2 →P 0
Δ1 = max &
1≤l≤ke 1≤j≤p
Δ2 =
max
1≤l≤ke 1≤j≤p
2
γ̃ − γ 2 →P 0
jl
jl
γjl2 − γjl2 | →P 0, and then, since γjl2 ’s are
which would imply that max1≤j≤p1≤l≤ke |&
uniformly bounded from above by Condition RF and bounded below by γjl02 =
METHODS FOR OPTIMAL INSTRUMENTS
9
Ē[fij2 vil2 ], which are bounded away from zero. The asymptotic validity of the
initial option then follows.
We have that Δ2 →P 0 by Condition RF, and, since En [d̃il ] = En [dil ] − Ē[dil ],
we have
$
%
Δ1 = max En f 2 (d̃il − En d̃il )2 − d̃ 2 1≤l≤ke 1≤j≤p
≤
max
1≤l≤ke 1≤j≤p
+
max
ij
il
2En fij2 d̃il En [d̃il ]
1≤l≤ke 1≤j≤p
2
En f (En d̃il )2 →P 0
ij
Indeed, we have for the first term that
max
1≤l≤ke 1≤j≤p
2 √
En f d̃il |En [d̃il ] ≤ max |fij | En fij2 d̃ 2 OP (1/ n) →P 0
il
ij
i≤nj≤p
√ √
where we first used Holder inequality and max1≤l≤ke |En [d̃il ]| P ke / n by the
Chebyshev inequality and by Ē[d̃il2 ] being uniformly bounded by Condition RF;
then we invoked Condition RF to claim convergence to zero. Likewise, by Condition RF,
max En fij2 (En d̃il )2 ≤ max fij2 OP (1/n) →P 0
1≤l≤ke 1≤j≤p
1≤j≤p
Step 2. Here we consider the refined option, in which &
γjl2 = En [fij2&
vil2 ] The
&il , can be based on any estimator that obeys
residual here, &
vil = dil − D
(
&il − Dil 2n P s log(p ∨ n) (S.8)
max D
1≤l≤ke
n
Such estimators include the Lasso and Post-Lasso estimators based on the initial option. Below we establish that the penalty levels, based on the refined option using any estimator obeying (S.8), are asymptotically valid. Thus by Theorems 1 and 2, the Lasso and Post-Lasso estimators based on the refined option
also obey (S.8). This establishes that we can iterate on the refined option a
bounded number of times, without affecting the validity of the approach.
Recall that &
γjl02 = En [fij2 vil2 ] and define γjl02 := Ē[fij2 vil2 ], which is bounded away
from zero and from above by assumption. Hence, by Condition RF, it suffices to
γjl2 − γjl02 | →P 0, which implies the loadings are asympshow that max1≤j≤p1≤l≤ke |&
totic valid with u = 1. This, in turn, follows from
2
Δ1 = max &
γjl − &
γjl02 →P 0
1≤l≤ke 1≤j≤p
Δ2 =
max
1≤l≤ke 1≤j≤p
02
2
&
γjl − γjl02 →P 0
10
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
which we establish below.
Now note that we have proven Δ2 →P 0 in Step 3 of the proof of Theorem 1.
As for Δ1 , we note that
&il − Dil ) Δ1 ≤ 2 max En f 2 vjl (D
1≤l≤ke 1≤j≤p
+
max
1≤l≤ke 1≤j≤p
ij
&il − Dil )2 En fij2 (D
The first term is bounded by Holder and Lyapunov inequalities,
1/2
&il − Dil 2n
max D
max |fij | En fij2 vil2
i≤nj≤p
l≤ke
(
2 2 1/2 s log(p ∨ n)
→P 0
P max |fij | En fij vil
i≤nj≤p
n
where the conclusion is by Condition RF. The second term is of stochastic order
s log(p ∨ n)
max fij2 →P 0
i≤nj≤p
n
which converges to zero by Condition RF.
Q.E.D.
S7. ADDITIONAL SIMULATION RESULTS
In this section, we present simulation results to complement the results given
in the paper. The simulations use the same model as the simulations in the
paper:
yi = βxi + ei xi = zi
Π + vi σe2
(ei vi ) ∼ N 0
σev
σev
σv2
!!
i.i.d.
where β = 1 is the parameter of interest, and zi = (zi1 zi2 zi100 )
∼
2
] = σz2 and Corr(zih zij ) = 05|j−h| . In
N(0 ΣZ ) is a 100 × 1 vector with E[zih
2
2
all simulations, we set σe = 2 and σz = 03.
For the other parameters, we consider various settings. We provide results
for sample sizes, n, of 100, 250, and 500; and we consider three different values
for Corr(e v): 0, 0.3, and 0.6. We also consider four values of σv2 that are chosen to benchmark four different strengths of instruments. The four values of
Σ Π
Z
σv2 are found as σv2 = nΠ
for F ∗ : 2.5, 10, 40, and 160. We use two different
F ∗ Π
Π
designs for the first-stage coefficients, Π. The first sets the first S elements of
METHODS FOR OPTIMAL INSTRUMENTS
11
Π equal to 1 and the remaining elements equal to zero. We refer to this design
as the “cut-off” design. The second model sets the coefficient on zih = 07h−1
for h = 1 100. We refer to this design as the “exponential” design. In the
cut-off case, we consider S of 5, 25, 50, and 100 to cover different degrees of
sparsity.
For each setting of the simulation parameter values, we report results from
seven different estimation procedures. A simple possibility when presented
with many instrumental variables is to just estimate the model using 2SLS and
all of the available instruments. It is well known that this will result in poor
finite-sample properties unless there are many more observations than instruments; see, for example, Bekker (1994). The limited information maximum
likelihood estimator (LIML) and its modification by Fuller (1977) (FULL)1
are both robust to many instruments as long as the presence of many instruments is accounted for when constructing standard errors for the estimators;
see Bekker (1994) and Hansen, Hausman, and Newey (2008), for example. We
report results for these estimators in rows labeled 2SLS(100), LIML(100), and
FULL(100), respectively.2 For LASSO, we consider variable selection based on
two different sets of instruments. In the first scenario, we use LASSO to select
among the base 100 instruments and report results for the IV estimator based
on the LASSO (LASSO) and Post-LASSO (Post-LASSO) forecasts. In the
second, we use LASSO to select among 120 instruments formed by augmenting the base 100 instruments by the first 20 principal components constructed
from the sampled instruments in each replication. We then report results for
the IV estimator based on the LASSO (LASSO-F) and Post-LASSO (PostLASSO-F) forecasts. In all cases, we use the refined data-dependent penalty
loadings given in the paper.3 For each estimator, we report root-truncatedmean-squared-error (RMSE),4 median bias (Med. Bias), median absolute deviation (MAD), and rejection frequencies for 5% level tests (rp(0.05)).5 For
computing rejection frequencies, we estimate conventional 2SLS standard errors for 2SLS(100), LASSO, and Post-LASSO, and the many-instrument robust standard errors of Hansen, Hausman, and Newey (2008) for LIML(100)
and FULL(100).
1
Fuller (1977) required a user-specified parameter. We set this parameter equal to 1, which
produces a higher-order unbiased estimator.
2
With n = 100, we randomly select 99 instruments for use in FULL(100) and LIML(100).
3
Specifically, we start by finding a value of λ for which LASSO selected only one instrument.
We use this instrument to construct an initial set of residuals for use in defining the refined penalty
loadings. Using these loadings, we run another LASSO step and select a new set of instruments.
We then compute another set of residuals and use these residuals to recompute the loadings.
We then run a final LASSO step. We report the results for the IV estimator of β based on the
instruments chosen in this final LASSO step.
4
We truncate the squared error at 1e12.
5
In cases where LASSO selects no instruments, Med. Bias, and MAD use only the replications
where LASSO selects a non-empty set of instruments, and we set the confidence interval equal
to (−∞ ∞) and thus fail to reject.
12
TABLE S.I
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
499
499
238
238
0.037
6.836
2.753
0.059
0.056
0.145
0.136
0000
−0051
−0051
0059
0056
−0004
0003
0.025
0.345
0.345
0.059
0.056
0.084
0.081
0.056
0.024
0.024
0.000
0.000
0.022
0.018
79
79
9
9
0.065
6.638
3.657
0.137
0.132
0.152
0.142
−0003
−0039
−0039
−0008
−0008
−0015
−0010
0.043
0.479
0.479
0.087
0.082
0.094
0.080
0.058
0.038
0.038
0.034
0.030
0.050
0.048
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
500
500
231
231
F ∗ = 25
0108
8442
2792
*
*
0151
0144
0.103
0.135
0.135
*
*
0.085
0.081
0.103
0.378
0.377
*
*
0.105
0.101
0.852
0.121
0.121
0.000
0.000
0.058
0.060
77
77
13
13
F ∗ = 10
0176
228717
3889
0136
0133
0160
0150
0.168
0.113
0.113
0.047
0.042
0.069
0.068
0.168
0.527
0.526
0.081
0.085
0.097
0.094
0.758
0.122
0.122
0.056
0.056
0.078
0.092
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
261
261
0205
32289
1973
*
*
0184
0177
0.202
0.181
0.181
*
*
0.154
0.146
0.202
0.318
0.316
*
*
0.155
0.147
1.000
0.230
0.230
0.000
0.000
0.174
0.188
93
93
12
12
0335
14189
4552
0142
0140
0212
0198
0.335
0.298
0.298
0.051
0.060
0.136
0.129
0.335
0.588
0.588
0.094
0.094
0.152
0.144
0.998
0.210
0.210
0.104
0.114
0.242
0.244
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. EXPONENTIAL DESIGN. N = 100a
TABLE S.I—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
0
0
0
0
F = 40
0217
21404
6567
0135
0135
0134
0132
0193
0070
0070
0016
0025
0029
0038
0.193
0.718
0.718
0.091
0.088
0.093
0.090
0.530
0.082
0.082
0.042
0.066
0.056
0.056
0
0
0
0
F ∗ = 160
0178
0144
45371 −0016
17107 −0016
0129
0009
0128
0019
0128
0018
0126
0026
0.147
0.516
0.516
0.087
0.086
0.090
0.086
0.206
0.044
0.044
0.048
0.046
0.044
0.044
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0.391
6.796
5.444
0.137
0.136
0.148
0.147
0.381
0.158
0.158
0.042
0.052
0.073
0.082
0.381
0.607
0.607
0.094
0.094
0.100
0.101
0.988
0.140
0.140
0.088
0.096
0.114
0.136
0
0
0
0
0.307
8.024
7.637
0.136
0.136
0.139
0.139
0.285
0.012
0.012
0.023
0.031
0.038
0.047
0.285
0.408
0.408
0.087
0.089
0.089
0.093
0.710
0.116
0.116
0.086
0.090
0.094
0.100
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0096
19277
6273
0129
0126
0127
0122
−0004
−0037
−0037
−0001
−0002
0001
0000
0128
0003
79113
0014
10092
0014
0139
0001
0139 −0001
0138
0000
0137 −0008
0.068
0.599
0.598
0.086
0.084
0.087
0.081
0.086
0.540
0.540
0.093
0.094
0.092
0.092
0.044
0.022
0.020
0.042
0.038
0.050
0.042
0.064
0.038
0.038
0.054
0.058
0.036
0.054
a Results are based on 500 simulation replications and 100 instruments. The first-stage coefficients were set equal to 07j−1 for j = 1 100 in this design. Corr(e v) is the
correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and FULL(100) are respectively
the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100) and FULL(100) to obtain
testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 100
instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 120 instruments
formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias (Med. Bias), mean absolute
deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In these cases, RMSE, Med.
Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞) and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
13
14
TABLE S.II
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
409
409
0023
28982
0310
*
*
0070
0068
0001
0009
0009
*
*
0015
0007
0.016
0.158
0.151
*
*
0.044
0.039
0.054
0.032
0.030
0.000
0.000
0.002
0.002
88
88
33
33
0042
0850
0338
0086
0083
0096
0091
0000
−0014
−0014
−0001
−0002
−0005
−0004
0.027
0.137
0.134
0.060
0.058
0.063
0.057
0.064
0.036
0.036
0.036
0.034
0.038
0.040
Corr(e v) = 06
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
400
400
F ∗ = 25
0068
5126
0272
*
*
0083
0082
0.064
0.034
0.034
*
*
0.041
0.045
0.064
0.152
0.144
*
*
0.051
0.057
0.858
0.050
0.048
0.000
0.000
0.012
0.012
90
90
34
34
F ∗ = 10
0112
0759
0255
0079
0077
0096
0092
0.104
0.020
0.021
0.026
0.024
0.042
0.043
0.104
0.126
0.123
0.054
0.054
0.060
0.055
0.774
0.034
0.034
0.040
0.036
0.072
0.058
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
399
399
0131
2201
0266
*
*
0115
0113
0.129
0.072
0.076
*
*
0.106
0.104
0.129
0.145
0.134
*
*
0.106
0.104
1.000
0.130
0.130
0.000
0.000
0.088
0.094
78
78
29
29
0211
1111
0271
0092
0092
0122
0115
0.208
0.015
0.019
0.046
0.052
0.092
0.086
0.208
0.119
0.114
0.067
0.069
0.096
0.090
1.000
0.076
0.076
0.128
0.144
0.252
0.282
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. EXPONENTIAL DESIGN. N = 250a
TABLE S.II—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
0
0
0
0
F = 40
0130
0115
0126 −0008
0125 −0007
0085
0007
0084
0012
0084
0015
0081
0021
0.115
0.082
0.081
0.056
0.054
0.054
0.053
0.476
0.046
0.044
0.056
0.060
0.058
0.064
0
0
0
0
F ∗ = 160
0113
0090
0090
0081
0080
0081
0081
0.092
0.062
0.061
0.056
0.053
0.051
0.051
0.220
0.044
0.044
0.050
0.052
0.050
0.056
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0249
0244
0115 −0003
0114 −0001
0086
0023
0085
0031
0091
0042
0092
0052
0.244
0.072
0.072
0.057
0.061
0.064
0.065
0.994
0.044
0.044
0.084
0.098
0.114
0.134
0
0
0
0
0187
0092
0091
0084
0085
0086
0087
0.171
0.058
0.057
0.057
0.058
0.058
0.060
0.680
0.060
0.060
0.068
0.076
0.074
0.084
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0.066
0.138
0.137
0.091
0.089
0.088
0.086
0.077
0.096
0.095
0.083
0.082
0.082
0.081
0001
0003
0003
0000
−0002
0001
−0001
0000
0000
0000
0005
0006
0002
0004
0.045
0.084
0.084
0.062
0.061
0.060
0.059
0.052
0.064
0.064
0.056
0.054
0.055
0.054
0.072
0.066
0.066
0.076
0.080
0.062
0.068
0.062
0.060
0.060
0.072
0.070
0.064
0.062
0089
0006
0006
0003
0004
0009
0013
0171
0002
0003
0014
0018
0024
0030
a Results are based on 500 simulation replications and 100 instruments. The first-stage coefficients were set equal to 07j−1 for j = 1 100 in this design. Corr(e v) is the
correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and FULL(100)are respectively
the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100) and FULL(100) to obtain
testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 100
instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 120 instruments
formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias (Med. Bias), mean absolute
deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In these cases, RMSE, Med.
Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞) and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
15
16
TABLE S.III
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
461
461
0.016
1.085
0.188
*
*
0.046
0.045
0002
0020
0019
*
*
0001
0003
0.010
0.116
0.103
*
*
0.026
0.030
0.050
0.034
0.030
0.000
0.000
0.000
0.000
119
119
46
46
0.028
0.681
0.191
0.061
0.059
0.063
0.059
−0002
−0012
−0011
−0005
−0005
−0001
−0004
0.019
0.085
0.083
0.038
0.037
0.039
0.037
0.052
0.026
0.020
0.036
0.030
0.024
0.026
Corr(e v) = 06
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
463
463
F ∗ = 25
0.047
1.899
0.165
*
*
0.054
0.048
0.045
0.011
0.013
*
*
0.011
0.011
0.045
0.102
0.093
*
*
0.035
0.033
0.828
0.052
0.046
0.000
0.000
0.002
0.004
105
105
52
52
F ∗ = 10
0.079
2.662
0.182
0.059
0.057
0.063
0.059
0.074
0.003
0.005
0.016
0.015
0.030
0.027
0.074
0.082
0.079
0.041
0.040
0.040
0.039
0.758
0.040
0.036
0.040
0.038
0.048
0.060
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
467
467
0.093
2.729
0.166
*
*
0.080
0.074
0.093
0.038
0.042
*
*
0.065
0.067
0.093
0.103
0.092
*
*
0.065
0.067
1.000
0.122
0.122
0.000
0.000
0.028
0.030
106
106
46
46
0.152
7.405
0.154
0.066
0.065
0.084
0.078
0.150
0.004
0.008
0.037
0.038
0.063
0.061
0.150
0.077
0.073
0.048
0.048
0.066
0.064
1.000
0.068
0.070
0.124
0.132
0.222
0.236
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. EXPONENTIAL DESIGN. N = 500a
TABLE S.III—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
0
0
0
0
F = 40
0.095
0.084
0.083
0.058
0.057
0.059
0.057
0.085
0.003
0.004
0.006
0.007
0.010
0.013
0.085
0.049
0.049
0.036
0.037
0.040
0.038
0.532
0.044
0.044
0.048
0.050
0.048
0.070
0
0
0
0
F ∗ = 160
0.082
0.069
0.068
0.059
0.059
0.060
0.059
0.063
0.000
0.001
0.006
0.007
0.010
0.012
0.064
0.046
0.045
0.039
0.040
0.039
0.039
0.246
0.058
0.058
0.066
0.054
0.066
0.062
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0.176
0.072
0.072
0.060
0.061
0.064
0.066
0.172
0.008
0.010
0.019
0.020
0.031
0.033
0.172
0.044
0.044
0.041
0.042
0.045
0.044
0.988
0.064
0.064
0.078
0.078
0.112
0.156
0
0
0
0
0.134
0.063
0.063
0.059
0.059
0.061
0.061
0.124
0.002
0.003
0.010
0.010
0.017
0.021
0.124
0.045
0.045
0.040
0.041
0.041
0.041
0.684
0.048
0.048
0.070
0.068
0.082
0.090
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0.045
0.085
0.084
0.060
0.058
0.058
0.057
0.054
0.065
0.065
0.060
0.060
0.060
0.059
0.002
0.004
0.004
0.001
0.002
0.000
0.001
0.004
0.006
0.006
0.003
0.003
0.004
0.003
0.030
0.054
0.053
0.040
0.039
0.040
0.038
0.036
0.045
0.045
0.042
0.040
0.042
0.040
0.074
0.060
0.058
0.060
0.054
0.052
0.050
0.066
0.058
0.054
0.054
0.054
0.058
0.058
a Results are based on 500 simulation replications and 100 instruments. The first-stage coefficients were set equal to 07j−1 for j = 1 100 in this design. Corr(e v) is the
correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and FULL(100) are respectively
the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100) and FULL(100) to obtain
testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 100
instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 120 instruments
formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias (Med. Bias), mean absolute
deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In these cases, RMSE, Med.
Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞) and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
17
18
TABLE S.IV
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
395
395
0026
0000
46872
0013
2298
0013
*
*
*
*
0088 −0002
0080 −0002
0.018
0.257
0.257
*
*
0.063
0.048
0.054
0.018
0.018
0.000
0.000
0.012
0.010
206
206
27
27
0044
57489
2170
0082
0081
0089
0081
−0001
0009
0009
−0008
−0007
−0001
−0001
0.032
0.385
0.385
0.055
0.052
0.051
0.047
0.040
0.022
0.022
0.028
0.036
0.040
0.050
Corr(e v) = 06
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
391
391
F ∗ = 25
0077
48070
1447
*
*
0090
0085
0.073
0.067
0.067
*
*
0.047
0.045
0.073
0.226
0.226
*
*
0.065
0.053
0.812
0.122
0.122
0.000
0.000
0.020
0.026
232
232
22
22
F ∗ = 10
0121
13839
5258
0086
0087
0105
0096
0.110
0.095
0.095
0.027
0.026
0.042
0.040
0.110
0.373
0.373
0.058
0.059
0.065
0.060
0.728
0.124
0.124
0.034
0.048
0.084
0.090
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
396
396
0146
1851
1307
*
*
0113
0110
0.144
0.141
0.141
*
*
0.090
0.090
0.144
0.235
0.235
*
*
0.090
0.092
1.000
0.235
0.235
0.000
0.000
0.058
0.072
215
215
34
34
0224
13556
2386
0084
0082
0115
0109
0.221
0.191
0.191
0.046
0.040
0.082
0.079
0.221
0.397
0.396
0.059
0.058
0.088
0.081
1.000
0.194
0.194
0.068
0.070
0.206
0.224
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 5. N = 100a
TABLE S.IV—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
0
0
0
0
F = 40
0.126
5.794
5.405
0.079
0.078
0.082
0.080
0.110
0.098
0.098
0.012
0.012
0.021
0.023
0.110
0.367
0.367
0.051
0.051
0.054
0.052
0.432
0.060
0.060
0.046
0.046
0.060
0.056
0
0
0
0
F ∗ = 160
0.106
9.206
3.339
0.079
0.079
0.080
0.079
0.073
0.010
0.010
0.005
0.005
0.010
0.010
0.078
0.268
0.268
0.053
0.054
0.054
0.054
0.190
0.048
0.048
0.060
0.058
0.062
0.060
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0229
2223
2129
0084
0081
0091
0087
0.221
0.054
0.054
0.019
0.014
0.041
0.041
0.221
0.295
0.295
0.056
0.055
0.062
0.060
0.948
0.144
0.144
0.080
0.074
0.112
0.116
0
0
0
0
0160
23035
7825
0074
0074
0075
0075
0.146
0.007
0.007
0.005
0.004
0.018
0.014
0.146
0.237
0.237
0.049
0.049
0.051
0.051
0.534
0.064
0.064
0.036
0.036
0.048
0.044
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0066
16626
4540
0079
0078
0080
0077
−0007
0007
0007
−0009
−0009
−0006
−0009
0073
0003
2958
0037
2885
0037
0077
0001
0076
0000
0077
0001
0076 −0001
0.046
0.358
0.358
0.053
0.054
0.054
0.054
0.049
0.246
0.245
0.050
0.051
0.050
0.049
0.064
0.030
0.030
0.040
0.044
0.048
0.046
0.054
0.034
0.034
0.046
0.044
0.046
0.046
a Results are based on 500 simulation replications and 100 instruments. The first 5 first-stage coefficients were set equal to 1 and the remaining 95 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a nonempty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
19
20
TABLE S.V
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
491
491
0.016
1.295
0.202
*
*
0.049
0.042
0001
−0001
−0001
*
*
−0029
−0010
0.010
0.088
0.085
*
*
0.033
0.027
0.040
0.028
0.026
0.000
0.000
0.002
0.002
239
239
83
83
0.029
0.555
0.157
0.052
0.049
0.059
0.053
−0002
0000
0000
−0002
−0003
−0003
−0005
0.019
0.074
0.073
0.037
0.037
0.036
0.036
0.056
0.034
0.034
0.024
0.024
0.036
0.032
Select 0
RMSE
Med. Bias
500
500
493
493
F ∗ = 25
0.049
1.612
0.215
*
*
0.057
0.056
253
253
88
88
F ∗ = 10
0.075
3.082
0.163
0.051
0.050
0.058
0.055
Corr(e v) = 06
MAD
rp(0.05)
0.047
0.016
0.017
*
*
0.047
0.045
0.047
0.100
0.095
*
*
0.047
0.045
0.844
0.044
0.040
0.000
0.000
0.000
0.004
0.071
0.006
0.007
0.020
0.019
0.021
0.022
0.071
0.071
0.070
0.036
0.036
0.036
0.036
0.718
0.038
0.038
0.038
0.046
0.074
0.076
Select 0
RMSE
500
500
491
491
0094
10351
0165
*
*
0073
0073
249
249
78
78
0141
0136
0121
0050
0048
0067
0061
Med. Bias
MAD
rp(0.05)
0093
0026
0029
*
*
0063
0056
0.093
0.082
0.079
*
*
0.063
0.056
1.000
0.128
0.126
0.000
0.000
0.008
0.008
0139
0001
0004
0029
0025
0050
0048
0.139
0.057
0.056
0.038
0.037
0.053
0.049
0.998
0.070
0.072
0.050
0.050
0.160
0.170
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 5. N = 250a
TABLE S.V—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
0
0
0
0
F = 40
0.079
0.069
0.069
0.049
0.049
0.050
0.049
0.070
0.000
0.001
0.005
0.004
0.013
0.011
0.070
0.046
0.046
0.034
0.033
0.033
0.033
0.416
0.044
0.044
0.042
0.052
0.050
0.056
0
0
0
0
F ∗ = 160
0.063
0.054
0.054
0.048
0.048
0.049
0.048
0.045
0.000
0.001
0.004
0.004
0.006
0.005
0.047
0.036
0.036
0.032
0.031
0.032
0.031
0.158
0.048
0.046
0.048
0.042
0.048
0.052
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0.142
0.058
0.057
0.048
0.048
0.052
0.051
0138
−0001
0001
0014
0012
0025
0026
0.138
0.036
0.036
0.035
0.033
0.038
0.038
0.964
0.024
0.022
0.058
0.056
0.076
0.082
0
0
0
0
0.104
0.051
0.051
0.049
0.049
0.049
0.049
0096
0001
0001
0007
0007
0011
0013
0.096
0.035
0.034
0.033
0.032
0.034
0.033
0.560
0.036
0.036
0.050
0.050
0.056
0.058
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0.040
0.068
0.067
0.048
0.048
0.048
0.047
0.045
0.054
0.053
0.047
0.047
0.047
0.047
0001
−0002
−0002
0002
0001
0001
0001
−0001
0000
0000
0001
0000
0000
−0001
0.027
0.043
0.043
0.032
0.032
0.033
0.032
0.030
0.036
0.036
0.030
0.031
0.030
0.030
0.054
0.048
0.048
0.042
0.036
0.040
0.042
0.042
0.048
0.048
0.048
0.046
0.056
0.050
a Results are based on 500 simulation replications and 100 instruments. The first 5 first-stage coefficients were set equal to 1 and the remaining 95 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a nonempty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
21
22
TABLE S.VI
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.012
2.472
0.122
*
*
*
*
0001
0006
0006
*
*
*
*
0.008
0.062
0.058
*
*
*
*
0.052
0.024
0.022
0.000
0.000
0.000
0.000
300
300
121
121
0.020
0.205
0.100
0.032
0.030
0.036
0.034
0000
0001
0001
0000
−0001
−0002
0002
0.014
0.047
0.047
0.021
0.021
0.023
0.019
0.050
0.044
0.040
0.008
0.014
0.020
0.016
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.034
0.032
3.056
0.004
0.127
0.005
*
*
*
*
*
*
*
*
0.032
0.067
0.059
*
*
*
*
0.838
0.060
0.056
0.000
0.000
0.000
0.000
306
306
136
136
F ∗ = 10
0.053
0.127
0.088
0.034
0.034
0.038
0.037
0.050
0.048
0.047
0.024
0.024
0.026
0.026
0.750
0.056
0.050
0.022
0.030
0.042
0.070
0.050
0.004
0.005
0.012
0.014
0.017
0.016
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
498
498
0.066
1.026
0.106
*
*
0.040
0.037
0065
0017
0021
*
*
0039
0037
0.065
0.059
0.055
*
*
0.039
0.037
1.000
0.120
0.122
0.000
0.000
0.000
0.000
324
324
118
118
0.097
0.072
0.067
0.036
0.034
0.043
0.041
0096
−0001
0002
0021
0019
0030
0029
0.096
0.037
0.038
0.025
0.023
0.033
0.031
0.996
0.060
0.060
0.042
0.044
0.140
0.154
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 5. N = 500a
TABLE S.VI—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
0
0
0
0
F = 40
0.056
0.043
0.043
0.033
0.032
0.034
0.033
0.048
0.001
0.002
0.009
0.006
0.011
0.010
0.048
0.029
0.029
0.023
0.023
0.025
0.023
0.426
0.034
0.036
0.040
0.036
0.040
0.046
0
0
0
0
F ∗ = 160
0.047
0.036
0.036
0.034
0.034
0.034
0.034
0.035
0.003
0.003
0.005
0.005
0.007
0.007
0.035
0.025
0.025
0.025
0.025
0.025
0.025
0.188
0.044
0.044
0.030
0.030
0.032
0.032
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0.099
0.041
0.041
0.035
0.034
0.036
0.036
0097
−0001
0000
0009
0008
0016
0016
0.097
0.026
0.026
0.023
0.023
0.024
0.025
0.950
0.040
0.040
0.066
0.064
0.086
0.084
0
0
0
0
0.071
0.036
0.036
0.034
0.034
0.035
0.035
0064
−0001
0000
0003
0003
0008
0009
0.064
0.024
0.024
0.024
0.024
0.024
0.023
0.536
0.040
0.040
0.048
0.046
0.050
0.052
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0.027
0.044
0.044
0.034
0.034
0.034
0.034
0.033
0.037
0.037
0.034
0.034
0.034
0.034
−0003
−0001
−0001
0000
−0002
0000
−0002
−0005
−0003
−0003
−0004
−0004
−0005
−0005
0.019
0.030
0.030
0.022
0.023
0.023
0.023
0.021
0.024
0.024
0.025
0.025
0.024
0.023
0.044
0.036
0.036
0.048
0.044
0.050
0.048
0.058
0.046
0.046
0.048
0.046
0.050
0.052
a Results are based on 500 simulation replications and 100 instruments. The first 5 first-stage coefficients were set equal to 1 and the remaining 95 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-Fand Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a nonempty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
23
24
TABLE S.VII
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
486
486
0.020
4.259
1.615
*
*
0.043
0.044
0000
0013
0013
*
*
0018
−0011
0.013
0.150
0.150
*
*
0.034
0.032
0.044
0.020
0.020
0.000
0.000
0.002
0.000
500
500
55
55
0.026
1.269
1.181
*
*
0.037
0.035
−0002
−0001
−0001
*
*
−0005
−0004
0.018
0.138
0.138
*
*
0.024
0.022
0.054
0.026
0.026
0.000
0.000
0.038
0.052
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
493
493
F ∗ = 25
0.048
0045
5.353
0027
1.394
0027
*
*
*
*
0.024
0009
0.022 −0018
0.045
0.141
0.141
*
*
0.025
0.018
0.678
0.102
0.102
0.000
0.000
0.000
0.000
500
500
46
46
F ∗ = 10
0.049
2.468
1.616
*
*
0.038
0.035
0.042
0.142
0.142
*
*
0.023
0.024
0.386
0.086
0.086
0.000
0.000
0.058
0.058
0042
0018
0018
*
*
0006
0008
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
486
486
0.090
1.394
1.268
*
*
0.032
0.027
0.088
0.063
0.063
*
*
0.013
0.016
0.088
0.137
0.137
*
*
0.028
0.022
1.000
0.205
0.205
0.000
0.000
0.000
0.000
500
500
46
46
0.087
4.565
2.000
*
*
0.041
0.040
0.083
0.038
0.039
*
*
0.011
0.015
0.083
0.139
0.139
*
*
0.027
0.028
0.906
0.136
0.136
0.000
0.000
0.074
0.078
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 25. N = 100a
TABLE S.VII—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
500
500
1
1
F = 40
0.040
2.640
2.333
*
*
0.036
0.034
0.027
0.021
0.021
*
*
0.007
0.008
0.029
0.092
0.092
*
*
0.024
0.023
0.170
0.040
0.040
0.000
0.000
0.054
0.056
500
500
0
0
F ∗ = 160
0.032
1.946
1.932
*
*
0.035
0.032
0.014
0.001
0.001
*
*
0.003
0.002
0.023
0.058
0.058
*
*
0.023
0.021
0.072
0.022
0.022
0.000
0.000
0.060
0.046
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
0
0
0.061
1.916
1.837
*
*
0.040
0.037
0053
0008
0008
*
*
0011
0011
0.053
0.078
0.078
*
*
0.028
0.025
0.448
0.088
0.088
0.000
0.000
0.086
0.082
499
499
0
0
0.041
1.295
1.289
0.055
0.049
0.035
0.034
0029
−0001
−0001
0055
0049
0006
0005
0.031
0.053
0.053
0.055
0.049
0.024
0.022
0.176
0.030
0.030
0.000
0.000
0.036
0.048
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
500
500
3
3
500
500
0
0
0.031
3.549
2.810
*
*
0.038
0.036
0.032
2.393
2.375
*
*
0.038
0.036
−0002
0007
0007
*
*
−0001
−0005
−0001
0005
0005
*
*
−0001
−0003
0.020
0.088
0.088
*
*
0.025
0.024
0.022
0.055
0.055
*
*
0.025
0.024
0.076
0.034
0.034
0.000
0.000
0.058
0.066
0.050
0.020
0.020
0.000
0.000
0.050
0.060
a Results are based on 500 simulation replications and 100 instruments. The first 25 first-stage coefficients were set equal to 1 and the remaining 75 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
25
26
TABLE S.VIII
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
499
499
0.012
0.355
0.052
*
*
0.000
0.003
0000
0000
0000
*
*
0000
0003
0.008
0.024
0.023
*
*
0.000
0.003
0.052
0.054
0.050
0.000
0.000
0.000
0.000
500
500
1
1
0.017
0.026
0.026
*
*
0.022
0.021
0001
0001
0001
*
*
−0001
0000
0.012
0.018
0.018
*
*
0.015
0.014
0.044
0.040
0.040
0.000
0.000
0.050
0.046
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
499
499
F ∗ = 25
0.031
0029
1.183
0002
0.047
0002
*
*
*
*
0.014 −0014
0.007
0007
0.029
0.025
0.025
*
*
0.014
0.007
0.712
0.034
0.034
0.000
0.000
0.000
0.000
500
500
0
0
F ∗ = 10
0.030
0.024
0.024
*
*
0.022
0.021
0.027
0.018
0.018
*
*
0.014
0.015
0.406
0.036
0.036
0.000
0.000
0.044
0.054
0027
0001
0001
*
*
0006
0008
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.058
0.045
0.040
*
*
*
*
0.057
0.002
0.003
*
*
*
*
0.057
0.020
0.020
*
*
*
*
1.000
0.058
0.058
0.000
0.000
0.000
0.000
500
500
1
1
0.054
0.025
0.024
*
*
0.024
0.023
0.052
0.000
0.000
*
*
0.009
0.012
0.052
0.015
0.015
*
*
0.015
0.016
0.908
0.054
0.052
0.000
0.000
0.100
0.118
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 25. N = 250a
TABLE S.VIII—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
348
348
0
0
F = 40
0.026
0.022
0.022
0.023
0.022
0.022
0.022
0.018
0.001
0.001
0.007
0.004
0.005
0.006
0.019
0.015
0.015
0.017
0.016
0.016
0.017
0.176
0.056
0.056
0.022
0.020
0.056
0.068
47
47
0
0
F ∗ = 160
0.022
0.020
0.020
0.021
0.020
0.022
0.021
0.010
0.001
0.001
0.004
0.004
0.002
0.003
0.016
0.014
0.014
0.014
0.014
0.015
0.014
0.088
0.058
0.058
0.044
0.048
0.058
0.050
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
356
356
0
0
0.038
0.022
0.022
0.023
0.022
0.022
0.022
0.034
0.000
0.000
0.007
0.006
0.007
0.009
0.034
0.014
0.014
0.016
0.016
0.015
0.016
0.468
0.054
0.054
0.016
0.016
0.060
0.090
37
37
0
0
0.027
0.020
0.020
0.022
0.021
0.021
0.021
0.018
0.000
0.000
0.007
0.006
0.003
0.006
0.019
0.013
0.013
0.014
0.014
0.014
0.014
0.188
0.074
0.074
0.066
0.060
0.060
0.070
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
359
359
0
0
51
51
0
0
0.018
0.021
0.021
0.020
0.019
0.021
0.020
0.019
0.019
0.019
0.020
0.020
0.020
0.020
0001
0001
0001
0006
0005
0001
0002
−0001
0000
0000
−0001
−0001
−0001
−0001
0.012
0.013
0.013
0.015
0.013
0.015
0.013
0.013
0.013
0.013
0.015
0.013
0.014
0.013
0.048
0.052
0.052
0.002
0.008
0.048
0.052
0.040
0.036
0.036
0.040
0.026
0.042
0.040
a Results are based on 500 simulation replications and 100 instruments. The first 25 first-stage coefficients were set equal to 1 and the remaining 75 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
27
28
TABLE S.IX
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.009
0.036
0.031
*
*
*
*
0000
0001
0001
*
*
*
*
0.006
0.016
0.016
*
*
*
*
0.052
0.044
0.042
0.000
0.000
0.000
0.000
500
500
0
0
0.012
0.017
0.017
*
*
0.014
0.014
0000
−0001
−0001
*
*
0000
−0001
0.008
0.011
0.011
*
*
0.009
0.008
0.046
0.044
0.044
0.000
0.000
0.048
0.060
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.021
0020
0.071 −0001
0.030 −0001
*
*
*
*
*
*
*
*
0.020
0.016
0.016
*
*
*
*
0.652
0.042
0.040
0.000
0.000
0.000
0.000
500
500
0
0
F ∗ = 10
0.022
0.017
0.017
*
*
0.015
0.015
0.019
0.012
0.012
*
*
0.011
0.010
0.406
0.044
0.044
0.000
0.000
0.066
0.078
0019
0000
0000
*
*
0004
0004
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.041
0.025
0.023
*
*
*
*
0040
0000
0001
*
*
*
*
0.040
0.013
0.013
*
*
*
*
0.998
0.052
0.054
0.000
0.000
0.000
0.000
500
500
0
0
0.037
0.017
0.017
*
*
0.016
0.017
0036
−0001
−0001
*
*
0006
0009
0.036
0.012
0.012
*
*
0.010
0.011
0.888
0.040
0.040
0.000
0.000
0.096
0.144
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 25. N = 500a
TABLE S.IX—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
0
0
0
0
F = 40
0.017
0.015
0.014
0.014
0.014
0.014
0.014
0.012
0.000
0.000
0.004
0.003
0.002
0.003
0.013
0.009
0.009
0.009
0.009
0.009
0.009
0.160
0.046
0.048
0.056
0.056
0.048
0.042
0
0
0
0
F ∗ = 160
0.015
0.014
0.014
0.014
0.014
0.014
0.014
0.007
0.001
0.001
0.002
0.002
0.001
0.002
0.011
0.010
0.010
0.009
0.009
0.010
0.010
0.080
0.048
0.048
0.056
0.054
0.048
0.054
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
0
0
0
0
0.026
0.014
0.014
0.016
0.015
0.015
0.015
0022
−0001
0000
0009
0007
0004
0007
0.022
0.009
0.009
0.012
0.010
0.010
0.010
0.426
0.056
0.056
0.108
0.090
0.064
0.092
0
0
0
0
0.018
0.013
0.013
0.014
0.014
0.014
0.014
0013
0000
0000
0005
0004
0003
0005
0.014
0.009
0.009
0.010
0.010
0.010
0.010
0.138
0.036
0.036
0.064
0.056
0.052
0.064
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
0
0
0
0
0
0
0
0
0.013
0.014
0.014
0.014
0.013
0.014
0.014
0.014
0.014
0.014
0.014
0.014
0.014
0.014
−0001
−0002
−0002
0000
0000
0000
−0001
0000
0000
0000
−0001
−0001
0000
−0001
0.009
0.010
0.010
0.010
0.009
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.044
0.034
0.034
0.038
0.040
0.040
0.044
0.062
0.062
0.062
0.060
0.068
0.056
0.056
a Results are based on 500 simulation replications and 100 instruments. The first 25 first-stage coefficients were set equal to 1 and the remaining 75 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
29
30
TABLE S.X
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.017
1.649
1.204
*
*
*
*
0000
0005
0005
*
*
*
*
0.012
0.108
0.108
*
*
*
*
0.058
0.016
0.016
0.000
0.000
0.000
0.000
500
500
472
472
0.019
4.640
2.054
*
*
0.030
0.030
−0002
−0003
−0003
*
*
−0001
−0007
0.013
0.082
0.082
*
*
0.025
0.023
0.050
0.016
0.016
0.000
0.000
0.002
0.004
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0036
0032
25295
0014
1580
0014
*
*
*
*
*
*
*
*
0.032
0.101
0.101
*
*
*
*
0.546
0.090
0.090
0.000
0.000
0.000
0.000
500
500
468
468
F ∗ = 10
0031
0024
1284
0004
1156
0004
*
*
*
*
0032 −0001
0031
0002
0.024
0.079
0.079
*
*
0.019
0.016
0.256
0.062
0.062
0.000
0.000
0.006
0.008
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0065
74420
0884
*
*
*
*
0063
0038
0038
*
*
*
*
0.063
0.096
0.096
*
*
*
*
0.988
0.165
0.165
0.000
0.000
0.000
0.000
0052
0049
5798
0016
1542
0016
*
*
*
*
0022 −0007
0024 −0005
0.049
0.080
0.080
*
*
0.017
0.014
0.702
0.098
0.098
0.000
0.000
0.000
0.002
500
500
477
477
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 50. N = 100a
TABLE S.X—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
500
500
400
400
F = 40
0.026
0.662
0.657
*
*
0.027
0.025
0.015
0.012
0.012
*
*
0.004
0.004
0.018
0.049
0.049
*
*
0.018
0.016
0.114
0.034
0.034
0.000
0.000
0.006
0.010
500
500
357
357
F ∗ = 160
0.022
1.272
1.224
*
*
0.026
0.023
0.008
0.000
0.000
*
*
0.002
0.002
0.015
0.033
0.033
*
*
0.017
0.017
0.068
0.018
0.018
0.000
0.000
0.014
0.012
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
391
391
0.034
0.495
0.495
*
*
0.031
0.028
0027
0002
0002
*
*
0001
0002
0.027
0.041
0.041
*
*
0.024
0.017
0.264
0.058
0.058
0.000
0.000
0.010
0.014
500
500
350
350
0.025
4.169
4.017
*
*
0.023
0.023
0014
−0001
−0001
*
*
−0001
−0001
0.017
0.032
0.032
*
*
0.015
0.016
0.110
0.038
0.038
0.000
0.000
0.006
0.006
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
500
500
400
400
500
500
356
356
0.023
0.374
0.374
*
*
0.028
0.027
0.022
0.419
0.419
*
*
0.025
0.025
−0001
0003
0003
*
*
−0005
−0006
0000
0002
0002
*
*
−0003
−0001
0.017
0.046
0.046
*
*
0.019
0.019
0.014
0.031
0.031
*
*
0.018
0.017
0.064
0.034
0.034
0.000
0.000
0.014
0.014
0.056
0.026
0.026
0.000
0.000
0.010
0.016
a Results are based on 500 simulation replications and 100 instruments. The first 50 first-stage coefficients were set equal to 1 and the remaining 50 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
31
32
TABLE S.XI
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.010
0.023
0.023
*
*
*
*
0.001
0.001
0.001
*
*
*
*
0.007
0.014
0.014
*
*
*
*
0.062
0.060
0.060
0.000
0.000
0.000
0.000
500
500
85
85
0.012
0.016
0.016
*
*
0.015
0.014
0.000
0.000
0.000
*
*
0.001
0.001
0.009
0.012
0.012
*
*
0.011
0.010
0.038
0.046
0.046
0.000
0.000
0.030
0.032
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.023
0.021
0.021
0.002
0.021
0.002
*
*
*
*
*
*
*
*
0.021
0.013
0.013
*
*
*
*
0.594
0.020
0.020
0.000
0.000
0.000
0.000
500
500
92
92
F ∗ = 10
0.019
0.015
0.015
*
*
0.014
0.013
0.015
0.011
0.011
*
*
0.009
0.009
0.248
0.032
0.032
0.000
0.000
0.028
0.030
0.015
0.000
0.000
*
*
0.001
0.002
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.042
0.020
0.020
*
*
*
*
0.041
0.002
0.002
*
*
*
*
0.041
0.013
0.012
*
*
*
*
0.990
0.036
0.036
0.000
0.000
0.000
0.000
500
500
94
94
0.032
0.016
0.016
*
*
0.015
0.015
0.030
0.000
0.000
*
*
0.003
0.004
0.030
0.010
0.010
*
*
0.010
0.011
0.712
0.058
0.060
0.000
0.000
0.042
0.046
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 50. N = 250a
TABLE S.XI—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
500
500
0
0
F = 40
0.016
0.014
0.014
*
*
0.015
0.015
0.010
0.001
0.001
*
*
0.001
0.002
0.012
0.010
0.010
*
*
0.010
0.010
0.106
0.048
0.048
0.000
0.000
0.050
0.056
500
500
0
0
F ∗ = 160
0.015
0.014
0.014
*
*
0.015
0.015
0.005
0.000
0.000
*
*
0.002
0.002
0.010
0.010
0.010
*
*
0.010
0.010
0.082
0.050
0.050
0.000
0.000
0.062
0.048
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
0
0
0.021
0.014
0.014
*
*
0.015
0.014
0017
−0001
−0001
*
*
0002
0003
0.017
0.010
0.010
*
*
0.009
0.009
0.278
0.048
0.048
0.000
0.000
0.064
0.048
500
500
0
0
0.017
0.014
0.014
*
*
0.015
0.014
0010
0001
0001
*
*
0001
0002
0.012
0.009
0.009
*
*
0.010
0.009
0.112
0.054
0.052
0.000
0.000
0.048
0.052
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
500
500
1
1
500
500
0
0
0.013
0.014
0.014
*
*
0.015
0.014
0.013
0.014
0.014
*
*
0.015
0.014
0.000
0.001
0.001
*
*
0.000
0.001
0.000
0.000
0.000
*
*
0.000
0.000
0.009
0.009
0.009
*
*
0.009
0.009
0.009
0.009
0.009
*
*
0.010
0.009
0.054
0.050
0.050
0.000
0.000
0.058
0.068
0.044
0.046
0.046
0.000
0.000
0.058
0.056
a Results are based on 500 simulation replications and 100 instruments. The first 50 first-stage coefficients were set equal to 1 and the remaining 50 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
33
34
TABLE S.XII
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.007
0.015
0.015
*
*
*
*
0.001
0.001
0.001
*
*
*
*
0.005
0.009
0.009
*
*
*
*
0.058
0.052
0.052
0.000
0.000
0.000
0.000
500
500
1
1
0.009
0.011
0.011
*
*
0.010
0.010
0.000
0.000
0.000
*
*
0.000
0.000
0.006
0.007
0.007
*
*
0.007
0.006
0.052
0.052
0.052
0.000
0.000
0.042
0.052
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.016
0.014
0.013
0.000
0.013
0.000
*
*
*
*
*
*
*
*
0.014
0.009
0.009
*
*
*
*
0.552
0.034
0.034
0.000
0.000
0.000
0.000
500
500
0
0
F ∗ = 10
0.015
0.011
0.011
*
*
0.011
0.010
0.012
0.007
0.007
*
*
0.007
0.007
0.262
0.066
0.066
0.000
0.000
0.054
0.066
0.011
0.000
0.000
*
*
0.002
0.003
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.029
0.012
0.012
*
*
*
*
0028
0000
0000
*
*
*
*
0.028
0.008
0.008
*
*
*
*
0.994
0.044
0.044
0.000
0.000
0.000
0.000
500
500
1
1
0.022
0.011
0.011
*
*
0.011
0.011
0020
−0001
−0001
*
*
0003
0003
0.020
0.007
0.007
*
*
0.007
0.007
0.656
0.058
0.054
0.000
0.000
0.070
0.088
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 50. N = 500a
TABLE S.XII—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
500
500
0
0
F = 40
0.011
0.010
0.010
*
*
0.010
0.010
0.006
0.000
0.000
*
*
0.001
0.002
0.008
0.007
0.007
*
*
0.007
0.007
0.108
0.066
0.064
0.000
0.000
0.052
0.058
358
358
0
0
F ∗ = 160
0.010
0.010
0.010
0.010
0.010
0.010
0.010
0.003
0.000
0.000
0.003
0.003
0.001
0.001
0.007
0.007
0.007
0.007
0.008
0.007
0.007
0.076
0.044
0.044
0.004
0.006
0.048
0.042
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
497
497
0
0
0.015
0.009
0.009
0.014
0.012
0.010
0.010
0012
0000
0000
−0006
−0005
0002
0002
0.012
0.006
0.006
0.009
0.008
0.007
0.007
0.248
0.046
0.042
0.000
0.000
0.046
0.056
351
351
0
0
0.012
0.010
0.010
0.012
0.011
0.010
0.010
0007
0000
0000
0004
0003
0002
0002
0.009
0.007
0.007
0.007
0.007
0.007
0.007
0.128
0.046
0.048
0.024
0.024
0.056
0.064
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
497
497
0
0
349
349
0
0
0.009
0.009
0.009
0.005
0.010
0.010
0.009
0.010
0.010
0.010
0.011
0.010
0.011
0.010
0000
0000
0000
−0001
−0001
0000
0000
0000
0000
0000
−0001
−0001
0000
0000
0.006
0.006
0.006
0.004
0.007
0.006
0.006
0.007
0.007
0.007
0.008
0.007
0.007
0.007
0.032
0.032
0.032
0.000
0.000
0.044
0.038
0.060
0.064
0.064
0.014
0.016
0.066
0.066
a Results are based on 500 simulation replications and 100 instruments. The first 50 first-stage coefficients were set equal to 1 and the remaining 50 to zero in this design. Corr(e v) is the correlation between first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and
FULL(100) are respectively the 2SLS, LIML, and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100)
and FULL(100) to obtain testing rejection frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty
to select among the 100 instruments. LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select
among the 120 instruments formed by augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias
(Med. Bias), mean absolute deviation (MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In
these cases, RMSE, Med. Bias, and MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞)
and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
35
36
TABLE S.XIII
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.013
6.463
0.950
*
*
*
*
0.000
0.006
0.006
*
*
*
*
0.009
0.065
0.065
*
*
*
*
0.068
0.012
0.012
0.000
0.000
0.000
0.000
500
500
500
500
0.014
1.387
1.171
*
*
*
*
0.000
0.002
0.002
*
*
*
*
0.008
0.044
0.044
*
*
*
*
0.054
0.026
0.026
0.000
0.000
0.000
0.000
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.024
0.021
2.172
0.008
0.701
0.008
*
*
*
*
*
*
*
*
0.021
0.061
0.061
*
*
*
*
0.378
0.076
0.076
0.000
0.000
0.000
0.000
500
500
500
500
F ∗ = 10
0.019
0.012
2.774
0.001
0.539
0.001
*
*
*
*
*
*
*
*
0.013
0.044
0.044
*
*
*
*
0.152
0.040
0.040
0.000
0.000
0.000
0.000
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.043
0.697
0.555
*
*
*
*
0.041
0.017
0.017
*
*
*
*
0.041
0.060
0.060
*
*
*
*
0.916
0.145
0.145
0.000
0.000
0.000
0.000
500
500
500
500
0.030
1.250
1.131
*
*
*
*
0.027
0.007
0.007
*
*
*
*
0.027
0.046
0.046
*
*
*
*
0.486
0.074
0.074
0.000
0.000
0.000
0.000
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 100. N = 100a
TABLE S.XIII—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F = 40
0.017
0.007
1.345
0.005
1.139
0.005
*
*
*
*
*
*
*
*
0.010
0.027
0.027
*
*
*
*
0.088
0.022
0.022
0.000
0.000
0.000
0.000
500
500
500
500
F ∗ = 160
0.015
0.004
0.221
0.000
0.221
0.000
*
*
*
*
*
*
*
*
0.011
0.019
0.019
*
*
*
*
0.054
0.010
0.010
0.000
0.000
0.000
0.000
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.021
0.803
0.782
*
*
*
*
0014
0002
0002
*
*
*
*
0.015
0.023
0.023
*
*
*
*
0.176
0.050
0.050
0.000
0.000
0.000
0.000
500
500
500
500
0.016
0.277
0.277
*
*
*
*
0007
−0001
−0001
*
*
*
*
0.011
0.020
0.020
*
*
*
*
0.086
0.030
0.030
0.000
0.000
0.000
0.000
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
500
500
500
500
0.016
0.203
0.202
*
*
*
*
0.000
0.002
0.002
*
*
*
*
0.010
0.026
0.026
*
*
*
*
0.074
0.034
0.034
0.000
0.000
0.000
0.000
500
500
500
500
0.015
0.376
0.376
*
*
*
*
0.000
0.001
0.001
*
*
*
*
0.010
0.019
0.019
*
*
*
*
0.054
0.028
0.028
0.000
0.000
0.000
0.000
a Results are based on 500 simulation replications and 100 instruments. All first-stage coefficients were set equal to 1 in this design. Corr(e v) is the correlation between
first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and FULL(100) are respectively the 2SLS, LIML,
and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100) and FULL(100) to obtain testing rejection
frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 100 instruments.
LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 120 instruments formed by
augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias (Med. Bias), mean absolute deviation
(MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In these cases, RMSE, Med. Bias, and
MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞) and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
37
38
TABLE S.XIV
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.008
0.013
0.013
*
*
*
*
0.000
0.000
0.000
*
*
*
*
0.005
0.008
0.008
*
*
*
*
0.050
0.048
0.046
0.000
0.000
0.000
0.000
500
500
500
500
0.009
0.011
0.011
*
*
*
*
0.001
0.001
0.001
*
*
*
*
0.006
0.007
0.007
*
*
*
*
0.068
0.064
0.064
0.000
0.000
0.000
0.000
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.015
0.013
0.011
0.000
0.011
0.000
*
*
*
*
*
*
*
*
0.013
0.007
0.007
*
*
*
*
0.390
0.034
0.034
0.000
0.000
0.000
0.000
500
500
500
500
F ∗ = 10
0.012
0.009
0.010
0.001
0.010
0.001
*
*
*
*
*
*
*
*
0.009
0.007
0.007
*
*
*
*
0.158
0.048
0.046
0.000
0.000
0.000
0.000
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.027
0.012
0.012
*
*
*
*
0.026
0.001
0.001
*
*
*
*
0.026
0.008
0.008
*
*
*
*
0.926
0.042
0.044
0.000
0.000
0.000
0.000
500
500
500
500
0.019
0.010
0.010
*
*
*
*
0.016
0.000
0.001
*
*
*
*
0.016
0.007
0.007
*
*
*
*
0.456
0.056
0.054
0.000
0.000
0.000
0.000
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 100. N = 250a
TABLE S.XIV—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
496
496
F = 40
0.011
0005
0.010
0000
0.010
0000
*
*
*
*
0.010 −0003
0.007 −0004
0.007
0.007
0.007
*
*
0.009
0.008
0.090
0.050
0.050
0.000
0.000
0.000
0.000
500
500
478
478
F ∗ = 160
0.010
0003
0.010
0000
0.010
0000
*
*
*
*
0.016 −0001
0.016 −0002
0.006
0.006
0.006
*
*
0.009
0.007
0.080
0.064
0.062
0.000
0.000
0.006
0.008
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
494
494
0.013
0.010
0.010
*
*
0.008
0.009
0009
0000
0000
*
*
−0006
−0008
0.010
0.006
0.006
*
*
0.007
0.008
0.150
0.050
0.050
0.000
0.000
0.000
0.000
500
500
477
477
0.011
0.010
0.010
*
*
0.011
0.011
0005
0000
0000
*
*
0003
0003
0.007
0.006
0.006
*
*
0.009
0.007
0.088
0.058
0.058
0.000
0.000
0.002
0.002
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
500
500
487
487
0.009
0.010
0.010
*
*
0.011
0.010
0000
0000
0000
*
*
0003
0001
0.007
0.007
0.007
*
*
0.009
0.006
0.046
0.044
0.044
0.000
0.000
0.000
0.002
500
500
475
475
0.010
0.010
0.010
*
*
0.010
0.009
0000
0000
0000
*
*
−0001
−0005
0.007
0.007
0.007
*
*
0.006
0.007
0.054
0.052
0.052
0.000
0.000
0.000
0.000
a Results are based on 500 simulation replications and 100 instruments. All first-stage coefficients were set equal to 1 in this design. Corr(e v) is the correlation between
first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and FULL(100) are respectively the 2SLS, LIML,
and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100) and FULL(100) to obtain testing rejection
frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 100 instruments.
LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 120 instruments formed by
augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias (Med. Bias), mean absolute deviation
(MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In these cases, RMSE, Med. Bias, and
MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞) and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
39
40
TABLE S.XV
Corr(e v) = 0
Estimator
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
500
500
500
500
0.006
0.008
0.008
*
*
*
*
0.000
0.001
0.001
*
*
*
*
0.004
0.005
0.005
*
*
*
*
0.046
0.048
0.044
0.000
0.000
0.000
0.000
500
500
340
340
0.006
0.007
0.007
*
*
0.006
0.006
0.000
0.000
0.000
*
*
0.000
0.000
0.004
0.004
0.004
*
*
0.004
0.004
0.048
0.048
0.046
0.000
0.000
0.004
0.008
Select 0
RMSE
Med. Bias
Corr(e v) = 06
MAD
rp(0.05)
500
500
500
500
F ∗ = 25
0.011
0.009
0.008
0.000
0.008
0.000
*
*
*
*
*
*
*
*
0.009
0.006
0.006
*
*
*
*
0.396
0.052
0.052
0.000
0.000
0.000
0.000
500
500
325
325
F ∗ = 10
0.009
0.007
0.007
*
*
0.007
0.007
0.006
0.005
0.005
*
*
0.004
0.004
0.180
0.062
0.062
0.000
0.000
0.026
0.022
0.006
0.000
0.000
*
*
0.000
0.000
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
500
500
0.019
0.007
0.007
*
*
*
*
0018
0000
0000
*
*
*
*
0.018
0.005
0.005
*
*
*
*
0.920
0.036
0.040
0.000
0.000
0.000
0.000
500
500
316
316
0.012
0.007
0.007
*
*
0.007
0.007
0011
−0001
−0001
*
*
0000
0001
0.011
0.005
0.005
*
*
0.004
0.004
0.412
0.050
0.052
0.000
0.000
0.014
0.014
(Continues)
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
2SLS SIMULATION RESULTS. CUT-OFF DESIGN, S = 100. N = 500a
TABLE S.XV—Continued
Corr(e v) = 0
Estimator
Select 0
RMSE
Med. Bias
Corr(e v) = 03
MAD
rp(0.05)
Select 0
RMSE
Corr(e v) = 06
Med. Bias
MAD
rp(0.05)
500
500
2
2
F = 40
0.008
0.007
0.007
*
*
0.007
0.007
0.003
0.000
0.000
*
*
0.000
0.001
0.006
0.005
0.005
*
*
0.005
0.005
0.092
0.078
0.078
0.000
0.000
0.070
0.064
500
500
0
0
F ∗ = 160
0.007
0.007
0.007
*
*
0.007
0.007
0.002
0.000
0.000
*
*
0.000
0.000
0.005
0.005
0.005
*
*
0.005
0.005
0.060
0.058
0.060
0.000
0.000
0.050
0.056
Select 0
RMSE
Med. Bias
MAD
rp(0.05)
500
500
0
0
0.009
0.007
0.007
*
*
0.007
0.007
0.006
0.000
0.000
*
*
0.001
0.001
0.007
0.005
0.005
*
*
0.005
0.005
0.164
0.048
0.048
0.000
0.000
0.060
0.066
500
500
0
0
0.008
0.007
0.007
*
*
0.007
0.007
0.003
0.000
0.000
*
*
0.001
0.001
0.005
0.005
0.005
*
*
0.005
0.005
0.090
0.044
0.044
0.000
0.000
0.056
0.048
∗
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
500
500
0
0
0.006
0.007
0.007
*
*
0.007
0.007
0.000
0.000
0.000
*
*
0.000
0.000
0.004
0.004
0.004
*
*
0.005
0.005
0.038
0.040
0.040
0.000
0.000
0.036
0.034
500
500
0
0
0.007
0.007
0.007
*
*
0.007
0.007
0.000
0.000
0.000
*
*
0.000
0.000
0.005
0.005
0.005
*
*
0.005
0.005
0.052
0.052
0.052
0.000
0.000
0.048
0.056
a Results are based on 500 simulation replications and 100 instruments. All first-stage coefficients were set equal to 1 in this design. Corr(e v) is the correlation between
first-stage and structural errors. F ∗ measures the strength of the instruments as outlined in the text. 2SLS(100), LIML(100), and FULL(100) are respectively the 2SLS, LIML,
and Fuller(1) estimator using all 100 potential instruments. Many-instrument robust standard errors are computed for LIML(100) and FULL(100) to obtain testing rejection
frequencies. LASSO and Post-LASSO respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 100 instruments.
LASSO-F and Post-LASSO-F respectively correspond to IV using LASSO or Post-LASSO with the refined data-driven penalty to select among the 120 instruments formed by
augmenting the original 100 instruments with the first 20 principal components. We report root-mean-squared-error (RMSE), median bias (Med. Bias), mean absolute deviation
(MAD), and rejection frequency for 5% level tests (rp(0.05)). “Select 0” is the number of cases in which LASSO chose no instruments. In these cases, RMSE, Med. Bias, and
MAD use only the replications where LASSO selects a non-empty set of instruments, and we set the confidence interval equal to (−∞ ∞) and thus fail to reject.
METHODS FOR OPTIMAL INSTRUMENTS
2SLS(100)
LIML(100)
FULL(100)
LASSO
Post-LASSO
LASSO-F
Post-LASSO-F
41
42
BELLONI, CHEN, CHERNOZHUKOV, AND HANSEN
REFERENCES
BEKKER, P. A. (1994): “Alternative Approximations to the Distributions of Instrumental Variables
Estimators,” Econometrica, 63, 657–681. [11]
BELLONI, A., D. CHEN, V. CHERNOZHUKOV, AND C. HANSEN (2012): “Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain,” Econometrica (forthcoming). [8]
FULLER, W. A. (1977): “Some Properties of a Modification of the Limited Information Estimator,” Econometrica, 45, 939–954. [11]
HANSEN, C., J. HAUSMAN, AND W. K. NEWEY (2008): “Estimation With Many Instrumental Variables,” Journal of Business & Economic Statistics, 26, 398–422. [11]
LEDOUX, M., AND M. TALAGRAND (1991): Probability in Banach Spaces: Isoperimetry and Processes. Ergebnisse der Mathematik undihrer Grenzgebiete. Berlin: Springer. [8]
ROSENTHAL, H. P. (1970): “On the Subspaces of Lp (p > 2) Spanned by Sequences of Independent Random Variables,” Israel Journal of Mathematics, 9, 273–303. [1]
VAN DER VAART, A. W., AND J. A. WELLNER (1996): Weak Convergence and Empirical Processes.
Springer Series in Statistics. Berlin: Springer. [3]
VON BAHR, B., AND C.-G. ESSEEN (1965): “Inequalities for the rth Absolute Moment of a Sum
of Random Variables, 1 ≤ r ≤ 2,” The Annals of Mathematical Statistics, 36, 299–303. [2,5]
Duke University Fuqua School of Business, Durham, NC 27708, U.S.A.;
[email protected],
D-GESS, ETH Zurich, IFW E 48.3, Haldeneggsteig 4, CH-8092, Zurich,
Switzerland; [email protected],
Dept. of Economics, Massachusetts Institute of Technology, Cambridge, MA
02139, U.S.A.; [email protected],
and
University of Chicago Booth School of Business, Chicago, IL 60637, U.S.A.;
[email protected].
Manuscript received October, 2010; final revision received June, 2012.
© Copyright 2026 Paperzz