Supplemental Material - Cowles Foundation

Supplemental Material for
ASYMPTOTIC SIZE OF KLEIBERGEN'S LM AND
CONDITIONAL LR TESTS FOR MOMENT CONDITION MODELS
By
Donald W. K. Andrews and Patrik Guggenberger
December 2014
COWLES FOUNDATION DISCUSSION PAPER NO. 1977S
COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
YALE UNIVERSITY
Box 208281
New Haven, Connecticut 06520-8281
http://cowles.econ.yale.edu/
Supplemental Material
for
Asymptotic Size of Kleibergen’s LM
and Conditional LR Tests
for Moment Condition Models
Donald W. K. Andrews
Cowles Foundation for Research in Economics
Yale University
Patrik Guggenberger
Department of Economics
Pennsylvania State University
First Version: March 25, 2011
Revised: December 19, 2014
Contents
13 Outline
2
14 Proof of Lemma 8.2
3
15 Proof of Lemma 8.3
4
16 Proof of Theorem 8.4
11
17 Proofs of Su¢ ciency of Several Conditions for the
p j(
) Condition in F0j
26
18 Asymptotic Size of Kleibergen’s CLR Test with Jacobian-Variance Weighting
and the Proof of Theorem 5.1
29
19 Proof of Theorem 7.1
52
1
13
Outline
We let AG1 abbreviate the main paper “Asymptotic Size of Kleibergen’s LM and Condi-
tional LR Tests for Moment Condition Models” and its Appendix. References to Sections with
Section numbers less than 13 refer to Sections of AG1. Similarly, all theorems and lemmas with
Section numbers less than 13 refer to results in AG1.
This Supplemental Material provides proofs of some of the results stated in AG1. It also provides
some complementary results to those in AG1.
Sections 14, 15, and 16 prove Lemma 8.2, Lemma 8.3, and Theorem 8.4, respectively, which
appear in Section 8 in the Appendix to AG1. Section 17 proves that the conditions in (3.9) and
(3.10) are su¢ cient for the second condition in F0j :
Section 18 proves Theorem 5.1. Section 18 also determines the asymptotic size of Kleibergen’s
(2005) CLR test with Jacobian-variance weighting that employs the Robin and Smith (2000) rank
statistic, de…ned in Section 5, for the general case of p
this test is correct. But, when p
1: When p = 1; the asymptotic size of
2; we cannot show that its asymptotic size is necessarily correct
(because the sample moments and the rank statistic can be asymptotically dependent under some
sequences of distributions). Section 18 provides some simulation results for this test.
Section 19 proves Theorem 7.1, which provides results for time series observations.
For notational simplicity, throughout the Supplemental Material, we often suppress the argument
0
for various quantities that depend on the null value
Material, the quantities BF ; CF ; and (
1F ; :::; pF )
0:
Throughout the Supplemental
are de…ned using the general de…nitions given
in (8.6)-(8.8), rather than the de…nitions given in Section 3, which are a special case of the former
de…nitions.
For notational simplicity, the proofs in Sections 14-16 are for the sequence fng; rather than a
subsequence fwn : n
1g: The same proofs hold for any subsequence fwn : n
1g: The proofs in
these three sections use the following simpli…ed notation. De…ne
Dn := EFn Gi ;
n
:=
Fn ;
Bn := BFn ; Cn := CFn ; Bn = (Bn;q ; Bn;p
q );
Cn = (Cn;q ; Cn;k
Wn := WFn ; W2n := W2Fn ; Un := UFn ; and U2n := U2Fn ;
where q = qh is de…ned in (8.16), Bn;q 2 Rp
q;
2
Bn;p
q
2 Rp
q );
(13.1)
(p q) ;
Cn;q 2 Rk
q;
and Cn;k
q
2
Rk
(k q) :
n;q
n
:= Diagf
2
6
(p
:= 6
4 0
0(k
Note that
14
De…ne
n
1Fn ; :::; qFn g
0q
n;q
q) q
2 Rq
(p q)
n;p q
p) q
0(k
p) (p q)
q
;
3
n;p q
7
7 2 Rk
5
p
:= Diagf
2 R(p
(q+1)Fn ; :::; pFn g
q) (p q)
; and
:
(13.2)
is the diagonal matrix of singular values of Wn Dn Un ; see (8.8).
Proof of Lemma 8.2
Lemma 8.2 of AG1. Under all sequences f
0
n1=2 @
bn
vec(D
gbn
EF n Gi )
1
0
A !d @
n;h
:n
vec(Dh )
1
gh
A
Under all subsequences fwn g and all sequences f
1g;
0
0
N @0(p+1)k ; @
wn ;h
: n
h5;g
0pk
k
0k
pk
vec(Gi )
h
11
AA :
1g; the same result holds with n
replaced with wn :
Proof of Lemma 8.2. We have
bn
n1=2 vec(D
0
1
b 1n
n
B
C
X
B . C
1=2
Dn ) = n
vec(Gi Dn ) B .. C b n 1 n1=2 gbn
@
A
i=1
b pn
2
0
1
3
0
E
G
g
F
`1
n
` C
n 6
B
7
X
6
B
C 1 7
..
= n 1=2
vec(G
D
)
g
6
B
C Fn i 7 + op (1);
i
n
.
4
@
A
5
i=1
0
EFn G`p g`
where the second equality holds by (i) the weak law of large numbers (WLLN) applied to n 1
P
P
G`j g`0 for j = 1; :::; p; n 1 n`=1 vec(G` ); and n 1 n`=1 g` g`0 ; (ii) EFn gi = 0k ; (iii) h5;g = lim
(14.1)
Pn
`=1
Fn
is
pd, and (iv) the CLT, which implies that n1=2 gbn = Op (1):
Using (14.1), the convergence result of Lemma 8.2 holds (with n in place of wn ) by the Lyapunov
triangular-array multivariate CLT using the moment restrictions in F. The limiting covariance
b n Dn ) and n1=2 gbn in Lemma 8.2 is a zero matrix because
matrix between n1=2 vec(D
EFn [Gij
Dnj
(EFn G`j g`0 )
3
1
0
Fn gi ]gi
= 0k
k
;
(14.2)
where Dnj denotes the jth column of Dn ; using EFn gi = 0k for j = 1; :::; p: By the CLT, the limiting
b n Dn ) in Lemma 8.2 equals
variance matrix of n1=2 vec(D
(EFn vec(G` )g`0 )
lim V arFn (vec(Gi )
1
Fn gi )
see (8.15), and the limit exists because (i) the components of
submatrices of
n1=2 gbn
15
equals
5;Fn
and (ii)
lim EFn gi gi0
s;Fn
vec(Gi )
Fn
= lim
vec(Gi )
Fn
=
vec(Gi )
;
h
are comprised of
(14.3)
4;Fn
and
! hs for s = 4; 5: By the CLT, the limiting variance matrix of
= h5;g :
Proof of Lemma 8.3
Lemma 8.3 of AG1. Suppose Assumption WU holds for some non-empty parameter space
2:
Under all sequences f
n;h
:n
1g with
bn
n1=2 (b
gn ; D
n;h
2
;
b n UFn Tn ) !d (g h ; Dh ;
EFn Gi ; W Fn D
where (a) (g h ; Dh ) are de…ned in Lemma 8.2, (b)
de…ned in (8.17), (c) (Dh ;
1=2
WF =
F
h)
h
h );
is the nonrandom function of h and Dh
and g h are independent, (d) if Assumption WU holds with
; and UF = Ip ; then
h
=
0;
has full column rank p with probability one and (e) under all
subsequences fwn g and all sequences f
wn ;h
:n
1g with
wn ;h
2
; the convergence result above
and the results of parts (a)-(d) hold with n replaced with wn :
The proof of part (d) of Lemma 8.3 uses the following two lemmas and corollary.
2 Rk
Lemma 15.1 Suppose
variance matrix ), k
vectors
p
has a multivariate normal distribution (with possibly singular
p; and the variance matrix of
2 Rp with jj jj = 1: Then, P (
Comments: (i) Let Condition
A su¢ cient condition for Condition
(
0
2 Rk has rank at least p for all nonrandom
has full column rank p) = 1:
denote the condition of the lemma on the variance of
:
is that vec( ) has a pd variance matrix (because
=
Ik )vec( )): The converse is not true. This is proved in Comment (iii) below.
(ii) A weaker su¢ cient condition for Condition
is that the variance matrix of
2 Rk has
rank k for all constant vectors 2 Rp with jj jj = 1: The latter condition holds i¤ V ar( 0 vec( )) > 0
2 Rpk of the form
for all
(because (
0
0
0 )vec(
=
) = vec(
V ar( vec( )) > 0 for all
2
Rpk
2 Rp and
for some
0
)=
0
2 Rk with jj jj = 1 and jj jj = 1
): In contrast, vec( ) has a pd variance matrix i¤
with jj jj = 1:
4
(iii) For example, the following matrix
for Condition
(and hence Condition
satis…es the su¢ cient condition given in Comment (ii)
holds), but not the su¢ cient condition given in Comment
(i). Let Zj for j = 1; 2; 3 be independent standard normal random variables. De…ne
0
1
Z1 Z2
=@
A:
Z3 Z1
(15.1)
Obviously, V ar(vec( )) is not pd. On the other hand, writing
0
2)
= ( 1;
and
=(
1;
0
2) ;
we
have
0
V ar(
) = V ar(
1 [Z1 1
= V ar((
=(
2
1 2)
Now, (
1
= 0 implies
= 0 implies
1; 1)
are: (
and (
2; 2)
2
6= 0;
2
1
= 0 or
2; 2)
= (0; 0) implies (
with jj jj = jj jj = 1; V ar(
+
+
1
2 2 )Z1
1 2 Z2
+
2
2 2)
2 2
+(
+
2
1 2)
2
2 1)
)2
=(
1; 1)
1 1
)2
2
2 1) :
+(
= 0 implies
2
(15.2)
= 0 or
= (0; 0) implies (
1
= 0: In addition,
2
1 2)
=(
2
2 1)
=0
2
1 1 + 2 2)
=(
2
2 2)
>0
> 0: Hence, V ar(
0
) > 0 for all
and
2 R2 with jj jj2 = 1; and the su¢ cient condition given
holds.
allows for redundant rows in
; which corresponds to redundant moment
conditions in the application of Lemma 15.1. Suppose a matrix
adds one or more rows to
2 1 Z3 )
6= 0; etc. So, the two cases where (
) is pd for all
in Comment (ii) for Condition
+ Z1 2 ])
+
= (0; 0): But, (
1 1
2 [Z3 1
= 0 and (
2
= 0 implies
= (0; 0) and (
(iv) Condition
1 1
1 1
+ Z2 2 ] +
satis…es Condition
; which consist of one or more of the existing rows of
: Then, one
or some linear
combinations of them. (In fact, the added rows can be arbitrary provided the resulting matrix has a
multivariate normal distribution.) Call the new matrix
(because the rank of the variance of
+:
The matrix
+
also satis…es Condition
is at least as large as the rank of the variance of
+
;
which is p):
Corollary 15.2 Suppose
Rk
(p q )
Let M 2
q
2 Rk
q
is a nonrandom matrix with full column rank q and
has a multivariate normal distribution (with possibly singular variance matrix ) and k
Rk k
be a nonsingular matrix such that M
q
the variance matrix of M2
2 Rp
q
p q
2
with jj 2 jj = 1: Then, for
2 Rk
=(
q
has rank at least p
q
;
p q
p) = 1:
5
2
p:
= (e1 ; :::; eq ); where el denotes the l-th
coordinate vector in Rk : Decompose M = (M10 ; M20 )0 with M1 2 Rq
2
p q
) 2 Rk
p;
k
and M2 2 R(k
q ) k:
Suppose
q for all nonrandom vectors
we have P (
has full column rank
Comment: Corollary 15.2 follows from Lemma 15.1 by the following argument. We have
0
=@
M
The matrix
M1
q
M1
p q
M2
q
M2
p q
has full column rank p i¤ M
1
0
A=@
Iq
0(k
q ) q
M1
p q
M2
p q
has full column rank p i¤ M2
rank p
q : The Corollary now follows from Lemma 15.1 applied with
M2
;k
p q
q ;p
q ; and
2;
1
A:
(15.3)
p q
has full column
; k; p; and
replaced by
respectively.
The following lemma is a special case of Cauchy’s interlacing eigenvalues result, e.g., see Hwang
(2004). As above, for a symmetric matrix A; let
Let A
r
1 (A)
denote a principal submatrix of A of order r
2 (A)
::: denote the eigenvalues of A:
1: That is, A
r
denotes A with some choice
of r rows and the same r columns deleted.
Proposition 15.3 Let A by a symmetric k
:::
2 (A)
1 (A 1 )
k matrix. Then,
k (A)
k 1 (A 1 )
k 1 (A)
1 (A):
The following is a straightforward corollary of Proposition 15.3.
Corollary 15.4 Let A by a symmetric k
m (A r )
for m = 1; :::; k
r and (b)
k matrix and let r 2 f1; :::; k
m (A)
m r (A r )
1g: Then, (a)
m (A)
for m = r + 1; :::; k:
Proof of Lemma 8.3. First, we prove the convergence result in Lemma 8.3. The singular value
decomposition of Wn Dn Un is
Wn Dn Un = Cn
0
n Bn ;
(15.4)
because Bn is a matrix of eigenvectors of Un0 Dn0 Wn0 Wn Dn Un ; Cn is a matrix of eigenvectors of
Wn Dn Un Un0 Dn0 Wn0 ; and
n
is the k
on the diagonal (ordered so that
jFn
p matrix with the singular values f
jFn
:j
pg of Wn Dn Un
0 is nonincreasing in j).
Using (15.4), we get
Wn Dn Un Bn;q
1
n;q
= Cn
0
n Bn Bn;q
1
n;q
= Cn
0
n@
Iq
0(p q) q
1
A
1
n;q
0
= Cn @
Iq
0(k q) q
1
A = Cn;q ;
(15.5)
where the second equality uses Bn0 Bn = Ip : Hence, we obtain
b n Un Bn;q
Wn D
1
n;q
= Wn Dn Un Bn;q
1
n;q
bn
+ Wn n1=2 (D
= Cn;q + op (1) !p h3;q =
6
h;q ;
Dn )Un Bn;q (n1=2
n;q )
1
(15.6)
where the second equality uses n1=2
q (by the de…nition of q in (8.16)),
b n Dn ) = Op (1) (by
M1 < 1 8F 2 FW U ; see (8.5)), n1=2 (D
jFn
Wn = O(1) (by the condition jjWF jj
! 1 for all j
Lemma 8.2), Un = O(1) (by the condition jjUF jj
M1 < 1 8F 2 FW U ; see (8.5)), and Bn;q ! h2;q
with jjvec(h2;q )jj < 1 (by (8.12) using the de…nitions in (8.17) and (13.1)). The convergence in
(15.6) holds by (8.12), (8.17), and (13.1), and the last equality in (15.6) holds by the de…nition of
h;q
in (8.17).
Using (15.4) again, we have
n1=2 Wn Dn Un Bn;p
0
q
= n1=2 Cn
1
0
n Bn Bn;p q
0
= n1=2 Cn
0
n@
0q (p q)
1
Ip
q
0q (p q)
0q (p q)
B
C
B
C
1=2
C ! h3 B Diagfh1;q+1 ; :::; h1;p g C = h3 h1;p
= Cn B
n
n;p
q
@
A
@
A
(k
p)
(p
q)
(k
p)
(p
q)
0
0
1
A
q;
(15.7)
where the second equality uses Bn0 Bn = Ip ; the convergence holds by (8.12) using the de…nitions in
(8.17) and (13.2), and the last equality holds by the de…nition of h1;p
q
in (8.17).
Using (15.7) and Lemma 8.2, we get
b n Un Bn;p
n1=2 Wn D
where Bn;p
q
! h2;p
q;
q
= n1=2 Wn Dn Un Bn;p
! d h3 h1;p
q
q
bn
+ Wn n1=2 (D
+ h71 Dh h81 h2;p
q
=
Dn )Un Bn;p
q
h;p q ;
(15.8)
Wn ! h71 ; and Un ! h81 by (8.3), (8.12), (8.17), and Assumption WU
using the de…nitions in (13.1) and the last equality holds by the de…nition of
h;p q
in (8.17).
Equations (15.6) and (15.8) combine to prove
b n Un Tn = n1=2 Wn D
b n Un Bn Sn = (Wn D
b n Un Bn;q
n1=2 Wn D
!d (
h;q ;
h;p q )
=
h
1
1=2
b n Un Bn;p q )
Wn D
n;q ; n
(15.9)
using the de…nition of Sn in (8.19). The convergence is joint with that in Lemma 8.2 because it
b n Dn ); which is part of the former. This establishes the
just relies on the convergence of n1=2 (D
convergence result of Lemma 8.3.
Properties (a) and (b) in Lemma 8.3 hold by de…nition. Property (c) in Lemma 8.3 holds by
Lemma 8.2 and property (b) in Lemma 8.3.
7
Now, we prove property (d). We have
h02;p
0
= lim Bn;p
q h2;p q
q Bn;p q
= Ip
q
0
and h03;q h3;q = lim Cn;q
Cn;q = Iq
(15.10)
because Bn and Cn are orthogonal matrices by (8.6) and (8.7). Hence, if q = p; then
0
h
h3;q ;
h
= Ip ; and
2
n;h
0
8n
h;p q ;
M = h03 ; M1 = h03;q ; M2 = h03;k
gives the desired result that P (
that “M
h
all nonrandom vectors
q
h;p q 2
2 Rk
q
=
2
2
2 Rp
q
2 Rp
q;
and
=
q
=
h:
h;q
(= h3;q );
Corollary 15.2
has full column rank p) = 1: The condition in Corollary 15.2
in Corollary 15.2 that “the variance matrix of M2
h03;k
q;
= (e1 ; :::; eq )” holds in this case because h03
q
h;q
1; which is assumed in part
(d). We prove part (d) for this case by applying Corollary 15.2 with q = q;
=
=
has full column rank.
h
Hence, it su¢ ces to consider the case where q < p and
p q
h
p q
h;q
2
= h03 h3;q = (e1 ; :::; eq ): The condition
2 Rk
q
has rank at least p
q for
with jj 2 jj = 1” in this case becomes “the variance matrix of
has rank at least p
q for all nonrandom vectors
2
2 Rp
q
with jj 2 jj = 1:”
It remains to establish the latter property, which is equivalent to
p q
V ar(h03;k
q
h;p q 2 )
>08
2
2 Rp
q
with jj 2 jj = 1:
(15.11)
We have
V ar(h03;k
q
h;p q 2 )
= V ar(h03;k
1=2
q h5;g D h h2;p q 2 )
= ((h2;p
0
q 2)
(h03;k
1=2
q h5;g ))V
= ((h2;p
0
q 2)
(h03;k
1=2
vec(Gi )
((h2;p q 2 )
q h5;g )) h
=
h03;k
h
1=2
q h5;g Gi h2;p q 2
ar(vec(Dh ))((h2;p
1=2 0
q h5;g ) )
q 2)
(h03;k
(h03;k
1=2 0
q h5;g ) )
;
(15.12)
1=2
where the …rst equality holds by the de…nition of
h;p q
in (8.17) and the fact that h71 = h5;g
and
h81 = Ip by the conditions in part (d) of Lemma 8.3, the second and fourth equalities use the general
formula vec(ABC) = (C 0
A)vec(B); the third equality holds because vec(Dh )
by Lemma 8.2, and the fourth equality uses the de…nition of the variance matrix
an arbitrary random vector ai :
Next,
we show that
h03;k
h
1=2
q h5;g Gi h2;p q 2
8
N (0pk ;
ai
h
vec(Gi )
)
h
in (8.15) for
equals the expected outer-product matrix
1=2
0
Cn;k
Fn
lim
n
q
Gi Bn;p
q 2
:
1=2
q h5;g Gi h2;p q 2
h03;k
h
= ((h2;p
0
q 2)
1=2
vec(Gi )
((h2;p q 2 )
q h5;g )) h
(h03;k
= lim((Bn;p
0
q 2)
0
(Cn;k
q
n
= lim((Bn;p
0
q 2)
0
(Cn;k
q
n
0
q 2)
lim((Bn;p
0
q 2)
= lim((Bn;p
0
(Cn;k
0
(Cn;k
0
lim EFn vec(Cn;k
0
Cn;k
Fn
= lim
1=2
n
q
n
q
Gi Bn;p
1=2
q 2
1=2
))
vec(Gi )
((Bn;p q 2 )
Fn
0
(Cn;k
q
n
1=2 0
1=2
))
vec(Gi )
((Bn;p q 2 )
Fn
0
(Cn;k
q
n
1=2 0
n
q
1=2
n
q
1=2 0
q h5;g ) )
(h03;k
1=2
))
))EFn vec(Gi ) EFn vec(Gi )0 ((Bn;p
vec(Gi )
((Bn;p q 2 )
Fn
))
))
Gi Bn;p
0
(Cn;k
0
EFn vec(Cn;k
q 2)
q
n
q
1=2
n
0
(Cn;k
q 2)
n
q
1=2 0
))
1=2 0
))
0
q 2)
Gi Bn;p
;
(15.13)
where the general formula vec(ABC) = (C 0
A)vec(B) is used multiple times, the limits exist by
the conditions imposed on the sequence f
:n
Cn;k
! h3;k
q
q;
and
1=2
n
1=2
h5;g ;
!
n;h
1g; the second equality uses Bn;p
the third equality uses the de…nitions of
0
given in (3.2) and (8.15), respectively, and the last equality uses EFn vec(Cn;k
0
vec(Cn;k
q
n
1=2
Dn Bn;p
1=2
nm Gi Bnm )
= O(n
0
vec(Cn
Fn
We can write lim
0
vec(Cn
m
Fnm
q)
1=2 )
1=2
n
Gi Bn )
by (15.7) with Wn =
n
1=2
1=2
jFnm
2 Rp
For all
2
0
Cn
m ;k
Fnm
lim
2 Rp
q
j
1=2
nm Gi Bnm ;p j
with jj 2 jj = 1; let
and
Gi Bn;p
q)
j;
ai
F
=
1g of matrices
1 for some j = 0; :::; q: It cannot be the case
1=2
jFnm
! 1 as m ! 1
9 1 as m ! 1 by the de…nition of q in (8.16).
Now, we …x an arbitrary j 2 f0; :::; qg: The continuity of the
p j
1=2
ai
F
! h2;p
:
that j > q; because if j > q; then we obtain a contradiction because nm
condition in F0j imply that, for all
n
as the limit of a subsequence fnm : m
for which Fnm 2 F0j for all m
by the …rst condition of F0j and nm
q
j
j
) function and the
p j(
)
with jj jj = 1;
= lim
= (0q
p j(
p j
j0 ; 0 )0
2
0
Cn
m ;k
Fnm
2 Rp
j
1=2
nm Gi Bnm ;p j
j:
Then, Bnm ;p
2
2 Rp
j
> 0:
= Bnm ;p
(15.14)
q 2
and, by
(15.14),
p j
lim
0
Cn
m ;k
Fnm
j
1=2
nm Gi Bnm ;p q 2
>08
Next, we apply Corollary 15.4(b) with A = lim
0
Cn
m ;k
Fnm
q
1=2
nm Gi Bnm ;p q 2
;m=p
j; r = q
0
Cn
m ;k
Fnm
j; where A
j
q
with jj 2 jj = 1:
1=2
nm Gi Bnm ;p q 2
(q j)
(q j)
equals A with its …rst q
and columns deleted in the present case and p > q implies that m = p
9
and A
(15.15)
j
= lim
j rows
1 for all j = 0; :::; q:
Corollary 15.4 and (15.15) give
p q
0
Cn
m ;k
Fnm
lim
q
1=2
nm Gi Bnm ;p q 2
>08
2
2 Rp
q
with jj 2 jj = 1:
(15.16)
Equations (15.12), (15.13), and (15.16) combine to establish (15.11) and the proof of part (d)
is complete.
Part (e) of the Lemma holds by replacing n by the subsequence value wn throughout the
arguments given above.
= 0k for some
Proof of Lemma 15.1. It su¢ ces to show that P (
For any constant
2 Rp with jj jj = 1) = 0:
> 0; there exists a constant K < 1 such that P (jjvec( )jj > K )
:
Given " > 0; let fB( s ; ") : s = 1; :::; N" g be a …nite cover of f 2 Rp : jj jj = 1g; where jj s jj = 1
and B( s ; ") is a ball in Rp centered at
of radius ": It is possible to choose f
s
such that the number, N" ; of balls in the cover is of order "
p+1 :
That is, N"
s
: s = 1; :::; N" g
C1 "
p+1
for some
constant C1 < 1:
Let
r
denote the rth row of
for r = 1; :::; k written as a column vector. If
2 B( s ; "); we
have
jj
s jj
=
k
X
(
0
r(
2
s ))
r=1
!1=2
k
X
r=1
jj
2
r jj
jj
2
s jj
!1=2
= "jjvec( )jj;
where the inequality holds by the Cauchy-Bunyakovsky-Schwarz inequality. If
2 B( s ; ") and
= 0k ; this gives
jj
s jj
(15.17)
"jjvec( )jj:
(15.18)
Suppose Z 2 Rp has a multivariate normal distribution with pd variance matrix. Then, for
any " > 0;
P (jjZ jj
") =
Z
fZ (z)dz
sup fZ (z)
z2Rk
fjjzjj "g
Z
dz
C2 "p
(15.19)
fjjzjj "g
for some constant C2 < 1; where fZ (z) denotes the density of Z with respect to Lebesgue
measure, which exists because the variance matrix of Z is pd, and the inequalities hold because
the density of a multivariate normal is bounded and the volume of a sphere in Rp of radius " is
proportional to "p :
For any
2 Rp with jj jj = 1; let B
the diagonal k
B 0 be a spectral decomposition of V ar(
k matrix with the eigenvalues of V ar(
and B is an orthogonal k
to the eigenvalues in
is
) on its diagonal in nonincreasing order
k matrix whose columns are eigenvectors of V ar(
: By assumption, the rank of V ar(
10
); where
) that correspond
) is p or larger. In consequence,
the …rst p diagonal elements of
B 0 V ar(
: Let (B 0
)B =
vector
B0
and
p
: Let
p submatrix of
is pd (because the …rst p diagonal elements of
Now, given any
= 0k for some
"
= P [N
s=1 [
: We have V ar((B 0
)p ) =
p
are positive).
2B(
s ;"):jj
2 Rp with jj jj = 1)
= 0k g
f
jj=1
"
P [N
s=1 fjj
s jj
"jjvec( )jjg
"
P [N
s=1 fjj
s jj
"jjvec( )jjg \ fjjvec( )jj
"
P [N
s=1 fjj
s jj
"K g +
s=1
N"
X
) =
> 0 and " > 0; we have
P(
N"
X
jj and V ar(B 0
)p denote the p vector that contains the …rst p elements of the k
denote the upper left p
p
jj = jjB 0
are positive. We have jj
P (jj
s jj
P (jj(B 0 s
s=1
K g + P (jjvec( )jj > K )
"K ) +
s )p jj
"K ) +
N" C2 K p "p +
C1 "
!
p+1
C2 K p "p +
as " ! 0;
(15.20)
where the …rst inequality holds by (15.18) using
2 B( s ; "); the third inequality uses the de…nition
of K ; the third last inequality holds because jj(B 0 s
jjB 0 s
s )p jj
s jj
= jj
s jj
using the de…ni-
tions in the paragraph that follows the paragraph that contains (15.19), the second last inequality
holds by (15.19) with Z = (B 0 s
s )p
and the fact that the variance matrix of (B 0 s
s )p
is pd by
the argument given in the paragraph following (15.19), and the last inequality holds by the bound
given above on N" :
Because
= 0k for some
> 0 is arbitrary, (15.20) implies that P (
2 Rp with jj jj = 1) = 0;
which completes the proof.
16
Proof of Theorem 8.4
Theorem 8.4 of AG1. Suppose Assumption WU holds for some non-empty parameter space
2:
Under all sequences f
(a) bpn !p 1 if q = p;
(b) bpn !d
min (
n;h
:n
0
0
h;p q h3;k q h3;k q
1g with
h;p q )
n;h
2
if q < p;
11
;
(c) bjn !p 1 for all j
q;
bn ; i.e., (b(q+1)n ; :::;
b nU
bn0 D
b n0 W
cn0 W
cn D
(d) the (ordered ) vector of the smallest p q eigenvalues of nU
0
0
h;p q h3;k q h3;k q
bpn )0 ; converges in distribution to the (ordered ) p q vector of the eigenvalues of
h;p q
2 R(p
q) (p q) ;
(e) the convergence in parts (a)-(d) holds jointly with the convergence in Lemma 8.3, and
(f) under all subsequences fwn g and all sequences f
wn ;h
:n
1g with
wn ;h
2
; the results
in parts (a)-(e) hold with n replaced with wn :
The proof of Theorem 8.4 uses the following rate of convergence lemma. This lemma is a key
technical contribution of the paper.
Lemma 16.1 Suppose Assumption WU holds for some non-empty parameter space
Under all sequences f
n;h
: n
1g with
n;h
2
and for which q de…ned in (8.16) satis…es
1; we have (a) bjn !p 1 for j = 1; :::; q and (b) when p > q; bjn = op ((n1=2
q
`
q and j = q + 1; :::; p: Under all subsequences fwn g and all sequences f
wn ;h
2
wn ;h
2
`Fn ) )
: n
for all
1g with
; the same result holds with n replaced with wn :
Proof of Lemma 16.1. By the de…nitions in (8.9) and (8.12), h6;j := lim
j = 1; :::; p
2:
(j+1)Fn = jFn
for
1: By the de…nition of q in (8.16), h6;q = 0 if q < p: If q = p; h6;q is not de…ned by
(8.9) and (8.12) and we de…ne it here to equal zero. Because
in j; h6;j 2 [0; 1]: If h6;j > 0; then f
magnitude, i.e., 0 < lim
jFn
:n
1g and f
jF
(j+1)Fn
is nonnegative and nonincreasing
:n
1g are of the same order of
1:50 We group the …rst q singular values into groups that
(j+1)Fn = jFn
have the same order of magnitude within each group. Let Gh (2 f1; :::; qg) denote the number of
groups. (We have Gh
1 because q
1 is assumed in the statement of the lemma.) Note that
Gh equals the number of values in fh6;1 ; :::; h6;q g that equal zero. Let rg and rg denote the indices
of the …rst and last singular values, respectively, in the gth group for g = 1; :::; Gh : Thus, r1 = 1;
rg = rg+1
1; where rGh +1 is de…ned to equal q + 1; and rGh = q: Note that rg and rg depend on h:
By de…nition, the singular values in the gth group, which have the gth largest order of magnitude,
are f
rg Fn
:n
1g; :::; f
r g Fn
:n
1g: By construction, h6;j > 0 for all j 2 frg ; :::; rg
g = 1; :::; Gh : (The reason is: if h6;j is equal to zero for some j 2 frg ; :::; rg
is of smaller order of magnitude than f
by construction, lim
j 0 Fn = jFn
r g Fn
:n
1g; then f
rg Fn
1g for
:n
1g
1g; which contradicts the de…nition of rg :) Also
= 0 for any (j; j 0 ) in groups (g; g 0 ); respectively, with g < g 0 : Note
that when p = 1 we have Gh = 1 and r1 = r1 = 1:
50
Note that supj 1;F 2FW U jF < 1 by the conditions jjWF jj
M1 and jjUF jj
M1 in FW U and the moment
conditions in F : Thus, f jFn : n 1g does not diverge to in…nity, and the “order of magnitude” of f jFn : n 1g
refers to whether this sequence converges to zero, and how slowly or quickly it does, when it does converge to zero.
12
bn are solutions to the determinantal equation
b nU
bn0 D
b n0 W
cn0 W
cn D
The eigenvalues fbjn : j pg of nU
b 10 j
bn
b nU
bn0 D
b n0 W
cn0 W
cn D
Ip j = 0: Equivalently, by multiplying this equation by r12Fn n 1 jBn0 Un0 U
jnU
n
1
b Un Bn j; they are solutions to
jU
n
j
2
0 0 b 0 c0 c b
r1 Fn Bn Un Dn Wn Wn Dn Un Bn
(n1=2
r 1 Fn )
2
b 10 U
b 1 Un Bn j = 0
Bn0 Un0 U
n
n
(16.1)
wp!1; using jA1 A2 j = jA1 j jA2 j for any conformable square matrices A1 and A2 ; jBn j > 0; jUn j > 0
(by the conditions in FW U in (8.5) because
2 and 2 only contains distributions in FW U );
b 1 j > 0 wp!1 (because U
bn !p h81 by (8.2), (8.12), (8.17), and Assumption WU(b) and (c) and
jU
n
h81 is pd), and
r 1 Fn
> 0 for n large (because n1=2
r1 Fn
! 1 for r1
the quali…er wp!1 from some statements below.) Thus, f(n1=2
j
b1n 2 Rr1
for A
r1 ;
2
0 0 b 0 c0 c b
r1 Fn Bn Un Dn Wn Wn Dn Un Bn
q): (For simplicity, we omit
r1 Fn )
2b
jn
:j
pg solve
bn )j = 0 or
(Ip + A
bn ) 1 2 Bn0 Un0 D
b n0 W
cn0 W
cn D
b n Un Bn
j(Ip + A
Ip j = 0; where
r 1 Fn
2
3
b
b2n
A
A
b 10 U
b 1 Un Bn Ip
bn = 4 1n
5 := Bn0 Un0 U
A
n
n
0
b
b
A2n A3n
b2n 2 Rr1
A
(p r1 ) ;
b3n 2 R(p
and A
bn )
multiplying the …rst line by j(Ip + A
r1 ) (p r1 )
(16.2)
and the second line is obtained by
1 j:
We have
=
=
1 c b
r1 Fn Wn Dn Un Bn
1
1
c
r1 Fn (Wn Wn )Wn Dn Un Bn
1
1
c
r1 Fn (Wn Wn )Cn
2
n
(n1=2
+ Op ((n1=2
r 1 Fn )
r1 Fn )
1
)
1 c 1=2 b
Wn n (Dn
Dn )Un Bn
(16.3)
3
h
+ o(1)
0r1 (p r1 )
6 6;r1
7
1=2
1
(p r1 ) r1
= (Ik + op (1))Cn 6
O( r2 Fn = r1 Fn )(p r1 ) (p r1 ) 7
r1 Fn ) )
4 0
5 + Op ((n
0(k p) r1
0(k p) (p r1 )
2
3
r1 1
Y
h6;r
0r1 (p r1 )
1
5 ; where h6;r := Diagf1; h6;1 ; h6;1 h6;2 ; :::;
! p h3 4
h6;` g;
1
0(k r1 ) r1 0(k r1 ) (p r1 )
`=1
h6;r 2 Rr1
1
r1 ;
h6;r := 1 when r1 = 1; O(
1
r1 ) matrix whose diagonal elements are O(
(p r1 ) (p r1 )
r2 Fn = r1 Fn )
r2 Fn = r1
denotes a diagonal (p r1 ) (p
c
Fn ); the second equality uses (15.4), Wn !p h71
(by Assumption WU(a) and (c)), jjh71 jj = jj lim Wn jj < 1 (by the conditions in FW U de…ned in
b n Dn ) = Op (1) (by Lemma 8.2), Un = O(1) (by the conditions in FW U ); and Bn =
(8.5)), n1=2 (D
13
cn Wn 1 !p Ik (because W
cn !p h71 ; h71 :=
O(1) (because Bn is orthogonal), the third equality uses W
jY1
jY1
lim Wn ; and h71 is pd by the conditions in FW U ); jFn = r1 Fn =
( (`+1)Fn = `Fn ) =
h6;` + o(1)
`=1
for j = 2; :::; r1 ; and
jFn = r1 Fn
= O(
r2 Fn = r1 Fn )
nonincreasing in j); and the convergence uses Cn ! h3 ;
and n1=2
r2 Fn = r1 Fn
q):51
! 1 (by (8.16) because r1
r 1 Fn
`=1
for j = r2 ; :::; p (because f
jFn
: j
pg are
! 0 (by the de…nition of r2 );
Equation (16.3) yields
2
0 0 b 0 c0 c b
r1 Fn Bn Un Dn Wn Wn Dn Un Bn
2
!p 4
2
= 4
h6;r
0r1 (p r1 )
1
0(k r1 ) r1
0(k r1 ) (p r1 )
0r1 (p r1 )
2
h6;r
1
0(p r1 ) r1
0(p
r1 ) (p r1 )
30
2
5 h03 h34
3
h6;r
0r1 (p r1 )
1
0(k r1 ) r1
0(k r1 ) (p r1 )
5;
3
5
(16.4)
where the equality holds because h03 h3 = lim Cn0 Cn = Ik using (8.7).
In addition, we have
bn := Bn0 Un0 U
bn 10 U
bn 1 Un Bn
A
Ip !p 0p
p
(16.5)
bn !p h81 by Assumption WU(b) and (c), h81 := lim Un ; and h81 is
bn 1 Un !p Ip (because U
using U
pd by the conditions in FW U ); Bn ! h2 ; and h02 h2 = Ip (because Bn is orthogonal for all n
1):
The ordered vector of eigenvalues of a matrix is a continuous function of the matrix by Elsner’s
Theorem, see Stewart (2001, Thm. 3.1, pp. 37–38). Hence, by the second line of (16.2), (16.4),
2
cn D
b nU
bn Bn (i.e.,
cn0 W
b n0 W
bn0 D
(16.5), and Slutsky’s Theorem, the largest r eigenvalues of
Bn0 U
1
f(n1=2
r 1 Fn )
2b
((n
jn
1=2
r1 Fn )
2
b1n ; :::; (n
1=2
r 1 Fn )
bjn !p 1 8j = 1; :::; r1
because n1=2
r1 Fn
r1 g by the de…nition of bjn ), satisfy
:j
2
br1 n ) !p (1; h26;1 ; h26;1 h26;2 ; :::;
r1 1
Y
h26;` ) and so
`=1
(16.6)
q) and h6;` > 0 for all ` 2 f1; :::; r1 1g (as noted
b0 D
b 0 c0 c b b
above). By the same argument, the smallest p r1 eigenvalues of r12Fn Bn0 U
n n Wn Wn Dn Un Bn ;
i.e., f(n1=2
r1 Fn
r1 Fn )
! 1 (by (8.16) since r1
2b
jn
: j = r1 + 1; :::; pg; satisfy
(n1=2
r 1 Fn )
2
bjn !p 0 8j = r1 + 1; :::; p:
(16.7)
If Gh = 1; (16.6) proves part (a) of the lemma and (16.7) proves part (b) of the lemma (because
51
For matrices that are written as O( ); we sometimes provide the dimensions of the matrix as superscripts for
clarity, and sometimes we do not provide the dimensions for simplicity.
14
in this case r1 = q and
r1 Fn = `Fn
here on, we assume that Gh
= O(1) for all `
2:
Next, de…ne Bn;j1 ;j2 to be the p
Bn for 0
j1 < j2
q by the de…nitions of q and Gh ): Hence, from
(j2
j1 ) matrix that consists of the j1 + 1; :::; j2 columns of
p: Note that the di¤erence between the two subscripts j1 and j2 equals the
number of columns of Bn;j1 ;j2 ; which is useful for keeping track of the dimensions of the Bn;j1 ;j2
matrices that appear below. By de…nition, Bn = (Bn;0;r1 ; Bn;r1 ;p ):
By (16.3) (excluding the convergence part) applied once with Bn;r1 ;p in place of Bn as the farright multiplicand and applied a second time with Bn;0;r1 in place of Bn as the far-right multiplicand,
we have
%n :=
2
0
0 b 0 c0 c b
r1 Fn Bn;0;r1 Un Dn Wn Wn Dn Un Bn;r1 ;p
2
30
2
=4
h6;r + o(1)
1
0(k
r1 ) r1
+Op ((n1=2
= op (
5 Cn0 (Ik + op (1))Cn 4
1
r1 Fn )
r2 Fn = r1 Fn )
0r1 (p r1 )
O(
r2 Fn =
(k
r1 Fn )
r1 ) (p r1 )
)
+ Op ((n1=2
r 1 Fn )
1
3
5
);
(16.8)
where the last equality holds because (i) Cn0 (Ik + op (1))Cn = Ik + op (1); (ii) when Ik appears in
place of Cn0 (Ik + op (1))Cn ; the …rst summand on the left-hand side (lhs) of the last equality equals
0r1
(p r1 ) ;
and (iii) when op (1) appears in place of Cn0 (Ik + op (1))Cn ; the …rst summand on the lhs
of the last equality equals an r1
(p
r1 ) matrix with elements that are op (
r2 Fn = r1 Fn ):
De…ne
b ( ) :=
1n
2
0
0 b 0 c0 c b
r1 Fn Bn;0;r1 Un Dn Wn Wn Dn Un Bn;0;r1
b1n ) 2 Rr1
(Ir1 + A
b ( ) :=
3n
2
0
0 b 0 c0 c b
r1 Fn Bn;r1 ;p Un Dn Wn Wn Dn Un Bn;r1 ;p
(Ip
b ( ) := %
2n
n
b2n 2 Rr1
A
(p r1 )
; and
15
r1
r1
b3n ) 2 R(p
+A
;
(16.9)
r1 ) (p r1 )
:
As in the …rst line of (16.2), f(n1=2
0=j
r1 Fn )
2b
jn
:j
pg solve
2
0 0 b 0 c0 c b
r1 Fn Bn Un Dn Wn Wn Dn Un Bn
2
= 4
bn )j
(Ip + A
3
b ( ) b ( )
1n
2n
5
b ( )0 b ( )
2n
3n
b ( )0b 1 ( )b ( )j
2n
2n
1n
= jb1n ( )j jb3n ( )
= jb1n ( )j j
2
0
0 b 0 c0 c b
r1 Fn Bn;r1 ;p Un Dn Wn Wn Dn Un Bn;r1 ;p
b3n A
b02nb1n1 ( )%n %0nb1n1 ( )A
b2n
(Ip r1 + A
1
%0nb1n ( )%n
b02nb1n1 ( )A
b2n )j;
+ A
(16.10)
where the third equality uses the standard formula for the determinant of a partitioned matrix and
the result given in (16.11) below, which shows that b ( ) is nonsingular wp!1 for equal to any
1n
solution
(n1=2
r 1 Fn )
algebra.52
2b
jn
to the …rst equality in (16.10) for j
Now we show that, for j = r1 +1; :::; p; (n1=2
2b
r1 Fn )
jn
p; and the last equality holds by
cannot solve the determinantal equation
jb1n ( )j = 0; wp!1; where this determinant is the …rst multiplicand on the right-hand side (rhs)
of (16.10). This implies that f(n1=2
r1 Fn )
2b
jn
: j = r1 + 1; :::; pg must solve the determinantal
equation based on the second multiplicand on the rhs of (16.10) wp!1: For j = r1 + 1; :::; p; we
have
e
j1n
: = b1n ((n1=2
=
r 1 Fn )
2
bjn )
2
0 b 0 c0 c b
0
r1 Fn Bn;0;r1 Un Dn Wn Wn Dn Un Bn;0;r1
2
= h6;r
+ op (1)
1
op (1)(Ir1 + op (1))
(n1=2
r 1 Fn )
2
b1n )
bjn (Ir1 + A
2
+ op (1);
= h6;r
1
(16.11)
where the second last equality holds by (16.4), (16.5), and (16.7). Equation (16.11) and
2
min (h6;r
1
)>
0 (which follows from the de…nition of h6;r in (16.3) and the fact that h6;` > 0 for all ` 2
1
f1; :::; r1
1g) establish the result stated in the …rst sentence of this paragraph.
For j = r1 +1; :::; p; plugging (n1=2
52
r 1 Fn )
The determinant of the partitioned matrix
2b
=
jn
into the second multiplicand on the rhs of (16.10)
1
0
2
nonsingular, e.g., see Rao (1973, p. 32).
16
2
3
equals j j = j 1 j j
3
1
0
2 1
2j
provided
1
is
gives
2
0
0 b 0 c0 c b
r1 Fn Bn;r1 ;p Un Dn Wn Wn Dn Un Bn;r1 ;p
0=j
(n1=2
r 1 Fn )
2
bjn (Ip
b0 e 1 %n
A
2n j1n
bj2n : = A
b3n
A
r1
1
b2n + (n1=2
%0nej1n A
2
0
0 b 0 c0 c b
r2 Fn Bn;r1 ;p Un Dn Wn Wn Dn Un Bn;r1 ;p
using Op ((n1=2
j
r2 Fn )
2)
2
r2 Fn = r1 Fn ) )
bj2n )j; where
+A
using (16.8) and (16.11). Multiplying (16.12) by
0=j
+ op ((
= op (1) (because r2
r1 Fn )
2
)
(16.12)
r 1 Fn )
2
2
2
r1 F n = r2 Fn
(p
b0 e 1 A
b
bjn A
2n j1n 2n 2 R
r1 ) (p r1 )
gives
(n1=2
+ op (1)
+ Op ((n1=2
2
r2 Fn )
bjn (Ip
r1
bj2n )j
+A
q by the de…nition of r2 and n1=2
jFn
(16.13)
! 1 for all
q by the de…nition of q in (8.16)).
Thus, f(n1=2
r 2 Fn )
0=j
2b
jn
: j = r1 + 1; :::; pg solve
2
0
0 b 0 c0 c b
r2 Fn Bn;r1 ;p Un Dn Wn Wn Dn Un Bn;r1 ;p
+ op (1)
(Ip
r1
For j = r1 + 1; :::; p; we have
bj2n )j:
+A
(16.14)
bj2n = op (1);
A
(16.15)
b2n = op (1) and A
b3n = op (1) (by (16.5)), e 1 = Op (1) (by (16.11)), %n = op (1) (by (16.8)
because A
j1n
since
r 2 Fn
r1 Fn
and n1=2
r1 Fn
! 1); and (n1=2
(16.7)).
r1 Fn )
2b
jn
= op (1) for j = r1 + 1; :::; p (by
Now, we repeat the argument from (16.2) to (16.15) with the expression in (16.14) replacing that
bj2n ; Bn;p r ; r Fn ;
in the …rst line of (16.2), with (16.15) replacing (16.5), and with j = r +1; :::; p; A
2
1
2
r2 1
r3 Fn ;
r2
r1 ; p
r2 ; and h6;r = Diagf1; h6;r1 +1 ; h6;r1 +1 h6;r1 +2 ; :::;
2
Y
`=r1 +1
h6;` g 2 R(r2
r1 ) (r2 r1 )
bn ; Bn ; r Fn ; r Fn ; r ; p r ; and h ; respectively. (The fact
in place of j = r1 + 1; :::; p; A
1
2
1
1
6;r1
bj2n depends on j; whereas A
bn does not, does not a¤ect the argument.) In addition, Bn;0;r
that A
1
and Bn;r1 ;p in (16.8)-(16.10) are replaced by the matrices Bn;r1 ;r2 and Bn;r2 ;p (which consist of the
r1 + 1; :::; r2 columns of Bn and the last p
r2 columns of Bn ; respectively.) This argument gives
the analogues of (16.6) and (16.7), which are
bjn !p 1 8j = r2 ; :::; r2 and (n1=2
r2 Fn )
2
bjn = op (1) 8j = r2 + 1; :::; p:
(16.16)
bj3n in place of A
bj2n ; where
In addition, the analogue of (16.14) is the same as (16.14) but with A
bj3n is de…ned just as A
bj2n is de…ned in (16.12) but with A
b2j2n and A
b3j2n in place of A
b2n and A
b3n ;
A
17
respectively, where
b1j2n 2 Rr2
for A
r2 ;
bj2n
A
b2j2n 2 Rr2
A
3
b2j2n
b1j2n A
A
5
=4
b0
b3j2n
A
A
2j2n
2
(p r1 r2 ) ;
Repeating the argument Gh
b3j2n 2 R(p
and A
(16.17)
r1 r2 ) (p r1 r2 ) :
2 more times yields
bjn !p 1 8j = 1; :::; rGh and (n1=2
r g Fn )
2
bjn = op (1) 8j = rg + 1; :::; p; 8g = 1; :::; Gh : (16.18)
A formal proof of this “repetition of the argument Gh 2 more times”is given below using induction.
Because rGh = q; the …rst result in (16.18) proves part (a) of the lemma.
The second result in (16.18) with g = Gh implies: for all j = q + 1; :::; p;
(n1=2
rGh Fn )
2
bjn = op (1)
(16.19)
because rGh = q: Either rGh = rGh = q or rGh < rGh = q: In the former case, (n1=2
qFn )
op (1) for j = q + 1; :::; p by (16.19). In the latter case, we have
qFn
lim
rG
rG Fn
h
= lim
rGh Fn
h
Y
=
rGh Fn
of the proof. Hence, in this case too,
(16.20). Because
qFn
`Fn
for all `
qFn )
2b
jn
=
1
h6;j > 0;
(16.20)
j=rGh
where the inequality holds because h6;` > 0 for all ` 2 frGh ; :::; rGh
(n1=2
2b
jn
1g; as noted at the beginning
= op (1) for j = q + 1; :::; p by (16.19) and
q; this establishes part (b) of the lemma.
Now we establish by induction the results given in (16.18) that are obtained heuristically by
“repeating the argument Gh
2 more times.”The induction proof shows that subtleties arise when
establishing the asymptotic negligibility of certain terms.
Let ogp denote a symmetric (p
1; :::; p rg
1
rg
rg for `
1
+`
is op (
(rg
1 +`)Fn
1 (since
(rg
rg
are
(p
rg
1)
matrix whose (`; m) element for `; m =
2
1=2
1
rg Fn ) ):
rg Fn )+Op ((n
nonincreasing in j) and n1=2
1 +m)Fn
jFn
1)
=
We now show by induction over g = 1; :::; Gh that wp!1 f(n1=2
for some (p
2
0
rg Fn Bn;rg
rg
1)
1 ;p
(p
b0 W
c0 c b
Un0 D
n n Wn Dn Un Bn;rg
rg
are ogp may depend on j):
1)
1 ;p
+ ogp
(Ip
rg
rg Fn
r g Fn )
solve
j
Note that ogp = op (1) because
1
! 1 for g = 1; :::; Gh :
2b
jn
: j = rg
bjgn )j = 0
+A
1 + 1; :::; pg
(16.21)
bjgn = op (1) and ogp (where the matrices that
symmetric matrices A
The initiation step of the induction proof holds because (16.21) holds with g = 1 by the …rst line
18
bjgn := A
bn and ogp = 0 for g = 1 (and using the fact that, for g = 1; r
of (16.2) with A
g
and Bn;rg
1 ;p
1
= r0 := 0
= Bn;0;p = Bn ):
For the induction step of the proof, we assume that (16.21) holds for some g 2 f1; :::; Gh
1g
and show that it then also holds for g + 1: By an argument analogous to that in (16.3), we have
1 c b
rg Fn Wn Dn Un Bn;rg
02
B6
6
!p h 3 B
@4
h6;rg 2 R(rg
rg
0r g
1
1 ;p
2
6
= (Ik + op (1))Cn 6
4 Diagf
(rg rg
3
1)
h6;rg
0(k
1)
rg ) (rg rg
(rg rg
1)
1)
7 k
7;0
5
0rg
1
(p rg
1)
rg Fn ; :::; pFn g= rg Fn
0(k
p) (p rg
1)
1
3
7
7 + Op ((n1=2
5
C
A ; where h6;rg := Diagf1; h6;rg ; :::;
(p rg ) C
r g Fn )
1
)
rg 1
Y
j=rg
1 +1
h6;j g;
(16.22)
; and h6;rg := 1 when rg = 1:
Equation (16.22) and h03 h3 = lim Cn0 Cn = Ik yield
2
0
rg Fn Bn;rg
1 ;p
b0 W
c0 c b
Un0 D
n n Wn Dn Un Bn;rg
1 ;p
2
0(rg
2
h6;r
g
!p 4
0(p
rg ) (rg rg
1)
0(p
rg
1)
(p rg )
rg ) (p rg )
3
5:
(16.23)
By (16.21) and ogp = op (1); we have wp!1 f(n1=2 rg Fn ) 2 bjn : j = rg 1 + 1; :::; pg solve
0 b 0 c0 c b
bjgn ) 1 2 B 0
Ip rg 1 j = 0: Hence, by
j(Ip rg 1 + A
rg Fn n;rg 1 ;p Un Dn Wn Wn Dn Un Bn;rg 1 ;p + op (1)
bjgn = op (1) (which holds by the induction assumption), and the same argument as used
(16.23), A
to establish (16.6) and (16.7), we obtain
bjn !p 1 8j = rg
Let ogp denote an (rg
are op (
(rg +j)Fn = rg Fn )
1
+ 1; :::; rg and (n1=2
rg
1)
+ Op ((n1=2
(p
rg Fn )
2
bjn !p 0 8j = rg + 1; :::; p:
(16.24)
rg ) matrix whose elements in column j for j = 1; :::; p
rg Fn )
1 ):
rg
Note that ogp = op (1):
By (16.22) applied once with Bn;rg ;p in place of Bn;rg
19
1 ;p
as the far-right multiplicand and
applied a second time with Bn;rg
1 ;rg
in place of Bn;rg
1 ;p
as the far-right multiplicand, we have
%gn
:=
2
0
rg Fn Bn;rg
2
6
=6
4 Diagf
1 ;rg
b0 W
c0 c b
Un0 D
n n Wn Dn Un Bn;rg ;p
30
0r g
(rg
1 +1)Fn
; :::;
rg ) (rg rg
0(k
+Op ((n1=2
(rg rg
1
rg Fn )
1
2
1)
rg Fn g= rg Fn
1)
)
7 0
6
7 Cn (Ik + op (1))Cn 6 Diagf
5
4
(p rg )
0r g
(rg +1)Fn ; :::; pFn g= rg Fn
0(k
p) (p rg )
= ogp ;
3
7
7
5
(16.25)
where %gn 2 R(rg
rg
1)
(p rg )
; Diagf
(rg
1 +1)Fn
; :::;
rg Fn g= rg Fn
= h6;rg + o(1) = O(1) and the
last equality holds because (i) Cn0 (Ik + op (1))Cn = Ik + op (1); (ii) when Ik appears in place of
Cn0 (Ik + op (1))Cn ; then the contribution from the …rst summand on the lhs of the last equality
rg
in (16.25) equals 0(rg
(p rg )
1)
; and (iii) when op (1) appears in place of Cn0 (Ik + op (1))Cn ; the
contribution from the …rst summand on the lhs of the last inequality in (16.25) equals an ogp matrix.
bjgn as follows:
We partition the (p r ) (p r ) matrices ogp and A
g 1
0
ogp = @
b1jgn
where o1gp ; A
2 R(p
rg ) (p rg )
b
1jgn (
b
2jgn (
b
3jgn (
2
) :=
2 0
rg Bn;rg
o02gp o3gp
1)
(rg rg
1
bjgn
A and A
1)
3
b1jgn A
b2jgn
A
5;
=4
b0
b3jgn
A
A
2jgn
b2jgn
; o2gp ; A
2
2
R(rg
rg
1)
1 ;rg
b0 W
c0 c b
Un0 D
n n Wn Dn Un Bn;rg
b2jgn ; and
A
2
0
0 b 0 c0 c b
rg Fn Bn;rg ;p Un Dn Wn Wn Dn Un Bn;rg ;p
1 ;rg
+ o1gp
+ o3gp
(Irg
(Ip
rg
(16.26)
(p rg )
+ 1; :::; p and g = 1; :::; Gh : De…ne
1
) := %gn + o2gp
) :=
o1gp o2gp
rg
R(rg
; for j = rg
g 1
rg
b3jgn
; and o3gp ; A
1
b1jgn );
+A
b3jgn );
+A
(16.27)
where b1jgn ( ); b2jgn ( ); and b3jgn ( ) have the same dimensions as o1gp ; o2gp ; and o3gp ; respectively.
20
From (16.21), we have wp!1 f(n1=2
0=j
2
0
rg Fn Bn;rg
1 ;p
r g Fn )
b n0 W
cn0 W
cn D
b n Un Bn;r
Un0 D
g
= jb1jgn ( )j jb3jgn ( )
b
2jgn (
= jb1jgn ( )j j
2b
1 ;p
jn
: j = rg
+ ogp
1
(Ip
1
)0b1jgn ( )b2jgn ( )j
2
0
0 b 0 c0 c b
rg Fn Bn;rg ;p Un Dn Wn Wn Dn Un Bn;rg ;p
b3jgn A
b0 b 1 ( )(%gn + o2gp )
[Ip rg + A
2jgn 1jgn
1
b02jgnb1jgn
b2jgn ]j;
+ A
( )A
+ 1; :::; pg solve
rg
+ o3gp
1
bjgn )j
+A
1
(%gn + o2gp )0b1jgn ( )(%gn + o2gp )
1
b2jgn
(%gn + o2gp )0b1jgn ( )A
(16.28)
where the second equality holds by the same argument as for (16.10) and uses the result given in
(16.29) below which shows that b1jgn ( ) is nonsingular wp!1 when equals (n1=2 rg Fn ) 2 bjn for
j = rg + 1; :::; p:
Now we show that, for j = rg +1; :::; p; (n1=2
rg Fn )
2b
jn
cannot solve the determinantal equation
jb1jgn ( )j = 0 for n large, where this determinant is the …rst multiplicand on the rhs of (16.28)
and, hence, it must solve the determinantal equation based on the second multiplicand on the rhs
of (16.28). For j = rg + 1; :::; p; we have
e
1jgn
:= b1jgn ((n1=2
rg Fn )
2
2
bjn ) = h6;r
+ op (1);
g
(16.29)
b1jgn = op (1) (which holds by the
by the same argument as in (16.11), using o1gp = op (1) and A
b1jgn following (16.21)). Equation (16.29) and min (h 2 ) > 0 establish the result
de…nition of A
6;rg
stated in the …rst sentence of this paragraph.
For j = rg +1; :::; p; plugging (n1=2
rg Fn )
gives
(n1=2
bj(g+1)n : = A
b3jgn
A
+(n1=2
bj(g+1)n 2 R(p
and A
= o3gp
jn
into the second multiplicand on the rhs of (16.28)
2
0
0 b 0 c0 c b
rg Fn Bn;rg ;p Un Dn Wn Wn Dn Un Bn;rg ;p
0=j
o3gp
2b
2
rg Fn )
bjn (Ip
rg
bj(g+1)n )j; where
+A
b0 e 1 (%gn + o2gp )
A
2jgn 1jgn
2
rg Fn )
rg ) (p rg )
b02jgne 1 A
b
bjn A
1jgn 2jgn
1
(%gn + o2gp )0e1jgn (%gn + o2gp )
1
b2jgn
(%gn + o2gp )0e1jgn A
(16.30)
: The last two summands on the rhs of the …rst line of (16.30) satisfy
1
(%gn + o2gp )0e1jgn (%gn + o2gp ) = o3gp
ogp0 ogp = (
+ o3gp
2
2
rg+1 Fn = rg Fn )o(g+1)p ;
21
(ogp + o2gp )0 (h6;rg2 + op (1))(ogp + o2gp )
(16.31)
where (i) the …rst equality uses (16.25) and (16.29), (ii) the second equality uses o2gp = ogp (which
holds because the (j; m) element of o2gp for j = 1; :::; rg
(rg +m)Fn =
2
1=2
1
rg Fn ) )
rg Fn )+Op ((n
= op (
rg
1
and m = 1; :::; p rg is op (
(rg +m)Fn = rg Fn )+Op
((n1=2
rg Fn )
rg ) and (h6;rg2 + op (1))ogp = ogp (which holds because h6;rg is diagonal and
1)
(rg
since rg
2
min (h6;rg )
1 +j)Fn
1 +j
> 0); (iii) the
2
2
0
rg
rg Fn = rg+1 Fn )ogp ogp for j; m = 1; :::; p
op ( (rg +j)Fn (rg +m)Fn = 2rg Fn )( 2rg Fn = 2rg+1 Fn ) = op ( (rg +j)Fn (rg +m)Fn
Op ((n1=2 rg Fn ) 2 )( 2rg Fn = 2rg+1 Fn ) = Op ((n1=2 rg+1 Fn ) 2 ) and, hence,
last equality uses the fact that the (j; m) element of (
is the sum of a term that is
=
(
2
rg+1 Fn ) and a term
2
2
0
rg Fn = rg+1 Fn )ogp ogp
that is
is o(g+1)p (using the de…nition of o(g+1)p ); and (iv) the last equality uses the
2
2
rg Fn = rg+1 Fn )o3gp for j; m =
Op ((n1=2 rg Fn ) 1 )( 2rg Fn = 2rg+1 Fn )
fact that the (j; m) element of (
2
2
2
rg Fn )( rg Fn = rg+1 Fn ) +
+Op ((n1=2 rg+1 Fn ) 1 )( rg Fn = rg+1 Fn );
=
(using
rg Fn = rg+1 Fn
1; :::; p
=
rg is op (
(rg +j)Fn (rg +m)Fn
2
(rg +j)Fn (rg +m)Fn = rg+1 Fn )
op (
which again is the same order as the (j; m) element of o(g+1)p
1):
The calculations in (16.31) are a key part of the induction proof. The de…nitions of the terms
ogp and ogp (given preceding (16.21) and (16.25), respectively) are chosen so that the results in
(16.31) hold.
For j = rg + 1; :::; p; we have
bj(g+1)n = op (1);
A
(16.32)
b2jgn = op (1) and A
b3jgn = op (1) by (16.21), e 1 = Op (1) (by (16.29)), %gn + o2gp = op (1)
because A
1jgn
(by (16.25) since ogp = op (1)); and (n1=2
rg Fn )
2b
jn
= op (1) (by (16.24)).
Inserting (16.31) and (16.32) into (16.30) and multiplying by
0=j
2
0
0 b 0 c0 c b
rg+1 Fn Bn;rg ;p Un Dn Wn Wn Dn Un Bn;rg ;p
Thus, wp!1;
0=j
f(n1=2
rg+1 Fn )
2b
jn
+ o(g+1)p
(n1=2
2
2
rg Fn = rg+1 Fn
rg+1 Fn )
: j = rg+1 ; :::; pg solve
2
0
0 b 0 c0 c b
rg+1 Fn Bn;rg ;p Un Dn Wn Wn Dn Un Bn;rg ;p
+ o(g+1)p
(Ip
rg
2
bjn (Ip
gives
rg
bj(g+1)n )j:
+A
bj(g+1)n )j:
+A
(16.33)
(16.34)
This establishes the induction step and concludes the proof that (16.21) holds for all g = 1; :::; Gh :
Finally, given that (16.21) holds for all g = 1; :::; Gh ; (16.24) gives the results stated in (16.18)
and (16.18) gives the results stated in the Lemma by the argument in (16.18)-(16.20).
Now we use the approach in Johansen (1991, pp. 1569-1571) and Robin and Smith (2000, pp.
172-173) to prove Theorem 8.4. In these papers, asymptotic results are established under a …xed
true distribution under which certain population eigenvalues are either positive or zero. Here we
need to deal with drifting sequences of distributions under which these population eigenvalues may
22
be positive or zero for any given n; but the positive ones may drift to zero as n ! 1; possibly at
di¤erent rates. This complicates the proof. In particular, the rate of convergence result of Lemma
16.1(b) is needed in the present context, but not in the …xed distribution scenario considered in
Johansen (1991) and Robin and Smith (2000).
Proof of Theorem 8.4. Theorem 8.4(a) and (c) follow immediately from Lemma 16.1(a).
b n0 W
cn0 W
cn
bn D
Next, we assume q < p and we prove part (b). The eigenvalues fbjn : j pg of nU
bn
b nU
b n0 W
cn0 W
cn D
bn are the ordered solutions to the determinantal equation jnU
bn D
b nU
D
Ip j = 0:
Equivalently, with probability that goes to one (wp!1), they are the solutions to
b n Un Bn Sn
b n0 W
cn0 W
cn D
jQn ( )j = 0; where Qn ( ) := nSn Bn0 Un0 D
bn 10 U
bn 1 Un Bn Sn ;
Sn0 Bn0 Un0 U
(16.35)
bn j > 0 wp!1. Thus,
because jSn j > 0; jBn j > 0; jUn j > 0; and jU
b 0 b 0 c0 c b b
min (nUn Dn Wn Wn Dn Un )
equals
the smallest solution, bpn ; to jQn ( )j = 0 wp!1. (For simplicity, we omit the quali…er wp!1 that
applies to several statements below.)
We write Qn ( ) in partitioned form using
Bn Sn = (Bn;q Sn;q ; Bn;p
Sn;q := Diagf(n1=2
q );
1Fn )
1
where
; :::; (n1=2
qFn )
1
g 2 Rq
q
:
(16.36)
b n Un Tn (= n1=2 Wn D
b n Un Bn Sn ) can be written
The convergence result of Lemma 8.3 for n1=2 Wn D
as
where
b n Un Bn;q Sn;q !p
n1=2 Wn D
h;q
and
h;p q
h;q
b n Un Bn;p
:= h3;q and n1=2 Wn D
q
!d
h;p q ;
(16.37)
are de…ned in (8.17).
We have
cn W 1 !p Ik and U
bn U 1 !p Ip
W
n
n
(16.38)
cn !p h71 := lim Wn (by Assumption WU(a) and (c)), U
bn !p h81 := lim Un (by Assumpbecause W
tion WU(b) and (c)), and h71 and h81 are pd (by the conditions in FW U ):
23
By (16.35)-(16.38), we have
2
3
b n Un Bn;p q + op (1)
Iq + op (1)
h03;q n1=2 Wn D
5
Qn ( ) = 4
0
0D
0 W0h
1=2 B 0
0D
b 0 W 0 Wn n1=2 D
b n Un Bn;p q + op (1)
b
n1=2 Bn;p
U
+
o
(1)
n
U
3;q
p
q n n n
n;p q n n n
3
2
3
2
2
Sn;q
0q (p q)
Sn;q A1n Sn;q Sn;q A2n
5 ; where
4
5
4
(16.39)
A02n Sn;q
A3n
0(p q) q
Ip q
3
2
A
A2n
bn 10 U
bn 1 Un Bn Ip = op (1) for A1n 2 Rq q ; A2n 2 Rq (p q) ;
bn = 4 1n
5 := Bn0 Un0 U
A
0
A2n A3n
and A3n 2 R(p
q) (p q) ;
0
bn is de…ned in (16.39) just as in (16.5), and the …rst equality uses
A
0 C
:= h3;q and h;q h;q = h03;q h3;q = lim Cn;q
n;q = Iq (by (8.7), (8.9), (8.12), and (8.17)). Note
bjn (de…ned in (16.2)) are not the same in general for j = 1; 2; 3; because their
that Ajn and A
b1n 2 Rr1 r1 :
dimensions di¤er. For example, A1n 2 Rq q ; whereas A
h;q
If q = 0 (< p); then Bn = Bn;p
q
and
bn0 D
b n0 W
cn0 W
cn D
b nU
bn Bn
nBn0 U
1
bn )0 B 10 B 0 U 0 D
b0 0 c
= nBn0 (Un 1 U
n
n n n Wn Wn Wn
!d
0
h;p q
0
h;p q ;
cn W 1 (Wn D
b n Un Bn )B 1 (U 1 U
bn )Bn
W
n
n
n
(16.40)
where the convergence holds by (16.37) and (16.38) and
h;p q
is de…ned as in (8.17) with q = 0:
The smallest eigenvalue of a matrix is a continuous function of the matrix (by Elsner’s Theorem, see
cn D
b nU
bn Bn
cn0 W
b n0 W
bn0 D
Stewart (2001, Thm. 3.1, pp. 37–38)). Hence, the smallest eigenvalue of nBn0 U
converges in distribution to the smallest eigenvalue of
0
0
h;p q h3;k q h3;k q
h;p q
(using h3;k
0
q h3;k q
=
h3 h03 = Ik when q = 0), which proves part (b) of Theorem 8.4 when q = 0:
In the remainder of the proof of part (b), we assume 1
q < p; which is the remaining case
to be considered in the proof of part (b). The formula for the determinant of a partitioned matrix
and (16.39) give
jQn ( )j = jQ1n ( )j jQ2n ( )j; where
Q1n ( ) : = Iq + op (1)
0
Q2n ( ) : = n1=2 Bn;p
2
Sn;q
Sn;q A1n Sn;q ;
0 b0
0
1=2 b
Dn Un Bn;p q
q Un Dn Wn Wn n
0
[n1=2 Bn;p
0
0 b0
q Un Dn Wn h3;q
b n Un Bn;p
[h03;q n1=2 Wn D
q
+ op (1)
Ip
q
+ op (1)
A02n Sn;q ](Iq + op (1)
+ op (1)
Sn;q A2n ];
24
A3n
2
Sn;q
Sn;q A1n Sn;q )
1
(16.41)
none of the op (1) terms depend on ; and the equation in the …rst line holds provided Q1n ( ) is
nonsingular.
By Lemma 16.1(b) (which applies for 1
and bjn Sn;q A1n Sn;q = op (1): Thus,
2 = o (1)
q < p); for j = q + 1; :::; p; we have bjn Sn;q
p
2
bjn Sn;q
Q1n (bjn ) = Iq + op (1)
bjn Sn;q A1n Sn;q = Iq + op (1):
(16.42)
By (16.35) and (16.41), jQn (bjn )j = jQ1n (bjn )j jQ2n (bjn )j = 0 for j = 1; :::; p: By (16.42),
jQ1n (bjn )j =
6 0 for j = q + 1; :::; p wp!1. Hence, wp!1,
jQ2n (bjn )j = 0 for j = q + 1; :::; p:
(16.43)
Now we plug in bjn for j = q + 1; :::; p into Q2n ( ) in (16.41) and use (16.42). We have
0 b0
0
b
q Un Dn Wn Wn Dn Un Bn;p q
0
Q2n (bjn ) = nBn;p
0
[n1=2 Bn;p
bjn [Ip
q
0 b0
0
q Un Dn Wn h3;q
+ A3n
+ op (1)
b n Un Bn;p
+ op (1)](Iq + op (1))[h03;q n1=2 Wn D
0
(n1=2 Bn;p
0 b0
0
q Un Dn Wn h3;q
b n Un Bn;p
A02n Sn;q (Iq + op (1))(h03;q n1=2 Wn D
q
+ op (1)]
+ op (1))(Iq + op (1))Sn;q A2n
q
+ op (1))
+bjn A02n Sn;q (Iq + op (1))Sn;q A2n ]:
(16.44)
The term in square brackets on the last three lines of (16.44) that multiplies bjn equals
Ip
q
+ op (1);
b n Un Bn;p
because A3n = op (1) (by (16.39)), n1=2 Wn D
q
(16.45)
= Op (1) (by (16.37)), Sn;q = o(1) (by the
de…nitions of q and Sn;q in (8.16) and (16.36), respectively, and h1;j := lim n1=2
jFn );
A2n = op (1)
2 A +A0 b S
(by (16.39)), and bjn A02n Sn;q (Iq +op (1))Sn;q A2n = A02n bjn Sn;q
2n
2n jn n;q op (1)Sn;q A2n = op (1)
2 = o (1) and A
(using bjn Sn;q
p
2n = op (1)):
Equations (16.44) and (16.45) give
0
Q2n (bjn ) = n1=2 Bn;p
0
= n1=2 Bn;p
:= Mn;p
q
0 b0
0
q Un Dn Wn [Ik
b n Un Bn;p
h3;q h03;q ]n1=2 Wn D
0 b0
0
0
1=2
b n Un Bn;p q
Wn D
q Un Dn Wn h3;k q h3;k q n
bjn [Ip
q
q
+ op (1)
+ op (1)
+ op (1)];
where the second equality uses Ik = h3 h03 = h3;q h03;q + h3;k
25
0
q h3;k q
bjn [Ip
bjn [Ip
q
q
+ op (1)]
+ op (1)]
(16.46)
(because h3 = lim Cn is an
orthogonal matrix) and the last line de…nes the (p
q)
(p
q) matrix Mn;p
Equations (16.43) and (16.46) imply that fbjn : j = q + 1; :::; pg are the p
q:
q eigenvalues of the
matrix
Mn;p
q
:= [Ip
q
+ op (1)]
1=2
Mn;p
q [Ip q
+ op (1)]
1=2
by pre- and post-multiplying the quantities in (16.46) by the rhs quantity [Ip
(16.47)
q
1=2
+ op (1)]
in
(16.46). By (16.37),
Mn;p
q
!d
0
0
h;p q h3;k q h3;k q
h;p q :
(16.48)
The vector of (ordered) eigenvalues of a matrix is a continuous function of the matrix (by
Elsner’s Theorem, see Stewart (2001, Thm. 3.1, pp. 37–38)). By (16.48), the matrix Mn;p
converges in distribution. In consequence, by the CMT, the vector of eigenvalues of Mn;p
q;
q
viz.,
fbjn : j = q + 1; :::; pg; converges in distribution to the vector of eigenvalues of the limit matrix
0
b 0 b 0 c0
h3;k q h0
h;p q ; which proves part (d) of Theorem 8.4. In addition, min (nU D W
h;p q
n
3;k q
n
n
cn D
b nU
bn ); which equals the smallest eigenvalue, bpn ; converges in distribution to the smallest
W
eigenvalue of
0
0
h;p q h3;k q h3;k q
h;p q ;
which completes the proof of part (b) of Theorem 8.4.
The convergence in parts (a)-(d) of Theorem 8.4 is joint with that in Lemma 8.3 because it
b n Un Tn ; which is part of the former. This
just relies on the convergence in distribution of n1=2 Wn D
establishes part (e) of Theorem 8.4.
Part (f) of Theorem 8.4 holds by the same proof as used for parts (a)-(e) with n replaced by
wn :
17
Proofs of Su¢ ciency of Several Conditions for the
p j(
)
Condition in F0j
In this section, we show that the conditions in (3.9) and (3.10) are su¢ cient for the second
condition in F0j ; which is
p j(
0
CF;k
F
1=2
j
F
Gi BF;p
26
j
)
1
8 2 Rp
j
with jj jj = 1:
Condition (i) in (3.9) is su¢ cient by the following argument:
p j
0
CF;k
F
p j
C F;p
F
1=2
0
=
(
min
=
0
j
F
Gi BF;p
j
j)
vec(C F;p
F
F
j
1=2
Ip
min
2Rp j :jj
Gi BF;p
j
jj=1
(
jj(
Ip
Ip
0
vec(C F;p
F
min
2
2R(p j) :jj
=
jj=1
0
vec(C F;p
F
min
0
1=2
F
F
Gi BF;p
0
j)
j)
0
j
1=2
j
jj
j
j)
0
1=2
Gi BF;p
Ip
1=2
vec(C F;p
F
F
(
j
F
j)
Gi BF;p
j)
j)
(
jj(
min
2Rp
Gi BF;p
j)
j :jj
jj=1
Ip
Ip
jj(
F
0
CF;k
j
0
C F;p j
1=2
F
Gi BF;p
1=2
F
j
Gi BF;p
j
CF;k
0
CF;k
F
0
are a collection of p j rows of CF;k
eigenvalue of a (p
j)
(p
j );
j
1=2
j
F
Gi BF;p
replaced by
j
j)
jj2
jj2
j)
j and r = k
0
C F;p j ;
p (because
1=2
0
CF;k
F
; since
j
F
Gi BF;p
j
=
and by de…nition the rows of
the …rst equality holds because the (p j)-th largest
j) matrix equals its minimum eigenvalue and by the general formula
vec(ABC) = (C 0 A)vec(B); and the last equality holds because jj(
0
Ip
(17.1)
0
likewise with CF;k
j;
jj(
jj
;
is a submatrix of
F
j)
Ip
where the …rst inequality holds by Corollary 15.4(a) with m = p
0
C F;p j
j)
Ip
j)
0
jj2 =
(
0
Ip
j)
=
= 1 using jj jj = jj jj = 1:
Condition (ii) in (3.9) is su¢ cient by su¢ cient condition (i) in (3.9) and the following:
0
min
=
j
2
2R(p j) :jj
jj=1
C F;p
j
min
2R(p j)k :jj
min
jj=1
vec(
F
0
1=2
F
F
(Ip
jj(Ip
min
jj(Ip
=
1=2
vec(C F;p
F
j)
of
0
C F;p j
j
2
jj
Gi BF;p
j
j)
C F;p
C F;p
j
vec(
F
where the last equality uses jj(Ip
Gi BF;p
1=2
F
0
j)
j)
Gi BF;p
jj
j)
vec(
F
1=2
F
min
2
2R(p j) :jj
j)
Gi BF;p
jj=1
j)
jj(Ip
(Ip
jj(Ip
j
;
C F;p
j
j
C F;p
C F;p
C F;p
j)
j)
j)
jj
jj2
(17.2)
j)
jj2 =
0 (I
p j
0
C F;p
j C F;p j )
= 1 because the rows
are orthonormal and jj jj = 1:
Condition (iii) in (3.9) is su¢ cient by su¢ cient condition (ii) in (3.9) and a similar argument to
that given in (17.2) using the fact that min
of BF;p
j
0
2Rpk :jj jj=1 jj(BF;p j
are orthonormal.
27
Ik ) jj2 = 1 because the columns
Condition (iv) in (3.9) is su¢ cient by su¢ cient condition (iii) in (3.9) and a similar argument to
that given in (17.2) using min
of F in place of min
following calculations:
0
2R(p
1
(Ip
F
1=2
2Rpk :jj jj=1 jj(Ip
j)2 :jj
) =
jj=1
p
X
j=1
p
X
jj(Ip
F
C F;p
j
( j =jj j jj)0
min ( F
1
1
F
=(
0
1 ; :::;
0 0
p)
for
2 Rk 8j
j
jj2
2=(2+ )
for M as in the de…nition
= 1: The latter inequality holds by the
( j =jj j jj)
jj j jj2
max ( F )
max ( F )
((EF jjgi jj2+ )1=(2+ ) )2
M 2=(2+
)
2=(2+ )
M
p; the sums are over j for which
ity uses jj jj = 1; and the last inequality holds because
EF jjgi jj2 = ((EF jjgi jj2 )1=2 )2
M
jj j jj2 = 1=
)
j=1
where
j)
) jj2
j
;
(17.3)
6= 0k ; the second equal-
= max
2Rk :jj jj=1 EF (
0
gi )2
by successively applying the
Cauchy-Bunyakovsky-Schwarz inequality, Lyapunov’s inequality, and the moment bound EF jjgi jj2+
M in F:
Conditions (v) and (vi) in (3.9) are su¢ cient by the following argument. Write
vec(Gi )
F
= (MF ; Ipk )
fi
0
F (MF ; Ipk ) ;
(EF vec(Gi )gi0 )(EF gi gi0 )
where MF =
1
2 Rpk
k
:
(17.4)
We have
min (
vec(Gi )
)
F
=
=
min
2Rpk :jj jj=1
min
2Rpk :jj jj=1
0
(MF ; Ipk )
=
min (
fi
F );
0
(MF ; Ipk )0
jj(MF ; Ipk )0 jj
min
2R(p+1)k :jj
fi
0
F (MF ; Ipk )
jj=1
0
(MF ; Ipk )0
jj(MF ; Ipk )0 jj
fi
F
fi
F
jj(MF ; Ipk )0 jj2
(17.5)
where the inequality uses jj(MF ; Ipk )0 jj2 =
0
+
0
MF0 MF
2 Rpk with jj jj = 1: This
1 for
shows that condition (v) is su¢ cient for su¢ cient condition (iv) in (3.9). Since
EF fi EF fi0 ;
fi
F
= V arF (fi ) +
condition (vi) is su¢ cient for su¢ cient condition (v) in (3.9).
The condition in (3.10) is su¢ cient by the following argument:
p j
0
CF;k
F
1=2
j
F
Gi BF;p
j
p
0
CF
F
1=2
F
Gi BF;p
1=2
j
=
p
F
F
Gi BF;p
j
;
(17.6)
where the …rst inequality holds by Corollary 15.4(b) with m = p and r = j and the equality holds
28
because
18
0
CF
F
1=2
F
Gi BF;p
j
= CF0
1=2
F
F
Gi BF;p
j
CF and CF is orthogonal.
Asymptotic Size of Kleibergen’s CLR Test with JacobianVariance Weighting and the Proof of Theorem 5.1
In this section, we establish the asymptotic size of Kleibergen’s CLR test with Jacobian-variance
weighting when the Robin and Smith (2000) rank statistic (de…ned in (5.5)) is employed. This rank
statistic depends on a variance matrix estimator VeDn : See Section 5 for the de…nition of the test.
We provide a formula for the asymptotic size of the test that depends on the speci…cs of the moment
conditions considered and does not necessarily equal its nominal size : First, in Section 18.1, we
provide an example that illustrates the results in Theorem 5.1 and Comment (v) to Theorem 5.1.
In Section 18.2, we establish the asymptotic size of the test based on VeDn de…ned as in (5.3). In
Section 18.3, we report some simulation results for a linear instrumental variable (IV) model with
two rhs endogenous variables. In Section 18.4, we establish the asymptotic size of Kleibergen’s CLR
test with Jacobian-variance weighting under a general assumption that allows for other de…nitions
of VeDn :
In Section 18.5, we show that equally-weighted versions of Kleibergen’s CLR test have correct
asymptotic size when the Robin and Smith (2000) rank statistic is employed and a general equalfn is employed. This result extends the result given in Theorem 6.1 in Section
weighting matrix W
fn = b n 1=2 ; as in (6.2). The results of Section 18.5 are
6, which applies to the speci…c case where W
a relatively simple by-product of the results in Section 18.4.
Proofs of the results stated in this section are given in Section 18.6.
Theorem 5.1 follows from Lemma 18.2 and Theorem 18.3, which are stated in Section 18.4.
18.1
An Example
Here we provide a simple example that illustrates the result of Theorem 5.1. In this example, the
true distribution F does not depend on n: Suppose p = 2; EF Gi = (1k ; 0k ); where ck = (c; :::; c)0 2
b n EF Gi ) !d Dh under F for some random matrix Dh = (D1h ; D2h ) 2 Rk 2 :
Rk for c = 0; 1; n1=2 (D
fn = Ve 1=2 and MF = I2k ; we have n1=2 (M
fn MF ) !d M h under F for some random
Suppose for M
matrix M h 2
Dn
2k
2k
R
:53
We have
b ny = vec 1 (Ve 1=2 vec(D
b n )) = M
f11n D
b 1n + M
f12n D
b 2n ; M
f21n D
b 1n + M
f22n D
b 2n ;
D
k;p Dn
(18.1)
b n EF Gi ) !d Dh and n1=2 (M
fn MF ) !d M h are established in Lemmas 8.2
The convergence results n1=2 (D
and 18.2, respectively, in Section 8 of AG1 and Section 18 in this Supplemental Material under general conditions.
53
29
b n = (D
b 1n ; D
b 2n ); M
fj`n for j; ` = 1; 2 are the four k
where D
fn ; and likewise
k submatrices of M
for Mj`F for j; ` = 1; 2: Let M j`h for j; ` = 1; 2 denote the four k
Tny
= Diagfn
1=2 ; 1g:
k submatrices of M h : We let
Then, we have
b ny Tny =
n1=2 D
!d
b 2n
b 1n + M
f22n n1=2 D
f21n D
b 2n ; n1=2 M
b 1n + M
f12n D
f11n D
M
Ik 1k + 0k
k k
0 ; M 21h 1k + Ik D2h = 1k ; M 21h 1k + D2h ;
(18.2)
b 2n !d D2h
and n1=2 D
b ny Tny depends
(because EF Gi2 = 0k ): Equation (18.2) shows that the asymptotic distribution of n1=2 D
on the randomness of the variance estimator VeDn through M 21h :
f21n !d M 21h (because M21F = 0k
where the convergence uses n1=2 M
k)
It may appear that this example is quite special and the asymptotic behavior in (18.2) only
arises in special circumstances, because EF Gi = (1k ; 0k ); M21F = 0k
k;
and MF = I2k in this
example. But this is not true. The asymptotic behavior in (18.2) arises quite generally, as shown
in Theorem 5.1, whenever p 2:54
1=2
b ny ; then the calculations
If one replaces VeDn by its probability limit, MF ; in the de…nition of D
f21n replaced by n1=2 M21F = 0k k in the …rst line and, hence, M 21h
in (18.2) hold but with n1=2 M
replaced by 0k
k
in the second line. Hence, in this case, the asymptotic distribution only depends
on Dh : Hence, Comment (iv) to Theorem 5.1 holds in this example.
b ny by W
fn D
b n as in Comment (v) to Theorem 5.1. This yields equal
Suppose one de…nes D
b n : This is equivalent to replacing Ve 1=2 by I2 W
fn in the de…nition
weighting of each column of D
Dn
b ny in (18.1). In this case, the o¤-diagonal k
of D
in the …rst line of (18.2) equals 0k
k blocks of I2
k;
fn are 0k
W
k
f21n
and, hence, M
which implies that M 21h = 0k k in the second line of (18.2).
b ny does not depend on the asymptotic distribution of the
Thus, the asymptotic distribution of D
fn : It only depends on the probability limit of W
fn ; as stated
(normalized) weight matrix estimator W
in Comment (v) to Theorem 5.1.
18.2
Asymptotic Size of Kleibergen’s CLR Test with Jacobian-Variance
Weighting
b n is
In this subsection, we determine the asymptotic size of Kleibergen’s CLR test when D
weighted by VeDn ; de…ned in (5.3), which yields what we call Jacobian-variance weighting, and the
Robin and Smith (2000) rank statistic is employed. This rank statistic is de…ned in (5.5) with
f21n does not converge
When the matrix M21F 6= 0k k ; the argument in (18.2) does not go through because n1=2 M
f21n M21F ) !d M 21h by assumption). In this case, one has to alter the de…nition of Tny
in distribution (since n1=2 (M
b n before rescaling them. The rotation required depends on both MF and EF Gi :
so that it rotates the columns of D
54
30
=
0:
For convenience, we restate the de…nition here:
rkn = rkny :=
b ny is as in (5.4) with
(so D
=
by 0 by
min (n(Dn ) Dn );
55
0 ):
Let
b ny := vec 1 (Ve 1=2 vec(D
b n ))
where D
k;p Dn
(18.3)
b y )0 D
b y ; for j = 1; :::; p;
byjn denote the jth eigenvalue of n(D
n
n
ordered to be nonincreasing in j: By de…nition,
b ny equals (by )1=2 :
of n1=2 D
jn
by 0 by
min (n(Dn ) Dn )
(18.4)
= bypn : Also, the jth singular value
De…ne the parameter space FKCLR for the distribution F by
FKCLR := fF 2 F :
where
when
2
1
> 0 and
min (V
arF ((gi0 ; vec(Gi )0 )0 ))
0
0 0 4+
2 ; EF jj(gi ; vec(Gi ) ) jj
M g;
(18.5)
> 0 and M < 1 are as in the de…nition of F in (3.1). Note that FKCLR
in F0 satis…es
M
1
2=(2+ )
2;
F0
by condition (vi) in (3.9). Let vech( ) denote the half
vectorization operator that vectorizes the nonredundant elements in the columns of a symmetric
matrix (that is, the elements on or below the main diagonal). The moment condition in FKCLR is
imposed because the asymptotic distribution of the rank statistic rkny depends on a triangular array
CLT for vech(fi fi 0 ); which employs 4 +
moments for fi ; where fi := (gi0 ; vec(Gi EFn Gi )0 )0
as in (5.6). The min ( ) condition in FKCLR ensures that VeDn is positive de…nite wp!1; which is
1=2
needed because VeDn enters the rank statistic rkny via Ve
; see (18.3).
Dn
For a …xed distribution F; VeDn estimates
de…nition in (8.15) and the
MF
2
6
6
=6
4
DFy :=
M11F
..
.
Mp1F
p
X
j=1
..
min (
.
vec(Gi )
F
de…ned in (8.15), where
vec(Gi )
F
is pd by its
) condition in FKCLR :56 Let
3
M1pF
7
7
..
7 := (
.
5
MppF
vec(Gi ) 1=2
)
F
(M1jF EF Gij ; :::; MpjF EF Gij ) 2 Rk
p
and
; where Gi = (Gi1 ; :::; Gip ) 2 Rk
(18.6)
p
:
As in Section 5, the function veck;p1 ( ) is the inverse of the vec( ) function for k p matrices. Thus, the domain
of veck;p1 ( ) consists of kp-vectors and its range consists of k p matrices.
vec(Gi )
vec(Gi )
56
More speci…cally,
is pd because by (8.15) F
:= V arF (vec(Gi )
(EF vec(G` )g`0 ) F 1 gi )
F
1
1
0
0
0 0
0
0
= ( (EF vec(G` )g` ) F ; Ipk )V arF ((gi ; vec(Gi ) ) )( (EF vec(G` )g` ) F ; Ipk ) ; where ( (EF vec(G` )g`0 ) F 1 ; Ipk ) 2
Rpk (p+1)k has full row rank pk and V arF ((gi0 ; vec(Gi )0 )0 ) is pd by the min ( ) condition in FKCLR :
55
31
y
y
1F ; :::; pF )
Let (
denote the singular values of DFy : De…ne
BFy 2 Rp
p
to be an orthogonal matrix of eigenvectors of DFy0 DFy and
CFy 2 Rk
k
to be an orthogonal matrix of eigenvectors of DFy DFy0
y
1F ; :::;
y 2
jF ) for j
y
pF )
ordered so that the corresponding eigenvalues (
y
jF
tively, are nonincreasing. We have
=(
and (
y
1F ; :::;
(18.7)
y
pF ; 0; :::; 0)
2 Rk ; respec-
= 1; :::; p: Note that (18.7) gives de…nitions
of BF and CF that are similar to the de…nitions in (8.6) and (8.7), but di¤er because DFy replaces
WF (EF Gi )UF in the de…nitions.
De…ne (
1;F ; :::;
9;F )
as in (8.9) with
1=2
= WF =
7;F
F
;
8;F
= Ip ; and W1 ( ) and U1 ( ) equal
to identity functions. De…ne
10;F
0
= V arF @
1
fi
vech (fi fi
A 2 Rd
0)
y
1;F ;
and CFy
where d := (p+1)k+(p+1)k((p+1)k+1)=2: De…ne (
y
jF
are de…ned in (8.9) but with f
:j
pg;
BFy ;
y
2;F ;
d
y
3;F ;
;
(18.8)
y
6;F )
in place of f
as (
jF
1;F ;
:j
2;F ;
3;F ;
6;F )
pg; BF ; and CF ;
respectively.
De…ne
=
KCLR
F
:= (
:= f :
=(
hn ( ) : = (n1=2
Let f
n;h
2
KCLR
1;F ; :::;
1;F ; :::;
1;F ;
:n
y
1;F ;
10;F ;
2;F ;
10;F ;
3;F ;
y
2;F ;
y
1;F ;
4;F ;
y
3;F ;
6;F ;
1g denote a sequence f
n
7;F ;
2
(18.9)
y
6;F )
y
3;F ;
y
2;F ;
5;F ;
y
6;F );
for some F 2 FKCLR g; and
8;F ;
KCLR
10;F ; n
:n
bn
for H as in (8.1). The asymptotic variance of n1=2 vec(D
KCLR
:n
q;
and h1;p
1g for which hn (
EFn Gi ) is
vec(Gi )
h
n)
y
6;F ):
! h 2 H;
under f
n;h
2
q
p and hs for s = 2; :::; 8 as in (8.12), q = qh as in (8.16), h2;q ; h2;p
as in (8.17), and
h8 = Ip due to the de…nitions of
upper left k
y
3;F ;
y
2;F ;
1g by Lemma 8.2.
De…ne h1;j for j
h3;p
1=2 y
1;F ;
7;F
n;
n;q ;
and
8;F
and
n;p q
as in (13.2). Note that h7 =
q ; h3;q ;
1=2
h5;g and
given above, where h5;g (= lim EFn gi gi0 ) denotes the
k submatrix of h5 ; as in Section 8.
For a sequence f
2
h10 = 4
n;h
2
h10;f
h10;f
2f
KCLR
h10;f
h10;f
:n
f
2
2f 2
1g; we have
0
3
5 := lim V arFn @
32
fi
vech (fi fi 0 )
1
A 2 Rd
d
:
(18.10)
Note that h10;f 2 R(p+1)k
With
y
jF ;
BFy ;
and
CFy
(p+1)k
is pd by the de…nition of FKCLR in (18.5).
in place of
jF ;
BF ; and CF ; respectively, de…ne hy1;j for j
p and hys
for s = 2; 3; 6 as in (8.12) as analogues to the quantities without the y superscript, de…ne q y = qhy
as in (8.16), de…ne hy2;qy ; hy2;p
y
n;p q y
qy
; hy3;qy ; hy3;k
qy
; and hy1;p
qy
y
n;
as in (8.17), and de…ne
y
;
n;q y
and
as in (13.2). The quantity q y determines the asymptotic behavior of rkny : By de…nition, q y
is the largest value j (
below that if
qy
p) for which lim n1=2
= p; then
rkny
y
jFn
= 1 under f
qy
!p 1; whereas if
< p; then
n;h 2 KCLR : n
rkny converges in
1g: It is shown
distribution to a
nondegenerate random variable, see Lemma 18.4.
By the CLT, for any sequence f
n
1=2
n
X
i=1
0
0
@
n;h
2
KCLR
fi
vech (fi fi
0
0
EFn fi fi
0)
0
:n
1g;
1
A !d Lh
N (0d ; h10 ); where
Lh = (Lh;1 ; Lh;2 ; Lh;3 )0 for Lh;1 2 Rk ; Lh;2 2 Rkp ; and Lh;3 2 R(p+1)k((p+1)k+1)=2 (18.11)
and the CLT holds using the moment conditions in FKCLR : Note that by the de…nitions of h4 :=
lim EFn Gi and h5 := lim EFn (gi0 ; vec(Gi )0 )0 (gi0 ; vec(Gi )0 ); we have
2
h10;f = 4
for h5;g 2 Rk
k;
h5;g
h5;gG
h5;Gg h5;G
h5;Gg 2 Rkp
k;
vec(h4 )vec(h4 )0
and h5;G 2 Rkp
3
2
5 ; where h5 = 4
h5;g
h5;gG
h5;Gg
h5;G
3
5
(18.12)
kp :
We now provide new, but distributionally equivalent, de…nitions of g h and Dh :
g h := Lh;1 and vec(Dh ) := Lh;2
h5;Gg h5;g1 Lh;1 :
(18.13)
These de…nitions are distributionally equivalent to the previous de…nitions of g h and Dh given
in Lemma 8.2, because by either set of de…nitions g h and vec(Dh ) are independent mean zero
random vectors with variance matrices h5;g and
respectively, where
min (
vec(Gi )
)
Fn
vec(Gi )
h
vec(Gi )
h
(= h5;G vec(h4 )vec(h4 )0 h5;Gg h5;g1 h05;Gg );
is de…ned in (8.15) and is pd (because
vec(Gi )
h
is bounded away from zero by its de…nition in (8.15) and the
FKCLR ):
33
= lim
min (
vec(Gi )
Fn
and
) condition in
De…ne
y
Dh :=
p
X
j=1
2
M
6 11h
6 .
p
; where 6 ..
4
Mp1h
(M1jh Djh ; :::; Mpjh Djh ) 2 Rk
..
.
3
M1ph
7
.. 7
. 7 := (
5
Mpph
vec(Gi ) 1=2
)
;
h
(18.14)
Dh = (D1h ; :::; Dph ); and Dh is de…ned in (18.13). De…ne
y
h
=(
y
h;p q y
y
h;q y ;
y
h;p q y )
: = hy3 hy1;p
qy
y
2 Rk
+ Dh hy2;p
p
qy
y
h;q y
;
2 Rk
:= hy3;qy 2 Rk
(p q y )
qy
; and
:
(18.15)
Let a( ) be the function from Rd to Rkp(kp+1)=2 that maps
n
X
0
1
fi
A into
0)
vech
(f
f
i=1
i i
0
n
X
1
@
An := vech
n
vec(Gi EFn Gi )vec(Gi
n
1
@
(18.16)
EFn Gi )0
i=1
e n := n
1
n
X
i=1
gi gi0 2 Rk
k
and e n := n
1
1
n
X
vec(Gi
i=1
e n e n 1 e 0n
!
EFn Gi )gi0 2 Rpk
1=2
k
1
A ; where
:
Pn
part of its argument. Also, a( ) is well de…ned
P
and continuously partially di¤erentiable at any value of its argument for which n 1 ni=1 fi fi 0 is
Note that a( ) does not depend on the n
i=1 fi
pd.57 We de…ne Ah as follows:
Ah denotes the (kp)(kp + 1)=2
d matrix of partial derivatives of a( )
evaluated at (0(p+1)k0 ; vech(h10;f )0 )0 ;
(18.17)
where the latter vector is the limit of the mean vector of (fi 0 ; vech (fi fi 0 )0 )0 under f
n
n;h
2
KCLR
:
1g:
De…ne
1
M h := vechkp;kp
(Ah Lh ) 2 Rkp
kp
;
(18.18)
1
where vechkp;kp
( ) denotes the inverse of the vech( ) operator applied to symmetric kp kp matrices.
P
The function a( ) is well de…ned in this case because n 1 n
EFn Gi )vec(Gi EFn Gi )0
i=1 vec(Gi
P
n
1
1
0
1
0
1
pk (p+1)k
e
e
e
e
e
e
has full row rank pk:
= ( n n ; Ipk )n
n n ; Ipk ) 2 R
n n ; Ipk ) and (
i=1 fi fi (
57
34
e n e n 1 e 0n
De…ne
y
y
y
M h := (M h;qy ; M h;p
y
M h;p qy
:=
p
X
j=1
qy )
:= (0k
qy
y
; M h;p
qy )
(M 1jh h4;j ; :::; M pjh h4;j )hy2;p qy
and h4 = (h4;1 ; :::; h4;p ) 2 Rk
2 Rk
2 Rk
(p
p
; where
qy )
2
6
6
; Mh = 6
4
M 11h
..
.
..
(18.19)
3
M 1ph
7
.. 7
. 7;
5
M pph
.
M p1h
p:
Below (in Lemma 18.4), we show that the asymptotic distribution of rkny under sequences
f
n;h
2
KCLR
1g with q y < p is given by
:n
rh (Dh ; M h ) :=
where
y
h;p q y
y
h;p q y
min ((
y
+ M h;p
y0
0 y
q y ) h3;k q y h3;k q y (
y
h;p q y
y
+ M h;p
y
is a nonrandom function of Dh by (18.14) and (18.15) and M h;p
function of M h by (18.19). For sequences f
n;h
2
KCLR
: n
1g with
qy
q y ));
qy
(18.20)
is a nonrandom
= p; we show that
rkn !p rh := 1:
We de…ne
h
=(
h;
h;q ;
as in (8.17), as follows:
2 Rk
h;p q )
h2 = (h2;q ; h2;p
q );
p
;
h;q
h3 = (h3;q ; h3;k
:= h3;q ; and
q );
h1;p
q
h;p q
2
:= h3 h1;p
q
+ h7 Dh h8 h2;p
3
0q (p q)
6
7
k
7
:= 6
4 Diagfh1;q+1 ; :::; h1;p g 52 R
0(k p) (p q)
q;
where
(p q)
: (18.21)
1=2
b n through
In the present case, h7 = h5;g and h8 = Ip because the CLRn statistic depends on D
b n ; which appears in the LMn statistic.58 This means that Assumption WU for the parameter
b n 1=2 D
space
KCLR
cn = b n 1=2 ; U
bn = Ip ; h7 = h 1=2 ; and h8 = Ip :
(de…ned in Section 8.4) holds with W
5;g
Thus, the distribution of
h
depends on Dh ; q; and hs for s = 1; 2; 3; 5:
Below (in Lemma 18.5), we show that the asymptotic distribution of the CLRn statistic under
58
b n through the rank statistic.
The CLRn statistic also depends on D
35
sequences f
n;h
2
KCLR
1
CLRh :=
2
1g with q y < p is given by59
:n
LM h + J h
LM h := v 0h v h
2
p;
q
rh + (LM h + J h
v h := P
rh )2 + 4LM rh ; where
1=2
h
1=2
1=2
h5;g g h ; J h := g 0h h5;g M
h
h5;g g h
2
k p;
and
rh := rh (Dh ; M h ):
(18.22)
The quantities (g h ; Dh ; M h ) are speci…ed in (18.13) and (18.18) (and (g h ; Dh ) are the same as in
Lemma 8.2). Conditional on Dh ; LM h and J h are independent and distributed as
respectively (see the paragraph following (10.6)). For sequences f
n;h
2
KCLR
:n
2
p
and
1g with
2 ;
k p
qy =
p; we show that the asymptotic distribution of the CLRn statistic is CLRh := LM h := v 0h v h
where v h := P
1=2
h
h5;g g h :
The critical value function c(1
c(1
2;
p
; r) to be the 1
; r) is de…ned in (5.2) for 0
quantile of the
2
p
r < 1: For r = 1; we de…ne
distribution.
Now we state the asymptotic size of Kleibergen’s CLR test based on Robin and Smith (2000)
statistic with VeDn de…ned in (5.3).
Theorem 18.1 Let the parameter space for F be FKCLR : Suppose the variance matrix estimator
VeDn employed by the rank statistic rkny (de…ned in (18.3)) is de…ned by (5.3). Then, the asymptotic
size of Kleibergen’s CLR test based on the rank statistic rkny is
AsySz = maxf ; sup P (CLRh > c(1
; rh ))g
h2H
provided P (CLRh = c(1
; rh )) = 0 for all h 2 H:
Comments: (i) The proviso in Theorem 18.1 is a continuity condition on the distribution function
of CLRh
c(1
; rh ) at zero. If the proviso in Theorem 18.1 does not hold, then the following
weaker conclusion holds:
AsySz
(18.23)
2 [maxf ; sup P (CLRh > c(1
; rh ))g; maxf ; sup lim P (CLRh > c(1
h2H x"0
h2H
; rh ) + x)g]:
(ii) Conditional on (Dh ; M h ); g h has a multivariate normal distribution a.s. (because (g h ; Dh ;
M h ) has a multivariate normal distribution unconditionally).60 The proviso in Theorem 18.1 holds
59
The de…nitions of v h ; LM h ; J h ; and CLRh in (18.22) are the same as in (9.1), (9.2), (10.6), and (10.7), respectively.
60
Note that g h is independent of Dh :
36
whenever g h has a non-zero variance matrix conditional on (Dh ; M h ) a.s. for all h 2 H: This
holds because (a) P (CLRh = c(1
of iterated expectations,
(b) some calculations show that CLRh
0
1=2
h
; rh )jDh ; M h ) by the law
cJ h + c2 + crh i¤ X h X h = c2 + crh ; where c := c(1
(rh + c)LM h =
c)1=2 (P
; rh )) = E(Dh ;M h ) P (CLRh = c(1
h5;g g h )0 ; c1=2 (M
1=2
h
h5;g g h )0 )0 using (18.22), (c) P
and (d) conditional on (Dh ; M h ); rh ; c; and
h
h
+M
h
=
c(1
; rh ) i¤
; rh ) and X h := ((rh +
= Ik and P
h
M
h
= 0k
k;
are constants.
(iii) When p = 1; the formula for AsySz in Theorem 18.1 reduces to
and the proviso holds
automatically. That is, Kleibergen’s CLR test has correct asymptotic size when p = 1: This holds
y
because when p = 1 the quantity M h in (18.19) equals 0k
p
by Comment (ii) to Theorem 18.3
below. This implies that rh (Dh ; M h ) in (18.20) does not depend on M h : Given this, the proof that
P (CLRh > c(1
; rh ) =
for all h 2 H and that the proviso holds is the same as in (10.9)-(10.10)
in the proof of Theorem 10.1.
(iv) Theorem 18.1 is proved by showing that it is a special case of Theorem 18.6 below, which is
similar but applies not to VeDn de…ned in (5.3), but to an arbitrary estimator VeDn (of the asymptotic
variance
vec(Gi )
h
bn
of n1=2 vec(D
EFn Gi )) that satis…es an Assumption VD (which is stated below).
Lemma 18.2 below shows that the estimator VeDn de…ned in (5.3) satis…es Assumption VD.
(v) A CS version of Theorem 18.1 holds with the parameter space F
where F
;KCLR
:= f(F;
0)
: F 2 FKCLR ( 0 );
in (18.5) with its dependence on
0
0
2
;KCLR
in place of FKCLR ;
g and FKCLR ( 0 ) is the set FKCLR de…ned
made explicit. The proof of this CS result is as outlined in
the Comment to Proposition 8.1. For the CS result, the h index and its parameter space H are as
de…ned above, but h also includes
18.3
0
as a subvector, and H allows this subvector to range over
:
Simulation Results
In this section, for a particular linear IV regression model, we simulate (i) the correlations
y
between M h;p
qy
(de…ned in (18.19)) and g h and (ii) some asymptotic null rejection probabilities
(NRP’s) of Kleibergen’s CLR test that uses Jacobian-variance weighting and employs the Robin
and Smith (2000) rank statistic. The model has p = 2 rhs endogenous variables, k = 5 IV’s, and
an error structure that yields simpli…ed asymptotic formulae for some key quantities. The model is
y1i = Y2i0
0
+ ui and Y2i =
where y1i ; ui 2 R; Y2i ; V2i = (V21i ; V22i )0 ;
take Zij
0
Zi + V2i ;
2 R2 ; Zi = (Zi1 ; :::; Zi5 )0 2 R5 ; and
N (:05; (:05)2 ) for j = 1; :::; 5; ui
N (0; 1); V1i
(18.24)
2 R5
2:
N (0; 1); and V2i = ui V21i : The
random variables Zi1 ; :::; Zi5 ; ui ; and V1i are taken to be mutually independent. We take
37
We
=
n
= (e1 ; e2 cn
1=2 );
where e1 = (1; 0; :::; 0)0 2 R5 and e2 = (0; 1; 0; :::; 0)0 2 R5 : We consider 26
values of the constant c lying between 0 and 60:1 (viz., 0:0; 0:1; :::; 1:0; 1:1; :::; 10:1; 20:1; :::; 60:1);
y
as well as 707:1; 1414:2; and 1; 000; 000: Given these de…nitions, h1;1 = 1; h1;2 = c; and M h =
y
(05 ; M h;p
qy )
2 R5
2;
see (18.19).
In this model, we have gi =
EF Gi gi0 = 0k
k:
Zi Y2i0 : The speci…ed error distribution leads to
Zi ui and Gi =
vec(Gi )
h
In consequence, the matrix
(de…ned in (8.15)), which is the asymptotic
variance of the Jacobian-variance matrix estimator VeDn (de…ned in (5.3)), simpli…es as follows:
vec(Gi )
h
= lim V arFn vec(Di
EFn Di )vec(Di
EFn Di )0
= lim V arFn vec(Gi
EFn Gi )vec(Gi
EFn Gi )0 ; where
Di : = Gi1
1F
1
F
gi ; Gi2
2F
1
F
gi ;
jF
(18.25)
= EF Gij gi0 for j = 1; 2; and
F
= EF gi gi0 :
In addition, in the present model, Gi1 and Gi2 are uncorrelated, where Gi = (Gi1 ; Gi2 ): In consequence,
vec(Gi )
h
is block diagonal. In turn, this implies that lim MFn := (
diagonal with o¤-diagonal block lim M12Fn = 05
vec(Gi ) 1=2
)
h
is block
5:
The quantities hy1;j for j = 1; :::; 5 (de…ned just below (18.10)) are not available in closed form,
so we simulate them using a very large value of n; viz., n = 2; 000; 000: We use 4; 000; 000 simulation
y
repetitions to compute the correlations between the jth elements of M h;p
qy
and g h for j = 1; :::; 5
and the asymptotic NRP’s of the CLR test.61 The data-dependent critical values for the test are
computed using a look-up table that gives the critical values for each …xed value r of the rank
statistic in a grid from 0 to 100 with a step size of :005: These critical values are computed using
4; 000; 000 simulation repetitions.
Results are obtained for each of the 29 values of c listed above. The simulated correlations
y
between the jth elements of M h;p
qy
:33;
for all values of c
:38; and
:38
60:1: For c = 707:1; the correlations are
:32;
For c = 1414:2; the correlations are
correlations are
:01;
in Theorem 5.1 that
and g h for j = 1; :::; 5 take the following values
:01;
y
M h;p qy
:01;
:38;
:24;
:01; and
:38;
:27;
:27;
:27; and
(18.26)
:36;
:36;
:36; and
:36:
:27: For c = 1; 000; 000; the
:01: These results corroborate the …ndings given
and g h are correlated asymptotically in some models under some
sequences of distributions. In consequence, it is not possible to show the Jacobian-variance weighted
CLR test has correct asymptotic size via a conditioning argument that relies on the independence
61
The correlations between the jth and kth elements of these vectors for j 6= k are zero by analytic calculation.
Hence, they are not reported here.
38
of
y
h;p q y
y
+ M h;p
qy
and g h :
Next, we report the asymptotic NRP results for Kleibergen’s CLR test that uses Jacobianvariance weighting and the Robin and Smith (2000) rank statistic. The asymptotic NRP’s are
found to be between 4:95% and 5:01% for the 29 values of c considered. These values are very
close to the nominal size of 5:00%: Whether the di¤erence is due to simulation noise or not is
not clear. The simulation standard error based on the formula 100
)=reps)1=2 ; where
( (1
reps = 4; 000; 000 is the number of simulation repetitions, is :01: However, this formula does not
take into account simulation error from the computation of the critical values.
We conclude that, for the model and error distribution considered, the asymptotic NRP’s of the
Kleibergen’s CLR test with Jacobian-variance weighting is equal to, or very close to, its nominal size.
y
This occurs even though there are non-negligible correlations between M h;p
qy
and g h : Whether
this occurs for all parameters and distributions in the linear IV model, and whether it occurs in
other moment condition model, is an open question. It appears to be a question that can only be
answered on a case by case basis.
18.4
e Dn Estimators
Asymptotic Size of Kleibergen’s CLR Test for General V
In this section, we determine the asymptotic size of Kleibergen’s CLR test (de…ned in Section 5)
using the Robin and Smith (2000) rank statistic based on a general “Jacobian-variance”estimator
VeDn (= VeDn ( 0 )) that satis…es the following Assumption VD.
The …rst two results of this section, viz., Lemma 18.2 and Theorem 18.3, combine to establish
Theorem 5.1, see Comment (i) to Theorem 18.3. The …rst and last results of this section, viz.,
Lemma 18.2 and Theorem 18.6, combine to prove Theorem 18.1.
The proofs of the results in this section are given in Section 18.6.
Assumption VD: For any sequence f n;h 2 KCLR : n
1g; the estimator VeDn is such that
fn = Ve 1=2 and MFn is
fn MFn ) !d M h for some random matrix M h 2 Rkp kp (where M
n1=2 (M
Dn
de…ned in (18.6)), the convergence is joint with
0
n1=2 @
bn
vec(D
gbn
EF n Gi )
1
0
A !d @
gh
vec(Dh )
1
A
0
0
N @0(p+1)k ; @
0k pk
h5;g
0pk
vec(Gi )
h
k
11
AA ;
(18.27)
and (g h ; Dh ; M h ) has a mean zero multivariate normal distribution with pd variance matrix. The
same condition holds for any subsequence fwn g and any sequence f
wn in place of n throughout.
Note that the convergence in (18.27) holds by Lemma 8.2.
39
wn ;h
2
KCLR
:n
1g with
The following lemma veri…es Assumption VD for the estimator VeDn de…ned in (5.3).
Lemma 18.2 The estimator VeDn de…ned in (5.3) satis…es Assumption VD. Speci…cally,
b n EF n Gi ; M
fn MFn ) !d (g h ; Dh ; M h ); where M
fn := Ve 1=2 ; MFn := ( vec(Gi ) ) 1=2 ; and
n1=2 (b
gn ; D
Dn
Fn
(g h ; Dh ; M h ) has a mean zero multivariate normal distribution de…ned by (18.11) and (18.13)(18.18) with pd variance matrix.
b n is de…ned in Lemma 18.2 and
Comment: As stated in the paragraph containing (18.21), D
cn = b n 1=2 and U
bn = Ip :
Theorem 18.3 below with W
De…ne
Sny := Diagf(n1=2
y
1
1
1=2 y
qFn ) ; 1; :::; 1g
1Fn ) ; :::; (n
2 Rp
p
and Tny := Bny Sny ;
(18.28)
where Bny is de…ned in (18.7).
b ny Tny is given in the following theorem.
The asymptotic distribution of n1=2 D
Theorem 18.3 Suppose Assumption VD holds. For all sequences f n;h 2 KCLR : n
1g;
y
y
y
y
y
b n EF n Gi ; D
b n Tn ) !d (g h ; Dh ;
n1=2 (b
gn ; D
h + M h ); where
h is a nonrandom a¢ ne function of
y
Dh de…ned in (18.14) and (18.15), M h is a nonrandom linear (i.e., a¢ ne and homogeneous of
degree one) function of M h de…ned in (18.19), (g h ; Dh ; M h ) has a mean zero multivariate normal
distribution, and g h and Dh are independent. Under all subsequences fwn g and all sequences
f
wn ;h
2
KCLR
:n
1g; the same result holds with n replaced with wn :
y
y
h; M h)
Comments: (i) Note that the random variables (g h ;
in Theorem 5.1 have a multivariate
normal distribution whose mean and variance matrix depend on lim V arFn ((fi 0 ; vec (fi fi 0 )0 ) and
on the limits of certain functions of EFn Gi by (18.11)-(18.19). This, Lemma 18.2, and Theorem
18.3 combine to prove Theorem 5.1 of AG1.
y
(ii) From (18.19), M h = 0k
y
h4 = 0k and q y = 1 implies M h;p
y
M h;p
(
qy
p
if p = 1 (because q y = 0 implies q = 0 which, in turn, implies
qy
has no columns).62 For p
has no columns) or if h4;j = 0k for all j
1Fn ; :::; pFn )
of DFy n satisfy n1=2
jFn
y
2; M h = 0k
p
if p = q y (because
p: The former holds if the singular values
! 1 for all j
semi-strongly identi…ed). The latter occurs if EFn Gi ! 0k
p (i.e., all parameters are strongly or
p
(i.e., all parameters are either weakly
identi…ed in the standard sense or semi-strongly identi…ed). These two condition fail to hold when
Note that q y = 0 implies q = 0 when p = 1 because n1=2 DFy n = n1=2 MFn EFn Gi = O(1) when q y = 0 (by the
de…nition of q y ) and this implies that n1=2 EFn Gi = O(1) using the …rst condition in FKCLR : In turn, the latter
1=2
1=2
implies that n1=2 Fn EFn Gi = O(1) using the last condition in F . That is, q = 0 (since WF = F
and UF = Ip
1=2
c
b
b
because Wn = n
and Un = Ip in the present case, see the Comment to Lemma 18.2).
62
40
one or more parameters are strongly identi…ed and one or more parameters are weakly identi…ed
or jointly weakly identi…ed.
y
(iii) For example, when p = 2 the conditions in Comment (ii) (under which M h = 0k
p)
fail to
hold if EFn Gi1 6= 0k does not depend on n and n1=2 EFn Gi2 ! c for some c 2 Rk :
The following lemma establishes the asymptotic distribution of rkny :
Lemma 18.4 Let the parameter space for F be FKCLR : Suppose the variance matrix estimator
VeDn employed by the rank statistic rkny (de…ned in (18.3)) satis…es Assumption VD. Then, under
all sequences f
(a)
rkny
n;h
2
KCLR
:n
1g;
:= bypn !p 1 if q y = p;
(b) rkny := bypn !d rh (Dh ; M h ) if q y < p; where rh (Dh ; M h ) is de…ned in (18.20) using (18.19)
with M h de…ned in Assumption VD (rather than in (18.18)),
(c) byjn !p 1 for all j
qy;
(d) the (ordered ) vector of the smallest p
b ny ; i.e., ((by y
)1=2 ; :::;
q y singular values of n1=2 D
(q +1)n
(bypn )1=2 )0 ; converges in distribution to the (ordered ) p
hy0
3;k
qy
(
y
h;p q y
y
+ M h;p
qy )
2 R(k
q y ) (p q y ) ;
y
where M h;p
q y vector of the singular values of
is de…ned in (18.19) with M h de…ned
qy
in Assumption VD (rather than in (18.18)),
(e) the convergence in parts (a)-(d) holds jointly with the convergence in Theorem 18.3, and
(f) under all subsequences fwn g and all sequences f
wn ;h
2
KCLR
:n
1g; parts (a)-(e) hold
with n replaced with wn :
The following lemma gives the joint asymptotic distribution of CLRn and rkny and the asymptotic null rejection probabilities of Kleibergen’s CLR test.
Lemma 18.5 Let the parameter space for F be FKCLR : Suppose the variance matrix estimator
VeDn employed by the rank statistic rkny (de…ned in (18.3)) satis…es Assumption VD. Then, under
all sequences f
n;h
2
KCLR
:n
1g;
(a) CLRn = LMn + op (1) !d
(b) lim P (CLRn > c(1
n!1
2
p
and rkny !p 1 if q y = p;
; rkny )) =
if q y = p;
(c) (CLRn ; rkny ) !d (CLRh ; rh ) if q y < p; and
(d) lim P (CLRn > c(1
n!1
P (CLRh = c(1
; rkny )) = P (CLRh > c(1
; rh )) if q y < p; provided
; rh )) = 0:
Under all subsequences fwn g and all sequences f
replaced with wn :
41
wn ;h
2
KCLR
1g; parts (a)-(d) hold with n
Comments: (i) The CLR critical value function c(1
de…nition,
clr(r) :=
1
2
2
p
+
where the chi-square random variables
2
k p
2
p
r+
and
q
2
k p
(
2
p
; r) is the 1
+
2
k p
r)2 + 4
quantile of clr(r): By
2r
p
;
(18.29)
are independent. If rh := rh (Dh ; M h ) does not
depend on M h ; then, conditional on Dh ; rh is a constant and LM h and J h are independent and
distributed as
2
p
and
2
k p
(see the paragraph following (10.6)). In this case, even when q y = p;
P (CLRh > c(1
; rh )) = EDh P (CLRh > c(1
; rh )jDh ) = ;
(18.30)
as desired, where the …rst equality holds by the law of iterated expectations and the second equality
holds because rh is a constant conditional on Dh and c(1
; rh ) is the 1
quantile of the
conditional distribution of clr(rh ) given Dh ; which equals that of CLRh given Dh :
(ii) However, when rh := rh (Dh ; M h ) depends on M h ; the distribution of rh conditional on
Dh is not a pointmass distribution. Rather, conditional on Dh ; rh is a random variable that is not
independent of LM h ; J h ; and CLRh : In consequence, the second equality in (18.30) does not hold
and the asymptotic null rejection probability of Kleibergen’s CLR test may be larger or smaller
than
q y < p:
depending upon the sequence f
n;h
2
KCLR
:n
1g (or f
wn ;h
2
KCLR
:n
1g) when
Next, we use Lemma 18.5 to provide an expression for the asymptotic size of Kleibergen’s CLR
test based on the Robin and Smith (2000) rank statistic with Jacobian-variance weighting.
Theorem 18.6 Let the parameter space for F be FKCLR : Suppose the variance matrix estimator
VeDn employed by the rank statistic rkny (de…ned in (18.3)) satis…es Assumption VD. Then, the
asymptotic size of Kleibergen’s CLR test based on rkny is
AsySz = maxf ; sup P (CLRh > c(1
; rh ))g
h2H
provided P (CLRh = c(1
; rh )) = 0 for all h 2 H:
Comments: (i) Comment (i) to Theorem 18.1 also applies to Theorem 18.6.
(ii) Theorem 18.6 and Lemma 18.2 combine to prove Theorem 18.1.
(iii) A CS version of Theorem 18.6 holds with the parameter space F
see Comment (v) to Theorem 18.1 and the Comment to Proposition 8.1.
42
;KCLR
in place of FKCLR ;
18.5
Correct Asymptotic Size of Equally-Weighted CLR Tests
Based on the Robin-Smith Rank Statistic
In this subsection, we consider equally-weighted CLR tests, a special case of which is considered
in Section 6. By de…nition, an equally-weighted CLR test is a CLR test that is based on a rkn
b n for some general k k weighting matrix W
fn :
b n only through W
fn D
statistic that depends on D
We show that such tests have correct asymptotic size when they are based on the rank statistic
fn 2 Rk k that satis…es certain
of Robin and Smith (2000) and employ a general weight matrix W
1=2
conditions. In contrast, the results in Section 6 consider the speci…c weight matrix b n
2 Rk
k:
The reason for considering these tests in this section is that the asymptotic results can be obtained
as a relatively simple by-product of the results in Section 18.4. All that is required is a slight change
in Assumption VD.
The rank statistic that we consider here is
rkny :=
b 0 f0 f b
min (nDn Wn Wn Dn ):
(18.31)
We replace Assumption VD in Section 18.4 by the following assumption.
Assumption W: For any sequence f n;h 2 KCLR : n
1g; the random k k weight matrix
fn W y ) !d W h for some non-random k k matrices fW y : n 1g and
fn is such that n1=2 (W
W
Fn
Fn
k matrix W h 2 Rk
some random k
k;
WFyn ! Why for some nonrandom pd k
k matrix Why ; the
convergence is joint with the convergence in (18.27), and (g h ; Dh ; W h ) has a mean zero multivariate
normal distribution with pd variance matrix. The same condition holds for any subsequence fwn g
and any sequence f
wn ;h
2
KCLR
:n
fn (= Ve 1=2 ) = Ip
If one takes M
Dn
1g with wn in place of n throughout.
fn in Assumption VD, then D
b ny = W
fn D
b n and the rank
W
statistics in (18.3) and (18.31) are the same. Thus, Assumption W is analogous to Assumption
fn = Ip W
fn and MFn = Ip W y : Note, however, that the latter matrix does not
VD with M
Fn
typically satisfy the condition in Assumption VD that MFn is de…ned in (18.6), i.e., the condition
that MFn = (
vec(Gi ) 1=2
)
:
Fn
Nevertheless, the results in Section 18.4 hold with Assumption VD
WFy ; DFy = WFy EF Gi ; and M h = Ip
replaced by Assumption W and with MF = Ip
these changes,
y
Dh
as in (18.15) with
= Why Dh in (18.14) (because (
y
Dh
Below we show
y
as just given, and M h is
y
the key result that M h;p qy
vec(Gi ) 1=2
)
h
is replaced by Ip
de…ned as in (18.19) with
=
y
0k (p q )
for
rkny
Why );
y
M h;p qy
W h : With
y
h
is de…ned
= W h h4 hy2;p
qy
:
de…ned in (18.31). By (18.20),
this implies that
rh (Dh ; M h ) :=
min ((
y
y0
0 y
h;p q y ) h3;k q y h3;k q y (
43
y
h;p q y ))
(18.32)
when q y < p: Note that the rhs in (18.32) does not depend on M h and, hence, is a function only
of Dh : That is, rh (Dh ; M h ) = rh (Dh ): Given that rh (Dh ; M h ) does not depend on M h ; Comment
(i) to Lemma 18.5 implies that P (CLRh > c(1
sequences f
wn ;h
2
KCLR
:n
; rh )) =
under all subsequences fwn g and all
1g: This and Theorem 18.6 give the following result.
Corollary 18.7 Let the parameter space for F be FKCLR : Suppose the rank statistic rkny (de…ned
fn that satis…es Assumption W. Then, the asymptotic size
in (18.31)) is based on a weight matrix W
of the corresponding equally-weighted version of Kleibergen’s CLR test (de…ned in Section 5 with
rkn ( ) = rkny ) equals
:
Comment: A CS version of Corollary 18.7 holds with the parameter space F
;KCLR
in place of
FKCLR ; see Comment (v) to Theorem 18.1 and the Comment to Proposition 8.1.
y
Now, we establish that M h;p
qy
(= W h h4 hy2;p
qy
) = 0k
Why h4 := lim WFyn EFn Gi = lim CFy n
y 0
y
Fn (BFn )
where CFy n
(p q y ) :
y0
y
Fn BFn
We have
= hy3 lim
y0
y
F n h2 ;
y
Fn
is the singular value decomposition of WFyn EFn Gi ;
with the singular values of WFyn EFn Gi ; denoted by f yjFn : n
and zeroes elsewhere, and CFy n and BFy n are the corresponding
of singular vectors, as de…ned in (18.7). Hence, lim yn exists,
1g for j
k
call it
is the k
p matrix
p; on the main diagonal
k and p
y
h;
(18.33)
p orthogonal matrices
y
and equals hy0
3 h4 h2 : That
is, the singular value decomposition of Why h4 is
y y0
h h2 :
Why h4 = hy3
The k
p matrix
zeroes elsewhere. Let
qy;
y
h;j
y
h has the
y
h;j for j
limits of the singular values of WFyn EFn Gi on its main diagonal and
p denote the limits of these singular values. By the de…nition of
y
jFn
= 0 for j = q y + 1; :::; p (because n1=2
as
y
h
In addition,
2
=4
y
h;q y
y
y
(k
0 q) q
(18.34)
y
y
0q (p q )
y
y
0(k q ) (p q )
y
hy0
2 h2;p
! hy1;j < 1): In consequence,
3
5 ; where
qy
0
=@
y
h;q y
:= Diagf
y
y
0q (p q )
44
Ip
qy
1
A:
y
h
can be written
y
y
h;1 ; :::; h;q y g:
(18.35)
(18.36)
Thus, we have
y
M h;p
qy
: = W h (Why )
= W h (Why )
= 0k
(p q y )
;
1
Why h4 hy2;p
2
1 y4
h3
qy
= W h (Why )
y
h;p q y
y
y
(k
0 q) q
0q
0(k
y
1 y
h3
(p q y )
q y ) (p q y )
y y0 y
h h2 h2;p q y
30
5@
y
y
0q (p q )
Ip
qy
1
A
(18.37)
where the …rst equality holds by the paragraph following Assumption W and uses the condition in
Assumption W that Why is pd and the second equality holds by (18.35) and (18.36). This completes
the proof of Corollary 18.7.
18.6
Proofs of Results Stated in Sections 18.2 and 18.4
For notational simplicity, the proofs in this section are for the sequence fng; rather than a
subsequence fwn : n
1g: The same proofs hold for any subsequence fwn : n
1g:
Proof of Theorem 18.1. Theorem 18.1 follows from Theorem 18.6, which imposes Assumption
VD, and Lemma 18.2, which veri…es Assumption VD when VeDn is de…ned by (5.3).
Proof of Lemma 18.2. Consider any sequence f n;h 2 KCLR : n 1g: By the CLT result in
b n EFn Gi ) in (14.1), and the de…nitions of g h and Dh in
(18.11), the linear expansion of n1=2 (D
(18.13), we have
bn
n1=2 (b
gn ; D
EFn Gi ) !d (g h ; Dh ):
(18.38)
Next, we apply the delta method to the CLT result in (18.11) and the function a( ) de…ned in
(18.16). The mean component in the lhs quantity in (18.11) is (0(p+1)k0 ; vech(EFn fi fi 0 )0 )0 : We have
00
a @@
= vech
= vech
where
vec(Gi )
Fn
and
0(p+1)k
vech(EFn fi fi 0 )
EFn vec(Gi
vec(Gi )
Fn
Fn
1=2
11
AA
EFn Gi )vec(Gi
EFn Gi )0
vec(Gi )
Fn
1 vec(Gi )0
Fn Fn
1=2
= vech(MFn );
(18.39)
are de…ned in (3.2), the …rst equality uses the de…nitions of a( ) and fi
vec(Gi )
Fn
EFn fi fi 0 !
(given in (18.16) and (5.6), respectively), the second equality holds by the de…nition of
in (8.15), and the third equality holds by the de…nition of MFn in (18.6). Also,
h10;f and h10;f is pd. Hence, a( ) is well de…ned and continuously partially di¤erentiable at
45
lim(0(p+1)k0 ; vech(EFn fi fi 0 )0 )0 = (0(p+1)k0 ; vech(h10;f )0 )0 ; as required for the application of the
delta method.
The delta method gives
n1=2 (An
0 0
vech(MFn )) = n1=2 @a @n
1
n
X
i=1
! d Ah Lh ;
0
@
fi
vech (fi fi 0 )
11
AA
0
a@
0(p+1)k
vech(EFn fi fi 0 )
11
AA
(18.40)
where the …rst equality holds by (18.39) and the de…nitions of a( ) and An in (18.16), the convergence
holds by the delta method using the CLT result in (18.11) and the de…nition of Ah following (18.16).
1
Applying the inverse vech( ) operator, namely, vechkp;kp
( ); to both sides of (18.40) gives the
recon…gured convergence result
1
n1=2 (vechkp;kp
(An ))
1
MFn ) !d vechkp;kp
(Ah Lh ) = M h ;
(18.41)
where the last equality holds by the de…nition of M h in (18.18).
The convergence results in (18.38) and (18.41) hold jointly because both rely on the convergence
result in (18.11).
We show below that
n1=2 (VeDn
1
(vechkp;kp
(An ))
2
) = op (1):
This and the delta method applied again (using the function `(A) = A
(18.42)
1=2
for a pd kp
kp matrix
A) give
1
because vechkp;kp
(An ) = (
Qh10;f
Q0
1=2
n1=2 (VeDn
1
(An )) = op (1)
vechkp;kp
vec(Gi ) 1=2
)
+op (1)
h
and
vec(Gi )
h
is pd (because h10;f is pd and
(18.43)
vec(Gi )
h
=
for some full row rank matrix Q). Equations (18.38), (18.41), and (18.43) establish the
result of the lemma.
46
Now we prove (18.42). We have
VeDn := n
=
1
n
n
X
b n )vec(Gi
G
vec(Gi
i=1
n
X
1
vec(Gi
b n )0
G
b n b n 1 b 0n
0
EFn Gi )vec(Gi
EF n Gi )
i=1
=n
1
en
n
X
bn
vec(G
vec(Gi
EFn Gi )b
gn0
EFn Gi )vec(Gi
en
!
bn
vec(G
1
gbn gbn0
en
en e
EFn Gi )0
i=1
n
bn
EFn Gi )vec(G
bn
vec(G
1 e0
n
+ Op (n
EFn Gi )b
gn0
1
EFn Gi )0
0
);
(18.44)
where the second equality holds by subtracting and adding EFn Gi and some algebra, by the de…nitions of b n and b n in (4.1), (4.3), and (5.3), and by the de…nitions of e n and e n in (18.16) and
the third equality holds because (i) the second summand on the lhs of the third equality is Op (n 1 )
b n EFn Gi ) = Op (1) (by the CLT using the moment conditions in F; de…ned in
because n1=2 vec(G
b n EFn Gi ) = Op (1); and b n = Op (1);
(3.1)) and (ii) n1=2 gbn = Op (1) (by Lemma 8.3)), n1=2 vec(G
b
n
1
= Op (1); e n = Op (1); and e n 1 = Op (1) (by the justi…cation given for (14.1)).
Excluding the Op (n
1)
1
term, the rhs in (18.44) equals (vechkp;kp
(An ))
2:
Hence, (18.42) holds
and the proof is complete.
cn =
Proof of Theorem 18.3. The proof is similar to that of Lemma 8.3 in Section 8 with W
bn = Un = Ip ; and the following quantities q; D
b n ; Dn (= EFn Gi ); Bn;q ; n;q ; Cn ; and
Wn = Ik ; U
y
y
y
y
y
y
y by
n ; respectively. The proof employs the
n replaced by q ; Dn ; Dn (= D ); B y ;
y ; Cn ; and
Fn
n;q
n;q
notational simpli…cations in (13.1). We can write
b y By y (
D
n n;q
y
) 1
n;q y
y
= Dny Bn;q
y(
y
) 1
n;q y
by
+ n1=2 (D
n
By the singular value decomposition, Dny = Cny
y
Dny Bn;q
y(
y
) 1
n;q y
= Cny
y0
y
y
n Bn Bn;q y (
0
= Cny @
Iqy
y
y
0(k q ) q
y y0
n Bn :
y
) 1
n;q y
1
A = Cy
47
y
1=2
Dny )Bn;q
y (n
(18.45)
Thus, we obtain
= Cny
n;q y
y
) 1:
n;q y
:
0
y@
n
Iqy
y
y
0(p q ) q
1
A(
y
) 1
n;q y
(18.46)
b n = (D
b 1n ; :::; D
b pn ) 2 Rk
Let D
n
1=2
by
(D
n
Dny )
= n
1=2
p
p
X
j=1
p
X
=
j=1
and Dh = (D1h ; :::; Dph ) 2 Rk
f1jn D
b jn
(M
p:
We have
fpjn D
b jn
M1jFn EFn Gij ; :::; M
f1jn n1=2 (D
b jn
[M
f1jn
EFn Gij ) + n1=2 (M
MpjFn EFn Gij )
M1jFn )EFn Gij ; :::;
fpjn n1=2 (D
b jn EFn Gij ) + n1=2 (M
fpjn MpjFn )EFn Gij ]
M
p
X
(M1jh Djh + M 1jh h4;j ; :::; Mpjh Djh + M pjh h4;j );
!d
(18.47)
j=1
where the convergence holds by Lemma 8.2 in Section 8, Assumption VD, and EFn Gij ! h4;j (by
the de…nition of h4;j ):
Combining (18.45)-(18.47) gives
b y By y (
D
n n;q
where the equality uses n1=2
y
jFn
y
) 1
n;q y
y
h;q y ;
y
y
= Cn;q
y + op (1) !p h3;q y =
(18.48)
0
q y by the de…nition of q y and Bn;q
y Bn;q y = Iq y ;
! 1 for all j
the convergence holds by the de…nition of hy3;qy ; and the last equality holds by the de…nition of
y
h;q y
in (18.15).
Using the singular value decomposition Dny = Cny
y
n1=2 Dny Bn;p
0
0q
y
qy
= n1=2 Cny
B
1=2 y
= Cny B
@ n
n;p
(k
p)
(p
0
y y0 y
n Bn Bn;p q y
1
(p q y )
y y0
n Bn
0
again, we obtain
= n1=2 Cny
0q
y
(p q y )
0
y @
n
0q
y
(p q y )
Ip
1
qy
C
B
C
y
C ! hy B Diagfhy
C = hy hy
;
:::;
h
g
y
3
3 1;p
1;p A
A
@
1;q +1
y
0(k p) (p q )
qy
qy )
1
A
qy
;
(18.49)
where the second equality uses Bny0 Bny = Ip ; the convergence holds by the de…nitions of hy3 and hy1;j
for j = 1; :::; p; and the last equality holds by the de…nition of hy1;p
qy
in the paragraph following
(18.10), which uses (8.17).
y
By (18.47) and Bn;p
qy
! hy2;p
b ny
n1=2 (D
y
qy
; we have
y
Dny )Bn;p
y
using the de…nitions of Dh and M h;p
qy
qy
y
!d Dh hy2;p
qy
y
+ M h;p
qy ;
in (18.14) and (18.19), respectively.
48
(18.50)
Using (18.49) and (18.50), we get
b ny B y
n1=2 D
n;p
qy
b ny
+ n1=2 (D
y
= n1=2 Dny Bn;p
qy
! d hy3 hy1;p
y
Dh hy2;p qy
+
qy
+
y
Dny )Bn;p
y
M h;p qy
y
h;p q y
where the last equality holds by the de…nition of
qy
y
h;p q y
=
y
+ M h;p
qy ;
(18.51)
in (18.15).
Equations (18.48) and (18.51) combine to give
y
b ny B y
)
) 1 ; n1=2 D
n;p q y
n;q y
y
y
y
M h;p qy ) = h + M h
b ny Bny Sny = (D
b ny B y y (
b ny Tny = n1=2 D
n1=2 D
n;q
!d (
y
h;q y ;
y
h;p q y
+
y
(18.52)
y
using the de…nitions of Sny and Tny in (18.28), h in (18.15), and M h in (18.19).
b n EFn Gi ) !d (g h ; Dh ): This convergence is joint with that in (18.52)
By Lemma 8.2, n1=2 (b
gn ; D
b n EFn Gi ); which is part of the former,
because the latter just relies on the convergence of n1=2 (D
fn
and of n1=2 (M
MFn ) !d M h ; which holds jointly with the former by Assumption VD. This
establishes the convergence result of Theorem 18.3.
The independence of g h and (Dh ;
by Lemma 8.2, and the fact that
y
h
y
h)
follows from the independence of g h and Dh ; which holds
is a nonrandom function of Dh :
Proof of Lemma 18.4. The proof of Lemma 18.4 is analogous to the proof of Theorem 8.4 with
cn = Wn = Ik ; U
bn = Un = Ip ; and the following quantities q; D
b n ; Dn (= EFn Gi ); bjn ; Bn ; Bn;q ;
W
b ny ; Dny (= Dy ); by ; Bny ; B y y ; Sny ; S y y ; y ; and hy y ;
Sn ; Sn;q ; jFn ; and h3;q replaced by q y ; D
Fn
jn
n;q
n;q
jFn
3;q
respectively. Theorem 18.3, rather than Lemma 8.3, is employed to obtain the results in (16.37).
In consequence,
h;q
and
y
h;q y
h;p q
are replaced by
y
y
h;q y + M h;q y and
y
0k q by (18.19)).
y
y
y
where
+ M h;qy = h;qy (because M h;qy :=
y
y
y
are replaced by h;qy and h;p qy + M h;p qy in
y
y
h;p q y + M h;p q y ;
The quantities
respectively,
h;q
and
h;p q
(16.37) and in the rest of the proof of Theorem
y
8.4. Note that (16.39) holds with h3;q replaced by hy3;qy because h;qy = hy3;qy by (18.15) (just as
b
b
h;q = h3;q ): Because Un = Un ; the matrices An and Ajn for j = 1; 2; 3 (de…ned in (16.39)) are all
zero matrices, which simpli…es the expressions in (16.41)-(16.44) considerably.
The proof of Theorem 8.4 uses Lemma 16.1 to obtain (16.42). Hence, an analogue of Lemma
16.1 is needed, where the changes listed in the …rst paragraph of this proof are made and h6;j and
Cn are replaced by hy6;j and Cny ; respectively. In addition, FW U is replaced by FKCLR (because
FKCLR
FW U for
equals F0 for
and FKCLR
WU
su¢ ciently small and MW U su¢ ciently large using the facts that F0 \FW U
su¢ ciently small and MW U su¢ ciently large by the argument following (8.5)
bn = Un ; the matrices A
bjn for
F0 by the argument following (18.5)). Because U
WU
49
j = 1; 2; 3 (de…ned in (16.2)) are all zero matrices, which simpli…es the expressions in (16.9)-(16.12)
cn ; D
b n;
considerably. For (16.3) to go through with the changes listed above (in particular, with W
b ny ; Dny ; and Ip ; respectively), we need to show that
Dn ; and Un replaced by Ik ; D
By (5.4) with
=
b ny
n1=2 (D
0
Dny ) = Op (1):
(and with the dependence of various quantities on
(18.53)
0
suppressed for
notational simplicity), we have
2
3
f11n
f1pn
M
M
p
6
7
X
.
.. 7 e 1=2
..
kp
b jn ); where M
fn = 6
b jn ; :::; M
fpjn D
b ny =
f1jn D
D
(M
6 ..
.
. 7:= VDn 2R
4
5
j=1
fp1n
fppn
M
M
kp
:
(18.54)
By (18.6), we have
Dny
=
p
X
(M1jFn Djn ; :::; MpjFn Djn )
(18.55)
j=1
using Dn = (D1n ; :::; Dpn ); and Djn := EFn Gij for j = 1; :::; p:
For s = 1; :::; p; we have
fsjn D
b jn
n1=2 (M
fsjn n1=2 (D
b jn
MsjFn Djn ) = M
b jn
where n1=2 (D
fsjn
Djn ) + n1=2 (M
fsjn
Djn ) = Op (1) (by Lemma 8.2), n1=2 (M
fn
MsjFn ) = Op (1) (because n1=2 (M
vec(Gi ) 1=2
)
;
F
0
EF vec(Gi )gi F 1
MFn ) !d M h by Assumption VD), MsjFn = O(1) (because MF = (
in (8.15) satis…es
and
min (V
vec(Gi )
F
arF (fi ))
2
:= V arF (vec(Gi )
vec(Gi )
F
1
F
MsjFn )Djn = Op (1); (18.56)
gi ) = [
vec(Gi )
F
de…ned
: Ipk ]V arF (fi );
by the de…nition of FKCLR in (18.5)), and Djn = O(1) (by the moment
conditions in F, de…ned in (3.1)).
Hence,
n
1=2
b ny
(D
Dny )
=
p
X
j=1
f1jn D
b jn ; :::; M
fpjn D
b jn )
n1=2 [(M
(M1jFn Djn ; :::; MpjFn Djn )] = Op (1): (18.57)
This completes the proof of the analogue of Lemma 16.1, which completes the proof of parts (a)-(d)
of Lemma 18.4.
For part (e) of Lemma 18.4, the results of parts (a)-(d) hold jointly with those in Theorem 18.3,
rather than those in Lemma 8.3, because Theorem 18.3 is used to obtain the results in (16.37),
rather than Lemma 8.3. This completes the proof.
50
Proof of Lemma 18.5. The proof of parts (a) and (b) is the same as the proof of Theorem 10.1
for the case where Assumption R(a) holds (which states that rkn !p 1) using Lemma 18.4(a),
which shows that rkny !d 1 if q y = p:
The proofs of parts (c) and (d) are the same as in (10.5)-(10.9) in the proof of Theorem 10.1 for
the case where Assumption R(b) holds, using Theorem 18.3 and Lemma 18.4(b) in place of Lemma
8.3, with rh (Dh ; M h ) (de…ned in (18.20)) in place of rh (Dh ); and for part (d), with the proviso
that P (CLRh = c(1
; rh )) = 0: (The proof in Theorem 10.1 that P (CLRh = c(1
; rh )) = 0
does not go through in the present case because rh = rh (Dh ; M h ) is not necessarily a constant
conditional on Dh and alternatively, conditional on (Dh ; M h ); LM h and J h are not necessarily
2
p
independent and distributed as
and
2 :)
k p
Note that (10.10) does not necessarily hold in the
present case, because rh = rh (Dh ; M h ) is not necessarily a constant conditional on Dh :
The proof of Theorem 18.6 given below uses Corollary 2.1(a) of ACG, which is stated below as
Proposition 18.8. It is a generic asymptotic size result. Unlike Proposition 8.1 above, Proposition
18.8 applies when the asymptotic size is not necessarily equal to the nominal size : Let f
n
:n
1g
be a sequence of tests of some null hypothesis whose null distributions are indexed by a parameter
with parameter space
: Let RPn ( ) denote the null rejection probability of
a …nite nonnegative integer J; let fhn ( ) = (h1n ( ); :::; hJn ( ))0 2 RJ : n
functions on
lim inf n!1 Cn
1g be a sequence of
lim supn!1 Cn
1g; let Cn ! [C1;1 ; C2;1 ] denote that C1;1
C2;1 :
Assumption B: For any subsequence fwn g of fng and any sequence f
wn )
under : For
: De…ne H as in (8.1).
For a sequence of scalar constants fCn : n
hw n (
n
! h 2 H; RPwn (
wn )
wn
2
n!1
2
1g for which
! [RP (h); RP + (h)] for some RP (h); RP + (h) 2 [0; 1]:
Proposition 18.8 (ACG, Corollary 2.1(a)) Under Assumption B, the tests f
AsySz := lim sup sup
:n
n
: n
1g have
RPn ( ) 2 [suph2H RP (h); suph2H RP + (h)]:
Comments: (i) Corollary 2.1(a) of ACG is stated for con…dence sets, rather than tests. But,
following Comment 4 to Theorem 2.1 of ACG, with suitable adjustments (as in Proposition 18.8
above) it applies to tests as well.
(ii) Under Assumption B, if RP (h) = RP + (h) for all h 2 H; then AsySz = suph2H RP + (h):
We use this to prove Theorem 18.6. The result of Proposition 18.8 for the case where RP (h) 6=
RP + (h) for some h 2 H is used when proving Comment (i) to Theorem 18.1 and the Comment to
Theorem 18.6.
Proof of Theorem 18.6. Theorem 18.6 follows from Lemma 18.5 and Proposition 18.8 because
51
Lemma 18.5 veri…es Assumption B with RP (h) = RP + (h) =
; rh )) when q y < p:
RP + (h) = P (CLRh > c(1
19
when q y = p and with RP (h) =
Proof of Theorem 7.1
Theorem 7.1 of AG1. Suppose the LM test, the CLR test with moment-variance weighting,
and when p = 1 the CLR test with Jacobian-variance weighting are de…ned as in this section,
the parameter space for F is FT S;0 for the …rst two tests and FT S;JV W;p=1 for the third test, and
Assumption V holds. Then, these tests have asymptotic sizes equal to their nominal size
2 (0; 1)
and are asymptotically similar (in a uniform sense). Analogous results hold for the corresponding
CS’s for the parameter spaces F
;T S;0
and F
;T S;JV W;p=1 :
The proof of Theorem 7.1 is analogous to that of Theorems 4.1, 5.2, and 6.1. In the time series
case, for tests, we de…ne
but with
5;F
=(
1;F ; :::;
9;F )
and f
n;h
:n
1g as in (8.9) and (8.11), respectively,
de…ned di¤erently than in the i.i.d. case. (For CS’s in the time series case, we make
the adjustments outlined in the Comment to Proposition 8.1.) We de…ne63
5;F
:= VF =
1
X
m= 1
In consequence,
5;Fn
0
EF @
gi
vec(Gi
EF Gi )
10
A@
gi
vec(Gi
m
m
EF Gi
m)
10
A :
(19.1)
! h5 implies that VFn ! h5 and the condition in Assumption V holds with
V = h5 :
The proof of Theorem 7.1 uses the CLT given in the following lemma.
Lemma 19.1 Let fi := (gi0 ; vec(Gi )0 )0 : We have: wn
all subsequences fwn g and all sequences f
wn ;h
:n
1=2 Pwn
i=1 (fi
EFn fi ) !d N (0(p+1)k ; h5 ) under
1g:
Proof of Theorem 7.1. The proof is the same as the proofs of Theorems 4.1, 5.2, and 6.1 (given
in Sections 9, 10, and 11, respectively, in the Appendix to AG1) and the proofs of Lemmas 8.2
and 8.3 and Theorem 8.4 (given in Sections 14, 15, and 16 in this Supplemental Material), upon
which the former proofs rely, for the i.i.d. case with some modi…cations. The modi…cations a¤ect
the proofs of Lemmas 8.2 and 8.3 and the proof of Theorem 5.2. No modi…cations are needed
elsewhere.
The …rst modi…cation is the change in the de…nition of
63
of
5;F
described in (19.1).
The di¤erence in the de…nitions of 5;F in the i.i.d. and time series cases re‡ects the di¤erence in the de…nitions
vec(Gi )
in these two cases. See the footnote at (7.1) above regarding the latter.
F
52
The second modi…cation is that b n = b n ( 0 ) !p h5;g not by the WLLN but by Assumption
V and the de…nition of b n ( ) in (7.4). In the time series case, by de…nition, 5;F := VF ; so
h5 := lim
upper left k
FT S ;
= lim VFn : By de…nition, h5;g is the upper left k
5;Fn
k submatrix of h5 and
k submatrix of VF by (7.1) and (19.1). Hence, h5;g = lim
min ( F )
Fn :
F
is the
By the de…nition of
8F 2 FT S : Hence, h5;g is pd.
k submatrix of h5 that corresponds to the submatrix b jn ( ) of Vbn ( ) in
(7.4) for j = 1; :::; p: The third modi…cation is that b jn = b jn ( 0 ) = h5;Gj g + op (1) in (14.1) in
the proof of Lemma 8.2 (rather than b jn = EFn Gij g 0 + op (1)) for j = 1; :::; p and this holds by
Let h5;Gj g be the k
i
Assumption V and the de…nition of b jn ( ) in (7.4) (rather than by the WLLN).
We write
0
h5 = @
h5;g
h05;Gg
h5;Gg
h5;G
1
A for h5;g 2 Rk
k
; h5;Gg
0
1
h5;G1 g
B
C
B
C
..
=B
C 2 Rpk
.
@
A
h5;Gp g
k
; and h5;G 2 Rpk
pk
:
(19.2)
The fourth modi…cation is that VeDn in (11.1) in the proof of Theorem 5.2 is de…ned as described
in Section 7, rather than as in (5.3). In addition, VeDn !p h7 in (11.1) holds with h7 = h5;G
h5;Gg (h5;g )
1 h0
5;Gg
by Assumption V, rather than by the WLLN.
The …fth modi…cation is the use of a WLLN and CLT for triangular arrays of strong mixing
random vectors, rather than i.i.d. random vectors, for the quantities in the proof of Lemma 8.2 and
elsewhere. For the WLLN, we use Example 4 of Andrews (1988), which shows that for a strong
P
mixing row-wise-stationary triangular array fWi : i ng we have n 1 ni=1 ( (Wi ) EFn (Wi )) !p
0 for any real-valued function ( ) (that may depend on n) for which supn
for some
bn
n1=2 (D
(Wi )jj1+ < 1
1 EFn jj
> 0: For the CLT, we use Lemma 19.1 as follows. The joint convergence of n1=2 gbn and
EFn Gi ) in the proof of Lemma 8.2 is obtained from (14.1), modi…ed by the second and
third modi…cations above, and the following result:
n
1=2
n
X
i=1
( (Wi )
0
EFn (Wi )) = @
0k
Ik
h5;Gg h5;g1
!d N (0(p+1)k ; Lh5 ); where
0
1 0
gi
A=@
(Wi ) := @
vec(Gi ) h5;Gg h5;g1 gi
pk
Ipk
Ik
h5;Gg h5;g1
1
An
0k
1=2
n
X
(fi
EFn fi )
i=1
pk
Ipk
10
A@
gi
vec(Gi )
1
A ; (19.3)
fi = (gi0 ; vec(Gi )0 )0 ; and the convergence holds by Lemma 19.1. Using (19.2), the variance matrix
53
Lh5 in (19.3) takes the form:
0
0k
Ik
Lh5 = @
h5;Gg h5;g1
=@
h5;Gg h5;g1
0
vec(Gi )
h
10
pk
Ipk
0k
Ik
h5;Gg0
h5;Gg
h5;G
A@
h5;g
0k
h5;Gg
10
pk
Ipk
h5;Gg h5;g1 h05;Gg :
= h5;G
A@
h5;g
10
pk
A@
vec(Gi )
h
h5;g1 h05;Gg
Ik
0pk
1 0
A=@
k
Ipk
h5;g
0pk
k
0k
pk
1
A
vec(Gi )
h
1
A ; where
(19.4)
Equations (14.1) (modi…ed as described above), (19.3), and (19.4) combine to give the result of
Lemma 8.2 for the time series case.
The sixth modi…cation occurs in the proof of Lemma 8.3(d) in Section 15 in this Supplemental
Material. In the time series case, the proof goes through as is, except that the calculations in (15.13)
ai
F
are not needed because
ai
F
(and, hence,
as well) is de…ned with its underlying components
ai
F
re-centered at their means (which is needed to ensure that
vec(G )
vec(Gi )
implies that lim Fn i = h
automatically holds and
1=2
vec(h03;k q h5;g Gi h2;p q 2 )
(which, in the i.i.d. case, is proved in
h
is a convergent sum). The latter
lim
0
vec(CF
n ;k
Fn
q
1=2
Fn Gi BFn ;p q 2 )
=
(15.13).
This completes the proof of Theorem 7.1.
Proof of Lemma 19.1. For notational simplicity, we prove the result for the sequence fng rather
than a subsequence fwn : n
1g: The same proof applies for any subsequence. By the Cramér-
Wold device, it su¢ ces to prove the result with fi EFn fi and h5 replaced by s(Wi ) = b0 (fi EFn fi )
and b0 h5 b; respectively, for arbitrary b 2 R(p+1)k : First, we show
lim V arFn
n
1=2
n
X
!
s(Wi )
i=1
where by assumption
V arFn
n
1=2
n
X
5;Fn
!
s(Wi )
i=1
=
P1
=
m= 1 EFn s(Wi )s(Wi m )
n
X1
CovFn (s(Wi ); s(Wi
= b0 h5 b;
! h5 : By change of variables, we have
m ))
m= n+1
(19.5)
n
X1
m= n+1
jmj
CovFn (s(Wi ); s(Wi
n
m )):
(19.6)
This gives
V arFn
n
1=2
n
X
i=1
2
1
X
m=n
!
s(Wi )
jjCovFn (s(Wi ); s(Wi
b0
m ))jj
5;Fn b
+
n
X1
m= n+1
54
jmj
jjCovFn (s(Wi ); s(Wi
n
m ))jj:
(19.7)
By a standard strong mixing covariance inequality, e.g., see Davidson (1994, p. 212),
sup jjCovF (s(Wi ); s(Wi
m ))jj
C1
F 2FT S
=(2+ )
(m)
F
C1 C
=(2+ )
m
d =(2+ )
; where d =(2 + ) > 1;
(19.8)
for some C1 < 1; where the second inequality uses the de…nition of FT S in (7.2). In consequence,
both terms on the rhs of (19.7) converge to zero. This and b0 5;Fn b ! b0 h5 b establish (19.5).
P
P
When b0 h5 b = 0; we have limn!1 V arFn (n 1=2 ni=1 s(Wi )) = 0; which implies that n 1=2 ni=1
P
s(Wi ) !d N (0; b0 h5 b) = 0: When b0 h5 b > 0; we can assume 2n = V arFn (n 1=2 ni=1 s(Wi ))
c
for some c > 0 8n
1 without loss of generality. We apply the triangular array CLT in Corollary
1 of de Jong (1997) with (using de Jong’s notation)
n
1=2 s(W
i) n
1:
=
= 0; cni := n
1=2
n
1;
and Xni :=
Now we verify conditions (a)-(c) of Assumption 2 of de Jong (1997). Condition (a)
holds automatically. Condition (b) holds because cni > 0 and EFn jXni =cni j2+ = EFn js(Wi )j2+
2jjbjj2+ M < 1 8Fn 2 FT S : Condition (c) holds by taking Vni = Xni (where Vni is the random
variable that appears in the de…nition of near epoch dependence in De…nition 2 of de Jong (1997)),
dni = 0; and using
Fn (m)
Cm
d
8Fn 2 FT S for d > (2 + )= and C < 1: By Corollary 1 of
de Jong (1997), we have Xni !d N (0; 1): This and (19.5) give
n
1=2
n
X
i=1
s(Wi ) !d N (0; b0 h5 b);
as desired.
55
(19.9)
References
Andrews, D. W. K. (1988): “Laws of Large Numbers for Dependent Non-identically Distributed
Random Variables,” Econometric Theory, 4, 458–467.
Davidson, J. (1994): Stochastic Limit Theory. Oxford: Oxford University Press.
de Jong, R. M. (1997): “Central Limit Theorems for Dependent Heterogeneous Random Variables,” Econometric Theory, 13, 353–367.
Hwang, S.-G. (2004): “Cauchy’s Interlace Theorem for Eigenvalues of Hermitian Matrices,”American Mathematical Monthly, 111, 157–159.
Johansen, S. (1991): “Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian
Vector Autoregressive Models,” Econometrica, 59, 1551–1580.
Kleibergen, F. (2005): “Testing Parameters in GMM Without Assuming That They Are Identi…ed,” Econometrica, 73, 1103–1123.
Rao, C. R. (1973): Linear Statistical Inference and its Applications. 2nd edition. New York:
Wiley.
Robin, J.-M., and R. J. Smith (2000): “Tests of Rank,” Econometric Theory, 16, 151–175.
Stewart, G. W. (2001): Matrix Algorithms Volume II : Eigensystems. Philadelphia: SIAM.
56