Statistica Sinica: Supplement
SIMULTANEOUS CONFIDENCE BANDS AND
HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS
Gaorong Lia , Heng Pengb1 , Kai Dongb and Tiejun Tongb
a
College of Applied Sciences, Beijing University of Technology, Beijing 100124, P. R. China
b
Department of Mathematics, Hong Kong Baptist University, Hong Kong, P. R. China
Supplementary Material
This note contains two lemmas and the proofs for Theorem 1–Theorem 5.
S1
Two lemmas
Lemma 1 Suppose that conditions C1–C3 hold. If h → 0, nh3 → ∞, we have
β
− µl f (u) − µl+1 f ′ (u)h = OP (h2 + δn ), l = 0, 1, 2, 3,
sup Sn,l
u∈U
where δn =
q
log n
nh ,
β
Sn,l
f ′ (u) is the derivative of f (u) and
n
1X
=
n i=1
β T Xi − u
h
l
Kh (β T Xi − u),
l = 0, 1, 2, 3.
The details of proof can be found from Martins-Filho and Yao (2007). The following
lemma provides the uniform convergence rates for the estimators η̂ and η̂ ′ respectively.
Lemma 2 Let Bn = {β : kβ − β0 k ≤ cn−1/2 } for some positive constant c. Suppose
that conditions C1–C5 hold, we have
sup
|η̂(u; β) − η(u)| = OP (δn ),
(S1.1)
|η̂ ′ (u; β) − η ′ (u)| = OP (δn /h).
(S1.2)
u∈U ,β∈Bn
and
sup
u∈U ,β∈Bn
1 Heng
Peng is the corresponding author. E-mail: [email protected].
S2
G. R. LI, H. PENG, K. DONG AND T. J. TONG
The proof of Lemma 2 can be found from the full version of Wang et al. (2010) on
the web arXiv.org at arXiv:0905.2042, hence we omit the details here.
S2
Proof of Theorem 1
The proof of Theorem 1 can immediately be obtained from Carroll et al. (1997), Liang
et al. (2010), or Chen, Gao and Li (2013). So we omit all the details.
S3
Proof of Theorem 2
By (2.7) in Section 2 and some simple calculations, we have
η̂(u; β̂0 ) =
n
X
Wni (u; β̂0 )Yi ,
(S3.1)
i=1
where
Wni (u; β̂0 ) =
β̂0
β̂0
n−1 Kh (β̂0T Xi − u)[Sn,2
− {(β̂0T Xi − u)/h}Sn,1
]
β̂0 β̂0
β̂0 2
Sn,0
Sn,2 − (Sn,1
)
.
By Lemma 1, we have uniformly for u ∈ U and β ∈ Bn ,
β
Sn,0
= f (u) + OP (h2 + δn ),
β
Sn,1
= OP (h + δn ),
β
Sn,2
= µ2 f (u) + OP (h2 + δn ),
β
Sn,3
= OP (h + δn ).
(S3.2)
Hence, we have
β
β
β 2
Sn,0
Sn,2
− (Sn,1
) = µ2 f 2 (u) + OP (h2 + δn ).
(S3.3)
By (S3.1), we have
η̂(u; β̂0 ) − η(u)
=
=
n
X
i=1
n
X
Wni (u; β̂0 )Yi − η(u)
Wni (u; β̂0 )εi +
i=1
−
n
X
i=1
n
hX
i=1
i
Wni (u; β̂0 )η(β̂0T Xi ) − η(u)
Wni (u; β̂0 )[η(β̂0T Xi ) − η(β0T Xi )]
=: I1 + I2 − I3 .
(S3.4)
S3
SIMULTANEOUS CONFIDENCE BANDS
For I2 , by using Taylor expansion and some calculations, and from (S3.2)–(S3.3) and
condition C1, we have
I2
β̂0 β̂0
β̂0 2 −1
= [Sn,0
Sn,2 − (Sn,1
) ]
× η(u) + η
′
n
n1 X
(u)(β̂0T Xi
n
i=1
h
i
β̂0
β̂0
Kh (β̂0T Xi − u) Sn,2
− {(β̂0T Xi − u)/h}Sn,1
o
1
− u) + η ′′ (u)(β̂0T Xi − u)2 + O(|β̂0T Xi − u|3 ) − η(u)
2
1 ′′
β̂0 β̂0
β̂0 2 −1
β̂0 2
β̂0 β̂0
η (u)h2 [Sn,0
Sn,2 − (Sn,1
) ] [(Sn,2
) − Sn,1
Sn,3 ] + oP (1)
2
1 ′′
=
η (u)h2 µ2 + OP (h2 + δn ).
(S3.5)
2
√
Now we consider I3 . By Theorem 1, we know that β̂0 is a n−consistent estimator, that
is, β̂0 satisfies that β̂0 ∈ Bn . Note that supx∈X ,β∈Bn |η(β T x) − η(β0T x)| = O(n−1/2 ).
Invoking the Lemma 2 in Zhu and Xue (2006), it is easy to show that I3 = oP (n−1/2 ).
=
We approximate the process I1 as follows. Following the steps of Lemma A.2 in Xia
and Li (1999), we have
n
1X
Kh (β0T Xi − u){(β0T Xi − u)/h}εi = OP (δn ).
n i=1
(S3.6)
By the proof of Lemma A.1 in Xia et al. (2004), we can obtain that
n
n
1 X
1 X
Kh (β̂0T Xi − u)εi =
Kh (β0T Xi − u)εi + OP (hδn ).
nf (u) i=1
nf (u) i=1
(S3.7)
By (S3.2), (S3.6)–(S3.7) and using the same argument of (S3.5), it is easy to show that
I1
=
β̂0 β̂0
β̂0 2 −1 β̂0
[Sn,0
Sn,2 − (Sn,1
) ] Sn,2
n
1 X
n
β̂0 β̂0
β̂0 2 −1 β̂0
−[Sn,0
Sn,2 − (Sn,1
) ] Sn,1
=
1
nf (u)
n
X
i=1
i=1
Kh (β̂0T Xi − u)εi
n
1 X
n
i=1
Kh (β̂0T Xi − u){(β̂0T Xi − u)/h}εi
Kh (β0 Xi − u)εi + OP (δn (h + δn ))
=: Ie1 (u) + OP (δn (h + δn ))
uniformly for u ∈ U. This also implies that kI1 − Ie1 (u)k∞ = OP (δn (h + δn )).
Next, we will derive the asymptotic distribution of Ie1 (u). For convenience, let
{nhf (u)σ −2 ν0−1 }1/2 Ie1 (u)
=
2
{nhf (u)σ ν0 }
−1/2
n
X
K
i=1
=: {nhf (u)σ 2 ν0 }−1/2 ξ(u).
β0 X i − u
h
εi
(S3.8)
S4
G. R. LI, H. PENG, K. DONG AND T. J. TONG
Divide the interval [b1 , b2 ] into N subintervals Jr = [dr−1 , dr ), r = 1, 2, . . . , N −
−b1
ei = dr I(Ui ∈
1, JN = [dN −1 , b2 ] where dr = b1 + b2N
r. Define Ui = β0T Xi and U
ei = O(N −1 ). Then, by law of large
Jr ), r = 1, . . . , N, and it is obvious that Ui − U
numbers for the random sequence {εi }, i = 1, . . . , n, we have
" !#
!
n
n
X
X
ei − u
ei − u
Ui − u
U
U
ξ(u) =
K
−K
εi +
K
εi
h
h
h
i=1
i=1
!
n
X
ei − u
U
−1
εi
= OP (h ) +
K
h
i=1
e
=: OP (h−1 ) + ξ(u)
(S3.9)
ei , we have that
uniformly for u ∈ [b1 , b2 ]. By the definition of U
e =
ξ(u)
N
X
K
r=1
dr − u
h
X
n
i=1
ei ∈ Jr )εi .
I(U
Pt Pn
Pn
Let ξet = r=1 i=1 I(Ui ∈ Jr )εi = i=1 I(b1 ≤ Ui ≤ dt )εi , ξe0 = 0. Then, by
Lemma 2 in Zhang, Fan and Sun (2009), for any t = 1, . . . , N and u ∈ [b1 , b2 ], we have
|ξet − N 1/2 W (G(dt ))| = O(N 1/4 log N ) a.s.,
Rc
where W (·) is a Wiener process and G(c) = b1 σ 2 f (v)dv. By Abel’s transform, we have
and
e =K
ξ(u)
b2 − u
n
ξeN −
N
−1 X
r=1
K
dr+1 − u
h
−K
dr − u
h
ξer
−1 NX
dr+1 − u
dr − u
K
−K
[ξer − N 1/2 W (G(dr ))]
h
h
r=1
∞
N
−1 X
d
−
u
d
−
u
r
r+1
1/2
K
≤ max |ξer − N W (G(dr ))|
−
K
1≤r≤N
h
h
r=1
= OP (N
1/4
log N ).
Hence, we have
e
ξ(u)
b2 − u
W (G(b2 ))
h
N
−1 X
dr+1 − u
dr − u
−N 1/2
K
−K
W (G(dr )) + OP (N 1/4 log N )
h
h
r=1
= N 1/2 K
(S3.10)
S5
SIMULTANEOUS CONFIDENCE BANDS
uniformly for u ∈ [b1 , b2 ].
For a Wiener process, it is known that (Csörgö and Révész 1981, Page 44)
sup |W (G(t + ς)) − W (G(t))| = O({ς log(1/ς)}1/2 ) a.s.
t∈[b1 ,b2 ]
when ς is any small number. Using this property and the boundness of K(·), we obtain
that
N
−1 X
dr+1 − u
dr − u
K
−K
W (G(dr ))
h
h
r=1
Z b2
v−u
=
+ OP ({N −1 log N }1/2 )
W (G(v))dK
h
b1
uniformly for u ∈ [b1 , b2 ]. Together with (S3.9) and (S3.10), it is easy to show that
Z b2 v−u
−1/2
−1/2
ξ(u) − h
K
dW (G(v))
(nh)
h
b1
∞
3 −1/2
−1/2 1/4
= OP (nh )
+ (nh)
N
log N .
(S3.11)
Note that the order is OP (nh3 )−1/2 + (nh2 )−1/4 log n if N is taken as N = O(n). Let
Z b2 v−u
dW (G(v)),
Z1n (u) = h−1/2
K
h
b1
Z b2 v−u
−1/2
Z2n (u) = h
K
[σ 2 f (v)]1/2 dW (v − b1 ),
h
b1
Z b2 v−u
Z3n (u) = h−1/2
K
dW (v − b1 ).
h
b1
For a Gaussian process, invoking Lemma 2.1–Lemma 2.5 in Claeskens and Van Keilegom
(2003), we have
kZ1n (u) − Z2n (u)k∞ = OP (h1/2 ),
k(σ 2 f (u))−1/2 Z2n (u) − Z3n (u)k∞ = OP (h1/2 ).
(S3.12)
By (S3.11) and (S3.12), we have
(nhσ 2 f (u))−1/2 ξ(u) − Z3n (u) = OP (nh3 )−1/2 + (nh2 )−1/4 log n + h1/2 .
∞
This, together with (S3.8), and invoking Theorem 1 and Theorem 3.1 in Bickel and
Rosenblatt (1973), when h = O(n−ρ ), 1/5 < ρ < 1/3, we have
n
o
−1/2 P (−2 log{h/(b2 − b1 )})1/2 ν0
(nhσ 2 f (u))−1/2 ξ(u) − dn0 < x
∞
−→ exp − 2e−x .
Summarizing the above results, we complete the proof of Theorem 2.
S6
S4
G. R. LI, H. PENG, K. DONG AND T. J. TONG
Proof of Theorem 3
We also can finish the proof of Theorem 3 along the same lines as the proof of Theorem
2, here we omit the details of proof.
S5
Proof of Theorem 4
To prove the theorem, we need to derive the rate of convergence for the bias and variance
d
estimators. We first consider the difference between bias(η̂(u;
β̂0 )|D) and 2−1 h2 µ2 η ′′ (u).
By using the standards as in the proof of Lemma 2, we have
d
kbias(η̂(u;
β̂0 )|D) − 2−1 h2 µ2 η ′′ (u)k∞
=
=
OP (h2 {
p
log n/nh5∗ })
OP (h2 (n−1/7 log1/2 n)),
(S5.1)
where h∗ = n−1/7 comes from the pilot estimation of η ′′ (·).
Furthermore, by Lemma 1, we have
h
e = oP (1),
XT W 2 X − f (u)S(u)
n
∞
ν0 0
e
where S(u)
=
. For the estimator of variance σ 2 defined in Section 2.3,
0 ν2
Tong and Wang (2005) showed that σ̂ 2 is a consistent estimator of σ 2 and bias(σ̂ 2 ) =
O(n−3+3τ ) by taking m = cnτ with the constant c > 0 and 0 ≤ τ ≤ 13 .
These results, together with Theorem 1, by some simple calculations, it is easy to
show that
ν0 2 d
σ = oP (1).
(S5.2)
nhVar{η̂(u; β̂0 )|D} −
f (u)
∞
By (S5.1) and (S5.2), and invoking the result of Theorem 2, we finish the proof of
Theorem 4.
S6
Proof of Theorem 5
Under the null hypothesis, model (1.1) reduces to
Yi = γ0 + γ1 (β0T Xi ) + εi ,
i = 1, . . . , n.
For the convenience, we use the matrix and vector notations in the following. Let en be
an n × 1 vector with all elements being ones, and X∗ = ΓX be an n × p matrix. By
S7
SIMULTANEOUS CONFIDENCE BANDS
√
Theorem 1, it is known that β̂0 is a n−consistent estimator of β0 . By the definitions
of ε̂∗ and least squares estimators of γ0 and γ1 , we have
ε̂∗
=
Γε − (γ̂0 − γ0 )Γen + γ1 (ΓXβ0 ) − γ̂1 (ΓXβ̂0 )
Γε − (γ̂0 − γ0 )Γen − (γ̂1 − γ1 )(X∗ β0 )
=
−(γ̂1 − γ1 )X∗ (β̂0 − β0 ) − γ1 X∗ (β̂0 − β0 )
=: J1 − J2 − J3 − J4 − J5 ,
and J1 ∼ N (0, σ 2 In ). Let Jki denote the ith components of the vectors Jk , k = 1, . . . , 5,
respectively. Then we have
m
X
ε̂∗2
i =
i=1
m
X
i=1
(J1i − J2i − J3i − J4i − J5i )2 .
(S6.1)
2
We first consider the orders of Jki
, k = 1, . . . , 5. By the Cauchy-Schwarz inequality, we
have
2
J3i
= (γ̂1 − γ1 )2
p
hX
∗
β0j
Xij
j=1
i2
≤ (γ̂1 − γ1 )2 kβ0 k2
p
X
∗2
,
Xij
(S6.2)
j=1
where β0j is the jth component of the vector β0 . Note that kβ0 k = 1 and |γ̂1 − γ1 | =
OP (n−1/2 ), and from the condition (C6), we have
m
X
2
J3i
i=n0
2
≤ (γ̂1 − γ1 )
log n)
p n/(log
X
X
j=1
i=n0
4
∗2
Xij
= OP {(log log n)−4 }.
(S6.3)
From kβ̂0 − β0 k = OP (n−1/2 ), condition (C6) and the same argument, it is easy to show
that
m
X
2
J4i
i=n0
= OP {n
−1
−4
(log log n)
},
m
X
i=n0
2
J5i
= OP {(log log n)−4 }.
By |γ̂0 − γ0 | = OP (n−1/2 ) and the definition of Γ, it is easy to check that
OP (n−1 ) = oP (1). Note that
m
Pm
i=1
2
J2i
=
m
1 X 2
1 X 2
J1i ≤ σ 2 + max
(J1i − σ 2 ) = oP (log log n).
1≤m≤n m
m i=1
i=1
(S6.4)
Next we will consider the order of cross-terms in (S6.1). Since J1i controls the convergence rate of the other terms, we only need to consider the orders of J1i Jki , k = 2, 3, 4, 5.
By the Cauchy-Schwarz inequality, (S6.3) and (S6.4), we have
(
)1/2 ( m
)1/2
m
m
1 X
X
1 X 2
2
J1i J3i ≤
J1i
J3i
= OP {(log log n)−3/2 }.
√
m
m
i=n0
i=n0
i=n0
(S6.5)
S8
G. R. LI, H. PENG, K. DONG AND T. J. TONG
Similarly, we have
(
)1/2 ( m
)1/2
m
m
1 X
X
1 X 2
2
J1i J2i ≤
J1i
J2i
= OP {(log log n/n)1/2 },
√
m
m
i=n0
i=n0
i=n0
(
)1/2 ( m
)1/2
m
m
1 X
X
1 X 2
2
J1i J4i ≤
J1i
J4i
= OP {n−1/2 (log log n)−3/2 }
√
m
m
i=n0
and
i=n0
i=n0
(
)1/2 ( m
)1/2
m
m
1 X
X
1 X 2
2
J1i J5i ≤
J1i
J5i
= OP {(log log n)−3/2 }.
√
m
m
i=n0
i=n0
i=n0
Summarizing the above results, we have
m
m
1 X ∗2
1 X 2
√
ε̂i = √
J + OP {(log log n)−3/2 }.
m i=n
m i=n 1i
0
(S6.6)
0
Let
m
X
1
2
Tn∗ = max √
(J1i
− σ 2 ).
1≤m≤n
2mσ 4 i=1
Invoking Theorem 1 in Darling and Erdös (1956), we obtain that
p
2 log log nTn∗ − {2 log log n + 0.5 log log log n − 0.5 log(4π)} ≤ x
P
−→ exp(− exp(−x)).
(S6.7)
By (S6.7), it is easy to check that
Tn∗ = {2 log log n}1/2 {1 + oP (1)}
and
∗
1/2
Tlog
{1 + oP (1)},
n = {2 log log log n}
which implies that the maximum of Tn∗ can not be achieved at m < log n. By (3.5) in
Section 3, we have
∗
=
TAN
max
1≤m≤n/(log log n)4
m
X
1
2
−3/2
√
(ε̂∗2
}.
i − σ ) + oP {(log log n)
2mσ 4 i=1
Again invoking the result of |σ̂ 2 − σ 2 | = O(n−3+3τ ) in Tong and Wang (2005), and by
(S6.6), it is easy to obtain that
∗
TAN
= Tn∗ + OP {(log log n)−3/2 }.
Summarizing the above results, we complete the proof of Theorem 5.
SIMULTANEOUS CONFIDENCE BANDS
S9
References
Bickel, P. L. and Rosenblatt, M. (1973). On some global measures of the derivations of
density function estimates. Ann. Statist. 1, 1071–1095.
Carroll, R. J., Fan, J., Gijbels, I. and Wand, M. P. (1997). Generalized partially linear
single-index models. Journal of American Statistical Association 92, 477–489.
Chen, J., Gao, J. and Li, D. (2013). Estimation in a single-index panel data model
with heterogeneous link functions. Econometric Reviews 33, 928–955.
Claeskens, G. and Van Keilegom, I. (2003). Bootstrap confidence bands for regression
curves and their derivatives. Ann. Statist. 31, 1852–1884.
Csörgö, M. and Révész, P. (1981). Strong Approximations in Probability and Statistics.
Academic Press, New York.
Darling, D. A. and Erdös, P. (1956). A limit theorem for the maximum of normalized
sums of independent random variables. Duke Mathematical Journal 23, 143–155.
Liang, H., Liu, X., Li, R. and Tsai, C. L. (2010). Estimation and testing for partially
linear single-index models. Ann. Statist. 38, 3811–3836.
Martins-Filho, C. and Yao, F. (2007). Nonparametric frontier estimation via local linear
regression. Journal of Econometrics 141, 283–319.
Wang, J. L., Xue, L. G., Zhu, L. X. and Chong, Y. (2010). Estimation for a partiallinear single-index model. Ann. Statist. 38, 246–274.
Xia, Y. and Li, W. K. (1999). On single-index coefficient regression models. Journal
of the American Statistical Association 94, 1275–1285.
Xia, Y., Li, W. K., Tong, H. and Zhang, D. (2004). A goodness-of-fit test for singleindex models. Statistica Sinica 14, 1–39.
Zhang, W. Y., Fan, J. Q. and Sun, Y. (2009). A semiparametric model for cluster data.
Ann. Statist. 37, 2377–2408.
Zhu, L. X. and Xue, L. G. (2006). Empirical likelihood confidence regions in a partially
linear single-index model. J. R. Statist. Soc. B 68, 549–570.
© Copyright 2025 Paperzz