Statistica Sinica: Supplement KERNEL ADDITIVE SLICED INVERSE REGRESSION Heng Lian and Qin Wang University of New South Wales and Virginia Commonwealth University Supplementary Material S1 Proof of the main theorem. In the proofs, C denotes a generic positive constant. We first note that EX ∗ [(fˆ(X ∗ )−f (X ∗ ))2 ] = kΣ1/2 (fˆ − f )k2H . From (2.3), Σ1/2 fˆ satisfies the eigenvalue equation Σ1/2 (Σn + sI)−1 Γn Σ−1/2 (Σ1/2 fˆ) = λΣ1/2 fˆ. Similarly, Σ1/2 f satisfies the eigenvalue equation Σ1/2 Σ−1 ΓΣ−1/2 (Σ1/2 f ) = λΣ1/2 f. Using the perturbation theory for operators, for example as in Chapter 4 of Kato (1995), we only need to show that kΣ1/2 (Σn + sI)−1 Γn Σ−1/2 − Σ1/2 Σ−1 ΓΣ−1/2 k2 = Op (cn n−d/(d+1) ), where kAk denotes the operator norm for an operator A defined on HK . We write kΣ1/2 (Σn + sI)−1 Γn Σ−1/2 − Σ1/2 Σ−1 ΓΣ−1/2 k = kΣ1/2 ((Σn + sI)−1 Γn − (Σ + sI)−1 Γ + (Σ + sI)−1 Γ − Σ−1 Γ)Σ−1/2 k = kΣ1/2 ((Σ + sI)−1 (Γn − Γ) + (Σ + sI)−1 (Σ − Σn )(Σn + sI)−1 Γn +(Σ + sI)−1 Γ − Σ−1 Γ)Σ−1/2 k ≤ kΣ1/2 (Σ + sI)−1 (Γn − Γ)Σ−1/2 k + kΣ1/2 (Σ + sI)−1 (Σ − Σn )(Σn + sI)−1 Γn Σ−1/2 k +kΣ1/2 ((Σ + sI)−1 Γ − Σ−1 Γ)Σ−1/2 k =: (I) + (II) + (III). (S1.1) To simplify the proofs and notations a little bit, we assume K(., X) = 0 in the following, since using kK(., X)kH = Op (n−1/2 ), such terms does not lead to extra difficulties in the proof. By the same reason, we also replace p̂h by ph = P (Y = yh ) in the following expression of Γn . Obviously we can rewrite Γn and Γ as Γn = H X 1 ph h=1 Γ = ! n 1X K(., Xi )I{Yi = yh } ⊗ n i=1 ! n 1X K(., Xi )I{Yi = yh } , n i=1 H X 1 E[(K(., X)I{Y = yh })] ⊗ E[K(., X)I{Y = yh }]. ph h=1 (S1.2) 2 HENG LIAN AND QIN WANG The term (III) is easy to deal with as follows. Using kΣ1/2 ((Σ + sI)−1 − Σ−1 )(E[K(., X)I{Y = yh }] ⊗ E[K(., X)I{Y = yh }])Σ−1/2 k2 = ksΣ1/2 (Σ + sI)−1 (Σ−1 E[K(., X)I{Y = yh }]) ⊗ (Σ−1/2 E[K(., X)I{Y = yh }])k2 = O(ksΣ1/2 (Σ + sI)−1 k2 ) = O(s), we have (III)2 = O(s). For the term (II), writing Σ(x) = K(., x) ⊗ K(., x) and using that EkΣ1/2 (Σ + sI)−1 Σ(x)k2HS = Etr(Σ(x)2 (Σ + sI)−1 Σ(Σ + sI)−1 ) ≤ CEtr(Σ(x)(Σ + sI)−1 Σ(Σ + sI)−1 ) = Ctr(Σ(Σ + sI)−1 Σ(Σ + sI)−1 ) X λ2j C , (λj + s)2 j = where k.kHS is the Hilbert-Schmidt norm. In the inequality above we used that kΣ(x)f kH = p kK(., x)f (x)kH = f 2 (x)K(x, x) ≤ Ckf k∞ ≤ Ckf kH and thus kΣ(x)k ≤ C, and the inequality tr(AB) ≤ kAktr(B). Thus using the Markov inequality, we have (II)2 ≤ = kΣ1/2 (Σ + sI)−1 (Σ − Σn )k2 · k(Σn + sI)−1 Γn Σ−1/2 k2 ! X λ2j Op k(Σn + sI)−1 Γn Σ−1/2 k2 . 2 n(λ j + s) j We will argue later that actually k(Σn + sI)−1 Γn Σ−1/2 k2 = Op (1). The term (I) is more complicated. Let Γnh and Γh be the terms on the right hand side P P −1 of the sums in (S1.2) such that Γn = h p−1 h Γnh , Γ = h ph Γh . To bridge Γnh and Γh , we further define Γ0nh = ! n 1X K(., Xi )I{Yi = yh } ⊗ E[K(., X)I{Y = yh }]. n i=1 KERNEL ADDITIVE SLICED INVERSE REGRESSION 3 Since EkΣ1/2 (Σ + sI)−1 (K(., X)I{Y = yh } − E[K(., X)I{Y = yh }]) ⊗(Σ−1/2 E[K(., X)I{Y = yh }])k2HS ≤ CEkΣ1/2 (Σ + sI)−1 (K(., X)I{Y = yh } − E[K(., X)I{Y = yh }])k2H ≤ CEkΣ1/2 (Σ + sI)−1 K(., X)I{Y = yh }k2H CE I{Y = yj }hΣ(Σ + sI)−2 K(., X), K(., X)i2H CE I{Y = yj }tr(Σ(Σ + sI)−2 Σ(X)) CE tr(Σ(Σ + sI)−2 Σ(X))E[I{Y = yj }|X] = = = ≤ = Ctr(Σ2 (Σ + sI)−2 ) X λ2j C , (λj + s)2 j by Markov inequality, kΣ 1/2 (Σ + sI) −1 (Γ0nh −1/2 2 − Γh )Σ k = Op X j λ2j n(λj + s)2 ! λ2j n(λj + s)2 ! . Similarly one can show kΣ1/2 (Σ + sI)−1 (Γn − Γ0nh )Σ−1/2 k2 = Op X j . These imply that (II)2 = Op X j λ2j n(λj + s)2 ! . Once we have shown k(Σn +sI)−1 Γn Σ−1/2 k = Op (1),the bounds given above will combine P λ2 j and direct calculations by plugging to obtain that (S1.1) is bounded by Op s + j n(λ +s) 2 j in λj j −d and s = cn n−d/(d+1) shows that this is Op (cn n−d/(d+1) ). What is left is to show k(Σn + sI)−1 Γn Σ−1/2 k = Op (1), which is equivalent to showing k(Σn + sI)−1 Γn Σ−1/2 − Σ−1 ΓΣ−1/2 k = Op (1). Note this equation is actually similar to (S1.1). Following similar lines that are used to upper bound the terms (I)-(III) in (S1.1), we will get = k(Σn + sI)−1 Γn Σ−1/2 − Σ−1 ΓΣ−1/2 k2 ! X X λj λj Op k(Σn + sI)−1 Γn Σ−1/2 k2 + Op (1 + ). 2 2 n(λ n(λ j + s) j + s) j j When s = cn n−d/(d+1) with cn → ∞, by direct calculations we have λj j n(λj +s)2 P thus the above displayed equation implies k(Σn + sI)−1 Γn Σ−1/2 k = Op (1). = o(1) and
© Copyright 2025 Paperzz