Sen, Pranab Kumar; (1988).Whither Delete-K Jackknifing for Smooth Statistical Functionals?"

WHITHER DELETE-K JACKKNIFING FOR SMOOTH
STATISTICAL FUNCTIONALS?
by
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1849
March 1988
WHITHER DELETE-K JACKKNIFING FOR SMOOTH STATISTICAL FUNCTIONALS ?
BY PRANAB KUMAR SEN
UniVeJl4Uy 06 NolLth CaJtoLi.na a.:t chapel. Hili.
The classical jackknifing based on a resampling scheme with deletion of
of one observation at a time serves the dual purpose of bias reduction and
variance estimation. Delete-k jackknifing, for some k
~
1, is a variant of
this scheme. In the light of second order asymptotic distributional representations, it is shown that for second order (compact) differentiable statistical functionals, for any k
=
o(n ), delete-k jackknifing behaves very
much similar to the classical one. This raises the question: To what degree
delete-k jackknifing is preferable in practice ?
!.
!~E:~~~~E~~~.
Let Xl'· •• ,Xn be n independent and identically distributed
random variables (i.i.d.r.v.) with a distribution function (d.f.) F, defined on
the real line R. Let F
n
functional
e
unbiased for
be the
¢ample
(emp~eatJ
T(F), a natural estimator is T
n
e ,
and its mean square error
d. n. For a general statistical
= T(F n ). In general, Tn may not be
may not be generally known too. The
classical jackknifing serves a dual purpose of reducing the possible bias of T
n
and providing an estimator of its mean square error. It consists in identifying
the n subsamples of size n-l each (by eliminating one observation at a time from
the basic sample), and incorporating them in the formulation of the p¢eudov~ble¢
which in turn provide the jackknifed estimator and its mean square error estimator.
This scheme has been extended to the so called
delete-k jaekk~6ing where (~) sub-
samples of sizes n-k each are obtained by deleting k observations at a time from
the basic sample of size n, and these (~) subsample estimators are incorporated in
the formulation of the pseudovariables on which the jackknifed estimator of
e and
its estimated mean square error rest.
AMS 1980 Subject Cla¢¢i6ieation6,: 60F05, 62E20, 62F12
Key wo~ and phnahe¢: Compact derivatives; differentiable statistical functionals;
jackknifed estimator; jackknifed variance estimator; pseudovariables; second order
distributional representation.
2
Under quite general regularity conditions [viz., Parr(1983,1985) and Sen(1988)],
for T* l' the delete-l jackknifed version of T • as n
n,
n
(n-l){ T
(1.1)
n
- T*
n,l
+
00
,
C(F) almost surely (a.s.),
}
where C(F) is a suitable functional of F to be defined more precisely later on.
Also, let V:,l be the usual (Tukey) jackknifed estimator of 0i ' the (asymptotic)
mean square error of
(1.2)
V
T
-
n
*
e ).
a. s., as n
n,l
Further, if Tl(F;x) stands for the
-
we let TIn = n
and variance
-1 n
~i=l
o~
Then, under the same regularity conditions,
+
00
•
~6luenee
6unction corresponding to T(F) and if
~-
T (F;X ) [ so that n TIn is asymptotically normal with mean 0
l
i
], then the following .6eeond oJz.dvr. Mympto,ue cU6Wbutiol'la£ Jz.epJz.e-
.6entation (SOAVR) result holds [viz., Sen(1988)] :
* 1 - T(F) n{ Tn,
(1.3)
TIn }
v+
where the Zk are i.i.d.r.v.'s having the standard normal d.f., and the eigenvalues
A , k ~ 1 , depend on the underlying d.f. F and the functional T(.). Note that
k
n~{
(1.4)
T*
n,l
-
e} _
p
n~(
T
n
-
e)
v+
N(o,cr
2
)
i
,
even under weaker regularity conditions. Note further that (1.2) and (1.4) are
first order properties, while (1.1) and (1.3) are second order ones.
The object of the present study is to focus on general delete-k jackknifed
estimators{T * k ' k > I} and the related versions of che Tukey variance estimators
n,
2
(of 01 ), and to examine how farrthese first and second order properties hold?
In this context, we may note that for k > 2, one may have either a resampling
scheme of [n/k] deletion of distinct sets of k observations from the basic saIll]?le
or the more natural case of (~) possible subsamples of size n-k from the basic
sample of size
n. The first scheme has some arbitrariness in the partitioning,
and we shall mainly consider the second case. Along with the preliminary notions,
the proposed estimators are considered in Section 2. The main results are then
presented in Section 3 and their derivations are sketched in Section 4. The last
section deals with some useful remarks and general conclusions.
3
2. ___________________
Preliminary notions. Note that the sample d.f. Fn is defined by
(2.1)
F (x)
n
-1 n
= n Ei=l I( Xi 2 x) , x
E
R,
where I(A) stands for the indicator function of the set A. Thus, we have
(2.2)
= T(F)
6
and
T = 6
n
T(F ) ,
n
n
for a suitable functional T(.). Since we are dealing with statistical functionals,
the delete-k jackknifed estimators may all be defined solely in terms of the
sub-sample empirical d.f.s.Towards this, we let for every
~
E
I ,
( i)
s= n'\i
n,k
(2.3)
where
(2.4)
(i , ... ,i ) , n =
k
l
i
n
h, ... ,n} and r = set of all (k)
i
Let then
I·,
(2.5)
tI of
I
The pseusovariables generated by the delete-k jackknifing are defined as
(2.6)
T(k)
n,~
= k- l { nT
n
- (n-k)T(~) } , for i
n-k
-
Then, the delete-k jackknifed estimator of
(2.7)
T*
n,k
E
I.
6 is defined as
_ n)-lE
(k) }
>
- (k
{i E I} Tn,~ , for k _ 1.
Side by side, we also introduce the variance functions
{ T (k). - T* k }2.'
*
for k > 1.
E I}
n,k
n,.:
n,
Next, we introduce some regularity conditions on T(.). Let A be a topological
(2.8)
V
vector space and K be a class of compact subsets of A " such that every subset
K. We say that a functional T(.) is
consisting of a single point belongs to
HadamaJtd-c.ontbluoU6
(2.9)
IT(G) - T(F)
where IIG - FII
say that
(2.10)
at F
T(.) is
I
E
+ 0
A
, if
with IIG - F
II
+0 I on G
refers to the usual sup-norm [Le.,
E
K C KJ,
sUPI G(:x)-F(x)
x
I ].
6i.M:t otr.dvc. c.ompac.:t CHadamMd-l cUHvc.ent.i.a.bl.e at F
T(G) = T(F + G-F ) = Tefl + IT ()' ;x)d[G(;x:)-FCx)J
I
+
~ CF ; (;.,.Ft
E
K,
E
Next, we
A,
if
where
(2.11)
IRI (F,G-F) I
and \(F;x), the
0(11
G-F
II),
uniformly in G
6i.M:t otr.dvc. c.ompllc.:t deJUva.tive of T(.) at F. may be so normalized
4
that
f T (F;x)dF(x)
(2.12)
l
T(.) is J.>e.c.ond OJr..dvr. c.ompact (Ha.da.mMd-) cli66vr.e.n,t{able. at F
Similarly,
(Z.13)
=0 .
E:
A ,
if
T(G) = T(F) + J T (F;x)d[G(x)-F(x)] +
l
~JJ
+ RZ(F;G-F) ,
TZ(F;x,y)d[G(x)-F(x)]d[G(y)-F(y)]
where
(2.14)
= o(
!R2 (F;G-F)I
I IG
O~vr.
and T (F;.), the J.>ec.ond
2
- F
I ]2),
c.ompact
uniformly in G
d~va.tive.
E:
K,
of T(.) at F,may be so normalized
that
(Z.15)
J T (F;x,y)dF(y)
2
o=
J T (F;x,y)dF(x)
a.e.
2
Finally, we let
(2.16)
G
E:
K.
Other notations will be introduced as and when necessary.
~. !~~~~~~~~::~~~~:~.
For the delete-k jackknifing, we may either s.et k to be an
arbitrary positive number or may even allow k (=k
n, in such a way that as n increases, k
(3.1)
k
n
n
is o(n
n
) to depend on the sample size
l-n ),
for some
n
> 0, so that
is nondecreasing with (n-l)/(n-k ) converging to 1, as n
n
~
00
First, we consider the following theorem relating to the first order properties.
THEOREM 3.1.
16 T(F).L6 6.iMt oftdvr. Hacla.maJLd-eU66vr.entiable
f ri(G,x)dG(X) «00) .L6 Hada.rrnJtd-c.on:ti.nuoUJ.> a.t
(3.1) ,
. .
de6~ng
the
*a.4.~
Vn,k
k(n-l)(n-k) -1 Vn* • k
(3.2)
(2.8)
a.t F and T** (G)
=
1
F , .then 60ft e.vVUf k J.>a.tt.66ying
** (F) ,
,we have for 012 = T1
= V*n ,l + 0(1)
a.s.
2 + 0(1) a.s., as n ~ 00 •
1
ThLL6, when a.djLL6te.d by .the. J.>c.a..f..e. 6ac.toft kCn-ll!(n-kl ,.the. Tukey e.J.>tinntoft 06
=
0
the. vaJt.<.anc.e in .the de.lete-k. ja.c.k.k.n.<.6.<.ng .L6 J.>tJtongly c.On6.L6tent 60ft
THEOREM 3.2.
de.6.<.ned by
(3.3)
16 T(F) .L6 J.>e.c.ond-oftdvr. Hada.mMd-cU66eJtentia.ble
(2. 16)
F and T*2 (.)
.L6 Ha~-c.onUnuoUJ.> a.t F , .then
(n-l){ T - T* k}
n
at
o~ .
n,
*
~ (~)T2(F)
a.s., as n ~ 00
ThLL6 .the J.>ec.ond oMvr. biM y.JJr.opvr.ty 06 .the c1a..6J.>.<.c.a..f.. jac.k.k.n.<.6J..ng .L6 J.>haJr.ed by
de.lete.-k. jac.k.k.n.<.6ing, 60ft evvr.y k ~ T. Ac.tuaLe.y, (n-l)( T*n, k - T*n, 1)
= 0(1)
a.s.,
5
n
M
-+
00
THEOREM 3.3.
(3.4)
nolt evVty
,
~~ny-i..ng
k
that
A~~ume
EFT22(F;X, ,X,}
=
T22(F;x,y}dF(x}dF(y}
JJ
On
Then, undVt ;the hypo;thu-t6
(3.5)
LLd.lt.v.'~
Lr
V-+
(3.l)
>,
k
~~6y.,[ng
(3.')
ArJ Zr2 -, },
T
r
(.), r
> 1 ,de6-i..ned
-
= Ar Tr(Y} a.e.(F}, r
f T2 (F;x,y}T r (x}dF(x}
~
by
';
= 1 or 0 accordings as r = s or not.
J Tr(y}Ts(y}dF(y}
k ~~6y-i..ng (3.n, detete-k jac./zkYU6-i..ng lea.d6 to ;the
ThM, 601t any
M
00
wU.h ;the ~:ta.ndCVtd noJtma1.. d. n • and ;the ugwva1..uu
A c.OJr./tupond;to ;the oJtthoMJtmai. 6unc.tio~
r
(3.6)
<
Theoltern 3.2, nolt evVty
* - T(F} - T'n
- }
n{ Tn,k
whVte ;the Zr Me
(3.1)
-i..n ;the c.Me 06
k
=1
~ame
SOAVR
•
4. Proofs of the theorems. Using (2.3), it readily follows that
(4.1)
(kIn) ,with probability 1, for every i E I
<
However, this result is not strong enough for our purpose ( when we allow k to
increase, subject to (3.1». For this reason, we consider the following Lemma.
Lemma 4.1. Whenever k = k
n-l L{iE I}
(k)
(4.2)
n
= o(n
II Fn=:k
(i)
l-n
- Fn
), for some
11 2
=
n
O( k(n-k) -2 ) a.s., as n
Fi,k(x)
k
-1 k
Then, from (2.3) and (4.3), we
(4.4)
k-l(n_k)
II
2
Ej=l l(Xi .
F(~) - F
n-k
n
J
x
x )
£
-+
00
•
E I ,
Proof. Parallel to (2.3), we define for every i
(4.3)
> 0 ,
R.
obtain that
II
yj
=
i
E
I
Thus, the left hand side of (4.2) can be written as
(4.5)
n -1 E{iE I} k I IFi,k - F 11 2 }
k(n-k) -2 {(k)
n
2
2
n
<2k(n-k) -2{ (k ) -lE{.1£ I} k II F.1, k - F 11
+(k/n)n II Fn - F 11 }.
-
-
-
-
-
Now, by the classical results on the Kolll)ogorov-Smirnoy goodness of fit s·tatistic,
(4.6)
nilF
n
_F11
2
= O«log log n )
while kIn = o(n- n), so that kilF
identify
(~) -lL:{iE n k II F1,k
- FI1
n
- F 11
2
k
2
2
)
a.s., as n
-+
= 0(1) a.s., as n
as the
U-statist~c I
00
-+
,
00.
Further, we may
Hoeffding (1948)J
6
corresponding to the kernel ¢k(X " ..• X ) = kl IF - FI
k
l
k
I2 .
Using the basic results
of Kiefer (1961) [on the finite-sample behavior of the Kolmogorov-Smirnov type
statistics]. it follows that for every finite k
00
(4.7)
f
and for every finite integer r (
~
<c
(4.8)
4x exp{-2x
O
(~l).
2
}dx
=
1,
1) •
<00
r
.-uniformly in k
~
1.
As such. using the results in Sen and Ghosh (1981) [ viz .• Sen (1981.p.56)]. it
follows that for every (fixed) positive integer r • the 2r-th central moment of
is
O( (kin)
r
)
= oen-rn ). for every k = o(n l-n ), n
> O.
Combining this with the Markov inequality and the Borel-Cantelli lemma, we immediately obtain by choosing r : rn
(4.9)
= o(nl-n
1, that for k
>
kliF. k - FI1
2
=
~,
0(1) a.s., as n
-+
n
), for some
> 0,
00
Hence, (4.2) follows from (4.5) and the above discussions. Q.E.D.
Lemma 4.2. For any k : k = o(n
1
(4.10)
- F
n
-n ),
n
for some
>
0,
= O(k~ (n-k) -1 ) a.s., as n
II
~
00
•
The proof follows directly from (4.5) and the elementary moment inequality,
and hence, is omitted.
Let us now start with the proofs of the theorems. First, by an appeal to
(2 .
12)
(2 . 10) , for G = F(9
n=k
and F
= Fn
,we immediately obtain that for first
order differentiable T(.) ,
T(~)
(4.11)
n-k
T(F(~»
n-k
= T(F- n + F(~)
n-k
- F )
n
= T(Fn ) + fTl(Fn;X)dF~~~(X) +
Rl(Fn ;
F~~~
- Fn ),
~
i
€
I
where
(4.12)
In view of (2.12), the second term on the right hand side of (4.l1} can be expresaed
as
-(n-k)
-1 k
2:. 1 Tl(F ;X. ) . Therefore, by (2.6) and (4.11), we have
J=
+ k-l2:~ 1 T (J F ;X. ) + o( k-l(n-k) IIF(i ) - F
n,l.
n
r: 1 n ~j
n- k
n
From (2.7), (4.10) and (4.13), we obtain that
(4.13)
(4.14)
T(k~ = T
n~.
*n,k
T
T + 0 + o(k -~ ) a.s., as n
n
-+
00
II),
i
"'-
€
I .
-,
7
in this context, it may be noted that by writing T1(n~ = k-1~~ 1 T1(F ;X. ), i E I
J=
,~
n~.
J
(4.15)
I by (2.12)], which accounts for the second term on the right hand side of (4.14).
It follows similarly that
,n -1~
-en) )2 = n -1 -2
k.
2
\k)
{i En (T1 ,i
(k) k ~h E n [ ~j=lT1(Fn;Xi.) ]
."
-
n -1 -2{
( k)
(nk)
(4.16)
r {-:...
k .
-
-1 n
EI
'"
k
J
2
k
} [~ . =1 T1 (F ; X. ) + ~'..J." _ T1 (F ; X. ) T (F ; X. )]}
J
n ~.
J1"J .. 1
n ~. 1 n ~.,
J
2
~'-1 T (F ;X.) +[(k-1)/kn(n-1)]
~-
1
n
~
= (kn )-l~n. 1 T2 (F ;X.)
1 n ~
~=
n
~.~.
~rJ=
J
J
1 T (F ;X.)T (F ;X.)
1 n ~ 1 n J
[
/kn(n-1)]~.n 1T1(F
2
(k-1)
;X.)
~=
n ~
[ by (4.15) ]
2
= [(n-k)/k(n-1)] n-1~~ 1 T (F ;X.)
~=
1 n ~
2
[(n-k)/k(n-1)]{ J T (F ;x)dF (x)}
n
1 n
= [(n-k)/k(n-1)] T** (Fn)~
1
where
I IF n
- FI
I
~
0 a.s., as n ~
00
of T** (.), we obtain that T** (F )
1
1
n
(4.16', it follows that
~
,
and hence, by the assumed Hadamard-continuity
T** (F) = 012
1
a.s., as n
~
00
Thus, from
•
2
-1
= [(n-k)/k(n-1)]01 + o(k ) a.s., as n
(4.17)
Next, from (4.13) and (4.14), we obtain
n -1
{(k)
*
-en)
(k) ~{'E
I} Tn,~. - Tn, k + T1 ,~.
~
(4.18)
= o( k-1 + O«n-k)-l) ) a.s.
-
= o(k
) a.s., as n
~
00
[ by Lemma 4.1 ]
1
as k = 0(n -
,
n"
n. >
for some
At this stage, we may recall that I viz., Lemma 1 of Sen (1960)J if {X
{Y .} are two sequences of r.v.'s (not necessarily independent)
N
"~
-1 N
2
2: i =lXNi
-1 N
2
then N 2: i =l YNi
(i)N
(4.17) and (4.18)
(4.19)
..
(~~)
-+-A«oo)a.s. and
-+- A a.s., as. N -+-
00
•
-1 N
N
~.
1(~-'
~=--N~
- Y ~.)
N
2
-1
*
Vn,k
~
2
01
O.
Ni
}
and
, such that
~O
a.s., as N
~OO,
Letting N = (~), we obtain from (2.8),
that under the hypothesis of Theorem 3.1, as n -+-
k(n-1) (n-k)
00
-
-
-1
~
a.s., for every k: k = o(n
1-n
00
), for some n > O.
This completes the proof of Theorem 3.1.
Next, we consider the proof of (3.3), By using (2.13)-(2.15}, we obtain that
8
T(~)
n-k
Tn - (n-k)
+
(4 •Z0)
-1 k
Z -1 k
k
~j=lTl(Fn;Xi.) + [Z(n-k)] ~j=lE£=lTz(Fn;Xij,Xi£)
II F~:~
0 (
II Z ~
- Fn
,
ViE:
I .
Further, note that by virtue of (Z.15), E~ 1 TZ(F ;X. ,X.)
n
J=
1.
0, for every i(=l, ... ,n),
J
and hence, by (Z.7), (4.Z0) and Lemma 4.1, we have
-1 n 1 TZ(F ;X. ,X.) }
Tn*_k = T - 0 - [Z(n-k)] -1 { n~.
n
1.=
n 1. 1.
(4.Z1)
(k-l)[Z(n-k)]-l{n(n-l)}-l{~~~. ITZ(F ;X.,X.) } + o«n-k)-l) a.s.
n
1.1'"J=
1.
J
T - [Z(n-k)]-l{ 1 - (k-l)/(n-l)}{n-lE~ ITZ(F ;X.,X.)} + o«n-k)-l) a.s.
n
n
1.=
1.
1.
[Z(n-l)]-l f TZ(F ;x,x)dF (x) + o«n-k)-l) a.s., as n
n
n
T
n
+
00
so that
(n-l){ T - T*
}
n,k
n
(4.Z1)
~f
=
Again
II
Fn - F
I 1+
n
*
~T2(Fn)
1
0 a.s., as n
+
+ o(n/(n-k»
TZ(F ;x,x)dF (x)
+
00
n
+ 0(1) a.s., as n
,
* a.s., as n
TZ(F)
+
00,
a.s.
V k = o(n I-n ),n > O.
tl
while by the assumed Hadamard-continuity
+
00
This completes the proof of (3.3),
•
Finally, to prove (3.5), we note that by virtue of (3.3),
(4. ZZ)
(n-l){
*
T
n,k
* }
Tn,l
+
0
a.s., as n
+
00
,
' l - n),
~ k: k = o(n
n
>
As such, it suffices to show that (3.5) holds for the particular case of k
O.
I,
and this has already been proved in Theorem Z.Z of Sen (1988). Hence, we omit the
details here.
5.
~~~:~~~~~~_:~~~:~~.
From Theorems 3.1 and 3.Z it is quite clear that if instead
of the classical jackknifing, one uses the delete-k jackknifing procedure for some
k
~
I, as regards the two estimators T* 1 and T* k are concerned, by virtue of
n,
n,
(4.ZZ), they are a.s. equivalent upto the order n
-1
. Thus, upto the order n
-1
,
there is no change in the bias reduction picture due to higher order delete-k
jackknifing. In practice, bias of the order n-Z (or higher) are of not much interest.
As regards the Tukey form of the variance estimators are concerned, for delete-k
jackknifing, one needs the adjustment factor k(n-1)/(n-k), and with that (3.2)
ensures their asymptotic (a.s.) equivalence upto the first order. On the other
hand, from the computational aspect, de1ete-k jackknifing involves labor of the
9
order n
k
,and hence, with larger values of k, the computational complexities may
increase rather abruptly. From all these considerations, it may be concluded that
if jackknifing is to be condidered for bias reduction and variance estimation for
an estimator of a smooth functional T(F), then there is not much attraction for
adaptation of delete-k jackknifing. The picture may, however, change considerably
if we deal with so called non-smooth functionals where (2.11) or (2.14) may not
hold. Then, of course, a different method of attack may be necessary to study this
relative picture.
REFERENCES
HOEFFDING, W. (1948). A class of Statistics with asymptotically normal distribution.
Ann. Math.
St~t.
19 , 293-325.
KIEFER, J. (1961). On large deviations of the empiric D.F. of vector chance variables
and a law of iterated logarithm. Paein~e. J. Math. 11, 649-660.
PARR,
w.e. (1983). A note on the jackknife, the bootstrap and delta method estimators
of bias and variance. Biometnika 70, 719-722
w.e.
(1985). Jackknifing differentiable statistical functionals. J. Ro~.
Soe., B 47, 56-66.
SEN, P.K. (1960). On some convergence properties of U-statistics. Caieutta St~t.
Ao~oe. Bull. 10, 1-18.
PARR,
S~t.
SEN, P.K. (1981). Se.que.rvt:iai NonpaJr.ame.tJU~:
Inn~e.nee..
John Wiley, New York.
Inv~nee. P!Line~p.e.u
and StCLU6tieai
SEN, P.K. (1988). Functional jackknifing: Rationality and general asymptotics.
S~t.
Ann.
16, 450-469.
SEN, P.K. and GHOSH, M. (1981). Sequential point estimation of estimable parameters
based on U-statistics. Sankh~a, S~. A il, 331-344.