Sen, Pranab Kumar; (1982).Jackknife L-Estimators: Affine Structure and Asymptotics."

.•
JACKKNIFE L-ESTIMATORS:
AFFINE STRUCTURE AND ASYMPTOTICS
by
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1415
September 1982
•
JACKKNIFE L-ESTlMATORS:
AFFINE STRUCfURE AND ASYMPTOTICS*
By PRANAB KUMAR SEN
University of North Carolina, Chapel Hill
ForL-estimators, a representation of the jackknife statistic, based
on an inherent reverse martingale structure of jackknifing, is incorporated
in the study of the asymptotic properties of the estimator as well as the
allied jackknife estimator of the standard error.
Some applications to
sequential analysis are also discussed.
1.
Introduction.
{xi; 1">l}
Let
be a sequence of independent and iden-
tically distributed random variables (i.i.d.r.v.) with a continuous distribution function (d.f.)
xn:l
let
<
-
.~.
F,
< X
- n:n
defined on the real line
For every
n
(~l)
be the ordered r.v. (corresponding to
by virtue of the assumed continuity of
in probability.
R.
For suitable
g:
R+R
F, ties among the
and
X
n:et
are neglected,
8CO~8 {a (i), l<i<n},
n
--
consider
an L-estimator of the form
(1.1)
On
L
n
the ground of robustness and efficiency, various forms of
L
n
are often
advocated in problems of statistical inference [see Serfling (1980, Ch. 8)
and Huber (1981)]; the asymptotic theory plays a vital role in this context.
AMS Subject Classification:
Key words and phrases:
60F99, 62E20, 62F35
Embedding of Winer process, reverse martingale
approach, score function, sequential analysis, standard error estimate.
*Work partially supported by the National Heart, Lung and Blood Institute,
Contract NIH-NHLBI-7l-2243-L from NIH.
Whenever, for every (fixed)
where
$(F(x»g(x)
u~O<u<l) ,
a ([nu] + 1)
n
+
as
$(u),
n~,
is (at least) square integrable and some additional
regularity conditions hold [see Serfling (1980, Ch. 8)], then
asn~,'
(1.2)
where
(1. 3)
II
L: $(F(x»g(x)dF(x)
and
0
(1.4)
2
L
=
foo
foo{F(xAy) - F(x)F(y)}$(F(x»$(F(y»dg(x)dg(y)
-00-00
Stronger results in the form of weak as well as strong invariance principles
are contained in Sen (1981, Ch.7).
In order to make full use of these
in,
results in problems of inference, one usually needs to estimate
a non-sequential setup, usually, the weak consistency of this estimator
suffices,
wh~le
strong consistency is generally needed in sequential analysis.
In this context, jackknifing is found to be
the L-estimator based on
size
(1.5)
n-1
when
Xi
very useful.
(Xl, ••• ,Xi_l,Xi+1, ••• ,Xn)
has been removed), and let
L . = nL
n,1
n
(n-1)L (i)
n-l
l<i<n
Then, the jackknife L-statistic is
(1. 6)
L*
n
Also, the (Tukey) estimator of
(1. 7)
= n -1E.n1= 1Ln,1.
2
0L
is defined by
*2
-1 n
* 2
Sn = (n-1) E1.=l(Ln,1.-L)
n
-2-
Let
L (i)
n-1
be
(i.e., on a sample of
We shall see later on that
native estimator
S
*2
n
conside~d
is structurally very close to the alter-
in Sen (1978).
have not been studied, so far, in full generality.
L
n
For finite
and
S*2
n
Jg 2dF
¢ of bounded variation (on (0,1», some work in this area is
and for
'.
*
Properties of
due to Babu and Singh (1982), Efron (1982), and others.
lies in the development of a general theory where
(or of bounded variation) and/or
Eg
2
Our primary interest
¢ need not be bounded
may not be finite.
In this context,
a reverse martingale structure inherent in jackknifing [ViZ., Sen (1977)]
has been exploited.
Unlike other approaches, traditional decomposition
into i.i.d.r.v. 's and a.residual term has not been attempted.
Rather, linear
and quadratic functions of order statistics are employed and the reverse
martingale approach of Sen (1978) is incorporated to study.asymptotic results
--
on
*
Land
S*2 •
n
n
Along with the basic regularity condition.s, the representa-
tions are considered in Section 2.
of the main results.
Section 3 is devoted to the derivation
The last section deal·s with some applications in sequen-
tial analys is.
2.
Representations for
of the
n
subsamples
(Xl, ..• ,X )].
n
statistics
x~~i:n-l)
of size
L
*
n
and
*2
S •
n
Jackknifing rests on the construction
{(Xl· , ••• ,X.l.- l""'Xn ), l<i<n}
--
n-l [from
We characterize this, equivalently, in terms of the order
Xn:~'
l<a<n.
For every
a (l<a<n),
let
~~~i
=
(X~~i:l"'"
be the vector of order statistics corresponding to the sample
n-l ?formed by deleting
X
n:a
from the set
L-estimator corresponding to the sample of size
is denoted by
for
a = l, ••• ,n.
-(
(2.1)
of size
x(a)
n-l:i
=
r
n:i
X n : i +1
-3-
n-l
(Xn: l""'X
.
n:n ),' the
'
x(a)
re l
atl.ng
to -n-l
Note that
for
l<i<~-l
for
a<i<n-l
,
so that
( 2 • 2)
L(a) = __1__ ~n-l
(O)X(a)
n-l
n-1 ~i=l an _ 1 1 n-1:i
1
{a-1
o n }
= ---1
~o1= 1 a n- l(l)Xn:1° + ~l°_-N+l
an- 1(i-1)Xn:1°
n~
for
(2.3)
a = 1, ••• ,n.
,
With the same re-iridexing (i4a), we have, for every
L
= nL - (n-1)L(a)
n,a
n
n-l
= a (a)g(X
) + ~~-ll[a (i) - a l(i)]g(X 0)
n
n:a
1=
n
nn:1
+ ~~1=a+l[an (i) - a n-.1(i-1)]g(Xn:10)
As a result, for the jackknife estimator
*
L ,
n
we have
(2.4)
where
This representation of
L* will be exploited in our subsequent manipulations.
n
Similarly, from (1. 7), (2.3) and (2.4), we have
-4-
•
where
d (i,j) = (n_1)-lr n l{b (i) - c (i)}{b (j) - c (j)}
n
a=
na
n
na
n
(2.7)
for
i,j=l, ••• ,n.
This quadratic representation of
*2
will be utilized
n
in the study of its asymptotic properties.
In the rest of this section,
we introduce the regularity conditions (on
g
and the scores) tinder which
we shall pursue the study of the properties of
We denote by
8
a
--
s
,
E (O,~),
b(u)
b(u) = g(F
-1
(u», O<u<l
.
*
and S *
n
n
and assume that for every
is of bounded variation on
and finite, positive
L
(8,1-8),
and, for some real
K,
(2.8)
Also, we consider a sCOPe generating function
the scores
a (i)
n
O<u<l}
and relate
by letting
a (i) = cp(i/(n+1»
(2.9)
cp = {cp(u):
,
n
(2.10)
i-I
n
l<i<n; n>l
i
--<u<-
for
-
l<i<n
n
We could have defined the scores in some asymptotically equivalent a1ternative forms; this would not make any difference in the asymptotic results
to follow.
We assume that
CP' (= {cp'(u):
O<u<l})
cP
almost everywhere, where
(2.11) I¢(u) I < K{u(l-u)}-b,
where
(2.12)
b
has a continuous first order derivative
Icp'(u)I2.K{u(1-u)}-b-1,
is real and, in conjunction with (2.8),
a + b =
~
- 0,
-5-
for some
0>0
¥O<u<l
2
Note that in this setup, we need not assume that Eg (x) < 00 and/or that
1
22
¢ is of bounded variation, though J b (u)¢ (u)du < 00. However, in this
o
setup, jump discontinuities of
¢
has been excluded; it is, of course,
known that for such singular components, the L-statistics may not readily
amend to jackknifing.
3.
Asymptotics on jackknifing.
First, we consider the first order asymptotics.
------------~-~-----~-----
Note that by (1.1), (1.4) and (2.5),
L*
n - Ln =
(3.1)
=
n-l~~~=l {(n-l)an (i) - (n-i ) an _ (.)
~ - (i-l)an- l(i-l)}g(Xn:~.)
l
n -1 ~.n~= lan* (i)g(X
.)
.
n:~
say,
where by the first order Taylor expansion and (2.9),
(3.2)
where
_ i_ < ~(n) < i
(3.3)
n+l
Note that for
sil
n
i=l (or n),
and
i(n-i+l)
{¢ I
2
n
Thus, i f we write
as
n~,
(3.5)
(~~i»
l<i<n
the second (or first) term on the right hand
side of (3.2) vanishes, while for
0.4)
i-I < t"(n) < _i_
-...-si2
n+l'
'n
2~i<n-l,
,we may rewrite (3.2) as
¢l(~(n»} +.!...- ¢,(~(n»
il
2
il
n
a * ([nu] + 1) =l/J * (u),
n
n
n-i+l
2 ¢' u~~i»
n
then for every (fixed)
by (2.11) and (3.4),
l/J * (u) + l/J * (u) =
n
o,
O<u<l
while by (3.2), (3.3) and (2.11),
(3.6)
*
-2 -b
lan (i)! -< C{i(n-i+l)n}
, V l<i<n
-6-
u
€
(0,1),
where
C«oo)
is a finite positive constant.
Hence, by an appeal to
Theorem 7.6.2 of Sen (1981), we conclude that under (2.8) through (2.12),
as
n400 ,
L* - L
n
n
(3.7)
almost surely (a.s.)
+ 0
Let us now define the
L
as in (2.3), and let
n,
(3.8)
*2 + n(n-1) -1 (L *-L ) 2)
n
n n
(= S
Then, by (3.7) and (3.8),
(3.9)
"-
8
Also, if
by
2
n
Fn
. n
and
increasing sequence of
(3.10)
+
0
a.s., as
= F(Xn: l""'Xn:n ; Xn+",
J
n: l""'Xn:n )
(X
*7
- S -
j>l)
Xn+j , j>l,
sigma~fie1ds,
n+oo
be the sigma-field generated
for every
n>l,
then
fn
is a non-
and
n(n-1)E{(L
. n- 1-Ln·)2\Fn }
V n>l
where the penultimate step follows by using (2.3).
On
the other hand, by
Theorem 7.6.3 of Sen (1981), under (2.8) through (2.12), as
n+oo,
(3.11)
where
cr 2
L
is defined in (1.4).
under (2.8) through (2.12), as
(3.12)
Consequently, by (3.9), (3.10) and (3.11),
n+oo,
*2
n
S
+
cr L2 a.s.
-7-
To exploit the full utility of jackknifing, we need to consider the
second order asymptotics.
First, we notice that by virtue of (3.1)-(3.6)
(where by (3.5), the variance function
2
0L*_L
= 0)
and Theorem 4 of Wellner
(1977) ,
n k21L *.;.L
(3.13)
n
n
I
=
o«loglogn) k2)
a.s., as
n~
\
This result, though of some interest, is not good enough to provide the
full utility of jackknifing.
TowatrGs this, in view of the fact that, in
(3.4), the score function rests on
that
¢'
has a derivative
¢"
¢',
we make an additional assumption
(a.e.) on
(0,1),
where defining
b
by
(2.11)-(2.12) ,
¥ O<u<l
(3.14)
We then define
0
as in (2.12) and consider
(3.15)
where by (2.8), (2.11), (2.12) and (3.2), the contribution of the first
_n~l. n(l+O)/2n-1¢,(~~n1»g(X 1)
1.
n:
O
2
which is 0(n- / ) a.s., as n~, while
(or the last) term in the sum in (3.15) is
(or
n-2+(l+O)/2¢,(~(n»
for
2~i<n-1,
n2
(3.16)
g
(X
n:n
»
we write [by using (3.4) and (3.14)],
= {i(n-i+1)n... 2}n(J+O)/2(~~i)"'~~i»¢1I(~i·~»
n(l+O)/2 a: Ci )
+ (in-1)n-(J-O)/2¢'(~~i)} _ (C.n_i+1)n-1)n-(1-0)/2¢,(~~~»
where
(n)
~.:.
1. •
€
(n)
(n)
1.
1.
(~.2 '~.1)·
Note that by (3.3),
n-(l-o) /2 -< {i(n-i+1) /n 2} (1-0) /2 ,
'V 2<i<n-1.
(3.14) and (3.16), we have
(3.17)
-8-
n
,
(l+O)/2!I:"(n) _t" (n),\ <
~i2
~i1
Hence, using (2.11), (2.12),
.
0.18)
Hence, by (2.8), (2.12), (3.15), (3.17), (3.18) and Theorem 7.6.2 of Sen
(1981), we'conc1ude that as
'.
n-+<x>,
(3.19)
Note that under (2.8) through (2.12) and (3.14), by virtue of Theorem
7.5.1 of Sen (1981),
(3.20)
L
n - 1.I = n
-1 n
1':.1=1Z.1 + Rn
where
(3.21)
where
L
1.I
*-
1.I
n
= n -11':.n1=1Z.1 + Rn*
is defined by (1.3) and
a.s., as
(3.22)
for some
n>O.
Since the
n-+<x>
are i.i.d.r.v. with mean
o and variance
defined by (1.4), the Skorophod-Strassen embedding of Wiener process holds
for
-1 n
{OL
1':i=l Zi' n>l}
and this along with (3.21)-(3.22) provide us with
all the desired asymptotic results on the jackknife statistics
by (3.8) and (3.19), we have
(3.23)
'n\s2_ s*21 ~ 0
n
n
~
a. s., as
-9-
n-+<x>
*
L •
n
Further,
so that (3.12) remains intact.
This leads us to
(3.24)
n
E Z. + 0(1)
v'n:' i=l 1
1
°L
Actually, whenever in (2.12),
0(1)
by
not only
o(n-n)
S
*
n
> k'I,
is restricted to be
use appropriate rates of convergence of
a.s., for some
n>O.
Sn
n~
a.s., as
= --
to
0L
-
then we may
and replace [in (3.24)]
Thus, for jackknifing of L-Statistics,
is a consistent estimator of
0L
(it is strongly so), but
also the asymptotic normality results of the jackknife statistics extend to
a strong invariance principle, as in (3.24).
We conclude this section with
the remark that under (2.8) through (2.12), with
o>~,
from Gardiner and
Sen (1979), it follows that
(3.25)
where
(3.26)
2
Y
=
1 1
J J(sAt-st)LO(S)LO(t)dg(F-
o0
1
1
(s»dg(F- (t»
(3.27)
. 1
(3.28)
=
2J(1~&1¢(~}dg(1~1(sJl
t
t
= 2J s¢(s.)dg(}'
o
(3.29)
-1
(s})
As a result, by (3.19), the SalD,e asymptoti.c normality result holds for the
jackknife estimator
*2
S ,
n
when (3.14) is assumed to hold.
may be remarked that (3.14) (or the exis·tence of
-10-
In passing, it
¢") may not he really
needed.
It is possible to replace this by a local Lipschitz condition
that for every
u
€
(0,1)
and
a(u), S(u)
0 < a(u) < S(u)
such that
< a(u) + n- l < 1,
1<1>'(a(u»-ep'(S(u»I < la(u)-S(u)IY.max{!ep'(a(u»!,
•
for some
Y>~.
lep'(S(u»I}
This condition will insure (3.17), and hence, (3.19) will
remain intact.
4.
~~~:_.e:~::~~_::~~:~~.
In the literature, the L-statistics have been
advocated for efficient estimation of II
[in (1. 3) ], which may often be
expressed as a parameter (location/scale etc.) of the underlying d.f.F.
Also, in problems of testing hypotheses concerning
ll,
the L-statistics
are often found to be good robust competitors of some classical tests.
both of these contexts, jackknifing is quite useful.
have a bounded width confidence interval for
ll,
In
When one wants to
based on jackknife L-statistics,
then one encounters a sequential model, where (3.12) and (3.24) provide the
necessary tool for studying the asymptotic consistency of the procedure.
Since these are very analogous to that in Section 5 of Sen (1977), the details
are omitted.
We may, however, add that (3.25) [as extended to random sample
sizes in Gardiner and Sen (1979)] provides the asymptotic normality of the
stopping time for this sequential procedure.
Similarly, for sequential tests
based on L-statistics, the embedding of Wiener process in (3.24), provides
the basic tool for the study of the asymptotic DC function, while the Pitmanefficiency results are by-product of this representation.
Since the details
are analogous to those in Section 6 of Sen (1977), they are not reproduced
i.,\
here.
-11-
REFERENCES
BABU, G.J. AND SINGH, K. (1982). Asymptotic representations for jackknife
and bootstrap L-statistics (to be pubZished).
EFRON, B. (1982). Jackknife and Bootstrap Methods in Statistics, SIAM
Regional Conference Series in Applied Mathematics, Philadelphia.
GARDINER, J.C. AND SEN, P.K. (1979). Asymptotic normality of a variance
estimator of a linear combination of a function of order statistics.
Zeit. Wahr$ch. Ve~. Geb. 50, 205-221.
HUBER, P.J. (1981).
Robust Statistics, New York:
John Wiley.
SEN, P.K. (1977). Some invariance principles relating to jackknifing and
their role in sequential analysis. Ann. Statist.
315-329.
2,
SEN, P.K. (1978). An invariance principle for linear combinations of order
statistics. Zeit. Wahrsch. Verw. Geb. 42, 327-340.
.,.SEN, P .K. (1981). SequentiaZ Nonparametrics: Invariance PrincipZes and
Statistics Z Inference. New York: John Wiley.
SERFLING, R.J. (1980). Approximation Theorems of MathematicaZ Statistics.
New York: John Wiley.
WELLNER, J.A. (1977). A law of iterated logorithm for functions of order
statistics. Ann. Statist. 5, 481-494.
-12-