Carroll, R.J.; (1975)On the asymptotic normality of stopping times based on robust estimation."

ON THE
Asvr~PTnTIC
NORf"1!'IUTY OF STOPPING
TIMES BASED ON ROBUST ESTIMATORS*
by
Raymond J. Carroll
Department of Statistics
University of NOl"th Carolina at Chapel Hill
Institute of Statistics Mimeo Series #1027
September, 1975
* This work was partially supported by
~he Office of Naval Research, contract
N00014-67-A-0226-00014 at Purdue University. Reproduction in whole or in
part is permitted for any purpose of the United States Government.
SU;~W.RY
The asymptotic normality of certain stopping times in fixed-width interval analysis is discussed '''hen the intervals are based on M-estimates of
location.
Using results of Qlosh and Mukhophadyay (1975), it suffices to
consider the asymptotic normality of estimates
estimators under random sample sizes.
Two
cr~"
n
n~ethods
of the variance of Munder differing sets of
conditions are given; the first is based on finding almost sure representa2 \ihile the second
tions for an'
is based on the theory of weak convergence.
The final results are also applied to one-step M-estimators (Bickel (1975))
to obtain almost sure representations and weak convergence results.
K~Y
WORDS N1D PHRASES:
Fixed-Width Confidence Intervals, Robust Estimation,
M-Estimators, Sequential Analysis, Stopping Times,
Almost Sure Invariance Principles
AMS 1970 SUBJECT CLASSIFICATIONS:
Primary: 62L12; Secondary 62E20, 62G35.
1
Introduction
This paper is motivated by, but not restricted to, the problem of the
as~nptotic
times
normality of stopping times in sequential analysis.
N(d)
are generally defined as follows: if
constants tending to zero as
d+O, and if Y
n
{g(d)}
The stopping
is a sequence of
is a sequence of positive
statistics, then N(d) is the first time n that n ~ Yn/g(d). As long
2
as Yn ~ 0
almost surely (a.s.) as n~, then for ned) = [02/ g (d)],
N(d)/n(d)
~
1 (a.s.) as shown by Chow and Robbins (1965).
Ghosh and
Mukhophadyay (1975) show under these conditions that if there are positive
constants
a,b
for which
then
(ag(d)/b2)~(N(d)-a/g(d))~ N(O,l) as d~.
It is important to realize that
{Y}
n
and
0
2 do not exist in a
vacuum and are typically connected to an underlying estimation problem.
The
prototype is the construction of fixed-width confidence sequences for a
location parameter when the observations
tion
X 'X 2 •...
l
come from a distribu-
of locationF(~-1(x_8)); one bases the interval on a sequence {T}
n
scale equivariant estimators (see Bickel (1975) for definitions) and
is the as~ptotic variance of the ~-,o:rmed sequence
a
2
1
{n"2(T -8)}.
n
Thus, in
this problem,
{y}
n
is simply an estimate (scale equivariant but location
invariant) of
(J.
2
In this context, Ghosh and I'llukhophadyay (1975) discussed
in detail the case where
{T }
n
is a sequence of V-statistics.
2
In light of the above discussion, it seems natural to investigate the
problem of the asymptotic normality of stopping times when
estimator other than a U-statistic.
where
Tn
Tn
is a robust
We consider specifically the two cases
is an M-estimator (Huber (1964), Andrews et. al. (1972)) or a
one-step M-estimator (Bickel (1975)).
Here the
as~nptotic
variance is
(1.1)
We show in Section 6 that it is only necessary to estimate the following
functional of
F:
f
(1.2)
p(x)dF(x) ,
p is a known function.
where
Thus, the main body of this paper discusses the following estimation
problem, leaving until Section 6 the applications to the asymptotic norlnality
of stopping times based on M-estirnators.
(1.2).
We are interested in estimating
There is assumed to be a sequence of constants
valued random variables
N(d)
with
ned)
and integer-
N(d)/n(d) + 1 in probability as
d+O.
We estimate (1.2) by
(1.3)
where
Tn
is as above and
Sn
equivariant estimator of scale
The goal is
~o
for some A>O,
(1.4)
find
is a robust location invariant, scale
su~h
as the interquarti1e
reasonably weak conditions on
rang~.
p which guarantee that
3
where the convergence indicated is convergence in law.
There win be
two approaches to the estimation of (1.2). both
of lvhich yield (1.4) as a corollary.
tations are obtained for
sure behavior of Tn
TN (d)
and
and
')
0"-
n
S
n
In Sections 2 and 3, almost sure represen-
by making assumptions about (i) the almost
and (ii) the asymptotic distributions of
SN(d); the conditions on
p are mild and we do not require
differentiability.
In Sections 4 and 5, we obtain (1.4) by means of the theory of weak
convergence of stochastic processes with multidimensional time parameters
(Billingsley (1968), Bickel and lvichura (1971)).
In this approach we make
no assumptions concerning the almost sure behavior of T
n
conditions on
and
S, but the
n
p are stronger (but still do not include differentiability).
In Section 6 we return to the original problem of proving (1.4) when
0
2
is given by (1.1).
condHions, are [i ven for (1" 4) to hold> and
Mild
these conditions are satisfied in most cases of interest.
We are also able
to obtain almost sure representations and weak convergence results for the
one-step M-estimates studied by Bickel (1975).
2.
Prelininary (a.s.) Results
One estimator of the functional
f
p(x)dF(x)
which is location invariant
but not scale invariant is
n
(2.1)
-1
n
L
i=l
where
Tn
p(Xi-Tn ),
is a sequence of location and scale equivariant statistics con-
verging (in some sense) under
assume (unless indicated) that
Fe
to
8; in the rest of this paper, we
8=0, so the distribution function is
F(x).
4
In order to find a representation for the estimator (2.1), we want to avoid
global differentiability properties and instead make assumptions about the
behavior of certain integrals.
n~ (log n) -1 Tn
and
(2.2)
n
-1
~
If
p has two continuous bounded derivatives
almost surely (a.s.), a Taylor's expansion shows
0
n
n
L
= n- l . L1
p(X.-T)
1
i=l
n
p(X.) - T Ep'(X) + O(n-l(log n)2)
1
1=
lan /bn I
if
n
is bounded as
n~.
(a.s.) ,
The purpose of this
section is to derive a result similar to (2.2) with the order term
4
O(n- 3 / (10g n)2); the proof is elementary and makes no differentiability
assumptions concerning
p.
We first study a process
the major result will be obtained by looking at
Vn (t)
-1
Vn(a n Tn)'
given below;
All proofs will
be delayed until the end of the section.
Definition 2.1.
V (t)
n
For some sequence of constants
= n- l
I
{an}
decreasing to zero,
{p(x.-ta )-P(x.)-E(p(X-ta )-P(X))}.
1
n
1
n
. 1
1=
The next Proposition and Lemma give the almost sure rates at which
Vn
converges to zero.
Proposition 2.1.
H>O
Suppose
p is increasing.
Then there exists a constant
such that
(2.3)
where
(2.4)
sup
1
-
O~K~n-l
sup
O~k~n
IE{p(X-an k/n)-p(X-an (k+l)/n)}1
IV (k/n)
n
I.
5
Lemma 2.1.
Suppose
p is increasing and bounded and that for sequences
{B },
n
converginz to zero
{e}
n
IE{p(x-Bn -Cn )
(2.5)
00
(2.6a)
For all
= O( ICn I)
- p(X-Bn )}rl'
L
c>O,
n
2
exp{-cn/bn } <
2
2
exp{-cn/(b a )} <
n n
n=1
00
(2.6b)
For all
L
c>O,
n
(r=1,2)
n=1
co
00.
Then
The major result of this section, Theorem 2.1, finds a representation
for the estimator (2.1) if T
converges to zero (a.s.); note that the
conditions on
The proof is omitted since it is a consequence
n
p
of Lenmla 2.1 with
are minimal.
,
= a-J.
T •
n n
t
Theorem 2.1.
Suppose
and bounded.
Suppose
2.1.
P = P+ -p - , where both
and
are increasing
p
satisfy the conditions of LCmr.1a
Then
sup
Itl~l
n
b
n
Iv (t)1
+
0
(a.s.).
n
(a.s.)
Thus if
(2.7)
+
p
-1
n
L
i=l
{p(X.-T
)-Ep(X)} = n -1
1
n
n
l.
{p(X.)-Ep(X)}
1
i::l
{p(y-T )-p(y)}dF(y)
n
+
I
D(b- )
n
(a.s.),
6
where
an
w;1i1e if
It
= o(bn)
means
an/b n
-1
~
-1
an = n (log n) ,
O.
+
bn
If a~l
= n 3/4 (log
= (log
n)2, then
n) -2 .
is of interest. to ·remove the integral in (2.6) and obtain a
result similar to (2.2).
Corollarx 2.1.
The proof follows immediately fraIn (2.7) and (2.8).
If in addition to the conditions of Theorem 2.1, as
J {p(x+h)
(2.8)
- p(x)}dF(x)
= A(F)h
Ihl
0
+
2
+ O(lhI ),
then
(2.9)
n
-1
n
I
i::1
n
{p(X.-T )-Ep(X)}
n
1
= n -1 I
{p(X.)-Ep(X)} - A(F)T
1
i=l
n
+ O(b- 1)
n
(a.s.) .
Further, if for some funct ion
I/J (EI/J (X) =0)
n
(2.10)
T.
n
= n -1 I
(a.s.) ,
i=l
we have
n
-1
n
2
i=l
n
= n -1 I
{p(X.-T
)-Ep(x)}
1
n
{p(X.)+W(X.)-Ep(X)} + O(b-I)
1
1=1
1
n
(a.s.) .
In Corollary 2.2, the proof of which is omitted, we show that a wide
class of non-differentiable functions
p
satisfy (2.5) and (2.8).
This
class includes the psi functions due to Huber and Hampel (see Andrews, eta ala
(1972)) .
Corollary 2.2.
Suppose
p
is twice bounded1y differentiable except at a
finite nummer of points, and
of these points.
p(x)
= I{lxl~k}
Then
or
p
p(x)
F is Lipschitz of order one in neighborhoods
satisfies (2.5) and (2.8).
= x I{lxl
In particular, if
k} + k sign(x)I{lxl>k}' then
p
+
=p
-
-p ,
7
where
+
p ,P
are increasing and satisfy (2.5) and (2.8).
The discussion of the asymptotic normality of the stopping rule based on
(2.1)
is postponed until the next section.
Proof of Proposition 2.1.
[0]
Let
denote the greatest integer function.
Then, since
by the monotonicity of
(2.11)
sup
O~t~l
Iv
n
(t)1 ~
p we obtain
sup
O~k~n-l
+ 2
n
sup
O~k~n-l
+ sup
O~k~n
Proof of Leoona 2.1.
Iv
(k/n)-V ((k+1) In)
n
I
IE(p(x-a (k+l)/n)-p(X-a kin))
n
n
I
Ivn (kin) I.
Using the result of Proposition 2.1, we see that by (2.5)
and (2. 6a),
b n Al n + O.
To prove that
bn A2n
+ 0
(a.s.), we make use of the Borel-Crolte1li Lemma and
the exponential bounds (see Loeve (1968), page 254).
Specifically, letting
s~k = n Var(p(X-ank/n)-p(X)), we have for £0>0,
Pr
{ sup
O~k~n
IVn (kin) I
> E:o/bn
}
n {Inv
I . n} .
(kin)
80
~.J Pr __n_s - - > b 5 k
K-O
nk
n n
Under (2.6a) and (2.6b), the two possible cases of the exponential bounds
lead to the last sum being bounded above for some
M>O by one of
8
n
n
2
k=O
or
I
exp{ -Bn/b }.
n
k=O
This completes the proof.
3. An (a.s.) Representation
A second estimator of the functional
I
p(x)dF(x)
which is both location
and scale invariant is
n
(3.1)
-1
n
L
i=l
where
Tn
is as in Section 2 and
S
n
is a location invariant, scale
~;
equivariant estimator which converges to the scale parameter
throughout that
.;=1.
we assume
A version which is location invariant but scale equi-
variant (and which would be used in practice) is
n
I
(3.1)*
i=l
We find it more convenient to work with (3.1), returning to the study of
(3.1)* after Corollary 2.1.
(3.2)
n-
Again, from a Taylor's expansion
1 nL p (X.~ -Tn)
i=l
n
n
-1)
= n -1 . L1 p(X.)-{EXp'(X)}(S
1
n
(a.s.) .
1=
Note the rather surprising fact, also seen by Carroll (1975) in his study
of M-estimators, that unless
EXp'(X)
=0
the estimator (3.1) has an
asymptotic distribution different from the estimator (2.1).
gives a result on the order of (3.2) without
This section .. '
dif~orentiability
9
assumptions.
The process of i:'1terest is:'
Definition 3.1.
= n-
V (t,u)
n
1
For a sequence of constants
{an}
decreasing to zero,
n
I {p((l+ua ) (Xi-ta ))-p(X.-ta
)-E(p(l+ua ) (X-ta )-p(X-ta ))}.
. In· n
1
n
n
n
n
~=
The outline of this section is siwilar to that of Section 2.
Again,
the proofs of all necessary results are delayed wltil the end of the section.
Proposition 3.1.
M>O
Suppose
p
is increasing.
Then there exists a constant
such that
sup{IVn(t,u)I : O~t.u~l} ~ M{A n +A 2n +A~.:>n +A4n +ASn +A6n },
1
(3.3)
where
(3.4)
A =
1n
A.
2n =
A =
3n
A =
4n
sup
O~t.u~l
sup
In
IE{p((l+ua )(X-a [nt.]))_p((l+a [nu])(X_a ~))}I
n
nn
nn
nn
sup
sup
I
{p(X.-a
1
n t)-P(X.)-E(p(X-a
1
nt)-p(x))}1
i=l
O~t~l
O~j,k~n
n
-1
Ivn (k/n,j/n)I
In-
1
n
I
i=l
O~t,u~1
{p((I+a [nu])(x._a [nt]))
n n
1
n n
-p((I+a [nu]+l)(X._a [nti))}!
n
n
1
n n
{X.>a ~}
1
n n
Sn =
A
sup
O~t,u~l
In-
1
n
L
i=1
I
{p(O+a [~u])(x._a [ntl))
1
n n
n "
-p(Cl+a [nu]+l) (X.-a
n
n
1
~))}I
n n
rn~tl
{X.~a ~}
1
n n
I.
10
where
I B denotes the indicator function of the set
A6n
=
sup
O~t,u~l
Lemma 3.1.
Let
/E{p((l+ua ) (X-a [ntJi)))_p((l+ua ) (X-a Intl))}l.
n
n n
n
n n
p be bounded and increasing and suppose that equations (2.6a)
and (2.6b) hold and that if An' En' en
(3.5)
I f
(3.6)
I
B, and
converge to, zero,
{p(e l +An ) (X-Bn -cn ))-p((1+An )(X-Bn ))}rdFex) = O(ICn I)
J{P((l+An
n (X-C n ))-P((l+An ) (X-C n ))}rdFex) = O(IBn I)
+B )
for
1'=1,2
for
r=1,2,
where the integrals in (3.5) and (3.6) are taken over the real line on one
{X<Dn }, {X>D}
n
of the sets
and
Dn converges to zero.
Then
sup{bn IVn (t,u)l: O$t,u~l} ~ 0 (a.s.).
Theorem 3.1.
Suppose
+
P = P -P
-
Suppose further that
Lenuna 3.1.
+
where
satisfy the conditions of
p) P
{an }, {b}
n
satisfy (2.6a) and (2.6b).
Then
sup
Ib V et,u) I
lul,ltl::;1 n n
Hence, if a-leT -8)
n
1
n
+
0 (a.s.)
.
p(C- (x-O)), then under
(3 •.7)
n-
1
.1
1=1
(a.s.).
a-l(S -~)
+
0 (a.s.)
n
n
.I
{p(Xi)-EFP(X)}
under
F(x),
{p(XrTnl_Epp(X)}
n
and
+ 0
J
= n- 1
+
1=1
f {{~:nJ-p(Y)rF(Y)
+
O(b~l)
(a.s.).
The following Corollary gives our most specific result for the estimator
(3.1).
It shows that
11
satisfies the Law of the Iterated Logarithm if Sn , Tn
do and is asymptotically
normally distributed if
n
(n -1 I {pCX.)-EPCX)}.
1
i=l
Sn -1, Tn )
are jointly asymptotically normally distributed Cwhen properly normed).
Corollary 3.1.
Under the conditions of Theorem 3.1, if as
f {p(C1+h)CX+q)-pCX))}dFCx)
(3.8)
h,q
+
0
= ACF)h+BCF)q+oclhI 2)+OClqI2)+OClhq/),
then equation C3.7) becomes
n
n-l.I
(3.9)
(X. -Tn)-Ep(X)
p ~
n
n
1=1
= n- l . I1 {p(Xi)-Ep(X)}-A(F)(S n -l)-B(F)Tn +O(/an 12)+OClbn I)
(a.s.).
1=
We note that if we had chosen the scale equivariant estimator (3.1)*,
then (3.9) would have become
(3.9)*
2
{p(X.)-Ep(X)}+{Ep(X)-A{F)}{S
-l)-B(F)Tn +O{lan 1 )+0(lbn I)
1
n
(a.s.) •
Finally, the :resu1ts are'again
of functions
Corollary 3.2.
applicable for a wide variety
p.
Suppose
p is twice continuously differentiable except at
a finite number of points and that
F is Lipschitz of order one in neighbor-
12
-hoods-of these paints.
B(F)
= Ep'(X),
Corollary 3.3.
Then
F
so that if
Define
p(x)
p
satisfies (3.8) with
~s
symmetric and
A(F)
= EXp
(X)
p(X) = -pC-X), A(F)
and
= O.
by
p(x)
=1
if
=0
otherwise.
Ixlsk
F is Lipschitz of order one in neighborhoods of ±k, p
Then if
I
the conditions of Theorem 3.1 and Corollary 3.1.
If in addition
satisfies
Ejxl <
00,
the same results hold for the Huber function
p(x)
=x
=k
sign (x)
otherwise.
Corollary 3.1 yields in Corollary 3.4 simple conditions under which the
stopping rules described in Section 1 (and based on either (3.1) and (3.1)*)
are asynptotically normally distributed.
The following is easily shown
because of Anscombe's (1952) Theorems 1 and 4; the additional conditions on
Tn-, Sn
seem unavoidable.
CorolJary 3.4.
Let
Tn> Sn-l
be uniformly continuous in probability
(Anscombe (1952)) and suppose
(n~T , n~(s
n
n
-1),
n-~
I
i=l
{P(X.)-EP(X)})
~
are jointly asymptotically nornally distributed.
integer-valued random variables
N(d)jn(d)
+
1
ned)
Consider a sequence of
and constants
ned)
for which
in probability., If (3.9) and (3.9)* hold, both
N(d) {
L
1
N(d) -"2
(3.10a)
'-1
1.-
(X.
-T.,)
13
}
Pl--~ IJ(d) -Ep(X) .
S1fC<)
'J d
and
L
are
aS~Jptotica11y normally
tion function
1)
rS
N(d)-~ N(d)
I
i=l
(3.10b)
p (X.-T
1 N(G) -~Ep(X)
1 N(d) SN(d)
}
distributed,
under the distribu-
F(~-1(x_8)).
Proof of Proposition 3.1.
Since
p is increasing, simple manipulations show
+ l'E{p((l+uan ) (X-an t))-p((l+uan ) (X-an [nt]))}1
+ A
+ Vn ([nt]/n,u)
Zn
n
~
In- 1
I
{p((l+ua )(X.-a [nt]+l))_p((l+ua )(X.-a [nt]))}1
. 1
n
1=
+
n
1
n
n
1
n n
IE{p((l+ua ) (X-a [nt]+l))_p((l+ua )(X-a ~))}I + A
n
n n
n
Zn
nn
+ In- 1
I
. 1
1=
{p((l+ua ) ()(.-a [ntJ))_p(Cl+a [nu])(x._a lE!l))} \
n 1 n n
n n
1
n n
Thus,
sup
O~t,u~ I
Ivn (t,u) I ~
3
sup
0~t,u~ 1
In-
1
I
. 1
-p(Cl+a [nUl) (X.-a
n n
+
sup
O~t,u~ 1
{p((l+ua )(X.-a [nil))
n
l=
1.
In- 1
I
1.
n n
~~t]))}1
n n
. 1
1=
{p((l+a [nu])(x._a [nt]+l))
n n
1.
n n
14
-p((l+an[~U])(Xi-an[~~~))}1
(cont. )
~
H(A I n + AZn + A~In
Proof of LeffiQa 3.1.
imply ti1at
+
An
4
+
+:2A ln + A
+ A2n + A
+ A
6n
ln
3n
Ar:;~n + A ).
6n
Equations (2.6a) and (2.6b) together with (3.5) and (3.6)
b A
n 6n
b n Al n -+ 0,
-+
O.
The almost
sure behavior of the other terms in Proposition 3.1 follows in a manner
similar to the proof of LeJl1r.la 2.1 (al tllOUgh
and
A,1,n
A
5n
are not mean zero
random variables, one may normalize them easily).
4.
First Weak Convergence Results
The asymptotic normality of the stopping
r~lle
discussed in Section 1
was shown in Corollary 3.4 under (essentially) the assur:mtions tha.t
a-IT -+0
n n
(a.s.). a-I(S -1) -+ 0 (a.s.)
n
n
(where
is asymptotically normally distributee.
=
a-I
n
(log n)2 ), and that
The conditions on
and no differentiability properties were needed.
p
\Jere ninina1
In this and the next
section, by means of the theory of weak convergence, the assumptions
a
-1
n
-1
T -+ 0 (a.s.) and
n
price ;Jaid
an (Sn .. l) -+ 0 (a. s.) are removed; however_the
is strengthened restrictions on
p.
Differentiability of
p
is still unnecessary.
In this section we investigate the estimators (2.1).
discusse?
the only
Theorem 4.1,
the asymptotic normality of
assur~ption
for
T
n
being
T
n
-+ 0 (a.s.).
In Lenna 4.2,
15
the asymptotic normality of the left hand side of (3.10) is discussed, the only
.
f or
assumptl.ons
Tn
Definition 4.1.
Let
relating to the asymptotic behavior of· N(d) 3£TN(d)'
-%
Vn(s,t) = n
[ns]
~
V~(s,t)
That
Vn
=n
and V*n
~ {p(X.-t)-Ep(X-t)}
].
i=1
ens]
-};
I
i=l
{p(X.-[nt]/n)-Ep(X-[nt]/n)}.
l.
are essentially the same follows from the next
result.
Proposition 4.1.
Suppose that
J {p(x+q+h)
C4.1)
p is increasing and satisfies as
h
+
0,
2
- PCx+Q)}2dFCx) = OClhI ).
uniforr:lly in
Iql $ 1.
Then
sup{IVnCs,t)-V~(s,t)l: O$s,t$l ~ O.
Note that C4.1) is stronger than (2.5).
supremum could be taken for values of
It is clear from the proof that the
t
ranging over any finite interval.
Indeed, this will be true of all the results.
Definition 4.2.
Le~TIa
4.1.
Define
Assume the conditions of Proposition 4.1 hold, that
is continuous, and that there exists a constant
14>0
with
rCt l ,t 2)
16
f {P(y-t)-P(y-S)-E(P(X-t)-p(X-S»)}4
(4.2)
unifcrmly in
Then there is a D2-valued process
V
n
w
where
Itl,lsl
S
S
Mlt-sl
1.
W such that
w
=~
W
'
denotes weak convergence.
Ib>1V
Note that Theorem 4.1 below assumes nothing about the asymptotic distribution of N(d)~TN(d)' but rather it assumes that
Theorem 4.1.
is strongly consistent.
Suppose that the conditions of Lemma 4.1 hold and
(4.3a)
Tn
(4.3b)
Tn
N(d)
~
0 (a.s.) under
F(x)
is a sequence of integer valued random variables such that for
some sequence
{n(d)},
N(d)jn(d) ~ 1 as
d ~ O.
Then
where
2
N(ll,a )
.
a2 and
an d var1ance
Corollary 4.1.
(4.4)
is the distribution of a normal random variable with mean
Suppose that in addition to the conditions of Theorem 4.1
EF{P(X+h)-p(X)}
= hB(F,p)
+
2
0(lhI )
as
Ihl ~ 0 , and that
0
17
C(F,p).
has a limiting normal distribution with mean zero and variance
Then,
(N(d))-~
(4.6)
N(d)
r
{P(Xi-TN(d))-EFP(X)}
i=l
~> N(O,C(F,p)).
The nain result of this section so far, namely Theorem 4.1, is based
only on the assumption that
location parameter.
Tn
is a strongly consistent estimate of the
However, to get a result like (4.6), one must
assume
1.
essentially that
(N(d))~TN(d) has a limiting distribution.
If only this
assumption is made (rather than the strong consistency of Tn)' the conditions
on
p
can be relaxed.
wn (s, t) = n
Lemoa 4.2.
Suppose
Define now for
_~
p
[ns]
L
i=l
b
n monotonically nondecreasing
~
k
{p(X.-b t/n 2 )-Ep(X-b tin )}.
1
n
n
is increasing, that
n
N(d)/n(d)'::-r 8
(a positive ran-
dam variable) and that the following hold:
(4.7a)
There is a sequence
(4.7b)
(4.7c)
I
c
n
with b c In~
n n
and
+ 0
f {p(x+h)-P(x)}2dF (x)
=
f
= OClhl) as Ihl
{p(xth)-P (x)}dF(x)1
(Ihl)
as
Ihl
+
bn Ie n + O.
0
+ O.
Then
INn Cs,t)-Wn (s,o)1 .E...
(4.8)
so that if
(N(d)) J2TN (d)
0,
has a limiting distribution, (4.3c) holds.
and (4.5) hold, then (4.6) is true.
If (4.4)
18
Remark 4.1.
Two points are of interest here.
properties of
N(d)/n(d)
(4.7b) and (4.7c) if
have been relaxed.
First, note that the convergence
Secondly. it is easy to see that
has a bounded. first derivative or is twice boundedly
p
differentiable except at a finite number of points, and
F
is Lipschitz in
neighborhoods of these points.
Proof of Proposition 4.1.
of Bickel (1975).
First,
P.
In
I
~
Elvn (s,t)-V*(s,t)1
n
Now define
The method of proof here follows along the lines
= [n~j ]/n,
{p(x-t)-p(x-[nt]/n) 2 dF (x)
j = 1, ... ,
Iv (s,P. )-V*(s,P. )1
n
In n
In
E
Since
p
2
n* = [n 3/4 ].
= O(n- 1 ).
Then, uniformly in
n*
~
EIV (S,P. )-V*(s,P. )1
n
In
n
In
i=1
I
S,
2
is increasing,
sup{lv (s,t)-V (s,P. )1: P. ~t~P. 1 } ~ IV (s,P. )-V (s,P. 1 )1
n
n
In
In
J+ ,n
n
In
n
J+,n
L
+ n-~
~
the last inequality following by (4.1).
for
sup
O~t~l
V*.
n
[ns]
I
i=l
IE{p(x-P. )-p(x-P. 1 )}!
In
J+ ,n
IV (s,P. )-V (s,P. 1 )1
n
In
n
J+ ,n
1,.-
+
O(n--4 ) ,
A similar computation may be made
Thus,
Ivn (s,t)-V*(s,t)I
n
~
max
O~j~n*
+
IVn (s, P.In ) - v*n (s , P.In ) I
max
O~j~n*
+
max
O~j~n*
Ivn (s,P.In )-Vn (s'P'+
J l, n)1
\V*(s,P. )-V*(s,P. 1 )1 +
n
In n
J+ ,n
O(n-~).
19
Since the variances of each of the terms in absolute values are
O(n- 3/ 2 ),
application of Kolmogorov's inequality completes the proof.
Proof.of Lemma 4.1.
Zn
Let
= Vn (s,o)
Z (s,t)
n
converges weakly to an element
converges weakly to an element
Make
W
z
the following definitions.
wI
in
in
02
and
Z*(s,t)
= V*(s,t)-V
n
n
n (s,o).
If z* also
n
the proof would be complete.
Disjoint blocks
Band
neighbors if they abut and have one face in common.
X and block
process
For any
C in
')
IR~
D valued
Z
are
n;,\jj}
B = (sl,t 1]x(sz,t Z]' we define
Because of Theorem 6 of Bickel and Wichura (1971), since the finite
dimensional distributions converge and
boundary, it remains to
tightness, it
exists
Sf,.Oh'
that
Z*
Z*
vanishes along its lower
n
is. tight.
n
suffices to show that if
Band
In order to prove
C are neighbors, there
y>O, S>t such that
(4.9)
where
II
is a finite non-negative measure on the unit cube.
j,k,m,p,q,r be integers with
O~j~k~m~n,
O~p~q~r~n,
to deal with
(4.10)
B =
C =
(4.11)
j k
(n'l;]
k m
(n'n]
x
(E..9.)
x
(2. 9.]
n'n
n'n
J k
B = (n'il] x (~.~]
j k
£1
C
(n-'il] x (-~l.
n'n··
=
Letting
there are two cases
20
In the first case, under equation (4.4)
so that there is a constant
so that
8=1
V*(B)
n
and V*(C)
n
n
are independent
M>O with
suffices in . (4.9) with
Under equation (4.11),
V*(B)
and
~
V*(C)
n
Schwartz inequality may be employed.
being Lebesgue measure on the cube.
are not independent, but the
Then, letting
Z(p,q) = p(X-3.)-p(X-E.)n
n
E{p (X-;) -P(X-;)},.
EIV~(B)14 ~ (k;i)Elzep,q)1 4
+
(k~j)2(Elz(p,q)12)2
n
with the last inequality following from e4.2) and the fact that
(k-j)/n~l/n,
so that
Proof of Theoren 4.1.
Because the result is true for
Sect ion 17), it wi 11 suffice to show that for all
such that if n
~
E, (3
Vn
(Billingsley (1968),
> 0, there exists
nO'
pr{ sup
O~tSn
IZ~(s,t)1
>
S} <
E.
O~s~l
Now, since
e4.12)
Thus
Z*es,o)
n
Iz*es,t)1
n
= 0,
~ min{lz*es,t)-z*es,o)I,
Iz*es,n)-Z*es,t)l}
+ Iz*(s,n)l·
n
n
n
n
n
n, nO
21
(4.13)
sup
Os:t:5:n
0:5:5:5:1.
IZ*(s,t)1
n
~
min{
SUp
t~u~v
SUp
O~s~1
IZ*(s,u)-Z*(s,t)
n
n
I,
sup
O~s~l
IZ~(s,v)-Z~(s,U)I}
v-t~n
+
SUp
O~s~l
IZ*(s,n)l·
n
By Kolmogorov's inequality, the second term on the right hand side of (4.13)
converges in probability to zero as
the modulus
w" (Z*)
n
n
~
O.
The first term is bounded by
(see Bickel and Wichura (1971)) and
n
\'le
proved tightness
in Lemma 4.1 by showing that
= O.
lim lim Pr{W\l(Z*»d
n n
n-+O n-+oo
Thus, since
T ~ 0
n
(a.s.),
Proof of Corollary 4.1.
\'!here
'10
d~O.
The term in (4.6) can be written as
means bounded in probability.
j,
P
this term has the
1initing distribution as
s~ne
h
( N(d) )
From the proof of Theorem 4.1,
N(d)
L
-2
i=l
J"
{p(X.)-Ep(X)}
+ N(d)~B(F,P)TN(d)'
1.
which completes the proof.
Proof of Lemma 4.2.
4.1 with
P.
J11
P
N(d)/m(d) -+
= j/cn ,
Co
The proof of (4.8) follows closely that of Proposition
j
= 0•... ,
n* = cn '
for some constant
Then, one sees that if
cO' that
22
sup
OSs:,;!
"C'l"nce, "n.(s, 0)
IWm(d) (s,m(d);}-TNCd)/bf:1(d)) - Wm(d) (s,o) I
4 o.
converg''''s
to a i'Jl"C:iC'i' rJTocess,- Billinl!.sley?s
Theorem 17.2
~
~
hj,
applies.
5. Heak Convergence Results
In this section we discuss the asymptotic normality of (3.1) and (3.1)*
under random sample sizes.
Sn
As mentioned in Section 4, the assumptions about
J§
will only include knowledge of the asymptotic behavior of N(d) TN(d)
1,
and
N(d)~(SN(d)-l). First consider (3.1).
Definition 5.1.
Vn(s,t,u) =
Lemma 5.1.
For a sequence of constants
n
decreasing to zero
-k [ns]
~
k
{p(l+ua )(X.-t/n'2)))- EP((l+ua )(X-t/n 2 ) ) } .
. 1
n
1
n
1=
I
11 -
Let
{a}
{An}' {Bn }, {en}
each converge to zero.
Suppose
p is
increasing and
(5.2)
If H = E{p ((l+hA ) (X-B )) -P(C1+qA ) (X-B)) }, then for r = 1,2
n
n
n
n
n
uniformly in Ihl, Iql ~ 1, E{p((l+hAn) (X-Bn ))-p(Cl+qAn ) (X-Bn ))-Hn }2r
~ nlh_qlr
for some
M~O.
<1 O<t<l""}
P 0•
sup { 1"~ n ( s,t,u ) - Vn ( s,o,o ) I : 0<-s,u-,
- --lO --+-
ThfJorern 5.1.
LelTlITk'l5.1.
an-1 (Sn -s)
Suppose
P
+
= P+ -p - , where p,
p
If, in addition
+
0 (a.s.), and
p
N(d)/n(d) ~
e
satisfy the conditions of
)
(a positivernndom variable,
23
k
n(d)2TU(d)
has a limiting distribution, then
1, N (d) {
(5.3) (N(d)r'2
)
p
(XISTN( d) J 0
1=1
-
J P (Y-S N(d)
T
N(d)
) dF(y) } ,
~> N ( 0, Var(p(X)) ) .
N(d)
The same result is true if the almost sure behavior of S
is unknown but
n
(N(d))~(SN(d)-l)
Corollary 5.1.
has a limiting distribution.
Let (3.8) and the conclusion to Theorem 5.1 hold.
Tn and Sn -1
addition,
If, in
are uniformly continuous in probability and are
n
jointly asymptotically normally distributed with n- 1
L
{p(X.)-Ep(X)},
i=l
1
then both
N(d)--21, N(d)
I {p
i=l
(x.-TNd( )) ,-,Ep(X) }
1
SN(d)
and
{sN(d) p(Xi-TN(d))
SN(d)
N(d)-~ N(~)
i=l
_
are asymptotically normally distributed under
Proof of
Len~a
5.1.
We may assume that
O~t~l.
~EP(X)}
F(~-l(x_e)).
Then
Ivn (s,t,u)-Vn (5,0,0)1 ~ IVn (s,t,u)-Vn (s,t,o)1 +
(5.4)
Ivn (s,t,o)-Vn (5,0,0)1·
The second term on the right hand side of (5.4) has been handled in Section
4 with
bn
= 1.
0 be fixed but small and define
Let
j = 0,1, ... , m= [1/0].
set
{O~s,u~l, P.st~P.
J
J+
P.
J
= jim,
Then, if the following suprema are taken over the
l}' we obtain
suplVn (s,t,u)-Vn (s,t,o)1 ~ sup{lvn (S,t,U)-Vn (s,p.,u)I+lv
(s,PJ.,u)-Vn(s,PJo,o)I
J
n
+
Ivn (s,Po,O)-V
(s,t,o)1
J
n
~ 2 suplv (s,t,u)-V (s,p.,u)l+suplv (s,P.,u)-V (s,P.,o~
n
n
J
n
J
n
J'
24
By the
~onotonicity
of
p, this last term is bounded above by
2 suplv (s,P. l'u)-V (s,p.,u)1 + suplv (s,P.,u)-V (s,p.,o)1
n
J+
n
J
n
J
n
J
+ 2n~IE{p((1+ua )(X-t/n~)) - p((l+ua )(X-p./n~))}I.
n
n
Hence, if the following suprema are taken over the set
for some
]
{O:;:;s,t,u:;:;l}, then
i4>O,
m
I
j=O
m
suplv (s,P.,u)-V (s,p.,o)1 + 2
n
J
n
J
L
j=O
suplv (s,P.,o)-V (5,0,0)
n
]
n
I
+ 0(0),
the
fixed
0(0)
t
o
tenn following by (5.1).
Hence it suffices to show that for any
,
(5.6)
Ivn
(s,t ,u)-V (s,t ,o)ll:..rO.
on
0
The proof of (5.6) parallels that of Lem.TTI:l 4.1, with the condition (5.2)
being used.
Proof of Theorem 5.1.
Since
The expression on the left hand side of (5.3) is
is bounded by
so~e
Mo
with arbitrarily high probability,
Lemma 5.1 shows that (5.7) is equivalent in probability to
Vn(d) (N(d)/n(d) ,0,0),
completing the proof.
25
Remark 5.1.
process
Wnile it is possible to investigate the weak convergence of the
H (s,t,u)
n
= Vn (s,n]2t,u),
given by Wn (s,t,u)
to obtain results which use only global properties of
(5.2).
we have been unable
p
such as (5.1) and
Rather, our results require such local properties as differentiability.
6. Applications
We now present two applications of the results in the previous sections.
It is first shown that stopping rules for fixed-width confidence intervals
based on l1-estimators (Huber (1964), Andrews, et al (1972)) are asymptotically
normally distributed.
The second application is to one step M-estimators
(Bickel (1975)); almost sure representations are given for these estimators,
their asymptotic normality. under random sample sizes is discussed, and
estimates of their variance lead to stopping rules which are asymptotically
normal.
fl-estimators are defined as solutions to the following equation:
o=
.I
1=1
where
S
n
:I~
n
(6.1)
(X~ -T~) ,
n
is a robust estimate of scale with the invariance properties
discussed in Section 3.
nPT
ljJ
·Assume
Et/J (X)
= 0 and
l~.
t/J'
are bounded.
Then
is asymptotically nornal with mean zero and variance
I
ljJ2(x)dF(x)/{
I
ljJ'(x)dFcx)}2.
Hence, the natural estimator of (6.1) is
(6.2)
The following Lemmas are immediate consequences of the work in Sections 3 and 5.
26
Lemma 6.1.
Suppose ~2, ~'
satisfy the conclusion to Corollary 3.1 with
1.,
E~'(X);z! O.
an = n-'2 log.n, that (3.8) holds, and
a
a
1
2
Define
= {E~ i (X)}-2
= _2E~2(X)/{E~'(X)}3
Then
E~2 (X)
{E~)' (X)}2
(6.3)
= n -1
n
r
i=l
{al(~2(Xi)-E¢2(X))
+
a2(~'(Xi)-E~'(X))}
(a.s.) .
Lemma 6.2.
Suppose ~2, Wi
satisfy the conclusion to either Lemma 6.1 or
1..
Lemma 5.1, that
N(d)?(SN(d)-l)
and that for some constant
(6.4)
J-{
has a limiting distribution, that (3.8) holds,
D(F,~),
N(d)
N(d)~ N(d)-l i~l {al(~2(Xi)-Ey2(X)) + a2(~(Xi)-E~(X))}
}
-1
+ a 3 (SN(d)-1) + a 4 TN(d)
-S. n(o,
Then
c
N (d- N(d)
1,)
v
(6 . 5)
-1
r
N(d) /2(\-TN(d))
~~ ----'5""--'::""'::"
i=l
!-J(d)
N(d )'2"1 ------:-:-7":"__---:::;-;;:--~~-
(X.-Ttl( "))}2
~~,
{N(d) -1 N(d)
i=l
N(d)
IS
,0
D(F,~)).
27
Remark 6.1.
The results (6.3) and (6.5) hold under very general conditions.
One important set obtains (6.5) froD (6.3).
Carroll (1975) has shown that
under the conditions of Corollary 3.2,
H, if S
Bahadur (1966) has shown that for some function
n
is the inter-
quartile range (suitably normalized),
S -1 = nn
l
n
L
{H(X.)-EH(X)} + O(n-3/4C1og n)2)
i=l
(a.s.).
1
Under these conditions, (6.4) is inunediate from Anscombe (1952), so that (6.3)
and (6.5) hold.
If
Bn
is
a preliminary estimate of location (such as the sample median) and
Sn
the
The second appliation has to do with one step estimators.
robust estimate of scale, we define
S n
T
(6.6)
=B
n
-1
+
n\'
,
L.
n
.\11
(X. S-8n )
i=1
1
n
----=~-~___r.
'n
n-1.I~' (Xi~Bnll
n
1=1
J
Then the following Lem.'Ilas are also immediate frohl the results of Sections 3
and 5.
Lemna 6.3.
Suppose that
Suppose also that
EF~(X)
~, ~f
=0
A
sup{1 8 l, ISn-ll}
n
satisfy the conclusion to Corollary 3.1.
and
,
= O(n-"2(log
n))
(a.s.).
Then
(6.7)
I
i=l
~(Xi)-A~(F)(Sn-I)+Sn(E~'(X)-B~(F))}
+
O(n- 3/ 4 (log n)2)
(a.s.).
28
If. in addition,
t!J(x)
then
n-
(6.8)
1
and the conclusion of Corollary 3.2 holds.
n
I
tjJ(x.)
i=l
=
Tn
= 1jJ(-x)
1
(a.s.) .
E'4Ji (X)
Proof of Lemma 6.3.
We have
and
n
-1
n
n
-1
The result now follows since
I
tjJ'(X.) +
i=l
n
-1
=0
LeF.~na
6.4.
n
I
Suppose
tjJ, tjJ
i
J~A
1
}§
;~{
2
N(d)
satisfy the conclusion to LerWoa 5.1, that (3.8)
N(d)~BN(d)
N(d) TN(d)
Then
I·l(d)
(6.9)
~I(X.) = O(n-~(log n)) (a.s.) (remel!lber.
).
holds, and that both
butions.
(a. 5.)
1
i=l
El~ (X)
_y
O(n i?(log n))
lli~d
has the same
-1 N(d)
1,
N(d)~(SN(d)-l)
li~it
have limiting distri-
distribution as
.
A
i~l ~)(\)-AtjJ(F) (SJl(d) -l)-B1jJ(F)0N(d)
E1fJ' (X)
Since the median satisfies the Bahadur representation, (6.8) and (6.9)
hold under a set of conditions sinilar to those of neD.ark 6.1.
I-Tote that
the one-steps have the asymptotic variance eiven by (6.1), so that stopping
times based on (6.2) are also asymptotically normal when
Tn
is a one-step.
It should be noted that results similar to those given here can be obtaine:
by embedding the process
in a Brownian motion in a manner similar to that of Bickel and Rosenblatt (197.\
This approach requires from the outset that
F be continuous; in contrast, th;;
results given here make virtually no assumptions about
while discontinuities in
p
are handled by Elaking
borhoods of the discontinuities.
(3.7) under the assumption that
F if
p:. is continuou.
F behave nicely in neigh-
To be fair, the enbedding approach obtains
T ,5
n
n
are almost surely convergent , but in
working with Pi-estimators one generally can find rate results once strong
consistency is assured.
Thus, the methods of this paper are not only of interL::
in themselves but also yield results which compare quite favorably with those
obtainable from embedding.
30
ACKNOWLEDGEHENT:
I wish to thank Malay Ghosh for sending a preprint of his
paper.
REFERENCES
[1]
ANDREWS, D.F., BICKEL, P.J., HAHPEL, F.R., HUBER, P.J., ROGERS, N.H.,
and TUKEY, J.W. (1972). Robust Estimates of Looation: Sm~vey and
Advanoes. Cambridge: Prificeton University Press.
[2]
ANSCOMBE, F.J. (1952). Large sample theory of sequential estimation.
Eroc. Camb. Phil. Soc. 48~ 600-617.
[3]
BAHADUR, R.R. (1966). A note on quanti1es in large ~amples. Ann. Math.
Statist. 3?~ 577-580.
[4 ]
BICKEL, P.J. (1975). One step Huber estimates in the linear model. J.
Amer. Statist. Assoc. ?O~ 428-434.
[5 ]
BICKEL, P.J. and WICHURA, N.J. (1971). Convergence criteria for mu1tipar~illeter stochastic processes and some applications. Ann. Math.
Statist. 12:; 1656-1670.
[6]
BICKEL, P.J. and ROSENBLATT, M. (1973). On some global measures of the
deviations of density flillction estimates. Ann. Statist. 1s 1071-1095.
[7]
BILLINGSLEY, P. (1968). Convergence of ProbabiZity Measures. New York:
John Wiley &Sons, Inc.
[8 ]
CARkOLL, R.J. (1975). An aSYD~totic representation for M-estimators and
linear functions of order statistics. Institute of Statistics lnmeo
Series #1006, University of North Carolina at Chapel Hill.
[9]
CHOW, Y.S. and ROBBINS, H. (1965). On the asymptotic theory of fixedwidth sequential confidence intervals for the mean. Ann. Math.
Statist. 36, 463-467.
[10]
<;l{OSH, M. and HUKHOPHADYAY, H. (1975). Asymptotic normality of stopping
times in sequential analysis. Unpublished paper.
[11]
HUBER, P.J. (1964). Robust estimation of a location parameter. Ann. Math.
Statist. 35, 73-101.
(12]
LOEVE, M. (1968). ProbabiZity
"
Theory~
3rd ed., Princeton: Van Nostrand.