Carroll, R.J.; (1974)Asymptotically nonparametric sequential selection proceduresII - robust estimators."

N4S 1970 Subject Classifications: Primary 62F07, 62G35; Secondary 62L99, 62E20
Key Words and Phrases:
Ranking and Selection, Robust, M-estimators, Sequential
Ranking, Nmlparametric Selection, Linear Functions of
Order Statistics
ASYMPTOTICALLY NONPARAMETRIC SEQUENTIAL SELECTION
PROCEDURES II - ROBUST ESTIMATORS
by
RayY'iond J. Carroll
Department of Statistios
Universi ty of North Caro Una at Chape l Hi U
Institute of Statistics Mimeo Series #953
October, 1974
ASYMPTOTICALLY NONPARAMETRIC SEQUENTIAL SELECTION
PROCEDURES II - ROBUST ESTIMATORS
by
Raymond J. Carroll
Department of Statis tics
University of North Carolina at Chapel Hill
SUf"Ml\RY
F(x;
Let
~l'
...
ei )
(i = l,
'~K
be
,K)
K independent populations with distributioils
which are stochastically ordered; the basic ranl:ing
goal is to select the stochastically largest population.
Sequential selection
rules which are nonparametric in an asymptotic sense are given which solve the
problem by using M-estimators and L-estimators (Huber (1972)).
The results
are also applied to selection of the largest location parameter and to construction of fixed-width confidence intervals for location parameters using the
above robust estimators.
AMS 1970 Subject Classifications: Primary 62F07, 62G35; Secondary 62L99, 62E20
Key V~rds and Phrases: Ranking and Selection, Robust, M-estimators, Sequential
Ranking, Nonparametric Selection, Linear Functions of Order Statistics.
-2-
1.
INTRODUCTION
Let
'IT
1,
...
,'lT
K
F(x; e.)
1
(i = 1,
F(x; e.)
1
is unknown.
for all
8[1]
~
x, Le.,
••.
~
e[K]
be
,K)
'f
h
independent populations with distributions
where
e.1
The notation
is some unknown indexing parameter and
e.
will mean that
F(x; e.)
1
is stochastically smaller than
F (x; e.)
J
1
~
e.
J
~
F(x; e.
J
If
is the true (unknown) correct ordering, the ranking goal
is to devise sequential procedures for selecting the stochastically largest
population, which are in some sense nonparametric and robust.
In Carroll (1974), sequential rules of the form of Chow and Robbins (1965)
were investigated and a general theorem (Theorem 1.1 below) was proposed for
solving the problem using the indifference zone formulation of Bechhofer (1954)'
the sample median (under fairly weak conditions) and
saNple mean (under
t~e
rather restrictive conditions) were shown to satisfy this Theorem.
In Section
3, the conditions on the sample mean are greatly relaxed.
The sample mean, however, is notoriously non-robust, which in this context implies that entirely too many observations are being taken in most cases.
In order to reduce the number of observations and to obtain desired robustness
properties, this paper shows under fairly weak conditions that the robust Mestimators of Huber (1964) and Ham-pel (1974) and certain linear functions of
order statistics such as the trimmed mean also satisfy the conditions of
TIleorem 1.1; the proofs are given in Sections 4 - 6.
specialized to the case of location parameters
The results, when
(F(x;e)
= F(x
- e)) , yield the
first proofs that the above estimators may be used to select the largest location parameter and to construct fixed-width confidence intervals.
-3-
For future use. define
p*
(1.1)
= J ~K-I(X
+
b)d~(x)
(1.2)
~
where
is the distribution of the standard normal.
Thus. for ease of pre-
. .. ,e", will be assumed to be numbers on the
sentation. the parameters
l~
real line; however. it is very easy to extend the notation to cope with more
a.:rbitrary indexing sets.
If
"CSil
indicates a correct selection. the goal of
the ranking procedure will be to guarantee
lim inf
(1.3)
0+0
For
i
nco)
P(CS) = P*
= 1•... ,K • independent observations XiI' ... ,X in are taken from
F(c; e.)
and statistics T.(n) and a.(n)
1 1 1
(1.4)
(1.5)
a- 1 (e.)n 1/ 2 (T.(n) -
= 1•...• K)
decision. where
(1.6)
e.) ~ ~
1 1 1
a.(n)
1
+
a(e.)
1
TIle ranking procedure itself is to take
(i
are formed such that
, form the statistics
a.s.
Ni(o)
observations from
IT i
T.(N.(o))
• and make the natural
1
1
-4-
Remark 1.1. The stopping rules (1.6) are independent of one another, which is
certainly a drawback.
However, improvement here must await the discovery of a
nonparametric elimination rule for the location case.
In the location case,
if one chooses
K
2
o. (n)
I
(1.7)
i=l
1
the stopping rule would be to take N(o)
observations from each population,
where
(1.8)
N(o)
= first integer n ~ (bo(n)/o)2 .
Using the generic terms
T, on'
n
e, and o(e) , Carroll
(1974) proved the
following:
THEOREM 1.1.
If the parameter space
ncO)
is compact, then (1.3) holds if
(AI)
and if for each
n ~ J
Ie
and
8 , e > 0, S > 0, and
0
- eO
I
~ n
zeR, 3 J, 1'1' > 0, c > 0
such that
imp 1Y
(A2)
(A3)
(A4)
l 2 IT - T I
n
m
P {n /
e
> 8
for some
m with
1m - nl
<
cn} < e .
-5-
Note that (A2) and (A3) mean that (1.4) and (1.5) hold continuously.
(A4) is
an extension of Anscombe's (1952) condition.
COROLLARY 1.1.
N(o)
If
F(x;O)
= F(x
- e)
and (1.4) and (1.5) hold, then, using
given in (1.8), (1.3) holds if (A4) holds at
0
=0
and if
(1.10)
Remark 1.2.
In Carroll (1974) there was also a condition that for all
there exist
nand
10 - eol
d such that
<
€ >
n implies
(1.11)
This condition may be removed in Theorem 1.1 simply by defining
larger than some constant, say 100- 1000
The following convention will be used:
xn (0)
if for all
(1.13)
€
> 0, 3 N, n
+
X(e 0 ) a.s. uniformly
such that
p {IX (0) - X(O ~ >
e
n
0
E
Ie -
to be
The condition is unnecessary in
Corollary 1.1.
(1.12)
an
00 ' < n implies
for some
n ~ N} ~
E .
A similar meaning is attached to other types of convergence.
C ,
-6-
2. PRELIMINARY RESULTS
The Propositions of this section will be used repeatedly, and while fairly
simple, should be of some interest in themselves.
Proposition 2.1.
Let A be an indexing set, and let
variables with distributions
variances (say by
M).
Proof:
a
=1
(a€A)
Then if
+
with zero mean and uniformly bounded
n
1
0 s 0 < 1/2 , and if S (a) = nI Y.(a)
n
(1 - 20)-1 , where
j=l J
a.s. uniformly.
0
The proof fo11m'1s the "method of sequences" of Chung (1968).
+
be random
F(o;a)
n o S (a)
n
(2.1)
Yn (a)
0 is chosen so that
a
is an integer.
Let
Then, by
Chebyshev's inequality,
Let
ao
n
(2.2)
mea)
= na
S a(a)
n
+
0 a.s. uniformly.
and
K
L
(2.3)
Y. (a)
j cm(a) +1
where the
yields
(2.4)
max
na
is taken over
n ao Dn(a)
+
+
J
I,
1 s K < (n + l)a
Kolmogorov's Inequality
0 a.s. uniformly,
and the proof is completed by noting that for
n
a
s K < (n + l)a
-7-
Proposition 2.2.
Let
Xi (6)
have mean
0 and variance
0
2 (6) under
Then
F (0 ; 6) •
(2.6)
if for any sequence
(2.7)
converging to
8 , 6 ,
2
1
f
0-2(6 )
n
x~dF(x~.
B)
+
n
6
0
, and for all
E> 0 ,
0 ,
A (E)
n
where
A~(E)
= {x: Ix I
>
0(6 n )En 1/2 }.
Proof: This follows immediately from the Normal Convergence Criterion given
in
Lo~ve
(1963), page 295.
Co ro11 ary 2. land if
0
2
(6)
(2.6) holds if Xi (6)
is continuous in 6 .
Proposition 2.3.
Fn(x)
(Schuster (1969))
are uniformly bounded random variables
Let
be the empirical distribution.
F(x)
be a distribution and let
TI1en, there is a universal constant
such that
(2.8)
Pp{sup IFn(x) - F(x)1
> E} ::; C
exp{-2nE 2 } .
x
Corollary 2.2. For 0
(2.9)
nO sup
x
$
0
IF n (x)
<
1/2 ,
- P(x)!
+
0 a.s. uniformly.
C
-3-
3. SAMPLE
r~EANS
In this section, Theorem 1.1 is proved for sample means and variances
1.mder much less restrictive conditions than those given in Carroll (1974).
No
symmetry assumptions are made, nor are there any conditions on the inverse of
F(x;e) .
THEOREM 3.1. Suppose (2.7) holds, and that
(3.1)
Vare(X)
Ee(X - Ee (X))4
(3.2)
is continuous in
e
is bounded in some neighborhood of every eO'
Then
(AI) - (A4) hold.
Proof:
(A2) holds because of (3.2) and Proposition (2.1).
imply (AI).
Then (3.1) will
(A3) holds because of Proposition 2.2, and (A4) follows byexten-
ding Anscombe's (1952) proof by means of Kolmogorov's Inequality and (3.1).
4.
HUBER'S M-ESTlMATORS
Let
Xl' X2 , ...
have a distribution
increasing skew-symmetric function for
(1964) defines an M-estiwator Tn
n
-1
wh~c~
Ee~(X
-
e) = o.
~
be an
It will be
e satisfying this equation is unique. Euber
assumed throughout that the
(4.1)
F(x;e) , and let
n
L
as the solution to the equation
o.
j=l
Note that Tn' while location invariant, is not scale invariant.
To get this,
one needs some kind of scale invariant measure of dispersion (such as the
-9-
interquartile range)
Sn' and then defines T
n2
(4.2)
n-
In; this
se~tion,
1
j~1 w~Xj~:n2J
=0
by
.
conditions are given for which the solutions to (4.1)
and (4.2) satisfy (AI) - (A4).
Note that one immediately learns from this
that some Huber M-estimators satisfy the conditions of Geertsema (1972), and
thus may be used in nonparametric robust selection and confidence intervals.
The differentiability conditions on the
~
functions are needed only to invoke
Taylor's Theorem but do not seem to be crucial.
~
fmction
Ixl
for
< K , ~o(x)
Although Huber's favorite
= _,
!(
•
Sign X
for
does
not satisfy the differentiability conditions, it can be uniformly approximated
by "nice" functions.
It \\Till be assumed throughout that
bounded respectively by M(1jJ),
M(1/J~),
~, ~'
and
, and
M(1jJ") .
¢n
exist and are
It will further be assumed
that
(4.3)
For ease of exposition, the estimators
be studied.
(4.4)
formed from (4.1) will first
The asymptotic variance of Tn under F(x;8)
2
cr (8) =
and will be estimated by
J 1jJ2(t-8)dF(t;8)
---------=-
{ f ~,(t-8)dF(t;e)}2
is
-10: -',
(4.5)
where
Fn (t;8)
from
F(x;e) .
is the empirical distribution based on a sample of size
n
The proof of the following Lemma follows easily from The I"lean Value
Theorem and boundedness of 1jJ1
and
1jJ1l, together with application of the
He11y-Bray Theorem and Proposition 2.3.
LEMMA 4.1. Assume (4.3) holds.
T~en
(AI) and (A2) hold, if Tn - 8
a a.s.
+
uniformly.
The following Lemma becomes useful after i t i.s shown that
the
SID~
ass~~e
of i.i.d. random variables.
Tn - 8
1/2(1 n
n i~l
Proof: Since
82 (8)
Until Proposition 4.1 is established,
1jJ
1jJ(X i - e)
) ,
~ ~
uniformly .
is bounded, it is sufficient to show by Corollary 2.1 that
is continuous in
This follows in a manner similar to Le~ma 4.1.
8.
It will now be shown that (A3) holds.
(4.7)
By a Taylor's expansion,
n
n
n-1/2 L 1jJ(X. -8)
n-1 I
1
n1/ 2 (T _ 8) = _ _-=-i-__1:..--..__
+ {n 1/ 4 (T _ 8)}2 __i_=_1
n
1 n
n
1 n
n-
L 1jJ'(X.-8)
1
. 1
1=
where
is almost
0 a.s. uniformly.
+
n
8(8)
(4.6)
Tn
Z.(8)
is between X. - e and T - e .
l I n
n-
.
I/J"(Z.1 (8))
_
L11jJ'(x.-e)
1
1=
-11-
Since
n
(4.8)
n
-1
I
~1(Xi -
i=l
e)
+
by Proposition 2.1, and since
0
for
~
°
<
Ee~I(X
1jJ"
- e)
a.s. uniformly
is bounded, by Lemma 4.2, (A3) follows if
1/2 ,
(4.9)
nO(Tn - a)
+
0
a.s. uniformly.
This requires two steps.
a+
Proposition 4. 1.
Tn -
Proof:
is increasing, if
(4.10)
1jJ
Since
n
n
-1
L
i=l
n
0
a.s. uniformly.
1jJ(X. - a) ~ n
L
i=l
n
-1
L
~
n
-1
la - aol
1jJ(X.1 - a)
~
n
<
n/2
1jJ(Xi - eo
i=l
+
n)
n
-1
L
1jJ(Xi - eo - n)
i=l
By invoking Proposition 2.1, the proof is complete.
LEMMA 4.3. For 0
F(e
O
+
e; ao) - F(e
(4.11)
and (A3) holds.
~
o
0
<
1/2 , i f 1jJ'(O)
- e; eO)
nC(T
n
- a)
>
0 and
>
beE)
+
0 a.s. uniformly,
>
0 -for all
E> 0 ,
then
-12-
Proof: By invoking Taylor's Theorem,
°n
(4.12)
n (T
n
o{n- 1 n2 IjJ(X.-6) }
i=l
1
- 6) = - - - - - - n-
1
n
2 1jJ' (Z.1 (6))
. 1
1=
2.(6)
is between X. - e and T - e.
l I n
where
By Proposition (2.1), it
suffices to show that
n-
(4.13)
1 n
2
. 1
1=
1jJ'(Z.(6))
1
But this is true since
If
1jJ'(x)
~
a.s. unifonnly as
0
>
0 ,with
1jJ'(O) > 0
and
peOo
n
+
00
•
1jJ'(O) > 0 , by using Preposition 4.1.
+
e:'; eo)
~
- e:;06 )
o
0
e: > 0, ,and if ..(4.3) ho1ds,ot'hen (AI) - (A4) h,old for T
THEOREM 4.1.
for
~ c
p(e
n
(4.1) •
~
b(e:) > 0
d~fi,ned
by
.. ...
Proof:
It is sufficient to check (A4), the extended version of uniform
continuity in probability (Anscombe (1952)).
From (4.7) and Lemma 4.3, (A4)
need only be checked for the random variables
n-
(4.13)
1 n
Hn =
0
(4.14) ,'no(n,
S
1
°
1
n
n -1
Since, for
L 1JJeX.-e)
i=l
I
i=l
IjJ 1
ex. -e)
1
< 1/2
n
L
. 1
1=
{1jJ' (X. - 6) - E 1jJ1 (X 6
1
en)
+ 0
a.s., uniformly,
this follows from Anscombe's proof and Kolmogorov's Inequality.
-13··
Note that the above results do not require synmetry of
F(x;e)
about
e
although this seems to be a cornman condi.tion (conpare Huber (1964), Geertse!".a
(1972), Sen
T 2
n
and Ghosh (1971)).
However, for investigating the estimators
formed from (4.2), symr:letry will have to be i!:lposed (because of Le!lll'l.a
4.5). It will also be assumed throughout the rest of this section that
lji' (x) = ¢, (-x) .
(4.15)
The estimators
S
n
will be
asst~ed
to satisfy
for real
(4.16)
for some
(4.17)
~(e)
a
, and for alIOs 0 < 1/2 ,
Using results of Bahadur (1966), one may find relatively mild conditions for
whid. the interquartile range satisfies (4.17).
range for
S
T
F (x; e)
r~
under
n
is studied in Andrews, et al (1972).
(4.1u)
and is estimated by
(4.19)
Use of the interquartile
is now
The asymptotic variance of
-14Much of the work below follows in a manner similar to the proofs for the
estimates defined by (4.1); only those proofs which require extra effort are
presented.
Proposition 4.2.
Tn2 - 6
0 a.s. uniformly.
+
LEMMA 4.4. Assume that (4.3) holds. TIlen (AI) and (A2) hold.
LEMMA 4.5. Let a2 (6) = E ~2(~(:))
some neighborhood of each
(4.20)
6
0
a-l(o)nl/2[n-li!1
.
Then, if
E
6
Ix - el 2
is bounded in
,
~(X~:oJJ 1.
uniformly.
Proof: Using Taylor's Theorem twice, it suffices to show
(4.22) {Sn - ~(e)} n
-1/2
n
[x.-O) (x.-e)
i~l ~I ~(e)
~(e)J
+
0
a.s. uniformly.
(4.21) follows from (4.17) and Chebyshev's Inequality.
is symmetric about
0,
Ee~i[~(~))[~(~)) =
Since
~I(~(~)) [~(~))
0, so that (4.22) follows from
(4.21) and Proposition 2.1.
LEMMA 4.6. For
(4.22)
0 s 0 < 1/2 , under the conditions of Lenwa 4.3,
n o(T
- e)
n2
+
0
a.s. uniformly.
Now for the main theorem of this section, stating explicitly all the
nec~ssary
conditions.
-15-
THEOREM 4.2. Suppose the conditions of Theorem 4.1 hold, that F(x;a)
symmetric about
a ,that
= tP (-x) ,that Ealx - al 2 is bounded in
tP'(x)
l
' and that
O
Tn2 defined by (4.2).
some neighborhood of each
(A4) hold for
is
a
satisfies (A4) •
Sn
Then (AI) -
Proof: By Taylor's Theorem
n
(4.23)
Tn2 -
-1 nltP(XiS--eJ"'
i=l
a = Sn
n
n-
n
L tP"(Z.1n )
. 1
1=
-1 nL tP' (Xi
-a)
-,...i=l
n
l
n
.. n
-1
(Xi-O)
n
L1Ji'. 1
S
n
1=
By Proposition 2.1, and Lemmas 4.5 and 4.6, (A3) is completed.
Also, it
suffices to prove (A4) for
H = n
n
(4.24)
-1
}
1=1
tP(x~.-a)
J
n
{n- 1
.I
1=1
1Ji,[x~-a)}
-1
J
n
n-lA +A*(S -~(e))(s l;(e))-1+o(n- 1 / 2 )
n n n
n
=---~---~~----;:-----
x-e)
-0
Ea1Ji' (~(6) +o(n )
by expanding each term in the numerator about
X. -6
1
r; (a) , where
(4.25)
*
(4.26)
and hn
A
n
= O(~)
means
hn/g n
=
~
n
L
i=l
[x. -a)
t/J' l; ~a)
,
0 a.s. uniformly.
Then, assuming (without loss of generality) that
Let
m> n ,
c(a)
)
= EtP ' (x-e
~(e)
.
-16-
(4.27)
T
n
Sn
Since
T
m
= c (8) (!-n - !-)
A
m n
+
8) m-1 B
C(
m
-1 * (Sn-~(8)
An) ~ (.e)S
+
c (8) (n
+
o(n-1/2) .
n
5m-~(e))
- ~ (8)5
m
satisfies (A4), this completes the proof.
Remark 4.1. Carroll (1974) has shown that the interquartile range Sn
satisfies (A4) if
(where
~3/4'
and
5.
Ce' B8
~
1 as
8
~
80 ) for all
the relevant quartiles of
x,y
in some neighborhood of
~1/4
F(x; 80) .
r,1-ESTIMATORS OF HAMPEL AND OTHERS
Hampel (1974) (see also Andrews, et al (1972)) proposed estinates defined
as the solution of (4.1) or (4.2) closest to the median, but generated by
three-step
(5.1)
$
functions, the prototype of which is
$(x)
= -$(-x) = x
oS
x < a
=a
a S x <
c-x
- a.
-- c-b
b
=0
b
S x < c
x
~
c •
These estimators seem to have very good robustness properties.
Again, as in
Section 4, the conditions which guarantee (AI) - (A4) will not be strictly
-17satisfied by (5.1), but, as before, this will not present much of a problem
in the applications.
There are a number of problems in showing that estinators defined by
W functions (e.g., by (5.1)) satisfy (AI) - (A4). First of all, the
general
solution to
~(S;o) = J $(x
(5.2)
rray
not be unique.
in the support of
each
0 .
It will
b3
=0
- S)dF(x;O)
assumed that on any closed interval cont~ined
F(o , e) , th e number
0
f so 1ut10ns
.
to (5.2) is finite for
A quick check will show that the increasing nature of
~
was used
in Proposition 4.1, Lemma 4.3, and in the fact that
(5.3)
From now on, (5.3) will also be assumed.
To get results analagous to Theorems 4.1 and 4.2 for general
W functions,
. : one merely needs the conditions of those theorer.ls, the two conditions mentioned
above, and the conditions in the two
le~nmas
estimators defined by (4.1) will be used.
below.
For simplicity, only the
Lemma 5.1 is based on a proof of
Huber (1967).
lEr~
3
5.1.
T
-
n
e~
0
a.s. uniformly if d A > 0
such that for all
0
Ie -
80
Note that this is obvious if
$
an N and an n for which
' <
E > 0 ,
n iMplies
(5.4)
PROOF:
then n/lOO
X.
1
satisfy
Ix.1 - el
~ A
is like (5.1), since
J
one can invoke Proposition 2.1 and (4.3).
ITn - 01
< 2A
for
then if less
A
large, and
-113-
Now, if Tn -
a
0 a.s. uniformly does not hold, there is a sequence
+
£0 > 0 for which
(N., a.)
J
J
(5.5)
Pa {IT
j
n
-
a·1
J
~ £0
A(S; aD)
Since the number of zeros of
arguments, for all open subsets U of
that for
n
n ~ N.} ~ £0 .
some
J'
is finite and
is continuous in its
~
K = [-A, A] , there is a small
£
such
large enough,
(5.6)
B£K - U
U contains all zeros of
(5.7)
~(B;
aj )
j
~
n .
Since
(5.8)
V S£K - U , one can choose
Us
for which
B'£U
B
implies
(5.9)
By using the proof of Huber (1967),
(5.10)
P {T £U
a.
J
Since Tn
proof.
n
for all
n
~
N.}
J
on~
~
finds
1 - £
for
j
large
is the solution of (4.1) closest to the median, this completes the
-19-
LEMtt1A 5.2. Assume that for each 80 , there is A > 0 for which
(5.11)
o < b
(5.12)
:1
Then, :1 d > 0
no' f3
> 0
O
~ ~)'
(y)
ye: [-A, A]
such that if
such that for all
Ie -
e: > 0 , ::I N
8
01 <
no '
,n such that
18 -
eO I <
n
implies
(5.13)
n
Z.In (e)
about
X. 1
is derived from the Taylor's expansion of
.L
1=1
e .
lb (X. - T )
'1
n
Proof: The proof follows easily from the conditions and Lemma 5.1.
Remark 5.1. The conditions of Lemma 5.2 will be satisfied by (5.1) if F(x;8)
puts enough probability near
8.
Note that (5.13) is exactly what is needed
in Lemma 4.3.
6. L-ESTIMATORS
Linear functions of order statistics, especially the trimmed mean, are
Xl' X2 ' ... ,X n are independent with
F(x;8) , and if X(l) ~ X(2) ~ ... ~ X(n) denote the
quite popular robust estimators.
continuous distribution
If
order statistics, then the L-estimators are
n
(6.1)
Tn
= n -1 I
i=l
J(i/n)X Ci )
-20-
where J
is some function.
The
if
J(t) = 0
(6.2)
mean is defined by
~-trimmed
h:
= (1 - 2~)-1
t
€
[~,
1 - ~]
[~,
1 -
~]
In this section, the a-trimmed mean is shovrn to satisfy (AI) - (A4) under
fairly minill'.al conditions, ane then general conditions on J
are given.
method of proof here closely follows Moore(1968) who proved the following:
THEOREM 6.1. Let
(6.3)
0
2
(8)
=2
J J J(F(x;e))J(F(y;e))F(x;8)
{I - F(y;e)}dxdy
<
~
x y
(6.4)
(6.5)
J
be continuous on [0,1]
,all , and J i
except for jump discontinuities at
is continuous and of bounded variation on
1'1
[0,1] - {aI' ... ,\.t} , and F
-1
is continuous at
ai' ... ,aM'
Then
LEMMA 6.1. Suppose
L
dF(x;8)· +
(6.6)
J
J
satisfies (6.2) and that the distribution of
1
(6.7)
Yi =
satisfies (2.7), where
?
N(O, o-(8)) .
J J(x)[Ui(u)
°
1
- u]dF- Cx;8)
The
-21(6.8)
=1
Then
on
(AI) - (A4) hold with
otherwise.
being the obvious estimate of
0
2 (8)
based
Fn (x;8) .
Proof:
(AI) and (A2) obviously hold from (4.3).
Let
(6.9)
where
R.1
Un(u)
= F(X.;
1
is the empirical distribution of the uniform random variables
8) .
Then, since
yields with probability
F-1 (x;8)
+
F-1 (x; 80) , using Proposition 2.3
1 ,
1
= o(n -1/2 ) +
(6.10)
MOore's (1968) proof for his
J F-1 (x;8)J(x)dw (x)
n
o
.
lIn now suffices, together with the proof of
Theorem 3.1, since MOore arranges the right-hand side of (6.10) as a sum of
i.i.d. random variables.
THEOREM 6.2. Suppose that (6.5) holds and that (AI) and (A2) are true (which
will be the case in translation parameter families or if J
a compact set).
Then if
E81xl
vanishes outside
is bounded in some neighborhood of each
8 ,
0
and if the randorn variables in (6.7) satisfy (2.7), then (A3) and (A4) hold.
Proof: The proof will be given here only in the case where JV
Following Moore (1968),
(6.11)
is continuous.
-22where
1
(6.12)
I
nl
1
= f p-1(X;e)Jl(X)Wn (X)dX
o
+
J p-1(X;O)J(X)dwn (x)
0
1
I n2
= JF- 1 (x;e) [Jl (Vn(X))
I n3
= n -1/2
- J'(X)]Wn(X)dUn(X)
o
1
where
Vn(x)
J P-1 (x;e)J' (X)Wn (x) dWn (x)
o
is between Un(x)
and
x.
By Proposition 2.3,
I n2
~
0
uniformly, and by Theorem 3.1 since
1
(6.13)
1
n1= -
JJ(X)Wn (X)dP-1(x;O)
,
o
(6.14)
I
n1
~ ~ uniformly
Thus, to show (A3) and (A4), it suffices to show that
Kn (e)
defined by
1
(6.15)
= J {nl/4 (un (x)
Kn(e)
_ x) }2dJ, (x)P- 1 (X;6)
o
satisfies
(6.16)
K (0)
n
~
TIlis will be true if for
(6.17)
0
in probability uniformly
Ie - eol < n } there exists
K*
for which
a.s.
-23By using Hti1der's inequality and bringing the expectation inside, one gets
1
(6.18)
nE eK;'(6)
$
n-
2
J {3(nX(1
- x))2
nx(l - x)(1 - 6x(1 - X))}dJ'(X)F-l(X;e:
+
o
Thus, it will be sufficient to prove that if Vex) is the variation of
J'(y)F
-1
(y;e)
on
[1/2, xl ,
1
(6.19)
J
x(l - x)dV(x)
KO
<
if
1/2
Vex)
Since there is a uniform constant C such that
el
Elx -
by parts and the bound on
$
CF
-1
(x;e), integration
establishes (6.19) and completes the
proof.
7.
FIXED WIDTH CONFIDENCE INTERVALS AND SELECTION
In this section, the results in Sections 4 - 6 are discussed in relation-
ship to the important and often discussed problems of selecting the largest
location parameter (see Robbins, Sobel and Starr (1968), Geertsema (1972)) ane
constructing a fixed-width confidence interval for a parameter (see Chow and
Robbins (1965), Sen and Ghosh (1971)).
For the selection of the largest location, suppose one has
1T
l
,
,IT,,h with distributions
,e K are unknown, and
F(x - e.)
1
K popUlations
(i = 1, ••. ,K) , where
F is unknown.
If
6[1]
$
e(2]
$
denotes the correct (unknown) ordering, then one wishes to prove that for
suitably defined statistics
(7.1)
Tn'
lim inf
5+0 n(5)
P(CS)
= P*
-24independently of
to take N(o)
p*
=f
observations from each population, where, for
v 1 (x + b)d~(x)
~~-
,
= first
N(o)
(7.2)
and
F , where the stopping rule (following Geertsema (1972)) is
integer n ~ (bo /o)2
n
O~ is the estimate of the variance. Since the asymptotic normality has
been established for all the statistics
consistent estioate
0
2
n
Tn • one merely needs a strongly
of the variance and (A4) holding at
e = 0 . The
conditions thus implied are summed up in the following theorem.
THEOREM 7.1.
(7.1) holds
a)
In Theorem 4.1 if
b)
In Theorem 4.2 if a) holds,
second molnent, and if Sn
~I(O) >
0 and
F(e) - F(-e)
>
F is symmetric about
0 Ve
>
0 .
0 and possesses a
satisfies (A4) (which it will if it is the inter-
quartile range).
For Hampel-estimators defined by (4.1) (or 4.2) if a) (or b)) holds,
c)
E~'(X)
>
0 under F , (5.2) has only a finite number of solutions at
(5.12) holds at
d)
0, and
e =0
In Lemma 6.1 and ~1eorem 6.2 if
Elxl
exists under F and
F is
continuous and strictly increasing.
Remark
7.1.
(7.1) holds for each
F ; one can use the results of this paper
and Carroll (1974) to find conditions under which (7.1) holds in the location
problem uniformly over some compact subsets of the space of distributions.
-25-
The results of this paper may also be easily applied to finding fixedwidth confidence intervals for a parameter.
is
Xl' X2 ,
F(x - 8) , and that
Tn
Suppose the distribution of
is a statistic in this paper or in
Carroll (1974), for which
L
(7.3)
n l/2 (Tn - 8)
If a
is the desired confidence level and
(7.4)
a = q, (b) - ep ( -b)
one takes
(7.5)
N(o)
-+
Normal (0,
0
2
.
(8))
observations, where
N(o)
= first
integer n ~ (bo /o)2
n
The confidence interval formed is
IN(o)' where
07.6)
Then, under the conditions of Theorem 7.1,
(7.7)
Remark 7.2.
The problem of finding general conditions under which (A3) holds
(say for sums of dependent or independent T.V.
will be the subject of a later paper.
IS)
is of interest in itself and
-26-
REFERENCES
[1]
ANDREHS, D. F., BICKEL, P. J., HAl'iPEL, F.
n.,
ElmER, P. J., ROGERS, W. H.,
and TUKEY, J. H., (1972). Robust Estimates of Location: Survey and
Advances. Princeton University Press.
[2]
ANSCOMBE, F. J., (1952). Large sample theory of sequential estimation.
Froo. Camb. PhiZ. Soo. (48) 600-617.
[3]
BAr~UR,
R. R., (1966).
Statist. (37) 577-580.
[4]
BECHHOFEP., R. E., (1954).
A note on quanti1es in large samples.
A single-sa~ple multiple decision procedure for
ranking means of normal populations with known variances.
Statist. (25) 16-39.
(5)
CARROLL, R. J. (1974).
Ann. Math.
Asymptotically
nonpara~etric
Ann. Math.
sequential selection
Inst. of Stat. /tfUneo Series #944, Univ. of North Carolina.
procedures.
[6]
CHOW, Y. S., and ROBBINS, H., (1965). On the asymptotic theory of fixedwidth sequential confidence intervals for the mean. Ann. Math. Statist.
(36) 463-467.
[7]
CHUNG, K. L., (1968).
A Course in Probability Theory.
Harcourt, Brace,
and Nor ld, Inc.
[8]
GEERTSE~~,
J. C., (1972).
Honparametric sequential procedures for
selecting the best of k populations. J. Am. Statist. Assoo. (67)
614-616.
[9]
HAHPEL, F. R., (1974).
estimation.
The influence curve and its role in robust
J. Amer. Statist. Assoc.
(69) 383-303.
[10]
HUBER, P. J., (1964). Robust estimation of a location parameter.
Math. Statist. (35) 73-101.
[11]
HUBER, P. J., (1967).
nonstandard. conditions.
(1) 221-233.
Ann.
The behavior of maximum likelihood estimates under
Froo. Fifth BerkeZey Symp. Nath. Statist. Frob.
-27-
REFERENCES (cant.)
[12]
ffuBER, P. J., (1972).
(43) 1041-1067.
Robust statistics: a review.
[13]
fDORE, D. S., (1968).
An elementary proof of asymptotic normality of
linear functions of order statistics.
(14]
Ann. Math. Statist.
Ann. Math. statist.
?OBBINS, H., SOBEL, H., and STAP.R, N., (1968).
for selecting the largest of k means.
(39) 263-265.
A sequential procedure
Ann. Math. Statist. (39) 88-92.
[15]
SEN, P. K. and GEOSH, M., (1971). On bounded length confidence intervals
based on one-sample rank order statistics. Ann. Math. Statist. (42)
189-203.
[16]
SCHUSTER, E. F., (1969).
and its derivatives.
Estimation of a probability density function
Ann. Jbth. Statist.
(40) 1187-1195.