Carroll, R.J.Convergence Results for Sequential Estimation of the Largest Mean."

BJOMATHEM
ATiCS TRA/N/AiG
., PROGRAtf
Convergence Results for Sequential Estimation
of the Largest Mean
by
Raymond J. Carroll *
University of North Carolina at Chapel Hill
Abstract
~
We consider the sequential estimation of the largest mean of k populations
when the observations are normally distributed with a common unknown variance
and the goal is to control the mean square error (MSE) at a prespecified level;
this is a generalization of problems considered by Blumenthal (1976) and Carroll
(1977).
By eliminating from the experiment populations which the data indicate
are not associated with the largest mean, it is shown that, compared to existing
procedures, significant savings in sample size can be obtained.
Weak convergence
results are obtained for the stopping times and the estimate of the largest mean
as consequences of more general results; these are used to compute the asymptotic
MSE.
Key Words and Phrases:
Sequential Estimation, Elimination, Largest Normal Mean,
Weak Convergence, Random Time Change, Ranking and
Selection
4Ia
*This research was supported by the Air Force Office of Scientific Research under
Grant AFOSR-75-2796.
~
1.
Introduction
Let 8 , .•• ,8 be the unknown means of k normal populations with common
l
k
2
unknown variance cr , and let
taken from the k populations.
-
-
Xln'.'.'~n
be the sample means for n observations
Define the ordered population and sample means
by 8[1] ~ •.. ~ 8[k] and X[l]n ~ ••. ~ X[k]n.
Blumenthal (1973, 1976)
constructed sequential procedures for estimating the largest mean 8[k] with a
prespecified bound r on the mean square error (MSE).
data sensitive in that they depend on estimates ~in
~i
= 8[k]
His procedures were mildly
= X[k]n
- X[i]n of
- 8[i]' and he obtained some partial asymptotic results.
(1978a) extended the asymptotic results when
The purpose of this paper is twofold.
0
2
Carroll
is known.
In Section 2 we consider weak conver-
gence results which greatly extend the asymptotic theory for Blumenthal's
stopping time N and generalize the results of Carroll (1978a).
B
We define and
study the weak convergence of a stopping time process {N r } that includes NB as
a special finite-dimensional case.
We then define anew random change of time
process for sample means that is based on {Nr } and consider its weak convergence.
We finally study the limit distribution and MSE for the maximum sample mean upon
stopping.
The second purpose of this paper is based upon our belief that NB can be
made more data sensitive and efficient by the simple expedient of grafting onto
it the ability to eliminate populations (early in the experiment) which are
obviously not associated with and hence give no information about 8[k].
In
Sections 3 and 4 we define the elimination procedure, study its asymptotic behavior and give some Monte-Carlo results which show that the savings (over N ) in
B
sample size can be considerable.
2
To fix notation, define HI
MSE due to estimating 6[k] by
when 0
2
e~
~l
and let (0 2In)Hk(n ~Al/o, ••• ,n~Ak_l/o) be the
= ~[k]n'
rn order to control MSE at a level r,
is known the following procedures have been proposed:
take NBobserva-
tions from each population, where either (Blumenthal (1973))
(1.1)
or (Blumenthal (1976))
NB
(1.2)
= NB(k)
=
inf{m: n(m)
2
m}, where
2
Although an analogue of (1.2) (for the case 0 unknown) is possible and requires
the same basic techniques as used here, for notational reasons we prefer to
consider (1.1) and take NB observations from each population, where
.
(1.3)
NB = NB(k) = l.nf{n -> ar
Here a
=
-1
2
~
~
: nr -> a.nk Hk (n 2 b,.ol.m Ia.nk,···,n 2 ~kt\. 1 Ia. k)}'
,n n
a(r) > 0 are a set of small bounded constants with finite positive limit
Oh ar -1 an l.nteger, whOI
2
a O and Wl.t
l. e 0nk
0
is the usual pooled sample variance with
k(n-l) degrees of freedom.
We make the following
Assumptions.
The i.i.d. observations X ,X , ... from the jth popUlation have
jl j2
finite fourth moment.
o<
The functions H are continuous and satisfy
k
H. < Hk(xl"",x 1) < H
<
ml.n kmax
00.
Further, for every k and p,
lim
xl""'xp~
Finally, for every k and u, the Lebesgue measure of the set
is zero.
~
3
If the observations are normally
the MSE function H ,
k
distributed~
the assumptions hold for
We take 0 < ka · < H . . < H
,and i t will simplify computaO
m~n
max
tions without affecting results if we take H =1. With k=2. define ~=X[2]n-X[I]n'
max
2
6 = 8[2] - 8[1] and without loss of generality take 8 ..:: 8 and a = 1.
1
2
2.
Weak Convergence Results for NB
In this section we prove a number of weak convergence results for a stopping
time process that includes NB as a special (finite dimensional) case.
are given for k=2 but are easy to generalize to k>3.
All results
To outline important
special cases of the results, in Lemma I we give the limit distribution of rN ,
B
in Lemma 2 we discuss a random change of time for sample means, and in Lemma 3
we establish the limit distribution of the maximum sample mean upon stopping.
assume throughout this section that 6
~
(0":: no":: 00).
~
r S. for some 0 <
S<
00
and 6 2/r
+
nO2
Let Wbe Brownian motion with mean zero and variance 2t at time t,
and define
W*(s,n ) = s -~ IW(s)
o
+
sn OI .
Letting [.] denote the greatest integer function, define a stochastic process
The proofs of all results are delayed to the end of the section.
Proposition 1.
We
Let 0 < b
< b < 00 and " >" denote weak convergence.
l
2
space D[b , b 2] (Billingsley (1968)),
l
(8 > ~)
G
r
R
00
On the
4
In studying the stopping time NB, we have found a more general approach to
. be as convenient and to yield much stronger results.
o<
Consider processes for
t < 1 given by
N (t) '" inf{m > ar-l:·mr(t+l) >
H2(m~ b.!a)}
r
- m
m m
(i
Q(t) = inf{s
~
a: s(t+l)
~
H (W*(s,n ))} •
2
O
Both Nand Q are monotone non-increasing in t and are
r
easily verified to be members of 0[0,1].
Lemma 1 is comparable in spirit to work
of Gut (1975).
Lemma 1.
rN B
=
(Weak convergence of the stopping time process).
rN r (0)
& 1.
rN r => Q on
G~
= Pr{Q(t)
Since 2a < H.
mln
bution
0[0,1] .
by
G~(u)
(2.2)
<~,
For S _> ~,
(2.1)
Define
For S
£~nction.
Corollary 1.
> u} = pr{(H (W*(s,n )) - s(t+l)) > 0 for all a < s < u}
2
O
and H is bounded, it is easy to show that l-G * is a proper distrit
Of more interest is the following result.
(Distribution of Blumenthal's stopping time).
pr{rN
B
> u}
+
GO(u)
= Pr{Q(O)
In Lemma 1 for
> u}.
The next result will be useful in discussing the limit distribution of the
larger sample mean when sampling is stopped, but it is quite general and may be
of some interest in its own right for the following reasons.
Typical results in
the theory of weak convergence with random indices (Durrett and Resnick (1976)
41'
5
have a nice review) start with processes {y } in D[O,oo) and a sequence of integer~
r
valued random variables M and consider the process Vr(trM ) on D[O,oo), where
r
r
rMr has a limit distribution.
change" proportional to rMr .
change rM
r
In other words, Yr is perturbed by a "random time
In the next result, we allow
to be itself a stochastic process.
YU) (t)
r
= r lz
the random time
Define m = [t/r] and for j=1,2 let
m
{L
i=l
(X .. - e.)
J1
+
J
(t/r - m)(X.
Jm+
1 - e.)}
J
The processes y(j) are elements of C[O,oo) with weak limits y(j) •
r
Lemma 2.
(Weak convergence for random change of time processes).
W;j) (s, t)
(2.3)
on D {[0,oo) x [O,l]}
2
where WU ) (s, t)
= yU) (srN (t))
r
r
(Bickel and Wichura (1971)).
= y(j) (sQ(t))
Define processes
Then for
8 ~ lz ,
.
The last Lemma will be shown useful when we discuss the specific proposal
for eliminating the inferior population early in the experiment.
It is a simple
Corollary of Lemma 2 which delineates the asymptotic behavior of the larger sample
mean when the number of observations are approximately N .
B
Lemma 3.
(Limit distribution of the larger sample mean).
Let M;j) and ~j) be
the number of observations and the sample mean after M~j) observations on the jth
population, where
1 >
-
Let
e;
M(j)/N
= max{~l), ~2)}.
r
B
= M(j)/N
(0) ~
r
r
Then, for S >
)2'
1
U
= 1,2) •
6
Proof of Proposition 1:
(2.4)
The process Gr(s) can be written as
G (s)
r
The denominator of (2.4) converges almost surely to a
=1
while the numerator
o
converges weakly to w*(·, nO)' completing the proof.
Proof of Lemma 1:
We first prove (2.1) by verifying tightness and the conver-
gence of the finite dimensional distributions.
(2.5)
Pr{rN (t.) > u.
r 1.
1.
= Pr{mr(t.+l)
1.
<
(i
2
Note first that
= l, ... ,p)}
!«
am H2(m2~m/am) for all a < mr < u.1.
= Pr{[s/r]r(ti+l)
(i=l, ... ,p)}
2
< a[s/r] H2 (G r (s)) for all a < s ~ u i
(i
= l, ... ,p)}
,
the last equation using the facts that ar- l is an integer and 2a < H . .
m1.n
Rewrite (2.5) as
(2.6)
Pr{rN (t.) > u.
1.
r 1.
(i
= l, ... ,p)}
(i=l, ... ,p)
From Proposition 1, the continuous mapping theorem (since inf is continuous in
this context) and Theorem 2.1 of Billingsley (1968), (2.6) shows that as r
lim inf Pr{rN (t.) > u.
r 1.
1.
> Pr{Q(t.) > u.
-
1.
1.
(i
= l,
,p)}
(i
= 1,
,p) }
thus verifying the convergence of the finite dimensional distributions.
tightness, we appeal to Theorem 15.2 of Billingsley (1968).
+
0,
To prove
The first condition
of his Theorem is satisfied in our case because the process rN r is non-increasing
7
and rN (0) has been shown to have a limit distribution.
r
To check the second
condition of Billingsley's Theorem 15.2. we must show (in his notation) that for
all E > 0,
(2.7)
lim
o~
lim Pr{w;N (0/2) > t}
r-+O
r
=
0
Now, since rN r and Q are non-increasing,
(2.8)
lim
Pr{w;N (0/2)
r~
2
~ E}
r
2
lim Pr{rNr(io) - rNr((i+l)o) > E for some i = O,I, ...• [l/o]}
r-+O
Pr{Q(io) - Q((i+l)o) > E for some i = O,l, ... [1/o]}
2
pr{w (4o) .::. E/4} ,
Q
the next to last inequality following by the weak convergence of the finite dimensional distributions, while the last follows because Q is non-increasing.
Then (2.7)
follows from (2.8) because Q is an element of D[O,l].
The rest of Lemma 1 (S <
Proof of Corollary 1:
~)
0
follows in a similar but easier fashion.
By Lemma 1, we need only show that
Gi
is continuous.
Letting l(A) be the indicator of the event A,
lim l{Q(O) > u + E}= l{Q(O) > u}
E+O
lim l{Q(O) > u + E}= l{Q(O) > u} + l{H (W*(u. n )) - u = O} .
2
O
EtO
Now, by assumption, pr{H 2 (W*(u,n O))
= u} = O.
so that
Gi
0
is continuous.
In order to prove Lemma 2, we need the following supplementary results.
For
intervals T1 • T2 of the real line, we define D2{TlxT2} to be the space of functions x(s,t) (s€T 1 • tET 2) which are continuous from above with limits from
below (Bickel and Wi chura (1971)).
8
Proposition 2.
Let a 1b be arbitrary positive numbers and define DOCO,a] to be
the set of functions cj> in D[01al which·a.re nonincreasing and satisfy
o~
cj>(t)
~
b for 0
~
t
~
a.
Let {Y } be elements of C[0,oo)1
n
DO[O,a] and suppose there exists an element
Define random elements
{Y~},
(Y,~O)
{~
n
} be elements of
in C[O,oo] x DO(O,a] satisfying
y* of D {[0,oo) x [O,a]} by
2
Y~(s,t) = Y (s ~ (t)).
n
n
Then on D {[0,oo) x [O,a]},
2
y*
n
Proof:
Define y(l)(s,t)
n
= Yn (s,t),
> y* .
= Y(s,t),
y(l)(s,t)
so that {y (l)}, y(l) are
n
Denote by A the space C2 {[0,oo) x [O,a]} x DO[O,a] and
elements of C2 [O,oo].
define a function h: A + D {[O,oo) x [O,b]} by
2
h(x,v)(s,t)
= x(s,v(t))
.
This is shown to be a measurable mapping following Billingsley (1968, page 232).
Since y*
n
= h(y(l)
~)
n' n
and
the continuous mapping theorem completes the proof once we show that h is continuous at elements of
A. Let (xn , cj>n )
+
(x,cj»
E:
A. Then there exists functions An
mapping [O,a] into [O,a] such that for every c > 0,
sup{lxn(s,t) - x(s,t)
I:
0~ s <
sup{max [/ cj> (A (t)) - cj> (t)
n n
These two facts imply that for every
C,
I, IA.n (t)
Co
> 0,
0 < t < a}
- t
I]:
+
0
0 < t < a}
+
0 .
4i'
9
Since
Co
> 0 is arbitrary, following Lindvall (1973) and Whitt (1970), this
shows that h is continuous at (x,¢) and completes the proof.
Proposition 3.
( y(l)
r
Proof:
13..:: ~ ,
On C[O,oo) x C[O,oo) x D[O,l], if
(2.9)
y(2)
'
r
'
D
rN ) ~> (y(l), y(2), Q) •
r
By Lemma 1, each of the elements of (2.9) are individually and hence
jointly tight, so it suffices to prove convergence of the finite dimensional
distributions.
We show this only for a special case, noting that
C2.l0)
Pr{y(l)Ct) > u
y(2)Ct) > u ' rN Ct ) > u }
r 3
3
r
1
l' r
2
2
= pr{y;1)Ct 1)
> u l ' y;2)Ct 2) > u 2 '
inf
a<s<u
We assume with no loss of generality that 0
2
ccr~s/r]H2CGrCS)) -
[s/r]rCt 3+l))>0}.
3
t , t , u
l
2
3
2
1.
Since fourth moments
are finity, for any u
k
sup{r 2
C2.11)
IX.Jm - e.l:
J
1 < m < ur
--
-1 P
}
-+
O.
This shows that the second term in the definition of y(j) is negligible, so that
r
(since G is a continuous function of the first terms in yCj)), on C[O,l]xC[O,l]xD[a,l],
r
r
1) y(2)
(y C
r
'
r
'
Thus, as r
-+
0, the continuous mapping theorem and Theorem 2.1 of Billingsley
(1968) show that
lim inf pr{y;l)Ct ) > u ' y(2)(t ) > u ' rN Ct ) > u }
l
2
2
l
r 3
3
..:: pr{yCl) (t ) > u ' y(2) (t ) > u ' QCt ) > u } ,
1
2
l
2
3
3
which proves convergence of the finite dimensional distributions and completes
the proof.
D
10
Proof of Lemma Z:
exist positive numbers aI' a
a
Z
such that
< inf{Q(t): 0 ~ t ~ I} < sup{Q(t): 0 < t < I} < a
l
ti'
The boundedness of HZ means that with probability one there
Z
.
Define a process M (t) by
r
rMr(t) =rNr(t) I{a l ~ rNr(t) ~ a 2 }
+
a 2 I{rNr(t) > a 2 }
and define z(j)(s,t) = y(j) (srM (t)).
r
r
r
a l I{rNr(t) < all
+
By an extension of Proposition 2 and by
Proposition 3,
Now, since N is non-increasing,
r
Pr{M (t) # N (t)} < Pr{N (0) > a }
r
r
r
2
so that Z(j) - W(j)
r
r
gO.
An
+
Pr{N (1) < all
r
0 ,
+
application of the continuous mapping theorem and
o
Theorem 4.4 of Billingsley completes the proof.
Proof of Lemma 3.
Calculations and (2.11) show that
~
(2.12) r- 2 (8* - 8 )
2
I
r
= r
~2
M(l)
r
(1) 1
)r
. 1
1.=
max{(rM
(1)
= max{ (rM r
)
L
(Xl'1
-1 y(I) (rM(l))
r
r
+
~
r 2(8
1
8 ), (rM(2))-I y(2) (rM(2))} +
2
r
M(2)
M(l)
M(l)
!.z
r
) (rN ) -1 Vel) (N rN )- r /1, ( ~B ) (rN ) -1
= max f (_-.!'N
B
B
B
r
B
B
r
r
o
p
(1)
M(2)
y~2) (~B rN B)}
+
a (1)
p
By Lemma 2, the processes y (j ) (s rN ) are elements of C[O,l] which arC' tight with
r
B
weak limits, so that since M(j)/N ~ 1
r
B
'
11
This means by Lemma 2 that the elements of the last equation in (2.11) are
jointly weakly convergent, so that an application of the continuous mapping
o
theorem completes the proof.
3.
Eliminating Populations
The difficulty with the stopping time NB is that it is only mildly data
~l"
sensitive in that it estimates
•. '~k~l but continues to sample from popula-
tions which the data indicate are not associated with the largest population mean,
i.e., it fails to eliminate inferior populations.
A basic method for correcting
this deficiency is to use the technology due to Robbins (1970) and Swanepoel
and Geertsema (1976).
Define tea) = m-l(l
+
Suppose then an initial sample of size m is taken.
b 2/(m_l))m, and let b = b(a) satisfy the equation
bf _ (b) = a/(k-l), where Fm_l(f _l ) is the distribution (density)
m
ml
function of a t-distribution with m-l degrees of freedom. Define
n
_
_
2
h(t(a),n) = {(t(a)n)l/n - l}~ and let s2(i,j,n) = (n_l)-l I (x. -x. -x. +x. ) .
p=l JP lp In ln
We say that the ith population is eliminated at stage M. if it has not been
1 - Fm_l(b)
+
1
eliminated at any stage n < Mi and if, when populations jl, ... ,jp also have not
been eliminated before stageM., we have for some j
1
{jl, •.. ,j } that
P
Y' M - Y' M > h(t(a), M.)S(i,j,M.)
J i
1 i l l
(3.1)
Assuming
E
~-l >
0, 8j = 8[k] and an initial sample size m, the previously cited
works show that
(3.2)
Pr{M. > M.
J
1
for all i
F j}
_> 1 - a.
In other words, the probability is at most a of eliminating the population with
the largest mean.
We believe the choice m=5 initial observations will work quite
12
well.
The stopping times we consider are then defined formally as follows:
choose ~ (see below) and take an initial sample of size max(S,ar- l ) from each
population.
Definition.
Reorder the populations so that M <M <
l
2
case of ties being by sample means.
each population.
If NB(k)
2
' 0 ' ~~)
the ordering in
Ml , take NB(k) observations from
Otherwise, completely eliminate the first population from
further study and continue as if there were k-l populations in the experiment
(this includes changing the values of H to H _ and cr
to cr _ , but the value
nk
k
k l
nk l
of tea) in (3.1) remains unchanged).
2
Then, if NB(k-l)
M2 , take NB(k-l) observations from each population; otherwise eliminate the second population. Continue
in this manner until stopping, denoting the number of observations on each population by (N l
2
2 ... 2
Nk ) = N, with total sample size T = N + ••• + N .
l
k
Note that Blumenthal's NB = NB(k) is obtained as a special case by choosing
a
= O.
that
N2
We again consider only the case k=2 and define M = min(M ,M ).
l 2
~ ~
r
The next result shows how letting a
+
0 as r
+
Recall
0 influences M.
The proof is at the end of the section.
Lemma 4.
(size of M)
(0 < So <
~).
Choose a
Then, as r
+
+
0 as r
0 so that b
+
2
= 2 log tea)
=
r2So-l
~
S
r , so
0,
rM
g aO
if
S < So
1
if
S
+00
if
S > So
P
+
P
Lemma 4 is rather confusing at first sight.
= So
Note that
~
= 162 - 81 '
the smaller the value of S the farther apart the means are and the quicker one
should eliminate.
shows.
This means that for smaller S, rM should be small, as Lemma 4
The constant So (a monotone increasing function of a) merely serves as a
~
13
~
cut off point; for small a and hence small So' it becomes harde~ to eliminate
because we are insisting on more protection (see (3,2».
Lemma 5.
(Comparison of sample sizes).
For general k, let T
b
Let~.
total sample size of the Blumenthal procedure.
1
S.
N
r
1
(i
= kN B be
the
= l, •.. ,k-l)
and let p be the number of Si < So' i.e., p is the number of populations whose
means are far from 8[k] relative to a.
Then, if TE is the total sample size
taken by the elimination procedure,
where a
O
is defined immediately following (1.3).
Lemma 5 shows that considerable savings in sample size are possible.
In
the next section we show that this is accomplished without a corresponding
increase in
~~E.
Proof of Lemma 4.
rM
=
=
M~M2
~
2
M/log tea)
Using the facts that
TM/~ ~
S2(1,2,n)
(3.4)
Since
2.
7
2S 0 /B
, it suffices
Recalling that 81 < 82 and defining
X2n - X - ~, equation '(3.2) shows that with probability approaching one,
ln
so that with probability approaching one,
(3.3)
that
<~.
(~2M/2 log t(a»(2r log t(a»/~2 _ (~2M/2 log t(a»r
to prove that
Tn
First consider S
7
0,
B<
TM_l/~ ~
T +
M
~
T _
Ml
+ ~ ~
~
O.
> h(t(a),M) S(1,2,M)
h(t(a),M) S(1,2,M)
and M ~ ar
-1
the law of the iterated logarithm shows
Dividing through by
~
in (3.3) and noting that
2 almost surely, a few manipulations show that
(log tea)
+
log
2
M}/~
P
M7
~ •
14
.
M > ar- l
Now, Slnce
t(~))/~ZM ~ ~.
(log
by r
~-E
~ZM 2:. MY for some small positive y, so that (3.4) becomes
Next we consider the case S ~
, where E > 0 is sufficiently small.
Since
~.
In (3.3), divide through
!,; E
~/r2~
+
0, some manipula-
tions yield
{log tea)
+
log M}/Mr
l-ZE P
+
0 •
l ZE
Since Mr > MY for some Y > 0, this gives
(3.5)
Mr
Since rM
= r ZE
small so that r
l-ZE
flog tea)
P
+
00
l ZE
log t(a) (r M/log tea)), we can choose E > 0 sufficiently
2E
log tea)
+
00
and hence by (3.5), rM
p
+
which completes the
00,
proof.
4.
Mean Square Error
In this section we consider the MSE r -1 E(8
in a Monte-Carlo study for small sample sizes.
N - 8z)
Z both asymptotically and
Suppose that upon stopping,
N. observations have been taken from the ith population (i
1
from previous considerations, we are taking 8
1
= 1,Z).
< 8 ' N > ar
Z i
-1
Recall that
, and 0=1.
Blumenthal and Cohen (1968) indicate that even for N , there are many ways of
B
estimating 8[2]' but that 8* = max(X , XZn ) is a reasonably effective choice.
ln
n
Our stopping time employs an elimination feature, so we must take into account
the possibility that an eliminated population has a sample mean (upon stopping)
larger than any other sample mean (upon stopping).
The estimate we choose then
N= max(X lN
' X ). An alternative estimator is the maximum sample
ZN
1
2
mean over all populations which have not been eliminated, but we have been unable
is given by
to verify the uniform integrability needed in the proofs to follow.
~ ~ rS(s
<
The cases
~) and ~ ~ rS(s > ~) are different and are treated separately.
15
_
LeI1Ulla 6.
(Asymptotics of
tions of Lemma 4 with 0 <
r -~
(eN -
eN
when elimination may occur).
So < ~,
0
~
S < ~,
Consider the condi-
Then the limit distribution of
6 ) is the standard normal and r- 1 E(8 - 8 ) 2
2
N 2
+
1,
Lemma 6 says that for k=2 if the population means are sufficiently separated, even if elimination does not occur, our general procedure has precisely
the same asymptotic behavior (in 8 ) as does NB.
N
The next result shows this
to be true when the means are not separated, i.e., 62 - 61 = ~ - r S,
S ~ ~.
Note in this case that Lemma 4 says that elimination will probability not happen.
Lemma 7.
(Asymptotics of
eN
when elimination is unlikely to occur).
S>
the conditions of Lemma 4 with
conclusion to Lemma 3.
~.
-!.<
Then r 2(6*
N
Let
~
Consider
be the limit distribution in the
8 ) ~> ~, E~
2
2
exists, and
The same results hold if NB is used without elimination.
The results of Lemma 6 and Lemma 7 are rather unusual, in that they say
that for k=2 the simple elimination idea employed here can save the user in
terms of sample size with no (asymptotic) change in MSE.
In order to see how
this works with small samples, we ran a Monte-Carlo experiment with 500 simulations.
The complete results are reported in Carroll (1978b), but here we con-
sider a = .01, r = .10, .01 and
~
= 2.00, 1.00 and .20.
size m=5 was chosen as suggested in Section 3.
An initial sample of
The results are given below,
with TE/T B being the ratio of sample size needed for the elimination procedure
relative to Blumenthal's procedure.
16
. A::;2.00
A::; 1. 00
A::;.20
r ::; ..10
.01.
r
1.00
.82
1.00
.64
1.00
r -1 MSE for elimination
r = .10
r = .01
.86
.93
.89
.90
.72
.63
r -1 MSE for Blumenthal
r
.10
r = .01
.86
.93
.89
.77
.73
.63
TE/T B
::;
:::l:
LOO
Apparently, both procedures achieve their goal of controlling MSE.
The elimina-
tion stopping time can lead to substantial savings in sample size while achieving
its bound on MSE.
The Blumenthal procedure appears to have slightly lower MSE
overall, but this is achieved at the cost of increased sample size.
Proof of Lemma 6.
. (either a
By Lemmas 1 and 4, rN. converges inprobability to a constant
1
O or 1 depending on SO,S).
Thus, by Anscombe (1952, Theorem 1) the
vector
(4.1)
converges in distribution to a normal random vector.
Pr[X2N ~ X }
IN
2
1
so that r -~ (8
+
N-
1 since r-~(82 - 8 )
1
+
00.
This gives
Hence
8) has the required limit distribution.
By Bickel and Yahav
(1968), to complete the proof it suffices to show that for some r
00
Since N > ar
i
Pr{r -1 (eN
\'
(4.2)
sup
m=l O<r<r
L
-1
o
e2 )2
(i = 1,2), our definition of
> m} <
eN
00
shows that
O
> 0,
17
~ Pr{IX
ln - 61 ' > (mr)
~
2
for some
k
+ prf/X2n - 82 ' > (mr) 2 for some
-2
~ Co m
(for some Co > 0) ,
n > ar- 1 }
n>
1
ar- }
this last following by the maximal inequality for reverse martingales (Doob
(1953)).
Proof of Lemma 7.
r -~ (8
N-
o
This verifies (4.2).
82)
=>~.
Under our conditions, Ni/N B + 1 (i
= 1,2).
Thus, by Lemma 3,
The rest of the proof now follows from Bickel and ¥ahav
(1968) and (4.2).
change if P = k-l, so that all populations but one may be eliminated.
Lemmas
1-3 and 7 can handle the mixture situation p < k - 1 with some simple changes
in definition.
If one makes the further reasonable assumption that
Hk(xl' ... '~_l)
= Hk_p(xp+l'···'~_l)
when xl
= .•• = xp =
00,
then, as in Section
4, it is possible to show that NB and the elimination procedure lead to the same
asymptotic MSE.
18
References
Anscombe, F.J. (1952).
Large sample theory of sequential estimation.
~c~
Camb. PhiZ. Soc. 48 600-617.
Bickel, P.J. and Wichura, M.J. (1971).
Convergence criteria for multiparameter
stochastic processes and some applications.
Ann. Math. Statist. 12
1656-1670.
Bickel, P.J. and Yahav, J.A. (1968).
Asymptotically optimal Bayes and minimax
procedures in sequential estimation.
Billingsley, P. (1968).
Wiley
&Sons,
Ann. Math. Statisti. 39 442-456.
Convergence of ProbabiZity Measures.
New York: John
Inc.
Blumenthal, S. (1973).
Sequential estimation of the largest normal mean when
the variance is unknown.
Department of Operations Research Technical
Report No. 164, Cornell University.
Blumenthal, S. (1976).
Sequential estimation of the largest normal mean when
the variance is known.
Ann. Statist. 4 1077-1087.
Blumenthal, S. and Cohen, A. (1968).
means.
J. Amer. Statist. Assoc. 63 861-876.
Carroll, R.J. (1978a).
On sequential estimation of the largest normal mean when
the variance is known.
Carroll, R.J. (1978b).
larger mean.
#
Estimation of the larger of two normal
To appear in Sankhya, Series A.
A study of sequential procedures for estimating the
University of North Carolina at Chapel Hill Mimeo Series
1189
Doob, J.L. (1953).
Stochastic Processes.
Durrett, R.J. and Resnick, S.I: (1976).
Wiley, New York.
Weak convergence with random indices.
Technical Report No. 83, Department of Statistics, Stanford University.
Gut, A. (1975).
324-332.
Weak convergence and first passage times.
J.
AppL FPob. 12
~
19
~
Lindvall, T. (1973).
Weak convergence of probability measures and random functions
in the function space D[O,oo),
Robbins, H. (1970).
logarithm.
J. AppZ. Frob. 10 109-121.
Statistical methods related to the law of the iterated
Ann. Math. Statist. 41 1397-1409.
Swanepoel, J.W.H. and Geertsema, J.e. (1976).
Sequential procedures with
elimination for selecting the best of k normal populations.
S. Aft. statist.
J. 10 9-36.
Whitt, W. (1970).
sapce e[O,oo).
Weak convergence of probability measures on the function
Ann. Math. Statist. 41 939-944.
UNCLASSIFIED
---_._--------
C,lCUHI Tv CLA';5IFIC ATI')N OF THIS PA(,E (Wh611 0 ..,,, 1·:"/,.,,,01)
r
READ INSTRUCTIONS
BEFORE COMPLETING FORM
REPORT DOCUMENTATION PAGE
I.
REPORT NUMBER
GOVT ACCESSION NO.
1------------_._------TITLE (unel Subtil/cj
4.
Convergence Results for Sequential Estimation
of the Largest Mean
3.
HECIPIENT'S CATALOG NUMBER
5.
T'PE OF REPORT & PERIOD COVERED
TECHNICAL
6.
PERFORMING O"lG. REPORT NUMBER
Mimeo Series No. 1192
7.-A-U-T-HOR-(:".~)--~-~·-------·-'n-·-----------·------h:-8 .-·CONTRACT
Raymond J. Carroll
OR GRANT NUMBER(s)
--
AFOSR-75-2796
~---_._------.--------..,..--::-::-=:-:-----.-----+;-;;--=:::-::-::=-:~=:_=:_:c::~::_::=~=__:::~::____.
9. PERFOR'MING ORG,o,,'Z.ATION NAME: AND ADDRESS
10. PROGRAM ELEMENT. PROJECT. TASK
AREA 1\ WORK UNIT NUMSERS
Department of Statistics
University of North Carolina
_,_.__~hape )_J.!~}.r.2.-~_~th .Carol ina 27514 -._---_._----+ ._-----_._._----------12. HI:PORT DATE
11. CONTHOLLING OFFICE NAME AND ADDRf.SS
August, 1978
Air Force Office of Scientific Research
n. NUMBER OF PAGES
Bolling AFB, Washington, D.C.
19
h·
l ...... -Mo;ii"T'ORING.A~N,:::;-·NA.Mr
.\
:--:.---::--:-:--:-:-:.,..----:--;--:::--"7:7-::-::--7"'""'t--:-:;--~;:_:_:_;:::_::::_~-;-;:-~~~_:_-_::_----ADorn: <-.S(/f dlfff"!TflJIlt from COritrollJn~ O{fJctJ}
15. SECURITY CLASS. (ot tIJi" ft9/lnrt)
UNCLASSIFIED
t----;s;.
DECLASSIFICATION DOWNGRADIN'GSCHEDULE
~'-D-I!-,T-Fi I 8 UTI 0 N S·::T-'-P.-::~-::F.""M-::E-N-::T:-(-,,""'I-!l:--:.;-"-:R::-e-"-,,,-,).----
Approved for Public Release:
Distribution Unlimited
--------------_._---_._------------------_._---_._-18. <,UPPL EMENTARY NOTES
f - - - - - - - - - - - - -..- - - - .- ..- - ...- -..----- .....- , . - - - - - - - - - - - - - - - - . - - . - - - - .. - - - - - - . -......- - - .
1IJ_ KEy W(J1-'D~ (C'ollrifllle on /f·l"'(.'>C' :d.{'· JI 0,.( n.~ ..."tf~' 'lilt! id"lItll" by h/ol k nllmlJftr)
Sequential Estimation, Elimination, Largest Normal Mean, Weak Convergence,
Random Time Change, Ranking and Selection
DO
FORM
1 JAN 73
1473
EDITION OF 1 NOV 65 IS OBSOLETE
UNCLASSIFIED