Roverts, C.D.; (1962)An asymptotically optimal sequential design for comparing several experimental categories with a standard or control."

,....
UNIVERSITY OF NORTH CAROLINA
Department of Statistics
Chapel Hill, N. C.
AN ASYMPTOTICALLY OPTIMAL SEQUE])JTIAL DESIGN FOR
CO~JPARING
SEVERAL EXPERIMENTAL CATEGORIES 'VJITH
A STANDARD OR COJ:JTROL
by
Charles DeWitt Roberts
December 1962
Contrdct No. AF 49(638)-261
Three sequential procedures are considered in order to
to decide if any of k experimental categories are
better than the standard (or control) and, if so, to
determine one of which. With a specific loss function
and a cost c > 0 per observation the three sequential
procedures and fixed sample size procedures are compared
in a certain asymptotic sense as c --> O. In partiCUlar, one of the procedures is shown to be optimal in
this specific asymptotic sense.
This research was supported by the Mathematics Division of the
Air Force Office of Scientific Research.
Institute of Statistics
Mimeo Series No. 344
.,
e
iii
page
Chapter
III
ASYMPTOTIC RESULTS FOR THE CASE OF k EXPERIMEN-
···· ·· · · · ·
· ·· ·· ··
·· ·
···· ··
········ ··
47
47
47
48
49
··· ··· ·
· ··
· ·
·
··
····
· ·· ·
procedures considered
·• ··• ·• ·• • · ·
APPENDIX . . •
·• • • ·• • • • ·• ·• ·
BIBLIOGRAPHY •
·• • ···• • • ··········
50
TAL CATEGORIES
•
•
•
• •
• •
3.1 The problem •
•
• •
•
,
3.2 Assumptions
•
• •
•
3.3 Definitions and notation
•
3.4 Theorem 6 (The Bayes terminal decision rule).
3.5 Theorem 7 (Optimal sample size and risk for
fixed sample size procedure)
•
•
3.6 Theorem 8 (Asymptotic lower bound for risk
of any procedure)
•
• • • • • •
•
3.7 Theorem 9 (AsymptotiC risks of Procedures I,
II, and III).
• • • •
• • •
•
3.8 Table of prices of the procedures •
• •
Remarks
concerning
the
relative
merits
of
the
3·9
·
e
'.
~
50
50
51
51
52
54
iv
ACKNOWLEDGEMENTS
I am deeply indebted to Professor Wassily Hoeffding for proposing the problem, for his numerous suggestions and comments, and
for his encouragement and kindness.
I am grateful for my National Defense Education Act fellowship
which has supported me during my period of graduate study and research.
Bell Telephone Labs is thanked for computer use to run my
programs of the Chapter I calculations.
I thank Mrs. Doris Gardner
for her prompt and accurate typing of the manuscript.
v
INTRODUCTION
Basic problem.
terest.
The basic problem here is one of practical in-
This problem is that of comparing k
experimental cate-
gories with a standard (or control) in order to determine if any of
the experimental categories are better than the standard (or control)
and, if so, to determine one of which.
SUIllIIl8.ry and statement of' results.
For the above problem three
sequential procedures are given with complete specification as to
how the procedures are actually carried out in practice.
•
'.
Justification for using these procedures is given by considering
an analogous problem with three analogous procedures when there is
specified a loss function and a cost
c > 0 per observation.
For
the analogous problem it is shown in a certain asymptotic sense as
c
tends to zero that all three procedures are strictly better than
the best fixed sample size procedure and that one of the procedures
is the best procedure possible.
By appealing to asymptotic results
a discussion of' the relative merits of the three sequential procedures
as considered in practice is given.
CHAPTER I
THE PRACTICAL SOLUTION OF THE PROBLEM
1.1 The setup and the terminology.
mental categories and x(j)
We assume there are k
is the random variable resulting from
a measurement with the j-th category.
sity of x(j) by g(X,T.).
J
larger the value of
when
Q ::;: 0
We denote the probability den-
For simplicity it is supposed here that the
We say
T , the more desirable the category is.
T ::;: T ::;:
2
l
••• = Tk = To'
= ••• = Tk = T0
and say
Q
=j
when
= ••• : ;: Tj _l = Tj +l
and T ::;: T + 6 where
l
0
j
as described in the following table (where T ::;: TO + 6) :
'r
e
experi-
,'.
Q
x(l)
X(2)
0
g(X,T O )
g(X,T O )
g(X,T O)
1
g(X,T)
g(X,T o )
2
g(X,T o )
g(X,T)
g{X,T )
O
g(X,T O)
x(3)
·..
• ••
·..
·..
·..
k
6
> 0,
x(k)
g(X,T
O
)
g(X,T O )
g(X,T o )
g(X,T)
(1.1.1)
The decision D is preferred if Q = 0 or if none of the experio
mental categories is better than the standard (or control) £that
is,
T < T
s0
for
s::;: 1, 2, ••• , k
-
7
in the model (1.1.1).
The
2
decision D.
J
is preferred if
Q
=j
or if the j-th experimental
category is better than the standard rthat is,
-
model (1.1.1).
'1".
J
>
'1"
7
0-
in the
(We exclude the possibility that more than one
category is better than the standard.)
We will develop three sequen-
tial procedures for choosing one of the k+l
decisions (Do' D , ••• ,
l
so that the probability of selecting D when Q = 0 is at
o
least 1 - a, and the probability of selecting D when Q = j is
j
at least
1 -
~
for each
similar to that of paulson
j,
j
= 1,
2, ••• , k.
Thls formulation is
1272 •
In all cases we will consider sequential procedures for choosing
one of Do' Dl , ••• ,
tions:
(i)
~
which satisfy the following general assump-
The decision to stop .samp:liilg or to continue sampling as
well as the choice of the next experiment (category) may depend on the
available observations.
(ii)
Once the (n+l)-th experiment has been chosen its outcome is
assumed to be independent of the
n previous outcomes.
For more discussion about the sequential design of experiments
and general asymptotic results see Chernoff
1~7.
The asymptotic results obtained here are similar to those found
in Chernoff f~7.
In particular, asymptotic opt ima lity is in the
same sense.
2
The numbers in square brackets refer to the bibliography.
3
We will now define three sequential procedures.
1.2 Brief descriptions of procedures
Procedure I.
gories.
If say
I, II, and III.
Assign a random order for observing the k
(1, 2, ••• k) is chosen, sample on X(l)
until a
decision is made about the distribution associated with X(l)
if necessary start sampling on x(2).
X(2), sample on
X(2)
If
cate-
then
sampling were begun on
until a decision is made about the distribu-
tion associated with x(2)
then if necessary start sampling on x(3)
and so on.
Procedure II.
Sample in k (or less) - tuples of one observa-
tion on each category beginning with a
k-tuple.
After observing each
tuple decide whether or not to make a terminal decision and if sampIing is continued, decide which, if any, of the categories may be
eliminated from further sampling.
paulson
This is the procedure suggested by
L:2.7•
Procedure III. Take one observation on each category and then
on the basis of what is observed choose a category on which to sample
next.
After each observation select a category on which to sample
next.
stop sampling from a category entirely when a decision is made
about the distribution associated with that category and if necessary,
continue sampling only on the remaining categories.
1.3 More detailed descriptions of procedures
I, II, and III.
b < 0 < a, and
z(j)
g(x(j),
=
log
g(x(j) ,
'1"0
+ 6)
"0 )
for
j =
1, 2, ••• , k,
Let
4
zi j
and
z(j).
be the i-th observation on
)
Procedure I.
Select at random one of the k!
permutations of
1,2, .••• , k say (i , i 2, ••• , ~). We would then sample first
1
(i l )
~i2)
(ik )
.
on X
, then on X
, ••• , then on X
in the following way:
(i)
stop sampling on
for some
x(j) when
~j z~j)
b <
. 1
J.=
< a is violated
J.
n .•
J
~j z (j ) > a, do no further sampling
i=l i
on any category and make decision D.•
(ii)
n.
If for that
we have
J
(iii)
If for that
nj
J
nj
E
we have
i=l
( .)
z.J ~ b, begin sampling on the
J.
next category.
(iv)
If for each
nj ( .)
E z. J
i=l
J.
j, j = 1, 2, ••• , k, there is an
n.
J
for which
< b do no further sampling and make decision Do •
~rocedure II.
Sample in k (or less )-tuples with one observa..
tion on each of a subset of x( 1 ), X( 2 ), ••• , X(k) beginning with
,.
a k-tuple in the following way:
Let m be an n for which b < ~ z~j)
< a is violated for
J.
(i)
some
j •
If
~
£.,
i=l
make decision Ds
z(.·s)
J.
'"~ max
[~
z(jf=jl> a
stop sa.mpling and
i=l i
where here the max is over those j for which
.
j.
£.,
m observations have been taken on
(ii)
stop sampling on
b <
~j z(j) < a
i=l
i
x(j).
x(j} if for some n j
. is violated.
~j zi j )
< b for some nj
i=l
stop sampling and make decision Do.
(iii)
-e
If
one of the inequalities
and each
j
= 1, 2, ••• , k
5
Take one observation on X(1) ,X(2) , ••• , X(k)
Procedure III.
and then proceed as follows:
(i)
Sample next on x(s)
if
Zls)
~x
=
zll)
• In general
"
X(l) ,n on J X(2) 1 .... , ~
0 bserva t ~ons on
2
l
sample next on xes) if ~s z(s) = max J ~j z(j)f
i=l i
(j=l i J
on X(k)
a ft er n
(ii)
stop sampling on X( j
when b <
)
~j
i=l
for some
(iii)
z~ j )
~
< a is violated
n .•
J
If for some
n.
j
!;J
and some
a
stop
i=l
sampling and make decision D.•
J
(iV)
If for each j
there is an n j
for which
stop sampling and make decision D •
o
1.4 Determination of a and b. Suppose a, 13, and
and 'To is assumed know,
Io
f).
are specified
and denote
=
and
where
E~
indicates expectation for
Q
the state of nature;
also
let
k I
13
.,
A {1' a(k-l)~Io+(k-l)Il_7 j
= min
·e
o
(1.4.1)
6
For any of the three procedures, we suggest choosing
.'
a = log
(~
and b = log
a)
which is what is euggested by paulson
£f3
(k-l~A
=
£5_7
a
_7
(1.4.2)
for Procedure II.
For the three procedures we will have where P indicates proG
bability when Q is the state of nature
k
1 - P o (D 0 ) = E
Po(D j )
j=l
<
k
r i=l~ z(j)
i
P
E
a
->
0-
j=l
for some
r <
00_7
and
1 - P j (D j)
e
= 1
-
Pl(D l )
k
s=2
r
< P1 £
£
-
Zi
i=l
PI
PI(D s )
(1) < b for some
E
k
+ E
E
+
= PI(Do)
r
.E
(s)
zi ~ a
s=2
~=l
Now it is well-known (see Wald £6_7)
r
prE1
P
0-
1.=
r
1-
~
r
p'
1for
-e
.
z~l) > a for some r <
1.
-
z(l) < b
i=l i -
f
i=l
z (s) > a
i
-
s = 2, 3, •.• , k.
for some
for some
r <
for
<
00_7
SO!l1e
that
00_7
< e -a
00_7
< e
r <
r
00_7 :5
b
and
e-a
Thus in order to satisfy the requirement that
7
P (D ) > 1 - a and Pj(D.) > 1 - ~, we therefore determine a
00J a
and b so that k e- < a and eb + (k_l)e- a < ~. Let N~&)be
-
the sample size required for procedure
& = I, II, III
and let
N.
J
and
= 1,
j
2, ••• , k
J
& and category
and
N(&)
j
for
= Ni&)+N~~)+ ••• +N~&)
be the sample size required for a Wald boundary to be
crossed on category
j.
Then it follows that
for
But an approximate upper bound for EINI
upper bound for
EINs
may write (with
5
is
-
~o for
s = 1, 2, ••• , k •
is
a
and an approximate
II
s = 2, 3, ••• , k so that we
for approximately less than or equal)
E N(&)
~
II
<
~
s
_
(k-l)b
Io
Now if we minimize this approximate upper bound (that is,
a/II - (k-l)b/l o ) with respect to a and b subject to
k e -a < a and eb + ( k-l )-a
e 5 ~ we have the values for
b
given by
(1.4.1)
and
a
and
(1.4.2).
1.5 Discussion of the merits of the three procedures in practice.
We will assign a cost of
c >
° per observation, a loss for making
a decision, and a prior distribution that assigns probability
to Q = j, j = 0, 1, ••• , k.
problem let
reg, 8)
with procedure
8
S.J >
For our (k+l)-decision simple hypothesis
= expected
loss + expected cost of observations
if Q is the state of nature.
Let also
k
Sjr(Q,o) the Bayes risk
j=o
associated with procedure
and define Po the price of procedure
reo)
=
L:
°
·e
°
8
B by
PB
:;
lim
c
->
sup
0
reB)
-c log c
Thus the pr!ce of a procedure is a type of measure of the desirability
of the procedure where the more desirable procedures have small prices.
In Chapter III it is shown that for our (k+l)-decision simplehypothesis problem with a certain loss for making a decision and a
certain choice of a and b and Io=(~~{zi1)~ ,Il=El~zil)S
t (1)
-tz
zl
1
we have the following table:
a = inf E e
, (3 = inf E e
1
1
1
t
0
t
Procedure B
Price
I
I B~st
fixed sample
sJ.ze
k
PF =
i
k So
II
:;
(l-s o )
11
+
I
I
I
III (asynrptotically optimal)
PIlI
<p
<
-
Il-
(1 - So )
1
.L1
k
=
min (1 ,1 )
0
1
1
+ (1 - ~ ) ( o
1
1
~
k So
-1- +
0
S0
~
II
Table (1.5.1)
k
>
-log max( C\ ,(31))
I
PI
PB
1
k
+
k-l )
21 0
s
0
~
k - 1
+ max(I o ' II)
_7
(1 - 1;0)
+
II
= Popt
I
I
9
By the table of prices it is seen that all three procedures
asymptotically are strictly better than the fixed sample size procedure and that it is not possible to find a procedure asymptotically better than Procedure III.
is suggested over Procedure I.
Now if
If
Il/I
21
o
< II Procedure II
0-
or
Io/I
l
is small then
Procedure II is approximately as good as (optimal) Procedure III although the possibility is left open that Procedure II is asymptotically optimal.
If
Il/I
o
as good as Procedure III.
is small then Procedure I is approximately
Now in practice Procedure III may be diffi-
cult or troublesome to perform since one may have to change his physical apparatus after each
to choose Procedure I or II.
observation and so he may perhaps prefer
If it is expected that a certain cate-
gory will be better than the others or it is tentatively possible to
rank the categories then it would appear that Procedure I could be
desirable and experimentation performed by this ranking.
If a ranking
is not possible or it is highly suspected that more than one of the
categories could be better than the standard (or control) then Procedure II could be the best to choose since it has more of a tendency
to select the best of the categories rather than simply one better
than the standard which is what Procedures I
and III tend to do.
Monte Carlo trials were run on an IBM 7090 computer with two experimental categories for the three procedures described using 25,000
normal random deviates so that the distribution
with mean
0 and variance
and variance
1
1 and g(X,T)
in the table (1.1.1).
g(X,T ) -was normal
o
was normal with mean
1
The runs yielded the following
10
table:
Error Levels
J
I
!
:
i
,
a:
Expected sample sizes as observed
I
,
,
I
i
f3
!
I
Procedure I
I
=0
Procedure II
I
I
Procedure III I
I
!
:Q
I
Q
= 1,2
G
= 0 I Q = 1,2
G
=0
Q
= l,2!
i
i
,
1
.05
.05
I
I 12.9
I
i
i
!
II
!I
i;
.05
I
i
,
i
I
l-
! .01
i
.05
I
.01
!i
18.7
10.7
12.2
12.6
17·7
15.3
18.5 10.1
7.8
,
, 17.6
.01
i
I
I
12·7
15.3
17·7
17.3
10·5
I!
I
I
, 11.3
j
i
I
I
I
I
!
I
i
,
12.8
I
i
i
,
.01
9·1
11.1
I
I
I
10.9
10.3
13.2
,
i
9.8
I
i
I
I
For further numerical computations in particular for a comparison of the sample size required by a fixed sample size procedure with
an approximate upper bound to the expected sample size when normality
is assumed see paulson
127.
In particular from paulson
case of the table (1.5.2) we have the following table:
1'2.7
with the
11
Table (1.5.3)
r:
Error 1evels
I
a
Approximate upper
bound of expected
sample sizes for
Procedures I, II,
and III
Q. ,;. 0
Q: ;;; 1,2
sample size
required for
fixed sample
size procedure
f3
I .05
.05
26
15
15
.05
.01
31
21
21
.01
.05
36
12
11
.01
.01
48
I
I
1.6 Examples.
e
Here we assume
.
I
21
o < a < 1, o < f3 < 1,
Normal with infinite parameter space.
(1)
21
and 6 > 0 •
Here we satisfy 6, a, f3
and assume
g(X,T)
= (2n)-1/2
exp
2
~ -(X-T)
_7
2
Suppose the problem is specified by the table
g!
I
x(l)
X(2)
x(3)
g(x,w01 )
g(x,w11 )
g(x,w )
02
g(x,w12 )
~I
g(x,w21 )
k
g(x,w )
kl
01
I
1
·..
·..
g(x,w )
22
g(x,w )
03
g(x,w )
13
g(x,w )
23
·..
·..
g(x,wk2 )
g(x,w )
k3
• ••
X(k)
g(x,w )
Ok
g(x,wlk )
g(x,w2k )
•
«
where
wij
-
0
if
i
I
( ->
6
if
i
=j
~
j
g(x,w )
kk
12
~1th
Then with the three sequential procedures
and values of
a, b
'\
'"
. >"1
t3
I
= m~n (-0 ' a:(k-l))
a
=
log
b
=
log
(
k
rt3 -
-
r
~ ~(i:: x~l)
ot- .
=
P
o
<
<
and
5 ~(~
L
~(~
{ '1
0
~=
_£e
-a
~
r
2
~) >
a
7
(k-l)"Aa:
k-
Pj(Dj)~l-t3 for
W
ij
•
j=1,2, ... ,k
This is seen since
'
for some
x(l)_
i=l
r
P
< e
·e
~
1
~=
_
r <
00
~
j
> a for some
i
(1)
x.~
- w01
7a
~-
~2/2
x(j) _
(A.Q:)
for all permissible values of the
P
~
given by
Po(Do)~l-a: and
we will have
z(j) =
...
(f) + wOl)r
2
)~
a
for some
r <
00
~
r
<
00
~j
13
P
1
t' t:.( ~
. 1
~=
X~l)
~
-
r
t:. ) < b
for some
2-
r <
< b for some
<
r
w
Tll
<
e
and
o
}
r<oo~
~I
b
Similarly for
P
b for some
00
[t:.(
j = 2, 3, •.. , k
~ x~j) i=l ~
P. r t:.( ~ x~j )
t i=l
~
J
r t:.
2
) ~ a for some r < .J~
00
r t:. ) < b
2
-
")
for some
r <
00
~
-a
< e
b
< e
-a
Therefore 1 - P0
(D0) -< k e < ex and
a
b
1- F.(D.) < e + (k-l) e- _<13
J
(2)
J
for
-
Exponential Class.
Suppose that
j
= 1,2, ••• , k.
~,ex,
13 are specified,
~o
is
kno'WIl, and we have the problem of table (1.1.1) where
g(x,~)
= A(X) exp
Suppose
k.
z(j) = t:. s(x(j»
-
h(~ o
+ t:.) +
h(~)
for
0
j
= 1, 2, ••• ,
and b be chosen by (1.4.1) and (1.4.2) where
1
1
I o = Eo(-zi » and II = E1 (zi ». We will have that any of the
three procedures based on this z(j) will have the property
·e
Let a
~~ s(x) - h(~)_7 •
14
P (D ) > 1 - ex and P.(D j
o
1"
0
~
)
> 1 - t3
J-
-
1"0 + 1)..
for
j
= 1, 2, ••• , k
for all
This follows by an argument similar to that in example
(1) by making use of the convexity of h.
(3)
I).
Normal with k
experimental categories.
Suppose that
are specified, the problem is that of table (1.1.1) with
assumed equal to
Suppose a
1"
0,
~
1).,
and g(x,1") =
and b are chosen by
Byexample
(1)
1"
o
2
(2~)-1/2 exp~- (X;1"l-_7 .
(1.4.2)
with
~ = min { 1, t3/ex(k-l)} , and let z(j) = I). x(j) •.• , k.
ex, t3, and
1).2/2
for
= 1, 2,
j
any of the three procedures based on this
z (j ) will have the property P (D ) > 1 - a and P. (D .) > 1 - t3
o 0 J J for j = 1, 2, ••• , k uniformly for 1" ~ I). •
(4)
Normal with two experimental categories.
table (1.1.1) and then for simplicity assume
2
g(x,1") = (2~)-1/2 exp ~- (x;1") _7. Then
and
Ji)
xi2 ) -
~2
I).
and (1.4.2)
2 -tf
b
1
a=-=ogj3'"
so that
1" 0 = 0, 1" = 1).,
l ) = I). XlI) -
and
2
~
~rocedure III
would say after n l
observations on X(l) and n
observations on X(2) sample next
2
1)
nl
()
2
n
()
2
on X(
if Z (I). X 1 -~) > E2 (I). X 2 - ~) and otheri=l
i
2 - i=l
i 2 2
2
wise sample next on X(). Since I o = II = ~ , we would have by
(1.4.1)
=
zi
Suppose we have the
that,
A
f-'
~
rv
""
and a = ~ , b = log 1f3 - ~7 if f3 > ex.
following table:
In particular we have the
15
Error probabilities
ex
I
f3
Computed bound
a
b
.05
.05
3.6889
-3.6889
.05
.01
5.2983
-5.2983
II
.01
.05
5.2983
-5.2983
I
.01
.01
5.2983
-3.1011
II
I
I
I
I.
CHAPTER II
ASYMPTOTIC RESULTS FOR THE CASE OF TWO
EXPERIMENTAL CATEGORIES
2.1 The problem.
We wish to decide the true value of
Q
x(l)
x<2)
0
go
go
1
gl
go
2
So
gl
in
Q
We will consider asymptotic results for fixed sample size procedures
and Procedures I, II, and III defined in Chapter I with
-a
= b = log c
as the cost per observation c --> 0+ •
The choice of a
= -b
= log c •
One would prefer that as the
cost tended to zero the error probabilities also tend to zero.
that
g
a the probability level of making an incorrect decision when
= 0 and
when
Q
and c
I
~
0
small.
the probability level of making an incorrect decision
satisfy a = Mac,
~ = ~c
would have
a ~ -log c
For further discussion about the choice of
Chernoff
f 3.7.
for some Ma > 0,
Then the suggested values of a
(1. 4.1) and (1.4.2)
·e
Suppose
M~
> 0,
and b given by
and b ~ log c.
-a = b = log
c
see
17
2.2
~ssumptions•
(1) Suppose
vie have the loss function
(
J
0
if D
=Q
I\
1
if D
f
= <
L(Q,D)
",
when
Q
Q is the state of nature and decision
(2)
Suppose a cost
(3)
Assume
(i)
D is made.
c > 0 per observation is incurred.
go and
gl are probability densities of different
distributions with respect to a
(ii)
go(x) = 0
if and only if
~-finite
measure
~.
gl(x) = 0 almost every-
(IJ.).
where
(iii) The integrals
(
I
o
go(x) \
{lOg ~
gl\x/
=)
g (x) d ~(x)
0
gl (x) ~
g (x) \ gl ex) d ~(x)
o
)
eXist (finite) and are positive.
(4)
9
=j
The prior distribution S assigns probability
for
j
= 0,
1, 2
where So + Sl +
~2
=
S.
J
> 0
to
1.
2.3 Definitions and notation.
(5)
For
xi l
),
observations on
(a)
x~l) ,
...
observations on
X(2)
g (x(l»
1 i
,
X
(1) and xl(2) ,x (2) '
2
•••
18
=
,
(b)
Z(l)
(6)
By a procedure
i
log
selection rule
reg, 0)
log
0 we mean a stopping rule
*,
X, and a terminal decision rule D.
= expected
an experiment
Define
loss + expected cost of observations to be the
risk associated with procedure
0 when
Q is the state of nature
with
•
(7)
For
(a)
-oo<t<oo let (1)
t zl
A(t) = Eo e
-t z(l)
e
1
B(t) = El
where EQ, Q
= 0,1,2, denotes expectation when
nature.
or B may be infinite for some
(A
(b)
Q is the state of
t-values. ) Also define
inf A(t)
t
13 1 = inf B(t)
t
(8)
D
=
Define for the procedure
0 fixed:
the terminal decision rule where D = j
is made.
that
That is,
D
=j
when decision Q
=j
if and only if the terminal decision
Q, the true state of nature, is
j
is made.
19
=j
g
is made, .:.. D
D.
=
the event that decision
Nl
-
the number of observations taken on
J
= j ~:
X(l)
N2 = the number of observations taken on X(2)
=
Ml
the least
=
(1):
zi
I
the least
n
-
log
c
such that
2
(2) I
z.
,> - log
~
n2
I
c
M +M
l
2
=
U
l
~
I
L:
i=l
M
such that
l
L:
II.•~=l
nl
M
2
n
=
the least
n
such that
l
:5
log
c
such that
:5
log
c
such that
~l Jl) > - log c
i=l
V
2
-e
B(E)
=
~i
the least
n
2
such that
will be used to denote a fixed sample size procedure based on
20
;: = (nl ,
n2 )
observations with terminal decision rule D(;:)
we write Dj(~')
and DO(E)
for the event {D(!!)
= j~l
•
and
The expressions
o°<.~)
will be used to denote a Bayes fixed sample size procedure
and a Bayes terminal decision rule.
2.4 Theorem 1 (The Bayes tt;lrminal decision rule).
1
... ,
xC l )
X(l)
'
2
'
Given
... , X~n'2) mutually
2
independent then
DOCE)
=
Proof.
o
if
1
if 1'(1)
nl
>
(2)
n
2
>
S0
Sl
So
S2 (2)
l'
£1 n2
,
r(l)
nl
>
-
-
,
1'(2)
n
2
>
-
Sl (1)
l'
S2 nl
2 if
l'
Given
is a Bayes terminal decision rule.
l ), ••• , X~ll),
2 ), ••• , x(2) )
x =
(Xl
S2
X1
n
2
then
21
=
gQ,n Sj
g
S;'E
Therefore
Similarly
and
Since the Bayes terminal
decision
1
and decision
decision rule makes decision
when
2 when
L(gn' 2) < L(~n' 0), L(gn' 2) < L(gn' 1)
then the proof now follows easily.
0 when
22
2.5 Theorem 2
(Optimal asymptotic sample size and risk for fixed
sample size procedure).
If ~o(c)
(a)
= (nOl(c),
n02 (c»
log C/log ; max(al , ~l)\
....
where
nOl(c)
= n02 (c) =
,then
~,
["1 + 0(1)72 clog c
log { max(al , ~l)?
If ~
(b)
= ~(c)
is any sample size for which
5 ° (c
r(Bo(E»
log c)
then
["1 + 0(117 2 clog c
log { max (al , ~l)'~
(Therefore -0
n
is an optimal asymptotic (fixed) sample size.)
We first introduce some lemmas.
for any t1vO real numbers
Here we say hex)
x, y h(yx+(l - y)y)
is convex if
'S yh(x)+(l-Y)h(y)
° < Y < 1.
For definitions of A, B, a l , and ~l see (7)
The proofs of the following three lemmas follow easily.
if
Lemma 1.
The functions
Lemma 2.
For
t
€
A(t)
and B(t)
fa, ];7, A(t) <
00
above.
are convex.
and B(t) <
the intervals of (finite) convergence of A(t)
00
and B(t)
so that
are each
non-degenerate.
Lemma 3.
a
€
(0,
1) and b
Lemma 4.
wi th
Q =
° < al
(a)
€
(0,
< 1,
°<
~l
< 1 and (b) there exists
1) such that· a l = A(a),
~l
= B(b).
If we denote Pi as that probability measure associated
i, then there exists a number
(j
>
° so that
e
23
So
P (r(l)
o n
l
-
r)
1
P (r(l)
1 n
l
<
-
- ) -<
P (r(l)
2 n
l
>
-) <
(a)
(b)
If o <
for
n
l
>
-
<
So
Sl
So
gl
-
(j
nl
(2)
<Xl ' P 0 (rn
2
~
(j~l
(j
(2)
,Pl(rn
2
go
-> r ) ~
(j
2
=:
n
2
<Xl
g
n
2
gO )~(j<Xl
2
S
nl
(2)
<Xl ,pirn ~ So)
2
2
5
n2
(j
~l
< min (aI' ~l)' then
€
sufficiently large, and
,
for
n
sufficiently large.
2
Proof. Noting that
>
~o
r,)
1
< e
by Theorem A.l
of the Appendix, the proof of (a)
follows.
Since
=
nl
p
(l.:
0 '3.=
1
z(1)
24
~
> log r,0 )
1
i
nl
> p (l.: z(l) > v n )
l
o i=l i
for any v > 0
and n
*
sufficiently large and we may choose v
l
so close to
0
inf e -vt A(t) ~ 0:1 - €/2 then 0:* - e/2 ~ 0:1 - e.
t
applying Theorem A.l again the proof of (b) also follows.
NoW the expected loss satisfies (with
e
·e
By
=
0:
=
p o(r
So
r1 '
l >
< Po (rl >
~o
r)
1
rl ~
+
S2
r1
p 0(r2
r 2 ) + P0(r 2 >
>
r
= r(l),
n
1.
~o
~
l
, r
2
>
r
2
~l
1;2
= r(2»
n
2
rl )
So
)
~2
and
o
E L( 0, 0 <.~»
=
p o(rl >
go
r;:)
~o
~o
~ P0(r1 > ~ 1 ' r 2:5 ~ ) + P0(r2 >
So
So
P0(r2 :5 S2) + P0(r2 > S2) Po(rl :5
1
So
1
So
> "2 Po(r l > Sl) + "2 p0(r2 > ~2 )
for
n , n
sufficiently large by Lemma 4.
l
2
Also
So
Sl )
that
25
and
and
and
Therefore we have
Lemma 5.
(a)
For all
o
E L( 0 ,0 (n)) < P (r
-
-
0
So
l
~o
> - ) + P (r2 > w- )
Sl
0
';)2
~o
So
~ I ) + PI( r 2 >
o
E L( I, 0 (~)) ~ PI( r I ~
E
n , n
l
2
ti )
go
So
L(2,oo(~)) ~ P 2 (r2 ~ r ) + P 2 (rl >
2
I
(b)
r )
For all
n
l
and
n
2
26
and for all sufficiently large
o
E L( 0 ,0
1
nOl
n
al
Ol
P0 ( r 1
n
2
1
So
> S1 ) + "2 P 0 ( r 2 > ~ 2 )
Now we see by Lemma5( a) and Lemma 4( a)
= n02 = max ;' log c/log a l , log c/log (31 ,) • Therefore
log c
log a
n
= a l 02
which proves
If
and
l
So
(E» ~ "2
Proof of Theorem 2.
Now
n
<
al
l
=
= d... clog
c
c)
Theorem 2(a).
r( oo(E)):5 0 (c log c), then it follows that
lim inf n = lim inf n = +00. Let 0 < € < min (a ,(3l) and for c
l
l
2
c->o
c->o
sufficiently small we have for some M > 0 that r( eO(E» 5-M clog c
so that
P (r >
o l
l
~ 0 /~l) -< • 2 S·l
M clog c, Pl(r < ~ /~l)< -c: - M c
l
0
l -0n
and also by Lemma4(b) PO(rl>SO/~l)~(al-€)
n
l
r
P l ( 1 5 so/sl) ~ «(31 - e) • Hence it follows that
logc
n
l
log
(01 - €) ~ log c + 10g(-2 ~~l M log c)
-1
n l log «(31 - €) ~ log c + log ( - S M log c)
1
l
,
27
log ( -2
+
~0
log
M log c)
(C1. -
e)
log(- S~l M log c)
+
Thus we have for each
-1
log (f3 1 - e)
0 < E < min (a , f3 )
l
1
+ o(log c),
+ o(log c)
which implies
log c
+ o(log c),
+
0
(log c)
log f3 1
and so
Similarly
log c
+ o(log c) •
max :l log a..
,
log
f3
2,
~.L
1)
n2 ~
Therefore
r(5 (,E» ~ £1 + o(1172c log c/max L log a l , log f3 1(r
)
0
which proves Theorem 2(b).
2.6 Theorem 3 (Bounds for
(a)
-I
-I
a l > e o ,f3 1 > e 1 and
(2.6.1)
Proof.
t zl(1)
Now E e
o
with strict inequality holding for
- t I
e
t
~
0
o
so we have
for each t
28
a z(l)
Eel
o
=
a E z(l)
3. Thus a > e
1
-o<t<l
-t z(l)
-t E z(l)
III
E1e
> e
equality holding for
t
-I
=
1
0
(0,1) by Lemma
€
t E z(l)
> inf e
1
0
for an a
e
o
with strict in-
each t
~ 0
Also
so we have
-t z(l)
~1
= inf
-oo<t<eo
3. Thus
1
E1e
-b E z(l)
~1 > e l l
for a
b
-t E z(l)
inf
e
ll
>
O<t< 1
€
= e
(0,1) by Lemma
-II
which proves
(a) •
By (a) log ~ > - I o ' log ~1 > -I l
and so max
> max( -I o ' -II) which implies -max ~ log a l , log
,-
~l \
,
~. log
(a)
is an event such that p 0(8) > 0
If 8
( Nl
P (8) E / r,
o
0
i=l
t
N
2
(
(b)
(c)
1
P (8) E ~
O! i=l
o
,
\
PI(S) El "i
i
,e
(d)
I8
P (8) E ;\
2
2
'1'-
N1
-'-"'.
i
!
i=l
N2
n-'
C 1=1
)
';
\
g (X(2»
1 i
g (x(2»
o i
s
g (X(l»
o i
l
gl(Xi »
8
g (X(2»
o i
g (X(2»
1 i
(
,l
=
P (8)
l
= P2( 8)
(
)
= Po( S)
)
8
>
\
.I
= Po( 8)
then
"
log ~l)
< min (I o' I l )
which proves (b).
Lemma 6.
02'
29
if the procedure terminates with probability 1
Proof.
Our sample space
=
8 jk
';~: Nl
if x
1
-
=
Wjk
=
¢jk
and N2
if x
1
= k,
8
€
j;
8 jk
otherwise
"
By assumption (3) (ii) we have that P 0(8) > 0
Pl(S) > 0 and P 2 (8) > O.
00
It follows that ~
j=l
P1 (S)
Let us now prove
=
00
00
~
j=l
~
k=l
(a).
~
Wjk
= 1 with probability
k=l
P1 (8 jk )
j
=
u=l
g (x(l»
1 u
g ()1»
o u
k
v=l
·e
g (x(2» (
o v
implies
00
Now
1, 2.
1, 2.
8 jk
€
\
I
(0
= 0,
= 0,
!O otherwise
I
Q
Q
consists of points
,~.
=j
for
\
j'
J(
! u=l
\...
j
1T
u=l
c4t(x(l»
u
k
!.
v=l
1
for
30
co
co
= j=l
E
E
k=l
E
0
N1
F
¢
Jk N1N2 u=l
*.
N1
= Eo ¢N N
1 2
Tf
u=l
j
T(
u::::l
Then we have
·e
g (X(l»
1 u
g (X C1 )
o u
g (X(l»
1 u
g (x C1 )
o u
g
1
g
31
(X(l)
u
(X C1 »
o u
g
1
(X(l»
u
and since
Nl
which proves (a) •
Pl(S) = P (8) E ( r',
o
0
u=l
The proofs
~mma
7.
of (b), (c), and (d) are similar.
We have that
Ml
(a)(i)
P
Ml
(~ z~l) > - log c) = P2 ( ~ z~l) > . log c) <
J.
i=l J.
0'1
J.=
c
Ml
(ii)
(b)(i)
z.J.(1) < log c) < c
Pl ( ~
i=l
M2
M
P (~ z(2) > - log c) =P ( E2 z~'2)
l . 1 J.
o i=l i
J.=
M2
(ii)
(c)(i)
P2 (
~
i=l
z~2) < log c)
J.
- 11
EoMl = E2M1
+
c
0{127
=
-11
10
+ o(~7 log c
11
·e
:5
log c
->
- log c)
< c
32
(ii)
Proof.
Let A
be the set in the sample
Ml (1)
have n = Ml and
!:
zi
:: - log c and
i=l
the sample space on which we have n = M and
l
n
(1)
Then on An E z.
:: - log c. That is,
i=l J.
n
1
c
n
i=l
n
g (x(l»
a
< c -',
i
i=l
Therefore
so that
P (
~l
a 1=1
z(l)
i
Similarly
.
> _ log c)
~
00
!:
n=l
P (A )
a
n
space on which we
B
n
be the set in
Ml z.(1) < log c.
E
i=l
1.
-
33
Ml
Pl ( L:
i=l
co
z.1.(1)
< log c) = L:
n=l
M
co
< c L: P (B ) =cP(L: l
n=l
0'1
1.=
0 n
Pl(Bn )
z.1.(1) < log c)
which completes the proof of (a).
< c
Similarly (b) follows.
Let S be the set in the sample space on which ~l z(l) < log c
o
. 1
i
1.=
M
(1)
and Sl the set in the sample space on which L: l z.
> - log c. We
i=l
1.
will have po(So)+Po(Sl) = Pl(So)+Pl(Sl) = 1.
Now by Theorem A.2 of the Appendix we have
M
(E Ml ) (-I ) = E ( L: l
o
0 0 .1.=1
= P (S )E (~l
o 0 0 .1.=1
< P (S )log
o
0
z1.~l)
Ml
Ml
E (
0
TI
i=l
i
!
S)
M
l
S )+ P (Sl)log E (T
o
0
0
. 1
1.=
and by applying Lemma 6 we have
-I
(1)
S ) + P (S ) E (L: z
o
0
1
0 i=l i
P l (So)
E M < P (S ) log
o 0 l 0 0
P (S )
o 0
that
=
0
and
lim
=
inf
->
C
lim
log c
0
C
p (S)
o
> lim
log
0
c
c
=
log c
c-->o
Thus we have
log
0
inf
->
1
that
-~l +
0(117 log c
Io
:5
Now EoMl
EoU l •
Let us show that
EoUl
:5 - £1
+ 0(117 log clIo·
Ttle shall use a t:>'})e of proof which can be found in Chernoff L_~7.
Let
€ > 0 be given.
log c ~ - n l Io/(l+€).
p
(~l z~l)
o ~=
. 1
~
~=
In such case for
t >0
> log c)
-
< p ( ~l z(l) >
0'1
n l ~.(l + €) log clIo we have
If
i
-
nll o
l+€
)
= p (
~l (z~l)
0'1
~=
~
Io
+l+€
- )
> 0)
nl
I0
t(zi ) + - )(>
l+€
l
<
But
\
Ea
function for
\
!
'I) +
zi
i
e
Io/(l+€)
0 < t < 1.
has negative mean and finite moment generating
Hence the right-hand derivative of the moment
generating function is negative at
·e
t
= O.
Thus there is a
~
35
t
*
*
= t 0(e)
o
>
0
.
so that E
t:<zll) + Io/(l+e) )
<a
0
- 0
e
for some
o < ao < 1. Thus it follows that
P
(
~l
0'1
~=
for
(1) > log c)
z-l
...
nl ~ - (1 + e) log ClIo.
EoUl ~
-£1 + 0(117 log clIo.
-;-1 + 0(1)7 log clI
-
-
0
This is sufficient to show that
Therefore we have that
which proves
Eo
(c)(i).
By Theorem A. 2 we have
= E1 ( . ~l(_z(.l»)
1
~
-(
ElM)
l Il
£.J
~=
Ml
80 )+P l (8 1 )10g El (·
i=l
and by Lemma 6 we have
By applying Lemma 7(a)
P l (8 0 ) ~ c
so that
lim
c h>
and
again we have that P0(8 1 ) ~ c
0
Pl (8 0 )
log
=
o
and
M
l
=
36
lim
inf
c->
=
log
0
log
ElIvIl ~
p
t
(~l z~l) < _
But
we have
- log C
(~l
if
nIIl/(l+€).
€
1
> 0 and
In such case
log c)
zll) - II/(l+€)
-1
):5
has positive
=
*
tl(e)
it follows that
0)
mean and finite moment generating
< t < O. Hence the left-hand derivative of the
moment generating function is positive
·e
NoW
<
Zi(l)
1 J.=
. 1
t*
1
c
J.
function for
a
:5
•
l
log
< o.
0 J.=
'1
< P
c->o
c
- £1 + 0(127 log C/I
n l ~ - (1+€) log C/I 1
for
=
> lim
lim inf
c u> 0
Thus we have
c
<
0
so that
at
t
= O.
Thus there is
•
37
n l (1)
P l (.Z zi
~
~=l
nl
- log c) ~ al
n l ~ - (l+€) log C/I l •
for
Therefore we have that
E!11
= -1 1
(d)
is similar to that of (c).
2.7
Theorem 4 (Asymptotic lower bound for risk of any procedure).
+ 0( l)} log c III
Any procedure B for which
Pi (D )
j
:5 -
small.
=
Po(D l
If
r( B)
\ 2 ~o
r(B) ~ - (1+0(1»
Proof.
which proves (c) (ii).
°i
,
reB)
M clog c
-y-
0 (c log c) has
~ clog c
+
J
< 0 (c log c) then it follows that
if
f
i
for some M > 0
j
Nl
(1)·
EO(i~l zi
D2 )
N
l
+ Po(D o ) log E ( ,-.
0'1
~=
Po(D l or D ) log
2
gl(Xi
and
we have
.
I Dl or D2 )
gl(Xi
< Po(Dl or D2 ) log E ( I:
0 ~=
'1
·e
:5
0
Nl
=
This completes the proof of Lemma 7.
Now by Lemma 6 and Theorem A.2
Nl
(1)
- I o Eo Nl = Eo
Z
zi
i=l
or
The proof of
go(Xi
l
l
l
»
»
D
l
or
D )
2
»
l
go(Xi ) )
Do)
P (D or D )
2
l l
+ P (D ) log
o 0
Po(Dl or D2 }
P l (Do)
PoCD o )
c
sufficiently
But
lim
c ->
> lim
-c ->
= 11m
inf
log
0
c
P o (D 0 ) log (-M c log c)
0
log
c
C
->
=
1
and
so that
Eo N1 _> -(1+0(1»
I
o
log c. •
8i"1
nu ar1y
Now again by Lemma 6 and Theorem A. 2
N
1
-II E1 N1 = E ( - ~ z(l) )
1
i=l i
N1
< P1 (Do or D2 ) log E1 ( ,
i=l
·e
inf
0
Po(D o ) log P1 (D o )
log
c
39
But
and
lim inf
c -> 0
> lim
lim inf
log
c ->
c
Pl(D ) log (-M c log c)
l
-c->o
log
=
c
Pl(D l ) log Po(D l )
log
0
c
1
so that
El Nl ~ - (1 + 0(1»
Now
E N
l 2
~
0
log c/I •
l
~
and E N
2 l
0
( 2~
~ -(1+0(1»
Similarly
I
<;
!
E N ~ -(1+0(1»
2 2
log clI l ,
and therefore it follows that
O
+
0
which completes the proof.
2.8
Theorem 5 (Asymptotic risks of Procedures I, II, and III).
have that
(a)
reI)
= -£1
+ O(l)}
( 21;0
'i I
,
~
0
1
+ (1;1 + s2)( I
I
)~.
1
+ 21
1
0
J
I
clog c
'YJe
clog c
+
,
:5
:5
r(II)
-£1+0(127 } I
,
0
0
+ ( Sl + S2 )(
= -£1+0(127
(c) r(III)
reo)
where
2~
1
T
1
1
)(,. clog c
./l
max(I o ,II)
clog c
+
denotes Bayes r1sk with Procedure
Proof.
+
let us f1rst consider Procedure I.
0,
° = I,
II, III.
Now
L(O,o) = Po(Dl or D2 )
= ~
Po(D l or D2 i started with X(l»
started with X(2»
=
M
1
(z z~l) > _
1 P
2
0
1=1 ~
M
2
+ is1 P
~
,
(~
0
i=l
~
M
2
log c or Z z(l) < log c, Z z~2) > - log c)
i=l 1 i=l ~
z (2)
. > - log c or
M
l
= ~.!
M
1
M
2
M
l
P (Z z(l) > - log c) + P (Z
0 1=1
i
0 i=l
2:
M
(2) < log c, ~ 1 z (1)
~ z.
.
> - log c)
i=l ~ 1=1 ~
M
2
.
z~l) < log c) P (Z z~2) > -log c)
~
-
0
i=l
~
-
J
.~
(
+
M2
( )
M2
M1
~: P (~ z 2 > _ log c) + P (Z z1(2) < log c) P ( ~ z~l) > - log c)[
2 i 0 i=l i
0 1=1
0 1=1 ~
~
l
I
f
41
1
1
2(C+c)+2(c+c)= 2c
:5
by Lemma 7.
1
.
= 2 P (Do or D : started with
l
2
=
_1 P ( ~1 z(.l)
&..
2 1 i=1 ~
<
_
2
2
M
> - log c )
L: z (2) < 1og c or M
L: z (2)
.
log c,
i=l
1
M2 (2)
+ -2 P ( L: z1
l 1=1
:5
(1)
X
)
log c,
Ml
L:
1=1
1
z
1=1
(1)
1
< log c
-
1
M2 (2)
or L: z.
>
1=1 ~
-
- log c)
> - log c)',
-
< 21 (c + c) + 21 (c + c) = 2 c by Lemma 7 •
SitJi1urly
L(~,o)
Now EoN = EoN
I
EoN
·e
:5
+
2c.
E N
a 2
1
= 2' Eo(NII started with
so that
X(I»
1
+2
E (N
o 2
i started with. X(I»
1
+'2
Eo(NI
I
X(2) )
I started with
X(2»
+~ E (N
2
o 2
started with
I
J
= "21
E
M +
+
.!2
p (
1
0
.!2
~2
i=l
0
p (
0
~1
Z(
<
1
i=l i
z(2)
i
1)
<
1
_
og c
) EM
og c o l +
)
E M
o2
21
E M
0
2
•
By Lemma 7 we have that
+
~
L1 + O(C)\ {-(1+~~1»
and therefore EoN
E1N
1
= "2
Ef11 +
1
2
= -(1
+ 0(1»
CJ + ~ ~. -(1+~~1»
log
2 log clIo'
log
Also
M1 (1)
P1(.E zi
~ log c) E1M2
J.=1
and by applying Lemma 7 again
E1N
r -(1+0(1»
I
= .!
2
log
CI.'
(1
.! ~
+ 2 I
l
1
o(
+
C
)~' (
= -(1+ 0{1»
I
Similarly
1
J..
.J
-(1+0(1»
I
,
1
,
log
C
tj t( -(1+9(1»
I
0
.! f o( )?
(+21
C
L
log c (
\
'
+
.! ,\
2
.J
- (1+ 0 (1»
2I o
log
I
.\..
C
log c (
)
-(1+0(1»
I
0
•
log c \.
.)
C~
log c
Then we have
( 2
S
= -(1+0(1»', ~ +
!..
\
IO
clog c
which proves (a) •
Now let us consider Procedure II.
L(O,o)
=
Po(D l
=
Po(Dl ) + P0(D2 )
or D2 )
~l
< P (
Z1(l) > _ log c) + P ( ~2 Z1(2)
0 1=1
o 1=1
<
2
< 2c
Similarly
·e
c
by
by
L(2,0)
7.
Lemma
Lemma 7.
~
2c
•
> - log c)
.44
Now EoN
+ E aN2
=
Ea N1
=
-(1+0(1»
2 log ella
by Lemma 7
We have seen in the proof of Lerrnna 7 that
also since
N
2
:5
U
l
and
:5
N
l
U '
l
-(1+'c{l»
I
and also since
N
2
:5
E1U15 -(1+·0(1»
we have that
2 log
c
l
M
2
-(1+0(1»
'I
a
Thus it follows that
E N < -(1+0(1»
1
-
Il
log c
-(1+0(1) )
log
c
max(Io,I l )
Similarly
log c
- (1+0(1»
max(Io,I l )
Thus we have that
log c
log c
log CII
l
?>-
J
and in view of Theorem 4 this proves
clog c
(b) •
Now let us consider Procedure III.
< P (
~l z~l)
<
L(1,5)
p (
o
~l
1 i=l
<
Similarly
-
= 2c
L(2,5)
:5
~2 z~2)
< log c) + P (
z(l)
i
l i=l
-
> _ log c)
-
~
> - log c)
-
7.
by Lemma
2e •
-(1+ 0 (1»
I
~
D2 )
or
c + e
=
~2 z~2)
i=l
0
7 .
2c by Lemma
= Pl(D
<
> _ log c) + p (
~
o i=l
log
c
o
Let us now show that
E1 N:5 -(1+0(1»
E1Nl
:5
E~l:5
log e/I
l
-(1+0(1»
•
We have that
log e/Il'
l
observations on
X(l)
and
n
2
and
Let us show E1N2 =
It is sufficient to show that there is
n
E1N = E N
1 2
on
a~,
X(2)
0 <
~
0
(1)
log
e.
< 1, so that after
we have
46
n
E
2
(2)
zi
<0 )
i=l
for
(
t
= )".
E1 e
l
(2)
But
-z 1
for
-1 < t < O.
(1)~'
zl
-
has a positive mean and finite moment generating function
- -
Hence the left-hand derivative of the moment
generating function is positive at
-t* z(2)
so that E1 e l l
t
W
1 (t) = E1
t <0
for some
"I"
= O.
"1", 0
Thus there is a
<
"I"
t '* < 0
1
< 1. Now since
z(l)
1
e
*
then ~l(tl)
.5
t
and ~1(0)= 'lr 1 (-1) = 1
n
0 < "1"1 < 1 and hence "1"11 < 1 and we
is convex for
= "1"1 where
-1:5 t:s 0
have
which completes the proof that E1N < -(1+0(1»
E2N:S -(1+0(1»
log c/I 1 •
log c/I l • Therefore we have
. :' 21; O
r(III) :5 -(1+0(1» ~ I
:,
0
(~l + ;2) (
+
I
?
1
clog c
j
which in view of Theorem 4 completes the proof of (c).
the proof of the theorem.
·e
Similarly
This completes
CHAPTER III
ASYMPTOTIC RESULTS FOR THE CASE OF k
EXPERIMENTAL CATEGORIES
We are now able to make straightforward generalizations to
,.1 The problem.
e
We wish to decide· the true value of
...
Q
x(l)
x(2)
xC,)
0
go
go
go
go
1
gl
go
go
go
2
go
gl
go
go
go
gl
,
...
Q
in
X(k)
go
g
0
k
where k > 2.
We will consider asymptotic results for fixed sample
size procedures and Procedures
with
,.2
-a
= b = log
Assumptions.
I, II, and III defined in Chapter I
c as the cost per observation
c --> 0+ •
The assumptions (1), (2), and (3)
apply but now 'tve have
of Chapter II
48
(4 ' ) The prior distribution ~ = (So' ~l' ••• , Sk)
probability
s.
>
; J
°
to 9 = j
for
assigns
j = 0, 1, ••• , k.
3.3 Definitions and notation.
,
(5')
(a)
r(j)
n.
n
=
J
j
11
i=l
...
g (x(j»
1 i
g (x~j)
o ~
g (x(j»
1 i
g (x(j)
o i
(b)
z(j)
i
(c)
E = (nl , n2 , .•• ,
= log
observations on x(j), j = 1, 2, ••• , k
~)
(6), (7) the same as in Chapter II with
and
+ ~k r(k,o)
(8 ' ) Define for the Procedure 6 fixed:
D = the terminal decision rule where D
n:.ade.
Too t is, D = j
Dj
= the event that decision 9 =
N.
= number of observations on
N
=
M.
J
M
j
c
e
u.J
J
+ M +
2
= the least
J
such that
D = J. (\
,
x(j)
n.
J
E
i=l
+~
n.
9
= j is
is made.
is made,
j
...
Nl + N2 +
+ N
k
= the least n. such tlilat
= Ml
when decision
if and only if the terminal dec ision that
9, the true state of nature, is
J
=j
nj
E
i=l
z~j)
~
z(j)
i
>
- log c
< log c
j
=O,l, ••• ,k
v. =
J
t>'~)
~
the least n.
J
such that
will be used to denote a fixed sample size procedure based on
= (nl ,
n2 , '."~)
observations with terminal decision rule
and we write D.(n) for the event
J -
6o(~)
=j ~ •
-.
D(n)
D(~)
The expressions
and DO(~) will be used to denote a Bayes fixed sample size
procedure and a Bayes terminal decision rule.
3.4 Theorem 6 (The Bayes terminal decision rule).
j
= 1,
DO(E)
2 s ••• , k
and i
= 1,
2, ••• , n
j
Given
xi j ),
mutually independent then
=
,
... ,
k-l if
r
1
!
I
i
I
1.-
k
if So < r(k) , £1 r(l) < r(k), S2 r(2) < r(k),
Sk n 1
~
Sk n 2
~
Sk
~
Sk_1
(k-1)
Sk
~-l
-r
(k-l)
~-l
... ,
50
DO(~)
is such that
is a Bayes terminal decision rule.
Proof. Similar to Theorem 1.
3.5 Theorem 7 (Optimal sample size and risk for fixed sample size
procedure ).
nOl(c) = n02 (c)
= ••• = nOk(c) = log
(1 + ci.l»
log
(b)
If E
= E(c)
c
k clog c
f,. max (0:1 ,(31»)')
is any sample size for which
(1+0(1» k clog c
log max (0: , (31)
1
optimal asymptotic (fixed) sample size.)
then
r( 5 o(E» ~
Proof.
?
c/log Lmax(O:l' (3l)j , then
(Therefore
n
-0
is an
Similar to Theorem 2.
3.6 Theorem 8 (Asymptotic lower bound for risk of any procedure).
Any procedure
rea) >
Proof.
&
for which
r( &):5 0 (c log c) has
(' k ~
-(1+0(1»·~ ----I0 +
,I
(~l +
g2
0
"
+ ••• + ;k)
!
Il
i
" clog c
;
Similar to Theorem 4.
3.7 Theorem 9 (Asymptotic risks of Procedures I, II, and III) • We
will have that
( a)
·e
r ( I)
( kE; 0
(
= - ( 1+01»)
-y-
l
0
+
(
.
) 1/ 1
k-l \\
~1+£2+"'+Sk : 1: + 21 i~ clog c
\
1
oJ'
I .'
51
£
k
(b)
-(1+0(1»
I
0
(£ 1 + £2+ ••• + £k)
+
II
o
I.
clog c
I
-'
,
\. clog c
_.f
where
r(5)
Proof.
Note.
denotes the Bayes risk with Procedure
5, 5
= I,
II, III.
Similar to Theorem 5.
In view of Theorems 8 and 9 we see that Procedure III is
asymptotically optimal.
3.8 Table of prices of the procedures.
Procedure
We define
P the price of
5
5 by
=
lim sup
c -> 0
-c log c
and obtain (by use of Theorems 3, 7, and 9) the table (1.5.1) of
3.9 Remarks concerning the relative merits of the procedures considered.
It is easily verified that the sequential Procedures I, II, and III all
have (strictly) smaller prices than the best fixed sample size procedure.
If
Procedure I
PII :5 PI' (That is, if 2I o :5 II'
is not better than Procedure II.) Also for Il/I o small
2I o
:5
II' then
PII .. Popt and PI':':' Popt' For Io/I l small PII ,· Popt' Then
Procedure II is approximately asymptotically optimal for either I o /I].
1
small or II/I o small although the possibility is left open that Procedure II is asymptotically optimal.
APPENDIX
Theorem A.l
(Bounds of the sample mean).
Let
Xl' X2 , •••
be independent and identically distributed random variables. Define
for a fixed
(1)
=E
(4)
wet)
= e-at
(5)
T
If
=
(i)
Xl
t
¢(t)
for all real t
e
1t:
¢(t)
< t < co
_co
p(Xl
for all real
= a)
~
,
¢( t)
<
t
co
1
(ii) T is a non-degenerate interval
(iii) there eXists a positive
such that
~(~)
= inf
t
€
wet)
=
~
p (say)
in the interior of T
then
T
p < pn and for eNery real number € such that
nfor n sufficiently large p > (p _ €)n
n(a)
(b)
for every real number
sufficiently large
€ such that
~ ~ (p _ €)n •
0 <
€
<
~(~)
0 < € < ~ (~)
for
n
53
Proof.
This is essentially due to Chernoff ~L7 but is stated
and proved in this form by Bahadur and Rao ~};7.
(b)
By an argument similar to that used by Bahadur and Rao in the
proof of (a) this result follows.
Theorem A. 2 (Wald' s eCluation) •
SUppos e that
(a)
Zl' z2' •.•
(b)
E . Z 1:i<
(c)
N is a random variable whose values are the positive integers
(d)
the event .: N
(e)
then
j
<k
E N
<
co
j=l
Proof.
identically distributed
co
for
N
E( ~
are
z.)
= j~:
=
and the random varia.ble
(E N) (E zl)
J
See
Johnson
£!3:.7 •
zk
are independent
54
BIBUOGRAPHY
["1_7
Bahadur, R. R. and Rao, R. R., "On deviations of the sample mean,"
Annals of Mathematical Statistics, Volume 31 (1960),
pp. 1015-1027.
["2_7
Chernoff, H., "Sequential design of experiments," Annals of Mathematical Statistics, Volume 30 (1959), pp. 755-770.
£3.-7
Chernoff, H., "A measure of asymptotic efficiency for tests of a
hypothesis based on the sum of observations," Annals of
Mathematical Statistics, Volume 23 (1952), pp. 493-507.
["4_7
Johnson, N. t., "A proof of Wa1d l s theorem on cumulative sums,"
Annals of Mathematical Statistics, Volume 30 (1959),
pp. 1245-1247.
£5_7
Paulson, E., "A sequential procedure for comparing several experimental categories with a standard or control," Annals of
Mathematical Statistics, Volume 33 (1962), pp. 438-443.
["6_7
Wa1d, A., Sequential Analysis, John Wiley and Sons, New York,
1947.