UNIVERSITY Ol" NOffl'H CAROLINA
Department of Statistics
Chapel Hill, N. C.
A HODIFIED BAYES STOPPING RULE
by
'e
Sigmund J.
Ar~ter
Narch 1962
ThJs research 'Has supported by the Office of Naval
Research under contract No. Nonr-055 (09), for research
in probability and statistics at the University of
North Carolina, Chapel mIl, N. C. Reproduction in
whole or in part is permitted for any purpose of the
United States Government.
()
Institute of Statistics
Mimeo Series No. 318
ii
ACKNOWLEDGH1ENTS
I am deeply indebted to many individuals for their inspiration
and encouragement during my school years.
Ho,vever, it ..lOuld be an
injustice to the important contributions made by those not named) i f
any except my adviser, Dr. Hilliam J. Hall, \(Tere to be inc}j.vidually
mentioned.
'tlords can hardly convey my most sincere appreciation for
his guidance and assistance throughout the course of this invest:Lgation.
The Office of Naval Research is to be thalliced for their financial
support of a portion of this research.
Finally) I wish to thalli,- Mrs.
C~rdner
for her quick and
accurate typing of this manuscript and Miss Hartha Jordan for her
help through the w..aze of administrative detail.
I also wish to thanl:
Mrs, Betty Donaghy for final prepax'ation of the !(]anuscr:Lpt.
iii
TABLE O:L" CONTENTS
Chapter
Page
il.CKNO\!LEDGlll~TS
I
ii
1
INTRODUC'I'ION
1.1
Introduction of the modified Bayes rule
1. 2
i\.sS"\J.mptions , ,
1.]
Notation and definition of MBR
(~mR)
1
4
Calculation . . . . . .
1.5 Subjective probability justifj,cation
5
1.6 Brief literature review
7
1.7
II
Chapter summaries
7
GENERAL RESULTS
10
2.1
Notation
10
2.2
Discussion
11
2.3
Preliminary lelTImaS
13
:5
2. 4 Theorem 1.:
0'13:5
2.5
Convexity of
'I'heorern 2:
2·7 Theorem 3:
N
0:
00
<0:
-
0
-= (d I)
-n
Equivalent SPRT
2.6 Corollary 2.1:
III
1
< NB
00 -
.....•
16
17
18
18
SPECIAL CABES
),1
Estimation
Theorem 4:
20
Sufficient condition to look only at
R
n( 1) . . . . . .
20
Asymptotic Miniwax comparison
22
iv
Chapter
Page
Binomial estilI'.ation computations
0;
00
o;j
3.2
IV
.-
24
26
27
Testing hypotheses
29
Calculation of equivalent SPRT
29
Two composite hypotheses with linear loss
31
AREAS FOR FUTURE RESEARCH
32
BIBLIOGRAPHY
35
CHAPTER
I
INTRODUCTION
1
Introduction of the modified Bayes rule (MBR)
1.1
This thesis describes a stopping
l~le
for sequential sampling
,vhich vleighs the cost of additional observations against the expected gain to be derived from additional sampling.
The modified Bayes
rule (MBR) requires one more observation to be taken as long as the
posterior risk is larger than the expected posterior risk for any
additional fixed-size sample.
For the present investigation, risk
is defined within the 13ld framework of statistical decision theory
2
.
~1~7, uSlng losses and costs. It is recognized however, that in
some circumstances other formulations (such as Heiss ~1§J or
Lindley ~27) ~ay be more appropriate.
1.2 Assumptions
To simplify the
eJ~osition
the following assumptions will be
~.ade:
AI:
The experiment consists of observing, possibly sequentially, the random variables
Xl' X , ..• (real or vector2
valued) which are independent with a common probability
lThis resea~ch v~s supported by the Office of Naval Research under
contract No. Nonr-855(09) for research in probability and statistics
at the University of North Carolina, Chapel Hill, N. C. Reproduction
in whole or in part is permitted for any purpose of the United states
Government.
2The numbers in square brackets refer to the bibliography.
2
density
IJ..
f(./Q) with respect to a given
~-finite
measure
f( .jQ), i.s used to denote the
The same notation,
(joint) density of any number of random variables having
the (possibly
A2:
For any G
vector~valued)
€ ~
parameter
Q €.Jl.
and any terminal decision
d,
the loss
L(G;d) is non-negative.
A3:
c~Clk)' the "marginal" cost of
It is assumed that
is increasing in
k (i.e., for all
--. (y
)
Y
-k+l
-k' xn+k+l'
x, and all
-n
c*(Yk
).): where
n - +1) > c*(Y
n k '
denotes the cost of observing
and
lk
= (xn +l '
lk'
xn +k )·
c r (x
-r )
x, c*(y )=c
(x
)-c (x )
n -k - n+k -n+k
n-n
-r
Also, for any sequence x l 'x2 ,.··;
lim c*(Yk' ) = 00
k
nAny measurability assumptions needed to assure the existence
it is assumed that
A4:
(finitely) of the integrals used for the procedure are made.
A5:
The existence of a Bayes terminal decision rule is assumed.
The Bayes terminal decision is used exclusively here and
is denoted by
d.
The identical distribution assumption is made solely to avoid
notational complexity.
A2
is used primarily to justify the use of
the Fubini theorem in proving properties of the MER.
that
i~f Rn(k) (see Section 1.3)
A3 implies
is actually attained.
A5 is
used to avoid the detailed analysis (e.g., ~L7 page 297) or restrictive assumptions (e.g., ~127 page 89) otherwise necessary to verify
the existence of a Bayes terminal decision rule in each particular
case.
3
1.3
Notation and formal definition of the
I~R
The following notation will be used:
for any fixed
x .
n :::: 0, 1,
-n'
... ,
and
k :::: 0, 1, ••. ,
=jJ
Vlhere
-IL.
1".
X""
=J r~r
t
...J\..
k
X ::::
X X X x •.• x X (k times), (X
the spectrum of a
single observation),
.-
= the prior distribution of
g (g)
o
d~
n
(g)
f (x )
o -n
f(x /9) d~
-
:r
JL
Note:
Rn(o) :=Rn
0
f (x /g) dg
-n
'-
d F(;lk/Q)
-n
=
is the
Q,
(Q)/t 0 (x-n ),
0
(Q) ,
and
k
Trr(Y./Q)
1
J.
Un(x)
d IJ. (Y )
i
of
[27,
page 24.2;
is the
Rn(k)
may be interpreted as our present "best guess"
terior risk,
observed.
(k)
of the pos-
Rn + , if an additional sample of fixed size
k
R(k)
procedure.
k were
is the average risk of a Bayes fixed-sample size
The formal definition of the lIBR:
at the start of sampling (if
(if
n
n
=
is observed
0), or after -n
x
> 0),
(1)
if
(11) if
R ~
n
inf Rn(k)' observe
k
x
n+l
.
(In either case, 1.2 implies that the infimum is actually achieved.)
Note:
an equivalent formulation in terms of sets of distribu.-
tions may sometimes be more useful (see sections 2.5 and 2.6).
each
n
e, set of distributions ( ~n ) is defined such that sampling
stops if and only if
1. 4
'e
For
is
in
-
-n
Calculation
The calculation of
Rn(k)' for each
k, is feasible whenever the
Bayes fixed sample procedure can be explicitly obtained.
~127
(Part III of
can offer assistance in the evaluation of the integrals involved.)
On the other hand, an explicit evaluation of the stopping regions for
the Bayes sequential rule (BSR) has not been generally possible.
Ex-
cept for certain special cases -- as testing two simple hypotheses,
,.,hen Ii/aid's sequential probability ratio test (SPRT) is such a rule,
or i f the BSR is truncated or fixed sample-size .... it is not usually
possible to carry out the Bayes procedure.
determination of the appropriate
SPRT
Even in these cases, the
is not simple and in the trun-
cated cases the necessary computati::ms are exceedingly tedious.
rem 9.3.3
in ~L7 gives sufficient conditions for the
BSR
Theoto be
truncated or non-sequential.
The
MBR
calculations are readily adaptable to a high-speed
5
computer since the analyses before and after a sam::;>le has been observed only differ in the change from a prior to a posterior distribution of
9 (see section 2.2).
The problem 1s further simpli.-
fied i f the prior and posterior distributions are of trle same functional. form, vI1th only a change in parameters; since the same kind
of calculation is then required at each stage (see section 3.1).
A distribution satisfying this latter property has been called a
and a
con,jUgate :prior
f(x/g)
binomial
beta
Poisson
gamma
normal
distributio~
(Q, 1)
For the special case of
I
normal
ck(~k)
= kc the stopping rule can be
conveniently expressed in terms of
)'
n(J{)
=
That is, stop sampling as soon as
c
> 7n
case it is only necessary to have bounds on
1. 5
::=;
max
I'n(k).
In this
k.
c
to carry out the
MER.
Subjective probability <lustification
A subjective probability justification for the MBR rray be found in
interpreting the value of
~ 0
at any sct of G values as repl'esenting the
original relative conviction that the true value of
set.
Once
x
-n
G lies in this
has been observed the belief bas been cbanged as re-
6
fleeted. in the values of
~
n
.
In either case the distribution S
determines which average of the risk one would like to minimize,
The f.IBR
v7ill accept this pr~sent average r:Lsk (L e oJ stop samp-
ling) only if the cost of increasing one's convictions, through
lmowledge of a sample of any fixed size, is more than the expected amount to be gained.
If not, one more sample will be tall:en
and the same problem posed ,vith the (hoped-.for) better Imov7ledge
of the true state cf nature.
In the context treated above, the defining property of the
BSR, that of minimizing the original average rish:, does not seem
particularly relevant.
rule, by comparing
R
n
However, the method used to determine this
with the average risk "resulting from a con-
tinuation if at each future stage we did the best we could with the
resulting observations" ([)} , page 243), is really the optiwal property.
The MBR tries to approxiwate
this by only considering the
average risk if any future fixed-size sample were taken.
The averaging over projected observations way b3 alternatively'
fey
Ix ), the marginal
-II: -n
performed by averaging first with respect to
distribution obtained by integrating out
~.
is identical with the proposed computation of
is Lebesgue measure, and
d
~
o
However, this method
Rn,'1')'
;.
For examyle
(G) '-'" g (0) d. 0, the follov7:Lng
0
argument, in abbreviated notation, sketches the equivalence:
7
;(r
•
J
II g 1 d
n+~
:0;
1.6
Q
d Fey/x)
j'{ H f(y/Q,x)
:0;
rr
J
I
H g(Q,y/x)
fey/x)
fey/x) d Q d Y ,
g(Q/x) dy dQ,
Brief literature review
During the investigation of properties of the MBR, the bod: ~127
by Raiffa and Schlaiffer was publishecl.
Although the authors specifi-
cally refrain from discussing the sequential decision problem, it was
found that the preposterior analysis presented is exactly the same as
'e
Previous to this, Hala. ~1'2.7 page
the evaluation of an
151, as
an example of his generaltheor,y, briefly discussed the computation of
l~n
Rn(k)
to obtain the optirr.al second stage sample size.
In
1'2.7,
Chernoff introduced a sequential procedure based upon an asymptotic
approach where the cost per observation goes to zero.
In
i-17,
Hoeffding
obtained bounds for the average risl: of an arbitrary sequential test.
Many other vrr1.ters such as Anscombe .[)}, Barnard
LindleY.['2.7, and lJetherj.ll
in this area.
fl17
iy,
Bross ~!!..7,
have contributed to the literature
An excellent summary of the current state of this re-
search is given in
f~7
by Johnson.
1.7 Chapter sumw.aries
In Chapter II by defining a sequence of stopping rules having the
~illR
as a limit, the average risk for the MBR is found to be the limit
of a non-increasing sequence whose initial value is that for the fixed-
8
sample size Bayes procedure.
Since the average
risl~
of the MBR is,
of course, not less than that of the BSR, the two rules coincide if
the BSR is actually a fixed-sample size rule.
It is shown that the
tlBR is a SPRT for the problem of testing two simple hypotheses.
is also shovffi that the
act~al
larger than that f.or the BSR.
It
sample size required by the MBR is never
Therefore, if one were to use a BSR,
the computations to perform it would need to be started only after the
termination of the MBR.
At this point of termination, the improvement
possible over the MBR is identical to the difference betvleen the average
risk of a fully sequential Bayes procedure and that of a fixed-sample
size Bayes procedure in which the sample size is zero, the expectations
tah:en with respect to the posterior distribution of
Q.
In eha,pter III an cc;:;ily oo,ticf:i.ed condition is Given for
m~n
Rn(k)
error,
==
c I: (x..
)
1 --1"
Rn(l)
==
l;;:c,
in an estimation problem, where loss equals squared
f(·IG) is a member of the exponential class (see
section 3.1) and a conjugate prior distribution is used.
condition holds, only a single
Rn(k) needs to be
Hhen this
cor~pared
'I'Ti th
the
poster:i.or risk to determine whether additional sampling should be
formed.
In addition, the asxmptotic !I1ini1TIa~ rule
compared for several estiwatj.on examples.
£1!J:.7
per·~
and MBR are
To demonstrate the ce.lcula-
tions needed to perform a MBR and to determine its average risk, a
binomial estimation problem is presented in some detail.
Although the
BSR requires a fairly complicated. "vlOrldng-bacl;;:ward" method., which is
apparently not feasible by hand computation, the MBR requires only the
solution of several second degree equations.
An eJcample is also pre-
9
sentea. in which the particular
SPRT
equivalent to the MBR
for a two simple hypotheses testing problem with
is found
ck(~~) ~ kc
and sim-
ple loss function.
Chapter IV contains several areas where future research
profitable.
r~y
be
CHAPTER II
GENERAL RESULTS
2.1
Notation
The following additional notation will be used in this chapter:
0B:
Bayes sequential rule (BSR).
0,:
stopping rule which follows the iffiR up to and including a
J
sample of size
j,; if
min
for
~I:(t)
t
k
then a sample of fixed size m is taken J where
smallest
t
such that
Rj(t)
=
min
Rj(i)'
i
the fixed sample-size Bayes rule and
stopping rule of type
IJ "'Jj
m is the
is thus
oco the MBR .
started evfter '::t
"V"
(i.e' J if sampling is continued beyond
00
= OJ
x
is observed
.J it is with
- t +J
a single fixed-size sample).
an:
~N):
average risk of
0B'
average risk of Bayes rule truncated after
average risk of
o..
J
posterior average risk of
N. :
J
.
EJ.. .
o.(
Jet ) J
J -
us ing
sample size using
0B J a random variable.
sample size using
O. J for
for
j
J
j
= OJ IJ .•. J CO J a random variable
> O.
operator taking expectation vath respect to
~i
N observations.
fixed; specifically for any h(~i+l)J
X.
2+
lJ holding
11
f.(x.
1)
l
l+
where
f.J. (;~.J.+1)
:=
cf
a.
d~
f(Xi+l/Q)
f.t
(x.J.+ 1)
.(Q).
J.
..J1_
At:
2.2
[,r
N
0
:=t·
00
--
t}
,
stopping region for
HBR.
Discussion
Several properties follovT immediately from the definitions.
any
n,
will differ from
5 1
n+
observations
x
-n
is such that
5
n
For
(if and) only if the sequence of
R > min R (1")' A necessary condition
n ."n
It
for the two rules to differ is therefore that
N > n.
n
Also, as soon
as a particular
-e
x
has been observed, the rule 51 (x) will act
-n
t-n
. For example, using the rule
exactly the same as the r~le 5
n+k
5 (n > 1), the observing of x, changes ~
into
~~
and hence
n
=
..L
0
.1
prior distribution for a new decision problem, where the truly sequential portion has been reduced by one.
The additional properties shoiVll in this chapter are based priJnarily
on the recognition that:
ao(x l )
:=
r
<
(Xl
:=
min [R , E R , E E R ,
l
l 2
l 2 3
R0
if a "" R
o
0
~O(xO(Xl)
if a <R
o
0
}
,
,
ana.
e
12
q~
:::
~O~_l(Xl)
if
Ct
0
::: R
if
Ct
0
<
0
R
0
k
,
>
1,
These expressions follow almost directly from the definitions, but see
L-27.
o , assume
the Iffinmas, except for the first which is proved in
To make clear the distinction between
the use of.
0B
00
had resulted in the sequence
Then
~n'
that
n
N :::
B
if
and only if
Rn·-< En _..;-min (Rn+ l' En+ 1 -~min [Rn+ 2' En+c:~ -;-min( .. ,} ...- 7·
(1)
vmereas if
°
00
had led to the same sequence of observations,
N
00
::: n
i f and only if
'e
(2)
(The underscores are used for emphasis.)
E
i
min (.)
If an
< min
This, together with the fact
E ( • ), is the basis for the proof of Theorem
i
R
(novr Imovm)
n+l
(n+l)st observation is taken, both rules put
on the left-hand side and delete
side, etc.
E
n
and
R
n+l
3.
from the right-hand
However, to actually obtain the right-hand side of
(1)
requires a backward induction, vhile in (2) a forvrard induction will
suffice.
To indicate in more detail the computation of
take only a finite number of values, the case
(see Chapter III for a numerical
e~~ajJ1Jle.)
q~,
k::: 3
when
xi
can
ivill be examined.
13
(X < R :
0
o
If
1°.
For each xl' find
(Xo(Xl ):::: min fR l , Rl(l)' Rl (2)""_7.;
then
(Xl ::: Eo (Xo(Xl ) == Z
xl
2°.
(Xo(XI ) fo(X I )·
For each~2) such that
xl €
Al .- [xl: (Xl < Rl )
(Xo( 2E2)::: min fR 2 , R2 (l)' R2 (2)' .•.._].; then
3°,
For each ~3' such that
0_
(Xl (~2)
:::
i
2
R
E2(XO(~3)
Then
{
1
R
::: E a (x )
1 1-2
and
· {EO~~Xl)
xl
€
Al )
if
<X
if
(X2 < R2
:::: R
2
2
if (Xl :::: R
l
if
(Xl < Rl
if (X :::: R
°
°
if (X <R
0
The repetitive nature of the above formulas
ability to programming for high-speed computers.
computation of Q'k
a
2.3
Lemrr.a 1:
imply a ready aClaptIt is noted that the
is !lot necessary to perform (but only to evaluate)
ok'
Preliminary
0
len~as
14
Proof:
'e
:=:,!
J
~c.Jl.
Analogously, by applying
Lemma 2:
Proof:
If
H
f(~/~) d s k-l (~)
~~-2) ~~-3'
ex < R ; then for all
o
0
l~
~l (~).
d
... , Eo successively
> 1, 0:1I:
-
:=:
E
0:
1
o~-
l(x )
l
and
•
15
~
are exactly the same (k
tribution of
1)
since both use
gl
as the dis-
G to determine whether or not
Therefore the prior average risk, averaging with respect to
Xl' is the same.
requires the taking of a single observation
01
(Xl) and then either maldng a te:mdnal decis ion
(if
,_.7 ) or
R :::: min lR , E R , ..
1 2
l
l
fixed size
m "There
E .•. E
l
R
mm+
else taking a sample of
2
ditional on
l
Ie
min 1Rl' E1R , .•.~7
either case,
R, l'
1:= min E · .. E,
K"H
In
is the average risle con-
The prior average risk is then obtained by
taking the (prior) expected value of this average risk.
3°.
The same argluuent, used in the above proof, also shows that
al (Xl) ;:: El (min 1 R ) E2 R ) E E R ) ..._.7 J if Xl is not in
2
3 2 3 4
Al , since the only change required is that
~o becomes
~l'
,
For any fixed
define
used.
let
Til,
~::: /I, ~
II
+ (1 - /1,)
Sm .'
if ,.,
a' (x ), a"(x ), ... , as the a, ... ,
° -m
Then,
0
--m
a (x ) >
° -m
-
fO, l:7, and
~ m'
0
/I,
/I, €
s"m
is
a' (x,,) + (1 - /1,) all (x ).
0 -ill
0 -,,1
Proof:
It is sufficient to prove the case of
Then
ill::::
O.
Define the integer
16
A a~ + (I-A)
a~
. Rtl11 \
: : A m~n R(k) + (I-A) mJ.n
1
\ \.)
5.
< A R(i) + (I-A) R(i)
2.4
<
Theorem 1:
0:1
.:5
0:
0
if a :::: R
0
0
if a < R
o
0
if a :::: R0
0
<
if a <R (using
0
o
Lemma 1)
::::
min
k
o
2 .
a2 .:5 al
If
:
performed.
ao --
a0 --
N
~l
-- a2
since no samnlJ.'ng
J.'r,~
~
If a0<0
R , al(x ) '<- 0
a .(x,)
since (1
1.
0
l
proved for any
tations,
R0'
go
Eoa (Xl)
l
and is thus true for
:5
£1'
EoO:o(X1 ), and therefore
)
was
Taking expeca
2
:5
a l by
Lermr.a 2.
g and is
o
thus trlJ.e for ~l'
~
:5 a2
follows by the method used in the second part of
(2°), using (3 0 ) of Lellllna :2.
0:
< am- 1 follows by a repetition of (3 0 ) and (4 0 ), i.e.,
m-
17
6°.
2.5
Q':s
~
ex 00 is t h e fundamental property of
--I"L
If
-: (d l
-n
)
of f~7)
(see Theorem 9.4.3
Theorem 2:
is finite, then -: (d I)
-n
::;:
(I;
~
n
is in A
x
:
-n
n
DB'
is convex.
and
d:=
ell
l!here
)
;
is
such timt e
~n
i.s in
"" (d f ) if and only if the NBR requires sampling to stop at stage
-n
and a particular terminal decision (d l ) to be wade, and
:.
::;:U~.:: (d')
d f --n
-n
n
(see section 1.3).
Proof:
Let.JL:=: ( Q.)m
J.
1
i
'e
:=:
1, ... ,
ill,
for
° < I-. < 1.
-: (d l
).
If
-n
,
H(i, d)
and :>n
e (i) ==
Fix
L
/-g.,
J.
d(x,)
-
sn (0.)
:=
J.
···n -
n::;: 0, interpret
7+
e(x) for
n --n
I-. g '(g.) + (1-1-.)
n ].
such that both
x
-n
ill
I-.
:=:
s'n
J.
and S II
n are in
ex (x ) ::;: ex ,etc.
° -n
sn"(G.)
NoW
°
ill
L:s.'(i) H(i; d l
1 n
:::
,
II,
)
+ (1-1-.) L:g "(i) H(i, d l
1 n
since both
1""1. 1 (,x ) + (1-') AI"! v
v'o""n
II,
v' o \.::n )
< exo (x
-on )
)
are
j
g'n ond
a -;: (d I
-on
e"n
)
LeY,ID1a 3
since ex (x )
=:
o -n
min
Js.
ID
L; {; (i) H( 1" d*) < r, Sn ( i) H( i} d! )
In
-1
ill
.-
min
o3{-
m
I-. r, ~ '(i) H(i,d') + (I-A.) L: s~(i) R(i,d')
ill
:=:
1
n
Therefore, ex (x ) ::: R
.-n.
n
°
m
== >-: ~ (i) Ht i, d I
1
n
1
);
i. e., g
n
is in -: (d I
·-n
) •
18
2.6
Coro1.1a~y'
If
m
:=:
2.1:
2, c (x )
n -n
:=:
nc
and
Lei,d)
:=:
f,J. for
0. (; i
and
zero
othe.Y.'Vlise, then there exists an GPRT equivalent to the 1mR.
Proof:
12.7,
It ls shmm in
page
267,
that the convexity of ~n (0. r) j.s
sufficient for this conclusion,
Note:
~n' the rr..arginal cost
y , the sets
_ok
~ (d l
--n
the finiteness of
)
e
Theorem 3:
l)age 259, that j,f.:> for any
c~(Y.k) delJends only on the values in the vector
do not depend on
j'L. For
n.
This result doea not require
k
example,
the cost of the observation
2.7
["j}
It way be show'll, foll.owing
c. (xl )
I:
"-=
K-\:
].
c( Xl' L where
c (x,) il3
1.
Xi' is suf.fic:i.ent,
N00 -< NB,
Proof:
For any
N, at the n-th stage,
o~N)
stope sampling
(L-2..7,
page
242) if and only if
r···
Rn -< Ermin
[Rn+ l' En+ 1 rr-
min (Rt\f
ET'i~-D-L
T
0/',) B-].
' ...lR., :;7,
. •.l"·rJ~'._
"l~-n- ''N-n-
< min [En -rRn+'
l' En +1 Rn -1-2'",."
E 1 .•• E,N··D- 1 RN··n-·
7J
n+
:::;
Hence) if
°B
stops sampling at the
n,~th
:::
st'1ge, trlen
min
It
i.e.,
000
stops sampling.
.. J- 7
19
Corollary 3,1:
(i)
(ii)
Suppose
c1c(~
)
-~
= ltc
If Ln ->0 uniformly in x , then 500 is truncated,
-n
If -1n is a funct:Lon of n only, then 000 is the Bayes
fixed-sample size procedure.
Proof:
(i)
(ii)
By [')}, Theorem 9.3.3,5
is truncated. But Noo ~ N ,
B
B
By the same theorem, 5 is a fixed-sample size procedure.
B
But, using Theorem 1, CL.1:)--< ex00 -< ex0 = CY .
B
CHAPTER III
SPECIAL
3.1
CASES
Estimation
For the exponential class of densities,
f(x/g):::: a(g)~(x)expLGY(~27,
the parameters in the prior distribution combine 'Vrith the sample clata in
E~~tu~~
a particularly simple manner when a
(see section l.L~) is used.
In
.::'?Ejugate prior distribution
£12,7, page 45, the binary operation
*
is defined; for the case described above, the definition rec1uces to
xn + m'e
(Q , mY) ,x- (x) n') ::::((n'
o
n
where
is the prior estiITate of
Q
o
observations:
0
J/.[n'+ ro'},n ' + 11'.')
Q associated with
m' pseudo-
(Q, mil are prior c1.istribution parameters, (x , n') are
o
n
sample sufficient statistics, and the terms on the right are posterior
distribution parameters, e.g., if
d
~o ()
G
oc Ga-l,tl - G)b-J.
0. Q and
f(x/G) :.;; QX(l_G)l-x, then
(a/(a....bL a+b)
* (.:{n ,n)
n
::: ([ L: x. + a}/f n+a+bL n+a+b).
1.
l
The followj.ng theorem gives a sufficient condition for "looldng
ahead" at only one potential future observation.
Theorem q.:
If:
f(x/Q) is a member of the exponential class of densities,
L(Q;d) ::: (0. - G)2,
f(x/G), where
Then: %
0.
c, (JS )
~ -~
So (G)
:::: kc, and g (G) is a natural conjugate of
0
:::: g (G) 0. Q,
0
stops sampling as soon as
if and only if
c? Yn(l)
(see section l.L~),
21
J
r3
1
< -2(n' -I- m') + 1 7 L
n
n-
(x-G) 2 rex/G)
'Ylhere
0. x .
Proof:
For any fixed vector
x I ' " (x , Yl ), it is easily sho'Yffi that
-n+ e le-n'- e
r
+.l: Y., 7/
n' + ill' + le.}, where
1
~the mean of the posterior distribution of Q
-I-
ill ' ) Q
n
Consequently, by definition
.~.
,
.
.;.
Ie
TT
1
Assuming that
le
:::: ( 2 J
f(y./G) 0. y. g (G) 0. G
l
Rn·< Rn(I)
t
n
- {)(n'+m' )-I-k
-~.
-
j.f and only if'
C ~~nt
> Y (I )' and (3) implies
that
6
00
By per-
is a continuO'l..1.s variable,
7 Ln } /f
n'-I-rn'+k}
is less than zero, for all Ie, if and only if
Since
n
l
Ln-<
Ln(It )
Yn
(1)
(0) >
_ -> Yn.::.-
stops sampling if and only if
;)
c
3, which
is satisfied.
+ ltC, or
....; (3) implies
> Yn (1) •
-
In each of the examples considered, (normal; binowial and Poisson);
the ratio
J
n /1n
i-laS
founc1 to be ince-oendent
of -n
x, but no proof
Vias ob-"
-
tained th."!,t this is a general property.
with Theorem
9.3.3 of
~~7.
There may be some connection
For example in the binomial case,
22
2
n
n
2
n
x.)(n-L: x.+b)/A.(Ml) 7/r(a+'i2 x.)(n-L: x.+b)/(A.+l)A."7
1
1
1
1
1
1
1
= A./(A.+1)) 1There
A.;:: a+b+n .
vfuenever this is true) before sampling begins you know at what stage
only
needs to be examined.
Y n(l)
Asymptotic Ninimax Comparison
Hald
mates for
["1!:.7
introduced the Asymptotic f.linirr.ax (At.i) sequential esti-
c (x. ) ;;:; h:c
1l:.
and
-11:
the limit as c tends'to zero.
Ll§.7)
Cramer-Rao 10v1er bound
L(9:a'*)::: (a''1·_g)2.
"Asymptotic" refers to
By means of lJolfowitz's e:J>':tension of the
he shovled that
T~ an,}. T~ are both AU
solutions) where
N observations) N > (c
Take
c
by
Q
c -
G .
Nc
and estimate
A
G
~n)
n
G by
Cl*=Sn) d(G) =
9
n
>
-
IV
yf ;::
n
1
7
n-
t\
r-n(n+l)d(G )
L
.
G) given
E[ E 10~G~(y/Q) 2/ G J) and
N
d(G)
o
e
In a sense explained by Hald £"VJ.7)
Since the A11
1\
c
is the w.axirrn..un 1ilcelihood estima.tor of
('>J
d.
and estiw.ate
A
stop sampling as soon as
Hhere
d0 )-1/2)
;::
inf
T~ is better (for small c) than Tco •
procedures are among the very few general sequential
estirr.a.ting procedures where explicit solutions a.re available) a compari··
son 'Ivi th the MBR may be of interest.
The similarJ.ty of the stopping
criteria and estirr.ators) as shovffi in Table 1, is quite striking in view
of the apparent disparity between the two methods.
n) the procedures effectively coincide.
In fact, for
large
e
--
It
T.PJ3LE 1
I
r(x/e)
I
goee)
N(e,l)
I
2
N(eo,T- )
B(e)
pee)
where:
I
I
I
] B(a, b)
I r(9 o)
I
yl
Yn(l)
n
( n(n+1)
Xn (I-xn )
n(n+1)
-x
n
n(n+1)
r1
-
(n+T2 ) (n+T2+1)
(~r..J,.a)(l:;;'
.i.....
l
-
,b,
n
n
-.Llo.
-r - )
nn
a+b 2
2
(1+ - ) (n+1+a+b)
n
Xn +(e 0 1In)
(n+1)2 [ 1+(2/n)}
B(e) is the binoInQa1 distribution, pee) is the
e 0 -1
-t
re)cee
(
o
r1
I
I
/\
e
n
x-n
d
I
I
[ Xn+(T2e 0 /n)}/(1+(T2 /n)}
I x- I [ xn+(a/n)}/[l+«a+b)!n)}
11
I
I
poisson,~(a,
xn
I [ xn+
(8 0 /n)}/[l+(l!n)}
b) is the beta, and
t
iV
'-'1
24
Binomial estimation crunputations
To indicate two possible forms that the stopping regions for the
NBR may talte) and to shmT the numerical computations required to obtain
than, ti'TO binomial examples will be considered.
The second. i,dll be
q~'s.
given in considerable detail to include the computation of several
The form of the BSR for these examples is not knOi'll) and hand computation does not seem feasible.
It is only ImoiVD that they are truncated
[)], at a value not less than
27 and 12
and that the stopping boundaries are
respectively (Oorollary 3.1),
nowhere i'lithin those of the
HBR
(Theorem 3).
Example 1:
L(Q;d) :;:: (d_G)2, c'l\: (x,)
:;:: k/l~OOO
-K
n
Let:
s :;:: L:
1
00
In the
go(G) :;::~(2,2) :;:: G(l-Q).
x., r' :;:: (n+4) 2 (n+5) 2 .
J.
From Table 1, N :;:: n
i.e., s
and
~ (~)
{n
if and only if 1/4000 > (s+2)(n-s+2)/r')
-
2 (n+4)~1
-«n+5)2/100027 1/2 ):;::
(~~,~~).
(n,s) plane, the stopping regions are indicate 1 by the cross-
hatching in Figure
18."
1.
~~~\ \ '0-ll-;~
\\\\\\\\\llill\~\ \\
14.
10.
6.
2.
~'---'.--+---~-~----'---~----'---'----------
20
22
Pigure 1
24
26
11
27
25
Example 2:
N =n
Using Table 1,
For this
if and only if
00
e}~mp'le,
the stopping region
~~
shown in Figure 2.
~
s
5
CJ
4
n
3
2
1
'e
4
2
6
8
10
12
Figure 2
In this example the prior belief that
Q is large is sufficiently strong
(relative to the cost per observation) that saMpling cannot stop for
"too fe1-1" successes.
Computation of
Let
CX
oo
(= CX12 )
(n,');-) represent the coordinates in the (n, s )-plane of the
accessible stopping points v1here
Tl1ese a11pear belov' in Table
Let
t(y,z)
=
k(=1,2, ... ,12) indexes these points.
2.
number of distinct paths in the (n,s) plane, from
(0,0) to (y,z) which do not pass through a
(n,CJ ).
n
Several methods are
available in the literature, e.g., see ~127, for finding
t(y,z).
If
26
there vrere no boundary;
t(y,z) :::: t(y-l,z) + t(y-l, z-l) for
y
t(y,z) " { :
could be ·used.
By'
y >
Z ::
1
< z
y :::: Z
removing the contributions from each
t(u,CJ), this
u
recursive method is easily adapted to the present situation.
\::::: t(n,A!\:), lThere
Let
Table
1.:
e
1),
n:=:n(l\:), the values in Table 2 vrere obtained.
:=: the Bayes terminal decision, if sampling stops at the
d
point (n,
,
A.. ).
·l~.
ci ::::
-1,,-
Since this
(9 +
A.. ) /
'1e
'1s
,.
' t,r1.,)U'vlCr),~
.
1....
the mean of. th e pOS'CCI'1.or.-:
'('1 lEY
'L
(n + 10).
TABLE
n
2
~.
r,.\:
IF,
2
Letting
2
5
7
8
9
10
11
12
12
12
12
12
2
3
4
5
6
7
8
9
10
11
12
4
5
5
5
5
4
4
3
2
1
0
53
103
292
156
210
12
.1
-,~-.-_----
\;:
2
1
°
1,,-
7
_._--
r';-
11
13
15
1'2
14
-17
By definition,
12
a :: 9
00
vrhere
I;
1
23
65
_-_._-..
14
11~
lL~
18
19
20
13
'21
13
2'2
12
22
1J.
2~~
10
Ii2
9
2'2
I
\; Bee
27
After performing, "lith the aid of Table 2, the operattons indicated,
a00
=
.00623 + .00138
a
Computation of
For any
::::
.00761
•
j
x,
-n
n
where
8 '"
L;
x
1
Also,
1
Rn(k)
1
Jk
::;:
[
i~O
°
where
i
dg n+<;:
1 (G)
~
inf!(d*-Q)2 d
OJ}6
=
i
1 (G)} (1:) G (1_Q)k-i dg
~
n+<;:
G8+8 - 1 (1_G)n-8+1;:-i
6Xs+9-i, n-8+1;:":'i+1)
n
(G)", c1<;:J
d 0,
and therefore,
'e
1
:: r
J
°
k
I:
(s+9+i)(n-8+1+k-i)
i=o
(n+k+10)2 (n+k+11)
(s+9)(n-s+1)
-(n+10 )(n+11)(n+10+k)
Assuming that
k
+
c1<;:.
is a continuous variable,
d Rn(k)
d k
:::: 0
if and only
if
1~
where
1°.
f x7
= 1~ I : ;:
[[2000(s+9)(n-s+1)3 1/2
. Tn+IO)(n+l:ry-- (0+10)
I
integer closest to
==
a :::: min R(le)
°
It
J
1/2
.00629 + .00150 ::;: .00779
.
Since
tit
a :::: R(3) :::
°
y.
J'
- [{\IOJ(IT}
2000(9-0)
t -
010J
-
3,
l
J
::: 1, and hence
1
Also, fo(O) ::: 1 - fO(l) ::::
9f
(l-G) 6
8
d G
1/10 .
==
o
-e
For ~2 ::: (0,0), k' ::: 7, for ~2::: (0,1) or (1,0), k' :::
for ~2::: (1,1), k' ::: O.
f
o
(0,1) :::
o
f
0
(1,0) ::: 9
t
;
o
Also
of' (1 _l '
-O'-~-,
-,.., a
~
r
l
I
0
~
10
riO _
~=
-
0
h
4 and
1
~/~~'
0';
09(1 - 0) dG ::: 9/110 and therefore
a,
4.
For ~ ::: (1,0,1) or (0,1,1), k' ::: 3, for ~3 ::: (1,0,0)
(0,1,0) or (0,0,1), k' ::: 5 and
x2 ::: (1,1)
-
f
a
€
A2 ,
f
0
1);' :::
(1,0,1) ::: 3/44,
or
7 for .?E3 ::: (0,0,0). Note that
f
0
(0,0,1) ::: 3/220,
(0,0,0) ::: 1/220 etc.
j
=
.00626 + .00141 = .00767 -
By carrying more decimal places, it is found that a2 ::: .007675
and
a,:::
.007671 .
29
Since the stopping point (2,2) contributes .00563 to each
a
j
,
the reduction possible as compared with
this example.
a
It is noted that although the
is rather small in
o
a.'s are monotonic,
J
the contributions from e:xpected cost and loss are not.
3.2 Testing hypotheses
The testing of two simple hypotheses is another problem to which
the ivLBR may be applied.
1,( i, (1.) '" 0 if
lent
SPRT.
c1;;),:;:::
By Corollary 2.1, for
i
and
Xi
if
c.n (x
)
-n
= nc
and
d. 1= i, there exists an equiva-
From Theorem 3, the particular SPRT must have boundaries
at, or within, the SPRT corresponding to the Bayes solution.
Since the
BSR for this problem is available (by an iterative process ~27), an
explicit comparison of the tvlO procedures 111.ay be possible.
analytic comparison of the SPRT's has proven feasible.
No other
By the use of
lJald's v1ell-Imovm formulas ~27, approximate expressions for the 0"
function and ASN
c.
are obtainable for the MBR in this case.
The actual use of the MBR is not dependent on establishing the equivalent
SPRT.
If the sample number is expected to be s111.all (for example)
if the cost per observation is high), it may be simpler to determine
whether or not sampling should continue directly from the MBR definition.
To illustrate the computations needed to determine the equivalent
SPRT, the following numerical example is presented,
example is found in ~27, page 280.
This same bin01ual
The considerable decrease in the
computational complexity, v1hen compared v1ith the BSR, is evidence for
the remarks 111.ade in section 1.4.
Let:
ak:::: posterior probability that
G:::: G , if
1
~~
is observed,
11:
s
a'
S
:=
:=
1<
x.
L;
::::
1
1
1 - b ' :::: probability that
g::::
9
1
and
h =10G(a ' /b').
~
Then, for any a',
< 1/2 if and only if
< b'
s > s*
'e
or
_lL-
=
k
:2
+ -
21og:2
Then, if at = a , using the notation of section 1.4,
o
= L;
Ie) (~)s (~)l<-S
( s.3
.3
that
a o > 1/2.
k
where
P
1
it
is
aS$umec1
S7~
k
k
2 s
and P2 :::: ~ -y.. ( s )("3 )
1 l~-s
("3 )
,
To establish the critical value of
aI, that is, such that a:: a' > 1/2
is accepted,
where
a'
is compared vnth
c:::: 1/38.25 :::: .0262.
If
~4)::::
c
implies sampling stops and H
l
Y {k) for various values of a o,
.021
~ .65,
h' ;", 1, then
a I ~ ·73 J and 1(2):::: •020 ~ ~.3):::: .018 ana-
indicating that
12)::::.042
hI:::: 1
showing
h
is too large.
If
hI:::: .6, then
::::.6 to be too small.
this process) it is quickly found that
al
= .72
By contj_nuing
is approxiwBtely the
)1
critical value.
To find the SPRT boundaries, the notation of ~~7
page 279 is used and
a~*
]
log
=
[
log
=
2
i.e., the SPRT has boundaries (-2,2). Confirming Theorem ), these
boundaries are not outside any of the Bayes solutions; in fact, they
coincide with the innermost
P~yes
solution.
For the composite hypotheses testing problem when
and
if
L(Q,d)
d
£19.7.
is a linear function of
Q when
d
c (x. )
1~ -l~
= l~c
is v~ong and zero
is right, a detai led discussion l11.ay be found i.n Chapter 5 of
This chapter contains a presentation of the computations in-
volved in a preposterior analysis.
Although much of the complexity in
such an analysis lies in the difficulty of finding
min
l~
R
which is
(1 )"
n\_,>:
not needed for the NBR, their treatment vlill materially assist the determination of the stopping regions.
For example, their Chart I
for
the normal distribution, variance knovm problem, can be used to obtain
the optil11.al non-sequential size
an eqUivalent treatment).
ra for
in their notation; see _
( ",0
~
~I
For the MBR, at each successive stage it is
only necessary to determine whether
p
o
> O.
CHAPTER
4
AREAS FOR FUTUHE RESEARCH
Many questions regarding the r1BR
remain to be investigated.
In
addition to those peculiar to this procedure) there are those applicable
to any Bayesian-type rule.
The basic problem remains of establishing a
common "nmneraire" so that losses and costs can be quantitatively compared.
Also) rules for establishing the specific
t
!:>o
prior information or opinion regarding the parameter
to reflect the
Q are needed.
A brief discussion of several open problems is presented below:
1,
Investigation of
am
An indication of the gain achieved by using a sequential
procedure would be given by a knowledge of
w'ere available for
am - a00
If good bounds
ace - CtB ) a limit to the possible improvement 1'lith
more sophisticated procedures (see 2.) would be knolvu.
At present only
ao . In fact) except
101'7er bounds for
for the case of testing two simple hypotheses) little is lcno1'ffi about
the advantage of using a
2.
BSR.
Other modifications
In the same
1'~y
as the MBR discussed here has been defined
in terms of "1001dng ahead" at fixed samples of any size) a double sequence of such rules can be defined in terms of a truncated Bayes look
•
into the future.
That is) let
0ij
be such that as long as
N < jJ
33
another sample is taken and otherwise sampling is stopped.
right-hand side of the above expression is
min
~(k)'
a
ali~Ys
Since the
not greater than
general theorem of sample size inequality (like
k
Theorem 3) can be shOim.
Hhether the additional computational cOTIrplexity
implied by such rules is worthwhile remains to be investigated.
The
possible similarity "rith the sequence developed in 1-1'5.7 also is open
to question.
3· A single critical
Rn();:
, )
A general criterion for determining vhich
Rn(k) is the
minimum does not seem feasible in the light of the complexity shoim in
Chapter
5 of
Theorem
*way be possible although the minimizing
~1~7.
be dependent upon
However, some relaxation of the conditions of
x.
-n
Bounds on
R
n
or
k
will presumably
rn (see section 1.4) would
provide sufficient conditions for either stopping or continuing.
4.
Choice of alternative
eJ~eriments
For problems such as the Ilk-armed bandit ll , where the
distribution of the random variable depends upon the population sampled, an adaptation of the NBR ltey provide a fruitful approach.
By
separating the decision rule into one portion which decides whether to
continue sampling at all, and another which evaluates the expected resuIt of sampling from each specific population, a sequential procedure
could be defined.
Although not optimal in general, it is conjectured
that the lost efficiency rl'l1y be Il s1ight" for at least a restricted class
of such problems.
5 . Average sample number
Theorem 1
can be used to give a J presuwnbly poor, upper
bound (in terms of a ) for the expected value of
N averaged over
o
the prior distribution of
Q.
Some knowledge of
EG N00
may' be poss i-
ble by an adaptation of the technique used in ~17.
6. Properties of the Bayes procedure
The existence of a general sequential procedure whose
average risl';: is not greater than the fixed sample-size Bayes risk may
be useful in evaluating properties of the
a00
is less than
BSR.
For example, since
for the estimation problem found in section ).1,
the BSR cannot be a fixed sample-size procedure.
BIBLIOGRAPHY
Ll_7
Anscombe, F. J., "Testing to establish a high degree of
safety or reliability," unpublished paper presented at the
Eastern Regional Meeting of the Institute of Mathematical
Statistics, Cornell University, April 20-22, 1961.
Barnard, G. A., "Sampling inspection and statistical
decisions," Journal of the Royal Stati.?tical
So~iety,
Series (B), Volume 16 (1954), pp. 151-174.
Blackwell, D. and Girshick, M. A., Theory of Games and
Statistical Decisions, John Wiley and Sons, New York, 1954.
Bross, Ie J., "Maximum utility statistics," Ph. D, thesis,
Department of Experimental Statistics, North Carolina
State College, 1949.
Chernoff, H., "Sequential design of experiments," Annals of
Mathematical Statistics, Volume 30 (1959), pp. 755-770.
Grundy, P. M., Healy, M. J. R., and Rees, D. H., "Economic
choice of the amount of experimentation," JO·.J.rnal of the
~oyal Statistical Society, Series (B), VolWlle
18 (1956),
pp. 32-55.
Hoeffding, 1-1.; "Lower bound for the expected sample size
and the average risk of a sequential procedure," Annals of
Mathematical Statistics, Volume 28 (1957), pp. 57-74.
,
Johnson, N. L., "Sequential analysis:
a survey," Journal of
the Royal Statistical Societl' Series (A), Volume 124 (1961),
pp. 372-411.
36
["9_7
Lindley, D. V., IIBinomial sampli.ng schemes and the concept of
information, II Biometrika, Volume ~.4 (1957), pp. 179-186.
["10_7 Raiffa, H. and Schlaiffer, R., Applied Statistica;.l Decision
Theory, Graduate School of Business Administration, Harvard
University; Boston, 1961.
1-11_7
Savage, L. J"
TheJr0unda~io~s of Statistics, John Wiley and
Sons, New York, 1954.
£12_7
Schwartz, G., "Asymptotic shapes of Bayes sequential testing
regions," ArLl1als of Mathematical Statistic~, Vol1..une 33 (1962),
pp. 224-236.
["13_7
Stockman, C. M., and Armitage, P., "Some properties of closed
sequential schemes,1I Journal of the Royal Statistical Society,
Series (B), Volume 8 (1946), pp. 104-112.
£14_7 Wald, A., "Asymptotic minimax solutions of sequential point
estimation problems," Proceedings of the Second Berkeley
Berk~ley
Symposium on Mathematical Statistics and
Probabilit~,
(1950), pp. 1-11, University of California Press, Berkeley
and Los Angeles, 1951.
~15_7
Wald, A., Statistical Decision Functions, John Wiley and Sons,
New York, 1950.
["16_7 Weiss, L., Statistical Decision Theory, McGraw-Hill Book Co.,
Inc., Nevl York, 1961.
£17_7 Wetherill, G., "Bayesian sequential analysis," Biometrika,
Volume 48 (1961), pp. 281-292.
37
L18_7wOlfovlitz, J., "The efficiency of se<luentia.l estimates and
....
Wald's e<luation for se<luential processes," Annals of Mathematical Statistics, Volume 18 (1947), pp. 215-230.
It·
".
.,.
© Copyright 2026 Paperzz