The Minimax Risk for Clinical Trials
by
John Bather and Gordon Simons t
University of Sussex, U.K.
Summary
The risk involved in a trial to compare two medical treatments
is shared by patients who receive the inferior treatment during the
experimental phase and those remaining after the experiment who might all
receive the inferior treatment if the results are misleading.
We consider
the maximum of this risk with respect to the unknown probabilities of success
and seek allocation rules that minimise this quantity, for a given total of
patients.
It needs extensive computations to find such minimax procedures,
but there are simple and almost equally effactive allocation rules based on a
truncated sequential probability ratio test.
Key words:
CLINICAL TRIALS;
RATIO TEST;
Present address:
BAYES PROCEDURES;
SEQUENTIAL PROBABILITY
MINIMAX ALLOCATION RULES.
Mathematics Division, University of Sussex,
Falmer, Brighton BN1 9QH, U.K.
t This work was carried out while G. Simons was visiting from the
Department of Statistics, University of North Carolina, Chapel Hill,
NC 27514, U.S.A.
2.
1.
Introduction
Consider the task of allocating a total of
t
patients who suffer
from the same illness to two available treatments which successfully treat
the illness with unknown probabilities
Initially,
2N
P1
and
P2 ' respectively.
patients are assigned sequentially in pairs to the two
treatments and then the remaining
E(r)
, where
patients are assigned to the
The task is to choose
apparently superior treatment.
the expectation
t - 2N
r
N
so as to maximise
is the number of successfully treated
patients.
The expected successes lost (ESL), due to ignorance of the values of
P1
and
P2 ' is given by the difference
t max(P1,P2) - E(r).
It is
mathematically equal to the expected number of patients assigned to the inferior
treatment multiplied by the factor
loss
0
IP1-P21
IP1-P21 •
Thus, we can imagine a fixed
for each assignment to the inferior treatment.
denotes a sequential rule for choosing
b
N , upon which the evaluation of
is based, then the ESL becomes the risk function, denoted
associated with the allocation rule
If
b.
and minimising the risk with respect to
R (P1,P2,t)
b
The problems of maximising
b
E(r)
,
E(r)
are equivalent.
We are not concerned with all possible allocation rules, but only
with the class
time
N
V
of two-stage procedures, where each rule
has a stopping
which determines the length of the sequence of paired observations in
the first stage.
The aim here is to find, for each
minimax within the class
of
b
V.
t , a rule which is
It is not obvious a priori but, for many values
t , the minimax allocation rules belong to a certain subclass of
V
and our
search can be reduced by considering only stopping times of the corresponding
special form.
For the first
S
n
n
pairs of patients assigned to the two treatments, let
denote the number of successes produced by the first treatment minus the
number produced by the second.
The stochastic sequence
random walk having steps which equal
-1, 0, 1
S1'S2' ••. 'S[t/2]
with probabilities
is a
(1- P l)P2'
In order to reduce our search,
3.
we shall restrict attention to allocation rules corresponding to stopping
rules of the form:
min {n, 1 S n S t/2 : S
N
n
where
£
n
are subsets of the real line.
in Section 4 that this restriction on
An allocation rule
N
(1)
A }
It will be argued
does not seriously affect matters.
will be called symmetric if the subsets defining
6
symmetrical about zero.
N are
The use of a symmetric allocation rule seems
appropriate when one has little or no reason to believe that one particular
treatment is better than the other.
For symmetric rules, there is a very
useful computational formula for the risk function:
].
(2)
where
(3)
y
Suppose, for the moment, that
t
is an even integer.
associated with the allocation rule
6
0
Then
ot/2
which assigns half of the
is the risk
t
patients
to each of the treatments and the quantity
(4)
can be interpreted as the expected successes saved (ESS) by using the symmetric
rule
~
instead of
6
0
.
The ESS is always non-negative, and it is strictly
positive whenever
and
t
P(N < 2 ' SN ~ 0) > 0 .
By using (2), the risk function can be evaluated easily, rapidly and accurately
on a mainframe computer, even for fairly large values of
accurate simulation studies can be avoided.
t.
Costly and less
4.
A simple and, we believe, a revealing indicator of the overall
effectiveness of an allocation rule is given by the maximum risk
Mt:. (t)
t:.
Any rule
max Rt:.(P1,P2,t)
P1,P2
(5)
.
which minimises this indicator within the class
V
is, of course,
Of some significance is the actual size of the minimax value:
minimax.
(6)
M (t)
For given
t
this sense,
and any allocation rule
M(t)
t:., there must be a pair of values
This investigation will be
indicates how well one can do.
mainly concerned with
M(t)
t:. = t:.(t)
and the minimax rules
such that
Our results also include simpler allocation rules with risk
functions which come very close to attaining the minimax value.
It is easily seen that
M(t) +
as
00
t +
00
•
Computer calculations,
using formula (2), indicate that, for large t
0.3714t
M (t)
For comparison, we observe that
rule
t:.
=
t:.
0
Mt:. (t)
~
.
O.St
•
We have been able, for many values of
minimax rule
for the simplest allocation
t:.
t , to determine a (symmetric)
and thereby to evaluate the ratio
M(t)/t~.
These rules are
Bayes procedures for a two-point prior distribution which assigns equal
probabilities to the points
(P1,P2) = (~(1 ± IS), ~ (1 + IS») •
For fixed
IS ,
the minimal Bayes risk can be rapidly computed using backwards induction:
see Section 4 for the details.
value of
produces
IS
Our approach is to choose the least favourable
so as to maximise the minimal Bayes risk and it turns out that this
the minimax allocation rule for most of the values of
t s
201.
This
5.
assertion is based on extensive calculations we have made, using formula (2), to
verify that the maximum of the risk function coincides with the minimal Bayes risk,
for an appropriate choice of
0 .
The computations are described in Section 5,
including some exceptional cases, and the results are summarised in Section 6.
In fact, there are quite simple symmetric allocation rules
the ratio
M6(t)/t~ is very nearly as small as M(t)/t~.
a sequential probability ratio test (SPRT), truncated after
6
for which
One of these uses
[t/2]
pairs.
This type of rule was first investigated by Vogel (1960) who established an
inequality for the risk function and deduced an asymptotic upper bound on the
ratio
M(t)/t~ , as
As we shall see in Section 3, his inequality can
t~oo
be extended to provide a close upper bound for all values of
t.
It follows
that a truncated SPRT can be specified so that it behaves, for all practical
purposes, like the corresponding minimax rule.
Another allocation procedure is a simple adaptation of a rule suggested
by Anscombe (1963) for a related setting with normal data.
also behave like minimax rules;
quite flat in the variables
others:
see Lai et ai.
Such procedures
they have comparable risk functions which are
Pl,P2
Anscombe's rule has been recommended by
(1980) and Chernoff and Petkau (1981).
However, the
truncated SPRT seems preferable because of its simplicity and also because of
its lack of sensitivity to changes in
t:
see Section 6.
The comparison in Table 1 also includes an allocation rule adapted
from one suggested by Lai et ai.
approximately equal to
0.41
This does fairly well, with
for large values of
t
M6(t)/t~
It suffers from the
fact that no more than about a third of the patients can be assigned in pairs,
which causes difficulties when
and
are close together.
6.
The basic loss structure assumed here was developed for normal data
in the papers by Anscombe (1963) and Colton (1963).
Since then, several other
two-stage procedures have been suggested in the literature.
For example,
Begg and Mehta (1979) considered approximate Bayes procedures, using normal
prior distributions on the mean difference between two treatments.
Bayes
allocation rules and the corresponding minimal Bayes risks have been accurately
computed, for a similar model in continuous time, in the paper by Chernoff and
Petkau.
In particular, they found that Anscombe's rule is remarkably efficient
from a Bayesian point of view, for a wide range of normal prior distributions.
There are strong practical reasons for restricting attention to two-stage
procedures in medical trials:
the allocation of treatments can be randomised
for each pair of patients during the experimental phase, giVing protection against
the possibility of bias in the results.
practical advantage of simplicity.
achieved by comparing its
The truncated SPRT also has the
Table 2 gives an illustration of what can be
risk function, for the case
the corresponding minimax rule.
t = 100 , with that of
However, our results and those mentioned above
for normal models suggest that similar risk functions can be produced by other
allocation rules.
In this paper, we focus attention on the risk function and
its maximum, but it must be recognised that other criteria, such as final error
probabilities, are also relevant to the design of clinical trials.
Our investigation is concerned with a restricted version of the two-armed
bandit problem and it is worth considering what reductions in risk might be
achieved by extending the class
V
of two-stage procedures to permit arbitrary
switching from one treatment to the other.
widely studied.
The more general problem has been
For example, it is known that the two-point prior distributions
used here lead to reasonably simple Bayes procedures for the two-armed bandit
problem but, unfortunately, such allocation rules perform badly at other points
(Pl,P2) , where the risk is of order
t:
see Bather (1981), Section 3.1.
7.
The paper by Berry (1978) includes computations of the minimal Bayes risk when
and
are assumed to have independent beta densities, but these average
risks give very little indication of the corresponding maxima and the underlying
procedures are complicated.
For our purposes, a more useful result is an
asymptotic lower bound established in Bather (1982).
The minimax risk
M*(t)
for the extended class of allocation rules has the property that
lim inf
t+
M*(t)/t~ ~
0.306 •
oo
This gives some idea of the possible reductions in risk, but no indication of
the allocation rules involved.
8.
2.
Risk and error formulae for symmetric allocation rules
~
Let
be an allocation rule and let
N
be the stopping time,
defined as in (1), which represents the number of pairs of patients assigned
by
~
The remaining
during its testing phase.
5
assigned to the first treatment if
> 0 ;
N
t - 2N
patients are
to the second if
by tossing a fair coin, to either of the two treatments if
~
Theorem 1.
If
R~(P1,P2,t)
can be computed by means of formula (2).
Proof.
and,
SN = 0 •
is a symmetric allocation rule, then the risk function
For definiteness, let
P1 > P2
so that the second is the inferior
The expected number
of patients assigned to the inferior treatment can be expressed, for any
in terms of indicator random variables such as
1(SN < 0)
~
,
and then re-written
as follows:
(7)
If
~
is symmetric, then, given
N
=
probabilities for a path leading to
S
n
=-i
is
i
S
nand
i
n
Isn I
= i
> 0 , the ratio of the
and for its reflection leading to
Consequently, the corresponding conditional expectation of
Y
the difference
1(S
n
>
0) -
1(5
n
< 0)
assumes the form
i
i
(y -1)/(Y +1)
;
so
o
the desired conclusion follows from (7).
When
(P1,P2)
the random walk
is a boundary point of the unit square, the steps of
S1,S2'··· 'S[t/2]
Formula (2) remains valid i f the ratio
co
is interpreted as
1(SN
all have the same sign, non-negative if
~
I
(y SNI_ 1 )/()SNI+ 1 )
It follows from the expression on the left of (7)
0) •
that, at such boundary points, if
P(SN = 0, N < t/2) = 0 , then
8 E (N)
9.
The latter equality is a consequence of Wald's equation:
E(5 ) = E(5 'E(N) ,
1
N
More generally, for points on the boundary of
the unit square, formula (2) must be replaced by
(8)
where m is the least index n for which 5
0 is a stopping point for
n
The random walk is monotone in such cases, so we must have either 51 = 52
Sm
or
0
The
SN
~
G •
error probability
probability that the t - 2N
a
for an allocation rule
patients assigned according to
A is defined as the
5
N
at the end of
the testing phase are actually assigned to the inferior treatment.
this probability depends on
when
N.
and
Clearly,
and, by convention, we set
a =
~
The proof of the following result is similar to that for
Theorem 1.
Theorem 2.
If
A is a symmetric allocation rule, then
(9)
a
When
(P1,P2)
is a boundary point of the unit square,
a
where
m is defined as in formula (8).
0)
10.
3.
~ =
Let
Vogel's inequality for the truncated SPRT
D
denote the symmetric allocation rule defined by stopping as
± D , or when
soon as the process reaches the level
n = [t/2] •
The stopping
time is
min{n ~ 1
N
Let
R 'Pl,P2,t)
D
Isn I
D ,
[t/2]} .
denote the corresponding risk function,
D
=
It is
1,2, .••
convenient to refer to these allocation rules as truncated SPRT's, since the
unbounded version
= 00)
(t
is a sequential probability ratio test for the
two simple hypotheses represented by the points
(Pl,P2)
and
we shall see, the truncated SPRT with an appropriate choice of
on
t , is very nearly minimax.
for small values of
t
Its maximum risk
MD(t)
(P2,Pl).
As
D, depending
coincides with
M(t)
and, in general, it is only slightly larger.
Vogel (1960) discovered an upper bound for
R
D
which can be derived
easily from formula (2):
Theorem 3.
For the truncated SPRT described above and even values of
t
~
2 ,
(10)
where
e:
Observe that
stops at time
a
exceeds
N'
e:
= rnin{n
1 / (y
D
+ 1) •
(11 )
is the error probability of the unbounded SPRT that
~ 1 :
Isn I = D}
.
In general, the true error probability
e:
Proof of Theorem 3.
whenever
t/2 - N > 0 , and one obtains
ot
e: + (1-2e:) 0 E(N) •
11.
Since
N' ~ N
always holds and since
= E(SN')
E(S1)E(N')
, by Wald's equation,
we have
D(1-£) - D £
D(1-2£)
,
o
which establishes the required inequality (10).
Vogel's inequality provides an upper bound for the maximum risk
MD(t), given by
{o
max
t £ + D(1-2£)2} •
(12)
P1,P2
This must be evaluated numerically, but a two-dimensional search for the maximum
can be avoided by using (13) below.
boundary of the unit square,
reduces to
£ = 0
Notice that, when
(Pl,P2)
is on the
and the expression on the right of (12)
D.
The expression
Theorem 4.
its maximum at a point
0 t £ + D(1-2£)2 , appearing in (12)
(P1,P2)
on the diagonal line
{o
max
P
+ P
1
2
, always attains
= 1 .
Hence,
(13)
t £0 + D(1-2£0)2}
O<oS1
where
(1+0)2D + (1_0)2D
If the expression
Proof.
0 t £ + D(1-2£)2
attains its maximum on the
boundary of the unit square, then, as noted in the previous paragraph, the same
maximal value
D
is attained at all boundary points and, in particular, at the
(1,0)
each fixed
(P1,P2)
where
P1 + P2
0 , that the expression
1
0 t £ + D(1-2£)2
which is either on the diagonal line
the unit square.
Thus it is sufficient to show, for
P1 + P2
is maximised at a point
=
1
or on the boundary of
12.
For definiteness, let
0 < 0 ~ 1
where
Ipl ~ 1-0 •
and
Then
~(l-o+p)
~(l+o+p)
and write
P1 > P2
0 = P1 - P2 > 0
,
and
y
The diagonal line
P1 + P2 = 1
unit square corresponds to
from
0
to
1-0, Y
increases from
points
£
£,
for fixed
£0
(1+0)2/(1-0)2
the boundary of the
0 > 0 , as
Ipl
increases
to iQfinity and hence,
Since the expression
to zero.
£,
0 t £ + D(1-2£)2
it must attain its maximum at one of the end
0
=0
p = 0;
For fixed
Ipl = 1-0 •
given by (11), decreases from
is convex in
corresponds to
In the first case,
boundary point, while in the second,
p = 0
and
and
(P1,P2)
is a
is on the diagonal
o
The value
vet)
min
UD(t)
D=1,2, •••
is an upper bound on the minimax risk
suggests a value of
D
M(t) •
The minimising choice of
here
for the truncated SPRT and this was the approach used in
our preliminary investigations, for even values of
techniques for evaluating
MD(t)
However, it is worth noting that
t.
Subsequently, we developed
itself and we were able
directly and accurately with respect to
Vet)
to minimise this
D, for odd as well as even
is a good approximation to
our computations indicated that the excess involved in using
1% for
D
Vet)
t
min MD(t)
:
is less than
t = 2,4, .•. ,200 .
Vogel (1960) used the inequality (10) to obtain the asymptotic result
Vet) - 0.375
t~
His argument is not very clear, but the same result can be justified simply by
treating
shows that
D
as a continuous variable in the definition of
D - 0.292 t
~
.
vet)
, which also
The coefficient in the first result is recorded as
13 •
. 376
and also as
.367
in Vogel's paper:
in fact, the value
here is correct to six decimal places, but it is not exactly
.375
3/8.
given
14.
4.
Bayes procedures
Consider the symmetric prior distribution defined by assigning
probability
~
to each of the hypotheses
~(l+o)
Given the number of patients
t
and any fixed
V.
denote the minimal Bayes risk within
algorithm (20) from which
0 , 0 < 0
~
1 , let
•
B(o,t)
The aim, in this section, is to obtain an
B(o,t) can be computed.
Theorem 5 establishes
the general form of the corresponding Bayes procedure
.~ = ~(o,t)
within the class
Later, we shall be concerned with
v (t)
max
B(o,t)
(14)
O<o~l
= 0V
and the least favourable choice
0
that, for every allocation rule
~ E
which attains this maximum.
V , the maximum risk
Bayes risk with respect to any prior distribution.
restricted to stopping times of the form (1):
~
for paired observations would lead, if
of the fact that
B(o,t)
~
V
= ~ (0
V
, t)
must exceed the
This statement is not
any well-defined allocation rule
M~(t)
<
B(o,t) , to a contradiction
It follows that
B(o ,t)
V
where
(t)
is the smallest possible average risk with respect to
the given prior distribution.
V (t)
M~
We know
~
M(t)
~ M~
(t) ,
(15)
V
is the Bayes procedure associated with
oV
The
computations described in the next section will be used to verify, for most
values of
t , that
vet) =
M~
(t) , which is enough to confirm that
V
the minimax allocation rule.
is
V.
15.
Suppose that, after
n
pairs of patients have been assigned to the
two treatments, we have obtained
r
i
successes with
Pi ' i
above prior distribution leads to posterior probabilities
IT
= 1,2.
and
1
The
IT
2
for
the two hypotheses with
(16)
y
The posterior
error probability, if we stop and decide optimally in favour of
H
1
or
H '
2
is
(17)
It is convenient to work by backwards induction using the variable
which represents the number of patients left after sampling
stopping cost at the point
> 0
so that
a
j
(18)
J
= IT 2
<
IT
1
' a decision to stop would
allocate all the remaining patients to the first treatment.
On the other hand,
if we observe another pair of patients, this produces a transition from
to
(j+1, s-2) , (j, s-2)
and
w , respectively.
j
u.
J
w.
J
or
(j-1, s-2)
Let us define
with posterior probabilities
(j,s)
u. , v.
J
J
It is easily verified that
1 (yj-o
i(1+0 2 ) + -0
2 (yj+1)
i(1+02)
The
s a.
J
j
pairs.
(j,s) , excluding previous sampling, is
K. (s)
For example, if
n
s = t-2n
J
(19)
_ .!.o (yj-o
2 (yj+1)
Q. (s)
J
.!.(1-0 2 )
2
v.
u. + v
J
j
+ w.
J
1
.
as the minimum expected number of allocations of
the inferior treatment among the
s
remaining patients, starting in state
j.
16.
Of course, we are mainly interested in the minimal Bayes risk
which corresponds to the initial point
relations:
a.
J
Q. (0) = 0 , Q. (1)
J
J
~
s
2 ,
(20)
min{K. (s), 1 + u.Q. 1(s-2) + v.Q. (s-2) + w.Q. 1(s-2)} •
J
J ]+
J J
J J-
Q. (s)
J
t
For example, if
is odd, we must determine
each for an appropriate set of states
,
This can be computed from the
(O,t) •
and, for
B(6,t) = 6 QO(t)
in succession,
Q. (3) ,Q. (5) , •••
J
J
j , until we can evaluate
QO(t) •
The relevant states are those accessible from the optimal continuation region
which consists of all points
(j,s)
at which
Q. (s) < K. (s)
J
J
.
In order to see the form of this continuation region, let
K.(s) - Q.(s)
J
J
R. (s)
J
Thus,
R. (s)
J
~ 0
is the advantage of continuation over stopping.
procedure is determined by applying, at each point
the sampling process, the rule:
if
R. (s)
J
R.(s)
(j,s)
A Bayes
encountered during
take another pair of observations if and only
It is not difficult to obtain the relations satisfied by
> 0 •
from (20).
J
(21)
•
Clearly
R.(1) = 0
J
R. (0)
J
and, for
s
~
2 , j
0 , we
~
have
(22)
max{O, 2a -1 + u.R. 1(s-2) + v.R.(s-2) + w.R. 1(s-2)} .
j
J J+
J J
J J-
R. (s)
J
However, for
o ,
j
this is replaced by
(23)
We remark that
It follows that
K. (s)
J
Q. (s)
J
R (s-2) = R_ (s-2)
1
1
but
RO(S)
~ ~6(s-2)
=K
. (s)
-J
and
and, according to (19),
R. (s)
J
are symmetric in
in relation (23).
j
Notice also that
u'
J
=w
. , u .
-J
-J
= w.•
J
and, in particular,
..
2a -1 < 0
j
in (23), which guarantees continuation whenever
in (22),
s > 2 •
17.
The optimal continuation region is determined by the rule:
Theorem 5.
continue if and only if
d
and, in general,
Ijl < d
' where
s
is non-decreasing in
s
d
dO
s
1
= d2 = 0
with either
d
d
s
s-,~
or
•
For a given point
Proof.
if starting in state
j
(j,s)
with
patient according to whether
, consider using the Bayes procedure as
s-l
j
patients left and then allocating the final
is positive or negative.
sub-optimal at
(j , s)
and hence,
K, (s)
+
it follows from (21) that, in general,
K. (s- 1)
J
J
ct.
J
Q.(s)
J
~
R, (s)
J
•
+
This policy is
o
By direct calculation,
R, (2)
s
2cx -1 < 0
~
3
and any
whenever
when
j
R.
J+
~
j
0
~
R,
J
~
d
s
~
d
s-
1
ct,
Since
•
J
R. (s-l)
(24)
J
for all
R,
J-
j
For
1(s-2) = 0
R (3) > 0 , so
O
based on (22) shows that, in general,
from (24) that
J
o
R. (s)
J
(s-2)
R (4)
O
:5 Q.(s-1)
and equation (22) shows that
j
1(s-2)
0 , but
J
.
d
d
1
3
:5 d _
s 2
s
+ 1
=R
R (3)
j
Hence,
j
=
(4)
0
The same argument
Finally, it follows
o
and the proof is complete.
Before describing the computations, it will be helpful to modify the
notation.
Allocation rules of the type described in Theorem 5 can be converted
to the standard form (1) by defining
n = 1,2, ••. , [ t/2] •
of
t:
c
n
A
=
n
{j :
Ij I
~ c n } , cn =
d
2
t- n
'
Notice that this specification depends on the given value
is the critical value of
ISn I
after observing
We shall be mainly concerned with the behaviour of
corresponding Bayes procedure
~(o,t)
notation will be simplified by writing
as
0
B(o)
B(O,t)
n
and changes in the
varies, for fixed
= B(O,t)
and
pairs of patients.
~(o)
t , so the
~(O,t)
•
18.
5.
Computations
Two computer programmes were essential for our work.
the minimal Bayes risk
rule
~(e)
B(e)
One evaluates
and determines the corresponding allocation
•
for the symmetric prior distribution on the two points
The other evaluates the risk
arising from a stopping time of the form:
Isn I
min{n ~ 1
N
~ c , [t/2]} .
(25)
n
The first programme is based on the algorithm (20) described in the
previous section.
It can be accelerated by making use of Theorem 5 and by
performing only those calculations needed in the algorithm to obtain
B(e)
=
e QO(tl
and
~(e)
This requires some care.
.
Serious problems with
"overflow", the occurrence of numbers too large for the machine, are also
avoided by the elimination of inessential calculations.
to evaluate
B(e)
~(e)
and
accurately for values of
We found it possible
t
as large as 1200.
Contrary to a widely held belief, there are non-trivial settings for which
backwards induction algorithms are quite practical for large "horizons".
The programme used to evaluate the risk
formula (2).
of the form
R~
(Pl,P2,t)
is based on
The relevant algorithm is best described for a general expectation
E
f(N,SN)
with the stopping time appearing in (1).
[t/2]
I.
n=l
where
p.
n~
=
peN
~
n, S
recursively in terms of
n
=
i) •
L
i£A
p . f(n,i)
n~
,
n
The probabilities
Pni
can be evaluated
qi - P(Sl-i )
and all other
o.
We have
and
•
19.
l:
= 1,2, .•• ,[t/2J-l .
n
Pni qJ'-i '
itA
n
•
We are concerned with subsets of the form
values
c
A
with critical
c }
{j
n
n
specified in various ways.
n
For most values of
=
t
, the minimax allocation rule is given by the
1:.(0 )
where
Bayes procedure
I:.
the value of
which maximises
0
l.l
1:.(0)
and determines
l.l
ol.l
is the least favourable
B(o) •
o,
i.e.,
The programme which evaluates
can be used to find this "candidate"
I:.
B(o)
The risk
l.l
function can then be computed, using the other programme, to discover whether
According to the inequalities (15), it is enough
the candidate is minimax.
to show that
~
•
for all
(Pl,P2)
B(o)
(26)
l.l
in the unit square.
Although we were not able to establish
this analytically, as a result of extensive calculations we are convinced that
is minimax for all but 23 of the first 201 values of
t.
There are three different cases revealed by the computations and the
most common of these will be described first.
(i)
Case A.
1:.(0)
For example, when
critical values of
It switches to
at about
0
=,
0
As
increases from
(ii)
1, the allocation rules
to
make a number of transitions.
t
=
20
Isn I
,
the sequence
and to
begins
as
0 = .06
,
at about
1 1 1 1 1 1 1 100
The function
B(o)
which gives the
c ,c 2 ,.· .,c
10
1
I:. (0)
and specifies
2 2 1 1 1 1 1 1 0 0
.29
0
at about
2 2 2 1 1 1 1 1
to
o
0
2 1 1 1 1 1 1 1 0 0
0
.38
is continuous and unimodal.
It is
differentiable except at transition points, where it has "corners".
(iii) The mode
0
does not occur at a transition point, so that
l.l
B(o)
has zero slope at
o
= 0
l.l
20.
For example, when
B(o)
=
1.717
t = 20 , 0
from
=
0
(iv)
is about
\.I
.385
to
= .395
0
For each fixed
RA (P1,P2,t)
.39
and, to four significant figures,
.
•
0 > 0 , and for
is symmetric and unimodal in
with its
\.I
\.I
maximum at
\.I =
0 •
Thus the maximum of
RA (P1,P2,t)
\.I
over the unit square must occur somewhere on the diagonal
line
P1 + P
(v)
The risk
2
=
1 •
R (P ,P ,t)
A
1 2
along the diagonal, where
\.I
(P ,P )
1 2
= (~(1+0), ~(1-0»)
, has a local maximum at
0
=
0\.1
This follows from (ii) and (iii), because the risk agrees
with
(vi)
B(o)
The risk
is symmetric in
\.I
R (P ,P ,t) , where
A
1 2
o
•
A(o) = A
wherever
\.I
(P ,P ) = (~(1+0), ~(1-0») ,
1 2
and non-decreasing for
either non-increasing for
0 S 0 S 0
\.I
It is
or it decreases to some
minimal value and increases thereafter.
(vii)
The risk
R (P ,P ,t)
A
1 2
at
\.I
exceed
(P ,P )
1 2
(1,0)
does not
B (0 ) •
\.I
In case A, it is clear from these conditions that
A
\.I
is the minimax allocation
rule.
There are two ways in which the search for a minimax rule can fail, by a
violation of (iii) or by a violation of (vii).
In practice, the latter is more
important, so it is convenient to classify the exceptional situations as follows.
21.
Case B.
Condition (vii) does not hold and, perhaps, (iii) also fails.
Case C.
Condition (iii) does not hold, but all the other conditions
are satisfied.
In case B, the risk function referred to in (vi) does begin to increase
for large values of
a
and the inequality (26) is violated at
case occurs for the first time when
a = 1 •
This
t = 21 •
When condition (iii) does not hold, the notation
!:J.
II
= !:J.(a
II
)
is
misleading since there are two allocation rules which attain the minimal Bayes
risk
B(a)
the left of
a
a
..
, is Bayes for
and the second,
II
a
on the right,
point
!:J.~
The first,
II
<
a
II
<
b .
a
!:J. , is Bayes for
a
r
The function
B(a)
[a,a ]
in some interval
II
in an interval
to
[a ,b]
II
has a corner at its maximal
, a positive slope on the left and a negative slope on the right.
II
R!:J. (P1,P2,t) , with
Consider
~
agrees with
a
=
aII
B(a)
on
[a,a ]
II
and, therefore, has a strictly positive slope at
It must increase further to the right of
!:J.~
Thus, we are unable to deduce that
effectively eliminated as a candidate.
a
II
, so (26) cannot hold.
is minimax and, likewise,
!:J.
r
is
There are several examples of this
type to be mentioned and, to avoid confusion, the allocation rules
and
!:J.
r
will be treated separately, as examples of case B or case C.
Conditions (iii) and (vii) are violated when
and
!:J.
r
t = 57 , for
with
are examples of case B.
!:J.
r
3 3 2 2
sequence for
r
The sequence
begins with
For all other values of
t, 1
and both rules
which describes
2 2 2 2
~
= 23
The first occurrence of case C is when
and condition (vii) does not hold.
!:J.
t
t
~
begins
However, the corresponding
and (vii) is valid for this rule.
201 , only one of the cases A, B, or C occurs.
In particular, the next examples of case C are when
t = 65 , for both
!:J.~
and
!:J.
r
22.
and
Ai
We conjecture that, when there are two candidates
satisfy condition (vii), there is a randomised mixture of
A
and both
r
Ai
and
A
r
which is minimax.
We shall say that an allocation rule
A
is
essentially
~inimax
•
if
(27)
for all
(P1,P2)
shows, in every instance of case C,
concerned is essentially minimax.
cases
A
and
C
An examination of the computations
in the unit square.
1
~
t
~
201 , that the allocation rule
In other words, the distinction between
has no practical significance:
if
R
A11
(l, 0, t)
~
B (0 )
, then
11
is, for all practical purposes, a minimax allocation rule.
The programme for computing
allocation rules.
R (P1,P2,t)
A
can be used to analyse other
We have investigated the truncated SPRT, an adaptation of
a rule recommended by Anscombe (1963) and one adapted from a rule suggested by
Lai, Levin, Robbins and Siegmund (1982).
are included in the next section.
The results of these investigations
The stopping time for Anscombe's allocation
rule takes the form:
N
min{n ~ 1
4>((2/n)~
1 -
Is
n
I) ~ nit,
where
[t/2J} ,
For the Lai-Levin-Robbins-
Siegmund (LLRS) rule, the stopping time is given by
N
min{n
~
1 :
g((2/n)~
Isnl)
~ t/2n}
where
g (x)
24>(x)-1
x<p (x)
+
1 ,
x > 0 ,
g (0)
3 •
..
23.
Observe that
g
is increasing and, hence,
when the latter is an integer.
N
<
1 + t/6 ,so
N
~
t/6
Thus, the LLRS rule samples no more than
about a third of the patients before a decision is reached in favour of
one or other of the treatments.
24.
6.
Numerical results
Table 1 lists the normalised maximum risks
allocation rules described in the previous section:
chosen optimally;
MA(t)/t~
the minimax rule;
truncated SPRT with
D
and the LLRS rule.
For convenience, only even values of
so
A = ll(o )
l.l
l.l
t
= 60
B(o )/(60)~
is
the
the adaptation of Anscombe's rule,
t
are included.
The table includes one example of case B, described earlier.
the first column for
for the four
=
~
.3742, but
The figure in
MA (60)/(60)~
u~
is obviously not minimax, by comparison with the SPRT.
=
.3873
However,
the figures also make clear that the SPRT is very close to being minimax, since
the minimax risk is at least
B(o ) •
~
Observe that some entries occur in more than one column, not always for
the same reason.
When
t
is small, there are only a few reasonable symmetric
rules from which to choose and, sometimes, the same rule is selected by more
than one method.
In other cases, a repeated entry only reflects a near equality,
within four decimal places, such as in the first two columns when
t = 20 .
In still further cases, two rules have rather different risk functions, but the
same maxima
When
P1
=
MA(t) , occurring at the exceptional point
=
1 , P2
0 , the process
Sl,S2' •.•
(P1,P2)
=
(1,0)
is deterministic and two
different rules can easily have the same stopping time and, hence, the same risk:
see formula (8).
This situation arises in the last two columns when
t = 20 .
A "discreteness effect" is clearly visible in the first two columns of
Table 1.
The effectiveness of the best available rule varies with
For example, for the SPRT, the optimal
and then it switches to
first range, at
relatively small.
than
t
D
12
At
=2
D = 1
t
D = 2 , and the ratio
for
t
D value
i~
1
26,28, ... ,72 •
when
t = 2,4, ••• ,24
Near the middle of the
gives a good" fit" and the ratio
20 , the fit of
.3841
D
is larger.
1
t.
.3579
is
is not so good, but better
Similarly, the middle of the
25.
range for
small.
0 = 2
is near
t = 50 , where the ratio
.3702
is again relatively
The discreteness effect is also apparent in the first column, but the
pattern is harder to analyse because there are many more allocation rules
available.
"
•
M~(t)/t~ are all fairly stable for t ~ 30 and it can be
The ratios
seen that there is little difference between the minimax, SPRT and Anscombe rules •
The LLRS procedure does less well.
This seems to be because paired sampling
must stop with about a third or less of the total number of patients sampled,
which causes difficulties when
proportional to
t
-~
reaches its critical levels, at values
.
Table 2 provides a detailed comparison of the risk functions for the minimax
rule and the truncated SPRT when
t
100 •
prior distribution corresponds to
o~
which describes the stopping time
N
=
In this case, the least favourable
c ,c , ••. ,c
50
l 2
for the minimax rule, according to (25),
.188
The sequence
consists of 26 threes, 16 twos, 6 ones and 2 zeros, in that order.
choice of critical level for the truncated SPRT is
0 = 3 •
The optimal
For every pair of
entries in the table, the minimax rule has a smaller, or the same, risk.
This
suggests that the truncated SPRT is inadmissible, but the differences are small
except where
and
o
are both near
or both near
The risk functions
1 •
are quite flat, apart from the steep descent to zero near the line
Pl = P2 •
It would be interesting to compare these with the risk functions of fully
adaptive Bayes procedures, such as those considered by Berry (1978).
However,
the allocation rules involved are much more complicated and difficult to
evaluate without extensive simulations.
It appears that the minimax risk
.3714t~.
M~(t)
and
has a growth rate close to
The evidence for this is provided by our numerical evaluations of
for large values of
M~(t)/t~
M(t)
is
.3714
.3705
for
t
when
100
t
for
,
~
=
~
l.t
.3722 for
is the minimax rule:
t = 200 , .3714
A limiting ratio between
t = 1200
.371
for
and
the ratio
t
.372
800
seems
very likely.
which characterises the
Consider the sequence
allocation rule
~
For
t
1,2, .•. ,20, the sequence takes the form
)J
1 1 ••• 1 0
or
1 1 ••• 100 , according as
t
is odd or even.
Each of these
26.
is an example of case A, in the terminology of Section 5, and
is minimax.
The critical value
and this
2
first appears in the sequence when
leads to complications for
R~ (P1'P2,t)/t~
the normalised risk
2/t~,
to
t = 21,22, •.• ,26 :
at
=
(P1,P2)
1
shifts from
(1,0)
2 ,
to
l/t~
changes from
II
which causes the location of its maximum to move from an interior
point of the unit square to the point
t
when
t = 21
(1,0)
in the corner.
These values of
are covered by case B described in the previous section and we cannot claim
~
that
is minimax.
II
2/t~
Since the ratio
is decreasing in
expect the problem to disappear, and it does when
phenomenon reappears when the critical value
and it persists for
t = 58, 59, ••• , 63 •
start of the sequence
115
3
c ,c , .••
1 2
t
= 27
•
t , one might
However, the same
first has an effect, at
The introduction of
causes the same problem when
We have not tried to locate the first occurrence of
4's
t
= 57
at the
t = 113, 114
5's
,
and
in the sequence,
but it seems doubtful whether this phenomenon can recur indefinitely.
The reader is reminded that, when
~
and
illustrate case B and case C, respectively.
r
189 and 198.
178
t
in the range
1
~
t
~
of the first
201
values of
201 : t
~
We can confidently assert that
II
C.
D
These occur for
is a minimax rule for at least
D
in the truncated
As we have shown, this type of allocation rule can reduce the maximum
to attain
min MD(t)
M(t) •
this by comparing three methods.
which
The exact specification
involves extensive computations, but almost the same
results can be obtained by using simpler approximations.
fo~
is
t.
risk to a level very close to the minimax value
of
r
65, 87, 123, 125, 132,
It is worth commenting on the choice of the parameter
SPRT.
~
The rule
essentially minimax, as in all the other examples of case
seven other values of
~t
t = 57 , the allocation rules
D = 1, 2, 3, 4 or 5
Table 3 demonstrates
Column (a) shows the even values of
is optimal.
ranges obtained in the following way.
t
~
300
The other two columns give similar
Method (b) is based on the upper bound :
27.
MD(t)
~
UD(t) , established in Theorem 4 and, for each
D is determined by minimising
using formula (13).
UD(t)
This is a straightforward computation
Method (c) is a direct application of Vogel's asymptotic
result, quoted at the end of Section 3.
•
t , the corresponding
as the nearest positive integer to
For column (c),
0.292 t ~ •
The table shows few differences
between the three methods and these occur only when
there is a change in the optimal value of
D is defined simply
D.
t
is near a value at which
In such cases, because of the
minimisation, there is very little difference between the two competing values
of
MD(tl
For practical purposes, the simplest method (c) is quite
effective and it can also be used for
feature of the table is that
t > 300.
Another obvious
D changes slowly with
t
may not be easy to decide what is a realistic value of
medical trial:
It
t , in designing a
see the discussion in Anscombe's paper (1963).
of sensitivity of the truncated SPRT with respect to
t
Hence, the lack
may be another practical
advantage.
The discrete process
S1'S2' ••• 'S[t/2]
can be replaced by a Wiener
process with an unknown drift parameter and one obtains a model for which it can
be shown analytically, rather than numerically, that the minimax allocation rule
is Bayes for an appropriate symmetric two-point prior distribution.
The Bayes
rule can be obtained, in principle, as the solution of a free boundary problem
involving the heat equation.
In recent work, so far unpublished, Simons has
obtained close inner and outer approximations to the entire stopping boundary,
using techniques developed earlier by Bather:
a description of these techniques
and other applications is given in the recent paper by Bather (1983).
The same
continuous time model has been studied in some detail by Chernoff and Petkau (1981),
but for a normal prior distribution on the unknown drift.
They conclude that the
sub-optimal rule suggested by Anscombe (1963) performs very well.
•
Since Anscombe's
rule, in the present setting, is nearly minimax, as is the truncated SPRT, we are
further encouraged to recommend the latter.
28.
Acknowledgements
We are grateful to the referees for many constructive comments, including
.
This research was supported by a
an improvement in the proof of Theorem 5.
grant from the Science and Engineering Research Council and also by the
National Science Foundation Grant MCS81-00748.
References
Anscombe, F.J. (1963).
Sequential medical trials.
J. Amer. Statist. Ass.,
58, 365-383.
Bather, J.A. (1981).
experiments.
Randomized allocation of treatments in sequential
J. R. Statist. Soc. B, 43, 265-292.
Bather, J.A. (1983).
The minimax risk for the two-armed bandit problem.
Proceedings of a Conference on Learning Models : Lecture Notes in
Statistics, Vol. 20.
Bather, J.A. (1983).
technique.
Springer-Verlag.
Optimal stopping of Brownian motion: a comparison
In Recent Advances in Statistics (M.H. Rizvi, J.S. Rustagi
and D. Siegmund, eds.), Academic Press.
Begg, C.B. and Mehta, C.R. (1979).
trials.
Biometrika, 66, 97-103.
Berry, D.A. (1979).
trials.
Sequential analysis of comparative clinical
Modified two-armed bandit strategies for certain clinical
J. Amer. Statist. Ass.,
Chernoff, H. and Petkau, A.J. (1981).
paired data.
Colton, T. (1963).
2l,
339-345.
Sequential medical trials involving
Biometrika, 68, 119-132.
A model for selecting one of two medical treatments.
J. Amer. Statist. Ass., 58, 388-400.
Lai, T.L., Levin, B., Robbins, H. and Siegmund, D.
trials.
Vogel, W. (1960a).
(1960b).
problem.
Proc. Natl. Acad. Sci. USA,
~,
(1980).
Sequential medical
3135-3138.
A sequential design for the two-armed bandit.
An
asymptotic minimax theorem for the two-armed bandit
Ann. Math. Statist.
~,
430-443, 444-451.
•
TABLE 1
Normalised maximum risk
'I
6
An s combe
LLRS
2
.7071
.7071
.7071
.7071
4
.5000
.5000
.5000
.5000
6
.4103
.4103
.4103
.4593
8
.3726
.3726
.3726
.3854
10
.3598
.3598
.3598
.3870
12
.3579
.3579
.3579
.5774
14
.3613
.3613
.5345
.5345
16
.3676
.3676
.5000
.5000
18
.3754
.3754
.4714
.4714
20
.3841
.3841
.4472
.4472
30
.3811
.3886
.3842
.4222
40
.3697
.3714
.3701
.4721
50
.3697
.3702
.3759
.4246
60
.3742*
.3760
.3865
.4196
70
.3757
.3852
.3829
.4106
80
.3730
.3786
.3801
.4026
90
.3710
.3738
.3716
.4130
100
.3705
.3721
.3707
.4129
120
.3730
.3741
.3779
.4124
140
.3727
.3803
.3780
.4043
160
.3713
.3747
.3767
.4095
180
.3712
.3729
.3754
.4110
200
.3722
.3736
.3792
.4063
e
*
for four allocation rules
SPRT
Minimax
t
.J
M6(t)/t~
II
is not minimax in this case:
see text.
TABLE 2
Risk function
R~
(Pl,P2,t)
for
t
=
100;
minimax rule
(lower values) and SPRT (upper values)
TABLE 3
Range of even
t
•
•
•
300
with parameter
D
in SPRT;
methods (a), (b), (e)
e
(b)
(a)
D
~
$
(e)
1
2 -
24
2 -
24
2 -
26
2
26 -
72
26 -
72
28 -
72
3
74 - 138
4
140
5
232
230
74 - 142
144
238
236
74 - 142
144
236
238 - 354
© Copyright 2026 Paperzz