,
AN OPTIMUM PROPERTY OF BECHHOFER'S SINGLE.SANPLb
MULTIPLE.D~ISION
PROCEDURE FOR RANKING MEANS AND SOME EXTENSIOloJS
•
l
by
~m. Jackson Hall
Institute of Statistics
University of North Carolina
Institute of Statistics
Mimeograph Series No. 118
September, 1954
1. This research was supported by the United States Air Force,
through the Office of Scientific Research of the Air Research and
Development Command.
UNCLASSIFIED
Security Information
Bibliographical Control Sheet
1.
0 .A . :
Institute of Statistics, North Carolina State College of the
University of North Carolina
M.A.:
Office of Scientific Research of the Air Research and
Development Command
2 • 0 .A . :
M.A . :
CIT Report No. 11
OSR-TN- 54- 279
3. AN OPTIMUM PROPERTY OF
B}l~CHHOFER IS SINGLE-SAMPLE MULTIPLE-DECISION
PROCEDURE FOR RAmCING MEANS AND SOME EXTENSIONS
4.
William Jackson Hall
5·
September, 1954
6.
13
7.
None
8.
AF 18(600)-458
9.
RDO No. R-354-20-8
10.
UNCLASSIFIED
11.
None
12.
Bechhofer L-l_7 has considered the problem of ranking means of' k normal
populations with known variances, or more generally, of grouping the
populations according to ranks. He suggests, with only intuitive Justification, grouping the population means according to the ranked sample
means, and gives tables for finding the minimum. sample size which can be
used which will guarantee a specified probability of a correct grouping
when the population means satisfy lower bounds on the "distances" between
- groups. This paper gives justification for the use of his procedures
when the population variances end the sample sizes are specified as equal
among the populations, proving that if the bounds on the "distances" are
to be satisfied, the sample size cannot be reduced by using any other
type of procedure--that is, that Bechhoferls procedure is a. most economical
multiple-decision rule, as defined inL-3_7. Similar results, with some
limitation, are obtained for problema of ranking other population parameters and for a distribution-free ranking problem as well~ difficulties
in computing the most economical sample size for these non-normal problema
are discussed and apprOXimations using Bechhoferls tables ere indicated.
AN OPTIMUM PROPERTY OF BECHHOFER'S SINGLE-SAMPLE MULTIPLE.-DECISION
PROCEDURE FOR RANKING MEANS AND SOME EXTmSIONS
Wm. Jackson Hall, University of North Carolina
Summary.
Bechhofer
£1_7 has
considered the problem of ranking means of k
normal populations with known variances, or, more generally, of grouping the
populations according to ranks.
He suggests, with only intuitive justification,
grouping the population means according to the ranked sample means, and gives
tables for finding the minimum sample size which can be used which will guarantee
a specified probability of a correct grouping when the population means satisfy
lower bounds on the "distances" between groups.
This paper gives Justification
for the use of his procedures when the population variances and the sample sizes
are specified as equal among the popUlations, prOVing that if the bounds on the
"distances" are to be satisfied, the sample size cannot be reduced by using any
other type of procedure--that is, that Bechhofer's procedure is a most economical
multiple-decision rUle, as defined in
£'_7.
Similar reSUlts, with some limitation,
are obtained for problems of ranking other population parameters and for a
distribution-free ranking problem as well; difficulties in computing the most
economical sample size for these non-normal problems are discussed and approximations using Bechhofer's tables are indicated.
1.
Choosing the "Best" Normal Population.
Goal I:
We first suppose our goal is:
to choose the population with the largest mean atter taking a
sample of size n, as yet unspecified, from each of k normal populations n , n ,
2
l
••• , nk with unknown means Ill'
2 , ••• , ~ but equal and known variances
1-L
cl.
Thus,
we have a k-decision problem; we denote by Ai the decision to choose the population
n 1 as the "best" population (i
1-L
1
= \-L
LkJ
where Il
Ll_7
= 1,
, "., I-L
2, ""
Lk_7
k), that is, the decision that
are the ranked means (in increasing order).
2
We shall freely use the notation introduced by Bechhofer. 2 We require, given
reO
< r < 1)
r if
~
Lk_7
and 8(8
> 0),
that the probability of a correct decision be at least
>8
- ~
Lk-l_7 -
and subject to th1s restriction we wish to derive a
decision procedure with a mintmum sample size.
= (X ij ,
Let!j
X
x2j ' ••• , ~j)' j
iJ is the jth sample value from 1(1.
= 1,
••• , n, denote the sample values where
We consider !J as one observation from a
k-variate population and denote the (k-variate) density function of !J by
where ~
= (~l'
••• , ~).
~i
Denote
>
-
~. + 8
J
for all J
f
i)
= {~:
(i = 1, ... , k).
Clearly, using the terminology of Chapter II of
13J,
our problem is to find a
M.E. k-d.r. (most economical k-decision rule) relative to the vector
for discriminating among
of
13_7 to
over wi' ••• ,
~
~
= (wi'
••. ,
~).
z=(r, •.. ,r)
We shall use the first m1nimax method
obtain such a d.r.
Let n be fixed.
point
f(!,~)
~,
Consider the conditional a priori distributiODS
~,
••• ,~
respectively, where Ai assigns probability 1 to the parameter
with the ith coordinate
~O
+ 8 and all other coordinates
~O' ~O
arbitrary
but fixed, and consider the simple discrimination problem of finding a minimax
d.r. DO w.r.t. the weight function W(f , A }
i
J
2.
All references to Bechhofer refer to
= (-l/r
11_7
if i
=j
and 0 otherwise) tor
unless otherwise indicated.
3
discriminating among!
that is, f
= (fl'"
.,fk ) where f t = ) f(!,J:!) dAi (J:!) (i = 1, •• " k);
wi
i
is a k-variate normal density function with mean vector J:!
= ~O'
where ~i = ~O + 5 and ~j
f
j
i, and covariance matrix P
= (Oij)
= (~l"'"'~)
2
= (05ij )·
Using the results of Chapter I in ~3_7, a non-randomized likelihood ratio d.r.
DO satisfying (1.11) in
~3_7, i.e., satisfying
(1)
where Pi(D)
= Pr(D
chooses Ai
I f i ),
is a minimax d.r.
Now a likelihood ratio
d.r. is determined by the ratios aif~/ajf~ for some positive constants al"'"'~
where f~ is the corresponding likelihood for samples of size n.
verified that f~/f~
= exp ~5n(xi
xj )/cf_7 where xi denotes
-
n i " Hence, a minimax d.r. DO chooses Ai if aif~
equivalently, i f Yij
c ij
= (02/8n)
> ajf~
It is easily
the sample mean of
for all j
f
i, or,
> c ij for all j I i where Yij = xi - Xj and
log(aj!a i
),
and the cijtS are determined so that (1) holds.
(We
ignore the possibility of e~uality of a1f~ and ajf~ throughout since this is an
has a (k-l)-variate normal distribution with means (5, ••• , 5) and covariance matrix
= (T ij )
P
l
where Tij
tion by ·G
(which
c
il
is
f
ci,i_l
if
= j and 02/n
i
independent
00
00
) ...
= (202/n
of
Then
f
j); denote such a distribu-
o
Pi(D)
=
00
00
) ... )
ci,i+l
~O)'
if i
c
ik
dG=H(cil"",ci,~llci,i+l,···,cik)' say •
4
Rearrange the subscripts so that a
tively, of the ai's; then c
all ai's are equal.
~
ij
0
l
~
and a
c
kj
k
are the smallest and largest, respec-
for all j with equalities if and only if
Now H is a decreasing function in each of its arguments so
°
that Pk(D o) ~ H(O, ••• ,O) ~ PleD ) with both equalities if and only if all cij's
are zero.
Renee, the requirement (1) implies that all cij's are zero and hence
that all ai's are equal.
Therefore, DO chooses Ai if Yij
is, it ii is the largest sample mean (i
of
~O
= 1,
•.• , k).
°for all j 1 i, that
>
Note that DO is independent
which remains arbitrary.
Bechhoter proves that
~lJ
••• ,
~
are least favorable in the sense of
Theorem 2.5 (ii) of ~3_7 so it follows fro~ that theorem that DO is a minimax
d.r. w.r.t. the simple weight function
-l/r
if ~
E
(J.)j
=
W(~,A.)
J
(j
= 1, ••. , k)
o otherwise
for discriminating among
~.
By Theorem 2.7 of ~3_7, to find a M.E. d.r. we need consider only such
minimax d.r.'s DO for various values of the sample size n.
n
Bechhofer has given
tables for finding the minimum n, say N, for which D~ meets our requirements.
Hence D~ is a M.E. d.r. for discriminating among~; thus Bechhofer's procedure is
most economical for Goal I.
2.
Grouping Normal Means.
We now consider the more general goal treated
by Bechhofer of which Goal I is a special case:
Goal II:
tions, the k
s-
to find the k
s
"best" popUlations, the ks _ "second best" populal
2 "third best" populations, etc., and finally the k
s
(s
~
k,
k
i=l i
I.
= k),
l
"worst" popula.-
"best" being interpreted in the
5
sense ot largest population means.
'1
Denote "
1
= I. k j and 8k + 1 it
j=l
i
I 1
=
and that the same number of samples are to be taken from each population.
Let A1 i
i· be the dec1sion:
1 2"· k
1f
1f
11 ' ••• ,
1f
are the "worst" populationa,
",
1
kl
1k+l ' ••• ,
are the " next worse,
"
1f~
1
K
2
•••••••••••••••••••••••••••••••••••••••••••
1ft-....
-k
,
s-
are the "best'l POPulations
••• ,
1+1
where (11 , ••• , i k ) is a permutation of (1, ••• , k) such that i l < i < ••• <
2
1jc
+1
1
< 1jc
+2
1
< •.• <
~
2
, ••• , iit
+1
a-1
< ik
+2
a-1
k l) alternative decisions.
a
< ••• <
~
a
== ~.
<~
for 1=11 , 12 ,
- L·lrk 7
1
Thus, there
Correspondingly, given positive
8* _ , denote
8 1
~j
if '
... ,
~
1
,
•• • I
•••••••••••••••••••••••••••••••••••••••••••
1it '
2
6
r
Given
(0
< r < 1), we wish to find a M.E. m-d.r. relative to the vector
r = (r, "', r)
for discriminating among~, defined by (2), that is, a d.r. D
based on a minimum sample size subject to : Pr(D chooses Ai'
i
11.2 , •• k
i
.l:! € 00
1
i
1 2'"
I.l:!) ~
r
if
for all m values of the subscripts.
k
To obtain such a d.r. we use a method analogous to that in Section Ij the
argument is brief because of this analogy and the notational complexities.
Let n be fixed.
Let
~i
1
1'"
(1)i
assigning probability 1 to the parameter point .l:! with coordinates
i
1'"
be a conditional a priori distribution over
k
k
= •••
...
• • • • • •• • • •• • • • • • • • • • • • • • • • • • •• • • • • • • • • •
\.I.
\.1.
0
iit
••• + 8* l'
s-
= ••• =
+1
s-l
arbitrary, and der.ote
and correspondingly for the other values of the subscripts.
discrimination
probl~~
d.r. where the a.
1. 1 •
of discriminating among the f
i 's are chosen so that the Pi
•. k
minimax d.r. for this problem.
i 1'"
Consider the simple
its. A likelihood ratio
k
i
(D)'s are all equal is a
l ' •• k
Now it is easily verified that for any set of
7
<
Yi
.
j
j
l"·J.k , 1'" k
==
>
s-l *
E 5 Zt ( i l ' , ,ik , jl" , .t .)
t=l t
Uk
where the Zt'S are various contrasts of the sample means with all coefficients
restricted to the values -1, 0,+1, the coefficients depending on (il, •• ik,Jl ••• Jk ).
(It may be helpful to consider a special case such as k=4, s=3, k =1, k =1, k =2.)
l
2
3
various sets of subscripts (Jl ••• Jk ) unequal to (il ••• i k ) are random variables
having a joint {m-l)-variate normal distribution (independent of
~o).
Because
of the symmetry of the problem (we assume no a priori knowledge about the relative
magnitude of the population means), the same normal distribution, say G', will
occur for any set of (il ••• ik)'s, as long as the order of the variables is properly
arranged.
Hence, we may write Pil ••• ik{DO)
limits of the {m-l)-fold integral are all
= ~ ~ ..• )
+00
dG' where the upper
and the lower limits are the
m-l c{il ••• i k , jl, •• Jk)IS for various values of the {jl ••• Jk)'s unequal to
Now the e's are to be determined so that all p.
. {DO)IS are equal.
ll" .lk
Let a I
, and a "
" be the smallest and largest of the a's, respectively;
il···ik
il···ik
then an argument analogous to that in Section 1 proves that all
CIS
must be zero
and hence all a's must be equal.
Consider two sets of subscripts il.,.ik and Jl ••• jk' denoted simple
1 andi,
8
which differ only in that, for some
1
(1
~ l ~
t
s-l), one subscript, say i , in
the ith group and one subscript, say i " , in the (i+l)th group of
changed in
1 are inter-
l (and then the subscripts rearranged within groups so that they are
in increasing order).
Then, clearly,
-=---
where the summations are over the sample, the sample
omitted.
Hence, the corresponding contrasts Zt (t
su~scripts
= ~, ••• ,
coefficients zero except the lth contrast which is Xl" 1s equivalent to
(xiII> Xl')'
i by DO implies X.
1··' k
J1
> Xj
s-l) have all
Thus (8. f!
1
J)
> 8..l
Such a statement holds for every two sets ot sub-
scripts corresponding to two consecutive groups.
Ai
xi'.
having been
2
Hence, the decision to choose
for every pair of subscripts (J l , J 2 ), J l belong-
ing to the (t+l)th group (i~ +l, ••• ,ik ) and J 2 belonging to the tth group
.
t
,t+l
(i
kt-l+l, ••• ,ikt ),'
t
= 1,
°
••• , s-l. Clearly, then, a minimax d.r. D (for a
fixed sample size) groups the populations according to the sample meanSj i.e.,
denoting the ranked sample means (in increasing order)
x
Ll_7
°
,x
L2_7
,...,x
£kJ
and the corresponding populations by n (1)' n (2)" ".,n (k)' D chooses the alternative A corresponding to:
,
9
are the worst populations
are the next worse
•••••••••••••••••••••••••••••••••••••
~(k
+1)' ••• ,
s-l
Note that Do is independent of
~O
~(k)
are the best populations.
which remains arbitrary.
Bechhofer proves that the least favorable configuration, which is assigned
probability 1 by the
~i
i 's, is least favorable in the sense of Theorem 2.5
1··· k
(11) of ~3-7. Hence, DO is a minimax d.r. for discriminating among~, defined
by (2), and D~, where N is chosen according to Bechhofer's rules, 1s a M. E. d.r.
for this problem.
Thus, we have proved Bechhofer's procedure to be most econom-
ical for Goal II.
3. Ranking the Parameters of Distributions of the Laplacian Class.
We shall show that similar results, with some limitation, hold for problems of
ranking (or grouping according to ranks) the parameters 0 , ••• , Q of k popula1
k
tions whose distributions, differing only in a parameter 0, belong to the
Laplacian class of distributions, defined as all distributions having a density
function or probability function of the form
where p is an increasing function of O.
This class includes the normal distribu-
tions with known variance, the normal distribution with known mean, and the
binomial, negative binomial, and Poisson distributions.
We require, as before,
that the same number of samples be taken from each popUlation.
10
We consider only extensions to Section 1; extensions to Section 2 are
completely analogous.
Denote by
f(!,~)
the k-variate density (or probability)
function at k independent variables, the ith variable having the density (or
probability) function g(xi,Oi) (1
= 1,
••• , k).
Except for one point, it is
easily verified that the derivation ot Section 1 is valid tor this case concerning
any distribution ot the Laplacian class, replacing population means by O's, the
sample means (xl' ••• , ~) by (tl, ••• ,tk ) where t
i
is the average over the sample
values of t(x) tor the ith population, and the (k-l).variate normal distribution,
G, of the 1i vectors by a (k-l)-variate joint distribution of ti-t j (j
j
= 1,
••• , k) where the tJ's are no longer necessarily normal'.
a~sument8
1 i,
But none of the
of Section 1 depend on the nature of the G distribution except the tact
that G is independent of
~O
(here denoted 00); this fact, the exceptional point
referred to above, will not necessarily hold for other distributions of the
Laplacian class--in fact, it does not hold for any of the other distributions
listed above.
This is intuitively not surprising since none of the parameters
considered other than the normal mean have an unlimited range from plus to minus
infinity and hence we might expect a minimax d.r. to depend on the "location"
fixed by 00.
Let DO denote a decision rule based on a sample of size n which chooses
n
as the best population the one with the largest sample
ti,
and suppose N 1s the
sample size taken from (hypothetical) tables similar to Bechhofer's but constructed from the appropriate G distribution for a specified 00.
is specified in the definition of the parameter spaces
ID
i
Then, if 00
, thusly
,. It should also be noted that events aif~ = ajf~ may now have positive
probability and must be accounted for by allowing randomized d.r.'s--that is,
"ties" must be "broken" by randomization.
f
I
11
(4)
(.l)
o =(0:
i
-
some specified value of j)
(i
= 1,
o is a M.E. d.r. relative to 1
we have as in Section 1 that DN
••• , k),
= (7,
••• , 7) for
discriminating among·~O. To find a M.E. d.r. for discriminating among ~ (with
00 unspecified), it would be necessary to find a "least favorable" value of 00
for the particular problem at hand.
This may make the "least favorable configura-
tion" a rather unrealistic onei this might be overcome, however, by taking a
"middle course" of limiting 00 to some interval.
Tables for finding the M.E. sample size for such non-normal problems
analogous to Bechhoferls tables for the normal case have not been constructed.
The compilation of such tables would involve some unsolved distribution problems;
for example, if the
OIS
are parameters of binomial distributions, we require the
joint distribution of m-l dependent Yi's, each of which is the difference between
two binomial variables with the same parameter n but different Ql a • However, by
transforming t
= t{x)
into an approximately normally distributed variable,
Bechhofer's tables might be used to advantage to approximate the M.E. sample size,
as suggested by Bechhofer in Section 7 of ~1_7; for example, see Bechhofer's
Example 4.
Some indication of the accuracy of such apprOXimations may be given by
the comparison of a similar approximation with the exact computation made by
Bechhofer and Sobel in ~2_7.
4. A Distribution-Free Ranking Problem.
Suppose F , ••• , Fk are unknown
l
continuous distribution functions corresponding to the populations
~l'
••• ,
~k'
00
and denote Q(F i ) = 01 =
~
o
dF i (i = 1, ... , k).
(Alternatively, some constant
,
12
other than zero may be specitied as the lower limit ot integration in the detinition ot Q1')
Consider the goal analogous to Goal I:
to choose the population
with the largest value ot OJ atter taking an equal number ot samples trom each
population.
It the Fi's ditfer only in location parameter, then the ranked 0i's
correspond to ranked means or medians.
A
similar development may be given tor
the analogous Goal II.
o
We shall show that, tor a tixed sample size, a d.r. D which chooses 0 1
as the largest 0 it the sample trom n has the largest number of positive values
i
(i
= 1,
••• , k) is a mintmax d.r. for a simple weight function for discriminating
among ~O, defined by
for some specified value ot j} (i=l, ••• ,k)j
and hence, by choosing the sample size as indicated in Section 3 tor the proper
distribution G, DO is a M.E. d.r.
Let
Ot(x)(l _ O)l-t(x)/c
h(x,O)
=1
if x
~
Ixi
~
c
=
(
where t(x)
if
o
otherwise
0 and 0 otherwise and C 1s an arbitrary positive constant,
and denote by f(!,£) the corresponding joint distribution ot (xl' ••• ,
Xi has the density h(xi,Oi) (1 = 1, ••• , k).
(F l , "" Fk :
all
j
# 1)
dFt/dx
= h(x,Oi)
(1
= 1,
~)
where
Let Ai assign probability 1 to
••• , k) and 0
1
= 0 0+8,
OJ
= QO tor
(i = 1, ••• , k), and consider the simple discrimination problem ot
•
13
discriminating among t l , ... , f k where f is the "average" (w.r.t. A,i) density
i
function of (xl' ••• , Xk ).
Since h(x,01) is of the form (3), Section 3 implies
that DO defined above is a minimax d.r. for this simple discrimination problem.
That the
~i'S
are least favorable follows as for the previous cases, noting
first that the probability that DO will choose
1 when Fl' "', Fk ·are in
Q
w~
depends on F l , ••• , Fk only through 01' ••• , Ok'
The distribution G required to compute the M.E. sample size is the same
distribution required for the case of choosing the largest of k binomial
Parameters.
(See the last paragraph of Section 3.)
5. Directions for Further Research. The author is at present considering
the following problems:
(1) to prove Bechhofer and Sobel's procedure for rank-
ing variances of normal populations with unknown means ~2_7 is M.E.; (2) to
find M.E. solutions when different values of r
= r.1.
are specified corresponding
to each 0ii (3) to apply the theory of Section 4.3 of ~3_7 to find most economical solutions when the restriction of equal sample sizes (and of equal variances
for the normal case) is dropped; (4) to find a least favorable distribution of
00 for various distributions of the Laplacian classi and (5) to find methods
for obtaining the M.E. sample size more accurately for the non-normal cases--in
particular, for the binomial and distribution-free cases.
REFERENCES
L1J Bechhofer,
Robert E., "A Single-Sample Multiple Decision Procedure for
Ranking Means of Normal Populations with Known Variances tl , Annals
of Mathematical Statistics, 25 (1954), 16-39.
L2_7 Bechhofer,
Robert E., and Sobel, Milton, "A Single-Sample MUltiple Decision
Procedure for Ranking Variances of Normal Populations", Annals of
Mathematical Statistics, 25 (1954), 273-289.
1:3_7 Hall,
Wm. Jackson, "Most Economical Multiple-Decision Rules", Institute of
Statistics Mimeograph Series No. 115, University of North Carolina
(1954).
© Copyright 2026 Paperzz