Sen, P.K.; (1966)On non-parametric simultaneous confidence regions and test s for the one-criterion analysis of variance problem."

-J
I
I
1-
1
1
1
ON NONPARAMETRIC SIMULTANEDUS CONFIDENCE REGIONS AND
TESTS FOR THE
CJl~E
CRITERION ANALYSIS OF VARIANCE PROBLEM*
PRANAB KUMAR SEN
University of North Carolina, Chapel Hill
]
Institute of Statistics
1
~limeo
Series No. 463
March 1966
1
1_
1
-I
1
*Work
supported with the Army Research Office, Durham, Grant DA-31-124-G432.
1
1
DEPARTMENT OF BIOSTATISTICS
J
1
1I
UNIVERSITY OF NORTH CAROLINA
Chapel Hill, N. C.
I~
I
~
Je
ON NONPARAMETRIC SIMULTANEOUS CONFIDENCE REGIONS AND TESTS FOR THE
ONE CRITERION ANALYSIS OF VARIANCE PBOBLEM*
PRANAB KUMAR SEN
J
University of North Carolina, Chapel Hill
and University of Caloutta
1
J
Summary.
0' .......
...y~."
-,
For the one oriterion analysis of varianoe problem, some nonparametrio
-.
1
generalizations of the two well-known methods of Multiple oomparisons by Tukey
J
oharaoteristios of the proposed methods are oompared with those of the others,
[26] and Soheffe [19], are proposed and studied here.
The performanoe
available in the literature.
J
1.
]e
1
1
Introduotion.
•
_,'
"
r • .•, .• ".'
I
,-
J
J
, •
Let there be
o(~
2) independent samples, the ith sample
funotion (odf) Gi (x), for i = 1, ... , o.
it is assumed that
~
Gi(x)
= (~l'
=G(x -
~i)' i
=1,
••• ,
••• , eo) being a real o-veotor.
In one way analysis of varianoe problem,
0,
The null hypothesis to be tested
relates to
]
J
•
oomprising of ni independent and identioally distributed random variables
(i·i.d·r·v.) distributed aooording to aoontinuous oumulative distribution
(1.1)
J
~.
Under the assumption of G being a normal odf, the olassioal varianoe ratio
(F-) test is known to possess some optimum properties as a oomprehensive test
for Ho in (1.2). However, in many situations, we may not be merely satisfied
with the rejeotion of (1.2), but may also desire to test more detailed hypotheses
*Work
supported with the
Army
Researoh Offioe, Durham, Grant DA-3l-l24-G432.
I
I
.e
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
I_
I
-2-
concerning two or more of the components in
.~.
An important class of parametric
functions used for this purpose is the set of all,contrasts among Ql' ••• , Qcl
a contrast ~ being defined as
~
(1.3)
The class
=.!.",~, .~ = (ll'
q of
•.• ,
I. c )'! _1.:' c
= (1,
... , 1).
all possible contrasts is generated by the space ~ (of rank
N
c-l) of all possible
I.
vectors, satisfying (1.3).
Obviously, we can have always
,oV
a set of 0-1 linearly independent contrasts, say ~l' "0' ~c-l' which spans
the class ~
,..,..
={~:
~
=I. old,
"',
~·v
~
FJ
E
L,
~.~
I. J .. 1 C ).
_.~
It is also interesting to note
i"J
that ~ is translation invariate in the sense that for any real scaler I,
The problem of making further inferences about these contrasts, arising
when the F-test rejects H in (2.2), has been considered by Nandi [15], Duncan
o
[4], Tukey [26], Scheffe [19], Bose and Roy [2], Roy [18], Ghosh (8), Dwass [6],
among many others.
[19]
0
The two mostly used methods are due to Tukey [26] and Scheff~
Essentially these methods are concerned with providing simultaneous
confidence regions to all possible contrasts with a view to carry out any
multiple comparison test having a preassigned level of significance.
These
procedures are all valid for normal cdf's only.
Now with the steady advancement of nonparametric techniques in this field
of researoh, it has come to be reoognized that the nonparametric competitors
not only oompare quite favourably with their parametric rivals but also possess
some properties whioh mq not be shared by the others (cf. Lehmann [12], [14]).
Further, an interesting method of estimating shift parameters by rank order tests
oonsidered by Hodges and Lehmann [11] has made the opening of a new line of
approach to the analysis of variance problems, on which further works are due
to Lehmann ([13], [14]), Bhuohongkul and Puri [1] and Sen ([22], [23]).
However,
I
I
.e
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
I_
I
-.3for the problem of simultaneous confidence regions or multiple comparison tests,
these results
m~
not be directly applicable.
The objeot of the present investigation is to provide some nonparametric
generalizations of the well-known T and S methods of multiple comparisons
(cf. [20, pp. 66-8.3]) through the use of the same class of rank order statistics
\
as has been used in ([11], [1.3], [14], [21], [22], [2.3]).
Some of the ideas of
the present paper are scattered in rudimentary forms in the works of Dwass [7]
and Steel ([24], [25]).
However, no systematic approach to this problem
(particularly the asymptotic theory) has yet been made elsewhere in the literature.
The asymptotio properties of the proposed methods are studied and oompared with
those of the parametric S and T methods as well as the other ones by Steel ([24),
[25]) and Dunn [5].
2.
Nonparametrio generalizations of T-method of multiple comparisons.
I'"
fV'/,J~'JJ'VI'~.IW""'''''N/Y'''.· A~f""""' ~/'#N'-I-V _/V/UI/'tIN,...·',;
tvN
"'.,~}'JfU/~·'{h.-l"1
N'_
N~_"'J ftt;1"~.,..•.;NAJ
N"'VNro.t"~j'V_,... ,tVnNN
By
analogy with the homoscedasticity condition implicit in the use of the T-method
(cf. [28, p. 294]), we require here
(2.1)
Our proposed method is essentially based on Tukey1s [26] principle as adapted
for the one way classification.
We shall oonsider first the method of paired
comparisons, where we want to test for the difference of
(1.1),) for all i
*
j =
~i
-
~j
(referred to
1, ••• , o. SUbsequently, we will consider the method of
multiple comparisons, where we want to test for the significanoe of all possible
contrasts.
2.1.
The paired comparisons.
Let us formulate first the class of rank order
statistics which will be used throughout the paper.
x.1. = (X.1.l , ... ,
J'J
X.l.n ), i
= 1,
.... , c; ,.JI n
Let
= (1,
••• , 1);
]
-4-
1
E = (En, l' ••• , En, 2)'
n En,a = J n (~2n)' 1 ~ a ~ 2n,
Je
]
]
J
1
J
J
Je
J
J
J
J
1
J
~n
where J n is defined on the same line as in Chernoff and Savage [3, p. 972] and
it satisfies all the four regularity conditions of theorem 1 of [3]. Throughout
this paper, these regularity conditions will be implicit in the use of E.
""n
We
write then
2n
E =L ~ E
n
2n a=l n,a
Also, we denote by J(u) = lim J (u) for 0
n::
00
n
<u <1
and write
1
IJ. =
S J(u)
o
du,
Then, we assume further that
(2.6)
J(u) is
t
in u:
0
< u < 1;
Further, let Z(ni,j) = 1, if the a-th smallest observation in the combined
,a
(i,j)-th sample is from the ith sample, and let Z(i,j) r- 0, otherwise, for
n,a
a = 1, ••• , 2n. Then, we shall be oonoerned with the following class of ChernoffSavage [3] type of rank statistics
(2.8)
h (X., Xj )
n .. J.,..,
=1n
2n
~ E
-1
a-
Z(i,j) , for
n,a n,a
J.' 4~
- , ... , .
j .• 1
c
In oonjunction with (2.6), we shall assume that
(2.9)
hn(~i + a,!n' ~j) is
t
in a for all ~i' ~j' i :;j: j = 1, ... , c.
It may be noted that statistics of the type (2.8) include as particular cases,
1
Je
J
the Wilcoxon's [27] statistic, Normal. score statistic, emong others. Let us
then consider the following statistic
I
I
.e
I
I
I
]
]
-5-
wn --
(2.10)
which
pl~s
[2nt A- l I h eX. ,X.) - E
n . n,..,~ JVJ
n
Max.
l~i, j~c
the basic role in our proposed method.
THEOREM 2.1
Under H in (1.2),
o
where Xc (t) is the cdf of the sample renge in a sample of size c drawn from a
standardized normal distribution.
PROOF.
Let us define
X
(2.11)
Y = B(X) =
S
x
]
Then, under (1.2), Y.
0'
~J
]e
J
J
J
]
J
]
]
Je
J
I ),
JI[G(X)] dG(l), G(x ) =
o
t.
o
j = 1, ... , A, i = 1, ... , c are N(=l'lc) i·i·d·r·v.
By
theorem 1 of [3], it is known that Y has a finite absolute m('ment of order
2
+1,
for some ~
its mean by ~.
(2.12)
> 0,
Thus, on defining
z . =
n,~
2
the variance of Y is A , defined in (2.5), and we denote
1
n -"2A
it follows that Zn,l, ""
1
n
~ [ Y•.- ~ ], .~
~J
j=l
Zn,o are i'i'd"r"v'
normally with zero mean and unit variance.
for any given
= 1, ••• , c,
distributed asymptotically
Consequently, it is easily seen that
0,
lim P fMax.
/' Z
n= 00 (l~i, j~c
n,i
-z.
I -< t .J;>
n, J
()
= X0 t ,
where Xc(t) is defined in the statement of the theorem.
Now, proceeding precisely
on the same line as in the proof of theorem 1 of [3], and then using Poincares
theorem on total probability, it is easily seen that
I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
•
I
I·
I
•
-6[nt A- l
(2.14)
Simultaneously for all i
(2.10)
and
(2.15)
(2.13),
-t(zn,J.. - Zn,J.)]
f h (X., X.) _ P. 1.
1.nJ.
J
)
f
=0
P
(2.5), (2.7),
Thus, from (2.Li-),
j = 1, ••• , c.
(1),
we get that
I)J
(W - Max.
Z
_ Z
t n l~i,j~c I n,i
n,j
i
Hence, the theorem follows from
= 0p
(1)
•
(2.13).
In small samples, the exact distribution of Wn(under
Ho
in
(1.2),)
may
be quite involved and no suitable algebraic expression may be attached to it.
However, there appears to be a permutation procedure of eval,uating the exact
null distribution of Wn •
Let us write ~
is composed on N i·i·d·r·v·
= CJl ,
"., ~c).
and hence conditioned on the
tInder (1. 2),
giv'en~,
!-N
all possible
(Nl) permutations of the variates among themselves are equall,- likely.
This,
conditioned on the given~, all possible (Nl/(nl)c,) partiti~nings of these N
variables into c subsets of equal sizes are equally likely, each having the
(conditional) probability [N!/(n!)Cr l •
Thus, if we conside:c ....·ihe set of all these
partitionings and for each one of them, we compute the value o~<Wn (with the aid
of formulation
W.
n
(2.10),)
we will arrive at the permutation distr~bution function of
Since, G is assumed to be continuous (so that the possibili~y of ties may be
ignored, in probability,) and as hn (X.,X.)(i
=!= j = 1, ••• , c) are all rank order
,..,J. . . . . J
.
statistics, it follows that the permutation distribution of W
n
der~ved
in this
manner will agree with the exact null distribution of W. This pro~tedure may be
n
quite useful for small or moderately large values of n (particularly, if some
modern computing facilities are available), while for large samples,
theorem
2.1
we may
use
to apprOXimate the true null distribution of Wn by x' c (t), t.ables for
whioh are abailable in Biometrika tables
for Wilcoxonls
[27]
statistic, theorem
[16,
2.1
pp.
165-171].
It may be n~ted that
(in a slightly incorrect versitn) has
I
I.
I
I
I
I
I
I
I
-7-
been considered by Dwass [7].
Let now a:
our preassigned level of significance.
We denote
by W
n ,a and Rc ,a ' the upper lOOa·/ • point of the exact null distribution of Wn
and of Xc (t) respectively, so that
(2.16)
and by theorem 2.1, W
n ,a '-7 Rc,a •
The simplest type of paired comparison test may now be formulated as follows:
1.
For all 1 ~ i
<j
~ c, compute the values of h (Xi' X ); the values
n"" ~j
of the remaining set will be obtained from the relation that
2.
Ie
I
I
I
I
I
I
I
I·
I
e < a < 1 be
3.
Compute the value of W
corresponding to the preassigned a.
n,a
Referred to (1.1), regard those (Qi - Qj ) to be significantly
different from zero for which
(2.18)
It is easily seen that the test is an exact size a(e
< a < 1)
multiple
comparison (similar) test •
Steel
([24], [25])
has used Wilcoxon1s
[27]
statistic to derive some
multiple comparison tests analogous to Duncan's [4] and Tukeyts [26] methods.
Here, we won1t consider Duncants method, while his generalization of Tukey1s
method may be regarded as a particular case of our paired comparison test in
(2.18).
Moreover, his study remains appreciably incomplete in the sense that
he has not supplied any (asymptotic or exact) expression for W (even for his
n,a
simplest case,) not to speak of any property of his proposed procedure. For
c
=3,
he has provided a table for the probability law of the minimum Wilcoxon-
I
I.
I
I
I
I
I
I
I
I_
I
-8-
statistio (among the c(c-l) paired of samples) for sample sizes up to 6, and
also some approximate values for c
•
10, n
~
10. Our results not only generalize
his procedure to a wider class of rank order statistics but also suggests some
asymptotically simplified form of Wn •
Now, often we are not merely satisfied with the detection of those pairs
of (Qi' Qj) for which Qi 4: Qj' but also we want to attach a simultaneous
i . j = 1, ••• , c. For this, we shall
confidence region to all possible Q.~ - Q.,
J
*
use (2.18) and a method of deriving confidence intervals for shift parameters,
considered by Lehmann [13] and Sen ([21], [22]).
Let us write.
6ij = Qi - Qj for i, j = 1, •• " c.
From (Ll), we will hav'e then
(2.20)
G. (x)
~
= G(x-Q.) = G.(x-6
.. ),
J
~
~J
for i, j
=1,
... , c.
Let us then define
(2.20)
I
•-
~
r 1J.(1) = E Inn
In-t An Wn,a
1r
I; 1J.(2)
= En + tn-t AnWn,a •
n
Now, by (2.9) hn (X.
+ aI·n , AJ
X.) is monotonic in a.
,~
principle, we arrive at the following two values:
I 6ij •L = Inf
.( a:
i
(2.21)
\ 6 .. U = Sup { a:
~J'
Hence, by the sliding
hn (.N~
x. + aI n , NJ
X.)
(1) t ,
> 1J.n
·
(X. + aI , X. )
hn",~
. . n "'J
< 1J. n(2)-J
rJ
;
which defines an interval
(2.22)
_e
1.~ J' = rL 6.~ j: 6.~J'
. L ~ 6.~J. ~ 6.~.
j U} , i :t= j = 1, ••• , c.
Then, it follows from (2.18) though (2.22) that the ~robabil!ty is 1 - a that
I
I.
I
I
I
I
I
I
I
I_
II
-9-
the inequalities ~ij.L ~ ~ij ~ ~ij.U hold simultaneously for all i ~ j = 1, ••• ,
c.
We shall now consider certain asymptotic properties of the propose paired
comparison procedure.
To justify the approach theoretically and to avoid the
limiting degeneracy (2.22), we shall now conceive of a sequence of c-tuplets
of cdf's {' Gn, i(x), i = 1, ... , c i, for which (1.1) holds and
J
.l.
n 2 Q -'? A.
~n
~
= (A.1 ,
••• , A. ) as n ._:;. 00 ,
where 9 is defined as in (1.1)
'n
c
and A..
J.,
i
=1,
••• , c are real and finite.
We
also define
A.
ij
= A.i
- A. j for i, j
=1,
••• , c.
We will be then interested in paried oomparisons in A..J. IS, instead of 9.J. l s. (It
may be noted that as for the simultaneous confidence region £I ij : 1 ~ i, j ~ c :- ,
we may consider a somewhat more general formulation where as n
-700
~~j being some (fixed) real quantity, not necessarily equal to zero. Since, the
confidence intervals of the type (2.22) are all translation invariant (cf. [13],
-
•-=e
[22]), for the study of the asymptotio properties, it is immaterial whether we
take l1~j
=0
or not, for all i, j = 1, ••• , c.). The above formulation is
analogous to Pitmanls type of translation alternatives usually adopted to study
the efficiency aspects of the nonparametric analysis of variance tests.
Let us
now consider the Hodges-Lehmann [11] point estimate ~ij of ~ij' in (2.19).
this, we define
r ~ij
"'(1)
(2.25)
<.
= Inf. i" a: hn (X.
. . . J. + aI",n , X.)
""J
!'_ ~~2)
= Sup.
J.j
r
/.. a: hn (!i + B.!n' ~j)
> "" n)~
< ""n j
•
For
I
I.
I
I
I
I
I
I
I
-10-
Then, conventionally
/\
~ij
(2.26)
_
1
("
(1)
~ij
- 2
+
,. (2) )
~ij
,i, j
=1,
••• , c.
We also define
00
S (d/dx)J[G(x)]
B(J,G) =
(2.27)
dG(x).
-eo
Then by a simple and straight forward extension of theorem 1 of Sen
[2~],
it
follows that asymptotically I ij in (2.22) reduces to
(2.28)
I ij =
"
where A. ••
~J
= n2
1 A.ij : / A.ij
"
- A.ij I ~ A Rc,a/B(J,G)} ,
0
1"
(ti.. - ~ .. ) is the derived form of Hodges-Lehmann [11] estimate
~J
~J
I_
in (2.26), and A and RC,a are defined in (2.5) and (2.16) respectively.
Let us now compare (2.38) with the confidence interval obtained by T-method,
I
when G is assumed to be normal with a variance 0 2 • If we define Eij = Xi - Xj ,
as the difference of the ith and jth sample means, and if Rc,e (n-1) ,a is the
"""'"
I
upper lOOa
point of the studentized range RC,C (n- l)(Cf. Wilks [28, p. 294]),
then we have the probability 1 - a that the inequalities
I
i
...
I
is.~ j
--=
I
I
- n-i sR
c,c (n- 1) ,a -< l::..~J• -< E.~ j +
iii
sRc,c(n-l),a
=
holds simultaneously for all i, j 1, ••• , c, where s2 is the unbiased estimate
of 0 2 carrying c(n-l) degrees of freedom (d.f.). As it is well-known that
s2
I
•-e
·1·
P
~.
----r
0
2
d R
_= RC ,a as n
, an
c, c (1)
n- )a ----c,
-~
00',
we get from (2.29) that asymptotically the confidence interval in (2.29) reduces
to
f '\
'- "'i"J
i
•
< oRc,a'~,
I
Ie
I
I
I
I
I
I
I
I_
I
I
I
I
I
I
I
.e
I
-11-
0 )
2 liij - liij
where -Aij = n.1.(_
Thus, if we take the ratio of the square of the width
of the oonfidenoe intervals as a measure of the asymptotio effioienoy, the
0
asymptotio relative effioienoy (A'RoE') of our proposed method with respeot to
the T-method reduoes to
(2 •.32)
Thus, we arrive at the following theorem.
The A'R'E' of the proposed nonparametrio generalization of the
THEOREM 2.2.
T-method of paired oomparisons with respeot to the T-method itself, is equal to
the AoR·E·
of the two sample rank order test (on whioh the proposed method is
based,) with respeot to the Student's t-test.
We shall disouss more about (2 •.32) later on.
It
m~
be noted that the present author, in an earlier paper [22], has
oonsidered a method of estimating B(J,G) for any absolutely oontinuous G,
satisfying the oonditions of lemma 7.2 of Puri [17].
separately from each (i,j)th samples (for 1
~
i
<j
~
If we estimate B(J,G)
c), we may oombine these
together by an unweighted average, as the sample sizes are all equal.
If we
A.
denote this estimate by B (J,G), then it follows from (2.28) that asymptotioal1y,
the oonfidenoe interVal for Ai - Aj m~ be written as
,.
r..
,.
+
AR
/
B(J.G).
~ij - ARo,a/ B(J,G) -< X.-A.<A'
~
J- ~ j
o,a
'
2.2.
The multiple oomparisons.
We start with a remark that the estimators liij
in (2.26) are inoompatib1e in the sense that they
m~
not satisfy the transitive
relations viz., liij + li jk = liik' whioh must be true for the corresponding
parametric quantities. To remove this drawback, Lehmann ([12], [14]) has
oonsidered the following adjusted estimates. Let
A.
6.~.
1
=-o
c ,.
li for i
j=l ij
~
=1,
••• , 0;
I
I····e
I
I
I
I
I
I
I
-12-
,.
,.
Zij = ~i. - 6 j • for i
*
j = 1, ••• , c.
It is easy to see that Z.. satisfies the aforesaid transitive relation's.
J.J
It is
also well known (cf. [1, theorem 3.1]) that
1
n2
,.
(Zij - l\j) =op(l), for all i, j = 1, ••• , c.
Our proposed method is based on certain properties of Zij's.
PROOF.
Suppose for any fixed i, the range of (Zij - ~ij) is attained by the
pair of paired suffixes (i,k) and (i,[).
Then, using (2.35) we get that
I_
Since, the right hand side of (2.37) is independent of i, it holds for all
I
i = 1, ••• , c, and hence, is also the unrestricted maximum.
Hence,
I
I
I
I
I
I
,e
I
The other relation also holds similarly.
Hence the lemma.
c
If ~ = ~ l.Gi be any contrast in G and if for some [(=1, ••• , c),
1 J.
~
LEMMA 2.4.
c
we define ~u = ~ [.Z./., then
{I.
1
1 J.
,.
Sup \~. _
U
{I.
PROOF.
~,
A
c
t\\ \
~
<
1. "I'
- 2 ~
1
f.
I
AJ.' •
Max.
Z
,,1
1<·
_J, k<c
_ \ J·k - ~jk!
.
c
We can rewrite ~ as ~f. = ~ I.i 6i /.., and hence,
•
I
-13-
I.
I
I
I
I
I
I
I
I_
I
I
Since, the right hand side of (2.38) is independent of
for all
I
=1,
l,
and the inequality holds
••• , c, the lemma follows directly from (2.38).
If we now let
t ij
=01 1i for j =1,
then the contrast
&may
••• , c, i
=1,
••• , c,
c
also be expressed as Z
I
i~
Consequently,
from lemma 2.4 we get that
Now corresponding to the c(c-l) estimates 6 ij in (2.26), we complete the
values of Zij for i ~ j = 1, ••• , c. Further, from the c(c-l) simultaneous
I
I
confidence intervals I ij in (2.22), we compute the value of
I
We denote this maximum by Hn,a • So that from (2.22) and the probability statement
made just after it, we get that
I
I
Consequently, from (2.40) and (2.42), we conclude that the probability is 1- a
,.
I
Max.
\
l~j,k~c \ Zkj - 6 kj \ subjeot to 6 ij eIij for all i ~ j
that the inequalities
=1,
••• , c.
I
-14-
I.
I
I
I
I
I
I
I
Ie
c
1
c
~ ~tijZij--2Hn,a 1~1/.·/~rl\=~I.·Q·~
~ I:/.. Z··+- H ~11.·1
i=l j=l
1
i
1 1 1
i=l j=l 1 j 1J 2 n,a l 1
I
I
I
1
hold simultaneously for all ~\
c
&
.
c
c
c
¢.... •
"-'
This may be regarded as a nonparametric generalization of the well known
T-method of multiple comparisons.
(2.43) may be used to attach a simultaneous
Now using (2.22), (2.28), (2.41) and (2.42), it readily follows that
asymptotically
.1.
n2 Hn,a ~ ARc ,a/ B(J,G),
where A, R
C,a
and B(J,G) are defined in (2.5), (2.16) and (2.27) respectively.
Thus, if we define the derived estimates
"0
..I~ij
I
I
I
c
1
= n~
0
(zij - ~ij
A)"
, 1, J -- 1 , ••• , c
(b.~j being defined in (2.25),) then from (2.43), (2.44) and (2.45), we get that
(2.43) asymptotically reduces to
1 c
c c
"0 1 c .
~II.. tAR
/B(J,G) ~ ~ ~ /'i·(iI.·j-il.· ')~-2 ~1~iIAR /B(J,G).
1 1
c,a
i=l j=l J 1
1J
1
c,a
--2
If we now compare (2.46) with Tukeyts [26] results, as adopted in the case of
one way classified data with equal number of observations (cf. [28, p. 296]) we
again get the same A. R. E. as obtained in (2.32).
THEOREM 2.5.
Hence, the following.
The conclusions of theorem 2.2 also hold for the multiple comparison
•-
tests considered here •
I
For example, if we use Wilcoxon's statistic, then for normal cdf, (2.32) is equal
-e
I
Now regarding (2.32), various known bounds are available in the literature.
I
I.
I
I
I
I
I
I
I
Ie
to 0.95, it has the minimum value of 0.864 for any oontinuous G (of. [9]),
while it oan be arbitarily large for some speoific G.
Similarly, it is kno'Wn
that the use of normal soores results in a value of (2.32) whioh is always at
least as large as one, while it may also be indefinitely large for some specific
G.
The relative value of the effioiencies of the Wilooxon's test and normal
soore test depends on the partioular G, and for various known G, a nice aooount
of this is available with Hodges and Lehmann [10].
3. ~~~~~~e.:t!}9 ~~~~~~~~~~ll~ ~! ~:~~~~o.? ~~ ~~~~;: c;.~~1?.~i",s,..~. The
results of the preoeding seotion have a somewhat limited scope of applioability
in the sense that they deem the sample sizes to be all equal.
The method to be
considered now overcomes this drawbaok, and may be regarded as a nonparametrio
generalization of the well kno'Wn 8-method of multiple oomparisons (of. Sohefee
[19]).
This method is essentially a confidenoe region prooedure whioh is based
primarily on the oonstruotion of a suitable simultaneous oonfidenoe region for
I
I
I
I
I
I
I
the set of all possible oontrasts among 9.
'"
In an earlier paper ([22]), the present author has oonsidered suoh a
simultaneous oonfidenoe region to the (0-1) parameters (~i - 91 for i = 2, ... ,0)
whioh may be used here. However, from computational standpoint this prooedure
appears to be a little involved in the sense that here we are faoed with a set
of (0-1) simultaneous equations in (0-1) unknowns, where eaoh single equation
is an involved funotion of all these (0-1) unknowns and is only asymptotioally
linear in them.
Thus, the usual method of iteration, whioh has to be mostly
adopted in suoh a oase, beoomes very tideous.
,.
t'.
,"
'I
-.
-15-
,
To remove this diffioulty, we
shall oonsider the following approaoh whioh is oomputationally muoh more simple,
and at the same time, asymptotioally equivalent to the preoeding one.
method will be very appropriate if one of the
0
This
populations may be regarded as
I
-16-
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
••I
control and the rest as treatment group.
On the otherhand, if there is no such
natural control group, our procedure is based on selecting one of the c groups
as the standard or control group.
However, as we shall see later on that
asymptotically the procedure remains insensitive to the choice of any arbitary
control group.
In this situation, the sample sizes nl , ... , nc are not necessarily equal.
Let us denote by
Nij = ni + n j for i
(3.1)
=j =1,
••• , c and N
=nl
+ ••• + nc •
Then, tor the (i, j)th semples, we define h (X., Xj ) precisely in the same
n ""'). ""
replaces l1n in (2.3)
ij
The mean of the entries in !Nij is
manner as in (2.8), with the only change that
here.~
and have Nij instead of 2n) elements.
denoted by ~ ,and as in (2.7), we assume that
ij
\~
(3.2)
ij
-lJ.I=o(N-i),
where we define lJ. (and A2 ) as in (2.5). Let us also define
for i
~ j
=1,
••• , c.
Conventionally, we let VN•ii = 0 for i = 1, ... , c and we regard the first
sample to constitute the control group. Then, we define
where b"ij is the usual Kronecker delta.
1
c
\TN.l. - N i~lni VN•li
If we write
c
N i:2ni VN•li
1
=
'
I
I.
I
I
I
I
I
I
I
-17-
then (.3.4) may also be written as
(.3.6)
We may adopt a similar permutation approaoh (as in seotion 2) to find out the
exact null distribution
of~.
On the otherhand, if N is large subjeot to
then by an adaptation of the same proof as in lemma 1 of [1], we readily arrive
at the oonolusion that under Ho in (J..2), SN' in (3.4), has asymptotioally a
ohi-square distribution with (0-1) d.f. Further, from the same lemma, it
follows that under the sequence of alternatives in (2.23)(with n replaoed by
N,) SN has asymptotioally a nonoentral t distribution with (0-1) degrees of
)e
freedom and the nonoentrality parameter
I
I
(3.8)
o
where ~ = 1::
1
•
ei
Ai' and B(J ,G) is defined in (2.27).
Now from (2.20), we have
~
I
I
--~
~e
I
••
So, if we define
(3.10)
Nij
VN •• (a) =
.~J
nj
[hN(Xi
N
) + aI~ni ,X
~j
'w.
~ij
] for i, j
= 1, ... ,
then it follows from (1.1), (3.5), (3.9) and (3.10) that for all g
,',;
0;
I
I.
I
I
I
I
I
I
I
-e
I
I
I
I
I
.-
•
,e
I
I
-18-
(3.11)
has the same distribution as of SN in (3.6) under Ho in (1.2). Further from
(2.9), (3.3) and (3.10) we m~ conclude that VN.ij(a) is t in a for all
i
=j = 1,
••• , c.
So if we denote by SN
,a
the upper 100a ./.
point of the
null distribution of SN (so that
p
SN ,a --,
~ Xl0-1 ,a where P ~XI 2 Xlc-1 ,a 1~ = a,),
then from (3.11) we get that
p. ~
l.
SN (9)
rv
< -N,a"J)
8.._
t 9 ?= 1
- a.
Now, equating hN(X1 + a IN ' X.) to K_ in the same way as in (2.25) and
1 /.,].
-N
1i
(2.26), we arrive at the Hodges-Lehmann [11] point estimate ~ij of 61i for
'"
i
tv
A
=2,
••• , c j the joint asymptotic normality of the standardized form of these
estimates follow precisely as in the proof of theorem 3.1 of [1]. Let us now
denote by
v
"..,
= (V2 ,
••• , V )
c
on the boundary of the ellipsoid in (3.13).
positive, we find out a value of 61i ,
For any particular
*
s~ ~li'
!'
if Vi is
such that
(3.15)
on the otherhand, if Vi is negative, we define
(3.16)
for each i
=2,
••• , c.
In this manner, any point V on the interior neighbourhood
I'V
I
-19-
II
I
I
I
I
I
I
-e
I
I
I
of the boundary of the allipsoid in (3.13) is mapped into a point
(3.17)
Further, (3.13) is the equation of a olosed oonvex set of points ~ SN~) J ' and
~
as eaoh VN• li (a) is
(3.18)
in a, i = 2, ••• , 0, it follows that the set of points
SN(Q) ~ SN,a:,
0(61 ) = ',.(612 , ... ,610 ):
will also be a olosed oonvex set in (612 , ••• , 610 ), having the property that
(3.19)
P ; (612 , ... , 610 )
€
\~ ~ = 1 -
0 (~.>
a.
For small samples, SN in (3.4) will have essentially a finite number of disorete
mass points and henoe on the interior boundary of the ellipsoid SN
will be only a finite number of points ;~'i'
< SN,a' there
So, if for eaoh of these points,
we find out (by the prooess in (3.15) and (3.16),) the ooreesponding (finite
number of) points ; ~t < , in (3.17), then the oonvex hull of these set of points
will be our desired simultaneous oonfidenoe region for (612 , "" 610 ), For
large samples, we get by a straight forward generalization of theorem 4 of
Hodges and Lehmann [11] along with an extension of theorem 1 of Sen [22] (to
more than one parameter) that the set 0~1.) in (3.18) reduoes to
/
B2 (J , G)
I
(3.20)
I
"
"
"
where ,~l. = (612 , ••• , 610 ) and 61 • = (612 ,.", 610 ) is the Hodges-Lehmann [11]
I
I
,I
= ,~61:
N'
estimate of !'l.'
(3.21)
2
A
0
0
L:
L:
Thus, if as in (2.25) we write (rep1aoing n by N)
..1.
0
N2 (6i J. - 6..
) 7> "'.~ j as N ~oo , for i, j
~J
"
"
.($'j -(.)N(61·-61·)(61·-61·)~t 1 '1,
i=2 j=2 ~ ~
J
~
~
J
J
0- ,aT
1
"
= 1,
••• , 0,
"'ij = N2(6ij - 6~j) fori :i: j = 1, ... , 0, ~1 = ( "'12''''''''10).
I
I.
I
I
I
I
I
I
I
"Ie
I
I
I
I
I
I
I
••
I
-20-
where A.. f S are all real and finite, then (3.20) reduces to
~J
c(~l.) = {~l:
,~
Y"
c
c _
-
2a
A Xc-l,
,.,
,.
C1 ;
.)(Al·-Al , )(Alj-Alj)~ 2
~
J
~
~
.
B (J ,G) ,
e. ( ~~J
..- e
Z
Z
i=2 j=2 ~
Consequently, if we attempt to estimate B(J,G) as in [22], separately for each
combination (1, i)th (i = 2, ••• , c) samples, and pool these together into a
single estimate:a(J,G), then an asymptotic simultaneous confidence region to
Al
IV
•
will be
We shall now use (3.18) or (3.23)(or (3.24),) to derive a simultaneous
confidence region to any number of contrasts in
c
Any contrast ~ =
~.
c
=
1..9
.~
__.
may
....J
Since c(~l.) in (3.18) is a closed
also be written as -Z I.·~l·
-z2 I.'~l"
1 ~ ~
~
~
convex set, its convex hull is contained within the intersection of all the
c
supporting hyperplanes of c(~l)' Thus, giv-en any Z I.'~l"
;u
•
2
~
~
(which represents
the equation of a (c-2)-dimensional hyperplane,) we can always find two parallel
c
(c-2)-dimensional hyperplanes haiing the equations Z I.'~l' = c., j = 1, 2; such
2
~
~
J
that the oonvex hull of c(~l.) is contained within the (c-l)-dimensional steip
defined by these two hyperplanes.
Then, if we let cl
generality), we get the condidence interval
< c2
(without any loss of
(3.25)
whose confidence coefficient will be at least as large as 1 - a, whatever be the
number of contrasts, we work with.
In general, c1' c 2 will depend not only on
Up ... , I.c ) but also the convex hull of the set c(~l.)' For large samples,
(3.25) simplifies to a great extent. For this, let us define ,
I
I.
I
I,
I
I
-21-
$
(3.26)
2
2
2
"'2
2
"'2
= A Xl c-1 ,aIB (J ,G) and ~ = A Xl c-1 ,aIB (J ,G).
Then, it follows from (3.23) and a few simple adjust-ments that the equations of
c
the two parallel (c-2)-dimensional hyperplanes ~ ,(i6li = c j ' j=1,2, reduce to
;------_..
_'----_._~_.
--'-.
c
'"
\icc
/'
= j~2,(j(~lj-~lj) = ±$ V ~ ~ ,(i,(j(l/fl+
?t/ (?i)
(3.27)
=
I
I
I
~e
=
- ,Ic ,2·
± d, 1::1 "i I ([: i
<)
•
Consequently, asymptotically the probability is 1 - a that the inequalities
(3.28)
c
hold simultaneously for all ~ = 1::
,
'"
Since;;
1
, 6e
/..-9
,1 i
I
;h
I
,where
, in (3.26), oonverges stochastically to
get that asymptotically the probability is 1 - a
I
I
I
I
hold simultaneously for allOt
S
£" is
defined in (3.26).
independently of ~
~ ~
I
£
i'
inequalities
0 •
(3.29) may also be regarded as an extension of a similar result by Lehmann [14,
p. 1500] to a much wider class of rank order statistics.
Again, comparing (3.29) with the simultaneous confidenoe region provided
by the S-method of multiple comparison (of. Scheff' [20, pp. 68-71]), we
conclude that the asymptotic relative efficiency (A.R.E.) of the nonparametrio
I
we
I
I.
I
I
I
I
I-
I
I
-e
-
I
-22-
generalization with respeot to the parametrio 3-method is equal to
(3.30)
whioh is the same as in (2.32).
THEOREM 3.1.
Henoe, the theorem.
The oonolusions of theorem 2.2 also hold for the ncnparametrio
generalizations of S-method of multiple oomparison.
Thus, the disoussion made at the end of seotion 2 also apply to this oase.
Finally, regarding the oomparison of this method with the one proposed
by Dunn [5], we would like to point out the following.
(i)
Our method is valid even when 6~jIS, defined in (2.25), are not
necessarily zero, while as the proo9dure by Dunn assumes that Q is a null vector
(under Ho ) against the set of alternatives that at least one of the Qi in eaoh
group is different from at least one from the other. Our method is naturally
applioable in a more wider olass of situations.
(ii)
This is essentially a simultaneous oonfidence region based on the
principle of 3-method of multiple comparisons, while Dunn1s teohnique requiry,a
seleotion of a fixed number of oontrasts on whioh the level of signifioanoe to
,
eaoh oontrast depends.
j
•
~
•
...
-
This is not really justifiable in many oases, where we
may desire to test for any arbitrary number of oontrasts and in that oase, her
method will have some diffioulty to apply.
I
I.
I
I
I
I
I
I
I
Ie
REFERENCES
[1]
in linear models.
[2]
[3]
I
-
Math.
Statist~,
Simultaneous confidence interval
513-536.
CHERNOFF, H., and SAVAGE, 1. R. (1958).
Asymptotic normality and efficiency
of certain nonparametric test statistics.
[4]
DUNCAN, D. B. (1952).
Ann. Math. Statist.
~,
972-994.
~est •
On the properties of the multiple comparison
.V irginia Jour. of Sci • ..2, 49-67.
[5]
DUNN, o. J. (1964). Multiple comparisons using rank sums. Technometrics
g,
[6]
241-252.
DWASS, M. (1959).
Multiple confidence procedures. Ann. Inst. Statist
Math. lQ, 277-282.
[7)
DWASS, M. (1960).
Some k sample rank order tests.
Harold Hote11ing
volume. Stanford Univ. Press. 198-202.
[8]
GHOSH, M. N. (1955).
~,
-"'"
Ann.
On the estimation of contrasts
Ann. Math. Statist..2!a, 198-202.
BOSE, R. C., and Roy, S. N. (1953).
estimation.
I
......
BHUCHONGKUL, S., and PURl, M. L. (1965).
[9]
Simultaneous tests of linear Qypotheses. Biometrike
441-449.
HODGES, J. L. Jr., and LEHMANN, E. L. (1956).
nonparametric competitors of the t-test.
(10)
Ann. Math. Statist 27, 324-335.
HODGES, J. L. Jr., and LEHMANN, E. L. (1960).
score and Wilcoxon tests.
The efficiency of some
Comparison of the normal
Proc. 4th Berkeley Symp. Math. Statist. Probe
1,
307-317.
[11]
HODGES, J. L. Jr., and LEHMANN, E. L. (1963).
on rank tests.
Ann.
Math. Statist.
[12) LEHMANN, E. L. (1963).
I·
I
Math. Statist.
~,
~,
Estimates of location based
598-611.
Robust estimation in analysis of variance.
957-966.
Arm.:.
I
I.
I
'I
I
I
I
I
I
Ie
I
I
[1.3]
LEHMANN, E. L. (196.3).
Nonparametric confidence interval for a shift
Ann. Math. Statist. Jil:, 1507-1512.
parameter.
[14] LEHMANN, E. L. (196.3).
Asymptotically nonparametric inference:
alternative approach to linear models.
[15]
NANDI, H. K. (1951).
Ann. Math. Statist.
On the analysis of variance tests.
[16]
PEARSON, E. S., and HARTLEY, H.
statisticians.
[17]
[18]
(Edited).
PURl, M. L. (1964).
Caloutta Statist,
[19]
o. (1954). Biometrika tables for
Univ. Press London. vol. 1:
Asymptotio efficienoy of a olass of c sample tests.
Ann. Math, Statist.
J2, 102-121.
ROY, S. N. (1954).
Some further results in simultaneous confidenoe
interVal estimation.
SCHEFFE, H. (195.3).
variance.
Ann. Math. Statist. 25, 752-761.
A method for judging all contrasts in analysis of
Biometrika!J:Q, 87-104.
I
[20]
SCHEFFE, H. (1959).
The Analysis of Variance.
John Wiley and Sons Ino.
. New York
[21]
[22]
•
SEN, P. K. (196.3).
On the estimation of relative potency in dilution
SEN, P. K. (1966).
Biometrigs
two factor case.
[2.3]
12,
5.32-552.
On pooled estimation and testing of heterogeneity
of shift parameters by distribution-free methods.
=
I
Jil:, 1494-1505.
Asso. Bull. J, 10.3-114.
(-direct) assays by distribution free methods.
!
an
Part one: Univariate
Calcutta Statist. Asso. Bull. (Submitted).
SEN, P. K. (1966). On a distribution free method of estimating asymptotio
efficiency of a class of nonparametric tests. Ann. Math. Statist. (Submitted)
I
[24]
STEEL, R. G. D. (1960).
treatments.
[25]
Technometrics~,
STEEL, R. G. D. (1961).
Biometrics
A rank sum tests for comparing all pairs of
12,
5.39-552.
197-208.
Some rank sum multiple comparison tests.
]
~.
[26]
1
~
1
-1
I
t
I
1-
J
-I
J
J
J
I
1
,-
-I
TUKEY, J.
w.
(1953).
The problem of multiple comparisons.
Unpublished
manuscript. Princeton Univ.
[27]
WILCOXON, F. (1945).
Biometrics
[28]
Indiv'idual comparioon by ranking methods.
1, 80-83.
WILKS, S. S. (1962). Mathematical Statistics.
New York.
John Wiley and Sons, Inc.