Puri, Madan Lal and Pranab K. Sen; (1967)A class of rank-order tests for a general linear hypothesis."

I
I
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
--
I.
I
I
A CLASS OF RANK-ORDER TESTS FOR A
GENERAL LINEAR HYPOTHESIS
by
Madan Lal Puri and Pranab Kumar Sen
Courant Institute of Mathematical Sciences,
New York University and University of North Carolina
Institute of Statistics Mimeo Series No. 551
September 1967
~
1;1
1'1
"
Research sponsored by the Office of Naval Research,
Contract Nonr-285(38), and by the National Institute
of Health, Public Health Service, Grant GM-12868-03.
Reproduction in whole or in part is permitted for
any purpose of the United States Government.
DEPARTMENT OF BIOSTATISTICS
UNIVERSITY OF NORTH CAROLINA
Chapel Hill, N. C.
I
I
••I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
A CLASS OF RANK-ORDER TESTS FOR A
GENERAL LINEAR HYPOTHESIS*
By Madan Lal Puri and Pranab Kumar Sen
Courant Institute of Mathematical Sciences, New York University
and University of North Carolina at Chapel Hill
~~.
For a general multivariate linear hypothesis testing
problem a class of permutationally distribution-free rank-order
tests is proposed and studied.
Along with a multivariate cum
"
multistatistics generalization of Hajek's
(1966) results on
the asymptotic normality of rank-order statistics (for
regression alternatives), the asymptotic distribution theory
of the proposed tests and their asymptotic power properties
are studied.
Asymptotic optimality of the proposed tests for
specific alternatives is established and a characterization
of the multivariate multisample location problem [cf. Puri
and Sen (1966)] in terms of the proposed linear hypothesis
is also considered.
*
Research sponsored by the Office of Naval Research, Contract
Nonr-285(38), and by the National Institute of Health,
Public Health Service, Grant GM-12868-03. Reproduction
in whole or in part is permitted for any purpose of the
United States Government.
2
1.
Introduction.
"V'VVVVV'V~
Let ue consider a sequence of stochastic
matrices ~v = (~Vl'···' ~VN ), 1 ~ v
v
~vi --
( XVi
(1) ,.~., x(p))'
vi
'
< 00,
where
-- 1 , ... , Nv are independent stochastic
vectors having continuous cumulative distribution functions
(cdf) FVi(~)' ~
€
i
RP, i = 1, ... , Nv , respectively.
It is
assumed that
i
(1.1 )
< i
<
where a' = (a l , .•• , a p ) and s\:,
R = ((~jk))j-l
,
1"\1
are unknown parameters and
,
••• "
l<v<oo,
p. k-l
,
••• ,
q
_- ( c (1) , ••. , c (q))1 , i = 1, •.• , N
Vi
Vi
v
~Vi
are known non-stochastic vectors.
assuming some conditions on
N
~Vi's
Having observed
~v
and
(to be stated in Section 2),
we want to test the null hypothesis
(1.2)
H : R
o
S\.
=
OpXq ,
I'IJ
against the set of alternatives that
~
is non-null.
A variety of tests for Ho in (1.2), based on the assumed
normality of F(x),
is available in tne literature [cf. Anderson
I'IJ
(1958, Chapter 8) and Rao (1965, Chapter 8)]; the likelihood
ratio test is one of the mostly adopted ones.
In this paper,
the assumption of multinorma1ity of F(x)
is relaxed and a
I'IJ
,.
class of Chernoff-Savage-Hajek type of rank-order tests is
proposed and studied.
These tests are valid for all continuous
I
I
-,
I
I
I
I
I
I
el
I
I
I
I
I
I
-'I
I
I
I
I·
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
3
F(x). In Section 2, along with the preliminary notions, these
'"
rank-order statistics are defined. Section 3 is concerned
with certain permutational invariance structure of the joint
distribution of Ev (when H in (1.2) holds), and this is then
'"
0
utilized for the construction of permutationally distributionfree tests for Ho in (1.2). In Section 4, by a generalization
of Hajek's (1966) results, the asymptotic joint normality of
the proposed rank-order statistics (for nearby alternatives)
Section 5 deals with the asymptotic power
is established.
properties of the proposed tests for nearby alternatives,
and Section 6 is concerned with the asymptotic optimality of
the proposed tests, granted certain conditions on F(x).
In
"-
the last Section, relationship of the multivariate multisample
location problem with the linear hypothesis in (1.1) and (1.2)
is studied.
2.
Preliminarv I'V'V'VV"V'\AI
notions 'V'V\.I
and ~
fundamental 'VVVV'V~
assumDtions.
~
be the rank of
(.)
Xvr among the Nv observations
Rv(~)
1
Let
(.)
Xvi
,... ,
(.)
Xv~
,
v
for i = 1, ... , N ; by virtue of the assumed continuity of F(x)
v
"ties may be ignored, in probability, for all j = 1, •.. , p.
Let
~v
.(t) be a function defined on (0,1) and taking constant
, J
values over intervals [i/N v ' (i+l)/N v )' 0 ~ i ~ Nv ' i.e.,
there is some function ~j(t) defined for 0 < t < 1, for which
(2.1)
~v
. (t )
, J
for
i-I
Nv
<
t
i
< -N v
,
i
=
j
= 1, ... ,
1, ... , N ;
v
p
.
4
Define
I
I
-I
(2.2)
j =:: 1, ... , p ,
where UV(l)
~
..•
~
UV(N ) denote an ordered sample of Nv
v
observations from the ~ectangular distribution over (0,1).
(We may also definea~j)(a) =:: ~j(a/Nv+l) for a =:: 1, ... , Nv
and j = 1, ... , p. However, as it is known that the two
definitions lead to asymptotically equivalent results and the
theory follows on almost parallel lines, for simplicity of
notations and for intended brevity of discussion, we will
consider only the scores defined by (2.2).)
~V
Concerning
.'s, we assume that
, J
(2.3)(I)
~i~oo ~V,j(t) ::: ~j(t) exists for all 0 ~ t ~ 1 and
constant, for j = 1, ... , p, (where of course
NV - >
(2.4)( II)
00
lim
V-><D
j
as v - > co ), and
'l
J
0
[~V .(t) - ~j(t)J2dt ::: 0
'
for all
J
=1, ... , p.
Consider now
for j :::l, •.• ,p;
k=::l, •.• , q .
Our proposed test for Ho in (1.2) is based on the stochastic
matrix
I
I
I
I
I
I
_I
I
I
I
I
I
I
_I
I
I
I
I
I-
5
(2.6)
Concerning
I
I
I
I
I
I
(2.7) (III)
the following assumptions are made:
~Vils,
N
V
L
i=l
(k)
c Vi
=
0
for k
1, ... , q
=
~
(this can always be done by suitably adjusting
lim (Max [c (k)
V->OO. 1 <i <N ' vi
(2.8)(IV)
--v
Sup
(2.9) (V)
]2j;V
[c (k)]2 ]
1=1
vi
~
0
1
v < co ,
<
in (1.1)),
for all
k
NV
(2::
vi=l
[c (k) ]2} < co
vi
Ie
Let then £v = ((CV,kk'))k,kl=l, ... ,q where
I
I
I
I
I
I
(2.l0)(VI)
II
I
and
Rank of £v = q
~
=
for all k = 1, ..• ,
1 .
(In fact, by reparameterization in (1.1), £v can always be
made of full rank.
model.)
Hence, (2.11) is no restriction to our
Let us also define
N
1
(~a(j)(R(j))a(j')(R(jl)
(2 • 12) ~V,jjl = Nv-i
f;r--V
vi --v
vi
1
- N
v
(2.13)
1,.,., q;
M = ((nzv,JJ
o"))j ,J0' =1 , ... ,p
""v
•
q
•
6
Rank of .....v
M
(2.l4)(VII)
P -> 1 •
=
(Later on, we shall consider a theorem which establishes that
under certain conditions on F(x),
(2.14) holds.)
.....
(where
® refers
Finally, let
to the Kronecker-product of two matrices).
By (2.9) and (2.14), we obtain that
(2.16)
We denote by
Rank of
£v-1 =
(( d
reciprocal matrix of
v, jk; j
£v. We
I
~v
k' ))-1
= pq .
= (( d vjk ' j'k' )) the
may define formally our proposed
test statistic as
and its rationality will be made clear in the next section.
Before that, we would like to introduce the following notations.
Let H(x)
..... be a p-variate cdf whose univariate margina1s are
denoted by H[j](X) (j = 1, ... , p) and the bivariate (joint)
marginals bYH[j,j'](X'y) for j
I
j' = 1, ... , p.
Further, let
j=l, ..• , p ,
(2.18)
~jj =
J
0
1
2
T2
~j(t)dt - ~j ,
j=l, ... , p ;
I
I
-.
I
I
I
I
I
I
_I
I
I
I
I
I
I
_.
I
I
I
I
II
I
I
I
I
I
Ie
I
I
I
I
I
I
II
I
7
(2 . 20 )
Tl j
j
I
(H)
j
f.
jl
1, •.. , P ,
(2.21)
Finally, let
(2.22)
and
r (H)
'"
= lim
v->oo
It follows from (2.9), (2.11) and (2.11) that under the
assumption (2.23) (VIII) k(H) is of rank p and is finite,
~(H) will also be of rank pq, and ~(H) is finite, provided
lim
£v exists.
v->oo
3.
However, SUPvllr~(H) II will be finite, by (2.9).
'"
Permutation distribution of Sv and the rationalitv of
"V'V'VVVV'VV'V
'V'V"VVVV'VVVV
"""
"""'"
"""""
~
~v'
"""
Under Ho in (1.2), ~v is composed of Nv independent and identically
distributed random vectors. Hence, the joint distribution
of
remains invariant under the Nvl permutation of the Nv
vectors ~Vl"'" ~VN among themselves when (1.2) holds. We
~v
v
consider now the basic rank-permutation principle, which is
essentially due to Chatterjee and Sen (1966) and discussed also
by Puri and Sen (1966).
We define a p X Nv rank matrix ~v
8
R(l)
vI
(3.1)
~v =
.
-•
R(l)
VN v
.
.
R(P)
vI
,
)
R
= (~Vl' ..• , ",-VN
v
R(P)
VN v
where ~Vi = (R (1 ) , ••. , R(P)),
for i = 1, ..• , Nv · Each row
vi
Vi
of ~V is a permutation of the numbers 1, ••• , Nv ' there being
thus in all (Nv!)P possible realizations of ~v. Let us
rearrange the columns of
~v
in such a way that the first row
has elements 1, ••. , Nv in the natural order, and denote the
* ~v is said to be permutationally
corresponding matrix by ~v.
equivalent to a matrix ~~l), if it is possible to obtain
~~l) from ~v only by permutations of the columns of the latter.
Thus, corresponding to each
*
~v'
there will be a set
*
~(~v)
of
Nv ! possible realizations of ~v' such that any member of this
* Now, the probability
set will be permutationally equivalent to ~v.
distribution of Bv over the (Nv!)P possible realizations will
even when Ho in (1.2) holds (unless F(~) has
coordinate-wise independent marginals). Thus, rank-statistics,
depend on
I
I
F(~),
like Qv in (2.6), are, in general, not distribution-free
*
under Ho in (1.2). However, given a particular set ~(Rv)
"'(of Nv ! realizations), the conditional distribution of Bv
*
over the Nv ! over the Nv ! permutations of the columns of Bv
would be uniform under Ho in (1.2), i.e.,
I
I
I
I
I
I
_I
I
I
I
I
I
I
_.
I
I
I
I
I·
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
7
(2 . 20 )
Tl j
j ,
(H)
j 1= j'
1, ... , P ;
(2.21)
Finally, let
(2.22)
and
lim
~(H)
v->co '"
It follows from (2.9), (2.11) and (2.11) that under the
assumption (2.23) (VIII) k(H) is of rank p and is finite,
~(H) will also be of rank pq, and ~(H) is finite, provided
lim
£v exists.
v->oo
3.
However, sUPvlll~(H) II will be finite, by (2.9).
'" .
Permutation ~
distribution IV"
of Sv t'\I'V"
and t'\I'V"
the ~
rationalitv IV"
of
"V'V'VVV'V'VVV
~v.
Under Ho in (1.2), ~v is composed of Nv independent and identically
distributed random vectors. Hence, the joint distribution
of
~v
remains invariant under the Nvl permutation of the Nv
vectors ~Vl' •.• ' ~VN among themselves when (1.2) holds. We
v
consider now the basic rank-permutation principle, which is
essentially due to Chatterjee and Sen (1966) and discussed also
by Puri and Sen (1966).
We define a p X Nv rank matrix
Bv
8
R(l)
vI
(3.1)
~v =
.
-•
R(l)
VN v
.
.
R(P)
vI
,
)
R
= . (~Vl' .•• , ",VN
v
R(P)
VN v
for i = 1, .•• , Nv ' Each row
where ~Vi = (R ( 1 ) , ... , R(P)),
vi
Vi
of ~v is a permutation of the numbers 1, ... , Nv ' there being
thus in all (Nv!)P possible realizations of ~v. Let us
rearrange the .columns of
~v
in such a way that the first row
has elements 1, ... , Nv in the natural order, and denote the
*
corresponding matrix by ~v.
~v is said to be permutationally
equivalent to a matrix ~~l), if it is possible to obtain
~~l)
from
~v
only by permutations of the columns of the latter.
Thus, corresponding to each
*
~v'
there will be a set
*
~(~v)
of
Nv ! possible realizations of ~v' such that any member of this
* Now, the probability
set will be permutationally equivalent to ~v.
distribution of ~v over the (Nv!)P possible realizations will
even when Ho in (1.2) holds (unless F(~) has
coordinate-wise independent marginals). Thus, rank-statistics,
depend on
like
F(~),
in (2.6), are, in general, not distribution-free
~v
under H in (1.2).
o
However, given a particular set
*
~(Rv)
(of Nvl realizations), the conditional distribution of
'"
I
I
~v
over theN v ! over the Nv ! permutations of the columns of Bv*
would be uniform under Ho in (1.2), i.e.,
I
I
I
I
I
I
_I
I
I
I
I
I
I
_.
I
I
I
I
••I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
9
whatever be
F(~).
Let us denote
by~y
the permutational
(conditional) probability measure generated by the conditional
law in (3.2).
Then, by routine computations (along the lines
of Puri and Sen (1966)), we obtain that
E( S I~} == opxq .
""y
y
""
E( Sy ,jk Sy, j 'k' I~} = d y , jkj j 'k I
= 1,
, p,
k,k' = 1,
, q,
for j, j
I
where d y,Jok",J 'k I I S are defined by (2.15).
Since ~y is a stochastic matrix, for actual test construction it is more convenient to use a real valued function
of
~y
as a test-statistics.
By an adaption of the same arguments
as in Chatterjee and Sen (1966) and Puri and Sen (1966), we
may work with a positive-semidefinite quadratic form in the pq
elements of S , where the discriminant of the quadratic form
""y
is the inverse of the matrix Qy' which has the elements
in (3.2). This leads to the test statistic Ly '
'
defined by (2.17). Ly will be stochastically large if ~y is
dY,jkjj'k
stochastically different from O.
""
For small values of y
(i.e., Ny)' the conditional distribution of Ly ' given
*
~(Ry)'
can be obtained with the aid of (3.2), and a conditionally
distribution-free test for Ho in (1.2) based on Ly can be
constructed. This, however, requires the evaluation of the
Ny! realizations of ~y (under ~), while ~y is ~-invariant.
The task becomes prohibitively laborious and for large values
10
of v, we are forced to consider the following large sample
approach in which we approximate the true permutation distribution of
~v
(under
~v)
by that of a chi-square distribution
with pq degrees of freedom (d.f.).
As a basis of this study,
we consider the following.
N
THEOREM 3.1.
v
Let Hv(~)
=
(l/N v )
f=l
FVi(~)' and let
.f6v = lI.(H) IH == H be defined ~ in (2.21). Then, under the
p
assumptions (I)vto (V) of Section 2, [~v _ ~] ~> £px ~
V
Thus, under the assumptions (VI) and (VII),
co.
->
~v
is positive definite, in probability, and £v - ~(Hv) JL> opq~pq
"'"
as v
->
.,
co .
Proof:
The proof that M "",v
Ad
~'"V
L> Opx p as v
"'"
->
(X)
follows more or less on the same line as in the proof of
Theorem 4.2 of Puri and Sen (1966), and hence, for intended
brevity, the details are omitted.
The second part of the
theorem is a consequence of the first part and the results in
(2.9) through (2.23).
THEOREM 3.2.
elements of
~v
Q.E.D.
Under~,
the joint distribution of the
converges, in probability, to
~
multivariate
normal distribution with null means and covariances given by
(3.4) .
Proof:
The computation of the means and covariances of
the elements of
~v
follows directly from (3.3) and (3.4).
Since £v is positive definite, in probability (by the preceding
theorem), the asymptotic multinormality of the permutation
I
I
-,
I
I
I
I
I
I
el
I
I
I
I
I
I
-'I
I
I
I
••
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
11
distribution of S follows readily by an appeal to the multi""v
variate extension of the well-known Wald-Wolfowity-NoetherHoeffding-Motoo-Hajek permutational central limit theorems
[cf. H~jek (1961, p. 522)].
Hence, the theorem.
By virtue of the preceding theorem, we arrive at the
following theorem through a few simple steps.
THEOREM 3.3.
Under the conditions (I) to (VIII) of
Section 2,
~v'
(under~)
asymptotically converging, in probability, to a
defined by (2.17), has
chi-square distribution
~
~
permutation distribution
pq degrees of freedom.
Hence we have the following large sample test procedure:
2
>
v
- Apq,a
if
2
4.
~
2
Xt,a}
=
a(O
X~q,a
<
a
~ ~
notations.
0
i n (1 • 2) ;
~V
<
where P(Xt
, reject H
<
' accept Ho '
1), the level of significance.
2£
~v·
We introduce the folloWing
Let
(4.1)
and let HV[j](X) be the marginal of the jth variate of Hv(~),
I
I
12
for j = 1, ... , p.
-,
Also, let FVi[j](x) and FVi[j,j'](x,y)
be the univariate (jth) and bivariate ((j,j' )th) marginals
corresponding to FVi(~)' for j
(4.2)
Ajj(i
;rs
)
I j' =
1, .•• , p.
We define
=jl"f
J -co <X <Y<OO FVi[j](X)[l-F vi [ J'](Y)
,
I
• ~j(HV[j]~X))~j(HV[j](Y))dFvr[j](X)dFvs[jJ(Y)
+
rr-00
J,J
F [ ,](X)[l-FVi[j](Y)]
<X <Y<oo Vi J
for i, r, s = 1, ... , Nv and
rOO
(4.3)
Ajjl(i;rs) =
j
= 1, .•. , p;
ro
[FVi[j,j,](x,Y)-Fvi[j](X)Fvi[jl](Y)]
,
.
for j
V
_I
.00
J- )_00
I
~ j ( Hv [ j ] (x ) ) ~ j I (H v [ j
I ] (
y) ) dFvr [ j ] dF V s [ j
I ]
I jl = 1, ... , P and i, r, s = 1, ..• , Nv ; the subscript
in (4.2) and (4.3) is understood.
Let then
(4.4)
(y) ,
I
I
I
I
I
. Ajjl(i;rs) ,
for j, j' = 1, .•• , P and k, k' = 1, •.• , q; the corresponding
pqxpq matrix is denoted by
I
I
I
I
I
I
~v.
Throughout this section it will
I
--I
I
I
I
I-
I
I
I
I
I
I
Ie
I
I
I
I
I
I
13
be assumed that (i) F(x) in (1.1) is absolutely continuous
""
having a continuous density function f(x), x
""
(4.5)
~v
RP , and (ii)
is positive definite and finite.
Before presenting the main theorem of this section, we consider
the following lemma.
LEMMA 4.1.
2:
-
A(F)® C
""
""v
and (2.23),
~v
""V
Proof:
as v - >
00,
(4.6)
Under (1.1), (2.8) and (2.9),
->
opq)(pq as v
""
-
Hence under (2.11)
- > 00.
is positive definite in
~
By virtue of (2.8) and (2.9),
lim
v->oo
limit as v
- > 00.
max
l<i<N
-> 0
--v
and hence
( max
l<i<N
--v
From (4.2), (4.3), (4.6) and some routine computations, it
follows that
(4·7)
lim
v->oo
(
max
l<i,r,s<N
-
-v
rna x
l~j,jl<p
where 'Tl •• ,(F) is defined by (2.20).
JJ
[A.. I ( i .
) - 'Tl j . , (F)]}
JJ
,rs
J
=
0 ,
From (4.4), (4.7), (2.8)
and (2.9), we obtain after some simplifications that
(4.8)
(Jv, jkj j I k' - Cv , kk I 'Tl j j , (F) - > 0
for all j,jl = 1, ••. ,
II
I
""
€
Pj
k,k' = 1, ... , q.
the first part of the proof.
as v
- > 00,
This completes
The second part follows from
14
(2.11), (2.23) (with H
= F),
-,
(4.8) and some well-known
results on limits of sequences.
Q.E.D.
Let us now introduce the following notations.
Let
(where u(x) is equal to 1 or 0 according as x is positive or not),
for j = 1, ... , p, i,i' = 1, ... , Nv .
Also let
(4.10 )
for j = 1, •.. ,
Pj
k = 1, ... , q and i = 1, ... , N •
v
Finally
let
(4.11 )
NV
ZV,jk = f=l ZVi (jk)
for j = 1, ... ,
Pj
k = 1, ... , q.
Straight-forward but somewhat lengthy computations yield that
(4.12)
COV{ZV,jk' ZV,j'k'} = crV,jkjj'kl ,
for all j,j' = l, ... ,p;
k,k' = l, ... ,q,
where cr v 'k 'lk,IS are defined by (4.4).
,J , J
In this context, we bring now the findings of Hajek
(1966), who considered the special case of p = q
=1
I
I
and
I
I
I
I
I
I
_I
I
I
I
I
I
I
.'I
I
I
I
e
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
15
discussed the asymptotic normality of the corresponding rankorder statistic (which is just anyone of the SY,jk's).
Thus
referring to his elegant paper and thereby avoiding the
details, we obtain that
(4.13)
for all j
lim
v->oo
2
E[[Sy 'k-E(Sy 'k)] - Zy 'k} /Var(Zy 'k)
' J
, J
, J
= 1, ••. , p; k = 1, ••• , q.
, J
o ,
We shall utilize (4.13)
to establish a comparatively stronger result.
With this end
in view, we first consider the following lemma.
LEMMA 4.2.
t
>
Let ~y
= (a yl , ••. , a vt ) and £y = (b yl '···' b yt )
1, be two sequences of stochastic vectors, such that
v(a Yj - b ,)/V(a ,) - > 0 as y - >
YJ
YJ
-Then
00
for all j = 1, ..• , t.
as
Y ->
co
for all j, £ = 1, .•• , t.
The proof follows by expressing cov(ayj,a y £) as
COV(byj,b y £) + Cov(byj,ay£-b y £) + cov(aYj-bYj,by£)
+ cov(aYj-bVj,ay£-by£)' applying Cauchy-Schwarz inequality to
the second, third and fourth terms on the right hand side and
making use of the conditions stated in the lemma.
By virtue of (4.12), (4.13) and lemma 4.2, it follows
that
16
f or all j, j
I
= 1, ..• , p; k, k I = 1, •.. , q.
An immediate consequence of Lemma 4.2 is the following,
LEMMA 4.3.
bution with
~
~V
If
has asymptotically a multinormal distri-
vector
~V
and dispersion matrix
kv
and if for
each j(= 1, ... , t) v(a Vj - bVj)/V(a Vj ) --> 0 as v -->
b v ~ also asymptotically the ~ multinormal law.
00,
then
Thus it follows from (4.13) and Lemmas 4.2 and 4.3 that
for proving the asymptotic normality of
to show that (ZV,jk' j = 1, .. "
~v'
it is sufficient
p; k = 1, ... , q) has asymptoti-
cally a multinormal distribution.
This will be accomplished
by showing that any arbitrary linear combination of ZV,jk'S
has asymptotically a normal distribution.
With this end in
view, we define
p
(4.15)
Zv =
q
2::
2::
j=l k=l
£jk ZV,jk '
where Jjk's are real and finite and not all of them are zero.
From (4.10), (4.11) and (4.15), we obtain that
Nv
(4.16)
Zv = f=l gi(~Vi) ,
where
N
(4.17)
1
=N
~ ~
L-.
V r=l j=
t;q JJ'k( c Vi(k) -c Vi(k) )B j (r
k=·
for i = 1, ... , Nv . By definition,
random variables. Further
gi{~Vi)
(j)
'
i) (XVi ) ,
I
I
-,
I
I
I
I
I
I
_I
I
I
I
I
I
I
are independent
-'I
,
I
I
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
-
I.
I
I
17
is finite and positive by (4.5).
Proceeding then precisely
on the same line as in Htjek (1966, Section 5 specifically and
using a linear combination such as (4.15) with his Yv's),
it can be shown that under (4.5), the random variables
(gi(~Vi))
satisfy the Lindeberg condition for the applicability
of the central limit theorem.
Hence, we arrive at the
following.
THEOREM 4.1.
Under the assumptions (I) - (V) of Section 2
and (4.5), the elements of
~v
-
E~v
has asymptotically a
multivariate normal distribution with zero means and dispersion
matrix L: v •
""
Now, under the model (1.1) and granted the assumptions
made in Section 2, it follows from routine computations that
for all j
=
1, ... , p; k
=
1, ... , q; and by Lemma
4.1, the
covariance matrix ~v is asymptotically equivalent to k(F)~ £v·
In the next section, we shall make use of this result for studying
the asymptotic properties of the test based on
~v
in (2.17).
18
5·
Asvmntotic ~
nronerties 'V'\J
of ~
the 'V'\J'V'\J
test.
~
for all j,jl = 1, ... , p, where
(2.20) and (2.21).
~jjl
We denote by
IS are defined by (2.19),
Then, using Theorem 4.1, Lemma 4.1,
(4.19) and some well-known result on the limiting distributions
of quadratic forms in asymptotically normally distributed
random variables, it follows that
..* =
~v
f;;..L
>
=
j
~q ~
Sv 'kSV jtk t '\')jj t (F) .C k kt
>
v
t =1
= k t =1
'J
,
'I
has asymptotically (under (1.1) and the assumptions of Section 2)
,
a non-central chi-square distribution with pq degrees of
freedom and non-centrality parameter
where
(5.4)
Further, using Theorems 3.1 and 4.1, it is straight-forward
to show that under the assumptions made in Section 2,
I
I
-.
I
I
I
I
I
I
_I
I
I
I
I
I
I
.-I
I
I
I
•-
I
I
I
I
I
I
Ie
I
I
I
I
I
I
..I
I
19
Thus, we arrive at the following theorem .
THEOREM 5.1.
Under (1.1) and the assumptions made in
Section 2, the statistic
~v
has asymptotically a non-central
chi-square distribution with pq degrees of freedom and noncentrality parameter
~~,
defined
~
(5.3).
Theorem 5.1 may be used to study the asymptotic power
properties of the test in (3.5) and to compare its efficiency
relative to standard parametric tests.
Since in general
linear hypothesis testing problems, the different test-criteria
do not have uniformly good or bad performances, [cf. Anderson
(1958, Chapter 8)], it may be quite involved to give a neat
picture of the relative-efficiency of the proposed tests with
respect to all the parametric ones.
Among the notable
parametric tests, Roy's largest characteristic root-criterion
is admissible and powerful against certain alternatives,
while the likelihood ratio test is so for certain other
alternatives.
However, borrowing the ideas of Wald (1943),
it can be shown that in the sense of Wald, the likelihood
ratio test has asymptotically the best average power (on
suitable ellipsoids in the parameters
~jk's).
In this sense,
the likelihood ratio test may be regarded as optimum.
As such,
we will consider the asymptotic relative efficiency of our
proposed test with respect to the likelihood ratio test .
20
I
I
-.
Following the notations of Anderson (1958, Chapter 8),
we define the log-likelihood criterion (adjusted by a suitable
multiplying factor) by Tv (say,) and conclude from his Theorem
8.6.2 that Tv has asymptotically (under Ho in (1.2)) a chisquare distribution with pq degrees of freedom. Hence, the
same test procedure as in (3.5) applies to the likelihood ratio
test (with Zv replaced by Tv).
We denote the covariance
matrix (assumed to be finite) of F(x)
by ~
r = ((psJ"k(F)))"J, k
~
and assume it to be positive definite.
reciprocal matrix is denoted by
,-1
=
1, ... , p
The corresponding
= ((~jk(F))).
Since the
likelihood ratio-test criterion is exclusively based on linear
estimators of
~,
generalizing the method of approach of
Eicker (1963) and following essentially some routine steps it
can be shown that under the assumptions made in Section 2,
Tv has asymptotically a non-central chi-square distribution
with pq degrees of freedom and noncentrality parameter
It is seen that in multivariate situations, the asymptotic
relative efficiency (A.R.E.) of the test based on Zv with
respect to the one based on Tv' as measured by ~Z/~T' depends
I
I
I
I
I
I
_I
I
I
I
I
I
I
J
I
I
I
I
.I
I
I
I
I
I
Ie
I
I
I
I
I
I
..I
I
21
not only on ((v, ,,(F))) and ((~, ,,(F))) but also on Rand
JJ
. JJ.
~
£v. The following points are worth noting in this context.
(i)
If p = 1, no matter whatever be
~
and £v
which happens to coincide with the usual efficiency expressions
for the two sample location problem.
A few particular cases
First, if ~(u) = u (i.e., the
2
Wilcoxon scores case), (5.7) reduces to 12~11(F)(
CD f (x)dx)2,
may be worth mentioning here.
j'
-00
which has known values (as well as lower bounds) for various
F(x).
Secondly, if ~(x) is the inverse of the standard
normal cumulative distribution function (i.e., the normal
scores case), (5.7) is always greater than or equal to unity,
the equality sign holds only when F is normal.
This clearly
indicates the asymptotic efficiency of the proposed tests
for p = 1.
(5.8)
(ii)
If F(~)
p
=TT
j=l
F[J'](X J')'
it readily follows that both
matrices and as such
~(F)
""
and v(F) are diagonal
""
Hence, if we use the normal scores for ~Vj'S, again (5.9)
22
becomes greater than or equal to unity, where the equality
sign holds only when F(x) is multinormal.
If we use the normal scores for ~Vj'S, and if the
(iii)
parent cdf F(x) is a multinormal cdf, it readily follows from
"'"
(2.19), (2.20) and (5.1) that
~(F)
=
~(F),
and hence, from
6~/ 6
T = 1. Or in otherwords, for parent
normal distributions, the use of normal scores leads to
(5.3), (5.6) that
asymptotically most efficient tests.
In general, for arbitrary F(~), 6 ~/ 6 T is bounded
(iv)
below and above by the minimum and maximum characteristic
roots of ~(F)~-l(F), (the proof follows by a straight-forward
.
.
application of a theorem by Courant (cf. [11]) on the bounds
of the ratio of two quadratic forms).
Now, the bounds of
£(F)~-l(F) have been studied in detail by the authors (Sen
and Puri (1967)) in connection with the multivariate one sample
location problem.
As such, to avoid repetition, the details
are omitted.
6.
AsvmDtotic oDtimalitv of ~
for certain F(x).
~~"""""""'''''''''''''' v""""""",,,,,,,,,,,~
study the asymptotic optimality of
distributions.
~v
"'"
We shall
for a class of parent
We assume that F(x) has the absolutely continuous
"'"
density function f(x), and define,
"'"
I
I
-.
I
I
I
I
I
I
el
I
I
I
I
I
I
JI
I
I
I
.I
I
I
I
I
I
Ie
I
I
I
I
I
I
..I
I
23
for i = 1, ... , P .
(6.2)
Let then
(6.4)
I
fi[i](x)
=
d
dx f[i](x),
for i
= 1, ... , P •
We define the statistics
N
~ (k)
I
(j), (1)
(p)
UV,jk = ~ GVi [fj(X Vi ~Vi , ... , ~Vi )/f(~Vi)] ,
for j = 1, ... , P and k = 1, ..• , q.
We also define
for all j,jl = 1, ... , p, and let T = ((T' k 'Ik')) be the
'"
.
J,J
pqxpq matrix whose elements are cv,kk' X~jj" where £v is
defined by (2.10). Finally let T- l = ((Tjk,j'k')) be the
"'v
v
reciprocal matrix of !v' and
Then by an adaptation of the line of approach of Wald (1943),
we can show that (i) under Ho in (1.2), U*
v has asymptotically
a chi-square distribution with pq degrees of freedom (provided
((~jj'))
is finite and positive definite) and (ii) the test
based on U*v has asymptotically the best average power (on the
family of ellipsoids
24
( 6.8)
We shall show that under certain conditions on F(x), the test
based on
~v
'"
has also asymptotically the best average power
on the same family of ellipsoids, provided ~jfS are chosen
suitably.
Our treatment deals with a set of sufficient
conditions for this asymptotic optimality and the authors are
not aware of necessary conditions for the same.
where h jjl 's are real constants, not all zero, for j = 1, ..• , p.
(6.9) holds for (i) the multivariate normal distribution,
(ii) any coordinate-wise independent
distributio~ and
it may
also hold for many other distributions.
Let us now define
N
(6.10)
VV,jk
v
=
(k)'
~ c Vi
= 1, ... , p, k = 1, ... , q.
for j
(")
(")
fj[j](Xvr )/f[j](Xvr ) ,
If (6.9) holds, it follows
that UV,jk'S are linear functions of VV,jk'S, and consequently,
the quadratic form based on VV,jk'S (analogous to U*v )' will
also have the same properties as that of U*v .
We now define
(6.11)
~j(u) = f;[j](F[~](U))/f[j](F[3]~u))
, 0
j
<
u
=
1, ... , P ,
<
1,
I
I
-.
I
I
I
I
I
I
el
I
I
I
I
I
I
J
I
I
I
I
.I
I
I
I
I
I
Ie
I
I
I
I
I
I
..I
I
25
and define S 'k's as in (2.5). Following then H'jek's (1962)
V,J
elegant approach it is seen that Sv,jk - Vv,jk converges in
quadratic mean to zero for all j = 1, ... , p, k = 1, ... , q.
Consequently, on using Lemma 4.2 and some routine computations,
we obtain that for ~j'S given by (6.11), Lv in (2.17) is
stochastically equivalent to the quadratic form in VV,jk'S,
under (1.1) and the assumptions of Section 2.
Since, under
(6.9), this quadratic form in VV,jk'S is also equal to U*v in
(6.7), it follows that under (1.1), the assumptions made in
Section 2 and (6.9), (6.11),
(6.12)
Consequently, we arrive at the following.
THEOREM 6.1.
Under (i) the model (1.1), (ii) the
-
~
assumptions made in Section 2, and (iii) (6.9) and (6.10)
Lv has asymptotically the best average power on the family
of ellipsoids in
---=---- -
~'k's,
J
defined by (6.8).
-
In particular, if F(x)
is normal, the condition (6.9)
1'\1
..
-
is satisfied and (6.11) lead to the coordinate-wise normal
scores.
Hence,
~v
based on normal scores statistics leads to
asymptotically best test for the family of ellipsoids in
(6.8).
A special case of this theorem (that is, for p
=
1)
is dealt in an interesting power of Matthes and Truax (1965).
26
7-
A characterization of the multivariate multisamole location
'"
~
~-
Let
~ 1<1'
tV'\J
IV'V'v
••• , X
~
knk
~
~
be n k independent and identically
distributed p-variate random variables having a continuous
p-variate cumulative distribution function
Fk(~)
for k::::l, •.. ,c(>2).
In the multivariate multisample location problem [cf. Puri and
Sen (1966) and the other references cited therein], we may let
Fk(X) = Fl(x - e), K:::: 2, ... , c, e l :::: 0
'"
'"
"'k
'"
'"
We define N :::: n l + ... + n c ' and consider a sequence of N
p-vectors ~l' .•• ' ~N' of which the first n l observations are
from the first sample, the next n 2 from the second sample and
so on. Thus, (7.1) will correspond to (1.1), if we let
q ::::
C -
1,
~jk
:::: ej,k+l for k = 1, ... , c-l, and where
null vectors for an 1
<
i
~
~ViIS
are
n l , £Vi = (0, 1, 0, ••• , 0)1
for n
+ 1 ~ i ~ n l + n 2 , and so on. Thus, the results
l
derived in this paper, also generalizes the results of Puri
and Sen (1966).
The statistic Lv in (2.17) can be shown
to be identical with the corresponding LV in Puri and Sen (1966),
when (7.1) holds.
Consequently, the efficiency considerations
made in Sections 5 and 6 also applies to the multisample
multivariate location problem.
I
I
-.
I
I
I
I
I
I
_I
I
I
I
I
I
I
JI
I
I
I
.I
I
I
I
I
I
Ie
I
I
I
I
I
I
..I
I
REFERENCES
[1]
Anderson, T. W.
An introduction to multivariate
statistical analysis.
[2]
New York:
John Wiley, 1958.
Chatterjee, S. K., and Sen, P. K. (1966).
Nonparametric
tests for the multivariate multisample location problem
S. N. Roy memorial volume.
(Edited by R. C. Bose et al).
University of North Carolina (in press).
[3]
Chernoff, H., and Savage, I. R. (1958).
Asymptotic
normality and efficiency of certain nonparametric test
statistics.
[4 ]
Ann. Math. Statist. 29, 972 - 994.
Eicker, F. (1963).
Asymptotic normality and consistency
of the least squares estimators for families of linear
regressions.
[5]
Ann. Math. Statist. 34, 447 - 456.
H'jek, J. (1961).
Noether theorem.
[6]
H{jek, J. (1962).
order tests.
Some extensions of the Wald-WolfowitzAnn. Math. Statist. 32, 506 - 523.
Asymptotically most powerful rank
Ann. Math. Statist. 33, 1124 - 1147.
H§jek, J. (1966)
Asymptotic normality of simple linear
rank statistics under alternatives, I. (to be published).
[8]
Matthes, T. K. and Truax, D. R. (1965).
Optimal
Invariant Rank Tests for the k-Sample Problem.
Ann. Math. Statist. 36, 1207 - 1222.
Puri, M. L., and Sen, P. K. (1966).
On a class of
multivariate multisample rank-order tests.
Ser A. 28, 353 - 376.
Sankhya,
[10]
Rao, C. R., Linear statistical inference and its
applications.
[11]
New York:
John Wiley, 1965.
Sen, P. K., and Puri, M. L. (1967).
On the theory of
rank order tests for location in the multivariate one
sample problem.
[12]
Wald, A. (1943).
Ann. Math. Statist. 38, 1216-1228.
Tests of statistical hypotheses
concerning several parameters when the number of
observations is large.
482.
Trans Amer. Math. Soc. 45, 426 -
I
I
-.
I
I
I
I
I
I
el
I
I
I
I
I
I
JI
I