•
ON PERMUTATIONAL CENTRAL LIMIT THEOREMS FOR
GENERAL MULTIVARIATE LINEAR RANK STATISTICS
by
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1358
Sepetember 1981
ON PERMLITATIONAL CENTRAL LIMIT TIIEORfMS FOR GENERAL MULTIVARIATE
LINEAR RANK STATISTICS
By
*
PRANAB KUMAR SEN
University of North
Carolina~Chapel
Hill
SUMMARY. For multivariate linear rank statistics, permutational central limit
theorems have been proved, mostly, either incorrectly or under unnecessarily
stringent regularity conditions. These theorems are revisited here with some
special emphasis on a novel martingale approach.
1. INTRODUCTION
In nonparametric multivariate analysis, permutational central limit theorems
(PCLT) playa vital role. In the context of multvariate multisample rank order
tests, Puri and Sen(1966,Theorem 4.1) considered a PCLT and this was later [Puri
and Sen(1969,Theorem 3.2)J extended to general linear models. In either case, the
-e
proof provided by these authors is not correct (as will be explained in Section 2);
nevertheless, the theorems remain valid under the stringent regularity conditions
stated there. These theorems are multivariate generalizations of the classical
(univariate) PCLT, which in its most general form is due to Hajek(196l).He was
able to incorporate a powerful (quadratic mean) equivalence result for linear
rank statistics and linear combinations of independent random variables (r.v.)
which provide the desired result through the classical CLT. Puri and Sen(197l,
Theorem 5.4.1) adapted this technique in the multivariate case; but, their result
,.
on the PCLT does not properly follow from the Hajek(196l) result, though the
conclusion on the unconditional null distribution remains true. Conclusion on
the PCLT in the multivariate case based on the moment-convergence property [cf.
Wald and Wolfowitz(1944)J is,of course,valid, but demands comparatively stringent
regularity conditions. All these call
* Work
for a re-examination of multivariate PCLT.
partially supported by the National Heart, Blood and Lung Institute,
Contract NIH-NHLBI-7l-2243-L from the National Institutes of Health.
-2-
The object of the present investigation is to provide a systematic account of
multivariate PCLT's along with a novel martingale approach. The main results along
with the preliminary notions are presented in Section 2 and their proofs are
considered in Section 3. In this context, a martingale approach, developed in the
context of progressive censoring by Chatterjee and Sen(1973) and extended further
by Sen(1979), is incorporated in a convenient proof of a multivariate PCLT under
less stringent regularity conditions.
2. THE MAIN THEOREMS
Let
~i
= (XiI' •••• ' Xip)', i=l, ••• ,n, be n independent and identically dis-
tributed (i.i.d.) r.v.'s with a continuous distribution function (d.f.) F,
defined on the Euclidean space RP , where p is a positive integer.Let ~i =
(cil, ••• ,c iq )', i=l, ••• ,n, be a set of known (regression) vectors, where q > 1.
For each j (=l, ••• ,p), let Rij be the rank of X among X ""'X
, for i=l,
ij
1j
nj
••• ,n (ties among the observations are neglected, in probability, as F is assumed
to be continuous), and let a nj (l), ••• ,a (n) be a set of scores. Then, a set
nj
«
of multivariate linear rank statistics (LRS) L =
Lnjk » may be defined by
....n
n
n
. c j = n -1 Li=lc
L
(2.1)
,
= L. l(c
cj)a
j(R
)
,
njk
ij
ij
1=
n
ij
for j=l, ••• ,p and k=l, ••• ,q. In general, because of the inter-dependence of
the p variates, L
....n
is not genuinely distribution-free. However, it is permu-
tationally (conditionally) distribution-free under the following rank permutation
model, due to Chatterjee and Sen(1964). Consider the rank-collection matrix
m
""n
(of order pxn) specified by
~n (~l'" '~n)
=
=
lP
,
where R
_i = (Ril"'" R)
ip
columns of
~
-n
(~::.:::::.~~:)
R , ••• , R
t
for i=l ,
(2.2)
np
••• ,
n • Consider now a permutation of the
so that the top row is in the natural order (viz.,l, ••• ,n),and
denote the resulting matrix (termed the reduced rank-collection matrix) by OR* •
-n
Note that the totality of (n!)p rank collection matrices may thus be partitioned
into (nl)p-l subsets, where each subset corresponds to a particular reduced rank
-3-
collection matrix
* and
m
-n
the subset
SQR-n*) has
cardinality n!. The conditional
distribution of ~ over the appropriate S(tR*) is uniform, irrespective of the
_n
-n
(continuous) d.f. F. We denote this conditional (permutational) probability
measure by ~
n
E
• Then, we have [ see Puri and Sen(1969)]
8 ~n
n
where
~n =
«
= ~
has the elements
-1 n
- }{
Li=l{anj(Rij)-anj anj,(Rij ,)
n
= n-1 Li=l
(2.3)
n
Vnjj'»j,j'=l, ••• ,p
Vnjj ' = (n-l)
a nj
and V~ ~n = ~ne ~n
.
(2.4)
(2.5)
a nj (1) , j=l, ••• ,p
and
Our primary concern is to study the asymptotic multi-normality of Lunder
-n
the permutation model en; this is termed the multivariate PCLT.
-e
Since the
~i
are specified vectors, without any loss of generality, we may
assume that the rank of C =
-n
« Cnkk'»
is q , for n adequately large. More
specifically, we let C = D Q D where D = Diag( C~11' ••• 'C~ ) and assume
-n
-n-n-n
-n
n
nqq
that TI ,the smallest characteristic root of Q ,satisfies the following:
n
-n
TI
> Q > 0 , for every n > n (> q).
(2.7)
n
-
0
-
0
Further, as in Majumdar and Sen(1978), we let
~
~
n
=
max {( c.-c
- ) ' C-1 ( c - l<i<
c )}
-i
-n
n -1 -n -n
(2.8)
and define the (extended) Noether condition as
~
n
~
0
as
n
~
.,
00
(2.9)
also, the extended Ha}ek(1968) condition is defined by
sup {n ~ } <
n>n
n
~*
~
<
00. •
(2.10)
-0
In the multi-sample case, treated by Puri and Sen(1966), (2.10) holds, while in
the linear model case, Puri and Sen(1969) assumed that (2.10) holds; this may
not be really needed. In both these papers, sufficiently stringent conditions
on the scores were imposed which insure that V
_n converges in probability to
a positive definite (p.d.) matrix
~
• Basically, for the multivariate PCLT,
-4-
,
it suffices to show that for every non-null A (of order pxq), Trace(AL ) is
--n
/:)
_
(1)
(q)
asymptotically normal (under cr). If we let L - (L
, ••• ,L
), then, Puri
n
-n
-n
-n
and Sen(1966,1969) considered arbitrary linear compounds of the form AlL(l)+ •••
-n
E~lg
ira l(Ril), ••• ,a (R. )]'
1= n
n
np 1p
+ A L(q) , expressed the same in the form of
q-n
and then appealed to Theorem 7.1 of Hajek(196l) to show that under
later (with suitable
E?n ,
•
the
gni) is asymptotically multinormal. There appears to be
some flaws in these steps. First, one needs to consider an arbitrary linear
compound of all the pq elements of L (not simply a linear combination of its
-n
q columns) in order that the Cram{r-Wold characterization theorem applies. For
,
~
such a general
Trace(~~n)
, their simplified form for
may not be obtainable.
Second, even otherwise, Theorem 7.1 of Hajek(196l) does not apply here. In
this case, we have vectors Rl, ••• ,R whose joint distribution under E( (being
-n
n
different from their unconditional d.f.) does not conform to the model of Hajek
,/
(196l
0 where
(Rl, ••• ,R ) assumes all possible permutations of (l, ••• ,n) with
n
the equal probability (nl)-l and a (i),i=l, ••• ,n were p-vectors. This explains
-n
the inadequacy of the proofs of the multivariate PCLT's in Puri and Sen(1966,1969).
Puri and Sen(1971, Theorem 5.4.1) have sketched a different approach. Let F[j]
be the jth marginal d.f. for F, for j=l, ••• ,p, and let
~i
= (Uil, ••• ,U
ip
)' with
Uij = F[j](X ), j=l, ••• ,p, i=l, ••• ,n. Also, for eacj j(=l, ••• ,p), let
ij
aOj(u)
n
= a n j(i)
for (i-l)/n<u<i/n, i=l, ••• ,n.
-
Finally, let
-
L~ = E~=l[ a~l(Uil),···,a~p(Uip)]' (c- i
(2.11)
,
(2.12)
-n
Puri and Sen(197l) showed that
Then, by using the coordinatewise proof of Hajek(196l),
"
(V ~ C ) -~
-n -n
II -L n-- LnO ll . . .
- c ).
0, in probability, as n .....
00
•
(2.13)
O
L would ensure the same for L
-n
-n
However, the classical CLT may not apply to the conditional (permutational)
so that the asymptotic normality (under
en ) of
distribution of LO and hence the proof (under
-n
asymptotic
unconditional
en ) remains
incomplete; though, the
multinormality of LO and hence of L ,would follow.
-n'
-n
-5From the above discussion , it seems desirable to formulate multivariate
PCLT's for linear rank statistics in an unambiguous manner and to provide valid
proofs for them. Towards this, we define
(i)
a
-n
=
(2.14)
i=l, ••• ,n,
(2.15)
a
-n
(i)
=
a
-n
(2.16)
- a ) ,
-n
where, for the time being,we assume that V is of full rank (otherwise, under
-n
~ L will have a degenerate d.f.). Note that the -n
a(i) and V are stochastic
On' -n
-n
in nature, and hence, unlike the case of p=l,
Yn is, in general, a r.v.Then,
we have the following
o
Theorem 1. If
(in probability),
~
n
aSymptotically (in probability) normal with mean
+
00,
~,
e,
n
under
£ ~ dispersion
L
is
-n-
matrix
V ®C •
-e
-n
-n
It may be noted that under (2.10), all we need for the above theorem to hold is
that
n
-1
y
n
+
0 (in probability), as n
+
00,
and this can be established under
conditions much weaker than the ones in Puri and Sen(1966,1969,197l). In the.
next theorem, we do not want to impose (2.10) and desire to incorporate (2.9).
For this, we suppose that there exist score functions
~j(u),O<u<l,
j=l, ••• ,p,
o
such that for the anj(u), defined by (2.11),
l~;~p{ f~ {a~j(u)
-
~j(u)}2
du}
+
0 , as n
+
00
(2.17)
,
where, for each j(=l, ••• ,p),
~j(u) = ~j,l(u) - ~j,2(u), O<u<l , where
~j,k(u) is
nondecreasing, absolutely continuous and square
(2.18)
integrable inside (0,1), for k=1,2.
These two conditions are
Theorem 2.
less stringent than the ones in Puri and Sen(1966,1969).
1£ (2.9), (2.17) and (2.18) hold, then, under
~n' ~n is as~ptotically
(in probability) normal with mean 0 and dispersion matrix V
~
~n
~C
-0
•
Proofs of these theorems along with other comments are presented in the
next section.
-6-
3. PROOFS OF THE THEOREMS
Let us first consider the proof of Theorem 1. For an arbitrary non-null matrix
,
A (of order pxq) , we like to show that under
If we define the a(i) and
....n
Tr(~~~
L~=l ~~[~~i)
is asymptotically normal.
as in (2.14)-(2.15), we have then
L~.lL~=lL~=l(cik
) =
=
and
a....n
en , Tr(AL
)
--n
-
- c k ) Ajk [ anj(R ij ) - an ]
~n
J;
d ij =
(3.1)
L~=lAjk(cik-ck)' l<j<p,l~i~n,
~~ = (dil, ••• ,d ip )' for i=l, ••• ,n.Note that by (2.3), EenTr(~~~) = 0 and
, 2 _ P
P
q
q
_
E~n(Tr(~~n»
- Lj=lLj'=lLk=lLk'=lAjkAj'k,Vnjj,Cnkk' -
2
Ln,
say.
(3.2)
Express the reduced rank collection matrixm* as ( Rl,
* ••• ,R*), so that the first
~n
element of
*
~i
-
-n
is equal to i, for i=l, ••• ,n. Define then Sl, ••• ,8n by letting
R~l=RS 1 = i, for i=l, ••• ,n.
(3.3)
i
Further, let
-e
= a(8 i ) - a , f or i = 1 , ••• ,n ,
b....n (i ', tR*)
....n
....n
....n
and let
(3.4)
Y = (Y l""'Y ) be a random vector which takes on each permutation
....n
n
nn
of (l, ••• ,n) with the equal probability (n!)
,
-1
• Then, the permutation distribution
of Tr(AL
) agrees with the distribution of
........n
Zn =
Since under
,
*
Li=l ....dib-n (Y n i; lR
-n )
n
1 given tR_n* J.
(3.5)
€1,
m.*
is held fixed, while the vector Y has the discrete uniform
....n
n
....n
distribution lover the set of permutations of (l, ••• ,n)J, we may now virtually
repeat the proof of Theorem 3 of Hoeffding(195l) and obtain the asymptotic
normality of Z (given m* ) under the sole condition that
n
....n
Since In- is a
r.v~,
whenever
Yn~n ~
0, in probability, as n
y
~
~
00,
the method of
n n
~
0 as n+
00.
moment proof of Hoeffding holds for all m* , excepting a subset with probability
-n
tending to 0 as n
~
00,
and hence, the aforesaid normality holds, in probability.
This completes the proof of Theorem 1.
To prove Theorem 2, let F[jj'J be the bivariate marginal d.£. for the (j,j')th
variates, for the d.f. F, for
V jj
j~
j'=l, ••• ,p. Let then
, = l:l:lPJ(FU](X»lPj,(FU,](y»dF[jj'J(X,y)
for j,j'=l, ••• ,p, where
~
=
«V jj ,»
lPjlP j "
be defined by
(3.6)
-7-
</>j
=
f~
</>j (u)du • for j= l ••••• p.
Note that expressing (2.4) and (2.5) in the integral form involving the empirical
d.f.'s and then using (2.17)-(2.18) along with the usual Glivenko-Cantelli lemma
type result, we obtain that under (2.17) and (2.18).
v
-n
-+
• in probability, as n
V
-+
(3.8)
00
The proof of (3.8) is essentially similar to that of Theorem 3.1 of Puri and
Sen(19 69). and hence. the details are omitted. Now. without any essential loss of
generality. we assume that
V
is positive definite (otherwise. the limiting
permutation distribution of L will be singular, in probability). Let us also
-n
define the Si as in (3.3) and for every
k:l~k~n,
let S k = (Sl·····Sk)' and
-n.
be the sigma-field generated by S
let S
= o. Let now
(under € ).
n
n.O
n.k
-n.k
for k=l, •••• n and let ~
be the trivial sigma field. Let then
n.O
e
Ee
~e
(3.9)
(L I ~ n ,k ) • for k = O.l ••••• n •
L k =
-n.
n -n
At this stage, we appeal to Chatterjee and Sen(1973, Section 4), for the case
of p = 1 • and Sen(1979, Section 2), for general p
~
1. and obtain that
k
* •••• , a (R )-a*
L
= ~i=lI
.anl(RS.l)-anl(k)
(k)J'(c
-c)
-n.k
1
np Si P
np
- Si -n
(3.10)
where
(3.11)
*
and. conventionally, we let anj(n)
= O. for j=l, ••• ,p. By (3.9), we conclude
E?n , for every n.
{L k' ~ k: 0 < k < n}
_n.
n. - -
that under
(3.12)
is a zero mean martingale.
Thus, to prove Theorem 2, it suffices to consider for an arbitrary non-null A
,
(of order pxq), the partial sequence { Tr(~~n k)jO<k<n}
•
and verify the conditions
for the martingale central limit theorem [ viz., Dvoretzky (1972)J.Note that
by (3.12), under 6?n' {Tr(~~~.k)' O<k<n} also forms a martingale sequence. Hence,
if we define
Yn,k
,
,
Tr(~~n,k) - Tr(~~n,k_l)' k=l, ••• ,n,
then. it suffices to show that for
2
Tn' defined by (3.2), as n
(3.13)
-+
00,
-8{ ~.n 1 Eel y 2
n,i
~=
n
p
~ n,i-l] }/T~
and
{
~~=l
E
en
[Y
2
.l( IY .1
n,~
n,~
> £T
n
1
-+
)J}/T~
(3.14)
p
-+
0,
\:I
£
>
(3.15)
O.
Now, (3.14) and (3.15) follow (as a direct vector-extension) precisely on the same
line as in (4.32) through (4.40) of Sen(1979), and hence, the proof is complete.
We conclude this section with the remark that this martingale approach adapted
from Chatterjee and Sen(1973) and Sen(1979), besides providing a valid proof of the
PCLT in the multivariate case, avoids the computational complications and the
extra regularity condition (2.10) of the moment-convergence approach in Theorem 1.
REF IRE N C E S
CHATTERJEE, S.K. and SEN, P.K.(1964). Nonparametric tests for the bivariate
two-sample location problem. Calcutta Statist.Assa.Bull.
1l,
18-58.
and --- (1973). Nonparametric testing under progressive censoring. Calcutta
Statist.Asso.Bull.
·e
~,
13-50.
DVORETZKY, A.(1972). Central limit theorems for dependent random variables. Proc.
6th Berkeley SymP. Math.Statist.Prob.,Univ.California Press,
l,
515-535.
HAJEK, J.(196l). Some extensions of the Wald-Wolfowitz-Noether theorem. Ann.
Math.Statist. 32, 506-523.
---- (1968). Asymptotic normality of simple linear rank statistics under
alternatives. Ann. Math. Statist. 39, 325-346.
HOEFFDlNG, W.(195l). A combinatorial central limit theorem. Ann. Math. Statist.
~
, 558-566.
MAJUMDAR, H. and SEN, P.K.(1978). Nonparametric tests for multiple regression
under progressive censoring. Jour. Multivar. Anal.
~,
73-95.
PURl, M.L. and SEN, P.K.(1966). On a class of multivariate multisample rank order
tests. Sankhya, Ser.A 28, 353-376.
and --- (1969). A class of rank order tests for a general linear hypothesis.
Ann. Math. Statist. 40, 1325-1343.
and --- (1971). Nonparametric Methods· in Multivariate Analysis. John Wiley,
New York.
SEN, P.K.(1979). Rank analysis of covariance under progressive censoring.
Sankhya, Ser.A 41, 147-169.
WALD, A. and WDLFOWITZ,J.(1944). Statistical tests based on permutations of the
observations. Ann. Math. Statist.
11,
358-372.
© Copyright 2026 Paperzz