I I Ie I I I I I I I ON A CLASS OF NONPARAMETRIC TESTS FOR MANOVA IN TWO WAY LAYOUTS* by Pranab Kumar Sen University of North Carolina Institute of Statistics Mimeo Series No. 482 June 1966 .e I I I I I I Ie I *Work supported by the Army Research Office, Durham, Grant DA-31-124-G432 DEPAR~NT OF BIOSTATISTICS UNIVERSITY OF NORTH CAROLINA Chapel Hill, N. C. I I Ie I I I I I I I .- I I I I I I Ie I ON A CLASS OF NONPARAMETRIC TESTS FOR MANOVA IN TWO WAY LAYOUTS* PRANAB KUMAR SEN University of North Carolina, Chapel Hill, and University of Calcutta. SUMMARY. The object of the present investigation is to propose and study a class of nonparametric tests for the multivariate analysis of variance (MANOVA) problem relating to complete two way layouts. In this context, the concept of rank-permuta- tions for multidimensional interchangeability is developed, and the same is incorporated in the formulation of a class of genuinely distribution-free rank order tests. Asymptotic properties of the class of proposed tests are studied and compared with those of the standard parametric ones. 1. INTRODUCTION Let us consider a complete two way layout comprising of n complete blocks (replicates), each block containing applied. 2) plots where r different treatments are The yield (response) is a p variate quantitative (stochastic) vector, and we denote by block for i = 1, X~~) the k-th response for the jth treatment placed in the ith 1J ... , assumed that n, r, n, j = 1, ... , r, k = 1, " ' , p. p~ 2. (xg) , ... , x~~)), i = (]1 (1), ... , ]1 ]1' In the sequel, it will be Let then X~ . = _1J *Work r(~ 1J 1, ... , n, j = 1, ... , r', (p)) •, supported by the Army Research Office, Durham, Grant DA-3l-l24-G432. (1.1) (1. 2) I I -2- a~ = (a ~1) , ·.. , a~p», i J. = 1, ·.. , n', (1. 3) T~ = (T ~ 1) , ·.. , T~P» , j = 1, ·.. , r; (1. 4) (1) e i · = (e ij , - J ·.. , e~~», j = 1, ·.. , r, i = 1, ·.. , n, _J. -J I I I I I I I ,e I I I I I I Ie I J. J , and J J.J 1, ... , n. ·(1.5) j = 1, ... , r, (1. 6) We adopt the usual linear model as X •• = ]1 _ J.J - .. , i + ~i + T. + e_J.J -J where]l is the vector of mean effects, a. the block effects (i = 1, " ' , n), T. -J _J. the treatment effects (j = 1, ... , r), and e_ J.J .. the residual error vectors (i = 1, ••. , n, j = 1, " ' , r). independent. These component vectors are assumed to be mutually Our problem is to have a comprehensive test for the hypothesis of no treatment effects i.e., H : (1. 7) o In the parametric case, it is usually assumed that e_ J.J .. (i = 1, ••• , n, j = 1, " ' , r) are N(= nr) independent and identically distributed stochastic vectors distributed according to a mu1tinorma1 distribution with a null mean vector and a dispersion matrix (positive definite) L e~~», J.J for k, q = 1, " ' , p. (k) = ((Gkq»,where Gkq is the covariance of (e ij , The parametric MANOVA tests are either based on the likelihood ratio criterion or on the characteristic roots of some determinenta1 equations. The likelihood ratio criterion reduces to the ratio of two generalized variances and can be expressed as the product of several (p) independent beta variables (cf. Anderson (1958, Chapter 8». Alternatively, one may work with the smallest characteristic root of the determinenta1 equation involving the same generalized variances. used. Occasionally, some synunetric function of the roots are also For details, the reader is referred to Rao (1965, chapter 8). The parametric tests thus appear to be deterministic, but they are not very simple, especially I I -3- for p > 2. Further, in this procedure the assumptions of independence and multi- normality of the error vectors play an indispensible role. I I I I I I I .e I I I I I I Ie I Unlike the univariate case, very little has been investigated about the effects of departure from these two basic assumptions on the performance characteristics of the parametric MANOVA tests. On the otherhand, the assymption of multinormality of the error vectors is often found to be dubious, especially in many biometric problems. Further, in many problems, there appears to be sufficient evidence on the stochastic dependence of the error vectors within the same block. For example, in agricultural experiments, the presence of spatial correlation may distort the stochastic independnece of the error vectors within the same block. Similar dependence may be due to genetic effects in many animal feeding experiments. The object of the present investigation is to relax both the assymptions of multinormality as well as independence of the error components. In fact, for the tests proposed here, we require only that (i) the joint distribution function F(e'l' .•• , e. ) of e il , ••. , e. is ~1 ~1r ~ ~1r continuous and independent of i = 1, .•• , n, and (ii) F(e'l' ..• , e. ) is a symmetric function of its r arguments (vectors) ~1 ~1r e. , .•• , e. i.e., F remains invariant under any permutation of the r vectors ~1 l 1r among themselves, or in other words, e. , •.• , e. are symmetric dependent stocha.stic ~1 l ~1r vectors. Evidently, both the assumptions (i) and (ii) are much less restrictive than the usual assumptions of independence and multinormality. Thus, the proposed method appears to have a comparatively wider scope of applicability. In the nonparametric case, practically no work has been done on this line. For completely randomized layouts, very recently some nonparametric MANOVA tests have been offered by Chatterjee and Sen (1964, 1966), Sen (1965, 1966a), Puri and Sen (1966), and Anderson (1965), among few others. Bhapkar (1965) has also presented some I I -4- asymptotically distribution-free test for the same problem. The present author (1966 b) has considered some rank methods for combination of independent experiments I I I I I I I .- I I I I I I Ie I in MANOVA. The same procedure is applicable in our situation here, but it fails to be suitable in some respects. This problem may also be regarded as the multivariate generalization of the nonparametric ANOVA tests relating to two way layouts. Such ANOVA tests have been considered by Friedman (1937), Durbin (1951), Brown and Mood (1951), Benard and E1teren (1953), and others. These are all based on intra-block rankings, and the same method can be generalized to the MANOVA problem. The present author (1966 c) has considered a modified approach to nonparametric ANOVA tests for two way layouts. Extending an idea of Hodges and Lehmann (1962), he has considered therankings after alignment, and under a suitable permutation model, has obtained a class of genuinely distribution-free tests based on these modified rankings. This results, in most of the cases, in an increased (at least asymptotically) efficiency of the proposed test. The object of this paper is to generalize the method of rankings after alignment to the MANOVA problem and to offer some suitable parametric tests for the same. non- For this purpose, the concept of multidimensional interchangeability is developed and certain rank permutational ideas are formulated. With the aid of this a class of properly distribution-free rank order tests for the hypothesis in (1.7) is developed. Further, the celebrated Chernoff-Savage (1958) theorem on the asymptotic normality and power-efficiency of a class of univariate nonparametric test-statistics, as extended to the multivariate case by Puri and Sen (1966) and to the problem of compound symmetry of multivariate distributions by Sen (1966 c), is extended further to take care of the problem of multidimensional interchangeability, to be considered here. With the aid of this, the asymptotic power and power-efficiency of the proposed class of tests are studied. I I I_ I I I I -5- 2. SOME PRELIMINARY NOTIONS. Let us define a set of r 2 real quantities by (2.1) o for where 0ij is the usual Kronecker delta. all 51, = 1, . .. , r. Let us then consider the r intra-block contrasts Y. _1JV0 = L:.Jr =1 c JVoJ' X.. _1J , i = (2.2) 1, ••• , r. From (1.6) and (2.2), we have 1 r L::J= 1 (2.3) e . .), -1J where the first factor on the right hand side of (2.3) vanishes when H in (1.7) o holds. Further, by assumption (ii) of section 1, we get with some simple reasonings that the joint distribution of [(e. o _1JV - 1r L::J= 1 e-1J .. ), i a symmetric function of the r (vector) arguments. Consequently, from (2.3) , we get that under H in (1. 7) , the joint distribution of (Y. 0 -1 I I I I I I Ie I symmetric function of the r vectors Y. , -1 l ... , Y.-1r . is 1, , ... , Y.-1r ) will be a On the otherhand, ifH 0 in (1.7) does not hold, the joint distribution of (Y. l , ••• , Y. ) will be a symmetric -1 -1r function of its (vector) arguments only when each one of them is adjusted by appropriate location vectors. Thus, if instead of the observed responses X.. 's, -1J we work with the block-adjusted yields Y.. 's, our problem of testing H in (1.7) -1J 0 reduces to that of testing the hypothesis of interchangeability of the vectors Y. , •.• , Y. (for all i = 1, ... , n), against translation type of alternatives. -1 l -1r This is termed the problem of multidimensional interchangeability, and a formulation of an appropriate rank permutation model for the same, will be considered in the I I I_ I I I I I I I .e I I I I I I Ie I -6- next section. The necessary rank order statistics will be defined now. Let us pool the N(= nr) observations {y~~), ... , j = 1, 1.J into a combined set and denote the ordered observations by (k) Y (1) < ••• < r , i = l , ... ,n} (k) (2.4) Y(N)' where by virtue of the assumed continuity of the distribution of the error vectors, the possibility of ties in (2.4) may be neglected, in probability. Let then C(u) be the usual sign-function viz., 1, i f u { c(u) = > ° (2.5) 0, i f u < 0, and let R~~) = 1 + 1.J N I a.=l for i = 1, .. . , n, j = 1, Thus c(Y~~) _ y(k» (a.) , 1.J ..., (2.6) r• R~~) stands for the rank of y~~) within the set (2.4). 1.J 1.J is employed separately for each k = 1, ••. , p. This ranking procedure Consequently, any vector Y.. having _1.J p elements is made to correspond to a rank p-vector R~. _1J for i = 1, (R~:), 1J ..., n, ~XN ::.~ ~ = = j ... , Rij(P» = 1, ... , r • (2.7) , The composite collection is a p x N matrix (2.8) (R _11' ..• , R _lr' " ' , R _nl' " ' , R_nr ) ,. will be termed a collection (rank) matrix. the numbers 1, •.• , N. Each row of ~ is a permutation of For any positive integer N(= nr, n = 1, 2, ••• ) we define I I -7- p sequences of real numbers by te E(k) _N I I I I I I I .e I I I I I I Ie. I E~~~ = (E(k) , N,l ... , EN(k )), k = 1, •.. , p. ,N (2.9) 's are all real quantities and are explicit functions of (N~l). We adopt the coventional Chernoff-Savage (1958) form and write (k) (k) EN,a = I N where the function a (N+l) , a = 1, ... , N,k=l, ... ,p, a J~k) need be defined only at N+l ' a = (2.10) 1, ... , N. However, we shall find it more convenient to extend its domain of definition to (0, 1) according to the Chernoff-Savage convention. . d'1.ca t or f t '.1.ons 1.n unc Also, we define rp requences of {Z N, (j a,k) ,a = 1 , ••. , N} , o f r J. = 1 , ... , r, k = 1 , ••• , p by 1, of y(k) is some y~~)(i = 1, ... , n), (a) 1.J z (j, k) = N,a for a = 1, ..• , N. T(k~ = N,J (2.11) 1 0, otherwise, Then we define rp rnak order statistics 1 \ N E(k) Z(j,k) n La=l N,a N,a ,j = 1, ... , r, k=l, •• ', p. (2.12) It may be noted that 1 r -(1) where EN ' I -(k) r T(k) = 1 \ N E(k) = EN (say,), k = 1, "., p; j=l N,j N La=l N,a ... , E~P) are all known constants (depending on N). (r - l)p of the rp variables in (2.12) are linearly independent. test is based on the set of random variables in (2.12). (2.13) Thus, at most Our proposed To develop strictly distribution-free tests for the hypothesis (1.7), we shall consider in the next I I ·ae I I I I I I I •e -8- section some permutation model. a point of clarification. The But, before that it may be worth writing class of statistics in (2.12) has some similarity with that of a similar class of statistics considered by Puri and Sen (1966). However, in the later case, we have a one way classification with N independent p-variate observations, while in this case, we have a two way classification with n independent pr-variate observations. This makes the situation somewhat more complicated, and requires a more specialized attention for both the permutation as well as asymptotic test theory. 3. RANK PERMUTATIONS FOR MULTIDIMENSIONAL INTERCHANGEABILITY. The collection matrix ~XN, given by (2.8), is now expressed in terms of n . h submatr1ces Rpxr ' ••• , Rpxr ,were R.pxr.1S t h e matrix of t h e r ran k p-tup 1ets ~l ~n ~1 corresponding to (Y. , _1 l pxN ~ = (pxr ~l , ... , Y. ~1r ), for i = 1, ... , n. Thus, we have RPxr) • ... , ~n (3.1) Now under the null hypothesis (1.7), the joint distribution function G(Y'l""'Y' ) ~1 I I I I I I Ie. I ~1r is a symmetric function of Y. , •.• , Y. , and hence, the same remains invariant ~1 l ~1r under any permutation of the r vectors in the r positions of G. Since, there are r! possible permutations of the r vectors among themselves, the permutational probability (i.e., conditional probability) mass associated with each of the r! possible permutations is equal to (r!)-l,(under H in (1.7)~ for all i o Since, (Y. , " ' , Y. ) is distributed (jointly) independently of ~1 l ~1r for all i + i' = = 1, (Y ll (Y.~, ~I.l 1, ... , n. ... , Y., ) ~1 r ... , n,the joint distribution of , ... , Y.1r , .•. , Yn l' e- • • , Ynr ) (3.2) I I ~ I I I I I I I •- I I I I I I Ie I -9- ~ remains invariant under the following finite group which maps the sample space of :Ndnto itself. The number of elements of equal to (r!)n, and typicallY a transformation g n (Y~l' _1 {gn} ~ is is such that ... , •.• , Y*l _ r ' ••• , Y*l' _n where of transformations (3.3) •.• , Y*i _ r ) is any permutation of (Y. _1 1 , .•• , Y. _1r ), i = 1, ••• , n. Let :N be the Np-dimensiona1 sample space of :N,(and we take it to be the Np-dimensiona1 Euclidean space). Evidently, the sample space of :~ is the same as that of :N' and moreover, under H in (1.7), the joint distribution of :N remains invariant under o the group of transformations function on :N. ~. Then, for any :N Let now S(:N) be a (real or vector valued) ~ ~N' we will have a set of (r!) obtained under the group of transformations ~,and n values of S(:N) , this set is denoted by E(:N). Then, under the null hypothesis (1.7), the conditional distribution of S(:N) over the set . (k) Let us define TN . as in (2.2), and let ,J E(:N) will be uniform • TrxP _N = «T(k~». N,J J = 1, •.. , r, k = 1, ••. , (3.4) p Then, it follows that TN is a stochastic matrix, which under the group of transformations ~ (r!)~ possible can have only N rank p-tup1ets R.. , i -~ = 1, realizations. ••• , n, j = 1, Since: is an explicit function of the N ••• , r, it will be more convenient for us to review the above invariance argument in terms of the following rank-invariance argument. The way in which we have defined :N E ~ in (2.8) and (3.1), it follows that for any :N there will be·a corresponding collection of transformations ~, matrix~. On examining the group it will be clear that the transformation gn on :N' given I I -10 ae by (3.3), gives rise to another collection matrix I I I I I I I group of transformations .- I I I I I I Ie I the same transformation g n £ ~, which is obtained by applying on the original collection matrix ~ Thus, under the ~. {g } , the rank collection matrix R__ (corresponding n ~~ of ~N) gives rise to a set of (r!)n rank collection matrices (obtained by applying the same transformations is any member of L(~), {gn}') and this set is denoted by we note that ~ is really derived from of inversions of the columns of the later. ~ L(~N)' If ~ by a finite number Thus we may write (3.5) Hence, the set L(~) contains (r!)n rank-matrices which are permutationa11y (under inversions of intra-block columns) equivalent (under L(~) ~n) to~. Thus, we term (mod~) of~. ~ like :N is a stochastic variable, and each r<l>W of ~ is a permutation of 1, ••. , N. Thus, ~ can have (N!)P possible as the permutation set realizations, and this set of all possible realizations of ~ is denoted by ~, so that (3.6) The probability distribution of ~ i ~ on ~N (defined on an additive class of subsets Of<x' ,) will depend on the unknown joint distributions G (:il' ••• , Y. ), N ~1r = 1, ... , n, even under H in (1.7). o Thus, unlike the case of univariate one way classified data, the use of the unconditional distribution of provide a distribution-free test. ~ will fail to However, from what has been discussed before, it follows that = for all ~ £ L(~), (r!) -n , independently of G(:i1' ••• , :ir) , i (3.7) = 1, ••• , n. Now, the way I I --I I I I I I I •- I I I I I I Ie I -11- in which E~k), k = 1, ••• , p, are defined by (2.9), (2.10), it follows that TN in - [(2.12), (3.4)] is an explicit function of~. - Thus, the set L(~) will give rise to a set of (n!)n realizati?ns of TN' and this set is denoted by - L(T ). N - Hence, under the permutational probability measure (3.7), we will have a completely specified permutational distribution of TN' and the corresponding permutational probability measure is denoted by CPn • Let us then consider a test function ~(:N)(O ~ ~ ~ 1), which to each :N £::N associates a probability of rejecting H in (1.7), with the o aid of ~n. It follows that we can always select (r!) n • (y*) _N where ~(~N) in such a manner that (3.8) £, £(0 < £ < 1) is the preassigned level of significance of the test. Consequently, ~(:N) has the S(£) - structure of tests [cf. Lehmann and Stein (1949U, ~nd is a similar size £ test for the null hypothesis (1.7) • Now, in actual practice, we prefer to use some single-valued function of a test-statistic. ~N as There seems to be no definite suggestions regarding the structure of this test-statistics, and an optimum choice naturally may depend appreciably on the particular class of alternatives we have in mind. However, it may be suitable (though not necessarily optimum) to consider the following test-statistic which is the quadratic-form associated with the asymptotic permutation distribution of For this, let us consider first the permutational moments of ~N. ~N. It readily follows that E {TN(k~ I<p-.} ,J n Let us define = EN(k), for k = 1, ••. , p, j = 1, ••• , r. (3.9) I I --I I I I I I I .e I I I I I I Ie I -12- -(k) E = NR (k) i· 1 r ~r l.oj=l (k) , 1.' E(k) = 1, •.. , n, k = 1, •.• , p, (3.10) N,R. , 1.J as the intra-block averages. 1 Also let n L r { E(k) L '=1 n(r-1) i=l N R (k) J , ij i(k) } {i(q) N ,R, (k) .; N ,R. (q) 1.. 1. -i(q) N ,R,. (q) }' (3.11) 1.. for k, q = 1, ... , p; (3.12) It is then easy to verify that Cov {T(k) N,j' T(q) N,j' I n} (3.13) fork, q=l, "', p, j, j' =1, ••• , r, where 8", is the usual Kronecker delta. JJ For the time being, let us assume that given by (3.12), is positive definite, ~N(~)' and denote its reciprocal matrix by = 1, ... , p (3.14) Our proposed test-statistic SN can then be expressed as (3.15) and it may be noted that SN is essentially a non-negative stochastic variable. We shall see later on that under certain regularity conditions on G(Y'l' ••• , Y. ), _1. _1.r ~N(~) is positive definite with a very high probability, (precisely, in probability). However, if ~N(~) fails to be non-singular, we may work with the highest order I I ~ I I I I I I I .e I I I I I I Ie I -13- principal minor of ~N(~) which is positive definite, and proceed similarly only with the responses pertaining to this minor. !N(~) to be positive definite. Thus, for convenience, we may assume Now, (3.16) and SN measures the distance of gravity of the same. least one k in (3.4), from the permutational centre of If H in (1.7) does not hold, it can be shown that for at o = 1, "" p and one (stochastically) other than larger. ~N' j = 1, "', r, T~k~ will converge to a point ,J E~k), and hence, by (3.15), SN will be stochastically Thus, we may propose the following test function: 1, if SN > SN,E(~)' y(~), if SN = SN,E(~)' where the constants SN,E(~) and y(~) (3.17) may usually depend on ~ and are so chosen that (3.18) (3.18) implies that E{¢(Y ) IH } = E. ~N 0 to evaluate the exact values of For small values of n(and r), one may venture SN,E(~) and y(~) with the aid of (3.7). However, the labor of this process of evaluation increases considerably with the increase in n(or r), and hence, as in other permutation tests, we are faced with the problem of finding out the asymptotic form of the permutation distribution of SN' done in the next section. This is I I ae I I I I I I I .e I I I I I I Ie I -14- 4. ASYMPTOTIC PERMUTATION DISTRIBUTION OF SN' We shall impose certain regularity conditions on the p sequences (k) {~N }, k = 1, .•. , p, defined by (2.9) and (2.10), as well as on the joint distribution function G(Y. , ••. , Y.). ~ 1. l _1.r Let us define 1 [Number of Y~~) < x1 , k n 1.J - ~ ~k) (x) 1, .•• , p, j = 1, ... , r; (4.2) 1, ... , p; F(k,q) (x y) = 1 [Number of (Y~~), y(q)) < (x, N[j,R.]' n 1.J iR. for k, q = 1, ••. , p, j, R. = 1, •.. , r with either j (4.1) ~ y)J, R. or k (4.3) ~ q or both. Now, corresponding to the joint cdf G, let us denote the marginal cdf of y~~) and 1.J (k) (q) (k) (k,q,4 of (Y ij , Yu ) by F(jJ (x) and F[j ,R.J (x, y), respectively, for j, R. = 1, q = 1, •.• , p, with at least one of j (k) H 1 (x)=-; r ~ R., k ~ ... , r, k, q being true, and let (k) L:j=lF[j](x),fork=l, ... ,p. (4.4) With the definition of R~k),s as in (2.10), we make the following assumptions -N,a concerning j~k),s. ASSUMPTION 1. lim j(k) (H) = j(k) (H) exists for all 0 < H < 1 and is not a constant. n=oo N Since, we shall be interested here in translation type of alternatives, we shall further assume that j(k)(H) is t in H: 0 < H < 1 for all k = 1, ••• , p. (4.5) I I --I I I I I I I •e I I I I I I Ie I -15- ASSUMPTION 2 ~~=...;;;.=;..;......;;;~. I 1. EN J (k) N ex=l N for k = 1, ••• , p, (--.£L) _ J (k) (--.£L) N+1 N+1 I _1 = o(N ~), (4.6) ~ /" [J (k) (..1L H(k) ( » N N+1 N x _ J (k) (..1L H(k) (x) ) ] dF (k) ( ) (4.7) N[jJ x N+1 -~ _00 for all k = 1, .•• , p, j = 1, •.. , r. ASSUMPTION 3. dr [ dH r J(k)(H) is absolutely continuous in H: J (k) (H)] < K [H(l - H) J-r-~+o 0 < H < 1, and , (4.8) for r = 0, 1, and some 0 > 0, where K is a finite positive constant. Also for the positive definiteness and asymptotic convergence of the covariance matrix VN(~)' ASSUMPTION 4. given by (3.12), we require two more mild regularity conditions • 1 = N 0(1), (4.9) for k = 1, ..., p, and = 0 p (1) for all j, where either k ~ q or j R, = 1, •.. , p, k, q = 1, •.• , p, ~ R, or both. (4.10) Let us also define (4.11) I I -16- fit I I I I I I I ~q'j.e .e I I I I I I Ie I ~J (1) (p) (Z,. , ... ,Z .. ), j = Z! , ~J = ~J E{Z~~).z~,t} = 1,. .•. , r; for k,. q = l, ... ,p, j, J = l, ... ,.r; (4.12 ) (4.13) (4.14) 1 1 r r Vkq = ;- EJ'~l akq . J'J' - 2' E E akq . j£' for k, q=l, ... ,.p r j=l £=1 (4.15) V=«V» . (4.16) is positive definite (4.17) kq /'e# ASSUMPTION 5. ... V k, q=l, ... ,p Before we present the main theorems of this section, let us consider the conditions under which assumption 5 holds. Using (4.14), let us define (4.18) THEOREM 4.1 Assumption 5 holds if max [ ] #£=1,. ... , r Rank of ~(j,.£) = P PROOF. Let ,y£ t=($l'" t. = £tZ, " J ... "'~J .,£, ) p be any real and non-null p-vector, and let (4.20) j=l, ... ,.r, where Z, ,ts are defined by (4.12). ,., ~J (4.19) It is then easily seen that (4.21) Thus, we require only to show that for any non-null NJ, (4.21) is strictly positive. Using essentially the proof of lemma 4.1 of Sen (1966), it can 1 r be shown that ~Ej=l E(tj) - E(t~) will be strictly positive unless constant, for all j,£=l, ... ,r. (4.22) I I I'e I I I I I I -17- Now, using (4.18) and (4.19), we get that (4.23) E(t.-t n )2=£ A(. n)R, >0, J J!J ,.. ... J,J!J ,., for at least one pair (j,£), jrR,=l, ... ,r. As E(t j -t£)2 ~ 2[E(tj)+E(t;)J" (4.23) implies that E(t;;) > 0 for at least one j=l, ... , r. J Again, for the specific (j,£) for which (4.23) holds, we may assume without any loss of generality that E(t~) < E(t;;), E(t;;) > 0, and thus, we require only to show that E(t. t n ) < E(t;;). J!J J - J J ~ J If E(t;) = 0, the proof is evident, while, if E(t;) > 0, we have from (4.23) 2E(t.t n ) < E(t;;) + J E(t~) J J!J if (4.19) holds. < 2E(t;;). J J!J Hence, (4.22) can not hold for all j,R,=l, .•. r, Consequently, (4.21) is strictly positive. Hence, the theorem. It may be noted that (4.19) really implies that the vector (~ij - ~it) is of full rank for at least one j]'t=l, ... , r. I. I I I I I I I I_ I THEOREM 4.2. Under the assumptions 1 to 5, YN(~)' defined by (3.12), converges in probability to)!, defined by (4.16), and hence, is positive definite, in probability. PROOF. The proof of this theorem follows as a more or less straightforward gen- era1ization of theorem 4.2 of Puri and Sen (1966) and of theorem 4.2 of Sen (1966c). Hence, for the intended brevity of the paper, it is not considered in detail. THEOREM 4.3. Under the assumptions 1 to 5, the permutation distribution of the statistic SN' defined by (3.15), converges asymptotically, in probability, to a chi square distribution with p(r-1) degrees of freedom (d. f.). PROOF. We shall first, prove that under the permutation model considered in Section 3, [nt(T~k~ , J - ~k», mu1tinormal distribution. j=1, ... ,r-1, k=l, •.. ,pJ has asymptotically a p(r-1) This would be done by proving that any arbitrary linear function of these p(r-1) statistics has asymptotically a normal distribution under I I ae I I I I I I I .e I I I I I I Ie I -18- the permutation model of section 3. Such a linear compound can be equivalently written as (by virtue of (2.13),) ~ r.p (k) Wn = n- L: j =l L:k=1 d jk TN,j r where L:j=l d jk = 0, k=l, ... ,p. (4.24) Under assumption 2, (4.24) can be rewritten as 1 n2 n r p L: (L: L: d jk i=l j=l k=1 i k R(k) ) (N~i )} + 0p(1). (4.25) Let us then write p r = L: L: d jk j=l k=1 , i(~) The random variable UN under our permutation model. vectors R. " "'lJ R(k) (k)(-iL) J N+1' i=1,2, ... ,n. (4.26) can have only rl possible equally likely values These values are obtained by permuting the r j=l, ... , r (defined by (2 .7),) among themselves. Thus, (4.27) for i=l, ... , n. Similarly, p E{U2 . (!--) I@} N,1"'~ n = p L: L; k=1 q=l (4.28) r ( L: j=l for i=l, ... ,n. -19- Since the permutations of the rank-vectors within the ith block is independent of the permutations within the i~th block for iril=l, ... ,n, under our permutation model, (UN i(~)' i=l, ... ,n} are mutually independent. , Hence, to prove the desired result, we may use the Berry-Essen theorem [cf. Loeve (1962, p. 288)], according to which it is sufficient to show that J ~J lim n=oo (4.29) From (3.11), (3.12) and (4.28), we get that -1 -1 ]. 1 3 3 ~ 3 i Ie I (4.30) whereby theorem 4.1 and assumption 5, the right hand side of (4.30) is a (nonzero) positive constant, for any given (d , j=l, ... ,r, k=l, ... ,p). jk Thus, it is sufficient to show that the numerator of the left hand side of (4.29) is and this readily follows from assumption 3 and (4.26). 0 P Hence, under our per- mutation model, the first term of (4.25) has asymptotically, in probability, a normal distribution. Once this is established, we consider the quadratic form associated with the asymptotic mu1tinorma1 distribution of (n~(~:~ - E~k», j=l, ... ,r-1, k=l, ... ,p}, and using some well-known results on the limiting (~/2), I I I'e I I I I I I I •• I I I I I I Ie I -20- distribution of continuous functions of random variables [cf. Sverdrup (1952)], it is easily seen that under our permutation model, the statistic SN' given by (3.15), has asymptotically, ,in probability, a chi square distribution with p(r-l) d.f. Hence, the theorem. It may be noted that the permutation distribution of SN being essentially a conditional distribution, the convergence in theorem 4.3 holds, in probability, If we now denote by X2 i.e., for almost all Y . N N ~E chi square distribution with t d.f., the~ the upper 100E% point of the from (3.17) and theorem 4.3, we arrive at the following. THEOREM 4.4. X2p ( r- 1) , E SN', E (~-) and iV"N _ y(!-), defined by (3.17), converge, in probability to "'~ and 0, respectively. By virtue of theorem 4.4, the exact permutation test, considered in (3.17), reduces asymptotically to 2 <I>(Y ) N I, i f SN > - X P ( r- 1) ,E: = (4.31) { 0, otherwise; and (4.31) will be termed henceforth the asymptotic permutation test. 5. ASYMPTOTIC POWER OF THE PROPOSED TESTS. In this section we shall study the asymptotic power and power-efficiency of our proposed class of tests. This requires first of all the study of the asymptotic (unconditional) distribution of SN' when the null hypothesis (1. 7) is not necessarily true. For this study, we also adopt the same notations as in section 4, and write 00 J -00 (5.1) I I I'e I I I I I I I .e I I I I I I Ie I -21- for j=l, ... ,r, k=l, ... ,p. The statistics in (5.1) has some analogy with a class of similar statistics considered by Puri and Sen (1966). However~ in this case of two way layout we are faced with n independent pr-variate observations, while in the earlier case, Puri and Sen were faced with the oneway layout involving N(=nr) p-variate observations. This makes the situation somewhat more complicated in our case, and the necessary modifications will be studied here. Let us define JOOJ(k)(H(k)(x» dF[~~(X), (5.2) -00 for j=l, ... ,r, k=l, ... ,p. (k, q) i3 jjl .£JI 00 00 = JJ -00 00 Also let (k ) . (k) (q) () (k) [F[ /~I](x,y) - F[J.](x)F[J'I](Y)] JI k (H (x» () () JI q (H q (y». J, J (5.3) for Jrjl,£~ £1 = 1, ... ,r, k, q=l, ... ,p, with either jrjl or kfq or both,. while for j=l, ... ,r, k=l, ... ,p, /,,1,1 = 1, ... ,r. Finally, let {~ .~ (5.4) 1] Q(.k., q) __ .! [Q (.k., q) + Q(k, q) Q(k, q) Q(k, q) '"'JJI. r 1,~1 1,~=1 '"'JJ1'1,"I '"'1,1,I.jjl - '"'/,jl'jJ - '"'jJI.1,j~ (5.5) for k, q=1, ... , p; j, jl = 1, ... , r. THEOREM 5.1. If the assumptions 1,2 and 3 of section 4 hold, then for arbitrarily I I I_ I I I I I I I .e I I I I I I Ie I -22- continuous G(Y.l, ... ,Y. ), the random variables [N!(TN(k~ ~ ~r . ,J _ll~k)), J j=l, ... ,r, k=l, ... ,pJ has asymptotically a multinormal distribution with a null mean vector and a dispersion matrix with elements ~~~,q), defined by (5.5). . JJ' (It may be noted that by virtue of (2.13), (4.4) and (5.2), the above multinormal distribution will be essentially singular with a rank less than or equal to p(r-l).) PROOF. We shall present only a brief sketch of the proof, as the same will follow precisely on similar lines as in theorem 5.1 of Puri and Sen (1966) and theorem 5.1 of Sen (1966c). Proceeding precisely on the same line as in the proofs of these two theorems it can be easily shown that (5.6) for all j= 1, ... , r, k= 1, .•. , p, where B(k) + j, IN B~k) = 1. ~ {1. ; [B~k~,(y~~)) J,2N r jl=l n i=l J:J ~J (5. 7) (5.8) ...if x < y(k) ., ~J (5.9) if for i=l, ... ,n, j, t=l, ... ,r, k=l, ... ,p. y(k) x 2: i j ' It is therefore sufficient to show that ~ = (~u l l " "~) ; U~. k (B(.k)1 + B(k) ) has U ' u , Nt.~ ~ ~ pr j=l k=l J J, N j,2N asymptotically a normal distribution. By virtue of (5. 7), the same can be · f or any ar b ~trary non-nu 11 -1. n written as n ?l:. 1 B(Y.l, ... ,Y. ), where ~= N~ "'~r I I -23- I_ I I I I I I I .e I I I I I I Ie I B(Y. ) .... ~ l' ... , Y. "~r, = -1. r 2 r r p 1: 1: 1: (5.10) j=l 1=1 k=1 Since, the random variables in (5.10) are independent and identically distributed, in order to make use of the central limit theorem under the Lindebergls condition, it is sufficient to show that these have finite second order moments. Using (5.8), i t is easily seen that E{B(Y. , ... ,Y. )} = 0 for all i=l, ... ,n, "'~ 1 l\l~r and by virtue of (5.10), it appears to be sufficient to show that E{IB~~l (Y~~)12} < 00 for all j,.2=1, ... ,r, k=l, ... ,p, i=l, ... ,n. assumption 3 of section 4, it is easily seen that for any ~: 0 < Now, under the ~ < ° (defined by (4.8),) (5.11) uniformly in j, 2= 1, .•. , r, k= 1,. •.. , p. follows readily. Hence, the desired asymptotic normali ty Again, by (5.7), (5.8) and (5.9), we have (5.12) where 0u I (k, q) is the usual Kronecker delta and 13 jjl :2.1 and (5.4), for j,jl,2,11=1, ... ,r, k,cr1, ... ,p. IS are defined by (5.3) ' Hence, i t is easily seen that N E{(B~k) + B(k) )(B(q) + B(q) )} J, 1N j, 2N t, 1N :e, 2N = 13 (k, q) jJ which is defined by (5.5), for k,q=l, ... ,p, j,:e=l, •.. ,r. , (5.13) Consequently, by (5.6), we may conclude that the dispersion matrix of the asymptotic normal distribution has elements f3 ~:' q), defined by (5.5). Hence,. the theorem. We have already noted that the asymptotic normal distribution of theorem 5.1 is singular and of rank at most equal to p(r-1). If the null hypothesis I I I_ I I I I I I I ,e I I I I I I Ie I -24- in (1. 7) is true, G(Y ol, " .,Y o ) will be a symmetric function of the r vectors, 1. 1.r Y~~) 1.J and hence it is easily seen that (i) the marginal cdf of will be the same for all j=l, ... , r, i=l, ... , n, and is denoted by H(k) (x) for k=l, ... , p; (ii) the marginal cdf of H~k,q)(x,y) (Y~~), Y~~» 1.J 1.J (k1q) will not depend on j" and is denoted by for !o/q=l, ... ,p, and (iii) the marginal cdf of will not depend on (jr£), k, q=l, ... ,p. H~k,q)(X,.y) and is denoted by Thus, it follows from (Y~~), yii» (jr,e) for jr£=l, ... ,r, (5.3), (5.4), (4.11) through (4.14) that in this case (1) (k, q) f3 j j I :.£.1 I = akq.jjl = a kq , if j=jl=l,. •.. ,r, (5.14) (2) = a kq where ~~) Thus, from depends only on H~k,q)(x,y) and ~~) if on or J° I-- I, ... ,r,. J H~k,q)(x,y), respectively. (4.15) and (5.14), we get that in this case Vkq' defined by (4.15), reduces to (5.15) and (.<J(O~' q) = J-' XI (>:'Jon JO x-n-l , v x- r - l ) Vkq" where 0je is the usual Kronecker delta. ••• , .r , k , q-l - , ••• , p, (5.16) Consequently, it is easily seen that under H in (1.7), a (5. 17) (where «v kq » is the reciprocal of «v kq », and I I -25- I_ I I I I I I I •e ll(k) = //k) (u)du, k=l,. ... ,p,) o has asymptotically a chi square distribution with p(r-l) d.f. Now, under assumption 2 of section 4 (5.18) and by theorem 4.2, we have under assumption 5 that (5.19) Hence, from (3.15), (5.17), (5.18) and (5.19), we get that under H o in (1.7) (5.20) Hence, we arrive at the following . THEOREM 5.2. Under H in (1.7) and assumptions 1 to 5 of section 4, the o statistic SN in (3.15) has asymptotically a chi square distribution with I I I I I I Ie I p(r-l) d. f. Let now V be any consistent estimator of N If V v, N defined by (4.15) and (5.15). is positive definite and we denote its reciprocal by H v-I = «vkq », then N we can have an aSymptotically distribution-free test based on S N = n ik=l q=l i vkq ~ (T(k~ j=l (5.21) N, J A Since, SN can be shown to have the chi square distribution with p(r-l) d.f., when H in (1.7) holds, the test function may be proposed as o I, A i f SN 2 > X p ( r- 1) ,€ (5.22) { 0, otherwise. I I I_ I I I I I I I .e I I I I I I Ie I -26- We shall now consider the power properties of the permutation test in (3.17) and (4.31) and the large sample test in (5.22). equivalence relations among .these tests~ We shall obtain certain power- and compare them with the parametric tests referred to in Section one. By virtue of theorem 5.1, it can be shown that if the linear model (1.5) holds but the null hypothesis (1.7) is not true~ then k=l~ N~,. and hence, SN' defined by ... ~p~ can not all converge to zero as will be stochastically indefinitely 1arge~ considered will be all consistent. Thus~ (p~k) as N increases. for any given - E~k»~ j=l, ... ~r~ Consequent1y~, (~1"""~ """'r "ttl ) in (3.15)~ the tests (1.6)~ (not all null), the power of the test (3.17) or (4.31) or (5.22) will be asymptotically equal to unity. of the tests~ Hence~ forthe study of the asymptotic power properties we shall consider a sequence of alternative hypotheses for. which the power asymptotically lies in the open interval -.1. ~: ;:j = N 2 1y (E.~1). This we specify as j=l, ... ~r, (5.23) where A., j=l, ... ,r are all real p-vectors, not all equal (or null). J Further~ for simplification of the asymptotic power function, we shall assume that the cdf (k) (k, q) (k.1 q) ) F[ j] (x), F[ j, j] (x~ y) and F[ j~.e] (x, yare all absolutely continuous and have continuous density functions. Under {~} in (5.23)~ we will thus have sequences of (k) cdfts {F[j],N(x)} etc, defined for each N, and it is easy to verify that lim F(k) N=oo [ j], N (x) = H(k)(x) for all J'-l - , ••• , r , H~k, q) (x, y) (5.24) for all j=l, ... , r, lo/q=l, ... , P (5.25) = H~k, q) (x, y) for jr.e= 1, ... , r,. k, q= 1, ... , p. (5.26) I I I_ -27- Hence, in this case also (5.16) holds in the limit as ~ I I I I I I I •I I I I I I Ie I k dx = OOf JL J(k)(H(x)(x» N~. Also, if we define dF(k)() x, k=1 - , ... , p, (5.27) -00 then,. it is easy to show that (5.28) for all j=l, ... ,r, k=1, ... ,p. lows that under {~},. S~ Hence, from the results of theorem 5.1 it fo1- has asymptotically a noncentra1 chi square distribu- tion with p(r-1) d.f. and the noncentra1ity parameter (5.29 ) where -(q) _ A (q)/ r -Loj=lAj _ r, forq'-l, ... ,p . Now, from theorem 4.2, (5.24), (5.25),. (5.26) and the discussion following it, it follows that under {~} also SN! S~, and hence, we have the followin~. THEOREM 5.3. Under the seguence of alternatives {~} in (5.23), SN' defined by (3.15), has asymptotically a non-central chi square distribution with p(r-1) d. f. and the non-centrality parameter 6 ' defined by (5.29.), provided the conS ditions of theorem 5.1 hold, and in addition, the marginal cdf's corresponding to the joint edf G(Y. , ... ,Y. ) are all absolutely continuous and have continuous . "'1 1 "'1r density functions. If we consider the large sample test, defined by (5.22),. then it can be shown similarly that SN ! S~, under {~}, and hence, the conclusions of theorem A 5.3 also applies to SN. Thus, the permutation test considered in sections 3 and 4 and the large sample test considered in (5.29), are asymptotically power ] -28- ] (~}r ]e equivalent for the sequence of alternatives J J J to recommend the use of the same, for all sample sizes. ] in (5.23). the permutation tests are easy to define for small samples, we are now in a position In the parametric case, the limiting distributions of various test-statistics for this problem have been studied by various workers r and the reader may be referred to Anderson (1958, Ch. 8'10), Rao [(1952, Ch. 7), (1965, Ch. 8)], and James (1960), among others. MOst of the results relate to the null caser while it may be considerably difficult to formulate a general theory for the non-null cases, though some work has also been done on this line. ] J ] difficulty, and for the sequence of alternatives in (5.23), this statistic can be shown to have asymptotically a non-central chi square distribution with p(r-1) d. f. and the non-centrality III rameter ] ] p 2: tu= where 2:- J For the likelihood ratio test, however, the asymptotic non-null distribution may be found without much ]] As we have seen that '" 1 ~~k) J and ~(k) (5.30) k=1 q=1 are defined by (5.23) and (5.29), respective1Yr and = «akq » = «a kq »-1 The comparison of 6 p 2: 8 and is the reciprocal of the common dispersion matrix 2:. oJ tu (for the purpose of studying asymptotic relative efficiency) poses the same problem as has been studied in some detail by Puri and Sen (1966). For intended brevitYr this is therefore not reproduced again. The only remark that may be made here is that if we work with /\IN E(k),s (defined by (2.9), (2.10),) as the expected values of the order statistics in a sample ] ] Je 1 of size N drawn from a standardized normal distribution and term the resulting test as Normal score MANOVA test for any two way layout, then it is easily seen that for normal alternatives, this test is asymptotically power equivalent to the likelihood ratio test. In actual practice, the use of rank sums (i.e., E(k) = N,~ aF1, ..• ,N, k=l, ... ,p) often results in a quite simplified procedure and at the ~!(N+1) , I I 1_ I I I I I I I .e I I I I I I Ie I -29- same time does not involve any serious loss of efficiency. For details of these points, the reader may be referred to Puri and Sen (1966), the same argument being true in the two way layout case. I I I_ I I I I I I I .e I I I I I I Ie I REFERENCES ANDERSON, T. W. (1958): An introduction to multivariate statistical analysis. John Wiley and Sons Inc. New York. ANDERSON, T. W. (1965): Some multivariate nonparametric procedures based on statistically equivalent blocks. Proc. Inter. Symp. Multivariate Analysis. (in press). BENARD, A. and VAN ELTEREN, P. H. (1953): M-rankings. A generalization of the method of Proc. Kon. Ned. Ak. Van Wet. BHAPKAR, V. P. (1965): location problem. Ser A. 2£, 358-369. Some nonparametric tests for the multivariate several sample Proc. Inter. Sym.MultivariateAnalysis. (in press). BROWN, G. W. and MOOD, A. M. (1951): On median tests for linear hypotheses. Proc. Second Berkeley Symp. Math. Stat. Prob. II, 159-166. CHATTERJEE, S. K. and SEN, P. K. (1964): two sample location problem. Calcutta Stat. Asso. Bull. 13, 18-58. CHATTERJEE, S. K. and SEN, P. K. (1966): multisample location problem. Nonparametric tests for the bivariate Nonparametric tests for the multivariate Ann. Math. Stat. (Submitted). CHERNOFF, H. and SAVAGE, I. R. (1958): Asymptotic normality and efficiency of certain nonparametric test statistics. DURBIN, J. (1951): ~, Ann. Math. Stat. In complete blocks in ranking experiments. 972-996. Bri. J. Psychol. ~, 85-90. FRIEDMAN, M. (1937): The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Amer. Stat. Assoc. HODGES, J. L. Jr. and LEHMANN, E. L. (1962): ~, 675-99. Rank methods for combination of independent experiments in analysis of variance. Ann. Math. Stat. 11, 482-497. I I I_ I I I I I I I •e I I I I I I Ie I REFERENCES (CONTINUED) LEHMANN, E. L. and STEIN, C. (1949): On the theory of some nonparametric hypotheses. Ann. Math. Stat. 20, 28-45. LOEVE, M. (1962). Probability Theory. PURl, M. L. and SEN, P. K. (1966): order tests. D. Van Nostrand Co. Princeton, New Jersy. On a class of multivariate multisample rank Sankhya (Submitted). RAO, C. R. (1952). Advanced statistical methods in biometric research. John Wiley and Sons Inc. New York. RAO, C. R. (1965). Linear statistical inference and its applications. John Wiley and Sons Inc. New York. SEN, P. K. (1965): On a class of bivariate two sample nonparametric tests. Proc. 5th Berkeley Symp. Math. Stat. Prob. (in press). SEN, P. K. (1966 a): Ona class of multisample multivariate non-parametric tests. Ann. Inst. Stat. Math. (Submitted) • SEN, P. K. (1966 b): MANOVA. Rank methods for combination of independent experiments in Part one: two treatment multi-response case. S.N.Roy memorial volume. (in press). SEN, P. K. (1966 c): ~, HVC and ~C' SVERDRUP, E. (1952). variables. On some non-parametric generalizations of Wilks' tests for I. Ann. Math. Stat. (Submitted). On the limit distribution of a continuous function of random Skand. Actdskft. 35, 1-10.
© Copyright 2025 Paperzz