I I .- I I I I I I Ie I I I I I I I .e I ON THE THEORY OF RANK-ORDER TESTS AND ESTIMATES FOR LOCATION IN THE MULTIVARIATE ONE SAMPLE PROBLEM by Pranab K. Sen and Madan L. Puri University of North Carolina Institute of Statistics Mimeo Series No. 488 August 1966 This research sponsored by Sloan Foundation Grant for Statistics, U.S. Navy Contract Nonr-285(38) and by Research Grant, GM-12868, from the National Institutesof Health, Public Health Service. DEPARTMENT OF BIOSTATISTICS UNIVERS ITY OF NORTH CAROLINA Chapel Hill, N. C. I I .- I I I I I I Ie I I I I I I I •I 1 ON THE THEORY OF RANK-ORDER- TESTS AND ESTIMATES FOR LOCATION IN THE MULTIVARIATE ONE SAMPLE PROBLEM* by Pranab K. Sen l and Madan L. Puri University of North Carolina, Chapel Hill and Courant Institute of Mathematical Sciences, New York University O. Summary. In the multivariate one-sample location problem, the theory of permutation distribution under sign-invariant transformations is extended to a class of rank-order statistics, and is utilized in the formulation of a genUinely distribution free class of rank-order tests for location. ASYmptotic properties of these permutation rank order tests are studied and certain stochastic equivalence relations with a similar class of multivariate extensions of one-sample Chernoff-Savage- , Hajek type tests are derived. tests are studied. The power properties of these Finally, Hodges-Lehmann (1963) technique of estimating shift parameters through rank-order tests is extended to the multivariate case which includes among other results, the results obtained by Bickel (1964 and 1965). *Research sponsored by SlQan Foundation Grant for Statistics, U. S. Navy Contract Nonr-285(38) and by Research Grant, GM-12868, from the National Institute of Health, Public Health Service. IOn leave of absence from Calcutta University . 2 Let ~_ = (x{l), ... ,x{p)) be a p-variate random 1. Introduction. a "'U a variable having a continuous p-variate cdf (cumulative distribution function) F{x,9) where x = (x(l), •.. ,x(p)), "" "" "" 9 = (91' ... ,9 ), and where p is a positive integer which "" p in the sequel will be assumed to be greater than unity. Let then ~l' ... '~N be N i.i.d.r.v. 's (independent and identically distributed random variables) distributed according to the common cdf F(x,Q). Let 0 be a set of all """" . continuous p-variate cdf's, and it is assumed that F Let now ill € o. denote the set of all continuous p-variate cdf's which are symmetric about some known origin which we may without any loss of generality take to be x "" = 0, "" the symmetry being defined in the following manner. Let L, = P(x1 , ..• ,~) be any set of points P in the p-dimensional Euclidean space Ep . Any point P belonging to is denoted by P € and the image of the point P is z: ' L denoted by P* = P(x *l , .•. ,xp* ) where xi* = -xi' i = 1, ... ,p. The image set * of is denoted by t- L* (1.1) Then, if for any (1.2) L. t P(~ = (~ c. Ep € t:IF) = P(~ € ~IF) we say that F is symmetric about O. "" If, in addition, F is absolutely continuous having a density function f, the symmetry of F may also be defined I I -. I I I I I I _I I I I I I I I -. I I I 3 .- I I I I I I Ie I I I I I I I .I by the invariance of the density function under simultaneous changes of signs of all the coordinate variates. For the convenience of terminology, (1.2) may also be termed as the sign-invariance. Excepting with the later section dealing with the aSYmptotic normality and other properties of a proposed class Of estimates and tests, we will drop the assumption of absolute continuity of F. Thus, for the time being let us frame the null hypothesis HQ of sign-invariance as The two particular classes of alternative in which we may usually be interested are (1.4) F(x) is symmetric about some 6 '" '" ~ 0 . '" This will be the multivariate extension of the well known one sample location problem. H2 : F(x) is asymmetric about x = 0 though '" '" '" it may have the location vector O. '" This is the multivariate extension of the univariate symmetry problem. In the univariate case, there is a third problem whose multivariate extension would be the estimation of location vector 6, assuming F(x) to be symmetric about x '" '" '" = 6. '" Tests of the type mentioned above in the univariate case are due to Wilcoxon (1949), Govindarajulu (1960) and the 4 classical sign test. Estimates in the univariate case are due to Hodges and Lehmann (1963) and Sen (1963). However, in the multivariate case there are only a few suitable tests and estimates. In the bivariate case, Hodges (1955) has considered an association invariant sign test on which some further work has been done by Klotz (1964), and Joffe and Klotz (1962). The test is strictly distribution-free but neither the null nor the non-null distribution (for any suitable sequence of alternative hypotheses) of the test-statistic appears ,to be simple, and so there is little scope for studying the Pitman-efficiency of such a test. The only study of the efficiency of this test is made by Klotz (1962) and that is the Bahadur Efficiency. Further, being a sign test it is anticipated that as in the univariate case, for nearly normal cdf's, the efficiency of this test may be appreciably low. Some sign-tests are due to Blumen (1958) and Bennet (1962). Bluman's test is also strictly distribution-free but it is subject to the same drawback as Hodges' test. test is distribution-free only asymptotically. Bennet's Some further asymptotically distribution-free tests are due to Bickel. His tests are based on generalizations of the univariate sign-statistic and Wilcoxon's signed rank statistic and are distribution free only asymptotically. However, his study of the asymptotic power properties of these two tests provides some useful information about the performance characteristics of them. Very recently, Chatterjee (1966) has considered a I I -. I I I I I I el I I I I I I I -. I I I 5 .e I I I I I I Ie I I I I I I I .e I very interesting permutationally distribution-free bivariate sign test which asymptotically agrees with Bickel's and Bennet's sign-tests. In this investigation this permutation argument will be extended not only to the p-variate case for any p ~ 2 but also to a much wider class of rank-order tests which includes Bickel's tests as special cases, and which are strictly distribution free under permutation arguments. This extension also covers the normal scores and the other type of multivariate score tests on which practically there is no work in the literature. In the problem of estimation too, Bickel (1964) has extended Hodges and Lehmann (1963) type of estimates to the multivariate case using only the median and rank sum procedures. In this investigation a much wider class of rank order statistics will be used, and this will be a further generalization of Bickel's work. 2. The Basic Permutational Argument. Let us denote the sample point by (2.1) and the sample space by ~. Then under the null hypothesis to be tested the joint distribution of under the following finite group ~ remains invariant ~N of transformations gn given by (2.2) ji = 0,1; i= 1, .•• ,N, 6 where (-I)X "tt = (-X(l), •.. ,_x(p)). a u Thus there are 2N such possible (gn) under which the distribution of ~N remains invariant. Hence, given ~N' we can generate a set SN of 2N points ~N in ~N' such that all these 2N points will have the same probability. Hence, conditioned on the generated set SN' there are 2N = M points and the permutational probability measure associated with each point is the same, viz., 2 -N , when He given by (1.3) is true. Let us now denote by ~(~N) a test function l'Jhich with each -tN associates a probability of rejecting HO • NOH if w'e use the known permutational probability measure of ~N then it readily follows from the well known result on SN' of Lehmann and Stein (1949) that a test function ~(-tN) based on this permutational probability measure will possess the structure SeE) (cr. Lehmann and Stein (1949)), and hence will be a strictly distribution-free test. This characterizes the existence of strictly distribution free test o < E < 1, for the hypothesis He of size E, given by (1·3)· Now in actual practice ~(-tN) has to be constructed with special attention to the class of alternatives in mind, and in most of the multivariate problems, ~(~N) depends on -tN through a single-valued test statistic TN to have a comprehensive test. = T(~N) in order Due to these reasons we shall consider the following procedure. Let us rank the N values of IX(i)l, a a = 1, ... ,N, in order of magnitude, and let R(i) denote the rank of Ix(i)1 in this set. a a I I -. I I I I I I el I I I I I I I -. I I I 7 (i) - (i) Then (R l ' ... ,R N ) is a permutation of the numbers 1,2, ... ,N . This procedure is adopted separately for each i = 1, ... ,p. •e Thus, conditioned on a given ZN' an observation X having p '" I I I I I I Ie I I I I I I I .e I I'\.Q. values is converted into It,., ''''''''' Let nOl'J = (R(l), ... ,R(P)) , a a a=l, ..• ,N. (E ( i); N,a a=l, ... ,N) be a sequence of N real numbers; the sequence may of course vary over i=l, ... ,p. The value of ~i) corresponding to the value a = R(i) is ,a . a denoted by E(i) (i); a= 1, ... ,N; i= 1, ... ,p. i~,Ra(EN(i); a ,a Thus, = 1, •.. ,N); i = l, ... ,p, are selected corresponding to R in (2.3) we have a p-tuplet of I'\.Q. real values (1) (2.4) ~,a = (E and, we denote by ~N (1), .•. ,E N,Ra ~ = (1) E N,Ri l ) , = ~N (1) .. . , E .. . E . (p) ~,N a=l, ... ,N, (p)); N,Ra the following p by N matrix !N,l (2·5) (p) E N,Ri p )' , N R(l) , -N (p) N,RAp) is really a stochastic matrix, its i-th row being a random permutation of (E~ii, , . ... ,~i~) , for i = l, ... ,p. Thus there are (N~)P possible values of ~N' ~N and , ~N ~'Jo realizations are said to be equivalent when it is possible to arrive 8 , at ~ of ~N' , from ~N by only a finite number of inversions of columns Thus, we may always consider the first rO\J of natural order. ~N in There will be thus (N!)p-l non-equivalent realizations of ~N' We denote this set of (N!)p-l realizations ~N' ~N will be termed a collection rank order matrix and ~N will be termed as the s ace of collection matrice~. by Let us now consider a sequence of N diagonal matrices (C , a=l, ... ,N) each of order p by p, where I\(l , C = diag (C (l) , . . . , C(p ) ) (2.6) a '\.Q. C~i) = (_l)a i ; if X(i) (2.8) a ai=O,l; > 0, a i=l, ... ,p; and a i = 1 a=l, ... ,N, if x(i) a < 0 fa Thus, each is a stochastic diagonal matrix, which can have 2P realizations. Let us now define a p by pN matrix which we term the collection sign matrix. This is again a stochastic matrix which can have 2PN realizations. The set of 2PN possible realizations of £(N) is denoted by tN' and is termed the space of sign-matrices, while £(N) will be termed as the collection sign-matrix. This-is again a stochastic matrix which can have ~N possible realizations. The set of ~N possible realizations of £(N) is denoted by (N and will be called the space of sign-matrices, while £(N) will be termed the collection sign-matrix. I I -. I I I I I I el I I I I I I I -. I I I .e I I I I I I 9 Let us now consider the pair of stochastic variables (~N'£(N» which lie in the product space (~'£(N» probability distribution of (~N;(lJ) ~ The on this product space (defined on an additive class of (product) cr-fields of subsets of (~N'£N» depends on the unknown F{~), even when F€oo, that is, even when HQ is true. It follows from (2.2) that if we consider the finite group ~N of transformations gn (containing 2N elements), then the probability distribution of £(N) or (N remains invariant under the transformations in ~~, that is, lf (2.10) gN£(N) = «-1) - j Jl jN £1'"'' (-1) ~N)' gN GN ' € J j ... ,(_l) ac(p», a=l, .•. ,N C = diag.«-l) ac(l), "\.Q. a a Ie where (-1) I I I I I I I distribution of C(N) remains invariant under any such transformations gN' Again by definition of R (i) ,i=l, ... ,p, .e I Q (with Ja = 0,1 for a = 1, ..• ,N), then the probability Q. a • 1, ... ,N, and of that !N ~N (in (2.3), (2.4) and (2.5» we see remains invariant under any transformation gN or the original vector ~ € ~N' € ~N Hence if we consider the stochastic matrix (2.11) * (1) gN~ = (~Nl (gN (N) £1), ... , ~NN(gN £N» g~i) = (_l)J ij J =O,lj i=l, ..• ,Nj then it readily i follows that under HQ in (1.4), the joint distribution of where * conditioned on the given ~N {which implies that given . (~'£N» remains invariant under gN € ~N' Thus, if we gN~ consider the set of 2N values 10 (2.12) we get a set of 2N permutationally equ1valent (probab111s1tc) points, and the permutational probab111ty measure on this set is denoted by 09N, ~N' that i~ given (!N, Wh~Ch is a conditional measure given ~N)· We shall now develop a strictly distribution free test procedure under this permutational probability measure. Let us define j = l , •.. ,p; (2.14) Thus, T~j) is the difference of the sum of the rank fUnctions (~») for which x~j) > 0 and the sum over the remaining set. It follows eas11y that (2.15) E(:eN' f N } = ° = (0, ... ,0) ~ for any given (~I£N) • combinat10n j ~ Let us now define for each (k) . (4) k (= 1, ..• ,p) four subsets Sjk , ... ,S jk ' where (2.16) c(k) S(l) jk = (~ C(j) < 0, s(2) Jk = (~ C(j) < 0, C(k) a S(3) jk = (~ c(j) > 0, S(4) jk = (~lb C(j) > 0, a a a a < 0) )0 0) C(k) < 0) C(k) > O} a a a I I -. I I I I I I _I I I I I I I I -. I I I 11 .e I I I I I I Of course, if j null sets. .e I Since under the transformation gN on ~, both c~j) and c~lC) change signs simultaneously, the product E(jlj) E(klk) will have a positive sign for s~~) a::aa neg::tve sign for remain invariant under gn € and s~~), G N. S~~) and S~~) and the same will Hence it can easily be shown that if we define 1 (2.17) (j) (k) (j) _ (k) VN,jk = N (y- E (j)E- (k) +~ E (j)E- (k) - SfI) NR NR gTIf) NRa NRa. a a jk jk - (k) (j) - ? E (j}E Sf2) jk Ie I I I I I I I then S3~) and s~~) will be the = k, NRa (k) -~ E S ( 3) NRa. N (j}E NRa jk 1 (k) (j) (j) 2 vN, jj = N ~ [11c] (2.18) (k», j#k=l, ... ,p; NRa , j=l, .•. ,p; then (2.19) j,k = 1,. H'P. Let us now denote by v = (" ",N _ N,jk ) j,k=l, .•• ,p (2.20) and assume (for the time being) that XN is positive definite. (!N being a covariance matrix will be positive semidefinite at least. If XN is singular, then we can work with the highest order non-singular minor of XN and work with the 12 corresponding variates). conditions ~N Later we will show that under certain will be positive definite. Thus, we consider the following positive semi-definite quadratic form (2.21) where -1 ~N is the inverse matrix of ~N. If the null hypothesis is true, the center of gravity of will be zero, and conditionally on given ~N' that is, N on given(~'£(N»' there will be 2 permutationa11y equally ~N likely (not necessarily all distinct) values of SN. other hand, if HQ On the is not true, it can be shown, as in the univariate case that ~ T~j) converges in probabi~ity to some non-zero quantity for at least one j = l, ... ,p and hence SN/N will converge to a non-null quantity. will be stochastically large. right hand side tail of the permutation distribution of SN (2.22) ~(~N) = Thus the critical function SN,E(~N) 1 if SN aN,E if SN = SN, E (~N) 0 if SN > < SN,E (~N) where SN,E(~N) and a N,£ are chosen such that O<E<l, provides a strictly size E similar region. For small samples, SN,E(~N) -. I I I I I I _I Thus, SN This suggests the use of the as the suitable critical region. I I and aN,E are to be determined from the actual permutational cdf of SN' while for large I I I I I I I -. I I I .e I I I I I I Ie I I I I I I I .e I 13 samples, we have proved in the next section that 2 SN , E(zN) and aN E converge in probability toJ(,p,E 2 and zero respectively as N -> 00, where}' E is 2 r"'P , the 100 (l-E )<to point of the distribution (2.23) ~, 'I- with p degrees of freedom. 3· Properties of ~N and Asymptotic Permutation Distribution of SN. and of (x(j),x(k)) Let us denote the marginal cdf of x(j) a a a and Fj,k(x,YiQj''k) respectively, and let x > 0 o -< x,y -< Set FN, j (x) = (number of ~ j) ~ x)/N (3.5) (3.6) 00 I I 14 -. Finally, as in Chernoff and Savage, we write a = 1, •.• ,N, = j I I I I I l, ... ,p, where IN,j though defined only at l/(N+l), •.. ,N/(N+l) may have its domain of definition extended to (0,1) by letting it have constant value over (a/(N+l),(a+l)/(N+l)). Further- more, we make the following assumptions: Assumption 1. ~oo IN,j(U) = Jj(U) oxists for ° < U < 1 and is not a constant. Assumption 2. j[JN'/N~l l1f,jl - Jj{N~l Hjl] 1 2 dFN,j(xl=op{N- / -00 Assumption 3. J j (u) is absolutely continuous, and IJj(i) (u) I = I d(i)J (u) du (r) I _ -5-i- 1. ~ K[u(l-u)] 2 , i = 0,1 for some K and some 5 00 > 0. 00 Let us now define 0000 v jk = JJ Jj[Hj(X;Qj)]Jk[Hk(y;Qk)]dHj,k(x,Yigj,Qk), 00 j,k= l, ... ,p ). • el I I I I I I I -. I I I .e I I I I I I Ie I I I I I I I .e I 15 and v = (v . jk ) j,k=l, .•. ,p • We may remark that the assumptions 1 to 3 are needed to prove the joint asymptotic normality ot the permutation distribution or t N given by (~.14). Assumption 4 is required to establish the stochastic convergence ot the permutation covariance matrix ~N to Xwhich we shall require inthe sequel. Under the assumptions 1 to 4, Thebrem 3.1. probab11ityas N -> ~, where Z= (v ZN -+ :t in jk ) is given by +co +cd (3.10) vjk Proof: = Jo J.tj[Hj{XjQj)] Jk[Hk(YjQk)] dHj,k(x,YjQj,Qk) 0 . From (2.17), using (3.6) and (3.7) and the assumption 2, we rind that ~~ VN,j,k JJ IN,j[N~l ~,j(X)] IN,k[N~l ~,k(Y)] d~,j,k(x,y) =JJ J j[N~l ~, j(X)] Jk[N~l ~,k(Y)] d~, j,k(x,y)+ = 00 ~ 00 Now, writing Jj[N~l ~,j(x)] = Jj(H j ) + (~,j-Hj)J;(Hj) ~ ~il J;(H j ) 16 I I -. I (3.l4) and proceeding exactly as in [14] and [26], we find, omitting the details of computations that, 0000 JJ J j[ H (x;Q vN, j, k = 00 j j)] J k [ Hk(YiQk) ]dH j , k(X, Yi 9 j ,9 k )+op (1) I I I I I el Corollary 3. 1 ~ .i ;1:1' I where F j and Fj,k are symmetric about zero. (ii) The assumptions of Theorem 3.1 are satisfied, then ~N * in probab iIity as N....;.. -'> ~ 00 , where ¥, *= (v *jk) is given by +00 (3.16) v;k = +00 J,! J [2F (x}-l]Jk[2Fk (Y)-l] dFj,k(X'Y) -00 Theorem 3.2: j j . - 00 Under the assumptions 1 to 4 N- 1/ 2TN -has ...;.--.;.~;.........;~~...;...;;;...~---.;;;~.;......;;;......;~-' aSymptotically a p-variate normal distribution with mean vector zero and covariance matrix is given by (3~10). ~ = (V jk ) where (V jk ) I I I I I I -. I I I 17 .e I I I I I I Ie I I I I I I I .e I Proof: To prove this theorem, it suffices to ShO\'1 that for any arbitrary constant vector ~ = (6 1 , ... 6p )' the distribution of N- l / 2 5 TN' is asymptotically normal under '" '" the permutational probability measure ~ .. From (2.13) we notice that N P 6 ·'T =~y- 5. '" ",N a= .1=.L J I (3.17) r where by definition ~a(~'£(N),R.) ~.p£N depends upon and ~. Let us nO~1 consider the permutation distribution Generated by 2N transformations gN in(;N defined in (2.2). It then follows from earlier discussion that ~ and 2 remain 11 invariant under (;N or .tN' while £(N) may have 2 permutationally equally probable values: (3.18) which can only assume two values i (3.19) with equal probability ~ for a on the given ( 5 E (j)(j) , .j j ~N,that = 1, ... ,N. is, on the given = 0,1, Thus, conditionally (~'£N)' ) remain fixed, NR while a (C(i) a ' i = 1 , ••• , p ) can on1 y change s i gn s 1mu It aneous 1 y (with in each a) for a =l,.... ,N. Thus, for the 2N, gN € G , = 1, .. . ,Pi a = 1,.... ,N a. N we have the corresponding 2N values of \'Jritten as 2'·TN which may be 18 where (d N ) are mutually independent (under the permutational a probahility measure) and dN = +1 or -1 with equal probability 0: 1/2; and where conditionally on the given ~N' that is, on the given q~N'£(N)) and~, IaN (~N'£(N) ,~) I, 0: = 1, ... ,N 0: are all fixed. It is also easily seen that the permutational average of (3. 20 ) is zero, and the variance N~ VN So if we define WN,o:= laNa(~N'£(N),~)ldN,o:, then ~ I • unde~ ~, (WN,a. ) are independent random variables with means zero and liN ,a. can assume only two values -+1 WN,0: I each i'li th probability 1/2. We will now apply the central limit theorem to the sequence OlN,o.: 0:= 1, ... ,N) under the Liapounoff's condition, viz., r·7 \ ;J • 01' c. • Nhich reduces to showing that for some r 1 lim N-(=~-2)/2. N-> 00 > N 2, r N ~ laN,o: q~N'£(N)'~) I J:. ( -N ~ L- 0.=1 := 2 (E C 0 5) )r/2 aN ,a. ""N' ""! (N)' "" The denominator of the second factor of (3.22) is 2 ~N ~ , which by Theorem 3·1 converges to some positive quantity for any non-null 5 "" positive definite. as ..' is assumed to be "" So we require only I I -. I I I I I I _I I I I I I I I -. I I I 19 .e to show that for some r Since aN,o.(~'£(N)'R.) is a linear function of ~N and R. (with a the sign of the coefficients being a minus or plus depending on Q{N))' on applying the well known inequality that I LC= i=l (Cf. Loeve ailr ~ pr-l ~ lai,r i=l [23J, pp. 155), we get l IaN (R_,C(N)' 5) Ir N~ 0.=1 ,a ~N '" '" ~ pr-l ~ i=l Since, by using Chernoff-Savage conditions, we have for 2< Ie r < 2+5, 5 > 0, 00 I I I I I I I I 2, the numerator of the second factor of (3.22) is bounded in probability. I I I I I I .e > for i = l, •.. ,p , , we see that (3.22) converges to zero in probability as N -~ The proof follows. Theorem 3.3. Under the assumptions 1 to 4, the limiting permutational distribution of SN probability measure = 1 -1' N XN XN !N under the ~, is central chi square with p degrees of freedom. The proof of this theorem is an immediate consequence of Theorem 3.2 and is therefore omitted. Let us now define -:z. ( "J • ?'() '"I ~i +00 = J -00 Ji[Fi(X)- Fi~-X)] dF[i](X); i=l, ... ,p) 00. 20 and ~ = (; l' ... , ~ ). '" p Theorem 3.4. The test based on SN is consistent against the set of alternatives that; '" I o. In particular, if '" F[1](X) is symmetric about some 5 i , 2 = (5 l ,···,5 p )' then; I 0 for any 5 I 0, and hence the test is consistent - '" '" '" '" against any shift in location of a symmetric distribution. Proof: It is easy to show that ~ T~i) as N -+ 00 J.""'or all i probability as N assumption. -> ;i in probability 1 SN..... s I: • Thus, N · -11:s I ln '" '" '" where v is positive definite by = 1, ... ,p. -> 00, '" Consequently SN will be stochastically 2 indefini tely large as N as N ~ 00. Again, SN ,e +'xp ,ex -> 00. the test is consistent against ; ~ O. Furthermore, if '" Fi(X) is symmetric about some 5 i F 0, then it is easy to show that ~i I o. Hence the theorem. Asymptotic Normality of XN for Arbitrary F. In this section we shall prove that under a certain set of conditions the random vector XN (p)) = ( TN(1) , ... ,TN has a joint normal distribution in the limit. For the sake of convenience we shall consider the statistic (4.1) where -. I I I I I I _I Hence and since the right hand inequality tends to one as N --> co, 4. I I I I I I I I I -. I I I .e I I I I I I Ie I I I I I I I .e I 21 N TN,j = 1 L: R(j) Z{j) N i=l ~,i N,i (4.2) z~;I = 1 if ~;i; i = 1, ... ,N; where, xi j ) > = 1, ... ,p, j z~;i 0, and = 0 otherwise. are the constants satisfying the assumptions (1) to (3) of the previous section. * is related to the N ~N* - N -EN where reader may notice that the vector ~N by the relation = ~N _ 2 1 The N ~N (1) N vecto~ (p) EN=N(LENi, .. ·,~ENi) , ~1' ~1' ~'. and so it suffices to consider the equivalent statistic :t~. The main theorem of this section is the following: Theorem 4.1. Under the assumptions (1) to (3) of Sectlon :2.: 1/2 the random vector N -1(. (~N-~N(Q)) has asymptotically a p-varj~t2 normal distribution with mean vector zero, and covariance matrix (aN,jk)' where 00 IJ.N , j = Jo J j [ Hj (x; 9 j )] dF j (x; Gj) , J=l, ... ,p, and where (aN, jk) is given by (4.14) and (4.15) Proof: and as N \vriting I N, j (N+l ~,j) as respectivel~. 22 I I -. and, I I I I I I and, simplifying, we can express TN,j as N TN, j = Il N , j + B1N , j + B2N , j + ~ CiN, j (4.4) where = J ( 4 . 5) Il N , j (4.6) B1N ,j = J Jj(Hj(X,Qj))d(FN,j(x)-Fj(X,Qj) (4 . 7 ) B2N , j = J' (I1J, j - Hj) J ~ (Hj) dF j (x ) (4.8) C1N ,j = - N;l ( 4 •9) C2N , j = J (4.10) C3N ,j = J [Jj(N~l HN,j)- Jj(H j ) J j ( Hj (x, Q j )) dF j ( x, Q j ) J I1J,j J;(H) dFN,j(X) (I1J, j - el Hj) J; (H j (x )) d (FN, j (x ) - F j ( x) ) - (N~l 11-J,j-H j ) J~(Hj))dJN,j (4.11) Proceeding as in Govindraju1u et a1. (1965) it can be shown that the C-terms are all Op(1/N1/2)~ Hence the difference 1IN(TN,j-IlN,j)-~(B1N,j+B2N ,j) tends to zero in probability. Thus to prove this theorem, it suffices to show that for any real 5 i , i = l, ..• ,p, not all zero, N1/ 2 ~ 5.(B1N .+B 2N j) has normal distribution in the limit. j-1 J ,J , Now pr~ceeding as in [4), we can express N1/~5.(B1N j+B 2N .) J~l J , ,J I I I I I I I -. I I I .e I I I I I I Ie I I I I I I I e I I 23 as a sum of independent and identically distributed random variables having finite first two moments. The proof follows. To compute the variance-covariance matrix of B1N ,j+B2N ,j' we note from (4.9) and (4.10) that (4.12) B1N ,j+B 2N ,j= ~(FN,j~X)-Fj(XjQj))J;(Hj(XjQj))dFj(-XjQj) - ~ (FN,j(-X)-Fj(-X;Qj))J;(Hj~X;Qj)) dFj(X;G j ) This has mean zero, and variance = E ( - = ~~FN,j(X)-Fj(X;Gj»J;(Hj(X;Gj» ~ (FN,j(-X)-Fj(-XjQj»J;(Hj(X;Qj» dF j (XjQj»)2 ~ ~~ Fj(X;Qj)[l-Fj(YjQj)]J~(Hj(XjQj»J;(Hj(Y;~j»dFj(-XjQj) o<x<y<oo (4.13) + ~ ~ dF (-yj9 ) -- j ~~ Fj(-y;Qj)[l-Fj(-XjQj)]J;(Hj(X;Qj»J;~Hj(YjQj» 00 dFj(XjG j ) dFj(Y;G j ) ~ ~'Fj(-YjGj)[l-Fj(Xj9j)]J;(Hj(X;Gj»J;(Hj(Y;Gj» x=O Y=O dF j (-X j 9 j ) dFj(YjG j ) [Note that the application of FUblni's theorem permits the interchange of integral and expectation.] Hence j o<x<y<oo 00 - dFj(-XjG j ) I I 24 -. Simj.larly; 00 (4.15) O'N,jk= 00 J J [Fj,k(x,YiGj,Gk)-Fj(X;Qj)Fk(YiGk)]J;[Hj(X;Gj)J x=o Y=O 00 00 -J J [Fj,k(x,-Y;Gj,Gk)-Fj(XiGj)Fk(~YiQk)]J;[Hj(XiGj)] x=o y=o , Jk[Hk(Yi Gk )] dFj(-X;G j ) dFk(YiQk) 00 00 -J J [Fj,k(-X,Y;Qj,Qk)-Fj(-X;gj)Fk~y;gk)]J~[Hj(X;Qj)] x=o y=O 00 + 00 J fiFj,k(-X,-y;Qj,Qk)-Fj(-X;Gj)Fk(-Y; Gk)]J;[Hj(X;Gj)~ x=o Y=O I I I I I I _I I 5. Asymptotic Distribution Under Translation Alternatives. From this section onward, we shall concern ourselves p with a sequence of admissible alternative hypothesis HN, which specifies that for each j,k = l, ... ,p, g F.(Xi Q .) = Fj(X- .=.J.); J_ J yIN (5.1) w~ere Fj~~) g. Fj,k(x,y;Qj;Qk) = F. J, k Q k (x- --.J..,y_ - ) 1"N 11N and Fj,k(X'Y) aresymmetric about zero, and G = (G l , ... ,Gp ) is unknown. We shall also assume that the constant ~ji, , i=l, ... ,N; I"IJ j=l, ... ,p, is the expected value of the i-th order statistic I I I I I I -. I I I 25 .e of a sample of size N from a distribution function ?fJj(x) given by x I about zero or uniform over (0,1). Jj(x) I -1 = ¥j It may be noted that implies that the function *-1 l+x (x) =?/J j (~) • We first prove the following lemma. If J j Lemma 5.1. ( 5 • 1) 4 II +1I( x=o = ?/J -1 , then j F j (x )[ 1-F j (y ) ] J ; ( 2F j (x ) -1 ) J ; ( 2F j ( Y) -1) dF j (x) dF j ( Y) - O<x<Y<~ ~ .e E~;l the above definition of I I I I I I I I I 0 , * where ?fJj~X) is a distribution function either symmetrIc I I I Ie > - ~ 1-F j (x) )( l-F j ( y) }J; ( 2F j (x) -1 ) J ; ( 2F j ( Y) -1) dF j ( x ) dF j(Y ) y~o· == ~ II II j (x) [ 1-Hj ( Y) ] J . j (2F j (x) -1 ) J ; ( 2F j ( y) -1 ) d H . (x ) dII. ( y) J .J O<x<y<oo 1 + %( - = i!J~ o JI ~ (5·2) J j (x) dx) 2 0 (" ) - ax · ~ x=O y=O 00 I [F j, k(X' y) -F j (x)Fk(y)] J; (2F j (x) -l}J~( 2F k (y) -1 )dF j (-x )dFk(-y) - ~ - 26 00 00 -J J [F j , k ( - x, y) - F j ( -x) Fk ( Y) ] J ~ ( 2F j (x) -1 ) J ~ ( 2Fk ( Y) -1 ) x=O y=O dF (x) dF ( -y) j 1c 00 00 JJ [F j , k ( -x, - y ) - F j ( - x) Fk ( - y )] J ~ ( 2F j ( x) -1) J ~ ( 2Fk ( y ) -1 ) x=o y=O dF j (x) dF k ( y ) + 00 = % 00 JJ -00 J j ~ 2F j ( x ) -1) J k ( 2Fk ( Y) -1) dF j , k ( x, y) , -00 Proof: To establish the first equality of (5.1), we note that Hj~X) = 2F j (x)-1, and so the right hand side of the first equality can be written as 4 fJ"' (2Fj(X)-1)(1-Fj(Y))J~(2Fj(X)-1)J;(2Fj(Y)-1) O<x<y<oo - + ~ (J dFj(X) dFj(Y) 1 · J j (x) dX) 2 , o which equals 4 Jf Jf F j (X)[1-F j (Y)]J;(2F j (X)-1)J;(2F j (Y)-1) dFj(X) dFj(Y) O<x<y<oo - - 4 (l-Fj(x)) (1-F j (Y))J;(2F j (X)-1)J;(2F j (Y)-1) dFj(X)dFj(Y) O<x<y<oo +~ ( J o that is, 1 J j (x) dx) 2 I I -. I I I I I I el I I I I I I I -. I I I .e I I I I I I Ie I I I I I I I .e I 27 JJ o 4 Fj (x)[ l-F j (Y)] J ; (2Fj (x) -1 )J; (2F j (Y)-1 ) dF j (x) dF j ( y) <x<y<oo . 00 00 +2JJ (l-Fj(X)) (1-F j (Y))J;(2F j (X)-1)J;(2F j (y)-1)dF j (X)dF,1(y) x=o yeO' J1 00 -4 00 x=O yeO I! (1-Fj(X))(1-Fj(Y))J;(2Fj(X)-1)J;(2Fj(Y)-1)dFixldFj(Y) . . . 2 + 1j: ( J j (X ) dx ) • - 0 The last two integrals cancel out, and this proves the first equality. To prove the second equality, note that we can rewrite the second equation as 1 JJ x(l-y)J' (x)J' (y) dx dy + ~( J Jj(X) dx)2, o<x<y<oo . 0 ~ and the result follows as in Chernoff-Savage (1958). To prove (5.2), we note that when t is uniform over (0,1), the left hand side reduces to 00 00 IIFj,k(X'y) dFj(-X) dFk(-y) x=o yeO 00 - -oi 00 11 F j,k(X'y) dFj(-X) dFk(y) 7; -b!f x=o yeO 00 00 J' J Fj , k (-x, y) dF x ) dFk(y) -~ x=O yeO + J J Fj,k(-X'-y) dFj(X) dFk(y) - -d: . x=o yeO - j ( 00 00 Noting that Fj(X) = 1 - Fj(-x), making simple transformations, and integrating by parts, the above expressions equal 22 -. ~~ JJFj (x) Fk (Y) dFj , k (x, y ) - i -00 -00 The proof of the case when Jj(X) = ~jl(x) where ?j(X) is given by (5.2) follows by using the relations j=l, ... ,p, and proceeding as above. Theorem 5.1. (i) If, for ever~ j = l, ... ,p, the conditions of Theorem 4.1 are satisfied 9 (1.i) Fj(.X;QJ o. ) = Q r 1"N 1/N k Fj(X- .:..J.); Fj(X,YjQj,Qk) = F.(x-.=..1 ,y- ._. ) IN J * then N"1./2 (!N-~N) has aSYmptotically a p-variate normal distribution with mean vector zero, and covariance matrix ('t"jk) where 't"jk = The proof of this theorem is a direct consequence of Theorem )~.1 and Lemma 5.1 and is therefore omitted. It may be noted that the limiting distribution of N1/2(T*_~ ) is nonsingu1ar if and only if the f~Llnctions JJ. ""N I'CN and F j are such that the moment matrix of is non-singular. I I (Jj(2Fj~X)-1~j=lJ.. _P} I I I I I I _I I I I I I I I -. I I I 29 .e The following theorem gives the conditions under which the limiting distribution of I I I I I I I .e I is singular. Let Xj , j = 1, ... ,N, be random vectors satisfying the assumptions of, Theorem 5.1. Then the Theorem 5.2. I I I I I I Ie Nl/2(~;_~N) asymptotic distribution of k < P if and only if Nl/2(~;-~N) is k-variate normal, a.s. F. = ~ a kJ k [2Fk (y)-1] Jj~2Fj~X)-1) [a + constant, k are some constants]. The proof of this theorem is analogous to that of Theorem 3.3 of Bickel (1964) and is therefore omitted. 6. The Proposed Class of Asymptotic Tests and its Limiting Distribution. Ive no\'l assume that the covariance matrix ~* defined by (5.3) is non-singular. hypothesis He = ('1" jk) Then for testing the given by (1.3), we propose to consider the * defined - as test statistics SN . A * * (0) --*SN = N(~N-~N ) ~ (6.1) 1 * (~N-~N(O)) , '" where TN* 1\ is defined by (4.1) and (4.2); z:..* - '" *-1 consistent estimator of L. ' and '" (6.2) where 00 = J x=O J j ( 2F j (x) -1) dF j (x) , -1 is a 30 Lemma 6.1. (i) If the conditions of Theorem 5.1 are satisfied, (ii) the assumptions of Lemma 7.2 of Puri (1964) hold for each function F j and J j ; J = l, ... ,p then, Nl/2(!;-~N(O)) has aSymptotically a p-variate normal distribution with mean vector, (6.4) where 00 cJ = - 2 Jo J;(- and covariance matrix [Here fj~X) 2Fj (x )-1 )f J (x) dFj (x) 2:'* = jk ) given by (5.3). '" of Fj(x).] is the density (T I I -. I I I I I I _I The proof of the lemma follows by noting (that by I assumption (ii)) (6.6) and then applying Slutski's theorem ([12], p. 254). A consequence of this lemma is the followinG Theorem 6.1. If the assumptions of then for N -+ 00, * 6.1 are satisfied, the limiting distribution of *-1 * N{lN-~N(O))2: Lennna (TN-~N{O)) , is non-central chi square with p degrees of'" freedom andnoncentrality parameter I I I I I I -. I I I 31 - .e I I I I I I Ie I I I I I I I .e I where a is Given by (6.4) and (6.5), and ~** = ~T;k) j.s gj.'y en bX (6.8) /\".-1 No\'l since N~ is a consistent estimator of ~ * "" then for 11 ~s ~ -1 "" have the following If the assumptions of Lemma 6.1 are satisfied, Theorem 6.2. * SN 2:" 00 , _the limiting distribution of the statistic non-central chi square wit~ p degr~~s of freedom, * siven b~ (6.7). and non-centralit0~~a~et~~5(SN) /\ -1 * From Theorem 6.2, it is clear that the choice of> '-'" is of no i.mportance in the limit. k17 consistent estin:ator *-1 of L \·Jill pre8erve the asymptotic distribution of the "" i{.. statistic S11. One such con~istent estimator is provided by the permutational covariance elements in Section 3. He m3.Y c.lso use a theorem by Bhuchongkul (1964) to propose a si.milar class of esti.ma tes. HO'ilever here condltions are rela tively mo:"e restrictive (since they relate to 2.S~·:lptotic normality) than the ones in Theorem 3.1, and for our purpose, \'le need not bother about the conditions as we simply require here the asymptotic convergence (not normality) of the estimates. .!~~~em_~:2' and the Th9_.Ecrmutation test based on SN .fjivenJ>Y (2.21) asymp~oticallY * non-pal"ametric test based on SN ,g~vel1 32 33 .Q:l. (6.1) are asymptotically power equivalent for the sequence of translation type of alternatives defined by (S.l). Proof: From Theorem 5.1, it follows that for any such translation type of alternatives, the covariance matrix of the proposed vector of rank order statistics converGes asymptotically to _* L- , of which '" .1\* 2: is a consistent estimator. '" Again by Corollary 3.1, it readily follows that under any such sequence of translation type of alternatives, the permutation covariance matrix in Theorem 3.2 also converges * to ~. Thus by looking at (2.21) and Theorem 6.2, lIe obs;ve that under (5.1), SN cally equivalent. ~ S; where ~ means stochasti- The proof follows. I I -• I I I I I I _I By virtue of the stochastic equivalence of the tests * in the next section. of the unconditional test based on SN I I Special Cases: I * SN and SN' A. Vole Let Jj(u) * shall only consider the asymptotic properties = u, * test reduces to the rank-sum then the SN - SN(R) test (which may be regarded as a p-variate version of the univariate one sample test) considered in detail by Bickel (196S). In this case the non-centrality parameter (6.7) reduces to 5 (R) = where y = (1 jk ) is given by Q 1- 1 9 I I I I I -. I I I .- I I I I I I Ie I I I, I I I I .e I (6.10) J f~(X)dX)2 1/12( ,j =k +00 +00 JJ -~ F j (x) Fk ( Y) dF j I k ( x, y ) - 1/4 -~ ~ k --------------- , j r +00 +00 (J r~(X)dX) if r~(Y)dy) -co -co B. Let J j be the inverse of a chi-distribution with one * test reduces to the normal degree of freedom. Then the 3 N * (lN) test which may be regarded as a p-variate scores 3 N version of the univariate one-sample normal scores test. In this case the non-centrality parameter (6.7) reduces to 5(}) = 9 ~-l g' (6.11) (6.12) ~jk where ~ = (~jk) is given by j = ; = k j ., k . 35 7· Asymptotic Relative Efficiency. The asymptotic efficiency of one test relative to another in the present context can easily be obtained by a method given explicitly by Hanan (1956). as follows. It may be stated roughly If the two test statistics have, under the alternative hypothesis, non-central chi square distribution with the same number of degrees of freedom, the asymptotic relative efficiency of one test with respect to the other test is equal to the ratio of their non-centrality parameters after the alternatives have been set equal. For detaiis, the reader is also referred to Bickel (1965), Puri (1964) and Sen (1965) among others. Our main interest will be to study the relative performance of (i) the S~(~') test, (ii) the S;(R) test and (lii) the Hotelling's T~ test, which too has asymptotically a non-central chi square distribution with p degrees of freedom, and non-centrality parameter , Q where 11 = n", (a jk ) is the covariance matrix of the underlying - distribution function F. Thus, applying the definition of Hanan, and denoting eT,T* as the asymptotic efficiency of a test T relative to T*, we have (cf. (6.7)) e where ~** * 2 SN,TN = (9 L **-1 , = (T * ) is given by jk covariance matrix of F. 9) / (q -1 , 11 (6.8) and 9 ) TT = (cr jk ) is the I I -. I I I I I I el I I I I I I I -. I I I ,e Special Cases: (a) Normal Scores and Rank Sum Tests. From (7.2), we find that the efficiencies of the I I I I I I Ie I I I I I I I .e I normal scores * sNell * test and the rank sum SN(R) test relative to T~-test are (7.3) e where A = (7.4) * ~Ajk) e 2 SN(lJ ,TN * (Q"A- = l' Q )/(Q is given by (6.12) and 2 = -(Q r -1' Q )/(Q SN(R),T -N where r = (r jk ) is given by (6.11). , 1T- 1Q) TT = (cr jk ). -1, 1T Q) We may remark that the expression (7.4) is the same as the one obtained by Bickel (1965) for the p-variate one-sample problem, and by Chatterjee and Sen (1964) for the bivariate two-sample problem. For the study of the various aspects of the efficiency (7.4) in special situations the reader is referred to the interesting papers of Bickel (1965) and Chatterjee and Sen (1964). (b) Totally SymmetriC Case. A bivariate random vector (X,Y) is said to be totally symmetric if (X,Y), (X,-y), (-X,Y) and (-X,-Y) have the same distribution function. It can be shown following the lines of the argument of Bickel (1965), that a sufficient condition for the asymptotic independence of the components of total symmetry of (Xij),xi k » for every pair * is the ~N (j,k)~ 37 Thus in the event of totally symmetric case A, all diagonal matrices, and hence we have "" 11 , y are (7.6) Applying a theorem of Courant (cf. footnote) Bickel I I -. I I I I I I (1965) proved that ttl ....1 where~+ is the family of all totally symmetric p-variate distributions whose marginal densities exist. While the authors were working on the study of bounds of the efficiencies (7.S) and (7.7), the results of Bhattacharyya T~e~~em (Courant) x.Ax '/xB:c I , \'1h~:;,e (1966) who was considering a two-sample The maximal and minimal values of A and B are non-negative definite, §nd B Js non-s!ngular, are given by the maximal and minimal eigenvalu~s of AB -1 . I I I I I I I -. I I I .- multivariate problem were made public. I I I I I I variate two sample problem. Ie I I I I I I I •- I It turns out that the efficiency expressions (7.5) and (7.7) are the same as in the case of the corresponding tests for the multiThus for the ease of reference we give below some of the results for the proofs of which the reader is referred to [2]. (7.9) (7.10) In the passing we may remark that when the components of F are totally symmetric as well as identically distributed, then the expressions (7.5), (7.6) and (7.7) become independent of 9's, and the results are the same as in the case of the corresponding univariate one-sample problems. (c) Normal Case. Let us now assume that the underlying distribution function F is a non-singular p-variate normal with mean vector zero and covariance matrix 1T = (Ojk). Then it can easily be checked that (7.11) e * 2 = sNell ,TN 1 . This means that in the case of normal distributions, the property of the univariate normal scores test relative to the student's t test is preserved in the multivariate case . 39 This is interesting in the sense that the same is not the case with the multivariate rank sum test as Bickel (1965) has shm'm that (7.12) and (7.13) inf inf e * 2 = 0 FE~ Q SN(R)"TN for p ~ 3 .2 for p = 2 • < Tr- e * 2 ~ 0.965 SN(R),TN From (7.11) and (7.12), it follows that for p ~ 3 inf inf e 9 * > 1 for p ~ 2. SN(l),S;(R)- [~ is the family of all nonsingular p-variate normal distributiorn.] The above results indicate that in case the underlying distribution is normal, the normal scores test is always preferable to the rank-sum test. 8. Estimation Problem. One Rample Case: (1) (p)_ As before, let X = (X , •.. ,Xa. , a. - 1, ... ,N be "-Q a. a sample from a p-variate cdf F(X I - Q1 , •.. ,X p -Qp) where Q = (9 , ... ,9 ) is unknown, F is symmetric about zero, p 1 and F is continuous. Our aim to find suitable estimates for the parameter Q. I -. I I I I I I and Bhattacharyya (1966) has proved that FE~ I Suppose that the Hodges-Lehmann statistic h [(5.1) of [18]] is calculated for every univariate _I I I I I I I I -. I I I 40 ,e I I I I I (j).sample Xl(j) , ... ,XN ' j - l, ... ,p. hi ,e I hJ(xiJ),···,x~J» for the value obtained from sample Xlj), ... ,x~j) on the j-th variate. 1 (8.1) N h j =- W ~=1 (j) ~,1 Thus we have (j) ZN .. 1 where E(j) N,i is the expected value of the i-th order statist1c of a sample of size N from the distribution function given by (8.2) for x -0 > . Let sup (Qj ( (j) -Qj, ... ,x ( j ) -Qj) hjX N l > ~) Q** j = inf (Qj ( (j) (j)) h j xl -Qj'···'XN -Qj < ~) Q* j (8.3) I' Ie I I I I I I I = We shall write = and where ~ is a point of symmetry of the distribution of h. when Q. J J = o. It was shown in [ 18] that the estimate A (Qj* + Q** j )/2 = QN,j of Qj hasNmore robust efficiency than the classical estimate j = ~ Xj /N. a=l a In this section we shall consider the properties X of the estimate (8.4) of Q, and compare it with the normal theory estimate -X = (Xl,.·.,X - p ") where -.JL Xj = ~=1 considered by Bickel ~964). (j )/N, ~ and the estimates I I 41 Regularity Properties. 8(a) . The following theorems are immediate consequences of the theorems proved by Hodges and Lehmann (1963) for the case p = 1, and are therefore stated without proofs. A A "- The distribution of the estimate QN= (QN,1, ... ,9n,p) is (absolutely) continuous if F is (absolutely) continuous. Theorem 8.1. .A. A N(xl+a, ... , xN+a) = QN(2£1' .•. , xN)+a (a l ,··. ,ap ) is any 1 )( p vector of constants. Theorem 8.2. ..!:!here il:. = Theorem 8.3. (1) Q The distribution of A N is symmetric about Q, if Q F is symmetric about zero, -. I I I I I I _I Theorem 8.4. For every vector a = (al, ... ,ap ) I I 8(b) • Asymptotic Normality. Consider a sequence of sample sizes N = 1,2, ... , and let Q be a sequence of values of Q. We shall indicate N the dependence of h j and ~ on N by writing h jN and ~N. Theorem 8.5. Let of constants. ~ = (aI' ... ,ap ) be a one by p vector Let c l ' ~N = -~cN = -(al/c N, ~et ~be ,c N' ... be constants, and let , ap/c N)· a continuous p-variate distribution function I I I I I -. I I I 42 ,e whose marcinals have means zero and variances one, and lim PN(CN(hjN-~) ~ u j ; j = 1, ••. ,p) N->oo up +apBp u1 +a l Bl G = ( A'···' A ) 1 P I I where PN indicates that the probability is computed for the (j parameter values QN and where h jN stands for h jN ( xl( j ) ' ... ,xN .J I I I, Then for any fixed Q , ... , ~ A) (8.6) p By Theorems (8.2) and (8.4) Proof: Ie = lim P N->oo I I I I suppo~ .Q "" ~ ~) (C~N = lim P (h (x(j)- N->oo 0 = lim 1 ~ (j):J)) C , .•. ,xN - C ~ ~N N N (j)) - ~N ~ 0) Po{( h j .xl(j) , ••• ,xN N->oo = j . alB l G(-A- , ... , 1 ~. A p ). Now combining Theorems (8.5) and (4.1) we have Theorem 8.6. Under the assumptions of Theorem 5.1, I Nl/2(~_Q) has asymptotically a p-variate normal distribu- I I tion with mean vector zero and covariance matrix ,e I· 2:** '" * = (T~k) . given by (6.8). J Let J j = tjl where t j is R(O,l), then the A A estimate QN will be denoted by QN,(R) and in this case the Remark 1. )) . covariance matrix 2:** * reduces to = (Tjk) Z = (Yjk) defined by (6.11). '" [See also Bickel (1964).] -1 Let J j = W j where W j is given by (5.2); A A then the estimate QN will be denoted by QN,l and in this case, the covariance matrix ~** = (T *jk ) reduces Remark 2. to ~ = 8(c). ~Ajk) defined by (6.12). I I -. I I I, Asymptotic Efficiency. To obtain an idea of the relative performance of one estimator with respect to another, we employ the notion of "the generalized variances" a concept introduced by Wilks (see1'1i1ks (1962), p. 547). The "generalized variance" of a p-variate random vector (X 1 , ... ,X p ) with non-singular covariance matrix 2: = (P.kcr.crk) is defined to be 2 2 '" J J cr l ... crp det II Pjkll where "det" denotes the determinant. var X = T and T' of ~ with asymptotically non-singular matrices ~T = (T T T) and L~T ' = (T' T' T ' ) require Nand N' LPjkcrjcr Pjkcrjcr k k observations to achieve equal asymptotic generalized Then the asymptotic relative efficiency of T/ with respect to T eTI T , is defined as ~_ U~criT ' ... cr~);LdetIlPIkIlJP T' T' _ - lim N N->oo cr1 ... crp I fl _I I Suppose that the two asymptotically unbiased estimates variances. I detll Pjkll Now from (6.8), the generalized variance of "GN is 'I I I I I I e-I ,I I I ,I I, 44 (8.8) var where I I I I Ie I I I I ,e I JJ P*** jk J j ~ 2Fj (x) -1 ) J k ( 2Fk (Y)-1) dF j , k (x, y ) -co -co = j=k, 1 and, the generalized variance of the mean estimator XN (8.10) = (XN, l' • • • , ~ , p) is var (~) = fr j = 1 C1~ J . det I!Pjk ll where ~ C1 jkC1jC1k ) is the covariance matrix of X j' Hence, we have I I I +00 +00 t ***II p• detllPjk (8.11) detl\ Pjkll_ Our main interest is the study of the relative performance A '" _ of the estimators QN A, QN R and~. [For the efficiency A ,~ , comparisons of QN,R' ~ and the estimates based on the medians of the observations, the reader is referred to Bickel (196lj)]. From (8.11) it is easy to deduce p iT (8.12) ~1 J j=l 12(j~( f~(X)dX)2 * @et~Pjk~ I I -. p ~et p jk~ -co where ~ ~ [J * Pjk JFj(X) Fk(y) dFj,k(X'y)- ~] f. k if j -co -co = (8.14) where ~ +00 1 J Jj- [ 1 F j ~ x) ]}- [Fk (y)] dF j , k (X, Y) j f. k -co -co if j 1 = k. (8.16) [When p = 1, the efficiencies (8.12), (8.14) and (8.16) reduce to the familiar expressions in the one-sample univariate case.] We shall study the above efficiencies in different situations. Case 1. A '" Total Symmetry. QN,l' QN,R and XN _ Let us assume that the estimators have pairwise independent coordinates. In this situation the covariance matrices reduce to diagonal matrices, and we have I I I I I I _I I I I I I I I -, I I 46 I .e I I I I 1\ I (8.17) (8.18) (8.19) Hence from the results of Hodges-Lehmann (1961), ChernoffSavage (1958) and Mikulski (1963), we have (8.20) Ie I I, ,e I F€J- (8.21) (8.22) I I t I I inf where inf F€J- 2t is 7r = b' the set of all totally symmetric absolutely continuom p-variate distributions. Now suppose that the components of F are totally syrrooetric as well as identically distributed, then the efficiencies are independent of p, and hence the same as in the case of one-sample univariate situations. Case 2. I I Equally Correlated Distributions. Let us now assume that the distribution of (X~j),X~k)) is independent of j and k when Q = O. Then, it turns out -. I ·1 I I ,I that (8.23) (8.24) I _I (8.25) I Ca3e 3. Normal Case. Let the underlying distribution function F be a nonsingular p-variate normal distribution function with mean vector zero, and covariance matrix:: = (Pjkcrjcrk ) then, from (8.12), (8.14) and (8.16) we obtain (8.26) I I where 00 (8.27) P;k = 00 J Jlj[~] lk[~] -00 -00 j II I I t dlj,k(x,y) . -, I I I Ie I I I I I I I'e I 48 (8.28) * is defined by (8.27). where Pjk Now Biclcel (1964) has proved that for p (8·30) I I I I ,. I inf F€~ e~ > 2, =0 GN,R;XN where ~ is the set of all nonsingular p-variate normal distributions and, since it follo\'1s that (8.32) e I detllp *jkll ]l/p , detll Pjkll (8.29) Thus, we find that when the underlying distribution function is normal, the multivariate normal scroes estimator A. QN,! (which is as good as the means estimator) can be infinitely better than the multivariate Rank-sum estimator "- N,R for p > 2. This leads to the question of examining the relative performa.nce of these two estimatora vj.z. Q A "- GN,l and GN,R for the case when the underlying distribution 2 2 function F is bivariate normal N(O,Ol,02'P)' The efficiency behavior of '" GN,l and ~ GN,R is given by the following Theorem 8.7. " The efficiency of QN,R with respect to -'\QN,I is independent of cr1 ~ cr 2 and is given by 1 - P 2 ] 1/2 I I -. I I = .2 [ _ (1+2 cos ~ (v+u)) 2 u(v-u) where u is determined by eg . Q' N,R' N,l p ] = 2 cos (u+rr)/3. The function is monotone increas ins for 0 symmetric about p lim Ip 1->1 1/2 < p < - 1 and (2 ) = 0 and hence unimodal. Finally, e"" g ~ .g N,~' =( N,l 3 v sin 1I )1/2 = O~91. 3 The proof is the same as in Bickel (see Theorem5·2 [ 5 ]) since when the underlying distribution is normal, I I I I _I I , I I I I I -I I I I ,e I I I I I I Ie I I I I I I I .e I 50 References 1. Bennet, B. M. (1962). ~ts.J. On Multivariate Sign Roy. Statist. Soc. Sere B 24 195-161. ~ 2. Bhattacharyya G. K. (1966). Multivariate Two Sample Normal Scores Test for Shift. Doctoral dissertation, University of California, Berkeley. 3. Bhuchungkul, S. (1964). A Class of Nonparametric Tests for Independence in Bivariate Populations. Statist. ~ Ann. r~th. 138-149. 4. Bhuchongku1, S. and Puri, Madan L. (1965). On the Estilnation of Contrasts in Linear Models. Ann. Math. Statist. 5. Bickel, Peter J. ~ 198-202. (1964). On Some Alternative Estimates for Shift in the P-variate One Sample Problem. Ann. Math. Statist. ~ 1079-1090. 6. Bickel, Peter J. (1965). On Some Asymptotically Non-parametric Competitors of Hote1ling's T2 . Ann. Math. Statist. ~ 160-173. 7. Blumen, I. (1958). A New Bivariate Sign Test. J. Amer. Statist. Assoc. ~ 8. Chatterjee, S. K. (1966). Location. 448-456. A Bivariate Sign Test for Ann. Math. Statist. To appear. 9. Chatterjee, S. K. and Sen, P. K. (1964). Nonparametric Tests for the Bivariate Two Sample Location Problem. Calcutta Stat. Assoc. Bu1l.!Z 10. Chetterjee, S. K. and Sen, P. K. (1965). 49-50. Non-parametric Tests for the Multi-sample Multivariate Location Problem. Ann. Math. Statist. Submitted. 51 11. Chernoff, H. and Savage, I. R. -, (1958). Asymptotic Normality of a Certain Non-parametric Test Statistics. Ann. Math. Statist. 12. Cramlr, Harold (1946). ~ 972-994. Mathematical Models of Statistics. Princeton University Press. 13. Govindarajulu, Z. (1960). Central Limit Theorem and Asymptotic Efficiency for One-Sample Non-parametric Procedures. Mimeographed doctoral dissertation. University of Minnesota, Dept. of Statistics. 14. Govindarajulu, Z., Lecam L. and Raghavachari, M. (1965). Generalization of Chernoff-Savage Theorems on Asymptotic Normality of Non-Parametric Test Statistics. Proceedines Fifth Berkeley Symposium 15. Hannan, E. J. (1956). Math. Statist. Probe The Asymptotic Power of Tests Based J. Roy. Stat. Soc. Sere B. 18 Upon Multiple Correlations. ~ 227-233· 16. Hodges, J. C., Jr. (1955). Math. Statist. 26 ~ A Bivariate Sign Test. Ann. 523-527. 17. Hodges, J. L., Jr. and Lehmann, E. L. (1961). Comparison of Normal Scores and Wilcoxon Tests. Proc. Fourth Berkeley Symp. 1 ~ 307-318. 18. Hodges, J. L., Jr. and Lehmann, E. L. (1963). Estimates of Location Based on Rank Tests. Ann. Math. Statist. ~ 598-611. 19. Joffe, A. and Klotz, J. (1962). Null Distribution and Bahadur Efficiency of the Hodges Bivariate Sign Test. Ann. Math. Statist. ~ 803-807. I I I I I I I I el I I I I I I I -. I I I .e I I I I I I Ie I I I I I I I .e I 52 20. Klotz, J. (1964). Sign Tests of Small Sample Power of the Bivariate ~lumen and Hodges. Ann. Math. Statist. ~ 1576..1582. 21. Lehmann, E. L. (1959). Testing Statistical Hypotheses. Wiley, Ne\'l York. 22. Lehmann E. L., and Stein, C. (1949). On the Theory of Some Non-parametric Hypotheses. Ann. Math. Statist. 20 23· Lo'eve, M. (1955). """" Probability Theory. 28-45. D. Vn Nostrand Co., New York. 2!!·. Mikulski, P. W. (1963). On theEf'ficiency of Optimal Non-parametric Procedures in the Two Sample Case. A~~. Math. Statist. 22-32. ~ 25. Puri, Madan L. (1964). of c-Sample Tests. Asymptotic Efficiency of a Class Ann. ~~th. Statist. ~ 102-121. 26. Puri, Madan L. and Sen, Pranab K. (1965). On a Class of Multivariate Multi-Sample Rank Order Tests. 27. Sen, P. K. (1965) . On a Class of' Multi-Sample Multivariate Non-parametric Tests. Jour. Amer. Statist. Assoc. Submitted. 28. Sen, Pranab K. (1965) . Non-parametric Tests. On a Class of Two-Sample Bivariate Proc. Fifth Berkeley Symp. f.1ath. Statist. Probe (in press). 29. Sen, Pranab K. (1963). On the Estimation of Relative Potency in Dilution (-direct) Assays by Distribution-free Methods. Biometrics.~, 532-552. 53 30. \'la1d, A..and Wo1fowitz, J. (1944). Statistical Tests Based on Permutations of the Observations. Ann. Math. Statist. 31. Wilcoxon, F. (1949). Procedures. ~ 358-372. Some Rapid Approximation American Cyanamid Company, Stamford, Conn. 32. Wilks, Samuel S. (1962). John Wiley, New York. Mathematical Statistics. I I -. I I I I I I eI I I I I I I I -. I
© Copyright 2025 Paperzz