I I. I I I I I I I I Ie I I I I I ROBUSTNESS OF SOME NONPARAMETRIC PROCEDURES IN LINEAR MODELS by Pranab Kumar Sen University of North Carolina Institute of Statistics Mimeo Series No. 547 August 1967 Work supported by the Army Research Office, Durham, Grant DA-ARO-D-3l-l24-G746. DEPARTMENT OF BIOSTATISTICS I. I I UNIVERSITY OF NORTH CAROLINA Chapel Hill, N. C. I I. I I I I I I I I Ie I I I I I I. I I ROBUSTNESS OF SOME NONPARAMETRIC PROCEDURES IN LINEAR MODELS* by Pranab Kumar Sen University of North Carolina, Chapel Hill 1. Introduction and summary. For the random variables X.. (i=l, ..• ,N; 1.J j=l, ••. ,r) consider the linear model X.. = J1 + S. + (1.1) 1.J 1. + Y.. Cl::S.1. = 0, 1.J T. J L:T j = 0), where T'S are treatment effects, S's are nuisance parameters (block effects) and Y.. 's are error components. Nonparametric procedures for estimating and 1.J testing contrasts in T'S, based on the Wilcoxon signed rank statistics, are due to Lehmann (1964) and Doksum (1967), among others, and they rest on the assumption that Y.. 's are independent with a common continuous distribution. 1.J The object of the present paper is to show that these procedures are valid even when Y.l, ••• ,Y.1.r are interchangeable variables, for each i(=l, ••• ,N), 1. and moreover, they are robust against possible heterogeneity of the distributions of the error vectors Y. = (Y.l, ••• ,Y. ), i=l, ••• ,N.· ~1. 2. Some fundamental lemmas. 1. Define ~jk 1.r = Tj - T , k j~k=l, ••• ,r, and let X~jk = Xij - Xik ' Uijk = Y ij - Yik for j~k=l, .•• ,r and i=l, ... ,N. (2.1) Assume that Y. has a continuous r-variate cumulative distribution function ~1. (cdf) F.(x) which is symmetric in its r arguments, for all i=l, ••• ,N. 1. This ~ interchangeability of Y.l, ••• ,Y. 1. independent of j~k(=l, ••• l.r implies that (i) the cdf G.(x) of U" 1. 1.J k is ,r) and is symmetric about 0, and (ii) the bivariate * Work supported by the Army Research Office, Durham, Grant DA-ARO-D-3l-l24-G746. I I. I I I I I I I I Ie I I I I I -2- cdf G~(x,y) 1 i=l, ..• ,N. 1J 1J j~k~k'(=l, ... ,r), for all Let h(x) be a real valued skew-symmetric function, i.e., h(x) + h(-x) = 0 for all x. (2.2) Define (2.3) 1;.2= E{h 2 (U·. )}, j~k~k'. 1, 1J k (2.4) Then, we have the following. LEMMA 2.1. If (i) Y.l, ••• ,Y. - 1 1r are interchangeable random variables and -------=------------- (ii) h(x) satisfies (2.2), then (i) E{h(U .. )} = 0 and (ii) 1;. = O. 1J k -1,0 PROOF. (i) follows trivially from the fact that G. is symmetric about 0 and 1 h(x) satisfies (2.2). To prove (ii), denote by ~ = {t l order statistics corresponding to Yij'Yij"Yik and Yik ,. ~ t2 ~ t ~ 3 t } the 4 Since Y.. 's are 1J interchangeable variables, the conditional distribution of Y.. ,Y .. "Y.k,Y. k " 1J given ~, and t . 4 1J 1 1 will be uniform on the 24 equally likely permutations of t ,t ,t l 2 3 Further, h(x) satisfies (2.2). Hence, it is easy to show that for j~k=j '~k' , (2.5) E{h(U.jk)h(U .. 'k,)lt} = E{h(Y.j-Y·k)h(Yi.,-y· k ,) It} = 1 1J ~ 1 1 J 1 ~ Thus, writing 1;. 1,0 o. equivalently as Et{E[h(Uijk)h(Uij'k,)I~]} , ~ the proof follows from (2.5). Q.E.D. LEMMA 2.2. I. I I of U.. k,U .. k , is independent of If Y.l, ••• ,Y. - 1 (2.2), then 1;. 1 1, 1r ~ ~I;. 2' 1, are interchangeable variables and h(x) satisfies where the equality sign holds only when h(x) = bx, I I. I I I I I I I I -3- for all x. PROOF. If, in addition, h(x) is monotonic, 0 < S. 1 < - Define Zi = h(U ) + h(U i3l ). ~s. 1, Z. Then, by (Z.3), (Z.4) and V(Z.) = 3 si Z(l-Zs. lis. Z) ~ 0, 1 , 1, 1, (Z.6) where the equality sign holds only when Z. 1 si,l ~ (~)si,Z' =0 a.e •. Now (Z.6) implies that Also, by definition UilZ + UiZ3 + Ui3l = O. Hence, Z.1 =0 a.e., along with (2.2) implies that for all U , U • i12 i23 (2.7) in turn implies that h(x) = bx for all x, and this completes the first part of the proof. Let now order statistics corresponding to Y"'Y' 1J 1k ~ = and Y. ,. 1k {t l ~ t 2 ~ t } be the 3 Using then (2.2) and pro- ceeding as in the proof of lemma 2.1, one obtains that E{h(U··k)h(U· jk ,) It} = E{h(Y.j-Y·k)h(Yi·-Y. k ,) It} 1J 1 ~ 1 1 J 1 ~ (2.8) Assume that h(x) is t in x (as otherwise, work with -h(x». Then, (2.9) By (2.9), the left hand side of (2.8) is essentially non-negative, and integrating over the distribution of t, it follows that S. 1 ~ 1, Let now 00 = (2.10) f f -00 I. I I iZ3 - lemma Z.l, we obtain that Ie I I I I I ) + h(U ilZ 1, LEMMA 2.3. If Y.l, ••• ,Y. - 1 1r ~ O. Q.E.D. 00 -00 G.(x)G.(y)dG~(x,y) for i=l, ••• ,N. 111 are interchangeable random variables, - ~ _< A(F.) ~ 7/24, 1 I I. I I I I I I I I Ie I I I I I I. I I -4- where the upper bound 7/24 is attained only when G. is a uniform cdf over 1 (-a,a),a > 0. PROOF. We let h(x) = G.(x) 1 fied. -~. As G.1 is symmetric about 0, (2.2) is satis- Straightforward computations yield that Also G.(x) is t in x. ~. 1 1, = 1/12 and = A(F.) - \. 1 Hence, the lemma directly follows from lemma 2.2. Q.E.D. 1 REMARK. ~. 2 1, Lemma 2.3 generalizes theorem 2 of Lehmann (1964) to exchangeable random variables and also supplies condition under which the upper bound 7/24 for A(F.) may be attained. l 1 3. It may be noted that if V(Y .. ) = a 2 Robustness for 1J 2 and Cov(Y .. ,Y. ) = pa ,jfk, the classical ANOVA-test based on the variance1J 1k ratio criterion is a valid test for the null hypothesis H: o for the sequence of alternative hypotheses T =••• =T =0 1 r' and {~}: (3.1) (where al, •.• ,a r are all real and finite), (r-l) times the variance-ratio criterion has asymptotically (under Fl= •.• =FN=F) a non-central chi-square distribution with r-l degrees of freedom and non-centrality parameter (3.2) It is also known that the method of ranking after alignment [cf. Hodges and Lehmann 1962)] allows for the interchangeability of the error components, and it has been shown by Sen (1967a) that the efficiency of the non-parametric procedures based on aligned observations is not affected by the interchangeability 1) The author is grateful to Professor Wassily Hoeffding for pointing out that the existence of F. for which the corresponding G. is uniform on (-a,a),a 1 dubious. 1 > 0, is Our conjecture is that there exists no such cdf F. for which the corresponding Gi is uniform on (-a,a), a > 0. 1 I I. I I I I I I I I Ie I I I I I -5- of the errors. It will be shown here that the same is true for the procedures considered by Lehmann (1964) and Doksum (1967). For this, define ¢(u) as 1 or 0 according as u is > 0 or not, and let (~)-l (3.3) ¢(X~'k + X~f'k)' 1 < j < k < r. L l<i<i'<N 1J 1 J - Assume that Fl= .•• =FN=F (=+ Gl= ••• =GN=G and A(Fl)= ••. =A(F ) = A(F),) and that N G(x) has a continuous density function g(x) satisfying the conditions of theorem 1 of Lehmann (1964). Then, it is easy to show that 00 2a (3.4) jk _L g2(x)dx, for all 1 ~ j < k < r. Using then theorem 7.1 of Hoeffding (1948), our lemmas 2.1 and 2.3 and following some routine steps, we arrive at the following. THEOREM 3.1. 1f F(~) (i) Fl= •.• =FN=F, (ii) is symmetric in its r arguments, 1 and (iii) {~} in (3.4) holds, then {N~(WN,jk-~) - 2a jk has asymptotically a ~r(r-l)-variate Yjk,j 'k' = 1 ~ j < k ~ r} normal distribution with null mean vector and dispersion matrix ~ = «Yjk,j'k'» (3.5) -L g2(x)dx, 00 given by 1/3, j=j' , k=k' , j;&k, 4A(F)-1, j=j , , k;&k' , j;&k;&k' , 1 - 4A(F), j;&j , , j=k' , j';&k, 0 j;&j';&k;&k' Theorem 3.1 and lemma 2.3 show that the results derived by Doksum (1967) in his lemmas 2.1 through 2.4 remain true even when the errors are not all I. I I independent, but are within block symmetric dependent. Further, the use of theorem 3.1 as in (2.6) of [4] generalizes theorem 1 of Lehmann (1964) to I I. I I I I I I I I Ie I I I I I -6- exchangeable error components. Since the main results of Lehmann (1964) are based on his theorems 1 and 2, and that of Doksum (1967) on his lemmas 2.3 and 2.4, it follows from our lemma 2.3 and theorem 3.1 that the Lehmann-Doksum procedures remain valid for within block exchangeable error components. Now, we note that the variance of the cdf G is 2 a 2 (1_p) = a 2 (G), say. As such, using (3.2) and generalizing theorem 3.1 in the same manner as in lemma 2.3 of Doksum (1967), the asymptotic relative efficiency (A.R.E.) of the Doksum-test with respect to the classical ANOVA test, can again be shown to be equal to e', defined by (2.11) and (2.12) of Doksum (1967). clear that the A.R.E. is unaffected by the within block symmetric dependence of the error components. The same conclusion also applies to Lehmann's pro- cedure, as the variance of (l/N)Ii~l X!jk is also equal to a 2 (G). This shows that theorem 4 of Lehmann (1964) is also true for exchangeable errors. 4. Robustness for heteroscedastic errors. Let us define (4.1) Assume that the density functions gi(x) = (d/dx)Gi(x), i=l, ••• ,N satisfy sup (4.2) i 00 J g~(x)dx < 00 • -00 Then, it follows from the results of Sen (1967b) that under (3.1) and (4.2) 00 J g/ (x)dx} (4.3) = 0, _00 for all 1 ~ j < k < r. Further, by a direct generalization of theorem 2.1 of Sen (1967b), it follows that under I. I I It is thus {~} in (3.1) and (4.2), I I. I I I I I I I I Ie I I I I I I. I I -700 J g~(X)dx], 1 ~ j < k ~ r} has asymptotically a ~r(r-l) _00 variate normal distribution with null mean vector and dispersion matrix j=j' , k=k' , j;&k, j=j , , k;&k' , j ;&k;&k' , 1/3, (4.4) YN,J'k ,J. 'k' 4 >/F ) - 1, N 1 - 4;\(F ), N 0, j=k' , j ';&k, j;&j ';&k, j;&j' , ;&k;&k' , and 00 A(F N) = (4.5) LEMMA 4.1. ~ ~ A(FN) 00 J J ~(x)GN(Y)dG=(x,y). -00 -00 ~ 7/24, uniformly in Fl, ••• ,F N (which are symmetric in their arguments.) PROOF. We define h(x) = GN(x) -~. and symmetric about 0, so is G • N Since Gl, ••• ,G are all non-decreasing N Thus, h(x) satisfies (2.2). Using then lemma 2.2, we obtain that (4.6) ~~ 00 00 J J GN(x)GN(Y)dG~(x,y) ~ ~ -00 _00 +~ 00 J [GN(x)-~]2dGi(x), for all i. -00 The proof of the lemma is then completed from (4.1), (4.5) and (4.6), after 00 noting that J [GN(x)-~]2dGN(x) = 1/12, Q.E.D. -00 Let us now denote by cr~ = V(Y ij ), Picr~ = cov(Y .. ,Y. k ), j;&k, for 1J 1 i=l, ••• ,N, and let Then, proceeding on the same line as in lemmas 2.3 and 2.4 of Doksum (1967) and theorems 2, 3 and 4 of Lehmann (1964), and noting that (1/N)\.N1X~'k has L1= 1J I I. I I I I I I I I Ie I I I I I I. I I -8- the variance 26~, the A.R.E. of the Lehmann-Doksum procedures with respect to the corresponding parametric procedures is obtained as (4.8) where 00 (4.9) By virtue of lemma 4.1, e~ ~ eN. Now, eN is the efficiency (A.R.E.) of the Wilcoxon signed rank test with respect to Student's t-test when FI, ••. ,F not necessarily identical [cf. Sen (1967b)]. N are It follows from the results of Sen (1967b) that (i) eN is uniformly (in Gl, ••• ,G ) bounded below by 0.864 and N (ii) if Gi(x) = G(x/o ) for all i=l, ••• ,N, then i 00 eN ~ 120 2 (G) [ (4.10) J g2(x)dx]2 , _00 where the equality sign holds only when 0l= •.• =oN. also true for e~. Consequently, the same is This clearly indicates the robustness of the Lehmann-Doksum procedures for heteroscedastic errors (i.e., for F.(xl, ••• ,x ) = F(xl/O., ••• ,x /0.), 1 r 1 r i=!, ••• ,N, when 0l, ••• ,oN are not all equal). REMARK. The estimator of A(F) considered by Lehmann (1964) (and cited by Doksum (1967), (2.7» changeable errors. of [5]. will be a consistent estimator of A(F ), even for exN The proof follows by the same technique as in theorem 4.2 Consequently, when Fl= ••• =FN=F, the same will estimate A(F), even for exchangeable errors. This shows that the estimate of A(F), proposed by Lehmann (1964) and required with the Lehmann-Doksum procedures, can also be used for interchangeable errors. 1 I I. I I I I I I I I Ie I I I I I I. I I -9- REFERENCES [1] DOKSUM, K. (1967). Robust procedures for some linear models with one observation per cell. Ann. Math. Statist. 38, 878-883. [2] HODGES, J. L., JR. and LEHMANN, E.L. (1962). Rank methods for combination of independent experiments in analysis of variance. Ann. Math. Statist. 33, 482-497. [3] HOEFFDlNG, W. (1948). On a class of statistics with asymptotically normal distribution. Ann. Math. Statist. 19, 293-325. [4] LEHMANN, E. L. (1964). Asymptotically nonparametric inference with one observation per cell. Ann. Math. Statist. 35, 726-734. [5] PURl, M. L. and SEN, P. K. (1966). On a class of multivariate multisample rank order tests. Sankhya. Sere A. 28, 353-376. [6] SEN, P. K. (1967a). On a class of aligned rank order tests in two way layouts. Ann. Math. Statist. (Under consideration of publication). [7] SEN, P. K. (1967b). On a further robustness property of the test and estimator based on Wilcoxon's signed rank statistic. Ann. Math. Statist. (Under consideration of publication).
© Copyright 2025 Paperzz