I I. I I I I I I I Ie I I I I I I I .e I THE COVARIANCE ANALYSIS FOR DEPENDENT DATA by Francesca . Baldessari Gallo Department of Statistics University of North Carolina and Universittl di Roma . (Ist1tuto d1 C&lcolo delle Probab111ta) Institute of Statistics Mimeo Series No. 478 May 1966 Contents: 1. Introduction and notation 2. The structure of the matrix T .3. Analysis of covariance 4. Acknowledgment This research was supperted by the Consigl.i.o Nazionale delle Ricerche - Roma, Italy. Grant Ro. 10-220-5-3539; Department of Statistics University of North Carolina Chapel Hill, N. C. I 1. I. I I I I I I I Ie I I I I I I I Ie I Introduction and notations. In this note we consider the covariance analysis model. In the claasical theory this model is studied on the hypothesis that the data are independent and normally distributed, say N (~,aI), where I is the nxn identity matriX, and in order to test the hypothesis of equal influence of the treatMnts, two independent statistics are considered, say (t l - t 2 ) and t 2 (functions ot the data), whose ratio is distributed as the random variable P' (n , n , A) (ane l 2 decor's P' random variable with n l and ~ degrees ot treedOlll and A DOn-contrality parameter) • In this note we want to remove the hypothesis of independence of the data, and find the set of all the matrices V such that, the data being normally but not independently distributed, say N(~, V) with V p. d •. (positive definite), the two statistics (t -t ) and t are independent and their ratio has the same l 2 2 distribution F'(n , n , ~) as the classical case, i.e. When V l 2 = aI. In order to do this, we suppose that we have a set of data, say I, such that it is possible to express them in a linear model. In particular we s~ppose that we have one factor with t levels or treatments and that each observEd quantity. can be written as: Y"-j = ~ + 'r.~ + •.. ... ~J where ~ i • 1, ••• , tj 13 + e.~ j J. 1, ••• , r i is a constant; T is the i-th treatment constant; 13 is the regression i coefficient (unknown); ;~ij are observed fixed quantities and the vector e = [ell' •• ', It' r ]' is a normal r.v. (random variable) with mean vec~ .2!U1d variancet covariance matrix p.d., say V, so that the data y are normally distributed with mean vector R and variance-covariance matrix V. Now, in the classical analysis of covariance V= aI because the data are supposed to be independent, and to test the hypothesis B : ~ • '&2 • ••• • 'tt' the statistics used are: o 1 I = I [~ X 2 ,1....2 y ij - - n rL. .. = ijE Yij ; X•• = ijE x. ~j ; ij ij Y ij Yi • • E Yij ; j Xi' We denote by .1 .. Y. r'n .~ ]2 - I I I I I I I .......------:~----x2 2 E x. ij ~ j ij where Y x . •• n = U the n x n matrix Which has all elements equal to unify; U. the n x n matrix which has elements u(i], ~ pq p = 1, ••• , n, q == 1, ••• , n, all equal zero except for 'the r 2 elements u(i) pq i = n i _l = ni _l = [Yu ' + 1, •.. , n i ; J' ~12' •.. , Yt,r ] t [Yll! ••• , y ] the n x 1 vector of the data; X' [x, X , ••• , x ] = t ,r .D QJ! 12 t [Xl' ••• , x ] the n x 1 vector of the observed fixed quantities JI:. We suppose with p + 1, ... , n i , q el = lI: n also that xi S = [I ... ' ~ 0, -Un); D i = l~ ... , _n, and we use the following notations: Z' = [x~l, ••. , = s JCC' S ; A = S - D; X' S X -1 c= Q= n x n ; B • I - E Vi r i , ~ arbitrary for all i; and 2 - C; I I x~1; I I I I II ell I I I. I I I I I I I Ie I I I I I I 'hi o r..E ... O = A n x n 0 ••• 0 ..o • • • ~. ~ orbitrary for all i. r.. n We also use the further notations: lea, i\) for the chi-square r. v. with a degrees of freedom and r.. non-centrallty parameter j H( J, p) for the normal distribution with mean vector j and variance-covariance matrix P(p.d). We will also write < a > <=> < k > meaning that the statement 8. is equivalent to the statement k, and D(P) = D(R) meaning that the r. v. :P is distributed like the r.v. ij. Of course tl(y l , ••. , Y ) and tE(Yl' ••. , Y ) n n can now be written in the matricial form: t l (Yl' ... , Yn ) = Y' A Y, t 2 (Yl' ••. , yn)=Y'BY Now we suppose that the data are dependent, Le. that V is a g~neral symmetric p.d. matrix and we want to find all the matrices V such that, the data being dependent, we have Y' (A-B)Y and Y'B Y independent and Y' (A - B) Y n - t - 1 Y'B Y t - classical case. distributed like F' (t-l, n-t-l, Il' 1 T is a symmetric p.d. matrix of the form: 2. Il) as in the T = exI + Q. + Q.' + ~{X_Z' A + A Z X'). (1) The structure of the matrix T. Theorem 1. fl ~ We will write T for the matrices of this set and we show that If Y is N(Jl, V), (Le. if Y is normally distributed with mean vector and variance-covariance matrix V , where V is p.d. and ex is a positive constant, 2 then < D(Y' A Y) = D(x (n-2, H'~ H))> <=> < V = T > where T has the structure (1) First we prove that if K is a n x n matrix, then < K = K', S K S =0 ~ <=> < K = Q + Q' > Proof of the implication: => (I-U n -1) K( I-U n -1) I. I I , =0 implies K - U K n -1 ... K U n -1 + U K U ( n2)-1 k .. be the term of the i-th row and j-th column of the matrix K. ~J = O. Let The condition I k. of k • k •• ni - It + 2n k ij - is satisfied. = 0, where k i • = 1: k ij , k. j = ~ k ij , i = 1, ••• , n, k.... ~ k ij j In particular we have kii = 2 ki • n k.. - 2 n k •• so that k i . == ~ (kii+ 2) n k •• n i = 1, ... , n, and similarly k . = 2 (k jj +~) so that k. j k It·J II:: ~ ii:.u n Now if we put ~ = 2' qj = 2 we have K = Q + Q I • Proof of the implication: <= It is evident that K = K' also S K S = 0 ~ be written as (I - U n-1). l l l Q(I JJ n- ) + (I - tr n- ) Q'(I - U n- ) = 0 and we have (I - U n-l)Q(I - U n- l ) = 0 because n -1 K U = K, n -1 U K = n -1 1: k (I -. i U, (n 2)-1 U K U = n -1 1: kiU. But -1)] I and the proof is comU n -1) Q' (I - U n -1) = [ ( I - U .n -1)Q(I - U n pleted. We will use, also, the following-mown theorem (see [1]): If the random vector Y is N(f.L, V) with V p.d., and if the n x n matrix W is real and symmetric So to prove Theorem 1 we , have to prove: ~ <AVA=aA><=><V=T> l (for put W ==A a- ). <= For A T A == (S-D) T (S-D) we have A T A == S T S - S T D - D T S + D T D, but substituting T and recalling the previous results, we have: i S t =a D + i x Z 'A DT S S + S A Z XI S} .S T D = a D + t S X Z'A D + S X Z' A S + D A Z X' S; D T D = a D + l t S X Z' S T S =a S + S A Z XI S; AD+ t D A Z X' and thus A T A == a A. Proof of the implication: I I I I I I el (rank W = w), then Proof of the implication: ••I => Let us suppose that we may have a matrix V* = T + H for which A V* A = a A 4 S I I I I I I I -. I I I. I I I I I I I Ie I I I I I I I. I I and we show that H must have the same structure as To In fact A V* A .. a A is A T A + A H A = a A and because A T A = a A we have to study the equation A H A = O. By developing the left hand side of A H A = 0 we have S(H - H S X X' - X X' S H + X X' S H S X X') S = 0 and by our previous result H - H S X X' - XX'S H + X X' SHSXX'= R + R' where R is a matrix of the structure of Q and the r., i = 1, .•. , n, are ~ arbitrary. Now, let. h.~ j be an element of the matrix H, then. h ~J .. = x.J x. Z ~j ~k Z("k - ~ Putting E hir(x r E("k - EXi)E ~ (x + - Ex.) r r r ~ ZXi) - x. x. ~ r J k Ex.) = t.,'E hkj(~ - - ~ ~k k E Xi) = t., J and solving the system of equations 2 h"ii = g Xi t i - x~... q + 2 r i we have ll-lj ... H= !( = ! x. Xi j X Z' (E - 6) X. - (1 - .....!.) r - (1 j Xiii Xj 4 jj + --aL h (X xj Xi ) r ) • + (E - 6) Z X' + U 6 + 6 'U) i. e'. H = F + F' + i ! (X where r is a diagonal matrix, F is a matrix of the same structure as Q. Hence z' r Then H has the same structure as T-aB, and thus the structure of V = T + H is the some as that of T and the Theorem is proved. 3. Analysis of covariance Theorem 2 If the random vector Y is N~ T), than < D (Y' A Y) = D(a l(n-2, H'2A H » >.<=> < D (Y' (A - B)Y = D(a let - 1, H'(~-B)H», 5 D(Y' BY) • + D{ax 2 (n-t-l)}, r Z X'), I .-I Y' (A-B)Y and Y' B Y independent > Proof Theorem 5 of Gr8\Ybill-:~ia (1), says, among other things, that k if: Y is N(I-l, V), with Van x n ~tric p.d. matrix, Y' W Y = 1: Y' W. Y, i=l J. rank Wi Pi' rank W= p, then: I I I I I I = < (W V) 2 = W V, < D(Y' Wi \.t= where Y) 1: p. = p > <;::> J. 2 = D (X (Pi' hi»)' 2 --ex-- ' A-B W2 = aB ' W= A --ex Y) Theorem = ex A or (~ T)2 = ~ T. Then if we have rank Wl = t-l, rank W2 rank W = n-2, and thus we may write: D(Y'B independent> IJ: Wi 1+ But from Theorem 1 we have A T A put Wl = J' Wi Y are mutually D(Y'(A-B)Y) = D(a X2 (t-l, 0 we n-t-l, l!'(A~B)l1 ), 2 = D(ax (n-t-l») and A-B and B independent. _I 3 If the random vector Y is N(IJ, V), V p.d. and Y'(A-B)Y, y B Y l are independent, then <V . =T for all > <;::> < D(1(A-B}Y n-t-l Y' B If t-l ) = D(F I (t-l, n-t-l, 1-1' A 1-1 - 2 - ») > 1-1. Proof of the implication: => This implication follows inDnediately from. Theorems 1 and 2. Proof of the implication: Taking IJ = 1-1* =S = IJ1 where 1 is the a x 1 unify vector, we have A 1-1* = (S-D)I-1*=SJ..L*-DJ..L* 1-1* - X,lS X S X XI S IJ*. so that A 1-1* = O. <= From. simple calculation we verify that S 1-1* = 0 In the same way we have 1-1* A we have that <A 1-1*. = 0 > <;::> < [(A-B) 1-1* = 0 > .> 6 = O. But because A = (A-B) + B = I I I I I I .-I I I I. I I I I I I I Ie I I I I I I I. I I < B A Il* = 0 > <=> < B Il* = 0 > => and also «A- B) A Il* = 0 > <=> < (A - B) Il* = 0 > But it is known that for every p. d. v and Il = Il*, any quadratic form is decomposable into a linear combination of independent central t-l n-t-l r.v., i.e. Y'(A-B)Y = ~ Ai (1) and Y' B Y = ~ Q. X~ (1) i=l j=l J J . l ii where \ and (Xi are the parameters of the linear combination. Then it follows, from our hypothesis, that n-t-l t-l } = D(F'(t-l, n-t-l}) , Y' (A-B) Y independent of Y' B Y >=> < ..J: \ ~ (X j 2 ~ lj (1) (1) n - t - 1 } t-l and Y' B Y independent > = ,.jX~(t-l) L~ (n-t-l) } , Y' (A - B) Y • But from Baldessari (see [2]), this last statement is equivalent to < \ = ••. = ~ =(X 1 = ~ = ••• = '1 = a > 0, and Y' (A - B)Y and Y' B Y are independent ~ <-> 2 < D(Y'(A-B)Y) = D(a :l(t-l»), D(Y' BY) = D(ax (n-t-l»), Y' (A-B)Y and Y' B Y independent > 2 => < D(Y' A Y) = D(a X (n-2» > <=> < V = T > and the proof of our Theorem is completed. 7 I .-I Acknowledgment The author expresses his gratitude to Professor N. L. Johnson for helpful suggestions and for revising the article. I I I I I I _I I I I I I I .-I 8 I I I. I I I I I I I Ie I I I I I I I. I I References [1] Graybill, A. F. and l-!'U'sagl1a " G. " Idempotent· matrices and quadradic forms in the general linear hypothesis", Amer. Math. Statist. J 28, (1957), 678. [2] Baldessari, B. "Analysis of variance of dependent data", Institute of Statistics Mimeo Series No. ~67, March 1966. 9