Baldessari, F.G.; (1966)The covariance analysis for dependent data."

I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
.e
I
THE COVARIANCE ANALYSIS FOR DEPENDENT DATA
by
Francesca . Baldessari Gallo
Department of Statistics
University of North Carolina
and
Universittl di Roma
.
(Ist1tuto d1 C&lcolo delle Probab111ta)
Institute of Statistics Mimeo Series No.
478
May 1966
Contents:
1.
Introduction and notation
2.
The structure of the matrix T
.3.
Analysis of covariance
4. Acknowledgment
This research was supperted by the Consigl.i.o Nazionale delle
Ricerche - Roma, Italy. Grant Ro. 10-220-5-3539;
Department of Statistics
University of North Carolina
Chapel Hill, N. C.
I
1.
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
Ie
I
Introduction and notations.
In this note we consider the covariance analysis model.
In the claasical
theory this model is studied on the hypothesis that the data are independent
and normally distributed, say N (~,aI), where I is the nxn identity matriX,
and in order to test the hypothesis of equal influence of the treatMnts, two
independent statistics are considered, say (t
l
- t 2 ) and t 2 (functions ot the
data), whose ratio is distributed as the random variable P' (n , n , A) (ane
l
2
decor's P' random variable with n
l
and
~
degrees ot treedOlll and A DOn-contrality
parameter) •
In this note we want to remove the hypothesis of independence of the data,
and find the set of all the matrices V such that, the data being normally but
not independently distributed, say N(~, V) with V p. d •. (positive definite),
the two statistics (t -t ) and t are independent and their ratio has the same
l 2
2
distribution F'(n , n , ~) as the classical case, i.e. When V
l
2
= aI.
In order to do this, we suppose that we have a set of data, say
I, such
that it is possible to express them in a linear model.
In particular we
s~ppose
that we have one factor with t levels or treatments
and that each observEd quantity. can be written as:
Y"-j
= ~ + 'r.~ + •..
...
~J
where
~
i • 1, ••• , tj
13 + e.~ j
J. 1, ••• ,
r
i
is a constant; T is the i-th treatment constant; 13 is the regression
i
coefficient (unknown); ;~ij are observed fixed quantities and the vector
e = [ell' •• ', It' r ]' is a normal r.v. (random variable) with mean vec~ .2!U1d variancet
covariance matrix p.d., say V, so that the data y are normally distributed
with mean vector R and variance-covariance matrix V. Now, in the classical
analysis of covariance V= aI because the data are supposed to be independent,
and to test the hypothesis B : ~ • '&2 • ••• • 'tt' the statistics used are:
o
1
I
=
I
[~ X
2
,1....2
y ij - - n rL.
.. = ijE Yij ; X•• = ijE x.
~j
;
ij
ij
Y
ij
Yi • •
E Yij ;
j
Xi'
We denote by
.1
..
Y.
r'n .~ ]2
-
I
I
I
I
I
I
I
.......------:~----x2
2
E x.
ij ~ j
ij
where Y
x .
••
n
=
U the n x n matrix Which has
all elements equal to unify; U. the n x n matrix which has elements u(i],
~
pq
p = 1, ••• , n, q == 1, ••• , n, all equal zero except for 'the r 2 elements u(i)
pq
i
= n i _l
= ni _l
= [Yu '
+ 1, •.. , n i ; J'
~12'
•.. , Yt,r ]
t
[Yll! ••• , y ] the n x 1 vector of the data; X'
[x, X , ••• , x
] =
t ,r
.D
QJ!
12
t
[Xl' ••• , x ] the n x 1 vector of the observed fixed quantities JI:. We suppose
with p
+ 1, ... , n i , q
el
=
lI:
n
also that xi
S
= [I
...
'
~ 0,
-Un); D
i =
l~ ... , _n, and we use the following notations: Z' = [x~l, ••. ,
= s JCC'
S
;
A = S - D;
X' S X
-1
c=
Q=
n x n
; B • I - E Vi r i
,
~
arbitrary for all i; and
2
- C;
I
I
x~1;
I
I
I
I
II
ell
I
I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
'hi
o r..E ... O
=
A
n x n
0 ••• 0
..o • • •
~.
~
orbitrary for all i.
r..
n
We also use the further notations:
lea, i\) for the chi-square r. v.
with a degrees of freedom and r.. non-centrallty parameter j
H( J, p)
for the normal distribution with mean vector j and variance-covariance matrix
P(p.d).
We will also write < a > <=> < k > meaning that the statement 8. is
equivalent to the statement k, and D(P) = D(R) meaning that the r. v. :P is
distributed like the r.v. ij.
Of
course tl(y l , ••. , Y ) and tE(Yl' ••. , Y )
n
n
can now be written in the matricial form:
t l (Yl' ... , Yn ) = Y' A Y, t 2 (Yl' ••. , yn)=Y'BY
Now we suppose that the data are dependent, Le. that V is a g~neral
symmetric p.d. matrix and we want to find all the matrices V such that, the
data being dependent, we have Y' (A-B)Y and Y'B Y independent and
Y' (A - B) Y n - t - 1
Y'B Y
t -
classical case.
distributed like F' (t-l, n-t-l, Il'
1
T is a symmetric p.d. matrix of the form:
2.
Il) as in the
T = exI + Q. + Q.' + ~{X_Z' A + A Z X'). (1)
The structure of the matrix T.
Theorem 1.
fl
~
We will write T for the matrices of this set and we show that
If Y is N(Jl, V), (Le. if Y is normally distributed with mean vector
and variance-covariance matrix V , where V is p.d. and ex is a positive constant,
2
then < D(Y' A Y) = D(x (n-2,
H'~ H))> <=> < V = T > where T has the structure (1)
First we prove that if K is a n x n matrix, then
< K = K', S K S
=0
~
<=> < K = Q +
Q'
>
Proof of the implication: =>
(I-U n -1) K( I-U n -1)
I.
I
I
,
=0
implies K - U K n -1 ... K U n -1 + U K U ( n2)-1
k .. be the term of the i-th row and j-th column of the matrix K.
~J
= O. Let
The condition
I
k. of
k •
k ••
ni - It + 2n
k ij -
is satisfied.
= 0,
where k i • = 1: k ij , k. j = ~ k ij , i = 1, ••• , n, k.... ~ k
ij
j
In particular we have kii =
2 ki •
n
k..
- 2
n
k ••
so that k i . ==
~ (kii+ 2)
n
k ••
n
i = 1, ... , n, and similarly k . = 2 (k jj +~) so that k. j
k
It·J
II::
~
ii:.u
n
Now if we put ~ = 2' qj = 2
we have K = Q + Q I •
Proof of the implication:
<=
It is evident that K = K' also S K S = 0 ~ be written as (I - U n-1).
l
l
l
Q(I JJ n- ) + (I - tr n- ) Q'(I - U n- ) = 0 and we have (I - U n-l)Q(I - U n- l ) = 0
because n -1 K U = K, n -1 U K = n -1 1: k
(I -.
i
U,
(n 2)-1
U K U = n -1 1: kiU.
But
-1)] I and the proof is comU n -1) Q' (I - U n -1) = [ ( I - U .n -1)Q(I - U n
pleted.
We will use, also, the following-mown theorem (see [1]):
If the random
vector Y is N(f.L, V) with V p.d., and if the n x n matrix W is real and symmetric
So to prove Theorem 1 we , have to prove:
~
<AVA=aA><=><V=T>
l
(for put W ==A a- ).
<=
For A T A == (S-D) T (S-D) we have A T A == S T S - S T D - D T S + D T D,
but substituting T and recalling the previous results, we have:
i
S
t
=a D + i
x Z 'A
DT S
S +
S A Z XI S}
.S T D = a D +
t
S X Z'A D +
S X Z' A S + D A Z X' S; D T D = a D +
l
t S X Z'
S T S =a S +
S A Z XI S;
AD+
t D A Z X'
and thus A T A == a A.
Proof of the implication:
I
I
I
I
I
I
el
(rank W = w), then
Proof of the implication:
••I
=>
Let us suppose that we may have a matrix V* = T + H for which A V* A = a A
4
S
I
I
I
I
I
I
I
-.
I
I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
and we show that H must have the same structure as To In fact A V* A .. a A is
A T A + A H A = a A and because A T A = a A we have to study the equation A H A = O.
By developing the left hand side of A H A = 0 we have S(H - H S X X'
-
X X' S H +
X X' S H S X X') S = 0 and by our previous result H - H S X X' - XX'S H + X X' SHSXX'=
R + R' where R is a matrix of the structure of Q and the r.,
i = 1, .•. , n, are
~
arbitrary.
Now, let. h.~ j be an element of the matrix H, then. h ~J
.. = x.J
x. Z ~j
~k
Z("k -
~
Putting E hir(x
r
E("k - EXi)E ~ (x + - Ex.)
r
r
r
~
ZXi) - x. x.
~
r
J
k
Ex.) = t.,'E hkj(~ -
-
~
~k
k
E Xi) = t.,
J
and solving the system of equations
2
h"ii = g Xi t i - x~... q + 2 r i
we have ll-lj
...
H=
!(
= !
x.
Xi
j
X Z' (E - 6)
X.
- (1 - .....!.) r - (1 j
Xiii
Xj
4 jj + --aL h
(X
xj
Xi
) r ) •
+ (E - 6) Z X' + U 6 + 6 'U) i. e'. H = F + F' +
i
!
(X
where r is a diagonal matrix, F is a matrix of the same structure as Q.
Hence
z' r
Then
H has the same structure as T-aB, and thus the structure of V = T + H is the
some as that of T and the Theorem is proved.
3. Analysis of covariance
Theorem 2
If the random vector Y is N~ T), than
< D (Y' A Y) = D(a l(n-2, H'2A H » >.<=>
< D (Y' (A - B)Y
= D(a
let - 1,
H'(~-B)H»,
5
D(Y' BY)
•
+
D{ax 2 (n-t-l)},
r
Z X'),
I
.-I
Y' (A-B)Y and Y' B Y independent >
Proof
Theorem 5 of Gr8\Ybill-:~ia (1), says, among other things, that
k
if: Y is N(I-l, V), with Van x n ~tric p.d. matrix, Y' W Y = 1: Y' W. Y,
i=l
J.
rank Wi
Pi' rank W= p, then:
I
I
I
I
I
I
=
< (W V) 2
= W V,
< D(Y' Wi
\.t=
where
Y)
1:
p. = p > <;::>
J.
2
= D (X (Pi' hi»)'
2
--ex-- '
A-B
W2 =
aB '
W=
A
--ex
Y)
Theorem
= ex A
or
(~ T)2 = ~ T. Then if
we have rank Wl = t-l, rank W2
rank W = n-2, and thus we may write:
D(Y'B
independent>
IJ: Wi 1+
But from Theorem 1 we have A T A
put Wl =
J' Wi Y are mutually
D(Y'(A-B)Y)
= D(a
X2 (t-l,
0
we
n-t-l,
l!'(A~B)l1
),
2
= D(ax (n-t-l») and A-B and B independent.
_I
3 If the random vector Y is N(IJ, V), V p.d. and Y'(A-B)Y, y B Y
l
are independent, then
<V
.
=T
for all
> <;::> < D(1(A-B}Y n-t-l
Y' B If
t-l
) = D(F I (t-l, n-t-l,
1-1' A 1-1
-
2 - ») >
1-1.
Proof of the implication:
=>
This implication follows inDnediately from. Theorems 1 and 2.
Proof of the implication:
Taking IJ = 1-1*
=S
= IJ1
where 1 is the a x 1 unify vector, we have A 1-1* = (S-D)I-1*=SJ..L*-DJ..L*
1-1* - X,lS X S X XI S IJ*.
so that A 1-1*
= O.
<=
From. simple calculation we verify that S 1-1* = 0
In the same way we have 1-1* A
we have that
<A
1-1*. = 0 > <;::>
< [(A-B) 1-1* = 0 > .>
6
= O.
But because A = (A-B) + B
=
I
I
I
I
I
I
.-I
I
I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
< B A Il* = 0 > <=> < B Il* = 0 >
=>
and also
«A- B) A Il* = 0 > <=> < (A - B) Il* = 0 >
But it is known that for every
p. d.
v
and
Il =
Il*,
any quadratic
form is decomposable into a linear combination of independent central
t-l
n-t-l
r.v., i.e. Y'(A-B)Y = ~ Ai
(1) and Y' B Y = ~ Q. X~ (1)
i=l
j=l
J J
.
l
ii
where
\
and (Xi are the parameters of the linear combination.
Then it follows,
from our hypothesis, that
n-t-l
t-l
}
= D(F'(t-l, n-t-l}) ,
Y' (A-B) Y independent of Y' B Y >=>
< ..J:
\
~ (X
j
2
~
lj
(1)
(1)
n - t - 1 }
t-l
and Y' B Y independent >
= ,.jX~(t-l)
L~ (n-t-l)
} , Y' (A - B) Y
•
But from Baldessari (see [2]), this last statement is equivalent to
< \
= ••. = ~ =(X 1 = ~ = •••
= '1 = a
> 0, and Y' (A - B)Y and Y' B Y are
independent ~ <->
2
< D(Y'(A-B)Y) = D(a :l(t-l»), D(Y' BY) = D(ax (n-t-l»),
Y' (A-B)Y and Y' B Y independent >
2
=> < D(Y' A Y) = D(a X (n-2» > <=> < V = T >
and the proof of our Theorem is completed.
7
I
.-I
Acknowledgment
The author expresses his gratitude to Professor N. L. Johnson for helpful suggestions and for revising the article.
I
I
I
I
I
I
_I
I
I
I
I
I
I
.-I
8
I
I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I.
I
I
References
[1]
Graybill, A. F. and l-!'U'sagl1a " G.
" Idempotent· matrices and quadradic
forms in the general linear hypothesis", Amer. Math. Statist. J 28,
(1957), 678.
[2]
Baldessari, B.
"Analysis of variance of dependent data", Institute of
Statistics Mimeo Series No. ~67, March 1966.
9