Roy, S.N. and R.E. Bargmann; (1957)Tests of multiple independence and the associated confidence bounds." (Air Research and Dev. Command)

'L'ESTS
OF MUIJTIPLE
INDEPENDENCE
11SS0CIATED CONFIDENCE
AND
THE
BOUNDS
t
,
by
S. N. Roy and R. E. Bargmann
This research was supported by the United
St$tes Air Foroe through the .i$ir Force
Office of Scientifio Research of the Air
Research and Development Command, under
oontraot No. AF 18(600) - 83. Reproduction in whole or in part is permitted
for any purpose of the United states
Government.
Institute of Statistics
Mimeograph Series No. 175
June, 19,7
TESTS
OF
MULTIPLE
ASSOCIATED
CONFIDENCE
by S. N. Roy
1.
Summary.
INDEPENDENCE
THE
liliD
BOUNDS
*
andR. E. Bargmann
In this paper a test based on the union-inter-
section principle is proposed for overall independence between p
variates distributed according to the multivariate normal law,
and this is extended to the hypothesis of independence between
several groups of variates which have a joint multivariate normal
distribution.
Methods used in earlier papers
L3, 4_7 have
been
applied in order tv invert these tests for each situation, and to
obtain, with a joint confidence coefficient greater than or equal
to a preassigned level, simultaneous confidence bounds on certain
parametric functions.
These parametria functions are, in case I,
the moduli of the regression vectors of the variate p on the variates
(p-l), (p-2),
2, 1, or on any subset of the latter;
of the
variate (p-l) on the variates (p-2), (P-3), •. , 2, 1, or any subset
of the latter, etc.;
1.
and finally, of the variate 2 on the variate
Fur casu II, parallel to each case considered above, there is an
analogous statement in which the regression vector is replaced by a
regression matrix,
~,
say, and the tlmodulus It of the regression vector
is replaced by tho (positive) square-root of the largest
characteristic r00t of
on these
the
sets
proposed
(~~I).
of parameters
tests
Simultaneous confidence bounds
are
of hypotheses
given.
As
of multiple
far
as
inde-
* This resEoarch was supported by the United States .tdr Force
thrcugh th~3 idr Force Office of Scientific Research of the Hir
Research anJ Development Command.
2
pendence are concerned they are offered as an alternative to an"
other class of tests based on the likelihood-ratio criterion
["5,
6J which has been known for
a long time.
So far as the con-
fidence bounds are concerned it is believed, however, that no other
easily obtainable confidence bounds are available in this area.
One
of the objects of these confidence bounds is the detection of the
"culprit variates" in the case of rejection of the hypothesis of
multiple independence, for the ttcomplex" hypothesis is, in this
case, the intersection of several more
'~lementarytl
hypotheses of
tWO-by-two independence.
2.
-
Introduction, Notation, and Preliminaries.
Case I,
which deals with the question of independence among p normally
distributed variates, represents a wall known situation which has
occurred repeatedly in applications.
For
C88~
II, which deals With
the question of independence between k sets of normally distributed
variates (where each set contains one or more variates) a number of
potential applications has been described by Wilks ~6_7.
In
addition to the situations mentioned by Wilks, an interesting application concerns the problem of "unreliable measurement u •
If we con-
sider the Pi variates -1
x. (i-l,2, •••k) as different measurements of a
physically identical quantity Which, because of the inaccuracy of
the measuring instrument, are not in perfect correlation, the procedures outlined in sections 5 and 6 will study the independence or
dependence of the k underlying ''true'' physical quantities. The
3
"correction-for-attenuation" technique which is widely used in this
situation has two serious drawbacks which the present method tries
to overcome:
mea~urement,
(1) it assumes equal error variance for each fallible
and (2) it makes use of a statistic, the
'~orrelation
corrected for attenuation", Which is not a correlation coefficient,
for it may attain values greater than unity.
is free from these shortcomings.
The present method
The confidence bounds discussed
in section 6 will then give an indioation of the maximum attainable
degree of prediction if infallible measurements could be made.
Suppose we have a random sample of size n+l from an
N
L.s(p xl),
l:(p x pt'Z , with p ~In.
Then, denoting by S(p x p)
the sample dispersion matrix we know that 5 is symmetric and everywhere at least p.s.d., (and also p.d., a.e.).
It is also well known
that, a.e., there exists a one-to-one transformation from S(p x p) to
T(p x p) given by nS - TTf , where
with.,:poaitive diagonal elements.
if is a lower triangular matrix
Let t ij (i
~ j
•
1, 2, ... , p)
denote the elements ofT, Sij and a ij (sij-Sji' 0ij • 0ji' i,
j a 1, 2, .~., p) denote the elements of S and Z, and let sij and
crij denote the elements of 3-1 and Z-l. Furthermore, let
r p •l ,2 ••• (p_l)' r(p_l).1,2 ••• (p_2)' ••• , r 3 •1 ,2' and r 2 •l denote;
respectively, the multiple correlation coefficient of (p) with
(1,2, ••• ,p-l), of (p-l) With (1,2, ••• ,p-2), and so on, and finally
the simple correlation coefficient of (2) with (1).
It may be
noted that all except the last are non-negative and a.e. positive.
4
These multiple correlation coefficients will be called the stepdown correlations.
Likewise, let
~Hl
x i-I) = Lal 1,
1
(J2'1 ••• a.1-1 ,1-
,7
·.. a l ,1-' 1
·.. a 2 ,1-. 1
l
fall
a~2
(2.1)
• ••
•••a
'l
1-' l
,1-
aI, i-I
(for i = p, p-l, ""
-1
...
2) denote the population regression vector
of (i) on (1,2, ••• ,i-l) and
·..
·..
8
-1
l,i-l
8 2 ,i_l
• ••
8 1 ,i_l
= p, p-l, ••• ,
(for i
vector.
8 2 ,i_l
•••si_l,i_l
..
2) denote the corresponding sample regression
These regression vectors will be called the step-down
regression vectors.
Next, we-will present the expression for the multiple oorrelation coefficient by treating it as a special case of a canonical
correlation (which will be convenient for later purposes). We have
r
2
i.l,2, ••. ,i-l
,r
-1
=
... -1
~
s,11- s l 1, •.• s,1-1 ,1,7 sll
•••
for i - p, p-l, "', 2.
s,1-1 ,1' 1
8
,
11
s,1-1 ,1'
5
Next we have, by using nS
n~r sl'1 .•• s.1-1 , 1. 7
= T~' ,
= Lr t 1. 1
e.. t i
. , 7
,1-~-
t
. t ll
t
0
t
21
22
• ••
0
• ••
0
·..
•
·...
t.1-1 ,1. ~
and
•••
...
n
o ....
sl ,1. 1
=
s.1-1 ,1-_
. l'
t.1-1 , 1
. ...
• .. ,
o
o •••
•
,
...
It is then easy '~o check, by substituting the expressions (2.3.1)
into (2.3), that
for i
= p, p-l, "" 2.
Now, let us turn to the case of a (PI + P2 + ••• + Pk ) variate (- p-variate, say) normal distribution and partition the
population dispersion matrix,
~,
into
h
(2.5)
~(p
x p)
=
-~11
Z12
• •• ~lk
• • t
Z2k
• ••
l:lk
• •• Zkk
(Pk)
o
• J
6
and the sampl.e dispersion m.atrix.. 8, into
(2.6)
8(p
x p)
=
11
8
8i2
8
8
..
8
lk
(PI)
22
••• S2k
(P2)
Slk
•
S2k
., . Skk
(Pk )
(PI)
(P2)
,
Regarding each submatrix
as
~
12
(Pk )
an element let us say that there are
k "pseudo-rows u and k llpseudo-co1umns" in the matrices on the
right sides of (2.$) and (2.6).
Let ~.~. 1 , 2 J"" i - 1 and B.~. 1 , 2 , ••• ,~. 1 (for i = k, k-l, ""
denote the population and the sample regression matrix of the
(Pi)-set on the (Pi-1 + Pi - 2 + •• , + P1 )-set.
given by the expressions:
~.~. 1 , 2,
'"
.
,
These matrices are
... , -
·i 1 (Pi x p.~-~l+ P , 2+"'+ P1)
,
["lli 2: 2i ... 2: i _1 ,i_7
Z22
Z2,i_l
'L'
2,i-1
• ••
'L.~- 1 ,~. 1
2:
Zi2
B·
11
_.
·..
·..
·..
2:
l,i-1
and
2)
12
2:1 , i-I
-
-1
1
!
I
..
,
3
.. £Su 5 2i ....Si_J.,iJ
3
• •• 81,i-l
• •• 8 2,i-l
,
82,
i-1
• ••
12
822
11
I
812
,•
81 ,1...1
• ••
-1
,
8i -1,i-1
for 1 • k, k-1, ••• , 2 • The f3' s and Bla will be called the
&tep-down regression matrices.
8 1) 0 <
(2 ••
(l)
Also let us
denote by
< 0(2)
<
< 0 (Pi)
•••
112
i . 12
, , ••• , . ,
J.-. - - '" ... '1 < 1
12 " .. , i-l - Ci.,
J.-
-
the Pi oharaoteristio roots of the matrix
I
I
(2.8.2) 8-If
i 1: Sli 8
21
I
... 8 i _1 ,1-7
e
8
U
•••
8
1,i-1
•
•••
•
-1
I
For i • k, k-l, ""
2.
8i _l ,i
S.J.-1 ,J.. 1
81 , i-l •••
It will be noticed that these
CIS
Sli
are the
squares of the canonical oorrelation coefficients of the (Pi)-set
with the (Pl + P2 + ••• + Pi_l)-set and that, a.e., the inequalities
in (2.8.1) will be strict.
For any (Pi)-set, all the Pi canonical
correlation ooefficients (or rather,
will be
seen later, the
.
as r i 1 ,2 , ... , i ..1 in
These will be called the step-down ~ E!.
largest of them) will play the
the previous case.
as
same role
canonical oorrelations. We are assuming here for simplioity of
discussion, but without loss of generality, that the sets are so
i-l
numbered a5 to make Z Pj ~ Pi' (for i - 2,3, ••• , k). The matrix.
j-l
8
oorresponding to ~ in the previous case will be introduced in a
later section.
4 will be concerned with the first case,
Sections 3 and
i.e., the case of a p-variate normal distribution; section 3 will
r
describe a test of the hypothesis H , ai' = 0 (i j = 1, 2, ""
o
J
and section 4 will present simultaneous confidence bounds, for
i . p, po-l, ••• " 2 on
p),
(~'i.l,2, •••i-l ~ i.l,2, •••i-l l/2 (and on
truncations obtained by deleting any 1, 2, ••• , (i-2) variates of
the (i-1)-set.
Sections
5
and 6 will be concerned with the second oase, i.e.,
that of a (Pl+P2+"'+Pk)-variate normal distribution; section
5 will
r
describe a test of the hypothesis Ho ' Zij= 0 (i j = 1, 2, "" k),
and section 6 will present simultaneous confidence bounds, for i = k,
k-l, ••• , 2, on the largest characteristic root of
, , ..
(~, 1 2
~.
any
1,
t
. 1 p,~."
1 2
~-
2, ""
..i -
1) ~and on truncations obtained by deleting
(i-2) of the sets (P1)' (P2)' ••• (Pi_l)_7.
It will
be noted that, in case I, the variate (i) is independent of the
I
variates 1, 2, ""
i-1 if and only if
~i
1 2,
• ,
.
•• ,J.-
l~'
1 2
~."
••
. 1= 0,
,1.-
and, in case II, independence of the (Pi)-set on the
~(Pl)' (P2)' '."
(Pi_l)_7-set implies, and is implied by, the
vanishing of the largest characteristic root of
I
~i ..1,2, •• i-l)·
(~.
1 2
J.. ,
"
. 1
,J.-
9
3.
Independence.E
3.1
lndependence
correlations,
p-variate problem.
~
(1£
distribution)
~
!h2
~
£!
~
step-down
hypothesis.
The joint distribution of the t .. 's, fer general Z, is well
1.J
known and given by
,r -21 tr E-1 ·"il
canst • eXPL
TT' J
IT
i=1
t~~i
Among various proofs, a recent one is given in
null hypothesis we have Z
=D
O'ii
diagonal matrix with elements
0
fr
i;:j=l
dt . . .
1.J
1:1, 2_7.
Under the
, where the right side denotes a
11 , 0'22'
.."
o
PP
•
In this situation,
(3.1.1) reduces to
P
i
n j=l
TT dt
1=1
It should be noticed that the t.,
1.1.
I
S
vary form 0 to
00,
i , •
J
and the
t ' s from - Q) to + CD • Now a comparison with the expression (2.4)
ij
shows immediately that the r •1 , 2 , ••• ,1.. lIS (i :: p, ••• ,2) are
i
independently distributed, and their joint distribution is given by
const •
rr
(r 2
i=p 1.1,2, ••• ,1-1
/23 (1 .. r.21..1,2, ••• ,1-1 )n-~-l
2
x d(r.1..,
1 2 , ... ,1."
. 1)
10
3.2
~
proposed
~ ~
! reason
~ ~
allocation £l
component probabilities.
The proposed test is as follows:
(3.2.1)
~p
Accept Hover
.
o
L
r 2
UL
reject Hover
a
i =2
< 1..10
2
i.l, , ••• ,i-l -
J.=
r. 21 2
J..
. -1
, . , ••• ,1.
> IJ.
-
7,
and
J , where
IJ. is given by
To obtain 1..10, proceed as follows:
between zero and one.
= 2,
i
Take a trial value, 1..101' say,
Using this value for a given n and for
3, ••• , p obtain, from the Tables of the Incomplete Beta
Function, the probabilities corresponding to the individual factors
of the left side of (3.2.2);
say;
call these probabilities Y2' Y , ••• Yp '
3
form the product of the Yi's and denote it by y. Proceed in
the same manner for other trial values, 1..102' 1..103' etc., and plot the
IJ.. 's against the resulting y's.
J.
Then, on this plot, 1..10 in (3.2.2)
is that value of the lJ.i's whioh corresponds to y
=1
- a, the pre-
assigned confidence level.
It is obvious that, if the same IJ. goes with the different component regions
Fr. 21
J..
,
2,
. 1:5
••• ,1.-
1J._7 ,
go with these regions are all different.
the probability measures that
One r88son why we make this
11
kind of allocation of the different
y. 's is the following: Notice
~
that the acceptance region for H is the intersection (over i
o
p-l, ••• , 2) of regions
i, L:r i : l ,2, ••. ,i-l
the type
~r.2L(1
~.
"2
~ri:l,2, ••• ,i-l ~ ~_7 ;
~ ~_7
= p,
but, for a given
is itself the llltersection of regions of
.. " . 1) ~ ~-7, where r.
~-
-
L(l 2 ,.,., i 1) denotes
~.,
the (simple) correlation of the i th variate with any linear
combination of the variates (1, 2, ,."
i-I) which includes, as
a special case, the variates (1, 2, ••• , i-I) individually.
Thus, if we allocate the y. 's in such a way as to make
~
~
the
same for each component region, we attach the same weight not only
to the correlations between any pair of the observed variates but
also to the correlations between each variate and linear combinations
of some others.
The reader will perceive that this allocation is
not completely symmetric.
While symmetry is preserved with respect
to all oorrelations by pairs, the step-down procedure is asymmetric as
regards the correlation of any variate with any linear oombination of
all the other variates.
However, this is perhaps the best that could
be done under this particular approach.
the square of any simple correlation
exceeds the value of
independence.
~,
It should be noted that, if
in the correlation matrix
we will have to reject the hypothesis of
If, however, the square of the largest correlation
ooeffioient in the correlation matrix stays below
~,
we will have to
perform the step-down process in order to decide on acceptance or
rejection of the hypothesis of independence.
12
-
--
3.3 -.,;;,;;.;..~.;;;
Relation to the Likelihood-Ratio 'I'est.
Since the deter-
minant of R, the correlation matrix, equals (1 _ r 2
)
p.l,2, ••• ,p-l
2
x (1 - r p-2
1 •1 , 2, ••• ,p2) ••• (1 )
- r 2 •1 ' a test based on the
product of the complements of the squares of the step-down correlations
is equivalent to the
likelihood-ratio test.
v~ile
the distribution
of the determinant of R is fairly complicated, even under the null
hypothesis of independence
£7_7,
its moments
£6_7
are well known
and easily obtained from the joint distribution of the correlation
coefficients under the hypothesis of independence.
It can be easily
verified that they satisfy the recurrence relations:
(3.3.1)
e
(J.3.2)
I
lJ.l
P
... TT
(1 -
i=;1.
i~l
on
,
)
and
,
P
I
i-I
=
(1 - n+20: )
lJ.a+l
lJ. a iQ1
From these relations it is quite simple to obtain the moments, hence
the coefficients of skewness and kurtosis, and, from Table
Biometrika Tables for Statisticians
£8_7,
42 of the
we can obtain, at least
for moderately large n, very good approximations to the desired peroentage points of the odf.
Thus, for testing the hypothesis of inde-
pendenoe, the determinant test is quite useful and closely related to
the step-down prooedure presented in this paper.
At the moment, however,
we do not know of any method to use this determinant test for the construction of confidence bounds on parametric funotions without running
into complioated non-central distributions, whereas the step-down
13
prooedure, as described above, can be immediately inverted for the
purpose of constructing simultaneous confidence bounds.
4.
Confidence Bounds Associated
~ ~
~ ~ ~
£!
Independence
p-variate Problem.
For shortness, let us now denote by just r.J. the r.J.. 1 , 2 ,
defined by (2.3) and (2.4), by just
"', p.J.,J.'-1'
say) the ~.J.. 1 , 2 ,
••• ,
(with components
Pil , Pi2 ,
i - 1 defined by (2.1), and by just
b. (with components b.],
b'2'
""
J.
J.
-J.
defined by (2.2).
~i
. 1
••• J.-
b.J., i - l' say) the _b.J.. 1 , 2 "
•• ,
i 1
-
Assuming a general (symmetrlc, p.d.) 1:, let us
now transform the original variates xl' x 2 , ••• , x to a new set
p
... , X7l- defined by
p
1
o
1
=
o
o
• ••
o
o
1
• ••
o
• • it
·..
_-~Pl
• ••
1
x
p
Then it can be easily verified that the new variates are uncorrelated and hence that the step-down correlations of the new variates,
r* 1 2
. l' say, (i • p, p-l, ••• , 2) are independently distri-
1. , , ••• ,J.-
buted. With a joint probability, 1 - a, say, let us now make the
simultaneous statements:
(for i
= p,
p-l, ••• , 2) •
We have already seen that,
given a, we oan find
~.
given~,
we oan easily find a, and also,
In order to invert the typical component
statement (4.2) and thus make a confidenoe statement on ~i.l,2, ••• i-l'
i.e., on ~~il ••• ~ . . 1
J.,J.- -
ooefficient between
7,
we observe that
xr and Lxt, x~,
the multiple oorrelation
'''' xt_~ is the same as that
between xt and ~xl' x2 ' ••• , xi _l _7, since the starred variates in
the first square brackets are linear oombinations of just the nonstarred variates in the seoond square brackets;
this fact simpli-
fies our oalculation of the desired expressions in terms of the 3
matrix, i.e., the sample dispersion matrix of the original variates.
We may now use the results obtained in reference ~4_7 in oonnection
with the
confidence bounds on a (pseudo-) regression matrix of a
p-set on a q-set (p
~
q) for a (p + q)-variate normal distribution.
Let us take expression (3.2) from reference ~4_7 and renumber it as
,
where B and
~
are the sample and population regression matrices of
the p-set on the q-set given, respectively, by B = 3123-1
22 and
..1
~ ... Z12l:22; 31.2 is the sample "residual" matrix of the p-set
on the q-set given by 31 •2 • 311 - 31232~Si2;
the sample dispersion matrix of the q-set.
8 22 is, of course,
Given a preassigned con-
fidence coefficient, 1 - a, the value of A in (4.3) can be obtained
from the oentral distribution of the square of the largest canonical
correlation coefficient, for whioh reoursion formulas are
15
Fbr p
=1
and q
= i-I,
(4.3) reduoes to
(4.4)
<
-
where Bi and
~i
, )1/2
(b.b.
-1.-1.
,-1
+ A(Si'1. - -8 i Si - 1:-s i
)1/2 1/2 -1
0
(Si 1)
max-
,
are defined in the opening paragraph of seotion 4, and
•••
s
l,i-l
• ••
•
•••
S
i-l,i-l
, and, finally
A = +LlJ./(14J.)Jl / 2 , where lJ. is obtained by the
prooedure outlined in the sequel of (3.2.2).
the typioal statement (4.2) for i
a joint probability
on
~~i
for i
= p,
~
= p,
It then follows that
p-l, ""
2 will imply, with
1 - a, simultaneous oonfidenoe bounds (4.4)
p-l, ""
2.
In equation (3.1) of reference £4_7, we may put p
q
= i-l
and chose the vector
~2
=1
given there in such a way as to make
anyone, any two, eto., and finally any (i-2) oomponents of
to zero;
and
~2
equal
if then we make the corresponding transition from (3.1) to
(3.2) given in reference £4_7, we will have, along with each typical
statement (4.4) above, truncated statements wher6 anyone, two, etc.,
finally any (i-2) components of
R.
~1.
and -1.
b. have been deleted without,
however, disturbing the expressions that occur with A.
16
Tn~s,
statement
~4.4)
and the truncations mentioned above will result
in 21- 1 - 1 jou.t confidence statements for given i. Since i can take
the values p, r-l, '."
P
1:
2, we will have, altogether,
. 1
(21.- - 1) ""
i=2
2P - P - 1 cO:lfidence statements with a joint confidence coefficient
~ 1 - Ct.
5.1 Independence (in distribution) E1 the step-down sets of
oanonical correlation coefficients, under
~ ~
hypothesis.
st.arting from (2.6) in seotion 2 we shall make a transformation
,.,
from r~to a partitioned triangular matrix, T, given by
e
,.
'f11
nS(p x p) ... (Pl)
0
,...r
•
•
(Pk )
¥ll 0
• •• 0
T2l
, 0
T21 T22
(P2)
·.
'~22
• ••
-.
0
·..
I
0
• • !t
.Tkl Tk2
Tkk
(PI) (P2 )
(P )
k
T
kl Tk2
·..
,• .1
Tkk
The d.!.stribution of the elements of ~, under a general 1:; will be given
by (5.1.1):
.r
oonst • eXPL
1 tr 1:-1 ,~TT'
'.' 7
"
-
-~
• •• + P
:1:1
k ' and T..
Under Ho a ~ :1J
••
=0
(i
t
j
= 1,
'IV
III
..2.
l~r' dT.. ,
II t n-i
,i
'11.
1.=
T..
(i.e.,
1.:1
,."
:1':':J;.l.
l.J
trian~..ll'!·L·).
.
2, "', k), (5.1.1) reduces to
17
(5.1.2)
1 k
canst - exp ~ -~ Z tr
t::.
• 1
J.=
1
z7.
J.J.
.-.1
(T. l ,
- - -, T.J. i
J.
)
7
I
-
For shortness, let us write the matrix (2.8.2) in the form
where
8
(i)'
(Pi
X
Pl +P2+' "+Pi - 1 ) ,.
,
f
,
,
7
(p~),
.
81i 8 2i .,. 8i -1 ,J.(P )(P )
2
1
...
(Pi-I)
and
11
·.
81 ,J.. 1
•
, ,
•
8
,
.
8'
1,i-l
• ••
S
i-l,i-l
•
Also, let us denote the Pi characteristic roots of (5.1.3), ordered
from the smallest to the largest, by ~c~l), ci 2 ), ',-, ciPi) _7 .
It then follows directly that
(5.1,5)
nS
ii
=;
~ Til •• _T ii_7
r
I
Til , ••
'T ii_7
•
•
and nS _
i 1
=
TIl
0
i-l,l
0
·..,
•
T
• ••
•
• •• 'i-l,i-l
TIl
0
•
T
i-l,l
0
."
0
• • II
Ti-l,l Ti _l ,2
'v
'f'
•••
• e- •
•••
T.J.-1,J.. 1
,
0
,-
•
'Ii
•
f
••
Ti _l , i-I
•
18
Substituting the expressions (5.1.5) into (5.1.) ~or (2.8.2)_7, we
see that (5.1.3) reduces to
(5 .1.6)
f
i
,
1
1: Tij T. .
j=l
~J
i-l
J - L
"
1: Ti' T. .
j=l
J ~J
L
J .
N
Til ••• Tii _7 are independently
distributed under the null hypothesis, for i .. k, k-l, ••• , 1, it
Now, since by (5.1.2) the sets
follows that the sets of characteristic roots of (5,1.), viz.,
are independently distributed.
$.2 !h£ proposed
~ ~
independence.
We propose the following test:
, and
Accept H over
o
k
V L
reject Ho over
i=2
c(Pi) > A.
i
J ,
where A. is given by
k
(5,2.2)
11r PL
i=2
c~Pi)
1.
< A. ,
-
r~Pi) .. 0 7 = 1 - a
~
-
,
Sij'S from S in equations (,.1.4).
(i)
I
-1
(i)
and r denotes the largest characteristic root of E 1:
1: _ E
,
ii
i l
where the 1:ij 'S are obtained from E in exactly the same way as the
-1
It will be noted that ripi) is
zero i f and only if, for given i, Eij • 0, for j = 1, 2, ••• , i-l.
Analogous to section ).2, we take the same value of ~ for each factor
on the left-hand side of ($.2.2);
the reason is the same as that
19
given in section 3.2.
that given for
~ in
The procedure for obtaining
~
is analogous to
section (3.2) except that the incomplete Beta
function needs to be replaoed by the central distribution function of
the (square of) the largest canonical correlation ooefficient.
The
distribution and recursion relations for particular values are discussed explicitly in reference ~2_7.
5.3 Relation!2
~
Likelihood-Ratio Test.
Denoting the j'th canonical correlation coefficient of the
(")
Pi-set on the (Pl+ P2+·.·+Pi - l)-set by r·J..J l , 2 ,
••• ,
'1 ' we see that
J,-
where o(j) denotes the j'th oharacteristic root, and the Rls are
the sample correlation matrices corresponding to the oovariance
matrices, S.
Thus,
p.
it
j=l
(1 - r(j) 2
)
i.l,2, •• "i-1
=
, and the
product of the products of all step-down canonical oorrelation
coefficients (or rather, of the complements of their squares) beoomes
k
(S.3.3)
because
Pi
TT 11
i=2 j=l
tRlt
(1_r(j)2
)=
i.l,2, ••• ,i-l
,
=IR1~
Thus, a comparison with reference ~6_7 shows that a test based on
20
the produot of produots of all step-down correlations is closely
related to the likelihood-ratio test.
The distribution of this
statistic, under H , is discussed in '-7
a
given in
£6-7.
'- -
7,
They oan be readily obtained if, in the joint distri-
bution of all r. ,'s (for a general matrix (
~J
r
00
•••
-00
A
i')
Q
J
R, say)
I~f n/2
-co
f
0
I
Pf(!:f>
1\
P(RtR) ...
x
and the moments are
_-&
p
on
dr ,
i>j-l i J
dfjl ••• d{3p_l
["tr
~-I D~RD~ npf2
-r;, /2
(where D{3 is a diagonal matrix with elements e 1
,
a
({3l-r;,2)/2
••• J
we set
"R ..
A
~l
1\
R
22
0
•
0
• ••
0
·, .
•
0
..
.
0
0
• ••
•
.A
R
kk
•
The moments satisfy the convenient recurrenoe relation:
t
IJ.
1
p
~.. (n - q + 1)
=
k
IT
=1
p,- - - - - - J (n - i + 1)
, and
rr
j=1 i-I
,
lJ.
+1
a
rr
I
=
lJ.
a
(n + 2a .. i + 1)
-...,~9r...·:::.l
n
k
J=l
_
P
j
n
i=1
(n + 2a - i + 1)
•
,
,
21
Thus, for testing the hypothesis of independence between k sets of
normally distributed variates, the determinant test is quite useful
and closely related to the step-down procedure.
At the moment, how-
ever, we cannot easily construct simultaneous confidence bounds on
parametric functions on the basis of the determinant test, whereas
the step-down prccedure can be inverted into a simultaneous canfidence statement.
Canfiden~
6.
~ ~ ~
Beunds Associated
££
Independenoe
!2!. ~ iEl....:..Ea + ••• + Pk)-variate Problem.
Using (5.1.4) let
~i •1, 2,
. 1
••• ,1.-
rewrite ~.1.."
12
UB
i 1 in (2.7) and
••• ,-
in (2.8) as
I
(6.1)
A
~i
(
Pi x P1+P2+"'+Pi-l
)
~(i) ~-1
=
ui_l
U
an
d
t
Bi (Pi x Pl+P2+"'+Pi-l)
Next we partition ~i into
f
~il
= SCi) Si~1
~i2
~i,i-1 _7
...
(Pl ) (P2)
and Bi into ~ Bi1 Bi2
•
(Pi-l)
••• Bi ,i_1
_7
Assuming now a general (symmetrio, p.d.) .t, let us transform the
original variates !l(Pl x 1), ~2(P2 x 1), "', !k(Pk x 1) into a
new set of vrJriates ~t, !~, ... , ~l defined by
(6.3)
!t
(PI)
I(PI )
0
0
#1.#1 #I
0
0
lS1
lS~
(P 2 )
-~2l
I(P2)
0
• ••
0
0
!-2
=
·.
•
lS~
,
(Pk )
-~k1
.. ~k2
• ••
-f3 k ,k...l I(Pk )
!k
22
Then it can bs verified that the k sets of new (starred) variates
are uncorrelated, and hence the step-down sets of (squares of)
.
canon~cal
,- 02*(1) ' "" c*(p2 )_7 ,
2
correlations, L
7 are . dependent1y
..
1 ) ' "', ok
*( pk )_
.•• , Lr c ~~(
k
~n
~
L
*(1) , " ' , 03-l~(p3 )_7,
03
d.~s t r~. but e.
d W-Ith
~
a joint probDbility, 1 - a, say, we may thus make the simultaneous
st3tement
-lr(p. )
(6.4)
C.
~
~
<
'\
fI.
(for i
= k, k-l, ... ,
2).
.ill1alcgous tt, sectien 4, with the modifications given in 5.2, we
can find>" if' u is preassigned.
By the same argument as in section
4, and by using (4.3) we can obtain, with a joint confidence coefficient
1 - u, the following sets of simultaneous confidence bounds,
~
< c l / 2 «(.l (.l ~) <
- max "'it'~ -
c;~ (BiB~)
where
~.~
+
>..c;~
I
£Sii - SCi)
S~:ls(i)_7
,
and B. are defined by (6.1) and (6.2), SCi) and S. 1 by
~
(')
(5.1.4), and Z ~ and~. 1 analogously.
~-
~-
Following the argument
presented in section 3 of referenoe £4_7 and in section 4 of this
paper we see that, with a joint oonfidence coefficient
~
1 - a, not
only can we make the (k-l) statements (6.5) but, for each typical
•
statement under (6.5), we can also make a number of truncated confidence statements by deleting any number of variates of the (P.)~
set and any number of variates of the (Pl+P2+"'+Pi_l)-set taking
care only that the number of variates left in the (Pl+"'+Pi_1)-set
is not less than that left in the p.-set.
~
Referenoes J
L1.-7
I. Olkin and S. N. Roy, ''On multivariate distribution theory,U
Ann. Math. Stat., vol. 25 (1954).
~2_7
s.
N. Roy, "A report on some aspects of multivariate analysis,"
Institute of Statistics, University of North Carolina, mimeograph series No. 121.
["3_7
S. N. Roy, lISome furthur results in simultaneous confidenoe
interval estimation, tI
~4_7
s.
Ann. Math. Stat., vol. 25 (1954).
N. Roy and R. Clnanadeshikan, "Further contributions to
multivariate confidence bounds," Institute of Statistios,
University of North Carolina, mimeograph series No. 155.
e
~5_7
s.
S. Wilks, lICertain generalizations in the analysis of
variance, II Biometrika" vol. 24 (1932).
["6_7 s.
S. Wilks, "On the independenoe of k sets of normally
distributed statistical variables," Econometrica, vol. 3 (1935).
L7JI
A. vlald and R. J. Brookner, liOn the distribution of Wilks'
statistic for testing the independence of several groups of
variates, u Ann. Math. Stat " vol. 12 (1941).
L8J
E. S. Pearson and H. O. Hartley, "Biometrika Tables for
Statistioians,U vol. I, 4th ed., Cambridge University Preas
(1954) •