I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
I.
I
I
ON THE HYPOTHESIS OF "NO INTERACTICE" IN
THREE-DIMENSIONAL CONTINGENCY TABLES
by
V. P. BHAPKAR
University of North Carolina
University of Poona
and
GARY G. KOCH
University of North Carolina
Institute of Statistics Mimeo Series No.
August
440
1965
The hypothesis of "no interaction" in three-dimensional
contingency tables is considered from the point of view of
the associated underlying model. A general and computationally
simple class of test procedure8based on Wald's criterion (1943)
is discussed and illustrated.
The paper is expository in the sense that'many of the results given appear elsewhere in the literature. New results
are inclUded, wherever necessary, for the sake of completion.
This research was supported by the National Institute of' Health
Institute of General Medical Sciences Grant No.GM-J2868-01.
DEPARTMENT OF STATISTICS
UNIVERSITY OF NORTH CAROLmA
Chapel Hill, N. C.
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
ON THE HYPOTHESIS OF "NO mrERACTION" IN THREE-DD1ENSIONAL
CONTINGENCY TABIES
1.
Introduction
In recent years, much attention has been given to the hypothesis of
"no interaction" in three-dimensional contingency tables.
Often, however,
there has been ambiguity in
(i)
(ii)
the specification of the underlying model,
the statement (in terms of this rodel) of the hypothesis to be
tested,
(iii)
the interpretation (with respect to the model) of the results of
the test.
In this expository paper, the authors want to clarify the distinction betvTeen
the different experimental situations which give rise to data in the form of
a three-dimensional contingency table and to discuss the "no interaction"
hypothesis appropriate to each.
In any experiment, we collect data of two particular types from each
experimental unit (or sUbject).
(i)
These are
a description of the sub-population of units to Which he belongs
(or of the experimental conditiona which he undergoes),
(ii)
a description of what happens to him during and/or after the experiment.
In the case of contingency tables, we shall use the term "factor"
to denote a "way of classification" describing some aspect of the sub-
I
population of units to which a subject belongs (eg., the treatment combination
I.
I
I
denote a "way of classification" describing some aspect of what happens to
assigned to him; the block to "Which he belongs); and the term "response", to
him during and/or after the experiment (eg. , lives or dies; passes or fails;
I
is of low grade, average grade, high grade).
Hence, the data obtained from
each experimental unit is simply a description of
(i)
the sub-population to which he belongs in terms of a combination
of factor categories,
(ii)
what happens to him during and/or after the experiment in terms
of a combination of response categories.
The data in any multi-dimensional contingency table then are simply the
frequencies with Which SUbjects belonging to the same combination of factor
categories (1. e., the same sub-population) gave the same combination of response
categories (i.e., did the same thing).
Thus, one sees that the dimensions of
such a table are precisely the factors and responses according to Which each
unit is classified (eg., treatment, block, lives or dies) with the levels
being the associated sets of categories.
There are three principal types of
.'I
I
I
I
I
I
.,
I
three-dimensional contingency tables of experimental interest
(i)
the "three response, no factor" tables,
(ii)
the "two response, one factor" tables,
(iii)
the "one response, two factor" tables,
where the above phrasing specifies the nature of the dimensions of the table.
At this point, we note that marginal frequencies depending on factor combinations only (i.e., all response categories have been summed over) are fixed
numbers known prior to the actual performance of the experiment (eg., the
number of subjects in each block; the number of subjects receiving any particular treatment combination); all other frequencies are random variables (except,
of course, the total sample size).
Let us now consider the "no interaction"
hypothesis associated with each of the above three types of experimental
situations.
2
I
I
I
I
I
I
.-I
I
I
I.
I
I
With the "three response, no factor" tables, the experimenter is
interested in the relationships among the different responses.
The questions
which concern him are anaJ.agous to the problems of independence and correlation
in normal. multivariate analysis.
dependent?
For example, are the three responses in-
Are two of the responses taken together independent of the third
I
I
I
I
this in mind, we see that the hypothesis of "no interaction" associated with
I
ular measure of association (for which, there may be some choice) between any
one?
Is there any partial association between two of the responses within
given levels of the third?
Are the three responses pai.rw:l.se independent?
this kind of experiment has nothing to do with the way in Which factors
combine to determine a response (because we are in a no factor situation) as
is the case in analysis of variance; in actuality, it states that some partic-
Ie
two
I
responses is called the hypotheses of "no interaction" mainly because it is
I
I
I
I
I
With
or
the responses is constant over the levels (categories) of the third
(see Simpson (1951)).
This statement about the association aJOOIlg the three
a "bridge" between certain weak hypotheses and certain strong hypotheses; for
example, the conditions of "no interaction" among responses and of pairwise
independence of responses are together equivalent to the condition of complete
independence of the three responses; also the conditions of "no interaction"
aJOOng responses and of marginal independence of t ..,O of the responses with the
third are together equivalent to the condition of joiIt independence of those
two responses with the third.
For a discussion of the above, the reader is
referred to Roy and Kastenbaum (1956) and Lewis (1962).
If the experiment
is of the "two response, one factor" type, then
1re are interested not only in the association between the responses but also
I.
I
I
in the effect of the factor on them.
The questions we pose here are analagous
to the problems arising in the normal multivariate analysis.
:5
For example,
I
does the factor affect the joint distribution of the responses?
affect the marginal distributions of the responses?
independent at each level of the factor?
Does it
Or, are the two responses
Again the hypothesis of "no inter-
action" is expressed in terms of responses; it specifies that some particular
measure of association (for which, there may be same choice) is constant over
the levels (categories) of the factor.
One particular approach to this type
of formulation is given by GoodnJan (1964).
The "one response, two factor" tables are of the greatest interest to
us in this paper.
Here, the experimenter is interested in the way in which
the factors combine to determine the response.
The problems of the statistician
here are analagous to t hose arising in univariate normal analysis of variance.
For example, does one of the factors affect (the distribution of) the response?
Does either factor affect (the distribution of) the response?
Is there any
interaction between the factors in the way they affect (the distribution of)
the response?
The "no interaction" hypothesis for this experimental situation
has a number of forms depending on what function(s) of the distribution of
the response is of interest to us (for example, cell probabilities, logits,
probits,relative probabilities, or essentially any function formed only from
the response data within the same factor combination and for each factor
combination).
With any of these functions, two hypotheses of interest would.
be
(i)
the factors affect (the distribution Of) the :f'Imctian of the
response additively,
(ii)
the factors affect (the distribution of) the function of the response
multiplicatively.
These formulations are anaJ..agous to the hypothesis of no interaction in
analysis of variance as they
4
.-I
I
I
I
I
I
I
-.
I
I
I
I
I
I
.-I
I
I
I
.e
I
I
I
I
I
I
,e
(i)
are concerned with the interaction of factors (as opposed to the
association between responses),
(ii)
specify simple descriptions of the way in which the factors combine
to determine (the distribution of) some function of the response.
The above considerations will be discussed in greater detail in the next
section of the paper.
Finally, one should note that in many experimental situations, it is
lOOre or less obvious whether a particular dimension is a "response" or a
"factor", so that the problem belongs to one of the three situations discussed
in a unique manner.
sion
rre;y
On
the other hand, in some instances a particular dimen-
be viewed either as a "response" or a "factor" in which case the pro-
blem can be approached from different points of view.
2.
Formulations of the Hypothesis of "No Interaction"
I
2.1.
I
experimental unit belongs to the i th category of the first response, the j th
I
I
I
I
I
.e
I
The "Three Response, No Factor" l'tbdel
Let Pijk (where ire aSSlL"Ile Pijk
> 0) denote the probability that an
category of the second response, and the k th category of the third response
in an r x s x t contingency table; thus, the p's are subject to the constraint
~
• • k
~,
J,
P.jk = 1 only•
~
Recall nOif that the term "no interactioo." here means that
a particular measure of association between any two of the responses is the
same over all levels (categories) of the third.
Bartlett
(1935)
k
as a measure of association in a
level k.
2 x 2
suggested
= 1,2
table with the third response at
The hypothesis of "no interaction" for this case is then
5
I
I
-,
•
The above foI'lJDl1a.tion was generalized to the
r x s x t
tables by Roy and
Kastenbaum (1956) who considered the hypothesis specified by the constraints
=
(2.1)
i = 1,2, ••• , (r-1j
j = 1,2,
, (a-1
k = 1,2,
, (t-l
PijtPrst
PrjtPist
which nay be equivalently re-written in terms of the constraints
.
PijkPrsk
PrjkPiSk
"
i = 1,2, .•.•.• , (r-lj
j = 1,2, ...... , (S-l
k = 1,2, ••• , (t-1
~ PijtPrst = 0
PrjtPist
or the "freedom equations"
PijkPrsk =
i = 1,2,
Prj~iSk
j =
1,2,
.,... ,
... ,
(r-l)
for k = 1,2, ••• , t •
(s-l)
They offered a test of (2.1) which was in terms of restricted maximum like1ihood estimators of the parameters and which was identical with Bartlett's test
,men r· = s = t = 2.
However, the application of this test requires the
solution of (r-1)(s-1)(t-1) simuJ.ta.neous fourth degree (or third degree when
miner, s, t) = 2) equations in as many unknowns; methods of obtaining the solution to this system of equations have been given by Kastenbaum and Lamphiear
(1959) and Darroch (1962).
A simpler class of test procedures is based upon Wald's criterion (1943).
Th.es.e involve the unrestricted maximum likelihood estimators of the Pijk
(i.
e.,
the ~jk
= (nijI!N)
where nijk is the freque..'1cy of t~e response combin-
ation (i, j, k) and N =
1: n'jk is the total sample size). One example of
i,jk J.
a Wald test is the test due to Plackett (1962), following the approach of
Woolf (1955).
He considered the natural logarithms of the "freedom equations"
6
I
I
I
I
I
I
-.
I
I
I
I
I
I
I
-.
I
I
,I
,e
I
I
I
I
I
I
,e
I
I
I
I
I
I
I
•e
I
in (2.l.b) and constructed a test by the method discussed in Section 3.
A
IOOdification of Plackett' s method which is simpler to apply is given by
Goodman (1963b).
For a mre complete discussion of the above, the reader is
referred to Section 3 and the Appendix; numerical illustrations are given in
Section 4.
Finally, tests of the hypothesis of "no interaction" for the "three
response, no factor" situation can be obtained by adopting the method of
conditional distributions (in which one or mre of the responses are viewed
as factors) and using tests appropriate to the "two response, one factor"
situation (see Goodman (1964)) or the "one response, two factor" situation
(see Goodman (1963a)).
The latter of these is subject to some questioning on
philosophical grounds in view of the interpretation of the term "interaction"
given in the Introduction.
2.2
The "Two Response, One Factor" M:>del
In this situation, let Pijk (where we assume Pijk> 0) denote the
probability that an experimental unit from the k th category of the factor
belongs to the i th category of the first response and the j th category of
the second response in an
r x s x t
contingency table; thus, the p'S are
sUbject to the t constraints
1: POjk
°
J.
J., J
°
=1
k
= 1,2,
••• , t
One possible formulation of the "no interaction" hypothesis is given by (2.1)
and its equivalent forms where now the Pijk appearing there also satisfy the
above t constraints.
Goodman (1964) has offered a number of methods based
upon vlald' s criterion to test this hypothesis •
An alternative for.rm1la.tion of the "no interaction" hypothesis appropriate
7
to this m:>de1 is in terms of another measure of association between the
responses.
Consider a
2 x 2 x 2
table; if we let
k • 1,2,
where P10k
=
I: P1jk' POlk
j
=
a measure of association in a
I: Pilk'
i
2 x 2
(or the natural logarithm of ~) be
table with the factor at level k, then
the hypothesis of "no interaction" with respect to this measure is
•
Generalizing the above formulation to an
r x s x t
table, we are led to the
hypothesis specified by the constraints
i
=
j
k
= 1,2, • •• ,
= 1,2, • •• ,
= 1,2, • •• ,
= 1,2"
= 1,2,
= 1,,2,
• ••
• ••
• ••
,
,
,
i
, (r-1)for k
j = 1,2, ••• , (s-l)
~J
Here, one can note that if log
= 1,2,.....
~
t-1
= 1,2,
I
••• " t.
Statistical tests of (2.2) and
its equivalent forms (or the formu1.a.tion in terms of logarithms) may be obtained
by applying Wald's procedure which is described in Section 3 and the Appendix.
8
4.
I
I
I
the hypothesis of "no interaction" would be re-fornn.l1ated in terms of the
Numerical illustrations are given in Section
I
I
I
(r-ll
~S-l
is taken as the measure of associati. on,
natural logarithms of the constraints in (2.2).
I
I
-.
(s-l
(t-1
or the "freedom equations"
= ~ ..
-.
(r-ll
this rre.y be equiValently re-written in terms of the constraints
i
j
k
I
I
I
I
I
I
-.
I
I
I
.e
I
I
I
I
I
I
.e
I
I
I
I
I
I
I
.I
2.3
The "One Response, Two Factor" l-bdel
We now let
Pijk (where "Ife assume Pijk
> 0) denote the probability that
an experimental unit from the j th category of the first factor and the k th
category of the second factor belongs to the i th categoI""J of the response
in an r x s x t
contingency table; thus, the piS are subj ect to the st
constraints
j = 1,2, ... , s
I: Pijk = 1
i
k = 1,2, ... , t
•
In univariate normal analysis of variance, we are interested in whether the
mean response of sUbjects from the (j,k) th factor combination can be written
as the sum of an overall mean effect, an effect due to the first factor, and
an effect due to the second factor.
If this simple description of the vlay
in which the factors determine (the distribution of) the response is valid,
"Ire
say that there is no interaction between the tvro factors.
With this in mind,
..Ie
would say that there is"no interaction" between the tiVO factors of our
contingency table model if (the distribution of) the response (namely,
Pljk,P2jk' ••• , Prjk or some function thereof) for subjects from the (j,k) th
factor combination can be described solely in terms of an "effect" depending
on neither factor, an "effect fl depending on the first factor only, and an
"effect" depending on the second factor only.
For exa.n:q>le, an "additive"
formuJ..a.tion of this type of hypothesis would be
(2.3)
Pijk
= t i **
for
..mere
I: t. j* = I: t'*k
j
J.
k J.
= o.
of the second factor while
j
= 1,2, •••
k
= 1,2,
•••
,
,
... ,
(r-l)
s
t
t ,* is a quantity independent
iJ
is a quantity independent of the first
In the above
ti*k
= 1,2,
i
+ t ij * + ti*k
9
I
I
factor.
The hypothesis (2.3) my be equivalently re-written in terns of the
constraints
i = 1,2,
j = 1,2,
k = 1,2,
...
, (r-1)
...
, (s-l)
... , (t-1)
or the "freedom equations"
i
... , (r-1) for k = 1,2, ••• , t •
1,2, ... , (s-l)
= 1,2,
j =
statistical tests of (2.3) and its equivalent forms may be obtained by applying
Waldts procedure; for example, the test offered by
2 x 2 x t
A
Goodman
(l963a) for
tables is a Wald test based upon the "freedom eqJ.ation" formulation
"multiplicative" formulatioo of the hyPothesis of "no interaction" for
this nx>de1 would be
(2.4)
i
= 1,2,
... , (r-1) for
j
= 1,2,
••• , s
k
= 1,2,
••• , t
which :nay be re-written in terms of the natural logarithms of the pt s as
i = 1, 2 , ."1 (r-1)
for
j
k
where
~ tij*
j
=
~ ti*k
k
=0
= 1, 2,
= 1,2,
••• , s
"., t
; now (2.4) is equivalent to the constraints
i
j
k
or the "freedom equations"
i
= 1,2,
j
= 1, 2,
... ,
... ,
while (2.5) is equivalent to the constraints
10
(r-1)
for k
(s-l)
= 1,2, •••
= 1,2, •••
= 1, 2 , •••
= 1,2,
, (r-1j
1 (s.-l
, (t-1
••• , t
-,
I
I
I
I
I
I
-,
I
I
I
I
I
I
I
-,
I
I
I
.e
I
I
I
I
I
I
•e
log Pijk ~ log Pisk - log Pijt + log Pist = 0
i = 1,2, '••• ,
j = 1,2, ~~~,
k
= 1,2,
••• ,
lr-lj
s~l
t~l
or the "freedom equations"
log P ijk - log Pisk =
(2.5.b)
:.:.:., (r~l)far k
j = 1,2, ••• , (s~l)
i
r ij
= , .2,
= 1,2,
••• , t.
Statistical tests of (2.4) and (2.5) and their equivalent forms may be obtained
by applying Wald's procedure.
The for:mu.lation (2.4) is due to Roy and Bhapkar (1960); other possible
fornmJ..a.tions of the "no interaction" hypothesis appropriate to this type of
experimental situation are considered by Goodman (1963a) •
3.
Test Criteria
3.1 Computationa! Procedures for Obtaining Wald' s Criterion
I
I
I
I
I
I
I
.e
I
There are a number of ways of obtaining Wald' s test statistic for the
hypotheses of "no interaction" discussed in the previous section.
Some of
these 'Ways are equivalent in the sense that they lead to the same statistic;
some others ,though not equivalent in the above sense, do lead to asymptotically
equivalent statistics.
There are essentially two approaches which lead to Wald's statistic.
They differ in the sense that
(a)
one involves constraints;
(b)
the other involVes "freedom equations".
With either approach, the
(i)
co~tational procedure
involves forming
the unrestricted ma.xinnrm likelihood estimators of the left-hand
side of
11
I
I
(a) the constraints in terms of Which the hypothesis is specified;
for example (2.l.a), (2.2.a), (2.3.a), (2.4.a), (2.5.a);
(b)
the "freedom equations" in terms of which the hypothesis is
specified; for example, (2.l.b), (2.2.b), (2.3.b), (2.4.b),
(2.5.b);
i.e., the corresponding expressions in terms of the ~jk instead of the Pijk
where ~jk is given by
1. nijk/N in the "three response, no factor" IOOdel
2.
ni,ik/nOOk in the "two response, one factor" IOOdel
3.
nijk/nOjk in the "one response, two factor" IOOdel
with nijk being the number of experimental units in cell (i, j,k) and
nOjk = ~ nijk, nOOk = ~ nijk , N = ~ nijk ;
i
i,j
i,j,k
(ii) the consistent estimate of the asymptotic variance-covariance
matrix of the appropriate set of estimators (i.e., those of (a) or those of (b»
obtained by replacing piS by qls in the variance-covariance matrix of the linear
Parts of their corresponding Taylor expansions.
Now the estimates of the left hand side of
(a)
the constraints in terms of which the hypothesis is specified
(b)
the "freedom eqt1ations" in terms of which the hypothesis is specified
have asyn:q>toticaJ.1y a mltivariate normal distribution; hence the familiar chisquare statistic with (r-l)(s-l)(t-l) degrees of freedom to test for
(a)
the vector of (r-l)(s -l)(t-l) components which are the estimators
of the left hand sides of the constraints in terms of which the
hypothesis is specified has zero mean vector;
(b)
the t
vectors of (r-l)(s-l) components which are the estimators
of the left-hand sides of the "freedom equations" in terms of
12
-,
I
I
I
I
I
I
-.
I
I
I
I
I
I
I
-,
I
I
I
.e
I
I
I
I
I
I
.e
I
I
I
I
I
I
I
.e
I
which the hypothesis is specified all have the same mean vector namely, the corresponding right-hand sides of the "freedom equations".
Computationally, as has been observed by Plackett (1962) and GoodJlan (1963b),
(a)
the constraint approach involves the inversion of one matrix of
side (r-l)(s-l)(t-l);
(b)
the "freedom equation" approach involves the inversion of (t+l)
matrices of side (r-l)(s-l) •
The constraint approach and the "freedom equaticn" approach lead to the same
test statistic if the constraints in terms of which the hypothesis is
specified are contrasts; for example, (2.l.a), (2.2.a), (2.3.a), (2.4.a),
(2.5.a) correspond in the above sense to (2.l.b), (2.2.b), (2.3.b), (2.4.b),
(2.5.b) respectively.
Before concluding this sUb-section, we want to call
to the reader's attention that if the hypothesis is specified in terms of
naturaJ. logarithms (for example, Plackett's formuJ.aticn of (2.1) ),then Goodman
(1963b) has offered a simpler computational procedure for the "freedom
equation" approach.
'.2
I.
Test Criteria for Some Special Cases
ll
(2 x 2 x t) Tables from the "Three Response, No Factor Situation:
P~22k
Hypothesis:
Let
~=
P12kP2D~
qllk~
=
k = 1,2, ••• , t
"'11
nllk~
k = 1,2,
= n
n
ql2k~1k
l2k 21k
Let
k
The test statistic
X;;
= 1,2,
t
•
••• , t •
is given by
1
~ = 1: (~_h)2 v:k=l
~
t
... ,
•
=
t
~ v:-
1:
k=l
13
1
~
t
- ( 1:
~v~1)2/(
k=l
-K
t
1:
v~l )
k=l -K
t
1
h = ( E ~v~
k=l
It
where
1
v~ ) ;
k=l--:it
Hypothesis:
~
Let
and
= (-.!...
n
llk
t
1
E (~-g)2vk=l
~
If
and
~lk
+ log
t
1
k = 1,2, ... , t •
~
k = 1 , 2 , ••• , t •
)/(
t
E
1
v- );
d.f.
k=l ~
= t-1
•
t=2,
d.f. = 1 •
r.g
(1963b), (1964); and
xii
is a statistic due to GoodnBn (1963a),
is the statistic of Woolf (1955) and Plackett
(1962).
Assume t
= 2.
Hypothesis:
Let
k = 1,2, ... , t.
t
1
t
1
t
1
E ~v- - ( E ~v- )2/( E v- )
k=l
~
k=l
~
k=l ~
In the results given above,
3.
-.
is given by
=
g = ( E ~vk=l
gk
where
•
+ -.!...
+ -.!...
+ -1-)
n
~lk
n
l2k
22k
The test statistic ~
g
g
d.f. = 1
= log q1lk- log ql2k - log ~lk + log ~
Vgk
i: =
•
log P llk - log Pl2k - log P2lk + log P
= log "'11
22k
= log nllk - log nl2k - log
Let
= t-1
d.f.
t = 2,
If
2.
t
)1 (E
I
I
b
•
n lll~1~neJ.2
=
The test statistic
n121n211nll2n222
~
is given by
14
•
I
I
I
I
I
I
-.
I
I
I
I
I
I
I
-.
I
I
I
.e
I
I
I
I
I
I
.e
I
I
I
I
I
I
I
.e
I
~
= (b-l)2/Vb
=
(b_l)2/b2 (
and d.f.
E ...!.-)
i,j,k n ijk
= 1.
•
Note that although the hypothesis formulated here is equivalent to (2.1),
the test statistic ~
statistic ~
II.
1.
associated with (2.1.a), (2.1.b).
(2 x 2 x t) Tables from the "Two Response, One Factor" Situation:
Hypothesis:
Plll~22k
P~2lk
Test statistic is
2.
associated with it i,s different from the test
Hypothesis:
~
=
x:g
k
= 1,2,
••• , t
•
as gi'Ven in I.l.
P~22k
log (~
)
P
2lk
Test statistic is
t 11
= log
"'11
k
= 1,2,
... ,
t •
k
= 1,2,
... ,
t •
k
= 1,2,
••• , t •
as given in I.2.
k
= 1,2,
••• , t.
d.f. = t-l •
d.f. = 1 •
4.
H~ypothesis
:
log P llk - log P10k - log POlk = log cP11
EJr = log q1.lk - 1.og q1.0k - log ~lk
v
~
=
1.
1
1
1..
2nllk
)
(----- - - - - - ----- +
n llk
n 10k
n Olk
nOOk
nOlkn1.0k
1.5
k = 1,2,
k = 1.,2,
k = 1,2,
·.. ,
·.. ,
·.. ,
t.
t •
t •
I
I
The test statistic X ~ is given by
r!:.
E
t
t
t
t
= 1: (~_€)2 v-1 = 1: ~ v-1 _ ( 1: ~V-1 )2/( 1: v-1 )
k=l
EJt k=l "k ~
k=l
EJt
k=l EJt
t
where
If'
= 2, -f.
t
E
III.
1.
l' t
1
-= (1: EJtv· )/( 1:v·);
. k=l
~
k=l EJt
€
d.f'. = t-1.
= (E - €-)2/(v
+ v)
1
~
~
~
and
=1
d.f'.
•
(2 x 2 x t) Tables from the "One Response, Two Factor" Situation:
Hypothesis:
Pllk - PJ2k =
Let
~ = qllk - qJ2k
Let v
=
~
qllk~lk
+
nOlk
~
The test statistic
t
1
~ = 1: (~-d)a,,~
k=l
It
where
d = (
t
-1
~
k = 1,2, ••• , t •
qJ2k~
nOOk
= 1,2, ••• , t •
k
= 1,2,
=
t
t
t
1: ~ v~l _ ( 1: ~v~1)2/( 1: v~l )
k~
k~
It
t _1
1: ~vli- )/( 1: vii- );
k=l I t
k=l I t
It
d.f'.
= t-1
•
~ given here is due to Goodman (1963a).
Let ~ = (ql.ll!q12k)
~
k~lt
d.f'. = 1 •
k
Let v
••• , t •
is given by
and
The statistic
k
••• , t •
k = 1,2, ... , t •
=~
( ....L
+ --1..) - ( --1.. +......!..)
~"k
n
nJ2k
n
nOOk
The test statistic
= 1,2,
k = 1,2, ... , t.
I
I
I
I
I
I
-.
I
I
I
I
I
I
I
is given by
-.
16
I
Olk
llk
~
}
-.
I
II
I
I
I
I
I
I
.I
I
I
I
I
I
I
Ie
I
=1
and d.f.
3.
Let
1k
= log
n
qllk - log ql2k
"i
where
If
'1
'1k
k
=(
n
= 1og(nllkn02k )
k
= 1,2,
... , t •
k
= 1,2,
...,
t •
is given by
t
1
1: ('1 -'1)2 v - =
k=1
k = 1,2, ... , t •
Olk l2k
The test statistic ~
y2 =
r II
log P1lk - log Pl2k =
Hypothesis:
•
t
1:
k=l
-f"
k
v-
1
'1k
t
-1
t _1
1: 7k v
)/( 1: v );
k=1
1k
k=l" k
d.f.
= t-l
•
t = 2,~ = (Yl-12 >:r(V'1 + v'1, ) and d.f. = 1 •
,12
4.
Numerical Illustrations
4.1 Bartlett's Data
The following data (see Table 1) of Hoblyn and P~r are the results
of an experiment designed to investigate the propagation of plum root stocks
from root cuttings.
Table 1:
Bartlett's Data
At Once
t'ime of Planting
In Sprinf
Long
Short
Total
AJive
156
107
263
Dead
84
133
240
c.enP.:th of Cutting
rotal
240
17
Long
Short
Total
84
31
115
217
156
209
480
240
240
365
480
I
These data are of interest because they were used by Bartlett (1935) and
Goodman
(1964) to demnstrate their respective tests of a formulation of the
hypothesis of "no interaction" which is appropriate to either the "three
response, no factor" model or the "two response, one factor" lOOde1
(2.1».
"t~
However, in actuality,
(namely,
of planting" and "length of cutting" are
factors; and thus the e:x;periIoont is really of the "one response, two factor"
type.
We feel that the data should be analyzed by methods appropriate to this
roode1.
In addition, we will delOOnstrate that the linear IJX)de1 (2.3) provides
an excellent "fit" for the data with the chi-square value (d.f. = 1) to test
(2.3) being 0.08 (as opposed to values of 1.85, 2.27 of Bartlett and 1.93,
.
.
.
2.26, 2.27 of Goodman to test (2.1»; and hence we conclude that the factors
affect (the distribution of) the response additively in the sense of (2.3).
Analysis of the data:
1 if alive
(i)
Notation:
Let
Let
i
j
denote the response: i = {
denote tiIoo of planting:
2
if dead
1 if at once
j = {2
i f in spring
1 if long
Let k denote length of cutting: k = {2
if short
(ii)
Tests of HYPothesis:
H:
Pijk = t i
2
where
** + t ij *
+ ti*k
i
=1
j
for
= 1,2
k =
1,2
'-2
~ t ij * = ~ ti*k = 0 •
j=l
k=l
The hypothesis H may be expressed in terms of the constraint
where
Ji'
122.
<llll = n11 nOll = 2'ij:Q
=. 650
18
--I
I
I
I
I
I
I
-,
I
I
I
I
I
I
I
-,
I
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
I.
I
I
qJ21
=
nJ2/n021
=
84
240
q1J2
=
n1J2/n012
=
~ = .446
q122
=
nJ2!n022
=
*a
+
v =
= .350
= .ll.9
ql2l ~ + ~1~~12 + ~~
n021
n
n
012
022
= 0.0034
..,2
~
=
0.0002890
0.0033935
d.f.
= 0.085
=1
•
Since ~ is small, we conclude that the data are consistent with the
hypothesis H.
In fact, we have the following estimates of the parameters
appearing in H.
te l ,**
1.f1
i* = 0.155
t:..
t 12*
!:I.
= 0.393
.
,
,
= -0.155 ,
~l*l
= 0.107
t 1*2
= -0.107 •
,
and the following "fit" of the data to the hypothesis
= 0.650 = (0.393
+ 0.155 + 0.107) - 0.005 ,
= (0.393
- 0.155 + 0.107) + 0.005 ,
ql12
= 0.446 = (0.393
+ 0.155 - 0.107) + 0.005 ,
q122
= 0.129 = (0.393
- 0.155 - 0.107) - 0.002 •
q111
q12l = 0.350
Hence, we conclude that the factors affect (the distribution of) the response
additive1y in the sense of H.
19
I
4.2 Lessler's Data
In this sub-section, we consider three contingency tables from the
unpublished data of Lessler.
The experiment was conducted to study the nature
of sexual symbolism with the basic observation being a subject's classification
of an object: as being either masculine or feminine when that object is shown
to him at the following exposure rates:
1/1000 second, 1/500 second, 1/100 second, 1/50 second, 1/5 second.
The subjects involved in the study were classified according to
(a)
sex (males and females)
(b)
those who were not told the purpose of the experilllmt and those who
were told the purpose of the experiment (Group A and Group C respectively)
while the objects involved in it had been (by a previous study Lessler (1962»
assigned
(i)
a cultural meaning (M or F) related to which sex used it
(ii) an anatomical meaning (M or F) related to which sex it inately evoked
(iii)
an intensity (W or S) related to the strength of the degree to which
the object was representative of its classification according to
(i) and (ii).
Thus, as can be seen, the experiment is of the "five response, five factor
type" with the factors being (a), (b), (i),(i.1), (iii) above and the responses
being the classifications at 1/1000, 1/500, 1/100, 1/50, 1/5 second.
The tables below are presented only to illustrate tests of the hypothesis
of "no interaction"in various situations; no detailed analysis is deIOOnstrated
here.
.-I
I
I
I
I
I
I
••
I
I
I
I
I
I
.-I
20
I
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
I.
I
I
I.
A "Three Response, No Factor" Situation:
Table 2:
Subject Type - Group C Males
Object Type
- Cultura.l1y Masculine
Anatomically Feminine
Weak Intensity
!Response at 1/100 sec.
Response at
1/1000 sec.
Total
h
h
1
= 6.78
2
= 5.70
b
=
Vb =
M
F
F
M
tResnonse at 1/5 sec.
M
F
Total
184
38
10
14
194
52
7
7
114
27
121
222
24
246
14
134
148
V
h
vh
=
M
F
20
Total
9.34
1
= 11.19
2
1.19
~ = 0.057
d.f.
=1
~
= 0.047
d.f.
=1
i:g = 0.055
d.f.
=1
0.775
gl
= 1.914
V
= 0.2032
~
= 1.740
V
= 0.3445
gl
~
Hence, we accept the hypothesis of "no interaction" and conclude that the
association between (say) the response at 1/1000 second and the response at
1/100 second is the same 'Within each of the categories of the response at
1/5 second.
21
I
n.
.-I
A "Two Response, One Factor" Situation:
Table 3:
SUbj ect Type - Males
Object Type - Culturally Masculine
Anatomically Feminine
WeQ,k Intensitv
A
!sub.iect Group
F
M
194
27
22l
l77
l4
19l
F
52
J2l
l73
30
63
93
246
l48
394
207
77
284
Total
= l.4l
V
f2
= l.27
v
hl
= l6.72
vh
h2
= 26.55
V~ = 89.02
fl
f
M
TotaJ..
F
= 0.00230
l
f
Total
M
Response at lIs second
ResP0F:/Je at
.
1 lOOO secQnd
C
d.f.
.
=
=l
o.ool64
2
= 19.48
l
.
d.f. = l
Hence, we reject the hypothesis that the measure of association in (2.2) is
the same for both groups.
However, we accept the hypothesis that the measure
of a.ssociation in (2.l) is the same for both groups.
III.
A "One Response, Two Factor" Situaticn :
Table 4:
Subject Type - Males
Object Type - Culturally Mascullne
We ak Int ens itcv
-
A
Sub.;ect Group
Anatomical
MeAnintJ'
C
M
M 202
F
19l
Total
393
298
F
22l
Total
519
82
93
l75
96
l73
269
284
284
568
394
394
788
raesponse at l/lOOO second
F
Total
22
M
I
I
I
I
I
I
.I
I
I
I
I
I
.-I
I
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
I.
I
I
dl
= 0.0387
v~
= 0.001499
~
= 0.1954
v~
= 0.001093
v
= .003516
d.f.
=1
v
= .005099
~
v
= .003144
d.f.
=1
!!J. = 1.0576
~=
1.3484
71 = .0583
~
71
72 = .3001
v
72
=
.002804
Hence, we reject both the "additive" hypothesis
hypothesis
d.f. = 1
(2.4) and "multiplicative"
(2.5) of no interaction between the effects of the two factors
"group" and "Anatomical Meaning" on the probabilities p .., •
J.J~\:
Appendix
Let us consider a situation in which N experinental units (n . of ,vhich
OJ
are from the j th sub-population or factor category) are classified into
Let p .. (Where Ive asswne all p .. > 0)
J.J
J.J
denote the probability of the i th response category by a subject from the
categories according to some response.
j tl1 sub-POPUlation; and let
of course,
(A. 1)
vThere
allow
cjl
ij
denote the corresponding frequency (where,
n. = !: n .. ). 'We shall asswne the product multinomial model
oJ
i J.J
s
r
r
n ..
= II ((n jl/ II n .. :)( II P'j J.J))
. 1
0
. 1
J.J
. 1 J.
J=
J.=
J.=
r
!: p. j = 1,
i=l J.
i
n
r
!: n. .
i=l J.J
=n
. (fixed)
oJ
j
= 1,2, ... ,
s.
In
(A. 1),
to be a multiple subscript
i = (i , i ,
l 2
••• , i k ) where i ex = 1,2, ••• ;r ex
k
and ,'There
r =
II rex; and we allow
j
a=l
23
to be a multiple subscript
we
I
This is a "k-response, .I-factor" situation.
In certain incanwlete designs,
all sub-populations (or factor combinations) may not be included in the experi-
ment; here, however, we shall assume that the design is complete, i.e.,
.I
s =
One ,my of formulating hypotheses
]I
~l
st).
....
•
concerni~
the parameters
Pij
in the model
(A.l) is in terms of constrat nts
m = 1,2, ••• , u < (r-l)s
(A .2)
where ~' = (pu, ... , P~"'l,l' ... , PlS' ... Pr-l,s).
Let us denote the vector (F1 <.i), F2 (;e) , ... ,Fu CR»
are given known functions.
by ~'(~.
(i)
In (A.2), the Fm'S
Let us assume that the Fm(;e)
possess continuous partial derivatives up to the second order with
respect to the elements of £ ,
(ii)
are independent in the sensethat the u x (r-l)s
CF
Q»
matrix
J!<.i)
IlL
]
is of rank u W'hel"e u < (r-l)s. Let
·apij
~' = (qu' ••• , ~-l.l' ••• , qls' ••• , q(r-l)s) , where ~j = nij/nOj ' denote
defined by Jj(;e ) =
[
the vector of unrestricted maxinn.un likelihood estimators of the corresponding
elements of £' ; and let ~(~) d~note their variance-covariance matrix; i.e.,
n~i( k -,il£i)
Q,
=
• • •
~
where
Hence,
,ij
= (Pij' ••• , Pr-l,j)
!!(W
£(~
!!' (R)
S
• ••
Q.
n~<.E2-1>aE2) '. ••
'q,
• •• •••
-le
2. • • • nOs ~ -;es£s' )
• • •
and Ej = diagonal (Plj ,P2j , ••• , Pr-l,j)·
is the asymptotic variance-covariance matrix of
24
.-I
I
I
I
I
I
I
.I
I
I
I
I
I
.-I
I
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
Ie
I
I
(y);
£, t (S]) = (F1 (S]), F2 (S]), ... , Fu
i. e. of the unrestricted maximum likelihood
estimators of the left-hand sides of the constraints in (A.2).
By condition
(ii) above, this matrix is non-singular and positive definite.
If
true and if N ~eo (in such a
ifa'.l
(A.2)
is
Q. = (n j/N) remains fixed), then the
J
0
that
statistic
(A.3)
f- -distribution
has a limiting
with
u
degrees of freedom.
We call (A. 3)
Wald's statistic.
He may also test (A. 2) by applying Neyman I s linearization technique
and forming the minimum
(see Neyman
The linearized hypothesis is
s r-l
F*(~ = F (a) + E E
(A.4)
where
1949).
xi-statistic to test the linearized hypothesis
m
~
d F (J<,)
(d m
j=l i=l
Pij
m
I
)
£=s;!,
(PiJ'-~j) = 0
m = 1,2, ••• , u
here is regarded as fixed.
The minimum
xi-test of
(A. 4) (and hence of (A. 2 )as proved by Neyman)
is the
minimum value of
s
r
E
E
j=l i=l
subject to
(A.4).
Bhapkar
(1961)
showed that the minimum Xi-test of
(A.4)
given by
(A.5)
where
-*
V is the estimate of the covariance matrix of h (q) obtained by
-
replacing piS by q's; i.e.,
1. = !!(.s-)£(,9,)!!'(g).
25
Since
~*(J}) = n(~),
we see
is
I
that (A.4) and (A.5) are identical as was demonstrated by Bhapkar (1965).
Theorem:
Let a hypothesis concerning the Pij in (A.l) be formulated as
follows:
t
1.:
(A. 6)
,,=1,2, ••• ,p
a=l
where t < P < (r-l)s, where
g" are known functions satisfying the conditions
(i) and (ii), and where the drfY.'S are
of the (p x t) matrix ~ is
known
constants such that the r~ .
v. Then the Wald statistic to test (A. 6) (i. e.,
the mininum ~-statistic for (A.6) linearized) is the same as the minimum
sum of squares obtained by the generalized (weighted) least squares technique
applied to the g (~
"
with the asymptotic variance-covariance matrix estimated
.
by the "sample" variance-covariance matrix.
This statistic has asymptotically
a chi-square distribution with (p - v) degrees of :f'reedom under the hypothesis.
For this paper, we need only the proof for the case where Rank
PrOof:
£ =t
•
Linearizing the left-hand side of (A.6), we obtain
(A. 7)
I
p
,..
p x (r-l) s matrix of rank p.
If
=2,(,2)
~ = ~(;Y
,
2,
= ~(q)1"'1#
N
]
is a
we let
,
,
f
#"III
= ,..q
~(g)q
",.,,,.,,
then we may re-write (A. 7) in the form
(A. 8)
Gn+f=DB
iiiIV
N
G =.
p .
and
D
where Rank ,
, Rank
..
such that
;w .12
= JJ •
"."
N
= t.
Hence
26
t < P < {r-l)s
There exists a {p-t)x p matrix ,..
L
.-I
I
I
I
I
I
I
.I
I
I
I
I
I
.-I
I
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
I.
I
I
.f. = Q. l? ' then (A. 8) implies
If vre let
(A. 9)
On
L ~
,..".,a
+ L f
t"tJ""'I
the other hand (A.9) implies that
~en.erate,d
(~
=0
,....
+
!)
by the columns ,of~. so there exists
i.e., (A.9) implies (A.8).
belongs to the vector space
2 such
that
(! + f)
=
£ 2;
Hence (A.8) and (A.9) represent the same
Let l = 11 ~ and write (A.9) as
hypothesis.
Fn+Lf=r.
f'tttIN
NN
)<,
(A. 10)
~-statistic
The minimum
to test (A.10) (tmd thus to test (A.8»is
~ = (,all'
(Aol1)
•
+
!'k')(r£<'a)r,' )-l(!'.,g.
+
k !)
= (,at~'+!' ),BI (~(2)~'k') -\(Q;l. + !)
= g'LI(L S Lt)-~ B.
rIfttJ
~ =
where
N
/'OJ
N
~
2 £ (~ 2I
N
•
Consider now minimization of
vlith respect to
,S~ = (~ - £ .v' [-1(~
e. We have then
,..
-
!2. .v
and
min S2e N
= ,~t[S-l
_ S-ln(DIS-ln)-ln'S-l]~
.."
N,.,.,,..,, ,."""., Ji;# •
If we let
A = L'(L S LI )-lL and B = S-l_ S-~(D'S-~)-~'S-l, then
N,.."
,.."
,...,,..,,,
N
N".."
X;1 = min s2e if ,..,..
A = B.
(i)
(ii)
f"IitI,..."
Observe that
Rank ,..
A = Rank tv
B =
p - t ,
AD=13D=O,
N
N
,...,,..,,
,...,
27
N,..",...
N,.,.,
,.."
I'tIJ
I
(iii)
~'~
.-I
= ~'l = 2. ,
From (i), (ii), (iii) we have that there exist nonsingular (p x p) matrices
~
and
~
such that
A = £ll
and
A2e =l.
Hence,
AS'R=ASAQn=AQn=B
""""N~
Nr,;.
,..
,fIttI".,J:,
and also
4
~
S ''R" = #ttI.'If:,
r.",'R NS NB = N.J..H
C"B = A
,..,
N
Thus A=~ and
•
~=minS:
Summary Table
Model
"Two Response,
"Three Response,
No Factor"
"One RespCllse,
Two Factor"
(2,.1)
(2,.3)
(2.2)
(2,.4)
(2.5)
(2.1)
Hypothesis
Test Statistic
One Factor"
~,~,~
~,~
~,
In the above table, ~, ~,
for the
r x s x t
the 2 x 2 x t
x:' x:'
~,
~
~
x:
~
~, ~,
X;
are the generalizations
tables of the statistics given :in Sub-section 3.2 for
tables.
I
I
I
I
I
I
••
I
I
I
I
I
I
.-I
28
I
I
I.
I
I
I
I
I
I
I
I·
I
I
I
I
I
I
I.
I
I
BIBLIOGRAPHY
Bartlett, H. S. [1935], "ContinGency table interactions", Journal of the Royal
Statistical Society, Supplenent, ~ 248-52.
Bhapkar, V. P. [1961], "Sorre tests for categorical data",
Mathematical Statistics, 2S 72-83.
Annals of
Bhapkar, V. P. [1965], "A note on the equivalence of sore test criteria",
Institute of Statistics, University of North Carolina, ~~o Series
No. 421.
.
Darroch, J. N. [1962], "Interactions in multi-factor contingency tables",
Journal of the Royal Statistical Societyz Series B, 24, 251-63.
Goodman, L. A. [1963a], "On methods for comparing contingency tables ",
Journal of the Royal Statistical Society, Series A, 126, 9lt-108.
Goodman, L. A. [1963b], "On Placl"ett's test for contingency table interactions", Journal of the Ro;>ral Statistical Society, Series B, gL, 179-88..
Goodman, L. A. [1964],
"Simple methods for analyzing three-factor interaction
in contingency tables", Journal of the Arrerican Statistical Association,
2.2" 319-52.
Kastenbaum, M. A., and Lamphiear, D. E. [1959], "Calculation of chi-square
to test the no three-factor interaction hypothesis", Biometrics15 z 107-15.
Lessler, K. J. [1962], "The anatomical and cultural dimensions of sexual
symbols", Unpublished Ph.D. Thesis, Michigan State University.
Lewis, B. N. [1962], "On the analysis of interaction in multi-dimensional
contingency tables", Journal of the Royal Statistical Society, Series A,
~ 88-117.
Neyman, J. [1949], "Contribution to the theory of the ~-test", Proceedings
of the Berkeley Symposium on !'J8.thematical Statistics and Probability,
Ber}"eley, University of California Press, 239-73.
Plackett, R. L. [1962], "A note on interactions in contingency tables",
Journal ~f the Royal Statistical Societyz Series Bz ~ 162-6.
Roy, S. N. and Bhapkar, V. P. [1960], "Some nonparametric analogs of normal
PJ:TOVA, JNl..NOVA, and of studies in normal association", Contributions to
Probability and Statistics z .Essays in honor of Harold Hotelling, Stanford
University Press, Stanford, California, 371-87.
Roy, S. N. and Kastenbaum, M. A. [1956], "On the hypothesis of no "interaction"
in a multi-way contingency table", Annals of Mathematical Statistics z g1J
749-57.
29
I
Simpson, E. H. [1951], "The interpretation of interaction in contingency .
tables", Journal of the Royal Statistical Society, Series B, !2" 238-41.
WaJ.d, A. [1943], "Tests of statistical hypotheses concerning sever&l parameters
"Then the nwnber of observations is large", Transactions American Mathemtical
Society) ~ 426-82.
Woolf', B. [1955], "On estimating the relation between blood group and disease",
AnnaJ.s of Hum:1n Genetics, !2" 251-53.
.-I
I
I
I
I
I
I
.I
I
I
I
I
I
.-I
30
I
© Copyright 2026 Paperzz