Reinfurt, D.W.; (1970)The analysis of categorical data with supplemented margins including applications to mixed models."

•
This research was supported by National Institutes of Health,
Institute of General Medical Sciences, in part by Career Development
Grant No. GM 0038-16 and in part by Grant No. GM 12868-06.
•
,
'."" , ,~.
,-
THE ANALYSIS OF CATEGORICAL DATA
WITH SUPPLEMENTED MARGINS INCLUDING
APPLICATIONS TO MIXED MODELS
by
Donald William Reinfurt
j
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 697
July 1970
•
~
ABSTRACT
REINFURT, DONALD WILLIAM.
The Analysis of Categorical Data with
Supplemented Margins Including Applications to Mixed Models.
the direction of GARY GROVE
(Under
KOC~.)
The general linear model approach for the analysis of categorical
,.
data is extended to the case of supplemental data.
each individual
i~
In the usual case,
classified according to each of the d dimensions of
the corresponding contingency table.
With supplementation, there is
additional information on d' < d dimensions for a random sample
selected from the same homogeneous population.
This situation might
arise if there was special interest in only certain of the responses
and/or possibly for reasons of economy and is distinctly different
from the empty cell problem.
Starting with the simplest case, maximum likelihood and iterated
weighted least squares estimates are obtained for the basic parameters
for 2 x 2 tables with one margin supplemented.
The maximum likelihood
estimates are unbiased and have the same asymptotic covariance matrix
as do the iterated weighted least squares estimates.
Using conditional
expectations and variances) improved estimates of the precision of the
maximum likelihood estimates are obtained.
For the case with both
margins supplemented, iterative solutions are indicated.
Extensions to
r x c tables and higher_dimensional tables are outlined.
A general two_stage test procedure is presented which requires
certain modifications of the existing computer program.
.'
Several
detailed examples are given including mixed categorical data models
with supplemented margins.
~
The situations in which these special "no
factor, mUlti_response'l models arise are identified as are the
•
appropriate tests of hypotheses, namely, tests of marginal symmetry
and, for quantitative variables, tests of equality of mean scores.
is noted that supplementation should be of most use in exactly these
mixed categorical data model situations.
/
•.
.•.
•
It
•
ii
BIOGRAPHY
The author was born August 3D, 1938, in Wilkes_Barre, Pennsylvania,
and J being the son of a Methodist
part of the United States.
minister~
was raised in the eastern
He graduated from Unadilla Central High
School in Unadilla, New York, in 1956.
He received the Bachelor of Science degree with a major in
mathematics from the State University of New York at Albany in 1960,
and the Master of Arts degree in mathematics from the State University
of New York at Buffalo in 1963 where he was a teaching fellow.
Since
then, the author has combined his studies in the Department of
Experimental Statistics at North Carolina State University at Raleigh
~
•
with work at the North Carolina Highway Safety Research Center in
Chapel Hill.
The author is married to the former Mary Karen Hillix of
St. Joseph, Missouri, and has a daughter, Kristin Elizabeth, 18 months
old .
•
•
•
iii
ACKNOWLEDGMENTS
The author extends his sincere appreciation to the many persons
who have helped in so many ways in the preparation of this study.
He
is especially indebted to his advisor, Dr. Gary G. Koch, for his
invaluable advice, guidance, and constant encouragement.
In addition!
he would like to thank the other members-of his advisory committee,
Drs. John T. Gentry, Daniel G. Horvitz, Henry L. Lucas, and Harry
Smith, Jr.,
for their constructive criticism and suggestions.
Special gratitude is expressed to his employer, Dr. B. J.
Campbell, Director of the North Carolina Highway-Safety Research
Center, for making it possible to complete this research at an earlier
•
date than would otherwise have been possible.
Generous financial support for this study has been provided by
the National Institutes of Health, Institute of General Medical
Services.
Finally, the author expresses his thanks to his wife, Karen, and
his'daughter, Kristin, for their continued patience and sacrifice
during the course of this research.
•
•
iv
TABLE OF CONTENTS
Page
vi
LIST OF TABLES
1.
1.1
1.2
1.3
.
General Categorical Data Models
11
13
Form~lation
Estimation Procedures
Test Statistics
17
20
Notation and Ass~mptions
Methods of Inference
3.3
3.4
3.5
3.6
20
22
28
Introd~ction
28
Mixed
Mixed
Mixed
Split
30
.
Models in General
Categorical Data Models of Order 2
Categorical Data Models of Higher Orders
Plot Contingency Tables
MORE GENERAL CATEGORICAL DATA MODELS FOR THE CASE OF
SUPPLEMENTED MARGINS
3.1
3.2
•
Hypothesis
4
6
9
11
MIXED CATEGORICAL DATA MODELS
2.1
2.2
2.3
2.4
2.5
3.
2
Form~lation.
The General Linear Model
1. 4.1
1. 4. 2
2.
Historical Remarks
Some Recent Developments in Model
The Categorical Data Framework
Hypotheses of Interest
Methods of Inference
1. 3.1
1. 3. 2
1. 3. 3
1.4
1
2
Introd~ction
1. 2.1
1. 2.2
1. 2. 3
1. 2.4
•
1
INTRODUCTION AND REVIEW OF LITERATURE
Introd~ction
Perinatal Health
Hand Example
Dr~g Example
47
47
More General Linear Models in the Case of Normal
Random Variables
Notation for the More General Categorical Data
Model
•
Estimation.
.
Tests of Hypotheses
Examples Using the More General Model
3.6.1
3.6.2
3.6.3
32
37
41
S~rvey
Example
50
53
56
59
60
60
65
73
•
v
TABLE OF CONTENTS (continued)
Page
4.
ESTIMATION THEORY AND HYPOTHESIS TESTING FOR TWO-WAY
CONTINGENCY TABLES WITH SUPPLEMENTED MARGINS
4.1
4.2
Introduction
. . ..
2 x 2 Case with One Margin Supplemented
•
4.2.1
4.2.2
4.2.3
4.3
2 x 2 Case with Both Margins Supplemented
4.3.1
4.3.2
4.4
•
•
5.
Methods of Estimation
Properties of the Estimators
Extension of the Theory of Supplemented Margins
4.4.1
4.4.2
4.5
Notation
Methods of Estimation
Properties of the Estimators
The Case of One Margin Supplemented
The Case of Several Margins Supplemented
Tests of Hypotheses
•
SUMMARY AND SUGGESTIONS FOR FURTHER RESEARCH
5.1
5.2
Sunnnary
.
.
Suggestions for Further Research
81
81
81
81
82
88
94
94
100
103
103
107
108
113
113
117
6.
LIST OF REFERENCES
119
7.
APPENDIX
123
•
•
vi
LIST OF TABLES
Page
1.4.1.1
Expected cell probabilities (cell frequencies) for
the standard contingency table
.
20
~
2.2.1
Summary of assumptions and appropriate test
32
procedures for various types of mixed models
2.3.1
Vision records for women employees of the Royal
Ordnance factories
2.3.2
33
. . . .
Frequency distribution of the difference between
right and left eye grades .
... ..
36
2.4.1
Responses of patients to drugs A, B, and C
37
2.5.1
Responses of males to culturally masculine and
anatomically feminine objects with weak intensity
3.3.1
•
3.3.2
Frequency distribution for experimental units with
complete information . . . . . . . .
55
Frequency distribution for experimental units with
information on A only .
3.6.1.1
55
Frequency distribution for experimental units with
information on A and B only . . . . .
3.3.3
42
.
.
55
. ...
Serum bilirubin readings and 5_minute Apgar scores
for a sample of premature infants.
. ...
62
3.6.2.1· Finger data indicating classification with respect
to the presence or absence of sweat or sensation
3.6.2.2
Frequency distribution of complete and supplemental
data by agreement (A) and disagreement (D) for
each finger . . . . . . • . . . . . . . . . . . .
67
3.6.3.1
Frequency distribution of responses to drugs A and B
74
4.2.1.1
Expected cell probabilities (cell frequencies) for
the 2 x 2 case with one margin supplemented. . .
82
Observed frequencies for a 2 x 2 contingency table
with both margins supplemented . . . . . . . .
110
4.5.1
7.1
•
66
Iterative solution for the MLE's (p .. ) and WLSE's
(n..)
1J
1J
for Table 4.5.1
.
125
•
1.
INTRODUCTION AND REVIEW OF LITERATURE
1.1
Introduction
In the usual categorical data situation, there is complete
:
information available for each entry in the corresponding contingency
~.~.J
table,
the table.
each unit is classified according to each dimension of
However, in many situations, the experimenter is not
equally interested in all of the response variates or perhaps he is
able to obtain additional information on a subset of the dimensions
of classification relatively easily and economically.
The question
then arises as to how this additional or supplemental information might
be used in improving the estimates of the basic probabilities of the
•
table, thereby improving the corresponding tests of hypotheses .
The primary goal of this research is to extend the general linear
model approach for the analysis of" categorical data
(=!.
Grizzle
~~.,
1969) to the above_mentioned case of contingency tables with supplemented margins.
gation:
The following questions are relevant to this investi_
What are the properties of maximum likelihood estimates
(MLE's) and iterated weighted least squares estimates (WLSE's) with and
without supplementation?
given situation?
What method of estimation should be used in a
How can the results on estimation be incorporated into
the general Grizzle, Starmer, Koch (GSK) framework for hypothesis test_
ing?
What are the properties of these tests and for which tests is
supplementation most useful?
There are two additional goals of this research.
•
The first is
the investigation of a particular "no factor, mUlti_response l1 situation
in which supplementation would appear to be most useful since the main
•
2
hypotheses of interest are ones of marginal symmetry and, if the
response variables are quantitative, equality of mean scores.
How can
the GSK approach be applied to analyze the most general of these "mixed
categorical data model" situations with supplementation?
The second is a comprehensive review of the vast and scattered
literature on categorical data methods.
There is special emphasis on
the linear model approach since it is the method employed in the
remainder of this research.
1.2
1.2.1
General Categorical Data Models
Historical Remarks
Categorical data and its analysis have been of interest to
•
statisticians ever since the earliest origins of the subject.
For
example, in the area of vital statistics, Graunt, in the early 1600's,
presented data on causes of death
(~.!.J
consumption, diseases of
infancy, tooth diseases) which appeared in the form of frequency
tabulations according to the sex and marital status of the deceased.
Since 1790, census studies have collected and analyzed numerous crosstabulations on a wide variety of demographic variables.
In the area of
genetics, Mendel, in the mid_1800's, collected data on garden pea
varieties (one_way tables) leading to the now_famous Mendelian
heritability ratios
(~.~.,
with complete dominance for each of two
characteristics, genotype ratios of offspring of 9:3:3:1).
In the
field of epidemiology, Greenwood and Yule, in the early 1900's, had
considerable data on typhoid attacks on inoculated and non_inoculated
•
individuals.
The list of other examples is virtually unlimited .
•
3
In the analysis of categorical data, the binomial and Poisson
probability distributions and later the multinomial distributioll have
played an essential role.
One of the most important discoveries
(Laplace, de Moivre) in early statistical theory was that, as the
sample size becomes "1arge
ll)
data generated by any of these distribu_
tions is approximately normally or multinormally distributed.
property has been used repeatedly in
dev~loping
This
appropriate methods fnr
2
analyzing categorical data from the original Pearson chi_square (X )
statistic to the more complex Wald statistics for testing the hypo the_
sis of "no. interactibn l1 in multi-d"imensional contingency tables.
Aside from this historical background, categorical data appearing
in the form of frequency tabulations in one_way, two_way} and mUlti_way
••
tables is a familiar data array to virtually all statisticians in
today's society.
In many fields, much of the data arises from
questionnaires and thus is often nominal or, at best} ordinal.
Such
fields include the social sciences, the medical and health sciences,
and occasionally the physical sciences.
Unfortunately, one consequence of categorical data being of
interest to persons in widely divergent areas of interest has been the
development of a number of very different approaches to analyzing such
data.
In the familiar situations involving continuous random variables}
the general linear model includes such apparent special models as
multiple regression models, fixed effects} random effects, and mixed
effects analysis of variance models, models with covariables, etc.
uses a general least squares approach as the method of inference.
•
It
On
the other hand, with contingency tables, the underlying models have not
•
4
always been obvious and hence hypotheses have often been formulated in
a fragmented way.
As a result, it has appeared that many types of
complex categorical data situations required their own special methods
of inference.
The abundant and colorful literature for contingency
tables unfortunately documents this in considerable detail.
Thus, the
practitioner has often had to search the literature for a procedure
appropriate to his problem; when he has failed to find such a technique,
he has either been required to use an inappropriate analysis for his
data or leave his data unanalyzed.
Another difficulty in the analysis of categorical data has been
the existence of numerous methods of estimation and corresponding test
criteria upon which the statistical inferences are based.
••
As a result,
even when the researcher has decided upon the hypotheses of interest,
he still is confronted with a wide choice of particular methods to
apply.
In most cases,
this choice is not crucial since the various
methods have been shown to be asymptotically equivalent.
since various authors use different methods,
However,
to the uninitiated the
variety of choices is often confusing.
1.2.2
Some Recent Developments in Model Formulation
Pearson (1947) was among the first to note that, even in the
simple 2x2 contingency table, the same configuration could have arisen
from different sampling schemes.
Thus} it appeared necessary to
carefully specify the underlying probability model since different
models could lead to different statistical procedures with obViously
•
different interpretations .
•
5
Underscoring the work of Professor S. Ne Roy and his students,
Kastenbaum, Mitra, Diamond, Bhapkar and Sathe, has been the careful
consideration of the sampling scheme from which the data arose.
particular factor_response structure
(2:...:..,
The
sampling scheme) of the
model has dictated the hypotheses of relevance just as in univariate
and multivariate analysis of variance for continuous variables.
The combined contributions of these researchers form the basis of
the general approach found in Bhapkar and Koch (1968a, 1968b) and
especially. in Grizzle et al. (1969).
They recommend
multi~factor,
multi-response models as the key to formulating relevant hypotheses.
Their test statistics are derived through weighted least squares
••
procedures and correspond identically to the minimum modified chi_
square statistics due to Neyman (1949) or equivalently the generalized
quadratic form criteria due to Wald (1943).
The main alternative school of thought has been developed by
Goodman, Good, Bishop, Fienberg, Kullback, Kupperman and Ku, among
others.
The major papers from this group include Goodman (1968, 1970),
Bishop (1969), Bishop and Fienberg (1969), Kullback
Ku and Kullback (1968).
et~.
(1962) and
Goodman, Good, Bishop and Fienberg have been
motivated by the resemblance of multi_dimensional contingency tables to
certain complex factorial designs.
They then apply maximum likelihood
methods to the assumed multiplicative underlying models although their
methods can be modified to apply to other models as well.
Much of
their effort is. directed toward special problems such as incomplete
•
tables.
On the other hand, Kullback, Kupperman, and Ku use the method
of minimum discrimination information estimation (which closely
•
6
resembles maximum likelihood estimation) and the minimum discrimination
infonna tion
5
ta tis tic as the tes t cri terion
!l
for measuring the
divergence of the alternative hypothesis) as evidenced by the sample
values, from the null hypothesis."
The major portion of this research will utilize the multi_factor,
multi_response framework with statistical inference based on the
minimum modified chi_square method of Neyman.
However, in certain
cases, maximum likelihood results will be given for the sake of
completeness.
1.2.3
The Categorical Data Framework
In the analysis of any multi_dimensional contingency table, there
••
are three fundamental aspects to be considered.
These include
(i) the specification of the underlying model
two response model);
(~.~.,
no factor,
(ii) the formulation (in terms of this model) of the hypotheses
to be tested (e.g., independence of responses) and the
calculation of-the corresponding test statistics;
(iii) the interpretation (with respect to the model) of the
results (e.gO) there is an association or relationship
between the-two responses).
In order to specify the underlying model, it must be realized that each
subject (or experimental unit) may give rise to two types of data.
These are
(i) a descripti.on of the experimental conditions which the
subject undergoes (or of the sub_population of units to
which the experimental unit belongs)
to as "fae tors.1I or "popula tions:
•
1
__ henceforth referred
j
(ii) a description of what subsequently happens to each subject __
henceforth referred to as Ilresponses': .
In other words, the data in any multi.dimensional contingency table are
•
7
the frequencies with which subjects belonging to the same sub_population
or combination of £aerer categories yielded the same combination of
responses.
The dimension! d
j
of a contingency table refers to the
(~.~.j
total number of factors and responses
d = 2 in the previous
example since there were no factors and two responses) while the levels
represent the sub_categories within the factors and responses
handedness has three levels:
(~.~.,
left, right, and ambidextrous).
More specifically) consider r independent random samples taken
from r multinomial populations whe.re n.
represents the size of the
~o
sample from the i_th 'population and n .. the observed number falling in
~J
the j_th category of the i_th sample where i = I, 2, ... , rand
j
= I,
2, ... , s.
It should be noted that i and/or j may be multiple
subscripts as in the case of multi_factor and/or mu~ti_response models.
For example,
i
:;:;; if
... , i
~
d
) with iO'
l,2, .•. ,rO"
0'
f
1,2, ...
•• J
,s~,
= 1,2, ... ,d f
~ =
1,2, ... ,d .
r
and
In
this example, the model is that of a dcfactor, dr-response contingency
table with the factors at levels iO"
responses at levels
j~,
~
1, 2.• " ' ,
0'
= 1.1 2., " ' ) d
r
de
and the
Note also that all
response_level combinations are assumed to occur with positive prob..
ability; however, situations where this assumption does not hold may
be handled by methods analogous to those given by Goodman.
other hand) not all
•
factor~level
On the
ccmbinations are required to appear.
In particular, if the sample design is an incomplete block design, not
all factor_level combinations can appear.
•
8
Now, if n 1)
.. represents
the pmb"bilits
of an observation from the
,
.
fal1in~
i_th population (or factor level)
in r.he j_th response cat.egory
(i.e., if the n .. represent individual cell probabilities), and the
-
-
1)
samp~ing
is either from a very large. pcpula.tion or with replacement,
then the probability distribut'ion of the observed frequencies, n .. } is
1)
given by the product_multinomial. distribution
n
r
$
=
n
i=l
s
n
io
n
j=l
(n .. )
ij
1)
(1.2.3.1)
n .. ~
1J
s
where
••
E n .. = 1 and
1)
j=l
assumed that n
ij
- n.
> 0 for all i,
1,0
j.
(fixed)
fer all i and where i t is
This basic model (1.2.3.1) will be
assumed throughout and the hypotheses of interest will be expressible
as functions of the unknown probabilities in the model, 'say ,~(!y.
It
should be noted that, if sampling is without replacement from a rela_
tively small population and hence the sampling fraction is large (say,
exceeding 10 percent), methods of analysis based on the theory of
sampling from finite popula tions wcnld be appropria teo
The details of
the analysis for this situat.ion in terms of finite population correc_
tion factors is discussed in Johnson and Koc.h (1970).
•
1
1Johnson, W. D. and KC'ch, G. G.
Sume aspects of the statistical.
analysis of qualitative data from sample sllrveyso
Part 1:
Propor_
tions, mean scores, and other linear flmctions~ Submitted to Health
Services Research, 1970~
Department of Biostatistics, University of
North Carolina at Chapel Hill.
•
9
1.2.4
Hypotheses of Interest
In order to formulate appropriate hypotheses to be tested, the
underlying factor_response structure of the table should be identified.
Following Bhapkar and Koch (1968a), the four types of multi_dimensional
contingency tables of interest (including examples from the literature)
are the following:
Model
I.
Model
II.
No factor, multi_response tables
Bhapkar (1966): no factor, two responses (left and
right eye grades); d = 2
Uni_factor, multi_response tables
Mosteller (1968): one factor (country), two
responses (status of father's occupation, status
of sonls occupation); d = 3
Model III.
••
.~
Multi.factor, uni_response tables
Bhapkar and Koch (1.968a): two factors (driver
group, accident type), one response (accident
severity); d = 3
Model
IV.
Multi-factor, multi_response tables
Bhapkar and Koch (1968b): two factors (subject
group, anatomical meaning),
two responses
(classifications at exposure rates of 1/5 second and
1/1000 second); d = 4
Obviously, for one_way tables, only Model I can occur; for two_way
tables, only Models I and II can occur and Model II actually represents
uni-factor, uni_response tables; for three_way tables, only Models I,
II, and III can occ'lr.
Otherwise (d > 3), all. four models can arise
in various situations.
To identify the hypotheses appropriate to the different models,
consider, for
simplicity~
the case of three_dLmensional tables.
In the
case of "no factor) three response" tables (Model I), the primary
•
in~erest
lies in the relationships among the three responses which are
analogous to problems of independence and correlation in normal
•
10
multivariate analysis of variance.
Questions of interest include total
independence of the three responses) independence of anyone response
from the other two, pairwise independence, and partial association
between two responses within given ,levels of the third.
In addition,
as will be explained later, hypotheses of marginal symmetry and equal_
ity of mean scores for structured responses (where scores,
':'0.2:)
ridits or percentile scores, are assigned, to the response levels) are
of considerable interest.
If the experiment is of the "one factor, two response" type (Model
II), interest lies
in
the association between the responses as well as
the effect of the factor on the responses.
Here the questions posed
are similar to those encountered in one_way normal multivariate analy_
••
sis of variance.
Appropriate hypotheses for this case include
independence of the responses within each factor level, homogeneity for
each marginal response
(~.~.,
does the factor level affect the marginal
distribution of the response?), and joint homogeneity
(~.~.,
does the
factor level affect the joint distribution of the responses?).
Finally, if the experiment is of the
fl
two factor, one response ll
type (Model III), the questions of interest are similar to those
arising in univariate no~al analysis of variance, ~'~'J how do the
factors (cf. treatments or independent variables) combine to determine
the response
(=.E..
"yield" or dependent variable)?
Here the hypotheses
of interest include total homogeneity (~'~'J do both factors affect the
distribution of the response?), partial homogeneity
(~.~.,
factor affect the distribution of the response?)) and
•
between factors in the way they affect
~he
respoase
l1
does one
ll
no interaction
(~.~.,
do :he
•
11
factors determine the response in a purely additive or multiplicative
way or are the relationships between the factors and the response more
comp lica ted?) .
For completeness, note that, for Model IV situations ("multifactor" multi_response" experiments and hence d > 3») interest centers
in both the relationships among the responses and in the way the
factors combine co affect the responses.
Thus, the structure of the table (which is not always obvious or
unambiguous) dictates the types of hypotheses that are appropriate to
the given
~xperimental
situation.
The test statistics used and the
interpretation of the results then follow from the specified model and
the corresponding hypotheses of interest .
••
1.3
1.3.1
Methods of Inference
Hypothesis Formulation
A general formulation for many hypotheses of interest in the
analysis of categorical data
(~.~.,
in two_way tables, independence in
the Model I situation, homogeneity in the case of Model II) is given by
the following set of constraints on the cell probabilities:
f
(1. 3. 1. 1)
(n) = 0
txl
where
t < r(s-l)
•
with
•
12
...
~
TTr l,TT r 2"" ,TT rs )
"
= ( TTl,TT"))
••• ,TT ')
-""""£
-r
and the functions,
fk(~'
are functionally independent and independent
s
of the constraint,
fk(~
E TT . = 1, i
j=l i J
In addition, the
must have continuous first and second partial derivatives with
respect to the TT.. 's and
~J
o
(TT) =
txrs H
o
(~.~.,
1, 2, ... , r.
of k (~)
a
(TTij~!!
0
must be of rank t
full rank) for any TT in the neighborhood of TT.
For the case of
independence in the two-way case mentioned previously, the constraints)
f
k
(~
= 0,
take the form
••
where
~n
is the natural or Naperian logarithm, and where, for con_
venience; in the case of the single multinomial
(~.~.,
the situation
where two or more responses are given for each member of a given popula-
=1
since r
= 1),
the i-subscript is suppressed.
the case of homogeneity, the
fk(~
take the form
tion and hence i
For
for all i f i' and each j.
= 0
Alternatively; functions of the cell probabilities can be expressed in
the following linear model:
•
x
f (TT) =
13
txl txu uKl
(1.3.1.2)
•
13
!(!9
where
is as before, X is a known design matrix of rank u < t and
is a vector of unknown parameters.
~
It should be noted that X differs
from the usual design matrix when more than one function is constructed
within each population.
under (1.3.1.2).
Tests for fit of the model are appropriate
Given that the model fits the data, hypotheses con_
cerning various constraints on the model parameters can be formulated
as follows:
C
~
cxu ux1
=
(1. 3.1. 3)
0
cx1
where C is a matrix of known constants of rank c < u.
For example, in
the case of split_plot categorical models as described in Koch and
••
Reinfurt (1970),2 the significance of whole_plot effects, split_plot
effects, and interaction between whole_plot and split_plot effects may
be tested using appropriate C matrices.
1.3.2
Estimation Procedures
If the hypothesis of interest is formulated in terms of constraints
on the cell probsbilities as in (1.3.1.1), then it is necessary to
obtain estimates of the cell probabilities subject to those constraints.
These constrained estimates are then incorporated into one of several
asymptotically equivalent test criteria and the test is performed.
In the framework of (1.3.1.1), there are essentially two different classes of estimation procedures.
2
•
These are
Koch, G. G. and Reinfurt, D. W. The analysis of categorical data
from mixed models. Submitted to Biometrics, 1970. Department of
Biostatistics, University of North Carolina at Chapel Hill .
•
14
(1) Maximum likelihood and minimum discrimination information;
2
(2) Minimum chi_square (minimum X ) and modified minimum chi_
square (minimum
Xi)
in the sense of Neyman.
Both classes of estimation procedures satisfy a certain optimality
property, namely, they all yield BAN (best asymptotically normal)
A
estimates.
Thus, these estimates, n. OJ are
1J
(i) Consis tent and hence asymptotically unbiased (~.
converge in probability to the TT
ij
n.
E
i=l
'where
~.
1
10
-
to
n
io
with T = ~i
is a constantj
(iii) Efficient
••
they
);
r
(ii) Asymptotically normal as N =
=.. ,
(~.
=..'
the variance of any other consistent,
asymptotically normal estimate is at least as large as the
variance of ~ij);
(iv) Sufficiently regular
(~.
=..,
on..1J
OTT
exist and are continuous
ij
in TT
for all i,j).
ij
Historically, maximum likelihood estimates (MLE's) were probably
the first estimates derived for (1.3.1.1) and have certainly been the
most popular to date.
maximized
~ith
To obtain MLE's of the TT , (1.2.3.1) must be
ij
respect to the TT.. 's and subject to (1.3.1.1) leading to
1J
a system of simultaneous equations which are often non_linear in the
TT
ij
and consequently difficult to solve.
schemes have been proposed
(=..~.,
Although various iterative
Roy and Kastenbaum, 1956; Kastenbaum
and Lamphiear, 1959), the variety of possible hypotheses has precluded
•
the formulation of a general solution of the resulting systems of
•
15
equations.
However, due to the results of Birch (1963), fairly general
maximum likelihood procedures have recently appeared.
These include
the various iterative proportional fitting schemes presented in
Mosteller (1968), Goodman (1968), Bishop and Fienberg (1969), and
Bishop (1969), and are based on the realization that "the MLE's of the
expected cell frequencies are uniquely determined by the marginal
totals being equal to the MLE's of their .expectations."
Alternatively, Kullback and his associates have recommended the
use of minimum discrimination information estimates.
These BAN
estimates are obtained by minimizing, with respect to the TIij'S,
••
r
S
E
E
(1. 3. 2.1)
i=l j=l
subject to (1.3.1.1) where the Pij are fixed by hypothesis, observed,
or estimated.
Similar to maximum likelihood estimation, this procedure
also usually requires iterative procedures.
2
Minimum X estimates are obtained by minimizing
r
s
E
E
i=l j=l
(1.3.2.2)
n. TI ,
10 i J
with respect to the TI ' S and ,subject to (1. 3.1.1).
ij
Again, this
technique usually involves the solution of a system of complicated, nonlinear simultaneous equations, and, since minimum
x2
estimates are not
known to possess any desirable properties that MLE's lack, these
•
estimates have seldomly been used .
•
16
If the denominator of (1.3.2.2) is replaced by the observed cell
frequency, n
ij
, and the resulting sum
(1.3.2.3)
minimized with respect to the nij's and subject to (1.3.1.1)) the
resulting minimum
xi
estimates are obtained by solving only linear
equations provided the
fk(~
fk(~
proved that, i f the
are linear in the nij's.
Neyman (1949)
are not linear in the nij's, the
fk(~
be replaced. by their Taylor series expansion about the point, n
n..
/n.10
1J
where Pij =
••
H.,
i :::; 1,
J
r
E
+
fk(J:}
s
E
i=l j =1
r
j
j
:;:; 1, ...
J
5,
may
= ~,
namely
ofk(~
( an..
1J
(1.3.2.4)
k = I, 2, ... ,
thereby reducing the problem to the linear case.
t
In either case, the
·
.
·
resu 1 tlng
minlmum
Xl2 estimates are BAN estlmates
an dd 0 not requ i re
iterative procedures for their solution.
In the framework of (1.3.1.2) where a linear model is fitted to
the data, Grizzle
~~:
unrestricted MLE,
g, and then estimate the parameters
(1969) estimate
!.(~
by replacing :!2: by its
(~
of the model
by ordinary weighted least squares procedures applied to
f
(p)
txl -
•
x
txu
~
uxt
(1. 3. 2.5)
•
17
This procedure yields the familiar weighted least squares estimate
(WLSE),
~,
~
of
namely
= (X'S-lX)-l x'S-l !.(E)
b
(1.3.2.6)
uXl
where
S
= HV(E)H' = sample estimate 'of var(!.(E»
txt
with
V(E)
sample estimate of var(E)
rsxrs
••
Of
H
= (
txrs
1.3.3
k
(!V
)
OTT..
~£.
1J
Test Statistics
A number of asymptotically equivalent test statistics have been
proposed for tes ting the hypotheses specified by (1. 3. 1. 1) and
(1.3.1.2).
Asymptotic equivalence requires that, regardless of whether
the null hypothesis holds, the probability of any two tests being
r
contradictory tends to zero as N
~
r.
n.
is a constant.
_
co
with
10
i=l
These test statistics for (1.3.1.1) fall into three
essentially different classes.
They are
(1) Pearson's chi_square statistic
•
A
r
s
X2 = r.
r.
P
i=l j=l
(n
- n.10 TTij )
ij
n.
10
TT ..
1J
2
(1.3.3.1)
•
18
(2) (a) Neyman-Pearson's likelihood ratio chi_square statistic
x~
=
r
L
s
L[_2tn(
(1. 3.3.2)
i=l j =1
(b) Minimum discrimination information statistic
r
. 2
~
s
n ..
~
c
1J
L [2N n .. t n
)]
1J
i=l j=l
P ij
XI = L
(1.3.3.3)
(3) (a) Neyman's chi_square s tatis tic
r
s
L
L
n.
10
i=l j=l
•
(1.3.3.4)
n ..
1J
(b) Wald's statistic
2
XW
=
(!.(E) , (HV(E)H')
-1
(!.(E)
(1.3.3.5)
where £.' !.(E), V(E), H, £.' and N are defined in Section 1. 3.2, the n ij
are any BAN estimates of the
TI.
oJ
1J
and the
are the minimum discrim_
TI . .
1J
ination information estimates obtained by minimizing (1.3.2.1).
For testing the fit of the linear model (1.3.1.2), the usual
analysis of variance error sum of squares is used,
X; = SS(f(n)
=
X~
= (!.(E)
~.~.)
'S-l !.(E)
(1.3.3.6)
where Sand b are defined in Section 1.3.2.
•
Then, given that the model
adequately fits the data, tests involving contrasts of the model
•
19
~
parameters (C
cxu
sums of squares,
X~
=
uXl
0) are produced by usual analysis of variance
cxl
~'~'}
= SS(Cl!. = 0) = (C!:>' [C(X' S
_1
X)
-1
C']
_1
(1. 3.3.7)
(Cb)
where C is a matrix of constants of rank c < u.
2
2
2
Under H ' Neyman (1949) has shown that X ' ~R and X , using any
p
N
ol
2
BAN estimates of the n .. , are asymptotically x variates with t degrees
1J
of freedom (D.F.) and hence are asymptotically equivalent provided that
n.
.~o ) i ;:; 1, 2, ... ) r, remain constant as N -
(i)
(ii) f
•
00
J
k (!9 = 0, k = 1, 2, ... , t, has at least one solution such
that n
ij
> 0 for all i, j.
Bhapkar (1966) has shown that
XW2
(Wald's statistic) for testing linear
2
hypotheses in categorical data is algebraically identical to X when_
N
ever
X~ is defined, .!:..=.., whenever all the n .. are positive.
1J
for testing non_linear hypotheses,
linearization
~
is identical to
~ using Neyman's
technique on the hypothesis cons traints,
Kullback (1959) has shown that
xi
Similarly,
!.(~ ;:;
O.
(the minimum discrimination informa_
tion statistic), under H ' is also asymptotically
ol
x2
with D.F.
=t
so
that all of these tests used for H
are asymptotically equivalent when
ol
using the appropriate estimates of the individual cell probabilities.
In the context of the linear model as in (1.3.1.2),
the tests are
derived by conventional methods of weighted least squares and hence are
•
the same as Neyman's chi_square tests when translated into constraints.
2
Under H ' X: for testing the fit of the model is asymptotically X
02
•
ZO
with D.F.
= t-u,
and, given the model,
model parameters is asymptotically
hypothesis, C.!!. =
xZ with
for testing contrasts of the
D.F.
=c
under the null
~.
1.4
1.4.1
XcZ
The General Linear Model
Notation and Assumptions
For'a large portion of this research, unless stated otherwise, the
underlying probability model is that defined by (1.Z.3.1) with the
notation (see Grizzle
=!
al., 1969) referring to the expected cell
probabilities and hypothetical data shown in Table 1.4.1.1.
Table 1.4.1.1
Expected cell probabilities (cell frequencies) for the
standard contingency table
••
Populations
(fac tors)
1
Response Categories
1
Z
Till
Tl
(n
Z
Tl
(n
r
ll
)
Tl
Zl
Zl
(n
)
(n
Totals
s
lZ
Is
)
(n ls )
Zs
)
(n
)
rS
rs
lo
)
(n
Zo
)
1
TI
(n
•
Zs
(n
1
Tl
ZZ
ZZ
1
Tl
lZ
)
(n
ro
)
•
21
Let
... ,
TTr l' TTr 2"'" TTrs )
(1.4.1.1)
TT' ~ .. JTT')
= (TT'
-1' -2'
_
r
_, where Pij
p' = TT'
= unres tric ted MLE of TT'
lxrs
!!:"E.
... , r',
i = 1, 2,
0
=
11
o.
(1.4.1.2)
10
j = 1, 2,
... ,
s
(1.4.1. 3)
where
any function of
••
~
having first and second partial
derivatives with respect to the
k = 1, 2, ... ,
IT
ij
where
t < r(s-l)
f'
lxt
-TT·1TT.
1
1S
V(TT.) = var{p.)
-1
,~
1
= --'
O.
10
(symmetric)
TJ. (l-TT
1S
is
)
(1.4.1.4)
,
= V(p.) = V(TT.)
-1
•
= sample estimate of V(TT.)
-1
-1
!!:i =£.i
(1.4.1.5)
•
22
.
= V(p) = block diagonal matrix with diagonal blocks
V
rsxrs
(1.4.1.6)
1
of the form
n.
10
where D = diagonal (Pil' Pi2'
p.
...
)
Pis)
-1
Ofk(~
H
= H(P) = (
Cln ..
txrs
s
(1. 4.1. 7)
)
1J
~E.
= HVH' = sample estimate of var(f)
(1.4.1.8)
txt
It is 'assumed that the f
k
(~
are functionally independent and
s
_.
independent of the constraint,
L n,.
j=l
HVH' are of full rank.
= 1,
for all i, and hence Hand
1J
If some of the n
ij
0, they will be replaced
by lis so that S will be of full rank.
1.4.2
Methods of Inference
Assume that
!(2i>
the rank of X is u.
=
X£. where the fk's are possibly non_linear and
For the no factor, multi_response case
(~.~.,
one population problems), the relevant hypotheses are those of various
types of independence and are tested using
Xi
=
SS(!(~
which} under H :
o
f(n) is linear,
:
•
(1.4.2.1)
= 0)
!(~
~.~.,
= ~ is asymptotically X2 with D.F. = t.
f(n)
An where
If
•
23
a
(1) rs
a(2)rl ... a(2)rs
A =
txrs
a
(t) rs
,
~(2)
=
--
a'
-(t)
is a matrix of arbitrary constants of rank t, then (1.4.2.1) is more
explicitly given by
(1.4.2.2)
A second class of functions often arising in categorical data problems
is the class consisting of the logarithmic functions which can be
expressed by
!.(~
= K .(,n(A~ where .(,n(A~ denotes the vector of natural
(or Naperian) logarithms of the elements of An, and
-
matrix of arbitrary constants of rank v < t.
statistic for this case is given by
•
K
vxt
=
(k
) is a
Oly
The appropriate test
•
24
x~ = SS(f(n) = K ~(A::9 = 0)
(1. 4.2. 3)
2
Under H0' -1,
x is
where
asymptotically X2 with D.F.
= v.
It should be noted that caution must
be taken in constructing A so that no element of the vector, AE; is
zero.
Most frequently, A
=I =
(identity matrix) corresponding to
tests for various types of interaction.
replaced by lis, each element of I~ (=
Since empty cells are to be
£?
will be positive so that no
difficulty is encountered with this common application of the loga_
rithmic functions.
••
For simplicity, consider, as an example of the preceding, the test
of independence for the two_way case cited previously and suppose
jl = 1, 2; j2 = 1, 2. .Then
is given by
fen) = tn (
so that
A
•
= 14 = identity
K = (1, _1, _1, 1)
matrix
•
25
Pn (l-P n
V
~
V(E)
)
I
-P n P l2
-P n P2l
-P n P22
P l2 (l-P I2 )
-P l2 P2I
-P l2 P22
P2 1(l- P2l)
-P 2l P22
~N
(synnne tric)
P22 (1-P 22 )
where
D ~ diagonal (PII' P12' PZI' PZZ)
••
and hence
~
1
__1_, __
1_, _1_)
(PU'
PIZ
P21 · PZZ
1 1
1
1
1
+ + + --)
N P11
P1Z
PZ1
PZZ
~ ~--
and the test statistic with D.F.
~ ~
•
SSC!.(n)
~
K .f.n(A!!l
~
0)
~
1 is given by the familiar
•
26
For the uni_ or multi-factor, uni_ or multi-response case (i.eo
several population problems), X is known and of rank u < t.
J
A test
of fit of the model !(.!!.l = Xl!; is given by the residual sum of squares
(1.4.2.4)
where b is the vector that minimizes (! _ X~ 'S-l(! _ Xb) and is given
by
(1.4.2.5)
If the model fits the data,
x:
2
is asymptotically X with O.F.
Recalling that for the linear case, ! =
logarithmic case, !
••
= K ~n(A£?,
S
A~,
S
.
= AVA',
= KO-lAVA'O-lK',
=t
- u.
and for the
explicit expressions
for the test statistic given in (1.4.2.4) can easily be obtained for
these two classes of functions.
Given that the model fits, various hypotheses concerning
constraints on the model parameters,
!'
may be of interest as}
for
example, .significance of differential hospital effects and differential
surgical procedure effects relating to severity of the dumping syndrome
for the example in Grizzle· et al. (1969).
H :
o
C~ = ~
X~
A test of the hypothesis,
is produced by
=
SS(C~
(1.4.2.6)
= 0)
where C is a (cxu) matrix of arbitrary constants of rank c
2
2
H , X is asymptotically X with D.F.
o
c
•
= c.
~
u.
Under
Again, explicit expressions
for (1.4.2.6) can easily be obtained for the linear and logarithmic
classes of functions.
•
27
The two factor, one response example in Grizzle
~~.
(1969)
involving severity of the dumping syndrome provides a good illustration
of the linear model treatment of the several population problems and
thus will not be repeated here .
••
•
•
28
2.
MIXED CATEGORICAL DATA MODELS
2.1
Introduction
In considering two_way contingency tables, two types of models
have been identified (see Bhapkar and Koch. 1968a):
the "no factor,
two response" situation in which neither margin is fixed and the
"one factor, one response ll situation in whicJ;1 exactly one margin
is fixed.
For the first case, the hypothesis of primary interest is
one of independence or no association between the responsesj in the
second case, the hypothesis of primary interest is one of homogeneity
over the factor levels.
However, there exists a third type of model which will be called
••
a "mixed categorical data model of order 2" for the. case of two_way
tables.
The experimental situation corresponding to such a model
involves exposing each of n randomly chosen subjects from some
homogeneous population to both levels of a binary factor
(~:~:'
control vs treated) and classifying each of the two responses into one
of r categories.
The resulting data, which will be assumed to follow a
multinomial distribution, can then be represented by an r x r con_
tingency table.
Thus, this is a categorical data version of the
classical matched pairs design.
As such, it has been studied in a
somewhat different context recently by Miettinen (1968, 1969, 1970),
among others.
In addition, the methods of Mantel and Haenszel (1959)
and Mantel (1963) are also of interest for this experimental situation.
Al though these authors deal wi th appropriate s ta tis tical procedures for
•
the analysis of certain types of mixed categorical data models of
order 2, they do not indicate the way in which such contingency table
data can be analyzed as a special case of a unified general approach.
•
29
One such methodology which will be the basis of inference in this
research is that of Grizzle
et~.
(1969).
As a result, test statis-
tics are derived through weighted least squares analysis of certain
appropriately formulated linear models·as indicated in Section 1.4.2.
~.~.,
Alternative methods are also appropriate in certain contexts,
those of Lewis (1968) and Goodman (1970) which are based on maximum
likelihood and that of Ku and Kullback (1968) based on minimum
discrimination information.
To apply the methods of Grizzle, Starmer, Koch (GSK), it is first
necessary to recognize the underlying factor_response structure for the
data.
Since only the sample size n is fixed, this is a special case of
the "no factor, two response" model.
••
However, the hypothesis of
interest is one of homogeneity over the factor levels;
~.!.,
no difference between the effects of control and treated.
there is
This
hypothesis can be formulated specifically as an hypothesis of marginal
symmetry in the two_way table.
One of the principal purposes of this research is to demonstrate
that tests of marginal symmetry are not only applicable to mixed
categorical data models, but are the only tests directed at the ques_
tion of primary interest associated with such experimental situations.
Responses from the same subject (or units in the same matched pair) to
different factor levels are obviously associated since they have the
same initial source in common.
Thus)
the traditional hypothesis of
independence which is usually applied to "no factor)
tables is not of much interest here .
•
two response"
•
30
Finally, it is important to recognize that if the r categories
.can be described in
te~s
of some quantitative scale, the question
of principal interest is one of equality of mean scores over the two
marginals.
This hypothesis bears a definite resemblance to the
relationship tested by the classical matched pairs t_test used with
continuous data.
2.2
Mixed Models in General
For mixed model experiments in the continuous case (cf. Scheff~,
1959), each of n randomly selected subjects responds to each of d
treatments 'exactly once.
The outcome of the experiment constitutes an
observation matrix composed of n response vectors) each of length d.
••
The hypothesis of interest in mixed models is the equality of treatment
effects (or homogeneity over the factor levels).
The following basic assumptions are made (see Koch and Sen, 1968):
A.I
The n response vectors are mutually independent.
A.2
Each response vector has a continuous d_variate c.d.f.
(cumulative distribution function).
A.3
For any subject, the joint distribution of any linearly
independent set of contrasts among the d observations is
diagonally symmetric.
For the normal (or parametric) case} an additional assumption is
reqUired, namely,
A.4
The n error vectors have a common d_variate nonmal
distribution, N (~, E).
d
If it can be assumed} in addition) that
A.5
•
The subject effects are additive, i.e., the joint distribu_
tion of the i_th subject effect and i_th error vector is the
same for all subjects)
•
31
2
the appropriate test procedure is based on Hotelling's T (cf. Case III
of Koch and Sen).
A.6
If furthermore: it can be assumed that
There is compound symmetry of the error vectors (~.~.,
E
=
pu2J + (l-p)u2 I
where J is a (d x d) matrix
of l's and I is the (d x d)
identity matrix),
the usual analysis of variance F is appropriate (Case IV of Koch and
Sen).
Case I (which requires A.l _ A.4 only) and Case II (which
requires A.l _ A.4 and A.6) are not of interest in the parametric case.
Non_parametric analogues of these cases are given in Koch and Sen
(1968) .
The mixed models primarily considered in this research may be
viewed as belonging to the categorical data version of Case III for
••
2
which the Hotelling T _statistic is applicable in the continuous case
with multivariate normal responses..
As it is assumed in Case III that
there are no subject effects, it is appropriate to assume that a single
multinomial distribution characterizes the entire contingency table.
If assumptions A.5 and A.6 are not tenable in the categorical data
framework (Case II), it is useful to adopt a univariate point of view
as in the matched case_control situations considered by Mantel and
Haenszel (1959) and Mantel (1963).
The principal difference with Case
III is that here the data are represented by n two._way contingency
tables with d rows and r columns where the response of each subject
to each treatment is classified making the total frequency nd rather
than n.
•
Cases I and IV are of rather limited interest in the categorical
data framework.
Table 2.2.1 summarizes the three most important cases
along with the appropriate test procedures.
•
32
Table 2.2.1
Summary of assumptions and appropriate test procedures for
various types of mixed models
Case Assumptions
(A.4 for
Continuous Data
Non_Parametric
Tests
Parametric
Tests
Ca tegorical
Data Tests
Parametric
Tes ts)
II
a
A.l _ A. 3
Generalized
Friedman ~ s X2
A.6
test
III
A.l-A.3,
A.5
IV
A.l - A. 3,
A. 5, A.6
Test based
on Rotelling
T2'
d_variate signed
Analysis of
variance
F_test
Sen's aligned
rank Wilcoxon
test
block rank
Mantel-Raenszel
test (case_control)
Tests of Mantel
(matched sets)
Linear model tests
of mixed categor_
ical data models
Classical X2 test of
homogeneity
test
••
a Case I is omitted since it is of little interest here.
•
The extension to generalized mixed models will be considered at a
later time.
(~.~.,
With this extension, analyses of llmatched see' experiments
the study reported in Miettinen (1969) where each propositus was
matched with four controls) will readily be performed using the basic
GSK approach.
The present discussion is limited to more standard
situations.
2.3
Mixed Categorical Data Models of Order 2
First let us consider the following data (see Table 2.3.1) from
case records of the eye-testing of 7477 employees in Royal Ordnance
•
factories which have been used as an illustrative example for tests of
marginal symmetry by Stuart (1955), Bhapkar (1966), Grizzle et a1 .
•
33
Table 2.3.1
Vision records for women employees of the Royal Ordnance
factories
Left Eye Grade
Highest
(1)
Second
(2)
Third
(3)
Lowest
(4)
Total
Highest (1)
1520
266
124
66
1976
Second (2)
234
1512
432
78
2256
Third (3)
117
362
1772
205
2456
Lowest (4)
36
82
179
492
789
1907
2222
2507
841
7477
Right Eye Grade
Total
(1969), and Ireland
••
~~.
(1969).
These data may be viewed as arising
from a mixed model in which each of n = 7477 subjects is classified
according to both levels of a factor (eye position) with the corre_
sponding cell frequencies being assumed to follow a multinomial
distribution.
As such, the hypothesis of principal interest is that
the distributions of grades of vision are the same for both eyes,
marginal symmetry.
~.~.,
The values of the test statistics (each approxi-
mately distributed as
x2
with D.F. = 3 under the null hypothesis
obtained by the preceding authors are approximately equal (namely,
11. 96 (Stuart), 11. 97 (Bhapkar), 11..98 (Grizzle
(Ireland
~ ~.».
~ ~.),
and 12.00
Thus, the marginal distributions of eye grades are
significantly different.
Looking at the problem of marginal symmetry in a more formal way,
•
let n .. , denote the probability that a subject is classified according
JJ
to the j-th grade category on the right eye and the j'_th grade
•
34
category on the left eye.
Then the hypothesis of marginal symmetry
may be written as
(2.3.1)
j - j' = 1,2,3,4
4
where
TT
=
jO
4
E
and
E n. _I
j '=1 JJ
Following the GSK
j=l
approach, form the unrestricted maximum likelihood estimates
" where n .. , represents the number of subjects
Jj
JJ
j
j
'_th
cell.
in the
From these, the desired estimates of TT
and TT '
jO
oj
(n
jj
./n) of TT.
are obtained by taking the appropriate marginal sums of Pjj"
4
4
and
E
j' =1
••
E
j=l
p ..
JJ
I'
linear model is fitted to the PJ'o and p
respec tively.
1
0
0
~l
P20
0
1
0
~2
P30
0
0
1
~3
Pol
1
0
0
P02
0
1
0
P03
0
0
1
(2.3.2)
=
by weighted least squares, the residual sum of squares
D.F.
(~S)
with
= 3 is the GSK test statistic for (2.3.1) which is the same as
that of Bhapkar (1966) (except for round._off error) .
•
If the following
oj'
Plo
E
namely,
•
35
Alternatively, if scores aI' a , a , a are assigned to the
2
4
3
response categories, the hypothesis of equality of mean scores over the
two marginals may be written as
4
Ha :
-
4
E a.T!
=
j=l J jo
E a ,n .
j'=lj
OJ
I
(2.3.3)
=" Cl'LE •
One test statistic for this hypothesis is the residual sum of squares,
~,
with D.F. = 1, which is obtained by forming
Cl'RE =
.i a j P jO
J=l
4
and
••
E a ,p .,
j' =1 j
oJ
and fitting the linear model
by weighted least squares.
a 2 = 2, a
3
both eyes.
= 3, a
4
For the data in Table 1, the scores a
= 1,
= 4 have been assigned to the grade categories for
The residual sum of squares is then given by
=
(2.3.4)
4
E
j=l
\
j
7477(17012 _ 17236)2
7477
7477
=
=
•
l
11.97 .
•
36
The similarity of the test given in (2.3.4) to the matched pairs
t_test may be seen clearly by writing down the frequency distribution
of the difference (d .. ,
JJ
= a.
J
- a.,) between the right eye grade and the
J
left eye grade (see Table 2.3.2).
(_244/7477)
~
The average difference,
d,
is
-.03 and its estimated standard error, s_, is 0.0087.
d
Hence
-
(d / s_)
d
2
= (_3.46)
2
= 11. 97
which is identical to the value obtained for
Table 2.3.2
••
•
(2.3.5)
2
\i.
Frequency distribution of the difference between right and
left eye grades
Difference
(d ,)
Frequency
(f ,)
-3
66
_2
202
_1
903
0
5296
1
775
2
199
3
36
jj
jj
•
37
2.4
Mixed Categorical Data Models of Higher Orders
In general a "mixed categorical data model of order d ll involves
exposing each of n randomly chosen subjects from some homogeneous
population to each level of a factor with d levels and classifying each
response into one of r categories.
The resulting data, which will be
assumed to follow a multinomial distribution, are then represented by
an r x r x .•. x r contingency table of d dimensions.
This, then, is
the categorical data version of the classical mixed model with n
subjects and d treatments (see Section 2.2).
For the case of higher order mixed categorical data models with
binary response levels (r;2), Bhapkar (1965) has formulated a test in
terms of d_th order marginal symmetry.
••
in Grizzle
=!~.
This procedure is illustrated
(1969) for data indicating the responses of 46
subjects to each of the drugs A, B, and C as either being favorable
or not favorable (see Table 2.4.1).
Thus, this example is a mixed
categorical data model of order d ; 3 with drug type being the factor
with levels A, B, and C.
Table 2.4.1
Responses of patients to drugs A, B, and C
Response to A
Response to B
Response
to C
•
The hypothesis of interest is equality of the
Favorable
Fav.
Not Favorable
Not Fav.
•
Total
Fav.
Total
Not Fav.
Total
Fav.
6
2
8
2
6
8
16
Not
Fav.
16
4
20
4
6
10
30
Total
22
6
28
6
12
18
46
•
38
probability of a favorable response for each of the three drugs; i.e.,
equality of the corresponding marginal distributions or marginal
symmetry of the three one_way margins.
the GSK approach yields
~S = 6.58
significant difference (a
= .05)
For the data in Table 2.4.1,
with D.F.
=2
indicating a
between drugs.
In order to extend the theory given in the previous section to the
general case, let
(j1' j 2"
TT
.
. denote the probability of cell
j lJ 2 ···Jd
j~ =
.. ,jd) where
d_dimensional table.
1, 2, ... , r with
~ =
1, 2, ... , d
in a
Now, the hypothesis of total symmetry is only of
limited practical interest since it is the first order marginal distributions (and the corresponding tests of marginal symmetry) that provide
the clearest indication of the relative effects of the various treat_
••
ments or factor levels .
Thus, if
ability of level j for the
d
E
V= 1
~_th
$~j
represents the marginal prob_
dimension and is defined as
r
E
J· v
=1
TT..
(2.4.1)
•
J 1 J 2 ·• ·Jd
vF~ j~=j
the hypothesis of d_th order marginal symmetry for the one_way margins
may be written
H :
o
=
=
j
•
(2.4.2)
=
1,2,. H,r .
An appropriate A_matrix is then constructed which generates the
•
39
1
i:-
d
E n
E
n \1=1 j =1 j1 j 2··· j d
estimates P~j of ~~j' namely, P~j = -
\I
\lf~ j~=j
where n is the total sample size.
For example, if d = 3 and r = 3, we
have
1 111 111 1 1 0 0 000 0 0 0 0 0 0 0 0 0 0 000
o0
o0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 110 0 0 0 0 0 000
0 0 0 0 0 0 0 0 0 0 0 0 0 000 111 111 111
1 110 0 0 0 0 0 1 110 0 0 000 1 1 1 0 0 0 0 0 0
A
=
000111000000111000000111000
9x27
000000111000000111000000111
100100100100100100100100100
••
010010010010010010010010010
001001001001001001001001001
r
Since the ~~j necessarily satisfy restrictions of the type E ~
= 1,
"
j=l ~j
the
A~atrix
~ =
1, 2, ... , d; j = 1, 2, ... , (r-l).
is modified to genera te p ~j for only the combinationa
3, 6, and 9 would be deleted from A.
Thus, for this example, rows
To the resulting estimates
the following linear model ia fitted by weighted 1eaat squarea
•
P~j'
•
40
Pu
1
0
0
P12
0
1
0
P 1, r-l
0
0
1
P2l
1
0
0
0
1
0
0
0
1
1
0
o
o
1
o
o
0
1
P22
"'2
'" r_l
(2.4.3)
=
E
P 2,r_l
••
"'1
p
d,r_l
The residual sum of squares,
2
~S)
provides the test statistic for the
hypothesis of marginal symmetry in (2.4.2).
2
XMs
When (2.4.2) holds,
2
is approximately distributed as X with D.F. = (d_l)(r_l).
Alternatively, if scores at'
8
2
,
"'J
a
r
are assigned to the
response categories, the hypothesis of equality of mean scores over the
d marginals may be written as
H :
where
r
~
j=l
aJo$"Jo, Ii;
(2.4.4)
=
"'2
o
The hypothesis (2.4.4) may be
= 1, 2, ... , d.
':>
r
tested by obtaining the estimates
•
and fitting the linear model
~ aJop"Jo, ~ =
j=l
':>
•
41
1
0'1
1
0'2
E
(2.4.5)
~
1
O'd
by weighted least squares.
residual sum of squares,
As before the test statistic is the
~, but with D.F. ~ (d-l).
The tests given here for (2.4.2) and (2.4.4) appear similar to
the classical ones of homogeneity in
tables.
P~j
1I
0ne factor} One response"
However, here the underlying table is d_dimensional and the
are correlated with each other while in the "one factor, one
response" tables the underlying table is two_dimensional and the
••
are independent of P~'j for all ~' f~.
P~j
Hence, the two procedures are
quite different with respect to the underlying estimates of variance
and covariance.
Finally, it should be mentioned that other types of hypotheses of
marginal symmetry
(~.~.,
tested by this approach.
equality of all two_way margins) may be easily
All that is required is choosing an appropriate
A_matrix, and then fitting a linear model specific to the hypochesis of
interest.
However} such tests are not as readily interpreted experi-
mentally in the context of mixed models as are the tests involving one_
way margins.
2.5
Split Plot Contingency Tables
A somewhat more complex example than those considered previously
•
is provided by the following data (see Table 2.5.1) of Lessler (1962)
which have been described in Bhapkar and Koch (1968b).
In this case,
•
42
Table 2.5.1
Responses of males to culturally masculine and
anatomically feminine objects with weak intensity
Response at 1/5 Sec.
Response at 1/100 Sec.
A
Response at
1/1000 Sec.
M
F
Total
Response at
1/1000 Sec.
M
F
Total
F
M
Total
M
F
Total
171
18
6
12
177
30
7
7
7
56
14
63
191
93
189
18
207
14
63
77
284
184
38
10
14
194
52
7
7
20
114
27
121
221
173
222
24
246
14
134
148
394
M
F
C
Total
there are
••
0.0
groups of male subjects involved in a study of the nature
of sexual symbolism in objects:
Group A members were not told the
purpose of the experiment while Group C subjects were.
Each subject
was asked to classify as masculine or feminine (M or F) certain objects
which had been previously characterized as being culturally masculine
and anatomically feminine but with weak intensity at three different
exposure" rates:
this is a
lI
1/5 second, 1/100 second, and 1/1000 second.
Hence,
one factor, three response" situation with Group being the
factor and the M or F classification at different exposure rates being
the responses.
However,
since it is of interest to compare the
responses at different exposure rates, it is apparent that the data
from either one of the Groups A or C by themselves is a mixed cate_
gorical data model of order 3.
Thus,
the data for the combined groups
may be referred to as a "split plot categorical data" model.
•
Hence,
the hypotheses of interest are equality of group (whole plot) effects,
•
43
equality of exposure rate (split plot) effects, and "no interaction"
between exposure rates and groups.
These hypotheses may be readily
formulated using the GSK approach.
First, define the relative
frequency vectors for the two groups as follows:
p' = (171, 18, 6, 12, 7, 7, 7, 56) (1/284)
1x8
!:..A
p' = (184, 38, 10, 14, 7, 7, 20, 114) (1/394)
(2.5.1)
.!:.C
1x8
p' = (!:AJ
1x16
P1;)
In order to obtain estimates of the probability of masculine classifi-
cation at the exposure rates 1/5 second, 1/100 second, .1/1000 second,
respectively, in Group A and in Group C, define
••
A
=
6x16
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
1
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
1
0
1
0
1
0
(2.5.2)
Then fit the following linear model by weighted least squares
•
,
0.729
0.715
••
0.673
where
0.624
0.599
0.561
and
~A
is the linear effect of
is a mean effect in Group A, AA
1
is the non_linear effect of exposure in
exposure in Group A, AA
2
Group A, and ~C' AC ' AC are similarly defined quantities in Group C.
1
2
The hypothesis of no (group x exposure rate) interaction may be
formulated as
Ho :
AC '
1
which may be tested by using the following hypothesis matrix
•
(2.5.4)
•
45
roo
c ;
L
2x6
1
0
o
1
o
o
_1 OJ
(2.5.5)
0-1
in accordance with the principles of weighted regression analysis.
The resulting
xi ; 0.20
with D.F. ; 2 indicates that the data are
consistent with this hypothesis of no interaction.
Hence, the model
may be revised as follows:
E(.0
••
1
1
1 _1
~
1
1
0
2
Y
1
1 _1 _1
Al
;
1 _1
1 _1
1 _1
0
(2.5.6)
A
2
2
1 _1 _1 _1
where
~
is an overall mean, y is the differential group effect, 1.
1
is
the linear effect of exposure, and A is the non_linear effect of
2
exposure.
The hypothesis of no group main effects may be formulated
as
(2.5.7)
and tested by using the hypothesis matrix
C
[0
1
0
0]
(2.5.8)
lx4
while the hypothesis of no exposure rate main effects may be formulated
•
as
(2.5.9)
•
46
and tested by using the hypothesis matrix
C
=
2x4
[:
0
1
0
0
0
1
]
The resulting test statistics are
.,
14.55 with D.F.
= 2,
respectively.
(2.5.10)
x~ = 11.57
with D.F.
=1
and
xi =
Hence, it is apparent that there
are both significant differences between ·the two groups and among the
three exposure rates.
In fact, the differences among the three expo_
sure rates are linear as can be seen by testing separately for the
linear and for the non_linear effect of exposure.
Thus, we see that split plot contingency tables can be analyzed
by the linear model approach as readily as the more standard types of
••
•
multi-factor, mUlti_response experiments treated in Grizzle et a1 .
(1969) .
•
47
3.
MORE GENERAL CATEGORICAL DATA MODELS FOR
THE CASE OF SUPPLEMENTED MARGINS
3.1
Introduction
As has been indicated previously, the approach of Grizzle
~ ~.
(1969) is based upon applying weighted least squares analysis to
certain appropriately formulated linear models similar to those arising
in univariate and/or multivariate analysis.
As such, the following
assumptions are made:
(i) These linear models are completely specified by a single
design matrix.
(ii) The data is complete in the sense that every experimental
unit is classified according to each of the d dimensions
of the table.
••
(iii) All cells are assumed to have positive probability of
occurrence .
In many practical problems, some of these assumptions must be relaxed.
Goodman (1968) and Bishop and Fienberg (1969), among others, have
considered the implications of haVing
~
priori certain empty cells.
Their approach utilizes multiplicative models with maximum likelihood
estimation.
The relaxation of (i) is partially accounted for in the
GSK approach by allowing various functions,
abilities.
!(~,
of the cell prob-
The relaxation of (ii) is the primary cOncern of this
research.
In the usual categorical data situation, there is complete
information available for each entry in the corresponding contingency
table.
For instance, in the two factor (subject group, anatomical
meaning), two response (classification at exposure rates of 1/5 second
•
and 1/1000 second) example in Bhapkar and Koch (1969b), each entry has
•
48
been classified according to all four dimensions.
However, in many
situations, the experimenter is not equally interested in the response
variates or perhaps addicional information on one particular dimension
could be obtained relatively easily and economically.
Alternatively,
perhaps, as described in Kleinbaum (1970) for the continuous case,
there is missing data for some individuals in the sense that classifications on only d' < d dimensions have been made.
Retaining and
utilizing this partial information is distinctly different from the
"apparently" similar problem of empty cells (see Assumption (iii».
In this research, the additional information is obtained by design
rather than by chance and as such is referred to as llsupplemental
information" and the corresponding margins of the contingency table as
••
"supplemented margins I I .
This supplemental information is used to
improve the precision of the estimates of the marginal probabilities
over the. supplemented dimension(s») and it is also incorporated into
the estimation of the individual cell probabilities in order to improve
certain tests of hypotheses.
Examples where such supplemental information might arise include
questionnaire surveys similar to the recently_completed national
census.
In such
surveys~
a basic set of questions.
every respondent would be expected to answer
In addition, a certain proportion would be
asked to respond to a second set of questions judged to be less important or more difficult to answer than the questions in the basic set.
Supplemental information would derive from the answers to questions
in the basic set from those respondents not requested to consider the
•
second set of questions.
Similarly, in automobile crash investigations,
•
49
the investigating officer might routinely obtain certain biographical
information
(~:~'.'
race, sex, age) for all drivers but, due to the
difficult conditions of the interview, obtain, only for a specified
subset, additional information such as sobriety of the driver, purpose
of the trip, etc.
In this case, the supplemental information would
consist of the answerS to those biographical questions for those
drivers who were not examined in
detail~
A number of additional
examples are given later in this chapter.
It should be noted here that the use of supplemented margins in
the analysis of contingency tables resembles the technique of "double
sampling" as described in Cochran (1963).
In "double sampling," a
preliminary sample (cf. supplemental information) is used to estimate
••
the distribution of an auxiliary variable (cf. marginal distribution
of particular response or combination of responses) and a second
stratified sample is used to estimate the characteristic of interest
(cf. cell probability).
Previous work in this area appears to be limited to but one
author.
Blumenthal (1968) treats essentially the same problem as
considered in this research although he restricts attention to one_way
tables where each of the I main categories has
sub_categories of classification.
J.~
1
i
; 1, 2, ' .. J I,
The major difference in the two
approaches lies in Blumenthal's assumption that a sample belonging in
cell (i,j) has probability a .. of being only "partially classified" as
1J
opposed to having a secondary sample for which supplementary informa-
tion on a subset of the study variables is ·obtained.
•
Imposing certain
simplifying assumptions for this one_way situation, Blumenthal obtains
•
50
MLE's of the individual cell probabilities as well as the bias and
variance of these estimates.
He concludes by suggesting the extension
to double dichotomies which is treated in the present research (see
Chapter 4) from the point of view of "supplemented margins" rather than
"partially categorized ll data&
It should be noted that supplementation is considered only for
margins that are not fixed
margins).
(~.~.,
response margins rather than factor
Supplementing a factor margin is of no interest since that
margin being considered fixed a priori yields complete information
about its composition.
In order to see more clearly the implications of using supple_
mental data in the GSK approach, the recent work of Kleinbaum (1970)
••
for the case of normal random variables will be considered .
3.2
More General Linear Models in the Case of Normal Random Variables
In the case of normal multivariate theory where the standard
multivariate
(SM)
linear model is given by
E(y)
Nxp
x
~
(3.2.1)
Nxu uxp
with corresponding variance matrix} E} two basic assumptions are made.
These assumptions are that
(i) Each of the p response variates is measured for each of the
N experimental units (~.~., each of the N response vectors
is complete).
(ii) The design matrix, X, is the same for each of the p responses
(i.e., the same blocking system is used on each response
variable) .
•
•
51
Examples where certain of these assumptions do not hold are
numerous.
For example, in psychological testing with batteries of
(say) 20 different tests, each subject might take a combination of only
5 tests since taking many tests might tend to bias the later results.
Hence condition (i) is violated.
categorical data framework.)
(Analogous situations occur in the
Assumption (ii) is violated in many
econometric problems involving sets of different but correlated
regressions as in certain time series analyses.
Likewise, in certain
agricultural experiments, efficiency might dictate the use of various
blocking systems on different response variates leading to a number of
design matrices for the p responses.
Various authors, including most recently Afifi and E1ashoff (1966,
••
1967, 1969a, 1969b), have treated the case where one
the above assumptions is violated.
~
the other of
The General Incomplete Multivariate
(GIM) model pertains to the situation where (i) does not hold.
A
special case of the GIM mode1.is the Hierarchical Incomplete Multi_
variate (HIM) model where the missing data has a simple monotonic
structure.
The Multi_Design Multivariate (MOM) model allows the
various response variates to have different design matrices and hence
allows for the relaxation of (ii).
K1einbaum (1970) has allowed for the simultaneous relaxation of
these basic assumptions in his development of the More General Linear
Multivariate (MGLM) model.
The SM linear model as well as the GIM,
HIM, and MOM models arise as special cases of the MGLM model.
•
However,
since the Growth Curve Multivariate (GCM) model is not a special case
of the MGLM model, the adjective "More " rather than "Most" has been
used.
•
52
The most useful form of the MGLM model is the "vector version"
given by
~l
~l
0
WI
-
W
2
~
E(z) = E
b
(3.2.2)
= W~
=
~
W
p
0
z
-i'
NTxMT MTxl
NT"l
with var(::) =
where.=..(/.(. = 1,2, •.. , p, is a (N.(. x 1) vector
(I
comprising all observations on the t-th response variate in the entire
experiment where 0 < N.(.
••
matrix for
to
~
~,
where NT
~
and
=
~
N, W.(. the corresponding (N.(. x M,J design
the (M.(. x 1) vector of parameters corresponding
p
p
E Nk
and
k:l
k=l
full rank,
~ ~
~.
It is assumed that W is of
NT' and hence W.(. is of rank M.(..
For this assumption to
be met} it may be necessary to reparameterize the model.
Note that,
if (i) h~lds, N.(. = N; if (ii) holds, W.(. = WI for all .(..
The test statistic proposed by Kleinbaum for testing linear
hypotheses for the MGLM model is a Wald statistic with BAN estimators
of linear functions of the "treatment" parameters and consistent
estimators of the variance parameters.
More specifically, the
hypotheses to be tested are specified by the following set of constraints on the model parameters:
H :
•
o
o
cxl
(3.2.3)
•
53
where C is a matrix of arbitrary constants of full rank, c < M •
- T
number of applications, C has the form, C
where C is a (c
t
MLE's of
C~
= diagonal
In a
(C ' C ' ..• , C )
l
2
p
t x Mt ) matrix of arbitrary constants of rank c t
~
M
t
.
(say.• CQ are obtained using the likelihood function for
,,,,-1 _1
The consistent estimators of the variances, (W.. W) ,
the MGLM model.
of the estimated parameters are theoretically obtained by inverting the
derived Fisher's information matrix.
In actuality, Kleinbaum obtains
the information matrices for the GIM and MOM models which are con_
siderably less complicated, and, from them, infers the solution for the
general case.
Inserting these estimates into the appropriate quadratic
form. gives the Wald statistic for testing (3.2.3), namely J
••
2
~(K)
=
(CP '[C(W'O-lW)-lCTl(Cp.
Under the null hypothesis,
2
(3.2.4)
2
\'(K)
is asymptotically X with D.F.
= c.
Kleinbaum has shown, in addition, that the Wald statistic obtained for
the SM linear model as a special case of the MGLM model is equivalent
to Hotelling's trace criterion
(=!.
the previous discussion of mixed
categorical data models).
(Several detailed examples of the application of the MGLM model
can be found in Kleinbaum (1970) and thus will not be repeated here.)
3.3
Notation for the More General Categorical Data Model
In order to extend the GSK approach to the analysis of contingency
tables with supplemented margins} several assumptions must be made.
First, it is assumed for this generalization that there is no inter_
•
action between subjects and the presence of supplemental data
(~.~.,
•
54
for the responses classified, the joint marginal distributions for
subjects with complete data and for those with supplemental data are
the same).
Also, it is assumed that the supplemental data
arises by design
(~.!.,
fewer than d measurements are made on certain
of the experimental units due to cost considerations).
Conditional on the number of individuals with various types of
complete or supplemental data being fixed by design
~
priori, the
basic probability model is the product of several multinomials with
marginal and/or cell probabilities as fundamental parameters.
The details involved in modifying the GSK computer program to
handle the general case of'Lncompletel data will be carried out at a
later time.
••
However) the "vector version ll of the extended model will be
illustrated by considering a 2 x 2 x 2 contingency table with all pos_
sible degrees of supplementation and therefore a sizable total number
of experimental units of varying degrees of categorization.
There are
three basic types of experimental units:
( 1) The n
000
units with complete information on the three
responses A, B, C (see Table 3.3.1).
(2) The n
00
*
units with information on A and B but not on C
(see Table 3.3.2); similar tables apply to situations
with "missing" data on A and correspondingly on Bo
(3) The
00** units wi th informa tion on A only
(~.:'='o) "missing"
on B, C, BC, AB, and AC) (see Table 3.3.3); similar tables
apply to situations with information on B only and on conly .
•
•
ss
Table 3.3.1
Frequency distribution for experimental units with
complete information
B
1
2
C
c
1
1
n
2
n
1
2
n
lll
n
1l2
2
n
12l
122
A
n
211
I
I
I
n
2l2
'"". .. /
nOlO
'"
".
, .-
...
n
#
.. .. ..
OOl
Table 3.3.2
n
22l
n
222
I
020 I
I
"
n
+
OO2
Frequency distribution for experimental units with
information on A and B only
_.
B
1
2
1
n ll *
n 12 *
n lO *
2
n2l *
n 22 *
n20 *
nOl *
n02 *
nOO *
A
Table 3.3.3
Frequency distribution for experimental units with
information on A only
1
A
2
•
•
56
Also, let
(3.3.1)
,
= (n'
n
n
n'
nr
n
n'
)
- '-( C) , -( B) , -(A) , -(BC)' -(AC) , -(AB)
l
.:!.G
lxZ6
r
l
(3.3.2)
with
n'
lX-8
,
~(C)
lx4
and similarly for .:!.(A)' .:!.( B)
••
~(BC)= (n l **,n 2**)
lx2
and similarly for ':!'(AB)·' ~(AC) .
3.4
Estimation
As was pointed out in Chapter I, maximum likelihood estimation for
complete tables usually involves the solution of a system of simul_
taneous equations which are non_linear and consequently difficult to
solve.
The situation is considerably more complicated in the more
general case with supplemental data.
As a result, maximum likelihood
estimation of the cell probabilities will be used only in the simpler
cases
(~'~')
two_way tables as in Chapter 4) where certain explicit
solutions can be obtained.
•
For most complex tables, iterated weighted
least squares estimation will be used to obtain estimates of the cell
•
57
probabilities (or functions thereof).
Here, if the iterated weighted
least squares estimates (WLSE's) are unique, then these WLSE's will be
MLE's.
The theory for maximum likelihood estimation and weighted least
squares estimation for supplemented 2 x 2 tables is presented in detail
in Chapter 4.
For the 2 x 2 x 2 case cited in the previous section,
the table is sufficiently complex that weighted least squares estimation
would be used to derive the required estimates of the cell probabili_
ties, "ijk' i,j,k = 1,2, and their asymptotic covariance matrix, V,
using the model
(3.4.1)
••
•
where
~
snd " are as defined previously and
•
58
rn
000
0
-
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
n
0
0
n
0
0
0
n
0
0
0
0
noaa
0
0
0
0
0
0
n
0
0
0
0
0
0
n ooo
0
0
0
0
0
0
0
0
n ooo
0
0
0
0
0
0
0
0
0
0
000
000
000
•
n
••
XG
26x8
=
00*
00*
0
0
noo*
°00*
0
0
0
0
0
0
noo*
n
0
0
0
0
0
0
noo*
°00*
°0*0
0
°0*0
0
0
0
0
0
0
0
0
°0*0
0
°0*0
0
0
0
0
0
0
°0*0
0
0
0
0
0
0
0
0
0
0
°·*00
0
0
0
0
°*00
0
0
0
°0**
00*
°0*0
0
°0*9
0
°0*0
0
0
0
0
°*00 0
0
0
0
0
°*00
0
°*00
0
0
0
°*00
°0** °0**
°0** 0
0
0
0
0
0
0
0
°0** °0** °0** °0**
°*0*
°*0* 0
0
°'*0*
°*0*
0
0
0
0
°*0* °*0*
0
0
°*0*
°*0*
°**0 0
°**0 0
°'**0
0
"**0
0
0
0
°*00
•
n
000
'-
°**0
°*00
°·**0 0
°**0 0
°**0.
(3.4.2)
•
59
3.5
Tests cf Hypotheses
Testing various hypotheses for the more general case of
supplemental data using a modification of the Grizzle, Starmer, Koch
(GSK) program is essentially a two_stage procedure.
The first stage
replaces the first portion of the GSK program while the second stages
are, for the most part, the same.
More specifically, the first stage consists of the following:
(i) Input the observed frequencies
(.=..~., ~
for the 2 x 2 x 2
case) .
(ii) Obtain estimates of the cell probabilities corresponding
to the unrestricted MLE's for the case with complete data:
(1) If the table is fairly simple, obcain the MLE's (as in
••
(4.2.2.7».
This is the closest analogue to the un_
restricted MLE's for the complete data version and is
most consistent with tests using Wa1d statistics.
(2) For most ·cases, compute the WLSE's (iterated if
required) using a formulation similar to (3.4.1) for
the particular table of interest.
(3) The most general (and heuristic) approach is to use
WLSE's of
A~
(thereby estimating functions of the cell
probabilities).
This is particularly appropriate in
the mixed model situation.
'Note that, since in the complete case,
tests (~e£o) those involVing
Wa1d statistics, X2, etc.) using MLE's or WLSE's are asymptotically
N
eqUivalent, it seems reasonable that the same should apply to the
•
present case of supplemental data •
•
60
A
(iii) Obtain an estimate, V (or AVA' for (3», of the asymptotic
covariance matrix using the estimates derived in (ii) (see
Section 4.5 for certain details).
The second stage consists of the following:
(i) As in the existing computer program, input the following
matrices as required:
A
= contrast
x = design
C
K
matrix
matrix
= hypothesis
matrix
(for logarithmic functions of the cell probabili_
ties)
(ii) Use the adjusted (for supplemental data) estimates of the
••
cell probabilities as obtained in (ii) of the first stage
in place of the estimates given in (1.4.1.2) along with
the estimate of the asymptotic covariance matrix in (iii)
of the first stage in place of the usual estimate given in
(1.4.1.6).
Proceed with the testing of various hypotheses
exactly as in the existing GSK framework.
3.6
3.6.1
EXamples Using the More General Model
Perinatal Health Survey EXample
To illustrate the more general categorical data model and its
applicability, consider the following medical example involving 456
premature live births.
To be considered premature in this situation,
the birthweight could not exceed 2000 grams (or 4.4 Ibs.) nor could the
•
period of gestation exceed 34 weeks.
One of the main questions of
•
61
interest was that of an association between the 5.. minute Apgar inde.x
and the serum bilirubin level for premature infants..
The 5_minute
Apgar index is a composite of heart rate, respiratory effort, reflex
irritability, muscle tone and color observed five minutes after birth.
Since each component receives a Score of O} 1) or 2-, the index range
is 0_10 with the lower values representing the healthier infants.
A serum bilirubin count exceeding 1.0 mg per 100 ml indicates'a malfunctioning kidney in the premature infant which is usually accompanied
by considerable jaundice.
If allowed to persist, this condition can
lead to damage of the central nervous system.
Thus, the recognition of
an association between the Apgar index and the serum bilirubin count in
premature infants could help in the detection of potentially serious
••
conditions by, for example, the immediate recognition of a jaundiced
infant}
in turn} suggesting heart or respiratory troubleo
As indicated by Table 3.6.1.1, this is a "no factor, twO response"
situation with supplementation on both margins.
The large number (153)
of infants with no serum bilirubin reading might be accounted for by
advanced jaundice for these premature infants making the reading unnecessary.
If this is indeed the case, the large proportion with low
Apgar scores provides evidence contrary to the hypothesis of a positive association between the two responses (which is only slightly
supported by the 279 individuals with both measurements available).
Alternatively} perhaps those infants with low Apgar scores and
complexions were not examined for serum bilirubin levels.
no~al
Quite
possibly (from external eVidence), they would have had low bilirubin
•
levels and thus supported the association hypothesis.
At any rate, it
•
62
would seem that methods which utilize the considerable information in
the one_way margins are most appropriate for this example.
Table 3.6.1.1
1
Serum bilirubin readings and 5_minute Apgar scores for a
sample of premature infants
;
Serum Bilirubin
Reading
5_Minute
Apgar Score
0_1.0
I
I
I
Sub_
total
1.1 and
above
I
I
,I
Supplementation
on Apgar
Score
Total
,
0-6
35
57
92
7_10
75
112
187
110
169
279
Sub-total
------------------- -----------------
••
Supplementation on
Bilirubin Reading
Total
bee~
As has
__ 'On
I
I
I
117
209
36
223
I
I
I
I
I
__
153
10 .. _.•
·_n
_'.
432
___ n _ n
I
I
11
13
24
121
182
303
---"---,.'
I
I
I
I
:
456
stated, the test of interest is that of independence of
the 5-minute Apgar score and serum bilirubin reading for premature
infants, ~'':':}
H :
o
f (TT)
In(
o
(3.6.1.1)
where
TT
ij
•
= Pr[Apgar
score i, serum bilirubin reading j; i,j
= 1,2]
.
In the two_stage test procedure, the appropriate method of estimation
•
63
in the first stage is either (1) or (2) as given in Section 3.5.
= 1,2,
former, iterated MLE's, Pij' i,j
In the
are obtained using the rela-
tionship (4.3.1.3) with the computer program given in Chapter 7.
The
resulting MLE's are given by
= 0.1877
= 0.2960
P21 = 0.2092
= 0.3071
(3.6.1.2)
But these are exactly the iterated WLSE's from (2).
mate of the asymptotic covariance matrix for
~
The required esti-
is obtained by substi_
tuting the MLE's in (3.6.1.2) into (4.3.2.1) which is the inverse of
the information matrix.
••
0.483
v
Thus, the required estimate is given by
-0.259
_0.108
_0.117
0.612
-0.126
-0.227
o .522
_0.288
=
(symmetric)
For the second stage
(~.
(3.6.1.3)
0.632
Section 1.4.2 for the corresponding test
of independence for ordinary 2 x 2 tables using the GSK approach),
the following matrices are
A= I
4
required~
= identity matrix
K = (1, _1, _1, 1)
(3.6.1.4)
V as given in (3.6.1.3)
•
o = diagonal
(Pll' P12' P21' P22) .
••
64
The program then provides a test of independence which is the test of
fit of the model, f(.!9 =
o.
The test statistic is given by
(3.6.1.5)
(f(~n2
v
= -(K-O--"l""'o--lrK')-
where
f(V = In Pn - In P12 - In P2l + In P22
••
with V = (vkt)
k,t = 1,2,3,4.
x2 _variable with O.F. = 1.
Under H ,
o
xi
is asymptotically a
For this example,
x2 =
I
2
(-1.661 +1.204+1.561_1.171)
(.0191) - (-.0121) _ (_.0171) + (.Oll?)
(3.6.1.6)
= 0.0748 .
Thus, using the simplified double dichotomy to illustrate the procedure, it would appear that the serum bilirubin reading and the Apgar
score are independent .
•
•
65
3.6.2
Hand Example
In physical therapy, it has been hypothesized that sweat and
sensation in the hand are linked
(~.~.,
if, sensation is also present).
In order to investigate this conjec_
sweat is present if, and only
ture, 35 patients at the Hand Rehabilitation Center at the University
of North Carolina at Chapel Hill were studied.
These patients all had
severed nerves in the hand which was examined.
Because the ring finger
is influenced by both the median and ulnar nerves while the other
fingers by only one, data was obtained for this finger for only the 12
patients where both nerves were affected by the particular injury (with
the exception of patient 35 whose ring finger had been amputated; see
Table 3.6.2.1 presenting the results reported in Lachenbruch and Perry
••
Thus, this is a mixed categorical data model of order d ; 5
(~.
eye grade example in Chapter 2) with binary (r ; 2) response categories
(agree or disagree) for each finger.
In addition, there is supplemen_
tary (or incomplete) information for 23 patients representing, in this
particular example, the bulk of the available information.
Clearly, outcomes "1" and "4" agree with the hypothesis of asso_
ciation of sweat and sensation while "2" and "3" disagree with this
hypothesis.
Thus, the results of the investigation can be summarized
in terms of "agree with the hypothesis ll or "disagree with the
hypothesis" as shown in Table 3.6.2.2.
•
3
Lachenbruch, P. A. and Perry, J. Testing equality of proportion
of success of several correlated binomial variates. Submitted to
Biometrics, 1970. Department of Biostatistics, University of North
Carolina at Chapel Hill.
•
66
Finger data indicating classification with respect to
Table 3.6.2.1
the presence or absence of sweat or sensation
Patient
Nerve
Affected
1
2
3
4
5
6
7
8
9
U
C
C
C
C
M
U
M
C
U
U
10
.e
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
C
C
U
M
C
M
U
M
M
M
M
M
M
C
C
C
U
M
M
U
M
C
U
C
where
e
1
2
3
4
Thumb
Index
1
2
1
3
1
4
1
4
1
1
1
2
2
1
3
4
3
1
1
4
1
2
1
1
1
4
2
1
4
2
1
4
2
1
2
1
4
3
1
1
4
1
4
1
1
1
2
2
1
4
4
4
1
2
4
1
4
2
2
2
4
4
1
4
1
1
4
2
1
2
Finger
Middle
Fifth
1
4
4
2
1
4
1
4
1
1
1
2
2
1
4
4
4
1
2
4
4
1
2
1
2
4
4
1
4
2
1
2
2
1
2
= sweat and sensation present
= sweat present, sensation absent
= sweat absent) sensation present
= SWeat and sensation absent
U=
M =
C =
ulnar (thumb, index, middle, ring)
median (fifth, ring)
combination of both U and M (ring finger)
3
4
4
2
1
1
4
1
4
4
4
2
2
4
1
4
1
4
1
1
1
1
1
1
2
4
4
4
1
1
4
1
2
2
2
Ring
4
4
3
1
1
2
2
4
2
4
4
2
•
67
Table 3.6.2.2
Frequency distribution of complete and supplemental data
by agreement (A) and disagreement (0) for each finger
complete
(12 patients)
••
Supplemental
(23 patients)
Total
(35 patients)
Finger
A
0
A
0
A
0
Thumb
6
6
18
5
24
11
Index
7
5
19
4
26
9
Middle
7
5
18
5
25
10
Fifth
7
5
20
3
27
8
Ring
7
5
-
-
7
5
Total
34
26
75
17
109
43
The main questions of interest are (a) whether the marginal
probabilities of disagreement are the same for the five fingers apd, if
so, (b) whether this common value is near zero
an association of sweat and sensation).
(~.~.,
whether there is
Therefore, the hypotheses of
interest are .
H
oa
where
TT20000
TT
20000
=
=TT
02000
=TT
00200
=TT
00020
=TT
00002
(3.6.2.1)
Pr [disagreement for the thumb]
(similarly for the index, middle, fifth, and
ring fingers, respectively)
= 0
•
given that H holds
oa
(where
=
TT
20000
=
(3.6.2.2)
=
TT
2).
0000
•
68
Since the results for one finger cannot be assumed independent of
those for the other fingers, the classical Pearson
x2 _test
of
homogeneity for the proportions is not appropriate for (3.6.2.1).
How-
ever, the extended linear model for supplemental categorical data is
appropriate for this situation and uses all of the available data.
For the estimation involved at the first stage, method (3) of
Section 3.5 is most appropriate since interest lies in the marginal
probabilities of disagreement rather than individual cell probabilities.
Thus, WLSE's of the five marginal probabilities of disagreement (which
are functions of the individual cell probabilities) are obtained using
the model
••
E(A 1 !!d
9x48 48xl
=
(3.6.2.3)
1!.
5xl
where
fu
=
(~', ~(R»
(3.6.2.4)
1x48
with
"
"
"
n' = (nll111,n11112,nl1121,···,n12222,n21111,n21112,···,n22222)
1;("32
= Pr
where
!: !.,
•
"
n'
[level i of thumb, j of index, k of middle,
~ of fifth, and m of ring finger,
i,j,k,~,m = 1,2]
= (4,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,2,0,0,0,0,0,0,1,
0,0,0,0,0,0,0,3) (1/12)
•
69
.........
...
!!(R) = ("11110'"11120'"11210""'"12220'"21110'"21120""'"22220)
1x16
...
,
"",,.
where
=
2
I:
i,j,k,.t
=
1,2
m=l
,
!:.!., !(R)
= (12,2,1,0,1,0,2,0,3,0,1,0,0,0,0,1)(1/23)
from the supplemental information;
"
"
"
("20000' 02000' 00200' 00020' "
.)
00002
(3.6.2.5)
where
••
"20000 = Pr ["disagreement with the hypothesis" for the thumb]
=
2
2
I:
I:
2
I:
2
I:
j=l k=l .{.=1 m=l
and similarly for the remaining fingers;
AU
A1 =
·9x48
0
~
whero
•
°
A(i>
(3.6.2.6)
•
70
o0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 111 111 1 1 1 1 1 1 1 1 1 1
o0
0 0 0 0 0 0 1 111 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
A( ) =
0 0 0 0 1 1 1 1 0 0
5x32
o0
o
•
o0
1 1 1 1000011 1 1 0
o0
0 1 1 1 1
1 1 0 0 1 1 0 0 1 100 1 1 0 0 1 1 0 0 1 1001100 1 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 10101010 1
o0
0 0 0 0 0 0 1 1 1 1 1 1 1 1
0000111100001111
A(R) =
4xl6
0011001100110011
0101010101010101
1 0 0 0 0
••
.'
Xl .9x5
o
1 0 0 0
0
o
0
o0
1 0 0
1 0
0 0 001
(3.6.2.7)
1 0 0 0 0
0 1
o0
o0
1 0 0
o0
0 1 0
0
The weight matrix for the least squares procedure is given by the
.
inverse of (AI v
.
n
.:.:<;
!!d)
ma trix for (A 1
•
Ai)
(~.~.,
where
the inverse of the estimated covariance
•
71
o
v( )
TT
=
v
~
(3.6.2.8)
o
48x48
with
1
=
v( )
nOOOOO
TT
'
(D, TT
'
TT TT')
32x32
where nOOOOO = 12;
••
v (R)
TT
is given in (3.6.2.4);
1
= - - - (D,
TT
nOOOO *
TT
-(R)
16x16
where n OOOO *
= 23;
!i:(R) is given in (3.6.2.4).
A test with D.F. = 4 of (3.6.2.1) is obtained (as in GSK) using
the WLSE's,
C
a
4x5
~
of
~
0
0
0
_I
0
1
0
0
_1
0
0
1
0
_1
0
0
0
1
_1
from the first stage with the contrast matrix
=
since H
can equivalently be represented by C P
oa
a-
'.f
= O.
-
More
specifically, the test statistic is given by
•
(3.6.2.9)
•
72
where
2
and, under H ,x is asymptotically a
oa
a
x2_variab1e
with D.F. = 4.
Provided that H is not rejected, a test for Hob (as given in
oa
(3.6.2.2»
is obtained by using the principle of conditional error.
Thus, an appropriate test of Hob with D.F.
=1
is given by
(3.6.2.10)
2
where X , is analogous to (3.6.2.9) with C , replacing C where
b
b
a
~,
••
= IS
(the identity matrix).
Under Hob
(!.. =..'
the common probability of disagreement is zero),
x2_variab1e
the hypothesis that
Xb2
is asymptotically a
with D.F. = 1.
The CATLIN version of the GSK program can accommodate this example
by considering the data as arising from two populations, namely the 12
individuals with complete information and the 23 individuals with
information on only four fingers.
The following matrices comprise the
basic input for the program:
~
2x32
=[
4 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 3]
12 2 1 0 1 0 2 0 3 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
!
•
A =
2
9x64
A( )
0
0
0
A(R)
0
•
73
where A(,) and A(R) are given in (3.6.2.6).
However, for these data,
A
the estimate of the covariance matrix for
singular.
A2~'
namely
A2V~A2
' is
This is due to the middle, ring, and fifth fingers being
perfectly correlated for those 12 individuals with complete information.
Although the data could be manipulated to avoid this difficulty,
the preceding should adequately describe a test procedure which
appropriately utilizes all of the available information.
It should be noted in addition that having both nerves (median
and ulnar) severed appears to have a quite different effect on sweat
and sensation than having only one nerve severed (see Table 3.6.2.2).
For the 12 patients with both nerves severed, there is strong evidence
••
against the hypothesis that sweat and sensation in the hand are linked .
A notably different distribution of proportion of "disagreements with
the hypothesis" occurs for those 23 patients with only one severed
nerve.
This raises the question about the appropriateness of the
uniform hypothesis for all five fingers, especially in light of the
different physiology of the ring finger.
Nevertheless, the example is
still useful in showing the potential of the more general categorical
data model.
3.6.3
Drug Example
This is an example of a mixed categorical data model of order 2
with supplemental information on both margins.
Suppose a certain
pharmaceutical company wished to compare the effects of drugs A and B.
Of the 80 patients available for study, 50 patients (Group I) received
•
both drugs under essentially the same conditions including a suitable
•
74
time delay between applications (with the order of presentation of A
and B random); 16 patients (Group II) received drug A only; and 14
patients (Group III) received drug B only.
In each case, the patient's
response to the particular drug was observed and recorded as
"slight", or "strong".
II
none " ,
The results are given in Table 3.6.3.1.
The
main question of interest is whether the distribution of strength of
response is· the same for both drugs.
Table 3.6.3.1
Frequency distribution of responses to drugs A and B
I
[Group I]
••
Response to B
Sub_
total
None
(1)
Slight
(2)
(3)
(1)
4
6
0
10
Slight (2)
4
8
9
21
Strong (3)
0
5
14
19
8
19
23
Response to A
Response
to A only
[Group
II]
I
Strong
I
I
I
I
!
Total
t
None
Sub_total
------------------Response to B only
[Group III]
I
4
14
I
I
7
28
I
I
5
24
I
I
I
I
----------------------
50
16
I
_______
1___________
I
66
--------
,
I
Total
3
4
7
14
11
23
30
64
,
I
I
:
80
For the estimation involved at the first stage, methods (2) and
(3) of Section 3.5 are appropriate.
TT'
= (TT
•
lx15
TT
TT
TT
TT
TT
TT
TT
11' 12' 13' 21' 22' 23' 31' 32' 33
lx9
~
TT
For the former,
=
(~' '~(B)'~(A»
)
(3.6.3.1)
•
75
where
n'
lx9
= (4,6,0,4,8,9,0,5,14)
~B)
= (4,7,5)
Ix3
= (3,4,7)
!.'.(A)
Ix3
0
0
0
0
0
0
0
0
0 50
0
0
0
0
0
0
0
0
0 50
0
0
0
0
0
0
0
0
0 50
0
0
0
0
0
0
0
0
0 50
0
0
0
0
0
0
0
0
o 50
0
0
0
0
0
0
0
0
0 50
0
0
0
0
0
0
0
0
o 50
0
0
0
0
0
0
0
0
0 50
16 16 16
0
0
0
0
0
0
0
0
0
50
••
X
=
15x9
0
0
0 16 16 16
0
0
0
0
0
0 16 16 16
14
0
0 14
0
0 14
0 14
0
0
0 14
0 14
0
0
0 14
with estimated covariance matrix for
•
0
0
0 14
0
0
0 14
~
(3.6.3.2)
•
76
v
v( )
Q.
Q.
Q.
V(B)
q
0
Q.
=
15x15
(3.6.3.3)
.
V(A)
0
where
..
V( ) = n 00 (D._ TT TT')
TT
9x9
with
n
00
D.
= 50;
.
(TT
= diagonal
4
6
(50' 50'
TT
••
.
V(B)
= n
0*
3x3
with
.
= diagonal
(D
•
-
TT
.
ll , TT12 ,···, TT33 )
TT
~o
... ,
~I)
-.0
~o
no *
= 16;
D.
= diagonal (';10' TT ' TT )
20 30
.
.
TT
-.0
V(A)
•
TT
TT' )
5
= n
-
*0
(D
•
TT
-0.
:
n*o
7
(IT ' IT' TO)
3x3
with
4
= diagonal
= 14;
-0.
.
-0.
14)
50'
•
77
3
;:: diagonal (14
J
4
7
14 ' 14)
Thus, the first stage estimates of
~
could be obtained as in Example 1.
However, two adjustments must be made, namely,
(i) Reparameterization;
(ii) Modification for zero cells.
The first can be accomplished by deleting the nintn, twelfth, and
fifteenth rows and last column of X with the corresponding adjustments
made on
~
~
.
and V,
For the second adjustment, replace each zero
cell frequency by a small positive quantity (!.~., lis
••
s
= number
of responses for Group I).
= 1/9
where
Note, that, if there were a
zero marginal frequency, it could be replaced by the corresponding 1/3
in this case.
With these adjustments made, replace the usual first
stage estimates from the GSK program by. the resulting .first stage.
WLSE's (~and
s~etry
V),
formulate the hypotheses of interest, !.~., marginal
as specified by
H :
o
i
=
1,2
(3.6.3.4)
and proceed as in the second stage of GSK (as indicated. for the
previous examples).
Alternatively, for this example, it appears reasonable to assign
scores to the three levels of the two responses and to test for
equality of mean scores over the two marginals, namely
•
H :
o
(3.6.3.5)
•
78
where a and a are the mean scores for the responses to drugs A and B,
B
A
respectively.
Therefore, assigning scores a
1
= 1, a
2
= 2, and a
3
= 3
to the response categories for both drugs A and B and using initial
relative frequencies
(!!d
in place of the observation vector
(~,
the
first stage estimation involves the process referred to in (3) of
Section 3.5.
Specifically, WLSE's, a
A
and
aB,
are obtained using the
model
E( A
~ = Xl a
1
4x15 15xl
4x2 2xl
(3.6.3.6)
where
as in (3.6.3.1)
.e
(3.6.3.7)
with
=
,
1
50 (4,6,0,4,8,9,0,5,14)
1
n' ;:: IT (4,7,5)
~o
n'
-<l.
1
;:: 14 (3,4,7)
a'
1 1 1 2 2 2 3 3 3
1 2 312 3 1 2 3
A =
°°°°°°
°0 0 0 0 0
1
4x15
•
°°°°°°°°° 1 2 3 °°°
° ° ° ° ° ° ° ° ° ° ° ° 123
(3.6.3.8)
•
79
=
1
0
o
1
1
o
o
1
(3.6.3.9)
.
The inverse (V) of the weight matrix is given by AIV
Ai where
!!c
V(
·
V
!!c
0
-
)n
=
V(B)
(3.6.3.10)
n
15x15
Q
-e
V(A)
n
with
V( )
9x9
1
= -2- V( )
n
n00
.
•
V(B)
3x3
·
V(A)
3x3
as given in (3.6.3.3)
1
=--Z V(B)
TT
no*
1
= -2- V(A)
n
n*o
Finally, a test with D.F.
=1
for (3.6.3.5) is obtained (as in GSK)
using &A and &B from the first stage with the contrast matrix,
C = (1, _1).
•
E(ii)
Alternatively, if the linear model
where
(3.6.3.11)
•
80
is fitted using weighted least squares, the residual sum of squares
provides the required test statistic with D.F.
= 1.
It should be noted that, for the test of equality of mean scores
over the marginals, empty cells are no problem since, as long as the
marginal totals are non-zero 7 the required estimate of the covariance
matrix is non-singular.
Again, the CATLIN version of the GSK program can perform the
analysis.
Here the data is considered as having arisen from three
populations (Groups I, II, and Ill).
Thus, the following matrices
comprise the basic input for the program:
460 4 8 9 0 5
••
n
3;9
=
4 750 0 0 0 0
[
30040070
1 1 1 2 2 2 3 3 3 0 0 0 0 0 0 0 0·0 0 0 0 0 0 0 0 0 0
1 2 312 3 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 000
A2 ·=
4x27
o0
o
0 0 0 0 0 0 0 1 2 3 0 0 0 0 0 0 0 0 000 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 200 300
The model as given by (3.6.3.6) fits the data
fit gives
x: = 0.30 with
namely, Ho :
~A = ~B'
D.F.
= 2).
(~.~.,
the test of
To obtain the test in (3.6.3.5),
use Xl as given by (3.6.3.9) and C = (1, -1).
The corresponding test statistic with D.F.
=1
is found to be
XM2 = 2.1
indicating that the mean scores for drugs A and B are the same .
•
•
81
4.
ESTIMATION THEORY AND HYPOTHESIS TESTING FOR TWO_WAY
CONTINGENCY TABLES WITH SUPPLEMENTED MARGINS
4.1
Introduction
In this chapter, the estimation theory for two-way tables with
supplemented margins is investigated in detail since it is possible to
obtain certain explicit results for maximum likelihood estimation in
this relatively simple case and to compare these results with other
competing methods of estimation.
Various properties of the resulting
estimates are presented and the more general categorical data model
approach to hypothesis testing is illustrated for a 2 x 2 table with
both margins supplemented.
••
4.2
4.2.1
2 x 2 Case with One Margin Supplemented
Notation
Consider the following extension of the usual 2 x 2 contingency
table (see Table 4.2.l.ll, where TI
n
00
, i,j
.
= 1,2,
ij
, TI
io
' TI
are as defined previously.
Oj
' n
Then, n
observations are available for response A with n
i
J
i = l}2.
TI *
i
= Pr
ij
i
*
, n
o
*
io
' n
oj
' and
additional
responses at level
For simplicity it is assumed throughout this research that
[level i of response A for the supplemented dimension]
:;;
n.
i
LO
= 1,2
although a number of alternative assumptions on the supplemental prob_
•
abilities are possible
(=.. [.,
2
..
n i * = E a LJ
j=l
Tr ••
LJ
f
rr.
LO
where
•
82
Table 4.2.1.1
Expected cell probabilities (cell frequeocies) for the
2 x 2 case with one margin supplemented
Respoose A
Levels
Respoose B Levels
1
2
TT
TT ll
I
I
Sub_total
TT
12
I
I
Supplemeoted
Margio
Total
TT
1*
10
1
(0
11
)
(0
TT
12
)
(0
)
(N lo )
(0 1*)
TT
20
TT
22
21
10
TT
2*
2
(021)
(0
22
)
(0
TT
02
TTo 1
20
)
(0
i
I
I
I
I
I
I
1
Total
••
(0
01
2
1:
= 1).
)
(0
02
)
(0
00
)
:
(N
2*)
2ci
1
(N)
(00*)
It will also be assumed for simplicity that
j=l
o<
TTij < 1
i,j = 1,2
o
so that Pr [no
10
= 0] = (1 _ TT. ) 00 _ 0
10
as
0
00
_~.
Thus, a total
sample of N individuals is taken with both responses observed for
subjects aod with information on respoose A only for
n
o
* =N-
0
0
00
00
individuals.
4.2.2
Methods of Estimation
Intuitive estimates of the individual cell probabilities using the
•
supplemeotal ioformation are giveo by the followiog:
)
•
83
A
A
n ij = Pr [level i of response A, level j of response B)
= Pr [row i) Pr [column'j!row i)
•
N.10
i
~
J
1,2
j
n ..
( -2J.. )
n
n
00
1 +
n
0*
00
ratio of supplemented to
(
••
~
where n
ij
unsupplemented) x
unrestricted MLE
[
1_1_+~uc:.:n.::s.;:u",p",p.::l.;:e;::m:.:e:.:n:.:t:..:e.;:d;.....:i:.:n~1:,'t...h _r:.o:.w;;.......
ratio of total supplemented
+ to total unsupplemented
= 0 with probability approaching zero as n
But ';;ij is exactly the unrestricted MLE of n
before).
of the
This can be seen as follows:
two
ij
oo
.(say ';;ij = Pij' as
By virtue of the independence
00
0
hood function is given by
.(.
= n
00
2
= C1
•
where
2
n n
i=l j=l
2
n n (n .. )
i=l j =1
oj
becomes large.
samples (with sample sizes nand n *), the joint likeli_
2
1J
(n .. )
n ..
1J
.1J
n. .:
1J
n..
1J
2
n
i=l
J
, 2
• n0*' n
i=l
(n
io
ni
)
*
ni *
•
84
=
2
n
2
n
i=l j=l
2
n .~
iJ
n
i=l
2
2
Utilizing a Lagrangian multiplier for the restriction
E n
= 1,
i=l j=l ij
E
the log likelihood function (with the restriction) is given by
f =
2
E
2
E
i=l j=l
2
n.*.f.n n.
n.j.f.n
1
1
10
2
_ A( E E n - 1)
ij
i=l j=l
(4.2.2.1)
where
••
with corresponding normal equations
set
=
0
(4.2.2.2)
From (4.2.2.2), we have that
and, upon summing over j and then i, it follows that A = N.
i,j = 1,2
implies that
•
Now
(4.2.2.3)
•
85
(4.2.2.4)
Hence
2
E
n..
(..2:1.
N
j=l
01.*
+
tr
from (4.2.2.4)
(4.2.2.5)
But (4.2.2.3) also implies that
(4.2.2.6)
=
••
so that an explicit solution of the normal equations is given by
n ..
Pij
=
TT ••
1)
1)
=
N
n
=
N
=
-
-
ni *
Pio
ij
N
io
ni)(tr)
n
N.
10
ij
tr· n
from. (4.2.2.6)
from (4.2.2.5)
i,j
1,2
10
which is exactly the intuitive estimate given previously .
•
(4.2.2.7)
•
86
Al terna tively, let
n' = (nll,n12,n21,nl*)
lx4
=G
=
1T'
-s
(1T
11
, 1T , 1T )
12 21
lx3
n
00
0
0
n
0
0
n00
n o*
no*
0
=
X
4x3
0
00
0
Weighted least squares estimates,
••
=
V =
4x4
var(~
are obtained using the model
X!!s
(4.2.2.8)
-
Since
-ij ,
1T
1T
1T')
-s -s
o
=
0'
where
=
o
is a (3 x 1) vector of O's,
!!s
diagonal
, 1T , 1T )
12 21
D
(1T
11
the weighted least squares estimates,
!!S'
are given by
(4.2.2.9)
•
•
87
where
-
V
= V
-s
" =n
-s
1
n
--1
V
00
(D:. l + - - . J)
-s
"22
1
0
"
=
1
0'
no*i'ilo (l-Tilo)
with J a (3 x 3) matrix of l's.
Performing the indicated matrix
multiplication and simplifying yields
-
n ij "ij' : n ij ,"ij)
n
j
F j'
io
. •
(4.2.2.10)
which is in a format convenient for iterating.
!.~.,
in the right hand
side of (4.2.2.10), let
j
= 1,2
unrestricted MLE without supplemented margins
-(1)
obtaining "lj , j = 1,2.
Replace n(O) by n(l) in the right hand side
Ij
Ij
of (4.2.2.10) and continue the iteration until
-(k-l) I '
max (n
I -(k)
11 - n 11
I-(k)
I) <
n 12 - n-(k_l)
12
£
for some prescribed
li:.
From numerical work on the computer, this process appears to converge
•
quite rapid ly •
•
88
.
It should be noted that both ITij and Pij(~ "ij ~ ~ij) reduce to
the ordinary unrestricted MLE when n *
o
supplemental information).
(4.2.2.10)
(~.~.,
~
0
(~.~.,
when there is no
Also, Pij is a stationary solution of
the MLE's in (4.2.2.7) satisfy the relations
(4.2.2.10) for the weighted least squares estimates).
iterated WLSE's are unique, P
4.2.3
ij
~
"ij' i,j
~
Thus, if the
1,2.
Properties of the Estimators
In order to find approximations to the means, variances and
covariances of the MLE's, P
ij
~
i,j
1,2, it is convenient
to use the Taylor series expansion of Pij about the point
~
E (
Now
Pij
~
f (51?
n
~
io + n i *
n
N
n
ij
io
n00
+ no* qi*)
"N qij (1
nOD qio
l'
•
where
qio
~
qn + qi2
i,j
~
1,2
(4.2.3.1)
•
89
so that the Taylor series expansion (through the linear terms) is given
by
=
TI ..
lJ
(4.2.3.2)
= f(3J
°0* TTij
+ (1 _ -
N
- ) (q. ,-TT, .)
TT,
lJ lJ
10
i,j = 1,2
j
f
j' .
The MLE expressed by its Taylor series expansion (even including
quadratic terms) is unbiased.
••
Using (4.2.3.2), it can be shown that
n * 2 TT."
n
2 TT,'J' (l-TT,. ,)
-TT., 2 TTij , (l-TTij ,)
J + ( ..2:1.)
var(p. ,) = (3-) [( --=..L + 00) --,,,,,--_-,,-_
n
lJ
N
n io
no *
nOD
TTio
oo
TT. (l-TT. )
10
10
no*
=
TT, j (l-TT. ,)
1
lJ
+
N
i,j = 1,2
j
f
j'
(4.2.3.3)
This is exactly the asymptotic variance obtained by inverting Fisher's
information matrix) I, where
02
== ( _E(
I
3x3
where L(n; TT) denotes f as defined in (4.2.2.1) but without the
•
-
--<l
restriction.
It is also the asymptotic variance of the WLSE,
nij ,
•
90
namely, (X'V-lX)-l.
In addition, the MLE's satisfy the relations given
in (4.2.2.10) for the WLSE's.
Thus, it would appear that the MLE's and
iterated WLSE's are identical for the case of 2 x 2 tables with one
margin supplemented.
This is indeed the case if the iterated WLSE's
are unique.
The asymptotic covariances of the MLE's are also elements of the
inverse of the information matrix which is given by the following:
_1
"22
l+--+Ci
"11
1 +
1
Ci
r
r
"22
"12
1 + _.- +
••
Ci
(symmetric)
"11 (1_"11) +Yr
;
1
l
"12(1-"12) +Yr
if
•
r
°0*
'"20
10"20
n 00 "
no* "il"i2
noo . 'TTiO
l
-"12"2l
"21 (1-"21) +Yr
where
01
-"U"2l
-"11"12 - \1
(symmetric)
.'
(4.2.3.4)
1
r
i
;
1,2
2
•
91
which reduces to the usual covariance matrix when n
there is no supplemental information).
o
*
=
0
(~,!.,
when
Also depending on the sample
sizes selected (~.~., n oo and n o*) and the values of n , n , (j
ij
ij
!
j'),
Pij may be more efficient than the corresponding MLE, Pij(u)' for the
usual situation since
n
var(p .. )
1J
00
; tr
var(Pij(U»
(j!j').
+
Note also that the multiparameter extension for the Cram~r_Rao lower
bound could be used to determine if the MLE's are jointly optimal.
In
addition, since the MLE's are BAN estimators, they are consistent
(~.~.,
Pr [
I
p .. - n ..
1J
••
> s for i,j ; 1,2]
1
1J
- 0) and asymptotically
N-
efficient .
Alternatively, exact means as well as improved estimates of the
variances of the p .. , i,j ; 1,2, can be derived using conditional means
1J
and variances.
More explicitly, the following relationships will be
used:
(4.2.3.5)
var(p .. ) ; E[var(p .. I n. )] + var[E(p. j 1 n. )]
1J
1J
10
1
10
i,j;I,2.
Now, since (nil' n i2 ) is Multinomial (noo ; nil' n i2 ) and n io is
Binomial (n
I'
ni/niO ) .
00
; n. ), i t follows that (n ..
Thus
~o
1J
1
n. ) is Binomial (n. ;
10
10
•
92
E(p ij) = E[E(Pij
I n iO )]
n ..
n
+n.*
1
1.
= E[E( 0
n~J
N
n
=
E[ io
+n
io
n
••
P
ij
TTi/TTio
1.0
N
TT.
00 10
+n
(4.2.3.6)
]
°io
N
!:~.,
n.
TT
0*
n. +n TT.
0* 1.0
= E[ 1.0
=
I n io )]
1.0
0*
TT..
.22.]
TT.1.0
TT
ij
TT.
TT
io
N
1.0
is an unbiased estimator for
TT
ij
,
i,j = 1,2.
One component of the variance is readily obtained, namely
"
from (4.2.3.6)
ij
= ( TT )
NTTio
n
2
n
TT (1 - TT )
io
oo io
2 "i l o
N2 TTij TT
00
-io
(4,2.3.7)
i,j = 1,2
i
f
i'
The other component of var(p,.) is considerably more complex and
1.J
involves a minor approximation, namely the expectation of the recipro_
••
cal of the binomial random variable, n
io
'
This approximation involves
taking the expectation of the Taylor series expansion (through the
•
93
quadratic term) of f(n. ) =l/n.
10
10
about the point E(n. ) = nOOn1'0'
10
The resulting approximation is given by the following:
n.
E(..2:....) = E[
n.10
n
1
n.
00 10
10
_n
TT.
00 10
(nooniO )
2
i
= 1,2 .
(4.2.3.8)
Thus, omitting the considerable algebra involved,
(4.2.3.9)
••
+ (n *n.j,n. (n *n. +n
o 1.
1.0
0
1.0
i
,»]
0
i,j = 1,2
ifi', jfj'
so that, using the approximation mentioned previously,
(4.2.3.10)
•
Finally, combining (4.2.3.7) and (4.2.3.10) and simplifying, the
•
94
improved estimate of the variance is given by
var(p ..)';
~J
(4.2.3.11)
.
i, j = 1,2
if i'
j
f
j'
where the first two terms of (4.2.3.11) give the variance approximation
previously derived (see (4.2.3.3)).
(The covariance terms, although
more complicated, could be derived in an analogous manner.)
4.3
4.3.1
••
2 x 2 Case with Both Margins Supplemented
Methods of Estimation
The notation used for this case will be the obvious extension of
the notation used previously (see Table 4.2.1.1).
represent
the probabilities over the second
Thus,
~j'
~upplemented
j = 1,2,
margin where
it is assumed that n*. = n . + n
= n .; a sample of n*. responses are
J
2j
oJ
0
lJ
observed. for the second dimension of the table wi th n*. responses at
J
2
n
level j, j = 1,2; N
j = 1,2; and Nr = E Nio '
oj
oj + n*j'
i=l
2
so that
N = E N
oj
c
j=l
= Nr
+ n
*0
A number of intuitive estimates for the individual cell prob_
1..'
abilities were examined.
•
These included the following:
•
95
(i)
•
(r)
nij = WPij
+
where
(r) = Nio n ij
P1'J'
-N - n
r
io
with the following natural choices for w:
••
(1)
w
=
(2)
w
=
n o*
n o* + n*o
n
(iii)
·
ni {
+n
0*
(n oo + no) + (n oo + n*o)
=
N
r
N +N
r
c
~p(r)
=
(ii)
00
ij
ni *
n
n
ij
00
1+-n.
10
no*
1 +-n00
n*.
J
l+_
n
oj
n*
1+~
noo
i,j = 1,2
.
However, none of these estimates provides an explicit solution for the
MLE although numerical investigation suggests that the estimate in
gives a good approximation to the MLE .
•
•
96
Differentiating the log likelihood function (including the
restriction,
2
E
2
E
1) with respect to
i=l j=l
TI
ij
, i,j = 1,2, yields
the following normal equations:
i,j=l,2.
(4.3.1.1)
i, j ;;;; 1,2 .
(4.3.1.2)
But (4.3.1.1) implies that
Summing (4.3.1.2) over i and j (and changing the order of summation
••
where appropriate), it follows that A = N.
or,
Thus,
i, j = 1,2
(4.3.1.3)
i, j ;;;; 1,2
(4.3.1.4)
equ~valentlYJ
n ..
1)
N _
with either formulation requiring an iterated solution (see Chapter 7),
The method of iterative proportional fitting (IPF) would appear to
be a reasonable alternative to iterating either (4.3.1.3) or (4.3.1.4).
This intuitively_appealing method has been discussed and utilized by,
•
among others, Mosteller (1968), Goodman (1968) and Bishop (1969) and
•
97
is based on certain results on
by Birch (1963).
max~um
likelihood estimation derived
Basically, the IPF scheme manipulates the rows and
columns, respectively, of· the given table in order to satisfy condi_
.,
tions on the margins while maintaining the original association within
the table (as, for example, retaining the same cross_product ratio,
6
= In[(~11"22)/("12"2l)]'
in a 2 x 2 table).
That this method yields
MLE's of the cell probabilities is based.on the realization by Birch
that "the MLE's of the cell probabilities are determined uniquely by
the marginal totals being equal to the MLE's of their expectations'.'.
To apply IPF to the present problem, the question arises concern_
ing the correct marginals to fit.
It would appear that n * is relevant
l
only to estimation of "10 and correspondingly for n*l and that neither
••
is relevant to the association since, for each, only one response was
measured.
But this implies that the MLE for "10 should be the estimate
obtained for the case with one margin supplemented, namely,
Plo
= "10 =
(Nlo/N)
=
(n lo + nl*)/(noo + no*) and similarly for "01'
However, we have from numerical investigation
(~.~.J
the exact itera_
tion for the MLE's of the cell probabilities and, through the invari_
ance of MLE's, the MLE's of the marginal probabilities) that the Plo
and Pol (and resulting IS) selected for the IPF are not the MLE's of the
marginal probabili ties (and IS)!
Thus, since the proposed IPF scheme is
not based on MLE's, it will not give MLE's of the cell probabilities
(although this procedure does work fairly well for 6
~
0).
It appears
that when both margins are supplemented, the estimation of the margins
becomes complexly entangled with the association, and the individual
•
effects are no longer separable •
•
98
Alternatively, let
~
1x5
= (n11,n12,n21,n1*,n*l)
~ =
(TT ,TT12 ,TT21 )
U
1x3
n
X
=
00
0
0
0
n
0
0
n 00
n o*
n o*
0
n*o
0
n*o
00
5x3
••
0
Then, weighted least squares estimates, TT , obtained using the model
ij
E(~ =
are given by
-
(4.3.1.5)
XTT
where
1
n
00
v-I
.j
=
-
(D~l +~
TT
O'
0
0
TT
22
1
0
n o* Ti10 ( 1 - Tilo )
5x5
•
J)
0'
0
1
n*o Tio 1 (1 - Tiol )
(4.3.1.6)
•
99
Performing the indicated matrix multiplication and simplifying yields
n
00
(-- n*. - n i ,.)
n*o
J
J
TTi'j
n
00
(-- n.* _ n .. ,)
00*
+
~
n
00 n*.
(__
1J
+
Tr ••
1J
~o
-0--- °i ' * + ni'j')
0*
I
"iljl
i,j = 1,2
••
J
n
·00
i
F i'
j
}J
(4.3.1.7)
F j'
where
2
2
N_ = N + n
y
E <-1 )
00 Y E
i=l j=l TT
ij
with
y =
2
2
IT
IT
i=l j=l
(TT. j )
1
(4.3.1.7) also requires iterative procedures (see Chapter 7).
oj
As before, Pij and
nij
reduce to th~ ordinary MLE when n * = n*o
o
= 0 and they reduce to the previous estimates when n*
•
when one margin is supplemented).
o
=
0
F n0 *
(i.e.,
--
Also, Pij = MLE is a stationary
•
100
(!. =..'
solution of (4.3.1. 7)
the Pij satisfying (4.3.1.4) also satisfy
the relations (4.3.1.7) for weighted least squares estimates).
To show
this, (4.3.1.3) must be used along with the relationships
n ..
n .• ,
T\j
-ijl
.2:1. _ .2:L
TT
n ..
.2:1.
_ ni , j
-
TT
TTi'j
ij
=
n*j
I
TT
ajl
_
n*.
n
TToj
TT
_J =
-
i'j
TTi,o
D
TriO
TT.
i1i
n
1
,
'* -i-*= ij
--- ,
TT
= ni
-
ni I j
-
ij
(4.3.1.8)
• I
J
n i , j'
TTilj l
i,j = 1,2
4.3.2
I
I
i
f
i'
j
f
j'
.
Properties of the Estimators
Since the KLE's, Pij' i,j
= 1,2,
have not been expressed in closed
form, only asymptotic properties of these estimators will be investigated.
Belonging to the class of BAN estimators, the MLE's are
asymptotically unbiased as well as consistent and asymptotically
efficient.
The asymptotic covariance matrix is given by the inverse of the
information matrix, namely,
•
•
101
_1
"22
"11
1 + - - + 0 ' +0'
r
1 + 0'
1 + 0'
r
c
"22
"22
1
1 + -+0'
"12
r
'O-
n
c
00
(4.3.2.1)
(symmetric)
-"11"21 -.\1 - Y
1
- Ny
-"12"21 + Y
( symmetric)
••
where
n o*
"22
=
r·
noo "10"20
0'
n*o
0'
c
Yr
•
"22
n00 "01"02
no * "il"12
n
"io
00
i
= 1,2
n*o "lj "2j
noo
"oj
j
= 1,2
'O-
i
Yc
j
'O-
•
102
N
Y
=N+n
00
2
I:
Y
i=l
with
2
2
n
n
("i')
no*n*o i=l j=l
J
Y =
2 -;;-2--''----.2.---n oo
n ("i) n (" .)
i=l
0 j=l
OJ
A general form expressing the elements of both (4.2.3.4) and
(4.3.2.1) as well as the covariances in the usual situation
(~.!.,
without supplemented margins) is given by
••
(4.3.2.2)
- °ii,(1_2o .,)
j J
.
n o* "ij "i(3-j)
nOD
"io
n*o "ij "(3-i) j
noo
"oj
i,i',j,j' = 1,2
where
.f
•
(o
if
k
.(.
if
Kronecker delta
•
103
and with Ny' y as defined in (4.3.2.1); n o* = 0 = n*o for the
unsupplemented case while n*o = 0
f
n * for supplementation on the
o
row response only.
4.4
4.4.1
Extension of the Theory of Supplemented Margins
The Case of One Margin Supplemented
The major difficulty in extending the results from the 2 x 2 case
to the r x c case wi"th one margin supplemented is one of notation.
In
general, if certain correspondences between the two cases are made, the
extension is rather straightforward although most algebraically and
notationally tedious.
One such set of correspondences arises from the
investigation of certain properties of the estimates of n
••
ij
.
This set
includes the following:
r x c case
(i) n
2 x 2 case
rc
(it) (nrjl,nrj , ... , n rj
)
2
c-2
where j f c
Of
(iit) (n
,n . , ... ,n
)
ij 1 i J2
ij c_l
where
lj'
i=1,2, ... ,r_l
j
(iv)
n
Of
f
where j'fj
j
1 _
where i'fi
•
(v) 1 _ n .
oJ
n .,
oJ
where j 'fj .
(4.4.1.1)
•
104
:
The previous intuitive estimate, n
ij
, i
= 1,2, ... ,r;
extended to the r x c case is again the MLE., p ij'
j
= l,2, ... ,c,
Also.• using condi_
tional means and variances, the p .. are unbiased with approximate
1J
variances given by the following:
(4.4.1.2)
n
+
0
*
n .. (n. -no .)(l-n. )[(N-l)n. +1]
1)
10
1J
:T2
N n
00
10
10
3
n.10
i
;::: 1,2, ... ,r;
j;::: 1,2, .... ,c ..
Alternatively, let
0
1
~
;:::
1
n' .... , -r'
nl
-1' .:.:,z'
(0
0 1)
~
lx[r(c+l) -2]
where
n~
i
-1
lxc
n'
-r
lx(c_l)
n'
~
lx(r_l)
n'
~*
lxc
with
n'
-s
lx(rc-l)
•
;::: (0 r l' n r 2' ...... , n r,c_ 1)
= 1,2, ... ,r_l
•
105
where
n~
= (n. , n'2'
-1.
1.
1
i = 1,2, ... ,r_l
... , n,I.e )
1.
lxc
n' = (nr l' n r 2' ... , n r,c_ 1)
-r
lx(c_l)
~
= (n
lo , n 20 '
... , nr_l,o)
lx(r_l)
n
Q.
I
00 c
n
••
I
00 c
n
I
00 c
r
n
x
=
[r(c+l) _2]
x(rc_l)
I
00 c_l
o
n'
-0*
n'
-0*
n'
-0*
o
where
III
•
~,
is a (11 x
1l> identity matrix
Q. are vectors and matrices, respectively, of O's .
•
106
Then the weighted least squares estimates, n
ij
, obtained from the model
E(~ = X.~
are given by
~
=
--1
(X'V
X)
-1
-_1
X'V
(4.4.1.3)
~
where
o
V-I
=
[r(c+l) _2]x[r(c+l) -2]
1
Q.'
(0- 1 +_1_ J )
2
TT
TT
--0
ro
~.
with 0_ , 0_ , and J
!!s
(4.2.2.9).
~
i
defined correspondingly as in (4.2.2.8) and
The considerable algebra could be carried out at a later
time if.warranted.
It might be noted that the information matrix is
again identical to the asymptotic covariance matrix of the weighted
least squares estimates for several special r x c cases that were
considered.
It is of interest to note that all of the preceding theory for
the case of supplementation on one margin (including
two~way
margins,
etc.) can be extended rather readily to higher dimensional tables by
.j
associating all of the unsupplemented dimensions with the B response
•
considered previously.
For example, for a 2 x 2 x 2 table with the
first response supplemented by n
o
**
additional observations, the
•
ill 7
estimation problem is essentially that for a 2 x 4 table with n **
o
observations on the first dimension.
4.4.2
The Case of Several Margins Supplemented
As was the case with supplementation for one margin, the major
difficulty in extending the theory for the 2 x 2 case with both
margins supplemented to the r x c case is notation.
With the proper
correspondences, it is largely an exercise in algebra and will not be
carried out at this point.
The extension to higher dimensional tables is not nearly as
obvious as it was for the case with supplementation over a single
margin.
••
Examples of the types of designs for supplementation in
higher dimensional tables can be illustrated by the 2 x 2 x 2 case
with responses A, B, and C.
The margins available for supplementation
include the one_way margins for A, B, and C as well as the two_way
margins AB, AC, and BC.
Examples of the possible supplementation
designs include the following:
(i) A
(~.!.,
(ii) A, B
(iii) AB
(~.!.,
(~.!.,
(iv) A, AB
•
two one_way margins supplemented)
one two_way margin supplemented)
(~.!.,
(v) A, B, AB
(vi) AB, AC
one margin supplemented)
a one_way margin and an overlapping two_way
margin supplemented)
(~.!.,
(~.!.,
two one_way margins and a completely over_
lapping two_way margin supplemented)
two two-way margins supplemented) .
•
108
4.5
Tests of Hypotheses
As noted in Chapter 3, testing various hypotheses for the more
general case of supplemental data using a modification of the GSK
program is essentially a -two_stage procedure.
This two_stage procedure
will be illustrated here for 2 x 2 tables with supplementation using
the results derived in this chapter.
However, as stated previously,
the general formulation of the test procedure applies directly to r x c
tables and also to supplementation in higher dimensional tables.
The first stage in the two_stage test procedure consists of the
following:
(i) Input the observed frequencies such as are given in
Table 4.2.1.1 for one margin supplemented (Case I) or in
Table 4.5.1 for the case of both margins supplemented
(Case II).
...
(ii) Obtain estimates, TT
ij
, i,j ; 1,2, where
,....
TT
22
;
I-TT
n
-TT
A
12
-TT
2
1'
corresponding to the unrestricted MLE for the unsupplemented
case:
:'-"( 1) Case I:
Use the MLE's,
ij ; Pij
TT
as given in (4.2.2.7).
(2) Case II:
Use either the iterated MLE's obtained from
(4.3.1.3) or the iterated WLSE's obtained from
(4.3.1.7).
If a modification of the iteration
program given in Chapter 7 is used, the latter
estimate would be recommended since its itera_
•
tion converges much more rapidly than that for
the MLE.
•
109
(iii) Obtain an estimate, V, of the asymptotic covariance matrix
using the estimates derived in (ii):
(1) Case I:
Use the obvious extension (which includes
yariances and covariances related to P22)
of the inverse of the information matrix
given in (4.2.3.4) with Pij replacing
"ij' i,j = 1,2;
(2) Case II:
Proceed as in Case I but with the extension
of (4.3.2.1) with "ij replacing "ij' i,j = 1,2.
(See (4.3.2.2) for the details.)
The second stage consists of the following:
(i) Input the A, K, X and C matrices required for the analysis
as in the existing GSK program (see Section 3.5).
(ii) Use the adjusted estimates of the cell probabilities as
obtained in (ii) of the first stage in place of the usual
estimates given in (1.4.1.2) along with the estimate of the
asymptotic covariance matrix obtained in (iii) of the first
stage in place of the usual estimate given in (1.4.1.6).
With these adjusted estimates incorporating the information
provided by the supplemented margins, proceed using the
GSK program to obtain tests of marginal symmetry, uniformity,
etc.
Note that the program must be modified not only to
incorporate the first stage but also to delete the usual
.'
steps for obtaining p .. , i,j = 1,2, as in (1.4.1.2) and
1J
Vas in (1.4.1.6) .
A
•
•
no
It should be noted that, since all rc parameters are used, it may
be necessary, in order to avoid singularities, to effect a reparam-
eterization by the proper selection of the A matrix.
As an example of the. steps required for the two_stage testing
procedure, consider the following "no factor,
two response" situation
with supplementation on both margins (see Table 4.5.1).
Table 4.5.1
Observed frequencies for a 2 x 2 contingency table with
both margins supplemented
I
Response B
Levels
1
Sub_total
1
2
(n
1
2
3
(nn)
2
I
I
Levels
Response A
••
The hypothesis
4
(n
(n
12
)
)
(n
)
I
:
)
(n
Total
(n i *)
(N
(n l *)
9
22
on A
5
(n lo )
5
21
io
I
I
Supplementation
9
20
)
,
io
)
8
(N
lo
)
18
(n 2*)
(N
14
26
(no*)
(N )
r
20
)
I
Sub_total
(n
oj
)
- - - - - - - Supp lementa tion
on B
(n*j)
5
(no 1)
12
7
(n
02
)
- - - - - - 4
(n*l)
6
(n*2)
(n
-- -
I
oo
)
- -
10
(n*o)
I
I
I
I
I
- - - - - - - -
- - - -
I
I
I
I
I
I
•
Total
(N
9
oj
)
(No 1)
13
22
I
I
36
(No 2)
(N )
c
I
(N)
•
111
~.!.,
of interest is one of marginal symmetry,
H :
o
The test proceeds as follows:
Stage 1:
(i) Input Table 4.5.1 in the proper format.
(ii) Iterate for the weighted least squares estimates (see
<7.3»
yielding
T\l
;
"12 ; 0.2069
0.1010
(4.5.1)
"21 ; 0.3046
"22
; 0.3875
(iii) Insert the estimates from (4.5.1) into the extended version
of (4.3.2.1) yielding
~.0059
V
;
_0.0031
-0.0030
0.0000
0.0085
-0.0003
-0.0033
0.0109
_0.0076
4x4
( symmetric)
(4.5.2)
0.0125
Stage 2:
(i) Input the appropriate A matrix, namely
A; (0, 1, _1, 0)
(cf. Grizzle et a1. (1969), p. 493).
(ii) Using (4.5.1) and (4.5.2), the revised GSK program would
j
then compute the appropriate test statistic, ~S' which,
•
under H , is asymptotically
o
x2
with D.F. ; 1 where
•
112
~s
=
(A~' (AVA ') -1
(Ail)
(4.5.3)
= 0.48
indicating marginal symmetry.
Note that, due to the
•
simplicity of (4.5.3) for this 2 x 2 case, AVA' is
explicitly given by
..."
var(TT
12
I'll
it.
,..
~
..
) +var(TT ) - 2 COV(TT12J TT ) =
Z1
Zl
-.
n
(4.5.4)
*0
+n- 00
=
0.0201
providing a check for (4.5.3).
Tests for independence, total symmetry, equality of scored means,
and uniformity (~.!., "10
= 1/2)
are also appropriate for this example.
When the theory of estimation for supplemented margins is developed for
more general situations
(~.!.,
the "multi_factor, mUlti_response" case
with supplementation), the full generality of the modified (or twostage) GSK program can be utilized .
•
•
113
5.
SUMMARY AND SUGGESTIONS FOR FURTHER RESEARCH
5.1
Summary
This research has utilized primarily the weighted least squares
method of estimating functions of individual cell probabilities for
contingency table representations of categorical data situations along
2
with minimum Neyman X tests or equivalently tests using the quadratic
forms arising in the familiar linear regression analyses.
~
Grizzle
al. (1969) provide the theoretical framework for this approach.
Maximum likelihood estimation is used where feasible for purposes of
comparison.
2
The minLmum discrimination information and minimum X
methods of estimation along with the Pearson
-.
x2 ,
Neyman-Pearson like-
lihood ratio, and minimum discrimination information test statistics
are only briefly mentioned since a primary goal of this research is to
investigate a most general method of analyzing categorical data with
the linear model approach being the method selected.
In order to formulate appropriate hypotheses to be tested using
the Grizzle, Starmer, Koch (GSK) approach, the underlying factor_
response structure of the table should be identified.
As noted in
Bhapkar and Koch (1968a), the types of multi_dimensional contingency
tables (and appropriate hypotheses) include "no factor, mUlti_response"
tables (problems of independence); "uni-factor, mUlti_response" tables
(effect of the factor on the responses, association between the
_!
responses); "multi_factor, uni-response" tables (homogeneity, "no
interaction"
•
between factors and the way they affect the response);
"multi_factor, multi_response" tables (relationships among the
responses and in the way the factors combine to affect the responses).
•
114
An additional model is recognized in this research and is referred
to as a "mixed categorical data model" due to its analogy with the
well_known mixed model in the analysis of variance ..
The experimental
situation involves exposing each of n subjects to each of d levels of
a given factor and classifying each of the d responses into one of r
categories.
The resulting data are represented in an r x r x ... x r
contingency table of d dimensions.
The hypothesis of principal interest is equality of one_dimensional
marginal distributions.
If the r categories can be quantitatively
scaled, attention is focused on the hypothesis of equality of the mean
scores over the d first order marginals.
-.
Test statistics are developed
in terms of minimum Neyman X2 or equivalently weighted least squares
analysis of underlying linear models.
The analogy of the test for equality of mean scores in mixed
categorical data models of order 2 to the matched pairs t-test is
illustrated for a well_used example on eye_testing for employees in
Royal Ordnance factories.
It is indicated that tests for mixed
categorical data models of higher order bear a strong resemblance to
2
the Hotelling T
procedures used with continuous data in mixed models.
In addition, the application of the GSK approach to split plot
contingency tables is illustrated with an example from Lessler (1962),
and the extension of this research to complex mixed categorical data
models is outlined.
In mixed categorical data models as well as the models given in
•
Bhapkar and Koch (1968a), the assumption that the data is complete in
the sense that every experimental unit is classified according to each
•
115
of the d dimensions of the table is often violated.
The remainder of
this research deals with methods of utilizing this "incomplete" (or
"supplemental") data by extending the GSK approach as given in
Grizzle et al. (1969).
It is assumed throughout that there is no interaction between
subjects and the presence of supplemental data
(~.~.,
for the responses
classified, the joint marginal distributions for subjects with complete
data and for those with supplemental data are the same).
Conditional
on the number of individuals with various types of complete or supple_
mental data being fixed by design
~
priori, the basic probability model
is the product of several multinomials with marginal and/or cell prob_
abilities as fundamental parameters.
••
The most detailed investigation is carried out for 2 x 2 tables .
For the case with supplementation on one margin, the logical intuitive
estimates of the cell probabilities are shown to be the MLE's as well
as the iterated weighted least squares estimates (WLSE's).
Using
conditional expectations, the MLE's are shown to be unbiased.
MLE's, they are consistent and asymptotically efficient.
Being
The
asymptotic covariance matrix is then obtained by inverting the
infonnation matrix.
Since this is exactly the asymptotic covariance
matrix of the WLSE's, if the iterated WLSE's are unique, they must be
identical to the MLE's.
When both margins are supplemented, the MLEls and WLSE 1 s must be
obtained by iteration.
•
For the computer program that was written, the
WLSE's iteration converged considerably faster than did the one for the
MLE's.
It is noted that the iterative proportional fitting technique
•
116
of Mosteller (1968), Goodman (1968) and Bishop (1969) does not readily
give MLE's for this case.
Again, the asymptotic covariance matrix is
identical for the MLE's and WLSE's.
A general representation for the
elements of the covariance matrix is given which yields, as special
cases, the covariances of the estimates for the cases with supplementa_
tion on one margin and the usual case of complete data
(~.~.,
no
supplementation).
The extension of the theory to general r x c tables as well as to
higher dimensional tables is outlined.
The incorporation of these results into the general GSK framework
for hypothesis testing is indicated as a two_stage test procedure.
The
first stage consists of the following:
••
(i) Input the observed cell and marginal frequencies .
(ii) Obtain estimates of the cell probabilities corresponding to
the unrestricted MLE's for the case with complete data:
(1) If the table is fairly simple, obtain the MLE's.
(This
is the closest analogue to the unrestricted MLE's for
the complete data version and is most consistent with
tests using Wald statistics.)
(2) For most cases, compute the WLSE's by iterative
techniques.
(3) If interested in estimating functions of the cell
probabilities (say,
A~,
obtain WLSE's of these
J
expressions.
•
(This is particularly appropriate in the
mixed model situation.)
•
117
(iii) Obtain an estimate, V (or AVA' for (3», of the asymptotic
covariance matrix using the estimates derived in (ii).
The second stage consists of the following:
(i) Input the A, K, X, and C matrices required for the analysis
as in the existing program.
(ii) Use the adjusted (for supplemental data) estimates of the
cell probabilities as obtained in (ii) of the first stage
in place of the usual estimates along with the adjusted
estimate of the asymptotic covariance matrix in (iii) of
the first stage in place of the usual GSK estimate.
Proceed
with the testing of various hypotheses exactly as in the
-.
existing GSK framework.
A number of examples are presented to illustrate the more general
categorical data models utilizing supplemental data.
5.2
Suggestions for Further Research
The present research uncovered a number of areas which might
warrant further investigation.
These include the following:
(i) Generalization of the GSK program to efficiently handle the
case of supplemental data.
(ii) Development of the theory for supplemented margins in
general multi_factor, multi_response situations.
(iii) Consideration of iterative proportional fitting techniques
J
for maximum likelihood estimation when more than one margin
is supplemented.
•
(iv) Investigation of various supplementation designs with a
consideration of the cost of measurement.
•
118
(v) Comparison of small sample properties of tests for
contingency tables with and without supplementation.
(vi) Power comparisons of tests for contingency tables with and
without supplementation.
(vii) Consideration of the effect of supplementation on various
measures of association
(~'f"
Kendall's T, Goodman_Kruskal
G for two_way tables).
(viii) Investigation of the analysis of generalized mixed models
using the GSK approach .
••
•
•
119
6.
LIST OF REFERENCES
Afifi, A. A. and Elashoff, R. M. 1966. Missing observations in
multivariate statistics I. Review of the literature. Journal
of the American Statistical Association 61: 595_605.
Afifi, A. A. and Elashoff, R. M. 1967. Missing observations in
multivariate statistics II. Point estimation in simple linear
regression. Journal of the American Statistical Association 62:
10_29.
Afifi, A. A. and Elashoff, R. M. 1969a. Missing observations in
multivariate statistics III. Large sample analysis of simple
linear regression. Journal of the American Statistical Associa_
tion 64: 337-58.
Afifi, A. A. and Elashoff, R. M. 1969b. Missing observations in
multivariate statistics IV. A note on simple linear regression.
Journal of the American Statistical Association 64: 359-65.
Bartlett, M. S. 1935. Contingency table interactions. Journal of
the Royal Statistical Society (Supplement 2): 248_52.
••
.'
Berkson, J. 1954. Estimation by least squares and by maximum likeli_
hood. Proceedings of the Third Berkeley Symposium on Mathematical
Statistics and Probability 1: 1_11. University of California
Press, Berkeley and Los Angeles.
2
Berkson, J. 1968. Application of minimum logit X estimate to a
problem of Grizzle with a notation on the problem of no interac_
tion. Biometrics 24: 75-95.
Bhapkar; V. P. 1961. Some tests for categorical data.
Mathematical Statistics 32: 72-83.
Annals of
Bhapkar, V. P. 1965. categorical data analogs of some multivariate
tests. S. N. Roy Memorial Volume, University of North Carolina
Press) Chapel Hill, N. C.
Bhapkar, V. P. 1966. A note on the equivalence of two test criteria
for hypotheses in categorical data. Journal of the American
Statistical Association 61: 228_35.
Bhapkar, V. P. 1968. On the analysis of contingency tables with a
quantitative response. Biometrics 24: 329-38 .
.'
Bhapkar, V. P. and Koch, G. G. 1968a. Hypotheses of 'no interaction'
in multi_dimension~l contingency tables. Technometrics 10: 107_23 .
•
•
120
Bhapkar, V. P. and Koch, G. G. 1968b. On the hypotheses of 'no
interaction' in contingency tables. Biometrics 24: 567-94.
Birch, M. W.
tables.
220_33.
1963. Maximum likelihood in three_way contingency
Journal of the Royal Statistical Society (B) 25:
Bishop, Y. M. M. 1969. Full contingency tables, logits, and split
contingency tables. Biometrics 25: 383-99 •
•
Bishop, Y. M. M. and Fienberg, S. E.
dimensional contingency tables.
1969. Incomplete two_
Biometrics 25: 119-28.
Blumenthal, S. 1968. Multinomial sampling with partially categorized
data. Journal of the American Statistical Association 63: 542_51.
Cochran, W. G. 1950. The comparison of percentages in matched
samples. Biometrika 37: 256-66.
2
Cochran, W. G. 1952. The X test of goodness of fit.
Mathematical Statistics 23: 315_45.
Annals of
2
Cochran, W. G. 1954. Some methods of strengthening the common X
test. Biometrics 10: 417_51.
Cochran, W. G.
York.
1963.
Sampling Techniques.
John Wiley and Sons, New
Friedman, M. 1937. The use of ranks to avoid the assumption of
normality implicit in the analysis of variance. Journal of the
American Statistical Association 32: 675-99.
Good, I. J. 1963. Maximum entropy for hypothesis formulation,
esp'ecially for multidimensional contingency tables. Annals of
Mathematical Statistics 34: 911_34.
Goodman, L. A. 1968. The analysis of cross-classified data:
Independence, quasi_independence, and interactions in contingency
tables with or without missing entries. Journal of the American
Statistical Association 63: 1091_131.
Goodman, L. A. 1970. The multivariate analysis of qualitative data:
Interactions among multiple classifications. Journal of the
American Statistical Association 65: 226-56.
•
Grizzle, J. E., Starmer, C. F. and Koch, G. G. 1969. Analysis of
categorical data by linear models. Biometrics 25: 489_504.
"
•
Ireland, C. T. and Kullback, S. 1968a.
marginals. Biometrika 55: 179_88 .
Contingency tables with given
•
121
Ireland, C. T. and Kullback, S. 1968b. Minimum discrimination
information estimation. Biometrics 24: 707_13.
Ireland, C. T., Ku, H. H. and Kullback, S. 1969. Symmetry and
marginal homogeneity of an rxr contingency table. Journal of
the American Statistical Association 64: 1323-41.
Kastenbaum, M. A. and Lamphiear, D. E. 1959. Calculation of chi_
square to test the no three-factor interaction hypothesis.
Biometrics 15: 107_15.
Kleinbaum, D. G. 1970. A general method for obtaining test criteria
for multivariate linear models with. more than one design matrix
and/or incomplete in response variates. Unpublished Ph.D. thesis,
Department of Statistics, University of North Carolina at Chapel
Hill.
Koch, G. G. and Sen, P. K. 1968. Some aspects of the statistical
analysis of the 'mixed model'. Biometrics 24: 27_48.
KU, H. H. and Kullback, S. 1968. Interaction in multidimensional
contingency tables: An information theoretic approach. Journal
of Research of the National Bureau of Standards _ Mathematical
Sciences 72B: 159_99.
Kullback, S. 1959. Information Theory and Statistics.
Sons, New York.
John Wiley and
Kullback, S., Kupperman, M. and Ku, H. H. 1962. Tests for contingency
tables and Markov chains. Technometrics 4: 573-608.
Lessler, K. J. 1962. The anatomical and cultural dimensions of
sexual symbols. Unpublished Ph.D. theSiS, Michigan State
University, East Lansing.
Lewis, J. A. 1968. A program to fit constants to multiway tables
of quantitative and quantal data. Applied Statistics 17: 33-41.
Mantel, N. 1963. Chi_square tests with one degree of freedom;
extensions of the Mantel_Haenszel procedure. Journal of the
American Statistical Association 58: 690_700.
•1
Mantel, N. and Haenszel, W. 1959. Statistical aspects of the
analysis of data from restrospective studies of disease. Journal
of the National Cancer Institute 22: 719-48 •
Miettinen, O. S. 1968. The matched pairs design in the case of
all_or_none responses. Biometrics 24: 339_52.
•
Miettinen, O. S. 1969. Individual matching with multiple controls
in the case of all_or_none responses. Biometrics 25: 339_55.
•
122
Miettinen, O. S. 1970. Estimation of relative risk from individually
matched series. Biometrics 26: 75_87.
Mosteller, F. 1968. Association and estimation in contingency tables.
Journal of the American Statistical Association 63: 1-28.
2
Neyman, J. 1949. Contribution to the theory of the X test.
Proceedings of the Berkeley Symposium on Mathematical Statistics
and Probability. University of California Press, Berkeley and
Los Angeles. Pp. 239-73.
Pearson, E. S. 1947. The choice of statistical tests illustrated on
the interpretation of data classed in a 2x2 table. Biometrika 34:
139-67.
Rao, C. R. 1963. Criteria of estimation in large samples, pp. 345-62.
In C. R. Rao (ed.), Contributions to Statistics. Pergamon Press,
Oxford, New York.
Roy, S. N. and Kastenbaum, M. A. 1956. On the hypothesis of no
interaction in a multiway contingency table. Annals of Mathematical Statistics 27: 749_57.
Scheffe, H. 1959.
New York.
The Analysis of Variance.
John Wiley and Sons,
Stuart, A. 1955. A test for homogeneity of the marginal distribu_
tions in a two-way classification. Biometrika 42: 412_16.
Wald, A. 1943. Tests of statistical hypotheses concerning several
parameters when the number of observations is large. Transactions
of the American Mathematical Society 54: 426_82.
Wilks, S~ S. 1946. Sample criteria for testing equality of means,
equality of variances, and equality of covariances in a normal
multivariate distribution. Annals of Mathematical Statistics 17:
257_81 •
•
•
123
7.
APpmDIX
This chapter contains the iteration scheme for the HLE's (Pij) and
(n..)
for
1J
WLSE's
the 2 x 2 case with both margins supplemented.
The
expressions which must be iterated are as follows:
i,j ;: 1,2
(7.1)
(7.2)
+
'.
n*
~
n
n .. TT"j-n"jTT..
1J 1
1
1J
oo
n
+
00
(n * n i * - n ij ,)
o
TT
ij
+
n
n
00
00
)
( - - n*. _--n.,*+ni'j'
~o
J 00* 1
TTi
I
1
j
I
i,j
where
2
= N + n
N
y
•
2
E
i=l j=l
Y E
00
2
•
-y
n
=
2
n
IT
IT (TTij )
0* *0 i=l j=l
2
-"'2~'---""2--
n oo
IT
i=l
(n. ) IT (n .)
10 j=l
oJ
1,2
J]
i f i'
j
f
j'
••
124
The Call_A_Computer program written for simultaneously iterating
(7.1) and (7.2) is as follows:
OlO
.020
030
040
050
060
070
080
090
100
no
120
130
140
150
160
INPUT,«PA(I,J),J~l,2),I~l,2),«N(I,J),J~l,2),I~l,2),
(MR(I),I~l,2),(MC(J),J~l,2),NT,IR,IC,NTT,«PX(I,J),J~l,2),
+
+
I~l,2)
WIR~IR;
WIC~IC;
WR~WIR/WNT;
6 DO 2
WNT~NT
WS~WIC/WNT
DO 2 J~l,2
2 X(I,J)~PX(I,J)
DO 3 I~l, 2; B(I) ~A(I, 1) +A(I, 2)
3 Y(I) ~X(I, 1) +X(I, 2)
DO 4 J~l,2
C(J) ~(l,J) +A(2,J)
4 Z(J)~X(l,J)+X(2,J)
I~l,2;
A(I,J)~PA(I,J);
E~(WR*WS*X(l,l)*X(l,2)*W(2,l)*X(2,2»/(Y(1)*Y(2)*Z(1)*Z(2»
180
190
200
(X(l, 1) *x (1,2) *X(2, 1) +X(l, 1) *X( 1,2) *X(2, 2) +
+ X(l,l)*X(2,l)*X(2,2)+X(l,2)*X(2,l)*X(2,2»/(X(l,l)*X(1,2)*
+ X(2,l)*X(2,2»)
DO 5 I~l, 2; DO 5 J~l, 2
PA (I, J) ~(N( I, J) -IMR( I) *A(I, J) /B( I) -tMC(J) *A (I, J) / C(J» /NTT
2lO
K~3_I;
220
230
PX( I, J) ~(N(I, J) -IMR( I) *X( I, J) /Y (I) -tMC(J) *X (I, J) /Z (J) +
+ WR*(N(I,J)*X(I,L)_N(I,L)*X(I,J»/Y(I)+WS*(N(I,J)*X(K,J)_
+ N(K,J)*X(I,J»jZ(J)+E"(N(I,J)/X(I,J)+(MC(J)/WS-N(K,J»/
+ X(K, J) +(MR(I) /WR-N(I, L» /X(I, L) +(MC(J) /WS_MR(K) /WR+
+ N(K,L»/X(K,L»)/F
.
5 0(1, J) ~PA(I, J) -PX(I, J)
PRINT, t "ML~", «PA(I,J),J~l,2),I~l,2),"
LS~", «PX(I,J),
+ J~l,2),I~l,2),"
0~",«0(I,J),J~l,2),I~l,2)
GO TO 6; STOP; END
170
".
DIMENSION PA (2,2) ,N(2, 2) ,0(2,2) ,MR(2) ,MC (2) ,A (2,2) , B(2) ,
+ C(2) ,X(2,2), Y(2) ,Z(2) ,PX(2,2)
240
250
260
270
280
290
300
F~NTT+(NT*E"
L~3_J
Using Table 4.5.1 as an example and starting with the unrestricted
MLE's without supplementation as initial estimates of the Pij and ;ij'
i,j
~
1,2
<.!:=..,
(0)
Pn
~
/
1 12
~
0.0833
~
-(0)
(0)
!Tn ; P12
~
0.1667
~
-(0)
!T12 ;
p(O) ~ 0 3333 ~ n(O); p(o) ~ 0.4167 ~ nCo»~ the iteration proceeds as
21'
21
22
22 '
'.
shown in Table 7.1 .
•
•
~
,
•••
125
Table 7.1
Iterative sol~tion for the MLE's (Pij) and WLSE's
for Table 4.5.1
Stage in
Iteration
P11
1
(ITij )
a
22
D
M
.3968
.3881
.01050
.3046
.3906
.3875
.00310
.3047
.3046
.3886
.3875
.00llO
.2069
.3045
.3046
.3880
.3875
.00044
.2067
.2069
.3045
.3046
.3877
.3875
.00020
.1010
.2068
.2069
.3045
.3046
.3876
.3875
.00010
.1011
.1010
.2068
.2069
.3045
.3046
.3876
.3875
.00006
8
. lOll
.1010
.2068
.2069
.3045
.3046
.3876
.3875
.00004
9
.1011
.1010
.2069
.2069
.3045
.3046
.3876
.3875
.00003
10
.1010
.1010
.2169
.2069
.3045
.3046
.3876
.3875
.00002
11
.1010
.1010
.2069
.2069
.3045
.3046
.3875
.3875
.00002
12
.1010
.1010
.2069
.2069
.3046
.3046
.3875
.3875
.00001
TT
12
P 21
.1958
.2063
.1011
.2037
.1008
.1010
4
.1011
5
ll
P 12
.0963
.1016
2
.0998
3
TT
TT
2l
P 22
.3111
.3041
.2068
.3058
.2059
.2069
.1010
.2065
.1011
.1010
6
.1011
7
a ~ = max Ip .. _
1)
TT
-
TT •• /.
1)
For this example, the iterated cell probabilities are given by the
following:
Pll = 0.1010 =
.'
•
P 21 = 0.3046
-ll
P12 = 0.2069 =
TT
P 22 = 0.3875 =
TT
21
In this case (and in each of a
n~ber
TT
12
TT
(7.3)
22
of other examples), the iteration
for the WLSE's converged considerably faster than did the iteration for
the MLE's.
,
<