TESTniG HYPOTHESES WITH CATEGORICAL DATA
SUBJECT TO MISCLASSIFICATION
by
Kwanchai Assakul
and
C. H. Proctor
Institute of Statistics
Mimeograph Series No. 448
September, 1965
ERRATA
Should Read
e
55
Paragraph 2, Line 2
••• defined by (2.70» •••
70
Line 6
Let r=c=2,
71
Table 2.7.2, Line
...
3
Paragraph 2, Line 3
[l-(t.-l)ehe
Paragraph 2, Line 5
re2
104
Last Line
Remove let at end of line
108
Equation (3.35)
Pr { · · · },
Equation (3.36)
... P { • • • } ,
r
Last Line
~~(lxret)
112
-
= (Plll, ••• ,Pllt;
Prel'" .,Pret )
...,.
TESTING HYPOTHESES WITH CATEGORICAL DATA
SUBJECT TO
l{[SCLASSIF~CATION
by
KWANCHAI ASSAKUL
A thesis submitted. to the Graduate Faculty
of North Carolina State University at Raleigh
in partial fulfillment of the
requirements for the Degree of
Doctor of Philosophy
DEPARTMENT OF EXPERIMENTAL STATISTICS
Raleigh
1
9 6 5
APPROVED BY:
Chairman of Advisory Committee
R. J. Hader'
W, J. Hall '
o.
Wesler
ABSTRACT
ASSAKUL, KWANCHAI.
Testing HYPotheses with Categorical Data Subject to
Misclassification.
(Under the direction of 9HARLES HARRY PROCTOR.)
A random process of misclassification in contingency tables is
represented as a non-singular matrix
e.
The entries give the chances
that an individual, belonging in fact to the (i,j,k, ••. )-th cell of a
contingency table, is counted as in the (i' ,j' ,k' , ••. )-th cell.
would be written e"'jj'kk'
~~
This
, and the sums over all different (i' ,j',
0
•
Cl
k', •.. ) for each fixed (i,j,k, .•. ) are ones.
HYPothesized cell prob-
abilities of the observed data under misclassification are related to
the cell probabilities under no misclassification by the basic equation
p'
ijk ..•
=
A
L:
a, 1-', T •••
P
e.
at3T. .. ait3J Tk ...
. Deviations from the hypothesized
cell probabilities under misclassification are similarly derived from
the deviations under no misclassification.
These relations allow one to
derive the probability density of the observations under misclassification.
For testing the hypothesis of independence in a rxc two-way
contingency table, it is shown that the usual test procedure has the
required size if misclassification in the i-th direction is independent
of that in the j-th direction.
However, misclassification is shown to
reduce the asymptotic power of the test for the same departure from the
null hypothesis.
In exceptional cases, when the matrix e(rcxrc) is
nearly an identity and the values of the deviations are most accommodating, the power may be equal.
Numerical calculations are made to
illustrate how asymptotic power is reduced in this case.
The ratio of
sample size for data which is free of errors to the sample size for
data which is 8ubject t.o misclassification to produce the same asymptotic
power is also given.
In case errors are not independent, the usual test does not have
the required size and it is shown how BAN estimates and use ofaX2statistic can be applied to obtain a modified test procedure.
method of minimum
xi
by linearization is suggested as a convenient way
to obtain BAN estimates of the parameters.
demonstrate this.
The
Examples are given to
The asymptotic power function of the modified test
procedure is found and it is shown that misclassification reduces (subJect to a similar proviso as that above) the asymptotic power.
A suggested practical application of the results obtained to some
fields of research,
~.~.,
studies, is given.
A procedure for testing the independence of errors
is derived.
sample survey data and medical research
As an illustration of the derived procedure, a numerical
example us:i.ng sample survey data is calculated.
errors are found to be not independent.
The misclassification
The matrix
e
is estimated and
the modified test procedure is worked out.
Sindlar studies are made for the problem of testing the hypothesis
that the populations are identical in a t.xc two-way contingency table,
and similar results to those for the previous case are obtained.
An extension to the problem of testing the hypothesis of complete
independence in a three-way contingency table is given.
ii
BIOGRAPHY
Born:
August 17, 1939, Bangkok, Thailand
Parents:
Chin and Nantana Assakul
Education:
Mater Dei School, Bangkok, Thailand, 1948-1956
Faculty of Commerce and Accountancy, Chulalongkorn
UniversitYJ Bangkok, Thailand, 1956-1960
North Carolina State University at Raleigh,
Raleigh, N. C., Graduate Student, 1961-1965
Employment:
Faculty of Commerce and Accountancy, Chulalongkorn
University, Bangkok, Thailand, Assistant
in Accounting, 1959- Present
Instructor
iii
ACKNOWLEOOEMENTS
The author's foremost thanks go to Dr. Bundkit Kantabutra, Head of
the Department of Sta.ti. s ti cs of Chulalongkorn University, who encouraged
her to come to the United States for further studies after completing
her undergraduate w'Ork in Accounting at Chulalongkorn University.
The autw)Y wishes to express her sincere appreciation to her major
advisor Dr. Co H. Proctor who has given untiring counsel and shown
continued interest in the research and preparation of this dissertation.
To the other members of her committee, Dr. R. J. Hader, Dr. W. J. Hall
and Dr. O. Wesler, go many thanks for their helpful suggestions.
Acknowledgement is due the Governments of the United States of
A.1'D.erica and Thailand for making this study possible, the Agency of
International Development for sponsoring this project and the International Statistical Program Office of the Bureau of the Census for
arranging thi.s training program.
To Mrs. Selma, McEntire the author extends her appreciation for
her excellent typing of this dissertation.
To the memory of Chi.n Assakul, the author's father, this dissertation is
gra0ef'tti.l~T
dedicated.
iY
TlillLE OF CONTENTS
Page
LIST OF TABLES ,
1,
v
0,
,
0
vi
,
INTRODUCTION AND REVIEW OF LITERATURE,
1,1,
1,2,
1,3·
1,4,
2.
0
The Ap.,alysi.s of Categorical Data
Statement o,r Problem, . , • •
Review of Literature on t.he Study of Hisclassification ,
Outline of the Following Sections,
0
0
CONTINGENCY TABLE WITH BOTH NA.RGINALS RANDm-1 ,
2·3·
2.4.
2·5·
2.6.
X1
2.10.
0
0
0
0
3.4,
3,5·
0
0
,
•
,
0
•
,
•
0
CONTINGENCY T.ABLE WITH ONE MARGI'NAL FIXED
301,
302.
3.3·
0
0
0
0
,
0
0
Q
•
0
0
•
0
0
,
0
"
•
•
0
•
•
0
0
4.
GENERALIZATION
5.
SUMWIRY AND SUGGESTIONS FOR FUTURE RESEARCHo
0
S111TlIl'lCl:ry- ()
0
U
,
<,)
,
0
0
0
.,
0
0
.;l
,
(I
0
•
0
•
0
0
0
•
•
0
0
0
•
<)
U
REF~lCES
• , •
0
0
•
52
57
74
77
81
102
, 105
o
0
110
119
o
•
, 119
, 124
0
Suggestions for Future Research,
6. LIST OF
43
98
•
,
0
23
31
35
97
0
•
19
93
•
•
13
93
,
The Problem.
Independent Errors • . ,
The Asymptot::!.c POwer Function for the Case of
Independent Errors
Illustration , ,
Non=Independent Errers
0
1
9
10
15
Case I: Independent Errors . • • . . ,
The Asymptotic Power Function for the Case of
Independent Errors . • , . • • • • • , ,
I11ustrati,on for the Case of Independent Errors,
Case lIz Hon,"Ind,ependent Errors , , , , . , . • ,
Finding a Test Procedure for the Case of NonIndependent Errors When e(rcxrc) Is ~own. ,
Illustration of the Hethod of Minim1~
by
Linearization, • , ,
The AsymptotiC Power ~~nction for ~o
The Size of the Test If Misc1assification Is Ignored , ,
Non-Independent Errors When e(rcxrc) Is Unknown,
Application: Test of Independence of Errors
0
2,7.
2.8.
2·9·
3.
1
o
,
•
126
v
TABLE OF CONTENTS (continued)
Page
7· APPENDICES . • •
7·le
7·2.
7·3.
. • 129
Cramer U s Theorem
. . • • . 129
Asymptotic Power Function: Test of Independence in a
rxc Contingency Table . .
•• • •
130
Some Results in V~trix Theory. • • . . • . • . .
. 137
vi
LIST OF TABLES
Page
2.4.1,
2.4.2.
Effect of misclassification (independent errors wl.th the
same constant rates both on the i~th and j~th directions)
on the asymptotic power when testing the hypothesis of
independence in a rxc contingency table. . . • • • • • • .
36
Effect of misclassification (independent errors with
different rates for the i-th and j-th directions) on
the asymptotic power when testing the hypothesis of
independence in a rxc contingency table.
0 0 .
41
Presence and absence of second brood eggs in Boone County
sample subdivided according to application of fertilizer
55
Effect of misclassification (non=independent errors with
the same constant rates) on the value of the test
statistics for a given set of data . 0 , , , , 0 • 0 0
57
Effect of misclassification (non-independent errors with the
same constant rates) on the asymptotic power; for r=c=2,
p~. 'P~l} = {.6,.8} and dll = -d12 :; -d2l :; d22 0
71
0
2.6.1,
2.602.
2.7.1.
l
2.7.2.
2.7.3.
Effect of misclassification (correlated errors) on the
asymptotic pOi-lerj for r=c=2, {P~.'P~l} = {.6,.8} and
d
= -~2 = -d2l = d22 . . . • . , . . , • . , • ,
ll
Effect of misclassification (misclassification only in the
neighboring classes) on the asymptotic power; for r:=:2,1 c=3)
{
p~. 'P~1'P~21
= d22 and
2.8.1.
~3
= {.6, .2,·3 } and
= d23 = o. . . ,
0
,
~l = -d12 = -d21
•
••
2,10.2.
0
0 ,
0
•
73
•
Effect of misclassification (non-independent errors with
constant error rates) on the size of the usual test when
misc1assification is ignored , • , •
0
2.10.1.
71
0
•
•
•
•
,
,
78
Condition and plumbing facilitiesjfor occupied rural nonfarm and farm housing units: 1960
, , • .
89
Full cross-classification of the CES and the Census Data on
the conditions of housing unit and plumbing facilities
for all occupied housing units
.".,."
0
89
0
1.
INTRODUCTION AND REVIEW OF LITERATURE
1.1.
The Analysis of Categorical Data
Before proceeding to study certain features of the effects of
misclassification in the analysis of categorical data, some revieiVs on
the general results of the analysis of categorical data are necessary
for further discussions,
Of the two majoT areas of statis1:;ical i.nference, the estilTL"ltion of
the parameters and the testing of hy-potheses ,,,re shall. be mainly concerned with the second--testing.
The theory of hypothesis testing in
the analysis of categorical data (data which are presented in the form
of frequencies falling into certain categories or classes), goes back
to the pioneering paper of Karl Pearson [21], modified in important
directions at subsequent stages by a whole series of investigators,
including, among others, Yule [30], Fisher [10], Barnard [1],
E. S. Pearson [20], Cramer [6], Neyman [17], ¥dtra [13], Roy and Mitra
[25], Bhapkar [2], and Diamond [7],
As an introduction to the general setup and the approach to the
analysis of categorical data to be followed in this thesis, we quote
Mitra [13], p. v.
. . . consider the 2x2 table as discussed by Barnard and
Pearson. It was pointed out by them that the same 2x2
table, for example,
Inoculated
against cholera
Not inoculated
against cholera
Attacked with cholera
Not attacked with cholera
N
2
could be the outcome of three different conceivable
sampling ~chemes:
(i)
A random sample of size N from tlle whole
population and classification of every individual of the sample into one of the four
classes of the 2x2 table.
(11 ) A random sample of size N < N from the
l
~noculated population and classification of
every member of the sample into one of the
two categories "attacked" or II not attacked ll ,
and similarly an independent random sample
of size N = N-N from the not-inoculated
2
l
population and classification of members of
the sample into the same two categories.
(iii)
The third situation, where the marginals in
both directions are fixed, does not seem to
be experimentally realizable in this case without a great deal of effort, but is certainly a
lot more meaningful in SOme other types of
problems, for example, in Fisher's teatasting experiment or similar experiments on
extra sensory perception.
The probability model, upon which the whole structure
of the statistical analysis is to be built, necessarily
differs in the first two cases and more so in the third.
Assuming that the sampling is with replacement or is one
from a large po~ulation, case (i) represents sampling
from a single multinomial population while in case (ii)
we have two independent samples of sizes N and N from
2
l
two binomial populations.
In this thesis, we shall not discuss the third situation at all.
This approach, introduced for the simple 2x2 table by Barnard [1] and
Pearson [20], and used extensively in the work done by Mitra [13], will
be followed here.
It has two primary features.
The first is that one
is careful to distinguish between the case in which the sample is assumed
to be drawn from a single multinomial population and the case in which
the sample is
ass~ed
to be drawn from several independent multinomial
populations.
In the first case all marginal totals of the resulting
table of data are stochastic variates.
In the second case some
3
marginals are fixed by the manner in which the sample is drawn while
th~
rest are stochastic variates.
The second important feature is the careful definition of the
hypothesis to be tested and i.ts alternative.
mathemati~al
limiting
This is essential to the
derivation of the test statistic and its null and non-null
distrib~tions
and is certainly necessary for any meaningful
Qiscussion of the results of a specific analysis of experimental
da~a.
In a more general formulation of the problem, we have ! independent
samples from! different multinomial populations P ,P , ... ,P , the i-th
t
l 2
sample consisting of N independent observations from Pi (i=l, .•. ,t).
i
We shall assume tnat these populations are comparable in the sense that
every
indivi~ual
the same set of
~
from each population could be classified into one of
mutually exclusive and exhaustive categories Cj
t
(j=l, ••. ,s).
Thus every individual in the total sample of size N= Z N ,
i=l i
may be characterized by two indices "i" and "j" of two different types,
If
i 11 representing the population Pi from which it is drawn, while
represents the particular category C. to which it pelongs.
J
Il
j
"
The
classification of an individual according to the first index "i" will
be called a population-wise classification while the classification
according to the second index "j" will be call~d a variate-wise classification.
The same characteristic may sometimes behave as a variate-
Wise and sometimes as a population-wise classification, depending on
the sampling scheme used.
In case (i) of Mitra's statement quoted
earlier, the characteristic "inoculation" was a variate while in case
(ii) it was a population-wise classification.
4
It should be noted that "i" itself may be a multiple subscript
i=(i ,i , .•• ,i ) and "j" itself may be a multiple subscript
u
l 2
j=(jl,j2,··.,jv); i k=1,2, ••. ,b k (k=1,2, ••• ,u) and jh=1,2, •.• ,~
(h=1,2~ ••• ,v).
The subscripts i ,i , .•• ,i might correspond to ~
l 2
u
different characteristics distinguishing one population fram another
while the subscripts jl,j2' ·.·,jv might correspond to! distinguishable
and variable characteristics of individuals belonging to any population.
In this thesis, the following two situations are considered:
(i)
when t=l and v=2 (jl = 1,2, ••• ,r and j2 = 1,2, ... ,c; say)
which is
sometim~s
called the simple random sampling situa-
tion (!.!:.. ~ the rxc two-way contingency table with both
marginals random), and
(ii)
when t > 1, u=l (i
l
= 1,2, .•• ,t, say) and v=l (jl = l, ... ,c
say) which is sometimes called the stratified sampling
situation (!.!:.., the txc two-way contingency table with one
marginal fixed).
For situation (i) the usual hypothesis of interest is the
hypothesis that the two variable characteristics are independent and
for (ii) it is the hypothesis that the
~he
~
populations are the same.
tests to be considered throughout this thesis, for each
hYPothesis of interest in all the cases, are the large sample chisquare tests.
An excellent ano. extremely helpful review of the work
done by various authors on the application of the X2-tests, has been
given by Cochran [5].
appears in Mitra [13] .
A heuristic justification of the X2-criterion
5
In most of the usual applications of the frequency chi-square test,
the n observations in a random sample are classified into E mutually
exclusive classes.
There is some theory which gives the probability p.
J.
of an observation falling in the i-th class (i=l, ... ,r).
Sometimes
this theory or null hypothesis completely specifies the numerical value
of the p.
J.
r
L:
IS,
and in this case we use the simple goodness of fit
(0. - E. )2/E., to verify whether the theory is correct, where
1=1
1
].
].
0i and E are respectively the observed and the expected frequencies
i
(on the basis of the null hypothesis) in the i-th class (i=l, ••. ,r).
Other applications can be broadly stated as follows:
The theory
specifies the unknown Pi'S as certai~ known functions Pi (ol""'os)
of a lesser number, !, of functionally independent
1
unknown parameters,
and our object is to verify whether the theory is correct.
For this,
we use the statistic
Y?- =
o
r
L:
A
i=l
A
nPi (01' ... ,os)
as defined by (7.4), which is, in the limit as n ->
a
X2
with r-s-1 degrees of freedom.
00,
distributed as
(See Section 7.1 for detailed
statement.)
1
The functional independence of parameters will be understood to
mean that for every system of their values there exists at least one
determinant of s-th order,
Op~
Opk
~
~
~
OPk
OPk
(Jpk
OPk
1
s
~
...
s
~
1
s
~
r
O•
()
One of the difficulties often encountered in t-be practicr.d
applicatIon of the theory of frequency chi-square tests is the problem
of obtaining a system of solutions for the minimizing equations, such
as (7.2),
In many practical situations, the minimization results in
sets of equations which are so complex that it is difficul.t, if not
impossible, to obtain explicit solutions for the estimates of the
parameters,
Under certain conditiona, it is possible to adoI't a. different
method for estimating the para.meters and to overcome this diffieult,y
!""l
•
The main theorem on the limiting distribution of X- ha~ heenpro,ed
o
under the hypothesis that the method of estimati_on 5s the mod::tf::l ed
mi.nimum
I.
However, it is well known that there ts
an/~tber
Clar3.f' (if
methods of estimation leading to the same limiting chi-squar'" di (',t1'io1.1tion (see Cramer
[6], p, 506).
One such class of estimates is the BAN estimatps introduc""dl'.y
Neyman.
The definition of BAN estimates and the condj tionsundPT ,,-hleh
they exist are discussed in detail by Neyma.n [17J ,That J s, :i n
notations, we have that under the conditions
(7.1)
(ii II'
of Cramer's theorem.
the statistic
r
2:
i=l
where
n
--->
....
i
1
(a1 , ..• ,as ) form any set of BAN estimates,
00,
distributed as a
(1.1)
.....
np ( 0: , ... ,O:s)
I\lTi th
is in the ]jmit
r-s-1 degrees of freedom.
88
7
Furthermore., consider the expression
r
l::
[ ni - npi (al
0
H,
a: s )] 2
~._------"
i=l
(1.2 )
ni
In writing this formula it is assumed that none of the n.]. is equal to
Under the regularity conditions (7·,1),
p., > 0 for all i.
- '1
zero.
Since
the event {n > 0, all i} has probability approaching one under this
i
condition, we may for asymptotic purposes assuml: that all the nils are
non-zero.
Neyman [17] has shown that the statistic
"-
r
l::
A.
[ni_ - np. (aI' ... ,a8 )]
~
2
,
i=l
'"
"-
where (a , •. • ,a ) form any set of BAN estimates, is .in the limit as
s
l
n -->
00,
distributed as a
X2
with r-s-l degrees of freedom.
All of the tests considered throughout this paper are chi-square
tests.
The consistency of the frequency X2~te8ts (see Neyman [17])
makes the conventional power studies in the limiting sense utterly
meaningless in this case.
In the absence of any systematic study of
the power function of all these tests, a useful criterion seems to be
the asymptotic power of consistent tests as defined by Pitman [22].
This definition requires
~hat
the alternative depends upon the sample
size in the sense that as the sample size increases the alternative
gets clo3er and closer to the hypothesis.
A study of the asymptotic
power function and the associated study of asymptotic relative efficiency, also suggested by Pitman [22], is particu.larly use.ful when one
8
intends to compare the performance of several consistent tests of the
same hypothesis H in the immediate neighborhood of H.
Under the general conditions stated in Cramer's theorem (Section
7.1), Mitra [14] and Diamond [7] prove a result, to be of qonsiderable
X2
use to us later, on the asymptotic power function of the frequency
tests which in our notation can be stated
i~l,
follows:
•.. ,r, and the alternative
HlP.
~ p.~ (aI' •• • ,as
n
~n
r
where
any
&S
E di
i=l
= 0,
but not all d i
= 0,
) +
d /-~u
In
i
,
(1.4)
and where Pin
r Pi(a1 , ... ,as ) for
S! in A.
In order to obtain the asymptotic power, we consider the simple
alternative
Pi n
where
H.
o
S!o
= Pi (a-0 )
+ d.~
/-vu ,
is supposed to be the unknown true parameter point in A under
0
a.)
Then the
-0
is, in the limit as n -->
00,
(Notice however that H itself does not specify
statistic
X2,
o
as in
(7.4),
distributed
under BIn as a non-central chi-square variate with r-s-l degrees of
freedom and a non-centrality parameter
(1.6)
where
9
1.2.
Statement of Problem
As is usually the case in the practical application of a statistical procedure designed to give good results under some conditions,
one is faced with the problem that the conditions are not completely
met with in the situation under consideration.
The theory of the
analysis of categorical data is no exception to this.
Although in the
derivation of the chi-square test, the classifications of observations
into respective categories in the contingency table are assumed to be
correct, there are many practical problems where mistakes in classification are going to be made (see Bross [4], Rubin, Rosenbaum and Cobb
[27], Diamond and Lilienfeld [8], and Newell [16]).
For example, in
Mitra's illustration of a 2x2 table with two categories:
and not attacked
(A),
attacked (A)
the chance of wrong diagnosis is appreciable and
thus the possibility of an individual belonging to category A being
classified into A and vice versa cannot be ignored.
question then arises:
The important
What effects will misclassification have On the
performance of the usual significance tests?
~he
principal purpose of
this thesis is to answer this question.
Actually, we shall attempt to answer the following more detailed
questions:
i)
If the errors of classification are ignored and the usual
test applied, when and by how much will the level of
significance be affected?
ii)
How might we design tests of the usual hypotheses, that take
into account errors of classification and preserve the level
of significance?
10
Hhat effect will increasing misclassification have on the
iii)
asymptotic power of such tests?
1.3.
Review of Literature on the Study of Vdsclassification
Problems of the above nature have been investigated by Bross [4],
Rubin, Rosenbaum and Cobb [27], Mote [15], Diamond and Lilienfeld [8]
and Newell [16].
However, the more general consideration of the problem
has not been fully explored.
Bross considers the following problems:
i)
Suppose we are sampling from a binomial populati.on of
"problem individuals" (denoted as I) and "non-problem
individuals
(denoted as C( I) ) .
II
Let the proba.b::!..lity of an
observation belonging to I being wrongly classi.fied into C(l)
be
e
and that of an observation belonging to C(l) being
wrongly classified into I be cI>.
Let p be the probability of
an observation falling into I, if there were no misclassification.
The problem is this:
if x individuals are observed to
belong to I in a sample of size n, is x/n st:Lll an unbiased
estimate of p?
Bross proves that it is not and that the bias
is (K-l)p, where
e
K
=1
-
q
=1
- p.
+ (q/p)cI>
and
Further he remarks,
([~],
p.
481)
If estimates of e and cI> are available :from the
data used to construct the classification system,
the estimates might be adjusted by dividing by K.
11
Ib"ever, a little thought will reveal that K involves
the parameter which we are trying to estimate.
if
e
and
~
Hence even
are known, K cannot be used to remove the bias.
It is easy to see that if
E[ (x/n - ~)
e
+
11
~
1
e-
1 -
~
then
] =P
Thus if we have ~ priori knowledge of
[(x/n _ ~)
1 -
1
e-
~
e
and ~ and if
e
+ ~
r 1,
]
is an unbiased estimate of p.
ii)
The other problem Bross considers is the comparison of
probabilities in two binomial populations.
He shows that if
the same classification system is used in the two populations
then misclassification will not affect the "validity" of the
significance test.
Further he remarks,
"T,{e do not get off scot-free , however .
([4], p. 484),
Although the tests
are valid, paver may be drastically reduced."
the values of 11K' for various values of
e,
~
Bross presents
and p, vhere
11K' is used as the measure of the efficiency of the test
under misclassification.
He also indicates some possible extensions to small
samples.
12
Rubin, Rosenbaum and Cobb [27], in the attempt to make a rational
choice between the use of "clinical examination data" and "interview
data" for a given study, examine the effects of misclassification
(present when interview data is used) on the detection of association
between a factor (Smoker and Non-smoker) and a disease (Rheumatoid
Arthritis and no Rheumatoid Arthritis) in a 2x2 contingency table.
They arrived at the similar conclusions as those obtained by Bross.
That is, under certain assumptions
(~.S'
errors occur in one direction
only) the test procedure remains valid although the power of the test
is reduced due to misclassification.
Furthermore, they have demonstrated that compensation can be made
for the loss in power of the test, due to misclassification, by increasing the sample size; and a method of estimating the ratio of sample
size needed for "interview data" to sample size for "clinical examination data" has been presented.
Diamond and Lilienfeld [8] and Newell [16] focus their attention
on the effects of misclassification upon inferences made from epidemiclogical stUdies, and come out with the results which are not significantly different from those obtained by Bross [4] and Rubin, Rosenbaum and
Cobb [27].
Mote [15] attempts a more general treatment of the problem than
those
discussed earlier.
The investigations of the effects of mis-
classification on chi-square tests are made for the goodness-of-fit
test and for the following two cases of txr contingency table tests:
a)
stratified sampling situations (!.~., contingency tables with
one marginal fixed), and
13
b)
random sampling si tuations
U:..!:.,
contingency tables with both
marginals random).
For the goodness-of-fit tests, the usual test requires modification
when there are errors of classification.
size of the test will increase.
If errors are ignored, the
As a result of misclassification, the
asymptotic power is reduced.
For contingency tables, the investigation is made only under the
assumption that misclassification can take place only in one direction.
Under this assumption, the usual tests do not need any modification;
however, misclassification does reduce the asumptotic power.
If mis-
classification is ignored, the size of the test will increase.
1.4.
Outline of the Following Sections
Treatment of particular cases may serve the interests of some, but
it will not necessarily be informative for other cases of interest to
other people.
For this reason, we shall make an attempt to deal with
general formulation of the problems first before taking up particular
cases and applications.
Most of the results on the investigation of the effects of misclassification so far have been obtained under some restricted conditions.
The main assumption that has to be made is that errors of
classification can arise only in one direction.
certain set of problems.
directions.
This takes care of a
Problems do arise when errors arise in both
To quote Rubin, Rosenbaum and Cobb ([27], p. 263),
• • • we have made no attempt to study the problem of what
happens when there is misclassification on both axes. We
anticipate that this will be a difficult theoretical
problem, but is worthy of serious attention.
14
We shall consider the general problem when errors can occur on
both directions (of which, the case of errors occurring only in one
direction is just a special case), for
(i)
Two-way contingency tables with both marginals random in
Section 2, ar.t.d
(ii )
Two-way contingency tables with one marginal fixed in
Section 30
An extension to more than two-way contingency tables is discussed
in Section 40
As an illustration of the generalization, the extension
from two- to three-dlmensional case is given for the problem of testing
the hypothesis of complete independence.
Some comments on the treatment
of the problem of estimation--estimating the underlying probabilities,
when misclassification is present, will be given in Section 5 as a
suggestion for future research.
15
;2 ,
CONTINGENCY TABLE WITH BOI'H MARGINALS RANDOM
Before we proceed to discUE:S the problems in a rxc contingency
table 'tli th both marginals random, some preliminary definitions should
be given,
The term "individual" refers to things which are to be
classified,
A class of char&.cteristics in term.s of which individuals
may b e classi f ie d will be referre d. to as an
11
attrJ.bute tl
tl
(
!:..~.,
there
a.re t"ro at'Jrjbutes in a tIJ,;-oo.·way c0tJ.d.ngency table., one attribute on the
i-,th direction and the other attribut.e on the j,·th direction).
i'Mis_
classification" is cOnmlonly used to imply that, while on the occasion
of error an indi.vidual has been classified to one subclass
(:!-.o!:.., a
cell in the c:onti.ngency table), it properly2 belongs to some other
subclass.
The cell tow-hieh l.t belongs is tradi.tionally called its
"true cell".
That is, an error of classification occurs when in practice
an observation is classified to a cell other than its true one.
Suppose that the
~
individuals are randomly taken from some popula-
tion and then classified according to two variable arguments into a
hro-way table conslsting of r rows and c columns"
It is often desired
to test that the two variable arguments are independent.
It should be noted that the above sampling scheme represents
sampling from a single multinoudal population.
Let Pij denote the probability that a randomly chosen individual
belongs to the l-·th rOvr and the j-th column of the contiDgency table
r
(i=l,oo"r; J=l.,ooo,c), where the Pij1s obey
2:
c
,2: PiJ :::: 1.
i""1 J=l
2 "Properly" is defi.ned in terms of a. result found upon a.pplying
a preferred method of measurement. The term 1.6 normative, not empirical.
16
c
~
P
j=l
(i=l, •• o,r), denote the row marginal probabilities
ij
and
r
:P
,,:'=
.J
~ p,..
1=1 lJ
(j=l, ... , c)" denote the column marginal probabili·"
ties, then the hypothesis of independence of the two variable arguments
is equivalent to
(2.1)
where
=:
L: P .
.J
j
1;
=:
(i=l,ooo,r; j=l,oo.,c).
Let n. j be the observed number of indivJ.duals in the (i, j )··th cell
l
of the contingency table) and write
n.l .
for i=l, . o,r and j=l, o•. ,co
0
The joint density of the observations is
$
=
n ..
n.i
n.
.
l,J
n' ~"
l j
[
i,j
and
where
lJ]
n Pij
(2.2 )
,
n •
The traditionally used large sample . ;.-test procedure for testing
H is to reject H if
o
0
(n .. -n. n. In)
. lJ lo . J
n. n
1.
0
and not to reject otherwiseo
0
2
,fnJ
>
C
,
C is being defined as follows:
(the desired level of
significance) •
(2.; )
17
Let us see how' the errors of classification can be taken into
account while testing the hypothesis of independence"
For the case of two-way contingency tables with both marginals
random, errors of classification can arise in both directions, the i-tb
and the j-th directions,
(!.~.,
errors associated 'with the 1st attribute
and the 2nd attribute).
Let 8
ii
'Jj' denote the chance of an individual who properly belongs
to the (i, j ),. th cell, being wrongly classified into the (i i , j: )-th cell
(i#' and/or jfji); while 8
iijj
denotes the probability of correctly
assigning the individual of the (i,j).~th cell to the (i,j)-th celL
Clearly,
It can be expected that the following holds:
o -<
8j
., , .
< 8 I..I j J' -< 1 ,
.t
JJ
.~
Let p, . denote the probability that an individual belongs to the
lJ
(i, j )-th cell when there are no errors of classification, but let
p:i j
be the probability of an individual being assigned to the (i, j ),-th cell
when there are errors of classification.
r
c
= ~
~ Psk 8 sikj ,
s=l k=l
where
::;; 1, (i=l, ... ,r; j=l, •.• ,c).
It
can be seen that
18
Equation (2.6) is the basic equation representing the effect of
misclassification on the probability model.
When there are errors of
classification, the joint density of the observations is
lll'
,
= ~=--n_.__ [ n
n n. j
. j
l
J..
J.,
i
n'j
.
'p ~. J. ] ,
,J
lJ
where
Equation (2.6) can be given a convenient matrix form.
Let
denotes the stochastic3 matrix
and e(rcxrc)
then
£i(lxrc)
= E'(lxrc)
(2.8 )
e(rcxrc) •
Throughout this thesis, only cases where the misclassification
matrix e(rcxrc)
is non-singular are considered.
The case of singular
e(rcxrc) is cumbersome to handle and is of limited practical importance.
3 A square matrix A
= (a jk )
of order
the elements are non-negative and l,f
r
L:
k=l
a
jk
= 1,
(j=1, .. 0'1').
l'
1s called stochastic if all
19
Two cases will be considered separately:
(I)
The case of independent errors (f.!:., the error in the i-th
direction is independent of the error in the j-th direction.)
(II)
The case of non-independent erTorso
2.1.
Case I:
Independent Errors
For the i-th direction:
let P , denote the chance of an indikk
vidual belonging to the k-th cat.egory being wTongly classified into the
k'-th category (krk'=l, ••• ,r), and P , the probability of correctly
kk
assigning the individual of the k-th category to the k··th category
(k=l, .•• ,r).
Clearly,
r
~
Pkk' = 1
for all k=l, .• o,r.
k'=l
Similarly, for the j-th direction:
let Y , denote the chance of
hh
an individual belonging to the h-th category being wrongly classified
into the h'-th category (hrh'=l,.oo,c), and Y , the probability of
hh
correctly assigning the individual of the h.. th category to the h-th
category (h=l, ••• ,c).
Clearly,
c
~
h'=l
Y , = 1
hh
for all h=l, •.. ,c.
(2.10)
It can be 'expected that the following hold:
and
(2.11)
20
That the errors of classification are independent is taken to mean
that
(2.12)
where
~
j'
L:
i' ,jr
eii' jj'
e
ii'jj'
-
-
PiJ." ,
:=
1 J (i.=1" .. ,rj j=lJoo"c),
and
In matrix notation, we let p(rxr) denote the stochastic matrix
i =l, .. " r
i' =1, . , ., r
and
r(cxc), the stochastic matrix
j = 1, .. , ,c
J.r = 1 , ••. ,c
For independent errors, it. can be seen that
= p(rxr) *
e(rcxrc)
where the symbol
II
*
It
r(cxc) ,
denotes the
"
(2.14)
Kronecker product ,,4 ( or sometimes
called the "Direct product") of two matrices.
4 The Kronecker product of two matrices A and B is defined to be
the matrix C which has the partitioned form
C
=
allB
a 21B
a12B
a 22 B
alwB
a2wB
avlB
a v2 B
a
and is written as C
=A *
vw
B, where
a
vw
B
,
21
Before proceeding in the treatment of the case 'where errOrs can
occur on both directions and they are independent 3 it may be of' interest
to show that the case when errors occur in only one direction is a
special case of the one we are considering.
Tb.:ls can be seen as
follows:
For the case of errors occurring only in the i-th direction, we
have
e(rcxxc) - (efiljf)
where
eii
i
jj' =
( PH'
if' j=j'
t
otherwise
0
which can be written as
e(rcxrc)
= p(rxr) *
I(cxc)
0
That i6 3 when errors occur only in the i-th direction, we have
e(rcxrc)
where
r(cxc)
= p(rxr) *
r(cxc)
= I(cxc).
The case of errors occurring only in the j-th direction can be
represented by making appropriate interchanges of "i" and "j" in the
above
i~th
direction case.
Theref'ore, the results 'I"re will obtain for the case of' independent
errors are applicable to either of the above cases.
22
For the case of independent errors, we have
Ho
0
0
p
-->
.~
(2.16 )
p<-H:
ij = p
10 oj
0
To see this we may proceed as follows:
""
L:
s, k
pI
where
L
pi
.j
and
_
L: P
s
:=
i
i
PSo P " kP S
i'Y.kj : : : Pi·· " p 0,), , say,
So
P
s1
L: p 'Y
k ok kj
Similarly, since e(rcxrc) is assumed to be non-singular, it can be
shown that i.f
t.hen
(i=l,ooojr; j=l,.o.,c).
Hence, rejection or acceptance of H' implies rejection or accepto
ance ofH .
a
Consequently the size of the rejection region under H is
o
t.he same as before, when there is no IP.isclassification (Le., when
e(rcxrc) : : : I(rcxrc), the identity matrix).
to that of testing
H~.
Thus, the problem reduces
The test procedure is to reject Hi (and hence
o
23
X
g2
-
r
2::
r-0j
2
e
(2.18)
2::
1.::.:1 ;1=1
and not t,o reject otherwi.se; the gusmUties 0, ni.' n Qj and n are
being defined exactly the same way as in Section 2.1.
In this case it 1s seen that we get the same test procedure as we
would have obtained if there had been no errors of classification.
It should be noted here that in the case of independent errors, we
obtain the same result regardless of ·whether e(rexrc) is knoW'll or not.
Since
e(rcxrc)
does not enter into the expression of the l-"test
statistic, no estimate of e(rc:xrc) is needed 'lihen e(rcxrc) is unknown.
2,20
The As;ymptotic Power Function for the Case
of Independent Errors
Before any fu.rther di.scuss1.on on the asymptotic power function of
the test procedure defined by (2.18), we shall first evaluate the
asymptotic power function for the case of no misclassification.
Our
ultimate object is to compare the asymptotic powers of the two cases,
with and without errors.
When there are no errors of classification, the test procedure for
the hypothesis of independence is defined by (2,3).
We will now apply
the result on the asymptotic power f'1IDction of' the frequency -I-tests
stated in Section 1..1 to the case of no mi.sclassificationo
5 The prime is used here as a. reminder that the data used in
computing the statistic Xi 2 is siibject to errors of classification.
That is~ the probability distribution function of the observations
~ n ij } . depends on {p~j}
0
24
When testing H;
o
(i=l~ .. "r;
J=l
j
..
po
lj
~
po pwhere
10,
J
r; Po
i
.. 1. "
:=
1;
j
p
oj
:: I,
o"c); one considers the simple alt.ernative
o
'~ ~
0
P'i - p. P . + d 1J·/ y n
:lu
1.." J
J
..L
1{here
,
1"
L:
•
{dij 1 is
a set of deviation parameters j not all zero, such that
d., = OJ and ''({here
J
1.,J
{p~" y p~J.1
is S'I.l;ppCised to be the true unknown
parameter point in A -lIDder H •
o
Usir~
>t -,testB
>
limit as n
the resuLt on the asymptotic power function of the frequency
woe have that the statistic
_._->
Y?- J
as defined by (2.:3»
distri'buted. U-.TJ.der HJ.n as a
00
non~central
is in the
chi,.,square
variate with (rool) ( c-l) degrees of freedom and a non- centrality
parameter
(2.20)
where
~1
(lxrc)
B(rcxr+c-2)
:= -
{
1.
-vPf.
:i,j
~i.J.
(~)
k
0
r
and
for k.=r,r+l, oo,r+c-2 and k 1 ::k-r+,1
0
The asymptot:i.c power of the test is defined as
lim
n->
0
wm.ch, from the above resuJ..t, can be seen to be equal to
..l,
where
C, It
and
.in
6,
are as defined in (2.3), (2.19) and (2.20)
respectively,
It
is shown (see Section
7.2)
that the non-centrality parameter
D., as defined by (2.20), can be expressed in an explicit form as
folloW's:
L:
t:,~
:l,j
, ~,
L~ d,~,J)
If
the set of deviation parameters
consldered~
is such
that
;0;:>
\ •• ' - .
6
I-a
Let
.-
n
L:
i=1
,~
z: where z, are independent N( a, ;; 1)
l
1.
0
0
c.'~5')
The distrjbu·,
J..
.
'2
tion of' X is called the non.. central chi<·square.
a
Define
!:::.::::
n
L:
i=l
2
a,
1
If f(X2) denotes the density function of
a
_x21?_
'~,
then
a
£-1
2
"Inc: (X2r )
[Uf(la) - _.- - - - . - ._ _a_ _. [1 + !l',<2 6,/2') +
.
~
n\ a
n/2
2 .
n/2
x
f,
. a'
I
(
Define
F(x,n,6)
:=.
{
J~
\..1
1 ~ !.:
nr
n+2 J 2
U\
rill/2)'2+
. a' "
(2.23)
ooo ]
0
26
then
L -
L
jJ
J
The exrression (~: .26) is more commonly nsed than (2024) J (see Iti tr&
(13] and Mote [15]), and it., with the restrictions d~
- d. , '" 0,
.L.
,,1
viII be used from Sect.ion 2,4 on.
The function 1.-. F(C,,(r=1)(c-l).,6.) can be evaluated b~v- Patns.ik"f\
procedure [19] or calcul.ated from Fix's table [11]
We nmr \V'ish to study the effect of
asymptotic power
Il~l
0
0
misC'lassi!~ication en
":;h18
otter wO.rdewe \-Tish to find the asyr.nptotic puwex
of the same alternat;ive
f.T
'Ln J
for the case 1{hen misc1assiflcat:L:m :i.e
present.. namely,
i3
v
.-
lim
n
where
-->
P
co
r
{
X iC0 ,
-> C I
~n}
('2 v2r)
J
C and ~n are as defined in (2.18) and (2,19) respectively,
To find 13 i
J
first observe that by using the basic equation (2,6 <:
and the a,ssumptions that errors are independent and the misclassific-'l
tion matrix is non",singular, we can show that
where
d~J
~
l'
C:
L:
L:
s~l
k""l
d
skPSi'Y kj
and
L:
iJj
d:lJ,
;;::
0
J
and where
o
p~
L;
-'~'.I
t;~
'::>
p ,
0
j
Sl,
Therefore"
f3
i
=
Um
n
--> co
,1f Xi 2 ~ C I HIn }
PI'
-
lim
n
....
-?
P
CO
r
{x'2 >
-
Using the result on the aSY1lTI?totic power function
.,
x:- ~,tests
as di,scussed
lim
n->oo
earl:i,er~
PrfXl2,2: C
1
I
ot~
C
theo frequency
it can be seen that
Hin} ." 1-
F(Cj(r-l)(c,",l),.6i),~
' ,,", ,,")'
Cc,.j;
where
and
d
i
i.
:=:
r
i
L: d
~ L: d
Psi'
.j ij
8-=1 So
c
_.
L: d k 'Ykj
k=l
0
0
The funct.ion
is a ,'ltrictly monotonic increasing funct:i.on of 6. (see Roy [23]).
Conseq.uentlyto compare the two functions, I-F( c, (r-l)( c'~l) ,.6) and
I-F(C,(r,·l)(c~'l),.6,i)"
we need only to compare .6, and D,i, !.~.:) f31 ~ t3
if and only f,t.6,' ~ 6.
We shall now' show that 6.' <.6.0
28
(2 ..32 )
we see tb.&.t
2
(2.33 )
Thus
29
,6'
<
f ~
~
0
0
)
[L: d k/k' - \L: P 1"Y~l' d s
k s. J
k
' \. L"<
2
d . Ik . )1' ]
- (L:
k
.K
J S.
•
"" Z Z
j
s
[~
::; Z Z
j
hk -P\d
p
S
S •
-
d
.0'~.} 7kjl2
o
s.
Aprl;y-::i.nf~ Cauchy's inequality, the above is seen to be
[d sk -
<,?2L:Z
j s k
[d
::::
L:
s,k
Therefore,
1'~kds.
- d.k1'~.]
o
0
1's . 1'. k
o
d pO]
- d l'
s. .l\: sk
.k s.
o
0
;1YW.
6.'
if
,
2
=
~
.
-< 6. ,
~
Ca.uc)s·l s inequality
(b) are proportional,
r""1
. 7) ,
.J. 1eorelh'
Ikj
ps. P.k
.2:..~., 1 - F(C,(r-l)(c-l),6.')
FTGY!l
2
C::.
0
1- F(C,(r-l)(c-l),6.) .
(2,32),
~.
J
equality holds i f and only if (a)
see Hardy, Littlewood and Polya,
Consider the inequality
(2.33),
equality holds if and only
30
(2: pOk'Y . )d~
k
• kJ ::;.
J;
for all! and
where
T.
,'s are some constant not equal to zero.
J.J
For some _i, if none of the P . 's (s=l, .•• ,r) are zero, the above
S2
becomes
2: d k'Y ,
k s kJ
for s=l, ..• ,r.
d
(2: P°k'Y . )
s. k . kJ
po (2: d k'Ykj)
s. k •
o
= T 2'jP S.
,
(2.36)
This cannot be true, since
and
d
°.
s.
Therefore, if for some!, none of the Psi's (s=l, ..• ,r) are zero,
(2.33)
is a strict inequality.
Similarly for
[d
_ d
sk
(2.34), equality holds if and only if
po
s .• k
for all s
and J .
If none of the 'Ykj's are zero for some
d
sk
= ds. po. k
o d
Ps..k
J,
0
= T sJ.
.p k
then
'
for all k=l, .•. ,c •
Again, this cannot be true, since
o
L p k
k
.
Hence, if none of the 'Ykj's are zero for some
inequality.
J, (2.34)
is a strict
31
That is, 6' ~6 in general and 6' < 6 if, for some
!, none of the
Psi's (s=l, ... ,r) are zero or, for some~, none of the rkj's (k=l, ..• ,c)
are zero.
(For detailed statement on the conditions for which 6' = 6
for some sets of
..
1(d ~J
} and \ p? } , see Section 2.7.)
1
~J
.
We then have
the result for the case of independent errors that although misclassification does not affect the size of the test procedure, it tends to
reduce the asymptotic power.
2.3.
Illustration for the Case of Independent Errors
To illustrate the above point, let us consider the
fol~owing
example:
Let
Pii ,
e
if ifi'
l-(r-l)e,
otherwise
=
and
where
(2.37)
0 <
re
if jfj'
l-r(c-l)e,
otherwise
e < min(l/r,l/rc);
(i=l, ... ,r; j=l, ..• ,c).
Suppose the errors are independent, then
[l-(r-l)e][l-r(c-l)e]
l-(r-l)e
eii'jj'
if
i=i', j=j'
if
i=i' , jfj'
(2.38)
=
l-r(c-l)e
if
ifi' , j=j'
re 2
if
ifi' , jfj'
.
32
Using the basic equation
(2.6), it can be seen that
(i=l, ... ,1'; j=l""jC)'
The hypothesis we are testing is
where
H'
o'
E p.
i~'
=E
j
P
.j
=1
and, in this case, H' is
o
(2.40)
.There
and
p' .
.J
=E
E Pi'
i
.
j
p' .
.J
=1
= (l-r ce )p •j
+ r8
(i=l, ... ,1'; j=l, ... ,c) •
,
That is, the test procedure is that defined
by (2.18).
For the simple alternative
}l'
-In'
where
E d .. = E d .. = 0;
~J
i
~J
j
2
d ..
6.
=
~J
E
o
0
we have from
(2.26) that
,
i,j p.~. p .J.
and also that
d
6.'
:=
E
i,j
,2
2:.J. ,
,0
Pij
(2.41)
33
d~ ,
lJ
where
=.
( 1- re )(l-:v ce )do'
lJ
,0
Pij = [(l-re)p?l . +
and
,
(z:: d ~ . = z:: d~ j
lJ
e][ (1-)' c 9 )p 0. J. + )' e],
It is clear that 6' < 6
for
0 <
=.
l
j
i
,
0)
(i=l, •.. ,r; j=l, ... , c)
e < min(l/r, llyc),
.
~.~., the
asymptotic power is reduced due to misclassification.
For a gi.ven set of
~l
6' f0r various values of
p? , po.
1.
.J
1,
f
if we fix 6 and a and calculate
e and y, we will be able to compare the
asymptotic power functions for the two situations.
Let
p?l .
6
=.
For example:
llr and po.
= l/c , then
•J
= z::
i,j
2
d ..
lJ
0
= rc
0
p. P .j
l.
z:: d2..
i,j
lJ
and
6'
= z::
(l-re)
222
(l-yce) dlJ
..
i,j
For r=c=2, 6 = 10.509 and a = 0.05; if
6'
= 7.849.
e
= .03518 and y=l then
I-F(C,1,10.509) and (1-F(C,1,7,849) can be read from Fix's
[11] tables of non-central chi-square.
0.9 and 0.8 respectively.
These values are seen to be
That is, if a = 0.5 and
e
= .03518, the
asymptotic power7 would be reduced from 0.9 to 0.8.
in power could be obtained for other values of
These reductions
e and y.
Tables 2.4.1
and 2,4.2 show the effect of misclassification on the abymptotic power
for the situation mentioned above.
The numerical calculations show that misclassification (of the
given pattern) becomes more serious as the significance probability a
from Table 2.4,1 (a), we observe that if a = 0.05, then
decreases;
~.~.
e = .03518
reduces the asymptotic power from 0.9 to 0.8, but if a
then
e = .02937
= 0.01
reduces the asymptotic power from 0.9 to 0.8.
Table 2.4.1 also gives the ratio 6 1 /6, which is defined to be the
measure of the efficiency of the test under misclassification.
In fact,
the ratio 6' /6 can be taken to be the ratio of sample size for data
which is free of errors to the sample size for data which is subject to
7 The problem of "how good" is the asymptotic power, when the
sample size is moderat~has not been established. However, we have
made some comparisons between the exact power and the asymptotic power
of the large sample X2-test for the case of a 2x2 contingency table,
which show that, even for a sample size of 30, the asymptotic power
behaves quite remarkably well. This can be seen as follows:
Comparison of exact power and asymptotic power in the 2x2
contingency table (With one marginal fixed):
Pll!Pl.
P21!P2.
Exact power
Asymptotic power (Approx. )
.3
.6
·7
.8
.7
.6
.1
.1
.2
.2
.3
.254
.852
.802
.932
.587
.186
.278
.811
.784
·911
.591
.195
.4
The values of the exact power are taken from W. L. Harkness and
L. Katz's "Comparison of the Power Functions for the Test of Independence in 2x2 Contingency Tables" (AMS, 35 pp. 1115-1127, September,
1964) •
35
misclassification to produce the same asymptotic power.
Hence, when
there is a loss in power due to errors of classification and it is
desirable to restore this power to the proposed level, this can be
accomplished by enl,arging the sample size.
2.4. Case II: Non-Independent Errors
When errors are not independent, the relationship (2.16) does not
Consequently the problem of testing H cannot be reduced to that
o
hold.
of testing Hi0' as was the case of independent errors.
A procedure for
testing Ho can however be obtained and so let us proceed to see how.
Recall that under the null hypothesis, we have
r
p~J'
where
c
= s=l
L L P P k e Ok'
k=l so •
s~ J
L P so =2: P k=:l,
s
k
.
These rc functions
number,
~
,
(i=l, .•• ,rj j=l, ••• ,c) •
~ P~l' o.. ,p~c \
say, of unknoWYL parameters.
8
are expressed in terms of a
Consequently, to obtain a proce-
dure for testing H , we may use the results from Cramer's theory (stated
o
in Section 701) and some results involving BAN estimates given in Neyman
8 The number
~ of independent parameters that enter into the
expression of pill' o.o,p~c depends upon the number of independent un-
known elements in the misclassification matrix e(rcxrc). If e(rcxrc)
is known ~ is easily seen to be (r+c-2); the independent unknown
parameters are p
(s=l,ooo,r-l) and p k (k=l,.oo,c-l). If e(rcxrc)
s o .
is unknown, let u < rc(rc-l) be the number of independent unknown
elements in e (rcxrc), then ~.l :;: u+( r+c-2 )
0
Table 2.4.10
(a)
Effect of misclassification (independent errors \vi th the
same con3~ant rates both on the i-th and j-th directions)
on the asymptotic pm,ler when testing the hypothesis of
independence in a rxc c.ont1ngeney table,
p~
e--0.00000
.03518
.06229
.08685
.11123
.1.3'727
.16759
,20684
,27564
0,00000
.02916
,05558
.08181
.10981
.14221
.18)+65
.25867
0.00000
.02806
,05591
.08564
.12005
.16512
.24372
0.00000
.02951
.06101
.09747
.14.521
.22848
0,00000
.03348
.07222
.12296
,21146
0.00000
.04152
.09590
.19075
0,00000
.05931
,16274
0,00000
.117.36
; lie ; r = 1 , r=e=2
1..
6-
10,509
_ _"'
4~
10.509
7.8£1-9
6.172
4.899
A'!6.:
a = 0.05
Power
LOOOOO
.74688
.58731
.46617
:5 oe41.
.36550
2.911
.27700
2.056
.19583
·3
,2
1.242
.11818
o)+'26
,04054
01
0,8
7,849
1.00000
60172
,78634
·7
4,899
,62416
.6
:3 ,.841
,48936
.4
2.911
,37088
2.058
,26220
.)
1.242
.15824
.2
0,426
,05427
.1
6,172
1,00000
6.172
0.7
.6
4.899
,79375
3,84l
.62233
·5
2.911
.47165
.·4
,3
2,058
.33344.
.2
1. .24'2
.20123
.1
0.426
.06902
0.6
4.899
1.00000
4.899
.3 .841
.78403
·5
.4
2.911
.59420
2.058
.1+2009
..:;:/
1,242
.25352
.2
0.426
,08696
.1
3,841
3,841
1.00000
0·5
.4
2·911,
.75788
,3
2.058
.53580
02
1.242
.32335
o. 426
.11091
.1
2,911
1.00000
0.4
2·911
2 .058
.70697
·3
1.242
.42666
.2
01
0.426
.14634
2 ,058
2.058
1.00000
0·3
1.242
,60350
.1
0.426
.20700
0,2
1.242
1.00000
0.426
.34300
._-------------"
' - " " ' - - - - .1
------
•
37
Table 2. I/o, 1 .
(1) )
( continued)
p.o
_____q
o, 00000
J. ,
~ I
0
1 /
== -i/,r , p', -- , Ie
- LI
====_~
r=c==2
I
__.. .·__. K_=
4-'7'-~D..
1. 1: , U , Jj '79
L 00000
.0::?937
11,6bo
.78500
,C5175
.0{1 7 9
<) .511
.64594
.53794
8.004
')·391./
1/,20[:
.l~),:))r~
· .lGlj·~T6
).007
1 67 21-
•210!:-2
11.6(30
11 ~6C)o
9,611
(J.~)5'92
6.635
,68527
·S6eo,?
.2013
.11-6132
.36027
'5.39 4
,Ci.J732
.1]263
1f
1 1 -. ,:>"
,-,-+)U)
3·007
,257L~5
.19326
1.674
0.00000
.02236
· 01:.!~2 )1,
.0<;724
9·611
.1 1+332
1.00000
·(\32<30
.69035
.56123
,113783
·31287
U.ooh
C,(;35
:'5.39):·
.09328
.12605
.17699
4.208
3·007
1.674
£l.004
0.00000
6~635
.02291
5.39 1j.
.OLI.6SC)
.071~09
11- .20[3
3,007
,101-3:55
.06137
L671~
6.635
.37569
.20915
.05380
.08976
).007
.812Q6
.63)/,21
.45320
.25230
1.67h
.06796
.12681
,3 ~007
0,00000
5.394.
iJ..203
1.6'(L,
4.208
,
7I
·3
.2
.1
5.394
·11-1-564
0.00000
.03010
.1
O
0.6
.02523
1+.208
,1
" .()
q
U
.17413
1.00000
.10291
"
~U
1.00000
.62896
.67391
.'52574
6.635
0.00000
.011-029
'7
, t
,,20210
.11251
1.00000
.82286
3 . 00'1-
O.OCOOO
.n
·362:52
. 232[3?
.0237S1
.OL1508
o
0,<)
.44593
• c)~:LL+l
..:J20)
0,00000
Pc~Ter
1.00000
. r(8013
.'55711-7
.')
.4
·3
t)
"
'-
.1
0·5
,4
,3
.2
.1
0.1+
·3
.2
·3to34
.1
11.208
3,007
1.00000
1. 6'(~·
3.0CJ7
1.674
.39781
1.0CCOO
0.)
.2
,1
Oc2
.55:~70
.1
.71}:·59
_-
38
Table
2.4.l.
(continued)
(e ) p.0l . = l/r
e
0.00000
.02060
.03653
.05104
.06551
.08111
,099 1j-O
.12389
.16900
0.00000
.01698
.03244
.04787
0
) p .
.J
6.
..-
15 )~O:5
11.935
9.683
7,924
.0355i.
.05710
.08603
,13929
0.00000
.01942
6.420
.01~218
.07266
.12880
0.00000
.02417
.05654
,11615
0.00000
.03490
,09918
O.OOOCO
.07179
; 'Y
=1 ,
6.'
15.405
11.935
9,683
7,924
6,420
5·050
.3,737
2.401
0,910
11,935
9.683
7.924
6,1j-20
.06h49
,08399
,11009
..151:337
0.00000
.01629
.03255
.05006
.07061
.0981l
.14877
0.00000
.01709
:
= l/e
5·050
3·737
2)j-01
5·050
3,737
2.401
0·910
9.683
7.924
6.420
5.050
3.737
2.401
0,910
7·924
6,420
5.050
3·737
2.401
0·910
6.420
5.050
3·737
2.401
0·910
5·050
3·737
2.401
0,910
)·737
2.401
0·910
2.401
0,910
r=e=3
6.'
/6.
1.00000
.77475
.621356
.51438
,41675
·32782
.24258
.15586
.05907
1.00000
.81131
.66393
·.53791
.42313
·31311
.20117
.07625
1.00000
.81834
.66302
·52153
.38593
.24796
.09398
1.00000
,81020
.63730
.47161
·30300
,11484
1,00000
.78660
.58209
·37399
,14175
1.00000
.71j-000
.47545
.18020
1.00000
.64249
.24351
1.00000
·37901
a = 0.05
Power
0.9
,8
.7
~
.b
,5
.4
,3
.2
,1
0.8
.7
.6
,5
.4
,3
,2
.1
0.7
.6
·5
.4
·3
.2
.1
0.6
,5
.4
,3
.2
,1
0,5
.4
,3
.2
,1
0.4
·3
.2
.1
0,3
.2
,1
0.2
.1
39
Table 2.4.10
( conti,nued)
0
(d)
Pi. = 1/1' ,
e
t::"
0.00000
.01733
.03053
.04237
.05397
.06617
.08004
.09759
.12525
0.00000
.01392
.02641
.03865
.05152
.06615
.08466
.11384
0.00000
.01303
.02580
.03923
.05450
.07382
.10427
0.00000
.01329
,02727
.04316
.06326
.09495
0.00000
.01456
.03111
.05205
,08505
0.00000
.01730
.03920
.07371
0.00000
.02309
.05950
0.00000
,03911
0
P,j = l/e ; 'Y ;:
20·737
16.749
14.121
12 ,039
10.231
8.557
6.914
5.188
~.'
20·737
16,749
14.121
12.0,39
10,231
8.557
6.914
5.188
3.149
16.749
14.121
12.039
10.231
8.557
6.914
5.188
3.149
14.121
12.039
10.231
8·557
6.914
5·188
3.149
12 .039
10,231
8.557
6,914
5.188
3.149
10,231
8.557
6.914
5.188
3.149
8.557
6.914
5·188
3.149
6814
5.188
3.149
5,188
3.149
1 , r=c=3
t::,,'ffi:...
1,00000
.80769
.68096
,58056
.49337
.41264
.3.3341
.25018
.15185
1.00000
.84310
.71879
.61084
·51090
.41280
,30975
.18801
1.00000
.85256
.72452
.60598
.48963
.,36740
.22300
1.00000
,84982
,71077
,57430
.43093
.26157
1.00000
.836,38
.67,579
·50709
,,307'79
1.00000
.80799
.60629
.)6800
1.00000
,75036
,45545
1.00000
.60698
ex = 0.01
P~r:"-
0·9
.8
·7
.6
·5
.4
·3
.2
.1
0.8
·7
.6
·5
.4
·3
.2
.1
0·7
.6
·5
.4
.3
.2
.1
0.6
·5j
.4-
·3
.2
.1
0·5
.4
·3
.2
.1
0.4
.)
.2
.1
0·3
.2
......,
0.2
.1
40
Table 2.4,1,
(e)
(continued)
0
p.1,
::
1/r
e
,
p
0
.j
-,~
r~2; c=3 \
(r=~ c=2 J,
1
/'
12,655
9,635
'7,,702
12.655
.0649:;~
6,,213
.08)12
.10257
.12511
.15478
.20701
0.00000
.02lJ.90
.05380
,09196
.15961
(f)
:;:
=:=::: 6!
6
0.00000
.02632
,04658
lie ; I'
0
:p.
1.
e
0.00000
.02209
.03887
.05387
.06853
.08.390
.10126
.12307
.15690
0,00000
.01875
.03994
,06658
,10804
4·.957
3.832
2 ~ 77-6
0.624
4,957
3.832
2,7 7 6
1.'731
0.624
:::
llr
,
p
0
,j
£::,
17.427
8.190
=:
-
llc ;
r--
f::,.
17.427
1.3.881
11,567
9·752
8.190
6·758
5·372
3·91.+1
2.299
8.190
6.75(3
5 ·372
.3.941
2.299
~~!/~
Power
1.00000
.76136
.60861
,49095
.391'(0
0)0'281
.21936
0·9
.8
,04931
1000000
·77304
·56001
,3)...920
.1
'T
,
(r::2, c='3 )
'r::3;, ~:o:;t2 '
!
o
.6
·5j,
0"""
·3
02.
0·5
.4
·3
.2
oJ..2588
1.
0,05
::
,1~3678
1,,731
4,957
ex
.1
ex
::
0,01
61,6 -
Power
1.00000
.79652
.66374
.55959
,46996
.38779
,30826
,22614
.13192
1,,00000
,,82515
.65592
.48120
.28071
0·9
.8
,7
,6
v:'5
j
0'+
,3
,2
.1
0.5
oLe
,3
.2
,1
4·.1.
Table 2.4.2.
Effect of misclassifica tior: (:independent eJ:'2ors -V.rl th
dit'fel'ent rates for the i- th and j~th directions) on the
3.s)'1nptotic power vhcl1 tes Lingdw hypothesi.s of .1ndepeml-
ence In a rxc contingency tab.l.e
0.00000
.01743
.03061
.04234
.05377
.06568
.07906
.09579
. ]2225
0000000
.014h7
.02737
.03995
.05311
.06791
.08650
.11610
0.00000
.01393
.02753
.04177
10 ,.509
4.899
7 .81~9
':i o8hI
2·911
2 . O~jC\
1.,242
0,426
"( .849
3.81+1
2 .058
1,242
.4
·3r,
.c.
,I
<)
6 .17i:~
2·911
2.058
1024:2
0.426
4,899
3.841
2·911
2.058
1.242
0.426
3.8)+1
20911
2·911
h
.)
11" .Ei99
ho899
.? ,8 1n
4.899
~
.0
).841
6,172
2~9J..l
6.172
0·9
.8
·7
0.8
·7
.6
2.058
L242
0.426
.O5781~
.07807
.110·46
0.00000
.01464
.02999
.01+734
.06926
.10453
0.00000
.01659
.03538
.05917
.09768
0.00000
.02053
.04661
.08909
0.00000
.0291'7
.07703
0.00000
.056_60
10·509
7.849
6.172
2.058
1.:2 it2
0.426
2·911
2.058
1,2 1+2
0,.426
2,058
10242
0.426
10242
,) .426
• _ _'-,_ _..--......-._<0*0..._ _
<;
."l
~ ·4
,,3
")
of....
01
0,,7
.6
·5
I
.Lt
0.3
.2
.1
0.6
·5
04
•
,)
.2
,1
0·5
.4
·3
.2
.1
0.4
.)
r)
"L...
.1
0.)
.2
,1
0.2
.1
---~
42
Table 2.4.2.
(b)
(continued)
p~
1.
"" l/r ,
e
0.00000
.01457
.02551
.03518
.04450
.05413
.0648.3
.07791
.09726
0.00000
.01182
.02227
.03236
.04280
.05441
.06864
.08977
0.00000
.01111
.02186
.03299
.04538
.06059
.08322
0.00000
.01139
.02319
.03635
.05252
.07665
0.00000
.01253
.02651
.04372
.0691+5
0.00000
.01493
.03334
.06093
0.00000
.01993
.04989
0.00000
.033)+1
'Y ;: 3
.I
r=c::2
a ::: 0.01
Power
14.879
11,680
9·611
8.004
6.635
5.394
4.208
3·007
14.879
11.680
9.611
8.004
6.635
5.394
4.208
3·007
1.674
11.680
9.611
8.004
6.63,5
5.394
4.208
3·007
1.674
9·611
8.004
6.635
5.394
4.208
3·007
1.674
8.004
6.635
5·394
4.208
3·007
1.674
6.635
5.39'\';.
4.208
3·007
1.674
5.394
)+,208
3·007
1.674
4.208
3·007
1.674
3·007
1.674
.1
0.8
·7
,6
,5
.4
·3
.2
,1
0·7
.6
·5
.4
·3
.2
,
• .l.
0.6
·5
.4
·3
.2
.1
0·5
.4
Our problem is then reduced to that of obtaining appropriate
estimates, either the maximum likelihood estimates or any set of BAN
estimates, of the
fl
{ p~l' ••• 'P~c }.
However, if
contingency table
obtained,
!.~.,
j
lmknown parameters in the expression of
I-l
> rc: the number of the cells in the
no "appropri.ate" estimates of
the theorem does not hold,
I-l
parameters can be
It can easily be seen that
if u, the number of independent unknown elements in e(rcxrc), >
then
fl ::::
(r-l)(c-l),
rc and the problem cannot be solved.
The following two cases will be considered separately:
(i)
e(rcxrc) is known.
(ii)
e(rcxrc) is unknown.
2.5. Finding a Test Procedure for the Case of Non-Independent
Errors When e(rcxrc) Is Known
We shall now attempt to find a test procedure for testing the
hypothesis of independence, when errors are not independent and e(rcxrc)
is known, using the theory of frequency X2-test (!.~., obtaining " appropriate" estimates of the unknown parameters).
Under the hypothesis of independence, the joint density of the
observations is given by
i
(j)o -
11
i,j
n,
n ij
T
n
(2.44)
· i,j
In order to obtain the maximum likelihood esti,mates of the
parameters ps. (s=l, .•. ,r-l)
maximize (j)
o
and
p.
k (k=l, ... ,c-l), we have to
subject to the constraints
=1.
L: Ps. =L:p
k .k
s
44
Let
Differentiating
~
with respect to p
c
L: P k
k=l'
L:
e
Vl'ki
-<,.;
]
+
ij
L: P P k e , k '
i,j n [ L:
s k s. . s~ J
Similarly, differentiating
~
, we get
vo
A. =
o,
(2.46)
wi.th respect to p
.w ,we obtain
r
L:P s.
s=l
L:
n
i,j
e.,
S~WJ
]
+
. ------.--ij [ r
c
~
e 'k"
....
L:p P
8=1 k=l S. • k s~ J
~
=0
,
The equations (2046) and (2.47) are generally difficult to solve.
An i terative procedure
\~ould,
no doubt, be needed to obtain explicit
solutions for the estimates.
To overcome this difficulty, we shall adopt a different method for
estimating the parameters, that is, we shall propose to use the Best
Asymptotic Normal (BAN) method of estimation.
It can be
show~
that all the conditions for the existence of BAN
estimates (see Neyman [17]) are satisfied in this case.
(i)
P~j
= L: L: P P k e 'k'
...
s k. S o . SJ. J
and
=
f
(1=1,
o
,(al ,··
~J
0'
.,ar +c- 2)
.,r; j=l, ... ,c)
That is, we have
>
0
0
(11)
The functions f ij are continuous with respect to (0: " " ,
1
O:r+c.. 2) and possess continuous partial derivatives up to
(2.48 )
the second order,
(iii)
The parameters (0:1 "" ,O:r+c=2) are functionally inde,~
pendent, !.!:" the matrix (dfij/Cok ) is of rank r+c~2 ,
There are several alternative methods of generating BAN estimates.
We shall discuss here the one which will be helpful for our problem at
hand.
Because of the non-linearity of the functions f .. I S, the method
l.J
of minimum ~ b;y linearization vJ'ill be proposed,
This method, due to
Neyman [17], was introduced with the specific intention of finding a
BAN estimate which could be computed by solving linear equations,
Consider the expression
np." (0: ,
1
lJ
n
In minimizing
eo •
,0: )] 2
S
ij
xi, one may cons.ider the vector;e!
::; {Pll"" 'Pre} as
the vector of parameters which are subject to certain restrictions,
called side conditions, due to the dependence of .E. and s independent
parameters, there will be (rc-s) side conditions on the piS which can
be 'written in the form:
F t (PI'".L o<",p rc )
One may then minimize
=:
xi
0,
(2,50)
subject to these side conditions by the
method of Lagrange multipliers,
to minimize
t=l, " "rc~s ,
However, a simpler procedure would be
xi subject to the linearized counterpart of the above equa=
tions, that is, the first two terms of the Taylor series expansion
46
about the point q
0
_
(~ n,
:tJ'
,ir;)"
1J
The solution for the estiInate then only
requires solutiof< of linear equations,
For our problem at hand,. consider the expression
(~fP~j )2
qij
'Where
pi
ij
-, f,. ( a'l.' ' ,
lJ.
0
,ar +c=crJ,
(i=11
' ,
0' r;
j=l". "c),
In "WTiting thi.s formula it is assu.med that none of the qiJ is equal
to zero,
It is clear that the problem of estimating the parameters
(aI' " " a r + c- 2)
. is eqUivalent to that of estimating all the probabilities p~ .'
:l,J
For our present problem, it is more convenient to consider
the problem of minimiZing
xi with respect to Ctk1so
xi 'with respect to P~j than that of minimizing
The inforIr.ation that each p~, is a known function f. j of (r+c-2)
lJ
1
independent parameters (a "" o,a + _ ) is equivalent to the restriction
1
r c 2
on p~ . is imposed by means of b ::: rc,~(r+c,-2) equations of the form
:LJ
(2.52 )
which are obtainable by eliminating the parameters a's from the equa=
tions p~. ::: f'j'
lJ
1
We shall assume from here on that the restrictions
(2.52) can be explicitly obtained, !.~., the elimination of the parameters a i s from the s etp" :::: f, . ( a) can be actually carried out,
lJ
lJ is not necessarily the case in general, eventhough the conditions
guarantee the existence and regularity of the solutions.
This
•
Note that one of the above equations is clearly
L:
i,j
The aSB1.urrptions (2.48) on the 1~1ID:ctions f.. imply that the func,~
lJ
tions Ft possess continuous parti,al derivatives at least up to the
second order.
Alse, the independence of the parameters (a ,.
l ,
0
'JCt,
-+ =('\)
r c c.
implies that~ for each system of values of the P~j 1 S satisfying (2.,'52),
there exists at least one system of the P~j' s such that the Jacobi,an
In these CirC1Jmstances, Taylor! s formula may be applied to each
function Ft(;£I) to give its expansion about any point satisfying the
conditions P~j
L: :P~J :;;: 1 (j,=l, ....,r; j=l,.
c)" Taylor's
i}j
formula 'W'ill be applied to obtain the expan.sion of F ( I2I) about the
t
point P~j = qij
> 0 and
0
.,
(i=l,.oo,r; j=l, .. o,c).
Thus ,
where
and
~'(lxrc) ~ (qll' "oJ~c) •
Here b
respect to
' . represents the partial deri vatives of FtC£l,) with
t ,,3.,J
p:.
lJ
taken at
p:.
,- q..
lJ
lJ
0
Thus, b
. . does not depend upon
t ,1,J
the P: . I S so that F* is a linear function of the
lJ
t
:p:,'
soOn
lJ
the other
h8
hand the coeffi.cients C."
j i' jC (the 2nd order partial derivat.ives)
t,~, ,.
~
are functions of both the p~jiS and the ~jISO
part of the Taylor expansion of F
The method of rninimurn
with respect to
p~Jus
xi
t
Hence, F~ is the linear
around the point
~l =
go
by linearization is that of minimizing
~
subject to the reduced side conditions
o
.1
l
which will lead to the RAN estimates of Pij
,
So
It is clear how this technique of 'linearization' permits a
reduction of the problem to the solution of a system of linear eqtlations
and hence is more convenient
0
Let
represent the expression9 of the parameter a , in terms of the prob~
k
abilities P~j' 8., then a BAN esti.mate (;k of a~ is obtained by substi..
tuting in the above equation, instead of
p~
0' the corresponding BAN
lJ
estimate.
Xi subject to
To minimize
use the method of
Lagrar~e
the reduced side conditions
multipliers
(2.56), we
0
Let
Q
=
L:
i,j
9 To obtain equations su~h as (2057) is not generally easy.
For
Oill' problem" however, these expressions are not needed in obtaining the
test procedure2.
It can be easily seen. that we can vITi te
where
qi (lxrc)
i '
)
.E:1,lxrc
aLd
= (pi11' pi12'"
S( rex1'c::) '" D
)
stands for a diagol"1il matrix whose
'ltthere D
q" ... ~
J.
p~
0,' rc
J
diagonal elemE:nts are (qll J
0
v
q""",J'
qrc )
0)
0
Let
and
be a matrix of rarLk b
0
To minimize Q subject to
(\ t-,.L
, -- ~
2:..~.,
0
,
0
,
b)
j
( 0.c-.
..
Differentiate
with respect to £:'; set the derivative equal to zero to get the equa.L,
tion
6'"')'
.....J
50
and solve for
(2.62 )
Substituting (2.62) in (2.60), gives
! +
Let
P
TSTi~
= TST
and T(bxrc) is of
1
0
::=
0 .
Note that S(rcxrc) is symmetric positive definite
ra!~by
using the results in matrix theory given in
Section 7.3, we have that P(bxb) is non-singular.
From (2.63), we then have
r ::=
- P-
l
! .
(2.64)
Substitute (2.64) in (2.62), it follows that
The corresponding quadratic form is given by
(2.66)
A X2-test statistic given by
A
r
C:
(n .. -np~ .)
x~ :: L:
L: -' lJ
J
j=l
n:B~j
1.=1
2
,
or a xi-statistic given by
r
c
x~ ::= L:
L:
1
i=l j::=l
n .. -np.i j )2.
.'0
(
lJ
J.
(2.68)
51
A
where the
p~
. I s are as defined in (2.65)j is each distributed as a
~J
i?-variate with (r-l)(c-l) degrees of freedom as n -->
00,
under Ho
(see Neyman [17]).
The test procedure is either:
(I)
(II)
Reject H if X~ > C , and do not reject otherwise, or
o
Reject H if x~ > c , and do not reject otherwise;
o
C being so chosen that
Pr {
Neyw~n
x( r-l)( c-l) ::: C }
=
Ct
(the des ired level of • ignifi cance ) .
[17] shows that the above two tests are both consistent and
equivalent in the limit
lO
to the A-test (!.~., the Neyman-Pearson Like-
lihood Ratio Test).
Although they are equivalent in the limit, the question remains
open concerning which is better when the number of observations is only
moderate.
xi - statistic,
using the estimates obtained by the method of minimum xi by lineariIt may be interesting to point out also that the
zation, is equivalent to the statistic proposed by Wald [29] for a much
more general problem
(!.~.,
testing the general composite hypothesis)
but as adopted to the categorical setup.
(For the proof of the
equivalence mentioned above, see Bhapkar [3].)
10 If, Whatever be the simple hypothesis H, the probability of the
two tests T and T , say, contradicting each other tends to zero as n
1
2
is indefinitely increased, then the tests T and T are called equival
2
lent in the limit.
52
In obtaining BAN estimates by the method of minimum
xi by lineari-
zation, the essential part of the work is deducing the side conditions
(t=l, ••• ,b) .
Once the side conditions are deduced, the estimates can be obtained by
a straightforward computation as discussed.
2.6.
Illustration of the Method of Minimum
xi by Linear:i,zation
As an illustration of the application of the above procedure, we
shall consider the following example:
Let
e
(2.70)
=
l-(rc-l)e,
where
0 <
e < lire
and
e
otherwise
is known, (i,i'=l, ••• ,rj J,J'=l, ..• ,c).
Using the basic equation (2.6), it can easily be seen that here we
have
(2.71)
(i=l, .•. ,rj j=l, ..• ,c).
Under Ho '
Pi'j = (l-rce)p. p . +
~.
.J
e,
where
~
i
Pi
.
=~
J
p
.J
=1
,
(i=l, ... ,rj J=l, .•• ,c) •
The rc functions
are expressed in terms of r+c-2
Pi. (1=1, .•• ,r~l) and p.
independent unknown parameters
j
(j=l, .•. ,c-·l).
Therefore, there a.re rc,~(r+c-2) conditions on P~j' s.~ one of them being
E p'
"j
~,
ij
- 1 :; 0 •
(2.7,3)
We shall proceed to obtain the other (r-l)(c-l) side conditions by
eliminating the parameters p,
~.
.
and P j from the equations (2.72).
Sum over "i" on both sides of (2.72) and obtain
~ (p~j -
e) =
P,j (l-rce) ,
~
(j=l, ... ,c) •
Similarly, by sumrrdng over "j" on both sides of (2.72), we get
- e)
ij
= ~j~---:=:l-rce
E (pi
,
(i=l, ••• ,r)
Substitute the expressions (2,74) and (2.75) in (2.72), we have
c
[E
Pi' j -
e-
€=l
(p~€ -
r
e][
E
h=l
..;;;...;;;,...---~~..;:;..----
l-rce
,
(i=l, •• o,r; j=l, .. "c) •
Now
r
E
h=l
(p~j - e)
is seen to be independent of the index "i" for all j,k=l",.,c ,
Thus, for a fixed set of indices (j~k)'S
(Jrk~l,.,.,c);
we have
(
) cond
r-l
iti ons on p 1
..1 s, name1y
~J
I
P lj
-e
p~;:-e
pi
=
.,
e
e
,-*L...:.
P2k -
i
=
=
Because of the nature of the above
of (j,k)' s
~
,
Prj
.,
Prk
e
e
e~aations,
not all combinations
(jfk=1", o,c) are a1lo'W'ed in obtaining the independent re,-
strietions on
P~j~s}
for example, the set of indices
implies (k,k'), where kfk ' .
(j,k) and (j,k!)
It is o"bserved that, actually, there are
(e-l) sets of (j,k)' S that can be used.
pendent conditions, namely (2,78).
Each set gives us (r-l) inde-
Therefore, the (r-l)(c-l)
independ~
ent conditions on p~j'S are obtained.
To get the (e-l) sets of (j,k)'s, we simply fix an index j,
... ,e) and vary kfj=l, •.• ,e.
(j=l,
For example, take j=l, then the (c-l)
sets of (j,k)'s are (1,2), (1,3), .,., (l,c); and. the (r-l)(c~l) eondiI
r scan 'b e seen to b e:
t ions on p,.
l.J
e
, -- e
i
I
Pll
P12
,
P1l
"""T
P l3
n
PII
i
PIc
-e
,.
e
-e
-e
P21
= -r
P22
,
P2l
= "'"T
P23
-
- e
-e
P2c
I
~
:=
- e
i
-e =
e
-e
P;I
.,.
e
Prl
= "*T
Pr2 -B
=
Prl
r
Pr'"
."
-e
-e
1
,-
:=
Pr1
Pre -
..,..
~
(2.79)
e
e
After the (r-l)(c-l) + 1 side conditions have been deduced, namely,
A
•L:
P~j - 1. = 0, and (2,79) -' the BAN estimates P~j i
~,j
a straightforward application of the formula (2.6.5)
S
can be obtained by
55
The test procedure is then:
reject H if
o
> c
,
and do not reject otherwise, where C is chosen so tb.at
The above method for constructing a test procedure (w'ith the
misclassification patt.ern as defined by (2.70) will now be applj ed to
data taken from the survey cited in Snede~or's St~tiz~ Methods,
Section 9.6.
The 2x2 table shows the subdivision of the sample
l.ng to two attributes.
accord~
One of them 1.s the presence of second brood eggs
of the corn borer, while the other is the use of fertilizer or manure.
(See Table 206.1.)
Table 2.601.
Presence and absence of second brood eggs in Boone County
sample subdivided according to application of fertilizer
Fertilizer or Manure
Second Brood Eggs
Total
Some
None
Present
31
94
125
Absent
9
42
51
Total
40
136
176
The value of the usual ;,~test statistic i8 1.0553, with 1 dofo,
!.~.,
the hypothesis of independence is not rejected.
56
Suppose the above data is subjected to errors of classification of
the type mentioned above, with a knmffi value of e;l .075;1 say.
To
compute the statistic X¥ ,we proceed as follows:
From the above data} compute
= (.1761364 .5340909 .0511363 .2386364)
Sl(lx4)
.1761364
s (4x4) .0511363
I
L
Since
'~e' e'
"+e~+e'
P11P22 = P22- PI1- P12P21 P21 P12]
,
,
I
I
P11+P12+P21+P22-
.9
1
we have that
.1636364
F(2x1) = [. 02 75 052 ]
0
and T'(4x2)
= [ .~38637
~.4)90909
.1011364
i]
1
1.
Now from (2.65) J namely J
A
;£1 = S - ST'p-1-F
we have
Pil
",
'"
£1
P12
=
",
P21
_
l
.
I
P~2 J
=
,
.1761364
.0357429
.5340909
- .0163919
.1403935
.5504828
.0511363
.2386364
- ,0428480
.0939843
.0234970 .
J
.2151394
.
57
It follows that
x~
r:
==
==
5·577 ,
i,j
and
x~
==
1
==
8.091
Similar computations can be made for other values of
results for some selected values of
Table 2.6.2.
e
e.
The
are shown in Table 2.6.2.
Effect of misc1assification (non~independent errors with
the same constant rates) on the value of the test
statistics for a given set of data
Test St.atIstics
Esti.mates of Parameters
e
P1.
P2.
p. l
P.2
X~
X~
.005
.72
.28
.22
.78
1.26
1.37
.025
·73
.27
.19
.81
2.14
2.44
.050
.76
.24
.16
.84
3.68
4.61
.075
·77
.23
.12
.88
5.588.
b
8.09
aThe test is significant
(!
~., a
=
.05).
bThe test is highly significant (!.~., a
2.7.
1
=
.01).
The Asymptotic Power Function for X~
To study the effect of misc1assification on the asymptotic power,
we wish to find the asymptotic power function when misclassification is
present, namely
and cc:xnIG.:ce it to tbe corresponding i.lsymptoti c power "Then chere is no
misclass:1. fication., namely
""
n
lim
P
OJ r
._>
1~. ~
>c
I
H,ln
1r .
r)
( ,c:.
p,", \
oJuc J
By a method similar to that in Section <2,2, l.t can. 'be shown that
where
,0
P1j -and
Z d~Jo
1,j
f3i
=
= 0,
lim
n ->
CD
(i=l" , , ,r; j =1, ' . , , c) .
P
r
{X*2 >
C
I
(
-
Hm P ~ x~'2 > C
n--> co r l
I
H1n }
I~n1
(2.84)
Before proceeding to find
lim P
Jx~
n--->col'l
it
1.6
>c
-
I
Hin
l,
r
important to note that r3ince the reslLlt on the c>,::3;rmptotic power
fl.mction of the frequency;' -tests (stated in Sectic'[J '1,1) has been
obtaIned under the assumption that the method of entimation is that of
59
the modified minimum
tion of
~l"
it cannot 'be immediately applied to the evalua-
The method of estimation used for our problem is that of
the minim-um
xi by linearization.
If it can
minimum
X2,
b~
shown that the result also holds for the method of
xi by ] inearization,
t3' =
lim P
n _> ()) r
then it follows that
L~ X*2 -> C I. H~-~n)l
=:
I-F(C,(r-l)(c-l),6'),
(2.85 )
where
§. i (lxrc)
= (dill
B(rcxr+c-2) - )
and
-y;J. .'
1
-l-fJ
We shall now show that the result on the asymptotic power function
of the frequency -I'-tests holds for the method of minimum
xi by
linearization.
Mote [15] has shown that the result on the as~nptotic power function remains valid for any estimating procedure that yield estimates
A
a
h
(h=l,oo.,s) possessing the following three properties:
(i)
Functions
&h
(h=l, •• o,s) have continuous partial derivatives
with respect to all the independent v-ariables
(ii)
~j.
The result of substituting
qi' = f .. (al ,·· .,a ); (i=1, ... ,r; j=l, ... ,c)
J
in
A
~(q)
2J
S
leads to the identity:
A
ah
=~,
h=l, ••. ,s .
60
(iii)
If we let
Ai _
h _.
cbh
[ dqll
,
"to
II ,
,
"0" ,
= [a..n, 1 , l' •.. , a h , 1 ,c '
.•. , a h ,r, 1'· .•• , a h ,r,c ] ,
o
By definition (see Theorem 1 in Neyman
[17]1 a BAN estimate
generated by any estimating procedure must possess properties (i) and
(ii).
It will now be shown that property (iii) holds for a BAN estimate
ex*) .
(a)
obt,ainei by
:,>1;;08 t::i -cuting
Jng BAN est:1.mate:1 of the p
(b )
,.,(
'-
"
9:. q)
Let
::len')"",E:.
}3
equa;I~J..(j':.:
in the abc'\te
uv
the
0
BAN estimate of ex
0
..
land, '[:,
,
"i
g! a ·,'orresr·oo.d-
le:.g EA;\j 0EtJ!t1ate of P~" .] obta.ined
of
lJ
!. ~
(c)
0
y
ex
p02:~e:3Se5 ~oroperty
X~)
(iii).
corre:=ponding BAN esti.mate of P
o
ij
j
. denotes a
while :p* (q)
1.J
00
.; obtai,"'.ed 'by the method of
x~1 bv Iinearization then by 'l'heonsm 6 of NeymE1.n [17] J
0
we have thflt
(d)
E1.i Hl:<TUUl
J..
Let ~~:t,* (qJ denot8 a BAN estimate of ~0
minimum
CC>lTes pond,~
By "c::hain
we can write
62
(e)
From the definition of a BAN estimate, we have that
and
(f)
Being the same function, a partial derivative of ~h' evaluated
at the same point P~j;
and usir~ the result in (e), we have
d~h(:P, l' . ",prc )
..L.
=
dPuv
(g)
Finally~ applying (c) and (f) to (d), we have that
,
and then using (b) we have
cn*h
~~j
0
~j=Pij
=
ah,i,j'
as required for property (iii).
63
Thus, the results on the asymptotic power of the frequency X2-tests
hold also for tests using estimates generated by the method of minimum
xi by linearization.
Hence,~' is as defined by (2.85).
The problem is now to compare ~' and~.
Consider the matrix
Since the matrix B(rcxr+c-2) is of rank r+c-2, using the results
on matrix theory (7.3.1) and (7.3.2), we have that B(B'B)-lB' is
symmetric and at least positive semi-definite of rank (r+c-2).
In order to show
! ..!::.. ,
that~' ~ ~
L:
L:
i,j
i,j
That is,
we need only to show that
From (2.83) , write
L:
L:
i,j
i,j
(L:
s, k
L:
s, k
ds kes~'kJ)2
po po
e
s •• k sikj
Using Cauchy's inequality, we have that
d
2
e
sk Sikj] [
0 0 e
]
( L: d ke 'kj )2 < [ L:
L: k Ps. P •k s ik'J ,
s~
k
0
0
s
s, k.
s,
p P k
s,
s. .
for all
! and.J. •
(2.88)
64
Therefore,
<
l:i.'
<
=
L:
L:
ijj s,k
d
2
sk
0
s,k Ps. P . k
L:
o
= l:i.
•
l:i. •
We again have the result that the asymptotic power of the test procedure
may be reduced due to misclassification.
From
(2.88), equality holds if and only if
eSikj
where
o 0 e
ps.p. k sikj ,
= T ij
o 0
Ps.P.k
for all! and
.J. ,
are some constant not equal to zero.
T, ,
J.J
For some (i,j), if none of the esikj's are zero for all s=l, ••• ,r
eSJ.'k"s
J
or if none of the
are zero for all k=l, ••• ,c, the above equality
implies
o 0
d sk = TijPs.P. k
for all s=l, .•• ,r
or for all
These cannot be true, since
L: d
s
=o ,
ek
L:
s
p s0 • = 1
and
-
L: d
k sic
0
,
0
L: P k
k .
-
1
.
Hence, for some (i,j), (i=l, .•• ,r; j=l, ..• ,c), if
(1) eSJ.'k'J
i
S
are non-zero for all s=l, ••• ,r, or
(ii ) eSikj's are non-zero for all k=l" •• ,c
then 0 1 0 < 6.
.
,
k=l, .•• ,c.
65
If, for any e(rcxr~) ~ I (the identity matrix), neither of the above
conditi.ons holds, it car... be shmffi that there are some sets of { d
which .~(2.
}
We shall now give an illustration for the case where
6..
=:
ij
there are two non-zero off,odiagonal elements, namely,
(s
f s'
or k
f
k')
and
(i ~ i
Let
~
d
ij
~
i
S
or j
:f
j'
f
k).
} be any set of deviation parameters such that
,
then it can be shown that
d i2
:!J.
L:
i.,j
E
i,j
2
d ij _
0
p.
0
~J
p ..
~J
and
d,2
.=Y
E
L:
i,j
:=
,0
= €l
Since esS'kki
d
,0
i,J P
ij
:=
2
d :3 k
2
ij
0
Pij
ei i
i
jj i
2
d S i .k '
:=
E:
2
d
2
€2 d ij
P sk
P ij
= -0- +
0
ij
it can be seen that
2
d , ..
i J
_. +
-- +
+
0
0
0
0
Psk P s i k'
PiJ
Pi' j '
(€~d
k + ds ok')
L s
o
0
ElP sk + Psi k '
2
€ld sk
,
2
2
2
(l-€l) d sk
2
(1-E )d
2 ij
0
P ij
2
2
(€ld sk + d S 'k,)2
d ik i
+~+
'
0
o
0
0
€lP sk+ Ps'k'
Psik'
Pi'j'
XL
0
P sk
-
( €2 d ij + di'j')
0
2
0
€2 P ij + Pi'j'
2
( €2 d ij + d i , j' )
0
0
€2 P ij + Pi'j'
for
66
d
+
a
sk?s'k'
a
2
)
Psk
.2
sk
Cl
P
o
2
sk
In the same way, a set of deviation parameters
,2,',£
= I::::.
t
d
ij
} for which
may be found for a misclassification matrix e(rcxrc) with any
number of non-zero off-diagonal elements.
That is, for any mis-
classification matrix e(rcXTC:), if (2089) does not hold, then, for some
sets of \ dij \
j
0i 0
~
l:1
0
We shall now give a numerical example to illustrate that 0'0
for some e(rcxrc)
I
r(rcxrc)o
Let us consider a case of a 2x3
contingency table with the following misclassification. matrix:
= I::::.
67
,e(6x6)
L t
e .
{1'1.'
a
pO
.-
08
(1
0
0
02
0
0
1
()
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1.
)
(
f :;: 103,
'Po,
01' -.2
1
deviation parameters such that
number» 0, say., then d 22 =70) d
:::: -0, d
"" ",70, d
:;:; 60 and
21
l2
13
d
23
= ~6o.
=
Then, it
~an
1269084127 0
2
be seen that
0
Using the basic equation p i j ~ ~ P €I , . and d1.~j:;:; ~ d ke ik"
s, k_ ak s1.kJ
s, k s s J
follows that
U
To illustrate that the asymptotic power i,s reduced» let us take
first the example considered before in Section 2.6, namely
otherwise
where
0
<
€I
< lire,
(i:;:;1,." "r, j:=1, 00 o,e)o
it
68
As before ,.we have llll.der H0
(i=l;.. oo,r; J=l,. oo;.c).
(0
~
Let r=c=2,
!
p.. ,P 0 )\,. - -(( .6.9 08
J..
.1
I
(
}
and let
tion parameters such that
TllUS,
it is easy to see that
.
.
,
2
;;; 26.04167 d
To
compute
~i, first
0
note that since
uO
P'j
::: (.1-rce")P.0~ ••
p 0 .1. + e
~
and
we have for our case that
a'11 -'-
[II
-~2
- (1=48) d
-- = d'21 .-
and
,a
P:n
,0
P12
zO
P21
::
.48(1.·,48) +8
-
.12(1~4e )
:::
.32(1-48 ) +8
;;;
.08(1-48 ) +e
,,0
P22
+e
,
)
,
'U.S
chaos e the
devia~
69
Ueing (2,85) and evaluating the matrix B and the vector
1)
for our
case to be
B
2(1-48 2,
10
-
,,'
4
./
1.
=3
,"4
2
-2
~1
~~
and
v
,-
\--~
.. ~.
1
1
1
(1··48) d
- _ . ~ _ . _ .
.~~--
pi2 ~
P2l
y;J ~
I:l l
we can compute
9
d~ ==
6,"
,40355, then
for various values of
6,
=0
e,
10,509 and 6." :::: 60139.
1
.'I
-y.---
---~
)
·0
P/~9
<:.-
For example, l.f
e ;:.:;
,02,
Values of I-F(CjlylO,509)
and 1-F(C,1,60139) from Fix's tables are seen to be 0,9 and a:pproxi.·~
mately 0,7 reepectivelyo
That is y
e ::
002 reduces the asymptotic power
from 0 09 to approximatel;)" 0,7 0
Table 20701 shows the effect of misclassifi.cation on the asymptotic
power for the case of 2x2 contingency tables and the set of deviation
parameters as mentioned above,
The ratio tJ.'
/6.
is also given for various
values of eo
For the second example) we shall consid.er the ease of "correlated"
errors U~ n.~., the errors in the i.=th direction
()«''LIT
if s.::d onJ.y i f
there are errors in the j.,"th direction and vice versa) 0 That is.~
Bii , Jj'
where
0
=:
e
if
l-(r-l)(c=l)e
i.f i=1' an.d j=j;
°
otherwise
< e < 1/(r-l)( c·~l) 0
i#"
SLn.d j f j 1
70
Under H0;1 l.t can be seen that
,
(l,-e)1'10 J? ,,1 + ep2, I '.2
-
1'11
.- (l-e )Pl. P ,2 +
i
P12
i
~
P;2.L
I
;::
P"2
c:._
{
d
ll
P
j
o • .L..,
0
0
,
+ GPLP ,2
(1=e)p2,.p,2 + epl.P ,1
pOl' , pO,
"
such that
(l-e )P...,c::.
ep~~ p 1.
0
l __
and
r
J..
"" -d
;::
l2
ChOOSE
the devia:tl..onpararneters
It can be shown;! then,
that
2
do ,
--..;;J......J:....-
o
:=
0
26.04167
2
d
Pl" P J'
o
,0
;::
,48
.48
-
.12
+ .2e
.=
·32
=
:::
.08 + .4e
P 11
,0
P12
,0
P21
P
,
,29
,0
22
Using (2.85):/ the matrix B(.u.:x2) and the vector §.(I~xIL and hence the
value of t:.', can be eval.uated for vari.ous values of
a "". 0.01, then
mately 0,7,
e
0::
e,
Fer example, if
,I reduces the asymptotic power from 0,9 to approxi-
Table 2. 7 ,'~ shows the effect of m:1.sclassificatJon on the
asymptotic :power for the case mentioned above for various values of
The ratio b.' /6 is also given.
e.
Table 207.10
Effect of mieclSl.ssifi.cation (nor:.,·.independer~t ex-rors "ti ttl
the same conSTant rates) en the asymptotic :power; for
r""c=2 y
1:P~>j:°l}
""' { ,6;.8 } and
dll""-d12=-·d21",,d22
--~------------_._-----~---~-----~-~
a""O.05
1:::.""10 ..509
·e
(~=O.9)
L=3 .841
(~=O.5)
t}
61
__.-
--_._-----------_.
0.00
.01
.02
.03
. 04
,05
.06
,07
.08
009
.10
1.00000
.6'7590
.584·14·
·50491
.4:;611 .
..37613
·32361
027754
023705
.20146
.17017
3.841
2.596
10·509
? .10~)
6.159
50)06
4.583
30953
.3.401
2 o~~,)+4
109)9
1.675
10445
1,,243
10066
·911
2·91"{
2.491
20117
1.788
.,
T'h
l'
.654
6 Q63,5
,4· ():+8::::1
140879
10.057
?3Q691
7,513
6. 1+89
5 0596
40815
40130
.3 ·527
2·998
20532
),)87 '0
.~. 115,50
20894
20496
2.147
1&8 +2
1
1.573
. ,,'<;~'
J. .~.,>!
1.129
Effect of misc1assi.:fication (correlated errors) on the
as~totic_~o~er9_for r=e='2 y .~ P~,'P~l} "" ~ .6,.8} and
d 1l - -dl~ - d 21 - ~d22
o.
a=0005
e
1:::."''3,841
(~=O"9)
6=140879
6~6,635
((3=0.5)
((3=0,9)
6'
((3:;005)
6,'
0.00
.01
002
.10
020
1000000
·93680
088293
067166
.46502
a=OoOl
6=10 ..509
61
6'
'--------~--_._--~------
10.)09
90845
9·279
70059
4.887
3.841
3.598
3·391
2·580
1.786
14.379
13 .9~1j9
130137
9.99).j.
6.919
6.635
6.216
5,856
4.457
)0085
Another t;y:pe of m::LBclassif"i:::at,ion whIch is of seme
interestj~',~,'
the case of misclaBsifi.cf.ttion only :Ln
J
claeses JWill now be considered.
:f~ra~ticsl
t~rle
r.. eigbb,Jring
Let us t.ake. the c:ase of r:='2.; c:;:;) and
the following pattern of misclassificati.on.:
8
f
1=28
i
1'''.38
I
8,:ll i j 'J
c
:=
i
l°
for 1=1,2 and j=1,2 j
It
C&.n
1
::
P12
I
3.
be seen t,hat, under H , we have
o
:P
$PLP .1 + (1~38 )p~,J..lol J? 02 + 8PLP ,3 + ep_;
c::...
= 8P L
])1'2;
.j
otherwIse
-;,
i'l_
P 02 + (1=:28 )PL P. 3 + 8P2 ,p.3
i
P21 - 8P1 , P 0.1- + (1-28)P2,P.l + ep2,P,2
Let
{p~, P~l' P~2
j
{,6
} ::.;
J
parameters {dij ~ such that
d
13
= d23 =
°
0
,2.1 ·3
1
and let us choose the deviation
:ill:;:; -d12 :""~d21 :::; d 2 :2 "'" d, say., and
Thus> it can be seent.hat,
2
do
0
,"\
,_2:.L...
o 0
po:l P j
0
=:
,
To compute ,6,', first note that
43 005555 de
0
73
-
-d:,)
c! _
d"
m.
-d~2
d~J;
" i
(10 .,
.l...k
12
.1>
-
{1~·1~ )0.
;;::
- (1..
~./
- ~.d';3 - -ed
,
he'd
)
.,0
~'
pil - ,02(6-+8 ),
,0
Pr,'
C.l
-
.08(1-t-e
0
P12
L
For s given valJe of
and.
,0
P'~'::'
- ,i8,
::::
.
I'"
Pi;
1
.1(:L2 +
~--
e )j
~.'-
e,
....
'./
,0
P23
.,228
,
.. ,02 (l)+t/ )
,
the rnatr:ix B( 6:Kj) aDd. tb.E: vc.~;~ tOT' D(
ar.t::!>
Table 207,3 shows the effect of misc1as8ification on 't.he aSY1JTptot,:lc:
power for the case mentioned a'oove, faT d:1fferent valu.es of
Table
2.?, 5 .
e,
The
Effect of misc1a.ssification (misclassification only in the
neighboring classE:s) on the as;yrrrptotic power; for r:::2, c:::3.
tP~" P~l' P~2}
""
{.6,
,2, ,j
~
and
d11
~
-d12 "';
~d21
and
----_._----------_.
a:=:O,Ol
a""O,05
6""12 ,6.55
e
(t\=0 .9 )
!.:.U
0,00
,01
.0:2
,10
,20
1.00000
.72473
072042
,69263
.65966
12,655
90172
90117
8.765
8,348
6,;;40957
(t3~O 05 )
I.).R
D~lJ,427
1.).""8,190
(13=0,9)
([3"'"'0,5)
6'
1.)."
40957'
17,,1+27
8,190
3,593,
120630
50936
3·571
30433
5,270
120555
120071
1L496
5,900
5,673
'5.,40.3
;2
08,
The Size of the Test If His(:1.assii'ic''9,tion Is Ignored
Su:p:pose tb.at although there a.re errors of'
ignore them and use the usual teet proc:edu.:re
j
(;}a8stfic:at.ion~w€
.f,~,.~o:)
reJect H if
o
r;
i,j
and not otherw'ise:; C bei.ng so chosen that ex :18 the level of
In t,his case J wha.t is tbe "actual!) size of th.".; tes t?
does it deviate from
sig:~if:ica.nce0
HeM much
a~
Under the nuLl hypothesis J the pTobab,i1:i.ti.es governing the a-bserved
data (when misclassification is present) are
2:;L:p
s k
where
-
L;
k
P
So
P
ok
e,
,~
"k s~k:J
:::: 1.
j
l'i
'-'I'
'r',;
\,).,'-"jOUOjl'
Thus, the size of the usual test when there is m:i.l::!classiflcation is
n
lim P
_>
00 r
L(I- _.> C I .P~J:.
::::
2:; 2:;
s k
P
:P k
e .k'.J
SO". B.l
l
/
j
which can be written as
lim
n
->
00
Pr {,
x?-
>
-
C
where
2:; p
P _13 .\'kO
So ok S,l. J
Sj k
and
~
0) ,
,",
I f we keep 'Y ij fixed j
the a,bove limit is equal to 1 (since )f,,,tests
In order to arrive a.t !1:Qn~t:dyia.l r'esults j suppose we
are consistent)"
investigate the abcve lirrd.-c whee -Y ..
'.1
..L.J
h:YJ)othesis j :Ls
arbitra:ril~y
the devl2't::i.on from the null
close to zero,
In the attempt to use the lind ting procedure., applied earli.er in
eval.uating asymptoti.cpow'er ci' the teste, we let 'Y., -)
J"
~
"
,/-y nand.
7:
J.J
I
.-
defi.ne
where
10
:p '
oJ
and
f . 1~ OJ'
""
-y-;; /?;
1....
-
(1=1, ... ,1'; j=l, ... ,c).
I t ca.n now be seen that the above 1 irni t~ (:2.91 L :i.S equ'3.1 tc
where
r...
°
l:
1,J
o
.9
with equality holds if and only if
Hi and ·the size of the usual test proc.edure is 0:.
o
Therefore y u.nle.ss
A.
c
>
0 ar:d
ttHl,':;;
ex
J
> a.
To get an. idea as t;) bm-r a "Wl.1i. change it' m.i.sc:lassif'ication i.s
let us cone.ideY tb.E now t'am.il1ar case of 9.11
It i8 cleac
that~
Let
close to zero,
Let
(J
i
,'1
here) t.he '2.rTors a.re net irdeI,enjent and. hence
1,(r~en
inve,s tigate the above l:imi t
'J8
eu.
~ , ), . .-, e
JJ
'" (J
i
e
is
a:r·b.i~rarily
l-yn ,
Then
a~
'.-
Ilm P
n-~>
-,
co
r
H.m P
.!"J:~->
'Xl
r
\
.-'
x' > c
11
>
'1"
J::'ij
i
rt
''-'
i
Pij
.- F.01., P0 1 +
U:t.J
C
0
.-. PiJ:,j
+
e
i
.....,,"""'.-,
-yD
'Y
(
lj. /vr· tJ"
;'1.
wt.ere
.'
1-', .J'
...l,.,'
It can be easily ",€:Em that the abcve l:irn:it is E'li.A.al to
I-F(Cj(r-l)(c-l),
A. )
o~
)~
0
..i-rep,0 p,
,J )
'10
(0
\ c:
94)
"_
where
A.
o
z
(}<::)'
l'r'eOJj
L,J
For a gi.ven set of 1(y~ ,po,
l.
oJ)1 ,
i,fw'E: f:Lx: LX, and calcula.te A.
0
:for
va.rious values of fJ ,we w:Ll1 be a1:l1e to see how a changes 'when mi,sclassification is ignored
Let r=c::=,2,
e
:=:
{p~" ,PC.l \
010658, then A.
o
For
0
""
e:xam~le~
{<>'
,,81
J
and :G "" ... OU"
If ex
'"' 004::=:6 and 1,'~F(CjljOo426)"; O"lC)'
case a Is doubled if mis.classj:fjcation ,i5 :ignored
<>
';0
0 005 and
Thus,j ie 'UXl.S
Table 2. ,8,,1 shows
how ex changes when m:i,E'classifica'tion :i8 ignored, for various values of
8 J and different sets of
:2
09 '
For most
Non··lndependent Errors When fJ ( rcxrc) Is Unknown
pract:i.(~a,l,
Pl.l.t'!,oses" l.t 1e unrealistic to'3'si3Ume that the
misclassif'icat:ion mat,rix e(rcxrc) i'" knovm", a.s has been do:n,e jn Sect:i.ons
:2 05 and :2 6"
<>
In some cases> hO'<7ever J e:xpe:dmenters may have some ideas
of the types of misclassH1cation tney are facedw:i.th
ments; in other
words~ 1~or
n
, ,I
,JJ
':<hetr experi··
a cert;a.in set of data, thE:knC';.rled.ge that
8(rc:XIC) takes on a particular :form may be available,
8 001
jE
For example,
may equal 8 when iff IT or ,jfj IT (:!:"'~ o,~ the case of constant; error
rates discussed previousl;y),
Table 208010
e
-----"..
_._._.
~.
Effect of mi3classif'icatio!l (non~,
independent erY'c'!bt,d th con2t.ant. error
rates) on 't.be :O!ize 01' the uBualtest
when m:Lsclassific.ation lE- :ignored
f..
o
_--_._~. ----~--_ ..._~------"
r)(JOO(]
00(001))
(b)
010658
0426
,18:L99
1,2·:;:2
\
0
l P-"
e
.J. ,
. 0
o
I()
o
c:::. '.....J
,":,1'
)
.':P "J' )r
0.00000
021128
00000
106'['4
0001
o
,'.to
(c)
e
c
S:i.ze
ex'
0.00000
00000
O,Ci5
.08567
.14627
.426
,10
1..242
.~'?O
--
~-~~._~-~.
-~_._-~--~_._~_._-..
_
e
A,
o
0,00000
0,000
015982
022760
10674
3,007
0,01
o
'1 r.
,~.~.J
79
Although some pattern for e(:r2XfC) may be known} the values of the
elements in e(rcxrc) may still be unlmO'WIl"
independent and an
II
Hence, when errors are not
appropriate" test procedure for testing the hy"pot,he~'
sis of independence must be ob-r-ained,we are confronted
~d,th
the problem
of estimating e(rcxrc} in order to construct the test ]Jr'oced'.lre,
Let us proceed to see how to test H when e(rcxrc) is unk.nown,
o
Here, under H :
o
r
I-' •.'
'" l:
1.J
'1:"
,J "" J
c
l:
l: Pa P ke~~kl'
6=1 k=l ,-.. .- ,-.,.l. ,"
The
l":;
{pi ,',. ,pi ~ are expressed olD. terms of certai.n
.11'
'rc J
f'J.O.ctions
J
unknown parameters,
When e(rcxrc.)16 :l!',J:J.ovn,\-iEc hav'2 a.lready seen, in
Section 2 oJ} that the number of indep,sndent unknown, parameters that enter
into the expression of {p~j \
iB
number of unknown parameters is
f..L
r+c;~2"
When
e(rexT'::')
[~r.:-(rc'1)+(r+c-.2)],'i
is unknown., the
say,
If
f..L
rc.~
<
then the theorems, d1.s:::ussed earlier when fi(rc:x.rc) is assumed to be
known, are applicable with
instead of r+c~2v
That iS
independent :paramEters to be estimated.
f..L
in 'the case when e<rexrc) is unknown, be-
j
s:ldes having to find the BA..1IJ estimates for the parame1..er:s Pi, (i::o;ly'
:r·~1) and p
0/ j=l~ 00 0",c=1 L
0
")
e" s, the
we also find the BAN estimates for
unknown independent elements in e(rcxrc).
However, it can be easIly seen that f.l > rc tn general, si.nce
rc(rc-l) + (r+c-2)
problem when the
> Te: for rc > 10
l Pij~
(
It:is t!:'lE:!1 clear that the general
i
\ are :f.'ur.):;:t:icns of ~l
parameter", cannot be sol.ve:.L
Ho'Wever,
problem for particular cases,
!"~"
\~'e
' \
""
'\
rC:trc~lj+(r+c-2.1 unknown.
..::aL c"'::tain a scl1Jtion to the
for cases where the nuniber of
80
unknown lndependent elements in
(r-l)( c-'l),
6C that IJ.
e (rcxT-c)
.is .1.e:=:;3 t.ban or eq'ual to
< rc; tree pro"blem car. be so.lved 'using 1"he
met',KH:lS
r
-,
B., , , 'I' r
I
(,
ll. <.JJ
I
1··, (r(;,-l )8,
l
< 8 < 1/xc arld e 18 unknown,
v.rhere 0
\~cn81dered
already been
(Ncte that this same example has
for the case when t3 is knO'wI1, i.n Section 206,)
As before, v-le hStve lmder the milI hypothesis
:P .
P : " - p.
l~
where
~
:i.
J
], c '
1\ ,
~,
(}~rc:e)
~'J
I: F
,J
,.-
,j
The rc functions
+
e
j
1 .'
1\ P'li...1'jp.~ .;)) , , .~p1rc.,}
0
.J.,;,.
are I2xpresbed in tenus of
(r+c·~2)+1 '=r+c~,l unknown parameters, namely
-,) ande •
p.j f"
\J=1 J""c-~
estimates of {
Since
PL,p,.j,e }
r-+c~l <~C
1
J,
1
...
l.
2.5 can
>
I
J
the
The methcd. of minimum
xi
be applied to ,)btain BAN
,,,)
p.,~ P
est:imates
for r :> 1 and c.
can 'be obtained.
by linearization discussed 1.n Section
Pi, (j:-:o:l~ O " J r·l)j
'j
0
J
e
J" •
Since the essential part of the work. in applying the method of
r-.
minimum -;(:. by lineari:z.atior.l is deducing the side coy.Hh tionB _, we shall
1
now show how the side conditions can be deduced for thi:o special caseo
There are r+c.,l independent parameters to be estim.a:ted, therefore
the number of the Eide conditions is
,
rc~ ( r+c~l)
,
, t
)
"", (r",l),c,"l,.
As u.8ual J
81
one of these (~Onditlons can. be taken to be
L:
p~Jo" 1. ''''' 0,
Hence, we
iyJ
need only to deduce (r"l)(c-l)-l side conditionfL
known; ve have alread.y seen how
8,
be obtained J for example (b 79) ,
equati.ons and. solve for
e
set of (r,ol)(c-l) side conditi.ons can
Now, take any one of these (r'·l) ( c-l)
in terTl1s of P~j; So
other (r~l)(c=l)·"l eq'latl.one,
When e(rc:xYc) is
Substitut·e
eI:p)
Into the
These (r~·l)(c-l)"'l eq'J.atlon2"~ after
e(p) has been substituted~ together with
L: P: ,-1 "" O~ can bE' 'taken
»
"
,l,;}
.l,J
set of side 2ond:i.ticms used :in obtaini.ng BAN estJmat.es,
8,8
a
After the side conditions have been deduced, the BAN esti.rr.ates
"
P~tl
can be obtained 1:Iy a straig.,1:ltforward appl:i:::ation of the formula.
(2,65) ,
The test proced!xre is then:
reject H i f
o
L:
X,,2 -
:1, j
n
A
{
A
A
p. p " (I-ree) +
~
v
0
J
e
"\
}
> c'
and do not reject otherwise, Ci 'being so chosen that
p
~2
r
")( '1)' -~~ -> C' 1.~
1(X§(,: r-L,.C-
,10
c
Appli cat ion;
11
.
=a ,
Tes t of Independence of Errors
Before concluding Section 2} it is convenient here to give a few
suggestions on t.he possible
II
"tI
pract~ca1
a:pplica.tion of the results so
far Obtai.ned.
11 Note that the number of degrees of fre.:;dom tere is reduced by 1)
which is the number of l:..p..known e s to be estimate,.l,
IT
For
2.
set of categorical data which lS subject to err.ors
class:i.fJc.ation (for
interview data
i8
eX8.lY(pl€J
Gf'
th.e presence of cbsecvati.0n errC:L;:; when
used is often e:xperienc:edL t,he need
fOT
a crlte,r.i.on
to help the experimenter to decid,= whether the "usual" statj.s.ti.cal
analysis of the data could be used or whether an appropriate procedure
(adjusted for error,,) muet be obtained~ 1s warranted
0
For the problem of' testi.ng the hypothesiS of independence of the
two attributes in
(i)
8
TZC
contingency table, we have the results that
If errors are ir!dependent, then they do not affect the teet
procedure)
.1:o~oJ
the usual test procedure can be used.'l
al though t;he errors of classification may reduce the
asymptoUc power of the test
(ii.)
0
If error,:; are n.ot ind.ependent then an "appropriate" test
procedure must 'be obtained.; and in order to derive a test
procedu.:re1.t is necessary for the experimenter to possess
either a prior knowledge of the values of e(rcxYc) or the
knoviledge tbat
e~~ror,s
folIo•.,!' some particular pattern so as
to :reduce the nuniber of unknown
e IS
:tnto a "workable" size
In practice) onl.:' rarely possesses such knowledge
of gre3.t
j
0
0
Th·,lE> J it may be
nterE:st to be al1.e to deterntine wfJ.ethE:r or not "the errors for
a certain Bet o.f dna s.reindependento
This can be accomplished by
using statist:ical infer-ence=,finding a test procedure for testing the
hypothesis of independence of errorso
In order to maJret:his test.1 a special kind of data if: required.
The data needed is a £'u.11 tw,:>-way cross-classification of the
II
true
0
n
83
data and of 't:he data. obtained
iTi
the €x:r;erimeEt (subJect to eTTors);
that is> we must have a rcxrc ccmttngenc:y table of the following
form~
Sample
11
12
r:"
Total
11
12
ill 2
-.
Q
...
ij
Total
n
It is t.rUE: that data cf thi.:s nature .is Tel.rely available; nevertheless there exist several situations where (lata of this kind ean be
obtained,
For example:
in the medical research studies] there exists
a problem whe:re the investigatcrs are supposed to make a decision concerning the use of clinical examinations and the use of interviews for
a given study (see Rl!b1n, Rosenbaum and Cobb
[27])0
The investigators
realize that in usi.ng the interview data, there ma.ybe individuals who
would report; that they are diseased but, 'who 'Would be classifjed as free
of disease
'b;)'
a physi,cian after a complete medical examination or vice
versa; in other words, misclassJ,fi.catioD. :i8 pres",ut in the interview
data,
Only when the investigator f::an measure hoW' this error 'Hill affect
his results can he make a rational choiCE: between the use of medical
examinations and interviewB
0
For a problem such as thj,s, a full
84
cross-classification of the medtcal. eXaIJ15nation data and the interview
data can be obtained,
Althct:gh the medical examination data do not
necessarily represent t.he "trae" <ia:l::;"',tha.t is doctors make mistakes, since
the complete physical examination by a phY81-ci.an is the best or preferred
source for a diagnosis av:s.i.lable to us J the physici.an is report can be
accepted as "true."
The problem of the above nature can 'ce seen to have implications to
many other fields of studies aswelJ..,
~ . ~'J
to 8:i tue.tions where even
when more precise classification is possible, there may be time or cost
factors that necessitate the use Df cruder classificati.on methods
fact, our results in Section 203 may
hel~
0
In
decide if a more precise
classification is really 'Worth using in that they show how rapidly
asymptotic power is lost in term.s of the loss in sample size upon increases in misclassification rates,
Furthermore, in many of the Sample Survey problems, the Bureau of
the Census has in its "Evaluation and Research Program Series" a special
reinterview procedure known as the CES (the c.:mt.ent Evaluation Study)
which utilizes improved methods,12 in order to obtai.n as accurate data
as possible.
Tables containing full cross·-elassification of' the CES and
the Census data
published
j
j
using the categories for "'hieh Census data have been
are available (see (28]),
Let us proceed to find a test procedure for testing the hypothesis
of independence of errors given that the data in the form (2097) are
available,
12 The principle differences between CES and Census procedures are
described in the section II CES Survey Methods and Design!! [28],
Let
rL,
i
n
•
,v
JJ
'be the ob3erved .~lWJiter of individuals in the [( i, j ),
(1 i ,j i )]~.tb cell of the rcxrc contingency table,
.!'!:,.' 'the observed
nuniber of' individuals 'belonging to the (i,j ),·,th cell. being assigned to
the (i i ,j i )-th cell in. the experiment
F:'om the defini tioD of 0, .'
i
1. ·l.
e,n '.
that
, .')
L!~.1
·W'ri.t.e
0
givt?;n in Section 2.,L, it is clear
('. l • i
1 .• ,
J,J
t' ce 11,,
J.Jd"']
) -;.h
That
is, the Joint density of the observations is
where n,
.
is fixed and
1.••J '
L:
.
1
j
oj
,J
e,11.,
I , ., l
-
J",
1. j for all i"'1.,.
0 0' rand
j=l"ooyC,
We wish to test the hypothesis of independence of errors, or
equiva,lently
H0_
1;
61,1' 1.' (
J' J'"
"';
(1, i ""1, , ., ") r ,; j, j v ::~1,
j
P·'i
l .., / '1' •J' J•
0
,
'I
i
.:1
c), where the
piS
and "1 GS are being defined
as i,n Se::;·U.on 2,lj and. where
_.
1
~-,j
..'J.""1 ,,""" -' r;
i••
T.he hypothesi.s exprel?S 28
parameters J na.mely· the
piS
e." ! , . / in terms of a lesser number of
1.1
JJ
a,no. the "'! ~ s
u
'E"1e number of the .independent
unknown parameters that enter into the expression of
e's is
86
Under lobe miLL h;r1=crJ:,e2:ic,y the joint density of the observations
i,8 gi.ven by
(:2.100)
In Grde;r to obtaIn
eten,
·~,h€ Yrl.."lXimLl.'Tl
likelihood estimates of the
param~
'_jj
_,
..,
l
~-
i, '
Let:
"/0 01 )+""(~ po oi~I)+ 1-l(I: 'Y d'·I) "
JJ
l' ~~
jU J J
F -
(2.101)
,·p.p""r'e
.jatJ"ng
D....l.~.L
'•.-','.7. . 1t',....
,.~._.'F _·"t····
W,..l..
;,1- ... ,
o
,·'''t to p ii
'V'c~"'(
..
..J.,
VC...I:"'t:':"",
C ,
we get
'
j
vih:ich can be written as
(2.102 )
Simila.rly) differen:tiatl!.lg F 'w:ith respect to
r jj
i)
we get
(2.103)
From (2 .ICYe L us:ing L: c" , , - 1, we obtain
i"
D,
87
Similarly, from (2.103) and using L: , ...
j' JJ
=-n
Jl=-2:n
.
i.j.
~
Substituting
A.
= 1,
it follows that
.•
..J.
and Jl in (2.102) and (2.103), obtaining
(2.104)
and
....
'jj'
= n .. jj,/n•• j . .
Therefore, by the theory of the frequency X2-test, the statistic
[n'i'jj'
- n.1. j • (n~l
.. 1
~
2:
2:
ni • j • (n ~J...
.. , • In i
i,i' j,j'
is distributed as X2 with rc(rc-l)-a
of freedom as n -->
00,
II.
Ini
• • •
0
•
•
)(n
) ( n •• j.r
J
0
•
J'j,/n .
In •
II
.In
2
0
j )
0
"
= (r2 -l)(c2 -l)-(r-l)(c-l)
subject to the ratios n.~. j
J )]
degrees
being held fixed,
under Hal'
The test procedure for testing Hal is then to reject Hal if
and do not reject otherwise; C being so chosen that
l
p
1
2
X 2
r
2
>
(r -l)(c -l)-(r-l)(c-l) -
Cl }
=a
(the desired level of
significance).
For a particular set of data, if the hypothesis Hal is not
rejected, then the investigator may feel justified in applying the usual
test procedure for testing the hypothesis of independence.
If the
hypothesis Hal is rejected, then an "appropriate" test procedure for
88
testing H must be adopted. The method discussed in Section 2.5 may be
o
13 ~
~
applied, using as e(rcxrc) its estimate
e(rcxrc), where e(rcxrc)
denotes the matrix
1
r
1
.
J ,J:::: , ••• ,c
,
0
i
l.,1. :::: , ••• ,
0
i
A set of data is taken from the 1960 U. S. Censuses of Population
and Housing to illustrate the above method.
Table 2,10 ..1 gives the 2x2
contingency table with two attributes, "Housing Condition" and "P1unibing
Facilities," taken from the U. S. Census of Housing 1960, Volume VI,
Rural Housing.
The problem is to seew'hether the two attributes are
i.ndependent, using the given set of data.
data is subject to observation errors.
It is anticipated that the
A study on the accuracy of the
Census data has been made using a reinterview procedure known as the
CES (see [28]).
Table 2.10.2,. taken from the "Evaluation and Research
Program," [28], shows a full cross-classification of the
eES
and the
Census data on the condition of housing unit and plumbing facilities.
A test for the hypothesis of independence of errors will now be
performed, for the follO'\lirL3 set of data.
To compute the statistic ~l' we first compute Piii 's and
Yj j , 's
as follows:
A
Pl1
::::
nllll+nl12l+n1112+n1122
n11. +n12 •.
::::
·9303
0
'"
P
12
A
::::
1 - Pll
:=
.0697 ,
13 No justification for using such an estimate is being claimed.
We give here merely as a suggestion. Work needs to be done in this area
(see "Suggestions for Future Research," Section 5.2).
Table 201001.
Condition and plumbing facilities, for occupied rural
ncnfarm and farm housing units: 1960
-----------------,-,------------Housing Condition
Pluniliing Facilities
With All
Lacking
Plumbi,ng Facilities
Sound
Deteriorating
'7815
1379
746
955
Total
1701
---------------------------------Total
Table 2.10.2.
10895
Full cross-classification of the CES and the census data
on the conditions of housing uni.t and plumbing facilities
for all occupied housing units
Reported in Census
CES
Classification
Sound
Deteriorating
Total
With all
Lacking
With all
Lacking
Sound
With all plumbing
facilities
Lacking
33572
466
359
1295
1813
132
123
606
35867
2499
Deteriorating
With all plumbing
facilities
Lacking
2784
201
136
792
1312
79
129
'755
4361
1827
37023
2582
3336
1613
44554
Total
90
and similarly, we have
and
F'inally
!o~.,
the hypothesis that the errors are independent is rejected.
Hence, to test the hypothesis of independence of the two
attributes=- "Housing Condition" and "Plumbing Facilities" --the usual
test procedure cannot be used..
We shall now' find an appropriate test procedure by applying the
method discussed in Section 2.55 using as an estimate of e(rcxrc), the
A
matrix 8(rcxrc)
=(
no. v.
,I
~1 JJ )
n, ,
.
LJ
0
A
From the data in Table 201002, we compute e(rcxrc) to be
91
and trom
we have under the:1.u1.1 hypothesL: that
,
·0
,..,
P12 .- 001001 P lo p 01 + 051821 . . . .1 o..t:-" ,,2 + 003119 P2,Pol + 04:33.50 P20 P 02
0
P21
,- 00:1055 Plo :.P
.O~):282
+
'1
0 ....
n
:P10 Po2 + 0.';/J085 J:J2
By eliminatIng the two parameters P
tions J we find the
L
1='
,~- 01
+ ().4j24 P20 P 02
and PoI from the above equa-
51.de cond.itions to be as follows
Till"()
o
~
0,
Now, from the data, we cOI4lmte
I ')
51 ",LX'-!·,
i (-
,-
(.
\
0
7~
.!...7,·)r
)i,.-::'
.. .::c/
,:,C,'-)"" , ':J
"-
•
-1""
0068472
0717302
( j
'"
8.4X4)
0068472
.08(654)
"3,35188
1
1
..52725
o
.L
1
and finally
p' -,
:u
" .
,715902
- ,,01282
.139392
r
0126572
P':'"i
c~
000140
I
l
0068472
,087654
-v00406
.072532
,01548
,07217)+
Therefore)
_.
with
)~1...... 5·.<:::. . 98~**
j
~
8 degrees of freedom,
If misclassif:i.cation is jgnored and the usual test procedure is
used~ the value of L,he ?',.,staUstic can be seen to be 144.3.5598, which
is also highly sig:nificant"
93
.3.
CONTINGENCY TABLE WI'l'H ONE MARGINAL FIXED
3,,1,
Consider drawing
,!:
The Problem
independent samples from.! different popula-
tions, the i-th sample cone'isting of n,
J,
the i,~th population (1:.::;1,
"'OJ
'c) ,
independent individuals from
0
It is assl-uned that these populations
are comparable in the sense that every fndi vi,dual from each populati,on
Can be cl8,ssif'i,ed i.nto one of t;he same seT; of
~
mutually exclusive and
Thus, every individual in the total sample of
exhaustive categories.
size
t
N
:::
L: n,~,
i=1
'
'di',ces ","
is characterized by th e two
1.n
1.
and" J,II
0
f' two d ifferent c h ar-
acters, " iI/ representing the population from 'which it is drawn, while
"j" represents the particular category to which it belongs,
Let no
be the observed number of individuals i,n the (i,j )-th celL
c
Note here that for a fixed "i" ...he marginal frequency L: ni' j := no is
j=l
~,
fixed and does not depend on the individual cell frequencies niJ's,
t
while for a fixed "j" the marginal frequency L: n
,- n
is esseni=1 ij
.J
tia1ly a chance variable.
0
~J
0
Whereas in the case of contingency table with both
n~rginals
random
we have what is usually called a single-multinomial sample; we have here
a
product~mul tinomial
<P
= nt
i=1
[
sample,
The probability model is
~ - nc:
~ n," ~ ,j=l
no
~~.
j=l
lJ
rc
ij
1.
n'j]
j
(3.1)
where 1(ij denotes the conditional probability of an individual belonging
~o
the (i" j )-tn cell, given that i t is in the i-th
c
~
that
= 1(i
rt'
j 4 j) . ·
index
"
i
"
l"'hus
1'0'';'
require
vfe
c
= 1, while
~
j~
n' j = n • is held fixed.
i
).
Thus the
refers to classification of the population while the index
refers to classification of the variate; the quan'city n.
~
. clenotes
II
j
ll
the
preassigned sample size for the i-th population classification out of
"Thich n.. happen to lie in the j-th variate-category.
lJ
The hypothesis we are interested in testing is that the
1 popula-
tions are the same, i. e., 1'C., is independent of the index 'i'.
- lJ
hypothesis can be expressed as
H :
1'C
o
ij
::;
1'C
j
for all i=l, ••. ,t and j=l, .•• ,c •
The traditional large sample ~-test procedure is to reject
(n. ,-no n ./n)
~~
1.
~x(t-l)(C-l) ~
>
~. )n.. j ::; n
J
1
. (fixed)"
i•
C}
lb
if
2
.J
and not to reject otherwise; C, n
Pr
'.rhe null
C ,
' n ., etc. being defined as follo'Ws:
.J
=a
,
n .
•J
and
~
i
n
i .
::; L: n j = n,
j
.
(i=l, ••• ,t; j=l, ... ,c) •
Similar to the situation with both marginals random, here" errors
of classification can also arise in two directions:
95
(i)
In the "i" direction, !:..~., an individual is classified in
the first population when in fact it belongs to the second.
(ii)
In the "j" direction, !:..~., an individual may be classified
in the first category when in fact it comes from the second.
Let
e( tcxtc) = (e ~~
.. , j J. I) ,
where
e..
, .. , 's
n JJ
are being defined as in Section 2 and e(tcxtc) is assumed
to be non-singular as before.
c
Since we have let n .. (L n .. = 1) denote the conditional prob~J
j=l lJ
ability of an individual belonging to the j-th category given that it
comes from the i-th population when there are no errors of classificac
tion, we will correspondingly let n ~. ( L: ni' j = 1) denote the condi~J j=l
tional probability of an individual being assigned to the (i,j)-th
cell when there are errors of classification.
For data subjected to misclassification, the joint density of the
observations is
ell'
= nt
i=l
[
ni
~
.'
n. ]
n~.~ j
nij : j=l ~J
c
where
nc
L: ni' . = 1 and L: n'
j=l J
j l j
= ni
,
(3.3 )
is fixed, (i=l, ... ,t; j=l, ••• ,c).
•
Since we are considering the general case where errors can arise
in both the population-wise and variate-wise classifications, it is
convenient to note the following relation
i=l, ... , t
j=l, ••. ,c
96
where P
ij
denotes the '1Ulconditional' probability of an individual
belonging to the (i,j)-th cell,
(!.~.,
following the definition given
c
in Section 2), where p.
~.
= I: p.. and I: p.. = 1j=l ~J
. . ~J
~,J
Similarly, we have
i=l, ... , t
j=l, ••• ,c
where P~j denotes the 'unconditional' probability of an individual
being assigned to the (i,j)-th cell when there are errors of classific
cation, and p~. = .I: P~j' I: P~j = 1; as defined in Section 2.
J=l
i,j
From the definition of eii'jj' 's, it is clear that
,
Pij = L: k Ps
s,
where
L:
i,j
,
Pij = 1,
l
s ik'J
,
(i=l, .•. ,t; j=l, ••. ,c).
Hence, it follows from (3.4) and (3.5) that
Ps. re s kes~'k'J
L:
_ s, k
- I: ( I: P
.
J
(i=l, •.. ,t; j=l, •.. ,c),
and
L:
j
rei.
J
re k e 'k') ,
s~ J
s, k s. s
= 1
for all i=l, ••. ,t.
Let us see how to test the hypothesis H: re •. = re j , where L: re. = 1;
o
~J
j
J
where there are errors of classification. Again, the two cases of independent and non-independent errors will be considered separately.14
14 The discussions to be followed are quite analogous to those in
Section 2. Therefore, we shall not go into detail here and shall, instead, refer to the results obtained earlier, when needed.
97
Independent Errors
).2.
Let P.d
~~
(i,il=l,.oo,t) and )' .. 1 (j,jl=l, ... ,c) be the error
JJ
probabilities defined as in Section 2.
'{hen errors are independent,
,.,e have
where
2:
• ,
~
• I
,J
e .. , J~ J., = .;2:
J.~
...
p. .
II
I
1
= L: )' .. 1 = 1,
•• 1
( l,~
j' JJ
· =1 , ••• ,c ) .
=1 , ••• , t; ·J,J
1
For the case of independent errors, from
(3.7)
and
(3.8)
we can
write
,
(i=l, ... ,t; j:=l, ••. ,c) .
Under H , we have
o
Ps. 1t k p s~')'k'J
L:
= ..;.s",-'k
_
~
say,
L: ps.p si
(3.10)
s
for all i=l, •.• ,t and j=l, •.• ,c •
Hence,
-->
<:--
HI.
o'
1t ~.
lJ
= 1t I.
J
(3.11)
That is, rejection or acceptance of HI implies rejection or acceptance
o
98
of H ' Therefore, the problem of testing H reduces to that of testing
o
o
H'. The test procedure for testing HI can be easily seen to be:
o
0
Reject HI (and hence H ) if
o
0
2
(n, ,-n, n ./n)
~J
~.
oJ
ni. n. j I n
and do not reject otherwise.
defined as in
>
C
'
The quantities C, n ., n. j , etc. are being
i
(3.2).
Hence, for the case of independent errors it is seen that we get
the same test procedure as we would have obtained if there had been no
errors of classification.
3.3. The Asymptotic Power Function for the Case
of Independent Errors
Before discussing the asymptotic power, we first state a result,
proved by Mitra [14], concerning the asymptotic power function for the
case of one marginal fixed.
Under certain regularity conditions on n .. 's (see Mitra [14],
~J
p. 1222), let H denote the hypothesis
o
n ij = f ij (a , •.• ,am),
l
(i=l, ••• ,t; j=l, ••• ,c),
and Hln denotes the simple alternative
o
where a' = (aol,o.o,a ) is an inner point of A (a non-degenerate interval
-0
m
in the m-dimensional space of the ak's), and { 0ij } a set of deviation
parameters, not all zero, such that
~
j
o'j = 0, for all i=l, .•. ,t.
~
99
Let
_
~2
r
=
L:
,A
A
)]2
ln1, J.-n.~. f.~ j\al , •• • ,am
i,j
A
,
A
where (al, ••• ,a ) are consistent solutions to the likelihood equations.
m
Then, the asymptotic power of the test, using the statistic
defi,ned to be
t3:::::
lim P
n-->
00 r
{~ -> c I IL n
-~
1'
X2
(:3.16 )
which is equal to
I-F(C,t(c-l)-m,6.) ,
where
In our case, the hypothesis we are testing is
H'
o'
c
where
c
:Jt'. = L: f .. (al,···,a 1) = 1
j=l J
j=l ~J
cL:
(i=l, .•. ,t; j=l, ••. ,c).
is
and ak = :Jt'k' (k=l, .•• ,C-l)
100
We shall apply the above theorem to find the asymptotic power of
the
X2~tests for the problem at hand, !o~., testing Ho : ~ij = ~j where
= 1.
~ ~j
~rom the
When there are no errors of classification, it can be seen
theorem that the asymptotic pow'er of
i!-
where
,
l:
=:
i!,
i,j
is
iJ:=
lir';
n~->
0::'.
I
T
r
I\o} = l-F(C,(t-l)(c-l),6),
where
~l(lxtc) = (Oij~Qi/~~j
and
{~Qi/'~j
C(tcxtc-l)=
) ,
(d.ij/d.k)O}'
Mote [15] has evaluated the value of the non-centrality parameter
6., as defined above to be
c
t
l:
6.=
l:
i,h=l j=l
i<h
(0.~ fOh')
'. J
2
0
~.
Qi~
(3.20)
J
Now, let us see what happens to the asymptotic power when there are
errors of classification.
f3 '=
lim
n->
00
P
r
That is, we wish to find
J X 12 > c I
1 -
K
-""1 n
\ ,
I
101
where Xi
2
and H are as defined in
1n
(3.k~)
and (3.19) respectively.
As in Section 2.3, it can be shown that
-->
<--
(3.22 )
where
and
Z
j
0i j
== 0
for all 1=1, ..• ,t •
'I'herefore.~
13' =
lim P
n-> co r
{X i2 >- C I Rlirl }
= 1-F(C,(t-l)(c-l)",6.') ,
(3.23)
where
/:::,'
::
t
c
Z
Z
i,h=l j=l
1<11
{O~j}
and
{njOf
(I
1)2
°ifOhj
,0
no
Qi~
,
J
areas defined in (3.22).
We havenot yet been successful in attempting to show that /:::,' < /:::,
in general, as we have done for the case of both marginals random.
One
of the difficulties encountered here is that the marginal probabilities
PL (i=1"", t) enter into the expression of 6.'.
However, by using the
Cauchy's inequality J 'we can show that,6.' < /:::, for the special cases where
the marginal probabilities are equal and
piS
are of the following form:
102
p
otherwise
where
0 < p < lit
(1 J i ' =1, "" "t) ,
y
We have already sho'Wn in Section 2,1 that cases of errors occurring
in one direction are just special cases of the case of independent
errors,
Hm.-Tever, let us examine here the case of errors oecuring only
in the variate=wise classification, _io_e"
the matrix p(txt)::: (0",)
.
~~
Fr'om (,3 .2~?) it can be easily seen, for the
is an identity matrix,
above case, that
o
51
iJ
L: :p, 0, k'Y '
k 1., J. k j
::::
o
pi,
hence, the expression 6' of (3.23) no longer involves the marginal
probabiliti es
It can also be shown, using Cau.chy's
inequality, that for this special case;
~' ~ ~
in general and 6' < 6
for some,J., none of the 'Ykj's (k:=:l",.,c) are zero,
if
Hence, we have the
result for the case of errors occurring only in the j=th direction that
although misclassification does not affect the test procedure, it may
reduce the asymptotic power (which agrees with the result obtained by
Mote [15]).
3.4,
Illustration
As an illustration, let us now consider the following example:
e
1.'1' i..J.
T 1','
l-(t~l)e,
otherwise
:=:
103
(3.26)
and
r
:=
r$
l
~
l/ye) .
0 < 8 < min(l/t,
where
otherwise
1-r( o-l)e,
Assume that errors are independent, then
,
[l-(t-l>/'J
if
[1-y(c-1)8]8 ,
8
2
i~i' ,j=ji
,
and
where
u
2: P ij
i,j
=:
1 .,
(i=l, ... ,t; j=l, •.. ,c) .
J
It is easy to see that
p~
~.
:: 2: P:,
j
lJ
:=
(1- t6 )p.
~.
+8
and hence
(3.28)
(i=l, ... ,t; j::l, ... ,c)o
Under Ho :
n~j
where ~
J
n
iJ = n j , where
= (1- y c8)n j +
~
nj
y8 =
nj
nj ::: 1, for i=l,.oo,t, j:=l"
:::
1, we have
,'>C
,
104
Hence,
-->
<--
H •
0'
and misclassification dDes not affect the test procedure.
Also, i.t can be seen that for a simple alternative
where
~
j
0i' == 0, i=l, , .. ,t;
J
we b.ave
t
~
1,11::;;1
i<..n
(3.30)
and
t
~I::;
C
L:
~
i,11::.;l j=l
i<h
.'
12
I
to, .-o·h.)
J
~J
rr
,0
j
where
and
5~ ~ ~J
(i==l,.",t, j:l'HO'C) ,
For a gi.ven set of
{p~. }
and {
calculate ~i for various valuesof
e
rr~
}, if we fix
f::::,
and ex and
and 'Y, we will be able to compare
the asympt;ot:ic powers for the two situat:Lons.
For example, let
105
t~or
all
then
,:.:;
i:~ .,~}:=l
asymptotic
~pOYle::.·
For t=e:"--2,1 6
j:1
1,8 re0:.).cerJ.o
='
10 ..509 and
ex - 0.05, if
a=
.01743, Y
=3
then
13' .- I-F'(C,1,7.849)::an be read from Fix's tables of non-,central chisquare.
if
ex :::;
'These vEtl~tes ::'.re seer: to 'oe 0.9 and oJ3 respectively,
C.O'5 and I) '""
0.9::;0 0.8.
J)lI43 ,c.ile
a:3)~ptoti(;
That is)
pOIfer "l-lOL'[ld be reduced from
'fbe effect of misclassificatior. 01' the example above> on
tIle asympLcsiC pm-lE,}' of the test tor dif'f'ere.nt val.l)eS of y and e a.nd
different bJ.zes of ~c-n::i.l1gen(;)"';ables.. can. be ,seen 1':'0111.
and 2.4.2,
TheE3t: EmJ:lerical calcu.lations S:Cl0W thac r:'i·3classification
is LJore seriG'L;,s for 81n.aller level of significance,
::, e:r'
}".le;.
U,~
,.•-,
Lha,~
Ta.bles 2.4.1.
::::
~.,;,
u
111hel'e
L:
.
J
:1'(
0
J
= 1; we
ex.
have
106
i: 1!~j
,j
where
=1
~
(i=1, .• "t; j=l, •. o,c).
When errors are not independent, the relationship (3.11) does not
hold,.
Consequently a procedure for testing H must be obtained.
o
The
theory of frequenc:y i:~tests can be utili,zed to obtain a test procedl~re
for testing H :i 0:(' eqlliva1ently (3,31), That is, the problem of finding
o
a test procedure is reduced to finding BAN estimates for the independent
unknown parameters in the expressions of 11::
.u
l.J
s , namely (3,31).
'" b;y- linearization can be used here to
The method of minimum ~,
obtain the required BAN
When
e(tcxtc)
estirr~tes.
is k.nown, the number of independent unknown
eters to be estimated is (t-·l) +
... ,t-1) and rt
of p
s.
and
rt
k
(k=l~.,
o,c=1).
, ) respectively,
k
(c~'1),
Let p'"
The parameters are p
and '"
rr
So
pa.ram~
So
k denote the BAN estimates
Then, the test procedure is:
Reject H if
o
i:
PSo~keSikj
~------
'" rtke
'" " kj
i:
E p.
j s,k
8.
81.
ni .
'" "
E PS.rtkeSikj
s..zk
'" '"
i: p~ rr e
,
\ i:
J s, k <::; k s.lkj
rJ
2
> C(}'. )
, '
1
f
0
and do not reject otherwise; where 0(1) is so chosen that
l-
> C(l) J -
rv
~
.
(8=1,
10 7
It should be pointed out here that far the case of non",independent
errors as discussed :i.n this
section~
the pro"blems where c
< 2 cannot be
solved, (1. e., the prc/blems involving the commop~y used 2x2 contingency
tables are not being considered).
c _< 2. the number of indepeIlde~t
-
This is due to the fact that for
:rt
~j ~ s is less than the r!urnber of inde-
._~,
pendent unknown parameters Ii. the e:,q;rression (3 ..31).
vlhen
e (tcxtc)
is unKumvtl., "t:he number of independen"c 'par'ameters .is
(t-l)+(c-l)+u, where u is the rr'.lIilber of unkno'wu independent elements in
e (textc).
The parametere. to be estimated in thi.s case are p
BAN estimates of 1'6,' rt
k
and
eg'
respect:i.vely,
So
(e=l,
00
,.,
Then, the test procedure
is to reject H if
o
L:
i,j
,7.:1:)
(3
.
,Jj,
and not to reject ot,herldse; v.rhere
(? )
C'~
is being so chosen that
"" a: •
3.6,
The Size of the 'rest If Misclassification Is Ignored
To evaluate the ::letual size of the usual test when misclassification
is ignored, the limi ti!lg procedure 'J.sed in Section 2,8 vill be followed.
0
L
;( ==
i,j
and do not reject otherwise., where C is chosen such that
The size
o1~
being defined
bJ;;
the dbove rest, \{hen Y'1.L'cla83::ification is pre2ent.1 L,
n c'
I:p
~ reke,'
'l'
- ~"
BL<:J
0
a'
==
lim
n-'> co
P
r
x?
>
_
C
S.lK
L
I;
P
J s"k
o
'"
0
It {)
z,,;.;,o
,
,
;' slk'l
_....
"';"
which can be written as
0;'
=
lim P
n-> co r
:ij
where
I-l •.
l.J
and
rt~ J
-
I: I-l,. = 0, for all :1,:=1,.,
j
lJ
r-5.;:',?)
\.
./,
0' to
By the similar method as in Section 208, we can show that
0;'
0=:
l-F(C,(t-l)(C-l),A. ) ,
o
where
c:
L:
i,1.1::1 j:;;;;l
i<11
t
A.
o
r:
Q..:'-l.", •
.J..
We shall now shm'l t.hat, in general A.
A.
o
=0
o
~~
::: 0 (~. ~"
0;
2: a),
and
if and only if misclassification is such that (3.1l) holds.
109
It is clear
that
A..
o
fJ\)~'
t.he
aD::J,/';'
e:xvressio'.: of f..
(;
,,11 cU:
f..
0
> 0, and also
= 0 i:f and only i f
- 0
fc,r a1.1
~,
h anj
,J,
,
rG
~.,,~o,
Ij
(.
1
~1 )
Jt,)
,
(
8aYJ!o~o,
it is independent of
(3.40 )
•
I rhus, unless
holds), f.. a > O.
e(tcztc) is 8',;,,('h that (3,40) holds
That is,
(!..~., (3.11)
ai > a .
In other words J :if rrd,8classificatioll, (such that it does affect the
test procedure) is ignored and the usual test procedure is used, the
actual .size of the test increases,
l
4.
~n
.LIJ
GENERALIZATION
The investigations of the effect of misclassification in Sections
2 and 3 are essentially obtained for two-way contingency tables, either
for the case with both marginals random or that with one marginal fixed.
The extension of the work from tables of two to more than two dimensions
does not bring about any real problems so far as the "workable" notationa ar1d the ITLB,tn, c(}I!cepts of the investigati,on of tIle effects
cif
misclassification are concerned,
Although it does introduce a more cumbersome set of notations, the
extension to the general d x d x '"
l
2
x d contingency table does not
m
require more than a straightforward generalization of those we introduce
for the two-di.mensional case,
To illustrate the above point, we shall give a brief extension of
the results so far obtained to a rxcxt three-way cross-classification
tables with all marginals random,
It should be noted that for a three-
way contingency table, there are several hypotheses to be tested,
For
example, the hypothesis of complete independence among the three attributes, the hypothesis of conditional independence between the first and
the second attributes given the third attribute (say), the hypothesis of
multiple independence, etc, (see Roy and Kastenbaum [24]),
We shall,
however, consider only the problem of testing the hypothesis of complete
independence for a three-way contingency table,
The problems concerning more than three-dimensional tables are not
frequently met with in practice and will not be discussed here,
ever, it is hoped that the readers who are confronted with these
How-
111
problems will not find too much difficulty in generalizing the results
obtained here to the problems at hand.
As a natural extension of a two-'W'ay cont1.ngency table, suppose that
the
E observations
of a random sample are arranged in cells at the
points of a three dimensional rectangular lattice in rows (according to
some observable characteristic C of individuals belonging to the
l
sample), columns (according to some other characteristic C ) and layers
2
(according to a thircl characteristic
c ).
3
Let p. 'k be the probability of an individual falling in the
J.J
(i,j,k)-th cell corresponding to the i-th
row~
j-th column and k-th
layer; a.nd let n. 'k denote the observed number of individuals in the
1J
sample belonging to the (i,j,k)-th cell, (i=l, .•• ,r, J:::::l, ... ,c and
k=l, •. o,t).
We have
L:
i,j,k
P
ijk
=
1,
and
n
L:
i,j,k
ijk
=
.
n
( 4.1)
Let us write
p ..
1J.
p,
=
:::::
1.
0
-
p.1. k = ~ Pijk and p .J'k
~ Pijk'
J
L: p. ,
j 1J •
:::::
~ p. jk
j
~ Pi.k; p • j
,
:::::
:::::
L: po 'k;
1J
i
~ p .jk and p .• k
:::::
L: p, k
i
1.
(4.2 )
J
and similarly,
n..
1J .
= L: n. 'k' etc.
j
1J
Let Pji 'k denote the probability of an individual being assigned to
.J
the (i,J,k)-th cell, when there are errors of classification.
Let the
112
corresponding su.1'Jlll1ations over P~jk be denoted by
and
For the case of a three-way contingency table with all marginals
random, errors of classi.fi.cation can arise i.n three directi.ons, the i-th
direction} the J·th direction and the k-th direction,
Let
ei.i ' j j , kk'
denote the probability of an individual belonging
to the (i,j ,k)-th cell being wrongly classified into the (i' ,j' ,k ' )-th
cell and eiijjkk the probability of correctly assl.gning an individual
belonging to the (j.,j,k)~·th cell to the (i, j,k)-th celL
eo"
n.
=1
J'j'kk'
(4.4)
,
for all i.=l,. o,r, j=l,. o,c and k=l, ..
0
Clearly,
0
0'
t
0
It can be seen that
i
Pijk
(i=l)l
00
_.
r
L:
c
t
L:
L:
Pal3T
a=l 13=-=1 T=l
eail3jTk ,
.,r, j=:l, o. o,c, k=l, oo,t)j
0
or in matrix notation,
n '
.5:.1
::: n '
e
(4.6)
....'
where
n ' (lxrct) =: (Pl'll"" ,PI'It'· . ,
.5:.
1
;e' (lxrct)
::: (P JJ.1 '
0
,
•
,PlJ. t ;
0,
'00,
pireI"'" p
')'
ret
P~Cl"" 'Prct)
113
ar.d
fe
A(r--+><'j(o,,,,oct')'
\,;'-'~._
......
..l- ....,
-...
\.
",'
~
1
,L1.
j i kk'")'
'0),
-'
\fe shall nov consider testing the hypothefds of complete independerrce which can be expressed as follows;
f~1.,J'·,~"" .~~
_
p~
,J., " 0
T. P
<;..J~'
fl
k
J
(1_.11
',.rhere
L: .p>
1, ,
:1
<
0
When there 8.re
I'.O
test cOJ1lmo:n1y u3€:d (~,~,
Y?-
errors of (;l8,ssi.:fi,cati.of., the large·,sarnple .2_
x.
0
'3ee [24J) is;
j
Ln.
.;>
,loo
u.
:1, j ~ k
•.l....
and de
!"lOt
2 n jon. In]
'l.,n,
. lJi:.
L:
n
'J
o~
Q
0)
00
'J
.}
~
0,.:..1,
\
-
'::»
.;)1.. • • •
.:"
,(lK~
<2
n
4
tJ
00
0
,jn
f.t'...
reject otheTIfisE, where no
defined as :i- r1. i\. 4 "-.1
1 'I '·r.d (.Ll
Reject H i f
o
;i
00
>
co ,
( 4.8)
n . J n 'k and n are being
oJ,
,v
C0 being so:-:hosen that
Let us see how mLsc:lassification can "be taker. lute ac.:;:cunt -;"hile
testing H v
o
Again, we shaLl consider separately} T,he case of :1.ndependent errors
and that of non-indeper..d ent e.rrcrs
0
114
If errors are completely independent J then we have that
where the p's and the r's are as defined in Section 2, ~kk' denote the
chance of an individual belonging to the
k~th
category of the third
attribute being wrongly classified into the k 1 -th category of the same
~kk' the probability of correctly assign-
attribute (kfk'=l, •.. ,t), and
ing the indiv:i.dual of the k-tb categc'Yy of the third attribute to the
k-th category of the same attribute (k=l, •. "t); and where
L P' i ' = L r j .' = L ~kk' =
L
e'i'jj'kk' = 1 ,
i' ~
j'
J
k'
i';j',k'1.
. . , = 1 , ••• ,r,
( ~,~
From
j,j'=l, ... ,c, k,k'=l, ... ,t).
(4.5), we can then write
(4.10)
(i=l, .•. ,r, j=l, ..• ,c, k=l, .•. ,t).
Under H , we have
o
=
where
p~
~. .
p',
.J.
p'
..k' say
(4.11 )
115
,
p.
1. ••
==
r
L:
......
P....
rv1.·'
(i==l, ••• ,r),
p A
rAj'
(j=l, .•• ,c) ,
T\
a==1
.J;'rv
c
pI
.j .
==
L:
13=1
'r"
r'
t
pI
.• k =
L:
T=l
(k=l, ... ,t),
P .. T 1)Tk '
and
L: p~
•
1. ••
1.
=
:::: L: pI
:::; 1 .
L: p'
j
.j.
k
.. k
That is,
p.
H'
o'
1.jk
:::: p
p
p
1. . . j . . . k
-->
p'
<-- H'··
0"
ijk
Hence, rejection or acceptance of
H~
= p'
p'
p'
i . . . j .•• k
(4.12)
implies rejection or accept-
ance of H •
Consequently, the problem of testing H0 reduces to that of
testing HI .
A test procedure is to reject H'0 (hence H ) if
0
0
0
(4.13 )
L:
i,j ,k
and not to reject otherwise; Co' ni .. , n. j ., n •. k and n are being
defined as in
(4.8).
Therefore, we have the result that misclassification (when errors
are independent) does not affect the test procedure.
To study the effect of misclassification on the asymptotic power,
we compare
13 =
lim Pr
n-> co
1{.r- >
C
o
116
and
(4.14)
t3 '=
where
x2
lim Pr {X' 2 .::: Co
n->
I
IS.n }
CD
and X,2 are as defined in (4.8) and (4.13) respectively; BIn
denotes the simple alternative
_
0
0
0
Pl. . .p.p
P·jk~
. J... k
+
~
v"
/~~
lJ k yn,
where
= 0 .,
(i=l, .. o,r, j=l, ... ,c, k=l, ... ,t).
Diamond
frequency
[7] evaluates the asymptotic power function for the
~-test of the hypothesis of ~omplete independence in a three-
way contingency table to be
13 = l-F( C, rct-r-c-t+2, 6) ,
(4.15 )
where
6=
2 / 0
0
0
1:
0ijlC p.l . . p.
.J. P .. 1( '
i" j, k
and 0ijk'S are as defined in (4.14).
It can be shown, as in Section 2, that
13' = 1-F(C,
",here
6'
and where
=
1:
i, j, k
rct-r'~c-t+2,
,1'\'),
(4.16 )
117
O~jk =
,0
p.
1. ••
p
,0
.j .
°at3~P'f3j TJ Tk
I:
a,f3.,T
=
I:
I:
f3
and
P:~k =
I:
T
0
,
p
0
,
a •. Pai
a
=
p
.f3 . Yf3j
P~ .TTJ Tk ,
(i=l,. ",r, j=l,. ",c, k=J,., ... ,t) •
By applying the Cauchy's inequality successively, we can show that,
6'
=.
" 1':.,2 / ,0 ,0 ,0
. k v lJ'k p.l •• P • j . P .. k
.w
<
0
1., J '.
2
L: 0 'k
. j k iJ
l,
,
I P0
0
0
i . .P. j. P .. k = 6 .
That is, misclassification (when errors are independent) may reduce the
asymptotic power of the test.
When errors are not independent and we do not have the relation
(4.12), it can be seen that an appropriate test procedure
fOT
testing
H must be derived.
o
By using the theory of frequency ~-tests, our problem reduces to
that of finding appropriate BAN
estirr~tes
for the unknown parameters in
the following expressions:
(i=l, ..• ,r, j=l, ... ,c, k=l, ... ,t)j
A
!.~.,
,
finding BAN estimates Pijk'
The method of minimum
xi by lineariza-
"
tion can be adopted here to obtain p~ ok"
1.J
Then, the test procedure for testing H :
o
is to reject H if
o
p.~ jk = Pi
.. P .j ,P .. k '
118
A
[ nijk~np~Jk]
2
A,
nP
ijk
~nd
not to reject otherwise; Co is being chosen so tnat
r {X;ct-r-c-ti2
P
>
Co } = '" •
119
5.
SUMMARY AND SUGGESTIONS FOR FUTURE RESEARCH
5 .1 •
Summary
One of the difficulties often encountered in the analysis of
categorical data, either in the area of estimation or that of testing
typothesis, is the possibility of making mistakes in classifying individuals into respective categories.
This thesis presents a general
framework for dealing with errors of classification.
Its main aim is
an investigation of the effect of misclassification on testing
hypotheses.
All the test procedures considered are the large-sample
'If-tests.
The basis of the development is a definition of error probabilities
eii'jj'kk'
and a relation
IS
pI
ijk ..•
=
~
p
~
at'T...
Ct,tJ, T ...
eai~jTk.•.
'
where P~jk ..• 'S refer to probability of an individual being classified
into the (i,j,k, .•• )-th cell when there are errors of classification and
Pijk ••• ' the probability of an individual belonging to the (i,j,k, •.. )-th
cell.
An investigation is made of the effect of misclassification on
chi-square tests for the following cases:
a)
Two-way contingency tables with both marginals random,
b)
Two-way contingency tables with one marginal fixed, and
c)
Three-way contingency tables with all marginals random.
Throughout this thesis, e(rcxrc) which has been used to denote the
matrix
120
rA
\-'ii'jj'
•
.1
~,~
)
=1 , ••• ,r
j,j'=l, .•• ,c
for the case of two-way contingency tables, and
=1 , ••• ,r
J,J =1 , ..• ,c
•• 1
~,~
e(rctxrct) = (eii'jj'kk')
•• 1
k,k'=l, •.• ,t
for the three-dimensional case, are assumed to be non-singular.
For testing the hypothesis of independence in a rxc two-way
contingeney table (with both marginals random), two cases have been
considered.:
(i)
Independent errors (the errors in the i-th direction are
independent of the errors in the j-th direction) and
(ii)
Non-ind.ependent errors.
That i.s, the investigation is made for the case where errors can
arise in both directions.
The case of errors occurring only in one
direction is shown to be a special case of the case of independent
errors.
When errors are independent,
e(rcxrc)
= p(rxr) *
p(rxr)
=
?'(cxc)
=
1.!.,
r(cxc),
where
piS
(i,i'=l, ... ,r)
(j, j I =1, ••• , c)
;
denote the error probabilities for the i-th direction and r's,
those for the j-th direction, and the symbol
B*B
denotes the direct
product of the two matrices; it is shown that the test procedure for
121
testing the hypothesis of independence is unaffected by misclassification.
However, misclassification may reduce the asymptotic power of
the test.
To demonstrate the effect of misclassification on the
asymptotic power for the case of independent errors, the following type
of misclassification is considered in the computations:
1.
if ifi I
otherwise
and
Y
jj
,=
if jfj'
YfJ
{ l-y(c-l)fJ,
otherwise
where 0 < fJ < min( l/r, Ijyc) and errors are independent.
Tables 2.4.1 and 2.4.2 illustrate how the asympix)tic power is reduced, as fJ
increases.
These numerical calculations show that misclassification
becomes more serious as the significance probability decreases.
The
ratio 6' /6, which is the ratio of sample size for data which is free of
errors to the sample size for data which is subject to misclassification
to produce the same asymptotic power, is also given for each case.
For non-independent errors, two cases have been examined:
1)
fJ(rcxrc) is known.
2)
e(rcxrc) is unknown.
It is shown that, in general, the test procedure is affected by
misclassification,
!.~.,
an appropriate test procedure for testing the
hypothesis must be obtained.
In the process of deriving the test proce-
dure using Cramer's theory of frequency X2-test, we are confronted with
122
the problem that the likelihood equations are difficult to solve.
To
overcome this difficulty, a BAN method of estimation known as the method
of minimum
parameters.
xi by linearization is proposed,
to estimate the unknown
This method permits a reduction of the problem to the solu-
tion of a system of linear equations and hence is more convenient.
The asymptotic power function of the "modified" test procedure
has been found; and it is shown that misclassification tends to reduce
the asymptotic power.
As an illustration of the application of the method of minimum
xi by linearization, the following type of misclassification is being
considered:
II.
if
e11' jj' = \ e
ll-<rc-l)e,
where 0 <
ifi
I
or j
fj ,
otherwise
e < lire.
The deduction of the side conditions, which is an essential part
in applying the method of minimum
xi by linearization,
above type of misclassification.
The method of constructing an appro-
is shown for the
priate test, using as the method of estimation that of the minimum
xi
by linearization, has been applied to a set of data in a 2x2 contingency
table.
The values of the test statistics have been computed for some
selected values of
e,
and are shown in Table 2.6.2.
If misclassification is ignored and the usual test procedure is
used, it is shown that the actual size of the test will increase.
The
effect of neglecting misclassification for case II is shown in Table
2.8.1.
125
When e(rcxrc) is unknown, in general, the number of' independent
parameters to be estimated exceeds that of the cells.
~
Only cases having
unknown independent elements in the matrix e (rcxrc), 'where
u < (r-l)(c-l), can be consi.dered.
A suggested practical application of the results obtained to some
fields of research,
etc., is given.
has been derived.
~.£.,
sample survey data, med1.cal research stud1.es
A test procedure for testing the independence of errors
As an illustration of the deri.ved procedure, a
numerical example using sample survey data has been calculated,
For testing the hypothesis that "the populations are identical" in
a txc two-way contingency table (with one marginal fixed), the general
case of errors arising in both directions is considered.
Again, the two
cases of independent and non-independent errors are considered.
For the
case of one marginal fixed, distinct definitions of the "conditional"
probabilities and the "unconditional" probabilities are helpful in the
development.
Similar results to those for the previous case of both
rnarginals random, have been obtained.
Some numerical results are also
computed.
For the case of a
three~way
conti.ngency table with all marginals
random, given as an illustration of the straightforward generalization
from the two to more than
two~dimensional
complete independence is considered.
case, only the hypothesis of
It is shown that when errors are
completely independent, misclassification does not affect the test
procedure but it does reduce the asymptotic power.
124·
5.2.
Suggestions for F\rture Research
There are many possibilities for future research in this :new area
of errors of classification,
This thesis has dealt mainly with the
investigation of the effect of rrd.sclassif'ica-cior! en one of the two major
areas of statistical inference, namely, the area of testing hypothesis.
It has been established. (see Bross [4]) that misclassif'ication does
affect the estimates, and !J1.ay even present a more serious problem in the
case of estimates than in the case of testing.
Work needs to be done to
investigate the effects of misclassif'ication on the problem of
tion.
estim9.~
It is hoped that the basis of the development la.id down in this
thesis will be a help for further studies
0
For example, from our basic
relationship
£i
:=
Pole,
it is clear that the traditionally used unbiased estirn.ate .9. of .E will
no longer be unbiased, u..'1less the misclassification matrix
e
is an
identity matrix.
If
e
is non-singular, then we have the relation
where the matrix
e- l
biased estimate of
.9.
I
:E.g
is the inverse of
eo
It :i8 cl.ear that, an un-
is
e-1 .
If the matrix
e
is known, our problem of' finding an unbiased esti-
mate of the vector of the u.Y1derlyjng probabili ties~ ;b is sabred.
However> the matr:i.x
must be obtained.
e
js u.su.sl.ly unknQ"Wn and) hence y an estimate of
So there ha.s to -b e developed a.
U
procedure for esti.m.9.ting the misciassifi.cation. matr:Lx
the properties of the estimate
!-There
'" Is Ein esti.mate of e> is a.lso needed.
e
""fT
gOOd.
e.
e
est,irnating
A stu.dy on
126
6"
:.1"
Ba:cnard, GoA"
B:l.ometx,tka
Bhapkar
j
i961 " Some
V uf'o
Math, Stat,
~
••';'
0
4,
LIST OF REFERENeES
t.e::~t8
fer catcgoricaI data"
Annals cf
~~:s~7~>8;
Ebapk.ar, V Po 19650 A note on. the equivalence o:f some teet
cr1.te1'1ao No C, Ins:" 0:' Statistic.::!" ,~1imeogra:ph Series Noo
421., RIJ,leigb, Nu Co
0
19'5 h ,
Bross y L
M:tsclassificatiori irl 2x2 t;ablec'J
0
B:i.ometrlcs 1.0:
4,:B,~ )1.86 "
5.
Y?
Ccchra'!C J HoG. 1952" 'meMath Stat 23 ;315,-345 "
0
tEst of goodness of i1 t,
Ann'lls of
0
Mathernati.:::al ~'1ethod8 cf Statistics
U'niyo Press, Princeton"
6.
Gramer, Ho"L946"
70
DiamoDu, Eo Lo 1963, The limiting :power of categorIcal data chi'"
t:quare t.est8 anal.ogcus to normal analysis of variance
Annals
of Math, Stat,34;1432~144L
0
Princeton
0
80
Di.amond~
E. L. and L:ilienfeld, Au Mo 1962ao Effects of errors in
classification. and diagnosis in various types of epidemio1ogi~
cal studies" Amer J of PU:blic Health 2: 113?~1144
0
9.
0
0
Diamond" Eo L. and Lilienfeld, Ao M, 1962".:)
MiBclassification
errors in :2x2 tables with one rnargin fi.xed: some further
comments
Amero J, of Public Health 2g:2106=21100
0
0
10,
Fisher) Ro A,
StaL Soc
11.
On. the inter:pretation of chi,-square from
tables and the calculat:lon of Ii
J of the Royal
19;':"20
contingenc~l
0
0
8~ ~ 87-"'94
0
0
?
FiX, Eo 19490 Tables of non~' central )(", Uni v
Publications in Statistics !.(i?) :15~19
0
of California
0
12
0
F.ardYJ Go H, J Li ttlewood J JoE" and Polya, Go
Cambridge Ur..i iT Press., Canibri,dgl;:
0
130
19340
Inequalities
0
0
M1.tra..; S. Ko 1955" Contributjons to 't,he statistical analysis of
categorical data, N, Co Justo of StatlB.tics, H:imE'0graph
Series No 14'2 j R9.1eigh? No c.
0
140
Mitra? S. K. 19580
ehi-squaretest
the limiting power ,function of the frequency
Annals of Math, Sta:t~ gz:1221·~1233
01:'i
0
0
0
127
1957,..,. An .::..nvef;t:i.gat~ion (,f t,he cf.fe;ct of miselassifica=
t,ior:. em the ~. tes tlO'. i.n t:he a:'1alys:ls of eategc:r::l.cal data, N" C,
In.i:>L of Stat:Lstj;:';8." Mim2cgrap.h Se:des No" 182, Ralelgh, N. C,
Mote, V, Lo
Newell) D, J. 196:2 ,EY'ro.r2 in the .i.nte:r:pretat:ion of errors In
,::p:ideml01o&y, Arner, J of Public Hea.lth ,2g:1.925~1928"
0
,.,
170
Neyman.) ,J 0
192J.9 ,
Contr:f.bu.t:1on to the 't:heCiry of the )f -test.
Pro,:;" 01' the Berkeley Syrnposlum on Math, Stat" and Probability}
Drilv .. of Cal.i:fornia Pcess, 239,,273,
19570,>' On the ma;thematl.cal princ.ipleEi underlying the
theo:ry of' X=,.t.est, TL C,. Insto of Stati:sMcB, Mimeograph
Serie::: No_ 162: Raleigh J N.. Co
18.
Ogawa.,].
190
Patna:i.k J P, B" 19490 T.te non~,central and .F distri.butions and their
applications, Biomet·rlka 36 ~20.2<?32
0
20,
Pearson." E. EL 194-7, The choice of statistical tests illustra.ted
on the :1.nt,erpretation of data classed in a 2:x2 table,
"
'. ' I . .::::.=~l.),
~1··9~L
~67
Bl.omet,rJ.x;.8..
,
0
210
19000 On the cr:i.terion that a given system of deviab.one from the probable in tf,.e case of a correlated syst.em of
variables i8 s'u.ch that it can be reasonably supposed to have
a:r::.sen frcm random sampling, PhIlosophical. Magazine Seri.es
- jI 2S2,
-r,'1.~,
0:;7,-..~. . 72
:::>
,._.
Pearson" K,
0
22,
Pi tman~ Eo J, G"
:':.948, Notes on non'~:parametric statisti.cal
.tn.ference
Nv C. Inst:" of StatIstics ~ Mimeograph Ser:ies
Noo 27, Raleigh y l'L Co
0
2.3 '
Roy, SoN, 1957, Some A.spec:tB of Itru1.ti variat~ Analysis
Wiley and Sons, Inc 0", Ne1tl York"
24.
Roy, So lifo and KasteribaUIn" 1'10
2.'50
Roy J SoN and Mitra, So K. 195.'5 • A.r~ introduction to some non,·
parametric generalizations of analysis of variance and multi,~
variate analys:i.z, IlL C:, Jnst" of Star;;isti";:s~ J:.1imeograph Series
No 139 ~ Ral.eigh, No C,
0
John
195.50 A generalization of analysis
of variance and multivariate analyE':1.s tc data based on
frequencies in qual.itative ca tego!'J..es or class intervals"
N, C, Jnst, of Statiet1'28 y Mlmeog:raph Series No, 131 y Raleigh, N,C.
0
0
270
Rubin, T,,) Rosenba:urn, J., and Cobb J S. 19560 The use of i.nteryf'ew
date for the det.ec;tion of associat::\.on in fi.eld studies
oJ of
rrl.-.
·· f' 'l>-' ~'-a;" -~ ~.l ,-,,,,'<: ,.. . "6'"
~·.1Lron ,l~ LlJ..;:;'''' be", ~;::;/_)~C:O
0
0
0
280
U" S 0 13urc~au (;f the Census 0 19640 Evall,at,io:r " ard R.E::'ear~;n Progra!r~
960 ~ ,Accu.<.:~
cfthe UO So Censuses of Pop'.JTatioD. anc.
racy or Data on Housing Characteris"tic3" S""ries ER 601 No 0<'
Washingtoll J Do Co
..i...
29
<,
Wald. J A. 194.3, Tests of statistical hypotheEE5 ccncerrdng several
parameters when the nuI!iber of observatJ.c:r512 I.arg,,?,,, Tra::1.s, of
Amer Math Soc ~4: 426.. 482 "
0
.30 ,'fultS)
0
0
I} " U,
1911, An IntrodU.cti.on to the Ttlecry of s.tat.jsticE',
C.harles Griffin and Co.} Ltd,., London,
129
7. APPENDICES
7.1.
Cramer's Theorem
Suppose that we are given! ~unctions Pl(al , .,.,as )' P2(al , ... ,as )'
, .. , Pr(al",.,as ) o~ s < r variables al, •.• ,as ' such that, ~or all
points
o~
a non-degenerate open interval A in the s-qimensional space
the a j , the Pi
(a)
r
1:
i=l
satis~y
the
~ollowing
Pi(a , ... ,a )
l
s
=1
o~
conditions:
.
(7.1)
(c)
(d)
The matrix (OPi/daj) where i=l, .•• ,r and j=l, .•• ,s is
Let the possible results
o~
o~
ranks.
a certain random experiment E be
divided into ! mutually exclusive groups and suppose that the probability
o~ obtaining a result belonging to the i-th group is p~~ = Pi(a
), where
-0
~ = (a~, ... ,a~) is an inner point o~ the interval A.
number
o~
o~
~
results belonging to the i-th group, which occur in a sequence
n repetitions
o~
E, so that
r
1:
i=l
n
i
The equations
r
1:
i=l
j=l, .. " s j
Let n. denote the
=0
,
= n.
130
of the modified X2-minimum method15 then have exactly one solution
At
a
A
n ->
a
A
= (al, .•. ,as )
00.
A
=a
A
such that a converges in probability to a
-
-0
as
The value of X2 obtained by inserting these values of
into
r
X2
=
E
i=l
-
ni-np.(a)
~
np. (a)
~
2
-
,
namely
X2 =
0
r
E
... 2
n. -np. (a)
~
i=l
~-
(7.4)
np. (a)
is in the limit as n ->
J,. -
00,
distributed in a X2 distribution with
r-s-l degrees of freedom.
7.2.
Asymptotic Power Function: Test of Independence in a
rxc Contingency Table
To test the hypothesis of independence in a rxc contingency table,
or equivalently to test
15
r
For minimizing X2 = E
i=l
al, .•. ,as we have the equations
-
~2
[ni-nPi(al, . •• ,as )]2
nPi (aI' •.• ,as)
2
r
ni-np.
(n. -nPi)
dpi
] ~
- (~) = E [
~ + ~ 2
2 ~j
i=l
Pi
j
2nPi
1
with respect to
= 0,
j=l, ••• ,s .
As Cramer has indicated for large n, the influence of the second
term within the bracket in (7.3) becomes negligible. If we neglect the
second term completely we get back to (7.2) which Cramer calls the equations for the modified X2 minimum method. In this case, however, these
equations (7.2) agree with those obtained by the method of maximum likelihood.
131
where
l: Pi = l: P j = 1,
i
.
j
.
(1=1, .•. ,r; j=l, ... ,c) ;
the usual test procequre, when there are no errors of classification, is
defined as in (2.3).
Consider the simple alternative
where {dij } is a set of deviation parameters, not all zero, such that
i~ j
dij = 0, and \
p~., P~j}
point in A under H.
o
is supposed to be the true unknown parameter
The asymptotic power function of the test is de-
fined by
f3 =
lim Pr
n-> co
{r ~
C
I
IS.n} ,
where rand C are as defined in (2.3).
From the results on the asymptotic power function of the frequency
X2-tests, stated in Section 1.1, we have that
f3 = 1 - F(C,(r-l)(c-l),6.) ,
where
~I
(lxrc) = ( dll/-VP~l' ... , d /~ )
rc
rc
B(rcxr+c-2)
{lit
=
(dp1/~)O
,
}
and
Pk.
for k=l, .•• ,r-l
P.k'
for k=r,r+l, .•• ,r+c-2 and k'=k-r+l .
1,32
Note that
c-1
r-l
p
r,
"" 1 -'
Hence
r
ap"
®~~
-,
and
I',
I.;
i '=1
P
1.0
I: P j
oc =1- j=l
..... '
if irk,r
P oj
if i=k
=p ,j
if i=r
,~,
I
0
v
I
I
I
l
and
if jrk' , c
0
dP
ij
dP ok '
-
P.~
-
if j=k'
0
,-p,
i,1' j=c
:Lo
.
Now, we can write
B(rcxr+c~2)
'" D W
where
l/~
1/-VP~2
D(rcxrc)
o
and
:;)
133
r~
w( rcxr+c-2 )
=
--t
V
1
w
I
°
I
V
W
I
V
2
3
0
W
I
i
1----.--Ii
- - -W -W -W
-W
I
I
V
r-l
V
r
where
and
o
p,
1.
o
p.1.
V,(cxc-l) =
°
o
p.
°
1
,
1.
(i=l, .. "r) ,
0 0 0
-po
1.
-po
-po
1.
1.
By some algebraic manipulation, it can be seen that
B' B( r+c-2xr+c-2 )
__ [A KO]
O
where
if af~
...L
+...L·
1f a
o
0
Pa.
Pro
= , •.. ,r-l
=.J3 1
134
and
1/p.0 C
= (kTO"'
)
K(c-l)x(c-l)
It
k TO"--
follows that
It can be seen that A(r-lxr-l) and K(c-lxc-l) can be written as
follows:
A(r-lxr-l)
=D
(~
+ (-1-) 1 I'
o
0--
Pro
Pi.
and
K(C-lxc-l)
=D
+ (-1-) l I'
0
-(-1-)
o
p. s
,
P .j
where D(a.) (kxk) stands for a diagonal matrix whose diagonal elements
~
are (aI' ... ,ak ) and
1 stands for a column vector whose elements are alII.
From a theorem in Roy and Sarhan [26], it follows that
-1
A
where
= (aa13 ) ,
[
aaf3
= ~I
l
-P
0
0
a. P13 •
p~. (l-P~.)
if a
I
if a
= 13 = 1, ... ,r-l
13
135
and
where
if
T
f
(T
if
T
=
(T
kT(T =
= 1, ... ,c-1 .
r
Noting that
d~j
L
. j
~,
...
= 0 and using
=
L p.
. 1
~=
~
0
c
L
j=l
P . = 1,
oJ
it is observed that
(d .
,d2 ,·oo,dr1,d
- . . 1 ,··o,d oC -1)
1 .
where
c
I:
j=l
d. j
~
= di
.
and
r
I:
i=l
d. j
~
=d
.
for i=l,o .. ,r and j=l, ... ,c.
j
Also
B'~(r+c-2 x 1)
=
(dr_1./P~_1.)
(d.l/P~l)
-
-
(dr./P~.)
(d.c/P~c)
136
Now, it can be easily seen that
r-l
d'B(B'B)-lB'd
-
= ~
d.
i=l
d
c-l
d
1.
po
r.
J·=l·J
d
r-l
d
[~-~] + ~ d . [~ - ~]
d
i. po.
2
r-l d.
c-l d
°
°
p.j
2
p.C
d
c-l
- ~ ~
= ~ - - ....E..:. ~ d. + . ~1 ~
°
. 1
po i=l 1.
i=l p~
J= P .j
P0
.c J=
r.
l'
1.
2
do
r
:::::
~
i=l
2
d .
c
.....:l.
2.=. + ~
j:::::l
p.
°1.
p
°.j
The final step is to evaluate
2
=
~
. j
1,
2
d.
r
d ..
c
.2:J2_ [ ~ ~ + ~
°
p ..
1J
i=l p?
j=l
1.
Now, it is easy to check that the above can be written as
b.=
L:
(difP~j
d
i
Pij
If { d } is such that
ij
= ~j
then
b.
=
~
i,j
d 1. j
=0
,
°
-po1. d
°
i,j
L: d ..
i
1J
•
.j
)2
d
.j
137
7.3.
(7.3.1)
Some Results in Matrix Theory16
A(mxn)A'(nxm) will be symmetric and at least positive semidefinite of the same rank as A or A', the common rank!
being =::: min(m,n), where the symbol denotes the lesser of
It is easy to see that if m < n and A is of rank
m and n.
m, then AA' is positive definite.
(7.3.2)
If B(qxq) is symmetric positive definite, C(pxq)B(qxq)C'(qxp)
is symmetric and at least positive semi-definite of the same
rank as
(7.3.3)
c.
If B(pxp) is symmetric and at least positive semi-definite of
rank r < p and C(pxp) is non-singular, CBC' is symmetric and
at least positive semi-definite of rank r.
(7.3.4)
If M(pxp) is symmetric and positive definite, then there
exists a non-singular
T(PXP)
such that M =
TT',
where
T
stands for the triangular matrix.
16 Proof for each of the above theorems can be found in Roy [23].
© Copyright 2026 Paperzz