Mote, V.L.; (1957)An investigation of the effect of misclassification on the x^2 tests in the analysis of categorical data."

,
An Investigation of the Effect of Misolassification on
the X~Tests in the Analysis of Categorical Data'
by
Vasant Lakshman
..
.~
Institute;. of Statistics
Mimeo Series No. 182
May, 1957
1
'~ .. ~
~'
..
~ote
iii
.ACKNOWLEDG:B2mNTS
I would like to thank R. L. Anderson for suggesting-this problem
to me and for his guidance in the preparation of this thesis.
I consider
myself honored by having the privilege of working under his direction.
I would also like to take this opportunity to thank the other members of
my committee, E. J. Williams, S. N. Roy, H. V. Park, and Oleon Harrell
for their guidance and suggestions.
I have only human words to eXpress my gratitude to my parents,
without whose constant encouragement and financial aid this 'Work would
have been impossible.
Last, but not by the least, I would like to thank my colleague,
M. M. Goff, for his assistance during the preparation of the manuscript, and to Mrs. Covington for the caretul typing of it.
iv
TABLE OF CONTENTS
Page
List of Tables • •
Notation • • • •.
.... . . ...
... ...
. ..
..
v
vi
CHAPTER
I
INTRODUCTION AND REVIEW OF LITERATURE
II
THE GOODNESS OF FIT TEST
III
STRATIFIED AND RANDOM SAMPLING SITUATIONS • •
IV
SUMMARY AND CONCLUSIONS •
LITERATURE CITED
1
•• • . • •
• • • • •
10
....
• • . • •
65
81
...
88
v
LIST OF TABLES
Table
I
Page
Effect ofmisc1assification on limiting power when
the null hypothesis Ho is: Pj = l/r; .( = 0.05,
r =
II
Effect of misc1assification on limiting power when
the null hypothesis Ho is: Pj = l/r; .( = 0.01,
16
=, • • • • • • • • • • • • • • • • • • • • • • • •
17
Effect of misc1assification of the type
9jk = 8 if lj-kl = 1; 8jk = 1-28 if 1 < j = k < rJ
8 jk =1-8 if j = k = r,l; ejk = 0 otherwise; on
limiting power when Ho is: Pj = l/r. • • • • • • • •
59
r
III
5 • . • • . . . . • . • . .. . . . . . . . . . .
vi
NOTATION
Pr{X(f)
~
C}, C ~ 0, is used to denote
e(r x r) is used to denote a matrix of dimensions r x r.
A matrix B whose typical element in the j th row and the kth
column is bjk is sometimes denoted as
CHAPTER I
INTRODUCTION AND REVIEW OF LITERATURE
The theory of chi-square tests in the analysis of categorical data
has been extensively developed.
The origin of this subject can be traced
back to the pioneering paper of Karl Pearson [21].
Since then it has
been modified in important directions at subsequent stages by many inv~stigators
Cochran
including, among others, Yule (28], Fisher [8], Barnard [2),
[4,6], E. S. Pearson [20], Cramer f7], Neyman [17], Roy [24], and
Mitra [16].
Before proceeding to give an introduction to our problem, we
would like to give a brief outline of the main features in the development
of chi-square tests.
In the standard application of the test the N observations in a
random sample from a population are classified into! mutually exclusive
,
olasses.
There is some theory' or null hypothesis whioh gives the proba-
bility
I'
Pj[Pj > 0,
LPj - 1]
j-l
that an observation falls into the jth class (j -1,2,···,r).
Sometimes
the Pj are completely speoified by the theory as known numbers and sometimes they are less oomplete1y specified as known functions of one or
more parameters
[8 ,8a ,.··,Sm) , whose actual
1
values are unknown.
The starting point of the theory is the joint frequency distribution
of the observed numbers lnjl falling in the respective classes.
If the
theory is correot, the observed frequencies follow a multinomial distribution with the
[p j J as probabilities.
The joint distribution of the
2
n
j
is therefore specified by the probabilities
r
TI~1 TIp~j
j=l
·
j
,
'
As a test criterion for the null hypothesis that the theory is correct,
Karl Pearson [21] proposed the quantity
, Pearson did not mention any alternative hypothesis.
The test has
usually been regarded as applicable to situations in which the alternative
hypothesis is described only in rather vague and general terms.
A rigorous
'proof of the theorem that the limiting distribution of x a is a chi-square'
r7 ].
distribution with r-l degrees of freedom has been given by Cramer
When the probabilities Pj are specified as known functions of
m( <r) unknown parameters, we cannot use the quantity
x a •.
f~N3 -
L.i
j=:C
Np/e, .9a• "'!\ana ],
Np j (e1 ,Sa,··
'
·em)
as a test criterion for the null hypothesis, since the values of the
~ SkJ are unknown and must be estblated from the sample.
known constants
lekl
Now, if the 'un-
are replaced by estimates calculated from the sample
values, the [Pj} no longer will be constants but functions of observations;
in this case the Ibliting distribution of x 2 is not a chi-square with r-l
degrees of freedom.
The problem of finding the distribution of x 2 under
these circumstances was first considered by R. A.Fisher [ 9], who showed "
that it is necessary to modify the limiting distribution for x a •
For an
3
important class of methods of estimation, the modification indicated by
Fisher is of a very simple kind, namely, to reduce the number of degrees
of freedom of the limiting distribution by one unit for each parameter
estimated from the sample.
Choosing the modified minimum chi-square as the method of estimation,
Cramer has given a rigorous proof of the above theorem.
He has also
proved that there is a whole class of methods of estimation leading to
the same limiting distribution.
Some of these concepts will be used in
the developments in this dissertation.
Cramer's proof is too concise;
the writer would like to recommend a more recent paper by Ogawa [18] to
supplement Cramer's original.
In the application of chi-square tests to contingency tables there
are two currents of thought in the literature.
To bring out these two
views we shall take the following example given by Cochran [
5].
The data
are classified according to two different axes, A and B.
B1
B2
Tota.ls
A:t,
n1i
n12
11 ,
As
n a1
n 22
la.
Totals
1.1-
I .2
I
Data of this kind occur in at least three distinct experimental situations&
i) A rand.om sample of N is seleoted. from some population and every
observation
i~
olassified into one of the rour cells.
The
~bol
n .. denotes the observed number falling in class A.B., while p ..
~J
~ J
~J
denotes the corresponding probability of falling in the class,
where
..
The joint probability of this group of observations is the usual
multinomial
The null hypothesis that the two classifications are independent
amounts to the relation
')..\
l?l-.;!
Pl.l. -
Pu~ - P22 •
Hence two parameters must be estimated from the data.
ii)
A. random sample of size Ni . is taken i'rom a population denoted
by
~
and an independent sample of size
lation denoted by A2 •
!V2.
from another popu-
The null hypothesis states that the
probability p of an observation falling in B,. is the same in
both populations, A1 and A2 -
Given the null hypothesis, the
probability of the sample is the product of the two binomials,
1[
'N
..
......
~ ••' _, pn:L1:(I_p
)n12
{ ~1·n12·
N~. .
' , pn 21 (l-p )n 22 )
•
n 21 ·n 22 •
In this situation one parameter must be estimated from the data.
iii) A third case is obtained if' both sets of marginal totals are
fixed in repeated sampling.
is an example.
Fisher's tea-tasting experiment
In this case the joint probability of the group
5
of observations under the null hypothesis that two classifications
are independent .is:
. 'n12·'n 21·'n'
f
N'n
.• 3.1·
22·.
No parameters need be estimated from the data.
One approach for justification of 'X.2 tests in each of the above situations is the "conditional probabilityll approach adopted by R. A. Fisher
( 8 ].
In this approach an arbitrary restriction is imposed that, in re-
peated sampling, only those tables will be considered which have the same
marginal total as in the sample data.
This llconditional probability
approach," for cases (i) and (ii), in the words of Mitra [16] "is for
one thing not necessary, for another thing might be misleading. II
The other approach for (i) and (ii) is to start from a single multinomial (i) or a product of appropriate number of Imlltinomial distributions
(ii).
The hypotheses that are posed are different in the different situ-
ations.
The mathematical theorems to which appeal is made are presented
by Cramer [7].
An excellent exposition of this approach has been given
by Roy and Mitra [24].
Again in the third case, Roy and Mitra (24] point out that the
joint
probabili~y can
be obtained in the following two ways:
a) We start from an urn model in which there is an urn containing
N1
•
and N2 • balls of two different colors from which we draw
successively without replacement
N. 1 + N. 2
= N1 •
bunch N.j(j
+ N2 •
= 1,2)
= N).
N.3.
and N. 2 balls (With
The joint probability that the jth
will contain
D
lj ,n2j ~ill be given by
6
N1 • 1N2. ! N.1 1N.2 !
Nlnll1n121n21!n221
The great advantage of this scheme is that the different observations need not be assumed to be independent and the great disadvantage is that we would not know how to write
dO"~m
the joint
probability of the observations under a general hypothesis as
distinct from the hypothesis of independence of the classifications.
b) This joint probability (iii) can also be obtained by starting
from the multinomial of case (i) then putting Ho :
Pij = Pi.P.j
and then finding under H the conditional probability subject
o
to N1 .,N2 .,N. 1 and N.aobeing fixed.
The disadvantage of this
scheme is that one has to assume that the successive observations
are independent.
A. thorough review of the literature on these and other problems,
such &s correction for continuity and small expectations, has been done
b,. Cochran [ S].
As is uSl.1ally the case in the practical application of a theory,
one is faced with the problem that the conditions laid down by the theory
",
.'
are not completely met with in.the situation under consideration.
theory of the analysis of
categori~al
data is
110
The
exception to this.
One
of the difficulties often encountered is that the experimenter is liable
0
to make mistakes in classi£y1ng individuals in respective categories.
When the categories are of the type "lived lt and "died, II "there is
almost no risk of error.
However, for example, in the medical field
there is a considerable chance of error in complex diagnoses.
Bross [3],
To quote
7
In more complex diagnosis the clinician realizes th~t there
is a considerable risk of error, a risk that may vary a
a great deal depending on the disease under study, the
existence and availability of diagnostic tests, and other
factors.
Thus if we had two categories:
attacked (A) and not attacked (I),
because of the fact that the ohanoe of wrong diagnosis is appreciable,
the possibility of an individual belonging to category A being classified
into I and vice versa cannot be ignored.
The following questions then
aris&:
i)
How do we take into account these errors while testing the
usual1f hypotheses?
ii)1f these errors are ignored, will the level of significance
be affected?
iii) What will be the effect of these errors on the limiting power
of the test?
Problems of this nature have been investigated by Bross [3] and
Rubin, Rosenbaum and Cobb [25].
i)
Bross considers the £olloW1ng problems:
Suppose we are sampling from a binomial population of "problem
individuals" (henceforth denoted as I) and "non-problem
individuals" [denoted as C(l) J~
Let the probability of an
observation belonging to I being wrongly classified into C(I)
be
e and
that of an observation belonging to O(I) being wrongly
classified into I be
¢. Let p be the probability of an ob-
servation falling into I, i f there were no misclassification.
The problem is this:
if x individuals are observed to belong
to I in a sample of size n, is x/n still an unbiased estimate
y
~For
,
,
a discussion of the "usual hypotheses" see Roy and Mitra
.
[24J.
8
of p?
Bross proves that it is not and that the bias is (K-l)p,
where
K=l-e+~
p
and
q
=1
- p.
Further he remarks, "If estimates of
e
and p5 are available from
the data used to construct the classification system, the estimates might be adjusted by dividing by K."
However, a little
thought will reveal that K involves the parameter which we are
trying to estimate.
Hence even if
be used to remove' the bias.
e +¢ ~
e and
~
are known, K oannot
It is very easy to see th'at if
1 then
Thus if we have a priori knowledge of e and ¢ and if e + ¢
f
1,
is an unbiased estimate of p.
ii)
'!'he other problem Bross considers is the comparison of probabilities
in two bin..o mal populations.
He Show1\that "validity" of
the chi-square test is not affected by ignoring misclassification.FUrther he remarks, "We do not get off scot-free, however.
Although the tests are valid, power may be drastically reduced. 1I
Bross presents the values of
and p, where
11K'
for various values of
e,
p5
9
11K'
is used as the measure of the efficiency of the test under misclassi-
fication.
He also indicates some possible extensions to small samples.
Rubin, Rosenbaum and Cobb [25] focus their attention on the second
problem mentioned above • They err!ve at the same conclusions.
We shall consider the following problems of categorical data:
i)
The .goodness of fit test.
ii)
Contingency tables:
a) Stratified sampling situations.
b) Random sampling situations.
10
CHAPTER II
THE GOODNESS OF FIT TEST
2.1 The Problem
A frequency table consists of r classes; p~ is the value specified
-
J
by the null hypothesis (Ho ) for the probability of an observation falling
in the j th class, such that p~ > 0 and
J
r
LPj
= 1.
jllll
Let nj be the observed freQUency in the jth olass, in a sample of
size N, such that
-
Let
o
.~ -Np.1. •
x
j
.v1PJ
It there were no errors in the observations, the test oriterion would
bel r.eject Ho if
-
C being so chosen that
pr{'X:(r_l)
~ CJ • .( (desired level of significanoe)
Let us see how this test criterion must be modified when there are
errors in classification.
11
Let ejk(jfk) denote the probability of an observation belonging
to the jth class being wrongly put in the kth class; and 6jj the
probability of an observation belonging to the jth class being correctly
olassified in the jth class.
It is obvious that
r
Le
jk ... 1
(j '" 1,2 J
• •
·r )
0
k-l
Let pA denote the probability of an observation being assigned to the
jth class when there are errors in classification; and denote by Pj
the true probability of an observation belonging to the jth class.
is easy to see that
r
Pj - LPtetj
(j - 1,2, ••• r).
t-l
Let e(r x r) denote the stochasticll matrix
k - 1,2,···r.
Two cases will be considered:
(a) e(r x r) is non-s:ingular and known.
(b) &(r x r) is non-singular but unknown.
2.2 Case a:
e Is ..Non-singular and
Known:
Clearly:
[pi p& ••• P~J - [Pl Pa ••• PrJ
Y A square matrix A •
e(r x r).
(a jk ) of order!: is called stochastic if all
the elements are non-negative and if
r
). a . _ 1 (j'" 1 2 • or)
'tc' jk
..
, .,
•
0
It
",'
12
Ho·•
Now
p
j
pI = plo
= p jo <
- > HI.
o'
j
j
-
. where
(j ::
1,2,···r);
r
cl O
Pj
\'
0
= LPt 6tj
(j = 1,2,"·r).
t=l
Hence rejection or acceptance of HIo --->
rejection or acceptance of H0 .
<--Consequently the size of the rejection region under H is the same as
o
before (when
6 = I).
Thus the problem reduces to that of testing H' •
o
It is easy to see that the test procedure would be:
hence H ) if
o
xl2
=
l '
L en. -, 0' 2]
r
J " N,P j
j=l
0'
Npj
)
Reject HIo (and
> C
-'
C being so chosen that
pr.{X(r_l) :: C}
= ..( (desired level of significance)
2.2.1 Limiting power:
Let
{'fj 1be a set of deviation parameters,
that
Denote by H1N the hypothesis
not all 21ero, such
13
The limiting power of the goodness of fit test, following Pitman [22J
and Cochran [5], may be defined as I
Where p~
earlier.
Let
of
-X:
(j. l,2, •••r), x· and C have the same meaning as defined
n
-X: • L>i
whe:re Ii are
"
Y
i~dep,en4.~nt N(ai,l).
•
The distribution
.~
'
is called the non-central chi-square. Define
~l
n
6 .•
L1·
a
i-I
If
fa:)
denotes the density function of
+
1
X:' then
...
]
tf.T) ·
1
n(n+2) 2
.0.
a
+ ...
Let
It has been proved by Cochran [5 ] and Mitra [16 J that,
where
We wish to study the effect of misclassification on this l:1miting
power.
In other words we wish to find
where
r
L
Yj = ~t etj
•
t=l
•
• •
Using the result proved by Mitra [16J, the above is easily seen to be
1 - F(C,r-l,~')
where
The function
(see Roy [23 J).
is a strictly monotonic increasing function of
~
Consequently to compare the two functions, 1 -
F(C,r-l,~)
1 that
F(C,r-l,~'),
~, ~
~,
we need to compare
t::. •
can be written as follows:
Using Schwarz's inequality,
~
and
~'.
and
It will be shown
15
we see that
•
• •
..
• •
•
A'
• •
..• •
<A
1 - F(C,r_l,Af)
~
1 - F(C,r-l,A).
Henoe we have proved what might have been expeoted, namely that misclassification reduoes the limiting power.
To illustrate the above point let us consider the following example:
Let
6 jk
=6
6 ' . = 1 - (r-l)e
jJ
.
o < e < l/r.
where
Suppose the hypothesis we are testing is
In this case
H~
is
Pj = l/r.
j
f
k
16
Also it is easy to see that
A •
r\ 12
L
;
j
and
j
L~
A' • r(1-rS)2
= (1_re)2 A •
j
Now if we fix A and -< and calculate AI for various values of e,
we will be able to oompare the power function for the two situations.
For example:
Let A = 15.405, r
For
e = 0.05656,
=5
A'
and .,(
= 0.05.
= (1-r8)Pn= 7.924.
1 - F(C,4,15.405) and 1 - F(C,4,7.924) oan be read from Fix's [lOJ
tables of non-eentral chi-square.
These values are seen to be .9 and
.6 respeotively.
Tables I and II show the effect of misclassification on limiting
power for the situation mentioned above.
Table I.
Effeot of misc1assification on limiting power when the null
hypothesis Ho is:- Pj III l/r.
r
e
A
=5
AI
= (1-5'8)
A
Power
0.00000
15.405
0.9
.02396
11.935
.8
.04l44
9.683
.7
.05656
7.924
.6
6.420
.07089
.5
5.050
.08549
.4
.10150
.3
3.737
.12104
2.401
.2
.910
.1
.15139
.00000
6.420
.5
.02262
$.050
.4
.04741
J.737
.3
2.401
.2
.07769
•12i170
.910
.1
lIWhen 8 = 0.2000 , any dep arture from the null hyp othesis is unobservable and the power drops below the level of significance.
17
Table II.
Effect of misc1assification on limiting power when Ho is
Pj
= 1/r.
r
=5
e
Power
0.00000
.02026
.03496
.04761
.05952
.07153
.08452
.09997
.12206
.00000
.01709
.03559
16.749
14.121
12.039
10.231
8.557
6.914
5.188
3.149
10.231
8.557
6.914
5.188
3.149
~05758
.b8904
2.3 Case b:
0.9
.8
.7
.6
.5
.4
.3
.2
.1
.5
.4
.3
.2
.1
20.737
e( r x r) Is .Non-singular but Unknown:
It has been proved earlier that the hypothesis
r
Ho :
(j
= 1·· ·r)
<~ H' :
o
p'
j
= p.J0'
(j = l·"r',
where
Consequently the problem of testing H reduces to that of testing HI.
o
0
Let us proceed to see how to test H'o when e(r xr) is unknown.
Under the null hypothesis,
H'o
0'
= p'j = Pj"
the ~ functions tPi,p~, ••• p~} are expressed in terms of certain unknown
parameters.
The number of independent parameters that enter into the
expression of Lpi ••• p~} is easily seen to be r(r-1). We would like to
18
use Cramer t s theorem
r7 ]
to say, IlUnder
X,2
H~
the statistic
= ~ (nj - ~j~~ ,
L
Nap'
j=l
wh~re
Pj
sion of
is obtained by substituting for the parameters in the expres-
Pj
the consistent solution of the likelihood equations, is dis-
tributed as
1--2
with appropriate degrees of freedom as N-->oo. If
However,
Cramer's theorem requires that the number of independent parameters
2.4
Special Case of All e
jk
=6
(jfk):
From the above consideration it is clear that the problem cannot be
!Pjl
solved when the
are functions of r(r-l) unknown parameters.
we can obtain a solution to this problem for partic'tllar cases.
However,
One
particular case is
~
ejk = e
(ej j
where
e
=1
for all j
f
k.
.. (r-+)e for all j,
is some unknown parameter, such that 0 <
the true value of e where 0 < e < l/r.
o
e<
l/r.
Let
e = eo
be
In this situation we have,
under H :
o
In what follows we will assume that not all p
j
= l/r.
Suppose that the
classes are renumbered (if necessary) and that for the first
r o (l<ro::.r)
Pj f
l/r.
Without limiting generality, we can assume that
all these r 0 values of p
j
are different.
19
The modified minimum chi-square equatioJ! then can be written as:
:ro
\~n..lij~ = 0
Le -
qj
j=l
where
q.
J
If
A
e is
=
p.o
J
rp.0 - 1
J
•
I
A.
a solution of (2.1) such that 6 - > 6 in probability, then
0
the statistic
is distributea as
't..a with
r-2
degrees of freedom as N ->
00.
The equation
I1For minimizing
r
XI • I
(nj -
NPj' (9»I/Npj' (e)
j-l
with respect to
e we
have the equation:
-~ ~ ·L~~ -e;j- +~r:
r
o
j=l
0'
Pj
. ~
~:j ~ = 0
0
-
0'
.2NPj
As Cramer has indicated, for large N, the infiuence of the second term
within the bracket becomes negligible. If we neglect the second term
completely we get equation (2.1) which Cramer calls the modified minimum chi-square. In this case, however, this equation agrees with the
likelihood equation.
20
has (as it will be proved) r -1 real roots.
o
which root should we take as
'91
The question then arises
We shall proceed to answer this question.
First we note that there are r
!
o
0
values of' p. diff'erent from l/r.
J
Since .
ro
)
=
II';
,
-
(J
.
Ir
r
o
'
0
it f'ollows that neither can all P be greater than llr nor can all the
j
Pj be less than
l/r.
r o-s greater than l/r.
Suppose s«ro ) of the
P:
Let
Pj
are less than llr and
denote the largest of those which are less
than llr and 1':+1 denote the smallest one of those which are greater than
l/r.
Hence qs is the largest negative q and qs+l the smallest positive q.
We shall prove that one and only one root of the equation
r
o n
\'
j~e
j-=O
- qj
lies in the open interval (qa,qs+l ) and that this is the one that is
consistent.
Suppose that the q's are arranged in ascending order of magnitude:
q1 < qa ••• < qs < q s+1 ••• < qr •
o
We shall prove that between two consecutive q's, say (qj,qj+l) there
lies exactly one root of the equation
ro
\'
Le -
j=l
n
j
qj
= 0
•
Before we prove this we would like tostate a theorem proved by
Hobson [13].
21
If f(x) be a continuous functi on in
and if any of the functional limits
sign to any of those at b, there is
the open interval at which f(x) has
the open inter:val (a,b)
at a is of opposite
at least one point of
the value zero.
Let
Clearly f( 6) is a continuous function of 6 in the open interval
(qj' qj+l).
Further f(qj+O) '" +00 and f(qj+l-O) '" -co.
Hence by
the above theorem there exists at least one value 61 in the open interval
(qj,qj+l) such that f(61 )
'"
O. Therefore there must exist at least one
real root of f(6) - O.between two oonsecutive qls.
But there are ro-l
such pairs and the equation f(6) • 0 has r 0-1 roots.
Hence the result.
Consequently there exists one and only one root of the equation
ro
L.:
e_\ ·
0
j=l
between qs and qs+l.
Let
ro
g(&) =
IT (1 jal
We shall first prove that
g{~)/g(eo) ~
f)nj.
J
1 for all
ln
1
n a ···•
llz.]
and for
/\
all n where 6 is the root of the equation
~
nj
LJ e -
j-l
D
0
qj
that lies in the open interval (qs' qs+l).
Note that 6o is also contained in the open interval (qs ,q s +1)'
since qs ... (p~/rp~ - i)< 0 and qs+l '" (P~+l/rp~+l - 1»
l/r and 0 < &0 < l/r_.
22
Clearly the first and second partial derivatives of log g(e) are continuA
ous in the closed interval (e,80 ).
Hence by Taylor's theorem
(see
Whittaker and Watson [27]), one obtains:
where
R
a
(6)J G=e
(8 0 -6)2[0.2 log g
2t
~ea
l
.
6
where 61 is some intermediate point between G and 8.
Clearly
o
e> log g(e)l A .. 0
[~
Ja=e'
and
02 log g ( e
[ C5e:a
no matter what 6 1 is chosen.
1\
g(e)/g(e )
o
>
-
)l
Je=e
< 0,
l
"" > log g(e ), or
Hence log g(e)
-
0
1.
A. Wald [26], under eight assumptions has proved the following
theorem:
Let en(JS, o • •x n ) be a function of the observations
{Xl°· OXn}
such
that
then Prob
in->oo
11m en = e, 1
01
:I
1.
Ogawa [18] has proved that all the conditions laid down by Wald
are satisfied by the multinomial.
in our case.
Let
Hence Wald f s theorem is applicable
23
•
= g(e)[... Nl.
Jrn
j !
=1
Clearly from what we have proved previously, it follows that
"
L(e)/L(~
)
>
o -
Hence this proves the consistency of
1
"e.
>
o.
This root can now be obtained
by Newton's iterative procedure.
Limiting Power
2.4.1
Let
tr
j
J be a set of deviation parameters not all zero such that
r
L
~j = o.
j=l
We wish to study
where H denotes the hypothesis:
1N
Cle~ly
<-
'.
H1N - > HIN where H'lN denotes the hypothesis
Pj
where
Hence
""tj
I:
=
pr
+
tj//I,
j = 1,2,···r.
(l-re)"j.
Lt
P Jx ,2
N->oo rl
:= CIH1N}
=
Lt
N->oo
Prl x'
2
:= CIHINJ.
24
Now
where
If there were no misclassification the limiting power would be
1 - F(C,r-l,4)
where
It is easy to show that A' < A. We need only prove that
L + L~/pj ·
r
(1-reo )2
+
eo(l-rpj)
- p~'(l-re ) + e
"
•
• •
•
• •
0
r
<
j-l P j
We note· that
Pj' - Pj
,!2
0
j-l
25
o<
Since
l-rS < 1.
o
.601 < t::..
Hence
Also the number· of degrees of freedom has been reduced by one.
Conse-
quently the limiting power will be reduced due to misclassification.
2.4.2 Use of Best Asymptotically Normal (BAN)
Method of Estimation:
The equation
usually will be difficult to solve; even by Newton's iterative procedure.
Under certain conditions it is possible to adopt a different method for
estimating
e
and to overcome this difficulty.
The main theorem on the
limiting distribution of x 2 has been proved under the hypothesis that the
method of estimation is the modified minimum
1.2 •
However, there is
another class of methods of estimation leading to the same limiting
distribution, as indicated by Cramer [7, p. 506 J,
X
The theorem on the limiting distribution of 2 holds for any
set of asymptotically normal and asymptotically efficient
estimates of the parameters.
X2
26
One such class of estimates is the BAN (Best Asymptotically Normal)
estimates introduced by Neyman [17].
Before we proceed to obtain the
BAN estimates we would like to state the definition of BAN estimates and
the conditions under which they exist.
Consider t sequences (t
=
1 in our case) of independent trials and
let N • denote the number of trials in the i th sequence. Each trial of
i
the i th sequence is capable
of producing one of the r.1 mutually exclusive
.
results,
say R . ,R.
,··.,R.lri with respective probabilities
_
i1 . 1 2
Pil,P12"",Pir ' where
"
i
Denot~
by n ij the number of ooourrenoes of R in the course of N • trials
ij
i
th
forming the i
sequenoe. Also let qij • nij!Ni.V. Finally let
;+: ··
Nl • +N2
Nt •. ~. N ~d Qi • Ni.lN. The symbols n ij and qij will be
treated as random variables. The Q ' s will be considered as constants.
i
N,the total number of observations, will be assumed to increase without
limit.
The probabilities Pij are unknownJ but it is given that each
p .. (i
lJ
= 1,2,··.t;
j
= 1,2"
parameters \61 ,e2 , · · · ,em\.
•• r i ) is a specified
. function of several
Thus we shall consider the situation where
it is known that
YNeyman uses this
notation, which we have retained for ease of comparison. It is understood that qi' is not related to the parameter qj
we have used earlier in this J chapter
27
with the functions satisfying the t indentities
ri
L
f ij =1;
j=l
i = 1,2,·· ·t.
It will be further assumed that the above inequalities and the identities
hold for the whole range of variation of
l &U~:ill··
• em)
f ij will be assumed to be continuous with respect to
•
The functions
{e1 ,e a ,··-emJ
to possess continuous partial derivatives up to the second order.
and
We
shall use the notation:
,
1
•
Finally we shall assume that the matrix (afij/ae ) is of rank m.
k
A function ~jOf random variables qij which does not depend
Definition:
direotly on N is called a BAN estimate of the parameter e
j
if it satisfies
the following conditions:
(i)
A
ej
0
is a consistent estimate of 9 j •
That is to say, as N-->oo,
the estimate &jtends in probability to &~ or in symbols
'" •
Lt
P9N-->oo j
(ii)
j = 1,2,·· em.
0
Q"
oJ
1\
As N-->oo the distribution of e j tends to be normal,
N(ej, O""j//N)_
"
Lim
N->oo
where
(iii) If
p{ e~
OJ
~
e- is
placing
More specifically whatever real number w
0
-6
"it V1 <
OJ
w
1= 1/v'2'7t .,Lw. e-x 2/"2 dx = ¢(w)
-00
is a sure number, independent of N_
any other function satisfying (i) and (ii) but 6" re~J
then
28
(iv)
Estimates
tej 1considered as functions of qij possess con-
tinuous partial derivatives with respect to each q.
.•
1.,J
There are three alternative methods of obtaining BAN estimates.
will state here the one which will be helpfulin ()ur problem.
We
Consider
the expression:
t
ri .[ (q ... - f. j )
Q.
1.,J
1.,
~(q,f) = N
1.«
q. .
i=l j=l
1.,J
L L
2J •
In writing this formula it is assumed that none of the q. . is equal
1.,J
to zero.
The problem of minimizing ~ with respect to the unrestricted variation of (el,~a,···,em) while the values of q . . are held constant, leads
1.,J
to the solution of the following equations:
t
ri f
.
w = \ Q \ .:hJ f
k ~ i ~ qi . i,j,k
i=l j=l
,J
=0
k
= 1,2,·.·m.
The above system of equations possesses a system of solutions e~(q)
which are the BAN estimates of 6
(j = 1,2,···m).
j
We had remarked earlier that by adopting a different method for the
estimation of
e,
it would be possible to eliminate the labor of solving
the equation
n
j
-;:--"'-"- :: O.
e - q.J
The technique we wish to adopt is 'that of minimum A~.
e.
Before obtaining this estimate, let us check that all the conditions
are satisfied.
It is noted that
29
Pj
Since
=
P~ + g(l-rp~)
=
p~(1-r8) + 8.
J
o < e < l/r
apj/09, 02Pj/oela
P~
J
> O.
j
= 1,2,'·'r.
.
Also
exist and are continuous.
Since there exists at least
one P~ f l/r, the vector [api/oe ••• op~oeJ is of rank one. Hence all
J
the conditions for the existence of BAN estimates are satisfied. If
none of the quantities {nJ.'n 2 , · · .nrol is zero, the following equation
will need to be solved to obtain the BAN estimate:
The solution of this equation is
ro
I [(l-rp~)/njJ
~
=
j=l
l/r -
r
•
o
r \' [ ( l-rp 0., ) la/n '. ]
L
J
j=l
J
A
Since this equation has a unique solution S, it follows that
the BAN estimate of
Let
~.
1\
= Pj0
1\
= Pj
Pj
Pj
0
+
'"
0
6(1-rpj)
j
= r o+1 •.• r.
A
e is
30
In this case we have the statistic comparable to (2.2),
The limiting distribution of this statistic, under the null hypothesis,
is also a 't.. distribution with 1'-2 degrees of freedom.
The test procedure is'.
rejeot Ho if
x'·
-
> C
and accept otherwise J C being so chosen that
Pr
"(%'_2)
~ C • .(
(desired level of significance).
2.
Consider the situation "tiescribed in Section
2.4.2 where
-
we have t
sequenoes of independent trials, the i th sequence being capable of producing one of the %'i mutuallY' exclusive results
with respective probabilities
The 1f1pothesis (Ho ) to be tested lis that the Pij are known functions of
t
m«r-Lri )
i-l
i
unknown parameters 62"S.,.·. 'Sm~ In other words
31
where the f ij satisfy the oonditions stated in 2.4.2.
The usual test prooedure is:
x2
it!
=:w[t '\r
Qi
i=l
L
reject H if
o
/"
~j
At:;,
- P~(e.,.,9.,
,em)
))
.pi -l ( , ft···
2'
, ..
j=l
...
\1)
2] > C.
-'
oJ
otherwise do not reject.
~
~
,···,em are
~
The quantities C, 91 ,e a
i)
ii)
Pr\'X.(r_m~t)~·C} = -<.
\ 81 , e2' • • • , em ~ are the
defin~d
as follows:
consistent solutions of the modified
miniD'll.1ll1 chi-square equations.
We have seen that the above test procedure is valid for any
{el'ea ,···,Bm} prOVided that they are BAN estimates of
l.
{e1 ,B a, .. ·,em
We have also seen that the equations
ri
t
L~ L~';
i=l
j=l
fi,j,k
=0
k = 1,2,···,m.
'
possess a system of solutions which are BAN estimates of te1,ea,···,emi.
Consequently we can also use the test procedure:
reject H if
o
and do not reject otherwise; C is defined as earlier, but
f8u
ea , .. ·, 6m 1
is a certain system of solutions of the equations
ri
t
LL
.
i=l
Let
Q.
1
f ..
.2:..a...I f. . k = 0
qi j
j=l
'
~,
J,
k
= 1,2,···,m
e~ = {e~,e~, ...,e:J be a specific set of values of the els,
and let {'(ij
1be
a set of deviation parameters such that
32
1
= 1,2,···,t
Denote by 11.N the hypothesis:
PijN
eO
eO)
= Pij ( rP
~~, 2,"', m
Mitra [16] has evaluated
We will now show that
is the same as
Further let us adopt the following notation:
k = 1,2,··.,m
·"r
J·=12
"
, i
i = 1,2, ... ,t
denote the following matrix,
+ lij
IN
33
·..
, ,JQ1./f 1,1.
,
,
f 1 , 1. rrlQ1/f1. i
f1.,1. ,1./ Q;lf1.,1.
f1. 1.
f 1 a 1/Q1/f 1 a
f1. a JQ~/f1 a
f], , a ,.rI~/f1 , a
f 1,r1' JQ1./f], ,r1 .
f 1 ,r1.,mVQ1./f1 ,r1.
,,
,
,"
··•
f 1,r1,,/QiIf1.,r1
··•
,
f i ,1. ],/Qi/fi ,1.
f i 1 JQJfi ,1.
·..
f i ,l,mVQJfi ,1.
ft~i,l/Qt/ft,1
·..
f t 1. ,.~~/ft , 1
ft,rt,:I~/ft,rt
·..
ft
,,
••
•
f t ,1.,iv'QJft ,l
·•·
ft,rt,JQJft,rt
D(r xr)
,
0
D1.
0
0
Da
...
0
0
...
=
,rt,nI~7ft ,rt
0
D
t
where
Di(ri x r i )
=
f i 1./Qi
0
0
0
0
fi:/Qi
0
0
0
0
fijQ i
0
0
0
...
0
0
0
...
··•
.-
..
0
0
fir /Qi
i
34
Finally let BO(r x m)Y and DO(r x r) be the matrices whose elements are
the elements of B and D respectively evaluated at
e'
em)
= (61 ea ...
and
a'
-0
a
=
~o
where
= (~ ••• eO).
m
Under the general conditions stated in Section 2.4.2, Neyman has
proved the theorem (Theorem 5) which in one notation can be stated as
follows:
The system of equations:
r.
t
liTk
'Q '
1.
f..
..
k
= ~. i ~ ....!1l
q. 4 f i,j,k = 0
i=l
= 1,2,···,m
j=l 1., ...
possesses a system of solutions
e1 Ba
••• em' which has the following
properties:
i)
1\
functions eh(h :rl,2,"',m) have continuous partial derivatives
with respect to all the independent variables q. . •
1.,.1
ii)
that the result of substituting:
j = 1,2,"'ri i = 1,2,···,t
!'
in eh(g) leads to the identity:
h = 1,2,'" ,m
iii) Let
•••
~
aq1r1
...
YNo:te that. BO in this section corresponds to the matrix B used by
Mitra [16) and Ogawa [18].
•
•
35
Then
Ii.
A~
•
••
.. (BIB)-lBI(D1/2)-1
Al
m
where
D1l / 2
Dl / 2 (r x r)"
0
D1/2'
a
0
• ••
0
·..
0
·..
Dl / 2
I
•
•
·
o
0
t
o ,,~,.
o
Let .
xta ••• X J•
trt
Yij } be a Bet of deviation parameters not all zero such that
and ~'(1 x r) .. [x11 x1a ... x1~ ... xt1
Let
t
r
i
LY
ij .. 0
j=l
and let
i =
1,2,· .. ,t
36
be denoted by!!.
Let
~N
denote the hypoth,esis:
j
=1" 2
i
= 1,2,···,t.
••• r
, i
For )..> d we have the following:
When HlN is true, E( qij)
= PijN
and
Assuming H to be true and usiilg the Bienaym6-Tchebycheff inequality,
IN
we obtain:
J~lq
--1
i
ij
_P
I > A- ~I fijI! < PijN(l-PijN).
ijN -
Hence
Now by one assumption:
Hence
Vi
1-
(>-
-~I \jl)2
<
PijN .•
(A_d)2
37
Hence there exists a quantity
·•.
Pr[.m;: I~~ - P~~
of
suohthat Pij >
v1f 2: VC1)
of
>
o.
< PijN/(>.-d)"
v'pfj
.• .
Let !VIij denote the event:
(2.6)
Let
!VI
=T)t\jM•.•
J. . J.J
where "ij is the compliment of Mij •
.• .
P(M)
~
1 -
LL
p(lij )
i
;'1 -
j
LL Pi~N
j (r._d)2
i
t
38
,
+. R·
where:
we mean
evaluated at some intermediate point on the segment joining the points
(qll qta ••• q1r1 ••• qt1 qta .•• qtr ) and
t
000
( P11 P12 ••• P
00)
Pt1 ••• Ptr •
1r1 •••
.
t
Since a.,&,1,
• j and
.
a2e,..!e>qi'J
II
00
- . r are continuous in the closed interval
-~J
o po ... p
( q11 • •• q1r1 ... qt'
• •• OL ) and
,1 qt
,a~rt
' (p
,11 12
1r1
O
it follows that they are bounded
for all h, i, and j.
0
\' a. • j(q..
L
&,1,
. J.J
j
2
-
J!\ \"'
0
L\ ' ~,i,j
L L
j
Making use of relation (2.6) we see that
i
j
O
. . . Pt
1
...
Now
p~J')
J.
o]~ 1/2
.
(qij - Pij)2
O
ptr
t
)
39
where K1 is independent of N.
lal
--H-;~~;e~--
•
Similarly:
< K:a
-,
llN
e~-.i;--o(~7~lh· ~'-O(~i/N)--;;dhe-;~~ -i~ of~:O(>:1fii):----L~t--~···---------
a = K ~2 b
where
o < l-<nl
- 1.
<
Then
In matrix notation this can, be expressed a.s
~ 0
(tl-6)
h h
0
0
0
= (If)-1/2 [a.n,1,1
a.0
. . . . a.0
••• a.0
n,1,.2n,1,r
--n,t,1 ."a.
un,t,:tt;]D
1
1/ 2
x
-
From part (iii)o£ the theorem stated earlier we have
where (0 0 ... 1 ••• 0 ,;.. 0) is a vector with 1 in hth place and zero
elsewhere.
Hence
40
..
~here
1\
a' = (e1
1\
""
9 a ••• -em) and ~' = (~ ••• ~)
o<
I~I ~ 1
h
Making use of the result (Sh Ogawa.
e;)
= 1,2,·"
,me
is O(A/I!) and Ixijl
< 1./01,
(18] has shown· that
.,
l
= -x -
(N)
1/2
0'/\
B (a-e) +
--0
~2
K'~
(2.8)
.L
IN~
where
~. • (-<:1.2. ~ I ... ~m)
is a random vector such that 0 <
-<
1
h • l,2,· .. ,m and K' is a constant independent of N.
Making use of (2.7) and (2.8) we see that
where
h
We now define a
relation:
vector~'
III
1,2, ••• ,m.
• (z2, za ... zm) from the following
;
~oml (2.9) :It tellow. that nth a prebabil:lty sreater than 1 - (A-~)I
~ ~ for all h. Hence! converse. to .ero in probability.
41
.
The
lim~ting
i
distriblltion of.:! is multivatiate normal with mean
...
and with a. oovariancematrix I - PP', where
••• p o
o
P' (t x R) =
0
1r1
.•• 0
0
p0 P0
a1 aa
•••
o o •••
0
... 0
... Para
0
o o •.•
0
0
0
... 0
0
0
0
••• p
0
, tr
t
Now following Mitra.' s reasoning, itoan be shown that
Lt
{Pr III ~c II1.N}
N-->oo
=1
-F(C,r-m-t ,.6}
where
and where
It should be noted that in the above proof the only place where the
estimation prooedure enters into the pioture is whElre we require the
estimates
i}
8v 9.,.·· ,em to possess these three properties:
6
A
Functions "\
6
. '
(h • 1,2,' ··,m) have continuous partial derivatives
with respect to all the independent variables q. . •
~,J
ii} The result of . substituting:
.
",--
qi~ •
t ij (91,9.,"',8m}J j • 1,2,"',ri J
leads to the identityI
A
9h , =\ h
i
= 1,2,···,t
= 1,2,'" ,m.
in eh(q)
42
iii) Let
Then
At
1
At
2
·
·At•
= (BtB)-lBt(Dl/2)-1.
TIl
Consequently the result proved above remains valid for any estimating
procedure that yields estimates possessing the above three properties.
Consider the etatistic
where {@3.I~II· ",~1 are the estimates satisfying the three properties
mentioned earlier.
Neyman (17) has shown that limiting distribution of
this statistic under the mIl hypothesis is a chi-square distribution
with r-m-t degrees of 'freedom.
where
It can also be shown that
43
2.4.4
Size of Type I error if misclassification is neglected
Suppose that there are errors in observation; however, we ignore
.
them and use the usual test procedure, i.e., reject Ho
(j • 1,2,···,r)
Pj
= P0 j
tf
and do not reject otherwise; C being so chosen that
In this ease, what is the actual size of the Type I error?
How much
does it deviate from -<; i. e ., what is
If we keep
e
fixed, the limit is equal to 1.
Suppose we investigate
the above limit whene is arbitrarily close to zero.
Let e =
EPlIN.
Now we wish to determine
The above limit is equal to 1 - F(C,r-l,A ) where
o
r (l_rpo).
6.0
-e'·
I
j-1
Since .00 > 0 , the actual size ot
oL>
o.
Pj
the
Type I error is greater than -<.
The ,.fUnction 1 - ll'(C,r-1,AO)can be evaluated by Patnaik's [19]
procedure. or calculated from Fix's [10] tables.
44
To get an idea as to how
~will
change if misclassification is
ignored, let us consider the following examples:
Let
~
=0.05,
r =
4
and
{pjJ = 5/20,4/20, 8/20 and 3/20,
ar = 0.6000
J
. 1 - F(C,3,4o ) = 0.10.
Thus
.~
is doubled if misclassifioation is ignored.
Consider
~
= 0.05,
r
=4
and
{Pj}
>
= 1/20, 8/20, 8/20, 3/20
for
Sf
• 0.2230..
.0.0 •
0.779, 1 - F(C .. 3,60 )
= 0.10
6'
= 0.3658 .. .ao •
2.096, 1 - F(C,3..~0 )
= 0.20
3.302, 1 - F(C,3,o60 )
= 0.30.
sr • 0.4591,
40.0 •
45
2.5
Misclassification Only in the Neighboring Classes
In many practical situations it may be unnatural to
6 jk =
e for
all j
r k.
as~ume
that
An illustration of this point is provided in
"! Study of Smoking Habits of Individua1s Jl by D. G. Horvitz and George
Fordori [14].
The anm,vers to a questionnaire were compared against a
standard measure.
This study revealed that for the particular class
intervals chosen there was m1sclassification only in the neighboring
classes.
in more precise terms
I
ejk = 0
for /j-k > 1.
This leads
us to believe that in some situations Gjk should be. a function of the
.
t h · th
"dJ.stance" between the j
and the k .class. One way of taking into
account this distanoe would be:
o < e < 1,
and
However,
"e plan to consider thetollowing case
it I~-kl
e
. e~k •
•1
1-29 it 1<j • k<r
1-e it j • k •
o otherwise
r,1
where
.
o< e<
e·.
1
2(l+cos if) •
r
I
e
46
e
For this case the matrix
9
1-9
9 1-29
0
0
9
0
reduces to
...
...
0
0
0
0
0
0
0
0
0
0
0
0
0
e
1-26
e
o
o
o
o
o
o
•••
0
e
1-2e
9
o
...
0
0
e
1-9
o
We see that 9 jk > 0 (j
= 1,2,···,r;
k
= 1,2,···,r)
for all
e
in
the interval
[0,1/2(1+cos1T/r)].
Let us now note that this is the largest interval which does not contain
= O.
any values of 9 for which 11611
To prove this we need to solve the
determinanta1 equation:
lIe II
To evaluate the r
D
r
th
=
O.
order determinant:
·..
·..
0
0
0
0
0
0
0
0
0
0
0
0
·..
a
1-2e
6
0
• ••
0
e
1-26
a
0
• ••
0
0
a-
I-a
1-9
9
0
0
e
1-26
a
0
0
e
1-2a
6
0
0
0
0
0
0
0
0
0
0
=
0
47
This is known as a "continuant. 1t
evaluate determinants of this type.
Aitken
~[l]
has given a method to
However, we would evaluate this by
a slightly different method which would, after a few more calculations,
give us the inverse of the matrix associated with Dr'
Let T(r-l x r-l) denote the following triangular matrix:
•
8
'0
0
1-28
8
0
9
1-29
o
o
o
o
...
0
0
0
0
0
0
0
0
...
0
0
0
0
o •••
~
1-28
8
o
8
o ••• 0
e
1-26 8
Let
1-6
e
~(r-l
x 1)
=
8 1 (1 x r-l)
0
-2
··•
= [0 o···e
l-e]
o
Then
r-1!a.
1
T
r-l
This can be shown to be (_l)r IT I(~~T-l~]Y.
similar to that given by Roy [23] •
•
y
T-l exists for all
e
r o.
The following proof is
48
T
~1
9r
0
•
(..1)1'-1 _e f T-1
-a
I
-2
-.
1
~1
0
0
(_1)1'-1 _9 f T-i S
-a -1
~
T
Gr
-2
0
T
Let us proceed to ev&1uate T-l • We will adopt the following notation:
a •
e
x • 1-29.
and
It is well lmownthat the inverse ot a triangular matrix<is a triangular
matrix. Let T-l • Tl and
tal
o
o
•••
0
taa
o
•••
0
.t.
t
• ••
t r-2,i
•••
0
o
•••
0
o
•••
0
o
.!! •
t r-2,r-2
0
•
ii
•
•
t
t
t
1'-2,1 1'-2,2 1'-2,3
t
.
t
1'-1,1 t 1'-1,2 1'-1,3
The elements ti,j
....
t r-1,i • •• t r-1,r-2 t r-l,r-l
ot the, above matrix are given
by the following
equation.
i
at i ;!_l + xtil
• 0
ati i
-1
~
-
3,
i > 2.
j • 1,2,···,i-2.
,
,
(2.10)
Frem the last two equations we .ee that tii
= l/a
for all! and
ti,i_l • -xla- i ~ 2. In order to detemine ti,j fer i ~ 3,
- -.<
1 < j
i-2, we must solveditterence equation (2..10) which eXpresses
ti,J in terms or ti,i_l and ti,i •. Fran equation (2.10), noting that
a •
ef
0, we have
Let
Thus We have.
(2.11)
j •
1,2,3,···,i-2.
I r a let
• . [-XIa
1
-.1.
.0
J
and
Equation (2.11) then can be written as:
•
!!Note that q~j defined here is not, the same as qij defined in Section
2.4.2, where qij was defined as nij/Ni~ and was treated as a random
variable.
50
Ii,i-j =~,i-j-l
• Aa...
It.i,i-j-2
-A3".
""i,i-j-3
•
•
We note that this result depends only on (i-j), the difference between
the row and column numbers; i.e., elements at the same distance from
the main diagonal are equal to each other.
In other words,
t
i,j- t i-k,j-k •
In what follows we shall consider two cases:
i)
a·
e ;/1/4
ii)
a'"
e - 1/4
Case (i)
a·
e r 1/4:
i.e 0'
x ... 2a.
To express Ai-j-l in terms of A we shall make use of the following
theorem given by Friedman [ll}:
It
If a matrix whose eigenvalues (characteristic roots) arranged
in order of increasing absolute value, are A],,)..2,··· ,}.n and
if f(A) is any analytic function of ~ in a circle around the
origin with radius greater than />"nl, then f(A) equals rCA),
the polynomial of degree n-l for which f(>'k) ... r(>'k)
k ... 1,2,. ",n."
The characteristic equation of A is
-~- !a - A) + 1 • 0
}..a +
!A + 1 • 0.
a
r 2a, the above equation has two distinct roots.
Since x
be these roots.
Let Ai -j-1
(~)i-j-l
(Xa )i-j-1
I:
I:
I:
-<1 + j3A.
Let Ai and ).,a
After sUbstitution we find:
-< + j3~.
-< + j3~2 •
. ~-j-~ _
j3 1 : . )\1 _
•
• •
51
• -
A1
.Ai-j - 1
Xa
-
•
Xa
'
since
Now
.([ + j3A
J_'>i-
I:
Af-j-1 _ A1-j-l[_x/a
-1.
a
X
X
1·
0
.
1 a
j -. 2 _ ,Ai- j - 2[1
)\1 _ X
0
a
Since
~ +' >-2
I:
-x/a
and
'>-1 >"a
I:
1,
_x/a(>..l-j-1 _ ~i-j-l) _ (Ai-j-2 _ ~i-j-2)
I:
(~+Xa)(>i:-j-1_)..i-j-1) _ (Ai-j-2_~~-j-2)
+ A1 >'a()..i- j - 2_A;-j-2) - (Ai-j-2_~i-j-2)
Therefore
-<I + l3A
I:
>i-j _ ~~";j
= >..i- j - .>..i- j
I:
52
Since
>j-j+1 _ >--!-j+1
a
-=
1
>i -
X2~i-j.
1
\ i-j
- 1\2
a
Hence
\i-j+1\.i-j+1
"1
- "2 ,
ti,j·
a(X1 -);)
for i ~ 3
1 ~ j ~ 1-2.
It can be seen that
= (x+a)2t r-1 , 1·
+ a(x+a)[tr-,
1 2+t r-,
2 1] + a8t r- 2, 2
, • (x+a)8t r-1 , 1 + 2a(x+a)t r-1 , 2 + alit r- 2, 2·
Since
t
\.r-1 \. r-1
- Aa
r-1,1 - a(Xi-Xl)
- ~
and
t r-2,2
\.r-3
"'1
-)..,r-3
a,
a(Xi-Xa )
the above expression reduces to:
53
~x+a)2(Ar-l_>i-l) + 2a(x+a)(>i-2_Ai~2) + aa(~r-3_~i-3}
&(1\1-)\a)
•
a
>i - Xi
[U--l(x+a +1
'1.
Ai
a
)'8 _>i-l(x+&
+ 1 )2,l
a
1\2 J
• Ii ~ Xa[<l->..)a>{-l - <l.-).,,)a~-ll
At = l/~
Since
Dr
=
(_l)rI T/ [§~T1~]
... ~~'{(l_~)a>{~l_
(_a)r(l_~)2(>tr_l)
(Af-l)Ar
Case (ii)
,
a'" 9 ... 1/4 and hence x
(1_Aoa)2)i-l]
•
= 1-2e = 29
We have seen that
When x = 2a it can be easily seen that
in terms of
!
A1
= Aa = -1.
To express Ai-j-l
we make use of the following result by Friedman:
>.",
''When A1 ... Aa = ......
so that )..)1 is an eigenvalue of
mu'1tiplicity'V for A the result used in case (i) should
be modified as follows:
n-~ for which f(~)
f(J)(A) =
rj(~)
r(A) is that polynomial of degree
= r(\:)
k = ("+1, '21+2, : •• n) and also
j = 0,1,2, •••".-1, where f(J)(A,) = dj(f)/d"j.
[Friedman:
Let
Ex. 2.44, page 121]
54
Then
and
Hence
~
• (_1)i-~-2(i_j_2)
• (_1)i-j(i_j_2).
ti -~
1
AH - • (1+1)(_1)1-;1-2
j
III
(-1) i_j[-.(i_. ).
. (i-j-l)
+ (-1)H(H-2)
-(i_j-l.)]
. "
•
(i-j-2)
Therefore
Hence
From which it follows that:
..
r
tr_1,i • 4(-1) (r-l)J
t _ ,2
r 1
4(_1)r-l(r_2)J
t r-.2 , 2
4(-1)r(r-3)
Dr •
and
t;:{ ·4(-1)r~(r-l)
• r/4r- l •
-
~(r-2)
+ ft(r- 3)]
~ ~J
55
Let us now proceed to determine the values of e- for which Dr
Clearly Dr
ef
when
f
e = a = 1/4.
0 when
1/4.
e ~ 1/4 we
When
D
I:
Hence we need only consider the ease
have seen that
(_a)r(1_h)2(~r_l)
O,f-l)>{
r
~i
+
~2'
= O.
=
-xla = {2a-~/a
and
•••
Hence
•
(_a)r(~_l)a(~r_l)
(.}.f-l)}..i
(~r_l)
=
".,
(h+l)(~_1)2r-1
<
The roots of Dr • 0 are given by the 2r roots of unity, excepting for
~ .. -1 and ~ • +1.
\
Ali
l/a
I:
The other roots of unity are given by:
coslT (loft) ... i sin 1T (l+t)
r
=2
r
t
= 0,1,2,···,r-2,r,r+l,·· ·,2r-2.
t
f
r-l,2r-l
- (Al ... l/~)
= 2 - 2 coS7r(l+t/r)
•• •
a
I:
1/2 (l-cosT (l+t/r » •
The smallest value of !. corresponds to t
=,
r-2, and !. for which
a • 1/2(l+oos 'lr).
r
This proves the result.
However, if we want
have to evaluate 1'/,
91,J,
e to
lie in the interval [0,1/4], then we do not
for when
9< 1/4 it
can be easily seen that
II e I
56
is dominated.!! by its diagonal elements.
It has been proved (see
Mirsky (15)] that a determinant dominated by its diagonal elements does
not vanish.
It is easy to show that:
j = 2, 3, ••• , r-l
Pj = Pj_1 6 + P j (1-26) + P j +16
P~
= P1(1-e)
+
Pae
p;'- Pr(l-e) + P _ 9.
r 1
P ... po --'.> H'.
oOj
j < _ o·
H.
'Thus
pi... po'
j
j
where
j =
0'
0
,( 0
2,3, .,•• ,r-l
0)
p r ... p r + 9 Pr-I-P r •
'0'"
"
It Pj'" l/r tor all j ... 1,2, ••• ,r, then
and misclassitication does not affect the usual test procedure.
From Section 2.2 it follows that the test procedure in this ease
will be:
reject H if
o
x'· ·\£~,,l<n
j
j=l
Y A determ;nant
>
rrs
0'
Npj
)OJ > C
-'
laij In is Dominated by its diagonal elements if
Ia I s-lLIars I
rr
- IIPj'
(r'" 1,2,"·, n) •
•
~I-,-~~~~~~~~~~~~~~-,­
I'
I
6. ' , '.' ",
L<C<J) , ::
/iC
't·
tj
t
-~--------J
f
0
~
e'
.I
•
~
;ll
----~-I
-~-I
I -
({)::- 100
p, ,0 t ClI
u- - --
---
- I
-
-0 - -.
I
/j>
it)
+ 2- ~ "
57
Q being
so chosen that
PI{ A.(1'-1) ~ c} . . -<
(desired level of significance).
Let .{ ~} be a set of deviation parameters, not all zero, such that
l'
L'lj =
O.
Denote by H
IN
the hypothesis
j=l
Pj
... po + ] . = p
IN
j
jN
•
The limiting power:
Lt
P.r~Xf8 ~ CI~N} = 1
- F( C,r-l,AI )
N->oo
where
L\f
8
lf
= l' [ ~J
L
f
j=l Pj
~
lj_1 6
'(I
1
'\(l-e)
+
-rae
'Ifl' ... 0.:l' (I-e)
+
Y.1'-Ie.
+ '6j(1-26) + 0;+16
j
= 2,'" ,1'-1
From Section 2.. 2.1 it follows that limiting power is reduced ..
To illustrate that limiting power is reduced let us oonsider the
following example.
Let
l' •
j • 1,2,"',1'.
5 and let us choose the deviation parameters such that
and
58
Then it is easy to see that
Yi
"1'2
'(f
3
•
•
II
0
y~ =
14 (1-39 )
II
0
--r
"64 (1-29-)
6
~
= i 49
~.
•
II
['tj/pj
=
10 ¥~
j
I
~f = tj2/p
r
= 5'~(14e2-10e+2)
j
o < e < 0.27639
Now if fix '1"4 and
eo(
and calculate A' for various values of
e we
able to compare the power function for the two situations.
~4
1.2412,
•
will be
Thus if
0.05, i • 0.04832, then
eo( •
and
~,
• 11·935.
Values of 1 - F(0,4,15·405) and 1 - F(0,4,11'935) from Fix's table are
seen to be 0.9 and 0.8 respectively.
Table 3 shows the etfect of misclassification on limiting power for
the set of deviation 'parameters as mentioned above.
2.5.2
e unknown.
Method of estimation: modified minimum
X"
We .have sh own that
and that i f Pj • l/r then misclassification does not affect the usual test
procedure.
In what follows we will assume that there exists at least one
0'
j such that Pj
r. P0j •
It can be easily seen that existence of one
59
Table 3. Effect of misc1assification of the type 8jk = 8, if Ij-k/
8 jk = 1-2&, if 1 < j = k < rj ejk = 1-8, if j = k = r,l;
ejk = 0 otherwise, on limiting power when H is p. = l/r.
.
(i)
co(
= 0.05 r
=:
5
'(1
= fa
=
J
0
r; = 0
= 1;
i 4 = - 'fs = 1.2412
Power
o
(ii)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
11.935
9.683
7.924
6.420
,.0,0
3.737
2.401
0.04832
0.08422
0.11594
0.14664
0.17960
0.21604
0.2737 2
Pj • 1/, -<.• 0.0, r · 5
o
Power
0.,
o
,.050
3.737
2.401
0.04559
0.09666
0.16190
(iii) HI
o
P j • 1/5 -<. 0.01 r · 5 '(1·
12
•
~
•
o
o
H:
o
p j • 1/5 -<
20.737
=
16.749
14.121
12.039
10.231
8.557
6.914
5.188
o
" . -1'5 • 1.4400
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.01 r = 5
o
0.03438
0.07212
0.l1811
0.18784
0
Power
0.04079
0.07084
0.09709
0.12225
0.14823
0.17736
0.21419
(iv)
0.4
0.3
0.2
Power
10.231
8.557
6.914
5.188
3.149
0.5
0.4
0.3
0.2
0.1
60
0'
j such that Pj
such that
0'
Pj
r Pj implies the existence of at least
.
0
r P~.
P~'
o~e
0
P
0'
j
j)
This is true since the equations for
estimating 8- (both modified minim.wn chi-square and minimum
0'
k(r
Without limiting generalitY' we can assume that
r Pj for all j • 1,2,···,r.
some of the Pj
more
Xf)
when
0
.. P j are exactly of the same fom when none of the
0
.. P •
j
The modified minimum chi-square equation then is:
r
~
n
Ue -
j=l
j
=0
qj
where
o
q
j
..
_ _ _ _P~j---
000
j ..
2,3,··· ,r-I.
2pj - P j - l - Pj+l
o
PJ.
q1" - _
.....o
0
P1 - pa
o
q"
Pr
roo
Pr - Pr - l
'We note that neither can all qj 's be positive nor can all qj' s
be negative.
To see this consider:
Now
r-l
L
j"2
r-l
r-2
(2Pj-Pj_l-Pj+l) • 2Lpj - LPj
j=2
j-l
r
-LPj
j=3
• 2(1_P1o_P 0 ) _ (I_p o-P 0 1) - (l-p~-p~)
r r.r
.
- - (P10_P~) - (po_po 1).
r
•
• •
r-
61
Since
Pj
0,
>
tive and some
for all j
ar~
= 1,2,···,r,
negative.
it follows that some qj'S are posi-
It is easy to see that smallest of the
positive qj is greater than 1/2.
Let
e denat e
the root of the equation
r
L
j=l
.
n.
. e-
qj.
= 0
q.
J
1\
e
e in probability. As
that e lies between the largest
such that
shown
J
.-i)o
1\
in the previous case it can be
negative qj an.d smallest positive
This root can be obtained by Newton's iterativa prooedure.
The test procedure is:
reject Ho ' and hence Ho if
x,1 -
/
r ( (n -1'lp)
....l'oOlJ'
.
L~
'!I\'~O
j-l
~1'p j
> 0,
-
C being so chosen that
Pr
In
iX(r-2) ~. CJ-.(
(desired level of significance)
XII
j •
2,.3,··· ,r-l
In order to study the limiting power in this case let
a set of deviation parameters, not all zero, such that
r
I 'f
j •
j-l
o.
{Yj 1 be
62
We wish to study:
j
where
lj
are as defined in Section 2.".1.
Lt
N-->oo
P.r{XI •.~
clBiN1 • 1
= 1,2,···,r.
Now
- F(C,r-2,AI)
where
By a method similar to that in Section 2.2.2, it can be shown that
If there were no misclassification tlle limiting power would be:
1 - F(C,r-l,A)
where
r
L\.
I'lj/pj.
j=l
-
Thus it follows that AI < A.
65
CHAPTER III
/
STRATIFIED AND RANDOM SAMPLING SITUATIONS
J.l
Stratified Sampling Situations:
-
-
Consider the problem of drawing t independent samples from t dif-
fer~nt populations, the i tl'l sample consisting of Ni • independent
,
th
observations from the i
population (i -1,2, •••t).
that these
pop~lations
It is assumed
are comparable in the sense that every- individual
from each population can be classified into one of the same set of r
mutually exclusive and exhaustive categories Cj (j - 1,2,·· ·r).
Thus
every- individual in the total sample of size
t
N •
IN
i •
J
i-l
/
is characterized by the two indices "i" and fijI! of two different
types, "i" representing the population from which it is drawn, while
"j" represents the particular category to whioh it belongs.
This is
usually summarized by saying that the individual belongs to the (i,j)th
cell of a table of oross-c1assifioation.
The hypothesis weare interested
-
in testing is that the t populations are the same.
In this oase errors of observation can arise in two directions:
(i)· In the tti ll direction, e.g., an observation is classified in
the first population when in fact it comes from the second.
(ii) In the "j" direction, e.g., an observation may be classified
in the first category- when in fact it belongs to the second.
We will assume that there are no errors in the "i ll direction.
Let
eijj I
denote the chance of an observation belonging to the
(i,j) cell being wrongly classified into (i,jl) celL, j , jl and
eijj
66
represent the probability of correctly assigning the observation of
the (i,j) cell to the (iJj) cell.
Clearly
L.
6ijj • = 1-
,
J
We shall assume:
In other iPTords, e ...•
(i) .e.lJJ
". _. e. Jv
'';' for all i
lJJ
does not depend on "i."
(ii) Let e(r x r) denote the following matrix:
•
...
e(r x r) is assumed to be non-singular.
.
Let p .. (\ p ..
L
lJ
lJ
=
1) denote the probability of an observation
j
falling in the (i,j) cell when there are no errors in observation.
p!.
. 1.J
(~!.'
!J lJ
=
Let
1) denote the probability of an observation being assigned
j
to the (ij) cell when there are errors in observations.
ph
P12
...
pir
•
••
p'tl
p·b
...
P:L1
=
Ptr
Pu
•
•
·
Ptl Pt2
...
Pl.r
8 itU
8"'12
...
elifl'lr
...
Ptr
8"rl
8*r2
...
e~rr
Now we wish to test the hypothesis that
Ho'•
p ..
lJ
= Pj
j = 1,2,···r
i
Clearly
= 1,2,···t.
67
If the matrix e(r x r) is non-singular then
(i = i, 2, • • -t ),
H' :
o
where
t
Pj = L1?t 9*£j.
fl. ...l
Hence rejection or acceptance of
H~implies
rejection or acceptance of
Ho •
Therefore the test procedure is:
(n
reject H'o and hence H0 if
.)2
Ni N
• .
•J
ij N
N. N
¥
> C
and accePt otherwise, C, Ni ., N. j etc. being defined as follows:
Pr ('Va
I\..
.
> C1= .(
(r-l)(t-l) - )
... N.J..
t
Lnij
i"'l
t
=
N.j
r
IN
I
i=l
j=l
i ....
N. j
=N
In this case it is seen that we get the same test procedure as we would
have obtained if there had been no errors in observations.
"
"?.J,.'
<
68
3.2
Limiting Power for Stratified Sampling Situations
Before discussing the limiting power we would like to state a
result proved by Mitra [16] regarding the asymptotic power function for
the stratified sampling situations when the probabilities Pij under H
o
are functions of !! independent parameters. We will evaluate the
non-centrality parameter involved in the power function for our particular case; i.e., when the Pij under Ho are functions of m = r - 1 independent parameters.
Let H denote the hypothesis:
o
and .F1.N denote the hypothesisl
o
Pij • Pij
r
L'
t
j-l
ij - 0
(eo eO ••• eo) + rij
l' I
m
II
(i - 1,2,' ..t).
Let
I
x~ •
LL.. . . . .-----~~~ ....
i
j
where
The limiting power of
-x; is defined as
The above limit is 1 - F(C, rt-t...m, .Ai) where'
/
69
and
B(rt u )
~'(lx
=
rt)
V!8iIA)
=
e = eo
(ri)pi~)
Let us determine the value of this non-centrality pa.rameter in our ca.se
if there were no errors in observation.
The hypothesis we are testing is I
where
r
LPi~
r
- LPj - 1.
j-1
j-1
We have r - 1 independent parameters {Pi PI·· .Prj with
Hence
::i~.
k
B(rt
x r-i> -
0 i.f j ; k,r '
1 if j - k
[ -1 if j • r.
~W
~W
••
•
70
W(r x r-1) = [D<r_l x r-l~
where
d' (1 x r-I)
J
r....!..
D(r-I x r:I) =
M
o
1
M
o
•
1
jp~-l
and
I
d' (1 x r-1) =
-
[_....!.. _....!.. •.• -
IP;
v'P;
1
CO
1/ p
J:J:;"
r
J
• •
= W'W.
W'w = 1
+ 1
1
1
Pi P;
po-r
1
r
.l.+.l.
j)8'
por
Pa P;
•••
1
• ••
1
po-r
•
•
1 •••
por
~
r
• •
P~ (l-p~)
-P2,P a
0
...
Pa (l-poa)
•••
..
•
•
°
«:>
-PaPr _1
•••
0
0
Pr-·l.(l-pr-1)
•
=-
71
Now let us calculate W(w'w)-lw ••
W(W'W)-l
.JPr p;
= vPr(l-p~)
...
ffa(l-p~) .••
-v'~ P~
-/PI P;-l
-M P;-l
~;
.~....•
-Jp;_l P~
-vP; P~
.• .
-/p~p~
-!p;'p;
o
-/p~Pr
1 - P2
-v1-0-0
P2Pr
••
1
t
Pr0
-
W(WtW)-lw· will be denoted by S(r x r) •
.. t
B(B'B)-lB'
= IQ;:'
W
(W'W)-l[~W' ~W'
• ttlQtw t
!Q;W
t
~w
=
~S
';Q1 Q2 S
~S
Q2 S
...
JQ1~S
v'Q 2 Qt S
•
•
~S
~S
•••
~S
]
72
~I(lxrt)
Let
=
and let
Clearly
~I(l x rt)
= [~
~
•••
2.t L
Now
~(lX r)S(r x r) •
'(i1 -'1i2
vPr /PI
~
·i -
'fir]
••• -
Hr
0
1- P~
.•
•
•
••
I)
1 - Pr
Noting that
L~i~
• 0, it is observed that
~
&'s- ~
6'·
~
ti2
V'~~
ltiiLIPr ffa
••• 'fir] .
9r
Now it is easy to see that:
!'B(B'B)-lB, -
[~t~~
i-1
-
[~,
Y"'QaZ' •••
~l']
e",
73
where
Y"(lxr)·
••• (Q,11j1t + 9, tar + ••• 9t
v9r
ttl']
The final step is to evaluate
To this effect let us first calculate
Noting that Q1 + Qa + •••
~ ==
1, it is easy to see that
i
==
1,2,·· ·t.
Now it is eaSY' to check that
- [z' z' ••• -t
z'Jb- •
[-~, - -b'B(B'B)-lBtJb~
-.::.l -I
f t r<t
iJ
i,h-l j-l[
;.'t~' ~\]
j
i<h
Now let us see what happens to the limiting power when there are
errors in observation. We want
74
where H
IN is
o ... '(ij
P
= P
ij
j
IN
t~
tL. N <-> Hl'N
-J.
where
pI = pI ...
. ij
j
.
:2:J2
IN
'
r
"tb
=
I/.=1'fU "Lj ·
6
=1
- F[C,(r-l)(t-l)~HJ
where
A~=
t
L
i,h=l
i<h
We have already seen that if there were no errors in observations this
limit would be
where
t
A H =. [
i,h=l
i<h
We shall now show that in general
and
if none of the e*jj' are zero.
75
Equality holds if and only if [e.g., see Hardy, Littlewood and
Polya Theorem 7 (12)]
where)J. is some constant not equal to zero.
If none of
e...e.j
the above equality implies
<t= 1,2, •••r).
But this cannot be true because
r
1<
(u-'fht.)
J=l
and
r
LIi
.t-l
Hence if none of the
Hence
e.,lj
= 1•
are zero
=0
-
for all i and h
-
are zero
76
But
..
Since this is ture for all iand
h it
follows that in general
if none of Qttj are zero.
We again have the result that limiting power is reduced due to
errors in observation.
3.3
Problems in a t x r Contingency Table (Random Sampling)
Suppose that the
!
individuals are randomly taken from some popu-
lation and then classified according to two variable arguments in a twoway table.
It is often required to test that the two variable arguments
are independent.
In this case also errors of observation can arise in both directions,
viz., the "i" and the "j" direction.
However, we will consider the case
when there are errors in observation only in one direction.
make the same assumptions regarding the probabilities
stochastic matrix 9 as were made in Section 3.1.
eijjf
We will
and the
77
Let Pij denote the probability that a randomly chosen individual
belongs to the i th row and the j th column of the table when there are
no errors in observation.
Let pb be the probability of an individual
being assigned to the (i,j) cell when there are errors in observation.
Let
• ••
Pf
lr
•
•
•
and
pet x r)
=
P11
•
•
•
As in Seotion 3.1 we have
P1 (t x r)
= pet
x r)e(r x r).
Let
r
IPij = Pi.
j=l
i
= 1,2, .. ·t
j
= 1 , 2 , ···r
and
t
Ip
·
ij = P .j
i=l
Clearly
The hypothesis of independenoe is then equivalent to
78
Let
t
r
pi. =
LPIij
P
L
i=1
and
l j = p ~j.
j=1
Again
r
t
". ~
"pl.
"pIi. _"
pl. - 1.
~
ij _ ~.
~'j
i j
i-I
j-l
To see this we may proceed as follows:
•••
Hence
Pl(t x r);II
p
1.
[p
p
••• p ]
.1.2
·r
6(r x r).
•
•
•
Let
JI(1 x r)
;II
(1 1
•••
1)
and
J~(l
x t)
= (1
Multiply on the right both sides of the equation by J(r x 1).
that
eJ = J
1
•••
1).
Noting
79
we see that
=
••
•
Similarly multiplying on the left both sides of the equation by
J~(l x t) we get
Henoe
[P'l
• P'2
. ••• P'·r ].
•
••
Similarly it can be shown that if
then
The test prooedure is:
Reject Ht and hence H if
o
0
and do not rejeot otherwise,
being defined
exactly the
the quantities n ij ,- N ., N. , N, and C
i
j
same way as in Section 3.1.
80
3.4 Limiting Power for Random Sampling
Let
[rij } be a set of deviation parameters not all zero,
L
L~j
i
j
Yij = 0 =
such that
= O.
Limiting power is then defined as
where H is the hypothesis
IN
It can be shown that this limit is
1 - F(C, (r-l)(t-l),AI)
where
where
'I~j
r
I
•
'fiJ.e./...j •
1.=1
If there were no misclassification the above limit would be
1 - F(C, (r-l)(t-l),A)
where
By using the Cauchy-Schwartz inequality it can be shown that in general
and
if none of
~..Lj
are zero.
D.'
<
D.
81
CHAPTER IV
SUMMARY AND CONCLUSIONS
This thesis is devoted to the investigation of the effect of misclassification on the
12
tests in the analysis of categorical data.
The
problems of categorical data that have been considered are the goodness
of fit test and t x r contingency tables for
a) stratified sampling situations and
b) random sampling situations.
For the goodness of fit test the frequency table consists of r
classes and the null hypothesis specifies the probabilities of an obser-
ejk
vation falling in the different classes.
has been denoted as the
probability of wrongly classifying an individual belonging to the j th
class into the k
th
class and 6
jj
as the probability of correctly classi-
fying an individual belonging to the jth class in the jth class.
Finally
e(r x r) has been used to denote the matrix
j = 1,2,· .. ,r
k
= 1,2, ••• ,r.
Two cases have been considered:
i)
e(r x r) is non-singular and known.
ii) e( r x r) is non-singular but unknown.
When e(r x r) is non-singular and known, it is shown that the usual
test procedure has to be modified (except in the case when the null hypothesis is Pj
= l/r
j
= 1,2, ••• ,r].
The limiting power function has been
found; and it is shown that misclassification reduces the limiting power.
For the null hypothesis:
Pj
= l/r
j
= 1,2,···,r
82
and for a few selected sets of deviation parameters, the limiting power
has been computed both when misc1assification is present and when it is
absent.
The following types of misc1assification are considered in the
computations:
a)
ejk .=
e
jj
9
=1
b)
ejlc =
j
f
k
- (r-1)e
where
e
if
Ij-k/
=1
1-26
if
l<j
= k<r
1-e
if
j
0 < e < l/r.
= k = r,l
where 0 < e < 1/2(l+oos Jt).
r
Tables 2 and 3 illustrate how the limiting power is reduced as 6 increases,
for cases a) and
b) respectively.
These numerical calculations show that
misclassification becomes more serious as the significance probability
decreases.
To demonstrate the effect of neglecting misc1assification, the type
of misclassification stated in a) has been considered.
It is shown that
for a fixed value of e the level of significance of the usual test tends
to 1 as the sample size increases
indefinitely; while
for values of
.
,
e
arbitrarily close to zero (6 being of the order 1/(N) the level of
significance of the said test is greater than -<.
When 6(r x r) is non-singular but unknown it is shown that in general
the number of independent parameters to be estimated exceeds that of the
classes.
In order to reduce the number of unlmown independent elements
in the matrix e(r x r),
parameter
eI
ejj
and
ejk
are expressed in terms of a single
as defined in a) and b) above.
Here no knowledge e is
83
assumed -other than that it lies in the range [0, l/r
[0, 1/2(1+cos
in ease a] and
,
¥)
in ease b].
It is proved that the usual test procedure
has to be modified except when the null hypothesis is
j " 1,2, •••
,r.
When the null hypothesis is other than this, two estimation procedures
for estimating
e have been considered: modified
and minimum X~.
minimum chi-square
It is shown that misclassification reduces the limiting
power.
For contingency tables, the following assumptions have been made
regarding the nature of misclassification:
i)
ii)
Errors of observations arise only in one direction, say in
the
tI
Let
eijj ,
j
II
direction.
denote the probability of an observation belonging
to the (i, j) cell being wrongly' classified into (i, j') cell.
It is assumed that
words
eijj ,
II
eijj ,
does not depend upon i.
In other
6*jjl.
iii) It is assumed that the stochastic matrix:
j .. 1,2,··.,r
j'
II
1,2, ••• ,r
is non-singular.
Under these assumptions, it is shown that the test procedure for
testing the "usual" hypotheses in the two situations (stratified sampling
and random sampling) is unaffected by misclassification, the "usual"
hypotheses being respectively:
"The populations are identical II and
"The two variable arguments are independent. II
Although misclassification
does not affect the test procedures, it does reduce the limiting power.
84
The following problem shal1d be mentioned for further research.
In Chapter III, where the problems of contingency tables were considered,
it was assumed that errors of observations could arise only in one
direction.
This takes care of a certain set of problems.
arise where errors can arise in both directions.
Problems do
Let us consider the
following example:
N students are taken
and to each one of them two tests are given;
one in English and one in mathematics.
On the basis of the'se test s the
students are categorized in a two-way table given below.
Mathematics
English
Unsat'J.8 f ac t ory
G00d
F'
nr
Average
Good
~1
~a
~3
n14
Satisfactory
n 21
na:a
n:a3
n:a4
Unsatisfactory
n 31
n 32
n 33
n 34
N
On the basis of these data we wish to test the hypothesis that the two
variable arguments are independent.
can arise in both directions.
In this case errors of observations
Let 6 iil jj I denote the chance of wrongly
classifying an individual belonging to the (i, j) cell int 0 the (i f j' )
cell.
Clearly
Let Pij denote the probability of an observation taIling in the (ij)
cell when there are no errors in observations.
Let p! . denote the
J.J
8s'
..
probability of an observation being assigned to the (ij) cell when there
are errors in observation.
It is easy to see that
i,1'
=
1,2,3
j,j'
= 1,2,3,4
To illustrate the complexity of the problem let us take a rather simple
ease.
Suppose
9 iitjj' = 9
9iijj
=1
1·
~
T
i'
j
f
j'
- 119
where e is known and 0 < 9 < 1/12.
Let
•••
•••
and let
1 - 119
9(12 x 12)
=
9
·9•·
e
1 - 11e
9
In this case it is easy to see that
!!(1 x 12)
=!'(l x
12) 9(12 x 12).
9
...
e
...
1 - 119
86
Let
,4
= Pi.
i
= 1,2,3
ij = p. j
j
= 1,2,3,4.
'LPij
j=l
3
LP
i=l
The hypothesis of independence is:
Under the null hypothesis the joint density of the observations is given
by
+
9(1-12P p.) ]nij •
i • •J
In order to obtain the maximum likelihood estimates of the parameters
Pi • ,P • j we have to maximize by ~0 subject to the constraints
Let
~=
L
i
Lnij log [Pi .p. j (1-129) + 9] +)..[2Pi. -1] +)J[2P.j -1].
i
j
j
Differentiating ~ with respect to Pi. we get
I
n ijP •. (1-129)
_J.
. Pi P .(1-129) + 9
j
•• J
+)..
= o.
i
= 1,2,3
87
which can be written as
where
St ~ 8/1-129.
Similarly" differentiating ~ with respect to p.j we obtain
3
P
\'. ni j i •·
Lpi· p .J"
i=l
+ 9
... n
'
r
=0
j
The equations are difficult to solve.
when
e is
They will be much more complex
unknown and has to be estimated from the data.
that are obtained by the method of minimum
solve.
= 1,2,3,4
Xf
The equations
are also difficult to
So there has to be developed either a different -method of
generating BAN estimates or an iterative procedure which would enable
us to solve equations of the type mentioned above.
The model that we have indicated,
8iiljjl
=8
i
r iI,
j
f
also may be an over-simplification.
jl
and
8 iijj
=1
- 118,
Work needs to be done to build a
model which would be realistic in many situations and yet not make the
estimation procedures too complicated.
88
LITERATURE CITED
1.
Aitken, A. C.
Edinburgh.
1948.
Determinants and matrices.
Oliver and Boyd,
2.
Barnard, G. A. 1947.
.Jk: 248-252.
3.
Bross, I. 1954.
10: 478-486.
8.
Fisher, R. A. 1922. On the interpretation of chi-aquare from
oontingency tables and the oalculation of p. Journal of the
Royal Statistical Society ~I 87-94.
9.
Fisher, R. A. 1924 •. ,The conditions under which
disorepancy between observation and hypotheses.
Royal Statistical Sooiety §.II 442-450.
Significance tests for 2 x 2 tables.
Misclassification in 2 x 2 tables.
Biometrika
Biometrics
XI measures
the
Journal of the
10.
Fix, E. 1949. Tables of non-oentral 1...2 • University of California
Publications in Statistios !(2)1 15-19.
11.
Friedman, B. 1956. Prinoiples and teohniques of applied mathematics.
John Wiley and Sons, Ino. New York.
12.
Hardy, G. H., Littlewood, J. E., and Polya G.
Cambridge University Press, Cambridge.
13.
Hobson, E. W. 1927. The theory of functions of a real variable and
the theory of Fourier's series. Vol. 1. Cambridge University
Press, Cambridge.
14.
Horvitz, D. G., and Foradori, G. 1956. A study of smoking habits
of individuals. North Carolina Institute of Statistios Mimeograph Series No. 160
..
15 • Mirsky, L.
Oxford.
1955.
1934.
Introduction to linear algebra.
Inequalities.
Clarendon Press,
•
89
I
21.
Pearson, K. 1900. On the criterion that a given system of deviations
from the probable in the case of a correlated system of variables
is such that it can be reasonably supposed to have arisen from
random sampling. Philosophical Magazine Series 5, ,22: 157-172.
22.
Pitman, E. J. G.
inference.
/23.
1948.
Unpublished lecture notes on non-parametric
Roy, S. N. 1954. A report on some aspects of multivariate analysis.
North Carolina Institute of Statistics Mimeograph Series No. 121.
24.
Roy, S. N., and Mitra,S. K. 1955. An introduction to some nonparametric generalizations of analysis of variance and multivariate
analysis. North Carolina Institute of Statistics Mimeograph
Series No. lJf-.
25.
Rubin,T., Rosenba.um, J., and Cobb, S. 1956. The use of interview
data. for the detection of association in field studies. Journal
of Chronic Diseases ~: 253-266.
26.
Wald, A. 1949. Note on the consistency of the maximum likelihood
estimate. .Annals of Mathematical statistics £2: 595-601.
27.
Whittaker, E. T., and Watson, G. N. 1952. A course of modem
analysis. Cambridge University Press, Cambridge.
28.
Yule, G. U. 1911. An introduction to the theory of statistics.
Charles Griffin and Co., Ltd., London."
• 1
,..-
~ ..
w.
i .
.i
I
I
HD
'
..
_
n
~
~nn_~_
Iii;: - - - -. - - - - - - - - - - - - - - - - - j - - -
-- -- ---:.:;;Ii
r
-1-
tc
~
)<u\
I,
---~I
i
.~u
uu_'
.-
--J
._-------~
"
'~-Ai
.~
~U
u
.l.
U
I
-I
---
.
___
• . • •
.'U'
-;~------
--~-J
\
I
' u I.
I
)..
'.
I
l' "
Ci!
) i,l.t-
n\.7
2
For case 2) , in general, the test prooedure will be affeoted by misolassifioation. An
1
appropriate
t
test prooedure for testing the 'usual'
fl
I - . '(J'.d .~U?(
hYPother
?\
f I I
.'C/~?r:
(J
•
C)
{J
r
_?J
'-., 6
/fl .g.. ..
\ \
D
'J'.
2--
\.
•
/
-
~.~
Cl
\:f
f.
1
t~;;/- 1} -t> uf ~' ~
I
,I)-,I/t- l/.IJe J~ - e
--(TFJr;;~'J)1: ui
'.
'
f) -I- (e -:;v -)
r '1
'2 -
) f,J"J
. r,n J
,
~
(,{·7:;t;Y--lj~
. Jl/ -0IJ.j
+(()
; '/1J r()
t'
~
•
-
f
'/~"/~
D .~~ J 'j':(j):Z
.
.
0/1). rNJ6
. f/ ,(;f]V - / ,
l!
j ~ /IUJ 't' + )I ~ () Jq
/,~
q~
,j}u/e + (@ )V-I) rl:dj~' /f ~ (;J