Van Dat, Nguyen; (1982).Tests for Time-Space Clustering of Disease."

-e
TESTS FOR TIME-SPACE CLUSTERING OF DISEASE
by
Nguyen Van Dat
Department of Riostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1409
•
July 1982
TESTS FOR TIME-SPACE CLUSTERING OF DISEASE
by
Nguyen Van Oat
-e
A dissertation submitted to the faculty of
the University of North Carolina at Chapel
Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy
in the Department of Biostatistics
Chapel Hill
1982
,,~L-- Z~ 'L'rI-~. .
Reader
NGUYEN VAJ.\J, OAT.
Tests for Time-·Space Clustering of Disease
(Under the
direction of ROGER C. GRIMSON.)
Several statistical tests for time-space
been suggested by different authors.
clusteri~g
Some commonly used tests are reviewed
and their differences and similarities are discussed.
tests
of disease have
Two alternative new
that do not have some of the drawbacks of the others are suggested:
the number-of-empty-cells test and the zero-one matrix test.
The exact distribution of the number- of- empty- cells test statistic
is derived, together with the exact moments of the zero-one matrix test
statistic.
Since
these~oments
are too complicated for most practical
testing situations, a way to approximate the first and second moments is
-e
suggested, based on the exact moments calculated for a wide range of the
parameters.
Using these approximations and the asymptotic
normal prop-
erty of the test statistic, the approximate test can be carried out with
relative ease.
The performance of the zero-one matrix test is compared with its
counterparts
in the literature:
the EMM test and the scan test.
Examples
drawn from North Carolina mortality data are given to illustrate situations
in which one test is better than the others.
Generalizations of the zero-one matrix test are also suggested to adjust for extraneous factors and for multivariate applications, using maximum likelihood procedures and EUClidean distances respectively.
Finally, a summarized step-by-step practical guide for the test of
time or space clustering is presented together with some suggestions for
further research.
ii
ACKNOWLEDGEMENTS
I would like to express my appreciation to my advisor, Dr. Roger C.
Grimson, for his initial suggestion of the topic and more importantly,
. for his enthusiastic guidance and continuous support in the course of
this research.
Gratitude is also expressed to Drs. Michael D. Hogan,
Norman L. Johnson, Carl M. Shy and Michael J. Symons, who served on my
advisory committee.
I would also like to thank Dr. Lawrence L. Kupper for his guidance
and support during my first academic experience at the University of
-e
North Carolina, and Dr. William Mendenhall III for his advice during the
first years in my statistics career at the University of Florida and his
suggestion that I pursue a biostatistic training program at the
Univer~
sity of North Carolina.
For the skillful typing of the manuscript, I must thank Ms. Ruth Bahr,
whose work can only be appreciated by those who have seen my
hand\~·iting.
The valuable education which I obtained from the faculty of the Department of Biostatistics is gratefully acknowledged.
Last but not least I would like to thank my friends and my family
from half way arourd the earth for their encouragement and moral support
which helped me maintain a semblance of sanity which has made this work
all possible.
iii
TABLE OF CONTENTS
AC K..'-;OI\'LEDG~lDITS
ii
LIST OF TABLES
v
LIST OF FIGURES AND PLOTS
vi
LIST OF APPENDICES
vii
CHAPTER
.
I
INTRODUCTION
1.0
1.1
-e
OF THE LITERATURE
Introduction
Literature review
1.1.1
1.1. 2
1.1. 3
1.1. 4
1.1. 5
1.1.6
1. 1. 7
1.1. 8
II
.- REVIEW
AND
1
2
Pinkel and Nefzger's test
Knox's test
Barton and David's test
The E~1 procedure
Grimson's cluster model
Mantel's test
The scan test
Bailar, Eisenberg and Mantel's test
INTRODUCTION OF SOME NEW TESTS
2.0
2.1
2.2
1
2
3
5
5
6
7
8
9
11
Introduction
Number of empty cells test
Zero-one matrix test
11
11
15
2.2.1
IS
Description of test
(i)
(ii)
(iii)
(iv)
2.2.2
2.2.3
2.2.4
Test for time clustering
Test for space clustering
Test for time-space interaction
General properties of the zeroone matrix test
16
17
18
·18
Exact distribution of the test statistic for time clustering
20
Adjustment for better approximation
(continuity correction)
24
Simulation results
24
iv
CHAPTER
III
IV
REGRESSION ESTIMATION FOR THE ZERO-ONE MATRIX TEST
FOR TIME CLUSTERING
3.0 Introduction
3.1 Binomial estimates versus exact values
3.2 Estimate the variance using the expected value
3.3 Estimate the expected value
3.4 Appropriateness of the regression estimation
31
THE ZERO-ONE MATRIX TEST FOR TIME CLUSTERING
WITH THE EMM TEST AND THE SCAN TEST
43
26
26
28
29
CO~IPARING
4.0
4.1
4.2
4.3
Introduction
Situations in which the zero-one matrix test
is more powerful than the E~~ t e s t '
Situations in which the E~ test is more powerful than the kero-one matrix test
Zero-one matrix test for time clustering versus the scan test
V GENERALIZED ZERO-ONE MATRIX TEST FOR TIME CLUSTERING WITH ADJUSTMENT FOR EXTRANEOUS FACTORS; MULTIVARIATE APPLICATIONS
.
5.0
5.1
5.2
.. ..
Introduction
Zero-one matrix test for time clustering with
adjustment for extraneous factors
Zero-one matrix test in multivariate cases
5.2.1
5.2.2
VI
26
SUM~1ARY.
Classifying data points into two groups
Assessing the significant level of the
test
43
43
46
48
50
50
50
56
56
58
PRACTICAL GUIDE, AND SUGGESTIONS FOR FURTHER
RESEARCH
68
6.0
6.1
6.2
68
Summary
Practical guide
Suggestions for further research
69
75
APPENDICES
77
REFERENCES
93
e-
v
LIST OF TABLES
Page
Table
3 . .1
"-
Approximate and exact values of E(A) and Var(A)
for n=5(1)10,12,15,18,20(5)50 and m=S(l)lb
33
3.2
Means of the ratio and of the difference between
the approximate and the exact moments by number
of cells, based on results presented in Table 3.1 . . . . 37
3.3
Regressing the exact variance on the exact expected value of the test statistic A Test for
time clustering
. . . . . . ..
....
. • . 38
3.4
Regressing the exact expected value of A on the n
number of timeupit~ m and the decimal part of (m)
minus 0.5
. . . . . . . . .
. . . 39
3.5
Regressing a (probabilitx that n. ~ [~ - O.S])
1
on the decimal part of (-) ~ 0.5
• ~ ....
m
5.1
Mortality statistics for 1980 - North Carolina
residents - Diseases of the heart
. . . .
. . . . 40
. . . . 63
5.2
Mortali ty statistics for 1980 - North Carolina
residents - Cancer
· . . . . 64
5.3
1980 heart disease and cancer death rate in
North Carolina by Health Service Regi9n
5.4
· . . . . 65
Euclidean distance between 2 regions in North
Carolina based on heart disease and cancer
death rates . . . . . . . . . . . . . . . . . · . . . . 66
vi
LIST OF FIGURES AND PLOTS
Page
Plot 3.1
REI = Residual of the expected value in %
41
Plot 3.2
RE2
42
Fig. 5.1
North Carolina counties and eight health
.
service regions
.. .
Fig. 5.2
=
Residual of the variance in %
..
..
Dendrogram from single linkage clustering
on Euclidean distance of heart disease and
cancer death rates in the eight health
service reg fans in North Carolina
• • • • . 62
. • . 67
vii
LIST OF APPENDICES
Appendix
--
Page
1,1
Program to use the IMSL subroutine for IncolTlplete Beta function
.....,....
. . . . . 77
1.2
FORTRAN program ~o calculate Sum-2 which is
being used for Ip(n,r)
.
2
p
80
1.3
SAS Program to calculate I (r,n) from Sum-2
2.1
Standardized residuals from the regression
estimate of the expected value and variance
of A
. . . . . . . 86
83
CHAPTER I
INTRODUCTION AND REVIEW OF LITERATURE
1.0
Introduction
In recent litera.ture, lTIany epidemiologic studies are found attempting
to attach significance to what seems to be a large number of ca.ses of a
given disease in some particular place and/or at a given time, Recent ex.'
.~
amples of "clusters" of diseases reported included the case of the town of
Rutherford, New Jersey, where public health authorities became alarmed when
-e
it was found that the community of 20,000 had had 32 cases of leukemia, lymphoma and Hodgkin's disease in a five year period.
claiming an unusually high cancer rate in that state
testing in the area approximately twenty years ago.
Residents in Utah are
related to nuclear
The Center for Disease
Control has published numerous reports on "clusters" of disease.
Residents
of the area surrounding the Three-Mile Island nuclear facility in Pennsyl·"
vania are being followed for evidence of unusual "clusters" of health
problems.
I~hile
evidence that a particular disease does cluster in time and/or
space would not necessarily prove anything about its etiology, it could
pro~
vide clues in the development of theories concerning the cause of that disease or support existing theories.
Recently many important studies have
sought to link disease clusters to environmental factors,
Unlike many infectious diseases where time-space clusters are easily
recognized due to their relatively confined and isolated patterns, chronic
2
diseases \"i th lower incidence rates and longer latent
periods present
~
special difficulties in ascertaining patterns.
Several different statistical tests have been suggested by different
authors in an attempt to deal with this problem.
As part of this study
these tests will be reviewed and their similarities and differences will
be discussed.
Unlike the branch of multivariate statistical methods generally known
as cluster analysis, time-space clustering methods do not seek to define
clusters.
Rather they test for the presence of a time cluster, a space
cluster or a time-space interaction effect.
The analysis is an attempt to
determine whether disease cases tend to be closer together in time and/or
."
~-
•
Q-
closer together in space than one would expect if the cases occurred randomly.
While the methods have typically been applied to disease data, many
other applications have been conceived using these techniques.
Literature review
1.1
As a review of the literature, the following tests have been recommended by different authors:
1.1.1
Pinkel and Nefzger's test:
Pinkel and Nefzger (1959)
a test using a class-ical combinatorial problem:
recommended
if r cases are classified
into n cells, the probability of placing k or more cases in cells that contain at least one other case is:
P(k) =
To perform the test the authors recommend dividing the study area and
time period into time-space units (subjectively) and observing the maximum
3
number of cases in a time-space unit k
O
1, then calculating the P-value
+
of the test by:
r-l
I
P =
k=k
P(k)
o
One concludes that there is a time-space cluster if P < a where a is the
level of the teste
This test has been criticized as more sensitive to clustering in time
or space alone and requires a uniform populatjon distribution (see Mantel (1967)).
It is not clear how the authors derived the combinatorial formula for P(k),
1.102
Knox's test:
In his 1963 paper Knox suggested a test using a contin-
gene)' table approach.
-e
In this paper he divided the space distances under
study into 5 categories and the range of 800 days under study into 9 intervals and formed a 5x9 contingency table, the cell entries being the number
of pai rs falling wi thin the appropriate time-space unit.
Although Knox
did analyze the table using the usual chi-square test he indicated that
this is not really appropriate since the cell entries are not independent.
Abe (1973) has derived the appropriate way tQ analyze such a table.
of using the Knox statistic,
r
K
=L
c
L
(0 .. - i .. )
1J
1J
Q. •.
i=l j =1
as a chi-square statistic
2
1J
with (r-l) (c-l) degress of freedom, where
0 .. = observed entries of the contingency table,
1J
Q. ..
1J
= expected
Abe argued that an appropriate
value of O..
chi~square
1J
statistic is:
Instead
4
where the elements of
~I
(a row vector) are the deviations (0 .. - !L..)
1J
of the elements of Knox's table from their expectation,
inverse of the variance-covariance matrix.
and
1J
v-I
is the
This statistic has been shown
to have the correct expectation of a chi-square distribution.
In a paper
of the next year (1964-b) Knox analyzed the problem in a 2x2 contingency
table by classifying each of the (~) possible pairs of cases as close or
far apart in space and close or far apart in time as the following 2x2 table
shows.
Space
Total
close
far
c1o~e
a
b
a+b
far
c
d
c+d
Total
a+c
b+d
Time
.
n(n-l)
--e
2
The statistic a is the number of pairs which are close in both time and
space.
Statistical significance is calculated by treating the null distri-
bution of this statistic as Poisson, with expectation calculated from the
marginal totals in the usual manner.
The Poisson assumption is only appro-
priate when such pairs could be considered rare events.
In this case, this
test is considered a good approximation test, more sensitive to time-space
clustering (interaction) than clustering in time or space alone.
Alterna-
tives to the Poisson approximation is the normal approximation for large
bers or the permutation distribution for small numbers.
num~
Knox's method was
extended by Pike and Smith (1968) to take into account assumed periods of
infectivity and susceptibility of
cases.
This allows for investigations
of diseases with long latent periods.
Roberson (1979)
has shown that the choice of boundaries for "far" and
"close" criteria has some effect on the test outcome.
e
5
1.1.3
Barton and David's test:
Barton, et 0.1. (1965) and David and Barton
(1966) introduced a procedure analogous to ANOVA methods.
They divided the
time span of study into subintervals and calculated the statistic as the
average squared spatial distance between all points within subintervals
divided by the overall average squared spatial distance between all points.
Under the null hypothesis of no clustering, this statistic has an expected
value of 1, whlIe if clustering is present the expected value is less than 1.
The significance level of this statistic is determined by comparing the observed value with the pennutation test derived by pairing each location in
time with all possible locations in space.
The authors also discussed the approximation to the permutati.on di.stri-.N
bution, using a S-function which in turn can be approximated by the normal
-e
function when the number of time subintervals is large (k-l
~
30"Qn) or by
the F distribution when the number of time subintervals is small (k-l < 30 n)
Q
where
k
=
number of time subintervals,
n = number of observations,
Q
O"Q2
= test
statistic, and
.
= varIance
0
f Q
•
The major drawbacks to using this test are the lack of criteria for
determining the subintervals, the difficulty in computing the significance
l~vel
of the test statistic, and the fact that the smaller distances which
are of greater interest have less influence on the value of the statistic
than the larger distances.
1.1.4
The
·E~~
procedure:
Ederer, Myers and Mantel (1964) recommended a
procedure for time clustering.
The time period under study is divided into
6
k subintervals over which n cases of the disease are observed.
statistic is a, the maximum number of cases in a subinterval.
The test
The null dis-
tribution of a is derived by a combinatorial procedure; its mean and variance, along with the p-value, can be calculated, based on this distribution.
The procedure was also extended to test for time-space clustering by dividing
both
time of study and geographic region into subintervals called time-space
subunits.
Each time unit has a fixed number of time-space subunits, m.
Let a. be the maximum number of cases in a subunit of the i
l.
th
unit.
Then
the statistic is a chi-square statistic with 1 degree of freedom.
i=1,2, ... k,
tion of a. is that of the maximum of a multinomial distribution.
l.
Mantel, Kryscio and Myers (1976) tabulated tables for the expected value and variance of a. for some selected values of n cases and m subunits in
1
each unit.
The disadvantage of this test is its sensitivity to time or space
clus~
tering alone as well as time-space interaction.
1.1.5
Grimson's cluster model:
procedure similar to the
E~~
Grimson (1979) suggested a model using the
test.
This model considers only the number of
cases; it does not take into account the fact that cases are distinct.
The
model uses number of cases in a time-space subunit rather than the arrangement of cases in a unit.
Let a
j
be the number of cases in subunit j (j=l,2, ... ,k); then, based
k
on the composition of the number of cases r (r = I a.),
j=l J
~
7
k
P{max (a. )
J
~
I
mj r
a.} :; I j =1 J
Based on this composition model the limit distribution of the test statistic,
max(a;), was proved to be geometric rather than Poisson, as in the multiJ
nomial model of the EMM test. Expected value and variance of the test statistic
was computed for the case k"'5 and r""2 through 400.
This approach was used in modeling hepatitis data..
1.1.6
Mantel's test:
Mantel (1967) gave a generalized regression approach
which contains both Knox and Barton-David procedures as special cases.
He
suggested a statistic of the form:
I
'<'J
hex .. )g(y .. ) ,
~J
~
~J
where
x ..
= spatial
y ..
= temporal
~J
1J
distance between cases i and j
distance between cases i and j
h(e) and gee) are transformation functions for spatial
distance and temporal distance respectively.
The exact p-value for an observed statistic is obtained from the permutation distribution derived by evaluating the statistic for every possible
pairing of time and space location.
this is not simple to compute.
For even a moderately small sample size
Mantel gave some general fonnulae for the
permutation mean and variance of the statistic.
In most cases, however,
it may be more practical to use Monte-Carlo methods to estimate p values
as suggested by Besag and Diggle (1977), or to use the asymptotic distribution as an approximation.
also discussed.
The choice of the functions h(e) and gee) were
Mantel favored the
choice of transformations which empha-
size closeness rather than great distances, as this should increase the
8
power of the test to detect clustering if it is present.
He suggested
having the temporal and spatial trans formation functions as the reciprocals of the absolute time and space distances respectively.
e
To avoid prob-
lems of :ero distances he suggested adding a small constant to each distance.
No criteria were set by the author for selecting the values of these constants, and their values do effect the results of the analysis as discussed
by Roberson (1979).
Mantel's procedure
uses, where c and c' are the constants,
h (x .. ) =
1J
1
~
1J
Note that if we let
. f 1. th an d .th cases are close in space,
J
1
hex .. )
1J
=
{:
otherwise,
.th
{I
.
th
an d
cases are close in time,
gey· .) =
1f
1J
then a =
L hex 1J·
. . )(y .. )
1)
i <j
1.
J
o otherwise,
is the Knox statistic.
Mantel also pointed out
that the Knox's test with the significance level evaluated using its permutation distribution or normal approximation probably is more applicable
in many situations than the one evaluated by using the Poisson distribution
as suggested by Knox.
Asymptotic properties as well as Monte-Carlo results on robustness of
this test and the Knox test were also investigated by Roberson (1979).
1.1.7
The. Scan test:
In this test, the maximum number of cases in a fixed
interval of time within a fixed spatial unit was used as the test statistic.
tit
9
This statistic was first investigated in detail by Naus (1965, 1966)
as he derived the probability distri bution and the
ance of this statistic,
expectation and vaTi-
Later, Wallenstein (1980) invesrigated this statis-
tic and tabulated the lower tail of the distribution of. this statistic for
some selected values of n (scan statistic), N (total number of disease cases),
and L (time period under study divided by the length of the time interval
of the test statistic).
Naus (1982) suggested some approximations for the dis-
tribution of the scan statistics.
This test is probably more powerful than the EMM test and does not
need a criterion for dividing time under study into fixed intervals,
How-
ever, the test statistic ,is not easily assessed and due to the complexity
of its distribution, its significance level (p-value) is much more difficult
-e
to evaluate than that of the EMM statistic for most practical situations,
•
1,1,8
Bai1a1', Eisenberg apd
Mant~l's
test:
Bailar, Eisenberg and Mantel
(1970) introduced the following test:
Let
n be the total number of disease cases,
N be the number of years (time units) over which
n disease cases occur,
Ol't
po b e th e pro bab l
l y th a t a case occurs l°n the 1,th year
1
(time unit),
0th
no be t he num ber 0 f cases 0 b serve.d IOn the 1
year
1
(time unit)
N
( , l 1no1 = n)
•
i=1,2, .. .,N.
1=
Then
(no1. ·no1 + d) ::: number of pairs formed where the firs1; case
occurs in i th year and the second case occurs
,
(01+ d)th year.
1n
The statistic is
t ::: total number of pairs occurring in years exactly
d years apart,
10
N-d
t:::
I
. 1
1=
n.n. d
1. 1+
Cd::1 or d=period of infectivity and susceptibility.)
When P.1 = p.1- d
+
p.1+ d and p t - 0
B(t)
:: n (n-I)
2
Var(t) :: n(n-l)
2
n(n~l)
2
for t$O or t>N, it was proved that
N
I p. P.
. 1 11
1=
N
I
i=l
N
p.P.
1
+
n(n-l) (n-2)
1
(2n-3)(
I
i=l
l
i=l
2
p.P.
1
1
p.p:)2
1
1
-. e .
The p-value of this test statistic is evaluated by using the normal distribution as an approximation, with the above expected value
an~
variance.
This test has not been used widely and the appropriateness of the
approximation has not been investigated.
•
It seems to be appropriate for
testing of clustering in time or for testing of a latent period of length d.
•
CHAPTER II
INTRODUCTION OF
2.0
so~re
POSSIBLE NEW TESTS
Introduction
Although many different tests have been suggested, these tests contain
the f?llowing drawbacks:
'" They involve some subjective steps to determine the cTiteria which
would affect the outcome of the test .
.,-:i'"
- While they are mainly sensitive to time-space clustering, they tend
to be somewhat sensitive to clustering in time or space.
- The distributions of the test statistics are complicated and the
significance levels cannot be assessed easily in most practical
situations even with approximations.
- They are only a one tail test of clustering.
In an attempt to overcome these drawbacks, the following tests are suggested:
- Number-of-empty-cells test for rare events.
- Zero-one matrix tests for events with larger frequencies.
2.1
Number-of-empty-cells test:
Most of the derivations of the statistical formulas of this section
are known (see Feller, Johnson and Kotz, and Riordan).
This section will
show how an occupancy distribution of the number of empty cells may be used
as a cluster test.
After dividing the time and space under study into time-space units,
similar to' what is done in the EMM test, the statistic of interest is X, the
number of units without disease cases.
Example:
dividing 10 years of North
12
Carolina nortality data into 1 year tine units and 100 counties (space
units) will provide 1000 time-space subunits.
Let n be the total number of cases occurring in m time-space units.
m
Each unit has a. cases. j = 1,2 •... ,m; 0 ~ a. ~ n; I a. = n.
J
J
j=l J.
If X is the number of units with no cases, then under the null hypothesis of no clustering we have, for the special case of X=O.
m-I.
\
P(X::O)::
1
l.. (-1)
. n
m
1
1
m
(for n
C·) [1 - -]
i=O
~
m)
(2.1)
P(X=O) = O.}
{when n < rn
To see this. assume equal probability of having cases in any unit.
If E. is the event that unit j has no cases, then
J
P(Ej)=(l-~
peE. n E.)
= (1-
P(X=O) :: 1
p(
J
therefore,
n
1
p(
m
n
E.)
,
2 n
m
u E.)
j=l J
L P (E.)
j=l
==
n
m)
m
=1
Note that
=Cm~l)
+
LIp (E.
j<j'
J
J
n E.,) - . • . . , .
J
0 .
j=l J
m-l.
Therefore. P(X=O) =
Next. let m
~
k
~
L (-1) 1 (~)(l
i=O
. n
- 2:.)"
1
Max(O, m-n).
m
This establishes equation (2.1).
Then
(2.2)
P(X=k)
=
Proof of (2.3) is as follows:
k+j) n
m
(2.3)
13
= (m)
P(X=k)
p(First k cells n Last m-k ~elIS)
are empty
are occupIed
k
:: (m) p (All cases are i.n the last m-k cellS)
k
'and all these m-k cells a::e occupied
Applying the inclusion-exclusion principle to the last m-k cells,
The expected value of X is given by
which is equation (2.3).
-e
= mCI
E (X)
I n
- Iii)
To see this let
.f
{o
X. ::
1
' f J.th cell is empty.
I
J
.th cell is occupied.
J
1
n
Then E(X.) = (m-I)
J
m
m
E(X)
= L E(X.)
i=l
1 n
:: mel - iii)
(2.4)
1
Similarly,
2 n
I n
2
1 2n
Var(X) = m(m-l) (l - iii) + m(l - roo) - m (1 - iii)
(2.5)
The descending factorial moments of X are given by:
\.l (R) (X) = E{X(X-I) . . .
(X-R+l)}
(R)
m
:: E(X(R)) .
Equation (2.6) is derived from equation 2.2, where for
(2.6)
any
14
m~ k
If n
max(O,m-n),
~
~
m then P(X=O) > 0,
If n < m then P(X=O) =
o.
Therefore,
Note that when 0 s k < R,
Let k'
= k - R,
~(R) -=
0, so that no contribution is made to the sum.
= m - R.
m'
Then
e-
The double sum simplifies to unity (see (2.2)), so that
Based on (2.2), the exact p-va1ue for this test statistic can be computed
readily.
This test is .only good in situations where the disease incidence rates
are not so high that we would expect some time-space units to have no
disease cases (E(X) > 0). In order for E(X) :: ~ (to have approximately half
of the number of cells empty) we should have O.6Sm < n < 0.7m for m betvccn
10 and 1000; since E(X)
=
1 n
m(1 -
in)
m(l
m
= 2"
or
1 n
(1 -
in)
1
=2
1 n
m)' in order for E(X) =
m
2
we must have
15
or
n
for m
::
10,
for m :: lOa,
for
ill
-
::
log 2
log m - log em-I)
n
-,
6.5788
n
'"
68.9675
n = 692.8006
1000,
The test can be generalized by using the number of cells containing
less than a certain number of cases, say a, as a test statistic.
This test \"ill not be very powerful in detecting clustering if n is too
large (or too small) relative to m, since in these situations most likely the
test statistic
\~ill
asswne a value of zero (or (m-n)) under a random alloca-
tion of cases.
r
n
Total
n
r3
03
•
• n .
rJ
n
•
° n .•
n
oJ
rc
·C
n
n
r·
16
Let n .. be the observed entry of the (ij) th time-space unit, where
1J
~{ = 1,2, ...
r
= l,2, .. "c
J
and let
r
n
r
•
:: I
o
i=l
r
L n .. ,
oJ =i=l
1J
j
n .
c
n.
::
1"
(i)
L n ..
j=l 1J
Test for time clustering:
n.
1"
Let
a ..
1J
::: {Ol
n .. <
c
1J
if
n.
if n ..
2:
1J
l'
c
a .. is distributed approximately
1J
P
1
= 2"
like a Bernoulli distribution wlth
The test statistic, denoted by A, is defined by
r
A
=L
c
2
a ..
i=l j=l 1J
Note that a .. and a .. , are not independent; however, under the null hypo1J
1)
thesis H of no time-clustering, a plausible approximate distribution of A is
O
a binomial distribution with parameters p = } and n = re. From this binomial
distribution the significance level of the test statistic could be assessed.
When n is large, an approximate distribution of A is a normal distribution
with mean equal to ~c and variance equal to ~c .
Like the E~~ test, this test may be referred to as a test of timespace clustering but it is sensitive to time clustering within spatial
However. this test. by its nature, is more powerful than the
E~~
units~
test in
17
situations where within each time unit there is more than one large clustel' in different time-space subunits, since the EMM test uses only the maximum frequency in a single time-space subunit as the test statistic.
statis~ic
When the observed
is large, the test indicates cluster
avoid.ance in some time·-space subunits. (At certain times, some places have unus··
ually few cases.) When the observed statistic is small it indicates some
clustering in some time-space subunits, (At certain times some places have unusually many cases.)
(ii)
Test
clustering:
for.~ce
.
o. ..-1f
Let b .. ::
1J
{
n oj
n .. < - - ,
-1J
r
n .
1 1.'f n .. >- -oJ
1J
r
The test statistic, denoted by B, is defined by:
r
c
B
::=
L L b ..
i=l j=l 1J
Under the null hypothesis H of no space clustering, the approximate
O
distribution of B can be evaluated in the same manner as that of A.
As in test (i), this test is a type of time-space clustering test;
however, it is sensitive to space clustering within temporal units.
The test requires that the populations under study in the time-space
subunits are equal in size; this can be a severe restriction.
However, when
this requirement is not met, the frequencies n .. 's need to be adjusted or
1)
the rates should be used instead of the frequencies.
Example:
r
I
o if
b ..
1)
=
r ..
i=l 1J
r .. < - - 1)
r
r
r ..
2
1
if r .. ;::
1J
i=l
r
1)
where r .. is the
1)
observed disease rate
at time-space subunit ij.
18
(iii)
Test for time-space 'interaction':
r
n.
if n .. <
Let c ij
=
1
n.
~
n oj
no
1J
if n ..
0
1
0
0
n oj
n o•
The test statistic, denoted by C, is defined by
IJ
r
C =
c
I I c ..
i=l j=l 1J
.Under the null hypothesis of no time-space interaction, the approximate
distribution of C can be evaluated in the same manner as that of A,
The test for time-space "interaction" is not sensitive to time or
.-
space clustering alone,
...~~
This test, unlike the test for space clustering,
does not require that the populations under study within time-space units
be equal in size.
(iv)
General properties of the zero-one matrix test:
The zero-one matrix test has the following advantages:
- The test can be used to test either time or space clustering alone
or time-space interaction.
- The test statistic and its approximate distribution (to be derived)
are simple; therefore the p-value of the test can be evaluated without complicated computation.
The test is sensitive to cases with more than one cluster, which
gives it an advantage over the EMM test. which depends only on the
maximum.
- The rates (adjusted or unadjusted) can be used for this test instead
of the numerators themselves, as required by other tests.
- The test can be used as a one- or two-tail test; for instance,
when A is small it shows time clustering; when A is large it shows
~
19
cluster avoidance (vacuity) at some time-space subunit(s).
Example:
If we have one space and 6 time units with a total of
30 disease cases, then we may have one of the following
situations (r=l, c=6,
·_ _ _
~
,--
__
Time
~
~
n •• =30):
o
1 ..
2
Expectation
5
5
Observation 1
4
Observation 2
10
3
Total
4
5
6
5
5
5
5
6
7
2
8
3
0
6
10
4
0
30
Observation 3
IV -- 0
10
0
0
10
30
Observation 4
10
4
4
4
4
4
30
Observation 5
6
6
6
6
6
0
30
~~~~~~~
i
A ,:...
E(A)
:=
3
~
Bin(~,
~-
30
30
F---
6)
Var (A) .- 1. 5
In observations 1, 2, 3, since A=3=E(A) , we do not reject H of
O
clustering or cluster avoidance, while in observation 4 we would reject
HO in favor of clustering (A=l), and in observation 5 we would reject H
O
in favor of avoidance (A=S). Depending on how one defines clustering, one
may argue that observations 2 and 3 should be considered as clustering.
This test is not intended for detecting clustering of this type.
This test is not designed for testing clustering of extremely rare
diseases, since in these cases most time-space units have no cases (n .. =0
1J
for most ij) and we will have a small test statistic (A is small) and
hence the probability of committing a type II error is large.
For the
applications to rare diseases, the numer-cf-empty-cells test is recommended.
20
However, as discussed in Section 2.1, the number-of-empty-cells test is
also not very powerful in situations with extremely rare diseases.
In the above example, binomial and mormal distributions are considered
approximate distributions of the test statistic.
Since a .. 's are not
,lJ
inde~
pendent of each other, the appropriateness of these approximations needs to
be investigated further.
As part ot this study, this issue will be investi-
gated further along with the exact distribution of the test statistic.
2.2.2
Exact distribution of the test statistic for time clustering:
The test statistic is
r
A
c
=L
I a ..
'." i=l j=l 1J
r
=I
a.
i=l1"
where
c
a.
l'
Each a.
l"
=I
a ..
j=l 1J
is distributed independently (since "a.. only depends on the mar1J
ginal n. ).
l'
of a ..
i
Thus, for simplicity here we shall investigate the distribution
.
c
This is equivalent to the special case where r=l, A = L a . .
j=l J
. P(A=a) = Probability that among m cells a
n.
l'
of --c- or larger.
Let
a
p
I (p,v)
r (v+l)
ra(P)f(v+l-ap)
=
p.
J...
o
o
)(1-
of them have frequencies
a
a
x.)v-ap IT x~-l dx.
j=l J
j=l J
J
l
= Incomplete (type I) Dirichlet Integral •
It has been shown (see Harter et al (1975)) that Ia(p,V) is the probability
p
that the minimum frequency of the first a cells of a multinomial which has a+l
cells is at least p, provided that the first a cells have a common cell
1
probability of p = - (m>a) and the total of the cell
m
freq~encies
is v.
21
Note the special case II (p, 'J) "" Incomplete beta function I (p, 'J-p+l).
P
Applying the inclusion
P
.
and exclusion principles it can be shown that
for the case of 1 row (r=l, m=rxc=c):
P(A=a)
(2.7)
where p
From this exact distribution we get the descending factorial moments
of A:
E(A
(k)
(k) k
)""" m
I (p,\J) .
P
(2.8)
E(A) = mI (p,'J) = mI (p,'J-p+l) = rna .
p
p
(2.9)
1
-e
Note that the theory derived in this section does not apply if the population
sizes are unequal and have been adjusted for (e.g., the rates were used instead of the frequencies).
The higher moments of A can be derived from (2.8) as follows:
E(A(A-I))
EA
=:
2
2
- E(A) ~ m(m-l)I (P,'J) = m(m-l)B •
p
2
E(A ) = m(m-l)f3 + rna = m(a+(m·-l)B)
(2.10)
Similarly,
3
2
E{A(A-l) (A-2)} = E(A ) - 3E(A )
3
E(A )
=:
m(m·-l) (m-2)y
= rna
+
3m(m-I)6
3(m(m-I)6
+
+
2E(A) = mem-l) (m-2) y
+
+
rna) - 2ma
m(m-I)(m-2)y .
(2.11)
22
Similarly.
3
4
E{A(A-l) (A-2) (A-3)} = E(A ) - 6E(A )
IIE(A 2) - 6E(A)
+
= m(m-l) (m-2) (m-3)o
4
E(A )
= m(m-l) (m-2) (m-3)o
+
6(m(m-l) (rn-2)y + 3rn(m-l)B + rna)
- 11(m(rn-l)S
= rna
+
7m(m-I)B
6m(m-l) (m-2)y
+
+ rna) + 6ma
+
m(m-I) (m-2) (m-3)8.
(2.12)
where a
~
I 1 (p,v),
P
S
= I
Y = I 3 (p,v),
2
P (p,v),
.
P
In an attempt to findFa good approximate distribution for the statis-
tic A we need to compute the first 4 moments of A. To do this we need to
= 1,
have I(b)(p,v) for b
p
2, 3, 4.
selected values of b, p, p and v
To compute I~b)(p,V) for b
(Harter and Owen
1... PO'J
v!
[(P_l)!]b[V_bP]! 0
m'
= v-bp
b
(1- ~ x.)
i=l
m'
= p-l;
and s
1
m'
b
k'
= I (-1) (~).
k ~-O
k'
=I
(-1)
I
+
.
+
k=O
.
(1-
Ib
i=l
~) v-bp
>
bp
b p-l
IT x. dx.
i"'l J.
1
.
then the integrand may be expanded as follows:
ro'
5
. _IT_IXi
1
(1975)).
= 1, 2, 3, 4 note that for any v
we have
Let
e-
Tables are available for only some
(~)
b
(
xJ
i=l
s+i
=
11 +1 2 .•. I b k
Xl
I
s+i
l x2
k b
IT
i=l
XI~
s+i
2 ... xb
b
-:-._,O'"o_k7""'/--;-.~I
1 1 .1
2" ,
.1
b·
23
P.
\)1
R
b
I (p,v) b
p
[(p··l)!] (v-bp)! 0
J... J
0
v!
b
((p-1)!J (\J··bp)!
-1 k
_- - . _ pk+~p m'!.
. . ~J
( ( ) ... ' I
'=k
(il+P)(i2+P)···(ib+p)·Cm' .. k)li1!121 ... 1b'
11'12+···+ 1 b
1
b
(V - bp - k)! IT [( i. +P )i . ! ]
j=l
J
J
-
.- . :;.-
(2.13)
To compute the first 4 moments of A.
V-P
I
x::: 0
2
I (p,v) ::
p
__ vIp 3 P
3
P
(_p)x
~(-p+-x~)~(~v'~--p--x~)~!-x~! ::: Incomplete
beta :: a .
v-2p
V!pZp
[ (0-1) ! ]
I (p,v) -
(2.13) gives
2
-3
[(p-1)!]
L
x=o
v-3p
L
x=O
(
·(v-3~-XTf.1+']+
.L
k=x
(_)x
I
-
(p+1) (p+j) (p+k) it j fIT
J--
y .
i.j.k<.':O
r 4 (p, v)::
P
1]
V-4
L P ( ( -p )x
I
0
[(P_1)!]4 x--o (v-4p-x)! i+j+k+£=x l:p+i)(p+j)(p+k)(p+~)ifj!k!£! :: .
i,j,k.t<.':O
I
V.
P4p
If m :: number of cells
I
(m :: -)
p
~'1
::
E(A) :: rna
~2
::
2
E(A )
::
rna + m(m-I)(3
~'3
::
E (A 3)
::
rna + 3m(m-l)S +m(m-l) (m-2)y
~'
-l
::
E(A-l)
::
rna + 7m(m-I)(3 + 6m(m-l) (m-2)y+ m(m-l) (m-2) (m-3)o
24
Using these 4 moments a Pearson-type curve could be selected to be a good
approximate distribution of A (see Pearson et al (1962)).
Adjustment for better approximation (continuity correction):
2.2.3
As presented by
~.7),
the exact distribution is not in a simple form and
it cannot be used conveniently in most practical situations.
Usually we
will have to rely on some form of approximation.
It was found from the following simulation results that the distribution
r
c
L L a ..
of :\ =
is closer to the binomial distribution when a,. is defined as
1J
i=l j=l IJ
follows:
--
.~-
a
a ..
1J
=
if n .. <
1J
I if n ..
1J
~
~n-F- - o.sJ
r
1·
C
where [xl = smallest integer larger than x.
"nC:lr~st
integer to a" with rounding up of
e
o.sJ
(Note that [a-O.S] is the
n.s.)
This adjustment will be explored more thoroughly in Chapter III but
first the simulation results will be described.
2.~.~
Simulation results:
Results of.IOO tests using random numbers of one, two and three dig-
its to form tables of 10 columns and SO rows gave the means very similar to
the binomial means, 250.
However, the standard deviations obtained are
smaller than the binomial expected standard deviation, 11.18
This is due
to the fact that the a .. 's are negatively correlated to each other.
IJ
There-
fore, binomial approximations should be considered as a conservative first
step of testing. Exact tests and/or other approximations should be performed
to confirm a non-significant result from the binomial test.
.
24a
The following proof shows why the variance obtained through the simulation is much smaller than that of the binomial approximation; again, sinc.e
the rows are independent of each other, for simplicity we prove for the
case of 1'=1 without loss of generality
ill
Val' (A)
Var(a. )
+
1
III
\'
Cov(a. ,a,) .
L \'L
1
J
i=lrj=l
While the binomial approximation assumes that Cov(a. ,a.) '" 0, the true
1
Cav(a.,a.) has a negative value as shown below:
J
1
= B(a.);,.
J
E(a.)
1
For i
~
1
j:
Cov(a.,a )
j
1
S=
Note that
Since
a.
P(n.~p)
= E(a.1
J
1
J
=
< P(n.~p) = a,
J
1
we have
Hence
Cov(a. ,a.) < O.
J
1
If we let
P (n. ~p In. ~p)
1
J
Var(A)
:=
=a
1
Var(a.)
1
(0.'-0.) < O.
Therefore
Var(A) < rn
x
Var(a.).
1
then
I ,
m x Var(a.)
=mx
Note that
S - a2
P(n.~p, n.~p) = P(n.~pln.~p) P(n.~p).
,J,
J
'1
J
J
P(n,~pln.~p)
1
a.) - E(a.) E(a.)
+
m(m-l) (S_a 2)
+
m(rn-l)a(a'-a) .
J
25
Hypothetical clustering data are also generated by using random num-
e
bel's from the tables referred to above, with the first column changed to
zero, then changed to the swn of the first and second columns, and finally
changed to the sum of the first, second and third columns.
Results are pre-
sented in the following Table 2.1, which indicates that the test performs
well on either side of clustering (cluster avoidance when column I ::: 0 or
clustering when column 1 is larger than expected by the random process).
We have larger means with cluster avoidance data and smaller means with
clustering data, and the larger clustering data gives smaller means.
The
standard deviations were not significantly affected by the size of clustering.
-.;5
Table 2.1
Simulation results of
100 tests using random nwnbers of I, 2 and 3 digits
f
10 columns, SO rows => m:::500, E(A)=250, Var(A)=125, SO(A)=11.18
Random
Completely
Column 1
Mean
SO
Mean
SD
=0
I-Digit
2-0igit
249.75
6,29
250.13
6.44
3-Digi t--j
250.66
5.88
!
I
,i
,
I,
I
j
I
252.09
5,58
251.49
6.17
251.57
6.89
I
I
I
,:
1
,
I Colwnn 1 :::
! Col
1
+
I
Col 2
I
I
Mean !!
SO !I
244.74
5.71
I
i
246.63
6.89
243.29
6.30
t
! Column
I
1
+
Col 2
1
Col
:::
+
Col 3
I
i
j
I
I
,
1
I
Mean
SD
;
I
--1
I
.
i
238.17
6.62
233.21
5.89
235.54
7.96
CHAPTER III
REGRESSION ESTIMATION FOR THE ZERO~ONE
~~TRIX TEST FOR TIME CLUSTERING
3.0
Introduction
As shown by the simulation results in Section 2.2.4, the binomial
approximation tends to give a conservative estimate of the null distri.']'"
bution of the statistic A, in testing for time clustering.
The binomial
estimate of the variance is much larger than the variance obtained by the
simulation procedure.
In this chapter we will attempt to compare the binomial estimates
with the exact mean and variance for a range of situations with different
values of n, the number of cases to be distributed into m time units. Based
on the results of this comparison we will use a linear regression technique
to estimate the mean and the variance of the statistic A.
3.1
Binomial estimates vs the exact values
For the statistic A, since the rows (space units) are treated as inde-
pendent from each other, we shall, without loss of generality, consider the
distribution of cases in only one row: n disease cases are distributed into
m time units, each unit contains n. cases (\~ 1 n.= n).
J
p = [~]
m
=
L.J=
smallest integer larger than (~
m - 0.5).
J
Let p
=
1
m
and
To test the hy~othesis
that some units have unusually more (or fewer) cases than others, the test
27
statistic is
m
A=
L
a. ,
J
j=l
where
if n. < p ,
J
if n. ;;:: p
J
As shown in the derivation of equations (2.9) and (2.10), the exact
2
expected value of A and A are:
E(A)
= rna,
E(A 2)= rna
+
m(m-I)S,
where
a
=
1
p
I (p,n)
= I
= Incomplete
p
(p, n-p+1)
Beta function
S = I p2 (p,n) = Incomplete dirichlet function.
From these results we have the exact variance of A:
= rna
+
2 2
m(m-1)B - m a
= m(m-l) 6 -
(3.1)
ma(ma-1) .
For m from 5 to 10 and some selected values of n from 5 to 50, the
a
was computed using the Incomplete Beta program of the Institutes of
Mathematical Statistics Library (IMSL) and
B
was computed using a computer
program written by the author especially for this purpose.
Details of
these programs can be found in Appendices 1.1 to 1.3.
From these values of a's and 6's, the exact expected value and variance
of A are calculated using equations (2.9) and (3.1) respectively.
28
Under the binomial approximate distribution of A, [A ~ Bin (m,})] ,
the (approximate) expected value and variance of A are:
m
Bin E (A) = :2
Bin Var(A) '" ~
The exact and binomial approximate mean and variance of A for each
combination of m - 5(1)10 and n = 5(1)10, 12, 15, 18, 20(5)50 are presented
in Table 3.1.
Consistent with the simulation results, the expected values of A under
the binomial approximation are very similar to those under the exact distribution, while the exact \Fariances are much smaller than those obtained from
the binomial approximation.
In fact, the approximate variances are about
3 times larger than the exact ones.
The ratios and the differences between the approximate and the exact
variances as well as those of the expected values are also presented in
Table 3.1.
Over all, the ratios between the varia.nces are about 3, while the ratios
between the expected values are close to unity.
Means of the ratios and
means of the differences between the approximate and the exact moments are
presented in Table 3.2 by number of cells m.
The difference between the var-
iances becomes larger as m is increased, while the ratio between these values
remains about the same for all values of m from 5 to 10.
3.2
Estimate the variance using the expected value
Since approximate variance of A is a linear function of m (Bin Var(A)=!J),
the exact variance of the statistic A should be highly correlated with m,
which is in turn highly correlated with the exact expected value of this
statistic.
Therefore it is believed that tne expected value and the
29
variance of the statistic A are highly correlated.
A general least squares model fitting the exact variance on the exact
expected value showed that the variance can be closely estimated as a function of the exact expected value as fa 11ows:
Var(A)
=
0.155 x E(A) .
(3.2)
This model gives an R-square of 99.4%; the details of the regression
resuI~s
are presented in Table 3.3.
Some other mode Is \vere also tried but none gave a better fit; the
same model as in (3.2) with an intercept term gives an R-square of 90.4%;
the model fitting the variance on the squared term of the expected value
gives an R-square of 87.8% with an intercept term and 93.8% without an intercept term; the model as in (3.2) with an addition of the square term of -the expected value only improves the R-square value from 99.4 to 99.5%;
the model fitting the standard deviation (instead of the variance) on the
expected value gave an R-square value of 90.1% with an intercept term and
98.1% without.
From these results, it is concluded that the model of equation (3.2)
gives the "best" estimate of variance using the expected value; the word
"best" here is used to indicate the relative simplicity and lack of large
errors of the estimate.
3.3
Estimate the expected value
It was observed that even though the ratios between the binomial
approximate and
th~
exact expected value of (A) remain about the same
with different values of m,
ues
depend on
(Table 3.2), within each value of m these val-
the decimal part on the value m'
n
This is intuitively log-
ical since the expected value of A depends on the probability that each
~
.
30
a.
=
1, i.e. the probability that n. > integer part of (~- 0.5), this
1
1
m
probability depends on the value ~; yet n
i
can only be an integer; there-
fore the decimal part of m
~ was neglected in the binomial approximation.
For a better estimate of the expected value of the statistic A, this
decimal part should be taken into account.
A general least squares model fitting the exact expected value
of A on the number of time units (m) and ~ minus the smallest integer
m
larger- than (~. - 0.5) (Decimal) showed that the expected value of the starn
tistic A can be estimated as follows:
E(A) '" 0,,6
x
m
+
2.2
x
Decimal
This model gives an R-square value of 99.5%.
-e
(3.3)
The details of the re-
gression results are presented in Table 3.4.
The same model as in (3.3) with intercept term gives an R-square
value of 92.0%.
Due to the simplicity of the model and its good fit, equation (3.3)
can be chosen to estimate the expected value of A.
However, since E(A)=ma,
where a is the probability that n. ~ [!!. - 0.5] ,and
1
m
a is independent. of i,
a more general estimate of E(A) can be obtained from an estimate of a multiplied by m.
A general least squares model fitting the exact value of a on
the
difference between ~ and [~ - 0.5] (Decimal) showed that the value of a
m
m
can be estimated as follows:
a = 0.6
+
0.3 x Decimal
This model gives an R-square value of 80.8%.
sian
resul~s
(3.4)
The details of t.he regres-
are presented in Table 3.5.
From equation (3.4) the expected value of A can be estimated by:
31
E(A) = rna
E(A)
3.4
= m(0.6
+
0.3 x Decimal)
(3.5)
Appropriateness of the regression estimation
To check the appropriateness of the regression estimation, equation (3.5)
was used to estimate the expected value of A and this estimate was then
used to estimate the variance of A, using equation (3.2).
The residuals
were calculated as the difference between the estimates and the exact values.
These residuals were standardized in percents as follows:
Standardized residual =
Residual x 100
exact values
These standardized residuals were plotted against n in Plots 3.1
and 3.2, with Plot 3.1 presenting the residuals of the expected values
and Plot 3.2 presenting the residuals of the variances.
Univariate pro-
cedure applied on the residuals showed that in estimating the expected
value the residual is not more than 15% of the value being estimated;
90% of the time the residual is not more than 10% and 50% of the time
the residual is not more than 6%.
In estimating the variance, the resid-
ual is not more than 1/3 of the estimated value (33%); 90% of
~he
time the
residual is not more than 16%, and 50% of the time the residual is not
more than 7%.
More details of these results can be found in Appendix 2.
In summary, the results of the study presented in this chapter lead to
the conclusion that a good estimate of the expected value of the test statistic A, test for time-clustering, is
E(A)
=
m(0.6
+
0.3
x
Decimal)
where m is the number of time units and Decimal is ~ minus the smallest
IT!
32
integer larger than (~ - 0,5); n is the number of disease cases to be dism
tributcd into
m time units.
YeA)
A good estimate of the variance of A is
= 0.155
x
E(A)
•
These estimates are simple and relatively accu'rate with the error of the
estimate usually less than 10% for E(A) and less than 15% for VeA).
Table 3.1 *
Approximate and exact value:; of E(A) and Var(A) for n = 5(1)10,12,15,18,20(5)50 and
ODS
n
H
R
11
12
EEA
1
2
3
4
5
6
1
8
9
10
11
12
5
&
7
6
'.I
10
5
&
1
8
'.I
10
5
6
7
8
9
10
5
6
7
8
9
10
!.
5
5
5
5
5
I>
&
6
6
6
1
1
1
1
1
1
1
1
1
1
1
I
1
1
1
1
1
I
2
0.l>nJ20
O. 5911 1l.l
o.!in HI>
0.4H1091
0.445011
0.1109510
0.13765&
0.665102
0.603430
0.551205
0.50& 110
0.468559
0.190285
0.720918
O. b&0083
0.607304
0.561538
0.52 t703
0.496683
0.161432
0.101l6113
0.656391
0.610256
0.569533
0.563792
0.457341
0.150205
0.699342
0.653561
0.612579
0.624190
0.515483
0.185942
0.1J69211
0.692054
0.651J21
0.725122
0.6186b7
O.521l198
0.451296
0.756664
0.117570
0.601971
0.lIb7715
O. b51170
0.575912
O. 506b~b
0.1I50J57
0.422400
O. Jl19JL
0.2b0606
0.;/1 hll7
0.17-1110
0.14&100
0.,llJ68
0.411996
0.339611
0.280188
1l.234831
0.199262
0.608563
0.500364
0." 15031
0.346092
0.195256
0.253121
0.204252
0.573882
0.1185046
0.412895
0.3541131
0.306638
;).218750
O. 114805
0.548'131
0.41 J169
0.411181
0.359311
0.355204
0.2J1562
0.60b455
0.530162
0.465121
0.410011
0.501161
0.354812
0.2518]1
O. 181120
G.562311
0.503861
0.326149
O. 185179
0.IIOS7)2
0.311997'J
J.231l096
O.lllll'lllll
0.20771l7
O. Hl1dC
0.20110',]
0.It'l170!j
0.J'j4lob
11. lll'ib" J
O. J092lJ
3.J6160
3.5IJ87J
3.16135
J.891>13
1,.0IJ!i1>4
4.09S10
3. &8928
3.9'10&1
4.22401
4.40'.1611
4.56051
4. &8S59
3.95142
".li551
4.620511
4.65043
5.05384
5.21103
2. 48311.l
4.60459
4.96050
5.25113
5.49230
5.6953J
2.81896
.l.74405
5.25186
5.59414
5.8tl20 ..
6.12579
1.12095
3.09290
5. '>0 1'>'1
5.89540
6.221148
6. 513.i 1
3.&2".bl
J.71".I00
3.b9739
J.62b)1
6.8101b
1.17510
3.00986
2.110bb5
4.57)59
4.bO"1JH
4.57'!.i7
II. 501':>'
2.4'1"87
J. SU'Iv7
3. J'lfl18
5.<; 1 1£U
5. <l8'l<l 1
5. '19 71 0
.i.94L7'>
13
'"
15
16
17
18
19
20
21
21
23
24
25
26
27
28
29
30
31
J2
II
H
35
3b
J1
38
39
'10
111
4.2
113
44
115
')
0
1
8
9
10
5
6
1
8
9
10
5
(,
lib
1
8
'.I
10
5
6
7
II
,,1
OJ
1111
'19
50
51
52
51
54
:>5
10
5
6
7
Ii
q
10
S
h
1
1
7
7
1
7
8
8
8
8
8
0
9
9
9
9
9
9
10
.10
10
10
10
10
12
12
12
12
12
II
1!j
15
15
15
IS
15
III
18
III
III
11'1
18
20
1
1
1
1
1
2
2
1
1
1
1
2
1.
I
1
1
1
2
2
2
2
1
1
3
1
.l
2
2
J.
4
3
3
.2
2
L
II
0.II9H~74
o. ','JJj45
0.<;8'.>"69
0.b77160
O. u.I'l93"
0.Sij'J11b
0.S86551
tVA
~EA
IIYA
1• .:'>
1. ':>0
1.75
.l.OO
2 • .l5
l. :;0
I.L5
1. ">0
1.75
1.00
O.~II'Il~
.t.'>
0.S411>'J
0.5S9114
0.555S0
0.S .. 1'J4
0.5.l8l1
0.5l,85
0.605')0
0.1>11192
0.M'b4b
0.b1005
0.664113
0.50694
0.621>41
0.70210
0.14722
0.11110
0.78056
0.110110
O. b WIlO
0.72588
0.79890
0.64596
0.81J911
0.44142
0.115842
0.72495
0.81471
0.89582
0.94436
0.48470
0.50313
0.70"19
0.tllil80
0.92J16
0.992112
J.O
1.5
... 0
... 5
5.0
l.5
1.0
1. ~
11.0
11.5
5. U
2.5
J.O
3.5
4.0
O.,)I~79
l.5
1.0
3.5
iI.O
0.')7'119
0.I>03bll
0.611151
0.9.lJ02
1.03243
O. '17347
0.49014
0.b'l661
O• ., hl.il>
O.luObll
0.777 j )
0.lI1.1>23
0.'>1>'40
O. '>lleib)
O.1l0'>tl2
O.llbl'lli
O. IJ'b.iU
O• •u,12.l
... 5
5.0
1..5
1.0
J.5
11.0
fl.
5
5.0
2.5
3.0
J.5
't.0
'1.5
5.0
2.5
J.O
J.5
4.0
1i.5
5.0
".50
~J.
1.5
.1.0
1.5
41.0
...
';
~,.
0
2.5
... 0
l.S
l.;l!"
i.50
1.25
1.50
1. 75
2.00
2.25
2.50
1.l5
1.50
1.75
2.00
.i. l5
;/.50
1. ;/5
1. SO
1.75
l.OO
2.25
2.50
1• .i5
1. :>0
1.15
2. ,JO
.i.25
2.50
1.25
1.50
1.75
2.00
2.25
2.50
1. L5
1. SO
1.75
l.OO
l • .l5
2."0
1• .i"
1. '>0
1.1,
4.0
~.OO
... 5
r, .. 0
J.':J
2.l'J
L.:>O
1• .I:'
IlI!A
RlIA
l.II'jIH.1
l. 7 JIJ79
0.9 J052
3. HOH
1.021>50 J.I>OOJ7
1. 12J1i.l . 4.0647
1.UO'.l? 4. 1 J;/'~6
0.1.>77&4
2.37709
0.1'>176 2.41729
0.82860 1.10095
0.90710 J.000'/4
1.]')1'17
O.!:ltlb12
J.1('2(']
1. 0t.11 0
0.63266 l.45610
0.69J56 2.)'1..1>0
0.75746
2.11'1252
O.tllJJ!
2.61660
0.890 .. 1 2.91190
lJ.95f1'iO
3.20283
1.0006H
3. l1bll4
O.b515l
1.42403
0.70557
2.41088
0.76114 2.50344
0.1l19J] 2.65969
0.1377'11
.i. 6 •• 0 .. 1
U.11l1685
2.19H9
1.0'JJ211 3.27212
0.66643
2.41195
O. '114'16
1.42506
0.16504
2.51166
0.1I1b22 2.611730
0.8010" 1.57892
O.9699b 2.97177
O.bJ!. 18
l.4ll160
0.611150 2.41313
1.4)121
0.12149
0.7b/1>1 ..!.51808
lI.btH54
2.II.lJ47
0.l.i031'J l.50yal
2.8'1906
0.9"61)1
1.10J03 3.£)320
0.bu078 ~. 4376b
O. b96ilO .2.4214b
O.t1JObO
2. bliOOb
I.JUlIU9
J.OS6b2
0.765.'(.
l. 51.2 16
O. Ub!117 L.70907
.l.,,',761
O.'l <J.l"0
1.1I1U75
J • .l1l, 1.l
1.0010b
l.9Uf,7
O.HI/Oll .I. bll 16'>
1.02'19 I
2.'11)01
0.11dHI
~ ... ~ 1'1'.
1I.I·Pl1b
.~. bll 1 2/1
o. 'IU 1(,') l.1'f l .fJt;
.l.I>"1'>ijl
0.11 .. '1 '>'1
0.741&'!
U.6Y,~'>
lJtA
IIVA
-0.1l616
-0. :'1Jf, 1
-0. ltd J
O. 10JJ
0.4'141;
0.904 'J
-1.1891
-0. '190(,
-0.7240
-O.409l>
-0.fl606
C.31411
-1.11514
-1.32:'5
-1.1206
-0.6StllI
-0. '.>538
-0.2170
;1.0166
-1.6041>
-1.4605
-1.2511
-0.9923
-0.69:'3
-0.31'/0
0.151>0
-1.7519
-1.5947
- 1. Jll20
-1.1258
-0.6,,'0
-0.0929
0.7407'>
O.'JSlJi
1. 1'10'11>
1.4'i4S0
1.7060f:
1. '11173
O. 7111 Ie.
O. ';'14',0
1.10olOIl
1.JJJ'>11
1. r;1q'l~
1. UJ5=>1
0."11110"
O. 117J59
I.047'!!'
1• .l527t1
1. <;711'10
1.1191111
O.84tl'J0
-~.OO1€.
-1.11954
-1.12115
-1.5132
-1.12S!>
-0.7120
-0.1974
0.3731,
-2.3102
-1.1157
-0.5099
ll.19H
-1.'1131>
-0.(,v1 ..
-0.076;
O.'I'JOe,
V.JUSI
-0. 'Jbtll
1I. 1017
- 1... 11 J
-no CJi!t44
-(I ... '11.:
-0. '44
~d
().dll120
1.0.i412
1...0110
1.4011011
1.6.i1>02
Q.0025li
1. 0'1 1')t!
1.02505
1. 1752tl
1. J',4 HI
1.5556"
0.1b530
O.996.l1
1.0"481
1.1/12C
1. 32btl]
1. 5071 tl
O. nlloll
o. nOd 1
1.14,,3f:
I.JIllliJ
1. J26') "
1. 'lfJI7 ..,.,
0.711.5J
1. OJ9~"
1.0'»)J'1
1• .lbn"
t. '111 '132
1.1£.11>1
0.1l2J17
O. JJ.il,1l
1.1blJ1
1.1'Nlll
1.
IPlu~
1. bU 17q
O. 'ULHJ
in ::::
:::'1
b. f,Oll1
b... :;'~ 1'>'4
&.7ldL.1
7.0 1'HI ..
7. Ji '-II 0'1
7.7')1'11>
7.01~clO
b. ~ 91)(, 1
6.51'1J1I
6.611>53
r,. ii06J ..
7.')5206
7.161107
b. :!0'j24
". sa 10 1
0.5020'.
b. ')5 1/l))
6.(,8]11
b. 1 'i15'
1.44111
b.8JJIlO
6.57293
b.4'llH
u.511>5J
6.300Q;J
5.9d511'.l
7.24'1111
b.7ti1e'l
6. %60
b. 'I 6b73
6. 'I.HI'N
<..139'Jb
"7. fll115~
7.113 III
6.711('fl8
&.5",03 i
7.02'121'0
1l.1I0tl':l]
6.12510
S. lib:!5 1
7. HillS
b. 9~O 211
b.3'>702
5.7 I'> 2S
b. ".5404
b. 2 .. ~t1'j
6.01>1b.l
5. dll 1 JJ
').65\ jO
t.. ! 100'1
5. n HI
b.12'-'0
b. ltll"~
b. I JJ4 J
1> • .i"Il"'J
5(1)10
llt:('[!'IAL
0.000,)0
-0.lubl.7
-0.211';"1
-0. J 7500
-0.41144"
-0.50000
0.20000
0.00000
-0.1112H6
-0.25000
-0.3JjH
-0.40000
0.110000
0.16bb7
O.OOOJO
-0. 12500
-0.222n
-0.30000
-0.4:10110
O. JJJJj
O.142!lt.
0.00000
-0.11111
-0.20000
- O. 20000
-0.500ao
0.28571
0.12500
0.00000
-0.111000
0.00000
-O.lll))
O. 42d 57
0.25000
0.11111
0.00000
0.40000
0.00000
-0.2tl<;71
-0.500011
O. JIJJ]
0.20000
0.110000
- O. ~.oucu
O. 1tlltlli
-O.ll"OU
-0. J JlJJ
-0. ')chlOO
-0.1I00UI)
0.01)000
- O. Ii ~>I " 1
O. 1.llOOO
O. ,llhlOO
-u • .iCouu
O.llflUl1v
*See page 36 for definitions of column labels
\.oJ
\.oJ
e
e
e
e
e
Table 3.1 (Coa"cJ~_
085
/I
N
a
56
57
58
59
60
61
62
63
6q
65
66
61
/>6
69
70
71
12
7J
74
75
76
71
78
79
80
81
82
6J
84
65
86
87
8U
89
90
91
92
9J
94
6
20
20
20
20
lO
25
25
25
25
J
J
J
2
2
5
4
95
9b
7
8
9
to
5
I)
7
B
9
10
5
6
7
6
9
10
5
6
7
8
'}
10
5
6
7
8
9
10
';
I,)
7
U
'9
10
5
I>
7
8
9
10
25
25
30
30
]0
30
JO
JO
35
J5
35
35
]5
35
40
40
110
40
40
40
115
115
45
45
45
45
50
50
50
50
50
SO
'I
J
3
3
I>
5
4
4
]
3
7
6
5
II
'I
4
6
7
6
5
4
4
9
8
6
b
5
5
10
8
7
6
6
5
II
12
0.671J~1
o... :.z65611
O. ~s 7)02
o. lnl~7
0.'131149
O.3541JI>
0.29185i1
0.J541195
0.210867
0.31>5042
0.261665
0.195869
0.559610
0.4/>11691
0.1>61l092
0.6011253
0.579326
0.61U4JJ
0.406066
0.b20391
0.531>2'18
o. ,,62906
0.572"87
o. 5756/> I
0.630271
0.521>614
O. :>62799
0.588648
O.561t50
0.539812
0.573 143
0.652375
0.555077
0.469015
0.51>2654
0.508983
0.5·t,,657
0.571J05
0.662907
0.576869
o. 5~9 263
0.4,61 '.157
0.6,31>9 til
0.499431
0.56990'1
0.47J.1l62
0.55&259
0.6:18941
0.5114jJ7
O. 60b 513
0.411613J
C.st>a80l
0.28~5111
o. 3() 10119
0.J05147
0.25'14%
D.4231i12
0.J30012
0~28122l
'0.259759
0.303190
0.40n03
O.26dl>97
0.201761
O.27!l142
0.226757
0.240153
0.304665
0.423807
0.316032
0.214043
O.1':1'J18J
O. J8J228
0.226318
(]I. ]05758
O.2054'JII
(1.270259
0.342142
0.3165&0
O. J47JJI
0.21bO"/>
0.301>599
EEA
t:VA
1I.0lUOll O.S~'JllJli
J.91127 o .63U976
3. 1115J 0.660526
6.01263 0.901'12i1
1>.01l25J 0.95764"
2. til'l1,v3 0.il6J2 .. 8
3.71060 0.57690J
3.41646 0.600b68
4.9bJ13 O.71..!IlS:.I
11.62623 0.805621
'I.62'l06 0.829071
0.41>0500
2. fl1>2'14
J.45397 0.5561115
4.46190 a.ti1l1959
4.21292 0.7lJ1l15
5.96519 O.6'JIl37
5.1Hlb4ij 0.942228
2~ 83579
0 ... 58504
J.23687 O.5111J~4
4.01200 0.049854
5.21900 O. 7d~371>
4.99570 o. Cl24899
4. b90 15 0.851116
2.111427 0.il56988
J.05390 0.530J17
3.1>16/>0 0.623237
4.57044 0.742771
5.%616 0.685176
5.7686':1 o. JJJ77lt
2.79642 0.457320
2.119174 0.521017
4.451640 0.671>b02
J.99545 G.7056lL
5.12914 .0.635~25
4.72lJ6.l '0.8"J2U9
2.7eDO 0.45<ltlG4
J.65J64 C.51>1l7tl2
1I.0<;OJ6 0.6541127
... 85210 0.7'j97J04
iI.37519 0.11101115
5.681101
0.9211ilI>4
Ufl
UVA
J.O
J.5
4.0
4.5
i. '>0
1. ]',
2.00
2 • .l5
2.50
~
... O
2.5
l.O
].5
".u
4.5
5.0
2.5
3.0
3.5
4.0
4.5
5.0
2.5
).0
J.5
11.0
4. :.
5.0
2.5
3.0
1.5
~.
a
11.5
5. (j
2.5
J.O
J .0>
'"
'1.0
4.5
5_ 0
2.5
1.0
1. 'j
'1.0
q.5
5.0
·1. L5
1.50
1.75
2.00
2.25
..!. '>0
1.25
1.50
1.75
2.00
2.25
2.50
1.25
~.
50
1.75
2.00
2 • .!5
2.';0
1.25
1.50
1.15
l.OO
2.25
l.50
1.25
1.50
1.75
2.00
2.25
2.50
1• .l5
1.50
1.75
2.00
l.l5
2.~O
ileA
0.101'1 Ttl
O.lIq Jl<tl
1.075'Hl
O. 711U 40
o.aUOJ
0.86307
0.60U49
1.02445
0.U059'1
0.912'1(}
1.0d013
0.ti13J8
0.061157
0.7d3J7
0.949'Hi
O.7S'IJll
o. d<j 'JijO
O. dillS?
o. 926~5
0.872J6
0.11>"43
0.90018
1.0660&
11. ddll JJ
O. 'Jan5
0.96176
o. tns l'}
0.13425
o. dr,/>15
O.ljHOO
1.t.lJ74'1
0.7l:l504
i. 00114
0.877"34
1.057J<J
O.IHlltl6
0.illll0
0.IJ'>567
O.1l2'131l
1.02d5J
0.117904
ll'lA
DEA
1..500!,'!
-1.021:10
2.1191" -O.~17J
j.027:39
O.2B25
2. il9(·05 -1.51211
.!.61U';1 - -1.082'>
2.698Jl1 -0.J966
2.1>0009 -0.7106
2. 'J tJ42
0.0835
2.S87U2 -0.9I>Jl
2.1'1288 -0.3202
J.015'42
0.3709
2.7144'1 -0.3624
2.ti9423 -0.4';'10
2. ~,b6lJ -0.9679
:i.601HS -U.2129
2.521186 -1.4652
2.65J..29 -0.0865
2. 72t 26 -0.3J5<1
2. nOdJ
-0.2389
2.69291 -0.51LO
2.Sil9ilO -1.2190
2.12161 -0.4951
2.'JJ1J2
0.3099
2.1J5.10 -0 ... 314J
2.82649 -0.OS39
2.l:l0792 -0.1166
..!.69262 -O.570J:
2.541117 -1.4662
l.67731
-0.76!"H
2.7JJ32 -0.2%4
2. 6t. 7u5
0.100';
2.5Ufol3 -0.95lJ4
2.03434
0.0046
2.6 n60 -0.1>291
2.tJ'l590
G.27'"
L.77245 -O.:llltJ
2.u17Ll
-0.6536
2.b7246 -(j.5~11"
2.1>32'}0 -0.tlS21
2.8~4b6
0.1240
2. b~2b2 -0.6HllO
OVA
l:.W
o. ':"JO 17
#•.h 7]~l7
1.1110",
&. 1JO"~
S .. 6 281 J
1. JJ9lf7
1. l'l1l5tl
1. S4.2Jo
0.111675
o. 9L3 111
1.1'j'lJ;
1. all'>
1. "'.J~Jo
I." 7G'JJ
O.7,~9S(j
00 9~·J21.o
1.0',;;0'1
1.1.1l61!l
6.6JOH
Oa JJ t'lG.
b .. 232B7
6.43193
S.68711
L. '<2 HI j
:.. '}'J06',
~ .. ~aJ4 5
r..215'N
6. 2U It'(,
Ii. 5''':'0
S.., gil ~97
1. )';,Jll (,
&.<>':lH "
1.5'>777
0.7'1150
o. ,}5dl>',
1. 10015
1.2151>2
b. 2<j74 1
6.1tJ'Hlil
5. 9!l19 1
l>.ln7';
6 .. 6'iJ7~)
1.11251~
b .. 056~j
1.6 <jehltJ
0.79]01
O.91>'J6d
I. 111.o7b
1.25723
1.3648';:
1.56l>2J
0.792!>!!
O.97e9l
1.07 Jl4
1.l'l4J7
1.1114 Jll
1./>31>11
0.1 .. '11'1
O.'lJl12
1.09511
1• .14027
1.4(,1 tll
1. ~ 1l '.J4
J. :>lllS ~
e.151)30
5.15%2
5.89l'd
c. 153;; 3
6.7'1"IH
6. l11ti;!
6.111lH
5.528 H
6.5tWlll
5.bt.2H
6.13809
5.471~5
6. lollll!
~.423b2
6.24t>~7
6.3bo5tl
S.5S0<;7
b. "L1,":0
UEC[I1Al.
O.JJ.tJJ
-0.11l2::!G
-0.50000
0.1.1.Ll.2
0.0000 U
G.OOOOU
o. lb667
-0.42851
O.ll')\)O
-0.212n
-0.50000
0.00000
D.OOOOO
o. 2d 511
-0.2'>000
o. BJJj
0.0::1000
0.00000
-0. 16667
0.00000
0.37500
-0.11111
-0.50000
O.liOOOO
-O.jJJj)
-0.28571
0.00000
0.411444
0.00000
0.00000
-0. '}OOOtl
Q.42!l51
-0. J7500
O.OOOJJ
-O.SOJOO
0.00000
O. JH3J
O. 1';201>
0.25000
-0. ,,~ 11411
0.00000
w
.p..
35
e
....,
0:
:::>
e
0:>
C
.~
".
N
IC
...'"
OJ
::>
0
'<:
:"
0:
.....
z
z:
"
':'"
..,...
0
.c
oro
0:>
."
<:I
0
e
0
<:I
~
0
~
0
0
'"0 ""•
0
oro "
:v
0
,.!
<:>
0'"
;:)
:1
a;,
00'"
L/"l
,...
~
0
0'
;}
,..,
...
o
:v
?
."
0
... .....
....
'"01' ...
.... .....0
· ....-·
.c
0
0
0
0
."
0
~
:v
0
..·. ...·
0
0
<:>
~
'"
o
o
."
o
'"
'"
<;>
o
~
0
•
0
(')
:::>
::>
o
0
0
N
""",
.....
:::>
'" '"
"~ .; N
~" '"
.;:;
0,
.:I'
~
0
.
· · .
~
r.
N
...
V
'"
a>"'0
"'-
"'0
CO •
.0
... 0
0'.)0
o
"'01'
CO"..
","0
"0
OJ
•
.0
o
... '"co'"
lI'l
o
...'"
o
o
~
~
... ...
co
-· '"·
V'I
..,
o
o
...
-.....
oro
oro
.t:J
~
."
o
o
o
o
o
."
."
o
o
<>
o
o
o
o
-....
-....
'"'"
..."\~
?-
::>
N
.0
I
II
o
.Q
co
I
.
-
:1\
, z)
~
- .
"'0
N.
o
o
N
o
o
...
N
o
-00
.... 0
e
00
00
00
00
-
.:::>
o
N
o
."...
o
o
o
......,
o
o
o
oro
o
..
o
N
·
o
-· ·
I
.t:J
-0
'" '" '" '"'" '" '" '"'"
.t:J
OJ)
"..
lIol
-l
..
..=
CD
4
'"
~
'"
.... ....'"'" '"'0o
...•
o
'"• ...""
In
o
o
o
o
..,
-l
. ..
... '"
... ...
lIol
lIol
lIol
...
0::
oil
<a1
CD
a(
I>-
""
id
Q
... 0
N?
0>0
.....
-
.....
00
0-::>
O"l"=,
0"0
.cO
00
:%).
00
0,
.0
eO
o
-
0;:>
00
-
O.
.0
0<:.>
00
:::>0
o
.0
::-\....
-, .
'3'0
.0
?
0-
00
?O
00
o •
-
.0
<1).QO
.
00
'". 0
o
00
00
00
00
:::> •
•0
"'-
CO ....
>ll 0
..0-
00
rJ"I ~
0"0
.,0
0". 0
o
~­
.,0
j\..3
",a
'.
....
..
o
oro",
"'0
,",0
.0
,.......
~ l.I1
11\ ....
V'l0
..,0
:7'0
1""-0
:"'If"<-N
o
o
"'0
.... 0
N,.,
,.... :::tN
f""'lO
tnU'"t
teO\
;:"r'-o
,...0
.0
•0
N.
.0
o
-.
"'"''''
"''''
"' .... :> •
0
."
':)0
00
'-'0
.
-- "'0"'-"'0., .
'" .
. ..
'". 0
o •
N
.0
o
"'0
-...
..
00
00
eo
lI'l0
. .-
o
>,.)
N
...
o
~
'"..,
N
'"
:1\0
.... 0
"'"'.... 0
'''0
". 0 . 0
00
.,'" ....
.
'"
· -·
··
"0
o
.0
.:>
N'"
"'0
...'"
.0
... .... .......
"""
.0
.....,'" '".,
"..
.... ...'"
~o
"'0
1>'<0
I
o
.7
./J-
.... ...
_N
o
o
z
.0
o
.. . .... .
o
::>
00
00
0
a
.0
.0
o
oro'"
"....,
=
"".
-D
00
(TIlO
"",0
CO,
I
:::>
.'"
..,· · ·
:::>
o
c;r.O
N'"
o
o
.0
.0
o
- . -.
"-
.
'"
o
.
....
"',.. "'-N....
.. 0
.......
" ...
..o"'0
,
-IJ ....
"'0
"'0
"'0
::>0
o •
.0
NO
"'". 0
00
.... 0
o
o
0\ ""\
N.
,0
,~ti'
..... t
.0
0
.'
e
.0
I
"'..,
..-"",...
.....
.0
,
o
"'0
-. .
';DO
.... 0
.0
I
...
.n
a-O
o
."
36
Labels of the variable names in Table 3.1:
~1
:;: number of time units
N
~
number of disease cases to be distributed into M units
R - smallest integer larger than (; - 0.5)
II - r~/M(R,N) ~ incomplete dirichlet of first degree
::: ex
incomplete beta function 11/M(R,No-R+l)
I~/[VI(R, N)
12 -;.
EEA
'-=
= exact
::: incomplete dirichlet of second degree
expected value of A
= Mx
II
EVA - exact variance of A ::: M(M-l) 12 - M(Il)(Mx1l-l)
BEA - binomial
BVA
~.
approximat~
M
binomial approximate variance o f A '" "4
RVA
:::
BVA/EVA
REA
:::
BEA/EEA
DEA
:::
SEA
EEA
DVA - BVA
EVA
EV ::: EEA!EVA
Decimal :::
!iM - R
M
expected value of A ::: I
37
Table 3.2
Means of the ratio and of the difference between the approximate and the
exact moments by number of cells, based on results presented in Table 3.1 .
.
Number of obs.
m
BinVarA/Var(A) BinVar-VarA Bin E(A)/E(A) Bin
~
E(A)~E(A)
-
16
5
2.68
0.78
0.84
-0.53
i
-<
16
6
2.72
0.94
0.87
-0.53
16
7
2.69
1. 09
0.84
-0.75
16
8
2.76
1. 28
0.87
-0.69
16
9
2.79
1. 43
0.86
-0.84
16
10
2.98
1.64
0.94
-0.44
.~.~
.. ,..
96
5-10
2.77
1.19
0.87
-0.63
e
Table 3.3
GENERAL
DEPENDENT VA4IABLE: EVA
=
l' VALua;
1'1. " t"
R-S'..HJilRE
c. v.
41). a96911209
41>. 8969U20'l
11>7.<4.':17
0.Oll01
O.9Y~J52
7.7~7J
O. OO:.! 80 110 1
'IS,
0.21>636102
UMCO"RECTED rOTAL
96
41.1blJ6J]1
SOURCE
OF
TYPE
,
I
SS
f'
41>. t19b'j8209
ESTIIIATE
T FOR HO:
PABAII£T £8=0
0.15486785
129.J3
~
EEA
Pf.OCf,OUkE
"UO~LS
111;.-\ II SIiU AI<';;
ERROR
~
LIN~Aa
- Test for time clustering
SUII Of SI.IUARES
1I0DEL
PIRAIIET~
~
exact variance of A
OF
SOURCE
REA
e
Regressing the exact variance on the exact expected value
of the test statistic
EEA
e
= exact
ULU~
1612q.H
£'1{
> I TI
PH
>
F
0.0001
OF
:;'1 D iJr;v
EY A !l EA N
0.052952'11
O. utl21> lbSIi
'fYp:; IV
;j.i
4b.ll\l&'IllL!.l'l
f
BLUE
167..!11.97
I.'R
>
F
0.0001
STI! EllkOU OF
E:;'fUIA TE
0.0001
u.00119751
expected value of (A)
w
00
Table
- - 3.4
Regressing the exact expected vaJue of A on the number of time units m
and the difference between (~) and [~ - 0.5]
m
m
uUlEIlAL LldLAil 1I01ll':LS 1'llu..:t:JUilE
DEPEHD~HT
YARl1BLK: EiA
OF
SUII OF S120ARES
II Ell N st/U UK
t· VALUt;
PP. > F
R-S\!UAP.E
C.Y.
1I00EL
2
B4f>.90"4166l1
91.t.9922Ililll
9115.7 ..
o. DUO 1
0.995215
1.20116
ERROR
911
9. J;594008
0.0'J'J5Jl:.!1l
u.coaRECTED TOTAL
96
1955.311035892
SOORCE
OF
Tlt'E 1 SS
l' VALUr.
1912.61566561>
33.36111 JJ 111
19216.l)
SOURCE
II
o ECIII AL
T FOR 110:
Pll!lIlETEB
IS
DECIIIAL
ESTIIIATE
PAIlA"ET £R=O
O. 60J09l> 15
Z. 1902l.9 II
BU.09
.J J5.
PH
18.31
>
,!l,
Pit
>
l'
.1.0001
0.000 t
ITI
S'l:O E!l£WiI OF
ESUIIArt:
0.0001
0.01101
0.00431>755
0.11994808
Dl'
STO DEV
EEl IIEAN
0.J151jt.57lJ
4.l11:1%.ll20
TYPr;
i.
$$
1897. 82tlZ')dtl4
H. lull1 j j 15
r
VALUE
1901>7.66
j lS. :.!6
PI!
> r
0.0001
0.0001
r,.
W
\0
e
e
e
e
Tab~3.5e
e
Regressing a (probability that n. ~ [~ - 0.5]) on the difference between (~) and [~ - 0.5J
,m
1
G~NEIlAL
m
m
LINEAR 1I0DELS PuuCEUUUE
DEPENDENT VARIABLE: 11
DF
SU II OF SI/U AI': j,;S
IIj,;AN 51/:JAI1r.
F VAtUE
1'1' " F
~-S'..ill~
RE
c. v.
1I0DEL
1
O. 5798JJ 11
0.5796Vl1
J'1b.02
G.OOilt
0.6')8111
6. ') 141
ESIIOk
94
0.137&3010
il.0014b"'')
COliRECTED TOTAL
95
0.71746J.ll
SOURCE
DF
TYPE 1 SS
F "ALur;
1
0.57963311
Hb.02
SOURCB
DECIIlAl.
PARAIIETEIl
ESTIMATE
T 1'01< HO:
PARAIIET ER=O
INTERCEPT
DEC IIIAL
0.60743807
0.28655499
150.b2
19.90
PH
P:. >
l'
Df"
S'!"!i DEY
11 liEU
O.OJlli&lll1
U.':>d74UllO
BPI:: H
5S
0.:; ~'HlJ3 11
0.000 1
> I'r!
510 <:Rlloa OF
ESrl.lIATE
0.000 I
0.0001
O.OOllOJJ02
0.OhH956
f
~'ALU£
Hb.1l2
I'll
>
F
0.0001
"'o"
Plot 3.1
%
JYMU~L
('LuT Vt" HE1·"
l~
VALUE Of
~
(REI = Residual of the expected value in %)
15.0 t
to
7
I
I
I
'.l.5 t
I
I
I
10.0 t
I
I
I
7.5 t
I
d
'.I
1
It
5
7
5
I
I
5.0 t
'J
6
9
I
I
I
ti
II
'J
7
. 1
5
6
5
(,
Ii
5
5
8
7
8
9
7
9
6
5
6
I
I
I
7
6
5
1
8
9
1
II
7
1
7
9
f.
8
1
1
8
5
b
'I
5
a 9
9
t
1 ll,
7
') 8
5, 6 1 9
1 1
(,
I
I
I
-10.0 t
I
I
-12. 5
5
5
b
2. S t
I
IREI I
I
0.0 t
I
I
I
-2.5 t
I
I
I
-5.0 t
-1.5
5
(,
b
'}
1
"f.:.
6
5 5
I
+
1
I
I
-15.0 t
-----------------~---t---t---+---~---i---~---i---t---t---~---t---i---+---+---t---l---i---1---t---I---t---t------------------5
7
9
11
11
15
17
1'.1 LI
.lJ .l'i l.1 2<J Jl
Jj
J5 J7
J'J
l/1
'13 'I:" ,,7 '1'.1
N
HOTE:
1 OilS IHDlll;;N
~
I-'
e
e
e
e
e
Plot 3 . _
l'i..u·r OF
40 t
I
I
I
(RE2 =
SynOUL
IIE2.~
4~
VALJ~
Of n
Residual of the variance in %)
35 t
I
I
I
30 t
I
I
25
I
+
I
I
I
20
+
os
!
I
I
15
"1
+
')
I
I
10
e
1
liE2 I
+
'3
7
5
5
6
I
I
I
(;
7
Eo
9
8
5
':I
l'
I)
5
6
1
8
o+
1
5
1
I
I
I
5
678 5
-5 t
I
I
5
t,768S1
b l' 5
{;
5,
6
-10 t
I>
5
~
7
II
5
':I
9
7
b
1
1'.:,
1
'}
8
9
II
I
I
I
-15 t
I
I
5
6
':I
II
I
II
9
.;
5 t
I
I
I
if>
1
7
5
Ii
fl
(,
8
7
1
')
I
-20 t
(,
-------------------t---t---l·---t---l·---t---t--t---t--- t---t----I ---1---+ --- t--- 1·--- 1---t---1- ---I ---1--- 1----1 -------------- ----.-5,
7
9
11
13
15
H
19 ..!1
2J l5
11 2'J 31 Jj ):; n
39 lil
ill
~S
.. 7 .~"
II
NOTE:
9 OilS !lIDDEI!
.j:o-
N
CHAPTER IV
THE ZERO-ONE MATRIX TEST FOR TIME CLUSTERING
cmlPARI~G
NITH THE a·1M TEST AND THE SCAN TEST
4.0
Introduction
The zero-one matrix test is based on the number of disease cases observed
in each time-space unit, without regarding the arrangements of cases within
each unit.
E~~l
In this
respe~~: m~re
than any other test, it is similar to the
test and the scan test, both of which are based on the maximum number of
~
disease cases in an appropriately defined time-space unit.
In this chapter we will compare the zero-one matrix test with these
other two tests, in an attempt to define situations in which each test is
more powerful than the others and to determine the alternative hypotheses for
which one test is more appropriate than the others.
4.1
Situations in which the zero-one matrix test is more powerful than the
HL\l test
For a fixed number of time units and conditional on the total frequency
in all units. the HIM test depends only on the statistic a, the maximum frequency in a time unit, regardless of the distribution of the other frequencies
in the individual units, other than the maximum.
On the other hand, the zero-
one matrix test statistic depends on the frequencies of all the inJividual
units.
Therefore, the difference in power
of the two tests depends on the
frequencies of the individual units, other than the maximum.
To test for
time clustering, the zero-one matrix test is more powerful in situations in
~
44
h'hich
thl'TC
is less variation between the frequencies of the individual
time unLs, other than the maximunL
The following examples, (4.1) and
(4.2), illustrate this point:
Example 4.1:
Annual numbers of cervix cancer deaths in Nash County of
North Carolina reported from 1976 to 1980 are as follows:
1978
Number of cervix
cancer deaths
2
2
1979
2
2
Total
1980
14
6
With a total frequency of 14, for 5 time units (average 2.8 deaths per year),
the :ero-one matrix test statistic is A=l, which shows significant clustering .
.-.;'.
(Under the null hypothesis of no clustering, the approximate expected value
-e
and variance of A are, by equations (3.5) and (3.2), 2.70 and 0.42 respectively.)
The
Er-~1
test statistic in this case is a=6; under the null hypothesis,
conditional on total frequency of 14 for 5 time units, expected value
variance of a are 4.86 and 0.93 respectively.
and
Therefore, the EMM test shows
that the clustering of cervix canver deaths during 1980 in Nash County of
~orth
Carolina is not significant (Z
6~·4. 86
= ------ = 1.18 ),
while the zero-one
1:93
matrix test shows
that it is significant
1-2.7
(Z::: ---- ::: -2.62).
The signi-
r.42
ficance declared by,the zero-one matrix test is based largely on the fact
that there is little variation in the annual frequencies of cervix cancer
deaths in
~ash
County from 1976 to 1979 (2 deaths each year) and in 1980 the
number of deaths was 6.
If in 1976 and 1977 the number of deaths had been
1 and 3 respectively (instead of 2 each year), as in the following array:
Year
~umber of cervix
cancer deaths
1976
1
1977
3
1978
1979
2
2
1980
6
14
4S
tlWI1
the ::cro-onc matrix test statistic would be A==2, which shows j nsignj
fj - _
cant clustering for the same maximum and total frequencies of 6 and 14, re·spcctively, for 5 time units. The El\1M test would remain unchanged in this case.
It can be concluded by this example that while· the
of the
va~ation
E~1
test is independent
of the frequencies, the zero-one matrix test takes into ac-
count this variation, and clusters are more likely to be declared significant
by this test \'lhen there is a sudden change among otherwise stable frequencies.
The following example restates the same point:
Example 4.2:
in
~orth
Annual number of arteriosclerosis deaths in Catawba County
Carolina reported from 1976 to 1980 are as follows:
#"
Year
Number of arteriosclerosis deaths
11:-
1976
1977
1978
1979
1980
Total
15
11
11
10
11
58
e
With a total frequency of 58 for 5 time units (average 11.6 deaths per year),
the zero-one natrix test statistic is A=l, which shows significant clustering,
(under the null hypothesis of no clustering, the approximate expected value
and variance of A are, by equations (3.5) and
(3~2),
ly, which give a Z value of Z = 1- 2.40= -2.30.)
2.40 and .37, respective-
However, the EMM test sta-
1:37
tistic a=15 is not significant, as its expected value and variance are 15.7
' 1 y, wh'Ie hgIve
·
an d .>- • .>-8 respective
a Z va 1ue
Therefore, the
E~~
test shows
0
f Z = 15-15.7 = - 0 . _
38 .
13.38
that the clustering of arteriosclerosis
deaths during 1976 in Catawba County in North Carolina is not significant,
while the zero-one matrix test showed that it is significant, due to the fact
that there is little variation in the annual frequencies of arteriosclerosis
deaths in Catawba County from 1977 to 1980.
If in 1979 and 1980 the number
of deaths had been 9 and 12 respectively, instead of 10 and II, the zero-one
~
.
46
matrix test statistic would be A=2, which shows insignificant clustering for
the same maximum and total frequencies of 15 and 58, respectively, for 5
time units.
The El\lM test would rema:m the same with this frequency change.
The above two examples show that for a given number of time units and
conditional on total frequency, the EMM test will not detect a cluster unless maximum frequency in a time unit is quite large, while the zero-one
matrix test does not require that large a maximum; rather it depends on
the variation bet"Jeen uni t5.
~Qte
poses.
that examples (4.1) and (4.2) are used here for illustration pur-
It is assumed that the population of the two counties (Nash and Cataw-
ba of North Carolina) did not change substantially from 1976 to 1980
9
and
that the frequencies of deaths before 1976 and after 1980 were ignored. The
reasons for the clustering detected are not in the scope of this research
and hence are not discussed here.
matrix test
For comparison purposes, suppose that the annual number of cervix cancer
deaths in Nash County in North Carolina reported from 1976 to 1980 which
were used in example (4.1) were modified as follows:
Year
----
Number of cervix
cancer deaths
1976
2
1977
o
1978
3
1979
2
Total
1980
14
7
For this data set, the EMM test will show significant clustering (test
statistic A=7, E(A) :: 4.86, Var(A) :: 0.93,
--::
- - -7-4.86
1.93
which gives a Z valile
of
2.22). while the zero-one matrix test wi.ll not declare the
clustering significant, (test statistic A::2, E(A) :: 2.7, Var(A) :: 0.42, which
gives a Z value of Z ::
2-2.7
1:42
:: -1.08).
47
The significance declared by the EMM test is based on the large maximum
e
frequency in a time unit (7 for 1980) given the fact that there are only 14
cases in 5 time units (from 1976 to 1980).
The clustering in 1980 is hot
significant according to the zero-one matrix test, due to the fact that there
is a large variation between time units; it can be argued that, for example,
if the frequency can vary by chance from 0 to 3 as in 1977 and 1978, so also
it could be by chance that we observed 7 cases in 1980.
However, in example
(4.1), -the frequency observed in 1980 is only 6, but since in the previous
4 years there were only 2 each, this is declared as a significant cluster.
Similar examples canby obtained by assuming that the arteriosclerosis
data set of Catawba County used in example (4.2) is modified as follows:
statistic A=20, E(A) :: 15.70, Var(A) = 3.38,
wh~ch
gives a Z value of
Z :: 20-15.7 = 2.34 ), while the zero-one matrix test does not declare the
13.38
clustering
significant (test statistic A=2. E(A)
= 2.4,
Var(A) :: 0.38, which
.
glves
a Z value 0 f Z = 2-2.4 = -O.66)g
r.37
Similar to the above case, the EMM test declares the clustering signi ficant, based on the large maximum frequency in a time unit (20 for 1976),
given the fact that there are a total of 58 cases in 5 time units (from 1976
to 1980).
This clustering is not significant according to the zero-one ma-
trix test, due to the fact that there is a larger frequency variation between
~
time units compared to the original data in example 4.2.
It can be concluded from these illustrations that the
E~I
test is more
48
powerful than the zero-one matrix test in cases in which one large maximum
frequency is observed in one unit, together with a large frequency variation
the other units.
bet\~'cen
clus:c~in~ "si~nificant",
However, it may be undesirable to call this type
since a
la~ge
to the logical expectation of a large
')f
frequency variation between units leads
maximu~
in one unit.
As in the Bil\l test, the scan test statistic is the maximum frequency
of disease cases in a fixed interval of time; therefore, similar to the case
of the
E~1r'o1
test, the scan test is more powerful than the zero-one matrix
test in situations in which one large maximum is observed in a fixed inter ..
val of time, and the rest of the cases are distributed rather unevenly dur.-
.Ii"·
-
ing the rest of the time intervals under consideration.
-e
The zero-one matrix
test is more powerful than the scan test in situations in which a small elus··
ter of events is observed in a fixed time unit and the rest of the events
are distributed uniformly (evenly) during the rest of the time under consid··
eration.
Due to the nature of the scan test, a criterion for dividing the time
under study into fixed intervals (time units) is' not needed and the test sta·..
tistic does not depend on this subjective step.
In this respect the zero-one
matrix test has a drawback compared to the scan test; however, the scan test
does not get this advantage without paying a high price, as neither the exact
nor the approximate distribution of the scan test can be assessed easily,
and extensive tables on the significant points of this statistic are not available.
Furthermore, the scan test statistic requires more detailed data than
those required by the zero-one matrix test, when the length of the time unit
is the same.
For example, if the length of the time unit is one year, then
the :ero-one matrix test requires annual data, i.e .• number of cases occurring
in each year with an appropriate starting date, while the scan test requires
data with a more exact time at which each case occurred, or at least the semi-
49
annual data, i.e., number of cases occurring in each six months with an appropriate starting date, so that the scan statistic can be computed as the
maximum of all the sums of two consecutive semi-annual frequencies.
~Ioreover,
the scan statistic depends on how detailed the available data
are; for example. if the length of the "window" for the scan test is one
year, the scan statistic can be the maximum of all the sums of two conseClltive semi-annual frequencies or the maximum of all the sums of 12 consecutive monthly frequencies.
The latter statistic can be either equal or great-
er than the former and the two statistics are not necessarily the same. Therefore, in assessing the significant level of the scan test statistic, this
fact should be taken into ,a,sco!)nt, and this may not be feasible in most practical situations.
Note that when "time" defines a "unit", "clustering" between neighbOrin g
units can occur without being detected by the zero-one matrix test.
Example:
5, 7, 8, 8. S. 7 might be considered less "clusterful" than 5, 5, 7, 8, 8, 7;
the zero-one matrix test does not allow for this consideration.
e
CHAPTER V
GE~~ERALIZED
ZERO-ONE MATRIX TEST FOR TIME CLUSTERING
\\'ITH ADJUSDIENT FOR EXTRANEOUS FACTORS; MULTIVARIATE APPLICATIONS
5.0
Introduction
A major vleakness of most of the tests for time-space clustering is
that they do not take into account the population changes due to time and
space, nor the difference in distribution of extraneous factors among
·e
study groups.
The above factors may affect the outcome of the study.
None of the tests found in the I i terature can be used in a multi var··
iate problem such as cases in which one needs to test the hypothesis of
an unusual pattern occurring when a group of different diseases are considered simultaneously.
In this chapter we discuss first the possibility of using the zero-
one mat rix test for time clustering adj usting for extraneous factors, and
then we discuss the way to use this test in multivariate situations.
5.1
Zero-one matrix test for time clustering with adjustment for extraneous factors
One of the most important factors that should be taken into account
when investigating disease patterns over a period of time is the change in
population size.
As discussed earlier in Chapter II, to adjust the zero-
one matrix test for this change, the rates of disease in each unit should
be used, rather than the frequencies themselves.
arc
~nown
However, unless the rates
to have the same precision, an apparent gap within them may be
S1
misleading as a basis for classification as zero or one in the zero-one
matrix test.
~
The larger the population base for the rate, the more stable
the rate will be and also the more reliable will be the classification of
the corresponding unit.
Symons, Crimson and Yuan (1982) recorrunended a way to classify uni ts
into high or normal risk
of sudden infant death syndrome, adjusting for
population si:es, which can be used in the zero-one matrix test.
Instead
of classifying units into "high" or "normal" risk, we classify them into one
or :ero; unusual "cluster" is found when one observes a few "high ll risk units
among many "normal" risk units, and "avoidance" (or "vacuity") is found when
.-
.1;-
one observes a few "normal" risk units among many "high" risk units.
The
following discussion summarizes their method of classification:
The number of events, n., in the i
1
th
unit having a population size of
N.,
is assumed to have a Poisson distribution.
1
~
The distribution of the num-
ber of events is
n.
P(n.IN., A ) = exp(-N. A ) (N. A ) lin.! ,
11
g
Ig
19.
1
(5.1)
where i = l, ..• m indexes the units and A is the population rate in the
g
th
g
group, g=O or 1 for the zero or one value in the test matrix, which
correspond to "normal" or "high" risk respectively.
If a randomly selected unit is "normal" risk with probability 'ITO and
"high" risk with probability 'IT
l
= I - 'ITO' the unconditional distribution
of n.1 is then
1
f(n.I7T, A , N.) =
1
g
g
1
I 'IT P (n. IN.,
g=O g I l
A ),
g
(5,2)
The cluster test is formulated as a problem of estimating the tnknown
mixture component origin of each n , denoted by a = g, or equivalently
i
i
4It
5~
n. come from the g
th
1
component of the mixture.
of a. equaling g is IT.
A likelihood of the data is required to estimate
g
1
the m components of the vector a
~
a Bayesian approach.
~
The a priori probability
:0:
(a , ... ,am) by a maximum likelihood or
l
Let the vector
n denote
the m observations n., and
1.
Q be
be the vector of corresponding population size, and
parameters (ITO' IT l' A ' AI)
O
a specified allocation
~.
the vector of
The likelihood of the data for the m units,
0
parameters Q, and population size
eXP{'-I. r):
g
'p
N.A
1 g
+
E CL' n.1
g
,".>
by!; C is the collection of these observations.
g
is given by
Q.n(N.A )}
1
g
The m denotes the number pf observations allocated to the g
g
!'i
g
(S.3)
th
component
The first factor of
(5.3) represents the likelihood of the number of observations assigned
to each group and the remainder represents the likelihood of which of these
numbers of observations are assigned to each group.
A maximum likelihood approach determines the maximum likelihood esti··
mate of
~.
~,
as the allocation that maximizes (5.3).
by replacing the parameters
.2
This is accomplished
by their maximum likelihood esti.mators, giving
an allocation of the m n. to the 2 components, and then seeking the maximum
1
m
of (5. ~) over the 2 pOSSl' bei al l oca t'lons.
J
Given an allocation ~, the max-
imurn likelihood estimates of the parameters are
TT
g
::
and
m
-~
(5.4)
m
Le n.
1
A
g
- J_
The allocation
Ie g
~
(5.5)
N.1
that maximizes (5.3) is equivalent to the partition
of the m observations into 2 groups which minimizes the negative of the
53
natural logarithm of (5.3), especially,
m
I
1
A
l n. - L\ L\ n.1 ~n(N.1 ,A g ) - w) mg ~n(m)
g
. I 1 g=O
1=
Cg
g=O
(5.6)
A Bayesian approach requires the specification of a prior distribution,
p(Q), for
Q and
the averaging over the uncertainty in the unknown parameters,
since the allocation vector
~
is of primary interest.
Generally,
(S.7)
where the integration is over the parameter space of Q,
The mode
is taken as the Bayes estimate of the optimal allocation
~.
of (5.7)
If the prior is cho;e~ to delineate the parameter space as follows:
)
,A
p(Q) = p (1f 0 ,1f 1 ,A"
O I
I
~
I
[II ~-l][
II A,-1],
"
g=O g
(5.8)
g=O g
"'"
then the resulting Bayesian optimal allocation, !,
is equivalent to the
partition of the data into 2 groups that minimizes the criterion
I
I (2
g
n. )~n(L N.)
=0 C 1
C 1
g
g
I
-l
g=O
~n[r(L n.)]
C
g
1
I
- L ~n[r(m
g~O
)].
(5.9)
g
A "marginal" or unconditional likelihood can be written as
L(niQ, ~)
=
m{ I
n.
}
II I TI exp(-N.A )(N.A ) l/ n .! .
i=1 g=O g
1 g
1 g
1
(5.10)
An alternative large sample likelihood ratio procedure would be to
assign n. to the mixture component g when
1
A
TI
A
A
Pen.1 IN.,A
) = max in. P (n. IN. ,A. )}
g
1
g
J
1
1
J
O::;j::;l
(5. 11)
where A. and n. are the large sample maximum likelihood estimates or known
J
J
parameter values.
54
The
n~quired
maximum likelihood parameter estLm;ltes for (5.10) with
the likelihood ratio procedure (5.11) can be obtained by an iterative scheme
with intial estimates determined by preliminary descriptive analysis of
the data, or more precise initial estimates by the method of moments. Alternatively, initial estimates based upon! or ! from criteria (5.6) or (5.9)
respectively, may be employed.
These are biased parameter estimates; how-
ever, they are found by Symons (19'73) to work quite well as initial estimates "for an ite:rative maximum likelihood scheme for a mixture of multivariate normals, probably because the bias in the estimates of the means
tends to exaggerate
slight~y
the separation of the mixture components.
It has been shown by Lindley (1965) and Cox and Hinkley (1975) that
the Bayes estimates of the parameters in likelihood (5.10) with vague prior
information will converge to the maximum likelihood estimates with large
samples.
Due to the computation difficulties in optimizing criteria (5.6)
or (5.9) over 2m allocations, given the desirable large sample properties
of maximum likelihood estimates and the general optimality of the likelihood
ratio criterion, the likelihood ratio procedure"in (5.11) is preferable
with large samples.
With generally smaller sample sizes, the criteria
(5.6) or (5.9) may be more reliable, as has been shown by Symons (1973) for
a mixture of multivariate normals.
A computer program that performs the classification is available from
Symons, Grimson and Yuan (1982); this program makes the classification using
one of the three optimal criteria: maximum likelihood (5.6), Bayesian optimal allocation (5.9), and large sample likelihood ratiq (5.11).
The re-
sults of the following example are obtained using this computer program.
Example:
Annual number of deaths due to large intestinal cancer in
North Carolina from 1970 to 1978 for white males between the ages of 75
and 84, with the corresponding population size, are as follows:
5S
Number of deaths
Year
Source:
Population size
1970
54
37286
1971
39
38053
1972
47
38815
1973
63
39582
1974
66
40431
1975
81
41188
1976
64
42077
1977
64
43029
1978
67
44370
EPA data: combined population data from Census Bureau
and mortality ._data from National Center for Heal th
Statistics
.
When classifying into two groups, all 3 options of the classification
program applied to these 9 time units (years) give the same results, as
follows:
Group 1:
2 units
1971 and 1972
Group 2:
7 units
1970, 1973, 1974, 1975, 1976, 1977 and 1978
If one assigns a zero to each of the units in Group 1 and a one to each
of the units in Group 2, the test statistic is A=7.
The approximate expected
value and variance of this statistic are (applying equations (3.5) and (3.2)
and assuming the Decimal to be equal to zero): .
E(A)
= 0.6
x
m
= 0.6
Var(A) :: 0.155 x E(A)
x
9 :: 5.4 ,
= 0.155
x 5.4 :: .837.
Under normal approximation, the Z value for the test statistic A is
= 7-5.4 = 1.75
r.B37
,
56
which corresponds to a p-value of 0.08
~otc
(only significant at the level 0=.10).
that the regular zero-one matrix test applied to this mortality
data without adjusting for the population size would have given the statistic
A=6 (zeros arc assigned to years 1970, 1971 and 1972, and ones are assigned
to years 1973, 1974, 1975, 1976, 1977 and 1978) which corresponds to a Z-value
of Z
::;
6-5.4
- - = 0.66 and a p-value of 0.51.
;:-837
For the data of this example, adjusting for population size using the
regular zero-one matrix test on the death rates (instead of the death £1'equencies) would have given the same results as those obtained through the maximum likelihood methods.
.H.SJweyer, the maximum likelihood procedures would
be morc reliable in situations where population sizes change more drastically
-e
from one unit to another so that the rates do not have the same precision.
5.2
Zero-one matrix test in multivariate cases
There are situations in which the hypothesis is tested through several
variables simultaneously.
Examples are situations in which unusual patterns
for a group of diseases are tested, but these diseases cannot be combined
because of their differences in certain characteristics.
In this section we suggest a procedure for doing the test in these situations, using the zero-one matrix test.
Since the zero-one matrix test involves classifying data points into
two groups, zero and one, in order to apply this test to a multivariate data
set, we first must classify the data into two groups using a multivariate
technique; then the significant level of the test is assessed, based on the
number of data points in each group conditional on total number of data points
used.
5.2.1
Classifying data points (time-space units) into two groups:
There are
57
several methods to classify multivariate data into two groups using cluster
analysis concepts.
measure.
e
Classifying involves choosing a similarity or resemblance
Two units (data points) are similar if their profiles across var··
iables are "close" or if they show "many" aspects in common, relative to
those which other pairs of units share.
Generally, the most common type of measure for non-dichotomous data is
the distance type measures.
The Euclidean distance between two points
i and j in a space of r dimensions is
d ..
r
=[ \
1)
t~l
2 1
(X. -X. ) ]~2
1t
Jt
- .;'0'
where \ t and X. are the projection points of i and j on variable t
Jt
(t=I,2 ... r). The variables can be standardized to zero mean and unit standard deviation before applying the formula, since some researchers feel
that-~h~
use of Euclidean distance measure should be restricted to orthogonal, standardi~cd
variables. This has the effect of assigning equal weights to all variables.
The ordinary squared Euclidean distances are similar to the Mahalanobis
0
2
in the context of discriminant analysis.
Mahalanobis 0
2
takes both differences in axis lengths and the correlatedness
of axes into account.
d
2
The difference is that the
The Mahalanobis 0 2 is the same as ordinary Euclidean
if the latter is computed in a space in which the configuration has been
transformed into a hypersphere, as it has been shown by Green (1978).
Three different amalgamation rules for building up two clusters are
single linkage, complete linkage and average linkage.
Each rule can be de-
scribed briefly as follows (see Green (1978)):
Single linkage:
The single linkage or minimum distance rule starts out
by finding the two points with the shortest distance.
At the next stage, a
third point joins the already formed cluster of two if its shortest
dist~nce
~
58
to the members of the cluster is smaller than the two closest unlinked
points.
Otherwise, the two closest unclustered points are placed into a
cluster.
The process continues until all points end up in two clusters.
The
distance between two clusters is defined as the shortest distance from a
point in the first cluster to a point in the second.
lin~a!L~:
Complete
The complete linkage rule uses a similar way of fi rst
cluste!ing the two closest points.
However, the criterion for joining points
to clusters or clusters to clusters involves maximum (rather than minimum)
distance.
In other words, the distance between two clusters is the longest
distance from a point of first cluster to a point in the second cluster.
-e
Average
li~kage:
the other two.
The average linkage rule starts out the same way as
However, in this case the distance between two clusters is
the average distance from points in the first cluster to points in the secand cluster.
5.2.2
Assessing the significant level of the test:
The test statistic A
is the size of the smaller cluster; conditional on the total number of data
points m, the significant level (or P-value) of the test statistic A is
(5.12)
Proof of equation (5.12) is as follows:
Since we have m data points and we force them into two clusters, and
since each cluster contains at least one point, there are only
points to be classified into the two clusters.
(m~2)
data
Under the null hypothesis
of no special pattern in the data, each of these (m-2) points has equal probability (one half) of being classified into either of the two
clus~ers.
There-
S9
fore, the problem becomes one of assigning (m-2) "balls" randomly into
"urns".
t\vO
Nhen the smaller cluster contains A of m points, it is equivalent
e
•
to the fact that one of the two "urns" contains (A-I) "balls" of the total
(m-2) "balls".
The probability that one of the "urns" contains exactly
(A-I) "balls" is a binomial probabi Ii ty,
P(U
==
A-I)
and th.e P-value of the statistic A is the probability that one of the "urns"
contains (A-I) balls or less:
P = p{U
which is equation (5.12).
Example:
One hundred counties in the state of North Carolina are di vid--
ed into 8 health service regions, as in Figure 5.1.
In this example each
region is considered a space unit and we would like to test for space clustering of heart disease and cancer, based on the 1980 death rates reported
by the North Carolina State Center for Health Statistics.
The 1980 death
rates by county in North Carolina for heart disease and cancer are presented
in Tables 5.1 and 5.2 respectively.
From these rates, the rate for each
region is calculated as the average rate of all counties in that region. The
results are presented in Table 5.3.
The Euclidean distance between any two
regions i and j, d .. , is calculated as
1J
d ..
1J
= !ex._x.)2+(y._y.)2
~
]
1
J
1,2, ... 8
i
==
j
= 1,2"0.8
i ~ j
where X. is the heart disease death rate in region i, and Y. is the cancer
1
death rate in region i.
1
e.
60
The matrix of distances between two regions is presented in Table 5.4.
•
Based on these distances and applying the single linkage procedure as
described earlier in this section, we have the dendrogram as in Figure 5.2.
From this dendrogram, \'ihen forced into two groups, we have Group 1 consisting
of Regions 1, ;(., 3, 4, 5, 6 and 8- and Group 2 consisting of only Region 7
(the statistic A equals 1).
The P-value for this observation under the
null hypothesis of no clustering of heart disease and cancer is
p _.
which is significant at a
=
.OS.
-l"
These results indicate that there is significant clustering of heart
--
disease and cancer in region 7, which consists of 16 counties of the northeast area of North Carolina.
These results justify the concerns of the state
health officials about the inadequate health care of residents of these low
income counties which consist mostly of small farms.
It is also noted that
if one considers only the cancer ~eath rates in the above 8 service regions
and applies the regular univariate zero-one matrix test, using equations
(3.5) and (3.2) and assuming that the Decimal equals
zeTO)
we have the
ob~
served statistic Ao:2 (regions 1 and 7 have high rates of cancer) and under
the null hypothesis the approximate expected value and variance of A are
E(A) = m x 0.6 - 8 x 0.6 = 4.8,
Var(A)
=
0.155
which gives a Z value of
x
E(A)
~
0.155
x
4.8
~
.744
61
These results show a significant clustering of cancer deaths in regions
1 and 7
\~hile
heart disease combined with cancer show a significant cluster
only in region 7.
Therefore it can be concluded that while region 7 is
having a general health problem (which is represented by both heart and cancer diseases) region 1 may be having a cancer problem
concern of Schneider (1982).
which justifies the
However, since there are many other possible
reasons for the cancer problem in this region, it is presumptuous to conclude,
like
S~hneider.
that it is caused by the extensive use of agent white on
the forest areas of western North Carolina.
•
e
Fig~rc
'"
S.e
e
NORTH CAROLINA COUNTIES AND EIGHT HEALTH SERVICE REGIONS
s.
3. Piedmont
~
7
Capitol
I /-:1-1~ ~.;} r. \- ~lfA::;.-,.)
\!
i~'"
-J
I
'
.
•
,.
.'
,
I
~.
"j
~
t.i
ClUI:al • I.
4. Southel"n
Piedmont
JJ
n
_
,1
pm
I
L
.'
'-•.
If
.11,'
( "-)<, •\i
!
I
,.
•
.!r0{
C--J\I J,)rvJ,}
k.'
.
\.
f\.' rp,,,u/,
G
.'
n
\
"
'
.1)
II
. I,'.1
~. /lP
\
't'>o
-., ." .•.... ...,.
'"--],r-.:',",
I
=.
a"
, I"'I '\\
~"'1,~,
\'. I )
H. » 1: ---d/f
/
,I
.
"
•
"'v'",ca~
!i
_~ ~'~i:'tf'/
Clt,~'1.n ./1
.
.
fl
_
' \
' l.,V =---'"
~y
'{;::j
'\.
,II
"
/--
'" - - , '
1"''1lI~
~..V
~\
Q,1U1l
-
i i..l'lI:l'
.
SlWSilil
~
.
,I'
,,'
i_
•J:J'.-'
,'<>).
.~
" ....
7("
~ "".~ ~"'>A"~EV<..~C;"'~ ~ L~)y.
.~.'
,,1
~.. .,!~~J ../'\>,.
t<";:'~.~'J
J
."
J61t<Slo,;
I.
,.J
'
r
-
I
0
,~. '~
~o"
I
!
'
"~"
I'," ~ ~.
-"..
-.'
,,'"
N:-'<:,SIltil~~"j
""'ilI-lI~~
" ,,'y""
)~
"
,S".!" '~" )Ir:~.~\.iN
~
,
1. Far
.'
m,.
o~ C,U~I,
n
'I oom:..I..i ";\0;;'\:i"
' .'0'
'!:>"'.. ~<>" v-,~'
.
'I'
.YUQ.!.
j
V
I
,--:l
2. West
Northeast
0
8
0
Southeast
!""'~
-./- - I
\ Of
i -
\ ,.
6. Cardinal
(]\
('...1
63
TABLE 5.1
I'l
0 R TAL 1
T
NOR T H
o
GEOGFIAPHICAL
AREA
NORTH
C~FlOUNA
Y
AT!
S T
C A R
\)
L
I N A
1 S E A S E S
NUMBER
OF OEATHS
1980
17579
I C S
S T
F' 0 R
1
~
\) F
OEATH
RATE.
H
EAR T
1'80
COUNTIES
ICONT'DI
299.21+
"'2 HAL IF'AX
HGIOI\iS
1 W(STERN
11 N. CENTRAL
Of-lP IU s. CENTRAL.
OrtR IV (AST(RN
6236
3836
3688
3819
30'.81
301&.22
2 7 1.87
306.95
HSA
1 WESTERN
HSA 11 PIEOl'lONT
HSA III S. PIEDMONT
HSA IV CAPITAL
HSA
V CARDINAL
HS~
VI (ASTERN
3329
321+6
2907
2394
2550
3153
327.30
291.20
2'H.94
292.81
284.0'J
305.61
COUNTIES
ALA,..ANC(
ALEXANDER
ALLEGHANY
1+ ANSON
1
2
3
~
(,
'1
e
9
10
11
12
13
11+
1~
AS~E
AVERy
BEAUFORT
BEflTlE
BLADEN
BRUNSwICK
BUNCOl'lSE
BURKE
CAE'ARRUS
CALDwELL
CA"'OEN
CARTER£. T
CASIoiELL
16
17
18 CAlA~~A
1'1 CHATHA~
20 Ct-<EROK£'(
21 CHOIOAN
22 CLAY
?~
21+
2~
26
~7
26
2'1
30
31
32
3~
31+
3~
36
31
38
3':1
1+0
"1
CLEVElAND
COLU'"'9US
CRAVEN
CU"SERLAND
CURRITUCK
DflRE
DAVIDSON
DAVIE
('UPLIN
OUIlHAM
EDGECOl'lBE
FCRSYTH
FRANKLIN
GASTON
GATES
GRAHA'"
GRANVILLE
GREENE
GUILFORO
* Per
100,000
321
64
38
97
78
55
157
89
119
128
610
200
315
175
25
11+5
80
275
101+
S6
37
311
292
156
154
398
311
51+
292
70
130
1155
183
75'
121
517
36
20
11+3
1+ ..
876
323.80
2'56.01
396.32
379.1+8 .
3149.36
381.89
389.90
423.32
390.81
357.83
379.01+
275.84
366.13
258.31
1128.711
352.89
386.39
261.38
311.20
295.79
291" €o 7
513.~4
349.98
305.66
216.76
161.02
308.69
1403.70
258.03
284.51+
317.1+5
297.81
326.85
311.117
"02.58
318.02
1105.58
277.23
1+20.66
273.00
216.20
0
R £: S I 0 £: N T S
HAflNETl
1.11+ HAYWOOD
"'!l HENDERSON
".& HERTFOHD
~3
Ol"P
OHR
e
141
~~OKE
I+a HYDE
49 IREDELL
so JACKSON
51 JOHNSTON
52 JONES
5~
lH
51+
LENOIR
LINCOLN
MCOO"'ELL
MACON
MADISON
,..ARTIN
MECKLENBURG
MITCHELL
MONTGOI'IERY
MOORE
NIISH
NE\>; HANOVER
NORTHAMPTON
ONSLOW
ORANGE
PAMLICO
PASQUOTANK
PENDER
PE:RQUIPIANS
PERSON
PITT
POLK
RANDOLPH
RICHMOND
ROBESON
ROCKINGHAM
ROWAN
RUTHERFORD
SA,..PSON
SCOTLAND
STANLY
STOKES
SURRY
SWAIN
TRANSYLVANIA
TYRRELL
UNION
VIHJCE
WAKE
WARREN
WASHINGTON
WATAUGA
WAYNE
WILKES
WILSON
YADKXN
YANCEY
5~
'56
57
58
59
60
61
62
1>3
64
6~
6(:-
67
68
6~
70
71
72
n
74
7!)
76
77
78
19
80
81
132
i\~
BI+
I\~
8&
81
86
89
90
91
92
9~
':14
'3!)
9&
91
98
9':1
100
NUMBER
OF DEATHS
1980
210
217
160
232
80
I+S
22
21+6
7".
310
40
lI+7
200
H6
103
81
57
104
981
58
96
209
215
,304
91
141
135
50
103
78
27
DEAn.
RA'I'E191:'>0
379.8'"
S&l" 2~
344.13
3%.03
3~2 • .3'J
2~o.82
:371+.59
300.!l;
286.&9
14:39.11
1+12.24
400.:51
35&+.35
31+1+.57
293.16
40',.36
338.78
1100.80
2 l 12. (,f.,
1+01. %
427.25
413.e.
320.1
293.80
402.94
125.00
175.19
480.86
361.BI:l
351.12
284.75
112
38~.00
241
66
222
161
271
258
342
184
169
CJ6
175
87
197
36
53
18
183
156
653
S8
63
80
268
205
192
?-88,10
508.43
241.67
351.1,03
272. n
:309.25
a~
143
3!+~."9
31.:-2.08
3'HI.12
291'.,46
360.67
262.91
331.,,6
350.19
226.32
452.83
260.02
424.!'i2
211.0&
357.31
~25.61+
252.55
2H•• 1~
3~'3.48
3014.
295
287._
64
TABLE 5.2
f'l 0
R T A l
I
N 0 R l' H
T Y
\'
'"
T A i
X S
,.
NU~jBE:fl
I\R[II
OROl INA
~IORTH
OF
1 Cf 8 0
f\J 1 S
R
NUMBER
DEATH
:t9ao
196(1
COUNT H.:s
iCON'!'OI
9690
lG5.0B
42 HlIlH'IH:
OEA1'HS
RI\H~
4?l HARNETT
(..~
HAYWOOD
ll5 HENOf:RSDN
lIe ilERYFQKD
47 HO"!;:
t.
RE,GXONS
'"
OHR
1 WESTERN
U N. CEf~TRAL
II! S. CENYRAI..
OHR IV tAS TERN
OHfl
OHI1
HSfl
HStI
I WESTERN
HSA
tV CAPITAL
V CARDINAL
vI (ASTERN
II PIEDMONT
HSA HI S. PIEDMONT
HSA
HSA
;: 0 R
t: S 1 0 f.
R
C 1\ R 0 I.. I N A
C A N C [
GEOGfli\PHlCAL
C S
3382
21SA
1M~o 02
1"1.1'4
1%9
2189
H5.1'5
175" 94
~(~
In:. f.3
'::31
52
53
54
'5!l
56
176'+
1899
161.8
1268
1393
"fS5.19
1756
l10.20
170.3&
162.~"
155.08
HYDE
49 IHEtlELL
'50 .JACKSON
JOHNSTON
JONES
LEE
LENOIR
LINCOLN
MCDOWELL
!'I 1 flACON
.
COUNTIES
e~
3
14
!l
6
7
8
9
10
:- 11
AU'l"'ANCE
ALEXANOER
ALLEGHANY
ANSON
IISI-<E
liVERY
BEAUFORT
BE RT IE
RUDEN
BRUNSwICK
8ll~CO"'Be:
12 BURKE
13 CAAARRUS
114 CALOWELL
1!J CAI"DEN
1b CARTERET
17 OSWELL
16 CAUWtjA
1':1
20
21
24!
?3
214
2~
2b
27
2f:I
2'1
3U
,31
32
3~
e
314
3~
3&
37
38
3'1
40
141
CHATHAM
CHEROKl(
CHO\,/AN
CLAY
CLEVELAND
COLUI'lBUS
CRAVEN
CU/'IBERLANO
CURR nUCK
DARE
DAVIDSON
DAVIE
DUPLIN
DURHAM
EDGECO"'BE
FORSYTH
FFiANKLlN
GASTON
GATES
GRAHAM
GRANVILLE
GREENE
GUILFORD
'k
Per 100,000
191
39
21
410
1+1
14
87
48
51
81
329
111
163
':11
12
78
51
1!+!4
'43
46
42
8
12!i
102
107
225
31
18
171+
38
76
260
94
391
50
284
16
13
51
22
569
1'12.66
156.00
219.02
156.48
183.6f.l.
77 .20
216.06
226.31
167.49
226.1+4
204.43
161036
169.76
1314.32
205.79
189.IB
149.72
136.91
128.67
242.97
33lt.50
120.80
149.82
199.85
150.61
<J1.03
279.63
131+.56
153.15
" !l8 f,lADISON
5~) r·1A1H IN
60 ME C1"t.E:N8URG
61 ~nCHELI.
62 MON TGOr-iE R'f
63 MOORE
64 NASH
65 NEW HANOVER
66 NORHtAMPiON
67 ONSi. O~J
68 ORANGE
6":1 PAMLICO
70
PASQUO'TIl,NK
11 PENDER
72 PERQUXMflNS
n
PERSON
7'1 PITT
75
7f,
77
78
1Y
80
131
82
83
POLK
RANDOLPH
RICHMONQ
ROef.SON
ROCKINGHAM
ROwAN
RlHH(RFORD
SA"'PSoN
SCOTLAND
81+ ST.l\NU'
STOKES
86 SURRY
8,!l
61 SWAIN
88 TRANS'l'LVIH\lIA
154.4&
185.58
170.17
TYRRELL
90 UNION
161.89
160.145
16(.. 35
171+.69
180.26
180.20
Wt\l«(
150.02
136.50
179.40
8~
91
92
93
94
VANCE
WARREN
WASHINGrON
9~ WA1AUGA
9& WAYNE
97 WILXES
98 WllSON
9~ 'tADI<IN
100 "l'ANCEY
OF DEATHS
198(J
i1.7
83
1Hl
12"1
4q
22
10
:L5!\
33
130
21
77
107
60
D!::A"fH
RATE;)
1'380
211.(",2
1:39.32
',713.51
2H~" 1"}
168,,31
101.95
170.27
'62,3:S
121.85
184.1~
?16.~2
209.11
178.86
lill.GO
'+6
A~O.92
58
38
lt9
618
31
45
287 0 3'9
1'\1
121
205
53
84
H4
16
bO
45
210
51
Hi'
2<:'5.85
18S.53
152.86
211.l.M
200.2'1'
23:1.- 6E.
180.19
198.12
23~.G7
'14.41
109.01
17:3.11
~:to
,80
202,57
2'53.11
17l".nf.
139 d3G
21
'1,,1,77
151
If,I}.31i
&8
B3.50
146.69
17'5.00
182.48
181.17
'-4'il
146
U'l1
lOl
l~1.l9
95
45
76
3a
156.6:')
211+.86
1:5'1
22UQ3 tJ
13".4~
J
16
155.G~
~3
183.61.
276.'12
11
102
66
l;lS
141
33
44
14~
106
113
39
29
14~.'92
17~.6iJ
131.95
252.58
222.95
13fL 90
147.34
160.71
178.98
137.13
194.20
65
Table 5.3
1980 Heart Disease and Cancer Death Rate*
in North Carolina by Health Service Region
1~80
Heart Disease
Death Rate
Region
_.
I
I
1980
Cancer
Death Rate
313
193
332
163
3
306
164
4
354
163
5
334
169
6
348
173
7
371
221
8
330 .
163
1
2
"'.i'"
*Per 100,000
---e'
66
Table 5.4
Euclidean distance between 2 regions in North Carolina
based on heart disease and cancer death rates
Region'
------,-
o.
e
1
~
__
2
.•._____
~_......~_=___
3
______
~
to
.:1,
___
~-
__
-~__.,·~
____
,~<~_-'~
7
6
.)
__
._~~~~
......
r_~_~
8
..
<·. . ~ ~ . _ ~ ~ ~_ _ _ _•_ _~_ _ _ , _ _ _ _ _~ ~ = ~ , ~
~."".=.~_
1
0
1261
890
2581
1(] 1'7
1625
4148
1189
2
1261
0
677
484
40
356
i{88S
4
3
890
677
0
2305
809
1845
7474
sn
4
2581
484
2305
0
436
136
365:5
576
5
1017
40
809
436
0
212
407:5
S2
6
1625
356
1845
136
212
0
2833
424
7
4148
4885
7474
3653
4073
2833
0
S04S
8
1189
4
577
576
52
424
5045
0
-,i"'"
67
Figure 5.2
Dendrogram from single linkage clustering on Euclidean distance of
heart disease and cancer death rates in the 8 Health Services Regions
of the State of North Carolina
Region
3
5
2
,
I
8
~
.......
e'
4
6
J
1
7
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _~___l
Relative distance, in rank order
CHAPTER VI
SUM1VJARY, PF.;.. CTICAL GUIDE, AND SUGGESTIONS
r()R FURTHER RESEA.RCH
Several tests have
:'<::r~n
suggested by different authors for testing
or modeling the temporal and/or spatial clustering of disease.
eluded the
Pinkel-Nefzge~
These in-
test, the Knox test, the Barton-David test,
the Ederer-i\lyers-fvlantel (;:'-r,lj test> the Grimson model modification of the
E~~
test, the Mantel tes:, the scan test and the Bailar-Eisenberg-Mantel
test.
Since these tests ":.:'':: sensitive to different types of clustering,
it is important to have
~. ~:~ear
understanding of the various concepts of
temporal and spatial clt:s';:;;:-s and· of iltime-space interaction".
research
these concepts
the discussion of their
Some weaknesses of
~~~
deliberated when the tests arc reviewed with
":~;C':,:;rences
:~~
In this
and similarities,
:onventional tests are:
- They involve some :s'.:.:jective steps to determine the criteria,
- They are sensiti¥e :: time or space clustering as well as to timespace
interactio~.
- The distributions
the
:~
:he test statistics are so complicated that
significance::::,:;~s cannot
uations, even
be assessed in most practical sit-
~i:~ ~;~roximation.
- They are a one-t:::.:: :est of clustering, not sensitive to "vacuity",
Due to these weaknesses .e
..
~~ggest
two new tests:
69
- Numbcr-of-clIlpty-cells
test for rare events, and
- Zero-one matrix test for events with larger frequencies.
TIle number-of-empty-cells test is recommended for testing clusters of
relatively rare events, i.e., the total number of diseases events is between
one half and twice the total number of time-space units
l
C
2
m<n<2m).
The
zero-one matrix test is recommended for testing clusters of events with
larger frequencies, i.e., when the number of events is greater than twice
the number of time-space units
(n>2m).
In the next section of this chapter a step-by-step guide for testing
is presented which summarizes the work done in this research on the above
t~·o
tests.
6.1
Practical guide
Suppose we have n disease cases in m time-space subunits, and that each
e-
subunit has n .. cases, where j indexes the time unit, j :: l,2 ... c and i
I)
indexes the space unit, i
= 1,2, ... r; L\~ = In..
1J
J
:: n.
l'
\7=In.l ' = n
L1
The following steps are recommended for a test of clustering:
6.1.1
Step I, spec,ifying the al ternative
hypot.h.e~:
Depending on the cir-
cumstances, one may want to test for time clustering, for space clustering, or
for time-space "interaction".
it is
also
necess~ry
When time (or space) clustering is being tested
to specify whether the time (or space) clustering is
being tested across all space (or time) units simuZtaneousZy or whether the
clustering is being tested through individual space (or time) units.
The
difference between these two hypotheses is that the former seeks to answer
the question "is there an unusual pattern ovex time where the pattern is the ~
same for all space units?"
while the latter seeks to answer the question
70
"is there some unusual pattern over time which may not be necessarily the
samE: across all space units'?"
To answer the first question we
C:ln
ignore
the individual time-space frequencies n .. and do the test on the marginal
1J
fi:equency
on time n .• conditional on the total n and the number of time
"J
units c, as if we have only one space unit.
To answer the second question
the test should be done on each space unit sepaTately, i ,e., n .. should be
J]
considered for each value of i separately (i=I,2, ... 1') conditional on the
number of time units c and n.
1 '
then the results are combined as in Step S.
This test is sensitive to both time clustering and timc<;pace "interaction",
To test for time--spac:e "interaction" n .. should be considered, cancEIJ
ti.onal on both marginal frequencies n.
1°
and n , and total number of time"
, OJ
space subunits m = nxc,
Note that since the procedure to test for space patterns is similar
to that for time patterns with the word "space 1t and "time'" interchanged,
and the subscripts i and j interchanged, in the following steps we discuss
only the tests for time clustering and for time-space interaction.
6.1.2
S.tep 2,
~hoosiJ1g
a test: After deciding on the alternative hypothesis
a test must be chosen based on the total frequency "n l1 and the total number
of units "m" that the test uses as conditional factors.
For example, if
one tests for time clu.stering across all space units, then "n" is the above
nand "m" is the number of time units c;' n . are llsed a.s our observat ions.
OJ
If the test for time clustering is done for each space unit i separately,
then "n" is n.
1"
and "m" is the number of time units c, and n .. are used as
our observations.
1)
If one tests for time-space "interaction", then "n"
the above total nand m is the above m, m
=
15
rxc. and n .. are again used as
1J
our observations.
Following are the recommended tests under different conditions:
71
The number-of-empty-cells test is used when '!-"m" < "n" < 2"m" .
(Step 3)
2
The zero-one matrix test is used when "n" > 2"m"
(Step 4)
For "n" < f'm"
6.1.3
the Knox test is suggested (Chapter 2, Section 2).
§tep 3, the number-of-empty-,cells test:
In this step the notations
nand m should be understood as the total frequencies and the total number
of units (cells) respectively, which are the conditional factors on which
the test is based.
Depending on the hypothesis tested, nand m in this
step correspond to the values specified in Step 2 for "n" and "m".
-
.~
a)
Test statistic: x = number of cells that contain no cases
b)
Exact distribution:
·e ~
for m
c)
Exact moments:
k
~'max(O,m~n).
The descending factorial moments of X are
which gives
E(X) = mEm~l)n,
m-2 n
m
Val' (X) :: m(m-l) ( - )
d)
~
+
mIn
mC-=-)
m
P-value of the test statistic x:
m
P = P(X~x) = L P(X=k).
k=x
72
a)
Test statistic
(i)
For time clustering across all space unit$:
c:
L
/t,. ...
a.
J
j"'1
where
{:
a. J
(ii)
if n oj < [!!.
,
c - 0.5]
if n oj
~
n
r--c
- 0.5],
For time c.lv.stering on an individual space unit:
c
A..., •.
.<.
L
a ..
1J
j=1
where
n.
1·
r
if n .. < [-
a .. ..
1J
1J
if n ..
(iii.)
C
~
0.5]
,
n.
1
[c - - 0.5]
0
;:.::
1J
For time,-space "interaction ll ;
c
C
--
r
L I
a. '
i=1 j=1
where
r
1,]
if n .. <
1J
a ..
1J
=
if n. ,
1)
;;::
In. ~n ',1-.
,,_I
rn.
1·
n • J.
'1
~
Llil-" -
0.5 J
o,~l
[x] - smallest integer larger than x.
b)
Exact distribution:
Since the di.stribution of the test statistic C
has not been investigated and since the statistic A and A.
J
are similar ex-
cept for the differences in notations we only discuss the d.istribution of
the statistic A.
Note that A is equivalent to a
special case of the
73
= I,
statistic C with r
m = c.
In order to make the notations consistent
with those in the previous chapters we use r as 1 and m as c.
The exact distribution of A is
where
p =
m'1
P
n
=m
and v = n.
Exact moments:
c)
E(A(k))
which gives
The descending factorial moment
= rn(k)I~(P.V)
E(A) = m I1{P,V)
p
Var(A)
d)
of A is
and
- m
Approximate moments:
Due to the complexity of the incomplete dirich-
let integral involved in the exact distribution and moments of A, approximations are needed
in their evaluations.
As discussed in Chapter 3, expected value and variance of A can be
approximated as follows:
E(A) = m(0.6
+
0.3 x Decimal)
where Decimal is the difference between
e)
P-value:
i
and [~ - O.S] and Var(A)=.lSSxE(A).
The exact distribution of A can be assessed based on its
exact probability distribution function (2.7).
However, due to its complex-
ity we suggest approximating the P-value by first calculating the Z-value
from the expectation and the variance of the test statistic, Z
= A-E(A)
v\rar CA)
then using the standard normal probability distribution to get the P-value
corresponding to this Z.
74
units:
~hen
the time clustering test is done separately on each individual
space unit like the test statistic A.). in Step 4, the combined P-value of
the test can be assessed using the procedure similar to that used for the
D~I
the combined statistic is a chi-square statistic with one degree
test:
of freedom:
r:
r
[I):
?
X~ "-
1=1
Ai
)~ E (Ai)
.
i~l
1-
.
2
0, S]
~~---·l'" ,~~~-~~---~~----.--._--_._-~."..~--~~-~~
L
i =1
Var(A.)
].
Note that since this test is sensitive to both time clustering and timespace llinteraction", it {i-not recommended; instead, the statistic A or C
may be more appropriate, depending on the hypothesi.s being tested,
6,1.6
Step 6, adjusting for
cuss only the adjustment
population size.
~xtrane~us
factors:
In this research we dis-
for a single. non-dichotomous variable such as
This adjustment is done by classifying the time-space sub-
unit as zero or one, using the classification procedure recommended by
Symons, Grimson and Yuan (1982). This procedure' uses one of the follmJing
three cd teris:
maxi.mum likelihood, large sample likelihood ratio and
Bayes optimal allocation)
based on the Poisson assumption regarding the
distribution of the number of disease cases in a time-·space subunit.
In
adjusting the population sizes, the regular zero-one matrix test summarized
in Section 6,1,4 (Step 4) can be used with the
rates of events replacing
the frequencies of events; in this case the approximate first moment can
be calculated using equation (3.5) and assumi11g the Decimal to be equal
to :ero.
The procedure is only appropriate for si tm:"::ions in which the
variation in the population size is not too large, i.e,) the rates are calculated with comparable precision.
75
6.1.7
Step 7, multivariate procedure:
To test for unusual patterns on
several variables (diseases) simultaneously, the time-space subunits are
classified into two groups based on their relative Euclidean distance from
one to another.
The subunits are linked into groups by using one of three
linkage procedures: single linkage, complete linkage, and average linkage,
until we have only two groups.
The statistic is the size of the smaller
group, namely A, and the P-value conditional on the total m subunits is
p
6.2
Suggestions for further research
There are several issues related to the zero-one matrix test for time-
space clustering that need investigation before the test can be useful in most practical situations.
Some of these issues are:
- Extensive tables are needed on the exact distributions and/or on the
exact moments of the test statistic A, the test for time clustering.
- Exact distribution and moments of the test statistic C, the test for
time-space "interaction", are still not known; this problem needs to
be examined further so that an approximation procedure can be
inves~
tigated based on the information from the exact distribution and/or
moments.
- The computer program that classifies data into two groups, adjusting
for population sizes, based on the maximum likelihood criteria, is
now only available for use by its authors.
This'program needs to be
documented and modified in such a way that it can be used as a 5ubroutine
by others who need to do the test.
- Test for time-space clustering
adjusting faT several extraneous fac-
tors simultaneously. using an approach similar to that of Symons,
e
~
76
Grimson and Yuan, can be very useful in many practical situations.
This problem needs to be investigated further, especially the problem of adjusting for categorical variables such as race and sex.
~ndix 1.1
Program to use the HISL subroutine for Incomplete Beta funct ion
77
...
IS N 0002
rSlI 0003
IS N 0001&
INTEGER I.F.R
REA 1 (18 X:, ,!, , B, P ...
DIMENSION X(6),B{16)
H Ei\ D (1 , I 0 0 ) (f3 (I) , I:: 1, 16) B ::: n ~ nurn b era f cas e 5
:t::.\D (1,100) (X (K) ,K=1, 5) x.::: m'=... num~.~ro(~.E:l..:l..~_J~i!!1.e_unJ!~L
IS N 0005
~:
IS
0006
FORMAT (2:Jf3dO)
DO 10 I= 1,16
~OO7
T .:>
-"
..
;,.,
ISH ')008
ISH 0009I S!l COlO
IS ~r 00 I I
DO 8 K=I,6
ED= B (1) -.1+ 1. B~ __:: .J'l-r.:~l __
o
~
... -
_
..
L..
,---_
r-
L
0
-
__
0
._
.
i1~3E::'~ (XX,AA,D!3,?,IER) -+ I (r,n··r+l)=c'i, '" I (r,n)
P RIi:T 200, XX, A.a., BB, P , lEH__ .. __ ..
~
!.'__ _. . _._ . __
u_
•
STOP
END
••
....
q<
.• _ ,
__ •__
~~_.
_ _ . _ ••
..,.._.
,.
. . ._ - . _ _ •
~
._._~~_
_''''''''
._9Q__.
." ..
.__
.~_.
. . ._-.. -_ . _ _._- _ __ _-_ __.__._ __ - _ _ _..---_.._-- _ _----~-=._--_ -
"
'-
..
__ _..
1
C,\L~
IS N 00 19
-
..__
10 COWl'I NlJE
200 FORHAT (FISdQ,2Fl0.0,t'lS.a,IS)
0016
IS N 00 n
I SN 0018
-
__..
..._. . . ._.. _
AA=J
I S~: 0014
IS ~I 00 15
1- _
--'-"~-"-'--'."l"-""."-"----'-'------'-'--'----------"
XX= 1./ X( K)
XX ::: p ::: J={I3{I)*XX}+~L50f
m J::,'[¥- .. 0.5]=r
I Sli 0012
IS N OJ 13
I SN
.....
::: 5,6,7,8,9,10
--.~_.
. . .
_
...
_--_._---_._----
-- - -..
_....
..
-
.....
___
_ _ ..... _
. _. . .
... _
....... _
••_
...
_
..
•
_
. ._
.. _
•• _ _ •
......
~-
...._ _
~
._.
• •• ... w
,
~~_:'_.
....
' . _ •••
_~..;...
._~_
... _ _ •
.~.
__ •
~.
_._-_._-
__ •
•
._. _ _
_ _. . . . _ _ . _ . _ .
~
•
".~
._ ••
_
..._
<.
AA:::r=[~O.5]
m
J. 2JJ,)
n.ltit)?
J. I ~29
"),12">,)
Q. 1 I I 1
O. 10JJ
;). 2 000
0.1661
n-r+l
5.
1
I (r,n-r+l)=Incomplete Beta=I
. p (r,n)=a
p
i) • ,; ., 2:! I q (\ (j
:). 5;1;~ IJ. 7': I~
~ppendix. 1.)
1•
i.
I•
5.
I •
5.
o. 4 (l 70)
I•
5.
5.
6.
;). I! ;)~5 J93B
5.
1•
.) " ~ 37 :U S;, 2
O.
I Gj
44~1 ~'7!
C.2
6.
0.73785601
0.66510193
O. 6 03 ll3 () 55
6.
0.5512(~466
i.
6.
0.50612982
O. 100J
I•
;).2000
0.1661
I •
6.
1.
0.790281.180
7.
7.
7.
0.72091B28
0.6600833 t+
0.607301.114
7.
7.
7.
, .,_ ,._. _ 8.
0.56153761
0.5217 02 99
O. 1429
O. 1250
0.1111
{). 1429
0.1250
O. lIt 1
O. 1000
().2'000
0.1661
O. 1429
0.1250
1•
11•
1•
1h
1•
I•
1•
2._
i.
1.
1•
O. 1 I 1 I
J. 1000
1•
....1-.
0.2000
0.1667
O. 1429
0.1250
O. 1 I 1 I
J. 100\)
0.2000
0.1667
2.
2.
O. 1429
0.1250
O. 1 1 1 1
O. WOO
0.2000
0.1667
\). 1429
0.1250
O. 1 1 1 1
0. 1000
0.2000
O. 1667
O. 1'.4 29
0.1250
O. 1 1 11
J. 1000
\).2000
O. 1667
0.1429
·).1250
O. 1 III
;>. 10J:)
\).2000
0.11561
J. 1429
O. 1250
0.1111
J. 1000
,1.2JOJ
0.IG67
6.
8.
a.
.m
~,..
8e
9.
8.
a.
9.
1•
9.
1•
1• __,__ __ .. ._ 9.
9.
J•
9.
9.
2.
2.
1..
1•
.. --.- -I1• .._-_ .. ~--
2.
2.
2.
2.
I. _
1•
3.
3.
2.
2.
2.
-.
4•
?
3.
3.
10",
10 •
'10.
1\).
11.
1 1.
1 1.
I I.
.. .. 'J 2.
12.
13.
13.
1L~.
14.
14.
14.
15.
16.
16.
17.
17.
0.49668352
0.76743190
O.7086 L1282
0.65639107
0.61025564
0.56953268
0.56379239
0.45734J12
0.75026528
0.69934223
0.65356056
0.6J257939
0.624 J9031
0.51548312
0.78594170
0.73692445
0.69205384
0.65132145
0.72512208
0.61866723
0 .. 52819799
0.45329611
O. 756681~52
0.71757035
0.60197673
O. Il6 77 7 ~~ 9 7
0.65336995
0.57592229
0.50869630
3.1+5J95600
0.49897448
0.5973455J
O.485l~6873
0.67716031
0.60993407
17.
0.54971592
17.
12.
0.5885510')
18.
19.
I q.
5.
4.
0.46855889
2122.
O. S713J·~075
0.5596:)<]f,3
I) • !~ 6 4 6 <} 11# 4
O.b6dOQ205
i).(.)82S281
0.57932566
o. ti J>1 3 33 II
I.
(continued)
78
! '.+ 2 l )
.). I 2 ~ ')"
n. I I 1 I
-; (; :3
6? -) ') ~'; '7 ':}
l
~~
J.5:2d,.17"~2
n.
5.
4.
4,.
. ... .....
.., I
;_'. 5 1 .', '.. C1;) ; j
1667
L.
~ 4
, ) Ji
.,
27.
0.6.>827137
27.
28.
0.52661'15.1
3.
28 ..
0.58364854
1.
29.
. 30.
3 14
32.
32.
32.
0.56715826
J.
'.
6.
0.1250
O. fIll
J. ! J 00
0.2000
O. 1667
O. 1429
J .. 1250
4 •
7.
6.
50
). I : I I
!L
0.2000
4"
9.
1667
O.
,...
8.
38.
0.4819,700
4.
4.
8.
Ii
0.56110455
O.!.t72<J6!Hi
0.55625950
8.
1.
43.
4 11.
45.
45.
46"
O.6089L~070
O.5~3l133728
0.60651215
0.48613266
0 .. 56080127
"
~
.. _._ .._.
~
.. _.
~
~
~
_
~
-
.
-
.•.. __
~
.
_
_ ••
"
_ •• _ _
"
• __ •• _ _
~
_ _ -:-
.
••. ..-
<
_
.
.'
_
_.. ___ .._____.___.a.___ .__ .___..__ ..
.
••••
t.
t.
0-.. () J {) '7 i 4 I 2
o.. 4 9 9 q 31 2 'l
4 i.
~
_
••
. Ii
6.
5.
-
0 .. 57 h f! f 9 05
0.55328354.
40.
6•
- ..... - . - .
37 •
37.
37.
10.
•..
0.. 1000
0.51665710
0.57130526
0.66290723
~
._-
35.
~
.
" 1429
1),1250
0.11 i;
0.53981207
0.5.73 1u 29:1
0.65237532
O. 5550i71~2
0.469Jl1.. 81
O. 562£l'5i~OJ
0.50898210
36.
4J.
..
v.
_ _ ". ••• _
_..
-_or
:J.
O. 1661
.-
...J ..
3/~ •
6.
6.
5.
0.1250
0.1111
O. 100:)
0.2000
......
3"
(continued)
O.66~7,)904
O. 1667
O. 1429
5.
~l'endi~..!...:.l.
O. S) C2 '-l
'"') '1
t. " •
~
15.
v. 142J
~
79
h •
O. 1000
o
23 •
').2 (\ ,:)0
-J. 1429
'250
0.1111
J. IJ:)O
0.'2000
.
)
3~
I).
~
"?
<'. •••
laJ~
J.
.....
4.
.
J.
J.
._."_ ..,
. . ____. .-.. _..tit. .
AEPendix 1.2
I~(n,r)
FORTRA:\ program to calculate Sum-2 Ivhich is being used for
[V'EL 21.Q
(
JUN 74
CG~PILER
UPTIONS -
IsN OOu3
OIMEN~ION
OOl.lLf.
«BB(IltI=l,lo)
«X x(KIt ~ =1. ~ 6 J
100 FOR~AT C20F3.0)
00 10 11=1,16
00 8 KK=l,~
1
U001
, I U,
f'
Re)J,9 in.n_.
Read in m
r=l./)(XCKK) p::N:=8A (y 1) . -., .n.~.. __
- '11 •..------.-.. ----..•- .
AA,,,\=W*P)·.O.501
AAA
=
r
=
[.
m - 0.5]
R=I\j\A
r
f4=N .. «R*2)
m = (n-2r)
SUM2=O
.
n
I S:j 0012
I SfJ 0013
OOiLf.
0015
MFAC=l
0016
_ - _ ----------.. 0017 ---.. _-_ .. - 00 1 Fl=l '1 M,1- ._--_._~_..MF,I\Cd~FAC*Fl
m! = (n-2r)!
0018
1 CONTINUE
00;.9
.
.
0020
00 2 K=O ,f-' ,1
MKFAC=l
DO 3 F3::1.r~X .1 --.-_.. _
MKF 1\( :~·1KF AC*F3
(n-2r-k) !
3 CONTINUE
0023 ... _- .._ ..
0021+
0025
00;26
ISN
ISN
ISN
ISN
ISN
ISN
ISN
ISN
ISN
ISN
I srI
ISN
ISN
OOG7
+n - 21' - k
t.,K ::f~ ... K
OO~l
OO~2
S~J
I
XX(6)QAA(10),U~(16"
R (A 0 «1 , 1 0 0)
OOv6
ISN OOv9
ISN 0010
ISN 0011 .. , __._.
-e
MAIN,G?(=U2,LIN[CNr=5Hlsrl[~aUOOK.
HEAO(1,100)
OOlJ5
ISN 00u8
lSN
ISN
ISN
ISl'J
IStJ
ISN
ISN
ISN
ISN
ISN
ISN
ISN
NA~E=
SOl JRCE t E8 COl C , f~ 0 LIS T • 0 £. C1\ , :.. 0 1\ D , ;', MJ , i ,U E0 I r
INTEGER R,u,K,Fl,MK,I,J,f3,F~tF6,N
RE iU•..~ a p, SW.11 , SUM 2 , r" F 1\ C , ,II) K F1\ e,l F C t J F AC , A , H , A" A
IStJ 0002
iSN
ISN
ISN
ISN
FOHTRAN H
03/360
)
80
surU=O
t.
UO~2
5 CONTINUE:
0033
,JFAC=l
0035
......__ .
.._
..__
DO 6 F6=1.J,l
JFAC=JFAC*r:6 .. __j.~-i)_~o __ ._.
M -
..
_0._ .• -- -
._-.........
6 CONTINUE
0066
aOoH
A=~FAC/«IF~C*JFAC*MKFAC*tI+R)*(,J+R»
CO~8
su~n
=SUr-1l {."
4 CONTINUE
UO~9
I SfJ 0040
8:: ( ( ( "'1 )
ISN Q 041 ..__
ISN 00'+2
!SN 0043
ISN OO"!Lf
ISN OOlfS
I srI 0046
ISN CCtt 7;- _ .. __.
_.. __
**K ) i< ( P ** K ) ) *SU i-1.L
SU~~2 :::SUr~2+8
... "'__ ' __ .
..' _.. __.......:...__ .. _ --
..__...
2 CONTINUE:
8 PRINT 2QO,D,R,M,SUM2
10
200
_
CONTINUE
(F15.4,2<Il0,lX),F15.10J
STOP
END
FOR~I\T
?
f 2r
_~ [Sum-2]
1 - Cr,n ) -- _--:.:n.:..,.Pt:.-._
.
2
P
(n-2r) ! [Cr-l) !]
where Sum-2
-
_
0
.
=
00
I 0 , K• 1
.
.
..1=1< ... I
J = k-l
0028
_.._.•__ .
. __
0029 _.-.. _----_.. IFAC=l
OQjO
DO 5 F5=1.T.l
. I
zr;- AC:: IFAC.*F5
OOjl
1.
OOjLf.
.. __ _
(n-2r) !
i! Ck··i)! (n-2r-k)! (r+i) (r+k-i)
-
.
...
._
O,,20:.iO
O.1;~7
a• 1. I ~ "':i
. I) , '\. ;:} 0
0.1\11
I) ., 1," oj
3
o• r~
1
3
C.C:'JG:--:
1
1
1
1
1
:3
3
3
o ~ (. l 6 -;' 'j I.! l:~ ~
u. 10 7;.lla~q51
3
() • --: 'd', 1J V U (l
1
U
U.l~&1
1
4
4
0.1'.29
1
4
1
4
O.2~J~
0.1250
0.1111
O.lOUD
1
1
__ -. 1.. . .... _
o• .:2 [) U0
4
4
0.16 67
1
5
5
0 .. 1 1.29
1
5
r
0.t250
...
1
1
1
..... 2........
1
1
1
1
1
O~1111
O.!OOO
o .2 () a 0
0.1667
0.• 1'1 ~9
0.12 5 0
O.llll
O.!OOO
O.20r:lO .. _ •....
2 .._ 0 > "
OdlS67
2
1
1
O~lq29
O.12~O
O.~111
5
5
5
4
6
••.••..
i"
1
1
OwlC)O
O.lU~O
o.. 20 i}()
..__.... 2
0.1667
0.1429
5
5
7
7
a
2
0.1250
0.!111
0.1000
o• .2 0 () 0 ..
0.t667
0.1429
e
Appendix 1.2
('colltinueJ)
o • ~~ ,) ~, .~ u(-, () <.~ 13 e
o.. :) C: 1 j ':)ll ./ 3 'i U
o. 5Sll 7c:'Sb32~
0.596160807..1
O.63405'J866~
o
6 (; 4 2 06 '/6 5tl .
O,,:3622 1.0021 tl . _ ---- ... -...
ft
o ~Lj.2!.HJ8 .."i1331::l
(l~48ti20,,:~0317
0.530:.. 2 0 0254
o. 569(12~lJ.117
0.60267U1100
0 .. 07 59f\66 720. ._._
_ . _..
o. 511'/92~653:l
o.. 057 G12 069 Q
o.. 0 7 t~ ~n (,636 B
.....
.
0 .. 499 F>41.}971
0 01+:+ 0 If 81 ~j"n~_
o
_
_ .
2
11
0.:000
0.211 0 O.
O.lG61
2
_.~. __
, ... _..
~
' ... __
11
10
12
12
14
0 0 1250
0.1111
0.1000
o .2.11 v 0
0.1667
0.1429
0.1230
0.1111
3
:3
2
2
2
4
3
3
3
2
O~\nuO
~
l~
O.2uJO
0.1601
O.142 Q
0.12 5 0
5
4
15
17
~
17
19
3
14
_'.
._. _ ......
.•.
a
,'_
O.33(!181066lt
o .3TIOO'~3466
o .418G0875Il~
~, ..__ ..._..__ ._.,
. . . . ~. ,
O.050R962~1&
o.
...
_
0 .. O·3G75h ij231
0 .. 04 7'(,~L~696~
O.C5G:5:l~HH;O
O.OG~65'El1661
l).004574530~
o.. 0 0 I' :3 2 j
~ IP~ 1
o .. r)2463Y~05o
o .03164"'J5071
1~
n. (1::'U75H5661.
12
14
O.
1~
I) • ()
14
16
._~
0 .. C'S9~0162~~2
O.381712551'J
1
10
"...... 3 ._ .'_' ....
9... o e 0056566610._._ ......__.
o.. 0 0 9 0 'l 0 4 1?i3 '+
3
9
O.029766340U
·2
11
0.1111
......"
O1l4&:?f..91011'.:#
8
10
11
q'"
0,,37.'$57796'.. ..;
0 .. 4211280"4,.,
0" 0624,+6C)I~O~
345·09"'9a6~
2
.. ..
0 .. 368921."'-4624-
8
0.1250
o ,,11~29
81
'J
0,4555"/49048
O. 0263973~2ll
O.. 03871;)~052
e
e
2
2
1
;
u
0 .. 42(P~ 1 :i32'21
8
__. __... . '.'
-; [)::
O.. 471P.8UO"J5"
O.. 5126~<;O&ob
7
7
1
1
1
n~1'j5b
o .....' .': ~'. '! e j
&
2000 _ .. _... -•.. "" 2 . __ •. _ ... _...... 6
O.t~o7
2
b
o • :V.29
1
6
O~~250
p :'1 ti ~HJ ;.: 1 .~
6
6
6
• 1'} ..
0.1111
;:'
CeC~J~05S:'17""
c· (j ~e ~;'j:,6d
0 '! 3 1H
7'J1~ 'J
OoJ072;;1~49h
U~C?43C'Ji.'J4~
o .. O~";Olt5~l.n'5~
I) f; ::: 1 1+ 1 :-:,l 7 1
o • 'J GULv '; 1 5 .2 ,j1
o"
n .. Old
OO::)'J:) l'J
f}aOO':sOUlI370')
......__.....__
. __ .
_......
__
.
.~
-
19
19
0.11l.l
0.1. () ,11\
to
6
'3
14
O.ZfI:PJ
0.1:,::7
0,1"';;',)
o• 1 ~~. 0
0.1111.
~o
4
O.lOUO
0.2000 ___
O.1~67
o• 1 t~ 2 9
3
:5
__ .. 1 ..
6
24
2~
21
23
25
27
5
0.1250
0.1111
4
It
,.
O.~OU!j
O.20UO
O.lMi?
0_ . . . .
1
O.1~29
6
0 .. 1250
5
0·.10UO
_0.2UUO .
o•:! 0 0 I) .__..., .•
0. 4 ':;07
8
6
65
5
. 10
8
O,,14~9
'7
0.1250
0.1111
0.1000
£;
p
26
28
(J
4
.. __ . _ 9 _._ .....
f).16,~7
0.1429
0.1250
IJ,U.ll
Q.1000
21
21
2tj.
.-
,c-
-
o• 0 C· ,tt~ ~,;'" :,. G":- 0
o.. 0 0 :s :. IL'~ ~:-~ .~.,
0,) >J 0 :1. t;. j l :U
Appendix 1.2
r; ,) :1':h 1. ~ ;::~ u
(continued)
:J
~ ,1
82
o, :.1 0 iJ 5 .V.'7 C'} ':J
o : ,~; Gr :: ::> l :2 I;. :2 \J
G. a021U'~lS~'?O~
O.0030,:F12GOI
u•() 0 0 () U1.1-'• .3 au. __.
...._..,__...,.. _
0.000020';70'1
O,,~)OtJ07l}0511
O.OOO?:J'-J169:!
o.. OOO{~71;~49tt
O"OOO"65l~Olj
O.. OfJOOUU1319/j
o"0 0 0 0 0'';' 5 5 ;j i!
Ooocoo.t'7aH61~
30
32
O'100C06~2577
.3~
O.OOiJ36b9105
O.00021J.~1)5H'
2.1.
a" 00 {J :~ U0 15 '} 1. _." .....
29
33
33
35
35
30
O~oo()oOln5a2
31+
36
36
6
33
5
40
r
n-2r
___. _ . .. __ .
O,OOOOOOS4tj.o
OvOOQOlo256a
Ov0000530452
0.0001U22455
0.0000000296 .. __ .._. __
O"OOOOO~238U
O~OOOOlJ13Gl:j
O.OOOOO~9106
tLvOIJ01~110U
O~!Jnoo4·,3766
Sum-2
_..
Appendix 1. 3
SJ\S Program to c.dcl..llate
nIp
riCr,n) ::
.•
'Jf'
I
I~
I I « ~ "" I. I.·J)
i'-j: • • +1- - l
C~ ~ S :
fr·
[,<
t'\r.:
1·1 ?
I"~
'ill~:
....
;~:
f' :;-
( ( '.. =1\'- i
If ,>=;.
If- ... =~
,
1
"~PH
P
'.;"," 'f-I'"
'11'
-\
1!(I~*l!'21
~!':-~.
>~I
~t
If
if
I.~l
:i"
",;
'i; t,
.. l
f~
=:,1
r;' ,
","' '::")'11 I
~:
',,,,,
:!l.
~
t· :
: ":. ~
j':: •
,.,
'
;'11
't
.. T:. (.-:
r c:
f:
,"'i,
CJ~:2=Cf'''ll
r.r;N2=CCNU
-_.- -_ ... - --_."--'-'-"--". _.'----.---_.
..
;
1':'-?l~IN-311!~~*'t21
;
[.:;"; cnN':'=CCJ:\oJ5"'I";-lUl*(~;-11:l:iH>~21/(5""*21
Ci", (uN7=CON6*nj-ltl*(N-13l/liJ >i'*21/16>l<t,<21
r If'! UIN9=CON7>l-1i':-141*nJ-151/0'**ZI/n**21
1 I,;') CUN9=CorI8~' I N-16; *(N-171/{~1**21/(8**21
CON7=COI\6
CONl:=CON7
CON9"'CONB
CON10=CON9
lH,,'J CONIO::CON9~'iN-181*{N-1911IM**2)/(9**2n;
• t
pf.r'rr ;
i'~':,":
1"
.. ~
:. '(
.r '.
!'~
I'JT USEO O.Rl
SECON~S
AND 21l2K AND PRPHEO PAGES 3 TO 4.
TriA. ~
I,-,'J::
.,
I T./r,
Q,,:
I' ~. ;(
f;
i 1 $1)'12 12 :
.'.
. ~,. : ;:'
~, Ti:
,\ r .~
,,' -.- r.:
.
-.
n :: number of eel s
;
".
,I \
,"
m:: number of.cells "
'..;(C;:.lT,:C: HAS 96 ne~f:n'n!('NS I!ND 11 VAR[ABlES. 930BS/TRK •
S1,\I,"lU;1 USED O.~9 s.-:r.cmos AND 17aK.
'
.~.~. ~
'j.
A
PAGES 1 TC 2.
;
~R(~
,,1!"'"
PRI~TEO
CON3=CONZ
11i;:'J (ON3=cmJ2~'CI\:-l.il*IN-51/P~**21/12**21 ;
CGN4=([1N3
TilL; (.OW~=CCN3'\:~,J-6l*{N-71/1~:~·~21/{3**2;;
CONS"'CON4
T!c': CCJ')=CO'J4':'(r-~-81~(N-91/("1**21/(4**21 ;
CCN6=CON5
>~."
~I
194K AND
;
:
)=,.•
."':'
Sll!~2
I
~r
.'
A~!O
USED 0.4[0 SECONDS
i'I~1.51;
If
\,
217 OoSITRK.
.'
I r >. C ;
il~
1:= j·lTI.ll
;S
;
F r: 1 f, T :
~ ;-~
I::: ;Ii: 'nc
1,) ~
i '.' .',
I, 5
.
.'I;.
. _ (Sum-2)
2 2
(n-2r)! [(r-l)!] m
,·r',<.l'r.:,: HAS 96 OBSfRVATIGNS ~NO 7 VAIUA8lES.
')l"":'~";::,r USEU (]J.l2 SECONDS AND HaK.
c r : SFC
Sum- 2
IER'
.~
",;
frolll
nl
2 ::
(n-r) ! [(r-l) !]
:',
l~ (r , n)
-,,' :.C ·'.f'JI.\l
1,< : \:'~
i' " 1
:. 1". )'
;1")
(""I.
mS.::dl,rrO/l.S AIJC 7 V~RJ43lE'i. 217 OBSITRK.
ST,",;:: 1"tIT USED 0.24 ~E(Ot;i)S'ANl'l 178K• .
li.'\S 'j(,
'I
rn· f 0.: r :
;'::,('"'IJfT !"ql'lT USED 0.41 Si.:CG:OJOS MW 194'< ANi) PRPHEO"p4G~S
5 TO 6.
.)
00
w
e
e
e
e
e
Appendix_ 1.3 (COenued)
p
DB')
R
().n:>;")
?
l
4
5
6
7
8
9
O.61?PO
0.598122
(;. ~;,73?6
,) ./.87091
Il. 44~<l1l
1 i \).40Cl510
0.737856
1
1-;- 0.66511) Z
O.6·l?-+H
1
1 _ ... o,).551ZJ5
o
V.2r'O()
().16':,7
0.142')
12!:()
I}.
11
12
13
14
15
0.1111
0.1000
U .2000 .. __
0.1661
0.1429
() .12 SO
0.1111
IJ • lOU!}
if,
21)
21
22
23
2 I.
25.
26
21'
28
29
3iJ
31
32
3j
)4
35
36
37'
0.2·j·)0
O. 1667
0.1429
O.12511
0.1111
{}.lO~D
C.2COO
1'.1667
0.1429
0.1250
O.Ull
0.1000
1j.20(·0
C.16b!
U .142<)
o .1': ~o
0.1111
O.l(lUO
5
6
7
8
1
1
1
1
.1
O. lC,t, 7
0.1429
0.1251)
0.1111
.1(I00
II)
17
18
lq
.
11
5 ---- .
6
.,
'1.0.72,')918
0.66:,0,33
i
0.607304
'1
0.561538
1
0.521703
1
__ 4
__
O.49t684
O.7. I }·JO
l).1667
-.
__.---
0.6'>1321
0.725122
--
v.l1.67
4~
0.14£:9
2
ti'!'\
41
4f1
".12 :;0
Udlll
II • 11I!\I;
2
Z
4'1
O.Zli,jC
4
() .l.,]'l?
5t.!
O. 1l>6"
)
51
5:'-
().142<;
3
0.1250
2
O.S'H146
o .4Fl'j4 (,') •
O.6H160
'" ,
2
O.Uli
O.I1100
2
1
I,
O.20UO
:1
Z
'~4
'".,,\"'"
1
7
7
7
1
1
10
'04
J. l42t;
Q. 12 "i0
5
fl
1
3
3
39
4O
41
42
8 --.-- - 6
9
6
10
6
9
0.61i3667
il.52031'1a
0.4'j17 'it>
().75!,(,fl5
0.117")70
o. (,.) 19 7 T .. ,
0. 1,617 75
0.(,';3370
o.5/'j}?2
O.5ud",)!>
o. 45·n 'i 7
311
6
6
7
0."T67432
0.708&',3
1
0.6563'11
.. 1
0.610256
1
0.569533
1
0.563792 ..2
2 ; .) .457341
1 i O.75G265
1 ; 0.699342
0.653561
1
1 i 0.612579
2,! ... 0.624190
.{l.5154113
2
1:..• iJ. "859'.2
1· ' O.7Y,9J.4
O.692fl54
1
1
2
2
f.
l
1 , 0.46il559
~ 1 .. \. 0'. 791"J.85 .......
-- . 21
~
5
5
5
6
.. _..
10
0.506 no
'1
5
5
9
-
5 ------. 6
6
8
7
8
n
8
9
E
10
8
5
9 ....
c;
6
7
'9
8
S
<;
9
__._..
q
lC
.
5
(;
- --'
(,
7
8
~
0.4224;))
0.:27932
12
12
1C
12
15
1 ';
15
15
15
1~
18
18
18
..1B
'
O. ,76<.'6U6
0.211 1,81
O.174T70
0.146700
o. ~,Z 23t> 1J . ._ .. _--_. - ----0.41791(,
O.B<:tH
0.554797
59 8161
O.04J60
0.&6',207
'J.362240
1l.42881l't
o ./t842'J 3
D.530426
0.569'.25
0.602670
o .~)7 598 7 ." _.
,).360974
J.lg24415
C.471SfiO
0.512659
0.541925
O.lJS7612 .
0.0,4911
0.373578
'}.421128
0.462(,91
lj.
_
O./t~9134
O.J44~48
10
10
10
12
12
11
,). ';2000
0.059:102
0.330181
0.3771104
10
!2
0.'i'lJ27fi
0.638484
0.676758
0.7071119
C. 133500
u. ',3,)3U1
0.5.)15')5
10
7
9
1LI
5
2
10
8
9
10
5
6
7
8
9
10
5
6
7
Stl'~
f\;
0.418:,09
0.'<55575
0.026397
C.u3/3713
0.050'196
\} .062'.4 i
V. , S~<i5
0.301713
0.C05b57
u.ou<;toO()
0.1)2'17 36
l).lJ39757
0 • .::811368
0.:'34831
O.~99262
0.60R563
0. ')OO3{\4
0.415J31
C.:;4 !lon
0.35'.. )',31
O. :0"",38
(}.21li7S0
0.174805
~
-_ _._---..
_.
--_.
~---
-.' .. --
._.....
-
_. --- ._,_. -
v.
_
• __
-
...
~ _ .
O.;.S'.:H2
O.~·51a31
0.1131121
0.562371
O.~038tll
0.,26149
O.i(l~}19
O.O5~~15
O. U144119
o. ~2 n~'J
~
..-_._------ -
0.548931
0.H'169
O.4112fll
0.35'.1377
0.35'j204
O.232S{»2
0.606'055
0.530162
O. ~'6 5 t 21
O./.t 001 7
0.501761
O.llO1656
<J .il) ',5" '5
... . ,....
__._--- - -- -
0.t.126'15
a .(!/~ 7!'RS
'\
.
O.2'J5258
0.253121
0.204252
O.5tJS1l2
O.411':)Ot,6
O.4}S732
O. '0<''119
O.7311lj'lb
C. J0 7)26
V.02·'t.35
.. -
o. ;,)7 rR f
0.10S·j63
..
.O.441Tl15
'
',.
~
00
~
Appendix 1.3 (continued)
C' f\'\
5!
55
:19
6,1
61
62
63
64
05
66
67
68
6<1
1<1
71
72
73
74
75
71.
77
1e
79
SO
R1
82
53
8',
85
86
87
,'38
89
P
<.1.1429
1).1250
0.1111
O.lOuO
3
'5 596 10
0.464691
O. M.;",'J 2
) .6;':1253
o. '57';326
0.613433
o .4 f\1hl 116
(1.621'391
0.5%2413
7
B
9
;;l
'LCO~B:.41
Z(J
21)
·J.!JO Tn 15
0.02,.3272
J Gt1 ],-,/,~ 555
25
;;
~
2
3
:3
7
6
O.1111
\).1£.·£.7
0.1429
v.1250
11 .1111
..
..
~
'
a •'5 5S0 17
1).469015
?
6 _.. 5
4
4 .......
9
0.508983
0.516657
,) .5713')5
O.66??O7
a
0.1429
11.1250
.. _-
61
6
5 ~..
5 ..
0.1111
_.~
5
.
-- -
2~
----
'25
25
30
~
0.800'1204
35
35
10
5
40
O. QOOl'Ol'!)
0
7
40
..... 40
0.00 vi; :>46
() .(;000 179
O. ;;00,)613
O.0():'12118
0.0003669
0.0000002
o.')'; ill! 011
Ii. aool.l() 55
0.000,.. 163
~5
',J
40
It)
4\1
5
45
45
',S·
7
8
..
_.-.
45
45
v.OQuinZ
50
50
a .1)OlliJOOO
50
'h)
0.0\)00014
5:>
0.1>00'>"74
-
-- .--- . - -
0.35,.495
0.210807
il ..~6":;42
~
...
.
0.195£11>'1
O.l 8';;' Sl
o .3010 £9
o .3~5147
0.2'3'1456
0.4<'3742
-_..
~._-
-.
".-
... -
--"
--_. -_. - ,_.-
. ..
-..
_-
O. BC072
O.2R3221
0.25'1:"59
0< 3031';1
o ~/.(j72t'3
-
-- -
... ,.'-
O.2131l(,';7
0.201761
1),.2181'.2
0.226757
i).240153
- • ____ . _____
._-0_"
O.3046l.,6
o .423a07
0.316032
0.21'0043
O.199'T et ,
0.383228
O.2Z6~lB
._.
-- -_
..
-
..
ll.305758
0.205'494
.-
O.OJ(j:)1~1
--
.
0.267665
0.000;)002
O.O~\)'.J059
f.'iJ
0.354136
O.2'17il54
O. CC c·),) 3')
4~
sa
0 ....11'
O.;)0211)7()
V. Jl} ;\) a!n
O.()<ll)ilQ44
0 • .:100.)741
O.;jUJ2592
O. i}(!(j 4714
0.0': 01654
6
._- -
0.0001387
35
_-
._...
O.COO6512
35
9
10
0.556259
5
o .6U1::941
6
0.584337 - ...
1 -... O.60t.513
8
0.486133
9
O.56h8u1
10
--
;0
3C
6
l
'i
• • - 0•
O.';~O'.)962
3,
0.2 en02
0.I.<J219]
.tl!lOl'~12
::0
30
30
i 2
O. OOO',S!:)
0.(,010,135
O. CIJ 3<10 1')
0.00 ... 1.623
0.IJ:161443
o. (Joe )246
7
8
O.472~62
10
8
. . .1 ..
6
6
8
~r:
')
O.5t-2~54
0.57613<.9
0.559284
0.481957
0.636914
O ... 9<.H.31
0.569905
25
....
5
6
7
8
9
Iii
5
4
4 ...
a
0.20(0
0.1667
I). ! 429
J.IZSt)
0.1111
V.10!.'O
0.2000
0.1667
0.57:'483
0.575',61
O. 63A 271
(}.52£615
0.662799
O.SA8649
O.56715A
1).53-;812
0.5131 £,3
O.6~;:3]5
4
Q.ICOO
-
._
2C
1',)
5
6
1
9
l.J
0.';62(1)6
'5
4
4
1).10('0
0.2(:(10
. .-
h. __
:3
3
3
6
.12 50
0.100U
o•
:'\
5
95
'ilt>
'13
N
4
4
'9'.
91
92
M
(I. zrll.'O
().1111
O.lOllt)
iJ.20eo
0.1667
041429
1).1250
<;0
E
0.1667
0.1429 - .
11 .12 ~v
ll.1111
O. 101:;>
n. ZOL'O
0.16&7
0.1.429 . __ ....
l)
SU"l2
R
. -- ----- --.-
0.270259
O.34Z1t,Z
O.3165M) . ----_ . . . .
1l.341331
O.2H046
0.306599 _... - . -- .. - .
"
-
_.
00
VI
e
e
e
86
-<
4-i
0
(I)
~~~_N~~~_~~~ooom~~~~~~NQ_~oo~~~#~~mo~~~~~_~~OO~_D~~~~~_~~
.,.
u
~o~~m~~~o-~~~~~~-~~OO~~MCON~~~~~~~mN~~~~~-~·4N~O--~~~C--~
~~-~~o~~~~Oa~#(~~~~~N~~~~~~~~-~==~O~~~-W~~~~~~-~-~~~-Z~~~
o
,
•
0
I)
•
•
III
•
•
•
,
•
•
•
•
•
•
•
•
•
•
,
•
•
•
•
•
•
•
,
•
Q
•
~
•
•
•
•
•
•
•
•
I
•
•
•
•
•
•
•
,
•
•
•
~~O~~~~~~~~_~~~~~_~~~~~~~~N~~W~~~O~~~~~~~O_~Q~~Of~_~_~~O
c:
ro
Ilf
-
......
1"
-
till'
Illlf-
1IIIIf
"
III
1"-
I,.
,,--,-, I
'M
l-l
Cl3
>
"0
l::
~~~~r~~~~_~~~~~~=~~~~4_~~O~2~~~~~O~~ODr~~~_~~c~_~~_~~~~-~~
,.,
...
~~~~~~~~~·~~.~~O~N~~~~~~=(·JOO~~~~~OO~=O-~N~N~OO~--O~~-~~3
~O~~~=~~~~~~~~_~O~~~~~ _ _ N 4 ~ ~ _ ~ ~ O ~ ~ ~ ~ ~ O ~ ~ ~ O ~ ~ ~ ~ _ - ' ~ ~ 7 J _ G - ~
•
....III
•
,
CO
IV
:::l
•
•
0
•
,
•
•
•
~
•
•
to
,
•
•
•
•
•
•
•
..
•
,
•
i
•
•
•
•
~
•
•
t
•
•
,
•
u
~
•
•
,
~
•
•
•
.,
•
•
•
e
,
,
~~~O~~O~~~~N~~~~~~~~~~~~~-W~~~~~~~=~O~~O~X~~-~-=~~·~~---
f
I
-
i
I
r
t
I
I
I
I
I
f
I
I
I
I
I
I
I
I
I
,
1
f
I
I
t
I
I
f
I
I
I
I
I
I
i
I
,
,
I
t
I
Itt
Q~~,nO~~Q~O~~~~O~~~OO~O~O~~~~O~OO~O~OC=~O~~~~~~~~~~~~~~~
~-~~-~-=~-~~~~-~~O~-~~O~~~~~~~~~O~~~D~~~~-~~~~~~~J~-~-~~
:>
~-~O~~-~O~~~~O~~~~~~~~~~~-#~~m~~~~~~~~~~~~C-~~~~-r---~~
~~~~~~~~D~D~~~D~~~~~~~~~~~~~=ro~~~~~~~~~~~O~3~~~~~~~~~T,
.,."
OO~ooooooooooqoOOOOOOOOOO~O~OOOO~~~~OOOOO-OOO~~~C~~~=~~
,
'II
•
•
,
•
•
•
..
•
•
•
,
•
•
•
•
•
•
•
•
•
•
u
,
•
•
•
•
•
•
~
•
•
•
•
~
•
•
•
,
•
•
•
•
•
•
•
•
•
•
•
•
•
e
IV
~
U
IV
OM~~N~~~~~~=~~~~~-~N~~-~~~~-~~OO-~~O~QOD~~O~~if~~~~~~~,~
0.
~~~~~#~~~~.~~~~#~~~~~#~~NN~~~~~~~~~~~M~~~~~~~~~#~~~~~~~
iI
><
•
•
•
,
•
•
II
0
•
•
•
•
•
•
,
,
•
•
•
~
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
,
;I
•
•
•
•
•
•
•
•
•
•
•
•
•
0
•
"
•
IV
IV
o ,... -
.c
'0 -0 0
...., 0
0
,... '.:;) 0
N
O~I""" ID 0
0
-
0
0
::;, -
0
0
0
0
,.., ,.. 0
-
0
,:) 0
-
0
=>
..., 0
::t .: 0
= .::> '=' - =- ., =- ::"
-
~~~~~~OO~O~~O~~~Noo~~o-OOO~~OOO~~O-O~~~O~OO~~b~-~O~~~J~~
O~~~~OOO~~~~O~ONNOO-~O~O~~~~~OO~~~-~O~~~~~~O'~--~=-~,~~~~
O-N~~~'40-N~~~-O-~~'~-O-N~~N-O-O~#~-~~~~~M~O~---:'~?~~~~
4-i
r~
;r ~ ':)
0
o~~o~aoo~o~OO~OC~OO~~O-OO~~OOOO~~O-~~Q~O~OCO~O~C~~~~~~~
+-J
•
•
•
•
•
•
•
•
•
"
•
•
,
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
i
•
•
•
•
•
•
•
•
•
•
•
•
~
•
•
•
•
•
•
•
o~ooo~~o~OOOOOOO?OOOOOOOOOOOO~oo~oo~ooo~o~o~coo~~~~c~~~
0
I
1
,
'I
I
I
,
4
I
I
t
,
I
,
I t '
I
I
I
I'
t
•
I
I
I
IV
+-J
ro
E
•.-4
+-J
V'l
IV
.,
...
'"
~~~O~~~O~~~~#-ON~~OO~O~~N~~-N~Q~~O~~~~~~~'~~~~~~--~--~~~.~
\:'1 ~ ~ ,~
::7'0 ::t' 0 ~ e'" :::t' N - J"l - ."::) ~ C" '" ~ ~ :::t' •.,.. ,... ":C t"': ,.... ..... - 'D - 'D ~ _ .;, ..'") ":) .::f :!f ,..• ..: .... ~ """! .... ~ .:: ::. ~ "J ...
.n
l"'Il ·0 0
7~~~-~~~~~O~~~~""'-O-=~~~~~=~~~~~""~:~~~~~Z-N-O~~~~:~:-~~­
~~~~~~~~~~~40~O:::t'~~O-~~~~~~~~~~~O~~~~-~O-~~~~~~~-~~~~~~-~
~~~~~~~~~~D~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'~~--'~:~~~
~ ~
•
•••••••••••
,
•••
e
••
,
•••
,
0
•••••••
,
,
••
t
••••••
II
••••••
I
8
I
••
~OOO~OOOOO~OOOOOQOOOO_O~~OOOOOOOOOO~~OOO~~-O~~O~~O~~~~~?
c:
0
~~~~Q~~~~~~~~~n~ON~~~N~~=~M~~
IV
tlO
IV
1-4
e0
__
M""
O~~N~~~D~~_O~~~~~~~~~~~~
CND~OQm~-=~N~#-N~-NN~~~=C~-~-~~N~N-~.~N-O~-~~~~~X-O~~~-~
IV
+-J
:r .....
•
••••••••••••••••••••
.,
••••••
f
• • • • • • • • • '• • • •
e ,
••••• ,
•••••
~~~M##~~~~3~~#~~~~~#~~~~~~~~~~~~~~~~~~~~D~~~J~~~~'-~~~~·4
I-<
..r::
.r: .::; c- :::
- :;, ,..., Q\ ..... _ u:) ,.., ..7 ,.., ~ ... ".. U 1"'" ~ I'"' ,.;.. !J', ..:; ~ .;l Q'.
'7; '1""" C "" ..... ~ :: ::J .."I "'"
\"'00 """ ~ "'0 t:" ..... ::
~~~~~-~D~~~~~~~'~O~~~_~~~~=~O~~~~~~~~~~~-~~D~-~~~~~~~~-­
-~-~~~~O~~O~-~O=~~"'~O_~~=#-~N~O~-~~M~N~~~~~~-~~~~3r-~~~
~Q~~~~~~NO~~~~~~~-~~~~~~-~~~~N~~O~~_N-~~-~O~~~~~~~~~~:~
OM",' .., '7 -:.' :.0 -
•.-4
VI
VI
O~O~~O~~~~~~~~~~~N~~~~~M~~~~~~~~~~~-~~~N~~~~~~~~~~~~~'­
'"...
~~~~~~~~D~~N~~OON-~=O~.=~~~~~M~'~7~-O~~~-~~-~~~~7-~·,-- ..
~~O-'~~~~O~~=O~=~~~~~N~~=~Q~-~~N~O~O-~--~~~~~~='~~:-3~~
~~
~ND-~~~-~~~~OO~#~~O~~_~O~~~~_~~MO~~_O~~~~~~~OO-~O~O~~=~
~~l"'IlN--~~~N~_~~~~~N~~~#~M~_~~~~~~~~##~~~_~~~_3~~_~_~~_~~
•
•
,
••••••• ,
,
•••••••
,
•
8 •••••
,
•
,
••
,
••
,
•
,
•
"
•
,
,
,
••••••••
e • 0
OOOCOOOOOOOQOOOOOQOOOOOCOOOOCO~OOQOOOOQQ~OOOOO~O~~O~~~~
I-<
4-i
ON~
....:1i
VI
__
O~~O~O~~~~~~M~N~_~~~_~N_~OMN~~_~~~~~~~~O~D~~~~O~~­
~N~~~-~O~O~~~-~O~O~~#~~M~~~~~~~=#~~N~~~~~~~~~~~~~3::--~
~-~CO~~-~N~~~~~~~~~~~MN~,....~~M~~-~~~O~-~-N~~~~~~D~~~~-~~~
~~~~~~~~~-~=OOO~--~~~DO~~~O~~N~~~~N-~ro~~~~-~~~=O=~~~~~~
~~~~~O~~O~O~~N~O~N~~O~-~~~~~~_~-=M~~N_~~~_OD~~O~~~~~C~=
::l
"':l
~~~~~~~~~~~~~~~~~~~~~~~~~#~~~~D~~~~~~D~~~~~~a~~'~~3;~~
~
•
it. •
•
• •
• •
•
•
•
•
I
•
~
•
•
•
•
•
•
II
•
,
•
•
,
•
•
•
,
•
•
•
,
0
,
•
•
•
•
•
•
•
•
•
•
0
•
•
f
•
•
OOOOOOOOO~OQOOOQOOOOOOOOOOQOOOOOOOOOOOOOOOQQ~OOOOOO~OO~
•.-4
VI
C)
'"'
"':l
c:>
N
•.-4
z
~~~~~~~~4~~~~~~~~~m~~m~=~~~~~~OOOOOO~N~N~N~~~~~~~:=~~=0
~
~~~~~O~~~~~O~~~~~O~~~~~O~~~~~O~~~~~O~~~~~O~~~~~O~~~X~O~
~---------~-~-~---------~
.,."
1-4
ro
~
;;;
~
U)
=
o
~
-
-
-
-
-
-
-
-
-N~~~~~=~O-N~~~~~~~O_~~~~~~~~O_N~~~~~~~O_N~'~~~~~O_~~~~~
--------~-NN~NNNMNNNM~~~~~~MM~#~~~~~~~3#~~~~~~
87
~
W
~
O~~=-~~(~~~O~~~O~OO~~'D~O~~~~~~~~~#~~~~~~O~~
~~~~~~~~~~~~N~~~~-~~~~~~~~~~C~~N~~O~~-~C~
~-~_OO~~=~~~~~OW~N~~_D_O~~~~~~~~~~_N~~~_~~
"
•
,
••
,
0
"
•
"
,
•
"
•
"
"
"
•
,
••
"
,'"
"
•
P"'_ -, I
"
•••
"
t
')
"
•••
"
••
"
~~~~~O~~N~~OO~~~~-~C~~m-NOO~~-~~~O~M~DC~O
I .....
-
~
~
"'"
1-
"
f
,,-!
-"",".,...
-
-
--
O~~~=~-~~~e~oo~~~~M?~~~~~O~OO~-~=O~~M~~~~Q
~O-~~~~~r~-~~=~C~~O-~~~~~~~N~~~OO~O~~~~-~~
~.~~-~~OC~3:0~~O-~~~~-~~~~~NNO~~~OO~M~~-~O~
N~~NM~_~~~~~~#~~~~~~~OO~~30~Q~JG~~N~~~O~I~~
•
•
,
~
,
•
•
,
,
,
•
0
•
•
•
C
"
•
•
a " 0 ,
~
II
,
0
"
•
II
•
..
,
•
0)
•
"
•
•
,
"
~O~~_~~~NOrJ~#~~~~~~#~N~~_~~~#~~~~~#~~~_#~
It'l
~
~
:ll
I
I t '
I
It
....
,-,
I
---I
~~O~O~~~~Q~OOO~~oo~~~~~oo?eO~~~~~~o~=o~eoo
-~~~~~#-C~~~~~~~~~~~~~~~~~~~~n~~~~~~-~~-~
~~~~~C=-~.~~~~~~~~~~X~~~~~#~I~~-~O~~~~~~r~
~.~~~~~~~~~D~~~~~~~~~~~~~~~~O~~~~~~~~~~~~~
"
9
~
•
f
,
•
&
I
(I
•
•
•
•
•
•
•
•
•
•
•
,
•
•
,
•
3
•
,
•
~
•
•
•
•
•
•
•
•
•
O~O?~~OOO~OO~~~~O~O~OOOOOOO-O~~OOOOoOOOOO
<
N~~OQ~~--:~O~~~~OO~~~_~OO~~~O~~_~~~QN~~~O
~
#~M~~~~~~34~~~~Q~~M~~~~~~~~~~~N~~~~~~~~#~
~
MCONOO~~O~~=OQ-O~OO~OO-OO~PO~OQO~O~OO~~O~O
~
<
C
~
~
:tJ
o
II
••
I
•••••
I
•••••
It
,
••
It
,
l'
,
•
,
•
\l
••
,
••
\I
It
•
,
••
0
•
II
-~~-~~~~~~~JO~O~~~OOD~O~OO~~=~OOO~~O~O~~O~O
~~~~~~~~~~~OO~OMOO~O~_OO~~~~OOO~~OOO~~~~O
~~O~OJ~~~~OOO~~MOO~O~-O~~~O~O~Q~~~~O~~~~O
~-~~~~-~-~~O~~N~OO~~~-~O~~~~O~~~~O~~~-~~O
•••••••••••
'
......
6
••••
iii
••••••••••••••••••
~~OOO~COOOoooo~oeoo~OOOOOO~~OOOOOOOOOOOOQ
1 f t
4
I
I
I
I'
I
I
I'
I
I
~O~~~~~~~--~~~~~~#~~~~~~~~~~~~~~N~~~~~~~#
~~~~~~O~~~~~3~-~~~~~~~-~~~~~~N~·~~N~~=~~~~
c
>
~
~~~#~~~~~D~~~~ro-N~~~~~-~~N~-~~~~~~N=~~~-~
7~O-~-~ONr~~~-~-N~-~~~-DO~N~~~~~~~~o~~~e~
~~~O~~~O~O~~~~-~~~~~~~~~~~~~~~N~O~~~~~~~N
~~~~~~~~~=~#~~~~~7~~~~~~~~~~~~~~~~=~~~~~~
•
••
f
•••
,
•
,
•••••••
/I
0
•••
II
•••
0
••••••••••••••
o~e~o~oooo~aoo~OOOOOOOOOOQOOO~OOOOOOOOOOO
N
~~~~~~O~~~~~~O~~=~~OOQ~~OO~4~~~~~~NO#~O~­
=-~~~~~~~-~~'~~~-#~~OO~-~~~~-~~~#~-~~~~--C
~
~~~N~~O~~~~~~~N~D~~~~~~~~~O~~4-~~~~-MO~~=
~
~~~OO~~~~~~=#3N~~~NON~'D~O~~~~~=#~~~~~O~~~
~
~---~~--~NN~~~-~~~~--~~~~-~~~~~~~N~~~~~~~
•
•••••
'
••
,
,
•
II
•
,
••
,
•••
I
,
,
••••
,
••••
!l
••
,
••••
~~~~~~~~?#~~~~~~~~~~~~~~~~~~~~N~~~~~M##~~
~~~~~~~~N~~-~~~~N-~OM~-~~~~~N~~~~=~~~O-~~
~
~
~ 0 '::\ =t f'<") \1"1 ~ ..:J ~ -.D \,:I 1) oX) ~ U"\ d' f"" "'" L/"I 0'" ::) "'" '..:l ~ "".... l,,"'l "" 0 ~ :$> '1) N ..... 1.f'\ O""~:I'I .:,. -~ ...., '3' 7'
~~~-~=#=O~~~O-~~~~~-N~~-~-~=OO~N~~~~-~~~~
~~~-~~~~~~~~-~~~O-'~~~~-~~O~~~'~~~~~~~~~~4
~m~~~~n-~~~~O~~N~~~OQ~O~~~O~-~~=NOO~#-~-O
~N-~~~~~~N-~~~N#~~N~~~N~NN~~~~~~N~N~~~~N~
•
•••••
e
••••
t
•••
,
•••
II
\I
\I
•
,
••
It
••••••••
f
•
i)
••
,
OO~~OO~OOO~OOOQOO~OO~~OOOQOOOO~~OOOOOOO~O
-O-~~~~~~~D~
__
~~~~N~~~~7~~~~~~~~_~N~_~~~_
~-~~~~~~~#~~~~-~~~-#~~-~~L/"IOO~~~~~O~~~~~~O
~~~ON~~~~~~?~ND~~_~_~~O~~~M~~~~~~~~M~~\I"I_C
~
-~~~~""'~~O~N~~~~N~~~~N~~~=~-~~~-~~~ND~~~~~
~~C~O~-=N~D~~MN~~~M~~~~DO-~D~~=~~~~~O~O=~
~~'~D~~~~~~~~~~~~~L/"I~~~~,n~~~o~~#~~~~~~~~~~
• • , , • • • • • • e , • • • , • • • • • , , • , • • • • • • • • , , • \I Ii' •
It
If
OOOOOOOOOOOOOOOOOOOOOOOOeOOQ~OOeOOoeooo~o
~
Z
c
-
~~~~~~#e~~~~~#~~~~~~~~#~~~~#~~=~~~~Q~~~~~
OOOOO~~~~~~OOOOOO~~~~~~~OOOOO~~L/"I~~~OOOO~O
N~~N~~~N~~N~~~~~~MMM~~~~~~~~~#.##~~~~~~~~
~~~~O~4~=~O~b~=~O~~~~~O~~~~~O~D~~~o~~~m~o
-
'"=
Q
p
~
~
-
~
-
~~=~O-N~~~~~m~O-M~~~~~~~O_M~~~~~~~~_NM3~4
~~~~~~~~~~~~~~~~~~~~~~~~m==~=~Q===~~~~~~~
e
e
e
Appendix 2 (continued)
lJll IVAIl un:
'&BUBLE=REl
= Residual
of the expected value
1I0llENTS
96
II
IIEAN
STU OEV
SKEWNESS
USS
CV
T:IIEAN=O
D:HOiIlAL
-1.11565
6.01106
O. SHOO 1
351>6.29
-511.26
-1.916Jo
0.102988
QU AIIrlU;S (DEY=")
SUII IIGTS
SUII
VABUNCI':
KUIlTOSIS
CSS
-112.11112
36.1"27
51'0 /lEAN
0.6135115
PROB>ITI
0.0583254
0.013
1'80B> 0
STEil LEAF
15 0
III II
13
12
11 3
•
1
1
10 06
1
2
'J 29
2
8
1 JlJ9
6 6
5 01J51>d
fl 02J189
) b
2 1118
1 999
o 14
-0 115543)]22
-1 88166611J
-2 986432
-J 9tltl1l4200
-II <183210
-5 20
-6 b
-7 '175010
-II 9881>1>54200
-9 tl8Jl
- lU 86
----t----t----t----t
99"
95);
10.1061
50'" ronl
-l.1J71u
90';
.!5~
1.32597
·b• .!6'JJI>
1O';
5):
1:-
15'; V2
-0.21021111
lllJ3.5t>
\11
U,," 11111
ilA NGE
IJJ-IJI
110 DE
1I0XPLOT
I
1
1
I
I
1
I
I
I
,I
'I
I
J
1
6
(;
-10.1568
I
I
!
I
I
r----t
II
I
I
I
2
j
,,,. HOd
-'J.28J14
1'6.95]1
4<
*
t
• l-
~~t
•• tt
I
I
I
I
I
I
I
I
i
1
1
10
-'>.7l:J<l1.7
II
I
l.St
2
-9.j~t)oe
-10.15,,8
IO.6dtl
H.l919
I
I
!
i
I
I
b
10.0147
-8.65!ll;~
hORIIAL PROBA81LITY PLOT
I
i
1
I
"
HIG!!tSf
-10.1Sou
-lIJ.55i7
·').&n22
15.5t
I
t--'-.--t
8
L.G~·a::;r
-10.1568
1
·--t--*
i
I
''''.'J~J1
25.1105
'J.6J5JJ
1
3
2
10
8
EXT aEli t:s
1'1.9537
J.Jb'i91
100'; ,UX
96
l-
·.1
.$*..t
.·tt
tt
••
I
+••
I
tt ••
I
f •• •
I t · ·
I
tH
....
•••
I
I··
I t ·
i
HU
I
•• ~ •••
I
•••• t
-10.St·
• ftltf
t----t----t----~----1----r----t----I----f----t----1
-l
-1
iO
t!
Ii:
(Xl
(Xl
Appendix 2 (continued)
UIl1VARlU~
'ABIABL£=B £1
flili\lliEHCI 'fAilLE.
VALUE
coun
-10.75b8
-10.5517
-9.83122
-9. 188n
-9.28))4
-9.10218
-8.8916
-8. 8429..!
-8.18669
-8.60316
-8.59109
-8. '18497
-8.4011
-8.19519
-8.0'1551
-8.02295
-1.81958
-1.61016
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
~7."91n
-7. ]7754
-1.29956
-1. 14;H4
-6.9500'1
-6.63061
PERCENTS
CELL
CUll
1.0
l.il
1.0
1. II
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
2.1
1.1
q.2
5.2
6.3
1. j
6.1
9.4
10.'1
1i.5
12.5
13.5
14.6
15.6
16.1
17.1
18.8
19.11
20.8
.t 1.9
22.9
2".0
25.0
YALUE COUNT
-5. 195q]
-5.0.l2b8
-q.81qn
-".15"Ob
-If. 20966
-4.22006
-4.05422
-q.0042"
-J.815'111
-].80268
-].79'J9
-1.40882
-J. JS9
-].1615"
-J.OI1J..!
-3.001«>2
-2.99208
-2.70HO
-2.63l91
-2.18897
-2.13051
-2.24121
-1.76144
-1.7648]
1
1
1
1
1
1
1
1
1
t
1
1
1
1
1
1
1
1
1
1
1
1
1
1
PEhCENTS
CLLL
CUll
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1. Ii
1.0
20.0
21.1
l8.1
19.2
JO.2
11.3
n.3
33.3
j'l.4
35. "
lb.5
.J1.5
JIl.5
1'1.0
40.6
41.7
42.1
.. ].8
44.8
45.8
46.9
41.9
4'1.0
50.0
VAW E CtlUlit
-1.1U'I53
-1.b~812
-1.60902
-1.b0518
-1. JSbtl
-1.12i108
-.1l7146
-.1Ubl.. "
-.54)qI,S
-.45119'J5
-.4:101'12
-.12R18S
-.319016
-.30«>51l1
-.l1l344
-.212171
.06J'Jb/'!
0.1144183
1.811732
1. "ll1l4
1. 'I45..!9
2.llll1l:l.
2.4'1116
2.75778
I
1
1
1
1
1
1
1
1
1
1
1
I
1
1
1
1
1
1
1
1
1
1
1
PE!lt!fiTS
Ct:LL
CUll
1. Q
1.0
I.U
1.0
1.0
I.U
1. ()
1.0
1.0
1.. 0
1.0
I.e
1.0
51.0
52.1
0; J. 1
Sq.2
55.2
St.. 1
57.3
SIl.)
5';.4
6U.'I)
61 • .,
02.5
td.5
t>1!l'CEltf:'
'IHUi'; CUU!I':.'
~bti1
t
1.
~.l)J(J75
q • .12dO 1
1I.2oen
'I. t>'!,;n
'I.thIS1"
4. U5.l1tl
5.022·,7
S. 10~J
',.2'lU19
'".J. 4t14r::l~
5.tdn 1
5.79ll5'!
,1.0
6".10
6.~~q~U
! .Il
1.0
65.6
(,1:1.7
7 • .l1l0 1
7. 'II J02
7.U(,Jj'i
9.21«>16
9.111:11"5
10.1) lq1
lu.62Jll
1'.l'J 19
14. HOll
1,(.'1SH
1.0
1. ()
1. ()
1.0
1. ()
1.1l
1. U
1.0
(,7."
611.8
69.11
70.l!
1 t. ')
12.'1
111.0
15.0
I
I
1
,
,
1
1
I
1
I
...
I
1
1
I
1
1
1
1
1
1
1
1
CELL
LU~
1.U
!. (l
1.0
1.0
1.0
1.0
1.0
76.0
n.1
.,S. I
L.u
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.11
1.0
1.0
1.,)
1.0
t .. O
1.0
1.0
1. U
79• .1
llO • .!
Ill. 3
11.1. J
Ii J. J
:.Iq. lj
115. q
86.5
87.5
!lil.5
89. b
90.0
91. 1
9~. 7
'}!.8
'H.6
95. A
96.9
97.9
I} 9. 0
11)0. '.I
co
\0
e
e
e
,
e
e
Appendix 2 (conacd)
U~lVl.J.i.1.11i
=
VARIABLE=R E2
Residual of the variance
11011 F.1IIl 5
II
liEU
STII DEV
SKEIIN ESS
USS
Cy
'1' :IlElII=O
O:1I0B1I1L
91>
-1.911(,2
9.007b9
0.69011J
8058.'l6
-1I11 ••!07
-2.019lJ
O.0816li19
~UANrlLCS(D~P=II)
SUII llGTS
SUII
VARIANCE
KUkTOSlS
CSS
STu liEU
l'1I00>ITI
PBOB>O
91>
-18].516
111.13811
1.l71115
1108.15
0.919JII]
0.01102809
11l0~
!lAX
75); IP
SOli liED
2S~ Ii 1
Oil 111M
0.1111
32
30
28
26
211
22
20
111
16
111
12
0
1
8
56
016
10 2
8 25811>6
6 5
II 1828
2 J6129
o 1222224003418
-0 81395 112
-2 975S0913
-4 5511 32211
-6 81663397161>6(,5113
-8 88113
- 10 5J
-12 173
- 111 95813
-1& II
- 18 020
----+----t----t----t
•
•
1
o
1
I
I
2
J
1
b
1
1J
1
8
8
16
5
I.
J
:;
1
3
-1.1.72411
-19.9921>
10),
52. lil.llJ
"J-Ql
9.1I1J1I5
-!'l.9'J2!.
'1O)'
S"
11
I
i
t----t
i
I
t
.-----.
I
I
t,...---t
I
, I
e. I
!
I
i
'J.79Hl
-19 • .10"
" [ .. IlEST
1,,_"'>5
15.570.l
j
-13.471<;
-If,. 1
-11.Q~UJ
- 17.1I0';?
H.f>1i19
-19.9'126
-15.86H
H.Ol51
p~O~AarLITY
1(,.tilJS
~LOT
I
I
I
I
i
I
I
I
I
I
I
!
1
I
Lu.r.;:;r
-l'J.qq.b
•
JJt
1
1
1
32.0J~1
IIl.b.:'u
HoallAL
BOll'LOT
1
II
5
99"
9S"
1.I.III,;E
1I0DE
STEil LEAF
H.01S7
III 1
-2.9111159
~. ~II
ElTaEIIES
.
+
" -I
•• 1l-
"·t
.... t
.... ·t
tH-
'+I
t
1
I
I
I
I
I
t··.
t$l)
t"·
t·4<
••••
T.~·
•••••
I
uU
I
.-It
I
.
I
..
I
·tt
-19....
t--- -t----t----t·----t----t----t---- ,.--- -t-- --t- ----I
-2
-1
to
·il
+2
..+
""o
AppcnJix 2 (cont inued)
s rAT 1
~
r 1
~
A L
Il N II I. f
;; 1 .:i
:; Y :i 1: t: 1\
UN LV lllJ.lTt.
VllU18LE=1l £1
t"lU:\i UI::Ill: f
PERCEITS
'ALUE COURT
-19.9926
-19.20113
-10.01108
-17.1I052
-15.01>'H
-15.5219
-111.01140
-1/1.7102
-1/1.3J2I
-13.10))
-12.7239
-1..1.311>7
-10.111>111
-10.1701
-9.79147
-d.7999
-8.70195
-8. 68855
-8.J0616
-7.8lili6)
-7.689011
-7.611889
-7.5615)
-7.21839
1
1
1
1
1
1
1
1
I
1
I
1
1
1
1
1
1
1
1
1
1
1
t
I
CELL
CUll
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.
1.0
1.0
1.0
2.1
J.l
11.2
5.2
6.3
7.3
8.3
9./1
10.4
11.5
12.5
I3.S
111.6
15.&
16.7
17.1
°
°
lU.tI
19.8
20.11
21.9
22.'J
211.0
25.0
VALUE ClJUHT
-1.25111>1
1
-6.111219
-6.70 111"
-6. 65l.. 5
-6.61211
-6.1>0112
-6.56562
-6.550 11
-6.4b311
-b. 44442:
-6. J27Jl
-5.52123
-S.511167
-5. 395ftl>
-11.29663
-4. Zll114
-4.1101
-II. 141171
-4.06"H
-3.909/1 I
-3.1>5901
-3.51"92
-3. 4916il
-3.01053
1
1
1
1
1
1
1
1
1
1
1
I
I
1
1
1
1
1
1
1
1
I
1
l'f;IlU:NTS
cun
Ct.L"
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
°
1.
1.0
1.0
1.0
1.0
1.0
1.11
1.0
1.0
1.0
1.0
1.0
1.0
1.0
21>.0
21.1
'!6.1
'!9.2
30.2
31.3
JI..J
JJ. )
H. /I
)5. "
lb.S
TlllL.£
VAl-II E cOlin'
-2.11111>1>~
1
-/..7295:.!
-l.31.,J5
-1.76975
-1. "5"26
1
- 1. 1.'17 7)
-.91Hld
-.14111235
-.1104156
-.ldI>0t..9
0.I.oil1l1
J7.~
O.lu~S~9
3H.5
0.11;5)Ul
J'J.b
40. I>
Li.I!>!.,,) ..
"'.7
t,,,.1
.. J. tl
"11.8
45. R
4b. 'l
117.9
"'J.O
50.0
O.17ld'J4
tl. il',.. b ...
O.I111Ul
0.'H12H6
o. 9Jn5)
1.2n5"
1. II li1t1
1. ,,/'lJ:>
1.15311
4:.211)'l1l
1
1
1
1
1
,
1
1
1
1
1
1
1
1
1
1
i'j;:RC£IIT5
C«Ll.
UIII
I. (l S 1.0
1. U
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1. II
1.0
1.0
1.0
1.0
1. \l
1.0
1.0
52.1
~j.
1
1>2.:"
&.8212'1
63.S
9.0'11)';
'1.'i1'Ji
9.bli021
10.110 I
11.97b
12.0'171-
5'1./1
iI't.c>
65.b
iI(,.7
07.1
i
1.0
1
1
1 .. 0
69.11
10.11
1
1
nUo!!
(,0 ...
61. S
5".:<
55.2
:'1>.3
S7.J
511.3
68.11
,
J.ll:',!'J
j.
l.tl!>9lb
11.0971
... 78J57
5.,!OlIl7
:'.19"33
6.51b',
d. iRJI!
!!.SJLll
t.O
1.0
1.0
1. ()
1.0
VALLI\:: (;00"'£
~. bJ.1 J9
1
11.9
72. ':I
111.0
75.0
1'!~,,37;':
101.455
1';.5702
1... I'l~l5
1'l.l>ij19
32.0357
1
1
1
1
1
1
1
1
1
1
1
1
t
t
i
1
1
1
1
1
1
1
I
?£I\C EilTS
CELL
CUll
1.0 1t..0
1.0
l.u
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
77.1
711.1
1".'!
BO• .!
81.1
8L. J
83.3
114. Ii
8~.q
AI•• :;
d:... ~
t.o
tW. 5
1.0 0"...
1.0 '1u.o
1.0 ':11.7
'J ... 1
1.0
1.0 9 J.ll
1.0 '14.1l
1.0 9".6
1.0 91,. ~
1.0 91.9
1.1l 99. <J
1.0 100.0
I
,
'.
\0
f-'
e
e
e
92
Labels of the variable names in
~I
Appendi~.
=
N=
R =
11=
Same as labels in Table 3.1
(Page 36)
12=
EE.\ =
E\'.-\ =
Decimal =
PEA = predicted value of the expected value of A.
PEA = ~1(0.6 + 0.3 x Decimal).
Using equation 18:
PYA = predicted value of.' i'-the_. variance of A using PEA and equation 15:
PVA = 0.155 PEA.
'e
REI = (PEA-EEA)
x
100/EEA.
RE2 = (PVA-EVA)
x
100/EVA.
REFERENCES
Abe, O. (1973) "A note on the methodology of Knoxis tests of iTime and
space interaction''', Biometrics, 29, 67-77.
Bailar, J. C., Eisenberg, H, and Mantel, N. (1970)
of leukemia cases", Cancer'., 25(6)., 1301-1303.
"Time between pairs
Barton, D. E., David, F. N. and Merrington, M. (1965) "A criterion for
testing contagion in time and space", Annals of Hwnan Genetics" 29"
97-102.
Barton, D. E., David, F, N., Fix, E. and ~1eI'rington, M. (1967) "Tests
for space-time interaction and power functions", Proceedings of the
Fifth Berkeley Symposium. Ed. Lucien M. LeCam and Jerzy Neyman,
Vol IV, 217-227, University of California Press, Berkeley.
Bennett, B. M. and Nakamura, E. (1968) "Percentage points of the range
from a symmetric multinomial distribution". Biometrika., 55, 377-379.
Besag, J. and Diggle, P. J. (1977) "Simple Monte-Carlo tests for spatial
pattern", AppZied Statistics., 26" 327-333.
Cox, D. R. and Hinkley, D. V. (1975)
Hall, London.
Theoretical Statistics, Chapman and
David, F.N. and Barton, D. E. (1966) "Two spac,e-time interaction tests for
epidemicity", British Journal of Preventive and Social Medicine, 20,
44-48.
David, H. A. (1970)
Order Statistics, John Wiley
&Sons,
Inc., New York.
Darwin, J. H. (1957) "The difference between consecutive members of a
series of random variables arranged in order of size", B1:ometrika,
44, 211-21'8.
Ederer, F.,
~lyers,
and Mantel, N. (1964) "A statistical problem in
Do leukemia cases corne in clusters?" Biometr1ics,
M, H.
space and time:
20, 626-638,
Feller, W. (1968) An Introduction to Probability Theory
tion~, John Wiley & Sons, Inc., New York.
Godl.-in, H. J. (1949)
0athemati~al
~~~l!E_App~~­
ilSome low moments of order statistics", Annals oJ"
Statistics, 20, 279-285.
Green, P. E. (1978)
III inois.
Analyzing Multivariate Data, Dryden Press, Hinsdale,
..
..
94
Greem..ooJ, R. E. and Glasgow, M. O. (1950) "Distribution of maximum and
minimum frequencies in a sample drawn from a multinomial distribution".
Annals of Mathematioal Statistios~ 21~ 416-424.
Grimson, R. C. (1979) "The clustering of diseases", Mathematical Biosoienaes~ 46~ 257-278.
Gumbel,
E. J. (1958)
Statistics of Extremes, Columbia University Press,
New York.
Gumbel, E. J. and Herbach, J. H. (1951) "Exact distribution of the extremal quotient". Annals of Mathematiaal Statistios~ 22~ 418-426.
Gumbel, E. J. and Keeney, R. D. (1950) "The extremal quotient", Annals
of Mathematioal Statistios, 21, 523-538.
Gumbel. E. J. and Pickands III, J. (1967) "Probability tables for the extremal quotient", Annals of Mathematioal Stati3tios~ 38, 1541-1551.
Harter, H. L. (1969) Order Statistics and Their Use in Testing and Estimat ion , Vol. 1, Aerospace Research Laboratories, USAF. -------.
Harter, H. L. and O\.en. D. B. (1975) Editors: Selected Tables in Mathematical Statistics, Vol. 4, Institute of Mathematical Statistics,
Dirichlet Distribution, Type 1, Tables by Milton Sobel, V. R. R.
Uppu1uri and K. Frankowski.
Inter-nat iona1 ~lather.Jatical and Statistical Libraries, Inc., (IllSL) (1979)
Reference Hanudl, Edition 7, Vol. 2, "HDCETA (rlDBA) beta prohabil i tv
distribution function".
Johnson, N. L. (1960) "An approximation to the multinomial distribution;
some properties and applications". Biomet:r>ika~ 47~ 93-102.
Johnson, N. L. and Kotz, S. (1977) UTI} Models and Their Appl}cation,
John \'Ii! ey & Sons. Inc., New 'brk.
Johnson, N. L. and Young, D. H. (1960) "Some applications of two approximations to the multinomial distribution", Biomet:r>ika~ 47~ 463-469.
Klauber, ~l. R. (1971) "Two-sample randomization tests for space-time
clustering", Biomet:r>ios~ 27~ 129-142.
Knox, G. (1963) "Detection of low intensity epidemicity, application
to cleft lip and palate", British Journal of P:r>eventive and Sooial
Mediaine~ 17~ 121-127.
Knox, G. (1964a)
"The detection of space-time interaction", Applied
25-30.
Statistics~ 13~
Knox, G. (1964b) "Epidemiology of childhood leukemia in Northumberland
and Durham", B:r>itish Journal of P:r>eventive and Sooial Medicine~ 18,
17-2·l,
95
LinJ1~y. D. V. (1965)
Ha~o('sian
Introduction to Probability and Statistics from
Viewpoint. Part 2, Inference. Cambridge University Press,
e
London.
~lante1,
N. (1967). "The detection of disease clustering and a generalized
approach", Canoer Researoh, 27, 209-220.
~cgre$sion
~!ant(d. ~ .•
Kryscio, R. J. and Myers, M. H. (1976) "Tables and formulas
for extended use of the Ederer-Myers-Mantel disease-clustering procedure". American JournaZ of EpoidemioZogy, 104(5), 576-584.
Naus, J. I. (1965) "The distribution of the size of the maximum cluster
of points on a line", Journal of the AmeX'iccra Statistical AS80ciat-ion,
60, 532-538.
Nau.s, J. I. (1966a) "A power comparison of two tests for non-random clustering", Technometrics, 8, 493-517.
Naus. J. I. (1966b) "Some probabilities, expectations and variances for
the size of the smallest intervals and largest clusters" • •Journal.
of the American Statistical Association, 61, 1191-1199.
Naus, J. I. (1982)
JOUl'na~
~orth
"Approximations for distributions of scan statistics",
Assooiation~ 77, 177-183.
of American Statistical
Carolina Vital Statistics (1980) Leading Causes of Mo~tality, 2,
North Carolina Center for Health Statistics, Department of Human Resources. Public Health Statistics Branch.
Pearson. E. S. and Hartley, H. O. (1962) Biometrika Tables for Statisticians, Vol. 1, Cambridge University Press; London.
Pike,
~l. C. and Smith, P. G. (1968)
"Disease clustering: a generalization
of Knox's approach to the detection of space-time interaction".
Biometrics, 24, 541-546.
Pinkel. D. and Nefzger, K. (1959) "Some epidemiological features of child··
hood leukemia in Buffalo, NY, area", Canoe~, 12, 351-358.
Pyke, R. (1965) "Spacing", JOUr'11.a~ of the Royal Statistical Society, B,
27, 395-436. Discussion, 437-449.
Riordan. J. (1958) An Introduction to Combinatorial Analysis, John Wiley
& Sons, Inc., New York.
Roberson. P. K. (1979) "Distributional and robustness problems in time-space disease clustering", Ph.D. dissertation. University of Washington, Seattle."
Schneider, K.
(1982)
"Agent white", Inquiry, March 15, 1982, 14-18.
0
Sobel. ~l. and Uppuluri, V. R. R. (1974) "Sparse and crowded cells and
dirichlet distribution", Annals of Statistics, 2, 977-987.
•
96
St;lrk. C. R. and ~Iantel. N. (1967a) "Lack of seasonal or temporal-spat in 1 clustering of Do\vn I s syndrome births in Ivlichigan". American
Joui~~l of EpicemioZogy, 86, 199-213.
Stark, C. R. and l'olantel, N. (1967b) "Temporal-spatial distribution of birth
dates for ~Iichigan children with leukemia", Cancer Research, 27,
1749-1755.
Symons. ~1. J. (1973) "Bayes modification of some clustering cri teria l l •
Institute of Statistics Mimeo Series No. 880, Department of Biostatistics, University of North Carolina at Chapel Hill.
Symons, ~1. J., Grimson, R. C. and Yuan, Y. C. (1982)
events", in press.
I\'allenstein. S.
(1980)
"Clustering of rare
"A test for detection of clustering over time",
American JournaZ of EpidemioZogy, 111(3), 367-372.
Wallenstein. S. and Naus, J. 1. (1974) "Probabilities for the size of
largest clusters and !sma-llest intervals", JournaZ of the American StatisticaZ Association, 69, 690-697.
Young, D. H. (1962) "Two alternatives to the standard chi-square test of
the h)~othesis of equal cell frequencies", Bimnetrika, 49, 107-116.