~f'
COLU:GE
STAT;C~
i
;'j'I..i]C:;:, NCWTHr:AROL);,';f.:J
UNIVERSITY OF NORTH CAROLINA
Department of Sta~istics
Chapel Hill, N. C.
STUDIES IN SAMPLING WITH PROBABILITIES
DEPENDING ON SIZE
by
Judith Rich O'Fallon
May
1963
This research was supported primarily by the
National Institutes of Health under Grant No.
GM-10397-01.
Institute of Statistics
Mimeo Series No. 362
. ""<1("
" "
>
•
In the copies of this paper for the Mimeo Series, pages 38,
39, and 70 - 75 are omitted •.
When the paper was submitted
to the Graduate School of the University of North Carolina
as a master's thesis, these pages contained the flow
diagram for the computer program presented in
Appendi~
C
and graphs based on the data given in Tables IX-A through
XVII-Ao
Because it would be both difficult and expensive
to reproduce these drawings for mass production, we must,
regretfully, leave them out of the present copies of the
paper.
•
1i
ACKNOWLEDGEm:NTS
Without. assistance from many people, this thesis could
not have been completed.
I am especially gra.teful to Dr.
N. L. Jolmson, who gave so generously of his valuable time,
ideas, suggestions, and encouragement.
Thanks are due also
to Dr. W~ J. Hall for his personal interest in the final stages
of the thesis; to Mrs. Doria Gardner, who so patiently and ably
typed the te.."Ct
of the thesis; and to Miss Martha Jordan, who
guided me safely through the proper procedures and went out of
her way to help me to meet each deadline.
Many people in the Industry Division of t1:e Census Bureau
deserve a special vote of thanks.
Among them are Maxwell R.
Conklin" Chief of the Industry Division" without whose approval
I could not have worked on this problem; Jack L. Ogus, who
first kindled my interest in the problem, then anal-Tered
~.n
numerable questions to help me understand it; and Don Clark,
Ed Ricketts" and Judy Levine, who all helped in many ways.
IAst, but not least, I thar..k my husband Mike for listening
to me countless times when I needed to "talk out" an idea and
for being always kind and understanding.
iii
TABLE OF CONrENrS
Chapter
page
.. ....
THE ENTIRE PROBLEM • . .- . . . . . . . . . . . . . . .
1.1 The Development of P.D.S. sampling · . . . . . .
ACKNOWLEDGEMENTS
I
....
ii
1
1
1.2 The Ogus Solution for the Special Case When OnlY
One Class is Sampled
...
·..·.
• • • • • • • •
3
... • • •
4
1.2.1 Definitions • • • • •
..
1.2.2 Derivation of the Optimal Probability.
1.2.:3 A Method of Solution
• • • • • •
·......
2
Dhi
1.:3.1 Characteristics of the Population •
1.:3 Estimation of the Measures of Size ,
1.:3.2
1.:3.:3
II.
• • • •
......
Estimators Being Studied• • • • • · . . · . • •
Discussion of These Estimators . . . . . . . .
6
8
9
9
10
13
OPTIMAL ASSI<rnMENT OF PROBABILITIES FOR MORE THAN
ONE CLASS
....................
17
2.1 Derivation of the LaGrange Equations for a
General Number of Classes • • •
. .... ·.
...·..·...
The Equations for This Special Case • . . .
2.2 Specialization to Two Classes
2.2.2 The Procedure for Obtaining the Probabilities.
17
19
19
20
2.2.:3 The Use of a Single-Value Approximation
to Simplify Equation (2.8) • • • • • • •
·...
21
;Lv
Chapter
Page
2.2.4 Results of the Application of the SingleValue Approximation to an Artificial
Population • • • • • • • • • • • • •
23
2.2.5 The Use of the "Mixed" Approximation to
SimplifY Equation (2.8)
••••••••
26
2.2.6 Results of the Application of the Mixed
Approximation to an Artificial Population
• • 28
2.2.7 Comparison of the Methods of Approximation. • 29
III
A COMPUTER PROGRAM FOR COMPUTING THE TRUE SOWTION
WHEN TWO CLASSES ARE SAMPLED. • • • • • • • • • • • • 31
IV
PROPERTIES OF THE SOWTION FOR TWO CLASSES
•••••
43
4.1 The Optimal Probabilities for Two Classes
Sampled Jointly Compared with "the Optimal
Probabilities for Two Classes Sampled
Separately
.
Simplification of the Solution for Independent
Classes
..... . . . . . . • . . . . . . . .
46
Sensitivity of the Variances and Expected
Sample Size to Changes in a.
and b •
...
46
..
50
4.4 The Effect upon the Equations of Using
Different Variances Specifica.tions • • • • •
A Method of Selecting the Sample • •
• •
50
v
Chapter
v
Page
4.6
Variance of the sample Size • • • • • • • • ••
51
4.7
Existence and Uniqueness of the Solution
5;
A COMMENT ON THE SOWTION OF THE PROBLEM FOR A
GENERAL NUMBER OF CLASSES
•••• • • • • • • • ••
• • •
60
6;
APPENDIX A:
Tables Discussed in the Text
APPENDIX B:
Tables Mentioned Briefly in the Text. •
85
APPEEDIX C:
Computer Program in the GAT IAnguage. . •
9'+
BIBLIOGRAPHY • • • • • • • • • • • • • • • • • • • • 101
CHAPTER I
THE ENTIRE PROBLEM
1.1 The Development of 'lprobabilities Depending on Size lt (P.D.S.)
sampling
The problem with which this thesis deals is only a small facet
of a large problem confronting the Industry Division of the United
states Census Bureau.
During each year the Industry DiVision takes
scores of surveys, the most comprehensive of which is the Annual
Survey of Manufactures
f§ll,
which is used to estimate industrial
data such as total employment, average wages and man-hours, total
value of shipments for each of 1,100 product classes, and cost of
materials for industrial populations consisting of manufacturing establishments.
An establishment is defined to be a plant or location
at which a product is manufactured; thus, a large company or corporation may have many establishments, often of differing size and importance.
In a ty.pical industrial POPulation there are many small es-
tablishments, fewer medium-sized establishments, and a very few
giants.
Clearly, any method of sampling (such as simple random
sampling) in which there is positive probability that a giant will
be excluded from the sample is not a desirable sampling plan.
Therefore, the Industry Division wants to use some form of sampling with probabilities depending on size (p.d.s.); in partiCUlar,
~he numbers in square brackets refer to the bibliography.
•
2
the probability that an establishment will be selected for a sample
should be a function of the contribution which the establishment is
expected to make to the estimate.
Furthermore" in order to ensure
that the giants will alwys be included in the sample, the sampling
plan should be a form of stratified sampling haVing two strata:
a
certainty stratum" in which all establishments are selected with
probability one" and a non-certainty stratum" in which the probability
of selection for each establishment is less than one.
Finally, the
sampling plan should have the property that the selection of each establishment is independent of the selection of every other establishment.
.As Ogus explained (in a letter to the author concerning this
point)" the reason why the independence condition is desirable is because it will help the Industry Division "to avoid selecting the same
establishments for different surveys in order to spread the reporting
burden... •
The fact that our universes are relatively small is im-
portant here." Precisely how this is to be accomplished is a question
which the author has considered to be beyond the scope of this thesis.
Because the questionnaire connected with any survey is designed
to elicit information on many economic factors (e.g." value of shipments in each of several product classes, wges" cost of materials)"
one factor must be chosen so that the probabilities of selection are
functions of the contributions to the estimate of this factor.
Now,
every establishment will be able to supply information on items such
as 'Wages and man-hours" but not every establishment in the population
manufactures products which are classified in a particuJ./:l.r ;product c1.ass.
3
This situation can cause real difficulties --- and actually has done
so for the Census Bureau in the past.
To illustrate" suppose that
the probability of selection depends upon the contribution of the establishment to the estimate of the total number of employees; and
suppose that we also wish to estimate the total value of shipments
(which is a very important value) for each of" say" ten classes.
If
one of the classes contains products which can be manufactured by a
relatively small number of workers, and if the establishments which
manufacture these products do not also manufacture products from another
class" then the probabilities assigned to these establishments will be
relatively small.
Thus" it can easily happen that the sample dravm on
the basis of these probabilities will contain none of these establishments; consequently" on the basis of the sample there will be no information with which to estimate the total value of ship:rnents for that
class.
Because of this difficulty" the Industry Division has decided that
the probability of selection should depend on the contribution which the
establishment makes to the estimate of the total value of shipments in
those product classes which are involved in the survey.
Then" with
enough members dravm into the sample to ensure satisfactory estimation
of the product class totals" the estimation of those factors which are
reported by all establishments should be excellent.
1.2 Ogus Solution for the Special Case When Only One Class is Sampled
In August of 1959 Jack L. Ogus" chief mathematical statistician
for the Industry Division" distributed to his staff a memorandum
f1±.7
4
in which he derived the optiJDum values of the probabilities of selec..
tion for the case when only one product class is being sampled.
His
results are summarized as follows:
1.2.1 Definitions.
Let
denote the total number of units in the POPulation
N
h
reporting class h.
"
value in product class
ment
"
"
h reported by establish-
i 'in the Census year.
value in product class
h reported by establish-
ment i in the current year.
Nh
.1: Y = total value in product class h in Census
J.=l hi
.
year.
" i~ ~i = total value
iu product class h iu current
year.
Dhi
"
Db
"
a
bi
"
Phi
"
~i- Yhi •
Nh
~ - Yh = i:l Dhi •
if establishment i
=
if establishment i
{~
is not selected.
is selected.
the probability that establishment i
will. be
selected.
Define an estimatorl
Xh
for all i
=
Xh,
Nh
1:
i=l
such that Phi
as follows:
a hi (l/Phi )~
is positive.
Ogus shows that this estimator
is an unbiased estimator of the product class total:
5
and its variance, derived using the assumption of independence of
selection, is:
Var
1{
~:
He next considers the difference estimator,
This estimator is also an unbiased estimator of the class total:
and its variance, derived using the assumption of independence, is:
Var
Xh =
Nh
Var (Yh +!: ahi (l/Phi ) Dhi )
i=1
Nh
2 2
= . Z (l/Phi) Dhi Var
a hi
J.=1
Nh
2
= . Z (l/Phi - 1) Dhi
J.=1
•
6
D~ < ~, Var
Because in most cases
smaller than
preferable to
Var
Xh.
Xh
'Will almost surely be
Therefore, the difference estima.tor is
~.
Note that if establishment
i
is a certainty case, i.e.
then it contributes nothing to the variance of the estimate.
Phi
= 1,
Also, as
happens in most sampling methods, the variance is a function of the
2
D , which are unknown.
hi
quantities
to do the survey at all!
If we knew them, we wouldn't need
2
The problem of estimating D
hi
is discussed
later in this chapter.
Ogus defines a sample procedure to be optimum if it minimizes the
expected sample size subject to a variance specification.
Of particu-
lar intterf=lst to the Census Bureau is the specification of the form,
Var X"
2
K Y
h ' where K
,:.
h
Xh;'
On the assumption that
is determined by pragmatic considerations.
Yh , this variance specification is eqUi-
valent to the specification that the coefficient of variation,
Jv~r ~ , be inversely proportional to J~
for all classes
h•
. ·ll
1.2.2 Derivation of the Optimal Probability
...... ''C!'-........,.
~-
.......",.:::rr=-=...~'"
The expected sample size is:
•
In the population of N
h
establishments suppose there are
C
h
establish-
ments that should be included with certainty for an optimum design.
Mh
= Nh
and let
- C
h
m.
n
Let
be the total number of non-certainty cases in the popUlation;
= I:~
i=l
Ph.
J.
be the expected number of non-certainty cases
selected for the sample.
value of
Ph.J.
Then
which min1mizes
.
7
Nh
E I: a
= C + Ilb. To find the
hi
h
i=l
llb' Ogus uses the Cauchy Inequality
in the following fashion:
Var
(l/P
Ilb:::
Phi:
-~
Multiply both sides by
~
~
x.." =
i=l
- 1) D
hi
~
2
Since
Ilb
2
Y
h
•
2
is the only variable on the right...han~side (BHS), the condi-
tion which minimizes the LHS also minimizes
(1.2)
K
= ~ K Yh •
Phi I: (l/Phi - 1) Dhi
i=l
i=l
I:
~
2
hi
Ilb.
By cauchy's Inequality,
l 2
2
~
P
~
(l/P ) D . > { ~ p / (l/P . )1/21 D I} 2
i=l hii=l
hi
hi - i=l hi
hi
. hi.
=
{1:1~ ID
(
hi
} 2
I
,
with equality (and thus a minimum) obtained if and only if
From (1.1) and (1.2) we obtain the following relationships:
8
From (1.;), together 'With the definition of
Ilb'
we have:
•
Therefore,
•
Thus, the value of Phi
Phi
(N. B.
which minimizes the expected sample size is:
•
=
This solution can also be obtained by using the IaGrange tech-
nique for minimizing a function subject to restrictions.
i.t is used to find the optimal value of Pi
In Chapter II
in the general case.)
1.2.; A Method of Solution.
•
=0
In practice, the precise values of
~
and C
h
are not known be-
forehand, but they may be determined easily by arranging the
non-decreasing order and computing the follO'Wing partial sums:
and
IDhil
in
9
D~i •
1.3 Estimation of the Measures of Size
The problem of finding a satisfactory way to estimate the values
of D~
time.
is being studied by the Industry Division at the present
During the summers of 1961 and 1962 the author worked on this
problem" doing much of the preliminary work on which the current
studies are based.
Although there is no lack of data relating
'Dhi f
with the value Y " the laws protecting the confidentiality of the
hi
economic data for individual establishments prevent the use of actual
Census Bureau data in this thesis.
Therefore" the following discussion
must deal with generalities" since the
studies are being conducted
with the Census data.
1.3.1 Characteristics of the
In general" when
class
I~i
populati~.
I is
plotted against
Yhi
for any product
h, the scatter diagram resembles Figure 1.
Usually the linear correlation coefficient is
..
about 0.6" suggesting that there is same degree
of linear correlation.
But Dhi
can be --- and
not infrequently is
zero for large values of
..
.:.(~',.~ ."
I,
Y " and there are nearly always a few units"
hi
usually smaller ones" for which
IDhi I ~ Yhi • Clearly" it is not possi-
ble to predict the exact values of
ID
to estimate the expected value of the
Yhi
Figure 1.
hi
I
(D
hi
for each establishment; but
Jt S
for a given value of
is difficult too" pr1ma~ily because the variance of the
increases as
Y
hi
increases.
IDhi 11 s
10
1.3.2 Estimators Bein§ Studied.
In the SUIDmer of 1961 the author applied a logarithmic transformation to the data from three product classes and found that it
did not compensate adequately for the increasing variances.
Further..
more, even with computers available it was not practical to use such
In the SUIDmer of 1962
a transformation on large quantities of data.
the author wrote a series of FORTRAN programs for an IBM 1401 computer
to calculate ten different estimators
E ID
hi
I,
d.. (j),
OJ.
.
using the data from ten product classes.
j
= 1,
2, ... , 10, of
The first four esti-
mators were:
d.bi
where
= a + b Y , i.e. least squares linear re-
hi
•
(e~}1
=A +
BY , i.e. least squares linear re..
hi
2
gression of e
on Yhi •
hi
where
dbi
and e
(e~}rr
2
hi
=A +
are defined as above, and
BY
+
hi
CY~, i.e. least squares
quadratic regression of
e~ on Yhi
•
11
(:3)
= /<~)'
~(3)
(~)' =
where
(e~)'
+
2
A + BYhi + CYhi ' i.e. least squares
2
quadratic regression of D
hi
on Y •
hi
1------···---
(4)
CL
.(4)
f.lJ.
=
i (~.)I
-11~
I
~
2
are defined as in (3), and
{~i)t and ehi
where
2 )1'
( ehi
=
2 •
A + BYhi + CYhi
The remaining six methods all involve weighted regressions, in
which the weights used were intended to be inversely proportional to
2
I in one case and to the variance of D
in
hi
hi
Actually1 the weights were estimated only roughly, in the
the variance of
the other.
ID
following fashion:
seven numbers,
Bhj' j
the values of Yhi
B
hl
< Yhi
~ ~
for each of the ten product classes, three to
= 1,
i~to
, etc.
groups, the mean of the
2, ••• ,
7,
were chosen and used to divide
four to eight groups:
0
< Yhi
~
B
'
hl
Then, for the establishments in each of these
ID
hi
Its
about this mean were ca.lculated.
and the variance of the
ID
hi
Its
The weight assigned to every establish-
ment in this group 'Was the reciprocal of the group variance.
In similar
fashion, a second set of weights 'Was calculated using the D~i values
instead of the
IDhilts.
The weighted linear regression coefficients
were calculated according to tne
fo~lowing fo~ulaS;
12
b
=
a
=
(1.4)
ID.I - b(l/N ~
liN ~ w.
~
~
w.~ Yi )
,
' si2 is an estimate of Var IDhi I, and the
sums are taken over the values, i = 1, 2, ••• , N. Also, to simplify
where Wi
si -2
J
= NS i -2 I~
these expressions, the subscript h, denoting the product class, has
been omitted.
Similarly, the weighted quadratic coefficients were cal-
culated from the formulas:
22
2_
2
2
2
(~ WiDi Yi - liN ~ WiD~ Wi Yi)(.E Wi Yi - l/N(.E WiYi ) )
B
(~ WiD~i
-
c =
=
twiD~i
.. liN
~
yf - liN
=
where Wi
liN
WiDi
~
~
WiYi _ C
(.E 1-Ti y i )2
.E Wi
A
- liN I: WiDi
"'iYi HE \'1i Y~ - liN E WiY1E WiY~)
~Wi~
If
.. liNE Wi E Wi
2
2
.E WiYi - liN (~ WiYi )
2
2
E W.~ D.~ - B( liN ~ W.~Y.)
- C(liN ~ W.~Yi )
~
J
= NS i -2 IE
Estimators
Si
-2
<\i (5)
form to estimators
,
2
2
' and Si is an estimate of Var Dhi •
to <\u (8) were defined to be identical in
c1m. (1)
to ~i (4), except that in ca1~u1ating
and (~)' the weighted regression coefficients were used.
~(9)
= a + bYhi " and
~i(10)
=
fA + BYhi + CY~,
and C were the weighted regression coefficients.
a.w.
Finally,
where a, b" A, B"
13
1.3.3
~cuesion
of These Estimators.
There is much to criticize in the choice of estimators and in
the calculations of the weights.
Howeverl we should consider the fact
that the Industry Division statisticians are trying to find a simple
estimator, or at least one which is easy for a computer to applYI
which "Till "satisfactorily" predict E{D
I
for all values of Y.
Consequently, this study was designed primarily to cOlll:Pare the measures
of size assigned by each estimator and to see what effect each estimator
had upon the number of establishments put into the certainty stratum.
Because of the increasing variances of the
[D
hi
J' Sl
ordinary least squares methods, as in the computation of
the use of
a.:u
and
(~i)' for estimators (1) through (4), was not strictly justified.
However, it was tried because to calculate reasonable estimates of the
increasing variances with which to weight the data amounted to much
the same problem as to compute the E [D
mators
~(j), j
= 1,
2,
5, 6,
hi
I
in the first place.
Esti-
were roughly of the form:
That is, the real quantity being estimated was
2
D •
hi
ThiS was reason-
able because the contribution which an establishment makes to the
2
variance of the estimate is a function of D ; and the sample is suphi
posed to be taken large enough to meet certain variance specifications.
Estimators
~i (j ), j
= :5"
4, 7, 8, were intended to be upper bounds
2
on the values of D • This is a conservative approach to the probleml
hi
14
for it tends to over-estimate the
D~iiS and therefore should result
in larger samples.
Estimators (9) and (10) are theoretically sound estimates of the
E IDhil
and
Suppose that
E(D~i) respectively. This can be sho'Wll as follows:
= 1,
Xi' i
E(Xi ) = a + ~ Yi
2, ••• , N, are random variables such that
and Var Xi
= 0'i2
• We would like to find the least
squares estimates, a, b, of the regression coefficients, but because of
the unequal variances, we cannot apply least squares techniques directly
to the
X variables.
However, the variables, Zi = Xi/O'i '
i = 1, 2, ••• , N, do have equal variances:
and therefore, since E(Zf) is also a function of
ex and ~, least
squares techniques may be applied to them to determine a
and b.
We wish to minimize the sum,
where u
i
= O'i.2
•
After differentiation, we obtain the normal equations:
Borrowing a device found on p. 244 of reference
LJ:.7,
let us set
wi = U i / ii = N ui/i:. u i ; then the normal equations may be written as
15
follows:
From these equations" computing forms (1.4) are obtained.
In similar
fashion" the fonns tor the quadratic regression may be derived.
the case of
while in
In
~i(9)" the linear regression is computed with Xi = IDhil;
<\u.(10), the quadratic regression is computed with Xi
= ~i
Unfortunately, there are serious problems associated with the
estillation of the variances,
CT~
,
Var IDI is expected to increase as
i
= 1"
2, ... , N.
In general,
Y increases because" although
In I is expected to get J..a,rger as Y increases, the lower bound on
InI remains zero for all values of Y. Moreover" because for each
Y-value there is usually only one
the variance of
ID
I
IDhil" it is impossible to estimate
for each Y in the usual way.
In the opinion of
the author and Industry Division statisticians" for the purposes of
this study a reasonable way to estimat2 the expected variances 'Was to
calculate the group variances" as previously described.
But such esti-
mates are very rough" and the properties of such a method of estimation
are unkno'Wll.
Therefore, the problems of obtaining reasonably accurate
weights may well invalidate the use of the weighted regression as a
method of estimating EfDhi'.
The results of this study are not all available yet" but most of
them are disheartening.
In general, estimators (9) and (10) produced
some negative values of
dbi" often for small values of Yhi
2
but
•
16
occasionally .... and more <'tlBastrously - - f'or medium..sized yvalues.
Due to a programming error, the results from estimators
(3), (4), (7),
~i(2)
and
and (~) are not yet available.
So far,
seem to give the best results consistently.
dw,(l)
Interestingly
enough" in almost all ten product classes the linear correlation
coefficient between
IDhi I
and
<\u. (1)
'WaS
higher than that between
IDhi I
and <\1(2), but the mean squared error f'or
20
less than that for
0/0
<\i (2)
was about
<\i(l).
Because of the skewess of the population and the increasing
variances, the estimation of these crucial measures of size will proba..
bly
al'~ys
be the weakest aspect of' p.d.s. sampling.
CHAPTER II
OPTIMAL ASSIGm.mNT OF
Pi I s
FOR MORE THAN ONE CLASS
Now let us assume that there is a satisfactory method available
for estimating
and proceed to consider the problem of assign-
ing probabilities of selection to the N establishments in the population in an optimal way when two or more product classes are to be
sampled.
2.1 Derivation of the ItiGrange Equations for a General Number of
Classes.
Suppose that we 'Wish to gather information about n product
classes,
n~
2, that we will use the estimator
Xh
defined in 1.2.1,
and that the variance specifications are of the form, Var
h
= 1,
2, ••• , n, where
Xh = K2yh'
K is the same constant for all classes.
Let a.
J.
one or zero according as establishment
i
Let
be
is or is not selected for
the sample; and let Pi be the probability that establishment i
will
be selected.
As in the case when only one class is being sampled, the expected
N
N
sample size is: E .E a
=
.E P. • If we define a procedure to be
i=l 1
i=l J.
optimum if it minimizes the expected sample size subj ect to the n
18
variance restrictions,
desired form of Pi.
'We
ma;y use the IaGrange technique to obtain the
That is, we must minimize the following function
'With respect to Pi
and
~,
1, 2, ••• , N; h = 1, 2, ••• , n:
i =
. .
N
~1'~2P"'~n) =i~lPi +~l~h
g(P l , p 2 ,,,·,PN'
~Pi ~ ~ {i~=l(l/Pi
=
i=l
n
+
h=l
- 1)
D~i
g(P l , P 2 , ••• , PH' ~l' ~2' ... ,
To minimize
(Var
Xh - K2yh)
- K2yh}
"-n,)'
we differentiate
N + n variables and set each derivative
'With respect to each of the
equal to zero:
o
g
o~ =
h
N
2
From. these equations
(2.1)
(2.2)
Pi
N
E
i=l
N
2-_
2
E (l/P i ) Dhi - (K-Yh + E Dbi ),
i=l
i=l
=
j
'We
the form:
2, ••• , n •
i = 1, 2, ... , No
h:l
2
=
2
K Yh
+
N
2
E Dhi ,
h
=
1, 2, ••• , n.
i=l
ThUS, to find the optimum value of Pi' we must solve
for the n
= 1,
obtain the following equations:
n ~ D~'
(lip·) Dhi
J.
h
n
equations
unknown laGrange multipliers, where each equation is of
19
N
2.-_
2
K-Yh + .E Dhi , h
~
1, 2, ••• , n.
].;::1
2.2.1 The Equations for the Special Case of Two
Classes.
.......=
=
~ . , ~.... ~ , I C
_
='H
_=_w;~,=,..,~_~
~=--
In this thesis we shall devote our attention almost exclusively
to the case n
= 2.
In this special ease, equations (2.1) and (2.3)
take the following form (for clarity, let us set a = A
l
and b = A. ):
2
[~--;~-=_.-;-
(2.4)
i = 1, 2, ••• ,
Pi = Ja Dli + b D2i '
N
E
=K
i=l
2
N
2
Y + E D
,
h
hi
1=1
h
N.
= 1,
2.
Now, this population can be divided into three mutually exclusive and
exhaustive groups:
all establishments for which D2 > 0, D2 ;:: 0 ,
li
2i
2
2
Group II: all establishments for which D = 0, D > 0 ,
li
2i
2
Group III: all establishments for which D > 0, D2 > 0 •
li
2i
Group I:
Using these groups, we may rewrite the left..hand-sides of equations
(2.5) as follows:
N
E
2
D .
l J.
i=ll D2
b D2
;J a
li +
2i
2
Dl ·
J.
= E
i
ieI a D2
..
= lira
and
11
+
2
Dl •
J.
E
ieIII Ja D2
b D2
J li + 2i
,.
£0
2
D
N
I:
2i
i=lJ
2 + a 1i
D
I:)
2
D
2i
= I/F
L:
1€II
ID
01 +
2:1.
L:
•
ieIII
Therefore, equations (2.5) may be 'Writ"ten in the form:
(2.6)
1::
ID201+
ieII.
J.
L:
ie:III
If we divide equation (2.6) by equation (2.7), we obtain one equation
in the one unknovm,
Q
=
.;e:ro- :
(2.8)
N
2
K-Yf"\ + L: DI"\o
C.
° 1
c.J.
J.=
li •
1::
Q
~_
2
D
ie:III/~~2 + ~2
.,
1i
2i
- - - - - - - _ : . - . _ - '.............-.._- ::: Q •
1::
ie:III
2.2.2 The Procedure for Obtaining the Probabilities
The general procedure for obtaining the optimal values of Po,
J.
i = 1, 2, ••• , N, is to solve equation (2.8) for
of accuracy as desired.
Q
Then, using this value of
to as many digits
Q,
evaluate either
equation (2.6) or equation (2.7) together with the equation, a
to obtain the values of a
every i, calculate Pi
and b
=r;;-Dii
~
+ b
to the desired accuracy.
~~i·.
If for some
= Q~,
For
i, say i
= j,
21
''-2-2'
a Dli + b D2i
v
"
~
1, set Pj
=1
and remove establishment
j
fram the
.
non-certainty stratum.
Then, after all such establishments have been
removed fram the population (of non-certainty cases), recompute
I l'
E 'Dl.i, E
D2i
etc., for the remaining establishments and solve
ieI
J.' ielI
.
equation (2.8) again. Repeat the process until values of a and b
are obtained for which
population.
la
D~
+b
D~i
< 1 for i
in the reduced
In this fashion, all establishments haVing such large
2
hi
values of D
that the formal solution ot the equations required
that they have a negative coefficient (N.B. that in the variance formuN
2
E (l/P i - 1) Dhi ' the quantity, (llP i - 1), is negative for
i=l
Pi > 1), have been designated certainty cases and so contribute nothing
la,
to the Var
Pi
Xh ,;
while to all the remaining establishments values of
have been assigned so that the variance specifications are met.
2.2.3
The Use of a Single-Value Approximation to Simplify Equation (2.8).
r(Q}
The form of equation (2.8) is
tion could be obtained by iteration
vergence of this equation).
== Q , suggesting that a solu-
(cf. 4.7 faJ;' dis.cuooion Q!f' tbe eon-
However, for every new value of
different radicals must be evaluated, where
N
3
is the number of esQ,
N
3
tablishments in Group III; and the numerator and denominator ot the'
second factor must be recomputed.
f(Q) varies slowly with respect to
Thus, if
Q,
N is very large, or it
3
requiring many iterations, the
calculations would require too much time, even on a computer, to be
practi.cal.
Let us consider, then, the effect of substituting approximations
tor the values of D~ and D~i in Group III. Suppose that
~
D~
is
22
2
an approximation to be substi.tuted for every Db!
in Group III in
the second factor of f(Q},
h = 1." 2,.
/
,
Then equation (2.8) becomes:
+
g
(
(2.9)\
2
\
N
2
,
K Yl + oZ Dli / '
J.=1
Z
!
D
i
ieII j 211
+
and equations (2.6) and (2.7) become:
N
(2.10)
ra
D1
~2
f 2"2
JQ D1 + D2
=
K2 Y1
3
Q
.Z I Dlil +
J.eI!
,
V2
N
+
Z
i=l
,
and
2
Dli
Z jD • +
ieII' 2J.
(2.11)
=
where N; is the number of establishments in Group III.
Since the
first factor and the quantities, Z !D1ojI and Z IID2 " I , are free of
,
ieI
J.,
ieII' J.I
Q and are known for any given pair of product classes, they are coni
stants in equation (2.9) (and, of course, in equation (2.8) also).
Thus, at each stage of the iterative process only the value of the
square root and the values of the numerator and denominator of the
second factor must be recomputed.
Clearly, when a single approxima-
tion is substituted for every value of
equation is comparatively easy to solve.
produce nearly the same values of a
D~ in this way, the iteration
But will equations (2.9)-(2.11)
and b
as equations
(2.6) - (2.8)~
2.2.4 Results of the Application of the Single-Value Approximation to
an Artificial Population.
Because the probability distributions of the
X's
and the Dt s
are unknown, and because we are un'Willing to make any assumptions
about their form, it will not be possible to derive general results
and draw general conclusions based on the properties of these distributions.
The best we can do is to construct a population which has
the same characteristics as the populations 'With which the Census Bureau
deals and study the effects of different apprOXimations upon it.
Then
any conclusions that we draw in general will be based on the assumption
that the effects upon the real populations 'Will be similar to those observed in the case of the artificial population.
In Table I of Appendix A (hereafter we shall refer to this as
Table I-A), is the population constructed by the author, based on the
rea.l populations with which she worked at the Census Bureau.
To con-
struct it, she decided in advance how many one-digit (i.e. 1-9), twodigit (i.e. 10-99), etc. numbers there were to be in each product class,
then used the Rand tables of random numbers to obtain the precise value
for each number.
Hopefully,. t.h:i.s ar-ttf':i.cial
Po:Du.1o. t10n
has the charac-
500/0 of the
teristics of most industrial populations, e.g. more than
establishments are small (in this case,
are large (here,
Y> 1,000).
Y
< 100) and less than 200/0
Of course, since some classes are more
variable than others or include products which are more expensive or
more popular than those of other classes, the tems,
II
small" and
"large", are relative 'Within each product class.
Now, using this pO}?ulation, let us attempt to answer the quest:ton
already posed:
are the solutions of equations (2.9) - (2.11) good
approximations to the solutions of equations (2~6)
- (2.8)1 The answer,
when the arithmetic mean, the median, and the geometric mean are used
as the single approximation D~
,
is, unfortunately, "noll.
As Table
II-A shows, when the arithmetic means were substituted into (2.9) (2.11), the resulting value of
than the O}?tinJal Q;
and b
0/0
'WaS 140
a
Q
'WaS more than
'Was nearly 20
0/0
40 0/0 smaller
smaller than the optimal a;
larger than the opti.n:Jal b.
On the other hand, when
either the medians or the geometric means were used,
too large,
a
'WaS about
Q
viaS about
40 0/0
340/0 too small, and b 'Was about 660/0
too small.
As a result of these errors in
a
and b, the arithmetic mean
approximation designated three establishments as certainty cases instead of the opti.ma.l two, while the median and geometric mean approximations designated no certainty cases at all.
Then, as is sho'W-l in
Table III-A, when equations (2.9) - (2.11) were solved again on the
basis of the population from Which the certainty cases bad been removed,
25
a
Q
and b were
itself
'WaS
640/0 and 78 0/0 too large, respectively, although
only 40 /0 too small.
equivalence of the
However, as a measure of the
f(Q) in (2.9) to the
f(Q) in (2.8) this is poor,
since there was one less establishment in the population for equation
(2.8), making the parameters unequal.
At any rate, the aberrations in the values of a
and b
resulting
from the use of equations (2.9) - (2.11) predictably affected the values
of P.J. assigned to the establishmEints
(cf. Table V-A):
the arith-
metic mean approximation enlarged the P. I s, while the median approxiJ.
mation reduced them.
Because the results of the geometric mean approxi-
mation were so similar to those of the median apprOXimation, the probabilities derived from the geometric mean apprOXimation were not calculated.
Since the expected sample size and the variances are functions
of the pIS, they too deviated from the optimal values
IV-A).
was 20
(cf. Table
For the arithmetic mean approximation the expected sample size
0/0
larger than the optimal size, and the variances were
and 40 0/0 below the specifications.
290/0
However desirable this state of
affairs might seem, it violates the assumption underlying this method
of sampling, Le. that the important consideration is to minimize the
expected sample size, not the variances.
On the other hand, the
median approximation produced an expected sample size that was 380/0
less than the optimal value, together with variances that were 1850/0
and 190
0/0
over the specifications.
would be totally useless.
Needless to say, these results
26
Obviously, the method of a:pproximating all values D~i by a
single value is not satisfactory --- at least, not when that single
value is the arithmetic mean, the median, or the geometric mean.
Since the distribution of' the
is quite skewed, it seems clear
DiS
that the arithmetic mean is being dominated by the few large values,
whereas the median and geometric mean are sensitive to the numerous
small values
( cf. table below).
.I Arithmetic
Mean
.
'-"2
...
D
l
!
7',272.5
~_.-
I
Median
Mean
I
I
-
-
I
712.0
6".90
!
582.5
694.17
~I-
""2
D2
70,781.0
11
Is there any other single value which would be a better ap:proximation1
If there is, it is not an obvious or common one.
The two
means and the median were chosen in the first :place because they
are the ccmmon measures of central tendency.
that the variance of the D2 , s
is so huge "that a single value must
2
either awroXimate most of the D ·s
the very large D2 •s;
The real :problem is
well and, in effect, ignore
or it must a:pproximate the large D2 •s rela-
"tively well and thus vastly overestimate "the majority of the D2 ,s.
2.2.5 The Use of' the ''Mixed'' Approximation to Simplify Equation (2.8).
To avoid approxinBting the large D2 values, let us consider the
:JIlerits of a more complicated type of approximation:
bers,
D and D
1
2
Choose two num-
such that D < D , which will be used to divide
l
2
27
:2
D
li
the numbers
in Group III into three divisions according to size;
and choose D < D4 to divide the number D~i in Group III into
3
three similar size groups. In particular, select these f'our numbers
so as to make the values in the small- and middle-sized classes as
nearly homogeneous as possible.
Then split Group III into the f'ollowing
subdivisions:
(1)
all establishments f'or which 0 < Di < D , 0 <
i
l
(2)
all establishments f'or which 0 < Dit < Dl , D:;
(3) all establishments f'or which
(4)
D
l
all establishments f'or which Dl
:5 Dii < D2 ,
:5 Dii < D:2'
D~i < D ;
:5 D~i
0
D:;
~
2
be the arithmetic mean of' the D
hi
(j),
where
every D~
h = 1, 2; j = 1: 2, 3, 4.
< D4 ;
< D~i < D:;;
:5 D~i < D4 ;
/2
( 5) all establishments f'or which D2
li ~ D2 and or D2i
Let Dbj
3
~
D4 •
values in subdivision
If' we
SUbstitute~~j
f'or
in subdivision (j), j = 1,2,3,4, and leave the values in
subdivision (5) undisturbed, equation (2.8) may be written as f'ollows:
!
(Q
f
4
n. D2
D2
,
N
\ L: iD
L:
J lj
+ L:
Ii
K2y2 + E D~i) iel 1
j=119~2 +])2 i£(.5) jg2y;2 +D2
J
i=l
I
" l j 2j
Ii 2i
!
N'
-2
2
\ 2__
2 )
4 n.D
D~i
I\K~l + . ~ Dli E D} L:
J 2j
+ L:
.:;
J.-l
ieII:2
j=lJ,,~2 +"n'2
ie(5) ,,~'2 n2
~;olj 2j
~;Oli+ 2i
I
( 2.12)
I
J
.
= roo~ ,
I
where
n
j
is the number of' establishments in subdivision (j).
The
f'orms of' equations (2.6) and (2.7) corresponding to this approximation,
which we shall call the ''mixed'' approximation, are:
2J
(4
_ - i:
- 1eI
t·' J?1:U* Q[j=l
t
=
(2.1,»)B.
2
K Y
+
1
~
ieII
ID 1+
2i!
N
2
D
i=l li
~
4
E
j=l
2.2.6 Results of the Application of the Mixed Approximation to the
Artificial Population.
As one can readily see from Tables II-A and
III-A, the mixed
approximation gave results that were very close to the true optimal
values, especially when compared with the results of the approximations
already discussed.
The value of
Q
which solved equation (2.12) for
the entire population (Round I) was less than
the solution of equation (2.8), 'While a
and 5-1/2
0/0
1_1/2
and b
larger than the optimal values.
0/0
smaller than
were about
2_1/2
0/0
Thus the mixed approxi..
mation designated the correct two establishments as certainty cases.
Furthermore, the solution of (2.12) for Round II, i.e. after these
certainty cases had been removed from the population, was only 1 0/0
larger than the solution of (2.8), while
and 2_1/2
0/0
a
and b
larger than the optimal values.
vTere
4_1/2 0/0
The probabilities com-
29
puted from the solutions to equations (2.12) - (2.14) usually agreed
with the opti.Ir8l values to two, and sometimes to three, decimal places
(cf. Table V-A).
1
Finally" the expected sample size was less than
0/0
0/0
too large, and the variances were about 3
than the specifications
and
2
0/0
smaller
(cf. Table IV-A).
2.2.7 Comparison of the Different Methods of Approximation
~
=m~
Judging from the results of using the V6r:f.oUS apprex:imBtiona with
this art1fiCfAl population, it would seem that to use a single value
to approximate all values of D~
in Group III
leads to the assign-
ment of probabilities that are far from being optimal.
In fact, the
solution to equation (2.9) may be so different from the solution to
equation (2.8) that it couldn't even be called a good guess at the
value of the true
Q.
The mixed approxination" on the other hand,
gives promise of assigning probabilities that are fairly close to the
true optimal probabilities.
Perhaps in some cases, 'When the statisti-
cian is willing to accept answers that are only approximately correct"
the probabilities assigned by the mixed approxination will be aufficient.
However, the fact that the deviations encountered for this
artificial population were all less than
60/0 of the optimal values
does not necessarily mean that the mixed approximation will give results
as good as these for every population.
In particular, since the arti-
ficial population contains a total of only eighteen establishments in
Group III, we cannot be sure how accurate will be the solutions to
equations (2.12) - (2.14) when there are hundreds of establishments in
that group.
30
However, the results of the various approxiIIa tions when applied
to the artificial "Population do suggest that the solution of equation
(2.12) will be a better es"bimate of the solution of equation (2.8)
than will the solution of equation (2.9) using any of the cammon
measures of central tendency for the single apprOXimation.
Thus, the
importance of the mixed apprOXimation may well lie in the fact that
it can lead us to a relatively small neighborhood of the true
Q-value.
Then, using the solution of equation (2.12) as the initial Q, we may
solve equation (2.8).
Having used the mixed approximation to find a
Q-value close to the true one, we shall have narrowed down the popula-c5.on of non-certainty cases, eliminated many steps in the iterative
process for solving equation (2.8), which bas
N different square
3
roots to evaluate at each step, and therefore shall have saved ourselves
time and effort.
CHAPTER III
A COMPUTER PROGRAM FOR CCMPUTING THE TRUE SOWTION
WHEN TWO CLASSES ARE SAMPLED
If there are very many establishments in Group III, the whole
procedure for calculating the probabilities should be done on a camputer.
The author has written a program in the GAT language to accom-
plish this on a UNIVAC 1105 computer (cf. Appendix C).
A brief dis-
cussion follows, of the 'Way in which the program solves equations
(2.6) - (2.8) and computes the probabilities and variances.
plify notation, let
fl{Q)
denote the fo~ of the function
as found in equation (2.8) and let
f (Q)
2
To simf(Q)
denote the form as found
in equation (2.12).
Part 0:
The data are read into the computer.
Of primary interest
are the previously defined variables,
Y , Y , N, K, D ,
l
l
2
D2 , D , D4, and the values of the
3
h = 1,2.
E,
2
D matrix, (D~i)' i
= 1,
2, ••• , N;
In addition to these, the data set contains the variables
Qo' and MJ' , which will be explained when the use of each occurs.
Part I:
The computer divides the establishments into seven sets.
Sets 1 - 5
are the five subdivisions defined in 2.2.5,
and the sixth and seventh sets are Groups I and II as defined in 2.2.1.
To avoid using the values of the
D2 matrix directly in the iterative
process and to simplify calculations involving Group III, the program
32
defines two
each. consisting of five rows , one for each of
matrices~
the first five sets and fifteen columns (the number of columns may be
increased for larger populations).
The X-matrix contains the Di
values for the establishments in sets
1 -
5,
while the Y-matrix con-
tains the D~ values for the same establishments.
All positions in
the matrices which do not contain D2 values are set equal to zero.
For each establishment there is defined a pair of variables which tell
what set the establishment has been put into and, if that set is one of
the first five, into what position in the
values have been placed.
2
X and Y matrices the D
This knOWledge is necessary if the establish-
ment is later designated a certainty case and must be removed from the
popUlation of hon-certainty cases.
part II:
This part governs the calculation of population parameters~
N
2
2
A. The computer calculates E Dh . and K Yh , h
. 1 . J.
J.c6 == EI Dlil and C7
ieI
well as
B. The computer calculates R
N
(K2y1
+ . E Dii )
=
2 , as
1 I·
E D2i
ieII
2__
= (K-Y2
..
N 2
+ .E D )/
2i
~
and. the values
J.=l
j = 1, 2,
= 1,
J.~
, h
= 1,
2;
3, 4.
Part III: This part governs the iterative process.
A.
The computer begins to calculate a sequence of
approximations to the true value of
Q
according to
The value
tended to be a good guess at the true value of
Q,
Q
o
, in-
is read into the
computer ~ith the data.
Fbr ~eh value of
:8.
ienm
(1)
°o
of another sequence~
defined by the equa:Hon,
lat'! vI11ues of
L, the computefl l3.1so calculates a
° = °1
0
•
is
The computer continues to calcu-
{OL} and { 0L} until one of thre0 events occurs:
L = 51; (2)1 0Ll < e/1000, where
e
Q; or
the precision of the estimate of
is -the variable which determines
(~)
01-1.
0L < O.
If (1)
occurs, the computer prints out the values of both sequences and proceeds to the next population.
G
= 9L
If (2) occurs, the computer sets
and proceeds to Part IV of the program.
If
cating that the solution lies somewhere between
computer checks to see whether
sets
9
= QL
lOLl < e.
(~)
91-1
(2), or (~) occurs.
(1),
L
, the
< e, the computer
equal to the midpoint of the interval between
and computes the next values, Q1+1 and
If
lOLl
sets
by
9
and computes the next values of the sequences, con-
until
01+1
and
If it is not, the computer
tinuing
9
occurs, indi-
9
L
and
9 _
L l
Then, by multiplying
&1+1.
0L and 01-1 to determine which product is negative, the
computer identifies the half of the interval which must contain the
true
Q
and sets
9
equal to the midpoint of that half of the inter-
val.
Three more times the computer goes through the routine of iden-
tifying the interval which contains the real
Q and setting
Q
equal
to the midpoint of it, finally accepting an estimate which is within
e/16 of the true value.
The computer then goes to part IV.
A regression routine has been written into the program to speed
up the iterative process.
When
L
= ~m,
m
= 1,
2, ... , 16, the com-
34
= 0,
pute3:' uses the JlB.irs of values... (~L-j' 0L-j)' j
Eo
late the coefficients of the regression line
B
1 0,
it re-defines
-AlB,
Q to be
L
pected to be zero, and redefines
QL-l and the new QL'
the
Q and 8
part IV:
If B
5
= 0,
L
1, 2, to calcu-
= A + BQ.
the value for which
Then, if
5
is ex-
to be the difference between
the computer prints the values of
sequences and goes on to the next population.
This part computes the values of a
and b.
The com-
puter first checks to see whether the value of MJ,
which is read in with the set of data, is one or two.
computer calculates the value of a
equation
(2.6),
from either equation
according as equation
being solved; it then sets
b
If MJ
= a Ig 2 •
(2.12)
or equation
= 1,
(2.13)
(2.8)
These values of a
or
is
and b
2.
computer calculates
and sets
a
= Q~.
b
2 will be
specification on Var X
part
v:
= 2,
the
from either equation (2.14) or equation (2.7)
These values of a
specification on Var
If MJ
and b
will assure that the
satisfied at least as well as is the
Xi'
This sequence is used only once, the very first time the
pIS
are calculated.
The computer sim,ply evaluates each
Pi' using equation (2.4), and counts the number of certainty cases
created.
will
Xi
give probabilities which will satisfy the specification on Var
at least as well as the specification on Var X
the
If )-'a
nii
+b -~~~.
~
1, the computer goes to Part VI;
otherwise, it continues with the next
1.
certainty cases created, the computer sets
back to Part IIB.
If there have been any
L
=1
again and goes
If not, the computer goes to part VIII.
;5
part VI;
2
subtracts D
hi
set
6
or set
This. seqtk;nce removes certainty cases from the population. First, the computer sets Pi = 1. Then it
N 2
from E D , h = 1, 2. If establishment i was in
hi
i=l
7, the computer subtracts the non-zero JDf-value from
the appropriate EIDhif •
Otherwise, the computer locates the position
in the X and Y matrices where
2
D
li
them equal to zero, and subtracts
1
2
and D
are located, sets
2i
from the appropriate N , the
j
counter which tells how many establishments belong to set
j =
1, 2, ••• , 5.
j,
Then the computer returns to the part of the pro-
gram from which it came, e.g. parts V, VII, or IX.
Part VII:
This part governs the calculation of the P 1 s
ever a
of certainty cases.
Pi = 1.
Pi
and b
For every
have been recomputed after the removal
i, the computer determines whether
If it is, the computer goes on to the next establishment.
< 1, the computer sets Pi = Ia Dii
fa ni
i
when-
+
b-;~~- ~
+ b
~~i-'
1, the computer goes to part VI.
continues with the next establishment.
If
If
Otherwise, it
If no new certainty cases are
created, the computer goes to Part VIII.
Otherwise, it sets
L = 1
and goes back to Part lIB.
part VIII:
This sequence begins the calculations to solve
equations (2.6) - (2.8).
The computer goes through
the procedures outlined in parts III and IV, except that in this part, the
= f 1 (GL-l)'
L
Q in the computer when
the terms of the sequence of approximations are defined by 0
L = 1, 2, ••• , 50, where
G
o
is the value of
Part VIII of the program. is begun.
Instead of performing the operations
36
in either of' Parts V or VII, oo-wever, the computer goes to part IX
after obtaining the values of a
Part IX:
and b.
This sequence is used only once, the first time that
a solution is obtained for equation (2.8).
puter defines
= 1,
i
2, ••• , N.
p" = 1
~
lation.
= 1,
the computer accepts
and proceeds to the next establishment.
tablishment.
=1
-
~+
<1
~+
<1
and
and proceeds to the next es-
1
but
<
and goes to Part VI
If Pi+N
p" N
= Pi+N
-
~+
If
Pi
>
If P. N
D~i'
a Dii + b
> 1 and P.~
If P. N
Pi < 1, the computer sets
Pi
J
N new variables, P i +N =
The com-
Pi
1 , the computer defines
to remove establishment
but Pi
in order to put establishment
= I,
i
the computer sets
i
from the popuPi
= Pi+N
and,
back into the non-certainty stratum,
from which it had been removed when the mixed approxiInation was being
used, goes into the following routine, in which the operations of part
VI
are reversed:
longed to set
2
D
hi
is
counter N
j
N
2
E Dh " , h
~
i=l
7, the non-zero
6 or set
appropriate sum; and if
added to
i
= 1,
2;
if
i
be-
ID I-value is added to the
belonged to set
is increased by one, and Di
i
j, j
= 1,
2, ••• ,
5, the
is put into the proper
column in the j-th row of the X matrix, while
the corresponding position in the Y matrix.
D~i is re-inserted into
If there have been any new
certainty cases created or any old ones returned to the non-certainty
stratum, equation (2.8) must be solved again.
sets
L
=1
and goes back to Part VIII.
Therefore, the computer
At the end of part VIII,
however, it goes through the operations in part VII instead of those
described in Part IX;
if no new certainty cases are then created, the
37
computer goes on part
x;
otherwise, it returns to the operations de-
scribed in Part "VIn and VII until.1 finally, values of a
and b·
are obtained for whJ.ch no more new certainty cases are created.
Part X:
This sequence computes the variances and the expected
sample size from. the equations,
N
2
N
2
Vh = Var
Xh
=
N
Z (lIP i ) Dhi
Z D , h = 1, 2, and EBB = Z P.. It also comi=l
i=l hi
i=l ~
putes four quantities which measure how well the specifications have
been met:
Vh - K2yh and (Vh - K2yh)/(K2yh)' h = 1, 2. Then these
measures are printed out, together with the quantities, Dh2! , Pi'
IIP i , Vh ' and EBB,
h = 1, 2;
i = 1, 2, ••• , N..
From the flow diagram included in this chapter it is possible to
get a better idea of the details of the program.
follows:
o
D
<>
The notation is as
input or output operation
ordinary calculations
question
beginning of an iterative sequence: the
computer performs the operations between
the beginning and the end statements once
for each value in the domain of the
variable i; (3) indicates that the last
statement of the iterative sequence is
numbered ;.
·_-----.
40
f
end of an iterative sequence
I
\
I
,
. --_
_.- .J
numbered statemerrt
In addition to the numerous variables already defined, there are
several variables not previously mentioned, most of which are used
as counters and indicators to steer the computer to the proper section
of the program at the proper time.
xx
The more important ones. are:
indicator which designates the set into which
variable which tells in which set
i
i
is put.
will be found.
variable which tells in which column of the
matrices i mail be found.
X and Y
indicator which directs the computer to the proper places.
indicator which is
M
indicator which is
r0
when equation (2.12) is being solved,
II ."
f1
!
!.
0
It
(2.8)
II
II
II.
llhen computer is going through the
"halving routine II •
otherwise.
counter which counts the number of times the computer goes
through the "halving routine."
nc
counter which counts the number of new certainty cases
created.
nn
counter which counts the number of certainty cases demoted to non-certainties.
There are several other variables to be found in the flow chart, but
they are fairly clearly defined by the operations associated with them.
The steps needed to execute the operations performed in the parts
described in the previous pages can be roughly located as follows:
4l
· . . . . . .. . . . .
Parts J: and ITA
··
· · ··
part IlIA
·• • ··• . . . · •
part IIIB
•
· • · • • .. · . · .
Regression routine .. ..
··• • • • . .
part lIB • •
o.
o.
0
•
0
•
0
0
0
0
part IV
·•
part V
.
..
0
• ..
• ..
0
0
0
Part VI
•
•
0
• • •
part VII
·• • ·
0
0
0
• • • • • .. •
Halving routine
0
• • • ..
. .. ..
0
0
0
•
0
.
.
0
· ·.
0
•
.
• •
0
0
0
•
•
0
•
0
0
•
0
0
0
·
• ·
•
· · • · .. • •
part IX . ..
· • • .. • · .. . .. .. • • • •
Return routine
·• • • ·• • • ·. •
Part X •
· . . · . · · .. · • • • ·
Part VIII
• • •
0
..
0
0
0
12 - 15
II
16 - 21
II
21 - 29
fI
22 - 24
II
26 - 28
II
29 - 31
tr
31 - 33
If
33 - 37
tr
39 - 40
tr
42
II
47 - 53
48 50
0
•
2 - 12
tr
~
0
. . •
.
S+.a~ts
tr
If
45
-
53
- 55
The steps not included in this brief directory deal with details of
directing the computer to the proper portion of the program.
In Table I-B
is an artificial population, constructed by Census
Bureau statisticians, of fifty-nine establishments which manufacture
in ten product classes.
There are forty-five possible different pairs
of classes; some pairs are composed of classes that have no establishments in common, and one consists of two classes 'With the property that
all the establishments manufacturing in the smaller class also manufacture in the larger one.
extremes.
But most pairs lie somewhere between these
All forty-five pa:trs have been fed into the computer as
data for this program.
The results for these populations may be found
42
in Tables II-B through VI-B.
They show,1 euflong other things, that
the variances achieved d:i.:f'fered. from the specifications only in the
sixth or seventh digit.
The time demanded by the program doesn't seem
to be very great, at least not for these smaller populations:
on one
occasion the computer produced the results for thirty-five pairs of
classes in seven minutes, including the time needed to prepare the
computer for operation.
As the program is now written, it cannot accomodate large populations.
However, it can be easily expanded, although to what extent the
author does not know.
On the basis of its performance, the program
does seem to show promise of being a practical means of obtaining the
optimal values of the
only for two classes.
pIS
in the special case when we are sampling
CHAPTER IV
PROPERTIES OF THE SOWTION FOR TWO CIASSES
So far, we have derived equations which maY' give us the optimal
probabilities to assign to the establishments in our population when
we wish to sample
n
product classes; and we have developed a method
of solving these equations when
n
= 2.
Since the solution is diffi-
cult to obtain, we might ask ourselves a few leading questions:
HoW
do these optimal probabilities compare with the probabilities that
would be assigned to each establishment if each class were being
sampled separately?
Since the individual class probabilities are
much easier to obtain, is there a possibility that some combination
of them would closely approximate the true optimal solution for two
classes?
How sensitive are the variances and expected sample size
to deviations of
a
and b
from the optimal values?
important of all, does a solution al'W8.ys exist?
And, most
In this chapter we
shall explore these and other questions.
4.1 The Optimal Probabilities for Two Classes sampled Jointly, Compared
with the Optimal Probabilities for the Two Classes
sample~
S6IJa rate ly.
Let Phi
denote the optimal probability assigned when class
is being sampled individually,
h = 1, 2;
optimal probability assigned when classes
and let Pi
1
and
2
h
denote the
are sampled to-
44
gether;
i
and P.,
J.
= 1,
2, ••• , N.
In Table VI-A are the values,
li
, P2i ,
for fNery establishment in the author's artificial popula-
F'or 35 of the 42 establishments,
tion.
P
Pi
< max
in two cases, one involving a fairly large Dii
(P li , P2i ); only
and medium
the other involving very large values of both Dii
Pi> max (pli' P2i ) •
and
D~i'
D~i' is
Because our purpose is to minimize the expected
sample size, 'We might expect an establishment which makes sizeable
and XU to be more "valuable" to us than
1
2
an establishment which makes the same contribution to only one esticontributions to both
mate.
Since P.
J.
XU
is a measure of the importance of the establishment
2, while
to both Xi ~nd X
Xh alone,
h
= 1,
Phi
is a measure of its importance to
2, we should not be surprised to find that
greater than either P
fairly large, but that
li
or P
2i
Pi < max (P
2
when both D
li
li
, P
2i
2
and D
2i
Pi
is
are
) when min(Pli , P 2i ) is
small.
Table VI-A also suggests that, for the two classes considered
here, if Dii = D~j , then P li < P 2j •
2
2
D2i = Dlj = 0, then Pi < Pjj in fact,
case.
= D~j
Similarly, if Dii
Pi < P li < P j < P 2j
and
in this
These results are explained in part by the fact that
K2y2 < K2yl; since the variance specification is smaller for the second
class, the same D2
value must receive a larger probability in order
to have a smaller weight and thus contribute less to the variance of
the estimate.
Furthermore, Table VI-A suggests that
in value to max (p l i ' P21) if the maximum is
P2i
Pi
than if it is
ThiS can be seen most easily in the cases when min (P
small or zero.
is closer
li
, P
2i
)
Pli •
is
45
After examining this table and similar tables based on the
45
pairs of Census Bureau classes, which, due to considerations of
space, cannot be included in this thesis, the author believes there
is no obvious, simple combination of
approximate
Pi.
P
li
and P
2i
that will closely
It seems not unreasonable to expect that the apprOXi-
mation which sets
P.
~
= max
(Pl"
~
P ") will be conservative; for some
2~
evidence to that effect, see Table VII-A, which contains the results
for the author 's population.
P li
Possibly, for smaller values of both
ancl P2:5.' a weighted average might be a satisfactory approximation;
but because one class will usually have stricter variance specifications
than the other, such an average would probably have to be of the form:
pI
i
=
if P l " < P 2 " ,
~
~
-
where ~
< f l < f 2 < 1, and class 2 is taken to be the one with the
stricter variance specification.
duce values of Pi
Since no weighted ave!age will pro-
> max (P li , P2i ), this approximation is unsatis-
factory for larger values of P
li
and P2i.
Just how large they must
be before the approximation fails would appear to be a matter of sheer
intuition.
choice of
Also, some rule would have to be developed to guide in the
f
l
and
f "
2
The author has considered such explorations
to be outsid.e the scope of this thesis, but perhpas the problem is
worth studying in the future.
46
4.2 Simplification of the Solution for Independent Classes.
Usually, then, complicated iterative procedures will be necessary
to produce the optimal values of a
and
b.
However, when there is
no establishment in the population for which both
Dii
and
positive, the optimal values may be obtained more easily.
2
when D
hi
D~i are
In practice,
must be estimated, this situation w:Ul arise only when the
two classes are totally "independent" of each other, Le. when there
is no establishment in the population which manufactures products in
2
Because either D
li
both classes.
2
or D
2i
is zero, Group III, as
defined in 2.2.1, is empty, and all establishments belong to either
Group II or Group I.
From equations (2.6) and (2.7) we obtain the
relationships:
ra
N
1::
=
ID f
i=l. li .
and
jb=
2__
2
N
K-Y2 + 1:: D2i
i=l
Therefore, as we would expect, the optimal P.
J.
the optimal Phi
in this case is simply
for the class in which establishment
i
manufactures.
4.3 Sensitivity of the Variances and Expected Sample Size to Changes in
a
and b.
If we accept values of a and b
that differ from the optimal
values, A and B, what are the effects likely to be upon the variances
and the expected sample size?
let Vh = Var
Xh '
To facilitate the following discussion,
h = 1, 2, and ESS = eA'",Pected sample size.
In
Table VIII-A are twelve small graphs which show the effect of de-
viations in a
and b' for three populations; the numerical values
on which these graphs are based may be found in Tables
XVII-A.
IX-A to
As the graphs suggest (and the figures show even more clearly),
the cr..anges in the variances and the ESS are not symmetrical about the
origin, (a = A, b = B).
a o and b
o
and if a
=A +
Of particular interest is the fact that if
are non-negative quantities not simultaneously zero,
a o and b = B + b o ' then K2.yh" Vh = €h > 0; but
if a = A - a o and b = B ., b o ' then Vh .. K2yh> €h.
(starting from a
decreasing a
= A,
or
b
That is
b = B) the variances increase more rapidly with
than they decrease with increasing
Furthermore, if both a
and b
a
or
b.
are too small, the error is greater
than the sum of the errors which occur when one is accurate and the
other is too small.
If both are too large, the error is
~
than
the sums of the errors which occur when one is accurate and the other
too large.
Since these results were obtained in the cases of the
V , and V
2
1
ESS,
for all three populations, they seem to indicate that if
an error is to be made, it is better that it be one which increases
a
or b
than one which decreases it.
Also of interest is the fact that in all three populations the
ESS varies much less, proportionately, than do the variances.
a small deviation in a
but change a variance by
or b
may change the ESS by less than
Thus,
1
0/0
50/0 or more.
Possibly, some of the characteristics of the graphs may be exp1ained by the characteristics of the classes in each popUlation and
their relationships to each other.
In population 1, there are 25 es-
48
tablishments producing in class 1
and 14 producing in class 2; both
classes are only moderately variable:with one exception, 'Dhi ' < 200
for all i, h
= 1,2.
In population 2, there are 30 establishments
producing in class 1 and 32 producing in class 2; both classes are
highly variable:
in each class
D~i > 120,000 for more than 100/0
of the establishments, while D~i < 400 for about half of them.
population 3, there are 30 establishments in class 1
and
13
In
in
class 2; the first class is highly variable, while the second is only
slightly variable:
the largest
fD2i I
is 74.
For the first two populations, changes in a produce smaller changes
in the ESS than do corresponding changes in b;
since changes in a
and we expect this,
are in percentages of A, changes in b
percentages of B, and A < B.
are in
But in the third population, changes in
a produce much larger changes in the ESS than do corresponding changes
in b, despite the fact that
B'; 9A.
A possible explanation for
this surprising result is that there are more than twice as many establishments in class 1 as there are in class 2.
But the fact that
a similar result did not occur in Population 1, in which class 1
has
nearly twice as many establishments as class 2, suggests that the relative sizes of the classes may not, of itself, be very iJnI:ortant.
A
second, more probable, explanation is that class 1 is much more variable
than class 2; small changes in a are multiplied by some very large
values of Di ,
i
whereas corresponding changes in
2
b are multiplied
only by very small values of D • Thus, changes in a
2i
have a much
more drastic effect upon some of the probabilities and hence upon the
ESB than do corresponding changes in
b.
A qUick glance at the graphs is sufficient to show that, in
general,
V
l
is hardly affected by changes in
affected by changes in a.
b, and V
2
is hardly
But a closer scrutiny of the numbers in
the variance tables (e.g. Tables X-A, XI-A, XIII-A) reveals that the
and V of changes in b and a,respective~, were
2
smallest in magnitude for population 1 and largest in magnitude for
effects on V
l
Population 2.
These results are intuitively reasonable, for the
variances are functions of the probabilities, which, in turn, are
functions of both a
and b
in the case of the establislmlents in
Group III (as defined in 2.2.1).
Consequently, we would expect that
the larger the number of establislmlents in Group III, the greater the
effect upon V1
of changes in b
and upon V2
of changes in a.
The numbers in the variance tables appear to support this hypothesis,
= 4;
for in Population 1, N:;
lation :;, N:;
= 8,
where
N:;
in . Population 2, N:;
= 18;
and in Popu-
is the number of establishments in
Group III.
To summarize" we may say that, in general, negative deviations
in
a
and b
produce deviations of greater magnitude in the expected
sample size and the variances than do positive deviations of the same
amount in
a
and b.
variations in a and b
The expected sample size is less sensitive to
than are the variances.
And, finally" there
are indications that the effects of the deviations in a
and b
depend to some extent upon the relative variability of the two classes
and upon the number of establishments common to both classes.
50
4.4 The Effect Upon the Equations of Using Different Variance
Specifications.
It is perhaps worthwhile to note that although, in the development of p.d.s. sampling, variance specifications of the form
1-1ere used, there is no reason why other forms of specification might
not be used.
Suppose the specification is:
Val'
Xh = s;.
Then, the
equations derived in Chapter II are changed only to the extent that
S~ is substituted for K2.yh in them because the pro:perties of the
special form of the variance specification were never used in the
derivation of these equations.
4.5 A Method of Selecting the Sample.
Having obtained the optimal values of the probabilities, how do
we use them properly to draw a sample? Because the development of
p.d.s. sampling assumes 'that the selection of each establishment is
independent of the selection of every other one, the method of
drawing the sample must guarantee independence of selection in order
for the variances to be of the form
method is the following:
N
~
2
(l/P. - 1) Dhi • One such
. I
J.
J.=
first, draw all certainty cases into the
sample.
Suppose the values of Pi have a 11 been rounded to k
digits.
For each of the remaining establishments select a k-digit
number,
r , by some random process, e.g. a table of random numbers
i
or a computer routine to generate random numbers. Considering r i
to be a k-place decimal number, compare it with Pi:
establishment
1
if 1'1 < Pi'
is selected for the sample; but if 1'1
~
Pi'
51
establishment
i
~eJecte4.
is
4.6 Variance of the Sample Size.
At this point is becomes plain that under certain circumstances
there can be positive probability that the sample could contain all the
establishments in the population or none of them.
To be sure, the
probability of either extreme is very smallj the probability of drawing
N
Jrr
P., which is positive if Pi > 0 for all
i=l 1
N
ij and the probability of drawing none is
(1 - P.), which is
1
i=l
2
positive only if Pi < 1 for all 1. Since the Dhi values must,
in reality, be estimated, Pi Will nearly always be positive; and
all establishments is
n
populations in which there are no certainty cases do exist.
Thus it
can happen that an extreme sample is drawn.
This unfortunate possibility points up the fact that the sample
size itself can vary from sample to sample in which the same set of
Pi values is used.
If one has sufficient interest, patience, and
a computer available, one can compute the probabilities of drawing
a sample of size k, k
= 0,
1, ••• , N, using a given set of probabili-
ties for a given population of N establishments.
Since the proba-
bilities presuppose a sample of about the size of the ESS, to draw
an extreme sample will given extreme estimates,
reason for this is that in the estimate
by
Xh'
Xh'
h
every Dhi
= 1, 2. The
is weighted
l/Pi' which is a number ~ 1. Thus, if the whole population is
drawn,
Xh
will be much larger than
~.
ThiS is one form of sampling
in which the estimate is not improved by taking a larger and
sample.
]a. rger
In a practical situation, if a.n extreme sample were drawn, e.g.
52
too many small establishments in it, a statistician might be tempted to
throwaway some of them in order to obtain a moxe reasonable estimate
(in this field of industrial
s~tiatic5 he
other source concerning the true value of
often bas knowledge fram same
~) •
If he yielded to this
temptation, he would be accepting the liability of a bias of unknown
magnitude in his estimate in return for the benefits of a smaller, but
unknown, variance of the estimate and better control of the sample size.
There is a type of p.d.s.
sampling which controls the sample size
much better than the type developed in this thesis.
With this type of
sampling, probabilities are computed as functions of measures of size;
then the establishments are arranged in random order, a number
r
be-
tween zero and one is selected at random, and the first probability is
added to it.
selected.
If the sum is at least
1, the first establishment is
Then 1 is subtracted from the sum, and the second proba-
bility is added to the remainder.
If this new sum is at least
second establishment is also selected,
1
1, the
is subtracted from the sum,
and the third probability is added to the remainder, etc.
If the sum
is less than one, the establishment is rejected, and the next probability is added to the sum.
In this way only those establishments whose
probabilities cause the sum to exceed 0.999 ••. are selected.
With this
method of selection, the sample size is equal to the integral part of
N
N
the number, r + .E Pi. Now, .E P. may be written in the form,
i=l
i=l ~
n + d, where n is an integer, and 0 < d < 1. Since r was taken
to be a number such that
0
~
r < 1, the sample size will be:
53
N
1:
i=l
r
d + r< 1
,
i n + 1 if d + r> 1
.
I
a.
~
!
=
if
n
fj
\"
The price to be paid for better sample size control is loss of independence of selection, for the probability of selection is clearly
affected by the outcome of the preceding draw.
Var
Xh
has not yet been derived for this type of sampling, the op-
tima1 assignment of the P. 's
~
However, since
to be n
Since the form of
in about
about 100d percent.
r
has not been developed.
is drawn at random, we expect the sample size
100(1 - d) percent of the samples and n + 1
in
Consequently, we have a nearly fixed sample size
procedure and may be able to use results pertaining to fixed sample
size procedures with unequal probabilities and without replacement,
which are presently appearing in the various statistical journals.
particular, Hartley and Rao (reference
I"d.7)
In
have derived an asymptotic
variance for this case, but some study would be necessary to determine
whether our probabilities would satisfy their initial assumptions.
4.7 Existence and Uniqueness of a Solution for Equation (2.8).
Finally we come to the important questions:
always exist for equation (2.8)?
does a solution
If so, is it unique? To simplify
the following discussion, let us write equation (2.8) in the form:
t
1
1:
C+ g
i€III ( g2 + u.
Jti
~
(4.1)
f(g) = R
= Q,
ui
D
+
r:
ieIII
i
It i
'i
g2 + u
i
where
L: /D2il, and t i = D~i and ui = D~1 for 1 e III. Thus C
ieII
and D are non-negative numbers, and R, t., and u. are positive
D
=
~
~
numbers, i e III •
Clearly,
f(Q) is a continuous function of
Q
for all real
Q.
The first derivative is:
t
RL:
fl (Q)
=
u
i i
2
372
(t Q + u )
i
i
(D + C
2
u.~
(D + .E
)
Jti Q2 + u
f'(Q) > 0
Since
Q ~ 0,
for
f(Q)
.2
.E Jt i Q + ui )
Q +
i
is an increasing function of Q.
Furthermore,
f(O)
RC
=--D +
f(Q)
lim
Q
If C
I
_>
0,
co
.E
=
,
=
Ju;:
and
n(c + .E.ft;:)
=
D
f(O) > OJ and if D ~ 0,
.E
ieIl
lim f(Q) <
Q
_>
ID2 ./
co.
J.
In the most
co
common case, then, when each class has at least one establishment
f(O) > 0 and lim f(Q) < co j
Q_> co
that is, f(Q) > Q at Q = 0, but f(Q) < Q for large Q. Since, by
which produces only in that class,
the Bolzano Theorem, there must be at least one positive Q for
55
f(~)
which
= ~, in the general case "at least one positive solution
to equation (2.8) does exist.
D
I
0 ,
= 0,
f'CO) > 1, and
there must be at least one positive solution to equation (2.8),
for in this case
f(~)
Similarly, if C
f{~)
>
~
in some neighborhood of the origin, and
< Q for very large Q.
But if D
= 0 or if
C
=0
and f'CO) < 1, we do not know
whether a solution, other than the values,
Q
=0
and Q
= 00,
exists.
Moreover, in the cases where positive solutions do exist, we do not
know whether there are several solutions or just one.
for the uncertainty is the fact that
fll(Q) contains both positive
and negative terms, indicating that, while
it may not be monotonic; and therefore
fl(Q) is always positive,
f(Q) may increase more rapidly
for some values of Q than for others.
Conceivably, then, there
= ~,
could be one or more values of Q for which f(g)
or C
=0
= 0, where
g(~)
= f(g)
f(~)
= g,
=0
is equivalent to the equation,
- g, it is theoretically possible to
determine the number of roots of equation (2.8).
of terms and simplification, the equation,
(4.2)
even if D
and fl{O) < 1.
Because the equation,
g{~)
The reason
RC+g(
L:
g(Q)
- D)
After combination
=0,
=
is of the form,
0
i€III
By a process of squaring and combining terms, this equation can be
put into the form of a polynomial equation, the degree of which is
a function of N • Thus, if N
3
3
is large, the polynomial has a great
56
many roots.
However, of these roots there may be many negative
and imaginary ones; and, furthermore, in the process of squaring
the LHS of equation (4.2), we may have introduced negative roots.
Therefore, it is probably true that the number of positive roots
for equation (4.2) is considerably smaller than the degree of the
polynomial.
In a given case, when we know the values of the popu-
lation parameters, the polYnomial may be of help to us to find the
positive roots of (4.2).
But in general, we do not knoW' exactly
how many positive roots there are for equation (2.8).
Aware that no finite, positive solution may exist or that more
than one may eXist, what can we say about the iterative method of
solving equation (2.8)1
First of all, we should beware when either
C or D is zero; that is, when all of the establishments manufacturing
in one class also manufacture in the other.
The computer could easily
be programmed to deal with these cases separately.
Secondly, to help
us to determine whether additional roots exist, we could add a short
subroutine to the computer program to evaluate
twenty points between
or some arbitrary
f(Q)
at, say,
Q = 0 and Q = lim f(Q), provided D ~ 0 ,
Q-value if D
= O.
Q_>
<Xl
While such a set of
f(Q)-
values is not guaranteed to detect every fluctua. tian of f'( ~} , it
might indicate certain Q-intervals which might be worth further investigation.
Thirdly, it is important to realize that the iterative
process does not converge to every value of
If
f(gl)
= Ql'
f(Q2)
= Q2'
and f(Q) > ~
9
for
iterative process will converge to Q2; but if
for which
Ql
f(9)
= g.
< Q < Q2' then the
reg) < Q for
57
QI <
~
f(Qi)
< Q2' then the iteration Will converge to
= Qi'
1
= 1,
~l'
Therefore, if
I < Q < G2 and
then the iterative process Will ignore
2, 3, and f(Q) < Q for
Q
9 < Q< 9 ,
2
3
the root, Q = 9 , for all values of 9 except in the Virtually im2
possible event that Q is chosen as a starting point of the process,
2
A further modification of the program may help us to detect
f(Q) > Q for
multiple roots and missed roots,
In principle, the computer would
begin the iteration at the value,
Q = f(O), and continue until it
obtained a solution,
again at the value,
Q,
I
Then it would begin the iterative process
~ =
lim f(Q), and continue until it obtained
Q_>
a second solution, Q , If Q = Q2' there is exactly one distinct
I
2
positive root. If Q -:J Q2' the computer could begin the iterative
I
00
process again near the midpoint of the interval
(QI' Q2)'
third iteration yielded still another distinct root,
If this
Q ,then the
3
iterative process would have to be started a fourth time, near the
midpoint of the larger of the intervals, (Ql' Q,), (Q3' Q2)'
This
series of iterations would have to be continued until the process
converged to one of the previous roots,
Then, because there will be
a missed root between every root already obtained (except in the rare
case when y
= f(Q)
is tangent to the line,
y
= Q,
at Q), the com-
puter would have to go into an iterative regression routine, regressing 5i = (f(Qi) - Qi) on Qi in a neighborhood of the midpoint
between two roots, in order to estimate the value of Q for which
5 = O.
Finally, for every root, the expected sample size would have
to be calculated:
the solution to our problem would be the set of
58
probabilities which yielded the smallest
ESS.
Clearly, if the iterative process detects more than two distinct
roots, the method of obtaining all the roots and finally obtaining
the solution is so complicated and time-consuming that it will be
prohibitive to all but those having an extraordinary need for accuracy_
Undoubtedly, most people will simply
already obtained.
settle for one of the roots
In that case, quick inspection of the two or three
sets of probabilities obtained should determine which of them, if any,
Wov1.&!e a maXimum ESSj a nonsensical, however mathematically valid
(in that it gives the maximum ESS), solution would tend to assign high
probabilities to the small establishments and relatively small proba.
bilities to the large ones.
To summarize our knOWledge concerning the existence and uniqueness of solutions for equation (2.8), we can say that in general, a
positive solution does exist; only when C = 0
or D
III
possibility that an acceptable solution will oot exist.
0
is there a
Furthermore,
the solution may not be unique, but if it is, this fact may be determined by going through the iterative process twice, once using the
initial value,
Q
= f(O),
the other using the initial value,
=
lim f(Q).
If the solution is not unique, we must find the
Q_> co
roots which the iterative process ignores. Finally, the number of
Q
roots is finite, and almost assuredly many of then are nega.tive
complex.
or
A simple and quick subroutine can help us to graph the function
for positive
Q and thus determine fairly well whether to expect a
multitude of roots or not.
59
In brief 1 the iterative process and the computer program. developed in this thesis are useful in many cases 1 but they do not constitute a perfect solution for the problem.
CHAPTER V
A COMMENT ON THE SOWTION TO THE PROBLEM
FOR A GENERAL NUMBER OF CLASSES
Because the techniques utilized in the case of two classes cannot be extended in a straight-forward fashion to three or more
classes, the author can only make a suggestion as to a possible
way to approach the general case.
Since for many of the popula-
tions studied in the previous chapters the optimal values of a
and b differed only slightly from
ci
and c~, where
cl
and
c 2 are the coefficients in the optimal class probabilities,
Phi
= ch
fDhif, h
= 1,
2, perhaps these coefficients could serve as
a relatively good first approximation to the solution of the equations, (2.:;):
N
!:
i=l
j~
h=l
An
~D~
=
2
S
h
N
2
D ,
i=l hi
+ !:
h
= 1,
2, ••• , n •
approach that might be worth investigating involves solving equa-
tion (5.h) for
N
!:
i=l
~,
holding the other n-l Ats
fixed:
61
. h = 1, 2, ••• , n.
In particular, suppose we set
Then, holding A.
(5.1)
>--1
for
A.
j
fixed,
j
= 2,
~
= c~
, h
3, ... , n,
l
• This might be done by an iterative process in which
l
= a,
A.
l
= b ,
B".f a process of halving this interval, a
could be obtained which would satisfy (5.1) to Within
l
a specified tolerance.
(5.1) and keeping
(5.2) for
(a,b) is found such
the LHS of (5.1) > RHS (5.1), and when A.
LHS (5.1) < RHS (5.1).
value of
2, ••• , n.
let us solve equation
is reduced systematically until an interval
that when A.
= 1,
A.
2
A.
j
Next, setting
= c~
A.
equal to the solution of
l
= 3, 4, ••• ,
, j
n, let us solve equation
by the procedure outlined above.
In general, solve equation (5.h) for
the values which solved equations
~,keeping A.
j
fixed at
(5.j), j = 1, 2, ••• , h-l, and
2
' k = h+l, ••• , n. When all n A. I S have been so
k
modified, return to equation (5.1) and recompute the solution for
keeping
~
= c
A. , using the new values of
1
A.
j
, j = 2,
process of solving equations (5.h), h
3, ••• , n. Continue the
= 1,
2, ••• , n, successively
until the solution to each equation changes by less than arbitrary
amount from one time to the next.
This brief outline is just a suggestion, one which may prove to
be full of flaws.
Among other things, the author does not know
whether this technique will converge to a solution at all, and if
it does, whether the solution will be unique.
62
Another method of solution which might be explored is
H. O.
Hartley's technique of nonlinear programming by the simplex: method
(reference f~7).
It may be that the forms of equations (2.1) and
(2.3) Will have to be modified in order to satisfy Hartley's requirements, but it is not immediately obvious that such modification
is impossible.
In reference f~7 Hartley d.iscusses the problem of
convergence of the simplex method, giving a proof that, if a unique
solution eXists, the simplex method converges to it.
63
Tt.3LE
I-:'~
Author's Artificial ?o;ulation
Establishment
Yl1
i
1
1
1
2
3
4
5
2
4
7
9
11
18
58
59
66
75
78
82
82
92
256
261
316
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
598
_
605
768
957
987
2,868
4,188
5,613
7,525
16,513
36,508
Inul
Y2i
I 2il
100
912
199
14
3
38
2,961
91
1
18
424
49
0
82
5
7
5
19
79
2
7
34
4
264
9,299
862
9
64
499
168
3
30
578
51
15,760
9
137
29
944
2
1
2
0
4
22
54
64
43
59
115
447
1,589
1
0
5
3
1
3
6
0
12
5
60
22
30
32
15
14
20
111
59
195
136
50
11
281
352
920
56
123
695
5,573
7
8
28
91
95
135
222
396
701
969
6,780
Y1
= 78,608
Y2
= 40,593
D
9
e
e
e
TABLE II- A,_
Results of the ilethods of !\o'0roxhnation,
!"-~7)roximation I
e
a
b
I
~ound
I
Error
in e
Error
in a
Error
in b
,
Arithmetic
'iean
I
'.'!eJian
0.168
3.£08 (10- 8 )
127.835 (l0-8)
-41. 87%
-18.94%
+139.86%
0.409
8
2.943 (10- )
17.625 (10- 8 )
+41.52%
-33.77%
-66.93%
0.399
2.911 (l0-8)
18.283 (10- 8 )
+38.06%
-34.60%
-65.70%
-1.38%
+2.54%
+5.43~;
Error
in e
Error
in a
Error
in b
-/... 02%
+63.51%
+77.51%
+2.51%
~eometric
'Tean
J'1txed
I
0.285
4.564 (l0-8)
56.192 (lO-8)
True lJutimal
I
0.289
4.451 (10- 8 )
53.296 (10- 8 )
TABLE III-,"..
T.(esu1ts of the 'Iethods of AOl)roximation, 1'>.ound I I
Approximation
I
Q,
::l
b
I
,
Arithmetic
Hean
I
0.477
24.217 (l0-3)
106.434 (10- 8 )
Ifixed
I
0.502
15.489 (10- 8 )
61.463 (l0-8)
+1.01%
+4.58%
True Optimal
I
0.497
14.811 (10- 8 )
59.960 (lO-8)
----
----
I
---- '"
.po.
e
e
T~BLE
e
IV-A
Variances and EXDected Sample Size Achieved by the
Approximation
VI
V2
~etbods
of A07)roxirnation
ZSS
* Error
in VI
* Error
in V2
* Error ,
in ESS
P~rit~metic
3,563,269
1,574,505
6.616
-29.17%
-39.39%
+19.66%
14,321,564
7,506,951
3.444
+184.67%
+188.96%
-37.71%
'fixed
4,889,248
2,543,663
5.582
-2.82%
~2.09%
+0.96%
True ':lptimal
5,027,902
2,597,735
5.529
-0.06%
-0.01%
Sryecifications
R = 8
5,080, 912
2,597,952
Tean
~1edian
*Error in the variance is ~iven as ')ercenta~e of the variance suecification, Yl1hile error in expected
sample size is fiven as nerc~nta~~ of the o~tima1 exnected sample size.
Cl'
VI
66
TABLE V-A
Probabilities Assigned by Methods of Approximation
D 2
1i
i
1
2
1
0
21:)
9
1
0
36
0
~
4
~
7
8
9
10
11
14!~
13
14
25
3,600
484
900
1,024
1"i
PPI:)
16
17
196
400
1?
]8
l?~Pl
19
20
3,481
38,025
, A' I. at::
2,500
121
78.q61
123,904
846,400
Pl
22
23
24
25
26
~ ,~~
P7
28
29
~o
31
32
~~
34
35
36
37
38
~1
15,129
483,025
01:)8.,20
D
2i
2
Arith. Mean Median
Pi
Pi
Mixed
Pi
True Optimal
Pi
0.0002
0.0038
0.0835
0.0005
0.0013
0.000"5
0.0076
0.1780
0.0207
0.0009
0.0103
0.0038
OA0052
0·.0059
0.0004
0.0071
0.11')60
0.0012
0.0024
0.0012
0.0143
0.3324
0.0004
0.0070
0.1541
0.0012
0.0024
0.0012
0.0141
0.3283
0.0382
0.0019
0.0231
0.0085
0.0115
0.0129
0.0070
0.0054
0.0094
0.0'102
0.0229
0.0750
0.0721
0.3869
0.1302
0.1082
0.1355
0.3541
0.l083
0.0524
0.7784
1.0000
0.0008
0.0015
C
81
~q.601
9
324
179,776
2.401
0
25
21')
49
1 .1'16
16
4oo~
249,001
26,224
9
81
lA,76q
841
891,136
1
4
0
16
484
2 916
4,096
1,849
'5Q
~.481
40
41
42
13,225
199,809
2.'124 921
0.0005
0.0093
0.2053
0.0015
0.0031
0.0011')
0.0188
0.4375
0.0'509
0.0025
0.0295
0.0108
0.0148
0.0166
0.0000
0.0069
0.0122
0.064q
0.0293
0.0960
() noun
0.5154
0.1734
0.1383
0.1732
0.4529
0.1440
0.0675
1.0000
1.0000
0.0010
0.0021
0.0000
0.0041
0.0227
0.0'1'57
0.0660
0.0444
0.0609
0.1187
0.4612
1.0000
0.0~87
0.0020
0.0236
0.0087
0.0118
0.0132
O.OO~~ 0.0071
0.0024 0.0055
0.0045 0.0096
0.02~8 0.0'112
0.0103 0.0234
0.0335 0.0767
O.o~~l:;
0.07~4
0.3917
0.1318
0.04B~ 0.1106
0.0604 0.1385
0.1580 0.3621
0.2097
0.0706
O.o~A~
o ,nQ~
0.0244
0.4139
0.9"569
0.0004
0.0008
0.0000
0.0017
0.0092
0.0227
0.02b9
0.0181
0.0248
0.0483
0.1877
0.6671
0.0535
0.7890
1.0000
0.0008
0.0016
0.0000
0.0031
0.0172
0.0423
0.0502
0.0337
0.0463
0.0902
0.3504
1.0000
() r"'lr"'lr"'lr"'l
0.0031
0.0170
0.0418
0.0496
0.0333
0.04'57
0.0890
0.3461
1.0000
67
TABLE VI-A
Optimal
pI
i
i
1
2
3
4
5
6
7
8
1
0
25
9
1
81
39 601
9
q
36
0
324
179,776
0.0005
0.0000
0.0024
0.0014
0.0005
0.0014
0.0029
0.0000
0.0073
0.1606
0.0024
0.0145
0.3421
Pi
0.0005
0.0073
0.1606
0.0014
0.0024
0.om4
0.0145
0.3h 21
0.0004
0.0070
0.1541
0.0012
0.0024
oom~
0.0141
0.3283
t-:-9~_ _---,1=-:44~t--_-=2..L:
.4~0:.=.l+1-~0~.
0~0;..,:,;5l8:::-+....:0l'.:.~0",3)95,-,-~0:..:..:.0~)'3u)'9.5.,,- 0.0':582
10
25
0
0.0024
0.0000 0.0024
0.0019
11
3,600
0.0290
0.0290 0.0231
12
484
O.. OlOh
0 .. 0106
O.ooA")
13
900
0.0145
0.0145
0.0115
14
1,024
25
0.0155
0.0040 0.0155
0.0129
15
225
21'5
0,0072
0.0040 0.0072 0.0070
16
196
0.0068
0.0068 0.0054
17
400
49
0.0097
0.0056 0.0097
0.0094
18
12.321
1.156 0.057S6
0.0274
0.01'5':56 0.01'502
19
3,481
16 0.0285
0.0032 0.0285
0.0229
20
38,025
0.0942
0.094~
0.0750
21
18 496
4.096 0.0567
0.0516 0.0567
0.0721
22
2,500
249,001 0.0241
0.4027 0.4027
0.3869
23 .
121
28,224
0.0053
0.1356 0.1356
0.1302
24
78.961
9 0.Pi1'57
0.oo?4 o. 1~1:)7
O. loB?
25
123,904
0.1700
0.1700! 0.1355
26
846,400
81 0.4443
0.0073 0.4443
0.3541
27
':5 1':56
18.769 0.0270
0.1106 0.1106' 0 108"2)
28
15,129
841
0.0594
0.0234
0.0594
0.0524
29
483,025
891,136
0.3356
0.7618 0.7618 0.7784
30
31.058.329
1.0000
1.0000 1.0000
31
1
0.0008 0.0008 . 0.0008'
I 0.0015
32
4
0.0016 0.0016 R
1':5
0
00 OQO:=O-+-.:.<.JO"woO~OOO i 0.0000
34
16
0.0032
0.0032 I 0.0031
35
484
0.0178 0.0178 0.0170
36
2 . 916
0.04':56 o. 04 ~6
0 O!J. 1 A
37
4,096
0.0516 0.0516 0.0496
I
S~
40
41
42
~'~~i
13,225
199,809
2 ~ 524 •921
g:g~i~ g:g,i~ I g:g,~~
0.0~)28
0.3607
1. 0000
0.0928'
0.3607
1. 0000
0.0890
0.3461
1. 0000
68
TABLE VII-A
Variances and Expected Sample Size Resulting from
Use of Pli , P2i , Pi' anq Pi' = max(Pli , P2i )
P
li
Var Xl"
Var X2"
Expected
Sample Size
P
2i
5,030,912
P'
i
Optimal
Pi
Specifications
3,976,063 5,027,902 5,030,912
2,597,969 2,508,729 2,597,135 2,591,952
2.554
3.~60
5.782
5.529
TABLE VIII..A
69
This table is a collection of twel~e charts which show the
effects which deviations in a and b from their qptimal values,
A and B, respectively, have upon the values of the expected
sample size (ESS) and the variances, VI = Var Xl" and
V = Var X ", in three populations. In the first population,
2
2
A : B ;,.. 4 : 5, and there are four establishments in Group III.
In the second population, A : B ;,. 20 : 81 ~ 1 : 4, and there
are eighteen establishments in Group III. In the third population, A : B ::: 46 : 415 ::: 1 : 9, and there are eight establishments in Group III.
In all charts the range of a is 80?o A
the range of b is 8O?o B -;; b -;; 120?o B.
sample size is defined to be
~ a -;; 120?o
A, and
Change in expected
100 (ESS - Optimal ESS) / (Optimal ESS);
that is, the proportional change expressed in percent.
change in Var Xh" is defined to be·
Similarly,
100 (Vh - ~Yh) / (~Yh)'
where ~Yh is the specification; h = 1,2.
Chart I
Chart II
Chart III:
Chart IV :
the plane, b
"
" , a
II
"
"
"
= B.
= A.
, biB = 200?o - a/A.
, biB = a/A.
LEGEND
change in expected sample size.
change in Var Xl".
change in Var X ".
2
e
e
e
'"
TABLE IX-A
Percentage Change in Expocted Sample Size (population 1)
~
0.80 B
0.90 B
0.94 B
0.96 B
0.98 B
B
1.02 B
1.04 B
1.06 B
1.10 B
1.20 B
B - .20A
B - .:lOA
B - .OGA
B - .04A
B - .02A
B + .02A
B + .04A
B + .06A
B + .10A
B + .20A
0.30 A 0.90 A 0.94 A 0.96 A 0.93 A
-5.435
-2.642
-1.568
-1.040
-1.954 1-0.950
-0.564
-0.374
-0 517
-0.186
+0.142
+0.279
+0.411
+0.659
+1.193
A
1.02 A
-3.480
-1.692
-1.004
-0.666
-0 331 1-0 147
,0.000 +0.184
+0.328 +0.512
+0.653
+0.975
+1.609
+3.147
1.04 A 1.06 A
1.10 A
1.20 A
-1.712
-0.788
-0.457
-0.299
+0.367
+0.547
+0.904
+1.767
+1.020
+1.522
+2.513
+4.914
-1.015
-0.457
-0.262
-0.171
-0 083
+0.079
+0.155
+0.226
+0.357
+0.612
~
0'\
e
e
e
TABLZ X-P_
Percentage Chanoe in Var Xl" (PoT)u1ation 1)
~I
0.80
0.90
0.94
0.96
0.98
B
0.90 A
0.94
A..
0.96 A
+3.421
+2.245
B
+1.105
f). 000
-1.073
+1.102
-0.003
-0.007
-0.010
-0.016
-0.031
-1.076
+2.238
13
B - .20'\
B - .10A
B - .06A
13 - .04A
B - .02A
B + .02A
3 + .O4'\.
B + .06A
B + .1OA
B + .20A
-1.070
+2.252
+5.839
1.02 A
+1.109
+3.432
+12.848
A
+0.036
+0.017
+0.010
+0.007
+0.003
+5.908
B
B
B
3
B
B
0.98 A
+12.891
:3
13
1.02
1.04
1.06
1.10
1.20
0.80 A
+3.411
+5.871
+12.810
1.04 A
1.06 A
1.10 A
-s.n"?
1.20 A
-9.457
-3.117
-2.108
-2.115
-3.127
-5.068
-9.489
-2.121
-3.130
-5.083
-9.516
-9.464
-5.055
-3.119
-2.109
-1.070
+1.103
-1-2.240
+3.413
+5.875
+12.817
1
.....
......
e
e
e
TABLE XI-A
Percentage Change in Var X2" (Population 1)
~
0.00
0.90
0.94
0.96
0.93
B
1.02
1.04
1.06
B
B
B
B
B
B
B
B
I.le B
1.20 B
B- .20A
B - .lOA
B - .06A
B - .04A
n - .02A
B + .02A
B + ,04A
B + .OGA
B + .1GA
B + .20A
0,80 A
0.90 A
0.94 A
0.96 A
0.93 A
A
+1.235
+0,002
-1.195
+14.333
+6.569
+3.316
+2.504
+1 233
0,000
-1.197
-2.359
-3.382
-5.652
-10.583
+14.360
+6.581
+3.316
+2.509
+0.023
+0.011
+0 ,006
+0 ,004
-2.354
-3.4&1
-5,642
-10,563
1.02A
1.04 A
1.06 A
1.l0 A
1.20 A
+14.310
+6.550
+3.810
+2.500
+1.231
-0,002
-1.199
-0,004
-0,006
-0.010
-0.019
-2.363
-3,493
-5,662
-10.600
+11.174
+5.218
+3.051
+2.009
+0.992
-0,968
-1.913
-2.837
-4.621
-8.749
71>
e
TABLS
Percenta~e
b~IO.80 A
0.80
0.90
0.94
0.96
:3
1.02
1.04
1.06
1.10
1.20
13 -
BBBBB+
B+
B+
B+
B+
.20'\
.101\
.06A
.04A
.02A
.02A
.04A
.06A
.lOA
.20A.
0.94 A
0.96 A
0.98 /\.
-1. 944
-1.289
-1.870
-0.912
-:].542 .
~O.360
y::n- ',.
in Exnected Sample Size (Population 2)
-3.275
B
B
B
B
:3
0.90 A.
C)3n~e
' -6.738
B
B
B
0.98 B
B
e
e
-0.641
-0.179
+0.279
+0.552
+0.820
+1. 341
+2.551
!\
-4.841
-2.357
-1.400
-0.929
-0.462
O.OGO
+0.458
+0.911
+1.360
+2.247
+4.399
1.02 A.
1.04 A
1.06 A
1.10 A
1.20 A
-3.103
-1.477
-0.869
-0.574
-0.284
+0.178
+0.635
+0.354
+0.528
+0.873
+1. 713
+1.264
+1.887
+3.115
+6.091
+0.570
+0.304
+0.187
+0.126
+0.064
-0.066
-0.133
-0.203
-0.347
-0.742
I
-..J
\0
e
e
e
TABLE XIII-A
Percentage Change in Var Xl" (Population 2)
~
0.80 B
0.90 B
0.94 B
0.96 B
o 98 B
B
1.02 B
1.04 B
1.06 B
1.10 B
1.20 B
B - .20A
B - .10A
B - .OGA
B - .04A
B - .02A
B + .02A
B + .04A
B + .OGA
B + .10A
B + .20A
0.80 A 0.90 A 0.94 A 0.96 A 0.98 A
+15.634
+7.165
+4.162
+2.731
+13.502
+6.211
+3.613
+2.373
+1 345
+1.169
+0.997
+2.029
+3.099
+5.355
+11.786
A
+1.937
+0.911
+0.534
+0.352
+0-174
0.000
-0.170
-0.337
-0.501
-0.818
-1.561
1.02 A 1.04 A 1.06 A 1.10 A 1.20 A
-8.311
-4.505
-2.794
-1.894
-0 963
-1.136
-1.305
-2.240
-3.315
-5.378
-10.098
-2.572
-3.803
-6.164
-11.540
-9.694
-5.172
-3.189
-2.156
-1.094
+1.126
+2.287
+3.483
+5.992
+13.046
g'
e
e
e
TABLE XIV-A
Percentage Change in Var X2" (Population 2)
~
0.30 B
0.90 B
0.94 B
0.96 B
o 98 B
B
1.02 B
1.04 B
1.06 B
1.10 B
1.20 B
B - .20A
B - .10A
B - .OGA
B - .04A
B - A02A
B + .02A
B + .04A
B + .06A
B + .10A
B + .20A
0.80 A 0.90 A 0,94 A 0.96 A 0.98 A
+19.261
+8.827
+5.127
+3.365
+0.862
+0.421
+0.250
+0.166
+1.657
+0.083
-1.447
-2.853
-4.221
-6.850
-12.859
A
+18.167
+8.356
+4.860
+3.191
+1.572
0.000
-1 .. 527
-3.012
. -~.457
-7.230
-13.567
1.02 A
1.04 A
1.06 A 1.10 A
1.20 A
+17.165
+7.905
+4.600
+3.021
+1.488
-0.082
-1.608
-0.163
-0.244
-0.403
-0.791
-3.169
-4.686
-7.594
-14.218
+3.131
+1.533
+0.912
+0.605
+0.301
-0.299
-0.595
-0 ..889
-1.468
-2.874
0:>
I-'
e
e
e
TABLE XV-A
Percentage Change in Expected Sample Size (Population 3)
~
0.80
0.90
0.94
0.96
0.98
B·
1.02
1.04
B
B
B
B
B
B
B
1~06 B
1.10 B
1~20 B
B - .20A
B - .10A
B - .OSA
B - .04A
B - .02A
B + .02A
B + .04A
B + .06A
B + .10A
B + .2OA
0.80 A 0.90 A 0.94 A
0.96A
0.98 A
A
1 .. 02 A
1.04 A
1.06 A
1.10 A
1.20 A
.-
-6.160
-0.491
-0.239
-0.142
-0.094
-2.994
-1.778
-1.179
-5.667
-2.755
~1,635
-1.085
-0.586
-0.540
-0.493
-0.992
-1.,497
-2.526
-5.218
-o~057
0.000
+0.046
+0.902
+0.138
+0.228
+0.446
+4.637
+2.333
+1.446
+0.969
+0 487
+0.543
+0.581
+1.063
+1.587
+2.621
+5.125
+1.156
+1.725
+2.840
+5.569
-.,---.
+5.074
+2.595
+1.572
+1.053
+0.529
-0.534
-1.074
-1.620
-2.729
-5.615
g>
e
e
e
TABLE XVI-A
Percentage Change in Var Xl" (Population 3)
~
0.80 B
0.90 B
0.94 B
0.96 B
0.98 B
B
1.02 B
1.04 B
1.06 B
1.10 B
1.20 B
B - .20A
B - .1OA
B - .OGA
B - .04A
B - .02A
B + .02A
B + .04A
B + .OGA
B + .10A
B + .20A
0.80 A
0.90 A
0.94 A 0.96 A 0.98 A
+27.683
+12.687
+7.369
+4.836
+27.547
+12.627
+7.335
+2 381
+4.814 . +2.370
+2.360
+4.792
+7.303
+12.572
+27.432
A
+0.117
+0.056
+0.033
+0.022
+0 011
0.000
-0.011
-0.021
-0.031
-0.051
-0.099
1.02 A
1 ..04 A 1.06 A
1.10 A
1.20 A
-20.244
-10.814
-6.673
-4.513
-2 2QO
-2.300
-2.311
-4.534
I
-6.704
-10.867
-20.347
-4.555
-6.734
-10.915
-20.435
-20.337
-10.861
-6.701
-4.532
-2,299
+2.369
+4.812
+7.332
+12.621
+27.534
&
e
e
e
TABLE XVI I-A
Percentage Change in Var X2" (Population 3)
~
0.80 B
0.90 B
0.94 B
0.96 B
0.98 B
B
1.02 B
1.04 B
1.06 B
1.10 B
1.20 B
B - .20A
B - .10A
B - .OGA
B - .04A
B - .02A
B + .02A
B + .04A
B + .OGA
B + .10A
B + .20A
0.80 A
0.90 A 0.94 A 0.96 A 0.98 A
+13.217
+6.057
+3.518
+2.309
+0.631
+0.303
+0.179
+0.119
+1.137
+0.059
-0.988
-1.948
-2.880
-4.668
-8.. 738
A
+12.458
+5.726
+3.330
+2.186
+1.077
0.000
-1.046
-2.063
-3.051
-4.949
-9.283
1.02 A 1.04 A 1.06 A 1.10 A 11.20 A
+11.792
+5.416
+3.148
+2.066
+' OlA
-0.058
-1.103
-0.116
-0-172
-0 284
-0 551
-2.175
-3.215
-5.211
-9.756
+0.634
+0.307
+0.181
+0.120
+0.059
-0.059
-0.116
-0.173
-0.283
-0.538
co
-I='"
e
e
e
TABLE I-B
Census Bureau Artificial Population
Establishment
1
2
3
4
5
6
7
8
YAi
IDAil
41
253
7,785
1.310
3,980
59
5,910
7.022
41
170
7,785
389
72
0
1,019
1 353
159
2,339
24
819
1,550
1
83
9
10
11
12
13
14
15
16
17
18
19
20
21
2,2
23
24
25
26
27
28
11,557
12
438
°0
65
1,021
12
YBi
IDsil
YCi
tDCil
Inni I
YDi
!
! I~il
' YEi
I
I
20
287
20
12
146
7
11
0
8
2
471
8
52
44
417
2
°9
4
1
364
17
164
0
23
1
2
0
31
1
1
0
40
105
0
11
1
2
0
7
7
°
13
171
13
54
20
28
1
11
2
438
I
r
1
CP
\Jl
e
e
e
TABL2 I-B (continued)
EstablishmentlYPi
1
2
3
4
IDcil
YCi
tDPi!
-
-
-
-
-
7
9
-
-
-
-
192
11
-
13
14
15
16
17
18
19
20
21
22
_.
28
10
12
-
-
8
20
84
-
0
_.
-
-
0
-
5
74
-
YJi
IDJil1
1
10
-
-
-
6
8
'Dul
YIi
!
I
5
11)Hil
YTH
_.
-
2
0
5
225
96
312
32
105
1
-
-
27
91
-
-
-
-
67
76
434
-
51
5
0
4
0
-
-
-
4
-
-'
-
-
-
79
1
75
2
2
24
3
43
27
1
235
235
0
-
8'
0
0
-
0
0
2
0
140
160
160
-
-
-
Q
-
-
74
.-
7
40
248
192
312
52
225
3
_.
74
15
2
230
0
52
1
1
5
5
22
46
1
-
10
5
0
23
24
29
26
27
28
1,169
_.. _-----
191
l
ex>
0\
e
e
e
TABLE I-B (continued)
Establishment
YAi
29
30
31
32
33
34
35
36
37
t Dei I
YBi
YCi
IDcil
\
YDi
9R
IDDi
I
745
172
36
290
471
430
170
141
535
111
8
3
7
52
13
39
1
96
310
1
163
3
17
1
19
1
0
43
219
3
2
7
3
5,696
8
384
17
352
1
5
17
1
35
0
2,985
2,798
4,884
411
910
13
55
55
78
33
2.577
2.577
-:;;0
0:>0
39
40
41
42
43
44
45
46
,47
148
IDAil
4
2
P
YEi
I~il
6
1
23
6
39
2
22
0
1
0
55
21
124
0
0
11
73
188
11
4
1,831
38
19
149
50
51
52
53
--"54
55
56
57
58
59
I
.-.-..:..
CP
--J
e
e
e
TABLE I-B (continued)
I
!Establ ishment
i
29
30
31
32
33
34
35
YFi
25
36
37
38
39
40
41
42
43
ID Fi
I
3
10
28
473
38
2
324
41
3
1
48
7
0
37
1
80
3
!
YGi
I
I
IDGi
I
.YHi
i
I
!
137
i
I I
YIi
!
14
5
I
i
i
6
2
i
1
2
i
I
I
1
I
IDJi~ 'J
YJi
I
\
1
\
I
!
919
23
826
137
192
61
36
6
39
3
II
I
I
!
I
I
II
1,872
I
!
I
144
45
46
47
48
49
50
51
52
53
54
55
I I
54 I
IDIi
I
I
1
r
lDai
I
•
25
I
3
1
2
0
35
295
12
239
201
27
)
35
I
i
I
I
I
I
,
i
I
56
1
I
604
21 .
II 1,212
192
21
1,212
33
0
10
0
323
515
4
86
49
III
1
2
I
1
!
5,217
57
58
59
I
I
----------,
..
I
I
I
1___ . __.1
-._ ..... _-_._._. __ .. __ .L.._.__... __
748
I
I
I
I!
8
117
9
1
2
45
2
0
I__
j
,
CD
I»
89
TA.3IJ~
II-:i
Pairs of Classes from the Census Bureau
POP.
1
3
A
A
A
A
2
D
3 - _. - - - E
F
4
0\
G
5
H
6 - - - .tI.
!
7
A
J
A
8
9 - - - B ---~--I
10
11
B
B
12 -
B
13
14
15
16
17
18
19
20
21
22
- -
_. - - - - - -
23
I
24 25
26
27
28
29
30 31
32
33 34
35
36 37
38
39
40
41
42
43
!
44
45
- -
B
B
B
C
C
C
C
C
C
C
C
C
D
-
- -
-
_..-
-
-~
- -
_.-
-
_.
-
- -
1;'
_.
-
- - - - -
E
E
F
F
F
- -
-
- - .. -
H
H
I
-
-
-
- - -
I
J
.,
G
1;'
-I
I
- -I
-
J
H
I
J
I
J
J
-
I,
-I
.-
19
15
22
19
19
26 16
21
5 7
8
9 8
3
8
1
1
2
4
-
I
-
-
I
-j
I
-I
J
J
I
- -j
I
I
8
9
8
-
-
-
-
-
-
10
10
9
10
-' -
8
8
1
11
-
6
4
-
0
2
1
9
-
-
-'
- -
-
_.
.-
-
-
-
12 2
15
15
23
5
10 11
10
- 12 3
16
- 14 7
3
-
-
1
6
-
-
-
1
4
-
3
1
-
0
0
0
4
-
-
-
-
5
2
- - -
5
7
-
-
-
-
- 12 -
-
-
-
-
-
6
-
3
16
- - -
10
12 10
7
2 10
7
7 12
9
8 3
3
17
12
- 5
2
6
7 13
8
1
- - --
0
8
1
_.
5
5 5
in Grou" III
7
2
4
- - -
of units
~': o.
-
-
-
- - -
3 -
H
I
J
G
H
!
- -
- -
G
H
- -
-
F G
G
G
E
F
G
H
E
- -
B
D
- - -
OJ
E
J
A
I
J
E
F
1)
D
D
D
D
F
G
H
I
No. of units
in r;rou') II
rJo. of units
in Grou"-' I
I
Class 2
Class 1
Poryu1~tion
6
10
-
-
- -
-
-
3
19
11
-
4 3
-
6
0
5
4
4
2
- 0
2
-
5
8
0
18
3
13
3
3
18
12
21
16
12
- - -- - -' -- -
0
3
4
0
0
4
90
TABLE III-B
~~~.
Pop.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1'5
16
17
18
19
20
21
22
23
2 1+
25
26
27
2CS
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
.
Values of
Q,
a, and b Obtained from Mixed Approximation
Mixed Approximation
a
b
Q
.35745+1
.42885
.34990
.33247
.83809
.70857
.38359
.63297
.24899
.28147
.25365
.39554
.57138
.13569
.42687
.10293
.12523+1
.'5'5280-1
.42548
.40491
.6'3129
.91117
.77118-6
.18503
.10998+1
•J'8l91
.15616+1
.22322+1
.55310
.22606+1
.63165
.11440+1
.20444+1
.61479
.11473+1
.10360+1
.22503+1
.66787
.16341+1
.14433+1
.42151
.10801+1
.30794
.75139
.35538+1
.44057-6
.42115-6
.4'3323-6
.43055-6
.43132-6
.43802-6
.41919-6
.43326-6
.28038-6
.2e421-6
.38472-6
.28494-6
.28491-6
.14776-6
.28191-6
.46414-8
.44748-6
.12365-7
.65947-6
.72583-6
.72583-6
.72583-6
.5668-17
.'52952-7
.41979-~
.39364-5
.38921-5
.43562-5
.26360-5
.4271'5-5
.17235-5
.23166-5
.36540-5
.34837-5
.23561-5
.18423-5
.44270-5
.41421-5
.1-1-2854-5
.18213-5
.15728-5
.17782-5
.87426-6
.87426-6
.9:t.871-5
.34482-7
.22899-5
.3'5386-'5
.38951-5
.61408-6
.87241-6
.2e4CS9-5
.10814-5
.45225-5
•35e74-5
.44255-5
.18213-'5
.87266-6
.80258-~
.1'5471-'5
.43808-6
.28532-6
.43559-5
.36428-5
.44270-5
.18213-5
.87426-6
.95312-5
.1'5467-'5
.34708-5
.40828-5
.15960-5
.87426-6
•861tB-5
.83'585-6
.43197-5
.17702-5
.87426-6
.92170-5
.17899-~
.17165-5
.87426-6
.92861-5
.16049-'5
.87426-6
.88523-5
.15241-5
.92196-5
.15485-5
.72745-6
Q
I
.36892+1
.43188
.36910
.33293
.82239
.70857
.3CS276
.63094
.25381
.28147
.25365
.39'5'54
.57138
.13586
.42687
.10296
.12912+1
.'53280-1
.42548
.40491
.6'5129
.91117
.77118-6
.18'503
.11811+1
.97190
.15205+1
.22322+1
.54698
.22407+1
.64336
.11440+1
.20444+1
.61479
.14092+1
.11'544+1
.22503+1
.67578
.16787+1
.14433+1
.42352
.11121+1
.30794
.75139
.35299+1
True Optimal
a
.43794-6
.41984-6
.43347-6
•43013-€i
.42899-6
.43802-6
.41852-6
.43261-6
.38046-6
.28421-6
.28472-6
.28494-6
.28491-6
.14774-6
b
.32177-7
.22509-5
.31817-'5
.38806-5
.63429-6
.87241-6
.28567-5
.10867-5
.43536-5
.35874-5
.44255-5
.18213-'5
.87266-6
.80043-5
.28191~6 .15471-'5
.46429-8 .43802-6
•47446-€i .28460-6
.12365-7 .43559-5
.65947-6 .36428-5
.72583-6 .44270-5
.72'58'5-6 .1821'5-'5
.72583-6 .87426-6
.5483-17 .92196-5
.52952-7 .15467-5
.40844-5 .29279-5
.38110-5 .40345-5
.37596-5 .16262-5
.43562-5 .87426-6
.28767-5 .86121-5
.42139-5 .83929-6
.17680-5 .42716-5
.23166-5 .17702-5
.'56'540-1) .87426-6
.34837-5 .92170-5
.29250-5 .14729-5
.20816-5 .1'5619-5
.44270-5 .87426-6
.42007-5 .91982-5
.43089-'5 .15290-5 i
.18213-5 ·.87426-6 I
.15815-5 .88167-5 I
.18018-5 .14567-5 '
.87426-6 .92196-5
.87426-6 .15485-5
.91386-5 .73340-6
91
TABLE IV-B
Error in Q, a; and b from Mixed Approximation
I
Pop. I
1
2
,
~
4
I
~ I
7
8
9
10
11
I
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
,
I
I
~o
31
32
33
34
35
36
37
38
39
40
41
42
43
44
4'5
II
I
70inError
Q
-3.109
-0.702
-5.202
-0.138
+1.909
0.000
+0.217
+0.322
-1.899
0.000
' 0.000
0.000
0.000
-0.125
0.000
-0.029
-3.013
0.000
0.000
0.000
0.000
0.000
0.000
0.000
-6.883
+1.030
+2.70':5
0.000
+1.119
+0.888
-1.820
0.000
0.000
0.000
-18.585
-10.256
0.000
-1.170
-2.657
0.000
-0.475
-2.877
0.000
0.000
+0.677
70 Error
in a
-+0.601
-+0.312
-0.055
+0.098
+0.543
0.000
+0.160
+O~150
-0.029
0.000
0.000
0.000
0.000
-+0.014
0.000
-0.032
-5.686
0.000
0.000
0.000
0.000
0.000
0.000
0.000
+2779
+3.290
+':5.'524
0.000
+2.301
+1. ~67
-2.517
0.000
0.000
0.000
-19.450
-11.496
0.000
-1·395
-0.545
0.000
-0.550
-1. ~10
0.000
0.000
+0.531
70 Error
in b
+7.164
+1. 733
+11.217
+0.374
-3.186
o 000
-0.273
-0.488
+3.880
0.000
0.000
0.000
0.000
+0.269
0.000
-+0.014
-+0.253
0.000
0.000
0.000
0.000
0.000
0.000
0.000
+18.542
+1.197
-1.857
0.000
-+0.055
-0.410
+1.126
0.000
0.000
0.000
+21.522
+9.898
0.000
-+0.956
+4.964
0.000
+0.404
+4.627
0.000
0.000
-0.811
92
TABLE V-B
Optimal Values of the Variances and Expected Sample Size
for Pairs of Census Bureau Classes
Pop.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
V1
I
.
6
i 3;833,21 .1
! 3,833,216.0
I 3,833,216.1
I 3,833,216.2
3,833,216.2
3.833.216.0
3,833,216.2
3,833,216.2
264,064.00
264,064.00
I 264,061.28
264.064.01
264,063.87
I
264,064.03
264,064.03
30,144.263
30,144,020
30.143.871
30,143.978
30,144.000
30.144.000
30,144.000
5,705.110 .
30,143.948
121,856.00
121,856.00
121 855.99
121,856.00
121,855.82
121,855.96
52,928.015
52,928.043
52,928.001
52,927.999
I
52,928.015
77.248.071
77,248.000
77,248.006
77,248.002
!
310,016.01
310,016.00
310,016.00
409,344.00
I 409,344.00
I 166,719.97
I
I
I
I
!
V2
i
64 Yl
264,063.97 3,833,216
:1
121,855.91
11
52.927.986
I
11
77,247.979
310,016.24
409,~44.00
166,720.14
239,872.36
121,855.99
52,928.053
77,247.999
310.016.00
409,344.00
166,718.15
239.872.19
3,833,216.2 I
264,063.97
121.856.00
52,928.000
77,248.001
310.016.00
409,344.00
166,720.00
239.872.00
52,927.955·
77,248.052
,10.016.00
409,344.00
166,720.00
239.872.00
77,248.001
310,016.00
409,344.00
166,720.00
239,872.01
310.016.00
409,344.00
166,720.00
239.872.01
409,344.00
166,719.68
239.871.83
166,720.00
239,872.00
239,872.00
I
I
I
64 Y2
264,064
121,856
52.928
77,248
3J.0,016
EPi
6.9457817
7.1206693
7.0700847
7.2026494
7.0959422
"
409,~44 I 7.8149409
"
166,720 7.3619005
"11
239,872 18.1739817
264.064
121.856 I 1.6846590
52,928 1.2238279
11
77,248 11. 4133090
"
310 016 1.7182228
409,344 1.9084878
11
166,720 3.4502560
11
239.872 2.4?c;6281
30,144 3,833,216 6.9380231
264,064 1.0445214
"
121,856 0.6703054
"
52,928 0.2183175
II
77,248 0.4050873
"
310.016 0.7090677
II
409,344 0.9010662
II
166,720 3.3582320
11
,
239.872 1.4068194
121,856
52,928 0.7874769
77,248 0.9351167
11
~10 016 1.1915595
"
409,344 1.5488872
"
166,720 3.5332655
II
239.872 1.8679661
"
77,248 0.4656822
52,928
310,016 0.7937072
"II
409.344 1.0777170
166,720 3.5468236
II
II
239,872 1.5450134
77 248
Ai10 OJ~6 O. -~ ~II
409,344 1.2618518
166,720 3.7137998
II
239.872 1.7732813
"
310,016
409, 34 .j. 1.5658322
166,720' 3.8079726
"II
239.872 2.0611893
409,344
166,720 ' 4.2371474
239,872 ' 2.2845830
II
166,720
4.5208524
tI
It
I
"
I
9'
TABLE VI-B
Error in the Variances Expresseo as Percentage of the Specirications
i~
~o Error
Error
Pop.
in V2
in VI
1
.326097-05
-.118343-04
2
.000000
-.729280-04
•3-26097...05
3
-.258311-04
4
-.265480-04
.652194-05
.768610-04
.652194-05
5
6
.000000
.000000
.843480-04
7
.652194-05
8
•• 652194-05
.149819-03
.000000
-.4808!J.4-05
9
10
.000000
.996342-04
11
-.102958-02
-.126419-05
12
•126002-G5
.295856-05
.000000
13
-.473370-04
14
-.111175-02
.118343-04
15
.118343-04
.773525-04
16
.652194-05
.872278-03
17
.672229-04
-.118343-04
18
.160281-05
-.428445-03
19
-.737022-04
.922539-06
20
.126419-05
.161983-05
21
.126002-05
-.809914-06
22
.000000
.809914-06
.000000
-.810738-1{)2
23
24
-.J.70892-03
-.162847-05
.801405-06
25
-.848736-04
26
.000000
.670021-04
27
.126002-05
- .104187)-04
28
.000000
.000000
.117150-05
29
- .150665-03
.000000
30
-.312549-04
.126419-05
31
.276762-04
.000000
.821059-04
32
33
.184508-01:)
.000000
34
-.276762-05
.117150-05
.244271-05
.285987-04
.922860-04
-.126002-05
.000000
.000000
37
38
-.117150-05
.758515-05
.244271-05
.252838-05
39
40
.000000
.252003-05
41
-.126002-05
- .193298-03
42
-.708386-04
- -0.126002- 05
.000000
.000000
43
44
.000000
-.814236-06
-.164010-04
-.162847-05
45
70
~~
APPENDIX C
Computer Program in the GAT Language
(0001)
55 IS THE HIGHEST· STATEMmNT NUMBER
DnmJSION C(ll5) D(100,2,10) I(45)
(0002)
X(90, 15, 12) Y(B5, 15, 7) Z(140)
(0003)
(0003) 1 READ
2, IO, 1,1,7,
(0004)
(0005) 2 KIO = 0
38, IO, 1,1,75,
(0006)
X(IO + 11) = 0.0
(0007)
(0008) 38 Y(IO + 6) = 0.0
(0009)
c6 = O.
(0010)
C7 = O.
(0011)
c4 =KlO * KlO * C2
(0012)
C5 = KlO * K10 * C3
11, IO, 1, 1, KO,
(0013)
c4 = c4 + D(IO, 1)
(0014)
C5 = C5 + D(IO, 2)
(0015)
GO 'IQ 3 IF D(IO, 1) V O.
(0016)
(0017)
Kl1 = 7
(OOlB)
GO TO 10
(0019) 3 GO TO 4 IF D(IO, 2) V O.
Kl1 = 6
(0020)
GO TO 10
(0021)
(0022) 4 GO TO 8 IF D(IO, 1) WD2
GO TO 8 IF D(IO, 2) WD4
(0023)
GO TO 6 IF D(IO, 1) WD1
(0024)
GO TO 5 IF D(IO, 2) WD3
(0025)
(0026)
Kll = 1
(0027)
GO TO 9
(0028) 5 Kl1 = 2
J(45) K(20) N
(0029)
(0030)
(0031)
(0032)
(0033)
(0034)
(0035)
(0036)
(0037)
(0038)
(0039)
(0040)
(0041)
(0042)
(0043)
(0044)
(0045)
(0046)
(0047)
(0048)
(0049)
. (0050)
(0051)
(0052)
(0053)
(0054)
(0055)
6
7
8
9
10
11
12
13
14
(0056\
.
.' I
(0057)
(0058) 15
(0059) 16
(0060)
(0061)
(0062) 17
(0063)
GO TO 9
GO TO 7 IF' D(IO, 2) WD3
Kll = 3
GO TO 9
Kll = 4
GO TO 9
Kll = 5
KKll = KKll + 1
X(Kll, KKl1) = D(IO, 1)
Y(Kl1, KK1J.) = D(IO, 2)
110 = KKl1
GO TO 11
CKl1 = CK11 + D(IO, K11 - 5)P.5
KKl1 = KKll + 1
JIO = Kl1
TCO
C1 := co
Kl2 :: 0
Kl3 = 0
D9 = C5 / c4
Kl4 = 13
K15 = 0
13, 10, 1, 1, 4,
XIO = o.
YIO = o.
10 = 1
GO TO 16 IF KIO U 0
15, JO, 1, 1, 15,
XIO = XIO + X(10, JO)
YIO = YIO + Y(IO, JO)
10 = 10 + 1
GO TO 11,. IF 10 H 4
Tel
10 = 1
x6 = o.
95
X7 = C7
(0064)
(0065)
(0066)
18 GO TO 19 IF KIO U 0
X5 = (XIO / KIO * 01
= x6
x6
(0068)
X7 = x7 + YIO / Xl
19
= IO
IO
01 + YIO / KIO)P.5
+ XIO / X,
(0067)
(0069)
*
+ 1
(0070)
GO TO 18 IF IO R 4
(0071)
GO TO 21 IF K5 U 0
(0072)
(0073)
20, IO, 1, 1, 15,
GO TO 20 IF X(5, IO) U 0.0
(0074)
X5
(0075)
(0076)
x6 = x6 + X(5, IO) / X5
= (X(5,
X7
= X7
(0077)
20 KO = KO
(0078)
21 x6 = C1
IO)
* C1 * C1
+ Y(5, IO»P.5
+ Y(5, IO) / X5
* x6 + c6
= D9 * x6 / X7
(0079)
(0080)
(0081)
CK14
c(K14 + 51) = C1 - CK14
GO TO 29 IF AC(K14 + 51) Q 0.001
(0082)
GO TO 26 IF K15 U 1
(0083)
(0084)
c63
(0085)
(0086)
K16 = K14 / 3
(0087)
(0088)
22, IO, 8, 1, 11,
* DO
= c64
GO TO 25 IF C(K14 + 51)
*
C(K14 + 50) Q o.
ZJ, 824, L+1 M
(0090)
XIO = O.
23, IO, 0, 1, 2,
x8 = x8 + c(K14 ·;IO)
(0091)
X9 = X9 + C(K14 - IO)
IO)
(0092)
no = XlO
- IO + 51)
22
(0089)
(0093)
(0094)
23
XII
= XII
Y6
= (XlO
+ C(K14 •
* c(K14 •
IO) * c(K14
+ c(K14 - IO + 51)
- x8
*
XII / 3.) / (X9 - x8
(0095)
GO TO 55 IF Y6 U 0.0
(0096)
/ 3. - Y6 * x8 / 3·
CK14 = -Y5 / Y6
c(K14 + 51) = C1 - CK14
(0097)
(0098)
Y5
= XII
*
x8 / 3.)
(0099)
(0100)
(0101)
(0102)
(010;)
(0104)
(0105)
(0106)
(0107)
(0108)
(0109)
(0110)
(0111)
(0112)
(0113)
(0114)
(0115)
(0116)
(0117)
(0118)
(0119)
(0120)
(0121)
(0122)
(012;)
(0124)
(0125)
(0126)
(0127)
(0128)
(0129)
(0130)
(0131)
(0132)
(01;;)
24 Cl = CK1J.I.
Kl4 = IQ.4 + 1
GO TO 55 IF Kl4 v 62
GO TO 4; IF Kl; U 1
GO TO 17
25 GO TO 24 IF A(Cl - C(K14 - 2» V DO
Kl5 = 1
CK14 = (Cl + C(Kl4 - 2» * 0.5
KJ.6 = 0
GO TO 24
26 Kl6 = Kl6 + 1
GO TO 29 IF K16 V 3
27, IO, 1, 1, 5,
27 GO TO 28 IF C(K14 + 51) * C(Kl4 - IO + 51) Q o.
28 CKl4 = 0.5 * (Cl + C(K14 - IO - 1»
GO TO 24
29 GO TO ;0 IF Kl9 U 2
c8 = x6 / c4
c8 = c8 * c8
C9 = C8 / (C1 * Cl)
GO TO 31
30 C9 = x7 / C5
C9 = C9 * C9
c8 = Cl * C1 * C9
;1 K17 = 0
IO = 1
GO TO 39 IF K12 U 1
GO TO 47 IF K12 U 2
;2 ZIO = (C8 * D(IO, 1) + C9 * D(IO, 2»P.5
GO TO ;'7 IF ZIO Q 1.0
3; ZIO = 1.0
;4 Kl7 = K17 + 1
c4 = C4 - D(IO, 1)
C5 = C5 - D(IO, 2)
GO TO 35 IF JIO V 5
9r (
X(JIO, IIO)
Y(JIO, IIO)
(0134)
(0135)
(0136)
(0137)
(0138)
(0139)
(0140)
(0141)
(0142)
(0143)
(0144)
(0145)
(0146)
(0147)
(0148)
(0149)
(0150)
GO TO 36
CJro = CJIO - D(IO, JIO - 5)P.5
35
36 KJIO = KJIO - 1
GO TO 40 IF 1Cl2 U 1
GO TO 51 IF 1Cl2 U 2
37
Kl2 =
GO TO
39 GO TO
ZIO =
(0163)
(0164)
1
12
40 IF ZIO U 1.0
(C8 * D(IO, 1) + C9
* D(IO,
GO TO 33 IF ZIO W 1.0
40
IO=IO+l
GO TO 39 IF IO R KO
GO TO 41 IF Kl7 U 0
GO TO 12 IF Kl3 U 0
D9 = C5 / c4
(0153)
(0154)
(0157)
(0158)
(0159)
(0160)
(0161)
(0162)
IO=IO+l
GO TO 32 IF IO R KO
GO TO 42 IF Kl7 U 0
(0151)
(0152)
(0155)
(0156)
98
= o.
= o.
Kl4
= 13
=0
Kl5
GO TO 43
41 GO TO 53 IF Kl3 U 1
42 TCl Tc8 TC9 TKl4
Kl3 = 1
=0
= 13
Kl2 = 2
x6 = o.
X7 = C7
Kl5
Kl4
43
(0165)
(0166)
45, IO, 1, 1, 5,
GO TO 45 IF KIO U 0
(0167)
(0168)
45,
JO, 1, 1, 15,
GO TO 45 IF X(IO, JO) U 0.0
2»P.5
99
e
X5 = (C1 *Cl * X(IO, JO) + Y(IO, JO»P.5
C0169)
x6 = x6 + X(IO, JO) / X5
(0170)
X7 = X7 + Y(IO, JO) / X5
(0171)
(0172) 45 KO = KO
GO TO 21
(0173)
(0174) 47 1a.8 = 0
5*, IO, 1, 1, KO,
(0175)
Z(IO + KO) = (C8 * D(IO, 1) + C9 * D(IO, 2»P.5
(0176)
GO TO 48 IF Z(IO + KO) Q 1.0
(0177)
Z(IO + KO) ,= 1.0
(0178)
GO TO 51 IF ZIO U 1.0
(0179)
GO TO 34
(0180)
(0181) 48 GO TO 51 IF ZIO Q 1.0
Kl8 = Kl8 + 1
(0182)
c4 = c4 + D(IO, 1)
(0183)
C5 = C5 + D(IO, 2)
(01~)
GO TO 49 IF JIO V 5
(0185)
X(JIO, IIO) = D(IO, 1)
(0186)
Y(JIO, IIO) = D(IO, 2)
(0187)
GO TO 50
(0188)
(0189) 49 CJIO = CJIO + D(IO, JIO - 5)P.5
(0190) 50 KJIO = KJIO + 1
(0191) 51 ZIO = Z(IO + KO)
GO TO 52 IF Kl7 W1
(0192)
GO TO 52 IF Kl8 W1
(0193)
GO TO 53
(0194)
(0195) 52 Kl2 = 1
Kl5 = 0
(0196)
Kl4 = 13
(0197)
D9 = C5 / c4
(0198)
GO TO 43
(0199)
(0200) 53 XO = o.
YO = o.
(0201)
Cl0 = o.
(0202)
54, IO, 1, 1, KO,
(0203)
100
(0204)
GO TO 54 IF ZIO U O.
XO = XO + n(IO, 1) / ZIO - n(IO, 1)
(0205)
YO = YO + n(IO, 2) / ZIO - D(IO, 2)
(0206)
C10 = C10 + ZIO
(0207)
(0208)
Z(IO + 2 * KO) = 1. / ZIO
(0209) 54 TIO TZIO TD(IO, 1) TD(IO, 2) TZ(IO + 2
TXO TYO TC1 Tc8 TC9 TKl4- TC10
(0210)
(0211)
D5 = xo - KlO * K10 * C2
(0212)
D6 = YO - KlO * KlO * C3
(0213)
D7 = D5 / (KlO * KlO * C2)
(0214)
D8 = D6 / (Kl0 * KlO * C3)
(0215)
TD5 ... D8
GO TO 1
(0216)
(0217) 55 TK14 TC13... C113
(0218)
GO TO 1
EIID
(0219)
*
KO)
TJIO
101
BIBLIOGRAPHY
[ ) ] Bennett, Carl Allen, and Franklin, Norman L., statistical
Analysis in Chemistry and the Chemical Industry, John
Wiley and Sons, Inc.,· Wew York
£3.7
(1954) •
Hartley, H. 0., "Nonlinear programming by the simplex method,"
Econometrica, 29 (1961), 223-237.
12.7
Hartley, H. 0., and Rao, J. N. K., "Sampling with unequal probabilities and without replacement,"
Annals of Mathematical
Statistics, 33 (1962), 350-374.
L!!7
Ogus, Jack L., "1959 Annual Survey of Manufactures - sample
Design:
L27
Theory, Part I,"
unpublished.
Rand Corporation, A Million Random Digits with 100,000 Normal
Deviates, The Free Press, Glencoe, Illinois (1955).
L§l
u. S. Bureau of the Census, Annual Survey of Manufactures:
1960,
u.
(1962).
S. Government Printing Office, Washington 25, D. C.
© Copyright 2025 Paperzz