.
'
•
DISTRIBUTION AS A MODEL FOR THE NUMBER OF
CHILDREN EVER BORN
by
Lester Randolph Curtin
University of North Carolina at Chapel Hill
Institute of Statistics MimeD Series No. 1188
JULY 1978
•
•
TIlE
HI
DISTRIBUTION ft5 A r.l)DEL FOR TIlE NUMBER OF
CHILDREN EVER BORN
by
Lester Randolph Curtin
•
A Dissertation submitted to the faculty of
the University of North Carolina in partial
fulfillment of the requirements for the
degree of Doctor of Philosophy in the
Department of Bio~tatistics
Chapel Hill
1978
-:,
(0·...··4..,-- .:-.:..'J.,.."..:-_._
--
Approved by:
';:.i')/, \
..
•
~.
..
~~'~"\"'
...
-.
•
ABSTRACT
LESTER RANDOLPH CURTIN. The HI - Distribution as a Model for the
Number of Children Ever Born. (Under the direction of
CHIRAYATH M. SUCHINDRAN and H. BRADLEY WELLS.)
The
HI
distribution is examined as a possible model to
describe the distribution of the number of children ever born, or the
parity distribution, at any age for a birth cohort of women.
Initially, the
HI
distribution is developed as a discrete
time contagion model and the probability function and moments are derived.
•
A continuous time model leads to a compound Poisson process
formulation of the
HI
distribution.
In both the discrete time and
continuous time formulation, the concept of parity-specific forces of
fertility is retained through the contagion function and the conditional intensity function.
Maximum likelihood estimates and moment estimates of the
parameters of .the
HI
distribution are derived.
Maximum likelihood
estimates from truncated data are considered as well as maximum likelihood estimates for the parameters of a modified
The modified
HI
HI
distribution.
distribution is applied to the age-specific
parity distributions of two relatively high fertility countries,
Costa Rica and Guatemala, by urban or rural classification of the
populations.
•
The results indicate that the modified
HI
distribution
provides a good approximation to the parity distribution at the ages
20 to 30 and to the truncated parity distributions at ages 30 to 45.
ii
As
an example of a relatively low fertility population, United
States cohort fertility data by race is examined.
a failure of the
HI
or the modified HI
describe the parity distribution.
nomial distribution and a HI
...
The results indicate
distribution to adequately
For this data, a mixture of a bi-
distribution was found to be a good
approximation to the parity distribution.
•
...
•
ACKNOWLEDGMENTS
I would like to express my gratitude to my advisors,
Dr. C. M. Suchindran and Dr. H. B. Wells, for their guidance and
patience in the course of my research.
Their advice, both academic
and personal, contributed immeasurably to the completion of this
dissertation.
I would also like to thank Dr. Gary Koch, Dr. Dana Quade, and
Dr. Boone Turchi for their many contributions as members of my doctoral
committee.
•
For the skillful typing of the manuscript, I must thank
Joyce Hill and Janet Cochran who completed a rather difficult task in
a professional and cheerful manner.
To my wife Kathleen and my daughter Jennifer I express sincere
gratitude and affection for their encouragement and moral support.
They were, and will always be, my most important motivation.
Financial support was proVided by a training grant from NICHD
and administered through the department of Biostatistics .
•
•
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS
iii
LIST OF TABLES
vi
Chapter
I.
INTRODUCTION AND REVIEW OF THE LITERATURE
1
1.1
1.2
1.3
3
1.4
1.5
1.6
1.7
1.8
•
II.
THE CONTAGION MODEL
2.1
2.2
2.3
2.4
2.5
2.6
III.
IV.
Introduction
. • . . . . . .
Parity Specific Ferti1ity . . . .
Probability Models for the Parity
Distribution . . . . . . .
Discrete Contagion Models
.
Hypergeometric Functions
.
Families of Discrete Distributions
The HI distribution· . .
Aims of Present Research
Introduction
The Development of the Contagion Model
Probability Distribution Functions . .
Moments . . . . . . . . . . . . . . .
Limiting Cases of the HI Distribution
Descriptive comparisons
4
10
12
15
17
18
21
21
22
25
30
35
37
A POISSON PROCESS FORMULATION
48
3.1
3.2
3.3
3.4
48
50
Introduction
.
The Time-homogeneous Process
The Conditional Intensity Function
Mortality. . .
.
54
57
DATA CONSIDERATIONS AND ESTIMATION
60
4.1
4.2
4.3
61
Introduction.
The Parity Distribution
Estimation of Parameters
4.3.1
4.3.2
•
1
4.3.3
4.3.4
4.3.5
Estimation from complete data
Maximum likelihood estimators
Method of moments . . . . . .
Asymptotic relative efficiency
Truncated data
. . . . . .
60
65
66
66
70
72
73
v
Chapter
Page
4,4
4,5
V,
Modified HI Distribution.,
.
An Example
.•.....•
76
APPLICATION OF THE HI AND MODIFIED HI
DISTRIBlJTIONS . . '. .
82
5.1 Introduction
5.2 The Expected Trend with Age
5.3 High Fertility Data
.
83
84
5.3.1
5.3.2
5.4
VI.
74
82
The maximum likeliho~d estimates
Goodness of, fit
85
87
Low Fertility Data
88
5.4.1
5.4.2
5.4.3
89
90
92
A binomial mixture distribution
Maximum likelihood estimators
Goodness of fit. , .
.
SUMMARY AND SUGGESTIONS FOR. FUTURE WORK:
III
6 . 1 Summary
"
.
6.2 Suggestions for Future Research
114
REFERENCES
. .'.
•
.
III
115
•
•
•
LIST OF TABLES
Tables
Descriptive Statistics for the HI
Distribution
40
2.2
The HI Distribution
41
2.3
Descriptive Statistics for the H
2
Distribution
42
2.4
The H Distribution
(ll = 2.0)
43
2.5
The H Distribution
(ll = 3.0)
44
2.6
Descriptive Statistics for the H
3
Distribution
.
45
2.7
The H
Distribution
(ll = 2.0)
46
2.8
The H Distribution
(IJ = 3.0)
47
4.1
Asymptotic Relative Efficiency (in percent)
of the Method of Moments Relative to Maximum
Likelihood Method for the HI Distribution
78
Estimates of the Parameters of the HI and
Modified HI Distributions
.
79
Observed and Expected Truncated Populations
for Women Aged 45-49, U.S., 1960
80
Observed and Expected Number (in thousands)
at a given Parity for Women Aged 45-49,
U.S., 1960
81
The Function Ux
Guatemala Data
94
2.1
•
4.2
4.3
4.4
5.1
5.2
5.3
•
Page
5.4
2
2
3
3
for the Costa Rica and
Maximum Likelihood Estimates for the Parameters
of the Modified HI Distribution; Costa Rica
95
Maximum Likelihood Estimates for the Parameters
of the Modified HI Distribution; Guatemala
96
Comparison of the Fitted ~Iean and Variance of the
the Modified HI Distribution; Costa Rica and
Guatemala
.
97
vii
Table
5.5
5.6
5.7
5.8
5.9
5·:JO
5.11
5.12
5.13
5.14
5.15
5.16
5.17
Page
Observed and Expected Parity
Costa Rica
. . .
•
Distributions;
Observed and Expected Parity Distributions;
Guatemala
.. .
98
99
Indices for the Goodness of Fit of the
Modified HI Distribution; Costa Rica. and
Guatemala. • . . . . . . . . . . . ~ .
100
Observed and Expected Truncated Proportions
(per 1,000) for Selected Ages; High fertility
Data . . .
..
101
The function Ux for Selected Ages,
United States Cohort Data
102
Comparison of VaJ'ious Binomial, Mixtures for·'
the Parity Distribution at Age 36, U.S ..
(white) Data
103
Maximum Likelihood Estimates for the Parameters
of the HI Distribution, U.S. 1920 Birth Cohort
Data
.
104
Maximum Likelihood Estimates for the Parameters
of the ~Iodified HI Distribution, U.S. 1920
Birth Cohort Data
105
Maximum Likelihood Estimates for the Parameters
of the Mixture, U.S. 1920 Birth Cohort Data,
with n = 5 .
. ..
.
106
Comparison of the Fitted Mean and Variance
for U.S. Cohort Data
.
107
The Observed and Expected Parity Distribution
for Selected Ages, U.S. Cohort Data
108
Indices for the Goodness ofHt of the· Modified
HI Distribution and the Mixture (5.2), U.S.
COhort Data . . . . . . . . . . . . .
109
Observed and Expected Truncated Proportions
(per 1,000) for U.S. Data, Age 47
110
•
•
•
CHAPTER I
INTRODUCTION AND REVIEW OF'THE LITERATURE
1.1
-
.
Introduction
The development of demographic techniques for the study of
fertility has been hindered by the almost overwhelming complexity of
the subject.
Fertility consists of the multiple occurrence of an
event, a live "birth, which depends on the characteristics' of the
couple rather than a single individual.
•
Each occurrence of the event
is subject to the interaction of biological, social, economic, and
psychological factors.
Another concern is the difficulty of direct observation on
the distributions of the underlying variables of the reproductive
process.
Important underlying variables of fertility which can be
observed are the age, parity, and marital status of a woman.
This
study will be concerned with the investigation of a probability distribution that can be related to the parity distribution, or the
distribution of the number of children ever born, conditional on the
age of the woman.
The probability distribution will then be a model,
or representation, of cumulative cohort fertility and can be used to
analyze trends in cohort data.
A good approximation of the parity
distribution would have applications to projecting popUlations, to
•
simUlating fertility histories, and, possibly, to adjusting defective
data on fertility.
2
Simple distributions, such as the Poisson or the Binomial, do
not seem to provide a satisfactory fit to available data on the number of children ever born in a specified· time interval.
Allowance
•
for sterility, nonsusceptible periods, and heterogeneity in the
. popUlation has led to"modified probabiU ty distributions which are
too cumbersome to be of much practical use.
We will consider'one
particular compound distribution as a description of the parity distribution conditional on age.
Details· of the research will. follow
after a brief'review of the literature.
'.
•
•
3
•
1.2
Parity Specific Fertility
Parity is defined as the number of live births that a woman
has borne.
A birth to a woman of parity
order birth.
m is said to be a
m+ 1
That these definitions are consistent with the occur-
rence of multiple births (e.g. twins, triplets) can be seen by the
following example.
Suppose a woman of parity zero has twins.
The
first twin born is a first order birth and the second twin born is a
second order birth.
The woman's parity is then two.
The importance of parity as an underlying variable of fertility has been emphasized recently by Ryder (1965).
Earlier, Quensel
(1939) has noted that a considerable proportion of the variation in
fertility that is attributed to age is apparently due to other fac-
•
tors such as parity and the duration of marriage.
Attempts to adjust
for marriage and parity in fertility analysis include Wicksell (1931),
who introduced a mathematical formulation of marriage, as well as
Quensel (1939) and Hajnal (1950) who include an adjustment for parity.
Murphy (1965) developes a population model incorporating marriage and
parity that is based on population projection expressed as a matrix
operation.
This leads to a stable population differentiated by age,
parity, and marital status.
Just as the crude birth rate ignores the age-distribution,
fertility measures such as the total fertility rate ignore the parity
distribution.
Lotka and Spiegelman (1940), Whelpton (1948), and
Heuser, et.a1. (1970) have found that the changing composition of
births by order explains a large portion of changes in fertility a-
•
mong American women.
However, these studies are based on the analysis
of fertility by birth order and, in general, disregard the parity
4
distribution of the population.
Some parity adjusted measures have been computed.
Whelpton
(1946) computes the net reproduction rate adjusted for age, parity,
•
fecundity, and marriage although Karmel (1950) indicates that some of
these adjustments, the adjustment for marriage in particular, are
somewhat arbitrary.
lfuelpton (1954) computes parity adjusted cumula-
tive birth rates and Park (1967, 1976) computes a period fertility
index that is adjusted for age and parity.
Oechsli (1975) computes
intrinsic rates adjusted for age, age and parity, age and marriage,
and age, parity, and marriage.
He concludes, as does Karmel (1950),
that adjustment for parity without adjustment for marriage may be
misleading.
1.3
Probability Models for the Parity Distribution
Obviously, it is desirable to have a parametric representation
of the parity distribution.
A first step towards this goal is the de-
•
velopment of a probability model for parity specific fertility.
Reviews of probability models of reproduction include Joshi (1965),
Sheps (1965, 1971) and Sheps, Menken, and Radick (1969).
Probability
models of reproduction have been concerned with the number, sequence,
and timing of births to couples.
tic or stochastic in nature.
relationships and reSUlts.
These models are either determinis-
Deterministic models describe average
Stochastic models can be used to gain
insight into the natural variations of a biological system.
The uses
of such models include:
(i)
(ii)
the interpretation of empirical data;
the development of appropriate methods of measurement,
•
5
•
that is, the choice, interpretation, and interrelationships of fertility indices; and
(iii)
the identification of the nature of the interactions
of various factors.
A classification system for reproductive'models·can be developed, as in Sheps, et.al. (1969), by considering the. treatment of
time as continuous or discrete, by the population of interest being a
cohort or cross-sectional (period) population, and by the nature of
assumptions on the biological variables.
The basic biological vari-
ables of interest are as follows.
1.
Fecundability.
This is generally defined (e.g. Gini,
1924) as the probability of conception per unit time for
a woman who is susceptible to the risk of conception.
•
It
depends on the occurrence of ovulation during a particular cycle, on the frequency and timing of intercourse as
related to ovulation (Glasser and Lachenbruch, 1968.and
Lachenbruch, 1967), the use of contraceptives, characteristics of the semen, and other characteristics of the
woman and her spouse including age and health status,
which may act on fecundability through their effects on
the preceding factors (Hartman, et.al. 1962).
Fe cundabi1-
ity can be considered homogeneous both over time (age)
and among women.
Heterogeneity can be introduced by con-
sidering the variation between women and the variation
within each woman.
•
2.
Outcomes of Pregnancy.
A conception can end in a live
birth or a fetal loss (spontaneous or induced abortion).
6
The probability that a conception ends in a fetal loss
may depend ,on maternal age, rank order of pregnancy,. the
interval between conceptions, and the health of the
•
An individual's genetic charac-
mother (Nesbitt, 1957).
teristics'may also -affect -this probability (James, 1961').3.
Infecundable or Non-susceptible Periods.
The duration of
nonsusceptible periods, consisting of the length of the
pregnancy period and the length of the infecundable
period following the pregnancy outcome, can be considered
fixed or random and may depend on the outcome of the
pregnancy and other considerations such as the effect of
lactation (Tietze, 1961).
Probability models related to the parity distribution include
Dandekar (1955) who suggests a modified binomial for the number of
children born to specified age cohort of women ina fixed time period.
•
In deriving the distribution, Dandekar makes the following assumptions:
(1) all women are susceptible to the risk of pregnancy and
all conceptions end in a live birth, (2) fecundability is constant
over time and does not vary among women, and (3) the length of the
nonsusceptible period following a live birth is constant.
The modi-
fied binomial that Dandekar derives is an interrupted binomial
distribution (Johnson and Kotz, 1969) and no simple form is known for
the mean, variance, or probability generating function of this distribution (Patil and Joshi, 1968).
Dandekar also obtains an inter-
rupted Poisson distribution as a limiting form of his modified
binomial distribution.
Again, the mean, variance, and probabili ty
generating function of this interrupted Poisson distribution are not
•
7
•
known (Patil and Joshi, 1968).
Dandekar generates his model by con-
sidering an abrupt sequence of trials, that is, he considers some
women to be in a non-susceptible state (following a previous live
birth) at the beginning of the observation period.
Estimation of
parameters in these models is by the method of maximum likelihood
using successive approximations (Rao, 1952).
Dandekar's models do
not provide an adequate fit to his data.
Basu (1955) formulates Dandekar's model as a stochastic process and Dharmadhikari (1964) provides a generalized stochastic model
with Dandekar's model being a special case.
Dharmadhikari points out
that Dandekar's implicity assumes that the successive waiting times
between live births, except for the first birth, are independent and
•
identically distributed random variables which have an exponential
distribution with the origin shifted by the (constant) length of the
nonsusceptible period.
This is then generalized by considering an
arbitrary distribution of waiting times.
Singh (1963, 1964) derives a modified binomial distribution
using an analogy between Neyman's (1949a) fishing model and the reproductive process.
Singh adds one assumption to the set of
Dandekar's, that is, (4) a woman may not be exposed to risk of conception at any time during the observation period.
Singh (1963)
allows for heterogeneity in the model by assuming that the monthly
probability of conception varies according to a Beta distribution
among women.
Parameter estimates are obtained by the minimum chi-
square procedure of Neyman (1949b) and the model provides a good fit
•
to the given data.
Pathak (1966) extends Singh's model by allowing
for an abrupt sequence of trials.
Singh and Bhattacharya (1970)
8
generalize Singh's models by allowing more than one kind of pregnancy
outcome.
Singh (1968) has developed a model similar to Singh (1963)
•
by assuming that the number of conceptions follows a Poisson distribution.
As an example of these types of modified distributions, let
X be a random variable corresponding to the number of live births in
a fixed time interval
(0, T),
a live birth conception, let
susceptible period, and let
never being fecund.
let
h
A be the monthly probability of
be the constant length of the non-
1 - a.
(0 <a < 1)
be the probability of
Then Singh (1968) obtains
P[X=O] = l-a+Cle- AT
rr
. 6=0
P[X = i] = Cl
e-A(T-ih) [ACT - ih) ]m
m=l
x=l, 2, . . . , n-l
Here
n
•
m!
.i-I
I e-A(T-ih+h)
for
(1.1)
and
[A(T - ih
m!
+ h)
]~J
P[X=n] '" I-P[X:5n-l].
is equal to the maximum number of live births observed in
the interval
(0, T).
Other models for the parity distribution include James (1963)
who derives a model similar to Singh's (1963) model but he assumes
that a conception is not recorded until the end of the nonsusceptible
period.
Brass (1958) assumes that the number of children born in a
fixed time period follows a Poisson distribution.
He then assumes
that the parameter in the Poisson distribution varies according to a
gamma distribution and obtains a negative binomial distribution.
Brass then considers only fecund women and obtains a negative bi-
•
9
•
nomial (left-) truncated at zero.
By introducing heterogeneity into
Dandekar's (1955) modified Poisson distribution, Brass obtains a
ther modification of the negative binomial distribution.
fur~
Biswas
(1973) extends Brass' results and derives what appears to be a mixture of negative binomial distributions.
Shah (1970) considers the
conditional distribution of being at parity
m at age
x
as a
mixture of a Poisson distribution and a gamma distribution.
This can
be represented as
f(m; A, a,p Ix) =p[e
-A m
A 1m!]
+
C1-p) a (1- a)
m
.
(1. 2)
Chandrasekaran and George (1962) fit a logistic curve to data on the
number of children ever born.
•
Sheps, et.al. (1969) show that the models of Dandekar (1955),
Basu (1955), and Singh (1963, 1964) can be developed under the assumptions of a renewal process as well as the model of Henry (1953)
who, under a set of assumptions, derives the expected number of
births of a specified order in a time interval
(t, t
+
6t).
In
general, renewal theory models require a sufficiently long reproductive period and a homogeneous population, that is, the model
parameters do not vary over time or among women.
Other analytical
models by Henry (1957, 1961a, b) for the number of births and conceptions, in discrete and continuous time, allow for both homogeneous
and heterogeneous populations.
Perrin and Sheps (1964) and She?s
(1967) develop analytical fertility models based on the theory of
•
semi-Markov processes and
~Iarkov
renewal theory (pyke 1961a, b).
These models can be used to study the effects of underlying factors
10
on the reproductive process (e.g. 'Sheps 1964, Sheps and Perrin 1963,
Potter 1969, 1970, and Potter et.al., 1970), Sheps and
~Ienken
(1971)
apply computer simulation to their time dependent model to study
•
short term effects of changes in reproductive behavior.
Other stochastic models that lead to theoretical representations of the parity distribution have been developed by Chiang (1971).
Hoem (1968, 1970), and Nour (1972).
stay in parity
For Hoem (1968), an individual's
m has an exponential distribution and occurrence/ex-
posure rates are used as estimates of the forces of transition.
A
problem with occurrence/exposure rates is in defining the population
at risk.
This causes the rates to be somewhat insensitive to short
term changes in fertility.
.Chiang (1968) assumes an underlying
multinomial distribution which yields estimates of forces of transition in the same form as occurrence/exposure rates.
Chiang's (1971)
model is based on an age-dependent process.
Nour (1972) presents a
model similar to Hoem's and concludes that
(1) the models based on
the renewal process are inadequate and·
•
(2) the two most important
underlying variables of fertility are the age at first marriage, which
determines the population at risk, and the conditional probability
that a Woman who is fecund at age
val
(x, x + L'lx),
1.4
Discrete Contagion Models
x will conceive in the age inter-
which determines the level and timing of births.
One objective of this study is to select a parametric
representation of the parity distribution conditional on age that
can be used in the analysis of data, population projection, and simulation.
Some of the probability models discussed previously yield
•
11
•
representations of the parity distribution that are too theoretical
to be of much practical use in the analysis of data or for population
projection.
Other of the fertility models rely on computer simula-
tion for results and again they are of little use in analyzing data.
The models that do yield parametric representations of the parity
distribution tend to be either too simplistic or too cumbersome and
difficult to use.
The selection of a model to describe the parity distribution
requires some sort of systematic procedure.
Ord and Patil (1972)
suggest a scheme for model selection as:
(i)
formulate an initial model, say, a class or family of
distributions,
•
(ii)
(iii)
(i v)
(v)
collect the data,
select a particular distribution,
test the goodness of fit of that distribution, and
formulate hypothesis to ascertain the underlying chance
mechanisms which generate the distribution.
A reasonable choice of an initial model for biological
situations is some contagious distribution (Neyman, 1939).
A con-
tagious or compound distribution is derived from a probability distribution function dependent on a set of given parameters by
regarding those parameters as random variables haVing specified
probability distributions.
For example, a negative binomial distri-
bution is a contagious distribution derived from a Poisson distribution whose parameter has a gamma distribution.
•
The contagion function can be defined in the following manner
(Kemp and Kemp, 1974).
Suppose that there are a finite number
n
of
12
opportunities for an event to occur; then the probability that at
least one more event will occur given that
k events have already
occurred is call ed the contagion function and it is equal to
times the probability of the event occurring at the
(k
+
•
(n - k)
1) -st
op-
portunity given that the event has already occurred at each of the
first
k opportunities.
Frechet (1939, 1943) has shown that the
contagion function is equal to the ratio of the
moment to the
k-th
(k
+
l)-th
factorial
factorial moment.
Feller (1943) states that a process may be affected by two
types of contagion.
True contagion is where each "favorable" event
increases (or decreases) the probability of future favorable events.
App~rent
contagion occurs when the population is not homogeneous.' It
is interesting to note that the same distribution can be derived from
.
both types of contagion.
In the development of a distribution for
rare events, a form of the negative binomial distribution was derived
by Greenwood and Yule (1920) who assumed apparent contagion.
•
The
same distribution was obtained with an assumption of true contagion
by Eggenberger and Polya (1923).
The variety and number of possible contagious distributions
leads to the consideration of families of discrete distributions.
Many families of discrete distributions can be defined by the form of
their probability generating functions (p.g.f's) which can often be
expressed as hypergeometric functions.
1.S
Hypergeometric Functions
The hypergeometric function, denoted
solution of the differential equation
2Fl [a, B; y; xl, is a
•
13
•
dF
+ {y - (0. + B+ 1) x} dx -
x (x - 1)
Here
2Fl [0.. p; y; x]
which converges for
=
'",f(o.+n)
n)
r (B)
n~O rea)
Ixl < 1.
pergeometric fWlction is, for
=
f(B
Cl
+
BF = 0
f(y)
r(y+n)
(1.3)
x
n
TiT
(1.4)
The integral representation of the hyy - B > O.
f(y)
reB) rey - B)
r
t B- l (1 - t) y-B-l (1 - tx) 0. dt. (1. 5)
o
The differential equation
•
dF
dx
+ (y - x)
(1.6)
- 0. F = 0
has as one solution the function
fey)
rey+ n)
which also converges for
Ixl < 1.
The function
xn
n!
IF1[0.; y; x]
(1.7)
is
called the confluent hypergeometric function and has the integral
representation
1 F [0.; y; x]
l
•
=
f(y)
rCCLJr (y
- CL)
J1 etx(l _ t) v-o.-lto.-l dt.
(1.8)
o
The derivatives of the hypergeometric and confluent hypergeometric
functions are readily available as
14
d
dx ZF I [a, (3; y; x] =
r:: ZFI[a+
1, B+ 1; y+ 1; x]
n
d
(a)n((3)n
ll[a+n, (3+n; y+n;x]
n ZF 1 [a, (3; y; x] =
(y)n
dx
d
dx IF I [a; y; x] = ~ /1 [a+ 1; y+ 1; x]
•
(1. 10)
( 1.11)
(a)n
(y)n IF I [a+ n; y + n; x]
where
(1.9)
(LIZ)
(a)n = a(a+ 1) ... (a+ n - 1) = f(a+ n)/f(a) .
Another useful relationship involving the confluent hypergeometric
function is known as Ktunmer's formula and is stated as
IFI[a;y;-x]=e -x
/1 [ y-a;y;x].
(1. 13)
•
The hypergeometric and confluent hypergeometric functions are special
cases of the general hypergeometric function
00
I"
l
n=O
(a In· ··(a )n
1
P
1
q
(b )n ... (5 )n
(I.14)
Amore concise account of hypergeometric functions can be found in
either Erdelyi (1953) or Slater (1965).
Philipson (1960) relates the
confluent hypergeometric function to compound poisson processes.
Kemp (1968) looks at the function
(1.15)
and establishes exhaustive conditions for
G(s)
to be a valid pgf.
•
15
•
A generalized family of discrete distributions having the pgf
G(s)
is then established by Kemp.
1.6
Families of Discrete Distributions
Recall that the Pearson system (Elderton and Johnson, 1969)
for continuous distributions is based on the differential equation
(1.
16)
Ord (1967a) uses a similar difference equation to define families of
discrete distributions.
•
His equation is
(a - x)f(x - 1)
f(x) - f(x - 1)
(1.
17)
Using this difference equation, Ord (1967b) suggests a rough graphical procedure for eXamining data.
He defines the ratio
u(;c) = xf(x)/f(x - 1)
and plots
u(x)
versus
x
for the poisson, binomial, negative bi-
nomial, beta-pascal, and beta-binomial.
quired and the function
(1.18)
vex) = (l/2) (u(x)
At times smoothing is re+
u(x + 1))
is graphed.
A
summary of the use of the di fference equation and the graphical procedure can be found in Ord (1972).
The ratio
f(x
+ l)/f(x)
is quite similar to the difference
equation and can be used to define families of discrete distributions.
•
Gurland and Tripathi (1975) discuss several forms of this ratio and
the resulting probability generating functions.
For example, the
16
Katz (1963) family is defined by
f(x+l)/f(x) = (a+Bx)/(x+l)
(1.19)
G(s) = /O[alB; Bz]/ /O[alB; B]
(1. 20)
•
which has the pgf
An extension of this is the ratio
f(x + 1) /f(x) = (a + Bx) / (x + ;\)
(1.21)
which has the pgf
(1.22)
Taking the limit as
B goes to zero we obtain the CB family (Crow
and Bardwell, 1963) which is defined by the. ratio
f(x + 1) /f(x) = a/ (;\ + x)
and has the pgf
•
(1.23)
•
(1.24)
Gurland and Tripathi consider two further extensions which they call
the
E CB family and the
l
fined by
E CB family.
2
The
",f.,,(x;:,-.,+i"-l,,-) = -.-.=a.>-(Y~+x",),-:-,
f(x)
(x+y)(x+I)
and has the pgf
E
I
CB family is de-
(1. 25)
•
17
•
(1.26)
This reduces to the CB fami ly ror
y =
1.
The
E2 CB family
15
de-
fined by the second order difference equation
llo (x + 2) (x + 1) f(x + 2) +Cl l
(x + A - ClO)f(X + 1) +Cl
2
l
(x + y)f(x) =
o.
(1. 27)
This has the pgf
(1.28)
The Katz family, the CB family, and the
E
l
CB family all
have pgf's of the form
•
(1.29)
Kemp (1968) calls these the generalized hypergeometric probability'
(ghp) distributions.
Dacey (1972) has investigated this general fam-
ily of ghp distributions in considerable detail and has constructed a
method for identifying various members of this family.
Kemp and Kemp
(1974) consider particular cases of the family with pgf
G(5) =
(1. 30)
F [(a); (b); A(S - 1)] .
P q
They call these the generalized hypergeometric factorial moment (ghf)
distributions.
1.7
•
The
HI
Distribution
One particular ghf distribution is called the
tion after Katti (1966).
HI
distribu-
This is a compound distribution formed by
18
considering a binomial distribution with parameters
n
has a poisson distribution with parameter
distribution with parameters
a
and
B.
n
and
where
p
A and p has a beta
This is equivalent to a com-
•
pound distribution formed by a poisson distribution with parameter
AP 'where
tion is
p has 'a beta distribution. "The pgf of the -HI
G(s) =
/1 [a; a+ B;
A(S - 1)).
distribu-
Kemp and Kemp (1974) give the
following results for the factorial moments and probabilities.
f(x) =
11' (r)
=
f(a + B) AX
f(a+ x)
f(a+B+x)XT /l[a+ x; a+ B+x; -A] (1. 31)
f(a)
r(a+ r)
r(a+ B)
rca)
f(a+B+r)
Ar = x(a+r-l) Il'(r-l)
a + B + r - 1)
(1.32)
A second order difference
e,qua~ron,
simil'ar to that for the'" E
Z
CB
family, that will be of some use is
(x + 2)(x + 1) f(x + 2) = (x + l)(A + a + B+, x)f(x + 1) - A(a + x)f(x) .
(1. 33)
•
The negative binomial distribution is a limiting form of the
HI
distribution and the case
rectangular distribution.
a = B= 1
reduces to Feller's poisson-
Gurland (1958) uses the
HI
distribution
to generalize a Poisson distribution thus extending the work of
Neyman (1939) and Beall and Rescia (1953).
The density of the
HI - distribution involves the confluent hypergeometric function.
Therefore, maximum likelihood estimation of the parameters
B
A, a, and
is extremely difficult.
1.8 Aims of Present Research
Analytical techniques for dealing with fertility largely ig-
•
19
•
nore the parity distribution.
For example, most fertility indices
are functions of the age-specific fertility rates, but the agespecific fertility rates are just weighted averages of the corresponding age-parity-specific fertility rates where the weights are the
parity distribution for that age (or age group)'.
Being'abl'e to"pre-
diet (or simulate) changes in the parity distribution will enable one
to predict (or simulate) changes in the age-specific fertility rates
and in various indices of fertility.
This motivates the desire for a
good probability model of the parity distribution at all ages.
We desire a relatively simple and workable model (probability
distribution) to describe the parity distribution at any time in the
reproductive age span.
•
This study is concerned with the
tribution as such a model.
HI - dis-
We are interested in the development, in
a deterministic fashion, of the
HI - distribution
as a contagion
model to describe the parity distribution conditional on age.
aspects of the
HI
and related (the
H2
and
Various
H3) distributions will
be explored in Chapter II.
In Chapter III, a continuous time, discrete space random process, e.g. the compound Poisson process, also gives rise to the
distribution as a representation of the parity distribution.
HI
From
this formulation, the distribution of waiting times and inter-arrival
times can be determined.
It is important to note that in this formu-
lation the intensity function,
parity-specific.
~(t)
or force of fertility, is not
However the conditional forces of fertility.
~n(t),
corresponding to the age-parity specific fertility rates. are depend-
•
ent on parity.
Thus this model incorporates the parity-specific
concept in an analagous fashion to models of contagion.
20
One problem that arises is that fertility data collected are
not often given
~n
parity specific detail.
In Chapter IV, we devote
some time to the question of how to derive the parity distribution
from different types of fertility data.
We also have the problem that
parity-specific information is usually right truncated at-some
mum reported parity.
of the
model is
HI
maxi~
This necessitates the consideration of estimates
distribution based on truncated data.
ap~lied
•
In Chapter V, the
to U.S. cohort data by single years of age and to
data in five year age groups from selected countries.
•
•
•
CHAPTER II
THE CONTAGION MODEL
2.1
Introduction
In this chapter the
HI - distribution is developed as a
contagion model to represent the parity distribution, conditional on
age, under a restrictive set of assumptions.
The development of the
model suggests two additional compound distributions as alternatives
to the
•
HI
distribution, namely, the
H
2
and
H
3
distributions.
Little work has been done with these distributions.
The
probability density functions and probability generating functions
are examined and the moments are derived.
Limiting cases of the den-
sity functions are briefly discussed and some simple descriptive comparisons are made .
•
22
2.2
The Development of the Contagion Model
Suppose we observe a cohort of women aged
t.
Assume that
each woman has been susceptible to the risk of a live birth conception
for a fixed number, say
nt'
•
months and that the monthly probability
of a live birth conception is a constant, denoted
p with
0< P < 1.
Under these conditions, the number of births to any woman is a random
variable having a binomial distribution with parameter
n
t
and
p
(Sheps and Menken, 1973).
A problem which arises is that the
n
will vary among women
t
due to the influence of such variables as age at marriage, nonsusceptible periods following a live-birth conception (the nine months
of gestation plus a period of post-partum amenorrhea), and non-susceptible periods resulting from
can consider the
preg~ancy
wastage.
Because of this we
to be a discrete random variable.
subscript for age, we can initially consider
n,
Dropping the
the number of months
or time units susceptible to the risk of a live birth conception, as a
•
random variable having an (equally dispersed) Poisson distribution.
Using Gurland's (1958) notation for compound distributions, we write
the compound binomial as
Poisson(Ap) - Bin(n,p)
1\
n
Poisson(A)
which is read as a Poisson distribution with parameter
equivalent to a binomial distribution with parameters
where
n
(2.1)
AP
and is
nand
p
is considered to be a random variable having a Poisson
distribution with parameter
A.
Further heterogeneity among women can be introduced by considering the parameter
p
of the compound binomial distribution as a
•
23
•
random variable with probability density function
have
0 < P <: 1,
f(p).
an obvious choice for the distribution of
Since 'we
p
is the
Pearson Type I, or Beta, distribution (e.g. Johnson and Kotz, 1970)
with parameters
a
and
b.
That is
f(p) = f(a)f(b - a) pa-1 0 _ p)b-a-l
feb)
with
0 <P < 1
and
a < a < b < "".
(2.2)
The resulting distribution is the
Hl-distribution (Gurland, 1958, Katti; 1966) and can be expressed as
a compound Beta-Binomial distribution, e.g.
H (A, a, b) - Bin(n, p) A Beta(a, b) A Poisson(A) ,
p
n
l
or
•
H (A, a, b) - Poisson(Ap)
l
pBeta(a, b)
(2.3)
The preceding discussion implies that the Hl-distribution is
derived from assumptions concerning non-homogeneity in the population.
The contagion mOdel (2.3) is then a result of "apparent" contagion
(Feller, 1943).
This does not preclude the development, in a manner
similar to Eggenberger and Po1ya (19Z3), of the Hl-distribution by
assuming "true" contagion in parity specific fertility'.
'In some high fertility populations, the assumption that the
number of time units susceptible to conception follows an equally dispersed distribution (Poisson) may be replaced by an assumption of an
over-dispersed distribution, such as the negative binomial distribution, for the number of time units susceptible.
This leads to a con-
tagion model where the compounded probability density function is
•
known as the Hz-distribution (Gurland, 1958; Katti, 1966).
This
24
distribution is also a compound Beta-Binomial distribution as well as
a compound negative binomial distribution.
Neg Bin(k, pp') - Bin(n, p)
II
n
We have
Neg Bin(k, p'),
•
(2.4)
and
H2 (k, p', a, b) - Bin(n, p)
p Beta(a, b)
~
NegBin(k, p') ,
(2.5)
or
H
wi th
2
(k, p', a, b) - Neg Bin(k, pp')
k >0,
0 < p' < l,
and
p Beta(a, b)
,
b >a >0 .
In a population with low fertility or a large amount of contraceptive usage, the number of time units susceptible to the risk of
a live birth· conception may be under-dispersed and follow a binomial
distribution.
H
3
This compound Beta-Binomial model will be called the
distribution to be consistent with the previous two compound Beta-
Binomial distributions and can be denoted
H
3
(N, p', a, b) - Bin(n, p)
II
p
Beta(a, b)
II
n
Bin(N, p') ,
•
(2.6)
or
H (N, p', a, b) - Bin(N, pp')
3
wi th
N an integer and
N > 0,
p Beta(a, b),
0 < P I < l,
and
(2.7)
b >a >0
The use of the simple discrete distributions as compounding
distributions may be an over-simplification.
The random variable
is defined in terms of discrete time units (e.g. months).
n
This is
justified by the biological nature of the human reproductive system.
It should be noted, however, that the random variable
n
could be in-
terpreted as the amount of (continuous) time that a woman is suscep-
•
25
•
tible.
In this case, the underlying distribution for the number of
births is no longer a binomial distribution.
We will see later (in
Chapter III) that the distributions defined by (2.3) and (2.5) can
arise in a continuous time setting.
2.3
Probability Distribution Functions
The probability distribution function (p.d.f.) can be obtained
from the probability generating function (p.g.f.) for each of the
H
2
and H
distributions.
3
HI'
The p.g.f. is defined for a random variable
X as
g(s) =
'"
.
L Pr[X=i]sl
(2.8)
i=O
•
For the compound binomial distributions the p.g.f.'s are readily
available as, for the Poisson (Ap) distribution,
gl (s) = e
Ap(s-l)
(2.9)
for the negative binomial (k, pp'),
g2 (s) = [1 - pp I (s - I) ] -k
and for the binomial
(N,
(2.10)
pp ') ,
g3(s) = [1
+
pp' (s - 1)]
N
(2.11)
We can consider the above pog.f.'s as conditional on a fixed
value of
po
by assuming
range of
•
from
p.
The p.g.fo's of the
f(p)
H - distributions are then obtained
as a Beta distribution and integrating over the
For example, the pgf of the
HI
distribution is derived
26
gH (s Ip) = gl (s) = exp[Ap(s - 1)].
(2.1Z)
1
Then
•
gH (5) = rgH (s Ip)f(p)dp •
a
1
1
which is
=
f(a)f(b - a) fl Ap(s-l) a-{ .
b-a-l
e
.
p
(1 - p)
dP
-'-"=~f(~b"'"")-'-"-
o
or
(2.13)
gH (s) = /1 [a; b; A(s - 1)],
1
where
IFl(a; b; :I.(s-l)
is the confluent hypergeometric function.
For the pgf of the HZ distribution, defined in (1.8), we
consider
•
and
(1
= rea)f(b-a)
J [l-ppl(s-l)]
o
f(b)
-k a-I
P
(l-p)
b-a-l
dp
or
gH (s) = 2 Fl (k, a; b; p' (s - 1)) ,
2
where
ll[k, a; b; p'(s-l)]
(2.14)
is the hypergeometric function
defined in (1.4).
Similarly, for the H distribution we have
3
gH (s I p)
3
= g2 (s) = [1 =pp
I
(s - 1)]
N
,
and
( ) _ rea) reb - a)
~. S f(b)
.)
or
1
Ja
[1
+
N a-I
pp I (s - 1)] p
.
(1 - p)
b-a-l
dP ,
•
27
•
gH (s)=/l[-N, a; b;-p'(s-l)] .
(2.15)
3
The p.d.f's are then obtained from the derivatives of the
p.d.f's as
Pr [X = x] = P = g (x) (s)
x
x!
Is=O
(2.16 )
Using (2.16) and the equations for the derivatives of
hypergeometric functions (§ 1.6.1) we then obtain, for the
HI-distribution,
AX (a)x
H/X = XT
(b)x /l[a+x;
(2. 17)
b+x; -A],
while for the H -distribution
2
x (k) (a)
P
•
H2 x
-- (p'
- )
_",",x"--_
x!
(b)x
2F1 [k
+ x; b + x; -p ' ]
(2.18)
ll[-N+x, a+x; b+x; pI] .
(2.19)
+
x,
a
and for the H -distribution
3
P
H3 x
= (- p , )
xl
x (-N) (a)
_-.",x-,-_x,-,
(b)x
We know that the hyper geometric functions are solutions of
certain second-order differential equations (given in
§
1.6.1).
These differential equations can be used to derive second order
difference equations among the probabilities of the H-family.
For
example, consider the function
F = II [-N + x, a + x; b + x; p'] .
The function
p' (p' - 1)
•
(2.20)
F is associated with the differential equation
dF
+ [b+x- (a+2x-N+l)p']dp' - (-N+x)(a+x)F=O
(2.21)
28
Now
dF
(-N+x) (a+x) 2 F [-N+x+1, a+x+1; b+x+1; p'),
dp' =
(b+x)
1
.
(2.22)
•
and
= (-N+x) (-N+x+1) (a+x) (a+x+1) F [-N+x+2, a+x+2; b+x+2; p').
(b+x) (b+x+1)
2 1
(2.23)
For the
H - distribution we can write
3
(2.24)
P =A' F1[-N+x, a+x; b+x; pI),
x 2
x
where
A =
x
(-p)
x (-N)
x
(a)
x
(2.25)
x!
then
P
=A.l=El
x+1
x
x
(-N+x)(a+x) 2F1[-N+x+1, a+x+1; b+x+1; p'),
(b+x)
(2.26)
and
2
P
= A' P
x+2
x x(x+1)
•
(-N+x) (-N+x+1) (an) (a+x+1) F [-N+x+2, a+x+2;
(b+x) (b+x+1)
2 1
b+x+2; p')
.
(2.27)
We now have
F = (A )-lp
x
x
(2.28)
dF
(A )-1 .2..- p
dp' =
_p'
x+1
X
,
(2.29)
and
2
d F
(A )-1 x(x+l) p
2
x+2
dp,2 = 'x
(2.30 )
P
•
29
•
Substituting (2.28), (2.29), and (2.30) into equation (2.21)
and multiplying each side of the differential equation by
A
we
x
get, after simplifying,
(x+2)(x+l)(I-p')P x+ 2 - (x+I)[b+x-p'{a-N+2x+l}]P x+ I+P'(N-x)(a+x)P x =0,
(2.31)
as the difference equation for the
By letting
H - distribution.
3
F = II [k + x;, a + x; b + x; -p']
difference equation for the
we can get the
H - distribution as
2
(x+2) (x+l) (l+p')P x+ 2 - (x+l) [b+x+p' {K+a+2x+l}]P x+ l+P' (k+x) (a+x)P x = 0
(2.32)
The differential equation associated with the confluent
•
hypergeometric function can be written as
2
d F
dF
x - 2 + (13 - x) dx - (IF = 0,
dx
which has
F = I FI [(I; 13; x]
. F = I F [a + x; b + x; -A]
I
as one solution.
(2.33)
By letting
we can obtain the difference equation for
the HI distribution as
(x+2) (x+I)P x+ 2 - (x+l) [A+b+x]P x+ I+A(a+x)P x = O.
(2.34)
The above difference equations can be used to obtain
estimates of the parameters of the distributions.
They can also
be used as recurrence relations among the probabilities of each
distribution.
•
bili ties
and
PO
For example, for the
and .pI
HI distribution, the proba-
can be calculated from the values of
a, b,
A and from tables of the confluent hypergeometric function
30
(e.g. Slater, 1960).
•
The equation (Z.34) can then be applied to
determine the remainder of the probabilities.
We note that for the
HI
and the
HZ
distributions,
most of the above results have been obtained previously (e.g. Kemp
and Kemp, 1974).
They are presented here for complete coverage 6f
the subject matter.
Also, for completeness, it can be easily
.. shown that the probabilities given by (Z.17) sum to one.
We have
00
L P
x=O HI x
=
By Taylor's theorem (Wittaker and Watson, 19Z7) we see
F(a;b;y+z)
Letting
z=-;\
b+x; z] .
and
y=A,
so that
z+y=O.
we obtain
•
00
Similar results hold for both the
Z.4
HZ
and
H distributions.
3
Moments
The r-th factorial moments of a (discrete) distribution is
defined as
].Jer) =
for
""
L x(x-l) .. ·(x-r+l)P
x=O
r = 1, Z, 3.... ,.
x
and
].JeD) =1
The factorial moments of the
(Z.35)
HI' HZ' and
H3
distributions can be computed either from an expression due to
Kemp and Kemp (1974) or from the p.g.f. as
].J er) =
r
d g(s)
ds
r
for
s=l
r = O. 1, Z, 3, ...
(Z.36)
•
31
•
fo~
For example,
the
I1
HI
distribution we have
r
=
d
r
1F [a; b; A(S l
ds
Ill)
(2.37)
s= 1
so that
.,
_ Ar(a)r
/1 [a, b, 0] -
(2.38)
(b)
r
This gives a simple recurrence relation for the factorial moments
of the
HI
distribution as
I
_
V(r+l) -
A(a + r)
(b+r) VCr)
f
or
Similar recurrence relations hold for the
•
distributions.
Ver +l ) =
and for the
For the
H
3
I
V(r+l)
H
2
and
H
3
H distribution
2
p' (k+r) (a+r)
(b+r)
(2.39)
r=O, 1, 2, ...
\Jer)
for
r = 0,1,2, '"
(2.40)
distribution
= p' (N-r) (a+r)
(b+r)
VCr)
for
r=O, 1,2, ... , N-l . (2.41)
The mean and variance of each distribution can now be
expressed as
Aa
VH = b
1
2
G
H1
•
Aa
= b
~l+b(b+l)
A(b-a) ]
,
(2.42)
32
JlH
=
b
2
a2
~
H2 =
b
IJ
•
~
~
kp' (b-a)
p' (a+l)]
,
1 + b(b+l) + (b+l)
E!E2
H3 =
b
and
2
~1 +Np'(bca)
aH = ~
b(b+l)
b
3
A =kp'
Setting
(2.43)
=Np'
p' (a+l)]
(b+l)
(2.44 )
so that
we can
establish the relationship
(2.45)
It may be noted that
both the
HI
IJ
H
<
2
a~
an d
2
IJ
H
1
<
2
aH
so that
1
and
H distributions are over-dispersed.
2
distribution may be over-dispersed or under-dispersed. From (2.44)
•
2
< a
only when N(l - alb) > a + 1 .. Since we have
H
H
3
3
0< alb < 1 the inequality N > a + 1 is sufficient to assure that
we see
Jl
H
3
<a
Jl
2
.
H
3
Ottestad (1943) has looked at the first differences in the
series of ratios of the factorial moments of certain discrete
distributions.
These ratios are also used in estimation procedures
(e.g. Gurland and Tripath, 1974).
The ratios are defined by
and
nr
=
Jl'
(r+l)
Jl (rl
for
r=1,2, . . . .
(2.46)
•
33
•
The first differences are then given by
(2.47)
The ratios of factorial moments can be easily determined from the
simple recurrence relations previously mentioned.
Hence, for the
HI distribution, equation (2.38) gives
_ A(a+r)
' \ - (b+r)
for
r
=
(2.48)
0, 1, 2, ...
It can be shown that
a+r
a +r +1
b +r < b +r +1
for
r=O,1,2,
...
(2.49)
which gives the relation
•
flO < fl 1 < fl 2 <
for the
distribution.
HI
(2. SO)
This also means that the first
differences (2.47) are strictly positive.
In fact,
A(b-a)
b (b+ 1)
(2.51)
and
A
a+l
° 2 = b + 2[1 - b+l]
=
;,(b-a)
(b+l) (b+2)
(2.52)
so that
°
Similarly
°
1
=
b+2 °
b
2 = bb ++ 31
°
3
2
and thus
° 1 > °'2 .
and, in general,
b +r + 1
•
or = b + r - 1 °r+ 1
(2.53)
34
We now can say that, for the .H
>
°1
For the
°2
>
°3
l
distribution,
>... > 0
(2.54 )
•
H2 distribution, (2.40) gives
p' (k+r) (a+r)
(b+r)
It can be shown that
r=0,1,2, ...
(k+r) (a+r) < (k+r+l) (a+r+l)
(b+r)
(b+r+l)
(2.55)
(2.56)
which implies
for the
H distribution.
2
However, no statement such as (2.54)
can be made for the
H distribution.
2
Even less can be said for the
H distribution.
3
Equation
(2.41) gives
(a+r)
nr -- p'(N-r)
b+r
for
r=O, 1, ... , N
•
(2.57)
We can see, after some algebra, that
(N-l) (a+l)
Na
{N- (a + 1 +
= - + _1
b+l
(b+l)
b
This means that
~]} .
nO < n1 whenever
~ < (N-l)(a+l)p'
(b+l)
b
that is) when
N
>
b(a+l)
(b-a)
(2.58)
•
35
•
when
In a similar fashion it can be established that
N
and that
n2 < n3
>
b(a+3)+2
b-a
(2. 59)
>
b(a+5)+4
(b-a)
(2.60)
when
N
H distribution, the first differences (2.47) depend
3
entirely on the parameters and no simple inequality exists.
For the
2.5
Limiting cases of the
HI distribution
Kemp (1968) provides some results on the limits of general
hypergeometric functions that are useful in
cases of the
H distribution.
l
determinin~
the limiting
The necessary results are
•
(2.61)
and
(2.63)
which can be derived from
lim f(c+d+r) = 1 .
r
d-+t"" f(c+d)d
•
Some results on the limiting cases of the
distributions are known.
(2.64 )
H
l
and related
For example, Kemp and Kemp (1974) state
36
that the negative binomial distribution is a limiting form of the
HZ
distribution.
Gurland (1958) looks at some limiting forms of the
HI
distribution.
One result due to Gurland is
= 1,
lim IFl[a; b; 1-(5-1)]
•
(Z.65)
b~co
a fixed
which is the p.g.f. of a degenerate distribution corresponding to a
constant zero.
Gurland also notes that for
the p.g.f. of the
HI
b
large and a fixed
distribution tends towards the p.g.f. of a
generalized Polya-Aeppli distribution with
(2.66)
Another result due to Gurland is
lim lim /l[a; b; :1(5 -1) =
a-+OO b-+oo
exp~is+-yl)J
[
(b-a) /a+y
J.
(2.67)
•
We note that this can be written as
lim lim
(2.68)
IFI[a; b; 1-(5 -1)] = exp(l-p(s - 1)] .
a-+OO b-+eo
a/b-+p
This is the p.g.f. of a Poisson distribution with parameter
We can also take a similar limit for the p.g.f.
H
3
distributions.
lim lim
For the
HZ
'5
of the
:lp.
HZ
and
distribution, (2.63) yields
ZFI[k, a; b; p'(s-l)] = IFO[k; ; pp'(s -l)/(l-pp')]
a- b-
(2.69)
a/b-+p
which is the p.g.f. of a negative binomial with parameters
pp'.
For the
H
3
distribution we get
k
and
•
37
•
lim lim
pp I (1 - s) I (1 - pp I) )
2Fl [-N, a; b; pI (1 - s)) = /O[-N;
a-- b-
(2.70)
a/b->-p
which is the p.g.f. of a binomial distribution with parameters
and
ppl.
The
and
N
HI
distribution is a limiting form of 'both- the
H distributions.
3
lim lim
k-- pl+O
H
2
Using (2.61) we have
2FI [k, a; b; pI (s - 1)) = IF I [a; b; A(s - 1))
(2.71)
2Fl[-N, a; b; pI (1- s)]
(2.72)
kp'+A
and
lim lim
.p'-+{)
=
/l[a; b; '\(s -1)].
N-
Np'+,\
•
2.6
Descriptive Comparisons
The tables that follow give substance to the analytical
results of the previous sections.
It is instructive to see in a
numerical fashion the changes in the probability functions and
moments that occur when the values of the parameters of the
distribution are changed.
° °
We expect that
nO < n l < n 2
and
2 < 1 for the HI
distribution and this is in evidence in Table 2.1. We also see
that for
the
HI
A fixed and the ratio
•
and
and
°
fixed, so that the mean of
distribution is fixed, the variance decreases as the
parameters
nl
alb
a
and
b
increase.
The ratios of the factorial moments
n 2 also decrease as well as the differences
2 = n 2 - nl .
°1
= n - nO
l
38
The case where
the parameters
result.
a
and
A and the ratio
b
alb
are decreasing yields an interesting
In this instance the variance increases and there is more
probability in the upper tail of the distribution.
HI
•
are both fixed and
Also, the
distribution becomes bimodal wHh one mode at the 'zero class.
The result is a relatively large probability of being in the zero
class.
This has implications in the problem of estimating parameters
of the
HI distribution from observed data, on the parity distribu-
tion by age.
We find, especially at younger ages, that a proportion
of the population must be considered as never being susceptible to
the risk of conception; either due to natural sterility or simply
•
not being' married.
This tends to increase' the observed proportion
of the population of parity zero.
this increased proportion of parity zero, then the estimates of
and
HI
b
a
may be smaller than the true parameters of the underlying
distribution.
This possibility will be explored later in the
context of a modified
HI
distribution.
Table 2.2 gives some idea of how large the parameters
,b
•
If an allowance is not made for
must be in order for the
HI
a
and
distribution to approach it's
limiting form, the Poisson distribution.
This table also gives
some indication of possible problems in using maximum likelihood
estimation.
Relatively small changes in the observed population (due
to sampling variation) may produce very different estimates of the
parameters for a large proportion of the parameter space of
For the
HZ
distribution we have
no < n 1 < n z '
a and b.
This is
•
39
•
reflected in Table Z.3 which also shows
values of
a
°1
and
remaining constant.
when the parameters
ratio
alb
°1
or for small values of
b
nl , nZ'
as well as
kp'
and
oZ<
a
k.
decreases as
oZ'
except for large
The variances.
k
increases with
The descriptive statistics also decrease
and
b
increase with
k, p',
and the
remaining constant.
The probabilities in Tables Z.3 and Z.4 indicate the change
in the
HZ
distribution with changes in the parameters.
possible to see that the limiting form of the
a
and
b
values of
distribution, as
increase, is closely approximated for relatively small
a
and
of the parameters
•
HZ
It is
b.
a
This again implies a problem in the estimation
and
b
in that large variation in the estimates
may yield relatively small variation in the predicted probability
density function.
and
b
We also note the tendency of small values of
a
to give a large probability for the zero class and thus a
tendency for the probability density function to be bimodal.
specifically, as the mean increases but
a
and
b
More
decrease, the
variance increases and the p.d.f. becomes bimodal.
This situation is more defined in the
Here it seems as though
P[X
=0]
= 1]
> P[X
H
3
distribution.
whenever
a < b < 1.
condition appears to be sufficient to assure that the
bution is bimodal.
As with the
HZ
statistics (in Table 2.6) for the
•
increases with
Np, a, and b
for
increasing with
a
and
constant.
b
H
3
This
distri-
distribution, the descriptive
H
3
distribution increase as
N
remaining constant while they decrease
N. p, and the ratio
Finally, i t is seen that
or both may be negative for the
H3
0z < 01
alb
but either
distribution.
remianing
01
or
0z
40
•
TABLE 2.1
Descriptive Statistics for the H -distribution
1
. parameters
Descriptive Statistics
2
A
a
b
>J
3.2
.5
.8
2.
3.333
2.667
2.857
0.667
0.190
(J
1'1 1
1'1 2
'\
°
2
3.2
5.
8.
2.
2.267
2.133
2.240
0.133
0.107
3.2
50.
80.
2.
2.030
2.015
2.029
0.015
0.014
3.
7.0
4.333
4.6
1.333
2.667
5.
.3
.5
5.
3.
5.
3.
4.0
3.333
3.571
0.333
0.238
5.
30.
50.
3.
3.118
3.039
3.077
0.390
0.038
•
•
41
•
TABLE 2.2
The H -distribution
1
1.1
A
•
•
3.2
= 2.0
1.1
3.2
3.2
5.0
= 3.0
5.0
5.0
a
.5
5
50
.3
3
30
b
.8
8
80
.5
5
50
x
p
P
P
P
x
P
x
P
x
x
x
X
0
.2658
.1555
.1374
.2568
.0833
.0528
1
.1986
.2672
.2706
.1099
1655
.1522
2
.1845
.2521
2687
1097
.2021
.2224
3
.1486
.1702
.1791
1185
.1894
2197
4
.0999
.0910
.0902
.1171
.1472
.1649
5
.0567
0407
0366
.1014
0985
.1002
6
.0276
.0157
.0125
.0765
0581
.0514
7
.0118
.0054
.0037
.0508
0307
.0228
8
.0044
0016
.0009
0300
0147
.0090
9
.0015
.0004
.0002
.0159
0064
.0032
10
.0004
.0001
*
0077
0026
.0010
11
.0001
*
*
0034
0010
.0003
12
*
*
*
.0014
.0003
.0001
13
*
*
*
0005
0001
*
14
*
*
*
.0002
*
*
15
*
*
*
0001
*
*
* < .00005
42
•
TABLE 2.3
Descriptive Statistics for the H -distribution2
parameters':
k
p'
Descriptive statistics
a
b
-
2
).l
(J'
1)1
1)2
-
°
° 0.952
1
2
4
.8
.5
.8
2.0-
4.667
3.333
4.286
1.333
8
.4
.5
.8
2.0
4.0
3.000
3.571
1.000
0.571
.267
.5
.8
2.0
3.778
2.889
3.333
0.889
0.444
12
4
.8
5
8
2.0
3.333
2.667
3.360
0.667
0.693
8
.4
5
8
2.0
2.8
2.4
2.8
0.4
0.4
.267
5
8
2.0
2.622
2.311
2.613
0.311
0.302
12.
4'1
.8:·
50'
80
2.0
3.037
2.518
3.044
0.518
0.525
8
.4
50
80
2.0
2.533
2.267'
2.536
0.267
0.270
12
. 267
50
80
2.0
2.365
2.183
2.367
0.183
0.185
6
.833
.3
.5
3.0
9.167
5.056
6.133
2.056
1.078
12
.417
.3
.5
3.0
8.083
4.694
5.367
1.694
0.672
18
.278
.3
.5
3.0
7.722
4.574
5.111
1. 574
0.537
6
.833
3
5
3.0
5.667
3.889
4.762
0.889
0.873
12
.417
3
5
3.0
4.833
3.611
4.167
0.611
0.556
18
.278
3
5
3.0
4.556
3.518
3.968
0.518
0.450
6
.833
30
50
3.0
4.637
3.546
4.102
0.546
0.557
12
.417
30
50
3.0
3.877
3.292
3.590
0.292
0.297
18
.278
30
50
3.0
3.624
3.208
3.419
0.208
0.211
•
,
•
43
•
TABLE 2.4
The H -distribution
2
a=.5,
b=.8
a=5.,
ell = 2.0)
b=8.
a = 50,
b = 80
p'
.8
.4
.267
.8
.4
.267
.8
.4
.267
k
4
8
12
4
8
12
4
8
12
0
.3108
.2887
.2811
.2141
.1859
.1760
.1992
.1696
.1591
1
.2065
.2048
.2034
.2610
.2656
.2666
.2633
.2684
.2696
2
.1601
.1715
.1757
.2085
.2277
.2351
.2184
.2401
.2487
3
.1166
.1297
.1352
.1382
.1518
.1572
.1454
.1599
.1657
4
.0794
.0878
.0913
.0824
.0867
.0882
.0849
.0883
.0892
5
.0512
.0541
.0551
.0460
.0446
.0437
.0455
.0427
.0412
6
.0317
.0309
.0302
.0245
.0212
.0197
.0230
.0188
.0169
7
.0189
.0165
.0153
.0127
.0095
.0082
.0110
.0076
.0063
8
.0110
.0084
.0072
.0064
.0041
.0033
.0051
.0029
.0022
9
.0062
.0041
.0032
.0032
.0017
.0012
.0023
.0011
.0007
10
.0034
.0019
.0014
.0015
.0007
.0004
.0010
.0004
.0002
11
.0019
.0008
.0006
.0007
.0003
.0002
.0004
.0001
.0001
12
.0010
.0004
.0002
.0004
.0001
.0001
.0002
.0000
.0000
13
.0006
.0002
.0001
.0002
.0000
.0000
.0001
.0000
.0000
14
.0004
.0001
.0000
.0001
.0000
.0000
.0000
.0000
.0000
x
•
•
44
•
TABLE 2.5
The H -distribution
2
a: ..3,
b:.5
a: 3,
CIl:3.0)
b:5
a: 30,
b: SO
.417
.278
.833
.417
.278
.833
.417
.278
6
12
18
6
12
18
6
12
18
0
.2775
.2667
.2632
.1180
.1006
.0948
.0906
.0716
.0653
1
.1288
1204
.1171
.1851
.1771
.1737
.1773
.1671
.1627
2
.1177
.1155
.1141
.1902
.1967
.1987
.2038
.2131
.2163
3
.1091
.1143
.1159
.1607
.1735
.1784
.1797
.1969
.2038
.0953
.1046
.1084
.1209
.1321
.1365
.1344
.1475
.1527
.0780
.0873· .0913
.0842
.0904
.0928
.0899
.0950
.0968
.0602
.0668
.0696
.0554
.0570
.0574
.0554
.0546
.0539
.0442
.0474
.0485
.0349
.0337
.0329
.0321
.0287
.0271
8
.0311
.0314
.0313
.0213
.0188
.0177
.0177
.0140
.0125
9
.0212
.0197
.0188
.0126
.0101
.0090
.0094
.0065
.0054
10
.0139
.0117
.0106
.0073
.0052
.0044
.0048
.0028
.0022
11
.0090
.0067
.0057
.0042
.0026
.0020
.0024
.0012
.0008
12
.0056
.0037
.0029
.0020
.0012
.0009
.0012
.0005
.0003
13
.0034
.0019
.0014
.0013
.0006
.0004
.0006
.0002
.0001
14
.0021
.0010
.0007
.0007
.0003
.0002
.0003
.0001
.0000
p' .833
k
x
4
.
,
5
6
7
.
•
•
45
•
TABLE 2.6
Descriptive Statistics for the "_-distribution
.>
parameters
p'
a
b
4
.8
.5
.8
2.0
2.0
2.0
1.428
1
°
0.0
8
.4
.5
.8
2.0
2.667
2.333
2.143
0.333 -0.190
.267
.5
.8
2.0
2.889
2.444
2.381
0.444 -0.063
\l
0
nl
n2
°2
-0.571
4
.8
5
8
2.0
1.2
1.6
1. 12
-0.4
8
.4
5
8
2.0
1.733
1. 867
1. 68
-0.133 -0.187
.267
5
8
2.0
1.911
1.956
1.867
-0.44
12
•
2
N
12
•
Descriptive Statistics
-0.48
-0.089
4
.8
50
80
2.0
1.022
1. 511
1. 015
-0.489 -0.496
8
.4
50
80
2.0
1.526
1.763
1.522
-0.237 -0.241
12
.267
50
80
2.0
1.694
1.847
1.691
-0.153 -0.156
6
.833
.3
.5
3.0
4.833
3.611
3.067
-0.611 -0.544
12
.417
.3
.5
3.0
5.917
3.972
3.833
-0.972 -0.139
18
.278
.3
.5
3.0
6.278
4.092
4.089
1.092 -0.004
6
.833
3
5
3.0
2.333
2.778
2.381
-0.222 -0.397
12
.417
3
5
3.0
3.167
3.056
2.976
0.056 -0.079
18
.278
3
5
3.0
3.444
3.148
3.175
0.148
6
.833
30
50
3.0
1.598
2.533
2.051
-0.467 -0.481
12
.417
30
50
3.0
2.358
2.786
2.564
-0.214 -0.222
18
.278
30
50
3.0
2.611
2.870
2.735
-0.130 - 0.135
0.026
46
•
TABLE 2.7
The H_-distribution
:>
a = .5,
b = .8
a
().J=2.0)
= 5,
a = 50,
b = 80
p' = .8
.4
.267
.8
.4
.267
.8
.4
.267
N=4
8
12
4
8
12
4
8
12
0
.2231
.2431
.2505
.0896
. 1231
.1341
.0653
.1024
.1144
1
.1554
.1840
.1901
.2445
.2625
.2651
.2499
.2669
.2691
2
.1955
.1973
.1933
.3293
.2846
.2726
.3695
.3084
.2934
3
.2506
.1798
.1672
.2498
.1977
.1870
.2500
.2063
.1960
4
.1754
.1204
.1120
.0869
.0939
.0933
.0653
.0874
. 0893
5
.0556
. 0570
.0307
.0351
.0240
.0292
6
.0167
.0219
.0066
.0101
.0042
.0070
7
.0029
.0063
.0009
.0022
.0004
.0012
8
.0002
.0013
.0001
.0004
.0000
.0002
X
9
.0002
.0000
.. 0000
10
.0000
.0000
.0000
11
.0000
.0000
.0000
12
.0000
.0000
.0000
•
13
14
•
47
•
TABLE 2.3
The tL-distribution
:>
a
=
b = .5
.3,
p'=.833 .417
•
b=5
a = 30,
b = 50
.267
.833
.417
.267
.833
.417
.267
N=6
12
18
6
12
18
6
12
18
0
.2423
.2486
.2511
.0527
.0671
.0724
.0188
.0348
.0407
1
.0881
.0981
.1021
.1284
.1494
.1553
.0996
.1306
.1387
2
.0764
.0974
.1024
.1976
.2042
.2041
.2311
.2300
.2278
3
.0964
.1182
.1193
.2322
.2093
.2022
.3007
.2515
.2395
4
.1529
.1348
.1282
.2l05
.1697
.1610
.2313
.1898
.1803
5
.2047
.1268
.1162
.1340
.1102
.1057
.0996
.1041
.1032
6
.1391
.0934
.0864
.0446
.0572
.0579
.0183
.0425
.0465
7
.0525
.0526
.0234
.0266
.0130
.0168
8
.0220
.0262
.0074
.0103
.0030
.0050
9
.0066
.0106
.0017
.0033
.0005
.0012
10
.0014
.0035
.0003
.0010
.0001
.0002
11
.0002
.0010
.0000
.0002
.0000
.0000
12
.0000
.0002
.0000
.0000
.0000
.0000
x
•
a=3,
(1-1 = 3.0)
13
.0000
.0000
.0000
14
.0000
.0000
.0000
•
CHAPTER III
A POISSON PROCESS FORMULATION
3.1
Introduction
In the previous chapter a discrete time model led to the de-
velopment of the
HI - distribution as a representation of the parity
distribution conditional on age.
the
In this chapter we shall show how
HI - distribution resul ts from a particular continuous time pro-
cess, namely, the homogeneous compound poisson process.
•
This random
process implies certain probability distributions for the time between
births and for the waiting times to
n-th
live birth.
In particular,
the inter-arrival times are independent and identically distributed.
Lundberg (1940) defines in general terms the time-homogeneous
compound Poisson process and derives several interesting results which
can be applied to the particular process that we shall consider.
One
implication of these results is that a probability model allowing for
parity-specific forces of fertility may not be necessary.
For the
random process considered,· the force of fertility, or intensity function, at any time· t
does not depend upon the number of events
(births) that occurred in the interval (O,t).
intensity of the
n-th
occurrence, given that
However, the conditional
n- 1
events have al-
ready occurred is shown to be a function of the number of occurrences.
•
This is analogous to the contagion effect discussed earlier.
Thus if
the time homogeneous compound Poisson process adequately describes the
49
parity distribution, models using parity-specific forces of fertility,
such as Hoem (1970), need not be considered.
•
•
so
•
3.2
The Time-Homogeneous Process
A compound Poisson process is defined by Parzen (1962) in the
following manner.
Consider the stochastic process
{X(t), t > O}.
Let
N (tJ
X(t) =
L
(3.1)
Yn
n=l
such that
{Y n , n = 1,2, ... } are independent identically distributed
random variables and
ty function
process.
y(t).
(N(t), t > O}
Then
X(t)
is a Poisson process with intensi-
is said to be a compound Poisson
The function
met)
Jt y(tJdt
=
(3.2)
o
•
can be defined as the mean value function of the underlying Poisson
process
N(t) .
We now define
time interval
(O,t)
X(t)
and
months) in the interval
to be the number of live births in the
N(t)
(O,t)
risk of a live birth conception.
in
(O,t).
define
p
to be the number of time units (e.g.
that a woman is susceptible to the
We assume that there is no mortality
This assumption will be relaxed later.
As before, we can
as the probability of a live birth conception in a time
unit given that the woman is susceptible to risk of a live birth conception.
This corresponds to Nour's definition of conditional fecund-
ability (Nour, 1972) and, in the context of the compound Poisson,
implies that
•
wi th probability (1 - p)
with probability
p
Sl
for
n=1,2, ...
The probability of a woman being susceptible to the risk of a
live birth conception in the interval
(t, t +tlt)
is given by
•
yet) M + 0 (tlt)
and the probability of a live birth conception, or the unconditional
fecundabili ty ,in the interval
(t, t + tlt)
can now be defined as
py(t)M + OeM) •
We see that, at this stage,
{X(t), t >0)
is simply a Poisson process
wi th intensity or' "force of fertility" defined by
ep(t) = py(t)
an'd~the
probability of' n
live births in the interval
(O,t)
is de-
•
termined by the Poisson distribution with parameter equal to
It <P(t)dt.
Hence,
o
= [pm(t) ) e -pm( t)
(3.3)
nJ
since
I
t <Pl t) dt =
o
pIt yet) dt = pm(t)
o
Hoem [1969] considers the general case of a non-homogeneous
Poisson process with the force of fertility,
ep(t),
not specified.,
Our interest lies in the special case where the Poisson process
{N(t), t > O}
is assumed to be homogeneous so that
met) = At
(3.4)
•
52
•
and
p,
the conditional fecundability, is assumed to be a random var-
iable having a Beta distribution.
This last assumption allows for
heterogeneity among women but does not allow for fecundability changing over time.
The conditional distribution of
is given by (3.3) with
met)
=
At.
X(t)
given the value of
That is
P (tip) = [pAtIn e -PAt .
n
n!
The unconditional probability of
Pn (t) =
r
p
n
(3.5)
births in the interval
(O,t)
is
(3.6)
Pn (t Ip) f (p) dp .
o
•
Our assumption is that f(p) is the Beta distribution.
=fo
[Att f(b - a) f(a)
n!
feb)
e
Hence
-PAt n+a-l (1 _ p)b-a-l dp",
p
(3.7)
or,
(a)
P (t)
n
for
n = 0, I, 2, ...
parameters
a, b, and At
n
(3.8)
(b) /1 [a+n; b+n; - At]
n
This is simply the
HI
distribution with
Note that the distribution of
HI - distribution for any value of
t.
X(t)
is the
This means that the parity dis-
tribution at any arbitrary age can be represented by the
HI - distri-
bution so that, for a cohort of women, the parity distribution
•
conditional on the age of the cohort is given by the
HI - distribution
for every age.
The general form of
Pn (t)
given by (3.6) has been examined
53
by Lundberg (1940) who defines a compound Poisson process as a random
process that generates probabilities of the form
P (t)
n
=
J'" D.!.f.
n!
e- At dF(A) .
•
(3.9)
o
Lundberg's primary interest is when
F{A)
is the Gamma distribution.
In this case (3.9) determines the Polya Process.
The case of
F(A)
as a Beta distribution is considered by Philipson (1960) and in this
case (3.9) yields an
HI - distribution with parameters
a, b, and t .
Thus (3.8) can be considered to be an extention of Philipson's result
and a special case of. Lundberg's general (time-homogeneous) compound
Poisson process.
Further consideration of Lundberg's results will fol-
low in section 3.3.
Returning to the compound Poisson process that yields (3.8),
we note that the probability generating function of
X(t)
is
•
(3.10)
g(s) = /1 [a; b; At(s - 1)] .
Further heterogeneity among women can now be introduced by considering
the probability of a woman being susceptible to the risk of a live
birth conception to vary among women.
That is, the parameter
be considered to be a random variable with denshty function
The probability generating function of
g(s)
'"
=
J
X(t)
A can
f(A)
becomes
lFl[a;b;At(s-l)]f(A)dA
(3.11)
o
If we let
fCA)
1
=
'sk=-""r-(k-)
k-l e -AlB ,
A
(3.12)
•
54
•
i.e. the Gamma distribution with parameters
p' = (6+ 0-
k
and
S,
and define
1
then the probability generating function (3.11) becomes
(3. 13)
g(s) = ll[k, a; b; p't(s - 1)]
which is the
3.3
HZ - distribution.
The Conditional Intensity Function
This section involves the direct application of several
results due· to Lundberg (1940).
Lundberg defines a compound Poisson
process in terms of the probability density
Pn(t) = Prob[n events in (O,t)]
•
as
=
Joo (At)n
n!
o
where the parameter
e- At dUCAl
A is defined as the risk and
(3.14)
U(A)
is defined
as the unconditional risk distribution such that
00
J
o
(3.15)
dUe A) = 1 .
which has deri vati ves, for
n = 1, 2, ...
P (n)(t) =
=
o
r(-
A) n e - At dU ( A)
(3.16)
o
Lundberg shows that the probability density function can be given by
•
. n
=
( -t)
n!
P (n) (t)
o
(3. 17)
55
The other result of. interest involves the conditional intensity function
$n(t)
which can be defined as
$ (t)
n
where
X(t)
=
lim P [X(t + lit)
lit-.. 0
=n + 11 X(t) = nJ
is the number of events in (0, t).
•
(3. 18)
Lundberg also derives
(3.19)
and a recurrence relation
(3.20)
The stochastic process
{X(t) , t > o} that we are considering
can be defined as
P (t) =
n
J'" (Apt) n e -Apt dF(p)
n!
o
(3.21)
where the risk is the product of a constant risk
risk
p
with the unconditional risk distribution
Beta distribution with parameters
poet)
=
a ·and
b.
•
A and a variable
F(p)
being the
In particular
J'" e- Apt dF(p)
(3.22)
o
which has derivatives
Po (n) (t) =
J'" (_Ap)n e- Apt dF(p)
(3.23)
o
Lundberg's result (3.22) immediately follows for any choice of
F(p).
The conditional intensity function is defined (c.f. Lundberg)
•
56
•
as
<P (t) =
n
where
U*(p)
r
(3.24)
p d U*(p)
o
is the conditional risk distribution function and
p [Aytjn e- Ayt dF(y)
fo
n!
Jo
n!
U* (p) = . . : . - - - - - - a> [Aytjn e- Ayt dF(y)
or
U*(p)
•
=
(3.25)
Substituting (3.25) into (3.24) we get
ra>
J yn+l e- Ayt dF(y)
r'
o
(3.26)
<p (t) = ~------
n
yn e - Ayt dF(y)
Examination of the derivatives of
PO(t)
in (3.23) leads to
P (n+ 1)
<p (t) =
n
•
-1
0
(t)
A P (n) (t)
(3. 27)
o
This is slightly different from (3.19) and leads to a modified recurrence relation, namely,
>
••••
.....
, .".
57
•
ep~(t)
= <p ( t ) + .,..:;..-,-,n
Aepn (t)
The above results hold for any distribution
Poisson process that yields the
(3.28)
F(p).
HI - distribution for
assumed to be a Beta dis tribution wi th parameters
gration of
dF(p) . over the range
Po (t)
P (n) (t)
0
For thecompotmd
0 <p <1
a
x(t),
and
Inte-
(3.29)
n (a)n
[-A]
is
leads to the following:
= IFl[a;b;-At]
=
b,
F(p)
(b) IFl[a+n; b+n; -At].
(3,30)
n
and
+ n)
epn (t) = A(a
b+n
IFl[a+n+~;b+n+l;.-At]
/l[a+n;b+n; -At]
In terms of the probabilities
Pn(t)
(3',31)
given by (3.8), (3.31) can be
written as
n+ 1
t
•
P
n + 1 (t)
P (t)
n
which also holds for Ltmdberg's compotmd Poisson process.
(3.32)
We see that
lim ep (t) = 0
t+ oo
n
which assures the process of a stationary distribution.
However. in
practice we are concerned with a finite age span and the stationary
distribution is determined by the end of the reproductive period,
3.4
Mortality
Under the assumption that mortality is independent of fertil-
ity. the parity distribution at any age is determined only by the
•
•
force of fertility.
This is readily seen in Hoem's (1969) general
non-homogeneous Poisson process.
P
mn
Hoem defines the following:
(x, y) = probability that a person will be alive at age y
and parity
age
x
n
given that the person was alive at
and of parity
m,
Qmn(x, y) = probability that a person will not be alive at age
y
but attained parity
was alive at age
x
n
given that the person
and of parity
m, and
lImn (x, y) = probability that a person alive at age
pari ty
m will have
interval
lImn (x, y) = Pmn (x, y)
•
Letting
¢(s)
(x, y),
+
(n - m)
<P(s)
and
and
further births in the
and
Qmn (x, y) .
be the force of fertility and
mortality with
x
lieS)
be the force of
lieS) "independe~t, the assumption of a non-
homogeneous Poisson process gives
Pm , m +k (x, y) =
1
(3.33)
IT
Note that this can be expressed as
(3.34)
•
which is the product of the parity distribution given survival to age
y
and the probability of surviving the age interval
(x, y) .
,
If we let age
span, and age
span with
y
x < y,
x
59
'
be the age of onset of the reproductive age
be any age before the end of the reproductive age
then the change of variab les
t =Y - x
will make
•
Hoem's model comparable to the compound Poisson process of section 3,1,
In fact, (3,38) with
m= 0
gives the probability of
n
births in the
interval as
(3,35)
This is the parity distribution of those surviving the interval
(0, t)
For the time homogeneous compound poisson process, we defined
(3.36)
<Pet) = PA
and then assumed
p
was a random variable having a Beta distribution.
Substituting (3.36) into (3.35) and integrating over the range of
yields the parity distribution
Pn(t)
given by (3.8) which was de-
rived under the assumption of no mortality in the interval
Thus the assumption of no mortality in
p
•
(0, t)
(0, t).
can be replaced by the
assumption that mortality and fertility are independent.
P (t)
n
is
then interpreted as the representation of the parity distribution at
any age given survival to that age.
•
•
CHAPTER IV
DATA CONS IDERATIONS AND
4.1
ESTI~lATION
Introduction
There are two problems which must be considered in applying the
models described in Chapter II to fertility analysis.
The first
problem involves the collection of data on fertility.
Since we are
concerned about models of the parity distribution at every age, cohort
data is usually required to examine the appropriateness of the HI
•
distribution.
Under an assumption that fertility has not been changing
over time, period or corss-sectional data may" be used.
It will also be
assumed that age and parity are reported correctly>
The second problem is the estimation of the parameters of the
model.
A complication which arises is that fertility data is often
given by five year age groups rather than by single years of age.
Also, parity specific data is often right truncated so that the tail
of the observed distribution is considered to be unknown.
chapter will be addressed to these problems .
•
This
61
4.2
The Parity Distrihution
The parity distribution is defined as the distribution of the
numher of children ever born.
It can be determined directly from
•
surveyor census data or it can be estimated from data on age-parity
specific fertility rates.
must be made.
In the
lat~er
case, several.?ssumptions
We must assume only one birth per calender year, no
multiple hirths, and no mortality for women during the reproductive
age span.
The assumptions are not as restrictive as they may first
appear.
It is biologically possible to have two births within a one
year time period but this is not a likely event.
It is even more un-
likely for the two births to occur in the same calender year.
From
available U.S. data. (Heuser, 1967) we see that only ten births per one
thousand births is a multiple (e.g., twin, triplet) birth.
In
addition, the chance of a multiple birth increases with age and parity.
Since most of (U.S.) fertility occurs at the younger ages and smaller
parities, the effect of multiple births is inconsequential.
•
The
assumption of no mortality for women can be replaced by a less restrictive assumption of the independence of mortality and fertility.
In order to see this we let
m be the birth order or parity.
x be the exact age of women and
The following functions will be
considered:
B(x)
the number of births to women aged
B ex)
m
the number of births of order
parity
K(x)
x.
m to women aged
x
and of
m- 1.
the number of women aged
x.
•
62
•
K (x)
the number of women of parity
F(x)
the age-specific fertili ty rate (ASFR).
F (x)
the age-parity specific fertility rate (APSFR), conditional
m·
m
on
m at exact age
x.
K
_ (x) .
m l
F*(x)
the unconditional age parity specific fertility rate.
q(x)
the probability of a woman dying in
m
the probability of a woman of parity
(x, x + 1).
m dying in
(x, x + 1).
the parity distribution, or the proportion of women that are
of parity
m at exact age
x.
According to the above definitions, we have
Pm(x)
•
= Km(x)/K(x)
B(x)
= ~B (x),
K(x)
= ~K
mm
m
m
,
(4.1)
and
(4.2)
(x)
(4.3)
The fertility rates are then
Fm(x)
=
Bm(x)/K m- lex)
F* (x) = B (x)/K(x),
m
F(x)
m
=
(4.4)
and
(4. S)
B(x)/K(x)
The notion of
F (x)
m
(4.6)
as the conditional APSFR and
F*(x)
m
the unconditional APSFR can be established from the relationship
(4.7)
It should be noted that
•
F (x) = ~F*(x) = ~F (x) p 1 (x) .
m
m
mm
m
(4.8)
as
63
Hence the ASFR is the sum of the corresponding unconditional APSFRs
and a weighted sum of the corresponding conditional APSFRs where the
. weights are the parity distribution.
Parity-specific data are usually given as
[for example, see ·Whelpton and Campbe1·1,. 1960].
F(x)
and
The. F.• (x)
m
•
F;(x)
are
defined as the APSFRs and the .F (x), when they are considered, are
m
defined as "birth probabil i ties" (Shyrook and Siegel, 1973).
it is more proper to interpret the
We feel
as conditional APSFRs
F (x)
m
rather than as probabilities.
We stated earlier that the assumption of no mortality in the
reproductive age span could be relaxed.
This can be seen in the
development of the parity distribution as a function of the unconditional APSFR.
CASE I
Assume no mortality.
K(x)
=
•
Here we have
K(x - 1)
=
K(x
+
1)
and
K _ (x) = K _ (x - 1) - Bm(X - 1)
ml
m1
That is, the number of women ages
the number of women aged
women who had an
x-I
x
B _ (x - 1).
m1
of pari ty
of pari ty
m-th order birth in
women who had an (m + 1) -th
+
m- I
is equal to
m- 1 minus the number of
(x - I, x)
order birth in
(4.9)
plus the number of
(x - 1, x).
Then
and, under the assumption of no mortality,
P - (x) = K _ (x)/K(x - 1)
m1
m1
(4.10)
•
64
•
We then have
Pm-l (x) =
K _ (x - 1) - Bm(x - 1) + B _ (x - I)
ml
m1
(4.11)
K(x - I)
or
- 1) +~F* -l(x- 1) .
Pm-l (x) = Pm- lex - 1) - £i*(x
m
mCASE II
q(x)
- •• (4:-l2)
Assume that mortality is independent of parity so that
= qm(x)
for all
m.
Then
K(x) = [l-q(x)jK(x-l)
(4.13)
and
K _ (x) = [1 - q(x)jK _ l (x-I) - [1 - q(x)jBm(x - 1)
m
ml
+ [1 - q(x)]B _ (x).
m l
•
(4.14)
Again we have that
(4. 12)
Thus the parity distribution at age
parity distribution at age
x-I
x
can be obtained from the
and the unconditional APSFR.
This
also leads to the calculation of the parity distribution as the
differences in cumulative unconditional APSFR which is the method
found in standard texts (e.g., Shryook and Siegel, 1973).
Given the
parity distribution and unconditional APSFR, the conditional APSFR
can be obtained from (4.7) as
(4.15)
For data in five year age groups, the parity distribution
•
must be given.
very difficult.
Estimation of the parity distribution from APSFR is
The problem here is that there can be several births
-
- '-
in a· five,. year interval.
Thus the population at risk v.aries in an
unknown fashion during the five· year period.
Since the· parity d'istri-
Dution for a five year interval is the woighteJ average of' the parity
•
distributions for the single' year intervals·, this unknown. variation'
precludes the' estimation· of the parity distribution for' ,the five year
intervals.
The APSFR can be used to obtain the parity distribution at the
upper age limit of the five year age interval..
The cumulative un-
conditional APSFR is the same for five year or single year data at
the upper age limit of an interval.
This means that the ·differences
in the cumulative APSFR can be used to estimate the parity distributions at the upper age limit.
tional APSFR for age
~roups
For example, the cumulative uncondi-
15-20 and 20-25 can be used' to estimate
•
the parity distribution at age 20 and at age 25.
4.3
Estimation of Parameters
Once the parity distribution has been determined, the par-
meters of the HI distribution must be estimated.
One procedure would
be the method of maximum likelihood to derive estimators.
However,
we shall see that the maximum likelihood equations do not yield
explicit solutions for the estimators and an interative procedure
must be used to determine the maximum likelihood estimates (m.l.e.).
The method of moments can also be used.
A look at the asymptotic
relative efficiency (a.r.e.) of these two methods will indicate for
what region of the parameter space the method of moments can be
expected to give good results.
•
66
•
4.3.1
Estimation of parameters from complete data .
At this point we will assume that the data are available for
the complete distribution.
In particular. this assumption implies
that the data are not truncated.
The case where the data is
truncated will be considered later (3 4.3.5).
When the parity distribution is given by single years of age.
we can consider the observed data as a probability distribution conditional on age and then Obtain estimates of the parameters of the HI
distribution for each conditional distribution, i.e .• at every age.
When the parity distribution is given for five (or ten) year age
groups, the same procedure can be followed.
In this instance, the
data are, in a sense, aggregate data since the parity distribution
•
for a five (or ten) year age group is simply a weighted average of the
parity distribution by single years where the weights are the relative
age distribution within the five (or ten) year age group.
4.3.2
Maximum likelihood estimates
Consider a probability function for a discrete random variable
X as
f (x) = Pr [X = x]
for
n
x=O, 1, 2, ...• N.
(4.16)
The likelihood function for a sample of size
is then defined as
n
L =
where
•
x.
1
is the observed value os the random variable for the i-th
individual in the sample.
such that
(4.17)
TIf(x.)
i=l
1
x. =x[for
1
If we let
x=O,l, ... ,
N],
n
x
be the number in the sample
then the likelihood can be
67
written
N
L
=n{f(X)}
x=o
n
x
(4.18)
•
The maximum likelihood estimators are obtained by maximizing with
respect to the parameters of the distribution.
They can also be
'"
.
obtained by maximizing the log-likelihood given by
N
log L =
n log f(x) .
L
x
x=O
(4.19)
The p.d.f. of the H distribution is given by
l
(a)
(b /
f(x) =
/1 [a + x; b + x; - A]
(2. 17)
x·
for
where
x = O. 1, •.. , N·
N
random variable in the sample.
is the maximum reported value of the
The parameters of the distribution
•
are now
The m.l.e. solve the equations
o•
for
i = l, 2, 3.
(4.20)
To obtain the partial derivati ves of the log likeli-
hood function we need the following result.
d (a+x)k
aa
=
ada
n
jk-l
(a + x
j=O
'j
+ J)
or
k-l (a + x\
= I a+x+j
j=O
(4.21)
•
68
•
for
k = 1, 2, ...
This gives the likelihood equations as
! '
N
alogL
= L
ab
x=O
'" k-l
L L
-,\ ] k=l
j=O
nx /l[a+x; b+x;
L L
x=l j=O
N
'" k-l
1
L
"!""""
x=O
(b + x) k
b+x;
-,\]
(a + x) k
L L
k=l j=O
N
+
(_1/+1,\k
k! (b + x + j)
n
x
b +j = 0 •
x-I
N
alogL
=
aa
(a + x)k
(b + x\
x-I
L L
x=l j=O
1
(4.22)
(-//
k!(a+x+j)
n
x
a +j = 0 ,
)
(4.23)
and
•
alogL
a,\ =
N
L
x=O
'!,.
x ,\
'"
1
/l[a+x;
b+x; -,\
(a + x)k
L
k=l (b + x) k
(_l/,\k-l
(k - 1) !
= 0 .
1
(4.24)
These equations do not yield explicit solutions for the m.l.e. and
an iterative procedure, such as that outlined in Elston and Kaplan
(1972), must be used to obtain the estimates.
The covariance matrix of the m.l.e. is given by the inverse
of the Fisher information matrix.
The (i, j)-th element of the
Fisher information matrix is given by
=
_Er210gf(X~
= E~lOgf(X)
as.as.
as.
1
1
1
alOgf(xj
as.
.
(4.25)
J
Now we have
•
alogf(x) =
ae i
1
f(x)
af(x)
ae.1
... .,. ....
(4.26)
69
so that we can write
af(x)
(4.27)
ae.
•
J
The calculation of the.partial derivatives can .be eased; by
B (x) =
n
for
x = 1, 2, '"
1::& (a+x+n-l)
n
defin~ng
(4.28)
(b + x + n _ 1) Bn _1 (x)
, and
where
A
= it (a
x
for
x = 1,. 2.; ... ,
+ x - 1)
x (b + x - 1)
A
(4.29)
x- 1
and
o =1
A
Then the p.d.f. can be expressed as
00
f(x)
= L
n=O
(4.30)
B (x)
n
and the partial deri vati Yes can be written, for
x = 0, 1, 2, ... , as
af (x) =
aa
af(x) =
3b
•
(4.31)
00
\'
(4.32)
l.
n=I x
and
af(x)
dit
001 n+x-l
=
L - L
n=I x
j
=0
(b + j)
_1) B
n
(x)
•
(4.33)
•
70
•
Here
I =
x
4.3.3
1:
if
x" 1
if
x=O
(4.34 )
Method of Moments
Moment-type estimators for the parameters of the HI distribu-
tion can be obtained using the recurrence relation (1.32) for the
ratios of factorial moments.
I = (T l , T2 , T3 )' = (b, a, A),
Defining
some simple algebra yields
b
I=
2 (11 1 - 11 2) / (11 2 - 211 1
a
=
+
11 0 )
211 0 (11 1 - 11 2)/(211 0 11 2 - 11 111 2 - 11 111 0 )
lJ
(4.35)
(211 0 11 2 - 11 111 2 - 11 111 0 ) / (11 2 - 211 1 + 11 0 )
•
1 = (b, a, A)
The estimates
I
are obtained by substituting sample
factorial moment ratios for the
11
The covariance matrix of
following manner.
moments.
Let
1
in (4.35).
i
1,
~,
is computed in the
be the covariance matrix of the central
Then, for a sample of size
n,
lJ' _lJ'lJ'
lJ' - J.l'lJ'
J.l' - J.l' J.l'
J.l'3
J.l']J'
2 1
ll'4 2
ll' 2
l.J '
ll'
]J I
J.l' J.l'
ll'
lJ I
J.l'
211
1
1=. = n
.-
denoted
321
).1'
532
431
431
llt ll '
53
2
lJ I J.l'
633
We require the Jacobian matricies for three transformations:
(J.l ' J.l'
3' 2' J.l')
3
-+
I
J.l'
'"'(1)'
(2)' lJ
( "
•
and
(lJ h)' J.l (2)' ]J b))
(3 ))
-+
(11 0 , 11 1 , 11 2 ),
,
(4.36)
71
•
The Jacobian matrices are defined in terms of partial
derivatives.
For example,
aCT , T , T )
1
2
3
aCn ,n , n )
O 1
2
aT /an o
= aT zlan
aT /an
3
aT /an 1
aT /an 2
o
aT zlan 1
aT/an 2
O
aTian1
aTian2
C4.37)
For the moment estimators C4.35) we have
100
11 =
J
-2
=
-1
1
0
2
-3
1
(4.38)
1
0
0
-n/\-I (1)
l/\-1(1)
0
0
-n 2/\-I (2)
C4.39)
1/\-1 (2)
•
and
-T
1
.l3 = -g
2C1+T )
1
1
2 Cn -n ) - C2n -n ) T2
1 2
2 1
2n n 1 -n n 2 -n n
1
1 O
O
L
2n + Cn +n 2 )T 2
O
O
2nOn1-n1n2-n1nO
2T -n -n
3
O 2
2n 2 - n 1 - T3
-2 - T1
-2n - C2n -n )T
O 1 2
O
2nOn1-n1n2-n1nO
2n - n -T
3
O 1
C4.40)
where
g
The covariance matrix
ft = :l.3
= n1 -
ft
:l.2 :1: 1
C4.41)
2n 1 + nO .
is given by
f
C:l 3 :l2 :1: 1 )
I
(4.42)
•
72
•
4.3.4
Asymptotic Relative Efficiency
For two estimation procedures for the parameters
~
=
Cel, e , ... , e )',
2
3
the asymptotic relative efficiency (a.r.e)
is defined (Puri and Sen, 1971) as
(4.43 )
where
VI
V2
is the covariance matrix for the first method and
is
the covariance matrix of the estimates for the second method.
We will consider the a.r.e. of the method of moment estimators
(4.35) relative to the m.l.e. of the parameters of the HI distribution.
The covariance matrix of the m.l.e. is given by the inverse of the
•
Fisher information matrix.
Letting
lee)
be the Fisher information
matrix, we have
(4.44 )
=
so that
(4.45)
is the a.r.e. of the method of moments reltaive to m.l.e. for the
HI distribution.
Table 4.1 shows the a.r.e. for several sets of the parameters
a, b, and A.
increases.
For
With
a and b
fixed, the a.r.e. decreases as
a and A fixed, the a.r.e. increases as
The a.r.e. also increases as
•
fixed.
a
increases with
A
b increases.
A and the ratio
alb
Since the mean of the HI distribution is given by
Aa
=1)
].JH
1
(2.42)
73.
we see that the method of moments will compare favorably to the
method of maximum likelihood when the mean
The lowest
a.r~e.
with the ratio
H
is relatively small.
I
A is large and· a and b are small but
is when
alb
IJ
close· to one so that
ll-r
•
is relatively large.
t
4.3.5
Truncated data
We previously considered the probability function for a
discrete random variable as
f(x)
= Pr [X = x]
for
x = 0, I, 2, ...
The right-truncated distribution can be expressed as
*
.
f (x) = Pr [X = xl
for
x = 0, I, ... , m ,
(4.46)
where
m
I
f* (x) = f (x)1
f(y)
(4.47)
.
y=o
The likelihood function for the truncated distribution is given as
m
*
n
L = TI{f (x)} x
(4.48)
•
x=O
For the HI distribution, the expression
m
I
b+y; -A]
fey) =
(4.49)
y=O
can not be simplified and, again., the likelihood equations will not
have an explicit solution.
Moment-type estimators can still be used for truncated data
on the parity ·distribution.
In this instance, the complete distri-
but ion is truncated by considering some maximum recorded parity, say
M,
and
becomes the number of women reporting a parity greater
than or equal to
M.
The moments of the observed distribution are
•
;4
•
M* > ~l
then computed using an arbi.trary value
last parity class.
for the mean of the
That is,
~1-1
.~
x
r
f(x)
+
(~1
* ) r reM)
.
.
(4.50)
x=o
For example, if the maximum reported parity is
might be used in computing the moments.
M= 6
then
M* = ; • 5
This is done in the exmaple
in a following section (§ 4.5).
Another type of truncated data which can be considered is
doubly truncated data.
For our purposes we need to consider
m
= f(x)/ L fey)
f**(x)
x=!,2, .. o,m.
(4.51 )
y=l
Here the zero-class is considered unknown as well as the upper tail of
•
the distribution.
The likelihood is given by
n
m
=n
x=l
L
{f**(x)) x .
(4.52)
An application of this type of likelihood is seen in the next section.
4.4
The modified HI distribution
The zero-parity class of the parity distribution may be too
large due to natural sterility or due to a large proportion of unmarried WOmen especially at the younger ages of the reproductive age
span).
This problem has been observed by Singh (1963) who suggests
a simple procedure for dealing with it.
Since we are concerned "ith the HI distribution as a model
for parity specific fertility, we can define
•
the probabilities of a WOman having parity
HI distribution and
TI
O'
TIi' TI
Z' ...
TI
'
O
IT
l
, TI ,
2
0, 1, 2, '"
as the corresponding
as
under the
75
probabilities under the modified HI distribution.
•
Then the
modified HI distribution is defined as
11* = (1 - a) + a1l
o
~
and
(1 - a)
(4.53)
=
iT~
i=1,2,
is interpreted as the proportion of women never
susceptible to the risk of conception.
The parameters of the
modified HI distribution can be estimated by the method of maximum
likelihood using the equation
n
00
TI
L =
(11*)
X
(4.54 )
X
x=O
or, if the data are right truncated,
m
=TI
L
x=O
m
[11*x /
L
y=O
n
11*] X
(4.55 )
Y
Rather than estimate the parameters of both the HI and modified HI
distributions, a doubly truncated distribution of the form (4.51)
can be considered.
Now suppose that
observed proportions at parity
PO' PI' P2 , ... , PM- l
0, 1, 2, ... , M- 1 and PM
proportion of parity greater than or equal to
M.
•
are the
is the
The likelihood
(4.55) can be used to estimate the parameters of the HI distribution.
... ,
Given these estimates, an expected distribution
can be calculated.
should have
P. /1'.
~
~
If the
;;
for
1
Pi
are from an HI distribution, we
i = 0, 1, 2, ...
the modified HI distribution,. then· P. < P.
~
~
If the
P.
~
are from
and an p.stimate of
a
is given by
~
~
a.=P./P.,
~
~
~
for
i=l, 2, 3, ... , M.
(4.56)
•
76
•
4.5
An example
As an example of estimating the parameters of the HI distribution, we can use data presented in Keyfitz (1968, p. 389).
The
data are the number of children ever born to women aged 45-49, in the
United States, for the year 1960 as reported in the U.N. Demographic
Yearbook (1963, p. 454).
Keyfitz fits a negative binomial distribu-
tion to the actual data by the method of moments.
For the HI distribution, we consider three estimation
(i) maximum likelihood estimation assuming that the
procedures:
data are right truncated,
(ii) method of moments estimation, and
(iii) maximum likelihood estimation assuming the data are doubly
truncated.
•
For the modified HI distribution, we consider:
(iv) maximum likelihood estimation assuming the data are right
truncated and
(v) maximum likelihood estimation assuming the data
are doubly truncated and using (4.56) to estimate
Q.
The estimates
are presented in Table 4.2.
The observed and fitted truncated proportions are presented
in Table 4.3.
Table 4.4
shows the observed number at each parity
and the fitted number for the negative binomial, HI' and modified HI
.distributions.
A chi-square statistic for the goodness of fit of
the various distributions can be computed but due to the large
sample size the chi-square statistic would be quite large.
To indicate
the relative goodness of fit, the chi-square statistic can be computed
from the observed and expected proportions.
protion of parity
•
parity
i
be
E ;
i
i
be denoted
then
O.
1
Let the observed pro-
and the expected proportion of
77
n
1,000 x
(0._E.)2
1
L
i=O
where
n
1
(4.57)
E.
1
is the maximum reported parity.
This is used as a goodness
•
of fit index that is standardized for a hypothetical sample size of
one thousand.
The chi-square index (4.57) is presented in Tables 4.3 and
4.4 for the various distributions.
If the chi-square values were
interpreted as test statistics, they would show that the HI and
modified HI distributions do not fit the data.
This lack of fit is
not necessarily due to the failure of the distributions as models
for the parity distribution.
dat~
The lack of fit is due in part to the
being from a popUlation which is non-homogeneous with respect to
the demographic factors which affect fertility.
The
X2
index does seem to indicate that, as an approximation
to the pari ty distribution, the modified HI distribution is superior
•
to either the negative binomial distribution or the HI distribution.
However, it must be noted that in fitting the modified HI distribution,
four parameters are estimated from six frequencies while the HI
distribution has only three parameters and the negative binomial
distribution has only two parameters.
Thus the relative goodness of
fit would be expected to be better for the modified HI distribution.
•
78
•
TABLE 4.1
Asymptotic Relative Efficiency tin percent) of the method
of moments relative to maximum likelihood
method for the HI distribution
a
.5
b
alb
.56
.67
.9
3.0
29.14
4.5
19.98
6.0
16.98
.75
53.35
34.95
28.28
27.43
1
.5
60.88
46.92
43.87
43.09
2
.25
73.17
63.58
60.81
61.14
5
.1
85.41
78.14
74.36
72.40
.05
91.49
85.84
81.90
79.06
2.78
.9
73.05
56.02
45.79
41.47
3.33
.75
78.12
65.63
59.14
57.11
5
.5
85.96
78.42
74.71
73.44
10
.25
93.13
88.72
85.89
84.08
25
.1
97.31
95.08
93.21
91. 62
50
.05
98.69
97.49
·96.39
95.36
5
.9
83.87
71. 74
62.96
59.23
6
.75
87.60
79.15
73.67
71.23
9
.5
92.72
87.98
85.00
83.33
18
.25
96.80
94.37
92.52
91.12
45
.1
98.85
97.82
96.88
96.04
90
.05
99.46
98.94
98.45
97.97
7.22
.9
89.41
80.75
73.96
72.81
8.67
.75
92.06
86.13
81. 85
79.90
13
.5
85.54
92.33
90.05
88.56
26
.25
98.15
96.63
95.39
94.37
65
.1
99.36
98.77
98.22
97.70
.05
99.70
99.41
99.13
98.86
10
2.5
•
4.5
6.5
130
•
A
1.5
50.09
79
•
TABLE 4.2
Estimates of the parameters of the
HI and modified HI distributions
Distribution
b
Estimates
a
A
a
(i)
HI
4.208
·1.862
5.312
(ii)
HI
5.225
2.038
5.906
(iii)
HI
5.913
3.114
4.589
(iv)
mod HI
7.421
3.575
4.966
.940
(v)
mod HI
5.913
3.114
4.589
.942
•
Distributions:
(i)
(ii)
(iii)
HI' right truncated (MLE)
HI' right truncated (moment estimates)
HI' doubly truncated (MLE)
(iv)
modified HI' right truncated (MLE)
(v)
modified HI' doubly truncated (MLE)
•
80
•
TABLE 4.3
Observed and expected truncated proportions
for women aged 45-49, U.S., 1960
Right Truncated:
PARITY
•
Index
2
X
0
1
2
3
4
5
Observed
.191
.206
.261
.167
.094
.082
Neg Bin
.174
.254
.232
.170
.108
.063
21. 95
(i)
HI
.184
.237
.220
.171
.117
.071
18.28
(ii)
HI
.188
.243
.221
.168
.113
.068
19.00
(iv)
mod HI
.192
.221
.228
.178
.116
.065
15.10
(v)
mod HI
.191
.219
.226
.179
.118
.067
15.24
Doubly Truncated:
PARITY
•
Index
1
2
3
4
5
Observed
.254
.322
.206
.116
.102
(iii)
.271
.280
.221
.146
.083
HI
2
X
18.90
81
•
TABLE 4.4
Observed and expectednlimber" (in thousands) at. a
given parity for women aged 45-49, U.S., 1960
Index
PARITY
Distribution
o·
1
2
3
4
5
6
x2
Observed
937
1010
1283
819
463
404
274
-
Neg 8in
842
1229
1122
821
525
306
345
24.31
(i)
HI
886
1144
1062
826
564
344
363
22.10
(ii)
HI
909
1175
1071
814
545
329
347
21. 33
(iii)
HI
674
1140
1178
932
613
348
305
36.49
(iv)
mod HI
944
1087
1119
876
568
319
277
14.52
(v)
mod HI
937
1074
1110
878
577
328
287
14.69
•
•
•
CHAPTER V
.APPLICATION TO OBSERVED PARITY DISTRIBUTIONS
5.1
Introduction
In this chapter, the
HI
distribution will be examined as a
model for the age-specific parity distributions from various countries.
As a first step, a modified
be investigated.
HI
distribution of the form (4.53) will
One parameter of this modified distribution is
interpreted as the proportion of women in a population who are never
•
susceptible to the risk of conception.
This parameter converges to
zero if the underlying population can be described by the
HI
distri-
bution without modification.
The initial investigation indicates that the
HI'
or modified
HI'. distribution provides a reasonable model for high fertility populations such as Cost Rica or Guatemala.
fertility, such as the United States, the
For populations with lower
HI
distribution may not
provide an adequate description of the parity distribution.
U.S. data, a new model will be proposed.
•
For the
83
5.2
The Expected Trend with Age
In the Poisson process formulation of the
(Chapter III), the parameter
constant and
t
A is actually· A't
is the time or age interval.
that the estimates of
A,
or
A't,
III
distribution
where
A'
is a
•
Therefore, we expect
should increase monotonically for
each successive age or age interval.
Since the underlying Poisson
process is time homogeneous, the remaining parameters should remain
constant.
If the parameters vary with age, we can conclude that true
underlying stochastic process may be non-homogeneous.
In the development of the
III
distribution as a contagion
model (Chapter II), the preliminary distribution is binomial with parameters
nand
variables.
p.
These parameters are then assumed to be random
The variable
n,
defined as the number of time units
susceptible to the risk of a live birth conception, is assumed to
follow a Poisson distribution wi th parameter
A.
-n,is parameter
A
•
can now be interpreted as the mean number of time units susceptible to
the risk of a live birth conception.
The number of time units
susceptible should increase with age and so, once again, the estimates
of
A should increase with age.
The binomial paramete r
p
is de fined as the condi tional fe-
cundability or the probability of a live birth conception in a time
unit given that the woman is susceptible to the risk of conception.
Then
p
meters
is
a
assumed to follow a beta distribution (2.2) with paraand
b.
In this fonnulation of the
are not restricted to having the estimates of
stant for each age.
variable
p
111
a
distribution we
and
b
remain con-
At any given age we can interpret the random
as a type of truncated mean, that is, the average
•
84
•
fecundability over the interval from the beginning of the reproductive
age span up to the given age.
If the fecundability follows a unimodal
distribution with age, then the truncated mean will also follow a
unimodal distribution with age.
E(p) = alb.
In our model we have
Thus
the estimate of the average fecundability is given by the ratio of the
estimates of
a
and
b.
This ratio should increase with age until a
maximum value is reached and then the ratio should decrease with age.
5.3
High Fertility Data
The data in this section are for the countries of Costa Rica
and Guatemala (U.N. Demographic Yearbook, 1975).
The parity distri-
butions are given for five year age intervals for the calendar year
1973.
•
The data are not cohort data so we need to assume that the pat-
tern of fertili ty has been fai rly cons tant wi thin each five year age
interval.
The observed parity distributions are right truncated and
are available for rural or urban areas as well as for the
~ntire
(total)
population.
The procedure of Drd (1967) can be used as a preliminary step
in examining the data.
The function Drd defines is
Uex) = xf(x) I fex - I)
For our purposes,
x ex = 1, 2, 3, ... )
(1.18)
is the observed parity and
f(x)
is the observed proportion of the women in the five year age interval
with parity
x.
Table 5.1 shows the function
Uex)
for each age
group of the total popUlation of Costa Rica and of the total popUlation
of Guatemala.
•
At every age,
U(x)
increases as
indicates the possibility of the
HI
relatively small values of
for
U(x)
x
increases.
distribution as a model.
x
= I,
that is,
U(I)
=
This
The
85
f(l)/f(O),
implies that
frO)
is larger than might be expected.
order to account for this, the modified
HI
In
distribution was examined.
The estimation of the parameters of the modified
HI
distribution is
•
described in section 5.3.1.
5.3.1
The Maximum Likelihood Estimates (m.l.e.)
The probabil i ties unde r the modi fied
HI
distribution can be
written as
lTo
= (l -
CL) + CL lTO
and
(4.53)
lTi :;:
where the
a <'CL
lT
a 1T i
J
i = 1 , 2, 3, ...
,
are the probabilities under the
i
HI
distribution and
The parameters of the distribution defined by (4.53) are
<1.
Q' = (a, b, A, CL).
These parameters are estimated for each age group.
The maximum likelihood estimates (m.l.e.'s) are computed using the
direct se:lrch method outlined in Elston and Kaplan (1972).
In applying
•
this method for several sets of data, it is apparent that for the
modified
HI
distribution, the likelihood surface is sometimes very
flat in the region of the m.l.e.
'So
Thus the m.l.e.'s presented here
may not be the true m.l.e.'s but they are the best estimates currently
available.
The m.l.e.'s of the parameters of the modified
HI
distribu-
tion are given in Table 5.2 for Costa Rica and in Table 5.3 for
Guatemala., For Guatemala, the estimates of
interval.
A increase for each age
The same trend holds for the data from Costa Rica with the
exception of the 45-49 year age interval.
TIlis may indicate recall
error in reporting the number of children ever born in this age intervalor this may be due to using cross-sectional rather than cohort
•
86
•
data.
The estimates of
a
and
b
do vary with age indicating the
possibility of an underlying stochastic process that is non-homogeneous.
The ratio
alb
does appear to follow a unimodal pattern
with age for both urban populations.
the ratio
alb
For the remaining populations,
follows a unimodal pattern with age when the following
age intervals are ignored:
age 30-34 for the total population of
Costa Rica, age 40-44 for rural Costa Rica, and both age 15-19 and age
40-44 for rural Guatemala.
The deviation from the unimodal pattern
for these age intervals may be due to either sampling variation or
poor estimates of the parameters or to using cross-sectional data.
The estimates of
•
converge rapidly to 1.0 for Guatemala.
a
In fact, for age greater than age 25 every estimate of
sampling variation of 1.0.
The estimates of
for the most part, increasing with age.
a
a
is within
for Costa Rica are,
For both countries the rela-
tively low estimates at the younger ages may indicate that the modified
HI
distribution effectively eliminates the need for dealing
with a separate function for the proportion ever married.
This will
have to be investigated further.
We know that for the
11-1
and
=
For the modified
•
HI
distribution we have
Aa/b
1
(2.42)
= Aa {I + A(b - a) }
2
GH
HI
I
b
b (b + 1)
distribution it is easily established that
87
IlmH
and
2
<1mH
1
= a%
1
2
H
= a<1
1
+
a(l - a)
1
(5.1)
2
%1
The fitted mean and variance under the·modified
HI
bution for Costa Rica and Guatemala are given in Table 5.4.
•
distriAs
expected, the mean and variance (of the number of children ever born)
increase with age except for the age interval 45-49.
This may indi-
cate recall error in reporting the number of children ever born or it
may be due to using cross-sectional data.
It can also be seen that
the mean number of children ever born is greater for Guatemala than
for Costa Rica, which means that Guatemala has a higher level of fertili ty.
5.3.2
Goodness of Fit
The observed and expected parity distributions are given in
Table 5.5 for Costa Rica and in Table 5.6 for Guatemala.
•
Due to the
large sample sizes involved, no significance test for goodness of fit
will be attempted.
However, we can compute a chi-square statistic for
2
use as a relative index of the goodness of fit, that is, the X
index (4.57).
In this section we are dealing with data in five year age
intervals.
We expect that the goodness of fit should appear to be
worse in the middle ages of the reproductive age span where fertility
is not constant within the five year age interval.
This is evident in
Table 5.7 for the data from Guatemala, especially for rural Guatemala.
For Costa Rica, the lack of fit tends to increase with age.
This
indicates either failure of the model or failure of the assumption
•
88
•
that fertility has not been changing with time .
In examining the observed and expected parity distributions,
we can see that for the age intervals 40-44 and 45-49, for Costa Rica,
we have that the expected proportions exceed the observed proportions
at every parity in the range 0 to 7.
age interval 40-44 for Guatemala.
This pattern is repeated for the
For the age interval 45-49 for
Guatemala, the expected proportions are less than the observed proportion at every parity in the range 0 to 7.
data being considered right truncated.
This problem is due to the
Table 5.8 shows that in each
of the above four age intervals, the truncated probabilities are
closely approximated by the truncated modified
HI
probabilities but
the expected proportion truncated (Table 5.6) are quite different from
•
the observed proportion truncated.
It is obvious that a more refined
estimation procedure for dealing with this type of parity-specific
data must be developed.
5.4
Low Fertility Data
The data used in this ·section is cohort fertility data by
single years of age for the white, the non-white, and the total U.S.
population for the birth cohort of year 1920 (Heuser, et.al. 1975).
For the youngest ages of the reproductive age span the parity distribution is concentrated at the smallest parities and the estimates of
the parameters of the expected distributions can not be computed.
The
estimation procedures will begin for the parity distribution for age
20.
•
The results for age 20 and for every third year following age 20
will be presented to give an indication of how the estimates and the
expected distributions may change with age.
89
Ord's function
and 47.
U(x)
The pattern for age 26 is similar to the pattern of
for Cost Rica and Guatemala.
Thus an
HI
or moJified
U(x)
for
x=2,
that is
U(x)
cated when a mixture of a binomial and an
However, for
U(2) =2f(2)/f(1),
too large for the white and for the total populations.
error has shown that the pattern of
U(x)
distri-
HI
bution may be reasonable for the U.S. data at this age.
age 47 the value of
•
is presented in Table 5.9 for ages 26
is
Trial and
for this age can be dupliHI
distribution is
considered.
5.4.1
A Binomial Mixture Distribution
In low fertility countries, women terminate their reproductive
experience before the end of their reproductive age span.
think of the population of women as comprising two groups.
We can
One group
consists of women who are biologically capable of reproduction but are
voluntarily non-fecund.
The second group of women remain susceptible
to the risk of conception.
•
It is assumed that the parity distribution
of this second group follows an
HI
distribution while the parity
distribution of the first group may follow a binomial distribution
with parameter
n
and
p.
The model for the entire population is
then the mixture
m" 1.
where
H1".1
l3"i
= (l - el)
13'"1
+ Ct.. ". , i = 0, 1, 2, ...
-HI
1
(5.2)
are the probabilities under the binomial distribution,
are the probabilities under the
distribution, and
(1 - el)
is the proportion of women who have effectively completed their
families.
•
90
•
The mean and variance of the
(2.42).
HI
distribution are given by
The mean and variance of the binomial distribution are
)Js = np
and
,
(5.3)
2
aS = np(l - p) .
It is not difficult to show that the mean and variance of the
mixture can be given in terms of the mean and variance of the binomial
and
HI
distributions as
)Jm = (1 - a) )JS + a~
1
and
•
(5.4)
2
2
2
a = (1 - a) aS + a 0H + a (l - a)
m
1
2
(111 +)JS - )JH )JS)
1
1
?
The choice of which binomial mixture to use as a model simplifies to the choice of a value of
n.
In table 5.10 we see that as
increases, the mean and variance of each component increase.
index (4.57) is minimized for
n=5.
nomial component is closest to 2.
this.
For
The
the mean of the bi-
For U.S. data, one might expect
Therefore, for the remainder of this section, the mixture dis-
tribution to be considered is (5.2) with
5.4.2
n=5
n = 5.
Maximum Likelihood Estimates
For the U.S. data, the m.l.e.'s of the parameters of the
modified
HI'
HI'
and binomial mixture are presented in Tables 5.11, 5.12,
and 5.13 respectively.
•
n
2
X
For the
HI
distribution, the estimates of
A are increasing with age for the non-white data and for the white
data with the exception of age 41 for the white data.
If ages 35 and
91
41 are ignored for the total population, then the estimates of
increase with age for the total population as well.
a
and
tions.
b
A
TI,e estimates of
•
vary considerably with age for each of the three popula-
The ratios
are unimodal with ~age for the whi te .popu-
alb
lation and, i f the ages 35 and 41 are ignored, for the total population.
For the non-white population the ratios
tend to decrease
alb
as age increases.
The estimates for the parameters of the modified
bution follow the expected patterns with few exceptions.
for the white population the estimates of
for some ages.
This indicates that the
a
HI
and
distri-
Note that
get quite large
distribution is converg-
ing to the Poisson-distribution for these ages.
estimating the parameters is apparent.
b
HI
Again-the-problem of
We can be fairly confident
that the estimates for the non-white population and the total population are good estimates but the estimates for the white population are
questionable.
•
The problems in estimation may be due to the lack of
fit of the model.
This will be discussed later.
The estimates for the binomial mixture appear to be good
estimates.
For the
HI
component, the estimates of
ing with age for all populations.
population for age 35 and older.
what about a value of
A= 4.1
A are increas-
The only exception is for the white
Here the estimates of
A vary some.:'
which may be the upper limit for this
particular population.
The estimates of
a
and
b
are certainly not equal for each
age, which indicates again the possibility of an underlying non-homogeneous stochastic process.
The ratios
alb
pattern for each population without exception.
follow a unimodal
FOT
the non-white
•
92
•
population, the relatively higher fertility population, the maximum
alb
value of
is reached about age 23 while for the white population,
alb
the relatively lower fertility population, the maximum value of
is not reached until about age 41.
alb
The ratio
for the total
population, which is the mixture of the white and non-white populations, reaches a maximum about age 35.
Further investigation is
needed to determine if the level of fertility can be related to the
age at which the maximum value of' alb
is reached.
For the binomial component of the mixture, the estimate of the
parameter
p
is increasing with age for the white and ·for the total
popUlations while for the non-white population the pattern of
age is more varied.
•
age.
The mixing parameter tends to be
abou~
wi th
U - shaped wi th
For the non-white population, the minimum value of
reached
p
a
is
age 26 while for the white population and the total popu-
lation the minimum is reached about age 32.
Further interpretation of
the parameters must wait until the binomial mixture can be applied to
the parity distribution of other birth cohorts from the U.S and other
low fertility count ries.
5.4.3
Goodness of Fit
Again, no probability statements will be made to asses the
goodness of fit, or lack of fit, of the
HI'
modified
nomial mixture due to the large sample sizes involved.
HI'
or bi-
The index
(4.57) will be used as a measure of relative goodness of fit.
?
X-
In
Table 5.14 we compare the fitted mean and variance for the three possible models for each of the three populations.
•
For the total and for
the white popUlations, the fitted mean and variance for the mixture
93
exceed the fitted mean and variance for the
dis tributions at every age.
HI
and modified
HI
For the non-white population, the fitted
mean and variance are fairly equal for each of the three distributions
examined.
•
This helps to explain the goodness of fit indices presented
in Table 5.16.
In Table 5.16 we see that the binomial mixture is clearly
superior to the modified
HI
distribution.
This is not due simply to
the fact that the mixture has five parameters while the modified
HI
distribution has only four parameters.
The examination of moments in Table 5.13 indicates that the
HI
and modified
HI
distributions are not flexible enough to describe
a parity distribution with a relatively low mean number of children
ever born.
The mean number of children ever born for the non-white
population is only slightly larger than that of the white population
but the variance of the number of children ever born is much larger
for the non-white population.
The goodness of fit of the modified
distribution seems to imply that the. modified
HI
•
HI
distribution is a
proper model for the parity distribution when the variance of the
number of children ever born is comparatively larger than the mean.
The truncation of a woman's reproductive age span decreases both the
mean and the variance of the number of children ever born.
In this
case, the binomial mixture model appears to be adequate to describe
the parity distribution.
Some examples of the observed and expected
parity distributions for U.S. data are presented in Table 5.15 while
the observed and expected truncated proportions are presented in
Table 5.17.
•
"
TABLE 5.1
01
The hmction
U
x
for the Costa Rica and Guatemala data
X
Population
Age
I
2
3
4
5
6
COSTA RICA
(Total)
15-19
.0960
.6194
.6985
.7213
.9091
*
20- 24
.4372
1.4582
1.6388
2.0621
2.0255
2.4000
*
2. 1000
25-29
.6659
2.3419
2.4424
2.9793
3.6343
4.0204
3.9525
30-34
.6366
2.9964
.3.2833
3.4159
4.4689
5.2580
5.6183
35-39
.5868
2.7590
3.7782
3.9701
4.6695
5.5806
6.4605
40-44
.5868
2.4875
3.5014
4.0000
4.8951
5.7654
6.8071
45-49
.5.,92
2.2694
3.4024
4.0376
4.7870
5.7496
6.6444
15-19
.2055
.6524
.6519
.4602
.7692
20-24
.76 12
1.2293
1.4503
1.5388
J. 6024
1.4444
3.7692
25-29
.9975
2.6762
3.5393
3.6526
3.4568
3.7114
4.9595
30-34
.9922
3.1509
3.6437
4.8973
4.9664
4.9541
5.7570
35- 39
I. 1093
2.7670
3.4935
4.4761
5.7555
6.2176
6.4633
40-44
.9514
2.8889
3.6046
4.7170
5.2117
6.1625
7.1349
45-49
1. 2415
2.3418
3.3634
4.4543
5.6530
6.1188
7.2869
GUATHlALA
(Toto 1)
•
•
7
•
95
•
TABLE 5.2
Maximum likelihood estimates for the parameters
of the modified H distribution; Costa Rica
1
Parameters (M.L.E.)
Population
All
Urban
Rural
a/b
,\
Age
a
b
15-19
1.431
3.071
.466
1. 11 S
.308
20-24
0.600
1. 074
.559
2.382
.820
25-29
1.487
2.648
.562
5.044
.889
30-34
. 1. 950
3.488
.559
7.754
.919
35-39
2.093
3.679
.569
9.386
.924
40-44
1. 502
2.797
.537
10.887
.9.,7
45'-49·
1.355
2.507
..540
10.210
.920
15-19
1.010
4.786
.211
1.402
.349
20-24
1.523
3.906
.390
2.640
.736
25-29
1. 814
2.744
.661
3.239
.85b
30-34
2.260
3.078
.734
4.498
.889
35-39
1. 338
1.793
.746
5.261
.923
40-44
0.943
1.344
.702
5.925
.950
45-49
0.749
1.099
.682
5.826
.947
15-19
0.238
0.632
.377
1.010
.553
20-24
1.884
2.740
.688
2.746
.747
25-29
0.423
0.600
.705
4.440
1.000
30-34
1. 170
1.606
.728
7.262
.947
35-39
1.733
2.581
.671
10.893
.946
40-44
1. 199
1.703
.704
13.311
.961
45-49
1.022
1.484
.688
11. 743
.946
(X
•
•
96
•
TABLE 5.3
Maximum likelihood estimates for the parameters
of the modified HI distribution; Guatemala
Parameters
Population
All
Urban
•
Rural
•
(~1.
L. E)
Age
a
b
alb
A
a
15-19
2.001
4.145
.483
1.135
.556
20-24
1.764
3.069
.575
2.157
.926
25-29
0.301
0.351
.858
3.673
1.000
30-34
0.683
0.821
.823
5.575
1.000
35-39
1. 167
1.616
.722
8.124
.996
40-44
1. 471
2.165
.680
9.245
.985
45-49
1.186
1. 815
.653
10.345
1.000
15-19
1.284
4.208
.305
1.244
.493
20-24
0.801
1.083
.740
1. 937
.830
25-29
0.382
0.485
.787
3.176
1.000
30-34
I .879
2.496
.753
5.265
.954
35-39
1.655
3.038
.545
9.141
.993
40-44
1.903
3.669
.519
10.826
.976
45-49
1.699
3.441
.494
11. 032
.987
15-19
12.082
13.040
.926
.732
.556
20-24
0.026
0.030
. B74
2.075
I. 000
25-29
0.132
0.142
.931
3.819
.991
30-34
0.621
0.708
.877
5.921
1.000
35-39
0.846
1.032
.819
7.630
1.000
40-44
1.981
2.845
.696
10.286
.976
45-49
1.071
l.
416
.756
10.348
1.000
97
•
T,\BLE 5.·1
Comparison of the fitted mean and variance of the
modified HI distribution; Costa Rica and Guatemala
Cos ta Ri ca
Fitted
Variance
Guatemala
Fitted
Variance
Population
Age
Mean
All
[combined]
15-19
.1608
.2423
.3046
.41.,5
20-24
1.0910
1. 9053
1. 1477
1.5303
25-29
2.5183
4.8369
3.1514
4.3688
30-34
3.9854
8.4128
4.6360
7.0254
35-39
4.9320
11.2017
5.8444
11. 0248
40-44.
5.4772.
14.7675
6.1889
12.5640
45-49
5.0766
14.1119
6.7603
15.3710
15-19
. 1034
.1430
. 1871
.2542
20-24
.7575
1.2123
1. 1893
1.7665
25-29
1.8314
2.9349
2.5004
3.6379
30-34
2.9354
4.8734
3.7807
5.8804
35-39
3.6235
6.4492
4.9445
10.2165
40-44
3.9496
7.7495
5.4818
12.3377
45-49
3.7616
7.8743
5.3748
12.5241
15-19
.2106
.3275
.3772
.4920
20-24
1. 4109
2.4082
1. 8140
2.2736
25-29
3.1295
5.6930
3.5203
4.4564
30-34
5.0107
10.2073
5.1947
7.4036
35-39
6.9191
16.5657
6.2511
10.4914
40-44
9.0010
25.4457
6.9919
13.8687
45-49
7.6474
22.2538
7.8245
15.9985
"
Urban
Rural
Mean
•
•
TAIlLE 5.5
00
Ol
Observed and expected parity distributions; Costa Rica
Parity
Age
Distribution
0
1
2
3
4
5
6
7
;,8
15-19
observed
.8817
.0846
.0262
.0061
.0011
.0002
.0001
.0000
.0000
mod HI
.8817
.0845
.0258
.0061
.0012
.0002
.0000
.0000
.0000
observed
.4812
.2104
.1534
.0838
.0432
.0175
.0070
.0021
.0015
mod HI
.4816
.2113
.1497
.0881
.0429
.0176
.0062
.0019
.0007
observed
.2284
. 1521
. 1781
.1450
.1080
.0785
.0526
.0297
.0272
mod HI
.2292
.1581
.1650
.1476
.1158
.0805
.0501
.0281
.0257
observed
.1299
.0827
.1239
.1356
.1158
.1035
.0907
.0728
.1452
mod HI
.1336
.0923
.1179
.1292
.1269
.1135
.0931
.0704
.1231
observed
. Jl81
.0693
.0956
.1204
.1195
.1116
.1038
.0958
.1655
mod HI
. 1095
.0636
.0879
.1046
.1125
.1114
.1025
.0879
.2201
observed
.0951
.0558
.0694
.0810
.0810
.0793
.0762
.0741
.3881
mod HI
.1092
.0669
.0802
.0888
.0934
.0941
.0908
.0837
.2930
observed
.1072
.0578
.0656
.0744
.0751
.0719
.0689
.0658
.4137
.1354
.0734
.0842
.0909
.0940
.0934
.0888
.0807
.2594
20-24
25-29
30-34
35-39
40-44
45-49
mod 111
.'
I
•
•
•
•
•
TABLE 5.6
0\
0\
The observed and expected parity distribution; Guatemala
Parity
Age
15-19
1
2
3
4
5
6
7
.7756
.1594
.0520
. 0113
.0013
.0002
.0000
.0001
. 0001
.7756
.1612
.0492
.0114
.0022
.0004
.0000
.0000
.0000
observed
.3873
.2948
.1812
.0876
.0337
.0108
.0026
.0014
.0005
mod HI
.3832
.2953
.1837
.0878
.0343
.0114
.0033
.0008
.0002
observed
.1192
.1189
.1591
.1877
.1714
.1185
.0733
.0321
.0199
mo~ HI
.1185
.1151
.1660
.1863
.1637
.1170
.0703
.0363
.0267
observed
.0641
.0636
.1002
.1217
.1490
.1480
.1222
.1005
.1307
mod HI
.0629
.0663
.0925
.1242
.1455
.1459
.1262
.0954
.1410
observed
.0503
.0558
.0772
.0899
.1006
.1158
.1200
.1108
.2796
mod '\
.0475
.0567
.0704
.0837
.1006
.1113
.1141
.1074
.3062
observed
.0473
.0450
.0650
.0781
.0921
.0960
.0986
.1005
.3775
mod "I
.0478
.0507
.0661
.0806
.0935
.1028
.1063
.1028
.3494
observed
.0443
.0550
.0644
.0722
.0804
.0909
.0927
.0965
.4035
mod III
.0432
.0538
.0623
.0707
.0791
.0866
.0920
.0935
.4188
ob~erved
mod HI
20- 24
25-29
30-34
35-39
40-44
45-49
I
a
Dist ribution
-
I
.
•
;,8
100
•
TABLE 5.7
.Indicies for the goodness of fit of the modified
HI distribution, Costa Rica and Guatemala
COSTA RICA
Population
All
•
Urban
Rural
•
Age
~
X· Index
GUATEMALA
2
X Index
15-19
.039
20-24
1. 332
1. 019
25-29
2.180
3.195
30-34
7.722
2.118
35-39
19.020
3.936
40-44
44.072
4.087
45-49
124.100
1.052
15-19
.302
.234
20-24
1.383
4.287
25-29
7.649
2.229
30-34
30.474
1.489
35-39
89.451
2.471
40-44
135.649
2.618
45-49
268.059
6.002
15-19
.221
20-24
.752
1.878
25-29
1.977
6.781
30-34
1. 359
17.114
35-39
1. 345
2.523
40-44
31. 638
.746
45-49
6.537
15.732
•
•
•
TABLE 5.8
~
0
~
Observed and expected tnlncated proportions (per 1,000) for selected ages, high fertility data
Country
Costa Rica
0
1
2
3
4
5
6
7
152
97
155
159
135
121
106
85
152
105
134
147
148
129
106
80
I
142
83
115
144
143
134
124
115
140
82
113
. 134
144
143
131
113
I
155
91
113
132
132
130
125
121
154
95
113
126
132
133
128
118
183
99
112
127
128
123
118
112
183
99
114
123
127
126
120
108
74
73
115
140
171
170
141
116
73
77
108
146
169
170
147
III
70
77
107
125
140
161
167
154
68
82
101
124
145
160
164
155
I
76
.72
104
125
148
154
158
161
73
78
102
124
144
158
164
158
I
74
92
108
121
135
152
155
162
74
93
" 107
122
136
149
158
161
Age
30-34 ·1
35-39
40-44
45-49
Guatemala
30-34
35-39
40-44
45-49
I
I
2
X
I
I
6.83
1. 81
I
0.67
I
0.43
1.42
I
0.97
1.12
I
0.16
102
•
TABLE 5.9
The function
U
x
for selected ages, United States cohort data
X
Population
3
4
1.2711
1.2829
1 .7369
2.0441
2.4173
1. 3916
2.9686
2.1704
·2.4061
2.7093
3.4884
26
.7372
1.2842
1.2131
I. 54 53
1.8197
2.0187
47
1. 5318
3.2285
2.1662
2.3387
2.6132
3.2800
26
.6687
1 . 1579
1 .9861
2.8050
2.8998
3. 1701
47
.8671
1.2741
2.2735
3.1851
4.0454
4.8925
Age
1
26
.7308
47
2
5
6
All
\'Ihite
•
•
. Non "hite
103
•
TABLE 5.10
Comparison of various binomial mixtures foro
the· parity distribution of age 36, U.S. (white) data
Estimated Parameters
HI
Component
.489
2.356
1.000
2.697
4.254
4.756
.330
.410
2.050
1. 210
3.007
6.039
3. 141
6
.237
.350
2.100
1.365
3.424
8.379
5.822
7
.198
.309
2.149
1.495
3.971 11. 868
15.830
8
.195
.276
2.208
1.599
5.029 16.832
42.868
4
.481
5
x
2
VARIANCE
a
MEAN VARIANCE
Index
MEAN
n
P
Binomial Component
•
•
-""
TABLE 5.11
a
~laximllm
likelihood estimates for the parameters of the III distribution, U.S. 1920 birth cohort data
ALL
•
11'11 ITE
NON WHITE
.\
a
b
alb
.\
.196
1 .164
.870
2.089
.416
1.309
2.898
.332
1.920
.792
2.159
.367
2.851
1.281
2.686
.477
2.307
.940
6.428
.146
11. 095
2.523
3.043
4.472
.681
2.323
.903
5.430
.166
12.478
.690
2.810
5.022
6.428
.781
2.474
.919
5.467
.168
14.242
7.085
.604
3.631
5.835
7.448
.783
2.800
.968
7.886
.123
22.445
2.784
3.953
.704
3.328
3.536
4.331
.817
2.889
.970
11:592
.084
35.322
41
0.061
.065
.940
2.534
.030
.031
.961
2.519·
.960
12.550
.076
39.804
44
2.599
3.681
.706
3.446
4.058
5.365
.756
3.249
.987
14.352
.069
43.454
47
2.860
4.261
.671
3.635
4.904
6.879
.713
3.451
.989
15.456
.064
46.799
Age
a
b
alb
.\
a
b
20
.235
.715
.329
.811
.427
2.177
23
.791
2.227
.355
1.936
.962
26
1.081
2.339
.462
2.471
29
I .991
3.135
.635
32
2.834
4.108
35
4.276
38
•
alb
•
•
•
v>
•
TABLE 5.12
o
....
~Iaximum likelihood' estimates for the parameters of the modified '\ distribution, U.S. 1920 birth cohort data
ALL
'r
WHITE
NON II'HITE
-
Age
a
b
alb
A
20
2.011
4.354
.462
1.032
23
1.514
3.995
.379
26
1. 254
2.521
29
3.129
a
b
alb
.559
.309
.342
.903
.520
2.121
.855
2.059
2.535
.582
.497
2.374
.
.964
1.890
3.418
4.573
. 684
2.440
.958
4.383
32
11.709 15.173
.772
2.648
.945
35
28.065 34.278
.819
2.804
38
15.422 21.449
.719
41
11.619 16.731
44
47
a
b
alb
A
.487
1. 285
2.413
.533
1.188
.862
1. 403
.782
.927
2.205
.420
2.668
.933
.553
2.161
.920
.912
5.598
.163
9.834
1.000
5.458
.818
2.033
.955
.936
7.793
.121
17.576
1.000
43.809
45.508
.963
2.091
.962
.978
8.510
.115
21.736
1.000
.944
49.175
54.750
.898
2.528
.965
.981
9.771
.100
27.763
1.000
3.391
.953
50.000
51.301
.975
2.502
.964
.933
7.232
.129
22.273
1.000
.694
3.623
.953
24.232
29.617
.818
3.070
.065
.946
9.904
.096
31.420
1.000
12.006 20.802
.577
4.391
.955
47.737
49.096
.972
2.603
.963
.974 11.166
.087
35.253
1.000
11.752 21. 468
.547
4.627
.957
49.500
50.071
.988
2.562
.963
.963 10.724
.090
33.955
1.000
Cl
A
Cl
Cl
TABLE 5.13
'".....o
~Iaximum likelihood estimates for the parameters of the mixture (5.2), U.S. 1920 birth'cohort data, ..ith n = 5
ALL
a
Age
b
alb
\II11lTE
A
p
a
a
b
NON \1111 IrE
alb
A
p
a
b
alb
1.288 2.734 .471
A
p
a
.786 .293 .840
20
.152 1.000 .152 1.058 .132 .790
.094 .177 .531
23
.687 2.671 .257 2.408 .228 .871
.116.377 .3081.600.167.576
. 174
.233 .746
1.934 .084 .612
26
.141
.292 .483 2.571 .213 .483
.140 .303 .462 2.374 .222 .472
.181
.253 .714
3.135 .129 .544
29
.092
.159 .579 3.534 .293 .318
.069 .121 .571 3.370 .299 .277
.378
.652 .579
4.389 .172 .640
32
.073
.113 .648 3.961 .353 .312
.030 .046 .654 3.879 .362 .261
.886 2.544 .348
8.031 .144 .791
35
.089
.126 .705 4.139 .390 .351
.028 .038 .734 4.063 .398 .298
.935 3.011 .310 10.076 .145 .812
38
.267
.381 .700 4.302 .409 .427
.088 .114 .7724.142 .413 .353
1.0533.741 .282 12.212 .133 .799
41
.409
.609 .672 4.560 .420 .468
.024 .029 .824 4.065 .411 .376
1.232 4.949 .249 14.679 .123 .772
44
.683 1.200 .569 5.336 .434 .511
.086 .105 .8174.074 .412 .397
1.196 5.019 .238 15.605 .126 .778
47
.801 1.571 .510 6.057 .440 .523
.123 .152 .808 4.131 .413 .400
1.074 4.266 .252 14.831 .135 :793
•
.070 .137 .703
a
•
•
•
•
.....
•
TABLE 5.14
o
....
Comparison of the fitted mean and variance for U. S. cohort data
ALL
Predicted
Age
HI
~lean
20
23
26
29
32
35
38
41
44
47
.267
.687
1.141
1.602
1.938
2.191
2.344
2.382
2.433
2.440
Variance
20
23
26
29
32
35
38
41
44
47
.351
.953
1. 596
1. 959
2.269
2.581
2.810
2.723
2.960
2.994
mixture
III
.266
.687
1.139
1: 598
1. 930
2.168
2.326
2.397
2.422
2.425
.266
.687
1. 149
1.649
2.016
2.290
2.493
2.550
2.614
2.665
.228
.637
1.100
1.582
1. 933
2.194
2.359
2.420
2.457
2.461
.350
.949
1. 573
1. 933
2.221
2.443
2.691
2.831
2.901
2.912
.364
1. 025
1.984
2.916
3.695
4,633
.296
.847
1. 460
1.796
2.074
2.351
2.593
2.653
2.763
2.770
mod HI
NON WHITE
WHITE
4~913
5.248
5.610
5.961
mixture
mixture
HI
.229
.639
1.099
1.588
1. 936
2.191
2.349
2.423
2.437
2.439
.229
.638
1.104
1. 615
2.000
2.286
2.467
2.543
2.566
2.573
.545
1.046
1.622
2.075
2.394
2.755
2.955
3.046
2.989
2.993
.546
1.046
1.602
2.127
2.498
2.787
2.874
3.002
3.025
3.048
.545
1.046
1.511
1.937
2.362
2.677
2.881
2.961
3.033
3.101
.292
.835
1. 458
1.800
2.089
2.376
2.562
2.683
2.669
2.669
.294
.937
1. 784
2.595
3.354
4.105
4.675
4.899
4.971
5.022
.680
1.643
3.691
5.432
6.779
8.860
10.550
11 .309
10.867
10.962
.682
1.630
3.601
5.889
7.551
9.250
9.646
10.824
11.208
11.084
.744
1. 777
3.332
4.857
6.653
8.286
9.549
10.193
10.783
11.272
mod HI
mod HI
co
o
TABLE 5.15
~
The observed and expected parity distribution for selected ages, U.S. cohort data
Parity
Population
All
Age'
Distribution
26 .
observed
47,
White
26'
47
Non white
26
47
•
I
7
0
1
2
3
4
5
6
mod III
.3942
.3948
.28Bl
.2900
.1831
.1755
.07B3
.OB62
.0340
.0356
.0139
.0126
.0056
.0039
.0028
.0014
mixture
.3942
.2B86
.1 Bl 7
.OBOO
.0329
.0141
.0057
.0029
obse rved
.1236
.1720
.2553
.IB47
.1111
.0602
.0350
.05Bl
mod HI
.12B2
.1965
.2357
.19BO
.1265
.0674
.030B
.01B9
mixture
.1215
'.IB34
.236B
.1962
.1104
.0570
.0366
.05BO
observed
.3976
.2931
.IBB2
.0761
.0294
.0107
.0036
.0013
mod III
.3976
.2966
.17B3
.OB24
.0313
.0101
.002B
.0009
mixture
.3974
.2934
.IB66
.07Bl
.02B3
.0107
.0039
.001B
observed
.1100
.16B5
.2720
.1964
.114B
.0600
.032B
.0455
mod HI'
.1135
.193B
.2453
.2071
.1312
.0665
.02Bl
.0146
mixture
.1105
.17B5
.2567
.20BB
.1142
.05B2
.0346
.03B4
observed
.3722
.24B9
.1441
.0954
.0669
.03BB
.0205
.0132
mod HI
.3700
.234B
.1521
.0972
.0607
.0369
.021B
.0265
mixture
.3722
.24B6
.1439
.0956
.0660
.0396
.0202
.0139
observed
.2310
.2003
.1276
.0967
.0770
.0623
.050B
.1543
mod '\
mixture
.2399
.1 B18
.1396
.1074
.OB23
.0629
.047B
.13B3
.2320
.1990
.1305
.094B
.076B
.0631
.0511
.1527
•
~
•
·."
109
•
o
........
TABLE 5.17
Observed and expected truncated proportions (per 1,000) for u.s. data, age 47
Population
All
White
Non-white
•
2
X
Distribution
0
I
2
3
4
5
6
observed
131
183
271
196
118
64
37
mod HI
131
200
240
202
129
69
31
8.09
mixture
129
195
251
208
117
61
39
3.31
observed
115
177
285
206
120
63
34
mod "I
115
197
249
210
133
67
29
9.68
mixture
115
186
267
217
119
61
36
2.39
observed
273
237
151
114
91
74
60
mod "I
278
211
162
125
96
73
55
5.74
mixture
274
235
154
112
91
74
60
0.11
•
•
•
CHAPTER VI
SUMMARY AND SUGGESTIONS FOR FUTURE WORK
6.1
Summary
This dissertation is concerned with the development of a
probability distribution which will describe the distribution of the
number of children ever born (the parity distribution) at any given
age for a birth cohort of women.
Chapters II and III are concerned with the development of the
HI
•
distribution (2.16) as a model to describe the parity distribution
at any age, e.g. the parity distribution conditional on the age of the
women.
The
HI
distribution can arise as either a discrete time
model or a continuous time model.
In Chapter II, the
time) contagion model.
HI
distribution is derived as a (discrete
The development of the contagion model suggests
other distributions as possible models of the parity distribution,
namely, the
H3
HZ
and
H distributions. For each of the HI' HZ' and
3
distributions the probability function, the probability generating
function, and the moments of the distribution are derived.
Also, for
each distribution a second order difference equation for the probabilities is derived.
These results allow simple descriptive comparisons among the
•
three distributions.
In particular, it is shown, by assuming that the
means of the three distributions are equal, that the
varianc~
of the
lIZ
H distribution is less than the variance of the
3
and the variance of the
the
HZ
HI
HI
distribution
distribution is less than the variance of
distribution.
In Chapter III, the
HI
•
distribution is developed as a con-
tinuous time model. ··In the development of a continuous time fertility
model, one must consider either age-parity specific forces of fertility
or a force of fertility that is not dependent on parity.
the latter, we show how the
HI
By choosing
distribution can arise as a time
homogeneous compound Poisson process.
In this process, the force of
fertility conditional on parity is a function of parity.
Thus the
concept of parity-specific forces of fertility is retained in the
compound Poisson process without the need for a model with explicit
parity specific forces of fertility.
Chapters IV and V deal with the application of the
bution to data on the number of children ever born.
HI
distri-
•
Chapter IV shows
how age-parity specific fertility rates can be used to determine the
parity distribution at a given age.
Also considered in Chapter IV is
the problem of estimating the parameters of the
The maximum likelihood equations for the
HI
HI
distribution.
distribution do
not yield an explicit solution and an iterative procedure must be used
to obtain the maximum likelihood estimates.
estimators for the
HI
The method of moment
distribution is also defined and the asymptotic
relative efficiency of moment estimates relative to maximum likelihood
estimates is computed.
Data on the parity distribution are generally censored at some
maximum reported parity.
To simplify estimation procedures, the data
are considered to be right truncated.
A problem arises in the zero
•
113
•
parity class due to the possibility of a proportion of women never
being susceptible to the risk of a live birth conception.
This leads
to a modification of the zero parity class and to the corresponding
modified
HI
distribution.
An example of the estimation procedures
is presented using data found in .Keyfi ~z (l9q8) '.
In Chapter V the modified
HI
,~
distribution is applied to the
parity distribution from the relatively high fertility countries of
Costa Rica and Guatemala.
The data on the parity distributions is
available by urban or rural classification.
modified
HI
It is found that the
distribution is a good approximation to the truncated
distribution for both countries and for both urban and rural classification.
•
U.S. cohort data are used as an example of a relatively low
fertility country.
The modified
data as well as might be expected.
HI
distribution does not fit the
This is possibly due to a propor-
tion of women voluntarily terminating their reproductive experience
after a certain desired family size is reached.
ture of a binomial distribution and a
HI
distribution seems to be an
appropriate model for the parity distribution .
•
In this case, a mix-
114
6.Z
Suggestions for Future Research
There are three basic areas for future research.
1.
Better estimation procedures.
There is a need to obtain better
estimates of the parameters of the
tribution.
HI
and modified
HI
•
dis-
Further work must be done in regard to censored data
and to data in five year age intervals.
Z.
Alternative models for the parity distribution.
The probability
distributions which can be examined include
(i)
(ii)
(iii)
The
HZ
and
H
Mixtures of the
3
distributions,
HI
with the
HZ
and
A distribution similar to the
HI
distribution but de-
H
3
distributions,
veloped from a compound Poisson process where fecundability
is non-homogeneous in time (age), and
(iv)
A Poisson-Inverse Gaussian distribution.
This is included
because a form of the inverse-Gaussian distribution, known
•
as the Hadwiger function, is useful in graduating agespecific fertility rates.
3.
Applications of the appropriate model for the parity distribution.
Once a workable model for the parity distribution is obtained,
possible applications include:
(i)
(ii)
(iii)
(i v)
(v)
analysis of cohort trends in fertility,
project fertility and thus population projection,
simulate fertility histories,
adjust defective fertility data,
graduate single year fertility data from fertility data
given in five year age intervals, and
(vi)
analysis of socio-economic differences in a given population.
•
•
REFERENCES
Basu, D.
(1955). "A note on the structure of a stochastic model
considered by V.~1. Dandekar," Sankhya A, ~, 251-252.
Beall, G. and Rescia, R.R.
(1953). "A generalization of Neyman's
contagious distributions," Biometrics 2., 354-386.
Biswas, S.
(1973). "A note on the generalization of Brass' model,"
Demography ~, 459-467.
Brass, W.
(1958). "The distribution of births in human populations,"
Population Studies 12, 51-72.
Chandrasekaran, G. and George, ~1. V.
(1962). "Mechanisms underlying
the differences in fertility patterns of Bengalle women from
three socio-economic groups," Milbank Memorial Fund Quarterly 40,
59-89.
•
Chiang, C. L.
(1968).
Introduction to Stochastic Processes in
8iostatistics, John Wiley and Sons, New York.
Chiang, C.L.
(1971). "A stochastic model of human fertility,"
Biometrics ~, 345-356.
Cox, P.R.
(1972).
Oemography.
Cambridge University Press, London.
(1963). "Estimation of the parameters
Crow, E.L. and Bardwell, G.E.
of the hyper-poisson distribution," in Classical and Contagious
Distributions, G.P. Patil (ed.) Pergamon Press, New York, 127-140.
Dacey, ~1. F.
(1972). "A family of discrete probability distributions
defined by the generalized hypergeometric series," Sankhya B, 34,
243-2S0.
Dandekar, V.M.
(1955). "Certain modified forms of Binomial and
Poisson distributions," Sankhya A, .!2., 237-250.
Dharmadhikari, S.W.
(1964). "A generalization of a stochastic model
considered by V.~1. Dandekar," Sankhya A, ~, 31-38.
•
Eggenberger, F. and Polya, G.
(1923). "Uberdie Statistikverketteter
Vorgange," Zei tschrift fur angewandte Mathematik und Mechanik !.'
279-289 .
116
Elderton, W.P. and Johnson, N.L.
(1969). Systems of Frequency
Curves. London, Cambridge University Press.
Erdelyi, et. al. (1953). Higher Transcendental Functions.
Hill Book Company, New York.
•
McGraw-
Feller, W.
(1943). "On a general class of contagious distributions,"
Annals of Mathematical Statistics li, 389-400.
Frechet, M. (1939). "Les probabilities associees 11 un systeme
d'evenemts compatibles et dependants 1. Evenements en nombre
fini fixe," Actuali tes Sci. Indus t. No. 942, Hermann, Paris.
Frechet, M.
(1943). "Les probabilities associees a un systeme
d'evenemts compatibles et dependants 2. Evenements en nombre
fini fixe," ActuaIi tes Sci. Indus t. No. 942, Hermann, Paris.
Gini, C. (1924) . "Premieres recherches sur 1a fecundabi Ii te de la
femme," Proceedings of the International Mathematics Conference,
Toronto 889-892.
Glasser, J. and Lachenbruch, P. A.
(1968). "Observations on the
relation between the frequency and timing of intercourse and
the probability of conception," Population Studies ~, 399-407.
.0
Greenwood, M. and Yule, G. U.
(1920). "An enquiry into the nature of
a frequency distribution representation of multiple happenings
with particular reference to the occurrence of mUltiple attacks
of disease of of repeated accidents," .Journal of Royal
Statistical Society A, ~, 255-274.
~
- - - .- .-
•
(1958). "A generalized class of contagious distribuGurland, J.
tions," Biometrics li, 229-249.
Gurland, J. and Tripathi, R. (1975). "Estimation on some extensions
of the Katz family of discrete distributions involving hypergeometric functions," in Statistical Distributions in Scientific
Work, Vol. 1 (edited by Patil, C.P. et.~), 59-82.
Hajnal, J.
(1950) . "Births, marriages, and reproducti vi ty, England
and Wales, 1938-47," in Papers of Royal Commission on PopUlation,
Vol. II, 303-442. Reports and Selected Papers of the Statistics
Committee, His Majesty's Stationery Office, London.
Hartman, C.G. (1962). Science and the Safe Period.
Watkins, Baltimore.
Williams and
Henry, L. (1953). "Fondements theoriques des mesures de la
fecondite naturelle," Revue de l' Instut International de
Statistique ~, 133-151
Henry, L.
(1957). "Fecondite et famille-models mathematiques,"
Population g, 413-444.
•
117
•
Henry, L, (1961a) . "Fecondi te et famille-modiHs mathematiques,"
Population ~, 27-48.
Henry, L. (l961b). "Fecondite et fami11e-modiHs mathematiques:
Applications numeriques," Population ~, 261-282.
Heuser, R.L. (1967). MUltiple births, United States, 1964. Vital
and Health Statistics, Series 21, No. 14. National Center for
Health Statistics, Washington, D.C., U.S. Government Printing.
Office.
Heuser, R.L. (1976) . . Cohort Fertility Tables, United States,
1917-1970, NCHS, U.S. Government Printing Office.
Heuser, R.L., Ventura, S.J., and Godley, F.H. (1970). Natality
Statistics Analysis, United States, 1965-1967. Vital and
Health Statistics Series 21, No. 19. National Center for Health
Statistics, Washington, D.C., U.S. Government Printing Office.
Hoem, J .M. (1968). "Fertility rates and reproduction rates in a
probabilistic setting," Biometric-Praximetre~, 38-66.
Hoem, J.M. (1970). "Probabilistic fertility models of the life table
type," Theoretical Population Biology l, 12-38.
•
Jacobson, P.H.
New York .
(1959).
American Marriage and Divorce, Rinehart:
James, W.H. (1961). "On the possibility of segregation in the
propensity to spontaneous abortion in the human female,"
Annals of Human Genetics 25, 207-213.
James, W.H. (1963).
!Z.' 51-65.
"Estimates of fecundability," Population Studies
Johnson, N.L. and Kotz, S. (1969). Distributions in Statistics:
Discrete Distributions. Houghton-Mifflin, Boston.
Johnson, N.L. and Kotz, S. (1970). Distributions in Statistics:
Continuous Distributions. Houghton-Mifflin, Boston.
Joshi, D.O. (1965). "Stochastic models utilized in Demography,"
World Population Conference, Vol III, United Nations: New York.
Karmel, D.H. (1950). "A note on P.K. Whelpton's calculation of parity
parity adjusted reporduction rates," Journal American Statistical
Association ~, 119-124.
•
Katti, S.K. (1966). "Interrelations among generalized distributions
and their compoennts," Biometrics E, 44-52 .
118
Katz, L. (1963). "Unified treatment of a broad class of discrete
probability distributions," in Classical and Contagious Discrete
Distributions. G.P. Patil (ed.), Pergamon Press: New York
175-182.
Kemp, A.W. (1968). "A wide class of discrete distributions and the
associated differential equations," Sankhya A, 30, 401-410.
•
Kemp, A.W. and Kemp, C.D. (1974). "A.family of discrete distributions
via their factorial moments," Communications in Statistics ~.
1187-1196.
Keyfitz, M. (1968). Introduction to the Mathematics of Population.
Addison Wesley, New York.
Lachenbruch, P.A.
(1967). "Frequency and timing of intercourse and
the probability of conception," Population Studies ~, 23-31.
Lotka, A.J. and Spiegelman, M. (1940). "The trend of birth rate by
age of mother and order of birth," Journal of American
Statistical Association 35. 595-601.
Lundberg, O.
(1940). On Random Processes and their Application to
Sickness and Accident Statistics. Uppsala, Sweden.
Murphy, E.M.
(1965). "A generalization of stable population
techniques." unpublished PhD dissertation, Department of
Sociology, University of Chicago.
Nesbi tt, R. E. (1957) . Perinatal Loss in Modern Obstetrics.
Davis Company. Philadelphia.
•
F. A.
Neyman, J.
(1939). "On a new class of contagious distributions."
Annals of Mathematical Statistics ~. 35-57.
Neyman, J.
(1949a). "Contributions to the theory of the chi-square
test," !'roceedings of the First 8erkely Symposium on Mathematical
Statistics and Probability. 239-273.
Neyman, J. (1949b). "On the problem of estimating the number of
schools of fish." University of California Publications in
Statistics !, 21-36.
Nour,
El~Sayed
(1973). "A stochastic model for the study of human
fertili ty," Institute of Statistics Mimeo Series No. 879,
University of North Carolina at Chapel Hill.-
Oechsli, F.W. (1975). "A population model based on a life table
that includes marriage and parity." Theoretical Population
Biology ~, 229-245.
Ord, J.K.
(1967a). "On a system of discrete distributions,"
Biometrika ~. 649-656.
•
119
•
Ord, J.K.
(1967b). "Graphical methods for a class of discrete
distributions," Journal of Royal Statistical Society, Series
130, 232-238.
Ord, J.K.
(1972).
~,
Families of Frequency Distributions. Griffin, London.
Ord, J.K. and Patil, G.P.
(1972). "Statistical modeling: an
al ternative view," in Statistical Distributions in Scientific
Work, Vol. ~ (edited by Patll, et.a1.) 1-10.
Park, C.B.
(1967). '~easuring the probability of eventually bearing
n live births: an extention of fertility tables," Proceedings
of the American Statistical Association, Social Science Section
374-382.
Park, C.B.
(1976). "Lifetime probabilities of additional births by
age and parity for American women, 1935-196B, a new measure of
period fertility," Demography 13, 1-18.
Parzen, E.
(1962).
Stochastic Processes,Holden-Day, San Francisco,
Pathak, K, B,
(1966). "A probability distribution for a number of
conceptions," Sankhya B, 28, 213-218.
•
Patil, G.P. and Joshi, S.W.
(1968). A Dictionary and Bibliography
of Discrete Distributions. Oliver and Boyd, Edinburgh .
Pearl, R.
(1933). "Factors in hyman fertility and their statistical
evaluation," Lancet 225, 607-611.
Perrin, E.B. and Sheps, M.C.
(1964). "lluman reproduction, a
stochastic process," Biometrics 20, 28-45.
Philipson, C.
(1960). "The theory of confluent hypergeometric
functions and its application to compound poisson processes,"
Skandinavisk Aktuarietidskrift ~, 136-162.
Potter, R.G.
(1969). "Renewal theory and births averted," Invited
papers, London Conference, International Union for the Scientific
Study of Population.
Potter, R,G.
(1970). "Births averted by contraception: an approach
through renewal theory," Theoretical Population Biology .!..' 251-
272.
Potter, R.G., McCann, B., and Sakoda, J.~l.
(1970). "Selective
fecundability and contraceptive effectiveness," ~li Ibank
~lemorial Fund Quarterly 48, 91-102.
•
Pyke, R.
(1961a). "Markov Renewal Process: definitions and preliminary properties," Annals of Mathematical Statistics 32,
1231-1242.
120
Pyke, R. (196lb). "Markov Renewal Process with finitely many states,"
Annals of Mathematical Statistics 32, 1243-1259.
Quensel, C. (1939). "Changes in fertility following birth
restriction," Skandinavisk Actuarietsdskrift ll, 177-199.
•
Rao, C.R. (1952). Advanced Statistical Methods in Biometrics
Research. Wiley Publications, New York.
Ryder, N.B. (1965). "The measurement of fertility patterns,"
pp. 2B7-306 in M.C. Sheps and J.C. Ridley (eds.) Public Health
and Population Change: Current Research Issues. University of
pittsburgh Press, Pittsburgh.
Seal, H. L. (1948). "The probability of decrements from a population:
a study in discrete random processes," Skandinavisk
Aktuarietidskrift ~, 14-45.
Shah, B.V. (1970). User's Manual for POPSIM.
Institute (Preliminary copy).
Research Triangle
Sheps, M.C. (1964). "Pregnancy wastage as a factor in the analysis
of fertility," Demography !-' 111-118.
Sheps, M. C. (1965a). "The application of probability models to the
study of the pattern of human reproduction," in Sheps and
Ridley. (eds.) .. Public Health and Population Change: Current
Research Issues, Pittsburgh.
Sheps, M. C. (1967). "The uses of stochastic models in the evolution
of population policies I: theory and approaches to data
analysis," Proceedings of Fi fth Berke ly Symposium on Mathematical
Statistics and Probability.
•
Sheps, M.C. (1971). "A review of models for population change,"
Review of the International Statistics Institute 39, 185-196.
Sheps, M.C. and Menken, J.A. (1971). "A model for studying birth
rates given time dependent changes in the reproductive
parameters," Biometrics 27, 325-343.
Sheps, M.C., Menken, J.A., and Radich, A.P. (1969). "Probability
models for family building: an analytical review,"
Demography ~, 161-183.
Shryock, H.S., Siegel, J.S., and Associates (1973). The Methods and
Materials of Demography (rev. ed.) Washington, D.C., U.S.
Government Printing Office.
Singh, S.N. (1963). "Probability models for the variation in the
number of births per couple," Journal of the American
Statistical Association ~, 721-727.
•
•
121
Singh, S.N. (1964).
26, 95-102.
On the time until the first birth," Sankhya B,
Singh, S.N. (1968). "Chance mechanisms of the variation in the
number of births per couple," Journal of the American Statistical
Association 63, 209-213.
Singh, S.N. and Bhattacharya, B.N. (1970). "A generalized probability distribution for couple fertility," Biometrics 26, 33-40.
Singh, S.N., Bhattacharya,BN., and Yadara, R.C. (1974). "A parity
dependent model for the number of births and its applications,"
Sankhya B, 92-102.
Slater, L.J. (1966). Generalized Hypergeometric Functions.
Cambridge University Press, London.
Spiegelman, M. (1968). Introduction to Demography.
University Press, Cambridge.
Harvard
Srinivasan, K. (1966). "An application of a probability model to
the study of interlive births intervals," Sankhya B, ~, 1-8.
•
Tietze, C. (1961). "The effects of breast feeding on the rate of
conception," Proceedings of the International Population
Conference, New York ..
United Nations (1963). Demographic Yearbook, Vol. IS, New York:
United Nations International Publications Service.
United Nations (1975). Demographic Yearbook, Vol. 27, New York:
United Nations International Publications Service.
Whelpton, P.K. (1946). "Reproduction rates adjusted for age, parity
fecundity, and marriage," Journal of the American Statistical
Association ~, 501-516.
II'helpton, P.K. (1948). "The meaning of the 1947 baby boom," Vital
Statistics Special Reports, Selected Studies, Vol. 33, No.1
Washington, D.C.
II'helpton, P.K. (1957). Cohort Fertility.
Press, Princeton.
Princeton University
Whelpton, P.K. and Campbell, A.A. (1960). "Fertility tables for
birth cohorts of American Women, Part I," National Office of
Vital Statistics, Special Reports, Vol. ~. No.1 .
•
122
Wicksell, S.D. (1931). "Nuptiality, fertility, and reproduction,"
Skandinavisk Aktuarietidiskrift.
•
Whittaker, E.T. and Watson, CN. (1927). A Course of Modern Analysis,
Cambridge University Press, London.
•
•
© Copyright 2026 Paperzz