,
ESTIMATORS OF PARAMETERS IN BIOLOGICAL
MODELS OF HUMAN FERTILITY
By
Chirayath M. Suchindran
Department of Biostatistics
University of North Carolina, Chapel Hill, N. C.
Institute of Statistics Mimeo Series No. 849
OCTOBER 1972
ESTIMATORS OF PARAMETERS IN
BIOLOGICAL MODELS OF
HUMAN FERTILITY
by
Chirayath M. Suchindran
..
A dissertation submitted to the faculty of
the University of North Carolina at Chapel
Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy
in the Department of Biostatistics.
Chapel Hill
1972
Approved by:
------------------Adviser
Reader
-
.
Reader
ABSTRACT
CHlRAYATH M. SUCHINDRAN. Estimation of Parameters in Biological
Models of Human Fertility. (Under the direction of PETER A.
LACHENBRUCH.)
The biological approach to the study of human reproduction involves
construction of probability models for the number, sequence and timing
of births to couples.
The reproductive process can be viewed as a
result of the interaction of the following factors:
the probability of conception in unit time
(3) the
non-suscep~ibility periods
(1) fecundability,
(2) fetal losses and
due to conceptions.
The proposed
study examines in detail the estimation problems in certain models
for conception times and for time to first livebirth.
As a model for the waiting time to conception, Potter and Parker
suggested a compound distribution of geometric with a single beta
distribution (with two parameters
a
and
of heterogeneous fecundability among women.
b) under the assumption
For this model we have
presented:
(1)
estimates for the parameters from
and truncated samples.
complet~
censored, grouped
Algorithms are given to obtain, where
available, the moment or moment type estimates, maximum likelihood estimates and minimum chisquare estimates,
(2)
the asymptotic variance-covariance matrices of the estimates
obtained, and
(3)
the asymptotic relative efficiencies of the moment estimators
compared to the maximum likelihood estimators.
ii
Examples are provided using the Princeton Fertility Survey data
and the Hutterite data.
Sampling experiments conducted using simu-
lated data provide information about the sampling distribution of
the estimates.
Since the model does not fit a set of data, a modified
model with the assumption that fecundability among women is distributed
as a mixture of two beta distribution has been tried.
A model for waiting time to first livebirth is derived along the
lines proposed by George.
Simultaneous estimates of the parameters
and their asymptotic variance-covariance matrix are obtained from
complete, censored, and grouped samples using the method of maximum
likelihood.
These estimates provide information about average time
to conception, average length of non-susceptible periods due to a
conception that ends in a fetal loss, and the probability of a fetal
loss given conception.
,
ACKNOWLEDGMENTS
t
I wish to express my sincere gratitude to Drs. P. A. Lachenbruch
and Mindel C. Sheps for their inunense help in the preparation of this
study.
The persistent, but patient guidance and encouragement of
Dr. P. A. Lachenbruch during the research were invaluable.
Dr. Mindel
C. Sheps, who first aroused my interest in Mathematical Demography
through several stimulating and thoughtful discussions, suggested the
problems studied in this research aud guided me during the initial
stages of this work.
I am also grateful to her for providing me the
data used in this study.
I specially thank Dr. H. Bradley Wells for his academic, professional, and personal advice during my graduate tt"aining in the Department of Biostatistics.
I would also like to thank Drs. R. Shac.htman, B. V. Shah, and
M. J. Symons, the other members of my doctoral conunittGe, for their
review of my work and their suggestions for improving it.
I \Jould
further like to extend my thanks to Dr. D. G. Horvitz for agreeing
to participate at my preliminary oral examination.
I am indebted to Mrs. Katie Yelverton for her assistance in
writing the computer programs for this study.
This research is made possible by the financial support of
th.~
Carolina Population Center and the Department of Biostatistics.
To
these institutions I gratefully acknowledge my appreciation.
The detailed data from the Princeton Fertility survey were
provided by Dr. R. G. Potter.
The data for the Hutterite women were
obtained by Dr. A. G. Steinberg, with the support of Grant H-03703
iv
from the National Heart Institute.
I wish to express my thanks to Mr. N. M. Lalu and Mr. Kelvin K.
Lee for their support and encouragement.
,
Appreciation is extended to Mrs. Pamela Smith for her excellent
job of typing this paper.
Finally, I owe a very special debt of gratitude to Dr.
Aley~mma
George, Professor of Statistics, University of Kerala, India for
stimulating my interest in statistics and for constantly encouraging
me to pursue my studies.
.
TABLE OF CONTENTS
'.
Page
LIST OF TABLES
vii
LIST Or" FIGUR.ES
e
xi.
.
1
Introduction
]n
The Reproducti.ve Process ...............•......•..
General Classification of Mode Ls •••••••••••••••••
Field Observations on Some Components of the
Reproductive Process •.•...............•.•....••..
A Review of Suggested Models ........•............
Aims of the Present Research .....•...........•...
Censoring and Truncation ............•..•.•...•...
The Estimation Procedures .••.....................
1.8.1 The Method of Maximum Likelihood •.......••
1.8.2 The Method of Minimum Chisquare
2
<:>
flo
Chapter
1.
INTRODUCTION AND REVIEW OF LITERATURE
1.1
1.2
1.3
1.4
e
1.5
1.6
1.7
1.8
"
II.
III.
4
5
11
14
15
16
19
CONCEPTION TIME MODELS-ESTIMATION FROM COMPLETE SAMPLES
23
2.1
2.2
2.3
23
24
Introduction
,. •. . .. . .
Model with Homogeneous Fecundabi.1ity ~.".........
Model with Heterogeneous Fecundability
2.3.1 Beta Distribution of Fecundability
2.3.1.1 The Moment Estimators ....•.•.. 0..
2.3.1.2 The Maximum Likelihood Estimators
2.3.2 A Mixture Distribution of Fecundability ...
27
CONCEPTION TIME MODELS-ESTIMATION FROM CENSORED SAMPLES
44
3.1
3.2
....
3
Introduction ..•.••••...•••.•.••..•••.••...•.••...
The Estimates from Singly Censored Data •.....•...
3.2.1 Moment Type Estimators .••.•...••.•...•....
3.2.2 Asymptotic Variance-Covariance Matrix of
Moment Estimators
.
3.2.3 Minimum Likelihood Extimators
.
3.2.4 Minimum Chisquarc Estimatllrs •...••..••••..
26
28
30
33
44
45
46
50
55
')/
vi
CHAPTER
Page
3.3
3.4
Effect of Artificial Grouping ••••.••••••••••••••..
Estimation Under Multiple Censoring ••.••.•••.••••
3.4.1 Homogeneous Fecundability and Fixed
Withdra'tvals
3.4.1.1
e •
•
• •
•
•
•
•
•
•
•
•
•
•
•
• •
•
• •
•
• •
66
p
68
69
3.4.3
Homogeneous Fecundability and Random
Withdrawals
3.4.2.1 The Maximum Likelihood Estimates.
Heterogeneous Fecundability and Fixed
Wi thdrawals
72
3.4.4
Heterogeneous Fecundability and Random
"..................
Withdrawals
3.4.5
77
Summary of Estimation Under Progressive
Censoring
3.5
IV.
63
The Maximum Likelihood Estimate
of
3.4.2
•
59
62
81
Asymptotic Relative Efficiencies •...•.•.••.•.••.•
82
3.6
Examples
87
3.7
Results from Sampling Experiments .•..••.•••••.••• 101
III
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
CONCEPTION TIME MODEL-SAMPLES FROM A TRUNCATED
DISTRIBUTION ....••.•..•......................•...•..•. 112
4.1
Introduction
4.1.1
••••••••••••••••••••••••••••••••
113
114
115
4.4
4.5
4.6
Estimates from Proportion Conceived ••.•••.••••••.
Moment Type Estimators ••••••••..••.••.•••.•••.•..
4.3.1 Large Sample Properties of Moment
Estimators ..•..•..........•..•..•..••...••
Maximum Likelihood Estimators ...•......•..•..•.•.
Minimum Chisquare Estimators ....•..........•.....
Asymptotic Relative Efficiencies ...•.........•...
Concept iOll
4.7
Exam.ples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 128
4.8
Resul ts from Sampling ...•...•.........•......••.. 140
4.2
4.3
V.
120
123
125
126
ESTIMATION PROBLEMS IN A MODEL FOR FIRST LIVEBIRTH
INTERVAL
5.1
5.2
5.3
5.4
5.5
5. 6
VI.
112
A Truncated Discrete Time Model for
"
"........ 150
Introduction .•••••••..••.•.•.•..•••........•••.••
Derivation of the Model
The Problem of Estimation from Complete Samples ••
Estimation from Censored Samples .••.•.....••..•••
5.4.1 Iterative Scheme •••••••••••.••.•••..•.••••
Estimation from Grouped Data ...•.. , .•....•••...•.
Examples..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
150
150
157
165
166
173
175
SUMMARY AND SUGGESTION FOR FUTURE WORK .•••..•••••.•••. 181
6.1
SuntIDary ................•........................ " 181
6.2
Suggestions for Future Research .......•...•.•••.. 184
LIST OF REFERENCES ...•••...•••.••••.•...........•.....•.....••••. 185
LIST OF TABLES
Page
TABLE
2.1
Observed and Expected Proportion of Conceptions, by Month
Under Various Models
40
2.2
Estimates and Their Standard Errors for Various Models
41
2.3
The Correlation Matrix (R) of the Estimates of the
Parameters of Mixture of Two Beta Distributions •••••••••••
42
3.1
3.2
.
3.3
Asymptotic Relative Efficiency of Moment Estimators to
MLE for Selected Values of a and b ~nd Censoring
Time t = 12
84
Asymptotic Relative Efficiency of Moment Estimators to MLE
for Selected Values of a and b and Censoring Time
t = 48
.
84
Asymptotic Relative Efficiency of Moment Estimators to MLE
for Selected Values of a and b (Complete Samples)
85
3.4 Asymptotic Relative Efficiency of MLE Estimates from
Grouped and Ungrouped Samples (Censored Sample t = 48)
3.5
3.6
3.7
Asymptotic Relative Efficiency of MLE Estimates from
Grouped and Ungrouped Samples (Censored Sample t = 48)
86
Estimates from Censored Samples and Their Standard Errors
(Where Available) - Hutterite Data .••••.••.•.•.......•....
89
Estimates from Censored Samples and Their Standard Errors
(Where Available) - PFS Group I
91
Estimates from Censored Samples and Their Standard Errors
(Where Available) - PFS Group II
92
•••••••••••••••
3.8
3.9
....
85
(I
•••••••••••
..........................
Estimates from Grouped Data (Censored at t = 48) for PFS
Group I and Group II
II!Io
•
•
•
•
•
•
•
93
viii
Page
TABLE
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
,
Observed and Expected Frequencies of Conception by Month,
Hutterite Data (Censored Sample t = 6) •.........•..•••.••
94
Observed and Expected Frequencies of Conception by Month,
Hutterite Data (Censored Sample t = 12) ..•...••..•••.....
9.')
Observed and Expected Frequencies of Conception by Month,
Hutterite Data (Censored Sample t = 24) ••.•••••.....•••..
96
Observed and Expected Frequencies of Conception by Month,
PFS Group II (Censored Sample t = 12) ..•.•.••.......•..••
97
Observed and Expected Frequencies of Conception by Month,
PFS Group II (Censored Sample t = 24) •....•••...•••....•.
97
Observed and Expected Frequencies of Conception by Month,
PFS Group II (Censored Sample t = 36) ••.••••.••••••.••••.
98
Observed and Expected Frequencies of Conception by Month,
PFS Group II Using Estimates from Grouped and Ungrouped
Data (Censored Sample t = 48) ••••.•••...•••.••••.•.•..••.
99
Observed and Expected Frequencies of Conceptions by Month,
PFS Group I Using Estimates from Grouped and Ungrouped
Data (Censored Sample t = 48) .••.....•..•.•......... ,....
100
Values of a, b
Simulation
102
and Mean Fecundabi1ity Used for
...............................................
3.19
Summary Statistics of Simulated Samples by Censoring Time,
a = 3.5, b = 10.0, ~ = 0.2592 .•••.•.......•.•...•...••...
3.20
Summary Statistics of Simulated Samples by Censoring Time,
a ~ 4.5, b = 12.86, ~ = 0.2592 .....................•...•.
106
3.21
Summary Statistics of Simulated Samples by Censoring Time,
a = 4.5, b = 11.57, p = 0.28 ••••••••...............•....•
lOE;
3.22
4.1
4.2
Summary Statistics of Simulated Samples by Censoring Time,
a = 5.0, b = 12.86, ~ = 0.28 .•••••••.•••••••.••.........•
Asymptotic Relative Efficiency of Moment Estimators to MLE
for Selected Values of a and b, and Truncation Time,
t = 12
.
, ., ,.'
.....J. •.!
1.2,
Asymptotic Relative Efficiency of Moment Estimators to MLE
for Selected Values of a and b, and Truncation Time,
127
t = 48
III
•••••••••
"
.
ix
Page
•
TABLE
4.3
4.4
4.5
4.6
4.7
4.8
4.9
Estimates from Truncated Samples and Their Standard Errors
(Where Available), Hutterite Data ..•••..•.•....••.....•...
130
Estimates from Truncated Samples and Their Standard Errors
(Where Available) PFS Group I ••••..•••••.•...•.•........••
131
Estimates from Truncated Samples and Their Standard Errors
(Where Available) PFS Group II •••••••.•.••..•••••••.......
132
Observed and Expected Frequencies of Conception by Month,
Hutterite Data (Truncation Time = 6) •.••.••...•••••......•
133
Observed and Expected Frequencies of Conception by Month,
Hutterite Data (Truncation Time = 12) ••.••......••.••...••
134
Observed and Expected Frequencies of Conception by Month,
PFS Data Group I (Truncation Time = 12) ...•.•••....•....•.
135
Observed and Expected Frequencies of Cu~ception for Grouped
Data, PFS Group I (Truncation Time = 12) •......•......•...
136
4.10 Observed and Expected Frequencies of Conception by Month,
PFS Data Group II (Truncation Time = 24) •.•.....•...••.•..
137
4.11 Observed and Expected Frequencies of Conception by Month,
PFS Data Group II (Truncation Time - 36) ..•.•.............
138
4.12 Observed and Expected Frequencies of Conception by Month,
PFS Data Group II (Truncation Time = 48) ...•..... ,........
139
4.13 Sunnnary Statistics of Simulated Samples, by Truncation
Time, a = 3.5, b = 10.0, p = 0.2592 ..
142
00
..
•
..
..
•
..
..
....
•
..
4.14 Summary Statistics of Simulated Samples, by Tnmcati.on
Time, a = 4.5, b = 12.86, P = 0.2592
144
4.15 Summary Statistics of Simulated Samples, by Truncation
Time, a = 4.5, b = 11.57, p = 0.28 ••.•••••....•..••.••.••.
146
4.16 Sunnnary Statistics of Simulated Samples, by Truncation
Time, a = 5.0, b = 12.86 t ~ = 0.28 .~
.
148
5.1
Estimates and Their Standard Errors from Complete Samples.
177
5.2
Estimates and Their Standard Errors from Censored Samples
(Censori.ng Time = 24) .•.•....•..•.........•.••.........•..
178
x
Page
,
TABLE
5.3
Estimates and Their Standard Errors From Grouped Samples •.
5.4
Observed and Expected Frequencies of Livebirths by Month,
Hutterite Data
'" . . . . . . . . . . . . . . . . . . .
179
180
,
.
LIST OF FIGURES
'
Page
FIGURE
2.1
Fitted Distributions of Fecundability ...•........
0
••••••••
43
CHAPTER I
INTRODUCTION AND REVIEW OF LITERATURE
1.1.
Introduction
Among approaches to the study of human fertility, the construction
of probability models for the number, sequence, and timing of births to
couples, has come into prominence in the last two decades.
This con lw
attributed primarily to the inadequacy of conventional demographic
analysis for studying the interaction of the factors underlying the
reproductive process.
Also, direct observation of the process by mealJ"
of detailed clinical and laboratory investigations on a large popula
tion is difficult and this method does not provide data on the
but ion of the underlying variables in a population.
n
distr~
Various ki.nde; nf
models have been suggested for the human reprl)ductive process.
T!'is
study will examine, in detail, the estimates of the pnnlIfleters involved
in two such models,
i.e., a model for the waiting times to r:.oIlceptJori
suggested by Potter and Parker (1964) and a model for the
to first livebirth proposed by George (1967).
~!1aiting
times
For the model for concq)·"
tion times, we shall consider the estimation for complete samples as
well as censored, truncated and grouped samples.
to this model is also attempted.
A slight modification
For the model for the waiting
times
to first livebirth, we shall consider only estimates from complete,
2
censored and grouped samples.
We will present the details of the study
plan in the latter part of this chapter after a brief review of the
suggested models.
1.2 The Reproductive Process
Human reproduction can be considered as the result of the inter"·
action of the following three basic factors.
1.
Fecundability: - Conception in a woman depends on several
factors such as frequency and timing of intercourse relative
to ovulation, the characteristics of semen and ovum, and the
use of contraceptives.
These factors may depend upon the
characteristics of the individual.
Thus conception can
bf~
considered as a chance event and fecundability is defined as
the probability of conception in a unit time for a fecund
woman susceptible to conception.
The unit is taken as a
menstrual cycle or as a month considering the average length
of a menstrual cycle to be a month.
2.
Fetal loss: - All conceptions do not end in livebirths.
A
proportion ends in the death of the fetus, and the prob.gbility
that a conception will lead to fetal loss may depend on the
characteristics of the woman, such as age or order of
pregnancy.
3.
Non-susceptible period (or Infecundable period): -
Once a
conception occurs a woman cannot conceive again until the end
of the pregnancy and a post partum non-susceptible period,
the length of which depends on the outcome of the pregnancy.
The length of this non-susceptible period may also depend on
maternal characteristics.
•
3
1.3 General Classification of Models
,
Models have been developed for two different purposes.
Firstly,
models have been used to assist in interpreting observed data using
the parameters or functions of parameters of the model fitted to the
data.
Secondly, models have been proposed for the reproductive process
to study the interaction of underlying variables.
In the absence of
direct observations, the later models are especially useful for studying the effects of the changes in the basic factors.
The types of data used for the dependent variables examined
include the following: (1) the number of births or conceptions during
a period and (2) the time to first conception or the the first livebirth or, to conceptions or 1ivebirths of any order.
The components
of fecundability are also studied using data on time of ovulation and
coital
patterns.
In building models, the variables have been treated in the following manner.
1.
Time: - time is treated as either continuous or discrete.
In
the case of the discrete treatment, the time unit is taken as
a month.
Continuous time treatment is often use.d as a mathe-
matical convenience.
Mixed models treat conception as occur--
ring in discrete time and other events in continuous time.
2.
Fecundability: - As a first step towards the construction of
the models, fecundability may be asswned to be identical for
all women in the group.
over time.
Also it can be assumed to be constant
This assumption of homogeneity can be relaxed by
introducing variability within a woman and v3riability
women.
be~ween
4
3.
Duration of the non-susceptible period:-
The period of
gestation as well as the post partum nonsusceptible periods
I
have been treated as fixed or as variable within a woman or
between women according to specified distributions.
1.4 Field Observations on Some Components of the Reproductive Process
Field observations on the components of the reproductive process
are very few in number.
Such information would certainly help in
building models by providing ideas about the distribution of these
components in the human population.
Potter et at (1965) report detailed estimates of post partum
amenorrhea, abbreviated as PPA, collected from eleven Punjab villages
in India by a prospective study during a period of
results show that
PPA
increases with age.
lation is observed between
PPA
J21 to
5 years.
The
Also, a positive corre-
and lactation.
Jain et at (1970)
analyse data on amenorrhea and lactation collected through retrospective surveys and report that the period of amenorrhea increases
monotonically with age and parity.
S. P. Jain (1967) in a prospective
study of 1,480 urban women observes a weak correlation between amenorrhea and lactation, but a strong positive correlation between age and
amenorrhea. Tietze (1961) and Saxena (1969) report positive correlation between amenorrhea and lactation.
Not many reliable statistics are available on the incidence of
fetal loss.
The best available data come from the study of French and
Bierman (1962) conducted in Hawaii.
Using life table techniques, they
have derived the distribution of fetal loss by the length of gestation.
It is also believed that the incidence of fetal loss increases with age
5
and parity.
The mean length of gestation leading to livebirths has
been observed by Gibson and Mckeown (1950) and is found to be around
nine months with a standard deviation of about one month.
.
A few observations are available on the length of the menstrual
cycle and its variation within a woman and among women.
Gunn et aZ
(1937), Haman (1942), Vollman (1956), Treloar et aZ (1967) and Brayer
et aZ (1969) report the range of menstrual cycles between women and the
changes in length with respect to age.
1.5 A Review of Suggested Models
The proposed probability models and the application of these
models to the study of the pattern of human reproduction have been
reviewed by Sheps (1965), Joshi (1965), and Sheps et al (1969).
As pointed out earlier, the suggested models belong to two broad
categories depending upon the dependent variables.
One category treats
point events like conceptions or livebirths as the dependent variable,
whereas, the other uses intervals such as time to first conception or
first livebirth or interlivebirth interval as the dependent va!:iable.
Some of the models have been constructed for the interpretation of data,
whereas other analytic models have been constructed to st.udy the
iJx!.:~r
relationship between the variables.
The earli.est attempt to use probability models in studying the
reproductive process seems to have been made by the Italian statistician
Gi.ni (1924).
After introducing the concept of fecundability Gin!
proposes an estimate from data on the first livebirths based on certain
assumptions.
Henry (1964) uses this estimate of fecundability to make
inferences about the presence of fetal loss.
Pearl (1933, 1939)
6
introduces the pregnancy rates to study the reproductive performance.
Potter (1960), Henry (1964b) and Sheps (1965, 1966) have studied the
biases in the estimates of Gini and Pearl.
Dandekar (1955) suggests a modified binomial and a modified
Poisson distribution for the number of children born to a specified age
cohort of women in a fixed time period.
In deriving these distri-
butions Dandekar makes the following assumptions: (1) All women are
susceptible to the risk of pregnancy in the beginning and all conceptions lead to a livebirth.
(2) Fecundability does not change with
time and is identical for all women.
period after a conception is
constant integer.
(m-l)
(3) Length of the non-susceptible
units of time,
Dandekar also generalizes
~is
m being a
model by implicitly
assuming that the process is in operation for an indefinitely long time,
and that all women are not susceptible at the beginning of the observation period.
to his data.
Dandekar's models does not provide a satisfactory fit
Singh(1~63,
1964) adds one more assumption to the set of
Dandekar (1955), i.e., a woman may not be exposed to the risk of
conception at any time during the observation period.
Singh's models
are derived using the analogy between NeYman's (1949b) fishing model
and the reproductive process.
data.
The model gives a good fit to Dandekar's
Pathak (1966) modifies Singh's model by relaxing the assumption
that all fecund women are susceptible at the beginning of the observation period.
Singh and Bhattacharya (1970) also generalize Singh's
models by allowing more than one kind of pregnancy outcome.
Sheps
et al (1969) have also suggested modifications to Singh's models.
James (1963) also has derived a model similar to Singh's model, but
assumes that a conception is not recorded until the end of the
7
non-susceptible period.
Brass (1958), assuming a constant risk of
childbearing for a woman over a time
t
and further assuming that
the risk varies among women according to a gamma
distributio~derives
a negative binomial distribution for the number of children born
during a specified time period.
Later, considering only fecund women,
Brass uses a negative binomial truncated at zero to describe the
reproductive performances of selected countries.
Introducing hetero-
geneity into Dandekar's modified Poisson distribution, Brass also
derives a modified negative binomial distribution for the number of
children born during a time period.
L. Henry, in a series of papers (Henry 1953, 1957, 1961, a, b)
provides various analytical models for birth5 and conceptions.
Henj"y
(1953), under a set of assumptions, obtains the expected number of
births of a specified order in a time interval (t, t+ot).
Sheps et aZ
(1969) show that Henry's expression is a particular result from a modi-·
fied renewal process in which the first interval has a distribution
Sheps and Perrin (1966)
different from the rest of the intervals.
have developed analytic models for the
r
th
conception that leads to
a 1ivebirth, by allowing more than one pregnancy outcome.
The discus-
sions are mostly restricted to the homogeneous case, where the underlying parameters are held constant and identical for all women.
This
enables use of the theories of renewal process and Markov Chains.
Sheps and Menken (1971) propose a time dependent model on the lines of
Markov chains to study the short term ,effects
ductive behaviour.
of changes in repro--
The conclusions from this model are derived by way
of computer simulation.
8
Models have been suggested to study fertility determinants using
the following duration variables: (1) time to first conception or
first livebirth, (2)
interlivebirth interval (sometimes referred to as
closed interval), (3) time to a birth of a specified order, and (4)
open intervals defined as the time elapsed since a livebirth to a
specified point in time.
Using Ginfsconcept of a fecundability, Henry (1953, 1957) suggesm
some models to interpret interval data.
Potter and Parker (1964) use
the same concept to build a model for time to first conception.
will describe this model in detail in Chapter II.
We
Sheps (1964a) also
obtains a model for time to first conception under the assumption of
heterogeneous fecundability.
Using this modl'l it has been shown
the conditional probability of conceiving at time
t
t~at
given a woman
has not conceived until that time, decreases with time.
This result
follows from the fact that in the absence of contraception, the fecundability composition of the cohort changes rapidly by virtue of the
tendency of pregnancy to select out the most fecund and to leave behind
an increasingly subfecund group.
Potter, McCann and Sakoda (1970)
using the beta distribution for fecundability have studied the nature
of selective fecundity in the presence of contraception of various
effectiveness.
Srinivasan (1966), George (1967), George and Pillai (1969) have
formulated models for the time interval between any two successive
livebirths.
By a slightly different treatment, the interval between
the date of marriage and the first livebirth can also be included in
this category.
The derivation of these models will be described in
detail in Chapter V.
The models have been fitted to a set of data.
9
The treatment of Srinivasan and George differ in specifying the distributions of the components of the interval.
Henry (1961a) has derived
\
a relationship between fertility and interval between births during a
defined period under very restricted assumptions. Wolfers (1968) argues
that mean birth intervals using the number of births occurring in a
specified time cannot be used in investigating the characteristics of
women, as these means do not represent all women equally.
An open interval for a specific parity is defined as the time
elapsed since the birth to a fixed point in time.
derives the distribution of open
Srinivasan (1968)
intervals under a set of assumptions.
By definition the open interval is analogous to the backward recurrence
~ime
in renewal theory, and the distribution and moments in the
stationary case are available in standard text books (Cox 1962).
Sheps
and Menken (1970, 1972a) have studied the distribution of birth intervals in a stable population.
Their study is motivated by the fact
that observed distributions of the lengths of intervals between successive births are considerably affected by the method of ascertainment
defining the group investigated, the kind of data
obtaine~
the size and
composition of the population, and the effect of competing risks such
as
deat~
marriage and dissolution of marriage.
Venkatacharya (1969a, b)
and Sheps et at (1967, 1970) have studied the biases in interval data.
They have noted that the bias is greater when the effect of the basic
factors is to increase the interval between births.
been pointed out that life table techniques of
data do not always remove the biases.
It has also
analysis of interval
10
Perrin and Sheps (1964) and Sheps (1967) present a powerful analytical model based on the theory of semi Markov process and Marko·v
renewal theory (Pyke 1961a, b).
The model has been used further to
study the effects of changes in the underlying factors of the process
(Sheps 1964b; Sheps and Perrin 1963; Potter 1969, 1970; Potter et aZ
1970).
The application of this model is limited by the restrictive
assumptions involved.
For example, in the model it is necessary to
have all parameters fixed in time, the same probabilities apply to all
women and the reproductive period should be sufficiently long.
As
stated earlier, the chance of conception is dependent upon the
frequency and timing of intercourse with respect to ovulation.
Several
models are available in the literature studying this component of
fecundability (Glass and Grebenik (1954), Potter (1961».
Lachenbruch
(1967) uses Montecar10 methods to study the effect of coital patterns
on fecundability.
Glasser and Lachenbruch (1968) present a model to
estimate fecundabi1ity under two different situations of timing of
intercourse.
Barret and Marshal (1969) attempt to determine the rela-
tionship between fecundability and coital frequency using a set of
data on 1898 menstrual cycles for 241 married women.
Due to the restrictive assumptions involved in the analytic models,
computer simulations are also often used to study the reproductive
process.
A complete review of simulation models used to study the
process is given in Sheps (1971).
Potter and Sakoda (1966, 1967)
describe a macrosimu1ation model, deterministic in nature, developed
primarily to study problems concerning the dependence of family planning success upon fecundity.
Ridley and Sheps (1966) present a micro-
simulation model which simulates the reproductive history of women
11
under various assumptions.
HOIVi'tz et aZ (1969) presents a micro-
simulation model, POPSIM which may be used for simulating either cohorts
or period population in one or two sex population.
1.6 Aims of the Present Research
Our review of literature shows the following problems with some of
the models.
The presence of many unknown parameters makes the problem
of estimation very difficult for the models used to fit a set of data.
Even with very simplified assumptions the expressions for the required
probabilities become very complicated.
The common practice of speci-
fying certain unknown parameters may result in misleading estimates.
Simultaneous estimates of the parameters are seldom tried.
ties of the estimators are not investigated often.
The proper-
As noted earlier,
the method of ascertainment is an important factor in using interval
data.
Because of its simplicity, the time from marriage to first conception or first liveblrth in a population with no premarital conceptions,
provides good data for the study of fecundability.
One advantage of
these data is the absence of a post partum non-susceptible period after
the termination of pregnancy.
The types of ascertainment in this
case may be as follows.
1.
A marriage cohort is followed (prospectively) until all conceive or
a fixed length of time has passed.
2.
Women of a marriage cohort who have conceived at least once at a
particular point of time are asked to give the time from marriage
till the first conception.
Although this is the ideal data for the
suggested model, the collection of data sometimes poses difficulty
12
due to the problem of determining the exact time of conception.
One way to overcome this problem is to obtain the waiting to conception by subtracting the estimated length of pregnancy from the
time to the termination of first pregnancy.
The first type of
observations will face the risks of sterility, death, widowhood or
dropping out of the observations due to other reasons.
The second
group consists of only survivors to the time of observation.
Potter and Parker (1964) present a model for the time to first
conception assuming a beta distribution of fecundabi1ity among women.
This model will be described in detail in Chapter II.
are used to estimate the unknown parameters.
Moment estimators
Later, Majumdar and
Sheps (1970) used maximum likelihood estimates to fit the same data.
Eventhough considerable improvement in the fit is obtained, the fit is
not adequate.
Several reasons are put forth for the inadequate fit of
the model to this particular data.
The data were collected retrospec-
tively from women who had their second child in September 1956, during
the Princeton Fertility survey (Westoff et aZ 1961).
Others have
found this model adequate for their data (Berqu6 et aZ(1968); Jain
(1969».
The inadequacy of the model to the Princeton data may be due to
several reasons.
Firstly, the underlying model may be different from
the proposed model.
Secondly, the method of ascertainment of the data
may be an important factor for the
inadequate fit of the model.
Response errors are likely to have been present and a definite tendency
toward heaping at 6, 12, 18, and 24 months is seen in this data.
In
order to help smooth this tendency, sometimes grouping of the data is
done.
Also, Majumdar and
Shep~s(1970)
work shows that the type of
13
estimator used is also an important factor.
Menken and Sheps (1972b)
extend the maximum likehood method to estimate fecundability from a
,
particular type of censored samples.
George (1967) derives models for time to first livebirth and
interlivebirth
intervals.
These models have been subsequently fitted
to a set of data assuming values for certain parameters and using a
moment estimator of the remaining one unknown parameter (George and
Pillai 1969).
In some cases, the estimates of unknown parameters are
found to be heavily dependent upon the assumed values.
The focus of the present research will be on obtaining certain
estimators and studying their properties for models suggested by
Potter and Parker (1964) and George (1967).
The moment estimators,
maximum likelihood estimators and minimum chisquare estimators will be
considered whenever possible.
of the following.
Specifically the study will consist
Firstly, for the model of Potter and Parker, we
will examine:
1.
Estimates from complete samples, censored samples, grouped samples
and truncated samples.
2.
The asymptotic variance-covariance matrix of the estimates obtained.
3.
The asymptotic relative efficiencies of various estimates.
4.
A slight modification of the model by changing the distribution of
fecundability.
Secondly, for the model to first livebirth presented by George
(1967), we will study the problem of simultaneous estimates of
unknown parameters from complete, censored and grouped samples.
Chapter II we will describe the estimates from complete samples.
Chapter III will examine the estimates from censored samples and
In
14
grouped samples.
in Chapter IV.
The estimates from truncated samples will be studied
In Chapter V the estimates from models for time to
first 1ivebirth will be considered.
The remainder of this Chapter will
give certain definitions and certain general procedures to obtain
estimates, which will be helpful in the later
chapters.
1.7 Censoring and Truncation
If a marriage cohort of
marriage for a time period of
N women are observed from the time of
t
months, the exact time to conception
or a 1ivebirth is known only for
n
out of
N women who experience
such an event during the observation period.
For the remaining
(N-n)
women it is only known that their conception time or time to first
1ivebirth exceeds
t.
Data of this kind are generally known as censor-
ed data.
Various forms of censoring are used in the
type of problems we
are considering. We shall confine our studies to the following situations.
1.
Observations are censored at a single point
the event is known for the
n
t.
The exact time to
women and for the remaining
(N-n)
women the time to the occurrence of event is known to exceed
2.
A marriage cohort of
of
t
months.
t.
N women are observed for a maximum duration
Every month a number of women are withdrawn from
those who have not conceived till that period.
drawn are treated as fixed or random.
The number with-
The number is considered to
be random if the women are withdrawn from observation due to death,
dissolution of marriage, or dropping out of observations.
The
number is fixed only if a pre-assigned number of women are withdrawn
15
from the study to reduce the work load.
3.
In a third kind of censoring each individual will have distinct
censoring points,i.e. J for each individual we either have an exact
time to the event under consideration or know that the time to the
specified event exceeds a time, where the time can differ from individual to individual. This type of situation occurs when a study is
conducted for t months and all those who get married during the
od are observed.
per~
Thus our sample will have women of different mari-
tal durations, which essentially are the various censoring times.
A truncated distribution is the one formed from another "complete"
distribution by cutting off and ignoring the part lying in some finite or
infinite intervals.
truncated
A truncated sample is obtained by sampling from a
population~i.e.J
a population having an underlying time distri-
bution which is a truncated distribution.
In a censored sample, although
the exact conception times of censored individuals are not known, the
number of individuals whose observations are censored
is known.
But in
a truncated sample, the number of women whose conception times exceed
the truncated point is not known.
This type of sample is obtained
during a retrospective survey in which women married for
t
years and
conceiving at least once are asked to give their time to first conceptions.
1.8 The Estimation Procedures
In this section we will describe the method of maximum likelihood
and the method of minimum chisquare applied to obtain estimates from
multinomial distributions,where the probability of belonging to any
16
category depends on a set of parameters.
These procedures are used to
obtain estimates from censored and truncated samples.
1.8.1 The Method of Maximum Likelihood
Let
to
i
th
n.(S)
1
-
(i-I, 2, ••• , t)
class in a multinomial population, where
If in a sample of size N,
is a set of unknown parameters.
n
i
denote the probability of belonging
(i .. 1, 2••• t) belongs to the i
th
class, then the likelihood
function is given by
L - Constant
(1.1)
Then we have
t
log
L -
+
Constant
L
i-I
nilog~i(~) •
(1.2)
Pi' (i-I, 2, ••• , t; L Pi - 1) be the proportion observed in the
n
i
class, i. e., Pi" I f ' i-I, 2, ••. , t . The maximum likelihood
estimating equations are obtained by differentiating (1.2) with respect
to the unknown parameters
alog L _
(1.3)
as r
Using the relationship
1T
i
...
1
and
0, (r-l, 2,. •• , s) ,
(1.3) can be expressed as
(1.4)
Using a theorem in matrix algebra (Theorem 8.3.3, Graybill (1969»,
(1.4) can be rewritten in matrix form as
17
(1.5)
-1
where
V
.. N
1T
L+L
1T
1T'
.. (1T l'
-
1T
1
1
1
1T
1T
t
t
1
L+L
1
•1T t
1T 2
.
• 1T t
.
1
1
1T
1T
1t' 2 t
t
••• ,
1T
t
.1T t
_1_
1T t - 1
t
+.!1T
t
_ ),
t 1
(l x t-1)
P
,
=
(PI' P2' ••• , pt - 1 )·
(l x t-1)
e
We use
-1
V
1T
to denote the inverse of
which is given by
V
1T
-1T 1T
1 2
--
and which is the covariance matrix of
Let
G
denote
sx1
where
(gi)
gi
P
.
= alog
ae.
L
,
i=l, 2, ••• , s
.
1
_sxl
Using (1.5), the maximum likelihood estimating equations can be written
in matrix form as
,
G• H
-1
~!
(P-n) ... 0 ,
(1. 6)
18
,
where
H· ...
ae 1
aW _
t l
ae
l
aW
aW
aW
aWl
ae
l
aWl
ae 2
ae
a'll'l
ae
s
2
2
t
ae
2
o'll'2
_
l
2
a'll't_l
ae
s
ae s
Since equation (1.6) does not give an explicit solution, we will use
iterative procedures to derive the estimates.
The method adopted here
is due to Rao (1952).
Let
2
_E(a log L]
ae ae
i
j
where
log L • 0 Log L] we can write
ae
ae
i
j
Using the relation
,
M • E(G G ')
substituting for
(1. 7)
G from (1.6),
(1.8)
Let
9 be an estimate of
e.
Expanding (1.6) by Taylor's expansion,
and taking expectations, the iterative equations for maximum likelihood estimates can be written as
(1.9)
19
...
The iteration may be continued until the norm
.
!i
.
and
~i+l
small level.
II ~i
A
- ~i+lll,
where
denote two successive estimates, reaches a desired
The asymptotic variance-covariance matrix of the estimates
il aiven by
Va •
E(~-~) (~-~) ')
•
~-l E(~ ~I.) ~-l • ~-l •
(1.10)
~
An estimate of (1.10) is obtained by substitutina the obtained estimates
in (1.10).
1.8.2 The Method of Minimum Ch1square
The method of minimum chisquare is known to yield asymptotically
efficient estimators under quite aeneral conditions (Cramer (1946),
Neyman (1949 a
».
In aeneral, the estimators obtained in this way will
be efficient for arouped data, but not necessarily for the oriainal
data.
In the case of discrete distributions, however, there is a
natural aroupina based on the lattice points of the distribution, which
results in no loss of information (the set of group frequencies being
a sufficient statistic).
If this grouping is used the minimum chi-
square method is asymptotically efficient for the original data.
Let
class, and
be the observed proportion in the i th
Pi,(i. 1, 2, .•. , t)
~i(~)
be the corresponding probability.
Then the minimum
chisquare estimates are obtained by minimizing the function
2
X (~) - N
subject to the condition
~ (Pi-~i)
L
i=l
t
r
~i(~) •
i-1
~i
1.
2
(l.ll)
Sometimes it is very diffi-
cult to solve the equations obtained by equating to zero the derivatives of
X2 with respect to
e. Several iterative procedures have
been suggested to solve this problem.
20
Ferguson (1958) makes the following suggestions.
Expand (1.11)
in a Taylor's series and truncate it after the second term.
be an estimate of
e.
Let
a
Then,
(1.12)
where
.2
X
(e)
=
ax~]
[a<
ae ' ae-' ... , ae
ax 2
1
x·2(~) •
and
2
(1.13)
s
al
a2x2
ae 1 ae s
2
~
1
a2l
a2x2
ae 1 ae s
e
;z;-s
Differentiating (1.12) with respect to
~,
and equating to zero, the
estimating equation reduces to the form
(1.15)
Ferguson (1958) shows that the estimates obtained through (1.15) are
best asymptotically normal.
The asymptotic variance covariance matrix
of the estimates are given by (1.10).
Minimization of (1.11) is also done by linearizing
respect to its parameters.
reduce
(1. 9) .
TI(~)
with
In this case the estimating equations
to that of maximum likelihood estimating equations given in
21
Definition 1.8.1
Ferguson (1958) defines a
An estimate
8
of
n
8
O(In)
will be called
consistent estimate as follows.
O(In) consistent i f
fu(6 n"";8)
is bounded in probability uniformly in n when 8 is the true value of
the parameter. That is, for every e:> 0 and 8 e: parameter space
B so large that for every
exists a number
®
there
n = 1, 2, •..
A
P [1n18
n
-81> B 18] <e:
•
Definition (1.8.1) helps in describing the following observations made
by Ferguson (1958) and LeCam (1956).
Ferguson (1958) states:
(1)
Estimates based on certain functions of moments which are asymptotically normal are also
(2)
If a
O(In)
O(In)
consistent estimates.
consistent estimate is used to obtain a solution to
equation (1.15), then the solution obtained after a single iteration is asymptotically efficient.
(3)
LeCam (1956) states that a solution obtained for (1.9) by a simple
substitution of a
efficient.
O(In) consistent estimate is asymptotically
(Usually the method of scoring does not require a
consistent estimator as initial guess).
Definition 1.8.2
Wilks (1962, p. 547) defines the generalized variance of an estimator
matrix.
8
as the determinant
Let
~l
exl
and
~2
of the corresponding variance-covariance
denote two sets of estimates of
sxl
22
~
with the asymptotic variance-covariance matrix of
~x1
respectively.
compared to
s
and
Then the asymptotic relative efficiency (ARE) of
~1
is defined as
Sen 1971, p. 230)
ARE •
where
V
1
is the number of unknown parameters estimated.
V
2
~2
CHAPTER II
CONCEPTION TIME MODELS - ESTIMATION FROM COMPLETE SAMPLES
2.1
Introduction
In this chapter, we will describe in detail the estimation of the
risk of conception using certain conception time models from complete
samples.
The models and the estimation procedures are described by
various authors (Potter and Parker (1964), Sheps (1964a), Majumdar and
Sheps (1970),
attempted here.
Sheps and Menken (1972b». Certain modifications are
The detailed description of the estimation from
complete samples is helpful for comparison with the estimates from the
censored and truncated samples presented in the subsequent chapters.
The sample is considered to be complete in the sense that the data on
conception times are collected prospectively by observing a marriage
cohort until everybody conceives.
Throughout this chapter we assume that the fecundability of each
woman remains constant from month to month until pregnancy, and conception is a random event conditioned on her fecundability.
The various
models presented in this chapter vary in their assumption regarding
the distribution of fecundability among women.
24
2.2 Model with Homogeneous Fecundability
Consider a homogeneous population in which all women have identical
constant fecundability
p.
Then X,
the month of conception can be
considered to be a random variable following a geometric distribution;
i.e.,
x-l
P[X=x] = p q
where
= 1,
x
2, •••
(2.1)
q = 1 - p •
The probability that conception does not occur in the first
x
months is given by
x
p[X>x] • q •
(2.2)
The conditional probability that a woman conceives in month
that she has not conceived in the first
=
P[X=xIX>(x-l)]
(x-l)
p
x
given
months
•
(2.3)
The expression (2.3) shows that the conditional conception rate at the
beginning of a month among those who have not conceived till that time
is constant and equal to
p.
The distribution (2.1) has the moment generating function
M(s)
=
pe
s
1-qe
The mean and variance of
s
X are
E(X)
= -1p
Var(X) = ~
p
Set
1
A=p
(2.4)
2
(2.5)
25
Sheps and Menken (1972b) have shown that, in general, the moments
of
X can be expressed as
r
L
k=l
where
C(k, r)
k
C(k, r) A ,
(2.6)
satisfies the recurrence relation
= 0,
C(k, r)
k, r < 1, k > r
=l,k=r=l
= k[C(k-l, r-l) - C(k, r-l)], otherwise. (2.7)
The result (2.7) is proved by the method of induction.
We shall derive an alternate relationship between the moments as
follows.
Setting
A
= -1p ,
the moment generating function of
X
~an
be rewritten as
By differentiating
can be shown that
M(s, A)
M(s, A)
partially with respect to
A and
s,
it
satisfies the differential equation
+
=
A M(s, A)
a M(s,
A)
as
(2.8)
By definition
M(s, A)
where
~r
denotes the
r
th
=I
s
r
~l
~r
moment of
X
(2.9)
about zero.
(2.9) in (2.8) and comparing the coefficients of
s
r
tion
Substituting
r
we get the rela-
26
(2.10)
Substituting the relation (2.6)>> i.e.
k
A
and comparing the coefficients of
The estimate of
p
r
E(x )
= L~
c(k, r)l
k
in (2.10)
relation (2.7) is obtained.
by the method of moments and by the method of
maximum likelihood are the same:
A
1
p=-
(2.11)
X
where
X is the sample mean.
It can be shown that for large samples
2
Var(p) '" p (l-p)
(2.12)
N
where
N is the sample size.
This model is too simple with its assumption of homogeneous fecun-
dability.
Modification of this model is done by introducing hetero-
geneous fecundability.
2.3 Models with Heterogeneous Fecundability
The model described in Section (2.2) is modified by assuming that
p
is distributed among
women with a general distribution
h(p).
Henry (1953),Sheps (1964a) and Sheps and Menken (1972b) describe the
consequences of heterogeneous fecundability on conception rates.
Potter and Parker (1964) choose
parameters
a
and
b.
h(p)
to be a beta distribution with
Section (2.3.1) will describe in detail, the
model suggested by Potter and Parker.
Section (2.3.2) will consider a
27
model with the assumption that fecundability is distributed as a
mixture of beta distributions.
2.3.1
Beta distribution of Fecundabi1ity
Assume that among couples, fecundability is distributed as,
h(p) =
where
1
---:...;:;;.--:-~
(2.13)
B(a, b)
a, b > 0
and
0 < p < 1 •
From (2.1)
p[x=xlp] • P qx-l , x '"' 1 ,2, •..
where
X is the conception month.
Then, the unconditional distribution of the conception month is given
by
a
a+b
p[x=x]
x=l
=
(2.14)
ab(b+l) ••• (b+x-2)
(a+b)(a+b+l) .•. (a+b+x-l) ,
x
= 2,
3, •••
The probability that conception does not occur in the first
x
months
is given by
••. (b+x-l)
p[x>x] = (a+b)b(b+l)
(a+b+l) ••. (a+b+x-l)
, x
~
1 .
The conditional probability of conceiving at month
has not conceived for the first
u ( x) =
The function
u(x)
(x-I)
a
(a+b+x-l)
-;---::--'----::-~, X
(2.15)
x
given that she
months
~
(2.16)
1 .
is a decreasing function of
x.
This shows that
28
the conceptions are occurring at a decreasing rate.
factorial moment of
)J[r]
r
th
X is
(a+b-1)
(a-1)
=
r=l
(2.17)
(a+b-1) b(b+1) ... (b+r-2)
, r
(a-1)(a-2) .•• (a-r)
exists only when
and
The
~
2 ,
a > r.
We also note that,
var(X)
= ab(a+b-1)
2
(2.18)
(a-1) (a-2)
2.3.1.1
The Moment Estimators
Potter and Parker (1964) obtain the moment estimators of
b.
a
and
Majumdar and Sheps (1970) give the asymptotic variance-covariance
matrix of the moment estimators.
By equating the sample raw moments of
X to the corresponding
population moments, we obtain the moment estimators of
a
and
b
as
a=
(2.19)
where
denotes the
exist only when
)J2
sample moment about zero.
exists i.e.,
The estimates
a > 2 .
The asymptotic variance-covariance matrix is given by
¥=J
I
VJ ,
(2.20)
29
W denotes the asymptotic variance-covariance matrix of
where
b"
a
and
and
aa
J
aa
am
om1
I
2
==
ab"
am
ab"
am
1
evaluated at the expected values of
2
a,
and
"
b,
Var(m )
1
v ==
where
Also,
denotes the
r
th
population moment about zero.
The
V matrix exists
only if each of its elements exist and all elements exist only if
a > 4.
Elements of the
matrix are obtained as follows:
J
aa
=
am
1
aa
am
4m (a-1)-a
1
1
(m z- 2mi+m1
2-a
=
2
(m Z- zm
i+m1}
,
aa
ab
= (a-1) + (m -1)
1
amI
amI
ab
am
=
2
aa
(m -1)
1
am
2
It follows from a theorem (Cramer (1946), Section 28.4) that, for
large values of
N,
i f the
then the moment estimators of
V matrix in (2.20) exists, i. c. i f
a
and
b
given in (2.19) are
a > 4,
30
approximately normally distributed about the corresponding population
characteristics with the variance given by
W in (2.20).
2.3.1.2 The Maximum Likelihood Estimators
Majumdar and Sheps (1970) derive the maximum likelihood estimators
of
a
and
b
and compare the efficiency of the moment estimators to
the maximum likelihood estimators.
Let
i
.
Then
n , i
i
= 1,
2, ...
be the probability of a conception in month
is given by (2.14).
Let
be the observed frequency of conceptions in month
i.
The likelihood
N values is proportional to
functions for the sample of
(l.2l)
and
alog
where
Cla
alog n
alog L
=
Cla
Ii
n
_alog L _
db
-
Ii
n
1T
X
I
alog nx
=
db
j=O
i
da
alog n
x-I
1
a
= --
i
i
(2.22)
db
i
1
, x
(a+b+j)
~
(2.23)
1
1
, x = 1
(a+b)
x-2
= j=O
I
1
x-I
(b+j)
I
j=O
1
(a+b+j)' x
~
2 •
(2.24)
The maximum likelihood estimates are obtained by solving the simultaneous equations given by (2.22).
Majumdar and Sheps (1970) present an
algorithm to solve these equations by iterative procedures.
31
The asymptotic variance-covariance matrix of the maximum likelihood
estimates is obtained as follows.
M • (m ij ) ,
Let
where
(2.25)
~
a log L
m = -E [ aS2 aS ] , i, j
ij
i
j
= 1,
2, Si
a
and
S2 = b •
Then, the asymptotic variance-covariance matrix of the maximum like1i-
!1;
hood estimates is the inverse of
tion matrix.
~
is commonly known as the informa-
The elements of the information matrix are
mIl • N
I
2
a log
lT
i
i
oa
lT
a2log
m = N
l2
L
lT
i
i
i
(2.26)
2
IT •
l.
(2.27)
aallb
2
and
ro
= N
22
I
1
II log 11' .
l.
11'1
llb
(2.28)
2
where
2
II log 11'x
lla 2
~21 og
d
a
lT
aaab
2
II log
llb
2
~+
=-
IT
x
X-I
=
L
X-I
I
1
j=O (a+b+j)2
1
j=O (a+b+j)2
'
,x~l,
x ~ 1 ,
(2.29)
(2.30)
1
x= _ _--..,-'x= 1
2
(a+b)
x-I
x-2
=-I
1
j=O (b+j)2
+
L
1
j=O (a+b+j)
2,x~2.
(2.31)
32
Majumdar and Sheps (1970) use the first two hundred terms in the infinite series (2.26) - (2.28) to estimate the elements of
M.
Katti
and Gurland (1961) provide certain gUidelines as to the number of terms
to be used in evaluating infinite series in an information matrix.
(n+1)th
denote the
term in the series involved in the
Let
term of the information matrix.
T
L
2
ij
(n)
2
for
n
= 1,
T
reached for which
L
Compute
2, ••• , successively until a value of
i,j Sij (n)
2
ij
Let
is
n
(n)
This ensures that
2
i,j Sij (n)
IT ij (n) I
for each
and
i
j.
Isij(n)1
The results of Majumdar and Sheps (1970) show that the efficiency
of moment estimates compared to the maximum likelihood estimates is
extremely low for values of
increases.
a
close to 4, and increases as
Changes in the values of
b
a
have little effect on the
efficiency.
Majumdar and Sheps (1970) use moment estimators as the initial
estimates for finding the maximum likelihood estimates.
estimators do not exist when
a
~
The moment
2.
A set of estimates can also be derived by equating observed proportions to the theoretical proportions conceiving every month.
Since the
observed proportions are consistent estimators of class probabilities,
the estimates based on them are also consistent.
A set of estimates
can be obtained from any two consecutive observed class proportions.
For example, equating the first two observed proportions to their
expected values we have the equations
a
PI = a+b
(2.32)
33
b
a
(2.33)
P2 = (a+b) (a+b-l)
and
where
are the two observed proportions of month land 2
respectively.
Solving (2.32) and (2.33) we get the solutions
(2.34)
and
(I-PI) a
b---PI
(2.35)
If
a
or
b
is negative, attempts may be made using any other pairs
of consecutive proportions to get another set of estimates.
2.3.2 A Mixture Distribution of Fecundability
Earlier in Chapter I, we noted that the model described in section
2.3.1 failed to give an adequate fit to the data of Potter and Parker
(1964).
But such a model was found to be adequate for sets of data
from Hutterites (Majumdar and Sheps 1970) and from Taiwan (Jain 1969).
Several reasons were hypothesized for the inadequate fit of the model
to the Potter and Parker data.
One of the hypotheses is that the
proposed distribution of fecundability may not be appropriate.
appears that the women under study consist
It
of two distinct groups with
respect to fecundability, one with a high average risk of conceptions
and the other with a low risk of conception.
This may be a natural
34
phenomenon or may be due to the presence of contraceptors in this
population.
Hence a new model is proposed with the assumption that
among couples, fecundability is distributed as a mixture of two beta
distributions, i.e.,
h(p)
(2.36)
where
aI' b l , a 2 , b 2 > 0 ,
O<p<l,
and
0 < a < I •
Then, the probability distribution of
X,
the month of conception is
given by
~ =
P{-X=x) =
x = 2, 3, ...
The probability that conception does not occur in the first
(2.37)
x
months
is given by
p[X>x]
The
r
th
• (2.38)
factorial moment of the distribution is
35
,r = 1
~ [r]
r. =
2, 3, ..•
(2.39)
Since the moment estimators of the five unknown parameters are difficult to solve, we will find only the maximum likelihood estimates.
Let
n i , i = 1, 2, •••
in month
i.
be the observed frequency of conceptions
The likelihood function for the sample of
2
n = N
i=l i
values is proportional to
(2.40)
where
~.
1
is given by (2.37).
The maximum likelihood estimates are the solutions of the five
simultaneous equations
a~~g L
i
=
alog
L n.
~.
1
ae.1
1
n.
a~.
~.
1
1
2 -.! ~
Qe. = 0,
=
i
(2.41)
1, 2, ••. , 5 ,
Set
where
a
=
b.(b.+l) ••• (b +i-2)
j
J
j
J
(a.+b.) ... (a.+b.+i-2)
J
J
J
,i
= 2,
3, ..• ;
J
j = 1, 2
(2.42)
36
Then,
(2.43)
(2.44)
(2.45)
(2.46)
(2.47)
Equations (2.43)-(2.47) can be evaluated using the following relationships:
_
i-I
~
1
l.
leO (aj +b j +l)
,i ~ 1; j
1, 2,
(2.48)
(2.49)
i-I
l~O
i
~
2; j
=
1, 2
1
(aj+bj+l) ,
(2.50)
The equations for obtaining the maximum likelihood estimates are solved
using the method of scoring (Rao, 1952).
37
The Information matrix is given by
where,
1, 2, .•. ,5. (2.51)
The number of terms used in the infinite series (2.51) is determined
by the procedure of Katti and Gurland (1961) described in section
2.3.1.2.
The variance-covariance matrix of the estimators is given by
the inverse of
W.
For the purpose of solving equations (2.41), we need to have a set
of initial estimates.
Moment estimators of this distribution are very
difficult to obtain, since they involve a fifth degree equation.
Esti-
mators based on observed proportions as derived in the single beta
distribution model are also hard to find.
The following guidelines are
used to obtain initial estimates.
An in1tal guess is made on the means
of the two mixing distributions.
This can be done graphically by plot-
ting the empirical cumulative distribution function.
An approximate
value of the mixing proportion also can be read from the graph.
Using
these guesses and two observed proportions all parameters can be approximated.
Instead of using proportions, equations obtained by equating
second and third factorial moments to the corresponding population
factorial moments can be used.
2.3.3 Examples
For the purpose of illustration we use the data of 958 women from
the Princeton Fertility Study given in Potter and Parker (1964).
Potter
and Parker fit the single beta distribution model to this data by the
38
Majumdar and Sheps (1970) show the fit using the
method of moments.
maximum likelihood estimates.
Since this model does not fit the data,
we will try to fit it with a mixture of beta distribution model.
A
comparison of the fit for this data under the three different assumptions of fecundability discussed in this chapter is given in Table 2.1.
The table shows that the model based on a mixture of distributions
gives a better fit for this data.
The
~hisquare
values computed on the basis of original grouping
given in Majumdar and Sheps (1970) show significance under all the
models.
The recomputed value of chis4uare by combining months 5-6 and
7-9 groups is not significant for the mixture distribution.
Because of
clustering seen at the month six in the original data, the chisquare
statistics were recomputed with this grouping.
The estimates of various
parameters obtained for different models are given in Tables 2.2 and
2.3.
The estimates from the mixture distribution show that the popu-
lation has a high risk grouping consisting of 63.31% of the population
with a mean fecundability of 0.5578 and a less fecund group consisting
of 36.69% of the population with a mean fecundability of 0.1167.
The
data show that nearly 40% of the women conceived during the first month.
The two fitted distributions of fecundability are shown in Figure 2.1.
The mixture distribution shows two distinot distributions.
Although the iterative equations provided a solution and the likelihood function increased at each iteration, the rate of convergence
was very slow.
It took
the algorithm converged.
slightly over one hundred iterations before
The correlation matrix of the estimates
showed high correlation among the estimates, and the estimates had
large variances.
This might be a reason for the slow process of
39
convergence.
Hasselblad (1969) and Oppenheimer (1971) observed similar
problems while studying mixtures of two Poisson distributions and
mixtures of two exponential distributions, respectively.
They noticed
that, in such instances, a single distribution might be an adequate
representation of the data.
The test based on a chisquare statistic
showed that the model with a single distribution was not adequate for
our data.
In subsequent chapters we have considered the estimation
problems in a single beta model from censored and truncated samples.
Chapters III and IV contain discussions of these problems.
40
TABLE 2.1
Observed and Expected Proportion of Conceptions,
by Month Under Various Models
Expected Proportions
Observed
Month
Proportions
Homogeneous
Model
Single Beta
Mixture of Two
Distribution \! Beta Distributions
1
0.3967
0.1829
0.3726
0.3960
2
0.1597
0.1494
0.1814
0.1627
3
0.0981
0.1221
0.1053
0.0899
4
0.0470
0.0997
0.0681
0.0587
5-6
0.0971
0.1482
0.0815
0.0752
7-9
0.0532
0.1352
O.Obn
0.0663
10-12
0.0480
0.0738
0.0336
0.0409
13-24
0.0710
0.0808
0.0.516
0.0709
25-48
0.0209
D.DDn}0.0078
0.0245
0.0306
49+
0.0084
0.0001
0.0186
0.0088
Sample
Size
958
958
958
Chisquare
Value
~
958
410.17**
(7. df)
34**~
(7. df)
15. 652**'d
(4.df)
From Majumdar and Sheps (1970)
'd Chisquare values re-computed, combining groups 5-6 and 7-9 as a
single group 5-9, are
Single Beta Distribution Model
Mixture of Two Beta Distribution Model
**
Significant at 1% level of significance.
29.73** (6.df)
7.64 (3.df)
41
TABLE 2.2
Estimates and Their Standard Errors for
Various Models
Models and Parameters Estimated
Estimate
Standard Error
0.1829
0.0053
a
1.29
0.11
b
2.17
0.27
CJ.373
0.015
2.3206
4.9593
1.8397
2.2611
4.3589
4.4858
32.9862
53.5661
0.6331
0.3884
Mean Fecundabi1ity of the
Mixture Distribution
0.3960
0.3864
Mean Fecundabi1ity of First
Beta Distribution
0.5578
0.2345
Mean Fecundabi1ity of Second
Beta Distribution
0.1167
0.0624
Homogeneous ModeZ
I
p
II
SingZe Beta Distribution\L
a
Mean Fecundabi1ity a+b
III
Mixture of TWo Beta
Di8t~ibutions
a
b
a
b
1
1
2
2
(l
\1
From Majumdar and Sheps (1970)
42
TABLE 2.3
The Correlation Matrix
(~)
of the Estimates of
the Parameters of the Mixture of Two Beta Distributions
1
0.9849
1
R
=
-0.8938
-0.9062
-0.9778
-0.8368
-0.8446
-0.9309
0.9958
0.9394
1
1
0.9567
1
43
Fioure 2.1
FITTED DISTRI BUTIONS OF FECUNDABILITY
3.14
II
2.51
...->-
C /)
z 1.89
I.IJ
Q
e
1.26
0.63
o
I
0.2
0.4
0.6
0.8
1.0
x
Single beta distribution with 0=1.29, b=2.17
:II Mixture of two beta distributions with 0,= 2.3206,
b J = 1.8397, 02= 4.3589, b2 = 32.9862,
oC=O.6331
CHAPTER III
CONCEPTION TIME MODELS - ESTIMATION FROM CENSORED SAMPLES
3.1
Introduction
In the previous chapter we assumed that observations of the concep-
tion times of women continued until all women conceived.
situations this is not always feasible.
are observed prospectively for at most
Instead, all women in a study
t
months.
assume that all women are observed for exactly
t
there are no losses from the sample before time
other than conception.
at
t.
these
n
t,
women.
The remaining
(N - n)
t.
i.
t
due to any cause
N women, only
n
women
women are known to have
Sometimes women drop out of the
study for reasons other than conception.
than
months and that
the exact time of conception is known only for
a conception time greater than
i(i < t),
Let us further
Then the data is said to be singly censored
If out of an initial cohort of
conceive before
In actual
If a woman drops out in month
it is only known that she has a conception time greater
In that case we consider the data to be multiply censored.
Multiple censoring can also occur in the following survey situations.
A study is conducted for
t
months, and all those who get married
during this period are ob served until the end of
study will consist
tth
month.
of women of different marital durations.
marital duration varies, a woman with marital duration, say
This
Since
ti,
is
45
observed only for a period of
ti
S
t
months.
If she has not
conceived by then t it is only known that her conception time exceeds
t.
1.
months.
In this chapter we will consider the problem of estimation
from censored data.
For singly censored data t algorithms are presented
to obtain moment type estimates, maximum likelihood estimates t and
minimum chisquare estimates, and their asymptotic relative efficiencies
are obtained.
We also present an algorithm to obtain maximum likeli-
hood estimates from multiply censored data.
For singly censored data t
the underlying distribution is assumed to be geometric compounded with
a single beta distribution.
3.2 The Estimators from Singly Censored Data
.
Let
~
denote the probability of conceiving in month x.
x
for the model under consideration,
a
1T X
= a+b'
x
~
x
is given by
(2.l4)t
Then
i.e.~
=1
= ~~a~b~(",-b+-:l=)-:-.-:-.-'•-...>.:(b~+:-",x,--:.:-2)<--~
(a+b) (a+b+l) ••• (a+b+x-l)
Let
R denote the probability that the conception time exceeds
months.
Then t
R is given by (2.15)t
t
i.e.~
b(b+l) ••• (b+t-l)
R
A sample of
= P(X>t) = (a+b)(a+b+l) ••• (a+b+t-l)
N women are observed for
observed t only
n
women to fall into
tion t and the
t
women conceive by month
(t+l)
(t+l)th
not conceive by month
months.
t.
We can consider the
N
categories according to the month of concep-
category is formed by all
t.
Of all women
(N-n)
women who do
The probability of belonging to any of the
46
first
t
(t+l)th
categories is given by
1T
x
t
=
X
I, 2, ... ,
and for the
t
category the corresponding probability is given by
R.
3.2.1 Moment Type Estimators
When the samples are censored the conventional moment estimators
cannot be obtained.
We will present below, a procedure to derive
certain moment type estimators from singly censored data.
Let the samples be censored at time
t.
Of
N women observed,
women conceived during months I, 2, ••• , t,
respec-
t
L
tively and
n
= n.
i
i=l
For the remaining
known that their conception time exceeds
n
PI' P2 ,···, Pt
and
i
~ =
Let the ratio
"
and
women, it is
t.
Pit i = I, 2, ... , t
are
R
(N-n)
"
N-n
= R
and -
unbiased estimates of
1T
l
,
1T
2
From the general expression for the
R, respectively.
Then,
N
,···,
r
1T
t
th
factorial moment of the distribution given in (2.17), we have
00
E(X)
L
=
X 1T
x=l
X
a+b-l
a-I
=
(3.1)
and
00
=
E(X(x-l»
L
x (x-1)7T
x=l
x
= 2(a+b-l)b
(a-I) (a-2)
(3.2)
The sum in (3.1) can be split as,
t
00
L
X 1T
x=l
Substituting for
(3.3) reduces to
1T
x
X
=
L
x==l
00
+
X 1T
x
L
x=t+l
X 1T
X
(3.3)
from (2.14) and simplifying, the second sum in
47
co
l
X
1T
x=t+l
X
... (t+l)R + (b+t)
(a-I)
R •
(3.4)
Similarly it can be shown that
(3.5)
Also, the sum in (3.2) can be split as
t
co
L
pI
x(x-I)1T
x
-
L
pI
00
x(x-I)1T
+
x
L
~~
X (x-I)1T
x
.
(3.6)
t
T =
Let
I
L
x Px
~l
(3.7)
t
and let
T2
= L
x(x-l) P
x=l
(3.8)
x
Then,
t
L
x=l
X
1T
X
and
t
E(T ) = L x(x-l)
2
x=l
IT
x
t
Hence
T
I
and
T
2
can be considered as estimates of
L
X
TT
X
x=l
and
t
L
x=l
x(x-l)
respectively.
1T
Substituting for the relevant sums in
x
(3.3) and (3.6), we get the following equations to estimate
"
(~t)"
(a+b-l)
TI + (t+l) R + (a-I) R = (~-l)
•
t
a
and
b:
(3.9)
48
(3.10)
~
where
b
and
are the estimates of
a
and
b,
respectively.
From equation (3.9) we obtain
(3.11)
Using (3.10 and (3.11), we get
G1-R)T2+(1-R)T1-T~
2
,..
a = ------------....:...----(l-R)+t (t+1)R(1-R)-'2T1 {T 1+(t+1)~-lTI
2
(3.12)
11
and
b
=
(;-1)[T +(t+l)R-1] + t
1
k
(3.13)
'"
(l-R)
Let us examine the behaviour of these estimates as
lim R
t+oo
0
< P <
+
00.
From the
R, we have
definition of
For
t
= lim
t+oo
l
J0
(l_)t P
P
a-I
(l-p)
B(a,b)
b-l
dp.
(3.14)
1
a-I
(1- )t P
P
(I-p)
b-l
B(a,b)
and the right hand side of the inequality, being a beta function, is
integrable for
a
>
0, b
>
0
and
0 < P < 1 .
Also, ~::(l_p)t exists and is equal to zero.
ated convergence theorem
(Lo~ve
1963, Page 125),
Hence by the domin-
,
f
49
G
lim R = 1 . 1im(1.)t~
t-+co
"0 t-+co '"'P.
J
ea-I (l-p) b-1
d
B(a, b)
=0
P
(3.15)
•
We will follow similar arguments to that used to obtain the limit in
(3.15) to get
lim
t-+co
t
R
=
l
a-l
b-l
lim .. [t(l-p)t] P
(l-p)
t-+co
Jo
It(1_p)t-1, ~ --~1~~2' for
[l-(l-p)]
Since
t(l_p)t p
dp •
B(a,b)
a-I
(I-p)
B(a,b)
b-1
<
...!..
-
2
p
0
<
p
<
1 ,
(3.16)
we have
pa(l_p)b-l
B(a,b)
and the right hand side of the inequality is integrable for
,
a > 1
b > O.
and
L Hospital's rule
Applying
t
lim t(l-p)t = lim -(l-p)
t-+co
t-+co log(l-p)
for
0 < p < 1.
lim tR =
t-+co
0 < P < 1,
=0
t
Hence by the dominated convergence thoerem
f
1~lim (l_)t~ p a-I (l-p) b-l
t
0 t-+co
dp
B(a, b)
P
=0 .
(3.22)
By following similar arguments, one can also show that
lim
t-+co
Also when
..
t
t(t+l)R = 0, for
is infinite,
T
l
and
a
>
T
2
2, b
>
are
the first two sample
factorial moments of the complete distribution •
Using (3.15), (3.22) and (3.23), we obtain
0 •
(3.23)
50
lim . .
t-+-oo a
2(T 2+T1-T l2 )
=T
2-2T l (Tl-l)
2 (m
=
2
""1ll
2
1
)
(3.24)
2
m -2m +m
2
l l
where
m
l
and
m2
are the first two sample moments about zero.
Also
lim . .
t-+-oo b
=
..
(a-l) (ml-l) •
(3.25)
The relations (3.24) and (3.25) agree with the moment estimators from
complete samples given in (2.19).
The moment type estimators derived exist only when the second
factorial moment of the distribution exists which holds only if
a > 2.
....
Therefore
a
and
b
do not exist for
a
~
2.
3.2.2 Asymptotic Variance-Covariance Matrix of Moment Estimators
We notice in (3.12) and (3.13) that the estimators
A
a
and
b
....
are functions of random variables
TIt T
2
and
R.
Let
(3.26)
and
(3.27)
where
~l
and
~2
denote two functions.
variance-covariance matrix of
a
and
b is
Then t for large samplestthe
given by (Cramer (1946»
,
W= J V J
(3.28)
51
where
W is the asymptotic variance-covariance matrix of
J ..
34>1
3T
1
34> 1
3T
2
-..--
34> 2
a4> 2
aT
-..--
aT
1
A
and
b,
34>1
3R
Q4>2
QR
2
evaluated at the expected values of
a,
T1,T 2 and
R,
and
V is the
A
T1 , T2 and
variance-covariance matrix of
R written as
V ..
Var(R)
From the multinomial law we have
'lT
i
(l-1T
i
)
N
= _
1T i 1T j
N
i = 1, 2, ... , t
i, j =
1,2, ... ,
(3.29)
,
t
,
(i;'j),
(3.30)
and
(3.31)
Var(R)= R(l-R) •
N
We will now find the elements of
V matrix using the results (3.29)-
(3.31).
=
var.[
t
r
I
:LillI
=.
i
1:1
2
iPi]
Var(P i )
+L L ij
i;'j
cov(PiP j ) •
52
Substituting for
Var(P i )
and
Cov(PiP j )
using (3.29)-(3.31)
and
simplifying we obtain,
(3.32)
Similarly, it can be shown that
l~ti=lI
Var(T 2) = N
+ 2
i(i-l)(i-2)(i-3)~.
+4
1
L
t
i=l
i(i-l)(i-2)~.
1
I i(i-l)~i {I i(i-l)1Ti}~J
-
1:1
•
(3.33)
1=1
Cov(T ,
l
(3.34)
R
t
Cov(Tl,R) = - -N Lin. ,
i=l
1
R
= -
N
t
L
i=l
i(i-l)n i ·
(3.35)
(3.36)
The sums involved in expressions (3.32)-(3.36) are obtained as follows.
Using (3.1) and (3.4) we have
53
(a+b...,l)
t
i~1 i~i = (a-I) -
{.' .
(b+t)R }
(~+I)R + (a-I)
(3.37)
.
From (3.2) and (3.5) we obtain
I i(i-l)~i ~(~1~(I~b)
- {(t+l)tR + 2(t+l) (b+t)R
a
a
(a-I)
8
i=1
2
... 2(b+t) (b+t-l)R }
(a-1) (a-2)
(3.38)
•
By similar manipulations we can show that
t
L
i=1
i(i-1)(i-2)~ • 6(a+b-1)b(b+1)
i
(a-1) (a-2) (a-3)
~
-
L
i=t+l
i(i-1) (i-2)n
i
(3.39)
and
It
i=l
i(i-1)(i-2)(i-3)n
i
= 24(a+b-l)b(b+l)(b+2)
(a-I) (a-2) (a-3) (a-4)
~
I
i=t+1
where
i(i-l) (i-2) (i-3)n i ,
(3.40)
~
L
i=t+l
i(!-I)(i-2)n
i
= I(t+l)t(t-l)+ 3t(t+1) (b+t) + 6(t+1) (b+t) (b+t+l)
~
(a-I)
(a-I) (a-2)
+ 6 (b+t) (b+t+l)(b+t+2)]R
(a-I) (a-2) (a-3)
,
54
and
I
i=t+l
i(i-l)(i-2)(i-3)~i = I5t-2)(t-1)t(t+l)+4(t-1)t(t+l)~:~~~
+ 2t(t+1) (b+t) (b+t+l) + 24(t+l) (b+t) (b+t+l) (b+t+2)
(a-1) (a-2)
(a-1) (a-2) (a-3)
+ 24 (b+t) (b+t+l) (b+t+2) (b+t+35] R
(a-I) (a-2) (a-3).(a-4)
J
matrix can be obtained by replacing Tl , T ,
2
R by their expected values in the following expressions.
The elements of the
and
.
J
-
2{(1-R)-2Tl}+2a{2Tl+(t+l)~-1}
aa" =
"
"
"
"
aT l
T2 (l-R)+t
(t+l)R(1-R)-2T
l {T l +(t+l)R-l}
(3.41)
(l-ib (2-;)
aa=
aT 2 Ti(1-R)+t(t+l)R(1-R)-2T {T +(t+l)R-l}
l l
(3.42)
-2 (T l +T 2)-a{-T 2+t (t+l) (1-2R)-2T l (t+l)}
aa"
-=
"
T (1-R)+t(t+l)R(1-R)-2T {T l +(t+l)R-l}
aR
2
l
(3.43)
ab"
aT
(3.44 )
~
-=
ab"
aR"
1
"
(l-R)
2
=
A
A
~;2 {Tl+(t+1)R-l~
A
,
(3.45)
e
55
"
Var(R),
Var(T )
I
Hence by applying Chebychev's
Expressions (3.31), (3.32) and (3.33) show that
and
Var(T )
2
O(N-~).·
are of
inequality (Cramer (1946), Section 20.3)
...
R, T
and
I
T
2
respectively to their corresponding population values
t
L
and
i(i-l)TI
i=*l
"
functions of R,
i
,
Tl
respectively
and
T •
2
t
N~
Also
a,
L
hi'
i=l "
and bare
and
b
are consistent
estimates
t
L iTI i and L i(i-l)TI i
i-I
i-I
(Cramer (1946), Section 28.4)
and 6
provided
00.
t
Thus applying Slutsky's theorem
~
(Cramer (1946), Section 20.6)
as
R,
converge
R,
a
exist.
Applying a theorem
are normally distributed
with means at their corresponding population values and their variances
given in (3.28).
3.2.3 Maximum Likelihood Estimators
Sheps and Menken (1972b) have described an algorithm to get the
maximum likelihood estimates for the present model from multiply
censored data.
Eventhough the singly censored data is a special case
of multiple censoring,
we present here a different algorithm, which
can also be used for grouped and truncated samples with necessary
modifications.
As pointed out in Section (3.2), under the singly censored case,
women can be classified into (t+1) categories.
Then the maximum
likelihood estimates are given by the iterative equations given in
(1.9),
i.e.,
e=
-(n) ,
§
(3.46)
56
where e(n)
n
th
is the maximum likelihood estimate of
e
(a,b)
at the
iteration,
,
=
H
an 1
i7f
aa
oa
da
d7f 2
07f
db
db
d7T
l
db
07f
2
t
t
and
-1
=
V
-7T
-
7f
R
l
1
R
7T
where
aa
i
2
1
R
alog 7f
alog 7f
and
1
R
R
..~~ ~
7f
t
e
R
H matrix can be obtained as follows:
07f i
7f
--.
ob
i
alog 7f
R
...L+ l
1
R
The elements of the
1
1
R
..!- + 1
ab
i
1, 2, ... , t ,
(3.47)
i = 1, 2, ..• , t ,
(3.48)
db
i
i .. 1, 2, .•.• t
are given in (2.23)
and (2.24).
The estimates using two proportions given by (2.34) and (2.35) or
the moment estimates obtained in Section 3.2.1 can be used as the initial
estimates for the iterative equation (3.46). The asymptotic variancecovariance matrix of the maximum likelihood estimators is given by
57
M ... (H"'V-1H)-1
-
Where
~
-IT
-
(3.49)
,
denotes the asymptotic variance-covariance matrix.
3.2.4 Minimum Chisquare Estimators
The minimum chisquare estimates are obtained using the iterative
equation given in (1.15), i.e.,
(3.50)
where
n
th
e-(n)
is the minimum chisquare estimate of
e=
(a,b) at the
iteration,
.2 '
X (~)
and
•• 2
~
(~ )
=
a2l
a21
2
aaab
a21
a21
aa
aaab
Set
lT +
t 1
= Rand
Pt+1
= R.
ab
2
Then, the elements of
.2
~-(e)
can be
expressed as follows:
2
t+1 Pi alT.
-N
---~
".~= 1 IT 2 aa
L
i
'
(3.51)
58
- N
(3.52)
•• 2
The elements of
X
(8) are given by
2
t+l Pi
+
8
where
1
=a
and
82
=b
2
L 3"
1=1
.
1T
1, 2,
(3.53)
i
•
For computational purposes the second order derivatives in (3.53)
can be expressed as
d
2
1T.
l.
2
d log
where
dSidS j
1, 2,
'IT.
l.
, i, j = 1, 2
(3.54)
are given in (2.26)-(2.28).
As in the case of the maximum likelihood estimates, the estimates
based on the proportions given by (2.34) and (2.35) or the moment
type estimates obtained in Section 3.2.1 can be used as the initial
estimates in order to obtain the minimum chisquare estimates.
Since the minimum chisquare estimates and the maximum likelihood
estimates have the same asymptotic properties, the asymptotic variancecovariance matrix of the estimates is also obtained using the relation
(3.49) .
59
3.3 Effect of Artifical Grouping
Suppose instead of the natural grouping we get data by combining
certain groups together.
For example, models for first conception
gives probability distribution of waiting in months, whereas. data
are obtained in three or six month intervals.
This kind of grouping
is done either because the records are collected in this form or
because grouping has been used to reduce the effects of age errors,
due to digit preference, etc., and chance fluctuations of small
numbers.
Let us consider samples censored at time
t.
The data in its
natural form consist of (t+l) categories with probabilities of belonging to each category given by
and
(2.15).
~l' ~2'
••• '~t
and
R defined by (2.14)
The asymptotic variance of the moment and maximum
likelihood estimates are given in Section (3.2).
When the data are
grouped artificially exact sample moments cannot be estimated.
The
alternative is to substitute for sample moments approximate moments
from grouped data.
In the following paragraphs we will derive analytical expressions
for the variance-covariance matrix of the maximum likelihood or
2
minimum
X
estimates and compare them with the variance-covariance
matrix of estimates from natural grouping.
Let us assume that there are (c+l) categories in the new grouping.
They are made by grouping the first
group,
r
2
r
l
categories to form the first
categories to form the second group and so on, and the final
categories from the
(c+l)th
class; so that we have
60
r
Let
l
+ r
Z
.. . +
+
r c+ l = (t+l)
.
(3.55)
•
" •
TI c+ l be the corresponding probabilities.
,
• "2 ,···, *c l, and TI •c+l 1 - Lc TI."
Let TI • = [TIl'
TIl' TI z"'"
1T
Tf
i=l
Let
cxt
D be a
1
matrix defined as follows
!? = ~"""
11. •• 1
l
00 ••• a
cxt
00 ••• 0
~ :2~~
00 ••• a
11. .• 1
00 ... a
00 .•• 0
00 ••• a
r
TI
Then
(3.56)
00 ... 0
c
II:'::r
• =DTI
(3.57)
Under the new multinomial law, the estimates have an asymptotic
variance given by (1.10)
~.'
V· =
(3.58)
[
where
•
H ,
V •
are obtained from (1. 6) replacing
TI
by
TI
*
TI
•=
H
cxs
where
s
li"
~e J
,
is the number of unknown parameters estimated, and
vector of unknown parameters.
(3.59)
e
is the
e
61
From (3.57)
a1T *
--=
ae
'Ir
i. e.,
H
From the definition of V!
(3.60)
• D H •
matrix, we can write
(3.61)
where
[Diag(1T)]
is a
(txt)
diagonal matrix
with
n , i
i
= 1,
2, ••• ,t
as diagonal elements.
Similarly
t
Y 'Ir = N[Diag(1T
'Ir
)
lit
-
1T
1T
lit
]
(3.62)
1T
and
(3.63)
lit
Also,
lit'
W 1T= D 1T 1T
,
D
(3.64)
Hence,
V
1T
*= N[~{Diag(1T)}~ ,
=~
"
D]
, ,
N[Diag(1T) - 1T 1T
=DV
~
- D 1T 1T
~1T
]~
t
D
~
The asymptotic variance-covariance matrix of the new estimates is
given as
(3.65)
62
V*
= [H*'V-:01a.*]-1
_ _ * _
1T
,
= [J:!
(3.66)
The estimates from natural data have the asymptotic variance-covariance
matrix
(3.67)
Using the definition given in (1.8.2), the asymptotic relative
efficiencies of the new estimates to the estimates from natural
grouping is given as
(3.68)
3.4 Estimation Under Multiple Censoring
Earlier in Section 1.7 we have described two specific instances
where multiple censored data can occur.
1.
A marriage cohort of
of
t
months.
They are:
N women are observed for a maximum duration
Every month a number of women are withdrawn from
those who have not conceived till that period.
The number with-
drawn are either fixed or random.
2.
A study is conducted for
t
months and all those who got married
during the period are observed.
Thus the study will have women
of different marital durations, which are the various censoring
times.
In this case the number of women observed for a period of
63
months are known.
Sheps and Menken (1972b) consider the estimation problem in the
latter case, the details of which can be found in their paper.
We shall
restrict our discussion only to the first situation, which is sometimes
known as progressive censoring.
We will consider the problem of esti-
mation under two assumptions of fecundability:
(1) All women have
identical constant fecundability. (2) Fecundability is constant over
time for a given woman, but varies among women according to a beta
distribution.
We have also noted in Section 1.7, how the number with-
drawn could be treated as fixed or random.
the
i
be the number of women conceiving in
n , i - 1, 2, ••• t t
i
Let
th
c i ' i • 1, 2, ... , (t-l)
month and
withdrawn each month. At the end of
t
the number of women
~e
months
c
women, who have
t
not conceived or have not been withdrawn till that time, remain.
may be considered as the withdrawals at the
t
th
month.
They
Thus we have,
(3.69)
3.4.1 Homogeneous Fecundabi1ity and Fixed Withdrawals.
Let
month
c
p
i
women (c
i
M
i
=
are withdrawn.
fixed and known)
i
Set
At the end of the i th
denote the fecundability of a woman.
r (nj+cj ),
jel
i
= 1,
2, .•• , (t-l)
,
and set
Then, the conditional probability of
given
"I' "2"··' "i-1 '
n
i
conceptions in month
is
i
64
'n
p i (l-p)
N-M-n
i-I
i
i
= 1, 2, ••• ,
t
•
(3.70)
This result is true because in the homogeneous case the conditional
probability of conceiving in month
it
given that she has not
conceived before, remains constant and is equal to
p.
The joint probability
(3.71)
By re-arranging the terms the relation (3.71) can also be written as
P(n1 , n 2 ,···, n t ) =
where
p(l_p)i-1
i
i
and
conceiving in the i
the first
[n -nMi - 1]
1~1
t
th
(l_p)i
r:~(l-p) i_ilniG
J ~l-p) n
J
Ci
t
(3.72)
are respectively the probability of
month and the probability of not conceiving in
months.
Also,
(3.73)
The expected values of
n
i
are obtained as follows.
65
(3.74)
(3.75)
Writing
(N-M1 )I
as
(N-n1-c 1 )(N-M1-l)1
and splitting the sum, the
right hand side of (3.75) can be written as,
- c1
I
NI
nl (N-n -1) 1
1
(3.76)
,
If we change
N
,
=N-
1, n
2
= n2
- 1,
the first sum of (3.76) can
be written as
Np(l-p)
,
,I
(nl'n2~· .n t )
P(n1 ,n 2 ,···,n t )
= Np(l-p) •
,
Simi1ar1y,if we set
C
2
= C2 +
of (3.76) reduces to the form
,
1,
and
n
2
= n2
- 1,
the second sum
66
Hence,
(N~M1_l)1
By a similar argument of writing
and also separating the term with (N-Mi_l-l) I
(N-M _ ) (N-Mt_l-l)I
i l
as
and so otlJ we can show that
E(ni ) = Np(l_p)i-l - c 1 p(1_p)i-2 _ c 2P (1-p )i-3 ••• - ci-lP
for
(3.77)
i · 2, 3, ••• , t •
can be interpreted as follows.
Np(l-p)
i-I
women would have
conceived had there been no withdrawals and
is the number that would have conceived in the i
ci - l
th
month had
been retained.
3.4.1.1 The Maximum Likelihood Estimate of
p
The likelihood function of the sample is given by (3.72).
Then
the log likelihood function is
t
log L = constant +
L ni
i=l
t
log p +
L [(i-l)n.+ic.]
i ..l
1..
log(l-p) .
(3.78)
1.
The estimating equation is obtained by setting the derivative of the
log-likelihood with respect to
p
to zero, i.e.,
(3.79)
Solving (3.79), we have
67
t
p
rn
i=l
i
= ---'---
(3.80)
The numerator of (3.80) is the total number of conceptions during the
t
months of observation and the denominator is the total women months
of observation.
The estimate is similar to Pearl's pregnancy rate
(Pearl 1933).
..
The asymptotic variance of
p
is
(3.81)
Differentiating (3.78) twice
and
[(i-l)E(ni)+ic ]
i
(l-p)
where
E(n )
i
is given by (3.77).
The asymptotic variance of
(3.82).
2
p
is obtained by taking the reciprocal of
68
3.4.2 Homogeneous Fecundability and Random Withdrawals
Let
p
be the constant probability of losing a woman in a
particular month and
p
and
p
be independent.
that the loss occurs on the last day of the month.
2, .•• , t
Let
denote the probability of conceiving in the
and
TT
i th
month.
+ , i = 1, 2, ••• , t
t i
TT i'
i
th
i = 1,
month
denote the probability of withdrawal in the
Then,
TT
i
= p(l-p)
i-I
(l-p)
i-I
, i
= p(l_p)i (l_p)i-l
i
= (l_p)t (l_p)t-l,
Let
We further assume
n , (i=l, 2, ••• , t)
i
ci(i=l, 2, •.• , t)
= 1,
=
2, ••• , t ,
(t+l), ••• , (2t-l),
i = 2t •
(.).83)
be the number conceiving and
be the number withdrawn" in month
i.
Then,
t
L
(ni+c ) = N. The joint probability of the sample is given by the
i
i=l
multinomial law as,
(3.84)
Also from multinomial law we have,
E(n ) = Np(l-p)
i
E(c )
i
i-I
(l-p)
= N(l-p) i (l-p) i-I
i-I
, i = 1, 2, ••• , t ,
p, i
= 1,
= N(l_p)t(l_p)t-l, i = t
(3.85)
2, ••• , (t-l),
(3.86)
69
3.4.2.1 The Maximum Likelihood Estimates
The likelihood function for the sample of
(3.84).
N women is given by
The log-likelihood is
t
l
log L = const +
t
i-1
n i log wi +
l
i=l
c i log wt + i '
(3.87)
and
;nog L =
op
;nog L =
op
I
i=l
ni p
t-1
L
i=l
Ii=l [(i-1)ni +ic i ]
(3.88)
(i-1) (ni+c i )
(l-p)
(3.89)
(l-p)
The maximum likelihood estimating equations are obtained by equating
(3.88) and (3.89) to zero. Solving the likelihood equation we get
t
L
,p
ni
i=l
= ---.;;:-......=._--
(3.90)
I r i(ni+ci)l
l!=1
J
and
(3.91)
The variance-covariance matrix is given by the inverse of the matrix
M ... (m
ij
) ,
where
70
Since
p and
p and
p
p are independent, we have the asymptotic variances of
given by
Var(~) · - {E a2~:~ Lr
Var(p) . _{E
a2~:~ L}-l
Differentiating (3.88) and (3.89) and taking expectations
(3.92)
- E
[
21 L]
o~
a
ap
=
(i-1)E(n i +c i )
E(c i )
t
2 + L
2
i=l p
i=l
(l-p)
t
L
(3.93)
From (3.85) and (3.86) we have
t
I
i~l
E(n i ) = N
=
t
L
.
p[(l-p) (l-p)]
i-I
i~l
Np [l~{(l-e)(l-p)}t]
l-(l-p) (l-p)
(3.94)
71
and
t-l
1
1-1
t
t-l
t
r
E(c i ) - r Np(l-p) (l-p)
+ N(l-p) (l-p)
i=l
i=l
(3.95)
Also
t
r
i
i=l
E(n i ) - Np
t
i-I
r
i[(l-p) (l-p)]
i-I
•
Simplifying the sum on the right hand side we can write
t
~
l
i=l
i E(n )
i
= ~E[l-(t+l)(l-p) t (l-p) t +t(l-e) t+l (l-p) t+l ]
[l-(l-p) (l-p)]
2
(3.96)
Similarly we can show that
It
t-l
t-l
t
t
i E(c ) = Np(l-e) [l-t(l-p)
(I-e)
+~t-I)(l-p) (I-e) ]
i
i-1
[l-(l-p) (l-p)]
+ Nt(l-p) t (I-p) t-l .
(3.97)
Adding (3.96) and (3.97)
It
t
t
i E(ni+c ) _ N[l-(l-e) (l-p) ]
i
i=l
I-(l-p) (l-p)
Substituting for
B(n i ) and
B(c i ) in (3.92)
(3.98)
72
-
"[0
E
2
10 8
op2
LJ" "~"p(l-p)
"" N
[1,.;, (l-p) t(l_p) t J
[l-(l-p) (l-p)]
(3.99)
When
t
+~,
and
p == 0, the model reduces to the one described"in
Section (2.2), and
...
2
1
p.X
and
Var(p) • p (l-p)
N
which agree with (2.11) and (2.12), respectively.
Similarly, using (3.93)
E[-a2~:~ LJ .
Hence,
N(l-p)
p(l-p)
[l-(l-p) t-l (l-p) t-l ]
l-(l-p) (l-p)
Var(p) • p(l-p)
[l-(l-p) (l-p»)
N(l-p) [l_(l_p)t-l(l_p)t-l)
(3.100)
3.4.3 Heterogeneous Fecundability and Fixed Withdrawals
Let
p
be distributed as a Beta distribution with parameter
~
(a,b).
Let
~
given by (2.14),
x
is
x
be the probability of conception in month x.
Then,
i.e.~
P[X==x] .. -JL, x .. 1
a+b
ab(b+l) ••• (b+x,.;,2)
Let
x.
~
Then
be the probability that a woman's conception time exceeds
~
is given by (2.15), i.e.,
73
o ~.(b+l) ••• (b+x-l)
~
(a+b)(a+b+l) ••• (a+b+x-l)' x ~ 1 .
The conditional probability of conceiving in month
the conception time exceeds
(x-I)
As before, let
in month
i
t
in month
i;
months,is
X
n , i • 1, 2, ••• , t
i
and
I
i"*l
given that
.-~-l
1T
u(x)
x,
(3.101)
denote the number of conceptions
c i ' i - I , 2, ••• , t
denote the number of withdrawals
(ni+c i ) - N· Set
= ir
Then the conditional probability of
n , j ... 1, 2, ••• , (i-I),
j
is
t
Mi
ni
l
(ni+c i )
and
MO· O.
conceptions given
P(n/n , n ,"··, n _ ) •
l
2
i 1
conceive in month i, (N-Mi _1-:n i ) ~om~n's.;onception time exceeds i]
P(N-M _
women's conception time exceeds (i-I)]
i l
n
N-M _ -n
i
i 1 i
1T
i
Qi
Then the joint probability of
(3.102)
n , n , ••• , n
1
2
t
is
(3.103)
74
By a similar argument used in section (3.6.1) we can show that
and in general
c _
1T
i l
i
Qi-I
' i ... 1, 2, ..• ,
t
•
(3.104)
3.4.3.1 The Maximum Likelihood Estimates
The likelihood function for the sample of
L·
t
n
N
is given by (3.103)
[N-Mi _ l ]
i=l
(3.105)
ni
The log-likelihood is
t
log L = const +
L
i=l
n i log
+
1T.
1.
I
c
i
log Q .
i
(3.106)
The maximum likelihood estimating equations are
nog L ...
(3.107)
aa
Clog L...
ab
t
L
i=l
n
(3.108)
i
Differentiating the logarithm of
Clog Q
i
aa
i-I
... I
j=O
Qi'
we have
1
(a+b+j)
(3.109)
75
a10g Qi
ab
i-I
=.
1
r
j=O
i-I
-. j:!:O
I
(b+j)
1
(3.110)
(a+b+j-1)·
Also from (2.23) and (2.24) we have
1
i-I
- -I
a
j-O
1
a+b+j
,i~l,
1
=-,i-1
a+b
'
-
i-2
1
i-I
I (b+j) - j=O
L
j-O
Sheps and Menken (1972b) give
derivatives of
Set
n
Y
log L.
= c = 0
the
~
2 •
expressions for the first two
They are given below.
y s 0, y
Y
1
(a+b+j)' i
~
t + 1.
t
Wj = L (ny+1+c )
y=j+1
y
A
t
1
n
L
2
a y=l y
=-
t-1
B
=L
j=O
t-1
1
(a+b+j)2
=L
1
(b+j )2
a10g L
ca
=! L n
C
j=O
Wj
w.J
Then,
t
a i=l i
t-l
-
L
j=O
(3.111)
76
t-1
t-1
alog L
ab
= L -!- W -. L
(3.112)
2
=_A+ B
(3.113)
=B
(3.114)
a log
aa 2
2
a
L
log L
aaab
j=O b+j
j
j=O
and
a2 log
ab
2
L=_ C+ B •
(3.115)
The asymptotic variance-covariance matrix of the estimates are
giv~n
by the inverse of the matrix
i, j
= 1,
2; 8
1
= a,
8
2
=b
..
From (3.113)-(3.115) we have
- E[a2~:~ LJ . E(A) - E(B) ,
2
_ E[a 10 g L] = _ E(B) ,
aaab
2
_ E[a 10 g2 L] = E(C) - E(B) ,
(3.116)
(3.117)
(3.118)
ab
where
t-1
E(A)
= ~. L
y=l
a
E(B)
t
=.
2
(3.119)
E(n ) ,
y
1
j=O (b+j)
2 E(W.) •
J
(3.120)
77
t-l
1
= L
E(C)
(3.121)
j-O. (b+j)2
and
t
is given in (3.104)
L
y=j+l
and
E(Dy+l+C ) .
Y
3.4.4 Heterogeneous Fecundability and Random Withdrawals
As before, we assume that fecundability is distributed among
women as a beta distribution.
conception months, denoted by
given by (2.14).
Also
time exceeds month
Qi'
Then the probability distribution of the
'11' i '
when there is no withdrawal is
the probability that a woman's
is given by (2.15).
i
conce~tion
Further, we assume that
the probability of losing a woman, denoted by
p,
is identical for
all women and does not change with respect to time.
The probability
of an event (either conception or withdrawal) to occur in a particular
month conditional on
Let
'Ie
'11'
'Ie
and
0
1
p
lT
t +i
is given in (3.83).
denote respectively the probability of concep-
tion and the probability of withdrawal in month
Also
'Ie
'11'0/
1 P
tional on
and
'Ie
lT t
+i / p
= 1, 2, ••• , t.
i, i
denote the respective probabilities condi-
p •
From (3.83) we have
i = 1, 2, ... t
t
(l-p) (l-p)
t-l
,i
=
,
1,2, ... , (t-l),
i
=
t
t
Hence the unconditional probabilities are
.
78
n*
i
=
1
.
(1- )i-1(1_ )i-1 e
0
P
P
P
f
= ni
where
n
i
(l-p)
i-1
,i
= 1,
a-1
(l-e)
B(a.b)
2, ••••
t
b-1
ap
•
(3.122)
is given by (2.14).
Similarly we can show that
n*
t +i
=
Qi(l-p)P i-I , i • 1, 2•••• , (t-1) ,
(3.123)
where
Qi
is given in (2.15).
Using the multinomial law we have
(3.124)
and
E(C
i)
= NQi(l-p) i-1 p,
= NQt(l-p)
t-1
i
=
1, 2, ••• , (t-1),
,i = t •
(3.125)
3.4.4.1 The Maximum Likelihood Estimates
Let
n , i = 1, 2, .•• , t
i
and
c i ' i = 1, 2, •.• , t,
respectively
be the number of conceptions and the number of withdrawals in month
t
L
(ni+c ) = N. The likelihood function for the sample of
i
i=l
values follows a multinomial law
i;
L=
t
*c
*n
IT n i n +i
i
t i
i=1
Nl
t
IT nil
i=l
t
IT c
~1
il
N
(3.126)
79
The log-likelihood is
log L = const +
t
L
i=l
t
= const
+
r
i=1
n i log
*
t
'~i +. L c i log ~:+i
i=l
n i [log ~i + (i-I) log (l-p»)
t-1
+
L
~1
c i [log Qi + (i-I) log (l-p) + log p)
+ c t [log Q + (t-l) log
t
(l-p)]~
(3.127)
The maximum likelihood estimating equations are
a10g L •
aa
where
p
o
t
a10g L
ab
=o
t
a10g L
ap
=
o
t
is assumed to be independent of
a
and
b.
Differentiating (3.127) we have
a10g L
oa
~1
a
a10g n
t
i
n
a
+ L cJ.o
i=l i
a
i=l
= tL
og L _
ab
-
t
~
l
i=l
ni
a10g
ab
~i
t
+
~
l
i=l
Ci
(3.128)
a10g QJ.o
ab
(3.129)
(3.130)
Expression (3.130) is the same as (3.89).
the estimate of
p
as
Hence from (3.91) we have
80
~
p =
(3.131)
Also the expressions (3.128) and (3.129) are the same as (3.107) and
(3.108).
Hence the estimating equations of
aa
t
1
~
n
a i-I
t-1
l
alog L...
j=O
ab
and
b
are given
i.e.~
in (3.111) and (3.112),
a10S L ...
a
_
i
t-1
~
1
(a+b+j)
j-O
t-1
1
(b+j)
Wj -
L
j-O
where
t
Wj =
Since
a
and
estimates of
L
y=j+1
b
a
(ny+1+c )
y
and
are independent of
and
b
n t +1
p,
=0
.
the maximum likelihood
are obtained by solving the two simultaneous
equations (3.111) and (3.112) by the scoring method (Rao 1952). Also, the
covariance of
p with the estimates of
a
and
b
is zero.
Hence
A
the asymptotic variance of
p
is obtained as the inverse of
-Ef32~:~ LJ .
Differentiating (3.130) and taking expectation
(3.132)
where
E(n )
i
and
E(c i )
are given by (3.124) and (3.125) •
81
given by the inverse of the matrix
= 1,
i, j
= (m
~
ij
2, 8 = a
1
),
and
A
A
The asymptotic variance-covariance matrix of
a,
and
b
is
where
8
2
= b.
The
elements of the matrix are given by (3.116)-(3.117), i.e.,
_
E(a2lo~ L]
=
oa
~
Iy=l
a
_ E 0 2 log L] • - t-l
L
[ oaob
[
ob
2
1
(a+b+j)2
j-O
_ E 02 log L]
E(Wj ) ,
t~l
1
= t-l
L
j=O
1
E(n) _ tIl
y
j=O (a+b+j) 2
(b+j)
2 E(Wj ) -
I..
E(Wj )
j =0 (a+b+j)
2
t
where
is given in (3.124) and
Using the relation
tuting for
E(n i )
Qj-1
and
= Qj +
E(c )
i
n
j
,
E(Wj ) = . I E(ni+l+c ) •
i
1=j+l
we simplify
E(W. )
J
by
substi-
and the result is
E(W ) = NQj+l (l-p)j, j = 1, 2, ... , (t-l) .
j
(3.133)
3.4.5 Summary of Estimation Under Progressive Censoring
We have presented in Section 3.4 the maximum likelihood estimates
of various parameters of interest in the models presented under progressive censoring.
The models are constructed by making the assumptions
that the fecundability is either homogeneous or heterogeneous and that
the withdrawals are either fixed or random.
Due to nonavailability of
real data collected under progressive censoring, the methods are not
82
illustrated here.
Creation.of artificial data through simulations
requires special computer programs and no such attempt is made here.
Certain results presented here are similar to the work of Sheps and
Menken (1972b) under a different type of multiple
under the present setup
censoring~
except
the expected values of the numbers of
conceptions and withdrawals in a month are different.
Several modifications can be done to the model when data are
progressively censored with random withdrawals.
The withdrawals can
be assumed to be distributed uniformly throughout the month rather
than assuming them to occur on the last day of the month. Also the
constant probability
p
of withdrawals can be assumed to depend on
a woman's fecundability or can be assumed to he varying among women
according to a specific probability distribution.
3.5 Asymptotic Relative Efficiencies
The asymptotic relative efficiency of the moment type estimators
derived in Section (3.2) compared to the maximum likelihood estimators
iseomputed according to the definition 1.8.2.
The asymptotic variance-
covariance matrix of the moment estimates exists only when
a > 4.
Since the maximum likelihood and minimum chisquare estimators have the
same asymptotic variance-covariance matrix, the moment estimators
have the same asymptotic efficiency to each.
The asymptotic relative
efficiencies computed for various combinations of
censoring times
t
For the purpose of
= 12
and
48
comparison~
a
and
band
are given in Tables (3.1) and (3.2).
the efficiencies computed from complete
samples, given in Majumdar and Sheps (1970) is also presented in
83
Table (3.3).
The tables show that the efficiency of moment estimators
increases as
a
and
b increase.
When a
or
b
increases the
variance of the mixing beta distribution decreases or the women under
consideration are less heterogeneous with respect to fecundability.
Also, the efficiency decreases as
t
increases.
In total, the
moment type estimates are at their best for small values of
large values of
a
and
t
and
b.
Tables (3.4) and (3.5) compare the relative efficiencies of
maximum likelihood estimates from grouped and ungrouped samples, when
the groups are formed with equal size of
3
and
The efficiency goes down considerably as
a
increases and as the
class interval increases.
6,
respectively.
Also the efficiency increases as
b increases.
Although the groupings considered here show considerable effect
on the efficiencies, one should also remember the advantages of grouping in reducing errors due to digit preference
of small numbers.
Indeed,
clumping at various points seems to imply
a different probability model.
the effect of clumping.
However, grouping the
~ata
eliminates
Also, certain other forms of grouping along
with other combinations of
have less adverse effect.
and chance fluctuations
a
and
b which we have not considered may
84
TABLE 3.1
Asymptotic Relative Efficiency of Moment Estimators to
MLE for Selected Values .of
Censoring Time, t
= 12
a
and
band
(in Percentage)
b
a
4.001
4.1
4.5
5.0
6.0
2
5
10
15
74.53
74.80
75.93
77 .36
86.34
93.44
86,43
86.78
87.24
88.19
89.19
93.47
93.57
93.70
93.96
94.24
96.14
96.15
96.20
96.26
96.37
96.48
7.0
80.20
82.86
10.0
20.0
50.0
88.95
96.01
98.78
92.04
95.13
96.84
96.97
99.19
97.64
99.41
98.10
100.0
99.45
99.66
99.77
99.51
99.82
TABLE 3.2
Asymptotic Relative Efficiency of Moment Estimators to
MLE, for Selected Values of
Censoring Time, t
= 48
a
and
band
(in Percentage)
b
2
5
10
15
4.001
47.35
58.46
70.43
77 .97
4.1
4.5
5.0
6.0
48.40
52.84
58.51
59.05
61.62
70.67
71. 78
65.13
7.0
68.75
76.40
72.32
78.54
73.37
77 .20
81.19
78.07
78.56
79.33
81.37
83.81
10.0
87.85
89.22
89.94
90.45
20.0
50.0
96.00
98.78
96.91
99.19
97.38
99.41
97.56
99.51
100.0
99.45
99.66
99.77
99.82
a
85
, TABLE 3 ..3*
Asymptotic Relative Efficiency of Moment Estimators to MLE for
Selected Values of
a
and
b
(Complete Samples)
b
2
a
5
10
15
1.5
4.001
4.1
1.5
15.3
1.5
15.5
15.5
1.5
15.5
4.5
35.6
36.0
36.1
36.1
5.0
50.2
50.9
51.1
51.2
6.0
66.9
68.0
68.4
68.5
7.0
76.0
77.3
77 .8
77 .9
10.0
87.8
89.2
89.8
90.0
20.0
96.0
96.9
97.4
97.5
50.0
98.8
99.2
99.4
99.5
100.0
99.4
99.7
99.8
99.8
*
From Majumdar and Sheps (1970)
TABLE 3.4
Asymptotic Relative
Efficiency of MLE Estimators from Grouped*
Vs Ungrouped Samples (Censored Sample t
=
48)
b
2
5
10
15
4.001
44.21
72.89
87.98
95.21
4.1
43.34
72.19
87.73
91.27
4.5
40.01
69.39
86.54
91.23
5.0
36.26
65.94
84.81
90.76
7.0
25.16
53.46
76.90
86.55
10.0
15.26
65.06
78.23
20.0
4.36
39.17
15.80
36.61
52.81
50.0
0.51
2.60
9.13
17.59
.48
2.06
4.73
a
100.0
*
.0008
Groups of class interval 3
86
TABLE 3.5
Asymptotic Relative Efficiency of MLE Estimates from Grouped.
Vs Ungrouped Samples (Censored at
t
= 48)
b
2
5
10
15
4.001
13.98
38.88
65.20
76.69
4.1
4.5
13.34
11.02
37.80
33.72
64.36
60.91
76.26
74.32
5.0
8.71
5.53
3.58
29.18
21. 78
56.57
16.27
48.24
40.74
71.50
65.18
1.10
6.96
23.97
41.11
.05
.00
.00
0.62
.00
4.26
11.36
a
6.0
7.0
10.0
20.0
50.0
100.0
•
Groups of class interval 6
.00
.09
.002
58.67
.43
.01
87
3.6 Examples
This section gives the estimates of
a
and
b
treating the
Hutterite data and the data from the Princeton Fertility survey, given
in Majumdar and Sheps (1970), as censored at various points.
The Hutterite data relate to a sample of 342 women, who were
married before the age of 25 years, who had a livebirth at least nine
months after marriage and who denied having an intervening pregnancy.
The data are presented in lunar months.
The data from the Princeton Fertility Study are the samples
studied by Potter and Parker (1964).
Group I consisted of 958 women
who reported that they conceived following interruption of contraception, and Group II of 458 women who reported that they conceived
before starting contraception.
The Princeton data show a definite
tendency for heaping at months 6, 12, 24, ••• , and a few extreme observations.
Estimates are obtained from the Hutterite data treating the data
as singly censored at t
= 6,
12 and 24 months.
For the Princeton
data, estimates are obtained treating the data as singly censored at
t
= 12, 24, 36 and 48 months.
Estimates and their standard errors
where available, are given in Table (3.6)-(3.9).
For the Princeton
data, moment type estimates do not exist since estimates of
less than 2.
a
are
The tables show that for minimum chisquare and maximum
likelihood estimates, the tendency of the coefficient of variation of
the estimates of
a
and
b
is to go down with
t.
The estimates
88
of
a
and
b
also show a tendency to increase with
t,
and estimates
of mean fecundability and correlation between the estimates of
and
b
decrease with
a
t.
The chisquare goodness of fit using the estimates obtained from
ungrouped data for the three sets of data for selected values of
t
are given in Tables (3.10)-(3.17). Because of heaping at certain ages,
the data are grouped as was done by Majumdar and Sheps (1970), for the
purpose of calculating chisquare.
The chisquare value for this grouped
data, calculated on the basis of minimum chisquare estimates from
ungrouped data sometimes exceeds the chisquare value calculated using
maximum likelihood estimates from similar data.
The phenomenon is
particularly present in Princeton data I, where heavy heaping and
extreme observations are present.
f~w
Hence the maximum likelihood and
minimum chisquare estimates are re-computed using the grouped data
and are presented in Table (3.16) and (3.17).
e
e
e
TABLE 3.6
Estimates from Censored Samples and Their Standard Errors (Where Ava1iab1e)
A Hutterite Data (N = 342)
Censoring Time
Estimates·
..
I
I
I!Moments
a
...
b
...
1)
i
C
2.9768
4.3659 (1.5724)
8.0191
12.7520 (5.5165)
0.2707
0.2550 ( .0189)
-
r
I
I
a
48
12
6
0.9853
36.01
43.26
i
c
!
a
1. 9189 (0.6204)
2.4470 (0.6320)
3.1620 tU./'d'J1)
II
b
4.6836 (1. 8538)
6.2250 (2.0271)
8.4360 (2.6307)
1)
0.2906 (0.0221)
0.2821 (0.0201)
0.2726 (0.0185)
IMLE
r
0.9758
0.9683
0.9686
I
i
...
C
c
,
I
I
I Minimum
Chis quare
b
a
32.33
25.83
24.97
b
39.58
32.56
31.18
a
2.1219 (0.7450)
2.6389 (0.7257)
2.8839 (0.6868)
b
5.3197 (2.2562)
6.9871 (2.3936)
8.3665 (2.5221)
1)
0.2851 (0.0218)
0.2741 ( .0197)
0.2563 (0.0180)
r
0.9796
0.9722
0.9660
,..
C
c
a
35.11
27.50
23.81
b
42.41
34.25
30.14
00
\C
90
Table 3.6 continued:
*
Standard errors are given in parenthesis
of mean fecundability = ~
a+b
p
= Estimate
r
= Correlation
c ' c
a
b
coefficient between
a
= Coefficient of variation of a
and
and
b'"
b'"
respectively
e
e
e
TABLE 3.7
Estimates from Censored Samples and Their Standard Errors (Where Available)
B PFS Group I (N = 958)
Censoring Time
MLE
a"
1. 0708 (0.0972)
1. 2116 (0.1011)
1.2710 (0.1036)
1.2599 (0.1001)
"
b
1. 7076 (0.2196)
2.0009 (0.2421)
2.1298 (0.2529)
2.1055 (0.2454)
"
p
0.3854 (0.0148)
0.3772 (0.0143)
0.3737 (0.0141)
0.3744 (0.0141)
r
0.8951
0.8858
0.8832
0.8781
9.07
8.34
8.15
7.95
12.86
12.10
11.88
11.66
C
c
Minimum
Chisquare
a
b
a"
1.1422 (0.1078)
0.9700 (0.0767)
0.9097 (0.0664)
0.9068 (0.0639)
"
b
1.9906 (0.2647)
2.1970 (0.2620)
2.1468 (0.2432)
2.1413 (0.2370)
"
P
r
0.3646 ( .0144)
0.3062 (0.0133)
0.2976 (0.0131)
0.2975 (0.0131)
0.9052
0.8771
0.8599
0.8524
C
9.43
7.30
7.05
11.33
11.07
c
*
48
36
24
12
Estimates*
a
b
13.30
7.91
11. 92
I
Moment estimates do not exist
Symbols are as in Table 3.6
\0
~
TABLE 3.8
Estimates from Censored Samples and Their Standard Errors (Where Available)
C PFS
Group II (N
= 458)
Censoring Time
24
0.8672 (0.1198)
0.6219 (0.0651)
0.9562 (0.1045)
0.9623 (0.1014)
2.3910 (0.4800)
1.8710 (0.3164)
2.7190 (0.4608)
2.7430 (0.4530)
P
0.2662 (0.0186)
0.2495 (0.0182)
0.2602 (0.0176)
0.2597 (0.0174)
r
0.9072
0.8474
0.8731
0.8662
"
a.
"b
"
MLE
48
36
12
Estimates*
a
13.81
10.57
10.92
10.54
".
20.08
16.91
16.95
16.51
C
0.9314 (0.1337)
0.6591 ( .0712)
0.9971 ( .1117)
0.9674 ( .1031)
b
2.6756 (0.5531)
2.0943 (0.3607)
3.0883 ( .5340)
3.0688 ( .5138)
"
p
0.2582 (0.0182)
0.2393 ( .0177)
0.2441 ( .0168)
0.2397 ( .0166)
r
0.9152
0.8562
0.8810
0.8709
a
A
Minimum
Chisquare
a
14.35
10.81
11.20
10.66
c
b
20.67
17.23
17.29
16.74
C
* Moment estimates do not exist
Symbols are as in Table 3.6
\0
tv
e
e
e
93
TABLE 3.9
Estimates from Grouped Data (Censored at t
for
PFS
PFS Group II
a"
1.3538 (.1122)
.9635 ( .1020)
b
2.2770 (.2726)
2.7294 ( .4517)
P
.3728 (.0141)
.2609 (.0175)
r
.8860
.8660
'"
C
a
8.28
10.58
b
11.97
16.54
c
'"a
1. 3126 ( .1428)
0.9877 (.4174)
'"b
2.2576 (.1925)
2.8497 (.2474)
.3676 (.0119)
.2573 (.0819)
'"
Minimum
Chisquare
1)
r
C
c
*
Group I and Group II
PFS Group I
Estimates*
MLE
= 48)
0.1925
.3225
a
10.88
42.25
b
5.64
8.68
Symbols are as in Table 3.6.
94
TABLE 3.10
Observed and Expected Frequencies of Conception by
Month, Hutterite Data
(Censored Sample t
=
Expected
Observed
6)
~requency
Month
Frequency
MLE
MIN CHISQ.
1
103
99.4
97.5
2
53
61.3
61.5
3
43
40.5
41.1
4
27
28.2
28.8
30
20.4
21.0
9
15.3
15.7
5
6
7+
77
77 .1
76.4
Total
342
342.0
342.0
Chisquare
8.5317
8.4320
(4d. f)
p>.05
-
p>.05
95
TABLE 3.11
Observed and Expected Frequencies of Conception by
Month, Hutterite Data (Censored Sample t = 12)
Expected Frequency
Observed
Moments
MLE
MIN CUISQ.
92.6
96.5
93.8
53
61.9
62.1
61.6
3
43
43.0
42.0
42.4
4
27
30.7
29.6
30.1
5
30
22.6
21.6
22.1
6
9
17.0
16.2
16.6
7
12
13.0
12.3
12.7
8
9
10.1
9.6
9.9
9
6
8.0
7.6
7.9
10
8
6.4
6.2
6.4
11
10
5.2
5.0
5.2
12
5
4.3
4.2
4.3
13+
27
27.2
29.1
29.0
Total
342.0
342.0
342.0
342.0
14.7370
p>.05
14.7310
p>.05
14.5584
p>.05
Month
Frequency
1
103
2
Chis quare
(10d.f)
96
TABLE 3.12
Observed and Expected Frequencies of Conception by
Month, Hutterite Data (Censored Sample, t
Observed
Month
Frequency
1
103
2
= 24)
Expected Frequency**
Moments
MLE
MIN CHISQ.
87.2
93.2
87.7
53
61.4
62.4
59.9
3
43
44.2
43.3
42.3
4
27
32.4
31.0
30.8
5
30
24.2
22.7
22.9
6
9
18.3
17.0
17.5
7
12
14.0
13.0
13.5
8
9
10.9
10.1
10.7
9
6
8.6
7.9
8.5
10
8
6.8
6.4
6.9
11
10
5.5
5.1
5.6
12
5
4.4
4.2
4.6
13-15
9
9.0
8.7
9.8
16-18
7
5.2
5.2
6.1
19-24
7
5.2
5.5
6.6
25+
4
4.7
6.3
8.6
Total
342.0
342.0
342.0
342.0
17.8721
p>.05
16.9179
p>.05
17.6938
p>.05
Chisquare
(13d.f)
**
Estimates from ungrouped data
97
TABLE 3.13
Observed and Expected Frequencies of Conception by
Month. PFS. Group II (Censored Sample. t
Observed
=
12)
Expected Frequency**
MLE
MIN CHISO.
Month
Freauencv
1
129
121.9
118.3
2
57
68.5
68.7
3
53
44.1
45.0
4
26
31.0
5-6
27
40.8
31.9
42.2
7-9
46
35.5
36.8
10-12
29
21.4
22.1
13+
91
94.8
93.0
Total
458
458.0
458.0
15.4896
0<.01
15.3917
p<.Ol
Chis quare
(s d. f)
** Estimates from ungrouped data
TABLE 3.14
Observed and Expected Frequencies of Conception. by
Month, PFS. Group II (Censored Sample, t
Month
Observed
Frequency
= 24.)
Expected Frequency**
MLE
MIN CHISQ.
1
129
114.2
109.6
2
57
61.2
61.2
3
53
39.1
39.8
4
26
27.6
28.4
5-6
36.9
33.2
38.2
7-9
27
46
10-12
29
20.8
21.6
13-24
41
40.5
41.9
25+
50
84.5
82.8
458
458.0
458.0
Total
Chisquare
32.1683
(6d. f)
0<.01
** Estimates from ungrouped data
34.5
30.9243
n< 01
.-
98
TABLE 3.15
Observed and Expected Frequencies of Conception by
Month, PFS Group II (Censored Sample t
=
36)
Expected Frequency**
Month
Frequency
MLE
1
129
111.8
119.2
2
57
53
67.9
45.6
69.3
45.4
3
MIN CHISQ.
4
5-6
26
32.8
32.1
27
43.9
42.4
7-9
46
38.6
36.9
10-12
29
23.3
22.1
13-24
41
41.6
39.4
25-36
18
16.1
15.3
37+
32
36.4
35.9
Total
458
458.0
458.0
Chis quare
17.0728
16.4133
(7d. f)
.01<p<.05
.01<p<.05
-
**
Estimates from ungrouped data
-
e
e
TABLE 3.16
Observed and Expected Frequencies of Conception, by Month, PFS, Group II
Using Estimates from Grouped and Ungrouped Data (Censored Sample, t
=
48)
Expected Frequency**
MIN CHISQ.
MLE
Expected Fre uency*
MIN CHISQ.
MLE
Month
Observed
Frequency
1
129
118.9
109.8
119.4
117.8
2
57
69.4
66.9
69.4
3
53
45.5
45.1
69.5
45.5
45.8
4
26
32.2
32.5
32.2
32.4
5-6
27
42.6
43.7
42.5
43.1
36.9
22.1
37.4
22.4
I
29
37.0
22.1
38.7
23.5
13-24
41
39.4
42 •.3
39.3
39.7
25-48
26
23.4
25.5
23.3
23.4
49+
24
27.5
30.0
27.3
26.6
Total
458
458.0
458.0
458.0
458.0
16.2563
.01<p<.05
17.8413
.01<p<.05
16.2600
.01<p<.05
16.2002
.01<p<.05
7-9
46
10-12
Chis quare
(7d. f)
*
Estimates from ungrouped data
**
Estimates from grouped data
\0
\0
TABLE 3.17
Observed and Expected Frequencies of Conception, by Month, PFS Group I
Using Estimates from Grouped and Ungrouped Data (Censored Sample, t
Expected Frequency*
= 48)
Expected Frequency**
Month
Observed
Frequency
MLE
1
380
358.7
285.0
357.2
352.2
2
153
173.0
150.8
175.6
174.0
3
94
100.1
93.8
102.2
101.8
4
45
64.6
64.2
65.9
65.9
5-6
93
77 .4
82.6
78.8
79.2
7-9
51
59.7
70.0
60.2
61.0
10-12
46
32.1
41.3
31.9
32.7
13-24
68
49.6
72.9
48.2
50.0
25-48
20
23.9
43.5
22.1
23.6
49+
8
18.9
53.9
15.9
17.6
958
958.0
958.0
958.0
958.0
Chis quare
33.9653
96.6009
34.1140
33.6784
(7d. f)
p<.Ol
p<.Ol
p<.Ol
p<.Ol
Total
*
**
e
MIN CHISQ.
MLE
MIN CHISQ.
Estimates from ungrouped data
I-'
o
o
Estimates from grouped data
e
e
101
3.7 Results from Sampling Experiments
This section will give reeulta from sampling experiments designed
to study the sampling distribution of various estimators under single
censoring.
Data on conception times from the distribution given in
(2.14) are generated for various parameter combinations.
estimates of
a
and
samples generated.
b
Various
described in Section 3.2 are obtained for the
A complete study should look at the distribution
of the estimators under different combination of sample sizes, values of
parameters, and various censoring times.
Also such a study should look
into the special problems, if any, observed during the study such as
the problem of convergence.
It is plausible to further extend the study
to include multiple censoring as well as different distributions of
fecundabi1ity.
Because of the large scale nature of the study we have
not attempted here a complete investigation with the sampling experiments.
Instead, we provide here some preliminary results with limited
parameter combinations.
The method of generating conception times from the distribution
give·n in (2.14) is described in detail in Sheps and Menken (1972b).
We
generated 100 samples of 500 conception times each, for parameter combinations of
t
= 12
a
and 24.
and
b
given in Table (3.18), with single censoring times
102
TABLE 3.)8
Values of
a. b
and Mean Fecundabi1ity
Used for Simulation
a
b
-p
3.5
10.0
0.2592
4.5
12.86
0.2592
4.5
11.57
0.28
5.0
12.86
0.28
The parameter combinations chosen show a change in fecundability.either
changing
a
or
b
or both.
bility but different values of
Also two sets will have same mean fecundaa
and
b.
Since moment type estimates exist for all the parameter combinations chosen. the estimates are considered as not available when either
estimate
a
or
b
becomes negative.
For the maximum likelihood
estimates and minimum chis quare estimates. a maximum of 50 iterations
are allowed for convergence.
to be nonconvergent.
Otherwise. the estimates are considered
The program also examines for negative values of
the parameters. as well as the values of the log-likelihood functions
and the chi-square statistic.
The summary results based on the experiments are given in Tables
(3.19)-(3.22).
Since the estimates are not available in some cases for
the reasons stated before. we have calculated the summary statistics
based only on the available number of estimates out of the 100 samples
generated.
In order to have a rough idea about the symmetry of the
103
distribution of the estimates apart from mean and standard deviation.
median. quartiles and trimeans are also given.
Trimeans are calculated
using the formula
Trimean
Ql
where
m
Ql + 2M+ Q3
4
is the first quartile.
Q3 is the
M is the median. and
third quartile of the distribution.
When the distribution is symmetric.
the population mean. median. and trimean will be the same.
In a
few instances. estimates are not available.
The non-
availability is more frequent for minimum chisquare estimates.
seems to be happening as
t
increases and the variance of the concep-
tion time distribution decreases, (i.e., as
that for the selected values of
a
and
b
for the estimates. and the bias decreases as
and
b.
This
a
increases).
It seems
there is a positive bias
t
increases for fixed
a
Similar results were observed for the maximum likelihood esti-
mates in the asymptotic case by Sheps and Mustafa (1972).
estimates of
a
and
b
Since the
are highly correlated. the bias seems to have
little effect on the estimates of mean fecundability.
Similar results
from maximum likelihood estimates are also obtained by Sheps and Menken
(1972b).
The calculated values of the mean. median. and trimean are
fairly close in every case.
TABLE 3.19
Summary Statistics of Simulated Samples, by Censoring Time
= 100)
(Number of Samples Generated
a = 3.5
b = 10.0
P = .2592
Estimators
Moment
No. of Estimates
Obtained
100
t = 12.
MIN CHISQ.
MLE
99·
100
Parameter
a
b
p
a
b
P
a
b
P
Mean
3.7253
10.8853
.2596
3.7209
10.8549
.2590
3.9230
11.6505
.2557
s.d
1.3552
4.7916
.0167
1.2507
4.3682
.0152
1.3492
4.7535
.0153
Median
3.4579
9.8086
.2606
3.4497
10.1429
.2591
3.6285
10.5298
.2563
Quarti1es
2.8585
7.7559
.2474
2.8518
7.8616
.2487
2.9374
8.1216
.2435
4.3464
13.0185
.2703
4.1575
12.4107
.2692
4.5892
13.7895
.2657
3.5302
10.0979
.2598
3.4772
10.1429
.2590
3.6959
10.7427
.2555
Trimean
.....
o
~
e
e
e
e
e
e
Table 3.19 continued:
a
=
3.5
b
= 10.0
Estimators
p=
.2592
t
= 24
MO~T
;No. of Estimates
Obtained
100
MIN CHISQ.
100
96
b
-p
3.6771
10.6810
.2585
3.5897
10.8487
.2510
.0172
.9334
3.2098
.0142
.8373
3.0438
.0140
10.8788
.2588
3.4602
10.1806
.2586
3.4896
10.5650
.2504
2.8574
7.7134
.2472
3.0144
8.2564
.2487
2.9530
8.3886
.2407
4.2979
12.6533
.2684
4.1688
11.8644
.2678
4.0149
12.2207
.2605
3.5960
10.5311
.2583
3.5259
10.1806
.2586
3.4868
10.4348
.2504
-p
Parameter
a
b
Mean
3.7039
10.7728
.2597
s.d
1.0680
3.6831
Median
3.6143
Quarti1es
Trimean
MLE
a
a
b
p
.....
o
VI
TABLE 3.20
Summary Statistics of Simulated Samples by Censoring Time (t)
(Number of Samples Generated • 100)
a
= 4.5
b
= 12.86
p • • 2592
t
= 12
Estimators
MOMENT
MLE
MIN CHISQ.
No. of Estimates
Obtained
99
99
99
Parameters
a
b
Mean
5.3185
15.5832
s.d
2.6058
Median
Quartiles
Trimean
-
-
a
b
-
a
b
.2595
5.2750
15.4440
.2596
5.6593
16.8936
.2565
·8.7796
.0145
2.5523
8.5950
.0146
2.9540
10.1734
.0147
4.6052
13.0009
.2588
4.6069
13.1033
.2588
4.8558
14.2504
.2574
3.7534
10.2432
.2515
3.7323
10.3918
.2518
3.9331
11.0820
.2482
5.9591
18.3291
.2675
5.8114
17.2868
.2668
6.2135
18.7581
.2652
4.7308
13.6436
.2592
4.6894
13.4713
.2591
4.9645
14.5852
.2570
P
P
P
~
o
0\
e
e
e
e
e
Table 3.20
continued:
a = 4.5
b = 12.86
p
= .2592
Estimators
MOMENT
No. of Estimates
Obtained
97
Parameters
I
e
a
b
t = 24
MIN CHISQ.
MLE
96
97
-p
a
b
-p
a
b
p
Mean
5.1331
14.7674
.2628
5.1064
14.6769
.2619
4.6708
13.8661
.2552
s.d
2.0968
7.0129
.0169
1.9804
6.5987
.0149
1.5007
5.2124
.0148
Median
4.6590
13.1704
.2612
4.7196
13.2274
.2606
4.3744
13.0226
.2554
Quarti1es
3.9294
10.5646
.2540
4.0354
11.1479
.2534
3.8081
10.8628
.2468
5.8161
16.5835
.2728
5.6200
16.1511
.2710
5.1981
15.6498
.2643
4.7659
13.3722
.2623
4.7737
13.4385
.2614
4.4388
13.1395
.2554
Trimean
....
o
'-J
TABLE 3.21
Summary Statistics of Simulated Samples, by Censoring Time,
(Number of Samples Generated = 100)
a
= 4.5
b
= 11.57
Estimators
p = .28
= 12
MOMENT
No. of Estimates
Obtained
MIN CHISQ.
MLE
100
100
100
-p
a
b
-p
a
b
.2793
5.1645
13.6538
.2798
5.4868
14.7689
.2759
8.9720
.0167
2.8074
8.3990
3.5106
10.5709
.0155
4.5976
12.0775
.2803
4.5501
11.9292
.2787
4.6701
12.6303
.2746
3.6954
9.1023
.2674
3.6203
8.9981
.2678
3.7558
9.5803
.2645
5.9769
15.5207
.2892
5.8695
15.5655
.2895
6.0157
16.4577
.2852
4.7169
12.1945
.2793
4.6475
12.1055
.2787
4.7780
12.8247
.2748
Parameters
a
b
Mean
5.2371
13.8848
s.d
2.9948
Median
Quarti1es
Trimean
t
-
p
.0156
....
o
co
e
e
e
e
e
Table 3.21
e
continued:
a = 4.5
b = 11.57
Estimators
p
= .28
t
= 24
MOME~T
No. of Estimates
Obtained
100
MLE
MIN CHISQ.
100
93
-p
a
b
12.7278
.2803
4.8379
12.6733
1. 7731
5.3715
.0174
1.6399
Median
4.4931
11.6502
.2796
Quartiles
3.9177
9.6952
5.3452
4.5622
Parameters
a
Mean
4.8529
s.d
Trimean
-P
a
b
P
.2793
4.3880
11.8461
.2721
4.9376
.0148
1.2572
3.9356
.0146
4.5009
11.7289
.2789
4.0832
11.1229
.2724
.2705
3.8030
9.5388
.2709
3.5834
9.2454
.2646
14.8466
.2889
5.3925
14.3986
.2872
4.7719
13.1115
.2807
11.9606
.2797
4.5493
11. 8475
.2790
4.1304
11.1507
.2725
b
....
o
\0
TABLE 3.22
Summary Statistics of Simulated Samples, by Censoring Time
(Number of Samples Generated
a
= 5.0
b
= 12.86
p
= .28
t = 12
MOMENT
Estimators
No. of Estimates
Obtained
= 100)
MIN CHISQ
MLE
100
100
100
a
b
p
Parameters
a
b
p
a
b
p
Mean
5.7474
14.9948
.2814
5.7411
14.9770
.2815
6.1120
16.2484
.2780
s.d
2.4780
7.1744
.0161
2.5122
7.2884
.0162
2.9258
8.6102
.0161
Median
5.1458
13.4441
.2799
5.0894
13.1164
.2777
5.3106
13.9313
.2748
Quarti1es
4.0986
10.1235
.2684
4.1525
10.1246
.2695
4.2893
10.5792
.2673
6.5250
17.2511
.2930
6.6524
17.5583
.2928
6.9413
19.0914
.2901
5.2289
13.5657
.2803
5.2459
13.4790
.2794
5.4629
14.3833
.2768
Trimean
I-'
I-'
o
e
e
e
e
e
e
Table 3.22 Continued:
a = 5.0
b = 12.86
Estimators
p
= .28
t
=
24
No. of Estimates
Obtained
100
100
100
Parameters
a
b
".
Mean
5.3862
13.9110
s.d
1. 9438
Median
Quarti1es
Trimean
MINCHISQ.
MLE
MOMENT
a
b
a
1>
p
.2840
5.4368
14.0546
.2825
4.8126
12.8656
.2756
5.8388
.0176
1. 7948
5.3987
.0152
1.3586
4.4002
.0152
4.9181
12.4064
.2827
4.9504
12.4626
.2827
4.5743
11.9509
.2762
3.9009
9.7881
.2725
4.0726
9.9539
.2729
3.7097
9.5260
.2654
6.2922
16.6312
.2961
6.6535
17.7706
.2904
5.6529
15.5980
.2849
5.0073
12.8080
.2835
5.1568
13.1624
.2821
4.6211
12.2564
.2757
."..
....
....
....
CHAPTER IV
CONCEPTION TIME MODEL - SAMPLES FROM A
TRUNCATED DISTRIBUTION
4.1
Introduction
Data on conception times are often not collected through prospec-
tive studies.
such data.
Instead,retrospective studies are employed to gather
In retrospective studies
women with specified charac-
teristics are selected at a point of time, and are asked to state their
past history.
Let us think of a similar survey in which sampling
i~
done from a group of women who are married at the same time and have
conceived at least once at the time of survey.
They are then asked
to state the time from marriage to first conception.
Further, we
assume that there are no premarital conceptions and the population
sampled contains all women of the marriage cohort who have conceived at
least once.
If the samples are selected from a marriage cohort with
duration of marriage
t
years,
this method of survey ignores all
women from the original cohort who have not conceived at that time.
Thus the population surveyed is the one formed from the "complete"
population described earlier by ignoring all women whose conception
time exceeds time
t.
The data obtained can be considered as a
sample from a population having an underlying conception time distribution which is a truncated distribution.
In this chapter
we shall
state a model for such data and shall study the properties of various
estimators of the parameters of such a model.
113
4.1.1 A Truncated Discrete Time Model for Conception
Let the "complete" discrete time distribution be given as follows
p [X=x] =
where
1T
L 1Tx
1.
1T
x
'X
= 1, 2,... ,
(4.1)
is the probability of a conception in month
x
x,
and
K
If the population is truncated at the month
t,
the truncated
population is governed by the probability law
P[X-x] ""
*
1,2, ••• ,
X""
7f
x
t
,
(4.2)
where
t
* ""
L
1T
1T
1T
* = _-.;.;x_
X
and
i=l
x
1
In the specific model we have considered before,
'IT
x
is given by
(2.14)
1T
=..i!.-.,
a+b
x
=
where
a
and
x=l
a b(b+l) ••• (b+x-2)
=
(a+b)(a+b-l) ••• (a+b+x-l) ,x
2, 3, •.• ,
b > 0
.
Also,
00
t
L
x=l
where
p(X>t)
1T
x
= 1 -
I
1T
x=t+l
x
=
1 - p(X>t)
,
is given by (2.15).
In the following sections we will obtain various estimators of the
parameters
a
and
cated population.
b,
when the samples are from this specific trun-
114
4.2 Estimates from Proportions Conceived
Let
Pi.
i = 1. 2, •••• t ,
the sample who conceived in the
derive estimates of
probabilities.
a
and
b
denote the proportion of women in
th
i
month. If t ~ 3 , we can
by equating
P 's
i
to the actual
For example. by equating the proportion conceiving
during the first three months we have the following set of equations:
where
TI
'*
x
is defined in (4.2).
(4.3)
From the probability law given in (2.14), we have
TI
TI
...
2
3
=
a
b
(a+b) (a+b-l)
a
b
(b+l) _
(a+b) (a+b-l) (a+b-2)
Thus, from (4.3) we get
b
a+b-l
P3
P2
TI~
TI
3
= TI ~ = -:rr;-
Solving (4.4) and (4.5) we obtain
b+l
= a+b-2
(4.4)
(4.5)
115
b"
=
(4.6)
and
A
a =
- I
.
(4.7)
A
If
a
a
and
and
b.
b
are positive, they can be considered as estimators of
It is to be noted that any three consecutive class propor-
tions can be used in the above manner to derive the estimators of
and
b.
are consistent estimators of
Since the
are functions of the
and
a
'lT
*
i
and
a
,.,
b
and
they are also consistent estimators of
a
b.
4.3 Moment Type Estimators
The
r
th
moment about zero of the distribution truncated at
t
is given by
t
*= L
llr
x=l
x
r
'IT
*
(4.8)
X
The usual moment estimators are obtained by solving for
a
and
b
the equations formed by setting the first two sample moments to the
corresponding population moments.
For the distribution under cons i-
deration such a procedure fails to give an explicit solution for
and
b.
a
Hence we use a technique similar to the one described in
Section 3.2 to derive moment type estimators ofa truncated distribution.
Let us consider the sample of size
n
from a truncated distri-
bution as a subset of a larger sample of N from the complete population.
116
N is considered to be unknown.
Let
conceive
f ,
i
i
= 1,
in month
2, ••• , t,
denote the frequency of women who
i.
Let
T1 •
t
T
2
= L
i.=l
i(1-l)f
i
,
t
T3 •
r 1(i-l) (i-2)f i
(4.9)
i=1
and let
T* = T + N
2
2
co
I
i(i-l)lTi '
i.=t+l
co
T* = T + N L 1(i-l) (i -2) 7T i •
3
3
i-=t+l
Then, we can consider
(4.10)
as estimates of the first three
factorial moments of the complete distribution.
Equating these esti-
mates to their actual values given in (2.1), we get the following set
of equations:
117
e
'If
T
-1'If= (a+b-l)
(a-I)
TO
•
,
(4.11)
T*
2
- = 2 (a+b-l)b
(a-I) (a-2)
TO*
(4.12)
T*
3
6 (a+b-l)b(b+l)
*= (a-I) (a-2) (a-3)
TO
(4.13)
Equations (4.11), (4.12) and (4.13) contain three unknowns
and
N.
a,
b
From (4.11) we have,
'If
'If
(a-I) T1 - (a+b-1) TO '
i. e.,
(a-I)
~1 + N i=t+l
I i1T~.
(a+b-1)
L
~
00
Set
I
~=t+1
I
[TO +
'lT
i=t+1
(4.14)
i]
00
'IT.
1.
=
R .
Substituting for
I
i'IT i
i=t+1
from (3.4), the
equation (4.14) can be written as
atNR = (a+b-l)T
O
- (a-l)T
1
.
(4.15)
From (4.11) and (4.12) we have
i.e.,
(4.16)
Substituting for relevant sums from (3.4) and (3.5),
(4.1~)
can be
118
simplified as
at(t+1)~R
= 2b T1 - (a-2)T •
2
(4.17)
Using (4.15) and (4.17) we get
b[2T1-(t+1)TO] = (a-2)T 2 + (a-I) (t+1) TO - (a-I) (t+l)T .
1
(4.18)
Equations (4.12) and (4.13) give
It
It
(a-3)T 3 = 3(b+1)T2 '
i. e . .J
(a-3)
~3+~
~
r
i( i-I)
i-t+1
(i-2)1r~ = 3(b+1) ~2+~
~
~
i(i_1)1r~'
r
~
i=t+l
(4.19)
Substituting for the sums from (3.5) and (3.39), the equation (4.19)
can be simplified as
at(t+l)(t-1)~R =
3(b+l)T 2 - (a-3)T
3
.
(4.20)
Equations (4.17) and (4.20) give
(4.21)
Equation (4.18)gives
a[ -(t+l)T +(t+l)T +T 2 ] + (t+])T 1 - (t+1)T - 2T
1
O
2
O
[2T -(t+l)T ]
1
O
(4.22)
b=-----=:--.---::...-~----'-----....::----=.
Substituting for
b
in (4.21) we get an estimate of
a
as
~T;+2(t-1)(t+1)Ti-6TIT3-3(t+3)T1T2-2(t+1)(t-4)TOT2
a
=
~T~+2(t-1)(t+1)Ti-2T1T3-3(t+1)T1T2-(t+l)(t-4)TOT2
-2(t+l) (t-l)T 1TO+3(t+l)T OT;]
-2(t+1)(t-1)TITO+(t+1)TOT3~
(4.23)
An estimate of
Let
b
m[ *r]
can be obtained by putting
denote the
a
th
119
in (4.22).
sample factorial moment of the trun-
Then
cated distribution.
Using (4.24),
r
a = a
and
b
T
*
m[l]
I
=-
*
m[2]
=-
*
m[3]
--
TO
T
2
TO
T
3
TO
(4.24)
can be rewritten as,
a=---------------------A
(4.25)
(4.26)
b
The estimates
a
i.e., when a
> 3.
When
t
~
00,
b"
and
exist only when the third moment exists,
the factorial moments of the truncated distribution
coincide with the factorial moments of the complete distribution.
limit of
a,
as
t
~
00
is
The
120
lim
t-+<x>
where
m[r]
distribution.
A
a =
(4.27)
denotes the
If the
r
r
th
th
sample factorial moment of the complete
sample moment about zero is denoted by
m ,
r
then
lim
t-+<x> a
=
A
and
lim
t-+<x>
A
A
& = (a-l)(ml-l) •
(4.28)
The results in (4.28) agree with the moment estimators from complete
samples given in (2.19).
4.3.1
Large Sample Properties of Moment Estimators
The estimates obtained in (4.26) and (4.27) are functions of
sample factorial moments.
Hence, they can also be considered as
functions of sample moments.
By application of Slutsky's theorem
[Cramer (1946), Section 20.6], the estimates are consistent, and they
are asymptotically normally distributed as
TO +
w.
The asymptotic
variance-covariance matrix is obtained below.
Since factorial moments are functions of moments about zero, we
can write
(4.29)
where
*
m, i
i
= 1,
2, 3,
stand for the sample moment about zero for
121
the truncated distribution, and
K
1
and
y denote a (3 x 3) matrix with
Let
*
~r
where
[~] ,
ft
j
= 1,
stand for two functions.
(r,s)
th
represents the
moment of the truncated distribution.
matrix
K2
2; i • 1, 2, 3.
Let
J
element
r th
population
represent the (2x3)
But
~mi
tit
am[2]
tit
1, 2; i
+
ami
Hence,
J
can be written as
where
J~1
-
oK
2
*
om[2]
and
*
om*
2
om[l]
*
om[2]
1
*
om *
2
&m[3]
*
om[3]
3m*
1
--r
om
tit
om[2]
*
om 2
*
om*
3
om[l]
= 1,
2, 3 •
122
Then the asymptotic variance-covariance matrix of
a
and
b'"
is
given by (Cramer (1946), Section 27.7)
(4.30)
where
W is the asymptotic variance-covariance matrix of
a
and
'"
b.
Using the relation between moments about zero and factorial moments,
it can be shown that
:l2
The elements of the
where
y
and
x
~1
=
1
o
-1
1
2
-3
~.
~4. 31)
matrix are _obtained below.
respectively stand for the expressions in the numer-
ator and denominator in (4.23).
Writing
"
ax
=y
and differentiating
we get the following equations:
=-
* + 3(t+1) ,
6m[11
-2(t+l) (t-4)
r:~(t-l)(t+l)m[l]-2m[3]-3(t+l)m[2]-2(t+l(t-lja+x
*
*
*
:1
(4.32)
(4.33)
a~ *
am[l]
=
*
*
*
4(t-l)(t+l)m[1]-6m[3]-3(t+3)m[2]-2(t+1)(t-1)
(4.34)
123
The derivatives of
a
with respect to the
fir~three
factorial
moments can be easily obtained from the equations (4.32)-(4.34).
A
Using the relation between
a
and
b given in (4.26) we can obtain
the following equations:
(4.35)
- a - 2 ,
+ (t+l) •
The equations (4.32)-(4.37) provide all the elements of
The matrix
a
(4.37)
•
y involves the sixth moment of the distribution, which
a > 6.
exists only when
matrix of
~l
(4.36)
and
b
Hence, the asymptotic variance-covariance
can be computed only when
a > 6.
4.4 Maximum Likelihood Estimators
If
n
x (x<t),
x
and
denotes the number of women conceiving in the month
t
I
x=l
n
x
= n,
the sample size, then the likelihood
function can be written as
t
L(~) =
constant
n
x-I
where
0'" (a,b) .
n
"
x
1T
X
(4.38)
124
The iterative equations to obtain the maximum likelihood estimates
of
a
and
b
are given by equation (1.9).
i.e.,
where the matrices involved are defined appro.priately; and
stands for the maximum likelihood estimate of
6
=
(a,b)
I(n)
at the
n
th
iteration.
The
H matrix is given by
Cln
,
H
•l
ClV
lie
Cln
anI
ab
lie
lie
2
ant
Clb
ab
lie
Using the relation between
lie
ant
aa
aa
Cla
==
•2
ni
and
n
given in (4.2),
i
it can be
shown that
lie
*t
*t
ani
- - = 1T i
Cla
and
*
.'!c
_
Clb - 1T
i
t
10
8"!
Cla
1Og
Clb
.!_
l
1=1
1T
alOg.~
it
Cla
i
_
' i - 1, 2, ... , t ,
alOg.~
L 1T *i Clb '
(4.39)
t
i = 1, 2, •.. , t •
(4.40)
i=l
The computational formula given for derivatives of
log1T
i
in (2.23)
and (2.24), along with the relations (4.39) and (4.40) enable us to
compute the
H matrix.
The
y-1*
matrix is easily computed by replac-
1T
ing
1T.
1
by
it
1T i '
i == 1, 2, •.. , t,
in the matrix
-1
V
-1T
given in (1. 5) •
"-
The asymptotic variance-covariance matrix of
a
and
b,
denoted by
125
*
D
is given by (1.10)
as
(4.41)
4.5 Minimum Chisquare Estimators
The iterative equations to obtain the minimum chisquare estimates
are of the form given in (1.15), i.e.,
~(n+l)
.2
where
and
iteration.
_
~.2(_6)l-1
~
~
2
x_·
qn
6 =
e(n)
,
•• 2
X
and
X (6)
-(n)
6
~(n)
=
are defined in (1.13)
and (1.14) respectively,
is the minimum chisquare estimate of
Set
6
1
=
a
and
6
.2
Then the elements of the
2
X (6)
~
6
=
(a,b)
at the
, j
= 1,
2,
th
b •
matrix are given by
(4.42)
1, 2 .
The derivatives
n
are given by (4.39) and (4.40),
respectivPly.
The~lements
aX2
a6 a6
i m
~
= n -
t
l
of
p2
i
*2
• .2
X (6)
are given by
i,
m
= 1, 2 . (4.43)
i=l 'll"i
The second partial derivatives involved in (4.43) are computed using
the following relations obtained by partially differentiating
'll"i* twice:
126
1
1f
t
L
i.-I
i, m = 1, 2
(4.44)
t
where
1f
=
L
i=l
1f
i
The first and second derivatives of
in terms of
a
and
b
are given in section 2.3.1.2.
4.6 Asymptotic Relative Efficiencies
The asymptotic relative efficiency of the moment estimators
compared to the maximum likelihood estimators are computed according to
definition 1.8.2.
The variance-covariance matrix of the moment esti-
mators exists only when
a > 6.
The asymptotic relative efficiency
computed for selected combinations of
12
and
48
a
and
b,
and truncation times
months are given in Tables 4.1 and 4.2.
The asymptotic
relative efficiency of complete samples (or infinite truncation time)
is given ..n Table 3.3.
as
a
of
t.
increases.
But as
The efficiency of moment estimates increases
Also it decreases as
t
b
increases for small values
increases, the efficiency also increases, and the
rate of increase is higher for higher values of
difference in efficiency for different values of
increases, and for the complete samples
b.
As a result, the
b
narrows down as
a reverse trend is noticed.
t
127
TABLE 4.1
AHymptotlc Relative Effi.ciency of Moment Estimators
to MLE for Selected Values of
and Truncation Time,
t
a
and
b
= 12
---b-----------i
2.0
5.0
10.0
15.0
6.001
50.72
41. 79
37.08
36.57
6.1
51.09
42.04
37.15
36.59
6.5
52.51
43.07
37.45
36.74
7.0
54.11
44.32
37.90
36.79
10.0
60.70
50.84
41.52
37.58
20.0
68.42
62.30
53.09
46.39
50.0
72.69
70.20
65.86
61.63
100.0
74.03
70.57
68.35
a
__
J
72.78
_ _. . .- - - 1 - - - -
~ - - - , - .
TABLE 4.2
Asymptotic Relative Efficiency of Moment Estimators
To MLE for Selected Values of
and Truncation Time,
:
t
b_~~_-2-
_a_ _
5
..0_---i_ __.O
a
and
b
= 48
-t
10.0
r
15 ' 0
6.001
67.53
64.59
57.20
6.1
68.04
65.11
57.75
6.5
69.92
67.05
59.85
53.35
7.0
71.97
69.16
62.21
55.65
10.0
79.70
77 .27
71. 79
66.14
2u.0
87.22
85.87
82.78
79.57
50.0
90.56
90.04
88.76
87.39
100.0
91.48
91.22
----+-'-----_.
J
90.58
50.87
51.37
89.89
__.__._--
,-~.
128
4.7 Examples
This se.ction gives the estimates of
a
and
b
tn~ating
the data
givEn in Majumdar and Sheps (1970) as truncated at various points.
data is described in
ection (3.6).
The
Estimates are obtairled for Mutte-
rite data treating the data as truncated at months 6, 12, and
2/~.
For
thf~
PrincI!ton data estimates are obtained with truncation times 12, 24,
36, and 48 months.
The moment estifllstes extst only
their variances exist only when
a > 0"
wh~n
a ) 3,
,md
For the Pri.nceton data, the
moment. estimates are not available for the caees considered.
F.stimates
and their standard errors where available are given in Tables (4.3)(4.5).
The moment estimates could be calculated only on (me instan-:c.
The cuefficient of variation of the estimates goes do"m as time
increases.
The increasing
this phenomenon.
of observations is also a facto! for.
Table (4.4) shows that the. estim:1te of
fecundability for
tion times.
numb~r
PFS
a
nnd
mean
group I i.s very 10\\' compared to other trunca-
A possible explanation for this is the cluRteri.ng of
obsen'ntions at month 18 and 24 3i'vl the
vatiors in between.
i r:o~;ql("~
of c1\ly a
L~l.o.l
obser-
It is evident from the chisquare function mini-
mized that the estimates depend on the observed clasG
fr~qumlcies.
We have not examined here in detai.l the bIases in the esti.m:ltes due to
extreme observations or due to misclassification of the data.
The goodness of fit of the data using selected estimates are given
in Talles (4.6)-(4.12).
As we noticed in censored samples, chisquare
valueE computed on the basis of minimum chis quare estimates from
ungrouped data exceed that of maximlUU likelihood estimates ,..hen data
a~e
grouped for the purpose of computing chisq11sre values.
This is
129
clearly seen by comparing Tables (4.8) and (4.9).
As
befor~
the
Princeton data do not show clear evidence of the fit of the model.
The Hutterite data show evidence for the fit of the model.
130
TABLE 4.3
Estimates from Truncated Samples and
Their Standard Errors (Where Available), Hutterite Data.
-----r--------
--Truncation
Time
6
12
265
31 5
Estimate~
24
---+--------_._n
338
A
a
3.5458
A
Moments
b
8.0226
"
if
r
0.3065
c
c
a
b
-
,,--
-
A
a
1.4844 (2.3 1.91)
0.9178 (0.6341)
1. 7257 (0.6281)
b
3.8332 (4.8047)
2.6815 (1. 4603)
4.4791 (1. 7758)
"
0.2791 (0.0864)
0.2550 (0.0387)
0.2781 (0.0196)
0.9921
0.9732
0.9707
A
MLE
if
r
c
a 164.32
c 125.34
b
a"
69.09
36.40
54.46
39.65
0.6167 (1.1784)
0.8821 (0.6334)
1.2398 (0.4654)
A
b
2.3017 (2.3114)
2.7137 (1. 5050)
3.5245 (1.3555)
-p
0.2113 (0.1563)
0.2453 (0.0405)
0.2602 (0.0198)
r
0.9850
0.9735
0.9637
A
MINCHISQ.
c 1191.08
C:100.42
7
37.55
55.46
38.46
1 1.81
---_ _-_._-_._-•..
*
A
p
r
Standard errors are given in parentheses
Estimate of mean fecundabi1ity
= Correlation
c
n
c
coefficient between
a
and
b
= Coefficient of variation of a and b respectively
= Sample
Size
e
e
e
e
TABLE 4.4
*
Estimates from Truncated Samples and Their Standard Errors
~~here
Available)
FFS
Group I
0.6439 (0.1056)
0.8251 (0.1099)
1.1380 (0.1768)
1.3775 (0.2027)
0.3613 (0.0159)
0.3746 (0.0145)
O. Qca4
0.9069
16.40
13.32
I
II
15.53
"' 7"
I
I
I
17.25
.'
1.L+..L
,
I
0.0651 (0.0780)
..~
r
!
c
I'
,
a
c,
I
D
I
!
0.1636 (0,0700)
i
0.2674 (0.0683)
I
i.
0.8222 (0.1251)
,
I
0.8049 (0.1829)
0.6020 (0.1070)
0.6986 (0.1126)
0.2421 (0.0686)
0.0976 (0.0917)
0.1897 (0.0451)
:
0.2454 (0.0260)
0.8907
I
0.8350
0.9374
57.81
i
I
!
0.9026
119.87
I
22.72
I-1oment estimates do not exist
Symbols are as in Table 4.3
I
:
--+
17.78
!
II
I
I
I
I
I
42.78
I
I
.
25.57
16.12
15.22
1
--+
!
i
I
,.....
w
,.....
TABLE 4.5
Estimates* from Truncated Samples and their Standard Errors (Where Available)
I
Truncation
Time
12
367
n
I
I
PFS
data Group II
24
36
48
408
426
434
"-
a
0.0734 (0.2210)
0.4877 (0.2063)
0.9304 (0.3707)
1. 5791 (0.4844)
0.5748 (0.1754)
1. 7473 (0.4670)
0.6676 (0.1650)
1.9463 (0.4803)
0.2360 (0.0315)
0.2475 (0.0337)
0.2554 (0.0118)
0.9347
0.9256
0.9201
"-
II
b
A
MLE
301.07
I
I
42.30
30.52
24.72
39.84
1I
30.67
26.73
24.68
I
PI
0.0731 (0.1789)
r
0.9407
I
ca
cb
I
!
I
I
0.4134 (0.1966)
1. 5260 (0.4758)
.
IMINCHISQ.
I
I
,
~!
:J
chi
0.0820 (0.1656)
0.9429
258.09
40.83
I
~-,
!
*
I
II
i
I
I
j
I
I
I
!
t
I
0.4750 (0.1634)
I
1.6488 (0.4487)
I
I
I
0.4942 (0.1433)
1. 6488 (0.4237)
0.2132 (0.0362)
0.2237 (0.0244)
0.2266 (0.0208)
0.9331
0.9222
0.9133
I
I
47.55
31.18
I
I
34.40
28.99
27.21
25.11
I
I
I
t-'
W
N
Moment estimates do not exist
Symbols are as in Table 4.3
e
I
I
e
e
133
TABLE 4.6
Observed, and Expected Frequencies of Conception,
by Month, Hutterite Data
(Truncation Time = 6)
Expected Frequency
Observed
Month
Frequency
1
103
2
Moment
MLE
MIN CHISQ.
97.7
100.0
99.5
53
62.4
60.6
58.5
3
43
41.5
40.1
39.2
4
27
28.5
28.1
28.5
5
30
20.2
20.6
21.9
6
9
14.7
15.6
17.4
265
265.0
265.0
265.0
8.7678
8.4225
8.1565
Total
CHISQ.
(J.df)
.01<p<.05 .Ol<fp<.05
.Ol<p<.05
134
TABLE 4.7
Observed, and Expected Frequencies of Conception,
by Month, Hutterite Data
(Truncation Time • 12)
Observed
Expected Frequency
Month
Frequency
MLE
1
103
101.4
99.3
2
53
59.1
58.6
3
43
38.9
38.9
4
27
27.6
27.8
5
30
20.6
20.9
6
9
16.0
16.4
7
12
12.8
13.1
8
9
10.5
10.8
9
6
8.8
9.1
10
8
7.4
7.7
11
10
6.4
6.6
12
5
5.5
5.8
315
315.0
315.0
11. 7434
11.6237
Total
CHISQ.
(9. df)
__
- - - ._-------- - - - - - _ .
MINCHISQ.
p>.05
p>.05
.-.
-- I---
135
TABLE 4.8
Observed and Expected Frequencies of Conception,
by Month, PFS Data Group I
(Truncation Time
=
12)
Expected Frequency
Observed
Month
Frequency
MLE
MIN CHISQ.
1
380
381.7
370.6
2
153
149.0
144.6
3
94
85.9
85.3
4
45
57.9
58.9
5
32
42.7
44.2
6
61
33.2
35.1
7
15
26.8
28.8
8
19
22.3
24.3
9
17
19.0
21.6
10
10
16.4
18.3
11
10
14.4
16.3
12
26
12.7
14.6
Total
862
862.0
862.0
CHISQ.
53.2741
51.1771
(9.df)
p<.Ol
p<.Ol
136
TABLE 4.9
Observed, and Expected Frequencies of Conception for
Grouped Data, PFS Group I
Observed
**
Expected FreQuency**
MLE
MIN CHISQ.
Month
Frequency
1
380
381.7
370.6
2
153
149.0
144.6
3
94
85.9
85.3
4
45
57.9
58.9
5-6
93
75.9
79.3
7-9
51
68.1
74.1
10-12
46
43.5
49.2
Total
862
862.0
862.0
CHISQ.
12.1072
(4.df)
p<.OI
Estimates from ungrouped data.
14.6594
p<.OI
137
TABLE 4.10
Observed, and Expected Frequencies of Conception,
by Month, PFS Data Group II
(Truncation Time
Observed
I
= 24)
Expected Frequencies**
MLE
MIN CHISQ.
Month
Frequency
1
129
126.5
122.4
2
57
65.1
63.6
3
53
41.3
40.8
I
**
5-6
27
39.4
39.8
7-9
46
36.2
37.2
10-12
29
23.2
24.3
13-24
41
47.1
50.8
Total
408
408.0
408.0
CHISQ.
13.5614
14.0457
(5. df)
.01<p<.05
.01<p<.05
Estimates from ungrouped data.
138
TABLE 4.11
Observed and Expected Frequencies of Conception,
by Month, PFS Data, Group II
(Truncation Time
Observed
**
= 36)
Expected Frequencies**
MLE
MIN CHISQ.
Month
Frequency
1
129
125.6
120.7
2
57
66.0
63.7
3
53
42.0
40.9
4
26
29.6
29.1
5-6
27
39.6
39.7
7-9
46
35.8
36.7
13-24
41
44.7
48.5
25-36
18
20.1
23.0
Total
426
426.0
426.0
CHISQ.
13.8956
15.0113
(6.df)
.01<p<.05
.01<p<.05
Estimates from ungrouped data
139
TABLE 4.12
Observe~and
Expected Frequencies of Conception,
by Month, PFS Data, Group II
(Truncation Time = 48)
Observed
Month
•
**
Frequencies
Expected Frequencies**
MLE
MIN CHISQ.
1
129
124.4
119.2
2
57
67.0
63.2
3
53
42.8
40.6
4
26
30.1
28.9
5-6
27
40.0
39.3
7-9
46
35.7
36.2
10-12
29
22.2
23.3
13-24
41
42.6
47.5
25-48
26
29.2
35.8
Total
434
434.0
434.0
CHISQ.
14.3423
16.9218
(6.df)
.01<p<.05
.01<p<.05
Estimates from ungrouped data
140
4.8 Results from Sampling Experiments
This section will describe the results from sampling experiments
designed to study the various estimators for different truncation
times and for a limited number of parameter combinations.
ation
The gener-
of the data is done according to the method described in
Section 3.7 except that only women conceiving before the truncation
time are observed.
The parameter combination of
a
and
b
used to
generate data are given in Table 3.18, and the truncation times are
t = 12 and 24 months.
For each parameter combination
100 samples
of size 500 each are generated.
As before, the moment estimators are considered non available,
when either
a
or
b
is negative.
For maximum likelihood and mini-
mum chisquare estimates, a maximum of 50 iterations are allowed.
Also,
the program examines for negative values of the estimates as well as
the values of the log-likelihood functions and chisquare.
The summary results based on the available estimates are given
in Tables (4.13)-(4.16).
The minimum chisquare estimates did not
converge in a large number of cases.
We have not examined here, in
detail, the reasons for the large number of nonconvergence in the case
of minimum chisquare estimates.
This makes the comparison of the mini-
mum chisquare estimates difficult.
But in general, all the methods
seem to yield highly positively biased estimates of
the bias is large for small values of
t.
a
and
band
The mean is considerably
higher than the median, whereas the trimeans and medians are close.
This means that the observed distributions of the estimates are somewhat positively skewed with the presence of a few extreme observations
e
•
141
at the right tail.
and
b
The high correlation between the estimate of
a
may be the reason for the smaller bias observed in the esti-
mate of mean fecundability.
TABLE 4.13
Summary Statistics of Simulated Samples, by Truncation Time
(Number of Samples Generated 100)
a = 3.5
I
b = 10.0
P = 0.2592
t
= 12
Estimators
MOMENT
MLE
MIN CHISQ.
No. of Estimates
Obtained
96
97
78
Parameters
a
b
p
b
a
a
p
b
p
I
I
Mean
5.4305
15.6421
0.2565
s.d
5.7240
16.6643
0.0190
I
Median
3.5507
10.0023
0.2594
I
I 3.5020
Quartiles
2.2970
6.9534
0.2460
'7.0167
20.2229
4.1039
11.7053
I
I
I
I
Trimean
7.0198 I 20.1379
0.2564
5.7983
16.8007
0.2554
42.8517
0.0184
12.4497
35.8188
0.0169
9.8538
0.2588
3.0577
9.1289
0.2571
2.3463
6.5893
0.2442
2.2660
6.4128
0.2453
0.2681
6.8594
19.5634
0.2680
4.6985
14.7508
0.2672
0.2583
4.0524
11.4651
0.2575
3.2700
9.8554
0.2567
15.2909
I
I
I-'
~
r-..J
e
e
"
e
e
e
e
Table 4.13 continued:
a
I
= 3.5
b
= 10.0
Estimators
No. of Estimates
Obtained
Parameters
P = 0.2592
t
MOMENT
MLE
MIN CHISQ.
100
100
85
a
b
p
0.2617
4.3080
12.6185
0.2591
17.2061
0.0154
2.4348
7.9538
0.0135
5.0522
13.7032
0.2612
3.6422
10.5082
0.2574
0.2522
3.6320
9.7780
0.2522
2.7940
7.7291
0.2484
18.8327
0.2745
6.3165
18.7473
0.2726
5.0090
14.8150
0.2684
14.4670
0.2617
5.0132
13.9834
0.2618
3.7718
10.8902
0.2580
a
b
Mean
6.1724
17.8703
0.2618
s.d
5.6483
16.9348
Median
4.9947
Quarti1es
,
Trimean
= 24
a
b
6.2096
18.0565
0.0157
5.4653
14.5693
0.2602
3.5927
9.8970
6.3807
4.9907
p
I
P
I
I
I
I
f-'
-I:'W
TABLE 4.14
Summary Statistics of Simulated Samples, by Truncation Time
(Number of Samples Generated 100)
a = 4.5
b = 12.86
P
=
0.2592
t = 12
Estimators
MOMENT
MLE
MIN CHISQ.
No. of Estimates
Obtained
93
91
77
Parameters
a
b
Mean
7.7489
22.7957
s.d
14.6304
Median
Quarti1es
I
I
I
Trimean
-p
a
b
0.2570
9.4927
27.2414
44.2634
0.0185
35.6227
3.9306
11.5821
0.2578
2.5047
7.2853
6.3446
4.1777
p
a
b
0.2584
5.1883
15.1661
0.2558
101. 0280
0.0148
9.3803
28.7240
0.0159
4.1379
11.9016
0.2583
3.7356
10.8221
0.2567
0.2493
2.4269
7.6069
0.2495
2.4171
6.8207
0.2556
11. 9705
0.2664
6.1150
17.5325
0.2665
4.9446
14.2530
0.2633
12.1050
0.2578
4.2044
12.2357
3.7802
10.6795
0.2455
p
0.2581
I-'
.c.c-
e
e
e
-
e
e
Table 4.14 continued
a = 4.5
b = 12.86
P = 0.2592
t
= 12
Estimators
MOMENTS
MLE
MIN CHISQ.
No. of Estimates
Obtained
96
96
63
Parameters
a
b
Mean
6.1724
17.8703
s.d
5.6483
Median
Quartiles
a
iJ
b
-
P
f
a
b
P
0.2618
6.2096
I 18.0565
0.2617
4.3080
12.6185
0.2591
16.9348
0.0157
5.4653
17.2061
0.0154
2.4348
7.9538
0.0135
4.9947
14.5693
0.2608
5.0522
13.7038
0.2612
3.6422
10.5082
0.2574
3.5927
9.8970
0.2522
3.6320
9.7787
0.2522
2.7940
7.7291
0.2484
6.3807
18.8327
6.3165
18.7473
0.2726
5.0090
14.8150
0.2684
5.0132
13.9834
0.2618
3.7718
10.8902
0.2580
I
I
I
0.2745
I
Trimean
4.9907
14.4670
I
0.2617
I
~
+:VI
TABLE 4.15
Summary Statistics of Simulated Samples, by Truncation Time
(Number of Samples Generated 100)
a
= 4.5
b
= 11.57
P = 0.28
t
= 12
Estimators
MOMENT
MLE
MIN CHISQ.
No. of Estimates
Obtained
93
96
88
-p
Parameter
a
b
p
a
b
p
a
b
Mean
6.4453
16.7594
0.2774
7.1665
18.5821
0.2778
6.2087
16.4448
0.2740
s.d
11. 7038
30.3286
0.0169
11.4514
29.3952
0.0168
10.8791
28.5809
0.0169
Median
4.1695
10.4703
0.2774
4.3774
11.3095
0.2776
3.9680
10.5554
0.2745
Quarti1es
2.5625
6.6678
0.2643
2.6809
7.0743
0.2651
3.0640
7.3425
0.2683
6.9776
18.8176
0.2882
8.8295
22.7574
0.2910
4.6034
12.1277
0.2772
4.0948
11.0305
0.2737
I
Trimean
6.5084
17.7695
0.2876
4.3526
11.3445
0.2766
I
~
~
0\
e
e
e
e
e
•
e
Table 4.15 continued
a
= 4.5
b
= 11.57
p
= 0.28
t
= 24
Estimators
MOMENT
MLE
MIN CHISQ.
No. of Estimates
Obtained
100
100
71
Parameter
a
b
P
Mean
5.4369
12.3621
0.2792
5.4942
14.5211
s.d
2.7965
8.1769
0.0153
3.1099
Median
4.8551
12.6476
0.2785
Quartiles
3.5213
8.8663
6.4921
4.9309
Trimean
P
a
b
0.2790
4.0260
10.8675
0.2735
9.0677
0.0151
1.8456
5.5046
0.0132
4.7231
12.6138
0.2790
3.4308
9.5676
0.2737
0.2689
3.4872
8.5055
0.2706
2.6677
7.0794
0.2634
16.6658
0.2892
6.5137
16.9467
0.2893
4.8350
12.3615
0.2814
12.1068
0.2788
4.8618
12.6700
0.2795
3.5911
9.6440
0.2731
a
b
P
....
"
.t-
TABLE 4.16
Summary Statistics of Simulated Samples, by Truncation Time
(Number of Samples Generated 100)
a = 5.0
b = 12.86
P = 0.28
t · 12
Estimator
MOMENT
MLE
MIN CHISQ.
No. of Estimates
Obtained
93
93
78
I
Parameter
a
b
p
a
b
p
a
b
Mean
9.7093
25.7689
0.2805
14.1854
36.8921
0.2807
7.7524
20.1899
0.2783
s.d
16.6425
46.8724
0.0164
47.1985
124.1276
0.0163
15.0551
39.5968
0.0156
Median
5.0578
12.3890
0.2783
4.9035
11.9372
0.2778
4.2140
10.7577
0.2766
Quarti1es
3.0640
7.3425
0.2683
2.7990
0.2683
2.5912
7.0943
0.2676
8.8295
22.7574
0.2910
8.4980
21.8122
0.2922
6.5642
17.6701
0.2891
5.5023
13.7195
0.2790
5.2760
13.2858
0.2790
4.3958
11. 5699
0.2775
Trimean
7.4564 .
if
.....
~
(Xl
e
e
e
-
-
e
Table 4.16 continued:
a
= 5.0
b
= 12.86
Estimator
P = 0.28
t
= 24
No. of Estimates
Obtained
99
98
Parameter
a
b
p
Mean
6.1134
15.9327
0.2836
6.6571
I 17.5876
s.d
4.4203
12.9388
0.0159
6.0058
Median
5.0155
12.2207
0.2840
Quartiles
3.6969
9.2074
6.5515
5.0698
Trimean
MIN CHISQ.
MLE
MOMENT
77
a
b
0.2824
4.8012
12.8566
0.2775
17.8421
0.0155
3.4865
10.3908
0.0150
5.1111
12.9348
0.2819
3.9162
10.0836
0.2766
0.2744
4.0257
9.5165
0.2727
3.0651
7.5599
0.2679
17.3537
0.2917
6.5876
17.3696
0.2924
5.0079
13.7972
0.2856
12.7506
0.2835
5.2089
13.1890
0.2822
3.9764
10.3811
0.2767
a
b
p
p
~
~
\0
CHAPTER V
ESTIMATION PROBLEMS IN A MODEL FOR
FIRST LIVEBIRTH INTERVAL
5.1
Introduction
In this chapter, we will consider estimation problems in a model
for the first livebirth interval.
Such a model is sometimes prefer-
red over conception time models for several reasons.
Conceptions
aborted at very early stages often go unnoticed; hence data on conception times are subject to gross errors.
Also, data on livebirth inter-
vals allow us to estimate simultaneously the
r~sk
of conception, the
risk of fetal loss and the average length of the non-susceptible period
due to a fetal loss.
We will discuss in the following sections the
estimation of these factors from a model due to George (1967).
Simul-
taneous estimation of the parameters has not been attempted before;
instead values of some of the parameters are assumed and the rest of
the unknown parameters are estimated using the moments.
The derivation
of the model is described in Section 5.2 and the problems of estimation
are discussed in the subsequent sections.
5.2 Derivation of the Model
Time to the first livebirth includes the following components:
(1)
the sum of waiting times for conceptions,
(2)
the period of non-susceptibility associated with the fetal
losses occurring during the interval, and
151
the period of gestation for a livebirth.
(3)
Ihe third component is assumed to be a constant equal to 9 months.
Let
I
I
be the duration from marriage to the termination of the
pregnancy that results in the first 1ivebirth.
consider only the random variable
I,
In this chapter, we
which is the duration from
marriage to the time of conception that results in the first 1ivebirth.
Ihe random variable
I
is related to
I
I
by the relation,
In actual situations we observe the random variable
had
n
T' •
I'
=I +
9.
If a woman
fetal losses before the first 1ivebirth conception, her
to first 1ivebirth conception can be written as the sum of
time
(2n+l)
random variables
(5.1)
where,
i
th
Xi
is the waiting time for
conception, and
Y
i
is the period of non-susceptibility due to the
i
th
conception
that ends in a fetal loss.
We make the following assumptions:
(1)
Ihe number of fetal losses before the first livebirth follows
a geometric distribution
P(N=n) = (1-8)n8 ,
where
8
0<8<1.
n
= 0,
1, 2, .•. ,
(5.2)
is the probability that a conception ends in a livebirth and
Ihis implies that the occurrences of fetal losses are indepen-
dent within a woman.
(2) Xi's are independent and identically distributed random variables with density function fl(x) and the Laplace transform of fl(x),
L(f l (x» = ~l(x) .
(3) Yi'sare independent and identically distributed random variables with density function f (y) and the Laplace transform of f (y),
2
2
L(f2(y» = ~2(s) .
152
Let
(4)
X.'
S
1
f(T)
be
in (5.2).
and Y.'
s are independent.
1
the density function of the random variable
The~
for a fixed
L(f(T) In)
If
~(s)
n,
the Laplace transform of
= ~1n+1 (s)
n
~Z(s)
denotes the Laplace transform of
~(s) -
co
r
ncO
T
given
f(T)
is
.
(5.3)
f(T), then,
~~+l(s) ~~(s) (1-8)n8
~l(s)e
(5.4)
George (1967) assumed the distributions of
and obtained the distribution of
teristic function of
X and
T by treating
Y as exponentials
~(s)
as the charac-
T and inverting it by the inversion theorem.
will also make similar assumptions on the distributions of
X and
We
Y
and invert (5.4) using the known inversions of certain Laplace transforms.
of
T
Further, this will be helpful in characterizing the distribution
as a mixture of two exponential distributions.
Let
(5.5)
where
c
1
> 0 ,
and let
f (y)
2
where
and
C
z>
0 •
Then,
= c2
-czy
e
,0 < y <
co
,
(5.6)
153
Substituting for
~l(s)
and
C (C
<p
(s)
~2(s)
in (5.4)
+s)8
1 2
= --=-2-"'--"';;;""'--s +(c +c )S+C C 8
1 Z
l Z
(5.7)
Using partial fractions (5.7) can be written as
Al
4><s)
where
s
2
A and
+ (Cl+C Z)
B
l
=--+-A+s
B+s'
(5.8)
B are the negatives of the roots of the equation
s
+ cl
c
and
A1
=
C
1
2 8
= 0, i.e.,
8-Ac 8
1
B-A
(5.9)
8-c c 8
l 2
B-A
(5.10)
l 2
BC
B =
1
c
The Laplace transform of
f(T)
given in (5.7) is the sum of Laplace
transforms of two exponentials. Hence by inversion of (5.7), f(T)
can be written as
(5.ll)
and
b
154
Then
A = a + b,
B = a - b,
Al = (b-a+c 1 6)
2b
A
,
(a+b-c 1 6)
B
l
B
-=
2b
and
We will show that
f(T)
can be considered as a mixture of two
exponential distributions, i.e.,
(5.lZ)
where
(5.13)
and
0 <
(l
< 1 .
In order to prove the above statement, we have to show that
B
i
B > 0, and 0 <
<
c
For
A =
Since
1
1 •
> 0 , C > 0,
z
and
0 < 6 < 1
t
1/2 (cl+CZ) + 1/21 (cl+c2)2_4clcZ6
>
0 .
(c1+c Z) > I(C +C )Z-4C C 6
l 2
1 2
B
= 1/2(cl +c Z) - l/Z;!(c l +C Z)Z-4C l CZ6
> 0
•
A
> 0,
155
Substituting
for
and
a
b
in the expression for
given in
(5.11). we can write
(5.14)
The absolute value of
i.e.
4ciS(-1+S)
<
is less than 1
0
and is true for
0 < S < 1.
when
Hence from
(5.14)
Also
Thus the distribution of
T can be expressed in the form given in
(5.1Z). where
Al = 1/2(c +c ) + 1/Z;!(C +c )Z-4C C S
l Z
1 Z
1 2
=
(a+b) •
(5.15)
(5.16)
b-a+c 6
and
a
=
I
2b
(5.17)
156
The relationship between a, A!'. A2 and the original parameters
and
a
are obtained below.
From (5.15) and (5.16), we have
(5.18)
(5.19)
(5.20)
The relation (5.17) gives
(5.21)
Using (5.18)-(5.21), we can express c ' c 2 and
1
A
1
and
A2
in terms of
a,
as,
cl
..
2
2
aA +(l-a)A
l
2
[Ala+(l-a) A2]
c
.-
AA
l 2
[AIa+(l-a) A2 ]
2
and
e-
[aAl+(l-a) A2 ]
2
2
[aA +(1-a)A ]
1
2
The mean and the variance of
of
a
(5.22)
(5.23)
2
(5.24)
T are easily obtained from the definition
T given in (5.1):
E(T) ..
.! .L + I-a .L = ~ + l-a
a c1
a c 2 Al A2
,
(5.25)
e
157
Var(T)
a
I-a
2 + - 2 + a(l-a)
A1
A2
== -
e = I,
When
i.e.~
distribution of
(5.26)
if all the conceptions result in livebirths, the
T
is identical to the distribution of
X, and by
(5.5)
f(T) - c l e
-c T
1 ,
T > 0
•
(5.27)
Thus, if the population under consideration is not subject to the
risk of fetal loss before the first livebirth, the distribution of
time to first livebirth can be represented by a single exponential.
It may be noted that the assumptions
X and
Yare exponentially distributed
that the random variables
imply that the risks involved
are homogeneous among women and are constant with respect to time.
5.3 The Problem of Estimation from Complete Samples
The distribution of
and
e.
T
involves three unknown parameters
c ' c '
l
2
The problem is to get simultaneous estimates of these unknown
parameters.
George and Pillai (1969) assume the values of
an
estimate of
in (5.25).
c
l
by equating the first sample moment
The resulting estimate of
cl
and
m
l
e
and get
to
E(T)
is
(5.28)
158
Usually, the actual values of
directly.
c2
e
and
are hard to determine
Hence we develop a method for obtaining the joint estimates.
First, the maximum likelihood estimates of AI' A
2
are required.
and
a in (5.13)
The invariant property of the maximum likelihood
estimates (Hogg and Craig 1965) helps us to obtain the estimates of
c ' c and
l
2
e
using the estimates of AI' A and a and the relationships
2
(5.22)-(5.24).
In this way, we could make use of available procedures
of getting the maximum likelihood estimates of the parameters of a
mixture of exponentials.
The problem of estimation for a mixture of two exponentials has
been considered by Hasselblad (1969) and Oppenheimer (1971). The
required iterative equations can be obtained as a special case of
Hasselb1ad's (1969) general formulation of exponential families.
Oppenheimer (1971) also gives a successive substitution iterative
procedure, but the density function is written in a slightly different
form.
We will present this procedure with the necessary modifications.
For a sample of size
n
from a populations with density
f(x)
given in (5.12), the likelihood function is
n
L =
II
i=l
,
f(x )
i
(5.29)
and the corresponding log-likelihood is
n
log L =
L
i=l
(5.30)
log f(x i ) .
The resulting maximum likelihood estimating equations are as follows:
alog L
aa
...
n
L
i=l
k
e
i
-'Ix
-A
2
f(x )
i
e
i]
-'2X
=
o,
(5.31)
159
o
(5.32)
t
=
0 .
(5.33)
n
Multiplying (5.31) by
a
and adding
L
i=l
to
f(x. )
both sides
~
we obtain
(5.34)
Hence, (5.31) can be written as
(5.35)
Equations (5.32), (5.33) and (5.35) yield the following iterative
equations:
A(V)
a
\ (v) - 1
1\1 e
(vH)
xt
(5.36)
f(x .)
~
A(vH)
j
""
j
AjV),
where
at the
v
th
j "" 1, 2,
and
a(v)
= 1,
2,
(5.37)
denote the value of the parameters
iteration.
The asymptotic variance-covariance matrix of the estimates are
given by the inverse of the matrix
160
(5.38)
1, 2, 3,
where
and
and
From (5.12)
we have
It can be shown that
2
mIl .. a
f
IX)
(
1
·0. F'l - x
m22 • (1-a)
]2
g~
T
(5.39)
dx ,
2
(5.40)
1
m33 • a(l-a)
(5.41)
m12 - a(l-a)
(5.42)
(5.43)
and
(5.44)
Behboodian (1971) gives a methodology to evaluate these infinite integrals and provides a table with which the elements m
ij
mated for various combination of
a, A ,
l
and
A •
2
can be esti-
161
The successive iteration procedure described earlier requires
a set of initial estimates for
Al t, A
2
and
a.
For this purpose we
will use the moment estimators of these parameters as given by Rider
(1961).
Let
be the
sample moment about zero.
Equating
the first three sample moments to the corresponding population moments
we get the following equations
(5.45)
(5.46)
(5.47)
Solving these equations we obtain
a
and
Al
and
A2
=
(5.48)
are the roots of the quadratic equation:
The asymptotic variance-covariance matrix of the estimates
and
e
CIt
c
2
can be expressed as
,
W
~3x3
where
V
= A VA
---
is the variance-covariance matrix of the estimates of
(5.50)
162
,
A
The elements of the
aC
".
aC
l
aA l
aA
aC
aC
2
l
2
aC
l
aa
aC
aA l
2
aA
2
an
ae
aA l
ae
aA
ae
aa
2
e
2
A matrix are given below. rrom (5.23)
Taking the logarithms and differentiating with respect to
aC
l
aA
l
=
Al
a[2A -c ]
l l
A a+(1-a)A
2
l
(5.51)
Similarly
(l-a) (2A -c )
2 l
cle
(5.52)
(5.53)
163
From (5.20)
Hence
(5.54)
(5.55)
ee 2
eC l
- -oaCla
(5.56)
From (5.22)
taking logarithms and differentiating we get
Substituting for
and
simplifying
as = -2a -2A- ac 2 •
l l
cl
~
111\1
(5.57)
Similarly
(5.58)
164
(5.59)
The
~
matrix in (5.50) can be obtained using (5.51)-(5.59).
In an earlier section of this
exponential of the form
f(T)
= cle
chapter~
we have noted that a single
-cIT
is adequate to describe the
data if the population is not subject to the risk of fetal loss or
if the risk of fetal loss is negligible.
We can actually perform
statistical tests to see whether we have a single exponential or a
mixture.
Oppenheimer (1971) describes several procedures for testing
this hypothesis.
We will briefly describe the standard likelihood
ratio test.
1.
Assuming
of
c
f(x)
= cle
-Clx
obtain the maximum likelihood estimate
1
and calculate the corresponding likelihood function
X
using the estimate of c, i. e. ,
1
= -
-n
= "n
c e
1
2.
Assuming a mixture of exponentials, obtain estimates of
and
a.
AI' A2
using the procedures described earlier and calculate the
corresponding likelihood functions using the estimates in place of
the true parameters, i.e.,
165
3.
Set
I
=
L(~1)/L61'~2'~).
Then asymptotically
-2log I
is
distributed as a chisquare distribution with 2 degrees of freedom.
The
degrees of freedom are obtained as the difference between the number
of parameters estimated under the alternative and the null hypothesis.
5.4 Estimation from Censored Samples
In this section, we shall consider the estimation of the risks of
conception and fetal loss using singly censored data on the time to
first livebirth.
fixed duration of
A marriage cohort of
N women are followed for a
,
t
months.
The exact time to first livebirth is
known only for women who gave birth during this period.
For others it
,
is only known that the duration exceeds
t
,
a woman who is pregnant at time
t,
months.
In this process,
whose outcome has not yet been
determined is considered to be a censored observation.
at the random variable
We will look
T which is 9 months less than the observed
duration to the first livebirth, 9 being the constant duration of
gestation of a livebirth.
The observations on the random variable
,
can be considered as censored at
are followed, and
n«N)
t = t
- 9 months.
If
N women
,
women give birth to a child during
we consider that for these
n
T
t
months,
women the exact time for a livebirth
conception is known and for the remaining
(N-n)
women it is only
known that their time for a livebirth conception exceeds
t
months.
In the remaining part of this section we will consider the problem of
estimation of the parameters from such sample observations when the
underlying model is given by (5.15).
166
5.4.1
Iterative Scheme
From (5.12) we have
= agl(Al,T) +
f(T)
(1-a)g2(A 2,T)
-A T
where
T
0 and
>
8 (T). A e
j
j
Denote
j
,
A
j
>
0 ,
j •
1, 2
(5.60)
F(t) • J:f(T)dT ,
and
(5.61)
The probability of observing
during
t
n women with a livebirth conception
units of time is
(5.62)
Let
these
xl' x '···, x n
2
n
women.
be the observed livebirth conception time for
Then the joint conditional density of
xl' x '···, x
2
n
is given by
(5.63)
From (5.62) and(5.63) we obtain the following likelihood function
(5.64)
167
Also
1 - F(t)
='
(f(t)dt
-A
-A
t
t
= a e 1 + (l-a)e 2 .
(5.65)
The log-likelihood is
n
log L - const +
L log
i=l
f(x i ) + (N-n) log(l-F(t»
•
(5.66)
The maximum likelihood estimating equations are obtained by taking
partials of
log L
in (5.66).
They are
Similarly
(N-n)
~
_
+ l-F(t) t(l a)te
-A 2
tl_
~- 0,
(5.68)
168
enog
L ..
aCt
(N-n)
=0
+ 1-F(t)
(5.69)
•
n
Multiplying (5.69) by
and
Ct
l
:L=1
adding
e
-t>"
2
on both sides we get
-tA 2
+ (N-n)e
(5.70)
1-F(t)
From (5.69) and (5.70)
-A lt'
~-F(t) - N •
(N-n)
Equations (5.67). (5.68). and (5.71)
(5.71)
enable us to write the iterative
equations as
Ct
(v)
(v+1)
Ct
c--
N
_>..(V)t
n
(N-n)e 1
+
1-F(t)
L
i-1
_>..(V)
n e'
L
A(v+1) • 11111
j
(5.72)
-A(V)t
x
i
f(xi)
j
n
,L
~1
(N-n)te j
1-F(t)
• j ..
'->"jX i
xie
1. 2 •
(5.73)
169
We notice that (5.72) and (5.73) tend to be the same as (5.36) and
(5.37) as
t +
00
•
5.4.2 Asymptotic Variance-Covariance Matrix
The asymptotic variance-covariance matrix of the estimates is
given by the inverse of the matrix
m
ij
V
3
2s
-- -ElaaV10av LJ '
i
~
8
(m
ij
),
where
i, j • 1, 2, 3 ,
j
= a.
Differentiating the first partials of
L with respect to
A
1
given in (5.67) we have
a2 log
n I2
L • _
n
L
:f.cl
(N-n)
I-F(t)
a2F
(N-n)
2
aA - [1-F(t)]2
(5.74)
1
We need the expectation of the expression given in (5.74). But,
and
E(n)
E(N-n)
= NF(t)
(5.75)
,
= N(l-F(t»
Taking expectations in (5.74) by fixing
•
n, we have
(5.76)
170
E a2 log L /
[
aA2
1
It (
ft
1 .of oJ 2 dx + .
a2f dx
(N-n) a2 f
]
n=-n·of·a~l· F(t)
n on2F(t) -l-F(t) d,,2
· 1 1
(N-n)
1-1(t)
A"
.
(2LJ2
1
Again taking expectations we get
-E [
a210g2
Lj
ft 1[Of l2
f~
0
1
=N
all
1
dx + 1-F(t)
(~]2
aA
1
•
(5.77)
In general we can write
i, j
= 1,
2, 3
where
(5.78)
From the expression of
af
~
1
given in (5.12),
f
=Q
1 (1r- -
ag
= Q.
1· 1
~
J
x' ,·g1
(l-a):~: = (l-a)g - +2 .
171
and
af
-=
aa
Substituting for the derivatives of
of the Information matrix
~
f(x) in (5.78), the elements
can be expressed as follows:
(5.79)
N
+ 1-F(t)
.(aF(t») 2
aa"
(5.81)
172
m
12
[oA2]
l dA Z
-_ _ E d log L
It
= Na(l-a) . ..
0
]
. [- 1 - x [-1. - x
Al
. A2
1
1 2
-8 8 dx
f
(5.82)
N
+ I-F(t)
aF(t) aF(t)
aA
N
+ I-F(t)
(5.83)
aa
I
dF(t) dF(t)
dA
aa
2
(5.84)
From (5.60) and (5.61)
F(t) • aG (t)
I
+ (1-a)G 2 (t)
= aG-e-AIJ +
(I-a)
G-e
-A 2J •
and
.-A 1 t
= ate
aF(t) = (l-a)te
dA Z
(5.85)
-A
t
2
(5.86)
173
[,-1..1t
of(t)
= - Ie
oa
We notice that as
t
+ ~
.-A 2J
- e
(5.87)
the variance-covariance matrix reduces to
that of complete samples.
The integrals to be evaluated in (5.79)-
(5.84) are the same as in (5.39)-(5.44) except for the upper limit of
integration.
and
The values of
values of
AI'
1.. 2
and
a
a
are obtained using the estimated
using the relationships(5.22)-(5.24).
asymptotic variance-covariance matrix of the estimates of
a
The
c ' c
l
2
and
are obtained using (5.50).
5.5 Estimation from Grouped Data
As we have pointed out earlier in Chapter III, situations arise
where individual values of the time to first livebirth is not recorded,
but rather the data are grouped into intervals.
Suppose that the range of variation of the time to first livebirth
conception is partitioned into
the interval being
T , j
j
= 0,
h
intervals with the end points of
1, 2, ••• , h;
TO
= 0,
T = ~ •
h
Let
th
interval, i. e. ,
denote the number of women falling in the i
i
n
n = N. The observed number of
in the interval [T :- l , Ti ], and
i
i
i=l
women classified in the manner follows a multinomial distribution given
n
L
by
(5.88)
174
where
n.
1.
is the probability of an individual falling in the
i th
interval, and
Ti
ni
= f·
f(x) dx
T :i.-1
i
= 1,
2, ••• , (h-1) ,
(5.89)
The maximum likelihood estimating equations are given in (1.19) as
where
and
and
V- 1 and
~n
H are defined in (1.5) and (1.6), respectively.
~
The
~
asymptotic variance-covariance matrix of
e
is given by (1.10) as
(5.90)
175
(5.91)
1 • 1, 2, ••• , (h-l)
The estimates of
of
AI' A , a
2
cl ' c2
and
e
(5.92)
are obtained using the estimates
and the relationships (5.22)-(5.24).
variance-covariance matrix of the estimates of
The asymptotic
and
a are
given in (5.50).
5.6 Examples
This section illustrates the estimation procedures outlined earlier
in this chapter.
Data used is obtained from 554 Hutterite women, who
reported their time from marriage to first livebirth.
period for a livebirth is taken as 9 months.
The gestation
Eleven women who reported
their birth before 9 months from the marriage are not included in the
sample.
All the women were reported to have married before the age
of 25 years.
The maximum likelihood estimates from the complete samples are
obtained using the method of iteration outlined in section 5.3.
c.'
The
evaluation of the indefinite integrals involved in the calculation of
variance-covariance matrix is done using an IBM/360 scientific subroutine program.
The program DQL32 uses a Gaussain-Lagurre quadrature
formula which integrates exactly
multiplied by e
x
the integrands in (5.39) to 5(43)
as a polynomial of degree 63.
The results are
176
presented in Table 5.1.
The results show that the mean time for a
conception is 4.07 months and that nearly 7.3% of the conceptions end
in fetal loss.
The likelihood ratio criterion derived in Section 5.3 is used to
test whether a single exponential will fit the data.
value of
-2log l
The calculated
is 19.4876, which corresponds to a chisquare value
with 2 degrees of freedom is significant
(P value
= 0.00006).
Hence,
we conclude that a single exponential is inadequate for the data.
The
goodness of fit using the estimated parameters from the ungrouped data
is given in Table 5.4.
For the purpose of computing chisquare, the
data is grouped as shown in the table.
A small chisquare value shows
that the model fits the data.
For the purpose of illustration, the observations are treated as
censored at month 24.
The estimates are obtained using the iterative
procedure described in Section (5.4).
The finite integrals involved in
the estimation of the variance-covariance matrix is computed using an
IBM/360 scientific subroutine.
The subroutine DQG32 evaluates the
integrals using a 32 point Gauss quadrature formula, which integrates
exactly polynomials up to degree 63.
5.2.
The results are given in Table
The goodness of fit using the estimates from censored sample is
shown in Table 5.4.
The model adequately fits the data.
Estimates from grouped data are obtained treating the data as
grouped in Table 5.4.
The estimates and the corresponding variance-
covariance matrix are calculated according to the methods given in
Section 5.5.
The results are given in Table 5.3, and the goodness of
fit using these estimates is shown in Table 5.4.
fits the data.
The model adequately
177
TABLE 5.1
Estimates and Their Standard Errors From Complete Samples
Sample Size = 554
Parameter
Estimate
Std. Error
Al
A2
0.2580
0.0357
0.0949
0.0275
a
0.8136
0.1323
Correlation matrix of
(~1,i2'~)
0.6845
1
Parameter
c
c
1
2
e
Correlation matrix of
Estimate
Std. Error
0.2454
0.0263
0.1075
0.0391
0.9274
0.1476
(~1'~2'
8)
0.5589
0.9482l
O.3~71
1
Mean waiting time for conception
1
~
c
= 4.07
J
months.
1
1
Average time lost due to a fetal loss - - = 9.30 months.
c
z
178
TABLE 5.2
Estimates and Their Standard Errors From Censored Samples
(Censoring time • 24 months)
Sample Size = 554
Parameter
Estimate
Standard Error
A1
A2
0.2786
0.0513
0.1190
0.0377
0.6984
0.2417
a
Correlation matrix of
"
"
,.
(A1tA2t a)
0.8123
-0.9137]
-0.9147
1
1
Parameter
c
1
c
2
6
Correlation matrix of
Estimate
Std. Error
0.2538
0.1438
0.0513
0.0377
0.2417
0.9080
(~lt~2t
I
6)
0.6770
0.8823J
0.3369
1
1
Mean waiting time for conception
~. 3.94 months.
c
1
Average time lost due to a fetal loss
1
c
~
z
= 6.95
months.
179
5.3
TABLE
Estimates and Their Standard Errors From Grouped Samples
Sample Size
=
554
Parameter
Estimate
Std. Error
Al
0.3480
0.0436
1..
0.1432
0.0165
0.4904
0.5004
2
a
Correlation matrix of
(~1'~2'~)
-0.4612
0.2704J
-0.7630
1
1
Parameter
c
c
1
2
e
Correlation Matrix of
Estimate
Std. Error
0.2826
0.1001
0.2036
0.0696
0.8535
0.7916
(~1'~2'
8)
-0.9585
0.9763J
-0.9900
1
1
1
Mean waiting time for conception : - = 3.54 months.
c
1
Average time lost due to a fetal loss
1
-= 4.91
c
2
months.
TABLE 5.4
Observed, and Expected Frequencies of Livebirths, by Month, Hutterite Data
Expected Frequencies
Observed
Complete
Censored
Grouped
Month *
Frequency
Samples
at t = 24
Samples
0-3
269
268.5
269.3
273.2
3-6
140
131.3
130.1
126.6
6-9
50
66.1
65.8
64.1
9-12
41
34.7
35.0
35.1
12-15
19
19.2
19.8
20.5
15-18
12
11.2
11.8
12.5
18-21
9
7.0
7.4
7.9
21-24
5
4.5
4.8
5.0
24+
9
11.5
10.0
9.1
554
554.0
554.0
554.0
6.8916
p>.05
6.0749
0>.05
5.8598
D>.05
Total
CHISQ.
(Sd.f)
*Time to first 1ivebirth conception
e
-
~
(Xl
o
-
CHAPTER VI
SUMMARY AND SUGGESTIONS FOR FUTURE WORK
6.1
Summary
This dissertation examines in detail
the estimation problems in
certain models for conception times and for time to first 1ivebirth.
ChapterSII - IV deal with the analysis of conception times under
various assumptions.
The models considered vary according to their
assumptions on fecundability, the monthly chance of conception.
Fecundabi1ity can be either homogeneous or heterogeneous among women,
and can be constant or vary with time.
We have not considered the
case where fecundability varies with time.
Potter and Parker (1964)
propose a single beta distribution with two parameters
a
and
b
as
the distribution of fecundability and hence the proposed distribution
of conception time is a geometric compounded with a single beta distribution.
1.
For this model we have examined the following:
estimates from complete samples, censored samples, grouped
samples and truncated samples,
2.
the asymptotic variance-covariance matrix of the estimates
obtained,
3.
the asymptotic relative efficiences.
Examples are provided using the data from the Princeton Fertility
Survey and from the Hutterite population.
A sampling experiment
182
conducted with limited parameter combinations provides information about
biases of the estimates under censoring and truncation.
Because of the
large scale nature of the study, we have not attempted here a complete
investigation with sampling experiments.
A slight modification of
Potter and Parker's model is also attempted by assuming the distribution of fecundabi1ity as a mixture of two beta distributions.
Chapter II deals with estimation in complete samples.
Since the
model using a single beta distribution of fecundabi1ity does not fit
the Princeton Fertility data, a model using a mixture of two beta
distributions is tried on this data.
Although this model also fails
to show an adequate fit to the data using the chisquare goodness of
fit test with the grouping of Potter and Parker, it does show a good
fit when grouping is slightly altered.
The changed grouping is neces-
sitated by the heavy heaping of data seen on certain months.
Chapter III presents the estimates and their properties from
censored samples.
Estimates from singly censored data and multiply
censored data are given.
Although the conventional moment estimates
cannot be obtained from singly
censored
data,
estimates
are obtained using moment type functions and their asymptotic variancecovariance matrix is also derived.
Algorithms are given to calculate
the maximum likelihood estimates and minimum chisquare estimates.
The asymptotic relative efficiencies of the moment type estimators
show that these estimates perform well for small censoring times and
for large values of the parameters
exist only when
a > 2.
a
and
b.
But these estimates
Our sampling experiments shows that there is
a positive bias in all estimates and the bias decreases
ing time
t
increases.
as the censor-
The loss of asymptotic efficiency
grouping is given in this chapter.
due to
This chapter also describes the
183
maximum likelihood estimates of the parameters of the mOdel under
progressive censoring.
The estimates are obtained under the assump-
tions of homogeneous as well as heterogeneous fecundability, and also
under the assumptions that the monthly withdrawals are fixed or
random.
Chapter IV describes the estimation problems from truncated
samples.
The distribution of conception time given by Potter and Parker
is truncated,considering only the population of women who have conceived
during a fixed time period.
The moment, maximum likelihood and minimum
chisquare estimates are obtained from the truncated samples and their
asymptotic variance-covariance matrices are presented.
Examples using
the Princeton Fertility survey data and Huttelite data are given treating the data as truncated at various points.
only when
a > 3
only when
a > 6.
The moment estimates exist
and the variance-covariance matrix can be computed
The asymptotic relative efficiency
estimators increases as
a
increases,
and decreases as
increases for small values of truncation time
a reverse trend with
b
is noticed.
of the moment
t.
But as
b
t
increases
Our sampling experiments show
that the minimum chisquare estimates have a considerable problem of
convergence.
in detail.
This problem of convergence has not been studied here
The sampling experiments also show
large positive biases
with all the estimates and the bias is large for small values of truncation time.
Chapter V discusses the method of obtaining simultaneous estimates
of all the parameters in a model for time to first livebirth proposed
by George (1967).
The model is derived and algorithms are presented to
obtain the estimates and their asymptotic variance-covariance matrix
184
from complete samples, censored samples and grouped samples.
Examples
using a set of data from Hutterite women married before the age of 25
are given.
6.2 Suggestions for Future Research
On the basis of this study we make the following recommendations
for further research:
1.
A look at the sampling distribution of the estimates for various
combinations of sample sizes, parameters values and different
censoring or truncation times other than what has been done here
using computer simulation.
2.
Evaluation of the models under various types of misreporting of
data.
This can be done either analytically or by computer simula-
tion by introducing certain probability models for misreporting.
3.
Testing the adequacy of the model under relaxed assumptions.
For
example, it may be worthwhile to see how the model with heterogeneous fecundability fits, when fecundability decreases with time
or when fecundability varies among women according to a distribution
which is different from what we have considered.
4.
The problem of convergence and large asymptotic variances in the
model of mixture of beta distributions of fecundability.
The model for time to first livebirth may be extended to study
interlivebirth intervals by adding the component of post partum
amenorrhea.
This model may be further generalized by adding hetero-
geneity of fecundability into the model.
LIST OF REFERENCES
Barret, J. C. and Marshall, J., "The risk of conception on different
days of the menstrual cycle," Population Studies, 23 (1969),
455-461.
Behboodian, J., "Information matrix for a mixture of two exponential
distribution," Institute of Statistias Mimeo Series No. '148,
Uni versity of North Caro lina at Chape l Hi ll. 1971.
Berqu6, E. S., Marques, R. M., Milanesi, M. L., Martins, J., Pinho, E.,
and Simon,!., "Levels and variations in fertility in Sao Paulo,"
Milbank Memorial Fund Quarterly, 46 (1968), 167-185.
Brass, W.,. "The distribution of births in human populations," Population Studies, 12 (1958), 51-72.
Brayer, F. T., Chiaze, L., and Duffy, F. J., 'ta1endar rhythm and
menstrual cycle range," Fertility and Sterility, 20 (1969), 279288.
Cramer, H., Mathematical Methods of Statistics. Princeton University
Press, (1946).
Cox, D. R., Renewal
Theo~,
Methuen, London, (1962).
Dandekar, V. M., "Certain modified forms of binomial and Poisson distributions," Sankhya A, 15 (1955), 237-250.
French, F. E. and Bierman, J. M., "Probability of fetal mortality,"
Public Health Reports, 77 (1962), 835-847.
Ferguson, T., "A method of generating best asymptotically normal estimates with application to the estimation of bacterial densities,"
Annals of Mathematical Statistics, 29 (1958), 1046-1062.
George, A., "A probability model for interlivebirth interval," Paper
presented at the 36th Session of International Statistical Institute, Sydney Australia, (1967).
George, A. and Pillai, R. K., "Comparative studies of two probability
models for interlivebirth intervals," contributed paper, London
Conference, International Union for Scientific Study of Population,
(August, 1969).
186
Gibson, J. R. and Mackewn, T., "Observations on all births in Birmingham 1947," British Journal of Social Medicine. 4 (1950), 221.
Gini, C., "Premieres recherches sur 1a fecundabi1ite de 1a femme,"
Proceedings of the International Mathematics Conference, Toronto,
(1924), 889-892.
Glass, D. V. and Grebenik, E., "The trend and pattern of fertility in
Great Britain," Papers of the Royal Corrmission on Population,
(1954), 255.
Glasser, J. and Lachenbruch, P. A., "Observations on the relation
between frequency and timing of intercourse and probability of
conception," Population Studies, 22 (1968), 399-407.
Graybill, F. A., Introducation to Matrices with Applications in Statistics, Wadsworth Publishing Company, California, (1969).
Gunn, D. L. Jenken, P. M., and Gunn, A. L., "Menstrual periodicity and
statistical observations on a large sample of normal cases,"
Journal of Obstretics and Gynaecology, British Empire, 44 (1937),
839.
Haman, J. 0., "The length of the menstrual cycle," American Journal of
Obstretics and Gynecology, 43 (1942), 870.
Hasse1b1ad, V., "Estimation of finite mixtures of distributions from
the exponential family," Journal of American Statistical Association, 64 (1969), 1459-1471.
Henry, L., "Fondements theoriques des mesures de 1a f~condite naturell.e.:'
Revue de l'Instut International de Statistique, 21 (1953), 135-151.
Henry, 1., "Fecondite et fami11e - models mathematiques," Population,
12 (1957), 413-444.
Henry, L., "Fecondite et famil1e - models mathematiques," Population,
16 (1961a), 27-48.
Henry, L., "Fecondite et famil1e - models mathematiques:
numeriques," Population, 16 (1961b), 261-282.
Applications
Henry, L., "Mesure de temps mort en fecondite nature11e," Population,
19 (1964a), 485-514.
Henry, L., "Morta1ite intra - uterine et fecondabi1ite," Population,
19 (1964b), 899-940.
Hogg, R. V. and Craig, A. T., Introduction to Mathematical Statistics,
The Macmillan Company, New York, (1965).
187
Horvitz, D. G., Lachenbruch, P. A., Giesbrecht, F. G., and Shah, B. V.,
POPSIM, "A demographic microsimulation model," Invited paper,
London Confepenae, Intemational Union fop the Saientifia study
of Population, (August, 1969),
Jain, A. K., "Fecundability and its relation to age in a sample of
Taiwanese women," Population Studies, 23 (1969), 69-85.
Jain, A. K., Hsu, T. C., Freedman, R., and Chang, M. C., "Demographic
aspects of lactation and post partum amenorrhea," Demogpaphy,
7,(1970), 255-267.
Jain, S. P., "Post partum amenorrhea in Indian women," Contributed
papers, Intemational Union for Saientifia study of Population,
Sydney, Australia, (1967).
James, W. H., "Estimates of fecundability," PopUlation Studies, 17
(1963), 57-65.
Joshi, D. D., "Stochastic models utilized in Demography," World Population Conferenae, 1965, Vol. III., New York: United Nations,
(1965).
Katti, S. K. and Gurland, J., "The Poisson Pascal distribution,"
BiometPias, 17 (1961), 527.
Lachenbruch, P. A., "Frequency and timing of intercourse and the probability of conception," Population Studies, 21 (1967), 23-31.
Lecam, L., "On the asymptotic theory of estimation and testing hypothesis," Proaeedings of Third Berkeley Symposium, 1: 129, (1956)
Loeve, M.,Probability Theo~, Third Edition, Van Nostrand Reinhold
Company, New York, (1963).
Majumdar, H. and Sheps, M. C., "Estimators of a type I geometric distribution from observations on conception times," Demography, 7 (1970),
349-360.
Neyman, J., "Contributions to theory of the chis quare test," Proceedings
of Bepkeley Symposium, University of Califomia Press, (1949a).
Neyman, J., "On the problem of estimating the number of schools of
fish," Publications in Statistics, 1, Bepkeley, Univepsity of
Califomia ppess, (1949b), 21-36.
Oppenheimer, L., "Estimation of a mixture of exponentials for complete
and censorp.d samples," Institute of Statistics Mimeo SePies No.
??1, ,University of North Carolina at Chapel llill, 1971.
188
Pathak, ·K. B., "A probability distribution for the Olunber of conceptions," Sankhya B 28., (1966), 213-218.
Pearl, R., "Factors of human fertility and their statistical evaluation," Lancet 225., (1933), 607-611.
Pearl, R., Natural History of Population., Oxford University Press,
New York, (1939).
Perrin, E. B. and Sheps, M. C., "Human reproduction:.
process," BiometT'ics 20 (1964), 28-45.
A stochastic
Potter, R. G., "Length of the observation period as a factor affecting the contraceptive failure rate," Milbank Memorial Fund
Quarterly., 38 (1960), 140-152.
Potter, R. G., "Length of fertile period," Milbank Memorial Fund
Quarterly., 39 (1961), 142-143.
rotter, R. G., "Renewal theory and births averted;' Invited paper,
London Conference., International Union for the Scientific study
of Population., (August, 1969).
Potter, R. G., "Births averted by contraception: An approach through
renewal theory," Theoretical Population Biology., 1 (1970),
251-272.
Potter, R. G., Jain, A. K., and McCann, B., "Net delay of next conception - A highly simplified case," Population Studies., 24 (1970),
173-192.
Potter, R. G., McCann, B., and Sakoda, J. M., "Selective fecundability
and contraceptive effectiveness," Milbank Memorial Fund Quarterly.,
48 (1970), 91-102.
Potter, R. G. and Parker, M., "Predicting time required to conceive,"
Population Studies., 18 (1964), 99-116.
Potter, R. G. and Sakoda, J. M., "A computer model of family building
based on expected va1ues,"Demography., 3 (1966), 450-461.
Potter, R. G. and Sakoda, J. M., "Family planning and fecundity,"
Population Studies., 20 (1967), 311-328.
Potter, R. G., New, M. L., Wyon, J. B., and Gordon, J. E., "Application
of field studies to research on the physiology of human reproduction - lactation and its effect upon birth intervals in eleven
Punjab Villages," In Sheps, M. C. and Ridley, J. C. (eds.), Public
Health and Population Change., Pittsburg:
PI'css., (1965).
~1iversity
of Pittsburg
189
Pyke, R., "Markov renewal processes: Definitions and preliminary
properties," Annals of Mathematiaal Statistias, 32 (196la),
l231~1242.
Pyke, R., "Markov renewal processes with finitely many states,"
Annals of Mathematiaal Statistias, 32 (196lb), 1243-1259.
Puri, M. L. and Sen, P. K. Nonparametria Methods in Multivariate
Analysis, Wiley Publications, (1971)
Rao, C. R., Advanaed Statistiaal Methods in Biomet:l'ia Researah, Wiley
Publications, New York, (1952).
Rider, P. R., "The method of moments applied to a mixture of two
exponential distributions, Annals of Mathematiaal Statistias,
32 (1961), 143-148.
Ridley, J. C., and Sheps, M. C., "An analytical simulation model of
human reproduction with demographic and biological components,"
Populations Studies, 19 (1966), 297-310.
Saxena, P. C., "Lactation and post partum amenorrhea," All India
Seminar in Demographia Analysis, February 1969.
Sheps, M. C., "On the time required for conception," Population Studies,
18 (1964a), 85-97.
Sheps, M. C., "Pregnancy wastage as a factor in the analysis of fertility data," Demography, 1 (1964b) 111-118.
Sheps, M. C., "Application of probability models to the study of human
reproduction," In Sheps, M. C. and Ridley, J. C. (eds.), Publia
Health and Population Change, University of Pittsburg Press,
Pittsburg. (1965).
Sheps, M. C., "Characteristics of a ratio used to estimate failure
rates: Occurrences per person year of exposure," Biomet:l'ias,
22 (1966), 310-321.
Sheps, M. C., "Uses of Stochastic models in the evaluation of population
policies 1. Theory and approaches to data analysis," Proaeedings
of Fifth Berkeley symposium on Mathematiaal Statistias and Probability, Vol. IV. (1967).
Sheps, M. C., "A review of models for population change," Review of
International Statistiaal Institute, 39 (1971), 185-196.
Sheps, M. C. and Menken, J. A., "On closed and open birth interval in
a stable population," Invited paper, Sagunda Conferenaia Regional
de Pobtacian, T. U.S.S.P., Me:r:iao City, (August, 1970).
190
Sheps, M. C. and Menken, J. A., "A model for studying birth rates given
time dependent change in reproductive parameters," Biometrics,
27 (1971), 325-343.
Sheps, M. C. and Menken, J. A., "Distribution of birth intervals
according to the sampling frame," Theoretical Population Biology,
3 (1972a), 1-26.
Sheps, M. C. and Menken, J. A., "On estimating the risk of conception
from censored data," Paper presented at the Meeting on Popul,ation
Dynamic8, University of Wisconsin, June, 1972.
Sheps, M. C., Menken, J. A., and Radick, A. P., "Probability models for
family building - An analytical review," Demography 6 (1969),
161-183.
Sheps, M. C., Menken, J. A., Ridley, J. C., and Lingner, J. W., "Birth
intervals, artifact and reality," Contributed paper, Sydney
Conference International, Union for the scientific study of
Popul,ation, (1967), 857-867.
Sheps, M. C., Menken, J. A., Ridley, J. C., and Lingner ,J. W., "The
truncation effect in birth interval data," Journal, of Amsrican
Statistical Association, 65 (1970), 678-694.
Sheps, M. C. and Mustafa, A. M., "Some considerations affecting the
design of follow up studies," (in Press). (1972)
Sheps, M. C. and Perrin, E. B., "Changes in birth rates as a function
of contraceptive effectiveness - some applications of a stochastic
model," American Journal of Public Health, 53 (1963), 1031-1046.
Sheps, M. C. and Perrin, E. B., "Further results from a human fertility
model with a variety of pregnancy outcomes," Human Biology, 38
(1966), 180-193.
Singh, S. N., "Probability models for the variation in the number of
births per couple," Journal of American Statistical Association,
58 (1963), 721-727.
Singh, S. N. , "A probability model for couple fertility," Sankhya
B 26: (1964), 89-94.
Singh, S. N., and Bhattacharya, B. N., "A generalized probability
distribution for couple fertility," Biometrics 26: (1970),
33-40.
Srinivasan, K., "An application of a probability model to the study
of interlivebirth intervals," Sankhya B 28: (1966), 1-8.
Srinivasan, K., "A set of analytic models for the study of open birth
intervals," Demography 5: (1968), 34-44.
191
'-
Tietze, C., liThe effect of breast feeding on the rate of conception,"
Paper presented at the International Population Conference,
New York, 1961.
Treloar, A. E., Boynton, R. E., Behn, B. G., Brown, B. W., "Variations
of the human menstrual cycle through reproductive life," InternationaZ JournaZ of Fertility 12, (1967).
Vollman, R. F., liThe degree of variability of the length of the menstrual cycle in correlation with age of woman," Gynaecologia, 142
(1956), 310.
Venkatacharya, K., "An examination of a certain bias due to truncation
in the context of simulation models of human reproduction,"
Sankhya B J1: (1969a), 397-412.
Venkatacharya, K., "Certain implications of short marital durations in
the analysis of 1ivebirth intervals," Sankhya B J1: (1969b),
53-68.
Westoff, C. F., Potter, R. G., Sagi, P. C., and Mishler, E. G., Family
Growth in Metropolitan America, Princeton University Press,
Princeton, (1961).
Wilks,
,
•
s.
5., Mathematical Statistics, New York: Wiley, (1962).
Wo1fers, D., "The determinants of birth intervals and their means,"
Population Studies 22: (1968), 253-262.
© Copyright 2026 Paperzz