Boyle, Kerrie Eileen; (1983).Survival Model for Fertility Evaluation."

·e
SURVIVAL MODEL FOR FERTILITY EVALUATION
by
Kerrie Eileen Boyle
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1450
September 1983
SURVIVAL MODEL FOR
FERTILITY EVALUATION
by
Kerrie Eileen Boyle
·e
A Dissertation submitted to the faculty of
the University of North Carolina at Chapel Hill
in partial fulfillment of the requirements for
the degree of Doctor of Public Health in the
Department of Biostatistics.
Chapel Hill
1983
Approved by:
ABSTRACT
KERRIE EILEEN BOYLE. Survival Model for Fertility Evaluation.
the direction of MICHAEL J. SYMONS.)
(Under
This research is concerned with developing methods for summarizing
the fertility experiences of a cohort of women.
A general model for
fertility evaluation based on survival techniques is presented.
The
proposed piece-wise survival function has a multiplicative exponential
hazard rate whereby the relationship between birth predictor covariates
and the reproductive experience of the women under study can be
examined.
Here the event of interest is a live birth.
The first approach to maximum likelihood estimation of the
·e
regression coefficients is through construction of the full likelihood
function.
Race by parity by age by calendar year specific U.S. birth
rates (National Center for Health Statistics, 1976) estimate the
underlying fertility hazard rate for a woman in the study cohort with
similar characteristics during the same year interval.
The crucial
assumption of conditional independence of a woman's yearly contribution
to the likelihood is discussed.
Finally, two goodness-of-fit strategies
are presented for examining the adequacy of the piece-wise constant
hazard rate in describing these reproductive histories.
Another framework for estimating regression coefficients is
available from Cox regression (Cox, 1972).
A stratified proportional
fertility hazard rate is developed to evaluate a woman's reproductive
experience on a year by year basis.
This method also departs from the
previous in the construction of the likelihood; the partial likelihood
function is employed.
2
The proposed model is used to assess the effect of occupational
exposure to hazardous chemicals on the reproductive health of male
workers employed at two chemical manufacturing plants in the U.S.
The
fertility experience of the workers' wives, gathered retrospectively,
was used as a surrogate measure of the male workers' reproductive health
in order to test the null hypothesis that chemical exposure in the
workplace has no effect on the reproductive health of the workers.
e·
TO MOM AND DAD
·e
ii
ACKNOWLEDGEMENTS
This research has provided me the opportunity to work closely with
my advisor, Dr. Michael J. Symons; his guidance and constant enthusiasm
are grately acknowledged.
Thanks are also extended to committee members
Drs. Regina Elandt-Johnson, Richard Levine, Thomas Starr, C. Suchindran,
and Allen Wilcox.
Discussions with Professor Norman L. Johnson were
very helpful.
I especially want to thank Daniel DalCorso for the programming
assistance he so willingly provided and Yuan Yang for his help with
·e
MAXLIK.
Gratitude is also expressed to Sea Parker for her skillful
typing of this dissertation.
While at the University of North Carolina, I received financial
support from Occupational Health Training Grant Number IA028.
This
research has also been supported by the Department of Epidemiology of
the Chemical Industry Institute of Toxicology at Research Triangle Park.
All analyses were performed on their computer; moreover, the reproductive history files described herein are used with their kind permission.
Fi nally, I am indebted to my friends, near and far, and to my
wonderful family for their love, encouragement, and support.
iii
TABLE OF CONTENTS
Page
List of Tables
Li st of Fi gures
v
vii
Chapter
I.
INTRODUCTION AND REVIEW OF THE LITERATURE ••
1.1
1.2
1.3
Introduction. . . . . . . . . . . . . .
Demographic Models of Family Building • . • . •
Fertility Surveillance Methods in Occupational
Health Investigations . . . . . . . . .
1.3.1
1.3.2
1.3.3
Observed versus Expected Numbers of Births
Stratified Analysis.
• . ..
Survivorship Analysis..
....•••.
1.3.3.1
1.3.3.2
1.4
-e
II.
Life Tables
Cox Regression
Independence Assumption in the Analysis of
Reproductive Histories • . . • •
1
3
10
10
15
16
16
18
20
1.5 Outline of Subsequent Chapters.
25
A MODEL FOR THE FERTILITY HAZARD FUNCTION
26
.
Introduction . . . . . . . ·
Description of the Data
A Multiplicative Exponential Fertil i ty Hazard
Functi on Model
··
2.4 The Likelihood Function
·
2.1
2.2
2.3
2.5
2.6
2.7
2.8
2.9
III.
1
···
··········
...... ········
·········..
The Assumption of Yearwise Conditional Independence
of a Woman's Contribution to the Likelihood
Parameter Estimation • • . .
Simple Illustration . . . .
Testi ng Hypotheses about the 1·10del • • • • • . •
Goodness-of-Fit of the Model . . • . . • • • • • • .
ANALYSIS OF OCCUPATIONAL REPRODUCTIVE DATA USING THE
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD RATE MODEL:
FULL LIKELIHOOD FUNCTION • • • • • • • • • • • • • •
26
27
29
32
35
38
40
43
44
48
3.1 Introduction • • • • . • • • • • • • • • • • • •
3.2 Analysis of Plant A ••
• •••••••••
3.3 Analysis of Plant B • • • • ••
• •••••••
3.4 Summary of the Essential Features of the
Multiplicative Exponential Fertility
65
Hazard Model . . . . . . . . . . . . . . . . . • .
81
48
49
iv
TABLE OF CONTENTS (Cont'd.)
Chapter
IV.
V.
Page
FERTILITY HAZARD RATE FUNCTION WITH STRATIFICATION:
PARTIAL LIKELIHOOD . • . . • • . . • . .
84
4.1 Introduction . • • . • • . • • . . . . . . . .
4.2 Description of the Data . . • • . . . . • • . .
4.3 A Cox Regression Approach to Fertility Evaluation
4.4 Illustration: Plant A Revisited . . . •
4.5 Summary of the Essential Model Features
84
85
86
91
94
SUMMARY AND SUGGESTIONS FOR FUTURE RESEARCH.
98
5.1
98
99
Summary
. . ..
5.2 Future Research
5.2.1
5.2.2
5.2.3
5.2.4
5.2.5
REFERENCES
APPENDIX A
APPENDIX B
. . .
.
Forms for the Hazard Function.
. •..
Regression Strategy . • . . . . . . . . • . .
Occupational Exposure Covariate .
. ....
Pooled Analyses.
Model Sensitivity • . . . . . . . . . . . . .
99
100
102
104
106
107
e-
v
LIST OF TABLES
Table
1.1
3.1
Page
Hypothetical Cross-classification of Pregnancy Outcome
by Characteristics
.
The Number of Woman-Years of Reproducti ve Experience
and Births by Calendar Year: Plant A • · ·
·
50
The Number of Woman-Years of Reproductive Experience
And Births by Woman sAge: Plant A . . · · · · · ·
51
The Number of Woman-Years of reproductive Experience
by Woman's Pari ty: Plant A . . . . . . · · · · ·
52
THE Number of Woman-Years of Reproductive Experience
by Occupational Exposure and Marital Status: Plant A
53
The Number of Births by Occupational Exposure and
Marital Status: Plant A . . . . . • . . . . . •
54
· ·· ·····
3.2
· ····
I
3.3
·
3.4
3.5
·e
· ····
3.6 . Estimates and Estimated Standard Errors for the Parameters
and the Log Likelihood for the Multiplicative Exponential
(2.3) and the Multiplicative Weibull (2.34)
Fertility Hazard Model: Plant A . . . . . . . . • . . .
3.7
3.8
3.9
56
Observed and Expected Numbers of Births, Test Statistics
and P-values for Testing Goodness-of-fit of the
Multiplicative Exponential Fertility Hazard
Model (2.3): Plant A . . . . . . . . • . • . . . . • . .
58
Estimates and Estimated Standard Errors for the Parameters
and the Log Likelihood for Multiplicative Exponential
Fertility Hazard Models for Selected Predictors of
Birth Probability: Plant A • . . • • . . . . . • . . . . . .
60
Estimates and Estimated Standard Errors for the Parameters
for Multiplicative Exponential Fertility Hazard Model,
All Years and Married Years Only, for Selected Predictors
of Bi rth Probabil i ti es: Pl ant A • .
• . . • • • •
63
3.10 Estimated Fertility Hazards for Women with
Selected Characteristics: Plant A . . . •
3.11
20
64
The Number of Woman-Years of Reproductive Experience and
Births by Calendar Year and Exposure Definition: Plant B
68
3.12 The Number of Woman-Years of Reproductive Experience and
Births by Age and Exposure Definition: Plant B . . • • .
69
3.13 The Number of Woman-Years of Reproductive Experience and
Births by Parity and Exposure Definition: Plant B
70
vi
LIST OF TABLES (Cont'd.)
Table
Page
3.14 The Number of Woman-Years of Reproductive Experience by
Occupational Exposure Status and Marital Status
for All Years: Plant B • . . • . . . '.' . . . . . . . . • .
71
3.15 The Number of Births by Occupational Exposure Status and
Mari tal Status for All Years: P1 ant B . . . . . . . . .
72
3.16
Estimates and Estimated Standard Errors for the Parameters
and the Log Likelihood for the Multiplicative Exponential
(2.3) and the Multiplicative Weibu11 (2.34)
Fertility Hazard Model: Plant B . . . . . . . . . . . .
74
3.17 Observed and Expected Numbers of Births, Test Statistics
and P-va1ues for Testing Goodness-of-fit of the
Multiplicative Exponential Fertility Hazard Model (2.3):
Plan t B . . . . . . . . . . . . . . . . . . . . . . . . .
75
3.18 Estimates and Estimated Standard Errors for the Parameters
and the Log Likelihood for Multiplicative Exponential
Fertility Hazard Models for Selected Predictors of
Birth Probability: Plant B . . . . . . • . . . . . . . . . .
77
3.19 Estimates and Estimated Standard Errors for the Parameters
and the Log Likelihood for Multiplicative Exponential
Fertility Hazard Models for Selected Predictors of
Birth Probability Based on All Woman-Years and on Woman-Years
not Characterized by Multiple Exposure: Plant B . . . . . .
79
3.20 Estimated Fertility Hazards for Women with Selected
Characteristics Based on All Woman-Years and on Woman-Years
not Characterized by Multiple Exposure: Plant B
81
4.1
Estimates and Estimated Standard Errors for the Parameters
and the Log Likelihood for Cox Regression Model: Plant A . . 93
5.1
Observed and Expected Numbers of Births From the
Multiplicative Exponential Fertility Hazard Model (2.3) with
101
Selected Parity Covariates: Plant A . . . • . •
e-
vii
LIST OF FIGURES
Figure
3.1
4.1
-e
Page
U.S. Birth rates per 1000 women by age for white women
from 1925 and 1940 birth cohorts with live births at
ages 25, 27, and 30 . • • . • . . • . • . . . . •
82
Calendar Year by Parity Strata Contributing/Lost
to Partial Likelihood: Plant A • • • • • • • • • • • • • •
96
CHAPTER I
INTRODUCTION AND REVIEW OF THE LITERATURE
1.1
Introduction
It is difficult to determine whether chemical substances found in
the occupational environment deleteriously effect the reproductive
health of male workers exposed to these suspect agents.
Investigations
on a plant-level are feasible but numerous problems are associated with
a definitive confirmation of spermatogenesis dysfunction.
fi rmation requi res resource-demandi ng 1aboratory tests.
-e
Clinical conBecause of the
potential implications of these lab results, workers frequently refuse
to participate in such clinical investigations.
Moreover, since the
spermatogenesis cycle is approximately seventy-four days, one-time
laboratory tests may not be sufficient to identify historical male
reproductive impairment.
Consequently, investigators of occupational
health have begun to monitor the fertility experience of workers· wives
as a surrogate measure of the male workers' reproductive health.
With a woman's reproductive history more readily available, the
investigator now must decide upon a strategy to statistically summarize
the fertility experience of an occupationally-defined population and
evaluate the association between occupational exposure and fertility.
The emphasis of this research is the development of a general statistical model for evaluating the fertility experience of a cohort of women.
2
Fertility analysis is more complicated than that of mortality, the
other endpoint of the life cycle that is more commonly the focus of
epidemiologic investigations of occupational health.
once to an individual.
Death occurs only
When death occurs, only one person is affected.
That person may be a male or a female of any age.
Birth, on the other hand, reflects a complex interaction of personal
desires, opportunities and socioeconomic factors in addition to physiological factors.
with birth.
The inevitability surrounding death is not associated
Birth is the culmination of a biological process that began
nine months previously; however, it is not the only outcome of this
process.
Unlike mortality, fertility is related to more than one person
in that the father as well as the mother is involved.
The character-
istics of child and parents must be considered jointly when investigating fertility.
at risk.
birth.
Another differential characteristic is the population
Only a portion of the population is exposed to the risk of
Moreover, this exposure resumes several months after a birth and
may continue for years.
Fertility evaluation methods should consider these circumstances
surrounding a birth.
Few fertility models have this capability.
The
proposed survivorship model for fertility evaluation accommodates the
heterogeneity one is likely to encounter in study populations.
A review of the present state-of-the-art for describing the fertility experience of a defined population follows in subsequent sections
of this chapter.
The initial scope is broad:
the results of a survey
of the demographic literature for probability models describing the
variation in the number of children born to a woman in a specified time
period appear in Section 1.2.
The focus of Section 1.3 narrows with an
e-
3
overview of the methods employed to investigate the effect of occupational exposure on fertility.
Finally, the issues surrounding the
independence/dependence assumption of reproductive outcomes at different
parity levels is presented in Section 1.4.
1.2 Demographic Models of Family Building
Various demographic models have been constructed to represent,
approximately, the process of family building.
classified into two types.
ceptions and live births.
These models can be
One type considers point events such as conThe other concerns interval events such as
the length of time from marriage to first birth or the time interval
from the nth birth to the (n+1)st birth.
The emphasis of this review is
the former, that is, models of the distribution of point events.
When the point events are live births, the variation in the number
of children born to a woman must be assessed in order to summarize the
fertility experience of all women in a defined population.
examined the variability of human reproduction.
Fisher (1929)
He computed the actual
variation in the number of children born to over 10,000 married women in
New South Wales at the end of their reproductive years.
The variance
(16.18) of this empirical distribution greatly exceeded the average
number of children born to each woman (6.19).
This finding does not
support the null hypothesis that the distribution of women by number of
children born during a fixed time period follows a Poisson distribution,
in which case the mean and the variance would be equal.
Dandekar (1955) made a pioneer effort to describe the distribution
of the number of children born to a cohort of women of a specified age
during a fixed interval of time.
Under certain conditions affecting the
4
independence of successive trials, he derived a modified binomial
distribution and a modified Poisson distribution as the limiting form of
the modified binomial.
Note that the former distribution considers time
as a discrete variable and the latter treats it as a continuous
variable.
The model assumptions were:
(i) the probability of success
(conception) in a single trial or unit of time is a constant, say p, and
this probability is the same for all women in the cohort; and (ii) if a
trial results in a conception, then the probability of conception during
each of the subsequent (t-l) trials is zero.
This modification of the
assumption of independence of successive trials considers the physiological fact that a woman is not exposed to the risk of conception during
her rest period, i.e., the duration of her pregnancy plus her period of
post-partum amenorrhea.
Consider time as a discrete variable.
Then
P(x,m), the probability
of exactly x births in m trials, is obtained recursively from the
distribution function
F(x,m):
P{x,m)
= F{x,m) - F{x-l,m)
(1.1)
where
F(x,m) = qm-xt
[J
+ (m-xt)p +
(m-xt)~~-xt+l)p2
... + {m-xt){m-xt-l)j .. (m-xt+x-l)p
x.
+
X
]
,
(1.2)
5
and q = 1-p.
(x+l)
Note that the terms in the bracket in (1.2) are the first
terms in the negative binomial expansion of
(l_P)-(t-xm).
To
accommodate observational-type studies, Dandekar generalized this
modified Poisson model so that all women were not exposed to the risk of
conception during that first calendar year of the investigation.
These statistical models did not adequately fit several observed
distributions of women by the total number of live births obtained from
a survey of fertility in India between 1941 and 1945.
The author
suggested that the first model assumption, a constant probability of
conception for all women over the entire observation period, may not be
valid.
Also, the discrepancy between the empirical and hypothesized
distributions may be explained in part by the fact that this model did
not differentiate between fecund and sterile women.
Singh addresses these sources of heterogeneity among women in the
study population in his extension of Dandekar's original model.
Firstly, he assumes that a woman is or is not exposed to the risk of
conception during the entire observation period of length m with probability a
and
1-a, respectively (Singh, 1963, 1968).
words, a woman is either fecund or sterile.
In other
This assumption (iii) is
easily incorporated into the discrete time model (1.2) and the desired
probabil i ty is
P(x,t) = apxqm-xt [(m-~t+x) +
t~l (t-i-(X-1Ht+l)) t-i]
L
i=l
x-l
q.
(1.3 )
6
The probability model (1.3) provides a good fit to the empirical
distributions examined by Dandekar (Singh, 1963).
It should be noted that Singh's model (1.3) presumes that all fecund
women are exposed to the risk of conception during the first year of
observation.
Pathak (1966) generalizes Singh's binomial model (1.3) by
introducing a fourth model assumption (iv):
a fecund couple is or is
not exposed to the risk of conception in the first unit of the observational period.
This generalization extends the usage of the proba-
bility model to observational studies, an achievement similar to
Dandekar's (1955).
Next, Singh (1963) advises that the probability of conception
differs among couples.
He proposes a constant fecundability for each
woman and a beta distribution for the fecundabilities of all women.
However, the validity of this heterogeneity proposal is never examined
empi ri ca lly.
Lastly, Singh (1968) considers an alternative discrete time fertility model where the number of conceptions occurring to a female
during the entire observational period
(O,m)
follows a Poisson
distribution.
It is assumed that each woman has the same capacity for
conception, >... With a Urest U period following birth and the fact that
a woman remains fertile or sterile during the entire observation period,
the desired probability is
I e- >.. (m- xt ) (>..
- i=OI e->..(m-x(t-l))
P(x,m) = a. [
i=O
(m: ~ t) )i
1.
(>..(m-~~t-l)))iJ.
1.
(1.4 )
7
Expected frequencies calculated from the probability mass function (1.4)
suitably described the empirical distribution of women by the number of
live births from the Indian fertility survey.
Note that this Poisson
model differs from Fisher's (1929) in that it accommodates various sources of heterogeneity among the women in the study with respect to the
risk of giving birth.
The methodology presented so far implicitly has equated conceptions
with live births, also known as complete conceptions.
Fetal wastage
(abortions and still births) has been ignored or, equivalently, assumed
to be non-existent.
Singh and Bhattacharya (1970) expand Singh's
probability model (1.4) to account for complete as well as incomplete
conceptions.
-e
Variable length "res t" periods are considered in this
model, one employed after a complete conception and the other after an
incomplete conception.
Pathak (1971) treats pregnancy outcomes other
than live births in a similar manner.
Brass (1958) advocated a different modeling framework.
He investi-
gated a compound Poisson density to describe the distribution of births
to women during an observation period.
constant risk of conception,
The model assumptions include a
A, for each individual woman during the
study interval that varies among the women according to a gamma (Pearson
Type III) distribution, that is,
f ()
A
=
k -ak k-l
A
a e
r(k)
where a and k vary from population to population.
Singh (1963) assumed a beta distribution for A.
Recall that
The resultant
expression for the unconditional probability of x births during the
fixed observation period
(O,m)
is the negative binomial probability.
8
The variance of this distribution always is greater than the mean.
This
relationship is consistent with Fisher's (1929) observations.
Modifications to the model to include births to fecund women only lead
to the truncated (at zero) negative binomial probability,
P(x m)
,
=
r(k+x)
xlr(k)
(-!!!..-)X
a+m
([a+m)k _ 1)-1
a
(1. 5)
to describe the likelihood of exactly x births to a fecund woman
during the time period
(O,m).
Note that Brass' model (1.5) does not
differentiate between complete and incomplete pregnancies.
He proceeds
to incorporate a fi xed rest" peri od subsequent to each bi rth.
II
Biswas (1973) considers the negative binomial model of Brass (1958)
and introduces the probability that a woman is either fecund or sterile
(a
and 1-a, respectively).
The end result of this modification
to
the
negative binomial is similar to that of the truncated adjustment proposed by Brass.
The demographic probability models describing variation in the
number of children born to a woman in a specified time period discussed
up to this point are homogeneous with respect to time.
Specifically,
parameters describing the reproductive capacity of a woman remain
constant throughout the period of observation.
In practice, these time-
homogeneous probability models may provide a reasonable description of
the reproductive experience of a woman for study periods of short
duration.
However,
to
describe a woman's fertility experience over long
observation periods or, perhaps, her entire reproductive span, probability models must allow for variation in the reproductive parameters
for each individual woman.
Singh et ale (1974) and Singh (1981) develop
9
complex probability models where the probability of conception for a
woman is a function of her parity.
Henry (1957, 1961) completed several
theoretical studies of models where one or more of the model parameters
vary with the woman's age.
Sheps and Menken (1971) examine time-
dependent changes in reproductive behavior.
The principal function of
these latter, complex models is to elucidate the effect of various
biological factors on human reproduction.
Their application frequently
requires model simplification.
Hoem (1970) developed a series of fertility models motivated by life
table framework.
Within the theoretical structure of Markov processes,
transition probabilities describing the likelihood of moving from one
fertility state to another are estimated.
·e
From these probabilities the
estimated average time spent in a state can be calculated.
In Hoem's
theoretical application the issue of the heterogeneity of women in the
study cohort was directly addressed:
the states were parity by age by
marital status by interpregnancy interval by marital status duration
specific.
However, with the introduction of each of these confounders
into the analysis, the dimension of the transition matrix expands
geometrically.
From a practical viewpoint, it is doubtful that enough
observations would be available for a useful model.
In summary, several demographic models have been investigated as
possible candidates for fertility surveillance.
In the context of an
occupational health fertility study, demographic family building models
are beset by several difficulties.
The validity of the assumptions of a
homogeneous cohort of women and of the constancy of a woman's reproductive parameters over secular periods has been addressed.
Moreover, the
majority of these probability models presume that each woman is observed
10
for a time interval of specified length.
Adherence to such a rigid
follow-up schedule may not be feasible in some highly mobile industrial
populations.
1.3 Fertility Surveillance Methods
in Occupational Health Investigations
1.3.1 Observed versus Expected Numbers of Births
Another approach to evaluating the fertility experience of a group
of women appears in several aggregate-level fertility investigations.
This technique involves the comparison of the observed number of births
occurring among these women to an expected number, where the latter is
the number of births anticipated in the study group under the fertility
schedule of a standard population.
For example, the Princeton European
Fertility Project examined the decline in fertility in numerous European
countries since the French revolution (Matthiessen and McCann, 1978).
To describe reduction in fertility, these investigators utilized the
lIindex of overall fertiliti' , the ratio of total births in a country
to
the number anticipated if women in that geographic region had the agespecific fertility of North American Hutterite wives.
Blake (1955)
researched the effect of unstable reproductive unions on the fertility
of Jamaican women.
She calculated an expected number of live births
under the assumption that a woman was continually exposed to the risk of
pregnancy during her reproductive years.
A comparison to the actual
number of 1i ve bi rths quanti fi ed the 1I10st ferti 1i tyll attri buted
to
tem-
porary conjugal unions, an accepted sociological condition on the Island
at that time.
The application of comparing an observed number of births with an
expected for purposes of fertility evaluation in an occupational setting
11
has been recent.
Fisher (1978) proposed a technique for monitoring the
reproductive health of male industrial workers through a surveillance of
their wives' fertility experience during periods of occupational
exposure.
Specifically, let b be a mx1
vector such that bj has the value
one or zero depending on whether a birth occurred or did not occur
during the jth exposed year of a woman's reproductive history
(j = 1,2,3, .•• ,m).
In the U.S. population the conditional probabil ity
of birth in year j
race
rj
for an ever-married woman of age aj , parity c j '
and interval since last birth (or marriage) lj is
P(aj,cj,rj,lj)
(U.S. Bureau of the Census, 1975).
Fisher expressed the
probability of a particular fertility history f(b)
·e
m
f(b)
=
IT
j=l
P(a.,c.,r.,l.)
J
J
J
b.
J
J
as
[l-P(a.,c.,r.,l.}]
J
J
J
l-b.
J
J
The effect of exposure on fertility was assessed by comparing the
observed number of births occurring subsequent to exposure among women
in an occupationally-defined population to an expected number.
The
expected number of births for an ever-married woman of a specific age,
parity, race and interval since last birth (or marriage) was calculated
by summing, over all possible paths or reproductive histories
f(Q)
leading to births during the m exposed years, the product of the probability of that history times the number of births:
1
1
I
I
1
2
b =0 b =0
1
I
b =0
m
[f(b)·b'b]
12
The vector of birth probabilities associated with each fertility history
f(~}
differs to reflect the unique details of that hypothetical repro-
ductive experience.
Application of Fisher's methodology in an occupational setting may
produce misleading results.
Fisher's technique utilizes the birth pro-
babilities for the U.S. population.
Unfortunately, no attempt was made
to adjust these birth probabilities to reflect risk differences due to
factors that differentiate the U.S. population from the industrial population under study.
Recall that the expected number of births computed by Fisher's
methodology is not influenced by the woman's observed reproductive
experience during the interval of interest; it is only conditional on
her age, parity, race, marital status (ever-married), and time since
1ast birth at the begi nni ng of the exposure peri od.
Wi th the method
proposed by Wong et al. (1979), the expected number of births for a
woman during exposed years is calculated based on her actual fertility
experience during each year of that exposure interval.
Wong et al. (1979) examined the association of occupational exposure
to ethylene dibromide with male reproductive health through a retrospective summarization of the fertility experience of the wives of the
exposed workers.
These researchers employed an indirect method of
standardization to calculate expected numbers of births.
First, those
woman-years of reproductive experience where the husband experienced
hazardous occupational exposure were disaggregated by maternal age,
parity, race and calendar years.
Then the U.S. age by parity by race
by calendar year specific birth probabilities (National Center for
Health Statistics, 1976) were applied to the corresponding woman-years
e·
13
to obtain age by parity by race by calendar year specific expected
number of births.
Thus, the expected number of births for a woman is
conditional on her age and observed parity at the beginning of each
cal endar year in her reproductive hi story.
Observed and expected bi rths
were summed, and ratios of observed to expected numbers of births were
examined.
Under the null hypothesis of no effect on fertility due to
the husband's hazardous occupational environment, the number of observed
births occurring during the occupational exposure period is presumed to
have a Poisson distribution with mean equal to the expected number
described above.
Several limitations were acknowledged by the investigators.
The
birth probabilities employed as a standard reflect the fertility
·e
experience of both married and non-married women in the U.s. population.
Since the cohort of women under study were married, the expected number
of births may be underestimated.
In addition, efforts were made to
accommodate several sources of heterogeneity in the study population.
This standardization technique adjusted for several factors influencing
fertility, e.g., age of mother, birth cohort, race and parity.
However,
other factors such as religion and contraceptive practices were not considered in the analysis.
Finally, this method for monitoring fertility
in an industrial population does not consider that the birth probabilities for the industrial group in the absence of occupational exposure
may differ from the national experience.
This was remarked to be a
limitation of Fisher's method as well.
A recent contribution in the area of occupational fertility research
is the statistical model developed by Levine et al. (1980).
It is simi-
lar to the Wong et al. (1979) method in that it calculates an expected
14
number of births for a woman's reproductive period conditional upon age
and parity at the start of each calendar year.
However, the Levine
et al. method places a constraint on the expected number of births for
anyone parity level:
the expected number cannot exceed unity.
translates into the following computations.
This
Suppose a woman remains at a
parity level for an interval lasting more than one year.
Then the
probability of giving birth is one minus the probability of not giving
birth during this time period.
Equivalently, it is one minus the pro-
duct of the conditional probabilities of not giving birth during those
consecutive calendar years.
U.S. birth probabilities are utilized.
The observed and expected number of births are dichotomized according
to whether they occurred while the husband was not at risk to exposure in
the plant (01 and E1, respectively) or at risk to exposure (02 and E2,
respectively). Two parameters are estimated. The positive constant, ¢,
accounts for differences in fertility between the industrial population
and the general U.S. population due to socio-economic factors.
Of
particular interest is the parameter 8, a positive constant which
reflects the change in fertility following occupational exposure to the
suspect agent.
The random variable 01 is assumed to follow a Poisson
distribution with mean ¢E1, and 02 is presumed to be a Poisson random
variable, independent of 01' with mean 8¢E 2• Estimates of ¢ and 8
are 01/E1 and (02/E2)/(01/E1)' respectively. Note that ¢ and 8
are assumed not to vary across age, parity, birth cohort, or race
groups.
Hypothesis testing and interval estimation follows from the
distributional assumptions.
The attraction of this model lies in its ability to differentiate the
effects on fertility due to occupational exposure (8) from those due to
15
socio-economic factors
(<P).
This is feasible since the women under
study act as their own controls.
Specifically, their reproductive
experience is dichotomized so that pre-exposure fertility experience is
compared to their post-exposure fertility experience.
However, the
Poisson distribution assumption may not be valid with young women where
the probability of giving birth is not a rare event.
Levine et ale (1981) proceeded to validate this fertility surveillance method on the reproductive histories of wives whose husbands
were clinically known to have reduced sperm counts associated with occupational exposure to dibromochloropropane.
1.3.2 Stratified Analysis
Townsend et ale (1982) retrospectively gathered the reproductive
-e
history of wives of workers exposed to chlorinated dioxins as well as
those of a control group.
With pregnancy outcome as the unit of
analysis, a stratified analytic approach was utilized to examine the
association between exposure (dichotomous independent variable) and the
dependent variable, pregnancy outcome (favorable or unfavorable).
A
variable selection procedure chose from the nine categorical covariables,
viz., the wife's age, alcohol consumption and smoking habits, her
employment in a high risk job as well as use of pharmaceuticals during
pregnancy, gravidity, the occurrence of complications during pregnancy
and during labor and birth control method, the strata for each exposureoutcome analysis.
From an examination of estimated crude and adjusted
odd rati os, these researchers concl uded that' "there was no bi 01 ogically
meaningful association between adverse pregnancy outcomes and paternal
dioxins exposure."
16
Stratified analysis is one technique for controlling for confounding
variables at the analysis stage.
It is only worthwhile when there are
sufficient numbers of observations in each and every stratum.· However t
as the number of confounders increases SOt tOOt do the number of strata.
In spite of the variable selection proceduret Townsend et al. note that
most of the potential stratum-specific 2x2 tables "included no pregnancy
outcomes at all or none in one of either the dependent or independent
category levels."
One final word regarding this technique:
stratified
analysis presumes an appropriate categorization scheme for each
·variable.
Improper categorization may bias results t while categoriza-
tion of continuous variables may result in a loss of information.
1.3.3 Survivorship Analysis
1.3.3.1 Life Tables
Dobbins et ale (1978) studied the effect of occupational exposure on
fertility by examining the length of time to progress from one parity to
the next t that iS t the time interval between successive live births.
They constructed parity-specific clinical life tables to determine the
probability of remaining at ("surviving") a given parity.
To construct an i th parity-specific life tablet follow-up time for a
woman commences with her i th birth. Some of the traditional life table
functions and their modified interpretations include:
(i)
eXt x+1}t month intervals of follow-up subsequent
to the i th birth;
(ii)
lx t number of i th parity women exposed to the risk
of birth at the beginning of the interval eXt x+1};
e·
17
(iii)
(iv)
dx ' number of i th parity women who give birth in
the interval [x, x+1); and
qx' conditional probability of giving birth in
[x, x+1) by a woman of parity i.
Upon completion of the life table, the parity-specific survivorship
experience of women with exposed husbands was compared to that of a
control group.
If the occupational exposure reduced workers' fertility,
then it was hypothesized that their wives would remain at ("survive") a
given parity level longer than the controls.
Besides hazardous exposures in the workplace, it is known that
pregnancy spacing may be influenced by such factors as socio-economic
status, age of the mother and race.
·e
Life table analysis presumes homo-
geneity of the two comparison groups.
When the homogeneity assumption
is violated, variables can be controlled at the analysis stage through
stratification.
However, stratification is possible only with a large
number of observations.
Namboodiri et al.(l980) employed multi-stage increment-decrement
life tables to investigate the relationship between fertility and female
employment.
Through these parity-specific life tables the transition
among five states was examined:
pregnancy; contracepting/working; not
contracepting/working; contracepting/not working; and not contracepting/
not working.
Then the probability of first passage to a state as'well
as the time taken for this move was estimated.
The sample sizes
required for the construction of these life tables would be difficult to
obtain in an occupational setting.
The investigators utilized data from
the 1965 National Fertility Study on zero parity women to illustrate
their method.
18
The construction of life tables presumes that the events under study
are independent.
be dependent.
The birth events in a woman's reproductive history may
Thus, use of life tables to describe women's complete
fertility experience may violate model assumptions.
Namboodiri et al.
(1980) and Dobbins et al. (1978) circumvent this issue by considering
only one birth per reproductive history at a time through parityspecific life tables.
However, separate analyses of a woman's
parity-specific reproductive experience decreases the sample size.
Another consideration is the degree of homogeneity of the study
population.
Heterogeneity can be accommodated in life table analyses
through stratification; however, this necessitates large samples.
1.3.3.2 Cox Regression
Recently, Trost and Lurie (1980) developed a model of child spacing
based on Cox regression.
This semi-parametric hazard model utilizes an
unspecified, underlying hazard AO(t).
effect of covariables (Cox, 1972).
time t
In addition, it adjusts for the
The survival time modeled is the
from marriage to first birth, or from first birth to second
birth, and so on, which is just the length of time to advance from one
parity to the next.
The hazard model is specified by
where S is a column vector of regression parameters.
The focus on
only a fraction of a woman's reproductive history, e.g., the time from
one birth to the next, is similar to the usage of Namboodiri et al.
(1980) and Dobbins et al. (1978).
It minimizes any concern about the
independence assumption of birth events within a family.
e-
19
The authors illustrate the effect of five exogenous covariables on
the probability that a married couple has their first child after one
year.
These variables are the wife's IQ, education, and age; the
husband's income; and their place of residence as rural or urban.
Although not pursued by the investigators, an exposure indicator
variable could be introduced into this model when employed in an occupation setting to determine if exposure to hazardous material in the
working environment decreased the probability of birth of the first
child.
This rationale is similar to that presented by Dobbins et al.
(1978):
a detrimental effect on fertility may increase the length of
time spent at a particular parity.
Maximum likelihood estimates of the regression coefficients S are
obtained from a partial likelihood function.
Rationalization for the
partial likelihood is that the unknown, baseline hazard AO(t)
drops
out of the likelihood expression used for estimating the regression
parameters B.
Asymptotic properties of maximum likelihood estimates
include approximate normality, sufficiency, efficiency, and consistency.
In an occupational fertility study, it is questionable whether sample
sizes at the higher parities would be large enough for these desirable
properties to hold.
Each model examined in this section has certain limitations and
these limitations have been acknowledged.
What needs to be stressed at
this point are the advantages of each of the fertility evaluation techniques reviewed.
These advantages have been summarized below as desired
properties of a statistical model for fertility evaluation to be
employed in occupational settings.
model's capability:
These characteristics include the
20
(i)
to evaluate the effect of hazardous occupational
exposure on fertility;
(ii)
to adjust for heterogeneity within the study population;
(iii)
to accommodate heterogeneity between study and standard
populations; and
(iv)
to summarize a fraction of or the entire reproductive
span of a woman.
1.4
Independence Assumption in the Analysis of Reproductive Histories
This section is devoted to a critical issue that arises in the
statistical analysis of reproductive histories, namely, the fundamental
assumption of independence of sampling units.
We will first approach
this issue from a statistical viewpoint; then epidemiological evidence
presented in the literature will be discussed.
A comparison of rates and proportions frequently has been employed
to describe women's reproductive histories.
Often the reproductive
history data of women participating in a fertility study are summarized
in a table similar to Table 1.1.
TABLE 1.1
HYPOTHETICAL CROSS-CLASSIFICATION OF PREGNANCY OUTCOME
BY CHARACTERISTIC
Pregnancy
Outcome
Present
Incomplete
a
b
Complete
c
d
m1
m2
Total
n1
n2
N
Characteristic
Absent
Total
e-
21
The null hypothesis is that the proportion of incomplete pregnancies
among the
n1 pregnancies with the characteristic of interest is equal
to the proportion of the
n2 pregnancies without this characteristic.
The chi-square statistic for this table is
x2
Under the null hypothesis,
=
x2
with one degree of freedom.
N(ad-bc)2 .
nl n2ml m2
is distributed as a chi-square variate
Infante et al. (1976) utilizes this proce-
dure to measure the association between exposure to polyvinyl chloride
and fetal death.
tion are:
The model assumptions relative to Infante1s investiga-
(i) the
(ii) each of the
N pregnancies are mutually independent; and
n1 conceptions exposed to vinyl chloride has the same
probability of an incomplete outcome and each of the
n2 unexposed conceptions has the same probability of being aborted (Fleiss, 1973).
Reproductive histories analyzed with these standard procedures consider each pregnancy outcome as a sampling unit; see also Townsend
et al. (1982).
This implies that a woman's successive pregnancies are
independent events.
Violation of this independence assumption invali-
dates the inferences drawn from these statistical analyses.
From a
practical standpoint, concern about violating this assumption can be
minimized if the study interval is kept short.
However, as follow-up
time increases, the likelihood that a woman will have only one pregnancy
during this period decreases.
Additional evidence supporting or opposing the assumption of statistical independence of a woman's pregnancy outcomes is culled from the
literature.
Buffler (1980) presents the design of a study of reproduc-
tive outcomes of wives of workers employed in the chemical industry, and
22
concludes with remarks about the difficulties encountered in conducting
such an investigation.
An issue of deep concern relates to the sta-
tistica1 treatment of the collected data.
She states that "c1ear1y the
pregnancy is the unit of analysis, but the pregnancies experienced by a
single respondent cannot be considered as independent observations."
Alternative analytic methods are not proposed.
-Kissling (1981) directly addresses the question of statistically
independent pregnancy outcomes within a family unit in her investigation
of fetal loss.
She develops a generalized model for the analysis of
nonindependent observations.
Her uniform-logistic model, based on the
logistic regression model, considers potential risk factors associated
In addition, the model parameter 8*0 that accounts
for the "background" ri sk of an adverse pregnancy outcome of a fetal
with a pregnancy.
loss, is assumed to have a continuous uniform distribution over the
sample of family units.
For a dichotomous outcome,
Y = a (complete
pregnancy) or Y = 1 (incomplete pregnancy), the uniform-logistic model
has the form
(1.6 )
This compound probability (1.6) assumes independence of pregnancies
within a family unit conditional on the specific IIbackground
associ ated with each pregnancy.
Thi s IIbackground
ll
ll
risk
ri sk may consi der the
mother's age and gravidity; paternal exposure to hazardous chemicals at
the time of conception; prior fetal loss; and the mother's past and
current smoking and alcohol consumption habits.
An unconditional proba-
bility of a fetal loss was obtained from the uniform-logistic model by
23
integrating the conditional probability with respect to the density of
the "background" risk.
Further analyses revealed that this uncon-
ditional probability approached the logistic probability.
Application
of the uniform-logistic model to pregnancy history data produced estimated parameters almost identical to parameter estimates from the
logistic regression model, a model that assumes independence of
pregnancy outcomes.
A second approach to this issue has been identified.
restricting the analysis to a specific parity level.
It involves
Then one complete
pregnancy outcome per reproductive history enters the analysis and the
independence assumption definitely is not violated.
Dobbins et al.
(1978) suggest that a woman's pregnancy outcomes are not independent
events.
So as not to violate model assumptions, they restrict analyses
of pregnancy histories to specific parity levels.
As indicated
previously, parity-specific analyses do not summarize a woman's entire
reproductive experience.
Finally, several fertility evaluation methodologies presented in the
previous section consider sampling units other than a pregnancy outcome.
The fertility evaluation models advocated by Wong et al. (1979) and
Levine et al. (1980) assume that yearly fertility outcomes in a woman's
reproductive history are independent events where the independence is
conditional on the woman's age, parity, birth cohort and race at the
beginning of each year.
The unit of analysis for these two models is a
woman's annual fertility outcome.
The issue of interdependency of outcomes in a woman's reproductive
history surfaces in the epidemiologic literature as well.
Researchers
have focused on the occurrence of spontaneous abortions as a means to
24
investigate whether the outcome of a conception is associated with
pregnancy outcomes at higher gravidity levels.
inconclusive.
Study results are
Warburton and Fraser (1964) infer that the risk of spon-
taneous abortion does vary with gravidity.
'supports a strong parity effect.
Naylor's research (1974)
Kline et ale (1978) develop a log-
linear recurrence risk model to investigate this issue.
The recurrence
risk hypothesis proposes that once a woman aborts, the risk of future
abortions increases.
pregnancy outcomes.
maternal
This is suggestive of interdependency among
Controlling for the possible confounding effect of
age and recurrence risk, as indicated by a woman's history of
spontaneous abortion, Kline et al. (1978) conclude that spontaneous
abortion is not associated with gravidity.
The data were obtained from
an ongoing study of spontaneous abortion in New York.
Naylor and
Warburton (1979) consider the reproductive histories of over 14,000
women throughout the United States and examine the sequences of spontaneous abortions and live births descriptive of their pregnancy
histories.
If the risk of spontaneous abortion were independent of
gravidity, then the distribution of sequences with the same number of
live births and abortions would be uniform.
Stratified analyses support
the hypothesi s that the ri sk of spontaneous aborti.on increases wi th
gravidity, independent of maternal age.
In the epidemiologic literature the issue regarding the
independence/dependence of events in a woman's reproductive history
remains controversial.
Moreover, each statistical method for analyzing
a woman's reproductive experience has underlying model assumptions
regarding the independence of these events.
The validity of these
assumptions in fertility investigations must be addressed.
25
1.5 Outline of Subsequent Chapters
In the following chapter, a piecewise survival model with a multiplicative exponential hazard rate is proposed for fertility surveillance.
This parametric model provides for a year by year evaluation of a
woman's reproductive experience.
Furthermore, it allows the researcher
to investigate for an association between occupational exposure and
fertility.
The reproductive histories of two cohorts of women, whose
husbands have experienced on-the-job exposure to hazardous chemicals,
are analyzed in Chapter III to illustrate the methods of the previous
chapter.
The framework for a semi-parametric fertility hazard model is
developed in Chapter IV.
This competitive strategy is employed to study
the fertility experience of an occupationally-defined cohort of women.
Finally, suggestions for future research in survival models for fertility evaluation appear in Chapter V.
CHAPTER II
A MODEL FOR THE FERTILITY HAZARD FUNCTION
2.1
Introduction
The proposed survival function for fertility surveillance is a
'piecewise model where a woman's reproductive experience is examined on a
year by year basis.
A multiplicative exponential fertility hazard faci-
litates the examination of the effect of hazardous occupational exposure
on fertility.
Moreover, the covariables in the fertility hazard model
adjust for heterogeneity between the study and the external standard
population as well as accommodate heterogeneity among the women in the
cohort under study.
The framework for this model is developed in this chapter.
description of the experiment follows in the subsequent section.
A
Then
the fertility hazard rate is defined and the likelihood function is
constructed.
The assumption of conditional independence of a woman's
yearly contribution to the likelihood is investigated.
hood estimates of the model parameters are examined.
Maximum likeliThe relationship
between the fertility hazard and the covariables in the model is
explored through the likelihood ratio test.
Finally, two approaches to
the question of goodness-of-fit of the model are outlined.
e·
27
2.2 Description of the Data
The proposed model is used to examine the effect of occupational
exposure to a hazardous chemical compound on the reproductive health of
male workers employed at a chemical manufacturing plant.
In this
research the reproductive experience of the worker's wife is used as a
surrogate measure of the male worker's reproductive health.
Therefore,
at each chemical manufacturing plant the study cohort consists of the
wives of the workers employed at this facility on the interview date.
In the subsequent analyses, the reproductive period for a woman is
defined as the thirty-five year interval commencing with her fifteenth
year, and the events of interest until her fiftieth year as the births
of her children.
The reproductive experience of the women in the cohort
was ascertained retrospectively through an interviewer-administered
questionnaire survey of the male workers at the manufacturing plant.
This questionnaire was the source of essential demographic information
as well as the couple's date of marriage, the wife's birth date and the
date of birth of each child.
If the couple were separated or divorced,
then the date of that event was also available from the survey
questionnaire.
From the male worker's employment history, his date of
hire was abstracted.
The husband's previous job classifications within
this plant with the corresponding entry and exit dates also appeared in
this document, providing information on his occupational exposure to
suspect chemicals. These latter dates were incremented by nine months
so that the husband's employment experience at the estimated time of
conception would be associated with the observed birth.
For each woman in the study cohort for each year of her reproductive
experience, the values of covariables (z) utilized in this research were
28
available from these two data sources.
(i)
These covariables are:
the husband's occupational exposure status (zl)' an indicator
variable that is equal to one when the male worker is occupationally exposed to the suspect agent for more than six
months of the year, and zero otherwise;
(ii)
the couple's marital status (z2)' an indicator variable that
takes the value of one when the couple is married for less
than half the year, and zero otherwise;
(iii)
the natural logarithm of the woman's age (z3) as of January 1
of that year where age is a continuous variable ranging from
fifteen to forty-nine;
(iv)
the woman's parity (z4) as of the first of the year, where
parity is the number of children that a woman has borne; and,
finally,
(v)
the square of the previous covariable, i.e., parity squared.
Moreover, define
Zo = 1
for each woman year of reproductive experience.
A further word of explanation is offered regarding the model
covariates. Originally, the age-related covariate introduced into the
model was age.
Subsequently, an age squared term was added to the
covariate vector, analogous to the inclusion of parity and parity
squared.
However, convergence was not achieved for the model with age
squared and age as covariables, so that this pair of covariates was
replaced with tn (age).
For all practical purposes the women in the study cohort were
white, so race was not included as a covariate in the model.
Birth
cohort information is considered in the fertility hazard rate model
through the national birth rates.
29
2.3
A Multiplicative Exponential Fertility Hazard Function Model
In general, the mUltiplicative model for the hazard function for
positive time t
is
(2.1)
where z is a vector of covariables and S, a vector of unknown
~
parameters.
In (2.1)
AO(t), the underlying hazard of unspecified form,
is a function of time t
not depending on
is positive for all time t.
Also,
g(.)
covariables and of the unknown parameters.
~
or
~
such that Ao(t)
is a function of the
Assume further that the
underlying hazard remains approximately constant over time t,
Ao(t) = AO' and that g(~,~) = exp{£'~}.
(2.1) can be expressed as
Then the hazard rate function
(2.2)
This represents the multiplicative exponential model for the hazard
rate where the underlying hazard remains constant over the time
interval of interest.
Inspection of (2.2) reveals that the effect of any covariate is to
act multiplicatively on the baseline hazard rate function AO• This
multiplicative effect of the covariates on the hazard is known as the
proportional hazards model (Cox, 1972). The proportional hazards model
requires no distributional assumptions about the hazard rate function;
it only assumes that the hazard curves for individuals are proportional.
With the traditional parametric models, a specific shape for an
individual's hazard rate function must be specified.
Breslow (1975,
1977) thoroughly explores the methodology "for the stati sti cal analysi s
30
of censored survival data which arise from a model in which the factors
under investigation act multiplicatively on the hazard function of an
underlying nonparametric survival distribution."
The proposed hazard function model for fertility evaluation is based
on (2.2).
clarified.
continuous.
Before proceeding with this model, some notation is
Assume the period of observation for each and every woman is
Let i
index the women, each with specific race and birth
cohort characteristics.
Note that each woman retains this unique
identification (i) for the duration of the study.
The subscript j
indexes the calendar years defining the total observational period.
Considering the retrospective nature of the data collection process, the
observational period begins with the calendar year of the fifteenth
birth date of the oldest wife of any worker interviewed at the chemical
manufacturing plant and ends with the year of the interview.
j
Note that
also reflects changes in the woman's age and parity as time
progresses.
The data collected for the i th woman for the jth calendar year are
now described. Let t.. be the time of the birth to the i th woman in
'J
the jth calendar year; or if no birth occurs, t ij corresponds to the
end of follow-up for the i th woman during this year. In either case,
t ..
'J
val
is a date that has been converted into a fraction of a year inter-
Also consider a vector
th
:i j = (zl i j' z2i j' ••• , zri j ) for the i
year which are not explicitly a function of
the kth covariable on the i th woman for the
(0 < t .. < 1).
-
'J-
be continuous or discrete (k
= 1,2, ••• ,r).
of r covariables
woman for the jth calendar
t ij • The observed value of
jth calendar year (zkij) may
The covariates ~z,"'J
focus on
31
the sources of heterogeneity among and within the N women in the
cohort with respect to their reproductive risk.
Given these data, the proposed multiplicative exponential model for
the fertility hazard function for the i th woman for the jth year is
A.. (Z .. ,/3)
lJ
~lJ
~
=
B.. exp{/3·z .. } ,
lJ
(2.3)
~ ~lJ
where B.. , the U.S. birth rate for a woman with specific race, birth
lJ
cohort and parity characteristics during calendar year j, estimates the
underlying fertility hazard for a woman in the U.S. population with the
same characteristics as the i th woman in the jth year. The model (2.3)
assumes a constant fertility hazard for woman (i) in each year interval
[j, j+1)
·e
as well as proportional hazards among women given their race,
age and parity in each year (j).
The birth rates (Bij's) are not directly available, but were
obtained by a transformation of race by birth cohort by parity by calendar year specific birth probabilities (National Center for Health
Statistics, 1976).
Note that these pUblished probabilities describe the
fertility experience of women without regard to marital status.
Q•.
lJ
denote the birth probability during calendar year j
Let
for a woman
in the U.S. population with race, birth cohort and parity characteristics summarized by i.
The transformation used to obtain the birth
rates from these published birth probabilities, B.. = -tn(l - Q.. ), is
lJ
lJ
frequently employed in survival analysis to translate rates into proportions and vice versa; see p. 101 in Elandt-Johnson and Johnson (1980).
Breslow et al. (1983) employ a model analogous to (2.3) to investigate the association between exposure to arsenic and death due to
respiratory cancer among a cohort of smelter workers. Their assumption
32
that the baseline hazards are known from national mortality data is the
same as the assumption here regarding estimation of underlying fertility
hazards.
This research focuses on the multiplicative model of the hazard
function.
An additive model for the fertility hazard is a logical, but
not practical alternative for this application.
The structure of the
multiplicative model leads to simple yet elegant interpretations of the
regression parameter S as will be illustrated shortly.
with the additive model.
This is not so
Moreover, evidence culled from the literature
consistently supports the multiplicative model.
Osborn (1975) presents
a multiplicative model for the evaluation of independent effects of two
associated factors (maternal age and parity) on two schedules of vital
statistics rates, stillbirth rates and infant mortality rates.
His
earlier modeling of still births and births occurring in the U.S. during
the period of 1937-1941 (Osborn, 1972) indicated that the multiplicative
model provided a better fit to the empirical distribution of still
births than did an additive model.
2.4 The Likelihood Function
Consider the construction of the likelihood function
L(S).
From
(2.3) it follows that the distribution of the waiting time for a birth
to occur to the i th woman during the jth year is exponential with survi val functi on
t ..
lJ
S.. (t··lz .. ,s) = exp(-! B.. exp{S'z .. }du) .
lJ
lJ
~lJ ~
0
lJ
~ ~lJ
(2.4)
Then the contribution to the likelihood by the i th woman for the
jth calendar of her reproductive span, if a birth occurs during this
e-
33
year, is
(2.5)
The second factor of (2.5) is interpreted as the probability that the
i th woman does not give birth from the beginning of the jth year up to
time t .. , and the first factor is just the instantaneous (relative)
lJ
rate of giving birth at time t ij , a constant for 0 ~ t ij ~ 1. Since
the integrand in (2.5) is not a function of u, this equation can be
simplified to
(2.6)
B.. exp{s'z .. } . exp{-t. .B .. exp{slz .. }) .
lJ
~ ~lJ
lJ lJ
~ ~lJ
Then
t..
lJ
is just the fraction of the jth year that the i th woman was
at risk of giving birth.
Similar calculations reveal the contribution to the likelihood by
the i th woman from the jth year, when no birth occurs, as
exp{-t .. B.. exp{s'z .. }) ,
lJ lJ
~ ~ lJ
(2.7)
for t lJ
.. , an interval of time no longer than one year. When the
i th woman is at risk of giving birth for the entire jth year, then
t ij = 1; however, when the i th woman is lost to follow-up during the
jth year or censored by the end of the study, t ij < 1.
Two indicator functions facilitate the construction of the likelihood function.
First, define the (birth) indicator function
0..
lJ
1, if the i th woman gi ves bi rth duri ng year j, and
°ij
= { 0,
otherwise.
as
34
The second indicator function
tive periods of the
calendar year.
~ij
is constructed because the reproduc-
N women in the study do not all begin in the same
Say the observational period spans a total of n years
where ni ~ n, i = 1,2, ••• ,N. Then define ~ij to identify those
calendar years in the observational period that correspond to the
i th woman's reproductive years, namely,
1, if the i th woman's reproductive span included the jth
~
.. =
calendar year, and
lJ
0, otherwise.
Considering the piecewise nature of this approach, the contribution to
the likelihood by the i th woman during the year [j, j+l) is expressed
as
6 ..
,S) lJS .. (t.·lz .. ,s)
6 ..
lJ ~lJ ~
lJ lJ ~lJ ~ = A•. (Z .. ,S) lJS .. (t··lz .. ,13)
S.. (olz .. ,S)
lJ ~lJ ~
lJ lJ ~lJ ~
lJ ~ lJ ~
A.. (Z ..
(2.8)
where Sij(')
is defined as in (2.4). It follows that the contribution
to the likelihood by the i th woman over the ni years of her reproducLi(~)'
tive history,
=
n
II
j=l
=
n
II
j=l
is·
~
6. .
..
[A .. (Z .. ,S) lJ S.. (t .. lz .. ,s)] lJ
lJ
~lJ
~
[(B .. exp{S' z .. })
lJ
~ ~lJ
lJ
6..
lJ
~lJ
~
~
..
lJexp(-t .. B.. exp{s'z .. })] lJ
lJ lJ
~ ~lJ
(2.9)
35
In calculating the yearly contributions to the likelihood (2.9) each
Bij is the birth cohort by age by parity by race specific birth rate
for the i th woman in her jth year. Thus, a woman's yearwise contributions to the likelihood, (2.6) and (2.7), are considered to be conditionally independent, i.e., conditional upon the Bij and covariates
for the jth year. Further discussion of this yearwise conditional independence assumption appears in the following section.
Under the assumption that the probability of giving birth is independent among the N women in the cohort, define the likelihood for all
women as
N
L(S)
-
·e
= IT L.(S) .
i=l
(2.l0)
1 -
Substitution of (2.9) into (2.10) produces the desired overall likelihood
N
n
0..
<p ••
L(S) = IT IT [(B .. exp{S'z .. }) lJexp(_t .. B.. exp{S'z .. })] lJ
i=l j=l
lJ
- -lJ
lJ lJ
- _lJ
(2.11)
The construction of the likelihood (2.ll) on a year by year basis accommodates annual changes in covariables.
This is a way of incorporating
time-dependent covariables into the model and is used by others (Breslow
[1975], Breslow et al. [1983]). In addition to the function of updating
the same woman's risk for the next year's reproductive experience, the
vector of covariates provides a regression type adjustment for heterogeneity among the study members with respect to characteristics predictive
of birth.
2.5 The Assumption of Yearwise Conditional Independence
of a Woman's Contribution to the Likelihood
The crucial assumption of conditional independence of a woman's
yearly contributions to the likelihood is elaborated upon further.
36
Consider the random variable Tij which corresponds to the observed
values t ij • Now for the i th woman define ~i = (til' t i2 , t i3 ,
••• , tin.) . as the vector of follow-up times associated with the
1
th woman,
l·
z. = (.....1
Z'l' .....1
Z'2' .....1
Z'3' ••• , ~in.) as the corresponding vector
.....1
of covariables and, lastly,
~i
= (oil'
1
0i2' 0i3' .• " 0in ) as the
i
corresponding vector of birth indicators. The joint density
(2.12)
describes our experimental situation.
The assumption that the women's
reproductive experiences are independent justifies a factorization of
(2.12) into the product
N
II
i=l
PH·lz.,o.}
.
..... 1 ..... 1 ..... 1
(2.13)
Using the fundamental relationship between joint and conditional
densities, the contribution of the i th woman can be rewritten as
P{t·lz.,o.}
= P{t·llz·l,o·l}
• P{t·2It,1,z'1,0'1,z'2,0'2}
..... 1 .....1 .....1
1 .....1
1
1
1 .....1
1 .....1
1
..... P{t. It.l, ... ,t.
,Z.,o.}.
In.1 1
In.1- 1 . . . 1 .....1
(2.14)
Consider the following approximation of (2.14) where
z '2· ,0'2}
p{t·lz.,o.}
~ P{t·llz·l,o·l}
• P{t·1 2 I.....1
1 .....1 .....1
1 .....1
1
1
..... P{t. Iz. ,0. }.
In1
......In.
In.
11
(2.15)
A comparison of the jth factors from (2.14) and (2.15),
P{t··lz
.. ,o .. } =
lJ .....lJ lJ .
P{t··lz
.. ,o lJ
.. ,t."
. . "z·l'···'z
. . 1,0."
. . ,} ,
lJ .....lJ
1 ... ,t 1,J.....l
.....l,J1 ... ,0 1,J(2.16)
37
reveals that a 14arkov type assumption is being made in this survival
model for fertility evaluation.
It effectively presumes that the
updated reproductive characteristics for woman i
marized by the vector of covariates ~lJ
z..
for year j, sum-
and the corresponding U.S.
birth rate Bij , contain all the relevant information to describe her
birth risk in year j. Notice that some history is contained in current
age and parity so this conditions on more information than a first order
Markov process (Feller, 1968).
In terms of the multiplicative exponen-
tial fertility hazard model, this Markov type property implies that the
probability of a birth for the i th woman for the jth calendar year
depends on this woman's current reproductive characteristics and not on
the precise set of circumstances which lead to this point in her fertility period.
In other words, the reproductive characteristics (age,
race, parity and cohort) for a woman at the beginning of a year contain
all the information necessary to describe the likelihood of a birth for
that year and any additional information on her reproductive history
will not alter the likelihood of an outcome during this year.
another way, if two women enter calendar year j
Put
with the same set of
reproductive characteristics, then their likelihood for a birth during
this year is identical, regardless of how their specific reproductive
paths may have differed in the past.
On the other hand, if the repro-
ductive paths of these two women coincide only for this one year in
their entire reproductive periods, then their individual yearly contributions to the likelihood will differ for all other years except this
one.
In their longitudinal analysis of the risk of coronary heart disease
in the Framingham study, Woodbury et ale (1979) assume a similar
38
conditionality principle.
Moreover, empirical investigation of second
and third order Markov processes in their research revealed that the
additional information from the higher order Markov processes greatly
complicated the model and that no risk variable measured more than one
time into the past was statistically significant.
evidence in the literature.
There is additional
When examining the effect of membership in
transient states in the analysis of survival data under the proportional
hazards model, Breslow (1975) assumes that "the'risk of death is determined solely by the current state of each patient and not by his past
history," an assumption that is equivalent to the Markov property.
The concept of conditional independence is not new in occupational
fertility investigations.
Kissling (1981) developed a uniform-logistic
regression model to describe the probability of a favorable pregnancy
outcome.
This model assumes independence of pregnancies within a family
unit conditional on the specific "background" risk associated with each
pregnancy.
Surprisingly, parameter estimates from this model closely
approximated those from a logistic regression model which assumes independence of pregnancy outcomes within the family unit.
2.6 Parameter Estimation
The parameter
~
will be estimated by standard maximum likelihood
techniques. The log-likelihood,
N n
=
L L
i=l j=l
~n
L(~),
of (2.11) is
¢iJ·[o .. (~nB .. +S'z .. ) - t .. B.. exp{S'z .. }] .
lJ
lJ ~ ~lJ
lJ lJ
~ ~lJ
(2.17)
The partial derivative of (2.17) with respect to Sk gives the likelihood equations, namely,
e-
39
a.Q,nL(~) -_ L~
--as-k
i=l
~
j=l
N
k = 1, ... ,r.
<p • • [0.
,zk' .-t .. B.. exp{S'z. ,}zk··J
lJ lJ lJ lJ
~ ~lJ
lJ
lJ
n
I I
i=l j=l
=
for
L
'Zk' .[0 .. -t .. B.. exp{S'z .. }] ,
lJ lJ lJ lJ lJ
~ ~lJ
<p.
In general, there are no explicit solutions to the
system of equations corresponding to (2.18).
tes of
~,
(2.18)
"
~,
denoted
Maximum likelihood estima-
must be obtained using some iterative procedure.
In the accompanying analyses, the FORTRAN subroutine MAXLIK (Kaplan and
Elston, 1972) has been used to perform a direct search of the likelihood
surface.
This is achieved by fitting parabolas to points on the likeli-
hood surface.
The direct search can be followed by a numerically-based
Newton-Raphson iterative procedure to confirm the maximum likelihood
estimates.
An estimate of the reliability of ~ is available from large sample
.
a2 .Q,nL(S)
as a function of t ..
theory. Slnce the expected value of dSkdSj
lJ
is intractable, the variance-covariance matrix is approximated by the
inverse of the observed Fisher information matrix (Efron and Hinkley,
1978).
The second-order partial derivatives needed for this matrix are
presented below.
The off-diagonal elements are
-
N n
-i=l
I j=l
I
'Zk' 'Zl' .t .. B.. exp{S'z .. }
~ ~1J
lJ lJ lJ lJ lJ
<p.
where k ft 1 and k, 1 = 1,2,3, ••• ,r,
N
n
I I
i=l j=l
where k = 1,2,3, ••• ,r.
and the diagonal elements are
<p • • z~ .. t
lJ
(2.19)
.. B.. exp{S'z .. }
lJ lJ lJ
~ ~lJ
(2.20)
40
2.7 Simple Illustration
Further insight into this survivorship model for fertility evaluation can be gleaned by examining a special case of the fertility hazard
(2.3) containing only an intercept term and an occupational exposure
indicator covariab1e.
Specifically, let
for all (i ,j) and
1, if the husband of the i th woman was occupationally
exposed to the suspect agent during calendar year j,
and
0, otherwise.
An explicit solution for the maximum likelihood estimates of these two
e-
regression coefficients, So and Sl' exists.
The likelihood equation for So obta~ned from (2.18) ts
ANn
exp{sO} =
[L
L
i=l j=l
n
N
<p •• 8 .. J/[
lJ lJ
L L
i=l j=l
<p .. t .. B.. exp{B
lJ lJ lJ
.. }]
1z1lJ
(2.21)
Separate the denominator of (2.21) into two parts to reflect the
husband's occupational exposure history, namely,
N
n
L [I
j=l
i=l
Zl' .<p . •t .. B.. exP{B 1z1 ··} +
lJ lJ lJ lJ
lJ
n
L
j=l
(1-z 1 · .)<p • . t .. B.. exP{Sl z l··}] .
lJ
lJ lJ lJ
lJ
(2.22)
Considering the definition of zlii' the first term in (2.22) sums over
a woman's reproductive years when her husband is occupationally exposed
and the second sums over the unexposed years.
Recall that during
41
reproductive years when the husband is occupationally exposed to suspect
chemicals,
(2.23)
while during years that the husband did not experience exposure to
hazardous agents in the working environment,
(2.24)
After substituting (2.23) and (2.24) into (2.22), (2.21) can be
expressed as
N
I I <j> .. 0..
i=l j=l lJ lJ
- - - 7 .:------.....:---=--.;lL..-:---:-;N--n-------.
{2.25)
N n
exp 81 I L Z 1.. <j> .. t .. B.. + L L (1- Z 1.. )<j> .. t .. B..
i=l j=l
lJ lJ lJ lJ i=l j=l
lJ lJ lJ lJ
A
e xp {SO}
n
=
A
Now focus on 13 1• The likelihood equation, a R-nL(.@.)/aS 1 , from (2.18)
associated with this simple model is
N
I
i=l
~
j=l
-1 ~
zl··<j>· .t .. B.. exp{Sl zl··} = exp{13 o}
A
L
Since Zlij
lJ lJ lJ lJ
A
lJ
L
i=l
n
I
zl··<j>··o ...
j=l lJ lJ lJ
(2.26)
is zero during those years of a woman's reproductive span
when her husband does not experience adverse occupational exposure, only
reproductive years
years
(2.26).
(<j>ij = 1)
corresponding to the husband's exposed
(zlij = 1) make non-zero contributions to the double sums in
Since (2.23) holds during these remaining reproductive years,
(2.26) can be expressed as
N
exp{Sl}
n
N
n
zl .. <j> .. tooB .. = exp{8 0}-1 L'\ L'\ zl' .<j> .. 0... (2.27)
;=1 j=l
lJ lJ lJ lJ
;=1 j=l
lJ lJ lJ
L I
A
42
Thus, rearrangement of (2.27) produces
N
=
n
N
n
t .. B.. ]
[L
L Zl··<P··o .. ]/[exp{So} i=l
L j=l
L zl··<P··
i=l j=l lJ lJ lJ
lJ lJ lJ lJ
(2.28)
Substitute (2.27) into (2.25) to obtain the maximum likelihood estimate
N
=
[L
N
n
n
L (l-zl' .)<p • •0 .. ]/[ L L (l-zl' .)<p • •t .. B.. ]
i=l j=l
lJ
i=l j=l
lJ lJ
lJ
lJ lJ lJ
(2.29)
"'-
Replacing exp{B o} in (2.28) with (2.29) gives:
N
n
N .n
[L
L zl"<P" o .. ]/[ i=l
L j=l
L zl'lJ.<p lJ• . t lJ.. B..
]
i=l j=l lJ lJ lJ
lJ
exp {8 1} = ---;N:;--""-n------'---¥.N,..,....:--n--'-----o .. ]/[ L L (l-zl")<P, .t .. B.. ]
[L
L (l-zl")<P"
i=l j=l
lJ lJ lJ
i=l j=l
lJ lJ lJ lJ
(2.30)
With this simple model for the fertility hazard the maximum likelihood estimate for exp{B O} is just the ratio of the observed number of
births occurring in years in which the husband is not occupationally
exposed to hazardous chemicals to an anticipated number of births for
these same years based on the schedule of the general U.S. birth rates.
Insight into this anticipated number of births is obtained from the
following rearrangement of the first likelihood equation (2.21):
N
n
L L
i=l j=l
N
<p •• t .. B.. exP{SO+B
lJ lJ lJ
z ··}
1 1lJ
n
= L L
i=l j=l
<p • • 0 ...
(2.31)
lJ lJ
An expression for the anticipated number of births occurring to women in
the study cohort during the observation period is just the left hand
side of (2.31), which must equal the total observed number of births in
the right hand side of this equation.
43
Now exp{Sl} is estimated by the ratio of observed to anticipated
births during those years characterized by the male worker's (husband's)
occupational exposure to suspect agents divided by the analogous ratio
for unexposed years.
Note that the numerator of this latter estimate is
similar to the fertility ratio estimate proposed by Wong et al. (1979),
although Wong used birth probabilities as if they were birth rates.
Moreover, the denominator of (2.30) is exp{B O}'
An interpretation of each of these estimates follows from their
construction.
Firstly, exp{So} can be viewed as the constant multiple
that scales the underlying fertility hazard to adjust for socio-economic
differences between the external standard (U.S.) and the study cohort.
-e
Since the numerator of exp{B 1} (2.30) focusses on those years of the
husband's work history that are classified as exposed, and since the
denominator is precisely exp{13 0}' then exp{13 1} can be interpreted as a
multiplicative factor further modifying the baseline fertility hazard to
A
adjust for the additional effect of occupational exposure. When Sl
has a value less than zero, this suggests that the male worker's
occupational exposure to hazardous chemicals has had a deliterious
effect on his reproductive health, resulting in a reduced number of live
births.
2.8 Testing Hypotheses about the Model
A standard approach for hypothesis testing is the likelihood ratio
test utilizing maximum likelihood estimates from both a full and a
restricted model.
Specifically, let
Q
be a parameter space and w, a
A
subspace of
and
8
.J.U
Q
in which p parameters have been restricted.
be the maximum 1i kel i hood estimate of the parameter
Let
S in
~
~
Q
44
and w, respectively.
Then under fairly general regularity conditions
(2.32)
is asymptotically distributed as a chi-square variate with degrees of
freedom equal to the number of restricted parameters p; see Wilkes,
1938. This procedure will be utilized to determine if a regression
coefficient, say Sk' differs significantly from zero.
This is equiva-
lent to testing the null hypothesis that the corresponding covariable
had no multiplicative effect on the underlying fertility hazard.
Of
particular interest in this application is evidence of diminished fertility (live births) among male workers when occupationally exposed to
the suspect chemical agents.
2.9 Goodness-of-Fit of the Model
Hypothesis testing is appropriate only after the statistical model
has been shown to adequately fit the data.
Two distinct approaches to
the model selection process are examined in this section.
An approach in survival analysis with covariates is to initially fit
a general family of hazard functions which includes the hypothesized
hazard rate function as a special case of its parameters; see Cox (1961)
and Lewis (1981).
Consider the Weibull model with covariates, denoted
by Aij(tijl~ij' c,~) for the i th woman for the jth calendar year
c t .. c- 1expSz
{ I .. } ,
- A.. (t .. IZ •• ,c,S ) lJ
lJ
~lJ
~
e~.
lJ
~ ~lJ
(2.33)
lJ
where positive c is the shape parameter and positive 8 ij , the scale
parameter. The location parameter is assumed to be zero. Note that the
Weibull model (2.33) is a function of time t ij •
45
Let
8
l
ij = Bij - , then (2.33) becomes
c t.. c-l exp{S'z .. }
A.. (t .. Iz.. ,c,S ) = c B..
lJ
lJ
~lJ
lJ lJ
~
~ ~lJ
(2.34)
When c is constrained to be one, (2.34) reduces to the multiplicative
exponential fertility hazard (2.3):
A..
(t .. lz .. ,c=l,S) = B.. exp{S'z .. }.
1J 1J ~l J
~
1J
~ ~l J
(2.35)
This relationship between the mUltiplicative Weibull and the multiplicative exponential hazard models is important.
The validity of the
assumption of a constant fertility hazard for each woman year of reproductive experience inherent in the multiplicative exponential fertility
hazard model (2.3) can be investigated.
·e
This is achieved through the
likelihood ratio test, where
A
-2 ~n[L(c=1,8)/L(c,S)]
~
(2.36)
~
is asymptotically distributed as a chi-square variate with one degree of
freedom and L(·}
is the likelihood function corresponding to the
Weibull hazard with covariates (2.34).
The reader should be aware that
since this goodness-of-fit test (2.35) occurs within a multiplicative
framework, the assumption of proportional hazards is not under
investigation.
Another test, similar to the Chi-square test of Taulbee (1977), is
proposed to evaluate the goodness-of-fit provided by the multiplicative
exponential model (2.3).
Such chi-square tests detect general depar-
tures from the null hypothesis but are known to lack power (Cochran,
1952).
This test procedure is now described.
dom variable such that
Let o(t ij } be a ran-
46
1, if the i th woman gives birth at time t ij
o(t .. ) =
jth year; and
during the
lJ
0, otherwise.
The value of the survival function at time t ij , Sij(tijL~ij&)' is the
probability that the i th woman does not give birth ("survives") up to
time t ij in the jth calendar year.
Consider the jth calendar year for the i th woman.
Pr{o(t .. ) =
=1
- Pr{6(t .. )
= O}
The equality
,
(2.37)
Pr{o(t .. ) = l} = 1 - S.. (t .. lz .. ,B) ,
(2.38)
1J
l}
1J
can be rewritten as
lJ
lJ
lJ
~lJ
where Sij(·)
is defined as in (2.4).
adequate fit,
E(o(t lJ
.. )) = 1 - S..
(t .. \z·.,B)
lJ lJ ~lJ ~
~
Under the null hypothesis of an
and var(o(t 1· ·)) =
J
S.. (t .. \z .. ,B) • [1-S .. (t .. \Z .. ,B)], for the i th woman during
lJ lJ~lJ~
lJ lJ~lJ~
the jth calendar year.
This implies that the observed woman-years in
the jth calendar year correspond to independent Bernoulli trials.
The
outcome of interest is a live birth, where the probability of this event
differs among the woman-years depending on the value of the covariates
for each woman observed duri ng that year.
Now if the number of woman-
years during the jth calendar year is large enough and if the proposed
model describes the data adequately, then
N
X·
J
N
A
L q, .. o.. - 1=1
L q, lJ•. [l-S lJ.. (t··lz
.. ,B)]
i=l lJ lJ
lJ ~lJ ~
= ---'----:.--.:.----------
L q, •• [S lJ
.. (t··lz .. ,B)(l-S .. (t··lz .. ,B))]
lJ ~lJ ~
lJ lJ ~lJ ~
1=1 lJ
(2.39)
47
is distributed approximately normal with mean zero and variance one
under the null hypothesis and with the large sample normal approximation
to binomial.
It follows that the square of (2.39) is distributed
approximately as
x2 with one degree of freedom.
The observation period may range over several decades where
x·J
(2.39) is calculated for each calendar year j
under observation.
To facilitate interpretation and to ensure the validity of asymptotic
distribution theory, these calendar year-specific statistics are summed
over sets of calendar years albeit the proposed grouping is arbitrary,
but not data suggested.
Strictly, o(t ij ) is not independent of
o(tij,), particularly for j close to j'. Let J k be the kth set of
calendar years where the calendar years in each set are chosen so that
-e
the births occurring in these years are more likely to be independent.
For example, let the total number of sets be ten (x
= 0,1,2, ••• ,9)
. where set J x contains every tenth calendar year: 194x, 195x, 196x,
197x, and so on. Under the assumption of independence
x2
=
2
jEJ
x~
(2.40)
J
k
is distributed as a chi-square random variate with degrees of freedom
equal to the number of calendar years in set J k • Note that the degrees
of freedom have not been adjusted for the number of parameters estimated
in the fertility hazard model, since the number of woman-years contributing to (2.39) is so large as to make such an adjustment unnecessary.
In this framework, if evaluation of (2.40) for each set J k does not
result in a significant difference, then this will be considered as
evidence for the goodness-of-fit of the multiplicative exponential fertility hazard.
CHAPTER III
ANALYSIS OF OCCUPATIONAL REPRODUCTIVE DATA
USING THE MULTIPLICATIVE EXPONENTIAL FERTILITY
HAZARD RATE MODEL: FULL LIKELIHOOD FUNCTION
3.1
Introduction
In this chapter occupational reproductive data from two chemical
manufacturing plants are studied using the fertility hazard model
developed in Chapter II. Recall the multiplicative exponential hazard
(2.3) for the i th woman during the jth calendar year
A.. (z .. ,S) = B.. exp{Sl z .. } ,
lJ
~lJ
~
lJ
A.
.,
~ ~lJ
with the male worker's occupational exposure, marital status,
parity, and parity squared as covariates.
n (age),
Toluene diamine (TDA) has
been found to be mutagenic on the Ames test in animal studies.
In these
work environments, exposure to toluene diamine, separately or in conjunction with dinitrotoluene (DNT), is suspected of adversely effecting
the male worker's reproductive health.
The reproductive histories collected at each plant are examined
separately.
Each plant-specific analysis begins with a brief descrip-
tion of the study cohort.
Subsequently, the goodness-of-fit of the
multiplicative exponential fertility hazard is evaluated from two
approaches.
Finally, model results are presented and interpreted.
chapter concludes with a recapitulation of the essential features of
this fertility hazard model.
This
49
3.2 Analysis of Plant A
In April 1981, 226 males employed at Plant A were interviewed at
which time the fertility experience of their former and/or current
wives, totaling 238 women, was recorded.
Three of these male workers
were black, and the remaining 223 were white.
The essential charac-
teristics of these fertility histories follow.
A total of 6073 woman-years and 566 live birth events were tabulated
from these reproductive histories.
The study observational period,
spanning forty-six calendar years, began in 1935.
Table 3.1 as well as
the subsequent four tables describe the observational experiment, where
woman-years can be viewed as potential fertility and live births as
observed or actual fertility.
Even after careful inspection of these
tables, it is difficult to determine how well the observed fertility can
be explained as a realization of the potential fertility and such concomitant factors as the woman's age, parity, race, and marital status,
and the male worker's occupational exposure.
Table 3.1 contains the
distribution of these woman-years and birth events by calendar year.
Table 3.2 presents the distribution of woman-years of reproductive
experience and live births by woman's age.
Because of the retrospective
nature of the observational period for each woman, the distribution of
woman-years of reproductive experience decreases monotonically as age
increases.
Examination of Table 3.3 reveals that among the women from
Plant A approximately 38 percent of the observed births and 33 percent
of the woman-years of fertility experience occurred at parity
o.
The distribution of reproductive years and of live births by occupational exposure of the male worker to TDA and by marital status are
presented in Tables 3.4 and 3.5, respectively.
Note that 151 of the
50
TABLE 3.1
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE
AND BIRTHS BY CALENDAR YEAR: PLANT A
Calendar Year
Woman-Years
Births
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1
5
11
13
15
18
25
35
43
47
60
0
0
0
1
1
1
2
TOTAL
72
79
95
107
116
133
140
150
157
161
167
169
174
180
182
188
194
195
199
203
203
206
206
200
203
201
200
188
176
172
156
148
141
123
115
101
6073
4
2
3
6
3
9
12
12
10
15
16
20
24
24
34
32
35
29
29
23
29
29
26
16
17
13
10
9
e~
11
5
5
8
7
8
5
3
4
4
9
1
566
e
51
TABLE 3.2
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE
AND BIRTHS BY WOMAN'S AGE: PLANT A
·e
Age
Woman-Years
Births
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
238
238
238
238
238
238
236
234
233
229
224
222
216
214
208
200
196
190
185
182
175
170
168
164
158
155
149
146
138
129
104
20
0
0
0
1
3
8
13
38
50
55
43
43
42
50
41
37
26
16
26
16
16
12
6073
566
TOTAL
11
11
3
1
1
0
0
1
1
0
0
1
0
0
0
0
52
TABLE 3.3
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE
BY WOMAN'S PARITY: PLANT A
Parity
Woman-Years
Births
0
2018
214
1
1061
173
2
1362
103
3
903
50
4
490
17
5
186
5
6
17
3
7
36
1
8+
0
0
6073
566
TOTAL
e-
53
TABLE 3.4
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE
BY OCCUPATIONAL EXPOSURE AND MARITAL STATUS: PLANT A
Male Worker's
Occupational
Exposure (zl)
Exposed
Marital Status (z2)
Married
Not f4arri ed
Total
151
(100.0)*
0
(0.0)
151
(2.5)
Not Exposed
4483
(75.7)
1439
(24.3)
5922
(97.5)
Total
4634
(76.3)
1439
(23.7)
6073
* Figures in parentheses are percents of row totals for the interior of
the table and percents of grand total for the margin of the table.
54
TABLE 3.5
THE NUMBER OF BIRTHS BY OCCUPATIONAL EXPOSURE
AND MARITAL STATUS: PLANT A
Male Worker's
Occupati onal
Exposure (zl)
Exposed
Not Exposed
Total
* Fi gures
...
Marital Status (z2)
Married
8
Not Married
Total
8
( 100.0)*
0
(0.0)
(1. 4)
537
(96.2)
21
(3.8)
558
(98.6)
545
21
566
(96.3)
(3.7)
in parentheses are percents of row total s for the interior of
the table and percents of grand total for the margin of the table.
55
6073 woman-years were IIf1agged
ll
because of the male worker's exposure to
TDA in the workplace; all of these exposed years occurred while the
couple was married.
years.
Moreover, there were eight births (1.4%) in these
The relative magnitude of II no t married" woman-years is substan-
tia1: this category accounts for almost 24 percent of the total number
of woman-years.
Furthermore, there were 21 births during these years to
the wives of the workers in Plant A.
The appropriateness of the proposed model in describing the fertility experience of the women in Plant A must be established so the
question of goodness-of-fit of the exponential fertility hazard within
the multiplicative framework is now addressed.
The maximum likelihood
estimates of the regression parameters in the multiplicative exponential
-e
hazard rate (2.3) and the analogous estimates from the Wei bull hazard
rate with covariates (2.33) are presented in Table 3.6.
This table also
contains the estimated standard errors of these estimates and the log
likelihood of each model.
Inspection of Table 3.6 discloses that the
estimated regression coefficients in each model are comparable in sign
and magnitude, with the greatest difference occurring between the estimated regression coefficient corresponding to the covariab1e
~n
(age), a
variable not likely to vary within a year interval.
In Section 2.9, the multiplicative exponential hazard was presented
as a special case of the Wei bull hazard function with the same multiplicative form for the covariates.
In fact, comparison of the log likeli-
hood of the estimates from the multiplicative exponential hazard,
-1503 579, with that from the Weibu11 , -1502 963, reveals that the null
hypothesis
(H O: c
= 1)
cannot be rejected (P
= 0.27).
This first
56
TABLE 3.6
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS
AND THE LOG LIKELIHOOD FOR THE MULTIPLICATIVE EXPONENTIAL (2.3)
AND THE MULTIPLICATIVE WEIBULL (2.34)
FERTILITY HAZARD MODEL: PLANT A
Parameter
(~)
Intercept (zO)
Multiplicative Exponential
Estimated S.E.
M.L.E.
of Estimator
Multiplicative Weibull
Estimated S.E.
fYl.L.E.
of Estimator
1.942
0.902
2.113
0.909
Occ. Exposure (zl)
-0.489
0.357
-0.491
0.357
Marital Status (z2)
-2.450
0.231
-2.476
0.232
Q.n (age) (z3 )
-0.373
0.290
-0.452
0.297
Parity (z4)
-0.699
0.083
-0.683
0.085
parity2 (z5)
0.101
0.016
0.098
0.016
one
zero
0.961
0.035
Power Transform (c)
-------Log Li kel i hood
-1503.579
-1502.963
e·
57
goodness-of-fit investigation supports the presumption of a constant
fertility hazard for the i th woman throughout the jth calendar year
relative to the Weibull generalization.
Consider the second approach to goodness-of-fit outlined in
Chapter II, where goodness-of-fit is investigated for each of 10 sets
of calendar years.
Here the null hypothesis of goodness-of-fit of the
proposed model is evaluated relative to the more general alternative
hypothesis that the proposed model does not adequately describe the fertility data from Plant A.
The test statistics and P-values for each set
of years are displayed in Table 3.7, using the maximum likelihood estimates of the regression coefficients in the multiplicative exponential
fertility hazard as presented in the previous table.
-e
expected numbers of births by calendar year
Recall that
r 0..
L
jE:J
k
1
k) are also presented.
is the observed number of births during cal en-
lJ
dar years (j) in set J k and
number for these same years.
equation (2.30).
(J
The observed and
L
L (1
- S(t .. )
j E:J k i
i s the expec ted
1J
The test statistic X2
is described in
Appendix A contains a listing of observed and expected
numbers of births by individual calendar year.
To maintain an overall level of significance no greater than a,
evaluate the test for each of the 10 sets with level of significance
a/lO.
When a = 0.05, then a/lO = 0.005.
Clearly, the null hypothesis
of goodness-of-fit of the multiplicative model cannot be rejected by the
results in Table 3.7 or by a visual inspection of the observed and
expected numbers of births by calendar year in Appendix A.
The result of both aspects of the goodness-of-fit investigation
support the multiplicative exponential hazard rate as a model for the
TABLE 3.7
OBSERVED AND EXPECTED NUMBERS OF BIRTHS, TEST STATISTICS
AND P-VALUES FOR TESTING GOODNESS-OF-FIT OF THE
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD
MODEL (2.3): PLANT A
Calendar Year Sets:
k:
-
J
k
Expected
Births:
Observed
Births:
,0..
L
j e:J
k
1
1J
L ,( 1-S .. (t .. ) )
A
jEJ
k
1
lJ
lJ
Test Statistic
x2
d. f. P-va1 ue
1:
35,45,55,65,75
54
50.786
3.387
5
0.641
2:
36,46,56,66,76
59
52.914
5.007
5
0.415
3:
37,47,57,67,77
57
56.833
2.243
5
0.815
4:
38,48,58,68,78
62
60.155
7.239
5
0.203
5:
39,49,59,69,79
55
62.107
2.521
5
0.773
6:
40,50,60,70,80
60
61.341
4.695
5
0.454
7:
41,51,61,71,81
46
59.048
9.247
5
0.100
8:
42,52,62,72
54
53.292
6.183
4
0.186
9:
43,53,63,73
59
52.982
2.798
4
0.592
10:
44,54,64,74
60
52.481
3.032
4
0.552
e
•
01
ex>
•
e
59
reproductive histories of the 238 women from Plant A.
significance testing within this model is pursued.
Consequently,
Maximum likelihood
estimates of the regression coefficients and the estimated standard
errors of these parameters for various forms of the multiplicative exponential hazard rate appear in Table 3.8.
The parameter estimates for
Model 1 were obtained on the unrestricted parameter space.
In the
remaining four models, the parameter space was restricted in order to
test the significance of the constrained coefficient(s) through likelihood ratio test techniques.
Of particular interest in this research is the suspected impairment
of the male worker1s reproductive health due to on-the-job exposure to
the chemical TDA.
·e
It is hypothesized that spermatogenesis dysfunction
will manifest itself as a reduction in fertility, namely, live births.
The estimated regression coefficient corresponding to the male worker1s
occupational exposure to TDA
(8 1 = -0.489)
baseline fertility of about 40 percent (0.61
suggests a reduction in the
= exp{-0.489})
associated
with TDA exposure in the working environment when the remaining covariates are held constant.
However, this reduction is not statistically
significant (P = 0.14) after adjusting for the effects of the other
variables, viz., marital status,
~n
(age), parity, and parity squared.
The underlying fertility hazard rate in model (2.3) is estimated by
Bij , race by age by birth cohort by parity specific U.S. national birth
rates. The inclusion in the model of the covariates related to age
(z3)
and parity
(z4 and zS)
is not redundant but reflects an addi-
tional adjustment to the underlying fertility hazard associated with age
and parity.
Thus, the age (or parity) covariate is viewed as an
interaction of departure from the U.S. birth rates and age (or parity),
TABLE 3.8
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS AND THE LOG LIKELIHOOD FOR
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD MODELS FOR SELECTED PREDICTORS OF
BIRTH PROBABILITY: PLANT A
Likelihood Estimates
Model 3
Model 4
Model 5
-2.022
(0.827)
0.783
(0.070)
4.709
(0.802)
zero *
-0.411
(0.357)
-0.504
(0.357)
-0.570
(0.357)
-2.450
(0.231)
-2.447
(0.231)
zero *
-2.401
(0.228)
-2.181
(0.230)
-0.373
(0.290)
-0.391
(0.291)
0.718
(0.270)
zero *
-1.424
(0.250)
Parity (B 4 )
-0.699
(0.083)
-0.704
(0.083)
-0.375
(0.088 )
-0.733
(0.079 )
zero *
Pari ty2 (B 5)
0.101
(0.016)
0.102
(0.016)
0.043
(0.018)
0.102
(0.016)
zero *
~aximum
Parameter
(~)
Model 1
Model 2
Intercept (8 0 )
1.942
(0.902)t
1.992
(0.903)
Occ. Exposure (B 1 )
-0.489
(0.357)
Marital Status (B 2 )
R, n
(age) (8 )
3
- - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - -
Log Li ke 1 i hood
x2
(P-value)
-1503.579
-
---
-1504.682
-1606.781
-1504.414
-1538.896
2.206
(0.14)
206.404
(0.0+)
1.670
(0.20)
70.634
(0.0+)
* Fixed at 0.0.
t Estimated standard errors are shown in parentheses in the body of the table.
+ P-value is zero to three decimal places
e
•
O'l
a
e
61
i.e., corrections to the U.S. fertility rates with age (or parity) for
the modeling of this plant.
The additional adjustment to the underlying fertility hazard Bij
associated with z3' the natural logarithm of age, is not statistically
significant (P
= O.20).
The overall significance of the two parity-
related covariates, parity (z4) and parity squared (zS)' is examined
next.
The significant P-value (P < O.OOl) implies that a nonlinear
parity adjustment of the national birth rates is required.
The marital status covariate (z2) is unique among the covariables
included in the hazard model.
The estimated underlying fertility hazard
(B ij ) is specific with respect to race, age, birth cohort, and parity.
However, these Bij'S were calculated irrespective of marital status.
·e
The marital status mixture within the general U.S. population is not
readily available and, indeed, may be different from the marital status
mixture found in the occupationally-defined cohort of women from Plant A.
It is believed that the experience of unmarried women in the general U.S.
population, particularly at parity zero and at younger ages, may bias
comparisons of the fertility experience of the general U.S. population
with that of the study cohort, specifically underestimating the expected
number of births during married years.
To compensate for these poten-
tially differing mixtures, a marital status indicator covariate is
introduced into the fertility hazard model.
z2 given in Section 2.2 whereby z2
tak~s
Recall the definition of
the value of one when the
woman is single, divorced, or separated from, i.e., not married to the
male worker for the majority of that year. This tacitly associates the
national birth rates to "married ll experience, and a modification of
these rates to "unmarriedll years.
The estimated coefficient corresponding
62
to this marital status covariable (S2
ferent from zero (P < 0.001).
= -2.450) is significantly dif-
The large negative value of this coef-
ficient implies that the baseline "married" fertility must be severely
reduced during unmarried years to reflect the fertility experience of
the Plant A cohort, as one might have expected.
An additional analysis was undertaken as a further aid in the
interpretation of this covariate.
Only married years of reproductive
experience were studied using the multiplicative exponential fertility
hazard without the marital status covariate.
presented in Table 3.9.
These model results are
The parameter estimates for model (2.3) shown
in the previous table are also displayed in this table to facilitate
comparisons.
For each of the five parameters in common, the estimates
from each model are quite similar.
These results support the inclusion
of this covariate in the fertility hazard rate model.
Another means of describing the multiplicative exponential fertility
hazard is pursued:
rate are calculated.
model-specifi c estimates of the fertil i ty hazard
Table 3.10 displays characteristics (occupational
"exposure, marital status. age, and parity) of some "typical" women in
the Plant A study cohort during the observational period. The estimated
fertility hazard for the i th woman in the cohort during calendar year j,
~ij' reflects the multiplicative effect of the covariates
(~ij)
on the
underlying hazard rate Bij • As an example, consider an eighteen year
old, married woman with one child whose husband did not experience onthe-job exposure to TOA while employed at Plant A during 1952.
Based on
the reproductive experience of the study cohort, this particular woman's
estimated fertility rate
(B ij = 0.3719)
(~ij
= 0.4853) is 30 percent higher than that
of a woman in the general U.S. population with similar
age, race, birth cohort, and parity characteristics in 1952.
63
TABLE 3.9
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS FOR
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD MODEL, ALL YEARS
AND MARRIED YEARS ONLY, FOR SELECTED PREDICTORS OF
BIRTH PROBABILITIES: PLANT A
TABLE 3.10
ESTIMATED FERTILITY HAZARDS FOR WOMEN WITH
SELECTED CHARACTERISTICS: PLANT A
Calendar
Year (j)
OCcupati onal
Exposure
Mari tal
Status
Age
Parity
B..
lJ
exp{/3 z ..
1955
No
Not married
18
0
0.0993
0.205
0.0204
1955
No
Married
18
0
0.0993
2.373
0.2356
1952
No
Married
18
1
0.3719
1.305
0.4853
1968
Yes
Married
34
2
0.0454
0.425
0.0193
1948
No
Married
40
3
0.0258
0.537
0.0139
1963
No
Married
40
1
0.0145
0.969
0.0141
1963
Yes
Married
40
1
0.0145
0.594
0.0086
AI
~ ~lJ
A
}
A· .
lJ
m
~
-
.
-
65
To illustrate the impact of marital status on the estimated ferA
tility hazard A.. , consider an eighteen year old woman in 1955 with no
lJ
chil dren
to TDA.
(B ..
lJ
= 0.0993}.
Moreover, there is no occupational exposure
If this woman were married, the model-specific estimate of her
fertility hazard is 0.2356 as compared to 0.0204 if she were not
married.
The effect of occupational exposure to TDA on the underlying
fertility hazard can be demonstrated similarly.
til ity hazard
0.0145.
( B.. )
1J
The underlying fer-
for a 40 year old woman of parity one in 1963 is
This woman is married.
If her husband is occupationally
exposed to TDA at Plant A, this woman's estimated fertility rate is
0.0086.
In the absence of occupational exposure to TDA, this estimate
increases to 0.0141.
·e
3.3 Analysis of Plant B
Consider employees in a chemical manufacturing plant who are
stationed in such areas as packaging/shipping or who, working as maintenance personnel, must travel throughout the physical facility.
Often
it is difficult to categorize such individuals with respect to their
exposure status.
Moreover, in any manufacturing facility, exposure to
chemicals other than the compound under investigation may also be
suspected of impairing the male worker's reproductive health.
Inclusion
of such data in fertility evaluation analyses'IDay confound study
results.
One approach to maintaining clearly defined exposure cate-
gories in the analysis of occupational fertility data with the multiplicative exponential fertility hazard model (2.3) is to ignore those
woman-years of reproductive experience where the male worker's exposure
status is questionable.
Since such reproductive years can be considered
66
to contain vague information with respect to occupational exposure, they
will not be informative if included in the likelihood function (2.11).
This restriction of woman-years of experience is illustrated using data
from another chemical manufacturing plant, hereafter referred to as
Plant B.
With the manufacturing processes utilized in Plant B, it was difficult to differentiate occupational exposure to TDA from that to
dinitrotoluene (DNT).
The objective of the following investigation of
reproductive histories from Plant B is to assess the association between
on-the-job contact with TDA/DNT and fertility.
this analysis.
There are two phases to
Initially, all woman-years of reproductive experience
from Plant B are analyzed.
Then, woman-years characterized by multiple
occupational exposure are eliminated from a subsequent analysis of the
fertility histories from this same manufacturing facility.
Results from
these two analyses will be compared to determine if elimination of vague
information from the likelihood with respect to occupational exposure
will either improve the precision of parameter estimates or reduce confounding of study results.
The operational definition of exposure must be modified to accommodate multiple exposures.
In this analysis of the Plant B study
cohort, any woman-year of reproductive experience is ignored if the
male worker spent more time in a job or a work area associated with
these multiple exposures than either in a TDA/DNT-exposed or a strictly
nonexposed activity or area •. Now any remaining reproductive year was
characterized as exposed if the fraction of the year the male worker
was exposed to TDA/DNT was greater than the fraction of the year he
was unexposed; otherwise, the reproductive year was categorized as
e-
67
unexposed.
The definition of the other covariates remains the same; see
Section 2.2 of Chapter II.
A total of 167 white male workers in Plant B were interviewed in
October 1981, and the reproductive histories of 195 women, their current
or previous wives, were recorded.
There were 436 birth events and 3977
woman-years of reproductive experience associated with these workers and
their wives.
When reproductive years with vague occupational exposure
information were removed from the histories, these figures decreased to
424 live birth events and 3823 woman-years.
Distributions of these woman-years and births by calendar year, age,
and parity appear in Tables 3.11 through 3.13, respectively.
No births
occurred to women past 37 years of age, an age group that accounted for
over 10 percent of the woman-years of reproductive experience examined.
Inspection of the distribution of woman-years and births by parity
reveals that almost 40 percent of the reproductive years were parity
zero experience and that a similar proportion of births occurred at this
parity level.
There were no outstanding differences between the distri-
butions for the two definitions of occupational exposure.
Descriptive information regarding
marita~
and occupational exposure
status can be gleaned from Tables 3.14 and 3.15.
Focussing on marital
status, 30 percent of the selected reproductive years of the study
cohort are described as "no t married" woman-years.
Moreover, the number
of woman-years characterized by the male worker's occupational exposure
to TDA/DNT is 513 (12.9 percent); during these same years, 31 births
occurred to women in the Plant B study cohort.
The results of the fertility analysis of the Plant B cohort based on
all woman-years of reproductive experience are nowdi scussed.
As
68
TABLE 3.11
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE AND
BIRTHS BY CALENDAR YEAR AND EXPOSURE DEFINITION: PLANT B
Calendar
Year
All Years
Woman
Years
Births
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1
1
1
2
3
4
4
7
8
10
10
12
14
18
20
23
30
36
40
41
49
50
56
61
70
75
84
92
103
104
115
122
130
134
140
143
152
155
160
159
161
159
161
166
161
161
156
142
137
134
TOTAL
3977
No Mult. Exp. Years
Woman
Years
Births
1
1
1
2
3
4
4
7
8
0
0
0
0
1
1
0
0
0
10
10
0
0
2
0
8
6
12
12
8
12
8
9
12
12
14
18
20
23
30
36
40
41
49
50
56
61
70
75
84
90
97
98
108
116
125
129
135
136
146
144
149
155
156
154
155
163
154
147
149
135
128
124
436
3823
0
0
0
0
1
1
0
0
0
0
0
2
0
0
5
1
2
4
6
3
10
5
8
16
6
8
19
19
16 15
24
15
27
19
16
18
21
15
15
22
10
0
5
1
2
4
6
3
10
5
8
16
6
8
19
19
16
15
23
15
27
19
16
18
20
13
15
21
9
7
5
11
12
8
12
8
7
11
424
e~
e
70
TABLE 3.13
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE
AND BIRTHS BY PARITY AND EXPOSURE DEFINITION: PLANT B
Pari ty
All Years
Woman
Years
Births
No Mult. Exp. Years
Woman
Years
Births
0
1495
175
1485
173
1
643
139
623
133
2
830
73
790
72
3
530
32
486
30
4
318
12
296
11
5
92
5
81
5
6
69
0
62
0
7
0
0
0
0
8
0
0
0
0
3977
436
3823
424
TOTAL
e-
71
TABLE 3.14
THE NUMBER OF WOMAN-YEARS OF REPRODUCTIVE EXPERIENCE
BY OCCUPATIONAL EXPOSURE STATUS AND MARITAL STATUS
FOR ALL YEARS: PLANT B
Male Worker's
Occupati onal
Exposure (zl)
Marital Status (z2)
Married
Not Married
Total
Exposed
500
(97.5)*
13
(2.5)
513
(12.9)
Not Exposed
2321
(67.0)
1143
(33.0)
3464
(87.1)
Total
2821
(70.9)
1156
(29.1)
3977
-e
* Figures in parentheses are percents of row totals for the interior of
the table and p~rcents of grand total for the margin of the table.
72
TABLE 3.15
THE NUMBER OF BIRTHS BY OCCUPATIONAL EXPOSURE STATUS
AND MARITAL STATUS FOR ALL YEARS: PLANT B
Male Worker's
Occupati onal
Exposure (zl)
Exposed
Marital Status (z2)
Married
Not Married
Total
31
( 100.0)*
0
(0.0)
31
(7.1 )
Not Exposed
384
(94.8)
21
(5.2)
405
(92.9)
Total
415
(95.2)
21
(4.8)
436
* Figures in parentheses are percents of row totals for the interior of
the table and percents of grand total for the margin of the table.
..
73
before, the investigation of the fertility experience of the cohort of
women from Plant B is two-staged.
First, the goodness-of-fit of the
mUltiplicative exponential fertility hazard function is examined to
determine if the proposed fertility hazard adequately summarizes the
reproductive histories of the Plant B cohort.
Once the adequacy of the
model has been demonstrated, the regression coefficients are examined
for significance.
To avoid unnecessary repetition, this presentation
parallels that for Plant A.
Results from fitting a multiplicative exponential and a multiplicative Weibul1 fertility hazard to Plant B reproductive history data are
presented in Table 3.16.
The regression coefficient estimates are
remarkably similar with the greatest absolute difference occurring
between the parameter estimates associated with
Zn (age), z3.
A com-
parison of the log likelihood from the multiplicative exponential model
(-1055.868) with that from a multiplicative Weibull fertility hazard
function with the shape parameter fixed at 1.0 (-1055.824) through the
likelihood ratio statistic implies that HO: c = 1 cannot be rejected
(P = 0.77). This result supports the use of the multiplicative exponential fertility hazard function to summarize the fertility experience of
the Plant B cohort versus the multiplicative Wei bull hazard.
There is additional evidence for the appropriateness of the proposed
fertility hazard function.
Contrast the observed and expected numbers
of births by calendar year set presented in Table 3.17.
Corresponding
data by individual calendar year appear in Appendix B.
To maintain an
overall significance level of a
= 0.05,
set the significance level for
each calendar year set specific test at a/10 = 0.005.
The vector of
P-values in Table 3.17 supports the goodness-of-fit of the multiplicative
74
TABLE 3.16
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS
AND THE LOG LIKELIHOOD FOR THE MULTIPLICATIVE EXPONENTIAL (2.3)
AND THE MULTIPLICATIVE WEIBULL (2.34)
FERTILITY HAZARD MODEL: PLANT B
Parameter
(~)
Intercept (13 0)
Multiplicative Exponential
Estimated $.E.
M.L.E.
of Estimator
Multiplicative Weibull
Estlmated $.E.
M.L.E.
of Estimator
2.548
1.028
2.523
1.033
Occ. Exposure (13 1)
-0.236
0.191
-0.232
0.191
Marital Status (13 2)
-2.493
0.231
-2.488
0.231
Q,n (age) (13 )
3
-0.447
0.337
-0.430
0.342
Parity (13 4 )
-0.790
0.104
-0.798
0.107
Parity2 ((35)
0.098
0.022
0.099
0.023
one
zero
1.012
0.041
Power Transform (c)
-- ------
----
Log Li kel ihood
-1055.868
- -
--1055.824
e-
.
e
e
e
TABLE 3.17
OBSERVED AND EXPECTED NUMBERS OF BIRTHS, TEST STATISTICS
AND P-VALUES FOR TESTING GOODNESS-OF-FIT OF THE
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD
MODEL (2.3): PLANT B
Calendar Year Sets:
k:
J
k
Observed
Births:
I I
. J .
JE
k
1
6 ..
lJ
Expected
Births:
I I
. J.
JE
k
1
Test Statistic
(1-5 .. (t .. ) )
lJ
lJ
2
x
d.f. P-va1ue
1:
32,42,52,62,72
44
37.932
6.658
5
0.247
2:
33,43,53,63,73
30
38.777
6.026
5
0.304
3:
34,44,54,64,74
41
42.354
12.311
5
0.031
4:
35,45,55,65,75
47
40.518
7.368
5
0.195
5:
36,46,56,66,76
40
42.370
7.712
5
0.173
6:
37,47,57,67,77
36
44.578
7.577
5
0.181
7:
38,48,58,68,78
54
44.328
7.413
5
0.192
8:
39,49,59,69,79
46
47.035
2.551
5
0.769
9:
40,50,60,70,80
46
47.672
0.564
5
0.990
10:
41,51,61,71,81
52
48.137
11.899
5
0.036
"-J
(J'l
76
exponential fertility hazard in a piecewise survival model in this
occupational fertility setting.
observed for Plant A.
However, it is not as good as that
Perusal of Appendix B reveals several calendar
years where the observed and anticipated number of births are not in
close agreement (e.g., 1951, 1955, 1962, 1964, 1971, and 1974).
Visual
examination of the characteristics of the woman-years of reproductive
experience associated with several of these calendar years as compared
to the characteristics of those woman-years associated with the previous
and subsequent calendar year did not reveal any obvious differences in
the values of the covariates.
With the above support for the general adequacy of this piecewise
fertility hazard model (2.3), the statistical significance of the
regression coefficients is examined.
Refer to Table 3.18 for the model
results required for this investigation.
Of primary importance in this occupational health study is the association between the male workers' on-the-job exposure to TDA/DNT (zl)
and fertility.
exposure
The estimated regression coefficient for occupational
(8 1 = -0.236)
tility (0.79
translates into a 21 percent reduction in fer-
= exp{-0.236})
associated with on-the-job exposure to
TDA/DNT when the remaining variables are held constant.
decrement is not statistically significant (p
However, this
= 0.19).
The lack of specificity of the U.S. birth rates
(B ij ) with respect
to marital status was described in the previous section.
It is not
surprising that the estimated coefficient for marital status
the Plant B cohort
(8 2 = -2.493),
(z2)
for
which is statistically significant
(P < 0.001), suggests a decrease in fertility during unmarried years.
Once again, when the analysis was restricted to "married" years, the
.
e
e
e
TABLE 3.18
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS AND THE LOG LIKELIHOOD FOR
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD MODELS FOR SELECTED PREDICTORS OF
BIRTH PROBABILITY: PLANT B
Parameter
~aximum
(~)
Model 1
2.548
Model 2
Likelihood Estimates
Model 3 .
Model 4
Model 5
(1. 028) t
(1. 008)
2.828
-0.020
(0.968)
1.186
(0.080)
6.741
(0.927)
Occ. Exposure (Sl)
-0.236
(0.191)
zero *
-0.020
(0.191)
-0.290
(0.187)
-0.075
(0.190)
Marital Status (S2)
-2.493
(0.231)
-2.482
(0.231)
zero *
-2.465
(0.230)
-2.148
(0.229)
in (age) (S3)
-0.447
(0.337)
-0.545
(0.329)
0.150
(0.320)
zero *
-2.010
(0.295)
Parity (S4)
-0.790
(0.104)
-0.782
(0.104)
-0.290
(0.108)
-0.833
(0.099 )
zero *
Parity2 (S5)
0.098
(0.022)
0.097
(0.023)
0.011
(0.024)
0.099
(0.022)
zero *
Intercept (SO)
Log Li ke1i hood
x2
(P-value)
-1055.824
-1056.673
-1162.601
-1056.761
-1095.770
1.698
(0.19)
213.554
(0.0+)
1.874
(0.17)
79.892
(0.0+)
* Fixed at 0.0.
t Estimated standard errors are shown in parentheses in the body of the table.
+ P-val ue is zero to three decimal pl aces.
.......
.......
78
five parameter estimates from a model without the marital status
covariate were similar to the corresponding five regression coefficients
estimated in Modell of Table 3.18.
Finally, an additional adjustment
to the U.S. fertility experience for .tn (age) (z3)
cally significant (P
was not statisti-
= 0.17), whereas a further adjustment to these
national birth rates for parity (z4)
and parity2 (zS)
was necessary
in order to model the reproductive experience of this occupationally
defined cohort of women.
Efforts were made to clarify the relationship between. TOA/ONT exposure and fertility in Plant B by removing from the analysis file those
reproductive years characterized by the husband's exposure to multiple
compounds.
Recall that lS4 woman-years were "flagged" because of
multiple occupational exposures; see Table 3.11.
Model results
obtained from the modified data file are presented in Table 3.19.
To
aid the reader, the parameter estimates and the log likelihood obtained
from the complete data file are repeated in this table.
As before,
Model 1 is the full model and Model 2, the restricted model needed to
test the significance of the coefficient related to occupational
exposure.
There are several noteworthy results.
First, a comparison of the
estimated standard errors of the regression coefficients between the two
full models in Table 3.19 indicates that the precision of the maximum
likelihood estimates did not improve with the relatively more strict
definition of occupational
e~posure
utilized in the latter analysis.
Moreover, the P-value associated with the test of statistical significance of the regression coefficient associated with occupational exposure increased from 0.19 to 0.24.
Finally, the estimated regression
e
e
e
TABLE 3.19
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS AND THE LOG LIKELIHOOD FOR
MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD MODELS FOR SELECTED PREDICTORS OF
BIRTH PROBABILITY BASED ON ALL WOMAN-YEARS AND ON WOMAN-YEARS NOT CHARACTERIZED
BY MULTIPLE EXPOSURE: PLANT B
A1T -Woma-n--Yea-rs---------------fro.
Parameter
~)
Model 1
Model 2
Murt. Expos-lTre--Years
Model 1
Model 2
Intercept ((30)
2.548
(1.028) t
2.828
(1.008 )
2.902
(1.059 )
3.181
(1.034 )
Occ. Exposure ((31)
-0.236
(0.191)
zero *
-0.218
(0.192)
zero *
Marital Status ((32)
-2.493
(0.231)
-2.482
(0.231)
-2.496
(0.231)
-2.486
(0.230)
R.n (age) ((33)
-0.447
(0.337)
-0.545
(0.329)
-0.564
(0.347)
-0.662
(0.339)
Parity ((34)
-0.790
(0.104)
-0.782
(0.104 )
-0.781
(0.105)
-0.773
(0.105)
Parity2 ((35)
0.098
(0.022)
0.097
(0.023)
0.098
(0.023)
0.097
(0.023 )
-Log- Li- ke1i
- - hood
- - - - - - - - - - -1055.824
- - - - - - --1056.673
-------- ---- -- ----- ----1020.729
x2
(P-value)
* Fixed at 0.0.
t
--
1.698
(0.19)
--
Estimated standard errors are shown in parentheses in the body of the table.
-1021.410
1.362
(0.24)
""-J
1..0
80
coefficients from the two analyses are quite similar.
The extent of
this similarity is demonstrated in Table 3.20 which presents estimated
fertility hazards for women with selected characteristics based on all
woman-years of reproductive experience and then on the restricted set of
woman-years.
There is not any substanti al di fference between these two
estimates for the four woman-years described in this table.
With such a
comparison, it is important to remember that the multiple exposure years
were few in number (154) relative to the total number of woman-years
available from Plant B (3977).
3.4 Summary of the Essential Features of the Multiplicative
Exponential Fertility Hazard Model
Within the framework of the multiplicative exponential fertility
hazard model (2.3), the reproductive span of women from Plants A and B
are evaluated on a yearly basis.
Several assumptions must be made when
utilizing this fertility evaluation procedure.
This model assumes a
constant fertility hazard for woman (i) in each year interval [j, j+1)
as well as proportional hazards among women given their race, age, and
parity in each year (j).
An external standard for fertility experience
is introduced into the fertility hazard model:
rates,
the U.S.
national birth
Bij's, are presumed to estimate the baseline fertility hazard
in the model.
The variability in these rates is demonstrated in
Figure 3.1 where the U.S. birth rates for a white woman from the 1925
birth cohort and another from the 1940 birth cohort, each giving birth
at age 25, 27, and 30 during her reproductive span, are plotted against
age.
The uniqueness of the birth cohorts is evident from this plot:
the difference between the two sets of rates per 1000 women ranges from
3.3 at age 15 to 84.8 at age 20.
e
e
e
TABLE 3.20
ESTIMATED FERTILITY HAZARDS FOR WOMEN WITH SELECTED CHARACTERISTICS
BASED ON ALL WOMAN-YEARS AND ON WOMAN-YEARS NOT CHARACTERIZED BY MULTIPLE EXPOSURE:
PLANT B
JOT Woman-Years
Calendar
Year (j)
Occupational
Exposure
Mari ta1
Status
Age
Parity
1955
No
Not married
18
0
1963
Yes
Married
40
1963
No
Married
1968
Yes
Married
A
No~rfult.
Exp. Years
exp{S z.. }
A
exp{s'
z· .}
~ ~lJ
A· •
lJ
0.0993
0.290
0.0288
0.294
0.0292
1
0.0145
0.971
0.0141
0.924
0.0134
40
1
0.0145
1.230
0.0178
1.149
0.0167
34
2
0.0454
0.636
0.0289
0.622
0.0282
B..
lJ
I
~ ~lJ
Aij
00
.....
82
260
lfIp'- ........,
240
220
200
"':.......'.1'/'
180
.-.......
160
140
120
100
80
-
-
60
40
20
e·
""'""
0 .-.
15
20
1940 Birth Cohort
-
25
30
35
40
* Ages at which births occurred
1925 Birth Cohort
Figure 3.1.
U.S. Birth rates per 1000 women by age for white women from
1925 and 1940 birth cohorts with live births at ages 25,
27, and 30.
83
A crucial assumption in the construction of this piecewise survival
model for fertility evaluation is that of conditional independence of
the likelihood of each reproductive year outcome in a woman's fertility
history.
The conditioning occurs on the woman's updated reproductive
characteristics as summarized by the vector of covariates,
~ij'
and on
the U.S. specific birth rates, Bij • In addition to a time dependent
covariate type approach to updating the same woman's risk for the next
year's reproductive experience, the vector of covariates provides a
regression type adjustment for heterogeneity among the study members
with respect to characteristics predictive of birth chances.
different roles these covariates play.
Recall the
The marital status covariate
improves the comparability of the U.S. national birth rates, that have
been calculated without regard to marital status, with the marital status specific reproductive histories of the study group.
On the other
hand, the parity and age related covariates reflect adaptation of the
U. s. bi rth rates to the unexposed experi ence of the study cohort.
The
relationship between fertility and the male worker's occupational exposure can now be assessed.
In Plant A (B) the reduction in the baseline
fertility associated with the male worker's exposure to TOA (TOA/DNT)
was not statistically significant •
•
CHAPTER IV
FERTILITY HAZARD RATE FUNCTION WITH STRATIFICATION:
PARTIAL LIKELIHOOD
4.1
Introduction
An alternative strategy for fertility evaluation, one that does not
require a schedule of fertility rates from some standard population is
now proposed.
A stratified proportional fertility hazards function is
developed to evaluate a woman's reproductive experience on a year by
year basis.
This method also departs from the previous in the construc-
tion of the likelihood.
The partial, or so-called conditional, likeli-
hood approach of Cox (1972) is employed.
This chapter contains five sections.
model are defined in Section 4.2.
The covariates employed in the
The stratified fertility hazard model
and the development of the partial likelihood function are then
presented.
In Section 4.4, the reproductive histories of the women from
Plant A are used to illustrate this fertility hazard model, and these
resul ts are then compared to those obtai ned from the model di scussed in
Chapter II.
Lastly, the essential features of this survival model for
fertility surveillance are summarized in Section 4.5.
85
4.2 Description of the Data
The details of the data collection process are described in
Section 2.2 of the second chapter.
Briefly, the reproductive histories
of the wives of male employees of Plant A were obtained from an
interviewer-administered questionnaire.
available.
Work histories were also
From these data, each calendar year of a woman's reproduc-
tive period for ages 15 through 49 was characterized by her parity, her
age, and her husband's status with respect to occupational exposure to
TDA.
Moreover, it was noted if and when a live birth occurred during
each calendar year.
The covariates for the Cox model are now described.
For each year
of a woman's reproductive experience, a vector of covariates is
constructed, including:
(i)
the husband's occupational exposure status (zl)' an indicator
variable that is equal to one when the male worker is occupationally exposed to TDA for more than six months of the year
and zero, otherwise;
(ii)
the couple's marital status (z2)' an indicator variable that
takes the value of one when the couple is married for less
than half the year and zero, otherwise;
(iii)
the natural logarithm of the woman's age (z3) as of January 1
of that year where age is a continuous variable ranging from
15 to 49.
These three covariates are a subset of those included in the previous
model.
An explanation as to why parity is not included as a covariate
is provided in the next section, where the parity by calendar year specific strata for the Cox model are described.
86
4.3 A Cox Regression Approach to Fertility Evaluation
An alternative hazard model for fertility surveillance is available
from Cox regression (1972).
A stratified proportional hazards model is
defined and a partial likelihood function is constructed to describe
this occupational fertility observational experiment.
First, the strata are determined.
The calendar year by parity stra-
tification variable was so constructed for several reasons.
Firstly, in
retrospective fertility investigations, a varied number of live birth
events are recorded for each woman.
These events can be viewed as
multivariate failure time data for each woman as described by Prentice
et al. (1981).
A woman moves to the next stratum or state when a birth
occurs, or equivalently, when her parity changes.
natural candidate for strata definition.
Thus, parity is a
Further, the need to charac-
terize the male worker1s occupational exposure on a yearly basis
necessitates a definition of strata that is calendar year specific as
well.
Joint consideration of these factors leads to strata that are
parity by calendar year specific.
This is one analytic strategy whereby age, marital status, and the
male worker1s occupational exposure are included as covariates in the
model.
Another option considers strata that are age by parity specific
with calendar year incorporated into the model as a covariate along with
marital status and the male worker1s occupational exposure.
This use of strata is similar to the use of transient states by
Mantel and Byar (1974). Mantel and Byar developed a modified life table
technique to compare the survival experience among different treatment
groups when an individual1s membership in a group varied during the
Observation period.
The number of life tables constructed is equivalent
87
to the number of treatment groups or transient states.
A subject's
person-years of experience are tabulated according to his membership in
each group and then are attributed to the corresponding life table.
events of interest are handled similarly.
The
In this fertility evaluation
application, a woman's "membership" in a specific calendar year by
parity stratum cannot exceed one year, i.e., she is at risk in any stratum for at most one year.
If she is not censored from the study, she
then becomes a member of another stratum or transient state where she
will also remain for at most one year.
The previous notation is modified to accommodate the stratification
scheme. Let 5 subscript the stratum. Suppose the i th woman in stra-
,
·e
...
, zri s) not
s, with covariate vector ~is = (z1is' z2is' z3is'
dependent on time, gives birth at time t i s' Then the fertility hazard
functi on for the i th woman in stratum s at time tis is
tum
Ao15 (t.1S~~15
Is,z. )
=
A 5 (t.15 ) exp{S'z.
} .
~~15
O
(4.1)
The fertility hazard model (4.1) is the product of an arbitrary function
of time, i.e., the unspecified baseline or underlying hazard AOs(tis)'
and a log-linear function of the covariates and the regression
parameters.
stratum.
This model assumes proportional hazards within each
Since the form of the baseline fertility hazard remains
unspecified, hazard functions similar to (4.1) are considered
semi-parametric.
Once again, the goal is the estimation of regression coefficients
in (4.1).
Let there be ds live births to women in stratum s where
88
are the
ordered~
less than one
assumed
(k =
distinct~
1~2,3~
••• ~ds).
live birth times for positive t ks
The group of women in stratum s at
risk of giving birth at time tis
is called the stratum s risk set at
tis~ and this risk set "just before" the i th birth~ i.e.~ at time
tis-O~
is denoted R(tis's).
Consider a partial likelihood function for estimating the vector of
regression coefficients
in the model (4.1) that only utilizes infor-
~
mation on the women in the risk sets R(tis's).
No data external to the
study group enter the hazard model as with the hazard model (2.3) of
Chapter II.
Given that one live birth occurs in stratum s at time
tis among the women in the risk set R(tis's)~ then the conditional
probability that the i th woman is that woman who gave birth at tis is
written (Prentice et
al.~
1981)
exp{S'z. }/[
- -1S
t
R t.1S~ s)
NE:
n
exp{S'zn}]
-
(4.2)
_N
This probability does not depend on the underlying hazard
Similar to Cox (1975), the partial likelihood is taken as the product of
these
probabilities~
where the product is taken over the s strata and,
within each stratum, over the ds observed live births, specifically,
II~;
~
eXP{.@'~~l]}J..
L(S) =
{exp{B'zis1/[
s> 1 i =1
- ~E:R {t. , s)
.
1S-
(4.3)
The partial likelihood for the analysis of Plant A consisted of a product of such terms over 566 risk sets, one corresponding to each of the
births; see Table 3.1 i nChapter I I 1.
ranged from one to 98 women.
The si ze of these ri sk sets
•
89
It is important to note that the partial likelihood (4.3) only uses
information on those women who are present in the stratum "just before"
a birth.
Suppose a woman withdraws from the study between two births,
(t si ' t si +1). This event does not effect the construction of
the likelihood since the underlying fertility hazard AOS('), although
say in
completely unspecified, is presumed not to be a function of
~.
Thus,
the information about B available from the observation times for such
withdrawals is trivial in most applications; see Efron (1977).
Parity is not a covariate in this model.
By definition of the
strata, all the members in a risk set R(tis's)
have the same parity.
Examination of the likelihood (4.3) reveals that if a parity covariable
were included in the model, this covariate could be factored out of the
sum in the denominator, and it would cancel with the identical term
found in the numerator of this expression.
It is important to note,
however, that this definition of strata does allow for a distinct
parity-related regression coefficient for each stratum.
The corresponding log likelihood of (4.3) is
d
~nL(B) =
~
I
s> 1
s
1.f_1[(~':iS) - ~n(
~ER
ft.
1S '
s)
exp{S'zn})] .
~ ~k
(4.4)
To facilitate estimation procedures, the risk set notation is simplified
in (4.3) by introducing an indicator variable Yis i where
1, if the i th woman in the cohort is a member of
the i th risk set of stratum s, and
0, otherwise.
Then, equation (4.3) can be rewritten as
L(~) S~l r~l {eXP{~'~islI[Jl YiS~eXP{!l'~~l]~
=
(4.5)
90
and log likelihood (4.4), as
d
s
~ n L(~) = s~ 1 i ~ 1 [( ~ I~ is) -
~n(
N
L y. nexp{S'zn})J ,
. 1 1 S,.;,
~. ~,.;,
1=
where N is the total number of women in the cohort.
hood estimates of
~
(4.6)
Maximum likeli-
and statistical inferences based on large sample
theory are developed directly from (4.6), particularly the first and
second partial derivatives of this equation.
Expressions for these
derivatives appear below:
d
s
= 1: L (zkis s> 1 i =1
a2~nL(§.)
aSk as j
ds
=
B
p)
(4.7)
(AE-BC~
(4.8)
2
(AD-B)
L
i=l
A2
(4.9)
-L L
s> 1
A2
i=l
_-
and
a2~nL(~) =
2
a sk
-L
s~l
ds
where
N
A=
L
~=l
y.lS ~exp{S'z~}
__ ,
N
B = ~~l Yis~zk~sexp{~'~~} ,
N
C=
L
~=l
y. ~z.~ exp{8Iz~}
lS J s
- -
,
N
o = L YiS~Z~~Sexp{~'!~} ,
~=l
and
N
E=
L
~=l
y.lS ~zk~s z.~
exp{8Iz~} .
J s
- ~
e
91
The maximum partial
ll
II
likelihood estimates of the parameters
obtained by equating (4.7) to zero and then solving for
k = 1,2,3, ... ,r.
S are
Sk'
Since a closed-form solution to this system of
equations is not available, an iterative procedure provides these parameter estimates.
The results of a grid search of the likelihood surface
were verified by a Newton-Raphson procedure using the FORTRAN subroutine,
MAXLIK written by Kaplan and Elston (1972).
In large sample theory, the asymptotic variance-covariance matrix of
B is the inverse of the Fisher information matrix, the negative of the
matrix of expected values of second partial derivatives.
value of (4.9) for
k
= 1,2,3, ••• ,r
the expected value of (4.8) for j,k
-e
diagonal.
The expected
would appear along the diagonal and
= 1,2,3, ••• ,r
(j f k), off the
Since the expected values of the Yis£' S , as functions of
tis's, are intractable, the observed information matrix provides estimates of the reliability of the maximum likelihood estimates.
4.4
Illustration:
Plant A Revisited
The procedures detailed in the previous sections of this chapter are
employed to summarize the fertility experiences of the study cohort from
Plant A.
Before presenting these results, an explanation of the model
parameters is offered.
Differences exist in the interpretation of the parameters of the Cox
regression model as opposed to the previous model (2.3) where an external
standard was utilized as an estimate of the baseline hazard function.
Because of the structure of the underlying hazard rate in (2.3), the
coefficient for
£n (age)
was viewed as an adaptation of U.S. fertility
rates to the Plant A experience.
92
The comparison in the Cox regression procedure is an internal one
whereby the birth event with its corresponding covariate vector is
II
contrasted with ll women of the same parity in the same calendar year at
their observed covariate values.
.
This concept of internal analysis, one
that does not use any specified parameters, impacts on thei nterpretation of the regression coefficients.
83
For example,
exp{S3z3ij}' where
is the regression coefficient corresponding to in (age), is
interpreted as the multiplicative factor that scales the unspecified and
unknown baseline fertility hazard to adjust for the effect of age.
is analogous to a main effect.
This
The interpretation of the occupational
exposure and marital status parameter estimates, S1 and 82 respectively,
as main effects is consistent between the two models although the
underlying hazard that they modify differs, one fixed constants and the
other unspecified.
The maximum likelihood estimates and estimated standard errors of the
regression parameters in this semi-parametric approach to fertility
evaluation are displayed in Table 4.1.
Parameter estimates for Model 1
were obtained on the full parameter space; in the remaining models, the
parameter space has been restricted as indicated.
Since the regularity
conditions for the asymptotic distribution of the likelihood ratio are
slightly milder than those for the asymptotic normality of the maximum
likelihood estimator, the likelihood ratio statistic, described in
Section 2.8 of Chapter II, will be the basis for inferences.
The male worker's occupational exposure to TDA reduces the underlying
fertility hazard about 36 percent (exp{8 1} = 0.64) when the other
covariates are held constant, though this reduction is not statistically
significant (P
= 0.20). The effect on the baseline hazard due
to marital
e-
93
TABLE 4.1
ESTIMATES AND ESTIMATED STANDARD ERRORS FOR THE PARAMETERS
AND THE LOG LIKELIHOOD FOR COX REGRESSION MODEL: PLANT A
Maximum Likelihood Estimates
Parameter (6)
Model 1
r40del 2
Model 3
Model 4
Occup. exposure (6 1)
-0.441
(0.366)t
zero *
-0.264
(0.365)
-0.303
(0.365)
-3.555
(0.243)
-3.552
(0.243)
zero *
-3.051
(0.238)
-2.363
(0.269)
-2.351
(0.269)
-0.444
(0.236)
zero *
~lari
tal status (6 2)
Q,n (age) (63)
.-
---
-
----
Log-likelihood
x2
(P-value)
------ ------
-1816.711
-------- --
-1817.537
-2019.758
-1859.930
1.652
(0.20)
406.094
(0.0+)
86.438
(0.0+)
* Fixed at 0.0.
t
Standard errors appear in parentheses in the body of this table.
+ P-value is zero to three decimal places.
94
status and
~n
(age)
is in the same direction.
Further, this reduction
is statistically significant: in both cases, the P-value is less than
0.001.
4.5 Summary of the Essential Model Features
The essential features of this semi-parametric approach to fertility
evaluation through Cox regression are summarized in this section.
First, the model assumptions are reiterated.
There is an unknown
underlying fertility hazard function for each calendar year by parity
specific stratum.
Furthermore, the definition of the strata is closely
related to the proportional hazards assumption in that this model presumes proportional hazards for women within each calendar year by parity
specific stratum.
However, this assumption is not required across
strata since AOs(·)
may be different.
Note that the proportional
hazards assumption is not as definitive as that associated with the
multiplicative exponential fertility hazard model (2.3), whereby the
hazards are presumed to be proportional within each age by parity by
calendar year category.
An attraction of this model in occupational fertility applications
is the aspect of internal analysis, whereby each woman's giving birth is
contrasted with that of other women from the same occupationallydefined study cohort.
These women live in the same geographic area and,
seemingly, would have fairly comparable socio-economic status.
Other
sources of heterogeneity that impact on differential reproductive
behavior, such as the male worker's occupational exposure to hazardous
chemical agents, can be accommodated through the vector of covariates.
Recall that the value of these covariates may change as often as every
e·
95
year.
The model parameters are straightforwardly interpreted as
multiplicative effects on the underlying hazard.
Several limitations of the model are also acknowledged.
With this
approach to parameter estimation, timing of withdrawals from the study,
specifically between two consecutive stratum-specific births, does not
enter into the partial likelihood function (4.3).
The fact that the
withdrawal times are not used in the estimation of the covariates is a
characteristic of the partial likelihood approach in general.
The other
source of information loss is particular to this application of occupational fertility.
Numerous women-years of reproductive experience never
enter the partial likelihood (4.3) because of the definition of the
strata and the subsequent construction of this likelihood function as
the product of stratum-specific conditional probabilities of birth.
Specifically, if there is a calendar year by parity combination with no
birth event, then no matter how numerous the woman-years of reproductive
experience in this stratum, they will contribute no information to the
partial likelihood (4.3).
Figure 4.1 describes this information loss for the Plant A cohort.
Each entry in this display identifies stratum containing at least one
reproductive year.
Moreover, an "X" indicates that there was a live
birth event in this stratum, while an "0 " signifies that no birth
occurred among these stratum-specific reproductive years.
Consequently,
this latter symbol (" 0 ") identifies a stratum containing information
that does not enter the likelihood.
There are over 1050 (17.2%) woman-
years of reproductive experience that do not contribute to the partial
likelihood function and 82 percent of these woman-years occurred after
1965.
An option would be to modify the stratum definition by grouping
Calendar Year by Parity Strata Contributing/Lost*
to Partial Likelihood: Plant A
Figure 4.1.
Calendar Year
Parity 35
o
45
40
000 0 0 0
0 0
x
2
x
4
xx
x
x
3
0
75
70
65
0
80
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
0 0 0
x
0
xxxxxxxxxxxxxxxxxxxxxxx
ox
0
000 0 0
0 0 000
x
0 000 0 0 0
o
5
60
55
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
x
1
50
xxxxxxxxxxxxxxx
xxxxx
0 0 0 0 0 0 0 0
xx
o
6
7
0 0
xx
xx
0
xx
0 0
x
0 0 0 0 0
x
0 0 0 0 0 0
x
x
000 0 0 000
0
xxx
0 0 0 0
xx
x
0
x
0
0
0 0 0
x
0
xxx
0
000 000
x
0 0 0 0 0
o
0 0 0
o 0 x x 0 0 0 0 0 0 0 0 0 0 0 0
* x = live birth
o
e
= no
live birth and woman-years lost
l..O
O"l
e
•
#
e
97
calendar years.
However, such a modification makes it impossible to
characterize the male worker's occupational exposure on a yearly basis
and to evaluate its association with the woman's reproductive experience
during this same time period, a major objective of this research effort.
Finally, a subtle difference between occupational mortality and
occupational fertility studies employing stratified-type analyses is
brought to the reader's attention.
Suppose there is a deleterious
effect on the male worker's reproductive health due to on-the-job hazardousexposures which results in a reduced number of live births. Then
a smaller number of risk sets and, possibly, strata enter such an analysis so that more information is ignored than under a "no effect"
scenario.
Consider the most extreme situation where there is complete
spermatogenesis dysfunction among the workers during exposed years,
i.e., no births occur during the corresponding women-years.
From an
analytic point of view, there is no information available to examine the
association between fertility and the male worker's occupational
exposure, yet an obvious conclusion is available by inspection of the
data!
Contrast this with an occupational mortality study where the
number of events (deaths) would increase with a presumed deleterious
occupational exposure.
.
CHAPTER V
SUMMARY AND SUGGESTIONS FOR FUTURE RESEARCH
5.1
Summary
A survival model for summarizing and evaluating the fertility
experience of a cohort of women with two approaches to parameter estimation has been the focus of this research.
Since the proposed method
examines each woman's reproductive experience on a year by year basis,
the value of a woman's birth predictor covariates may change as often as
every year.
In other words, this piecewise constant hazard rate has a
time dependent covariate structure.
Hence, a woman's age, her fertility
experience and marital status as well as her husband's occupational
exposure to suspect chemicals are updated each year.
Reproductive histories of the wives of workers from two chemicalmanufacturing plants were investigated.
In Plant A exposure to TDA was
suspected of impairing the male worker's reproductive health.
The maxi-
mum 1i kel i hood estimates based on the full 1i kel i hood functi on suggest a
reduction of 39 percent in the U.S. birth experience in order to model
the fertility experience of the women from Plant A during years characterized by paternal occupational exposure.
From Table 3.8, the
regression coefficient corresponding to occupational exposure ;s estimated to be -0.489 with an estimated standard error of 0.357.
The ana-
logous estimates from the partial likelihood approach to parameter
estimation, found in Table 4.1, are -0.441 and 0.366, respectively.
e-
99
These latter statistics translate into a 36 percent reduction in fertility during years when the male worker was exposed to TDA; here the
reduction occurs to some unknown underlying fertility hazard.
The con-
sistency in these results is noteworthy despite differences in the
construction of the likelihood and in the underlying fertility hazard
rate.
Recall that these results are not statistically significant at
a = 0.05.
In a second plant the effect of paternal exposure to TDA/DNT on fertility was evaluated.
Because of the manufacturing processes utilized
in Plant B, exposure to these two chemicals could not be differentiated.
Model results presented in Table 3.19 suggest a 21 percent reduction in
the U.S. birth rates to describe the reproductive experience of the
-e
wives of the male workers from Plant B in years when these workers
experienced on-the-job exposure to TDA/DNT.
is not statistically significant.
Once again, this reduction
The conclusion that there is no sta-
tistically significant association between exposure to TDA (TDA/DNT) and
fertility is consistent with the results from other epidemiologic
investigations of these same two plants (Levine, 1983).
5.2 Future Research
5.2.1
Forms for the Hazard Function
In this dissertation, a piecewise multiplicative exponential hazard
rate model has been used to approximate the fertility experience of a
woman during her thirty-five year reproductive span. The multiplicative
exponential fertility hazard rate (2.3) for the i th woman during the
jth calendar year with covariate vector z..
....1J
A.. (z .. ,8)
lJ ....1J ....
=
is repeated here:
B.. exp{8'z .. } .
lJ
........1J
( 5.1)
100
Consideration of other models for the hazard would be useful.
In
particular, an analogous additive model for the hazard rate
A.. (Z .. ,S) = B.. + S'z ..
lJ
~lJ
:..,
lJ
~ ~lJ
(5.2)
might be developed and results compared to those obtained from hazard
model (5.l).
Note that restrictions on
~
are required to insure that
the hazard (5.2) is positive for all values of the covariates.
5.2.2
Regression Strategy
A strategy for general regression exploration of higher order terms
must be developed.
Care must be exercised in choosing covariates and
higher order terms for inclusion in the multiplicative hazard rate
model.
Each candidate should be examined carefully for subject matter
interpretation.
Often the complicated explanation of higher order terms
precludes them from the model.
Further examination of the parity and parity squared covariate utilized in the hazard rate model (2.3) provides additional insights into
the selection of model covariates to describe the fertility experiences
of the women from Plant A.
As parity is an important birth confounder,
the observed and expected births for this cohort of women are aggregated
by parity; see Table 5.1.
Recall that Table 3.7 contains these data
aggregated by calendar year set.
Perusal of these parity-specific data reveals that the fertility
hazard rate model with covariates parity, parity squared, the natural
logarithm of the woman's age, a marital status indicator, and a paternal
occupational exposure indicator, over-estimates the number of births at
parity one while the observed number of births exceeds the expected
101
TABLE 5.1
OBSERVED AND EXPECTED NUMBERS OF BIRTHS FROM THE MULTIPLICATIVE
EXPONENTIAL FERTILITY HAZARD MODEL (2.3)
WITH SELECTED PARITY COVARIATES: PLANT A
102
number at parities zero, two and three.
These results led to the con-
sideration of a model where the parity and parity squared covariates
were replaced with an intercept term for parity zero woman years of
reproductive experience and a linear term for those reproductive years
characterized by a parity greater than zero.
These new covariates allow
for an adjustment of the U.S. birth rates for women who have
demonstrated fertility (parity one plus) as well as a corrective factor
for the special group of infertile women at parity zero.
The parity-
specific expected number of births among the Plant A cohort estimated
from this revised fertility hazard model appears in Table 5.1.
Notice
the similarity between the expected and observed number of births at
each parity level.
In addition to interpretational difficulties, the risk of multicollinearity and the resulting loss of precision necessitates a careful
selection of model covariates.
Kleinbaum, Kupper, and Morgenstern
(1982) describe such a strategy within the multiple logistic regression
framework when interest lies in estimating the regression coefficients.
However, caution must be exercised in the interpretation of statistically significant coefficients resulting from sifting through the data.
Replication on separate data sets, or split portions of the same data
set, often reveal different coefficients as significant.
5.2.3 Occupational Exposure Covariate
The definition of occupational exposure utilized in this research
was based on a six months cutoff point.
Specifically, if the husband
were occupationally exposed to the suspect chemical for at least six
months during a calendar year, then this year in a woman's reproductive
•
103
span was classified as "exposed".
Such a definition may result in
misc1assification of conceptions with respect to occupational exposure.
For example, consider an estimated conception date of November in a year
when the husband experienced on-the-job exposure from August through
December.
Since his exposure period during this year was only five
months, this reproductive year is characterized by an absence of paternal occupational exposure even though the conception occurred while the
husband was exposed.
It is difficult to quantify occupational exposure so that other
definitions should be investigated.
Moreover, whenever occupational
exposure data are gathered retrospectively, questions are raised
regarding the quality of the exposure information.
However, consistency
of results in plant-specific analyses is reassuring evidence that the
quality of the occupational exposure data is similar in the concerned
plants.
In the analysis of Plants A and B presented in Chapter III, a
reduction in fertility was associated with exposure to TDA and TDA/DNT,
respectively; however, neither was statistically significant.
The fertility surveillance of the Plant B cohort, presented in
Section 3.3, demonstrates the flexibility of the proposed multiplicative
exponential fertility hazard rate model:
reproductive years charac-
terized by vague occupational exposure were not examined to prevent
confounding of model results by inexact information.
Another analytic
approach to this problem is proposed.
•
Construct two dummy covariables to describe the male worker1s occupationa1 exposure status, say zi
and zi'
* where
104
1, if the male worker is exposed to TDA/DNT during
zl
=
the year of interest, and
0, otherwise,
and
if the male worker experiences multiple
occupational exposure during the year of interest, and
otherwise.
Joint consideration of zl
implies no hazardous on-the-job
exposure when both zl and zl* are zero.
With this analytic strategy
all woman-years of reproductive experience from the Plant B cohort would
be included in the analysis, and inferences regarding the association
between occupational exposure to TDA/DNT and fertility would be based on
the estimated regression coefficient corresponding to
zl·
The param-
eter estimates and their estimated standard errors from this alternative
e~
model can be compared to the resul ts in Section 3.3.
This illustrates the flexibil ity of this piecewise constant fertility hazard rate in defining the covariate, occupational exposure.
In
a similar manner, this model provides the epidemiologist with the means
to investigate a dose-response relationship between paternal occupational exposure to a suspect chemical and fertility.
5.2.4 Pooled Analyses
Consider now that the number of workers employed at a chemical manufacturing plant is small so that the woman-years of reproductive
experience may not be sufficient for invoking large sample distribution
theory.
If there are several small manufacturing facilities where
workers are exposed to the same chemical compound, it is possible to
"
105
pool the reproductive history data from these plants, and utilize the
multiplicative exponential fertility hazard model to perform a stratified analysis on this combined data file.
A pooled analysis may also be desired when the individual analysis
of reproductive histories from two or more plants reveals an association
between fertility and the same occupational exposure that is marginally
significant in each plant.
Analysis of these pooled data may result in
a significant association.
When analyzing reproductive histories pooled over several plants,
the investigator is confronted with interesting tests of common
covariates.
Introduce Plants C and D to illustrate a strategy for this
pooled analysis.
The first step in this analytic framework is to
investigate the null hypothesis that the effect of occupational exposure
is similar in the two plants, namely,
(5.3)
O· S C,exposed -- S D,exposed·
H·
Presume that (5.3) is not rejected.
Then proceed to test the equality
of the other regression coefficients (minus the intercept term),
(5.4)
Proceed to explore the effect due to occupational exposure,
HO: Sexposed
= 0,
(5.5)
where Sexposed is the pooled regression coefficient corresponding to
occupational exposure in Plants C and D.
The model for testing this
latter hypothesis will have a common adjustment for the other covariates,
say ~, or plant-specific adjustments,
results of test (5.4).
~C and ~D'
depending on the
106
5.2.5 Model Sensitivity
The sensitivity of the piecewise multiplicative hazard rate model
needs to be investigated.
It is desirable that this model detect
various departures from the null hypothesis of no association between
occupational exposure and fertility.
Consider the situation of a sudden
excursion in the concentration of the suspect chemical in the working
environment whereby the male workers are exposed to a high concentration
of the chemical, but for a very short time period.
minimal exposure over several years.
Contrast this to
Can the fertility hazard rate
model detect reduced fertility in each of these situations?
There are several approaches to examining the sensitivity of the
proposed model.
Utilize the model on available data sets, where a
reduction in fertility associated with paternal occupational exposure
has been demonstrated, to see if the piecewise multiplicative hazard
rate provides the same conclusion.
Simulation techniques provide
greater flexibility in that any fertility reduction scenario can be
incorporated into the derived data file.
REFERENCES
Billewicz, W.Z. and Thomson, A.M. (1973). Birthweights in Consecutive
Pregnancies, Journal of Obstetrics and Gynecology of the British
Commonwealth 80, 491-~8.
Biswas, S. (1973). A Note on the Generalization of William Brass·s
Model, Demography lQ, 459-467.
Blake, J. (1955). Family Instability and Reproductive Behavior in
Jamaica, in Current Research in Human Fertility, Milbank Memorial Fund,
New York.
Blot, W.J., Shimizu, Y., Kato, H., and Miller, R.W. (1975). Frequency
of Marriage and Live Birth among Survivors Prenatally Exposed to the
Atomic Bomb, American Journal of Epidemiology 102, 128-136.
Brass, W. (1958). The Distribution of Births in Human Populations,
Population Studies ~, 51-72
Breslow, N.E. (1975). Analysis of Survival Data under the Proportional
Hazards Model, International Statistical Review 43,45-58.
Breslow, N. (1977). Some Statistical Model s Useful in the Study of
Occupational Mortality, in Environmental Health: Quantitative Methods,
ed. A. Whittemore, Society for Industrial and Applied Mathematics,
Philadelphia, 88-103.
Breslow, N.E., Lubin, J.H., Marek, P., and Langholz, B. (1983).
Multiplicative Models and Cohort Analysis, Journal of the American
Statistical Association 78, 1-12.
--Buffler, P.A. and Aase, J.M. (1980). Genetic Risks and Environmental
Surveillance: Epidemiological Aspects of Monitoring Industrial
Populations for Environmental Mutagens, Paper prepared for the
Mutagenicity Task Group of the American Industrial Hygiene Council,
Washington, D.C.
Cochran, W.G. (1952). The x2 Test of Goodness of Fit, Annals of
Mathemati cal Stati sti cs Q, 315-345.
Cox, D.R. (1961). Tests of Separate Families of Hypotheses.
Proceedings of the Fourth Berkeley Symposium!, 105-123.
Cox, D.R. (1972). Regression Models and Life Tables (with Discussion),
Journal of the Royal Statistical Society, Series ~ 34, 187-220.
108
Cox, D.R.
(1975).
Partial Likelihood,
Biometrika~,
269-276.
Dandekar, V.M. (1955). Certain Modified Forms of Binomial and Poisson
Distribution, Sankhya ~, 237-251.
Dobbins, J.G., Eifler, C.W., and Buffler, P.A. (1978). The Use of
Parity Survivorship Analysis in the Study of Reproductive Outcome, Paper
prepared for the Symposium on Methodologic and Analytic Issues in
Monitoring Human Populations for Reproductive Risks Associated with
Environmental Exposures at the Society for Epidemiologic Research Annual
Meeting, Iowa City, Iowa, 14-16 June 1978.
Efron, B. (1977). The Efficiency of Cox's Likelihood Function for
Censored Data, Journal of the American Statistical Association 72,
557-565.
- -Efron, B. and Hinkley, D.V. (1978). Assessing the Accuracy of the
Maximum Likelihood Estimator: Observed versus Expected Fisher
Information, Biometrika~, 457-487.
Elandt-Johnson, R.C. and Johnson, N.L. (1980).
Data Analysis, John Wiley and Sons, New York.
Survival Models and
Erickson, J.D. and Bjerkedal, T. (1978). Interpregnancy Interval.
Association with Birth Weight, Stillbirth, and Neonatal Death, Journal of
Epidemiology and ~o!,!m_unity Health E, 124-130.
Fedrick, J. and Adelstein P. (1978). Factors Associated with Low Birth
Weight of Infants Delivered at Term, British Journal of Obstetrics and
Gynaecology 85, 1-7.
Feller, W. (1968). An Introduction to Probability Theory and Its
Appl ication, Vol . .!., John Wiley and Sons, New York. ---- - Fisher, E.A. (1978). Assessing Reduced Fertility in Industrial
Populations, M.S.P.H. Thesis, Department of Epidemiology, School of
Public Health, University of North Carolina, Chapel Hill, N.C.
Fisher, R.A. (1929).
Dover, New York.
The Genetical Theory of Natural Selections,
Fleiss, J.L. (1973). Statistical Methods for Rates and Proportions,
John Wiley and Sons, New York.
Henry, L.H. (1957). Fertility and Family: Mathematical Models I, in
On the Measurement of Human Fertility, translated and edited by
~C~eps and E. Lapierre-Adamcyk, American Elsevier, New York.
Henry, L.H. (1961). Fertility and Family: Mathematical Models II, in
On the Measurement of Human Fertility, translated and edited by
~C~heps and E. Lapierre-Adamcyk, American Elsevier, New York.
e~
109
Hogue, C. (1971). Refilling the Empty Room: Using Vital Statistics to
Study Rapidity of Fetal Death Replacement, M.S.P.H. Thesis, Department
of Epidemiology, School of Public Health, University of North Carolina,
Chapel Hi 11, N.C •
•
Hoem, J.M. (1970). Probabilistic Fertility Models of the Life Table
Type, Theoretical Population Biology!, 12-38.
Infante, P.F., Wagoner, J.K., and Waxweiler, R.J. (1976). Carcinogenic,
Mutogenic and Teratogenic Risks Associated with Vinyl Chloride, Mutation
Research~, 131-142.
Kaplan, E.B., and Elston, R.C. (1972). A Subroutine Package for
Maximum Likelihood Estimations (MAXLIK), Institute of Statistics Mimeo
Series No. 823, Department of Biostatistics, UniverSity of North
Carorfna;-Chape1 Hill, N.C.
Kiser, C., Graybill, W.H., Campbell, A.A. (1968). Trends and
Variations in Fertility in the United States, Harvard University Press,
Cambridge, Massachusetts--. --Kissling, G. (1981). A General Model for Analysis of Nonindependent
Observations, Institute of Statistics Mimeo Series No. 1357, Department
of Biostatistics, UniverSity of North Carolina, ChapeT HTTT, N.C.
Kleinbaum, D.G., Kupper, L.L., and Morgenstern, H. (1982).
Epidemiologic Research, Wadsworth, Inc., California.
Kline,J., Shrout, P.E., Stein, Z., Susser, M., and Weiss, M. (1978).
An Epidemiologic Study of the Role of Gravidity in Spontaneous Abortion,
Early Human Developments!, 345-356.
Levine, R.J. (1983). Chemical Industry Institute of Toxicology,
Research Triangle Park, N.C. Personal communication.
Levine, R.J., Symons, M.J., Balogh, S.A., et ale (1980). A Method for
Monitoring the Fertility of Workers. 1. Method and Pilot Studies,
Journal of Occupational Medicine~, 781-791.
Levi ne, R.J., Symons, M.J., Balogh, S.A., et al. (1981) • A t~ethod for
Monitoring the Fertility of Workers. 2. Validation of the Method among
Workers Exposed to Dibromochloropropane, Journal of Occupational Medicine
~, 183-188.
-Lewis, A.W. (1981). The Burr Distribution as a General Parametric
Family in Survivorship and Reliability Theory Applications, Institute of
Statistics Mimeo Series No. 1351, Department of Biostatistics,
-University of North CaroTfna-:cnapel Hill, N.C.
Mantel, N. and Byar, D.P. (1974). Evaluation of Response-Time Data
Involving Transient States: An Illustration Using Heart-Transplant
Data, Journal of the American Statistical Association 69, 81-86.
110
Matthiessen, P.C. and McCann, J.C. (1978). The Role of Mortality in
the European Fertility Transition: Aggregate-Level Relations in The
Effects of Infant and Child Mortality on Fertility, S. Preston (ea:T,
-Academic-Vress, New-York.
Namboodiri, N.K., Suchindran, C.M., and Wyman, K. (1980). A Life Tab1 e
Approach to the Study of Work-Fertility Relationships, Paper presented
at the Annual Meeting of the American Statistical Association, Houston,
Texas, August 1980.
National Center for Health Statistics. (1976). Fertility Tables for
Birth Cohorts by Color: United States, 1917-1973, U.S. Government--Printing OfflC~ Washington, D.C.
Naylor, A.F. (1974). Sequential Aspects of Spontaneous Abortion:
Maternal Age, Parity, and Pregnancy Compensation Artifact, Social Biology
Q, 195-204.
Naylor, A.F. and Warburton, D. (1979). Sequential Analysis of
Spontaneous Abortion. II. Collaborative Study Data Show that Gravidity
Determines a Very Substantial Rise in Risk, Fertility and Sterility 1l,
282-286.
Osborn, J. (1972). A Statistical Investigation into the Effects of
Maternal Age, Parity and Birth Concentration on Stillbirth and Infant
Mortality Rates. Ph.D. Thesis, University of London.
Osborn, J. (1975). A Multiplicative Model for the Analysis of Vital
Statistics Rates, Applied Statistics 24, 75-84.
Pathak, K.B. (1966). A Probability Distribution for the Number of
Conceptions, Sankhya Series ~ 28, 213-218.
Pathak, K.B. (1971). A Model for Estimating Fecundability of the
Currently Married Woman from the Data on Her Susceptibility--A Cohort
Approach, Demography ~, 519-524.
Prentice, R.L., Williams, B.J., and Peterson, A.V. (1981). On the
Regression Analysis of Multivariate Failure Time Data, Biometrika 68,
-373-379.
Sheps, M. and Menken, J. (1971). A Model for Studying Birth Rates Given
Time Dependent Changes in Reproductive Parameters, Biometrics ~, 325-343.
Shryock, S., Siegel, S., et al. (1975). The Methods and Materials of
Demography. U.S. Bureau of the Census, U.~Governmen~rinting OfffCe,
Washington, D.C.
Singh, I. (1981). A Conception Dependent Probability Distribution of
Couple Fertility, Journal of Bioscience l, 207-214.
e-
111
Singh, S.N. (1963). Probability Models for the Variation in the Number
of Births per Couple, Journal of the American Statistical Association 58,
721-727.
Singh, S.N. (1968). A Chance Mechanism for the Variation in the Number
of Births per Couple, Journal of the American Statistical Association ~,
209-213.
Singh, S.N. and Bhattacharya, B.N. (1970). A Generalized Probability
Distribution for Couple Fertility, Biometrics ~, 33-40.
Singh, S.N., Bhattacharya, B.N., and Yadava, R.C. (1974). A Parity
Dependent Model for Number of Births and its Applications, Sankhya ! 36,
93-102.
Taulbee, J.D. (1977). A General Model for the Hazard Rate with
Covariables and Methods for Sample Size Determination for Cohort
Studies, Institute of Statistics Mimeo Series No. 1154, Department of
Biostatistlcs, University of North Carolina, Cnape~l, N.C.
Townsend, J.C., Bodner, K.M., et al. (1982). Survey of Reproductive
Events of Wives of Employees Exposed to Chlorinated Dioxins, American
Journal of Epidemiology 115, 695-713.
Trost, R.P. and Lurie, P. (1980). Estimation of Child Spacing Model with
Cox Regression Technique, Paper presented at the Annual Meeting of the
American Statistical Association, Houston, Texas, August 1980.
u.S. Bureau of the Census. (1975). Census of Population 1970, Subject
Reports, Final Report PC(2)-3B. Childspacing and Current Fertility, u.S.
Government Printing Office, Washington, D.C.
Warburton, D. and Fraser, F.C. (1964). Spontaneous Abortion Risks in
Man: Data from Reproductive Histories Collected in a Medical Genetics
Unit, American Journal of Human Genetics 23, 41-54.
Wilks, S.S. (1938). The Large-Sample Distribution of the Likelihood
Ratio for Testing Composite Hypotheses. Annals of Mathematical
Statistics 9, 60-62.
-Wong, 0., Utidjian, H.M., and Karten, V.S. (1979). Retrospective
Evaluation of Reproductive Performance of Workers Exposed to Ethylene
Dibromide (EDP), Journal of Occupational Medicine~, 98-102.
Woodbury, M.A., Manton, K.G., and Stallard, E. (1979). Longitudinal
Analysis of the Dynamics and Risk of Coronary Heart Disease in the
Framingham Study, Biometrics~, 575-585.
APPENDIX A
OBSERVED AND EXPECTED NUMBERS OF BIRTHS BASED ON
THE MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD MODEL (2.3)
BY CALENDAR YEAR: PLANT A
Calendar Year:
j
Observed Births:
I.
1
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
•
Expected Births:
I. (1-S 1..J ( t 1.., ) )
A
<5 ••
1J
1
0
0
0
1
1
1
2
4
2
3
6
3
9
12
12
10
15
16
20
24
24
34
32
35
29
29
23
29
29
26
16
17
13
10
9
11
5
5
8
7
8
5
3
4
4
9
1
..
u
0.001581
0.009915
0.074894
0.152255
0.469177
0.514127
0.750533
1. 797417
2.660129
2.727677
3.089753
5.250696
8.116872
10.357876
13.733681
14.390855
17.729872
19.362763
20.678313
23.356470
24.412395
27.037421
29.839058
31.452334
30.679243
30.291596
26.618463
24.583844
22.426348
19.224996
16.023496
13.939594
12.750146
11. 906491
11.060829
10.450891
9.461395
7.548104
7.217621
7.172209
7.259139
6.676703
6.051680
6.286083
6.163662
5.693032
4.487201
e·
.(
e
APPENDIX B
OBSERVED AND EXPECTED NUMBERS OF BIRTHS BASED ON
THE MULTIPLICATIVE EXPONENTIAL FERTILITY HAZARD MODEL (2.3)
BY CALENDAR YEAR: PLANT B
Calendar Year:
j
Observed Births:
L. 0 lJ..
1
·e
•
e
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
0
0
0
0
1
1
0
0
0
0
0
2
0
0
5
1
2
4
6
3
10
5
8
16
6
8
19
19
16
15
24
15
27
19
16
18
21
15
15
22
10
8
6
12
12
8
12
8
9
12
Expected Births:
ICl-S
.. (t .. ))
i .
lJ lJ
0.002180
0.005051
0.010470
0.182547
0.228822
0.361199
0.262501
0.257274
0.270648
0.292311
0.524951
0.851293
1.026161
1.173773
3.265449
3.680697
3.910166
4.682415
5.629321
8.546828
8.719117
8.407498
10.483153
10.006583
10 .158190
11.454334
13.782591
15.559347
15.824016
15.566064
16.369071
17.832910
18.347946
17.480874
18.029287
17.414518
15.650388
15.783377
15.802418
14.414412
12.316200
11.680324
12.485878
11.674124
10.688039
11.667033
10.722260
10.752802
10.145266
9.317234