Dorigo, David; (1975)Some estimation procedure for bivariate life tables."

SOME ESTIMATION PROCEDURES FOR
BIVARIATE LIFE TABLES
by
David Dorigo
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1031
January 1975
ABSTRACT
DAVID DORlGO. Some Estimation Procedures for Bivariate Life Tables.
(Under the direction of D. G. HOEL and C. M. SUCHINDRAN.)
1)
Maximum likelihood estimates for the parameters of a distri-
bution in the bivariate exponential family are obtained through iterative techniques when samples are subject to
~andom
censoring on both
components. Data were obtained by simulation for several values of the
parameters of the underlying bivariate distribution.
2)
Nonparametric estimates based on bivariate empirical distri-
bution functions are presented. Suppose the range [O.T) of observation
is partitioned into. say, w intervals [ti,t + ) (i
i l
= O,1 •••• ,w-1).
When
data appear in grouped form and are censored according to a certain
scheme, (maximum likelihood) estimate
for the unconditional probability
that the first component fails in the i-th interval and the second in
the j-th interval (q .. ) are obtained. Other parameters, such as the
1J
probability that both components survive the i-th interval, are also
estimated.
3)
To test the hypothesis of independence between the two com-
ponents, the likelihood ratio test is used for the case of grouped data.
An illustrative example, based on data concerning 2,197 amenorrheic
women, is given.
A score function, aimed to generalize the usual Ken-
da11's tau for testing independence, is proposed for the case of bivariate continuous data with random censoring.
SOME ESTIMATION PROCEDURES
FOR BIVARIATE LIFE TABLES
by
David Dorigo
A Dissertation submitted to the faculty of the
University of North Carolina at Chapel Hill in
partial fulfillment of the requirements for the
degree of Doctor of Philosophy in the Department of Biostatistics.
Chapel Hill
January 1975
Approved by:
-=-----:~---..
Co-adv=l-ser
{\
-./J. (' ~ . ' ( (~.\-,\.x........
/
'-
Co-adviser
,1. ;;u.~.-,
Reader
ACKNOWLEDGMENTS
The author would like to express his deep gratitude to Professors D. G. Hoe1 and C. M. Suchindran, co-advisers:
the former for his
guidance, and, above all, immense patience; the latter for his encouragement and suggestions throughout this study.
The author is also
grateful to Professor G. G. Koch for specific suggestions concerning
portions of this work, and to Professors H. Bradley Wells and P. Uh1enberg for a critical reading of the manuscript, which resulted in many
improvements.
Professors Bradley Wells and Suchindran, as faculty
advisers, always honored me with their warm friendship.
The author is very much indebted to Rebecca Wesson for her
determination in handling a messy manuscript.
TABLE OF CONTENTS
CHAPTER
1.
PAGE
INTRODUCTION • • • •
1.1
. . . . - . . . . . . . . . . .. .
Univariate Life Tables: Uses, Applications ••
1.1.1 Some Terminology . . • • • • •
1.1.2 The Problem of Estimation.
1.1.3 Concomitant Variables ••
1.2. Bivariate Life Tables • • • •
1.2.1 Definition of the .Main Functions ••
1.2.2 Concomitant Variables.
1.3
II.
1
3
5.
• 10
• 14
• 15
• • • • 19
..
• 19
PARAMETRIC ESTIMATION. •
2.1
The Model • • • • •
• • • • • •
2.1.1 Choice of the Model.
2.1.2 Description of the Model •
2.2
Estimation Procedures • . • •
2.2.1 Complete Observations.
2.2.2 Censored Observations.
•• 22
• 22
22
• 23
• • • 24
. . . . . 24
. . . . . . 25
2.3
Simulation of Data. • • • • • •
2.4
Failure Rates and Concomitant Variables •
•
2.4.1 Description of Failure Rates . •
2.4.2 An Approximate Estimation with Concomitant
Variables. • . •
• •••••
2.4.3 Expected Values. • • • • •
• •
2.5
III.
Contents of the Present Work.
1
Failure Rates in Fixed Intervals. •
2.5.1 Expression of Likelihood with No Censoring
NONPARAMETRIC ESTIMATION • • • • • •
3.1
• 29
.....
32
32
35
44
• 50
50
57
Empirical Distribution Function as an Estimator • • • 57
3.1.1 Estimators • • . • • • •
• • • • •
• 57
TABLE OF CONTENTS (continued)
PAGE
CHAPTER
3.2
Estimation with Complete Observations • • • •
3.2.1 Maximum Likelihood Estimators • • • • •
3.2.2 Mean and Variance of the Estimators.
Estimation with Censored Observations • • •
• • 64
3.3.1 Definition of the Main Parameters •• . . • . . 64
3.3.2 Maximum Likelihood Estimation • • •
• 66
3.4
An Extension of the Kaplan-Meier Approach •
3.5
Likelihood Expression.
Checking Assumptions
4.2
....
Failure Rates • • •
SOME HYPOTHESIS TESTING • • •
4.1
V.
• 63
3.3
3.4.1
3.4.2
IV.
· . . •• 5959
• 84
• • 86
....
Test of Independence for Continuous Data.
4.1.1 Complete Observations.
4.1.2 Censored Observations. • • •
..•
Test of Independence for Grouped Data • •
4.2.1 Complete Observations.
4.2.2 Censored Observations.
CONCLUDING REMARKS • • • • • •
• 79
• 79
....•
• 90
• • • 90
• 90
• 92
• 95
95
• • 97
• • • 110
CHAPTER I
INTRODUCTION
This chapter presents a general view of life tables:
introduces
some definitions that will be used in the text, mentions some applications in which life tables are useful, and brings to light some
statistical problems involved.
The content of the present work is
described in the last part of the chapter.
1.1.
Univariate life tables:
uses, applications
Information concerning the risk of mortality may be obtained
from life tables.
In its original form, the life table was conceived
as a device for expressing human mortality facts in terms of probabilities.
Such tables may be constructed from statistics relating to deaths
among the general population, or from any complete mortality data relating to groups of sufficient size.
~ong
the existing types of life tables, the one most useful for
projections of mortality is the so called cohort life table.
A cohort
is generally defined as a group of people sharing some cornmon experience
during a specified period of time.
If mortality is the characteristic
to be measured, the cohort would be observed from the moment of birth
of all the people in it until all of them die, the end point.
Besides its original use as a device to gauge the mortality trend
of human populations, life tables have been applied to other fields.
In
2
life testing theory, the population that one wishes to study is simply
a collection of lifetimes, such as the intervals from installation to
failure for a set of light bulbs.
The lifetime of an item is the time
span from some initial state to some terminal event.
latter will be referred to as failure or death.
Hereafter the
In general terms, one
will be interested in the distribution of lifetimes measuring that
time span.
For instance, in the case of light bulbs, one might consider
the hours of use from the initial state of being newly installed to the
terminal event of burning out, a failure.
In medical follow-up studies, a cohort of individuals sharing
some morbidity experience is observed from a initial state, such as
the date of hospital admission, up to recovery or death.
The purpose
here might be the measurement of quantities such as the expectation
of life of treated patients.
Most generally, if a nonnegative random variable, T, represents
the eventual time to failure (the length of time until a particular
functioning item fails to operate properly), then the survival function
associated with T is defined as
pet) - P[T > t],
t
> O.
This function is of fundamental importance in connection with the
following applications, among others:
(a)
in medical follow-up studies, one of the most common
statistical problems is to measure the rate at which
some group of people die or acquire a disease.
For
instance, one would like to know the proportion of
3
patients surviving an operation for lung cancer that will
be alive t years after the operation;
(b)
in life testing problems, one may wish to determine the
rate at which electric bulbs fail after a given number of
hours of use.
In any case, only one component is being measured in the two examples
just mentioned:
out (in (b».
the eventual time to death (in (a», or to burning
And statistically that is generally measured by the
survival function.
In practice, some problems concerning the data frequently arise.
The most common situation encountered is when only partial information
on survival is available on some or all individuals.
If a cohort of
people is under study, then the length of time an individual is observed
depends not only on his lifetime, but also on variables like entry time,
closing date of study, lost to follow-up.
When such variables are
present, the information on an individual's status is incomplete and
the data are said to be censored.
1.1.1.
Some terminology
In the sequel some terms, generally adopted in follow-up studies,
are presented.
Given a cohort of people, a loss is due to the fact that
an individual in it moves or fails to return for treatment.
Withdrawn
alive are those known to be alive at the end of the study; and the
time to withdrawal alive for an individual alive at the end of the
study is the length of time from entrance into the study to the end of
it.
They are sometimes referred to as a right censored observation; a
late entry is termed a left censored observation.
By an open cohort one
~
4
means persons entering and leaving the experience during the period of
observation.
may occur.
In this case, both lost to follow-up and withdrawn alive
In general, no distinction, as far as statistical treatment
is concerned, is made between them, although Chiang [1968] does make a
distinction by treating the cases lost to follow-up as a competing
risk.
On the other hand, Cutler and Ederer [1958] pointed out the fact
th
that although withdrawn alive at the i-- interval and lost to fol10wup are treated alike in their analysis, they distinguish between the
two mainly for the following reason:
it is important to be aware of
the number lost to follow-up because of the problem of bias.
If there
is no left censoring, one has what may be called a closed cohort (since
in this case there will be no late entries).
In a life table two quantities of interest may be defined as
follows.
The survival function, P(t), mentioned earlier, represents
the probability that an individual in the cohort (or a component in a
system) survives longer than t.
The other is the failure function.
Because observations on a survival distribution are necessarily ordered,
it is possible to define certain quantities that have specific interpretations related to ordering.
plays a fundamental role.
Among those, the failure rate function
It is used by actuaries under the name of
'force of mortality' to compute life tables; in reliability theory it
has been called 'hazard rate'; other names are age-specific death
rate, and mortality intensity.
In the present work those names will be
used interchangeably.
Formally, suppose the time to failure, T, of an item is a nonnegative random variable, its lifetime, with distribution function
5
F(t) and probability density function f(t)
= dF(t)/dt.
Then the prob-
ability that an item. functioning at time t. will fail. or die. in the
interval (t. t + dt) is the force of mortality. which is given by
A(t)
1.1.2.
f(t)
= 1-F(t)
o<
t
The problem of estimation
Due to incomplete information. statistical problems have appeared
in connection with the estimation of parameters such as the survival
function, failure rates. Parametrically, the basic assumption is that
the data form a sample of independent observations of random variables,
with the same probability distribution.
Nonparametrica11y, several
estimators of the survival function, of failure rates have appeared.
The standard actuarial estimate (of the conditional probability
of death in an interval) was used first in follow-up studies by
Berkson and Gage [1952].
A similar estimate was proposed by E1veback
[1958J, whose estimate, however, is based on interval frequencies.
A
variant of the actuarial estimate was obtained by Kaplan and Meier
[1958J, called the product limit estimate, which is obtained by
restricting attention to the class of discrete distributions which
place mass only at the observed uncensored survival times.
Cutler and
Ederer [1958J also used the actuarial estimate with the sole difference
that the withdrawals in any interval are to be interpreted as the sum
of the losses and those alive at the closing date of study; they made
no further distinction between the two.
Because of its importance. the estimate proposed by Kaplan and
Meier [1958] should be reviewed more closely.
It is constructed in
6
such a way that when there is no censoring, the maximum likelihood estimate of the proportion of deaths in each interval (in which the entire
period of observation is subdivided) coincides with the usual binomial
estimate.
If censoring is present, then the following convention is
adopted:
adjust each of the censored variable an infinitesimal amount
to the right.
So that information contributed by individuals censored
in any interval does not contribute to the number withdrawn alive in
it.
Algebraically, the product limit estimate may be explained as
follows:
let pet) represent the survival function at time t.
the period of observation into w subintervals with the i
being [t , t + ).
i
i l
th
Divide
subinterval
Then the actuarial estimate, Pi' of the conditional
probability of surviving over the i
th
subinterval may be written as
where
d i represents t h e numb er
0
.th sub ·
b serve d to di e d ur1ng
t h e 1--
interval;
th
r , the number under observation at the beginning of the i-i
subinterval;
th
Wi' the number lost during the i-- subinterval.
Now, for any particular sample of a population, consider the effect of
using sufficiently small subintervals into which the whole period of
observation is divided.
These subintervals are such, that at most one
death or one loss occurs in anyone of them.
Thus
7
1
p'"
i
if no death occurs in i
th
subinterval
•
th
if a death occurs in i-- subinterval
(but no loss)
Since pet) is estimated by the product of the Pi's, it yields
"-
pet) =
k
II
i=l
'"
Pi'
th
where the k-- subinterval ends at time t.
core of Kaplan and
~leier's
Note that -- and this is the
reasoning -- the actuarial estimate gives
the binomial law (survival beyond t being a 'success' and death before
t a 'failure'), namely
N-D
= -N
where
N represents the sample size;
D, the number of individuals dying before time t;
N-D, the number of individuals surviving beyond time t,
when there are no losses.
"-
That being the case, the limit to which pet) tends, when the partitioning of the period of observation is considered, must also have the
"-
binomial property.
In fact, pet) will tend tQ
N-l
N
N-2
N-l
N-D
N-D+1
N-D
= -N
When losses occur the cancellation of the factors is interrupted.
In
8
other words, the estimate of the survival function is given by a step
function that jumps at points where a death occurs.
The main difference between the actuarial and the product limit
estimate resides in the fact that the latter does not allow adjustment
for withdrawals in the estimate of the survival function between
points of failure.
Thus the product limit estimate tends to overesti-
mate the conditional probability of survival.
The product limit estimate has been extended by Turnbull [1974]
for the case of double censoring, which occurs when one has information that the event of interest has occurred prior to some lower limit
of observation.
This author reduces the problem to that of single
censoring by applying the concept of self-consistency (see Efron [1967])
of an estimate.
It is then shown that the self-consistency estimate of
the survival function is a maximum likelihood estimate and it is unique.
Parametrically, the survival function may be specified as a
member of a family of distributions.
The most common models are
exponential, Makeham-Gompertz, Weibull and log normal distributions.
The problem of estimating the failure function is rather
complex.
On the basis of actual observations of times to failure, it
is difficult to distinguish among the various nonsymmetrical probability functions.
Thus the differences among the prospects, mentioned
before, like exponential, Weibull, log normal distributions become
significant only in the tails of the distribution, but because of
lim~.ted
saffip1e sizes actual observations are sparse in the tails.
Fortunately, the task of distinguishing these distributions within the
range of actual observation is greatly simplified by the way the
9
concept of failure rate itself is defined.
Because the fact that in
reliability contexts there is strong reason to believe that beyond a
certain age the function A(t) does not decrease due to inevitable
deterioration, the exponential law appears to be the best choice.
As
a matter of fact, because the time between successive events in a
Poisson process is exponential and these times are independent, this
result is often invoked to support the assumption of an exponential
failure law.
However, an important point should be brought into attention:
earlier failures give an idea of how A(t) behaves for small t, and
peaks in the graph of A(t) suggest that different failure laws other
than the exponential be sought out.
Consequently, the distribution of
T can not be known with confidence and so nonparametric methods are
called for, as mentioned earlier.
Among these, it is worth mentioning
the ingenious method proposed by Grenander [1956].
It is well
established in studies of the mortality of human and animal population
the usefulness of the force of mortality and its estimation has been
attempted in a variety of ways and on various amounts of prior knowledge
and on data in several forms.
Restricting the class of distribution
functions by requiring A(t) to be nondecreasing, Grenander gave an
algorithm for computing the maximum likelihood estimate of the failure
rate.
The only difference between his and the one obtained
~hrough
the actuarial estimate is that the latter is used when the ages at
death within the interval are not known; if they are known, then
Grenander's estimate is somewhat better.
10
1.1.3.
Concomitant variables
Concomitant information on, say, a patient's condition often
accompanies survival time information.
For instance, in leukemia the
white blood count at diagnosis is a useful concomitant variable for
the analysis of survival.
So that in some situations it is reasonable
to assume that the probability of survival in the presence of a morbid
condition varies for individuals in a cohort, whether one is interested
in just one or more than one component.
Also, it seems reasonable to
assume that such probability is a function of one or more variables
(like age, sex).
Feigl and Zelen [1965] suggested an exponential model
to describe the survival of leukemia patients:
the mean survival of
each individual is assumed to be a linear function of the white blood
cell count at diagnosis, the concomitant variable.
Ti i s t h e
h
were
.
surv~va
1
.
t~me
f or t h e ~-.th
.
pat~ent
In symbols:
i n t h e co h ort; Ai'
'
the corresponding exponential parameter; Xi the corresponding concomitant variable; a (the effect due to group membership) and b being
parameters to be estimated.
An extension of this work was done by Zippin and Armitage [1966]
in order to permit maximum likelihood estimation of the parameters of
the linear regression where not all patients in a follow-up study have
died by the end of the study, i.e., the extension allows for censored
observations.
Breslow [1974] went one step further by presenting a formulation
with more than one concomitant variable.
It should be pointed out,
11
however, that these works assume a constant failure rate over the entire
period of observation.
that assumption is.
And this raises the question on how legitimate
On the other hand, it is not entirely clear how a
concomitant variable adjusts the survival function as expressed through
the failure function.
Nonparametrically, and still allowing for concomitant variables,
quite an amount of work has been done in the past two years, mainly as
a result of the various possible extensions created by the publication
of Cox's [1972] article.
As vented before, a number of concomitant
variables describing the cause of a disease, medical status and clinical
stage of disease are recorded when a patient is taken into a study.
Because such variables seem important, Cox suggests an exponential
relation between the failure function and concomitant variables.
He
was then led to consider a general failure function of the form
t
where
~
=
> 0
(zl'zZ' ..• 'zp) represents a 1 x p vector of concomitant
variables for an individual in the cohort;
B represents
the p x 1
corresponding vector of parameters and AO(t) is the unknown failure
function for the condition z
= 0 or B = Q.
Censoring is included by
supposing that the only individuals subject to it are those actually
censored.
It is worth mentioning that in the absence of censoring,
an obvious and very frUitful approach to the problem of adjusting for
differences in concomitant variables is by means of the powerful
method of least square linear regression.
Depending on the type of
data being analysed, transformations may be necessary to achieve
12
approximate normality and homoscedasticity.
However, this approach is
not easily adapted for arbitrarily censored data.
The estimation of
B in
the model was carried out by Cox through
conditional likelihood arguments.
Let R(t ) denote the risk set, which
i
comprises the individuals whose failure time is at least t .•
Also, let
1
r
i
be the number of individuals selected to die from R(t ).
i
Then the
conditional likelihood is the product of the conditional distribution
of r
i
(i=l,2, ..• ,k).
In this way he regards the experiments at each
time point as independent.
Then it can be shown that the likelihood of
L(B)
At this point, an interesting fact is worth mentioning.
Kalbfleisch
and Prentice [1973], introducing the device of considering each of the
withdrawals as censored at the preceding uncensored survival time,
obtained the same likelihood espression given above, by working instead
with the rank statistics of the ordered failure times.
The only dif-
ference between the two approaches lies in the fact that the latter
derived the likelihood from the marginal distribution of the ranks.
So, Kalbfleisch and Prentice used only the information contained in
the order of the failure times, while Cox used the information at the
points of failure.
When there are ties, resulting from grouping of the data, the
continuous model proposed by Cox becomes complicated to apply.
In
13
order to estimate the underlying survival distribution under such
circumstance, a discrete-time model, similar to that of Kaplan and
Meier, was considered.
at t
i
The conditional probability,
Pi(~)'
of death
given survival at ti-O, for an individual with concomitant
variable
~,
was assumed to satisfy the linear logistic relation
where Pi represents the corresponding parameter for the underlying
distribution.
fixed
S can
And the maximum likelihood estimator of Pi(e) for
be obtained by usual iterative procedure.
The trouble with the above model occurs when very small partitions of the whole period of observation are considered, since it
would be necessary to bring in increasing numbers of observations.
The inconsistency raised by this situation was somewhat eased by
Kalbfleish and Prentice [1973], who suggested a different discretetime model for estimation of the underlying survival distribution
which allows a consistent set of distributions.
In symbols their
model is
The estimation procedure here is again the usual one:
iterative
calculations are required at each failure time, t , to estimate the
i
probability Pi' after which the previously derived maximum likelihood estimate of S is inserted.
14
1.2.
Bivariate life tables
One ~an extend the life table arguments by adding one more
component.
Accordingly, the concept of survival time is extended to
more than one component.
One might wish to look at two survival times
jointly, particularly when they are dependent.
If T , T denote times
Z
l
to failure, thp. bivariate (joint) survival function of T , T is
2
l
symbolized by
Data now may be censored on neither,
o~e
or both components.
It is readily seen the major role played by the bivariate
survival function in applications, suggested by the following examples.
Once again, in the field of life testing, one might be interested in
knowing something about the distribution of service times that may be
expected of a two-component system.
For instance, one may wish to deter-
mine the distribution of life times measuring the span from the time
the light bulbs are placed in service to the time they burn out.
Note
that in a two-bulb system they both can fail simultaneously or
separately.
Suppose one is interested in the lifetime of a couple after
marriage.
components:
The observations on each cuuple consists of two associated
the lifetimes of husband and wife.
Insurance companies
are interested in knowing how much a couple should pay so that the
surviving member can receive a certain amount in the event of death
of either.
15
In the medical literature there is strong suspicion about the
possible association between Parkinson disease and influenza.
One may
wish to know such quantities as the expected duration of influenza for
a given duration of Parkinson disease and how they are correlated.
To demographers it is important to know quantitatively the
association of the duration of post-partum amenorrhea with location,
since the former is a major factor in determining fertility, and, also,
because it is positively correlated with anovulation.
A natural way to study such problems is to represent the bivariate survival function and related functions through bivariate
life tables.
The present work intends to look at those, particularly
when the components are statistically dependent.
1.2.1.
Definition of the main functions
Since in many physical situations, particularly those in which
modes of failure of various types of equipment may be physically
related, the degree of dependence between the lifetimes of the components
is crucial, it is worthwhile to recall here the several ways in which
the concept of failure rate may appear.
In what follows this notion will
be reviewed in a bivariate context.
Basu [1971] defines bivariate failure rate, given an absolutely
continuous bivariate distribution function, F(t ,t ), with a density
l 2
function f(t ,t ), as
l 2
and that is just the conditional probability gensity function of
16
(T ,T ) given the event T > t , T > t •
2
l
l
2
l 2
If there is independence
between the two components, then one has
in other words, the bivariate failure rate equals, in this case, the
product of the corresponding univariate failure rates.
In many situations, as mentioned earlier, the two components
are associated and thus the failure rate of each component depends on
the lifetime of the other.
definition.
This fact is hardly seen from Basuls
So for many practical purposes his definition does not
carry any intuitive appeal.
However, Cox [1972] has defined failure
rates for a bivariate context in a way that allows a complete
physical interpretation.
In fact, following his notation, one defines
P[t ~ T < t + ~tlt ~ T , t ~ T ]
2
l
l
lim
~t
~t~O
so that AlO(t)dt represents the probability that the first component
fails (or dies) in the interval (t, t + dt) given that both components
have survived to time t.
lim
Also, one defines
P[t ~ TZ < t + ~tlt
~t~O
~t
2
TZ' I I = u]
for u < t
and thus A (tlu)dt represents the probability that the second component
z1
fails (or dies) in the interval (t, t + dt) given that the first component failed at u.
In analogous manner one defines AZO(t), A (tlu).
12
17
The advantage of Cox's definition (over Basu's) is evident:
it
enables one to give a practical interpretation to each one of the four
failure rates so defined and also one can reconstruct the joint survival
function.
In reliability contexts there are situations in which the failure
of one component puts additional responsibility on the remaining one,
which becomes more prone to failure as it gets older.
With that in
mind, Brindley and Thompson [1972] define bivariate increasing failure
rate if and only if
(~
> 0, fixed)
is a decreasing function of t ,t for every subset of both variables.
1 2
In other words, they merely require that the residual life of all subsets of a set of two survival times must be increasing.
One may define
decreasing failure rate in a similar way.
No matter which definition one uses, the important problem that
follows is how to estimate the failure rate in a bivariate context.
Again, one is confronted with the problem of choosing a bivariate distribution law that fits the type of data one is handling.
Here, how-
ever, the situation is more complicated because the natural way would
be to extend the definition of (univariate) exponential distribution
to the plane, and that can not be done uniquely.
In fact, several
authors have proposed bivariate exponential distributions, as a basic
model (Freund [1961], Marshall and Olkin [1967], Gumbel [1960]).
Because of its appealing physical interpretation, Freund's bivariate
distribution has been used in the present work as a basic model when
one restricts attention to bivariate continuous data
wi~h
censoring
18
in either components.
Unfortunately, its marginal distributions are
not exponential.
In the univariate context, the exponential distribution has a
number of desirable mathematical properties, but its applicability is
limited bacause of the so called 'lack of memory' property:
if the
lifetime, T, of an item has the exponential distribution, previous
use does not affect its future lifetime.
In other words, if an item
has not failed up to a time t, the probability distribution of its
future life time, T-t, is the same as if the item were quite new and
had just been placed in use at time t.
(The univariate exponential
distribution is the only one with this property.)
In medical follow-
up studies one might be reluctant to postulate such alternative
since, after diagnosis of a disease, medical treatment may cause the
failure rate to decrease for a period of time.
Thus, in such cases,
the 'lack of memory' property will hardly hold.
Transplanting this
property to a bivariate context, one can say that a bivariate random
variable (T ,T ) has the 'lack of memory' property if and only if
I 2
P[Tl>$I+t, T2 >5 Z+tIT I >5 1 , TZ>SZ]
In words:
= P[TI>sl' TZ>SZ]
for sl'
5
Z' t > O.
when two types of failure, whose corresponding time to
failure are T , T
Z
l
~
0, are acting upon an individual, the probability
that both causes are still acting units of time from now is the same
as if both causes were new (i.e., the residual life of both components
is independent of their present ages).
One can show easily that Freud's
bivariate exponential distribution has this property.
On the other
hand, Marshall and Olkin [1967] have shown (see Theorem 5.1 of their
article) that a bivariate exponential distribution has either
19
exponential marginals (which is not the case of Freund's bivariate distribution) or the 'lack of memory' property, but not both simultaneously.
It should be pointed out that in the context of bivariate life tables,
the 'lack of memory' property may be of doubtful advantage.
In fact,
in a two-component system, correlation is likely to arise because one
component possesses in some sense a memory of the time to failure of
the other.
1.2.2.
Concomitant variables
Once a definition of failure rate in a bivariate context is
agreed upon, the problem of considering concomitant variables can be
done in the same way it is treated in the univariate life table
situation.
Parametrically or not, the simplest way is to have the
same function of the concomitant variable,
multiplying the failure functions.
~
=
(zl,z2""'zp)'
And in this case the estimation
of the various parameters, that will eventually appear during the
calculation process, is conducted in the usual manner, although
numerical procedures are very likely to be needed.
1.3.
Contents of the present work
The aim of this work is to mainly develop and apply some esti-
mation procedures to
bivariate life tables.
The results obtained are
a straightforward generalization of the ones already existent for the
univariate case, apart, of course, from differences due mainly to some
conditions imposed for technical reasons.
In the following chapter the problem of estimation based on a
certain parametric model is considered.
The problem there is treated
20
in a fairly general way in that random censoring is allowed.
Section 2.1
introduces the parametric model to be used, namely, the bivariate
exponential distribution proposed by Fruend [1961].
This model was
chosen mainly because of the immediate practical interpretation that it
suggests.
The only drawback resides in the fact that the model is
comprised of four parameters, which make cumbersome the handling of the
equations one faces.
After a brief review of the estimates for the
parameters in the case of complete observations, Section 2.2 presents
the expression of the likelihood for the case of random censoring.
Estimates in this case are obtained by iterative techniques, the only
way to handle the problem for a given set of data.
The data used were
generated through a computer as described in Section 2.3.
Several
samples were obtained in that way, with a subroutine being used to
obtain the maximum likelihood estimates for the parameters of the model •.
As voiced elsewhere in this work, concomitant variables give a better
insight in that they show certain characteristics of the data that
would not have been detected, had those variables been left out.
Section 2.4 aims to look precisely at that; and in order to handle the
problem in a complete analytical way some approximations have been used
for the estimates.
Towards the end of Chapter 2, in its last section,
the various failure rates that appear in a bivariate life table
context are shown in a more general situation.
There they are permitted
to take the form of step functions along the plane, so that they are not
assumed constant over the entire period of observation, a severe
restriction encountered in many works dealing with (univariate) life
tables.
21
Chapter 3 starts by briefly presenting some estimates based on
the empirical distribution function.
When one has complete data
at
hand, the problem of estimating the parameters in a nonparametric context is very easy, since in this case the classical multinomial scheme
applies.
On the other hand, when censoring is considered the estimation
procedure will depend, of course, on how it occurs.
Section 3.2
assumes certain conditions under which the estimates proposed there
are valid.
The results in that section are pretty much generaliza-
tions of their univariate counterparts.
In the next section, an
attempt is made to establish some conditions under which the KaplanMeier estimate may be extended to a bivariate context.
The last
section presents some nonparametric estimates for the various failure
rates that have been considered in this work.
Finally, Chapter 4 looks at some hypothesis testing that may be
of interest, such as the test for independence between both components
for the case of continuous data when censoring is considered.
CHAPTER II
PARAMETRIC ESTIMATION
Chapter 2 treats the problem of estimation in a bivariate life
table context when a specific parametric model is assumed.
Because ran-
dom censoring is included in the model, one will see that the only way
to obtain estimates is through numerical procedures.
Generality is
introduced in the last section when failure rates are allowed to vary
as step functions.
2.1.
The model
2.1.1.
The choice of the model
In problems concerning causes of failure, the exponential distribution has been shown to be a reasonable basis for a model.
However,
when one faces a two-component system there is no unique way, as mentioned earlier, of providing a bivariate analogue of the (univariate)
exponential distribution.
In particular, if exponential margins are
required for the bivariate analogue, it has been shown by M. Frechet
that an analogue may be provided in infinitely many ways.
However, the
property of exponential margins is not the only reasonable requirement
for a bivariate analogue to the exponential distribution.
to retain in two dimensions the 'lack of memory' property.
One may wish
In this
case, the property of exponential margins must be given up as shown by
Marshall and Olkin [1967].
23
There are a number of different bivariate exponential distributions available in the literature.
Freund [1961] has provided a very
reasonable bivariate analogue to the (univariate) exponential.
2.1.2.
Description of the model
The bivariate exponential distribution proposed by Freund arises
in the following way:
suppose a system is comprised by two components,
whose lifetimes are T and T , respectively.
2
1
Let the density functions
-- when both components are acting -- be
-at
==
ae
=
Be
1
-Bt
2
The lifetimes are dependent in that a failure of either component
changes the failure rate of the lifetime distribution of the other component.
Hence, when the component with lifetime T fails at t , the
1
1
failure rate of the other becomes
B'; and in this case the joint density
When the component with lifetime T fails at t , the other acquires a
2
2
failure rate a'; the contribution to the joint density in this case is
Since there is no other dependence between the two components, the joint
density function
of (T ,T ) is then
1 2
= ~B'
~'
S - B')t 1 ]
+ B - a')t 2 ]
exp[-B't 2 - (a +
exp[-a't
1
- (a
24
This model may realistically represent a situation where two
components perform similar functions, and the failure of one component
puts additional responsibility on the remaining one. For instance, in
such diverse situations as the failure of paired airplane engines, or
the failure of paired organs such as lungs or kidneys.
2.2.
Estimation procedures
2.2.1.
Complete observations
Considering a random sample of N pairs of observations (T
(i
= 1,2, ••• ,N)
on N two-component systems, Freund
1i
,T
2i
)
[1961] obtained the
following set of simultaneous maximum likelihood estimates
a=
r
51 + 54
,
a' = 5N -3
r
54
,
(2.2.1.1)
a = 51N +- r54
'"
,
S'
r
,
=5 51
2
where r denotes the number of first components failing before the second
component;
51' the sum of the lifetimes of the first component when it fails
first;
52' the sum of the lifetimes of the second component when the
first component fails first;
53' the sum
of the lifetimes of the.first component when the
second component fails first;
54' the sum of the lifetimes of the second component when it fails
first.
Also Freund has shown that the variances for the above estimates
are
25
+ B(N-I»)
(N_I)2 (N-2)
V(s) • NB(NB + a(N-I»
(N_I)2 (N-2)
v(a) • na(Na
(2.2.1.2)
(and for fixed r)
A
V (a')
(N-r)2 a,2
= ---"'~.=..!:-2-':':'--(N-r-l)
2.2.2.
,
V(S')
(N-r-2)
r 2 B,2
= ---=-~--
r >
2.
(r-l) 2 (r-2)
Censored observations
In this section the parameters a,a'
,B,B'
of the bivariate expo-
nential distribution proposed by Freund are estimated by the method of
maximum likelihood when random censoring is present.
It is assumed
that observations may be randomly censored on neither, one or both. components.
In practice that is often the case.
For instance, in medical
follow-up studies, in place of complete observation on each individual,
there is usually a cut
off date beyond which a surviving individual
will no longer be followed.
Such curtailment of observation may be due
either to closing or cutting off the study while some individuals still
survive, or to the removal of some individuals from observation for reasons other than the one of interest.
In general, one may suppose that
each item in the sample can be regarded as potentially having both a
random censoring time and a failure time, which are assumed to be statistically independent.
Also, suppose that only the smaller of the two
such times is observed.
This type of data is called randomly censored.
But despite the incompleteness of the data, it is desired to estimate
the parameters in the underlying model and other quantities of interest.
Consider then a random sample of N pairs of observations
(T li ,T 2i ) on a two-component system from a population with joint density
26
function given by equations (2.1.2.1).
sample has lifetime equal t
ond.
1
So that the i-th item in the
for the first component; t , for the sec2
Let TO be the random censoring time, independent of (T ,T ), and
1 2
with a given cumulative distribution function G(t ) and corresponding
O
density function g(t ).
O
Under this scheme the sample may be character-
ized as follows:
Possible Cases
Observed Values
TO < T1 < T2
TO < T2 < T1
(to,t O)
(to' to)
T < T < TO
2
1
(t ,t )
1 2
T < TO < T
2
1
(t ,t )
1 O
T < T < TO
2
1
(tl't 2)
T < TO < T
1
2
(t ,t ) •
o 2
The likelihood function for the data can be written, neglecting
constants,
L
=
IT
(i)
S
n
1
{P(tOi,tO.)g(t ')} °
IT
{f(t 1i ,t 2i )o(1 - G(t 2 .»} °
Ol.
l.
S (i)
l.
n
2
IT
{f~ f(t .,t ,)dt
° g(t )} °
IT
{f(t ·,t )o(1 - G(t »} °
2i
1i
Oi
1 l. 2i
1 l. 2 l.
S (i) t Oi
S (i)
n
n
3
4
(2.2.2.1)
where
and it is often called the joint survival function;
27
Similarly, one defines the other sets 5
n
(i), ••• ,5
2
n
(i).
Note that
S
additive constants which depend on the observations but not on the
parameters are omitted from the likelihood expression.
The joint survival function above under the present model is
given by
I
S-S'
XP{-(a+S)t l } a+S-S' exp{-(a+S)(t 2-t )}
l
+
a+~-B' eXP{-B'(tz-tl)~
XP{-(a+B)tZ}[~~~'
+
~ t1 ~ t z
0
(2.2.2.2)
exp{-(a+B)(t1-t Z)}
a+~-a' exp{-a'(tl-tZ)~
0
~ tz ~ t1,
(2.2.2.3)
(2.2.2.4)
The log likelihood expressed from equation (2.2.2.1) becomes
28
+
2
S (i)
n
5
{In
r:Oif(tli,t2,)dtli + ln g(t Oi )}.
t
(2.2.2.5)
1
To proceed further with the likelihood approach, an explicit
parametric representation for the underlying survival
needed.
distribution is
Also assume that the censoring scheme follows an exponential
distribution with a known parameter A.
Thus, replacing in the log likelihood above the densities and
cumulative densities functions by their corresponding expression, one
has
In L(a,a',S,S')
L
m
{In A - At
S (i)VS (i)VS (i)
n
n
n
1
3
5
- (a+S)
Y t o·i
S (1)
n
1
+
Y {In
S (i)
n2
}
Oi
as' - S't
2i
- (a+6-S')t li }
(2.2.2.6)
29
At this point, to carry further the problem of estimating the
four parameters
a,a',S,S'
one must now use iterative procedures, since
estimates can not be derived analytically.
The following section will
illustrate the numerical procedures used for getting the appropriate
estimates from randomly censored observations.
2.3.
Simulation of data
If appropriate data were readily available,one could start using
the iterative techniques at once for estimating the parameters in the
model.
However, real data are not available; computer simulation pro-
cedure
is then used to generate data from the underlying model.
The
usual way of constructing bivariate samples for simulation purpose is
to obtain an observation from the marginal distribution of one of the
variables, and, then, select the other observation from the conditional
distribution which results.
The procedures may be carred out as follows:
(i) firstly, a sequence of random numbers {ui}~=l uniformly distributed on the interval [O,l],is generated.
variable generator VARGEN
(*)
The random
was used for this purpose;
(ii) secondly, the first component of the bivariate exponential
distribution is generated by using the following standard
result:
If G is any cumulative distribution function, it is
possible to choose h,such that
(*)VARGEN uses the Tausworthe random number generator, which is
described by J. Whittlesey in Communications of the ACM, vol. 11, Sept.
1968, p. 641.
30
P[h(X)
~
..a:J<x<oo
x] • G(x)
where X is uniformly distributed over [0,1].
Labelling {t1i}~=1 the random sequence of numbers representing the first component, one has
t
-1
1 • FT (u),
1
where u is an element of
the sequence {ui}~=l and
the marginal cumulative distribution function of the first
component, T ;
I
(iii) next, the second component of the bivariate distribution is
generated, forming the sequence {t2i}~=l where an element of
this sequence is given by
where
,
t
. \
(a-a')(cc+S) exp{-(CC+S)tI}+ a'S exp{-a't }
1.
.
"
• (1 - exp{-(a+S-a')t })
2
which symbolizes the conditional cumulative distribution
31
function of the second component T , given T - t ;
2
l
l
(iv) finally, a sequence {tOi}~=l. of exponentially distributed
random numbers is generated which represents the censoring
variable.
The results of the iterative procedure are presented in the following examples.
Bivariate samples of size n - 100 were generated from
(Freund's) bivariate populations for various values of the parameters.
The corresponding estimates were obtained through the subroutine
MAXLIK.<*)
By supplying a set of inital values for the parameters, this
subroutine searches the likelihood surface to find maximum likelihood
estimates of the parameters and computes an estimate of their variancecovariance matrix.
The search is conducted in such a way as to verify
that the estimates correspond to a point on the likelihood surface that
is higher than a set of neighboring points.
However, only one such local
maximum is found for each set of initial parameter values.
Example 1.
Population parameters chosen as follows:
a - 0.9, a' = 1.1, S
a
at
a
a'
=
0.3,
S' = 0.8,
A = 0.3.
Initial estimates
Final estimates
Standard error
0.75000
0.95000
0.20000
0.65000
1.31834
0.54845
1.13841
0.59975
0.21304
0.28025
0.18873
0.21694
Joint log likelihood of the sample (as expressed by equation
(2.2.2.6») - 43.30721 with convergence attained after 10 iterations.
(*)
Kaplan, E. G. and Elston, R. C. [1972], A Subroutine Package
for Maximum Likelihood Estimation (MAXLIK), UNC Institute of Statistics
Mimeo Series, No. 823.
32
Variance-covariance matrix
-
I •
Example 2.
0.045388
0
0
0.000645
0
0.078538
0
0
0
0
0.035619
0
0.000645
0
0
0.047062
Population parameters chosen as,follows:
a • 1.0, a' • 1.2,
Initial estimates
0.75000
0.90000
1.50000
1.65000
a
at
S
at
S•
2.0,
S' -
Final estimates
1.02913
1.38171
3.18110
1.02569
2.2, A = 0.2.
Standard error
0.13208
0.53370
0.51615
0.60518
Joint log likelihood of the sample (as expressed bY,equation
(2.2.2.6»
- 10.68031 with convergence attained after 12 iterations,
the search being discontinued due to negligible change in 1ikelihood.
Variance-covariance matrix
I •
2.4.
0.017444
0
0.005392
0
o
0.284839
0
0
0.005392
o
0.266412
0
o
o
, 0
0.366247
Failure rates and concomitant variables
2.4.1.
Description of failure rates
In both medical follow-up and reliability contexts it is tmportant to know, given a two-component system, the rate of failure of either
33
component.
The failure function is an attempt to describe mathemat-
ically the length of life of a component in the system, whatever that
may be.
In a univariate context, typical failure functions are:
nential, gamma, Weibull.
expo-
In a two-component system, their bivariate
counterparts have been investigated in the recent literature.
Specifically, adopting the definitions given by Cox [1972], one
can write the following failure functions:
or, in words, the probability that the first component, T , fails in
l
(t , t
l
l
+ dt ) given that both components have surlived to times t , t ,
2
l
l
and
the probability that the first component fails in (t , t + dt ) given
l
l
l
that the second component, T , failed at time t .
2
2
Similarly, one
defines A (t ,t 2) and, for t 2 > t l , A2l(t2Itl).
20 l
Now carrying further the above definitions, one gets
Pr[tl~TI2tl+dtl' t 22T2 ]
dt
or
l
1
34
and
or
(2.4.1.2)
where P(t ,t ) represents, as before, the bivariate survival function at
1 2
(t ,t ). Adopting the parametric form given by Freund's bivariate expo1 2
nential
distributio~one has
immediately from equation (2.2.2.2) by
taking derivatives and from equation (2.4.1.1), for a + a
A10 (t ,t 2)
1
=
~
a',a':
a-(a'-a) exp{-(a+S-S')(t -t )}
2 1
0 ~ t1 < t2
Sa'-(a'-a)(a+S) exp{(a+S-a')(t -t )}
2 1
a-(a'-a) exp{-(a+S-a')(t -t 2)}
l
0 ~ t
and, similarly, for a + a
In particular, when t l
=
~
2
< t
1
a',S':
t 2 • t,
Proceeding in a similar manner, one obtains the other two failure
functions
35
and
Hence. one can completely characterize Freund's bivariate exponential
distribution by expressing it through these four failure functions.
2.4.2.
An approximate estimation with concomitant variable
If concomitant information is available. analysis of the data
may be carried out in several ways.
The simplest one is to have the
same function of the concomitant variable. z. multiplying all the failure functions as suggested by Cox [1972]:
where b is a parameter to be estimated corresponding to the covariable
z.
The case in which a vector of covariables, say z = (zl.zZ •••• 'zp)'
is considered. is a straightforward generalization of the present situation.
The joint density function in this case will be
Q
~a
,
Q
' ) bz
e Zb z exp{
-a' e bz·
t - ( a+~-a
e t } 0
z
l
~
tz < t •
l
(2.4.2.1)
To analyze the effect of the concomitant variable on the model. consider a very simple scheme with complete observation on the survival
times of both components.
Specifically. consider a cohort of size N
36
without censoring.
Suppose that the first component fails first m
l
times, then the second component fails; and in the remaining N - m
l
cases the second component fails first.
Denote by
the lifetime of the first component where the first component
lli
fails first;
t
t 2li the lifetime of the second component where the first component fails first;
t 12j the lifetime of the first component where the second component fails first and, finally,
t 22j the lifetime of the second component where the second component fails first.
The likelihood of the sample, neglecting constants, is given by
L •
N-m.
2bz
l.
•
IT
Sa 'e
2
j
j-l
Taking log of the above expression
L •
f zli _ 13, fe
log L - ml (log a + log 13') + 2b
- ale
i
bZ
li
t
bZ li
t 21 !
bZ
li
- (13-13') l e t
+ (N-m ) (log 13 + log a')
li
l
i
IIi
+ 2b
l
j
Z2j - a
,
l
j
- (a-a')
bZ
e
l
j
2j
bZ
t
bZ
e
12j
2j
t
- 13
L
j
..
22j
e
2j t
22j
(2.4.2.2)
37
The parameters to be estimated are
a,S,a',S' and b.
Taking derivatives
in equation (2.4.2.2)
oL
ml
ra - -a - Ii
e
bZ li
t
IIi
-
Ij
e
bZ 2j
(2.4.2.3)
t 22j - O.
Also,
oJ:
-Ii
6SoJ..
oa' ..
bZ
e
li
l
Ij
+ ---- -
S
IIi
bz
N-m l
L
(Xl -
e
l2j
bZ
oS' .. F' - L
i
e
li
'bz
2j
e
t
Le
+
2jt
j
m
l
oL
N-m
t
bZ
j
bZ
t 2li+
2j
Le
li
t
i
t
o·,
(2.4.2.4)
= O·,
(2.4.2.5)
lll
22j
22j
lli
= O·,
(2.4.2.6)
(2.4.2.7)
An explicit solution may be obtained as follows.
m
I
Sl(b)"
bZ
e
li
Let
~li,the sum of the products of e
bZ
li
by the
i-I
lifetimes of the first component where the first component fails first;
S2(b)
III
Le
bZ
li
~li' the sum of the products of e
bZ
li
by the life-
i
times of the second components where the first component fails first;
S3(b) -
I
j
e
bZ 2j
b~j
~2j' the sum of the products of e·
by the life-
38
times of first component where the second component fails first;
•I
bZ
e
2j
~2j' the sum of the products of e
bZ
2j
by the life-
j
times of the second component where the second component fails first.
Thus from equation (2.4.2.3) one obtains for any given b
m
l
a(b) • S (b) + S (b)
1
4
and similarly from equation (2.4.2.2) for any given b
N~
1
S(b) • S (b) + S (b) •
1
.
4
Equation (2.4.2.5) yields
N~
a'(b) • S (b) 3
~ 4 (b)
and equation (2.4.2.4) gives
Now the maximum likelihood estimation of b will be obtained by maximizing the log likelihood with respect to a(b), S(b), a'(b), S'(b).
has then
L(b)·
max~
a,S,a',S'
• ml {log a(b) + log S'(b)} + 2b LZli
i
One
39
+ CN-ml ){log S(b) + log a'(b)} + 2b
L Z2j
- a'(b)S3(b)
j
- a(b)S4(b) - {a(b) - a'(b)}s4(b).
Evaluating
o.c(b)
d
Ob
an equating it to zero, it results in
o!(b)
J_l
6b
• m1~
~6~~b)
_
_
-
{O~~b)
• SlO» +
oa(b) + 1
Ob
S'(b)
• S2(b) +
OS~~b)
r
OS'(b)t + 2 \
Ob
~ zli
. 6 0 (b)}
OS~~b) • a(b~ _ {O~~b) _ 06~~b)tS1(b)
oSl (b)
,
11
oS(b)
l o a ' (b)}
ob (S(b) - a (b)} + (N-ml)lS(b) • ob
+ a'(b)·
ob
+ 2
f
Zj - {
oa' (b)
OS3 (b)
ob
• S3(b) +
Ob
•
.1
a'~b)I
_ {oa(b) • S (b) + oS4 (b) • S(b)l _ {oa(b) _ ca' (b)}S (b)
. ob
4
ob
6b
ob
4
J
-
OS4(b)
ob (a(b) - a'(b)}
= 0,
(2.4.2.8)
which will yield b" by means of iterative procedure in the usual way.
Using this value in the expressions for a(b),S(b),a'(b),S'(b), the corresponding maximum likelihood estimates are obtained.
Now, an approximate estimate for b can be obtained by further
manipulating the equation (2.4.2.8).
rew~itten
as follows:
In fact, this equation can be
40
OS4(b)
-
ob
{a(b) - a'(b)}
= O.
(2.4.2.9)
Collecting terms in a(b),S(b),S'(b)a'(b) in -equation (2.4.2.9),
one has
(2.4.2.10)
Note now that
2n+?
Under the conditions O<bzi<l (zli fixed) and zli ~l as n~ one replaces the
. bZ
b2
term e i by 1+bz1i+~ zli (by neglecting terms of higher order in Taylor expan-
41
sian), then terms like ~ zi e
Let
so that one can write
Or
bZ
li
t li will become
l
i
t li + b
l
i
zlitli +
42
Therefore.
:: m
1
Analogously. one can write
and also.
and
o
ob
10g(S3(b) - S4(b»
•
~b
log
(1
t lj
J
I
+ b ; z 2j t lj +
~2 Z~jtlj
m
0
(bZ 1j
6b log
e
t 1j -
.;
e
bZ 2j
t
)
2j
- ; t 2J
43
+
b(l
z 2j t 1j -
I
2
b
z 2j t 2j ) + Z-
(I
2 t
z 2j
1j -
I z 2j2t 2j))
Letting 5 21 • I t 1j - I t 2j ; 5 22 " I z j t 1j - Iz j t 2j ; 5 23
j
j
j
j
so that
15
{
• 6b
log 821 + 10g(1 +
822
b -8-
21
28
b -S-)
23}
+ "2
21
Thus,
822
(N - m ) -81 (
+
21
and, similarly,
Going back to equation (2.4.2.10) one can write
823 )
b -S,
21
•
= I z~t1j
j
-
·44
"
I
Solving this last equation for b, one has an approximate esti1
"-
mator for it, b, under the conditions stated earlier (p. 40).
Hence one has immediately the estimates for the failure rates
based on that of b.
"-
where Si(b) (i
~
In other words,
1,2, ••• ,4) denotes the estimate of Si' which is
"-
obtained by replacing b by b in the corresponding expressions.
2.4.3.
Expected values
Suppose that the lifetime of a component in a system can be
regarded as a random variable T
fT(t).
~
0 with probability density function
Also, suppose that such component has survived to age a, say.
How much longer can it expect to live?
The residual lifetime yet to be
lived by the component of age a is T - a, which is the conditional
expectation of T - a given that the individual survived to age a.
Sym-
45
bolically this is expressed by
E(I - a
II
>
a)
=
E«I-a)I)
E(I ) £
E
r:
• --:..a
~
E[ (I-a) Ie:]
E«I-a) IE)
,.
P(E)
P (I > a)
(t-a) f(t)dt
_
r:
a
f(t)dt
where E represents the event II > a l and IE the indicator function of E.
Ihis quantity will be evaluated in a bivariate context below.
Ihe expectation of life of the second component given that the first
has failed at T ~ t and the second survives the first is given by
l
1
since, as Cox [1972] has written,
t -0
f(t ,t ) ,. exp
l
and
where
2
{-J 0 1
(A (u) + A (u»du
10
20
46
• 0
elsewhere.
Therefore,
i,
I
(2.4.3.l)
Remark.
Suppose now T and T are independent; then take
2
l
Thus the above equation would become
!
,I
as it should be.
47
Analogous~y,
the expected residual lifetime of the first compo-
nent given that the second has failed at T
2
the second is given by
Similarly,
t
. f(t ,t )
l 2
= exp{-!02
-0
(A (U)+A (u»du
10
20
so that
where
• 6
Thus
elsewhere.
=
t
2
and the first survives
48
Again one assumes T and T are independent.
l
2
Hence,
In the sequel the results above are used in an illustrative example.
Consider the following situation:
let the lifetimes of the
causes acting on the same individual be represented by the random variables T ,T •
l 2
thei~
The causes act in such a way that when both are present
failure rates are represented by ae
bz
and
Bebz ,
respectively, where
z is a covariable indicating group membership.
Suppose that when the first component fails the failure rate of
the second component becomes ate
bz
; now if one lets st > a, this means
that the expected lifetime of the second component will be shorter as
800n as the first component is out of action.
Similarly, let ate
bz
be
49
the failure rate of the first component as soon as the second component
fails.
Let be assumedthat a' > a (i.e., the lifetime of the first com-
ponent will be shorter as soon as the second component is out of
action).
So that the four failures rates that may occur in such a scheme
can be expressed by (as in Section 2.4.2)
bz
• ae •
Q bz
·..,e
,
A12 (tlu) • a'e
A2l (tlu) • S'e
bz
bz
(u < t)
(u < t)
a'>a, S'>S.
(2.4.3.A)
Note that by specifying these four failures rates as such, the distribution of survivorship is automatically determined and, thus, the procedure is no longer nonparametric.
The expectation of life of the second component given that the
first component has failed at t
l
and the second component survives the
first is given by equation (2.4.3.2).
Replacing in it the values of the
failure rates given by equation (2.4.3.4), it can be shown that
In particular, for b • 0
=
1
S' .
Analogously, from equation (2.4.3.2) one can show that
E(Tl-t2ITl>t2,T2~t2) = ~l~b-Z
ale
'
50
which reduces to simply
1
."
el
when b
O.
IS
Let W• min(T ,T ) represent the lifetime spent together by both
1 2
components. One may also want to know its expected value.
It is easy to show that under Freund's bivariate exponential
model
The random variable min(T ,T ) has therefore an exponential distribution
l 2
with parameter (a+S); with covariab1e z in the model, W is exponential
(a+S)e
bz
•
Hence,
E(W)
-bz
.-a+S
e
which reduces to
E(W)
whenever b
III
-a~S
O.
Also it can be shown that
-bz
V(W) • (2 - e
)e
(a+S>2
2.5.
-bz
Failure rates in fixed intervals
2~5.l.
Expression of likelihood with no censoring
Let a more general situation be considered now.
Suppose that
the period of observation is divided into w disjoint intervals.
Leaving
censoring aside, a cohort of N individuals, who were present at the
51
beginning of the study, entering the i-th interval may be classified
into the following mutually exclusive sets:
i)
ii)
those in which both components fail in the interval;
those for which the first component fails in the "i-th inter-
val and the second component has failed in the j-th interval (j < i);
iii)
those for which the second component fails in the j-th
interval and the first component has failed in the i-th interval (i < j).
As suggested in the paper by Cox [197Z], the failure rates will take the
form of step functions, and may be assumed to be constant over each
interval.
Translating that into the present context, one has
for
• ai
for
(i =
1,Z, ••• ,w)
The unconditional probability that both components fail in the
1-th interval is given by
t i
P(S1i) a It
i
+l
t i
It
+l
f(xly;i)dx dy
(ial,Z, ... ,w)
i
where, by generalizing equation (Z.4.3.l) for the case when the failure
rates take the form of step functions,
exp{-«al+Bl)h l + (aZ+BZ)h Z + ••• + (ai-l+Bi-l)h i - I ) -
-
I~ (ai+Bi)du - I~ S'idu}aiS'i
i
exp{-«al+Bl)h l + (aZ+SZ)h Z + ••• + (ai-l+Bi-l)h i - 1 ) -
-
I~i (ai+Bi)du -
I;
a'idu}Sia'i
52
hi • t + -t being the width of the i-th interval.
i 1 i
Hence, one can
immediately write
where
Labelling the integrals on the right hand side of equation (2.5.1.1) 1
1
and 1 , respectively, one can show that
2
and
Therefore,
• e
-Ai,
-(ai+Si)h i
-aihi -(ai+Si)hi
[(ai-ai-Si) (l-e
)+(ai+Si)(e
-e
)].
(2.5.1.2)
53
Proceeding along the same lines, one calculates now the unconditional
probability that the first component fails in the i-th interval and the
second in the j-th interval for i < j, P(Sij)'
This can be written as
where
for
Thus
(2.5.1.3)
where
The integrals above are easily evaluated.
One can show that
55
and
Evaluating the integrals on the left hand side of equation (2.5.1.6),
which, after some simplification, reduces to
j
< i.
(2.5.1. 7)
Since no censoring is being considered presently, the unconditiona1 likelihood, neglecting constants, for a cohort of size N is given
by
A
w
L •
II
)] ij
[P (S
i,j=l
ij
i~j
where A
represents the number of persons whose first component fails
ij
in the i-th interval and the second in the j-th interval; A
ii
is sim-
ilarly defined.
The log of the likelihood above gives, by replacing the probabilities by their corresponding expressions,
w
l
log L • . Aii 10g{K1i exp{-A i + (Si-ui-Si)ti}(l - exp{-(ui+Si)h })+
i
i"'l
56
w
l
+
i,j-l
j
Afj{log Klj - Aj - Aj,i + log(l - exp{-aih i }) +
i
(2.S.l.8)
where
and Ai
j
stands for A when i < j and A for A when j < i
ij
ij
1j
(i,j • 1,2, ••• ,w).
Maximum likelihood estimates for the various failure rates may
be obtained through the usual iterative techniques.
CHAPTER III
NONPARAMETRIC ESTIMATION
Chapter 3 treats the problem of bivariate life tables in a nonparametric set-up.
Estimates for the various parameters are proposed,
mainly for data which are grouped.
these are also developed.
Failure rate estimates based upon
Two assumptions needed to extend the Kap1an-
Meier [1958] procedure to a bivariate context are suggested, and it is
shown that they are satisfied for a particular distribution in the
blvariate exponential family.
3.1.
Empirical distribution function as an estimator
3.1.1.
Estimators
In a univariate context, given.a set of observations t ,t , •• ,t
n
1 2
from a population with cumulative distribution function F(t), it is natural, in the absence of additional information, to estimate F(t) by the
usual empirical distribution function
F(t)
n
=!
n
L I(t
n i ...l
- t ),
i
where
if
u > 0
(3.LLl)
otherwise.
However, it should be pointed out that one would not use this estima-
58
tor
if there were at hand sufficient prior information about the dis-
tribution of F(t), e.g., that F(t) is a member of a given parametric
class such as the exponential.
Grenander [1956] used the additional
information of F(t) being with increasing failure rate to get a nonparametric maximum likelihood estimation of the failure rate and, consequently, of F(t).
The estimator above may be readily extended to a bivariate context.
Let (tll,tZl),(tlZ,tZZ), ••• ,(tln,tZn) be a sample from a popula-
tion with a bivariate distribution function F(tl,t ).
Z
One can define
(3.1.1.2)
where I is defined as in expression (3.1.1.1).
in the present context is obvious:
And its interpretation
it is the proportion of observa-
tions with lifetime of the first component less than t
l
and the life-
time of the second less than t 2 •
An estimator for the bivariate survival function follows at
once.
Letting P (t ) be the marginal survival function with respect to
l 1
the first component and P (t ) with respect to the second, one writes
2 Z
"
Pl(t
l)
-
"
1 - F
(t) ... 1
l,n 1
-
"
1 - FZ,n(t
Z) • 1
n
_.1 L I(t
n i ...l
1
- t
li
),
and
"
PZ(t
Z)
as the corresponding estimator.
1
n
n
L I(t Z -
i-l
Inasmuch as
t Zi )
59
one has
(3.1.1.3)
as an estimator for the bivariate survival function.
Failure rates and bivariate survival function can be linked in
the same way as in equations (2.4.1.1) and (2.4.1.2); in fact, one can
write
hk
'"
P (t ,t +k) - '"P (t ,t )
2
n l
n l 2
k
(3.1.1.5)
where hand k may be sought of as arbitrarily small positive numbers.
Similarly one defines ~20(tl,t2) and ~2l(t2Itl)'
3.2.
Estimation with complete observations
3.2.1.
Maximum likelihood estimators
In this section a straightforward generalization of the univariate life table approach is presented for the case of a bivariate set
of grouped data.
Let qij be the unconditional probability that the first component fails in the i-th interval and the second component in the j-th
interval; symbolically,
60
For each individual in the cohort, define"
otherwise •
Suppose that one starts with a cohort of N individuals.
there are N sequences of independent random variables
(~.
Thus
{aij~}
1,2, ••• ,N) and the corresponding (unconditional) likelihood will
be
(3.2.1.1)
L •
where w represents the last age group in the life table and under the
restriction that
(3.2.1.2)
The usual approach to the problem of finding restricted maximum
likelihood estimates is by the method of Lagrange multipliers.
be the multiplier and maximize
log L - ~{L
i
with respect to qijo
Lqij
- l}
j
Differentiation gives
c5
---- [log L - ~{L
q
- l}l
c5Qij
i j ij
L
where
=
Aij
qij
-
~,
Let
~
61
which represents the number of individuals whose first component fails
in the i-th interval and the second in the j-th interval.
The above derivatives will disappear for i,j
A
if:!1
qij
is the same, and equal to
~, for all i,j.
~
l,2, ••• ,w only
So that the maximum
likelihood estimate of qij must be proportional to A •
ij
By virtue of
equation (3.2.1.2)
and thus
(i'" l,2, ••• ,w).
Conditional probabilities can be readily estimated.
(3.2.1.3)
Let,
(j < i)
be the conditional probability that the first component fails in the
i-th interval given that the second has failed in the j-th interval,
and
(j < i)
be the conditional probability that the second component fails in the
i-th interval given that the first has failed in the j-th interval.
Hence,
Pr(Tl-i, TI :::..i, TZ=j) , Pr(T l =i,T 2=j)
ql2(ilj) Pr(T >i,T =j)
= Pr(T >i,T =j)
1
2
2
l
62
and~
similarly,
.~
In terms of estimates one can then write
(j < i)
where r
12
(ilj) represents the number of individuals whose first compo-
nent survives the i-th interval given that the second component failed
in the j-th interval.
Analogously,
(j < i),
where r
2l
(ilj) represents the number of individuals whose second com-
ponent survives the i-th interval given that the first component failed
in the j -th interval.
It may happen that r
2l
(i Ij), r
mates Q2l(ilj), Q12(ilj) undefined.
take all of them to be equal to one.
12
(ilj) is zero, leaving the esti-
To avoid'this situation one may
In cohort studies with large
sample sizes, such situations should not become a problem.
63
3.2.2.
Mean and variance of the estimators
Mean and variances are readily obtainable.
Note that
with
1: 1: Aij - N
i j
and
l l
qij = 1,
i j
which says that an individual with both components functioning at the
beginning of the study eventually dies.
Given the size of the cohort, the joint distribution of the random
P(JLl,A12,A2l,···,A
-~
w-1 ,w ,Aw,w _l,Aw,w IN)
=
A
•
q
ww
ww
which identifies the classical multinomial scheme.
Therefore,
and, in general,
(i,j - 1,2, ••• ,w).
For given N > 0 we can write
64
which says that qij is unbiased for qij.
Following standard results, it can be shown that
and that
i ~
3.3.
k, j
~ ~.
Estimation with censored observations
3.3.1.
Definition of the main parameters
Suppose that for the whole period of observation [O,T), the life
table estimates are calculated from grouped data arising from a partition of [O,T) into w intervals [ti,t i +l )
( i · O,l, ••• ,w-l), to • 0,
t w - T.
The set of grouped data can be partitioned into four categories
according to survival-death status.
i)
They are:
both components survive past the i-th interval (then they
may be lost);
ii)
first component dies in the i-th interval, the second sur-
vives (then may be lost);
iii)
first component survives (then may be lost) and the second
component dies in the j-th interval;
iv)
first component dies in the i-th interval, the second com-
ponent dies in the j-th interval.
65
= P[T1
Let qij
• i, T2 - j] be the unconditional probability
that the first component fails during the i-th interval and the second
fails during the j-th interval;
l
l
CL=i+1 B=i+1
(~
q 13
a
= 1,2, ••• ,w-1)
the probability that both components survive the i-th interval;
- P[T
1
= i,
2 > j; j > i ]
T
=
l
w > j ~ i
(i "" 1,2, ••• ,w-1)
q 13
B=j+1 i
the unconditional probability that the first component fails in the
i-th interval and that the second survives the j-th interval; and
- P[T
1
> i, T
2
= j;
i
~
j]
= l:
CL=i+1
qCLj
w > i > j
(j = 1,2, ••• ,w-1)
the unconditional probability that the first component survives the
i-th interval and the second fails in the j-th interval.
To carrJ' out the task of estimating the life table parameters
in the present situation, some assumptions need to be made regarding
censoring.
In this section it is assumed that the censoring will
occur at the end of each interval whenever present.
As for the param-
eters themselves, they are subject to certain constraints, which
become obvious depending on the nature of assumptions made with respect
to the lifetime of the components.
Some possible situations are
described below:
i)
both components must fail during the period of study,
whether or not one or the other or both are censored.
the condidon
That amounts to
66
I
~
~,
i
ii)
one or both components may still be alive at the end of the
study and just one censoring point, t, is considered, which may occur
at the end of any interval except the last.
This situation is depicted
as follows:
t < w
iii)
either or both components may still be alive at the end of
the study but, instead of just one, several censoring points,
t ,t ,···,t r ,are allowed.
l 2
t
0
t
And that can be expressed as
r r qij + Qt
i-I j-l
3.3.2.
t
0
+
0
0
t -1
-1
r Ri
i-I
0
+
t
'
0
I
j=l
St
0'
j - 1
Maximum likelihood estimation
For the entire cohort of N individuals the likelihood of the
sample, neglecting constants, is
L •
with
I
67
restricted to
w
Ll
qij • 1 ,
i j
and where
A
ij
represents, as before, the number of individuals whose first
component fails in the i-th interval and the second in the j-th interval;
B , the number of individuals censored at the end of the i-th
ii
interval with both components alive (B00
= 0);
C , the number of individuals censored at the end of the j-th
ij
interval with the first component dying during the i-th interval
(i < j);
D , the number of individuals
ij
censored at the end of the i-th
interval with the second component dying during the j-th interval and
the first component surviving the j-th interval (i
~
j).
The log likelihood equation for the entire cohort will be
w
log L -
w
w-l
w-l w-l
r
r
Aij log qij + L Bii log Qii + L L Dij
i-I j-l
i=l
j-l i=j
w-l w-l
+
w w
Cij log Rij - ~(L
qij - 1) ,
i-I j-i
i-I j=l
L L
L
with
w
N-
w
L l
j-l i-I
w-l
Aij +
L
i-I
w-l w-1
Bii +
L L
j=i i=l
w-l w-l
Cij +
Taking derivatives with respect to qij:
L L
iaj jal
Dij •
log SiJ' +
68
for i •
j
(main diagonal)
(3.3.2.1)
for
j
> i (above main diagonal)
if i • 1
6 log L ..
o qij
(3.3.2.2)
ifl<i~w,
for i >
j
(below main diagonal)
if j • 1
6 log L ..
6 qij
(3.3.2.3)
ifl<j~w.
From equations (3.3.2.1), (3.3.2.2), (3.3.2.3) one obtains
(3.3.2.4)
for j > i
w-l Cij
Aij + qij L R
j-i ij
D
$ qij
i-I Bkk
j-l Cik
A + q
---+ q
ij
ij k-l Qkk
ij kai Rik
l
l ---.
if i • 1
$ q
ij
if 1 < i ~ w,
(3.3.2.5)
69
and for 1 > j
w-1 01j
~ qij
j-1 Bkk
i-I 0kj
k-1
k=j Skj
I q-- + qij 2
Aij + qij
.
Is·
i-j ij
A1j + q1j
kk
= ~ q
if j - 1
if 1 < j ~ w.
ij
(3.3.2.6)
Summing the terms on each side of equations (3.3.2.4), (3.3.2.4), and
(3.3.2.6), one gets
w
'\
w
A
i~l ii
+ '\
q
1:1
i-I B
w
'\ - kk = ~ \' q
ii k:1 Qkk
i:1 ii'
(3.3.2.7)
j > i,
(3.3.2.8)
i-I B
W W
j-1 e
kk
ik
-+
q
=
k-1 Qkk
j,i=2 ij k=i Rik
j>i
2
22
2-
W W
<P
22
j,i-2
q
ij
,
j>i (3.3.2.9)
w-1 Oil
Ail + qi1
2
i=l
--s-- = <p
il
qil
i
(3.3.2.10)
> j,
ww
22
qij·
i,j=2
i>j (3.3.2.11)
The sum of terms on the right hand side of,equation (3.3.2.7)' through
(3.3.2.11) is equal to
W
ww
i-I
j ,i=2
j>i
~(r qii + q1j +
22
ww
qij + qi1 +
22
i,j=2
i>j
qij)'
70
which reduces to
Now,combining the left hand sides of equations (3.3.2.7),
(3.3.2.9), (3.3.2.11), one has
w
i-1 B
kk
j-1 B
kk
l
qu l Q + l l qij l Q + l l qij k. 1 Q '"'
i-1
k-1 kk
j i=2
k=l kk
i jEl2
kk
j>i
i>j
w
- (l
i-1
qu +
l
w
w
l
j
j>i>2
w Bkk
-rk-1 ---.
Q
Qkk
kk
i-1 B
kk
qij) +
i-1 B
kk
l Q
k-1
kk
+
w
l
w
i
l
j
l
qij
j-1 B
kk
l Q
k-1
kk
=
i>j~2
w
0:::
l
k-1
(3.3.2.13)
B
kk·
Also from equations (3.3.2.7) through (3.3.2.11)
w
ww
ww
j>i
i>j
r A + A1j + j,i=2
l l Aij + Ail + l l Aij =
i-1 ii
i,j=2
w
ww
ww
ww
Aii + l l Aij +
Aij - l l Ai·
i-1
j>i>l
i>j~l
. i,j-1
j
- l
rr
(3.3.2.14)
71
and collecting other similar terms from equations (3.3.2.8) and
(3.3.2.9)
w-l C
-q
w-1 w-1 C
l.:.!1+ l
l .=!1.(l
j~i~2 Rij
1j j=l R1j
w-1 w-1 C
- j~i>l
l 2
w-1
~ R. =
q)
ik
w-1 w-1
2 Ci j .
l
(3.3.2.15)
j~i>l
1j
Rij
k>j
Similarly, one has from equations (3.3.2.10) and (3.3.2.11)
w-1 D
q
lj
2
i-j
w-1 w-1
=.!i +
Slj
L Lq
i>j~2
i-1 D
L
~=
ij k=j Skj
w-1 w-1
L l
i.:J~l
D •
(3.3.2.16)
iJ
Thus summing the right hand sides as expressed by equations (3.3.2.13)
through (3.3.2.16), one gets
ww
22
i,j=l
w-1
ij + l Bkk +
k=l
w-1 w-1
l
A
j>i
L cij
w-1 w-1
+
L L Dij = N,
i~
the total number of people in the initial cohort.
<p
Hence,
= N.
Note that from equation (3.3.2.4)
(i" 1,2, .•• ,w),
72
one obtains
=0
which gives for i- 1 and keeping in mind that B
00
the estimator of P[T 1 • 1, T 2
= 1],
the unconditional probability that
both components die during the first interval.
From equations (3.3.2.7), (3.3.2.9), and (3.3.2.11), one has
A
( 11
w
+
+
1 22
w
wi-I B
(w
w
i-I B )
A
+
q
kk +
A
+
q. l
kk
i-2 ii
i-I ii k-1 Qkk
j>i>2 ij
j>i>2 ij k=l Qkk
2
2
2
+
l 2A
j -1 C
w
22
2~
( j>i>2qij k-i R
ik
-
w
i>j~2 ij
+
I2
j -1 B
w
i-I D
2 2 q 2 ~ + 2 2 q I.=.ti
i>j~2 1j k=l Q . i>j~2 ijk=j Skj
kk
w
w
w)
( i-I q 11 + i>j.::2 q.ij + j>i.::2 q ij
2
2I
22
Finally, from equations (3.3.2.13) through (3.3.2.16), one can
readily write
ww
2L
i,j-2
w-1
A
+
2
i=l
w-1 w-1
B
+
w-1 w-1
2 2 C + 2i~~22 Dij
i~~2 ij
"
ij
ii
Q11 - - - - - - - - - - - - - - - - - N
the maximum likelihood estimator of P(T 1 > 1,T 2 > 1).
)
73
Going back to equation (3.3.2.l7) and letting i - 2, one
obtains the maximum likelihood estimator of P(T
I
• 2, T2 • 2},
.
N -( w
w-l
w
i~2 j~2
ij +
A
1:
k-l
w-l
w-l
Bkk +
1: 1:
j~i~2
Cij +
1: 1:
i~~2
)
D
ij
For i • 3,4, ••• ,w,the corresponding estimates of qii can be obtained
in like manner.
Notice the similarity between the above estimate and
the corresponding univariate actuarial one.
It is easy to prove the following result, which generalizes a
previous result:
under the assumed conditions, the bivariate survivor-
ship function at the i-th interval, symbolized by Q can be estimated
ii
by the method of maximum likelihood as
w
w
L 1:
A
Qii •
x,y=i+1
w-1
+
A
xy
w-1 w-l
w-1 w-1
B +
c
+ L 1: D
L
xy
xx
xy
y>x>i+1
x=i
x>y~+l
i-1 Bkk
N- L
k=l Q
kk
L
1:
-A
where w is any finite integer (w > 1).
This is now shown below.
In fact, summing all terms on the left hand side of equations
(3.3.2.4), (3.3.2.5), (3.3.2.6), one obtains
w
w
w
LL
~ A +
L 1: A +
A
x-i+1 xx
y>x>i+1 xy
x>y~i+1 xy
and
w
::
1:
w
L
A
y=i+1 x=i+1 xy
(3.3.2.18)
74
w-l
y-l C
w-l
C
q
Rxk y>x>i+l xy k-x xk
y>x>i+l xy
l I
I
w-l
- y>x>i+l
l l
r?- (Ik>y
l l
C
XY
R
xy
w-l
<lxk)
w-l
R
xy
= l l
y>x~i+l
C
(3.3.2.19)
•
xy
And similarly,
w-l
x-l D
lr
x>y>i+l
q
xy
w-l
-
I
D
r -r
x>y>i+l
xy
-r (r
w-l 'D
l ~ = x>y~i+l
II
kay ky
S ..
xy
w-l
xy
k>x
qk)
Y
w-l
l rD.
x>y~i+l
(3.3.2.20)
xy
And also
w
I
x-i+l
..
~
qxx
x-l Bkk
r
k=l
-Q
kk
w
+
II
y>x>i+l
qxy
x-I Bkk
I
k=l
-Q
kk
~
w
x-I B
r
q
+
I I q ) 'r kk
x-i+l xx
y>x>i+l xy k=l kk
W
Q
+'
w
+
rI
x>y>i+l
qxy
y-l B
kk
r
k-l
-Q
kk
w
y-l B IJ
,
r
I
q
{x>y>i+l xy k=lr kk
and manipulating conveniently the various limits of summation
Qkk
75
(3.3.2.21)
The sum of terms on the right hand side of equations (3.3.2.4) through
(3.3.2.6) is given by
w
ct>(
l
i+l
and since
ct> •
to N Qii.
N, one simply needs to equate the preceeding expression
Therefore, one has from the right hand side of equations
(3.3.2.18) through (3.3.2.21)
w-l w-l
2A
y,x=i+l xy
l
w-l
+
2
k=i
w-l w-l
B
kk
+
2 LC
y>x>i+l xy
w-l w-l
+
L LD
x>Y':::'i+l xy
+
i-I B
L kk. Q = NQii.
k=l Qkk
kk
And, starting with Qll' one can construct all estimates of Q up to
that corresponding to the (i-l)-th interval and finally (from the
equations above)
w
w-l
w-l
w-l
2 I A + L L C + L L D + L Bkk
'"
x,y=i+l xy
y,:::,x>i+l xy
x':y.:::.i + l xy
k=i
Q • - - - - - - - - - - : - - : - - - : : - - - - - - - - - , (3.3.2.22)
ii
N_
iII
~kk
k=l Q
kk
as postulated.
Since all the estimates of Q (i = 1,2, ••• ,w) are available by
ii
now, one can then write from equation (3.3.2.4)
76
which gives
(3.3.2.23)
Other estimates can be obtained along the same lines.
In fact,
from equation (3.3.2.5), one has
i-1 Bkk
j-1 Cik
Aij + qij L -Q + qij L
k=l kk
k=i
-R
ik
1 < i
= N qij
~
w,
and summing over j on both sides, one obtains
w
l A
j-i+l ij
w
+
i-1 B
w
j-1 C i k w
kk
L
q
L
-+ I q
l
j-i+l ij k-1 Q
j=i+1 ij kai
kk
-aN
Rik
l
q
j a i+1 ij'
which can be rewritten as
And the estimates of Q (i a l,2, ••• ,w) being known, one can
ii
write then (from the last equation)
(3.3.2.24)
Analogously, one can rewrite equation (3.3.2.6) as
j-1 B
kk
L
i-1 D
kj
l
Aij + qij
-Q + qij
k-1 kk
k=j
s
kj
- N qij
1 < j < W
77
and summing over i on both sides,
w
I
w
A
i-j+l ij
+
j-l B
w.
i-l D
kk
L --+ L q
L kj
i-j+l ij kal Qkk
i-j+l ij k=j Skj
I
q
=N
which gives immediately
w
L
A
i-j+l ij
+ S
i-l B
w-l
L kk + L D
N S
jj k=l Qkk
i=j i j ·
jj
or
w-l
w
L A
i-j+l ij
N -
+
L
D
i=j ij
i-I B
kk
(3.3.2.25)
L ;:-
k-l Q
kk
'"
an estimate of Sjj (j - 1,2, ••. ,w) in terms of Qkk'
Finally, estimates for Rij , Sij' qij can be obtained in a
straightforward manner.
In fact, for j >
~
equation (3.3.2.5) writes
i-I Bkk
S-l C'k
A +q
~ ---+q
~ _J._=Nq
is
is k:l Qkk
is k:i Rik
is
Summing over S one has
and that can be rewritten as
which gives for j > i
(1 < i ~ w).
78
(j > i).
(3.3.2.26)
Similarly, when i > j one obtains
j-l B
i-I D
kk
kj
ASj + q
+ q
Sj k-l Qkk
Sj k=j Skj
2 ---
2
- N qSj
(1 < j :.. w)
and summing over S
w
I A
S-i+l Sj
w
+
j-l B
w
i-I D
w
kj
-N
q ,
Skj
5-i+l Sj
kk
2
q
2
-+ 2 q
2
S-i+l Sj k-l Q
S=i+l Sj k=j
kk
2
which gives for i > j
(1 > j).
(3.3.2.27)
Once the estimates of R and Sij are known, one can finally
ij
write the estimate of qij.
Indeed, from equation (3.3.2.5) for j > i
one has
(j > i)
(3.3.2.28)
and, for i > j, the estimate is obtained directly from equation
(3.3.2.6); that is
79
N~
(i > j).
j-1 Bkk
~
(3.3.2.29)
-,,-+
( k-1 Qkk
3.4.
An extension of the Kaplan-Meier approach
3.4.1.
Likelihood expression
An extension of the Kaplan-Meier [1958] procedure for the case
of a bivariate sample is suggested in the sequel.
The aim here is to
obtain an expression for the likelihood of a bivariate sample when the
data are incomplete by virtue of losses.
This may be obtained by gen-
era1izing the Kaplan-Meier approach of the univariate to the bivariate
case.
Recall that their univariate product-limit estimate is a max-
imum likelihood estimate.
Consider a cohort of N individuals being observed over time.
On each of them we observed two components.
As time moves one can
observe the following:
(i)
events below the main diagonal are the ones for which the
first component failed or was censored first;
(ii)
events above the main diagonal are the ones for which the
second component failed or was censored first;
(iii)
events along the main diagonal are those for which both
failed, or have been censored (lost), or still have both components
alive (no failure) within the same interval.
Briefly, recall the univariate Kaplan-Meier procedure.
t ,t , •••
l 2
denote failure times
L ,L , •••
l 2
denote a loss (withdrawal) (censored).
Let
80
The likelihood for the situation above is [p indicates survivorship
function]
~.
[pet -0) - pet )] • P(L(l»
I
I
I
P(L(I»
2
x [P(t 2-O) - P(t 2)] • P(Lj2»
x [P(t 3-O) - P(t 3)]
x [P(t -O) - P(t )] •
4
4
P(L~4»
P(L~4»
x [P(ts-O) - pets)]'
i
This expression is maximized by increasing p(Li »
and P(ti-O} and
decreasing P(t ) in such a way that it is consistent with the monoi
i
tonic property of P and with t ~ Li ), < t + • This is achieved by
i
i l
i
moving the Li ) back to t and ti-O back to t _ • so that
i
i l
The above likelihood can then be written
with Po • 1.
In a bivariate context one would have the following.
to maximize P(T
> t, T > t).
2
I
One wants
Four quantities will appear:
P(ti-o, L } - P(t , L }, which represents the proportion of
i
j
j
items with failure of first component at t
at L for L >
j
j
t
i
i
and the second is censored
;
P(L , tj-O) - P(L , t h the proportion of those for which the
i
j
i
second component at t and the first component is censored at L for
i
j
Sl
~> ~;
P(L , L ), the proportion of items who drop out at L
i
j
i
m
L
j
(main
diagonal); and
~
2
P(ti,t j )
= P(ti-O,tj-O)
- P(ti-O,t j ) - P(ti,tj-O) + P(ti,t j )
represents the proportion of items with failure of both components
between (ti-O,tj-O), (ti,t j ).
Thus one would like to maximize something proportional to
[P(ti-O,L j ) - P(ti,L j )] • [P(Li,tj-O) - P(Li,t j )]
2
• P(Li,L j ) • ~ P(ti,t j ).
Before writing the expression of likelihood in the bivariate
(i)
situation, look at how the points L
j
nearest t in the univariate case.
,ti-O are moved back to the
The joint survivorship function has
the property that P(ti,t ) increases as (ti,t ) decreases.
j
P(ti,t ) by P •
j
ij
P(L
74
,L
31
P(t ,t )
7 S
j
Denote
The possible cases are, for instance:
)
= P73
= P 72
(increases as L is moved back to nearest t)
(it remains in·the same position)
P(t -O,t -O) - P
S
67
7
(both components move back so it increases,
as required).
Remark.
How about differences like P(t -O,L ) - P(t S,L )?
S
l3
13
According to the convention that has been used, one should have
it expressed as P
71
- PSI.
In general, in the likelihood expression,
one is dealing with pieces like P(ti-O,L j ) - P(ti,L j ) and one.would
like to increase this difference.
This can be done by increasing
P(ti-O,L ) and decreasing P(ti,L ).
j
j
Therefore to proceed further one needs the following.
82
Assumption 1.
(t
2
~ t
Assume P(tl,t) - P(t 2 ,t)
~
as t
+ for
all
t ,t 2
l
l )·
Similarly, to take care of P(Li,tj-O) - P(Li,t ) one needs also,
j
P(t,t ) - P(t,t ) decreases as t increases for all t ,t with
l
2
l 2
t
2
~ t
l
•
Note that P(t,t ) - P(t,t ) for fixed t represents the proporl
2
tion of items with failure of the second component in (t ,t ).
l 2
Sim-
ilarly, P(tl,t) - P(t ,t) represents the proportion of items with failure
2
of the first component in (t ,t 2) for fixed t.
l
Finally, consider the quantity 6P(t ,t ).
i j
In the likelihood,
one wants to increase P(ti-O,tj-O), P(ti,t j ); as for P(ti-O,t ) and
j
P(ti,tj-O) they should decrease.
If one follows the convention, then
P(ti-O,t j ) - Pi-l,j
increase
P(ti,tj-O) - Pi,j-l
increase,
and this is exactly the opposite of what is required.
Note that the monotonicity of P(ti,t ) says that
j
for t
< t < ••• < t ' where the superscript identifies the correw
2
l
sponding component.
If is also assumed that
(1)
t(l) < L(l)
<
t
+l
i
i
- i
L(2) < t (2)
t(2)
~ i
i +l ,
i
(i • 1;2, ••• ,w-l)
which says that in the interval [ti,t i +l ) a loss due to either component may occur at t
i
but not at t i + •
l
At this point, another assumption is required to avoid a f1a-
83
grant contradiction of having to increase and decrease simultaneously
the term Pi - 1 ,j in the expression of the likelihood.
So that, one
requires the following.
Assumption 2.
For ti ~ t 1 , ti ~ t z it is assumed that the above
second difference of P(·,·) over [ti,ti;tl,t
z)
decreases with ti,ti;
i.e., let
Under the assumptions the log likelihood may be written as
k
L -
L
i-1
k
t(i) log Pi,i +
L
i,j
i<j
t 21 (jli) 10g(Pi_l,j - Pi,j)
e
k
+
l
i,j
j<1
m (ilj) 10g(P _ ,j_l - Pi-l,j - Pi,j-l + Pi,j)
12
i 1
(3.4.1.1)
which is to be numerically maximized subject to the conditions that
Pi,j ~ Pi+l,j' Pi,j ~ Pi,j+l all i,j
= 1,2, ••• ,k
and PO,O
= 1,
and where
t(i) represents the number of losses in the i-th interval;
t Z1 (j/i), the number of losses in the j-th interval given that
the first component has failed in the i-th interval (i < j);
84
~l2(ilj), the number of losses in the i-th interval given that
the second component has failed in the j-th interval (j < i);
m (jli), the number of failures in the j-th interval due to the
2l
second component given that the first component has failed in the i-th
interval (i < j); and
ml2 (ilj), the number of failures in the i-th interval due to the
first component given that the second component has failed in the j-th
interval (j < i).
The estimates for Pij , for all i,j, may be obtained from the likelihood
equation (3.4.1.1) above, by numerically maximizing it under Assumptions land 2.
3.4.2.
(a)
Checking assumptions
Using Freund's bivariate distribution, one has regarding
Assumption 1:
1
- a+e-S' [a exp{-S't-(a+S-S')t 2}
+ (8-8') exp{-(a+8)t}]
1
P(t ,t)-P(t ,t) • a+S-S' a exp{-S't} (exp{-(a+S-S')t 1 }
2
1
- exp{-(a+S-S')t 2 })·
Let a+S-S' • c.
Thus
decreases as t increases.
a' > a
8' > S
0+8
~
a'
85
ii)
for t
> t > t > 0
Z- 1
P(t1,t)-P(tz,t) •
~[a
-
exp{-a'tl-(a+a-a')t}+(a-a') exp{-(a+a)t }]
l
~
[a exp{-a'tz-(a+a-a')t}
+ (a-a') exp{-(a+B)t }].
z
Let a+a-a' '" k.
Hence,
so that P(tl,t) - P(tz,t) decreases as t increases for all
tl,tz,t (t z
~
t l )·
Similarly one can show that the assumption holds whenever tz~t>tl>O.
(b)
Assumption Z is also shown to hold for Freund's bivariate
distribution.
In fact, for u
l
2
t ,
l
U
z
~
t , one wants to show that
z
decreases as ul,u Z increases.
Recall that
+ (a-a') exp{-(a+B)t }]
l
where
c • a+a-a',
a' > a,
a' > a,
a+a
~
a'.
86
Note that
u
l
U
t , u2
l
~
~
t 2 , and t l
~
t
2
I
gives . u l
~ t
~
l < u2
u2
~
tl
~ t
<
2
t2
2
6 p • ~ [a eXP{-S'u 2-(a+S-S')u l } + (B-B') exp{-(a+B)u }]
2
- ~ [a exp{-B't 2-(a+B-B')u1 } + (B-B') exp{-(a+B)t }]
2
- ~ [a exp{-S'u 2-(a+B-S')t } + (B-S') exp{-(a+S)u }]
l
2
+ ~ [a exp{-B't 2-(a+B-S')t } + (S-S') exp{-(a+S)t }]
l
2
I
•C
[a
exp{-B'u 2-(a+B-B')u l }- a exp{-S't 2-(a+B-B')u }
l
- a exp{-B'u 2-(a+B-B')t l } + a exp{-B't -(a+B-B')t }]
2
l
• %[exp{-B'u 2}
(exp{-(a+B-S')u l } - exp{-(a+B-B')t l }
- exp{-(a+B-B')u } exp{-B't } + exp{-B't -(a+B-B')t }]
l
2
2
l
and obviously
2
P decreases as u ,u
increases.
l 2
It can be easily shown that for the other cases, namely,
ul
~
u
~
2
3.5.
u2
~
t 2 < t l , u2 ~ ul < t l < t 2 , u2 ~ t 2 < u l ~ t l ,
u < t < t and u l ~ u 2 ~ t 2 < t l , the condition is satisfied.
l
2
l
~
Failure rates
Using some of the previous results, it is now possible to get
estimates for the failure rates, when the set of data appear in grouped
form.
In fact, under the conditions of Section 3.3, these rates may be
obtained as follows.
From equation (2.4.1.1) one may write
87
-1
"
P(t,t)
Now let
be the length of the i-th interval.
Thus, using the notation of Sec-
tion 3.3
-1
=-"
but notice that
and
Hence,
w
"
-[Qi+l,i
Rii =
And therefore,
w
~
10
(i-th):::I
-.
"
1
Qi,i
Analogously, one may write
which yields, as above
-.
"
1
Qi
,i
L
kai+l
L Ci
k=i+l ik
hi
Pike
88
Now, since
it follows that
w
I q
,..
,..
A (i-th)
20
1
5i i l
---.~=-_.
,..
Qi,i
h
k-i+l ki
h
'"
.
i
i · Qi,i
Analogously,
1
In general, for the case when i < j, one
~12 (j Ii)
but
can write
I
89
Consequently,
(j > i).
On the other hand, for j < i, it is clear that
and since
one has
(j < i).
For instance, under the conditions assumed in Section 3.3, one
will have after some simplication
CHAPTER IV
SOME HYPOTHESIS TESTING
Chapter 4 deals with some of the problems concerning hypothesis
testing.
The aim here is to begin exploring the statistical aspects
involved, such as the test for independence of the components when the
data available appear in a continuous form and subject to censorship.
The case when the data are in grouped ,form is also considered.
These
ideas are illustrated by two examples, one involving hypothetical
data; and the other is based upon data for 2,197 women.
4.1.
Test of independence for continuous data
4.1.1.
Complete observations
Chapter 2 of the present work was concerned mainly with estimation in a given parametric context.
That is, with a situation where
an appropriate probability distribution on the sample space of the
model, namely Freund's bivariate exponential distribution, was known
apart from the values of a finite number of unknown parameters, in
that case four in all.
To avoid imposing strong unrealisitic assump-
tions on that bivariate family of distributions, the use of a nonparametric approach to deal with the problem of testing is called for.
Suppose, then, that a set of N bivariate observations
(T
ll
,T
2l
), (T
12
,T
22
), ••• , (T
1N
,T
2N
), one observation on each of N
subjects, each pair representing the eventual survival times of the
91
first and second components, respectively, is given.
The null hypoth-
esis of interest here is whether the two variates involved in the
bivariate structure are independent.
That is, if T represents a mea-
sure of association between the two components, then one would like to
test whether T
m
O.
It is worth noticing that this type of data, when displayed in
a format that encompasses three or more categories, yields to the
knowledge that there is an order between the categories and this conveys new statistical information which one may use in measuring association.
And the alternatives considered deal with those types of
dependence for which an appropriate measure of association differs
from zero.
A natural way to handle the problem is, therefore, through
the use of ranks.
That being the case, the measurement of association
is now seen to be simply the problem of measuring the correlation
between the two rankings that are defined.
And the usual way of
testing the hypothesis of independence in this context is through Kendall's [1970] correlation coefficient, T, defined as
Assuming continuous marginals, a natural estimator for this parameter
is
n-l
~
n
I
i-I j-i+l
Uij
!.2 n(n-l)
1 < i < j < n
where
if (Tli-Tlj)(T2i-T2j) > 0
if (Tli-Tlj)(T2i-T2j) < 0 •
92
For one-sided test of the null hypothesis. H • of independence
O
against the alternative of the two components being positively correlated (T > 0). the decision rule. at the a level of significance. is
to reject H whenever the score function
O
n-l
K·
I
n
I
Uij ~ c(a.n).
i-I j=i+l
where c(a.n) is a constant such. that P(K
~
c(a.n»= a.
Values of
c(a.n) are given in Kendall [1970]. Table 1 (Appendix).
4.1.2.
Censored observations
When censoring occurs. one no longer can use the score function
above.
However. it is possible to test the hypothesis of independence
under censoring by properly modifying it.
In fact. in cases where
just one of the two components is right-censored. Brown. Hollander and
Horwar [1974]
~as
proposed a permutation test based on Kendall's tau.
In the sequel an extension of their test is suggested for the case when
both components are subject to right-censoring.
Thus. suppose that both components T and T are censored.
l
2
this case. the data may be described as follows:
where
In
93
°i
-{:
i f Zi • T
2i
if Z • C
i
i
(Le., T
is censored at C )
i
Zi
(i • l,Z, .. .,N).
The score function, K, can now be rewritten as
where
a
ij
K
1
if T
> T
0
if T
=
-1
if T
< T
1
if T
> T
0
if T
=
-1
if T
< T
1i
li
li
lj
(decision based on Vi' Vj , Yi ' Yj )
Tlj or 'not sure'
1j
(decision based on Vi' Vj , Yi ' Yj )
and
b ij
=
Zi
Zi
Zi
Zj
T
Zj
Zj
°.,
(decision based on Zi ' Z. ,
<5 . )
J ]. J
or 'not sure'
(decision based on z.,Zj,<5.,<5.)
].
].
J
(i,j = 1,Z, ••• ,N).
The term 'not sure' in the definitions above means that because of
censoring one does not know how T1i,T 1j (TZi,T Zj ) are ordered.
For
instance, suppose that C > T
> T
and C > T
> T ' i.e., neili
Zj
i
lj
Zi
i
ther component is censored.
In terms of the variables V's and Z's,
= 1,
that event will happen whenever Yj
or Yi = 0, Yj
= I,
Vi = Vj and 0i
case the score will be a
ij
= 1,
b
Vi >
~;
= 0, OJ = 1,
ij
and
Zi
OJ = 1,
= Zj'
Zi > Zj'
And in this
= 1.
As mentioned earlier (see Section Z.Z.2), it is assumed that
the random censoring is independent of each bivariate observation
(Tli,T
Zi
) (i
= 1,Z, ••• ,N).
Let K(w*) be the observed value of the
94
statistic K when censoring affects both components.
Allowing the
T1'~
instance,to permute among themselves, the observed values of K will
comprise a set with (NI)
values.
If
(i1,i2""'~)
is an arbitrary
permutation of (1,2, •••• N), then the conditional probability (given the
hypothesis, HO' of independence)
=
P(W* • f(W*»
(NI)-l,
(4.2.1.1)
where
is an element of the set of (NI)
tion f.
transformations induced by
th~
func-
It can be shown in the theory of nonparametric tests, that
equation (4.2.1.1) holds only if the independence of T1 and T2 implies
the independence of (V,y) and (Z,e).
Under the framework above, one can define a a size test as fo1lows:
let
K1 (w*)
denote the (NI)
~
K (w*) So ••• So K
(w*)
2
(NI)
ordered values of K(f(w*»
of the permutation set.
where f(w*) is an element
Following Brown, Hollander and Horwar, one
can define a test statistic as
_{Ol
if K(w*) > i(m*) (w*)
t(w*) -
where m*
=
(NI)
if K(w*) < K(m*)(w*)
- [(NI) al, the brackets indicating the largest inte-
ger not superior to the number inside it.
If K(w*)
= K(m*) (w*),
then
for
95
the test should be randomized so as to give the level a.
4.2.
Test of independence for grouped data
4.2.1.
Complete observations
Suppose that for N individuals in the cohort there are exactly
2
N distinct, ordered, uncensored survival times (Tll,T2l),(TlZ,T22)'
••• ,(TlN,T
ZN
).
Let A represent the number of individuals in the
ij
cohort whose first component fails (or dies) in the i-th interval and
= 1,Z, .•• ,wl ;
the second in the j-th interval (i
before.
Consider the corresponding
WI
x
j
= 1,Z, ••• ,wZ)
z contingency
W
as
table that can
be formed from the data:
All
A
lZ
AZI
A
22
·
··
···
A
wll
with N =
II
i j
A
w 2
Z
...
A
IW
A
2w
...
Z
Z
·
··
A
ww
l 2
Ao j •
].
If there are no ties among the true survival times, there will
be N number of failures (or deaths) by the end of the study.
Under
this context, the composite hypothesis one shall be interested in
testing is that the two components are independent, that is,
against the alternative
96
where qi- and q_j are arbitrary positive parameters subject to the
condition
This is analogous to testing the hypothesis of no correlation in a
bivariate normal population.
The null hypothesis, H ' is that the two
O
components, represented by their eventual survival times T and T ,
2
1
respectively, are independent.
Since the two components are acting on
the same initial item, it is naturally expected for them to be as sodated.
As it is well known, under the null hypothesis the family of
possible distributions is the multinomial parametrized by q, and the
set of possible q is
So, once the maximum likelihood estimates of qij under the null
hypothesis is evaluated, the test statistic reduces to the usual form
which is distributed in the limit, as N becomes large, as X2 with
(w -1)(w -1) degrees of freedom.
l
2
And this result enables one to
easily determine a tests which is of size
for all q in 8 ,
0
~
~
(in the sense that
~(q)=~
being the type I error), namely, the test with
97
critical region
where K is the upper lOOa percent point ofaX2 (wl-l) (w -l) distribua
2
tion.
4.2.2.
Censored observations
Here, once again, suppose that for the whole period of observation the life table estimates were calculated from grouped data aristng from a partition of the period of observation into w intervals.
The imposed condition
w
w
L L
i=l j=l
qij· 1
indicates that each item will have both components failed by the end
of the study, whether either or both of them are censored or not.
this situation the data appear in the following form:
A C
12 12
A B
l1 U
C D
1l ll
A B
22 22
A D
2l 2l
C D
22 22
...
...
···
··
·
·
··
·
··
A
ww
with
w
N•
I L Aij
j
i
w-l
+
I
i
w-l
Bii +
II
j~i
w-l
Cij +
II
i.::J
Dij ,
In
98
the observed random variables Aij'S, Bii's, Cij'S, Dij's being interpreted as in Section 3.3.
The problem of testing the hypothesis of independence in such a
table presents enormous difficulties.
To express such a complicated
pattern of interdependence in terms of a single coefficient is very
difficult.
However, for large N, one can use the following procedure
which enables one to obtain a good approximation.
In fact, to test the
hypothesis of independence between the two components, namely
(i,j • l,2, ••• ,w)
which may be stated as
where 8 is the sample space defined by
8 - {q; 0 < qij ~ 1,
L
qij· 1
for all i,j},
i,j
against the alternative
for some i,j,
one may form the likelihood ratio
whose critical region, for a fixed size a, of 8
0
against 8 is given by
99
where K is such, that
ex.
This ex. size test is hard to obtain since i(Aij,Bii,Cij,Dij) is rather
a complicated function of the observed random variables A'j,B.i,C .. ,
~
~
~J
D and the problem of determining their joint distribution is far
ij
from trivial.
But, for large N, the large sample theory says that
-2 log i is distributed, for all q £ eO' approximately as X2 with
(w-1)(w-1) degrees of freedom.
That many number of restrictions on q
required to define eO' may be viewed as follows.
One may note that
there are w + w - 2 free parameters in the restricted sample space eO'
namely q1.,q2.' ••• '~-1,.' but not ~. because of the condition
and q.1,q.2, ••• ,q. ,w-l' but not q.w because of the implied condition
L
q
= l.
i
·i
2
Similarly, there are w -1 free parameters, namely q11,q12,q21' ••• '
Q
-1",w , but not a~,w since
~
L L qij
= l.
i j
So in order to ensure that q £ eO' one must impose in all (w
(w + w - 2) • (w-1)(w-l) restrictions on q
Under the null hypothesis, note that
~
e.
2
- 1) -
100
o _
Qii
w
~
w
~
l.
l.
a-1+1 13=1+1
<laa -
w
~
w
~
~.
l.
a-1+1
l.
a=d.+l
q.a
and
o
w
w
w
\to q. j
Sij • a.I+l qaj • a=I+l
• q. j a.I+l qa·
(1 ~ j).
Now the maximization of the likelihood L(Aij,Bii,Cij,Dij;q) for q € 80
is done by the method of Lagrange multipliers.
multipliers are needed, say
~l
and
~2'
Noticing that only two
the log likelihood equation can
be written
w
log L"
w
r
r
Ai. log qi. +
A. j log q.j
i=l
j-l
which leads, after taking derivatives, to the restricted likelihood
equations:
B
aa
K-l K-l
+
r r
13=1 a=13
-
~
1
.. 0
(K .. 1, 2 , • • • , w)
(4.2.2.1)
101
(R. - 1,2, ... ,w)
w
I
w
I
qij - 1 - 0
i-I j=l
with BOO
==
(4.2.2.2)
O.
From equation (4.2.2.1)
or
K-l
~. +
L
a-I
and summing over K, one has
B
aa
(4.2.2.3)
102
But notice that
w K-l
~
l
w
Baa
w
Kal a-I
~
i-a+l
qK" •
qi"
~
K-l
K-l
qK"
w-l
Baa
w
~ .
a-I
L
i-a+l
- L
a-I
qi"
Baa
and also
w K-I K-l
l
K-l
l
l
a>B-l
w
DaB
w
l
ica+l
qK"
qi"
K-l K-l
l l
- K-l
l qK" a>B-l
DaB
w
l
i-a+l
w-l w-l
l l DaB •
a>B-l
•
qi"
Thus equation (4.2.2.3) can be rewritten as
so that under the null hypothesis
Similarly, from equation (4.2.2.2)
Summing the last equation over
~
w
l
q.j
~·a+l
103
(4.2.2.4)
and noticing that, as before,
w-l w-l
l:S>a=1l:
Hence, equation (4.2.2.4) becomes
w
l:
£-1
w-l
A.£ +
I
a-I
w-l w-l
Baa +
l:S~a=ll:
w-1 w-l
C B+
a
I1>£=1l:
Di £· ~2 '
or
~2
= N.
The estimates can now be readily written.
(4.2.2.1), one obtains for K
which gives
=1
From equation
104
for K - 2 that same equation leads to
and, in general, for i - 1,2, ••• ,w
(4.2.2.5)
i-I
N-
r
a.-I
L
K-l
"
q.
Similarly, from equation (4.2.2.2) for t
=1
1 -
a.
K
I
I
I
I
I
I
I
which gives
for t - 2 equation (4.2.2.2) yields
lOS
in general for j • 1,2, ••• ,w
w-1
A. j +
"q.j •
L
i-j
Dij
(4.2.2.6)
a a
L L1 CKt
aa + i>K..
B
j-1
N
- L
a
a-I
1 -
I
i-I
q.i
The likelihood ratio then becomes
w-1 AO Bii w-1 AO c ij
IT . Q
IT IT R
i=l ii
j~i ij
•,.,
w
IT
i-I
w-1 AO Dij
IT IT S
i~ ij
w-1 A Bi! w-1 A COJ w-1 A Dij
IT Q
IT IT R 1 IT IT So
1..1 ii
j~i ij
i~ 1j
The estimates in the equation above are given by equation (3.3.2.22)
through (3.3.2.29) and also by equation (4.2.2.5) through (4.2.2.6).
Example 1.
For the purpose of illustrating the previous theory, con-
sider the following hypothetical cohort of 50 people.
Let w = S be the
number of intervals in which the period of observation is divided.
B1l • 1
A44 • 1
A ... 1
lS
A25 os 0
A
1
3S
A ... 0
4S
A .. 0
S4
ASS • 1
B • 1
5S
All • 2
~2 - 1
A21 • 1
A22 • 3
A13 - 0
A ... 1
23
A3l • 2
A32 • 1
A33 • 1
A
1
14
A .. 2
24
A34 - 2
A4l • 1
A42 • 0
A .. 1
S2
A • 0
43
AS3 • 0
ASl • 0
lIZ
=0
B22 • 2
B • 0
33
B • 0
44
106
C .. 0
14
C24 • 3
C34 • 2
C44 • 2
D11
- 1
I
D .. 2
21
D31
D .. 2
33
D .. 1
43
- 0
D41 • 1
1
Assuming the conditions of Section 3.3, maximum likelihood estimates yield, under the null hypothesis of independence:
5
I
5
Ai. log qi ... -35.18867,
i-I
I
j-l
II
Dij log
i~J~l
A j log q.j • -34.68638
•
S~j .. -30.84211.
On the other hand, under the alternative
II
i,j
II
4
A log qij .. -72.52423,
ij
C log
ij
j>i~l
Rij
.. -30.26369,
I
i-I
A
Bii log Qii • -5.47145,
II
Dij log Sij .. -29.87624.
i~>l
So that -2 log t(q) • 8.29708, which indicates nonsignificance for X2
with 16 degrees of freedom.
107
Example 2.
As mentioned elsewhere in this work, a problem of special
interest to demographers is the measurement of association between the
duration of post-partum amenorrhea and breast feeding.
The following
data(*} on 2,197 women describe the number of women amenorrheic whose
menses returned during an interval and the corresponding duration of
lactation.
For the purpose of illustration, it is assumed that the
study terminated at the end of the fourth month, this one being the
only single censoring point to be considered in the example.
Duration of lactation (in months)
0-1
1-2
2-3
3-4
0-1
A -97
11
A =l
12
A13 '"'l
A ,",1
14
C ",,0
14
1-2
A '"'992
21
A =23
22
A ,",9
23
A =5
24
C =32
24
2-3
A31"'561
A =39
32
A =27
33
A34"'9
C34 =18
3-4
A ,",128
41
A =13
42
A =18
43
A =17
44
B "'131
44
D4l -34
D ,",3
42
D =2
43
D =19
44
Maximum likelihood estimates under the null hypothesis of independence and restricted to
are given by (using equation (4.2.2.4) iteratively)
(*)
Sa1ber, E. J., Feinleib, M., and Macmahon, B. [1966]. The
duration of post-partum amenorrhea, American Journal of Epidemiology,
vol. 82, nb. 3, pp. 347-358.
108
"
100
q1- • 2,197
"
1,061
q2- • 2,197
"
654
q3- - 2,197
"
193
q4- • 2,197 •
And similarly (using equation (4.2.2.5) iteratively)
'"
q-1
-
1,812
2,197
'"q-2
57
'"q-3 2,197
79
= 2,197
51
'"q-4 • 2,197
•
Since
'"
Q4- •
4
B44 + I D4j
j-1
N
4
so
189
2,197
,
A
Q-4 •
B44 +
I
i-1
C
i4
N
the remaining estimates can be obtained as
(i - 1,2, ... ,4),
(j • 1, 2 , ••• ,4) ,
and
Straightforward computation gives
log max L(A's,B's,C's,D's;q) • -429.55484.
q€8
0
On the other hand, under the alternative
log max L(A's,B's,C's,D's;q)
q'80
= -392.98827.
198
• 2,197 '
109
From these two, one can compute -2 log t(q) • 73.13314, which indicates a highly significant value for X2 with 9 degrees of freedom and
that conforms with one's suspicion about the association between amenorrhea period and duration of laction.
CHAPTER V
CONCLUDING REMARKS
The reader by now is, hopefully, aware of the scope of bivariate
life tables.
At least for the sake of speculation, estimates for mul-
tivariate life tables can be obtained along the lines suggested in this
work.
The addition of more components, whether in a parametric context
or not, will have the only effect of burdening the handling of the
equations.
Under random censoring, all estimates will have to be eval-
uated through iterative procedures, when an underlying multivariate
distribution is assumed.
Questions concerning bias of the various estimates remain
untouched.
The problem of estimation from censored data has some fac-
tors responsible for deviations between the true and estimated parameter.
Among them:
i)
the data available usually have not been obtained by a ran-
domized procedure as recommended by sampling theory;
11)
1n the case of human life tables, it is likely that only a
rough prediction can be made, since the development of mortality
depends upon factors that are very difficult to handle;
iii)
in demographic studies the sample size is often very large
(sometimes even complete enumeration) while in mortality investigations
the available data may be of moderate size.
Thus, in the latter case, one has to pay more attention to this sort
III
of error.
It may happen that, for certain medical follow-up studies,
the results in the previous chapters are not immediately applicable
without modifying some of the formulas suggested.
Finally, it should be pointed out that, at present, asymptotic
theory for the various estimators concerning life tables is virtually
inexistent even in a univariate context.
~
.'
.'.~
.~
i
j
I
REFERENCES
Basu, A. P. [1971].
66, 103-104.
Bivariate failure rate.
J. Amer. Statist. Assoc.
Berkson, J. and R. P. Gage [1952]. Survival curve for cancer patients
following treatment. J. Amer. Statist. Assoc. iL, 501-515.
Breslow, N.
[1974]. Covariance analysis of censored survival data.
Biometrics 30, 89-99.
Brindley, E. C., Jr. and W. A. Thompson, Jr. [1972]. Dependence and
aging aspects of multivariate survival. J. Amer. Statist. Assoc.
!I, 822-830.
Brown, W., Jr., M. Hollander, and R. Horwar [1974]. Nonparametric
tests of independence for censored data, with applications to heart
transplant studies. Reliability and Biometry: Statistical Analysis
of Lifelength (edited by F. Proschan and R. Serfling), Soc. Ind.
Appl. Math., Philadelphia.
Chiang, C. L. [1968]. Introduction to Stochastic Processes in Biostatistics, Chapter 12, J. Wiley, New York.
Cox, D. R. [1972]. Regression models and life tables. Journal of
the Royal Statistical Society, series B 34, 187-202 (with discussion).
Cutler, S. J. and F. Ederer [1958]. Maximum utilization of the life
table method in analyzing survival. J. Chronicle Disease ~,
699-712.
Efron, B. [1967]. The two-sample problem with censored data.
Fifth Berkeley Symposium~, 831-853.
Proc.
Elveback, L. [1958]. Estimation of survivorship in chronic disease:
the actuarial method. J. Amer. Statist. Assoc. 53, 420-440.
Feigl, P. and M. Zelen [1965]. Estimation of exponential survival
probabilities with concomitant information. Biometrics 11,
826-838.
Frechet, M. [1951]. Sur les tableaux de correlation dont les marges
sont donnes. Anna1es de l'Universite de Lyon, 3 Ser. no. 14A, 53
(quoted from Gumbel [1960]).
113
Freund, R. J.
tribution.
[1961]. A bivariate extension of the exponential disJ. Amer. Statist. Assoc. 56, 971-977.
Grenander, U. [1956]. On the theory of mortality measurement, Part
II, Skandinavisk Aktuarietidskrift 39, 126-153.
Gumbel, E. J. [1960]. Bivariate exponential distributions.
Statist. Assoc. ~, 698-707.
J. Amer.
Kalbfleisch, J. D. and R. L. Prentice [1973]. Marginal likelihoods
based on Cox's regression and life table model. Biometrika 60,
267-278.
Kaplan, E. L. and P. Meier [1958]. Nonparametrics estimation from
incomplete observations. J. Amer. Statist. Assoc. 53, 457-481.
Kendall, M. G. [1970].
Griffin, London.
Rank Correlation Methods, 4
th
edition, C.
Marshall, A. W. and I. 01kin [1967]. A multivariate exponential distribution. 'J. Amer. Statist. Assoc. ~, 30-44.
Turnbull, B. W. [1974]. Nonparametric estimation of a survival function with doubly censored data. J. Amer. Statist. Assoc. ~,
169-173.
Zippin, C. and P. Armitage [1966]. Use of concomitant variables and
incomplete information in the estimation of an exponential survival
parameter. Biometrics~, 665-672.