Koury, Kenneth J.; (1981)Parametric Competing Risks Models in Clinical Trials."

-e
PARAMETRIC COMPETING RISKS MODELS IN CLINICAL TRIALS
by
Kenneth J. Koury
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1365
November 1981
PARAMETRIC COMPETING RISKS MODELS IN CLINICAL TRIALS
by
Kenneth J. Koury
A Dissertation submitted to the faculty of The
University of North Carolina at Chapel Hill in
partial fulfillment of the requirements for the
degree of Doctor of Philosophy in the Department
of Biostatistics
Chapel Hill
1981
Approved by:
.~
(--~
~
ca~r~
Reader
KENNETH J. KOURY. Parametric Competing Risks Models in Clinical Trials
(Under the direction of PRANAB K~~R SEN.)
Some fully general parametric competing risks models, which allow
the underlying (possibly dependent) lifetimes to be modeled as a function of an arbitrary number of covariables, are formulated, and their
resulting likelihood function is derived.
Under carefully stated reg-
ularity conditions, large sample tests of hypotheses concerning the
model parameters are obtained for the single point truncation (Type I
censoring) and single point (Type II) censoring schemes, as well as for
the case when all the observations are complete.
Due to ethical considerations and possible loss of efficiency
associated with single point truncation or censoring schemes, timesequential procedures, which allow termination of the experiment at the
earliest possible stage based on the accumulated statistical evidence,
are often advocated for clinical trials.
For parametric competing risks
models a progressively truncated scheme (PTS) is used to develop a
(large sample) time-sequential test procedure, and this procedure is
modified to allow the presence of "nuisance" parameters and a form of
stagger~d
entry.
Although it has limited applicability, a (restricted) sequential
procedure is also obtained in the context of repeated significance
testing, and a parallel procedure based on truncated observations is
given.
Under suitable regularity conditions, we show that the distributions of the proposed test statistics for both the time-sequential and
sequential procedures converge weakly to those of a functional of certain Gaussian processes under the appropriate null hypotheses.
iii
Unfortunately, these null distributions are not available in the
statistical literature, and theoretical derivation appears mathematically intractable, at least in general.
However, the computer generation
of general Gaussian processes is discussed and subsequently used to provide an algorithm which obtains the distributions of the appropriate
functionals empirically.
Finally, the regularity conditions for the procedures developed in
this investigation are verified for a general independent risks exponential model and a dependent risks model based on Gumbel's bivariate exponential distribution, and numerical illustrations are provided for these
procedures using simulated data from exponential populations.
e-
ACKNOWLEDGEMENTS
Working with my adviser, Professor P. K. Sen, has been an honor
and priviledge, and it is indeed a pleasure to express my sincere appreciation for his invaluable advice, constant guidance, and encouragement,
as well as his patience, warm personality, and sense of humor.
I also
wish to thank the other committee members, Professors C. E. Davis, J. E.
Grizzle, G. Heiss, and N. J. Johnson, for their helpful comments and
suggestions.
In addition, I am grateful to Professors C. E. Davis, J. E. Grizzle,
and O. D. Williams for providing valuable practical experience and finan-
-e
cial support through the Lipids Research Clinics Program which is funded
by the National Heart, Lung, and Blood Institute.
r wish to thank my wife, Mary Lou, for her continued support and
encouragement during my entire graduate education.
Appreciation is also
expressed to my mother, Valeria, and my late father. Kenneth. for their
encouragement and understanding.
Additionally, I would like to thank
D. L. Hawkins, a fellow student in the Department of Biostatistics, for
the many interesting statistical, as well as nonstatistical, discussions
we have had.
Finally, I must express a warm thanks to Vicky Crowder who typed
the manuscript with great skill, patience, and speed.
CONTENTS
iv
ACKNOWLEDGEMENTS
I.
INTRODUCTION AND LITERATURE REVIEW
1.1
1.2
1.3
1.4
1.5
1.6
1.7
II.
-e
1
4
8
14
T
i
.....
....
16
20
25
ASYMPTOTIC MAXIMUM LIKELIHOOD PROCEDURES FOR GENERAL
PARAMETRIC COMPETING RISKS MODELS
2.1
2.2
2.3
2.4
2.5
2.6
2.7
III.
.......···
. · · ·· ·
······.
·······
······
Introduction
Basic Formulation and Notation
The Likelihood Function.
Problems of Identifiability.
Further Discussion of the Interpretation of the
and the Basic Assumption
Incorporation of Concomitant Information
Outline of Research Proposal
Introduction . . . . • . • . • • • • • . . . .
Some General Parametric Competing Risks Models
The Likelihood Function. • . . •
Asymptotic Normality of the MLE. . . . . . •
Large Sample Tests of Hypotheses . • . • .
An Example Involving Exponential Lifetimes .
Withdrawals and Type I Censoring
2.7.1 Withdrawals. • .
. •.•
2.7.2 Type I Censoring . . . . .
29
30
33
38
53
57
60
61
62
TIME-SEQUENTIAL PROCEDURES FOR PARAMETRIC COMPETING
RISKS MODELS
3.1
3.2
3.3
3.4
3.5
Introduction . . . . . . • . . . . . . .
Preliminary Notions and the Main Theorem.
Proof of Theorem 3.1 . . . . . . • . • . . . • . .
Some Remarks on Theorem 3.1 and Its Assumptions.
Applications to Clinical Trials and Life Testing
Prob lems . . . . . . . . . . . . . . . . .
92
3.6
Nuisance Parameters: The Case When the Null
Hypothesis Does Not Completely Specify §.
Staggered Entry . . . . • • . . . • .
Applications to Type II Censoring . . . .
97
112
115
3.7
3.8
67
68
75
90
vi
IV.
REPEATED SIGNIFICANCE TESTING: A SEQUENTIAL PROCEDURE
FOR PARAMETRIC COMPETING RISKS MODELS
4.1
4.2
4.3
4.4
Introduction . . . . . . . . • • • . • . • •
Preliminary Notions and the Basic Invariance
Principle • . . • . . • • . . . . . . • .
Proof of Theorem 4.1 . • • . . • • . . . . .
Applications to Life Testing Experiments.
4.4.1 An Asymptotic Sequential Test of
H :B = 6
(Specified) . • • . . .
o -
-0
119
123
137
137
4.4.2
4.5
V.
139
142
SIMULATION OF THE DISTRIBUTION OF A FUNCTIONAL OF A
GENERAL GAUSSIAN PROCESS, SOME EXPONENTIAL COMPETING
RISKS MODELS, AND NUMERICAL ILLUSTRATIONS
5.1
5.2
5.3
5.4
5.5
VI.
A Test Procedure Based on Truncated
Lifetimes .
• . • .
Concluding Remarks • . • • • • . • . . . .
118
Introduction . . • . . . . • . • . . •
Computer Generation of a General Gaussian
Process and Its Use in Simulating the
Distribution of a Functional of the Process.
Some Comments on the Assumptions of the
Procedures of Chapters II, III, and IV . . • .
A General Independent Risks Exponential Model.
5.4.1 Numerical Illustration of the Progressively
Truncated Test of Section 3.5: Example
5.4.1
.
5.4.2 Numerical Illustration of the Progressively
Truncated Test in the Presence of Nuisance
Parameters: Example 5.4.2. . . . •
A Computing Risks Model Based on Gumbel's
Bivariate Exponential Distribution
SUGGESTIONS FOR FURTHER RESEARCH. •
144
145
148
149
159
169
178
185
APPENDIX. . . •
188
BIBLIOGRAPHY.
191
e-
CHAPTER I
INTRODUCTION AND LITERATURE REVIEW
1.1.
Introduction
Given the importance of clinical trials in medical research, and
the variability typically observed in the treatment outcomes, it is
easy to justify the considerable attention that statisticians have devoted to the design and analysis of these experiments.
As emphasized
by most statisticians and many research clinicians [e.g., Chalmers
-e
(1975), Ederer (1975), and Bal1intine (1975)], a blind, randomized trial, when properly conducted and analyzed, helps to insure an unbiased
treatment comparison.
The statistician's chief contribution to clini-
cal trial research, however, consists of developing methodology which
provides an efficient means of detecting treatment differences with a
quantifiable degree of precision.
recently developed statistical
Indeed, classical
as well as more
techniques have greatly contributed to
the successful use of clinical trials in conducting medical research.
Nevertheless, entirely satisfactory statistical methodology does
not yet exist for some important research settings in which 'clinical
trials playa vital role.
Specifically, consider situations where the
basic outcome or response variable of interest is the time to occurrence of a well defined event.
Typically, this event is "rare" in the
sense that during the time allotted for completion of the study, it is
expected to occur in some, but not all, of the subjects.
Examples of
2
such events include the occurrence of a myocardial infarction,
debilitation due to a stroke, and death from a specific form of cancer.
The analysis of data arising from these experimental situations is
called "survival analysis," and a considerable statistical methodology
has already been developed and applied to clinical trials.
[See, for
example, Ka1bf1eich and Prentice (1980), E1andt-Johnson and Johnson
(1980), or Gross and Clark (1975).]
Suppose, however, that more than one type of event needs to be
considered.
For example, in a study designed to assess the effects of
a possible carcinogen, groups of animals exposed to increasing doses of
the substance might be compared with respect to their time to death
from one of several related forms of cancer.
Similarly, the same re-
sponse variable could be used to compare treatment and control groups
of cancer patients in order to assess the efficacy of a treatment for
cancer.
Situations like these, where several causes of death or
"failure" are "competing" for an individual's life or good health are
termed "competing risks" settings.
In order to simplify the presenta-
tion and discussion, death will subsequently be used as the event of
interest, and the experimental units of the study will be referred to
as patients or subjects.
It should be appreciated, however, that any
suitably defined event can be used, and the units need not be human,
or even living.
For example, a competing risks model could be used to
study the time to failure of a machine which can fail due to the
malfunction of anyone of several components.
Although substantial effort has been devoted to developing the
theory of competing risks, this development has not been particularly
well adapted to its efficient use in the clinical trial settings
3
described above.
Features that might be considered desirable in a
competing risks methodology designed specifically for clinical trials
include:
(i)
The ability to handle censored observations.
As noted
above, time limitations usually imply that some of the
subjects will still be alive at the end of the study.
Since their time to death has not been observed, these
subjects are said to have "censored" or "truncated"
lifetimes, that is, censored by the study's (fixed)
stopping point (Type I censoring or truncation).
Al-
ternatively, in order to obtain a sufficiently powerful
statistical test for detecting treatment differences,
-e
it is sometimes determined a priori that a certain number of deaths must be observed.
If there are
n
sub-
jects in all, and the predetermined number is
m
« n),
the trial will be stopped after the first
and the lifetimes of the remaining
are therefore censored.
(n - m)
m deaths,
subjects
This type of order statistic
censoring is referred to as Type II censoring.
(ii)
The ability to incorporate useful concomitant information.
In clinical trials it is typically necessary
for reasons of validity and/or precision to account for
important prognostic, anthropometric, and demographic
variables.
Also, estimates and inferences concerning
the effects of these variables, as well as the treatments, on the patients' lifetimes must be made in the
censoring contexts described above.
4
(iii)
The ability to allow time-sequential procedures to be
utilized.
In longer running studies, ethical and eco-
nomic considerations necessitate provisions for stopping
the trial when one of the treatments shows clear superiority.
Procedures which achieve this efficiency by per-
mitting repeated treatment comparisons (at arbitrary
time points or after each death) without invalidating
the usual probability statements are termed "timesequential."
1.2.
Basic Formulation and Notation
As implied in the preceding introduction, the focus of this paper
is the application of competing risks analysis to clinical trials.
Furthermore, our attention will be restricted to parametric models,
that is, cases where prior theoretical or empirical evidence suggests
that the time to death will reasonably follow known distributions.
The
basic mathematical framework of competing risks analysis, as given by
Gail (1975), can be formulated as follows:
A population is assumed to be subject to
cl,cZ' ••• ,c '
k
with
T
i
individual dies of cause
k
causes of death,
denoting the hypothetical time at which an
c '
i
i · l, ••• ,k.
These underlying lifetimes
are hypothetical since each individual dies of only one cause, and
hence only
T
= min(Tl,TZ, ••• ,Tk ),
the actual lifetime, and
actual cause of death, can be observed.
mated empirically since
c '
i
Although it cannot be esti-
(T ,T , ••• T )· is not observable, many authors
l 2
k
formulate competing risks theory in terms of the following joint
survival function,
the
5
(1.1)
where
each
0 < t. <
00.
1
T
i
i · l, .... k.
S(O, ... ,O) = 1,
= 0,
S(oo, ... ,oo)
is assumed to be absolutely continuous.
and
In particular, the
survival function associated with the actual lifetime is
Stet) = P(T>t) • P(Ti>t.i=l, •.• ,k) = S(t,t, ••• ,t) •
(1. 2)
One of the classical uses of competing risks analysis involves
the estimation of the survival curve of a population subject to the
specific risk(s) remaining after one or more risks have been eliminated.
In fact, David and Moeschberger (1978) trace this aspect of
competing risks back to the 1760's when Daniel Bernoulli attempted to
-e
predict the effect of eradicating smallpox on the mortality rates of a
given population.
Problems of this type can be readily handled using
the above formulation if one makes an assumption described by Gail
(1975) as the "basic assumption of competing risk analysis," namely,
that "eliminating cause
has no effect on
nullifies the corresponding argument of
of a population exposed to
Sl o23 .•• k(t) - Slo(t),
c
1
S( )."
Similarly, the survival function of a
is assumed to be
= S(O,t,t, ...• t).
As usual,
sponding marginal survival function of
Si(t)
Si(t)
S(),
= P(Ti>t i ) = S(O,O, ..• O,ti,O, •.• O),
sumption implies that
The survival function
alone, for example, is denoted by
population exposed only to
Si(t )
i
itself but only
and the above assumption implies that
Sl o23 ..• k(t) • S(t,O.O, .•. ,O).
S23 ... k ol(t)
S( )
denotes the corre-
that is,
and again, the basic as-
= Si o12 ••• (i-l)(i+l) ... k(t)
= Sio(t),
the survival function of a population exposed only to risk
c .•
1
6
Citing Makeham (1874) and Cornfield (1957), Gail recommends that this
assumption be used cautiously since any steps taken to eliminate
may alter other risks as well.
Co
Prentice, Kalbfleisch, Peterson,
Flournoy, Farewell, and Breslow (1978),
1
not content to be merely cau-
tions, strongly criticize the basic assumption and question the appropriateness of formulating competing risks theory in terms of the hypothetical or latent lifetimes described above.
Their point of view and
a more detailed discussion of this assumption will be provided later.
Using established notation,
denote the joint
pdf
fi(t),Fi(t)
and
fT(t),FT(t)
of
T.
T
i
and
and joint
f(tl, ••• ,t )
k
cdf
of
and
F(tl, ••. ,t )
k
(Tl, ... ,T ),
k
while
denote the corresponding distributions
Although it is not uncommon to assume that the risks act
k
k
n P(Ti>t) = n Si(t),
i=l
i=l
the above formulation can be applied regardless of any dependencies
that exist among the lifetimes.
In order to allow this generality,
three hazard rates are typically defined.
Though equivalent to the
definitions given, for example, by David and Moeschberger (1978) and
Prentice, et al. (1978), the form used by Gail (1975) not only precisely defines these quantities, but offers an immediate interpretation as
well:
1
It should be noted that this paper, which appears in the
December, 1978 issue of Biometrics is essentially reproduced in Chapter
7 of the recent (1980) text by Kalbfleisch and Prentice, The Statistical
Analysis of Failure Time Data.
e-
7
an individual dies of some cause the individual has]
(t,t+dt) and all k risks survived all c
i
are acting in (t,t+dt)
to time t
= P [ in
(1. 3)
Ai(t)dt
~
an individual dies of c
in
i
P (t,t+dt) and c
is the only
[ risk acting in tt.t+dt)
the individual has]
survived c. to
time t
1
(1.4)
=P
an individual dies of c
in
i
(t,t+dt) and all k risks are
[ acting in (t,t+dt)
the individual has]
survived all c
i
to time t
(1. 5)
Using these definitions, Gail (1975) and David and Moeschberger
(1978), among others, have shown that
-e
=-
=-
(1.6)
Xi - t
x = O,t:!i
]f(O, ... o,t,o ... O)
t
=-
(1. 7)
and
t, Vi ]
IU4(t,t,."t)
•
Note that (1.7) implicitly depends on the basic assumption.
(1. 8)
Further-
more, from these basic relationships, it is shown that
(1. 9)
8
and
k
ST(t) -
where
Gi(t)
function.
(1.10)
IT Gi(t) ,
i-I
= exp[- J~ gi(u)du] is a type of cause-specific survival
Also, assuming that the
T
i
are independent implies that
k
(i)
S(tl,···,tk )
k
= IT Si(t i )
i-I
and, in particular,
IT S. (t) ,
1.
i=l
(1.11)
(ii)
gi(t) = \(t)
and
Gi(t)
= Si(t),
ial, ••. ,k
(1.12)
Gail states, however, that these conditions are weaker than the independence assumption since neither implies but both are implied by inHe also remarks that one should not be surprized that, in
k
general, gi (t) ;: \ (t) and AT (t) r/:
\ (t) since an individual
i=l
who has survived exposure to all k risks until time t may very well
dependence.
L
have a different risk of dying from
survived exposure to only
1.3.
c
i
c
i
until time
than an individual who has
t.
The Likelihood Function
In the general parametric setting under consideration, a
functional form for
The
gi(t)
S()
and hence
gi(t),
i=l, .•• ,k,
are. in turn, functions of unknown parameters.
is assumed.
e.,
-1.
which
may include components relating to appropriately chosen covariables.
In clinical trials. covariables denoting treatment assignment as well
as potentially important anthropometric, demographic. and etiologic
variables are typical, and the primary objective is to estimate and
e-
9
draw inferences with respect to the
~i.
Maximum likelihood procedures
are usually employed, and the likelihood function is therefore of central importance.
As noted earlier, it is often necessary in clinical
trials to allow for Type I (fixed time) or Type II (order statistic)
censoring.
In the case of Type I censoring [see, e.g., David and
Moeschberger (1978) ] , the
T~,
than censoring time
~
th
~.
T~ = min(Tl~,T2~, ••. ,Tk~)
>
patient is under observation no later
l, .•. ,n.
1~, T~
patient is considered a "survivor".
the first
teger
Thus, if his lifetime
will not be observed, and the
When using Type II censoring, only
m deaths are observed, where
(m < n);
~th
m is a previously chosen in-
it is further assumed that all individuals are under
observation for the same length of time [David and Moeschberger (1978)].
-e
Letting
c '
i
d
i
= l, ••• ,k,
i
denote the number of individuals who die from cause
and
s
denote the number of survivors, we have
k
L
n ,=
i=l
d i + s.
patient with the
c
i
(i
And letting
jth
ti(j)
denote the time to death of the
longest lifetime among those who die of cause
= l, ..• ,k;j = 1, ... ,d i ),
the likelihood function, allowing for
the possibility of Type I or Type II censoring, can be written as [e.g.,
Gail (1975), David and Moeschberger (1978), Prentice, et al. (1978)J:
i - l , ... ,k
where
T (u)
,
u=l, .•• ,s
and
j = 1, ... , d . ,
(1.13)
~
denotes the censoring times of the
s
10
survivors, and
Ti(j)
denotes the (fixed) censoring time of the
individual whose lifetime is
ti(j)'
It is interesting to note that of the three references quoted,
only David and Moeschberger provide a formal justification of the likelihood equation, and this is done only for the case when all lifetimes
are
obs~rved,
that is, when there is no censoring.
Although extension
to the case of Type I censoring is immediate, perhaps Type II censoring
deserves a special comment since it involves the distribution of order
statistics.
Only the proportionality factor of the likelihood equation
changes, however, and therefore estimates of the parameters (but not
their distributions) are unaffected.
Gail (1975), on the other hand, does not consider Type II censoring, and only allows a common fixed censoring time,
of Type I censoring.
T,
in the context
This seems adequate when all patients enter the
study simultaneously and are therefore under observation for the same
length of time,
T,
unless death occurs prior to this point.
Veryof-
ten, however, entry into the trial is staggered, and the more general
form of Type I censoring which allows individual censoring times is required.
The response variable is usually defined as the time from ini-
tiation of treatment until death, and regardless of the chronological
time at which a patient is placed under treatment, this time point corresponds to "study" time
= O.
Since the trial is typically designed to
be stopped at a predetermined chronological time point, this censoring
time, when translated to the study time scale, is different for patients with different chronological entry times (see Figure 1).
With
respect to the form of the likelihood function given in (1.13), it can
be seen that it is neither necessary to order the lifetimes in the case
of Type I censoring nor is it overly complicating to do so.
e-
11
Figure 1:
Comparison of Chronological and Study Time Scales
Chronological Time Scale
Study Time Scale
patient 1/1
+--..L---'------x
~---------'x
patient 112
• _ _--':=::"="';::"::":'=-"":':-='_ _-1
I
. patient 1/3
1
t
o
l
Ending Date
Starting Date
• denotes time at which treatment was initiated
x
denotes time of death
denotes censoring time
T
T
-e
l
= tl-t O denotes the length of the study
, T , T
2
3
denote the individual censoring times for patients 111, 2,
and 3.
In the context of Type II censoring, each of the survivors has a
cornmon censoring time, namely
the study ends with the
th
m
t(m) =
death.
since
max {t ( )},
lsiSk i mi
Because this censoring scheme is
based on order statistics, each patient must be under observation for
the same length of time (unless death occurs before this amount of time
has elapsed).
This condition is satisfied, for example, when all pa-
tients enter the trial simultaneously.
gered and the
th
m
Alternately, if entry is stag-
death subsequently occurs at time
t (m) ,
then, in
order to insure that the appropriate order statistics have actually
been recorded, observation must continue on those individuals who have
not completed
t(m)
time units until they do.
timately result in the observation of
This procedure may ul-
m'(m < m' < n)
deaths, and, un-
derstandably, is'more likely to be considered satisfactory if the entry
times are relatively close.
12
Also, it may sometimes be desirable to combine Type I and Type II
th
m
censoring, that is, the trial is stopped at the time of the
or at (fixed) time
m*{m* ~ m)
T,
whichever occurs first.
death
In this case, the
smallest order statistics are observed, and the common cen-
soring time for the
s*" n -m*
survivors is
T* = min(t(m),T).
k
From (1.10),
ST(t) ..
Gi(t;~i)'
IT
R.... l
and therefore (1.13) can be
rewritten as
a
k
~di
IT
IT
i-I j-l
e-
x
d.
k
l.
IT
R..-l
IT
j=l
and, upon rearranging the products and setting
R.. -
i
in the second
and third terms,
~ G. );8.)J.
k
IT
R..=l
(T (
u=l
k
Thus,
L(~l'.·.'~k) ~
n
i-I
Li(~i)
where
1
U-1
13
(1.14)
and consequently, i f each of the
parameters, i.e.,
~i n ~j .. <1>,
can be performed for each of the
~i.
gi(t;~i)
Vi ,. j,
L
i
has a different set of
then a separate maximization
to obtain the
(ml)
estimates of
This factorization of the likelihood function has been noted by,
among others, David and Moeschberger (1978) and Prentice, et al. (1978),
and often simplifies the numerical problems associated with maximum
likelihood estimation.
-e
exactly the
Also,
likelih~od
function that would be obtained if deaths from
all causes other than
death, i.e. , the
well as the
L(U)'
were treated as censored at the time of
tR.{j) ,
u
R.
= l, .•• ,i-l,i+l, .•. ,k;
= 1, ... ,s,
are
j
= 1, ... dR.'
as
treated as fixed censoring times.
Prentice, et al. (l978) note that this result provides a formal justification for the practice, commonly used in survival analysis, of
treating deaths from all causes other than the main cause being studied,
as well as live withdrawals, as censored, at least for the estimation
of
~i.
David and Moeschberger (1978) emphasize, however, that the
distribution of these estimators may depend on the distributions of
gR.(t;~R.)'
~i n ~j ~
R.
<I>
~
i,
and that this argument cannot be applied if
for some
j.
14
1.4.
Problems of Identifiability
From (1.10),
Gi(t)
= exp -[J
t
o
g. (u)du],
i-= 1, ••• ,k,
and
1
therefore the likelihood function (1.13 or 1.14) can be expressed completely in terms of the
gi(t),
i
= l, ••• ,k.
Prentice, et al. (1978)
further remark that in light of this and the likelihood factorization,
standard techniques of survival analysis can be used to identify the
gi(t),
= l, ..• ,k.
i
This means that these hazard functions can be
directly estimated from the type of data observable from competing
risks settings.
(T,I)
For convenience, some authors denote such data by
where, as usual,
identifying
c '
i
T
= min(Tl, ... ,T k )
and
the actual cause of death;
I
is a random index
(T,I)
is sometimes
called an "identified minimwn" [Elandt-Johnson (1979), Basu and G,hosh
(1978)].
Prentice, et al. prefer to use
where
-z
empha-
sizes that it may be necessary to account for concomitant information
which, of course, is also observable.
Additionally, Elandt-Johnson
(1979) has referred to the survival function,
specific hazard rates,
gi(t),
i = l, ... ,k,
ST(t),
and the cause-
as the basic estimable
quantities of competing risks analysis.
Letting
Xi
denote the lifetime of an individual given that the
cause of death is
i.e.,
Sx (x) • p(T>xII=i),
David and
i
Moeschberger (1978) and Birnbaum (1979) point out that both the distributions of
of the form
Xi
and
(T,I)
P(I=i),
i
I:
l, •.. ,k,
can be estimated from data
without additional assumptions.
clear that given the joint distributiQn of the
given
Xi
and
S(tl, •.. ,t ),
k
P(I=i),
T ,
i
Furthermore, it is
i
= l, .•• ,k,
i.e.,
one can completely determine the distributions of
i = l, ... ,k,
or equivalently, the distribution of
e-
15
(T,l).
With this background, a question of fundamental importance can
now be posed:
given the distribution of the identified minimum, (T, I) ,
or the distributions of
Xi
and
PCI-i),
i = l, ... ,k,
quantities
which can be directly estimated from competing risks data, can one determine the joint distribution of
Tl, .•. ,T ?
k
This is the so-called
"problem of identifiability" [Birnbaum (1979), Elandt-Johnson (1979),
David and Moeschberger (1978)J, and the answer is, in general, no.
In
fact, Elandt-Johnson (1979) has shown that there is an infinity of different survival models [i.e., different random vectors
leading to the same observed data,
equivalent.
(T,1);
such models are said to be
An example, using Fruend's (1961) bivariate extension of
the exponential distribution, is given by David and Moeschberger (1978).
"-
On the other hand, under the assumption that the
T. ,
1.
i = 1, ... ,k,
are independent, Berman (1963) showed that the distributions of the
and the probabilities
tribution of
Tl, .•• ,T
P(l=i),
k
i'" l, •.• ,k,
X.1.
determine the joint dis-
and the corresponding marginals.
Also,
Tsiatis (1975) demonstrated that for each survival model with dependent
hypothetical lifetimes there is an equivalent one with independent hypothetical lifetimes; Elandt-Johnson (1979) has further emphasized that
this latter model is unique.
joint survival function
In short, the dependent risks model with
5(t , ••• ,t )
l
k
basis of the distribution of
(T,l)
cannot be distinguished, on the
alone, from an infinity of other
models with dependent risks, or from the unique independent risks model
haVing a joint survival function of the form
k
n
i=l
*
5 (t ).
i i
16
Although these results may seem discouraging, it should be noted
that in the context of parametric competing risks analysis, it is assumed that the functional form of the joint survival distribution,
S(t , •.• ,t k ),
1
is known.
In this case, David and Moeschberger (1978)
point out that typically there should be no problem in completely identifying the model since the unknown parameters can be estimated from
the likelihood function.
Nadas (1971) has shown, for example, that
given the distribution of
(T,l),
the parameters of the joint survival
function can be uniquely identified when
further discussion is
giv~n
(T ,T )
1 2
are bivariate normal;
by E1andt-Johnson (1979), Anderson and
Ghurye (1977), and Basu and Ghosh (1978).
The practical aspects of the problem of identifiability in the
context of parametric Gompeting risks analysis can perhaps be summarized as follows:
(i)
(ii)
when the
T , ... ,T
1
k
are independent there is no problem.
when a dependent risks model is proposed, it has been
chosen on the basis of prior theoretical or empirical evidence, and the many "equivalent" models are therefore irrelevant.
Usually the unknown parameters of the chosen
model will be estimable from the
lik~lihood
function,
leading to the identifiability of the model based on its
assumed form and the recorded data
1.5
(T,l).
Further Discussion of the Interpretation of the
T
i
and the
Basic Assumption
To this point competing risks problems have been formulated in
terms of the hypothetical lifetimes,
vival distribution,
S(t , ... ,t ).
1
k
T1, •.. ,T ,
k
and their joint sur-
Although most authors use it,
e·
17
Prentice, et al. (1978) critize this approach due to the
nonidentifiability, in general, of the joint survival distribution (and
its margina1s), and the questionable appropriateness of the "basic assumption" described earlier.
Instead, they prefer to define competing
risks models by specifying a distribution for the observable quantities
(T,I,z)
via the cause-specific hazard rates,
gi(t;~),
i
= l, .•• ,k,
which represent the basic identifiable elements of competing risk data.
Furthermore, these authors note that David and Moeschberger
T'i
(1978), among others, define
when
to be the lifetime of an individual
is the only risk present, 1. e., when all other risks have
been "eliminated", and assume that the actual lifetime is given by
--
T
= min(Ti, ... ,T k)
when all
k
risks are acting.
Although they ac-
knowledge that this formulation gives a physical interpretation to the
underlying lifetimes, Prentice, et al. point out that it implicitly assumes that the time to death from
risks are present or
c
i
c
i
is the same whether all
is acting alone.
k
Arguing that this assump-
tion is reasonable in very few cases, they reject this formulation.
It
should be noted, however, that David and Moeschberger stress this interpretation of the hypothetical lifetimes only when it is assumed that
Ti, .•. ,T
k
(1.12) that
are independent.
gi(t)
= Ai(t),
In this case it has already been noted in
i
= l, ••• ,k,
which means that the implic-
it assumption is valid, and consequently the description of
by Moeschberger and David is correct.
T'
i
given
Nevertheless, the point made by
Prentice, et al. is well taken -- the basic assumption that eliminating
cause
Co
only nullifies the corresponding argument of the joint sur-
vival distribution,
sumption.
S(),
is indeed a strong and often dubious as-
Furthermore, the state that "any assumption about the
18
relationship between the observed
T
and times to failure for specific
causes. given the removal of other causes will require detailed knowledge of the system under study and of the mechanism for cause removal."
Elandt-Johnson (1979). stressing this same point. considers the basic
assumption to be generally unjustified.
Although one should exercise
c~ution
in. or perhaps refrain from.
interpreting the various marginal distributions as the resulting survival distributions after elimination of the corresponding causes of
death, it should be emphasized that the use of hypothetical lifetimes
to formulate competing risks problems does not necessarily limit the
generality of the methods or lead to difficulties of interpretation.
Recall that
T
i
has been defined as the hypothetical time at which an
individual dies of cause
Thus. these quantities im-
plicitly represent the hypothetical lifetimes under the actual study
conditions, that is. when all
dent fashion.
k
risks are acting in a possibly depen-
This means that none of the generality of competing
risks analysis is sacrificed by the hypothetical lifetimes formulation.
Furthermore, the
T
i
can perhaps be more easily interpreted by
considering the following analogy:
Suppose an electronic shooting gallery allows
k
contestants to
shoot at a single moving target which. when hit by a light beam from
any of the
k
competitors' rifles. immediately pops back up.
The ob-
ject is to hit the target as many times as possible, and therefore the
competitors keep shooting regardless of how often they have scored.
For the purposes of the analogy, however, the variables of interest are
the respective amounts of time that elapse before the
hit the target for the first time.
The time for the
k
.th
1.
competitors
contestant
e-
19
is analogous to
T
i
scored by the others.
and is computed by essentially ignoring any hits
Note that the conditions of the game remain the
same since all competitors stay active throughout, and although one
might consider their shooting to be independent, it is not difficult to
envision the existence of dependencies.
Of course, in the competing risks setting, an individual dies
only once, and therefore the lifetimes,
ical.
T ,
i
are necessarily hypothet-
Nevertheless, the proper interpretation of
sents the time to death from cause
tions, i.e., in the presence of all
T
i
is that it repre-
under the actual study condi-
k
(possibly dependent) risks,
that would ultimately be observed if deaths from all causes other than
--
c
i
could be ignored.
This concept of ignoring certain causes of death
has also been used by Elandt-Johnson (1976) in interpreting the various
marginal distributions of
S(tl, ••. ,t ).
k
Thus, letting
*
Ai(t)
the hazard rate corresponding to the (marginal) distribution of
denote
T.,
1
it follows that
*
A.(t)dt
1
an individual dies of c
in
the individual has]
i
and all k risks are survived c i to
[ acting in (t,t+dt)
time t
= P (t,t+dt)
(1.15)
and
(1.16)
Earlier (in 1.7) it was stated that, under the basic assumption,
*.
A.(t)
1
= A.(t)
1
and
20
Si(t)
= Si o1. •• (i-l)(i+l) .•• k(t)
:: Sio(t).
More generally, however,
(1.17)
and thus, if the basic assumption is invalid,
Sio(t)
~
Si(t).
If
Tl, ... ,T
k
basic assumption is valid, and
As
Ai(t)
~
'*
Ai(t)
and
are truly independent, moreover, the
'*
Ai(t)
= Ai(t) = gi(t).
previously emphasized, however, clinical trials are primarily
conducted to examine treatment responses under actual study conditions,
without extrapolation to any altered state of affairs.
In this setting,
the competing risks formulation, based on the joint distribution of the
hypothetical lifetimes,
Tl ••.. ,T •
k
can be applied with full general-
ity without an implicit or explicit need for the basic (or any similar)
assumption.
Indeed, for this purpose. it is sufficient to estimate and
draw inferences with respect to the
gi(t;~i)'
which are the only ha-
zard functions corresponding to observable quantities, while the
*
Ai(t)[Si(t)].
Ai(t)[Sio(t)],
which represent hypothetical quantities. and the
which refer to an altered set of conditions that is
generally not amenable to statistical modeling. are of little. if any,
interest.
1.6.
Incorporation of Concomitant Information
As discussed earlier. it is usually necessary to incorporate
covariab1es into the analysis of data generated by a clinical trial.
Since much of the research in competing risks does not appear to have
been specifically developed for clinical trials, it is perhaps not surprizing that many authors do not address this issue.
Of those that do
Ie.g., David and Moeschberger (1978) and Prentice, et al. (1978)J, the
e-
21 .
emphasis has been on the introduction of covariables through the
cause-specific hazard functions,
gi(t),
i · l, ••• ,k.
In the context
of parametric competing risks models David and Moeschberger suggest
using
(1.18)
where
~
is the vector of parameters corresponding to
of covariables,. and
c(~;Q)
= 1.
c(~;~)
is any function of
z
~,
B
and
the vector
such that
They trace the origin of this suggestion to the following
model proposed by Cox (1972) for ordinary survival analysis settings
(k = 1),
in which the probability density of survival time is assumed
to be
(1.19)
where
~,
~,
A(t)
and
c(~;~)
are as above, and
= Ao(t)c(~;~)
A(t) • A0
(t)c(z;B)
-In particular, Cox emphasizes
is the hazard rate and
(1.20)
is the cumulative hazard rate.
(1.21)
c(~;~)
= exp(~'~).
Note that in the
case of independent risks, the addition of a subscript
tities
f,
A,
A,
c,
competing risks setting.
AO'
and
Ao
to the quan-
generalizes Cox's notions to a
For dependent risks. this
ther generalized by considering the
i
gi(tl~)
form~lation
is fur-
functions defined above.
Parametric settings involve the assumption of a specific form for the
joint survival distribution,
S(),
which is equivalent to specifying
the form of the
Aoi(t)
under independence) in the above
gOi(t)
(or
22
model, since
c(z;B)
is a constant (does not depend on
t).
Thus, the
covariables and their corresponding regression parameters can be readily incorporated into the likelihood function displayed earlier (1.13 or
1.14), thereby enabling estimation and inference to proceed as usual.
As an example, Kalbfleich and Prentice (1980) consider the following
exponential regression model for
= Ae-B'z-
~(tl:)
,
= 1:
k
which implies that
(1.22)
B'z
B'z
fT(tl~) • (Ae- -)exp[-(Ae- -)t],
That is,
T
is exponential with mean
A,t > 0 .
(Ae§':)-l.
More generally,
Prentice, et ale (1978) describe an "accelerated failure time model,"
in which the covariables
z
accelerate or decelerate the time to
failure (death), with cause-specific hazard functions,
i = 1, ... ,k.
(1. 23)
In particular, Farewell and Prentice (1977) consider the exponential,
Weibull, and log-normal distributions as special cases of this model.
Typically, though not necessarily, the regression parameters
will be different for each lifetime, that is, if
of parameters associated with the
§i n §j
= ¢, j ;
i.
i
th
~i
denotes the set
lifetime, then
When applicable, this is emphasized by writing
(1.18) and (1.23), for example, as
g i(t)c(B.;z) ,
o
-1 -
I
g. (t z)
1
-
-B'z
= goi(te -B'z
-i-)e -i-
and
(1. 24)
(1.25)
23
Prentice t et a1. (1978) consider this case on1Yt and stress that t due
to the factorization of the likelihood function noted in (1.14)t ordinary survival analysis
B. t
-1
i = 1 t .•• t k.
(k - 1)
techniques can be used to estimate the
A detailed account of the various methods available
is given t for examp1e t by Ka1bf1eich and Prentice (1980).
One of the most popular methods of incorporating concomitant
information when a single cause of death is under consideration is
Cox's (1972 t 1975) proportional hazards mode1 t a partially nonparametric technique.
Prentice t et a1. (1978) state this model t in the
context of competing risks t as follows:
(1. 25)
--
where
gOi()
~
0
is arbitrary (i.e., not specified as in the param-
etric approach).
Furthermore, they express the resulting partial like-
1ihood function as
~di
ITk
IT exp(~j~i(j» ; : . - L
i=l j=l
~€R(ti(j»
J
exp(~j~~)
t
(1. 26)
where
ti(j).
R(ti(j»
denotes the set of individuals at risk just prior to
Since this is not a fully parametric model, it will not be con-
sidered further, but more detailed discussions are given, for examp1e t
by Ka1bfleich and Prentice (1973), Breslow (1974)t Holt (1978), and
Prentice and Breslow (1978).
The approaches described above introduce covariab1es via the
hazard functions.
Although it can be argued that it is intuitive to
model- this function in light of its natural interpretability, the
24
estimates of the parameters of these models, unlike the analogous
estimates obtained from classical regression analysis, do not directly
correspond to the relationships displayed in simple plots of the response variables versus the covariables.
For this reason it is perhaps
more intuitive to use the approach usually taken in regression analysis.
That is, it is assumed that
distribution,
Tl, .•. ,T
S(tl, ••. ,t ),
k
have a joint survival
k
such that
= @i)
where
B
well.
Feigl and Zelen (1965) utilize this formulation for
T
will typically depend on
i
(i.e.,
@
and
z
k
may as
=1
when
is assumed to follow an exponential distribution, while Zippin and
Armitage (1966) allow for Type I censoring in the same setting.
context of competing risks
(k > 1),
ceived very little attention.
restrictions on the parameters,
however, this approach has re-
This is probably due to the inherent
Q'z> 0
~i-
'
for all
i
and
other hand, modeling hazard rates with
c(~;§)
= exp(§'~),
c(~;~)
> 0,
which
~,
may complicate the iterative techniques used for estimation.
with
In the
On the
and in particular,
avoids these annoying restrictions.
It ap-
pears, therefore, that hazard rate models owe their popularity more to
mathematical convenience than intuitive appeal.
Nevertheless, it may
be possible to develop intuition by employing some of the hazard plotting methods described, for example, by David and Moeschberger (1978)
and Kalbfleich and Prentice (1980).
Both methods permit parametric
analyses based on the usual lifetime distributions, but, in a given
situation, one approach may be preferred over the other on the basis
of the appropriateness of the mathematical relationship used to
e-
2S
incorporate the covariables.
In most cases, however, the choice is
made on the basis of numerical versus intuitive and interpretative
simplicity.
Unfortunately, since modeling approaches based on the
mean of the hypothetical lifetimes have been reported so rarely, it is
impossible to determine if the numerical complications of this method
generally outweigh its convenient interpretability, or vice versa.
1.7.
Outline of Research Proposal
The focus of this dissertation is on the development of parametric
statistical procedures which provide an appropriate means of analyzing
competing risks data for the various experimental designs typically
used in clinical trials and other experimental settings.
Since our at-
tention is restricted to the parametric case, it is assumed that, on the
basis of prior theoretical or empirical evidence, the parametric form of
the joint distribution of the underlying lifetimes can be specified.
It
should be emphasized that the methods presented are not based on any
specific parametric model; rather, a general theoretical framework
which can be applied to a broad class of parametric competing risks
models is proposed.
In Chapter II general parametric competing risks models are formulated in terms of the (possibly dependent) underlying lifetimes, and the
means of these lifetimes are expressed as functions of an arbitrary number
of covariables which include treatment indicators as well as any other
appropriate explanatory variables.
The resulting likelihood function
is derived, and it is shown to be a function of independent but generally nonidentica11y distributed random vectors.
This important fact
influences the formu1at10n and derivation of all procedures considered
26
throughout this investigation since it implies that the mathematically
simpler iid framework cannot be used for general competing risks
models.
Under carefully stated regularity conditions, the asymptotic
normality of the
mle
is established for the case when all the life-
times are observed, that is, when no censoring or truncation schemes
have been used.
From this result, large sample tests of very general
hypotheses based on Wald and likelihood ratio statistics are obtained.
It is also shown that withdrawals fit into the competing risks framework already discussed, and therefore, they require no special treatment.
Finally, parallel results are obtained for the single point
truncation scheme.
In this case, the study is designed to be stopped
at a predetermined time (Type I censoring or truncation), and the data
is analyzed only at this point.
The procedures proposed for this case
are similar to those discussed above, except that they are based on
truncated observations instead of complete observations.
Due to ethical considerations and the possible loss of efficiency
associated with single point truncation or censoring schemes, timesequential procedures, which allow termination of the experiment at the
earliest possible stage based on the accumulated statistical evidence,
are often advocated for clinical trials.
In Chapter III, a progres-
sively truncated scheme (PTS) is used to develop a large sample timesequential test procedure for parametric competing risks models, and
this procedure is appropriately modified for the case when "nuisance"
parameters are present.
Early termination is possible in a PTS since
the test statistic, which is based on truncated observations, is continuously monitored over
(O,X],
the interval of time allotted for
the study, and compared to an appropriate critical value.
Also, the
e'
27
above procedures are adapted to allow staggered entry into the trial.
In the form of staggered entry considered, patients are entered over a
specified interval,
is monitored over
[o,X],
o
[Xo'xJ,
and then an appropriate test statistic
where, as before,
maximum length of the study.
X denotes the planned
We conclude Chapter III by showing that
the single point censoring scheme can be handled as a special case of
the results obtained above.
end with the
r
th
(1
.s. r
In this scheme, the study is designed to
< n)
failure (Type II censoring), and the
resulting data is analyzed only at this single censoring point.
Although it has limited applicability, a (restricted) large
sample sequential procedure is also obtained for parametric competing
risks models in the context of repeated significance testing in Chap-
'e
ter IV, and a parallel test based on truncated observations is given.
In a repeated significance testing scheme, observations are obtained
sequentially, and the test statistic is updated after the addition of
each observation.
As soon as the accumulated statistical evidence
permits, the null hypothesis under consideration is rejected and the
experiment is terminated.
If, however, the sample size reaches a pre-
determined maximum without rejection of the null hypothesis, the experiment is terminated and the hypothesis is accepted.
This type of
sequential procedure is called "restricted" since, in contrast to the
more classical sequential designs, the sample size cannot grow beyond
the specified maximum.
It should be emphasized that the procedures of
Chapter IV are sequential in the sense that the observations are obtained sequentially, while the procedures of Chapter III, which use a
fixed number of observations, are sequential in time.
28
The large sample results of Chapters III and IV are obtained by
showing that, under suitable regularity conditions, the distributions
of the proposed test statistics for both the time-sequential and sequential procedures converge weakly to those of a functional of certain
Gaussian processes under the appropriate null hypotheses.
Unfortunate-
ly, these null distributions are not available in the statistical literature, and theoretical derivation appears mathematically intractable,
at least in general.
However, the computer generation of general
Gaussian processes is discussed in Chapter V and subsequently used to
provide an algorithm which obtains the distributions of the appropriate
functionals empirically.
Also, the regularity conditions of Chapters
II, III, and IV are verified for a general independent risks exponential model and a dependent risks model based on Gumbel's bivariate
exponential distribution.
This, along with some of the comments of the
earlier chapters, suggests that the procedures proposed in this investigation are applicable to a broad class of parametric competing risks
models.
Additionally, using simulated data from exponential popula-
tions, the basic time-sequential techniques of Chapter III are
illustrated in detail.
Finally, in Chapter VI, we make some final comments and a few
suggestions for further research.
CHAPTER II
ASY~~TOTIC
MAXIMUM LIKELIHOOD PROCEDURES FOR
GENERAL PARAMETRIC COMPETING RISKS MODELS
2.1.
Introduction
If the parametric form of the joint distribution of the
hypothetical lifetimes in a competing risks setting is assumed known,
likelihood procedures are often used to estimate and draw statistical
inferences with respect to underlying model parameters.
Typically,
maximum likelihood estimates (mle) are found and inferences are based
on the assumed asymptotic normality of the corresponding estimators.
Rigorous justification for this approach is not found in the
lit~rature,
an omission which cannot be ignored easily since, as shQwn in Section
2.3, the likelihood function arising from competing risks models is a
function of independent but generally nonidentically distributed random
vectors.
This fact
inc~eases
both the number and complexity of the
regularity conditions needed to insure asymptotic normality of the
estimators, the basic result obtained in this chapter.
ml
The current in-
vestigation deals with the classical statistical setting in which data
collected during the experiment are analyzed only at its conclusion;
procedures which permit time-sequential and sequential methods of
analysis are developed in subsequent chapters.
Some fully general parametric competing risks models are formulated in Section 2.2 while the resulting likelihood function is formally
30
derived in Section 2.3.
In Section 2.4 an appropriate set of
regularity conditions is presented and used to establish the asymptotic
normality of the
mle.
Large sample tests of very general hypotheses
based on Wald and likelihood ratio statistics are presented in Section
2.5. and Section 2.6 includes a simple example involving exponential
lifetimes.
Withdrawals and Type I (fixed point) censoring are consid-
ered in Section 2.7. while a discussion of Type II (order statistic)
censoring is deferred until Chapter III.
2.2.
Some General Parametric Competing Risks Models
As
discussed in Chapter I. it is convenient. as well as
conventional. to formulate competing risks models in terms of the underlying lifetimes or the hypothetical times to death associated with the
various risks under consideration.
lation is exposed to
k
Specifically. suppose a study popu-
causes of death.
cl' ••.• c '
k
and let
T
i
note the hypothetical time at which an individual dies of cause
under the actual study conditions. that is. when all
acting in a possibly dependent fashion.
each individual
j
(j
= 1.2 ••..• n),
k
de-
C.
1
risks are
Further. we assume that for
these hypothetical lifetimes are
jointly distributed as follows:
(2.1)
where
(1)
F
is an absolutely continuous distribution whose form
is known but depends on the (unknown) parameters
and
~.
~j
31
(2)
1l ij
= E(Tijl~ij) = ~i:ij' i
where
~ij
=
1, ••• ,k
and
j
~i
1s a vector of covariab1es and
= 1, ••• ,n,
is the
corresponding vector of regression parameters
satisfying
•
1l
(3)
e
ij
> 0
for all
1
and
j
,
denotes any parameters other than the vector of
means,
that are required to specify the model,
and
and
(4)
--
Additionally, if a location-scale parameter model is assumed, the
hypothetical lifetimes can be expressed as
= Si' z..
- -1J
where
8.
-1.
and
zi'
-
-J
FE
i
= l, ..• ,k;
j=l, ... ,n,
are defined as above, and for
]
E,
where
+ EiJ"
=
(Ei,,···,E ,)'
kJ
J
~
j
(2.2)
= l, •.• ,n,
F (e;1l =O,L ) ,
E
E - -_E
_ - -_
(2.3)
is a given (absolutely continuous) distribution and
~1'~2' •.. '~n
are iid random vectors.
implies that, for
j
The location-scale model also
= l, ••• ,n,
(2.4)
It should be noted that a location-scale parameter model is not requlred for any of the procedures that will be subsequently developed.
It will only be necessary to specify the parametric form of the joint
32
as in model (2.1); in fact, the exponential
distribution of
distribution, which is often used to model lifetimes, is not a locationscale parameter family.
Although the location-scale parameter model is
obviously similar to the classical linear model, model (2.1) is also
.
analogous to the classical linear model since the mean lifetimes are
expressed as linear combinations of appropriate covariables.
It is
important to note that, under both models,
Tl, ••• ,T
are independent
-n
but generally not identically distributed random vectors.
While the
are not observable in a competing risks setting,
the following quantities are:
where
T = min (Tlj , ••• , T ) ,
j
kj
e-
and
~j
= (l~ij , ... , ~j ) ,
.1 ij =
is the actual lifetime of the
Thus,
r-
0,
.th
J
T.
if
J
= Tij
(2.5)
otherwise
individual while
an indicator random vector which identifies his cause of death.
lowing the notation established in Chapter I,
5 ,
T
f
j
denote the survival function,
individual
= fT
(T ) ,
j
while
pdf,
and
cdf
T
,
and
j
6.. is
-J
Fo1-
FT.
J
of the lifetime for the
5T • 5T (tij,···,t k ·),
_j
_j
J
FT • FT (ti., .•. ,t kj ) represent the
_j
_j
J
corresponding joint distributions of the underlying lifetimes for the
fT
_j
j th
_j
(tij,···,t k .),
individual
J
(T)
_j.
and
Du e
to t h e i ncorporat i on
0
f concomi tant i n f or-
mation in the above model, these distributions generally depend on the
individual subject and therefore the subscript
j
is necessary.
If at
any point in the sequel this subscript is dropped, it is done only for
33
convenience or clarity of presentation.
Note however, that the joint
distribution of the residuals associated with the location-scale
parameter model,
F ,
does not depend on
g
j.
Since the model presented in (2.1) is analogous to ordinary
regression models with respect to the manner in which it incorporates
covariab1es, it provides an intuitive as well as fully general approach
(~l' ••• '~k)
to competing risks using parameters
preted easily by many researchers.
E(Ti,lz i ,) = Si'Z'j > 0,
J - J
-
However, the restriction that
= l, ... ,k and j = 1, ... ,n,
i
-1.
that can be inter-
may complicate the numerical procedures used to estimate the model
--
parameters in certain situations.
If these restrictions create prob-
lems, one might consider modeling the mean lifetimes with the following
more general, possibly nonlinear, function:
~i'
J
where
= E(T i J, Iz- iJ·) = h(Si;zi'·),
--J
h() >
= l, .•. ,k;
(2.6)
1, ... , n ,
j
° and the other aspects of model (2.1) are unchanged.
In this model the
@i
(i
yet, for a suitably chosen
example,
i
h(§i;:ij)
= 1, ••• ,k)
h(),
= eXP(§~:ij)'
are unrestricted real vectors and
E(T ij ) =
(1
h(~i;~ij)
= 1, ••• ,k;
j
> 0.
As an
= 1, ••• ,n)
satis-
fies the requirement that the mean lifetimes be positive without
placing any restrictions on the parameters
2.3.
(§i'
i
= 1, ••• ,k).
The Likelihood Function
In order to employ likelihood procedures for estimating and
drawing inferences
~ith
respect to the
find the joint distribution of
(T,~),
~i
(i
where
= 1, ••• ,k),
we must
T = min(Tl, ••• ,T )
k
and
34
~
=
(~l'
•••
'~);
the subscript
j
has been omitted for convenience.
Now,
=
lim
~t+O
1 pit 2 Tl 2 t+~t;T2 > Tl; .. ·;Tk > Tl }·
t
(2.7)
But
e-
Similarly,
~ ST(t,t+~t, ...
Thus, from (2.7) we have
,t+6t) -
ST(t+~t,t+~t,
...
,t+~t)
35
lim
!::.t-+O
!::'t
< f
T ,-!::.(t,~-(l,O, ••• O»
lim
<
!::'t-+O
!::'t
,
Since both limits approach
however,
xi=t,Vi
= -
=
-------1
k
o.
IT [gi(t)] 1 0ST (t),
i=l
2.
=
(1,0, ... 0)
E
~k
t >
= t(l,O, ... ,O)lxk,(O,l,O, •.• ,O), ••. (O, •.• ,O,l)},
lihood function is given by
[1,
if
T=T
l
otherwise
and is, in
Therefore,
for
~
(\ =
0,
0
Clearly, this argument does not depend on
fact, valid for any legitimate value of
since
° and
(2.8)
and the like-
36
=.rrn
J=l
If
D.
1
~[
subjects die from cause
.th
of the
J
J
'J
(2.9)
krr [gi.(t j )] O.1J ST.(t.)
i=l
J
J J
and
ci '
subject dying from cause
c.
1
denotes the lifetime
Ti(j)
(i = l, ..• ,k;j = l, ... ,d.),
1
then (2.9) may be expressed as
d
i
j~l gi(ti(j))ST(ti(j)) ,
(2.10)
where the missing proportionality factor is the (multinomial) probability that
D = d
i
i
(i
= l, .•. ,k).
It should be noted that (2.10) is
equivalent to the likelihood function given in (1.13) without censoring,
and it is similar to the likelihood derived by David and Moeschberger
(1978, Chap. 2).
Their derivation, however, is valid only when the
risks are independent while the above formulation and derivation is
valid in general.
It is also of interest to note that the above like-
lihood functions can be directly obtained in a heuristic manner by
utilizing the intuitive interpretations of the
gi
and
ST.
This
latter approach is taken by most authors, at least implicitly.
Since a
considerabl~
mathematical simplification can be obtained
by writing the log-likelihood function as the sum of iid random vectors,
if a location-scale parameter model can be assumed, it is natural to
try to express the likelihood function in terms of the residual vectors,
E.,
-J,
as in classical linear models theory.
lines,
Proceeding along these
37
ST (t j )
j
=
P[T
U
> tj,. .. ,Tkj > tjl
= P[Tlj-~lj
> tj-~lj,···,Tkj-~kj > tj-~kj]
= SE(t·-~I·,···,t·-~k·)
_ J
J
J
J
(2.11)
Also,
=
f a~iSE_J' (xl-~IJ·'···'~-~kJ·)
J/SE
x =t Vn
-J'
R, j ' Y:
(t·-~l·,···,t·-~k·)'
J
J
J
J
-Using this, (2.9), and (2.11)
L«(31, ••• ,(3k,L
-
=
where
-E
-
~ r~
j=l~=l
[gE
i
,(T.,1l.),j
J-J
= l, ... ,n)
(Tj-~l"···'T.-lJk·)]llijSE
(T.-lJI.,···,T.-lJk'~
J
J
J
_j J
J
J
JJ
T. = min(TI., .•• ,T .)
kJ
J
J
Although the residual vectors
for
j = l, .•• ,n.
~j = (Tlj-lJlj, ••• ,Tkj-lJkj)'
j = l, .•• ,n,
are iid, the above argument shows that the likelihood function is a
function of
(Tj-lJlj, ••• ,Tj-lJkj); ~j'
Tj - lJR,j = min(T 1j ,···,Tkj ) - lJR,j
j
= l, ••. ,n,
where
does not have mean 0
for all
38
£
==
l, ..• ,k
and
j == l, .•• ,n.
scale parameter model,
It follows that, even for the location-
(Tj-lJlj, ..• ,Tj-lJkj)'
j == l, ... ,n,
are not
necessarily identically distributed, and consequently the log-likelihood
function can be expressed as the sum of independent but generally nonidentically distributed random vectors.
This important feature of com-
peting risks models, which is a reflection of the fact that the actual
lifetime is the minimum of
k
underlying lifetimes, will largely dic-
tate the mathematical formulation and derivation of all procedures that
will be presented throughout this investigation.
2.4. ASymptotic Normality of the MLE
Once the maximum likelihood estimators have been obtained,
usually by interative techniques whenever covariables are included,
most authors assume that these estimators are asymptotically normally
distributed without precisely stating the necessary regularity conditions.
Standard demonstrations of this result require that the 10g-
likelihood function be expressible as the sum of iid random vectors.
Since it was shown in the last section that this is generally not possible with competing risks models, however, the corresponding regu1arity conditions will increase in number as well as complexity, and it
seems appropriate to carefully state and prove the required distributional result.
Before actually stating appropriate regularity conditions, it is
convenient to develop some notation.
of model parameters.
Let
§ denote the entire vector
We renumber the components of
our notation for the joint density of
(T .,6,)
J -J
B,
and simplify
as follows:
e'
and
--
:: f. (t . ,6. ; B)
J J -J -
(2.13)
Also, let
a log
.
f. (13)
-J -
'a
-
aB l
log f. (B)
as
J -
f
=
j
(~)
i:
(13)
J -
...
a log f.(~)
J
aBq
-
...
fj(~)
qXl
qXl
40
and
a
2
log fj(~)
.
dfj (S)
f. (S) -J -
=
d§'
d
2
a
2
log f (@)
j
a(32
1
as l a(32
•••
• ••
log £,(S)
d
2
dSqdS l
il,l(S)
f~' 2 (s)
j
-
...
-
j
log f. «(3)
J d(3ldS q
...
a
·..
2
log f. «(3)
J 2
as q
qXq
f:,q(S)
J
i q ,2«(3)
-
oj
2
....
J
f~'
., 1 (S)
·..
·..
log f. (8)
J dS q dS
2
J -
a
-
iq,q(s)
j
-
qxq
-
--
Using the notation presented above, an appropriate set of
regularity conditions can be stated as follows:
Fo r
j
=
l, ... , n ,
and for all
q
R
(or an open subset of
indicated),
f.(t.,~.;B)
J
J -J -
t. > 0,
J
if a restricted parameter space is
>
a
and has (continuous) first and
second order partial derivatives with respect to
For
r,s = l, ... ,q,
la~r
fj(tj'§j;@)1
Id8::8
8
n > 0,
= l, ... ,n,
~ U~(tj)'
fj(tj'§j;?)1
For some
where
j
U;(uldu <
f:
~ U;,8(tj ).
and all
~* (E:) = {~*:
J:
II ~*
and all
£:
J .
(2.14)
q
R ,
E
and
U;'S(uldu <
r,s = l, ... ,q,
- ~ II <
00.
§
S.
and
(2.15)
00.
j
=
l, ... ,n,
(2.16)
41
6 > 0
For some
and all
Elf~,s(B)ll+6 < K
J
<
y > 0
and each
Elf~(S)12+y < K <
J -
Let
for all
00
-
For some
ch
root of
s
-
q
(1
r,s = 1, ••. ,q,
denote the smallest
_
1
information matrix for the
lim inf ch
'"S,
-n
where
jth
n-+OO
bility approaching
(2.18)
(qth)
characteristic
n
1 (S) = - L 1.(8)
-n n j=l -J -
Theorem 2.1:
(2.17)
= 1, ... ,n.
j
(S»
-n -
0_
1, ••. ,n.
= 1, ... ,q,
s
for all
00
=
j
q
1.(S)
-J -
subject.
(1
(8»
-n -
is the Fisher
Assume that
> 0 .
(2.19)
If conditions (2.14)-(2.19) hold, then with proba1
as
n
grows large, there exists a unique
in a small neighborhood of the true parameter value,
1:0(6 -8)
-n -
that is, for every
~
N ( 0
q -
-n qXl qXq
,
and
(2.20)
£; Q,
1:0 ~'(~n-§) V
------~-- 4
h'T- -n
l
Remark:
,1- l (S»
§,
mle,
N(O,l) .
(2.21)
(8)£
--
The definition of asymptotic joint normality adopted in
(2.21) avoids any problems if
1-1 (8)
does not converge to a limit.
-n In practice, however, this will rarely, if ever, occur and so it will
typically be possible to write
42
r
V
--1
vn £'(B -B) -+ N(O,£',! (B)£),
- -n - - A
Proof:
For all
E[f:(B)]
J -
r
= l, •.. ,q,
L Jr~Ba
0
r
=
0 E:E
-j -k
=
where
s
~!,(f) = lim! (13).
= l, ••• ,q,
log
n~
-0 -
and all
j,
f.(U,O.~f.(U,O.)dU
J
-J
J
-J
L asa r f.(u,~.)du,
o.ER
r Jo J
J
-J ::k
by (2.15)
ea
= as
(1)
r
=
0 •
a2
as aB
Also, since
r
s
f.(t.,~.)
J J J
f. (t ,,OJ)
J
J-
43
2
3
d8 d6
=-
r
t
L
s 0 EE
= 1:,s(8) ,
J
j
the
-
[Or
J
]
J
th
,
by (2.11)
element of the Fisher information matrix
individual.
J
os
(2·) + E fj(~)of.(§)
(r,s)
.th
8 for the
for
-k
-j
p~
-j
Summarizing the above, we have
= 1, ... ,n,
-e
(2.22)
and
(2.23)
where the
(r,s)
th
!j(§)
element of
is given by
(2.24)
For any
6
....n
E
q
R
we can write, for some
B = 6 + n-~u ,
-n
where
f3
-
6 is the true parameter value.
based on a sample of size
n
we have
(2.25)
-
Letting
B
!:n
denote a
m1e
of
44
A
-~
13 = 13 + n G or
-n
A
and
in fact, this
mle
u"- =
L «(3+n-~u) = L «(3+n-\1)
n n -
= SUp
L (13 ) = SUp Ln(~n)
n -n
@n
U
m(S-n -13)
-
exists.
(2.26)
if,
The remainder of this proof is devoted to
showing that, with probability approaching one as
there exists a unique
this
mle
mle
n
grows large,
in a small neighborhood of
§,
and that
is asymptotically normally distributed.
Before proceeding, we note that
_1,
n
()
.2
log Ln «(3-n )
---:a~B::=--=-
(3
-n
1
n
rn
j==l
1
n
=-
3 log fj(~n)
L
a8
-n
1
13
= -L
n
L f.
(B)
vn j=l -J -
(2.27)
and
1
3
2
log L (6 )
n -n
:)P Cle'
-n -n
n
§o
a2
=- L
aB-n 313'
-n
n j=l
1
n
L
n j=l
==
1
-
e-
log f. (13 )
J -n
f:,l(t)
J
...
n
q
o
L i j ,l(l3- )
~
n j=l
·.. -n1 j=lnL f~,q(8°)
J
...
·..
·.. -n1 j=lnI iq,q(B
j
o)
(2.28)
To insure that the
to restrict
.::. K
II~II
Let
of
u
mle
But
§ will be found it is necessary
to a bounded region about
for some
K<
An (u)
log Ln(@n)'
- =
u.
closest to
log L «(3 )
n -n
as follows, using (2.25):
2,
that is, we require
(2.29)
00.
recalling that, by (2.25),
@n
is a function
can be expanded in a Taylor series about
B
45
a log L «(3 )
n-~u'
log L «(3 ) = log L «(3) +
n -n
n -
n -n
a(3
-n
2
1
-1
(a
log
]U
+ - n u' ---~--:':'-I
Ln(~n)
2
where
00
lJ
i
E
(0IJ •• IJ.
0
+ n -~ u.).
111
-
a(3 a(3 I
that is,
i = 1 .... ,q.
<
-
0_ 0
IJ
-n -n
lu.1
_1_
(2.30)
i = 1 ..... q •
m
Thus, from (2.27) and (2.28).
A (u)
n -
1
L «(3) + u'--
1
...
n -
°e
-
+-
u
2 -'
n
I
m j=l
= log
£j(~)
...
•••
u
,
(2.31)
where
1
n
X
-n
n
L
j-1
=
1
n
n
f~,1«(3)
... -1 nL
...
...
J
-
q
I i j ,1«(3)
-
j=l
"1
f .q«(3)
n j=1 j
-
...
... -1
n
n
I
j=l
f q , q «(3)
j
-
46
and
y
-n
...
...
=
It will be shown that
(2.32)
and
z (8)
-n -
v
~ N
(o,T (8»
,
q - -n -
(2.33)
but the proofs are postponed until the main result is established.
e"
Using (2.31) and (2.32),
Now, setting
a
= 0_ '
au An(u)
-
"I
z-n (B)
-
-
T
(B)u +
-n - -
0
p
=
(1)
°
(2.34)
or
l
u = 1(8)Z (8)
-n - -n -
Thus, with probability approaching
1
+
as
has been explicitly shown to be the unique
linear in
of
@,
~
with probability approaching
the true parameter value.
0
p
n
(2.35 )
(1) .
~
mle
00,
[since (2.34) is
1] in a small neighborhood
Furthermore, from (2.26),
47
u = f:n(§n-§)'
and, applying (2.33) and Slutsky's Theorem [Cramer
!n(a -6) ~
N (0,1- 1 (6», as required.
-n q - -n To complete the proof, (2.32) and (2.33) must be established. First,
(1946, p. 255)] to (2.35),
consider
Y.
-n
Its
-yr,S
(r,s)th
n
= _1 \L.
n j=l
n
element is given by
["f'r,s(Oo)
_ 'f·r,s(O)]
IJ
IJ
j
-
j
-
r,s
'
= 1 , •••
,q,
where
Also, note that from
(2.29) and (2.30),
K/In
n > 0,
and
where
§*(£)
Let y*r,s
n
£ =
I
=!
n j=l
K/I:n
~
Thus, for
adequately large,
= {§*: I I§*
n
= £.
0,
- §I
I
< £}
E{ sup lir,s(s*) - f r ,s(§)I}l+n. But as n ~
§*(£)
j
j
-
00,
so assumption (2.16) implies that
r,s
\I.
J
n
Therefore,
obtain
n~oo
I
lim ~ \L.
n~ n n j=l
\l r,s
j
-*r
P O.
.
y , 8 - -*r
y ,8 ~
n
(e: ~
and Mark ov ' s
LLN
y*r,8
L
But since
n
0),
= 0
we have -*r
Y ,8
n
D
~
0,
n
can be applied to
= _1 n \1:,8 ~
n . 1 J
J=
0
and, in view of (2.36),
as
yr,8
n
~ O.
48
Since the above argument applies for all
Y
-n
Next, consider
jT's
U:,S
J
qXq
~
Y fjr,s(§),
n
\
L
< K<
(2.37)
0
r,s
n j=l
= E[lr:'s(s)
\1+0]
J
-
Th us,
J' .= 1 , •.. ,n.
n
'"'!
n
tion (2.17),
'"' (r's)
= l, ••• ,q,
r,s
00
-
= 1, .•• ,q.
6 > 0
for some
r
U ' , s'"'O
d
,
an Ma r k ov ' s
j=l J
1im 1+0
1
n-+OO n
Byassump-
1 n
p
again applies, yielding Xr,s - - \ E[f:'s(S)] ~ O.
n
n j~l
J
-
L LN
and all
once
But, from (2.24),
and since this is true
for all
rand
~ -'
0
X + Y + 1 (8)
-n
-n
-n -
gives
~ O.
X + T (S)
-n
-n-
s,
Finally, we examine
Combining this result and (2.37)
which verifies (2.32).
1
Z (8),,-n r
n
\
L
vn j=l
v
~N
z-n (S)
all
(0,1
(S».
q - -n -
:f 0,
Q,
qXI
9,'Z
-
£'Z (S) = ~
- -n -
V
(8)/1£'1 (S)£
~
- -n - -
n
N (0,1).
1
Now,
n
£'£ (8)
- -j -
=~ I
rn
W,
j";l
j
where the
W.
J
are independent
but generally nonidentically distributed random variables.
an~ (2.23), it follows that for
V[W.]
J
Let
n
2+y
sn
j
= l, ..• ,n, E[W j
]
= E[W j2J = £'1.(8)£.
- -J - n
n
j=l
j=l
s2 -
=
[nL
j=l
e"
and show that
To accomplish this, we need to establish that for
- -n -
L
rn j=l
£,(S)
-J-
L E[W~] = L
2](2+Y)/2
E[W j ]
= n£'1
(S)L Then
- -n - -
=
~'Q
From (2.22)
=0
and
49
&;
Q
.~
.~
(W. = £'£.(8».
Notice, however, that
J' - - J -
~
that is, replacing
c·~
c; 0
leaves
P
unchanged.
For this reason. it is sufficient to show that
P + 0
n
as
n
£ such that
all
smallest
(qth)
by
£'£
=
1.
for any
is scale invariant,
p
n
Since
n+oo
2
00
for
the
In(~)' assumption (2.19)
characteristic root of
E Iw . 1 +Y = E 1£' f (S)
J
- -j -
+
inf (£'1 (8)£) = ch (1 (8».
£'£=1 - -n - q -n -
lim inf (£'1 (8)£)1+Y/2 >
- -n - -
implies that
n
o.
Further.
\2+Y
'e
by the
c
inequality
r
cr
ref. Puri and Sen (1971. p. 11)].
Elf~(B) 12+Y < K <
J
fore,
j
=
s
-
00
for
s
l ••.• ,n.
-
l.2s.2q
r < 1
if
r > 1
= l •...• q
and all
l.2s <q
s
=K'
j =
<
00
1•.•.• n;
there-
for all
It follows that
as
Pn =
Liapunov's
s
[ r-1
q
•
if
By assumption (2.18). however.
EIW. ,2+Y < max 1£ \2+Y•q 2+Y• max K
J
=
1.
CLT
yields
n
+
00.
But
and
50
= -l..
in
I
w IAff
j=l
j
(B) 9v
- -n - -
= 9v'z (B)/~'I (B)9v ,
- -n -
- -n - -
and since this quantity is asymptotically N(O,l) for all ~ ~ Q,
V
_
we have Z (B) ~ N (0,1 (8», which establishes (2.33) and completes
-n q - -n the proof. 0
Remark 1:
Note that
is a function of
8 and
z, which,
-J
in typical experimental settings, does not systematically vary with j
and that each of its elements is finite by (2.17).
Further, the indi-
viduals entered into any clinical trial represent a sample from a necessarily finite, though usually very large, population of possible subjects, and therefore, as the sample size increases,
I
(6)
-n -
1
=-
n
L
1,(6)
e'
converges to the corresponding population quantity,
n. 1 -J J=
whatever that might be.
This discussion implies that, in the practical
application of competing risks models to clinical trials (or virtually
1
any other experimental setting),
finite constants as
n
(8)
converges to a matrix of
-n grows large and, consequently, as in the remark
following Theorem 2.1, we may write
Remark 2:
Regularity conditions (2.18) and (2.19) can be replaced
with the somewhat less restrictive:
For some
y > 0
and all
Elf~(S) 1 2+Y <
J -
s
00,
= l, .•• ,q,
for all
j
= l, ... ,n
(2.18')
51
And for all
+ Q,
~
n
p
n
= L
\) 2+Y/ s 2+y -+ 0
j
j=l
as
n
n -+ 00,
where
(2.19')
Note that (2.18) requires uniform boundedness while (2.18') does not.
Elf~(8) ,2+Y by a function of
However, since one can usually bound
J -
and
Z., a uniform bound can easily be obtained by taking the maximum
-J
of the appropriate quantities over the index j. Also, since
is the variance-covariance matrix of
£.(8) = 3 log f.(8)/38
-J -
J -
(2.23)J , it is positive semi-definite for all
-e
j.
-
Additionally, if,
as might be expected in practical applications, we can rule out the
possibility of any dependencies among
3 log £,(13)/36 , ..• ,3 log f.(6)/<38 ,
J -
1
J -
q
then 1.(13)
-J -
is strictly positive
+
~'1.(13)~ > 0
for all ~
O. Further, from Remark 1,
- -J - !n(~)
converges to the corresponding population quantity, which we
1 n
denote by !(g) = lim n L !j(§)' But for all
definite, i.e.,
n-+oo
:z:
j=l
1
n
n L
lim
n-loClO
= lim
n-+oo [
and for each
as there are only finitely many different
~'1.(6)~ > O.
This implies that
ch <1(6»
which yields
- ....1 - -
q - -
> 0,
1(8)
lim ch
n-+oo
(1
> 0 ,
j=l
j
is also positive definite and
(6»
q -n -
=
ch (1(13»
q - -
>
o.
That is,
condition (2.19) will be satisfied in any application that is likely to
be considered.
In light of this discussion, (2.18) and (2.19) appear
52
relatively easy to verify in practice, especially since (2.l9') is not
a particularly simple quantity to deal with.
Remark 3:
possible.
Other sets of regularity conditions are certainly
The Liapunov-type conditions in (2.18) and (2.19) or (2.18')
and (2.19') which are sufficient for central limit theory can be replaced by necessary and sufficient, but generally more difficult to
verify, Lindeberg-type conditions.
conditions for survival analysis
given by Basu and Ghosh (1980).
For example, a set of regularity
= 1) with Type I censoring is
(k
Their model is of the form
where, for the
T
is the (fixed) censoring time.
J
individual,
is the time to death from the sole cause under consideration and
lj
c.
.th
J
competing risks model with
Note that this is essentially a
T
- c
j
2j
mass to a single censoring point.
assigning all of its probability
A comprehensive, as well as mathe-
matically rigorous, discussion of the asymptotic properties of the
mle
in the case of independent but nonidentically distributed random
variables is given by Inagaki (1973).
Remark 4:
When no covariables are included in the model (and no
observations have been censored), that is, when
11.
I;;J
where
~
(T.,6.)
J
~J
applied.
= E(T.)
= 11t:
~J
j
= l, ... ,n
,
is the (cornmon) mean vector of model (2.1), the observation
are
iid
random vectors and
iid
central limit theory can be
The appropriate regularity conditions for this case are
(2.14) and (2.15) with the
n = 0,
for all
.th
J
subscript omitted and (2.16) with
while the corresponding limiting distribution is
e-
53
!(~)
Note that
is the common information matrix for all
individ-
j
uals, that the conditions analogous to (2.14)-(2.16) are simpler in
this case, and that conditions like (2.17)-(2.19) are unnecessary.
2.5.
Large Sample Tests of Hypotheses
In clinical trials, as well as most other experimental settings,
tests of hypotheses with respect to certain functions of the model
parameters are of fundamental, and often primary, importance.
Fortu-
nate1y, Theorem 2.1 and its corollaries, which will be presented in
this section, provide large sample tests for any hypothesis that is
likely to be considered in applications.
-e
In the first remark following the proof of Theorem 2.1 (Section
2.4), it was noted that, in typical experimental settings,
will converge to a matrix of finite constants.
1 I (8)
Tn -n -
-
-+
0
as
n
-+
00,
1-n (6)
-
/
It also follows that
and using this fact in conjunction with
Theorem 2.1 establishes the following corollary:
Corollary 2.2:
1-n (8)
A
8
-n
qxq
P
-+
If the conditions of Theorem 2.1 hold, and if
converges to a matrix of finite constants as
n
-+
00,
then
~
8 is a consistent estimator of 8.
-n
Since Y (8)01- 1 (8) = I
denotes the
is idempotent, where I
-q
-q
-n - -n identity matrix, application of distribution theory for quadratic
_p,
that is,
forms [Searle (1971, p. 57)J to the result of Theorem 2.1 yields
Corollary 2.3:
and therefore
Under the conditions of Corollary 2.2,
54
V 2
n(~n - ~)'rn(~)(~n - ~) ~ Xq '
A
A
(2.38)
From this result it follows that a large sample test of
H :13 = 13
(approximate) size
iff
a
can be obtained by rejecting
- 13 ) 'T (S)(S
-0
-n - -n
is the (l_a)th
where
tion with
q
H
o
of
quantile of the chi-square distribu-
degrees of freedom.
C is a
pxq
EA[rn~ (~n
2.l t
-0
(2.39)
More generallYt one may wish to consider testing
where
o -
-
(p
2
q)
matrix of rank
§)] = Qt
p.
=
H :CS = 13
o --
-0
Now t from Theorem
CZ-l(S)C't
--n
and the
--
next result follows immediately.
Corollary 2.4:
a
pXq
(p
2
q)
Under the conditions of Corollary 2.2 t if
matrix of rank
r
C is
Pt
A
vn(CS - CS)
--n
--
V
~
--1
N (OtC1 (S)C')
p - --n
(2.40)
and
(2.41)
Thus t a large sample test of the general linear hypothesis t
H :CB
o --
Ho
= -0
S t
iff
of (approximate) size
n(~§n
-
a
~o),[~!:l(~)~,]-l(~@n
can be obtained by rejecting
- §o) > Xi-a(p)
(2.42)
Alternatively, one may wish to base test procedures on a likelihood ratio principle.
Since an appropriate limiting distribution for
a simple function of the generalized likelihood ratio statistic can be
obtained under the same regularity conditions that guarantee asymptotic
normality of the
mle
[ref. Kendall and Stuart (1967 t Chapter 24)],
we have
Corollary 2.5:
(u > l,v > 0)
Let
B = (6-u tB)
-v
be the vector of
parameters t and suppose one wishes to test
u + v
=q
55
H:B
= -u
B
o -u
A
Let
§v'
A
HI: -u
B
"B-u
o
A
B .. (B ,B )
-u -v
and let
against
o
~
§v
denote the (unrestricted)
denote the
m1e
B
of
-v
m1
estimators of
under
H
0
.
B
-u
and
Then the
generalized likelihood ratio is given by
~
L (B ,B, T ,li)
-u.;;.o~-v - _A = __
n
L(~u'~v'!'~)
and under the conditions of Corollary 2.2,
-2 log A
n
if
H
o
is true.
Although the form of the null hypothesis used in Corollary 2.5 may, at
-e
first, appear somewhat restrictive, it should be noted that appropriate reparameterizations permit the testing of quite general hypotheses
within this framework.
We see, then, that Theorem 2.1 and its corollaries provide large
sample tests for virtually any hypothesis that will be considered in
applications using either Wa1d (Cor. 2.3, 2.4) or likelihood ratio
(Cor. 2.5) statistics.
Both procedures have limiting (central) chi-
square distributions under
alternative can be
H ,
o
investig~ted
while the power against a specific
by utilizing the appropriatenoncen-
tra1 chi-square distribution.
I
(B) will not be
-n known in application, and therefore, an estimate is needed. A natural
Finally, as a practical note, we add that
choice is
-1
'"
(f3 )
-n -n
(2.43)
56
where
1.(8 ) = -E{f.(B
)} =
-J -n
-J -n
However, in competing risks settings, the required expectations are
often difficult to obtain, and the "observed" information matrix,
defined as follows, is commonly used:
n
ro(8
) =.!. L 1~ (8 )
-n -n
(2.44)
n j=l-J -n
where
-f-j (8-n ) =
1.(6)
Typically,
I-n (6)
-n
the
qxq
fj(~)
aB r as s
A
]
e-
S=S
- -n qXq
§,
-
so Corollary 2.2
is a consistent estimator of
Corollary 2.6, below, states that
this quantity.
log
is a continuous function of
-J -
implies that
a2
r<>
(8 )
-n -n
also consistently estimates
Consequently,
identity matrix (and similarly for
this it follows that
1-n (S)
-
lZO(s
)]-1)
-n -n
'
and from
1-n (8)
-n
-ro(S
)
-n -n
can be replaced by
or
in expressions (2.20) and (2.38)-(2.42) without altering the limiting
chi-square distributions.
Corollary 2.6:
Under the conditions of Corollary 2.2,
57
z<>
(8 ) - 1"-n un
~
-n -n
-
(ii)
Proof:
0
Part (i) was actually demonstrated in the proof of
X
= -~(B)]
-n
-n -
Theorem 2.1 [since
ing Markov's
by using assumption (2.17) and apply-
LLN.
To prove part (ii), write
yG(B ) = y0(S) +
-n -n
-n -
rfD(S
) - yG(S)]
-n -n
-n -
,
and then proceed as in the proof of Theorem 2.1 (compare the quantity
[ ]
in
to the variable
Y)
-n
(from Corollary 2.2), and Markov's
p
[
Q.
] +
LLN
S
-
to show that the quantity in
Using this and the result from part (i) in the above equation
0
establishes (ii) and completes the proof of Corollary 2.6.
2.6.
8-n ~
to use (2.16), the fact that
An Example Involving Exponential Lifetimes
The example considered here examines a competing risks model for
a homogeneous population with independent exponential lifetimes.
Al-
though uncharacteristic of most clinical trial settings since it does
not include covariables, this model permits an analytical determination
of the
ml
variances.
for
j
estimators of its parameters and their exact asymptotic
Specifically, the model is
= l, ••• ,n,
Tlj, ••• ,T
variables with means
iid
kj
T
j
= min(Tlj, ••. ,Tkj )
are independent exponential random
Thus, the observations
random vectors, and for each subject (the subscript
omitted):
where,
(T ., t..)
J -J
j
is
are
58
.. -
1
l-l i
e
-t/).1i
i=l, ... ,k,
i
l/).1i'
i
= 1, ... ,k
1, ... ,k,
and
(2.45)
Thus, for all
j,
(2.46)
and therefore
For
i = 1, .... k
i t follows that
d log L (l-1)
n dl-l
n
n
=
i
L
j .. 1
and setting this equation equal to
- °i/).1i +
0
n
l-Ii =
L
.L
J=1
2
t /l-l i
j
(2.47)
yields
n
tj /
j=l
L
j=l
<5 ij
•
n
But
L
j-l
0ij
= di =
and therefore the
the number of individuals who died of cause
ml
c,
i
estimators are
59
~i •
n
.I
J=l
Tj/D i ,
i
= 1, ••• ,k
•
To establish the asymptotic normality of these estimators we need only
verify the relatively mild regularity conditions for the
given in Remark 4 of Section 2.4.
iid
case
That these conditions are satisfied
follows from Section 5.4 of Chapter V where the stronger conditions for
the
inid
case are shown to hold.
Now, from (2.46) and (2.47), we have for
a
2
log f (T . , ~. , ~)
J -J -
a~ia~i'
j
= l, ••• ,n,
3
2
~ij/~i - 2T/~i
i
o,
i ; i'
= i'
=
and so the common information matrix is
2E(T)
3
i
= i'
~i
o,
i ; i'
60
But
From (2.45),
so
E[T] "" II
k
L
t=l
-1
~t'
and therefore,
Theorem 2.1,
n
"-
~.
1.
""
L
Tj/D
j=l
and variance
i
is approximately normally distributed with mean
~i
~3.
~ ~-l/
i t:l ~ n.
A more general model that assumes independent, exponentially distributed risks but allows the incorporation of an arbitrary number of covariables for each risk is presented in Section 5.4, while an exponential model with dependent risks is examined in Section 5.5.
2.7.
Withdrawals and Type I Censoring
"
So far, this chapter has provided a general framework for
handling competing risks data when the study design calls for a single
61
analysis at the experiment's conclusion, and when all lifetimes have
been observed.
As mentioned in Chapter I, due to time limitations and
other practical considerations, statistical techniques for analyzing
clinical trial data must permit censoring as well as withdrawals.
In
this section it is shown that withdrawals and Type I (fixed point)
censoring can be handled within the framework already presented.
A
discussion of Type II (order statistic) censoring is postponed until
Chapter III where it is shown to be a special case of the timesequential procedure developed there.
2.7.1.
Withdrawals
Any subject who does not remain in the study until he dies from
one of the
-e
k
risks under investigation, or until the study is ter-
minated is called a "withdrawal."
any cause other than the
k
This includes subjects who die from
causes being investigated, those who sim-
ply refuse to continue, and those who, for any reason, are unable to
complete the study.
It should be emphasized that withdrawals are ran-
dom phenomena since they are not planned before the trial begins but
occur randomly throughout its course.
For this reason withdrawals must
be distinguished from censored observations, which result from the
design of the study.
Since the time to withdrawal is a random variable, withdrawals
are appropriately handled by using a competing risks model in which
they are incorporated as one of the risks.
That is, if
(k + 1)
of primary interest, one formulates a model with
which the actual "lifetime,"
where
TI, ..• ,T
k
T,
is given by
T
k
risks are
risks in
= min(TI, ... ,Tk,Tw)'
represent the usual under-lying lifetimes, and
T
w
62
denotes the time to withdrawal.
Thus, withdrawals fit into the
competing risks framework already presented in this chapter and require
no special treatment.
Note that, if desired, one could distinguish be-
tween different types of withdrawals by increasing the number of risks
in the model, and that, like any other lifetime, the mean time to withdrawal can be expressed as a function of covariables.
In particular, a
treatment indicator can be included and the hypothesis of no treatment
effect tested.
Although handling withdrawals in the manner described above is
preferred, it requires that the distribution of the time to withdrawal
be specified.
For this reason, some researchers, especially those who
wish to analyze data strictly with survival models
withdrawals as (Type I) censored observations.
(k
= 1),
treat
This practice, however,
implicitly assumes that treatment has no effect on withdrawal times,
which are then viewed as predetermined censoring times.
2.7.2.
Type I Censoring
As discussed in Section 1.3, Type I censoring (also called
truncation) is commonly used to meet the time limitations imposed on
most experimentation, and if entry into the trial is staggered, then
truncation of the trial at time
c.,
times,
J
j
=
l, ... ,n,
c,
say, leads to individual censoring
for the trial participants, where
c. = c - t
J
and
t
.
OJ
(2.48)
.
oJ
denotes the entry time of the
.th
J
individual.
It is this general form of Type I censoring that we consider in this
section.
e-
63
Using notation which is similar to that presented in Chapter III,
(Tj:c'~j:c)' where,
denote the censored observations by
for
j =l, •.. ,n,
T. ,
if
J
T. < c.
J
=
!::..
-J :c
T.
J:c
c.J , . i f
T.
J
~
r'·
-J
J
cj
Q,
if
T. < c.
J
J
if
T.
J
~
cj
(2.49)
Thus, when
(I::.. )
~J
T. < c.
the actual lifetime
J
J
are observed, while
T. > c
j
J -
ual has a lifetime of at least
-e
Q).
(~j =
is unknown
*
J :c
J
Noting that
follows that the density of
f .
c.
and cause of death
J
indicates that the
Tj : c = min(Tlj, ... ,Tkj,Cj)'
f
*
,0. ;§)
J : C -J: C
"
(t.
,u
fT.,!::.. (tj:c'~J':c)'
if
t j : c < c J'
and
ST (c.),
j
J
if
t.
and
0,
otherwise
J:c - j :c
J -J
.
it
is given by
T•
-
individ-
units and that the cause of death
(Tj:c'~j:c)
«(3)
(T.)
= c.
J:C
J
~j:c
E
O.
= 0
-J:C.
~k
(2.50)
Further, let
L
n
* (6)
n,c -
=
n
j=l
this truncated sample, let
= d
2
f;:c(~)
log
dB aB
r
s
f.* (B)
J:c -
.* r
f ' (6)
j:c -
denote the likelihoQd function for
* (B)
a log f.J:C
-
aB r
r,s = 1 .... ,q,
and
and define the truncated
64
information matrix at time
c
by
1
n
2
=-
n j=l
(2.51)
!j'c(§) ,
.
where
Finally, let
'"
8(c)
denote the
truncated sample
role
j = l , ...
of
,n,
B
based on the
1. e., based on
L
*
(8) .
n,c -
Using this notation the following corollary can be stated:
Corollary 2.7:
Assume that the parameter space can be restricted
q
to a compact subset of
R ,
that
c <
00,
that conditions (2.14)-(2.18)
hold, and that, analogous to (2.19),
lim inf ch (T( )(S»
q
c -
> 0 .
n~
Then, with probability approaching
a unique
value,
'"
mle,
@,
1
as
n
(2.l9')
grows large, there exists
in a small neighborhood of the true parameter
S(c)'
and
(2.52)
that is, for every
£
~
0,
(2.53)
Proof:
Using (2.l5), from the proofs of Lemmas 3.2 and 3.7 of
Chapter III, we obtain for
r,s
=
l, ... ,q,
e-
65
and
(2.54)
•
Also, in the proof of Theorem 3.1 it will be shown that
y > 0
Thus, (2.18) implies that for some
and each
s
= l, ••. ,q.
(2.18')
Similarly, from Lemma 3.6 of Chapter III and (2.16)-(2.17) it follows
that for some
n>
and all
0
r,s = l, •.• ,q,
j = l, •.• ,n,
(2.16')
where
r,s
S*(E) =
{§*:I I§* - §I I
<
£}.
6 > 0 and all
and for some
= l, ... ,q.
Elf~,r,s(8) 1 1+6
J:C
-
< K* <
00
for all
j
=
l, ... ,n •
(2.17')
Using (2.54) arid (2.16')-(2.19') Corollary 2.7 follows from the proof
of Theorem 2.1 with
(Tj'~j)'
f j (§),
B replaced by their truncated
i . ,r(8) u*
f.,r,s(B)
1 (B)
-n
R
j:c-'
J:C
Remark:
-'
-(c)-'
f~(§), f~'s(§), T-n (8),
-
versions, (T
~
)
j:c'-j:c '
,..
and 8(c)' 0
and
*
f j : c (§)'
In practical applications the parameter space can always
be restricted to a compact subset of
Theorem 3.5 of Chapter III).
q
R
(see the remark following
Also, from (2.51),
66
*
.*
fj:c(~)'" (a/o~)log fj:X(~)'
is the variance-covariance matrix of
and
using a similar argument to the one given in Remark 2 of Section 2.4,
it follows that assumption (2.19') will be satisfied in any application
that is likely to be considered.
•
Corollary 2.7 can also be used to provide large sample tests of
hypotheses with respect to
S
(or certain functions of
~)
for the
case of Type T censoring by using results analogous to Corollaries 2.2-
2.5 of Section 2.5.
Since
I(c)(§)
will not usually be known in ap-
plications, it must be replaced by a consistent estimator, which does
not affect the asymptotic results.
For this purpose it is typically
easiest to use the "observed" truncated information matrix,
where
=!n ~L
1°
and
(13 )
-j:c -(c)
1°
._
['
log f *
: (§)
j c
Lemma 3.7 of Chapter III shows that
consistent estimator of
T(c)(§)'
(S )
-j:c -(c)
j=l
aB r aB s
]
"
- -(c) qXq
B=B
is, in fact, a
(2.55)
CHAPTER III
TIME-SEQUENTIAL PROCEDURES FOR PARAMETRIC
COMPETING RISKS MODELS
3.1.
Introduction
As discussed in
Chap~er
I, time-sequential procedures, which
permit the early termination of clinical trials based on accumulated
statistical evidence, must be developed in order to achieve an efficient design and analysis of (longer running) trials with respect to
.-
both economic and ethical considerations.
Since, as previously noted,
the current competing risks literature is totally lacking in this regard, the development of suitable techniques for competing risks
settings seems important.
Fortunately, Sen (1976) and Sen and Tsong (1981) have obtained
useful time-sequential procedures for a broad class of (parametric)
survival distributions which constitute a special case of the parametric competing risks models under consideration.
In his 1976 paper, Sen
proposes to monitor the experiment over the time interval
[O,TJ
by
basing the decision regarding possible early termination at time
t
E
time
(O,T)
on the order statistics which have been observed prior to
t.
This approach, however, relies heavily on the properties of
order statistics generated by a sequence of independent and identically
distributed random variables.
Since, as pointed out in Chapter II, the
random vectors arising from a competing risks framework are independent
68
e
\
but generally nonidentically distributed, this approach is not
considered further.
Instead, the progressively truncated scheme
(PTS)
developed by
Sen and Tsong (1981) is appropriately modified for competing risks
models in Section 3.2 where the main theorem of this chapter is also
stated.
It is important to note that this approach retains the inde-
pendence of the truncated observations, an essential feature which is
exploited in the proof of the main theorem given in Section 3.3.
Sec-
tion 3.4 contains some remarks on this theorem and its assumptions,
while its application to clinical trials and life testing experiments
in general is discussed in Section 3.5.
A time-sequential procedure
for the case when "nuisance" parameters are present is developed in
Section 3.6, and Section 3.7 shows how to modify the progressively
truncated tests of Sections 3.5 and 3.6 when entry into the study is
"staggered."
Type II censoring, which specifies that the experiment is
to be terminated after a predetermined number of deaths (failures) has
occurred, is considered in Section 3.8.
Examples are deferred until
Chapter V.
3.2.
Preliminary Notions and the Main Theorem
At this point we wish to develop a time-sequential procedure in
order to test
Ho ;
(.l.
t=
qXl
-
-
(.l.
t=o
(specified)
against
(3.1)
with respect to the competing risks model based on the underlying distributions specified in (2.1) or (2.6).
e'
69
For this purpose we consider the approach based on progressively
truncated likelihood ratio statistics
(PTLRS)
Tsong (1981) for life testing situations
(k
advanced by Sen and
= 1).
The essential dif-
ferences between their setting and the competing risks framework are:
(1) indicator random variables denoting the cause of death
(~j)'
(T.),
well as the usual random variable denoting the actual lifetime
must be included, (2) the observable random variables,
j = l, ... ,n,
as
J
(Tj'~j)'
are independent but generally nonidentically distributed,
and (3) at least
k (> 1)
parameters need to be considered.
Neverthe-
less, although competing risks models require a more general setting
than that considered by Sen and Tsong, it will be shown that their
--
basic approach can be extended to obtain a similar invariance principle.
Recall that for
j = l, ••• ,n,
the
pdf
of
is denoted
by
f. (S) - f. (t . ,0.) - f. (t . ,0. ; S) - f
J -
J
J -J
J
=
while the information matrix,
1.(B)
-J -
J -J -
Tj
A
'~j
(t., 0 . ; S)
J -J -
k
0 ..
IT [8t(t.;§)] 1 J • ST (t.;§)
i=l
J
j
J
L(S),
-J -
.
(3.2)
is defined by
-,
= E[f.(S)-f.(S)]
-J - -J-
,
(3.3)
@ denotes the vector of parameters as given in (2.12) and,
qxI
_
for a nonnegative h(~;§),
~ (~;§) = (d/a§) log h(~;§) and
2
qXl
= (d /a§u§') log h(l;§).
where
We envision monitoring the experiment over the time interval
[O?X],
where
X<
00
denotes the ultimate time at which the experiment
is designed to end if early termination is impossible, and introduce
70
the following truncated random variables:
=
T.
]:x
[Tj'
if
x.
if
[6..
-J
if
Q.
if
T. < x
J
T. > x
J
j = 1 ..... n; x
E
[0. 00) ,
j = 1 .... ,n; x
E
[0. 00 )
and
=
/:'.
-J :x
T < x
j
T > x
j
(Tj:x'~j:x)'
Note that the random vectors
j
= l •...• n.
.
0.4)
correspond
to the information available when the trial is truncated at time
x (~O).
Specifically. when
cause of death
jth
T. <
J
the actual lifetime
X.
are observed. while
T. > x
individual has a lifetime of at least
cause of death is unknown
(~j
= Q).
and
indicates that the
J -
x
(T .)
J
time units and that the
Also. since
e-
the above definitions imply that the density of
(Tj:x'~j:x)
can be written as:
f
f
*
T
"
.LI.
j :x -J:X
(t
.0. ;13)
j :x -J:X-
=
,,(t
Tj'~j
.OJ ).
j:x - :x
'.X
if t .:
J x
ST. (x) •
J
o.
< x and 0.
E ~k
-J:x
=
,0.
]:X -J:X
).
if
t j :x
< x and ~j:x
=
S. (x).
if t j : x
o.
otherwise
J
x and 0.
-J:x
=0
otherwise
f.(t.
J
if t .
J
x and ~j:x
E
~k
=0
(3.5)
71
for
j:o: l, ••• ,n
and
x
€
IO,~),
where
I - F, (t) ,
!k :0: {(l,O, ..• ,O)lxk,···,(O, ••• ,O,l)},
F/t) :0: F (t)
T
denotes the
realization of
(T
cdf
J
and
of
(t
j
, f), . ) •
j :x -J:x
then
T
j :x
a
Since the
(T
denotes a
*
fT
*
fJ':x(~)'
f),
(tj:x,Oj:x;§):
j :x'-j:x
*
fj:x(tj:x'2j:x;~)'
Also note that if
and
T,f),
j -j:x
0
)
:x'-j:x
Whenever convenient, the following will
be used interchangeably to denote
*
fj:x(tj:x'~j:x)'
j
:0:
f * (T
,f),j ):0: f,(T"f),j)'
j : x :j x -:x
J J-
and
f)"
-j
,f),
),
j:x-j:x
Tj : x < x
j:o: l, ••• ,n,
are independent, the like-
lihood function for these truncated random vectors is given by
L
'e
*
X
(6):0:
n,x -
€
[0,00) •
This likelihood function is based on truncation at a given time
(3.6)
x,
but since the trial is to be monitored from its beginning, we consider
a stochastic process based on
Specifically, for every
n
L
*
(S)
n,x -
(~ 1),
as
x
varies over
[O,X].
define the
X (> 0) ,
vector-valued process
(3.7)
where
n
:0:
L
j=l
:0:
(alas) logf,*
(6)
J:X -
(3.8)
72
and the
.*
I'
th
.* US)
component of
f
-j:x -
*
I'
fj~x(§) = (a/dBI') log fj:x(~)'
information matrix at time
defined for every
x
E
I'
x
is denoted by
= l, •.• ,q.
.th
J
for the
[0,(0)
Finally, the truncated
1 , (8) ,
-J:X -
individual,
is
as follows:
.*
.*
'
Ij:x(§) = E[!j:X(§)][!j:X(§)]
qXq
=
(E[f*,r(8).f~'S(8)])
j:x J:X qXq
.
(3.9)
Also, let
1
y (8) = -n -
n
I
1,(8) ,
n j=l -J -
and
1
=-
n
I
(3.10)
!J"x(§) .
'
n j=l
Now that the required notation has been introduced, we are in a
X
~n,§'
position to investigate the weak convergence of
and for this
purpose make the following assumptions:
For
and for all
j = 1, .. "n,
(or an open subset of
indicated),
q
R
f.(t 'Oj;8) >
J
j
--
t, > 0,
J
6.
-J
E
E ,
-k
and
_8
q
R
E
if a restricted parameter space is
° arid has
(continuous) first and
second order partial derivatives with respect to
8.
(3.11)
73
For
r.s
I"\~
ap
f . (t .. 0 . ; B)
J J -J -
r
=
1..... q. j . 1..... n.
I -<
U:(
t .) •
J J
Joo
0
and all
urj (u) d <
u
S
00.
€
q
R •
and
(3.12)
y >
For some
0 and
s
= 1•...• q.
ElfjS(Tj.~.;S)12+y < K <
-J -
-
s
00
For any given. u •••. ,u •
m
1
o < u < ••• < u < X,
-
--
1-
-
j
for all
m ~ 1,
= 1 •.••• n.
(3.13)
such that
m-
lim inf ch (L) > 0 •
mq -n
n-+OO
where
ch (E)
mq -n
denotes the smallest
(mqth)
(3.14)
characteristic
root of
l:
-n
mq mq
!(u ) (§)
1
! (u ) (§)
2
I(u ) (§) ... f-(u ) (8)
2 2
[(u ) (~)
1
I(u ) (~)
I(u )(~)
3
[(u ) (§)
r(u )(§)
2
=
1
2
..
... I(u ) (~)
3
.•
.•
r(u ) (§)
3
r(u ) (§)
m
(3.14a)
74
For
= l, ••• ,q,
r
0 < 0 < 1,
and sufficiently large
n,
(3.15)
0 < K < co,
for some
some
r
where
denotes the
Note that
c > 0,
r
th
r
and all
£
n, r
(x)
= n-~ L*
n,x
a
= n -~ ~6
a
(6)
r
~1n,1JD
discontinuity of
r
log L* (6),
n x -
X.
the
R(X)
and
nq[o,x]
tion on
W*
-n
[o.x]
for every
x.
Specifically, the points of
include
is the number of deaths occurring before time
X (> 0),
and
§
E
q
R ,
X
w
-n,_6
space endowed with the (extended) Skorokhod
Finally, let
= {W-n* (x), o <
such that
x < X}
EW * '" 0
-n
(u,v):O < u,v < X,
which depends on
and
where
J
belongs to
1
topology.
be a q-variate Gaussian func-
UAV
(6)
E[W * (u)][W * (v)J ' = f
-(UAV) -n
-n
= min(u,v)
has been defined in (3.10).
n,
is
denote the order statistics corresponding to
n (~1),
Thus, for every
r=l, ... ,q,
'
r=l, ..• ,q,
(x),
components,
q
r
< ••• < T
T
n,l
- n,n
T , ...• T ,
n
1
!(x)(~).
is a vector-valued stochastic process with a
necessarily a continuous function of
where
L= {O, 1, ... , [XI q]} ,
diagonal element of
continuous time parameter, although none of its
~ B
E
and
!(x)(§),
Then we have the
following basic invariance principle:
Theorem 3.1:
Under
p~
and assumptions (3.11)-(3.15),
convergent equivalent in distribution to
topology on
*
W
-n
is
in the (extended)
nq[o,x].
~
and W* are
-n
-n,§
asymptotically equivalent, though neither sequence necessarily converges
Remark:
to a limit.
This means that the distributions of
In practical applications, however, such limits will exist
e-
75
and we may write
(3.16)
where
W*
*•
= lim W
-n
n-+oo
3.3.
Proof of Theorem 3.1
Before proceeding with the main proof, it will be convenient to
establish several results, given in the following three lemmas:
Lemma 3.2:
Under the regularity conditions of Section 3.2,
E[~~,§(X)]
--
and for every
(u,v):O
~
=
Q
for each
x
E
[O,xJ ,
(3.17)
qXl
u,v
~
X,
(3.18)
Proof:
For each
E[W X
(3
n,
(x)]
x
E
[O,X],
JL
E[L*n,x «(3)J
r
r
=
r
vn
and from (3.4) and (3.5),
1
= -
in
x
j=l ~j E~k
Io
a~r
log
n
I { I
+
a
~B log fj(t.,O.)"fj(tj,O.)dt.
o
~ -J
-J
J
r
Sj(x)'Sj(~)}
(3.19)
76
since, by (3.12), the operations of differentiation and integration may
be interchanged,
.-1:..
n
a fX
L
{w-
rn j=l
r
0
f
a
Tj
Since (3.20) is true for every
Now, for any
r,s
(t.)dt. -arF.(x)}
J
J
r J
r
= l, •.. ,q,
= l, ••• ,q,
the
(r,s)
=o .
(3.20)
(3.17) holds.
th
element of
is given by
(3.21)
eby the independence of
u < v.
(T.,6.),
J -J
j
= l, •.• ,n
and (3.17).
Suppose
Then (3.4) and (3.5) yield
2
[f. (T . ,6j) ] ,
J J-
if
and
S . (u) "f. (T j ,6. ) ,
J
J
-J
*
*
=
fj:u(Tj:u'~j:u)"fj:v(Tj:v'~j:v)
S.(u)"S.(v),
J
J
if
T. < u
J
6. E E:k
-J
u < T < v
- j
and
6.
-J
if
T. > v
and
J
6.
-J
€
gk
=0
77
Thus,
u
= L J
o.(E
0
-J -k
+
[a~
or
oS
fj(tj,oj)ofj(tj,o.)ofj(tj,o.)dt.
-J
-J
J
log Sj(U)]o
r
L JV fjs(tj,~.).f.(tj,~.)dt.
0 (E
-j -k
J
U
J
J
J
(3.22)
But, by assumption (3.12),
--
L JV fjs(t.,2.)fj(t"~j)dtj = aBa IV
o.(E
-J -k
U
J
J
s
J
f (t.)dt j
u Tj J
so (3.22) becomes,
L
o (E
-j -k
+
u or
.
os
f. ( t , <5 .> of (t . ,0. ) 0 f . (t , OJ) d t .
j
0 J j ~J
J -J
J j J
I
[a~r
log
Sj(U)]o[a~s log
Sj(u)]oSj(u)
= E(f;~~(@)Of;~~(§»)
.
(3.23)
On the other hand, if
v < U,
a similar argument yields
E[f~,r(B)Of~'S(B»)
= E(f~,r(B)of~,s(B»).
J:u J:v J:v J:v(r,s) th
element of
X
][~n,§(v)
X
J'
E[ ~n,§(u)
Thus, from (3.21) the
is
78
result holds for every
r,s
= l, ••• ,q,
which establishes (3.18) and completes the proof of Lemma 3.2.
Let
{(T
B
n:x
be the
a-field generated by
,6. ), j = l, ... ,n}, x> 0,
j :x -J:x
a-field.
Then, for every
Lemma 3.3:
x
Proof:
0
For
< w
<
n,
For every
{Wn,S (x),Bn:x'x ~ O}
r
X
J]
~j E ~k
2
0
and let
B
n:x
B :
be the trivial
n O
is nondecreasing in
n (> 1)
and every
x (_> 0).
r = 1, ... ,q,
is a martingale.
(3.4) and (3.5) give
w ~ x,
.*
or
0* r
6
) = f.(T,,6.)
) = f '(T
j:w
j
:w'-j:w
J J-J
J:x'-J:x
=a. f.' r (T.
J:X
tJ.
(3.24)
[w
< T. <
J
~j
[w
E
0
2
~k
< x < T
-
~
-
-j:x
For
xJj ~ f *' r (T
w
2
j
=
j:x
J] ~ f.*'
0
r (T
J:X
6
)
or
= f.
(T .,6.)
6
)
j:x'-j:x
a
= as-
(3.26)
log S.(x) •
r
-
(3.25 )
J-J
J
j:x'-j:x
J
x,
E(~n,(3 (x) IBn . w) = ~
r
In"
n
I
j=l
E{f~,r(T.
J:X
J:X
,6.
J:X
) ICT.
J:w
,6.
J:W
)}
0.27)
79
Now, for
T
< w < x
j:w
-
and
~j:w
~k'
€
(T ,fj,.) ,
j -J
so (3.24) gives
r
= f.*j:w
'(T
fj,
)
j :w'-j:w
(3.28)
On
-e
the other hand, for
=
1 im
p[w~tj
T
j:w
:x
~Tj
= w -<
:x
fj,t~O
x
(and
fj,.
=
9.),
-J : w -
~tj
+fj,t;fj,. =6. ]
:x
- ] :x - ] :x
S.(w)
J
~j:x
€
~k
(3.29)
=
S.(x)/S.(w),
J
J
t
j:x
=x
and
6
-j:x
and in this case, from (3.25), (3.26), and (3.29),
=0
80
and using (3.12),
d
= asr
(3.30)
log S.(w) •
J
So (3.27)t (3.28), and (3.30) yield
E(Wnx. D
,IJ
(x)
r
IB
1 n {Or
<w;~. €E )
I fj(Tjt~.)OI(Tj
n : w) ... -- j=l
-J
:w
-J:W- k
rn
+ ~~
ap
r
log s.(w)or(T. =w·~. =o)}
J
J:W '-J:w-
1 n 0* r
~ (T
~
)
... -I f j:w
j :w'-j:w
e-
In j=l
... wX
()
w •
Q
n,IJ
r
Since the above argument is valid for all
r'" l, ••. ,qt
the proof of
0
Lemma 3.3 is completeo
The following useful result t which provides an extension of Lemma
4 of Brown (1971) to continuous time parameter processes t has been
given by Sen and Tsong (1981).
Lemma 3.4:
process.
Let
{T,x
x
Then for every
€
[OtX]}
be a separable sub-martingale
A > Ot
(3.31)
81
then the right-hand side of (3.31) is bounded
If, further,
by
Next, we show that the finite dimensional distributions
are convergent equivalent to those of
that for any given
ul, ••• ,u '
m
m > 1,
-e
-n
such that
v
•••
-+ N
*
lol •
(fdd)
This requires
°-<
u
< ••• <
1-
-
U
< X,
m-
(O,L)
mq - -n
that is, for any arbitrary real vector,
~;
0,
(3.33)
where
r-n
has been defined in (3.14a).
= n
-~
R.'
.*
L
•••
-n,u
= n -~ R.'
Now, using (3.7) and (3.8)
(S)
m
...
...
82
n
- L
where
Y
j
= n-~.Q.'[f*'
-
(3.34)
Y.
j=l
J
«(3) •••
Since, from
-j:u l -
the proof of Lemma 3.2,
.*
Ej :u
(§)
l
...
E
.*
f.
=Q
«(3)
-J:Um and
qp
I j : u (~)
1
·.. I j :u1
(~)
( (3)
1..
-J,u 1
«(3)
-I j ·u
. 2
·.. -J:u
1.
2
(6)
Ij:u
.
~j:u1 (§)
V
•
f . . (13)
-J,u ...
I j :u
m
==
it follows that for
...
=
•••
j
1
...
r
(~)
-J :u 2
l
(~)
·.. -J:u
1.
e-
m
( 13)
-
(3.35)
I:. ,
-J
= 1, •.• ,n,
the
Y.
J
are independent random
variables with
and
= -1
~'I: . .Q. •
n - -J-
Var(Y.)
J
,
Let
Q,' = (~1 , ...
1xq
(3.36 )
,
,
,.Q. ),
where
-m
1 xq
~t
=
(~ t 1'"'' Q, tq )
,
t
= 1, ... ,m.
Then, using the notation established in (3.8) and the line following
(3.8), (3.34), and the
c
r
inequality [ref. Puri and Sen (1971, p. 11)],
83
<
-
n-1~Y/2.(mq)l+Y.
I
r Eli*:S
max It 12+Y•
lStSm ts
t=l s=l
l SsSq
j.U t
(~)
,2+Y •
(3.37)
But
and, using (3.12),
·e
2+y
ai- ° L
s
-j
[f.(t.,O.)dt.
E: E
u
-k
L fro
6 €E
-j -k
=
6
J
t
J -J
J
f (t.,6.)dt.
j
u
J -J
t
J
L foou f~(t.,6')~f.(t.,6.)dt.
E:E
J J -J
J J -J
J
-j -k
2+Y
t
L fro
6 EE
-j -k
f.(t ,6.)dt.
u
t
J
j
-J
J
from Jensen's inequality [ref. Hewitt and Stromberg (1965, p. 202)J, so
84
<
L
- o.E:E
-J -k
'"
L
<5.€E
-J - k
os
u
t
os
J0 Ifj(t"~j) I
J
J
J
OOl fj(t"O.)
Os
12+Y ofj(t.,O )dt.
J
-J
J -j
J
O
12+Y
which does not depend on
J
fj(t"~j)dt.
f
'" E l f j (T . '!:::.j ; S)
J - -
Ely. 12+y
2+y
,
u '
t
Thus, from (3.37), for
< n- 1- y / 2o max
-
l~t~m
1£
ts
12+Yomq1+Yomo
I
.
s=l
j
= 1, ... ,n,
Elf~(s)12+y
J -
l~s~q
and using assumption (3.13),
- n
-1-y/2
=
oK'
,
Let
K' <
00
(3.38)
•
Then, from (3.35), (3.36) and definition
(3.14a) ,
s
2
n
1 n
\ £'t £
n j : l - -j-
=-
=
~'t ~
-
-n-
85
and therefore,
s
2+y
= [nI
2 ] (2+y)/2
E(Y.)
j=l
n
=
J
In order to apply Liapunov's
CLT,
(3.39)
we must show that
n-+
[as defined in (3.34),
Y.
I
Ely
(3.39), that
=
P
n
j-l
t
that is, replacing
-e
depends on
J
2+y/
1
j
/
c·t
by
~J.
-t
for all
Notice, however, from
rI E(y~)J(2+Y)/2
~=l
is scale invariant,
J
for any
c
i: 0 leaves
For this reason it is sufficient to show that
all
OO
Pn -+ 0
P
n
as
unchanged.
n-+
OO
for
tIt = 1.
such that
Now, from (3.38) and (3.39),
(3.40)
Since
root of
inf
&'&=1
L,
-n
(t'Z
-
£)
-0-
= chmq (L),
-n
Liapunov's
lim inf(t'L t)1+Y/2 > 0,
- -nn-+OO
From this it follows that
assumption (3.14) implies that
and hence, using (3.40),
CLT
P -+ 0
n
characteristic
the smallest
a~
n -+
00.
can be applied to obtain
n
t'w
---0::
I
j ,-=1
Yj - 0
V N ( 0,1 )
-+
l
86
This establishes (3.33) and completes the proof of the asymptotic
equivalence of the
fdd
W* Band
-n,
of
W* .
-n
~hat ~,@
Finally, it must be shown
is tight, and for this
0 < 0 < 1,
purpose we define, for every
~ (u)
II :0 2. u
< x
where
and
2. (u+o)
A
X} ,
(3.41)
t' = (tl, ••. ,t ),
-
Yl
A
Y2 = min(YI'Y 2)'
show that for every
q
To establish tightness, it is sufficient to
£ > 0,
(3.42)
e-
Since
sup{lt (x) - t (u)
r
r
1:0
< u < x < (u+o)
-
A
X}
(3.43)
2. q sup wo(t r ) ,
lsrsq
we have
p{wo(~) > £}
that for all
2. p{ sup wo(t r ) > £/q},
lsr::;q
so it suffices to show
r = 1, ... ,q,
lim lim P{w6(W~
oi-O n-KO
Q
)
>
E/q} = 0
for every
£ > 0 .
(3.44)
'\,)r
Next, we partition the interval
[O,X]
into subintervals of length
87
0:[£0,(£+1)0
c
=
A
X],
£
L - {O,l, •.. ,[X/o] + c}, where
E
(0,
if
[X/oJ < x/o
1,
if
[X/oJ
= X/o
, [y] denotes the greatest integer
the last subinterval has length
Wo(W~,B ) '= sUP(IW~,B (x)
< o.
Using definition (3.43),
W~
~. u < x ~
B (u) I :0
r ' r
r
sup
< su p [
- £EL £o~u~(£+l)o
~
I~n,B
r
(u+o)
A
y,
and
X)
(x)
u<x~(u+o)
and noting that either
x
E
(u,(£+l)O]
or
x
E
[(£+l)o,u+o]
when
£0 < u < (£+1)0,
+
and since
~
su p [
9.cL
la-bl ~ la-cl + Ib-cl,
Iw~ 'Br (x)-W~ ' Br (£0) I +
£O~x$(£+l)o
sup
X
X
Iw
B (u)_W B (£0) I
£o~u$(£+l)O n, r
n, r
sup
IwXn, Br (x)_Wn,X Br «£+1)0) I
+
sup
(£+1)0$x$(£+2)0
+
sup
WX S (u)-W nX e «£+l}O)
£O$u$(£+l)O I n ' r
' r
1
1
j
88
~ sup(z
sup
I~ 'Br (v)-~ ' Br (~o)
~€L
~o$v$(~+l)o
x
X
W (3 (u)-W B (~o)
£O~u$(£+l)o n, r
n, r
+
sup
+ IW
= SUP [4
£€L
1
X
D
n,~
r
sup
£O~v$(£+l)o
+
I]
«~+l)o)-WXn, B (~o) I
r
IW
X
x
B (v)_W B (£6)
n, r
n, r
1
sup
WX
(v)-W X
«9+1)6)
(£+l)o$v$(~+Z)o n,B r
n,B r
I
sup
IW~ 'Br (v)-W~ ' Br (£0)
£€L £O$v$(£+l)o
< 5 SUp[
I]
I] ,
e-
X
(v)_W D (£o)J, v € [£o,(£+l)oJ} is a
n, r
n'~r
separable sub-martingale. Thus, for every € > 0, Lemma 3.4 can be
where by Lemma 3.3,
{[wX 13
applied to obtain
p{Wo(W~,f3r )
> c/q}
~
L
Q,(
pr
sup
L lB.05V$ (£+ 1) 0
W~ 'Br (v)-W~ ' Br (M)
1
>
Z(l~!Jl~
q
(3.45)
89
But
E[W
X
Q
n,~
r
«R.+l)O)-~n, B (to)]2 = varI0n, 6 (R.+1)o)-tl
r
Q
n,~
r
(to)] ,
r
from Lenuna 3.2,
(3.46)
Also, from (3.33), as
p{lw
n
~ 00,
X
X
8 «R.+l)O)-W 6 (R.o) 1>£/10 oq} - p{lw* «t+l)O)-W* (to) 1>£/lOoq},
n, r
n, r
n,r
n,r
and since
[w*n,r «R.+l)O)-W*n,r (R.O)]
<
2
N(O,K~n)(O»
~r
,
(n).c]
1
-~ £ /100 oq oKR.r (u)
hen)
I
o
o
2 --- e
olOq
(0)/£
In=:
R.r
v2TT
1 [
-
2
-
from Mills' ratio [i.e., for
x > 0,
[1 - ¢(x)]/¢(x)
~ ~,
ref. Kendall
and Stuart (1969, p. 136-137)J,
(3.47)
So, from (3.45)-(3.47), for every
large
n,
£
> 0,
0 < 0 < 1,
and sufficiently
90
p{wa(W~,B )
> c/q}
r
~
10qc-
~~L(20qC-l[Ki~)(a)]3/2(2TI)-1/2exp{-£2/200.q2.Ki~)(o)}J1/2
l
,
and using assumption (3.15),
since there are approximately
x/a
terms in the sum,
c
~
r
=
0
as
l, .•.• q.
0
+ O.
0
3.4.
Rem~rks
The
> 0 ,
Since this argument is valid for all
(3.36) has been verified, and the proof of Theorem 3.1 is
complete.
Some
r
con~ents
on Theorem 3.1 and Its Assumptions
that follow should indicate that the assumptions of
Theorem 3.1 are relatively mild. and that this result will apply to a
fairly broad class of competing risks models.
Specific examples, which
are considered in Chapter V, will reinforce this claim.
Remark 1:
of the form
Note from (3.35) that
!j : u'
defined in (3.9).
U E
[O,X].
Thus,
L:
--j
is composed of submatrices
the truncated information matrices
is a function of
practical applications, does not
Band
~ste~atica1lL vary
z.
-J
with j.
which, in
And using
the same argument given in the first remark following Theorem 2.1. as
the sample size increases,
~~
-n
n
= -1 L\
~~
n j=l -j
converges to t h e
91
corresponding population quantity, whatever that might be.
This
discussion implies that, in the practical application of competing
risks models to clinical trials (or virtually any other experimental
setting),
converges to a matrix of finite constants as
2:
-n
n
grows
large, and consequently, as in the remark following Theorem 3.1, we may
write
x
W
-n,~
V
-+W
*
in the (extended)
J
topology on
1
Dq[O,X]
(3.48)
Remark 2:
Since, from (3.35),
matrix of the variables
.*
is the variance-covariance
.*
fj,
.0) ••• f
(T
fj,
.0)
j :u '-j :u '!:'
'-j :u
j :um'-j :um'~ ,
I
l
m
l
it is positive semi-definite for all j. Further, if, as might be ex-
-e
f
~j
-j :u
(T
pected in practical applications, we can rule out the possibility of
any linear dependencies among these variables,_ then
is strictly
positive definite, that is,
Further, from
Remark 1,
L
for all
converges to the corresponding population quantity,
-n
which we denote by
L
= lim
n-+OO
l: •
.;.n
But for all
1
= lim ~'L ~ = 1im[fi
n-+OO - -n-
n-+OO
n
I
j=l
~'l: ~
ch
> 0.
(2:) > 0,
mq -
This implies that
which yields
lim ch
>
°,
and for each j,
-J
is also positive definite and
as there are only finitely many different
- -j-
~'~j~]
(2:)
mq -n
2:.,
=
ell
(l:) > O.
mq -
That is,
assumption (3.14) will be satisfied in any application that is likely
to be considered.
Remark 3:
Note that assumption (3.13) is identical to assumption
(2.18) of Theorem 2.1, and that, as pointed out in the second remark
92
, -s
following this theorem,
function of
E f
and
j
(T J"~j
;6)
~
~
12+Y
can usually be bounded by a
Consequently, the uniform bound required by
0.13) can be obtained by taking the maximum of these functions over
the index
j.
Remark 4:
When no covariables are included in model (2.1) or
(2.6), or equivalently, when the null hypothesis
(T.,~.)
the observations
J
~J
are
iid
=
Ho :8
~
~
is true,
random vectors and the appropriate
regularity conditions for this case simplify somewhat.
Specifically,
assumptions (3.11) and (3.12) are still needed, but the
may be omitted.
0
.th
J
subscript
Assumption 0.15) is also required, but the func·tion
defined there is independent of
n
since the truncated information
matrices are now identical for each subject.
Finally, assumptions
(3.13) and (3.14) are unnecessary, and under
Ho :8 = 0
and the assump-
topology on
Dq[O,X], (3.49)
~
tions stated above,
'X
V
w
-+
~n,§
where
W*
\.J
*
in the (extended)
J
l
is similar to the Gaussian function of Theorem 3.1, except
that its covariance structure is of the form
for every
(u,v):O ~ U,v ~ X,
where
!(x)(§)
denotes the common
truncated information matrix for each subject.
3.5.
Applications to Clinical Trials and Life Testing Problems
As indicated earlier, for both ethical and economic reasons, we
require a procedure to test the hypothesis
H : B =8
o
-
q Xl
-0
(specified)
against
H.:B
1 -
I 8-0
93
with respect to the competing risks model based on the underlying
dis t ributions specified in <'2.1) or (2.6) which allows the possibili ty
of early termination at time
evidence permits.
x
£
(O,X]
if the accumulated statistical
To see how Theorem· 3.1 may be used for this purpose,
define
(3.50)
x
~n
where
S (x) = n
-~
a
I
'Ie
'-0
-
-
and !(x)(§o) = {!(x)(§)}!B=13 .
-
Ho:~ = ~o'
sup
'Ie'
{ ~n
sup {AO(X)}
O<x~X
-k:
{as log Ln,x(§)} So:: 13 =n
2
-0
n
I
j=l
• 'Ie
{!J·:x(~)}ls=s
-
-0
Then, by virtue of Theorem 3.1, under
-0
is convergent equivalent in distribution to
n
~1
'Ie}
(u)·!(u)(~o)·~n(u)
•
This suggests the following progres-
O<u~X
sive1y truncated test procedure which has an asymptotic (as
level of significance equal to
a
W
a
(0 <
a < 1)
~
00)
(0 < a < 1):
Continue experimentation as long as
where
n
o
x < X and A (x) <
n
-
W ,
a
satisfies
(3.51)
If, for the first time, for some
experiment at time
x
n
(2, X)
terminate the
x = X
n'
and reject
exists, then stop at the predetermined time
H .
o
If no such
X
n
«
X and do not reject
Although it does not appear that the critical values,
X)
H .
o
wa '
defined in (3.51), can be analytically determined, a general procedure
is outlined in Section 5.2 which provides an empirical determination
of
wa
by simulating the Gaussian process
W* .
-n
94
It should be noted that as
approach
AO(X)
,
I(x)(~o)
and
0
x ~ 0
may not be well-behaved as
n
competing risks models.
close to
~ 8
both
and therefore, depending on their relative rates of con-
0
vergence,
~ 0,
x
Although
for some specific
may not blow up when
x
is
for every competing risks model, this possibility requires
0
that, in general, the test statistic be computed for strictly positive
x
and explains why the supremums were taken over
general description given above.
x
(O,X]
€
Thus, if this problem exists for the
specific model under consideration, the test statistic,
monitored over the interval
in the
[E,X],
and
wa
Ao (x),
n
is also determined by
taking the supremum of the appropriate quantity in (3.51) over
for some appropriately chosen
a given percentage of the
prior to time
on
[E,XJ.
x
= E,
n
(>
O.
Typically,
E
[E,X],
is chosen so that
lifetimes are expected to be observed
which implies that
0
A (x)
will be well-behaved
n
As indicated above, the critical values,
depend on this choice of
is
,
for smaller values of
E'
tend to be more erratic and hence
wa '
E,
will also
O
A (x)
n
will tend to be larger.
W
a
will
Since
one would probably not feel comfortable if the trial were terminated
before at least a few of the lifetimes were observed, monitoring the
study over
[E,X]
instead of
[O,X]
from a practical point of view.
is not in any way objectionable
Of course,
E
should be chosen before
experimentation begins, and if the specific model being considered does
not have the problem just discussed, we may take
E
= O.
It is natural to base the test statistic on the quadratic form,
a
A (x),
n
defined in (J. 50) since the components of
~,8
Cx)
.
-0
weighted by the inverse of their variance-covariance matrix,
Also, for each
0 < x < X,
under
P8'
-0
are
95
(3.52)
where
denotes the (central) chi-squared distribution with
q
degrees of freedom.
It is important to note, moreover, that the test statistic,
o
A (x),
n
is invariant with respect to reparameterizations of the modeL
To see this, let
8
@,
denote a reparameterization of
that is,
-g -
(13) ,
where
L* (@)
n,x
is a one-to-one transformation, and let
-*
L
(8)
n,x -
denote the truncated likelihood functions under the
parameterizations, respectively.
...
log Ln,x (8)
-
d
= ~8
(8)
8
and
13
Then, since
=
-1
- -
g
and
we have
Zl
~e
a
-*
0
*-1
log Ln,x (g
(8»
and applying the chain rule,
=
.*
A· L
(13)
-n,x qxq qx1
(3.53)
96
where
A
qXq
a
= ae
a -1
ae
gl (~)
1
~
-1
(~)
=
·..
3
-1
~ gq (~)
1
... ·.. ...
a
a
·
..
gl
gq ae
q
q
~
-1
-1(8)
(~)
denotes the transpose of the Jacobian matrix of the transformation.
From this it follows that
~xn , 8(x) = n-~~*
(8) =
-n,x -
X
~~n,~(x)
(3.54)
and
(3.55)
Then, using (3.50), (3.54), and (3.55),
xno (x)
-x '
~n 8 (x)·y
'-0
-1
-x
-x
[~n 8 (x)J·~n e (x)
'-0
'-0
Thus, as claimed, the test statistic for the model parameterized by
-0
A (x),
n
zation,
~,
is identical to the test statistic for the original parameterio
A (x).
n
Note that this argument also shows that if
is multiplied by any nonsingular
unchanged.
qxq
matrix,
remains
97
3.6.
Nuisance Parameters: The Case When the Null Hypothesis Does Not
Completely Specify ~
In most practical applications, it is of interest to make
B.
inferences with respect to some, but not all, of the components of
Those parameters for which no inferences are desired but which are required to specify the model under consideration are called "nuisance"
parameters.
It
is important to note that, as written in (2.12),
@
denotes the vector of all model parameters, not just the regression
parameters defined in (2.1) or (2.6).
In many applications, some of
the regression parameters are treated as "nuisance" parameters along
with any miscellaneous model parameters that might be necessary.
Such
a situation would occur, for example, if one were primarily interested
in detecting a treatment effect, but included regression parameters
corresponding to the overall mean lifetime and one or more covariables
which are then treated as nuisance parameters for the purposes of this
discussion.
In order to formalize the notion presented above, partition
S,
the vector of model parameters displayed in (2.12), as follows:
(3.56)
~l
where
and
@2
denotes those parameters for which inferences are to be made
denotes the nuisance parameters.
We require a time-sequential
procedure in order to test
against
(3.57)
98
(3 0
-2
In t h is setting, 1 et
d enote the true value of the vector of
"(30
nuisance parameters, and let
denote its
-2,x
on the information available at time
o
=
§l=h
13 =So
-2 -2,x
o.
x,
MLE
that is,
under
"0
§2,x
H
based
0
satisfies
Since the time-sequential procedure
for testing the completely specified null hypothesis in (3.1) was based
on the stochastic process
x
~n,S
qXl -0
X
= {~n,13
-0
d log Ln,x(§)
(x) = n -~[ d13
*
-
J I13=13'
- -0
0
2
x
2
},
X
it seems intuitively reasonable to base a time-sequential procedure for
testing (3.57) on
=
{~~ 8 (x)
o2
x
2 X} ,
'-0
(3.58)
(,0
~1
where
(3
-0
qXl
=
pXl
]
130
-2 (q-p) ;<1
recalling that
130
-2
and
"-
S
-0
denotes the true value of the vector of nuisance
"0
parameters, and, as defined above,
time
x
T
(13)
-(x) -0 '
under
H.
o
(3.59 )
=
13
-2,x
denotes its estimator at
Also, the truncated information matrix at time
x,
is partitioned as follows:
-ll
-12
(3
l(x) Co)
pxp
I(x) (§o) =
qXq
1 21
-(x)
(q-p)xp
(8 )
-0
(B )
I
-0
- (x)
px(q-p)
T22
- (x)
(q-p)x(q-p)
(3.60)
(§o)
99
~(x)
and
is defined by
~(x)
=[
-I p
pxp
pXq
where
I
-p
denotes the
Finally, let
*
W
-n,p
Gaussian function on
~
u,V
~
X,
px(q_p)
* (x) ,
= {w-n,p
a _< x _< x}
such that
EW *
-n,p
= A(u)"T
(6 )"A'(v)
-(uAv) -0 -
where
(3.61)
'
identity matrix.
[a,x]
E[W * (u)][W * (v)] ,
-n,p
-n,p
(u,v):a
pXp
12 (13 ). [I22 (l3 ) J- l ]
_T-(x)
0
-(x) 0
= min(u,v).
uAV
=a
be a p-variate
and
for every
Using the notation estab-
lished in (3.58)-(3.61), the following result can now be given.
Theorem 3.5:
Assume that the parameter space can be restricted
to a compact subset of
(3.15)
hold.
to
q
R ,
denoted by
~,
and that conditions (3.11)-
of this chapter and conditions (2.16) and (2.17) of Chapter I I
Then, under
W*
-n,p
s'
-0
P
X
~n
in the (extended)
Remark:
S
is convergent equivalent in distribu-
'-0
J
l
topology on
nP[o,x].
Requiring the parameter space to be a compact (closed,
bounded) subset of
q
R
places no real restrictions on our ability to
draw inferences with respect to or find estimators of
§.
This follows
l3
since in any alternative hypotheses considered in applications,
can
be restricted to some closed, bounded neighborhood of the null value,
§o'
and for estimation purposes we only consider values of
§
that
can be placed in such a neighborhood of the true parameter value.
Before proceeding with the proof of this result, it will be
convenient to establish the following two lemmas:
Lemma 3.6:
that
j
§ E~,
= l, ••. ,n,
Assume that conditions (3.11) and (3.12) hold and
a compact subset of
r,s
= l, .•• ,q,
and
q
R •
x
E
Then for all
[a,x],
K<
00,
100
Eli*,r,8(T
(i)
j:x
A
·S)\K
j:x'~j:x'~
c* <
00,
(H)
< E[ SUp
§*(£)
where
[
-+-
- "r
f.' 8 (T.,6 .• S) I] K + p(£) ,
f.' 8 (T.,6.'S*)
I"r
]
J ~J ~
§*(£)
= {§*: I \§* - §I
J
]
I
< £}
~J
~
p(£) ~ 0
and
as
and
0,
x
i8 uniformly continuous in
(Hi)
implies that each element of
on
I(x)(~)'
itself, is uniformly continuous in
[O,X].
and hence
on
x
This
I(x)(~)
[O,X].
Proof:
\"
L
<5 f.E
~j
~k
K
2
r s
fXI"f.'
(t.,6.) I f.(t.,6.)dt. + I(Cl /ClQ~ ClB ) log S.(x) IK·s.(x)
°
co
2.
L
6 e:E
-j
~k
where, for
J
~J
J
J
~J
J
"r s
K
If.'
(t .. § . ) I f. (t ., 2 ) d t .
j
f°
j
J
J
J
J
= 1, ... ,n,
J
J
J
r
s
J
J
101
=
c. (x)
J
(02/aBroBs)Sj(x)
Sj (x)
fi
is finite for all
(3.11) and (3.12).
E ~
and
J
r
(a/aB )S.(x)
r
for
J
sup
OsxsX
by virtue of assumptions
L
°
fX(a/aB )f.(t.,6.;B)dt. ,
6 EE
J
r
is a continuous function of
(a/as aB )Sj(X)'
r
J
a continuous function of
1y,
=-
-j -k
(a/a8 )S.(x). and
s
[O,X]
E
Also, using (3.12),
(a/a8 )S.(x)
so
x
max cj(x)
x
s
J -J -
x
and the same is true
Thus, for each
[O,X]
j,
[O,X];
on the compact interval
is attained on
J
J
sup
- OsxSX
= c*
max c.(x)
j
J
<
00
,
which completes the proof of (i).
To prove (ii), note that
E[
,.. * r s
_ i*,r,s(T
"
.0)
sup f."
(T j :x ,/ij
j'.x
J· • x'~J· . x'J:
§*(£) J:X
- :x ;8*)
-
=
a2
J
log S. (x)
J
K
IJK
is
consequent-
and is finite, that is,
j
c.(x) <
c.(x)
102
But (3.11) and the compactness of
~,
is uniformly continuous in
partials.
~
imply that
B.
consequently,
peE)
+
2
/osr as s )fj(T.,t..;S)
J -J
.~
and the same is true for the first order
(a 2 /as r as s )
From this and (3.12) it follows that
is a continuous function of
(3
log S.(x)
J
II §* - § II
E ~ 0,
SO, as
·e
and,
+ 0
which proves part (ii).
0,
To see that (iii) holds, use (3.9) to write
x
L
=
§jE~k
Io
·r
·s
f. (t , <5 • ) • f (t . , <5 • ) • f . (t . , <5 .) dt .
j
J j ~J
J ~J
J J ~J
J
+ [(a/as )Sj(x)l.[(a/as )s.(x)]/s.(x) .
r
-
s
(3.62)
J
J
But the form of the integral in the first term indicates that this term
is a continuous function of
x.
Also,
from the proof of part (ii), so are
From this it follows that since
uniformly continuous in
r,s = l, ... ,q.
Lemma 3.7:
x
on
r
x,
is continuous in
J
(a/aB )S.(x)
[O,X]
[O,X]
S.(x)
J
(a/as )s.(x).
and
s
j
J
r~;:(§)
is a compact set,
for all
and
= 1, ... ,n
is
and
0
If conditions of Lemma 3.6 and conditions (2.16) and
(2.17) of Chapter II hold, then
(1)
log L:,x(§)]
asas'
=
1
n
- -n j=l
L
..
*
~j "x(§)
e-
103
"B denotes the MLE
-x
Note that
where
Proof:
E~
(6)J
-(x) -
=- !
E
a2
n
(
= - -1
n
of
_6
at time
x
where the
(r,s) th
element of
--
•.
*
fj(tj'~j)-fJ.(tJ"~j)dtj
I
+
(3.63)
E["*
!j:x(~) J is given by
E[i*,r,s(6)J =
JX(a2/d8 a8 ) log
j:x
0 EE
0
r s
-j -k
-e
[O,xJ.
log L * (13) ]
n,x a6d6'
E[f
(6)]
n j:1
-j:x '
t
€
Sj(x){(a2/~8ra8s) log Sj(X)} •
(3.64)
But
Using this, and a similar result for
(a 2/a6 d6 ) log S.(x),
r
S
J
be written as
-
L
x
-r
-s
[fj (t ,0.) ][f j (t j , OJ) Jf j (t .. 0.) dt j
j
0
-J
J -J
° J
.c E
-J -k
(3.64) may
104
and using (3.12),
+
=
(d
2/dBrdBs)Sj(x) - Sj(X) [or
Sj(x) J[OS
Sj(X) ]
-E[i*,r(B)oi*,sCB)]
j:x-
j:x-
(3.65)
Since this is true for all
•
j
l, ... ,n
=
and
r,s = l, ••. ,q,
it follows
from (3.63) that
en
=! \ - E[f~ (6)J
n j~l
which establishes part (i).
-J:X-
Note that
-:;()
!(x)(B)
is the truncated
analogue of the so-called "observed" information matrix.
To prove part (ii), we note that assumption (2.17) of Chapter II
and part (i) of Lemma 3.6 imply that for some
0 > 0
and all
r,s = l, ... ,q,
u:'s
J:X
for all
of
-:;()
j
!(x)(~)
=
l, ... ,n
=
Elf*,r,sCB) 1 1+6
j:x
and all
is given by
-
<
-
K + (c*)l+o < 00,
x c [O,X].
Also, the
(r,s)
0.66)
th
element
105
(3.67)
and from -{3. 66) ,
lim
n~
yielding
n
J=l
_! ~ -
{ro(,r),s(S)
x
-
n j:l
_E[i*,r,s(8)]
j :x
-
= Zr,s(B)
true for all
r,s
j :x
=
1
n
1+0 ):
- ,
so
1, ... ,q,
Ujr,s
:x
= 0,
so Markov's
~
Eff *,r,s(6)]}
j :x-
O.
~,r,s(6) _ ~,s(8)
(x)
-
~x)(~)
(x) -
-
r(x)(~)
LLN
applies,
But, from part (i),
g O.
g Q,
Since this is
which proves (ii).
To prove part (iii), note that assumption (2.16) of Chapter II and
part (ii) of Lemma 3.6 imply that for some
r,s
= l, ••• ,q,
j
=
l, .•• ,n,
and
x
€
n
> 0,
and all
[O,xJ,
-e
{3.68)
where
8*(£)
=
{~*:I I~*-~I
I
< £}.
Next, we write
(3.69)
and then proceed as in the proof of Corollary 2.6 of Chapter II to use
(3.68), the consistency of the
MLE,
and Markov's
LLN
to show that
~x)(~x) - ~x)(@)] ~ Q. Using this and the result from (ii) in (3.69)
establishes (iii) and completes the proof of Lennna 3.7.
Proof of Theorem 3.5:
Consider
1
c:-
/il
as a function of
~2'
0
* (8)
a log L n,x
a~l
§1=§~
(3.70)
~2=~2
Expanding (3.70) by the Mean Value Theorem about
106
@~,
the true parameter value, and evaluating this expression at
B
-Z
"60
= -Z,x
e-
yields
0.71)
where
§Z*
satisfies
116_*z-B_ozll
<
0
II -Z,x
SO _8-z II.
But (3.71) can be
rewritten as
8=6
-
1
n
Then, using (3.68), Markov's
Lenuna 3.7 to show that, under
Z
3 log L
-0
(6)
n,x
, -
3§13§Z
LLN
can be applied as in the proof of
the quantity in
P~o'
[ J ~ o.
Also,
from part (if) of Lemma 3.7,
- lIZ (6 )
-(x) -0
Y0
B=6-0
-
where
has been defined in (3.60).
Using these results,
107
(3.71) can be expressed, under
P(3'
as
"'0
1
1
1
L ( g02 x) = b ,x(_S02) - 1(2)(S ).In(s02 -(02) + 0 (1) •
- x -0
- ,x p
- n ,x - , -n
(3.72)
Furthermore, using Lemmas 3.6 and 3.7, we can proceed as in the proof of
Theorem 2.1 [lines (2.25) through (2.35)], to show that, under
=
["122
-(x)
(S)
-0
*
0 log Ln,x(_S)
r 1 .-1/.n- __
+0 (1), (3.73)
----:;::..z.._=:-'---1
p
O§2
)'1
(3=(3
-
I~;)
where
(§o)
P '
S
-0
-0
defi~ed in (3.60).
has been
(q-p)x(q-p)
Substituting (3.73) into (3.72) and noting, from (3.70), that
-e
8=(3
-
-0
and
*
1 0 log L n..<.,x
(~)
__
=
in
o§l
1
A
S=S-0
A
where
§o
and
§o
have been defined in (3.59), we obtain, under
P
S
'
-0
A
S=(3
-
-0
d log L*
(S)
n,x -
=
(!p
12 (S ).[122 «(3 )]-1)~
_1-(x)
-0
-(x) -0
~
d§l
* (~)
a log Ln,x
ag 2
.
@=@o
+ 0 (1)
P
S=S
- -0
108
1
= A(x).--
a log L*
rn
where
(~)
ntx ~
+
o§
has been defined in (3.61).
A<x)
0
~=s
... -0
p
(1)
Therefore, under
pXq
-0
~.B
{W~ 8 (x),
0
(x) =
-0
pxl
where
p~,
~(x). ~ ~
pXq
~ x ~ X}
qXl
(x)
'-0
+
0.74)
(1)
0
p
is the q-variate stochastic process
'-0
considered in Theorem 3.1.
From (3.74), Theorem 3.1, and Slutsky's theorem, it follows
immediately that the
those of
*
~n,p'
fdd
of
are convergent equivalent to
the p-variate Gaussian function of Theorem 3.5.
complete the proof of Theorem 3.5, it must be shown that
tight.
is
It is sufficient to show that
sup
0(6)
where
To
0(6)
I I~
3 (x) -
-n,~o
= {(x,u): Ix-ul
~~,3
< 6,
~o
(u)
II ~
a ~ X,u
~
0
xl.
as
6
+0
,
(3.75)
From (3.74), however,
e-
109
sup II~ "6 (x) - ~ " (u) II
D(o) -n,_o
-n,~o
=
sup
D(o)
.::. sup
D(o)
II [~(x)
ll~(x)
-
~(u)]~ 6
'-0
~(u) II·
-
(x) +
sup
O$X~x
~(u) [~ 6
'-0
~~ 6
(x) -
'-0
Recall that
D(o)
A(x) ~ [ !p
x
ous on
on
[O,X].
[O,X].
-e
'-0
'-0
-12
-22
[O,X]
-1)
-!(x)(§o)[!(x)(~o)]
1(
)(6)
- x
II
,
(1) •
0
(3.76)
p
and that from part
is uniformly continuous
-0
~(x)
Hence, each element of
is uniformly continu-
It follows, therefore, that
sup IIA(x) - A(u)
D(o) -
and since
p
n,_o
(iii) of Lemma 3.6, each element of
in
(1)
0
II~ B (x) II
+ sup II ~(x) II· sup II~ 6 (x) - ~ 6 (u) II +
O~xsx
(u)] +
II
is a compact set,
sup IIA(x)
-
O~x~X
-+
0
as
0 {- 0 ,
sup IIA(x)
OsxsX -
II = 0(1)
Furthermore, Theorem 3.1 implies that
II
(3.77)
is finite, that is,
(3.78)
•
sup I I~~ B (x)1
OsxsX
'-0
I
is bounded in
probability, that is,
sup
O~x~X
II~ B (x) II
=
'-0
(3.79)
0 (1) •
p
Also, from the demonstration of the tightness of
~
6
-n
'-0
given in the
.
proof of Theorem 3.1, it follows that
X
(u)1
sup I I~ 6 (x) - w
D(o) -n,_o
-n,§o
I~
0
as
0 {- 0 •
(3.80)
110
Substitution of (3.77)-(3080) into (3.76) establishes (3.75) and
completes the proof of Theorem 3.5.
0
o
H : 8 =8
01 -1
A time-sequential procedure for testing
unspecified)
Using Theorem 3.5, a progressively truncated test of
H : 13
o -1
=
130
-1
(§2
(3.81)
unspecified)
can be constructed in a parallel fashion to the progressively truncated
test of
H:13
o -
= -0
13
(specified) given in Section 3.5.
For this purpose
define
(3.82)
recalling that
130
-2
where
value of the nuisance parameters, and
A
=
-0
(13
O'AO'
13
)'. Then, by virtue
1 - 2 ,x
sup {~l,o(x)} is convergent equivalent
O<x~X
n
of Theorem 3.5 (under
13
denotes the (unknown) true
-
in distribution to
A
-1 *
sup {W*' (u).[A(u;13A )01(
)(13A )oA'(u;13)]
oW
(u) } .
O<u<X -n,p
-0 - u -0 -0
-n,p
(3.83)
Thus, we have the following progressively truncated test procedure which
has an asymptotic (as
n
~ 00)
level of significance equal to
0.(0 < a < 1):
Continue experimentation as long as
,
where
w (0 < a < 1)
a
p[ sup
o<u~x
x < X and
< W ,
a
satisfies
{w*'
(u)0[A(u;8 )oI( )(8 )oA'(u;8 )]-~* (u)}
-n,p
-0
- u
-0
-0
-n,p
>
w~]
=a
~
.
(3.84)
e-
111
,
If, for the first time, for some
experiment at time
(~ X)
X
n
~l,o(X ) > w ,
(l
,
n
x • Xn ,
H •
and reject
0
exists, then stop at the predetermined time
terminate the
If no such
(~ X)
X
n
X and do not reject
H .
0
The general procedure outlined in Section 5.2 of Chapter V may be
used to determine the critical values,
w(l ,
empirically.
Also, from
(3.82) and (3.84), it is importaat to note that these critical values,
as well as the test statistic,
be obtained from the data.
for a specified time,
~l,o(x),
n
depend on
For this reason, the study is typically run
xl
units say, the nuisance parameters are esti-
mated at this time, and the trial is monitored over
cations,
(o,x ]
l
-e
xl'
"0
~2,x' which must
[xl,x].
In appli-
which is specified in advance, might be chosen so that
is a certain percentage (e.g., 10% or 20%) of the entire time
interval
(O,X],
or such that a certain percentage of the
are expected to be observed prior to time
strictly speaking,
"0
@2,x
xl'
n
lifetimes
Further, note that,
should be updated for each
x.
In practice,
however, it will usually be sufficient to update these estimates
periodically; a detailed example and further discussion are provided in
Example 5.4.2 of Chapter V.
Finally, the same comments given for the
time-sequential procedure of Section 3.5 regarding the possible erratic
behavior of the test statistic as
x
~
°
risks models apply for this case as well.
for some specific competing
However, since the nuisance
parameters must be estimated, the test statistic will not be monitored
until time
xl > 0
Of course, if
anyway, which eliminates this potential problem.
~l,o(x)
n
is not well-behaved for
x
close to
0,
the
supremum in (3.84) must be taken over [xI,x], although this may be
done even if
~l,o(x) ~
n
°
as
x ~ 0.
112
Note that for each
°
< x ~ X,
under
P
e-0 ,
(3.85)
2
where
denotes the Chi-squared distribution with
Xp
freedom.
under
p
degrees of
Also, it should be noted that from part (iii) of Lemma 3.7,
H ,
o
and therefore, the "observed" truncated information matrix,
I(x)(~o)
can replace
3.7.
in equations (3.82)-(3.85).
Staggered Entry
Very often all the patients cannot be entered into a clinical
trial simultaneously; instead, they are recruited and entered over an
interval of time.
This type of entry is called "staggered" entry, and
to handle this situation we assume that the investigator can specify,
in advance, an interval,
the trial.
[o,X],
o
over which all patients will enter
Also, the trial is monitored over the interval
[x ,X],
o
that is, we do not begin testing the null hypothesis in (3.1) or (3.57)
until all the patients have been enrolled.
As in the preceding sections, let
x
denote the time from the
entry of the first patient until the present,
denote the time at which the
x .
OJ
[O,X],
0
E
j
=
l, ... ,n,
(x
ol
jth
x
E
[O,X];
also, let
patient enters the study,
= 0).
In this case, however, the
patients will have varying amounts of time on treatment, that is,
truncation of the study at time
the
.th
J
individual at time
X €
[X ,X]
o
corresponds to truncating
e-
113
• x -,X
X
j
This leads us to define, for
-
=
T.
]:x
j . 1, •.. ,n •
oj '
j . 1, ••. ,n,
[Tj •
if
T
< x
x. ,
if
T
> x
]
j
j -
and
x
(3.86)
[Xo ,X],
E:
j
j
(3.87)
-
t:.-j :x
-*
-
=
[~j'
Q,
fj
-
if
T < x
j
j
if
T
j
->
(t j :X' ~j :x),
fj:x(tj:x'~j:x;~) = Sj(X j )'
x.
J
if
if
0,
t j : x < xj
and
j
and
-t j : x = x
~j:x
2j
:
€
~k
x = 0
otherwise
(3.88)
-e
and
_*
L
(8)
n,x -
=
n
IT
j=l
_*
f
(6) .
j :x-
(3.89)
Similarly, using (3.87)-(3.89), let
(3.90)
where, for
r,s
= l, ... ,q,
(3.91)
Also, let
(3.92)
where
-
I(x)(~o)
is partitioned as in (3.60), and let
114
-x
W
-n t -6
o
qXl
(3.93)
and
-x §
-x " = {~n
w
-n B
t_
pXl
(x)
"
8=8
- -0
'-0
o
, x0 -<
x < x}
t
-
(3.94)
0
"-1
8
where
= -§o
8
-0
let
]
[ 2,x
"0
and 6
- 2 tX
denotes the analogue of
62 X .
-
*
denote the analogues of
and
"0
and
W
-n
SimilarlYt
t
*
the
W
-ntpt
q-
and p-variate Gaussian functions of Theorems 3.1 and 3.5, and let
~o (x)
and
n
~l,o(x)
defined in (3.50) and (3.82).
~l,o(x),
and
denote the analogues of
n
n
It should be emphasized that although the
test statistics and related quantities are still indexed by
x,
the
actual computation of these quantities requires the individual truncax -- x ,x , .. "x '
n
l 2
tion times induced by
-*
f.
Noting that the
J :x
defined in (3.86).
are still independent, it can be seen that t
using the notation established above, Theorems 3.1 and 3.5 (and their
corresponding Lemmas and Corollaries) are valid with
-x
replaced by
-x
w
~
and
-n,8-0
H :8 = B
o -0
and
W
-n '-0
B
-*
W
-n,p
-* ,
w
-n
and
X
w
"
-n 13
t
and
-0
*
-ntp
W
X
w
-n (3
and
'-0
replaced by
*
W
-n
It follows that a progressively truncated test of
(specified) can be constructed in the same fashion as the
procedure given in Section 3.5 with the appropriate notational (and
therefore computational) substitutions.
test statistic which is monitored over
In particular,
[X tX]
o
taken over this interval as well (instead of
~l,o(x) ,
n
test of
X EO
[Xo ,X]
o
H0
:81-=-81
~o(x)
n
is the
and the supremums are
[£,X]).
Similarly, using
as the test statistic t a progressively truncated
(~2
unspecified) can be constructed in a
parallel fashion to the procedure of Section 3.6.
e-
115
3.8.
Applications to Type II Censoring
Although the primary function of Theorems 3.1 and 3.5 is to
provide time-sequential procedures for competing risks models, it will
be shown in this section that 'these results also provide a means of
analyzing Type II censored data.
In this setting the data are analyzed
only once, after a predetermined number of deaths,
r
(0 < r < n),
have occurred.
Letting
the
r
th
T
r,n
denote the random variable representing the time of
failure (out of
subjects), we assume that
n
T
where
pco
r,n
"'p
is a fixed time point.
~p
x
~n,§o (Tr,n)
X
-
= ~n,~o (~p)
Also, we can write
{X
X
w
B'
-n,_o
,
the quantity in
{}
is a fixed time point, Theorem 3.1 implies that
Thus, since
o
X-}
+, ~n,~o (Tr,n) - ~n,§o (~p)
where, by (3.95) and the tightness of
X
7 )
~n,§o (~p'
(3.95)
~
WX (T
)
-n,§o r,n'
and hence
and variance
I(~ )(§o).
is asymptotically normal with mean
From this discussion we obtain the f01-
p
lOWing corollary which is useful for testing
H
:B = -0
B
o -
(specified)
vs.
=
B
-0
in the Type II censoring scheme described above.
Corollary 3.8:
Under the assumptions of Theorem 3.1 and (3.95),
X
W '(T
-n,B
-0
r,n
)
l
.r
-(T
.WX
) -n,B
r,n-o
(T
r,n
)
~
2
X •
q
116
Note that in any given application, the value of
specific group of subjects under study,
tr,n'
T
r,n
realized by the
is used in the computa-
tion of the test statistic.
The parallel result for testing the hypothesis
'"' a
-1
O
H :S
o -1
, ,
[§ = (~1§2)"
in the presence of nuisance parameters
by
H ]
o
~2
unspecified
follows similarly from (3.95) and Theorem 3.5.
Corollary 3.9:
Under the assumptions of Theorem 3.5 and (3.95),
X
W
-n,p (T r,n ).~(T
- r,n );Z(T
-
8
-0
)(8 )·A'(T
-0
r,n
-
r,n )~-l.wX
-n, AS (T r,n )
~i .
'~
-0
Although Corollaries 3.8 and 3.9 provide the results needed to
analyze competing risks data arising from a Type II censoring scheme,
they rely on assumption (3.95), the reasonableness of which we now
address.
TI, ... ,T
Let
T
<
n,l -
<
T
n,n
I
n
(t) = n
n . I
J=
L
F
-(n)
be such that
t.:p
and let
well known that, if
rln - p
(0,1),
E
T
- ~(n)
n,r
p
if
~p
F
n
~
and let
denote the order statistics of this sample.
Fj(t)
Further, if
FI, ... ,Fn
denote observations from
n
F
is such that
as
n
F(tp)
=
p,
T
a"(n»
p
= p.
It is
g0
then
n,r
n
then
where
~ 00,
F
Define
g tp
F
is independent of
n,
and
e-
117
Since the above discussion is applicable to the competing risks setting,
and since, in practical applications,
assumption (3.95) is indeed reasonable.
-e
•
F
n
will converge for large
n,
CHAPTER IV
REPEATED SIGNIFICANCE TESTING:
A SEQUENTIAL PROCEDURE
FOR PARAMETRIC COMPETING RISKS MODELS
4.1
Introduction
It is clear that unless the lifetimes under study are extremely short,
(classical) sequential methods are of little practical importance in the
analysis of failure time (lifetime) data.
Nevertheless, in this chapter
it is shown that the development of such methods for parametric competing
risks models is at least mathematically feasible.
An appropriate
sequential procedure for these models is proposed in the context of
repeated significance testing.
In this setting the experimenter obtains
observations sequentially and performs a test of an appropriate hypothesis after the addition of each observation.
As soon as the accumulated
statistical evidence permits, the null hypothesis is rejected and the
experiment terminated.
If, however, the sample size reaches a pre-
determined maximum number without rejection of the null hypothesis, the
experiment is terminated and the hypothesis is accepted.
A basic
invariance principle for parametric competing risks models which allows
the sequential approach just described is formulated in Section 4.2 and
proved in Section 4.3 where some comments on its assumptions are also
made.
Applications to life testing experiments are discussed in
Section 4.4 and the remarks of Section 4.5 conclude Chapter IV.
•
119
4.2
Preliminary Notions and the Basic Invariance Principle
We wish to develop an asymptotic procedure to sequentially test
f3
~
f3
-0
(4.1)
with respect to the competing risks model based on the underlying distributions specified in (2.1) or (2.6) as the sample size increases over
{l,2, ••• ,n}, where
n
denotes the (predetermined) maximum sample size
which is attained. only if the test procedure based on fewer observations
does not reject H.
o
If this maximum sample size is reached without
rejection of H , the experiment is terminated and H is accepted.
o
0
Although the procedure just described, which is often called
"repeated significance testing", is sequential in the sense that
observations are obtained sequentially, i t is not the "classical"
sequential procedure pioneered by Waldo
In the classical setting, the
alternative hypothesis is usually specified, as is the type I and
type II errors, and typically the expected stopping number is finite.
If, however, the alternative hypothesis is not specified, the sample
size may grow indefinitely large.
In the repeated significance testing scheme, on the other hand, a
maximum sample size and type I error are specified, while the alternative
hypothesis and type II error are left unspecified.
In this regard it is
similiar to the ordinary fixed sample size design, except that the test
statistic is repeatedly updated as each new observation is added.
This
allows the possibility of early termination based on the accumulated
statistical evidence and may lead to a considerable savings of both time
and money.
Since it is hard to imagine a practical application in which
the investigator cannot specify a sample size beyond which, due to
120
economic and/or time constraints. he is unwilling to continue. the
repeated significance testing procedure is the only sequential scheme
that will be considered.
that
In the development that follows we require
n. the maximum sample size. be large.
Following the notation of Chapter II. let 6
~qx1
denote the vector of
parameters associated with model (2.1) or (2.6) as given in (2.12). and
denote the density of (Tj'~j)' jal ••••• n. by
=
k
IT [g (t
i=l i j
;6)]
0ij
-
(4.2)
ST (tj;~)
j
•
Also. let
...
e·(4.3)
a2
0::
log f
j
--""
(~)
a6 as'
_
[ ~r.s(~)]
j
,
qXq
(4.4)
and
(4.5)
where 1.(6) denotes the information matrix for the j
~J
-
th
individual,
that is,
I. (8)
~J
-
= E[f. · (~) ·f. j , (~)]
J
'r's
.. (E[f. (6) .f (C5)]) x •
j J q q
(4.6)
Using this notation and keeping in mind the requirements of the
sequential procedure outlined above, we are led to define the following
vector-valued process for every n(~ 1) and 8
E
Rq :
121
[nx]
~n,@ - {~n,@(x) - n -~
qXl
where
[w]
L
j-l
t (~). °
j
denotes the greatest integer
:S
:S X
< w.
1 },
(4.7)
Also,
can be
expressed as
Z a(x)
n, 1
(4.8)
Zn, a (x)
q
~n,~(x)
and we define
= Q if
[nx] < 1.
Note that
~n,~
is a vector-
valued stochastic process with a continuous time parameter,
x, although
components, Z a (x). r • l,.",q, is a step function
n, r
with jumps at xl • {x: [nx] • I} , x 2 • {x:[nx] • 2}, ... ,
each of its
x
n
=
q
{x:[nx] • n}, that is, whenever a new observation is obtained.
Thus, for every
n(? 1) and ~
E
q
R , ~n,~
belongs to the
nq[O,l]
space
endowed with the (extended) Skorohod J l topology.
°
Z* • {z * (x),
-n
qX
*
function on [0,1] such that E~n •
Finally, let
for every (u,v): 0
"'r
S
S
x
:S
° and
I} be a q-variate Gaussian
*
*
E[~n(u)][~n(v)]'
= ![n(u/\V)](~)
u,v s 1, where uAv • min(u,v) and for
°s x
:S
1,
(4.9)
Now that the required notation has been introduced, we are in a
position to investigate the weak convergence of Z a' and for this
-n,_
purpose make the following assumptions:
For
=
j . l, ••• ,n,
and for all
t
j
> 0,
Qj
{(l,O, ••• ,O)lxk, ••• ,(O, ••• ,O,l)}, and ~
€ ~k
€
q
R (or an open subset
122
q
of R if a restricted parameter space is indicated),
fj(tj'§j;~)
> 0 and has (continuous) first and second order
partial derivatives with respect to
For
= 1, •.. ,q,
r,s
j
= 1, ••• ,n
8.
(4.10)
~
and all
€
Rq ,
(4.11)
For some
Elf;(e)
1
and each
y > 0
2 Y
+
$
Ks <
For any given
00
s · 1, ••. ,q,
for all
u , •.. ,u ,m
m
1
~
j = 1, ..• ,n.
(4.12)
1, such that
lim inf ch (D) > 0 ,
mq -n
n-+oo
where
ch
(D)
mq -n
denotes the smallest
(4.13)
(mqth) characteristic
root of
-n •
mqXmq
![nu ](§)
1
![nu ](~)
1
![nu ](~)
1
![nu ] (~)
1
![nu ](~)
2
! [nu ] (~)
2
1
(8)
-[nu 2 ] -
![nu ] (§)
1
![nu ] (~)
2
![nu ] (~)
3
! [nu ] (~)
3
•••
![nu ] (~)
1
D
...
![nu1](~)
![nu2](~)
...
...
![nu ](§)
3
•••
1[nu ] (~)
m
.
(4.13a)
e-
123
For
r · 1, ••• ,q,
Zr,r(S)
j
where
K <
S
-
r
00
for all
denotes the
r
th
j . l, ••• ,n,
(4.14)
diagonal element of
It is now possible to state the following basic invariance principle.
Theorem 4.1:
Under
and assumptions (4.10) - (4.14), Z Q
-n,p
is convergent equivalent in distribution to
J
l
topology on
Remark:
~
*
n
in the (extended) -
nq[O,l].
This means that the distributions of
Z
Q
-n,~
and
Z*
-n
are asymptotically equivalent, though neither sequence necessarily
converges to a limit.
In Rractica1 applications, however, such limits
will exist, and we may write
-e
4.3
~
Q
.~
*
V ~,
~
where
Z*
= lim Z*
n~
-n
Proof of Theorem 4.1
Before proceeding with the main proof it will be convenient to
establish the following two lemmas.
Lemma 4.2.
Under the regularity conditions of Section 4.2,
E[~n
e(x)].
'-
and for every
0
q X1
for all
x
€
[0,1],
(4.15)
(u,v): 0 s U,v s 1,
(4.16)
Proof.
Using assumption (4.11) it follows from line (2.22) of
Chapter II that, for
j
= l, ••• ,n,
(4.17)
and hence,
.,
(4.18)
Therefore, (4.15) follows directly from (4.17) and the definition of
Z Q(x)
-n,~
given in (4.7).
124
Further, for any
0
S
u,v s 1,
1 [nu].
.,
E[f.(S)"fj(A)],
-J - - t::
J"l
= -n . L..\
if
s
u
v
by (4.17) and
£j(~)'
the independence of the
1 [nul
•n L
!j(§)'
j=l
.. ![nu](~)'
Similarly, if
v
$
from (4.18),
from (4.9).
u,
we obtain
I[ ](S), and therefore,
- nv -
e0
which establishes (4.16) and completes the proof of Lemma 4.2.
Let
for
0
every
B
n,x
:S X
n ,
be the
s 1,
Q
n,f.'r
B
n,D
and let
For every
n(~
(x), B
,Os x s I}
n,x
Proof.
be the trivial
B
is nondecreasing in
n,x
Lemma 4.3.
{Z
a-field generated by {(Tj'~j),j
For
0 s w
S
x
1)
(x)IB
]
n,f.'
n,w
r
Q
E
Then, for
[0,1].
and every
r
= 1, ... ,q,
is a martingale.
and every
1
E[Z
x
a-field.
= l, ... ,[nx]}
[nwl
\
= n -~j=l
L..
r
= l, ... ,q,
(4.8) yields
•
r
E[f.(Tj,t:..;S)
B
]
J
-J n,W
I
1
[nx]
.
+ n- ~ L
E [f.r (T . , t:..
; S) IB nw ] •
J J _Jj-[nw]+l
'
(4.19)
125
j - 1, ••• ,[n~1,
But for
independence of the
E[£;(Tj'~j;~)IBn,w]· £;(6),
'r
fj(~)'
for
E[£;(Tj'~j;@)IBn,wl = E[£;(~)]
j . [nw] + l, ... ,[nx]
• 0,
by (4.17).
and from the
we obtain
Using these results in
(4.19) yields
1
E[Z
6 (x)IB
n, r
n,w
1· n-~
[nw] .
L
j-l
f
r
j
(6) + 0
-
- Z 6 (w).
0
n, r
Next, we show that the finite dimensional distributions
of
~n
6 are convergent equivalent to those of
,-
that for any given
u , ... ,u ,m
1
m
~
mqxl
This requires
1, such that
...
~
*
Z.
-n
(fdd)
v
..... N (O D )
mq -'-n '
(4.20)
-n,_6(u)
m
Z
that is, for any arbitrary real vector,
R, 1 ;t
mqx
Q,
v
R,'Z
(4.21)
------- ..... Nl(O,l),
./R, 'D Q.
- -nwhere D has been defined in (4.l3a).
-n
For notational convenience let
k
kl, ... ,k
are integers such that
m
"
,
9.,
lxmq
•
(~1 '''~m)' where
lxq
l
- [n u ], •.• ,k
1
m
= [num],
0 S kl S ; .. S km S n.
,
~t· <'~'tl· .. Q.tq)
lxq
this notation, (4.8), and (4. 20) we obtain
for
so
Also, let
t · l, ... ,m.
Using
126
~_
,
Z_ = n
-~'
...
,
·(~1
... ~)
.........m
I
= n-~
k
k
1, •
I ~lfj(S)
[ j=l- - -
+
2 ,.
I ~2f.(S)
j=l- J -
+
k
+ ... +
1 ,.
I ~ f. (8)
j=l-m- J k2
+
I
,.
~ f. «(3)
j=k +1 -m-J 1
k
+
2
m
I (I
, •
£t)f.(S)
j=k +1 t=2- -J1
+ ... +
~m (~')f.(S)
j=k +1
m-1
] ..
(4.22)
-m -J -
•
Now. define
C
c
C.
-J
mxl
=
1j
2j
...
C
mj
where
lEO
C
tj
[1.
k
if
j
$
O. i f
j
> k
t
t
(4.23)
127
for
t
= l, .•. ,m, and let
,
c*
c @ I-q
-j • -j
mqxq
where
I
-q
denotes the
Kronecher product.
qxq
identity matrix and
Note that for
c
k r-l + 1
c
c
j
:;; k
denotes the
r'
·..
(r-l)j
c
~
®
0
lj
...
c. =
-J
(4.24)
0
=
1
rj
.:. J
mj
so,
o
-e
-qxq
·..
c·* = c. ®
-J
x I-q
mqy.q
-J
=
o
-qxq
I
-q
• ••
I
-q
where the top block has dimension
dimension
(m-r+l)qxq.
,
&
*
£j
lxmq mqxq
(r-l)qxq
and the bottom has
It follows that
m
=
,
&t' for
tar lxq
L
and since this result is true for
(4.22) can be rewritten as
r
k ,
r
= l, •.• ,m
(k O = 0), equation
(4.25)
128
i
+ •.. +
m
(6)]
t ' c *f
jc:k
+1- - j - j m-1
(4.26)
That is,
k
,
m
L
Z =
Y
- j=1 j
.Q,
where
Y = n
j
-'2Z' c.f.(13),
*.
j
- -J -J -
= l, ..•• n,
are independent random variables.
for
(4.27)
,
From (4.17) and (4.18) it follows that,
e-
j = I,H.,k •
m
(4.28)
and
(4.29)
Note that:
(1)
for
1 -< J. <- k l'
I
-q
*
* =
c.].(13)c.'
-J-J - -J
•••
I
)
-q qxmq
I
-q
mqxq
!j (13)
=
1. (6)
-J
...
]. (/3)
-J
·..
1. (6)
-J
...
!j (13) !j (13)
·..
·..
!j (13)
1. (6)
-J
(4.30)
!j (13)
mqxmq
129
o
*
*
(0
I]
. I j (B) •
i ...
- - -q
-q
I
1. (B) c ' •
-J -J - - j
C.
-q
•••
·..
0
0
0
·.. !j
... ... ·.. ...
-e
(m)
and for
k
m-
0
lj (~)
0
!j (~)
1+1
$
j
$
(~)
·..
k
m
(4.31)
-j <alj
- mqxmq
I
,
o
•••
o
-1.(S>.(O ... 0 I)
-J - -q
I
-q
-0 0 ...
0
0
... .-..
...
0
Q
0
...0
(4.32)
l j (~)
mqxmq
Using (4.29) - (4.32) we obtain
130
k
m
= L var(Y j )
2 slan
j=l
·..
I , (8) ! j (~) ·..
- J ... ... ·..
! j (~)
! j (~)
= n -1 • Q,
k
,
l
L
=1
- 1,«(3)
·..
-J ... ... ·..
l ~ !j (~) ·..
0
k
+
·..
! j (~)
I . «(3)
- J -
2
0
0
I
j=k +1
1
0
k
•••
! j (~)
-1
1. (S)
-J ...
0
!j (~)
...-
! [nul] (~) ! [nu2 ] (~)
...
0
0
! [nul] (~) ! [nu1](~)
!
...
X €
[0,1],
and since
k
l
=
[nu1](~)
1
-
t
![num](~)
I [nx](S) • n-·
[~u1]'
J
...
! [nul] (~) ! [nu2](~)
-
Q,
! [nu ] (~)
2
...
since, by definition (4.9),
e-
0
0
0
"" t
I_ J. (S)
-
0
0
j=km- 1+1
,
-
~J
·..
... ... ·..
·.. ~j (~)
m
L
+ ... +
I . (S)
[nx]
L
for all
1,(8)
j=l -J -
k 2 = [nu 2 ],···,k
m
=
[nu ].
m
131
From this and the definition of
s
2
D
-n
= R._
km
given in (4.13a) we have
,
D
R.
-n-
(4.33)
and
(4.34)
for all
R.
;ae
O.
But, as
noted in the proof of Theorem 3.1 of Chapter III (and Theorem 2.1
of Chapter II), the ratio
Pn
fore sufficient to show that
is scale invariant, .and it is therep
n
,
R. R. = 1.
-e
~
as
0
n
~
00
for all
R.
such that
But using assumption (4.13) and proceeding as in the proof
of Theorem 3.1, we obtain
lim inf{R.'D R.)1+Y/2 > 0,
,
for all
such that
component of
inequality to
R.
,C.,
*
s
- -J
Ely.
J
1
R.R.=1.
=
2 y
+
(4.35)
....n
n~
,*
(R. c.)
Let
1, ... ,q.
- -J s
denote the
s
th
Then, using (4.27) and applying the
yields
< n
-1-y/2
l+y
, *
q·s
2+y
. q . max I{R. c.) I· I Elf (8)1 . •
l:5j:5k m - -J s s=l
jl:5s:5q
and by assumption (4.12),
< n- 1- Y/ 2 • max 1{R.'c~) l·q2+Y• max K
l:5j :5k m - -J s
l:5s:5q s
l:5s:5q
_ n -1-y/2'
.K ,
K
, <
00.
(4.36)
cr
132
From this it follows that
=
p.
n
-+ 0
n-+
as
oo
,
using (4.35), and therefore Liapunov's eLT can be applied to obtain
eThis establishes (4.21) and completes the proof of the convergence
fdd's
equivalence of the
of
Z
-n,8
and
Finally, it must be shown that
o <
purpose define, for every
w~(t )
u
where
t
r
r
a<
= sup{\t r (x)
denotes the
r
th
Z*
-n
Z
-n,S
1
and
.
is tight, and for this
r
= 1, •.. ,q,
- t (u)\:o ~ u ~ x
r
component of a
qx1
(u+0)A1}
$
vector
t.
(4.37)
Then, as
noted in the proof of Theorem 3.1 [lines (3.41) - (3.44)], to establish
the tightness of
z
-n,B
it is sufficient to show that for all
lim lim p{w o(Zn,8 ) > €/q}
n-+oo
r
=0
for every
€ > o.
o~o
Next, partition
[0,1]
[~0,(2+l)oAl],
into subintervals of length
2
E
L
= {O,l, ..• ,[l/o]
0:
+ c} ,
where
r
= 1, ..• ,q,
(4.38)
133
[1/0] < 1/0
[1/0] = 1/0
, _ [ 0, if
c - -1 , if
and the last subinterval has length
~
o.
Then, using definition (4.37) and proceeding as in the proof of
Theorem 3.1 it can be shown that
Wo(Zn,S ) = supllZn,B (s)-Zn,6 (u)l:o
r
r
S 5
where, by Lemma 4.3,
r
Iz
sU P (
sup
B
1fL 10sv~(1+1)oAl n, r
f[z
Q
n,lJ r
(v)-Z
is a separable sub-martingale.
of Chapter
III
u < x
~
(U+O)Al)
(v)~Z n, Br (10)1),
B (10)], v
n, r
f
Thus, for every
[10,(1+l)oAl]}
£ >
0,
Lemma 3.4
can be applied to obtain
pfw 8 (Zn,8 )
.-
~
> c/q}
r
Iz
< LPIsup
B (v)-Z B (10)1
- leL ~osvS(l+l)oAl n, r
n, r
• [EIZn,B
But
r
>2(1~q).J
«l.+1)O)-Zn'B~M)12]r·
E[Z.
6 «1+1)0)-2 n,~)Q (18)]2
n,
r
r
=
Var[Z
(1 -1)
from Lemma 4.2,
Q
n,P r
«1+1)8)-Z
B (10)]
n, r
[-i) ,
(4.39)
134
K~r(O).
-
(4.40)
Note however, that
Kn }(S) = n -1
9..
= n -1
[n(Q.+l)o]
I
j=1
J
$
-
j=l
[y]
oK
r
n
n
1,
~
J
-
J
r
~
ooK .
r
assumption (4.14) implies
K~ (0)
$
K °0,
r
n
~
00
~r
Further, from (4.20), as
,by assumption (4.14),
Y ,
$
< n(1+l)0 - (n18-l) oK
That is, for all
1
1:,r(S)'
I
1:,r(l3)
j=[n10]+1 J
n
y-1
I
[n(1+1)0]
< [n(1+1)6] - ln10]
and noting that
[nQ.O]
1:,r(l3)
0 < K
r
<
(4.41)
00
,
p fI Z Q «1+1) 8)-Z Q (10) I > ~}
n'''r
n'''r
lOq
~ pfiz*n,r «1+1)0)-Z*n,r (18) I > 10E q,}
< 20qc
-
-1
,
(Knn (0) /211):.1k exp{ -£ 2 /200 q 2 oKnn (o)},
0
.... r
,* «1+1)0)-Z * (10)]
since [Z
n.r
n,r
....r
~
(4.42)
N (O,Knn (0», and we can use Mills'
1
....r
ratio as in the proof of Theorem 3.1 to obtain this bound.
So, from
135
(4.39), (4.40), and (4.42), for every
sufficiently large
c > 0,
0 < 0 < 1, and
n,
p{w (Zn,8 ) > c/q}
o
r
and using (4.41),
< 10q£
-1 L,.
MA
~lL
(v20 o q
0
r
~
.
1/2 °c -1/2 o(K) 3/4 00 3/4 0(2n) -1/4 °exp{-c 2 /400 q 2 oK eO})
20 0v5 0 (q/r-)
3/2
r
-1/4
3/4 - 1 / 4 2
2
0(2n)
o(K)
00
oexp(-c /400 0q oK 00),
r
r
since there are approximately 1/0 terms in the sum,
--
-+
0
as
o,j, O.
Since this argument is valid for all
r
= l, ... ,q,
(4.38) has been
verified, and the proof of Theorem 4.1 is complete.
o
We conclude this section with the following comments.
Remark 1:
Assumptions (4.10), (4.11), and (4.12) are identical to
assumptions (2.14) [(3.11)], (2.15) [(3.12)], and (2.18) [(3.13)] of
Chapter II [III].
Assumption (4.13) is simi1iar to assumption (2.19)
[(3.14)] of Chapter II [III] and, as argued in Remark 2 below, it will
be satisfied in practical applications, as will assumption (4.14) which
is discussed in Remark 3 below.
We note that assumption (2.17) of
Chapter II is not needed for the basic invariance principle of
Theorem 4.1 but it is required for Corollary 4.4 [line (4.49)] which
provides a consistent estimator of
![nx](~).
136
Remark 2:
In Remark 2 of Section 2.4 it was noted that in practical
T
(6)
-n -
applications.
T (8)
-n -
1(8) as n
+
--
converge~ to a limit,
+
00.
say, that is,
Also, note that for
0
$
X $
1,
[nx]
! lnx] (§)
=
~ jIl !j (§)
(4.43)
But
_ nx
nx-l
~­
n
n
o
so for
x < 1,
$
lnx]
--+ X
n
and
(4.44)
(6)
I
-lnx]
- ' 0 <- x <- 1 ,
That is, in practical application,
D.
-n
D
-n
will converge to a limit as
n
grows large.
and therefore
Further. noting that
is a variance-covariance matrix. a similiar argument to the one
lim ch (D) > 0 and there..
n+oo
mq-n
fore assumption (4.13) will be satisfied in any application that is
given in Remark 2 of Section 3.4 shows that
likely to be considered.
Remark 3:
Recall that. using assumption (4.11), it was shown in
the proof of Theorem 2.1 that, for
1(6) • -Elf (6)] ==
-j -j -
j
=
l, ..• ,ri,
(-E[fr.s(f3)J)
j
-
qxq
,
(4.45)
which provides an alternate formula to (4.6) for computing
Also, since
r r
1.'
J
'r
(8) == Var[f.(B»), (r" l, ... ,q),
-
-J -
1. (8).
-J -
will be finite in
applications, and since this quantity is a function of
8 and
which. in typical experimental settings, does not systematically vary
with
j,
it follows that the uniform bound required by (4.14) can be
137
obtained by taking the maximum -of the appropriate quantities
index
ove~
the
which indicates that (4.14) will be satisfied in practical
j,
applications.
4.4.
Applications to Life Testing Experiments
An Asymptotic Sequential Test of H : 6 = 6
4.4.1.
o
-
-0
(specified)
Using Theorem 4.1 i t is possible to provide, in the context of
repeated significance testing, an asymptotic sequential test of hypothesis
(4.1) which is constructed in a parallel fashion to the time-sequential
test of Section 3.5.
~
n
(x)
= n -~
~n,6 (x)
where
o
For this purpose define, for
-0
H: B =
o -
B,
-0
*'
1,
(x)oZ[ ](6 )oZ Q (x),
- nx -0 -n,~-0
-0
(4.46)
[nx]
L {( d/ ZlS)
j=l
-
-
under
$
-1
= Z-n,~Q
log f. (S)} B
-J -
![nx](~o) = {![nx](~)}IB = 8'
and
0 < x
=
~o
Then, by virtue of Theorem 4.1,
-0
sup {~o(x)} is convergent equivalent in distribution
O<x:::l n
.
*
sup {Z (u)ol-[l ](8 )oZ (u)}. This suggests the following seque~tial
O<u~l -n
- nu -Q -n
test which has an asymptotic (as n ~ 00) level of significance equal to
to
<:t(0 <
a < 1):
Continue experimentation as long as
~o(x)
n
$
~
a
,
x
~
where
~ (0 < a < 1) satisfies
a
p
sup {Z *' (u)ol-[ 1 ](8 )oZ *(u)}
-n
- nu -0 -n
l([nx]
~
n) and
(4.47)
[O<u::;l
If, for the first time, for some
terminate the experiment with the
x
= Xn ([nx] = [nx] = N),
~o(x
n n ) > ~a ,
Nth ($ n) observation and reject
H.
o
138
If no such
N(s n)- exists, then stop when the sample size reaches the
predetermined number
n
and do not reject
H .
o
Although it does not appear that the critical values.
defined
r
"'a'
in (4.47), can be analytically determined. a general procedure is outlined in Section 5.2 which provides an empirical determination of
by simulating the Gaussian process
z-n* .
l:a
Note that. as remarked for the
time-sequential tests of Sections 3.5 and 3.6. the quadratic forms in
(4.46) and (4.47) may not be well-behaved as
use of strictly positive
scription given above.
x
x
~
O. which explains the
for these quantities in the general deo
'l/Jn(x),
In applications. the test statistic,
is monitored over [c.X]. and the critical values.
r;a'·· are determined
by taking the supremum in (4.47) over this interval as well. for some
appropriately chosen
£ >
O.
Typically.
percentage of the maximum sample size.
£
n.
It is also worth noting that. for each
".0
(x)
'l'n
where
2
X
q
is chosen to be a fixed
0 < x s 1. under
P
g Xq2 •
~o(x)
A
n
'
(4.48)
denotes the (central) Chi-squared distribution with
o f f ree dom. an d t h at. as s hown f or
S
_0
i n Sect i on 3 .5•
q
",0
'I' (x)
n
degrees
is
invariant with respect to reparameterizations (1-1 transformations of the
parameter space) of the model.
Even though (4.6) and (4.45) provide two different formulas for
computing
!j(~)'
and hence
![nx](~)'
both require integrations which
may be complicated for certain specific distributions.
estimator of
I
(8)
_[nx]
-0
Since a consistent
can replace this matrix in (4.46) and (4.47)
without affecting the asymptotic level of significanc.e of the test procedure. the following corollary is useful in these situations.
•
139
Corollary 4.4:
that for some
6 >
Assume that conditions (4.10) and (4.11) hold, and
° and all
1
Elt r ,s(8)1 +O
j
'"
r,s· l, ••• ,q,
S
K<
00
for all
j
= l, .•• ,n
°
S
(4.49)
Also, let
[me]
(O).'!' \ Zo(O).
o
Z
-lnx] ~
n j:l _j ~.
f
or
x
S
1
,
(4.50)
where
(4.51)
Then it follows that
Proof:
The proof follows by using Markov's LLN exactly as in
Corollary 2.6 to obtain
Remark:
]O(S) -m-
I·m(8) ~
° and from equation (4.43).
'"
o
Condition (4.49) is sufficient (but not necessary) for
assumption (4.14) of Theorem 4.1 to hold.·
4.4.2
A Test Procedure Based on Truncated Lifetimes
Even in experimental settings where the lifetimes under study are
sufficiently short so that sequential procedures might be considered it
is likely that the investigator would not want to wait too long for any
single observation.
For this reason it may be desirable to truncate each
observation at time
C,
where
say, if death (failure) has not already occurred,
c is a predetermined truncation time.
Fortunately, the approach
taken in Section 2.7.2, which uses some results from Chapter III, will
show that under essentially the same conditions assumed for Theorem 4.1,
a parallel result based on the truncated observation holds.
To see this we use the same notation as in Section 2.7.2 with
c
l
= c 2 = •.. = c n = c,
since the same truncation time is used for each
observation in this context.
Let (T.
the truncated observations, let
,~.
J:c "'J:c
), defined in (2.49), denote
f j* : c (§)' defined in (2.50), denote the
140
corresponding truncated densities, and let
!j (~)
the truncated analogues of
define
l*
_j :c (~),
!j (~) .
and
.. *
~j :c (~)
and
Also, for
o
$
x
denote
$
1
[nx]
l
(1)
-n
0:
where
L
(4.52)
I . c (fn ,
j-1 -j . -
!j:c(§)
denotes the truncated information matrix
defined in (2.51), and
[nx]
(2)
Finally, let
function on
(u,v): 0
D
-n: c'
$
~n:c,S(x)
Z
*
-n:c
.. {Z
[0,1]
- n
*
-n:c
-~
.*
j~l ~j:c(~)·
(x), 0
$
$
X
EZ *
-n:c
such that
1}
(4.53)
be a q-variate Gaussian
= 0 and, for every
*
*
, .. ![n(uAv)]:C(~)'
u,v ~ I, E[~n:c(u)][~n:c(v)]
the truncated analogue of
matrices of the form
![nx](~)
D
-n'
and let
be defined as in (4.13a) with
1_[nx]:c (6)
_ • We can
being replaced by
now state
_~~_~~~~!y~~:
a compact subset of
Assume that the parameter space can be
q
R ,
that
c <
00,
and (4.14) hold, and that, for any
restr~cted
that conditions (4.10)-(4.12)
•••
5
U
m
~
1,
lim inf ch (D
) >0.
n-+OO·
mq -n: c
.
Then, under
*
Z
-n:c
PS ' Z
-n:c,S
in the (extended)
Proof:
j
== l, ...
-
to
(4.54)
is convergent equivalent in distribution to
J
l
topology on
q
D [O,l].
In the proof of Corollary 2.7 it 'was noted that, for
,n,
and hence,
1.
(S) •
-J:c -
(4.55)
141
Also, proceeding as in the proof of Corollary 2.7 we can use some results
from the proof of Theorem 3.1 and Lemmas 3.6 and 3.7 to show that (4.12)and (4.14) imply
(1)
for some
y > 0
and each
s · 1, ••• ,q,
,
(4.12 )
and
(2)
for each
r · 1, .•. ,q,
,
<
co
for all
(4.14 )
j .
,
,
Then, using (4.54), (4.55), (4.12 ), and (4.14 ) the proof of Corollary
0
4.5 follows exactly along the lines of the proof of Theorem 4.1.
Remark:
As pointed out in the remark following Theorem 3.5, the
parameter space can always be restricted to a compact subset of
applications.
q
R
in
And, using the same reasoning as in Remark 20f Section
4.3, assumption (4.54) will also be satisfied in any application that is
likely to be considered.
This justifies our earlier statement that a
basic invariance principle similiar to Theorem 4.1 holds for the
truncated observations under essentially the same conditions that were
assumed for Theorem 4.1.
Using Corollary 4.5 an asymptotic sequential test of H :
o
(specified) based on the truncated observations, (T,
,~.
J:c -J:c
B = -0
B
), can be
constructed in the same fashion as the test of Section 4.4.1 for the
un truncated observations
by
(T., ~j).
J
-
and
We ,only need to replace
-0
~c(O < a < 1)
a
n
where
,
-1
0
t.Pn:c (x) - Z
(x)
oZ[
-n:c,B
- nx ] :c (~Q)
and
VJo(x)
satisfies
o~n • c,B
_0
(x),
(4.56)
142
p
sup {Z *' (u).I[-1 ] (13 )·z * (u)}
-n:c
~ nu :c -0
n:c
[O<u~l
> ~c J - a.
a
(4.57)
Finally, we note that by Lemma 3.7 [using assumption (4.49)],
~
(8) = ~
_(c) ~o
m
m
L I? (13 j
j=l-J:c ~o
is a consistent estimator of
1
-
m
L
m j-=l
!j:c(~o)'
{f~
-J:C
(4.58)
(B)}
~
13=13
_
~o
so by the truncated version of
(4.43) ,
1°
- [nx] :(c)
(8 )
is a consistent estimator of
4.5.
(4.59)
-0
1- [nx]:c (13-0 )
•
Concluding Remarks
As indicated earlier, applications of sequential procedures to
competing risks settings are likely to be very limited.
Nevertheless,
for those situations where they might be considered, we have shown that
it is mathematically possible to develop asymptotic sequential procedures,
in the context of repeated significance testing, for parametric competing risks models when the null hypothesis being tested completely
specifies the vector of parameters.
Although it is possible to extend
these basic results to the case when nuisance parameters are present by
using the techniques of Chapters II and III (as in the time-sequential
setting of Section 3.6), we will not provide the details due to their
limited applicability.
Finally, we note that it should also be possible to extend the results
obtained to the (perhaps more usual) sequential setting in which a
maximum sample size is not specified, that is,
n
= 00.
In this case,
however, a stochastic process over the entire positive real line,
(0,00), needs to be considered, and extra regularity conditions will be
e-
143
required to verify the tightness of this process, since the demonstration of tightness given for the repeated significance testing case
will not hold on (0,00).
Again, noting its relative unimportance in
applications, we leave this as possible topic for further research •
.
-
CHAPTER V
SIMULATION OF THE DISTRIBUTION OF A FUNCTIONAL OF A
GENERAL GAUSSIAN PROCESS, SOME EXPONENTIAL COMPETING
RISKS MODELS, AND NUMERICAL ILLUSTRATIONS
5.1.
Introduction
Under suitable regularity conditions, it has been shown that the
distributions of the proposed test statistics for both the timesequential procedures of Chapter III and the sequential procedures of
Chapter IV converge weakly to those of a functional of certain Gaussian
processes under the appropriate null hypotheses.
These null distribu-
tions, however, are not found in the statistical literature, and theoretical derivation of these distributions using currently available
statistical tools seems intractable, at least in general.
For this
reason, in Section 5.2 we show how the required null distributions can
be derived empirically by providing an algorithm which simulates the
distributions of functionals of general Gaussian processes through computer generation of these processes.
Section 5.3 contains some comments
on the assumptions of the procedures of Chapters II, III, and IV, and
these regularity conditions are shown to hold for a general independent
risks exponential model in Section 5.4.
Example 5.4.1 provides a numer-
ical illustration of the progressively truncated test of Section 3.5
using a special case of this model, while Example 5.4.2 illustrates the
procedure of Section 3.6, which is appropriate when nuisance parameters
145
are present.
Finally, a competing risks model based on Gumbel's
bivariate exponential deistribution is presented in Section 5.5, and the
assumptions required for the procedures of Chapters II, III, and IV are
verified for this model.
5.2.
Computer Generation of a General Gaussian Process and Its Use in
Simulating the Distribution of a Functional of the Process
§n*
Let
on
[O,X]
{2n* (X),
C
0 ~ x ~ xl
such that, for every
be a q-variate Gaussian function
(x,x'):O ~ x,x' ~ X,
EG *(x)
-n
E[G*(x)][G *(x)]'
-n
-n
=0
= r-n (x)
,
and
E[G * (x)][G * (x')]'
-n
-n
c
I
-n
(xAx') •
G,
-n
We wish to construct a stochastic process,
(5.1)
which is not only con-
but is also easy to generate
G*
-n'
Keeping this in mind, we proceed as follows:
vergent equivalent in distribution to
on a computer.
Let
2 , ••• ,2
-1
-n
~i
be
iid
N (0,1)
q - -q
random vectors and define
1
c
[~n(iX/n) - ~n«i-l)x/n)]~,
i
=
l, ..• ,n ,
qXq
r.
-n
(0)
=0
and
G (x) =
-n
Note that
§i
o<x
< X
is the square root of the matrix in
(5.2)
[J. which can be
obtained, for example, by the Cholesky decomposition.
It follows that
146
this matrix must be symmetric and nonnegative definite.
every
(x,x'):O
~
x,x'
~
From (5.2), for
X,
=0
EG (x)
-n
and
,
=
EG (x)"G (x')
-n
-n
L
13.13.
-1.-1.
i=1
z
,
[y]A[y']
E ([n(xAx')/X]"(x/n»
-n
,
(5.3)
since
[y] A [y']=[nx/X]
1'1
[nx'/X]'" [n(xl'lx')/X].
II ~n (s)
Now, suppose that for sufficiently largen,
\s-tl
O.
+
Then, since
[n(xAx')/X](X/n)
+
x A x'
as
- ~n (t)
n
+
II
+
00,
0
as
the
sample process
{G (x)} has the structure of the Gaussian process
-n
E (x) is such that the
{G-n*(x)} as n grows large. Further, if -n
sample process is tight, then G is convergent equivalent in distri-n
bution to G* (in the (extended) J
topology on Dq[O,X]).
1
-n
*
Simulating the Distribution of a Functional of
G
-n
For the test procedures of Chapters III and IV; we wish to
sup {G *' (x)oE -1 (x)oG*(x)} ,
-n
-n
-n
E:9{~X
is some appropriately chosen positive number. Guidelines for
simulate the distribution of the functional
where
£
choosing
E:
for the test procedures of Chapters III and IV are provided
in Sections 3.5, 3.6, and 4.4 immediately following the descriptions of
these procedures.
Also, recall that if the appropriate quadratic forms
are well-behaved as -x
+
0,
then we may take
£
= O.
e-
147
Now, from (5.3) and the discussion following it,
sup
{G' (kx/n)-E-l(kX/n)'G (kX/n)}
(n£/X)SkSn -n
-n
-n
is convergent equivalent in
}•
sup {G. ' (x)-E -1 (x)-G.(x)
-n
-n
-n
£SxSX
distribution to
which suggests the fo1-
lowing algorithm for obtaining the required simulated distribution:
(1)
Generate a random sample of size
N (0.1)
q - -q
(2)
§i
n.
Zl'
•••• -n
Z ,
-
from
and compute
= {~n(iX/n)
-
~n«i-l)x/n)}~.
i
= l, •••• n.
where
qXq
E (0)
-n
.-
= Q,
(3)
G (kX/n) =
(4)
,
1
Max {G (kX/n) -t:- (kX/n) -G (kX/n)},
-n
-n
£*sksn -n
£*
(5)
k = 1, .... n.
--'n
=
and
where
max{[n£/X],l}.
Repeat steps (1)-(4)
m times, and compute the
percent~les
of the observed distribution of the statistic in (4).
Then, as
nand
m grow large, the empirically determined percentiles
of step (5) will approximate the percentiles of
-1 (x)-~n(x)}
•
sup {~n. ' (x)-~n
£sxsX
quite well.
These percentiles can then be
used directly as the critical values of the corresponding test procedures or. if desired, the empirical distribution can be smoothed by
suitable Pearsonian curves (if possible), and the appropriate percentiles determined from the smoothed curve.
148
For the test procedures of Chapters III and IV,
L
-n
(x)
has the
following form:
L: (x)
Test Procedure
-n
Section 3.5
Section 3.6
A(x;S
-
-0
X €
[£,X]
X €
[£,1]
).y-(x) (e-0 )·A'(x;B
),
-0
Section 4.4.1
For the test procedures of Sections 3.7 and 4.4.2,
modification of the forms given above.
L: (x)
-n
is an obvious
We also note that in each of the
above cases, by arguments given in Chapters III and IV, either
is a continuous function of
x
is a continuous function of
x,
for every
n,
or its limit as
and in each of the above cases
is such that the sample process
L: (x)
-n
n-+ oo
L: (x)
-n
Thus, the algorithm
G (x) is tight.
-n
outlined above can be used to obtain the required simulations for all
the test procedures of Chapters III and IV.
In fact, the method des-
cribed in this section is likely to be applicable to any general
Gaussian process that occurs as the limiting process of a sample process
motivated by an actual experimental setting.
5.3.
Some Comments on the Assumptions of the Procedures of Chapters II,
III, and IV
It should be noted that the following sets of regularity
conditions of Chapters II, III, and IV are equivalent:
i)
(2.14), (3.11), and (4.10),
ii)
(2.15), (3.12), and (4.11),
iii)
(2.18), (3.13), and (4.12),
iv)
(2.17) and (4.49).
e-
149
Also. (2.17) implies (4.14), and. as remarked earlier, conditions (2.19)
[(2.19')], (3.14), and (4.13)[(4.54)] will be satisfied in practical
applications.
Therefore, any of the procedures of Chapters II, III, and
IV may be applied if the following conditions are verified:
(2.14). (2.15), (2.16), (2.17), (2.18), and (3.15).
5.4.
A General Independent Risks Exponential Model
In this example we assume that. for each individual
underlying lifetimes,
Tlj, ••• ,Tkj ,
,
~'j
variables with means
1
i = l, .••• k
and
= -Sizi'
- J
j.
the
k
are independent exponential random
> 0,
i = 1 •... ,k.
That is, for
= l, ••• ,n,
j
-e
and
k k
,
-1
= IT ST (t,) = exp[-t j L (~i~iJ') ] . (5.4)
t=l
ij J
i=l
Thus. from (2.8). for
j
= l, •••• n,
the joint density of
(T, ,b.,)
J -J
is
given by
f,(S)
J -
= fT
A
j'U
-j
(t"Oj)
J
~
(5.5)
150
and
log f (@)
j
k
- - i=l
I
,
k'_l
0ij log (~i~ij) - t j I (Biz i ,)
i=l - - J
(5.6)
As usual, denote the entire vector of parameters by
where
and
~i n ~i'
=¢
for all
(5.7)
i ; i' .
From (5.S) we can write
(5.8)
where
c 1j
and
c
2j
are functions of
B
but not
t •
j
Thus.
(5.9)
so
(5.10)
e-
151
Also.
(5.11)
so
(5.12)
-e
But. for all
,
j
Z -t j c 2j
= l •..•• n.
c '
ZJ
k
I
= L ~-i' >
i=l
e
so
-t.c Z·
J
J
J
and
tje
are integrable on
and
u~·s(tj).
Since this is true for all
(Z.15) is satisfied.
0
(0. 00) .
•
t
j
and therefore so are
r.s
= 1 •...• q.
e
-t.c z·
J
J
•
r
Uj(t )
j
assumption
Also, it is easy to see that assumption (Z.14)
holds as well.
Now. let
~i
be as in (5.7).
Then from (5.6).
(5.13)
Thus. for
j
= 1, ..• ,n,
since
1-6ij I < 1
and
T. ~ 0,
J
152
2+y
z
+
is,j
-= Kis,j
< max K
- lsjsn is,j
since, from (5.4),
=Kis
<
00
is exponentially distributed and therefore all
T
j
,
of its moments are finite, and since
(~i~ij)
is
= i1, ••• ,iPi'
is a vector of
> 0,
~ij
finite constants, and the same is true for
(5.14) holds for all
(5.14)
'
and all
in applications.
i
= 1, ... ,k,
Since
assump-
tion (2.18) has been verified.
Next, we note that from (5.7) and (5.13), for any
any
Bi' s
€
B
-i"
i
~
T
Bir
E ~i
and
'f'ijr,i's(_D)
~
= (~2/~B
° ir o~B i's )
0
1og f j (B)
_
=0
(5.15)
so assumptions (2.16) and (2.17) are satisfied in this case, and we need
only consider the case where
ir,is
= i1, ••. ,iPi'
Bir
in detail.
E ~i
and
Bis
(
~i'
For this case we have
(5.16)
and so for
j
e-
i',
= 1, ••• ,n,
•
153
-= Kir,is,j
•
<
max
max
max
K.
.
.
l~j~n l~i~k il~ir~ipi 1r,1S,]
i1~is~iPi
(5.17)
And since (5.17) holds for all
since, as noted above,
ir,is
= il,o.",iP i
and
= l, ••. ,k,
i
we have verfied assumption (2.17).
Also, from (5.16) it follows that, for
-<
j
= l, •.• ,n,
*' • • ) -2
@*«() {Iz i r,]. oz.1S,].II (l3~ i Z~1,]
sup
,
2
(B.z . . )- I
~1~1,]
*'
-3
+ 2T.lz
."Z . . 11(13 1, Z . . )
] i r,] 1S,]
~
-1,]
and letting
and
13*(c)
Pc
= max(p [2 ,I=' E3 )
= {f*: II ~* - ~ II
where
<
p
'
..,..3
II,
-1-1,]
-
(f3.z . . )
R*'
-Rsup 1(13. z. .)
£ = f3*(E)
-1 -1,]
E(l + 2T.)1+n <
]
-1-1,]
£},
-< (1 + 2T j ) IZ i r. j"Zi S,]·I"p£
Therefore, since
'
(f3.z . . )-
00
and since
< E (1 + 2T. ) 1+n" Iz . . "Z
• 11+n " 1+n
] . 1r,] is,]
P£.
-+
.
p
0
£
~
as
0
£
as
'" 0
£ '" 0,
•
R-
I
154
Because this is true for all
ir,is
= il, •.. ,iPi
and
i
= l,~ •• ,k,
assumption (2.16) holds.
Since assumption (3.15) involves the truncated information matrix,
I(x)(~) •
n
1
n I
j=l
!j:x(~)'
and since this matrix is needed to apply the
methods of Chapter III, it will be convenient to derive
this time.
!j:x(~)
•
at
Of the two expressions given in Chapter III, the one ob-
tained in Lemma 3.7 seems easier to apply:
that is, the
(r,s)
th
element of
!j:x(~)
is given by
··r s
"r s (x) ] .
f.' (tj,Oj;S)·fj(tj,o.;S)dt j + ST (x).ST~
J
-J j
(5.18)
From (5.4), for
i r · il, ..• ,iPi
and
= l, ..• ,k,
i
(a/aS ir ) log ST. (x)
J
,
2
• (t,.z i
,)/(Si z , j)
J
r,J
- J.,
(5.19)
while for any
ir,is
= il, ..• ,iP i
'S'Tir,iS(X)
j
and
i
= l, ... ,k,
= - (2 tjz i r, jZi S,J. )/(Q')3
ei~i'
,J
(5.20)
e'
155
It follows from (5.15), (5.18), and (5.19) that
l ir ,i's(S) = 0,
j:x
On
-
the other hand, for
Vir,i's
ir, is
such that
= 1, ... ,P i ,
i;' i' •
i
= 1, •.• ,k,
,
3
(5.21)
from (5.4),
(5.5), (5.16), and (5.20), we can write
.
ST (x)
j
=e
-xK
4
and
·S·Tir. ' is (x)
where
J
= -K 2 ·x ,
K2
= 2z i r, jZi s,J·/(~i~i·)
,J
K4
=
and
Thus,
(5.22)
156
•
(5.23)
From (5.22) it can be seen that only
and since
K1
and
K
3
are functions of
Qj'
~k· {(l,O, .•• ,O), ••• ,(O, .•• ,O,l)},
1xk
eand
(5.24)
Thus, using (5.23) and (5.24),
1
,-xK 4
= --(K
K 2 - K1)(e
4
-xK
- 1) + K xe
2
4
.
(5.25)
·e
157
Finally, from (5.18), (5.22), and (5.25), for
ir,is
=
H, ..• ,iPi'
and
i=l, ••• ,k,
1
--(K
=-
K
4
,-xK4
- K )(e
-1)
1
2
(5.26)
where
A
=
Combining (5.21) and (5.26)
we obtain the following general expression for an arbitrary element of
1.
~J:x
(B):
i,i'
~
=
For all
ir = il, •.. ,iP , i's
i
= i'l, ..• ,i'P i "
and
l, .•. ,k,
--
•
• t
l~r,l
s(6)
=
(5.27)
J:x
0,
k
where
A
=
,
I
i=l
(8.
z. .)
-1.-1,J
-1
.
We are now in a position to verify assumption (3.15).
that if, for all
°
<
£
<
[X/oJ,
°
< 6 < 1,
First, note
<
and some
00
,
(5.28)
for each
for all
ir
= i1,H.,ip.,
1.
i
= l'H.,k,
and all
j = l, ..• ,n,
n > 1,
-ir,ir
_ yir, ir (6) .::. K ·6
( OA)
_
1 (0'+1) 6) (~)
ir
that. is, (5.28) implies (3.15).
,
K. <
1r
But, from (5.27),
00
,
then,
158
- c
ir e -XR.c5(l
j
<ir'6
C. 1\ ,
-
since
J
e
-Xc ),
e
ir <
c
-AR.6
00
j
< 1
K.
1r
,
1 - e
and
<
00
-x < x ,
,
which establishes (5.28) and therefore verifies (3.15).
Since assumptions (2.14)-(2.18) and (3.15) have been verified for
the general independent risks exponential model of this section, the
comments of Section 5.3 indicate that any of the (appropriate) procedures of Chapters II, III, or IV may be used to analyze data which
follow this model.
x +
Finally, it is worthwhile to note that, letting
we obtain the (untruncated) information matrix for the
Specifically, for all
i,i'
= 1, .•• ,k,
and
j
where
r,
individual,
i ' s " i'l, .•• ,i'p."
1
j)(zi
s,
jH-l(Si'z.· .)-3,
- -1,J
i
= i'
(5.29)
[
0,
x ..
diagonal matrices.
th
in (5.27),
= l, ••• ,n,
(Zi
=
ir" i1, •.• ,iPi'
j
00
i
Thus, both
:I
and
i'
are block
Also, using (5.27) and (5.29), one can readily ob-
tain !n(~)' !(x)(§)'
and
![nx](~)'
quantities which are needed to
compute the test statistics of Chapters II, III, and IV.
159
5.4.1.
Numerical Illustration of the Progressively Truncated Test of
Section 3.5: Example 5.4.1
We illustrate the time-sequential procedure of Section 3.5 with a
simple hypothetical example which is a special case of the general independent risks exponential model considered above.
In order to motivate
the calculations that will follow, suppose a biochemist is trying to
develop an improved pharmacologic treatment for a certain type of cancer.
Although similar drugs increase the survival time for cancer patients,
they also tend to elicit undesirable cardiovascular side effects.
The
scientist hopes that his new formulation will eliminate this side effect,
and he is ready to begin animal experimentation.
--
From past experience,
it is known that, for a particular strain of laboratory mice with induced tumors, the time to death from the form of cancer under consideration is exponentially distributed with mean
are left untreated.
~
1
= 30 days when the mice
Further, it is thought that the time to development
of a cardiovascular problem in these (untreated) animals is independently
exponentially distributed with mean
~2
- 60 days.
Since the experimen-
ter would like some indication of whether or not the treatment is worth
pursuing and deve19ping as soon as possible, a progressively truncated
test of
Ho --
[~:12]
~
== [630°]
is advocated.
That is, the experiment will
be terminated as soon as the survival experience of a group
(n - 50)
of treated animals differs significantly from the known distribution of
similar untreated animals.
The scientist decides that if no difference
is found by the time 100 days have elapsed, or if a significant difference is found before 100 days and the data indicate that
the treatment will not be developed.
~2
< 60,
then
On the other hand, the drug will be
160
given further consideration if early termination is possible, and the
VI > 30
data indicate that
and
V
2
is close to 60.
Fifty mice can be accomodated in the available facilities, and
from the above description an (independent risks) exponential competing
risks model is appropriate with
of
X
= 100
n
= 50
= ~i
for
days, and
k = 2
risks, a maximum study length
mice, each h:wing underlying lifetimes
with means
E(T ij ) =
T1j
~ij
i
= 1,2
and all
j
= 1, ..• ,n,
T
denotes the
2j
time to the development of cardiovascular abnormalities for the J.th
where
denotes the time to death from cancer, and
mouse.
=
~j
denotes his actual "lifetime," and
(Alj ,t.
2j
)
indicates the cause of "death."
(5.5), the survival function (for
(T.,t. ),
J -j
j = 1, ... ,50,
T )
j
Hence, from (5.4) and
and the joint density of
are given by
-t
= e
j
A
and
(5.30)
where
A=
1/~1
+
l/~2.
From this it follows that
+ {[(a/a~i) log Sj(x)] _ _0 }oI[t j > x]
~·V
-
(5.31)
e-
161
I[]. 1
where
is false.
if the expression in
Also, since the
example, alISO
!j:x(~);
i • 1,2
(Tj'~j)'
is true and
j . 1, ••• ,50,
I[]
are
=0
iid
if it
in this
mice have the same truncated information matrix,
therefore, using (5.27) with
and all
[]
Zij
=1
,
and
(~i~ij)
= Vi for
j,
(5.32)
where
A. (l/V~) + (l/V~).
As indicated above, the hypothesis to be tested is
--
Ho :lJ~ • V
. [30]
-0
60
'
and a test based on a time-sequential procedure is particularly
appealing.
Since
Ho
completely specifies the vector of model param-
eters, it is appropriate to use the progressively truncated test of
Section 3.5 which is based monitoring the test statistic
(5.33)
over the interval
taken
£.
0),
(0,100]
where
(since
A~O(X) ~ 0
as
x ~ 0,
we have
162
. L n .*
W100 (x)" n ~ tL f j ( t ,U~ ;1J )
- 50 '~o
j-l -:x j - j Co
1
n
.. -
l
(5.34)
In j-l
Thus, for a given set of data,
~o - (30 60)',
given
(t j ,Olj,02j)'
one can compute
j - 1, •.• ,50,
A~O(X)
for any
x
€
and
for a
[0,100]
using definition (5.34) and the expressions given in (5.31) and (5.32).
Now, as
x varies over
[0,100],
appropriate critical value,
p[
Since
*
~50(x)
~;O(x)
(x,x'):O
-
Q
~
where
*'
is compared with the
a. (0 < ex < 1)
*
-1
sup t~50(x).~( )(~ )·~50(x)} > w ] .. a. •
OSxslOO
x
0
a.
is a 2-variate Gaussian function on
~
100,
method of Section 5.2
W
ex
[with
(5.35)
[0,100]
E[~;O(X)][~;O(x')]'" !(XAX')(~)
and
x,x'
satisfies
W
such that
for every
can be empirically determined by using the
~n(x)" !(x)(~o)'
to simulate the distribution of
£ =
0,
X = 100]
and
*'
~l
*}
sup {~50(x).!(
)(~ )·~50(x) •
OSxS100
x
0
Although this method should be used with much larger values of
m,
we consider
n" m .. 200
sufficient for the purposes of this illus-
I (x) (l:!o)
tration.
For
~n (x) ..
X = 100,
and
n" m - 200,
nand
defined as in (5.32),
1:! o
=
(30 60)',
the following results were obtained:
a.
.50
.25
.10
.05
.01
'"
wa.
4.87
6.52
8.97
10.77
16.19
(5.36) .
163
To complete the illustration, a set of data,
j • 1, ••• ,50,
is needed, and it was simulated as follows:
(a)
Generate observations,
(c)
Repeat steps (a) and (b) for
data of the form
0_
(t j ,Olj,02j)'
and
from two independent
j . 1, ..• ,50,
(t j ,Olj,6 2j ),
thus obtaining
as required.
The data generated in this manner, given in Table A.l of the appendix,
were then used to compute the test statistic,A~o(x), and the results
shown in Table 5.4.1 were obtained.
Thus, using
a · .05,
is rejected since
and Ho :~ • (30 60)'
~
first time at
at time
time
x,
x,
= 56
x
"
l:!(x)'
the experiment is terminated at day
days.
"
Ao50 (x) > w.
05 • 10.77
It is easy to show that the
mle
56
for the
of
~
[1. e., based on the truncated likelihood function at
*
L50,x(~)
-
*
n
j~l
fj
:x(~)]
is given by
"
("")'
~(x)
= ~1,x~2,x '
where
~i,x -
and where
t j:x •
n
n
,.,
[tj'
x,
(L
tj
if
t
j-l
if
j
tj
.X> / j-l
L
.
6ij: x '
< x
and
> x
i • 1,2 ,
(5.37)
164
TABLE 5.4.1
OBSERVED VALUES OF
0
A0 (x)
50
FOR EXAMPLE 5.4.1
0
x
"50(x)
x
"50(x)
1
2.563
31
7.311
2
1.919
32
7.009
3
4
1.847
33
6.006
3.676
34
5
3.118
35
5.881
6.300
6
1.949
36
6.740
7
2.520
37
6.680
8
3.119
38
6.150
9
3.753
39
6.510
10
4.389
40
6.870
11
2.780
41
12
3.885
42
5.887
6.162
13
3.361
43
6.444
14
4.432
44
6.734
15
16
5.600
45
46
7.032
7.337
47
7.649
17
5.920
5.235
18
6.319
48
7.970
19
20
6.505
49
8.298
7.622
50
8.633
21
8.808
51
8.977
22
8.061
52
9.328
23
24
8.229
9.274
53
54
9.686
10.053
25
26
7.160
55
10.428
8.021
56
10.810*
27
7.949
57
11.200
28
6.960
29
7.675
58
59
11.598
12.004
30
6.695
60
10.599
e~
165
l~ •
~J:X
6 ,
=
(6
6
lj:x' 2j:x
if
).-j
For the data of this example,
[0,
if
these computations yield
50
L
j=l
50
50
t j : 56
L
1259.8,
j:::l
L
(\J' :56 = 25,
j:::l
02J': 56 ::: 16 ,
and hence,
"-
~(56)
50.4] ."
[ 78.7
Thus, the data suggest that the treated mice survive the cancer for a
longer period, and that the cardiovascular side effects are at least no
-
worse than expected under the null hypothesis.
Consequently, the
experimenter decides to continue developing his drug .
.
Note that the observed stopping time (56 days) is considerably
less than the maximum study length (100 days) which indicates that the
progressively truncated test procedure employed here may lead to substantial savings of hoth time and money.
To obtain a rough indication
of the power of this procedure, 100 exponential samples were generated
as above
(n::: SO,
WI::: 45,
W
2
= 60),
the test procedure was per-
formed for each sample, and the following were computed:
"Empirical Power" ::: .46
and
"Empirical Expected Stopping Time" ::: 70.0 •
In fact, of the
100
46
samples which rejected
Ho :W~ ::: (30 60)'
days had elapsed, the average stopping time was
35
before
days.
Again,
166
this illustrates that the time-sequential procedure has the potential
to achieve substantial savings over the more traditional single point
analyses.
Also, the "power" does not seem too low considering the
moderate sample size
(n
= 50)
and the fact that the data were gen-
erated from a population which differed from the null population in
only
1
of the
2
parameters.
For comparitive purposes, suppose it was decided to use a single
point truncation (Type I censoring) scheme at
this data.
c
l
c
= 100
days to analyze
The procedure given in Section 2.7.2 (using
= Cz = ... = c 50 =
c
priate for this purpose.
= 100
as the common truncation time) is appro-
Actually, since the longest observed lifetime
in our sample (Table A.l of Appendix A) is
99.8 < 100,
the asymptotic
tests of Section 2.5 for complete observations could be used.
However,
we intend to make a power calculation and, in general, all of the 1ifetimes will not necessarily be shorter than
100
days.
From the sample
data, we compute
50
1.
50
t
l';l
=
L
1387.7,
j
6lj
j=l
= 29,
and since all the lifetimes have been observed by day
"
~(100)
= ~" =
(47.9
66.1)' •
21 ,
100,
(5.38)
Also, from (5.32),
0.0002458
!(100) (~)
=[
0
(5.39)
167
H ~~ • (30 60)',
o -
By Corollary 2.7, under
;n(~
~
~)
-
-0
is asymptotically
N2[Q,1<ioo) (D)], and so neD - ~0)'·I(100)(Q)·(Q - ~o) is asymptoti2
cally x (2). Using (5.38) and (5.39), the test statistic can be ca1cu1ated for our sample as follows:
• 4.11 •
Since
2
X. 95 (2) • 5.99, Ho cannot be rejected using a = .05.
over, under the specific alternative
asymptotically
N2[1O(IS 0)',
H1:~
;n(~
= (45 60)',
More-
-
~o)
is
!(iOO)(~)]' so the test statistic
n(~ - ~O)'·!(lOO)(Q)·(Q - ~o)
2'
X (2,~),
is asymptotically
where the
non-centrality parameter is
= 2.77
.
Thus, the power of the above test procedure (using
the alternative
HI:~·
(45 60)'
p{X
2'
a = .05)
against
is
(2,2.77) > 5.99}
=
.30 .
The above calculations show that, when the data of Table A.I are
analyzed with the more traditional methods of Section 2.7.1,
Ho:~
• (30 60)'
cannot even be rejected using
power of this test against
H1:~·
(45 60)'
a
=
is only
.05,
.30.
and that the
A1tbough it
appears that this single point test procedure does not compare favorably
168
with the progressively truncated test described earlier, it must be
noted that
the single point test statistic used the quantity I(x) (Q),
while the progressively truncated test used
T(
)(~).
- x -0
Of course, un-
is also asymptotically
Ho:B· Bo ' n(~ - BO)'·r(lOO)(Bo)·(~ - ~o)
x2 (2), and calculating this quadratic form for
the sample data yields
11.96,
der the null hypothesis,
which is highly significant
Thus, the single point test procedure based on using
~o
(p < .005).
in the calcu-
lation of the information matrix is much more powerful than the para1- '
leI test which uses
and it is certain to be more powerful than the
corresponding progressively truncated test, which agrees with our intuition.
Even though the progressively truncated test is less powerful
than the corresponding single point test, the single point scheme may
involve a considerable loss of efficiency--if the single truncation
time is too early, the observed results may not have reached statistical significance by this time, and if the truncation time is too
~ate,
an unnecessary expenditure of time, money, and possibly lives of the
experimental units may occur.
It should be noted, moreover, that the single point test procedure
using I(IOO)(~o) will not be equally powerful against all alternatives
that are a given distance from
I(x)(~)
~
o
.
That is, due to the form of
based on an exponential model, the direction of the alternative
as well as its distance from the null value is important in determining
the test's power.
For this reason, the procedure using I(IOO)(~o)
will not always be more powerful than the procedure using !(lOO)(g).
In particular, against alternatives
~2
o
< lJ2 ,
powerful.
the procedure using the
Hl:~
mle,
=
~l
such that
o
~l < ~l
will tend to be more
and
169
5.4.2
Numerical Illustration of the Progressively Truncated Test in the
Presence of Nuisance Parameters: Example 5.4.2
Suppose a clinical trial is being conducted in order to assess the
efficacy of a new treatment for two different serious diseases.
Two
hundred patients are randomly assigned into treatment and control
groups, and it is determined that these two groups are comparable with
respect to their baseline characteristics.
For each patient, the res-
ponse variable is the time from initiation of treatment to death, and
the cause of death is also recorded.
possible causes of death,c
l
In this setting, there are two
and
and a competing risks framework
is used to model the patient's observed lifetime,
two underlying lifetimes,
and
T ,
j
in terms of the
as follows:
denotes the (hypothetical) time from initiation of treatment
where
to death from cause
i • 1,2.
For the purposes of this numerical
and
illustration, we assume that
are independently exponen-
tia11y distributed with means
~ij
where
z.
J
= -1
Zj
• E(T ij )
=1
= BiO
if the
+ Bilz j ,
jth
for
i = 1,2
and
j
=
1, •.• ,200 ,
patient is in the treatment group, and
if he is in the control group.
Note that this model is a
special case of the general independent risks exponential model considered earlier, and that the survival function and joint density of
(T ,6.>
j -J
for
j
= l, .•. ,n
are therefore given by
170
and
f. «(3) = f
T
J j
where
~j = (l/~lj)
,U. (tj'~';~)
J
(5.40)
A
-J
+ (1/~2j)'
It is decided that the trial will run for at most
100
weeks,
and, for ethical reasons, we require a procedure which allows the trial
to be terminated as soon as the accumulated statistical evidence is
sufficient to reject the null hypothesis of no treatment effect,
(5.41)
The progressively truncated test of Section 3.6 is appropriate in this
case since the null hypothesis does not completely specify the vector of
model parameters, that is, the overall mean lifetimes,
are "nuisance" parameters in this context.
810 and (320'
Following the notation of
Section 3.6, we denote the entire vector of parameters by
611
(3 =
[::] -
13
21
(5.42)
(310
(320
where
and
~l
~2
denotes those parameters for which inferences are to be made,
denotes the nuisance parameters.
Since
n = 200
for this example, we monitor the following test statistic:
and
X = 100
171
(5.43)
Recall that
where
at time
true, then
~ij
"0
~2,x
denotes the
x
= ~i = 8iO '
mle
But if
i = 1,2,
(;o
~2,x
=
H
o
of
is
so (5.37) yields
[~~O'X] .
"0
'
820 ,x
where
(5.44)
1,2 ,
and
tj :x
and
0j:x
have been defined in (5.37).
100
~200,B (x)
2xl
1
=
(200)-~
-0
1
= (200)-~
200
l
j=l
200
l
j=l
.
Also,
*
[(a/a~l) log fj:x(~)]la=B
-
-0
1*,11(6 )
j:x
-0
(5.45)
1*,21(6 )
j:x
_0
where, using (5.40), it can be shown that
<
x]
i = 1,2 ,
and
r[ ]
denotes the indicator function of the expression in
(5.46)
[].
172
Next, let
-:A x
o )
1. (8) '"' (l-e
-J:X -0
:A
0
and since
So
"'0
Ao '"' (1/B 10 ,x) + (1/~20,x)'
Zj '"' 1
Then, from (5,27),
2 "0
-3
Zj (B 10 ,x)
0
0
2 "0
-3
Zj (B 20 ,x)
"0
-3
Zj (8 10 , )
0
for
"0
z.(I\O
J
,x )
10,x
"0
z. (8
100
patients and
J
20 ,x
)
-3
Zj
0
"0
-3
Zj (6 20 ,x)
0
(80
0
-3
)-3
0
"0
0
(B 20 ,x)
-3
= -1 for the other 100
patients,
I
I
o
o
o
1
-:A x
=
(l-e
:A
0)
o
o
(13
0
-3 I
0
0
I
- - - - - - - - - - -1- - - - - - - - - - o
0
I (So
) -3
0
I
10,x
20,x
o
o
)
I
I
I
o
(5.47)
Thus, from definitions (3.60) and (3.61),
173
o
1-(x) (8-0 ) =
22
Zxz
(Bo
o
20,x
)-3
and
"
~ (x;~ ) •
ZX4
0
Q)
(!Z
2x2
(5.48)
2x2
Consequently, from (5.47) and (5.48),
[ A (x;6 ). T
(8)· A'(x;8
)]
-0
-(x) -0
-0
2x4
4x4
4x2
o
-Ao x
=
(I-e)
A
o
o
(5.49)
where
AO
=
"0
"0
(l/BlO,x) + (l/B 20 ,x).
Note that, except for the change in
notation, (5.49) is identical to (5.32) of example 5.4.1.
(5.43)-(5.46) and (5.49) to compute the test statistic,
each
x.
"1
0
AZOO(X) ,
for
(0 < a < 1)
a
satisfies equation (3.84), can be empirically determined by using the
An appropriate critical value,
method of Section 5.Z with
and
Al 0
X
= 100.
AZOO(X) ~ 0
tn(x)
w '
x ~ O.J
W
= [~(x;~o)·I(x)(§o)·~'(x;~o)]'
[As in Example 5.4.1, taking
as
where
a
£
=0
= 0,
is permissible since
A
Note, however, that
£
B
= (0_'
-0
as pointed out in Section 3.6, both the test statistic,
the functional which must be simulated [which depends on
6°' )'.
-Z,x
"1
0
AZOO(X) ,
r-n (x)]
Thus,
and
not
only depend on parameters which must be estimated from the data, but,
strictly speaking, these estimates,
"0
~
I::Z,x'
should be upd,ated for each
174
x.
In practice, however, the following strategy might be adopted:
Run the experiment until time xl' and then, using
(a)
'"
At time
(b)
,
w
a
obtain
x
An
calculate
6
-2,x
points until
E
[x l' x)
2 •
"0
~2 , x 2 '
Ho
is rejected or
For this particular example,
"'0
x
"0
"'0
Continue this process of updating
(c)
6
-2,x
for
and rerun the simulation
2
,,'
program using these estimates to update W • Then calculate
a
2
based on
x 2 '" 30,
~l,o(x)
and compute
x
= 40,
3
etc.
is obtained for
updated every
10
n
= 200,
6
-2,x
x
at specific time
=X
X = 100,
is reached.
and we chose
That is, the trial is run for
x '" 20,
20
xl
= 20,
weeks,
and these estimates are subsequently
weeks.
Table A.2 of the appendix contains the specific data,
(t j ,Olj,02j,Zj)'
treatment group
j
'" 1, .•• ,200,
(Zj '" 1)
lj
= 50
and
lJ 2j '" 60
[For the
the underlying lifetimes were generated from
exponential populations with means
lJ
used for this example.
lJ lj '" 80
and
lJ 2j '" 100,
were used for the control group
while
(z. '" -l).J
J
For this data, the test procedure described above was performed, and the
results shown in Table 5.4.2 were obtained.
Thus, using
Ho:(6ll62l) , '"
beneficial.
a '" .05,
the trial is stopped at week
43,
° is rejected, and we conclude that the treatment is
This example clearly shows the advantage of using time-
sequential procedures in clinical trials, that is, as soon as a treatment is shown to be effective (or superior to a standard, treatment),
all patients can be placed on the treatment and receive its beneficial
175
, TABLE 5.4.2
VALUES OF
x
•
we
20
~~.x'
"0
,,'
W. 05 '
AND
61O ,x
~~O,X
67.1 .
83.5
~~O~(X)
,,'
FOR EXAMPLE 5.4.2
w. 05
"1
A 0 (X)
200
10.75
10.089
21
9.934
22
9.421
23
6.429
24
5.895
25
4.951
26
3.720
27
4.747
28
5.316
29
4.000
30
68.3
77.3
11.47
4.011
31
5.333
32
5.393
33
6.214
34
6.275
35
6.345
36
7.007
37
7.093
38
7.183
39
7.276
40
74.8
77.2
10.40
8.131
41
8.886
42
9.671
43
44,
10.520*
11.081
45
11.589
176
effects.
Thus, time-sequential procedures may lead to not only a
savings of time and money, but may save or prolong some lives as well.
"0
B
-2,x
Finally, we remark that the practice of updating
only at
specified intervals seems reasonable since, as shown in Table 5.4.2,
A
I
and
are not greatly affected when
is varied.
The
information presented in Tables 5.4.3 and 5.4.4 provides additional
evidence which also supports this claim.
W
a
The relative invariance of the
with respect to moderate changes in the underlying parameters is
especially noteworthy since each of the simulations reported in Table
5.4.3 was carried out using the method of Section 5.2 with
For real data, one would feel more comfortable using
m
m
= n = 200.
= n = 1000,
and
the critical values, especially the extreme ones, would probably vary
even less than those of Table 5.4.3 for different values of the underlying parameters.
The relative invariance of the critical values, as
well as the test statistic, is probably due to to fact that, marginally,
at each time point
x,
both the functional being simulated and the
statistic are quadratic forms which have
2
X
distributions regardless
of the values of the underlying parameters.
TABLE 5.4.3
SIMULATED VALUES OF
W
a
"0
FOR VARIOUS
8
-2,x
620,x
"I
"I
810,x
A'
w. lO
w. 05
w. Ol
67.1
83.5
9.27
10.75
11.86
68.3
77.3
9.25
11.47
16.59
74.8
72.2
8.67
10.40
12.62
30.0
60.0
8.97
10.77
16.19
AD
AO
~est
•
177
TABLE 5.4.4
OBSERVED VALUES OF . ~1'06t)
200
•
"0
I\O,x:
"0
l3
•
20,x'
x
--
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
SO
FOR EXAMPLE 5.4.2 AND VARIOUS
"0
~2 ,x
67.1
68.3
74.8
65.0
83.5
77.3
77.2
80.0
"A(X)
"A(X)
"A(X)
"A(X)
10.08
9.93
9.42
6.42
5.89
4.95
3.71
4.74
5.31
4.00
4.02
5.37
5.42
6.29
6.34
6.41
7.09
7.17
7.25
7.34
8.41
9.21
10.04
10.93
11.48
12.05
12.25
13.63
13.86
13.69
13.91
9.86
9.77
9.26
6.36
5.86
4.96
3.73
4.70
5.26
3.98
4.01
5.33
5.39
6.21
6.27
6.34
7.00
7.09
7.18
7.27
8.33
9.11
9.92
10.80
11. 37
11.92
12.12
13.49
13.74
13.57
13.81
9.93
9.88
9.31
6.37
5.87
4.97
3.71
4.64
5.18
3.90
3.93
5.23
5.29
6.08
6.13
6.19
6.84
6.91
7.00
7.08
8.13
8.88
9.67
10.52
11.08
11.58
11.77
13.12
13.34
13.16
13.37
9.93
9.80
9.31
6.39
5.87
4.95
3.74
4.75
5.32
4.03
4.05
5.39
5.45
6.31
6.37
6.44
7.12
7.21
7.30
7.39
8.47
9.27
10.10
11.00
11.57
12.14
12.35
13.74
13.99
13.83
14.07
178
5.5.
A Competing Risks Model Based on Gumbel's Bivariate Exponential
Distribution
The standard joint cumulative distribution function for Gumbel's
bivariate exponential distribution, as given by Johnson and Kotz (1975),
•
can be modified as follows to allow the means of the margina1s,
T ,T ,
1 2
to differ from unity:
(5.50)
t
1
> 0,
t
z
> 0,
~1
> 0,
~2
> 0,
e > O.
It follows that the
marginal distributions are given by
F
T.1.
that is,
T
i
(t i )
=1
- e
-t/~i
, t
i
> 0, ~1.' > 0, i
is exponentially distributed with mean
= 1,2
~i'
,
i
(5.51)
= 1,2,
and
therefore, the joint survival function is
-(l/~l~2)(~2tl + ~lt2
e
+ et l t 2)
(5.52)
Also, the survival function of the observed lifetime in a competing
risks framework,
T
= min(T 1 ,T 2),
is given by
(5.53)
179
and, using (1.8), we obtain the cause-specific hazard functions,
8 (t) - ~(1 + at)
1
1-1 1
].12
1
8t
= --(1
+ --)
g2(t)
Further, from (5.53), the
pdf
].12
of
T
(5.54)
].11
is
so
k
E(T )
=[
o
=
J:
2
I:
(5.55)
k "" 1,2, ... ,
that is, all the moments of
Let
T
are finite.
(Tj'~')'
j "" l, ••• ,n denote a sample from a population
-J
exposed to two competing risks whose joint survival distribution is
given by (5.52).
Then, adding a subscript
it follows from (2.8) that for
j
= l, •.• ,n,
j
to the above quantites,
the joint density of
180
(5.56)
Now that the
f.(S),
J -
j
= l, ••. ,n,
have been specified, all of the quantities needed to carry out the procedures of Chapters II, III, and IV can be obtained from these key functions.
To justify using these methods, however, the necessary regular-
ity conditions must be verified for this model.
Using a similar approach to that taken for the independent risks
exponential model of Section 5.4, and noting, as in (5.55), that
2
e
-(cIt. + c 2 t j )
J
< e
-clt j
for
c
> 0,
2
it is easy to show that
assumptions (2.14) and (2.15) hold.
From (5.56) it follows that, for
·1
f j (t!)
a log f.
(~)
a~l
-
-
~l
f j (l!)
= l, ... ,n,
J -
=
_ -Olj
·2
j
= -02j
_
~2
02j et j
2
t.
+~+-L-
2
(~l+e~ltj)
~l
°lj~
~
2
(~2+e~2tj)
2
t.e
+
2
~1~2
2
-=.L
2 + 2
~2
~2~1
and
·3
f. (~)
J -
2
t.
02j t j _-L
+
1l1+8t j
~2+etj
~1~2
0lj t j
(5.57)
..
181
Also,
and
(5.58)
Similarly,
182
2
-I
S. (x) =~+~
2
2
J
lJ 1
lJ 1lJ 2
2
-2
S.(x) =1£.+~
2
2
J
lJ 2lJ 1
lJ 2
and
2
x
-3
Sj (x) = - - lJ 1 lJ 2
(5.59)
while
S~,l(x)
J
2
2x e
=-2x
3
3
lJ 1 lJ 2
lJ 1
e"
2
2x e
-2x
S~,2(x) =
---3
3
lJ 2
lJ2lJl
S~,3(x) = 0
J
S~·2(x)
J
=
2
e
-22 ,
x
lJ 1 lJ 2
2
x
2
lJ 1 lJ 2
S~,3(x)
=--
S~,3(x)
=--
and
x
2
2
lJ 1 lJ2
(5.60)
183
Now, combining (5.57) and (5.59), we obtain
is obtained from (5.58) and (5.60), where
while
.*
~j:x
and
..*
~j:x
are the
truncated quantities needed for the time-sequential methods of Chapter
III and for those methods of Chapters II and IV which are based on
truncated observations.
Due to the nature of the integrals involved,
one would not be tempted to compute the information matrix,
1.(~) = -E[f.(~)],
-J -
or the truncated information matrix,
-J -
..* (~)], analytically.
I. (~) = -E[f.
-J:x -J:x estimators would be used:
Instead, the following consistent
and
(5.61)
Next, we note that
(1)
all moments of
(2)
o<
(3)
16i.1
J -< 1
(4)
p
P
CIt.
< cl t j
J
(c +c t )p
cP
2 3 j
2
~l
< 00,
Using these facts, the
T.
J
o<
for
c
~2
are finite [from (5.55)],
< 00,
i = 1,2
r
for
o < e -- ,
and all
p >
00
and
j,
and
t
j
> 0
for all
o.
j ,
(5.62)
inequality, and proceeding as in Section 5.4
for the independent risks exponential model, it is easy to show that,
for the quantities given in (5.57) and (5.58), assumptions (2.16),
(2.17), and (2.18) are satisfied.
Finally, although
184
= n -1
n
L !j .'x(J:)
has not been analytically determined, it is
j=l
not difficult to show that assumption (3.15) holds for
= n -1
a consistent estimator of
I(x)(~)·
In light of the comments of Section 5.3, the above discussion
indicates that the methods of Chapters II, III, and IV may be used to
analyze data which is known to follow the competing risks model based on
Gumbel's bivariate exponential distribution.
of the underlying lifetimes,
]J1
and
]J2'
Also, note that the means
could have been written as
linear combinations of an arbitrary number of covariables, as in the independent risks exponential model of Section 5.4.
The basic arguments
presented above would remain unchanged, even though the notation and resuIting computations would be considerably more complicated.
Remark:
Although the required calculations for this dependent
risks model were perhaps a little more tedious than those for the independent risks exponential model of Section 5.4, essentially the same
arguments were used to verify the regularity conditions of Chapters II,
III, and IV for both models.
The common key property of these models is
the finiteness of the moments of the actual lifetimes,
T.
J
= min(Tl., ..• ,Tk ,). This suggests that the assumed regularity conJ
J
ditions are fairly mild, and although the computations may be more involved for more complicated models, the methods developed in this dissertation should be applicable to a broad class of competing risks
models.
e-
CHAPTER VI
SUGGESTIONS FOR FURTHER RESEARCH
Given the parametric form of the joint distribution of the underlying lifetimes, the procedures of Chapters II, III, and IV provide statistical tools which can be used to analyze competing risks data that
has been collected under a wide variety of experimental designs.
In
particular, these methods allow the investigator to design the study to
observe all the lifetimes, to use single point truncation (Type I censoring) or single point censoring (Type II censoring) schemes, to incorporate a suitable progressively truncated scheme, or to make repeated
significance tests.
If the individual risks under consideration are in-
dependent, then it is generally possible to propose an acceptable parametric form for the underlying joint distribution of these risks, and
the procedures developed are readily applicable.
Of course, this is
also true when the risks are dependent and their joint distributions can
be specified.
However, due to the identifiability problem noted in
Chapter I, these joint distributions cannot be directly estimated from
the observed data in a competing risks setting involving dependent risks.
As a result, widely accepted dependent risks models are not generally
available, especially in medical applications.
Unfortunately, in medi-
cal and other biological applications where a competing risks model is
appropriate, it is highly likely that the risks under consideration will
be dependent.
For this reason, it seems that further research aimed at
developing acceptable parametric competing risks models for specific
186
clinical applications is crucial.
Once appropriate models are available,
the methods developed in this dissertation can be immediately applied.
A modification of the time-sequential procedures of Chapter III,
which allows staggered entry into the study but permits the monitoring
to begin only after all subjects are enrolled, has already been proposed
in Section 3.7.
Although this modification is fairly simple and is
likely to be acceptable for most applications, it should be possible to
allow the study to be monitored from its beginning by considering a twodimensional stochastic process (which is a function of the increasing
sample size as well as time) and investigating the weak convergence of
this sample process to an appropriate two-dimensional Gaussian function.
Such an approach was taken, for example, by Majumdar and Sen (1978) when
they developed an appropriate progressively censored scheme based on
rank statistics for the simple regression problem in the presence of one
risk
(k
= 1).
Also, the weak convergence of the proposed test statistics of
Chapters III and IV could be examined under contiguous alternatives, as
Sen and Tsong (1981) have done for parametric survival models
under a progressively truncated scheme.
(k = 1)
Such results might then be used
to compare the local power of large sample tests proposed by future research with the tests considered in this investigation.
For practical
purposes, however, more useful information regarding the properties of
the procedures of Chapters III and IV can be obtained by performing more
elaborate numerical studies (than the simple illustrations of Chapter V)
using specific competing risks models.
In particular, the determination
of empirical power and empirical stopping time is quite important in
assessing the efficiency of procedures which allow early termination.
187
In conclusion. we note that in some cases a competing risks model
might be used to provide a supplementary or exploratory analysis of data
even when only one cause of death (failure) is of primary concern.
For
example. in most clinical trials only one cause of death is under investigation but withdrawals inevitably occur.
peting risks model with
k
This indicates that a com-
= 2 risks could be used to assess the effect
of treatment on the time to withdrawal.
Similarly. one might use a
model with two risks to determine whether a competing cause of death
needs to be considered in addition
t~
the primary cause.
As indicated
earlier. if a dependent risks model is necessary, it may be hard to
justify using a specific parametric model.
Nevertheless. in situations
like these. an exponential model, for example. could be employed to help
answer the pertinent questions. and although the estimates and probability statements arising from such an approach do not constitute a formal
analysis, the rough approximations obtained may provide additional useful information which cannot be extracted from an ordinary survival
model.
e.
e
e
TABLE A.2.a
COMPETING RISKS DATA FOR EXAMPLE 5.4.2
(z, = 1)
Treatment Group
j
i
1
--*
2
50.5
3
10.9
4
54.2
5
0.8
6, 90.7
7
9.3
8
9.8
99.3
9
10
68.8
11 . 9.1
12
13.8
4.5
13
14
51.4
2.5
15
17.3
16
17
75.7
18
-19
-28.2
20
81.3
21
22
0.5
42.8
23
24
33.3
44.2
25
*--
° ,
°lj
_2J
0
0
1
0
0
1
1
1
1
1
0
1
1
0
0
1
0
0
0
0
1
1
1
1
1
1
1
0
1
1
0
0
0
0
0
1
0
0
1
1
0
1
1
1
1
0
0
0
0
0
j
i
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
24.0
48.0
65.5
23.0
11.9
12.6
30.1
43.3
51. 7
27.7
71.9
3.6
14.4
68.9
70.0
22.9
67.9
83.1
20.8
22.2
25.6
6.1
4.5
-
--
62.7
denotes truncated lifetime, i.e.,
° ,
°lj
_2J
0
1
0
0
1
0
0
0
0
1
0
1
1
0
1
1
1
1
1
1
1
1
1
0
1
1
0
1
1
1
1
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
I
1
t. > 100.
J
J
j
t,
j
t,
-
_ J_
°lj
°2j
-
_J_
°lj
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
96.1
32.7
77.9
73.2
28.2
20.8
3.6
43.6
0.5
31.4
41.0
54.0
24.7
66.9
8.7
54.7
74.2
45.6
5.6
1
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
0
1
0
0
0
1
0
0
1
0
0
1
1
0
0
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
22.1
8.1
27.2
25.6
22.2
25.2
10.5
21.4
19.5
70.1
0
0
1
1
0
1
1
1
1
1
1
0
1
1
0
1
1
1
0
1
0
0
0
1
1
71
72
73
74
75
---
I
--
14.3
--
50.7
1.1
28.7
43.6
0.3
69.9
2.0
26.5
15.4
14.6
22.8
74.6
24.7
59.4
9.2
87.8
I
0
0
0
1
0
0
1
1
1
1
0
1
0
0
0
1
~
1
1
0
1
1
0
1
1
0
1
1
1
0
1
1
0
00
0
1
0
1
1
1
0
....00
\0
TABLE A.2.b
COMPETING RISKS DATA FOR EXAMPLE 5.4.2
Control Group
j
101
102
103
104
105
106
107
108
109
110
111
112·
113
114
115
116
117
118
119
120
121
122
123
124
125
*--
e
t.
--L
93.8
30.3
6.8
32.5
0.5
56.7
5.8
6.1
62.1
43.0
5.5
8.6
2.8
30.8
1.5
10.8
45.4
89.2
89.2
16.9
50.8
0.3
26.5
20.8
27.6
~ ~
0
0
1
0
0
1
1
1
1
1
0
1
1
0
0
1
0
0
0
0
1
1
0
1
1
1
1
0
1
1
0
0
0
0
0
1
0
0
1
1
0
1
1
1
1
0
0
1
0
0
j
~
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
14.4
30.0
39.3
13.8
7.4
7.6
18.1
26.0
31.0
17.3
43.1
2.3
9.0
41. 3
43.7
14.3
42.5
51.9
13.0
13.9
16.0
3.8
2.8
79.9
39.2
-
denotes truncated lifetime, i.e.,
~ ~
0
1
0
0
1
0
0
0
0
1
0
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
1
1
1
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
(z. = -1)
J
j
~
~
°2j
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
112
173
174
175
60.1
20.4
46.7
43.9
17.6
12.5
2.2
26.2
0.3
18.8
24.6
33.8
14.8
40.1
5.4
32.8
44.5
27.4
3.5
12.5
1
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
0
1
0
1
1
1
0
0
0
0
1
1
0
1
1
1
1
1
1
0
1
1
0
1
1
1
0
1
0
0
0
1
1
-
--*
6.6
13.4
11. 7
42.1
j
t.
--L
~
°2j
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
13.3
4.8
17.0
15.3
13.3
15.7
99.1
8.6
68.6
30.4
0.7
17.2
27.2
0.2
41.9
1.3
16.6
9.7
9.1
13.7
46.6
14.8
35.7
5.5
54.9
0
0
1
0
0
1
0
0
1
0
0
0
1
0
0
1
1
1
1
0
1
0
0
0
1
1
1
0
1
1
0
1
1
0
1
1
1
0
1
1
0
0
0
0
1
0
1
1
1
0
-
....
1.0
t j > 100.
0
e
e
BIBLIOGRAPHY
Anderson, T. W. and Ghurye, S. G. (1977). Identification of parameters
by the distribution of a maximum random variable. Journal of the
Royal Statistical Society - Series B, 39, 337-342.
Ba11intine, E. J. (1975). Objective measurements and the double-masked
procedure. American Journal of Ophthalmology, ~, 763-767.
Basu, A. P. and Ghosh, J. K. (1978). Identifiability of the mu1tinormal
and other distributions under competing risk model. Journal of
Multivariate Analysis, ~, 413-429.
Basu, A. P. and Ghosh, J. K. (1980). Asymptotic properties of a solution to the likelihood equation with life-testing applications.
Journal of the American Statistical Association, 11, 410-414.
Berman, S. M. (1963). Note on extreme values, competing risks and semiMarkov processes. Annals of Mathematical Statistics, 34, 11041106.
Birnbaum, Z. W. (1979). On the mathematics of competing risks. DHEW
Publication No. (PHS) 79-1351. u.S. Department of Health, Education, and Welfare.
Breslow, N. E. (1974). Covariance analysis of censored survival data.
Biometrics, 30, 89-99.
Brown, B. M. (1971). Martingale central limit theorems.
Mathematical Statistics, 42, 59-66.
Annals of
Chalmers, T. C. (1975). Ethical aspects of clinical trials.
Journal of Ophthalmology, 11, 753-758.
American
Cornfield, J. (1957). The estimation of the probability of developing
a disease in the presence of competing risks. American Journal
of Public Health, ~, 601-607.
Cox, D. R. (1972). Regression models and life-tables. Journal of the
Royal Statistical Society - Series B, 34, 187-220.
Cox, D. R. (1975).
Partial likelihood.
Biometrika, 62, 269-276.
Cramer, H. (1946). Mathematical Methods of Statistics.
Princeton University Press.
Princeton:
David, H. A. and Moeschberger, M. L. (1978). The Theory of Competing
Risks, Griffin's Statistical Monographs and Courses, No. 39.
London: Griffin.
192
Ederer. F. (1975). Why do we need controls? Why do we need to
randomize? American Journal of Ophthalmology. 79. 758-762.
E1andt-Johnson. R. C. (1976). Conditional failure time distributions
under competing risk theory with dependent failure times and proportional hazard rates. Scandinavian Actuarial Journal. 1976.
37-51.
Elandt-Johnson. R. C. (1979). Equivalence and nonidentifiability in
competing risks: A review and critique. Institute of Statistics
Mimeo Series No. 1222. Chapel Hill: University of North Carolina.
E1andt-Johnson. R. C. and Johnson. N. L. (1980). Survival Models,and
Data Analysis. New York: John Wiley and Sons.
Farewell. V. T. and Prentice. R. L. (1977). A study of the distributional shape in life testing. Technometrics. 19. 69-76.
Feig1. P. and Ze1en. M. (1965). Estimation of exponential survival
probabilities with concomitant information. Biometrics. 11. 826838.
Freund. J. E. (1961). A bivariate extension of the exponential distribution. Journal of the American Statistical Association. ~. 971977.
Gail. M. H. (1975). A review and critique of some models used in competing risk analysis. Biometrics. 1!. 209-222.
Gross. A. J. and Clark. V. A. (1975). Survival Distributions I
Reliability Applications in the Biomedical Sciences. New York:
John Wiley and Sons.
Hewitt. E. and Stromberg. K. (1965).
York: Springer-Verlag.
Real and Abstract Analysis.
New
Holt. J. D. (1978). Competing risk analysis with special reference to
matched pair experiments. Biometrika. 65. 159-166.
Inagaki. N. (1973). Asymptotic relations between the likelihood estimating function and the maximum likelihood estimator. Annals of
the Institute of Statistical Mathematics. ~. 1-26.
Johnson. N. L. and Kotz. S. (1975). A vector multivariate hazard rate.
Journal of Multivariate Analysis. 1. 53-66.
Kalbfleisch. J. D. and Prentice. R. L. (1973). Marginal likelihoods
based on Cox's regression and life model. Biometrika. 60, 267-278.
Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis
of Failure Time Data. New York: John Wiley and Sons.
193
Kendall, M. G. and Stuart, A. (1967). The Advanced Theory of Statistics,
Volume 2: Inference and Relationship (Second Edition). New York:
Hafner.
Kendall, M. G. and Stuart, A. (1969). The Advanced Theory of Statistics,
Volume 1: Distribution Theory (Third Edition). New York: Hafner.
Majumdar, H. and Sen, P. K. (1978). Nonparametric testing for simple
regression under progressive censoring with staggering entry and
random withdrawal. Communications in Statistics - Theory and
Methods, A7, 349-371.
Makeham, W. M. (1874). On an application of the theory of the composition of decrementa1 forces. Journal of the Institute of Actuaries
(London). 18, 317-322.
Nadas, A. (1971). The distribution of the identified minimum of a normal
pair determines the distribution of the pair. Technometrics, 11,
201-202.
Prentice. R. L. and Breslow, N. E. (1978). Retrospective studies and
failure time models. Biometrika, 65, 153-158.
Prentice, R. L., Kalbfleisch, J. D., Peterson, A. V., Jr., Flournoy, N.,
Farewell. V. T., and Breslow, N. E. (1978). The analysis of failure times in the presence of competing risks. Biometrics,~,
541-554.
Puri, M. L. and Sen, P. K. (1971). Nonparametric Methods in Multivariate
Analysis. New York: John Wiley and Sons.
Searle, S. R. (1971).
Linear Models.
New York:
John Wiley and Sons.
Sen. P. K. (1976). Weak convergence of progressively censored likelihood ratio statistics and its role in asymptotic theory of life
testing. Annals of Statistics, i, 1247-1257.
Sen, P. K. and Tsong, Y. (1981). An invariance principle for progressively truncated likelihood ratio statistics. Metrika, 28, (in
press).
Tsiatis, A. (1975).
peting risks.
]1, 20-22.
A nonidentifiabi1ity aspect of the problem of comProceedings of the National Academy of Sciences USA,
Zippin, C. and Armitage, P. (1966). Use of concomitant variables and
incomplete survival information in the estimation of an exponential survival parameter. Biometrics,~, 665-672.