Addressing an idiosyncrasy in estimating survival

Addressing an idiosyncrasy in estimating survival curves using
double-sampling in the presence of self-selected right censoring.
Constantine E. Frangakis
Department of Biostatistics, Johns Hopkins University
615 N. Wolfe St., Baltimore, MD 21205, U.S.A.
[email protected]
and
Donald B. Rubin
Department of Statistics, Harvard University
1 Oxford St., Cambridge, MA 02138, U.S.A.
[email protected]
In Biometrics (2001, 57: 333-353), with discussion.
S UMMARY. We investigate the use of follow-up samples of individuals to estimate survival
curves from studies that are subject to right-censoring from two sources: (i) early termination of the study, namely, administrative censoring, or (ii) censoring due to lost data prior to
administrative censoring, so-called dropout. We assume that, for the full cohort of individuals, administrative censoring times are independent of the subjects’ inherent characteristics,
including survival time. To address the loss to censoring due to dropout, which we allow to be
possibly selective, we consider an intensive second phase of the study where a representative
sample of the originally lost subjects is subsequently followed and their data recorded. As with
double-sampling designs in survey methodology, the objective is to provide data on a representative subset of the dropouts. Despite assumed full response from the follow-up sample,
we show that, in general in our setting, administrative censoring times are not independent of
survival times within the two subgroups, nondropouts and sampled dropouts. As a result, the
stratified Kaplan-Meier estimator is not appropriate for the cohort survival curve. Moreover,
using the concept of potential outcomes, as opposed to observed outcomes, and thereby explicitly formulating the problem as a missing data one, reveals and addresses these complications.
We present an estimation method based on the likelihood of an easily observed subset of the
data, and study its properties analytically for large samples. We evaluate our method in a realistic situation by simulating data that match published margins on survival and dropout from
an actual hip-replacement study. Limitations and extensions of our design and analytic method
are discussed.
K EY W ORDS: Double-sampling; Dropouts; Loss to follow-up; Potential outcomes; Rubin
Causal Model.
1
1. Introduction
1 1 Motivation.
When studies follow subjects over time and investigate a time-to-event outcome
, such as
survival, the outcome may not be available for all subjects at the end of the study. We focus on
the situation with the simultaneous occurrence of: (i) administrative censoring, that is, censoring due to early termination of the study, and (ii) dropout, or loss to follow-up, which occurs
when the subject interrupts contact with the investigator before the end of the study and before
the survival time is observed. An important example is patients who have surgery in which a
human joint, e.g, the hip, is replaced with an artificial one, a prosthesis. Often, the prosthesis wears out and needs replacement, and surgeons are interested in the time,
, between the
first surgery and the next replacement, that is, the survival time of the prosthesis. Our work is
motivated by the large number of dropout patients often occurring after surgery of joints (e.g.,
Gartland, 1988; Dorey and Amstutz, 1989; Murray, Carr, and Bulstrode, 1993). Our goal is to
estimate the survival function of the cohort of all subjects entering the study.
In order to estimate this survival function in the presence of censoring, it is common to
assume that entry times into the study do not relate to survival and thus, essentially, that administrative censoring is ignorable in the sense of Rubin (1976, 1978). Moreover, it is sometimes
assumed that dropout is also ignorable, possibly after conditioning on observed covariates (e.g.,
Dobbs, 1980). Ignorability of dropout then essentially requires that a subject whose survival is
censored by dropout after time , say, from entry, has comparable survival to a subject whose
survival is censored after time by administrative censoring. Although ignorability of administrative censoring can be plausible, ignorability of dropout is suspect (Sims, 1973; Austin et
al., 1979), and one plan to lessen the reliance on this assumption is to use a random (or representative) sample of the original dropouts. Assume that, for this subset of subjects, the investigator successfully obtains the data intended to have been recorded in the absence of dropout,
2
namely either the actual failure time
or knowledge that it would have been administratively
censored. Similar two-phase designs are known in the survey literature as double-sampling
designs (Cochran, 1963; Glynn, Laird, and Rubin, 1993; Rotnitzky and Robins, 1995; Zanutto,
1998) and the analyses of their data are typically simple when the outcome is either fully observed or fully missing. Problems involving double-sampling with competing risks have been
discussed by Hogan and Laird (1996), and Flehinger, Reiser and Yashchin (1998), although under different frameworks, assumptions and goals from ours, whereas Baker, Wax, and Patterson
(1993) and Wax, Baker, and Patterson (1993) discussed double-sampling of dropouts assuming
discrete survival data and in the absence of competing varying administrative censoring.
1 2 Purpose and outline.
We have two aims. First, by formulating an appropriate framework for our problem that links
the concept of potential outcomes (Neyman, 1923; Rubin, 1974, 1978) to competing risks, we
show how to use the data from the double-sampled subjects to draw inference for the cohort
survival function. Obtaining valid inference using double-sampling with survival data is more
subtle than with survey data because of an idiosyncrasy that can render usual analyses incorrect when estimating the cohort survival function. Demonstrating this fact is our second aim,
because, to our knowledge, this has not been discussed in the literature.
The procedure we focus on needs only a stratification of the intended data, and, for example,
does not rely on the times of study entry and of study-dropout for each dropout subject (such
times might not be available to the analyst due to confidentiality constraints). When more than
the intended data are available for analysis, our procedure will not be fully efficient, and one
could construct locally semiparametric efficient estimators (e.g., Robins, Rotnitzky, and Zhao,
1994), or could use fully parametric methods, and it would be interesting in future work to
compare these procedures to the proposed method in realistic finite samples.
3
In the next section, we develop our framework using potential outcomes that are characteristics inherent to subjects. We posit an assumption that treats administrative censoring as a
randomization variable, and thus independent of the potential outcomes. This reveals a specific
structure on the two subgroups of subjects for whom the only reason for possibly missing data
is administrative censoring: the nondropout group, and the recovered subset of the dropouts. In
particular, for each of these two groups, it is tempting to apply standard survival methods that
assume independence between survival and administrative censoring times, and then combine
these two separate estimates as with two-phase designs in survey literature. Under our assumptions, however, we show that within each of the two subgroups, the administrative censoring
times generally correlate with survival times. Consequently, Kaplan-Meier estimators (Kaplan
and Meier, 1958) that stratify by these groups, or, alternatively, likelihood-based methods that
ignore this correlation, generally do not appropriately estimate the survival curves within either
of the subgroups, nor hence, the survival curve for the cohort. In Section 3, we discuss addressing the problem from the perspective of missing data and we derive an estimation procedure
that, under our assumptions, appropriately estimates the cohort survival curve. In Section 4,
we evaluate our procedure in situations that are realistic in practice. Although double-sampling
when faced with both dropout and administrative censoring is natural, we do not have a completely real data example available for demonstration. Consequently, we use published margins
for survival curve, dropout rate, sample size accrual rate, and available length of follow-up from
an actual hip-replacement study, to project simulated double-sampling data, which we use to
compare our procedure to the re-weighted stratified Kaplan-Meier estimators. The final section gives some further remarks for the design and estimation methods. The appendix provides
technical details.
4
2. The Problem
2 1 Potential outcomes and goal.
Consider a population of subjects, for example, having surgery to have the human hip joint
replaced by a prosthesis. A common complication is that subjects drop out, that is, are lost
to follow-up, in the sense that they interrupt contact with the investigator before their survival
times becomes known. For clarity, we define potential outcomes (Neyman, 1923; Rubin, 1974,
1978) for subjects of the population
before making probability statements, as in the Rubin
Causal Model (Holland, 1986). Suppose that subjects ipate in a hypothetical study depicted in Fig. 1. Let
study, e.g., the time of surgery, and
from the population partic-
be the calendar time of entry into the
be the survival time of the prosthesis, counting from
surgery for each subject (Fig. 1(a)). Also in part (a) of the figure, let
be the length of time,
counting from entry, for which subject would remain in the study, i.e., without dropping out,
if the study remained open indefinitely, where . The potential outcome is treated as
a characteristic inherent to subject at this time in this study, like birth-date and gender. For
subjects
who would drop out in that hypothetical scenario,
, whereas for
subjects who would not drop out before failure of their prosthesis no matter when the study
would end, so we say that the indicator is the indicator function.
!" $#
is the true dropout status, where
&%'#
[Figure 1 about here.]
Now that the potential outcomes have been defined, we consider the actual study. Let
)(+*&,
be the calendar time when the study actually ends, e.g., June 1 1999, as in Fig. 1(b).
- /0)(+*1,23. be the administrative censoring time, and
.
4 min 56-7# and 89: ;-$# , be, respectively, the length of survival time that lies
Using standard notation, we let
within the administrative censoring time, and the indicator for whether that length is the full
survival time. The data
4 <8=>#
are the intended data, that is, the information the investigator
would record in the absence of any dropout. But when 5
4 , neither 4 nor 8 is observed,
4
@<A$B C D !E ># ; @<A$B F for subjects who are not dropouts
in the study, whereas @<A$B HG for the dropouts in the study (Fig. 1(c)).
Note that the observed, study-dropout status, @<A$B , is not always equal to the true dropout
status . For example, by comparing Fig. 1 (a) to (c), subjects 4 and 5 who are administratively
censored in the study (and thus have @<A$B D ) are a mixture of true nondropouts ?IJK# and
true dropouts ?IHGL# .
We let M)>N# be the fraction of all subjects in whose survival time exceeds , for )OPG .
Our goal is to learn about the survival function M)>N# . Our methods can be extended to allow
but rather
?<6@<A$B #
where
for observed covariates. However, to communicate our main arguments without complicating
notation, we assume that no covariates are recorded or, alternatively, that we are already within
cells of observed covariates.
2 2 Initial aspects of study design.
Assume that subjects
that
Q R S are a simple random sample from the population
, and
is large enough so that observations on different sampled subjects can be treated as
independent. We focus discussion on cohorts
of subjects that are homogeneous (e.g., in
terms of a common surgical method over time) in the sense that entry times are not related
with either survival times,
, or true dropout times .
We can express this in the following
assumption, which we will make throughout the rest of the paper.
A SSUMPTION 1. Randomness of entry times:
TU < V6.$# T U V6$# T U ! )>#W
where, here and in the sequel, the measure Pr ?# is the one induced by the random sampling
from the population .
6
One implication of Assumption 1 is that, in the full cohort of subjects, survival and administrative censoring times are independent: Pr <6->#.
Pr >#
Pr ?-># , which is a common
assumption (Fleming and Harrington, 1991, and references therein; Hougaard, 1999). Another immediate implication is that true dropout status and administrative censoring times are
independent: Pr ?<6-># Pr ?X$# Pr !-$# .
2 3 Double-sampling and observed data.
In joint-replacement studies, dropout occurs for a variety of reasons (e.g., the subject moves,
has pain and visits a different physician, feels very well and interrupts the hospital visits).
Because dropout is possibly related to survival (as in similar settings, Sims, 1973; Austin et al.,
1979), we consider a second phase of data collection, where the investigator selects a subset of
the study-dropouts, (i.e., with
@<A$B YG
), and pursues them intensively enough to record their
4 < 68># , which would have been observed at calendar time Z(+*&, . We let MII;
if subject has /@VA$B G and is pursued in this second phase, such as subject 2 in Fig. 1(d),
and we let MI[;G otherwise. Pursuing dropouts is costly (e.g., Dorey and Amstutz, 1989), so
intended data
the size of the sample would depend on available resources and rarely include all those with
@<A$B 3G
. For situations where
4 <68$#
is still not recorded among some subjects with M\I; ,
see Section 5. Also relevant to the design of the second phase of data collection is the following
issue.
The data 4 V68$#
for dropouts with early entry times generally carry more information than
those with later entry times in the sense that the former subjects have been in the study longer.
Moreover, early entry subjects are more likely to be study-dropouts because they have been
followed for a longer period. Therefore, a simple random sample from
]^ C @<A$B _G:`
will
automatically oversample study-dropouts with earlier entry times. For this reason, and because
we generally will not know exactly the optimal design without prior knowledge of the survival
7
times of the dropouts, we assume for simplicity that, from those with a@<A$B
chooses the set
]^ C MI/bR`
3G
, the investigator
by simple random sampling that, for convenience, we treat as
independent Bernoulli trials with common and known selection probability Pr VM"c
de @VA$B GL#gfIhjilk . We comment on extensions in Section 5.
2 4 Idiosyncrasy.
After the double-sampling, we have two groups of subjects with
in phase 1,
]^ C /@<A$B bR`
, and the sampled subset
]^ C MmbR`
4
V 68$# :
the nondropouts
of the dropouts; for both
groups, the only reason for missing survival time is administrative censoring. The two groups
can have different survival curves, and, because only some subjects of those with @VA$B HG are
R` versus ]nMo"YR` , after
recovered, an analysis that ignores the group classification, l] @<A$B Y
double-sampling (e.g., as in Dorey and Amstutz, 1989), is not appropriate generally.
Therefore, in this section we consider the consequence of using the following procedure:
(i) constructing Kaplan-Meier estimators within these two observed subgroups, and then (ii)
combining the two estimators, weighted by the appropriate proportions of
]^ C @<A$B pGq`
, to estimate the cohort survival function
MX>N# .
]^ C @<A$B FR`
and
This stratified Kaplan-Meier
(SKM) statistic is the standard estimator when censoring is solely administrative. The statistic
SKM would be consistent for
of the survival times
with
@<A$B rG
M)$N#
if the administrative censoring times
for study-nondropouts, i.e., with @<A$B J
-)
were independent
, and for study-dropouts, i.e.,
. However, this independence does not generally hold under Assumption 1 in
either group; the following result is derived as an application of Bayes’ theorem and after some
algebra (proof omitted).
8
R ESULT 1. Under Assumption 1, and with positive s and ,
TU T U OtKd @<A$B J
-3su# t
O Kde3J^#
TU
TU
vxw ! T yU JK#mzg{+?s^T|NU # ?xO}sl63HGL# ?3J^#\z
?tO}s^6y~GL# 
TU TU Ot Kd xO€s^6H3GL# where {+?s^|N# C OgKdeH;^#
TU T
U
OxKde @<A$B ~G 6-3s# OxKde3HG t€su#
In Result 1, Pr O‚Kd 6-0rs#
@<A$B
is generally a function of
s
and
and, therefore, survival and
administrative censoring times are correlated within both strata defined by observed dropout
status
@<A$B .
This complication is not due to the simple random sampling design from the
study-dropouts in the second phase because the correlation between survival and censoring
times is also present in the study-nondropouts. Focusing on the latter group, Result 1 shows
-
{+?s^|N#ƒ for all s^| . A necessary and sufficient set
of conditions for this is that: (i) survival times be independent of true dropout statuses ,
and (ii) among true dropouts, ]^ C Z+G:` , survival times be independent of dropout times
. However, by definition for true dropout, for all . If, in addition, there is a survival
that
and
would be independent if
time in the cohort that is smaller than the largest observed dropout time, which is expected to
be true in most realistic cases, then it can be easily shown that conditions (i) and (ii) above
cannot both hold. It follows that the straightforward within-stratum Kaplan-Meier estimators
are generally inconsistent and, hence, that generally the combined statistic SKM is not an
appropriate estimator for the cohort survival curve
M)>N# .
Result 1 also implies that standard
“ignorable” (Rubin, 1978) likelihood or Bayesian survival methods that stratify on
@<A$B
and
ignore the correlation between survival and administrative censoring times would also suffer
from bias, even in large samples with the correct model for M)$N# .
9
The spurious correlation within observed groups in Result 1 occurs because the observed
data are non-trivial functions of the potential outcomes that reflect scientifically relevant characteristics of subjects. The distinction between potential outcomes and observed data occurs, more generally, in studies that suffer from deviations from protocol, including treatmentnoncompliance and incomplete outcomes, and can be critical in defining and estimating quantities of interest (e.g., Rubin, 1978; Robins and Greenland, 1994; Angrist, Imbens and Rubin,
1996; Frangakis and Rubin, 1999; Robins, Greenland and Hu, 1999; Rubin and Frangakis,
1999).
3. Addressing the Problem: Missing Data Perspective
3 1 General.
Because the selection mechanism of study-dropouts at the second phase is known conditionally
on the first phase, any data
4 <8=>#
missing in stratum
? @VA$B 9GL#
after double-sampling are
missing at random (Rubin, 1976), a fact that we can use for robust estimation using likelihood
principles. Under Assumption 1, the likelihood function of data
„ ;]l @<A$B 6-<…Mo!`n†] 4 568 @<A$B JR`n†]l @<A$B HG:…MooHG:`n†] 4 <68V6 @<A$B HG …Moo;R`
C C C is proportional to
X?‡Qd „ #Hˆ
where
‡
Pr v
4 V68<6-V6 @<A$B JR‰6‡
#1Šq‹!Ž ŒS
Pr ?-<V6
Ž
@<A$B 3G ‰6‡
# hS5 Š ‹!Ž ŒS k7h5:i k
Pr 4 V68<6-V6<6 V@ A$B 3
Ž
G ‰6‡R# i
represents parameters governing the distribution of all observables. Consider also the
likelihood function of the reduced data
„’‘ ƒ]l …Mo?`†“]: 4 ?68!# C @VA$B r
@<A$B
10
or Moƒn` ,
proportional to
!‡Qd
„ ‘
#Hˆ
Making inference on
Ž
Ž
4
V68|de @VA$B ;
‰6‡R#K• Š ‹!ŒS Pr <8=Nde @<A$B HG ‰6‡R#K• i
~”
”
v Pr ? @<A$B D
‰6‡
# Š ‹!Ž Œ ] Pr ? @VA$B 3
G ‰6‡R#…` 5 Š ‹?Ž Œ 
MX>N#
Pr 4
X?‡Hd
by maximizing
„
#
when
‡
is unrestricted is generally not
possible. Alternative ways to address the problem, under Assumption 1, include
(i) maximizing the reduced data likelihood X?‡Qd
„ ‘
#
with unrestricted ‡ ;
(ii) positing semiparametric submodels, ‡ , for ‡ ;
B
(iii) positing fully parametric submodels for ‡ .
For approach (i), the components of ‡ that are identifiable from X?‡Qd
„–‘ #
identify M)$N# , so
this method will produce a consistent estimator of the survival curve for general distributions
‡
. This method provides the basic intuition for how the problem can be addressed. Because
this method uses only the /@<A$B –stratification of the intended data
4 V68$#
of nondropouts and
of dropouts that have been double-sampled, it can be applied even in constrained situations,
for example, when the entry times &V3.(+*1,.2—-$# and dropout times are hidden from the
analyst because of confidentiality concerns.
For approach (ii), work of Robins, Rotnitzky, and Zhao (1994) can be used to construct
estimators for
M)$N#
that take the form of inverse probability of censoring weighted (IPCW)
‡ B –optimal estimators within that class
asymptotically: are efficient when the working model ‡ is correct; remain consistent when
B
‘
‡ B is incorrect; and are more efficient than that of (i) when the additional data „ 2 „ are
statistics and include estimator (i) as a member. The
available. In approach (iii) maximum likelihood or Bayesian inference can be used.
We focus on approach (i), thereby providing insight into the situation in the simple case with
11
minimal data available. Nevertheless, it would be interesting to compare the three methods to
see how much efficiency is lost using (i) relative to (ii) and (iii) in realistic settings where the
extra information is available.
3 2 Maximum likelihood from X?‡Qd
Assuming that the random variables
„’‘
#.
and - are absolutely continuous, first we re-parameterize
the problem in terms of hazard functions. Define the net hazard function of interest for the full
cohort of subjects,
˜
net
$N# C lim™1š.›] Pr
>œ
0z;žmd
EŸN#| Rž¡`
. By Assumption 1,
survival and administrative censoring times are independent in the full cohort. Therefore, the
net hazard function in the full cohort is equal to the crude hazard function in the full cohort,
defined as
˜
crd
>N# C ‚z3žmd 4 E‚N#| Rž¡`
lim™1š.›] Pr >¢
(e.g., Fleming and Harrington,
1991). Furthermore, we can generally decompose the crude hazard function of the full cohort
to the crude hazard functions for the two subgroups defined by the observed dropout status
£ > N# C @<A$B , ˜ crd
lim™1š.›] Pr
˜
©
crd
$ xmztžmd 4 E}…@<A$BP¤ #| nž¥`
$N#c£N¦ §
£ $N# C ›|¨ R©
Pr ?
£ $N#N˜ crd
£ > N#W
, using
where
£ £
@VA$B ~¤"d 4 E}N# « &£ ¬j§ ª $N#$£&f ¬ £1¬ |› ¨  ª $N#$f
(3 2)
£ $N# Pr 4 E}Kde@VA$BP¤ # are the probabilities of being in the “risk set” at time ,
ª
£f Pr ? @VA$B P¤ # , for ¤­~G ® , and the expressions in (3 2) follow from Bayes’ theorem. The
time dependent weights £ >N# reflect that (3 2) is a stratification at every time point. At this
©
£ $N# , and the probabilities £ $N# and f £ can be estimated by
stage, the crude hazard functions ˜ crd
ª
’
„
‘
# with no further modeling assumptions.
maximizing the likelihood function X!‡’d
For convenience in presentation, abbreviate the two groups ]^ C @<A$B ;n` and ]^ C MoI;n`
by ¯ and ¯=› , respectively. Using standard notation, define the “risk set” and failure processes:

and where
12
‘
‘
4
4
° $ N# C ±: Z
EFN# and ² $ N# C ³: ZcN#N8 ; and for the groups
«
«
‘
‘
¯ £ |¤J G ® , by ° £ >N# C S ´Kµ·¶ ° $ N# and ² £ $N# C ¸´®µ¹¶ ² >N# . Using the notation
½ £
$N# C ž»º¢>N# for the differential of right-continuous processes º¢>N# , we define the processes ¼ crd
¾
›h |¨ ¿ k ž»² £ ${o#| ° £ !{o# , and
¿
½
£ !{o#1ž½ ¼ crd
£ ${I#…
where
¼ >N# C £|¦ §
¼
›
›|¨ 
À ©
(3 3)
£ >N# C « ª ¼ £ >N# £&f ¼ ¬ £ £1¬ £ >N# C ~° £ >N#| n £ ¼
£&¬'§ ›|¨ ¼ $N# f ¼
ª¼

ª
©
«
and where £ is the number of people in group ¯ £ , and f ¼ £ :? @VA$B H¤ #| n , for ¤¢G ® .
½ £
$N# maximize the likelihood X?‡yd „’‘ # with respect to, and are centered
The processes ¼ crd
¾
½ £
£ !{o#1ž»{ , regardless
>N# C ›|¨ ¿ ˜ crd
approximately at, the cumulative crude hazard functions crd
h k
for each individual, by
of dependence between survival and administrative censoring times. Similarly, the empirical
£
ª ¼ > N#
distributions
approximately at,
and the ratios
£
ª >N#
and f
¾
£
f ¼£
maximize
„ ‘#
X?‡;d Q
with respect to, and are centered
respectively. Then, by (3 2) and (3 3), the process
½
¼ >N#
will be
›|¨ ¿ k ˜ crd !{o#NžL{ , and thus, by Assumption 1, at the net cumulative
h
¾
½
hazard function net $N# C ›|¨ ¿ k ˜ net !{o#NžL{ .
h
½
The estimator ¼ $N# in (3 3) can also be viewed as having a generalized form of a “weighted
centered approximately at
logrank” statistic (Fleming and Harrington, 1991), where, here, the weights vary both across
time and between the two estimated crude hazard functions. However, the approximate vari-
« N£ § ¾ ¿ £
£ !{o#NžL{’ ½ net $N# , (given in (A.2)), has
|› ¨  › $ {I#N˜ crd
©
a different form from that of usual weighted logrank statistics. For a time |(Px)(+*&, such that
Pr OP5(Ád @<A$B 3¤ # and Pr !-9O}&(Âde @<A$B H¤ # are positive for ¤ÃyG ® , the following result,
½
whose proof is outlined in the appendix, summarizes the large-sample distribution of ¼ $N# .
ability of
½
¼ >N#
around the estimand,
13
R ESULT 2. Under Assumption , and for G­tx1( ,
½
½ÇÈ>É
)ÅÄ Æ ¼ $N#[2
>N#^Ê–Ë Ìc$N#W
Ë
H
Also, for Ð with GÑJÐ
|Á&(
weakly in distribution, as
Í
Ìc$N# is a Gaussian process with ­]nÌF>N#…`ÎÏG .
, the covariance ÒuÓKÔm]nÌcVÐ^#W…Ìc>N#…` has the form (A.2) given in
, where
the appendix.
The expression for ҁÓKÔI]nÌc?Ðl#……Ìc$N#6` in (A.2) involves quantities, defined in the appendix,
½ crd
£ $ N# , £ >N# , and f £ , ¤­HG:® . By replacing the latter with
ª
½ crd
£
£
their sample analogues ¼ >N# , ¼ >N# , and f ¼ £ in all the defining expressions in the appendix,
ª
Õ
we let the statistic ¼ VÐ
|N# be the resulting sample analogue of expression (A.2). Then, using
Result 2 and Slutsky’s theorem, it can be easily shown that, for times with G=xg|( ,
that are functions of the unknowns
½
½
ÅÄ Æ ¼ $N#2
in distribution, as
wise) for
½
net
½
×WØ:Ùm]»2Ú¼ $N#6`
$N# .
ÏË
net
Å
$N#^Ê Æ Õ ¼ $…|N#KÊ  Ä Ë ²Ö?G:®^#W
(3 5)
. We can then use (3 5) to obtain confidence intervals (point-
Using the identity
MX>N#Á×WØ:Ùm]»2
½
net
$N#6`
, we obtain an estimator,
)¼M $N# C , that is consistent for the survival curve. We can use the same relation to obtain
confidence intervals for
½
Í
net
MX>N#
as the reciprocals of the exponentiated confidence intervals for
>N# .
X?‡Ûd „Ü‘ # instead of
X?‡Úd „ # , it requires only the implication of Assumption 1 that survival and administrative
censoring times - are independent in the cohort, and partially unobserved, group of subjects.
Note that, because this estimation method is based on likelihood
The full structure of Assumption 1 was important in Result 1 to demonstrate that stratified
Kaplan-Meier estimators are generally not appropriate in this setting.
14
4. Projected Simulations from Actual Study
4 1 Setting.
Results for the “Tharies” type of hip-replacement prostheses were reported by Dorey and Amstutz (1989), who discussed 1985 data for the first 100 such prostheses implanted in their
hospital starting in 1975. These prostheses were implanted in the first 2-year window (19751977) and, although they all had a potential follow-up time
-)
of at least 8 years, 45 (45%)
of the patients were lost to follow-up within 8 years following their surgery and before prosthesis failure (i.e.,
ÝrÞ
, although the dropout times are not known to us). Among those
original dropouts, the intended data
4 568$#
were recovered for 35 patients (78%=35/45) af-
ter an intensive search reported by the investigators. Subsequently, the survival curve for the
cohort was computed using a single Kaplan-Meier estimator, where the non-sampled dropouts
were treated as administratively censored. Because the general inconsistency of this unstratified Kaplan-Meier estimator in our framework follows from techniques used in Sec. 2.4, we do
not simulate detailed results for it, but we use their reported curve as a reference “true curve”
below. In addition, the study had, essentially, no administrative censoring because patients who
got Tharies in the 8-year window 1977-1985 were ignored. In this Section, by projecting information from the Tharies study to allow potential inclusion of patients from the whole 10-year
window available in 1985, we will compare our method with the stratified Kaplan-Meier when
both dropout and administrative censoring are present, in conditions summarized in Table 1.
To reflect looking at accrual within the 10-year window, in all 12 conditions of Table 1,
(+*1,3 K G years and we simulated entry times uniformly in (0,10) [so, -.ß
)
1KGa2x.!#Zàá?G  KGL# ]. We matched points on the Tharies empirical survival curve of Dorey
Z‘
and Amstutz (1989; Fig. 1, “complete analysis”) to a model for where C ÏâSÓ
ã· !# is
normally distributed, giving mean äm塿/Yâ¸ÓRã·?ç éè years # and standard deviation ê¡å¥æÁ9G ìë
í Þ .
The resulting survival probabilities M)$N# at times ƒî ë and 8 years are 72.65%, 65.19%,
we set
15
and 58.20%, and, in the following, will be treated as true and will be our estimation goal. The
simulations follow Assumption 1.
The Tharies’ study-dropout rate of 45% in the first 8 years of follow-up means that the
fraction of true dropouts
?ZIHGL#
would be larger than 45%. To reflect this, we fixed Pr !/I
GL#­ÏïRG
ð , and used the models described in the next paragraphs to get study-dropout rates
f (ñoòq B ó C Pr ? @<A$B ~Gode-"O}ÞL# in the range 40% to 46% when, as with the Tharies’ 45% study(+
dropout, we condition on -OyÞ . We also report our rates f B C Pr ? @VA$B JGL# over the full
10-year window. To investigate plausible conditions for the remaining associations between
true dropout status, survival, and dropout time among true dropouts, for which we do not have
further information from the Tharies study, we posit the following models.
We draw conditionally on the survival times following the probit model Pr !ÁIJ:d&â¸Ó
ã¡
N#Hô]lõ Š a
z Š 6`
$#
, where ô/1%j# is the standard normal cumulative distribution function. Because
we have fixed the marginal probability of being a true dropout to 50%, we vary Š Û2aR6G ®
,
Š V qí jíRè 6G ®2 í:ìíRè # . Under this setting, â¸ÓRã· # has the same standard deviation within both sets ]^ C ZG:` and ]^ C cn` , and de dqY generates a difference of
Š
1.05 standard deviations between the means of â¸Ó
ã· # in the two groups defined by . So, the
meaning of is how long the true nondropout patients’ Tharies would last compared to true
Š
which determines õ
dropouts.
Among true dropouts,
survival time
.
]^ C Z\yG:`
, we allow the dropout time +
To do this, we simulate
and right-truncated at
Z‘
. That is,
to be related to the
from a log-normal regression on
‘ ‘
‘
‘
P…6HHGL#à Pr ]l d €…6yHG:6 `» where
‘ ‘
‘
‘
P…6H3GL#3² w Iä ÷Læzgø ê¥÷
æ >[2ùä¹å¥æ|#Wêm÷
ú æ 1)2ùø ú6# Pr ? d
ê 塿

âSÓ
ã·?#Kdö ‘
‘ ?â¸Ó
ã· >#|#
16
spectively, for simplicity.
corr ?5
|d – GL# ,
‘
äm÷
梃â¸Ó
ã·!)(+*&,n# and ê¥÷LæÃ_ê 塿 , re‘
By varying the parameter ø we induced different values of ø C where we fix the mean and standard deviation of
,
the correlation between patient dropout times and prosthesis survival
among the true dropouts.
To match the observed sample size accrual rate of 100 patients in the actual Tharies 2year window, for each individual study and in each of the 12 conditions defined by the above
settings we simulated
}³ïnG
G
patients in the 10-year window. In each individual study, we
subsequently double-sampled study-dropouts with a 50% probability each.
We set our goal to compare performance, in terms of coverage rates of nominal 95% confidence intervals and mean squared errors for the survival probability M)>N# of Tharies at Hî:
ë
and 8 years, and evaluated over 2500 replicated studies for each condition in Table 1, using
(i) the procedure described in Section 3, labeled IML because it is the maximum likelihood of
the
@<A$B –stratification of the intended data; and (ii) the procedure labeled SKM, based on the
stratified Kaplan-Meier estimator described in Section 2.4 and a normal approximation to its
distribution. The standard error for SKM is obtained by the delta method; these standard errors
were very close to the sampling standard deviations of SKM as computed over the simulation
replications for each condition (not shown). An S-plus5 program with FORTRAN subroutines for general implementation of procedure IML, and an S-plus5 program for generating the
simulations described here are available from the authors.
4 2 Results.
Based on the conditions in Table 1, when the true dropout patients’ Tharies have longer survival
times than those of the true nondropouts (conditions with
comparably except for the cases where the correlation
ø
Š ³2a
), SKM and IML perform
between patients’ dropout time and
Tharies survival is 0.34 or 0.16, where IML is superior to SKM. These cases, among all those
17
with
Š û2a
, have the highest fraction of study-dropouts f
(+ñoòq B ó
and closest to the observed
45% based on the 100 patients reported by Dorey and Amstutz (1989). This indicates that, even
with relatively small association between survival time and dropout time among true dropouts,
the fraction of study-dropouts influences the performance of the SKM procedure.
[ Table
1 here ]
This influence can also be seen in the conditions where true dropouts would have the same
Tharies survival distribution as the true nondropouts (conditions with
Š _G
). Because the
survival distribution for the whole cohort is the same across all sensitivity conditions, the prostheses for true dropout patients must have shorter survival times under conditions G than
Š under conditions ü2 . Consequently, by comparing the results assuming 2 to
Š
Š
those assuming 9G with comparable values of the correlation ø , we observe that (i) there
Š
(+
(
are larger study-dropout rates, f ñoòqB ó and f B , and (ii) in these cases, SKM notably undercovers the true probability values. Nevertheless, IML has sufficient coverage, and is also more
accurate than SKM as indicated by the mean squared errors.
Finally, the conditions of Table 1 in which the prostheses for true dropout patients have
Š r
) are
these conditions, among all conditions in the Table, give the highest study-dropout rates f
(ñoòq B ó ,
smaller survival times than the prostheses for true nondropouts (conditions with
associated with high correlation, ø , between patients’ dropout time and Tharies survival. Also,
and the closest to the observed 45% from the actual study. In these conditions, SKM performs
very poorly, whereas IML performs quite well.
5. Further Remarks and Extensions
Among the dropouts pursued in the second phase, there can be a small number of cases for
which the data
4 V68$#
still do not get recorded, perhaps if a person has died before Z(+*1, . If,
in these cases, one still wished to project a hypothetical status of the prosthesis at (+*1, , which
18
would have to be assessed based on other observed information, we assume these special cases
would be treated as administratively censored, and, therefore, practically we can still assume
that if MoI; , then the data
4
< 68$#
are recovered.
As noted in Section 2.3, the data
4
< 8=>#
for dropouts with early entry times generally
carry more information than those with later times. However, a nonprobability sampling in the
second phase that would include only study-dropouts with early entry times cannot generally
give appropriate inference for the cohort without using parametric model extrapolation because,
by Result 1, the study-dropouts with early entry times have different survival distributions from
the study-dropouts with later entry times.
We restricted attention to the simple random sampling at the second phase, for convenience
in demonstrating estimation and to clarify the complication of using the stratified KaplanMeier estimator even in this simple design. Alternative designs in the second phase can be
implemented that use different probabilities of selection for different subjects. When these
probabilities depend on continuous, as opposed to discrete, covariates, such as the entry times,
the method proposed here based solely on stratification on
@<A$B
status would need adjustment
to reflect the design probabilities of selection when estimating the cohort survival curve.
As mentioned in Section 3.1, when, in addition to the data
„–‘
, the data
„
are available
for analysis, the proposed method can be further improved by semiparametric or parametric
methods. Moreover, the implementation of estimation can be simulation-based, as opposed to
analytic, and can be motivated even within our simple setting where closed-form approximate
inference exists. For example, one could use the permutation distribution of the data the subsample ]^
C Mom;R`
of study-dropouts to “impute” the missing data 4 58=>#
4 <8=>#
of
for the non-
subsampled study-dropouts. For each such configuration, a Kaplan-Meier statistic could be
calculated for the “imputed” full cohort, where censoring and survival times are independent,
and these statistics could then be averaged to give an estimate for the cohort survival curve.
19
Because the total number of such permutations can be very large, this approach practically
would rely on principled simulation methods coupled with principled methods of analysis of
multiply imputed datasets (Rubin, 1987; Glynn, Laird and Rubin, 1993; Tu, Meng, and Pagano,
1993; Efron, 1994; Lin, Fleming and Wei, 1994; Rubin, 1996; Wang and Robins, 1998).
Simulation-based implementation of estimation can be particularly relevant in extensions of
our setting when analytic derivation becomes less tractable, for example, in situations when
Assumption 1 is relaxed to allow for non-homogeneous cohorts with calendar time trends in
survival and true dropout behavior.
A final comment is that we expect many studies of survival employ at least some type of
informal double-sampling of dropouts, and we hope that the availability of our framework and
methods stimulate the study and use of double-sampling to address dropout with survival data.
ACKNOWLEDGEMENT
The authors thank the orthopedic clinic of IKA Hospital in Melissia, Athens, Greece, for hosting in part this work; Stuart G. Baker and Emanuel Frangakis for useful discussions; the editor
and reviewers for helpful comments; and the support from the U.S. National Institute of Child
Health and Human Development (R01 HD38209), National Institute of Mental Health (R01
MH56639), National Institute for Drug Abuse (R01 DA10184), the Johns Hopkins H.-C. Yang
Memorial Faculty Fund, and the National Science Foundation.
R EFERENCES
Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996). Identification of causal effects using instrumental
variables (with Discussion). Journal of the American Statistical Association 91, 444–472.
Austin, M. A., Berreyesa, S., Elliott, J. L., Wallace, R. B., Barret–Connor, E., and Criqui, M. H. (1979).
Methods for determining long-term survival in a population based study. American Journal of Epidemiology 110, 747–752.
20
Baker, S. G., Wax, Y., and Patterson, B. H. (1993). Regression analysis of grouped survival data: informative censoring and double sampling. Biometrics, 49, 379–389.
Billingsley, P. (1979). Probability and Measure. New York: Wiley.
Cochran, W. (1963). Sampling Techniques. New York: Wiley.
Dobbs, H. S. (1980). Survivorship of total hip replacements. The Journal of Bone and Joint Surgery [Br]
62, 168–172.
Dorey, F. and Amstutz, H. C. (1989). The validity of survivorship analysis in total joint arthroplasty. The
Journal of Bone and Joint Surgery [Am] 71, 544–548.
Efron, B. (1994). Missing data, imputation, and the Bootstrap (with Discussion). Journal of the American Statistical Association 89, 463–478.
Flehinger, B. J., Reiser, B., and Yashchin, E. (1998). Survival with competing risks and masked causes
of failures. Biometrika 85, 151–164.
Fleming, T. R. and Harrington, D. P. (1991). Counting Processes and Survival Analysis. New York:
Wiley.
Frangakis, C. E. (1999). Coexistent complications with noncompliance with study-protocols, and implications for statistical analysis. Unpublished PhD thesis (appendix of Part 3), Department of Statistics, Harvard University.
Frangakis, C. E. and Rubin, D. B. (1999). Addressing complications of intention-to-treat analysis in
the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes.
Biometrika 86, 365–379.
Gartland, J. J. (1988). Orthopaedic clinical research. Deficiencies in experimental design and determination of outcome. The Journal of Bone and Joint Surgery [Am] 70, 1357–1364.
Glynn, R. J., Laird, N. M., and Rubin, D. B. (1993). Multiple imputation in mixture models for nonignorable nonresponse with follow-ups. Journal of the American Statistical Association 88, 984–993.
Hogan, J. W. and Laird, N. M. (1996). Intention-to-treat analyses for incomplete repeated measures data.
Biometrics 52, 1002–1017.
Holland, P. (1986). Statistics and causal inference. Journal of the American Statistical Association 81,
21
945-970.
Hougaard, P. (1999). Fundamentals of survival data. Biometrics 55, 13–22.
Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of
the American Statistical Association 53, 457–481.
Lin, D. Y., Fleming, T. R., and Wei, L. J. (1994). Confidence bands for survival curves under the proportional hazards model. Biometrika 81, 73–81.
Murray, D. W., Carr, A. J., and Bulstrode, C. (1993). Survival analysis of joint replacements. The Journal
of Bone and Joint Surgery [Br] 75, 697–704.
Neyman, J. (1923). On the application of probability theory to agricultural experiments: essay on principles, Section 9. Translated in Statistical Science, 5, 465–480, 1990.
Robins, J. M. and Greenland, S. (1994). Adjusting for differential rates of prophylaxis therapy for PCP
in high-versus low-dose AZT treatment arms in an AIDS randomized trial. Journal of the American
Statistical Association 89, 737–479.
Robins, J. M., Greenland, S., and Hu, F.-C. (1999). Estimation of the causal effect of a time-varying
exposure on the marginal mean of a repeated binary outcome (with Discussion). Journal of the
American Statistical Association 94, 687–712..
Robins, J. M., Rotnitzky, A., and Zhao, L.-P. (1994). Estimation of regression coefficients when some
regressors are not always observed. Journal of the American Statistical Association 89, 846–866.
Rotnitzky, A. and Robins, J. M. (1995). Semiparametric regression with follow-up of nonrespondents.
Technical report, Department of Biostatistics, Harvard School of Public Health.
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies.
Journal of Educational Psychology 66, 688–701.
Rubin, D. B. (1976). Inference and missing data. Biometrika 63, 581–592.
Rubin, D. B. (1978). Bayesian inference for causal effects. Annals of Statistics 6, 34–58.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Rubin, D. B. (1996). Multiple imputation after 18+ years (with Discussion). Journal of the American
Statistical Association 91, 473–489.
22
Rubin, D. B. and Frangakis, C. E. (1999). Comment on “Estimation of the causal effect of a time-varying
exposure on the marginal mean of a repeated binary outcome” by J. M. Robins, S. Greenland, and
F.-C. Hu. Journal of the American Statistical Association 94, 702–704.
Sims, A. C. P. (1973). Importance of a high tracing rate in long-term medical follow-up studies. Lancet
No. 2, 433–435.
Tu, X. M., Meng, X. L., and Pagano, M. (1993). The AIDS epidemic: estimating survival after AIDS
diagnosis from surveillance data. Journal of the American Statistical Association 88, 26–36.
Wang, N. and Robins, J. M. (1998). Large-sample theory for parametric multiple imputation procedures.
Biometrika 85, 935–948.
Wax, Y., Baker, S. G., and Patterson, B. H. (1993). A score test for non-informative censoring using
doubly sampled grouped survival data. Applied Statistics 42, 159–172.
Zanutto, E. L. (1998). Imputation for unit nonresponse: modeling sampled nonresponse follow-up, administrative records, and matched substitutes. PhD thesis, Department of Statistics, Harvard University.
A PPENDIX : O UTLINE
OF PROOF OF
½
R ESULT 2
¼ >N# , we will use some additional definitions. For OxG ,
£
«
„
„ >N# ;
let >N#ƒ] £&¬j§ ›|¨ £&¬ $N#>f £&¬ `
 ú , and, for ¤ÚFG \ , let: £ ¨ ý?þWÿ ®>N# C r12Â^# f¥›<f
>
N
#
ª 
ª
£ ¨ ý?þ ®$N# C ‚&2aK# £ $f¥›Vf ›K$N# „ $N# ; and £ ¨ ý ®$N# C F12© ^# £  >N# ›l$N# „ $N# . In the above,
ª
ª  ª
© £ ¨ ý?þuÿÄ ®$N# is the quantity obtained when the© the Äweight £ $N# in (3 2) is regarded as a function
© ›l>N#W $N# , and f , and is differentiated partially© with respect to ›l>N# . The quantities
of

ª
ª 
ª
£ ¨ ý?þ $N# and £ ¨ ý $N# have similar interpretation. In an analogous way, we define the funcĽ
« Ä £N§ ¾ ¿ £
« £N§ ¾ ¿ £
½ £
½
½ £
©
tions crd
${o# , crd
!{o# ,
›|¨  › ¨ ý?þWÿ ${I#1ž crd
›|¨  › ¨ ý?þ ${I#1ž crd
ý?þWÿ >N# © C ý?þ >N# C Ä
Ä
« £N§ ¾ ¿ £ ©
½
½ £
and crd
!{o# . Using these definitions, we© express Ì9h Kk1$N# C ›|¨  › ¨ ý K${o#Nž crd
ý >N# C ½ Ä ½
© Ä form, in Lemma A.
ÅÄ Æ ¼ >N#2 net $N#KÊ in a linearized
To state the approximate variability of
23
L EMMA A. Under Assumption , and for G
gx1(
,
Ì hKk $N#ÁÌ ý h ®k $ N#\zƒ£N¦ § Ì ý?hþ ®¶ k >N#\zxÌ ýh
®¶k $N#\zq$  ÅÄ #…
›|¨ 
Ä
¿ ½
Ì ýh Kk $N#~ ÅÄ f ¼  2Úf  # ž ý !{o#…
›
Ä
Ä
¿ ‘
À
£
£
Å
Å
£
£

Ä
Ä
h
®
k
Ì ý?þ ¶ $N#~ ¦
¨ 5$N#…
¨ &>N#
]l° !{o#[2 ª
› ¸´®µ ¶
À ¿ £
Å
Å
${I# ž
£ ¨ &$N#…
£ ¨ 5$N# £ Ä
Ì ýh
K¶k $N#~ £  Ä ¦
£
!{o#
›
¸´®µ¹¶
À ©ª ¿
‘
£ ¨ 1$N#3² ‘ $N#[2
and
° !{o#Nž
›
À
where
£ !{o#…`lž ½ ?ý þW ¶ !{o#W
£ ¨ 5${I#…
½£ ${o#
Proof. We have the identity
Ì h®k >N# £N¦ § )ÅÄ
|› ¨ 
¿ £ 2 £
¿ £
¼ £ ž»² £ z
¦ ž £ ¨ "!$ # £
°
°
›
›
©
À ©
À © ¸ ´®µ ¶
(A.1)
where we have suppressed the variable of integration. The first summands in the right hand
side of (A.1) are asymptotically equivalent to
expansion we have
¾ ¿
½ £
ÅÄ › ¼ £ 2 £ #Nž crd
.
©
©
Moreover, by a Taylor
ÅÄ ¼ £ 2 £ #~ ÅÄ f ¼  2Úf  # £ ¨ ý z ÅÄ £ ¦ § ª ¼ £ æ2 ª £ æ…# £ ¨ ý?þ ¶ æ z%q!  ÅÄ #
æ |› ¨ 
©
©
© Ä
©
Substituting this expansion in (A.1), and noting that ° £ n £ converges uniformly in probability
to £ in & G |5((' as £ Ë Í , by the Glivenko-Cantelli Theorem (Billingsley, 1979, Theorem
ª
20.6), Lemma A follows after some algebra.
Using Corollary B.1.1 of Fleming and Harrington (1991), in order to show Result 2 it suffices to show two conditions: (i) for any finite set of times converges in distribution to
VÌc$  #W S 6ÌF>)®#N# , where Ì
24
 S N ) , the vector ?Ì h®k $  #W S 6Ì hKk >)K#|#
is a Gaussian process with moments
as in Result 2, and (ii) the process Ì
h®k
is “tight” in the sense of Fleming and Harrington (1991,
p. 340).
The large-sample normality in condition (i) follows from applying the central limit theorem
and Theorem 2.4.4 of Fleming and Harrington (1991) on Lemma A. This application also
shows that, for fixed times G­€Ð
|}X(+*1, , the covariance cov ]nÌc?Ðl#……Ìc$N#6` is given by,
Õ
Õ
Õ
ý V Ð
|N#\z £N¦ §
?ý þ ¶ ?ÐLNN#\z ý
¶
Ä
|› ¨ 
where
Õ ý KVÐ
|N# gf f¥› ¿ ž ½ ${o# B ž ½
C  ›
ý ›
Ä
Ä
À
À
and, for ›D^ q]6f¥›Vf h'ilk ` , J^ …f

¿
$N#\z
ý
Ä
Õ
ý?þ ¶ ¨ ¶ VÐ
|N#\z
Õ
ý?þ ¶ ¨ ¶ $…6Ðl#…
(A.2)
‘
${ W#  , and ¤­HG:®R
‘
B £
£ ${o# £ !{ ‘ # '^ž ½.ý?þ ¶ $ { ‘ #Nž .½ ?ý þ ¶ !{o#…
&
]
+
*-,lØm!{"{ #6`Z2
ª
ª
› › ª
À 02À 1eÇ ¨ ¿ £
h B k ] !{o#…` ú ½£ Õ ý
KVÐ
|N# / £
${o#W and
¶
C
£ ${I# ž
›
©
ª
À
021eÇ ¨ ¿
h
k B £ ${ ‘ # £
B
½ ‘ ½ £ Õ ý?þ ¨ ®VÐ
|N# J23 £
!
o
{
1
#
ž
!o{ #
¶ ¶
ª
C
ý?þ ¶ !{ 1# ž
£
4
$
I
{
#
›
À
À ª
©
Õ
ý?þ ¶ VÐ
|N# C £
The proof of condition (ii) is not difficult but is more laborious and is omitted here. Further
details can be found in Frangakis (1999, appendix of Part 3).
25
Figure 1. Potential outcomes and observed data with double-sampling. For each part, solid
lines represent observed information, dashed lines represent unobserved information: (a) survival times, (arrows), and true dropouts (circles); (b) administrative censoring with true dropout;
(c) study-dropouts; (d) subject 2 is double-sampled.
26
Subject
R
Ei
E max
i
1
1
2
3
4
5
1
0
0
0
0
0
1
1
1
(a)
E max
Subject
(c)
∆i
E max
1
0
1
0
0
1
2
3
4
5
obs
Ri
Si
0
1
0
0
0
(b)
(d)
27
Table 1
Projected Tharies prosthesis study of Sec. 4: sensitivity inference for the prostheses survival probabilities M)>N# at Hî
çLïRð
ë
and 8 years from primary surgery; coverage of nominal
confidence intervals and mean squared error.
576
-1
8:9
-0.4 -0.2
8
;=<?
>A@ E
BDCF
(%)
;=<?>A@
(%)
0
0.0
0.2
-0.4 -0.2
1
0.0
0.2
-0.4 -0.2
0.0
0.2
0.16 0.34 0.46 0.60
40.3 38.8 36.7 34.6
19.7 19.4 18.3 16.4
0.34 0.45 0.57 0.67
43.0 42.2 41.4 40.5
23.1 23.2 22.5 21.5
0.55 0.63 0.68 0.72
45.8 44.9 44.2 44.0
26.5 26.8 27.3 27.0
93.1 94.4 93.9 95.4
94.2 95.5 94.4 94.6
86.4 85.7 84.3 82.8
94.2 95.3 94.3 93.8
36.2 36.8 40.5 38.0
94.0 94.1 94.1 94.3
1.10 0.97 0.96 0.87
0.91 0.84 0.92 0.88
1.35 1.32 1.37 1.46
1.06 0.99 1.03 1.03
4.35 4.35 4.17 4.13
1.10 1.05 1.11 1.05
92.0 93.3 93.9 94.9
94.5 95.6 93.5 95.0
86.2 85.2 84.2 81.5
95.3 94.8 95.6 94.1
33.8 34.7 35.4 35.7
94.4 94.0 93.9 94.5
1.65 1.43 1.33 1.25
1.28 1.19 1.28 1.25
1.84 1.91 1.98 2.13
1.34 1.37 1.36 1.40
6.34 6.20 6.20 6.07
1.38 1.44 1.43 1.38
90.3 92.4 93.5 94.1
94.1 94.9 94.2 94.8
87.6 85.8 85.2 83.4
94.8 94.6 94.4 94.5
44.6 45.5 47.5 46.8
94.2 94.2 94.2 94.7
2.78 2.36 2.08 1.94
1.83 1.69 1.77 1.70
2.43 2.48 2.58 2.73
1.90 1.90 1.89 1.85
7.46 7.35 7.17 7.07
1.94 1.94 1.85 1.83
G2HJI"K
Coverage (%)
SKM
IML
HL2MN:O7K
MSE
SKM
IML
G2HP:K
Coverage (%)
SKM
IML
HL2MN:O7K
MSE
SKM
IML
G2HJQ"K
Coverage (%)
SKM
IML
HL2MN:O7K
R
MSE
SKM
IML
(+ñoòq B óTS
corr UWa
Pr UWV
cbed Y V @<A$B
SfX _
S
XY[Z ]\
^+_
;
R
( B
are considered because
S
Pr UWV
a ?ghd
@<A$B
S
X _
; only positive values of
for true dropouts.
SKM, stratified Kaplan-Meier method; IML, method of Section 3.2.
28
`
S