Joint Modelling of Accelerated Failure Time
and Longitudinal Data
By
Yi-Kuan Tseng
Joint Work With
Professor Jane-Ling Wang
Professor Fushing Hsieh
Tseng Y.K., Hsieh F., and Wang J.L. (2005). 92, pp. 587-603, Biometrika.
CD4 count plot of five patients
60
ID 69
ID 58
ID 62
ID 64
ID 74
50
CD4 count
40
30
20
10
0
0
500
1000
1500
2000
Days
2500
3000
3500
I. Introduction
W (t ) X (t ) e(t )
X (t ) : longitudinal covariates
e(t ) : independent measurement error
{t | X (t )} 0 (t ) exp( X (t )}
X (t ) { X ( s ) : 0 s t};
: regression parameter
0 : unspecified baseline hazard rate function
CD4 counts and time to AIDS (or death)
X (t ) b0 b1t
Self and Powitan(1992), Degruttola and Tu(1994),
Tsiatis et al.(1995), Faucett and Thomas (1996),
Wulfsohn and Tsiatis(1997) Bycott and Taylor(1998)
Dafni and Tsiatis (1998), Tsiatis and Davidian (2001)
X (t ) f (t )T b U (t )
f(t) :a vector of known functions of time t, U (t) : a
stochastic process
Taylor et al.(1994), Lavalley and Degruttola(1996),
Henderson et al.(2000),Wang and Taylor(2001), Xu
and Zeger(2001)
Two-stage partial likelihood approaches
-truncation causes bias
Joint likelihood approaches
-robust to the distribution of random effects
- unbiased
- efficient
Bayesian approaches
Conditional score approaches
Accelerated
failure time model is an attractive
alternative when the proportional hazard
assumption fails.
For time independent covariates X:
log T ' X e
T : survival time; X : time independent covariates;
e: random error
Suppose S0 : baseline survival function ( T | X 0)
S (t ) S0 {te( ' X ) } =S0 (u )
U ~ S0 Te ' X U
U : a subject would have lived if there's no exposure (X 0)
u t exp( ' X ) t 2
t 30 (years old) with the same survival probability as u 60
(year old), (aging twice faster)
For
time dependent covariates X(t), we consider
the AFT model in Cox and Oakes (1984):
T
U ~ S0 , where U { X (T ); } exp{ X (s)}ds
S{t | X (t )} S0[ {X (t; )}]
Biological
0
meaning: Allows the influence of entire
covariate history on subject specific risk.
For
an absolutely continuous S0, the hazard rate
function with covariate history:
t
{t | X (t )} 0 [ e ' X ( s ) ds]e ' X (t ) 0 [ { X (t ); }] '{ X (t ); }
0
If
baseline hazard is unspecified, the expression
corresponds to a semi-parametric model.
Robins
and Tsiatis (1992)– rank estimating equation
Lin and Ying (1995)– asym. consistency and Normality
Hsieh (2003)– over-identified estimating equation
Goal of the study: provide an effective estimators for
β with unspecified baseline hazard and the parameters
of longitudinal process
Different assumptions on baseline hazard:
-- Wulfsohn and Tsiatis (1997)
Discrete baseline hazard with jumps at event times
-- Our assumption:
The baseline hazard is a step function.
II. Joint AFT and Longitudinal model
Notations:
Ti : event time of subject i, i 1, ,n
Ci : censoring time
Vi : observed time min (Ti , Ci )
i : 1(Ti Ci ), event time indicator
ti
: measurement schedule (tij : tij Vi ), j 1,...mi
Wi : response (Wij : tij Vi )
X i () : time dependent covariate
ei
: measurement error
Observed data for each i:
(Vi , i , Wi , t i ), independent across i.
Model for longitudinal data:
Wi X i (t i ) ei
X i (t ) bi T (t ) (linear mixed effect model)
(t ) {1 (t ),..., p (t )}T : vector of known functions of time t
bi T (bi1 ,..., bip ) : p-dimensional random effects ~ N p ( , ) ei
ei ~ N (0, e2 I )
Examples :
p 2, {0 (t ), 1 (t )} (1, t )
p k , {0 (t ),..., p 1 (t )} (1, t ,..., t k 1 )
p 2, {0 (t ), 1 (t )} {log(t ), t 1}
Model
for survival:
(t | X (t )) (t | , bi ) 0 ( (t; , bi ) ' (t; , bi )
Where
t
(t ; , bi ) e
X (s)
ds e
0
(t ; , bi ) e
'
Joint
t
biT ( s )
ds,
0
X (t )
e
biT ( t )
likelihood:
Assumptions
-- noninformative censoring
-- noninformative measurement schedule tij ,
both are independent of future covariate
history and random effects bi
L( ) L( , , , e2 , 0 )
i 1[ { j 1 f (Wij | bi , t i , e2 )} f (Vi , i | bi , t i , 0 , ) f (bi | , ) dbi ]
n
mi
f (Wij | bi , t i , e2 ) ~ N{biT ( s), e2 }
f (bi | , ) ~ N ( , )
f (Vi , i | bi , t i , 0 , ) [0 ( (Vi ; , bi ) (Vi ; , bi )] exp{
'
i
(Vi ; ,bi )
0
0 (t ) dt}
III. EM Algorithm
Complete
data likelihood:
L ( ) i 1[ j 1 f (Wij | bi , ti , e2 )} f (Vi , i | bi , ti , 0 , ) f (bi | , )]
*
n
mi
M-step:
Let E{ h(bi ) | Vi , i ,Wi , ti , } Ei { h(bi ) }
2
e
be the conditional expectation based on the current estimate ( , , , , 0 ).
Dfferentiating Ei {log L* ( )}
n
Ei (bi ) / n,
i 1
n
Ei (bi )(bi )T / n,
i 1
n
mi
n
Ei {Wij biT (tij )}2 / mi
2
e
i 1 j 1
i 1
For 0 :
Let T1 ,..., Td denote d distinct uncensored event time
The corresponding baseline survival time are:
Tk
uk exp{ bkT ( s)}ds, k 1,..., d
0
Estimate uk by current estimate of and the current empirical Bayes estimate of bi
u ( k ) denote these estimates in ascending order--- 0 =u (0) u (1)
u(d )
Therefore, we have
d
0 (u ) Ck 1{u
k 1
Ck
For
n
i 1
n
i 1
( j 1) u u ( j ) }
Ei [ i 1{u ( k 1) u u ( k ) } ]
i
Ei [{u ( k ) u ( k 1) }1{u ( k 1) u u ( k ) } ]
i
:
Plug 0 in Ei {log L* ( )},
d
d
T
Ei i log[ C j 1{u( j1) u u( j ) } ] i {bi (Vi )} C j {u ( k ) u ( k 1) }1{u( k 1) u u( k ) }
i
i 1
j 1
j 1
n
n
n
mi
i 1
i 1
j 1
Ei {log f (bi | , )} Ei { log f (Wij |bi , e2 )}
no closed form expression for . We may
maximize the conditional likelihood by numerical
method.
There’s
E-step:
To compute Ei (.),we need knowledge of f (bi | Vi , i ,Wi , ti , )
which can be expressed as:
f (Vi , i | bi , ti , ) f (bi | Wi , ti , )
f (Vi , i | bi , ti , ) f (bi | Wi , ti , )dbi
Let * { T (ti1 ) ,..., T (timi ) }T , A { (ti1 ),..., (timi )}T .
* 11 12
Wi
Then ~ N ,
, and therefore
21 22
bi
1
1
bi | Wi , ti , ~ N { 2111
(Wi A ), 22 2111
12 }
To
derive Ei (.), we may generating M multivariate
normal sequences for bi | Wi , ti , , denoted by Ni ( Ni1 ,...NiM )
E {h(b )}
i
The
i
M
j 1
h( N ij ) f (Vi , i | N ij , ti , )
M
j 1
f (Vi , i | N ij , ti , )
, M is large.
T accuracy increases as M increases. In order to
have
h higher accuracy and less computing time, we
may follow the suggestion in Wei and Tanner (1990)
. That is, to use small value of M in the initial iterations
of the algorithm, and increase the values of M as the
algorithm moves closer to convergence.
We encounter two difficulties when estimating standard
error of :
EM
algorithm involved missing information
-Remedies in Louis (1982) and McLachlan and
Krishnan (1997) are valid for finite dimensional
parameter space.
No
explicit profile likelihood
- Need projection onto all other parameters
- However, it’s very hard to derive due to λ0
Bootstrap
technique in Efron(1994):
1. Generating bootstrap sample 0* from original observed data 0 .
*
2. The EM algorithm is applied to the bootstrap sample to derive the MLE .
*
0
3. Repeat step 1 and 2 B times.
B
B
4. Compute Cov( ) 1/( B 1) ( b ) b ) , where b = / B.
*
b 1
*
b
*
b
T
b 1
*
b
IV. Simulation Studies
Sample
size n=100 with 100 MC replications
-- preliminary scheduled measurement times: (0, 1, ... , 7)
-- (t ) (1, t )
-- (1,0.5)T
-- 0 1, 1, e2 0.25
(i) No censoring with ( 11 , 12 , 22 ) (0.01, 0.001,0.01)
(ii) With censoring time ~ exponential distribution with mean 25.
(iii) With same setting except 22 0.3 and 35% negative values of bi are truncated
(i) Normal random effects without censoring
β
μ1
μ2
σ11
σ12
σ22
σe2
target
1
1
0.5
0.01
-0.001
0.001
0.25
mean
1.0075
0.9955
0.5013
0.0087
-0.0011
0.0009
0.2528
SD
0.0945
0.0163
0.0055
0.0015
0.0002
0.0002
0.0135
(ii) Normal random effects with censoring
β
μ1
μ2
σ11
σ12
σ22
σe2
target
1
1
0.5
0.01
-0.001
0.001
0.25
mean
0.9918
0.9944
0.5015
0.0083
-0.0011
0.0009
0.2516
SD
0.1272
0.0249
0.0056
0.0023
0.0004
0.0002
0.0198
(iii) Nonnormal random effects with censoring
β
μ1
μ2
σ11
σ12
σ22
σ2e
target
1
1
0.5
0.01
-0.001
0.001
0.25
empirical
target
1
0.9993
0.6758
0.0104
-0.0058
0.1358
0.2753
mean
0.9950
1.0007
0.6682
0.0099
-0.0006
0.1627
0.2500
SD
0.1091
0.0140
0.0535
0.0004
0.0036
0.0318
0.0223
V. Application on Medfly data
The medfly (Mediterranean fruit fly) data:
--From Carey, et al. (1998)
-- We focus on 251 female medflies which have the
most egg reproduction (>1150).
--Range of event time from 22 to 99
-- Range of total reproduction from 1151 to 2349
--No censoring and missing
Relationship between daily egg laying and mortality
--Violate the proportionality (By scaled
Schoenfeld residual test with p-value 0.00305)
Profiles of daily egg laying of first three flies
5
subject1
subject2
subject3
4.5
4
log(# of daily egg laying+1)
3.5
3
2.5
2
1.5
1
0.5
0
0
20
40
60
Time
80
100
120
Initial model:
W (t ) X (t ) e (t )
*
*
*
X * (t ) t b0 exp[b1 (t )]
Log transformed model:
W (t ) log[W * (t ) 1] X (t ) e(t )
X (t ) b0 log(t ) b1 (t 1)
The
parameter estimates derived from original data
and 100 bootstrap samples under the joint AFT
β
μ1
μ2
σ11
σ12
σ22
σe
fitted
values
-0.4340
2.1227
-0.1442
0.3701
-0.0482
0.0068
0.8944
bootstrap
mean
-0.4313
2.1112
-0.1429
0.3651
-0.0483
0.0066
0.8958
bootstrap
SD
0.0115
0.0375
0.0051
0.0353
0.0002
0.0005
0.0223
Fitting
incomplete medfly data:
--Randomly select 1-7 days as the corresponding schedule
times for each individual.
--Then, add the day of death as the last schedule time.
Therefore, each individual may have 2-8 repeated
measurements.
-- The sub data set is further censored by exponential
distribution with mean 500 (20% censoring rate)
The
parameter estimates derived from incomplete
data and 100 bootstrap samples under the joint AFT
β
μ1
μ2
σ11
σ12
σ22
σe
fitted
values
-0.3890
2.2011
-0.1665
0.2833
-0.0382
0.0051
0.9775
bootstrap
mean
-0.3526
2.1986
-0.1575
0.2862
-0.0398
0.0057
0.9712
bootstrap
SD
0.0323
0.0461
0.0074
0.0351
0.0046
0.0006
0.0570
The End
© Copyright 2026 Paperzz