Survival Analysis and the ACT study

Survival Analysis
and the ACT study
Laura Gibbons, PhD
Thanks to An Introduction to Survival Analysis Using Stata
Acknowledgement
• Funded in part by Grant R13 AG030995 from
the National Institute on Aging
• The views expressed in written conference
materials or publications and by speakers and
moderators do not necessarily reflect the
official policies of the Department of Health
and Human Services; nor does mention by
trade names, commercial practices, or
organizations imply endorsement by the U.S.
Government.
What is survival analysis?
• Time to event data.
• It’s not just a question of who gets demented,
but when.
• Event, survive, and fall are generic terms.
ACT example
Risk for Late-life Re-injury, Dementia, and Death
Among Individuals with Traumatic Brain Injury:
A population-based study
Kristen Dams-O’Connor, Laura E Gibbons,
James D Bowen, Susan M McCurry, Eric B Larson,
Paul K Crane.
J Neurol Neurosurg Psychiatry 2013 Feb;84(2):177-82.
TBI-LOC = Traumatic Brain Injury with
Loss of Consciousness
Outcomes (different ways of defining failure)
• TBI-LOC during follow-up
• Dementia
• Death
Survival function
• The number who survive out of the total number at risk
• In this example, “failure” is a TBI-LOC after baseline.
• 4225 participants, with 96 TBI-LOC after baseline.
Survival Function
1.00
0.90
0.80
0
2
4
6
8
10
12
Years since baseline
No TBI-LOC before baseline
TBI-LOC before baseline
14
16
Hazard function
• The probability of failing given survival until this
time (currently at risk)
• The hazard function reflects the hazard at each
time point.
• It’s usually easier to look at the cumulative hazard
graph.
Cumulative Hazard for TBI with LOC
(with 95% confidence bands)
Cumulative Hazard
.2
.15
.1
.05
0
0
2
4
6
8
10
12
Years since baseline
TBI-LOC before baseline
No TBI-LOC before baseline
14
16
Think carefully about onset of
time at risk
• Study entry
• Time-dependent covariates
• What to do about exposures which occur before
study entry (left truncation)
ACT: TBI-LOC during follow-up
• Used study entry as onset of time at risk
• Exposure: report of first TBI-LOC at baseline
None (n = 3619)
At age<25 (n=371)
At age 25-54 (n=104)
At age 55 to baseline (n=131)
• No time-dependent covariates for this example
Time axis
Continuous – exact failure time is known
Discrete – time interval for failure is known
ACT onsetdate for dementia outcomes
The midpoint between the two study visits
(biennial and/or annual) that precede the date
of the consensus of dementia. The date of the
consensus of dementia is defined as the earliest
consensus that resulted in a positive diagnosis
of dementia (DSMIV) and that was not later
reversed as a false positive.
Age as the time axis
• Makes sense in an aging study.
• Often modeled as baseline age + time.
Ties
• Multiple events occurring at the same time.
• Make sure your software is handling this the
way you want.
Think carefully about censoring
• Censoring: The event time is unknown
• No longer at risk
• Missing data – random or informative?
• Hope it’s noninformative [Distribution of censoring
times is independent of event times, conditional on
covariates. ~ MAR.]
Right censoring
Event is unobserved due to
• Drop out
• Study end
• Competing event (more on this later)
Interval censoring
• Know it occurs between visits, but not when
• Assume failure time is uniformly distributed in that
interval
• An issue in ACT (hence onsetdate)
Left censoring
The event occurred before the study began.
• What about those whose TBI-LOC resulted in death or
dementia before age 65? They are not in our study.
• Worry about this one.
Left truncation
Onset of risk was before study entry.
• We used our 4-category exposure, but risk really must
be defined as “TBI-LOC before age 25 and not leftcensored”, etc.
ACT censoring variables
• Competing event: onsetdate (dementia) or
• Visit date (visitdt)
• Withdrawal date (withdrawdt)
(The FH data does not include anyone who withdrew.)
• Death date (deathdt)
Modeling
Non parametric – Kaplan-Meier
Survival Function
1.00
0.90
0.80
0
2
4
6
8
10
12
Years since baseline
No TBI-LOC before baseline
TBI-LOC before baseline
14
16
Log-rank test for equality of survivor functions
|
Events
Events
p
| observed
expected
---------------------------+------------------------No TBI-LOC before baseline |
66
82.70
TBI-LOC before baseline
|
30
13.30
---------------------------+------------------------Total
|
96
96.00
chi2(1) =
Pr>chi2 =
24.44
0.0000
Semi-parametric (Cox)
Assumes the hazards are proportional
Cumulative Hazard
0.15
0.10
0.05
0.00
0
2
4
6
8
10
12
14
16
Years since baseline
TBI-LOC before baseline
No TBI-LOC before baseline
Looks like a reasonable assumption here, but we looked
at a variety of graphs and statistics to make sure.
Hazard Ratios
Baseline report of age at first TBI with LOC as a
predictor of TBI with LOC after study enrolment,
controlling for age, sex, and years of education.
Age at first TBI
with LOC
None prior to baseline
< 25
25-54
55-baseline
Late life TBI with LOC
cases/person years
66/21,945
15/2147
6/678
9/798
HR (95% CI)
1 (Reference)
2.54 (1.42, 4.52)
3.24 (1.40, 7.52)
3.79 (1.89, 7.62)
Model checking
• Proportional hazards assumption
• Covariate form
• Baseline, lag or current visit covariate
• Et cetera
Parametric
Can be proportional hazard models
• Exponential. Constant baseline hazard.
• Weibull. Hazard is monotone increasing or
decreasing, depending on the values for a
and b.
• Gompertz. Hazard rates increase or
decrease exponentially over time.
• See Flexible Parametric Survival Analysis
Using Stata for many more.
Accelerated failure time
• Risk is not constant over time.
• Time ratios. Ratios > 1 indicate LONGER
survival.
Types of accelerated failure time
(AFT) models
• Gamma. 3 parameter. Most flexible. Fit a
gamma model and see which parameters are
relevant.
• Exponential, Weibull can also be formulated as
AFT models. In the Weibull model, the risk
increases over time when β > 1.
• Log-normal. The hazard increases and then
decreases.
• Log-logistic. Very similar to log-normal.
Baseline report of TBI-LOC
and the risk of dementia
• Proportional hazards assumption not tenable.
• The log-logistic AFT model was the winner, reflecting
an increased risk over time.
• You can compare AICs to pick best model, or pick
one based on your hypothesis.
Our AFT model for any dementia
• Controlling for baseline age, gender and any
APOE-4 alleles
• Remember that TR > 1 => longer survival
TBI-LOC is NS. Older baseline age and APOE
associated with shorter survival. Female and
education associated with longer survival.
-----------------------------------------------_t |
TR [95% Conf. Interval]
-------------+---------------------------------base4 |
<25 |
1.02
0.87
1.20
25-54 |
1.04
0.78
1.38
55+ |
1.06
0.81
1.39
|
10 years age |
0.53
0.49
0.57
female |
1.15
1.05
1.26
education |
1.02
1.00
1.04
apoe4 |
0.69
0.63
0.76
------------------------------------------------
Competing risks
• Individuals are at-risk for AD, vascular dementia,
other dementias
• No longer at risk for one type once diagnosed with
another (assuming we’re dealing with first
diagnosis)
• Use cause-specific hazard functions and
cumulative incidence functions.
What is going on in competing risks?
• Which is it?
• One process determines dementia and another says
which type
• Two separate processes, and one event censors the
other.
• In our analysis of TBI-LOC predicting AD, we censored at
other dementia diagnoses, but we could have modeled
multiple dementia outcomes (assuming adequate
numbers).
• The competing risk model may be more accurate,
because the time to AD and the time to other dementia
are probably correlated.
Shared frailty, aka unobserved
heterogeneity or random-effects
• There may be variability in individuals’ underlying
(baseline) risk for an event that is not directly
measurable.
• One way of dealing with patients from different
cities, for example.
• The assumption is that the effect is random and
multiplicative on the hazard function.
• Need to distinguish between hazard for
individuals and the population average.
• Population hazard can fall while the individual
hazards rise.
• The frailer individuals have failed already, so the
overall hazard rate drops. Yet time is passing, so
each person’s risk is still rising.
• In a shared frailty model, HR estimates are for
time 0.
• Covariate effects decrease as the frail fail.
• Gamma frailty models. Covariate effects
completely disappear over time.
• Inverse-Gaussian models. Covariate effects
decrease but do not disappear over time.
Other issues in survival analysis not
covered today include
• Events that can occur more than once
(heart attacks, for example)
• Parallel processes
• Unshared frailty
Questions, discussion