HSRP 734: Advanced Statistical Methods July 3, 2008 Objectives Describe the situations under which this method would be useful Describe censored data Describe the survivor function, the hazard function, and their relationship What is Survival Analysis? Survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of interest is time to an event. What do we mean by Time? Length of follow-up till the event of interest occurs Follow-up can start at (for example) 1. 2. Registration into a clinical trial Time of employment Age of the individual at the time of the event What do we mean by Event? Usually we mean death – thus the name “survival” analysis Relapse Disease incidence Can also be a positive event Discharge from psychiatric counseling Normalization of WBC count Why not just use what we already know? Means, t-tests, and linear regression Counts and chi-sq tests, and logistic regression Censored Data In survival analysis, an observation consists of two random components Observed variable represent time (e.g., actual time until death, or time until last follow-up) Bernoulli random variable (0,1) for whether the observation is censored or not – 1 if we observed a failure, 0 if we have a censored observation Censored Data When we have only incomplete information about the exact survival time due to a random factor Non-informative censoring – whether an observation is censored or not is independent of the value of the observation. Informative censoring – whether an observation is censored or not is dependent on the value of the observation We will require non-informative censoring mechanisms. If censoring is informative, then these methods will generate biased results. Types of censoring Right censoring – true survival time is greater than what we observed Left censoring – true survival time is less than what we observed Interval censoring – subjects are not observed continuously and we only know the event happened between time A and time B (e.g., annual testing of partner of an HIV+ individual) Three common reasons for right censoring Person does not experience the event before the study ends Person is lost to follow-up during the study period Person withdraws from the study because of death (if death is not the outcome of interest) or some other reason (e.g., adverse drug reaction) What does the data look like? 1 drop out 2 3 4 end of study 5 0 5 10 15 20 What does the data look like? ID Time Status 1 20 0 2 14 1 3 17 0 4 13 0 5 5 1 Survival Distribution The probability distribution of the survival times can be described in five different, equivalent ways: 1. 2. 3. 4. 5. probability density function cumulative distribution function survival function = 1 - cumulative distribution function hazard function cumulative hazard function Survival Distribution Distribution of times to event – called “survival times,” even when the “event” is not “death” Let T = survival time (T ≥ 0) t = specified value for T Survival times follow a continuous distribution with times ranging from zero to infinity Ordinary methods for estimating and comparing continuous distributions cannot be used with survival data due to the presence of censoring Probability Density Function f (t ) 1 f (t ) lim P[t T t t ] t 0 t Difficult to estimate density directly because of censoring – histogram not direct estimate of f(t) Cumulative Distribution Function F (t ) t F (t ) P[T t ] f ( s )ds 0 Survival Function S (t ) t t 0 S (t ) Pr[T t ] f (u )du 1 f (u ) du 1 F (t ) Monotone nonincreasing function S(0) = 1 S(+∞) = 0 Hazard Function λ(t ) Instantaneous death rate at time t, given alive at time t Pr(t T t t | T t ) (t ) lim t 0 t Prob event in (t, t t ) given survived to t lim t 0 t Other names for hazard function include: Force of Mortality, Incidence Function Rates — Probability / unit time (sec-1, years-1) Hazard Function λ(t ) So, you survived to time t, what is the probability that you survive another increment of time t? Now standardize this conditional probability to a per unit of time. As unit of time gets very small, goes to 0, this conditional probability becomes an instantaneous rate. Some simple features of h(t) h(t) takes on values in the interval (0, ∞) h(t) could be instantaneously increasing, decreasing, or constant Hazard Function λ(t ) Cumulative Hazard Function Λ(t ) t (t ) ( s)ds ln( S (t ) 0 Survival Distribution Any one of these five functions is enough to specify the survival distribution. There exists an equivalence relationship between the them. The most important models for survival analysis are about hazard rate λ(t) When λ(t) is high, S(t) decreases faster. When λ(t) is low, S(t) decreases slower. Example Consider a clinical trial in patients with acute myelogenous leukemia (AML) comparing two groups of patients: no maintenance treatment with chemotherapy (X=0) vs. maintenance chemotherapy treatment (X=1) Demographic and other clinical variables present Time to relapse is of interest Event = relapse Example Group Weeks in remission -- ie, time to relapse Maintenance chemo (X=1) 9, 13, 13+, 18, 23, 28+, 31, 34, 45+, 48, 161+ 5, 5, 8, 8, 12, 16+, 23, 27, 30+, 33, 43, 45 No maintenance chemo (X=0) + indicates a censored time to relapse; e.g., 13+ = more than 13 weeks to relapse Grouped data/life table analysis Divide the time period into intervals appropriate for the data – use more intervals in periods of changing incidence For each person, tally time spent at risk (person-years) in each interval Tally the events in each interval Grouped data/life table analysis Estimate the incidence rates (hazard rates) as the ratio of the number of events to the total time at risk in an interval: # of events ˆ person -time Example cont. Maintained on chemo Not maintained on chemo Interval Events Person-time (weeks) Events Person-time (weeks) 0-5 0 55 2 60 5-10 1 54 2 46 10-15 1 46 1 37 15-20 1 38 0 31 20-25 1 33 1 28 25-30 0 28 1 22 30-35 2 20 1 13 35-40 0 15 0 10 40-45 0 15 2 8 45-50 1 8 - - 50+ 0 111 - - Survival function The “Survival Function” is defined as S(t) = Pr (Survived beyond time t) For example, suppose t = end of follow-up time bin 3 S(t) = Pr (Survived > t) = Pr (survived through bin 1 and survived through bin 2 and survived through bin 3 ) = Pr(survived bin 1) x Pr(survived bin 2 given survived bin 1) x Pr(survived bin 3 given survived bin 1 and bin 2) Survival function Calculate probabilities of surviving through bin j of follow-up time by finding the complement of the probability of dying in bin j Pr (Survived bin j) = 1 - Pr(died in bin j) Pr ( “Die” in bin j ) is approximated by # Events in Bin j yj y j Lj Average number of people at risk in Bin j Pj Length of Bin j N j Lj Nj where yj = # of events in bin j Nj = time at risk (person-time) in bin j Lj = length of bin j (must be small for the approximation to work well) Survival function Then, use Pj , the probabilities of dying in bin j, to estimate the survival function S(t): y j Lj Sˆ (t ) 1 Pr(Die in j) 1 Pj 1 j 1 j 1 j 1 N j t t t The calculations needed for Sˆ (t ) , the estimated survival function, are usually organized into a “life table”, as follows: Sj = Pr ( Survived beyond the end of bin j) S0 = 1 Survival function Maintained on chemo Not maintained on Chemo j Lj Nj yj 1-Pj Sj Nj yj 1-Pj Sj 1 5 55 0 1 1 60 2 0.83 0.83 2 5 54 1 0.91 0.91 46 2 0.78 0.65 3 5 46 1 0.89 0.81 37 1 0.86 0.56 4 5 38 1 0.87 0.70 31 0 1 0.56 5 5 33 1 0.85 0.60 28 1 0.82 0.46 6 5 28 0 1 0.60 22 1 0.77 0.36 7 5 20 2 0.5 0.30 13 1 0.62 0.22 8 5 15 0 1 0.30 10 0 1 0.22 9 5 15 0 1 0.30 8 2 0* 0 10 5 8 1 0.38 0.11 - - - 0 11 111 111 0 1 0.11 - - - 0 Survival function Trouble with follow-up time bins that are too wide: 1-Pj = 1-yi Li /Ni = 1-(10/8) = -0.25 Work-around: set the probability, 1 - Pi, to zero whenever the estimate is negative To display the estimated survivor, plot Sˆ (t ) vs. t — For grouped data: Plot Sˆ (t ) at the end of each time interval connecting the points with line segments (not steps like Kaplan-Meier) At time=0, plot Sˆ (t ) =1 Survival function 1.0 Maintained on chemo Not Maintained on chemo 0.8 0.6 0.4 0.2 0.0 0 10 20 30 Weeks 40 50
© Copyright 2026 Paperzz