slides

On the Analysis of Tuberculosis Studies
with Intermittent Missing Sputum Data
Daniel Scharfstein
Johns Hopkins University
[email protected]
July 12, 2013
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Happy Birthday!
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Collaborators
I
I
I
I
I
Andrea Rotnitzky
Maria Abraham
Aidan McDermott
Lawrence Geiter
Richard Chaisson
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Observed Data
66
61
56
51
46
1:n0
41
36
31
26
21
16
1 4 7
1 4 7
11
11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71
Ethambutol
71
Moxifloxacin
1
2
3
4
5
6
7
8
Daniel O. Scharfstein
1
2
3
4
5
6
7
8
Tuberculosis Studies with Missing Sputum Data
Example of Patient Data
Line
1
2
Culture Results
Converter?
1
?
N
2
+
N
3
?
?
Visit
4
?
Daniel O. Scharfstein
5
?
?
6
Y
7
Y
8
Y
L
R+1
S
3
6
{3, 4, 6}
Tuberculosis Studies with Missing Sputum Data
Observed Data
66
61
56
51
46
1:n0
41
36
31
26
21
16
1 4 7
1 4 7
11
11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71
Ethambutol
71
Moxifloxacin
1
2
3
4
5
6
7
8
9
Daniel O. Scharfstein
1
2
3
4
5
6
7
8
9
Tuberculosis Studies with Missing Sputum Data
Example of Patient Data
Line
1
2
I
I
I
I
Culture Results
Converter?
1
?
N
2
+
N
3
?
?
Visit
4
?
5
?
?
6
Y
7
Y
8
Y
L
R+1
S
3
6
{3, 4, 6}
Need assumptions on the distribution of time of culture
conversion given observed data.
3: V3 -, V5 -; 4: V3+, V5-; 6: V5+
Forward time: visit 3, visit 5 given 3-, visit 5 given 3+
Reverse time: visit 5, visit 3 given 5-
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Example of Patient Data
Line
1
2
3
4
5
6
Culture Results
Converter?
Culture Results
Converter?
Culture Results
Converter?
1
?
N
?
N
?
N
2
+
N
+
N
+
N
3
?
?
?
N
?
?
Visit
4
?
N
Y
Daniel O. Scharfstein
5
?
?
+
N
Y
6
Y
Y
Y
7
Y
Y
Y
8
Y
Y
Y
L
R+1
S
3
6
{3, 4, 6}
6
6
{6}
3
4
{3, 4}
Tuberculosis Studies with Missing Sputum Data
Example of Patient Data
Line
1
2
3
4
5
6
7
8
9
10
Culture Results
Converter?
Culture Results
Converter?
Culture Results
Converter?
Culture Results
Converter?
Culture Results
Converter?
1
?
N
?
N
?
N
?
N
?
N
2
+
N
+
N
+
N
+
N
+
N
3
?
?
?
N
?
?
+
N
Y
Visit
4
?
N
Y
Y
Y
Daniel O. Scharfstein
5
?
?
+
N
Y
Y
Y
6
Y
Y
Y
Y
Y
7
Y
Y
Y
Y
Y
8
Y
Y
Y
Y
Y
L
R+1
S
3
6
{3, 4, 6}
6
6
{6}
3
4
{3, 4}
4
4
{4}
3
3
{3}
Tuberculosis Studies with Missing Sputum Data
Benchmark Assumptions
We postulate that
P[T = r + 1|O = o] = P[T = r + 1|O = o (r ) ]
(1)
P[T = k|T ≤ k, O = o] = P[T = k|O = o (k−1) ]
(2)
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Sensitivity Analysis
P[T = r + 1|O = o] =
P[T = r + 1|O = o (r ) ] exp(α)
hr +1 (o (r ) ; α)
P[T = k|T ≤ k, O = o] =
Daniel O. Scharfstein
P[T = k|O = o (k−1) ] exp(α)
hk (o (k−1) ; α)
Tuberculosis Studies with Missing Sputum Data
(3)
(4)
Curse of Dimensionality
I
I
I
I
Need to estimate for each realization O = o with |S| > 1,
P[T = k|O = o (k−1) ]
These probabilities cannot be estimated
non-parametrically.
Postulate a parametric model for the law of the observed
data O given baseline covariates X .
This model induces parametric models for
P[T = k|O = o (k−1) ], that ultimately enable estimation
of P[T = k|O = o] by borrowing information across
strata O = o (k−1) .
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Model for Observed Data
I
I
Ok = (Mkc , Ckobs , Mks , Skobs ); O = X , O K .
Model the law of O given X by modeling the distribution
of Ok given O k−1 and X for all k = 1, . . . , K .
logit{P[Mkc = 1|O k−1 , X ]} = a(k, Ok−1 , X ; γ (a) )
logit{P[Ckobs = 1|Mkc = 0, O k−1 , X ]} = b(k, Ok−1 , X ; γ (b) )
logit{P[Mks = 1|Mkc , Ckobs , O k−1 , X ]} = c(k, Mkc , Ckobs , Ok−1 , X ; γ (c) )
logit{P[Skobs = 1|Mks = 0, Mkc , Ckobs , O k−1 , X ]} = d(k, Mkc , Ckobs , Ok−1 , X ; γ (d) )
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Inference
I
I
We can express for all realizations O = o with |S| > 1,
the conditional probability P[T = k|O = o (k−1) ] as a
given functions of o (k−1) and γ = (γ (a) , γ (b) , γ (c) , γ (d) ).
We can express P[T = r + 1|O = o] and
P[T = k|T ≤ k, O = o] as given functions,
P[T = r + 1|O = o; γ; α] and
P[T = k|T ≤ k, O = o; γ; α] of o, γ and α.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Inference
I
I
Estimate γ by γ
b using maximum likelihood.
P b
Estimate P[T = k] by n1 i=1 P[T
i = k; α] where
b
P[Ti = k; α] equals 0 if k ∈
/ Si , equals 1 if |S| = 1 and
k ∈ Si , equals
P[T = Ri + 1|O = Oi ; γ
b; α]
if |Si | > 1 and k = Ri + 1, and equals












{1 − P[T = Ri + 1|O = Oi ; γ
b ; α]}
(1 − P[T = s|T ≤ s, O = Oi ; γ
b ; α]) ×






k<s<R
+1


i


Y
k∈Si
P[T = k|T ≤ k, O = Oi ; γ
b ; α]
I
if |Si | > 1 and k ∈ Si , k < Ri + 1.
Confidence intervals by non-parametric bootstrap.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Data Analysis
I
I
I
Treatment groups were not balanced with respect to the
cavitation status at baseline; 81.1% and 56.9% have
cavitation in the moxifloxacin and ethambutol arms,
respectively.
Estimate for each treatment group, the distribution of
time of culture conversion by a weighted average of
cavitation-specific distribution of time of culture
conversion.
Weights are taken to be the marginal (i.e., not
conditional on treatment arm) proportion of patients with
and without cavitation at baseline, respectively.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Data Analysis
I
I
I
Under benchmark assumption, the estimated probabilities
of being a culture converter at or by week 8 are 92.5%
and 75.5% in the moxafloxacin and ethmabutol arms,
respectively.
Estimated difference is 17.0% (95% CI: [4.5%,29.3%]).
Benchmark analysis suggests a statistically significant
difference in culture conversion at or by week 8 in favor of
moxifloxacin.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Data Analysis
0.9
0.8
0.5
0.6
0.7
Probability
0.8
0.7
0.6
0.5
Probability
0.9
1.0
Ethambutol
1.0
Moxifloxacin
-10
-5
0
5
α1
Daniel O. Scharfstein
-10
-5
0
5
α0
Tuberculosis Studies with Missing Sputum Data
-5
0
Moxifloxacin
-10
α1 (Moxifloxacin)
5
Data Analysis
-10
-5
0
5
α0 (Ethambutol)
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Data Analysis
1.0
Ethambutol
1.0
Moxifloxacin
Probability
0.6
0.8
0
5
0.0
0.2
0.4
0.6
0.4
0.2
0.0
Probability
0.8
-2
0
0
2
4
6
8
Visit
Daniel O. Scharfstein
0
2
4
6
8
Visit
Tuberculosis Studies with Missing Sputum Data
Data Analysis
0.00 0.05 0.10
Signed Distance
-0.20
-0.10
-0.10
0.00 0.05 0.10
Ethambutol
-0.20
Signed Distance
Moxifloxacin
-10
-5
0
5
α1
Daniel O. Scharfstein
-10
-5
0
5
α0
Tuberculosis Studies with Missing Sputum Data
Data Analysis
I
I
I
Compare the treatment-specific distributions of time to
culture conversion.
Estimate a common treatment effect over time, using
Cox(1972) logistic model for discrete survival data.
This model assumes that
hz (k)
= τk exp(βz) k = 1. . . . , 8, z = 0, 1
1 − hz (k)
I
where hz (k) = Pz [T = k|T ≥ k] and τ1 , . . . , τ8 ≥ 0.
exp(β) is the ratio of the odds of first becoming a culture
converter at visit k given culture conversion at or after
visit k comparing moxifloxacin to ethambutol.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Data Analysis
I
For each choice of α0 and α1 , we minimize the following
objective function:
)2
(
1 X
8
X
b
hz (k)
− τk exp(βz)
1−b
hz (k)
z=0 k=1
I
I
with respect τ1 , . . . , τ8 ≥ 0 and β, where
b
bz [T = k|T ≥ k].
hz (k) = P
For each choice of α0 and α1 , this method finds the
”closest” fitting logistic model to the ”data”: {
b
hz (k) : k = 1, . . . , 8, z = 0, 1}.
Even if the model is incorrectly specified, still provides a
valid test of the null hypothesis of no treatment effect.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Data Analysis
I
Under benchmark assumption, the estimated hazard ratio
is 3.41 (95% CI: [1.16,16.90]), indicating that patients
treated with moxifloxacin have a statistically significant
shorter time of culture conversion than those treated with
ethambutol.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
5
Data Analysis
-10
-5
α1
0
Moxafloxacin
-10
-5
0
5
α0
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data
Discussion
I
I
I
I
I
I
Roshamon
Coarsening at Random
Everything is relative.
Sensitivity analysis parameters are not scientifically
interpretable.
Look at induced distributions
Requires scientific judgement.
Daniel O. Scharfstein
Tuberculosis Studies with Missing Sputum Data