Semiparametric estimation of the average causal

Biostatistics (2010), 11, 1, pp. 34–47
doi:10.1093/biostatistics/kxp034
Advance Access publication on October 8, 2009
Semiparametric estimation of the average causal effect of
treatment on an outcome measured after a
postrandomization event, with missing outcome data
PETER B. GILBERT∗
Department of Biostatistics, University of Washington, Seattle, WA 98105, USA and Fred Hutchinson
Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
[email protected]
YUYING JIN
Department of Biostatistics, University of Washington, Seattle, WA 98105, USA
S UMMARY
In the past decade, several principal stratification–based statistical methods have been developed for testing and estimation of a treatment effect on an outcome measured after a postrandomization event. Two
examples are the evaluation of the effect of a cancer treatment on quality of life in subjects who remain
alive and the evaluation of the effect of an HIV vaccine on viral load in subjects who acquire HIV infection. However, in general the developed methods have not addressed the issue of missing outcome data,
and hence their validity relies on a missing completely at random (MCAR) assumption. Because in many
applications the MCAR assumption is untenable, while a missing at random (MAR) assumption is defensible, we extend the semiparametric likelihood sensitivity analysis approach of Gilbert and others (2003)
and Jemiai and Rotnitzky (2005) to allow the outcome to be MAR. We combine these methods with the
robust likelihood–based method of Little and An (2004) for handling MAR data to provide semiparametric estimation of the average causal effect of treatment on the outcome. The new method, which does not
require a monotonicity assumption, is evaluated in a simulation study and is applied to data from the first
HIV vaccine efficacy trial.
Keywords: Causal inference; HIV vaccine trial; Missing at random; Posttreatment selection bias; Principal stratification; Sensitivity analysis.
1. I NTRODUCTION
For randomized placebo-controlled efficacy trials of an HIV vaccine evaluated in HIV-uninfected volunteers, a primary objective is to evaluate the effect of vaccination on the incidence of HIV infection.
Another objective, which was secondary in 2 trials of antibody-based vaccines (Flynn and others, 2005;
Pitisittithum and others, 2006) and co-primary in 2 trials of T cell–based vaccines (Buchbinder and others,
∗ To whom correspondence should be addressed.
c The Author 2009. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected].
Semiparametric estimation of the average causal effect of treatment
35
2008), is to evaluate the effect of vaccination on HIV viral load measured after HIV infection. Two causal
approaches of interest for assessing the latter objective are intent-to-treat (ITT) methods that assess the
burden of illness (a composite end point that combines the infection end point and the postinfection end
point) (Chang and others, 1994) and conditional methods that assess the postinfection end point in the
principal stratum of subjects who would be HIV-infected under either treatment assignment (the so-called
“always infected” stratum). As discussed by Gilbert and others (2003) [henceforth GBH] and others, the
2 approaches address different substantive questions. For example, in registrational/licensure trials, the
ITT approach may be most appropriate for the primary analysis and the conditional approach would
be used in secondary analyses of the mechanistic vaccine effect on the postinfection outcome, whereas
in preregistrational test-of-concept trials, the conditional approach might be used for the primary analysis
(Mehrotra and others, 2006).
Here, we develop methodology for the conditional approach to evaluate the causal vaccine effect on
the postinfection end point. For concreteness, we refer to the first postrandomization event as infection
and the outcome of interest measured after this event as viral load. However, the method has general
application to evaluating causal treatment effects on outcomes measured after a postrandomization event,
including studies of quality of life (Rubin, 2000), prostate cancer severity (Shepherd and others, 2008),
kidney disease, and cancer screening (Joffe and others, 2007).
Because the set of trial participants who are in the always infected stratum is unknown, causal treatment effects for this group are not identified from randomized trial data. Two approaches for addressing
the nonidentifiability have been to derive sharp bounds for causal effects (Jemiai and Rotnitzky, 2003;
Hudgens and others, 2003; Zhang and Rubin, 2003) and to estimate causal effects under an additional set
of identifiability assumptions that include models describing the nature and degree of possible selection
bias, with a sensitivity analysis to explore how the inferences vary over a range of the selection models
(GBH; Hayden and others, 2005; Hudgens and Halloran, 2006; Jemiai and Rotnitzky, 2005 (henceforth
JR); Jemiai and others, 2007; Shepherd, Gilbert, Jemiai and others, 2006; Shepherd and others, 2007,
2008). All these methods assume that viral load is observed for all infected subjects. Therefore, the validity of their inferences depends on a missing completely at random assumption (MCAR). While the
MCAR assumption is often untenable, trials may collect sufficient data on participant characteristics to
make a missing at random (MAR) assumption plausible. This article extends the approach of GBH and
JR to accommodate MAR missingness of the viral load end point.
To illustrate the new method, we focus on the first efficacy trial (Flynn and others, 2005) and define
the end point Y p to be the pre-antiretroviral therapy (pre-ART) log10 viral load at the month 12 visit post
HIV infection diagnosis. Only subjects who have not started ART by the month 12 visit contribute a value
Y p . We exclude viral load values measured after ART initiation because ART strongly effects viral load
levels (Gilbert and others, 2003). Inferences about Y p apply to a population where ART is not prescribed
during the first 12 months after infection diagnosis.
Of the 368 subjects who became HIV-infected during the trial, 121 had Y p observed, 138 had missing
data because they initiated ART prior to the month 12 study visit, and 109 had missing data because they
dropped out prior to initiating ART and prior to the visit. Figure 1 shows the VaxGen data on Y p = Y3 in
relationship to 2 key covariates: Y1 is early pre-ART square root CD4 cell count, and Y2 is early pre-ART
log10 viral load; these variables average all pre-ART values measured within 2 months of infection diagnosis. Viral load early and at 12 months were positively correlated (Figures 1(e) and (f)), with Spearman rank
correlation 0.68 (0.55) for the vaccine (placebo) group. There is evidence of a negative correlation between
early CD4 cell count and viral load at 12 months (Figures 1(c) and (d)) for the vaccine group (Spearman
correlation = −0.33) but not for the placebo group (Spearman correlation = 0.16).
In addition, lower levels of Y1 and higher levels of Y2 were highly predictive of ART initiation
(P values<0.001 in a multivariate Cox model; Gilbert and others, 2005), showing that MCAR was badly
violated. MAR may be plausible, however, because physicians base decisions to prescribe ART on the
36
P. B. G ILBERT AND Y. J IN
Fig. 1. VaxGen trial data: jittered pre-antiretroviral (pre-ART) viral loads at the month 12 postinfection diagnosis
visit (Y3 ) ((a) and (b)); pre-ART viral loads at the month 12 postinfection diagnosis visit versus square root CD4 cell
counts early after infection diagnosis (Y3 versus Y1 ; (c) and (d)); pre-ART viral loads at the month 12 postinfection
diagnosis visit vs early after infection diagnosis (Y3 versus Y2 ; (e) and (f)).
monitoring of CD4 cell counts, viral loads, HIV-related clinical events, and comorbidities (Hammer and
others, 2008). In fact, MAR missingness due to ART initiation may approximately hold “systematically”
in settings where ART is offered to all infected participants when their biomarkers are observed to cross
prespecified thresholds or they present with prespecified symptomatic illnesses.
Semiparametric estimation of the average causal effect of treatment
37
Two popular approaches to making valid inferences under MAR are inverse probability weighted
(IPW)–based methods (e.g. Robins and others, 1995) and likelihood-based methods (e.g. Little and
Rubin, 2002). For some HIV vaccine trials, the former methods are expected to be relatively inefficient and
possibly unstable because the estimated weights for some subjects are expected to be near zero. Specifically, for a subject with Y p observed, the weight in the denominator of the estimating equation equals
the estimated probability that the subject did not drop out by the month 12 visit multiplied by the
estimated probability that the subject did not start ART by the visit conditional on not dropping out.
These estimates can be computed as fitted values in regression models based on the subject’s covariates.
The latter conditional probability may be near zero for subjects whose CD4 cell counts drop below the
prespecified threshold at which ART initiation is recommended (currently the recommended threshold
is between 200 and 500 cells/mm3 depending on the country). In fact, for ethical reasons any efficacy
trial will offer ART to all infected participants who meet treatment criteria, such that the more successful the treatment coverage the closer some estimated weights in the denominator may be to zero.
Therefore, for some HIV vaccine trials, IPW methods are expected to provide relatively imprecise
inferences.
While likelihood methods are less subject to the instability problem, they are susceptible to misspecification of the model relating Y p to covariates. To partially ameliorate this problem, we use the
robust likelihood–based method of Little and An (2004) (henceforth LA) that is based on penalized
splines of the propensity score. Our approach only handles monotone missing data (i.e. dropout); an
alternative approach would handle nonmonotone missingness using parametric multiple imputation and
Monte Carlo Markov chain, for example, by extending the method of Mogg and Mehrotra (2007) for
analyzing viral load data to address postrandomization selection bias. The article is organized as follows. Section 2 describes the causal estimand of interest and identifiability assumptions. Section 3 shows
how to combine the methodologies of GBH/JR and LA into a procedure for consistently estimating the
causal estimand under an MAR assumption. Section 4 evaluates the new method in a simulation study.
Section 5 applies the method to the first HIV vaccine efficacy trial, and Section 6 offers concluding
remarks.
2. AVERAGE CAUSAL EFFECT ESTIMAND AND IDENTIFIABILITY ASSUMPTIONS
2.1
Notation and estimand
Let Z be treatment assignment, and let X be a q-vector of baseline covariates fully observed for everyone.
Let S be the indicator of the postrandomization event. Subjects experiencing S = 1 are subsequently
evaluated at V visits, where variables Y1 , . . . , Yn 1 are collected at visit 1, variables Yn 1 +1 , . . . , Yn 1 +n 2 are
V
collected at visit 2, and so on, with variables YV −1 n +1 , . . . , Y p collected at visit V , where p = i=1
ni .
i=1
i
The entire collection of p variables measured after S = 1 is Y ≡ (Y1 , . . . , Y p ) , where Y p is the outcome
variable of interest. For j = 1, . . . , p, let M j be the indicator of whether Y j is missing and set M =
(M1 , . . . , M p ) . The variables Y are only meaningful if S = 1; thus, Y and M are undefined if S = 0 and
we denote this by Y = M = ∗. For HIV vaccine trials, Z is vaccination assignment (Z = 1, vaccine;
Z = 0, placebo) and S is HIV infection diagnosis during the trial.
Each participant has potential infection outcome S(1) if assigned vaccine and S(0) if assigned placebo.
For Z = 0, 1, the potential outcomes Y (Z ) ≡ (Y1 (Z ), Y2 (Z ), . . . , Y p (Z )) and M(Z ) ≡ (M1 (Z ),
M2 (Z ), . . . , M p (Z )) are defined if S(Z ) = 1; otherwise Y (Z ) ≡ ∗ and M(Z ) ≡ ∗. With µz ≡
E(Y p (z)|S(0) = S(1) = 1) for z = 0, 1, the “average causal effect (ACE)” estimand of interest is
ACE ≡ µ1 − µ0 . Our goal is to estimate the ACE based on assumptions and the observed i.i.d. data
(Z i , X i , Si , Mi , Yi ), i = 1, . . . , N .
38
P. B. G ILBERT AND Y. J IN
2.2
MAR assumption
For subjects with S = 1, let Yobs denote the components of Y that are observed and Ymis denote the
components of Y that are missing. Let f be the conditional cumulative distribution function (cdf) of
M given Y and S = 1, f (M|Y, S = 1, ν), where ν denotes unknown parameters. MAR states that
missingness depends only on the observed values Yobs , that is,
f (M|Y, S = 1, ν) = f (M|Yobs , S = 1, ν)
for all Ymis , ν.
For simplicity, we develop the methods for a setting where Y1 , . . . , Y p−1 are fully observed in infected
subjects (with S = 1) and only Y p has missing values. In Section 6, we describe how to extend the method
to the case of monotone missing data.
2.3
Identifiability assumptions
Throughout, we make the following 3 assumptions.
A1: Stable unit treatment values assumption (Rubin, 1978).
A2: The treatment assignment Z is independent of (X, S(0), S(1), M(0), M(1), Y (0), Y (1)).
A3: For infected subjects (with S = 1), the missing data mechanism for Y is MAR.
All the papers for studying the causal vaccine effect in the always infected stratum cited in Section 1
assume A1 and A2; we refer the reader to these articles for discussion about their justification. As discussed in Section 1, the MAR assumption A3 may be quite plausible when missingness is mainly due to
ART initiation and ART guidelines are used. If the missing data mechanism is believed to be non-MAR,
then the methodology here may mislead and should be used with caution. An advantageous feature of the
MAR assumption is that investigators can design clinical trials to collect the covariate data that make it
plausible.
We next postulate additional assumptions that identify the ACE and are indexed by fixed sensitivity
parameters. Following JR, we suppose 3 models, which we refer to collectively as A4:
g0 (Pr(S(1) = 1|S(0) = 1, Y p (0) = y) = α0 + β0 y,
(2.1)
g1 (Pr(S(0) = 1|S(1) = 1, Y p (1) = y) = α1 + β1 y,
(2.2)
Pr(S(0) = 1|S(1) = 1) = φ,
(2.3)
where g0 and g1 are known invertible link functions whose inverses are continuous in α0 and α1 , α0 and
α1 are unknown parameters to be estimated, and β0 , β1 , and φ are known sensitivity parameters that are
varied over plausible ranges, where subject matter experts can help define the plausible ranges. With logit
links g0 and g1 , β0 is interpreted as the difference in the log odds of infection in the vaccine group given
infection in the placebo group with y versus y − 1 viral load, and β1 is interpreted similarly reversing
the role of vaccine and placebo. The parameter φ is interpreted as the probability that a subject infected
in the vaccine group would also be infected in the placebo group. Except for the method of JR, all of the
previously developed methods cited above assume φ = 1 (i.e. monotonicity, that the vaccine does not
increase the risk of infection for any subject), in which case the selection model (2.2) is superfluous and
only model (2.1) is used, and the only sensitivity parameter is β0 . In this case, the methods of GBH and
JR are equivalent. For greater applicability of the method, here, we allow for nonmonotonic settings by
considering the trio of models (2.1–2.3).
Semiparametric estimation of the average causal effect of treatment
39
3. E STIMATION OF THE ACE
3.1
Estimation of the ACE for complete data
For the case that Y p is measured from all subjects with S = 1, JR proved that A1, A2, and A4 identify
the ACE and developed unbiased estimating equations that can be solved to obtain a consistent asymptotically normal estimator of the ACE. Shepherd and others (2008) summarized this result together with
a computational procedure for solving the equation and for obtaining the consistent sandwich variance
N
Ui (θ) = 0 is detailed using
estimator. With θ ≡ ( p0 , p1 , α0 , α1 , µ0 , µ1 ) , JR’s estimating equation i=1
our notation in Appendix A ([A.1]–[A.6]) in the supplementary material available at Biostatistics online.
JR’s estimating equation was derived based on the following relationships, which form the basis for
integrating the LA and JR methods:
E[g0−1 (Y p ; α0 , β0 )|S = 1, Z = 0] = φ pp10 ,
(3.1)
E[g1−1 (Y p ; α1 , β1 )|S = 1, Z = 1] = φ,
(3.2)
E[Y p g0−1 (Y p ; α0 , β0 )|S = 1, Z = 0] = φ pp10 µ0 ,
(3.3)
E[Y p g1−1 (Y p ; α1 , β1 )|S = 1, Z = 1] = φµ1 .
(3.4)
3.2 Estimation of the ACE for MAR data
We show how to augment JR’s estimating equations to handle missing data via LA’s method. Following
Section 5 of LA, we suppose q + p 3 so that there are at least 2 covariates. For infected subjects,
separately for groups Z = 0 and Z = 1, define the logit of the propensity score for Y p to be observed,
given the covariates X, Y1 , . . . , Y p−1 :
Y1∗ ≡ logit(Pr(M p = 0|X, Y1 , . . . , Y p−1 , S = 1, Z = z)).
(3.5)
LA observed that, conditional on the propensity score and assuming MAR, the missingness of Y p does
not depend on Z , X , Y1 , . . . , Y p−1 (Rosenbaum and Rubin, 1983). Consequently, each of the expectations
in (3.1–3.4) can be written as
E[h(Y p )|S = 1, Z = z] = E[(1 − M p )h(Y p )|S = 1, Z = z]
+ E[M p × E(h(Y p )|Y1∗ , X 2 , . . . , X q , Y1 , . . . , Y p−1 , S = 1, Z = z)|S = 1, Z = z],
(3.6)
where h(Y p ) denotes a function of Y p . Therefore, JR’s estimating function Ui (θ) can be modified to
accommodate MAR missingness, where the modified function UiM (θ) uses (3.6) instead of (3.1)–(3.4).
The estimating function UiM (θ) is detailed in Appendix A ([A.7]–[A.12]) in the supplementary material
i [h(Y p )] for h(Y p ) =
available at Biostatistics online. The function UiM (θ) includes predicted values E
M
−1
−1
gz (Y p ; α0 , β0 ) or Y p gz (Y p ; α0 , β0 ), for z = 0, 1, and Ui (θ) becomes fully specified once LA’s techi [h(Y p )].
nique is applied to obtain the predicted values E
3.3
Calculation of the fitted values
We use LA’s (Section 5) penalized spline propensity prediction method to predict Y p for infected subjects
with missing data. First, a logistic regression model is fit relating M p to (X 1 , . . . , X q , Y1 , . . . , Y p−1 ) ,
40
P. B. G ILBERT AND Y. J IN
∗ for all subjects with S = 1. Second, a spline regression model
which yields estimated propensities Y
1
∗ is fit using subjects for whom Y p is observed. Beyond Y
∗ , additional covariates that predict
of Y p on Y
1
1
Y p may be entered into the regression model parametrically. From this fitted model, the value E[Y p |Y1∗ ,
X 2 , . . . , X q , Y1 , . . . , Y p−1 , S = 1, Z = z] is predicted for each subject with S M p = 1 using his or her
i [gz−1 (Y p ; αz , βz )]
∗ and other covariates. Third, as described below, fitted values E
estimated propensity Y
1
−1
and E i [Y p gz (Y p ; αz , βz )] are computed using these regression fits and either numerical integration (for
a general link function) or a closed-form equation (for an inverse probit link function). These steps may
be done in the vaccine and placebo groups separately.
Applied to our setting, LA’s specific propensity spline prediction model replaces one of the predictor
variables, say X 1 , by Y1∗ and supposes
(X 2 , . . . , X q , Y1 , . . . , Y p−1 |Y1∗ , S = 1, Z = z)
X
X
Y
Y
(Y1∗ ), . . . , szq
(Y1∗ ), sz1
(Y1∗ ), . . . szp−1
(Y1∗ )), z ),
∼ N ((sz2
(3.7)
(Y p |Y1∗ , X 2 , . . . , X q , Y1 , . . . , Y p−1 , S = 1, Z = z, γz )
Y
∗
(Y1∗ ) + r z (X 1∗ , . . . , X q∗ , Y1∗ , . . . , Y p−1
, γz ), σz2 ),
∼ N (szp
for z = 0, 1, where for j = 2, . . . , q, szXj (Y1∗ ) = E(X j |Y1∗ , S = 1, Z = z) is a spline for the regression
of X j on Y1∗ , where X ∗j = X j − szXj (Y1∗ ); and for j = 1, . . . , p − 1, szYj (Y1∗ ) = E(Y j |Y1∗ , S = 1, Z = z)
is a spline for the regression of Y j on Y1∗ , where Y j∗ = Y j − szYj (Y1∗ ). Furthermore, for z = 0, 1, r z
is a parametric function with unknown parameter vector γz that satisfies r z (Y1∗ , 0, . . . , 0, γz ) = 0 for
i [Y p ] =
all γz . For subject i in group z with Si M pi = 1, the predicted value of Y p is obtained as E
∗
∗
∗
∗
∗
∗
X
∗
∗
Y
∗ ), Y
yi1 ) + r z (
xi1 , . . . , xiq , yi1 , . . . , yi( p−1) ; γ
xi j = xi j − sz j (
xi1 ), yi j = yi j − sz j (
yi1
szXj
szp (
z ), where denotes the sample estimate of the spline szXj , szYj denotes the sample estimate of the spline szYj , xi j is the
realization of X i j , and yi j is the realization of Yi j . The model (3.7) enters X ∗j and Y j∗ as covariates; in
follow-up work, Zhang and Little (2005) showed that the LA method—and hence our sensitivity analysis
method—remains valid if the X ∗j , Y j∗ are replaced with the X j , Y j .
i [gz−1 (Y p ; αz , βz )] as
Next, using (3.7), for a general link function, we compute the fitted values E
−1
1
gz (y; αz , βz ) σz ), where is the cdf of the standard normal distribution. The
σz d([y − E i [Y p ]]/
−1
predicted value E i [Y p gz (Y p ; αz , βz )] is also computed using numerical integration. For the case that gz
i [gz−1 (Y p ; αz , βz )]
is the inverse probit function, numerical integration is not needed and the fitted values E
−1
i [Y p gz (Y p ; αz , βz )] can instead be computed as simple functions of E
i [Y p ]. This procedure is
and E
described in Appendix A in the supplementary material available at Biostatistics online.
3.4
Computational algorithm for estimating the ACE
N
=
µ0 is computed by solving i=1
UiM (θ) = 0 (UiM (θ) is defined in Appendix
The estimate ACE
µ1 − A [A.7]–[A.12] in the supplementary material available at Biostatistics online) with the following steps.
N
Step 1: Estimate p0 ≡ Pr(S(0) = 1) by solving i=1
U1iM ( p0 ) = 0, and estimate p1 ≡ Pr(S(1) = 1)
N
by solving i=1 U2iM ( p1 ) = 0.
i [g −1 (Y p ; α0 , β0 )] into U M (θ) and solve for α0 in
Step 2: Plug p0 and the fitted values E
3i
0
N M
i [g −1 (Y p ; α1 , β1 )]
U
(α
)
=
0
with
a
1D
line
search.
Similarly,
plug
p
and
the
fitted
values E
0
1
i=1 3i
1
M
into U4i (θ) and solve for α1 .
Semiparametric estimation of the average causal effect of treatment
41
i [Y p g −1 (Y p ; α0 , β0 )] into U M (θ) and
Step 3: Plug the estimates of p0 and α0 and the fitted values E
0
5i
N
N
U6iM (µ1 ) = 0 for µ1 .
solve i=1 U5iM (µ0 ) = 0 for µ0 . Similarly, solve i=1
Under the monotonicity assumption (i.e. φ = 1), the computational algorithm is the same except that the
second part of Step 2 is omitted (α1 is no longer relevant), and U6iM (θ) simplifies (given in expression
(A.13) of Appendix A in the supplementary material available at Biostatistics online). If gz is inverse
probit, then Steps 2 and 3 use closed-form formulas (A.14) and (A.15) in Appendix A in the supplementary material available at Biostatistics online; otherwise they use numerical integration as described in
Section 3.3.
3.5
Standard errors and CIs
and confidence intervals (CIs) for ACE is
Deriving asymptotic-based standard error estimators for ACE
difficult, given the smoothing and tuning parameter selection involved with the penalized splines. We
develop a bootstrap approach. As LA did not develop CIs nor standard error estimators, the performance
of this approach is of interest even for the setting without a postrandomization event.
Within each treatment group Z = z separately, B bootstrap data sets are constructed by sampling with
replacement N z realizations of (X i , Si , Mi , Yi )|Z i = z, z = 0, 1. The estimation procedure described
are
µ1 , and ACE
in Section 3.4 is carried out for each bootstrap data set. Then, standard errors for µ0 , estimated by the sample standard deviations of the bootstrap estimates, and (1 − α) × 100% CIs are
obtained as the α/2 and 1 − α/2 percentiles of the bootstrap estimates.
3.6
Robustness of the ACE estimator
LA’s Theorem 1 states a double robustness–type property of the propensity spline prediction procedure,
see Kang and Schafer (2007) and comments within for further discussion. In our setting, this robustness
property can be stated as follows.
3.1 Assume A1–A4, and let µ0 and µ1 be the estimators of µ0 and µ1 , respectively, obtained
T HEOREM N
by solving i=1
UiM (θ) = 0. Then, for z = 0, 1, µz is a consistent estimator of µz if either (i) the mean
of Y p conditional on (Y1∗ , X 2 , . . . , X q , Y1 , . . . , Y p−1 , S = 1, Z ) in (3.7) is correctly specified or (ii) the
propensity Y1∗ is correctly specified, and (iii) E[X ∗j |Y1∗ , S = 1, Z = z] = szXj (Y1∗ ) for j = 2, . . . , q and
E[Y j∗ |Y1∗ , S = 1, Z = z] = szYj (Y1∗ ) for j = 1, . . . , p.
LA comment that the robustness property of (iii) is that r z does not need to be correctly specified and
suggest that (b2) is a weak assumption because of the flexibility of the spline regression models.
4. S IMULATION STUDY
We briefly describe the simulation study of an HIV vaccine trial, with expanded details in Appendix B in
the supplementary material available at Biostatistics online.
4.1
Design of simulation study
Four simulation experiments were carried out such that A1–A4 hold with φ = 1 (i.e. monotonicity holds)
and g0−1 (y; α0 , β0 ) = (α0 + β0 y). We created moderate or extreme selection bias specified by β0 = 1
or β0 = 3. Based on the VaxGen trial, we consider the 3 variables Y1 , Y2 , and Y3 defined in Section 1
42
P. B. G ILBERT AND Y. J IN
(the outcome Y3 is month 12 pre-ART log10 viral load; Y1 is early pre-ART square root CD4 cell count;
Y2 is early pre-ART log10 viral load).
Each simulation generated data with an even chance of assignment to vaccine or placebo, an overall
placebo infection rate of 25%, and Y (0) = (Y1 (0), Y2 (0), Y3 (0)) multivariate normal, with means and
variances equal to the sample estimates obtained in the VaxGen trial (Gilbert and others, 2005). The correlations were set to cor(Y1 (0), Y2 (0)) = cor(Y1 (0), Y3 (0)) = −0.5 and cor(Y2 (0), Y3 (0)) = 0.8. For infected vaccine group subjects, Y (1) was set to Y (0) + (0, 0, −0.5) , which specifies the true ACE = −0.5.
We assumed complete data for Y1 and Y2 in infected subjects and created missing data for Y3 that was
either MCAR or MAR. In either case, Y3 is missing in about 50% of infected subjects. The MAR scenario
generated missingness indicators from a logistic regression model of M3 on Y1 and Y2 with coefficients
chosen from a fit to the VaxGen data.
Four methods were used to estimate the ACE with a 95% bootstrap percentile CI. The first was ordinary least squares complete case (OLSCC), which simply compares sample averages of Y3 between
groups in all infected subjects that have Y3 measured. The next 3 methods are causal: before deletion
(BD), which applies GBH/JR to Y3 using the complete data (an unknowable gold standard for comparison); CC, a complete case analysis that applies GBH/JR; SPPL, the new method with penalized spline
propensity prediction based on a regression of Y3 on the spline of Y1∗ , where Y1∗ is the linear predictor of
the estimated propensity to observe Y3 from a linear logistic regression of M3 on Y1 and Y2 . This SPPL
method uses 15 equally spaced fixed knots and a truncated linear basis and uses model (3.7) with a linear
additive model for Y2∗ : r z (Y1∗ , Y2∗ , γz ) = γz1 Y1∗ + γz2 Y2∗ , for z = 0, 1. Note that the OLSCC method is
equivalent to the GBH/JR method for the special case that φ = 1 and β0 = β1 = 0.
4.2
Simulation results
Table 1 shows bias and root mean squared error (RMSE) of the estimated ACE and the coverage probability of the nonparametric bootstrap CIs. In all cases, the OLSCC method is badly biased because it ignores
Table 1. Bias, RMSE and 95% coverage probability for the OLSCC, BD, CC, and SPPL methods (true
ACE = −0.5)
OLSCC
BD
CC
SPPL
MCAR data
Bias
β0 = 1
ACE
ACE coverage
β0 = 3
ACE
ACE coverage
RMSE
0.426
0.464
27.6%
0.608
0.627
2.2%
Bias
RMSE
Bias
RMSE
Bias
RMSE
Bias
RMSE
−0.002
0.147
91.6%
−0.020
0.214
92.8%
−0.002
0.169
93.4%
0.001
0.152
91.8%
−0.037
0.213
93.4%
0.003
0.168
94.6%
MAR data
β0 = 1
ACE
ACE coverage
β0 = 3
ACE
ACE coverage
Bias
RMSE
Bias
RMSE
Bias
RMSE
0.343
0.372
37.6%
−0.005
0.131
94.8%
−0.025
0.177
97.4%
0.002
0.154
96.8%
0.464
−0.004
0.135
95.6%
−0.050
0.194
94%
−0.001
0.156
95.6%
0.482
6.8%
NOTE: BD = GBH/JR applied to complete data BD (unattainable gold standard); CC = GBH/JR applied to CCs; SPPL = spline
propensity prediction with a linear predictor.
Semiparametric estimation of the average causal effect of treatment
43
Fig. 2. Simulation study: boxplots of nonparametric bootstrap percentile 95% CI lengths for the ACE.
the postrandomization selection bias. As expected, the CC method is unbiased under MCAR missingness but biased under MAR missingness, whereas the new SPPL method is unbiased under both MCAR
and MAR data. Corresponding to this, the CIs about the ACE for the CC method have too-low coverage
probability under MAR data, whereas the CIs for the SPPL method are always near the correct level. The
RMSEs for the SPPL method are 21.1% lower on average than those of the CC method, showing that the
new method provides substantially better estimators than the existing methods.
Figure 2 shows distributions of the CI lengths, demonstrating that the SPPL method provides much
more precise estimates than the CC method. Furthermore, under MAR missingness the bootstrap variance
had a similar distribution as the Monte Carlo sample variances of the ACE
s for the
estimates of ACE
SPPL method but not for the CC method, indicating that the bootstrap variance estimator for the SPPL
method is unbiased (not shown). Additional simulations were conducted under the above design except
that Y (1) was set to Y (0) + (0, −0.5, −0.5) . The results were similar, demonstrating that the new method
performs well when there is an average causal effect on a postinfection covariate used to predict the
outcome of interest.
5. E XAMPLE
We now apply the newly developed SPPL method to the VaxGen data. The same method as evaluated
in the simulations was applied, except with 3000 bootstrap replications. Our outcome of interest Y3 and
44
P. B. G ILBERT AND Y. J IN
covariates Y1 and Y2 are the same as considered above. Of the 368 subjects who became HIV-infected,
there is nearly complete data for Y1 (n = 334) and Y2 (n = 341), and, as discussed above, one-third of
the infected subjects have Y3 observed (n = 121) (Figure 1). The R 2 value of a linear regression of Y3
on Y1 and Y2 is 0.40, suggesting that accounting for the predictive information in Y1 and Y2 with the new
method is expected to provide greater efficiency than the CC method that does not leverage this positive
correlation.
Because the estimated incidence of HIV infection was slightly lower in the vaccine group than the
placebo group (Flynn and others, 2005), the data support (but of course do not prove) the monotonicity
assumption that φ = 1. Accordingly, we first conduct a sensitivity analysis assuming φ = 1. For the sensitivity parameter β0 ranging from −2 to 2, which in our opinion covers the plausible range for potential
selection bias, Figure 3 shows point and 95% bootstrap CI estimates of the ACE by the CC method and
by the new method. Both analyses support no causal vaccine effect on viral load, and the new method
provides slightly narrower CIs.
Acknowledging that monotonocity may not hold, we next performed a sensitivity analysis setting
φ = 0.8. This value was chosen based on the fact that the upper 95% confidence limit for the hazard ratio
(vaccine/placebo) was 1.17. A value φ = 0.8 assumes that a vaccinated subject who becomes infected
would have a 20% chance of avoiding infection had he/she been assigned placebo, which approximately
specifies a 17% plausible elevation of infection risk in the vaccine group. The shaded regions in Figure 4
show values of (β0 , β1 ) under which a 95% CI for the ACE excludes 0. Figure 4 shows that only under
conditions of large selection bias in certain directions are the data consistent with a beneficial or harmful
effect. We conclude that the data are consistent with the null hypothesis of ACE = 0.
We conducted the example using an inverse probit link function gz . With this link, interpreting the
sensitivity parameters βz requires thinking in the “Z-metric” (normal quantitle metric), where a one unit
increase in Y p leads to an increase in the probit score −1 (Pr(S(1) = 1|S(0) = 1, Y p (0) = y) by
βz standard deviations. For those uncomfortable thinking in the Z-metric, it may be preferable to use
Fig. 3. Estimated ACE and 95% CIs with φ = 1. Solid lines are for the SPPL method and dashed lines are for the CC
method.
Semiparametric estimation of the average causal effect of treatment
45
Fig. 4. (a) Estimated ACE, (b) lower 95% confidence limits, and (c) upper 95% confidence limits with φ = 0.8. Solid
lines are for the SPPL method and dashed lines are for the CC method.
an alternative link function that they find more interpretable, for example, logit. Furthermore, it is good
practice in sensitivity analysis to repeat the analysis for multiple link functions.
6. C ONCLUDING REMARKS
Recently developed methods for estimating causal treatment effects on an outcome measured after a postrandomization event, including the methods of GBH and JR, are valid under an MCAR assumption. We
have extended the GBH and JR methods to allow for MAR missingness, by augmenting it with LA’s robust likelihood approach that uses penalized spline propensity prediction. The simulation study showed
that the new method corrects for the bias of the existing methods and improves precision by incorporating
covariates that predict the outcome.
We have focused on accommodating missing data on the outcome of interest Y p , and as such have supposed complete data for the postrandomization event S. The method could be extended to accommodate
MAR missingness of S via IPW approach of Shepherd and others, 2008 to address this issue.
If the missing outcome data follow a monotone pattern, such that Y j+1 , . . . , Y p are missing for all subjects for whom Y j is missing, then the method can be extended using the approach described in Section 7
of LA. Specifically, the propensity spline model (3.7) can be applied sequentially to each block of missing
i [gz−1 (Y p ; αz , βz )] and E
i [Y p gz−1 (Y p ; αz , βz )] appearing in
values. With this approach, the fitted values E
46
P. B. G ILBERT AND Y. J IN
the estimating function UiM (θ) are computed by replacing missing covariate values by predicted values in
sequential regression models.
Extensions of interest include accommodating nonmonotone missing data and expanding the sensitivity analysis to evaluate the impact on inferences of deviations from the MAR assumption.
S UPPLEMENTARY MATERIAL
Supplementary material is available at http://biostatistics.oxfordjournals.org.
ACKNOWLEDGMENTS
Conflict of Interest: None declared.
F UNDING
National Institutes of Health (RO1 AI054165-06) to P.B.
R EFERENCES
B UCHBINDER , S. P., M EHROTRA , D. V., D UERR , A., F ITZGERALD , D. W., M OGG , R., L I , D., G ILBERT, P. B.,
L AMA , J. R., M ARMOR , M., D EL R IO , C. and others (2008). The Step study: the first test-of-concept efficacy
trial of a cell-mediated immunity HIV vaccine. Lancet 372, 1881–1893.
C HANG , M. N., G UESS , H. A. AND H EYSE , J. F. (1994). Reduction in the burden of illness: a new efficacy measure
for prevention trials. Statistics in Medicine 13, 1807–1814.
F LYNN , N. M., F ORTHAL , D. N., H ARRO , C. D., J UDSON , F. N., M AYER , K. H., PARA , M. F. AND T HE RGP 120
HIV VACCINE S TUDY G ROUP (2005). Placebo-controlled phase 3 trial of recombinant glycoprotein 120 vaccine
to prevent HIV-1 infection. The Journal of Infectious Diseases 191, 654–665.
G ILBERT, P. B., ACKERS , M. L., B ERMAN , P. W., F RANCIS , D. P., P OPOVIC , V., H U , D. J., H EYWARD , W. L.,
S INANGIL , F., S HEPHERD , B., G URWITH , M. (2005). HIV-1 virologic and immunologic progression and
antiretroviral therapy initiation among HIV-1-infected participants in an efficacy trial of recombinant glycoprotein
120 vaccine. The Journal of Infectious Diseases 192, 974–983.
G ILBERT, P. B., B OSCH , R. AND H UDGENS , M. G. (2003). Sensitivity analysis for the assessment of causal vaccine
effects on viral load in HIV vaccine trials. Biometrics 59, 531–541.
H AMMER , S. M., E RON , J R , J. J., R EISS , P., S CHOOLEY, R. T., T HOMPSON , M. A., WALMSLEY, S., C AHN , P.,
F ISCHL , M. A., G ATELL , J. M., H IRSCH , M. S. and others (2008). Antiretroviral treatment of adult HIV infection: 2008 recommendations of the International AIDS Society—USA panel. Journal of the American Medicial
Association 300, 555–570.
H AYDEN , D., PAULER , D. K. AND S CHOENFELD , D. (2005). An estimator for treatment comparisons among
survivors in randomized trials. Biometrics 61, 305–310.
H UDGENS , M. G. AND H ALLORAN , M. E. (2006). Causal vaccine effects on binary postinfection outcomes. Journal
of the American Statistical Association 101, 51–64.
H UDGENS , M. G., H OERING , A. AND S ELF, S. G. (2003). On the analysis of viral load endpoints in HIV vaccine
trials. Statistics in Medicine 22, 2281–2298.
J EMIAI , Y. AND ROTNITZKY, A. (2005). Semiparametric methods for inferring treatment effects on outcomes defined
only if a post-randomization event occurs, [Unpublished PhD. Dissertation]. Harvard School of Public Health,
Department of Biostatistics.
Semiparametric estimation of the average causal effect of treatment
47
J EMIAI , Y., ROTNITZKY, A., S HEPHERD , B. AND G ILBERT, P. B. (2007). Semiparametric estimation of treatment
effects given base-line covariates on an outcome measured after a post-randomization event occurs. Journal of the
Royal Statistical Society, Series B 69, 879–902.
J OFFE , M. M., S MALL , D. AND H SU , C.-Y. (2007). Defining and estimating intervention effects for groups that will
develop an auxiliary outcome. Statistical Science 22, 74–97.
K ANG , J. D. Y. AND S CHAFER , J. L. (2007). Demystifying double robustness: a comparison of alternative strategies
for estimating a population mean from incomplete data. Statistical Science 22, 523–539.
L ITTLE , R. AND A N , H. (2004). Robust likelihood-based analysis of multivariate data with missing values. Statistica
Sinica 14, 949–968.
L ITTLE , R. J. A.
AND
RUBIN , D. B. (2002). Statistical Analysis with Missing Data. New York: Wiley.
L ITTLE , R AND Z HANG , G. (2008). Extensions of the penalized spline of propensity prediction method of imputation. Biometrics, doi: 10.1111/j.1541-0420.2008.01155.x.
M EHROTRA , D. V., L I , X. AND G ILBERT, P. B. (2006). A comparison of eight methods for the dual-endpoint
evaluation of efficacy in a proof-of-concept HIV vaccine trial. Biometrics 62, 893–900.
M OGG , R. AND M EHROTRA , D. V. (2007). Analysis of antiretroviral immunotherapy trials with potentially
non-normal and incomplete longitudinal data. Statistics in Medicine 26, 484–497.
P ITISUTTITHUM , P., G ILBERT, P. B., G URWITH , M., H EYWARD , W., M ARTIN , M., VAN G RIENSVEN , F.,
H U , D., TAPPERO , J. W., C HOOPANYA , K., T HE BANGKOK VACCINE E VALUATION G ROUP. (2006).
Randomized, double-blind, placebo-controlled efficacy trial of a bivalent recombinant glycoprotein 120 HIV-1
vaccine among injection drug users in Bangkok, Thailand. The Journal of Infectious Diseases 194, 1661–1671.
ROBINS , J. M., ROTNITZKY, A. AND Z HAO , L. P. (1995). Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association 90, 106–121.
ROSENBAUM , P. R. AND RUBIN , D. B. (1983). Assessing sensitivity to an unobserved binary covariate in an
observational study with binary outcome. The Journal of the Royal Statistical Society, Series B 45, 212–218.
RUBIN , D. B. (1978). Bayesian inference for causal effects. Annals of Statistics 6, 34–58.
RUBIN , D. B. (2000). Comment on “Causal inference without counterfactuals” by A.P. Dawid. Journal of the
American Statistical Association 95, 435–437.
S HEPHERD , B., G ILBERT, P. B., J EMIAI , Y. AND ROTNITZKY, A. (2006). Sensitivity analyses comparing outcomes
only existing in a subset selected post-randomization, conditional on covariates, with application to HIV vaccine
trials. Biometrics 62, 332–342.
S HEPHERD , B., G ILBERT, P. B. AND L UMLEY, T. (2007). Sensitivity analyses comparing time-to-event outcomes
only existing in a subset selected post-randomization, conditional on covariates, with application to HIV vaccine
trials. Journal of the American Statistical Association 102, 573–582.
S HEPHERD , B., G ILBERT, P. B. AND M EHROTRA , D. V. (2006). Eliciting a counterfactual sensitivity parameter.
The American Statistician 61, 56–63.
S HEPHERD , B. E., R EDMAN , M. W. AND A NKERST, D. P. (2008). Does finasteride affect the severity of prostate
cancer? A causal sensitivity analysis. Journal of the American Statistical Association 103, 1392–1404.
Z HANG , J. L. AND RUBIN , D. B. (2003). Estimation of causal effects via principal stratification when some outcomes
are truncated by “death.” Journal of Educational and Behavioral Statistics 28, 353–368.
[Received December 30, 2008; revised July 29, 2009; accepted for publication July 30, 2009]