The Choice of Analytical Strategies in Inverse-Probability

American Journal of Epidemiology
© The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of
Public Health. All rights reserved. For permissions, please e-mail: [email protected].
Vol. 182, No. 6
DOI: 10.1093/aje/kwv098
Advance Access publication:
August 26, 2015
Practice of Epidemiology
The Choice of Analytical Strategies in Inverse-Probability-of-Treatment–Weighted
Analysis: A Simulation Study
Shibing Yang, Juan Lu, Charles B. Eaton, Spencer Harpe, and Kate L. Lapane*
* Correspondence to Dr. Kate L. Lapane, Department of Quantitative Health Sciences, University of Massachusetts Medical School,
55 Lake Avenue North, Worcester, MA 01655-0002 (e-mail: [email protected]).
Initially submitted September 23, 2014; accepted for publication April 6, 2015.
We sought to explore the impact of intention to treat and complex treatment use assumptions made during weight
construction on the validity and precision of estimates derived from inverse-probability-of-treatment–weighted analysis. We simulated data assuming a nonexperimental design that attempted to quantify the effect of statin on lowering low-density lipoprotein cholesterol. We created 324 scenarios by varying parameter values (effect size,
sample size, adherence level, probability of treatment initiation, associations between low-density lipoprotein cholesterol and treatment initiation and continuation). Four analytical approaches were used: 1) assuming intention to
treat; 2) assuming complex mechanisms of treatment use; 3) assuming a simple mechanism of treatment use; and
4) assuming invariant confounders. With a continuous outcome, estimates assuming intention to treat were biased
toward the null when there were nonnull treatment effect and nonadherence after treatment initiation. For each 1%
decrease in the proportion of patients staying on treatment after initiation, the bias in estimated average treatment
effect increased by 1%. Inverse-probability-of-treatment–weighted analyses that took into account the complex
mechanisms of treatment use generated approximately unbiased estimates. Studies estimating the actual effect
of a time-varying treatment need to consider the complex mechanisms of treatment use during weight construction.
as-treated analysis; data simulation; intention to treat; marginal structural models
Abbreviations: IPTW, inverse-probability-of-treatment–weighted estimation; ITT, intention to treat; LDL-C, low-density lipoprotein
cholesterol; MSM, marginal structural model.
treatment for the remaining study period (6). Invoking this assumption simplifies the weight construction process and the assumption of no uncontrolled confounding (2, 6, 7). However,
violating the ITT assumption to some degree is common (8). In
routine clinical practice in the United States, about one third to
one half of the patients do not take medications as prescribed
by their doctors (9). When nonadherence is substantial, ITT
analyses estimate the effect of initiating a treatment rather
than the actual treatment effect (10). Depending on the reasons
for discontinuation, the ITT approach may be perfectly reasonable. Indeed, the first discussion (to our knowledge) of the ITT
assumption argued that the ITT estimator was in fact the parameter of public health interest owing to the likelihood that
reasons for discontinuation stemmed from toxicity (11).
Studies using IPTW often chose an as-treated analytical
strategy (2); that is, they categorized patients according to
Inverse-probability-of-treatment–weighted estimation (IPTW)
of marginal structural models (MSMs) has been increasingly
used to adjust for time-varying confounding in nonexperimental studies (1, 2). Unlike conventional methods, IPTW
controls for confounding through assigning a weight to each
participant, which is proportional to the inverse of the probability of receiving the observed history of treatment use (3,
4). In the presence of time-varying confounders influenced by
prior treatments, IPTW can adjust for the confounding without blocking the indirect effects mediated by confounders or
introducing selection bias (3, 4).
The validity of the IPTW method relies on the correct estimation of the conditional probability of receiving observed
treatment (4, 5). Many studies applying IPTW have made an
intention-to-treat (ITT) assumption (2), which means that
once treatment is initiated, patients are assumed to stay on that
520
Am J Epidemiol. 2015;182(6):520–527
Choice of Analytical Strategies in IPTW Analysis 521
the treatment actually received by the patients during the
study period (10, 12). Different from ITT analyses, as-treated
analyses attempt to estimate the effect of actual treatment
(10). To correctly estimate the actual effect of a time-varying
treatment, however, investigators need to consider the complex mechanisms of treatment use in the process of weight
construction (6). In this study, by “complex mechanism”
we mean that the impacts of potential confounders are different for different treatment regimens. For instance, several applications of IPTW have demonstrated that the relationships
of confounders to initiating the treatment under study were
different from continuing the treatment (13, 14). However,
few studies performing as-treated analyses actually consider
complex mechanisms of treatment use in their analysis (2).
To our knowledge, no previous study has evaluated the impact of adopting different analytical strategies on the performance of IPTW estimates. The objective of this study was to
compare the validity and precision of IPTW estimates derived
from several commonly used analytical approaches to constructing weights.
METHODS
This research did not require ethics review because no
human subjects were involved.
Data generation
We generated data assuming a nonexperimental design
that attempted to answer a hypothetical study question: “What
is the effect of taking statin for 12 weeks by the entire sample
on lowering low-density lipoprotein cholesterol (LDL-C)?”
We chose this question because the efficacy of statin on lowering LDL-C has been established (15, 16) and because patterns of statin use among patients with hypercholesterolemia
were extensively studied (17–21). These studies provided us
the parameters to generate data that mimicked the real-world
situation (22).
Data were generated on the basis of the casual diagram in
Figure 1. In this diagram, LDL denotes levels of LDL-C,
A indicates use of statin, and subscripts 0, 1, and 2 represent
baseline, 6, and 12 weeks after baseline, respectively. The hypothetical data set included 1,000 patients who were newly
diagnosed with hypercholesterolemia and failed to control
LDL-C through lifestyle changes (15).
LDL0
LDL1
LDL2
A0
A1
Figure 1. The causal diagram guiding data generation. LDL denotes
levels of low-density lipoprotein cholesterol, A indicates use of statin
medication, and the subscripts 0, 1, and 2 represent baseline, 6 weeks
after baseline, and 12 weeks after baseline, respectively.
Am J Epidemiol. 2015;182(6):520–527
At t0, we assumed that the baseline LDL level (LDL0) was
normally distributed, with a mean of 130 mg/dL and a standard deviation of 35 mg/dL (23, 24). To simplify the discussions, we further assumed that the probability of the initiating
treatment (A0) depended on the level of LDL0 and that A0 had
a binomial distribution. The mean of A0 was generated from
the following formula:
logitðPrðA0 ¼ 1jLDL0 ÞÞ
¼ α0 þ logðORLDL-Initiation Þ × LDL level0 ;
ð1Þ
where α0, ORLDL-Initiation, and LDL_level0 denote the parameter of intercept, odds ratio of starting statin treatment comparing a higher LDL-C level to a lower level, and baseline
LDL levels in a categorical format, respectively. The LDL_
levels of 0, 1, and 2 correspond to LDL0 less than 160, between
160 and 190, and greater than 190 mg/dL, respectively (15).
ORLDL-Initiation was set at 1.5 on the basis of the literature
that a higher LDL-C level was associated with a greater probability of initiating statin treatment (18, 25). α0 was set at
0.3146, so that 60% of the study participants initiated treatment at t0 (17).
At t1, LDL-C was assumed on average reduced by 30%
from LDL0 among those who initiated treatment at t0 and remained unchanged among those who did not (16, 26). A random error was added to LDL1 so that its standard deviation
was ∼40 mg/dL. A1 was generated separately for those who
did not initiate treatment at t0 (i.e., A0 = 0) and those who did
(i.e., A0 = 1). Among those with A0 = 0, A1 was generated in
the same way as A0 using formula 1, expect that A1 was determined by levels of LDL1 instead of LDL0. For those with
A0 = 1, the probability of continuing treatment at t1 depended
on the reduction in LDL-C from t0 to t1. In this study, we defined adherence level as the probability of continuing treatment at t1 among those who started treatment at t0. Among
those with A0 = 1, A1 was simulated from a binomial distribution with its mean generated from the following formula:
logitðPrðA1 ¼ 1jA0 ¼ 1; LDL0 ; LDL1 ÞÞ
¼ γ0 þ logðORLDL-Continuation Þ × LDL Red,
ð2Þ
where LDL_Red was 0 if reduction in LDL-C was less than
30% of LDL0 and 1 if the reduction was greater than 30% of
LDL0. Reduction by 30% of LDL0 was the average change in
LDL-C from t0 to t1 among those with A0 = 1, so LDL_Red
was actually a dummy variable with value 1 indicating an
above-average reduction in LDL-C. ORLDL-Continuation was
set at 1.5 so that patients with an above-average reduction
in LDL-C had 50% higher odds of continuing statin treatment
compared with those having below-average reduction (20,
21). γ0 was set at 0.6528 so that 70% of those with A0 = 1
continued treatment at t1 (19, 20).
At t2, we assumed that, among patients who initiated treatment at t1 (i.e., with A0 = 0 and A1 = 1), LDL-C on average decreased by 30% from LDL1, which was the same treatment
effect we specified for treatment initiation at t0 on LDL1;
among patients continuing treatment at t1 (i.e., with A0 = 1
and A1 = 1), LDL-C on average decreased by 14.3% from
LDL1, which corresponded to a total decrease of 40% from
522 Yang et al.
LDL0 after 12 weeks of treatment (16, 26); among those discontinuing treatment at t1 (i.e., with A0 = 1 and A1 = 0), LDL-C
rebounded to LDL0 (27, 28); among patients never starting
treatment (i.e., with A0 = 0 and A1 = 0), LDL-C remained unchanged from LDL1. A random error was added to LDL2 so
that its standard deviation was ∼45 mg/dL. Based on these
specifications, the true effect size of 12-week treatment with
statin was a 40% decrease from baseline, that is, 130 mg/dL ×
(−40%) = −52 mg/dL. In this study, we defined effect size as
the difference in LDL-C after treating the entire study sample
with statin for 12 weeks and LDL-C after withholding statin
from the study sample.
To assess the performance of analytical approaches under
various scenarios, besides the base-case scenario described
above, we also generated alternative scenarios with parameter
values varied on the probability of treatment initiation,
ORLDL-Initiation, ORLDL-Continuation, adherence level, effect size,
and sample size. In alternative scenarios, we generated data
with 10% of the patients starting treatment at t0, ORLDL-Initiation
or ORLDL-Continuation equal to 3, adherence level equal to 50%,
60%, 80%, 90%, or 100%, effect size equal to 0 mg/dL or
−26 mg/dL (i.e., 20% decrease from baseline), and sample
size equal to 200 or 20,000. We chose the sample size of
20,000 observations to mimic epidemiologic studies using
an administrative database that often have large sample sizes.
The parameter values for the base-case and alternative scenarios are summarized in Table 1.
Analytical approaches
To evaluate the impact of taking statin for 12 weeks by the
entire study sample, we analyzed the simulated data using
IPTW based on 4 different approaches to constructing weights:
1) IPTW assuming ITT (ITT-IPTW); 2) IPTW assuming
complex mechanisms of treatment use (Complex-IPTW);
3) IPTW assuming a simple mechanism of treatment use
(Simple-IPTW); and 4) IPTW assuming invariant confounders
(Invar-IPTW). For the Complex-IPTW, Simple-IPTW, and
Invar-IPTW approaches, we conducted as-treated analyses
(12). Complex-IPTW acknowledged that the impact of confounders on initiating a treatment was different from their
impact on continuing the treatment. Different models were
specified in estimating the probabilities of treatment initiation
and treatment continuation. Simple-IPTW assumed that confounders had the same impact on initiating and continuing the
treatment. As such, the same model specification was used in
estimating the probabilities of initiating and continuing treatment. Our previous review also found some studies performing
Invar-IPTW analyses, in which they used baseline covariates
to predict changes in treatment status during the follow-up
period (2). The probability of discontinuing the treatment
was estimated on the basis of baseline covariates.
The weight construction process in each method is described in Table 2. In all methods, weights were first estimated separately at t0 and t1, which were the unconditional
probability of receiving observed treatment divided by the
conditional probability of receiving observed treatment given
confounders (4, 29, 30). A patient’s final weight was the
product of his/her weights at t0 and t1 (4, 29, 30). We assumed
that all methods correctly recognized the mechanism of treatment initiation at t0 and, thus, shared the same process of
weight construction at t0. The differences among methods
were in the way of estimating the probability of treatment
use at t1. ITT-IPTW, Complex-IPTW, and Simple-IPTW differed in estimating the conditional probability of continuing
the treatment at t1. ITT-IPTW assumed that the probability of
continuing treatment was 1, and thus the weight was 1 at t1 for
those with A0 = 1; Complex-IPTW assumed that treatment
continuation depended on reduction in LDL-C, which was
consistent with the true data generation process; SimpleIPTW assumed that treatment continuation depended on
levels of LDL-C, which was the same as treatment initiation.
Finally, Invar-IPTW used LDL0 to predict treatment initiation and continuation at t1.
Table 1. Parameter Values Used for Data Generation
Base-Case Scenario
Parameter
Meaning
Pr(A0 = 1)
Probability of starting statin treatment at
baseline or time 1
ORLDL-Initiation
Odds ratio of starting statin treatment
comparing a higher LDL-C level with a
lower level
Pr(A1 = 1jA0 = 1) Probability of continuing statin treatment at
time 1 among those on treatment at
baseline (i.e., adherence level)
Probability,
OR
%
60
Effect size: the difference in LDL-C after
treating the entire sample for 12 weeks
and LDL-C after withholding statin from
the entire sample
No.
Sample size
Reference Probability,
OR
No.
%
17
1.5
3
19, 20
1.5
20, 21
−52 mg/dL
1,000
Value
10
18
70
ORLDL-Continuation Odds ratio of continuing statin treatment
comparing above-average reduction in
LDL-C with below-average reduction
β
Value
Alternative Scenarios
15, 16
50, 60, 80,
90, 100
3
0 mg/dL, −26 mg/dL
200; 20,000
Abbreviations: LDL-C, low-density lipoprotein cholesterol; OR, odds ratio.
Am J Epidemiol. 2015;182(6):520–527
Choice of Analytical Strategies in IPTW Analysis 523
Table 2. Approaches of Constructing Weightsa
Modeling Approach Designation
Weight Construction
ITT-IPTW: marginal structural models
assuming intention to treat
At t0: w0 = Pr(A0 = a0)/Pr(A0 = a0jLDL_Level0)
At t1: If A0 = 0, w1 = Pr(A1 = a1)/Pr(A1 = a1jLDL_Level1);
if A0 = 1, w1 = 1
Final weight: wfinal = w0 × w1
Complex-IPTW: marginal structural models
assuming complex mechanisms of
treatment use
At t0: w0 = Pr(A0 = a0)/Pr(A0 = a0jLDL_Level0)
At t1: If A0 = 0, w1* ¼ PrðA1 ¼ a1 Þ=PrðA1 ¼ a1 jLDL Level1 Þ;
if A0 = 1, w1* ¼ PrðA1 ¼ a1 Þ=PrðA1 ¼ a1 jLDL RedÞ
*
¼ w0 × w1*
Final weight: wfinal
Simple-IPTW: marginal structural models
assuming simple mechanism of
treatment use
At t0: w0 = Pr(A0 = a0)/Pr(A0 = a0jLDL_Level0)
At t1: If A0 = 0, w1** ¼ PrðA1 ¼ a1 Þ=PrðA1 ¼ a1 jLDL Level1 Þ;
if A0 = 1, w1** ¼ PrðA1 ¼ a1 Þ=PrðA1 ¼ a1 jLDL Level1 Þ
**
¼ w0 × w1**
Final weight: wfinal
Invar-IPTW: marginal structural models
assuming invariant confounders
At t0: w0 = Pr(A0 = a0)/Pr(A0 = a0jLDL_Level0)
At t1: If A0 = 0, w1*** ¼ PrðA1 ¼ a1 Þ=PrðA1 ¼ a1 jLDL Level0 Þ;
if A0 = 1, w *** ¼ PrðA1 ¼ a1 Þ=PrðA1 ¼ a1 jLDL Level0 Þ
1
***
Final weight: wfinal
¼ w0 × w1***
Abbreviations: IPTW, inverse-probability-of-treatment–weighted estimation; ITT, intention to treat.
a
LDL_Levelt equals 0 if LDLt is <160 mg/dL, 1 if 160–190 mg/dL, and 2 if >190 mg/dL; LDL_Red is 0 if LDL0–LDL1
is ≤30% of LDL0 and 1 if LDL0–LDL1 is >30% of LDL0; Pr(At = at), unconditional probability of receiving observed
treatment at time t ; Pr(At = at jLDL_Levelt), conditional probability of receiving observed treatment at time t given the
level of low-density lipoprotein cholesterol at time t.
The probability of using treatment given LDL-C was estimated with logistic regression models. For instance, the conditional probability of initiating treatment at t0 given the level
of LDL0 was estimated by using the following logistic model:
which was calculated as the difference between the average of
2,000 estimates and the true effect size divided by the true
effect size (22). To compare the precision of estimates derived from different methods, we calculated the standard deviation of the 2,000 estimates under each scenario (22).
logitðPrðA0 ¼ 1jLDL Level0 ÞÞ
¼ η0 þ η1 × LDL Level0 :
ð3Þ
For those with A0 = 0, the probability of receiving observed
treatment at t0 was 1 minus the predicted probability derived
from model 3.
After the final weight was constructed for each subject, the
second step was to fit a weighted outcome model to estimate
the effect of statin on LDL2. We used a linear model:
LDL2 ¼ β0 þ β1 × A11 þ β2 × A10 þ β3 × A01 þ ε;
ð4Þ
where A11 indicates statin use at both t0 and t1, and A10 and A01
indicate statin use only at t0 and t1, respectively. Because ITTIPTW assumed that no patients discontinued the treatment once
they initiated it, in ITT-IPTW analyses, A11 represents treatment
initiation at t0, A01 represents treatment initiation at t1, and A10 is
always 0. The parameter of interest in this study is β1, which
estimates the difference between LDL-C after the study population was treated with statin for 12 weeks and LDL-C when
none of the population was treated with statin (4).
Assessment of method performance
We simulated 2,000 data sets, and with each data set we performed analyses with the 4 approaches described above. Thus,
each analytical method generated 2,000 estimates. We evaluated the validity of different methods using percentage bias,
Am J Epidemiol. 2015;182(6):520–527
RESULTS
We simulated 324 scenarios with parameter values varied
on effect size (3 options), sample size (3 options), adherence
level (6 options), probability of treatment initiation (2 options), and associations between LDL-C and treatment initiation and continuation (3 options). Figures 2 and 3 show
results for scenarios with β, n, Pr(A0 = 1), ORLDL-Continuation,
and ORLDL-Initiation set at the base-case values, as well as scenarios in which we changed 1 parameter at a time while keeping all the others at their base values. To fully illustrate the
impact of nonadherence on the performance of different approaches, we reported results under all 6 adherence levels.
Figure 2 shows the simulated bias in estimates from the 4
analytical approaches. When there was no treatment effect,
ITT-IPTW estimates were close to the true effect size regardless of adherence levels (Figure 2A). When the true effect was
nonnull and the adherence level was less than 100%, ITTIPTW estimates were biased toward the null (Figure 2B–
2F). Bias in ITT-IPTW estimates was not influenced by the
probability of treatment initiation, magnitude of confounding,
sample size, or effect size (results for scenarios with an effect
size of −26 mg/dL were not shown but similar to those with an
effect size of −52 mg/dL). In these simulations, the extent of
bias in ITT-IPTW estimates was linearly correlated with adherence levels: a 1% decrease in the proportion of patients staying
on treatment after initiation was associated with an approximately 1% increase in the bias in ITT-IPTW estimates.
524 Yang et al.
B)
2
0
0
–10
–10
1
0
–1
–3
100
90
80
70
60
–20
–30
–30
–40
–50
–50
Adherence Level, %
90
80
70
60
–60
100
50
Adherence Level, %
0
0
0
–10
–10
–10
Bias, %
10
–30
–20
–30
–40
–50
–50
–50
80
70
60
Adherence Level, %
50
60
50
–30
–40
90
70
–20
–40
–60
100
80
F)
10
–20
90
Adherence Level, %
E)
10
Bias, %
Bias, %
–20
–40
–60
100
50
10
Bias, %
10
–2
D)
C)
3
Bias, %
Bias, mean
A)
–60
100
90
80
70
60
Adherence Level, %
50
–60
100 90
80
70
60
50
Adherence Level, %
Figure 2. Simulated bias in 4 analytical approaches with
marginal structural models under various scenarios. Bias was calculated as the mean of
P ⌢
effect estimates from 2,000 trials, and Bias ð%Þ ¼ ½ ðβ βÞ=β × 100%=2; 000: In A), data points above the horizontal line y = 0 indicate that estimates were biased upward, whereas in B) through F), data points above the line y = 0 indicate estimates were biased downward. The lines marked
with diamonds denote ITT-IPTW estimates, squares denote Complex-IPTW estimates, triangle denotes Simple-IPTW estimates, and circles denote
Invar-IPTW estimates. Parameter setup: In A), β = 0 mg/dL, ORLDL-Initiation = 1.5, ORLDL-Continuation = 1.5, sample size = 1,000, Pr(A0 = 1) = 60%; in
B), β = −52 mg/dL, ORLDL-Initiation = 1.5, ORLDL-Continuation = 1.5, sample size = 1,000, Pr(A0 = 1) = 60%; in C), β = −52 mg/dL, ORLDL-Initiation = 1.5,
ORLDL-Continuation = 3, sample size = 1,000, Pr(A0 = 1) = 60%; in D), β = −52 mg/dL, ORLDL-Initiation = 3, ORLDL-Continuation = 1.5, sample size = 1,000,
Pr(A0 = 1) = 60%; in E), β = −52 mg/dL, ORLDL-Initiation = 1.5, ORLDL-Continuation = 1.5, sample size = 1,000, Pr(A0 = 1) = 10%; and in F), β = −52 mg/dL,
ORLDL-Initiation = 1.5, ORLDL-Continuation = 1.5, sample size = 200, Pr(A0 = 1) = 60%. IPTW, inverse-probability-of-treatment–weighted estimation; ITT,
intention to treat.
Complex-IPTW estimates were approximately unbiased regardless of the effect size or choices of other parameter values.
When the sample size was 200 (Figure 2F) or ORLDL-Initiation
was 3 (Figure 2D), Complex-IPTW estimates were biased upward by less than 2%. Under other scenarios with nonnull
treatment effect as shown in Figure 2, Complex-IPTW estimates were biased by less than 0.5%. Simple-IPTW estimates
were biased downward under all scenarios except some scenarios with a sample size of 200 or ORLDL-Initiation of 3. This
downward bias became more evident when ORLDL-Continuation
was 3 (Figure 2C) or the adherence rate decreased. InvarIPTW estimates were biased upward for most scenarios.
This upward bias became more evident when ORLDL-Initiation
was 3, but less so when ORLDL-Continuation was 3 or the adherence rate decreased.
The empirical standard errors of estimates derived from the
4 approaches are shown in Figure 3. Under scenarios with no
treatment effect, standard errors of ITT-IPTW estimates increased when 10% of the study sample initiated treatment
or ORLDL-Continuation was 3 (data not shown), but they did
not depend on levels of adherence (Figure 3A). When the
treatment effect was nonnull, standard errors of ITT-IPTW
estimates increased along with the levels of nonadherence,
and this relationship was also observed for estimates derived
from the other 3 approaches (Figure 3B–3F).
Compared with ITT-IPTW estimates, Complex-IPTW estimates had larger standard errors when there was no treatment effect, but smaller or similar standard errors when the
treatment effect was nonnull. Compared with Complex-IPTW
estimates, Simple-IPTW estimates had slightly larger standard
Am J Epidemiol. 2015;182(6):520–527
Choice of Analytical Strategies in IPTW Analysis 525
3.5
3.0
2.5
100
90
80
70
60
3.5
3.0
2.5
100
50
C)
4.0
Standard Error
B)
4.0
Standard Error
Standard Error
A)
Adherence Level, %
90
80
70
60
4.0
3.5
3.0
2.5
100
50
Adherence Level, %
D)
E)
90
80
70
60
50
Adherence Level, %
F)
5.0
9.0
5.5
4.5
3.5
100
90
80
70
60
Adherence Level, %
50
4.5
Standard Error
Standard Error
Standard Error
6.5
4.0
3.5
3.0
100
90
80
70
60
Adherence Level, %
50
8.5
8.0
7.5
7.0
100
90
80
70
60
50
Adherence Level, %
Figure 3. Simulated standard errors of estimates from 4 analytical approaches of marginal structural models under various scenarios. Standard
error was the standard deviation of the estimates from 2,000 trials. The lines marked with diamonds denote ITT-IPTW estimates, squares denote
Complex-IPTW estimates, triangles denote Simple-IPTW estimates, and circles denote Invar-IPTW estimates. Parameter setup: In A), β = 0 mg/dL,
ORLDL-Initiation = 1.5, ORLDL-Continuation = 1.5, sample size = 1,000, Pr(A0 = 1) = 60%; in B), β = −52 mg/dL, ORLDL-Initiation = 1.5, ORLDL-Continuation =
1.5, sample size = 1,000, Pr(A0 = 1) = 60%; in C), β = −52 mg/dL, ORLDL-Initiation = 1.5, ORLDL-Continuation = 3, sample size = 1,000, Pr(A0 = 1) = 60%;
in D), β = −52 mg/dL, ORLDL-Initiation = 3, ORLDL-Continuation = 1.5, sample size = 1,000, Pr(A0 = 1) = 60%; in E), β = −52 mg/dL, ORLDL-Initiation = 1.5,
ORLDL-Continuation = 1.5, sample size = 1,000, Pr(A0 = 1) =10%; and in F), β = −52 mg/dL, ORLDL-Initiation = 1.5, ORLDL-Continuation = 1.5, sample size =
200, Pr(A0 = 1) = 60%. IPTW, inverse-probability-of-treatment–weighted estimation; ITT, intention to treat.
errors under all scenarios except those with no treatment effects, and Invar-IPTW estimates had slightly larger standard
errors under all scenarios except those with a sample size of
200 (Figure 3F) or ORLDL-Initiation of 3 (Figure 3D).
DISCUSSION
Under simple yet realistically constructed scenarios, our
simulation study demonstrated that ITT-IPTW estimates were
biased toward the null when there were nonnull treatment
effect and nonadherence after treatment initiation. The extent
of bias in ITT-IPTW estimates appeared dependent solely on
the level of nonadherence. IPTW analyses that took into account the complex mechanisms of treatment use generated
approximately unbiased estimates without a noticeable sacrifice in precision when the treatment effect was nonnull.
IPTW assuming a simple mechanism of treatment use failed
to correctly model the relationship between LDL-C and treatment continuation. As such, the negative confounding by
Am J Epidemiol. 2015;182(6):520–527
LDL-C could not be fully controlled in the weighted population. As expected, this confounding bias became more evident
when the impact of LDL-C on treatment continuation became
stronger (i.e., ORLDL-Continuation = 3). Similarly, the weight
construction process in IPTW assuming invariant confounders did not correctly model the relationships of LDL-C to either treatment initiation or continuation. As the impact of
LDL on treatment initiation increased, a positive confounding
bias became more dominant and, as the impact of LDL-C
on treatment continuation increased, a negative bias became
more dominant. While keeping other parameters constant, as
the adherence level approached 50%, the contribution of the
negative association between LDL-C and treatment continuation became stronger to the overall association between
LDC-C and statin use at t1 after controlling for LDL0. Therefore, the Simple-IPTW and Invar-IPTW estimates that were
biased by residual confounding of LDL1 became smaller in
value as the adherence rate approached 50%.
Conventional wisdom suggests that IPTW considering the
complex mechanisms of treatment use might generate more
526 Yang et al.
valid but less precise estimates than the ITT-IPTW analyses (6).
Our study further suggested that this was true under scenarios
with null treatment effect but not with nonnull treatment effect.
The standard error of an IPTW estimate probably depends on
the variation of constructed weights (30), variance of the study
exposure, and mean squared error of the outcome model (i.e., a
linear regression model) (31). By incorporating the probability
of treatment continuation, Complex-IPTW analyses generated
weights that had a larger variance than those in ITT-IPTW analyses. This explains the finding that, under scenarios with no
treatment effect, the Complex-IPTW estimates had larger standard errors than ITT-IPTW estimates. However, when there
was nonnull treatment effect, the standard errors of ITT-IPTW
estimates were probably inflated by the increased mean squared
errors due to the misspecification of the study exposure in outcome models (31).
When the effect of continued treatment is estimated, it is
well known that ITT analyses are biased toward the null
when the effect is nonnull and there is treatment nonadherence (10). However, to our knowledge, this was the first study
that explored the bias in ITT-IPTW estimates in relation to
levels of nonadherence and patterns of confounding. Under the
causal structure assessed in our study, for each 1% decrease in
treatment adherence experienced by the sample, the bias in
the average treatment effect increased by 1%. Even when
the adherence level was as high as 90%, the ITT-IPTW analyses underestimated the treatment effect by ∼10%. This underestimation may be especially problematic for drug safety
studies, because the analyses may miss the harmful medication effects (10). Admittedly, the ITT estimates may be relevant to the decision making of “assigning” a population to
certain medical intervention, because not everyone in the
population will take the assigned treatment (32). However,
as shown in our study, ITT estimates depend on the adherence
level and may be generalizable only to populations with same
compliance behavior.
Our study demonstrated the necessity of considering the
different relationships between confounders and different
treatment regimens in as-treated analyses. Besides the relationships between LDL-C and statin initiation and continuation illustrated in our study, the phenomenon of complex
treatment use was also noted by other studies (13, 14). Cook
et al. (13) reported that potential confounders, such as occurrence of angina and transient ischemic attacks, were negatively
associated with continuing aspirin treatment but positively
correlated with starting aspirin. To properly perform ComplexIPTW analyses, substantive knowledge regarding the relationships between potential confounders and different treatment
regimens (e.g., initiation, continuation, and resumption) should
guide the specification of treatment models during weight construction. Furthermore, the findings that IPTW estimates assuming invariant confounders were biased by uncontrolled
confounding emphasized the importance of collecting information on time-varying factors that predict the study outcomes
and also bring about changes in treatment use (10).
Several limitations must be considered. First, we simulated
scenarios with treatment use varied only at 2 time points. For
situations involving more time points, the mechanisms of
treatment use become more complicated. For instance, the
impact of confounders on treatment resumption may be
different from their impact on treatment initiation or continuation. In addition, information on time-varying confounders
that bring about changes in treatment use, as well as a large
sample size, is necessary to correctly model the complex
mechanisms of treatment use. Furthermore, studies with more
time points and performing as-treated analyses often need to
make assumptions about the dose-response relationship between treatment use and study outcome (12). Second, to avoid
the problem of noncollapsibility of some effect measures (e.g.,
odds ratio, hazard ratio) and thus simplify the interpretation of
the results, we chose a continuous outcome in this study.
Whether our findings extend to different types of outcomes,
such as time-to-event outcomes or categorical outcomes,
needs to be explored.
In conclusion, under a range of simulated scenarios, we
demonstrated that IPTW estimates assuming ITT were biased
toward the null when nonnull treatment effect and nonadherence
after treatment initiation occurred. With a continuous outcome,
we found that this bias was linearly correlated with nonadherence
levels. Studies attempting to estimate the actual effect of a timevarying treatment should take into account the complex mechanisms of treatment use in the process of weight construction.
ACKNOWLEDGMENTS
Author affiliations: Division of Epidemiology, Department
of Family Medicine and Population Health, Virginia Commonwealth University, Richmond, Virginia (Shibing Yang,
Juan Lu); Center for Primary Care and Prevention, Memorial
Hospital of Rhode Island, Pawtucket, Rhode Island (Charles
B. Eaton); Departments of Family Medicine and Epidemiology, Warren Alpert Medical School and School of Public
Health, Brown University, Providence, Rhode Island (Charles
B. Eaton); College of Pharmacy, Midwestern University,
Downers Grove, Illinois (Spencer Harpe); and Department
of Quantitative Health Sciences, University of Massachusetts
Medical School, Worcester, Massachusetts (Kate L. Lapane).
This study was supported by the National Heart, Lung, and
Blood Institute (contract HHSN268201000020C, reference
no. BAA-NHLBI-AR1006). The OAI is a public-private partnership comprising 5 contracts (N01-AR-2-2258; N01-AR-22259; N01-AR-2-2260; N01-AR-2-2261; N01-AR-2-2262)
funded by the National Institutes of Health, a branch of the Department of Health and Human Services, that is conducted by
the OAI Study Investigators. Private funding partners include
Pfizer, Inc., Novartis Pharmaceuticals Corporation, Merck Research Laboratories, and GlaxoSmithKline. Private sector
funding for the OAI is managed by the Foundation for the National Institutes of Health.
Conflict of interest: none declared.
REFERENCES
1. Suarez D, Borràs R, Basagaña X. Differences between marginal
structural models and conventional models in their exposure
effect estimates: a systematic review. Epidemiology. 2011;
22(4):586–588.
Am J Epidemiol. 2015;182(6):520–527
Choice of Analytical Strategies in IPTW Analysis 527
2. Yang S, Eaton CB, Lu J, et al. Application of marginal structural
models in pharmacoepidemiologic studies: a systematic review.
Pharmacoepidemiol Drug Saf. 2014;23(6):560–571.
3. Hernán MA, Hernández-Díaz S, Robins JM. A structural
approach to selection bias. Epidemiology. 2004;15(5):615–625.
4. Robins JM, Hernán MA, Brumback B. Marginal structural
models and causal inference in epidemiology. Epidemiology.
2000;11(5):550–560.
5. Hernán MA, Robins JM. Estimating causal effects from
epidemiological data. J Epidemiol Community Health. 2006;
60(7):578–586.
6. Platt RW, Brookhart MA, Cole SR, et al. An information
criterion for marginal structural models. Stat Med. 2013;32(8):
1383–1393.
7. Neugebauer R, Fireman B, Roy JA, et al. Dynamic marginal
structural modeling to evaluate the comparative effectiveness of
more or less aggressive treatment intensification strategies in
adults with type 2 diabetes. Pharmacoepidemiol Drug Saf.
2012;21(suppl 2):99–113.
8. Osterberg L, Blaschke T. Adherence to medication. N Engl J
Med. 2005;353(5):487–497.
9. Kripalani S, Yao X, Haynes RB. Interventions to enhance
medication adherence in chronic medical conditions: a
systematic review. Arch Intern Med. 2007;167(6):540–550.
10. Hernán MA, Hernández-Díaz S. Beyond the intention-to-treat
in comparative effectiveness research. Clin Trials. 2012;9(1):
48–55.
11. Cole SR, Hernán MA, Margolick JB, et al. Marginal structural
models for estimating the effect of highly active antiretroviral
therapy initiation on CD4 cell count. Am J Epidemiol. 2005;
162(5):471–478.
12. Danaei G, Rodríguez LA, Cantero OF, et al. Observational data
for comparative effectiveness research: an emulation of
randomised trials of statins and primary prevention of coronary
heart disease. Stat Methods Med Res. 2013;22(1):70–96.
13. Cook NR, Cole SR, Buring JE. Aspirin in the primary
prevention of cardiovascular disease in the Women’s Health
Study: effect of noncompliance. Eur J Epidemiol. 2012;27(6):
431–438.
14. Yang S, Eaton CB, McAlindon TE, et al. Effects of
glucosamine and chondroitin supplementation on knee
osteoarthritis: an analysis with marginal structural models.
Arthritis Rheumatol. 2015;67(3):714–723.
15. Expert Panel on Detection, Evaluation, and Treatment of High
Blood Cholesterol in Adults. Executive summary of the third
report of the National Cholesterol Education Program (NCEP)
expert panel on detection, evaluation, and treatment of high
blood cholesterol in adults (Adult Treatment Panel III). JAMA.
2001;285(19):2486–2497.
16. Adams SP, Tsang M, Wright JM. Lipid lowering efficacy
of atorvastatin. Cochrane Database Syst Rev. 2012;12:
CD008226.
17. Centers for Disease Control and Prevention (CDC). Vital signs:
prevalence, treatment, and control of high levels of low-density
lipoprotein cholesterol—United States, 1999–2002 and 2005–
2008. MMWR Morb Mortal Wkly Rep. 2011;60(4):109–114.
Am J Epidemiol. 2015;182(6):520–527
18. Caspard H, Chan AK, Walker AM. Compliance with a statin
treatment in a usual-care setting: retrospective database analysis
over 3 years after treatment initiation in health maintenance
organization enrollees with dyslipidemia. Clin Ther. 2005;
27(10):1639–1646.
19. Simons LA, Ortiz M, Calcino G. Long term persistence with
statin therapy—experience in Australia 2006–2010. Aust Fam
Physician. 2011;40(5):319–322.
20. Simons LA, Levis G, Simons J. Apparent discontinuation rates
in patients prescribed lipid-lowering drugs. Med J Aust. 1996;
164(4):208–211.
21. Benner JS, Pollack MF, Smith TW, et al. Association between
short-term effectiveness of statins and long-term adherence to
lipid-lowering therapy. Am J Health Syst Pharm. 2005;62(14):
1468–1475.
22. Burton A, Altman DG, Royston P, et al. The design of
simulation studies in medical statistics. Stat Med. 2006;25(24):
4279–4292.
23. Ballantyne CM, Blazing MA, King TR, et al. Efficacy and
safety of ezetimibe co-administered with simvastatin compared
with atorvastatin in adults with hypercholesterolemia. Am J
Cardiol. 2004;93(12):1487–1494.
24. Faergeman O, Hill L, Windler E, et al. Efficacy and tolerability
of rosuvastatin and atorvastatin when force-titrated in patients
with primary hypercholesterolemia: results from the ECLIPSE
Study. Cardiology. 2008;111(4):219–228.
25. Pearson TA, Laurora I, Chu H, et al. The Lipid Treatment
Assessment Project (L-TAP): a multicenter survey to evaluate
the percentages of dyslipidemic patients receiving
lipid-lowering therapy and achieving low-density lipoprotein
cholesterol goals. Arch Intern Med. 2000;160(4):459–467.
26. Cubeddu LX, Cubeddu RJ, Heimowitz T, et al. Comparative
lipid-lowering effects of policosanol and atorvastatin: a
randomized, parallel, double-blind, placebo-controlled trial. Am
Heart J. 2006;152(5):982.e1–982.e5.
27. Chen H, Ren JY, Xing Y, et al. Short-term withdrawal of
simvastatin induces endothelial dysfunction in patients with
coronary artery disease: a dose-response effect dependent on
endothelial nitric oxide synthase. Int J Cardiol. 2009;131(3):
313–320.
28. van der Harst P, Asselbergs FW, Hillege HL, et al. Effect of
withdrawal of pravastatin therapy on C-reactive protein and
low-density lipoprotein cholesterol. Am J Cardiol. 2007;
100(10):1548–1551.
29. Hernán MA, Brumback B, Robins JM. Marginal structural
models to estimate the causal effect of zidovudine on the
survival of HIV-positive men. Epidemiology. 2000;11(5):
561–570.
30. Cole SR, Hernán MA. Constructing inverse probability weights
for marginal structural models. Am J Epidemiol. 2008;168(6):
656–664.
31. Kutner MH, Nachtsheim CJ, Neter J, et al. Applied Linear
Statistical Models. 5th ed. New York, NY: McGraw-Hill/Irwin;
2005.
32. Shrier I, Steele RJ, Verhagen E, et al. Beyond intention to treat:
What is the right question? Clin Trials. 2014;11(1):28–37.