Expected value of sample information for Weibull survival data

HEALTH ECONOMICS
Health Econ. 16: 1205–1225 (2007)
Published online 27 February 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/hec.1217
EXPECTED VALUE OF SAMPLE INFORMATION FOR WEIBULL
SURVIVAL DATA
ALAN BRENNAN* and SAMER A. KHARROUBIy
University of Sheffield, Yorkshire, UK
SUMMARY
Expected value of sample information (EVSI) involves simulating data collection, Bayesian updating, and reexamining decisions. Bayesian updating in Weibull models typically requires Markov chain Monte Carlo (MCMC).
We examine five methods for calculating posterior expected net benefits: two heuristic methods (data lumping
and pseudo-normal); two Bayesian approximation methods (Tierney & Kadane, Brennan & Kharroubi); and the
gold standard MCMC. A case study computes EVSI for 25 study options. We compare accuracy, computation time
and trade-offs of EVSI versus study costs.
Brennan & Kharroubi (B&K) approximates expected net benefits to within 1% of MCMC. Other methods,
data lumping ðþ54%Þ; pseudo-normal ð5%Þ and Tierney & Kadane ðþ11%Þ are less accurate. B&K also produces
the most accurate EVSI approximation. Pseudo-normal is also reasonably accurate, whilst Tierney & Kadane
consistently underestimates and data lumping exhibits large variance. B&K computation is 12 times faster than the
MCMC method in our case study. Though not always faster, B&K provides most computational efficiency when
net benefits require appreciable computation time and when many MCMC samples are needed.
The methods enable EVSI computation for economic models with Weibull survival parameters. The approach
can generalize to complex multi-state models and to survival analyses using other smooth parametric distributions.
Copyright # 2007 John Wiley & Sons, Ltd.
Received 24 April 2006; Revised 3 November 2006; Accepted 19 December 2006
KEY WORDS:
value of information; sample size; clinical trial design; Weibull model; proportional hazards; costeffectiveness
INTRODUCTION
Expected value of sample information (EVSI) quantifies the expected value to the decision maker of
obtaining sample information before making a decision (Raiffa, 1968). In health economics, value
of perfect information methods are used in sensitivity analysis and quantifying the potential value of
research (Claxton and Posnett, 1996; Claxton, 1999a; Felli and Hazen, 1998; Meltzer, 2001; Brennan
et al., 2002a,b; Coyle et al., 2003; Yokota and Thompson, 2004a,b; Tappenden et al., 2004), whilst EVSI
is promoted for determining optimum sample sizes and allocation rates in health and clinical studies
(Thompson and Graham, 1996; Claxton et al., 2001; Claxton and Thompson, 2001; Chilcott et al., 2003;
Brennan et al., 2002c; Ades et al., 2004). Mathematically, we assume a decision model with uncertain
parameters y; a joint prior probability distribution given current evidence H; pðyjHÞ; and a choice
*Correspondence to: School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, Yorkshire, S1 4DA,
UK. E-mail: a.brennan@sheffield.ac.uk
y
Formerly Research Assistant Centre for Bayesian Statistics and Health Economics (CHEBS), Department of}Probability and
Statistics, University of Sheffield, UK. Currently: Lecturer in Statistics, University of York, Yorkshire, UK.
Copyright # 2007 John Wiley & Sons, Ltd.
1206
A. BRENNAN AND S. A. KHARROUBI
between interventions D ¼ fD1 ; D2 ; . . . ; DN g with net benefit functions NBðDi ; yÞ: A new research study
with a specified Design would provide data XyI on parameters of interest yI ; giving a posterior density
via Bayesian updating pðyjXyI Þ; and enabling a revised decision given the data. EVSI is the difference
between the expected value of a decision made after the proposed research and the expected value of a
decision made with only current information (Brennan et al., 2002c; Ades et al., 2004), i.e.
EVSIðyI ; DesignÞ ¼ EXyI max fEðycI ;yI jXyI Þ ½NBðD; yI ; ycI Þg max fEy ½NBðD; yÞg
ð1Þ
D
fyI ; ycI g
D
R
and Ez ½f ðzÞ ¼ ½f ðzÞpðzÞ dz:
where y ¼
Except in very particular circumstances, current algorithms to compute (1) recommend nested Monte
Carlo sampling combined with Bayesian updating (Brennan et al., 2002c; Ades et al., 2004). First, we use
Monte Carlo to produce a sample of the parameters of interest yIsample : Next we use Monte Carlo to
simulate a data-set of a specified sample size and design XyIsample by sampling data assuming that the true
values of the model parameters of interest are yIsample : The simulated data-set provides new evidence on
the model parameters, and next we use Bayesian updating to compute or estimate the posterior
probability distribution pðyjXyIsample Þ: Finally, we estimate the net benefit produced by a revised decision
given the data. To do this we use a further Monte Carlo integration conditional on the data-set, sampling
parameters from the posterior pðyjXyIsample Þ; to evaluate the inner expectations for each intervention in the
first term of (1). This whole process is repeated many times, the outer Monte Carlo sampling process is
used to produce different simulated data-sets, followed for each data-set by a Bayesian update and then a
nested inner Monte Carlo sampling process to estimate the conditional expected net benefits.
The Bayesian updating process can also be complex. To date, EVSI studies (Claxton et al., 2001;
Chilcott et al., 2003; Brennan et al., 2002c; Ades et al., 2004) have focussed on prior probability
distributions and simulated data which are conjugate, simplifying the computation using analytic
formulae for the posterior probability distribution of the model parameters. Computation remains
significant because we have many sampled data-sets, each of which require Bayesian updates and Monte
Carlo simulation to compute the posterior expectation (e.g. 1000 data-sets and 10 000 inner samples per
data-set ¼ 10 million model runs). If the prior and the data are non-conjugate, then the dominant
approach currently is to use Markov Chain Monte Carlo (MCMC) methods (Spiegelhalter et al., 2000)
to undertake Bayesian updating and produce simulated samples from the posterior probability
distribution (e.g. an application in WinBUGS http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml).
This is a substantial computational expense itself and must be repeated for each simulated data-set,
which can result in very substantial computation times.
An increasing number of health economic models utilize the Weibull distribution to examine survival
times. The most common analyses relate to mortality but the distribution can also examine time to
clinical events (e.g. myocardial infarction) or other administration related events (e.g. duration on drug
therapy before switching treatment). The advantages of the Weibull distribution are that: (a) it is a
large family incorporating many different survival curve shapes, (b) the expected mean survival, i.e.
area under or between survival curves easy to compute, and (c) it can be adjusted for covariates using
multivariate Weibull regression. The Weibull distribution can be parameterized at least four
ways (Collett, 1997; Abrams et al., 1996; Lecoutre et al., 2002; Abernethy, 2000) (see Appendix A).
Here, we follow Abrams et al. (1996), using ðy1 ; y2 Þ ¼ ðlog g; log lÞ and the key equations for the Weibull
follow:
y
Instantaneous hazard hðtÞ ¼ ey2 ey1 te 1 1
y
ð2Þ
y
Probability density function f ðtÞ ¼ ey2 ey1 te 1 1 expðey2 te 1 Þ
Copyright # 2007 John Wiley & Sons, Ltd.
ð3Þ
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1207
EXPECTED VALUE OF SAMPLE INFORMATION
Mean survival EðTÞ ¼
1
ey2
1=ey1
Log likelihood log Lðy1 ; y2 Þ ¼
G½ð1=ey1 Þ þ 1;
where G½x ¼
Z
1
ux1 eu du
ð4Þ
0
n
X
di ðy1 þ y2 Þ þ ðey1 1Þ
i¼1
n
X
di logðti Þ i¼1
n
X
ey2 ðti Þe
y1
ð5Þ
i¼1
The maximum likelihood estimates for the parameters are usually obtained by using numerical methods
(e.g. Newton–Raphson) to maximize the log-likelihood function. If there are covariates xi for the ith
individual in a study, the proportional hazards model assumes that the hazard can be adjusted
y
proportionately, i.e. hi ðtÞ ¼ expðb1 x1i þ b2 x2i þ þ bp xpi Þey2 ey1 te 1 1 ; where the bi are the log hazard
ratios and the equations above can be modified accordingly. Although it is becoming common to
characterize the log Weibull parameters (y1 ; y2 ; and the bi ’s) with a joint multivariate normal
distribution, the prior normal distribution for the parameters is not conjugate with the Weibull
distributed individual survival data. This results in a computational problem for implementing EVSI
with the Weibull model.
In this study we evaluate 5 methods of Bayesian updating for use in computing EVSI with Weibull
models. The first two methods are heuristic approximations to produce quick methods of estimating the
posterior Weibull parameters assuming that the are normally distributed; the third is our own novel
Brennan & Kharroubi Bayesian approximation to estimated the conditional expected net benefit; the
fourth a similar previously published Bayesian approximation (Tierney and Kadane, 1986) and the fifth a
full implementation of MCMC. We test each approach in an illustrative case study building on a classic
example in Collett’s textbook (Collett, 1997). We set out a simple decision model incorporating quality of
life and cost issues together with an illustrative cost function for proposed trials and examine 25 different
proposed study designs. Finally, we discuss implications for future applications and research.
METHODS
Case study and traditional sample size calculation
Collett uses the example of a clinical trial in chronic liver disease to illustrate the traditional sample size
calculation for survival studies (Collett, 1997). Patients with chronic active hepatitis can have rapid
progression to liver failure and early death. Survival data exists on 44 standard therapy patients. We
have digitized the published Kaplan–Meier curve (Collett Figure 9.1) assuming no censoring. There are
33 deaths over 6 years and 11 survivors up to almost 10 years. The new therapy ‘is expected to increase
the five year survival probability from 0.35 to 0.55’ (Collett example 9.1).
The traditional sample size calculation is based on the general proportional hazards model, the log
rank test statistic and the null hypothesis of no difference between treatments (i.e. a true log
proportional hazard y ¼ 0) (Collett, 1997). Random chance means there is a probability that we could
form incorrect conclusions after obtaining the data, but the larger the sample the less this is likely to
happen. To establish how large the sample needs to be to avoid wrong conclusions, statisticians demand
that clinicians/decision makers define (i) the ‘significance level’ a: a low but acceptable level of
probability of a type I error, whereby the data suggests a survival difference when the null hypothesis is
actually true, (ii) the clinically significant difference yR : an improvement in survival at which clinician’s
would feel happy to adopt the new treatment, and (iii) b: a low but acceptable level of probability of
type II error, whereby one does not reject the null hypothesis even though the true difference is yR :
Collett shows that the sample size is given by
No: of deaths required in study ¼
Copyright # 2007 John Wiley & Sons, Ltd.
4ðza=2 þ zb Þ2
y2R
ð6Þ
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1208
A. BRENNAN AND S. A. KHARROUBI
Table I. Traditional sample size calculation ða ¼ 0:05; b ¼ 0:1Þ
Accrual period (years)
Follow-up (years)
0.25
0.75
1.50
0.25
0.50
1.00
2.00
3.00
5.00
7178
1196
756
550
352
257
1286
893
756
514
336
245
907
755
671
380
282
230
Total sample required n ¼
1
No: of deaths required
%
% þ f Þ
þ 4Sð0:5a
þ f Þ þ Sða
1 %
6 ½Sðf Þ
ð7Þ
where za=2 and zb are the upper a=2 and b points of the standard normal distribution; a is the length of
the accrual period during which individuals are recruited, f is the length of the follow-up period after
% ¼ 1 ½SS ðtÞ þ SN ðtÞ; the weighted
recruitment is completed (i.e. total study duration is a þ f ) and SðtÞ
2
average of the survivor functions SS ðtÞ and SN ðtÞ for individuals on the standard and new therapies,
respectively, at time t; under the assumption that the log-hazard ratio is yR :
Applying Equation (6) to the case study, with a ¼ 0:05; 1 b ¼ 0:9 and the reference proportional
hazard ratio ðcR ¼ logð0:55Þ=logð0:35Þ ) yR ¼ 0:5621) gives:
No: of deaths required dliver ¼
4 ð1:96 þ 1:28Þ2
’ 133
ð0:5621Þ2
ð8Þ
Applying Equation (7) with a ¼ 18 months, f ¼ 24 months, SS ð2 yearsÞ ¼ 0:70; SS ð4 yearsÞ ¼ 0:45 and
SS ð6 yearsÞ ¼ 0:25 approximately gives:
133
¼ 380
ð9Þ
n¼
0:35
We have repeated the same approach here for a number of alternative assumptions about accrual rate a
and follow-up period f (Table I).
Statistical inference underlies the traditional sample sizing approach, and its use to inform adoption
decisions has been criticized (Claxton, 1999b). The approach has an implicit decision rule as follows:
(a) if the data show a survival difference significant at the a level in favour of the new therapy then it
should be adopted, (b) if the data show no proven difference at the a level then retain standard therapy.
It also has an implicit trade-off between the size of the sample (i.e. study costs) and the risk of type I and
type II errors. This implicit trade-off ignores three important factors: (1) the costs of data collection, (2)
the cost and benefit consequences of making a wrong adoption decision, (3) the likelihood that the true
value of y actually is yR : To perform this trade-off explicitly we need a health economic model.
Health economic decision model and research cost function
To investigate the computation of EVSI in survival studies we have developed an illustrative case study
health economic decision model. The model is built around the Collett example but introduces
additional parameters which are almost always important in health economic analysis of survival
models, i.e. costs of the treatments, costs of ongoing care and quality of life of patients with the disease.
The model parameters, and their existing prior uncertainty are illustrative (fictional) but serve to
produce a simple yet realistic form of model in order to explore EVSI calculations.
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1209
EXPECTED VALUE OF SAMPLE INFORMATION
100%
New Therapy Prior expected
Weibull
90%
80%
% Surviving
70%
60%
Standard Therapy
- Kaplan Meier
50%
40%
30%
Standard Therapy
-Fitted Weibull
20%
10%
0%
0
2
4
6
8
10
12
Years
Figure 1. Case study survival data based on Collett example
The health economic decision model has net benefit functions for a standard and a new treatment.
The first three model parameters are the Weibull shape ðy1 Þ; scale ðy2 Þ and the log hazard ratio defining
the relative effect of the new therapy ðy3 Þ: Figure 1 shows the source data for standard therapy survival
in Kaplan–Meier form, together with the prior survival curves for each therapy. There are four other
parameters: the mean utility for patients with the disease ðy4 Þ; the mean cost per day of ongoing care
ðy5 Þ; and the mean cost of standard and new therapy (y6 ; y7 ). lWTP refers to the societal willingness to
pay for a QALY and essentially converts quality adjusted survival into monetary terms. The net benefit
functions are:
!
!
1=ey1
1=ey1
1
1
y1
y1
NB1 ¼ lWTP
G½1 þ 1=e * y4 y5 * y
G½1 þ 1=e y6
ð10Þ
ey2
e2
NB2 ¼ lWTP
1
y
2
e ey3
1=ey1
!
y1
G½1 þ 1=e * y4
1
y5 * y y
2
e e3
1=ey1
!
y1
G½1 þ 1=e y7
ð11Þ
Table II shows the prior probability distributions for the uncertain model parameters. Given the data
on the 44 patients and assuming y1 and y2 are normally distributed, we find their prior means and
variance matrix using maximum likelihood estimation (MLE), a process undertaken in the [R] statistical
package by specifying the log-likelihood function and using the Newton–Raphson function nlm(). For
y3 ; Collett defined a reference log hazard ratio yR ¼ 0:5621 and we assume a process of elicitation has
been undertaken to express uncertainty around this. We also assume that there is no prior information
on any correlation between y3 and the baseline Weibull parameters y1 and y2 ; and that y4 ; y5 ; y6 and y7
are mutually independent.
A probabilistic sensitivity analysis using 1 million Monte Carlo iterations showed that the new
treatment has higher expected net benefit (£36 601 versus £92 931). However, 28.8% of iterations
showed the standard rather than the new therapy as the optimal decision, partly due to the fact that the
specified variance for the hazard ratio results in some iterations where survival on the new therapy is
actually worse than on standard therapy. The overall EVPI is estimated at £5025 per patient.
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1210
A. BRENNAN AND S. A. KHARROUBI
Table II. Prior probability density parameters for multi-variate normal distribution of decision model parameters
Parameter
y1
y2
y3
y4
y5
y6
y7
Shape
Scale
Log Haz ratio
Utility
Cost/day (£)
Cost(Std) (£)
Cost(New) (£)
Mean
Variance matrix
0.10299
1.62790
0.68552
0.7
40
5000
15 000
0.02215
0.03560
0
0
0
0
0
0.03560
0.08752
0
0
0
0
0
0
0
0.24493
0
0
0
0
0
0
0
0.01
0
0
0
0
0
0
0
100
0
0
0
0
0
0
0
10 000
0
0
0
0
0
0
0
10 000
lWTP ¼ £30 000:
The costs of research depend upon the study design options. We examine five alternative sample sizes
ðn ¼ 50; 100; 200; 400; 1000Þ over 5 study follow-up durations (6 months, 1, 2, 3, 5 years) giving 25
alternative study designs. We examine two scenarios for the cost function for research. In Scenario 1,
study set-up costs are £3 000 000; the cost of initial recruitment is high, with a unit cost per patient who
starts the trial of £10 000; whilst the additional cost per patient year of follow-up is relatively cheap at
£1500: In Scenario 2, the initial recruitment costs are assumed zero but the costs of ongoing follow-up is
much larger at £15 000 per patient year.
Bayesian updating and approximating the posterior expectation
Because the prior joint normal distribution for the Weibull parameters is not conjugate with Weibull
data there is no analytic formula for the posterior probability density of y1 ; y2 and y3 : We investigate
five methods for producing estimates of the posterior expected net benefit given a simulated data-set.
The data-set simulation process begins by defining the number of patients ðNnew Þ in each arm of the
proposed trial and the follow-up period planned ðDnew Þ: Monte Carlo is use to sample the values of the
log shape, scale and hazard ratio parameters (y1sample ; y2sample and y3sample ) from their prior distribution.
The next step is to using Monte Carlo sampling from the Weibull distribution to sample a survival time
for each of the patients on standard therapy in the trial (using the R function rweibull for Nnew patients
with arguments based on y1sample and y2sample ). Survival times for patients on new therapy are also
sampled (using rweibull with the same shape but a scale parameter adjusted using y3sample ). The final
step is to apply censoring to the data-set based on the specified follow-up period for the trial. In this case
study, we assumed that all simulated patients who survive beyond the Dnew follow-up period are
censored and known to be still alive at time Dnew : The simulated data-sets can then be used to test the
five different methods of estimating posterior expected net benefits.
Heuristic data lumping. The first method is a simple heuristic which makes several assumptions. We
have prior existing data, e.g. from the Kaplan–Meier curve for the 44 patients. We have a simulated
data-set of Nnew patients in each arm. This method simply lumps these two data-sets together, and uses
maximum likelihood estimation on the combined data-set to estimate the posterior parameters y# 1 ; y# 2
and y# 3 and their associated variance matrix (Brennan and Kharroubi, 2003). Having obtained these
values, we compute the expected net benefit of each treatment by a Monte Carlo integration sampling
the posterior parameters from a joint multi-variate normal distribution.
An advantage is that this method requires relatively little computation. The first major component is
the use of Newton–Raphson or equivalent numerical methods to estimate the maximum likelihood of
the posterior parameters and their associated variance matrix. The second component is the Monte
Carlo sampling to compute the posterior expected net benefits. The main disadvantages are: (a) the
method is not using formal Bayesian methods to update the parameters but rather lumping the two
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1211
EXPECTED VALUE OF SAMPLE INFORMATION
data-sets together, (b) we are assuming that the existing knowledge is all in the form of individual level
data (c) the data when merged together are treated as though they are from the same underlying
distribution and there is no opportunity to use methods for adjusting for any bias, and (d) we assume
the posterior Weibull parameters are normally distributed.
Heuristic assuming normally distributed Weibull parameters. The second method is also a relatively
simple heuristic. We first compute the MLE and variance matrix for the Weibull parameters using prior
data only. We assume that the uncertainty in these parameters (log shape, scale and hazard) can be
correctly characterized with a multivariate normal distribution, Prior Nðm; VÞ: Next we take the
simulated data-set alone, and use maximum likelihood estimation to quantify a mean and a variance
matrix for the parameters. Again, we assume that the uncertainty can be correctly characterized with a
multivariate normal distribution, such that Data alone Nðm0 ; V 0 Þ: Note that if we have a large sample
size and/or a long follow-up period for the simulated trial, then these parameters will be estimated with
greater accuracy and the variance matrix V 0 will have smaller variance and covariance elements. To
combine the prior evidence and the simulated data, we then assume that we can use the Bayesian
updating formula for the multi-variate normal distribution to produce posterior parameter estimates
and variance matrix, i.e. Posterior Nðm00 ; V 00 Þ where,
m00 ¼ ðV 01 þ V 1 Þ1 ðV 01 m0 þ V 1 mÞ;
V 00 ¼ ðV 01 þ V 1 Þ1
ð12Þ
Again, to compute the expected net benefit of each treatment given the simulated data-set we undertake
a Monte Carlo process, sampling these posterior parameters from the joint multi-variate normal
posterior distribution.
This method also requires relatively little computation. It has the same two major components as data
lumping, i.e. Newton–Raphson to estimate the parameters and associated variance matrix, and the
Monte Carlo sampling to compute the posterior expected net benefits. A further advantage of this
approach is that there is no requirement for the prior evidence to be individual level data. This enables
the use of summary measures available to the analyst (e.g. median survival at 5 years), elicitation
approaches or more complex evidence synthesis methods to produce the prior probability distribution.
The main disadvantage is that the method is not using formal Bayesian methods. It is technically
incorrect because the simulated data are not normally distributed and so the uncertainty in the Weibull
parameters is not automatically normally distributed, i.e. the distributions are not conjugate. We are
assuming that the parameter estimates from the MLE process are normally distributed. The question is,
how good an approximation of the true posterior distribution for these parameters does this assumption
produce? The accuracy of this heuristic (and indeed the data lumping approach) depends on context. If
the prior information is weak (i.e. based on little knowledge and with considerable uncertainty) and the
data very strong (i.e. a large sample size), then the data will dominate and the result could well be almost
equivalent to formal Bayesian updating. This is similarly the case if the priors are strong and the data
collection exercise very small. Between these extremes, these heuristics may have reduced accuracy.
A novel approach to approximating posterior expected net benefit (Brennan & Kharroubi). The third
method is a more complex Bayesian approach, which allows us to estimate the posterior expected net
benefit efficiently. This approach avoids two computationally intensive tasks (i) formal updating to obtain
the full posterior probability distribution, and (ii) Monte Carlo integration. Brennan and Kharroubi
describe the approach in detail elsewhere (Brennan and Kharroubi, 2007), building on earlier work on
Laplace approximations for Bayesian quantities (Sweeting and Kharroubi, 2003). A detailed derivation
(Kharroubi and Brennan, 2005) produces an approximation formula for the posterior expectation of a
real valued function vðyÞ of d uncertain parameters ðy ¼ y1 ; . . . ; yd Þ given particular sample data X:
# þ
EfvðyÞjXg ffi vðyÞ
d
X
þ
þ
#
ða
i vðyi Þ þ ai vðyi Þ vðyÞÞ
ð13Þ
i¼1
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1212
A. BRENNAN AND S. A. KHARROUBI
y# is the posterior mode, the values ðy# 1 ; . . . ; y# d Þ that maximize the posterior density function given
þ
the data X: Each y
i and yi is a specific point in the d dimensional space at which the function
þ
vðyÞ is evaluated. A weighted average of these evaluations is taken with weights a
i and ai applied. The
þ
þ
points yi and yi ; and weights ai and ai ; depend on both the prior probability distribution for
the model parameters y and on the data X; but are independent of the function v and can therefore
be used to compute posterior expectations of several net benefit functions. To aid intuitive
þ
understanding, Appendix B provides further explanation of the meaning of the points y
i and yi ;
þ
and weights ai and ai : Note, that we can consider the first term of (13) to be a first-order
approximation of the posterior expectation (i.e. the function evaluated at the posterior mode) and the
full equation to be a second-order approximation. In Appendix C, we explain briefly how to compute
each component of formula (13). If we let vðyÞ ¼ NBðD; yÞ for each decision option D in Equation (1),
then we can approximate EVSI by
(
)
d
X
þ
þ
# þ
#
EVSI ffi EX max NBðD; yÞ
ða NBðD; y Þ þ a NBðD; y Þ NBðD; yÞÞ
yI
i
D
i
i
i
i¼1
max fEy ½NBðD; yÞg
ð14Þ
D
The number of evaluations of the net benefit function required to estimate EVSI using the
approximation formula is 2d þ 1 times the number of treatment strategies times the number of
simulated data-sets (outer loops). This can be a much smaller number of evaluations of the net benefit
functions than when computing the expected net benefit by Monte Carlo. For example, if we have a
model with 20 parameters, and 2 treatments, and plan to evaluate EVSI using 1000 simulated data-sets,
then using Equation (14) would require 82 000 net benefit evaluations, whereas using traditional Monte
Carlo based on 10 000 inner samples would require 20 million net benefit evaluations (around 250 times
more). Thus, if the net benefit function itself takes an appreciable time to compute the method could
provide considerable time savings. In the section ‘Computation Time’, we quantify the computation
trade-off for our case study model.
In order to apply Equation (13) in our case study, we need expressions for the log posterior density
and its partial derivative with respect to each parameter. Using Bayes theorem, the posterior
density given a data-set X will be pðyjX; HÞ / LðyÞ pðyjHÞ where LðyÞ is the likelihood function
of the data X given parameters y and pðyjHÞ is the prior for the parameters. In our case study we
have a Weibull proportional hazards model with two treatments (for standard therapy xi ¼ 0
and for new therapy xi ¼ 1) and a single parameter describing the log proportional hazard, i.e. y3 :
We defined the prior density of the three Weibull parameters ðy1 ; y2 ; y3 Þ and the other model
parameters ðy4 ; . . . ; y7 Þ as multivariate normal. If we denote y as multivariate normal with
dimension d ¼ 7; mean m and variance matrix S; then the prior probability density function can be
written as
1
Prior pðyjHÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi expð0:5ðy mÞT S1 ðy mÞÞ
ð2pÞd jSj
ð15Þ
Extending Equation (5) for the Weibull log likelihood to the proportional hazards case, taking logs of
Equation (15) and ignoring constants, we get the log posterior density:
n
n
X
X
lðyÞ /
di ðy1 þ y2 þ y3 xi Þ þ ðey1 1Þ
di logðti Þ
i¼1
n
X
i¼1
y1
ey3 xi ey2 ðti Þe 0:5ðy mÞT S1 ðy mÞ
ð16Þ
i¼1
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
EXPECTED VALUE OF SAMPLE INFORMATION
1213
# and also use to compute all of the
which we can then maximize to find the posterior mode y;
components of (14). We also require the vector of partial derivatives of the log-likelihood function with
respect to each model variable as described in Appendix D.
The advantages of avoiding two substantial aspects of computation (Bayesian updating to obtain a
þ
posterior distribution and Monte Carlo integration) must be weighed against the computation of y
i ; yi ;
þ
ai and ai ; which requires some conditional maximization using numerical methods. The further main
disadvantage of the method is that it requires the analyst to mathematically describe the log posterior
density function and its partial derivatives.
Tierney & Kadane’s Laplace approximation for posterior expectations. In 1986, Tierney and Kadane
produced a method to approximate the posterior expectation. The posterior expectation of vðyÞ given n
items of sample data X ðnÞ is written as
R
vðyÞe‘ðyÞ pðyÞ dy
ð17Þ
En ½v ¼ E½vðyÞjX ðnÞ ¼ R ‘ðyÞ
e pðyÞ dy
where ‘ðyÞ here is the log-likelihood function and pðyÞ is the prior density. The essence of the method is
to compute the mode of the numerator and denominator of (17) and apply Laplace’s method for
approximating integrals. Tierney and Kadane define two functions, LT ¼ ½log p þ ‘=n and LnT ¼
½logðvÞ þ log p þ ‘=n; and their modes y# T and y# nT respectively. Then Tierney & Kadane show that:
1=2
det Sn
n #n
#
En ½v ¼
en½LT ðyT ÞLT ðyT Þ
ð18Þ
det S
where, Sn and S are minus the inverse Hessian matrix of LnT and LT evaluated at y# nT and y# T ;
respectively.
Computation of the Tierney and Kadane approximation requires us to specify the two functions LT
and LnT ; compute y# nT and y# T ; and minus the inverse Hessians. In practice we use numerical methods to
compute these entities (again nlm() in [R]). The advantages of this method are the same as those for the
Brennan–Kharroubi method above. Our method needs to recompute the components of the formula for
each different simulated data-set but the components are independent of the net benefit function used
and therefore apply to all of the d ¼ 1 to D strategies. One further disadvantage of Tierney and Kadane
is that we need to recompute the components of its formula (i.e. the LnT function, y# nT and Sn ) not only
for each data-set, but also separately for each different net benefit function.
Markov chain Monte Carlo sampling in WinBUGS. MCMC is a method for producing samples from the
posterior distribution (Gilks et al., 1996). In health related evidence synthesis, the approach is most
commonly implemented using the statistical package WinBUGS (Spiegelhalter et al., 2001). The
sampling algorithm (e.g. Gibbs sampling) generates an instance from the distribution of each variable in
turn, conditional on the current values of the other variables. After many iterations the process
stabilizes, and it can be shown that the resulting samples are from the joint posterior distribution.
To implement MCMC, the analyst must specify the statistical model for the parameters including the
likelihood function as well as the sample data. We also specify initial values for the posterior
parameters, the number of ‘burn in’ runs to ignore, and the number of samples to generate from the
posterior. In the context of EVSI, we need to repeatedly call WinBUGS to produce samples from the
posterior for each separate sampled data-set. Fortunately we can use the ‘R2WinBUGS’ package to call
WinBUGS from within a program in [R]. This package uses the function ‘bugs.data()’ to prepare and
send the sample data file to WinBUGS, and the function ‘bugs()’ to call WinBUGS, undertake the
MCMC sampling and return the resulting samples from the posterior to [R]. These samples can be then
used within a Monte Carlo integration (probabilistic sensitivity analysis) to produce the posterior
expected net benefit for each treatment.
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1214
A. BRENNAN AND S. A. KHARROUBI
The advantage of MCMC is that it can work on a wide class of distributions and in particular it
solves the problem of Bayesian updating if the prior and data are non-conjugate (such as normal priors
for our survival model parameters and Weibull survival data). It also produces direct samples from the
posterior which can be used immediately in the Monte Carlo simulation to compute the posterior
expected net benefits. The disadvantage, particularly in the context of EVSI, is computation time. There
are three large computation processes required. First we require large numbers of simulated sample
data-sets. Second, for each simulated data-set we require long runs of the MCMC simulation to produce
the approximation to the true posterior. Third, for each generated posterior, we need Monte Carlo
integration to quantify the expected net benefit given the data.
Analysis
To test the accuracy of the five methods in approximating the posterior expected net benefit, we tested
each approach on the same simulated data-sets. We simulated five separate data-sets with 44 patients
for standard therapy and 44 patients on new therapy, assuming censoring at 6 years. These data-sets
vary across the range of possible trial results and were obtained by sampling 10 000 values for the
Weibull parameters from their joint distribution, computing expected survival difference in each case,
picking the 10th, 30th, 50th, 70th and 90th percentiles and simulating a data-set given these values. This
provided 5 sample data-sets which are large enough to affect the prior evidence but not so large as to
dominate the Bayesian update computations. The three methods which use Monte Carlo integration to
compute the posterior expectation were run with 10 000 iterations. The ‘gold-standard’ MCMC method
simulated 10 000 samples after a ‘burn-in’ of 5000.
As a second test we analyzed EVSI estimates for the 25 alternative study designs using each method.
This was based on 1000 simulated data-sets for each study design. Here probabilistic sensitivity analyses
were based on 1000 samples from the joint posterior distribution for the parameters. For the MCMC
method we simulated 1000 samples from the posterior after a ‘burn-in’ of 1000. Computation times were
also examined, tested on a standard PC (Pentium 41:8 GHz personal computer with 512 Mb RAM).
Finally, we examined our Brennan & Kharroubi approximation on its own, again simulating 1000
data-sets to compute estimated EVSI for each of the 25 study design options and trading these off
against the two cost of research scenarios.
RESULTS
Comparison of posterior expectations for 5 sample data-sets
Using five exemplar simulated data-sets, the estimated posterior expected net benefit for new and
standard therapies provide 10 comparisons in all. Figure 2 shows the results for each method when
compared against the MCMC expectation using 10 000 samples which could be considered the ‘gold
standard’ (the MCMC expectation is given a value of 1.00). It can be seen that our own approximation
(B&K) is within 1% of the MCMC based approximation on each occasion. In contrast, the data
lumping heuristic is much more variable resulting in substantially different estimates from the MCMC
(up to 54%). The pseudo-normal heuristic is the next most accurate but consistently underestimates the
posterior expected net benefits (by around 5%). The Tierney and Kadane method has over-estimated
posterior expected net benefits (by on average 11%), a result which has been shown to occur elsewhere
in the circumstance of relatively small numbers of new data points from censored multivariate normal
data distributions Kharroubi (2001).
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1215
EXPECTED VALUE OF SAMPLE INFORMATION
Comparison of EVSI results between methods using 1000 sample data-sets
Figure 3 below shows the MCMC EVSI estimates compared to the other methods using 1000
simulated data-sets. The results show that our approximation (B&K) is very close to the MCMC. The
mean difference across the 25 study design options when expressed on the scale with overall EVPI
indexed to 100 was just 0.04, the maximum difference was 2.4 percentage points and the root mean
square difference 1.2. The next most accurate method was the pseudo-normal approach, with a root
mean squared difference from MCMC of 1.9, but lower accuracy when the simulated data-set is
relatively small. The Tierney and Kadane approximation marginally underestimated the EVSI, on
average by around 4.5 percentage points on the indexed scale. The data lumping method (not shown on
the graph) had much larger errors, with a root mean squared difference from MCMC of almost 20
points.
The relation between Figures 2 and 3 is as follows. Figure 2 is based on just 5 sample data-sets and
reports the posterior expectations for each, showing how variable the result can be in comparison to the
MCMC gold standard. Figure 3 is based on 1000 different data-sets for each of the 25 study designs and
Comparison of Bayesian Updating Methods
Indexed to MCMC=1.00
in 5 Simulated Data-Sets (n=44 in each arm)
1.20
1.10
1.00
0.90
0.80
B&K
DataLump
Pseudo-normal
Method
T&K
Figure 2. Comparison of posterior expected net benefit estimates using 5 sample data-sets (Indexed to MCMC
posterior expectation ¼ 1:00)
EVSI Esimates for Different Methods
EVSI (Indexed to Overall
EVPI =100)
50
MCMC
40
B&K
30
20
T&K
10
PseudoNormal
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Study Design Option
Figure 3. Comparison of EVSI results between methods using 1000 sample data-sets
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1216
A. BRENNAN AND S. A. KHARROUBI
reports the EVSI, i.e. the average of the maximum posterior expectations over all data-sets. The
important issue for EVSI is the differences in Figure 3 which shows that our approximation provides
reasonable estimates of EVSI and that the pseudo-normal heuristic is the next most accurate
approximation.
It is also possible to use the first-order version of our approximation, i.e. approximating posterior
# rather than the full version of Equation (13). Applying this in our
expectations with, EfvðyÞjXg ffi vðyÞ
case study model, we found that for the smaller and shorter studies, the first-order estimates of EVSI
were higher than the second order (by up to 8 percentage points when indexed to overall EVPI ¼ 100).
As the sample size and study duration increases, the first- and second-order estimates become much
closer and for the largest proposed trial (1000 patients in each arm, followed for 5 years) the EVSI
estimates were within 50:5 percentage points. This is to be expected because as the sample size increases
the impact of the non-linearity in the net benefit functions and hence the need for the second-order term
reduces.
Computation time
The computation time required for the MCMC method to compute EVSI for all 25 study designs based
on just a single simulated data-set using 10 000 iterations for the Monte Carlo integration with a burn-in
of 5000 was 53 min 56 s: If 1000 simulated data-sets were used this translates to 37.5 days to complete
the EVSI analysis. In contrast, our B&K approximation took 4 min 32 s to compute EVSI for all 25
study designs for a single simulated data-set, which translates to a computation time of 3.1 days for 1000
data-sets. The results show that our approximation achieved equivalent accuracy to MCMC
approximately 12 times faster. Table III below shows the relative speed of B&K approximation versus
MCMC for each individual study design. The approximation shows greater relative speed when the
study duration is longer, i.e. when there are fewer censored patients. The actual sample size makes little
difference to the relative speed of the methods. These computation time comparisons are made using
10 000 iterations for the Monte Carlo sampling of posterior expected net benefits. Reducing the number
of MCMC samples and inner level Monte Carlo sampling both to 1000 reduces accuracy of the EVSI
estimate but still leaves our B&K approximation 1.6 times faster.
Importance of cost function and prevalence for optimal sample sizing
In considering efficient allocation of resources to research we cannot simple look at EVSI per patient
but must think both about the prevalence of patients facing the decision between strategy options and
about the trial costs. The optimal research design is different depending on these parameters. This is
shown using four illustrative scenarios with two different prevalence estimates and two different trial
cost functions (Table IV).
Table III. Computation time comparisons. Relative speed of novel Bayesian approximation
versus MCMC with 10 000 iterations
Sample size
Duration of follow-up
0.5
1
2
3
5
Copyright # 2007 John Wiley & Sons, Ltd.
50
100
200
400
1000
4.1
6.0
8.2
11.5
15.2
4.3
6.7
9.3
12.8
18.5
3.9
6.0
8.2
10.8
15.0
3.9
5.9
8.0
10.7
14.3
3.6
5.5
7.8
10.4
14.2
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1217
EXPECTED VALUE OF SAMPLE INFORMATION
Table IV. Population EVSI versus trial costs for first and second illustrative trial cost functions,
with population ¼ 10 000 and 5000
Scenario 1
Duration of follow-up ðDnew Þ
Sample size per arm ðNnew Þ
0.5
1
Part A: Prevalence ¼ 10 000
Population EVSI (£ millions)
50
4.2
6.3
100
6.2
9.4
200
10.7
12.5
400
14.8
17.2
1000
19.5
18.5
Trial costs (£ millions)
50
4.1
4.2
100
5.2
5.3
200
7.3
7.6
400
11.6
12.2
1000
24.5
26.0
Expected net value of research (£ millions)
50
0.2
2.2
100
1.1
4.1
200
3.4
4.9
400
3.2
5.0
1000
5.0
7.5
Part B: Prevalence ¼ 5000
Population EVSI (£ millions)
50
2.1
3.2
100
3.1
4.7
200
5.4
6.3
400
7.4
8.6
1000
9.8
9.2
Trial costs (£ millions)
50
2.1
3.2
100
3.1
4.7
200
5.4
6.3
400
7.4
8.6
1000
9.8
9.2
Expected net value of research (£ millions)
50
2.0
1.0
100
2.0
0.6
200
1.9
1.3
400
4.2
3.6
1000
14.7
16.8
2
3
9.1
12.8
13.5
18.7
18.3
Scenario 2
Duration of follow-up
5
0.5
1
2
3
5
11.6
14.0
19.1
18.2
18.7
15.2
18.7
18.8
20.7
23.0
4.2
6.2
10.7
14.8
19.5
6.3
9.4
12.5
17.2
18.5
9.1
12.8
13.5
18.7
18.3
11.6
14.0
19.1
18.2
18.7
15.2
18.7
18.8
20.7
23.0
4.3
5.6
8.2
13.4
29.0
4.5
5.9
8.8
14.6
32.0
4.8
6.5
10.0
17.0
38.0
4.2
6.2
10.7
14.8
19.5
6.3
9.4
12.5
17.2
18.5
9.1
12.8
13.5
18.7
18.3
11.6
14.0
19.1
18.2
18.7
15.2
18.7
18.8
20.7
23.0
4.8
7.2
5.3
5.3
10.7
7.1
8.1
10.3
3.6
13.3
1.8
3.4
3.5
2.2
14.5
3.1
3.8
1.5
8.3
44.7
4.1
2.0
1.9
20.8
74.3
4.7
0.7
14.2
42.3
130.0
4.6
6.4
6.7
9.4
9.2
10.5
12.2n
8.8
3.7
15.0
0.5
1.7
4.7
5.8n
1.5
5.8
7.0
9.5
9.1
9.3
7.6
9.4
9.4
10.4
11.5
2.1
3.1
5.4
7.4
9.8
3.2
4.7
6.3
8.6
9.2
4.6
6.4
6.7
9.4
9.2
5.8
7.0
9.5
9.1
9.3
7.6
9.4
9.4
10.4
11.5
4.6
6.4
6.7
9.4
9.2
5.8
7.0
9.5
9.1
9.3
7.6
9.4
9.4
10.4
11.5
2.1
3.1
5.4
7.4
9.8
3.2
4.7
6.3
8.6
9.2
4.6
6.4
6.7
9.4
9.2
5.8
7.0
9.5
9.1
9.3
7.6
9.4
9.4
10.4
11.5
0.3
0.8
1.5
4.0
19.8
1.3
1.1
0.7
5.5
22.7
1.6
1.4
0.6
1.6
8.2
1.3
1.3
2.7
6.4
23.8
1.4
2.6
8.3
17.6
53.8
1.7
5.0
11.5
29.9
83.7
2.9
8.6
23.6
52.6
141.5
2.9n
2.9
0.6
6.6
26.5
*Optimal study design for each scenario.
Under Scenario 1 we have a relatively expensive initial recruitment and cheaper follow-up costs. With
a prevalence of 10 000 patients, the optimum study design would have 100 patients per arm, followed up
for 5 years, giving an expected net value of research of £12:2 m: Although larger sample sizes would
provide increased EVSI, their increased costs outweigh the benefits and the expected net value of
research for a study of 1000 patients per arm followed for 5 years would be £15:0 m: When the
alternative (Scenario 2) cost function is used, there is a much higher cost premium to be paid for longer
follow-up. The result is that shorter trials with a large sample size provide the most expected net value of
research. The optimum over the study designs examined is the study with 400 patients per arm followed
for just 6 months. With a population of 5000 ready to benefit then only 7 of the studies have positive
expected value under trial cost Scenario 1 and the optimum shifts again, this time to a study of
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1218
A. BRENNAN AND S. A. KHARROUBI
Nnew ¼ 50; Dnew ¼ 6 months. For Scenario 2, there are no proposed studies with positive value and we
should in principle say that the proposed research is too expensive to be worthwhile.
DISCUSSION
If health economics is going to contribute to clinical trial design then the capacity to analyse survival
trials is essential. This paper has developed and tested a new approach to computing EVSI for
survival studies. It can be used alongside traditional sample sizing to provide explicit trade-offs
of the costs versus the value of data collection. This can help to clarify and replace the implicit
trade-offs provided by specified power and significance levels. The case study shows how the
cost function for research and the prevalence of patients can affect which study design, if any,
might be considered optimal. Our approximation achieves more efficient computation in two
ways: replacing the need to implement MCMC to obtain the posterior and replacing the need to use
Monte Carlo integration (probabilistic sensitivity analysis) to evaluate the posterior expected net
benefit. The case study shows our approximation to be robust and efficient in comparison to other
approaches.
The case study application here is a relatively simple Weibull proportional hazards model.
Such models are common in health economic analyses of treatments where survival or death
are the only two important health states. Our approach can easily extend to more complex models with
several different states having Weibull transition rates between them. The case study assumed that prior
distributions had treatment effect ðy3 Þ; which was not correlated with the Weibull parameters ðy1 ; y2 Þ:
Our method would still work if there were such correlations, whether they be elicited, or a correlation
revealed in prior data of some form. In fact our method will work for any smooth parametric model
of survival such as the Gompertz or gamma density functions. The approach could also work in
circumstances where the hazard is assumed to change over time depending on the model
assumed. Simple piecewise exponential forms are sometimes assumed for analyses of survival data.
Bayseian updating for the scale parameters of such piecewise exponential models would
not be conjugate and so the traditional MCMC approach would be required. If the points at
which the ‘pieces’ fit together are known then the B&K approach would work because the posterior
probability density function should be smooth and differentiable. More complex piecewise scenarios
with change points which are unknown and determined by the data would probably also work
but would require further thought. In our case study model, the short and long follow-up trials
both consider survival as the clinical endpoint. In many real clinical trials, shorter term studies
might also examine surrogate outcome measures, e.g. changes in risk factors such as cholesterol
or blood pressure, whilst longer term trials would provide more direct evidence of hard endpoints.
To account for this issue, the health economic model would need to examine explicitly each
of these parameter types (surrogate and final outcomes) together with the uncertainly in the
relationship between them. All of this is possible within the framework set out here. We must
also note that traditional sample sizing does not specify the Weibull, but assumes only a constant
proportional hazard applied to a non-parametric survival baseline (the Cox model). Traditional
sample sizing leaves the functional form of the survival curve open. Whilst this has benefits in
terms of generalizability, it means that a key requirement in health economic evaluation cannot
be met, i.e. to compute the mean survival or mean survival difference between treatments. In our
case study we specify a Weibull form for the survival curve. Statisticians will wish to assess the
appropriateness or otherwise of the Weibull or other parametric models for a particular survival
analysis in practice.
Bayesian updating in the Weibull model has been examined in some previous studies. Abrams
et al. (1996) used the Tierney and Kadane approximation with a vague prior to perform interim
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
EXPECTED VALUE OF SAMPLE INFORMATION
1219
trial analysis. Lecoutre et al. (2002) assume the Weibull shape parameter is different between treatments,
breaking the proportional hazards assumption. They allow the scale parameter to have any density
function, specify the prior density of the shape parameter conditional on the scale parameter to be of
inverse gamma form, and then show that the posterior also takes the inverse gamma form. Their
approach is relatively intensive, requiring numerical approximation to get the posterior followed by
Monte Carlo sampling to compute the posterior expected net benefits. Further research to apply the
Lecoutre et al. method within EVSI might be beneficial for those applications where the assumptions
are known to apply.
Our B&K approximation is generalizable beyond survival models to many health economic
decision model contexts. It will work for almost any form of joint prior density pðyjHÞ;
likelihood function LðyÞ and net benefit functions NBðD; yÞ: It requires only smooth, differentiable
mathematical functions for the joint prior density, and an expression for the likelihood of the
simulated sample data, such that the log posterior density and its partial derivatives can be written
or approximated numerically. Practical limitations relate primarily to computation and model
set-up time.
We have shown that computation time can be substantially faster than when repeatedly
using MCMC via WinBUGS for each simulated data-set. Note that our method will not necessarily
always be faster. If the net benefit is simple to calculate, if we have conjugate distributions giving easy
access to posterior distributions (which we do not for the Weibull), and if the number of Monte-Carlo
inner samples required to gain accurate estimates of the conditional expectations is small, then the
traditional Monte-Carlo approach may be faster than using our approximation. If these conditions are
relaxed then the approximation is likely to provide computational efficiency gains. The time to set up
the statistical model and program is clearly greater than following the simple rules for traditional
sample size calculation. Once set up of course, the programs can be amended to apply to new survival
trials relatively easily. The [R] code for our analysis is available at http://www.shef.ac.uk/chebs/
software.
Our relatively simple case study defines the study design by two parameters, sample size and followup period. More complex specification of the trial design could be incorporated into our framework, e.g.
unequal samples in different arms and more detailed modelling of accrual, dropouts and censoring.
Similarly, the modelling of the cost structure of the study options can be more sophisticated when
necessary. The proportional hazards model in our case study used just one covariate (i.e. treatment) but
if prior information exists, other covariates can also be included. The approach also applies beyond
mortality to any model parameters which have the Weibull distribution, e.g. time to clinical events or
time to withdrawal from drug treatment.
Some might argue for a hybrid form of analysis using the health economic decision model
to define the decision maker’s reference significant difference ðyE Þ at which the decision threshold is
crossed, and then using the traditional sample size approach. This is not compatible with decision
theory based on expected value of the policy options because the traditional sample size calculation has
embedded within it the implicit decision rule that the standard treatment will be adopted until the new
treatment is shown to be more effective at the a significance level. It would also leave implicit the tradeoff of the costs versus value of the data collection. To be compatible with decisions based on expected
value, we must make the full move to computing EVSI. The single important assumption required to
make the move is that we are able to specify in advance a prior probability density function for the
proportional hazard.
Further research could extend these ideas to the Bayesian concept of assurance and to
Bayesian clinical trial simulation. Assurance in clinical trials (O’Hagan and Stevens, 2001; O’Hagan
et al., 2001), rather than just computing the power (1 b) for a particular value of y ¼ yR ;
instead integrates over the range of possible values of y and produces essentially, the expected power of
the trial. Traditional sample sizing is often repeated over several alternative values of yR ; using
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1220
A. BRENNAN AND S. A. KHARROUBI
published data or expert opinion to consider the relative plausibility of different assumptions. It appears
to be a small step to move from this to specifying a probability distribution for y and computing
assurance. Assurance has the advantage of explicitly accounting for the uncertainty in y but, when
compared to EVSI, still leaves implicit the trade-off between costs of research versus the value of data
collection. Bayesian clinical trial simulation (BCTS) is similar to EVSI in simulating alternative data
collection exercises (see discussion in O’Hagan et al., 2001). However, it allows for much more complex
decision rules, for continuation of data collection and for the adoption decision, than the simple
maximum expected net benefit rule we use in EVSI. Our approach to computing posterior expected net
benefits might also be useful in BCTS.
In conclusion, we have developed and tested a Bayesian approximation formula for posterior
expectations of real valued functions given observed data EðNBðD; yÞjXÞ in the context of EVSI for
survival trials. The case study builds on a classic text book example to show how the formal integration
of economic considerations is both feasible and potentially profound in the design of survival trials. The
approximation method is very generalizable, working for any net benefit function and any smooth
mathematically defined joint probability distribution. Computation time reductions are substantial and
likely to be even greater for more complex, computationally expensive decision models. We hope that
health economists and statisticians will take the method forward and apply EVSI in planning a wide
range of survival trials.
APPENDIX A: DIFFERENT PARAMETERIZATIONS FOR THE WEIBULL
DISTRIBUTION
Collett (1997) provides an excellent introduction to survival analysis in medical statistics, denoting
r as the number of deaths amongst n individuals, di ¼ 1 if a death has occurred at time ti and di ¼ 0 if
the ith survival time ti is censored. Collett specifies g as the shape parameter and l as the scale
parameter.
When modelling uncertainty in survival it has been recommended that analysts fit Weibull survival
curves to the data and assume the uncertainty in Weibull parameters (g; l) is well characterized by a
lognormal distribution.
(E.g. http://www.york.ac.uk/inst/che/training/modelling.htm,
http://www.shef.ac.uk/chebs/news/news1.html).
At least four different parameterizations are available for the Weibull distribution. This can be
confusing to the new reader and here we set out the differences. In this paper we use the log form set out
in the second row of Table AI (Abrams et al., 1996). This means that ðy1 ; y2 Þ can be characterized as
multivariate normal.
For each parameterization, the probability density function f ðtÞ can be obtained from the standard
result in survival analysis that f ðtÞ ¼ hðtÞSðtÞ:
Table AI. Different parameterizations for the Weibull distribution
Method [Reference]
Collett (Collett, 1997)
Shape
g
Log form (Abrams et al., 1996)
y1 ¼ log g
Lecoutre (Lecoutre et al., 2002)
bL ¼ g
Reliability (Abernethy, 2000)
g
Copyright # 2007 John Wiley & Sons, Ltd.
Scale
l
y2 ¼ log l
lL ¼ 1l
1ð1=gÞ
lw ¼ l
Hazard function ðhðtÞÞ
lgt
g1
y1 ey1 1 y2
e t
e
bL bL 1
lL t
g
g1
ðt=l
wÞ
lw
Survivor function (SðtÞ)
expðltg Þ
y
y2 e 1
expðe
bt Þ
t L
exp lL
expðt=lw Þg
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1221
EXPECTED VALUE OF SAMPLE INFORMATION
# yþ ; y ; aþ AND a
APPENDIX B: THE MEANING OF y;
i
i
i
i
A brief explanation of the meaning of each element of Equation (13) is worthwhile. The elements are
easier to begin to conceptualize in the univariate case, i.e. where there is only one uncertain parameter,
giving the formula,
#
# þ fa vðy Þ þ aþ vðyþ Þ vðyÞg
EfvðyÞjXg8vðyÞ
ðB1Þ
In the univariate case, yþ and y are each simply one standard deviation away from the mode of the
posterior probability density y ¼ y# s; where s ¼ j 1=2 quantifies the standard deviation. The
# yþ and y : This
approximation requires us to evaluate the function vðyÞ at just these three points y;
contrasts with the evaluation of an expectation using Monte Carlo sampling from many random points
across the posterior density of y: A weighted average of the two evaluations at yþ and y is taken. aþ
and a are the weights given. In the univariate case, each is a function of the first derivative of the log of
the posterior density function evaluated at both points yþ and y : That is,
aþ ¼
1
1 ðl 0 ðyþ Þ=l 0 ðy ÞÞ
a ¼
and
1
¼ 1 aþ
1 ðl 0 ðy Þ=l 0 ðyþ ÞÞ
Note that a are approximately 12: In the special case when the posterior density is symmetric,
then yþ and y will be equidistant from y# and the first derivative (slope) of the log of the
posterior density function at these 2 points will be equal and opposite, i.e. l 0 ðy Þ=l 0 ðyþ Þ ¼ 1:
This results in aþ ¼ a ¼ 12: More generally, if the posterior probability density is, say positively
skewed, then the slope of the log posterior density at y will be greater than at yþ ; resulting in
l 0 ðy Þ=l 0 ðyþ Þ being greater than 1, and hence a 5125aþ : That is, with a positively skewed
posterior distribution, more weight will be given to the evaluation of the function at vðyþ Þ than to
the evaluation of vðy Þ:
Posterior Probability Density Function
0.07
0.06
θ^
0.05
0.04
θ-
θ+
0.03
0.02
0.01
0
-3
-2
-1
0
1
2
3
Log (Posterior Density)
2
θ-3
-1
-2
l'(θ-)
θ+
0
-2
-4
0
1
2
3
l'(θ+)
-6
-8
-10
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1222
A. BRENNAN AND S. A. KHARROUBI
# yþ and y and (b) aþ and aþ when the posterior probability density function is a
Illustration of (a) y;
normal distribution.
# yþ ; y ; aþ AND a
APPENDIX C: STEPS FOR COMPUTING y;
i
i
i
i
We next explain the steps for computation. y# is the posterior mode, i.e. it is the values ðy# 1 ; . . . ; y# d Þ that
maximize the posterior density function given the data X: If the distributions are conjugate, then we can
often use conjugate formulae to compute the posterior mode analytically. For example, if y and X are
both multi-variate normal such that y Nðm; VÞ and X ðm0 ; V 0 Þ; then the posterior mode, which in
this case is equivalent to the posterior mean, is given analytically by the formula, y# ¼ ðV 01 þ V 1 Þ1 ðV 01 m0 þ V 1 mÞ: In the general case we will need to use an iterative numerical optimization process
# (In our case study applications we used [R]
such as the Newton–Raphson technique to estimate y:
software with the optim or nlim functions, writing the posterior density function mathematically and
then computing the y# which minimizes the negative of the posterior density function, or the negative log
posterior which is often easier mathematically.)
þ
yþ
i and yi themselves are vectors. Each is the ith row of the matrix y and y ; respectively. The
þ
matrix y has the following structure:
Matrix Diagram 1 : yþ
0
B
B
B
B
B
B
B
B
B
B
B
B
B
yþ ¼ B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
1
y# ð2Þ ðy1 Þ
y# 1 þ ðk1 Þ2
y# 1
..
y# 1
...
y# 1
...
y# 1
. . . y# i1 ;
y# 1
...
y# 1
...
y# 1
...
1
!
.
..
.
1
y# i þ ðki Þ2 ;
y# ðiþ1Þ ðyi Þ
..
!
.
1
y# d þ ðkd Þ2
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
The off diagonal elements, to the left of and below the diagonal, are simply the posterior modes for
the first i 1 components of y; i.e. y# 1 ; y# 2 ; . . . ; y# i1 :
1
The diagonal elements are y1þ ; y2þ ; . . . ; ydþ ; where yiþ ¼ y# i þ ðki Þ2 : ki ; which is a constant, comes
from the information matrix J, and is the reciprocal of the first element of the matrix ½J ðiÞ 1 : To get ki
we need to undertake the following steps:
# where jðyÞ ¼ d2 l=dy2 is the matrix of second-order partial derivatives
1. Compute J, which is jðyÞ;
of lðyÞ: For almost all probability distributions with a defined mathematical form we can
simply undertake the partial differentiation with respect to y twice. In the exceptions, we can
again use numerical methods.
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1223
EXPECTED VALUE OF SAMPLE INFORMATION
2. Get J ðiÞ which is the sub-matrix of J beginning at the ith row and ith column as set out in the
diagram below.
Matrix Diagram 2 : J and J ðiÞ
0
@2 ‘ðyÞ
B
B @ðy1 Þ2
B
B 2
B @ ‘ðyÞ
B 2 1
B @y @y
B
B
B
J ¼ B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
1
@2 ‘ðyÞ
@y1 @y2
@2 ‘ðyÞ
@ðy2 Þ2
..
.
...
..
.
...
d
J ðiÞ ¼
j
...
j
...
b
@2 ‘ðyÞ
;
@ðyi Þ2
..
.
C
C
C
C
C
C
C
C
C
C
C
C
C
C
eC
C
C
C
C
C
jC
C
C
jC
A
c
3. Invert the matrix J ðiÞ to give ½J ðiÞ 1 : (In our case studies [R] uses the simple command solve(J),
which uses numerical methods).
4. Finally, pick out the first element of ½J ðiÞ 1 :
The off diagonal elements to the right of, and above, the diagonal of yþ are the most complicated to
compute. y# ðiþ1Þ ðyi Þ is defined as a vector of ðd iÞ elements, i.e. y# ðiþ1Þ ðyi Þ; y# ðiþ2Þ ðyi Þ; . . . ; y# ðdÞ ðyi Þ: These
elements are computed as maximising the posterior density function of y conditional on the first
i elements of y being defined by the earlier elements in the row yþ
i : To get these we need to find the
ðd iÞ elements of y to maximize lðyÞ given1 that the first ði 1Þ components of y are y# 1 ; y# 2 ; . . . ; y# i1 ; and
the ith component of y is yiþ ¼ y# i þ ðki Þ2 : Again, we use iterative numerical optimization to find the
solution. (In [R] we used the optim function to minimize the negative log-likelihood function but with
fixed values for the first i elements of y). We do not need to do this for the final row in the matrix. We do
need it for the first d 1 rows, and it needs to be done separately for yþ and y ; thus requiring 2ðd 1Þ
numerical optimizations.
ðiþ1Þ 1=2
To compute aþ
ðyi Þj
: li ðyþ
i Þ is the partial
i and ai we need to compute li ðyi Þ; and ni ðyi Þ ¼ j j
i
i
derivative of the log posterior density function with respect to y ; i.e. @lðyÞ=@y evaluated at the point yþ
i :
This is usually obtained analytically by undertaking the partial differentiation. ni ðyþ
Þ
is
obtained
using
i
the jðyÞ matrix illustrated in Matrix diagram 2, and obtaining the determinant of the sub-matrix j ðiþ1Þ ðyÞ;
when y ¼ yþ
i :
Note that, in this multi-parameter situation, if the function vðyÞ is linear and the parameters y are
independent and the posterior probability distribution is symmetric with the mode equal to the mean
y# ¼ ymean ; then the first term approximation is accurate because the matrices yþ and y are diagonal,
aþ ¼ a ¼ 12 and hence the second term in the approximation is zero.
APPENDIX D: PARTIAL DERIVATIVES OF THE LOG POSTERIOR DENSITY
To undertake the Brennan & Kharroubi form of approximation, we also require the vector of partial
derivatives of the log-likelihood function with respect to each model variable, i.e. li ðyÞ ¼ @lðyÞ=@yi : We
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
1224
A. BRENNAN AND S. A. KHARROUBI
denote O as the vector of partial derivatives of the last term of (16), that is O ¼ S1 ðy mÞ; and Oi as
the ith element of that vector of functions. The partial derivative with respect to y1 is
n
n
n
X
X
X
y1
l1 ðyÞ ¼
di þ ðey1 Þ
di logðti Þ ey3 xi ey2 ðti Þe logðti Þey1 þ O1
ðD1Þ
i¼1
i¼1
i¼1
The partial derivative with respect to y2 is
n
n
X
X
y1
l2 ðyÞ ¼
di ey3 xi ey2 ðti Þe þ O2
i¼1
ðD2Þ
i¼1
In the simplest case where there is just one b ¼ y3 ; i.e. only one covariate, the partial derivative with
respect to y3 is
n
n
X
X
y1
l3 ðyÞ ¼
di xi xi ey3 xi ey2 ðti Þe þ O3
ðD3Þ
i¼1
i¼1
For the remaining parameters, we again denote Oi as the ith element of the vector of O ¼ S1 ðy mÞ; the partial derivatives of the last term of (16), and thus the partial derivatives with respect to
y4 ; y5 ; y6 and y7 ; respectively, are O4 ; O5 ; O6 and O7 :
REFERENCES
Abernethy RB. 2000. The New Weibull Handbook. Reliability and Statistical Analysis for Predicting Life, Safety,
Survivability, Risk, Cost and Warranty Claims (4th edn). Abernethy: Florida.
Abrams K, Asby D, Errington D. 1996. A Bayesian approach to Weibull survival models}application to a cancer
clinical trial. Lifetime Data Analysis 2: 159–174.
Ades AE, Lu G, Claxton K. 2004. Expected value of sample information calculations in medical decision modeling.
Medical Decision Making 24: 207–227.
Brennan A, Kharroubi SA. 2003. Calculating expected value of sample information for the weibull distribution.
Presented at the 25th Annual Meeting of the Society for Medical Decision Making, October 2003, Chicago
Conference poster, University of Sheffield. Available at http://www.shef.ac.uk/content/1/c6/03/85/60/
evsi weibull.ppt
Brennan A, Kharroubi SA. 2007. Efficient computation of partial expected value of sample information using
Bayesian approximation. Journal of Health Economics 26: 122–148.
Brennan A, Chilcott J, Kharroubi SA, O’Hagan A. 2002a. A two level Monte Carlo approach to calculating
expected value of perfect information: resolution of the uncertainty in methods. Presented at the 24th Annual
Meeting of the Society for Medical Decision Making, October 23, 2002, Baltimore. Conference poster, University
of Sheffield. Available at http://www.shef.ac.uk/content/1/c6/03/85/60/evpi.ppt
Brennan A, Chilcott J, Kharroubi SA, O’Hagan A. 2002b. A two level Monte Carlo approach to calculation
expected value of sample information: how to value a research design. Presented at the 24th Annual Meeting of
the Society for Medical Decision Making, October 23, 2002, Baltimore. Conference poster, University of
Sheffield. Available at http://www.shef.ac.uk/content/1/c6/03/85/60/evsi.ppt
Brennan A, Kharroubi SA, Chilcott J, O’Hagan A. 2002c. A two level Monte Carlo approach to calculating
expected value of perfect information: resolution of the uncertainty in methods. Discussion Paper, University of
Sheffield. Available at http://www.shef.ac.uk/content/1/c6/02/96/05/brennan.doc
Chilcott J, Brennan A, Booth A, Karnon J, Tappenden P. 2003. The role of modelling in prioritising and planning
clinical trials. Health Technology Assessment 7.
Claxton K. 1999a. Bayesian approaches to the value of information: implications for the regulation of new health
care technologies. Health Economics 8: 269–274.
Claxton K. 1999b. The irrelevance of inference: a decision making approach to the stochastic evaluation of health
care technologies. Health Economics 18: 341–364.
Claxton K, Posnett J. 1996. An economic approach to clinical trial design and research priority-setting. Health
Economics 5: 513–524.
Claxton K, Thompson K. 2001. A dynamic programming approach to efficient clinical trial design. Journal of
Health Economics 20: 797–822.
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec
EXPECTED VALUE OF SAMPLE INFORMATION
1225
Claxton K, Neumann PJ, Araki S, Weinstein MC. 2001. Bayesian value-of-information analysis. An application to
a policy model of alzheimer’s disease. International Journal of Technology Assessment in Health Care 17: 38–55.
Collett D. 1997. Modelling Survival Data in Medical Research. Chapman & Hall: London.
Coyle D, Buxton MJ, O’Brien BJ. 2003. Measures of importance for economic analysis based on decision modeling.
Journal of Clinical Epidemiology 56: 989–997.
Felli JC, Hazen GB. 1998. Sensitivity analysis and the expected value of perfect information. Medical Decision
Making 18: 95–109.
Gilks WR, Richardson S, Spiegelhalter DJ. 1996. Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC:
London.
Kharroubi SA. 2001. Hybrid simulation and asymptotic techniques for Bayesian computation. PhD Thesis.
Kharroubi SA, Brennan A. 2005. A novel formulation for approximate bayesian computation based on signed
roots of log-density ratios. Research Report No. 553/05. Research report, Department of Probability and
Statistics, University of Sheffield. http://www.shef.ac.uk/content/1/c6/02/56/37/laplace.pdf
Lecoutre B, Mabika B, Derzko G. 2002. Assessment and monitoring in clinical trials when survival curves have
distinct shapes: a bayesian approach with weibull modelling. Statistics in Medicine 21: 663–674.
Meltzer D. 2001. Addressing uncertainty in medical cost effectiveness analysis. Implications of expected utility
maximization for methods to perform sensitivity analysis and the use of cost-effectiveness analysis to set
priorities for medical research. Journal of Health Economics 20: 109–129.
O’Hagan A, Stevens JW. 2001. Bayesian assessment of sample size for clinical trials of cost effectiveness. Medical
Decision Making 21: 219–230.
O’Hagan A, Stevens JW, Campbell MJ. 2001. Assurance in clinical trial design. Pharmaceutical Statistics 4:
187–201.
Raiffa H. 1968. Decision Analysis: Introductory Lectures on Choice Under Uncertainty. Addison-Wesley:
Reading, MA.
Spiegelhalter DJ, Myles JP, Jones DR, Abrams KR. 2000. Bayesian methods in health technology assessment:
a review. Health Technology Assessment 4.
Spiegelhalter DJ, Thomas A, Best N, Lunn D. 2001. Winbugs User Manual: Version 1.4. Mrc biostatistics unit:
Cambridge, UK.
Sweeting TJ, Kharroubi SA. 2003. Some new formulae for posterior expectations and bartlett corrections. Test 12:
497–521.
Tappenden P, Chilcott J, Eggington S, Oakley J, McCabe C. 2004. Methods for expected value of information
analysis in complex health economic models: developments on the health economics of interferon- and
glatiramer acetate for multiple sclerosis. Health Technology Assessment 8.
Thompson KM, Graham JD. 1996. Going beyond the single number: using probabilistic risk assessment to improve
risk management. Human and Ecological Risk Assessment 2: 1008–1025.
Tierney L, Kadane J. 1986. Accurate approximations for posterior moments and marginal densities. Journal of the
American Statistical Association 81: 82–86.
Yokota F, Thompson KM. 2004a. Value of information literature analysis (VOILA): a review of applications in
health risk management. Medical Decision Making 24: 287–298.
Yokota F, Thompson KM. 2004b. Value of information (VOI) analysis in environmental health risk management
(EHRM). Risk Analysis 24.
Copyright # 2007 John Wiley & Sons, Ltd.
Health Econ. 16: 1205–1225 (2007)
DOI: 10.1002/hec