Is health more important when you have less?

HET MARGINALE
NUT VAN GEZONDHEID
Is gezondheid belangrijker als je er minder van hebt?
Research Paper July 2014
Wilbert van den Hout
Afdeling Medische Besliskunde, LUMC
HET MARGINALE
NUT VAN GEZONDHEID
Is gezondheid belangrijker als je er minder van hebt?
EXECUTIVE SUMMARY
Context
Studie
Consequenties
Aanbeveling
5
5
Error! Bookmark not defined.
6
7
SAMENVATTING
Achtergrond
Methoden
Resultaten
Conclusies
8
9
9
9
9
INLEIDING
Onderzoeksvragen
11
13
METHODEN
Steekproef
Internet vragenlijst
Statistische analyses
14
14
14
15
RESULTATEN
Analyse binnen het standaard EQ5D tarief model
Analyse binnen een gedisaggregeerd model
Directe preferentievraag
18
18
19
20
DISCUSSIE
Implicaties
Interpretatie
Conclusie
21
21
22
24
REFERENTIES
25
APPENDIX A:
APPENDIX B:
APPENDIX C
Visueel Analoge Schaal voor het waarderen van gezondheidstoestanden,
vanuit persoonlijk perspectief
28
Visueel Analoge Schaal voor het waarderen van gezondheidsveranderingen,
vanuit beleidsperspectief
29
Directe preferentievraag voor behandeling in slechte of
relatief goede gezondheid, afhankelijk van perspectief
30
-3-
Executive summary
Context
Voor economische evaluaties is het belangrijk de waarde van gezondheid goed te kunnen
bepalen. Aanleiding voor deze studie is het advies geweest van de Raad voor de
Volksgezondheid, waarin onder andere wordt aanbevolen om een plafond voor de kosten per
QALY te hanteren bij de beoordeling of interventies in aanmerking komen voor bekostiging uit
collectieve middelen. Gesuggereerd werd bovendien om deze afkapwaarde afhankelijk te maken
van de ziektelast, met een maximum van €80.000 per QALY voor zeer ernstige aandoeningen.
Het gebruik van verschillende afkapwaardes impliceert dat een vergelijkbare
gezondheidsverbetering meer mag kosten, meer waardevol is, voor iemand met een ernstigere
aandoening. In de literatuur bestaat onduidelijkheid of dit terecht is. De rechtvaardigheidsliteratuur ondersteund over het algemeen deze prioriteit voor patiënten met ernstigere
aandoeningen. Maar de literatuur over classificatiesystemen als de EQ5D, SF6D en de HUI laat
juist het tegenovergestelde zien; dat eenzelfde gezondheidsverbetering meer waard is in goede
gezondheid.
Studie
Doel van het huidige onderzoek is om beide onderzoeksgebieden binnen één studie te
reproduceren en te zien of het verschil kan worden verklaard uit het verschil in perspectief (als
beleidsmaker of persoonlijk) en/of uit het verschil in het gebruikte format (als
-5-
gezondheidsverbeteringen of als gezondheidstoestanden).
Respondenten uit de algemene bevolking (response 52%, n=710) werden met een internet
vragenlijst gevraagd naar hun beoordelingen van de waarde van gezondheid. Gezondheid werd
gepresenteerd met behulp van EQ5D profielen en werd gewaardeerd met behulp van visuele
analoge schalen.
De resultaten laten consequent zien dat eenzelfde gezondheidsverbetering meer waardevol
wordt gevonden in goede gezondheid dan in slechte gezondheid. Dat geldt ook voor de
gezondheidsverbeteringen gewaardeerd vanuit beleidsperspectief, wat het meest overeenkomt
met de formulering uit de rechtvaardigheidsliteratuur. Direct gevraagd naar hun preferentie
geeft een ruime meerderheid ook de voorkeur aan behandeling in relatief goede gezondheid.
Consequenties
De resultaten stellen grote vraagtekens bij het gebruik van verschillende kosten-per-QALY
afkapwaardes, afhankelijk van de ernst van de aandoening. In het standaard kosten-per-QALY
kader mag een grotere gezondheidsverbetering meer geld kosten. Patiënten in slechtere
gezondheid hebben wellicht meer ruimte voor verbetering van hun gezondheid, en in dat geval
zijn hogere kosten aanvaardbaar. Het gaat een stap verder om dat plafond afhankelijk te maken
van ziektelast. Dan mag niet alleen een grotere verbeteringen meer kosten, maar ook mag
eenzelfde gezondheidsverbetering meer kosten voor iemand met een ernstigere aandoening. Die
aanpak staat op gespannen voet met de resultaten van onze studie: onze respondenten vonden
eenzelfde gezondheidsverbetering juist consequent meer waardevol bij een betere gezondheid.
Een kosten-per-QALY plafond dat afhankelijk is van ziektelast kan worden gezien als een
manier om de criteria doelmatigheid en noodzakelijkheid te combineren in één overkoepelend
criterium. In de trechter van Dunning werden doelmatigheid en noodzakelijkheid gehanteerd als
twee onafhankelijke filters: een behandeling moest aan beide criteria apart voldoen om in
aanmerking te komen voor vergoeding uit collectieve middelen. Onze studie ondersteund deze
aanpak met onafhankelijke filters. Dat betekent overigens niet dat een kosten-per-QALY
plafond noodzakelijk op een rigide manier zou moeten worden toegepast. Een economische
gezien dure behandeling zou toch gerechtvaardigd kunnen worden door bijvoorbeeld
wetenschappelijke baten of niet-gezondheidsbaten, door prioriteit voor kwetsbare
bevolkingsgroepen, of door overwegingen van solidariteit, rechtvaardigheid, eerlijkheid en
behoefte. Maar de veelzijdigheid van dit type argumenten maakt het onwaarschijnlijk dat deze
in één overkoepelend criterium met doelmatigheid gevat kunnen worden. De resultaten van
onze studie geven daar geen aanleiding toe.
Onze studie heeft ook consequenties voor de invulling van het criterium noodzakelijkheid. Het
expliciet verbinden van de criteria doelmatigheid en noodzakelijkheid vereist dat
-6-
noodzakelijkheid net als doelmatigheid op een kwantitatieve en graduele manier wordt
ingevuld. Ziektelast voldoet hieraan en wordt op dit moment gebruikt als een belangrijke
determinant van noodzakelijkheid. Het gescheiden houden van de criteria doelmatigheid en
noodzakelijkheid geeft ruimte voor andere invullingen, die meer recht zou kunnen doen aan het
minder graduele karakter van noodzakelijkheid. Door de commissie Dunning werd
noodzakelijkheid bijvoorbeeld gedefinieerd in termen van voortijdige sterfte en normale
deelname aan de samenleving.
Tenslotte kan nog worden opgemerkt dat het niet eenvoudig was om de algemene bevolking te
laten oordelen over doelmatigheid. Compliceren van het doelmatigheidskader met variabele
afkapwaardes zou het publieke debat over doelmatigheid nog verder kunnen belemmeren.
Aanbeveling
De resultaten van onze studie stellen grote vraagtekens bij het gebruik van verschillende kostenper-QALY afkapwaardes, afhankelijk van de ernst van de aandoening. Zolang een betere
onderbouwing ontbreekt is het raadzaam om het criterium doelmatigheid niet afhankelijk te
maken van ziektelast.
-7-
This study was funded by the Health Technology Assessment (HTA) methodology
programme for high-cost medicines of the Netherlands Organisation for Health Research and
Development (ZonMw), project number 152002016
Research paper February 2014
-8-
Abstract
THE MARGINAL UTILITY OF HEALTH
Is health more important when you have less?
Background
Evidence is conflicting on whether similar health changes are more valuable in poor health than
in good health. Research on distributive justice suggests that policy makers should value health
improvements more for those in worse health. This has led to differential cost-effectiveness
thresholds, accepting higher costs for patients with higher disease severity. However, health
state classification systems valued from the personal perspective invariably show that
differences in health are more important in better health. The aim of the current study is to
explain the discrepancy between research on distributive justice and health state classification
systems, from the difference in format and in perspective.
Methods
Respondents from the Dutch general public (response 52%, n=710) filled out an internet
questionnaire, valuing health from different perspectives (personal or policy) and in different
formats (states and changes). Health was presented using EQ5D profiles, and valued using
visual analogue scales. Data were analysed using random effects models, to investigate whether
health improvements are more valuable in good health or in poor health.
Results
Few differences were found between the personal and the policy perspective. Estimates from the
state format and the change format were more different. However, in all analyses, health
improvements were considered more valuable in good health than in poor health.
Conclusions
Our results consistently show that health improvements are considered more valuable in good
health. These results question the robustness of the foundations for using differential costeffectiveness thresholds.
The Marginal Utility of Health
-9-
Introduction
It has been argued that cost-effectiveness acceptability thresholds should depend on disease
severity, accepting higher costs for patients with higher disease severity [1,2]. In the Netherlands,
costs could be acceptable up to 10,000 euro per quality-adjusted life year (QALY) for diseases
with low severity, but up to 80,000 euro per QALY for diseases with high severity (Figure 1).
Such differential thresholds are consistent with the economic concept of diminishing marginal
utility. The additional utility derived from something good (like additional money or particular
products) often decreases as the already available amount increases. Indeed, when asked to
assign a similar health improvement to either someone in poor health or someone in good health
(Figure 2), many people prefer to improve the health of the person in poor health [3,4]. This
suggests that health has diminishing marginal utility, assigning more value to health
improvements for those with less health.
However, research on tariffs for health state classification systems like the EQ5D, SF6D and HUI
shows a different picture. These tariffs estimate utility functions for health states and they invariably
show increasing marginal utility, assigning more value to health changes in good health. For
example, the EQ5D is a classification system consisting of five domains: mobility, self-care, usual
activities, pain/discomfort, and anxiety/depression [5]. In the Dutch EQ5D tariff [6], the value of an
improvement from being confined to bed to having no mobility problems can range from 0.16 to
0.47: the same mobility improvement is considered about three times more valuable for an
Figure 1. Differential cost-effectiveness thresholds, depending on disease severity [1]
The Marginal Utility of Health
- 11 -
otherwise healthy patient than for an otherwise extremely ill patient (Figure 3). This increasing
marginal utility in the EQ5D tariff model is represented by two so-called interaction terms that
assign disutility to the first problematic and the first extremely problematic domain. Contrary to
the otherwise healthy patient, the otherwise extremely ill patient has no improvement on these
interaction terms when only mobility is improvement. The interaction terms are present in most
estimated EQ5D utility functions [7,8]. The SF6D has a structure with similar interaction terms
[9]. Increasing marginal utility for the multiplicative HUI results from preference
complementarity, where the impact of two impairments is less than the sum of the two
individual impacts [10].
Thus, tariffs for health classification systems show increasing marginal utility, as opposed to the
diminishing marginal utility that is found by research on distributive justice. It is unclear why
both lines of research contradict each other. One explanation could be the difference in
perspective. Compared to the policy perspective, the personal perspective used to estimate
tariffs may suppress solidarity issues and may be more averse to the first deterioration than to
further deteriorations. A second explanation could be that both lines of research differ in the
framing of health as either states or as changes, leading to differences in how adaptation is
incorporated [11]. Becoming ill is worse than being ill. Research on health state classification
systems present stable adapted descriptions. In contrast, by focussing on the changes, the
research on distributive justice may neglect adaptation. As a result, valuations using a change
format may be more averse to larger deteriorations, i.e. to worse states. Kahneman and Tversky
Figure 2. Which patient would you prefer to give treatment?
Research paper February 2014
- 12 -
assert that, as a basic principles of perception and judgment, the carriers of value are the
changes, rather than final states [12].
Research questions
The aim of the current study is to reproduce both lines of research within a single study.
Valuations are obtained from the Dutch general population, analysing the marginal utility
depending on format (change versus state) and perspective (personal versus policy). The
research questions are:
1. Does valuation using a state format invoke increasing marginal utility, whereas
valuation using a change format invokes diminishing marginal utility?
2. Does valuation using a personal perspective invoke increasing marginal utility, whereas
valuation using a policy perspective invokes diminishing marginal utility?
Figure 3. Value of an extreme improvement on mobility, depending on
the level of the other domains (according to the Dutch EQ5D tariff [6])
The Marginal Utility of Health
- 13 -
Methods
Respondents from the general public were asked to value a set of health states and a set of
health changes, either from their personal perspective or from a policy perspective. Valuing
health states from the personal perspective resembles research on health state classification
systems. Valuing health changes from the policy perspective resembles research on distributive
justice.
Sample
Participants, at least 18 years old, were recruited through a company for marketing research.
Quota sampling was used to achieve a sample that was representative for the Dutch population
with regard to age, gender and education. Respondents were approached through email and
invited to participate in the study. The invitation email contained an internet link for
respondents to start the questionnaire. After completion of the questionnaire the respondents
received points that could be exchanged into a gift voucher or donated to charity.
To estimate the standard 12-parameter EQ5D tariff model for the 22 conditions (state or
change format, with personal or policy perspective), the total number of parameters is 48.
Requiring ten respondents per parameter [13], we aimed at including 500 respondents.
Internet questionnaire
Respondents filled out a single internet questionnaire, including a part A valuing states and a
part B valuing changes. Each respondent was presented with both parts A and B, in random
order. We considered it too difficult for a respondent to change perspective half way the
questionnaire. Therefore, respondents received a questionnaire that was at random entirely from
the personal perspective or entirely from the policy perspective. In the policy (or social)
perspective, respondents were asked to imagine that they were a policy maker who had to value
health for groups of patients. In the personal perspective, respondents were asked to value health
for themselves. Respondents were primed with their perspective before the valuation tasks, and
the perspective was briefly repeated on each valuation screen.
In part A, nine health states were valued. Health states were described using the profiles of the
EQ5D classification system. First, respondents were asked which health state they considered
worse: profile 33333 or dead. Next they were asked three times to value a set of three health
states on a visual analogue scale (VAS) ranging from 100 (= best imaginable health) to 0 (=
worst imaginable health) (see Appendix A). All VAS values were divided by 100. For those
Research paper February 2014
- 14 -
who considered profile 33333 worse than dead, a group-level linear transformation was used to
map the average value for dead to 0 (thus mapping the value for 33333 at -0.532).
In part B, nine health changes were valued. Health changes were described as improvements
from one EQ5D profile to another EQ5D profile. First, respondents were asked which health
improvement they considered more valuable: 33333→11111 or dead→11111. Next they were
asked three times to value a set of three health changes on a VAS ranging from 0 (= no
improvement) to 100 (= most valuable improvement) (see Appendix B). All VAS values were
divided by 100. For those who considered the improvement 33333→11111 more valuable than
dead→11111, a group-level linear transformation was used to map the average value for
dead→11111 to 1 (thus mapping the value for 33333→11111 at 1.526).
The health changes presented in part B were changes to or from the frequently used seventeen
EQ5D profiles that were first used by Macran [14,15]. For each respondent, a random set of
nine changes was chosen, including six improvements to or from a Macran profile (varying only
one or two domains), two improvements from a Macran profile to 11111, and one self-change
from a Macran profile to itself (included as a consistency check). Changes were chosen to
include small, intermediate and large improvements, and were presented in random order. The
health states valued in part A were chosen as the respondent’s first nine unique states from the
eighteen states defining the nine changes in part B. This way, values for the assessed changes in
part B could be compared within respondents to the differences between the values for the
assessed states in part A.
Near the end of the questionnaire, after parts A and B, a non-numerical direct preference
question was asked on whether equally effective treatment was preferred in poor health or in
rather good health (see Appendix C). In the personal perspective respondents were asked “In
which situation would you prefer to receive this treatment for yourself?” In the policy
perspective they were asked “Which patient group would you prefer to give this treatment?”
Statistical analyses
We analysed the data both within the standard EQ5D model and in a more general
disaggregated model. The standard EQ5D tariff model has twelve parameters: one for each nonoptimal level of each domain, plus two interaction terms over all domains combined. The ten
parameters for the levels of the domains do not show whether health is more valuable in good
health or in poor health: if the difference between the levels 1 and 2 is less valuable than the
difference between the levels 2 and 3, than this merely shows that the wording for level 2 is
closer to level 1 than to level 3. Marginal utility is reflected in the two interaction terms.
Deterioration to level 2 or 3 of a certain domain invokes the interaction terms if the initial state
had no problems at all, but these same deteriorations do not invoke those interaction terms if the
The Marginal Utility of Health
- 15 -
initial state already had a level 2 or 3 on some domain. Therefore, the signs of the interaction
terms show whether the same health change is more valuable in good or in poor health.
The coefficients of the standard EQ5D tariff model were estimated using the following
model [5]:
1 – Uij =
 
d 1..5   2..3
αdℓ Ldℓij + β2 N2ij + β3 N3ij + εij ,
where Uij denotes the valuation for the j-th health state valued by the i-th respondent, Ldℓij
indicates whether the d-th domain of that state is at the ℓ-th level, and N2ij and N3ij indicate
whether any of the domains is at least at level 2 and 3, respectively. Under this standard EQ5D
tariff model, the value of the assessed changes is equal the difference between the values of
their initial and destination states:
Dij =
 
d 1..5   2..3
αdℓ ΔLdℓij + β2 ΔN2ij + β3 ΔN3ij + εij ,
where Dij denotes the valuation for the j-th health change valued by i-th respondent, and the
Δ-variables denote the difference between the initial and the destination state for the respective
indicator variables. Although the variables of the state and the change model are different, their
structure and parameters are the same. Therefore, with appropriate transformations of the
variables in the models, the state and change valuations of each respondent can be aggregated in
a combined model and their estimated coefficients can be compared. Diminishing and
increasing marginal utility for health can be concluded from the signs of the estimated
coefficients for the interaction terms (i.e. β2 and β3) in the different parameter sets.
The analysis based on the interaction terms of the standard EQ5D tariff model requires the
validity of that model. For this reason, marginal utility was also investigated in a more direct
way that does not rely on the validity of the standard model. In this disaggregated model, fifteen
types of transitions were distinguished: the level 1→2, level 2→3 and level 1→3 transitions, for
each of the five domains. This disaggregated approach separates the fifteen types of transitions,
and estimates how their value depends on the severity of the entire initial health state. The
following model was estimated:
Dij =

t 1..15
Ttij [ γt + δt Sij ] + εij ,
where Ttij indicates whether the j-th health change valued by the i-th respondent includes a
transition of the t-th type and Sij denotes the severity of the initial state in that health change.
Severity is quantified here by the unweighted average level over the five domains, ranging from
severity 0 for profile 11111, through severity 0.5 for profile 22222, to severity 1 for profile
33333. For the separate domains this model is more flexible than the EQ5D tariff model (with
six parameters per domain instead of two). Moreover, marginal utility is quantified by 15
Research paper February 2014
- 16 -
parameters (one for each transition type), instead of two overall interaction terms. This
disaggregated model is basically a model for the value of the health change, but was applied
both to the valuations for the changes and to the differences between the valuations for the
corresponding initial and final states. Diminishing and increasing marginal utility for health was
concluded from the signs of the estimated coefficients for severity (i.e. δt for t =1..15) in the
different parameter sets.
To allow for within-respondent comparisons of the state and change valuations, the analysis
includes those respondents who finished both valuation parts A and B. Model parameters were
estimated using random effects models, to account for the repeated valuations from the same
respondent and format. For the direct preference questions, proportions were analysed using
binomial and Fisher’s exact tests. All analyses were performed using Stata/IC 11.2 for
Windows. P-values less than 0.05 were considered statistically significant.
Table 1: Estimated parameters for the standard EQ5D tariff model, in the four conditions
Personal perspective
(n=383)
Interaction terms:
Any domain at level 2 or 3 (β2)
Any domain at level 3 (β3)
Policy perspective
(n=327)
State
format
Change
format
State
format
Change
format
-0.31 ‡
-0.14 ‡
-0.20 ‡
-0.19 ‡
-0.24 ‡
-0.12 ‡
-0.17 ‡
-0.17 ‡
Mobility
level 2 (α12)
level 3 (α13)
-0.07 ‡
-0.16 ‡
-0.04 *
-0.03
-0.06 ‡
-0.13 ‡
-0.08 ‡
-0.04
Self-care
level 2 (α22)
level 3 (α23)
-0.07 ‡
-0.11 ‡
-0.02
0.04
-0.06 ‡
-0.12 ‡
-0.06 †
0.04
Usual activities
level 2 (α32)
level 3 (α33)
-0.03 †
-0.09 ‡
-0.05 *
-0.03
-0.03 *
-0.11 ‡
-0.03
-0.09 ‡
Pain/discomfort
level 2 (α42)
level 3 (α43)
-0.06 ‡
-0.20 ‡
-0.07 ‡
-0.07 ‡
-0.09 ‡
-0.24 ‡
-0.13 ‡
-0.16 ‡
Anxiety/depression
level 2 (α52)
level 3 (α53)
-0.09 ‡
-0.17 ‡
-0.07 ‡
-0.03
-0.09 ‡
-0.17 ‡
-0.08 ‡
-0.08 ‡
*
P-value  0.05
†
P-value  0.01
‡
P-value  0.001
The Marginal Utility of Health
- 17 -
Results
Of 1371 respondents opening the questionnaire, 1092 (80%) started their first valuation part (A or
B), 874 (64%) finished that first valuation part, 804 (59%) started their second valuation part, and
710 (52%) finished both valuation parts (A and B).
Among the study population, 54% were female and the average age was 49.6 (SD 14.5, range 18 to
81). The highest completed education level was for 12% primary school, for 14% pre-vocational
secondary training, for 13% junior vocational, for 10% senior vocational secondary, for 11% senior
general secondary, for 12% pre-university, for 14% higher professional, and for 14% university. The
family situation was for 7% living with parents, for 22% single without children, for 5% single with
child(ren), for 31% living together without children, for 28% living together with child(ren), and 7%
otherwise. These numbers are roughly representative for the Dutch population.
The self-change we included to check for consistency was correctly valued at 0 (i.e. no
improvement) by 225 (32%) respondents; the non-zero valuations were equally distributed over the
interval from 0 to 1.526. The self-change was rated as less valuable than the other two improvement
on the same valuation screen by 456 (64%) respondents.
Analysis within the standard EQ5D tariff model
Table 1 shows the estimated parameters of the EQ5D tariff model, for the four conditions of the
study. Most importantly, in all four conditions the estimated coefficients for both interaction terms
were significantly negative (all P < 0.001). The negative interaction terms imply that the
corresponding utility functions all have increasing marginal utility, considering health improvement
more valuable in good health than in poor health.
Few differences were seen between the estimated parameters for the personal and the policy
perspective, neither for the state format (P ≤ 0.05 for 3 out of 12 parameters) nor for the change
format (P ≤ 0.05 for 3 out of 12 parameters). For the state format, the parameters for the first
interaction term and for the pain/discomfort domain were significantly different (β2 at P = 0.001,
α42 at P = 0.04 and α43 at P = 0.04). For the change format, the parameters for the pain/discomfort
domain and the level-3 parameter for the anxiety/depression domain were significantly different
(α42 at P = 0.01, α43 at P < 0.001, and α53 at P = 0.03).
The estimates for the state and the change format differed more often, both in the personal
perspective (P ≤ 0.05 for 8 out of 12 parameters) and in the policy perspective (P ≤ 0.05 for 5 out of
12 parameters). For the state format, the estimated level-3 coefficients were overall about double the
size of the level-2 coefficients (α.3 ≈ 2  α.2), suggesting that the impact of extreme problems is
considered about twice as large as the impact of some problems. For the change format, the level-3
Research paper February 2014
- 18 -
coefficients were more similar in size to the level-2 coefficients (α.3 ≈ α.2), and the worse value for
level 3 is mostly reflected by the second interaction term (β3) that is larger than for the state format.
Analysis within the disaggregated model
Table 2 shows the estimated parameters using the disaggregated model, for the four conditions of the
study. Most importantly, of the 60 estimated δ coefficients, 40 were significantly negative and none
were significantly positive. The negative δ coefficients imply that the value of the changes decreases
with severity, so health improvement is more valuable in good health than in poor health.
Few differences were seen between the estimated parameters for the personal and the policy
perspective, neither for the state format (P ≤ 0.05 for 1 out of 30 parameters) nor for the change
format (P ≤ 0.05 for 2 out of 30 parameters). For the state format, the γ parameter for the 3→1
change on mobility was significantly different (P = 0.004) and for the change format the γ and δ
parameters for the 3→2 change on mobility (P = 0.05 and P = 0.01, respectively).
Table 2: Estimated parameters for the disaggregated model, in the four conditions
(Estimated value of the change is γ + δ  Severity).
Personal perspective
(n=383)
State
format
Policy perspective
(n=327)
Change
format
State
format
Change
format
γ
δ
γ
δ
γ
δ
γ
δ
Mobility
Self-care
Usual activities
Pain/discomfort
Anxiety/depression
2→1
2→1
2→1
2→1
2→1
0.34 ‡
0.28 ‡
0.26 ‡
0.31 ‡
0.34 ‡
-0.32 †
-0.34 ‡
-0.10
-0.36 ‡
-0.46 ‡
0.47 ‡
0.45 ‡
0.50 ‡
0.44 ‡
0.42 ‡
-0.56 ‡
-0.71 ‡
-0.79 ‡
-0.37 ‡
-0.55 ‡
0.33 ‡
0.26 ‡
0.15 ‡
0.26 ‡
0.33 ‡
-0.47 ‡
-0.29 †
0.14
-0.36 ‡
-0.40 ‡
0.43 ‡
0.36 ‡
0.31 ‡
0.43 ‡
0.36 ‡
-0.66 ‡
-0.54 ‡
-0.35 †
-0.45 ‡
-0.43 ‡
Mobility
Self-care
Usual activities
Pain/discomfort
Anxiety/depression
3→2
3→2
3→2
3→2
3→2
0.15
0.01
0.28 ‡
0.43 ‡
0.19 *
-0.07
0.21
-0.32 *
-0.44 *
-0.21
0.45 ‡
0.47 †
0.42 ‡
0.47 ‡
0.33 ‡
-0.21
-0.34
-0.35 †
-0.33 †
-0.33 †
0.08
0.16
0.16
0.33 ‡
0.14
0.14
-0.05
-0.05
-0.38 *
-0.04
0.72 ‡
0.51 †
0.45 ‡
0.37 ‡
0.26 ‡
-0.77 ‡
-0.62 †
-0.38 ‡
-0.35 †
-0.15
Mobility
Self-care
Usual activities
Pain/discomfort
Anxiety/depression
3→1
3→1
3→1
3→1
3→1
0.46 ‡
0.15
0.34 ‡
0.58 ‡
0.54 ‡
-0.18
0.28
-0.13
-0.62 ‡
-0.50 ‡
0.43 ‡
0.45 ‡
0.63 ‡
0.57 ‡
0.59 ‡
-0.22
-0.32 †
-0.74 ‡
-0.46 ‡
-0.55 ‡
0.09
0.08
0.29 ‡
0.55 ‡
0.44 ‡
0.22
0.25
0.03
-0.49 ‡
-0.38 ‡
0.44 ‡
0.27 ‡
0.51 ‡
0.48 ‡
0.50 ‡
-0.30 †
-0.20
-0.43 †
-0.27 ‡
-0.45 ‡
*
P-value  0.05
†
P-value  0.01
‡
P-value  0.001
The Marginal Utility of Health
- 19 -
The estimates for the state and the change format differed more often, both in the personal
perspective (P ≤ 0.05 for 18 out of 30 parameters) and in the policy perspective (P ≤ 0.05 for 18 out
of 30 parameters).
Direct preference question
When asked directly from the personal perspective, 72% of the respondents preferred equally
effective treatment in a situation where they were in rather good health, over a situation where they
were in poor health. In the policy perspective, 86% of the respondents preferred giving treatment to a
patient group in rather good health, over a patient group in poor health. The percentages differed by
perspective (72% versus 86%, P < 0.001), but in both perspectives more respondents preferred
treatment in rather good health than in poor health (72% versus 28% and 86% versus 14%, both
P < 0.001).
Research paper February 2014
- 20 -
Discussion
The research questions of this study were whether the change format and the personal
perspective invoke diminishing marginal utility, whereas the state format and the personal
perspective invoke increasing marginal utility. These questions are answered negatively: all
conditions invoke the same increasing marginal utility, assigning more value to health
improvement for those in better health. This consistent finding was true even for the health
changes valued from the policy perspective, which we designed to resemble the research on
distributive justice. As an illustration, Table 3 shows the value of an extreme improvement on
mobility, depending on the overall health of the person. For the different conditions and analyses, the
same improvement on mobility is consistently more valuable for an otherwise healthy patient than
for an otherwise extremely ill patient.
Implications
In the standard health economic cost-per-QALY framework, larger health improvements are allowed
to have higher costs. For example, costs can be acceptable up to 40,000 euro times the QALY gain.
In this approach, patients in worse health may have a larger scope for improving their health, and in
that case higher costs are acceptable. Using differential cost-effectiveness thresholds goes a step
further. Not only are larger health gains allowed to be more expensive, but also similar health gains
Table 3: Value of an extreme improvement on mobility, depending on the levels of the other domains
Within the standard EQ5D tariff model †
Improvement 31111 → 11111
Improvement 32222 → 12222
Improvement 33333 → 13333
Within the disaggregated model ‡
Improvement 31111 → 11111
Improvement 32222 → 12222
Improvement 33333 → 13333
Formulae
Dutch
EQ5D
tariff *
α13 + β2 + β3
α13 + β2
α13
0.47
0.40
0.16
γ1 + 0.2 δ1
γ1 + 0.6 δ1
γ1 + 1.0 δ1
*
Policy
Personal
perspective, perspective,
Change
State
format
format
0.61
0.30
0.16
0.38
0.21
0.04
0.42
0.35
0.28
0.38
0.26
0.14
According to [6], also shown in Figure 3
According to the parameters presented in Table 1
‡
According to the parameters for the 11-th transition type presented in Table 2
†
The Marginal Utility of Health
- 21 -
are allowed to be more expensive for those with more severe disease. Such an approach with
differential thresholds has been advocated in the Netherlands (Figure 1) and several other countries
[1,2]. Somewhat similarly, the UK’s National Institute for Health and Care Excellence (NICE)
accepts higher costs for patients nearing the end of their lives [16]. The use of differential thresholds
has been based on public debate and on research on distributive justice. We were unable to replicate
this line of research. Like research on utility tariffs, our study consistently showed the opposite
result, that health improvement is considered more valuable in good health.
The utilitarian cost-per-QALY approach has often been criticized for ignoring equity concerns [17].
Differential cost-per-QALY thresholds can be seen as a way to unify utilitarian and egalitarian
principles in a single framework. However, it only addresses inequity in health and there may be
more effective ways to reduce health inequity than through health care provision [18]. The current
study suggests that the question whether an intervention is expensive should be considered on itself,
in a framework with fixed cost-per-QALY benchmarks: health benefits should receive the same
weight regardless of other characteristics of the people receiving the health benefit [19]. Such fixed
benchmarks do not demand strict decision making: an intervention that is economically expensive
may still be justified by scientific or non-health benefits, by priority for vulnerable populations,
concerns about solidarity, equity, fairness and need, or by other ethical considerations [17,18,20,21].
However, we argue that currently there is insufficient evidence for a unifying theory that does justice
to the complexity in such non-economic considerations.
Interpretation
According to the results of our study a health improvement is more valuable for those who are
healthier. This is different from the value derived from, for example, money: a gain of a thousand
euro is probably less valuable for those who are wealthier than for a poorer individual. Why is health
different from wealth? One explanation may lie in the instrumental value of health: health is not
(only) an intrinsic value in itself, but is instrumental, for example, in allowing us to participate in
society. Therefore, health may only be important above a certain threshold. Improvements that
remain below the threshold may be less important because they are insufficient to improve our
participation in society. Another explanation for the increasing marginal utility could be that our
aspiration level for health is so high that any deviation from optimal health is unacceptable. As a
result, the first deterioration may be much more important than further deteriorations. This all-ornothing dichotomisation may be stronger among the general population, than among patients who
already have a less-than-optimal reference point. A similar, more general, third explanation is
provided by prospect theory, in which the sensitivity for a change diminishes with the distance from
the reference point [22-25]. Prospect theory is most known for assigning more value to losses than to
gains, but it also assigns more value to changes near the reference point than changes further away
from the reference point. This results in an S-shaped utility function that is concave for gains
(diminishing marginal utility ) and convex for losses (increasing marginal utility) [23]. As our
Research paper February 2014
- 22 -
respondents were from the general public (which is generally preferred for economic appraisal),
most respondents will have experienced the presented health scenarios as losses, resulting in the
increasing marginal utility we found.
Apart from the nature of the marginal utility of health, it is also important to speculate on why we
were unable to reproduce the results from the research on distributive justice. One explanation could
be that we failed to capture essential elements from that research. We considered the change format
and the policy perspective to be the crucial elements that we adopted in our study. By being more
explicit about the nature of health changes, we may have suppressed other ethical considerations.
Also, it was not straightforward how inequity and how health improvements should be represented
(the vertical axis of Figure 2). We have chosen to present them by severity and change in terms of
the EQ5D classification system, i.e. current problems on quality of life domains. Alternative
representations for inequity could be in terms of fair innings, prognosis or proportional shortfall
[3,4,17], but we think that combinations of quality of life with time durations are too difficult for an
internet questionnaire. An alternative representation for the health improvement could have been in
terms of utility. The question then is which patient would be preferred to receive the similar utility
improvement. We have not used this presentation, because we consider it internally inconsistent:
utility scales should have the interval property, so by definition equal numerical utility differences
represent equally strong preference differences, regardless of their location on the utility scale
[26,27].
A second explanation for why we were unable to reproduce the results from research on distributive
justice could be in the visual analogue scales we used. VAS measurements are relatively
straightforward, but comparing and valuing health changes is an unusual and difficult task. Many of
our respondents did not give a correct low rating to the self-change we included as a consistency
check. Also, the scaling properties of VAS valuations are important. In our study we have basically
assumed that the scales have the interval property: that they are anchored at 0 and 1 (for dead and
perfect health) and that equally large differences across the scale are equally valuable. This
assumption has been questioned and VAS valuations for health states are often transformed using a
power transformation to improve their interval properties [28,29]. Clearly the interval assumption
and transformations are closely related to diminishing and increasing marginal utility.
Transformation of the VAS valuations in our study could have an impact on the results, but would
require further research on their validity for the scales we used to value health changes. Although the
VAS measurements may not have the proper scale properties, their results were confirmed by the
direct preference questions (which asked for a choice instead of valuations). The Time Trade-Off
and the Standard Gamble measurement tools are believed to have better scaling properties than the
VAS, but we decided not to use those because we considered it important to use a similar
measurement tool to value states and changes and we were unable to design proper Time Trade-Off
or Standard Gamble procedures that could be used to value health changes.
The Marginal Utility of Health
- 23 -
Finally, a third explanation for why we were unable to reproduce the research assigning higher value
to changes at lower health, could be found in an analogy with similar research on orphan drugs and
on end-of-life treatment. Despite strong general support for statements expressing a desire for equal
treatment rights for patients with rare diseases, researchers found little evidence that a societal
preference for rarity exists if treatment of patients with rare diseases is at the expense of treatment of
those with common diseases [30,31]. Other research calls into question whether a policy of giving
higher priority to end-of-life treatments than to other types of treatments is supported by the public,
particularly if the health gains offered by the treatments being ‘de-prioritised’ are larger than those
offered by the end of life treatments [32,33]. Similarly, respondents may report a socially desirable
preference for improvement for those in poorer health, but this consideration may be of little
importance when weighed against other criteria.
Conclusion
Regardless of which explanation is valid, our study consistently shows that health improvement is
considered more valuable in good health. These results question the robustness of the foundations
for using differential cost-effectiveness thresholds.
Research paper February 2014
- 24 -
References
1. Council for Public Health and Health Care. Sensible and sustainable care [In Dutch]. The
Hague: Council for Public Health and Health Care, 2006.
2. Paris V, Belloni A. Value in Pharmaceutical Pricing. OECD Health Working Papers No. 63.
OECD Publishing, 2013.
3. Stolk EA, Pickee SJ, Ament AH, Busschbach JJ. Equity in health care prioritisation: an
empirical inquiry into social value. Health Policy 2005; 74: 343-355.
4. Nord E. Concerns for the worse off: fair innings versus severity. Soc Sci Med. 2005; 60: 257263.
5. Dolan P. Modeling valuations for EuroQol health states. Med.Care 1997; 35: 1095-1108.
6. Lamers LM, Stalmeier PF, McDonnell J, Krabbe PF, Van Busschbach JJ. Measuring quality
of life in economic evaluations: the Dutch EQ-5D tariff (in Dutch).
Ned.Tijdschr.Geneeskd. 2005; 149: 1574-1578.
7. Xie F, Gaebel K, Perampaladas K, Doble B, Pullenayegum E. Comparing EQ-5D Valuation
Studies: A Systematic Review and Methodological Reporting Checklist. Med Decis
Making. 2014; 34: 8-20.
8. Greiner W, Weijnen T, Nieuwenhuizen M, Oppe S, Badia X, Busschbach J, Buxton M, Dolan
P, Kind P, Krabbe P, Ohinmaa A, Parkin D, Roset M, Sintonen H, Tsuchiya A, de
Charro F. A single European currency for EQ-5D health states. Results from a sixcountry study. Eur J Health Econ. 2003; 4: 222-231.
9. Brazier J, Roberts J, Deverill M. The estimation of a preference-based measure of health from
the SF-36. J Health Econ. 2002; 21: 271-292.
10. Feeny D, Furlong W, Torrance GW, Goldsmith CH, Zhu Z, DePauw S, Denton M, Boyle
M. Multiattribute and single-attribute utility functions for the health utilities index
mark 3 system. Med Care. 2002; 40: 113-128.
11. Stiggelbout AM, De Vogel-Voogt E. Health state utilities: a framework for studying the gap
between the imagined and the real. Value Health. 2008; 11: 76-87.
The Marginal Utility of Health
- 25 -
12. Kahneman D, Tversky A. Prospect Theory: An Analysis of Decision under Risk.
Econometrica 1979; 47: 263-292 (page 277).
13. Altman DG. Practical statistics for medical research. London: Chapman & Hall, 1991 (page
349).
14. Macran S, Kind P. Valuing EQ-5D health states using a modified MVH protocol:
preliminary results. In 16th Plenary Meeting of the EuroQol Group, Sitges, 6–9
November 1999. Discussion Papers, Badia X, Herdman M, Roset M (eds). Institut de
Salut Publica de Catalunya: Spain, 2000; 205–240.
15. Tsuchiya A, Ikeda S, Ikegami N, Nishimura S, Sakai I, Fukuda T, Hamashima C, Hisashige
A, Tamura M. Estimating an EQ-5D population value set: the case of Japan. Health
Econ. 2002; 11: 341-353.
16. Chalkidou K. Evidence and values: paying for end-of-life drugs in the British NHS. Health
Econ Policy Law. 2012; 7: 393-409.
17. Wagstaff A. QALYs and the equity-efficiency trade-off. J Health Econ. 1991; 10: 21-41.
18. Powers M, Faden R. Inequalities in health, inequalities in health care: four generations of
discussion about justice and cost-effectiveness analysis. Kennedy Inst Ethics J. 2000;
10: 109-127.
19. National Institute for Health and Care Excelence (NICE). Guide to the methods of
technology appraisal. London: National Institute for Health and Care Excelence, 2013.
20. Neumann PJ, Weinstein MC. Legislating against use of cost-effectiveness information. N
Engl J Med 2010; 363: 1495-1497.
21. Cookson R. Can the NICE "End-of-Life Premium" Be Given a Coherent Ethical
Justification? J Health Polit Policy Law. 2013; 38:1129-1148.
22. Tversky A, Kahneman D. Advances in Prospect Theory: Cumulative Representation of
Uncertainty. J Risk Uncertainty 1992; 5: 297-323.
23. Treadwell JR, Lenert LA. Health values and prospect theory. Med Decis Making 1999; 19:
344-352.
24. Hill SA, Neilson W. Inequality Aversion and Diminishing Sensitivity. J Econ Psych 2007;
28: 143-153.
Research paper February 2014
- 26 -
25. Winter L, Parker B. Current Health and Preferences for Life-Prolonging Treatments: An
Application of Prospect Theory to End-of-Life Decision Making, Soc Sci Med. 2007;
65: 1695-1707.
26. Gold MR, Siegel JE, Russell LB, Weinstein MC. Cost-effectiveness in Health and
Medicine. New York: Oxford University Press, 1996 (pages 90-91, 94 en 99).
27. Torrance GW. Utility measurement in healthcare: the things I never got to.
Pharmacoeconomics. 2006; 24: 1069-1078.
28. Stiggelbout AM, Eijkemans MJ, Kiebert GM, Kievit J, Leer JW, De Haes HJ. The ‘utility’
of the visual analog scale in medical decisionmaking and technology assessment. Is it
an alternative to the time trade-off? Int J Technol Assess Health Care 1996; 12: 291298.
29. Torrance GW, Feeny D, Furlong W. Visual analog scales: do they have a role in the
measurement of preferences for health states? Med Decis Making. 2001; 21: 329-334.
30. Desser AS, Gyrd-Hansen D, Olsen JA, Grepperud S, Kristiansen IS. Societal views on
orphan drugs: cross sectional survey of Norwegians aged 40 to 67. BMJ 2010; 341:
c4715.
31. Desser AS. Prioritizing treatment of rare diseases: a survey of preferences of Norwegian
doctors. Soc Sci Med. 2013; 94: 56-62.
32. Shah, K. and Devlin, N. Understanding social preferences regarding the prioritisation of
treatments addressing unmet need and severity. Research Paper 12/05. London: Office
of Health Economics, 2012.
33. Shah, K., Tsuchiya, A., Risa Hole, A., and Wailoo, A. Valuing health at the end of life: A
stated preference discrete choice experiment. NICE Decision Support Unit report.
Sheffield: Decision Support Unit, 2012.
The Marginal Utility of Health
- 27 -
APPENDIX A: Visual Analogue Scale used to value health states,
from the personal perspective
Research paper February 2014
- 28 -
APPENDIX B: Visual Analogue Scale used to value health changes,
from the policy perspective
The Marginal Utility of Health
- 29 -
APPENDIX C: Direct preference questions for
treatment in poor or in rather good health, depending on perspective
Consider two situations. In one situation you are in rather good health. In the other situation you are in poor health. You can get treatment in one of these two situations. Treatment is equally effective in both situations and provides a fair‐sized health improvement. If you had to choose, in which situation would you prefer to receive this treatment for yourself? When I am in poor health When I am in rather good health Consider two groups of patients. One patient group is in rather good health. The other patient group is in poor health. You can give treatment to one of these two patient groups. Treatment is equally effective in both patient groups and provides a fair‐sized health improvement. If you had to choose, which patient group would you prefer to give this treatment? The patients in poor health The patients in rather good health Research paper February 2014
- 30 -