preference-based measures in economic evaluation in health care

P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
Annu. Rev. Public Health. 2000. 21:587–611
c 2000 by Annual Reviews. All rights reserved
Copyright PREFERENCE-BASED MEASURES IN ECONOMIC
EVALUATION IN HEALTH CARE
?
Peter J. Neumann, Sue J. Goldie, and
Milton C. Weinstein
Program on the Economic Evaluation of Medical Technology, Center
for Risk Analysis, Harvard School of Public Health, Boston, Massachusetts 02115;
e-mail: [email protected]
Key Words utilities, preferences, quality-adjusted life years gained, QALYS,
cost-utility analysis
■ Abstract Estimating preferences for states of health has been an active area of
research in recent years. Unlike psychophysical approaches, which discriminate levels of health status, preference-based approaches incorporate values or utilities for
health outcomes and can be used in cost-effectiveness analyses to aid resource allocation decisions. This chapter considers issues and controversies involved in using
preference-based measures in economic evaluation in health care, with a particular emphasis on cost-utility analysis and the estimation of quality-adjusted life years. Topics
considered include techniques for measuring preferences, the use of preference-based
classification systems, the relationship between patient and community preferences,
methods for obtaining utilities from clinical trials, mapping health status from health
utilities, the development of “off-the-shelf” preference weights, and proposed alternatives to quality-adjusted life years. We also consider applications of cost-utility
analyses to public health interventions. Although cost-utility analyses have become
more popular recently, many challenges remain for the field. Widespread acceptance
of the methodology likely awaits more consensus on measurement techniques, as well
as educational efforts in the public health and medical communities on the usefulness
of the approach.
INTRODUCTION
Health policy researchers and analysts have long been interested in the question
of how people value health. The issue lies at the heart of attempts to assess the
relative worth of different health and medical interventions. If we know the value
people attach to the health improvement they receive from different interventions,
it could help to determine how to provide most efficiently more of the outcomes
that people desire and fewer that they do not (47, 117).
In this chapter we consider preference-based measures in economic evaluation in health care. We start by reviewing how preferences are incorporated into
0163-7527/00/0510-0587$14.00
587
P1: FDJ
March 31, 2000
588
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
economic evaluations, focusing on cost-utility analysis, which has emerged as the
recommended practice for the field (134). [Some analysts use the generic term
“cost-effectiveness analyses” to include all forms of analyses measuring costs per
unit of health effect (47). Here we follow the practice used by Drummond et al (28)
by referring to analyses measuring cost per quality-adjusted life years (QALYs)
as “cost-utility analyses.”] Next, we describe the concept of QALYs, followed
by a discussion of emerging topics in preference estimation. The final section
considers applications to public health interventions.
?
INCORPORATING PREFERENCES INTO
ECONOMIC EVALUATION
Cost-Benefit Analysis
Health policy analysts traditionally used cost-benefit analysis (CBA) to assess the
value of health programs (136, 137). In CBA, analysts estimate the net social
benefit of a program or intervention as the incremental benefit of a program minus
the incremental cost. All costs and benefits are measured in monetary units (e.g.
dollars). The approach is useful because it leads to a simple decision-making rule:
If a program’s net benefits exceed its net costs, then it should be adopted. However, CBA also raises measurement difficulties, because it requires the monetary
valuation of health benefits.
Early on, cost-benefit analysts tended to quantify health benefits with a “human
capital” approach. That is, the value of reduced health was measured as the lost
earnings of affected individuals. The advantage of the human capital approach was
that it approximated value as the “productive potential” to society that would be
lost through morbidity and mortality. It also permitted a relatively straightforward
calculation. The disadvantage, as critics such as Schelling (104) and Mishan (77)
noted, is that the approach has no basis in economic theory—because it ignores
underlying consumer preferences and implies that unproductive periods such as
leisure time and retirement are without value.
These observers noted that a superior approach would consider the fact that
consumers make tradeoffs between health and other goods and services. People
don’t spend all of their money to relieve their symptoms or to reduce their risk of
death; instead, they consume to the point at which the improvement justifies the
costs. Therefore, health should be valued by determining how much individuals
are willing to pay for it. Unlike human capital, willingness-to-pay measures are
preference based. The metric is monetary, which allows tradeoffs with costs and
nonhealth consequences.
A number of researchers have attempted to measure the value of health by
assessing what people are willing to pay for specific health benefits. Economists
typically measure the value of commodities by examining the prices of goods and
services bought and sold in the marketplace. But since private markets for health
benefits do not generally exist, it is difficult and often impossible to measure the
value of health by appealing to market data. Therefore, researchers have turned to
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
589
other measures. Some have taken “revealed preference” approaches by imputing
willingness to pay from comparable market prices or wages [e.g. the willingness
to accept occupational risk could be valued as the incremental wage paid to such
workers (128)]. The problem is that prices and wages may not be truly comparable
[i.e. unique properties of risk and benefit associated with a job may be confounded
by the magnitude of the risk itself (93)].
Others have used direct surveys of consumers, called willingness-to-pay or
“contingent valuation” surveys, because responses are contingent on a hypothetical market for the good or service of interest (1, 4, 59, 83, 89, 110, 115). The advantages of the approach are that it is grounded explicitly in principles of welfare
economics and provides a means to quantify the benefits of difficult-to-estimate
factors such as the psychic benefits of symptom relief. The disadvantage is that
researchers have often found that the method does not produce reliable estimates
(111, 112).
?
Cost-Effectiveness Analysis
In recent years, cost-effectiveness analysis has emerged as a favored analytic technique for economic evaluation in health care.1 A major appeal of cost effectiveness
over cost-benefit analysis is that it allows analysts to quantify health benefits in
terms of health rather than in monetary units.
Cost-effectiveness analyses show the relationship between the net resources
used (costs) and the net health benefits achieved (effects) for a specific intervention compared with a specific alternative strategy. Cost-effectiveness analyses
involve comparisons between two alternatives or between the presence and absence of an intervention—the cost per effect (C/E) ratio reflects the difference in
an intervention’s costs divided by the difference in its health effectiveness (47). If
ratios are expressed in similar units, they can be compared to determine the most
efficient ways to furnish health benefits.
Many cost-effectiveness analysts have expressed health benefits in terms of
intermediate outcomes specific to the treatment and disease under investigation.
For example, a researcher studying alternative strategies to prevent cancer might
evaluate and compare each strategy by the costs incurred per cancer case prevented.
The approach is advantageous in that it focuses narrowly on the clinical problem
and is familiar to the clinicians who treat the disease. A disadvantage is that it
does not permit comparisons of treatments for cancer with interventions for other
conditions. For example, the cost per case of cervical cancer detected cannot
be readily compared with the cost per case of Alzheimer’s disease detected. To
1 Many
analysts have also conducted cost-identification analyses (which compute the net
costs associated with an intervention but ignore health outcomes), cost-consequence analyses (which compute and list components of incremental costs and consequences of alternative programs without any attempt to aggregate results into a single metric), and
cost-minimization analyses (which compare the costs of two interventions in which the
outcomes are presumed to be equal). However, none of these methods incorporate preferences explicitly.
P1: FDJ
March 31, 2000
590
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
inform societal decisions about which of many competing interventions produces
the greatest gain in health for the resources expended, we need to express the
numerators and denominators of C/E ratios of diverse medical interventions in
similar terms.
One way to standardize cost-effectiveness ratios is to measure the health effects
of interventions in terms of life expectancy—the cost-effectiveness ratio for each
alternative would reflect the costs per year of life gained. A limitation of this
approach is that life expectancy alone does not take into account the quality of
additional time that is gained (e.g. an added month of life with disability or pain is
valued the same as an added month without disability or pain). Ideally, an analysis
would capture such effects.
The widely recommended approach is to measure health outcomes in terms
of QALYs to incorporate both the prolongation and quality of life (134). The
advantages of QALYs are twofold: they capture in a single measure gains from
both reduced morbidity and reduced mortality, and they incorporate the value or
preferences people have for different outcomes (28).
?
THE CONCEPT OF QALYS
QALYs represent the benefit of a health intervention in terms of time in a series
of quality-weighted health states, in which the quality weights reflect the desirability of living in the state, typically from “perfect” health (weighted 1.0) to dead
(weighted 0.0) (91, 117, 135). Once the quality weights are obtained for each state,
they are multiplied by the time spent in the state; these products are summed to
obtain the total number of QALYs.
Researchers have used a number of techniques over the years to construct
the quality weights. One option is to use the standard-gamble and time-tradeoff
techniques, which have a sound theoretical basis in economic utility theory. These
methods involve asking respondents to value health states by explicitly considering
how much they would be willing to sacrifice to avoid being in a particular health
state. (A variation is the person tradeoff approach, which involves asking people
how many outcomes of one kind they consider equivalent to some quantity of
outcomes of another kind (87a).
Alternative elicitation techniques include rating scales and ratio scales. In
the former, respondents are asked to express the strength of their preferences for
particular health states by marking a point on a scale. The scale may be a visualanalog scale, which contains no internal markings (raters mark a point between two
anchor states, such as dead and perfect health) or a category-rating scale, which
is divided into discrete intervals. By using a ratio scale, also called magnitude
estimation, respondents assess the disutility of states as multiples of a disutility
reference state.
As Drummond et al (28) note, people often use the terms “utility,” “value,” and
“preference” interchangeably. Preference is a general term to describe the desirability of a set of outcomes. Values and utilities are different types of preferences
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
591
that depend on the elicitation method; values are measured under conditions of
certainty (rating scale, time-tradeoff ), and utilities are measured under conditions of uncertainty that satisfy certain axioms of expected utility theory (standard
gamble).
Cost-utility analysis has its roots in expected utility theory (129), which describes a normative model of rational decision making under conditions of uncertainty. The preference or quality weights that are developed from preferences
developed with methods other than the standard gamble are thus not technically
utilities. But even QALYs developed with preferences constructed from standard
gambles are considered utilities only if certain restrictive assumptions hold.2 To
date, only a few studies have examined the empirical evidence for the descriptive
validity of QALYs as utilities, with mixed results (12, 63).
Regardless of the technique used to estimate them, the quality weights must meet
several conditions (28). First, they should be based on individuals’ preferences
for health states, as opposed to psychophysical (sometimes called psychometric)
methods, which provide numerical assessments to reflect individuals’ health status.
Psychometric approaches are designed to discriminate among levels of health
status, for example, the presence, frequency, or intensity of capabilities or feeling
(99). Such measures (e.g. the Medical Outcomes Study 36-item short-form health
survey) can be useful in measuring changes in health status over time, in predicting
future health outcomes, and in discriminating among individuals with different
diseases (99). But a limitation is that these measures do not necessarily reflect the
value that either patients or members of the general population place on the various
attributes of health being measured. That is, simply summing up the weightings
on various health status scales does not ensure that the weightings will be viewed
by individuals as better or worse off (47). For example, two individuals without
the use of their legs might have the same numerical ranking on a psychometric
scale but value that health state very differently. In contrast, preference-based
approaches incorporate values or utilities for health outcomes and can be used in
cost-effectiveness analyses to aid resource allocation decisions (117).
A second condition is that QALYs must be measured on an interval scale (133a),
for example, a scale on which equal intervals have an equivalent interpretation.
The reason is that cost-effectiveness analysis does not distinguish gains between,
for example, 0.1 and 0.2 and 0.6 and 0.7, but treats all numerically identical gains as
equal. Interval scaling means that comparisons in the incremental gains of QALYs
are valid across programs or interventions. For example, an intervention that takes
2 These
?
conditions include independence of preferences (utility scales for length of life and
quality of life can be specified independently rather than conditionally upon the level of the
other attribute); constant proportional tradeoffs between longevity and quality of life (one
would be willing to give up some fraction of one’s life years in order to improve the quality
of those years from one level to a preferred level, and that the fraction depends only on the
two quality levels and not on the length of life at the outset) and risk neutrality (utilities
are directly proportional to longevity for a fixed quality level) (28, 135). QALYs can be
represented by a more general, risk-adjusted model if the utility independence and constant
proportional tradeoff assumptions hold (95a).
P1: FDJ
March 31, 2000
592
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
a quadriplegic from a value of 0.1 to 0.3 is treated the same as an intervention that
takes a healthy person with back pain from 0.8 to 1.0.
A third condition for the quality weights is that they be anchored on measures
of perfect health and dead. Traditionally, perfect health has been given a value of
1 and dead a value of 0, both for convenience and because giving perfect health
a value of 1 means that QALYs reflect units of perfect health years (28). The
anchors are required if the weights are used in calculations of QALYs, because
they ensure that the absence of a year is equivalent to a year at zero weight (0.0
QALYs). Health states may also be judged to be “worse than dead” and take on
values less than zero.
Observers have noted that many cost-utility analyses use the upper anchor of
the scale to reflect only the absence of a particular health condition, ignoring the
fact that the average patient is still subject to chronic and acute conditions (37).
One way to correct for this is to use average age-specific health-related qualityof-life weights from population-based studies. In addition, others point out that
health-related quality of life of those whose lives has been saved or extended by
a health intervention may be influenced by age, gender, race, or socioeconomic
status (47). One way to address this is to conduct sensitivity analyses to indicate
explicitly how the results of analyses are affected by these characteristics (47).
A recent study of the cost effectiveness of a new drug to treat patients with
Alzheimer’s disease illustrates how the preference measures can be used (82a).
The study used a state-transition or Markov model to characterize the progression
of Alzheimer’s disease through different disease stages and residential settings
(Figure 1). Data from a clinical trial were used to simulate the paths of disease
?
Figure 1 An Alzheimer’s disease policy model. [From Neumann et al (82a), reprinted with
permission]
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
593
TABLE 1 Preference weights in
Alzheimer’s diseasea
Stage/setting
Patients
Caregivers
Mild
Community
Nursing home
0.68
0.71
0.86
0.86
Moderate
Community
Nursing home
0.54
0.48
0.86
0.88
Severe
Community
Nursing home
0.37
0.31
0.86
0.88
?
a
Source, Neumann et al (82a).
progression for patients (reflected by the arrows in the diagram) with and without
treatment. If each state is assigned a cost and quality-of-life weight, the paths of
two cohorts—one with and one without the drug—can be compared in terms of
their costs and quality-adjusted life expectancy. The model could thus address the
question of whether a new drug, through its ability to slow cognitive deterioration
and thus forestall progression to more severe Alzheimer’s disease stages (and more
costly residential settings), would produce health care cost savings and/or qualityof-life improvements over the option of no treatment. To measure the preference
weights associated with each disease stage and setting, the authors administered
the Health Utilities Index Mark 2 (HUI2) to caregivers of Alzheimer’s patients.
The weights obtained are shown in Table 1. It is important to emphasize that these
weights were obtained using the HUI2 and that there is no sense that these weights
reflect an absolute or universal interpretation of preferences for the health states in
question. As noted below, other preference elicitation techniques or instruments
can yield very different weights.
KEY TOPICS IN PREFERENCE ESTIMATION
Estimation of preferences in economic evaluation continues to be a very active area
of research. Key topics include (a) techniques for measuring preferences, (b) the
use of preference-weighted health state classification systems, (c) the relationship
between patient and community preferences, (d ) options for obtaining utilities
from clinical trials, (e) mapping health status to health utility measures, ( f ) offthe-shelf preference weights, and (g) alternatives to QALYs. Each of these topics
is considered briefly below. Note that this discussion is intended as a review of
selected topics. The interested reader is encouraged to explore the references in
more detail.
P1: FDJ
March 31, 2000
594
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
Techniques for Measuring Preference Weights
As noted, researchers have used a variety of methods to estimate preference
weights. Many experts prefer choice methods (e.g. standard-gamble or timetradeoff methods) over scaling methods (e.g. rating scales and ratio scales) (28).
Frequently, scaling methods are used for convenience, however, and often investigators use a mixture of methods for any given study (28).
Considerable research over the years has tested the validity and reliability of the
various techniques and the feasibility of eliciting responses (47, 117, 118). Validity
and reliability have been measured in numerous ways, with various results reported
(117). In comparison to the standard gamble, the time-tradeoff technique has been
found to be relatively valid, whereas the rating scale has not, for example (116).
In general, rating scales, although easy to use, have been found to be subject to
measurement bias (119). A number of studies have found that different elicitation
methods lead to different preference weights, with most reporting that standardgamble scores are greater than time-tradeoff scores, which are in turn greater than
visual-analog scores (13, 28, 53, 81, 90, 97, 127).
In recent years, researchers have experimented with various alternative elicitation techniques, including estimating general-population utilities by using one
binary-gamble question per respondent (16). [This technique involves asking a
standard-gamble question of different subgroups of the population in which the
risk of death varies across subgroups. The mean utility is then estimated by the area
above the proportional distribution of responses indicating acceptance of the gamble (16)]. Other methods that have been used include “chained procedures” to
measure temporary health states (temporary health states weighed indirectly with
the aid of intermediate anchor states) (57). Methods for optimizing sampling
strategies for estimating QALYs have also been explored (96).
Investigating the impact of different survey administration methods is also an
active area of research. For example, investigators have recently reported that the
elicitation protocol used to search for subjects’ utility values can strongly influence
results (66) and that telephone interviews yield similar time-tradeoff values and
standard-gamble utilities compared with face-to-face interviews (127).
?
Using Preference Classification Systems
In terms of methods for describing health states, some researchers have measured
quality-of-life weights based on direct, holistic utility assessment (e.g. asking patients a time-tradeoff question), whereas others have based weights on prespecified,
health state classification systems. With direct-utility assessment, individuals are
asked questions in which their current health state or a hypothetical health state
described by a “vignette,” video clip, or other description is typically placed on a
0–1 scale between perfect health (1.0) and dead (0.0).
Several research groups have developed universal or “generic” health state
classification systems, such as the Rosser Index (101), the HUI (32, 120), the
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
595
Quality of Well-Being Scale (61), the EuroQol (31), and the Health and Limitations Index (46). These systems are designed to be complete and general enough
to apply across many different types of conditions and treatments. They provide an
indirect means of obtaining preference weights: Patients are assigned a health state
classification based on responses to health status questionnaires, and prespecified
preference weights obtained from other populations are then applied.
However, the systems differ in terms of how they define the relevant domains or
attributes of health, as well as the techniques used for obtaining preference weights.
Preference weights in the HUI, for example, are based on multiattribute utility
theory, in which the domains of the classification system are regarded as attributes
in a utility function. The latest version, the HUI Mark 3, contains eight attributes—
vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain—with
five to six levels of functioning per attribute (39). Preference measurements are
based on the visual-analog scale and the standard-gamble instruments and were
collected from a population sample in Hamilton, Ontario.
The Quality of Well-Being Scale categorizes patients by symptoms and levels
of functioning, which are represented by scales of mobility, physical activity, and
social activity (61). Classifications are also based on the symptom or problem
that individuals find most undesirable. The scoring function is based on categoryscaling measurements of a random sample of the general public.
The EQ-5D system developed by the EuroQol Group contains five attributes—
mobility, self-care, main activity, usual activity, pain/discomfort, and anxiety/
depression—with three levels per attribute (61a). The scoring function for preferences is based on the time-tradeoff technique used on a random sample of the
adult population in the United Kingdom.
Each of the three systems has been used extensively (28). [The Health and
Limitations Index (HAL-ex), which uses a nationally representative sample to
estimate quality-adjustment factors (46), is a more recent effort that has not yet been
used widely.] In terms of comparing scores produced by the different instruments,
researchers have reported both similarities and differences in preference weights
obtained with different instruments in the same population (45, 88). One recent
study, for example, reported substantial agreement between patient self-ratings
obtained using the EuroQol instrument and patients’ utility scores on the HUI
for states representing lower levels of functioning, but the study also reported
differences for higher levels of functioning (45).
Researchers have reported varied results in comparisons of preferences assessed
with generic instruments vs preferences assessed directly. A number of studies
have reported sizeable differences among the different approaches (17, 18, 124),
although some research has found similarities (40). Gabriel et al (40), for example,
recently reported that preferences estimated with the HUI did not differ significantly from direct time-tradeoff scores in women with osteoporotic fractures.
The Panel on Cost Effectiveness in Health and Medicine noted that, in selecting
a health state classification system, the system should reflect the domains important for the particular problem under consideration (47). If the cost-effectiveness
?
P1: FDJ
March 31, 2000
596
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
analysis is intended for use in a reference case analysis (an analysis that incorporates a standard set of methodological practices and is intended to aid in broad
societal resource allocation decisions), the preference measure used should be a
generic one or be calibrated in such a way that it is capable of being compared
with a generic system (47).
A general limitation of generic health state classification systems is that they
may lack sensitivity to important differences in particular diseases. For example, a
recent study found that the HUI2 scores for caregivers of patients with Alzheimer’s
disease were insensitive to changes in patients’ disease stage, despite the fact that
caregiver burden, measured with disease-specific instruments, increased with patients’ disease severity (84). For this reason, disease-specific classification systems
that are appropriately preference weighted may play an important role, and they
could be mapped to a generic measure suitable in the reference case (47). Examples include the Q-tility Index (133) for cancer and the Functional Capacity Index
for trauma (71).
?
Relationship Between Patient and Community Preferences
Researchers have long debated the relevant populations to serve as the source of the
preferences for health states. A common practice has been to use the preferences
of the study investigators themselves (86). Often, preferences of clinicians have
been used, under the rationale that they are most familiar with the conditions under
investigation. Some studies have used patient preferences, because they reflect the
values of the individuals most directly affected by clinical decisions, whereas others
have used a representative or convenience sample of the general population.3 The
argument for community-based preferences is that societal resource allocation
decisions should be made by appealing to population-based community values
(47).
A number of researchers have found that individuals afflicted with a specific
disease tend to value their health state more highly than those who have not experienced the condition (29, 40, 102), although similarities in preferences have
also been reported between patients and nonpatients (7, 68, 92, 100). Researchers
have also reported that patient values are higher than their surrogates believe
(122, 123). [Wide individual-to-individual variations in health values are common
(36, 82, 123)]. The general explanation for disparities between patients and nonpatients is that patients adapt to accommodate their limitations and alter their goals
and expectations (47).
3 Alternatively, some analysts do not use anyone’s preferences for measuring health-related
quality of life, but instead follow the psychometric tradition of asking a standard set of health
status questions and then scoring responses. The scores are then used as or transformed into
indicators of preference weights for the QALY calculation (86). Although this approach
does require respondents to assess various attributes of health, as noted above it does not
directly reflect respondents’ preferences for one health state over another.
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
597
The Panel on Cost Effectiveness recommended community preferences for
health states as the most appropriate preferences for use in a reference case analysis and that weights should be collected from a representative sample of the
general population (47). They noted that a consistent set of community weights
for health conditions and health states, used across studies intended to inform
resource allocation, would significantly improve the comparability of analyses.
Options for Obtaining Utilities from Clinical Trials
?
A topic receiving considerable attention in recent years is the collection of preference weights directly in randomized clinical trials. The appeal of the approach
lies in its potential to produce more reliable and precise estimates (33). Collecting
such data as part of an ongoing trial also has logistical advantages, because the data
collection can be combined with administration of other instruments (33). The
approach is not without problems, however, because it may add to overall study
costs and to respondent burden. Also, because health values may be relatively insensitive to clinical changes, clinical trials with these values as endpoints may
require large sample sizes (125). There are also questions about the acceptability
of the results in the clinical community.
Researchers have investigated a number of options for obtaining preferences
from clinical trials, including the use of direct, holistic utility assessment (15, 64),
as well as generic health state classification systems (34, 58, 138).
Others have used disease-specific classification systems. An example of a
cancer-specific system is the Quality-Adjusted Time Without Symptoms or Toxicity (Q-TWiST), developed for assigning patients in clinical trials to health states,
using clinical-trials data (43). Weights for Q-TWiST states have not yet been obtained directly; rather, analyses with Q-TWiST typically rely on threshold analyses
of break-even utility values for toxicity and symptoms relative to TWiST and death
(44).
Mapping Health Status to Health Preferences
A number of researchers have recently explored the relationship between psychometric health status measures and preference measures (3, 9, 17–21, 38, 70, 98,
106, 124). One goal has been to find a way to obtain preferences-based measures
from widely used health status instruments such as the Short Form (SF)-12 (131)
and SF-36. However, the studies generally find poor-to-moderate correlations (in
the range of 0.2–0.45) between psychometric and preferences-based scales (comparisons across studies must be made with caution because analyses differ across
population, sample size, and instrument used). The results indicate that psychometric and preference-based approaches measure different aspects of health for
different purposes.
The studies tend not to lend strong support for the estimation of utilities directly
from the SF-36 or other health status instruments, although some have reported
more promising results. Fryback and colleagues (38), for example, reported that
P1: FDJ
March 31, 2000
598
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
a six-variable regression equation drawn from five of the SF-36 components predicted 59.6% of the variance in quality of well-being (QWC) scores. Lundberg
et al (70) found that a regression model that used the 12 items of the SF-12 scale explained 50% of the variance in rating-scale responses in a larger general-population
sample. Bult et al (21) noted that a much higher percentage (85%) of variation
was explained when the heterogeneity across subjects (calculated with latent class
analysis to estimate unknown parameters and class membership) was taken into
account.
?
Off-the-Shelf Weights
Leaders in the field have called for a standard catalog of weights that could be used
in any cost effectiveness analysis in lieu of primary data collection for each new
analysis (47). The idea is that the catalog would comprise well-described health
states with preference scores for each state, to allow users possessing enough descriptive information about health states, but not preference weights themselves, to
obtain values for their analyses. The Panel on Cost-Effectiveness has recommended
criteria for an ideal system for such a catalog, including derivation from a theorybased method on which empirical data have been collected; availability of weights
from a representative, community-based sample; low burden of administration;
and ability to furnish weights for health states and illness and conditions (47).
Although no system to date meets the criteria, there have been a number of
promising attempts to collect such weights. The Beaver Dam Health Outcomes
Study (36), for example, based on >1300 respondents from the general population,
has reported time-tradeoff and QWB scores for a variety of conditions. Selected
preference weights are shown in Table 2. The idea is that researchers conducting
cost-utility analyses could use these population reference values in estimating the
value of a particular treatment (36). For example, a treatment that relieves asthma
would return a patient with this condition from a quality-of-life level of 0.71 to a
level of 0.87; these values would in turn be used in the QALY calculation. Note
that, on average, persons without asthma have an average time-tradeoff score of
0.87 and not 1.00, reflecting the fact that they have other conditions and are not
considered in “perfect health.”
Dolan and colleagues (27) have reported valuations for various health states
based on the time-tradeoff method. Another effort has resulted in a set of healthrelated quality-of-life scores for chronic conditions based on nationally representative U.S. data from the National Health Interview Survey (46). Elsewhere,
researchers have developed a comprehensive catalog of preference weights by
using secondary data from published cost-utility analyses (8).
Proposed Alternatives to QALYs
Some researchers have argued for methods other than QALYs, maintaining that
the QALY approach is too complex (24) or, in some cases, advocating the use of
more complex measures (41, 74, 75). Proposed alternatives include healthy-years
equivalents (HYEs) (74), saved young life equivalents (87), and disability-adjusted
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
599
TABLE 2 Selected preference weights from Beaver Dam Health Outcomes studya
Time trade-off scores (age-adjusted)
Condition
Asthma
Arthritis
Angina
Stroke
Severe back pain
Migraine
Myocardial
infarction
Diabetes (insulin)
Depression
Hiatal hernia
a
QWB scores (age-adjusted)
Persons with
condition
Persons without
condition
Persons with
condition
Persons without
condition
0.71
0.82
0.79
0.90
0.79
0.82
0.73
0.87
0.90
0.87
0.86
0.88
0.86
0.86
0.68
0.69
0.66
0.68
0.67
0.70
0.64
0.73
0.75
0.73
0.73
0.74
0.73
0.73
0.63
0.70
0.85
0.87
0.87
0.86
0.66
0.65
0.70
0.73
0.73
0.73
?
Source, Fryback et al (36). Abbreviation: QWB, quality of well-being. Reprinted with permission.
life years (DALYs) (79), although the options have their own limitations and are
subject to debate themselves (47).
HYEs have been proposed as an alternative to QALYs (41, 74, 75) based on
the claim that they avoid certain restrictive assumptions about preferences. For
example, supporters claim that HYEs generalize from the constant proportionality
of QALYs by permitting the rate of tradeoff between life years and quality of life
to depend on the life span.
HYEs are calculated by measuring the utility for each possible “health pathway” of a stream of changing health states and converting this utility to an HYE
by a second measurement. There is, however, considerable debate about this second component, which has been shown to be essentially equivalent to a simple
time-tradeoff question (78, 130). Johannesson et al (60) argue that HYEs are by
definition the same as the equivalent number of years in full health in the timetradeoff developed by Torrance et al (121)—they are essentially a generalization
of risk-neutral QALYs in which the assumption of a constant proportional tradeoff
between life years and quality of life is relaxed.
Furthermore, the burden of utility assessment is appreciable because the number
of HYEs must be calculated for every possible duration of time in the health state; in
other words, HYEs require independent valuations of all possible health scenarios
rather than individual health states (60). Thus, HYEs do not offer a practical
solution to the problem of assigning utilities to health profiles for various qualities
of life, because of the enormous scope of the task of assessing time trade offs for
all possible sequences of health states over a lifetime.
DALYs have been widely used in economic evaluations conducted outside
the United States (80), and a large group of international health interventions
P1: FDJ
March 31, 2000
600
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
report health outcomes as cost per DALY (56). DALYs were developed as the
measurement unit for the Global Burden of Disease Study (80), whose goal was
to quantify the burden of disease and injury in human populations.
In effect, DALYs are similar to QALYs in that they provide a metric for quantifying life expectancy after adjusting for morbidity. However, whereas QALY
weights are based on social preferences, DALY weights also incorporate age adjustments, based implicitly on economic productivity (i.e. young or middle-aged
adults receive higher weights than the elderly or small children). DALYs incorporate two components, years of life lost (YLLs) and years lived with disability
(YLDs). Thus DALYs from any given condition are simply the sum of YLLs and
YLDs from the condition such that
?
DALYi = YLLi + YLDi ,
where i is the condition. YLLs are calculated by using standard expected years
of life lost and are discounted and age adjusted. YLDs are time lived in health
states worse than perfect health, weighted by the preference weight for each health
state. Preference weights for 22 indicator conditions have been developed using
the person tradeoff method. Seven classes of disability have been defined based on
these 22 indicator conditions and distributions of disabling severity generated for
several hundred treated and untreated disabling sequelae. Both years of life and
years with disability are also weighted by age-specific weighting factors, which
assign a greater value to a year of young or middle-aged adult life as compared
with a year of life lived by young children or the elderly. DALYs are not without
their own critics. Some have raised questions about the equity and ethics of the
age-weightings, for example (78a). Also, DALYs rely on Japanese life tables, no
matter what the actual target population.
RECENT APPLICATIONS TO PUBLIC HEALTH
INTERVENTIONS
In recent years, cost-utility analysis has been used to evaluate hundreds of interventions, ranging in scope from public health to clinical medicine (85). A recent review
of this literature underscores the growth of the field and the variations in methods
that analysts have used to estimate preference weights (85). The analyses have covered a wide range of conditions and interventions (Table 3).4 Most articles have focused on tertiary prevention (57.5%), followed by secondary (32.5%) and primary
prevention (10.1%). Of interest to public health professionals, there have been
analyses of screening strategies (10.5%), health education interventions (5.3%),
and immunizations (3.9%). Table 4 presents a list of selected cost-utility analyses
4 Note
that Tables 1 and 2 are organized in part by diagnostic categories, though the
assignment of diagnosis may be somewhat arbitrary since individuals frequently have
coexisting morbidities.
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
TABLE 3 Description of published cost-utility analyses
1976–1997 by prevention stage, condition, and type of
interventiona
Parameter
Nb
Percent
Prevention stagec
Primary
Secondary
Tertiary
23
74
131
10.1
32.5
57.5
Condition (ICD-9 category)
Circulatory system
Neoplasms
Infectious and parasitic
Genitourinary system
Digestive system
Musculoskeletal system
Endocrine, nutritional, and metabolic
Nervous system and sense organs
Mental disorders
Blood and blood-forming organs
Respiratory system
Injury and poisonings
Conditions originating in perinatal period
Variousd
58
40
35
14
12
12
11
11
11
8
5
4
4
3
24.6
17.9
15.6
6.3
5.4
5.4
4.9
4.9
4.5
3.6
2.2
1.8
1.8
3.0
Type of intervention
Pharmaceutical
Surgical
Diagnostic
Screening
Medical procedure
Care delivery
Health education/behavior
Immunizations
Medical device
Other
73
41
26
24
16
13
12
9
6
2
32.0
18.0
11.4
10.5
7.0
5.7
5.3
3.9
2.6
0.9
?
a
Source, Neumann et al (85).
b
N = 228 studies.
c
Primary preventive measures are those provided to prevent onset of a targeted
condition (e.g. routine immunization of healthy children). Secondary preventive
measures identify and treat asymptomatic persons who have already developed
risk factors or preclinical disease, but in whom the condition has not become
clinically apparent (e.g. screening for high blood pressure). Tertiary preventive
measures include all medical or surgical interventions designed to limit disability
after harm has occured (see references 114a and 126).
d
Articles covered diseases in more than one category.
601
P1: FDJ
March 31, 2000
12:0
602
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
TABLE 4 Cost-utility analyses of selected preventative interventions, 1992–1997a
Year
Intervention studiedb
1992
Hormone replacement therapy
23
1992
Mammography screening
50
1993
Autologous blood donation/transfusion
11
1994
Hib vaccination
51
1994
Hib catch-up immunization
73
1994
Hormone replacement therapy/lifestyle intervention
42
1994
Preoperative autologous donation of various units
10
1994
Preoperative autologous donation, overall
48
1994
Screening for prostate cancer
62
1994
Solvent-detergent-treated fresh-frozen plasma
1995
Preoperative autologous blood donation
1995
Breast cancer screening
1995
Childhood vaccines
1995
Screening blood donors for hemochromatosis
1995
Screening blood donors to prevent postransfusion hepatitis B
and C infection
22
1995
Selective HBV vaccination
72
1996
Behavioral group HIV-prevention intervention
52
1996
PRP-T conjugate Hib vaccine
1996
Screening for abdominal aortic aneurysm
1996
Screening for asymptomatic carotid artherosclerotic disease
26
1996
Screening for mild thyroid failure
25
1996
Transdermal nicotine patch as an adjunct to physician’s
smoking cessation counseling
35
1997
Cardiovascular risk reduction program
103
1997
Hepatitis a vaccination in health care workers
109
1997
HIV postexposure chemoprophylaxis
1997
HIV testing protocols for donated blood
1997
HIV prevention skills training for men who have sex with men
1997
Pneumococcal bacterium vaccination in the elderly
1997
Routine use of intraoperative autologous transfusion device
55
1997
Screening for carotid disease in asymptomatic patients
65
1997
Universal cancer screening program
54
a
?
Reference
5
30
14
105
2
67
113
94
6
95
107
Source, Stone et al (114).
b
Abbreviations: Hib, Haemophilus influenza type B; HBV, hepatitis B virus; HIV, human immunodeficiency
virus; PRP-T, polysaccharide-tetanus toxoid.
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
603
TABLE 5 Methods used for estimating preference
weights in published cost-utility analysesa
Measurement of preference weights
Nb
Percent
Measurement scale used
Study specific
Based on previous study
Preexisting generic
Preexisting disease specific
Could not be determined
109
60
44
2
13
47.8
26.3
19.3
0.9
5.7
74
59
55
52
32.4
25.9
24.1
22.8
77
127
33.8
55.7
59
46
22
59
25.9
20.2
9.6
25.9
?
Source of preferencesc
Author
Clinician
Patient
Community
Preference measurement technique
Author’s judgment
Formal techniqued
Rating scale/magnitude
Estimation
Time trade off
Standard gamble
Could not be determined
a
Source, Neumann et al (85).
N = 228 studies.
b
c
More than one response allowed per article.
d
Includes 59 (25.9%) using rating scale, 46 (20.2%) using time trade off,
and 22 (9.6%) using standard gamble.
of preventive interventions published from 1992 to 1997. The list underscores the
diversity of interventions analyzed, from vaccines for pneumococcal pneumonia,
Haemophilus influenza type B, measles, tetanus, typhoid, dengue, and hepatitis A
to screening strategies for breast cancer, cervical cancer, thyroid disease, and tuberculosis. Recent summaries of the cost-effectiveness ratios of these interventions
are provided by Graham et al (49) and Stone et al (114). Economic information
from cost-utility analyses and related approaches are also increasingly being considered or used to develop clinical guidelines for public health. An example is a
recent report by the Centers for Disease Control, which examined 19 prevention
strategies based on the health impact and cost of the related disease, injury, and
disability and the effectiveness and cost effectiveness of the strategy (76). Cost
effectiveness has also been considered in the development of other public health
guidelines, including the report of the U.S. Preventive Services Task Force (126).
A challenge for all of these efforts is that economic evaluations frequently use
different methods, raising questions about the comparability of studies. Table 5
summarizes the methods used for estimating preferences in published cost-utility
analyses, for example. These data underscore the considerable variation in the
P1: FDJ
March 31, 2000
604
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
measurement scales, sources of preferences, and preference measurement techniques used. For example, studies generally used health states specific to the study
at hand (47.8%); only 19.3% have used a preexisting generic health state classification system (e.g. HUI) as recommended by the Panel on Cost-Effectiveness in
Health and Medicine. Most frequently, the authors themselves were the source of
preferences (32.4%), followed by clinicians (25.9%), patients (24.1%), and members of the community (22.8%). Whereas just over half of the studies used some
kind of formal measurement technique (55.7%), many studies relied on authors’
judgment (33.8%); in over one-quarter of the cases, the measurement technique
could not be determined.
In general, few studies have adhered to recommendations now provided by
leaders in the field for preference estimation. In some ways these results are not
surprising, because estimating preference weights remains a young field, characterized by ongoing debate on several conceptual issues, such as the appropriate
source of these weights. Still the extent of the variations observed and the persistence of such trends over time (86) is troubling and may contribute to the lack
of acceptability of the method among decision makers (69, 108). Concerns about
the comparability and credibility of analyses will likely persist without further
improvements and standards in the field (85).
?
CONCLUSIONS
Estimating preferences in economic evaluation in health care continues to be an
extremely active area of research. Although cost-utility analyses have become
more popular in recent years, many challenges remain for the field. Widespread
acceptance of the methodology may await more consensus over measurement
techniques, as well as educational efforts in the medical community on the potential
usefulness of the approach. The challenge for the future will be to find reliable
and valid measurement techniques.
ACKNOWLEDGMENTS
The authors are grateful to Sally Araki for comments on an earlier version of this
manuscript and to Vijay Ramakrishnan for excellent research assistance.
Visit the Annual Reviews home page at www.AnnualReviews.org
LITERATURE CITED
1. Acton JP. 1973. Evaluating public programs to save lives: the case of heart attacks. R-950-RC, Rand Corp., Santa Monica, Calif.
2. Adams PC, Gregor JC, Kertesz AE,
Valberg LS. 1995. Screening blood donors for hereditary hemochromatosis:
decision analysis model based on a
30-year database. Gastroenterology 109:
177–88
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
3. Andresen EM, Patrick DL, Carter WB,
Malmgren JA. 1995. Comparing the performance of health status measures for
healthy older adults. J. Am. Geriatr. Soc.
43:1030–34
4. Appel LJ, Steinberg EP, Powe NR, Anderson GF, Dwyer SA, Faden RR. 1990. Risk
reduction from low osmolality contrast media: What do patients think it is worth?
Med. Care 28:324–37
5. AuBuchon Birkmeyer JD. 1994. Safety
and cost-effectiveness of solvent-detergent-treated plasma: in search of zero-risk
blood supply. JAMA 272:1210–14
6. AuBuchon JP, Birkmeyer JD, Busch MP.
1997. Cost effectiveness of expanded human immunodeficiencey virus testing protocols for donated blood. Transfusion
37:45–51
7. Balaban DJ, Sagi PC, Goldfarb NI, Nettler S. 1986. Weights for scoring the quality
of well-being instrument among rheumatoid arthritics: a comparison to general
population weights. Med. Care 24:973–80
8. Bell C, Chapman RC, Sandberg EA, Stone
PW, Neumann PJ. 1999. An off-the-shelf
help list: a comprehensive catalog of preference weights from published cost-utility
analyses. Med. Decis. Mak. 19:519(abstr.)
9. Bennett KJ, Torrance GW, Moran L, Smith
F, Goldsmith CH. 1997. Health state utilities in knee replacement surgery: the development and evaluation of McKnee. J.
Rheumatol. 24:1796–805
10. Birkmeyer JD, AuBuchon JP, Littenberg
B, O’Connor GT, Nease RF, et al. 1994.
Cost-effectiveness of preoperative autologous donation in coronary artery bypass grafting. Ann. Thorac. Surg. 57:161–
68
11. Birkmeyer JD, Goodnough LT, AuBuchon JP, Littenberg B. 1993. The costeffectiveness of preoperative autologous
blood donation for total hip and knee replacement. Transfusion 33:544–51
12. Bleichrodt H, Johannesson M. 1996. An
experimental test of constant proportional
13.
14.
tradeoff and utility independence. Med.
Decis. Mak. 17:21–32
Bleichrodt H, Johannesson M. 1997. An
experimental test of the theoretical foundation for rating scale valuations. Med. Decis.
Mak. 17(2):208–16
Boer R, de Koning HJ, van Oortmarssen
GJ, van der Maas PJ. 1995. In search of
the best upper age limit for breast cancer
screening. Eur. J. Cancer 31A:2040–43
Bombardier C, Ware J, Russell IJ, Larson M, Chalmers A, et al. 1986. Auranofin
therapy and quality of life in patients with
rheumatoid arthritis: results of a multicenter trial. Am. J. Med. 81:565–81
Bosch JL, Hammitt JK, Weinstein JC,
Hunink MGM. 1998. Estimating general population utilities using one binarygamble question per respondent. Med. Decis. Mak. 18:381–90
Bosch JL, Hunink MGM. 1996. The relationship between descriptive and valuational quality-of-life measures in patients
with intermittent claudication. Med. Decis.
Mak. 165:217–25
Bosch JL, Hunink MGM, Tetteroo E, Bos
JJ, Mali WPTM. 1994. Quality of life assessment in patients with peripheral arterial disease. Med. Decis. Mak. 14(4):425
(Abstr.)
Brazier J, Jones N, Kind P. 1993. A comparison of two health status measures: Euroqol meets SF-36. Presented at Health
Econ. Study Group/Fac. Public Health
Med. Conf., Univ. York, York, UK
Bult JR, Bosch JL, Hunink MGM. 1996.
Heterogeneity in the relationship between the standard-gamble utility measure
and health-status dimensions. Med. Decis.
Mak. 16:226–23
Bult JR, Hunink MGM, Tsevat J, Weinstein
MC. 1998. Heterogeneity in the relationship between the time tradeoff and shortform-36 for HIV-infected and primary care
patients. Med. Care 36:523–32
Busch MP, Korelitz JJ, Kleinman SH,
AuBuchon JP, Schreiber GB. 1995. The
?
15.
16.
17.
18.
19.
20.
21.
22.
605
P1: FDJ
March 31, 2000
606
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
Retrovirus Epidemiology Donor Study:
declining value of alanine aminotransferase in screening of blood donors to prevent posttransfusion hepatitis B and C virus
infection. Transfusion 35:903–10
Cheung AP, Wren BG. 1992. A costeffectiveness analysis of hormone replacement therapy in the menopause. Med. J.
Aust. 156:312–16
Cox D, Fitzpatrick R, Fletche A, Gore S,
Spiegelhalter D, Jones D. 1992. Qualityof-life assessment: Can we keep it simple?
J. R. Stat. Soc. A 155(3):353–93
Danese MD, Powe NR, Sawin CT, Ladenson PW. 1996. Screening for mild thyroid
failure at the periodic health examination.
JAMA 276:285–92
Derdeyn CP, Powers WJ. 1996. Costeffectiveness of screening for asymptomatic carotid atherosclerotic disease.
Stroke 27:1944–50
Dolan P, Gudex C, Kind P, Williams A.
1996. The time trade-off method: results
from a general population study. Health
Econ. 5:141–54
Drummond MF, O’Brien B, Stoddart GL,
Torrance GW. 1997. Methods for the Economic Evaluation of Health Care Programmes. Oxford, UK: Oxford Univ.
Press
Epstein AM, Hall JA, Tognetti J, Son LH,
Conant L Jr. 1989. Using proxies to evaluate quality of life: Can they provide valid
information about patients’ health status
and satisfaction with medical care? Med.
Care 27(Suppl.):S91–98
Etchason J, Petz L, Keeler E, Calhoun
L, Kleinman S, Snider C. 1995. The cost
effectiveness of preoperative autologous
blood donations. N. Engl. J. Med. 332:719–
24
EuroQol Group. 1990. EuroQol—a new
facility for the measurement of health related quality of life. Health Policy 16:199–
208
Feeny D, Furlong W, Boyle M, Torrance
G. 1995. Multi-attribute health states clas-
33.
34.
sification systems: health utilities index.
Pharmacoeconomics 7:490–502
Feeny D, Labelle R, Torrance GW. 1990.
Integrating economic evaluations and quality of life assessments. In Quality of
Life Assessments in Clinical Trials, ed. B
Spilker, pp. 85–95. New York: Raven
Feeny DH, Torrance GW. 1989. Incorporating utility-based quality-of-life assessment measures in a randomised trial. Med.
Care 27(Suppl. 3):S190–204
Fiscella K, Franks P. 1996. Cost-effectiveness of the transdermal nicotine patch
as an adjunct to physicians’ smoking cessation counseling. JAMA. 275:1247–51
Fryback DG, Dasbach EJ, Klein R, Klein
BEK, Dorn N, et al. 1993. The Beaver
Dam Health Outcomes Study: initial catalog of health-state quality factors. Med.
Decis. Mak. 13:89–102
Fryback DG, Lawrence WF. 1997. Dollars
may not buy as many QALYs as we think:
a problem with defining quality-of-life
adjustments. Med. Decis. Mak. 17:276–
84
Fryback DG, Lawrence WF, Martin PA,
Klein R, Klein BE. 1997. Predicting quality of well-being scores from the SF-36:
results from the Beaver Dam Health Outcomes Study. Med. Decis. Mak. 17:1–9
Furlong W, Feeny D, Torrance GW.
1998. Multiplicative, multi-attribute utility
function for the Health Utilities Index Mark
3 (HUI3) system: a technical report. Work.
Pap. 98-11. McMaster Univ. Cent. Health
Econ. Policy Anal., Montreal, Canada
Gabriel SE, Kneeland TS, Melton LJ, Moncur MM, Ettinger B, Tosteson ANA. 1999.
Health-related quality of life in economic
evaluation of osteoporosis. Med. Decis.
Mak. 19:141–48
Gafni A, Birch S. 1993. Economics, health
and health economics: HYEs versus
QALYs. J. Health Econ. 11:325–39
Geelhoed E, Harris A, Prince R. 1994.
Cost-effectiveness analysis of hormone replacement therapy and lifestyle
?
35.
36.
37.
38.
39.
40.
41.
42.
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
intervention for hip fracture. Aust. J.
Public Health 18:153–60
Gelber RD, Goldhirsch A. 1986. A new 53.
endpoint for the assessment of adjuvant
therapy in postmenopausal women with
operable breast cancer. J. Clin. Oncol. 4:
1772–79
Gelber RD, Goldhirsch A, Cole BF. 1993. 54.
Evaluation of effectiveness: Q-TWiST.
Cancer Treat. Rev. 19:73–84
Glick HA, Polsky D, Willke RJ, Schulman KA. 1999. A comparison of preference assessment instruments used in a clin- 55.
ical trial. Med. Decis. Mak. 19:265–75
Gold MR, Franks P, McCoy KI, Fryback
DG. 1998. Toward consistency in costutility analyses: using national measures to
create condition-specific values. Med. Care
56.
36(6):778–92
Gold MR, Siegel JE, Russell LB, Weinstein
MC. 1996. Cost-effectiveness in Health
and Medicine. Oxford, UK: Oxford Univ.
57.
Press
Goodnough LM, Grishaber JE, Birkmeyer
JD, Monk TG, Catalona WJ. 1994. Efficacy and cost-effectiveness of autologous
blood predeposit in patients undergoing
radical prostatectomy procedures. Urology
58.
44:226–31
Graham JD, Corso PS, Morris JM, SeguiGomez M, Weinstein MC. 1998. Evaluating the cost-effectiveness of clinical and
public health measures. Annu. Rev. Public
59.
Health 19:125–52
Hall J, Gerard K, Salkeld G, Richardson J.
1992. A cost-utility analysis of mammography screening in Australia. Soc. Sci. Med.
60.
34:993–1004
Harris A, Hendrie D, Bower C, Payne J, de
Klerk N, Stanley F. 1994. The burden of
Haemophilus influenzae type b disease in
Australia and an economic appraisal of the 61.
vaccine PRP-OMP. Med. J. Aust. 160:483–
88
Holtgrave DR, Kelly JA. 1996. Preventing HIV/AIDS among high-risk urban
women: the cost-effectiveness of a be- 61a.
607
havioral group intervention. Am. J. Public
Health 86:1442–45
Hornberger JC, Redelmeier DA, Petersen
J. 1992. Variability among methods to assess patients’ well-being and consequent
effect on a cost-effectiveness analysis. J.
Clin. Epidemiol. 45(5):505–12
Hristova L, Hakama M. 1997. Effect of
screening for cancer in the Nordic countries on deaths, cost and quality of life up
to the year 2017. Acta Oncol. 36(Suppl.
9):1–60
Huber TS, McGorray SP, Carlton LC, Irwin
PB, Flug RR, Flynn TC. 1997. Intraoperative autologous transfusion during elective
infrarenal aortic reconstruction: a decision analysis model. J. Vasc. Surg. 25:984–
94
Jamison DT, Mosley WH, Measham AR,
Bobadilla JL, eds. 1993. Disease Control
Priorities in Developing Countries. New
York: Oxford Univ. Press
Jansen SJ, Stiggelbout AM, Wakker PP,
Vliet Vlieland TP, Leer JW, et al. 1998.
Patients’ utilities for cancer treatments: a
study of the chained procedure for the standard gamble and time tradeoff. Med. Decis.
Mak. 18:391–99
Jenkinson C, Gray A, Doll H, Lawrence K,
Keoghane S, Layte R. 1997. Evaluation of
index and profile measures of health status
in a randomized controlled trial. Med. Care
35:1109–18
Johannesson M, Jonsson B. 1991. Willingness to pay for antihypertensive therapy:
results of a Swedish pilot study. J. Health
Econ. 10:461–74
Johannesson M, Pliskin J, Weinstein M.
1993. Are healthy-years equivalents an improvement over quality-adjusted life years?
Med. Decis. Mak. 13(4):281–86
Kaplan RM, Anderson JP, Wu AW,
Matthews WC, Kozin F, Orenstein D. 1989.
The Quality of Well-Being Scale: applications in AIDS, cystic fibrosis, and arthritis.
Med. Care 27(Suppl. 3):S27–43
Kind P. 1996. The EuroQol instrument: an
?
P1: FDJ
March 31, 2000
608
62.
63.
64.
65.
66.
67.
68.
69.
70.
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
index of health-related quality of life. In 71. MacKenzie EJ, Damiano AM, Miller TS,
Luchter S. 1996. The development of
Quality of Life and Pharmacoeconomics in
the Functional Capacity Index. J. Trauma
Clinical Trials, ed. B Spilker, pp. 191–201.
41:799–807
New York: Raven. 2nd ed.
Krahn MD, Mahoney JE, Eckman MH, 72. Mangtani P, Hall AJ, Normand CE. 1995.
Hepatitis B vaccination: The cost effecTrachtenberg J, Pauker SG, Detsky A.
tiveness of alternative strategies in Eng1994. Screening for prostate cancer: a deland and Wales. J. Epidemiol. Community
cision analytic view. JAMA 272:773–80
Health 49:238–44
Kupperman M, Shiboski S, Feeny D, Elkin
EP, Washington AE. 1997. Can preference 73. McIntyre P, Hall J, Leeder S. 1994. An
economic analysis of alternatives for childscores for discrete states be used to dehood immunisation against Haemophilus
rive preference scores for an entire path of
influenzae type b disease. Aust. J. Public
events? An application to prenatal diagnoHealth 18:394–400
sis. Med. Decis. Mak. 17:42–55
Laupacis A, Wong C, Churchill D, Cana- 74. Mehrez A, Gafni A. 1989. Quality adjusted
life years, utility theory, and healthy-years
dian Erythropoietin Study Group. 1991.
equivalents. Med. Decis. Mak. 9(2):142–49
The use of generic and specific qualityof-life measures in hemodialysis patients 75. Mehrez A, Gafni A. 1993. Health-years
equivalents versus quality-adjusted life
treated with erythropoietin. Control. Clin.
years: in pursuit of progress. Med. Decis.
Trials 12:168S–79S
Mak. 13:287–92
Lee TT, Solomon NA, Heidenreich PA,
Oehlert J, Garber AM. 1997. Cost-effec- 76. Messonier ML, Corso PS, Teutsh SM, Haddix AC, Harris JR. 1999. An ounce of pretiveness of screening for carotid stenosis in
vention: What are the returns? Am. J. Prev.
asymptomatic patients. Ann. Intern. Med.
Med. 16(3):248–63
126:337–46
Lenert LA, Cher DJ, Goldstein MK, 77. Mishan EJ. Evaluation of life and limb.
1971. A theoretical approach. J. Polit.
Bergen MR, Garber A. 1998. The effect
Econ. 79:687–706
of search procedures on utility elicitations.
78. Morrison GC. 1997. Healthy years equivMed. Decis. Mak. 18:76–83
alent and time tradeoff: What is the differLivartowski A, Boucher J, Detournay B,
ence? J. Health Econ. 16:563–78
Reinert P. 1996. Cost-effectiveness evaluation of vaccination against Haemophilus 78a. Morrow RH. Bryant JH. 1995. Health policy approaches to measuring and valuing
influenzae invasive diseases in France. Vachuman life: conceptual and ethical issues.
cine 14:495–500
Am. J. Public Health 85(10):1356–60
Llewellyn-Thomas H, Sutherland HJ, Tibshirani R, Ciampi A, Till JE, Boyd NF. 79. Murray CJ. 1994. Quantifying the burden of disease: the technical basis for
1984. Describing health states: methoddisability-adjusted life years. Bull. WHO
ologic issues in obtaining values for health
72(3):429–45
states. Med. Care 22:543–52
Luce BR, Lyles CA, Rentz AM. 1996. The 80. Murray CJL, Lopez AD. 1996. The Global
Burden of Disease. Geneva, Switzerland:
view from managed care pharmacy. Health
WHO. 990 pp.
Aff. 15(4):168–76
Lundberg L, Johannesson M, Isacson 81. Nease RF, Brooks WB. 1995. Patient desire for information and decision making in
DGL, Borgquist L. 1999. The relationhealth care decisions: the Autonomy Prefship between health-state utilities and the
erence Index and the Health Opinion SurSF-12 in a general population. Med. Decis.
vey. J. Gen. Intern. Med. 10(11):593–600
Mak. 19:128–40
?
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
82. Nease RF Jr, Kneeland T, O’Conner GT,
Sumner W, Lumpkins C, et al. 1995. Variation in patient utilities for outcomes of
the management of chronic stable angina:
implications for clinical practice guidelines. JAMA 273:1185–90
82a. Neumann PJ, Hermann RC, Kuntz KM,
Araki SS, Duff SB, et al. 1999. Costeffectiveness of Donepezil in the treatment of mild or moderate Alzheimer’s
disease. Neurology 52:1138–45
83. Neumann PJ, Johannesson M. 1994. The
willingness to pay for in vitro fertilization: a pilot study using contingent valuation. Med. Care 32(7):686–99
84. Neumann PJ, Kuntz KM, Leon J, Araki
SS, Hermann RC, et al. 1999. Health
utilities in Alzheimer’s disease: a crosssectional study of patients and caregivers.
Med. Care 37:27–32
85. Neumann PJ, Stone PW, Chapman RH,
Sandberg EA. 1999. A formal audit of 228
published cost-utility analyses. Harv. Sch.
Public Health Work. Pap. Harv. Univ.,
Boston, Mass.
86. Neumann PJ, Zinner DE, Wright JC.
1997. Are methods for estimating QALYs
in cost-effectiveness analyses improving?
Med. Decis. Mak. 17:402–8
87. Nord E. 1992. An alternative to QALYs:
the saved young life equivalent (SAVE).
Br. Med. J. 305(6858):875–77
87a. Nord E. 1995. The person trade-off approach to valuing health care programs.
Med. Decis. Mak. 15(3):201–8
88. Nord E. 1996. Health status index models for use in resource allocation decisions. Int. J. Technol. Assess. Health Care
12:31–44
89. O’Brien B, Gafni A. 1996. When do the
“dollars” make sense?: toward a conceptual framework for contingent valuation
studies in health care. Med. Decis. Mak.
16(3):288–99
90. O’Leary JF, Faircloth DL, Jankowski
MK, Weeks JC. 1995. Comparison of
time trade-off utilities and rating scale
91.
92.
values of cancer patients and their relatives: evidence for a possible plateau relationship. Med. Decis. Mak. 15:132–37
Patrick DL, Erickson P. 1993. Health Status and Health Policy. New York: Oxford
Univ. Press
Patrick DL, Sittampalam Y, Somerville
S, Carter W, Bergner M. 1985. A crosscultural comparison of health status values. Am. J. Public Health 75:1402–7
Pauly MV. 1995. Valuing health care benefits in money terms. In Valuing Health
Care, ed. F Sloan, pp. 99–124. London:
Cambridge Univ. Press
Pinkerton SD, Holtgrave DR, Pinkerton
HJ. 1997. Cost-effectiveness of chemoprophylaxis after occupational exposure
to HIV. Arch. Intern. Med. 177:1972–
80
Pinkerton SD, Holtgrave DR, Valdisseri
RO. 1997. Cost-effectiveness of HIVprevention skills training for men who
have sex with men. AIDS 11:347–57
Pliskin JS, Shepard DS, Weinstein MC.
1980. Utility functions for life years and
health status. Oper. Res. 28:206–24
Ramsey SD, Etzioni R, Troxel A, Urban
N. 1997. Optimizing sampling strategies
for estimating quality-adjusted life-years.
Med. Decis. Mak. 17:431–38
Read JL, Quinn RJ, Berwick DM,
Fineberg HV, Weinstein MC. 1984. Preferences for health outcomes: comparison
of assessment methods. Med. Decis. Mak.
4:315–29
Revicki DA. 1992. Relationship between health utility and psychometric health status measures. Med. Care
30(Suppl.):MS274–82
Revicki DA, Kaplan RM. 1993. Relationship between psychometric and utilitybased approaches to the measurement of
health-related quality of life. Qual. Life
Res. 2:477–87
Revicki DA, Shakespeare A, Kind P.
1996. Preferences for schizophrenia-related health states: a comparison of
?
93.
94.
95.
95a.
96.
97.
98.
99.
100.
609
P1: FDJ
March 31, 2000
610
101.
102.
103.
104.
105.
106.
107.
108.
109.
110.
111.
12:0
Annual Reviews
NEUMANN
■
GOLDIE
■
CHAP-24
WEINSTEIN
on willingness-to-pay and starting point
patients, caregivers, and psychiatrists. Int.
bias. Med. Decis. Mak. 16:242–47
Clin. Psychopharmacol. 11:101–8
Rosser R, Kind P. 1978. A scale of valua- 112. Stalhammer NO, Johanesson M. 1996.
Valuation of health changes with the contions of states of illness: Is there a social
tingent valuation method: a test of scope
consensus? Int. J. Epidemiol. 7(4):347–
and question order effects. Health Econ.
58
5:531–41
Sackett D, Torrance G. 1978. The utility of different health states as perceived 113. St Leger AS, Spencely M, McColloum
CN, Mossa M. 1996. Screening for abby the general public. J. Chronic Dis.
dominal aortic aneurysm: a computer as31(11):697–704
sisted cost-utility analysis. Eur. J. Vasc.
Salkeld G, Phongsavan P, Oldenburg B,
Endovasc. Surg. 11:183–90
Johannesson M, Convery P, et al. 1997.
The cost-effectiveness of a cardiovascular 114. Stone PW, Teutsch SM, Chapman RH,
Bell C, Neumann PJ. 1999. Cost-utility
risk reduction program in general pracanalyses of clinical preventive services.
tice. Health Policy 41:105–19
Harv. Sch. Public Health Work. Pap.,
Schelling TC. 1968. The life you save
Harv. Univ. Boston, Mass. 32 pp.
may be your own. In Problems in Public Expenditure Analysis, ed. SB Bhase, 114a. Tengs, Adams ME, Pliskin JS, Safram
DG, Siegel JE, et al. 1995. Five-hundred
pp. 127–76. Washington, DC: Brookings
life-saving interventions and their costInst.
effectiveness. Risk Anal. 15(3):369–90
Shepard DS, Walsh JA, Kleinau E, Stansfield S, Bhalotra S. 1995. Setting pri- 115. Thompson MS. 1986. Willingness to pay
and accept risks to cure chronic disease.
orities for the Children’s Vaccine InitiaAm. J. Public Health 76:392
tive: a cost-effectiveness approach. Vac116. Torrance GW. 1976. Social preferences
cine 13:707–14
for health states: an empirical evaluation
Shmueli A. 1999. Subjective health status
of three measurement techniques. Socioand health values in the general populaEcon. Plan. Sci. 10:128–36
tion. Med. Decis. Mak. 19:122–26
Sisk JE, Moskowitz AJ, Whang W, 117. Torrance GW. 1986. Measurement of
health state utilities for economic apLin JD, Fedson DS, et al. 1997. Costpraisal. J. Health Econ. 5:1–30
effectiveness of vaccination against Pneumococcal bacteremia among elderly peo- 118. Torrance GW, Boyle MH, Horwood SP.
1982. Application of multi-attribute utilple. JAMA 278:1333–39
ity theory to measure social preferences
Sloan FA, Whetten-Goldstein K, Wilfor health states. Oper. Res. 30:1043–69
son A. 1997. Hospital pharmacy decisions, cost-containment, and the use of 119. Torrance GW, Feeny DH, Furlong WJ,
Barr RD, Zhang Y, Wang Q. 1996.
cost-effectiveness analysis. Soc. Sci. Med.
Multi-attribute preference functions for
45(4):525–33
a comprehensive health status classificaSmith S, Weber S, Wiblin T, Nettleman
tion system: health utilities index mark 2.
M. 1997. Cost-effectiveness of hepatitis
Med. Care 24(7):702–22
A vaccination in health care workers. Infect. Control Hosp. Epidemiol. 18:688–91 120. Torrance GW, Furlong WJ, Feeny DH,
Boyle M. 1995. Multi-attribute preferSorum PC. 1999. Measuring patient prefence functions: health utilities index.
erences by willingness to pay to avoid:
Pharmacoeconomics 7:503–20
the case of acute otitis media. Med. De121. Torrance GW, Thomas W, Sackett D.
cis. Mak. 19:27–37
1972. A utility maximization model
Stalhammer NO. 1996. An empirical note
?
P1: FDJ
March 31, 2000
12:0
Annual Reviews
CHAP-24
PREFERENCES IN ECONOMIC EVALUATION
122.
123.
124.
125.
126.
127.
128.
129.
130.
611
for evaluation of health care programs. 131. Ware JE, Kosinski M, Keller SD. 1996.
A 12-item short-form health survey: conHealth Serv. Res. 7(2):118–33
struction of scales and preliminary tests
Tsevat FJ, Cook EF, Green ML, Matchar
of reliability and validity. Med. Care
DB, Dawson NV, et al. 1995. Health val34(3):220–33
ues of the seriously ill. Ann. Intern. Med.
132. Ware JE, Sherbourne CD. 1992. The
122:514–20
MOS 36-item short-form health survey
Tsevat J, Dawson MD, Wu AW, Lynn J,
(SF-36). I. Conceptual framework and
Soukup JR, et al. 1998. Health values of
item selection. Med. Care 30(6):473–83
hospitalized patients 80 years old or older.
133. Weeks J, O’Leary J, Fairclough D, Paltiel
JAMA 279:371–75
D, Weinstein MC. 1994. The Q-tility inTsevat J, Solzan JG, Kuntz KM, Ragland
dex: a new tool for assessing healthJ, Currier JS, et al. 1996. Health values
related quality of life and utilities in clinof patients infected with human immunical trials and clinical practice. Proc. Am.
odeficiency virus: relationship to menSoc. Clin. Oncol. 13th, Dallas, TX. p. 436
tal health and physical functioning. Med.
(Abstr.) Washington, DC: Am. Soc. Clin.
Care 34:44–57
Oncol.
Tsevat J, Weeks JC, Guadagnoli E, Tosteson ANA, Mangione CM, et al. 1994. Us- 133a. Weinstein MC, Fineberg HV. 1980. Clinical Decision Analysis. Philadelphia, PA:
ing health-related quality of life informaSaunders. 351 pp.
tion: clinical encounters, clinical trials,
and health policy. J. Gen. Intern. Med. 134. Weinstein MC, Siegel JE, Gold MR,
Kamlet MS, Russell LB. 1996. Rec9:576–82
ommendations of the Panel on CostU.S. Preventive Services Task Force.
Effectiveness in Health and Medicine.
1996. Guide to Clinical Preventive SerJAMA 276:1253–58
vices. Baltimore, MD: Williams &
135. Weinstein MC, Stason WB. 1977. FounWilkins. 2nd ed. 953 pp.
dations of cost-effectiveness analysis for
Van Wijck EEE, Bosch JL, Hunink
health and medical practice. N. Engl. J.
MGM. 1998. Time-tradeoff values and
Med. 296:716–21
standard gamble utilities assessed during
telephone interviews versus face-to-face 136. Weisbrod BA. 1961. The Economics of
Public Health. Philadelphia, PA: Univ.
interviews. Med. Decis. Mak. 18:400–5
Penn. Press. 127 pp.
Viscusi WK. 1993. The value of risks to
137. Williams A. 1974. The cost-benefit aplife and health. J. Econ. Lit. 31:1912–46
proach. Br. Med. Bull. 30(3):252–56
Von-Neumann J, Morgenstern O. 1944.
Theory of Games and Economic Behav- 138. Wu AW, Mathews WC, Brysk LT, Atkinson JH, Grant I, et al. 1990. Quality
ior. Princeton, NJ: Princeton Univ. Press.
of life in a placebo-controlled trial of
625 pp.
Zidovudine in patients with AIDS and
Wakker P. 1996. A criticism of healthy
AIDS-related complex. J. Acquired Imyears equivalents. Med. Decis. Mak.
mune Defic. Syndr. 3:683–90
16:207–14
?