P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 Annu. Rev. Public Health. 2000. 21:587–611 c 2000 by Annual Reviews. All rights reserved Copyright PREFERENCE-BASED MEASURES IN ECONOMIC EVALUATION IN HEALTH CARE ? Peter J. Neumann, Sue J. Goldie, and Milton C. Weinstein Program on the Economic Evaluation of Medical Technology, Center for Risk Analysis, Harvard School of Public Health, Boston, Massachusetts 02115; e-mail: [email protected] Key Words utilities, preferences, quality-adjusted life years gained, QALYS, cost-utility analysis ■ Abstract Estimating preferences for states of health has been an active area of research in recent years. Unlike psychophysical approaches, which discriminate levels of health status, preference-based approaches incorporate values or utilities for health outcomes and can be used in cost-effectiveness analyses to aid resource allocation decisions. This chapter considers issues and controversies involved in using preference-based measures in economic evaluation in health care, with a particular emphasis on cost-utility analysis and the estimation of quality-adjusted life years. Topics considered include techniques for measuring preferences, the use of preference-based classification systems, the relationship between patient and community preferences, methods for obtaining utilities from clinical trials, mapping health status from health utilities, the development of “off-the-shelf” preference weights, and proposed alternatives to quality-adjusted life years. We also consider applications of cost-utility analyses to public health interventions. Although cost-utility analyses have become more popular recently, many challenges remain for the field. Widespread acceptance of the methodology likely awaits more consensus on measurement techniques, as well as educational efforts in the public health and medical communities on the usefulness of the approach. INTRODUCTION Health policy researchers and analysts have long been interested in the question of how people value health. The issue lies at the heart of attempts to assess the relative worth of different health and medical interventions. If we know the value people attach to the health improvement they receive from different interventions, it could help to determine how to provide most efficiently more of the outcomes that people desire and fewer that they do not (47, 117). In this chapter we consider preference-based measures in economic evaluation in health care. We start by reviewing how preferences are incorporated into 0163-7527/00/0510-0587$14.00 587 P1: FDJ March 31, 2000 588 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN economic evaluations, focusing on cost-utility analysis, which has emerged as the recommended practice for the field (134). [Some analysts use the generic term “cost-effectiveness analyses” to include all forms of analyses measuring costs per unit of health effect (47). Here we follow the practice used by Drummond et al (28) by referring to analyses measuring cost per quality-adjusted life years (QALYs) as “cost-utility analyses.”] Next, we describe the concept of QALYs, followed by a discussion of emerging topics in preference estimation. The final section considers applications to public health interventions. ? INCORPORATING PREFERENCES INTO ECONOMIC EVALUATION Cost-Benefit Analysis Health policy analysts traditionally used cost-benefit analysis (CBA) to assess the value of health programs (136, 137). In CBA, analysts estimate the net social benefit of a program or intervention as the incremental benefit of a program minus the incremental cost. All costs and benefits are measured in monetary units (e.g. dollars). The approach is useful because it leads to a simple decision-making rule: If a program’s net benefits exceed its net costs, then it should be adopted. However, CBA also raises measurement difficulties, because it requires the monetary valuation of health benefits. Early on, cost-benefit analysts tended to quantify health benefits with a “human capital” approach. That is, the value of reduced health was measured as the lost earnings of affected individuals. The advantage of the human capital approach was that it approximated value as the “productive potential” to society that would be lost through morbidity and mortality. It also permitted a relatively straightforward calculation. The disadvantage, as critics such as Schelling (104) and Mishan (77) noted, is that the approach has no basis in economic theory—because it ignores underlying consumer preferences and implies that unproductive periods such as leisure time and retirement are without value. These observers noted that a superior approach would consider the fact that consumers make tradeoffs between health and other goods and services. People don’t spend all of their money to relieve their symptoms or to reduce their risk of death; instead, they consume to the point at which the improvement justifies the costs. Therefore, health should be valued by determining how much individuals are willing to pay for it. Unlike human capital, willingness-to-pay measures are preference based. The metric is monetary, which allows tradeoffs with costs and nonhealth consequences. A number of researchers have attempted to measure the value of health by assessing what people are willing to pay for specific health benefits. Economists typically measure the value of commodities by examining the prices of goods and services bought and sold in the marketplace. But since private markets for health benefits do not generally exist, it is difficult and often impossible to measure the value of health by appealing to market data. Therefore, researchers have turned to P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 589 other measures. Some have taken “revealed preference” approaches by imputing willingness to pay from comparable market prices or wages [e.g. the willingness to accept occupational risk could be valued as the incremental wage paid to such workers (128)]. The problem is that prices and wages may not be truly comparable [i.e. unique properties of risk and benefit associated with a job may be confounded by the magnitude of the risk itself (93)]. Others have used direct surveys of consumers, called willingness-to-pay or “contingent valuation” surveys, because responses are contingent on a hypothetical market for the good or service of interest (1, 4, 59, 83, 89, 110, 115). The advantages of the approach are that it is grounded explicitly in principles of welfare economics and provides a means to quantify the benefits of difficult-to-estimate factors such as the psychic benefits of symptom relief. The disadvantage is that researchers have often found that the method does not produce reliable estimates (111, 112). ? Cost-Effectiveness Analysis In recent years, cost-effectiveness analysis has emerged as a favored analytic technique for economic evaluation in health care.1 A major appeal of cost effectiveness over cost-benefit analysis is that it allows analysts to quantify health benefits in terms of health rather than in monetary units. Cost-effectiveness analyses show the relationship between the net resources used (costs) and the net health benefits achieved (effects) for a specific intervention compared with a specific alternative strategy. Cost-effectiveness analyses involve comparisons between two alternatives or between the presence and absence of an intervention—the cost per effect (C/E) ratio reflects the difference in an intervention’s costs divided by the difference in its health effectiveness (47). If ratios are expressed in similar units, they can be compared to determine the most efficient ways to furnish health benefits. Many cost-effectiveness analysts have expressed health benefits in terms of intermediate outcomes specific to the treatment and disease under investigation. For example, a researcher studying alternative strategies to prevent cancer might evaluate and compare each strategy by the costs incurred per cancer case prevented. The approach is advantageous in that it focuses narrowly on the clinical problem and is familiar to the clinicians who treat the disease. A disadvantage is that it does not permit comparisons of treatments for cancer with interventions for other conditions. For example, the cost per case of cervical cancer detected cannot be readily compared with the cost per case of Alzheimer’s disease detected. To 1 Many analysts have also conducted cost-identification analyses (which compute the net costs associated with an intervention but ignore health outcomes), cost-consequence analyses (which compute and list components of incremental costs and consequences of alternative programs without any attempt to aggregate results into a single metric), and cost-minimization analyses (which compare the costs of two interventions in which the outcomes are presumed to be equal). However, none of these methods incorporate preferences explicitly. P1: FDJ March 31, 2000 590 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN inform societal decisions about which of many competing interventions produces the greatest gain in health for the resources expended, we need to express the numerators and denominators of C/E ratios of diverse medical interventions in similar terms. One way to standardize cost-effectiveness ratios is to measure the health effects of interventions in terms of life expectancy—the cost-effectiveness ratio for each alternative would reflect the costs per year of life gained. A limitation of this approach is that life expectancy alone does not take into account the quality of additional time that is gained (e.g. an added month of life with disability or pain is valued the same as an added month without disability or pain). Ideally, an analysis would capture such effects. The widely recommended approach is to measure health outcomes in terms of QALYs to incorporate both the prolongation and quality of life (134). The advantages of QALYs are twofold: they capture in a single measure gains from both reduced morbidity and reduced mortality, and they incorporate the value or preferences people have for different outcomes (28). ? THE CONCEPT OF QALYS QALYs represent the benefit of a health intervention in terms of time in a series of quality-weighted health states, in which the quality weights reflect the desirability of living in the state, typically from “perfect” health (weighted 1.0) to dead (weighted 0.0) (91, 117, 135). Once the quality weights are obtained for each state, they are multiplied by the time spent in the state; these products are summed to obtain the total number of QALYs. Researchers have used a number of techniques over the years to construct the quality weights. One option is to use the standard-gamble and time-tradeoff techniques, which have a sound theoretical basis in economic utility theory. These methods involve asking respondents to value health states by explicitly considering how much they would be willing to sacrifice to avoid being in a particular health state. (A variation is the person tradeoff approach, which involves asking people how many outcomes of one kind they consider equivalent to some quantity of outcomes of another kind (87a). Alternative elicitation techniques include rating scales and ratio scales. In the former, respondents are asked to express the strength of their preferences for particular health states by marking a point on a scale. The scale may be a visualanalog scale, which contains no internal markings (raters mark a point between two anchor states, such as dead and perfect health) or a category-rating scale, which is divided into discrete intervals. By using a ratio scale, also called magnitude estimation, respondents assess the disutility of states as multiples of a disutility reference state. As Drummond et al (28) note, people often use the terms “utility,” “value,” and “preference” interchangeably. Preference is a general term to describe the desirability of a set of outcomes. Values and utilities are different types of preferences P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 591 that depend on the elicitation method; values are measured under conditions of certainty (rating scale, time-tradeoff ), and utilities are measured under conditions of uncertainty that satisfy certain axioms of expected utility theory (standard gamble). Cost-utility analysis has its roots in expected utility theory (129), which describes a normative model of rational decision making under conditions of uncertainty. The preference or quality weights that are developed from preferences developed with methods other than the standard gamble are thus not technically utilities. But even QALYs developed with preferences constructed from standard gambles are considered utilities only if certain restrictive assumptions hold.2 To date, only a few studies have examined the empirical evidence for the descriptive validity of QALYs as utilities, with mixed results (12, 63). Regardless of the technique used to estimate them, the quality weights must meet several conditions (28). First, they should be based on individuals’ preferences for health states, as opposed to psychophysical (sometimes called psychometric) methods, which provide numerical assessments to reflect individuals’ health status. Psychometric approaches are designed to discriminate among levels of health status, for example, the presence, frequency, or intensity of capabilities or feeling (99). Such measures (e.g. the Medical Outcomes Study 36-item short-form health survey) can be useful in measuring changes in health status over time, in predicting future health outcomes, and in discriminating among individuals with different diseases (99). But a limitation is that these measures do not necessarily reflect the value that either patients or members of the general population place on the various attributes of health being measured. That is, simply summing up the weightings on various health status scales does not ensure that the weightings will be viewed by individuals as better or worse off (47). For example, two individuals without the use of their legs might have the same numerical ranking on a psychometric scale but value that health state very differently. In contrast, preference-based approaches incorporate values or utilities for health outcomes and can be used in cost-effectiveness analyses to aid resource allocation decisions (117). A second condition is that QALYs must be measured on an interval scale (133a), for example, a scale on which equal intervals have an equivalent interpretation. The reason is that cost-effectiveness analysis does not distinguish gains between, for example, 0.1 and 0.2 and 0.6 and 0.7, but treats all numerically identical gains as equal. Interval scaling means that comparisons in the incremental gains of QALYs are valid across programs or interventions. For example, an intervention that takes 2 These ? conditions include independence of preferences (utility scales for length of life and quality of life can be specified independently rather than conditionally upon the level of the other attribute); constant proportional tradeoffs between longevity and quality of life (one would be willing to give up some fraction of one’s life years in order to improve the quality of those years from one level to a preferred level, and that the fraction depends only on the two quality levels and not on the length of life at the outset) and risk neutrality (utilities are directly proportional to longevity for a fixed quality level) (28, 135). QALYs can be represented by a more general, risk-adjusted model if the utility independence and constant proportional tradeoff assumptions hold (95a). P1: FDJ March 31, 2000 592 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN a quadriplegic from a value of 0.1 to 0.3 is treated the same as an intervention that takes a healthy person with back pain from 0.8 to 1.0. A third condition for the quality weights is that they be anchored on measures of perfect health and dead. Traditionally, perfect health has been given a value of 1 and dead a value of 0, both for convenience and because giving perfect health a value of 1 means that QALYs reflect units of perfect health years (28). The anchors are required if the weights are used in calculations of QALYs, because they ensure that the absence of a year is equivalent to a year at zero weight (0.0 QALYs). Health states may also be judged to be “worse than dead” and take on values less than zero. Observers have noted that many cost-utility analyses use the upper anchor of the scale to reflect only the absence of a particular health condition, ignoring the fact that the average patient is still subject to chronic and acute conditions (37). One way to correct for this is to use average age-specific health-related qualityof-life weights from population-based studies. In addition, others point out that health-related quality of life of those whose lives has been saved or extended by a health intervention may be influenced by age, gender, race, or socioeconomic status (47). One way to address this is to conduct sensitivity analyses to indicate explicitly how the results of analyses are affected by these characteristics (47). A recent study of the cost effectiveness of a new drug to treat patients with Alzheimer’s disease illustrates how the preference measures can be used (82a). The study used a state-transition or Markov model to characterize the progression of Alzheimer’s disease through different disease stages and residential settings (Figure 1). Data from a clinical trial were used to simulate the paths of disease ? Figure 1 An Alzheimer’s disease policy model. [From Neumann et al (82a), reprinted with permission] P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 593 TABLE 1 Preference weights in Alzheimer’s diseasea Stage/setting Patients Caregivers Mild Community Nursing home 0.68 0.71 0.86 0.86 Moderate Community Nursing home 0.54 0.48 0.86 0.88 Severe Community Nursing home 0.37 0.31 0.86 0.88 ? a Source, Neumann et al (82a). progression for patients (reflected by the arrows in the diagram) with and without treatment. If each state is assigned a cost and quality-of-life weight, the paths of two cohorts—one with and one without the drug—can be compared in terms of their costs and quality-adjusted life expectancy. The model could thus address the question of whether a new drug, through its ability to slow cognitive deterioration and thus forestall progression to more severe Alzheimer’s disease stages (and more costly residential settings), would produce health care cost savings and/or qualityof-life improvements over the option of no treatment. To measure the preference weights associated with each disease stage and setting, the authors administered the Health Utilities Index Mark 2 (HUI2) to caregivers of Alzheimer’s patients. The weights obtained are shown in Table 1. It is important to emphasize that these weights were obtained using the HUI2 and that there is no sense that these weights reflect an absolute or universal interpretation of preferences for the health states in question. As noted below, other preference elicitation techniques or instruments can yield very different weights. KEY TOPICS IN PREFERENCE ESTIMATION Estimation of preferences in economic evaluation continues to be a very active area of research. Key topics include (a) techniques for measuring preferences, (b) the use of preference-weighted health state classification systems, (c) the relationship between patient and community preferences, (d ) options for obtaining utilities from clinical trials, (e) mapping health status to health utility measures, ( f ) offthe-shelf preference weights, and (g) alternatives to QALYs. Each of these topics is considered briefly below. Note that this discussion is intended as a review of selected topics. The interested reader is encouraged to explore the references in more detail. P1: FDJ March 31, 2000 594 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN Techniques for Measuring Preference Weights As noted, researchers have used a variety of methods to estimate preference weights. Many experts prefer choice methods (e.g. standard-gamble or timetradeoff methods) over scaling methods (e.g. rating scales and ratio scales) (28). Frequently, scaling methods are used for convenience, however, and often investigators use a mixture of methods for any given study (28). Considerable research over the years has tested the validity and reliability of the various techniques and the feasibility of eliciting responses (47, 117, 118). Validity and reliability have been measured in numerous ways, with various results reported (117). In comparison to the standard gamble, the time-tradeoff technique has been found to be relatively valid, whereas the rating scale has not, for example (116). In general, rating scales, although easy to use, have been found to be subject to measurement bias (119). A number of studies have found that different elicitation methods lead to different preference weights, with most reporting that standardgamble scores are greater than time-tradeoff scores, which are in turn greater than visual-analog scores (13, 28, 53, 81, 90, 97, 127). In recent years, researchers have experimented with various alternative elicitation techniques, including estimating general-population utilities by using one binary-gamble question per respondent (16). [This technique involves asking a standard-gamble question of different subgroups of the population in which the risk of death varies across subgroups. The mean utility is then estimated by the area above the proportional distribution of responses indicating acceptance of the gamble (16)]. Other methods that have been used include “chained procedures” to measure temporary health states (temporary health states weighed indirectly with the aid of intermediate anchor states) (57). Methods for optimizing sampling strategies for estimating QALYs have also been explored (96). Investigating the impact of different survey administration methods is also an active area of research. For example, investigators have recently reported that the elicitation protocol used to search for subjects’ utility values can strongly influence results (66) and that telephone interviews yield similar time-tradeoff values and standard-gamble utilities compared with face-to-face interviews (127). ? Using Preference Classification Systems In terms of methods for describing health states, some researchers have measured quality-of-life weights based on direct, holistic utility assessment (e.g. asking patients a time-tradeoff question), whereas others have based weights on prespecified, health state classification systems. With direct-utility assessment, individuals are asked questions in which their current health state or a hypothetical health state described by a “vignette,” video clip, or other description is typically placed on a 0–1 scale between perfect health (1.0) and dead (0.0). Several research groups have developed universal or “generic” health state classification systems, such as the Rosser Index (101), the HUI (32, 120), the P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 595 Quality of Well-Being Scale (61), the EuroQol (31), and the Health and Limitations Index (46). These systems are designed to be complete and general enough to apply across many different types of conditions and treatments. They provide an indirect means of obtaining preference weights: Patients are assigned a health state classification based on responses to health status questionnaires, and prespecified preference weights obtained from other populations are then applied. However, the systems differ in terms of how they define the relevant domains or attributes of health, as well as the techniques used for obtaining preference weights. Preference weights in the HUI, for example, are based on multiattribute utility theory, in which the domains of the classification system are regarded as attributes in a utility function. The latest version, the HUI Mark 3, contains eight attributes— vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain—with five to six levels of functioning per attribute (39). Preference measurements are based on the visual-analog scale and the standard-gamble instruments and were collected from a population sample in Hamilton, Ontario. The Quality of Well-Being Scale categorizes patients by symptoms and levels of functioning, which are represented by scales of mobility, physical activity, and social activity (61). Classifications are also based on the symptom or problem that individuals find most undesirable. The scoring function is based on categoryscaling measurements of a random sample of the general public. The EQ-5D system developed by the EuroQol Group contains five attributes— mobility, self-care, main activity, usual activity, pain/discomfort, and anxiety/ depression—with three levels per attribute (61a). The scoring function for preferences is based on the time-tradeoff technique used on a random sample of the adult population in the United Kingdom. Each of the three systems has been used extensively (28). [The Health and Limitations Index (HAL-ex), which uses a nationally representative sample to estimate quality-adjustment factors (46), is a more recent effort that has not yet been used widely.] In terms of comparing scores produced by the different instruments, researchers have reported both similarities and differences in preference weights obtained with different instruments in the same population (45, 88). One recent study, for example, reported substantial agreement between patient self-ratings obtained using the EuroQol instrument and patients’ utility scores on the HUI for states representing lower levels of functioning, but the study also reported differences for higher levels of functioning (45). Researchers have reported varied results in comparisons of preferences assessed with generic instruments vs preferences assessed directly. A number of studies have reported sizeable differences among the different approaches (17, 18, 124), although some research has found similarities (40). Gabriel et al (40), for example, recently reported that preferences estimated with the HUI did not differ significantly from direct time-tradeoff scores in women with osteoporotic fractures. The Panel on Cost Effectiveness in Health and Medicine noted that, in selecting a health state classification system, the system should reflect the domains important for the particular problem under consideration (47). If the cost-effectiveness ? P1: FDJ March 31, 2000 596 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN analysis is intended for use in a reference case analysis (an analysis that incorporates a standard set of methodological practices and is intended to aid in broad societal resource allocation decisions), the preference measure used should be a generic one or be calibrated in such a way that it is capable of being compared with a generic system (47). A general limitation of generic health state classification systems is that they may lack sensitivity to important differences in particular diseases. For example, a recent study found that the HUI2 scores for caregivers of patients with Alzheimer’s disease were insensitive to changes in patients’ disease stage, despite the fact that caregiver burden, measured with disease-specific instruments, increased with patients’ disease severity (84). For this reason, disease-specific classification systems that are appropriately preference weighted may play an important role, and they could be mapped to a generic measure suitable in the reference case (47). Examples include the Q-tility Index (133) for cancer and the Functional Capacity Index for trauma (71). ? Relationship Between Patient and Community Preferences Researchers have long debated the relevant populations to serve as the source of the preferences for health states. A common practice has been to use the preferences of the study investigators themselves (86). Often, preferences of clinicians have been used, under the rationale that they are most familiar with the conditions under investigation. Some studies have used patient preferences, because they reflect the values of the individuals most directly affected by clinical decisions, whereas others have used a representative or convenience sample of the general population.3 The argument for community-based preferences is that societal resource allocation decisions should be made by appealing to population-based community values (47). A number of researchers have found that individuals afflicted with a specific disease tend to value their health state more highly than those who have not experienced the condition (29, 40, 102), although similarities in preferences have also been reported between patients and nonpatients (7, 68, 92, 100). Researchers have also reported that patient values are higher than their surrogates believe (122, 123). [Wide individual-to-individual variations in health values are common (36, 82, 123)]. The general explanation for disparities between patients and nonpatients is that patients adapt to accommodate their limitations and alter their goals and expectations (47). 3 Alternatively, some analysts do not use anyone’s preferences for measuring health-related quality of life, but instead follow the psychometric tradition of asking a standard set of health status questions and then scoring responses. The scores are then used as or transformed into indicators of preference weights for the QALY calculation (86). Although this approach does require respondents to assess various attributes of health, as noted above it does not directly reflect respondents’ preferences for one health state over another. P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 597 The Panel on Cost Effectiveness recommended community preferences for health states as the most appropriate preferences for use in a reference case analysis and that weights should be collected from a representative sample of the general population (47). They noted that a consistent set of community weights for health conditions and health states, used across studies intended to inform resource allocation, would significantly improve the comparability of analyses. Options for Obtaining Utilities from Clinical Trials ? A topic receiving considerable attention in recent years is the collection of preference weights directly in randomized clinical trials. The appeal of the approach lies in its potential to produce more reliable and precise estimates (33). Collecting such data as part of an ongoing trial also has logistical advantages, because the data collection can be combined with administration of other instruments (33). The approach is not without problems, however, because it may add to overall study costs and to respondent burden. Also, because health values may be relatively insensitive to clinical changes, clinical trials with these values as endpoints may require large sample sizes (125). There are also questions about the acceptability of the results in the clinical community. Researchers have investigated a number of options for obtaining preferences from clinical trials, including the use of direct, holistic utility assessment (15, 64), as well as generic health state classification systems (34, 58, 138). Others have used disease-specific classification systems. An example of a cancer-specific system is the Quality-Adjusted Time Without Symptoms or Toxicity (Q-TWiST), developed for assigning patients in clinical trials to health states, using clinical-trials data (43). Weights for Q-TWiST states have not yet been obtained directly; rather, analyses with Q-TWiST typically rely on threshold analyses of break-even utility values for toxicity and symptoms relative to TWiST and death (44). Mapping Health Status to Health Preferences A number of researchers have recently explored the relationship between psychometric health status measures and preference measures (3, 9, 17–21, 38, 70, 98, 106, 124). One goal has been to find a way to obtain preferences-based measures from widely used health status instruments such as the Short Form (SF)-12 (131) and SF-36. However, the studies generally find poor-to-moderate correlations (in the range of 0.2–0.45) between psychometric and preferences-based scales (comparisons across studies must be made with caution because analyses differ across population, sample size, and instrument used). The results indicate that psychometric and preference-based approaches measure different aspects of health for different purposes. The studies tend not to lend strong support for the estimation of utilities directly from the SF-36 or other health status instruments, although some have reported more promising results. Fryback and colleagues (38), for example, reported that P1: FDJ March 31, 2000 598 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN a six-variable regression equation drawn from five of the SF-36 components predicted 59.6% of the variance in quality of well-being (QWC) scores. Lundberg et al (70) found that a regression model that used the 12 items of the SF-12 scale explained 50% of the variance in rating-scale responses in a larger general-population sample. Bult et al (21) noted that a much higher percentage (85%) of variation was explained when the heterogeneity across subjects (calculated with latent class analysis to estimate unknown parameters and class membership) was taken into account. ? Off-the-Shelf Weights Leaders in the field have called for a standard catalog of weights that could be used in any cost effectiveness analysis in lieu of primary data collection for each new analysis (47). The idea is that the catalog would comprise well-described health states with preference scores for each state, to allow users possessing enough descriptive information about health states, but not preference weights themselves, to obtain values for their analyses. The Panel on Cost-Effectiveness has recommended criteria for an ideal system for such a catalog, including derivation from a theorybased method on which empirical data have been collected; availability of weights from a representative, community-based sample; low burden of administration; and ability to furnish weights for health states and illness and conditions (47). Although no system to date meets the criteria, there have been a number of promising attempts to collect such weights. The Beaver Dam Health Outcomes Study (36), for example, based on >1300 respondents from the general population, has reported time-tradeoff and QWB scores for a variety of conditions. Selected preference weights are shown in Table 2. The idea is that researchers conducting cost-utility analyses could use these population reference values in estimating the value of a particular treatment (36). For example, a treatment that relieves asthma would return a patient with this condition from a quality-of-life level of 0.71 to a level of 0.87; these values would in turn be used in the QALY calculation. Note that, on average, persons without asthma have an average time-tradeoff score of 0.87 and not 1.00, reflecting the fact that they have other conditions and are not considered in “perfect health.” Dolan and colleagues (27) have reported valuations for various health states based on the time-tradeoff method. Another effort has resulted in a set of healthrelated quality-of-life scores for chronic conditions based on nationally representative U.S. data from the National Health Interview Survey (46). Elsewhere, researchers have developed a comprehensive catalog of preference weights by using secondary data from published cost-utility analyses (8). Proposed Alternatives to QALYs Some researchers have argued for methods other than QALYs, maintaining that the QALY approach is too complex (24) or, in some cases, advocating the use of more complex measures (41, 74, 75). Proposed alternatives include healthy-years equivalents (HYEs) (74), saved young life equivalents (87), and disability-adjusted P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 599 TABLE 2 Selected preference weights from Beaver Dam Health Outcomes studya Time trade-off scores (age-adjusted) Condition Asthma Arthritis Angina Stroke Severe back pain Migraine Myocardial infarction Diabetes (insulin) Depression Hiatal hernia a QWB scores (age-adjusted) Persons with condition Persons without condition Persons with condition Persons without condition 0.71 0.82 0.79 0.90 0.79 0.82 0.73 0.87 0.90 0.87 0.86 0.88 0.86 0.86 0.68 0.69 0.66 0.68 0.67 0.70 0.64 0.73 0.75 0.73 0.73 0.74 0.73 0.73 0.63 0.70 0.85 0.87 0.87 0.86 0.66 0.65 0.70 0.73 0.73 0.73 ? Source, Fryback et al (36). Abbreviation: QWB, quality of well-being. Reprinted with permission. life years (DALYs) (79), although the options have their own limitations and are subject to debate themselves (47). HYEs have been proposed as an alternative to QALYs (41, 74, 75) based on the claim that they avoid certain restrictive assumptions about preferences. For example, supporters claim that HYEs generalize from the constant proportionality of QALYs by permitting the rate of tradeoff between life years and quality of life to depend on the life span. HYEs are calculated by measuring the utility for each possible “health pathway” of a stream of changing health states and converting this utility to an HYE by a second measurement. There is, however, considerable debate about this second component, which has been shown to be essentially equivalent to a simple time-tradeoff question (78, 130). Johannesson et al (60) argue that HYEs are by definition the same as the equivalent number of years in full health in the timetradeoff developed by Torrance et al (121)—they are essentially a generalization of risk-neutral QALYs in which the assumption of a constant proportional tradeoff between life years and quality of life is relaxed. Furthermore, the burden of utility assessment is appreciable because the number of HYEs must be calculated for every possible duration of time in the health state; in other words, HYEs require independent valuations of all possible health scenarios rather than individual health states (60). Thus, HYEs do not offer a practical solution to the problem of assigning utilities to health profiles for various qualities of life, because of the enormous scope of the task of assessing time trade offs for all possible sequences of health states over a lifetime. DALYs have been widely used in economic evaluations conducted outside the United States (80), and a large group of international health interventions P1: FDJ March 31, 2000 600 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN report health outcomes as cost per DALY (56). DALYs were developed as the measurement unit for the Global Burden of Disease Study (80), whose goal was to quantify the burden of disease and injury in human populations. In effect, DALYs are similar to QALYs in that they provide a metric for quantifying life expectancy after adjusting for morbidity. However, whereas QALY weights are based on social preferences, DALY weights also incorporate age adjustments, based implicitly on economic productivity (i.e. young or middle-aged adults receive higher weights than the elderly or small children). DALYs incorporate two components, years of life lost (YLLs) and years lived with disability (YLDs). Thus DALYs from any given condition are simply the sum of YLLs and YLDs from the condition such that ? DALYi = YLLi + YLDi , where i is the condition. YLLs are calculated by using standard expected years of life lost and are discounted and age adjusted. YLDs are time lived in health states worse than perfect health, weighted by the preference weight for each health state. Preference weights for 22 indicator conditions have been developed using the person tradeoff method. Seven classes of disability have been defined based on these 22 indicator conditions and distributions of disabling severity generated for several hundred treated and untreated disabling sequelae. Both years of life and years with disability are also weighted by age-specific weighting factors, which assign a greater value to a year of young or middle-aged adult life as compared with a year of life lived by young children or the elderly. DALYs are not without their own critics. Some have raised questions about the equity and ethics of the age-weightings, for example (78a). Also, DALYs rely on Japanese life tables, no matter what the actual target population. RECENT APPLICATIONS TO PUBLIC HEALTH INTERVENTIONS In recent years, cost-utility analysis has been used to evaluate hundreds of interventions, ranging in scope from public health to clinical medicine (85). A recent review of this literature underscores the growth of the field and the variations in methods that analysts have used to estimate preference weights (85). The analyses have covered a wide range of conditions and interventions (Table 3).4 Most articles have focused on tertiary prevention (57.5%), followed by secondary (32.5%) and primary prevention (10.1%). Of interest to public health professionals, there have been analyses of screening strategies (10.5%), health education interventions (5.3%), and immunizations (3.9%). Table 4 presents a list of selected cost-utility analyses 4 Note that Tables 1 and 2 are organized in part by diagnostic categories, though the assignment of diagnosis may be somewhat arbitrary since individuals frequently have coexisting morbidities. P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION TABLE 3 Description of published cost-utility analyses 1976–1997 by prevention stage, condition, and type of interventiona Parameter Nb Percent Prevention stagec Primary Secondary Tertiary 23 74 131 10.1 32.5 57.5 Condition (ICD-9 category) Circulatory system Neoplasms Infectious and parasitic Genitourinary system Digestive system Musculoskeletal system Endocrine, nutritional, and metabolic Nervous system and sense organs Mental disorders Blood and blood-forming organs Respiratory system Injury and poisonings Conditions originating in perinatal period Variousd 58 40 35 14 12 12 11 11 11 8 5 4 4 3 24.6 17.9 15.6 6.3 5.4 5.4 4.9 4.9 4.5 3.6 2.2 1.8 1.8 3.0 Type of intervention Pharmaceutical Surgical Diagnostic Screening Medical procedure Care delivery Health education/behavior Immunizations Medical device Other 73 41 26 24 16 13 12 9 6 2 32.0 18.0 11.4 10.5 7.0 5.7 5.3 3.9 2.6 0.9 ? a Source, Neumann et al (85). b N = 228 studies. c Primary preventive measures are those provided to prevent onset of a targeted condition (e.g. routine immunization of healthy children). Secondary preventive measures identify and treat asymptomatic persons who have already developed risk factors or preclinical disease, but in whom the condition has not become clinically apparent (e.g. screening for high blood pressure). Tertiary preventive measures include all medical or surgical interventions designed to limit disability after harm has occured (see references 114a and 126). d Articles covered diseases in more than one category. 601 P1: FDJ March 31, 2000 12:0 602 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN TABLE 4 Cost-utility analyses of selected preventative interventions, 1992–1997a Year Intervention studiedb 1992 Hormone replacement therapy 23 1992 Mammography screening 50 1993 Autologous blood donation/transfusion 11 1994 Hib vaccination 51 1994 Hib catch-up immunization 73 1994 Hormone replacement therapy/lifestyle intervention 42 1994 Preoperative autologous donation of various units 10 1994 Preoperative autologous donation, overall 48 1994 Screening for prostate cancer 62 1994 Solvent-detergent-treated fresh-frozen plasma 1995 Preoperative autologous blood donation 1995 Breast cancer screening 1995 Childhood vaccines 1995 Screening blood donors for hemochromatosis 1995 Screening blood donors to prevent postransfusion hepatitis B and C infection 22 1995 Selective HBV vaccination 72 1996 Behavioral group HIV-prevention intervention 52 1996 PRP-T conjugate Hib vaccine 1996 Screening for abdominal aortic aneurysm 1996 Screening for asymptomatic carotid artherosclerotic disease 26 1996 Screening for mild thyroid failure 25 1996 Transdermal nicotine patch as an adjunct to physician’s smoking cessation counseling 35 1997 Cardiovascular risk reduction program 103 1997 Hepatitis a vaccination in health care workers 109 1997 HIV postexposure chemoprophylaxis 1997 HIV testing protocols for donated blood 1997 HIV prevention skills training for men who have sex with men 1997 Pneumococcal bacterium vaccination in the elderly 1997 Routine use of intraoperative autologous transfusion device 55 1997 Screening for carotid disease in asymptomatic patients 65 1997 Universal cancer screening program 54 a ? Reference 5 30 14 105 2 67 113 94 6 95 107 Source, Stone et al (114). b Abbreviations: Hib, Haemophilus influenza type B; HBV, hepatitis B virus; HIV, human immunodeficiency virus; PRP-T, polysaccharide-tetanus toxoid. P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 603 TABLE 5 Methods used for estimating preference weights in published cost-utility analysesa Measurement of preference weights Nb Percent Measurement scale used Study specific Based on previous study Preexisting generic Preexisting disease specific Could not be determined 109 60 44 2 13 47.8 26.3 19.3 0.9 5.7 74 59 55 52 32.4 25.9 24.1 22.8 77 127 33.8 55.7 59 46 22 59 25.9 20.2 9.6 25.9 ? Source of preferencesc Author Clinician Patient Community Preference measurement technique Author’s judgment Formal techniqued Rating scale/magnitude Estimation Time trade off Standard gamble Could not be determined a Source, Neumann et al (85). N = 228 studies. b c More than one response allowed per article. d Includes 59 (25.9%) using rating scale, 46 (20.2%) using time trade off, and 22 (9.6%) using standard gamble. of preventive interventions published from 1992 to 1997. The list underscores the diversity of interventions analyzed, from vaccines for pneumococcal pneumonia, Haemophilus influenza type B, measles, tetanus, typhoid, dengue, and hepatitis A to screening strategies for breast cancer, cervical cancer, thyroid disease, and tuberculosis. Recent summaries of the cost-effectiveness ratios of these interventions are provided by Graham et al (49) and Stone et al (114). Economic information from cost-utility analyses and related approaches are also increasingly being considered or used to develop clinical guidelines for public health. An example is a recent report by the Centers for Disease Control, which examined 19 prevention strategies based on the health impact and cost of the related disease, injury, and disability and the effectiveness and cost effectiveness of the strategy (76). Cost effectiveness has also been considered in the development of other public health guidelines, including the report of the U.S. Preventive Services Task Force (126). A challenge for all of these efforts is that economic evaluations frequently use different methods, raising questions about the comparability of studies. Table 5 summarizes the methods used for estimating preferences in published cost-utility analyses, for example. These data underscore the considerable variation in the P1: FDJ March 31, 2000 604 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN measurement scales, sources of preferences, and preference measurement techniques used. For example, studies generally used health states specific to the study at hand (47.8%); only 19.3% have used a preexisting generic health state classification system (e.g. HUI) as recommended by the Panel on Cost-Effectiveness in Health and Medicine. Most frequently, the authors themselves were the source of preferences (32.4%), followed by clinicians (25.9%), patients (24.1%), and members of the community (22.8%). Whereas just over half of the studies used some kind of formal measurement technique (55.7%), many studies relied on authors’ judgment (33.8%); in over one-quarter of the cases, the measurement technique could not be determined. In general, few studies have adhered to recommendations now provided by leaders in the field for preference estimation. In some ways these results are not surprising, because estimating preference weights remains a young field, characterized by ongoing debate on several conceptual issues, such as the appropriate source of these weights. Still the extent of the variations observed and the persistence of such trends over time (86) is troubling and may contribute to the lack of acceptability of the method among decision makers (69, 108). Concerns about the comparability and credibility of analyses will likely persist without further improvements and standards in the field (85). ? CONCLUSIONS Estimating preferences in economic evaluation in health care continues to be an extremely active area of research. Although cost-utility analyses have become more popular in recent years, many challenges remain for the field. Widespread acceptance of the methodology may await more consensus over measurement techniques, as well as educational efforts in the medical community on the potential usefulness of the approach. The challenge for the future will be to find reliable and valid measurement techniques. ACKNOWLEDGMENTS The authors are grateful to Sally Araki for comments on an earlier version of this manuscript and to Vijay Ramakrishnan for excellent research assistance. Visit the Annual Reviews home page at www.AnnualReviews.org LITERATURE CITED 1. Acton JP. 1973. Evaluating public programs to save lives: the case of heart attacks. R-950-RC, Rand Corp., Santa Monica, Calif. 2. Adams PC, Gregor JC, Kertesz AE, Valberg LS. 1995. Screening blood donors for hereditary hemochromatosis: decision analysis model based on a 30-year database. Gastroenterology 109: 177–88 P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 3. Andresen EM, Patrick DL, Carter WB, Malmgren JA. 1995. Comparing the performance of health status measures for healthy older adults. J. Am. Geriatr. Soc. 43:1030–34 4. Appel LJ, Steinberg EP, Powe NR, Anderson GF, Dwyer SA, Faden RR. 1990. Risk reduction from low osmolality contrast media: What do patients think it is worth? Med. Care 28:324–37 5. AuBuchon Birkmeyer JD. 1994. Safety and cost-effectiveness of solvent-detergent-treated plasma: in search of zero-risk blood supply. JAMA 272:1210–14 6. AuBuchon JP, Birkmeyer JD, Busch MP. 1997. Cost effectiveness of expanded human immunodeficiencey virus testing protocols for donated blood. Transfusion 37:45–51 7. Balaban DJ, Sagi PC, Goldfarb NI, Nettler S. 1986. Weights for scoring the quality of well-being instrument among rheumatoid arthritics: a comparison to general population weights. Med. Care 24:973–80 8. Bell C, Chapman RC, Sandberg EA, Stone PW, Neumann PJ. 1999. An off-the-shelf help list: a comprehensive catalog of preference weights from published cost-utility analyses. Med. Decis. Mak. 19:519(abstr.) 9. Bennett KJ, Torrance GW, Moran L, Smith F, Goldsmith CH. 1997. Health state utilities in knee replacement surgery: the development and evaluation of McKnee. J. Rheumatol. 24:1796–805 10. Birkmeyer JD, AuBuchon JP, Littenberg B, O’Connor GT, Nease RF, et al. 1994. Cost-effectiveness of preoperative autologous donation in coronary artery bypass grafting. Ann. Thorac. Surg. 57:161– 68 11. Birkmeyer JD, Goodnough LT, AuBuchon JP, Littenberg B. 1993. The costeffectiveness of preoperative autologous blood donation for total hip and knee replacement. Transfusion 33:544–51 12. Bleichrodt H, Johannesson M. 1996. An experimental test of constant proportional 13. 14. tradeoff and utility independence. Med. Decis. Mak. 17:21–32 Bleichrodt H, Johannesson M. 1997. An experimental test of the theoretical foundation for rating scale valuations. Med. Decis. Mak. 17(2):208–16 Boer R, de Koning HJ, van Oortmarssen GJ, van der Maas PJ. 1995. In search of the best upper age limit for breast cancer screening. Eur. J. Cancer 31A:2040–43 Bombardier C, Ware J, Russell IJ, Larson M, Chalmers A, et al. 1986. Auranofin therapy and quality of life in patients with rheumatoid arthritis: results of a multicenter trial. Am. J. Med. 81:565–81 Bosch JL, Hammitt JK, Weinstein JC, Hunink MGM. 1998. Estimating general population utilities using one binarygamble question per respondent. Med. Decis. Mak. 18:381–90 Bosch JL, Hunink MGM. 1996. The relationship between descriptive and valuational quality-of-life measures in patients with intermittent claudication. Med. Decis. Mak. 165:217–25 Bosch JL, Hunink MGM, Tetteroo E, Bos JJ, Mali WPTM. 1994. Quality of life assessment in patients with peripheral arterial disease. Med. Decis. Mak. 14(4):425 (Abstr.) Brazier J, Jones N, Kind P. 1993. A comparison of two health status measures: Euroqol meets SF-36. Presented at Health Econ. Study Group/Fac. Public Health Med. Conf., Univ. York, York, UK Bult JR, Bosch JL, Hunink MGM. 1996. Heterogeneity in the relationship between the standard-gamble utility measure and health-status dimensions. Med. Decis. Mak. 16:226–23 Bult JR, Hunink MGM, Tsevat J, Weinstein MC. 1998. Heterogeneity in the relationship between the time tradeoff and shortform-36 for HIV-infected and primary care patients. Med. Care 36:523–32 Busch MP, Korelitz JJ, Kleinman SH, AuBuchon JP, Schreiber GB. 1995. The ? 15. 16. 17. 18. 19. 20. 21. 22. 605 P1: FDJ March 31, 2000 606 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN Retrovirus Epidemiology Donor Study: declining value of alanine aminotransferase in screening of blood donors to prevent posttransfusion hepatitis B and C virus infection. Transfusion 35:903–10 Cheung AP, Wren BG. 1992. A costeffectiveness analysis of hormone replacement therapy in the menopause. Med. J. Aust. 156:312–16 Cox D, Fitzpatrick R, Fletche A, Gore S, Spiegelhalter D, Jones D. 1992. Qualityof-life assessment: Can we keep it simple? J. R. Stat. Soc. A 155(3):353–93 Danese MD, Powe NR, Sawin CT, Ladenson PW. 1996. Screening for mild thyroid failure at the periodic health examination. JAMA 276:285–92 Derdeyn CP, Powers WJ. 1996. Costeffectiveness of screening for asymptomatic carotid atherosclerotic disease. Stroke 27:1944–50 Dolan P, Gudex C, Kind P, Williams A. 1996. The time trade-off method: results from a general population study. Health Econ. 5:141–54 Drummond MF, O’Brien B, Stoddart GL, Torrance GW. 1997. Methods for the Economic Evaluation of Health Care Programmes. Oxford, UK: Oxford Univ. Press Epstein AM, Hall JA, Tognetti J, Son LH, Conant L Jr. 1989. Using proxies to evaluate quality of life: Can they provide valid information about patients’ health status and satisfaction with medical care? Med. Care 27(Suppl.):S91–98 Etchason J, Petz L, Keeler E, Calhoun L, Kleinman S, Snider C. 1995. The cost effectiveness of preoperative autologous blood donations. N. Engl. J. Med. 332:719– 24 EuroQol Group. 1990. EuroQol—a new facility for the measurement of health related quality of life. Health Policy 16:199– 208 Feeny D, Furlong W, Boyle M, Torrance G. 1995. Multi-attribute health states clas- 33. 34. sification systems: health utilities index. Pharmacoeconomics 7:490–502 Feeny D, Labelle R, Torrance GW. 1990. Integrating economic evaluations and quality of life assessments. In Quality of Life Assessments in Clinical Trials, ed. B Spilker, pp. 85–95. New York: Raven Feeny DH, Torrance GW. 1989. Incorporating utility-based quality-of-life assessment measures in a randomised trial. Med. Care 27(Suppl. 3):S190–204 Fiscella K, Franks P. 1996. Cost-effectiveness of the transdermal nicotine patch as an adjunct to physicians’ smoking cessation counseling. JAMA. 275:1247–51 Fryback DG, Dasbach EJ, Klein R, Klein BEK, Dorn N, et al. 1993. The Beaver Dam Health Outcomes Study: initial catalog of health-state quality factors. Med. Decis. Mak. 13:89–102 Fryback DG, Lawrence WF. 1997. Dollars may not buy as many QALYs as we think: a problem with defining quality-of-life adjustments. Med. Decis. Mak. 17:276– 84 Fryback DG, Lawrence WF, Martin PA, Klein R, Klein BE. 1997. Predicting quality of well-being scores from the SF-36: results from the Beaver Dam Health Outcomes Study. Med. Decis. Mak. 17:1–9 Furlong W, Feeny D, Torrance GW. 1998. Multiplicative, multi-attribute utility function for the Health Utilities Index Mark 3 (HUI3) system: a technical report. Work. Pap. 98-11. McMaster Univ. Cent. Health Econ. Policy Anal., Montreal, Canada Gabriel SE, Kneeland TS, Melton LJ, Moncur MM, Ettinger B, Tosteson ANA. 1999. Health-related quality of life in economic evaluation of osteoporosis. Med. Decis. Mak. 19:141–48 Gafni A, Birch S. 1993. Economics, health and health economics: HYEs versus QALYs. J. Health Econ. 11:325–39 Geelhoed E, Harris A, Prince R. 1994. Cost-effectiveness analysis of hormone replacement therapy and lifestyle ? 35. 36. 37. 38. 39. 40. 41. 42. P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. intervention for hip fracture. Aust. J. Public Health 18:153–60 Gelber RD, Goldhirsch A. 1986. A new 53. endpoint for the assessment of adjuvant therapy in postmenopausal women with operable breast cancer. J. Clin. Oncol. 4: 1772–79 Gelber RD, Goldhirsch A, Cole BF. 1993. 54. Evaluation of effectiveness: Q-TWiST. Cancer Treat. Rev. 19:73–84 Glick HA, Polsky D, Willke RJ, Schulman KA. 1999. A comparison of preference assessment instruments used in a clin- 55. ical trial. Med. Decis. Mak. 19:265–75 Gold MR, Franks P, McCoy KI, Fryback DG. 1998. Toward consistency in costutility analyses: using national measures to create condition-specific values. Med. Care 56. 36(6):778–92 Gold MR, Siegel JE, Russell LB, Weinstein MC. 1996. Cost-effectiveness in Health and Medicine. Oxford, UK: Oxford Univ. 57. Press Goodnough LM, Grishaber JE, Birkmeyer JD, Monk TG, Catalona WJ. 1994. Efficacy and cost-effectiveness of autologous blood predeposit in patients undergoing radical prostatectomy procedures. Urology 58. 44:226–31 Graham JD, Corso PS, Morris JM, SeguiGomez M, Weinstein MC. 1998. Evaluating the cost-effectiveness of clinical and public health measures. Annu. Rev. Public 59. Health 19:125–52 Hall J, Gerard K, Salkeld G, Richardson J. 1992. A cost-utility analysis of mammography screening in Australia. Soc. Sci. Med. 60. 34:993–1004 Harris A, Hendrie D, Bower C, Payne J, de Klerk N, Stanley F. 1994. The burden of Haemophilus influenzae type b disease in Australia and an economic appraisal of the 61. vaccine PRP-OMP. Med. J. Aust. 160:483– 88 Holtgrave DR, Kelly JA. 1996. Preventing HIV/AIDS among high-risk urban women: the cost-effectiveness of a be- 61a. 607 havioral group intervention. Am. J. Public Health 86:1442–45 Hornberger JC, Redelmeier DA, Petersen J. 1992. Variability among methods to assess patients’ well-being and consequent effect on a cost-effectiveness analysis. J. Clin. Epidemiol. 45(5):505–12 Hristova L, Hakama M. 1997. Effect of screening for cancer in the Nordic countries on deaths, cost and quality of life up to the year 2017. Acta Oncol. 36(Suppl. 9):1–60 Huber TS, McGorray SP, Carlton LC, Irwin PB, Flug RR, Flynn TC. 1997. Intraoperative autologous transfusion during elective infrarenal aortic reconstruction: a decision analysis model. J. Vasc. Surg. 25:984– 94 Jamison DT, Mosley WH, Measham AR, Bobadilla JL, eds. 1993. Disease Control Priorities in Developing Countries. New York: Oxford Univ. Press Jansen SJ, Stiggelbout AM, Wakker PP, Vliet Vlieland TP, Leer JW, et al. 1998. Patients’ utilities for cancer treatments: a study of the chained procedure for the standard gamble and time tradeoff. Med. Decis. Mak. 18:391–99 Jenkinson C, Gray A, Doll H, Lawrence K, Keoghane S, Layte R. 1997. Evaluation of index and profile measures of health status in a randomized controlled trial. Med. Care 35:1109–18 Johannesson M, Jonsson B. 1991. Willingness to pay for antihypertensive therapy: results of a Swedish pilot study. J. Health Econ. 10:461–74 Johannesson M, Pliskin J, Weinstein M. 1993. Are healthy-years equivalents an improvement over quality-adjusted life years? Med. Decis. Mak. 13(4):281–86 Kaplan RM, Anderson JP, Wu AW, Matthews WC, Kozin F, Orenstein D. 1989. The Quality of Well-Being Scale: applications in AIDS, cystic fibrosis, and arthritis. Med. Care 27(Suppl. 3):S27–43 Kind P. 1996. The EuroQol instrument: an ? P1: FDJ March 31, 2000 608 62. 63. 64. 65. 66. 67. 68. 69. 70. 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN index of health-related quality of life. In 71. MacKenzie EJ, Damiano AM, Miller TS, Luchter S. 1996. The development of Quality of Life and Pharmacoeconomics in the Functional Capacity Index. J. Trauma Clinical Trials, ed. B Spilker, pp. 191–201. 41:799–807 New York: Raven. 2nd ed. Krahn MD, Mahoney JE, Eckman MH, 72. Mangtani P, Hall AJ, Normand CE. 1995. Hepatitis B vaccination: The cost effecTrachtenberg J, Pauker SG, Detsky A. tiveness of alternative strategies in Eng1994. Screening for prostate cancer: a deland and Wales. J. Epidemiol. Community cision analytic view. JAMA 272:773–80 Health 49:238–44 Kupperman M, Shiboski S, Feeny D, Elkin EP, Washington AE. 1997. Can preference 73. McIntyre P, Hall J, Leeder S. 1994. An economic analysis of alternatives for childscores for discrete states be used to dehood immunisation against Haemophilus rive preference scores for an entire path of influenzae type b disease. Aust. J. Public events? An application to prenatal diagnoHealth 18:394–400 sis. Med. Decis. Mak. 17:42–55 Laupacis A, Wong C, Churchill D, Cana- 74. Mehrez A, Gafni A. 1989. Quality adjusted life years, utility theory, and healthy-years dian Erythropoietin Study Group. 1991. equivalents. Med. Decis. Mak. 9(2):142–49 The use of generic and specific qualityof-life measures in hemodialysis patients 75. Mehrez A, Gafni A. 1993. Health-years equivalents versus quality-adjusted life treated with erythropoietin. Control. Clin. years: in pursuit of progress. Med. Decis. Trials 12:168S–79S Mak. 13:287–92 Lee TT, Solomon NA, Heidenreich PA, Oehlert J, Garber AM. 1997. Cost-effec- 76. Messonier ML, Corso PS, Teutsh SM, Haddix AC, Harris JR. 1999. An ounce of pretiveness of screening for carotid stenosis in vention: What are the returns? Am. J. Prev. asymptomatic patients. Ann. Intern. Med. Med. 16(3):248–63 126:337–46 Lenert LA, Cher DJ, Goldstein MK, 77. Mishan EJ. Evaluation of life and limb. 1971. A theoretical approach. J. Polit. Bergen MR, Garber A. 1998. The effect Econ. 79:687–706 of search procedures on utility elicitations. 78. Morrison GC. 1997. Healthy years equivMed. Decis. Mak. 18:76–83 alent and time tradeoff: What is the differLivartowski A, Boucher J, Detournay B, ence? J. Health Econ. 16:563–78 Reinert P. 1996. Cost-effectiveness evaluation of vaccination against Haemophilus 78a. Morrow RH. Bryant JH. 1995. Health policy approaches to measuring and valuing influenzae invasive diseases in France. Vachuman life: conceptual and ethical issues. cine 14:495–500 Am. J. Public Health 85(10):1356–60 Llewellyn-Thomas H, Sutherland HJ, Tibshirani R, Ciampi A, Till JE, Boyd NF. 79. Murray CJ. 1994. Quantifying the burden of disease: the technical basis for 1984. Describing health states: methoddisability-adjusted life years. Bull. WHO ologic issues in obtaining values for health 72(3):429–45 states. Med. Care 22:543–52 Luce BR, Lyles CA, Rentz AM. 1996. The 80. Murray CJL, Lopez AD. 1996. The Global Burden of Disease. Geneva, Switzerland: view from managed care pharmacy. Health WHO. 990 pp. Aff. 15(4):168–76 Lundberg L, Johannesson M, Isacson 81. Nease RF, Brooks WB. 1995. Patient desire for information and decision making in DGL, Borgquist L. 1999. The relationhealth care decisions: the Autonomy Prefship between health-state utilities and the erence Index and the Health Opinion SurSF-12 in a general population. Med. Decis. vey. J. Gen. Intern. Med. 10(11):593–600 Mak. 19:128–40 ? P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 82. Nease RF Jr, Kneeland T, O’Conner GT, Sumner W, Lumpkins C, et al. 1995. Variation in patient utilities for outcomes of the management of chronic stable angina: implications for clinical practice guidelines. JAMA 273:1185–90 82a. Neumann PJ, Hermann RC, Kuntz KM, Araki SS, Duff SB, et al. 1999. Costeffectiveness of Donepezil in the treatment of mild or moderate Alzheimer’s disease. Neurology 52:1138–45 83. Neumann PJ, Johannesson M. 1994. The willingness to pay for in vitro fertilization: a pilot study using contingent valuation. Med. Care 32(7):686–99 84. Neumann PJ, Kuntz KM, Leon J, Araki SS, Hermann RC, et al. 1999. Health utilities in Alzheimer’s disease: a crosssectional study of patients and caregivers. Med. Care 37:27–32 85. Neumann PJ, Stone PW, Chapman RH, Sandberg EA. 1999. A formal audit of 228 published cost-utility analyses. Harv. Sch. Public Health Work. Pap. Harv. Univ., Boston, Mass. 86. Neumann PJ, Zinner DE, Wright JC. 1997. Are methods for estimating QALYs in cost-effectiveness analyses improving? Med. Decis. Mak. 17:402–8 87. Nord E. 1992. An alternative to QALYs: the saved young life equivalent (SAVE). Br. Med. J. 305(6858):875–77 87a. Nord E. 1995. The person trade-off approach to valuing health care programs. Med. Decis. Mak. 15(3):201–8 88. Nord E. 1996. Health status index models for use in resource allocation decisions. Int. J. Technol. Assess. Health Care 12:31–44 89. O’Brien B, Gafni A. 1996. When do the “dollars” make sense?: toward a conceptual framework for contingent valuation studies in health care. Med. Decis. Mak. 16(3):288–99 90. O’Leary JF, Faircloth DL, Jankowski MK, Weeks JC. 1995. Comparison of time trade-off utilities and rating scale 91. 92. values of cancer patients and their relatives: evidence for a possible plateau relationship. Med. Decis. Mak. 15:132–37 Patrick DL, Erickson P. 1993. Health Status and Health Policy. New York: Oxford Univ. Press Patrick DL, Sittampalam Y, Somerville S, Carter W, Bergner M. 1985. A crosscultural comparison of health status values. Am. J. Public Health 75:1402–7 Pauly MV. 1995. Valuing health care benefits in money terms. In Valuing Health Care, ed. F Sloan, pp. 99–124. London: Cambridge Univ. Press Pinkerton SD, Holtgrave DR, Pinkerton HJ. 1997. Cost-effectiveness of chemoprophylaxis after occupational exposure to HIV. Arch. Intern. Med. 177:1972– 80 Pinkerton SD, Holtgrave DR, Valdisseri RO. 1997. Cost-effectiveness of HIVprevention skills training for men who have sex with men. AIDS 11:347–57 Pliskin JS, Shepard DS, Weinstein MC. 1980. Utility functions for life years and health status. Oper. Res. 28:206–24 Ramsey SD, Etzioni R, Troxel A, Urban N. 1997. Optimizing sampling strategies for estimating quality-adjusted life-years. Med. Decis. Mak. 17:431–38 Read JL, Quinn RJ, Berwick DM, Fineberg HV, Weinstein MC. 1984. Preferences for health outcomes: comparison of assessment methods. Med. Decis. Mak. 4:315–29 Revicki DA. 1992. Relationship between health utility and psychometric health status measures. Med. Care 30(Suppl.):MS274–82 Revicki DA, Kaplan RM. 1993. Relationship between psychometric and utilitybased approaches to the measurement of health-related quality of life. Qual. Life Res. 2:477–87 Revicki DA, Shakespeare A, Kind P. 1996. Preferences for schizophrenia-related health states: a comparison of ? 93. 94. 95. 95a. 96. 97. 98. 99. 100. 609 P1: FDJ March 31, 2000 610 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 12:0 Annual Reviews NEUMANN ■ GOLDIE ■ CHAP-24 WEINSTEIN on willingness-to-pay and starting point patients, caregivers, and psychiatrists. Int. bias. Med. Decis. Mak. 16:242–47 Clin. Psychopharmacol. 11:101–8 Rosser R, Kind P. 1978. A scale of valua- 112. Stalhammer NO, Johanesson M. 1996. Valuation of health changes with the contions of states of illness: Is there a social tingent valuation method: a test of scope consensus? Int. J. Epidemiol. 7(4):347– and question order effects. Health Econ. 58 5:531–41 Sackett D, Torrance G. 1978. The utility of different health states as perceived 113. St Leger AS, Spencely M, McColloum CN, Mossa M. 1996. Screening for abby the general public. J. Chronic Dis. dominal aortic aneurysm: a computer as31(11):697–704 sisted cost-utility analysis. Eur. J. Vasc. Salkeld G, Phongsavan P, Oldenburg B, Endovasc. Surg. 11:183–90 Johannesson M, Convery P, et al. 1997. The cost-effectiveness of a cardiovascular 114. Stone PW, Teutsch SM, Chapman RH, Bell C, Neumann PJ. 1999. Cost-utility risk reduction program in general pracanalyses of clinical preventive services. tice. Health Policy 41:105–19 Harv. Sch. Public Health Work. Pap., Schelling TC. 1968. The life you save Harv. Univ. Boston, Mass. 32 pp. may be your own. In Problems in Public Expenditure Analysis, ed. SB Bhase, 114a. Tengs, Adams ME, Pliskin JS, Safram DG, Siegel JE, et al. 1995. Five-hundred pp. 127–76. Washington, DC: Brookings life-saving interventions and their costInst. effectiveness. Risk Anal. 15(3):369–90 Shepard DS, Walsh JA, Kleinau E, Stansfield S, Bhalotra S. 1995. Setting pri- 115. Thompson MS. 1986. Willingness to pay and accept risks to cure chronic disease. orities for the Children’s Vaccine InitiaAm. J. Public Health 76:392 tive: a cost-effectiveness approach. Vac116. Torrance GW. 1976. Social preferences cine 13:707–14 for health states: an empirical evaluation Shmueli A. 1999. Subjective health status of three measurement techniques. Socioand health values in the general populaEcon. Plan. Sci. 10:128–36 tion. Med. Decis. Mak. 19:122–26 Sisk JE, Moskowitz AJ, Whang W, 117. Torrance GW. 1986. Measurement of health state utilities for economic apLin JD, Fedson DS, et al. 1997. Costpraisal. J. Health Econ. 5:1–30 effectiveness of vaccination against Pneumococcal bacteremia among elderly peo- 118. Torrance GW, Boyle MH, Horwood SP. 1982. Application of multi-attribute utilple. JAMA 278:1333–39 ity theory to measure social preferences Sloan FA, Whetten-Goldstein K, Wilfor health states. Oper. Res. 30:1043–69 son A. 1997. Hospital pharmacy decisions, cost-containment, and the use of 119. Torrance GW, Feeny DH, Furlong WJ, Barr RD, Zhang Y, Wang Q. 1996. cost-effectiveness analysis. Soc. Sci. Med. Multi-attribute preference functions for 45(4):525–33 a comprehensive health status classificaSmith S, Weber S, Wiblin T, Nettleman tion system: health utilities index mark 2. M. 1997. Cost-effectiveness of hepatitis Med. Care 24(7):702–22 A vaccination in health care workers. Infect. Control Hosp. Epidemiol. 18:688–91 120. Torrance GW, Furlong WJ, Feeny DH, Boyle M. 1995. Multi-attribute preferSorum PC. 1999. Measuring patient prefence functions: health utilities index. erences by willingness to pay to avoid: Pharmacoeconomics 7:503–20 the case of acute otitis media. Med. De121. Torrance GW, Thomas W, Sackett D. cis. Mak. 19:27–37 1972. A utility maximization model Stalhammer NO. 1996. An empirical note ? P1: FDJ March 31, 2000 12:0 Annual Reviews CHAP-24 PREFERENCES IN ECONOMIC EVALUATION 122. 123. 124. 125. 126. 127. 128. 129. 130. 611 for evaluation of health care programs. 131. Ware JE, Kosinski M, Keller SD. 1996. A 12-item short-form health survey: conHealth Serv. Res. 7(2):118–33 struction of scales and preliminary tests Tsevat FJ, Cook EF, Green ML, Matchar of reliability and validity. Med. Care DB, Dawson NV, et al. 1995. Health val34(3):220–33 ues of the seriously ill. Ann. Intern. Med. 132. Ware JE, Sherbourne CD. 1992. The 122:514–20 MOS 36-item short-form health survey Tsevat J, Dawson MD, Wu AW, Lynn J, (SF-36). I. Conceptual framework and Soukup JR, et al. 1998. Health values of item selection. Med. Care 30(6):473–83 hospitalized patients 80 years old or older. 133. Weeks J, O’Leary J, Fairclough D, Paltiel JAMA 279:371–75 D, Weinstein MC. 1994. The Q-tility inTsevat J, Solzan JG, Kuntz KM, Ragland dex: a new tool for assessing healthJ, Currier JS, et al. 1996. Health values related quality of life and utilities in clinof patients infected with human immunical trials and clinical practice. Proc. Am. odeficiency virus: relationship to menSoc. Clin. Oncol. 13th, Dallas, TX. p. 436 tal health and physical functioning. Med. (Abstr.) Washington, DC: Am. Soc. Clin. Care 34:44–57 Oncol. Tsevat J, Weeks JC, Guadagnoli E, Tosteson ANA, Mangione CM, et al. 1994. Us- 133a. Weinstein MC, Fineberg HV. 1980. Clinical Decision Analysis. Philadelphia, PA: ing health-related quality of life informaSaunders. 351 pp. tion: clinical encounters, clinical trials, and health policy. J. Gen. Intern. Med. 134. Weinstein MC, Siegel JE, Gold MR, Kamlet MS, Russell LB. 1996. Rec9:576–82 ommendations of the Panel on CostU.S. Preventive Services Task Force. Effectiveness in Health and Medicine. 1996. Guide to Clinical Preventive SerJAMA 276:1253–58 vices. Baltimore, MD: Williams & 135. Weinstein MC, Stason WB. 1977. FounWilkins. 2nd ed. 953 pp. dations of cost-effectiveness analysis for Van Wijck EEE, Bosch JL, Hunink health and medical practice. N. Engl. J. MGM. 1998. Time-tradeoff values and Med. 296:716–21 standard gamble utilities assessed during telephone interviews versus face-to-face 136. Weisbrod BA. 1961. The Economics of Public Health. Philadelphia, PA: Univ. interviews. Med. Decis. Mak. 18:400–5 Penn. Press. 127 pp. Viscusi WK. 1993. The value of risks to 137. Williams A. 1974. The cost-benefit aplife and health. J. Econ. Lit. 31:1912–46 proach. Br. Med. Bull. 30(3):252–56 Von-Neumann J, Morgenstern O. 1944. Theory of Games and Economic Behav- 138. Wu AW, Mathews WC, Brysk LT, Atkinson JH, Grant I, et al. 1990. Quality ior. Princeton, NJ: Princeton Univ. Press. of life in a placebo-controlled trial of 625 pp. Zidovudine in patients with AIDS and Wakker P. 1996. A criticism of healthy AIDS-related complex. J. Acquired Imyears equivalents. Med. Decis. Mak. mune Defic. Syndr. 3:683–90 16:207–14 ?
© Copyright 2026 Paperzz