HEALTH ECONOMICS CONTINGENT VALUATION Health Econ. 12: 281–294 (2003) Published online 12 June 2002 in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/hec.729 Design techniques for stated preference methods in health economics Fredrik Carlssona* and Peter Martinssona,b a b Department of Economics, Go. teborg University, Gothenburg, Sweden Department of Economics, Lund University, Lund, Sweden Summary This paper discusses different design techniques for stated preference surveys in health economic applications. In particular, we focus on different design techniques, i.e. how to combine the attribute levels into alternatives and choice sets, for choice experiments. Design is a vital issue in choice experiments since the combination of alternatives in the choice sets will determine the degree of precision obtainable from the estimates and welfare measures. In this paper we compare orthogonal, cyclical and D-optimal designs, where the latter allows expectations about the true parameters to be included when creating the design. Moreover, we discuss how to obtain prior information on the parameters and how to conduct a sequential design procedure during the actual experiment in order to improve the precision in the estimates. The designs are evaluated according to their ability to predict the true marginal willingness to pay under different specifications of the utility function in Monte Carlo simulations. Our results suggest that the designs produce unbiased estimations, but orthogonal designs result in larger mean square error in comparison to D-optimal designs. This result is expected when using correct priors on the parameters in D-optimal designs. However, the simulations show that welfare measures are not very sensitive if the choice sets are generated from a D-optimal design with biased priors. Copyright # 2002 John Wiley & Sons, Ltd. Keywords choice experiments; health economics; optimal design Introduction Many interesting research questions in health economics cannot be investigated using revealed preferences data and the development of stated preference methods during the last decade has increased the possibilities of exploring individuals preferences, for example for non-existing healthcare programmes, relationship between doctor and patient, and willingness-to-pay (WTP) for different health-care treatments. The key to a successful stated preference survey is to ask questions in a cleaver way, i.e. to generate one or several questions in such a way that the maximum amount of information is collected from each respondent given other constraints such as complexity for the respondent, length of the questionnaire and the cost of the survey. Recently, there has been an increased interest in choice experiments (conjoint analysis) in health economics for eliciting individuals’ preferences for different treatment and prevention programmes (e.g. [1–7]). In a choice experiment, each individual is presented with a sequence of choice sets and the individual is asked to choose their preferred alternative in each of the choice sets presented. Each choice set contains several alternatives defined by a set of attributes and attribute levels. Individuals’ preferences are revealed by their choices. From their responses it is possible to *Correspondence to: Department of Economics, Gothenburg University, Box 640, 405 30 Gothenburg, Sweden. Tel.: + 46 31 7734174; fax: + 46 31 7731043; e-mail: [email protected] Copyright # 2002 John Wiley & Sons, Ltd. Received 21 May 2001 Accepted 18 February 2002 282 estimate the marginal rate of substitution (MRS) for the attributes and, given that a cost attribute is included, the marginal WTP for the attributes. The main difference from a contingent valuation survey, in which individuals are asked to value the realisation of a scenario, is that the set-up with repeated choices focus on individuals’ trade-offs between several different attributes. Thus, we can consider contingent valuation method to be a special type of choice experiment. An intrinsic problem with all surveys is that we cannot ask individuals about everything and hence we have to limit the information collection process. A description of the development of a choice experiment, which is applicable to all types of stated preference surveys, is given by Ryan and Hughes [3] and they identify five stages: (1) determination of the attributes, (2) assigning levels to the attributes, (3) construction of the choice sets by combining the attribute levels in each of the alternatives, (4) collection of responses and (5) econometric analysis of data. In this paper, we focus on the third stage, which is called the statistical design. The design issue is less complex when collecting interval data such as open-ended contingent valuation questions or time-trade-off experiments (see e.g. [8]). For open-ended contingent valuation there is essentially no design problem and moreover for models with continuous data there has been an extensive research on optimal designs [9]. However, open-ended contingent valuation surveys have been heavily criticised, e.g. on the grounds that people are more familiar to make discrete decisions and that it is easier to act strategic in an open-ended survey (see e.g. [10]). Recently, choice experiments have also been applied to time-trade-off-experiments [11]. Thus, there seems to be a trend that discrete choice questions are becoming the dominating format for elicitation of preferences in stated preference surveys. The purpose of this paper is to describe and compare a number of different design techniques that can be used in choice experiments and to discuss how to implement these techniques in practice. In Monte Carlo simulations we compare traditional orthogonal designs with designs that can explicitly incorporate prior information on preferences, so-called D-optimal designs. In order to illustrate the problem of optimal design, consider the following illustrative example. We wish to conduct a choice experiment in order to estimate the relative importance of certain attriCopyright # 2002 John Wiley & Sons, Ltd. F. Carlsson and P. Martinsson butes in a certain treatment programme. The programme is described by three attributes in addition to the cost attribute; the attributes are Chance of successful operation, Continuity of contact with the same staff, and Amount of information received about the operation. The design problem deals with how to combine the attribute levels into alternatives and choice sets. In order to explain why a certain choice set should not appear in a choice experiment, consider the following example. The attribute Chance of successful operation is most likely the most important for the subjects. Thus, we can expect that if one alternative has 90% chance of successful operation and the other has a 10% chance in the same choice set, and the cost of the operation does not differ much between the alternatives, most respondents will choose the first alternative irrespective of the levels of the other attributes. Asking respondents to choose between two such alternatives would not provide much information since respondents will not make trade-offs between the attributes. Thus, it is important to present attribute levels in a choice set so that none of the attributes become dominant or inferior. Traditional designs, such as orthogonal designs, disregard this aspect of the design and only ensure that we can estimate the effects of the different attributes independently of each other. In practise, a researcher would not allow for these extreme attribute levels in the same choice experiment. A D-optimal design considers explicitly the importance of the levels of the attributes, and ensures that the alternatives in the choice sets provide more information about the trade-off between the different attributes. However, this requires explicit incorporation of prior information about the respondents’ preferences into the design. Thus, a key issue when applying more advanced designs is the need for more prior information. One source of information is results from previous studies, but primarily the information is obtained from own focus groups and pilot studies. It should be noted that orthogonal designs also require prior information in order to choose the attribute levels in such a way that dominating and inferior attributes are avoided. In section titled ‘Implementing optimal designs’ we discuss the practical issues regarding collection of prior information. In addition, we also present the sequential design approach, which states that information collected at an early stage of the choice experiment can be used as prior information and thus to update the design. Health Econ. 12: 281–294 (2003) 283 Stated Preference Methods Design Techniques The objective of a choice experiment in health economics is often to calculate the MRS between different attributes. In particular, welfare measures such as WTP for a whole programme or specific attributes of a programme are of main interest. The precision of the estimated parameters is of vital importance and this is of course linked to how efficient individuals’ preferences are revealed in the choice experiment. As far as we are aware, little is known about the effect of using different statistical design techniques in choice experiments on the estimated parameters and on the calculated MRS. We therefore compare orthogonal designs with other design techniques with respect to their ability to correctly estimate marginal WTP (or any other MRS) in pair-wise choice experiments when using Monte Carlo simulations. The rest of the paper is organised in the following way. In Section titled ‘Statistical design of choice experiments’ we discuss the theory of optimal design, beginning by discussing it for the case of linear models before moving to non-linear ones, which characterise discrete choices. Furthermore, we discuss practical issues when implementing optimal designs in a choice experiment. The results of the Monte Carlo simulations on the efficiency of different design techniques, in situations with both correct and incorrect prior information, to elicit correct marginal WTP are presented in section titled ‘Comparison of design approaches in pairwise choice experiments’. Finally, section titled ‘Discussion’ concludes the paper. Statistical design of choice experiments Theory of optimal design The objective of an optimal statistical design is to extract the maximum amount of information from the respondents subject to the number of attributes, attribute levels and other characteristics of the survey such as cost and length of the survey. The central question is then how to select the attribute levels to be included in the stated preference experiment in order to extract maximum information from each individual. Let us assume a continuous response and two attributes, attribute 1 with two levels (a,b) and attribute 2 also with two levels (c,d). Given the number of attributes and attribute levels, we can create a full Copyright # 2002 John Wiley & Sons, Ltd. factorial design. A full factorial design includes all alternatives that can be created from the combinations of attribute levels. A full factorial design contains in this case four alternatives (ac, ad, bc and bd), i.e. all possible combinations of attribute levels. Let us assume that we are interested in valuing each attribute separately. The value of attribute 1 can be calculated as the difference in the WTP between the alternative (bc) and alternative (ac) (or alternatively (bd) and (ad)). The value of attribute 2 can be obtained in a similar fashion. If we use the alternatives (ac) and (bd) the subtraction of the WTP of alternative (bd) from the WTP of alternative (ac) results in a WTP where the value of the changes in attributes 1 and 2 is confounded, i.e. does not provide any information on the valuation of the single attributes. A design that considers this redundant combination is the orthogonal fractional design, which only contains the uncorrelated alternatives (ac, ad, bc). In order to analyse our responses, we can use the following standard linear model: y ¼ b0 x þ e ð1Þ where y is the continuous response, b a vector of parameters, x is a matrix of attribute levels (called design matrix) and e is a vector of error terms. The parameter vector can be obtained from b# ¼ x0 yðx0 xÞ1 ð2Þ and the covariance matrix of b# , O, is O ¼ s2 ðx0 xÞ1 ð3Þ In our case the main objective is to estimate all coefficients with high precision in order to calculate an accurate value of each attribute. Thus, the problem of optimal design can be seen as a problem of defining the design matrix, x, in such a way that the ‘size’ of the covariance matrix of the estimator is minimised, implying precise estimates of the parameters. One common measure of efficiency, which relates to the covariance matrix, is D-efficiency h i1 ð4Þ D-efficiency ¼ jOj1=K where K is the number of parameters to estimate. There are also several other criteria of efficiency such as A- and G-efficiency, which all are highly correlated and the main reason for choosing Defficiency is that it is less computationally burdensome (see e.g. Kuhfeld et al. [12]). In the case of a linear model, an optimal design must fulfil two Health Econ. 12: 281–294 (2003) 284 F. Carlsson and P. Martinsson criteria in order to be D-efficient; orthogonality and level balance [12]. Level balance criterion requires that the levels of each attribute occur with equal frequency in the design. Kuhfeld et al. [12] use a computerised search algorithm, based on a modified Fedorov algorithm, to minimise Defficiency in order to construct an efficient, but not necessarily orthogonal, linear design. A modified Fedorov algorithm works in the following way; an initial design is randomly drawn from a full factorial design. From the initial design the algorithm will, through an iterative process, exchange alternatives in the initial design with ones from a list of candidate alternatives until it is not possible to reduce D-efficiency any further, i.e. D-efficiency is minimised. In a choice experiment, individuals choose the alternative they prefer from a choice set. Thus, we are not analysing a linear model but a non-linear one due to the discrete nature of the dependent variable. From the point of view of maximising the amount of information, it would be desirable if all individuals could rank all possible attribute level combinations. This would however be too cognitively demanding as well as time consuming and hence the complexity of the choice experiment needs to be reduced. One way is to let the individuals compare a few number of alternatives in a choice set. The amount of information collected from each individual can be increased by repeated choices. The central question is then how to combine the alternatives from the full factorial design into the choice sets so that a maximum amount of information is extracted given other constraints such as the number of choice sets in the experiments. Huber and Zwerina [13] identify four principles for an efficient design of a choice experiment based on a non-linear model: (i) level balance, (ii) orthogonality, (iii) minimal overlap and (iv) utility balance. A design that satisfies these principles has a maximum Defficiency. A design has minimal overlap when an attribute level does not repeat itself in a choice set. Utility balance requires that the utility of each alternative in a choice set is equal. The utility that individuals derive from an alternative is considered to be associated with the levels of the attributes of the alternative. By following the standard random utility framework [14] the utility function, Ui , is composed of a deterministic part, Vi , and an unobservable part, ei , such that Ui ¼ Vi þ ei Copyright # 2002 John Wiley & Sons, Ltd. ð5Þ The attributes are arguments in the deterministic component and thus Equation (5) can also be expressed as Ui ¼ b0 xi þ ei ð6Þ where b is a vector of parameters and xi is a vector of attributes for alternative i. It is important to differ between generic models and alternative specific models. In a generic model, the parameters of the attributes have the same impact on utility independent of the alternative, while in an alternative-specific model some or all parameters of a specific attribute differ between alternatives. For illustration of the simplest case of an alternative specific case let us assume that a certain type of illness can be cured by either taking a pill or a syringe and that the effect on the probability of health improvement, number of future visits, etc. are the same. This illustrates the case when the medical technique differs between the alternatives, and that individuals may have inherent preferences for either being treated by a pill or a syringe. The alternative specific term can then be included as an attribute in Equation (6). A generic model would for example be applied when we compare different alternatives when the overall treatment programme does not differ between the alternatives, i.e. in our case either pills or syringes are assumed to be the treatment method. There are several types of statistical designs that consider some or all of the requirements in Huber and Zwerina’s [13] paper for an efficient design of a choice experiment. Traditionally, orthogonal designs where the levels of each attribute vary independently have been preferred. The main reason for this is probably that the parameter estimates of a linear model are uncorrelated when using an orthogonal design and that there exists canned routines in statistical packages for this design strategy. In general, the choice sets are then created either simultaneously or by comparing the orthogonal array with a constant base alternative (see e.g. Louviere [15]). Creating the choice sets simultaneously means that the design is selected from the collective factorial. The collective factorial is an LAC factorial, where C is the number of alternatives and each alternative has A attributes with L levels. The second case, comparison to a base alternative, would, for example, be motivated by an existing treatment programme when the orthogonal array is compared with a base alternative. Health Econ. 12: 281–294 (2003) 285 Stated Preference Methods Design Techniques A cyclical design is an easy extension of the orthogonal approach, but it can only be applied in the case of a generic model. First, each of the alternatives in the orthogonal design is allocated to different choice sets. Attributes of the additional alternatives are then constructed by cyclically adding alternatives into the choice set based on the attribute levels. The attribute level in the new alternative is the next higher attribute level to the one applied in the previous alternative and if the highest level is attained, the attribute level is set to its lowest level. For example, we have two attributes with three levels, 1–3, and an alternative with attribute levels 1 and 3. An additional alternative will then have the attribute levels 2 and 1. By construction, this design has level balance, orthogonality and minimal overlap, hence it satisfies the principles of an optimal design for a choice experiment, except for utility balance (Bunch et al. [16]). Let us now illustrate the optimal design for a discrete choice model when considering the principle of utility balance. Suppose that we wish to estimate the parameters of the utility function in Equation (6) and that the choice experiment consists of N choice sets, where each choice set, Sn , consists of Jn alternatives, such that Sn ¼ fx1n ; . . . ; xJn n g, where xin is a vector of attribute levels. The standard model is the conditional logit, where the error terms are assumed to be independently and identically distributed type I extreme value. The choice probability for alternative i from a choice set Sn is then 0 eb xin Pin ðSn ; bÞ ¼ PJn j¼1 0 eb xjn ð7Þ McFadden [14] shows that the maximum likelihood estimator is consistent and asymptotically normal distributed with mean equal to b and the following covariance matrix: " # Jn N X X 0 1 z0jn Pjn z1 ð8Þ O ¼ Z PZ ¼ jn n¼1 j¼1 where zjn ¼ xjn Jn X xin Pin i¼1 The covariance matrix in (8) depends on the true parameters in the utility function since the choice probabilities depend on them. In the case of a linear model, the covariance matrix is proportional Copyright # 2002 John Wiley & Sons, Ltd. to the design matrix, i.e. O ¼ ðX0 XÞ1 s2 . In the discrete choice model, the covariance matrix does not only depend on the combinations of attribute levels in the experiment, but also on how the alternatives are combined in the choice sets. This because the combinations of alternatives in the choice sets affect the covariance and hence affect the efficiency of the design. Thus, there is a gain in efficiency if prior information about the parameters can be used when creating the design. Huber and Zwerina [13] show that it is possible to substantially improve efficiency by using expected parameter values when constructing the choice sets. More specifically, they extend the cyclical design of Bunch et al. [16] by incorporating priors on the distribution of the parameters. Zwerina et al. [17] explicitly formalise the ideas of Huber and Zwerina by adapting the search algorithm of Kuhfeld et al. [12] to the four principles for efficient choice designs as described in Huber and Zwerina [13]. This means that they add the principle of utility balance when creating the design. The search algorithm now minimises D-error for a given set of parameter values and the need for pre-specified values on the parameters is easily seen in (7) and (8). Consequently, this approach requires prior information on the parameters. Based on Zwerina et al. [17] a SAS programme was created using the search algorithm. The SAS programme allows for a large variety of designs and when using this approach only the number of attributes, number of alternatives and number of choice sets, in addition to the assumed utility function has to be specified. In Appendix C we show an example of the information that has to be specified in the SAS application in order to generate a design. Closed-ended contingent valuation surveys can be seen as a special case of a choice experiment. The requirement of a priori information in a choice experiment is similar to implementation of optimal design for closed-ended contingent valuation surveys, where a priori information is needed on the distribution on WTP in order to determine the bids (see e.g. [18, 19]). The main difference is that our design problem involves several attributes and not only the bid and this complicates the design issue. It is important to understand that, irrespective of the choice of statistical design, information regarding the respondents’ preferences for the attributes is needed, even in the case of an orthogonal design. For example, if the attribute levels are assigned in such a way that the Health Econ. 12: 281–294 (2003) 286 F. Carlsson and P. Martinsson entire choice in a choice set depends on the attribute levels of a certain attribute, not much information is extracted. In order to avoid this situation some prior information is needed. Moreover, the amount of information extracted is then determined by how appropriately the attribute levels are set, which relates to the knowledge of the shape of the utility function. Consequently, the difference between the design strategies is not the prior information that is needed, but the way in which the information is entered into the creation of the design. In some situations we may not have any prior expectations and therefore we may assume that all parameters are equal to zero. This may be the case in an early stage in the development of a choice experiment. If we set the parameters in the utility function to zero, the covariance matrix in Equation (8) is reduced to " # Jn N X 0 1 1X z0jn z1 ð9Þ O ¼ Z PZ ¼ jn J n¼1 n j¼1 where zjn ¼ xjn Jn 1X xin Jn i¼1 In this case the design problem of a choice experiment is simplified. In this paper we distinguish between a D-optimal design based on prior expectations on parameter values, called a DP-optimal design, and a design based on no prior expectation, i.e. zero parameter values, a DZ-optimal design. Kanninen [20] presents a more general approach to the optimal design. In her design, the number of attribute levels is also a part of the optimal design problem. Kanninen shows that in a D-optimal design each attribute should only have two levels, even in the case of a multinomial choice experiment, and the levels should be set at two extreme points of the distribution of the parameters. This means that level balance will not be satisfied in a multinomial choice experiment. Implementing optimal designs As has been seen in section titled ‘Theory of optimal design’, implementation of statistical designs in choice experiments requires in some cases a great amount of information on respondents’ Copyright # 2002 John Wiley & Sons, Ltd. preferences. A natural and important question is how good priors about the parameters in the utility function are to be obtained. Some indicative information may be obtained from previous studies, but running focus groups and pilot studies is of vital importance. In general, pilot studies and focus groups are very important for a successful choice experiment. As a preliminary step, focus groups and pilot studies are used to collect information about suitable attributes and attribute levels to include in the experiment. Furthermore, they are often used to test the questionnaire, and to give information about how respondents receive and interpret the information presented. We argue that pilot studies should be used more explicitly for design purposes by estimating the parameters and then to use these estimates, together with other prior information on the parameters, in the design of the final experiment. The results from these pre-tests may not only change the design, but also the attribute levels. For example, if the attribute levels for an attribute are far apart, the levels of this attribute may completely dominate the choice made by the individuals. Thus, the development of a choice experiment may involve both changing the attribute levels and the design. A problem in all designs is that biased priors of the parameters influence the design and thus limit the amount of information extracted as the design applied is based on an inefficient design. An interesting design approach is presented in Kanninen [21] where she shows that a sequential design approach improves the efficiency of the design for closed-ended contingent valuation surveys. When using a sequential design approach, the researcher uses collected data from the real survey to update the priors on the distribution of the parameters, which then are applied to update the design in the survey. We suggest that the same strategy should be used for the design of choice experiments. In the case of using the design approach presented in Zwerina et al. [17], this would mean that during the actual choice experiments the responses are used to estimate a model and then creating a new design depending on the new parameter estimates. Using an orthogonal design, the approach would be the same, as the parameters need to be estimated in order to be able to evaluate whether the attribute levels are set to optimal levels. Kanninen [20] shows a simpler approach for updating the design given a number of attributes and alternatives. The D-optimal design results in certain response probabilities, which is easily seen Health Econ. 12: 281–294 (2003) 287 Stated Preference Methods Design Techniques in Equation (7). By calculating the observed response probabilities, a balancing attribute can be included in order to achieve the desired response probabilities and the balancing attribute would most naturally be the cost attribute. However, this design approach still relies on prior information on the distribution of the parameters: the difference is that the updating approach is easier to conduct as a new design does not have to be created. There are a few number of choice experiments that have used the D-optimal design of Zwerina et al. [17]. Johnson et al. [22] use this algorithm without including any priors on the beta parameters in a health choice experiment, while Carlsson and Martinsson [23], in a choice experiment on donations to environmental projects, use the results of a pilot study as prior information on the parameter distribution. Steffens et al. [24] adapt the updating approach suggested by Kanninen [20] in a choice experiment on bird watching and they found that updating improved the efficiency of the estimates. All information obtained from pilot-studies and focus groups has to be analysed carefully. Credibility and relevance always has to be investigated. However, it is difficult to obtain unbiased information from pilot studies and the D-optimal approach is likely to be more sensitive to this information than an orthogonal design approach. In the next section, in which we compare different design approaches, we will also investigate the impact of misspecified priors regarding the parameter distribution on the results from a design based on these priors. Comparison of design approaches in pair-wise choice experiments Description of designs and experiments In this section we compare different design techniques in a pair-wise choice experiment with regard to their effect on the estimated coefficients and marginal WTP, by conducting Monte Carlo simulations. We evaluate the orthogonal design, the cyclical design and the D-optimal design with and without priors. Let us assume that a scenario is created that describes two different alternatives. Each alternative is described by four variables: Copyright # 2002 John Wiley & Sons, Ltd. three continuous variables, x1 , x2 and x3 , which describe three different attributes in the alternative, and a continuous variable, Cost, which indicates the cost for realising the alternative. The variables x1 , x2 and Cost are three-level attributes, while x3 is a two-level attribute. The attributes could, for example, be the attributes mentioned in the introduction: likelihood of continuation of contact with the same staff, chance of successful operation and whether information is received about the operation. In the generic case, the parameters of the attributes are the same between the alternatives, whereas the alternative-specific model hinges on some or all parameters of at least one attribute differ between alternatives. Furthermore, we also study the case with and without a base case. A base case is included when, for example, we want to include a fixed treatment, usually the current one, as an alternative in the choice sets. We conduct three different types of experiments which represent the three most common applications in health economics: (i) Experiment 1 – alternative specific without a base case, (ii) Experiment 2 – alternative specific with a constant base case and (iii) Experiment 3 – generic design without a constant base case. We assume a linear utility function and the true difference in utility between the two alternatives is represented by Equation (10) for the generic case and by Equation (11) for the alternative-specific case. Du ¼ x1 þ 0:5x2 þ 0:2x3 0:003 Cost ð10Þ Du ¼ 0:1 þ x1 þ 0:5x2 þ 0:2x3 0:003 Cost ð11Þ The alternative-specific model is the simplest possible, where the alternatives only differ by an intrinsic value represented by a non-zero constant while the parameters of the attributes are the same. The three experiments and the designs we test for in each experiment are summarised in Table 1. The D-optimal designs are created in SAS using the search algorithm presented in Zwerina et al. [13] (see Appendix C) and the orthogonal designs by using the ADX menu system in SAS. In Experiment 1, an orthogonal design is compared to a D-optimal design approach. We create the D-optimal designs in two different ways; with correct prior information of the parameters (DP-error) and with no prior information of the true parameters, which assumes the parameters are equal to zero (DZ-error). Furthermore, we create two D-optimal designs based on biased Health Econ. 12: 281–294 (2003) 288 F. Carlsson and P. Martinsson Table 1. Summary description of experiments Design Description Utility difference Experiment 1: Alternative specific design without a constant base case Orthogonal Choice sets created simultaneously, i.e. the design is created from the full factorial design DZ-optimal D-optimal design with no prior information of the parameters DP-optimal D-optimal design with correct prior information of the parameters DP-optimal false D-optimal design based on biased expectations, prior 1 only cost variable biased. DP-optimal false D-optimal design based on biased expectations, prior 2 all but cost variables biased. Experiment 2: Alternative specific design with a base case Orthogonal An orthogonal design with a fixed base case DZ-optimal D-optimal design with no prior information of the parameters D-optimal design with correct prior information DP-optimal of the parameters Experiment 3: Generic design without a constant base case Orthogonal Choice sets created simultaneously, i.e the design is created from the full factorial design. Same design as Experiment 1 Cyclical Cyclical design which is created from an orthogonal main effects array with nine alternatives D-optimal design with no prior DZ-optimal information of the parameters DP-optimal D-optimal design with correct prior information of the parameters expectations of the true parameter values. Hence, we also indirectly demonstrate how a sensitivity analysis of a D-optimal design can be conducted. The attribute levels are x1 ¼ 3; 4; 5, x2 ¼ 15; 20; 25, x3 ¼ 2; 4 and Cost ¼ 500; 700; 900. The created choice sets are presented in Table A1 in Appendix A. Each row in the table represents one alternative in a choice set and consequently each choice set (Set) is presented on two subsequent rows. The true choice probability (Prob) in each choice set is calculated using the true utility difference function in Equation (10), which relates to the deterministic part of the utility function and the logit choice probability in Equation (7). From Table A1 it is clear that there are differences in the choice probabilities between the different design approaches. In the orthogonal design there are several choice sets where one of the alternatives has a choice probability almost equal to one. Copyright # 2002 John Wiley & Sons, Ltd. 0:1 þ x1 þ 0:5x2 þ 0:2x3 0:003 Cost 0:1 þ x1 þ 0:5x2 þ 0:2x3 0:001 Cost 0:2 þ 2x1 þ x2 þ 0:4x3 0:003 Cost 0:1 þ x1 þ 0:5x2 þ 0:2x3 0:003 Cost x1 þ 0:5x2 þ 0:2x3 0:003 Cost Alternatives with a choice probability close to one are also found in the case of the DZ-optimal design, although less frequently. For the DPoptimal design the choice probabilities are much more balanced in terms of utility, even in the cases where the parameters are biased. When the choice probability is close to one, very little information is obtained. In Experiment 2, an orthogonal design with constant base case is compared to a Doptimal design approach. One explanation for the frequent use of this kind of design approach, except for when we want to compare an existing programme with a new programme, is probably that it is easier to create an orthogonal design when one alternative is fixed since the choice sets are then defined by construction. The levels of the base case are x1 ¼ 2x2 ¼ 22:5, x3 ¼ 3 and Cost ¼ 600. Health Econ. 12: 281–294 (2003) 289 Stated Preference Methods Design Techniques In Experiment 3, an orthogonal design is compared to a cyclical design and a D-optimal design approach. In this experiment we use the same orthogonal design as in experiment 1 since the orthogonal design is, by definition, unaffected by whether or not there is a constant included in it. The designs for Experiments 2 and 3 are not presented in the paper, but are available upon request. The designs in the different experiments are evaluated in Monte Carlo simulations. In the simulations each individual makes 18 pair-wise choices. The reason being that an orthogonal main-effects design, i.e. a design that assumes no interaction effects between the attributes, consists of 18 choice sets. We use two sample sizes in the simulations; 75 and 150 respondents, resulting in 1350 and 2700 observations, respectively, and the simulations are repeated 2000 times for each sample size. We have deliberately chosen rather small sample sizes since they are often small in applied research in health economics. Furthermore, as we have chosen two sample sizes we can also illustrate the effect on efficiency when the sample size increases. A more detailed description of the data-generating process is relegated to Appendix B. Results We evaluate the different design techniques by comparing the estimated marginal WTP between two alternatives to the ‘true’ marginal WTP. Since we specify the utility function and the parameters we can calculate a ‘true’ marginal WTP. In Table 2 we present the two alternatives for which the marginal WTP is estimated. The marginal WTP between the two alternatives described in Table 2 is calculated from the parameters in a standard fashion as (see e.g. [25] Table 2. Attribute levels for the two treatment programmes Attribute Treatment programme 1 Treatment programme 2 Difference in levels x1 x2 x3 3 20 4 5 15 2 2 5 2 Copyright # 2002 John Wiley & Sons, Ltd. for details) marginal WTP ¼ b0 bCOST þ 3 X bi i¼1 bCOST Dleveli ð12Þ The true marginal WTP is 300 for the generic design is and 333.33 for the alternative-specific design based on Equations (10) and (11), respectively. We apply two different measures to evaluate the designs: (i) the bias in the estimated marginal WTP and (ii) the mean square error (MSE) of the estimated marginal WTP, i.e. the average of the difference between marginal WTP obtained in the simulation and true marginal WTP raised to the power of two. The results of the simulations are presented in Table 3. In most simulations the designs are approximately unbiased. Using a Doptimal design with priors on the parameters results, as expected, in smaller MSE. In the first simulation experiment all designs are close to unbiased except for the orthogonal design, which on average overestimates marginal WTP. The orthogonal design also has the largest MSE, being between 3.7 and 4.4 times the MSE for the DP-optimal design. The DZ-optimal design performs better than the orthogonal design; the MSE for the orthogonal design is 1.3–1.5 times the MSE for the DZ-optimal design and in addition the DZoptimal design is less biased. Using biased priors, results in a larger marginal WTP, but the MSE increases only slightly. This result should be interpreted with some care, since in practice the false priors may also affect the choice of attribute levels. In our case it is difficult to obtain large differences between different D-optimal designs created on biased priors, at least where there are moderate differences between the biased priors. Moreover, the simulations indicate that the MSE decreases as the sample size increases, as expected. In the second simulation experiment, where the orthogonal design contains a base case, all designs perform relatively well in terms of being unbiased except that the Dz-optimal design overestimates the true marginal WTP when the sample size is 75. In comparison to experiment 1, the MSE is larger for all designs when the sample size is 75. The orthogonal design has a very large MSE; 8.3–8.6 times the MSE for the Dp-optimal design. The Dzoptimal design also has a large MSE, but the MSE for the orthogonal designs is still 1.9–2.3 times the MSE for the Dz-optimal design. Health Econ. 12: 281–294 (2003) 290 F. Carlsson and P. Martinsson Table 3. Bias and MSE in estimated marginal WTP for different sample sizes and design techniques Sample size 75 Bias Experiment 1 Orthogonal DZ-optimal DP-optimal DP-optimal false prior 1 DP-optimal false prior 2 Experiment 2 Experiment 3 150 MSE Bias MSE 13.10 2.53 0.03 0.02 2.53 8836.62 5724.42 2007.69 2312.04 2346.63 6.49 0.01 0.09 0.07 0.69 3836.65 2857.66 1045.30 1115.42 1273.80 Orthogonal DZ-optimal DP-optimal 3.23 8.87 0.93 21440.65 11560.86 2590.67 1.62 1.46 0.38 11299.06 4813.53 1305.47 Orthogonal Cyclical DZ-optimal DP-optimal 9.67 7.29 7.12 0.37 7607.44 4938.31 11270.29 1414.62 6.56 1.47 6.37 0.75 3943.58 2470.40 5394.62 706.17 In the third simulation experiment all designs, except for the Dp-optimal design, overestimate the true marginal WTP. This experiment indicates that the cyclical design performs better than both the orthogonal design and the DZ-optimal design; the orthogonal design has a MSE 1.6 times the MSE for the cyclical design, and the DZ-optimal design has a MSE 2.2–2.3 times the MSE for the cyclical design. Consequently, in our case the DZ-optimal design performs worse than the orthogonal design. Again, a DP-optimal design has the lowest MSE; the cyclical design has a MSE 3.5 times the MSE for the DP-optimal design. These simulations indicate that there are large differences between the design approaches with respect to MSE. A DP-optimal design, i.e. using the true parameters, results in a much lower MSE compared to the other approaches. Furthermore, in two of the experiments, a DZ-optimal approach performs better than an orthogonal design, while in the third the orthogonal has a lower MSE than the DZ-optimal. The DP-optimal is also less sensitive to the sample size, and provides unbiased estimates even at a low sample size. Discussion The objectives of this paper were to describe and to compare different design approaches for stated Copyright # 2002 John Wiley & Sons, Ltd. preference surveys in health economics, with special emphasis on choice experiments, and to discuss how to use design techniques in empirical applications. In particular we concentrate on two types of designs; orthogonal design, which has been the most frequently used design in health economics, and D-optimal designs. The purpose of a choice experiment is to measure the rates at which individuals trade-off the attributes, hence it is important to assign attribute levels so that individuals actually make trade-offs. Furthermore, it is important to combine attribute levels into alternatives and choice sets in such a way that the maximum amount of information is extracted given e.g. the number of choice sets to be included. Higher precision in the estimates can potentially be obtained if we have some prior knowledge about individuals’ preferences when creating the design, and this type of information should be obtained during the development of the choice experiment through literature review, focus groups and pilot studies. However, we also suggest a sequential design approach in which the collected data are analysed during the real experiment and the design is thus improved as the experiment proceeds. It is important to be aware of that the need for prior information is not unique to D-optimal designs since for example in an orthogonal design prior information is needed in order to select attribute levels in a way that no attribute becomes either superior or inferior to the other. Health Econ. 12: 281–294 (2003) 291 Stated Preference Methods Design Techniques The Monte Carlo simulations indicate that the proposed DP-optimal approach performs much better than any of the other approaches that we analyse. It is not surprising that knowledge about the true parameters results in creation of an efficient design yielding better estimates of marginal WTP. More important are the relative performance of the design approaches and the sensitivity of the estimated marginal WTP to biased priors on the parameters, which is the likely case in real world applications with limited information. The simulations indicate that there are large differences between the design approaches with respect to the MSE. Furthermore, a small bias in the priors does not seem particularly serious for a D-optimal design and it still results in better estimates than a traditional orthogonal design. In conclusion, we have shown in simulations that the choice of design technique when creating the design of a choice experiment will affect the precision in the estimates, in our case measured by marginal WTP. Generally, we strongly recommend researchers to use a Doptimal design, and as shown in Appendix C, it is easy to apply. The gain of applying a D-optimal is hard to generalise as several parameters are involved such as the functional form of the underlying utility function, number of attributes, number of attribute levels, etc. If the time and the budget permit, it is preferable to at least allow for the possibility of using a sequential design approach. More complicated designs such as inclusion of interaction effects, restrictions in the number of choice sets, etc., can also be handled by the design approach presented in Zwerina et al. [17]. Furthermore, we would recommend research- ers to study the sensitivity of using different priors on the measured marginal WTP as performed in this paper during the development of the choice experiment and thereby gain information about whether additional pilot surveys are needed. Furthermore, we consider sensitivity of biased priors and relative performance of different design techniques to be an important part of the discussion of the results from a choice experiment. In particular if the results will be used as input in public policies. In general only the sensitivity in the estimations are discussed in the results by calculating of e.g. confidence intervals for welfare measures, but as shown in this paper an inefficient design per se may result in biased welfare measures. Acknowledgements We wish to thank participants of a seminar at Lund University and two anonymous referees for valuable comments. Joel Huber, Fuqua School of Business, Duke University, kindly provided the SAS code; the code is also available at ftp://ftp.sas.com/techsup/download/ technote/ts643/. Financial support from the Swedish Transport and Communications Research Board, the Bank of Sweden Tercentenary Foundation and the Swedish National Institute of Public Health is gratefully acknowledged. Appendix A The optimal designs for experiment 1 are shown in Table A1. Table A1. Optimal designs experiment 1 DZ-optimal Orthogonal DP-optimal DP-optimal false prior 1 DP-optimal false prior 2 Set Cost X1 X2 X3 Prob Cost X1 X2 X3 Prob Cost X1 X2 X3 Prob Cost X1 X2 X3 Prob Cost X1 X2 X3 Prob 1 1 2 2 3 3 4 4 5 5 4 5 4 3 5 5 3 3 25 25 20 25 25 20 15 20 20 2 4 4 2 4 2 4 2 2 0.79 0.21 0.55 0.45 0.60 0.40 0.65 0.35 0.69 500 700 900 700 700 500 900 500 700 500 900 900 900 700 700 900 700 500 5 4 4 5 5 4 3 3 4 15 25 25 25 20 20 25 20 20 2 4 2 2 2 2 2 4 4 0.04 0.96 0.29 0.71 0.75 0.25 0.83 0.17 0.94 900 500 500 900 500 900 500 700 500 4 3 4 3 3 4 5 4 3 25 20 15 20 15 20 25 20 20 2 4 2 4 4 2 2 4 2 0.88 0.12 0.35 0.65 0.14 0.86 0.98 0.02 0.01 Copyright # 2002 John Wiley & Sons, Ltd. 700 900 500 700 700 900 900 500 900 4 3 5 4 3 5 5 4 4 15 20 20 25 25 25 15 15 20 4 2 2 4 4 2 2 4 4 0.40 0.60 0.23 0.77 0.29 0.71 0.38 0.62 0.69 700 900 500 900 900 700 700 900 500 5 3 5 3 5 3 3 5 3 15 20 20 25 20 25 25 20 25 4 2 4 2 4 2 4 2 2 0.65 0.35 0.35 0.65 0.35 0.65 0.45 0.55 0.40 Health Econ. 12: 281–294 (2003) 292 F. Carlsson and P. Martinsson Table A1 (continued) DZ-optimal Orthogonal 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 900 700 500 500 700 700 500 900 900 900 700 500 500 700 900 500 700 700 700 500 500 900 500 900 500 700 900 3 3 4 3 5 4 5 3 4 5 5 5 5 5 3 4 4 4 3 3 3 4 4 5 3 3 5 20 25 25 20 25 20 15 20 15 15 15 25 20 25 15 25 15 15 25 15 15 15 20 20 25 15 20 2 4 2 4 4 2 4 4 4 4 2 4 4 4 4 2 4 4 4 2 2 4 4 2 4 2 4 0.06 0.25 0.75 0.02 0.98 0.65 0.35 0.83 0.17 0.48 0.52 0.93 0.07 1.00 0.00 1.00 0.00 0.02 0.98 0.52 0.48 0.03 0.97 0.12 0.88 0.01 0.99 700 900 500 700 900 900 500 900 700 700 900 500 900 900 700 700 900 900 700 500 700 700 500 700 500 700 500 5 4 5 3 5 3 4 4 5 5 4 5 3 3 4 4 5 5 3 4 3 5 4 3 5 5 3 25 15 20 25 15 20 15 20 15 25 15 20 25 15 20 25 15 25 15 20 25 20 25 15 25 15 25 DP-optimal 4 2 4 2 4 2 4 4 2 4 2 2 4 4 2 4 2 4 2 4 2 2 4 4 2 4 2 0.99 0.01 0.99 0.96 0.04 0.50 0.50 0.80 0.20 1.00 0.00 0.60 0.40 0.03 0.97 0.99 0.01 1.00 0.00 0.40 0.60 0.08 0.92 0.00 1.00 0.04 0.96 500 500 900 500 900 500 900 700 500 700 500 500 700 700 900 900 700 500 700 900 500 500 700 900 500 900 700 5 3 5 4 3 5 4 3 4 4 5 3 5 5 3 4 5 5 3 5 3 5 3 3 4 3 5 15 25 20 15 20 20 25 25 20 20 15 20 20 20 25 25 20 15 20 15 20 20 25 25 20 20 15 2 2 4 2 4 4 2 2 4 2 4 4 2 2 4 2 4 4 2 4 2 4 2 4 2 2 4 Appendix B: Data-generating process By using the attribute levels from the designs in Appendix A, we can calculate the difference in utility from the deterministic part reported in Equations (10) and (11). In the random utility framework there is also an unobservable part. We assume that the unobservable part is logistically distributed. Thus, based on the difference in utility from the deterministic part and a randomly drawn error term from a logistic distribution, we obtain the ‘observed’ difference in utility between the alternatives. This results in the following specification for the generic model and the alternative specific model to be applied in the simulations w ¼ x1 þ 0:5x2 þ 0:2x3 0:003 Cost þ Z ðB1Þ w ¼ 0:1 þ x1 þ 0:5x2 þ 0:2x3 0:003 Cost þ Z 0.31 0.80 0.20 0.35 0.65 0.55 0.45 0.65 0.35 0.65 0.35 0.29 0.71 0.45 0.55 0.65 0.35 0.65 0.35 0.23 0.77 0.65 0.35 0.69 0.31 0.40 0.60 DP-optimal false prior 1 DP-optimal false prior 2 700 900 500 700 900 500 900 900 500 700 500 500 900 700 500 500 700 900 500 700 500 900 700 900 700 500 700 500 700 900 500 900 900 500 900 700 500 900 900 500 700 900 900 500 500 900 500 700 500 700 900 500 700 500 5 5 3 3 5 4 3 3 4 4 5 4 5 3 5 5 3 4 5 5 3 3 4 4 3 3 4 15 15 20 25 20 15 20 25 25 20 15 25 25 20 15 20 25 25 25 20 25 25 20 25 25 25 25 4 2 4 2 4 4 2 4 2 2 4 2 4 4 2 4 2 4 2 2 4 2 4 2 4 2 4 0.31 0.12 0.88 0.69 0.31 0.55 0.45 0.15 0.85 0.65 0.35 0.48 0.52 0.60 0.40 0.65 0.35 0.15 0.85 0.20 0.80 0.65 0.35 0.52 0.48 0.33 0.67 5 4 5 5 3 3 4 4 3 3 4 3 4 5 3 3 5 3 4 4 5 3 5 5 3 4 3 20 15 15 15 20 25 20 15 15 15 15 20 15 20 25 20 15 20 20 25 25 20 15 15 20 15 15 4 4 2 2 4 2 4 2 4 2 4 2 4 4 2 2 4 4 2 4 2 2 4 4 2 2 4 0.60 0.52 0.48 0.60 0.40 0.50 0.50 0.52 0.48 0.48 0.52 0.50 0.50 0.65 0.35 0.27 0.73 0.67 0.33 0.52 0.48 0.69 0.31 0.23 0.77 0.52 0.48 where Z is a logistic distributed error term and w is the difference in utility between the alternatives. However, we do not observe this difference, only the alternative that has been chosen. Thus the continuous response, w, is transformed to a binary answer, y# , where the cut-off point is set to 0 for w. A positive value on w indicates that the ‘observed’ difference in utility favours alternative 1, hence the observed choice would be alternative 1 and vice versa. In order to be able to evaluate marginal WTP we regress y# on x1, x2, and x3 and Cost by applying a standard conditional logit estimator. For a particular choice set in a specific simulation, we use the same simulated error term for all designs when calculating the simulated response, w. Hence, the difference in w between the designs only relates to the differences in the attribute levels derived from the designs, thus enabling us to compare the designs thoroughly. ðB2Þ Copyright # 2002 John Wiley & Sons, Ltd. Health Econ. 12: 281–294 (2003) 293 Stated Preference Methods Design Techniques Appendix C: Initial SAS code */Specify the beta vector. The number of values corresponds to */the number of main effects. In this case 7 main effects */ as the design has 1 2-level attribute and 3 3-level attributes */The values depend on the assumptions about the beta vector */and in the case shown we assume no prior information %let beta = 0 0 0 0 0 0 0 0; */Specify the number of alternatives in each choice set %let nalts = 2; */Specify the total number of choice sets to be generated %let nsets = 15; */Create the full factorial design, using Proc Plan */The design has 1 2-level attribute and 3 3-level attributes proc plan ordered; factors x1=2 x2=3 x3=3 x4=3/ noprint; output out=candidat run; */The full factorial design is then coded and the model is */specied. In this case the model only contains main effects */and no interaction effects. Interaction effects can be */specied in the model statement as e.g. x1*x2 proc transreg design data=candidat; model class(x1 x2 x3 x4/); output out=tmp cand; run; */The data set tmp cand is then read into the design program */specified in Zwerina et al [17]. */The code can be downloaded at */ftp://ftp.sas.com/techsup/download/technote/ts643/ References 1. Maas A, Stalpers L. Assessing utilities by means of health system alternatives: Methodology for analysis. Health Services Res 1992; 9: 35–52. 2. Propper C. The disutility of time spent on the United Kingdom’s National Health Service waiting list. J Human Resour 1995; 30: 677–700. 3. Ryan M, Hughes J. Using conjoint analysis to assess women’s preferences for miscarriage management. Health Econ 1997; 6: 261–273. 4. San Miguel F, Ryan M, McIntosh E. Establishing women’s preferences for the treatment of menorrhagia using the technique of conjoint analysis. Health Economics Research Unit Discussion Paper No. 06/97, University of Aberdeen, 1997. 5. Van der Pol M, Cairns J. Establishing patient preferences for blood transfusion support: An application of conjoint analysis. J Health Services Res Policy 1998; 3: 70–76. 6. Verheof C, Maas A, Stalpers L, Verbeek A, Wobbes T, Daal W. The feasibility of additive conjoint Copyright # 2002 John Wiley & Sons, Ltd. 7. 8. 9. 10. 11. 12. 13. /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* measurement in measuring utilities in breast cancer patients. Health Policy 1991; 17: 39–50. Vick S, Scott A. Agency in health care: examining patients’ preferences for attributes of the doctorpatient relationship. J Health Econ 1998; 17: 587–605. Zweifel P, Breyer F. Health Economics. Oxford University Press: Oxford, 1997. Zwerina K. Discrete Choice Experiments in Marketing. Physica-Verlag: Heidelberg, 1997. NOAA. Natural resource damage assessments under the oil pollution act of 1990. Fed Regist 1993; 58: 4601–4614. van der Pol M, Cairns J. Estimating time preferences for health using discrete choice experiments. Soc Sci Med 2000; 52: 1459–1470. Kuhfeld W, Tobias R, Garrat M. Efficient experimental design with marketing research applications. J Marketing Res 1994; 31: 545–557. Huber J, Zwerina K. The importance of utility balance in efficient choice designs. J Marketing Res 1996; 33: 307–317. Health Econ. 12: 281–294 (2003) 294 14. McFadden D. Conditional logit analysis of qualitative choice behaviour. In Frontiers in Econometrics, Zarembka P (ed.). Academic Press: New York, 1974; 105–142. 15. Louviere J. Analyzing decision making: metric conjoint analysis. Sage University Paper Series Quantitative Applications in the Social Science, Vol. 67, Sage: Newbury Park, 1988. 16. Bunch D, Louviere J, Anderson D. A comparison of experimental design strategies for multinomial logit models: the case of generic attributes. Working Paper, Graduate School of Management, University of California at Davis, 1994. 17. Zwerina K, Huber J, Kuhfeld W. A general method for constructing efficient choice designs. Working Paper, Fuqua School of Business, Duke University, 1996. 18. Alberini A. Optimal designs for discrete choice contingent valuation surveys: single-bound, doublebound, and bivariate models. J Environ Econ Manage 1995; 28: 287–306. 19. Kanninen B. Optimal experimental design for double-bounded dichotomous choice contingent valuation. Land Econ 1993; 69: 138–146. Copyright # 2002 John Wiley & Sons, Ltd. F. Carlsson and P. Martinsson 20. Kanninen B. Optimal design for multinomial choice experiments. Journal of Marketing Research 2002; 39: 214–227. 21. Kanninen B. Design of sequential experiments for contingent valuation studies. J Environ Econ Manage 1993; 25: 1–11. 22. Johnson R, Banzhaf M, Desvouges W. Willingness to pay for improved respiratory and cardiovascular health: a multiple-format stated-preference approach. Health Econ 2000; 9: 295–317. 23. Carlsson F, Martinsson P. Do hypothetical and actual marginal willingness to pay differ in choice Experiments? – Application to the valuation of the environment. J Environ Econ Manage 2001: 41: 179– 192. 24. Steffens K, Lupi F, Kanninen B, Hoen J. Implementing an optimal experimental design for binary choice experiments: an application to bird watching in Michigan. In Benefits and Costs of Resource Policies Affecting Public and Private Land, Polaksy S (ed.). Western Regional Research Publication, 2000. 25. Johansson P-O. Evaluating Health Risks. An Economic Approach. Cambridge University Press, Cambridge, 1995. Health Econ. 12: 281–294 (2003)
© Copyright 2026 Paperzz