Hybrid discrete choice models: gained insights versus increasing effort Petr Mariel ∗ Department of Applied Economics III (Econometrics and Statistics), University of the Basque Country Avda. Lehendakari Aguirre, 83 E48015 Bilbao, Spain E-mail: [email protected] Tel: +34.94.601.3848 Fax: +34.94.601.3754 Jürgen Meyerhoff a,b a) Institute for Landscape Architecture and Environmental Planning Technical University of Berlin D-10623 Berlin, Germany E-mail: [email protected] b) The Kiel Institute for the World Economy, Duesternbrooker Weg 120, 24105 Kiel, Germany ∗ Corresponding author 1 Abstract Hybrid choice models expand the standard models in discrete choice modelling by incorporating psychological factors as latent variables. They could therefore provide further insights into choice processes and underlying taste heterogeneity but the costs of estimating these models often significantly increase. This paper aims at comparing the results from a hybrid choice model and a classical random parameter logit. Point of departure for this analysis is whether researchers and practitioners should add hybrid choice models to their suite of models routinely estimated. Our comparison reveals, in line with the few prior studies, that hybrid models gain in efficiency by the inclusion of additional information. The use of one of the two proposed approaches, however, depends on the objective of the analysis. If disentangling preference heterogeneity is most important, hybrid model seems to be preferable. If the focus is on predictive power, a standard random parameter logit model might be the better choice. Finally, we give recommendations for an adequate use of hybrid choice models based on known principles of elementary scientific inference. Keywords: discrete choice, hybrid choice model, land use, random parameter logit, marginal willingness to pay, latent variable JEL: Q51, C35 2 1. Introduction Hybrid Choice Models (HCM) have recently become more popular in discrete choice modelling as they expand standard choice models by incorporating psychological factors that may affect decision making. Generally, HCMs extend the specification of the traditional random utility model (RUM) by incorporating additional decision protocols in order to relax the simplifying assumptions and enrich the underlying behavioural characterizations. These extensions comprise, among others, flexible disturbances (e.g., factor analytic) to mimic more complex error structures and to allow for the explicit modelling of latent psychological factors such as attitudes (Ben-Akiva et al., 2002). This paper aims at contributing to the current literature by investigating whether HCMs provide further insights that justify the higher costs of estimation. The occasion for this question is the experience gained from the development and estimation of a couple of latent variable models (Bartczak et al., 2015; Hoyos et al., 2015; Mariel et al., 2015). In all studies we found that the modelling process was very complex and costly because of the high number of coefficients and the complex likelihood function with numerous local maxima that make its maximization tricky. On the other hand, the models resulted in new insights compared to more conventional models like the Random Parameter Logit (RPL) or the Latent Class Model. Thus, more knowledge about the potential gains of HCMs seems to be valuable as the majority of choice experiment applications nowadays apply routinely approaches capable of capturing unobserved taste heterogeneity such as RPL or latent class models. Given this, the question is whether researchers and practitioners mainly interested in the outcome of choice experiments should also move ahead and include HCM to the suite of models routinely estimated as discrete choice models are being increasingly used in environmental valuation studies (see e.g. Can and Alp, 2012 or Justes et al.,2014). 3 Accordingly, in this paper we focus on a closer comparison of the results from an HCM and those from an RPL. As case for the comparison we choose to analyse the effect of different design dimensions on the propensity to select the status quo (SQ) option in discrete choice tasks. The literature generally provides evidence that people have a tendency to choose the SQ option disproportionately often and that this behaviour is, at least partially, triggered by the design characteristics of the choice sets used in the survey (Boxall et al. 2009; Rolfe and Bennett, 2009; Zhang and Adamowicz, 2011). The data are from a study applying a design-ofdesigns approach (Caussade et al., 2005) resulting in 16 different choice designs. Across those designs the following five design dimensions vary systematically: the number of choice sets, the number of alternatives, the number of attributes, the number of levels and the range of attribute levels (see Meyerhoff et al. 2015). Moreover, we use an attitudinal scale developed for measuring impulsivity. We were motivated to add this scale by the recently increasing interest in whether personal traits explain (stated) choice behaviour (e.g., Grebitus et al., 2013). In the following we compare the HCM and the RPL with respect to the impact of the design dimensions on the frequency of SQ choices, the distribution of the marginal WTP estimates, and the extent to which they allow insights into respondents’ decision making. The paper is organized as follows. Section 2 discusses the definition of the latent variables used in HCM, Section 3 describes the methodological framework and Section 4 the case study. Afterwards, Section 5 presents the main results and, finally, Section 6 is devoted to discussions and conclusions. 2. Latent variables and structural equation models A latent variable is one of the foundation stones of a structural equation model, but there is no general definition of a latent variable that includes all its applications. Non-formal definitions consider latent variables as “hypothetical variables” that cannot be directly 4 measured (MacCallum and Austin, 2000). Among the formal definitions, we find the local independence definition (Hambleton et al., 1991), the expected value definition (Lord and Novick 1968, pp. 29-30), the definition of them as nondeterministic functions of observed variables (Bentler, 1982) or the sample realization definition suggested by Bollen (2002, p. 612), which seems to be simple and flexible: “A latent random (or nonrandom) variable is a random (or nonrandom) variable for which there is no sample realization for at least some observations in a given sample”. Structural Choice Models (Rungie, Coote and Louvieree, 2011, 2012) combine structural equation modelling (SEM) with discrete choice models, assuming that the latent variables have random coefficients with multivariate distributions with unknown parameters. The model incorporates factors that influence the random coefficients and can influence each other through links in the structural equations. We focus in the following on related but different models called HCMs in which the latent variables represent the characteristics of individuals, typically constructs like attitudes (Ben-Akiva et al., 2002). These latent variables are treated as endogenous and related to sociodemographic characteristics in structural equations, but, at the same time, they are explanatory in measurement equations relating them to observed indicators. This type of model has been increasingly used in all fields in which discrete choice models are applied. Nevertheless, criticism of them has increased at the same pace. The most frequently used latent variable models thus far have been applied and supported in transportation by, among others, Abou-Zeid et al. (2010), Walker et al. (2010), Daly et al. (2012), Prato et al. (2012), Glerum et al. (2014), Kamargianni and Polydoropoulou (2014), Kim et al. (2014) and Paulssen et al. (2014). In a recent paper, also in transportation, Hess et al. (2013) find better performance of HCMs in terms of efficiency, represented by lower standard errors, and argue that this approach presents a theoretical advantage in terms of endogeneity bias and measurement error, but its practical implications seem limited. 5 Chorus and Kroesen (2014) go even further in their criticism. They state that HCMs do not support the derivation of travel demand policies that aim to change travel behaviour through changes in a latent variable, because of the non-trivial endogeneity of the latent variable regarding travel choice and the cross-sectional nature of the latent variable which does not allow for claims concerning changes in the variable at the individual level. The first argument is probably highly case specific as the endogeneity of the latent variable can be an empirically non-relevant issue. The second argument definitely needs future research, as it is not obvious how strongly the cross-sectional nature of the attitudinal information affects the performance of the HCMs. Recently, Dekker et al. (2014) investigated to what extent choices for leisure activities and related travels are driven by the satisfaction of needs of a particular leisure activity. They include in their choice model latent variables representing the anticipated level of individual needs-satisfaction by a particular leisure activity. Using a stated choice-dataset involving choices between leisure activities, they contrast regret-minimisation based discrete choice models including and excluding the subjective measurements of need-satisfaction. Their empirical results show that, not unexpected, a big portion of the unobserved heterogeneity (around 40%) in the activity specific utility levels can be attributed to anticipated needs satisfaction. In environmental valuation, the HCM has been applied, among others, by Hess and Beharry-Borg (2012), Bartczak et al. (2015), Hoyos et al. (2015), Mariel et al. (2015), and Lundhede et al. (2015). In general, they all support the finding that HCMs provide greater insights into attitudes as additional drivers of choices. Both Lundhede et al (2015) and Bartczak et al (2015) found, for example, a significant influence of age on the latent variable and subsequently on WTP estimates. In some case also gains in efficiency were achieved. Nevertheless, Dekker et al. (2013), who additionally asked follow-up questions to record 6 respondents’ response certainty, note, in a rather critical way, that this additional information does not significantly improve the explanation of the observed choices. Kløjgaard and Hess (2014), applying the HCM approach in order to investigate data from a health survey, also express scepticism about latent variable models. They found that only a small share of the overall heterogeneity was linked to the latent variable. According to their interpretation, an explanation for the weak link could be the fact that preference heterogeneity is unrelated to attitudes and perceptions, or, more precisely, that the specific attitudinal statements measured in the survey are not directly linked to preference heterogeneity. Some of the issues related to the use of latent variables in HCMs might be avoided by learning from the SEM literature. Cliff (1983), for example, gives some warnings and advice to structural modellers, reminding them of four principles of elementary scientific inference that are perfectly applicable to discrete choice models with latent variables. The first principle is that data do not confirm a model; they only fail to refute it. That is, an estimated model cannot tell us about what is not in it. Generally, it is thus recommended to estimate multiple specifications and functional forms of a model in order to better understand the underlying generating process. The second principle is that post hoc does not imply propter hoc; that is, a significant coefficient in an estimated model does not always mean causality. That principle can be related to critique by Chorus and Kroesen (2014) regarding the cross-sectional nature of the latent variable. Due to this characteristic it is not appropriate for analysis of changes in the variable at the individual level. The third principle is crucial in HCM as it states that just giving something a name does not mean that we understand it. This is directly related to the definition of a latent variable, which usually is defined through associations with a set of indicators. Cliff (1983, p. 121) states: “... we can only interpret our results very cautiously unless or until we have included enough 7 indicators of a variable in our analysis, and have satisfied not only ourselves but sceptical colleagues and critics that we have done so”. The meaning of the latent variable will always, to some extent, be wrong, and our indicators will, to some extent, be unreliable. Moreover, in HCMs the definition of the latent variables is usually neither based on theoretical foundations nor proved through empirical work. There are, however, accepted scales to measures, for example, attitudes with a tested set of questions, like locus of control (Rotter, 1975) or environmental beliefs (Stern, 2000), which can easily be incorporated in choice models. If the set of follow-up questions has not been based on theoretical findings, a preliminary exploratory multivariate analysis should at least be applied to confirm the structure of the underlying constructs. The fourth principle is that ex post facto explanations are untrustworthy. If a model has been adjusted on the basis of its fit or lack of fit to a particular data set, its statistical status is precarious until it can be tested on a new data set. Regarding that principle, a simple prediction, such as the one used in this application, can help in model comparison and can shed light on the real performance of the model and on how close the model is to the true data-generating process. 3. Model specification We use two model specifications in this paper to investigate the influence of the design dimensionality on stated choices. The first is a HCM consisting, apart from measurement equations for attitudinal indicators, of two types of structural equation, one for the choice model and one for the latent variable model. The structural equation for the choice model is based on random utility theory (RUM), which is used to link the deterministic model with a statistical model of human behaviour. Under this framework, the utility 𝑈𝑖𝑛𝑡 of alternative 𝑖 for respondent 𝑛 in choice situation 𝑡 (from a total of 𝑇𝑛 choice occasions) is given by: 8 (1) 𝑈𝑖𝑛𝑡 = 𝑉𝑖𝑛𝑡 + 𝜀𝑖𝑛𝑡 , where 𝑉𝑖𝑛𝑡 in a classical logit model depending on observable explanatory variables, which are usually attributes (𝑥𝑖𝑛𝑡 ) and vectors of attribute parameters 𝛽. The term 𝜀𝑖𝑛𝑡 is a random variable following an extreme value distribution with location parameter 0 and scale parameter 1. In a HCM, 𝑉𝑖𝑛𝑡 also depends on the latent variable 𝐿𝑉𝑛 and a vector of parameters 𝛼 usually representing the interaction terms of the latent and explanatory variables. Now let 𝑗𝑛,𝑡 be the alternative chosen by consumer 𝑛 in choice situation 𝑡, such that 𝑃𝑛,𝑡 (𝑗𝑛,𝑡 ) gives the logit probability of the observed choice for consumer 𝑛 in choice situation 𝑇 𝑛 𝑡. The logit probability of consumer 𝑛’s observed sequence of choices is 𝑃𝑛 = ∏𝑡=1 𝑃𝑛,𝑡 (𝑗𝑛,𝑡 ). The second structural equation for the latent variable is given by (2) 𝐿𝑉𝑛 = ℎ(𝑍𝑛 , 𝛾) + 𝜔𝑛 , where ℎ(𝑍𝑛 , 𝛾) represents the determinist part of 𝐿𝑉𝑛 and the specification is ℎ(·), which is in our case linear, with 𝑍𝑛 being a vector of the socio-demographic variables of respondent 𝑛, and 𝛾 being a vector of parameters. Additionally, 𝜔𝑛 is a normally distributed random disturbance with zero mean and standard deviation 𝜎𝜔 . In our case, the latent variable should represent the level of impulsivity of the respondents. Measurement equations use the values of the attitudinal indicators as dependent variables, and explain their values with the help of the latent variables. The ℓ𝑡ℎ indicator (of the total of 𝐿 indicators) for respondent 𝑛 is therefore defined as: 𝐼ℓ𝑛 = 𝑚(𝐿𝑉𝑛 , 𝜁) + 𝑣𝑛 , (3) where the indicator 𝐼ℓ𝑛 is a function of the latent variable 𝐿𝑉𝑛 and a vector of parameters 𝜁. The specification of 𝑣𝑛 determines the behaviour of the measurement model and depends on the nature of the indicator. Responses to impulsivity statements in our case study are collected 9 using a Likert type response scale, so that the measurement equations are given by typical ordinal logit (Mariel et al., 2015) in which, apart from the parameters 𝜁, the corresponding thresholds 𝜏 need to be estimated. The model is finally estimated by maximum simulated likelihood. The estimation involves maximizing the joint likelihood of the observed sequence of choices (𝑃𝑛 ) and the observed answers to the attitudinal questions (𝐿𝐼ℓ𝑛 ). The two components are conditional on the given realization of the latent variable 𝐿𝑉𝑛 . Accordingly, the log-likelihood function of the model is given by integration over 𝜔𝑛 : 𝐿 𝐿𝐿(𝛽, 𝛾, 𝜁, 𝜏) = ∑𝑁 𝑛=1 𝑙𝑛 ∫𝜔(𝑃𝑛 ∏ℓ=1 𝐿𝐼ℓ𝑛 ) 𝑔(𝜔)𝑑𝜔. (4) Thus, the joint likelihood function (4) depends on the parameters of the utility functions included in (1), the parameters for the socio-demographic interactions in the latent variable specification defined in (2), and the parameters for the measurement equations defined in (3). Daly et al. (2012) describe different identification procedures. We follow the Bolduc normalization by setting σω equal to 1. All model components are estimated simultaneously and are contrasted using PythonBiogeme (Bierlaire, 2003, 2008) and Ox (Doornik, 2001). The benchmark model for the hybrid setting described above is a typical RPL model in which we assume that 𝛽𝑛 is a vector of the true, but unobserved, taste coefficients for consumer 𝑛. We assume that 𝛽𝑛 is distributed over consumers with density 𝑔(𝛽, Ω). In this 𝑅 (𝑗𝑛,𝑡 |𝛽) gives the logit probability of the observed choice for consumer 𝑛 in choice case, if 𝑃𝑛,𝑡 situation 𝑡, the logit probability of consumer 𝑛’s observed sequence of choices is: 𝑇𝑛 𝑃𝑛𝑅 �𝑗𝑛,𝑡 |𝛽� 𝑔(𝛽|Ω)𝑑𝛽 . 𝑃𝑛𝑅 (Ω) = ∫𝛽 ∏𝑡=1 (5) The log-likelihood function for the observed choices is then: 𝑅 𝐿𝐿(Ω) = ∑𝑁 𝑛=1 𝑙𝑛 (𝑃𝑛 (Ω)) . 10 (6) 4. Case study The survey aimed at measuring preferences for land use changes in Germany. 1 Thus, the selected choice attributes comprise share of forest, land consumption, biodiversity conservation and a price attribute (Table 4.1). All attributes except those concerning biodiversity conservation were presented in all designs, while the biodiversity attributes were used to adjust the number of attributes according to the design plan proposed by Hensher (2004). Following this approach, 16 separate efficient designs were created using C-efficiency allowing for minimizing the variance of WTP (Scarpa and Rose 2008). The designs were optimized for a MNL model. Table 4.1: Attributes used in the Choice Experiment Attribute FOREST LAND BIO BIO_AGRAR BIO_FOREST BIO_URBAN BIO_OTHER1 BIO_OTHER2 BIO_OTHER3 COST Description Percentage changes in the share of forest (positive and negative) Percentage changes in land conversion for housing development and traffic (positive and negative) Biodiversity in the whole landscape including all landscape types Agricultural landscape biodiversity Forest landscape biodiversity Urban area biodiversity Biodiversity in other landscape types: forests, urban areas, mountains, water Biodiversity in other landscape types: urban areas, mountains, water Biodiversity in other landscape types: mountains, water Contribution to a landscape fund in € per year Table 4.2 provides an overview of the 16 designs and of how the dimensions of the choice sets vary across designs. All choice tasks included an SQ alternative, i.e., a zero price option with no environmental changes, plus two or more alternatives depending on the design-of-designs plan. Choices in the choice experiment regarding landscape changes had to be made by considering the landscape within a distance of about 15 kilometres from the respondent’s place of residence. Respondents for the nationwide online survey were recruited 1 See Meyerhoff et al. (2015) for more details of the design of the choice experiment and the survey. 11 from a panel of a survey company. Each respondent was randomly allocated to one of the 16 designs. Table 4.2: Design overview Design Sets 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 24 18 24 12 6 24 6 12 24 6 6 12 18 18 12 18 Alternatives 4 4 3 3 3 3 4 5 5 5 4 5 4 3 3 5 Attributes 5 5 6 6 4 4 7 4 4 7 6 5 7 4 5 6 Levels 3 4 2 4 3 4 2 4 4 3 4 2 2 3 2 3 Range Base +20% +20% Base +20% -20% -20% +20% Base +20% -20% -20% Base -20% Base -20% Interviews completed 67 63 66 59 82 45 181 65 65 128 71 68 83 65 99 76 Note: The number of interviews does not include those respondents who always chose the SQ option. The questionnaire also included scales to capture different attitudes or personality traits of the respondents. One of these was a scale developed for measuring impulsivity. The scale is meant to provide a measurement instrument that allows the psychological trait of impulsivity to be recorded in an economic way, i.e., in a way that consumes only a small amount of interview time. The scale follows the UPPS (Urgency Premeditation Perseverance and Sensation Seeking Impulsive Behavior Scale) approach. Kovaleva et al. (2012) point out that there is still no standard definition of impulsiveness but that it is assumed that the construct is multidimensional and thus comprises various aspects of impulsive behaviour. These include, among others, i) the tendency to act without thinking and without sufficient information for a decision, ii) the tendency to prefer a smaller immediate reward, and iii) the tendency to choose riskier alternatives or the inability to assess the risks associated with decisions correctly. Therefore, the UPPS approach comprises the four subscales urgency, 12 intention, endurance and willingness to take risks. Each subscale is addressed using two items. Table 4.3 reports the wording of the attitudinal statements and the direction of the association with the latent construct impulsivity. Kovaleva et al. (2012) show that their scale performs well and allows a reliable and valid measurement of impulsivity. Table 4.3: Attitudinal questions impul1 urgency impul2 impul3 intention impul4 + + - impul5 endurance impul6 impul7 willingness to impul8 take risks + + Sometimes I do things impulsively that I shouldn't do I sometimes do things to cheer myself up that I later regret I usually think carefully before I act I usually consider things carefully and logically before I make up my mind I always bring to an end what I have started I plan my schedule so that I get everything done on time I am willing to take risks I am happy to take chances The scale was added to the survey in order to shed light on the link between respondents’ psychological traits and their stated choices in the survey. We expect that respondents who tend to be more impulsive are more likely to choose alternatives with a positive price, i.e., not the SQ option, and that this intensifies when the choice sets become more complex with a higher dimensionality. The reason for this is that people who are said to be more impulsive are, among other things, expected to be more likely to act without reflecting on the consequences and to be more likely to take risks (Kovaleva et al., 2012). To some extent, however, the scale, which was provided by a leading social science research centre in Germany (GESIS - Leibnitz Institute for the Social Sciences), was added in an experimental manner as we expected it to be a reliable measurement instrument enabling us to estimate HCMs. The literature applying latent variable models indicates that not using reliable measurement instruments reduces the possibility of estimating an HCM. 13 5. Results Table 5.1 describes the variables used in the econometric models, along with their descriptive statistics. Non-responses to items mean that the useable sample comprises 23,118 responses from 1,661 individuals. Briefly, the mean age is 42.3 years, the share of female respondents is 53% and the mean disposable income of the respondents’ households is 17,500 Euros. As the survey was conducted as an online survey, we did not expect the sample to be representative for the population in Germany. Not all people have access to the Internet and, above all, not all use it regularly. Obvious deviations exist for the variables education and income. Compared to the German population, the share of respondents with higher education is too large and thus the disposable incomes are also too high. However, as we did not plan to aggregate, for example, welfare measures based on the model results, we assume for the following that the model comparison is not affected by the sample composition. Table 5.1: Summary statistics Variable (Attribute) AGE MAN HIGHEDUC INCOME POSITION ALTERNATIVES ATTRIBUTES WIDE NARROW LEVEL3 LEVEL4 Description Mean Std.Dev. Age 42.31 13.53 Gender: Male 0.47 0.50 Level of education > secondary 0.39 0.49 Income 17,500.00 34,959.09 Position of the choice set 9.18 6.22 Number of alternatives 3.85 0.81 Number of attributes 5.32 1.08 Wide level range 0.29 0.45 Narrow level range 0.33 0.47 Three level range 0.28 0.45 Four level range 0.33 0.47 Min 19 0 0 450 1 3 4 0 0 0 0 Max 84 1 1 100,000 24 5 7 1 1 1 1 In addition to the socio-economic information, the respondents were asked a series of attitudinal questions regarding impulsivity, as presented in Table 4.3. Table 5.2 shows the response distributions on a 5-point Likert scale. For each statement, values closer to five would 14 equate to stronger agreement while values closer to one would equate to stronger disagreement. Table 5.2: Responses to the impulsivity attitudinal questions 1 impul1 4% urgency impul2 10% impul3 1% intention impul4 1% impul5 1% endurance impul6 2% impul7 3% willingness to take risks impul8 1% Note: 1 = doesn’t apply at all, 5 = applies completely 2 35 % 40% 10% 8% 3% 16% 29% 20% 3 24% 24% 17% 18% 11% 16% 26% 30% 4 32% 24% 58% 58% 60% 50% 37% 43% 5 5% 2% 14% 15% 25% 16% 5% 6% As a first step, an exploratory factor analysis was conducted on the responses to the attitudinal questions. The exploratory factor analysis employed principal axis factor analysis. According to Table 5.3, it seems reasonable to choose a two-factors solution, as the percentage of variance explained decreases sharply in the third factor, and the highest factor loadings appear in the columns for Factors 1 and 2. A HCM with all eight attitudinal questions and two latent variables would have a very high number of parameters (82), which could lead to numerical issues in the estimation procedure. As parsimony is also an important issue for model development, we estimated numerous alternative model specifications and selected a subset of questions using as criterion the significance of the parameters 𝜁 in the measurement equations (3). This is the reason why only three attitudinal questions (impul1, impul7 and impul8) have finally been included in the HCM incorporating therefore only one latent variable representing the first factor (Table 5.3). This one latent variable solution is also in line with the definition of our attitudinal questions, as impul1 is related to urgency and impul6 and impul7 to willingness to take risks. Our latent variable represents, therefore, urgency and risk propensity. What is pursued here is the satisfaction of the third principle introduced by Cliff (1983) stating that just giving something a name does not mean that we understand it. In our 15 case we chose three indicators of clearly stated theoretical concepts basing our decision on a factor analysis of our data. Table 5.3: Exploratory factor analysis Eigenvalues and percentages Factor Eigenvalue Proportion Factor loadings Cumulative Variable Factor1 Factor2 Factor3 Factor4 Factor1 1.92 0.66 0.66 impul1 0.56 0.00 0.36 0.10 Factor2 1.16 0.40 1.06 impul2 0.46 -0.07 0.43 0.06 Factor3 0.49 0.17 1.23 impul3 -0.61 0.27 0.28 -0.11 Factor4 0.16 0.06 1.28 impul4 -0.60 0.26 0.27 -0.13 Factor5 -0.18 -0.06 1.22 impul5 -0.26 0.45 0.00 0.21 Factor6 -0.19 -0.07 1.15 impul6 -0.33 0.35 -0.04 0.23 Factor7 -0.21 -0.07 1.08 impul7 0.52 0.56 -0.07 -0.11 Factor8 -0.23 -0.08 1.00 impul8 0.47 0.61 -0.11 -0.08 As outlined in Section 3, the specification of a HCM requires the specification of two types of structural equations, one for the choice model and one for the latent variable model. Following equation (1), the structural equation for the choice model has a deterministic term 𝑉𝑖𝑛𝑡 , defined in our case as: 𝑉𝑖𝑛𝑡 = 𝛽 ′ 𝑋𝑖𝑛𝑡 = (𝐴𝑆𝐶𝑖 + 𝛼𝐴𝑆𝐶𝑖 𝐿𝑉𝑛 ) + (𝛽𝐹𝑂𝑅𝐸𝑆𝑇 + 𝛼𝐹𝑂𝑅𝐸𝑆𝑇 𝐿𝑉𝑛 ) 𝐹𝑂𝑅𝐸𝑆𝑇𝑖𝑛𝑡 +(𝛽𝐿𝐴𝑁𝐷 + 𝛼𝐿𝐴𝑁𝐷 𝐿𝑉𝑛 )𝐿𝐴𝑁𝐷𝑖𝑛𝑡 + 𝛽𝐵𝐼𝑂 𝐵𝐼𝑂𝑖𝑛𝑡 − 𝑒𝑥𝑝(𝛽𝐶𝑂𝑆𝑇 +𝛼𝐶𝑂𝑆𝑇 𝐿𝑉𝑛 ) 𝐶𝑂𝑆𝑇𝑖𝑛𝑡 , (7) where 𝐹𝑂𝑅𝐸𝑆𝑇, 𝐿𝐴𝑁𝐷, 𝐵𝐼𝑂 and 𝐶𝑂𝑆𝑇 are the choice attributes described in Table 4.1 and 𝛼𝐴𝑆𝐶𝑖 = 0 ∀ 𝑖 ≠ 𝑆𝑄. The attribute 𝐵𝐼𝑂 is substituted by the corresponding split attributes in designs including 𝐵𝐼𝑂_𝐴𝐺𝑅𝐴𝑅, 𝐵𝐼𝑂_𝐹𝑂𝑅𝐸𝑆𝑇, 𝐵𝐼𝑂_𝑈𝑅𝐵𝐴𝑁, 𝐵𝐼𝑂_𝑂𝑇𝐻𝐸𝑅1, 𝐵𝐼𝑂_𝑂𝑇𝐻𝐸𝑅2 and 𝐵𝐼𝑂_𝑂𝑇𝐻𝐸𝑅3. In addition, we include alternative specific constants 𝐴𝑆𝐶𝑖 for all but one of the alternatives. Note that the functional form of equation (7) resembles an RPL with the key attributes (𝐹𝑂𝑅𝐸𝑆𝑇, 𝐿𝐴𝑁𝐷 and 𝐶𝑂𝑆𝑇) being random to allow for a more straightforward comparison of the results obtained from the two models. According to (7), and apart from the key attributes, 𝐴𝑆𝐶𝑆𝑄 is also assumed to be random, which allows a possible SQ effect caused 16 by impulsivity and/or complexity of the design to be analysed. In the RPL the coefficients 𝛽𝐹𝑂𝑅𝐸𝑆𝑇 , 𝛽𝐿𝐴𝑁𝐷 and 𝐴𝑆𝐶𝑆𝑄 are assumed to be normally, and 𝛽𝐶𝑂𝑆𝑇 to be log-normally, distributed, which is in line with (7). Moreover, we assume that there is a vector of individual characteristics and complexity variables that affects the mean of these random parameter distributions. To make the two competing models similar, we include in the vector affecting the mean of the random parameters the same variables as those included in the determinist part of the latent variable 𝐿𝑉𝑛 defined in (2). Figure 5.1: Empirical distributions of random parameters Share of forest Land conversion Cost ASCsq Note: solid line represents the individual contributions of each random parameter and dashed line a normal density (log-normal for Cost). As the selection of the parameters’ distribution is a key issue in the RPL methodology we applied the empirical approach proposed by Hensher and Greene (2003) to describe graphically the empirical distributions for the random parameters. Due to this procedure the same model for different data subsets are estimated. These subsets are created by leaving one individual out. The differences in the parameter estimates obtained by the use of these subsets and the parameter estimates of the whole sample provide the contribution (incremental marginal utility) of a specific individual to the overall sample mean parameter estimate and they can, therefore, indicate the type of underlying individual preference heterogeneity. Figure 5.1 shows the shape of these individual contributions for each random parameter (solid line) together with a normal density (dashed line). The cost coefficient, however, is plotted with lognormal dashed density. The lognormal distribution (with a sign 17 change), assumed for the cost parameter, assures finite moments for the WTP distributions (Daly, Hess, and Train; 2012). Figure 5.1 shows that there are no sizeable deviations of the individual contributions from the previously assumed density shapes for the random parameters. Table 5.4 presents the maximum simulated log-likelihood estimation obtained from the RPL using 200 Halton draws. The high number of observations and the high number of different utility function specifications due to the complex design do not allow for using more Halton draws as this would increase estimation costs drastically. However, both models were estimated by two different software packages (PythonBiogeme and Ox) and by using various sets of starting values to prove the stability of presented results. The estimated means and standard deviations of all random coefficients are presented in the upper part of Table 5.4, together with estimated coefficients representing the heterogeneity in mean. The lower part of the same table presents the estimations of the non-random coefficients. Table 5.4: Random parameter logit estimation Observations: Respondents: Parameters: 23118 1661 58 Share of forest Mean St. Dev. Mean heterogeneity: Choice task position Number of alternatives Number of attributes Wide level range Narrow level range Three levels Four levels Age Male Higher Education Other coefficients: Biodiversity-Whole BiodiversityAgricultural Biodiversity-Forest Biodiversity-Urban Biodiversity-Other1 Biodiversity-Other2 Log_L: AIC: BIC: CAIC: -17669.9 35455.8 35922.6 35980.6 Land conversion Value 0.0299 *** 0.0316 *** pvalue <0.01 <0.01 Value -0.0251 0.0203 0.0002 0.0002 -0.0007 -0.0107 0.0079 0.0053 0.0076 -0.0001 -0.0013 0.0016 0.04 0.87 0.54 <0.01 <0.01 0.06 0.02 0.46 0.52 0.47 -0.0004 0.0004 0.0024 0.0034 -0.0045 0.0031 0.0012 0.0000 0.0009 -0.0053 ** *** *** ** 0.0105 *** <0.01 0.0081 0.0123 0.0095 0.0096 0.0092 <0.01 <0.01 <0.01 <0.01 <0.01 *** *** *** *** *** 18 Cost *** *** ** *** *** p-value <0.01 <0.01 Value -0.442 -2.702 <0.01 0.65 <0.01 0.02 <0.01 0.06 0.54 0.58 0.47 <0.01 0.0068 -0.1896 -0.4745 -0.6457 -0.0918 -2.3069 -1.4857 0.0091 -0.8476 -0.0128 ASC SQ *** *** *** *** *** *** *** pvalue 0.24 <0.01 Value 1.041 3.548 pvalue 0.21 <0.01 0.06 <0.01 <0.01 <0.01 0.27 <0.01 <0.01 <0.01 <0.01 0.88 0.0446 -1.1274 0.0963 0.8278 -0.3945 0.6809 0.3379 0.0168 -0.0931 -0.1263 *** <0.01 *** <0.01 0.36 *** <0.01 0.06 0.01 ** 0.24 0.01 ** 0.60 0.54 Biodiversity-Other3 ASC2 ASC3 ASC4 0.0064 0.1042 -0.1331 -0.3949 *** *** *** *** <0.01 <0.01 <0.01 <0.01 Table 5.5 presents the maximum simulated log-likelihood estimation results of the HCM obtained using also only 200 Halton draws. The upper part of the table presents the estimations of the key attributes together with corresponding LV effect coefficient (𝛼). The coefficients (𝛾) of the structural equation of the LV defined in (2) are on the left hand side of the table and the coefficients of the measurement equations (𝜁) on the right hand side. These are presented together with the thresholds estimated using the ordinal logit model (defined as 𝜏ℓ , 𝜏ℓ + 𝛿1ℓ , 𝜏ℓ + 𝛿2ℓ , 𝜏ℓ + 𝛿3ℓ ) for the three attitudinal response scales. Table 5.5: HCM model estimation Observations: Respondents: Parameters: 23118 1661 43 Log_L: AIC: BIC: CAIC: Share of forest Value p-value Coefficient Effect of the LV Structural equation Choice task position Number of alternatives Number of attributes Wide level range Narrow level range Three levels Four levels Age Male Higher Education Other coefficients: Biodiversity-Whole Biodiversity-Agricultural Biodiversity-Forest Biodiversity-Urban Biodiversity-Other1 Biodiversity-Other2 Biodiversity-Other3 ASC2 ASC3 ASC4 0.0366 *** -0.009 *** <0.01 <0.01 -0.0093 0.1441 0.0589 -0.0101 0.0608 0.2387 0.1141 0.0007 0.3284 0.1273 *** *** ** <0.01 <0.01 <0.01 0.91 0.31 <0.01 0.12 0.65 <0.01 0.13 0.0139 0.0070 0.0090 0.0077 0.0087 0.0048 0.0072 0.0609 -0.0985 -0.3309 *** *** *** *** *** *** *** *** *** *** *** *** 26315.4 52716.8 53062.9 53105.9 Land conversion Value p-value Cost Value p-value -0.0228 *** <0.01 3.0761 *** <0.01 0.008 *** <0.01 -1.335 *** <0.01 Measurement equation parameters Thresholds and constants -3.686 *** <0.01 𝜏1 2.571 *** <0.01 𝛿11 1.076 *** <0.01 𝛿21 2.623 *** <0.01 𝛿31 𝜏2 𝛿12 𝛿22 𝛿32 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 0.01 <0.01 <0.01 𝜏3 𝛿13 𝛿23 𝛿33 -4.005 2.717 1.130 2.686 *** *** *** *** <0.01 <0.01 <0.01 <0.01 -4.682 3.115 1.333 2.710 *** *** *** *** <0.01 <0.01 <0.01 <0.01 Coefficients of the LV -0.533 ** 𝜁1 -0.492 ** 𝜁2 -0.279 ** 𝜁8 19 0.03 0.05 0.04 ASC SQ Value p-value 4.1742 *** <0.01 -3.857 *** <0.01 A comparison of Tables 5.4 and 5.5 leads to the following conclusions. The estimates of the non-random coefficients, as well as the coefficient mean values of the Share of forest and Land conversion coefficients, are very close. The main difference between the models can be found only in the Cost and 𝐴𝑆𝐶𝑆𝑄 coefficients. Many design dimensions and sociodemographic variables have significant effects on the mean of the random coefficients in RPL and on the LV in the HCM model. A direct comparison, however, is not possible because the RPL is more flexible, in the sense that it allows different impacts of these variables on the mean of each random coefficient, whereas in the HCM this effect is modelled through a latent concept of impulsivity and is therefore assumed to be the same for all coefficients. Given that the 𝜁 coefficients in (3) are negative, high values of the latent variable correspond to less impulsive, and more risk averse, individuals. Thus, as 𝛼𝐴𝑆𝐶𝑆𝑄 is negative, more impulsive individuals with high risk propensity are more likely to choose an alternative different from the SQ option, and that confirms our a priori expectations. Next, based on the results from Tables 5.4 and 5.5, we simulate the marginal WTP values and the distribution of the 𝐴𝑆𝐶𝑆𝑄 for the sample population of respondents, using 10,000 draws of the corresponding normal (𝐴𝑆𝐶𝑆𝑄 , Share of forest, Land conversion) and log- normal (Cost) distributions, taking into account the heterogeneity in mean coefficients. Similarly, the simulated marginal WTP and the distribution of 𝐴𝑆𝐶𝑆𝑄 for the HCM model are computed by using 10,000 draws for the LV of each respondent and taking into account the coefficients of the structural equations for the LV. Table 5.6 presents the distribution of the WTP of the two models obtained for the two attributes Share of forest and Land conversion. As can easily be seen, the median values are similar, but the distribution of the WTP obtained by RPL is much wider. This could indicate a better performance of the HCM in terms of less variation of WTP values (Hess et al., 2013). However, we have to be cautious with the comparison as the large intervals in the RPL are likely to be, at least partially, driven by the 20 heavy tailed lognormal distribution of the cost coefficient. Nevertheless, the cost coefficient in the HCM is also modelled in lognormal-like way (−𝑒𝑥𝑝(𝛽𝐶𝑂𝑆𝑇 +𝛼𝐶𝑂𝑆𝑇 𝐿𝑉𝑛 ) 𝐶𝑂𝑆𝑇𝑖𝑛𝑡 ). Table 5.6: Distribution of marginal WTP obtained by RPL and HCM models Share of forest Land conversion 25th percentile 0.1 -10.8 RPL Median 2.0 -0.9 75th percentile 22.4 0.0 Share of forest Land conversion 25th percentile 1.3 -2.0 HCM Median 2.5 -1.3 75th percentile 4.8 -0.7 As the means of the random coefficients in the RPL model, as well as the latent variables, depend on various design dimensions and socio-demographic variables, the WTP values can be simulated for specific subgroups of respondents. The Tables 5.7 and 5.8 demonstrate how the distributions of WTP change under different scenarios characterized by different values of two design dimension variables. This allows us to analyse the effect of these variables on the WTP distributions. Table 5.7: Effects of position in a series of choice occasions on the marginal WTP distribution Position 25th perc. Low (<5) 0.0 High (>13) 0.1 Share of forest - RPL Median 75th perc. 1.9 21.9 2.2 21.5 Position Low (<5) High (>13) 25th perc. -9.3 -12.0 Land conversion - RPL Median 75th perc. -0.7 0.0 -1.3 -0.1 Position 25th perc. Low (<5) 1.4 High (>13) 1.1 Share of forest - HCM Median 75th perc. 2.8 5.1 2.3 4.4 Position Low (<5) High (>13) 25th perc. -2.0 -1.9 Land conversion - HCM Median 75th perc. -1.3 -0.7 -1.2 -0.6 21 Table 5.8: Effects of number of alternatives on the marginal WTP distribution Alternatives Low (3) High (5) 0.0 0.2 Alternatives Low (3) High (5) 25th perc. 25th perc. 1.1 1.5 Share of forest - RPL Median 75th perc. Alternatives 25th perc. 1.2 4.2 13.2 47.4 Low (3) High (5) -7.3 -17.9 Alternatives 25th perc. Low(3) High (5) -1.9 -2.1 Share of forest - HCM Median 75th perc. 2.2 3.0 4.1 5.5 Land conversion - RPL Median 75th perc. -0.7 -1.3 0.0 0.0 Land conversion - HCM Median 75th perc. -1.1 -1.4 -0.6 -0.7 The effects in Tables 5.6 and 5.7 are, as expected, in the direction of the sign of the corresponding heterogeneity in mean coefficient (RPL) or the structural equation coefficient (HCM). These effects are not always in the same direction in both approaches as the models rely on different assumptions. If we focus on shifts in the median values, we conclude that these are not as large as we would expect in all cases presented above but, nevertheless, they are too large to be ignored. For example, the mean WTP value for the Share of forest attribute changes from 1.2 to 4.2 in RPL and from 2.2 to 3.0 in HCM as a consequence of the change in the number of alternatives from 3 to 5. Using the same procedure, the distribution of 𝐴𝑆𝐶𝑆𝑄 was simulated in the RPL and HCM models under different scenarios. Table 5.9 characterizes the changes in those distributions attributable to design dimension variables. For example, the two approaches confirm that a choice task appearing later in the sequence of tasks increases the utility of the SQ alternative, leading to a higher probability of it being chosen. This can be due to the fatigue effect (e.g., Boxall et al., 2009). On the other hand, as expected, in the two approaches a higher number of alternatives has an opposite effect – that is, more alternatives leads to a lower probability for the SQ choice. The same result was obtained in Oehlmann et al. (2014). 22 Table 5.9: Distribution of 𝐴𝑆𝐶𝑆𝑄 under different scenarios 25th perc. RPL Median 75th perc. -4.3 -3.3 -1.8 -0.9 0.8 1.6 Position Low High Alternatives Low High -2.8 -4.9 -0.3 -2.6 2.0 -0.2 Attributes Low High -3.9 -3.7 -1.3 -1.2 Position Low High 25th perc. HCM Median 75th perc. -3.3 -2.5 -0.6 0.2 2.1 2.9 Alternatives Low High -2.2 -3.6 0.4 -1.0 3.1 1.7 1.2 1.3 Attributes Low High -2.7 -3.1 -0.1 -0.4 2.6 2.3 -0.3 0.0 2.4 2.7 -0.1 -0.5 2.6 2.2 Wide level range No -4.2 Yes -3.0 -1.7 -0.5 0.8 1.9 Wide level range No -3.0 Yes -2.8 Narrow level range No -4.2 Yes -4.3 -1.7 -1.8 0.8 0.7 Narrow level range No -2.8 Yes -3.2 Next, we compared the performance of the two models by two simple approaches. First, in a similar way to the simulation of the marginal WTP distributions, we simulated the probabilities of each alternative based on the sample population of respondents, using 10,000 draws. If we assume that the highest probability coincides with the choice prediction, we get the classification tables of observed and predicted outcomes presented in Table 5.10. The results are presented in percentages. As can be observed, there are only minor differences between the results for both models. Both models, RPL and HCM, predict very similarly, but at the same time also poorly. Table 5.10: Classification table of observed and predicted outcomes RPL Observed 1 2 3 4 5 HCM 1 0.084 0.036 0.066 0.030 0.015 2 0.036 0.085 0.066 0.030 0.014 Predicted 3 0.061 0.066 0.158 0.022 0.013 4 0.030 0.032 0.026 0.056 0.010 5 0.012 0.010 0.011 0.010 0.021 0.224 0.229 0.327 0.147 0.073 0.230 0.230 0.320 0.155 0.063 1.000 23 1 2 3 4 5 1 0.086 0.032 0.065 0.030 0.013 2 0.030 0.089 0.068 0.032 0.015 Predicted 3 0.059 0.062 0.151 0.019 0.013 4 0.029 0.030 0.026 0.054 0.006 5 0.020 0.017 0.017 0.014 0.027 0.224 0.229 0.327 0.147 0.073 0.225 0.233 0.304 0.145 0.093 1.000 1 𝑁 2 If we transform the information into one indicator, defined as 𝑅𝐶𝑜𝑢𝑛𝑡 = ( ) ∑𝑗 𝑛𝑗𝑗 , where 𝑛𝑗𝑗 is the number of correct predictions for outcome 𝑗 that are located on the diagonal 2 2 cells of the two tables, we get 𝑅𝐶𝑜𝑢𝑛𝑡 = 0.403 for the RPL model and 𝑅𝐶𝑜𝑢𝑛𝑡 = 0.406 for the HCM model. If we make our prediction more realistic and use in each simulation step a draw from a uniform [0,1] to generate a choice prediction based on the predicted probabilities, then 2 drop slightly to 0.371 and 0.373 in the RPL and HCM models respectively. the values 𝑅𝐶𝑜𝑢𝑛𝑡 Unsurprisingly, the difference is also very small. If we analyse the contribution of the attributes and attitudinal questions to the 2 prediction in more detail, we can subtract from the numerator and denominator of 𝑅𝐶𝑜𝑢𝑛𝑡 the number of cases in the outcome with the highest frequency (in our case outcome 3), and we 2 obtain an adjusted 𝑅𝐶𝑜𝑢𝑛𝑡 which is, in our case, 0.114 for RPL and 0.118 for HCM. Our knowledge of attributes and attitudinal questions, compared to a prediction based only on the marginal distributions, reduces the error in prediction by only 11.4% and 11.8% respectively. There are other simple indicators related to observed and unobserved heterogeneity that can be used to compare the RPL and HCM. The random coefficients are an appealing part of the RPL, but we would certainly prefer to interpret a model in which the unobserved heterogeneity represents only a small part of the random coefficients. The same is true for the HCMs. Actually, an RPL-like definition of the HCM coefficients (𝛽𝐹𝑂𝑅𝐸𝑆𝑇 + 𝛼𝐹𝑂𝑅𝐸𝑆𝑇 𝐿𝑉𝑛 ) is a nice way to disentangle the preference heterogeneity through the use of the underlying construct. To achieve this goal, the coefficients 𝛾 in (2) should be sufficiently big so that ℎ(𝑍𝑛 , 𝛾) represents a high proportion of the total variation of the latent variable. Table 5.11 represents the ratios of the variances of observed and unobserved heterogeneity. For the HCM model, the table represents the ratio of the variances of ℎ(𝑍𝑛 , 𝛾) and 𝜔𝑛 defined in (2) and computed by the use of the same simulations as those used in the above prediction exercise. The values in the RPL column have been computed in a similar way. 24 Table 5.11: Observed/unobserved heterogeneity ratios RPL HCM Share of forest 0.076 0.067 Land conversion 0.077 0.067 Cost 0.155 0.067 𝐴𝑆𝐶𝑆𝑄 0.097 0.067 As can be observed from Table 5.11, the ratios are low but this finding is not unusual in the literature (Dekker et al., 2013; Kløjgaard and Hess, 2014). 6. Discussion The objective of this paper was to investigate whether the insights gained from HCMs, which have been applied more frequently in the recent literature, justify the additional effort. We used as a case a data set based on design-of-designs approach allowing for the analysis of the influence of choice task complexity on model outcomes. Regarding the influence of the design dimensions we find that both the HCM and the RPL model show that the design dimensions influence the WTP distribution. The results are obviously not exactly the same for the two models, as the more flexible RPL specification allows us to see different effects of the design dimensions on WTP for each attribute. Both approaches, moreover, confirm that all the design dimensions in the analysis influence the marginal WTP values, and, subsequently some conclusions can be drawn. Firstly, it is important to choose the design dimensions of choice sets carefully as they can significantly influence the outcomes. Our results show that the highest influence corresponds to the number of alternatives and the number of attribute levels. Secondly, the design dimensions are also related to the frequency of SQ choices. According to our results, more alternatives for the choice set have a negative impact on the frequency of SQ choices. This can be explained by the so-called preference matching effect (Zhang and Adamowicz, 2011), i.e., giving respondents more alternatives on a choice set increases the probability that they find an alternative that matches their preferences. By 25 contrast, the number of choice tasks faced by a respondent positively affects the frequency of SQ choices, i.e., the later in the sequence of choice sets, the higher the propensity to choose the SQ alternative. This might be caused by respondent fatigue at the end of the sequence of choice sets. To what extent learning and fatigue take place while responding to a discrete choice experiment is, however, still under investigation (see for a recent study Campbell et al., 2015). In this study we have only focused on the design dimensions and have not incorporated other aspects of complexity such as the total number of level changes or the similarity of alternatives measured, for example, through entropy (e.g., Zhang and Adamowicz, 2011). Therefore, we might not have captured all those aspects of complexity that influence the propensity to choose the SQ alternative. The reason for this is that we wanted to focus here on the comparison of the models. Readers interested in the relationship between the other aspects complexity and SQ choices are thus referred to Oehlmann et al. (2014). Finally, regarding the effect of impulsivity on the propensity to choose the SQ option, we conclude that more impulsive and risk-seeking people are more likely to choose a non-SQ alternative. The findings add to an increasing evidence about the relationship between personality traits and choices (Farizo et al., 2016). The main objective of this paper, as stated in the introduction, was to compare, more closely than is usually done, an HCM with the more commonly used RPL model. The comparison includes performance, the insights gained through the estimation and the subsequent post-estimation analysis. We therefore believe that our results add new insights to the ongoing debate regarding the performance and additional value of HCMs (Chorus and Kroesen, 2014; Dekker et al., 2014, Kløjgaard and Hess, 2014; Vij and Walker, 2015). The two competing models in our case study were specified in a similar way so that their comparison would be relatively easy. The two models allow for preference heterogeneity of three key attributes. One part of this heterogeneity is linked to the dimensionality of the choice tasks and to socio-demographic variables. The other part remains random. The main difference 26 between the two approaches is that the taste heterogeneity in the RPL model is not linked to any underlying latent attitudes. Thus, a comparison in terms of model fit is not straightforward. Some authors compute the LogL-value of competing models corresponding only to the choice part of the model. However, this procedure is debateable as the loglikelihood function is maximized taking into account all the parameters of the model. This is why in the literature the debate about the suitability of the HCM usually remains in the discussion of the actual differences in the implied sensitivities of alternative model specifications. The work of Glerum et al. (2014) is an exception, presenting an interesting validation of their model in relation to the fourth principle of the SEM literature that ex post facto explanations are untrustworthy. They estimate the HCM on 80% of the data and compute the choice probabilities for the remaining 20% of the data. Assuming that the highest predicted probability corresponds to the chosen alternative, the authors compare this to the actual choice. They also use the 𝜌̅ 2 as an additional indicator of the validity of the HCM in comparison to a plain MNL model, concluding, unsurprisingly, that the HCM performs better. A different approach was presented by Kløjgaard and Hess (2014), who try to disentangle the influence of the latent variable, but their conclusion is not very optimistic. Only a small share of the overall heterogeneity is linked to the latent variable that explains only slightly more than 6% of the total variance. The validation of the HCM should therefore be an important part of any empirical application based on HCM methodology, as the criticism of this approach basing on empirical evidence (Kløjgaard and Hess, 2014) and theoretical foundations (Chorus and Kroesen, 2014, Dekker et al., 2013) has increased considerably. If we focus on our comparison of the performance of the two models, the first conclusion, based on the prediction exercise (Table 5.10), is that the two models perform very similarly and that no great differences can be found in their prediction outcomes. The second 27 conclusion is that the two models perform very poorly in forecasting, as they are only able to predict around 40% of the actual outcomes. This low percentage is not a big issue in our case, as the main goal of the discrete choice models in the environmental field is usually not prediction but policy making based on the WTP values obtained. Nevertheless, it shows that the methodologically complex models like those presented in this application, which are widely used and not only in environmental studies, can show complex forms of inter-variable relations and how the variables relate to preference heterogeneity, but fail to give a good representation of the underlying data-generating process. Regarding the observed/unobserved heterogeneity ratios of the two models (Table 5.11), the low ratio for the HCM is perfectly in line with Kløjgaard and Hess (2014), as in their case study only 6% of the overall heterogeneity is linked to the latent variable. This may require a revision of the main idea behind the relationship between preference heterogeneity and attitudes, i.e., that attitudes contribute substantially to explaining taste heterogeneity. As can be seen in Table 5.10, the RPL model does not amend this shortcoming. The standard deviations of the random parameters are too high, leaving little space for the design dimensions and socio-demographic variables to represent the relevant portion of the overall heterogeneity. It is obviously questionable whether the assumed distributions are the appropriate ones in this case, but the same pattern of a low ratio is observed for all coefficients. And this point is related to the first principle of elementary scientific inference: that data do not confirm a model, they can only fail to refuse it. Low ratios of observed and unobserved heterogeneity in many HCM applications lead us to revise our a priori assumption that the unobserved constructs representing our latent attitudes are related to sociodemographic characteristics usually included in a choice experiment, not only in environmental valuation. The modelling approach should be also tested in detail. The significant reduction of the unobserved heterogeneity obtained by Dekker et al. (2014) by the use of a mixture of the RPL and HCM approach seems to be promising line for a future research. 28 As already stated in the literature, HCMs gain in efficiency by the inclusion of additional information (attitudinal questions) in the choice model. This is why, in our case, the WTP distributions derived from the HCM present lower variations. Our case is rather atypical, because the plain RPL model has more coefficients than the HCM. The typical situation is the opposite one, i.e., the number of parameters in a HCM rises rapidly with each attitudinal debriefing question. Thus, our case is an untypical example where the HCM simplifies a complex RPL model devoted to the capture of complex observed heterogeneity. Our recommendation based on the present evidence is that people conducting choice experiments can consider adding the HCM to their suite of models even if the costs are high and these models are not yet available in standard econometric packages. Nevertheless, there are many case specific external issues needed to be taken into account to be able to reduce unobserved heterogeneity. This is in line with Vij and Walker (2016) who evaluate systematically the benefits of the HCM framework in comparison with a more traditional choice model without latent variables using a set of criteria based on statistical considerations and relevance to practice and policy. The study finds the statistical benefits of the HCM to be smaller than previously believed. According to their conclusions, HCM can improve predictions and to reduce the variance of the parameter estimates only in some specific cases. However, in terms of relevance for practice and policy, they recommend the use of HCM as it allows to measure, test and quantify the impact of latent constructs on observable behaviour through, for example, willingness to pay estimates or elasticity of demand. The overall conclusion of our model comparison leads us to highlight the importance of the final use of an estimated model. If we are interested in learning something new from the model and disentangle the preference heterogeneity further than a standard RPL allows, a HCM is a promising option. This view is supported by Vij and Walker (2016, pp.212): “Unlike simpler choice models, ICLV models provide a mathematical framework for testing and 29 applying complex theories of behaviour, and lend structure and meaning to underlying sources of heterogeneity.” If model fit and predictive power are the goals, more simple models can be a more adequate choice. However, so far most studies in environmental valuation investigate taste heterogeneity and its drivers. Finally, we would like to highlight the importance of the SEM principles in the use of HCMs summarized in Section 2. Specifically, the third and fourth principles of elementary scientific inference are not given sufficient attention by practitioners. The third is related to the theoretical foundation of the latent variable and the necessity to apply exploratory multivariate analysis to the attitudinal responses before including them in the model. Further, the last principle, related to model validation, is almost completely missing in the recent choice modelling literature. The present application uses classical approaches to gain deeper insight into the performance of the applied complex models. In addition to the methodological issues there is, finally, a very important issue with respect to the estimation procedure as we found that it is very important to use various starting values as the HCMs can quickly end up in local maxima. All in all, both the RPL and the HCM allow for an understanding of some of the complex impacts of the complexity variables on the structural coefficients, but the two models confirm that a large portion of the unobserved heterogeneity remains unexplained. Future research should find the missing variables or alternative model specifications that could reduce this unobserved heterogeneity. 30 Acknowledgments The authors acknowledge financial support from the Department of the Environment of the Basque Government and from the Department of Education of the Basque Government through grant IT-642-13 (UPV/EHU Econometrics Research Group), Spanish Ministry of Economy and Competitiveness through grant ECO2014-52587-R as well as from the German Federal Ministry of Education and Research Funding (Fkz. 033L029G; Fkz 01LL0909A). This paper was partially written when Petr Mariel was visiting the Durham University Business School. He gratefully acknowledges the support provided by the Basque Government (Ikermugikortasuna 2016) for this stay. 31 References Abou-Zeid, M., Ben-Akiva, M., Bierlaire, M., Choudhury, C., Hess, S., 2010. Attitudes and value of time heterogeneity, in: Van de Voorde, E., Vanelslander, T. (Eds.), Applied Transport Economics – A Management and Policy Perspective. De Boeck Publishing, Antwerp, pp. 523545. Bartczak, A., Mariel, P., Chilton, S., Meyerhoff, J., 2015. The impact of latent risk preferences on valuing preservation of threatened lynx populations in Poland, forthcoming in Australian Journal of Agricultural and Resource Economics, doi: 10.1111/1467-8489.12123. Ben-Akiva, M., McFadden, D., Train, K., Walker, J., Bhat, C., Bierlaire, M., Bolduc, D., BoerschSupan, A., Brownstone, D., Bunch, D.S., Daly, A., De Palma, A., Gopinath, D., Karlstrom, A., Munizaga, M.A., 2002. Hybrid choice models: Progress and challenges. Marketing Letters 13(3), 163-175. Bentler, P.M., 1982. Linear systems with multiple levels and types of latent variables, in: Jöreskog, K.G. and Wold, H. (Eds.), Systems Under Indirect Observation, North-Holland, Amsterdam, pp 101-130. Bierlaire, M., 2003. BIOGEME: A free package for the estimation of discrete choice models, in: Chevroulet, T., Sevestre, A. (Eds.), Proc. 3rd Swiss Transportation Research Conf., 19-21 March 2003, Monte-Verita, Ascona, Switzerland. Bierlaire, M., 2008. An Introduction to BIOGEME Version 1.7. Available at: biogeme.epfl.ch. Bollen, K., 2002. Latent variables in psychology and the social sciences. Annual Review of Psychology 53, 605-634. 32 Boxall, P., Moon, A., Adamowicz, W.L., 2009. Complexity in decision making: The effect of task structure and context on participant response. Australian Journal of Agricultural and Resource Economics 53(4), 503-519. Campbell, D., Boeri, M., Doherty, E., Hutchinson, W. G., 2015. Learning, fatigue and preference formation in discrete choice experiments. Journal of Economic Behaviour & Organization 119, 345-363. Can, O., Alp E., 2012. Valuing of environmental improvements in a specially protected marine area: a choice experiment approach in Göcek Bay, Turkey. Science of Total Environment 8, 439-291. Chorus, C., Kroesen, M., 2014. On the (im-)possibility of deriving transport policy implications from hybrid choice models. Transport Policy 36, 217-222. Cliff, N., 1983. Some cautions concerning the application of causal modeling methods. Multivariate Behavioral Research 18, 115-126. Daly, A., Hess, S., Patruni, B., Potoglou, D., Rohr, C., 2012. Using ordered attitudinal indicators in a latent variable choice model: A study of the impact of security on rail travel behaviour. Transportation 39, 267-297. DOI 10.1007/s11116-011-9351-z. Daly, A., Hess, S., & Train, K. (2012). Assuring finite moments for willingness to pay in random coefficient models. Transportation, 39(1), 19–31. http://doi.org/10.1007/s11116-011-9331-3 Dekker, T., Hess, S., Hofkes, M., Brouwer, R., 2013. Hybrid choice models for decision uncertainty: Implicitly or explicitly uncertain?. The Sebel Pier One Sydney. 3 - 5 July 2013. The 3rd International Choice Modelling Conference. Dekker, T., Hess, S., Arentze, T. and Chorus, C., 2014. Incorporating needs-satisfaction in a discrete choice model of leisure activities. Journal of Transport Geography, 38. 66 - 74 33 Doornik, J., 2001. Ox: An Object-Oriented Matrix Language. London. Timberlake Consultants Press. Farizo, B. A., Oglethorpe, D., Soliño, M., 2016. Personality traits and environmental choices: On the search for understanding. Science of the Total Environment 566-567, 157-167. Glerum, A., Atasoy, B., Bierlaire, M., 2014. Using semi-open questions to integrate perceptions in choice models. The Journal of Choice Modelling 10, 11-33. Grebitus, C., Lusk, J.L., Nayga Jr, R.M., 2013. Explaining differences in real and hypothetical experimental auctions and choice experiments with personality. Journal of Economics Psychology 26, 11-16. Hambleton, R.K., Swaminathan, H., Rogers, H.J., 1991. Fundamentals of Item Response Theory. Sage, Newbury Park, CA. Hensher, D. A., & Greene, W. H. (2003). The mixed logit model: The state of practice. Transportation, 30(2), 133–176. http://doi.org/10.1023/A:10225587153502 Hess, S., Beharry-Borg, N., 2012. Accounting for latent attitudes in willingness-to-pay studies: The case of coastal water quality improvements in Tobago. Environmental and Resource Economics 52(1), 109-131. DOI 10.1007/s10640-011-9522-6. Hess, S., Shires, J., Jopson, A., 2013. Accommodating underlying pro-environmental attitudes in a rail travel context: Application of a latent variable latent class specification, Transportation Research Part D: Transport and Environment, 25 (December 2013), 42-48 Hoyos, D., Mariel, P., Hess, S., 2015. Incorporating environmental attitudes in discrete choice models: An exploration of the utility of the awareness of consequences scale. Science of the Total Environment 505, 1100-1111. 34 Justes, A., Barberán R., Farizo B., 2014. Economic valuation of domestic water uses. Science of Total Environment 8, 472-712. Kamargianni, M., Polydoropoulou, A., 2014. Development of a hybrid choice model to investigate the effects of teenagers’ attitudes towards walking and cycling on mode choice behavior. Transportation Research Record 2382, 151-161. Kim, J., Rasouli, S., Timmermans, H., 2014. Expanding scope of hybrid choice models allowing for mixture of social influences and latent attitudes: Application to intended purchase of electric cars. Transportation Research Part A 69, 71-85. DOI 10.1016/j.tra.2014.08.016. Kløjgaard, M.E., Hess, S., 2014. Understanding the formation and influence of attitudes in patients’ treatment choices for lower back pain: Testing the benefits of a hybrid choice model approach. Social Science & Medicine, DOI 10.1016/j.socscimed.2014.05.058. Kovaleva, A., Beierlein, C., Kemper, C.J., Rammstedt, B., 2012. Eine Kurzskala zur Messung von Impulsivität nach dem UPPS-Ansatz: Die Skala Impulsives-Verhalten-8 (I-8) (GESIS Working Papers 2012|20). Köln: GESIS. Lord, F.M., Novick, M.R., 1968. Statistical theories of mental test scores. Addison-Wesley, Reading, MA. Lundhede, T.H., Jacobsen, J.B., Hanley, N., Strange, N., Thorsen, B.J., 2015. Incorporating outcome uncertainty and prior outcome beliefs in stated preferences. Land Economics 91, 296316. MacCallum, R.C., Austin, J.T., 2000. Applications of structural equation modeling in psychological research. Annual Review of Psychology 51, 201-226. 35 Mariel, P., Meyerhoff, J., Hess, S., 2015. Heterogeneous preferences toward landscape externalities of wind turbines – combining choices and attitudes in a hybrid model. Renewable and Sustainable Energy Reviews 41, 647-657. Meyerhoff, J., Oehlmann, M., Weller, P., 2015. The influence of design dimensions on stated choices in an environmental context. Environmental and Resource Economics 61(3), 385-407 Oehlmann, M., Weller, P, Meyerhoff, J. 2014. Complexity-induced Status Quo Effects in Discrete Choice Experiments for Environmental Valutation. Paper presented at Annual Conference 2014 (Hamburg): Evidence-based Economic Policy from German Economic Association. https://www.econstor.eu/bitstream/10419/100616/1/VfS_2014_pid_615.pdf Paulssen, M., Temme, D., Vij, A., Walker, J., 2014. Values, attitudes and travel behavior: A hierarchical latent variable mixed logit model of travel mode choice. Transportation 41, 873888. Prato, G.G., Bekhor, S., Pronello, C., 2012. Latent variables and route choice behavior. Transportation 39(2), 299-319. Rolfe, J., Bennett, J., 2009. The impact of offering two versus three alternatives in choice modelling experiments. Ecological Economics 68, 1140-1148 Rotter, J.B., 1975. Some problems and misconceptions related to the construct of internal versus external control of reinforcement. Journal of Consulting and Clinical Psychology 43, 5667. DOI 10.1037/h0076301. Rungie, C., Coote, L.V., Louvieree, J.J., 2011. Structural choice modelling: Theory and applications to combining choice experiments. Journal of Choice Modelling 4(3), 1-29. Rungie, C., Coote, L.V., Louvieree, J.J., 2012. Latent variables in discrete choice experiments. Journal of Choice Modelling 5(3), 145-156. 36 Scarpa, R., Rose, J.M., 2008. Design efficiency for non-market valuation with choice modelling: How to measure it, what to report and why. The Australian Journal of Agricultural and Resource Economics 52, 253-282. Stern, P.C., 2000. Towards a coherent theory of environmentally significant behavior. Journal of Social Issues 56, 407-424. Vij, A., and Walker, J. L. (2016). How, when and why integrated choice and latent variable models are latently useful. Transportation Research Part B: Methodological, 90, 192–217. http://doi.org/10.1016/j.trb.2016.04.021 Walker, J., Li, J., Srinivasan, S., Bolduc, D., 2010. Travel demand models in the developing world: Correcting for measurement errors. Transportation Letters 2, 231-243. Zhang, J. and Adamowicz, W.L., 2011. Unraveling the choice format effect: A contextdependent random utility model. Land Economics 87(4), 730-743. 37
© Copyright 2026 Paperzz