Cognition xxx (2011) xxx–xxx Contents lists available at SciVerse ScienceDirect Cognition journal homepage: www.elsevier.com/locate/COGNIT Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory q Andreas Glöckner a,⇑,1, Thorsten Pachur b,1 a b Max Planck Institute for Research on Collective Goods, Bonn, Germany Cognitive and Decision Sciences, Department of Psychology, University of Basel, Switzerland a r t i c l e i n f o Article history: Received 5 February 2011 Revised 6 December 2011 Accepted 6 December 2011 Available online xxxx Keywords: Prospect theory Risky choice Individual differences Cognitive modeling Heuristics a b s t r a c t In the behavioral sciences, a popular approach to describe and predict behavior is cognitive modeling with adjustable parameters (i.e., which can be fitted to data). Modeling with adjustable parameters allows, among other things, measuring differences between people. At the same time, parameter estimation also bears the risk of overfitting. Are individual differences as measured by model parameters stable enough to improve the ability to predict behavior as compared to modeling without adjustable parameters? We examined this issue in cumulative prospect theory (CPT), arguably the most widely used framework to model decisions under risk. Specifically, we examined (a) the temporal stability of CPT’s parameters; and (b) how well different implementations of CPT, varying in the number of adjustable parameters, predict individual choice relative to models with no adjustable parameters (such as CPT with fixed parameters, expected value theory, and various heuristics). We presented participants with risky choice problems and fitted CPT to each individual’s choices in two separate sessions (which were 1 week apart). All parameters were correlated across time, in particular when using a simple implementation of CPT. CPT allowing for individual variability in parameter values predicted individual choice better than CPT with fixed parameters, expected value theory, and the heuristics. CPT’s parameters thus seem to pick up stable individual differences that need to be considered when predicting risky choice. Ó 2011 Elsevier B.V. All rights reserved. 1. Introduction In the behavioral sciences, a popular approach to describe and predict behavior is cognitive modeling. Many cognitive models contain adjustable parameters; depending on the values of the parameters, the model makes different predictions. There are two main reasons for q This research was supported by a Swiss National Science Foundation Grant to Thorsten Pachur (100014 125047/2). Our thanks go to Jörg Rieskamp for discussing modeling issues and to Laura Wiles for copy editing the manuscript. ⇑ Corresponding author. Address: Max Planck Institute for Research on Collective Goods, Kurt-Schumacher-Str. 10, D-53113 Bonn, Germany. Tel.: +49 (0)2 28 9 14 16 857; fax: +49 (0)2 28 9 14 16 858. E-mail address: [email protected] (A. Glöckner). 1 These authors contributed equally. modeling with adjustable parameters. First, adjustable parameters allow simulating reactions to variations in the task, such as adaptations of decision thresholds in evidence accumulation models (e.g., Busemeyer & Townsend, 1993), or stopping rules for retrieval in memory models (e.g., Raaijmakers & Shiffrin, 1981). Second, adjustable parameters can accommodate individual differences among people. For instance, the generalized context model for categorization (e.g., Nosofsky, 1984) has adjustable parameters to allow for individual variability in the subjective weights for the cue dimensions; in learning models, adjustable parameters allow for individual differences in memory decay and learning rate (e.g., Rieskamp & Otto, 2006; Sutton & Barto, 1998). A model with adjustable parameters, however, can be a mixed blessing: on the one hand, adjustable parameters 0010-0277/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.cognition.2011.12.002 Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 2 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx can increase the scope of a model in that they allow extending the conditions under which the model is defined. Hence, parameters can potentially increase the testability of a model (i.e., its empirical content; Popper, 1934/ 2005; see also Glöckner & Betsch, 2011). On the other hand, adjustable parameters also increase the model’s flexibility. This can be problematic because a very flexible model may be fit to almost any set of observations and can therefore be difficult to falsify, decreasing its empirical content. Moreover, the greater a model’s flexibility, the greater its risk, when fitted to data, of picking up unsystematic variance (i.e., overfitting; Roberts & Pashler, 2000). Therefore, it is important to test to what extent multi-parameter models that can capture, for instance, individual differences, in fact reflect these differences reliably—that is, that they are superior in predicting future behavior as compared to models without adjustable parameters. One way to address this question is to examine the stability of individually fitted parameter values across time. Currently, systematic tests of the consistency and temporal stability of individual differences as measured by parameters in cognitive models are very rare (e.g., Birnbaum, 1999; Yechiam & Busemeyer, 2008). As a consequence, it is unclear to what degree individual differences in parameter values can be replicated and thus indicate genuine and robust individual differences, or whether they result from overfitting and thus reflect mainly unsystematic variability. In this article, we illustrate the issue of parameter stability in cognitive models by examining the temporal stability of parameter values in cumulative prospect theory (Tversky & Kahneman, 1992; see also Kahneman & Tversky, 1979), arguably the most prominent descriptive model of decision making under risk. In decisions under risk, the decision maker evaluates options with different outcomes that occur with some probability (e.g., winning $100 with a probability of 30% and losing $200 with a probability of 70%). To predict a decision, in cumulative prospect theory the outcomes and the probabilities are transformed nonlinearly into subjective values and decision weights, respectively (for details see below). The extent of the transformations, assumed to reflect psychophysical aspects of information processing as well as the relative weighting of positive and negative outcomes, are governed by adjustable parameters. By virtue of its adjustable parameters, cumulative prospect theory can capture differences in risky decision making among people, and several investigations have applied cumulative prospect theory to study individual differences (Booij & van de Kuilen, 2009; Fehr-Duda, Gennaro, & Schubert, 2006; Gonzalez & Wu, 1999; Harbaugh, Krause, & Vesterlund, 2002; Pachur, Hanoch, & Gummerum, 2010). Importantly, not all models of risky choice have adjustable parameters. As models without adjustable parameters are more transparent (or penetrable; Willis & Pothos, 2012) and have been shown to be less prone to fit unsystematic variability than multi-parameter models (e.g., Dawes, 1979; Gigerenzer & Brighton, 2009; Makridakis, Hibon, & Moser, 1979), some authors have argued for simple models that forgo a transformation of outcomes and probabilities and have no adjustable parameters (e.g., Brandstätter, Gigerenzer, & Hertwig, 2006; Brandstätter, Gigerenzer, & Hertwig, 2008; Payne, Bettman, & Johnson, 1993). One possibly problematic aspect of these simple models is that they have been proposed to apply to a relatively narrow range of conditions only, which limits their scope (and reduces their empirical content, see Glöckner & Betsch, 2011). More importantly, they may lag behind parameterized models in terms of predictive power because of their inability to accommodate individual differences within one model. Do they? The question whether modeling with adjustable parameters has added value in predicting behavior is also critical for another reason. When testing cumulative prospect theory against parameter-free heuristics, some studies have used (various sets of) fixed parameter values across all people to derive its predictions (e.g., Brandstätter et al., 2008; Hilbig, 2008; Johnson, Schulte-Mecklenbeck, & Willemsen, 2008; but see Glöckner & Betsch, 2008). Such an approach might be appropriate if the stability of individual variability in parameter values is rather low. To the extent that parameter variability between people reflects robust differences in decision making, however, using a fixed set of parameters might underestimate the predictive power of cumulative prospect theory. In the following, we examine the merits of modeling individual differences in risky choice with the adjustable parameters of cumulative prospect theory by (a) testing the temporal stability of individually fitted parameter values; and (b) comparing the predictive accuracy of several versions of cumulative prospect theory varying in flexibility against each other as well as against models that have no adjustable parameters.2 As the stability of obtained parameter estimates—and by extensions a model’s ability to predict behavior—might be influenced by the fitting method (cf. Birnbaum, 2008a), we will also compare implementations of CPT that differ in terms of the fit index used to estimate parameters. 2. Cumulative prospect theory 2.1. Model According to cumulative prospect theory (CPT), the outcomes of an option (e.g., a gamble) are evaluated with respect to a reference point, which is often the status quo (i.e., current wealth level) but might also be influenced by frames, goals, and hopes. The subjective value V of an option with outcomes x1 6 6 xk 6 0 6 xk+1 6 6 xn and corresponding probabilities p1 pn is given by: V¼ k X i¼1 pi v ðxi Þ þ n X pþj v ðxj Þ; ð1Þ j¼kþ1 where v is a value function satisfying v(0) = 0; p+ and p are decision weights for gains and losses, respectively, which result from a rank-dependent transformation of the outcomes’ probabilities. The decision weights are defined as: 2 As a cautionary note it should be kept in mind that model evaluations based on an index of fit alone (e.g., correlations) can lead to wrong conclusions, and should ideally be complemented by tests of critical properties of the model in question. For a demonstration and discussion, see Birnbaum (1973, 1974, 2008a). Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx p1 ¼ w ðp1 Þ pþn ¼ wþ ðpn Þ pi ¼ w ðp1 þ þ pi Þ w ðp1 þ þ pi1 Þ for 1 < i 6 k pþj ¼ wþ ðpj þ þ pn Þ wþ ðpjþ1 þ þ pn Þ for k < j < n ð2Þ + with w and w being the probability weighting function for gains and losses, respectively (see below). The weights for probabilities of losses (i.e., i 6 k) represent the marginal contribution of the respective probability to the total probability of worse outcomes and the weights for probabilities of gains (i.e., j > k) represent the marginal contribution of the respective probability to better outcomes. Several functional forms of the value and weighting functions have been proposed (see Stott, 2006, for an overview). In the current experiment, we use the value function suggested by Tversky and Kahneman (1992), which is defined as v ðxÞ ¼ xa if x P 0 b v ðxÞ ¼ kðxÞ ð3Þ if x < 0 a and b capture the concave and convex curvature of the value function in the gain and loss domains, respectively, and are assumed to be smaller than 1, indicating decreasing marginal utility. The parameter k reflects the relative weighting of losses and gains and is usually found to be larger than 1, indicating loss aversion (but see Lopes & Oden, 1999). The weighting function assumes an inverse S-shaped transformation of the objective probabilities. Here we compare two commonly used forms of the weighting function. First, we implemented CPT with the one-parameter form proposed by Tversky and Kahneman (1992), which is defined as þ wþ ðpÞ ¼ w ðpÞ ¼ + pc cþ ðp þ if x P 0; þ þ ð1 pÞc Þ1=c pc ð4Þ c 1=c ðpc þ ð1 pÞ Þ if x < 0: c and c reflect the sensitivity to probability differences in the gain and loss domains, respectively, and are assumed to be smaller than 1, yielding overweighting of small probabilities and underweighting of moderate-to-large probabilities. Second, we implemented CPT with a twoparameter weighting function that allows separating the function’s curvature from its elevation (e.g., Gonzalez & Wu, 1999): þ wþ ðpÞ ¼ dþ pc þ dþ pcþ þ ð1 pÞc d pc w ðpÞ ¼ c d p þ ð1 pÞc if x P 0 ð5Þ if x < 0 where c+ and c (both <1) govern the curvature of the weighting function in the gain and loss domains, respectively, and indicate the sensitivity to probability differences. The parameters d+ and d (both >0) govern the elevation of the weighting function for gains and losses, respectively, and can be interpreted as the attractiveness of gambling. Note that compared to the one-parameter 3 weighting function, the two-parameter weighting function thus provides an additional way to quantify risk aversion (which has traditionally been assumed to be captured by the curvature of the value function). 2.2. Individual variability and temporal stability of CPT parameters By virtue of its specific value and weighting functions, CPT can account for several established violations of expected value theory and expected utility theory, such as the Allais paradox, the fourfold pattern, and the certainty effect (for an overview see Wu, Zhang, & Gonzalez, 2004; see Birnbaum, 2008b, and Birnbaum & Chavez, 1997, for limitations of CPT).3 Although CPT has often been tested in terms of its ability to account for modal choices (e.g., Brandstätter et al., 2006; Kahneman & Tversky, 1979; Tversky & Kahneman, 1992), there are also studies that have fitted CPT to individual participants, finding considerable heterogeneity in parameter values across individuals (e.g., Abdellaoui, Bleichrodt, & Haridon, 2008; Gonzalez & Wu, 1999). Moreover, the constructs captured by CPT’s parameters—such as decreasing marginal utility, probability sensitivity, and loss aversion—are often assumed to reflect stable, trait-like variables (Yechiam & Ert, 2011). Consistent with this view, individual differences in CPT’s parameters have been shown to be systematically related to variables such as gender (Booij & van de Kuilen, 2009; Fehr-Duda et al., 2006), age (Harbaugh et al., 2002), and personality (Pachur et al., 2010). The associations of CPT parameter values with individual variables suggest that the values might to a certain degree be stable over time. But are individual differences in risk attitude and the parameters modeling these differences indeed temporally stable? Several studies have demonstrated some degree of stability of individual differences in risky decision making over time (e.g., Levin, Hart, Weller, & Harshman, 2007). For instance, Andersen, Harrison, Lau, and Rutström (2008) found people’s risk attitudes (measured across a period of 3–17 months) to be correlated between .35 and .58. Similarly, studies examining the temporal stability of preferences in individual choice problems have found considerable—though far from perfect—consistency (e.g., Abdellaoui et al., 2008; Ballinger & Wilcox, 1997; Camerer, 1989; Hey, 2001). Does this consistency also extend to the constructs underlying risk preferences as measured by CPT’s parameters (i.e., decreasing marginal utility, probability sensitivity, and loss aversion)? Note that analyses of parameter stability reach beyond analyses of choice stability as the former also allow an assessment of the specific factors underlying an observed level of choice stability. For instance, it is possible (at least in principle) that instabilities in choice are primarily due to variation in one single 3 Concerning the limitations of CPT, Birnbaum (2008b) showed, for example, that it cannot account for more than a dozen well established phenomena in risky choice (e.g., violations of coalescing). Note, however, that these phenomena have been mainly demonstrated in gambles with more than two outcomes. In problems with two-outcome gambles—as the ones used in the current experiment—CPT and the alternatively suggested Transfer of Attention eXchange (TAX) model (Birnbaum & Chavez, 1997) make essentially the same predictions (Birnbaum & Navarrete, 1998). Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 4 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx parameter (e.g., loss aversion), whereas others (e.g., probability sensitivity) remain constant. In fact, as a given level of, say, loss aversion can be consistent with several different choice patterns (across a set of gamble problems), it is even conceivable that parameter stability is very high while choice consistency is rather low (e.g., because people are indifferent between two gambles). Interestingly, there are some indications that the temporal stability of the constructs captured by CPT’s parameters is not very high. Zeisberger, Vrecko, and Langer (in press) asked participants to provide certainty equivalents for the same set of 24 lotteries in two sessions that were 1 month apart. From these certainty equivalents the authors estimated individual CPT parameters and concluded that for a non-negligible proportion of participants parameters varied strongly over time. In a study in which participants were presented with the identical set of gamble problems repeated in close succession, Birnbaum (1999) found that for more than half of the participants CPT fitted to one set made less than 75% correct predictions for the other set (although individually fitted parameters for the two-parameter weighting function were correlated over time). How much does this apparent instability in parameter values—which might be due to overfitting in the parameter estimation or genuine instability—compromise CPT’s ability to predict people’s choices? If parameter variability mainly reflects unsystematic variance, might it even be justified to ignore individual differences captured by model parameters altogether (as in many models of heuristics)? And how much flexibility in a model is necessary to capture reliable individual differences? To address the latter issue, we also test different implementations of CPT varying in the number of adjustable parameters. 3. Experiment: are adjustable parameters in CPT worth their price? 3.1. General approach We presented participants with risky choice problems in two sessions, separated by 1 week. CPT’s parameters were fitted to each participant at each of the two sessions separately. There were three main goals to our study: first, to examine the stability of people’s choices across time; second, to examine the stability of individually fitted CPT parameters across time; third, to evaluate how well different implementations of CPT can predict risky choice, and how their performance compares to the performance of expected utility theory, expected value theory, and several simple heuristics that have no adjustable parameters. 3.2. Implementations of CPT We modeled our participants’ choices with various implementations of CPT, varying both in terms of the number of adjustable parameters and the fit index used to estimate the parameter values. The number of adjustable parameters varied depending on whether the one-parameter or the two-parameter weighting function was used, and whether parameters were estimated separately for gains and losses or not. As an index to quantify the fit between CPT’s predictions and participants’ choices we used either (a) the percentage of mismatching choices (a measure often used to compare and fit models of risky choice; Brandstätter et al., 2006; Glöckner & Betsch, 2008; Hau, Pleskac, Kiefer, & Hertwig, 2008; Ungemach, Chater, & Stewart, 2009)—that is, if the subjective value of gamble A, V(A), was higher than the subjective value of gamble B, V(B), CPT predicted that gamble A is chosen and we determined the parameter values that minimized the percentage of gamble problems where the predictions of CPT diverged from the participant’s choices;4 or (b) the deviation measure G2 (Sokal & Rohlf, 1994), which expresses the negative log likelihood of the set of predicted choices: G2 ¼ 2 N X ln½fi ðyjhÞ: ð6Þ i¼1 N refers to the total number of gamble problems, and f(y|h) refers to the probability with which CPT, based on a particular set of parameter values h, predicts an individual’s choice y. If gamble A was chosen, then f(y|h) = pi(A, B), where pi(A, B) is the predicted probability that gamble A is chosen over gamble B; if gamble B was chosen, then f(y|h) = 1 pi(A, B). To determine pi(A, B), we used an exponential version of Luce’s choice rule (Luce, 1959; cf. Rieskamp, 2008), pðA; BÞ ¼ e/VðAÞ ; þ e/VðBÞ e/VðAÞ ð7Þ where / > 0 is a sensitivity parameter, specifying how sensitively the predicted probability reacts to differences between the gambles’ subjective values V(A) and V(B).5 4 Minimizing percentage of mismatching choices can potentially lead to problems in the identification of optimal parameter values since minimizing is not done on a perfectly continuous objective function (i.e., various sets of parameter values can result in the same proportion of matching choices). For various reasons, we are convinced that this is only of minor importance for our data set. First, note that in the first step of our parameter estimation we scanned the entire parameter space to determine the best starting points for the subsequent parameter optimization— making it unlikely that we did not identify optimal parameter values. Second, we checked the 20 best-fitting parameter sets for all participants and found that for those rare cases where there was more than one set of parameter values with the same best fit, the parameter values differed merely on the fourth decimal place. Finally, we also conducted a parameter estimation using a complete grid search of the parameter space (with the parameter values in steps of 0.05). As it turned out, the obtained values were similar to the ones we obtained using the simplex method (which we used for the analyses reported below). 5 Note that evaluating CPT based on a probabilistic fit index (such as G2) implicitly assumes a specific error theory (i.e., how errors of the model are distributed) whereas using a deterministic fit index (such as the percentage of mismatching choices) does not. Although we find that different fit indices lead to very similar estimates of CPT’s parameter (see below), the choice between a probabilistic or a deterministic fit index can still have strong implications. When using G2 to evaluate a deterministic version of CPT implemented by setting the sensitivity parameter / in Equation 7 to 100 (cf. Rieskamp, 2008), this model was unable to fit the data better than chance level (based on G2), replicating the findings by Rieskamp (2008). Nevertheless, fitting CPT based on a deterministic fit index still led, as reported below, to a high percentage of correct predictions. In other words, one’s conclusion about the performance of a deterministic model may critically depend on whether G2 or the percentage of correct predictions is used as performance measure. Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 5 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx Table 1 The eight implementations of cumulative prospect theory (CPT) tested in the experiment. Implementation of CPT Outcome sensitivity Fit index: % mismatching choices CPT(3,pc) a CPT(4,pc) a CPTGW(4,pc) a CPTGW(6,pc) a Probability sensitivity c+ = c c+ c+= c c+ Elevation c c – – d+=d d+ – – d Loss aversion Response sensitivity No. of adjustable parameters k k k k – – – – 3 4 4 6 k k k k / / / / 4 5 5 7 2 Fit index: G CPT(4,g2) CPT(5,g2) CPTGW(5,g2) CPTGW(7,g2) a a a a c+ = c c+ c+= c c+ c c – – d+= d d+ – – d Note. The numbers in the brackets indicate the number of adjustable parameters in the model and whether the criterion at fitting was G2 (g2) or percentage mismatching choices (pc). Depending on the fit index used, we estimated the parameters such that either the percentage of mismatching choices or G2 was minimized. Note that compared to implementations fitted to minimize the percentage of mismatching choices, implementations fitted to minimize G2 can be more flexible in fitting the data because they have one additional parameter. In addition, G2 is a more sensitive measure of fit as it treats an observed choice of gamble A differently depending on whether the predicted probability was, say, p(A, B) = .51 or p(A, B) = .99, whereas the percentage of mismatching choices does not treat these cases differently. CPT with the one-parameter weighting function and fitted to minimize the percentage of mismatching choices has a total of five parameters: two for the sensitivity to differences in outcomes (a and b, for the gain and loss domain, respectively), two for the sensitivity to differences in probability (c+ and c, for the gain and loss domain, respectively), and a loss aversion parameter (k). In CPT with the two-parameter weighting function, there are two parameters for the probability sensitivity (c+ and c for the gain and loss domain, respectively) and two for the elevation of the weighting function (d+ and d for the gain and loss domain, respectively). When fitted to minimize G2, all implementations of CPT have one additional parameter (/). Because it has been shown that estimating different exponents of the value function for gains and losses (i.e., a and b) can lead to instable estimates of k (Nilsson, Rieskamp, & Wagenmakers, 2011), we estimated only one common exponent a for both domains. To examine how the flexibility of CPT impacts its ability to account for people’s choices, we implemented additional versions in which the parameters for the weighting function were fixed to be the same for the gain and loss domains. Overall, we thus tested eight different implementations of CPT: when fitted to minimize the percentage of mismatching choices, the implementations with the one-parameter weighting function had either three (a, c, and k), or four (a, c+, c, and k) parameters; the implementations with the two-parameter weighting function had either four (a, c, d, and k) or six parameters (a, c+, c, d+, d, and k). When fitted to minimize G2, the implementations with the one-parameter weighting function had either four (a, c, k, and /), or five (a, c+, c, k, and /) parameters; the implementations with the two-parameter weighting function had either five (a, c, d, k, and /) or seven parameters (a, c+, c, d+, d, k, and /). Table 1 provides an overview of the eight implementations of CPT that we tested. 3.3. Competing models To evaluate the merits of CPT’s multi-parameter framework for predicting choice, we tested its performance against the performance of expected utility theory (EU), according to which the subjective value of an option with n outcomes and probabilities is determined as V¼ V¼ n X i¼1 n X pxa for x P 0 and pð1ÞðxÞa for x < 0: ð8Þ i¼1 In other words, EU assumes that the outcomes x are transformed using a power function with exponent a, whereas the probabilities p are not transformed. Further, we examined how much predictive power is gained by fitting CPT to each individual participant by comparing its performance with the performance of several benchmarks that have no adjustable parameters (and thus ignore individual differences). First, we determined for each implementation of CPT the performance when using one common parameter set across all participants, namely the median of the individually fitted parameter values. Second, we determined the performance of CPT using the parameter set reported by Tversky and Kahneman (1992), which has been used in previous tests of CPT (e.g., Brandstätter et al., 2006; Glöckner & Betsch, 2008). Specifically, the parameter values were a = 0.88, b = 0.88, k = 2.25, c+ = 0.61, and c = 0.69. Finally, we tested expected value theory (EV), which forgoes transforming outcomes and probabilities, and 11 heuristics that have been proposed as models of risky choice. The heuristics are described in Table 2 (cf. Brandstätter et al., 2006; Payne et al., 1993). 3.4. Method 3.4.1. Participants Sixty-six participants, mainly students from the University of Bonn, took part in the experiment (25 male, mean Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 6 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx Table 2 Heuristics for risky choice. Priority heuristic. Go through reasons in the order of: minimum gain, probability of minimum gain, and maximum gain. Stop examination if the minimum gains differs by 1/10 (or more) of the maximum gain; otherwise, stop examination if probabilities differ by 1/10 (or more) of the probability scale. Choose the gamble with the more attractive gain (probability). For loss gambles, the heuristic remains the same except that ‘‘gains’’ are replaced by ‘‘losses’’. For mixed gambles, the heuristic remains the same except that ‘‘gains’’ are replaced by ‘‘outcomes’’. Equiprobable: Calculate the arithmetic mean of all outcomes for each gamble. Choose the gamble with the highest mean. Equal-weight: Calculate the sum of all outcomes for each gamble. Choose the gamble with the highest sum. Better than average: Calculate the grand average of all outcomes from all gambles. For each gamble, count the number of outcomes equal to or above the grand average. Then choose the gamble with the highest number of such outcomes. Tallying: Give a tally mark to the gamble with (a) the higher minimum gain, (b) the higher maximum gain, (c) the lower probability of the minimum gain, and (d) the higher probability of the maximum gain. For losses, replace ‘‘gain’’ with ‘‘loss’’ and ‘‘higher’’ with ‘‘lower’’ (and vice versa). Choose the gamble with the highest number of tally marks. Probable: Categorize probabilities as probable (i.e., p P .50 for a two-outcome gamble, p P .33 for a three-outcome gamble, etc.) or improbable. Cancel improbable outcomes. Then calculate the arithmetic mean of the probable outcomes for each gamble. Finally, choose the gamble with the highest mean. Minimax: Choose the gamble with highest minimum outcome. Maximax: Choose the gamble with the highest outcome. Lexicographic: Determine the most likely outcome of each gamble and choose the gamble with the better outcome. If both outcomes are equal, determine the second most likely outcome of each gamble, and choose the gamble with the better (second most likely) outcome. Proceed until a decision is reached. Least likely: Identify each gamble’s worst outcome. Then choose the gamble with the lowest probability of the worst outcome. Most likely: Identify each gamble’s most likely outcome. Then choose the gamble with the highest, most likely outcome. Note. Heuristics are from Brandstätter et al. (2006) and Payne et al. (1993). age 24.7 years, SD = 5.2), which was conducted at the MPI Decision Lab. The experiment consisted of two sessions, each of which lasted for about 30 min. The second session was run 1 week after the first. In addition to a flat fee of 22 € (US $ 30.80), participants received a performancecontingent payment that ranged from –9.90 € to 10.00 € (US –$13.86 to $14). Two subjects did not show up for the second session, leaving us with complete data sets for 64 participants. 3.4.2. Material and design We used sets of two-outcome gamble problems that had been used in previous studies, which were rescaled to fit the same range of outcomes (i.e., from 1000 € to 1200 €). The first set (R-problems) consisted of 180 randomly generated problems investigated by Rieskamp (2008; scaled by factor 10): 60 pure gain, 60 pure loss, and 60 mixed gamble problems. The second set (GB-problems) consisted of 40 problems that had been constructed to differentiate between the priority heuristic and CPT (Glöckner & Betsch, 2008; scaled by varying factors and, if necessary, adapted to maintain the basic properties of the original problems). In both sets the expected values of the gambles within a problem were similar (i.e., the ratio of expected values was smaller than 1:2), thus avoiding obvious choices. The third set (HLG-problems) consisted of 10 problems designed by Holt and Laury (2002; low payoff version scaled by factor 200) to measure risk aversion, with each problem involving a choice between a gamble with two medium outcomes and a gamble with a high and a low outcome; and eight problems which were adapted from tasks designed by Gächter, Johnson, and Herrmann (2007) to measure loss aversion, with each problem involving a choice between a 50:50 chance of receiving a loss or gains of varying amounts (i.e., 100 € vs. 50, 100, 150, 200, 220, 240, 300, or 400 €) and a gamble that pays nothing with certainty. We included the specifically designed problems of the HLG set to improve param- Gamble 1 Gamble 2 A 50% 100€ A 65% -40€ B 50% -50€ B 35% 340€ Fig. 1. Format used in the experiment to present the gamble problems. eter estimation of cumulative prospect theory’s key constructs. From the above sets, we created two sets of 138 problems each. First, all of the 18 HLG-problems and half of the GB-problems (randomly selected) were included in both sets to allow examining participants’ choice consistency. Second, using the odds/even method, the R-problems and the remaining GB-problems were distributed equally across the two sets. One set was presented at the first session, the other at the second session of the experiment. The order in which the two sets were presented was counterbalanced across participants. At the end of each session, one gamble problem was randomly selected and the participant received an additional bonus according to her choice in the respective problem with an exchange rate of 100:1 €. 3.4.3. Procedure In each session, participants were introduced to the task and the incentive scheme and were then presented with 138 gamble problems, as shown in Fig. 1.6 The problems were presented in an individually randomized order. For each problem, participants were instructed to indicate their 6 Although there is currently little evidence that the way gamble problems are presented impacts choice behavior (see Glöckner & Herbold, 2011, Figure 3), we cannot rule out that presenting the pieces of information for a gamble within a box might have increased the compensatory processing assumed by utility models such as CPT. Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 7 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx Table 3 Performance, expressed as the percentage of matching choices, of the different implementations of CPT in fitting (i.e., when using the best-fitting set of parameters) as well as in backward (i.e., when predicting choices at the first session, t1, based on parameters fitted at the second session, t2) and forward prediction (i.e., when predicting choices at t2 based on parameters fitted at t1). All values are means with SEs in parentheses. Fit index G2 Number of adjustable parameters % Mismatching choices Number of adjustable parameters 3 4 4GW 6GW 4 5 5GW 7GW Fitting t1 t2 Total 80.3% (0.6) 80.8% (0.6) 80.6% (0.4) 81.0% (0.6) 81.7% (0.6) 81.4% (0.5) 81.1% (0.5) 81.7% (0.6) 81.4% (0.4) 82.6% (0.5) 83.4% (0.6) 83.0% (0.4) 76.1% (0.7) 76.0% (0.8) 76.0% (0.6) 76.9% (0.7) 77.5% (0.7) 77.2% (0.5) 76.0% (0.7) 76.8% (0.8) 76.4% (0.6) 78.4% (0.6) 78.8% (0.7) 78.6% (0.5) Prediction Backward Forward Total 75.4% (0.7) 75.8% (0.6) 75.6% (0.6) 75.6% (0.7) 75.2% (0.7) 75.4% (0.6) 74.2% (0.8) 75.1% (0.7) 74.6% (0.7) 75.2% (0.7) 75.1% (0.7) 75.1% (0.6) 73.4% (0.8) 74.9% (0.7) 74.1% (0.7) 74.5% (0.7) 75.9% (0.7) 75.2% (0.6) 74.2% (0.8) 75.1% (0.7) 74.6% (0.7) 75.5% (0.7) 76.2% (0.7) 75.8% (0.6) Note: ‘‘GW’’ in the superscript of the number of free parameters indicates an implementation of CPT with the two-parameter weighting function investigated by Gonzalez and Wu (1999). Table 4 Median estimates for the parameters of the different implementations of cumulative prospect theory as well as the median G2 for the implementations fitted based on G2. In the implementations 4GW and 5GW, for the parameters c and d only one common value was estimated for the gain and the loss domains. Parameter Fit index G2 Number of adjustable parameters % Mismatching choices Number of adjustable parameters 3 4 4GW 6GW 4 5 5GW 7GW 0.64 0.67 0.77 0.68 0.74 0.74 0.75 0.74 0.75 0.79 0.81 0.74 0.71 0.70 0.72 0.74 0.76 0.61 0.68 0.68 0.73 0.74 0.71 0.67 0.69 0.65 0.65 0.61 0.70 0.58 0.68 0.63 ðþÞ – – – 0.83 0.85 – – – 0.85 0.77 0.81 0.71 – – – 0.89 0.89 – – – 0.99 0.81 0.82 0.63 ðþÞ – – 0.78 0.68 – – 0.93 0.66 G2t1 – – 1.66 1.80 – – – – – 1.55 1.55 – – – – – 1.86 1.99 – – – 1.02 1.05 1.61 1.72 – – – – – 1.35 1.31 0.08 0.11 155.8 – – 1.27 1.19 0.06 0.06 136.8 – – 1.52 1.46 0.09 0.13 135.2 1.87 1.59 1.05 1.11 0.10 0.13 123.0 G2t2 – – – – 150.4 132.7 131.1 118.7 at1 at2 cðþÞ t1 cðþÞ t2 ct1 ct2 dt1 dt2 d t1 d t2 kt1 kt2 /t1 /t2 Note: ‘‘GW’’ in the superscript of the number of free parameters indicates an implementation of CPT with the two-parameter weighting function investigated by Gonzalez and Wu (1999). choice by pressing the respective key on the left or the right side of the keyboard. 3.5. Results 3.5.1. Consistency of preferences We first analyzed the consistency of individual preferences by comparing, for each participant, the choices in the 38 problems that were presented in both sessions. As it turned out, in, on average, 79% (robust SE = 0.01) of the cases people made the same choice at the two sessions. 3.5.2. Stability of CPT parameters Next we fitted the different implementations of CPT to each participant’s choices, using either the percentage of mismatching choices or G2 as fit index. To reflect main assumptions of CPT, the parameter values were restricted as follows: 0 < a 6 1; 0 < k 6 10; 0 < c(+) 6 1; 0 < c 6 1; 0 < d(+) 6 4; 0 < d 6 4. For the implementations of CPT fitted to minimize G2, the sensitivity parameter was restricted as 0 < / 6 10 (cf. Rieskamp, 2008). To reduce the risk of being stuck in local minima, for the parameter estimation we first conducted a grid search to identify the set of parameter values that minimized the respective deviation measure (i.e., the percentage of mismatching choices and G2); depending on the implementation of CPT, we partitioned the parameter space such that there were at most 80,000 parameter combinations, with a similar partitioning across the different parameters. Then the 20 bestfitting combinations of grid values emerging from this grid Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 8 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx Table 5 Correlations of individually estimated parameter values across time. Parameter Fit index G2 Number of adjustable parameters % Mismatching choices Number of adjustable parameters a c(+) c d(+) d k / 3 4 4GW 6GW 4 5 5GW 7GW 0.49 0.50 – – – 0.61 – 0.33 0.24 0.06 – – 0.50 – 0.22 0.41 – 0.13 – 0.50 – 0.36 0.18 0.28 0.32 0.03 0.42 0.50 0.31 – – – 0.27 0.22 0.48 0.54 0.40 – – 0.30 0.33 0.52 0.56 – 0.17 – 0.34 0.23 0.59 0.44 0.28 0.23 0.25 0.15 0.32 Notes. N = 64, Significant correlations (p < .05) are in bold. ‘‘GW’’ in the superscript of the number of free parameters indicates an implementation of CPT with the two-parameter weighting function investigated by Gonzalez and Wu (1999). search were used as starting points for subsequent optimization using the simplex method (Nelder & Mead, 1965), as implemented in MATLAB. Table 3 shows the performance (expressed as percentage of matching choices) of the different implementations of CPT when they were fitted to the data.7 As can be seen, performance tends to increase with the number of adjustable parameters (irrespective of fit index), so in fitting higher complexity paid off somewhat. Table 4 shows the median best fitting parameter values, separately for the 8 different implementations of CPT. Compared to previous studies (for an overview, see Fox & Poldrack, 2008), we obtained very similar patterns. Specifically, c was always higher than c+, c was usually smaller than d, and d+ was always higher than d. The major difference occurred with regard to the loss aversion parameter k, for which we found a relatively low value (Tversky & Kahneman, 1992, for instance, report a value of 2.25). The different implementations of CPT yielded rather similar values for the parameters, with the largest variability occurring on k (which varied between 1.05 and 1.99). How stable are the individually fitted parameter values across time? To address this question, we correlated the obtained individual values between the two sessions. Table 5 shows the results. First, the correlations showed a large effect size (according to the classification by Cohen, 1988) for almost all parameters, and this held for most implementations of CPT. Nevertheless, if one assumes that the constructs captured by CPT’s parameters reflect traitlike variables (Yechiam & Ert, 2011), one might argue that the magnitude of correlation should have been higher. Second, the correlations tended to be higher for simpler versions of CPT. Paired sample t-tests showed that for all implementations with a one-parameter weighting function the parameter values did not differ between the two sessions (all t < 1.89, all p > .05). For the implementation with a two-parameter weighting function, by contrast, the parameter stability was lower. Specifically, with the exception of the implementation with four parameters estimated to minimize the percentage of mismatching choices (i.e., CPTGW(4,PC)), for all implementations at least 7 To derive the predictions of the CPT implementation with parameters fitted based on G2, we assumed that the option with a higher predicted probability was chosen. one parameter differed significantly between the sessions. The most notable case was the implementation with five parameters fitted to minimize G2 (i.e., CPTGW(5,G2)), for which d increased while both k and c decreased from the first to the second session (all t > 2.25, p < .05). 3.5.3. Predicting risky choice Next we asked: how well do the different implementations of CPT predict (rather than fit) people’s choices? For that purpose, we determined the percentage of cases where CPT, using the parameters fitted to the individual choices at the first session, correctly predicted the choices in the second session (forward prediction)—and vice versa (backward prediction). As Table 3 shows, while more flexible implementations (i.e., those with a higher number of adjustable parameters) had performed better in fitting, this advantage disappeared in prediction and all implementations performed rather similarly. In fact, the highest average percentages of correct predictions were achieved by the simplest (three parameters fitted to minimize the percentage of mismatching choices: 75.6%) and the most complex (seven parameters fitted to minimize G2: 75.8%) implementations. In other words, the increased flexibility of CPT afforded by fitting some parameters separately for gains and losses and by having a two-parameter weighting function does not seem to yield a higher predictive power. Next, we compared the performance of the implementations of CPT with individualized sets of parameter values to the performance of several benchmarks: First, we determined the predictive accuracy of each implementation when using the set of median parameter values (as reported in Table 4) for all participants. Fig. 2 shows the results, with performance averaged first across backward and forward prediction and then across participants. As the gray bars show, the performance based on the set of median parameter values was worse than the performance based on individual parameter sets for each participant (t > 2.00, p < .05; for all implementations except for CPT(4,G2) and CPTGW(5,G2)).8 Nevertheless, CPT based on the median parameter values of our current sample of 8 An alternative approach to test CPT’s predictive power based on parameter values for the aggregate is to use the set of parameters fitted to the aggregate choices. This approach yielded very similar parameter values as the median values and a similar, though slightly worse performance in prediction. Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 9 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx 100% Individual parameter values 90% %Correct Predictions 80% 70% Aggregate parameter values (Median) 75.8% 75.6% 75.4% 75.1% 75.3% 75.2% 74.1% 74.2% 74.6% 73.7% 73.6% 74.0% 73.9% 73.6% 73.5% 73.3% 72.1% 70.3% 70.0% 69.6% 68.9% 65.7% 61.4% 61.0% 60% 56.7% 56.7% 53.2% 54.5% 56.7% 50% 54.9% 54.8% 49.2% 50.7% 40% 30% 20% 10% 0% Fig. 2. Average performance of the different models in predicting participants’ individual choices. CPT = cumulative prospect theory with a one-parameter weighting function; CPTGW = CPT with Gonzalez and Wu’s two-parameter weighting function, EU = expected utility theory. The numbers in the brackets indicate the number of adjustable parameters in the model and whether the criterion at fitting was G2 (g2) or percentage mismatching choices (pc); CPT(TK) uses the median parameter values (for each participant) reported in Tversky and Kahneman (1992); EV = expected value theory; PH = priority heuristic; EQUI = equiprobable; EQW = equal weight; BTA = better than average; TALLY = tallying; PROB = probable; MINI = minimax; MAXI = maximax; LEX = lexicographic; LL = least likely; ML = most likely. Error bars indicate the standard errors of the mean. participants outperformed CPT based on the set of parameter values reported by Tversky and Kahneman (1992). How does CPT fare relative to EU, which also assumes marginal decreasing utility but forgoes transforming probabilities? To test EU, we fitted it both on the percentage of mismatching choices and on G2. Specifically, we estimated the exponent of the utility function (Eq. (8); with one common parameter for gains and losses); when using G2 as fit index, we additionally estimated a sensitivity parameter for the choice rule (similarly as in Equation 7).9 The fitted parameter values were then used to determine the performance in backward and forward prediction. Fig. 2 shows that EU’s performance (averaged first across backward and forward prediction and then across participants) was clearly worse than the performance of the worst CPT implementation (i.e., CPT with parameters from Tversky & Kahneman, 1992; for both fitting indices: t > 3.6, p < .001). Above we have seen that ignoring individual differences in CPT’s parameter values leads to worse performance in prediction. Moreover, the inferior performance of EU, which in contrast to CPT does not transform probabilities and does not assume loss aversion, suggests that taking into account these factors improves performance. How do EV and the heuristics (Table 2) perform, neither of which consider individual differences or assume a transformation of outcomes and probabilities? Fig. 2 shows that both EV 9 Fitting the exponent in expected utility theory’s utility function to minimize the percentage of mismatching choices yielded median value of 0.76. Fitting the exponent to minimize G2 yielded a median value of 0.68 for the exponent of the utility function and 0.12 for the sensitivity parameter. (all t > 7.84, p < .001) and all heuristics (all t > 14.23, p < .001) are clearly outperformed by all implementations of CPT—including the one with the fixed parameter set from Tversky and Kahneman (1992). In addition, EV outperformed all heuristics (all t > 3.77, p < .001). Of the heuristics, the priority heuristic and the minimax heuristic performed best. 4. Discussion We examined the degree to which the multi-parameter framework of CPT, arguably the most widely used approach to model people’s decisions under risk, helps predicting individual choice relative to models that have no adjustable parameters. Specifically, are the individual differences as captured by CPT’s parameters sufficiently stable over time that estimating them improves predictive power? And does parameter stability (and by implication predictive power) depend on the number of adjustable parameters with which CPT is implemented? There are three key results. First, the stability of individuals’ preferences over a period of 1 week was relatively high (79%). Second, individual differences as measured by CPT’s parameters—decreasing marginal utility, loss aversion and probability sensitivity—were correlated over time, indicating that the parameter values reflect relatively stable individual differences among people. Nevertheless, there was some indication that an implementation using a two-parameter weighting function led to less stable parameter values. Third and most importantly, with a Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 10 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx predictive accuracy of more than 75%, implementations of CPT allowing for heterogeneity in parameter values among people consistently outperformed implementations that use a common parameter set for all participants as well as expected value theory and various heuristics—all of which cannot accommodate individual differences within one model. Finally, all implementations of CPT outperformed expected utility theory which, though also allowing to capture individual differences with parameters, forgoes a transformation of probabilities and (in our implementation) does not assume loss aversion. Our results also provide insights as to which aspects of CPT’s parametric ‘‘menagerie’’ actually have an added value in predicting choice. Researchers intending to use CPT to model choice have to make several decisions when implementing the model. Should a one-parameter or a two-parameter weighting function be used? Should parameters be estimated separately for gains and losses, or only one common set for both domains? Based on our findings, it appears that whatever reliable individual differences between decision makers exist, they can (at least among a relatively homogeneous population such as the students investigated in our experiment) be captured by a rather simple implementation of CPT with three parameters: one with a common parameter for the utility curvature across gains and losses, a loss aversion parameter and a one-parameter weighting function fitted to minimize the percentage of mismatching choices. Adding further complexity to CPT does not yield higher predictive power: first, as separate parameters of the weighting functions for gains and losses do not lead to better prediction, differences between gains and losses seem to be sufficiently represented by having a loss aversion parameter k. Second, adding an elevation parameter to the weighting function—and thus being able to capture risk aversion not only by the curvature of the value function but also by the weighting function—does not lead to better performance in prediction (the psychological appropriateness to differentiate between sensitivity and attractiveness notwithstanding; Gonzalez & Wu, 1999). We cannot exclude that gamble problems can be constructed in which a two-parameter weighting function might prove superior in prediction, but for the broad set of gamble problems used in our experiment the additional sophistication provided by a two-parameter function seems to be completely offset by a larger amount of overfitting. The limited value of greater complexity also extends to how parameters are estimated. Although using G2 as fit index allows, in principle, a more sensitive tailoring of the model to the data, it does not lead to a higher percentage of correctly predicted choices than using the percentage of mismatching choices (at fitting) as fit index. Concerning the stability of participants’ individual parameter values, the correlations between the two sessions showed a high effect size; nevertheless, the correlations were far from perfect (Table 5). Note that in addition to overfitting and genuine instability in the underlying constructs, a further possible contributing factor to the observed level of (in)stability is that not all choice problems were identical at the two sessions (although they were drawn from the same sets of problems and currently only little is known as to how certain characteristics of gamble problems translate into parameter values; e.g., Holt & Laury, 2002). Future studies could thus examine parameter stability when using the exact same set of choice problems across all sessions. Our findings have important implications for current debates on how to model risky choice. In particular, outcome tests comparing heuristics and variants of expected-utility models, including CPT, have evaluated the predictive power of cumulative prospect theory using one common parameter set (e.g., Brandstätter et al., 2006). Based on our findings, such an approach is likely to underestimate the performance of cumulative prospect theory. In line with previous findings (Glöckner & Betsch, 2008), however, we find that even with a common parameter set CPT outperforms heuristics (e.g., priority heuristic; Brandstätter et al., 2006) as well as expected value theory in predicting individual choice. We showed that people reliably differ in how they make decisions (even when given the same set of choice problems). How to accommodate these individual differences in models of risky choice? Our results suggest that CPT’s multi-parameter framework offers one viable way. Alternatively, individual differences in risky choice could be modeled assuming that different people rely on different heuristics (e.g., Brandstätter et al., 2008; Payne et al., 1993). An additional analysis focusing on the heuristics only, however, showed that this approach is rather limited (at least based on the set of heuristics investigated here). First, only 48% of our participants were best described by the same heuristic at both sessions; second, using the best-performing heuristic at the first session to predict choices at the second session (and vice versa) yielded, on average, only 62.4% correct predictions, which is far below the predictive accuracy achieved with CPT. It should be emphasized that although in our analysis CPT emerges as the superior model in predicting the outcome of people’s choices, this does not mean that it also provides a good description of the information processing steps underlying people’s choices. In fact, in light of people’s limited capacity for carrying out complex processing deliberately (Simon, 1956), it seems unlikely that they explicitly calculated weighted sums, as described in CPT’s algebraic formulation. In addition, process tests clearly speak against this explanation (Cokely & Kelley, 2009; Glöckner & Herbold, 2011; Pachur, Hertwig, Gigerenzer, & Brandstätter, submitted for publication; Payne & Braunstein, 1978). What are alternative mechanisms? One possibility is that choices result from automatic processes (see Glöckner & Witteman, 2010, for a recent review) that can lead to compensatory information integration—as modeled by evidence accumulation (Busemeyer & Townsend, 1993; Rieskamp, 2008) or coherence maximizing parallel constraint satisfaction mechanisms (Glöckner & Herbold, 2011; Holyoak & Simon, 1999). To be sure, by having multiple adjustable parameters, current models of automatic processes face the potential problem of overfitting (Roberts & Pashler, 2000); in addition, they have not yet been submitted to rigorous quantitative tests in risky choice (but see Rieskamp, 2008; Scheibehenne, Rieskamp, & Gonzáles-Vallejo, 2009). Nevertheless, qualitative tests have Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx shown that models of automatic processes can account well for choices, decision time, and patterns in information acquisition (Glöckner & Herbold, 2011). Alternatively, people might rely on heuristic principles for information search and integration (e.g., Brandstätter et al., 2006; Russo & Dosher, 1983; Tversky, 1972). Despite the negative results for existing models of heuristics from outcome tests reported in this article, evidence from neuroimaging (Venkatraman, Payne, Bettman, Luce, & Huettel, 2009), verbal protocol studies (Cokely & Kelley, 2009), as well as eye tracking (Russo & Dosher, 1983) and information board studies (Mann & Ball, 1994; Payne & Braunstein, 1978) suggests that people often process ‘‘information about the gambles in ways inconsistent with compensatory models of risky decision making’’ (Payne & Braunstein, 1978, p. 554). These results highlight that a challenge for future research is to reconcile the apparently conflicting lines of evidence from process tests and outcome tests. 5. Conclusion A key objective in cognitive modeling is simplicity. However, as Einstein (1934) put it: ‘‘[T]he supreme goal of all theory is to make the irreducible basic element as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.’’ (p. 165) That is, things should be made as simple as possible, but not simpler. Various approaches, ranging from multi-parameter frameworks such as cumulative prospect theory to simple heuristics that ignore part of the information and cannot accommodate individual differences within one model, have been proposed to predict people’s risky choices. We have shown that simpler implementations of cumulative prospect theory yielded more stable parameter estimates and were as robust in prediction as considerably more complex implementations. However, further simplifying models by ignoring individual differences (as in models of heuristics) may be too high a price in the quest for valid accounts of risky choice. References Abdellaoui, M., Bleichrodt, H., & Haridon, O. L. (2008). A tractable method to measure utility and loss aversion under prospect theory. Journal of Risk and Uncertainty, 36, 245–266. Andersen, S., Harrison, G., Lau, M., & Rutström, E. (2008). Lost in space: Are preferences stable? International Economic Review, 49, 1091–1112. Ballinger, T. P., & Wilcox, N. T. (1997). Decisions, error and heterogeneity. The Economic Journal, 107, 1090–1105. Birnbaum, M. H. (1973). The devil rides again—Correlation as an index of fit. Psychological Bulletin, 79, 239–242. Birnbaum, M. H. (1974). Reply to the devil’s advocates: Don’t confound model testing and measurement. Psychological Bulletin, 81, 854–859. Birnbaum, M. H. (1999). Testing critical properties of decision making on the Internet. Psychological Science, 10, 399–407. Birnbaum, M. H. (2008a). Evaluation of the priority heuristic as a descriptive model of risky decision making: Comment on Brandstätter, Gigerenzer, and Hertwig (2006). Psychological Review, 115, 253–260. Birnbaum, M. H. (2008b). New paradoxes of risky decision making. Psychological Review, 115, 463–501. Birnbaum, M. H., & Chavez, A. (1997). Tests of theories of decision making: Violations of branch independence and distribution independence. Organizational Behavior and Human Decision Processes, 71, 161–194. 11 Birnbaum, M. H., & Navarrete, J. B. (1998). Testing descriptive utility theories: Violations of stochastic dominance and cumulative independence. Journal of Risk and Uncertainty, 17, 49–78. Booij, A. S., & van de Kuilen, G. (2009). A parameter-free analysis of the utility of money for the general population under prospect theory. Journal of Economic Psychology, 30, 651–666. Brandstätter, E., Gigerenzer, G., & Hertwig, R. (2006). The priority heuristic: Making choices without trade-offs. Psychological Review, 113, 409–432. Brandstätter, E., Gigerenzer, G., & Hertwig, R. (2008). Risky choice with heuristics: Reply to Birnbaum (2008), Johnson, Schulte-Mecklenbeck, and Willemsen (2008), and Rieger and Wang (2008). Psychological Review, 115, 281–289. Busemeyer, J. R., & Townsend, J. T. (1993). Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment. Psychological Review, 100, 432–459. Camerer, C. (1989). An experimental test of several generalized utility theories. Journal of Risk and Uncertainty, 2, 61–104. Cohen, J. (1988). Statistical power for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Cokely, E. T., & Kelley, C. M. (2009). Cognitive abilities and superior decision making under risk: A protocol analysis and process model evaluation. Judgment and Decision Making, 4, 20–33. Dawes, R. M. (1979). The robust beauty of improper linear models. American Psychologist, 34, 571–582. Einstein, A. (1934). On the method of theoretical physics. Philosophy of Science, 1, 163–169. Fehr-Duda, H., Gennaro, M., & Schubert, R. (2006). Gender, financial risk, and probability weights. Theory and Decision, 60, 283–313. doi:10.1007/s11238-005-4590-0. Fox, C. R., & Poldrack, R. A. (2008). Prospect theory and the brain. In P. Glimcher, E. Fehr, C. Camerer, & R. Poldrack (Eds.), Handbook of neuroeconomics. San Diego: Academic Press. Gächter, S., Johnson, E. J., & Herrmann, A. (2007). Individual-level loss aversion in riskless and risky choices. SSRN eLibrary. Gigerenzer, G., & Brighton, H. (2009). Homo heuristicus: Why biased minds make better inferences. Topics in Cognitive Science, 1, 107–144. Glöckner, A., & Betsch, T. (2008). Do people make decisions under risk based on ignorance? An empirical test of the priority heuristic against cumulative prospect theory. Organizational Behavior and Human Decision Processes, 107, 75–95. Glöckner, A., & Betsch, T. (2011). The empirical content of theories in judgment and decision making: Shortcomings and remedies. Judgment and Decision Making, 6, 711–721. Glöckner, A., & Herbold, A.-K. (2011). An eye-tracking study on information processing in risky decisions: Evidence for compensatory strategies based on automatic processes. Journal of Behavioral Decision Making, 24, 71–98. Glöckner, A., & Witteman, C. L. M. (2010). Beyond dual-process models: A categorization of processes underlying intuitive judgment and decision making. Thinking and Reasoning, 16, 1–25. Gonzalez, R., & Wu, G. (1999). On the shape of the probability weighting function. Cognitive Psychology, 38, 129–166. Harbaugh, W. T., Krause, K., & Vesterlund, L. (2002). Risk attitudes of children and adults: Choices over small and large probability gains and losses. Experimental Economics, 5, 53–84. Hau, R., Pleskac, T. J., Kiefer, J., & Hertwig, R. (2008). The descriptionexperience gap in risky choice: The role of sample size and experienced probabilities. Journal of Behavioral Decision Making, 21, 493–518. Hey, J. D. (2001). Does repetition improve consistency? Experimental Economics, 4, 5–54. Hilbig, B. E. (2008). One-reason decision making in risky choice? A closer look at the priority heuristic. Judgment and Decision Making, 3, 457–462. Holt, C. A., & Laury, S. K. (2002). Risk aversion and incentive effects. American Economic Review, 92, 1644–1655. Holyoak, K. J., & Simon, D. (1999). Bidirectional reasoning in decision making by constraint satisfaction. Journal of Experimental Psychology: General, 128, 3–31. Johnson, E. J., Schulte-Mecklenbeck, M., & Willemsen, M. C. (2008). Process models deserve process data: A comment on Brandstätter, Gigerenzer, and Hertwig (2006). Psychological Review, 115, 263–273. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–292. Levin, I. P., Hart, S. S., Weller, J. A., & Harshman, L. A. (2007). Stability of choices in a risky decision making ask: A 3-year longitudinal study with children and adults. Journal of Behavioral Decision Making, 20, 241–252. Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002 12 A. Glöckner, T. Pachur / Cognition xxx (2011) xxx–xxx Lopes, L. L., & Oden, G. C. (1999). The role of aspiration level in risky choice: A comparison of cumulative prospect theory and SP/A theory. Journal of Mathematical Psychology, 43, 286–313. Makridakis, S., Hibon, M., & Moser, C. (1979). Accuracy of forecasting: An empirical investigation. Journal of the Royal Statistical Society, Series A, 142, 97–145. Mann, L., & Ball, C. (1994). The relationship between search strategy and risky choice. Australian Journal of Psychology, 46, 131–136. Nelder, J., & Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7, 308–313. Nilsson, H., Rieskamp, J., & Wagenmakers, E.-J. (2011). Hierarchical Bayesian parameter estimation for cumulative prospect theory. Journal of Mathematical Psychology, 55, 84–93. Nosofsky, R. M. (1984). Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 114. Pachur, T., Hanoch, Y., & Gummerum, M. (2010). Prospects behind bars: Analyzing decisions under risk in a prison population. Psychonomic Bulletin and Review, 17, 630–636. Pachur, T., Hertwig, R., Gigerenzer, G., Brandstätter, E. (submitted for publication). Cognitive processes in risky choice. Payne, J. W., Bettman, J. R., & Johnson, E. J. (1993). The adaptive decision maker. New York, NY: Cambridge University Press. Payne, J. W., & Braunstein, M. L. (1978). Risky choice: An examination of information acquisition behavior. Memory and Cognition, 6, 554–561. Popper, K. R. (1934/2005). Logik der Forschung (11th ed.). Tübingen: Mohr Siebeck. Raaijmakers, J. G. W., & Shiffrin, R. M. (1981). Search of associative memory. Psychological Review, 88, 93–134. Rieskamp, J. (2008). The probabilistic nature of preferential choice. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 1446–1465. Rieskamp, J., & Otto, P. E. (2006). SSL: A theory of how people learn to select strategies. Journal of Experimental Psychology: General, 135, 207–236. Roberts, S., & Pashler, H. (2000). How persuasive is a good fit? A comment on theory testing. Psychological Review, 107, 358–367. Russo, J. E., & Dosher, B. A. (1983). Strategies for multiattribute binary choice. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 676–696. Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63, 129–138. Sokal, R. R., & Rohlf, F. J. (1994). Biometry: The principles and practice of statistics in biological research. New York: Freeman. Stott, H. P. (2006). Cumulative prospect theory’s functional menagerie. Journal of Risk and Uncertainty, 32, 101–130. Scheibehenne, B., Rieskamp, J., & Gonzáles-Vallejo, C. (2009). Cognitive models of choice: Comparing decision field theory to the proportional difference model. Cognitive Science, 33, 911–939. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press. Tversky, A. (1972). Elimination by aspects – A theory of choice. Psychological Review, 79, 281–299. Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323. Ungemach, C., Chater, N., & Stewart, N. (2009). Are probabilities overweighted or underweighted, when rare outcomes are experienced (rarely)? Psychological Science, 20, 473–479. Venkatraman, V., Payne, J. W., Bettman, J. R., Luce, M. F., & Huettel, S. A. (2009). Separate neural mechanisms underlie choices and strategic preferences in risky decision making. Neuron, 62, 593–602. Wills, A. J., & Pothos, E. M. (2012). On the adequacy of current empirical evaluations of formal models of categorization. Psychological Bulletin, Vol 138, 102–125. doi:10.1037/a0025715. Wu, G., Zhang, J., Gonzalez, R. (2004). Decision under risk. In D. J. Koehler & N. Harvey (Eds.), Blackwell handbook of judgment and decision making (pp. 399–423). Malden, MA: Blackwell. Yechiam, E., & Busemeyer, J. R. (2008). Evaluating generalizability and parameter consistency in learning models. Games and Economic Behavior, 63, 370–394. Yechiam, E., & Ert, E. (2011). Risk attitude in decision making: In search of trait-like constructs. Topics in Cognitive Science, 3, 166–186. Zeisberger, S., Vrecko, D., & Langer, T. (in press). Measuring the time stability of prospect theory preferences. Theory and Decision, doi:10.1007/s11238-010-9234-3. Please cite this article in press as: Glöckner, A., & Pachur, T. Cognitive models of risky choice: Parameter stability and predictive accuracy of prospect theory. Cognition (2012), doi:10.1016/j.cognition.2011.12.002
© Copyright 2026 Paperzz