Learning Under Compound Risk vs. Learning Under Ambiguity - An Experiment Othon M. Moreno∗ Yaroslav Rosokha† July 2013 Abstract We design and conduct an economic experiment to investigate the learning process of the agents under compound risk and under ambiguity. We gather data for subjects choosing between lotteries involving risky and ambiguous urns. Decisions are made in conjunction with a sequence of random draws with replacement, allowing us to estimate the beliefs of the agents at different moments in time. For each of the urn types we estimate the initial prior and a model of Bayesian updating allowing for base rate fallacies. Our findings suggest an important difference in updating behavior between risky and ambiguous environments. Specifically, after controlling for the initial prior, when updating under ambiguity subjects significantly underweight the new signal, while when updating under compound risk subjects are essentially Bayesian. JEL Classification Codes: C81, C91, D81, D83 Keywords: Ambiguity; Risk; Discrete Choice; Experiments; Learning. ∗ † Banco de México. Email: [email protected] Purdue University, Krannert School of Management. Email: [email protected] 1 1 Introduction Decision-making under uncertainty is one of the most important areas of study in eco- nomics both in its single-agent (decision theory) and multiple-agent (game theory) forms. The current paradigm in economic modeling for the agent’s decision under uncertainty over time relies heavily on the Bayesian Updating (BU) rule which specifies how new information is incorporated into the decision making problem. The results of experimental testing in the past decades, however, are unsettling (see Camerer (1995) for a comprehensive survey of these results). In fact the recurrent violations of BU predictions have been a fertile ground for alternative decision-making models such as bounded rationality (Kahneman and Frederick (2002), Rabin and Schrag (1999)). As fist noted by Knight (1921), it is important to distinguish between risk (uncertainty with known probabilities) and ambiguity (uncertainty with unknown probabilities). In particular we say that a problem is ambiguous if there is not sufficient information to generate a unique prior probability. An example of a purely risky problem is playing roulette, where all (rational) players agree on a unique probability of generating the outcomes and where the potential gains and losses can be interpreted as odds. However, there are many real life problems, where agents need to make decision with imprecise information at best. Take for example the decision of constructing a stock portfolio, where the nature of uncertainty cannot be reduced to odds; or making decisions using conflicting recommendations from two or more experts. This is an important area in economics and psychology, that encompasses both normative and positive decision theory. Ellsberg (1961) made an important point: attitudes towards risk are distinct from attitudes towards ambiguity. Experimental studies by Stahl (2011) and Halevy (2007) show that, while we can explain the aggregate behavior using an ambiguity averse model, there is substantial heterogeneity at the individual level where a significant fraction of participants are ambiguity-neutral. While these studies focus on the decision-making under ambiguity in static environments, in the real world, people often have to revise their decisions (learn) 2 upon arrival of new information. In their seminal article, Kahneman and Tversky (1973) present evidence that individuals over-value new information relative to Bayes rule (a judgment bias known as representativeness). El-Gamal and Grether (1995), using a compound lottery set-up, show that the most important rules used by the subjects are: (a) Bayes rule, (b) a representativeness , and (c) conservatism (under-weighting the new signal). Both of these studies, however, consider only learning in risky environments. In this paper, we ask the question “how do people learn in uncertain environments?” We focus on the problem of how people incorporate new information, as it becomes available, to refine their assessment of the likelihood of events. In particular, we investigate whether the learning process in an ambiguous environment differ from the learning process in a compound risky environment? The applications in mind are problems such as portfolio re-balancing in the face of new information, which involve a continual judgment of the likelihood of events. In fact, there has been recent efforts in the theoretical front to explain the process used to incorporate new information in dynamic problems involving ambiguous beliefs (Epstein and Schneider (2007) and Hanany and Klibanoff (2009)). But, as pointed out by Hey (2002) and Camerer et al. (2003), problems of this kind are virtually unexplored in experimental settings. We design and conduct an economic experiment to compare the learning process in risky environments and the learning process in ambiguous environments. The participants are required to choose between pairs of lotteries involving urns of black and white marbles with unknown proportions. Repeated sampling with replacement from such urns allow the participants to learn (update their beliefs about) the composition of the urn. In order to identify the difference between updating under ambiguous and risky scenarios we use two types of urns: “compound” and “ambiguous”. The composition of the “compound” urn is the result of a known randomization device, in other words, the objective probabilities are told to the subjects. The composition process of the “ambiguous” urn is unknown to the participants, that is, the objective probabilities are not told to the subjects. 3 We develop a model of subjective belief updating that combines features of the “anchorand-adjustment” type models with Bayesian updating type models. An assumption that permits us to relate the two is that the prior is distributed according to a Beta distribution. The properties of a Beta distribution lead to an updating process that is analytically tractable and can be estimated using standard gradient-descent methods. Each prior is characterized by two parameters - the mean and the weight, and the updating process is characterized by the weight placed on the new signal. Thus, while weights that are placed on the initial beliefs provide an insight into the belief formation, weights that are placed on the new signal capture learning behavior. We find that there is a significant difference at the aggregate level between learning in ambiguous environments and learning in risky environments. Specifically, when updating under ambiguity subjects significantly underweight the new signal (conservatism), while when updating under risk subjects tend to over-weight the new signal (representativeness); however, by adding a degree of freedom to capture the strength of initial beliefs we find that the behavior in risky environments is consistent with the Bayesian updating rule. We also find that initial prior for the compound urn is not the one that participants “should have formed” as a result of the composition process. Specifically, participants place most of the weight on the extreme outcomes, while disregarding the middle outcome. Related papers to ours are Massey and Wu (2005) and Corgnet et al. (2012). Massey and Wu (2005) employ an experimental paradigm to study the detection of regime switching by participants who know the true parameters of the underlying process. Varying the probability to which the regime is likely to switch they find that individuals are most prone to under-reaction in unstable environments with noisy signal. In our view, the relevant parameters of a process in the real world are rarely specified and it is crucial to know any differences in behavior that might arise due to presence of ambiguity. We find that in ambiguous environments subjects place significantly lower weight on the new information when compared to the risky environment. Corgnet et al. (2012) examine the effect that ambiguity 4 has on financial markets in an experimental settings. They find that the role of ambiguity in explaining price anomalies is limited. Our study is more fundamental in nature, instead of looking at experimental market outcomes, such as price and quantity traded, which might be affected by the market structure itself, our goal is to investigate belief updating directly. Our finding of significant under-reaction to the new information in ambiguous environments, and corresponding over-reaction in risky environment, may explain some of the systematic errors that investors make documented in the behavioral and empirical finance literature (Thaler (2005), Jegadeesh and Titman (1993), Barberis et al. (1998), and Shefrin (2002)). It is particularly relevant for an environment where investors have difficulties interpreting the information and might hold on the investment (under-react) until they are sure that the “good news” are not transitory (Barberis et al. (1998)). While Jegadeesh and Titman (1993) note the possibility of distinct updating behavior attributed to different information structures, their argument that ambiguity arises when evaluating the long-term prospects, requires the investors to overreact under ambiguity. This is in contrast to Barberis et al. (1998) and Shefrin (2002) who attribute the ambiguity to the short run and suggest under-reaction due to ambiguity. The deviation from the Bayesian paradigm in an ambiguous environment may also explain some of the learning behavior in repeated games where there is uncertainty about an opponent’s type. For example, the learning rule underlying fictitious play is equivalent to belief updating using Bayes rule in the face of observed play (Fudenberg and Levine (1998)). And, if all players use Bayesian updating then play converges to a Nash equilibrium (Jordan (1995)). However, when there is not enough information to objectively form unique prior about an opponent’s type, in other words the environment is intrinsically ambiguous, differences in learning behavior in response to a signal (action by an opponent) will have strong implications for both the learning dynamics and the equilibrium convergence. Our results delineate the effect of ambiguity on learning behavior in a controlled laboratory setting, specifically we find significant evidence of under-reaction to new information 5 under ambiguity. We also find that controlling for the initial prior matters, even in the case when there is enough information to objectively form a unique prior, as is the case for the compound environment. The rest of the paper is organized as follows: In Section 2 we describe the experimental design, elicitation procedure, and present an overview of the data. In Section 3 we introduce the learning models and estimation procedure used. In Section 4 we test hypotheses of interest and discuss the results. Finally, in Section 5 we conclude. 2 Experimental Design The task in the experiment is simple: a sequence of marbles is drawn with replacement from an urn whose precise black/white composition is unknown to the participant. Concurrent with the draws, the participant is asked a series of questions in the form of: Please pick one of the two alternatives: A B $28 if the marble drawn is black $X $0 otherwise where X is some fixed amount.1 Successive draws are made from the same urn, replacing the marble after each draw. As each new marble is drawn, the color of the marble is new information regarding the composition of the urn. This new information could affect the decision-maker’s beliefs about the black/white composition of the urn, and hence the subsequent valuation of options A and B. Our methodology allows us to track the valuations of the options at every drawing round, providing indirect evidence on the updating process. At the end of the experiment, participants are compensated according to an incentive compatible procedure.2 1 2 The amount of $X offered to the participants varies significantly in the range $1 to $27. One lottery is picked at random and carried out at the end of the experiment. 6 2.1 Urn Types Urns in the experiment differ based on the information regarding their composition. Specifically, we distinguish two types of risky urns: risky urns (R), whose exact composition is known to the participants; compound urns (C ), whose composition “process” is known.3 And ambiguous urns (A), whose composition process is unknown.4 Figure 1: Urns ? R1 R2 R3 C ? A Figure 1 presents the three risky urns (R1-R3 ), the compound urn (C ), and the ambiguous urn (A) used in the experiment. We will use urns R1 -R3 to elicit the risk preferences of the participants, while the C and the A urns will be used to investigate the learning behavior of the participants. 2.2 Treatments Treatments of the experiment vary according to the order in which urns are presented. We refer to all tasks corresponding to the same urn as a stage. Specifically, we distinguish five stages of the experiment: during stage 1-3, the participants are presented with urns R1 R3 ; during stages 4 and 5, the participants are presented with either (i) the C urn or (ii) the A urn. The order in which urns are presented will distinguish treatment 1 and treatment 2. 3 Four urns of known compositions (two R1 , one R2 , and one R3 ) are placed in a box and one is randomly drawn to be the C urn used for the draws. 4 Unlike the C urn, no objective probabilities are given for the composition of A urn - there is ambiguity about the number of black marbles. 7 Treatments are summarized in Table 1.5 Notice, that the three risky urns are always shown in the same order for both treatment, while the order of the C and the A urns changes between treatments. Table 1: Treatments Stage Treatment 1 Treatment 2 1 R1 R1 2 R2 R2 3 R3 R3 4 C A 5 A C We will use the following notation throughout the paper: superscript j in xji will refer to the treatment (j ∈ {1, 2}), while subscript i will refer to the type of urn (i ∈ {R1 , R2 , R3 , C, A}). 2.3 Decision Tasks We use a modified version of the iterative Multiple Price List Design (iMPL) as described by Harrison and Cox (2008) (Section 3.1). In particular, in each decision round a participant is presented with the following task: The outcome of the lottery is based on the color of the ball that will be drawn from urn i ∈ {R1 , R2 , R3 , C, A}. Please choose between lotteries A and B for each question. A B vs 1) $6 $28 if black, $0 otherwise 2) $9 $28 if black, $0 otherwise 3) $12 $28 if black, $0 otherwise 4) $14 $28 if black, $0 otherwise Note that payoff for the safe option, A, varies throughout the experiment. For details about the design of decision tasks please refer to Appendix A. 5 Ri - risky urn (composition is known). C - compound urn (composition process is known). A - ambiguous urn (composition process is unknown). 8 2.4 Draws During stages 4 and 5 a sequence of draws with replacement is observed by the participants. These draws do no affect the payoffs but serve as a signal about the urn composition. Specifically, 3 draws from an urn are made and then a new decision task is presented to the participants.6 We refer to each decision task as round. For example, round 0 represents choices made before any draws are observed, round 1 represents choices after the first set of 3 draws was observed, round 2 represents choices after the second set of 3 draws, and so on. Therefore, in total, we have 5 rounds during stages 4 and 5. 2.5 Administration and Data Seventy eight subjects were recruited for the experiment using ads posted at different locations on the University of Texas at Austin campus.7 Ten sessions of the experiment were administered with numbers of participants varying between six and eleven. The experiment was programmed and conducted by the authors using the software z-Tree (Fischbacher (2007)). All lotteries were executed by physical randomization devices. Of the seventy eight subjects there were four whose responses are omitted because they displayed no understanding of the task at hand.8 The data is summarized by session and is presented in Table 2. Table 2: Data Summary Session N Treatment Av. Earn. 1 9 1 15.67 2 5 1 23.40 3 11 1 11.55 4 7 1 19.14 5 10 2 21.50 6 6 5 2 19.00 7 6 2 21.67 8 10 2 11.40 9 6 1 17.33 10 5 2 15.00 Overall 74 – 17.57 The choice of 3 draws (12 draws total) was made based on average rate of learning across 1000 simulations of random urns. 7 All subjects were restricted to be undergraduate students. 8 They chose either all A or all B for at least half of the rounds in a sequence 9 In total each participant made 52 decisions, but only one was selected randomly and carried out after all decisions have been made. The duration of the experiment was about 45 minutes with an average payoff of $12.57 from the responses plus a $5.00 show up fee. Thus, on average each question was worth $0.24. 3 Estimation In this section we present the estimation procedure and the learning models considered. 3.1 Estimation Procedure We assume throughout the paper that the aggregate behavior can be summarized by a representative agent whose utility function is parametrized using a normalized version of the CRRA functional form.9 We use a CRRA utility representation of the form: u(x) = (x)1−γ − 1 1−γ (1) where x is the lottery prize and γ 6= 1 is the risk aversion parameter to be estimated. Thus γ = 0 corresponds to a risk neutral gent, and γ > (<)0 corresponds to a risk averse (loving) agent. To ensure that the possible utility values lie between zero and one for all possible payoffs, a further normalization proves to be useful: U (x) = u(x + F ) − u(F ) u(28 + F ) − u(F ) (2) where F is the show-up fee paid regardless of the outcome of the lottery. The representative agent chooses the option with the highest expected value given her current belief, subject to an error, which is assumed to be distributed according to a logistic distribution centered at zero. Using the maximum likelihood approach we estimate 9 We pool the choices made by all participants and estimate a single set of parameters for each model. 10 a latent structural model of choice of the form: PAt = 1 1 + e−λ(Ept [U (Ai,t )−U (Bi,t )]) , (3) where PAt is the probability that subject chooses lottery A at round t, Ai,t and Bi,t is the ith choice pair presented to the participants in round t; pt represents the belief of a marble drawn being black in round t; and λ is a parameter capturing the precision with which the agent makes a choice when evaluating the difference in expected payoffs between lotteries. Although there might be differences in the precision parameter depending on urn types or treatments, since we have no theory to explain such differences and to avoid over-fitting of the data, we impose one λ for the purposes of structural estimation of our models. The learning models tested in this paper differ on the way pt evolves over time as new information is revealed to the agents. Detailed description of these models follows in Sections 3.2.1 and 3.2.2. A useful characteristic of the theories we consider is that each can be identified by an initial prior, p0 , and an updating rule specified by a parameter vector θ. So for a given p0 and θ, at any point in time, we can elicit indirectly the beliefs about the urn composition, i.e. the probability of the next drawn marble being black. Using this probability we can obtain an expected utility value for the two alternatives presented. Using the Logit specification in equation 3 we obtain a maximum likelihood estimates of θ and p0 . 3.2 Learning Models We consider two learning models: 1) Reinforcement Learning Heuristic (RLH) (Section 3.2.1) for which the current belief is the exponential recency-weighted average of observed signals; and 2) Bayesian updating model with Base Rate Fallacy (BUBRF) (section 3.2.2) which generalizes the standard Bayesian updating allowing for new signals to have different weight relative to the prior. 11 3.2.1 Reinforcement Learning Heuristic (RLH) The learning behavior for the Reinforcement Learning Heuristic can be characterized by a simple updating rule that is a convex combination between the belief at time t − 1 and the information provided by the new signal, st . To formalize this type of model, let pt be the belief about the probability of a black marble being drawn in round t, st the signal at round t, and ρ ∈ (0, 1) the weight assigned to the new signals. Then this belief evolves over time according to the following equation: pt = (1 − ρ) × pt−1 + ρ × where st 3 st . 3 (4) is the fraction of black balls drawn at round t. This formulation of pt is also interpreted as an exponential, recency-weighted average of past signals (Sutton and Barto (1998)). Using equation (4) in equation (3) we can jointly estimate γ, p0 , ρ, and λ. 3.2.2 Bayesian Updating with Base Rate Fallacy (BUBRF) Goeree et al. (2007) introduce a generalization of Bayesian model for a two-round two-urn setting, that allows for deviations from Bayesian updating (BU). We extend their framework to a more general setting with more than two periods and two urns. Then, assuming that the subjective prior is distributed according to a Beta distribution, and using the principle of exponential decay (ElSalamouny et al. (2009)), we reformulate the generalization of BU rule in a form similar to (4). Intuition. Consider an agent who observes one signal s from the C urn but acts as if she observed two identical signals. She can use Bayes’ Rule to compute the probability that draws were made from the R1 urn given the first observation: likelihood posterior z }| { P r(R1 |s) = P prior z }| {z }| { P r(s|R1 )P r(R1 ) p∈R1 ,R2 ,R3 P r(s|p)P r(p) 12 Then, given the second observation, the agent would update her beliefs that draws were made from the R1 : likelihood 2 prior history z }| { z }| { z}|{ P r(s|R1 )P r(R1 |s) P r(s|R1 ) P r(R1 ) P r(R1 | s, s ) = P =P 2 P r(s|p)P r(p|s) p∈R1 ,R2 ,R3 p∈R1 ,R2 ,R3 P r(s|p) P r(p) Similarly, if the same agent acted “as if” she observed three identical signals, the likelihood would be raised to the power of 3. Notice, however, that this agent has to start with a prior in order to apply the Bayes rule. This is not a problem for the C urn because there is enough information provided to objectively form a unique prior. But for the A urn it is not as straight forward. Our approach will be to estimate the prior for both, the C and the A urns. Model. Let a scalar β capture the number of “as-if” observations.10 In other words, β captures the weight players place on the new signal relative to the prior. In order to distinguish between the old and the new information (signals) we will use β t . It is necessary either to impose a prior or to estimate a subjective prior over the urn composition. We choose the second option. Furthermore, we will not impose or limit a prior to be over specific proportions. Equation (5) presents the model in the most general form: t P (p|Ht ) = ´ 1 0 L(st |p)β P (p|Ht−1 ) L(st |z)β t P (z|Ht−1 )dF (z) (5) where P (p|Ht ) is the posterior belief over p at round t, L(st |p) is the likelihood of observing signal st given p, and P (p|Ht−1 ) is the prior over p at t − 1. Our notation implies that signal st is incorporated into the history of signals Ht . In order to make the model tractable we introduce the following assumption. Assumption 1: Prior, P (p|Ht−1 ), is distributed according to a Beta distribution Let us suppose that the prior at time t is distributed according to a Beta distribution with 10 For example, if β = 1 an agent acts “as-if” she observed one signal (i.e. Bayesian). If β = 2 an agent acts “as-if” she observed two signals. 13 parameters at−1 , and bt−1 ; i.e. Ht−1 is summarized by (at−1 , bt−1 ). First, note that after observing a signal11 , st , the posterior is also distributed according to Beta distribution with parameters at = at−1 + β t st (6) t bt = bt−1 + β (3 − st ) where at is the number of successes and bt is the number of failures perceived by the subject at round t. This formulation is equivalent to the one used by ElSalamouny et al. (2009) for beta-based computational trust frameworks.12 Then using the property of the Beta distribution we can express the mean p̄t as follows: h p̄t where ρ̄t = t i p̄t−1 + at−1β+bt−1 st at−1 + β t st at h i = = = t at + b t at−1 + bt−1 + 3β t 1 + at−13β+bt−1 st = (1 − ρ̄t ) × p̄t−1 + ρ̄t × 3 3β t . at−1 +bt−1 +3β t (7) Notice, that estimation of (7) comes down to being able to estimate a0 , b0 , and β which fully characterize the law of motion of the beliefs. A further modification proves to be extremely useful for interpretation and estimation. Let pt = at at +bt and Nt = at + bt ; then, instead of estimating a0 and b0 , we can estimate p0 and N0 . This is useful because we can directly estimate the belief, p0 , from the responses provided by the agents in round 0, and N0 can be interpreted as the weight of that prior belief which will be estimated jointly with the weights of the signals. In this way we can rewrite equation (7) as follows: p̄t = Nt−1 p̄t−1 Nt 11 + 3β t Nt × st 3 (8) Recall, that we are drawing three marbles between the decision rounds. The difference being that ElSalamouny et al. (2009) normalize the strength of the most recent signal to 1 (which in our setting just means dividing everything by β t ) 12 14 where Nt = Nt−1 +3β t . That is, Nt tracks the number of draws perceived by the subject. This model has a natural interpretation relative to Bayesian updating. A value of β > 1 implies an over-weighting of new information relative to past beliefs (representativeness heuristic). A value of β < 1 corresponds to under-weighting of new information (conservatism heuristic). And β = 1 is simply standard Bayesian updating rule. In a special case, when a subject starts with N0 = 0 and uses Bayesian updating rule, Nt is the actual number of draws made up to round t. Relationship between RLH and BUBRF. Let us consider the difference between RLH and BUBRF models in more detail. First, from equation (4), since ρ is constant over time, the implication for the BUBRF model is that Nt = and ρ = β−1 β 3β t Nt is constant over time, which means that 3β t+1 β−1 (9) with β > 1. Notice that when β → 1, Nt → ∞ for all t, which is equivalent to ρ → 0 and means that the new signal does not affect the initial belief. Also notice that under the RLH specification the weight of a new signal is always a constant proportion of the past signals, while the weight of the prior, N0 , is 3β 13 . β−1 In contrast, BUBRF in equation (7) does not impose any relationship between the weight of the prior and the weight of a new signal; furthermore BUBRF is not restricted to the over-weighting behavior (β > 1). 13 β > 1 and N0 > 3. 15 0.7 0.7 Figure 2: BUBRF with β = 2 Belief (p) 0.5 0.6 N0 = 0 N0 = 1000 N0 = 6 RLH* Bayes** 0.3 0.3 0.4 0.4 Belief (p) 0.5 0.6 N0 = 0 N0 = 1000 N0 = 6 RLH* Bayes** 0 1 2 3 Draw Round (t) 4 0 1 2 3 Draw Round (t) 4 *RLH with ρ = .5 **Bayesian updating starting from the prior implied by the principle of insufficient information (p0 = .5, N0 = 2,β = 1). Figure 2 presents BUBRF implementation for β = 2 and several different N0 . We would like to point out that according to equation (9), the BUBRF model with β = 2 and N0 = 6 (green line) is equivalent to RLH model with ρ = .5 (blue line), which is why you cannot see a difference between the two lines. 4 Results This section is organized as follows. In Section 4.1 we present the estimation results for the risk aversion parameter using responses only from stages 1-3. In Section 4.2 we consider estimates of subjective beliefs in round 0 (before any draws have been made). These estimations are carried out only on a subset of the data for two reasons. First, we would like to compare results with prior studies on risk and ambiguity attitudes in static environments. And second, we carry out preliminary tests to narrow in on the main result of the paper.14 In Sections 4.3 and 4.4 we present estimation results for BUBRF and RLH 14 For example, we test for Ambiguity Aversion and Reduction of Compound Lotteries and compare results with prior findings. We could have carried out same tests as part of joint estimation done in Sections 4.4 and 4.3, however for exposition and comparison purposes we do these tests separately. 16 models respectively, however, unlike Sections 4.1 and 4.2 we carry out joint estimation. 4.1 Risk Aversion In stages 1-3 the agents are endowed with objective probability, p0 , and we assume that the belief of the representative agent is consistent with this probability. Table 3 presents the MLE estimates of equation 3 with standard errors reported in parenthesis. Table 3: Risk Aversion γ λ logL .45 (.04) 10.97 (.81) -489.21 γ is the risk aversion parameter; λ is the precision with which the agent makes a choice when evaluating the difference in expected payoffs between lotteries. λ = 11 implies a 75% probability of choosing the best of two options that differ by only 10% in expected utility. Our results corresponds to a representative agent that exhibits levels of risk aversion which are commonly observed in studies using micro-level data. The estimates for the risk aversion parameter are in line with previous findings in the experimental literature (see Harrison and Rutström (2008), Harrison and Cox (2008)). 4.2 Initial Beliefs Prior studies on ambiguity aversion emphasize an important distinction between valuation of risky and ambiguous lotteries. Principally, at the aggregate level, the valuation of the ambiguous urn is lower than the valuation of a compound and risky urns (Halevy (2007)), which is explained by population being ambiguity averse. In this section we present our results about attitudes towards compound risk and ambiguity and make a comparison with prior studies in the literature.15 15 For the purposes of this paper we focus on initial belief; however, all conclusions made in this section also hold true for urn-valuation approach. 17 Recall, that during round 0 of each stage no draws have yet been made, hence the only information given to the subjects pertains to the composition process of the urn. Using equation (3) and responses from round 0 of stages 4 and 5 we estimate an initial belief about the probability that a black marble will be drawn. In other words, we estimate a subjective belief that is consistent with participants choices in round 0. Note that we are fixing the risk-aversion parameter to the estimates obtained from stages 1-3 only (γ = .45 from Table 3). The estimates for the unrestricted model, for which initial beliefs can vary by both order of presentation and urn type, are presented in column IB1 of Table 4. We observe that the initial belief is the highest for the C urn in treatment 1, the second highest for the A urn in treatment 1, followed closely by the initial belief for the A urn in treatment 2, and, surprisingly, the initial belief about the probability of black marble being drawn is the lowest for the C urn in treatment 2. In what follows we test several hypotheses to gain a better understanding of the exact nature of this outcome. 18 Table 4: Initial Beliefs IB1 IB2 pj0A H0 : – γ p10C p20C p10A p20A λ0 logL # par AIC rank H1 IB1 IB2 IB3 IB4 IB5 .45 .48 (.03) .40 (.03) .44 (.03) .42 (.03) 5.85 (.71) -365.35 5 740.70 4 .45 .48 (.03) .40 (.03) .43 (.02) .43 (.02) 5.85 (.71) -365.48 4 738.95 2 – .615 – = p0A IB3 pj0A = .5 IB4 pj0i = p0i .45 .45 .48 (.03) .44 (.02) .39 (.03) .44 (.02) .50 .43 (.02) .50 .43 (.02) 5.50 (.70) 5.81 (.71) -371.25 -367.45 3 3 748.50 740.89 6 5 p-values .003 .123 .001 .047 – NA – IB5 pj0i = pj0 IB6 pj0i = p0 .45 .46 (.02) .41 (.02) .46 (.02) .41 (.02) 5.84 (.71) -366.04 3 738.08 1 .45 .43 (.02) .43 (.02) .43 (.02) .43 (.02) 5.81 (.71) -367.59 2 739.17 3 .502 NA NA NA – .215 .121 NA .595 .078 γ is the risk aversion parameter; pj0i is the initial subjective belief about probability of black marble being drawn from urn of type i ∈ {C, A} in treatment j ∈ {1, 2}; λ0 is the precision parameter. λ0 = 6 implies a 65% probability of choosing the best of two options that differ by only 10% in expected utility. H-IB2: Initial beliefs about the A urn may not vary by treatment. (H0 : pj0A = p0A ) We test this hypothesis by imposing a restriction on the initial beliefs in ambiguous environments, and present our estimates in column IB2 of Table 4. With a p-value of 0.615 we fail to reject the null and conclude that there is no evidence of difference between initial beliefs about the probability of black marble being drawn from and ambiguous urn in treatments 1 and 2. H-IB3: Initial beliefs about the A urn are consistent with the principle of insufficient information. ( H0 : pj0A = .5 ) In the Gilboa and Schmeidler (1989) framework an ambiguity averse agent acts according to the worst case scenario prior. For the A urn, this means that the initial belief is lower 19 than the one implied by the principle of insufficient information (i.e. pi0A < .5).16 Thus, to test the hypothesis of ambiguity aversion we impose the restriction of p10A = p20A = .50, and present the estimates in column IB3. Using likelihood ratio test for nested models, we obtain a p-value of .001 relative to the model in IB2. Therefore, we have statistically significant evidence to reject the restriction, in other words, we have evidence of ambiguity aversion. Next, we test whether the nature of uncertainty determines the initial beliefs about the urn composition. This is an extension of H-IB2 where we additionally restrict the initial beliefs about the C urn to be the same between treatments. H-IB4: Initial beliefs may vary by urn type but not by treatment. ( H0 : pj0i = p0i ) We test this hypothesis in column IB4. With a p-value of .047 relatively to the model in column IB2 we reject this restriction at the 5% significance level. Therefore, we have evidence that initial beliefs about the C urns do vary by treatments. The following hypothesis consider the potential order effect. The next hypothesis considers an order effect. H-IB5: Initial beliefs may vary by treatment but not by urn type. ( H0 : pj0i = pj0 ) We test this hypothesis in column IB5 using a likelihood ratio test and obtain a p-value of .502 relative to IB1. Therefore we fail to reject the null hypothesis, and so when an ambiguous urn follows a compound urn the initial beliefs are the same as those for the compound urn. Furthermore, model in column IB5 is the best according to the Akaike Information Criteria. Finally, we test whether there is any differences in initial beliefs at all. H-IB6: Initial beliefs may vary neither by treatment nor by urn type. ( H0 : pj0i = p0 ) Using a likelihood ratio test we obtain a p-value of .078 relative to the model IB5, which is significant at the 10% level but not at the 5% level. However, when estimating the initial 16 Assuming that participants consider a set of possible priors whose means are symmetric around .50. 20 beliefs jointly with the learning models, and making use of the full set of data, we are capable of rejecting the IB6 specification at the 5% significance level. Furthermore the model IB6 is dominated by IB5 according to the Akaike Information Criteria. To summarize our results for subjective beliefs in the static environment, we find evidence against the principle of insufficient information and evidence of an order effect. Specifically, in ambiguous environments subjects assign a probability significantly lower than the one implied by the reduction of compound lotteries when starting with the prior implied by the principle of insufficient information. This result is consistent with ambiguity aversion documented in the literature on decision making under ambiguity in static environments. The order effect comes into play when initial beliefs about the second urn are anchored at the level of initial beliefs about the first urn. In other words, when the C urn is presented first, the initial beliefs about the A urn that follows are not significantly different from the beliefs about the preceding C urn. And vice versa, when the A urn is presented first, the initial beliefs about the C urn that follows are not significantly different from the beliefs about the preceding A urn. It is worth re-emphasizing, that estimates in Table 4 were generated using only responses from round 0, therefore we would expect a difference when estimating the same beliefs as part of a broader learning model. Nevertheless, we will use the results for each of the learning models to verify that the initial beliefs obtained from each model are consistent with those estimated from round 0. 4.3 RLH In what follows we test several restrictions of the Reinforcement Learning Heuristic to determine if there is any difference in updating behavior between treatments or uncertainty types. The estimation results for the Reinforcement Learning Heuristic are presented in Table 5. Recall that, we assume that the precision parameter is constant across treatments, 21 urn types, and rounds.17 We can notice that values of the risk aversion parameter, γ, in Table 5 are within one standard error from the estimates in Section 4.1. Similarly, the estimated values of the initial beliefs in Table 5 and the estimates in Section 4.2 are within one standard error. In other words, joint estimation results are consistent with what we obtained by estimating risk aversion and initial beliefs 4.2 in Section 4.1 and 4.2. What we observe from the unrestricted model RLH1 is that weights do vary for different set-ups. Particularly, the weights the for the C urn are close between treatments, while weights for ambiguous urn in treatment 1 are lower than in treatment 2. We proceed to formally test these differences in updating between treatments and urn types. First, we consider whether the conclusion in IB5 holds when this model is implemented. In other words we test whether pj0i = pj0 , under this model specification. H-RLH2: Initial beliefs may vary by treatment but not by urn type. ( H0 : pj0i = pj0 ) With a p-value of .850 we fail to reject this restriction, hence there is no significant evidence against IB5, and therefore it will serve as a starting point for further restrictions of the RLH. Second, we consider whether there is any difference in initial beliefs either across urn types or treatments. H-RLH3: Initial beliefs may vary neither by treatment nor by urn type. ( H0 : pj0i = p0 ) With a p-value of .014 we reject this restriction relative to RLH2, hence there is significant evidence against RLH3. Together with RLH2 this means that the initial beliefs vary by treatment. This serves to reinforce the H-IB5 and H-IB6 hypotheses. Third, we consider whether the updating parameter, ρ, depends on treatment but not the urn type. 17 That is we assume that, once the belief is selected, the decision task is the same for the C urn as it is for the A urn, and therefore the error should be the same. 22 H-RLH4: Updating weights may vary by treatment but not by urn type. (H0 : ρji = ρj ) Imposing the restriction we estimate and present the model in column RLH 3. Using a likelihood ratio test relative to RLH2 we find a p-value of .015, therefore we reject the null and conclude that there is statistically significant evidence to reject the hypothesis of same updating by order treatment. Next, we test whether the updating parameter, ρ, is the same for the two ambiguous urns and for the two compound urns, but may differ across treatments. Table 5: Reinforcement Learning Heuristic Models RLH1 RLH2 H0 : – pj0i = pj0 γ p10R p20R p10A p20A ρ1C ρ2C ρ1A ρ2A λRL logL # par AIC rank H1 RLH1 RLH2 RLH3 RLH4 RLH5 .39 (.08) .48 (.04) .37 (.05) .45 (.03) .38 (.04) .47 (.07) .43 (.09) .17 (.08) .38 (.07) 3.79 (.23) -2484.95 10 4989.91 4 .39 (.08) .46 (.03) .37 (.03) .46 (.03) .37 (.03) .47 (.07) .42 (.08) .17 (.08) .39 (.07) 3.79 (.22) -2485.12 8 4986.23 1 – .850 – RLH3 RLH4 RLH5 RLH6 pj0i = p0 pj0i = pj0 ρji = ρj pj0i = pj0 ρji = ρi pj0i = pj0 ρji = ρ .45 (.05) .46 (.03) .38 (.03) .46 (.03) .38 (.03) .44 (.05) .44 (.05) .31 (.05) .31 (.05) 3.74 (.22) -2488.19 6 4986.82 2 .39 (.07) .46 (.03) .38 (.03) .46 (.03) .38 (.03) .39 (.04) .39 (.04) .39 (.04) .39 (.04) 3.73 (.22) -2489.44 5 4988.88 3 .292 .101 NA NA – .110 .034 NA .662 .044 .39 (.08) .39 (.07) .42 (.03) .46 (.03) .42 (.03) .38 (.03) .42 (.03) .46 (.03) .42 (.03) .38 (.03) .48 (.07) .37 (.05) .38 (.09) .41 (.06) .18 (.08) .37 (.05) .36 (.07) .41 (.06) 3.79 (.22) 3.73 (.22) -2485.12 -2484.41 7 6 4990.21 4990.69 5 6 p-values .097 .067 .014 .015 – NA – γ is the risk aversion parameter; pj0i is the initial subjective belief about probability of black marble being drawn from urn of type i ∈ {C, A} in treatment j ∈ {1, 2}; ρji is the weight placed on new signal; λRL is the precision parameter. λRL = 3.75 implies about 60% probability of choosing the best of two options that differ by only 10% in expected utility. 23 H-RLH5: Updating weights may vary by urn type but not by treatment. (H0 : ρji = ρi ) Imposing the restriction we estimate and present the model in column RLH 5. Using a likelihood ratio test relative to RLH2 we find a p-value of .101, therefore we fail to reject the null and conclude that there is no statistically significant evidence to reject the hypothesis of same updating by uncertainty type. Finally, we test whether type of uncertainty matter for the updating. H-RLH6: Updating weights vary neither by urn type nor by treatment. (H0 : ρji = ρ) Using a likelihood ratio test relative we obtain p-values of .034 and .044 relative to RLH2 and RLH5 respectively. Therefore we reject the restriction at the 5% level. Furthermore, models RLH2 and RLH5 are best according to the Akaike Information Criterion among the RLH type models. Figure 3 illustrates two sequences of beliefs implied by RLH5. In particular, we can observe the differences between the the rates at which the representative agent updates his beliefs under risky and ambiguous environments. Notice that the difference between treatments for the C urns and the A urns comes from the difference in the initial beliefs analyzed in Section 4.2. As we can see the updating process for the ambiguous environment is less volatile than the one observed for the compound environment. 24 0.7 0.7 Figure 3: RLH5 Belief (p) 0.5 0.6 C1 C2 A1 A2 Bayes* 0.3 0.3 0.4 0.4 Belief (p) 0.5 0.6 C1 C2 A1 A2 Bayes* 0 1 2 3 Draw Round (t) 4 0 1 2 3 Draw Round (t) 4 *Bayesian updating starting from the prior implied by the principle of insufficient information (p0 = .5, N0 = 2). To summarize, in this section we considered one of the simplest learning models and found significant evidence of different updating weights depending on the type of uncertainty. Specifically, weights placed on the signal in compound risky environments are significantly higher than weights placed on the signal in ambiguous environments. 4.4 BUBRF The Bayesian Updating Base Rate Fallacy model under consideration has several nice features. On one hand, it allows us to interpret the estimates relative to the standard Bayesian Updating. On the other hand, using the BUBRF model we can determine to what extend the strength of the initial beliefs influence the learning behavior once new draws are made. Estimation results for the BUBRF model are presented in Table 6. The model in column BUBRF1 presents unrestricted estimates. Notice that estimates of the risk aversion parameter are slightly higher as compared to Sections 4.1 and 4.3; however the differences between estimates based only on stages 1-3 and joint estimates as part of the BUBRF model are insignificant. Also notice, that the estimates of the initial beliefs (p10C , p20C , p10A , p20A ) are not significantly different from our estimates in Section 4.2, 25 which where estimated only on responses from round 0. Next, we will consider the plausible hypotheses about variation in levels and strengths of the initial beliefs. Table 6: BUBRF BUBRF1 BUBRF2 H0 : – p0i = p0 j j N0i = N0 p0A = p0C j N0i = N0 βi = β j +BU BRF 2 βi = βi +BU BRF 2 βi = β +BU BRF 3 βC = 1 +BU BRF 5 γ p10C p20C p10A p20A 1 N0C 2 N0C 1 N0A 2 N0A 1 βC 2 βC 1 βA 2 βA λBRF logL # par AIC rank H1 .55 (.06) .50 (.04) .39 (.04) .44 (.05) .41 (.04) .19 (.43) 3.47 (1.88) .02 (.28) 1.10 (1.09) .89 (.15) 1.35 (.36) .38 (.11) .45 (.39) 4.08 (.24) -2458.03 14 4944.06 4 .55 (.06) .48 (.03) .41 (.04) .48 (.03) .41 (.04) .03 (.25) 2.54 (1.09) .03 (.25) 2.54 (1.09) .88 (.15) 1.25 (.32) .38 (.10) .83 (.27) 4.04 (.24) -2459.72 10 4939.45 3 .56 (.07) .45 (.03) .45 (.03) .45 (.03) .45 (.03) .42 (.41) .42 (.41) .42 (.41) .42 (.41) .88 (.16) 1.39 (.38) .39 (.13) .18 (.19) 3.93 (.24) -2466.82 9 4949.64 7 .55 (.06) .48 (.03) .40 (.03) .48 (.03) .40 (.03) .02 (.30) 1.47 (.54) .02 (.30) 1.47 (.54) .95 (.14) .95 (.14) .48 (.11) .48 (.11) 4.00 (.24) -2461.72 8 4939.43 2 .52 (.06) .47 (.03) .40 (.03) .47 (.03) .40 (.03) .19 (.32) 2.02 (.61) .19 (.32) 2.02 (.61) .74 (.10) .74 (.10) .74 (.10) .74 (.10) 4.03 (.23) -2465.58 7 4945.17 6 .56 (.05) .47 (.03) .40 (.03) .47 (.03) .40 (.03) .02 (.30) 1.48 (.53) .02 (.30) 1.48 (.53) 1.00 1.00 .48 (.11) .48 (.11) 3.99 (.23) -2461.79 6 4937.57 1 BUBRF1 – .495 – .007 .001 – .56 (.06) .47 (.03) .40 (.03) .47 (.03) .40 (.03) .17 (.28) 2.92 (1.15) .17 (.28) 2.92 (1.15) .66 (.11) 1.05 (.25) .66 (.11) 1.05 (.25) 4.04 (.23) -2464.32 8 4944.64 5 p-values .050 .010 NA – .288 .136 NA NA – .035 .008 NA .114 .005 – .378 .248 NA NA .708 NA j BUBRF2 BUBRF3 BUBRF4 j BUBRF3 j j BUBRF4 j BUBRF5 BUBRF6 BUBRF5 j BUBRF6 j BUBRF7 j γ is the risk aversion parameter; pj0i is the initial subjective belief about probability of black marble j being drawn; βij is the weight placed on the new signal; N0i is the weight placed on the initial belief; λBRF is the precision parameter. λBRF = 4 implies about a 60% probability of choosing the best of two options that differ by only 10% in expected utility. First we will explore whether the conclusion from IB5 holds when BUBRF is estimated jointly with the risk aversion parameter γ and the initial belief p0 . It is noteworthy that in addition to the mean of the initial prior (what we refer to as the “initial belief” throughout 26 the paper), the BUBRF specification has a parameter that captures the weight of the initial belief, N0 . Therefore to test the hypothesis about equality of the priors by treatment we test both pj0i = pj0 and N0ij = N0j . H-BUBRF2: Initial prior may vary by treatment but not by urn type. ( H0 : pj0i = pj0 , N0ij = N0j ) With a p-value of .495 we fail to reject this restriction, hence there is no significant evidence against IB5, and therefore it will serve as a starting point for further restrictions of the BUBRF model. Next, let us consider whether there is any variability in priors at all. H-BUBRF3: Initial prior may vary neither by treatment nor by urn type. ( H0 : pj0i = p0 , N0ij = N0 ) Using a likelihood ratio test we obtain a p-value of .007 relative to BUBRF1 and a p-value of .001 relative to BUBRF2, and therefore there is significant evidence against this hypothesis. This means that subjects do have significantly different priors between treatments. We next test whether similar conclusion holds for weights that subjects place on the new signals. H-BUBRF4: Updating weights may vary by treatment but not by urn type. ( βij = β j ) With a p-value of .010 relative to BUBRF2 we reject this restriction. Therefore there is significant evidence that weights placed on the new signal may vary by uncertainty type. Next, we test a hypothesis about the weights that subjects place on new information depending on urn type. H-BUBRF5: Updating weights may vary by urn type but not by treatment. ( βij = βi ) Using a likelihood ratio approach we test this hypothesis and with a p-value of .136 we fail to reject this hypothesis against the alternative in BUBRF2. In other words there is no significant evidence against weights varying by uncertainty type. Now, we test the hypothesis of common weights across treatments and urn types. 27 H-BUBRF6: Updating weights vary neither by urn type nor by treatment. (βij = β) Using a likelihood ratio approach we test this hypothesis and with a p-value of .005 we reject the restriction. In other words, similarly to the H-BUBRF4 we conclude that there is significant evidence that weights vary by uncertainty type. Finally, the weights placed on the new signals in the compound environment for BUBRF1 BUBRF6 are close to 1. Recall, the special case when the weight is equal to 1 corresponds to Bayesian Updating. We formally test this hypothesis in H-BUBRF7 H-BUBRF7: Updating weights for the C urn are consistent with Bayesian Updating. ( βCj = 1) With a p-value of .708 we fail to reject this model relative to the model in column BUBRF5. Thus, there is no evidence against the hypothesis that agents place the Bayesian Updating in the compound environment. Model BUBRF7 is the best according to the Akaike Information Criterion. Figure 4 illustrates the implications of BUBRF7 under two sequences of draws. Notice that under BUBRF7 there is a difference in behavior between the two treatments for the compound environment that is due to the difference in initial priors. The learning process in a compound environment is different from the learning process in an ambiguous environment as implied by BUBRF7 and can be seen from a reaction to a new signal after drawing in round 2 in Figure 4. 28 0.7 0.7 Figure 4: BUBRF7 Belief (p) 0.5 0.6 C1 C2 A1 A2 Bayes* 0.3 0.3 0.4 0.4 Belief (p) 0.5 0.6 C1 C2 A1 A2 Bayes* 0 1 2 3 Draw Round (t) 4 0 1 2 3 Draw Round (t) 4 *Bayesian updating starting from the prior implied by the principle of insufficient information (p0 = .5, N0 = 2). Difference between Bayes and C1 is due to difference in strengths of initial 1 = 0.02). beliefs (N0C The main result for the BUBRF model is that the weight assigned to new information is significantly lower for ambiguous urns than for risky urns, and is independent of the order in which the urns were presented. In particular, subjects significantly underweight new information in ambiguous environments relative to Bayesian Updating, but act essentially Bayesian in risky environments. This behavior is consistent with under-reaction documented by Shefrin (2002) in financial markets. 4.5 BUBRF vs. RLH We now proceed to compare results for the RLH and the BUBRF models. Under both specifications we conclude that subjects assign significantly lower weight to a new signal in ambiguous environments than in risky environments. The difference between these two models is the way new information is incorporated in relation to the original belief. Under the RLH specification, the weight on the initial belief is treated similarly to the weight of the signals. On the other hand, the BUBRF allows for the weight on the initial belief to 29 be different from the weight on the signal. Furthermore, the weight on a new signal under the RLH specification, by construction, is greater than implied by Bayesian Updating. In contrast, the estimates of the BUBRF model suggest significant under-weighting (relative to the Bayesian Updating) of new information in ambiguous environments. Because the RLH model is a special case of the BUBRF model (equation (9)), we can test the hypothesis that subjects behave according to the RLH against the alternative that they behave according to the BUBRF. H-RLH: Updating behavior is according to the RLH. (N0ij = 3β , β−1 β > 1) Using a likelihood ratio test for nested models we obtain a p-values that are less than .001 for all RLH1-RLH6 models when tested relative to the BUBRF1 model, and therefore there is statistically significant evidence against the RLH specifications. This result emphasizes the importance of treating weights of the initial belief separately from weights of the signal, and allowing for under-reaction to new information. 4.6 Positive vs. Normative In this section we will discuss the implications of a difference between the positive and the normative approaches for the decision making under uncertainty. Particularly, under the positive approach we consider the BUBRF7, the best descriptive model according to the Akaike Information Criterion. And under the normative approach we consider the standard Bayesian updating when starting from the diffuse prior and adjusting for the composition procedure.18 Table 7 presents parameters for the best descriptive model, BUBRF7, and the prescriptive model for both, the C and the A urns when each is presented first. 18 The diffuse prior, or state of complete ignorance, corresponds to N−1 = 0. In other words, before we started composing the urns, participants were completely ignorant of urn composition. During construction of the A urn we added one black and one white marble, hence N0A = 2 and p0A = .5. The construction procedure of the C urn implies N0C = 4 and p0C = .5. 30 Table 7: Models Positive .47 (.03) .40 (.03) .02 (.30) 1.48 (.53) 1.00 .48 (.11) p10C p20A 1 N0C 2 N0A 1 βC 2 βA Normative .50 .50 4.00 2.00 1.00 1.00 j pj0i is the initial belief about probability of black marble being drawn; N0i is the weight placed on j the initial belief; βi is the weight placed on the new signal; Left column of Figure 5 presents priors estimated under BUBRF7, while right column presents priors that participants “should have used”. It appears that participants placed more emphasis on the extreme outcomes for both the C and the A urn. And, even when the mean is consistent with the reduction of compound lotteries, as is the case for the C urn, the prior itself is vastly different. This result is interesting in of itself and deserves to be explored in future studies. 2.5 Positive 2.0 C1 : p0 = 0.47, N0 = 0.02 A2 : p0 = 0.4, N0 = 1.48 C1 : p0 = 0.5, N0 = 4 A2 : p0 = 0.5, N0 = 2 0.5 0.5 0.0 0.0 0.00 Normative Density 1.0 1.5 Density 1.0 1.5 2.0 2.5 Figure 5: Priors 0.25 0.50 0.75 Initial Belief (p0 ) 1.00 31 0.00 0.25 0.50 0.75 Initial Belief (p0 ) 1.00 The difference in learning behavior between compound and ambiguous environments, which is the main result of this paper, is demonstrated on the left column of Figure 6. Specifically, in ambiguous environments participants significantly underweight the new information and as a result the updating process is less volatile. Figure 6: Belief Updating Normative 0.7 0.7 Positive C1 A2 0.3 0.3 0.4 0.4 Belief (p) 0.6 0.5 Belief (p) 0.5 0.6 C1 A2 0 1 2 3 Draw Round (t) 4 0 1 2 3 Draw Round (t) 4 Let us consider the learning behavior about the C urn ( Figure 6, red line) . What appears to be a significant over-weighting of the first signal in the left column (positive view) , as compared to the right column (normative view), is attributed to the difference between the subjective and the objective priors and not to the difference in the updating behavior. In fact, the updating behavior after drawing rounds 3 and 4 is very much the same between the left and the right columns. Interestingly, if instead of estimating the prior we had imposed the objective prior for the C urn, the updating behavior would be consistent with significant over-weighting of new information and the base rate fallacy heuristic, and hence violation of the Bayesian Updating rule This result emphasizes importance of understanding the initial belief formation and needs further investigation. 32 The learning behavior about the A urn ( Figure 6, blue dotted line) is different between the positive and normative views in two dimensions - the initial belief and the updating behavior. First, the subjective initial belief is significantly lower than the one derived from the urn composition procedure. Second, the new information is significantly under-weighted, which is consistent with the conservatism heuristic documented in the literature. 5 Conclusion The goal of our paper is to contribute to the understanding of human probability judg- ment in uncertain environments, and more specifically to the understanding of the reaction to new information under uncertainty. We designed and conducted an economic experiment in order to estimate differences in the learning processes of agents under risk and ambiguity. Participants were required to make sequential choices over pairs of lotteries involving two types of urns: (i) a risky urn that was built using a known randomization device, and (ii) an ambiguous urn where the composition process of was unknown to the participants. For each urn type we estimated two models of learning: (1) Bayesian updating allowing for base rate fallacy (BUBRF); (2) Reinforcement Learning (RLH). We find a significant evidence for difference in learning behavior under both models. We find evidence that the order of presentation affects the subjective prior considered by the participants. Specifically, the initial beliefs about the compound urn are significantly lower after the participants have been exposed to ambiguous lotteries. We interpret this result as evidence of anchoring. This effect, although undocumented in the ambiguity aversion literature to the extent of our knowledge, is widely known in psychology as the anchoring effect, named by Tversky and Kahneman (1974). This meta-anchoring effect is worth exploring in greater detail in the future. The importance of treating the initial beliefs separately from the new information is highlighted when we test, and reject, the RLH specification against more general BUBRF. 33 We find that the adjustment rate is significantly lower in ambiguous environments than in risky environments for both the RLH and BUBRF models. Furthermore, the rate at which new information is incorporated in ambiguous environment is significantly lower than implied by the Bayesian updating. By introducing an additional degree of freedom to capture weights of the initial beliefs we find that new information in the compound environment is incorporated at a rate consistent with Bayesian Updating. The apparent deviation from the Bayesian paradigm could be the result of the considered models not being designed with ambiguity in mind, which could oversimplify the updating process. Recently, some theoretical models that focus on the updating of beliefs under ambiguity have been developed, most notably Epstein and Schneider (2007) multiple prior model. The observed conservatism in the updating process could be the result of an agent acting according to the worst case scenario prior in the set of considered priors each of which is updated according to Bayesian updating rule. Unfortunately the nature of our data does not allow us to test for these hypotheses. The observed lower initial anchor probability is also consistent with a setting that chooses the worst case scenario (or a convex combination between the best and worst case scenarios) out of a set containing more than one prior. Further research will include more elaborate experimental designs that allow for the estimation of multiple prior models. The documented differences in the updating behavior under risk and under ambiguity are also relevant when explaining the phenomena of momentum and overreaction to information in financial markets. In particular it is plausible that the ambiguous nature of information causes the market to under-react in the short run, and as ambiguity dissipates, the agents adjust their portfolios and (possibly) overreact in the long run. This type of anomalies in the updating heuristics make room for trading strategies that yield abnormal returns, as documented by Shefrin (2002) and Jegadeesh and Titman (1993). 34 Acknowledgments We owe thanks to Dale Stahl for his comments and funding for this project. References Barberis, N., Shleifer, A., Vishny, R., 1998. A model of investor sentiment. Journal of financial economics 49, 307–343. Camerer, C., 1995. Individual decision making, in: Handbook of Experimental Economics. Princeton University Press, pp. 587–703. Camerer, C.F., Loewenstein, G., Rabin, M., 2003. Advances in Behavioral Economics. Princeton University Press. Corgnet, B., Kujal, P., Porter, D., 2012. Reaction to public information in markets: How much does ambiguity matter? *. The Economic Journal . El-Gamal, M.A., Grether, D.M., 1995. Are people bayesian? uncovering behavioral strategies. Journal of the American statistical Association , 1137–1145. Ellsberg, D., 1961. Risk, ambiguity, and the savage axioms. The Quarterly Journal of Economics 75, 643–669. ElSalamouny, E., Krukow, K.T., Sassone, V., 2009. An analysis of the exponential decay principle in probabilistic trust models. Theoretical computer science 410, 4067–4084. Epstein, L.G., Schneider, M., 2007. Learning under ambiguity. The Review of Economic Studies 74, 1275–1303. Fischbacher, U., 2007. z-tree: Zurich toolbox for ready-made economic experiments. Experimental Economics 10, 171–178. 35 Fudenberg, D., Levine, D.K., 1998. The Theory of Learning in Games. The MIT Press. Gilboa, I., Schmeidler, D., 1989. Maxmin expected utility with non-unique prior. Journal of Mathematical Economics 18. Goeree, J.K., Palfrey, T.R., Rogers, B.W., McKelvey, R.D., 2007. Self-correcting information cascades. Review of Economic Studies 74, 733–762. Halevy, Y., 2007. Ellsberg revisited: An experimental study. Econometrica 75, 503–536. Hanany, E., Klibanoff, K., 2009. Updating ambiguity averse preferences. The B.E. Journal of Theoretical Economics 9. Harrison, G.W., Cox, J.C., 2008. Risk aversion in experiments. volume 12. Emerald Group Pub Ltd. Harrison, G.W., Rutström, E.E., 2008. Risk aversion in the laboratory . Hey, J., 2002. Experimental economics and the theory of decision making under risk and uncertainty. The GENEVA Papers on Risk and Insurance-Theory 27. Jegadeesh, N., Titman, S., 1993. Returns to buying winners and selling losers: Implications for stock market efficiency. The Journal of Finance 48. Jordan, S.J., 1995. Bayesian learning in repeated games. Games and Economic Behavior 9, 8–20. Kahneman, D., Frederick, S., 2002. Representativeness revisited: Attribute substitution in intuitive judgment. Heuristics and biases: The psychology of intuitive judgment . Kahneman, D., Tversky, A., 1973. On the psychology of prediction. Psychological review 80, 237. Knight, F., 1921. Risk, Uncertainty, and Profit. University of Chicago Press. 36 Massey, C., Wu, G., 2005. Detecting regime shifts: The causes of under- and overreaction. Management Science 51, 932–947. Rabin, M., Schrag, J.L., 1999. First impressions matter: A model of confirmatory bias*. Quarterly Journal of Economics 114, 37–82. Shefrin, H., 2002. Beyond greed and fear: Understanding behavioral finance and the psychology of investing. Oxford University Press, USA. Stahl, D.O., 2011. Heterogeneity of ambiguity preferences. Working Papers . Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning: An Introduction. The MIT Press. Thaler, R.H. (Ed.), 2005. Advances in Behavioral Finance, Volume II. Princeton University Press. Tversky, A., Kahneman, D., 1974. Judgment under uncertainty: Heuristics and biases. Science 185, 1124 –1131. 37
© Copyright 2026 Paperzz