Learning Under Compound Risk vs. Learning Under Ambiguity

Learning Under Compound Risk vs. Learning Under
Ambiguity - An Experiment
Othon M. Moreno∗
Yaroslav Rosokha†
July 2013
Abstract
We design and conduct an economic experiment to investigate the learning process of the agents under compound risk and under ambiguity. We gather data for
subjects choosing between lotteries involving risky and ambiguous urns. Decisions are
made in conjunction with a sequence of random draws with replacement, allowing us
to estimate the beliefs of the agents at different moments in time. For each of the
urn types we estimate the initial prior and a model of Bayesian updating allowing for
base rate fallacies. Our findings suggest an important difference in updating behavior
between risky and ambiguous environments. Specifically, after controlling for the initial prior, when updating under ambiguity subjects significantly underweight the new
signal, while when updating under compound risk subjects are essentially Bayesian.
JEL Classification Codes: C81, C91, D81, D83
Keywords: Ambiguity; Risk; Discrete Choice; Experiments; Learning.
∗
†
Banco de México. Email: [email protected]
Purdue University, Krannert School of Management. Email: [email protected]
1
1
Introduction
Decision-making under uncertainty is one of the most important areas of study in eco-
nomics both in its single-agent (decision theory) and multiple-agent (game theory) forms.
The current paradigm in economic modeling for the agent’s decision under uncertainty over
time relies heavily on the Bayesian Updating (BU) rule which specifies how new information
is incorporated into the decision making problem. The results of experimental testing in
the past decades, however, are unsettling (see Camerer (1995) for a comprehensive survey of
these results). In fact the recurrent violations of BU predictions have been a fertile ground for
alternative decision-making models such as bounded rationality (Kahneman and Frederick
(2002), Rabin and Schrag (1999)).
As fist noted by Knight (1921), it is important to distinguish between risk (uncertainty
with known probabilities) and ambiguity (uncertainty with unknown probabilities). In particular we say that a problem is ambiguous if there is not sufficient information to generate
a unique prior probability. An example of a purely risky problem is playing roulette, where
all (rational) players agree on a unique probability of generating the outcomes and where
the potential gains and losses can be interpreted as odds. However, there are many real
life problems, where agents need to make decision with imprecise information at best. Take
for example the decision of constructing a stock portfolio, where the nature of uncertainty
cannot be reduced to odds; or making decisions using conflicting recommendations from two
or more experts. This is an important area in economics and psychology, that encompasses
both normative and positive decision theory.
Ellsberg (1961) made an important point: attitudes towards risk are distinct from attitudes towards ambiguity. Experimental studies by Stahl (2011) and Halevy (2007) show
that, while we can explain the aggregate behavior using an ambiguity averse model, there is
substantial heterogeneity at the individual level where a significant fraction of participants
are ambiguity-neutral. While these studies focus on the decision-making under ambiguity
in static environments, in the real world, people often have to revise their decisions (learn)
2
upon arrival of new information. In their seminal article, Kahneman and Tversky (1973)
present evidence that individuals over-value new information relative to Bayes rule (a judgment bias known as representativeness). El-Gamal and Grether (1995), using a compound
lottery set-up, show that the most important rules used by the subjects are: (a) Bayes rule,
(b) a representativeness , and (c) conservatism (under-weighting the new signal). Both of
these studies, however, consider only learning in risky environments.
In this paper, we ask the question “how do people learn in uncertain environments?” We
focus on the problem of how people incorporate new information, as it becomes available, to
refine their assessment of the likelihood of events. In particular, we investigate whether the
learning process in an ambiguous environment differ from the learning process in a compound
risky environment? The applications in mind are problems such as portfolio re-balancing in
the face of new information, which involve a continual judgment of the likelihood of events.
In fact, there has been recent efforts in the theoretical front to explain the process used to
incorporate new information in dynamic problems involving ambiguous beliefs (Epstein and
Schneider (2007) and Hanany and Klibanoff (2009)). But, as pointed out by Hey (2002)
and Camerer et al. (2003), problems of this kind are virtually unexplored in experimental
settings.
We design and conduct an economic experiment to compare the learning process in risky
environments and the learning process in ambiguous environments. The participants are
required to choose between pairs of lotteries involving urns of black and white marbles
with unknown proportions. Repeated sampling with replacement from such urns allow the
participants to learn (update their beliefs about) the composition of the urn. In order to
identify the difference between updating under ambiguous and risky scenarios we use two
types of urns: “compound” and “ambiguous”. The composition of the “compound” urn is
the result of a known randomization device, in other words, the objective probabilities are
told to the subjects. The composition process of the “ambiguous” urn is unknown to the
participants, that is, the objective probabilities are not told to the subjects.
3
We develop a model of subjective belief updating that combines features of the “anchorand-adjustment” type models with Bayesian updating type models. An assumption that
permits us to relate the two is that the prior is distributed according to a Beta distribution.
The properties of a Beta distribution lead to an updating process that is analytically tractable
and can be estimated using standard gradient-descent methods. Each prior is characterized
by two parameters - the mean and the weight, and the updating process is characterized
by the weight placed on the new signal. Thus, while weights that are placed on the initial
beliefs provide an insight into the belief formation, weights that are placed on the new signal
capture learning behavior.
We find that there is a significant difference at the aggregate level between learning in
ambiguous environments and learning in risky environments. Specifically, when updating
under ambiguity subjects significantly underweight the new signal (conservatism), while
when updating under risk subjects tend to over-weight the new signal (representativeness);
however, by adding a degree of freedom to capture the strength of initial beliefs we find that
the behavior in risky environments is consistent with the Bayesian updating rule. We also
find that initial prior for the compound urn is not the one that participants “should have
formed” as a result of the composition process. Specifically, participants place most of the
weight on the extreme outcomes, while disregarding the middle outcome.
Related papers to ours are Massey and Wu (2005) and Corgnet et al. (2012). Massey
and Wu (2005) employ an experimental paradigm to study the detection of regime switching by participants who know the true parameters of the underlying process. Varying the
probability to which the regime is likely to switch they find that individuals are most prone
to under-reaction in unstable environments with noisy signal. In our view, the relevant parameters of a process in the real world are rarely specified and it is crucial to know any
differences in behavior that might arise due to presence of ambiguity. We find that in ambiguous environments subjects place significantly lower weight on the new information when
compared to the risky environment. Corgnet et al. (2012) examine the effect that ambiguity
4
has on financial markets in an experimental settings. They find that the role of ambiguity in
explaining price anomalies is limited. Our study is more fundamental in nature, instead of
looking at experimental market outcomes, such as price and quantity traded, which might
be affected by the market structure itself, our goal is to investigate belief updating directly.
Our finding of significant under-reaction to the new information in ambiguous environments, and corresponding over-reaction in risky environment, may explain some of the systematic errors that investors make documented in the behavioral and empirical finance literature (Thaler (2005), Jegadeesh and Titman (1993), Barberis et al. (1998), and Shefrin
(2002)). It is particularly relevant for an environment where investors have difficulties interpreting the information and might hold on the investment (under-react) until they are
sure that the “good news” are not transitory (Barberis et al. (1998)). While Jegadeesh and
Titman (1993) note the possibility of distinct updating behavior attributed to different information structures, their argument that ambiguity arises when evaluating the long-term
prospects, requires the investors to overreact under ambiguity. This is in contrast to Barberis
et al. (1998) and Shefrin (2002) who attribute the ambiguity to the short run and suggest
under-reaction due to ambiguity.
The deviation from the Bayesian paradigm in an ambiguous environment may also explain some of the learning behavior in repeated games where there is uncertainty about an
opponent’s type. For example, the learning rule underlying fictitious play is equivalent to
belief updating using Bayes rule in the face of observed play (Fudenberg and Levine (1998)).
And, if all players use Bayesian updating then play converges to a Nash equilibrium (Jordan
(1995)). However, when there is not enough information to objectively form unique prior
about an opponent’s type, in other words the environment is intrinsically ambiguous, differences in learning behavior in response to a signal (action by an opponent) will have strong
implications for both the learning dynamics and the equilibrium convergence.
Our results delineate the effect of ambiguity on learning behavior in a controlled laboratory setting, specifically we find significant evidence of under-reaction to new information
5
under ambiguity. We also find that controlling for the initial prior matters, even in the
case when there is enough information to objectively form a unique prior, as is the case for
the compound environment. The rest of the paper is organized as follows: In Section 2 we
describe the experimental design, elicitation procedure, and present an overview of the data.
In Section 3 we introduce the learning models and estimation procedure used. In Section 4
we test hypotheses of interest and discuss the results. Finally, in Section 5 we conclude.
2
Experimental Design
The task in the experiment is simple: a sequence of marbles is drawn with replacement
from an urn whose precise black/white composition is unknown to the participant. Concurrent with the draws, the participant is asked a series of questions in the form of:
Please pick one of the two alternatives:
A
B
$28 if the marble drawn is black
$X
$0 otherwise
where X is some fixed amount.1 Successive draws are made from the same urn, replacing
the marble after each draw. As each new marble is drawn, the color of the marble is new
information regarding the composition of the urn. This new information could affect the
decision-maker’s beliefs about the black/white composition of the urn, and hence the subsequent valuation of options A and B. Our methodology allows us to track the valuations of the
options at every drawing round, providing indirect evidence on the updating process. At the
end of the experiment, participants are compensated according to an incentive compatible
procedure.2
1
2
The amount of $X offered to the participants varies significantly in the range $1 to $27.
One lottery is picked at random and carried out at the end of the experiment.
6
2.1
Urn Types
Urns in the experiment differ based on the information regarding their composition.
Specifically, we distinguish two types of risky urns: risky urns (R), whose exact composition
is known to the participants; compound urns (C ), whose composition “process” is known.3
And ambiguous urns (A), whose composition process is unknown.4
Figure 1: Urns
?
R1
R2
R3
C
?
A
Figure 1 presents the three risky urns (R1-R3 ), the compound urn (C ), and the ambiguous urn (A) used in the experiment. We will use urns R1 -R3 to elicit the risk preferences of
the participants, while the C and the A urns will be used to investigate the learning behavior
of the participants.
2.2
Treatments
Treatments of the experiment vary according to the order in which urns are presented.
We refer to all tasks corresponding to the same urn as a stage. Specifically, we distinguish
five stages of the experiment: during stage 1-3, the participants are presented with urns R1 R3 ; during stages 4 and 5, the participants are presented with either (i) the C urn or (ii) the
A urn. The order in which urns are presented will distinguish treatment 1 and treatment 2.
3
Four urns of known compositions (two R1 , one R2 , and one R3 ) are placed in a box and one is randomly
drawn to be the C urn used for the draws.
4
Unlike the C urn, no objective probabilities are given for the composition of A urn - there is ambiguity
about the number of black marbles.
7
Treatments are summarized in Table 1.5 Notice, that the three risky urns are always shown
in the same order for both treatment, while the order of the C and the A urns changes
between treatments.
Table 1: Treatments
Stage
Treatment 1
Treatment 2
1
R1
R1
2
R2
R2
3
R3
R3
4
C
A
5
A
C
We will use the following notation throughout the paper: superscript j in xji will refer to the treatment (j ∈ {1, 2}), while subscript i will refer to the type of urn (i ∈
{R1 , R2 , R3 , C, A}).
2.3
Decision Tasks
We use a modified version of the iterative Multiple Price List Design (iMPL) as described
by Harrison and Cox (2008) (Section 3.1). In particular, in each decision round a participant
is presented with the following task:
The outcome of the lottery is based on the color of the ball that will be drawn
from urn i ∈ {R1 , R2 , R3 , C, A}. Please choose between lotteries A and B for
each question.
A
B
vs
1)
$6
$28 if black, $0 otherwise
2)
$9
$28 if black, $0 otherwise
3)
$12
$28 if black, $0 otherwise
4)
$14
$28 if black, $0 otherwise
Note that payoff for the safe option, A, varies throughout the experiment. For details
about the design of decision tasks please refer to Appendix A.
5
Ri - risky urn (composition is known). C - compound urn (composition process is known). A - ambiguous
urn (composition process is unknown).
8
2.4
Draws
During stages 4 and 5 a sequence of draws with replacement is observed by the participants. These draws do no affect the payoffs but serve as a signal about the urn composition.
Specifically, 3 draws from an urn are made and then a new decision task is presented to
the participants.6 We refer to each decision task as round. For example, round 0 represents
choices made before any draws are observed, round 1 represents choices after the first set of
3 draws was observed, round 2 represents choices after the second set of 3 draws, and so on.
Therefore, in total, we have 5 rounds during stages 4 and 5.
2.5
Administration and Data
Seventy eight subjects were recruited for the experiment using ads posted at different
locations on the University of Texas at Austin campus.7 Ten sessions of the experiment
were administered with numbers of participants varying between six and eleven. The experiment was programmed and conducted by the authors using the software z-Tree (Fischbacher
(2007)). All lotteries were executed by physical randomization devices.
Of the seventy eight subjects there were four whose responses are omitted because they
displayed no understanding of the task at hand.8 The data is summarized by session and is
presented in Table 2.
Table 2: Data Summary
Session
N
Treatment
Av. Earn.
1
9
1
15.67
2
5
1
23.40
3
11
1
11.55
4
7
1
19.14
5
10
2
21.50
6
6
5
2
19.00
7
6
2
21.67
8
10
2
11.40
9
6
1
17.33
10
5
2
15.00
Overall
74
–
17.57
The choice of 3 draws (12 draws total) was made based on average rate of learning across 1000 simulations
of random urns.
7
All subjects were restricted to be undergraduate students.
8
They chose either all A or all B for at least half of the rounds in a sequence
9
In total each participant made 52 decisions, but only one was selected randomly and
carried out after all decisions have been made. The duration of the experiment was about
45 minutes with an average payoff of $12.57 from the responses plus a $5.00 show up fee.
Thus, on average each question was worth $0.24.
3
Estimation
In this section we present the estimation procedure and the learning models considered.
3.1
Estimation Procedure
We assume throughout the paper that the aggregate behavior can be summarized by a
representative agent whose utility function is parametrized using a normalized version of the
CRRA functional form.9 We use a CRRA utility representation of the form:
u(x) =
(x)1−γ − 1
1−γ
(1)
where x is the lottery prize and γ 6= 1 is the risk aversion parameter to be estimated. Thus
γ = 0 corresponds to a risk neutral gent, and γ > (<)0 corresponds to a risk averse (loving)
agent. To ensure that the possible utility values lie between zero and one for all possible
payoffs, a further normalization proves to be useful:
U (x) =
u(x + F ) − u(F )
u(28 + F ) − u(F )
(2)
where F is the show-up fee paid regardless of the outcome of the lottery.
The representative agent chooses the option with the highest expected value given
her current belief, subject to an error, which is assumed to be distributed according to a
logistic distribution centered at zero. Using the maximum likelihood approach we estimate
9
We pool the choices made by all participants and estimate a single set of parameters for each model.
10
a latent structural model of choice of the form:
PAt =
1
1 + e−λ(Ept [U (Ai,t )−U (Bi,t )])
,
(3)
where PAt is the probability that subject chooses lottery A at round t, Ai,t and Bi,t is the
ith choice pair presented to the participants in round t; pt represents the belief of a marble
drawn being black in round t; and λ is a parameter capturing the precision with which the
agent makes a choice when evaluating the difference in expected payoffs between lotteries.
Although there might be differences in the precision parameter depending on urn types or
treatments, since we have no theory to explain such differences and to avoid over-fitting
of the data, we impose one λ for the purposes of structural estimation of our models. The
learning models tested in this paper differ on the way pt evolves over time as new information
is revealed to the agents. Detailed description of these models follows in Sections 3.2.1 and
3.2.2.
A useful characteristic of the theories we consider is that each can be identified by an
initial prior, p0 , and an updating rule specified by a parameter vector θ. So for a given p0
and θ, at any point in time, we can elicit indirectly the beliefs about the urn composition, i.e.
the probability of the next drawn marble being black. Using this probability we can obtain
an expected utility value for the two alternatives presented. Using the Logit specification in
equation 3 we obtain a maximum likelihood estimates of θ and p0 .
3.2
Learning Models
We consider two learning models: 1) Reinforcement Learning Heuristic (RLH) (Section
3.2.1) for which the current belief is the exponential recency-weighted average of observed
signals; and 2) Bayesian updating model with Base Rate Fallacy (BUBRF) (section 3.2.2)
which generalizes the standard Bayesian updating allowing for new signals to have different
weight relative to the prior.
11
3.2.1
Reinforcement Learning Heuristic (RLH)
The learning behavior for the Reinforcement Learning Heuristic can be characterized by
a simple updating rule that is a convex combination between the belief at time t − 1 and the
information provided by the new signal, st . To formalize this type of model, let pt be the
belief about the probability of a black marble being drawn in round t, st the signal at round
t, and ρ ∈ (0, 1) the weight assigned to the new signals. Then this belief evolves over time
according to the following equation:
pt = (1 − ρ) × pt−1 + ρ ×
where
st
3
st
.
3
(4)
is the fraction of black balls drawn at round t. This formulation of pt is also
interpreted as an exponential, recency-weighted average of past signals (Sutton and Barto
(1998)). Using equation (4) in equation (3) we can jointly estimate γ, p0 , ρ, and λ.
3.2.2
Bayesian Updating with Base Rate Fallacy (BUBRF)
Goeree et al. (2007) introduce a generalization of Bayesian model for a two-round two-urn
setting, that allows for deviations from Bayesian updating (BU). We extend their framework
to a more general setting with more than two periods and two urns. Then, assuming that
the subjective prior is distributed according to a Beta distribution, and using the principle
of exponential decay (ElSalamouny et al. (2009)), we reformulate the generalization of BU
rule in a form similar to (4).
Intuition. Consider an agent who observes one signal s from the C urn but acts as if she
observed two identical signals. She can use Bayes’ Rule to compute the probability that
draws were made from the R1 urn given the first observation:
likelihood
posterior
z }| {
P r(R1 |s) = P
prior
z }| {z }| {
P r(s|R1 )P r(R1 )
p∈R1 ,R2 ,R3 P r(s|p)P r(p)
12
Then, given the second observation, the agent would update her beliefs that draws were
made from the R1 :
likelihood 2
prior
history
z }| { z }| {
z}|{
P r(s|R1 )P r(R1 |s)
P r(s|R1 ) P r(R1 )
P r(R1 | s, s ) = P
=P
2
P
r(s|p)P
r(p|s)
p∈R1 ,R2 ,R3
p∈R1 ,R2 ,R3 P r(s|p) P r(p)
Similarly, if the same agent acted “as if” she observed three identical signals, the likelihood
would be raised to the power of 3. Notice, however, that this agent has to start with a prior
in order to apply the Bayes rule. This is not a problem for the C urn because there is
enough information provided to objectively form a unique prior. But for the A urn it is not
as straight forward. Our approach will be to estimate the prior for both, the C and the A
urns.
Model. Let a scalar β capture the number of “as-if” observations.10 In other words,
β captures the weight players place on the new signal relative to the prior. In order to
distinguish between the old and the new information (signals) we will use β t . It is necessary
either to impose a prior or to estimate a subjective prior over the urn composition. We
choose the second option. Furthermore, we will not impose or limit a prior to be over
specific proportions. Equation (5) presents the model in the most general form:
t
P (p|Ht ) = ´ 1
0
L(st |p)β P (p|Ht−1 )
L(st |z)β t P (z|Ht−1 )dF (z)
(5)
where P (p|Ht ) is the posterior belief over p at round t, L(st |p) is the likelihood of observing
signal st given p, and P (p|Ht−1 ) is the prior over p at t − 1. Our notation implies that signal
st is incorporated into the history of signals Ht .
In order to make the model tractable we introduce the following assumption.
Assumption 1: Prior, P (p|Ht−1 ), is distributed according to a Beta distribution
Let us suppose that the prior at time t is distributed according to a Beta distribution with
10
For example, if β = 1 an agent acts “as-if” she observed one signal (i.e. Bayesian). If β = 2 an agent
acts “as-if” she observed two signals.
13
parameters at−1 , and bt−1 ; i.e. Ht−1 is summarized by (at−1 , bt−1 ). First, note that after
observing a signal11 , st , the posterior is also distributed according to Beta distribution with
parameters
at =
at−1 + β t st
(6)
t
bt = bt−1 + β (3 − st )
where at is the number of successes and bt is the number of failures perceived by the subject
at round t. This formulation is equivalent to the one used by ElSalamouny et al. (2009)
for beta-based computational trust frameworks.12 Then using the property of the Beta
distribution we can express the mean p̄t as follows:
h
p̄t
where ρ̄t =
t
i
p̄t−1 + at−1β+bt−1 st
at−1 + β t st
at
h
i
=
=
=
t
at + b t
at−1 + bt−1 + 3β t
1 + at−13β+bt−1
st
= (1 − ρ̄t ) × p̄t−1 + ρ̄t ×
3
3β t
.
at−1 +bt−1 +3β t
(7)
Notice, that estimation of (7) comes down to being able to estimate
a0 , b0 , and β which fully characterize the law of motion of the beliefs.
A further modification proves to be extremely useful for interpretation and estimation.
Let pt =
at
at +bt
and Nt = at + bt ; then, instead of estimating a0 and b0 , we can estimate p0
and N0 . This is useful because we can directly estimate the belief, p0 , from the responses
provided by the agents in round 0, and N0 can be interpreted as the weight of that prior
belief which will be estimated jointly with the weights of the signals. In this way we can
rewrite equation (7) as follows:
p̄t =
Nt−1
p̄t−1
Nt
11
+
3β t
Nt
×
st
3
(8)
Recall, that we are drawing three marbles between the decision rounds.
The difference being that ElSalamouny et al. (2009) normalize the strength of the most recent signal to
1 (which in our setting just means dividing everything by β t )
12
14
where Nt = Nt−1 +3β t . That is, Nt tracks the number of draws perceived by the subject. This
model has a natural interpretation relative to Bayesian updating. A value of β > 1 implies
an over-weighting of new information relative to past beliefs (representativeness heuristic). A
value of β < 1 corresponds to under-weighting of new information (conservatism heuristic).
And β = 1 is simply standard Bayesian updating rule. In a special case, when a subject
starts with N0 = 0 and uses Bayesian updating rule, Nt is the actual number of draws made
up to round t.
Relationship between RLH and BUBRF. Let us consider the difference between RLH
and BUBRF models in more detail. First, from equation (4), since ρ is constant over time,
the implication for the BUBRF model is that
Nt =
and ρ =
β−1
β
3β t
Nt
is constant over time, which means that
3β t+1
β−1
(9)
with β > 1. Notice that when β → 1, Nt → ∞ for all t, which is equivalent
to ρ → 0 and means that the new signal does not affect the initial belief. Also notice that
under the RLH specification the weight of a new signal is always a constant proportion of
the past signals, while the weight of the prior, N0 , is
3β 13
.
β−1
In contrast, BUBRF in equation
(7) does not impose any relationship between the weight of the prior and the weight of a
new signal; furthermore BUBRF is not restricted to the over-weighting behavior (β > 1).
13
β > 1 and N0 > 3.
15
0.7
0.7
Figure 2: BUBRF with β = 2
Belief (p)
0.5
0.6
N0 = 0
N0 = 1000
N0 = 6
RLH*
Bayes**
0.3
0.3
0.4
0.4
Belief (p)
0.5
0.6
N0 = 0
N0 = 1000
N0 = 6
RLH*
Bayes**
0
1
2
3
Draw Round (t)
4
0
1
2
3
Draw Round (t)
4
*RLH with ρ = .5
**Bayesian updating starting from the prior implied by the principle of insufficient information
(p0 = .5, N0 = 2,β = 1).
Figure 2 presents BUBRF implementation for β = 2 and several different N0 . We would
like to point out that according to equation (9), the BUBRF model with β = 2 and N0 = 6
(green line) is equivalent to RLH model with ρ = .5 (blue line), which is why you cannot see
a difference between the two lines.
4
Results
This section is organized as follows. In Section 4.1 we present the estimation results
for the risk aversion parameter using responses only from stages 1-3. In Section 4.2 we
consider estimates of subjective beliefs in round 0 (before any draws have been made).
These estimations are carried out only on a subset of the data for two reasons. First, we
would like to compare results with prior studies on risk and ambiguity attitudes in static
environments. And second, we carry out preliminary tests to narrow in on the main result
of the paper.14 In Sections 4.3 and 4.4 we present estimation results for BUBRF and RLH
14
For example, we test for Ambiguity Aversion and Reduction of Compound Lotteries and compare results
with prior findings. We could have carried out same tests as part of joint estimation done in Sections 4.4
and 4.3, however for exposition and comparison purposes we do these tests separately.
16
models respectively, however, unlike Sections 4.1 and 4.2 we carry out joint estimation.
4.1
Risk Aversion
In stages 1-3 the agents are endowed with objective probability, p0 , and we assume that
the belief of the representative agent is consistent with this probability. Table 3 presents the
MLE estimates of equation 3 with standard errors reported in parenthesis.
Table 3: Risk Aversion
γ
λ
logL
.45 (.04)
10.97 (.81)
-489.21
γ is the risk aversion parameter; λ is the precision with which the agent makes a choice when
evaluating the difference in expected payoffs between lotteries. λ = 11 implies a 75% probability
of choosing the best of two options that differ by only 10% in expected utility.
Our results corresponds to a representative agent that exhibits levels of risk aversion
which are commonly observed in studies using micro-level data. The estimates for the risk
aversion parameter are in line with previous findings in the experimental literature (see
Harrison and Rutström (2008), Harrison and Cox (2008)).
4.2
Initial Beliefs
Prior studies on ambiguity aversion emphasize an important distinction between valuation of risky and ambiguous lotteries. Principally, at the aggregate level, the valuation of the
ambiguous urn is lower than the valuation of a compound and risky urns (Halevy (2007)),
which is explained by population being ambiguity averse. In this section we present our
results about attitudes towards compound risk and ambiguity and make a comparison with
prior studies in the literature.15
15
For the purposes of this paper we focus on initial belief; however, all conclusions made in this section
also hold true for urn-valuation approach.
17
Recall, that during round 0 of each stage no draws have yet been made, hence the only
information given to the subjects pertains to the composition process of the urn. Using
equation (3) and responses from round 0 of stages 4 and 5 we estimate an initial belief about
the probability that a black marble will be drawn. In other words, we estimate a subjective
belief that is consistent with participants choices in round 0. Note that we are fixing the
risk-aversion parameter to the estimates obtained from stages 1-3 only (γ = .45 from Table
3).
The estimates for the unrestricted model, for which initial beliefs can vary by both order
of presentation and urn type, are presented in column IB1 of Table 4. We observe that
the initial belief is the highest for the C urn in treatment 1, the second highest for the A
urn in treatment 1, followed closely by the initial belief for the A urn in treatment 2, and,
surprisingly, the initial belief about the probability of black marble being drawn is the lowest
for the C urn in treatment 2. In what follows we test several hypotheses to gain a better
understanding of the exact nature of this outcome.
18
Table 4: Initial Beliefs
IB1
IB2
pj0A
H0 :
–
γ
p10C
p20C
p10A
p20A
λ0
logL
# par
AIC
rank
H1
IB1
IB2
IB3
IB4
IB5
.45
.48 (.03)
.40 (.03)
.44 (.03)
.42 (.03)
5.85 (.71)
-365.35
5
740.70
4
.45
.48 (.03)
.40 (.03)
.43 (.02)
.43 (.02)
5.85 (.71)
-365.48
4
738.95
2
–
.615
–
= p0A
IB3
pj0A
= .5
IB4
pj0i
= p0i
.45
.45
.48 (.03)
.44 (.02)
.39 (.03)
.44 (.02)
.50
.43 (.02)
.50
.43 (.02)
5.50 (.70) 5.81 (.71)
-371.25
-367.45
3
3
748.50
740.89
6
5
p-values
.003
.123
.001
.047
–
NA
–
IB5
pj0i
=
pj0
IB6
pj0i
= p0
.45
.46 (.02)
.41 (.02)
.46 (.02)
.41 (.02)
5.84 (.71)
-366.04
3
738.08
1
.45
.43 (.02)
.43 (.02)
.43 (.02)
.43 (.02)
5.81 (.71)
-367.59
2
739.17
3
.502
NA
NA
NA
–
.215
.121
NA
.595
.078
γ is the risk aversion parameter; pj0i is the initial subjective belief about probability of black marble
being drawn from urn of type i ∈ {C, A} in treatment j ∈ {1, 2}; λ0 is the precision parameter.
λ0 = 6 implies a 65% probability of choosing the best of two options that differ by only 10% in
expected utility.
H-IB2:
Initial beliefs about the A urn may not vary by treatment. (H0 : pj0A = p0A )
We test this hypothesis by imposing a restriction on the initial beliefs in ambiguous
environments, and present our estimates in column IB2 of Table 4. With a p-value of
0.615 we fail to reject the null and conclude that there is no evidence of difference between
initial beliefs about the probability of black marble being drawn from and ambiguous urn in
treatments 1 and 2.
H-IB3:
Initial beliefs about the A urn are consistent with the principle of insufficient
information. ( H0 : pj0A = .5 )
In the Gilboa and Schmeidler (1989) framework an ambiguity averse agent acts according
to the worst case scenario prior. For the A urn, this means that the initial belief is lower
19
than the one implied by the principle of insufficient information (i.e. pi0A < .5).16 Thus, to
test the hypothesis of ambiguity aversion we impose the restriction of p10A = p20A = .50, and
present the estimates in column IB3. Using likelihood ratio test for nested models, we obtain
a p-value of .001 relative to the model in IB2. Therefore, we have statistically significant
evidence to reject the restriction, in other words, we have evidence of ambiguity aversion.
Next, we test whether the nature of uncertainty determines the initial beliefs about the
urn composition. This is an extension of H-IB2 where we additionally restrict the initial
beliefs about the C urn to be the same between treatments.
H-IB4:
Initial beliefs may vary by urn type but not by treatment. ( H0 : pj0i = p0i )
We test this hypothesis in column IB4. With a p-value of .047 relatively to the model in
column IB2 we reject this restriction at the 5% significance level. Therefore, we have evidence
that initial beliefs about the C urns do vary by treatments. The following hypothesis consider
the potential order effect.
The next hypothesis considers an order effect.
H-IB5:
Initial beliefs may vary by treatment but not by urn type. ( H0 : pj0i = pj0 )
We test this hypothesis in column IB5 using a likelihood ratio test and obtain a p-value
of .502 relative to IB1. Therefore we fail to reject the null hypothesis, and so when an
ambiguous urn follows a compound urn the initial beliefs are the same as those for the
compound urn. Furthermore, model in column IB5 is the best according to the Akaike
Information Criteria.
Finally, we test whether there is any differences in initial beliefs at all.
H-IB6:
Initial beliefs may vary neither by treatment nor by urn type. ( H0 : pj0i = p0 )
Using a likelihood ratio test we obtain a p-value of .078 relative to the model IB5, which
is significant at the 10% level but not at the 5% level. However, when estimating the initial
16
Assuming that participants consider a set of possible priors whose means are symmetric around .50.
20
beliefs jointly with the learning models, and making use of the full set of data, we are capable
of rejecting the IB6 specification at the 5% significance level. Furthermore the model IB6
is dominated by IB5 according to the Akaike Information Criteria.
To summarize our results for subjective beliefs in the static environment, we find evidence
against the principle of insufficient information and evidence of an order effect. Specifically,
in ambiguous environments subjects assign a probability significantly lower than the one
implied by the reduction of compound lotteries when starting with the prior implied by
the principle of insufficient information. This result is consistent with ambiguity aversion
documented in the literature on decision making under ambiguity in static environments.
The order effect comes into play when initial beliefs about the second urn are anchored at
the level of initial beliefs about the first urn. In other words, when the C urn is presented
first, the initial beliefs about the A urn that follows are not significantly different from the
beliefs about the preceding C urn. And vice versa, when the A urn is presented first, the
initial beliefs about the C urn that follows are not significantly different from the beliefs
about the preceding A urn.
It is worth re-emphasizing, that estimates in Table 4 were generated using only responses
from round 0, therefore we would expect a difference when estimating the same beliefs as
part of a broader learning model. Nevertheless, we will use the results for each of the learning
models to verify that the initial beliefs obtained from each model are consistent with those
estimated from round 0.
4.3
RLH
In what follows we test several restrictions of the Reinforcement Learning Heuristic to
determine if there is any difference in updating behavior between treatments or uncertainty
types. The estimation results for the Reinforcement Learning Heuristic are presented in
Table 5. Recall that, we assume that the precision parameter is constant across treatments,
21
urn types, and rounds.17
We can notice that values of the risk aversion parameter, γ, in Table 5 are within one
standard error from the estimates in Section 4.1. Similarly, the estimated values of the
initial beliefs in Table 5 and the estimates in Section 4.2 are within one standard error. In
other words, joint estimation results are consistent with what we obtained by estimating risk
aversion and initial beliefs 4.2 in Section 4.1 and 4.2.
What we observe from the unrestricted model RLH1 is that weights do vary for different
set-ups. Particularly, the weights the for the C urn are close between treatments, while
weights for ambiguous urn in treatment 1 are lower than in treatment 2. We proceed to
formally test these differences in updating between treatments and urn types.
First, we consider whether the conclusion in IB5 holds when this model is implemented.
In other words we test whether pj0i = pj0 , under this model specification.
H-RLH2:
Initial beliefs may vary by treatment but not by urn type. ( H0 : pj0i = pj0 )
With a p-value of .850 we fail to reject this restriction, hence there is no significant
evidence against IB5, and therefore it will serve as a starting point for further restrictions
of the RLH.
Second, we consider whether there is any difference in initial beliefs either across urn
types or treatments.
H-RLH3:
Initial beliefs may vary neither by treatment nor by urn type. ( H0 : pj0i = p0 )
With a p-value of .014 we reject this restriction relative to RLH2, hence there is significant
evidence against RLH3. Together with RLH2 this means that the initial beliefs vary by
treatment. This serves to reinforce the H-IB5 and H-IB6 hypotheses.
Third, we consider whether the updating parameter, ρ, depends on treatment but not
the urn type.
17
That is we assume that, once the belief is selected, the decision task is the same for the C urn as it is
for the A urn, and therefore the error should be the same.
22
H-RLH4:
Updating weights may vary by treatment but not by urn type. (H0 : ρji = ρj )
Imposing the restriction we estimate and present the model in column RLH 3. Using a
likelihood ratio test relative to RLH2 we find a p-value of .015, therefore we reject the null
and conclude that there is statistically significant evidence to reject the hypothesis of same
updating by order treatment.
Next, we test whether the updating parameter, ρ, is the same for the two ambiguous
urns and for the two compound urns, but may differ across treatments.
Table 5: Reinforcement Learning Heuristic Models
RLH1
RLH2
H0 :
–
pj0i = pj0
γ
p10R
p20R
p10A
p20A
ρ1C
ρ2C
ρ1A
ρ2A
λRL
logL
# par
AIC
rank
H1
RLH1
RLH2
RLH3
RLH4
RLH5
.39 (.08)
.48 (.04)
.37 (.05)
.45 (.03)
.38 (.04)
.47 (.07)
.43 (.09)
.17 (.08)
.38 (.07)
3.79 (.23)
-2484.95
10
4989.91
4
.39 (.08)
.46 (.03)
.37 (.03)
.46 (.03)
.37 (.03)
.47 (.07)
.42 (.08)
.17 (.08)
.39 (.07)
3.79 (.22)
-2485.12
8
4986.23
1
–
.850
–
RLH3
RLH4
RLH5
RLH6
pj0i = p0
pj0i = pj0
ρji = ρj
pj0i = pj0
ρji = ρi
pj0i = pj0
ρji = ρ
.45 (.05)
.46 (.03)
.38 (.03)
.46 (.03)
.38 (.03)
.44 (.05)
.44 (.05)
.31 (.05)
.31 (.05)
3.74 (.22)
-2488.19
6
4986.82
2
.39 (.07)
.46 (.03)
.38 (.03)
.46 (.03)
.38 (.03)
.39 (.04)
.39 (.04)
.39 (.04)
.39 (.04)
3.73 (.22)
-2489.44
5
4988.88
3
.292
.101
NA
NA
–
.110
.034
NA
.662
.044
.39 (.08)
.39 (.07)
.42 (.03)
.46 (.03)
.42 (.03)
.38 (.03)
.42 (.03)
.46 (.03)
.42 (.03)
.38 (.03)
.48 (.07)
.37 (.05)
.38 (.09)
.41 (.06)
.18 (.08)
.37 (.05)
.36 (.07)
.41 (.06)
3.79 (.22) 3.73 (.22)
-2485.12
-2484.41
7
6
4990.21
4990.69
5
6
p-values
.097
.067
.014
.015
–
NA
–
γ is the risk aversion parameter; pj0i is the initial subjective belief about probability of black marble
being drawn from urn of type i ∈ {C, A} in treatment j ∈ {1, 2}; ρji is the weight placed on new
signal; λRL is the precision parameter. λRL = 3.75 implies about 60% probability of choosing the
best of two options that differ by only 10% in expected utility.
23
H-RLH5:
Updating weights may vary by urn type but not by treatment. (H0 : ρji = ρi )
Imposing the restriction we estimate and present the model in column RLH 5. Using a
likelihood ratio test relative to RLH2 we find a p-value of .101, therefore we fail to reject the
null and conclude that there is no statistically significant evidence to reject the hypothesis
of same updating by uncertainty type.
Finally, we test whether type of uncertainty matter for the updating.
H-RLH6:
Updating weights vary neither by urn type nor by treatment. (H0 : ρji = ρ)
Using a likelihood ratio test relative we obtain p-values of .034 and .044 relative to RLH2
and RLH5 respectively. Therefore we reject the restriction at the 5% level. Furthermore,
models RLH2 and RLH5 are best according to the Akaike Information Criterion among the
RLH type models.
Figure 3 illustrates two sequences of beliefs implied by RLH5. In particular, we can
observe the differences between the the rates at which the representative agent updates
his beliefs under risky and ambiguous environments. Notice that the difference between
treatments for the C urns and the A urns comes from the difference in the initial beliefs
analyzed in Section 4.2. As we can see the updating process for the ambiguous environment
is less volatile than the one observed for the compound environment.
24
0.7
0.7
Figure 3: RLH5
Belief (p)
0.5
0.6
C1
C2
A1
A2
Bayes*
0.3
0.3
0.4
0.4
Belief (p)
0.5
0.6
C1
C2
A1
A2
Bayes*
0
1
2
3
Draw Round (t)
4
0
1
2
3
Draw Round (t)
4
*Bayesian updating starting from the prior implied by the principle of insufficient information
(p0 = .5, N0 = 2).
To summarize, in this section we considered one of the simplest learning models and
found significant evidence of different updating weights depending on the type of uncertainty.
Specifically, weights placed on the signal in compound risky environments are significantly
higher than weights placed on the signal in ambiguous environments.
4.4
BUBRF
The Bayesian Updating Base Rate Fallacy model under consideration has several nice
features. On one hand, it allows us to interpret the estimates relative to the standard
Bayesian Updating. On the other hand, using the BUBRF model we can determine to what
extend the strength of the initial beliefs influence the learning behavior once new draws are
made. Estimation results for the BUBRF model are presented in Table 6.
The model in column BUBRF1 presents unrestricted estimates. Notice that estimates
of the risk aversion parameter are slightly higher as compared to Sections 4.1 and 4.3;
however the differences between estimates based only on stages 1-3 and joint estimates as
part of the BUBRF model are insignificant. Also notice, that the estimates of the initial
beliefs (p10C , p20C , p10A , p20A ) are not significantly different from our estimates in Section 4.2,
25
which where estimated only on responses from round 0. Next, we will consider the plausible
hypotheses about variation in levels and strengths of the initial beliefs.
Table 6: BUBRF
BUBRF1
BUBRF2
H0 :
–
p0i = p0
j
j
N0i = N0
p0A = p0C
j
N0i = N0
βi = β j
+BU BRF 2
βi = βi
+BU BRF 2
βi = β
+BU BRF 3
βC = 1
+BU BRF 5
γ
p10C
p20C
p10A
p20A
1
N0C
2
N0C
1
N0A
2
N0A
1
βC
2
βC
1
βA
2
βA
λBRF
logL
# par
AIC
rank
H1
.55 (.06)
.50 (.04)
.39 (.04)
.44 (.05)
.41 (.04)
.19 (.43)
3.47 (1.88)
.02 (.28)
1.10 (1.09)
.89 (.15)
1.35 (.36)
.38 (.11)
.45 (.39)
4.08 (.24)
-2458.03
14
4944.06
4
.55 (.06)
.48 (.03)
.41 (.04)
.48 (.03)
.41 (.04)
.03 (.25)
2.54 (1.09)
.03 (.25)
2.54 (1.09)
.88 (.15)
1.25 (.32)
.38 (.10)
.83 (.27)
4.04 (.24)
-2459.72
10
4939.45
3
.56 (.07)
.45 (.03)
.45 (.03)
.45 (.03)
.45 (.03)
.42 (.41)
.42 (.41)
.42 (.41)
.42 (.41)
.88 (.16)
1.39 (.38)
.39 (.13)
.18 (.19)
3.93 (.24)
-2466.82
9
4949.64
7
.55 (.06)
.48 (.03)
.40 (.03)
.48 (.03)
.40 (.03)
.02 (.30)
1.47 (.54)
.02 (.30)
1.47 (.54)
.95 (.14)
.95 (.14)
.48 (.11)
.48 (.11)
4.00 (.24)
-2461.72
8
4939.43
2
.52 (.06)
.47 (.03)
.40 (.03)
.47 (.03)
.40 (.03)
.19 (.32)
2.02 (.61)
.19 (.32)
2.02 (.61)
.74 (.10)
.74 (.10)
.74 (.10)
.74 (.10)
4.03 (.23)
-2465.58
7
4945.17
6
.56 (.05)
.47 (.03)
.40 (.03)
.47 (.03)
.40 (.03)
.02 (.30)
1.48 (.53)
.02 (.30)
1.48 (.53)
1.00
1.00
.48 (.11)
.48 (.11)
3.99 (.23)
-2461.79
6
4937.57
1
BUBRF1
–
.495
–
.007
.001
–
.56 (.06)
.47 (.03)
.40 (.03)
.47 (.03)
.40 (.03)
.17 (.28)
2.92 (1.15)
.17 (.28)
2.92 (1.15)
.66 (.11)
1.05 (.25)
.66 (.11)
1.05 (.25)
4.04 (.23)
-2464.32
8
4944.64
5
p-values
.050
.010
NA
–
.288
.136
NA
NA
–
.035
.008
NA
.114
.005
–
.378
.248
NA
NA
.708
NA
j
BUBRF2
BUBRF3
BUBRF4
j
BUBRF3
j
j
BUBRF4
j
BUBRF5
BUBRF6
BUBRF5
j
BUBRF6
j
BUBRF7
j
γ is the risk aversion parameter; pj0i is the initial subjective belief about probability of black marble
j
being drawn; βij is the weight placed on the new signal; N0i
is the weight placed on the initial
belief; λBRF is the precision parameter. λBRF = 4 implies about a 60% probability of choosing the
best of two options that differ by only 10% in expected utility.
First we will explore whether the conclusion from IB5 holds when BUBRF is estimated
jointly with the risk aversion parameter γ and the initial belief p0 . It is noteworthy that in
addition to the mean of the initial prior (what we refer to as the “initial belief” throughout
26
the paper), the BUBRF specification has a parameter that captures the weight of the initial
belief, N0 . Therefore to test the hypothesis about equality of the priors by treatment we
test both pj0i = pj0 and N0ij = N0j .
H-BUBRF2: Initial prior may vary by treatment but not by urn type. ( H0 : pj0i = pj0 , N0ij = N0j )
With a p-value of .495 we fail to reject this restriction, hence there is no significant
evidence against IB5, and therefore it will serve as a starting point for further restrictions of
the BUBRF model. Next, let us consider whether there is any variability in priors at all.
H-BUBRF3: Initial prior may vary neither by treatment nor by urn type. ( H0 : pj0i =
p0 , N0ij = N0 )
Using a likelihood ratio test we obtain a p-value of .007 relative to BUBRF1 and a p-value
of .001 relative to BUBRF2, and therefore there is significant evidence against this hypothesis. This means that subjects do have significantly different priors between treatments.
We next test whether similar conclusion holds for weights that subjects place on the new
signals.
H-BUBRF4: Updating weights may vary by treatment but not by urn type. ( βij = β j )
With a p-value of .010 relative to BUBRF2 we reject this restriction. Therefore there
is significant evidence that weights placed on the new signal may vary by uncertainty type.
Next, we test a hypothesis about the weights that subjects place on new information depending on urn type.
H-BUBRF5: Updating weights may vary by urn type but not by treatment. ( βij = βi )
Using a likelihood ratio approach we test this hypothesis and with a p-value of .136
we fail to reject this hypothesis against the alternative in BUBRF2. In other words there
is no significant evidence against weights varying by uncertainty type. Now, we test the
hypothesis of common weights across treatments and urn types.
27
H-BUBRF6: Updating weights vary neither by urn type nor by treatment. (βij = β)
Using a likelihood ratio approach we test this hypothesis and with a p-value of .005 we
reject the restriction. In other words, similarly to the H-BUBRF4 we conclude that there is
significant evidence that weights vary by uncertainty type.
Finally, the weights placed on the new signals in the compound environment for BUBRF1 BUBRF6 are close to 1. Recall, the special case when the weight is equal to 1 corresponds
to Bayesian Updating. We formally test this hypothesis in H-BUBRF7
H-BUBRF7: Updating weights for the C urn are consistent with Bayesian Updating. ( βCj =
1)
With a p-value of .708 we fail to reject this model relative to the model in column
BUBRF5. Thus, there is no evidence against the hypothesis that agents place the Bayesian
Updating in the compound environment. Model BUBRF7 is the best according to the Akaike
Information Criterion.
Figure 4 illustrates the implications of BUBRF7 under two sequences of draws. Notice
that under BUBRF7 there is a difference in behavior between the two treatments for the
compound environment that is due to the difference in initial priors. The learning process in
a compound environment is different from the learning process in an ambiguous environment
as implied by BUBRF7 and can be seen from a reaction to a new signal after drawing in
round 2 in Figure 4.
28
0.7
0.7
Figure 4: BUBRF7
Belief (p)
0.5
0.6
C1
C2
A1
A2
Bayes*
0.3
0.3
0.4
0.4
Belief (p)
0.5
0.6
C1
C2
A1
A2
Bayes*
0
1
2
3
Draw Round (t)
4
0
1
2
3
Draw Round (t)
4
*Bayesian updating starting from the prior implied by the principle of insufficient information
(p0 = .5, N0 = 2). Difference between Bayes and C1 is due to difference in strengths of initial
1 = 0.02).
beliefs (N0C
The main result for the BUBRF model is that the weight assigned to new information is
significantly lower for ambiguous urns than for risky urns, and is independent of the order
in which the urns were presented. In particular, subjects significantly underweight new
information in ambiguous environments relative to Bayesian Updating, but act essentially
Bayesian in risky environments. This behavior is consistent with under-reaction documented
by Shefrin (2002) in financial markets.
4.5
BUBRF vs. RLH
We now proceed to compare results for the RLH and the BUBRF models. Under both
specifications we conclude that subjects assign significantly lower weight to a new signal
in ambiguous environments than in risky environments. The difference between these two
models is the way new information is incorporated in relation to the original belief. Under
the RLH specification, the weight on the initial belief is treated similarly to the weight of
the signals. On the other hand, the BUBRF allows for the weight on the initial belief to
29
be different from the weight on the signal. Furthermore, the weight on a new signal under
the RLH specification, by construction, is greater than implied by Bayesian Updating. In
contrast, the estimates of the BUBRF model suggest significant under-weighting (relative to
the Bayesian Updating) of new information in ambiguous environments.
Because the RLH model is a special case of the BUBRF model (equation (9)), we can
test the hypothesis that subjects behave according to the RLH against the alternative that
they behave according to the BUBRF.
H-RLH:
Updating behavior is according to the RLH. (N0ij =
3β
,
β−1
β > 1)
Using a likelihood ratio test for nested models we obtain a p-values that are less than .001
for all RLH1-RLH6 models when tested relative to the BUBRF1 model, and therefore there
is statistically significant evidence against the RLH specifications. This result emphasizes
the importance of treating weights of the initial belief separately from weights of the signal,
and allowing for under-reaction to new information.
4.6
Positive vs. Normative
In this section we will discuss the implications of a difference between the positive and the
normative approaches for the decision making under uncertainty. Particularly, under the
positive approach we consider the BUBRF7, the best descriptive model according to the
Akaike Information Criterion. And under the normative approach we consider the standard
Bayesian updating when starting from the diffuse prior and adjusting for the composition
procedure.18
Table 7 presents parameters for the best descriptive model, BUBRF7, and the prescriptive
model for both, the C and the A urns when each is presented first.
18
The diffuse prior, or state of complete ignorance, corresponds to N−1 = 0. In other words, before we
started composing the urns, participants were completely ignorant of urn composition. During construction
of the A urn we added one black and one white marble, hence N0A = 2 and p0A = .5. The construction
procedure of the C urn implies N0C = 4 and p0C = .5.
30
Table 7: Models
Positive
.47 (.03)
.40 (.03)
.02 (.30)
1.48 (.53)
1.00
.48 (.11)
p10C
p20A
1
N0C
2
N0A
1
βC
2
βA
Normative
.50
.50
4.00
2.00
1.00
1.00
j
pj0i is the initial belief about probability of black marble being drawn; N0i
is the weight placed on
j
the initial belief; βi is the weight placed on the new signal;
Left column of Figure 5 presents priors estimated under BUBRF7, while right column
presents priors that participants “should have used”. It appears that participants placed
more emphasis on the extreme outcomes for both the C and the A urn. And, even when the
mean is consistent with the reduction of compound lotteries, as is the case for the C urn,
the prior itself is vastly different. This result is interesting in of itself and deserves to be
explored in future studies.
2.5
Positive
2.0
C1 : p0 = 0.47, N0 = 0.02
A2 : p0 = 0.4, N0 = 1.48
C1 : p0 = 0.5, N0 = 4
A2 : p0 = 0.5, N0 = 2
0.5
0.5
0.0
0.0
0.00
Normative
Density
1.0 1.5
Density
1.0 1.5
2.0
2.5
Figure 5: Priors
0.25
0.50
0.75
Initial Belief (p0 )
1.00
31
0.00
0.25
0.50
0.75
Initial Belief (p0 )
1.00
The difference in learning behavior between compound and ambiguous environments,
which is the main result of this paper, is demonstrated on the left column of Figure 6.
Specifically, in ambiguous environments participants significantly underweight the new information and as a result the updating process is less volatile.
Figure 6: Belief Updating
Normative
0.7
0.7
Positive
C1
A2
0.3
0.3
0.4
0.4
Belief (p)
0.6
0.5
Belief (p)
0.5
0.6
C1
A2
0
1
2
3
Draw Round (t)
4
0
1
2
3
Draw Round (t)
4
Let us consider the learning behavior about the C urn ( Figure 6, red line) . What appears
to be a significant over-weighting of the first signal in the left column (positive view) , as
compared to the right column (normative view), is attributed to the difference between the
subjective and the objective priors and not to the difference in the updating behavior. In
fact, the updating behavior after drawing rounds 3 and 4 is very much the same between the
left and the right columns. Interestingly, if instead of estimating the prior we had imposed
the objective prior for the C urn, the updating behavior would be consistent with significant
over-weighting of new information and the base rate fallacy heuristic, and hence violation of
the Bayesian Updating rule This result emphasizes importance of understanding the initial
belief formation and needs further investigation.
32
The learning behavior about the A urn ( Figure 6, blue dotted line) is different between
the positive and normative views in two dimensions - the initial belief and the updating
behavior. First, the subjective initial belief is significantly lower than the one derived from
the urn composition procedure. Second, the new information is significantly under-weighted,
which is consistent with the conservatism heuristic documented in the literature.
5
Conclusion
The goal of our paper is to contribute to the understanding of human probability judg-
ment in uncertain environments, and more specifically to the understanding of the reaction
to new information under uncertainty. We designed and conducted an economic experiment
in order to estimate differences in the learning processes of agents under risk and ambiguity.
Participants were required to make sequential choices over pairs of lotteries involving two
types of urns: (i) a risky urn that was built using a known randomization device, and (ii)
an ambiguous urn where the composition process of was unknown to the participants. For
each urn type we estimated two models of learning: (1) Bayesian updating allowing for base
rate fallacy (BUBRF); (2) Reinforcement Learning (RLH). We find a significant evidence for
difference in learning behavior under both models.
We find evidence that the order of presentation affects the subjective prior considered
by the participants. Specifically, the initial beliefs about the compound urn are significantly
lower after the participants have been exposed to ambiguous lotteries. We interpret this result as evidence of anchoring. This effect, although undocumented in the ambiguity aversion
literature to the extent of our knowledge, is widely known in psychology as the anchoring
effect, named by Tversky and Kahneman (1974). This meta-anchoring effect is worth exploring in greater detail in the future. The importance of treating the initial beliefs separately
from the new information is highlighted when we test, and reject, the RLH specification
against more general BUBRF.
33
We find that the adjustment rate is significantly lower in ambiguous environments than
in risky environments for both the RLH and BUBRF models. Furthermore, the rate at
which new information is incorporated in ambiguous environment is significantly lower than
implied by the Bayesian updating. By introducing an additional degree of freedom to capture
weights of the initial beliefs we find that new information in the compound environment is
incorporated at a rate consistent with Bayesian Updating.
The apparent deviation from the Bayesian paradigm could be the result of the considered
models not being designed with ambiguity in mind, which could oversimplify the updating
process. Recently, some theoretical models that focus on the updating of beliefs under
ambiguity have been developed, most notably Epstein and Schneider (2007) multiple prior
model. The observed conservatism in the updating process could be the result of an agent
acting according to the worst case scenario prior in the set of considered priors each of
which is updated according to Bayesian updating rule. Unfortunately the nature of our
data does not allow us to test for these hypotheses. The observed lower initial anchor
probability is also consistent with a setting that chooses the worst case scenario (or a convex
combination between the best and worst case scenarios) out of a set containing more than
one prior. Further research will include more elaborate experimental designs that allow for
the estimation of multiple prior models.
The documented differences in the updating behavior under risk and under ambiguity are
also relevant when explaining the phenomena of momentum and overreaction to information
in financial markets. In particular it is plausible that the ambiguous nature of information
causes the market to under-react in the short run, and as ambiguity dissipates, the agents
adjust their portfolios and (possibly) overreact in the long run. This type of anomalies in
the updating heuristics make room for trading strategies that yield abnormal returns, as
documented by Shefrin (2002) and Jegadeesh and Titman (1993).
34
Acknowledgments
We owe thanks to Dale Stahl for his comments and funding for this project.
References
Barberis, N., Shleifer, A., Vishny, R., 1998. A model of investor sentiment. Journal of
financial economics 49, 307–343.
Camerer, C., 1995. Individual decision making, in: Handbook of Experimental Economics.
Princeton University Press, pp. 587–703.
Camerer, C.F., Loewenstein, G., Rabin, M., 2003. Advances in Behavioral Economics.
Princeton University Press.
Corgnet, B., Kujal, P., Porter, D., 2012. Reaction to public information in markets: How
much does ambiguity matter? *. The Economic Journal .
El-Gamal, M.A., Grether, D.M., 1995. Are people bayesian? uncovering behavioral strategies. Journal of the American statistical Association , 1137–1145.
Ellsberg, D., 1961. Risk, ambiguity, and the savage axioms. The Quarterly Journal of
Economics 75, 643–669.
ElSalamouny, E., Krukow, K.T., Sassone, V., 2009. An analysis of the exponential decay
principle in probabilistic trust models. Theoretical computer science 410, 4067–4084.
Epstein, L.G., Schneider, M., 2007. Learning under ambiguity. The Review of Economic
Studies 74, 1275–1303.
Fischbacher, U., 2007. z-tree: Zurich toolbox for ready-made economic experiments. Experimental Economics 10, 171–178.
35
Fudenberg, D., Levine, D.K., 1998. The Theory of Learning in Games. The MIT Press.
Gilboa, I., Schmeidler, D., 1989. Maxmin expected utility with non-unique prior. Journal of
Mathematical Economics 18.
Goeree, J.K., Palfrey, T.R., Rogers, B.W., McKelvey, R.D., 2007. Self-correcting information
cascades. Review of Economic Studies 74, 733–762.
Halevy, Y., 2007. Ellsberg revisited: An experimental study. Econometrica 75, 503–536.
Hanany, E., Klibanoff, K., 2009. Updating ambiguity averse preferences. The B.E. Journal
of Theoretical Economics 9.
Harrison, G.W., Cox, J.C., 2008. Risk aversion in experiments. volume 12. Emerald Group
Pub Ltd.
Harrison, G.W., Rutström, E.E., 2008. Risk aversion in the laboratory .
Hey, J., 2002. Experimental economics and the theory of decision making under risk and
uncertainty. The GENEVA Papers on Risk and Insurance-Theory 27.
Jegadeesh, N., Titman, S., 1993. Returns to buying winners and selling losers: Implications
for stock market efficiency. The Journal of Finance 48.
Jordan, S.J., 1995. Bayesian learning in repeated games. Games and Economic Behavior 9,
8–20.
Kahneman, D., Frederick, S., 2002. Representativeness revisited: Attribute substitution in
intuitive judgment. Heuristics and biases: The psychology of intuitive judgment .
Kahneman, D., Tversky, A., 1973. On the psychology of prediction. Psychological review
80, 237.
Knight, F., 1921. Risk, Uncertainty, and Profit. University of Chicago Press.
36
Massey, C., Wu, G., 2005. Detecting regime shifts: The causes of under- and overreaction.
Management Science 51, 932–947.
Rabin, M., Schrag, J.L., 1999. First impressions matter: A model of confirmatory bias*.
Quarterly Journal of Economics 114, 37–82.
Shefrin, H., 2002. Beyond greed and fear: Understanding behavioral finance and the psychology of investing. Oxford University Press, USA.
Stahl, D.O., 2011. Heterogeneity of ambiguity preferences. Working Papers .
Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning: An Introduction. The MIT Press.
Thaler, R.H. (Ed.), 2005. Advances in Behavioral Finance, Volume II. Princeton University
Press.
Tversky, A., Kahneman, D., 1974. Judgment under uncertainty: Heuristics and biases.
Science 185, 1124 –1131.
37