Risk, Estimation Uncertainty, and Unexpected Uncertainty

Introduction
Experiment
Conclusion
Bayesian Learning under Three Kinds of
Uncertainty: Risk, Estimation Uncertainty, and
Unexpected Uncertainty
Élise Payzan-LeNestour
Australian School of Business, UNSW Sydney
October 2010
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation
Natural sampling tasks: decision maker explores (“samples”)
reward prospects and learns about their values
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation
Natural sampling tasks: decision maker explores (“samples”)
reward prospects and learns about their values
Model-based (Bayesian) learning or model-free (RL) learning
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation
Natural sampling tasks: decision maker explores (“samples”)
reward prospects and learns about their values
Model-based (Bayesian) learning or model-free (RL) learning ?
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation
Natural sampling tasks: decision maker explores (“samples”)
reward prospects and learns about their values
Model-based (Bayesian) learning or model-free (RL) learning ?
Both modes coexist in the brain (e.g., Balleine ea 2005, Gläscher
ea 2010); the mode that influences behavior is the one that is more
adapted to the current situation (Daw ea 2005)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation
Natural sampling tasks: decision maker explores (“samples”)
reward prospects and learns about their values
Model-based (Bayesian) learning or model-free (RL) learning ?
Both modes coexist in the brain (e.g., Balleine ea 2005, Gläscher
ea 2010); the mode that influences behavior is the one that is more
adapted to the current situation (Daw ea 2005)
→ Question is irrelevant:
when the two modes are algorithmically the same; see the
Kalman Filter (Aoki 1987)
when answer already known: Model-free RL does as well as
model-based learning in the long run; model-based outperforms
in the transient period (Balleine ea 2005, Daw ea 2005)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation
Natural sampling tasks: decision maker explores (“samples”)
reward prospects and learns about their values
Model-based (Bayesian) learning or model-free (RL) learning ?
Both modes coexist in the brain (e.g., Balleine ea 2005, Gläscher
ea 2010); the mode that influences behavior is the one that is more
adapted to the current situation (Daw ea 2005)
→ Question is irrelevant:
when the two modes are algorithmically the same; see the
Kalman Filter (Aoki 1987)
when answer already known: Model-free RL does as well as
model-based learning in the long run; model-based outperforms
in the transient period (Balleine ea 2005, Daw ea 2005)
Here question asked in the context of unstable natural sampling
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Bayesian learning complex here:
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Bayesian learning complex here: requires assessment of Risk
(Expected Uncertainty),
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Bayesian learning complex here: requires assessment of Risk
(Expected Uncertainty), jump likelihood (Unexpected
Uncertainty) (Yu&Dayan 2005),
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Bayesian learning complex here: requires assessment of Risk
(Expected Uncertainty), jump likelihood (Unexpected
Uncertainty) (Yu&Dayan 2005), and Estimation Uncertainty
(Ambiguity)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Bayesian learning complex here: requires assessment of Risk
(Expected Uncertainty), jump likelihood (Unexpected
Uncertainty) (Yu&Dayan 2005), and Estimation Uncertainty
(Ambiguity) combined
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Motivation (cont.)
Unstable sampling tasks: reward probabilities jump over
time so decision maker is continuously experiencing transient
periods
In such contexts even most sophisticated kind of model-free
RL can’t perform as well as Bayesian learning (Courville ea
2006, Choi ea 2009)
So Bayesian shall control behavior ... to the extent that the
brain can implement it!
Bayesian learning complex here: requires assessment of Risk
(Expected Uncertainty), jump likelihood (Unexpected
Uncertainty) (Yu&Dayan 2005), and Estimation Uncertainty
(Ambiguity) combined
May the human brain approximate it?
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Evidence for Bayesian Learning
Behavioral evidence exists but with rat subjects (Gallistel ea
2001)
Neural evidence exists for separate encoding of the three levels
of uncertainty (e.g., Preuschoff ea 2006-2008, Hsu ea 2005,
Yoshida ea 2006, Huettel ea 2006, Rutishauser ea 2006, Behrens ea
2007, Den Ouden ea, 2010, Watson ea 2007)
But levels studied separately or without independent control of
the three levels (Behrens ea 2007); unclear whether human brain
can tease apart representations of Risk and Unexpected Uncertainty
which are antagonistic (Yu&Dayan 2005)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Evidence for Bayesian Learning
Behavioral evidence exists but with rat subjects (Gallistel ea
2001)
Neural evidence exists for separate encoding of the three levels
of uncertainty (e.g., Preuschoff ea 2006-2008, Hsu ea 2005,
Yoshida ea 2006, Huettel ea 2006, Rutishauser ea 2006, Behrens ea
2007, Den Ouden ea, 2010, Watson ea 2007)
But levels studied separately or without independent control of
the three levels (Behrens ea 2007); unclear whether human brain
can tease apart representations of Risk and Unexpected Uncertainty
which are antagonistic (Yu&Dayan 2005)
This study: provides behavioral and neural evidence in a six-armed
bandit task where the arms jump over time
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Restless Bandit Task
Six-armed bandit in which reward processes jump over time
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Two ways to learn option values in this task
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Two ways to learn option values in this task
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Two ways to learn option values in this task
1
Model-based (Bayesian): tracks hidden outcome
contingencies by detecting jumps on the spot → quick to
adapt
Requires to tease apart Risk, Estimation Uncertainty and
Unexpected Uncertainty
2
Model-free RL: predicts next outcome from observation
of past outcome; purely correlational → slower to adapt
(“backward looking”)
Representation of uncertainty either absent (Rescorla-Wagner)
or monolithic (Pearce-Hall)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Evidence for Bayesian Learning Sought at Two Levels
Behavior: Bayesian and RL learning produce different
behaviors in the task
⇒ Did subjects act more like Bayesians?
Note: Bayesian and RL use the same exploration policy – softmax
choice rule (Ishii ea 2005)
Imaging: Truly Bayesian metrics that RL would ignore
⇒ Do we see neural activation correlating with these metrics?
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Behavioral Evidence for Bayesian Learning
Bayesian model better predicts behavior than does RL (62
subjects, 500 choices per subject) (Payzan 2010)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Behavioral Evidence for Bayesian Learning
Bayesian model better predicts behavior than does RL (62
subjects, 500 choices per subject) (Payzan 2010)
Subjects directed exploration towards best known options i.e.,
were ambiguity-averse more
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Behavioral Evidence for Bayesian Learning
Bayesian model better predicts behavior than does RL (62
subjects, 500 choices per subject) (Payzan 2010)
Subjects directed exploration towards best known options i.e.,
were ambiguity-averse more
Behavioral marker of Bayesian learning (no representation of
ambiguity under RL)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Behavioral Evidence for Bayesian Learning
Bayesian model better predicts behavior than does RL (62
subjects, 500 choices per subject) (Payzan 2010)
Subjects directed exploration towards best known options i.e.,
were ambiguity-averse more
Behavioral marker of Bayesian learning (no representation of
ambiguity under RL)
Neural markers? ⇒ Examine whether activation correlating
with unexpected uncertainty, estimation uncertainty, and risk
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Imaging Study
Adapted the task: same stochastic structure, same
information, but only 2 options proposed for choice on each
trial design
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Imaging Study
Adapted the task: same stochastic structure, same
information, but only 2 options proposed for choice on each
trial design
Replicated previous behavioral results (17 subjects, 260 choices
per subject)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Imaging Study
Adapted the task: same stochastic structure, same
information, but only 2 options proposed for choice on each
trial design
Replicated previous behavioral results (17 subjects, 260 choices
per subject)
GLM with 4 onset regressors: cue (phasic), cue (tonic), outcome
(phasic), outcome (tonic)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Imaging Study
Adapted the task: same stochastic structure, same
information, but only 2 options proposed for choice on each
trial design
Replicated previous behavioral results (17 subjects, 260 choices
per subject)
GLM with 4 onset regressors: cue (phasic), cue (tonic), outcome
(phasic), outcome (tonic)
Included (model-derived) Unexpected Uncertainty, Estimation
Uncertainty, and Risk signals as parametric modulators at cue and
at outcome
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Imaging Study
Adapted the task: same stochastic structure, same
information, but only 2 options proposed for choice on each
trial design
Replicated previous behavioral results (17 subjects, 260 choices
per subject)
GLM with 4 onset regressors: cue (phasic), cue (tonic), outcome
(phasic), outcome (tonic)
Included (model-derived) Unexpected Uncertainty, Estimation
Uncertainty, and Risk signals as parametric modulators at cue and
at outcome GLM
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Imaging Study
Adapted the task: same stochastic structure, same
information, but only 2 options proposed for choice on each
trial design
Replicated previous behavioral results (17 subjects, 260 choices
per subject)
GLM with 4 onset regressors: cue (phasic), cue (tonic), outcome
(phasic), outcome (tonic)
Included (model-derived) Unexpected Uncertainty, Estimation
Uncertainty, and Risk signals as parametric modulators at cue and
at outcome GLM
Look at activations across the group
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Main Results (preliminary)
Risk signals in anterior insula (Huettel ea 2006, Preuschoff ea
2006, Christopoulos ea 2009,...)
Ambiguity signals in ACC (Behrens ea 2007) , right superior
temporal lobule, bilateral MFG (Huettel ea 2006, Gläscher ea
2010)
Unexpected Uncertainty signals in vmPFC (Hampton ea 2006,
Den Ouden ea 2010), parahippocampal gyrus (Rutishauser ea
2006), post cingulate and ACC (anterior to Behrens ea 2007; Cf.
Den Ouden ea 2010), anterior insula (Watson ea, 2007)
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Main Results (preliminary)
Risk signals in anterior insula (Huettel ea 2006, Preuschoff ea
2006, Christopoulos ea 2009,...)
Ambiguity signals in ACC (Behrens ea 2007) , right superior
temporal lobule, bilateral MFG (Huettel ea 2006, Gläscher ea
2010)
Unexpected Uncertainty signals in vmPFC (Hampton ea 2006,
Den Ouden ea 2010), parahippocampal gyrus (Rutishauser ea
2006), post cingulate and ACC (anterior to Behrens ea 2007; Cf.
Den Ouden ea 2010), anterior insula (Watson ea, 2007)
images
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Results (cont.)
Attentional (salience and executive control) processes in
amygdala, IFG (Corbetta ea 2000, MacDonald ea 2000, Gläscher
ea 2010) , superior temporal lobule (Yantis ea 2002, Gläscher ea
2010)
learning rate
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Bringing the Three Kinds of Uncertainty Together
Behavioral Evidence for Bayesian Learning
Neural Evidence for Bayesian Learning
Results (cont.)
Attentional (salience and executive control) processes in
amygdala, IFG (Corbetta ea 2000, MacDonald ea 2000, Gläscher
ea 2010) , superior temporal lobule (Yantis ea 2002, Gläscher ea
2010)
learning rate
Part of vmPFC covaries with Bayesian expected value (Tricomi
ea)
expected value
Ventral striatum responds to value of realized outcome
(O’Doherty ea 2002, McClure ea 2004-2007,...)
Élise Payzan-LeNestour
reward
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Summary
Discussion
Conclusion
Found neural correlates of Risk, Unexpected Uncertainty, and
Estimation Uncertainty in ant insula, ACC, lat PFC and
superior parietal
Since
1
Correlation of neural activity with Bayesian uncertainty signals:
not parasitic on general attentional mechanisms (orienting or
executive control)
2
Encoding of the three categories of uncertainty is
quintessentially Bayesian (RL ignores them)
⇒ Neural evidence for Bayesian learning in the task
Strengthens the behavioral evidence
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Summary
Discussion
Boundaries of Bayesian Learning?
Follow-up experiment: same boardgame but with a fourth level
of uncertainty added: Model (or Knigtian) Uncertainty (Knight
1921, Keynes 1921, Basili ea 2009,...)
Parameter uncertainty (Ambiguity) vs. Model Uncertainty
Bayesian learning becomes inefficient in this case (Draper 1995,
Diaconis&Freedman 1985)
According to Daw ea 2005, people should fall back to RL
Confirmed by our data
Élise Payzan-LeNestour
Society for Neuroeconomics
Introduction
Experiment
Conclusion
Summary
Discussion
Main Message of This Study
We found that
Bayesian learning controlled behavior when Estimation
Uncertainty reduced to Ambiguity
Evaporated when Estimation Uncertainty further meant Model
Uncertainty
⇒ Decision making under Model Uncertainty does not represent a
special more complex case of decision making under Ambiguity
Élise Payzan-LeNestour
Society for Neuroeconomics
Acknowledgments
Collaborators
Peter Bossaerts
Simon Dunne
John O’Doherty
Funding
Science Foundation Ireland, Wellcome Trust
NCCR Finrisk and Swiss Finance Institute.
Behavioral Evidence for Bayesian Learning
900
850
800
750
-LL of FB
700
650
600
550
500
450
400
460
510
560
610
660
-LL of RL
710
760
810
860
Legend: Comparative fits for each subject (point below 45 degree
line indicates that Bayesian model fits better)
Evidence for ambiguity aversion
900
-LL of Ambiguity-Averse FB
800
700
600
500
400
300
200
400
450
500
550
600
650
700
750
800
850
-LL of standard FB
With “Ambiguity-Averse” model, option value = expected value penalty proportional to level of estimation uncertainty
back
900
fMRI Design
Cue presentation and choice (2s)
Waiting stage (4s)
Outcome
presentation (1.5s)
back
"#
GLM
$%&'
()*+,-$.
$%&'
(478-$.
7%4$7:&' 7%4$7:&'
()*+,-$.
(478-$.
7%4$7:&'
;+5%&
"/
0-,1
"/
"/
2(9.
23
23
40-+5'46)&
40-+5'46)&'
33
Note: Uncertainty signals orthogonalized relative to learning
rate (LR): LR meant to pick up general attentional activation
(salience and executive control)
back
Expected value
p<.01, SVC at (9, 45, -13): vmPFC
back
Outcome delivery
outcome (phasic)
p<.001 (uncorrected): ventral striatum, mPFC
back
Neural Correlates of Learning rate
left ht p<.001 (uncorrected): right cerebellum, left MFG, left amygdala
back
Risk signal
sic)
p<.001 (uncorrected): left anterior insula
back
Ambiguity signals
p<.001 (uncorrected): anterior cingulate, right superior parietal
lobule, bilateral MFG, bilateral occipital lobe, left precuneus
Unexpected Uncertainty signals
%49"
8&)9"
5%49"
&"
38
p<.001: posterior left cingulate, vmPFC,
bilateral insula, caudate,
bilateral parahippocampal gyrus, anterior STG, right inferior
parietal lobule