Bayesian Theory of Mind: Modeling Human Reasoning about Beliefs, Desires, Goals, and Social Rela>ons ? Harvard Decision-‐Making Workshop P(Belief ,Desire | Action) =
April 18, 2012 P(Action | Belief ,Desire)P(Belief ,Desire)
Chris L. Baker MIT P(Action)
€
Commonsense Psychology Fritz Heider Heider & Simmel, 1944 “If we removed all knowledge of scien4fic psychology from our world, problems in interpersonal rela4ons might easily be coped with and solved much as before. Man would s4ll ‘know’ how to avoid doing something asked of him, and how to get someone to agree with him; he would s4ll ‘know’ when someone was angry and when someone was pleased. He could even offer sensible explana4ons for the ‘whys’ of much of his behavior and feelings. In other words, the ordinary person has a great and profound understanding of himself and of other people which, though unformulated and only vaguely conceived, enables him to interact with others in more or less adap4ve ways.” (Heider, 1958) How Do We Make These Inferences? Theory of Mind (ToM) Situa>on World State Agent State Agent Inference Belief Observer Desire Inten>on Ac>on c.f. DenneV (1987), Wellman & Bartsch (1988), Perner (1991); also BDI model (Bratman, 1987) Ques-ons: • What are the form and content of the theory? – Are there general, abstract organizing principles (from other fields)? • What are the psychological mechanisms (algorithms) for learning, inference and predic>on? • How do the knowledge and mechanisms develop / how are they acquired? Outline • Background – Inverse planning framework for social reasoning – Experiment on adult goal inference • The present studies – Bayesian Theory of Mind (BToM): Modeling adults’ and children’s joint belief-‐desire inferences – Using BToM for learning “what is where” from social observa>ons • Experimental and computa>onal approach – Quan>ta>ve psychophysics: collect fine-‐grained judgments elicited by systema>cally varying, controlled s>muli – Developmental data: probe emergence of knowledge & mechanisms – Compara>ve computa>onal modeling • BoVom-‐up, cue-‐based classifica>on of ac>on • Theory-‐based, genera>ve models of belief-‐, desire-‐ and inten>on-‐
based planning that support Bayesian inference The Development of Social Inference 12-‐Month-‐Olds Observed"
Behavior"
Inference"
Problem"
Surprising"
Outcome"
Expected"
Outcome"
Principle of Ra-onal Ac-on • Expecta>on that agents will take efficient Ac>ons to achieve their Goals, given their Beliefs about the Constraints of the World. • Provides “systema4c inferen4al and predic4ve genera4vity” Gergely & Csibra (2003) Principle of Ra>onal Ac>on as Forward Planning Markov Decision Process (MDP) Situation
World
State
Agent
Agent
State
Goal
Planning
Action
Forward Planning P(Action | Goal,Situation)
€
MDP Planning • Model of a classical economic agent who acts to maximize expected u>lity or value • Assump>ons: – Single agent, fully observable world – Ac>on cost is propor>onal to movement distance – Goal rewards trade off against ac>on costs – Stochas>c “soi-‐max” decision-‐making Social Inference as Inverse Planning Markov Decision Process (MDP) Situation
World
State
Agent
Agent
State
Goal
Planning
Action
Forward Planning P(Action | Goal,Situation)
€
Bayesian Inverse Planning • Inference involves working backward from observed Ac>ons to infer underlying Goal (+ other mental states / unobservables) • “Ideal-‐observer” response* • Inference is dynamic: progresses over the course of the agent’s trajectory Inverse Planning P(Goal | Action,Situation)
=
P(Action | Goal,Situation)P(Goal | Situation)P(Situation)
P(Action | Situation)
Dynamic Goal Inference W
S1
€
€
G1
A1
€
? €
S2
G2
… A2
ST
€
Online (Forward) inference: �
P (gt |s1:t+1 , w) ∝ P (st+1 |gt , st , w) P (gt |gt−1 , w)P (gt−1 |s1:t , w),
gt−1
€ (Smoothed) inference: Retrospec>ve P (gt |s1:T , w) ∝ P (gt |s1:t+1 , w)P (st+2:T |gt , s1:t+1 , w).
Baker, Saxe, &
Tenenbaum (2007, 2009)
Goal Inference as Inverse Planning C
C
A
Situation
World
State
A
Agent
Agent
State
Goal
Planning
B
B
Action
C
Example S>muli 7--10
11
13
3
x
3
7
1011 13
1Judgment point
3
7
1011 13
Time Timestep
C
7--10
11
3
13
-
x
B
1
1
0.5
0.5
0
3
7
1011 13
1Judgment point
0.5
0.5
0
x
B
A
B
C
People 0.5
0
-
A
-
-
1
Model C
A
0
0
-
A
13
11
3 - - - 7- -10
B
Challenging for low-‐level accounts 3
7
1011 13
1Judgment point
0.5
3
7
1011 13
Time Timestep
0
3
7
1011 13
Time Timestep
Baker, Saxe, &
Tenenbaum (2007, 2009)
Goal Inference as Inverse Planning 339
r et al. / Cognition 113 (2009) 329–349
C.L. Baker et al. / Cognition 113 (2009) 329–349
11
Movement-‐based model r r==0.97
0.94
People
People
People
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
4
0.60.6
0.80.81 1
γ=0.25)
β=0.5)
00
00 0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 11
M2(β=2.0,γ=0.25)
M3(β=2.5,κ=0.5)
11
1
0.8
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
r=
People
98
0.83
Inverse planning model rr == 0.94
0.98
0.6
Challenging 0.4
for low-‐level accounts 0.2
00
00 0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 11
M3(β=2.5,κ=0.5)
H(β=2.5)
0
0 0.2 0.
H(β
Summary odel
ameter
predictions
settings
(X-axes)
using best-fitting
versusplanning people’s
parameter
settings
inferences
(X-axes)
(Y-axes)
people’s
for
allionline
Experiment
goal inferences
1
(Y-axes
• Bayesian inverse monline
odels cgoal
orrelate highly versus
w
ith adult goal nferences nts
ap obstacle
10 and 11
condition
are
plotted
at judgment
in black.mThese
points
10 a
and
11 are
for
plotted
most
in
of black.
the outliers
of
trials
M1,account
• A movement-‐based odel htrials
ad saccount
lightly lower correla>on wThese
ith people’s gM3
oal andfor most of the o
inferences, showing par>cular difficulty in trials with long-‐range inten>onal structure • Related “inverse decision making” models 10
1113
AC
-
3-
-
• Baker, Tenenbaum & Saxe, 2006; Verma & Rao, 2006; Baker, Goodman & Tenenbaum, 2008; Yoshida, Dolan ambiguous
but
globally
between
goals,
M2
& Friston, 2008, 2010; Lbetween
ucas, Griffiths, Xgoals,
u & FawceV, 2ambiguous
009; Goodman, Bunambiguous,
aker & Tenenbaum, 2009; Bbut
ergen, globally u
Evans & Tenenbaum, 2010; Ullman, Baker, Macindoe, Evans, Goodman & Tenenbaum, 2010; Tauber & pattern
predicted
judgments
the pattern
more of
accusubjects’ judgm
A 2011; Jern &the
C
Steyvers, Kemp, 2011 A of subjects’
13predicted
7--1011 rately
-
Baker,
Saxe,
& Tenenbaum
than alternative
models.
rately
These
than
targeted
alternative
trials
models.
com- (2009)
These tar
7--10
11prised nearly
prised
outliers
nearly
for the
all of
alternative
the largest outliers fo
3- all of
13 the largest
-
The Present Studies Goal Inference as
Inverse Planning
Joint Belief-‐Desire Inference in a Bayesian Theory of Mind Situa>on Situation
World
State
Agent
Agent
State
Goal
Planning
Action
MDP • Principle of ra>onal ac>on World State Agent State Agent Observe Bayesian Inference Belief Desire Planning Ac>on Par-ally Observable MDP • Principle of ra>onal ac>on • Principle of ra>onal belief FOOD TRUCKS • Food-‐trucks: Korean (K), Lebanese (L), Mexican (M) • Parking for food trucks is limited FOOD TRUCKS • Food-‐trucks: Korean (K), Lebanese (L), Mexican (M) • Parking spots marked by yellow squares • All possibili>es: {(K,L), (K,M), (K,N), (L,K), (L,M), (L,N), (M,K), (M,L), (M,N), (N,K), (N,L) (N,M)} FOOD TRUCKS • Food-‐trucks: Korean (K), Lebanese (L), Mexican (M) • Parking spots marked by yellow squares • All possibili>es: {(K,L), (K,M), (K,N)} FOOD TRUCKS • Food-‐trucks: Korean (K), Lebanese (L), Mexican (M) • Parking spots marked by yellow squares • All possibili>es: {(K,L), (K,M), (K,N)} Degree of Desire (1-‐7) (Ini-al) Degree of Belief (1-‐7) K L M L M N ? ? ? ? ? ? FOOD TRUCKS • Food-‐trucks: Korean (K), Lebanese (L), Mexican (M) • Parking spots marked by yellow squares • All possibili>es: {(K,L), (K,M), (K,N)} Degree of Desire (1-‐7) (Ini-al) Degree of Belief (1-‐7) K L M L M N ? ? ? ? ? ? Bayesian Theory of Mind Computa-onal Problem P(Belief ,Desire | Action,Situation) =
P(Action | Belief ,Desire)P(Belief | Situation)P(Desire)
P(Action | Situation)
Situa>on World State Agent State Agent Observe Likelihood Bayesian Inference Belief Observer Desire Planning Ac>on Likelihood Forward Problem: POMDP planning Genera-ve Model: POMDP (Ini-al) Degree of Belief L M N 2/3 1/6 1/6 Possible worlds: 1. (L)ebanese behind building 2. (M)exican behind building 3. (N)othing behind building Degree of Desire (1-‐7) K L M 3 1 7 Degree of Desire: 1. (K)orean: 1-‐7 2. (L)ebanese: 1-‐7 3. (M)exican: 1-‐7 Bayesian Theory of Mind Computa-onal Problem Situa>on P(Belief ,Desire | Action,Situation) =
P(Action | Belief ,Desire)P(Belief | Situation)P(Desire)
P(Action | Situation)
World State Agent State Agent Observe Likelihood Bayesian Inference Belief Observer Posterior Desire Planning Ac>on Likelihood Forward Problem: POMDP planning Inverse Problem: Bayesian social inference Hypothesis • Combinatorial experimental design + fine-‐
grained Belief & Desire judgments = varia>on captured by BToM • Comparison with alterna>ve models will demonstrate the necessity of mentalis>c representa>on Bayesian Theory of Mind Computa-onal Problem Situa>on P(Belief ,Desire | Action,Situation) =
P(Action | Belief ,Desire)P(Belief | Situation)P(Desire)
P(Action | Situation)
Agent State Agent Observe Assump-ons • Agent’s degree of desire is represented in terms of a discrete 1-‐7 u>lity scale for K, L, M • U>li>es trade off against rela>ve cost of movement • Agent’s beliefs are represented as a probability distribu>on over possible worlds L, M, N • Joint hypothesis space of beliefs and desires is enumerated from sets of discre>zed u>li>es and belief points World State Bayesian Inference Belief Observer Desire Planning Ac>on Forward Problem: POMDP planning Inverse Problem: Bayesian social inference Represen>ng Belief, Desire "
Degree of Belief Definitely “N” L M N 0.0 0.0 1.0 Completely Uncertain L M N 1/3 1/3 1/3 !
Definitely “L” L M N 1.0 0.0 0.0 !
!
Definitely “M” L M N 0.0 1.0 0.0 Degree of Desire Korean: 1-‐7 "
Lebanese: 1-‐7 "
Mexican: 1-‐7 => Trades off against rela>ve cost of movement Possible worlds: 1. (L)ebanese behind building 2. (M)exican behind building 3. (N)othing behind building Dynamic Belief-‐Desire Inference D
!
W
S1
S2
ST
O2
OT
!
O1
!
B2
!
B1
!
!
!
A1
!
!
BT
!
A2
Related work: ZeVlemoyer, Milch & Kaelbling (2009), Choi & Kim (2009, 2011) !
Alterna>ve Models • TrueBelief – Assump>on: Agent knows which truck is in each parking spot. – “Associa>onist” model that infers Desire based on which truck agent moves toward. • FixedBelief Agent Situa>on – Assump>on: Agent begins with an ini>al belief state that never gets updated. – Simple theory of Belief that does not depend on Perceptual Access to the World. World State Agent State Goal Planning (MDP) Ac>on Situa>on World State Agent State Agent Belief Desire Planning (MDP) Ac>on Experimental Design 3 Environments Gap: R Gap: L Gap: RL • Ini>ally occluded truck (“L”) could be Present or Absent • 5 paths: –
–
–
–
–
Check-‐Lei; “L” Check-‐Lei; “K” Check-‐Right; “L” Check-‐Right; “K” No-‐Check; “K” • Randomized: => 54 total trials – Trial order – Truck labels (ordering scrambled on each trial) – X-‐, Y-‐reflec>on – Agent color, name/gender – Belief/Desire ra>ng order Procedure Qualita>ve Results L
Example S>muli K.
7
6
5
Desire 4
3
2
1
L
K.
K
L
M
7
6
5
Desire 4
3
2
1
1
1
Ini>al 0.5
Belief 0
Ini>al 0.5
Belief 0
L
M
N
K
L
M
L
M
N
(Baker, Saxe, & Tenenbaum, 2011)
More Qualita>ve Results Model
People
A1
A2
B1
L.
K
K
1
0.5
K
L
M
0
L
M
N
7
5
3
1
1
0.5
K
L
M
C1
0
L
M
1
0.5
K
1
0.5
M
Desires
0
M
L
M
0
L
M
N
Beliefs
7
5
3
1
1
0.5
K
0.5
M
Desires
0
M
L
M
0
L
M
N
Beliefs
7
5
3
1
L.
K
1
0.5
K
N
C4
K
L
L
L.
1
K
N
7
5
3
1
C3
K
L
L
L
K
K
N
7
5
3
1
K
C2
L
7
5
3
1
L
L
K
7
5
3
1
B2
L
M
Desires
0
L
M
Beliefs
N
7
5
3
1
1
0.5
K
L
M
Desires
0
L
M
N
Beliefs
(Baker, Saxe, & Tenenbaum, 2011)
Overall Results Preference
Inference
Desire Inference 7
People
1
0.8
6
5
0.6
4
0.4
3
0.2
2
1
Belief
Belief Inference
Inference 1
2
3
4
5
6
7
0
0
0.2
0.4
0.6
0.8
1
Model (ȕ=1.0; Ȟ=0.1)
Model (ȕ=1.0; Ȟ=0.1)
BToM TrueBelief FixedBelief Desire 0.90 0.61 0.67 Belief 0.76 0.39 0.11 Irra-onal trials: cases where paths had no ra>onal interpreta>on according to model. (Baker, Saxe, & Tenenbaum, 2011)
Discussion • BToM captures fine-‐grained adult Desire judgments very accurately • Belief judgments were predicted less accurately by BToM, but s>ll rela>vely well compared to simpler alterna>ve models • “Irra>onal trials” illustrate where the POMDP-‐
based theory fails, but also where it succeeds • What does BToM tell us about children’s ToM development? Experiment (3-‐6 year-‐olds) Experimental Phase “Check-‐Stay” Tes-ng Phase: “No-‐Check” “Check-‐Turn” • Which fruit is Bunny’s favorite? • Which fruit is Bunny’s least favorite? Richardson, Baker, Tenenbaum & Saxe (2012; in prep.)
Modeling • BToM • “Desire Theorist” (DT) – Possible worlds: {Y+R, Y, R, -‐} – Bunny strongly desires one fruit over the others Situa>on World State Agent State Agent Agent Situa>on Observe World State Bayesian Inference Belief – Bunny is assumed to know the true World State – Bunny strongly desires one fruit over the others Agent State Goal Planning Desire Ac>on Planning Ac>on Richardson, Baker, Tenenbaum & Saxe (2012)
Results “No-‐Check” Proportion of
responses (Humans)
“Check-‐Stay” “Favorite” % Subjects 1
0.8
Fruit 1
Fruit 2
Fruit 3
0.6
0.4
0.2
0
3-4 5-6 Adult
Probability of
response (Model)
Check Stay
Model P(Favorite) “Check-‐Turn” Favorite
3-4 5-6 Adult 3-4 5-6 Adult
No Check
Check Turn
1
0.8
0.6
0.4
0.2
0
DT BToM
DT BToM
DT BToM
Check Stay
No Check
Check Turn
Richardson, Baker,
Tenenbaum & Saxe (2012)
Results “No-‐Check” “Check-‐Stay” “Check-‐Turn” Least Favorite
1
0.8
Fruit 1
Fruit 2
Fruit 3
“Least Favorite” 0.6
% Subjects 0.4
0.2
0
3-4 5-6 Adult
3-4 5-6 Adult
3-4 5-6 Adult
Check Stay
No Check
Check Turn
DT BToM
DT BToM
DT BToM
Check Stay
No Check
Check Turn
1
0.8
Model 0.6
P(Least Favorite) 0.4
0.2
0
Richardson, Baker,
Tenenbaum & Saxe (2012)
Discussion • Ability to perform joint Belief-‐Desire aVribu>ons increases from ages 3-‐6 • Trajectory parallels the development of the ability to pass the Sally-‐Anne False-‐Belief task • BToM and DT provide accounts of the discon>nuity between 3 and 5, and con>nuity from preschool to adulthood • Ongoing work: 1-‐3 year-‐olds Richardson, Baker, Tenenbaum & Saxe (2012; in prep.)
Learning What is Where with BToM • Does ToM allow us to learn about the World by observing others? • Can we extend BToM to Situa>ons where aspects of the World State are unknown to the Observer? Joint Belief-‐Desire Inference Joint Belief and World-‐State Inference Situation
Situation
Observation
Observation
Agent
Principle of
rational belief
Belief
Agent
State
World
State
Agent
State
World
State
Desire
Agent
Principle of
rational belief
Belief
Desire
Principle of
rational action
Principle of
rational action
Action
Action
FOOD CARTS
At a certain university food hall, there are three carts that serve food every day: Afghani (A), Burmese (B) and Colombian (C) The carts can park in any configura>on in the North (N), West (W), and East (E) spots on any given day TASK: figure out where each truck is, based on where Harold walks Harold likes A best, B second best, and C least Example sight-‐lines from various loca-ons: Jara-Ettinger, Baker & Tenenbaum (2012)
FOOD CARTS N All possible loca>ons for the three carts are shown to the lei Some>mes carts A and B can be closed Harold always goes to his favorite cart that is open Subjects rated the likelihood of each configura>on on a 0-‐10 scale
W E Possible Cart Loca-ons
They were encouraged to consider whether the carts were open or closed in making their inferences, but ra>ngs were only collected for the 6 possible configura>ons Jara-Ettinger, Baker & Tenenbaum (2012)
Experimental Design & Procedure • First, par>cipants completed a training phase in which carts were always “open” – They inferred configura>ons for 3 training examples • Before the tes>ng phase, the possibility that A and B could be “closed” was explained with a step-‐by-‐step example • Tes>ng s>muli included 9 “complete paths” and 6 “par>al paths”, generated by enumera>ng all valid ac>on sequences • In the tes>ng phase, subjects saw all trajectories in randomized order Jara-Ettinger, Baker & Tenenbaum (2012)
Alterna>ve Model (F-‐40)
Cue-‐based alterna>ve model with 7 key features and 40 free parameters (one for each feature, plus an addi>ve constant, mul>plied by 5 independent response variables) was fit to human responses using mul>nomial logis>c regression Key features: N W E Jara-Ettinger, Baker & Tenenbaum (2012)
Example Qualita>ve Results Training path 1
Human Judgments
BToM Model
0.5
F-40 Model
0
Incomplete path Complete path 1
1
0.5
0.5
0
0
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
Jara-Ettinger, Baker & Tenenbaum (2012)
Results (All Condi>ons) 1
1
1
1
0.5
0.5
0.5
0.5
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
1
1
1
1
0.5
0.5
0.5
0.5
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
1
1
1
1
0.5
0.5
0.5
0.5
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
1
1
1
1
0.5
0.5
0.5
0.5
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
0
1
1
1
0.5
0.5
0.5
0
0
0
A A B B C C
C B B C CA AC B A A B
0
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
Human Judgments
BToM Model
F-40 Model
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
A A B B C C
C B B C CA AC B A A B
Jara-Ettinger, Baker & Tenenbaum (2012)
Illustra>ve BToM vs. F-‐40 Contrasts Jara-Ettinger, Baker & Tenenbaum (2012)
0.6
0.5
0.4
0.3
0.2
0.1
0
Mean Human Judgments Mean Human Judgments Quan>ta>ve Results r=0.91
0
0.5
BToM 1
0.6
0.5
0.4
0.3
0.2
0.1
0
0
r=0.64
0.5
1
F-‐40 • Mean human judgments from each ra>ng, in each condi>on (114 =19x6 in total) as a func>on of BToM and F-‐40 model predic>ons • BToM predicts people’s inferences closely, while the range of F-‐40 is compressed and the overall correla>on is lower Jara-Ettinger, Baker & Tenenbaum (2012)
Individual Subjects Analysis Comparison of Individual Subjects’ Correla>ons r F-‐40 1
0
ï
ï 0 1
r BToM • Each point represents the overall correla>on of one subject’s ra>ngs with F-‐40, ploVed against the correla>on with BToM • 80% of subjects had a higher correla>on with BToM than F-‐40 • The red “X” indicates the correla>on of BToM with the mean human judgments vs. that of F-‐40 Jara-Ettinger, Baker & Tenenbaum (2012)
Summary • People use ToM to make joint mental state inferences and inferences about aspects of the world from observa>ons of behavior • Human inferences come surprisingly close to those of an ideal BToM observer in two challenging tasks Joint Belief-‐Desire Inference Joint Belief and World-‐State Inference Situation
Situation
Agent
State
World
State
Observation
Observation
Agent
Principle of
rational belief
Belief
Agent
State
World
State
Desire
Agent
Principle of
rational belief
Belief
Desire
Principle of
rational action
Principle of
rational action
Action
Action
Discussion • Introduced intui>ve principles of social inference: – Principle of ra>onal ac>on – Principle of ra>onal belief – Principle of ra4onal interac4on Principle of Ra>onal Interac>on Goal Inference as
Inverse Planning
Situation
World
State
Agent
Agent
State
Goal
Planning
Action
Social Goal Inference as Inverse Planning Markov Decision Process Situa>on Agent 1 Belief
(Goal2) Goal 1 Agent State 1 World State Agent 2 Agent State 2 Goal 2 Belief
(Goal1) Planning Planning Ac>on 1 Ac>on 2 Markov Game Baker, Goodman, & Tenenbaum (2008);
Ullman, Baker, Macindoe, Evans, Goodman & Tenenbaum (2009)
Discussion • Introduced intui>ve principles of social inference: – Principle of ra>onal ac>on – Principle of ra>onal belief – Principle of ra>onal interac>on – Subsumed by formal accounts of ra6onal agency • Proposed social inference mechanisms: – Inverse Planning – Inverse Inference Discussion and Future Research • Limita>ons – Are ra>onality principles limited to social inference about simple spa>al situa>ons and ac>ons? – Algorithmic issues • Naïve enumera>on methods for inference are wildly intractable – op>miza>on / search based methods are an ac>ve research area (Ng & Russell, 2000; Dvijotham & Todorov, 2009; Ziebart, Bagnell & Dey, 2011; Choi & Kim, 2011) Future Work Goal Inference as Inverse Planning Situa>on World State Joint Belief-‐Desire Inference with a Bayesian Theory of Mind Agent Agent State Situa>on World State Desire Planning MDP • Principle of ra>onal ac>on Agent State Agent Bayes Inf. Ac>on Belief Desire Planning Ac>on Social Goal Inference as Inverse Planning Situa>on Agent 1 Belief
(Goal2) Goal 1 Planning Ac>on 1 Agent State 1 World State POMDP • Principle of ra>onal ac>on • Principle of ra>onal belief Agent 2 Agent State 2 Goal 2 Planning Markov Game • Principle of ra>onal interac>on Ac>on 2 Belief
(Goal1) Par>ally observable stochas>c game (POSG) Conclusions • In several experiments, people’s inferences about agents’ goals, joint beliefs and desires, and social inten>ons approach those of an ideal Bayesian observer • Abstract principles of ra>onality (ac>on, belief, and interac>on) capture aspects of the knowledge and mechanisms enabling human social inference • Further work is needed for connec>ng these proposals to infant and child development Thank you! Acknowledgements • Noah Goodman • Tomer Ullman • Julian Jara-‐E~nger • Hilary Richardson • Leslie Kaelbling • Nancy Kanwisher • Tenenbaum & Saxe Lab members Ques>ons?
© Copyright 2026 Paperzz