Two Level Recursive Reasoning by Humans Playing Sequential

MSDM Workshop @ AAMAS-09
Two Level Recursive Reasoning by Humans
Playing Sequential Fixed-Sum Games
Authors:
Adam Goodie, Prashant Doshi, Diana Young
Depts. of Psychology and Computer Science
University of Georgia
Outline

Introduction
◦ Recursive reasoning
◦ Related work

Experimental study
◦ Problem setting
◦ Participants
◦ Methodology
Results
 Discussion

Recursive reasoning

Strategic recursive reasoning in multi-agent settings
(what do I think that you think that I think...)

Multi-agent decision making frameworks
◦ RMM
◦ I-POMDP

Theory-of-Mind

Real-world application settings
◦ UAV
Related work (I)

Harsanyi (1967)
◦ agent types
◦ common knowledge

Mertens and Zamir (1985)
◦ hierarchical belief system

Aumann (1999)
◦ recursive beliefs
Related work (II)
TOM and Behavioral game theory

Stahl and Wilson (1995)
◦ a symmetric 3×3 matrix game
◦ 4% of subjects attributed recursive reasoning to their opponents

Hedden and Zhang (2002)
◦ a sequential, two player, general-sum game(Centipede game)
◦ subjects predominantly began with first-level reasoning
◦ low percentage of subjects use second-level reasoning, when pitted
against first-level co-players

Ficici and Pfeffer (2008)
◦ a 3-player, oneshot negotiation game
◦ subjects reasoned about others while negotiating
◦ insufficient evidence to distinguish whether level two models better fit
the observed data than level one models
Experimental study
Problem setting
 Participants
 Methodology

◦ Opponent models
◦ Payoff structures
◦ Design of task
Problem Setting



Two-player alternating-move
Fixed-sum
Complete and perfect information
UAV cover story scenario
Probabilities for players I and II in the cover story scenario
Participants

162 subjects
Undergraduate students enrolled in lower-level
Psychology courses at the University of Georgia
 Incentives

◦ performance-contingent monetary rewards
◦ partial course credit
Methodology

Opponent models
◦ myopic
◦ predictive
Payoff structure
 Design of task

◦ training phase
◦ test phase
Opponent models
1. Myopic (First-level reasoning)
Player II chooses its action based on the outcomes at
states B and C
2. predictive (Second-level reasoning)
Player II chooses its action by reasoning what player I
will do rationally.
Payoff Structure

trivial games
◦ D < C < B <A
◦ A<B<C<D

diagnostic game
◦ C < B <A < D
◦ Different action choices for different opponent models
Design of Task

Training phase
◦ trivial games
◦ criterion
 no rationality errors in the 5 most recent games
◦ initial phase
 15 games
◦ kickoff
 failed to meet the criterion after 40 total training games

Test phase
◦ 40 diagnostic games
◦ intersperse with 40 C < A < B < D and D < B < A <
C
◦ groups based on opponents
 half against myopic ones
 half against predictive ones
◦ In each opponent model group
 half played with abstract version
 half played with the UAV cover story and the abstract version
Results

Time period
◦ three months(September-November 2008)

Monetary incentives
◦ 50 cents/correct action, average $30/participant

Training Phase
◦ 162 subjects ( 26 kicked off)

Test Phase
◦ 136 participants (70 female)


More accurate choices when opponent is predictive model
No significant difference for two versions of games
mean proportion of accurate choices across all
participants in each of the 4 groups
mean proportions marginalized over the abstract and
realistic versions
mean proportions marginalized over myopic
and predictive opponents
Count of participants grouped according to different
proportions of accurate choice
Discussion

UAV cover story neither improved nor reduced the
performance
◦ This particular cover story had no effect

Did the subjects employ Minimax or Backward
Induction?
◦ Exit questionnaire revealed most subjects did not use these
◦ Independent evaluators concurred that most subjects thought
recursively

In some settings humans tend to reason at higher levels
of recursion
Thank you
Questions?