Multi-player, non-zero

Multi-player, non-zero-sum games
4,3,2
4,3,2
4,3,2
1,5,2
7,4,1
1,5,2
7,7,1
• Utilities are tuples
player
y maximizes their own utility
y at each node
• Each p
• Utilities get propagated (backed up) from children to parents
Game theory
• Game theory deals with systems of interacting
agents where the outcome for an agent depends
on the actions of all the other agents
– Applied in sociology, politics, economics, biology,
and, of course, AI
• A
Agentt d
design:
i
d t
determining
i i th
the b
bestt strategy
t t
for
f
a rational agent in a given game
• Mechanism
M h i
design:
d i
h
how
tto sett the
th rules
l off the
th
game to ensure a desirable outcome
Simultaneous single-move games
• Players must choose their actions at the same time, without
knowing what the others will do
– Form
F
off partial
ti l observability
b
bilit
Normal form representation:
Pl
Player
1
Player 2
0,0
1,-1
-1,1
-1,1
0,0
1,-1
1,-1
-1,1
0,0
Payoff matrix
(row player’s utility is listed first)
Note: this is a zero-sum game
Prisoner’s dilemma
• Two criminals have been
police visit
arrested and the p
them separately
• If one player testifies against
the other and the other ref
refuses,
ses
the one who testified goes free
and the one who refused gets a
10-year sentence
• If both players testify against
each other
other, they each get a
5-year sentence
• If both refuse to testify, they
each get a 1-year sentence
Alice:
Testify
Alice:
Refuse
Bob:
Testify
-5,-5
-10,0
Bob:
Refuse
0,-10
-1,-1
Prisoner’s dilemma
• Alice’s reasoning:
– Suppose
pp
Bob testifies. Then I g
get
5 years if I testify and 10 years if
I refuse. So I should testify.
– Suppose
pp
Bob refuses. Then I g
go
free if I testify, and get 1 year if
I refuse. So I should testify.
• Dominant strategy: A strategy
whose outcome is better for the
player regardless of the strategy
chosen by the other player
Alice:
Testify
Alice:
Refuse
Bob:
Testify
-5,-5
-10,0
Bob:
Refuse
0,-10
-1,-1
Prisoner’s dilemma
• Nash equilibrium: A pair of
strategies such that no player
can gett a bigger
bi
payoff
ff by
b
switching strategies, provided
the other player sticks with the
same strategy
– (Testify, testify) is a dominant
strategy equilibrium
• Pareto optimal outcome: It is
impossible to make one of the
players
l
b
better
tt off
ff without
ith t making
ki
another one worse off
non zero sum game, a Nash
• In a non-zero-sum
equilibrium is not necessarily
Pareto optimal!
Alice:
Testify
Alice:
Refuse
Bob:
Testify
-5,-5
-10,0
Bob:
Refuse
0,-10
-1,-1
Prisoner’s dilemma in real life
• Price war
• Arms race
• Steroid use
Cooperate
Defect
C
Cooperate
t
Wi – win
Win
i
Win big –
lose big
Defect
Lose big –
win big
Lose – lose
http://en.wikipedia.org/wiki/Prisoner’s_dilemma
Is there any reasonable way to
get a better answer?
• Superrationality (Douglas Hofstadter)
– Assume that the answer to a symmetric problem will
be the same for both players
– Maximize the payoff to each player while considering
only identical strategies
– Not a conventional model in game theory
Stag hunt
Hunter 1: Hunter 1:
Stag
Hare
Hunter 2
H
2:
Stag
2,2
1,0
Hunter 2:
Hare
0,1
1,1
• Is there a dominant strategy for either player?
• Is there a Nash equilibrium?
– (Stag,
(Stag stag) and (hare
(hare, hare)
• Model for cooperative activity
Prisoner’s dilemma vs. stag
hunt
Stag hunt
Prisoner’ dilemma
Cooperate
Defect
Cooperate
Win – win
Win big –
lose big
Defect
Lose big –
win big
Lose – lose
Players can gain by
defecting unilaterally
Cooperate
Defect
Cooperate
Win big –
win big
Win – lose
Defect
Lose – win
Win – win
Players lose by
defecting unilaterally
Coordination game
(Battle of the sexes)
Wife:
Ballet
Wife:
Football
Husband:
Ballet
3, 2
0, 0
Husband:
Football
0, 0
or
2, 3
Wife:
Ballet
Wife:
Football
Husband:
Ballet
3, 2
0, 0
Husband:
Football
1, 1
2, 3
• Is there a dominant strategy?
gy
• Is there a Nash equilibrium?
– ((Ballet,, ballet)) or ((football,, football))
• How do we figure out which equilibrium to choose?
Game of Chicken
Pl
Player
1
S
Pl
Player
2
Straight
Chicken
Chicken
Straight
C
S -10, -10
-1, 1
C
0, 0
1, -1
• Is there a dominant strategy for either player?
• Is there a Nash equilibrium?
(Straight, chicken) or (chicken, straight)
• Anti-coordination game: it is mutually beneficial for the
two players to choose different strategies
– Model of escalated conflict in humans and animals
((hawk-dove g
game))
• How are the players to decide what to do?
– Pre-commitment or threats
– Different
Diff
t roles:
l
the
th “hawk”
“h k” is
i the
th territory
t it
owner and
d the
th “dove”
“d
” is
i
the intruder, or vice versa
http://en.wikipedia.org/wiki/Game_of_chicken