Simultaneous-move games with mixed strategies

Simultaneous-move games
with mixed strategies
1
Until now, the strategy of each agent corresponded to a single
action. Such strategies in simultaneous-move games are called
pure strategies.
Ex: (prisoners’ dilemma)
Cooperate
Both pure strategies
Defect
2
No equilibrium: no pair of actions best response
to each other
HINGIS
DL
CC
DL
50
80
CC
90
20
SELES
3
pure strategy in a simultanous-move game = action
also = expectation on other’s action
Take a step back in time:
Hingis’ expectation on other’s action is uncertain.
Thus
Hingis’ best-response is uncertain
We want to model this uncertainty
4
To handle these issues, we will introduce:
1.
2.
mixed strategies
mixed strategy Nash equilibrium
5
What is a mixed strategy?
It specifies that a pure strategy be chosen randomly
that is,
it assigns probabilities to the agent's pure strategies.
Choose a mixed strategy =
choose a probability (of playing) for each pure strategy
A mixed strategy is a probability distribution on pure strategies
6
Intuition for a mixed strategy
A population of agents (say tennis players)
Interpretation 1:
Some percentage p of them plays DL all the time
The remaining percentage (1-p) of them plays CC all the time
Interpretation 2:
Each agent sometimes plays DL
and sometimes plays CC
( p percent of the time )
( (1-p) percent of the time )
7
Example: (chicken game between Mercedes and BMW drivers)
Some Mercedes drivers always Swerve, some always go straight
Example:(tennis game) sometimes DL, sometimes CC
Example: (the chicken game)
I throw a coin. If heads, I swerve; if tails, I go straight.
So a mixed strategy is two probability numbers:
1/2 for Swerve (in general p for Swerve)
1/2 for Straight (in general 1-p for Straight)
8
NOTE:
Every pure strategy is a degenerate mixed strategy
It simply says that this pure strategy be chosen 100% of the time.
That is, you assign the probability number 1 to that pure strategy and
the probability number 0 to all other strategies.
9
A (strategic) game with mixed strategies is
1. A set of players
N
2. For each player i in N, a set of his strategies:
?
3. For each player i in N, his payoff function:
?
10
Defining mixed strategies
Si
the set of pure strategies of player i
Π(Si ) the set probability distributions on Si
= the set of mixed strategies of player i
πi in Π(Si )
πi(si )
is a typical mixed strategy for i
the probability of player i playing pure strategy si
πi : Si Æ [0,1] is a function such that the sum of πi(si ) numbers is 1
11
Defining payoffs of mixed strategies
ui( . ) player i’s payoffs from pure strategies
Example:
ui( si , s-i )
Ui( . ) player i’s payoffs from mixed strategies
Example:
Ui( πi , π-i )
Important assumption:
Ui( πi , π-i ) is the expected payoff of i from lottery ( πi , π-i )
it is a weighted average of ui( si , s-i ) values where
the weight of
ui( si , s-i ) is
π1 (s1 ). π2 (s2 ). … πn (sn )
12
The expected payoff of a lottery
Example:
Say you will get payoff X1 with probability p1
X2 with probability p2
……….
Xn with probability pn
Then your expected payoff is the weighted average:
p1 X1 + p2 X2 +…. + pn Xn
13
Agent i’s expected payoff from the mixed strategy profile: ( π1 ,…, πn )
The summation is
over all pure
strategy profiles.
The probability of the
pure strategy profile
(s1,…,sn) being played
Agent i’s payoff from the
pure strategy profile
(s1,…,sn)
14
πSeles(DL) = 0.4
πSeles(CC) = 0.6
πHingis(DL) = 0.7
πHingis(CC) = 0.3
HINGIS
0.7
0.4
SELES
0.6
DL
CC
0.28
0.42
DL
50
90
0.3
0.12
0.18
CC
80
20
0.28 50 + 0.12 80 + 0.42 90 + 0.18 20
= 65
15
π1(DL) = p
π1(CC) = (1-p)
π2(DL) = q
π2(CC) = (1-q)
HINGIS
DL
CC
DL
50
80
CC
90
20
SELES
16
A (strategic) game with mixed strategies is
1. A set of players
N
2. For each player i in N,
a set of his pure strategies:
Si
and from that a set of his mixed strategies: Π(Si )
3. For each player i in N,
his payoff function on pure strategies:
and from that his payoff function on mixed strategies:
ui( . )
Ui( . )
17
Payoffs are no more ordinal
With pure strategies, two games are equivalent if in them players’
ranking of outcomes are identical
Ex: prisoner’s dilemma and students doing a joint project
This is no more true when mixed strategies are allowed
Because, the players’ ranking of lotteries might be different in the
two games.
18
Example:
Consider the lottery which gives 0.25 probability to each cell
COLUMN
L
ROW
C
T
2, 1
1, 0
B
1, 0
-2, 1
COLUMN
L
ROW
C
T
8, 1
1, 0
B
1, 0
-2, 1
19
Nash equilibrium in mixed strategies:
Specify a mixed strategy for each agent
that is, choose a mixed strategy profile
with the property that
each agent’s mixed strategy is a
best response
to her opponents’ strategies.
Intuition for mixed strategy Nash equilibrium
It is a steady state of the society in which the frequency of each
action is fixed
(with pure strategies it was a fixed action instead)20
Seles vs. Hingis: a zero-sum game with no pure strategy Nash eq.
What would the expectations be in reality?
HINGIS
SELES
DL
CC
DL
50
80
CC
90
20
FIGURE 5.1 Seles’s Success Percentages in the Tennis Point
21
Copyright © 2000 by W.W. Norton & Company
Extending the game
Allow mixed strategies πH for Hingis:
play DL with probability q and
play CC with probability (1-q)
If Seles plays DL, her expected payoff is
USeles ( DL , πH ) = 50 q + 80 (1-q)
If Seles plays CC, her expected payoff is
USeles ( CC , πH ) = 90 q + 20 (1-q)
22
Extending the game table for the column player
An infinite number of new columns
HINGIS
SELES
DL
CC
q-Mix
DL
50
80
50q + 80(1 – q)
CC
90
20
90q + 20(1 – q)
Given a q-choice for Hingis, what will Seles choose?
FIGURE 5.4 Payoff Table with Hingis’s Mixed Strategy
23
Copyright © 2000 by W.W. Norton & Company
Seles’s
Success (%)
When Seles
plays
DL and CC
90
80
62
50
20
0
0.6
1
Hingis’s q-Mix
FIGURE 5.5 Diagrammatic Solution for Hingis’s q-Mix
24
Copyright © 2000 by W.W. Norton & Company
Seles’ best responses to different q choices of Hingis:
Against q s.t. q < 0.6
Seles plays DL
Against q s.t. 0.6 < q
Seles plays CC
Against
Both pure strategies best response
q = 0.6
If two pure strategies are both best
responses, then any mixture of them is
also a best response
25
Fix q = 0.6
Then
USeles ( DL , πH ) = USeles ( CC , πH ) = 62
Fix a mixed strategy πS for Seles:
play DL with probability p and
play CC with probability (1-p)
Seles’ expected payoff from (πS , πH ) is
USeles (πS , πH ) = pq USeles( DL , DL ) + p(1-q) USeles( DL , CC )
+ (1-p)q USeles( CC , DL ) + (1-p)(1-q) USeles( CC , CC )
= p USeles( DL , πH ) + (1-p) USeles( CC , πH )
= p 62 + (1-p) 62 = 62 .
26
Seles’ best responses to different q choices of Hingis:
Against q s.t. q < 0.6
Seles plays p = 1
Against q s.t. 0.6 < q
Seles plays p = 0
Against
Seles plays any p in [0,1]
q = 0.6
27
The best response curve of Seles:
28
What will Hingis do?
Fix a mixed strategy πS for Seles:
play DL with probability p and
play CC with probability (1-p)
If Hingis plays DL, her expected payoff is:
UHingis ( πS , DL ) = 100 – [50 p + 90 (1-p)] = 100 - USeles ( πS , DL )
If Hingis plays CC, her expected payoff is:
UHingis ( πS , CC ) = 100 – [80 p + 20 (1-p)] = 100 - USeles ( πS , CC )
29
Extending the game table for the row player
An infinite number of new rows
HINGIS
SELES
DL
CC
DL
50
80
CC
90
20
p-Mix
50p + 90(1 – p) 80p + 20(1 – p)
Given a p-choice for Seles, what will Hingis choose?
FIGURE 5.2 Payoff Table with Seles’s Mixed Strategy
30
Copyright © 2000 by W.W. Norton & Company
Seles’ payoffs (Hingis’ payoffs are 100 minus these)
Seles’s
Success (%)
Hingis
playing
DL and CC
90
DL
80
62
50
CC
20
0
0.7
1
Seles’s p-Mix
FIGURE 5.3 Diagrammatic Solution for Seles’s p-Mix
31
Copyright © 2000 by W.W. Norton & Company
Hingis’ payoffs
0.7
100
Seles’s
p-Mix
1
80
50
CC
38
Hingiss’s
20
Success (%)
10
DL
FIGURE 5.3 Diagrammatic Solution for Seles’sp-Mix
32
Hingis’ best responses to different p choices of Seles:
Against p s.t. p < 0.7
Hingis plays CC
Against p s.t. 0.7 < p
Hingis plays DL
Against
Both pure strategies best response
p = 0.7
( => all mixed strategies best response )
33
Hingis’ best responses to different p choices of Seles:
Against p s.t. p < 0.7
Hingis plays q = 0
Against p s.t. 0.7 < p
Hingis plays q = 1
Against
Hingis plays any q in [0,1]
p = 0.7
34
The best response curve of Hingis:
35
Best response
curve of
Hingis
Best response
curve of
Seles
q1
Best response
curves
combined
p1
q1
0.6
0
0
0.7
1
p
0
0
FIGURE 5.6 Best-Response Curves and Equilibrium
0.6
1
q
0
0
0.7
36
1
p
Copyright © 2000 by W.W. Norton & Company
The mixed strategy Nash equilibrium is
( ( 0.7 , 0.3 ) , ( 0.6 , 0.4 ) )
and the payoff profile resulting from this equilibrium is
( 62 , 100-62 )
Note:
At the equilibrium, Seles gets the same expected payoff from DL and CC
USeles ( DL , πH ) = USeles ( CC , πH ) = 62
Similarly, Hingis gets the same payoff from DL and CC
UHingis ( πS , DL ) = UHingis ( πS , CC ) =100 -62
THIS IS SOMETHING GENERAL !
37
Very Useful Proposition:
If a player is playing a mixed strategy as a best response
and if he assigns positive probability to two his actions (say A and B)
then his expected payoffs from these two actions are equal
Proof:
Suppose his expected payoff from A was larger than B
(note that his current payoff is a weighted average of these)
Then he can transfer some probability (i.e. weight) from B to A and
increase his payoff
But this means, his original strategy was not a best response, a
contradiction. So, the expected payoffs of A and B must be equal. 38
Use this proposition to develop a faster way of finding equilibrium
Dixit and Skeath call it: “leave the opponent indifferent” method
HINGIS
SELES
DL
CC
q-Mix
DL
50
80
50q + 80(1 – q)
CC
90
20
90q + 20(1 – q)
p-Mix
50p + 90(1 – p)
80p + 20(1 – p)
39
Conterintuitive change in mixture probabilities
Hingis’ payoffs from DL has increased: will her q choice increase?
HINGIS
SELES
DL
CC
q-Mix
DL
30
80
30q + 80(1 – q)
CC
90
20
90q + 20(1 – q)
p-Mix
30p + 90(1 – p)
80p + 20(1 – p)
40
DEAN
JAMES
Swerve
Straight
Swerve
0, 0
–1, 1
– (1 – q) = q – 1,
(1 – q)
Straight
1, –1
–2, –2
q – 2(1 – q) = 3q – 2,
– q – 2(1 – q) = q – 2
p-Mix
q-Mix
1 – p,
– p – 2(1 – p) = p – 2,
– (1 – p) = p – 1
p – 2(1 – p) = 3p – 2
FIGURE 5.7 Mixing Strategies in Chicken
41
Copyright © 2000 by W.W. Norton & Company
Dean’s
Payoff
James’
Payoff
Upper
envelope 1
1
0
0.5
James’
p-Mix
–0. 5
–1
–2
1
Upper
envelope
1
0
1
0.5
1
Dean’s
q-Mix
– 0. 5
–1
When Dean plays
Straight and Swerve
–2
FIGURE 5.8 Optimal Responses with Mixed Strategies in Chicken
When James plays
Straight and Swerve
42
Copyright © 2000 by W.W. Norton & Company
p1
q1
q1
0.5
0
0
0.5
1
p
0
0
0.5
FIGURE 5.9 Best-Response Curves and Mixed Strategy
Equilibria in Chicken
1
q
0
0
0.5
1
p
43
Copyright © 2000 by W.W. Norton & Company
Observe 1:
A pure strategy Nash equilibrium of a game in pure strategies,
after mixed strategies are allowed,
continues to be an equilibrium
They are now mixed strategy equilibria where agents choose very
simple (degenerate) mixed strategies (i.e., play this action with
probability 1 and play everything else with probability 0)
Observe 2:
A Nash equilibrium in mixed strategies where each agent plays a
pure strategy with probability one,
after mixed strategies are dropped,
continues to be an equilibrium in pure strategies.
44
HUMANITIES FACULTY
Lab
SCIENCE
Theater
FACULTY
p-Mix
Lab
Theater
q-Mix
2, 1
0, 0
2q, q
0, 0
1, 2
1 – q, 2(1– q)
2p, p
1 – p, 2(1 – p)
FIGURE 5.10 Mixing in the Battle-of-the-Two-Cultures Game
45
Copyright © 2000 by W.W. Norton & Company
Humanities’
Payoff
Science’s
Payoff
2
2
Humanities
choose
Theater and Lab
Sciences
choose
Theater and Lab
1
0
1
2 /3
1
Science’s p-Mix
FIGURE 5.11 Best Responses with Mixed Strategies in the Battle
of the Two Cultures
0
1/ 3
1
Humanities’ q-Mix
46
Copyright © 2000 by W.W. Norton & Company
q 1
1/3
0
0
2 /3
FIGURE 5.12 Mixed-Strategy Equilibria in the Battle of the Two Cultures
1
p
47
Copyright © 2000 by W.W. Norton & Company