Stat155 Game Theory Lecture 13: General-Sum Games - b

Outline for today
Stat155
Game Theory
Lecture 13: General-Sum Games
Two-player general-sum games
Peter Bartlett
Multiplayer general-sum games
Definitions: payoff matrices, dominant strategies, safety strategies,
Nash equilibrium.
Example: Cheetahs and gazelles
Nash equilibrium
Example: Polluting factories
October 11, 2016
1 / 21
General-sum games
2 / 21
General-sum games
Dominated pure strategies
Notation
A pure strategy ei for Player I is dominated by ei 0 in payoff matrix A if, for
all j ∈ {1, . . . , n},
aij ≤ ai 0 j .
A two-person general-sum game is specified by two payoff matrices,
A, B ∈ Rm×n .
Simultaneously, Player I chooses i ∈ {1, . . . , m} and the Player II
chooses j ∈ {1, . . . , n}.
Similarly, a pure strategy ej for Player II is dominated by ej 0 in payoff
matrix B if, for all i ∈ {1, . . . , m},
Player I receives payoff aij .
Player II receives payoff bij .
bij ≤ bij 0 .
3 / 21
4 / 21
General-sum games
General-sum games
Nash equilibria
Safety strategies
A pair (x∗ , y∗ ) ∈ ∆m × ∆n is a Nash equilibrium for payoff matrices
A, B ∈ Rm×n if
A safety strategy for Player I is an x∗ ∈ ∆m that satisfies
max x > Ay∗ = x∗> Ay∗ ,
min x∗> Ay = max min x > Ay .
y ∈∆n
x∈∆m
x∈∆m y ∈∆n
max x∗> By = x∗> By∗ .
x∗ maximizes the worst case expected gain for Player I.
y ∈∆n
Similarly, a safety strategy for Player II is a y∗ ∈ ∆n that satisfies
If Player I plays x∗ and Player II plays y∗ , neither player has an
incentive to unilaterally deviate.
min x > By∗ = max min x > By .
x∈∆m
y ∈∆n x∈∆m
x∗ is a best response to y∗ , y∗ is a best response to x∗ .
y∗ maximizes the worst case expected gain for Player II.
In general-sum games, there might be many Nash equilibria, with
different payoff vectors.
5 / 21
Example: Cheetahs and Gazelles
6 / 21
Example: Cheetahs and Gazelles
Dominant strategy?
Payoff matrices
large
small
(Karlin and Peres, 2016)
large
(`/2, `/2)
(s, `)
Payoff matrices
small
(`, s)
(s/2, s/2)
large
small
(s ≤ `).
(s ≤ `).
7 / 21
large
(`/2, `/2)
(s, `)
small
(`, s)
(s/2, s/2)
For ` ≥ 2s, large is a
dominant strategy.
Suppose ` ≤ 2s.
Pure Nash equilibria?
(large, small), (small, large).
But who gets the large
gazelle?
8 / 21
Example: Cheetahs and Gazelles
Example: Cheetahs and Gazelles
Example: ` = 8, s = 6.
Equilibrium is when L(x) = S(x):
x ∗ = (2` − s)/(` + s) = 5/7.
Mixed Nash equilibrium?
If Cheetah I plays Pr(large) = x, Cheetah II’s payoffs are:
large:
small:
`
L(x) = x + `(1 − x),
2
s
S(x) = sx + (1 − x).
2
Equilibrium is when these are equal: x ∗ = (2` − s)/(` + s).
(Karlin and Peres, 2016)
Think of x as the proportion of a
population of cheetahs that would
greedily pursue the large gazelle.
For a randomly chosen pair of
cheetahs, if x > x ∗ , S(x) > L(x),
and non-greedy cheetahs will do
better. And vice versa. Evolution
pushes the proportion to x ∗ . This is
the evolutionarily stable strategy.
9 / 21
Comparing two-player general-sum and zero-sum games
10 / 21
Comparing two-player general-sum and zero-sum games
Zero-sum games
1
A pair of safety strategies is a Nash equilibrium (minimax theorem)
2
Hence, there is always a Nash equilibrium.
3
If there are multiple Nash equilibria, they form a convex set, and the
expected payoff is identical within that set.
Thus, any two Nash equilibria give the same payoff.
Zero-sum games
4
If each player has an equalizing mixed strategy
(that is, x > A = v 1> and Ay = v 1),
then this pair of strategies is a Nash equilibrium.
(from the principle of indifference)
General-sum games
General-sum games
4
1
A pair of safety strategies might be unstable.
(Opponent aims to maximize their payoff, not minimize mine.)
2
There is always a Nash equilibrium (Nash’s Theorem).
3
There can be multiple Nash equilibria, with different payoff vectors.
11 / 21
If each player has an equalizing mixed strategy
for their opponent’s payoff matrix
(that is, x > B = v2 1> and Ay = v1 1),
then this pair of strategies is a Nash equilibrium.
12 / 21
Outline
Multiplayer general-sum games
Notation
A k-person general-sum game is specified by k utility functions,
uj : S1 × S2 × · · · × Sk → R.
Two-player general-sum games
Player j can choose strategies sj ∈ Sj .
Definitions: payoff matrices, dominant strategies, safety strategies,
Nash equilibrium.
Example: Cheetahs and gazelles
Simultaneously, each player chooses a strategy.
Player j receives payoff uj (s1 , . . . , sk ).
Multiplayer general-sum games
Nash equilibrium
Example: Polluting factories
k = 2: u1 (i, j) = aij , u2 (i, j) = bij .
For s = (s1 , . . . , sk ), let s−i denote the strategies without the ith one:
s−i = (s1 , . . . , si−1 , si+1 , . . . , sk ).
And write (si , s−i ) as the full vector.
14 / 21
13 / 21
Multiplayer general-sum games
Multiplayer general-sum games
Definition
Definition
A vector (s1∗ , . . . , sk∗ ) ∈ S1 × · · · × Sk is a pure Nash equilibrium for utility
functions u1 , . . . , uk if, for each player j ∈ {1, . . . , k},
max uj (sj , s∗−j ) = uj (sj∗ , s∗−j ).
sj ∈Sj
If the players play these sj∗ , nobody has an incentive to unilaterally
deviate: each player’s strategy is a best response to the other players’
strategies.
A sequence (x1∗ , . . . , xk∗ ) ∈ ∆S1 × · · · × ∆Sk (called a strategy profile) is a
Nash equilibrium for utility functions u1 , . . . , uk if, for each player
j ∈ {1, . . . , k},
max uj (xj , x∗−j ) = uj (xj∗ , x∗−j ).
xj ∈∆Sj
Here, we define
uj (x∗ ) = Es1 ∼x1 ,...,sk ∼xk uj (s1 , . . . , sk )
X
=
x1 (s1 ) · · · xk (sk )uj (s1 , . . . , sk ).
s1 ∈S1 ,...,sk ∈Sk
If the players play these mixed strategies xj∗ , nobody has an incentive
to unilaterally deviate: each player’s mixed strategy is a best response
to the other players’ mixed strategies.
15 / 21
16 / 21
Example: Polluting Factories
Example: Polluting Factories
Set xi = (pi , 1 − pi ), that is,
Pr(Player i plays purify) = pi .
For 0 < pi < 1, we have a Nash
equilibrium iff
ui (purify, x−i ) = ui (pollute, x−i ).
(Karlin and Peres, 2016)
Solving shows there are two
symmetric mixed Nash equilibria:
√
3± 3
p1 = p2 = p3 =
.
6
(Karlin and Peres, 2016)
(Karlin and Peres, 2016)
Pure equilibria?
(purify, purify, pollute), (purify, pollute, purify), (pollute, purify, purify).
“Tragedy of the commons”: (pollute, pollute, pollute)
18 / 21
17 / 21
Example: Polluting Factories
pi = Pr(i plays purify).
Plot: cost for p1 = p2 = p3 .
Blue curve: −ui (purify, x−i )
= p 2 + 2p(1 − p) + 4(1 − p)2
Red curve: −ui (pollute, x−i )
= 6p(1 − p) + 3(1 − p)2 .
4.0
3.0
What√if p is a little less than
(3 + 3)/6 ≈ 0.79?
purify has lower cost.
What about
√ near
p = (3 − 3)/6 ≈ 0.21?
Not an attractor!
2.5
2.0
1.5
1.0
0.5
0.0
0.0
Imagine that we draw random
factories from a population
with proportion p that pollute.
What√if p is a little more than
(3 + 3)/6 ≈ 0.79?
pollute has lower cost.
purify
pollute
3.5
Example: Polluting Factories
0.2
0.4
0.6
p:=p_1=p_2=p_3
0.8
1.0
(Karlin and Peres, 2016)
4.0
purify
pollute
3.5
Nash equilibria:
(p, p, p) with
√
p = (3 + √3)/6 ≈ 0.79.
p = (3 − 3)/6 ≈ 0.21.
p = 0.
(purify, purify, pollute),
(purify, pollute, purify),
(pollute, purify, purify).
3.0
2.5
2.0
1.5
1.0
What about near p = 0?
pollute has lower cost.
0.5
0.0
0.0
19 / 21
0.2
0.4
0.6
p:=p_1=p_2=p_3
0.8
1.0
20 / 21
Outline
Two-player general-sum games
Definitions: payoff matrices, dominant strategies, safety strategies,
Nash equilibrium.
Example: Cheetahs and gazelles
Multiplayer general-sum games
Nash equilibrium
Example: Polluting factories
21 / 21