1 Game Theory

1
problem set 8
from Osborne’s
Introd. To G.T.
Ex. (426.1), 428.1, 429.1, 430.1, 431.1, (431.2)
Repeated games
The grim (trigger) strategy
A reminder
1. Begin by playing C and do not initiate a deviation from C
2. If the other played D, play D for ever after.
Is the grim strategy a Nash equilibrium?
i.e. is the pair (grim , grim) a N.E. ??
Not in a finitely
repeated Prisoners’
Dilemma.
Punishment does not seem to
work in the finitely repeated game.
C
D
C
2,2 0,3
D
3 , 0 31 , 1
Every Nash equilibrium of the finitely repeated P.D.
generates a path along which the players play only D
Proof:
Consider the last time that any of the players plays C
along the Nash Equilibrium path. (assume it is player 1)
After that period they both play D
…
C
D
D
D
D
D
….
player 2
…
?
D
D
D
D
D
….
>
>
>
player 1
If he switches to play D:
He is better off here
player 1
…
D
D
D
D
D
D
….
player 2
…
?
?
?
?
?
?
….
4
An infinitely repeated prisoners’ Dilemma
sub-games
C
1
A reminder
D
2
C
1
2
C
2
1
D
D
C
D
1
2
2
1
2
2
1
2
2
C
1
2
D
5
An infinitely repeated game
A history at time t is:
{ a1, a2, ….. at }
where ai is a vector of actions taken at time i
ai is [C,C] or [DC] etc.
A strategy is a function that assigns an action for
each history.
S  a ,a , ......a   C, D
1
2
t
for all histories  a 1 ,a 2 , ......a t 
6
An infinitely repeated game
The payoff of player 1 following a history
{ a0, a1, ….. at,...… }
is a stream { G1(a0), G1(a1), ….. G1(at)...… }

u  w0 ,w1 , ...wt , ...... =  1 - δ   δ t wt = c
t =0

s.t.

δ w  δ c
t
t
t
t =0
t =0
7
An infinitely repeated game
If the payoff stream of a player is a cycle (of length n):
w0,w1,w2,……wn-1,w0,w1,w2,……wn-1, w0,w1,w2,……wn-1, ………
his utility is:
 1 - δ [ w0 + δw1 + δ
+δ
+δ
n
2n
w
w
2

+
 + ..........]
w 2 + ...+ δ w n-1 +
n-1
0
+ δw1 + δ w 2 + ...+ δ w n-1
0
+ δw1 + δ w 2 + ...+ δ w n- 1
2
2
n-1
n-1
w0 + δw1 + δ 2 w2 + ...+ δ n-1 wn-1
 1 - δ
1- n
8
An infinitely repeated game
w0 + δw1 + δ w2 + ...δ wn-1

1 - δ
n
1 -
2
n-1
w0 + δw1 + δ w2 + ...δ wn-1
2
n-1
1 +  +   ...
2

δ1
n-1
w0 + w1 + w 2 + ...+ wn-1
n
9
An infinitely repeated game
Is the pair (grim , grim) a N.E. ??
Not in a finitely
repeated Prisoners’
Dilemma.
(grim,grim) is a N.E. in the infinitely repeated
P.D. if the discount rate is sufficiently large
i.e. if the future is sufficiently important
10
Assume that player 2 plays ‘grim’:
If at some time t player 1 considers deviating from C
(for the first time)
… t-2 t-1
t
player 1
…
C
C
D
?
D
D
D
….
player 2
…
C
C
C
D
D
D
….
time
t+1 t+2
while if he did not deviate:
… t-2 t-1
t
player 1
…
C
C
C
C
C
C
….
player 2
…
C
C
C
C
C
C
….
time
t+1 t+2
11
C
The payoffs:

 3+ δ
C
2,2 0,3
D
3,0 1,1
t
t =1
… t-2 t-1
t
player 1
…
C
C
D
D
D
D
….
player 2
…
C
C
C
D
D
D
….
time
t+1 t+2

2 +  2δ t
t =1
… t-2 t-1
t
player 1
…
C
C
C
C
C
C
….
player 2
…
C
C
C
C
C
C
….
time
D
t+1 t+2
12
C
Player 1 will not deviate if:


2 +  2δ  3 +  δ
t
t =1


t
C
2,2 0,3
D
3,0 1,1
t =1

D

δ

(grim,grim) is a N.E. if the discount rate
is sufficiently large
i.e. if the future is sufficiently important
13
However,
(grim,grim) is not a Sub-game Perfect
equilibrium of the game
Assume player 1 follows the grim strategy,
and that in the last period C,D was played
player 1
…
C
D
D
D
….
player 2
…
D
C
D
D
….
D
D
D
....
Player 1’s (grim) reaction will be:
If Player 1 follows grim:
but he could do better with :
14
Strategies as Finite Automata
 A finite automaton has a finite
xxxxnumber of states (+ initial state)
 Each state is characterized
xxxxby an action
 Input changes the state of the
xxxxautomaton
15
C
C
D
D
The
grim
strategy
A state
and
its action
Inputs : The actions of the other player { C,D }
The transition: How inputs change the state
Initial State
16
Some more strategies
Modified Grim
C,D
C
C
D
D
C,D
1
D
C,D
2
D
C,D
D
3
4
D
D
C
C
C
D
D
C,D
D
C,D
C,D
D
D
17
Some more strategies
Tit for Tat
C
C
D
D
D
C
18
Axelrod’s Tournament
Robert Axelrod: The Evolution of Cooperation, 1984
C
D
D
C
D
C
Tit for Tat (Nice !!!)
C
D
C
D
Robert Axelrod
C
D
19
Can you still bite ???
C
C
D
D
C
D
C
C
D
D
D
C
Modified Tit for Tat
C
D
C,D
D
C
‘simpler’ than
Can you still bite ???
C
D
D
C
C
C
C
D
D
a modification
a strategy that exploits the weakness of
C
D
C,D
D
C
21
What payoffs are N.E. payoffs of the infinitely repeated P.D. ??
The FEASIBLE payoffs as
δ 1
π2
C
π1
D
C
2,2 0,3
D
3,0 1,1
What payoffs are N.E. payoffs of the infinitely repeated P.D. ??
Clearly, Nash Equilibria payoffs are ≥ (1,1)
π2
????
All feasible payoffs above (1,1)
can be obtained as Nash
Equilibria payoffs
The folk theorem
(R. Aumann, J. Friedman)
C
D
C
2,2 0,3
D
3,0 1,1
(1,1)
π1
Proof:
All feasible payoffs above (1,1)
can be obtained as Nash
Equilibria payoffs
choose a point in this region
π2
it can be represented as:
α1  2,2  + α2  3,0  + α3 0,3  + α4 1,1
α1 + α2 + α3 + α4  1,
αi  0
C
D
C
2,2 0,3
D
3,0 1,1
(1,1)
π1
Proof:
α1  2,2  + α2  3,0  + α3 0,3  + α4 1,1
α1 + α2 + α3 + α4  1,
αi  0
The coefficients αi can be
approximated by rational numbers
π2
ni
assume : αi = , ni ,m are integers
m
n1 + n2 + n3 + n4 = m
C
D
C
2,2 0,3
D
3,0 1,1
(1,1)
π1
α1  2,2  + α2  3,0  + α3 0,3  + α4 1,1
Proof:
ni
αi =
m
α1 + α2 + α3 + α4  1,
αi  0
consider the (m ) cycle :
n2
n3
{
{
n1
{
{
π2
(2,2).....(2,2)(3,0)...(3, 0)(0,3)....(0,3)(1,1).....(1,1)
n4
If the players follow this
cycle, their payoff will be
approximately the chosen
C
D
point when the discount rate
C 2,2 0,3
is close to 1.
(1,1)
π1
D
3,0 1,1
Proof:
n2
n3
{
{
n1
{
{
(2,2).....(2,2)(3,0)...(3,0)(0,3)....(0,3)(1,1).....(1,1)
n4
A strategy:
π2
Follow the sequence of the
cycle as long as the other
player does.
If not, play D forever.
C
D
C
2,2 0,3
D
3,0 1,1
(1,1)
π1
n2
n3
{
{
n1
{
{
(2,2).....(2,2)(3,0)...(3,0)(0,3)....(0,3)(1,1).....(1,1)
n4
A strategy:
π2
Follow the sequence of the
cycle as long as the other
player does.
If not, play D forever.
This pair of strategies is a N.E.
C
D
C
2,2 0,3
D
3,0 1,1
(1,1)
π1
One Deviation Property and Agent Equilibria
One Deviation Property
A player cannot increase his payoff in a sub-game in
which he is the first to move, by changing his action in
that node only.
A Theorem
A strategy profile is a sub-game perfect equilibrium in an
extensive game with perfect information iff both strategies
have the one deviation property
(no proof)