Chapter 8
Solving Extensive Form Games
8.1
The Extensive Form of a Game
The extensive form of a game contains the following information:
(1) the set of players
(2) the order of moves (that is, who moves when)
(3) the players’payo¤s as a function of the moves that were made
(4) the players’sets of actions for each move they have to make
(5) the information of each player afore each move he has to make
(6) probability distributions over any exogenous events.
8.1.1
Extensive Form Presentation of a Game, Some
Terminology
A game in extensive form starts at an initial decision node at which player i
makes a decision.
Each of the possible choices by player i is represented by a branch.
At the end of each branch is another decision node at which player j has to
make a choice; again each choice is represented by a branch. This can be
repeated until no more choices are made.
119
120
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
We then reach the end of the game, represented by terminal nodes. At each
terminal node, we list the players’payo¤s arising from the sequence of moves
leading to that terminal node.
We are in a game of perfect information if all players know in which decision
node they are when making a choice.
The set of all decision nodes between which a player cannot distinguish are
called an information set.
Hence, in games of perfect information each information set contains exactly
one decision node.
Note: Any simultaneous move game can be represented as a game in extensive form with imperfect information.
Example: (The Battle of Sexes)
The battle of the sexes with normal form representation
2
L
R
1 U 8,2 0,0
D 0,0 2,8
can alternatively be represented as a game in extensive form
Example: "Macho" game
8.1. THE EXTENSIVE FORM OF A GAME
Take the same payo¤ vectors. Assume that player 1 moves …rst.
Normal form presentation of this game in extensive form:
s1 2
s2 2
fU; Dg
fL j U ^ L j D, L j U ^ R j D;
R j U ^ L j D, R j U ^ R j Dg
Denote
LL: L j U ^ L j D
LR: L j U ^ R j D
RL: R j U ^ L j D
RR: R j U ^ R j D
The normal form presentation of the game is then
U
D
LL LR
8,2 8,2
0,0 2,8
RL RR
0,0 0,0
0,0 2,8
where Player 1 is the row player and Player 2 is the column player.
8.1.2
De…nition of a game in extensive form
A game in extensive form consists of the following items:
121
122
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
(1) a …nite set of nodes X , a …nite set of actions A and a …nite set of players
N
(2) a function p : X
each node x:
! X [ ? assigning a single immediate predecessor of
[ p(x) is non-empty for all x 2 X except for one node, designated the initial
node x0 ; p(x0 ) = ?. The immediate successor of x is then s(x) =
p 1 (x): s(x) is empty i¤ node x is a terminal node (s(x) = ?), s(x) is
nonempty i¤ node x is a decision node. The set of all predecessors and
successor of node x is found by iterating p and s until p(~
x) = x0 and
e~) = ? where x~ is a predecessor of x and x
e~ a successor of x ]
s(x
(3) a function
: X n fx0 g ! A giving the action that leads to any
noninitial node x from its immediate predecessor p(x) with the property
that if x0 ; x00 2 s(x) and x0 6= x00 then (x0 ) 6= (x00 ):
The choice set available at decision node x is c(x) = fa 2 A j a = (x0 ) for x0 2 s(x)g
(4) A collection of information sets H and a function H : X ! H assigning each decision node x to an information set H(x) 2 H. That
is, the information sets in H form a partition of X . The function H
satis…es that, if H(x) = H(x0 ) then c(x) = c(x0 ). In words, all decision
nodes assigned to a single information set have the same choices available. Choices available at information set H are written as C(H) =
fa 2 A j a 2 c(x) for x 2 Hg :
8.1. THE EXTENSIVE FORM OF A GAME
123
(5) A function : H ! N [ f0g assigning each information set in H to the
player (or to Nature, denoted as player 0) who moves at the decision
nodes in that set .The collection of player i information is denoted by
Hi = fH 2 H j i = (H)g :
(6) A function : H0 A ! [0; 1] assigning probabilities to actions at
information sets where Nature
P moves and satisfying (H; a) = 0 if
a2
= C(H) and H 2 H0 and
(H; a) = 1 for all H 2 H0 :
a2C(H)
A collection of payo¤ functions u = fu1 (:); :::; un (:)g assigning utilities to
the players for each terminal node that can be reached,
ui : T
!R
where T is the set of terminal nodes
T = fx 2 X j s(x) = ?g
A game in extensive form is speci…ed by
E
= fX ; A; N; p(:); (:); H; H(:); (:); (:); (ui (:))i g
"
predecessor
[ Finite number of actions, players and moves only for expositional convenience.]
A strategy for player i is a function si : Hi ! A such that
si (H) 2 C(H) for all H 2 Hi
"
choices available at information set H
In words, a strategy of player i de…nes an action for each information set of
player i.
A strategy is a complete contingent plan. It speci…es actions at information
sets that may not be reached during the actual play of the game.
124
8.2
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
Nash Equilibrium
Note that each strategy pro…le (si (:); :::; sn (:)) leads to a terminal node (if
Nature does not move) or a probability distribution over terminal nodes if
Nature moves.
We can thus rewrite (expected) utility functions to depend on strategy pro…les, denoted by U .
We can then de…ne a Nash equilibrium in the game E as a strategy pro…le
(si (:); :::; sn (:)) such that for all i
Ui (si (:); s i (:))
Ui (si (:); s i (:))
for all si (H) 2 C(H) for all H 2 Hi :
We can as well consider mixed strategies i (:), which are randomizations over
pure strategies.
The notion of Nash equilibrium then generalizes similar to Chapter 5.
Example:(The Battle of the Sexes)
We already analyzed the NE of this game.
To …nd Nash equilibria it is convenient to work with the normal form.
Example: ("The Macho Game")
U
D
LL
8,2
0,0
LR
8,2
2,8
RL RR
0,0 0,0
0,0 2,8
To support DR as an equilibrium player 2 has to play R in the information
set of player 2 that would be reached if player 1 played U.
This is the threat of player 2 to choose the bad outcome (0,0).
This threat is not credible because after the choice U a rational player should
choose L.
Hence, we can reject the equilibrium (D,RR) [and also the equilibrium (U,LL)]
as containing threats that are not credible. We may therefore want to consider
a stronger equilibrium concept that excludes threats that are not credible.
8.3. SUBGAME PERFECT NE, BACKWARD INDUCTION
8.3
125
Subgame Perfect Nash Equilibrium, Backward Induction
De…nition 1 A subgame of an extensive form game
game with the following properties:
E
is a subset of the
A subgame starts with a single decision node. It contains exactly this
decision node and all of its successors.
If a decision node x is in the subgame, then all x0 2 H(x) are also in
the subgame.
Remark: The game as a whole is a subgame.
Example 7: (The Battle of the Sexes)
This game has only one subgame, the whole game.
Remark:
Any simultaneous move game has only one subgame.
Example ("Macho game"):
This game has three subgames:
the whole game
126
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
the game starting at x0
the game starting at x00
Remark:
If a game E is a game of perfect information there exists a subgame for each
x 2 X that starts at x:
The general idea to solve for a …nite game is to exclude Non-Nash behavior,
that is, we require players to play a Nash equilibrium in each subgame.
The solution procedure for a …nite game (of perfect information) is called
backward induction. With perfect information only sequential rationality is
required.
For this, we start at decision nodes that have only terminal nodes as successors. If player i decides at this node x then he chooses action ai that
maximizes his payo¤ at this decision node. All other branches ai that leave
the decision node x can be eliminated.
We can then move the tree upward to eliminate more branches.
Example:
Sequential rationality implies that players choose UL along the equilibrium
path.
Proposition 2 (Zermelo’s Theorem) Every …nite game of perfect information has a pure strategy equilibrium that can be derived through backward
8.3. SUBGAME PERFECT NE, BACKWARD INDUCTION
127
induction (equivalent to subgame perfect NE as de…ned below). If no player
has the same payo¤s at any two terminal nodes, there exists a unique NE
that can be derived through backward induction.
Example:
(on backward induction)
Unique solution by backward induction.
For …nite games (with perfect or imperfect information) we can generalize
the procedure by considering Nash equilibria. The generalized backward
induction procedure works as follows:
1. Identify all NE of the …nal subgames (those that have no other subgames nested within).
2. Select one NE in each those subgames and derive the reduced extensive
form in which the subgames are replaced by the equilibrium payo¤s in
the selected NE of those subgames.
3. Repeat stages 1 and 2 for the reduced game until this procedure provides a path of play from the initial node to a terminal node.
This procedure selects a subset of the NE of the game in extensive form
It selects all subgame perfect NE (formal de…nition later).
E.
128
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
Example:
subgame at x1
Player 1
Player 2
F
A
N
0,2 -1,-2 4,3
-4,0 3,1
5,0
F
A
Normal form:
F
A
F
A
3 Nash equilibria:
Out
if In
Out
if In
In
if In
In
if In
F
A
N
2,2
2,2
2,2
2,2
2,2
2,2
0,2
-1,-2 4,3
-4,0
3,1
5,0
8.3. SUBGAME PERFECT NE, BACKWARD INDUCTION
129
(Out ^ F if In,F)
(Out ^ A if In,F)
(In ^ A if In,F)
Only (In ^ A if In,A) is subgame perfect.
NE at x1 : (A; A)
Reduced "game":
Player 1 chooses In in the reduced game!
[Playing F in the subgame that starts at x1 is not “credible”]
Note that in the above example we require more than sequential rationality.
Players anticipate that only NE will be played in the following subgames.
De…nition 3 A strategy pro…le s in an n-player extensive form game E is
a subgame perfect Nash equilibrium (SPNE) if it induces a Nash equilibrium
in every subgame of E .
Observations:
If the only subgame is the game as a whole, every NE is a SPNE.
(=)a NE of a simultaneous move game is always subgame perfect)
130
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
A SPNE induces a SPNE in every subgame of
E:
Formal justi…cation to use the generalized backward induction procedure:
Proposition 4 Consider an extensive form game E and some subgame SE
of E . Suppose that strategy pro…le ss is an SPNE in subgame SE . Let ^ E
be the reduced game formed by replacing subgame SE by a terminal node with
payo¤s equal to those arising from play of ss . Then
(1) In any SPNE s of E in which ss is an SPNE of SE , players’reduced
strategies s^ of s in the reduced game ^ E constitute an SPNE of ^ E .
(2) If s^ is an SPNE of ^ E , then the strategy pro…le s that speci…es the
moves in ss at information sets belonging to SE and that speci…es the moves
as described by s^ at information sets not belonging to SE is an SPNE of E .
Remarks:
Remark 1:
Special class of dynamic games E are …nitely repeated simultaneous move
games tN . If tN has a unique Nash equilibrium t , then there exists a
unique subgame perfect equilibrium of the multistage game E . It consists
of each player playing ti in stage t regardless of previous play.
8.3. SUBGAME PERFECT NE, BACKWARD INDUCTION
131
Remark 2:
If certain variables x can be changed faster than some other variables y then,
modelling such a situation, the choice of x comes at a later stage than the
choice of variables y.
Remark 3:
In games of perfect information with a …nite upper bound of moves on a
path and continuous action spaces, one …nds subgame perfect equilibria by
substituting the last player’s choice through its best response thus reducing
the game by one stage and by repeating this process until one reaches the
initial node.
Example:
Stackelberg duopoly with quantity choice
P (q) = 1
q
zero marginal costs of production
we can consider quantity choices qi 2 [0; 1]
Stage 1: Player 1 chooses q1
Stage 2: Player 2 chooses q2
1 (q1 ; q2 )
= q1 (1
2 (q1 ; q2 ) = q2 (1
q1
q1
q2 )
q2 )
At stage 2, player 2 chooses its best response given q1
r2 (q1 ) = arg max
q2
2 (q1 ; q2 )
132
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
1
q1
2q2 = 0 () r2 (q1 ) =
1
q1
2
Anticipating the reaction of player 2 at stage 2, player 1 maximizes
with respect to q1
1
max q1 1
q1
1 (q1 ; r2 (q1 ))
= q1
q1 2[0;1]
1
2
f.o.c.:
1 (q1 ; r2 (q1 ))
q1
2
1
2
q1
2
q1 = 0 =) q1 =
1
2
in subgame perfect equilibrium
1
4
1
=
4
1
=
;
8
q2 =
p
1
2
=
1
16
Leader obtains higher pro…ts than follower (Cournot competition).
8.4
Bargaining Games
Rubinstein’s Sequential Bargaining Game
1 Euro to be split between 2 players, player 1 and player 2.
Players make alternate o¤ers.
Player 1 starts, proposes to split 1 Euro, (s1 ; 1 s1 ).
Player 2 accepts or rejects.
If he accepts payo¤s are paid out. Otherwise we enter period 2. Player 2
now makes a proposal (s2 ; 1 s2 ) and player 1 can accept or reject.
If he accepts, payo¤s (s2 ; 1 s2 ) are paid out otherwise we enter period 3
where it is again the turn of player 1.
This game continues until an agreement is reached, possibly for an in…nite
number of periods.
8.4. BARGAINING GAMES
133
Players are impatient. They rather prefer a payo¤ in earlier period. This is
captured by a discount factor ; 0 < < 1:
Game of Perfect Information
Consider …rst a bargaining game with a …nite number of periods: player
makes an o¤er, player 2 can make a countero¤er or accept. If player 1 does
not accept, the countero¤er payo¤s (s; 1 s) will be paid out.
This game we can solve by backward induction.
Consider 3 periods.
In period 3 we assume that player 1 can receive s which is worth s in period
2. Hence, if player 2 makes an o¤er in period 2 of less than s player 1 will
reject.
=) s2
s
Hence, player 2 proposes ( s; 1
s) :
[note that 1
s > (1 s)]
134
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
With this player 2 can guarantee a payo¤ of 1
s in period 2 for himself.
This payo¤ is worth (1
s) in period 1. Hence, in period 1 player 1
has to o¤er at least (1
s) to player 2 to make him accept the o¤er.
1 s1
(1
s)
Hence, player 1 proposes (1
(1
s); (1
s)) :
In the unique subgame perfect equilibrium of this game player 1 proposes
(1
(1
s); (1
s)) in period 1 and player 2 accepts.
Consider now Rubinstein’s in…nite bargaining game.
Player 1 makes a proposal in odd periods, player 2 in even periods.
Observation:
If we reach any odd period which is not t = 1; then the subgame starting at
this point is identical to the game starting at t = 1 (with payo¤s valued at
the corresponding point in time).
[stationarity of the game]
Changing the idea of the players this holds for all periods.
s1 denotes the largest payo¤ that player 1 gets in any SPNE.
This is the largest amount player 2 can expect for himself in period 2: s2 = s1
(valued at period 1).
=) the smallest payo¤ that player 1 gets in any SPNE is
s1 () 1 = s1 + s1
¯
s =1
¯1
Claim:
s1
1
(*)
(**)
s1
Reason: The amount s1 is the smallest payo¤ player 2 can get in period 2.
¯
Hence, player 2 only accepts if he receives at least s1 in period 2 which leaves
¯
s1 1
s1 for player 1.
=) s1 1
s 1 = s1 + s1
s1
(**)
(*)
, s1
s1
s1
s1 =) s1 = s1
=) Player 1’s payo¤ in SPNE is uniquely determined, denote it by s1 :
From (*) we know that
()
=)
s1 = 1
s1 = 1+1
s2 = 1+
s1
8.5. INFINITELY REPEATED GAMES AND FOLK THEOREMS
135
Hence, we have the following result
Proposition 5 The Rubinstein bargaining game has a unique SPNE. In
equilibrium, player 1 receives payo¤ 1+1 and player 2 receives payo¤ 1+ .
Remark: As players become more and more patient, each player receives
1
.
2
lim ui ( ) =
!1
8.5
1
2
In…nitely Repeated Games with Observable Actions and Folk Theorems
In a repeated game we distinguish between the payo¤ that accrues to the
stage game and the overall payo¤ that is obtained when playing the repeated
game. The stage-game payo¤ function is denoted by gi : A ! R where
A = i2N Ai . Denote
the space of probability distributions over A. A
mixed action of player i is denoted by i . Let at (at1 ; :::; atn ) be the actions
that are played in period t, where t = 0; 1; 2; :::.
In a game with observable actions players observe the realized action at
the end of each period (in applications this may mean that they can make the
correct inferences although they may not observe the other players actions
directly); this is common knowledge. Hence in period t 1 the game has a
history ht = (a0 ; a1 ; :::; at 1 ) which is common knowledge among players. In
period t = 0 the game has started; the null history is denoted by h0 . Let
H t be the space of all possible histories in period t. (Note that players only
observe actual choices; they do not observe mixed actions.)
A pure strategy si for player i in the repeated game is a sequence of
maps, fsti gt=0;1;2;::: . Here sti map possible period-t histories ht to actions ai ,
sti : H t ! Ai . A mixed (behavior) strategy i in the repeated game is a
sequence of maps ti from H t to i .
Interpretation of in…nite horizon: in each period players think that the
game continues with positive probability .
Speci…cation of payo¤ function for the in…nitely repeated game: use discount factor and consider discounted sum:
1
X
t
ui = E (1
)
gi ( t (ht ))
t=0
136
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
where 1
is a normalization factor.
In each period begins a subgame. For any strategy pro…le and history
in period t, ht , compute each player’s expected discounted payo¤ at period t
(measured in period-t units),
E (1
)
1
X
t
gi (
(h ))
=t
This is called the continuation payo¤.
Remark: If
is a Nash equilibrium of the stage game, then the stategies
“each player i plays i in each period independent of the history”are a Nash
equilibrium.
Implication: Repeated play of a game does not decrease the set of equilibrium payo¤s.
We will do two things. First, we look at NE of the in…nitely repeated
game. Then we look at SPNE.
Let us …rst consider what individually rational and feasible payo¤ in the
in…nitely repeated game are. With respect to individual rationality we de…ne
player i’s reservation utility or minmax value as
v i = min max gi ( i ;
i
i)
:
i
This is the payo¤ player i obtains when the other players minimize player
i’s payo¤, while player i maximizes his payo¤ given this behavior. Clearly,
player i’s payo¤ is at least v i in any Nash equilibrium of the stage game.
Remark: Player i’s payo¤ is at least v i in any Nash equilibrium of the
in…nitely repeated game, irrespective of the level of the discount factor .
A strategy pro…le that gives payo¤s less than the minmax value for at
least one player violates individual rationality for some players and thus can
be ruled out on a priori reasons as an equilibrium.
Next consider feasible payo¤s. We have to careful since the set of feasible
payo¤s in the stage game (and thus in the repeated game for small discount
factors) need not be convex. The reason is that many convex combinations
of pure-strategy payo¤s cannot be obtained by independent randomizations
but require strategies to be correlated (as we have seen earlier in our analysis
of correlated equilibrium, e.g. in the “battle of the sexes”). As we say earlier,
we can use a public randomization device to obtain a convex set of payo¤s. In
8.5. INFINITELY REPEATED GAMES AND FOLK THEOREMS
137
the repeated game we need such a randomization device in each period. Let
the randomization device in the in…nitely repeated game thus be a sequence
f! 0 ; ! 1 ; ! 2 :::g of independent draws from uniform distribution of the [0; 1]interval whose realization ! t is observed at the beginning of period t. The
history in period t is then
ht = (a0 ; a1 ; :::; at 1 ; ! 0 ; ! 1 ; :::; ! t )
and the space of all histories in period t is denoted by H t . A pure strategy
si is then a sequence of maps sti : H t ! Ai .
Denote the set of feasible payo¤s by
V = convex hull fvj9a 2 A : g(a) = vg
We then obtain our …rst folk theorem:
Proposition 6 For every vector v 2 V with vi v i for all players i, there
exists a < 1 such that for all 2 ( ; 1) there is a Nash equilibrium of the
in…nitely repeated game with payo¤s v.
Idea of the proof: punish any deviation by using the minmax stategy.
However, in the subgame that starts after such a deviation such behavior does not constritute a Nash equilibrium. Hence, the equilibrium is not
subgame perfect. We now turn to the analysis of the SPNE of the in…nitely
repeated game.
Instead of using minmax punishment players can use the weaker punishment in which in response to a deviation all players choose Nash equilibrium
actions of the stage game. This gives rise to the following result.
Proposition 7 (Friedman, Review of Economic Studies 1971) Let
be a
Nash equilibrium of the stage game with payo¤ v . Then for any v 2 V with
vi
vi for all players i, there exists a < 1 such that for all 2 ( ; 1)
there is a subgame perfect Nash equilibrium of the in…nitely repeated game
with payo¤s v.
Example:
In…nitely repeated prisoner’s dilemma; discount factor
2
L
R
1 U -2,-2 -10,-1
D -1,-10 -6,-6
138
CHAPTER 8. SOLVING EXTENSIVE FORM GAMES
For su¢ ciently high discount rate the outcome (-2,-2) can be supported as a
SPNE.
The following strategies will do the job:
play always "Don’t Confess" (U or L, respectively)
unless there has been a deviation in the past by any player,
in that case play always "Confess" (D,R)
[we use the payo¤ matrix of the prisoner’s dilemma; the story cannot be told
as nicely as in the one-shot version]
such strategies are called "grim trigger strategies"
How does it work?
Following these strategies gives payo¤s 1 1 ( 2) for both players.
A deviation gives payo¤
1 + 1 ( 6)
( 2) >? 1 + 1 ( 6)
2 > (1
) 6
1 > 5 () > 15
1
1
()
()
Example: Bertrand duopoly (homogeneous goods)[at equal prices demand split equally]
1
For
any price p 2 [c; pm ] can be the price along the equilibrium path
2
in an SPNE.
A di¤erent type of strategy consists of “rewarding”the players who punish
a deviator by using a minmax stategy for a number of periods and then
to reward the punishers. This works if rewarding the punishers does not
reward the deviator; this requires the set of feasible payo¤s to be of su¢ cient
dimension. Denote dim V the dimension of the set V .
Proposition 8 (Fudenberg and Maskin, Econometrica 1986) Suppose dim V =
N . For every vector v 2 V with vi v i for all players i, there exists a < 1
such that for all 2 ( ; 1) there is a subgame perfect Nash equilibrium of the
in…nitely repeated game with payo¤s v.
For pure strategies the proof is easy to follow. For mixed strategies it is
more involved. Note that the dimensionality condition can be weakened to
dim V = N 1.
© Copyright 2026 Paperzz