Randomized or Mixed Strategies

January 25
Randomized or Mixed Strategies
perfect recall : in an extensive form game, a player does not forget what he once knew
Example 15 A game that fails to satisfy perfect recall (MWG, p. 225):
notation for the set of mixed strategies: ∆
common notation for a mixed strategy:  
mixed vs. pure strategies
note assumption of independent randomizations
behavioral strategy in extensive form game: independent randomization at each information set
mixed strategy: randomization over pure strategies
For games of perfect recall, the two forms of randomization are equivalent
Question: Is it possible to distinguish behavioral from mixed strategies in the above example of a game
with imperfect recall?
Let’s try. We consider the following pure strategies of player 2: (R,R,L) and (L,L,R). The first letter
indicates his move at the top left node, the second at the top right node, and the third at the information set.
Assume player 2 plays each of these pure strategies with probability 1/2. Notice that at his information
set he chooses L only if he reached this set by choosing R at the top left node and R only if he reached this
set by choosing L at the top right node. His choice at the information set is clearly not independent of his
choices at preceding nodes, and so we can’t represent it as a behavioral strategy.
Question: How do we interpret mixed strategies?
1. Does a player actually randomize?
2. Does it represent frequency within a population?
Applying the notion of strict dominance to the mixed or randomized strategies of a game:
0
 (    − )   (  − )
Notice no expected value notation.
  ( ) ≥ 0 denotes the probability that agent  selects 
X
  ( ) = 1
 ∈
Eliminating the strictly dominated mixed strategies in a game.
1. Eliminate the strictly dominated pure strategies (see the exercise below).
7
0
2.   strictly dominates   if and only if it gives strictly more for every profile of pure strategies − ,
⎤
⎡
h
i
X
Y
0
0
⎣   ( )⎦  (   − ) −  (  − )
 (  − ) −  (    − ) =
− ∈−
6=
Notice that if it holds for every profile of pure strategies of the opponents, then it also holds for every
profile of mixed strategies of the opponents.
Exercise 16 Exercise 8.B.6 If a pure strategy  is strictly dominated, then so is any mixed strategy
that plays  with positive probability.
1\2


1\2


 10 1 0 4
Example 17 p. 241
→  10 1 0 4

4 2 4 3

0 5 10 2

0 5 10 2
Notice that neither player has any strictly dominated strategies, if we consider only pure strategies. The
expected payoff from a 50-50 randomization of U,D, however, produces an expected payoff of 5, which strictly
dominates the payoff of 4 that comes from playing M.
1\2


 10 1 0 4

6 2 6 3

0 5 10 2
Note that 12 + 12 is strictly dominated by  even though  does not dominate eitther  or .
Rationalizable Strategies (8.C)
Bernheim and Pearce
Definition 18
0
0
1.   is a best response to  − if  (    − ) ≥  (    − ) for all   ∈ ∆
0
2.   is never a best response if it is not a best response to any strategy −
Clearly, a player should never use a strategy that is never a best response
Eliminating strategies that are never best responses eliminates all strictly dominated strategies, and
perhaps more
Strategies that remain after iterative elimination of strategies that are never best responses: those that
a rational player can justify, or rationalize, with some reasonable conjecture concerning the behavior of his
rivals (reasonable in the sense that his opponents are not presumed to play strategies that are never best
responses, etc.). "Rationalizable" intuitively means that there is a plausible explanation that would justify
the use of the strategy.
12 1
2
3
4
1 0 7 2 5 7 0
0 1
0 1
Example 19 2 5 2 3 3 5 2
3 7 0 2 5 0 7
0 1
4 0 0 0 −2 0 0 10 −1
Determine the set of rationalizable pure strategies. First eliminate
121 + 123 . 4 then strictly dominated by 2 once 4 is deleted.
12 1
2
3
4
2
3
12 1
12
1 0 7 2 5 7 0
0 1
1 0 7 2 5 7 0
1
2 5 2 3 3 5 2
0 1 ⇒ 2 5 2 3 3 5 2 ⇒
2
3 7 0 2 5 0 7
3 7 0 2 5 0 7
0 1
3
4 0 0 0 −2 0 0 10 −1
4 0 0 0 −2 0 0
8
4 , which is strictly dominated by
2
3
1
0 7 2 5 7 0
5 2 3 3 5 2
7 0 2 5 0 7
Example 20 What bids are rationalizable in the first price auction?
Suppose
− = max    
6=
so bidder  can’t both win the auction and make a profit. Then any bid below − is a best response to
the bids of the opponents. Since − is arbitary in this discussion, it follows that any bid of any bidder is
rationalizable. Hmmm...rationalizability doesn’t help much here.
Example 21 What bids are rationalizable in the second price auction?
Again, any bid  is a best response to some profile of bids by the other bidders. Interestingly, the second
price auction is a game in which game theorists have confidence in a unique prediction (i.e., bidding one’s
value by each bidder is the unique dominant strategy equilibrium). It isn’t determined by rationality alone,
however. Perhaps this point is helpful in understanding why subjects sometimes fail to play their unique
dominant strategies in experimental tests of such procedures as the second price auction.
Rationalizable strategies: A strategy is rationalizable if there is a "reasonable" conjecture concerning
the behavior of opponents under which the given strategy is a best response. The "reasonable" conjecture
requires that the behavior of the opponents also be best responses in this sense, and so on.
= iterative elimination of strategies that are never best responses
Example above: 1  2  3 are rationalizable for player 1, 1  2  3 are rationalizable for player 2
The point of rationalizability is to see how far you can go in analyzing a game based solely upon common
knowledge of rationality. Let’s downplay the term "common knowledge of rationality" for now, interpreting
informally as "both players know that each other is rational", and so on. As the example shows, however,
common knowledge of rationality may not take us very far in analyzing a game. In the remainder of Chapter
8, we’ll go further by requiring some form of equilibrium behavior in the play of the game.
Theorem 22 In games with  = 2 players, the rationalizable are exactly those that survive iterative deletion
of strictly dominated strategies (p. 245 of MWG).
In the case of   2 players, the set of rationalizable strategies may be strictly smaller than the
set of strategies that survive iterative deletion of strictly dominated strategies. For all values of
, rationalizability is a different motivation for the choice and elimination of strategies from iterative deletion
of strictly dominated strategies.
Nash Equilibrium (sec. 8.D)
Nash equilibrium: best response with a correct conjecture by each player concerning the strategies of the
other players
Definition 23 An -tuple (1    ) of pure strategies is a pure strategy Nash equilibrium if, for each
player ,
 (  − ) ≥  (0  − )
for all other pure strategies 0 ∈  of player .
Nash (1951)
Nash equilibrium adds to rationalizability the constraint that the players be correct in their conjectures
about each others’ behaviors.
Example 24 Meeting in NY:
12


 100 100
0 0

0 0
100 100
2 pure strategy Nash equilibria
9
Mixed Strategy Nash Equilibria
+ : those strategies that  uses with positive probability in the given strategy  
 (   − ) =  (0  − ) for all   0 ∈ +
 (   − ) ≥  (0  − ) for all  ∈ + and 0 ∈
 +
Example 25 Meeting in NY:
12


 100 100
0 0

0 0
1000 1000
each player goes to GCS with probability 111 in a mixed strategy Nash equilibrium
Example 26 (The Braess Paradox) There are 4000 motorists who drive each morning from the point
labeled Start to the point labeled Finish. There are two possible routes, one through  and one through .
The routes  →  and  →   can be thought of as bridges or limited capacity roads. The travel
time for each motorist on each of these routes is 100 minutes, where  is the total number of motorists
who choose that particular route. Travel time thus increases linearly in the number of motorists who choose
a particular route. The routes  →   and  →  are high capacity, modern roads that are each
sufficiently large to handle all 4000 motorists without increasing the travel time. The travel time on these
routes is 45 minutes, regardless of how many motorists travel on the route.
We assume that each motorist wishes to minimize his total travel time from Start to Finish, taking into
account the travel pattern determined by the routes chosen by all other motorists. We thus interpret this
as a game with 4000 players in which player chooses either the top route through  or the bottom route
through . Each motorist therefore has two possible strategies. A motorist will change his route in favor
of a shorter trip. We thus look for a distribution of motorists across the two routes that forms a Nash
equilibrium, i.e., no motorist can benefit by changing routes, given the choices of every other motorist.
We first characterize a property of a Nash equilibrium. We seek a number  of motorists for the top
route and a number  of motorists for the bottom route, where
 +  = 4000
The travel time on the top route is

+ 45
100
and the travel time on the bottom route is

+ 45
100
A motorist who switches from the top route to the bottom route changes his travel time from

+ 45
100
10
to
 + 1
+ 45
100
because he adds a motorist on the bottom route. For the driver on the top route to have no incentive to
switch, it must be the case that

 + 1
+ 45 ≥
+ 45 ⇔  + 1 ≥  
100
100
Similarly, for a driver on the bottom to have no incentive to switch, it must be the case that
 + 1

+ 45 ≥
+ 45 ⇔  + 1 ≥  
100
100
For a Nash equilibrium, it is necessary that no driver want to switch, i.e., both of these inequalities hold.
Therefore,
 + 1 ≥  ≥  − 1
The number  equals either  − 1,  , or  + 1. Recall that there are 4000 motorists, and so
 +  = 4000
If  =  − 1, then
 +  = 2 − 1 = 4000
which contradicts  being a whole number. Similarly,  =  + 1 is not possible, and so
 =  = 2000
is the only possibility for a Nash equilibrium. It is clear that this indeed is a Nash equilibrium distribution
of motorists, for a driver who changes routes strictly increases his travel time. Each motorist’s travel time
in the only Nash equilibrium is
2000
+ 45 = 65
100
minutes.
There’s no paradox yet, but here it comes! Suppose next that in the interest of improving traffic flow a
one-way route is added from  to :
For simplicity, we’ll assume that travel time on the route  →  equals zero. How does the addition
of this "shortcut" change the travel time of motorists in a Nash equilibrium? We claim that the route
 →  →  →   is the unique dominant strategy of every motorist in this new game of choosing
one’s route. To verify this, we select a motorist and consider his choice of a route given that the choices of
the other 3999 motorists determine a value of  along the route  →  and a value  along the route
11
 →  . It is not necessarily the case that  +  = 3999, because some of the other motorists may
take the shortcut  →  and thus count among both the numbers  and  . It is true, however, that  ,
 ≤ 3999 and  +  ≥ 3999.
The selected motorist now has 3 possible routes with 3 possible travel times:
route
travel time
+1
 →  →  
100 + 45
 +1
 +1
 →  →  →  
100 + 100
 +1
 →  →  
100 + 45
The "+1" indicates the congestion that the selected motorist creates by adding himself to a particular
route. We have
 + 1  + 1
 + 1 4000
 + 1
 + 1
+
≤
+
=
+ 40 
+ 45
100
100
100
100
100
100
and similarly,
 + 1  + 1
4000  + 1
 + 1
 + 1
+
≤
+
= 40 +

+ 45
100
100
100
100
100
100
The route  →  →  →   is therefore fastest for the selected motorist regardless of the decisions
of the other motorists and the values of  and  that are determined by these decisions. It is therefore a
dominant strategy for each motorist.
The unique dominant strategy equilibrium outcome is therefore that all motorists select the route  →
 →  →  . The driving time of each motorist is then
4000 4000
+
= 80
100
100
minutes, which is strictly more than the 65 minutes required before the shortcut  →  was introduced.
This is the Braess paradox, namely, adding a shortcut can increase the average travel time. Conversely, the
average travel time may be decreased by closing a road!
More generally, we can apply this to congestion in any kind of network in which users choose their own
routes. Adding a link in a network can diminish the performance of the network while deleting a link can
improve performance. This depends upon the assumption that network users array themselves as in a Nash
equilibrium.
Questions about MSNE:
1. Why do players bother to randomize when it doesn’t alter their expected payoffs?
2. Equilibrium depends upon randomization according to precise probabilities. Do we believe that people
behave in this way?
3. Independence of randomization vs. correlated equilibrium (Aumann).
Discussion of Nash Equilibrium:
Why should we expect the players to play a Nash equilibrium? This remains an active and incomplete area
of research.
1. Nash equilibrium as a consequence of rational inference. But as we saw, rationalizability is the
consequence of common knowledge of rationality and the structure of the game, and it does not
necessarily lead to correct conjectures on the part of the players
2. Nash equilibrium as a necessary condition if there is a unique predicted outcome of the game. Q: Why
would players believe that there is a unique way to play the game? It isn’t a consequence of rationality.
3. Focal points e.g., meeting in NY. As with 2., a focal point would have to be a Nash equilibrium.
4. Nash equilibrium as a self-enforcing agreement. Nonbinding communication before the game. But
shouldn’t the process of communication be modeled, and don’t the players communicate strategically?
V. Smith: If players can communicate, then they will act cooperatively (not Nash).
12
5. Nash equilibrium as a stable social convention. E.g., which side of the sidewalk to walk on. Stability
of the social convention requires Nash equilibrium.
Binmore’s parable of the quadratic equation
8.E Games of Incomplete Information: Bayesian Nash Equilibrium
games of complete information vs. games of incomplete information: The issue is incomplete knowledge of
one’s opponent’s preferences over the possible outcomes of the game.
formulated by Harsanyi, corecipient with Nash and Selten of the first Nobel Prize in economics awarded
in the field of game theory
This is a mathematically rigorous way of modeling the idea that players make decisions using their beliefs
about the preferences of each other. The beliefs are modeled here using probability theory, and as with
Nash equilibrium, it is typically necessary to assume that the beliefs of each player about the others are
correct. How beliefs turn out to be correct, or whether or not beliefs are consistent with probability theory,
are both legitimate concerns for the theory of games of incomplete information. We know, however, that
people make decisions in situations of uncertainty using their beliefs, and this is a logically consistent way
of modeling this kind of interaction among economic agents.
Bayesian Hypothesis: Whatever an agent doesn’t know for certain, he has complete, probabilistic beliefs
about
Ex.: Auctions
represent as game of imperfect information with a move of nature
incomplete vs. imperfect information
Example 27 The DA’s Brother (continued)
There are two types of player 2: with probability , the game is
1\2


0 −2
−10 −1
type I for player 2: 
 −1 −10 −5 −5
and with probability 1 − , the game is
1\2


0 −2
−10 −7
type II for player 2: 
 −1 −10 −5 −11
Type II represents a psychic penalty for confessing (i.e., player 2 hates being a "rat"). Notice that the
issue is the payoffs to player 2. Player 2 is assumed to know his type while player 1 knows the probabilities
of the two different games. In other games, it will be important to also assume that player 2 knows the
beliefs of player 1 so that he can think about how player 1 considers his options.
Player 2’s dominant strategy is to choose C if he is of type I and DC if he is type II. Player 1 therefore
evaluates his choices as follows:
 :  (−5) + (1 − ) (−1) = −1 − 4
 :  (−10) + (1 − ) (0) = −10
The equilibrium therefore depends on the value of : player 1 will choose C if
−1 − 4  −10
1  6
1
 
6
and player 1 will choose DC if

We have 2 "candidate" equilibria:
1: C
13
1

6
2: C,DC
and
1:DC
2: C,DC
Which is actually an equilibrium depends on the value of . In the case of  = 16, each is an equilibrium
This is an example of a pure strategy Bayesian Nash equilibrium ("pure strategy" because there is no
randomization in the choice of moves). It would typically be computed and discussed without reference to
the extensive form representation.
14