Equilibrium Selection In Coordination Games

Equilibrium Selection In
Coordination Games
Presenter: Yijia Zhao ([email protected])
September 7, 2005
Overview
of
Coordination
Overview of Coordination Games
Games
• A class of symmetric, simultaneous move, complete information games that exhibit multiple Nash equilibria.
• Example 1 Row player’s payoff matrix:


350 350 700


250 550 1000


0
0
600
• Other examples: dial-wait problem, technology selection
for firms with compatible products.
2
Nash Equilibrium as a
Nash Equilibrium as A Solution Concept
Solution Concept
• In example 1, there are two Nash equilibria: (1, 1) and
(2, 2)
• If cooperation is allowed, (3, 3) is a far better outcome
for both players.
• Experimental results strongly support the hypothesis that
the outcome will be a Nash equilibrium. (Cooper et al.,
1990)
• The coordination issue hence becomes an equilibrium selection issue.
Two Types of Coordination
Two Types ofFailure
Coordination Failure
Two Types of Coordination
Failure

350

250
 350

250
0
350
700


550
350 1000
700 

0 1000
600
550

0
0
600
• Players may fail to coordinate
on a single equilibrium.
• Players may fail to coordinate on a single equilibrium.
• Players coordinate on a single equilibrium that is Pareto
• Players coordinate on a single equilibrium that is Pareto
dominated.
dominated.
Deductive vs. Inductive
Equilibrium Selection Principles
Deductive vs. Inductive Equilibrium
Selection Principles
• Deductive selection principles assume decision makers possess beliefs consistent with some equilibrium.
• It does not attempt to explain how decision makers acquire these beliefs.
• Inductive selection principles use learning and evolutionary dynamics to predict equilibrium.
• The idea is that repeated interaction may allow decision
makers to learn to coordinate on some equilibrium.
5
Deductive Selection
Deductive Selection Principles
Principles
• Payoff dominance: One Nash equilibrium is said to payoff
dominate another Nash equilibrium if for every player the
payoff is strictly higher in the first one.
• (2, 2) is the payoff dominant equilibrium in example 1.
• Security: We define the secure equilibrium action as the
equilibrium action k that maximizes minj∈N E Ukj where
U is the row player’s payoff matrix.
• (1, 1) is the secure equilibrium in example 1.
6
Pairwise Risk Dominance
Pairwise Risk Dominance


(a, a) (b, c)


(c, b) (d, d)
• Assume a > c and d > b. There are two Nash equilibria
(1, 1) and (2, 2).
• Nash product is the product of deviation losses of both
players at a particular equilibrium. (a − c)2 is the Nash
product at (1, 1). (d − b)2 is the Nash product at (2, 2).
• A Nash equilibrium is said to pairwise risk dominates another if it has a strictly higher Nash product. (Harsanyi
and Selten 1988)
• (1, 1) pairwise risk dominates (2, 2) if (a − c)2 > (d − b)2 .
Risk Dominance
Risk Dominance
• Pairwise risk dominance relation is not transitive.
• For n×n games, define an extension of pairwise risk dominance based on Harsanyi and Selten’s heuristic justification.
• Let ∆N E be the simplex on NE, the set of Nash equilibria.
• For j ∈ N E, define qjRD as the relative proportion of ∆N E
for which j is the best response to some belief in ∆N E .
• k ∈ N E is risk dominant if k maximizes Uk q RD .
• Coincides with pairwise risk dominance in 2 × 2 games
and ensures transitivity in symmetric n × n games.
8
Inductive Selection Principles
Inductive Selection Principles
• Examples include fictitious play and its variations, reinforcement learning, and dynamic systems that are a hybrid of the above.
• Dynamic learning models may not converge.
• Equilibrium selection may be sensitive to differences in
initial conditions.
• Experiments are done with the logit best reply with inertia and adaptive expectations model (LBRIAE). (Stahl
1999)
9
The Formal LBRIAE Model
The Formal LBRIAE Model
• Let q(t, θ) denote the expected probability of play in period t based on the history of play up to and including
period t − 1.
• Let p(t − 1) denote the actual frequency of play in period
t − 1.
• q(0, θ) and p(0) are specified as the uniform distribution
over actions.
10
The
LBRIAE
Model
The Formal LBRIAE Model Cont.
Continued
• A proportion δ of the population behaves according to
q(t, θ) = θq(t − 1, θ) + (1 − θ)p(t − 1).
(1)
i.e. with probability θ the past action will be repeated and
with probability 1 − θ the recent past will be mimicked.
• A proportion 1 − δ of the population chooses a logit best
reply to equation 1, i.e. each strategy is played in proportion to an exponential function of the utility it has yielded
in the past.
• This probability choice function is then mixed with uniform distribution with positive probability (#) of trembles.
More on the LBRIAE Model
More on the LBRIAE Selection Principle
• It can be seen as a more sophisticated variation of stochastic fictitious play.
• Maximum likelihood parameter estimates are calculated
for the four parameters including θ, δ, and #.
• There is no guarantee that the limit dynamics will converge.
• It has been shown to outperform other leading dynamic
learning models in experiments. (Stahl 1999)
12
Experimental
Design
Experimental Design
• Five symmetric games are selected for which each selection principle makes an unique prediction.
• Parameters estimated with another set of games in Stahl
1999 were used for LBRIAE. 10,000 simulations were produced and a large population was used.
• In each session, one of the five games was played. A participant’s payoff was determined by her choice and the
percentage distribution of the choices of all other participants.
More on the Experiments
More on the Experiments
• Participants were seated at private computer terminals
separated from the other participants.
• Description of the game and instructions were common
knowledge among all participants.
• Each player could make hypothesis about the choices of
the other players, and calculation of the hypothetical payoffs to each action is available to players on screen.
Criterion
One
Criterion One
• For each of the five games, aggregated choices are calculated over all sessions of the game.
• For each game, the proportion of choices that are consistent with each equilibrium selection principle are calculated.
• Simple average is taken over all five games for each equilibrium selection principle.
• Result is shown in Table 1.
15
Performances According to
Criterion One
Performance According to Criterion One
• On average, 70% of the aggregated choices are consistent
with the LBRIAE principle.
• Both risk dominance and security perform well above 50%.
• Payoff dominance only captures 8.4% of all choices.
• Robustness of the results across the games.
Performances According to
Criterion Two
Performances According to Criterion Two
• We say that the final outcome is equilibrium selection
principle x in session i of a game, if at least 75% of the
choices are x.
• We compute the proportion of the experimental sessions
for which x was the final outcome.
• LBRIAE once again performs best while payoff dominance performs worst.
• Risk dominance has an average of 64%, but for game 13
it predicts none of the outcomes.
• Robustness of the results across the games. See Table 2.
17
Conclusions
Conclusions
• Both performance criteria rank LBRIAE as a clear winner
over deductive equilibrium selection principles.
• Risk dominance and security both outperform payoff dominance.
• Some final remarks.
18