sols - UCSB ECE

ECE 194V Game Theory and Multiagent Systems
Homework #4 Solutions
1. For each of the following 2 × 2 games, the column player is using a randomized strategy of
L with probability p and R with probability 1 − p. The row player seeks to optimize the
expected payoff.
• BoS:
B
S
B
2, 1
0, 0
S
0, 0
1, 2
• Stag hunt:
Stag
Hare
Stag
2, 2
1, 0
Hare
0, 1
1, 1
Alt
Std
Alt
3, 3
0, 0
Std
0, 0
1, 1
• Typewriter:
(a) In each case, pick “top” if Left is greater, “bottom” if Right is greater.
i. BoS:
2p + 0(1 − p) vs 0p + 1(1 − p)
ii. Stag hunt:
2p + 0(1 − p) vs 1p + 1(1 − p)
iii. Typewriter:
3p + 0(1 − p) vs 0p + 1(1 − p)
(b) Indifference occurs at equality
i. BoS: p = 1/3
ii. Stag hunt: p = 1/2
iii. Typewriter: p = 1/4
(c) Compute all Nash equilibria (both mixed and pure) for each game.
i. BoS: (B, B), (S, S) are the pure Nash equilibria. There is also a mixed equilibrium
in which Row plays B with probability 2/3 and Column plays B with probability
1/3 (thus, Row plays S with probability 1/3 and Column plays S with probability
2/3).
ii. Stag hunt: (Stag, Stag) and (Hare, Hare) are the pure Nash Equilibria. There is
also a mixed equilibrium in which Row and Column each play Stag with probability
1/2.
iii. Typewriter: (Alt, Alt) and (Std, Std) are the pure Nash Equilibria. There is also
a mixed equilibrium in which Row and Column each play Alt with probability 1/4
(and Std with probability 3/4).
The following shows how to compute the Nash Equilibria for stag hunt. The other two are
similar.
Stag
2, 2
1, 0
Stag
Hare
Hare
0, 1
1, 1
• Suppose the row and column players uses a mixed strategies (p, 1 − p) and (q, 1 − q),
respectively.
• The best response of the row player is:


1
Brow (q) = [0, 1]


0
2q > q + (1 − q)
2q = q + (1 − q)
2q < q + (1 − q)
which can be rewritten as


q > 1/2
1
Brow (q) = [0, 1] q = 1/2


0
q < 1/2
• Because of the symmetry between players, the best response for the column player is


p > 1/2
1
Bcol (p) = [0, 1] p = 1/2


0
p < 1/2
• The resulting Nash equilibria in terms of (p∗ , q ∗ ) pairs are
(1, 1), (0, 0), &(1/2, 1/2)
This is illustrated in the best response plot:
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
2. Grad School Competition:
(a)
0 Weeks
1 Week
2 Weeks
0 Weeks
1.5, 1.5
2, 0
1, 0
1 Week
0, 2
0.5, 0.5
1, −1
(b) There are no strictly or weakly dominated strategies.
2 Weeks
0, 1
−1, 1
−0.5, −0.5
(c) The unique mixed-strategy equilibrium has both players playing each action with equal
probabilities, or working (0, 1, 2) weeks with probability (1/3, 1/3, 1/3), respectively.
One way to find this is to compute the strategy for each player that would make the
other player indifferent between all of his strategies. For example, we can solve for
the Column-player strategies that make the Row player indifferent. This means that
if Column player works (0, 1, 2) weeks with probabilities (p1 , p2 , p3 ), these probabilities
must satisfy the system of equations defined by
p1 +p2
+p3
=1
2p1 +0.5p2 −p3
= 1.5p1
p1 +p2
−0.5p3 = 1.5p1
The first equation ensures that the probabilities sum to 1, the second equation ensures
that the Row player is indifferent between working 0 weeks and working 1 week, and the
third equation ensures that the Row player is indifferent between working 0 weeks and
working 2 weeks.
3. Bribes: Two players find themselves in a legal battle over a patent. The patent is worth 20
to each player, so the winner would receive 20 and the loser 0. Given the norms of the country,
it is common to bribe the judge hearing a case. Each player can offer a bribe secretly, and the
one whose bribe is the higest will be awarded the patent. If both choose not to bribe, or if
the bribes are the same amount, then each has an equal chance of being awarded the patent.
If a player does bribe, then the bribe can be valued at either 9 or 20. Any other number
is considered very unlucky, and the judge would surely rule against a party who offered a
different number.
(a) The unique Pure Nash Equilibrium is (Bribe 9,Bribe 9). One way to find it is with
iterated dominated strategy elimination, like so:
Bribe 0
Bribe 9
Bribe 20
Bribe 0
10, 10
11, 0
0, 0
Bribe 9
0, 11
1, 1
0, −9
Bribe 20
0, 0
−9, 0
−10, −10
Note that for both players, Bribe 20 is strictly dominated by Bribe 9, so if we eliminate
this strategy for both players, the game is reduced to this:
Bribe 0
Bribe 9
Bribe 0
10, 10
11, 0
Bribe 9
0, 11
1, 1
Now, for both players, Bribe 0 is strictly dominated by Bribe 9, so we can eliminate
Bribe 0 for both and we’re left with only the equilibrium (Bribe 9,Bribe 9) which must
be the only pure Nash equilibrium.
(b) Including the Bribe 15 strategy means there are no pure Nash equilibria left. This can
be verified by checking all new action profiles on the new payoff matrix:
Bribe 0
Bribe 9
Bribe 15
Bribe 20
Bribe 0
10, 10
11, 0
5, 0
0, 0
Bribe 9
0, 11
1, 1
5, −9
0, −9
Bribe 15
0, 5
−9, 5
−5, −5
0, −15
Bribe 20
0, 0
−9, 0
−15, 0
−10, −10
(c) The mixed equilibrium has probabilities for bribes of (0, 9, 15, 20) of (0.4, 0.5, 0.1, 0),
respectively. To find this:
• Note that Bribe 20 can never have positive weight in any mixed Nash equilibrium,
because payoffs can always be increased by moving weight to a different strategy.
• Solve for the Column-player strategies that make the Row player indifferent. This
means that if Column player Bribes (0, 9, 15) with probabilities (p1 , p2 , p3 ), these
probabilities must satisfy the system of equations defined by
p1 +p2 +p3 = 1
11p1 +p2 −9p3 = 10p1
5p1 +5p2 −5p3 = 10p1
The first equation ensures that the probabilities sum to 1, the second equation
ensures that the Row player is indifferent between bribing 0 and bribing 9, and
the third equation ensures that the Row player is indifferent between bribing 0 and
bribing 15.
4. (a) The set of Nash equilibria is characterized in the same fashion as Problem 1. Specifically,
in this game, there is one mixed Nash equilibrium in which each player plays H with
probability 1/3. There are also two pure Nash equilibria: (D, H) and (H, D).
A correlated equilibrium is defined as
X
X
Ui (ai , a−i )z (ai ,a−i ) ≥
Ui (a0i , a−i )z (ai ,a−i ) .
a−i ∈A−i
a−i ∈A−i
Accordingly, the set of correlated equilibria satisfies the following 4 inequalities:
a1 = H ⇒ U1 (H, H)z (H,H) + U1 (H, D)z (H,D) ≥ U1 (D, H)z (H,H) + U1 (D, D)z (H,D)
a1 = D ⇒ U1 (D, H)z (D,H) + U1 (D, D)z (D,D) ≥ U1 (H, H)z (D,H) + U1 (H, D)z (D,D)
a2 = H ⇒ U2 (H, H)z (H,H) + U2 (D, H)z (D,H) ≥ U2 (H, D)z (H,H) + U2 (D, D)z (D,H)
a2 = D ⇒ U2 (H, D)z (H,D) + U2 (D, D)z (D,D) ≥ U2 (H, H)z (H,D) + U2 (D, H)z (D,D)
Inserting numbers from the payoff matrix, we have that a correlated equilibrium is a
tuple z (H,H) , z (D,H) , z (H,D) , z (D,D) that satisfies these four inequalities
6z (H,D) ≥ z (H,H) + 4z (H,D)
z (D,H) + 4z (D,D) ≥ 6z (D,D)
6z (D,H) ≥ z (H,H) + 4z (D,H)
z (H,D) + 4z (D,D) ≥ 6z (D,D)
(b) In this particular game, the Nash equilibria have joint action probabilities of z (H,H) , z (D,H) ,
z (H,D) , z (D,D) = (0, 1, 0, 0), (0, 0, 1, 0), and (1/9, 2/9, 2/9, 4/9). All three of these tuples
satisfy the above inequalities.
Formally, we can prove N E ⊆ CE. Suppose z is a Nash equilibrium. If ai is a possible
signal to player i then pai i > 0 which implies that
X
X
Ui (ai , a−i )z (ai ,a−i) ≥
Ui (āi , a−i )z (ai ,a−i) .
a−i ∈A−i
a−i ∈A−i
since z is a Nash equilibrium. (Remember indifference phenomena). Hence, it is a
correlated equilibrium.
Thus, the Venn diagram looks like this:
5. A zero-sum game has a payoff (for the row player) given by
1 3
4 2
(a) Compute the security strategies for both players using pure actions, and conclude that
the game does not have a value in this case.
(b) Repeat using mixed strategies, and compute the value of the game.
Pure strategies:
• For the row player, the worst case for top is 1 and for bottom is 2. Therefore,
a∗row = Bottom, v = 2
• For the column player, the worst case for left is 4 and for right is 3. Therefore,
a∗col = Right, v = 3
• Note that v < v, so the game has no value.
Mixed strategies:
• Assume the column player mixes according to (p, 1 − p) and the row player mixes according to (q, 1 − q).
• For the row player, the plot below illustrates the consequences of mixing with p. The
horizontal axis is p. The vertical axis is payoff. The two lines correspond to column
player using left or right. The thick line represents the worst case as function of p. The
peak is the maximin.
4
3.5
3
2.5
2
1.5
1
0
0.2
0.4
0.6
0.8
1
Note that v = 2.5
• Likewise for the minimizing column player, the following is a plot of the worst case as a
function of q. The valley is the minimax.
4
3.5
3
2.5
2
1.5
1
0
0.2
0.4
0.6
0.8
Note that v = 2.5
• The value of the game is 2.5. The security strategies are
p∗ = 1/2
&
q ∗ = 1/4
1