ECE 194V Game Theory and Multiagent Systems Homework #4 Solutions 1. For each of the following 2 × 2 games, the column player is using a randomized strategy of L with probability p and R with probability 1 − p. The row player seeks to optimize the expected payoff. • BoS: B S B 2, 1 0, 0 S 0, 0 1, 2 • Stag hunt: Stag Hare Stag 2, 2 1, 0 Hare 0, 1 1, 1 Alt Std Alt 3, 3 0, 0 Std 0, 0 1, 1 • Typewriter: (a) In each case, pick “top” if Left is greater, “bottom” if Right is greater. i. BoS: 2p + 0(1 − p) vs 0p + 1(1 − p) ii. Stag hunt: 2p + 0(1 − p) vs 1p + 1(1 − p) iii. Typewriter: 3p + 0(1 − p) vs 0p + 1(1 − p) (b) Indifference occurs at equality i. BoS: p = 1/3 ii. Stag hunt: p = 1/2 iii. Typewriter: p = 1/4 (c) Compute all Nash equilibria (both mixed and pure) for each game. i. BoS: (B, B), (S, S) are the pure Nash equilibria. There is also a mixed equilibrium in which Row plays B with probability 2/3 and Column plays B with probability 1/3 (thus, Row plays S with probability 1/3 and Column plays S with probability 2/3). ii. Stag hunt: (Stag, Stag) and (Hare, Hare) are the pure Nash Equilibria. There is also a mixed equilibrium in which Row and Column each play Stag with probability 1/2. iii. Typewriter: (Alt, Alt) and (Std, Std) are the pure Nash Equilibria. There is also a mixed equilibrium in which Row and Column each play Alt with probability 1/4 (and Std with probability 3/4). The following shows how to compute the Nash Equilibria for stag hunt. The other two are similar. Stag 2, 2 1, 0 Stag Hare Hare 0, 1 1, 1 • Suppose the row and column players uses a mixed strategies (p, 1 − p) and (q, 1 − q), respectively. • The best response of the row player is: 1 Brow (q) = [0, 1] 0 2q > q + (1 − q) 2q = q + (1 − q) 2q < q + (1 − q) which can be rewritten as q > 1/2 1 Brow (q) = [0, 1] q = 1/2 0 q < 1/2 • Because of the symmetry between players, the best response for the column player is p > 1/2 1 Bcol (p) = [0, 1] p = 1/2 0 p < 1/2 • The resulting Nash equilibria in terms of (p∗ , q ∗ ) pairs are (1, 1), (0, 0), &(1/2, 1/2) This is illustrated in the best response plot: 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 2. Grad School Competition: (a) 0 Weeks 1 Week 2 Weeks 0 Weeks 1.5, 1.5 2, 0 1, 0 1 Week 0, 2 0.5, 0.5 1, −1 (b) There are no strictly or weakly dominated strategies. 2 Weeks 0, 1 −1, 1 −0.5, −0.5 (c) The unique mixed-strategy equilibrium has both players playing each action with equal probabilities, or working (0, 1, 2) weeks with probability (1/3, 1/3, 1/3), respectively. One way to find this is to compute the strategy for each player that would make the other player indifferent between all of his strategies. For example, we can solve for the Column-player strategies that make the Row player indifferent. This means that if Column player works (0, 1, 2) weeks with probabilities (p1 , p2 , p3 ), these probabilities must satisfy the system of equations defined by p1 +p2 +p3 =1 2p1 +0.5p2 −p3 = 1.5p1 p1 +p2 −0.5p3 = 1.5p1 The first equation ensures that the probabilities sum to 1, the second equation ensures that the Row player is indifferent between working 0 weeks and working 1 week, and the third equation ensures that the Row player is indifferent between working 0 weeks and working 2 weeks. 3. Bribes: Two players find themselves in a legal battle over a patent. The patent is worth 20 to each player, so the winner would receive 20 and the loser 0. Given the norms of the country, it is common to bribe the judge hearing a case. Each player can offer a bribe secretly, and the one whose bribe is the higest will be awarded the patent. If both choose not to bribe, or if the bribes are the same amount, then each has an equal chance of being awarded the patent. If a player does bribe, then the bribe can be valued at either 9 or 20. Any other number is considered very unlucky, and the judge would surely rule against a party who offered a different number. (a) The unique Pure Nash Equilibrium is (Bribe 9,Bribe 9). One way to find it is with iterated dominated strategy elimination, like so: Bribe 0 Bribe 9 Bribe 20 Bribe 0 10, 10 11, 0 0, 0 Bribe 9 0, 11 1, 1 0, −9 Bribe 20 0, 0 −9, 0 −10, −10 Note that for both players, Bribe 20 is strictly dominated by Bribe 9, so if we eliminate this strategy for both players, the game is reduced to this: Bribe 0 Bribe 9 Bribe 0 10, 10 11, 0 Bribe 9 0, 11 1, 1 Now, for both players, Bribe 0 is strictly dominated by Bribe 9, so we can eliminate Bribe 0 for both and we’re left with only the equilibrium (Bribe 9,Bribe 9) which must be the only pure Nash equilibrium. (b) Including the Bribe 15 strategy means there are no pure Nash equilibria left. This can be verified by checking all new action profiles on the new payoff matrix: Bribe 0 Bribe 9 Bribe 15 Bribe 20 Bribe 0 10, 10 11, 0 5, 0 0, 0 Bribe 9 0, 11 1, 1 5, −9 0, −9 Bribe 15 0, 5 −9, 5 −5, −5 0, −15 Bribe 20 0, 0 −9, 0 −15, 0 −10, −10 (c) The mixed equilibrium has probabilities for bribes of (0, 9, 15, 20) of (0.4, 0.5, 0.1, 0), respectively. To find this: • Note that Bribe 20 can never have positive weight in any mixed Nash equilibrium, because payoffs can always be increased by moving weight to a different strategy. • Solve for the Column-player strategies that make the Row player indifferent. This means that if Column player Bribes (0, 9, 15) with probabilities (p1 , p2 , p3 ), these probabilities must satisfy the system of equations defined by p1 +p2 +p3 = 1 11p1 +p2 −9p3 = 10p1 5p1 +5p2 −5p3 = 10p1 The first equation ensures that the probabilities sum to 1, the second equation ensures that the Row player is indifferent between bribing 0 and bribing 9, and the third equation ensures that the Row player is indifferent between bribing 0 and bribing 15. 4. (a) The set of Nash equilibria is characterized in the same fashion as Problem 1. Specifically, in this game, there is one mixed Nash equilibrium in which each player plays H with probability 1/3. There are also two pure Nash equilibria: (D, H) and (H, D). A correlated equilibrium is defined as X X Ui (ai , a−i )z (ai ,a−i ) ≥ Ui (a0i , a−i )z (ai ,a−i ) . a−i ∈A−i a−i ∈A−i Accordingly, the set of correlated equilibria satisfies the following 4 inequalities: a1 = H ⇒ U1 (H, H)z (H,H) + U1 (H, D)z (H,D) ≥ U1 (D, H)z (H,H) + U1 (D, D)z (H,D) a1 = D ⇒ U1 (D, H)z (D,H) + U1 (D, D)z (D,D) ≥ U1 (H, H)z (D,H) + U1 (H, D)z (D,D) a2 = H ⇒ U2 (H, H)z (H,H) + U2 (D, H)z (D,H) ≥ U2 (H, D)z (H,H) + U2 (D, D)z (D,H) a2 = D ⇒ U2 (H, D)z (H,D) + U2 (D, D)z (D,D) ≥ U2 (H, H)z (H,D) + U2 (D, H)z (D,D) Inserting numbers from the payoff matrix, we have that a correlated equilibrium is a tuple z (H,H) , z (D,H) , z (H,D) , z (D,D) that satisfies these four inequalities 6z (H,D) ≥ z (H,H) + 4z (H,D) z (D,H) + 4z (D,D) ≥ 6z (D,D) 6z (D,H) ≥ z (H,H) + 4z (D,H) z (H,D) + 4z (D,D) ≥ 6z (D,D) (b) In this particular game, the Nash equilibria have joint action probabilities of z (H,H) , z (D,H) , z (H,D) , z (D,D) = (0, 1, 0, 0), (0, 0, 1, 0), and (1/9, 2/9, 2/9, 4/9). All three of these tuples satisfy the above inequalities. Formally, we can prove N E ⊆ CE. Suppose z is a Nash equilibrium. If ai is a possible signal to player i then pai i > 0 which implies that X X Ui (ai , a−i )z (ai ,a−i) ≥ Ui (āi , a−i )z (ai ,a−i) . a−i ∈A−i a−i ∈A−i since z is a Nash equilibrium. (Remember indifference phenomena). Hence, it is a correlated equilibrium. Thus, the Venn diagram looks like this: 5. A zero-sum game has a payoff (for the row player) given by 1 3 4 2 (a) Compute the security strategies for both players using pure actions, and conclude that the game does not have a value in this case. (b) Repeat using mixed strategies, and compute the value of the game. Pure strategies: • For the row player, the worst case for top is 1 and for bottom is 2. Therefore, a∗row = Bottom, v = 2 • For the column player, the worst case for left is 4 and for right is 3. Therefore, a∗col = Right, v = 3 • Note that v < v, so the game has no value. Mixed strategies: • Assume the column player mixes according to (p, 1 − p) and the row player mixes according to (q, 1 − q). • For the row player, the plot below illustrates the consequences of mixing with p. The horizontal axis is p. The vertical axis is payoff. The two lines correspond to column player using left or right. The thick line represents the worst case as function of p. The peak is the maximin. 4 3.5 3 2.5 2 1.5 1 0 0.2 0.4 0.6 0.8 1 Note that v = 2.5 • Likewise for the minimizing column player, the following is a plot of the worst case as a function of q. The valley is the minimax. 4 3.5 3 2.5 2 1.5 1 0 0.2 0.4 0.6 0.8 Note that v = 2.5 • The value of the game is 2.5. The security strategies are p∗ = 1/2 & q ∗ = 1/4 1
© Copyright 2026 Paperzz