Minimax regret strategies in simple games VERY preliminary Michal Lewandowski∗ September 19, 2014 Abstract Goeree and Holt (2001) presented several simple games in which Nash equilibrium and the experminetally observed strategies agree. Then they showed that it is possible to modify these games in such a way, that the Nash equilibrium of the resulting games disagrees with the experimental prediction. One explanation of this fact is that players cannot be sure that their opponents are perfectly rational, but only in this case NE strategy is optimal. Otherwise, one has to take into account that some strategies are more risky than other. The riskiness of a given strategy can be measured by its maximum regret. Keywords: minimax regret, Nash equilibrium JEL Classification Numbers: C7 ∗ Warsaw School of Economics, [email protected] Michal Lewandowski Minimax regret strategies 1 Minimax regret strategies Consider a game Γ in strategic form: 1. The set of players N 2. For each player i ∈ N a set of actions Ai 3. For each player i ∈ N a payoff function ui : A → R Notation: • A set of all players profiles of actions is denoted by A ≡ ×i∈N Ai with a typical element denoted by a. • A set of all players but player i profiles of actions is denoted by A−i ≡ ×j∈N \i with a typical element denoted by a−i . Definition 1. Regret from chosing a0i ∈ Ai given a profile of other players’ actions a0−i ∈ A−i is defined as: Ri (a0i , a0−i ) = max [ui (ai , a0−i )] − ui (a0i , a0−i ) (1) ai ∈Ai Maximum regret from choosing a0i ∈ Ai is then given by: max Ri (a0i , a−i ) (2) a−i ∈A−i Minimax regret strategy for a player i ∈ N is an action a∗i ∈ Ai such that: ∗ ai ∈ arg min max Ri (ai , a−i ) (3) ai ∈Ai a−i ∈A−i Definition 2. Nash equilibrium in pure strategies is a profile of actions a∗ ∈ A such that: a∗i ∈ arg max ui (ai , a∗−i ), ∀a−i ∈ A−i , ∀i ∈ N (4) ai ∈Ai Example 1. Consider two games of a ”choose an effort” variety: Payoff table L R T 2, 2 −3, 1 B 1, −3 1, 1 Regret table L R max T 0, 0 4, 1 4 B 1, 4 0, 0 1 max 4 1 2 Michal Lewandowski Minimax regret strategies Payoff table L R T 5, 5 0, 1 B 1, 0 1, 1 Regret table L R max T 0, 0 1, 4 1 B 4, 1 0, 0 4 max 1 4 In the first game above, there are two Nash equilibria in pure strategies: (T, L) and (B, R) with payoffs (2, 2) and (1, 1), respectively and one mixed strategy equilibrium (0.8T + 0.2B, 0.2L + 0.8B). Contrary to the mixed strategy equilibrium prediction, many experiments confirmed that B and R are played much more often than T and L, respectively. This is captured by minimax regret strategies, which reveal that strategies T and L are much more risky than strategies B and R (maximal regret levels of the former are both 4 and for the latter 1). In the second game above, there are again two Nash equilibria in pure strategies: (T, L) and (B, R) with payoffs (4, 4) and (1, 1), respectively and one mixed strategy equilibrium (0.2T + 0.8B, 0.2L + 0.8R). Contrary to the mixed strategy equilibrium prediction, many experiments confirmed that T and L are played much more often than B and R, respectively. This is captured by minimax regret strategies, which reveal that strategies B and R are much more risky than strategies T and L (maximal regret levels of the former are both 4 and for the latter 1). Example 2. Consider three games of a ”matching pennies” variety. The first one is symmeric whereas the other two are asymmetric: Payoff table Game A L R T 80, 40 40, 80 B 40, 80 80, 40 Regret table L R max T 0, 40 40, 0 40 B 40, 0 0, 40 40 max 40 40 Payoff table Game B Regret table L R T 320, 40 40, 80 B 40, 80 80, 40 L R max T 0, 40 40, 0 40 B 280, 0 0, 40 280 max 40 40 3 Michal Lewandowski Minimax regret strategies Payoff table Game C Regret table L R T 44, 40 40, 80 B 40, 80 80, 40 L R max T 0, 40 40, 0 40 B 4, 0 0, 40 4 max 40 40 Nash equilibria of these games are compared with the experimentally played strategies in the following table: Game A NE Exp 0.50T + 0.50B Game B ! 0.50L + 0.50R 0.48T + 0.52B Game C 0.50T + 0.50B ! 0.13L + 0.87R ! 0.48L + 0.52R 0.96T + 0.04B 0.16L + 0.84R 0.50T + 0.50B ! 0.91L + 0.09R ! 0.08T + 0.92B ! 0.8L + 0.2R Interestingly, NE strategies of the Raw player are the same for all three games, which sharply contrasts with the experimental evidence. Let’s look at the riskiness of the strategies: For the column player both strategies in all three games have the same maximum regret. It is therefore safe for the Column player to play what Nash equilibrium suggests. Note that it also roughly reflects what the Column players actually play in these games. On the other hand however, the riskiness levels of the Raw player strategies differ substantially in these three games. In Game A, maximum regret for strategy T and B are equal, in Game B maximum regret for strategy T is much lower than that for strategy B and in Game C it is the opposite. This is reflected in the experimental evidence in which strategy T is played much more often than strategy B in Game B, with exactly the opposite happening for Game C. Example 3. Let’s consider the travellers’ dilemma with two players and action space Ai = {180, 181, ..., 300} for each i ∈ N = {1, 2}. Payoffs are the following: ui (a0i , a0j ) = min(a0i , a0j ) + P (a0i , a0j ), i 6= j, i, j ∈ N, 0 0 P if ai < aj where P (a0i , a0j ) = 0 if a0i = a0j , where P ∈ Z+ −P if a0 > a0 i j When P = 0 and for P = 1, any profile of strategies for which ai = aj , i 6= j is a pure strategy Nash equilibrium. When P > 1, the only Nash equilibrium which survives iterated elimination of weakly dominant strategies is to bid 180. Experimental evidence shows that if P is equal to 180 most of the players do in fact bid low (i.e. close to 180), but if P is equal to 5, then most of the players bid high (i.e. close to 300). 4 Michal Lewandowski Minimax regret strategies Let’s see what is the riskiness of the strategies in this game. Regret of player i for a given strategy profile (a0i , a0j ) ∈ A is equal to: Ri (a0i , a02 ) = max min(ai , a0j ) + P (ai , a0j ) − min(a0i , a0j ) + P (a0i , a0j ) ai ∈Ai To calculate it we need to consider two cases: ( 180 − 180 = 0 0 0 aj = 180 ⇒ Ri (ai , 180) = 180 − 180 + P = P a0j if a0i = 180 if a0i > 180 > 180 ⇒ Ri (a0i , a0j ) = a0j − 1 + P − min(a0i , a0j ) + P (a0i , a0j ) We can summarize it in the form of the table: a0i = 180 a0i > 180 a0j = 180 0 P a0j a0j > 180 − 181 a0j − 1 + P − min(a0i , a0j ) − P (a0i , a0j ) So the maximum regret of player i for a given 119 0 max regreti (ai ) = max(P, 118) max(2P − 1, 0) strategy a0i ∈ Ai is equal to: when a0i = 180 when a0i = 181 when a0i ≥ 182 Now we can solve for the minimax strategies in this game which is summarized in the following table: Values of P Set of minimax regret strategies Minimax value P =0 {300} 0 P ∈ {1, ..., 59} {300 − 2P, ..., 299, 300} 2P − 1 P ∈ {60, 61, ..., 118} {181} 118 P = 119 {180, 181} 119 P ∈ {120, 121, ...} {180} 119 For example, for P = 0, the only minimax regret strategy is to bid 300. For P = 5, any bid in the set {290, 291, ..., 300} is a minimax regret strategy. For P = 180 the only minimax regret strategy is to bid 180. Example 4. Consider the game of choosing an effort level, where N = {1, 2}, Ai = {110, 111, ..., 170}, i ∈ N . The payoffs for a given profile of actions (a0i , a0j ) ∈ A are defined as follows: ui (a0i , a0j ) = min(a0i , a0j ) − ca0i where c ∈ (0, 1). 5 Michal Lewandowski Minimax regret strategies The set of Nash equilibria consists of all the pairs (a, a), where a ∈ Ai . The experimental evidence on the other hand suggests that if cost of effort is low (c = 0.1), then people usually choose high effort whereas if cost of effort is high (c = 0.9), then they choose low effort. Again this is not captured by the NE prediction. Let’s calculate the riskiness of the strategies involved. The regret of a given profile of actions (a0i , a0j ) ∈ A is given by: Ri (a0i , a0j ) = = max min(ai , a0j ) − cai − min(a0i , a0j ) − ca0i ai ∈Ai a0j − ca0j − min(a0i , a0j ) + ca0i Maximum regret for a strategy a0i is given by: max regreti (a0i ) = max aj − caj − min(a0i , aj ) + ca0i aj ∈Aj = max 170(1 − c) − a0i , −110c + ca0i Minimax regret is given by: min [max (170(1 − c) − ai , −110c) + cai ] ai ∈Ai Let’s define a∗i ∈ Ai as the value of player i strategy for which the two elements of the above max function are equal: 170(1 − c) − a∗i = −110c ⇒ a∗i = 170 − c(170 − 110) In order to find the minimax regret, it is clear that we need to consider three cases: a0i = 110 ⇒ max regreti (110) = (1 − c)(170 − 110) a0i = a∗i ⇒ max regreti (a∗i ) = c(1 − c)(170 − 110) a0i = 170 ⇒ max regreti (170) = c(170 − 110) Since c ∈ (0, 1) it must be that the minimax regret strategy is a0i = a∗i and the minimax regret value is max regreti (a∗i ) = c(1 − c)(170 − 110). For example if c = 0.1, the minimax regret strategy is equal to 164 and if c = 0.9, the minimax regret strategy is equal to 116, which is in accordance with the experimental evidence. Example 5. The following is the so called extended coordination game: Payoff table L M R T 90, 90 0, 0 x, 40 B 0, 0 180, 180 0, 40 6 Michal Lewandowski Minimax regret strategies Since strategy R is never a best-response, it cannot be part of any Nash equilibrium. It is easy to see that the set of Nash equilibria of this game is the following irrespective of the value x: {(T, L), (B, M ), (0.67T + 0.33L, 0.67L + 0.33M )} Experiments reveal that if x = 0, then strategy B is played almost exclusively (96% of all subjects) whereas when x = 400, strategy B is played a little less often (64% of all subjects). In both cases the Column player plays strategy M more often than other strategies (84% in the first case and 76% in the second case). Let’s calculate the riskiness of the strategies involved. L Regret table M R max T 0, 0 180, 90 0, 50 180 B 90, 180 0, 0 x, 140 max(90, x) max 180 90 140 where x ∈ R. So • if x < 180, then minimax regret strategies are (B, M ) • if x > 180, then minimax regret strategies are (T, M ) Example 6. The following is the so called Kreps game: Payoff table L M NN R T 200, 50 0, 45 10, 30 20, −250 B 0, −250 10, −100 30, 30 50, 40 The set of Nash equilibria is {(T, L), (B, R), (0.49T + 0.51B, 0.13L + 0.87R)} Experimental evidence is summarized in the following table: Payoff table L [26%] M [8%] N N [68%] R [0%] T [68%] 200, 50 0, 45 10, 30 20, −250 B [32%] 0, −250 10, −100 30, 30 50, 40 How about the minimax regret strategies? 7 Michal Lewandowski Minimax regret strategies L M NN R max T 0, 0 10, 5 20, 20 30, 300 30 B 200, 290 0, 140 0, 10 0, 0 200 max 290 140 20 300 Regret table The only Nash equlibria in pure strategies of this game are (T, L) and (B, R). On the other hand, the only minimax regret pure strategy profile is (T, N N ). Definition 3. Given any strategic form game Γ, a randomized strategy for any player i is a probability distribution over Ai . Let ∆(Ai ) denote the set of all possible randomized strategies for player i. The set of all randomized strategy profiles will be denoted by ∆(A) = ×i∈N ∆(Ai ). It must be that: X σi (ai ) = 1, ∀i ∈ N ai ∈Ai We will write σ ≡ (σi )i∈N , where σi ≡ (σi (ai ))ai ∈Ai , for each i. For any randomized strategy profile σ, let ui (σ) denote the expected payoff that player i would get when the players independently choose their pure strategies according to σ: ui (σ) = X Y a∈A σj (aj ) ui (a), ∀i ∈ N j∈N For any σi0 ∈ ∆(Ai ), we denote (σi0 , σ−i ) the randomized strategy profile in which the i-th component is σi0 and all other components are as in σ. Thus: X Y ui (σi0 , σ−i ) = σj (aj ) σi0 (ai )ui (a) a∈A j∈N \i Definition 4. A randomized strategy profile σ ∗ ∈ ∆(A) is a Nash equilibrium of Γ if the following holds: ∗ σi∗ ∈ arg max ui (σi , σ−i ), ∀i ∈ N σi ∈∆(Ai ) (5) Definition 5. A randomized strategy σi∗ ∈ ∆(Ai ) is a randomized minimax regret strategy of player i ∈ N of the game Γ if the following holds: σi∗ ∈ arg ∗ )= where Ri (σi , σ−i P a∈A min σi ∈∆(Ai ) Q ∗ Ri (σi , σ−i ) (6) σ (a ) σi0 (ai ) maxa−i ∈A−i Ri (ai , a−i ) j j j∈N \i 8 Michal Lewandowski Minimax regret strategies Proposition 1.1. Minimax regret mixed strategy is equivalent to Nash equilibrium mixed strategy on the domain where they are both defined. Proof. min σi ∈∆(Ai ) = = min σi ∈∆(Ai ) Y a∈A min σi ∈∆(Ai ) Y a∈A σj (aj ) σi (ai )Ri (ai , a−i ) j∈N \i X X ∗ Ri (σi , σ−i ) σj (aj ) σi (ai ) max [ui (ai , a−i )] − ui (ai , a−i ) ai ∈Ai j∈N \i = X Y a∈A σj (aj ) σi (ai ) max ui (ai , a−i ) ai ∈Ai j∈N \i − X min σi ∈∆(Ai ) Y a∈A j∈N \i = X Y a∈A σj (aj ) σi (ai ) max ui (ai , a−i ) ai ∈Ai j∈N \i + max σi ∈∆(Ai ) X Y a∈A σi ∈∆(Ai ) X Y a∈A σj (aj ) σi (ai )ui (ai , a−i ) j∈N \i = CON ST + max σj (aj ) σi (ai )ui (ai , a−i ) σj (aj ) σi (ai )ui (ai , a−i ) j∈N \i 9 Minimax regret strategies Michal Lewandowski References Goeree, J. K. and C. A. Holt (2001). Ten little treasures of game theory and ten intuitive contradictions. The American Economic Review 91 (5), 1402–1422. 10
© Copyright 2026 Paperzz