ECO 5346 Sec 001 Klaus Becker Fall 2013 Homework #8 Solutions 1 There is a rough neighborhood with n 2 residents. Each resident has to decide whether to engage in the crime of theft. If an individual chooses to be a thief and is not caught by the police, he receives a payoff of W . If he is caught by the police, his payoff is Z . If he chooses not to commit theft, he receives a zero payoff. Assume W 0 Z . All n residents simultaneously decide whether or not to commit theft. The probability of a thief being caught equals 1/ m where m is the total number of residents who choose to engage in theft. Thus, the probability of being caught is lower when more crimes are committed and the police have more crimes to investigate. Find all pure strategy Nash equilibria. The payoff from being a thief given m 1 other people have also chosen to be thieves is m 1 1 W Z m m One Nash equilibrium is when no one chooses to be a thief. Given everyone else chooses not to turn to crime, a person who chooses to engage in theft is caught for sure so that his payoff is Z which is less than 0 (the payoff from not being a thief). Now consider an equilibrium in which m residents choose theft where 1 m n 1 so that some but not all choose to be a thief. For it to be optimal for those m residents to be thieves, it must be true that m 1 1 W Z 0.(SOL5.7.1) m m The left-hand side expression is the payoff from being a thief, while 0 is the payoff from not engaging in theft. For the n m residents who have avoided a life of crime to find it optimal to be law-abiding, it must be true that m 1 W Z 0.(SOL5.7.2) m 1 m 1 Again, the left-hand side expression is the payoff from being a thief but notice now that there would be m 1 thieves if this resident also chose a life of crime. Let us convince ourselves that if (SOL5.7.1) is true then (SOL5.7.2) cannot be true. mm1 W m1 Z from (SOL5.7.1) is a weighted average of W, which is positive, and Z, which is negative. If (SOL5.7.1) is true then that weighted average is non-negative. Now consider mm1 W m11 Z from (SOL5.7.2). It is also a weighted average but notice that it puts more weight on the positive number - a weight of than mm1 W m1 Z . Hence, if m m1 mm1 W m1 Z rather than m 1 m - and thus must be larger is non-negative then mm1 W m11 Z must be positive. But this means (SOL5.7.2) cannot be true. Intuitively, if m residents find it optimal to engage in theft, then so will m 1 residents since the chances of getting caught are lower when more people are criminals. From this we conclude that there cannot be a Nash equilibrium in which some but not all residents are thieves. The only remaining possibility is that all n residents choose crime. This is an equilibrium if and only if: n 1 1 W Z 0. n n To sum up: if nn1 W 1n Z 0 then there are two Nash equilibria - one with all residents as criminals and one with none of them as criminals. If nn1 W 1n Z 0 then there is one Nash equilibrium which has all residents being law-abiding. 2 For the game in Figure 1, find all mixed-strategy Nash equilibria. Figure 1 a b Player 1 c d Player 2 y 1,4 2,3 4,6 1,3 x 2,3 5,1 3,7 4,2 z 3,2 1,2 5,4 6,1 First note that c strictly dominates a and y strictly dominates z. Thus, any Nash equilibrium in mixed strategies must assign zero probability to those dominated strategies. We can then eliminate them so that the reduced game is Figure SOL7.6.1 Player 2 x Player 1 y b 5,1 2,3 c 3,7 4,6 d 4,2 1,3 For this reduced game, b strictly dominates d so the latter can be deleted. The reduced game is Figure SOL7.6.2 Player 2 x Player 1 y b 5,1 2,3 c 3,7 4,6 This game has no pure-strategy Nash equilibria. To find the mixed-strategy Nash equilibria, let p denote the probability that player 1 chooses b and q denote the probability that player 2 chooses x. The equilibrium conditions ensuring that players want to randomize are: q 5 1 q 2 q 3 1 q 4 q 1/ 2 p 1 1 p 7 p 3 1 p 6 p 1/ 3 ⇒ ( ) ( , ) where p denotes the probability that player 1 chooses b and q denotes the probability that player 2 chooses x 3. Galileo Galilei is potentially confronted by the Inquisition. To first describe what actually transpired, Pope Urban II referred Galileo to the Inquisition and he was brought to trial on April 12, 1633. After verbal persuasion from the Commissary General of the Inquisition, Galileo confessed that he had gone too far in supporting the Copernican theory in one of his books (even though he hadn’t). Galileo was then given an "examination of intention," which involves showing the instruments of torture shown to the accused. The final hearing by the Inquisition was held on June 22, 1633, at which time the 69-year old Galileo pleaded for mercy because of his "regrettable state of physical unwellness." With the threat of torture and imprisonment lurking in the background, the Inquisitors forced Galileo to "abjure, curse and detest" his work. Galileo complied in every way and was convicted and sentenced to life imprisonment and religious penances. Due to his age (and possibly his stature), the sentence was commuted to house arrest. He was allowed to return to his villa near Florence where he would remain for the last years of his life. That is history and Figure 2 represents a simple game theoretic modeling of it. Figure 2: Galileo Galilei and the Inquisition Urban VIII Do not refer Refer Galileo Urban VIII Galileo Inquisitor 3 5 3 Do not confess Confess Inquisitor 5 3 4 Torture Do not torture Galileo Confess Do not confess 4 1 5 2 4 2 1 2 1 (a) Find all of the Nash equilibria. The strategic form games are shown in Figure SOL8.1.1. Figure SOL8.1.1 Inquisitor - Torture Galileo Urban DNR R C / C C / DNC 3,5,3 3,5,3 5,3,4 5,3,4 DNC / C 3,5,3 DNC / DNC 3,5,3 4,1,5 1,2,1 Inquisitor -Don't Torture Galileo C /C Urban C / DNC DNC / C DNC / DNC DNR 3,5,3 3,5,3 3,5,3 3,5,3 R 5,3,4 5,3,4 2,4,2 2,4,2 The Nash equilibria are: (DNR, DNC/DNC, Torture), (R, C/C, Torture), (R, C/DNC, Torture), (DNR,DNC/C, Do not torture), (DNR, DNC/DNC, Do not torture). (b) Find all of the subgame perfect Nash equilibria (SPE) In his last decision node (which is associated with the path Refer-Do not ConfessTorture), Galileo chooses Do not confess. Given this choice, the Inquisitor chooses Do not torture. At his first decision node (associated with Urban VIII having chosen Refer), Galileo chooses Do not confess. Finally, using the above result, Urban VIII chooses Do not refer, as it produces payoff 3, which is greater than payoff 2 from playing Refer . Hence the unique subgame perfect Nash equilibrium is: (DNR, DNC/DNC,Do not torture). (c) For each Nash equilibrium that is not a subgame perfect Nash equilibrium, explain why it is not a subgame perfect Nash equilibrium. There are 4 Nash equilibria which are not subgame perfect Nash equilibria. In Nash equilibria (DNR,DNC/DNC, Torture) and (R, C/DNC, Torture), the Inquisitor is making a non-optimal decision by choosing to torture Galileo given Galileo plays Do not confess in his last decision node. In Nash equilibria (R, C/C, Torture) and (DNR, DNC/C, Do not torture), Galileo is making a non-optimal decision at his last decision node. He should play Do not confess instead. 4 President X is in a strategic confrontation with Y’s leader Z. Z’s type determines whether or not he has weapons of mass destruction (WMD) where the probability he has WMD is w where 0 w 53 . After learning his type, Z decides whether or not to allow inspections. If he allows inspections then assume they reveal WMD if Z has them and does not reveal WMD if he doesn't. If he doesn't allow inspections then uncertainty about whether he has WMD remains. At that point, President X decides whether or not to invade. The extensive form is shown in Figure 3. Figure 3: The WMD game Nature No WMD WMD Z Z Do not allow inspections Allow inspections X X Do not invade Invade Invade X Do not invade Do not allow inspections Allow inspections Invade X Do not invade Invade Do not invade Z 1 3 2 9 1 4 2 8 X 3 1 3 1 5 9 6 9 Note that X learns Z's type if Z allows inspections but remains in the dark if he does not. Z's payoffs are such that, regardless of whether he has WMD, his ordering of the outcomes (from best to worst) are: no inspections and no invasion, inspections and no invasion, no inspections and invasion, and inspections and invasion. Thus, he prefers not to allow inspections but is most motivated to avoid an invasion. X's preference ordering depends very much on whether Z has WMD. If Z has WMD, X wants to invade; if he does not then he prefers not to invade. Find consistent beliefs for X and values for b and h , where 0 b 1 and 0 h 1, whereby the following strategy pair is a perfect BayesNash equilibrium: Z's strategy: 1. 2. If I have WMD then do not allow inspections. If I do not have WMD then allow inspections with probability h. President X’s strategy: 3. 4. 5. If Z allows inspections and WMD are found then invade. If Z allows inspections and WMD are not found then do not invade. If Z does not allow inspections then invade with probability b. Consider the following beliefs for X: i ) if Z allows inspections and WMD are found then Z has WMD with probability one; ii) if Z allows inspections and WMD are not found then Z has WMD with probability zero; and iii) if Z does not allow inspections then Z has WMD with probability w1 ww (1h ) . Let us begin by showing the consistency of X's beliefs. When inspections are allowed, it is trivial that beliefs are consistent (they are actually consistent with the truth, not just Z's strategy). When inspections are not allowed then the posterior probability of Z having WMD is given by Bayes’ Rule to be: w 1 w 1 1 w (1 h) as with probability w Z has WMD and, in that event, he does not allow inspections with probability one; while with probability 1 w Z has WMD and, in that event, he does not allow inspections with probability 1 h. Turning to X's strategy, its optimality is clear when there are inspections; whether WMD are found or not. When inspections are not allowed, X is content to randomize (that is, 0 < b < 1 if and only if: 1 w (1 h) 6 w 1 3 w 1 1 w (1 h) w 1 1 w (1 h) 1 w (1 h) 9 w 1 1 w 1 1 w (1 h) w 1 1 w (1 h) The left-hand side expression is the expected payoff from invading and the right-hand side expression is the expected payoff from not invading. Solving this equation for h yields h 3315ww . Note that 3315ww 0 and 3315ww 1 if and only if 0 w 53 . The latter condition was assumed. When he has WMD, it is clearly optimal for Z to not allow inspections. When he does not have WMD, it is optimal to randomize if and only if: 2b + (1 – b)8 = 4 where he earns a payoff of 4 by allowing inspections - in which case there is no invasion and an expected payoff of 2b + (1 – b)8 from not allowing inspections - where there is an invasion with probability b . Solving this equation, we find b = 2/3 5 A well-known strategy for sustaining cooperation is Tit-for-Tat. With Tit-for-Tat, a player starts off with cooperative play and then does whatever the other player did last period. Tit-for-Tat embodies the idea that "What goes around comes around." For the Trench Warfare game, it takes the form: In period 1, choose miss. In period t 2 , choose miss if the other player chose miss last period and choose kill if the other player chose kill last period. For the infinitely repeated Trench Warfare game discussed in class, derive conditions for Tit-for-Tat to be a subgame perfect Nash equilibrium (SPE). Consider the Allied Soldiers and either period 1 or a period in which both (the Allied Soldiers and the German Soldiers) chose (miss, miss) in the previous period. Both players are to choose miss and this'll result in them both choosing miss in the ensuing period and every period thereafter. The resulting payoff sequence is 4 forever which has a present value of 4/(1- δ). Alternatively, a player, for example the Allied Soldiers, could choose kill that'll yield a payoff of 6 in the current period. According to their strategies, the Allied Soldiers will choose miss and the German Soldiers will choose kill in the next period. In the period after that, the Allied Soldiers will choose kill and the German Soldiers will choose miss. They'll keep alternating in their actions in all periods. The payoff from choosing kill is then: 6 + δ(0) + δ2(6) + δ3(0) + δ4(6) +… = 6/(1- δ2) Thus, choosing miss is optimal if and only if: 4 6 4 6 1 4 1 6 2 1 1 1 1 1 2 Now consider a history in which the Allied Soldiers chose miss and the German Soldiers chose kill in the preceding period. If the Allied Soldiers act according to their strategy by choosing kill (and the German Soldiers do similarly and choose miss), the Allied Soldiers’ payoff is 6/(1- δ). If the Allied Soldiers instead chose miss then the sequence of actions would be (miss, miss) in the current period and, therefore, also occur in all ensuing periods. The payoff for that is 4/(1- δ). For it to be optimal for the Allied Soldiers to choose kill when, in the previous period, they chose miss and the German Soldiers chose kill, it must be true that: 6 4 6 4 1 6 4 1 2 1 1 2 1 1 1 Next consider a history in which the Allied Soldiers chose kill in the previous period and the German Soldiers chose miss. The Allied Soldiers’ prescribed action of miss is preferable to choosing kill if and only if: 0 6 20 36 2 2 22 32 2 6 2 6 2 1 1 1 1 1 1 6 2 1 2 Finally, we have a history in which both chose kill in the previous period. It is indeed optimal to choose kill if and only if: 2 2 22 32 0 6 20 36 2 6 1 2 1 1 2 Putting all of these conditions together, Tit-for-Tat is a subgame perfect Nash equilibrium when: 1 1 1 and 2 2 2
© Copyright 2026 Paperzz