MATH 1340, HOMEWORK #10 SOLUTIONS 1. Consider the

MATH 1340, HOMEWORK #10 SOLUTIONS
DUE THURSDAY, APRIL 27
1. Consider the following 2 × 2 zero-sum game:
C N
C -1 5
N 3 -5
(a) What is Row’s minimax strategy?
(b) What is Column’s minimax strategy?
(c) What is the expected value of the game for Row (assuming that Row and Column adopt
their respective minimax strategies)?
(d) Suppose that Column is forced to adopt its minimax strategy, but Row is free to choose any
strategy (pure or mixed), what is the highest possible expected value that Row can get?
The lowest?
(e) I mentioned in class that deviating from the minimax strategy is “suboptimal” in a certain
sense. This question will try to nail down precisely what that means.
Suppose you took what I said in the most literal sense: that this leads to worse outcomes
(i.e. lower utility). Now suppose that both Row and Column deviate from their minimax
strategy. But this is a zero-sum game, so it cannot be the case that both Row and Column
lose utility, because any loss for Row is a gain for Column, and vice versa.
What is going on here? In other words, what does the minimax theorem actually guarantee about the minimax strategy? Is there any reason why Row or Column should deviate
from their minimax strategies?
Solution. We rearrange the given matrix as:
N C
C 5 -1
N -5 3
3−(−5)
8
(5−(−1))+(3−(−5)) = 14 .
3−(−1)
4
is to choose N with probability (5−(−1))+(3−(−5))
= 14
.
(5∗3)−(−1)∗(−5)
10
for Row is (5−(−1))+(3−(−5)) = 14 .
Row’s minimax strategy is to choose C with probability
Column’s minimax strategy
Expected value of the game
If Column is forced to adopt its minimax strategy, and Row chooses C, then the value of the game
for Row is (5) ∗ (4/14) + (−1) ∗ (10/14) = 10/14. If Column is forced to adopt its minimax strategy,
and Row chooses N , then the value of the game for Row is (−5) ∗ (4/14) + (3) ∗ (10/14) = 10/14.
Therefore, if Row chooses C with probability q and N with probability 1 − q, then the value of the
game is q ∗ (10/14) + (1 − q) ∗ (10/14) = 10/14. Hence, Row always gets a value of 10/14 if Column
is forced to adopt its minimax strategy. Therefore this is both the highest and the lowest possible
expected value that Row can get.
The minimax theorem guarantees that Row can get atleast a value of V if Row adopts its minimax
strategy. Column will get atleast −V if Column adopts its minimax strategy. If both players
deviate then it is possible that Row gets more than V or Column will get more than −V but it
won’t happen simultaneously as it is a zero sum game. There is no reason for Row or Column to
deviate because they each don’t know what the other player is doing. If they were to deviate they
1
2
DUE THURSDAY, APRIL 27
can only potentially get a lower value than what they would get had they chosen their minimax
strategy irrespective of what the other person is doing.
2. (a) [TP 10.13] Find the saddle points in the following games:
Solution. The saddle points are highlighted in red.
C N
C -1 -2
N 3 4
C N V
C 1 -1 2
N 3 1 -2
V 1 0 2
C N V
C 3 1 4
N 1 0 -2
V 2 1 3
(b) For the last game in [TP 10.13], show that the saddle point is a Nash equilibrium (in pure
strategies), that is, neither player can unilaterally improve their utility by changing their strategy.
Proof. If the Row player unilaterally deviates then the value that the Row player gets is 0 or 1
neither of which are bigger than the value at the saddle point. If the Column player unilaterally
deviates from C, N saddle point then the Column player gets −3 or −4 which are both smaller
than −1. If the Column player unilaterally deviates from V, N saddle point then the Column player
gets −2 or −3 which are both smaller than −1. Hence, neither player can improve their utility by
unilaterally deviating.
(c) [TP 10.12] An outcome in a two-person zero-sum game, even allowing for the possibility of
more than two strategy choices for each player, is called a saddle point if it is simultaneously the
smallest (or tied for such) entry in its row and the largest (or tied for such) entry in its column.
Prove that an outcome is a saddle point if and only if it is a Nash equilibrium (in pure strategies).
Proof. If an outcome is a saddle point then it is the smallest entry in the Row. If we negate all
the values in that Row, it must be the highest value in the Row. This is the value for the Column
player. If Column player were to unilaterally deviate then the Column player would get some other
value in this (negated) Row none of whose entries are higher than the Column player’s existing
value. If an outcome is a saddle point then it is the highest value in the column. If Row player
were to unilaterally deviate then the Row player would get some other value from the same column
but none of the other entries in the column are higher than the Row player’s existing value. Hence,
neither player would unilaterally deviate from the saddle point which shows that every saddle point
is a Nash equilibrium in a zero sum game.
If an outcome is a Nash equilibrium, then the Column player will not want to unilaterally deviate.
This means that the current value for the Column player is the best amongst all other entries in the
same row. Since this is a zero sum game, the value for the column player is negative of the entry in
the row. Therefore if we negate all the entries of the row, the value is the largest in the row which
means that the original entry is the smallest in the row. If an outcome is a Nash equilibrium, then
the Row player will not want to unilaterally deviate. This means that the current value for the
Row player is the best among all the other entries in the column. Hence, the outcome which is a
Nash equilibrium is simultaneously the lowest entry in the row and the largest entry in the Column
proving that it is a saddle point.
3 [TP 10.14] In the book Superior Beings, the author, Steven J. Brams (1983) of New York University’s Department of Politics, asks the following question: If God is omniscient (”all knowing“),
would we be able to determine if He had this power via our interactions with Him? Brams methodology consists of
(a) Rigorously defining omniscience game-theoretically,
MATH 1340, HOMEWORK #10 SOLUTIONS
3
(b) Modeling our relationship with God game-theoretically,
(c) Analyzing the game in 2 ways: assuming God is not omniscient and then assuming God is
omniscient,
(d) Concluding that if the outcomes are different then we can detect the power if He has it.
The fundamental game that Brams uses to model our relationship with God is the so-called Revelation Game:
Believe Don’t Believe
Reveal
(3,4)
(1,1)
Don’t Reveal (4,2)
(2,3)
We take these preferences as given.
(a) Analyze the game. That is, does either God or People have a dominant strategy and, if so and
the other side knows it, what will the outcome be?
Solution. God has a dominant strategy to not reveal Himself (because 4 > 3 and 2 > 1). People
do not have a dominant strategy but knowing that God has a dominant strategy they will choose
not to believe as (2,3) is preferred by the people over (4,2).
b) Assume that God is omniscient and that this means, game-theoretically, that God knows
which strategy People will choose. This is equivalent to saying that people move first and then
God will respond with His move (and the game ends). Analyze this version of the game.
Proof. If people choose to believe then God will choose not to reveal Himself as 4 > 3. If people
choose to not believe then God will choose not to reveal Himself as 2 > 1. In this case, God will
again choose not to reveal himself irrespective of people’s first move. Amongst the two possible
outcomes people would prefer (2,3) over (4,2).
(c) Can we determine if God is omniscient by this interaction?
Solution. We cannot determine if God is omniscient by this interaction because the outcomes are
the same in parts (a) and (b).
4. [TP 5.4] Apportionment methods can also be used in non-political contexts. Consider, for
example, the situation in which a mathematics department has 10 faculty members, each of whom
will teach 2 classes in the fall semester. These 20 sections need to accommodate seven different
courses with enrollments (and “quota” to be commented upon momentarily) as follows.
Course
Enrollment Quota
Calculus I
121
5.45
Calculus II
94
4.23
Calculus III
76
3.42
Linear Algebra
48
2.16
Real Variables
20
0.91
Cryptology
24
1.08
Math and Politics
61
2.75
Total
444
20
(a) Explain how the quotas were calculated.
Enrollment × 20
444
(b) Find the apportionment of the 20 sections among the 7 courses resulting from the use of
Hamilton’s method.
Solution. Quota =
Solution. After rounding all quotas down to the nearest whole number, we get
4
DUE THURSDAY, APRIL 27
(5, 4, 3, 2, 0, 1, 2)
whose sum is 17. The remaining seats are allocated to the first, fifth and seventh entry to get
(6, 4, 3, 2, 1, 1, 3)
whose sum is 20.
5. [TP 5.5] Prove that for any number of parties, if an allocation is envy-free, then it must also be
proportional.
Proof. We will prove the contrapositive of the statement. Assume there is a division of goods
among n people and the division is not proportional. There is atleast one person who has received
less than 1/n of the total goods according to his valuation. According to this person, the remaining
n − 1 people have recieved greater than 1 − 1/n portion of the goods. The average portion of the
goods received by the n − 1 people is greater than (1 − 1/n)/(n − 1) = 1/n. Hence, there is at least
one person who has received greater than 1/n portion of the goods. The person who received less
than 1/n portion of the goods is envious of this person and hence the division is not envy-free. 6. [TP 5.9] Is the divide-and-choose procedure manipulable? That is, can one player achieve a
strictly better outcome by misrepresenting his true valuation of the cake? Prove that it is not, or
give an example where one player achieves a strictly better outcome than obtained with an honest
application of the divide-and-choose procedure.
Solution. Let us assume the cake is rectangular and both people have a personal valuation of where
the cake should be cut for a 50-50 split. The person cutting the cake does not know what the other
person’s valuation is. However, if the person cutting the cake took a chance and was able to cut
the cake between the places where both of them were thinking the cake should have been cut, then
the person cutting the cake does strictly better. There is a risk that in misrepresenting the true
valuation, the person may end up cutting on either side of the two people’s true valuations in which
case, they will get a stricly worse outcome. Thus, the person cutting the cake cannot guarantee a
better outcome by misrepresenting their true valuation of the cake.
The person who is not dividing the cake has to choose one piece after the cake has been cut. If
this person does not choose the bigger portion (according to him), then he can only possibly get
a worse outcome. Thus, this person cannot guarantee a better outcome by misrepresenting their
true valuation of the cake.
Hence, neither of the players can guarantee a strictly better outcome by misrepresenting their
true valuation of the cake.
(Note: during the divide and choose procedure the players do not communicate with each other.
The first player divides the cake, the second player chooses one piece and the first player takes the
other piece.)