Assignment 1: Solutions

Math-Comp 553: Algorithmic Game Theory.
Assignment 1: Solutions
1. Computing Nash Equilibria I.
Let Blockbuster and VideoMarket be two competing firms in a video rental
industry which has declined to a point that it cannot support both companies
profitability. Each firm has three possible strategies. It can (i) exit the industry
immediately (E) or (ii) exit at end of this quarter (T) or (iii) exit at the end of
the next quarter(N). When a company exits immediately then its profit is zero
from that point onward. Each quarter that both companies operate in yields a
loss equal to -1. Each quarter that a firm operator operates alone yields a profit
of 2. The payoff of each firms are the sum of its quarterly profits/losses.
(a) Formulate this game in normal form.
(b) Does the game have a pure strategy Nash equilibrium?
(c) Find a mixed strategy Nash equilibrium.
Solution:
(a) The only place we have to be careful is when Blockbuster chooses T and
VideoMarket chooses N. The payoff in this situation is (−1, 1) since both
firms lose -1 in the first quarter and VideoMarket gains 2 in the second
quarter.
VideoMarket
Blockbuster
E
T
N
E
T
N
0,0 0,2
0,4
2,0 -1,-1 -1,1
4,0 1,-1 -2,-2
(b) Note that T never is best response to any of the pure strategies. So in
searching for a PSNE we need only consider N and E. It is then easy to
verify that (N, E) and (E, N ) are both PSNE.
(c) Suppose BlockBuster (symmetrically, VideoMarket) plays E and N with
probability 21 each. This has a payoffs 2, 12 , −1 against the pure strategies
E, T, N , respectively. But the pure strategy T has payoffs 2, −1, −1 against
the pure strategies E, T, N , respectively. Thus T is weakly dominated by
1
the mixed strategy ( 21 , 0, 12 ). Consequently there are mixed strategy NE in
which neither player puts any probability on strategy T .
Now let pB be probability that BlockBuster exits immediately (E), and qB
be probability stays till next term (N ). Since T is weakly dominated, we
have qB = 1 − pB . Now the expected utility to VideoMarket of choosing E
is zero and the expected utility for choosing N is 4pB − 2qB . The expected
utilities of E and N to VideoMarket should match in NE if it places positive
probability on both strategies. Thus we have
0 = 4pB − 2qB = 4pB − 2(1 − pB ) = 6pB − 2
which means pB = 13 and qB = 23 . Thus the MSNE for Blockbuster is
( 31 , 0, 23 ). A symmetric argument show that VideoMarket also plays the
mixed strategy ( 13 , 0, 32 ) at the MSNE.
2. Computing Nash Equilibria II.
Assume that Videotron and Bell are providing a product (some specialized
service) to three different companies {A, B, C}. Assume that each company
has maximum budget for the product of $1000. The sellers, Videotron and
Bell, select prices pV , pB ∈ [0, 1000]. Due to geographical constraints company
A can only buy from Videotron, and company B can only buy services from
Bell. However company C will buy from cheaper of the two sellers. (Assume
that if Videotron and Bell offer the same price then C will buy from Videotron
- better customer service!) Let the utility of the sellers be their revenue. Does
the corresponding game have a pure strategy Nash equilibrium?
Solution
There is no pure strategy Nash equilibrium. Suppose pv > 500. If Bell slightly
under-cuts the price and sets 500 < pb < pv then C will buy from Bell which
will give ub = 2pb > 1000. If Bell prices at pb > pv the ub = pb < 1000. So Bell’s
best response is to slightly under-cut Videotron. A similar argument shows that
Videotron will then best respond by setting pv = pb (as Videotron wins C in
the case of a tie. Thus pv > 500 cannot be part of a PSNE. On the other hand
if pv ≤ 500 then Bell’s best response is to set pb = 1000. But this cannot be a
PSNE as then Videotron will also best respond with pv = 1000. So there are
no PSNE.
3. Social Welfare and Nash Equilibria.
In a large house, n roommates are sharing internet. Suppose the shared internet
channel has a maximum download capacity of 1 unit. Each roommate i can
choose to download xi units where xi ∈ [0, 1]. Each roommate would like to
download as much as possible, but the quality of the download deteriorate with
2
the totalPbandwidth used. If the total amount that all roommates download,
that is ni=1 xi , exceeds the maximum download capacity then the utility of
each roommate is zero. The utility for roommate i is then given by
( Q
Pn
exi nj=1 e−xi xj − 1 if
xi < 1
ui =
Pni=1
0
if
i=1 xi ≥ 1
(a) Find a Nash equilibrium in this game.
(b) What is the social welfare of this equilibrium?
(c) Show that the optimal social welfare can be arbitrarily larger than the
welfare at equilibrium.
Solution:
(a) Each roommate is trying to maximize their own utility. So focus on how
the ith player can maximize its utility. This is done by simple calculus
tricks as follows:
n
d xi Y −xi xj
d
ui =
(e
e
− 1)
dxi
dxi
j=1
d xi (1−Pnj=1 xj )
(e
)
dxi
n
X
Pn
d
xi (1 −
xj ))exi (1− j=1 xj )
=(
dxi
j=1
=
= ((1 −
n
X
xj ) − xi )exi (1−
Pn
j=1
xj )
j=1
To
zero. It is zero only when (1 −
Pnoptimise we want his derivative toPbe
n
j6=i xj . To find the equilibrium, we
j=1 xj ) − xi = 0. That is 2xi = 1 −
solve the optimization problem simultaneously for each roommate. Thus,
P
1 − j6=i xj
xi =
∀i ∈ {1, · · · , n}
2
1
for all i.
The above system of equations has a unique solution xi = n+1
(b) The social welfare of the Nash equilibrium is
X
X 1
P
1
(1− n
)
j=1
n+1
n+1
ui =
e
−1
i
i
1
= n · e (n+1)2 − 1
≈ n(1 +
≈
1
− 1)
(n + 1)2
1
for large n
n
3
1
1
1
1
(c) We simply set xi = 2n
∀i then ui = e 2n (1− 2 ) − 1 = e 4n − 1 . The social
welfare in this scenario is
X
1
1
1
SW ∗ =
ui = ne 4n − n ≈ n(1 + ) − n =
4n
4
i
Since SWopt ≥ SW ∗ , we know
1/4
SWopt
SW ∗
1/4
(n + 1)2
n
≥
≥
=
=
≥
1
2
SWN E
SWN E
n(1/(n + 1) )
4n
4
n(e (n+1)2 − 1)
4. Nash Equilibria: Bluffing in Poker.
In this question, we will use a poker-like game to show that betting with an
inferior hand, called bluffing, is an integral part of optimal poker strategies. The
two-player game, which we will call Basic Endgame has the following rules:
• Before play starts, the two players, A and B, each put one dollar into the
pot (i.e. the pool of bets which will go to the winner of the game).
• A is then dealt a single card face down from a deck consisting solely of
‘winning cards’ (W) and ‘losing cards’ (L). The probability of being dealt
W is p.
• Once A looks at his card, his option is to CHECK or BET. If he CHECKS,
then the card is flipped over and the pot goes to A if it is W or to B if it
is L. If he BETS, he puts X dollars into the pot, and B has a turn.
• If A BETS, B can either CALL or FOLD. If B FOLDS, then the pot goes
to A and the game ends, regardless of what card A has. If B CALLS, then
B also puts X dollars into the pot and the card is flipped over, with the
pot going to A if it is W or to B if it is L.
The normal form of this game looks like this (A is the row player):
Card
W
L
(A, B)
Call
Fold
Call
Fold
Check
(1, −1)
(1, −1)
(−1, 1)
(−1, 1)
(−X − 1, X + 1)
(1, −1)
Bet
(X + 1, −X − 1) (1, −1)
We want the optimal strategy for this game which will be given by three numbers: α, the probability that A bets with L, β, the probability that B calls a
bet from A, and γ, the probability that A bets with W.
4
(a) Find γ without calculating.
(b) Write A’s expected profit as a function of α and β. Note that this game
is zero-sum, i.e. A’s profits match B’s losses.
(c) Show that there is a p∗ such that, for p ≥ p∗ , α = 1, β = 0 (and γ as in
(a)) is the optimal straetgy profile. Find p∗ .
(d) For p < p∗ , find a Nash equilibrium. What are A’s expected profits/losses
in this case? Given X, at what value of p does A break even with the Nash
strategy in expectation?
(e) Consider the special case of p = 1/2, X = 2. Compute A’s expected
losses using the Nash equilibrium strategy when A’s card is L. Note that
A would lose exactly as much money with L if he never bluffed. What,
then, is the value of bluffing?
Solution:
(a) If A bets with W, he will always win at least 1 dollar, and can possibly win
X + 1 dollars. If A checks with W, he will always win exactly one dollar.
So betting dominates checking and γ = 1.
(b) Player A’s expected payoff ΠA is
p(β(X + 1) + (1 − β) · 1) + (1 − p)(α(β(−X − 1) + (1 − β) · 1) + (1 − α) · −1)
For the subsequent questions it will help to rewrite this in a simpler form:
ΠA =
=
=
=
=
p(β(X + 1) + (1 − β) · 1) + (1 − p)(α(β(−X − 1) + (1 − β) · 1) + (1 − α) · −1)
p(βX + 1) + (1 − p)(α(−βX − 2β + 1) + (α − 1))
pβX + p − (1 − p)α(βX + 2β) + (1 − p)(2α − 1))
β(pX − (1 − p)α(X + 2)) + (1 − p)(2α − 1)) + p
β(pX − (1 − p)α(X + 2)) + 2α(1 − p) + (2p − 1)
(c) Since the game is zero-sum, B is trying to minimize ΠA . That is, Player
B’s expected payoff is:
ΠB = β((1 − p)α(X + 2) − pX) − 2α(1 − p) + (1 − 2p)
If pX ≥ (1 − p)(X + 2) then, since α ≤ 1, no matter what strategy A
takes, the coefficient of β is negative. So the best response is β = 0. In
that case, the best strategy for A is α = 1. Simplifying the inequality, we
get p ≥ (X + 2)/(2X + 2), so p∗ = (X + 2)/(2X + 2).
5
(d) Taking α = pX/[(1 − p)(X + 2)] (which is less than 1, since p < p∗ ) makes
the expression ΠB = −ΠA independent of β, so any value of β is a best
response. Rewriting ΠA as
ΠA (α, β) = (2p − 1) + pXβ + (1 − p)α[2 − (X + 2)β]
we see that any α is a best response to β = 2/(X + 2). So the mixed
strategy profile (α = pX/[(1 − p)(X + 2)], β = 2/(X + 2)) is a Nash
equilibrium. A’s profit is (2p − 1) + 2pX/(X + 2), which is non-negative
for p ≥ (X + 2)/(4X + 4).
(e) The probability of A checking with L and losing a dollar is 1/2. The
probability of A betting with L and being called for a loss of 3 dollars is
1/4. The probability of A betting with L and successfully bluffing for a
dollar profit is 1/4. So A’s expected profit when his card is L is −1, which,
indeed, is exactly what he would lose if he just checked with L all the time.
So the purpose of optimal bluffing is not to try and win pots with a losing
hand. A bluffs because, if he did not, B would never call when A had a
winning hand (as we saw in part (e)). A’s profit comes from getting B to
call against W, and he can only accomplish this if B knows that calling
sometimes pays off, i.e. that A is sometimes bluffing.
5. Social Choice Theory: Approval Voting.
In approval voting, each voter may vote for as many of the k candidates as
they wish. Thus, voter i votes for a set Si ⊆ [k]. Implicitly i approves of the
candidates in Si but disapproves of the candidates in [k] \ Si .
The candidates are then ranked in order of the total number of votes that they
receive.
(a) Does this voting mechanism satisfy unanimity and independence of irrelevant alternatives?
(b) Assume that voter i has a set Si∗ candidates that it likes. Then, given an
ordering π = (π1 , π2 , . . . πk ), let bi (πj ) = 1 if πj ∈ Si∗ and bi (πj ) = 0 if πj ∈
/
∗
Si . Thus π has a binary representation bi (π) = (bi (π1 ), bi (π2 ), . . . , b( πk )).
Given two orderings π = (π1 , π2 , . . . πk ) and π 0 = (π10 , π20 , . . . πk0 ) of the
candidates, we assume that voter i prefers π over π 0 if and only if bi (π) >
bi (π 0 ). Is this voting mechanism incentive compatible?
Solution:
(a) Unanimity: If all voters prefer candidate a over b then for all voters i we
have a ∈ Si and b 6∈ Si . If there are n voters then a has n votes and b has
zero. Thus a > b unanimity is established.
6
IIA: Suppose a > b, this can only happen when a is included in more voters
preference lists than b. If a new candidate c enters the race then for each
voter either choose c ∈ Si or c 6∈ Si , but this does not change number of
voters whose preference list contain a. Then same logic is true for b. Hence
we still retain a > b.
(b) Suppose the social welfare function g selects the ordering π as below
g(>1 , . . . , >i , . . . , >n ) = π,
bi (π 0 ) > bi (π)
(1)
We want to check whether there exists a >0i that can change the winner of
the game to π 0 . That is to say,
g(>1 , . . . , >0i , . . . , >n ) = π 0
(2)
In other words, we want to see if player i can change its Si to Si0 so that
to get a better ordering, or an ordering in which he gets to change its
candidate orders to a higher order.
Suppose πi ∈ Si , then the only change that voter i can make is to exclude
him from Si , i.e. πi 6∈ Si0 but that means the total number of votes for πi
will decrease and πi may move to the right in π 0 , i.e. πj0 = πi for some
j ≥ i and therefore bi (π 0 ) ≤ bi (π) and the change does no good to voter i
in this case.
Now suppose πi 6∈ Si , then by the only possible change πi ∈ Si0 , we will add
one unit to the total number of voters of πi . Therefore, πi will move to the
left in this case, i.e. πj0 = πi for some j ≤ i. But that means that one of the
candidates of i in Si might move to the right and therefore bi (π 0 ) ≤ bi (π)
and the change is not beneficial to voter i in this case either.
6. Social Choice Theory: Median Voter Theorem.
Consider an election where candidates are points on [0, 1] and each voter i has
a utility function fi : [0, 1] → < and ranks the candidates from highest utility
to lowest utility. This is called a ‘One-Dimensional Political Spectrum’ model.
Assume each player i has the single-peaked utility ui (x) = 1 − ||x − xi || for some
xi ∈ [0, 1].1
(a) If the voting system ask players to report their x0i s and then selects the
candidate closest to the voter with the median xi , prove that it is incentive
compatible, i.e., a weakly dominant strategy for the players is to report
1
The voter’s utility is “single-peaked” as fi takes its maximum at xi and fi is increasing
on [0, xi ] and decreasing on [xi , 1] (so xi is the only ‘peak’ of the function).
7
their true xi , assuming the number of voters is odd.
To answer the next two problems, you will need this definition. Candidate
A is a Condorcet Winner, if, for every other candidate B, the number of
voters who prefer A to B is larger than the number of voters who prefer
B to A.
The Condorcet Winner is a natural extension of the concept of a Majority
Winner in a two-candidate election. However, there is no guarantee that
a Condercet Winner exists.
(b) Prove the following Theorem (taking the number of voters to be odd):
Median Voter Theorem: In a 1-d political spectrum model, if all the
voters have single-peaked symmetric utility functions, the candidate most
preferred by the voter with the median peak is a Condorcet Winner.
(c) Show that the Median Voter Theorem fails in the ‘2-d political spectrum’
model by finding a 3-voter, 3-candidate example with single-peaked utilities that has an conflict cycle containing all the candidates, and thus no
Condorcet Winner.
Solution:
(a) We give the proof for an odd number of voters N . Clearly if i is the median
voter, he gets the best possible result by stating his true preference. So
consider the case xi < xM , where M is the median voter. By stating x0i , his
goal is to move the median to M 0 , with xM 0 < xM , so that the mechanism
selects a candidate closer to his true value xi . But it’s not hard to see that
i can’t move the median to the left! If he reports x0i < xM , the the median
will still be at xM 0 = xM , since xM still has 50% of the voters to the left
of it and 50% of the voters to the right of it. If he reports x0i ≥ xM , then
either the median won’t move, or it will move right, since now more than
50% of voters will be to the right of xM . So in either case, i cannot move
the median to the left by lying. The same argument shows that if xi > xM ,
then i cannot move the median to the right (and thus closer to him). So i
can never improve his utility by misrepresenting his xi , and therefore the
mechanism is incentive compatible.
(b) Let O be the candidate most preferred by the median voter and xO be
his position on the line. Let the peak of the median voter be at xM ,
and his utility be fM (x). We will show the result for xO < xM . For
xO ≥ xM , the proof is almost exactly the same. It should be obvious that
no candidate can be in (xO , xM ], because fM (x) is increasing on [0, xM ]
and if a candidate A was in (xO , xM ], A would be preferred to O by the
median voter. Consider a candidate A at position xA . If xA < xO , then
8
every voter with a peak to the right of the median voter must prefer O
to A. Since the median voter prefers O that means a majority of voters
prefer O to A. Now consider xA > xM . Every voter with a peak to the
left of xO prefers O to A and so does the median voter. But what about
the voters with peaks in (xO , xM )? We need them to prefer O to A as
well in order to get a majority. Here we use the fact that the graph for
any ui (x) is just a translated version of the graph for uM (x). Since the
median voter prefers O over A, we have u(xO − xM ) > u(xA − xM ). For
any voter i with xi ∈ (xO , xM ), we have xA > xM > xi > xO . Then,
as u is single-peaked with a peak at x = 0, u(xA − xi ) < u(xA − xM ),
and u(xO − xi ) > u(xO − xM ). So putting the inequalities together we
have u(xO − xi ) > u(xA − xi ), and voter i prefers O to A. So all voters
with a peak to the left of xM prefer O to A and so does the median voter,
therefore a majority of voters prefer O to A. Since any candidate must be
placed, by assumption, on [0, xO ) ∪ (xM , 1], we have shown that, for any
candidate A, the number of voters that prefer O to A is larger than the
number that prefer A to O. So O is a Condorcet Winner.
(c) Take the space to be [0, 1] × [0, 1]. Consider an equilateral triangle
with
√
the candidates at each vertex, say, the origin, (1,0) and (1/2, 3/2)). Use
the utility function from (a), so the three voters choose their preferences
based on which candidate is closest to their peak. We will try and position
them to get an intransitive cycle. Let the peak of voter 1 lie on the line
segment AB , at distance 1/2 − from A (so 1 lies slightly closer to A than
B). Similarly, put voter 2 on BC at distance 1/2 − from B and put voter
3 on AC at distance 1/2 − from C. Then for small enough , voter 1 will
have preference A > B > C, voter 2 will have preference B > C > A and
voter 3 will have preference C > A > B. So we get an A > B > C > A
cycle, and no candidate can be a Condorcet Winner.
9