Bayesian games: incomplete information games

Bayesian games: incomplete information games
Wieslaw Zielonka
www.liafa.univ-paris-diderot.fr/~zielonka
mail: [email protected]
LIAFA, Université Paris Diderot - Paris 7
November 3, 2014
1
Definition
Notation: for a finite nonempty set X, by ∆(X) we denote the set of probability
distributions over X.
We assume a finite set N of players and a finite set Ω of of possible worlds. For each
player i
• a set Ai (usually finite) of available actions,
• a finite set Ti of signals1 of player i,
• player i’s belief function pi : Ti −→ ∆(Ω × T−i ), where T−i =
signal profile for other players.
Q
j∈N \i
Tj is the
We will write pi (ω, t−i |ti ) to denote the probability that player i of type ti ∈ Ti
assigns to (ω, t−i ).
• for each signal ti ∈ Ti , Ai (ti ) ⊆ Ai is the set of actions available to player i,
• for each signal
mapping ui (·|ti ) : A × Ω × T−i −→ R,
Q ti ∈ Ti a utility (payoff)
Q
where A = j∈N Aj and T = j∈N Tj are action and signal profiles. Thus
ui (a, ω, t−i |ti ) is the payoff of player i in the world ω if the signal (type) of
player i is ti , a ∈ A is the action profile, t−i the signal profile of other players.
The intuition is that player does not know neither the state of the real world ω ∈ Ω
nor the types of the other players. But he receives a signal which can reveal to him
(partially or completely) information about the world state ω and about the types
of the other players. The signal that player i receives is his private information and
the belief function indicates how his belief depends on the signal that he receives
(on his type).
1
Instead of the signal received by agent i we will often speak about the type of player i. Thus
either we say that player i receives a signal ti or, equivalently, we say that player i is of type ti .
1
Each player upon the reception of a signal decides upon the action to take. A
strategy of player i is a mapping
σi : Ti −→ ∆(Ai )
from his signals to probability distributions over actions such that supp(σi (ti )) ⊆
Ai (ti ) where
supp(σi (ti )) = {ai ∈ Ai | σi (ti )(ai ) > 0}.
Player i playing according to σi will play ai with probability σi (ti )(ai ) if his signal
is ti . To simplify the notation, we write σi (ai |ti ) instead of σi (ti )(ai ) and this can
be read as the probability of playing ai under the condition that player i receives
signal ti .
All this structure is a public knowledge.
We say that players have consistent priors if there exists a probability distribution
p ∈ ∆(Ω × T ) such that the belief distributions are marginal distributions of p, i.e.
if
p(ω, (ti , t−i ))
p(ω, (t−i , ti ))
P
=P
.
(1)
pi (ω, t−i |ti ) =
0
0
p(ti )
ω 0 ∈Ω
t0 ∈T−i p(ω , (ti , t−i ))
−i
In most applications we assume consistent prior beliefs, but this is not necessary in
general, one can examine also what happens if players beliefs are not consistent.
Thus pi (·, ·|ti ) is belief of agent i if he is of type ti .
Example 1 (Bargaining game). There are two players, a seller and a buyer. The
buyer wishes to acquire an object belonging to the seller. They engage in the
following bargaining game.
Both players make simultaneously bids for the object. If the buyer’s bid B is greater
or equal to the seller’s bid S then the object changes the owner and the buyer pays
B+S
to the seller. If B < S then there is no trade (this is a one round game, if there
2
is no trade players part company and no other attempt is made).
Each player assigns some private value to the object, for example for the seller the
object is worth tS = 30 euros and for the buyer it is worth tB = 60 euros, in other
words selling at a price smaller than 30 euros and buying at a price higher than 60
euros are considered by them as a loss. We say that tS and tB are the types of the
seller and the buyer respectively. Obviously each player knows his own type but he
does not know the type of his adversary. We suppose that TS = TB = {1, . . . , 100}
1
to the event
are the sets of possible types and each player assigns the probability 100
that his adversary is of a particular type t.
Formally, we can assume that there is only one world state thus Ω can be left out
from our description, seller’s belief function is such that pS (t|tS ) = 1/100 for each
t ∈ TB and each his type tS and buyer’s belief is pB (t|tB ) = 1/100 for each t ∈ TS
and each his type tB .
The bids B and S are the actions taken by the players, in general these actions are
not equal to the types tB and tS since thinking strategically, the seller will have the
tendency to propose bids higher than tS and the buyer may propose bids smaller
than tB , the seller trying to maximize his profit and the buyer trying to minimize his
expenses (however there is a tension in the game since the seller asking to much can
2
prevent the trade even if trading at smaller price is still profitable for him, similarly
proposing bids too small the buyer can break the trade).
The seller’s utility is
(
0
if B < S,
uS ((B, S)|tS ) = B+S
− tS otherwise.
2
Thus the seller’s utility does not depend on the type of the buyer, and it is 0 if no
trade occurs and it is equal to the amount of the money that seller receives minus
his value of the object if the trade occurs.
Similarly the buyer’s utility is
(
0
if B < S,
uB ((B, S)|tB ) =
B+S
otherwise
tb − 2
i.e. 0 if there is no trade and the value of the object minus the amount of the money
that he pays for it if trade occurs.
2
Bayes-Nash equilibria in Bayesian games
A strategy profile σ ∗ = (σ1∗ , . . . , σn∗ ) is a Bayes-Nash equilibrium if for each player i
and each signal ti ∈ Ti the expected payoff of player i cannot be better if he changes
his strategy.
But what is the expected payoff of player i if he is of type ti ?
For a−i ∈ A−i and t−i ∈ T−i , the probability that players different from i play an
action profile a−i = (a1 , . . . , ai−1 , ai+1 , . . . , an ) if they receive signal profile t−i is
given by
∗
∗
∗
σ−i
(a−i |t−i ) = σ1∗ (a1 |t1 ) . . . σi−1
(ai−1 |ti−1 )σi+1
(ai+1 |ti+1 ) . . . σn∗ (an |tn ).
Then player i’s expected payoff is
ui (σ ∗ |ti ) =
X
X
X X
∗
pi (ω, t−i |ti )σi∗ (ai |ti )σ−i
(a−i |t−i )ui ((a−i , ai ), ω, t−i |ti ).
t−i ∈T−i a−i ∈A−i ai ∈Ai ω∈Ω
∗
Replacing in this formula σi∗ by another strategy σi we get ui ((σ−i
, σi )|ti ) – the
expected payoff that player i obtains using strategy σi .
Thus the strategy profile σ ∗ is an equilibrium if for each player i ∈ N and each
signal ti ∈ Ti and each strategy σi of player i,
∗
∗
ui ((σ−i
, σi∗ )|ti ) ≥ ui ((σ−i
, σi )|ti ).
Remark 1. If the players have consistent priors and the sets of states and signals
are finite then there exist Bayes-Nash equlibria.
Example 2. N = {1, 2}, T1 = {t11 , t12 }, T2 = {t2 } (player 1 has two types, player
2 has one type.
3
We shall identify here type profiles with the states of the nature. The two types of
player 1 have the same probabilities:
1
p(t11 , t2 ) = p(t12 , t2 ) = .
2
There are two states of nature, one corresponding to type profile (t11 , t2 ) and the
other corresponding to (t12 , t2 ). The set of actions of payer 1 is different for each of
his types.
Matrices below show the games played for each type profile. Let us note that player
1 knows which game is played but player 2 does not know.
Player 1
T1
B1
Player 2
L
R
1, 0 0, 2
0, 3 1, 0
Player 1
type profile t = (t11 , t2 )
Calculating Bayes-Nash equlibria.
Suppose that players play using the following strategies:
Player 1
x
1−x
Player 2
q
1−q
1, 0
0, 2
0, 3
1, 0
Player 1
T2
B2
Player 2
L
R
0, 2 1, 1
1, 0 0, 2
type profile t = (t12 , t2 )
y
1−y
Player 2
q
q
0, 2 1, 1
1, 0 0, 2
type profile t = (t11 , t2 )
type profile t = (t12 , t2 )
which means that player 1 select T1 with probability x and B1 with probability 1 − x
if the first game is played and he select T2 with probability y and B2 with probability
1 − y if the second game is played.
Player 2 selects L with probability q and R with probability 1 − q in both games.
We first show that q can be neither 0 nor 1.
• If q = 1 then player 1’s best response is T1 in the first game and B2 in the
second. But player 2’s best response to this strategy is R (q = 0). It follows
that q = 1 is not a part of a Bayesian equlibrium.
• If q = 0 then player 1’s best response is B1 in the first game a,d T2 in the
second. But then the best response of player 2 to this strategy is L (q = 1).
Thus the strategy of player 2 in Bayyes equlibrium should be completly mixed,
implying that playing L or R should give him the same payoff.
This implies that
1
1
1
1
(x · 0 + (1 − x) · 3) + (y · 2 + (1 − y) · 0) = (x · 2 + (1 − x) · 0) + (y · 1 + (1 − y) · 2)
2
2
2
2
giving
1 + 3y
(2)
5
For (xT1 +(1−x)B1 , yT2 +(1−y)B2 ) to be a strategy belonging to a Bayes equlibrium
this strategy must be a best response to qL + (1 − q)R.
x=
4
• If q < 1/2. Player 1’s best response is x = 0 and y = 1 which does not satisfy
(2).
• If q > 1/2. Player 1’s best response is x = 1 and y = 0 which does not satisfy
(2).
• If q = 1/2. Player 1’s payoff is 1/2 whatever strategy he plays. So that every
strategy of player 1 is a best response to 12 L + 12 R.
Thus Bayes equlibria are all strategies with q = 12 for player 2 and
1
4
≤x≤ ,
5
5
0 ≤ y ≤ 1,
x=
1 + 3y
5
for player 1.
The payoff of player 1 is 21 in each bayes equilibrium, the payoff of player 2 depends
on strategy of player 1 and is equal
1
1
12 + y
· 3(1 − x) + · 2y =
.
2
2
10
Example 3. Two players receive a lottery ticket with a number (a prize) from a
finite set W = {w1 , . . . , wk }, where w1 < . . . < wk .PThe probability of receiving
a ticket with the prize wi is p(wi ) for each player ( ki=1 p(wi ) = 1) and the two
prizes are identically and independently distributed. Each player chooses a ticket
and learns the prize of his ticket. Then both players are asked simultaneously and
independently if they agree to exchange the tickets. The exchange takes effectively
place only if both players want to exchange, otherwise there is no exchange everybody receives the prize according to his own ticket. Model this situation as a
Bayesian game, find equilibria.
The sets of types are T1 = T2 = W . For player i the probability pi (wj |wm ) that he
assigns to the event that his adversary is of type wj when he himself is of type wm
is p(wj ).
The set of action profiles is A = {(X, Y ) | X, Y ∈ {E, K}} where E means exchange
while K means keep.
The utility mapping for player 1 is
(
t2 if X = Y = E,
u1 ((X, Y ), t2 |t1 ) =
t1 otherwise.
The utility for player 2 is defined in a similar way (just exchange the indices 1 and
2 in the formula above).
Let a1 and a2 be the highest possible prizes such that players 1 and 2 decide to
exchange with a positive probability. Without loss of generality we can assume that
a1 ≥ a2 . Suppose also that a1 > w1 . But, in this case, player 1 of type a1 would
exchange knowing that his adversary will only exchange if his type is ≤ a2 ≤ a1 .
Surely, if the exchange takes places then he would not be better and often he would
be worse. Thus it is better for player 1 not to exchange in this case.
Thus the only Nash equilibrium is when players have a positive probabilities of E
only if their types are the lowest possible w1 . For all other types players are better
not to exchange.
5
Example 4 (More information may hurt). Consider the following Bayesian game.
There are two players N = {1, 2}, and two possible worlds Ω = {ω1 , ω2 }. The sets
of actions are for player 1 is A1 = {U, D} and for player 2 A2 = {L, M, R}. Player
1 receives the signal which gives him the full information about the world realized
by the nature while player 2 receives in both possible worlds the same signal, i.e. he
has no information about which world was realized but he knows that both worlds
have the same probability 12 .
The formal presentation is the following. The signal space of player 1 is T1 = 1, 2,
the signal space of player 2 contains just one element and can be therefore omitted.
Player 1’s belief mapping is
(
1 if i = j
p1 (ωi |j) =
0 otherwise.
i.e. he receives signal 1 in the world ω1 and signal 2 in the world ω2 .
Player 2’s belief mapping is
p2 (ω1 ) = p2 (ω2 ) =
1
2
i.e. for him both worlds have the same probability.
The payoff functions for each of the world are given by
U
D
L
1, 2
2, 2
M
1, 0
0, 0
R
1, 3
0, 3
U
D
world ω1
L
1, 2
2, 2
M
1, 3
0, 3
R
1, 0
0, 0
world ω2
where 0 < < 12 . The intuition is the following. Either the game ω1 or the game
ω2 is chosen by the nature with probability 1/2. Player 1 receives the signal which
indicates for him without any ambiguity which game will be played. Player 2’s signal
does not reveal any useful information, he does not know if ω1 or ω2 is played. We
shall note the strategy of player 1 as a pair (x1 , x2 ) where x1 is the action he plays
when he receives signal 1 and x2 is the action he plays when signal 2 is received (we
limit ourselves to deterministic (or pure) strategies). Similarly a pure strategy of
player 2 is any y in {L, M, R} (for player 2 we consider also only pure strategies).
The unique Bayes-Nash equilibrium is ((D, D), L) giving the expected payoff (2, 2).
(To see that this is a Nash equilibrium note that if player 1 knows that player 2
plays L then in both games his best response is to play D. The other way round if
player 2 knows that player 1 plays D in ω1 and in ω2 , then he can calculate that if
he plays L then his expected payoff is (2 + 2)/2 = 2, if he plays M then his expected
payoff is (0 + 3)/2 = 23 and finally when he plays R then his expected payoff is
(3 + 0)/2 = 23 . Clearly playing L is the best option.)
Suppose now that player 2 is fully informed about the state. In ω1 the unique Nash
equilibrium is (U, R) while in ω2 it is (U, M ) both giving the payoff (1, 3). For
0 < < 21 the payoff of player 2 is smaller in the second game than in the first one,
it is better for him to be not informed which games is played.
The following two exercises are taken from Maschler et al.[1].
6
Exercise 1. Nicolas would like to sell a company that he owns to Marc. The
company’s true value is an integer between 10 and 12 (including 1 and 12), in
millions of dollars. Marc has to make take-it-or-leave it offer, and Nicolas has to
decide whether to accept the offer or to reject it. If Nicolas accepts the offer,
the company is sold, Nicolas’s payoff is the amount he got, and Marc’s payoff is
the difference between the company’s true value and the amount that he paid. If
Nicolas rejects the offer, tho company is not sold, Nicola’s payoff is the value of
the company, and Marc’s payoff is zero. For each one of the following information
structures, describe the situation as a game with incomplete information, and find
all the Bayesian equilibria in the corresponding game. In each case, the description
of the situation is common knowledge among the players. In determining Nicolas’s
action set, note that Nicolas knows what Marc’s offers is when he decides whether
or not accept the offer.
1. Neither Nicolas nor Marc knows the company’s true value; both ascribe probability 31 to each possible value.
2. Nicolas knows the company’s true value, whereas Marc does not know it, and
ascribes probability 13 to each possible value.
3. Marc does not know the company’s worth and ascribes probability 13 to each
possible value. Marc further ascribes probability p to the event that Nicolas
knows the value of the company, and probability 1−p to the event that Nicolas
does not know the value of the company, and instead ascribes probability 31 to
each possible value.
1
Exercise 2. Two or three players are about to play a game: with probability 32
1
the game involves Players 1 and 2 and with probability 2 the game involves Players
1, 2 and 3. Players 2 and 3 know which game is being played. In contrast, Player
1, who participates in the game under all conditions, does not know whether he is
playing against Player 2 alone, or against both players 2 and 3. If the game involves
Players 1 and 2 the game is given by the following matrix, where player 1 chooses
the row, and Player 2 chooses the column:
T
B
L
0, 0
2, 1
R
2, 1
0, 0
with player 3 receiving no payoff.
If the game involves all three players, the game is given by the following two matrices,
where player 1 chooses the row, Player 2 chooses the column, and Player 3 chooses
the matrix:
T
B
L
1, 2, 4
0, 0, 0
R
0, 0, 0
2, 1, 3
T
B
L
2, 1, 3
0, 0, 0
W
R
0, 0, 0
1, 2, 4
E
(a) What are the states of the nature in this game?
7
(b) Find two Bayesian equilibria in pure strategies.
(c) Find an additional Bayesian equilibrium by identifying a strategy vector in which
all the players of all types are indifferent between their two possible actions.
References
[1] Michael Maschler, Eilon Solan, and Shmuel Zamir. Game Theory. Cambridge
University Press, 2014.
8

Download Report

Bayesian games: incomplete information games

Paperzz.com

Your Paperzz