Slides

Multiagent Systems
Solution Concepts for
Normal Form (Matrix) Games
© Manfred Huber 2012
1
Solution Concepts
n 
n 
Solution concepts define interesting outcomes in a
multiagent game, e.g.:
n 
Pareto-optimal outcomes
n 
Pure-strategy Nash equilibrium
n 
Mixed-strategy Nash equilibrium
n 
Maxmin strategy profile
n 
Minmax strategy profile
Important questions
n 
Do they exist ?
n 
How can they be computed ?
© Manfred Huber 2012
2
Zero-Sum Games
n 
Zero-Sum games are games where all the
sum of the payoffs for all agents for all
strategies is equal to 0.
n 
n 
Any game in which a transformation of the utility
function of the form u = k*u + l, k>0, l≥0 leads
to a zero-sum game can itself be considered a zersum game.
a 2-payer zero-sum game is completely
antagonistic
n 
© Manfred Huber 2012
u1 = -u2
3
Nash Equilibria in 2-Player
Zero-Sum Games
n 
In 2-player zero-sum games Nash equilibria
are minmax and maxmin strategies
n 
Solution can be found by searching through the
possible sets of support for mixed strategies
n 
n 
Exponential complexity
Maxmin solution can be formulated as a
linear program
n 
© Manfred Huber 2012
Maxmin for agent 1 is equal to Minmax for
agent 2
4
Linear Programming
n 
Linear programs are optimization problems under
linear, closed inequality constraints
n
maximize
"w x
i
i
i=1
n
subject to
"a
i, j
x i # b j $j = 1…m
i=1
x i % {0,1}
n 
$i = 1…n
xi are the variables to solve for
© Manfred Huber 2012
5
2-Player Zero-Sum Game
n 
In 2-player zero-sum games, maxmin and minmax
can be expressed as linear programs
n 
maxmin for agent 1 or minmax for agent 2:
minimize U1*
subject to
j
k
k
*
u
(a
,
a
)s
#
U
" 1 1 2 2 1 $j ! A1
k!A2
k
s
" 2 =1
k!A2
s2k % 0
© Manfred Huber 2012
$k ! A2
6
2-Player Zero-Sum Game
n 
minmax for agent 1 or maxmin for agent 2:
maximize U1*
subject to
j
k
j
*
u
(a
,
a
)s
#
U
" 1 1 2 1 1 $k ! A2
j!A1
j
s
" 1 =1
j!A1
s1j # 0
© Manfred Huber 2012
$j ! A1
7
Maxmin Strategies for General Games
n 
maxmin strategies can be computed for general sum
games
n 
n 
Computation by re-designing the game
n 
n 
n 
maxmin strategy determines the agent s strategy only
based on its utility function
Construct a zero-sum game G by maintaining the agent s
payoffs and changing the other agent s payoff function
The maxmin solution for the agent in the modified game G
is the same as for the original game
maxmin strategy in a general sum game is not
necessarily a Nash equilibrium
© Manfred Huber 2012
8
Nash Equilibria for 2-Player
Non Zero-Sum Games
n 
For non zero-sum games, linear programming does
no longer work because there is no single objective
function
n 
The complexity of computing Nash equilibria in general
sum games is unknown.
n 
n 
It is assumed that it is worst case exponential
2-Player general sum games can be fomulated as
Linear Complementarity Problems (LCP)
n 
n 
© Manfred Huber 2012
LCP is a constraint satisfaction problem without an
objective function
Formulation considers both players utility functions
9
LCP Formulation for 2-Player
Non Zero-Sum Games
n 
LCP combines the constraints of both agents
and adds complementarity constraints
# u (a ,a )s
j
1
i
k
2
k
2
+ r1 j = U1* $j " A1
k "A 2
# u (a ,a )s
j
2
i
k
2
=1 ,
k
2
j
1
+ r2k = U 2* $k " A2
j "A1
#s
k "A 2
#s
j
1
=1
j "A1
s2k % 0 , s1j % 0
$k " A2 , $j " A1
r2k % 0 , r1 j % 0
$k " A2 , $j " A1
r2k s2k = 0 , r1 j s1j = 0
$k " A2 , $j " A1
© Manfred Huber 2012
10
Nash Equilibria for 2-Player
Non Zero-Sum Games
n 
LCP for 2-player games can be solved, e.g., using the
Lemke-Howson algorithm
n 
n 
n 
Moves along the edges of a labeled graph in strategy
space
Worst-case exponential
Other solution is to search the space of supports
n 
n 
Determine which sets of actions could yield mixed
Nash equilibria
Compute the corresponding probabilities and verify
that it is an equilibrium
© Manfred Huber 2012
11
Domination and Strategy Removal
n 
Strategy profiles can be related
n 
si strictly dominates si
"s#i $ S#i
n 
for player i if
ui (si ,s#i ) > ui (si ',s#i )
si weakly dominates si
for player i if
"s#i $ S#i ui (si ,s#i ) % ui (si ',s#i )
&s#i $ S#i ui (si ,s#i ) > ui (si ',s#i )
!
n 
si very weakly dominates si
!
© Manfred Huber 2012
"s#i $ S#i
for player i if
ui (si ,s#i ) % ui (si ',s#i )
12
Domination and Nash Equilibria
n 
n 
A dominant strategy is a strategy that dominates all
others
A strategy profile consisting of dominant strategies
for all players must be a Nash equilibrium
© Manfred Huber 2012
13
Iterated Removal of
Dominated Strategies
n 
No equilibrium can be strictly dominated by
another strategy
n 
n 
All strictly dominated strategies can be removed
while still maintaining the solution to the game
Iterated removal of dominated strategies
repeats the removal process until no further
dominated strategies are available
© Manfred Huber 2012
14
Iterated Removal of
Dominated Strategies
n 
n 
L
C
R
U
3, 1
0, 1
0, 0
M
1, 1
1, 1
5, 0
D
0, 1
4, 1
0, 0
R is dominated by L for player 2
M is dominated by [0.5:U; 0.5:D] for player 1
© Manfred Huber 2012
L
C
R
U
3, 1
0, 1
0, 0
M
1, 1
1, 1
5, 0
D
0, 1
4, 1
0, 0
15
Iterated Removal of
Dominated Strategies
n 
Iterated removal preserves Nash equilibria
n 
n 
n 
n 
Strict dominance preserves all equilibria
Weak or very weak dominance preserves at least
one equilibrium
Removal order can influence which
equilibria are preserved
Iterated removal can be used as a
preprocessing step for Nash equilibrium
calculation
© Manfred Huber 2012
16
Correlated Equilibria
n 
In particular in coordination games the Nash
equilibrium does not achieve a very good
performance
n 
n 
Equilibrium results in strategies that often end up in low
payoff outcomes
go
wait
go
-100, -100
10, 0
wait
0, 10
-10, -10
What could solve this ?
© Manfred Huber 2012
17
Correlated Equilibria
n 
A solution would be to coordinate the random
picks by the players
n 
n 
Has to be outside the control and insight of each
player or they could change their strategy
Central random variable with a common, known
distribution and a private signal to each of the
players
n 
© Manfred Huber 2012
Signal is correlated to other signals but does not
determine the other players signals
18
Correlated Equilibria
n 
A correlated equilibrium is a tuple (v, π, σ)
n 
v is a vector of random variables with domains D
n 
π is the joint distribution of v
σ is a vector of mappings from D to actions in A
n 
for every mapping σ :
n 
# ! (d)u (" (d ),…, " (d ),…, "
i
1
1
i
i
n
(dn )) !
d"D
# ! (d)u (" (d ),…, " '(d ),…, "
i
1
1
i
i
n
(dn ))
d"D
© Manfred Huber 2012
19
Correlated Equilibria
n 
For every Nash equilibrium there exists a
corresponding correlated equilibrium.
n 
n 
If the mapping is replaced by the decoupled
probabilistic choices and the mapping is reduced
to the action choice indicated by the domain, the
correlated equilibrium reduces to a Nash
equilibrium
Not every correlated equilibrium is a Nash
equilibrium
© Manfred Huber 2012
20
Computing Correlated Equilibria
n 
Linear programming constraints for CE
&
p(a)ui (a) "
a %A |a i %a
& p(a)u (a ',a
i
i
#i
)
$i % N, $ai ,ai '% Ai
a %A |a i %a
p(a) " 0
& p(a) = 1
a %A
n 
!
Objective function: e.g. social-welfare
maximize :
$
'
#& p(a) # ui (a))
(
a "A%
i"N
Not necessarily a Nash Equilibrium
© Manfred Huber 2012
21
Computing Correlated Equilibria
n 
CE are easier to compute than Nash Equilibria
n 
n 
Only one randomization in CE and product of
independent probabilities in NE
Constraints for NE:
$
'
$
'
- && ui (a) # p j (a j ))) * - && ui (ai ',a+i ) # p j (a j )))
a "A |a i "a %
j "N
( a "A |a' i "a %
j "N \{i}
(
p(a) * 0
,i " N, ,ai '" Ai
- p(a) = 1
a "A
This is a nonlinear constraint – No linear program
©
!Manfred Huber 2012
22
Computing Nash Equilibria
n 
No algorithm is known to compute Nash
equilibria for n-player general sum games in
polynomial time
n 
n 
A number of iterated algorithms provide
good approximations
Multiagent learning can lead to efficient
approximate solutions
© Manfred Huber 2012
23

Download Report

Slides

Paperzz.com

Your Paperzz