Approximate Nash Equilibrium Computation

Approximate Nash Equilibrium Computation
Paul W. Goldberg1
1 Department
of Computer Science
University of Oxford, U. K.
Bristol Algorithms Days
3rd Feb. 2016
Goldberg
Approximate Nash Equilibrium Computation
The computational challenge
Input: payoff matrices R, C of bimatrix game G
Output: a Nash equilibrium of G
Centralised. No “strategic data”. I don’t care about social welfare.
Goldberg
Approximate Nash Equilibrium Computation
The computational challenge
Input: payoff matrices R, C of bimatrix game G
Output: a Nash equilibrium of G
Centralised. No “strategic data”. I don’t care about social welfare.
Two key (annoying?) features
PPAD-complete ...?
pseudo-polytime (but not polytime) algorithm
known for approximate version
Goldberg
Approximate Nash Equilibrium Computation
The computational challenge
Input: payoff matrices R, C of bimatrix game G
Output: a Nash equilibrium of G
Centralised. No “strategic data”. I don’t care about social welfare.
Two key (annoying?) features
PPAD-complete ...?
pseudo-polytime (but not polytime) algorithm
known for approximate version
Unusual answers for both exact and approx
versions... Coincidence??
Goldberg
Approximate Nash Equilibrium Computation
The computational challenge
Input: payoff matrices R, C of bimatrix game G
Output: a Nash equilibrium of G
Centralised. No “strategic data”. I don’t care about social welfare.
Two key (annoying?) features
PPAD-complete ...?
pseudo-polytime (but not polytime) algorithm
known for approximate version
Unusual answers for both exact and approx
versions... Coincidence??
similar results for other classes of games.
Goldberg
Approximate Nash Equilibrium Computation
What’s wrong with NP-completeness?
NP-complete problems have yes-instances and no-instances...
Goldberg
Approximate Nash Equilibrium Computation
What’s wrong with NP-completeness?
NP-complete problems have yes-instances and no-instances...
In searching for a Nash equilibrium, every instance (game) is a
yes-instance! Every game has one (Nash 1951), and a suggested
NE is easy to check.
Goldberg
Approximate Nash Equilibrium Computation
What’s wrong with NP-completeness?
NP-complete problems have yes-instances and no-instances...
In searching for a Nash equilibrium, every instance (game) is a
yes-instance! Every game has one (Nash 1951), and a suggested
NE is easy to check.
reduce from (say) SAT to NASH: what should happen to the
no-instances?
It’s conceivable some other “NASH is as hard as NP” proof could exist...
Goldberg
Approximate Nash Equilibrium Computation
TFNP: total function computation in NP
NASH∈TFNP:
“TF”: every game has an outcome
“NP”: a “transparent”, easily-checkable outcome
Best-known “hard” TFNP problem: FACTORING — given a
number, output its prime factorisation; hardness needed for much
crypto
Goldberg
Approximate Nash Equilibrium Computation
TFNP: total function computation in NP
NASH∈TFNP:
“TF”: every game has an outcome
“NP”: a “transparent”, easily-checkable outcome
Best-known “hard” TFNP problem: FACTORING — given a
number, output its prime factorisation; hardness needed for much
crypto
Hard TFNP problems: an unhappy family
Happy families are all alike; every unhappy
family is unhappy in its own way.
— Leo Tolstoy
For our purposes:
NP-complete problems are all alike; every hard
TFNP problem is hard in its own way.
— don’t quote me
Work in progress on this...
Goldberg
Approximate Nash Equilibrium Computation
PPAD: a happy subfamily of TFNP
END OF THE LINE (informally): given a start of line, find an end
Papadimitriou, C.H.: On the Complexity of the Parity
Argument and Other Inefficient Proofs of Existence, J.
Comput. Syst. Sci. (1994)
Goldberg
Approximate Nash Equilibrium Computation
PPAD: a happy subfamily of TFNP
A possible solution (again, informally)
Goldberg
Approximate Nash Equilibrium Computation
PPAD: a happy subfamily of TFNP
A possible solution (again, informally)
More formally, let’s model a queue as a directed graph where each
node has at most one incoming edge and at most one outgoing
arc; given a sink, find another source/sink.
Goldberg
Approximate Nash Equilibrium Computation
PPAD: a happy subfamily of TFNP
END OF THE LINE
Circuits Succ and Pred; n inputs, n outputs; graph on 2n vertices
with arc from u to v iff Succ(u)=v , Pred(v )=u
Given 0 has successor but no predecessor, find another vertex of
degree 1.
Goldberg
Approximate Nash Equilibrium Computation
PPAD: a happy subfamily of TFNP
stands for “Polynomial Parity Argument on a graph, Directed
version”
Papadimitriou, C.H.: On the Complexity of the Parity
Argument and Other Inefficient Proofs of Existence, J.
Comput. Syst. Sci. (1994)
PPA: same sort of thing, but undirected graph
As it happens, FACTORING belongs to PPA, related to PPAD...
Emil Jeřábek: Integer factoring and modular square roots
J. Comput. Syst. Sci., to appear
suggestive —but only suggestive— that PPAD is hard
Goldberg
Approximate Nash Equilibrium Computation
PPAD: a happy subfamily of TFNP
stands for “Polynomial Parity Argument on a graph, Directed
version”
Papadimitriou, C.H.: On the Complexity of the Parity
Argument and Other Inefficient Proofs of Existence, J.
Comput. Syst. Sci. (1994)
PPA: same sort of thing, but undirected graph
As it happens, FACTORING belongs to PPA, related to PPAD...
Emil Jeřábek: Integer factoring and modular square roots
J. Comput. Syst. Sci., to appear
suggestive —but only suggestive— that PPAD is hard
Digression: oracle model of PPAD assumes query access to
functions Succ and Pred:2n → 2n . Query complexity of search for
a solution is poly in the circuit model but not in the oracle model.
There are oracle separation results for PPAD and other subclasses
of TFNP
Goldberg
Approximate Nash Equilibrium Computation
From NASH to -NASH:
Bounded rationality fixes irrationality
With 3 players, NE may have irrational values (Nash ’51), and in
general, for any k > 2 players, n strategies, algebraic degree of
values may be exponential in n... also for graphical games
-Nash equilibrium
No
—————–
incentive ≤ incentive to deviate — solution can have values
that are multiples of /kn ∈ Q.
To be meaningful, assume payoffs in some bounded range, usually [0, 1].
Negative (hardness) results carry over to exact NE (useful for first
PPAD-hardness results)
Goldberg
Approximate Nash Equilibrium Computation
-NASH versus -Well-Supported NASH
-NASH: average payoff is worse than best-response by at most — but player may do much worse, with low probability
-WSNE (stronger!): anything a player does with positive
probability, pays at most less than best-response.
The support of a probability distribution is the set of events
that get non-zero probability — for a mixed strategy, all the
pure strategies that may get chosen.
i.e. anything in the support of a player’s mixed strategy, is within of best
Goldberg
Approximate Nash Equilibrium Computation
A good start: =
1
2
in poly time
Daskalakis, Mehta, and Papadimitriou: A note on
approximate Nash equilibria. WINE’07; TCS 2009
1
2
0.2
0
0.9 0.2
0.1 0.2
0.2 0.1 0.2
0.3 0.4 0.5
0.2 0.2 0.8
0.6 0.7 0.8
1
Player 1 chooses arbitrary strategy i; gives it probability 12 .
Goldberg
Approximate Nash Equilibrium Computation
A good start: =
1
2
in poly time
Daskalakis, Mehta, and Papadimitriou: A note on
approximate Nash equilibria. WINE’07; TCS 2009
1
1
2
0.2
0
0.9 0.2
0.1 0.2
0.2 0.1 0.2
0.3 0.4 0.5
0.2 0.2 0.8
0.6 0.7 0.8
1
Player 1 chooses arbitrary strategy i; gives it probability 12 .
2
Player 2 chooses best response j; gives it probability 1.
Goldberg
Approximate Nash Equilibrium Computation
A good start: =
1
2
in poly time
Daskalakis, Mehta, and Papadimitriou: A note on
approximate Nash equilibria. WINE’07; TCS 2009
1
1
2
0.2
0
0.9 0.2
0.1 0.2
0.2 0.1 0.2
0.3 0.4 0.5
1
2
0.2 0.2 0.8
0.6 0.7 0.8
1
Player 1 chooses arbitrary strategy i; gives it probability 12 .
2
Player 2 chooses best response j; gives it probability 1.
3
Player 1 finds best response k to j; gives it probability 21 .
They also find 65 -WSNE in poly-time...
Goldberg
Approximate Nash Equilibrium Computation
Computing -NE: the key facts
fact card
1 For > 0, support size of -NE is O(log n) (Althöfer; Lipton
et al); for < 21 support is Ω(nlog n ) (Feder et al)
2
3
4
For any fixed > 0, complexity is O(nlog n ); gives us hope for
poly-time approx’n scheme PTAS (LMM ’03)
PTAS for -NE can be turned into a PTAS for
-well-supported-NE (DGP’09; CDT’09) (kick out strategies
from -NE that pay less than best-response− 2 ). But, -WSNE
requires 2 /8-NE. (E.g., 34 -WSNE needs approx 0.07-NE)
but, best for NE is just over 13 , for WSNE, just under
2
3
Althöfer: On sparse approximations to randomized
strategies and convex combinations. Linear Algebra and its
Applications 1994
Lipton, Markakis, and Mehta: Playing Large Games using
Simple Strategies. EC, ’03
Feder, Nazerzadeh and Saberi: Approximating Nash
Equilibria using Small-Support Strategies. EC (2007)
Goldberg
Approximate Nash Equilibrium Computation
-Nash equilibrium in quasi-poly time
Given: n × n game...
Let N be a Nash equilibrium. (mixed: in general a probability
distribution)
Draw N samples from N ; let N̂ be uniform distribution over
these samples
Empirical payoffs converge to payoffs arising from N ...
How big does N need to be for uniform convergence to within
additive ? O(log n/2 )! So, N̂ is an -NE with support size
O(log n)
N̂ can be found by support enumeration in time O(nlog n ); also
works for k (constant) players
Goldberg
Approximate Nash Equilibrium Computation
-Nash equilibrium in quasi-poly time
Given: n × n game...
Let N be a Nash equilibrium. (mixed: in general a probability
distribution)
Draw N samples from N ; let N̂ be uniform distribution over
these samples
Empirical payoffs converge to payoffs arising from N ...
How big does N need to be for uniform convergence to within
additive ? O(log n/2 )! So, N̂ is an -NE with support size
O(log n)
N̂ can be found by support enumeration in time O(nlog n ); also
works for k (constant) players
Is there a PTAS?
← the big question
Poly-time for constant is unsatisfying; PTAS would be redemptive
Goldberg
Approximate Nash Equilibrium Computation
progress on additive -NE
Well-supported in blue
DMP
comm-bounded in red
GP
5/6 KPP•
0.732
FGSS•
3/4 •
KS
0.6619
FGSS
2/3
•
CDFFJS
• 0.6608
0.6619
0.6528
•
0.6608
CFJ
•
•
1/2 + δ0.5 DMP•
1/2 + δ
•
0.38197+ζ
0.437 (symmetric only)
DMP•BBM
•
0.36392
•
0.3393
TS
=1
=0
2006 07
08
09
Goldberg
10
11
12
13
14 Now
Approximate Nash Equilibrium Computation
constant support size not enough for < 21 :
consider random zero-sum win-lose games of size n × n:
0
1
0
1
1
0
0
1
0
1
1
0
0
1
1
0
1
0
0
1
0
1
0
1
1
0
0
1
1
0
0
0
1
1
0
1
1
0
1
1
0
0
1
0
0
1
1
0
1
1
0
1
1
0
0
0
0
1
1
1
0
1
0
0
1
0
1
0
1
0
1
0
Feder, Nazerzadeh and Saberi: Approximating Nash
Equilibria using Small-Support Strategies. EC (2007)
Goldberg
Approximate Nash Equilibrium Computation
constant support size not enough for < 21 :
consider random zero-sum win-lose games of size n × n:
1
1
0
1
0
1
1
0
0
1
0
1
1
0
0
1
1
0
1
0
0
0
1
0
1
0
With high probability, for
any pure strategy by
player 1, player 2 can
“win”
1
1
0
1
1
1
0
0
1
1
0
0
0
1
1
0
1
1
0
1
1
0
0
1
0
0
1
1
0
1
1
0
1
1
0
0
0
0
1
1
1
0
1
0
0
1
0
1
0
Feder, Nazerzadeh and Saberi: Approximating Nash
Equilibria using Small-Support Strategies. EC (2007)
Goldberg
Approximate Nash Equilibrium Computation
constant support size not enough for < 21 :
consider random zero-sum win-lose games of size n × n:
1
0
1
0.4
0
1
1
0
0
1
0
1
0.6
0
0
1
1
0
1
1
0
1
0
1
0
1
0
1
Indeed, as n increases,
this is true if player 1 may
mix 2 of his strategies
1
0
1
0
2
1
0
0
1
With high probability, for
any pure strategy by
player 1, player 2 can
“win”
1
0
0
1
0
1
0
1
0
1
1
0
0
1
0
0
1
1
0
1
0
0
1
1
0
1
1
0
0
1
1
1
0
1
0
0
1
0
1
0
Feder, Nazerzadeh and Saberi: Approximating Nash
Equilibria using Small-Support Strategies. EC (2007)
Goldberg
Approximate Nash Equilibrium Computation
constant support size not enough for < 21 :
consider random zero-sum win-lose games of size n × n:
0
1
0
1
1
0
0
1
0
1
0
0
1
1
0
1
1
0
1
1
0
1
3
or indeed, any constant
number of strategies
1
0
1
0
Indeed, as n increases,
this is true if player 1 may
mix 2 of his strategies
1
0
0
2
1
0
0
1
With high probability, for
any pure strategy by
player 1, player 2 can
“win”
1
0
0
1
0
1
0
1
0
1
1
0
0
1
0
0
1
1
0
1
0
0
1
1
0
1
1
0
0
1
1
1
0
1
0
0
1
0
1
0
Feder, Nazerzadeh and Saberi: Approximating Nash
Equilibria using Small-Support Strategies. EC (2007)
Goldberg
Approximate Nash Equilibrium Computation
constant support size not enough for < 21 :
1/n
0
1
1/n
1
0
1/n
0
1
0
1
0
1
0
0
1/n
0
1
1/n
0
1
0
1
0
1
0
1
0
0
0
1
0
1
0
1
1
0
1
1
1
0
1
0
1
0
1
0
1
0
0
0
0
1
1
0
1
1
0
1
0
1
0
1
0
1
0
1
1
1
1/n
1
0
1
0
1
0
But, for large n, player 1
can guarantee a payoff of
about 1/2 by randomizing
over his strategies (w.h.p.,
as n increases)
1
0
Goldberg
Approximate Nash Equilibrium Computation
How big a support do you need?
If less than log(n) strategies are used by player 1, there is a high
probability that player 2 can win...
Hence Ω(log(n)) is a lower bound on support size needed.
Matches O(log(n)) upper bound.
Goldberg
Approximate Nash Equilibrium Computation
-Well-Supported NE
The KS algorithm (for 23 -WSNE of game (R, C )):
1
3
1
look for pure profiles that pay each player ≥
2
If we find one, use it
3
else solve (R − C , C − R); use resulting profile
Kontogiannis and Spirakis: Well supported approximate
equilibria in bimatrix games. Algorithmica (2010)
Goldberg
Approximate Nash Equilibrium Computation
-Well-Supported NE
So, we can approximate -WSNE for slightly less than 23 ....
Feels like we should be able to do better....
Next: should we give up on search for PTAS, and prove that none
exists?
LOGNP: class of NP problems that require logarithmic amount of
non-determinism; quasi-poly algorithms
Papadimitriou and Yannakakis: On Limited
Nondeterminism and the Complexity of the V-C
Dimension. J. Comput. Syst. Sci. (1996)
Arguable that quasi-poly is best you can do...
Goldberg
Approximate Nash Equilibrium Computation
Reduce to -NASH?
Similar problem to before: LOGNP-complete problems “have”
no-instances
Goldberg
Approximate Nash Equilibrium Computation
Reduce to -NASH?
Similar problem to before: LOGNP-complete problems “have”
no-instances
Need a “hard-looking” TFNP problem in O(nlog n )
Goldberg
Approximate Nash Equilibrium Computation
Reduce to -NASH?
Similar problem to before: LOGNP-complete problems “have”
no-instances
Need a “hard-looking” TFNP problem in O(nlog n )
Reasonable question: Can we reduce, for sufficiently small , from
/2-NASH to (say) -NASH? (trying to “compare like with like”)
Goldberg
Approximate Nash Equilibrium Computation
new starting-point for “LOGNP-hardness” of -NASH
Babichenko, Papadimitriou and Rubinstein: Can Almost
Everybody be Almost Happy? PCP for PPAD and the
Inapproximability of Nash, Procs of ITCS (2016)
“exponential time hypothesis” for PPAD: END OF THE LINE
requires time 2Ω̃(n)
Goldberg
Approximate Nash Equilibrium Computation
new starting-point for “LOGNP-hardness” of -NASH
Babichenko, Papadimitriou and Rubinstein: Can Almost
Everybody be Almost Happy? PCP for PPAD and the
Inapproximability of Nash, Procs of ITCS (2016)
“exponential time hypothesis” for PPAD: END OF THE LINE
requires time 2Ω̃(n)
PPAD-completeness of NASH goes via intermediate problem
GCIRCUIT: compute fixpoint of arithmetic circuit.
Approximate version -GCIRCUIT also recently shown to be
PPAD-complete
Rubinstein: Inapproximability of Nash equilibrium, STOC
(2015)
New conjecture (BPR’15): For some δ, > 0, there’s a quasilinear
reduction from END OF THE LINE to (, δ)-GCIRCUIT. With
ETH for PPAD, (, δ)-GCIRCUIT requires time 2Ω̃(n)
Goldberg
Approximate Nash Equilibrium Computation
It follows from the conjecture, there exist 0 , δ 0 such that it takes
exponential time to 0 -satisfy a fraction 1 − δ 0 of GCIRCUIT
elements.
(With the close relationship between GCIRCUIT and graphical
games, you can’t keep almost everybody almost happy...)
From that, there’s an > 0 such that -NASH really requires time
nΩ̃(log n) .
Goldberg
Approximate Nash Equilibrium Computation
The self-critical bit: are we asking the right question?
Why -NE?
:-( lack of scale-invariance is a downside.
:-( Any result just for some constant is unsatisfying,
even = 0.01
“rival” notions
Trembling-hand perfect: if a strategy is suboptimal, give it prob at
most proper -NE: if s is worse than s 0 , Pr[s] ≤ . Pr[s 0 ]
Goldberg
Approximate Nash Equilibrium Computation
Conclusions
There’s a rich theory developing aiming to explain limits to
what we seem to manage to achieve in -NASH results
scope for progress in reducing ...
very little known about games of > 2 players.
fun stuff being done in query complexity of approx NE
Goldberg
Approximate Nash Equilibrium Computation
Conclusions
There’s a rich theory developing aiming to explain limits to
what we seem to manage to achieve in -NASH results
scope for progress in reducing ...
very little known about games of > 2 players.
fun stuff being done in query complexity of approx NE
Thanks!
Goldberg
Approximate Nash Equilibrium Computation

Download Report

Approximate Nash Equilibrium Computation

Paperzz.com

Your Paperzz