A tropical approach to zero

A tropical approach to zero-sum repeated
games
[email protected]
INRIA and CMAP, École Polytechnique
Inauguration du PGMO
École Polytechnique / ENSTA, Sep. 18-19, 2012
Based on joint work with Akian,Guterman (IJAC, 2012) Allamigeon,
Goubault (JCTA 2011), and Katz (LAA 2011).
PGMO research project (Allamigeon).
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
1 / 54
The message
Tropical convex sets ' zero-sum games
(decision over time, one player = MDP )
tropical polyhedra / LP : finite state and action,
deterministic
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
2 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” =
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =5
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =5
“5/2” =
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =5
“5/2” =3
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =5
“5/2” =3
“23 ” =
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =5
“5/2” =3
“23 ” =“2 × 2 × 2” = 6
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
“2 × 3” =5
“5/2” =3
“23 ” =“2 × 2 × 2” = 6
√
“ −1” =
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
7 0
2
“
”=
−∞ 3
1
“2 × 3” =5
“5/2” =3
“23 ” =“2 × 2 × 2” = 6
√
“ −1” =−0.5
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
Max-plus or tropical algebra
In an exotic country, children are taught that:
“a + b” = max(a, b)
“a × b” = a + b
So
“2 + 3” = 3
7 0
2
9
“
”=
−∞ 3
1
4
“2 × 3” =5
“5/2” =3
“23 ” =“2 × 2 × 2” = 6
√
“ −1” =−0.5
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
3 / 54
The sister algebra: min-plus
“a + b” = min(a, b)
“a × b” = a + b
“2 + 3” = 2
“2 × 3” = 5
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
4 / 54
Invented by several schools.
In particular at EDF:
M. Gondran and M. Minoux. Valeurs propres et vecteurs
propres dans les dioı̈des et leur interprétation en théorie
des graphes. EDF, Bulletin de la Direction des Etudes et
Recherches, Serie C, Mathématiques Informatique,
2:25–41, 1977.
“Au = λu”,
max (Aij + uj ) = λ + ui .
1≤j≤n
See also Gondran and Minoux’s book, Graphes et
algorithmes, Eyrolles, 1979.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
5 / 54
The term “tropical” refers to Imre Simon, 1943 2009
who lived in Sao Paulo (south tropic).
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
6 / 54
Cuninghame-Green 1960- OR (scheduling, optimization)
Vorobyev ∼65 . . . Zimmerman, Butkovic; Optimization
Maslov ∼ 80’- . . . Kolokoltsov, Litvinov, Samborskii, Shpiz. . . Quasi-classic
analysis, variations calculus
Simon ∼ 78- . . . Hashiguchi, Leung, Pin, Krob, . . . Automata theory
Gondran, Minoux ∼ 77 Operations research
Cohen, Quadrat, Viot ∼ 83- . . . Olsder, Baccelli, S.G., Akian initially
discrete event systems, then optimal control, idempotent probabilities,
combinatorial linear algebra
Nussbaum 86- Nonlinear analysis, dynamical systems, also related work in
linear algebra, Friedland 88, Bapat ˜94
Kim, Roush 84 Incline algebras
Fleming, McEneaney ∼00- max-plus approximation of HJB
Salut, Del Moral ∼95 Puhalskii ∼99, idempotent probabilities.
now in tropical geometry, after Viro, Mikhalkin, Passare, Sturmfels and many.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
7 / 54
What are tropical convex sets?
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
8 / 54
Some elementary tropical geometry
A tropical line in the plane is the set of (x, y ) such that
the max in
“ax + by + c”
is attained at least twice.
max(x, y , 0)
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
9 / 54
Some elementary tropical geometry
A tropical line in the plane is the set of (x, y ) such that
the max in
max(a + x, b + y , c)
is attained at least twice.
max(x, y , 0)
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
9 / 54
Two generic tropical lines meet at a unique point
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
10 / 54
By two generic points passes a unique tropical line
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
10 / 54
non generic case
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
10 / 54
non generic case resolved by perturbation
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
10 / 54
Tropical segments:
f
g
[f , g ] := {“λf + µg ” | λ, µ ∈ R ∪ {−∞}, “λ + µ = 1”}.
(The condition “λ, µ ≥ 0” is automatic.)
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
11 / 54
Tropical segments:
f
g
[f , g ] := { sup(λ + f , µ + g ) | λ, µ ∈
R ∪ {−∞}, max(λ, µ) = 0}.
(The condition λ, µ ≥ −∞ is automatic.)
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
11 / 54
Tropical convex set: f , g ∈ C =⇒ [f , g ] ∈ C
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
12 / 54
Tropical convex set: f , g ∈ C =⇒ [f , g ] ∈ C
Tropical convex cone: ommit “λ + µ = 1”, i.e., replace
[f , g ] by {sup(λ + f , µ + g ) | λ, µ ∈ R ∪ {−∞}}
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
12 / 54
Homogeneization
A convex set C in Rnmax is a cross section of a convex
cone Ĉ in Rn+1
max ,
Ĉ := {(λ + u, λ) | u ∈ C , λ ∈ Rmax }
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
13 / 54
The tropical point of view arises with log glasses
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
14 / 54
Gelfand, Kapranov, and Zelevinsky defined the amoeba of an
algebraic variety V ⊂ (C∗ )n to be the “log-log plot”
A(V ) := {(log |z1 |, . . . , log |zn |) | (z1 , . . . , zn ) ∈ V } .
y +x +1=0
max(log |x|, log |y |, 0)
attained twice
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
15 / 54
Gelfand, Kapranov, and Zelevinsky defined the amoeba of an
algebraic variety V ⊂ (C∗ )n to be the “log-log plot”
A(V ) := {(log |z1 |, . . . , log |zn |) | (z1 , . . . , zn ) ∈ V } .
y +x +1=0
max(log |x|, log |y |, 0)
attained twice
|y | ≤ |x| + 1, |x| ≤ |y | + 1, 1 ≤ |x| + |y |
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
15 / 54
Gelfand, Kapranov, and Zelevinsky defined the amoeba of an
algebraic variety V ⊂ (C∗ )n to be the “log-log plot”
A(V ) := {(log |z1 |, . . . , log |zn |) | (z1 , . . . , zn ) ∈ V } .
y +x +1=0
max(log |x|, log |y |, 0)
attained twice
X := log |x|, Y := log |y |
Y ≤ log(e X + 1), X ≤ log(e Y + 1), 0 ≤ log(e X + e Y )
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
15 / 54
Viro’s log-glasses, related to Maslov’s dequantization
a +h b := h log(e a/h + e b/h ),
h → 0+
With h-log glasses, the amoeba of the line retracts to the
tropical line as h → 0+
y +x +1=0
max(log |x|, log |y |, 0)
attained twice
max(a, b) ≤ a +h b ≤ h log 2 + max(a, b)
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
16 / 54
Tropical convex sets are deformations of classical convex
sets
Briec and Horvath 04
[a, b] := {λa +p µb, λ, µ ≥ 0, λ +p µ = 1}
a +p b = (ap + b p )1/p
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
17 / 54
Tropical half-spaces
Given a, b ∈ Rnmax , a, b 6≡ −∞,
H := {x ∈ Rnmax | “ax ≤ bx”}
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
18 / 54
Tropical half-spaces
Given a, b ∈ Rnmax , a, b 6≡ −∞,
H := {x ∈ Rnmax | max ai + xi ≤ max bi + xi }
1≤i≤n
1≤i≤n
x3
max(x1 , x2 , −2 + x3 )
x1
Stephane Gaubert (INRIA and CMAP)
x2
A tropical approach to zero-sum games
PGMO
18 / 54
Tropical half-spaces
Given a, b ∈ Rnmax , a, b 6≡ −∞,
H := {x ∈ Rnmax | max ai + xi ≤ max bi + xi }
1≤i≤n
x3
1≤i≤n
max(x1 , −2 + x3 ) ≤ x2
x1
Stephane Gaubert (INRIA and CMAP)
x2
A tropical approach to zero-sum games
PGMO
18 / 54
Tropical half-spaces
Given a, b ∈ Rnmax , a, b 6≡ −∞,
H := {x ∈ Rnmax | max ai + xi ≤ max bi + xi }
1≤i≤n
1≤i≤n
x3
x1 ≤ max(x2 − 2 + x3 )
x1
Stephane Gaubert (INRIA and CMAP)
x2
A tropical approach to zero-sum games
PGMO
18 / 54
Tropical half-spaces
Given a, b ∈ Rnmax , a, b 6≡ −∞,
H := {x ∈ Rnmax | max ai + xi ≤ max bi + xi }
1≤i≤n
1≤i≤n
x3
max(x2 − 2 + x3 ) ≤ x1
x1
Stephane Gaubert (INRIA and CMAP)
x2
A tropical approach to zero-sum games
PGMO
18 / 54
Theorem (Zimmermann; Samborski, Shpiz; Cohen, SG, Quadrat,
Singer; Develin, Sturmfels; Joswig. . . )
A tropical convex cone closed (in the Euclidean topology)
is the intersection of tropical half-spaces.
Rmax is equipped with the topology of the metric
(x, y ) 7→ maxi |e xi − e yi | inherited from the Euclidean
topology by log-glasses.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
19 / 54
Tropical polyhedral cones
can be defined as intersections of finitely many half-spaces
x3
V
x1
Stephane Gaubert (INRIA and CMAP)
x2
A tropical approach to zero-sum games
PGMO
20 / 54
Tropical polyhedral cones
can be defined as intersections of finitely many half-spaces
x3
2 + x1 ≤ max(x2 , 3 + x3 )
V
x1
Stephane Gaubert (INRIA and CMAP)
x2
A tropical approach to zero-sum games
PGMO
20 / 54
Tropical polyhedral cones
can be defined as intersections of finitely many half-spaces
V
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
20 / 54
Equivalently, a tropical polyhedral cone is generated by
finitely many extreme rays.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
21 / 54
Cyclic polytope, 4 vertices support half-space
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
22 / 54
Cyclic polytope, 4 vertices, support half-space
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
22 / 54
Fundamental operation: Conversion from inequalities to
extreme points.
tropical double description Allamigeon, SG, Goubault,
(STACS’10)
ocaml TPLib library by Allamigeon
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
23 / 54
10 inequalities, 24 extreme points, but 1381
pseudo-vertices.
Generated by TPLib and polymake.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
24 / 54
More generally, all the results of classical convexity have
tropical analogues, but sometimes more degenerate. . .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
25 / 54
From tropical convexity to games
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
26 / 54
Shapley operators
Ti (x) = max min
a∈A b∈B
riab
+
X
Pijab xj ,
i ∈ [n] =: Ω
j∈[n]
If xj ∈ Rn is the value tomorrow in state j, Ti (x) is the
value of the game today.
T is order preserving and additively homogeneous:
x ≤ y =⇒ T (x) ≤ T (y )
T (α + x) = α + T (x),
∀α ∈ R
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
27 / 54
The mean payoff problem
Compute
χ(T ) = lim T k (x)/k
k→∞
Mean payoff per time unit.
The limit is independent of the choice of x ∈ Rn (T is
non-expansive).
The limit is known to exist in particular if the graph of T
is semi-algebraic (Bewley, Kohlberg, Neyman).
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
28 / 54
Observation
If ∃u ∈ Rn , T (u) ≥ u, the game is super-fair, i.e.,
χ(T ) ≥ 0 (all initial states are winning).
Proof.
T k (u) ≥ u for all k ≥ 1, by Monotonicity,
limk T k (u)/k ≥ 0.
Similarly, if T (u) ≤ u, the game is subfair.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
29 / 54
Example: additive Richman game / tug of war
Max and Min flip a coin to decide who makes the move.
Min always pay.
3
2
1
−1
1
2
−1
Stephane Gaubert (INRIA and CMAP)
2
2
−8
3
A tropical approach to zero-sum games
PGMO
30 / 54
Example: additive Richman game / tug of war
Max and Min flip a coin to decide who makes the move.
Min always pay.
3
2
1
−1
1
2
−1
2
2
−8
3
1
vik+1 = (max(cij + vjk ) + min (cij + vjk )) .
j: i→j
2 j: i→j
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
30 / 54
3
2
1
1
2
−1
2
2
−1
−8
3
 
v1 = 12 (max(2 + v1 , 3 + v2 , −1 + v3 ) + min(2 + v1 , 3 + v2 , −1 + v3 )
5
0 v2 = 1 (max(−1 + v1 , 2 + v2 , −8 + v3 ) + min(−1 + v1 , 2 + v2 , −8 + v3 )
2
4
v3 = 12 (max(2 + v1 , 1 + v2 ) + min(2 + v1 , 1 + v2 )
this game is fair
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
31 / 54
1
v = (max vjk + min vjk ) ,
j: i→j
2 j: i→j
vi , i ∈ boundary prescribed:
discrete variant of Laplacian infinity (Oberman), or
Richman games (Loeb et al.), Tug of war (Peres et al.).
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
32 / 54
From games to tropical convexity
If T is a Shapley operator, then the set of subsolutions
C = {u | T (u) ≥ u}
(showing that the game is superfair) is a tropical
(max-plus) convex cone.
Proof. Since T monotone, u, v ∈ C =⇒
T (sup(u, v )) ≥ sup(u, v ), so sup(u, v ) ∈ C .
Since T (λ + u) = λ + u for λ ∈ R, u ∈ C implies
λ + u ∈ C . QED.
Supersolutions constitute a min-plus convex cone.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
33 / 54
Do certificates exist?
Proposition
If T is a piecewise affine Shapley operator = game with
perfect information and finite action spaces, then
χ(T ) ≥ 0 iff T (u) ≥ u for some u ∈ Rn .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
34 / 54
Do certificates exist?
Proposition
If T is a piecewise affine Shapley operator = game with
perfect information and finite action spaces, then
χ(T ) ≥ 0 iff T (u) ≥ u for some u ∈ Rn .
Follows from Kohlberg’s theorem on invariant half-lines:
∃u, T (u + s~η ) = u + (s + 1)~η , for all s ≥ 0.
We have χ(T ) = ~η .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
34 / 54
Do certificates exist?
Proposition
If T is a piecewise affine Shapley operator = game with
perfect information and finite action spaces, then
χ(T ) ≥ 0 iff T (u) ≥ u for some u ∈ Rn .
Follows from Kohlberg’s theorem on invariant half-lines:
∃u, T (u + s~η ) = u + (s + 1)~η , for all s ≥ 0.
We have χ(T ) = ~η .
Theorem (SG, Gunawardena TAMS 04)
If digraph G (T ) is strongly connected, exists λ ∈ R,
T (u) = λ + u. Then, χ(T ) = (λ, . . . , λ).
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
34 / 54
In general, χ(T ) ≥ 0 does not imply the existence of
u ∈ Rn such that T (u) ≥ u.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
35 / 54
In general, χ(T ) ≥ 0 does not imply the existence of
u ∈ Rn such that T (u) ≥ u.
However. . .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
35 / 54
T extends continuously to (R ∪ {−∞})n
(image of Rn+ by log glasses).
Setting xi = −∞ in T (x) prohibits player MAX to reach
state i.
Theorem
At least one state is winning (∃i, lim inf k (T k (x))i /k ≥ 0)
if and only if there exist u ∈ (R ∪ {−∞})n , T (u) ≥ u,
u 6≡ −∞.
Combine a Denjoy-Wolff type result of Nussbaum
(LAA86) with SG, Gunawardena (TAMS04). Actually, there
exists i such that limk (T k (x))i /k = limk maxj (T k (x))j /k.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
36 / 54
Correspondence between tropical convexity and
zero-sum games
Theorem (Akian, SG, Guterman, IJAC 2012)
TFAE:
C closed tropical convex cone
C = {u ∈ (R ∪ {−∞})n | u ≤ T (u)} for some
Shapley operator T
Then, MAX has at least one winning state
(∃i, χi (T ) ≥ 0) if and only if C 6= {(−∞, . . . , −∞)} .
Moreover, tropical polyhedra correspond to deterministic
games with finite action spaces. Then, state i is winning
iff ui 6= −∞ for some u ∈ C .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
37 / 54
x3
x3
x1
x2 x1
states 1,2,3 winning
Stephane Gaubert (INRIA and CMAP)
x2
states 2,3 winning
A tropical approach to zero-sum games
PGMO
38 / 54
Proof. (Equivalence). A closed tropical convex cone can
be written as
\
C=
Hi
i∈I
where (Hi )i∈I is a family of tropical half-spaces.
Hi : “Ai x ≤ Bi x”
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
39 / 54
Proof. (Equivalence). A closed tropical convex cone can
be written as
\
C=
Hi
i∈I
where (Hi )i∈I is a family of tropical half-spaces.
Hi : max aij +xj ≤ max bik +xk ,
1≤j≤n
1≤k≤n
aij , bik ∈ R∪{−∞}
[T (x)]j = inf −aij + max bik + xk .
i∈I
Stephane Gaubert (INRIA and CMAP)
1≤k≤n
A tropical approach to zero-sum games
PGMO
39 / 54
Proof. (Equivalence). A closed tropical convex cone can
be written as
\
C=
Hi
i∈I
where (Hi )i∈I is a family of tropical half-spaces.
Hi : max aij +xj ≤ max bik +xk ,
1≤j≤n
aij , bik ∈ R∪{−∞}
1≤k≤n
[T (x)]j = inf −aij + max bik + xk .
i∈I
1≤k≤n
x ≤ T (x) ⇐⇒ max aij + xj ≤ max bik + xk , ∀i ∈ I .
1≤j≤n
Stephane Gaubert (INRIA and CMAP)
1≤k≤n
A tropical approach to zero-sum games
PGMO
39 / 54
Hi : max aij + xj ≤ max bik + xk
1≤j≤n
1≤k≤n
[T (x)]j = inf −aij + max bik + xk .
i∈I
1≤k≤n
Interpretation of the game
State of MIN: variable xj , j ∈ {1, . . . , n}
State of MAX: half-space Hi , i ∈ I
In state xj , Player MIN chooses a tropical half-space
Hi with xj in the LHS
In state Hi , player MAX chooses a variable xk at the
RHS of Hi
Payment −aij + bik .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
40 / 54
x1 ≤ a + max(x2 − 2, x3 − 1)
−2 + x2 ≤ a + max(x1 , x3 − 1)
max(x2 − 2, x3 − a) ≤ x1 + 2
(H1 )
(H2 )
(H3 )
value χ(T )j = (2a + 1)/2, ∀j.
0
1
1
1
2
2
a−2
0
1
a
2
a−1
−2
−2
a−1
−2
a−1
2
3
−2
2
a−1
2
−a
Stephane Gaubert (INRIA and CMAP)
3
3
A tropical approach to zero-sum games
−a
3
PGMO
41 / 54
x3
x3
H3
H1
x1
H3
H2
H2
x1
x2
x2
H1
a = −3/2, victorious strategy of Min: certificate of
emptyness involving ≤ n inequalities (Helly)
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
42 / 54
x3
x3
H3
H1
x1
H3
H2
x1
H2
x2
x2
H1
a = 1, victorious strategy of Max: tropical polytrope 6= ∅
included in the convex set
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
42 / 54
Corollary
Deciding whether a tropical polyhedron given by
inequalities is empty is polynomial-time equivalent to
deciding whether a mean payoff deterministic game is
winning.
Remark. Deciding whether a mean payoff stochastic
game is winning reduces to a tropical convex
programming problem.
Existence of polynomial time algorithm open since Gurvich,
Karzanov, Khachyan 86.
Problem has a good characterization in the sense of
Edmonds: NP ∩ coNP class (Condon). So cannot be
NP-hard unless NP = coNP.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
43 / 54
Tropical polyhedra correspond to classical polyhedra with
exponentially large coefficients exp(βAij ). (Allamigeon, SG,
Katz, SG, Meunier, Schewe).
Existence of a polynomial time algorithm algorithm for
tropical LP is somehow related to the existence of a
strongly polynomial time algorithm in classical LP.
Several pseudo-polynomial algorithms have been proposed
for mean-payoff deterministic games / tropical LP:
pumping Gurvich, Karzanov, Khachyan 86
value iteration Zwick, Paterson 97
cyclic projections Butkovic, Cuninghame-Green 04, SG,
Sergei, Akian, SG, Nitica, Singer 11
policy iteration Cochet-Terrasson, SG, Gunawardena 98,
Dinghra SG 06, Bjorklund and Vorobyov 07, chaloupka 11.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
44 / 54
For stochatic games with perfect information, even a
pseudo-polynomial algorithm is not known (see Boros,
Elbassioni, Gurvich, Makino 10).
Policy iteration is currently the most effective algorithm
to solve large scale problems.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
45 / 54
Solving Richman games
150
0 50
Time (s)
250
Policy iteration algorithm of Cochet-Terrasson, SG 06, Akian,
Cochet-Terrasson, Detournay, SG 12 / PIgames C library
0
10000
20000 30000 40000
Number of nodes
50000
CPU time (single proc. Intel(R) Xeon(R) W3540 - 2.93GHz with 8Go of
RAM), max, mean, and min over 500 random graphs for each dim.
106 nodes ∼ 2.104 sec = 5 hours, LU is the bottleneck.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
46 / 54
Idea of the policy iteration for games (multichain):
T = inf T π ,
π
T π one player (easy)
Fix π. Compute an invariant half-line s 7→ v + s~η of T π ,
so
T π (v + s~η ) = v + (s + 1)~η .
Choose new strategy π 0 such that
0
T π (v + s~η ) = T (v + s~η )
for large s
(lexicographic minimaxisation).
0
Easy to show χ(T π ) ≤ χ(T π ).
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
47 / 54
Policy iteration for games goes back to Hoffman-Karp 66
assuming that every pair of strategies σ, δ yields an
irreducible Markov chain.
Variant does terminate in the discounted case Denardo
(1967), also Rao, Chandrasekaran, and Nair (1973), Raghavan
and Syed (2003).
Mean payoff reducible case remained unsolved,
actually algorithm may cycle.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
48 / 54
Policy iteration for games, reducible case
Tropical solution to cycling: at a degenerate iteration,
0
χ(T π ) = χ(T π )
Cycling arises from the nonuniqueness of the new
invariant half-line s 7→ w 0 + s~η which is selected.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
49 / 54
Policy iteration for games, reducible case
Tropical solution to cycling: at a degenerate iteration,
0
χ(T π ) = χ(T π )
Cycling arises from the nonuniqueness of the new
invariant half-line s 7→ w 0 + s~η which is selected.
Rule: enforce wi0 = wi on every node i of the critical
graph or “discrete stochastic Aubry set” of π 0 .
= set of nodes which occur infinitely often, a.s., in an
optimal stationnary strategy.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
49 / 54
Policy iteration for games, reducible case
Tropical solution to cycling: at a degenerate iteration,
0
χ(T π ) = χ(T π )
Cycling arises from the nonuniqueness of the new
invariant half-line s 7→ w 0 + s~η which is selected.
Rule: enforce wi0 = wi on every node i of the critical
graph or “discrete stochastic Aubry set” of π 0 .
= set of nodes which occur infinitely often, a.s., in an
optimal stationnary strategy.
Akian, SG 03, w 0 is uniquely defined by its value on C .
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
49 / 54
Concluding remarks
Tropical algebra gives a geometrical approach to
zero-sum games
It has suggested new algorithms
Complexity of mean-payoff games is open (unlike in
the discounted case, recent results by Ye, Hansen,
Miltersen and Zwick)
Tropical approach is also useful in the infinite
dimensional setting. Then, the issue is the curse of
dimensionality.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
50 / 54
Max-plus basis method Fleming, McEneaney, Akian, SG,
Lakhoua . . . . To solve an Hamilton-Jacobi equation
H(x, Dx v ) = 0, approximate the value function
Rd → R by a finite supremum of quadratic forms
v ' sup wi
i
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
51 / 54
In the one player deterministic case, use max-plus
linearity of the evolution semigroup to design a
convergent fixed point scheme.
For switched LQ problems, semigroup is propagated
by Riccati, and wi are trimmed by Shor relaxation,
leading to a cod free complexity estimate McEneaney:
time polynomial in d, but exponential in 1/precision.
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
52 / 54
coarse certified approximations can be obtained in a
cod way.
This is correlated by work in program verification:
solving the fixed point problem in static analysis by
abstract interpretation is equivalent to zero-sum
games with negative discount. In the template
method of Manna et al, and its non-linear
generalization by Adje, SG, Goubault, the set of
reachable values of the program variables is
approximated by the level set v ≤ 0 of
v = sup wi
i
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
53 / 54
Thank you!
Stephane Gaubert (INRIA and CMAP)
A tropical approach to zero-sum games
PGMO
54 / 54