Games with Secure Equilibria: A Theory for Component

Games with Secure Equilibria
Krishnendu Chatterjee (Berkeley)
Thomas A. Henzinger (EPFL)
Marcin Jurdzinski (Warwick)
Classification of 2-Player Games
• Zero-sum games: complementary payoffs.
• Non-zero-sum games: arbitrary payoffs.
1,-1
0,0
3,1
1,0
-1,1
2,-2
3,2
4,2
Classical Notion of Rationality
Nash equilibrium:
none of the players gains by deviation.
3,1
1,0
3,2
4,2
(row, column)
Classical Notion of Rationality
Nash equilibrium:
none of the players gains by deviation.
3,1
1,0
3,2
4,2
(row, column)
New Notion of Rationality
Nash equilibrium: none of the players gains by deviation.
Secure equilibrium: none hurts the opponent by deviation.
3,1
1,0
3,2
4,2
(row, column)
Secure Equilibria
• Natural notion of rationality for component systems:
– First, a component tries to meet its spec.
– Second, a component may obstruct the other components.
• For -regular specs, there is always unique maximal
secure equilibrium.
Games on Graphs
• Modeling component interaction:
– Vertices = states
– Players = components
– Moves = transitions
• Applications:
–
–
–
–
–
–
–
–
–
Synthesis (control) of sequential systems
Verification of adversarial specs
Receptiveness
Compatibility
Early error detection
(Bi)simulation checking
Model checking
Game semantics
etc.
Example: Verification
Starvation Freedom
(mutual exclusion protocols, cache coherence protocols)
In a multi-process system, can a process that wishes P
to proceed always eventually proceed no matter what
the other processes do?
8 (a ) hhPii } b)
8 (a ) 8} b)
X
8 (a ! 9} b)
X
Example: Verification
Starvation Freedom
(mutual exclusion protocols, cache coherence protocols)
In a multi-process system, can a process P that wishes
to proceed always eventually proceed no matter what
the other processes do provided they meet their specs ?
8 (a ) hhPii } b)
8 (a ) 8} b)
X
8 (a ! 9} b)
X
X
Games on Graphs
•
Turn-based (perfect-information) games:
–
–
–
•
Game graph G=((V,E), (V1,V2)).
E µ V £ V: serial edge relation.
(V1,V2): partition of the vertex set V.
The game is played by moving token along
edges of the graph:
–
–
V1: player-1 moves the token.
V2: player-2 moves the token.
Example: A Game Graph
s3
s0
s1
s2
Plays and Strategies
•
Play (outcome) of a game:
–
–
•
Player-1 strategy:
–
–
–
•
Infinite path (s0,s1,…) of states si 2 V such that (si,si+1) 2 E for all i ¸ 0.
: set of all plays.
Given prefix of play ending in V1, specifies how to extend the play.
: V* ¢ V1 ! V such that (s, (x¢s)) 2 E for all x 2 V* and s 2 V1.
Symmetric definition for player-2 strategies .
Given two strategies , and a start state s, there is a unique play
, (s).
Example
s3
s0
s1
• Example of a play: s0 s1 s2
• Strategies that yield this play:
- Player-1: s0 ! s1
- Player-2: s1 ! s2
s2
Memoryless Strategies
• Independent of the history of the play:
: V1 ! V
: V2 ! V
• Yield simple controllers.
• Existence puts games into NP.
Objectives and Payoffs
•
What the players are playing for:
–
–
–
–
•
Player-1 objective: play in set 1 µ V .
Player-2 objective: play in set 2 µ V .
General objectives: Borel sets in the Cantor topology.
Finite-state objectives: -regular sets (level 2.5 Borel sets).
From objectives to payoffs:
–
–
If , (s) 2 i , then player i gets payoff 1 else payoff 0.
The payoff profile for a strategy profile (,) at a state s is
(v1, (s), v2, (s)).
Classification of Games
•
Zero-sum games:
–
–
•
Complementary objectives: 2 = : 1.
Possible payoff profiles (1,0) and (0,1).
Non-zero-sum games:
–
–
Arbitrary objectives 1, 2.
Possible payoff profiles (1,1), (1,0), (0,1), and (0,0).
Zero-Sum Games on Graphs
•
Winning:
- Winning-1 states s: (9 ) (8 ) ,(s) 2 1.
- Winning-2 states s: (9 ) (8 ) ,(s) 2 2.
•
Determinacy:
–
–
–
Every state is winning-1 or winning-2.
Borel determinacy [Martin 75].
Memoryless determinacy for parity games [Emerson/Jutla 91].
(1,0)
(0,1)
Non-Zero-Sum Games on Graphs
Nash equilibrium (,) at state s:
(8 ’) v1, (s) ¸ v1’, (s)
(8 ’) v2, (s) ¸ v2,’ (s)
Example: Reachability Game
R2
s3
R1
s0
s1
Objective for player i is to visit Ri.
s2
Example
R2
s3
R1
s0
Nash equilibria:
(s0 ! s1, s1 ! s2): (1,0)
s1
s2
Example
R2
s3
R1
s0
Nash equilibria:
(s0 ! s1, s1 ! s2): (1,0)
(s0 ! s3, s1 ! s0): (0,1)
(s0 ! s1, s1 ! s0): (0,0)
s1
s2
Example
R2
s3
R1
s0
Nash equilibria:
(s0 ! s1, s1 ! s2): (1,0)
(s0 ! s3, s1 ! s0): (0,1)
s1
s2
-Regular Objectives
Synthesis:
- Zero-sum game controller versus plant.
- Control against all plant behaviors.
Verification:
- Non-zero-sum specs for components.
- Components may behave adversarially,
but without threatening their own specs.
Non-Zero-Sum Verification Games
Drawbacks of Nash equilibrium:
- Does not capture adversarial behavior.
- Not unique.
A new notion of equilibrium:
- Takes into account both non-zero-sum payoffs and
adversarial behavior.
- Captures the essence of component-based systems.
- Unique for Borel objectives.
- Computable for -regular objectives.
Secure Equilibria
•
Secure strategy profile (,) at state s:
(8 ’) ( v1,’ (s) < v1, (s) ) v2,’ (s) < v2, (s) )
(8 ’) ( v2’, (s) < v2, (s) ) v1’, (s) < v1, (s) )
•
A secure profile (,) is a contract:
if the player-1 deviates to lower player-2’s payoff,
her own payoff decreases as well, and vice versa.
•
Secure equilibrium:
secure strategy profile that is also a Nash equilibrium.
Secure Equilibria
3,3
1,3
2,1
0,0
3,1
2,2
0,0
1,2
(row, column)
Example
R2
s3
R1
s0
s1
Nash equilibria:
(s0 ! s1, s1 ! s2): (1,0)
(s0 ! s3, s1 ! s0): (0,1)
(s0 ! s1, s1 ! s0): (0,0)
not secure
s2
Example
R2
s3
R1
s0
s1
Nash equilibria:
(s0 ! s1, s1 ! s2): (1,0)
(s0 ! s3, s1 ! s0): (0,1)
(s0 ! s1, s1 ! s0): (0,0)
not secure
not secure
s2
Example
R2
s3
R1
s0
s1
Nash equilibria:
(s0 ! s1, s1 ! s2): (1,0)
(s0 ! s3, s1 ! s0): (0,1)
(s0 ! s1, s1 ! s0): (0,0)
not secure
not secure
secure
s2
Lexicographic Payoff Profile Ordering
• Player-1 preference º1 :
– (v1,v2) Â1 (v’1,v’2) iff v1 > v’1 or ( v1 = v’1 and v2 < v’2 ).
– (v1,v2) º1 (v’1,v’2) iff (v1,v2) Â1 (v’1,v’2) or (v1,v2) = (v’1,v’2).
• Player-2 preference º2 symmetric.
• Captures payoff maximization with external adversarial choice.
• Provides notion of maximality:
– Player-1: (1,0) º1 (1,1) º1 (0,0) º1 (0,1)
– Player-2: (0,1) º2 (1,1) º2 (0,0) º2 (1,0)
Alternative Characterization
A secure equilibrium is an equilibrium with respect to
the º1 and º2 payoff profile orderings:
A strategy profile (,) is a secure equilibrium at s iff
(8 ’) (v1, (s), v2, (s)) º1 (v1’, (s), v2’, (s))
(8 ’) (v1, (s), v2,’ (s)) º2 (v1,’ (s), v2,’ (s))
Example: Buechi Game
s3
B2
s2
s1
s0
B1
s4
Objective for player i is to visit Bi infinitely often.
Example
s3
B2
s2
s1
s0
B1
s4
Nash equilibria:
(s0 ! s4, s1 ! s4): (0,0)
secure
Example
s3
B2
s2
s1
s0
B1
s4
Nash equilibria:
(s0 ! s4, s1 ! s4): (0,0)
(s0 ! s1, s1 ! s0): (1,0)
secure
Example
s3
B2
s2
s1
s0
B1
s4
Nash equilibria:
(s0 ! s4, s1 ! s4): (0,0)
(s0 ! s1, s1 ! s0): (1,0)
secure
not secure
Example
s3
B2
s2
s1
s0
B1
s4
Nash equilibria:
(s0 ! s4, s1 ! s4): (0,0)
(s0 ! s1, s1 ! s0): (1,0)
(s0 ! s2, s3 ! s1): (1,1)
secure
not secure
Example
s3
B2
s2
s1
s0
B1
s4
Nash equilibria:
(s0 ! s4, s1 ! s4): (0,0)
(s0 ! s1, s1 ! s0): (1,0)
(s0 ! s2, s3 ! s1): (1,1)
secure
not secure
Example
s3
B2
s2
s1
s0
B1
s4
Nash equilibria:
(s0 ! s4, s1 ! s4): (0,0)
(s0 ! s1, s1 ! s0): (1,0)
(s0 ! s2, s3 ! s1): (1,1)
secure
not secure
secure
Example
s3
B2
s2
s1
B1
s0
s4
• Secure equilibrium (,) with payoff (1,1) at s0:
: if s1 ! s0, then s2 else s4.
s3 ! s1, then s0 else s4.
• Pair of “retaliating” strategies.
• Require memory.
: if
Maximal Secure Equilibria
Theorem: At every state s of a graph game
with Borel objectives, there is a unique
secure equilibrium profile that is maximal
with respect to both º1 and º2.
This is the rational behavior of both players at s if they wish to
1. satisfy their own objectives and, then,
2. sabotage the opponent’s objective.
Strongly Winning and Retaliating Strategies
• Winning strategies:
– Player-1 wins for the objective 1.
– Player-2 wins for 2.
• Strongly winning strategies:
– Player-1 wins for the objective 1 Æ :2.
– Player-2 wins for 2 Æ :1.
• Retaliating strategies:
– Player-1 wins for the objective 2 ) 1.
– Player-2 wins for 1 ) 2.
Winning Sets
•
•
•
•
•
•
W1: set of states s.t. player-1 has a winning strategy.
W2: set of states s.t. player-2 has a winning strategy.
W10: set of states s.t. player-1 has a strongly winning strategy.
W01: set of states s.t. player-2 has a strongly winning strategy.
W11: set of states s s.t. there is a pair (,) of retaliating
strategies with ,(s) ² 1 Æ 2.
W00: set of states s s.t. each player has a retaliating strategy and
for every pair (,) of retaliating strategies, ,(s) ² : 1 Æ : 2.
State Space Partition
State Space Partition
hh2ii ( : 1 Ç 2 )
W10
hh1ii ( 1 Æ : 2 )
State Space Partition
W01
hh2ii ( 2 Æ : 1 )
W10
hh1ii ( 1 Æ : 2 )
hh2ii ( : 1 Ç 2 )
hh1ii ( : 2 Ç 1 )
State Space Partition
W01
hh2ii ( 2 Æ : 1 )
W10
hh1ii ( 1 Æ : 2 )
hh2ii ( : 1 Ç 2 )
Retaliating
strategies
hh1ii ( : 2 Ç 1 )
There is no player-2 retaliating strategy in W10,
and no player-1 retaliating strategy in W01.
Retaliating Strategy Pairs
•
•
•
Every player-1 retaliating strategy  ensures for every
player-2 strategy  that 2 ) 1.
Hence every pair (,) of retaliating strategies ensures
(: 1 Ç : 2) ) (: 1 Æ : 2).
W11 and W00 are the regions of the state space where both
players have retaliating strategies:
– W11 contains the states where some pair (,) of retaliating
strategies ensures 1 Æ 2.
– W00 contains the states where all pairs (,) of retaliating
strategies lead to : 1 Æ : 2.
State Space Partition
W01
hh2ii ( 2 Æ : 1 )
W00
W10
hh1ii ( 1 Æ : 2 )
W11
Uniqueness of Maximal Secure Equilibria
W11:
W10:
W01:
W00:
(1,1) is a secure equilibrium, and hence maximal
(1,0) is the only secure equilibrium
(0,1) is the only secure equilibrium
(0,0) is the only possible secure equilibrium
W11 is the set of states where both players can collaborate
to win, yet each player keeps an “insurance policy.”
Generalization of Determinacy
Zero-sum games: 2 = :1
W1
W2
Non-zero-sum games: 1, 2
W01
W10
W00
W11
Computing the Partition
• For -regular 1 and 2.
• Computing
– W10 = hh1ii (1 Æ :2)
– W01 = hh2ii (2 Æ :1)
involves solving games with conjunctive (Streett)
objectives. These are generally not memoryless.
• We need to compute W11 µ hh1,2ii (1 Æ 2) = 9 (1 Æ
2).
B1
B2
s0
s1
W1,0 even though
hh1,2ii (1 Æ 2)
Computing the Partition
hh2ii (1 ) 2 )
W01
hh2ii ( 2 Æ : 1 )
W10
hh1ii ( 1 Æ : 2 )
hh1ii (2 ) 1 )
Computing the Partition
hh2ii (1 ) 2 )
W01
hh2ii ( 2 Æ : 1 )
hh1ii 1
U1
W10
hh1ii ( 1 Æ : 2 )
hh1ii (2 ) 1 )
Computing the Partition
hh2ii (1 ) 2 )
W01
hh2ii ( 2 Æ : 1 )
W10
hh1ii ( 1 Æ : 2 )
hh1ii 1
U1
U2
hh2ii 2
hh1ii (2 ) 1 )
Computing the Partition
hh2ii (1 ) 2 )
W01
hh2ii ( 2 Æ : 1 )
hh1ii 1
U1
hh2ii : 1
hh1ii : 2
W10
hh1ii ( 1 Æ : 2 )
U2
hh2ii 2
hh1ii (2 ) 1 )
Threat
strategies
T, T
Computing the Partition
hh2ii (1 ) 2 )
hh1ii : 2
W01
hh2ii ( 2 Æ : 1 )
W10
hh1ii ( 1 Æ : 2 )
hh1ii 1
U1
U2
hh1,2ii
(1 Æ 2
)
hh2ii 2
hh1ii (2 ) 1 )
hh2ii : 1
Threat
strategies
T, T
Cooperation
strategies
C, C
Computing W11
•
The pair (C+T, C+T) is a pair of winning retaliating strategies:
–
–
Player-1 follows C until reaching U1 [ U2 (then switch to known
retaliating pair) or until player-2 deviates from C (then switch to T).
Player-2 behaves symmetrically.
•
From states s where there are no cooperative strategies, for all
retaliation strategies (,) we have ,(s) ² : 1 Æ : 2.
•
Hence W11 = hh1,2ii(1 Æ 2) in the subgame without W10[ W01.
Computing the Partition
W01
hh2ii ( 2 Æ : 1 )
W10
hh1ii ( 1 Æ : 2 )
hh1ii 1
U1
U2
hh2ii 2
hh1,2ii
(1 Æ 2
)
W0
0
Computing the Partition
1,2 LTL: 2EXPTIME [Pnueli/Rosner].
1,2 parity: coNP [Emerson/Jutla].
Application: Compositional Verification
P1 2 W1 (1)
P2 2 W2 (2)
1 Æ 2 ) 
P1||P2 ² 
Application: Compositional Verification
P1 2 W1 (1)
P2 2 W2 (2)
1 Æ 2 ) 
P1||P2 ² 
P1 2 (W10 [ W11) (1)
P2 2 (W01 [ W11)
(2)
1 Æ 2 ) 
P1||P2 ² 
W1 ½ W10 [ W11
W2 ½ W01 [ W11
An assume/guarantee rule.
Summary
Rational behavior in non-zero-sum graph games:
–
–
–
–
not as restrictive as zero-sum games.
capture the essence of multi-component systems.
unique equilibria.
computable for -regular objectives.
Reference: LICS 2004.