Games with Secure Equilibria Krishnendu Chatterjee (Berkeley) Thomas A. Henzinger (EPFL) Marcin Jurdzinski (Warwick) Classification of 2-Player Games • Zero-sum games: complementary payoffs. • Non-zero-sum games: arbitrary payoffs. 1,-1 0,0 3,1 1,0 -1,1 2,-2 3,2 4,2 Classical Notion of Rationality Nash equilibrium: none of the players gains by deviation. 3,1 1,0 3,2 4,2 (row, column) Classical Notion of Rationality Nash equilibrium: none of the players gains by deviation. 3,1 1,0 3,2 4,2 (row, column) New Notion of Rationality Nash equilibrium: none of the players gains by deviation. Secure equilibrium: none hurts the opponent by deviation. 3,1 1,0 3,2 4,2 (row, column) Secure Equilibria • Natural notion of rationality for component systems: – First, a component tries to meet its spec. – Second, a component may obstruct the other components. • For -regular specs, there is always unique maximal secure equilibrium. Games on Graphs • Modeling component interaction: – Vertices = states – Players = components – Moves = transitions • Applications: – – – – – – – – – Synthesis (control) of sequential systems Verification of adversarial specs Receptiveness Compatibility Early error detection (Bi)simulation checking Model checking Game semantics etc. Example: Verification Starvation Freedom (mutual exclusion protocols, cache coherence protocols) In a multi-process system, can a process that wishes P to proceed always eventually proceed no matter what the other processes do? 8 (a ) hhPii } b) 8 (a ) 8} b) X 8 (a ! 9} b) X Example: Verification Starvation Freedom (mutual exclusion protocols, cache coherence protocols) In a multi-process system, can a process P that wishes to proceed always eventually proceed no matter what the other processes do provided they meet their specs ? 8 (a ) hhPii } b) 8 (a ) 8} b) X 8 (a ! 9} b) X X Games on Graphs • Turn-based (perfect-information) games: – – – • Game graph G=((V,E), (V1,V2)). E µ V £ V: serial edge relation. (V1,V2): partition of the vertex set V. The game is played by moving token along edges of the graph: – – V1: player-1 moves the token. V2: player-2 moves the token. Example: A Game Graph s3 s0 s1 s2 Plays and Strategies • Play (outcome) of a game: – – • Player-1 strategy: – – – • Infinite path (s0,s1,…) of states si 2 V such that (si,si+1) 2 E for all i ¸ 0. : set of all plays. Given prefix of play ending in V1, specifies how to extend the play. : V* ¢ V1 ! V such that (s, (x¢s)) 2 E for all x 2 V* and s 2 V1. Symmetric definition for player-2 strategies . Given two strategies , and a start state s, there is a unique play , (s). Example s3 s0 s1 • Example of a play: s0 s1 s2 • Strategies that yield this play: - Player-1: s0 ! s1 - Player-2: s1 ! s2 s2 Memoryless Strategies • Independent of the history of the play: : V1 ! V : V2 ! V • Yield simple controllers. • Existence puts games into NP. Objectives and Payoffs • What the players are playing for: – – – – • Player-1 objective: play in set 1 µ V . Player-2 objective: play in set 2 µ V . General objectives: Borel sets in the Cantor topology. Finite-state objectives: -regular sets (level 2.5 Borel sets). From objectives to payoffs: – – If , (s) 2 i , then player i gets payoff 1 else payoff 0. The payoff profile for a strategy profile (,) at a state s is (v1, (s), v2, (s)). Classification of Games • Zero-sum games: – – • Complementary objectives: 2 = : 1. Possible payoff profiles (1,0) and (0,1). Non-zero-sum games: – – Arbitrary objectives 1, 2. Possible payoff profiles (1,1), (1,0), (0,1), and (0,0). Zero-Sum Games on Graphs • Winning: - Winning-1 states s: (9 ) (8 ) ,(s) 2 1. - Winning-2 states s: (9 ) (8 ) ,(s) 2 2. • Determinacy: – – – Every state is winning-1 or winning-2. Borel determinacy [Martin 75]. Memoryless determinacy for parity games [Emerson/Jutla 91]. (1,0) (0,1) Non-Zero-Sum Games on Graphs Nash equilibrium (,) at state s: (8 ’) v1, (s) ¸ v1’, (s) (8 ’) v2, (s) ¸ v2,’ (s) Example: Reachability Game R2 s3 R1 s0 s1 Objective for player i is to visit Ri. s2 Example R2 s3 R1 s0 Nash equilibria: (s0 ! s1, s1 ! s2): (1,0) s1 s2 Example R2 s3 R1 s0 Nash equilibria: (s0 ! s1, s1 ! s2): (1,0) (s0 ! s3, s1 ! s0): (0,1) (s0 ! s1, s1 ! s0): (0,0) s1 s2 Example R2 s3 R1 s0 Nash equilibria: (s0 ! s1, s1 ! s2): (1,0) (s0 ! s3, s1 ! s0): (0,1) s1 s2 -Regular Objectives Synthesis: - Zero-sum game controller versus plant. - Control against all plant behaviors. Verification: - Non-zero-sum specs for components. - Components may behave adversarially, but without threatening their own specs. Non-Zero-Sum Verification Games Drawbacks of Nash equilibrium: - Does not capture adversarial behavior. - Not unique. A new notion of equilibrium: - Takes into account both non-zero-sum payoffs and adversarial behavior. - Captures the essence of component-based systems. - Unique for Borel objectives. - Computable for -regular objectives. Secure Equilibria • Secure strategy profile (,) at state s: (8 ’) ( v1,’ (s) < v1, (s) ) v2,’ (s) < v2, (s) ) (8 ’) ( v2’, (s) < v2, (s) ) v1’, (s) < v1, (s) ) • A secure profile (,) is a contract: if the player-1 deviates to lower player-2’s payoff, her own payoff decreases as well, and vice versa. • Secure equilibrium: secure strategy profile that is also a Nash equilibrium. Secure Equilibria 3,3 1,3 2,1 0,0 3,1 2,2 0,0 1,2 (row, column) Example R2 s3 R1 s0 s1 Nash equilibria: (s0 ! s1, s1 ! s2): (1,0) (s0 ! s3, s1 ! s0): (0,1) (s0 ! s1, s1 ! s0): (0,0) not secure s2 Example R2 s3 R1 s0 s1 Nash equilibria: (s0 ! s1, s1 ! s2): (1,0) (s0 ! s3, s1 ! s0): (0,1) (s0 ! s1, s1 ! s0): (0,0) not secure not secure s2 Example R2 s3 R1 s0 s1 Nash equilibria: (s0 ! s1, s1 ! s2): (1,0) (s0 ! s3, s1 ! s0): (0,1) (s0 ! s1, s1 ! s0): (0,0) not secure not secure secure s2 Lexicographic Payoff Profile Ordering • Player-1 preference º1 : – (v1,v2) Â1 (v’1,v’2) iff v1 > v’1 or ( v1 = v’1 and v2 < v’2 ). – (v1,v2) º1 (v’1,v’2) iff (v1,v2) Â1 (v’1,v’2) or (v1,v2) = (v’1,v’2). • Player-2 preference º2 symmetric. • Captures payoff maximization with external adversarial choice. • Provides notion of maximality: – Player-1: (1,0) º1 (1,1) º1 (0,0) º1 (0,1) – Player-2: (0,1) º2 (1,1) º2 (0,0) º2 (1,0) Alternative Characterization A secure equilibrium is an equilibrium with respect to the º1 and º2 payoff profile orderings: A strategy profile (,) is a secure equilibrium at s iff (8 ’) (v1, (s), v2, (s)) º1 (v1’, (s), v2’, (s)) (8 ’) (v1, (s), v2,’ (s)) º2 (v1,’ (s), v2,’ (s)) Example: Buechi Game s3 B2 s2 s1 s0 B1 s4 Objective for player i is to visit Bi infinitely often. Example s3 B2 s2 s1 s0 B1 s4 Nash equilibria: (s0 ! s4, s1 ! s4): (0,0) secure Example s3 B2 s2 s1 s0 B1 s4 Nash equilibria: (s0 ! s4, s1 ! s4): (0,0) (s0 ! s1, s1 ! s0): (1,0) secure Example s3 B2 s2 s1 s0 B1 s4 Nash equilibria: (s0 ! s4, s1 ! s4): (0,0) (s0 ! s1, s1 ! s0): (1,0) secure not secure Example s3 B2 s2 s1 s0 B1 s4 Nash equilibria: (s0 ! s4, s1 ! s4): (0,0) (s0 ! s1, s1 ! s0): (1,0) (s0 ! s2, s3 ! s1): (1,1) secure not secure Example s3 B2 s2 s1 s0 B1 s4 Nash equilibria: (s0 ! s4, s1 ! s4): (0,0) (s0 ! s1, s1 ! s0): (1,0) (s0 ! s2, s3 ! s1): (1,1) secure not secure Example s3 B2 s2 s1 s0 B1 s4 Nash equilibria: (s0 ! s4, s1 ! s4): (0,0) (s0 ! s1, s1 ! s0): (1,0) (s0 ! s2, s3 ! s1): (1,1) secure not secure secure Example s3 B2 s2 s1 B1 s0 s4 • Secure equilibrium (,) with payoff (1,1) at s0: : if s1 ! s0, then s2 else s4. s3 ! s1, then s0 else s4. • Pair of “retaliating” strategies. • Require memory. : if Maximal Secure Equilibria Theorem: At every state s of a graph game with Borel objectives, there is a unique secure equilibrium profile that is maximal with respect to both º1 and º2. This is the rational behavior of both players at s if they wish to 1. satisfy their own objectives and, then, 2. sabotage the opponent’s objective. Strongly Winning and Retaliating Strategies • Winning strategies: – Player-1 wins for the objective 1. – Player-2 wins for 2. • Strongly winning strategies: – Player-1 wins for the objective 1 Æ :2. – Player-2 wins for 2 Æ :1. • Retaliating strategies: – Player-1 wins for the objective 2 ) 1. – Player-2 wins for 1 ) 2. Winning Sets • • • • • • W1: set of states s.t. player-1 has a winning strategy. W2: set of states s.t. player-2 has a winning strategy. W10: set of states s.t. player-1 has a strongly winning strategy. W01: set of states s.t. player-2 has a strongly winning strategy. W11: set of states s s.t. there is a pair (,) of retaliating strategies with ,(s) ² 1 Æ 2. W00: set of states s s.t. each player has a retaliating strategy and for every pair (,) of retaliating strategies, ,(s) ² : 1 Æ : 2. State Space Partition State Space Partition hh2ii ( : 1 Ç 2 ) W10 hh1ii ( 1 Æ : 2 ) State Space Partition W01 hh2ii ( 2 Æ : 1 ) W10 hh1ii ( 1 Æ : 2 ) hh2ii ( : 1 Ç 2 ) hh1ii ( : 2 Ç 1 ) State Space Partition W01 hh2ii ( 2 Æ : 1 ) W10 hh1ii ( 1 Æ : 2 ) hh2ii ( : 1 Ç 2 ) Retaliating strategies hh1ii ( : 2 Ç 1 ) There is no player-2 retaliating strategy in W10, and no player-1 retaliating strategy in W01. Retaliating Strategy Pairs • • • Every player-1 retaliating strategy ensures for every player-2 strategy that 2 ) 1. Hence every pair (,) of retaliating strategies ensures (: 1 Ç : 2) ) (: 1 Æ : 2). W11 and W00 are the regions of the state space where both players have retaliating strategies: – W11 contains the states where some pair (,) of retaliating strategies ensures 1 Æ 2. – W00 contains the states where all pairs (,) of retaliating strategies lead to : 1 Æ : 2. State Space Partition W01 hh2ii ( 2 Æ : 1 ) W00 W10 hh1ii ( 1 Æ : 2 ) W11 Uniqueness of Maximal Secure Equilibria W11: W10: W01: W00: (1,1) is a secure equilibrium, and hence maximal (1,0) is the only secure equilibrium (0,1) is the only secure equilibrium (0,0) is the only possible secure equilibrium W11 is the set of states where both players can collaborate to win, yet each player keeps an “insurance policy.” Generalization of Determinacy Zero-sum games: 2 = :1 W1 W2 Non-zero-sum games: 1, 2 W01 W10 W00 W11 Computing the Partition • For -regular 1 and 2. • Computing – W10 = hh1ii (1 Æ :2) – W01 = hh2ii (2 Æ :1) involves solving games with conjunctive (Streett) objectives. These are generally not memoryless. • We need to compute W11 µ hh1,2ii (1 Æ 2) = 9 (1 Æ 2). B1 B2 s0 s1 W1,0 even though hh1,2ii (1 Æ 2) Computing the Partition hh2ii (1 ) 2 ) W01 hh2ii ( 2 Æ : 1 ) W10 hh1ii ( 1 Æ : 2 ) hh1ii (2 ) 1 ) Computing the Partition hh2ii (1 ) 2 ) W01 hh2ii ( 2 Æ : 1 ) hh1ii 1 U1 W10 hh1ii ( 1 Æ : 2 ) hh1ii (2 ) 1 ) Computing the Partition hh2ii (1 ) 2 ) W01 hh2ii ( 2 Æ : 1 ) W10 hh1ii ( 1 Æ : 2 ) hh1ii 1 U1 U2 hh2ii 2 hh1ii (2 ) 1 ) Computing the Partition hh2ii (1 ) 2 ) W01 hh2ii ( 2 Æ : 1 ) hh1ii 1 U1 hh2ii : 1 hh1ii : 2 W10 hh1ii ( 1 Æ : 2 ) U2 hh2ii 2 hh1ii (2 ) 1 ) Threat strategies T, T Computing the Partition hh2ii (1 ) 2 ) hh1ii : 2 W01 hh2ii ( 2 Æ : 1 ) W10 hh1ii ( 1 Æ : 2 ) hh1ii 1 U1 U2 hh1,2ii (1 Æ 2 ) hh2ii 2 hh1ii (2 ) 1 ) hh2ii : 1 Threat strategies T, T Cooperation strategies C, C Computing W11 • The pair (C+T, C+T) is a pair of winning retaliating strategies: – – Player-1 follows C until reaching U1 [ U2 (then switch to known retaliating pair) or until player-2 deviates from C (then switch to T). Player-2 behaves symmetrically. • From states s where there are no cooperative strategies, for all retaliation strategies (,) we have ,(s) ² : 1 Æ : 2. • Hence W11 = hh1,2ii(1 Æ 2) in the subgame without W10[ W01. Computing the Partition W01 hh2ii ( 2 Æ : 1 ) W10 hh1ii ( 1 Æ : 2 ) hh1ii 1 U1 U2 hh2ii 2 hh1,2ii (1 Æ 2 ) W0 0 Computing the Partition 1,2 LTL: 2EXPTIME [Pnueli/Rosner]. 1,2 parity: coNP [Emerson/Jutla]. Application: Compositional Verification P1 2 W1 (1) P2 2 W2 (2) 1 Æ 2 ) P1||P2 ² Application: Compositional Verification P1 2 W1 (1) P2 2 W2 (2) 1 Æ 2 ) P1||P2 ² P1 2 (W10 [ W11) (1) P2 2 (W01 [ W11) (2) 1 Æ 2 ) P1||P2 ² W1 ½ W10 [ W11 W2 ½ W01 [ W11 An assume/guarantee rule. Summary Rational behavior in non-zero-sum graph games: – – – – not as restrictive as zero-sum games. capture the essence of multi-component systems. unique equilibria. computable for -regular objectives. Reference: LICS 2004.
© Copyright 2026 Paperzz