Imperfect Information in Games for Multi-Agent Systems Internship Report ∼ May 23, 2016 – July 29, 2016. Raphaël Berthon1 Supervisors: Bastien Maubert2 and Aniello Murano2 1 2 ENS Rennes, France University of Naples Federico II, Italy Abstract. Problems involving agents can be met in a wide range of fields, and can be used in various ways: agents can represent robots in artificial intelligence, users in security, but also processors in architecture. A formal approach to these problems is to represent the world as a game structure, where the agents are the players, and a task involving our agents is a logical formula in a given language. In this framework, the model checking problem is to check whether our task (a formula) is satisfied on our model of the world (the graph of the game). A question that usually arises is to know if a group of agents has a strategy to reach a given goal in the game. We study this problem on logics with a great expressivity, and when agents have imperfect information about their world. 1 Introduction Trying to know if a group of agents can fulfill a given task involving strategies (the task is represented by a logical formula) in a given setting (represented by a graph, the game structure) is a problem met as well in AI as in computer security. The model checking problem is to check whether our task (the formula) holds on our model of the world (the graph of the game). Various logics exist to express questions on strategies. The more powerful the logic used to describe a goal, the more difficult it is to decide if the goal can be reached or not [1]. We know the complexity bounds for most of these logics, when players have exactly the state of the world at each step of the game [2–6]. This is not always the case, players may not be able to make the difference between some situations, or may forget information. To represent this, the idea of imperfect information has been introduced [7, 11]. A useful framework is perfect recall : agents still recall everything they have seen, but they are not able to make the distinction between some of the states in the game. Very few results are known on logics dealing with strategies in the perfect recall framework. Deciding if there is a winning strategy for a winning condition is undecidable in the general case [7], even with very weak logics like CTL. On these weak logics, some restrictions have been added to make this problem decidable [8, 9]. The complexity with perfect recall is almost always non-elementary (tower-exponential complete in fact), and some optimal restrictions (when any weaker restriction is undecidable) are known [10]. Some useful logics: The logics we use are defined on Game Structures, graphs with states labeled with propositional variables, and where the joint decision of a set of players gives the transition taken from a given state (in fact, it uses the history of all the states visited until now). We first present Linear Temporal Logic [2] (LTL). This logic includes boolean formulas on the variables of the current state, and some new operators. The next unary operator (denoted by X) means that a given property will be true at the state in the next step. The until binary operator (denoted by U ) means that a first property will be true in every future state, until a second property occurs. Computation tree logic [3] (CTL) is a logic where the strategy quantifier E (there exists a path) is used to say that the team composed of all the players in the game has a strategy to verify a given goal; and A (for all paths) to say that whatever their strategies are, they will reach this goal. An important remark is that the combination of E and ∧ can be used to say that in a given state, two futures are possible. For example, we can ask the players to have a way so that in the next state p is true, and to have a way so that in the next state, p is false. This is represented by EXp ∧ EX¬p. This formula can be satisfied if the current state has a transition leading to one state where p is true, and to another state where p is false. As our formulas have to deal with all possible behaviors of the system, a given point has in general more than one possible futures, and we shall see this set of possible futures as a tree. In this report, we focus on a generalization of CTL, Alternating-time Temporal Logic [4] (ATL) and some variations. In ATL, an existential quantifier hhAiiϕ can be used to say that the players in the set A can team up to fulfill the goal ϕ. With the team made of all the players in the game, we get the operator E back. Solving games with imperfect information and with CTL and LTL objectives has been extensively studied, but apart from the general undecidability result, nothing was known for stronger logics like ATL under perfect recall. In a first time, we study ATL∗ , an extension of ATL. Of all the four logics previously presented, only ATL∗ will be formally defined, but it is easy to understand LTL, CTL, and ATL as restrictions of this logic. In ATL, hhAii can only be used before specific kind of formula (in fact, strategy quantifiers can only be followed by X or U, and those cannot appear without a strategy quantifier). The ∗ indicates that we do not have this constraint any more. In general, when a logic is followed by ∗, there exists a constrained version of it. Plan of the report: In this internship, we studied the model checking of more expressive logics dealing with strategies, in the special case when there is a hierarchy on players, as introduced by Peterson and Reif [11, 12]. This framework is defined in Section 2. The choice for this specific restriction is that almost all known decidable cases in weaker logics can be reduced to this one [13, 14, 3, 15]. In Section 3, we show that ATL∗ model checking with the hierachical asumption is decidable (in fact the result is more general). We also consider ATL with both strategy contexts (ATL∗sc [6]) and imperfect information. Strategy contexts are interesting, as they can express Nash equilibria, which are greatly used in the field of artificial intelligence. To prove results on this logic, we had to define a second logic equivalent to our first one, QCTL∗i (QCTL∗ [16] with imperfect information). These definitions are presented in Section 4. The proof of equivalence is done, and then the undecidability of both these logics (even when the hierarchical asumption holds) is proved by a reduction from MSO with equal level; all these results are presented in Section 5. A decidability result was obtained on ATLsc with imperfect information. We had to introduce a new restriction for this. This is not described in detail in this report, and only mentioned in the conclusion. 2 2.1 Preliminaries Game Structure The logics presented earlier use games (or rather the description of what can happen in a given game) as their models. Games are represented by transition systems, where the choices of the players (and sometimes non-determism) decides the next move. Definition 1. A Game Structure [4] is a tuple G = hk, Q, Π, π, (da )a∈{1,...,k} , δi such that: – k ∈ N∗ is the number of players. Each player is designated by its index a ∈ {1, . . . , k}. We write Σ = {1, . . . , k} – Q is a finite set of states. – Π is a finite set of propositional variables. – π : Q → P(Π) is the labeling function: if q ∈ Q, then π(q) ⊆ Π is the set of propositional variables true at the state q. – da : Q → N∗ is such that da (q) is the number of moves available to player a ∈ {1, . . . , k} at state q ∈ Q. The moves of a player at a given state will be designated by their index i ∈ {1, . . . , da (q)}. For each state q, a move vector is a tuple hj1 , . . . , jk i such that for every player a, we have ja ∈ {1, . . . , da (q)}. D is the move function: D(q) is the set {1, . . . , d1 (q)} × · · · × {1, . . . , dk (q)} of move vectors at q. k – δ : Q × N∗ → Q is the transition function: for a set q ∈ Q, if every player a ∈ Σ chooses ja (i.e. the players choose the move vector hj1 , . . . , jk i ∈ D(q)), then the resulting state is δ(q, j1 , . . . , jk ). Example 1. Figure 1 gives an example of a game structure, G1 , with two players. We will use this example in most of this report. The transitions are given between parenthesis, and the moves of the players are separated by commas. The variables are not given here. Player 1 can only play + or ×, player 2 can play +, × or W . start (∗, ∗) qi (∗, ∗) (∗, W ) (∗, W ) q= (+, ×) (×, +) (+, +) (×, ×) q> (+, ×) q6= (×, +) qw (∗, W ) (+, +) (×, ×) (∗, +) (∗, ×) q⊥ Fig. 1. G1 , a Game Structure with two players. The symbol ∗ is used to denote that the move can occur whatever this player chooses. In this game, the first move decides non-deterministically if the players go in state q= or q6= . From state =, players 1 and 2 must play the same move to go to q> , otherwise they go to q⊥ . It is the contrary in q6= , where players 1 and 2 must choose a different move to go to q> . Finally, player 2 can force the game to wait a turn by playing W . In q= , this loops directly, but in q6= , they will first go by qw before having to play W another time to come back. Now, we define some basic notions on games. Let G be a Game Structure. A state qj is a successor of qi iff there is a move vector from qi to qj . A computation λ ∈ QN is a sequence such that every λ[i+1] is a successor of λ[i]. A q-computation is a computation such that λ[0] = q. We use λ[0, i] to designate the (finite) prefix q0 , q1 , . . . , qi and λ[i, ∞] for the (infinite) suffix qi , qi+1 , . . . A strategy fa : Q+ → N for player a ∈ Σ is a function associating every finite prefix λ of a computation to a move. Thus, if λ ∈ Q+ and q is the last state of λ, fa (λ) ≤ da (q): the chosen move must be available. The outcome of Fa from q, denoted as out(q, Fa ), is the set of q-computations that the players in A enforce when they begin from q and they follow the strategies in FA . 2.2 ATL∗ Syntax: We recall that the purpose of Alternating-Time logic (ATL∗ here) is to express conditions on strategies held by subsets of players. We use the same definition of ATL∗ as in [4], except that for the sake of simplicity, we do not introduce state formula and path formula. Definition 2. Let Σ = {1, . . . , k} be a set of players, let Π be a finite set of propositional variables. An ATL∗ formula is given by the following grammar: ϕ ::= p ∈ Π | ¬ϕ | ϕ ∨ ϕ | Xϕ | ϕUϕ | hhAiiϕ for all A ⊆ Σ We only study closed formulas, i.e. formulas where all temporal operators are in the scope of some strategy operator (hhAiiϕ or JAKϕ as we will see later). Semantics: G, λ |= ϕ is used to denote that the path λ satisfies the formula ϕ in the structure G. When G is clear from the context, we will omit it. Definition 3. We define inductively the semantics of ATL∗ this way: λ |= p for propositions p ∈ Π iff p ∈ π(λ[0]). λ |= ¬ϕ iff λ 2 ϕ. λ |= ϕ1 ∨ ϕ2 iff λ |= ϕ1 or λ |= ϕ2 . λ |= hhAii iff there exists FA a set of strategies, one for each player in A, such that for all computations λ0 ∈ out(q, FA ), we have λ0 |= ϕ. – λ |= Xϕ iff λ[1, ∞] |= ϕ. – λ |= ϕ1 Uϕ2 iff there exists a position i ≥ 0 such that λ[i, ∞] |= ϕ2 and for all j such that 0 ≤ j < i, we have λ[j, ∞] |= ϕ. – – – – def We will use the dual formula of hhAii, defined as JAKϕ = ¬hhAii¬ϕ. As hhAiiϕ means that the players in A can cooperate to make ϕ true, the dual formula JAKϕ means that the players of A cannot cooperate to make ϕ false. We will also use def the finally operator: Fϕ = >Uϕ, meaning that at least one time in the future, ϕ will be true. It is important to notice that in this semantics, when a new strategy operator is met, all the previous strategies are forgotten, even the strategies for the players who are not quantified by this new operator. We will sometimes say that G, q |= hhAiiϕ (using q instead of λ) to say that a formula is true from a given state of the game. As the hhAii operator considers a new set of outcomes, only the current state is relevant. We begin studying in Section 4.1 the case where the strategy context is kept. Example 2. If we take the Structure G1 of Figure 1, and take a variable lost such that lost is only true on q⊥ , player 2 has a strategy such that lost is never true: The strategy is just to always play W : G1, qi |= hh2ii¬Flost. In the same way, if state q> has a variable win set to true, players 1 and 2 can work together to have a strategy to make win true: player 1 can always play + and player 2 can play + if they are on state q= , and × on state q6= . Thus G1 , qi |= hh1, 2iiFwin 2.3 Imperfect information Perfect recall: Until now, we have presented games with perfect information, where players have perfect access to the current computation λ[0, n]. With imperfect information, players do not have access to the exact computation. In the case of perfect recall, players still recall all they observe, but they cannot make the distinction between some states. To represent what the k players see, we use different sets of indexes, Q1 , . . . , Qj . States of the game are represented by elements of Q1 × · · · × Qj . Each player i has an observation set oi ⊆ {1, . . . , j} : they can only see these components of the states. Thus the observation function of player i is the projection on the Q components of oi , which can be written πoi : Q1 × · · · × Qj → l∈oi Ql . Two states v and v 0 are indistinguishable for player i if they are identical after application of the observation function: v ≡i v 0 iff πoi (v) = πoi (v 0 ). This notation can easily be generalized to paths. A strategy for player i is a function fi : (Q1 × · · · × Qj )+ → N. When players have imperfect information, we add the constraint that players must make the same strategic choices in indistinguishable situations: if λ[0, n] ≡i λ0 [0, n], then fi (λ[0, n]) = fi (λ0 [0, n]). The model checking problem becomes much harder, as we have to take into account what every player sees. In the general case, games with imperfect information and perfect recall are undecidable [11]. As ATL∗ with imperfect information (ATL∗i ) can express the existence of winning strategies in games with imperfect information, this logic is also undecidable. qi start qi start (∗, ∗) (∗, ∗) (∗, W ) (∗, W ) q=,6= qw (∗, W ) q=,6=,w (∗, W ) q> (∗, +) (∗, ×) (∗, +) q (∗, +) (∗, ×) (∗, ×) ⊥ (a) q= and q6= confounded. q> (∗, +) (∗, ×) (∗, +) q⊥ (∗, ×) (b) q= , q6= , and qw confounded. Fig. 2. G1 under Imperfect Information. Example 3. In Figure 2 we can see the game G1 when players have imperfect information. Some states are merged and players do not see all the game: in Figure 2(a) only states q= and q6= are merged. In Figure 2(b), q= , q6= and qw are merged. Players know the "real" game, the subgames are only a useful way to understand what they see during the game. A better representation of what a player with perfect recall does see would be to use powerset construction, but we don’t have the place for this graph in this report. If we take the same notations as in Example 2, with player 1 seeing the game in Figure 2(a) and player 2 seeing the game in 2(b), we have that hh1, 2iiFwin still holds on qi : A strategy for player 2 is to play W twice, and then play +. Player 1 will either see that they were twice in state q=,6= , meaning that they are in the "real" state q= , and will play +. Otherwise, player 1 will see they went to state qw before coming back to state q=,6= , meaning that they are in q6= , and will play ×. Hierarchy on observations As said before, games with Imperfect Information and Perfect Recall are undecidable. Now, we present a useful restriction. There is a hierarchy between the observations [11, 12] of the players if there is an order on the players such that players with smaller indexes see more than players with higher indexes. More formally, it means that with k players, for i < k, oi+1 ⊆ oi . With CTL or LTL winning conditions, when there is a hierarchy on players, games with imperfect information and perfect recall are decidable [8, 9]. Even though this condition is not optimal, it is very strong, as almost all known decidability results (including some optimal ones) on these logics with perfect recall are using reduction to a hierarchy. This asumption is also very natural, as agents often have a hierarchy on them, whether it is in what robots can see, or the privileges on a computer going from the simple user to the administrator. 3 3.1 Solving ATL∗i with perfect recall Overview Let ϕ be an ATL∗i formula, G = hk, Q, Π, π, (da )a∈{1,...,k} , δi a game structure. To check whether ϕ is valid w.r.t. the game structure G, we will create the function solve_ATL∗i which returns the set of states in Q such that ϕ is satisfied. Let is_ltl be a function such that is_ltl(ϕ) returns true if ϕ is a LTL formula, false otherwise. For ϕ a LTL formula, we assume a function solve_LTL(ϕ, A, G) which returns the set of states in Q where the players in A have a winning strategy for the winning condition ϕ. It is especially useful to notice that when there is a hierarchy on the players, we know how to solve games with imperfect information perfect recall, and LTL winning conditions. In fact, our result is more general, as it permits to generalize all the decidable cases of games with LTL objectives with perfect recall [8, 9] to ATL∗i . Theorem 1. For all classes of game structures such that the games with LTL objectives, imperfect information and perfect recall are decidable, the ATL∗i model checking is decidable. To do this, we flatten the formula, with a bottom-up approach. We search for the innermost strategy quantifier, of the form hhAiiϕ. The formula ϕ following this quantifier is an LTL formula; we know how to find the states where the agents have a strategy to enforce ϕ. We add a new propositional variable in those states, phhAiiϕ to indicate that they can enforce ϕ, and replace hhAiiϕ by phhAiiϕ in our formula. We consider closed formula: by applying this method inductively, we end up with a boolean formula, which is easy to solve, for example with a function solve_bool. In our algorithm, we use the union of games G1 ∪ G2 . We do not define it in the general case, but only for the games met in this algorithm. The games we consider are almost indentical, except for propositional variables. They have a shared subset of variables, and a subset unique to each game. Thus, we take the union of all these propositional variables. For the shared variables, both games have the same valuation for these variables, and we keep them. For the unique variables, as the valuation is defined in only one of the games, we keep this valuation. 3.2 Model Checking Algorithm let flatten(ϕ, G) = match ϕ with |hhAiiψ −> if is_ltl ψ then sol := solve_LTL(ψ, A, G) else (ψ 0 , G0 ) := flatten(ψ, G) sol := solve_LTL(ψ 0 , A, G0 ) end Π 0 := Π ∪ pϕ for q in Q do if q ∈ sol then π 0 (q) := π(q) ∪ pϕ else π 0 (q) := π(q) end done return (pϕ , G0 = hk, Q, Π 0 , π 0 , (da )a∈{1,...,k} , δi) |¬ψ −> (ψ 0 , G0 ) := flatten(ψ, G) return (¬ψ 0 , G0 ) |ψ1 ∨ ψ2 −> (ψ10 , G1 ) := flatten(ψ1 , G) (ψ20 , G2 ) := flatten(ψ2 , G) return (ψ10 ∨ ψ20 , G1 ∪ G2 ) |Xψ −> (ψ 0 , G0 ) := flatten(ψ, G) return (Xψ 0 , G0 ) |ψ1 Uψ2 −> (ψ10 , G1 ) := flatten(ψ1 , G) (ψ20 , G2 ) := flatten(ψ2 , G) return (ψ10 Uψ20 , G1 ∪ G2 ) let solve_ATL∗i (ϕ, G) = (ϕ0 , G0 ) := flatten(ϕ, G) return solve_bool(ϕ0 , ∅, G0 ) In the proof, we say that (ϕ1 , G1 ) is equivalent to (ϕ2 , G2 ) iff G1 and G2 share the same set Q of states, and for all q ∈ Q, it holds that G1 , q |= ϕ1 iff G2 , q |= ϕ2 . 3.3 Correctness of the algorithm We have to show that q0 is in solve_ATL∗ (ϕ, G) iff G, q0 |= ϕ. For this, we prove by induction the following lemma: Lemma 1. For G a game structure, ϕ a ATL∗ formula, (ϕ0 , G0 ) = flatten(ϕ, G) is equivalent to (ϕ, G), and ϕ0 is a LTL formula. The proof can be found in the Appendix A.1. 4 Strategy Context: Some new logics We have proved two important results on ATL∗ when strategy contexts are kept. First, that this logic is undecidable, even with a hierarchy on players. Second, that it becomes decidable if the players can only decide of their strategies in a given order. In this report, we only prove the undecidability result. We first define the logic we study, ATL∗sc,i . Then we introduce a new logic, QCTL∗i . We show it is equivalent to ATL∗sc,i , and then directly prove our results on it. 4.1 ATL∗i Overview: The difference between an ATL∗ formula and an ATL∗ formula with strategy contexts (denoted ATL∗sc ) lies in the semantics of the formula, and more precisely in the semantics of the strategy quantifier. When a new strategy quantifier hhAii is met in ATL∗sc , players who are not present in A keep their strategy. To represent this, we add a context to our models, giving the strategies of the players. We use a definition close to the one by Laroussinie and Markey [6] In fact we study ATL∗sc with imperfect information and perfect recall, denoted ATL∗sc,i , and compare it to ATL∗ in the same framework. Syntax: Let Σ = {1, . . . , k} be a set of players, let Π be a finite set of propositional variables. Definition 4. The syntax of ATL∗ with strategy context (denoted ATL∗sc ) is the same one as ATL∗ , and is given by the following grammar: ϕ ::= p ∈ Π | ¬ϕ | ϕ ∨ ϕ | Xϕ | ϕUϕ | hhaiiϕ for all a ∈ Σ We only allow quantification on one player at a time, but as context is kept, it creates no difference with global quantifications on all the players; it only makes the proofs simpler. As for ATL∗ , we only study closed formulas, i.e. formulas of the form hhaiiϕ or JaKϕ. Semantics: First we recall that the strategies of player i are the elements of + F = N(Q1 ×···×Qj ) . Strategies will be in fact elements of F∅ = F ∪ ∅ where ∅ denotes that no strategy was assigned to player i. A context is a set of strategies, one for each player: a context is an element of (F∅ )k . Updating a context C with a new strategy fa , denoted C ← fa means that we replace the current strategy of player a in C by the strategy fa . We use C, G, λ |= ϕ to denote that in the context C, the computation λ satisfies the formula ϕ in the structure G. When G is clear from the context, we will omit it. The semantics is almost the same as for ATL∗ , except for hhaii. This operator is also the only one to use the strategy context: the other operators just send the context as it is when doing the recursive definition of the semantics. For both these reasons, we only give the new definition of hhaii: Definition 5. The semantics of hhaiiϕ in ATL∗sc,i is: – C, λ |= hhaii iff there exists fa a strategy for player a, such that with C 0 = C ← fa , for all computations λ0 ∈ out(q, C 0 ), we have C 0 , λ0 |= ϕ. While no strategy context is set, the context is ∅k . We use the formula JaKϕ for ¬hhaii¬ϕ. Example 4. If we take the same problem as in Example 3, with game G1 of Figure 1, player 1 seeing as in Figure 2(a) and player 2 seeing as in 2(b), we can use ATL∗sc,i to express problems that we cannot express in ATL∗i . For example, we can try to know whether player 2 has a single strategy such that the players never lose (lost is never true, i.e. the players never reach q⊥ ), and if player 1 cooperates, they can win (reach win in q> ). The strategy context let us say that the strategy of player 2 must not change between the left member of ∧ (player 2 is alone) and the right one (players 1 and 2 play together). Formally, we try to model-check the following formula: hh2ii (¬Flost ∧ hh1iiFwin) in model G1 with player 1 confounding q= and q6= and player 2 confounding q= , q6= , and qw . This formula does not hold, there is no such strategy: if player 2 waits and always plays W , then q> cannot be reached, and if player 2 chooses to play either + or ×, if player 1 does not cooperate and plays the wrong one, they are sent to q⊥ . As we have seen in the example, ATL∗sc,i is more powerful than ATL∗i . One interesting result is that ATL∗sc,i is powerful enough to express Nash equilibria: Definition 6. Let A be a set of players, such that each player a ∈ A has an objective ϕa . There is a Nash equilibrium if all the players have a common strategy such that none of the individual players could change their strategy to have a better outcome. This is expressed by the following formula: ! ^ hhAii (¬ϕa ⇒ ¬hhaiiϕa ) a∈A 4.2 QCTL∗i Overview: QCTL∗ is made of CTL∗ formulas, where we add quantifiers on propositional variables: ∃q.ϕ means that at each step, we decide the valuation of the propositional variable q. This logic was already well known, and known to be equivalent to ATL∗sc [16]. However, we enrich it with with imperfect information on propositional quantification: in QCTL∗i , we have different quantification operators, ∃i q.ϕ. This way, for two different states and paths that led to these states, we can ask to have the same decision for the value of q when the paths were indistinguishable for operator i. We show that this new logic is equivalent to ATL∗sc with perfect recall (ATL∗sc,i ). The semantics of a QCTL∗i formulas is given on a Kripke Structure. As for game structures, a Krike structure is a labeled graph, but without the notion of players. The logics presented earlier used games (or rather the description of what can happen in a given game) as their models. Here, we only use computations, as we do not have any player. Definition 7. A Kripke Structure is a tuple K = hQ, Π, π, δi such that: – Q is a finite set of states. – Π is a finite set of propositional variables. – π : Q → P(Π) is the labeling function: if q ∈ Q, then π(q) ⊆ Π is the set of propositional variables true at the state q. – δ : Q×Q is a binary relation, the transition function: for q, q 0 ∈ Q, (q, q 0 ) ∈ δ means that the transition from q to q 0 is available. Game structures can be seen as Kripke Structure by forgetting the moves and the players. Most of the definition given earlier still hold. We define the computation tree from a state q: it is the tree where the nodes are sequence of states, representing an history, and labelled with the propositions that hold in the last state. The computation tree can be seen as the unfolding of the structure K from the state q; we denote it by TK . If T is a computation tree beginning with q, and λ[0, k] a finite q-computation on T , the application of λ[0, k] to T is the subtree of T beginning at the node obtained by taking each time the transition proposed by λ; this represents a move in the tree along λ. Syntax: Let Π be a finite set of propositional variables. Definition 8. A Quantified CTL formula with imperfect information and perfect recall (QCTL∗ ) is a formula given by the following grammar: ϕ ::= p ∈ Π | ¬ϕ | ϕ ∨ ϕ | Xϕ | ϕUϕ | Eϕ | ∃i q.ϕi for all i ∈ Σ We only allow closed formula, i.e. formula beginning with Eϕ or with Aϕ = ¬Eϕ¬ Semantics: Let K = hQ, Π, π, δi be a Kripke structure. Our semantics is defined on the unfolding of this structure from a given state: we use T the computation tree from q of K. In the framework of imperfect information with perfect recall, we take Q = Q1 × · · · × Qj . Let k be the number of distinguished existential quantifiers operators ∃i . They are used to describe imperfect information. For each i ∈ {1, . . . , k}, let oi be observation set oi ⊆ {1, . . . , j}, meaning that i can only see these components of the states: Each operator i has an observation operator πoi which gives the observable components of the state. We add that there is a hierarchy on the operators: oi+1 ⊆ oi . T, λ |= ϕs is used to denote that the computation λ satisfies the formula ϕ in the tree T . When the current path is not relevant, we will just use T |= ϕ. Definition 9. The semantics of QCTL∗ formulas is defined as: T, λ |= p for propositions p ∈ Π iff p ∈ π(q). T, λ |= ¬ϕ iff T, λ 2 ϕ. T, λ |= ϕ1 ∨ ϕ2 iff T, λ |= ϕ1 or T, λ |= ϕ2 . T, λ |= Eϕ iff there exists a λ[0]-computation λ0 such that we have T, λ0 |= ϕ. T, λ |= Xϕ iff with T 0 the tree obtained from the application of λ[0] to T , we have T 0 , λ[1, ∞] |= ϕ. – T, λ |= ϕ1 Uϕ2 iff there exists a position i ≥ 0 such that with T 0 the tree obtained from the application of λ[0, i] to T , we have T 0 , λ[i, ∞] |= ϕ2 and for all j such that 0 ≤ j < i with Ti0 the tree obtained from the application of λ[0, j] to T , we have Ti0 , λ[j, ∞] |= ϕ. – T, λ |= ∃i p.ϕ iff there exists a valuation of p for each node of the computation tree (making a new tree T 0 ), such that for all finite paths ρ, ρ0 on the computation tree, if πoi (ρ) = πoi (ρ0 ), the valuation of p at ρ is equal to the valuation of p at ρ0 , and we have that in this modified tree, T 0 , λ |= ϕ – – – – – 5 Undecidability result on ATL∗sc,i We only give a sketch of the proof of the undecidability. 5.1 Equivalence between ATL∗sc,i and QCTL∗i First, we show the equivalence between our two logics, which can be expressed in two parts: Proposition 1. For all ATL∗sc,i formula ϕ, there is a QCTL∗i formula ϕ0 such that G, q |= ϕ iff TG |= ϕ0 . Proposition 2. For all QCTL∗i formula ϕ, there is an ATL∗sc,i formula ϕ0 such that TG |= ϕ iff G, q 0 |= ϕ0 . From ATL∗sc,i to QCTL∗i : As seen in Laroussinie and Markey [6], going from ATL∗sc,i to QCTL∗i is straightforward except for strategy quantifiers. When a strategy quantifier is encountered, we use a propositional quantifier with the same observation set as the player quantified, and add a formula ensuring that the choices for this quantifier give the representation of a strategy, and ensuring that when following the strategy, the translation of the QCT L∗i formula following the strategy quantifier is true. The full translation is available in Appendix A.2. From QCTL∗i to ATL∗sc,i : The translation of QCTL∗i into ATL∗sc,i is shorter but a bit less intuitive than the reverse [6]. We take k players, one per variable introduced by an existential quantifier and with the same observation set than the existential quantifier used. These player associated to a variable choose its truth value, and we add a new player, k + 1, who decides of the moves taken in the graph. The full translation is available in Appendix A.3. 5.2 Undecidability: Hierarchy on observations Overview: We do not give the full formal undecidability proof here. It is known that QCTL∗ is equivalent to Monadic Second Order logic (MSO) interpreted on trees [16]. It is also known that MSO on trees is undecidable as soon as a binary predicate equal_level is added to the logic (this logic will be denoted by MSOeq_lvl ), such that equal_level(q, q 0 ) holds if and only if q and q 0 are at the same height in the tree [16]. As we can already translate any MSO formula to QCTL∗i (by using the same translation as for QCTL∗ ), we only show here that we can translate the equal_level predicate to QCTL∗i , even in the case where there is a hierarchy on players. Proposition 3. For every MSOeq_lvl formula ϕ and tree T , there exists a QCTL∗i formula ϕ0 (which is effectively computable) such that T |= ϕ iff TK |= ϕ0 . Proof. To do this, we only need to introduce one existential quantifier: a blind quantifier, ∃1 , which can only see the number of steps taken until now (blind means that the observation function of 1 is ∅). Thus we translate the equal_level predicate by adding a propositional variable pq to every state q, and by saying that 2 has a way to label the tree with a fresh variable p1 such that if ρ or ρ0 is met, p1 is true, and in the path, p1 is only true once. This formula is verified on a tree if and only if ρ and ρ0 are at the same height in the tree, as the quantifier can only see the height of the nodes and must label in the same way nodes it confounds. The difficult part is not to make the formula true when the two nodes are at the same height, it is to have make it wrong when they aren’t. More formally, we replace equal_level(ρ, ρ0 ) by: ∃1 x.AG [(pρ ⇒ x) ∧ (pρ0 ⇒ x) ∧ (x ⇒ XG¬x)] Theorem 2. Model checking of QCTL∗i with imperfect information, perfect recall, and hierachy on quantifiers is undecidable. Sketch of proof. In the proof of undecidability of MSO with equal_level, the undecidable sub-logic where formulas start with a first existential quantifier, and under this quantifier the equal_level predicate can be translated to QCTL∗i with a quantifier seeing everything followed by a blind quantifier. As a consequence, we have that QCT L∗i with a hierarchy on quantifier is undecidable. By using the translation defined in Section 5.1, we have that ATLsc,i is undecidable, even with a hierarchy on players. Corollary 1. Model checking of ATL∗sc,i with imperfect information, perfect recall, and hierachy on players is undecidable. Proof. By Proposition 2. In the translation of QCTL∗i to ATL∗sc,i , the hierarchy on the quantifiers is changed to a hierarchy on players, and the asumptions of imperfect information and perfect recall are kept. As QCTL∗i with the previous restrictions is undecidable, and can be translated to ATL∗sc,i with imperfect information, perfect recall, and hierachy on players, this second logic with these restrictions is also undecidable. 6 Conclusion Overview: We have proved that the model checking problem with imperfect information, perfect recall, and hierarchy on players is decidable for ATL∗i with imperfect information, perfect recall and hierarchy on players, and undecidable on ATL∗sc,i with the same conditions. The undecidability result is strong, but not optimal: the proof only uses formulas with an alternation depth of 1, and two players. This result does not prove that there is no decidable fragment expressive enough to reason about the existence of Nash equilibria. Other results: We have also proved, but not presented here, that the model checking of ATL∗sc,i with perfect recall and hierarchy on player becomes decidable in the case where we also have a hierarchy on the order of quantification: the strategy quantifiers for the players knowing more must be placed more inner than the quantifiers for the players knowing less. We proved it on QCTL∗i (and obtained it on ATL∗sc,i by using Proposition 1). The decision method is very close to Kupferman and Vardi’s thin trees method [3, 15]. We only used the narrow operator, and a project operator close to the one used in the QCTL∗ decidability proof. This result is strong, as removing the hierarchy on quantifications leads to the previous undecidability result. In the proof, we had to suppose that propositional variables were visible to every player (and thus they cannot confound two states with different propositional variables). We do not know yet if we can remove this asumption. Future works: The decidability of the model checking of Nash Equilibria with perfect recall, and the optimality of our decidability result on ATL∗sc,i are two possible axis of work. A third one is Imperfect Information on Strategy Logic (SL). SL has been described in various ways [5], but very few results are known, and none of them use imperfect information. We hope that the logics and methods introduced in this report (QCTL∗i and a reduction from MSOeq_lvl on trees) may be used to obtain results on some restrictions of SL, for a well chosen definition of Imperfect Information. Aknowledgement I am very grateful to Aniello Murano and Bastien Maubert, who involved themselves a lot during their supervision of my internship. I would also like to thank Antonio, Dario, and Giovanni, for their warm welcome and their invaluable help. References 1. Moshe Y Vardi. Alternating automata and program verification. In Computer Science Today, pages 471–485. Springer, 1995. 2. Wolfgang Thomas. Automata on infinite objects. Handbook of theoretical computer science, Volume B, pages 133–191, 1990. 3. Orna Kupferman, Moshe Y Vardi, and Pierre Wolper. An automata-theoretic approach to branching-time model checking. Journal of the ACM (JACM), 47(2):312– 360, 2000. 4. Rajeev Alur, Thomas A Henzinger, and Orna Kupferman. Alternating-time temporal logic. Journal of the ACM (JACM), 49(5):672–713, 2002. 5. Fabio Mogavero, Aniello Murano, and Moshe Y Vardi. Reasoning about strategies. In LIPIcs-Leibniz International Proceedings in Informatics, volume 8. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2010. 6. François Laroussinie and Nicolas Markey. Augmenting atl with strategy contexts. Information and Computation, 245:98–123, 2015. 7. Orna Kupferman and Moshe Y Vardi. Church’s problem revisited. Bulletin of Symbolic Logic, pages 245–263, 1999. 8. Dietmar Berwanger and Anup Basil Mathew. Infinite games with finite knowledge gaps. arXiv preprint arXiv:1411.5820, 2014. 9. Dietmar Berwanger, Anup Basil Mathew, and Marie Van den Bogaard. Hierarchical information patterns and distributed strategy synthesis. In International Symposium on Automated Technology for Verification and Analysis, pages 378– 393. Springer, 2015. 10. Bernd Finkbeiner and Sven Schewe. Uniform distributed synthesis. In 20th Annual IEEE Symposium on Logic in Computer Science (LICS’05), pages 321–330. IEEE, 2005. 11. Gary Peterson, John Reif, and Salman Azhar. Lower bounds for multiplayer noncooperative games of incomplete information. Computers & Mathematics with Applications, 41(7):957–992, 2001. 12. Gary Peterson, John Reif, and Salman Azhar. Decision algorithms for multiplayer noncooperative games of incomplete information. Computers & Mathematics with Applications, 43(1):179–206, 2002. 13. Amir Pnueli and Roni Rosner. On the synthesis of a reactive module. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 179–190. ACM, 1989. 14. Amir Pnueli and Roni Rosner. Distributed reactive systems are hard to synthesize. In Foundations of Computer Science, 1990. Proceedings., 31st Annual Symposium on, pages 746–757. IEEE, 1990. 15. O Kuperman and MY Varfi. Synthesizing distributed systems. In Logic in Computer Science, 2001. Proceedings. 16th Annual IEEE Symposium on, pages 389– 398. IEEE, 2001. 16. François Laroussinie and Nicolas Markey. Quantified ctl: expressiveness and complexity. arXiv preprint arXiv:1411.4332, 2014. A A.1 Appendix Correctness of the algorithm solving ATL∗i with perfect recall. Lemma 1. For G a game structure, ϕ a ATL∗ formula, (ϕ0 , G0 ) = flatten(ϕ, G) is equivalent to (ϕ, G), and ϕ0 is a LTL formula. As solve_ATL∗ (ϕ, G) returns the states of G such that flatten(ϕ, G) (which is a LTL formula) holds, if flatten(ϕ, G) is equivalent to (ϕ, G), then q0 is in solve_ATL∗ (ϕ, G) iff G, q0 |= ϕ. – If ϕ = hhAiiψ, with ψ a LTL formula, we have indeed that for all q ∈ Q, G, q |= hhAiiψ iff G0 , q |= pϕ . – Else, if ϕ = hhAiiψ with ψ not a LTL formula, we have by the induction hypothesis that (ψ 0 , G0 ) = flatten(ψ, G) is equivalent to (ψ, G). As ψ 0 is a LTL formula, we are back to the first possibility. – If ϕ = ¬ψ, as (ψ, G) and (ψ 0 , G0 ) = flatten(ψ, G) are equivalent, it is the same for (¬ψ, G) and (¬ψ 0 , G0 ). – If ϕ = ψ1 ∨ ψ2 , we have a (ψ10 , G1 ) and a (ψ20 , G2 ) respectively equivalent to (ψ1 , G) and (ψ2 , G). As G1 is only G with added propositional variables, which can be made to only appear in ψ10 (resp 2 ), we have that (ψ1 ∨ ψ2 , G) is equivalent to (ψ10 ∨ ψ20 , G1 ∪ G2 ) – If ϕ = Xψ, the proof is exactly the same as for ¬ψ. – If ϕ = ψ1 Uψ2 , the proof is exactly the same as for ψ1 ∨ ψ2 . Thus, (ϕ0 , G0 ) = flatten(ϕ, G) is equivalent to (ϕ, G), and ϕ0 is a LTL formula. As a consequence, G0 , q |= ϕ0 iff G, q |= ϕ. As demonstrated earlier, this proves that a state q0 is in solve_ATL∗ (ϕ, G) iff G, q0 |= ϕ. A.2 Translation from ATL∗sc,i to QCTL∗i : Proposition 1. For all ATL∗sc,i formula ϕ, there is a QCTL∗i formula ϕ0 such that G, q |= ϕ iff TG |= ϕ0 . This translation has to keep the record of the current context C of strategies already chosen. The translation is not difficult for most of the operators, except hhaiiϕ. To translate hhaiiϕ, we introduce two QCTL∗i formulas. ϕstrat (i) introduces a first set of propositional variables, {ma1 , . . . , mak }, one for each possible move of player a. In each node of the tree, only one of them is true; this represents the move chosen by player a in the strategy being represented. ϕout (i) introduces a new variable, pout , which will only be true on the path given by the conjunction of the strategy described earlier and the strategies already in the context. Finally we add a formula indicating that if we follow pout (our strategy and our context), then the translation of ϕ is true. For the first operators, the translation (without making the difference between state formulas and path formulas) is the following: pC = p ¬ϕC = ¬ϕ C Xϕ = XϕC ϕ1 Uϕ2 C ϕ1 ∨ ϕ2 C = ϕ1 C ∨ ϕ2 C = ϕ1 C Uϕ2 C def To translate hhaiiϕ, we introduce the always operator: Aϕ = ¬E¬ϕ. Its meaning is that the formula ϕ is true in all possible path going from the current def state. We also define the finally operator, Fϕ = >Uϕ, and the one we use, def globally Gϕ = ¬F¬ϕ. The globally operator means that a formula is true on all the states we meet in the current path. We recall that da : Q → N∗ is such that da (q) is the number of moves available to player a at state q ∈ Q. Let A be a subset of the players Σ, at state q, the states available to these players playing each a given move ma is denoted by N ext(q, A, (ma )a∈A ). S Formally, N ext(q, A, (ma )a∈A ) = (mi )i6∈A , mi ∈di δ(q, (ma )a∈{1,...,k} ). V For m = (mia )a∈A , we write pm for the formula a∈A maia . We now define our new formulas: ^ _ ^ mai ∧ ϕstrat (a) = AG pq ⇒ ¬maj q∈Q j6=i i∈da (q) ϕout (A) = pout ∧ AG (¬pout ⇒ AX¬pout ) ∧ AG pout ⇒ _ pq ∧ q∈Q _ m∈(di (q))i∈A pm ∧ AX _ ! pq0 ⇔ pout q 0 ∈N ext(q,A,m) Then, we can finally define our translation: C hhaiiϕ = ∃a m1 . . . . .mk .pout . ϕstrat (a) ∧ ϕout (C ∪ {a}) ∧ A Gpout ⇒ ϕC∪{a} A.3 Translation from QCTL∗i to ATL∗sc,i : Proposition 2. For all QCTL∗i formula ϕ, there is an ATL∗sc,i formula ϕ0 such that TG |= ϕ iff G, q 0 |= ϕ0 . With k the number of propositional variables under a strategy quantifier, take k + 1 players. The transitions are such that player k + 1 chooses the move between the states q ∈ Q (the "real states"), and can choose to go to a state cq,i where player i chooses the move. These states represent the choice whether at q the ith variable is true (going to state pi ) or false (going to state p⊥ ). To translate a QCTL∗i formula defined on a Kripke structure K = hQ, Π, π, δi, we take the game G = hk 0 , Q0 , Π 0 , π 0 , (da )a∈{1,...,k0 } , δ 0 i. With k the number of quantified variables in our QCTL∗i formula, we take k 0 = k + 1 (each player i has the same observation set than the existential quantifier used to quantify Pi ), Q0 = Q ∪ {cq,i | i = 1, . . . , k, q ∈ Q} ∪ {pi | i = 1, . . . k} ∪ {p⊥ }. We add new variables: Π 0 = Π ∪ {Pi | i = 1, . . . k} ∪ {PQ }. Finally, we change π 0 such that for q ∈ Q, π 0 (q) = π(q) ∪ {PQ }, for i ∈ {1, . . . , k}, π 0 (pi ) = {Pi }, and otherwise, π 0 (cq,i ) = π 0 (q⊥ ) = ∅. The changes to the transition function δ 0 gives us the number of moves available da . We only give the details for the transition function. let M = 0 {j1 , . . . , jk+1 } be a move vector, for each q ∈ Q let q10 , . . . , qm the states such that 0 0 0 there exists a M such that δ(q, M ) = qj , j ∈ {1, . . . , m}. We take δ 0 (q, M ) = qj0 for j ∈ {1, . . . , m} and M such that jk+1 = j. We also add for each state q ∈ Q a transition to cq,i for i ∈ {1, . . . , k} when in M we have jk+1 = m + i. Finally, we add that δ(cq,i , M ) = qi if ji = 1 and δ(cq,i , M ) = q⊥ if ji = 2. We use the following translation for the QCTL∗i formula itself (note that the translation does not use the model): ¬ϕ f = ¬ϕ e ϕ^ f1 ∨ ϕ f2 1 ∨ ϕ2 = ϕ m P .ϕ = hh{i}iiϕ ∃^ e i g = Xϕ Xϕ e f = hh{k + 1}ii(GPQ ∧ ϕ) Eϕ e ϕ^ f1 U ϕ f2 1 Uϕ2 = ϕ ( hh{k + 1}iiXhh{k + 1}iiXp if p ∈ {P1 , . . . , Pk } pe = p otherwise
© Copyright 2025 Paperzz