Chronicles Construction Starting from the Fault Model of the System to Diagnose Bruno Guerraz1 and Christophe Dousson2 Abstract. This article falls under the problems of the diagnosis of distributed systems such as telecommunication networks. Among the various techniques used for the on-line diagnosis, we are interested in the chronicle recognition. A chronicle is a set of patterns of observable events temporally constrained. Within the framework of the diagnosis, the set of chronicles gives the signatures of the system failures and constitutes the necessary expertise to the diagnosis. As for the rule based systems, the acquisition of this expertise is problematic. We propose a method based on the Petri net unfolding to generate the chronicles necessary to the diagnosis, starting from the fault model of the system to be diagnosed. 1 2 A [3,4] B [10,15] C D [2,5] [1,+∞[ E Introduction This paper is related to the monitoring of a dynamic system like telecommunication networks: more precisely, we address the problem of following the dynamic changes of the system in order to detect (and identify) the faulty states. Our approach relies on the fault model of the system. Modeling is based on a tiles representation which was introduced in [1]: a tile is a transition of a partial state of the system. The advantage of such tile-based modeling with respect to more classical Petri net-based approaches is that here, the knowledge of the global behavior of the system is not required. Moreover, tiles are generic, the same tiles is used several time in the same model. The characteristics of telecommunication networks are on the one hand the big size of the model and on the other hand, the majority of the events are non observable and moreover, several transitions may have the same label (an alarm generated by the network). These characteristics lead to ambiguity during on-line diagnosis directly with the model and thus to a combinatory explosion of explications. To solve this problem, several methods have been proposed, [5] propose a stochastic approach: the diagnosis problem is defined as the computation of the most likely history of a partially stochastic Petri net given a sequence of observed alarms. Another approach is to off-line compile the model in order to minimise the work on-line. [11] propose such an approach, a structure called “diagnoser” which is built starting from the global model of the system is used for the on-line diagnosis. In this work, we take into account time information in the model and we propose a way to “compile” the fault model of the system in chronicles. These chronicles will allow then the diagnosis of the sytem by on-line recognition. A chronicle represents some pieces of evolution of the observed system. A chronicle is a partial order of observable events con1 strained by time and labeled by a faulty situation. Figure 1 shows a chronicle which contains five events. Its interpretation is the following : the event “B” must occur between 3 and 4 units of time after “A”, the event “C” must occur between 10 and 15 units of time after “B” and between 2 and 5 units of time after “D” etc. . . . France Telecom R&D, 2 avenue Pierre Marzin, 22307 Lannion cedex, France. e-mail: [email protected] –id.–, e-mail: [email protected] Figure 1. A chronicle: a partial order of observable events whith some time constraints between them. More recently, [7] proposes the notion of template languages which is similar to the chronicles (events constrained by time) with the difference that templates model the correct behavior of the system and are used to confirm the correct behavior of a system. Whereas in the “chronicle based diagnosis” proposed in [2], chronicles model the fault behavior of a system and the diagnosis consists in tracking online the occurrences of these “abnormal” chronicles. The advantage of this approach is on one hand the simplicity and great capacity of expression of the formalism and on the other hand the effectiveness of the tools of on-line recognition. In the context of telecommunication network monitoring, each abnormal situation can be described by one or more chronicles, the events correspond to time-stamped alarms and the constraints relate to their occurrence date. The “chronicle-based diagnosis” consists in tracking on line the occurrences of these chronicles starting from the flood of alarms risen from telecommunications equipment. In section 2, we describe an application example on which we will rely to illustrate the majority of the concepts presented later. In the following section, we introduce the initial tiles model and how it is refined by adding time to it to define the temporal tiles. Then, in the next section we present an algorithm which builds a structure containing all the possible behaviors of the temporal tiles system, starting from the initial temporal tiles : the time branching process. This algorithm is inspired by the Petri net unfolding, a well known partial order semantic of Petri nets introduced in [10] and described in more detail in [3]. In section 5, we explain how we extract the chronicles from the maximal time occurrence net and how we build all the chronicles necessary to the diagnosis of the system. 51 2 Application example R:failed We took into consideration an example extracted from our application context which is the monitoring of a Synchronous Data Hierarchy (SDH) telecommunication network. The whole SDH model describes the behavior of the network when a fault occurs and how the effects are propagated across all the equipment. In the general case, faults are propagated through the hierarchy of SDH layers of a given piece of equipment (vertical propagation) and also among equipment (horizontal propagation) through the links and connections. Figure 2 gives an example of a laser breakdown on the physical layer (optical fiber): (1) the emitter detects the problem and generates a Transmission Failed (TF) alarm and (2) the right hand receiver of the peer detects the absence of signal and generates a Loss Of Signal (LOS) alarm. The former behavior corresponds to ITU-T Recommendations on SDH networks3 . At this stage, a safety mechanism (3) called “Automatic Laser Switch” stops the emitter in order to avoid damage by emitting a laser on a cut fiber. Then, as before, the emitter in turn generates a TF, and finally (4) the left hand receiver generates a LOS. All theses steps are associated also by vertical propagations which are not described in this paper but could be found in [1]. TF LOS (1) (2) (4) (3) A B LOS TF Figure 2. Example of the laser emitter breakdown : (1) the transmitter breaks down and generates an alarm TF (2), the receiver notices an absence of signal and generates a LOS alarm (3), the transmitter stops automatically and generates TF and (4) the receiver generates in its turn a LOS alarm. R:failed ε E:ok E:failed Figure 3. A tile : if the system is in the partial state {R:failed,E:ok} then, the dumb label is emitted, the value of the variable “R” remains the same one and the value of the variable “E” becomes “failed”. 3.2 Time We consider time as a linearly ordered discrete set of instants whose resolution is sufficient for modeling the dynamics of the environment (i.e. any change can be adequately represented as taking place at some instant of the set). Out time-map manager relies on timepoints as elementary primitives and we handle time constraint as binary constraint between them. We define a time constraint graph T as a set of time points with time constraints between them. These constraints are numerical and are expressed as pairs of real numbers CT (t, t0 ) = [I − , I + ] corresponding to the lower and upper bounds on the temporal distance from t to t0 . The special values +∞ and −∞ could also be used (so, the symbolic constraint after could be expressed as [0, +∞]). Notice also that, if t or t0 are not defined in the time graph T , then we could always define CT (t, t0 ) = [−∞, +∞] (in other words, if a time point is not constrained by a graph, then the default constraint is [−∞, +∞]). We use the following operators for the constraint propagation (⊕) and the constraint conjunction (∩): I ⊕J = [I − + J − , I + + J + ] (1) I ∩J = [max(I − , J − ), min(I + , J + )] (2) We also define a partial order relation (denoted by ) between two time graphs as follows: T T 0 ≡def ∀(t, t0 ) ∈ T 2 , CT (t, t0 ) ⊆ CT 0 (t, t0 ) 3 Representation 3.1 Tiles The tile formalism was introduced in [6]. Let V be a finite set of variables v. Each variable v takes its value in a finite domain denotedQby Dv . The set of states of the system is defined by XV = v∈V Dv . Thus, for V ⊆ V, we can define the set of local Q states by XV = v∈V Dv . Elements of XV are denoted by xV and for v ∈ V, we denote by v(xV ) the value of the variable v in the state xV . In the sequel, transitions between two local states will be referred to as tiles. Formally a tile is a quadruplet θ = hV, xV − , α, xV + i, where V is a subset of variables and (xV − , α, xV + ) a transition between the local states xV − and xV + . α labels transition and ranges over a set of possible event labels which contains the dumb label denoted by . Figure 3 shows a tile modeling the ALS mechanism of the section 2. Variables are “E” for Emission and “R” for Reception. Variables are represented by circle and the transition by a box. We also denote the pre-condition and the post-condition of a tile θ by pre(θ) = (V, xV − ) and post(θ) = (V, xV + ). 3 ITU-T Recommendation G.774 : “Synchronous digital hierarchy (SDH) Management information model for the network element view” (3) Due to the constraint propagation, a constraint graph may have many equivalent representations (two constraint graphs are equivalent if they assume exactly the same sets of solutions for all the ti ). But we can prove that there exists a unique equivalent constraint graph, which is minimal (in the sense of the relation defined by equation 3). Its computation is ensured by a path-consistency PC2 algorithm with the complexity of O(n3 ) [8]. In the following, we denote Tb this minimal graph ; it is the canonical representative of its class of equivalence. In order to merge the constraints of two time graphs, we also define the union (∪) between time graphs. The resulted graph (T =T 0 ∪ T 00 ) contains all the time points of T 0 and T 00 and the constraints are defined as follows4 : ∀(t, t0 ) ∈ T 2 , CT (t, t0 ) = CT 0 (t, t0 ) ∩ CT 00 (t, t0 ) (4) Figure 4 shows the result of the union of two time constraint graphs, T = T 0 ∪ T 00 . Of course, a graph resulting from the union of two graphs could be inconsistent. For instance, if the time constraint of the figure CT 0 (t2 , t3 )=[5,13] is replaced by [0,0] then, T 00 is not consistent. 4 52 We have the following relation (T 0 ∪ T 00 ) T 0 . t2 t4 [1,2] t2 t2 [6,15] t1 [10,20] [0,5] T' CT(tR(failed), tε)=[1,2] t4 CT(tE(ok), tε)=[0,+∞] CT(tε,tE(failed))=[0,0] [6,15] [5,13] [0,10] [1,2] [5,10] t1 [10,20] [0,5] t3 t3 T" t5 T = T' T" R:failed t3 t5 3.3 τ3 ε R:failed CT(tR(ok), tε)=[0,+∞] CT(tε, tR(failed))=[0,0] R:ok ε R:failed Figure 5. Left tile of left models the laser breakdown of section 2 (variable emission becomes failed), right tile means that when reception becomes failed then after 1 to 4 units of time alarm LOS is generated and Sync2 becomes ok. Temporal tiles CT (tv(xV − ) , tv(xV + ) ) = [1, +∞] (5) A system of temporal tiles is a triple Σ = hV, X0 , T i where V is a finite set of variables, X0 is a set of initial states, T is a finite set of temporal tiles with V = ∪t∈T Vt (we assume that at the initial state all the variables have just changed value). The example of the section 2 can be modeled with a system of 17 temporal tiles. Some of these temporal tiles are extracted from the standards of the ITUT and the others (as the Automatic Laser Switch procedure) from the expertise. There are three variables for each piece of equipment: “Emission” (E),“Reception” (R) and “Operational state” (O). Variables “Reception” and “Emission” take their value in {ok,failed}, “Operational state” in {enabled, disabled}. Some variables of synchronization between the equipment are also added in the model. At the initial state all the equipment is operational (i.e. “Emission” and “Reception” are “ok” and “Operational state” is “enabled”). Figure 5 shows some of these temporal tiles. The left temporal tile models the emission breakdown and the right temporal tile modelises the emission of the alarm “Loss Of Signal” (LOS). The instant notation is as described above; for instance tR(f ailed) corresponds to the instant when the value of the variable “R” passes from “ok” to “failed”. Behavior of the system We present in the following sections a way to describe the whole behavior of a temporal tiles system and an algorithm to build it. 4.1 τ2 E:failed LOS R:failed The union of two time constraint graphs A temporal tile is a couple τ = hθ, T i where θ is the atemporal part of the tile which is defined as in section 3.1 and T is the temporal part of the tile which is defined as a time graph according to the section 3.2. The links between the pre- and post-conditions of θ and the instants of the time graph are ensured through the change of value of the variables. We denote by tv(xV ) the instant when the variable v passes from a value different from v(xV ) to v(xV ) and by tα the date when event α occurs. All these instants are in the time graph associated to the tile τ and therefore, any kind of constraints could be set between any couple of such instants. Some constraints on a temporal tile are due to domain axiom : a variable can change value only one time at the same time. So for a tile τ = hθ, T i with θ = hV, xV − , α, xV + i, ∀v ∈ V that v(xV − ) 6= v(xV + ), these constraints are added : 4 CT(tR(failed), tLOS)=[1,4] R:failed E:ok Figure 4. τ1 Temporal run Definition 1 (Run) The interleaved sequence of states and events x0 , a1 , x1 , a2 , . . . is a run R of the system Σ. If x0 ∈ X0 and, for each k > 0, there exists a tile θ = hV, xV − , α, xV + i such that : ( ak = α v(xk−1 ) = v(xk ) if v ∈ / Vk v(xk−1 ) = v(xV − ), v(xk ) = v(xV + ) if v ∈ Vk (6) As the state changes are described in the tiles, a run is also completely described with an initial state and a sequel of tiles. From now on, a run is described as x0 , θ1 , θ2 , . . . and each state xk could be computed with x0 , θ1 , . . . , θk . Moreover, as tiles define local transitions on partial state, two successive tiles can be based on two disjoint sets of variables. In this case these tiles are said to be “concurrent” and by exchanging the order of both tiles, we obtain two equivalent runs. Henceforth, we will not define a run as a sequence of tiles but as an equivalence class of tiles for the concurrence. This brings us to consider a run as a partial order of tiles. We extend the notion of run in the framework of temporal tiles. A temporal run is a set of temporal tiles and an initial state where the sequence of the atemporal parts of the tiles is a run and its associated time constraint graph is consistent. Definition 2 (Temporal run) The 2-uple hx0 , (τ1 , τ2 , . . .)i with τi = hθi , Tτi i is a temporal run of Σ iff : R = hX S0 , (θ1 , θ2, . . .)iis a run of Σ and, T = ( τ ∈R Tτ ) ∪ Tcaus is consistent (7) Where Tcaus is the time constraint graph containing the time constraint graph resulting from causality: ∀(τi , τj ) ∈ R2 , if τi is a predecessor of τj in the partial order of tiles then tαi and tαj are time points of Tcaus and CTcaus (tαi , tαj ) = [1, +∞]. The tricky point of this definition is the construction of the time constraint graph as the time points correspond to the change of value of the variables. If in a tile, the value of a variable is the same on the pre-set and the post-set then, when this tile is appended to the run, the value of this variable does not change. Figure 6 shows the beginning of a temporal run (up) and its time constraint graph (bottom). When the tile labeled by LOS is appended to the run, the value of the variable R does not change. Thus, when the tile labeled by α is appended, the time constraint [1, 2] is added between tR(f ailed) and tα : indeed, the instant when the variable R changes of value is tR(f ailed) . The constraint [1, +∞] is added between tLOS and tα with an aim of ensuring causality between the first tile and the second: it ensures that the cause precedes the consequence. To summarize, a temporal run can be defined by an initial state and a partial order of temporal tiles and, in an equivalent way ac- 53 CT(tR(failed), tLOS)=[1,4] CT(tsync2(failed), tLOS)=[0,+∞] CT(tLOS, tsync2(ok))=[0,0] R:failed CT'(tR(failed), tα)=[1,2] CT'(tα, tR(ok))=[0,0] R:failed R:failed R:ok α LOS sync2:ok sync2:false tR(failed) tα [1,2] [1,4] [0,0] tR(ok) [1,+∞] tLOS [0,+∞] tsync2(failed) Figure 6. [0,0] tsync2(ok) A temporal run : a run with its associated time constraint graph. cording to definition 2, by a (atemporal) run and a (consistent) time constraints graph. 4.2 Petri net and occurence net A run of a system described with temporal tiles is one of the possible behaviors of this system. The notion of occurence net introduced within the framework of the Petri nets is a formalization of a run including nondeterministic choices (or conflicts). Thus, an occurrence net represents several runs of a system together. We give below the definitions of a Petri net and then of an occurrence net. Definition 3 (Petri net) A Petri net is a tuple N = (P l, T r, L, In) where P l is the set of places, T r is the set of transitions, L is the set of links between the places and the transitions (i.e. L ⊆ (P l × T r) ∪ (T r × P l)) and In ⊆ P l is the initial marking. For each transition, t ∈ T r, the preset •t is the set of places connected upstream to t, i.e. •t = {p ∈ P l|(p, t) ∈ L}. The postset of t, t• is the set of places connected downstream to t, i.e. t• = {p ∈ P l|(t, p) ∈ L}. The preset and the postset are also defined for each place p ∈ P l, •p = {t ∈ T r|(t, p) ∈ L} and p• = {t ∈ T r|(p, t) ∈ L}. Definition 4 (Causality, Conflict, Concurrency) Given two nodes n and n0 of a Petri net, (places or transitions), we say that n causes n0 (written n ≤ n0 ), if either n = n0 or there is a path from n to n0 . We say that n and n0 are in conflict, (written n#n0 ) if there is a place p different from n and n0 , from which one can reach n and n0 by two different paths. And we say that n and n0 are concurrent if neither n ≤ n0 , nor n0 ≤ n, nor n#n0 . Definition 5 (co-set) A set of concurrent nodes is named a co-set. More formally, CO is a co-set iff ∀(n, n0 ) ∈ CO2 , n and n0 are concurrent. Definition 6 (Occurrence net) An occurrence net N = (P l, T r, L, In) is defined as a finite acyclic Petri net such that each place has at most one predecessor (| •p |≤ 1), no transition is in self-conflict (we have not t#t) and the initial marking is In = {p ∈ P l| • p = ∅}. Usually, occurrence nets are used for describing several behaviors of a Petri net [3] by the way of the branching process of a Petri net which is an occurrence net with a homomorphism on the Petri net. It represents several runs of a Petri net but it is not necessarily “maximal” : it does not necessarily contain full runs and, moreover, it may not contain all possible runs of the system. The complete branching process of a Petri net is called the unfolding and in most of cases, it would be infinite. 4.3 Time branching process We extend the notion of occurrence net by adding time information and we define the time branching process as a time occurrence net associated to a temporal tiles system. Then, we give the algorithm allowing to obtain the maximal time branching process. A time occurrence net is an occurrence net to which one associates a time point to each node (place or transition) and some time constraints between these time points. It is important to notice that because of the conflicts, the whole of the time constraints does not constitute a time constraint graph. Indeed, two nodes in conflict do not form parts of the same run. Thus, there cannot be any time constraint between these two time points. Definition 7 (Time occurrence net) A time occurrence net Nt = hN , T i is an occurrence net N = (P l, T r, L, In) for which each node n ∈ P l ∪ T r is associated to one time point in T . And ∀(n, n0 ) ∈ (P l ∪ T r)2 such that we do not have n#n0 there is a constraint ∈ T between the time points associated to n and n’. As the branching process is an occurrence net associated to a Petri net, we want to associate a time occurrence net to a set of tiles in order to define a time branching process. A time branching process of a tiles system is a time occurrence net for which each transition with its pre and post-set corresponds to one temporal tile of the system. Definition 8 (Time branching process) A time branching process of Σ is a time occurence net Nt for which each transition t of Nt with its post and preset is associated to a tile τ of Σ. The time-points associated to •t ∪ t ∪ t• should satisfy the time constraints of the tile τ. This defines a map µ such that, for each transition t ∈ Nt , µ(t) = τ and µ(•t) = pre(τ ) and µ(t•) = post(τ ). We give below the algorithm to construct the maximal time branching process of a temporal tiles system. This construction is made like a puzzle game. The algorithm starts with the branching process having the places corresponding to the initial marking of the system. The temporal tiles are added one at a time. To manage the conflicts and the time constraints, to each transition t of the branching process, we associate a time constraint graph. This graph denoted by T (t) is the union between the time constraints of the temporal tile τ for which µ(t) = τ and the S time constraints graph of the nodes which cause this transition ( t0 |t0 •∩•t6=∅ T (t0 )) and the time constraint graph Tcaus of causality constructed as in section 4.1. Notice that T (t) is the time constraint graph of the temporal run defined by the tile τ = µ(t) and all its predecessors in the puzzle. The time constraint graph is built and added to the transition when the temporal tile is added to the “puzzle”. Figure 7 shows the time branching process N of a temporal tiles system of three tiles τ1 (reception failure),τ2 (emission of the LOS alarm) and τ3 (ALS mechanism) with an initial state X0 given by R : ok and E : ok. The time constraint graph T (t1) associated to the transition t1 is the time constraint graph of the tile τ1 = µ(t1). The transitions t2 and t3 are in conflict thus, their associated time constraint graphs have the same prefix T (t1) to which we make the union with the time constraint 54 R:ok N LOS t2 R:failed ε t1 µ(t1) = τ2 R:failed R:failed ε t3 µ(t3) = τ3 E:ok E:failed T(t1) tR(ok) [0,+∞] tε [0,0] tR(failed) T(t2) tR(ok) [0,+∞] tε [0,0] tR(failed) T(t3) tR(ok) [0,+∞] tε [0,0] tR(failed) [1,2] [0,+∞] tE(ok) tLOS [1,4] tε [0,0] tE(failed) Figure 7. A system of three temporal tiles τ1 ,τ2 and τ3 , its time branching process N and the time constraints graphs associated to each transition of N : t1, t2 and t3. graph of τ1 for t1 and of τ2 for t2. Notice that T (t2) is the time constraint graph of the temporal run hX0 , (τ1 , τ2 )i and T (t3) the time constraint graph associated to the temporal run hX0 , (τ1 , τ3 )i. In order to build the maximal time branching process, we need the notion of “(atemporal) tiles that can be added to a given occurrence net”. For a given occurrence net N , the possible extensions of N denoted by pe(N ) are the pairs (τ, X), where X is a co-set of places of N and τ is a tile such that: • µ(X) = pre(τ ) • ∀t ∈ N , µ(t) 6= τ or µ(•t) 6= X For a tile system Σ = hV, X0 , T i, the construction of its maximal time occurrence net is described in the algorithm below : N ← X0 while pe(N ) 6= ∅ do choose S a pair (τ, X) from pe(N ) if ( ∀t|t•∩X6=∅ T (t)) ∪ Tτ ∪ Tcaus is consistent then append τ to N S T (tr(τ )) ← Tτ ∪ ( ∀t|t•∩X6=∅ T (t)) ∪ Tcaus end if end while This algorithm thus makes it possible to obtain the complete behavior of a system of temporal tiles (i.e. all temporal runs in the same structure). These temporal runs are maximal and in most cases they are not finite. Our objective is to detect the failures of the system as soon as possible, that is why we can limit the time length of the puzzle by adding a time constraint [0,ω] between the initial state and each transition of the puzzle, ω being the maximum length of the puzzle. The value of ω is given by expertise. 5 Extraction of the chronicles Previous sections show how we can build all the temporal runs of the system. If we project each run on the observable part, we obtain a set of observable events (transitions with a label different from ) constrained by time. This is a chronicle which is the signature of this run. Thus, if we extract from the branching process all the runs relevant to a fault propagation, we could project them in order to define all the relevant signatures of this fault. This is the intuitive idea of our approach. In order to build all the chronicles necessary to the diagnosis (i.e. all the fault signatures), we build a maximal time branching process for each failure of the system. Notice that in a tiles system, each failure is modeled by a tile (the left tile of the figure 5 models the laser emission breakdown). In a tiles system Σ = hV, X0 , T i, we distinguish the tiles modeling a primary breakdown (denoted by Tc ) from the other tiles (denoted by T0 ): T = Tc ∪ To and Tc ∩ To = ∅. For each tile τ of Tc we build the maximal time branching process of the system Σ = hV, X0 , To ∪ {τ }i with the algorithm of the section 4.3. This maximal branching process describes all the complete behaviors of the system when the failure modeled by τ occurs. Then we are interested in the maximal temporal runs of this fault in order to obtain the chronicles which are the complete signature of this fault. We denote a temporal run as hR, T i where R is the run without the time constraints (as defined in 4.1) and T the time graph of the run. We define a partial order relation between two temporal runs (for the time constraints, this comparison relies on the minimal propagated graphs) as follows: c0 hR, T i ≤ hR0 , T 0 i ≡def R ⊆ R0 and Tb T (8) This partial order allows us to define the maximal temporal runs from a maximal run. But the main difficulty is to extract directly the maximal temporal runs (the chronicles). Indeed, let us refer to figure 8, on the left side we have the time constraint graph of a maximal run. To simplify the example, all the events of this run are observable. The maximal chronicles we want to extract (the chronicles corresponding to the maximal temporal runs) are C1 and C2. C1 is the minimal time constraints graph of R. This graph is maximal indeed, there is no graph more larger. The chronicle C2 is maximal too, there is no graph more larger and less constrained. But C2 is not a temporal run of the system. Indeed, as we see on the graph of C1, if the alarm “LOS” occurs between one and two units of time after the alarm “TF” then the alarm “TF” must occur again. The real maximal temporal run is the graph of C2 with the constraint [3, 5] instead of [1, 5] ([3, 5] is the complementary of [1, 2] in [1, 5]) but the computation of all the exclusive constraints graph is NP-complete and possibly leads to high number of chronicles. LOS tLOS [0,0] [1,5] tTF [1,2] R Figure 8. [0,0] [1,2] tTF TF LOS [1,2] C1 [1,5] TF TF C2 A maximal run R and the two chronicles extracted from R : C1 and C2. As our goal is to obtain chronicles which will be recognised on line, it is sufficient to extract these chronicles as above if we add exclusion links between some of them: the recognition of a chronicle must exclude (or must have priority on) the recognition of the “smaller” chronicles extracted from the same maximal run. A chronicle C 0 is smaller than the chronicle C if the time points of C 0 are in C. In our example, C2 is smaller than C1 therefore the recognition of C1 have priority on the recognition of C2. 55 6 Future works Some open problems raised in the previous sections are listed hereunder. We propose a way of investigation in order to improve our approach. The last part is another point of view of this work which gives some clues for other extensions than time addition. First of all, we saw that the maximal runs could be infinite and we cut them with an upper bound of the duration; this is acceptable when you roughly know the dynamic of your modeled system (which is often the case). Nevertheless, within the framework of the Petri nets, [9] showed that there exists a finite structure which contains all the information related to the behavior of a safe Petri net : this structure is called the branching prefix. The algorithm to construct the branching prefix was refined in [4]. We plan to study how to adapt this algorithm in order to deal with time ; this work will be probably connected to timed Petri nets. Secondly, the extraction outputs a set of chronicles with exclusion links between some of them. For the being time, we rely on the online recognition engine to deal with these links (the engine must delay some recognitions in order to ensure for each recognised chronicle that a chronicle with a higher priority will not be recognised later) but we think that these links could be ensured by an off-line processing of the extracted chronicle models. Finally, if we consider the time graph as a set of parameters (of any kind) and a set of constraints (of any kind, too), our approach consists in adding a set of applicability rules to each tile and extending the puzzling of the occurrence net in order to satisfy these constraints. The properties we need are that we should be able to verify the satisfiability of any conjunctive set of constraints (in order to know if a tile is a possible extension or not) and that a partial order relation between the extracted pattern is defined (in order to extract the maximal ones). 7 [5] E. Fabre, A. Aghasaryan, A. Benveniste, R. Boubour, and C. Jard, ‘Fault detection and diagnosis in distributed systems: an approach by partially stochastic petri nets’, Journal of Discrete Event Dynamic Systems, (June 1998). Kluwer Academic Publishers, Boston. [6] C. Jard, ‘Synthesis of distributed testers from true-concurrency models of reactive systems’, in International Journal of Information and Software Technology, Elsevier, (2002). [7] Holloway L. and Pandalai N., ‘Template Languages for Fault Monitoring of Timed Discrete Event Processes’, IEEE Transactions on Automatic Control, 45(5), (May 2000). [8] Alan K. Mackworth and Eugene C. Freuder, ‘The complexity of constraint satisfaction revisited’, Artificial Intelligence, 59, 57–62, (1993). Elsevier Science Publishers. [9] K. McMillan, Symbolic model checking : an approach to the state explosion problem, Computer science, Carnegie Mellon University, 1993. [10] M. Nielsen, G. Plotkin, and G. Winskel., ‘Petri nets, event structures and domains’, Theoretical Computer Science, 13(1), 85–108, (1980). Elsevier. [11] M. Sampath, S. Lafortune, and D. Teneketzis, ‘Active diagnosis of discrete-event systems’, IEEE Transactions on Automatic Control, 908–929, (July 1998). Conclusion In the framework of the chronicles recognition, we presented a way to build the set of chronicles to diagnose a system in which faults propagation is modeled by a temporal tiles system. We first refined the notion of tile by adding time : the temporal tile. Then, we defined the notions of temporal run and time branching process of a temporal tiles system. As the branching process of a Petri net represents several runs of a Petri net in the same structure, the time branching process represents several temporal runs of a temporal tiles system. We presented an algorithm allowing to build the maximal time branching process (i.e. a structure containing all the temporal runs). We showed how to extract chronicles from the time branching process and how to build all the necessary chronicles to diagnose the system. REFERENCES [1] A. Aghasaryan, C. Dousson, E. Fabre, A. Osmani, and Y. Pencolé, ‘Modeling fault propagation in telecommunications networks for diagnosis purposes’, in 18th World Telecommunications Congress (WTC’02), Paris, France., (September 2002). [2] C. Dousson, ‘Extending and unifying chronicle representation with event counters’, in Proc. of the 15th ECAI, pp. 257–261, Lyon, France, (July 2002). F. van Harmelen, IOS Press. [3] J. Engelfriet, ‘Branching processes of Petri nets’, Acta Informatica, 28, 575–591, (1991). [4] J. Esparza, S. Romer, and W. Vogler, ‘An Improvement of McMillan’s Unfolding Algorithm’, Tools and Algorithms for Construction and Analysis of Systems, 87–106, (1996). 56
© Copyright 2026 Paperzz