Diagnosis of Intermittent Faults - EECS @ UMich

Discrete Event Dynamic Systems: Theory and Applications, 14, 171±202, 2004
# 2004 Kluwer Academic Publishers. Manufactured in The Netherlands.
Diagnosis of Intermittent Faults
OLIVIER CONTANT
[email protected]
Department of Electrical Engineering and Computer Science, The University of Michigan, 1301 Beal Avenue,
Ann Arbor, MI 48109-2122 USA
STEÂPHANE LAFORTUNE
[email protected]
Department of Electrical Engineering and Computer Science, The University of Michigan, 1301 Beal Avenue,
Ann Arbor, MI 48109-2122 USA
DEMOSTHENIS TENEKETZIS
[email protected]
Department of Electrical Engineering and Computer Science, The University of Michigan, 1301 Beal Avenue,
Ann Arbor, MI 48109-2122 USA
Abstract. The diagnosis of ``intermittent'' faults in dynamic systems modeled as discrete event systems is
considered. In many systems, faulty behavior often occurs intermittently, with fault events followed by
corresponding ``reset'' events for these faults, followed by new occurrences of fault events, and so forth. Since
these events are usually unobservable, it is necessary to develop diagnostic methodologies for intermittent faults.
Prior methodologies for detection and isolation of permanent faults are no longer adequate in the context of
intermittent faults, since they do not account explicitly for the dynamic behavior of these faults. This paper
addresses this issue by: (i) proposing a modeling methodology for discrete event systems with intermittent faults;
(ii) introducing new notions of diagnosability associated with fault and reset events; and (iii) developing
necessary and suf®cient conditions, in terms of the system model and the set of observable events, for these
notions of diagnosability. The de®nitions of diagnosability are complementary and capture desired objectives
regarding the detection and identi®cation of faults, resets, and the current system status (namely, is the fault
present or absent). The associated necessary and suf®cient conditions are based upon the technique of
``diagnosers'' introduced in earlier work, albeit the structure of the diagnosers needs to be enhanced to capture the
dynamic nature of faults in the system model. The diagnosability conditions are veri®able in polynomial time in
the number of states of the diagnosers.
Keywords: diagnosability, intermittent faults, fault diagnosis, fault detection
1.
Introduction
Practical experience has shown that detection and isolation of many classes of faults in
dynamic systems can be approached as a problem of state estimation and inferencing for
discrete event systems (Aghasaryan et al., 1998; Benveniste, 2003; Bouloutas, 1990;
Console, 2000; Debouk et al., 2000; Garcia et al., 2002; Jiang and Kumar, 2002; Jiang et al.,
2002; Lafortune et al., 2001; Lamperti and Zanella, 1999; Lin, 1994; Lin et al., 1993; Lunze,
2000; Pandalai and Holloway, 2000; PencoleÂ, 2000; Pencole et al., 2001; Sampath, 2001;
Sampath et al., 1998, 1995, 1996; Sengupta, 2001; Sinnamohideen, 2001; Westerman et al.,
1998; Hastrudi Zad et al., 1998). In many systems, faulty behavior often occurs
intermittently, with fault events followed by corresponding ``reset'' events for these faults,
followed by new occurrences of fault events, and so forth. In hardware systems, intermittent
172
CONTANT ET AL.
faults are typically caused by bad electrical contacts (e.g., faulty relays), ``sticky''
components (e.g., stuck valves), overheating of chips, noisy measurements from sensors,
power surges, and so forth. Intermittent faults occur in software systems as well; consider
for instance exceptions and interrupts that are caused by some unknown ``bugs'' and that
lead to crashes and reboots. The methodologies used in Aghasaryan et al. (1998),
Benveniste et al. (2003), Bouloutas (1990), Console (2000), Debouk et al. (2000), Garcia
(2002), Jiang and Kumar (2002), Lafortune et al. (2001), Lamperti and Zanella (1999), Lin
(1994), Lin et al. (1993), Lunze (2000), Pandalai and Holloway (2000), Pencole (2000),
Pencole et al. (2001), Sampath (2001), Sampath et al. (1998, 1995, 1996), Sengupta (2001),
Sinnamohideen (2001), Westerman et al. (1998) and Hastrudi Zad et al. (1998) assume that
once faults occur, they remain in effect permanently; hence, the terminology ``failures'' is
often used for these permanent faults. Furthermore, to the best of our knowledge, diagnostic
methodologies developed in the ®eld of model-based reasoning in arti®cial intelligence
(which are close in spirit to the discrete event systems methodologies, since they are also
based on qualitative system models) are also geared towards the diagnosis of permanent
faults; see, for example, Darwiche and Provan (1996), Dvorak and Kuipers (1992), Provan
and Chen (1998, 1999), and Williams and Nayak (1996).
Methodologies for diagnosing permanent faults are no longer adequate in the context of
intermittent faults, since they do not account explicitly for the dynamic behavior of these
faults that manifests itself in the form of alternating (unobservable) fault and reset events.
The work in Aghasaryan et al. (1998) and Benveniste et al. (2003) allows intermittent
faults but does not propose a systematic framework for their detection and isolation. The
linear temporal logic approach to failure diagnosis presented in Jiang and Kumar (2002)
may prove a viable approach for detection and isolation of intermittent faults, but this
remains to be explored. The recent work (Jiang et al., 2002) presents a state-based
modeling of faults (and implicitly their resets) and focuses on the diagnosis of the number
of occurrences of faults. Our focus in this paper is different from that in Jiang et al. (2002).
Our main concern is the diagnosis of the current status of the system (i.e., which faults are
present, which faults have never occurred, and which faults are reset). Our principal
contributions in this regard are:
*
A novel modeling methodology for discrete event systems with intermittent faults and
their associated reset events.
*
A set of four new de®nitions of diagnosability associated with the fault and reset
events.
*
Necessary and suf®cient conditions, in terms of the system model and the set of
observable events, for each of the four notions of diagnosability.
We consider untimed models of discrete event systems. The four de®nitions of
diagnosability are complementary and capture a hierarchy of desired objectives regarding
the detection and identi®cation of faults, resets, and the current system status (namely,
which faults are present, absent, or reset). The associated necessary and suf®cient
conditions are based upon the techniques introduced in the diagnoser approach of
DIAGNOSIS OF INTERMITTENT FAULTS
173
Sampath et al. (1995), albeit the structure of the diagnoser automata needs to be altered to
capture the dynamic nature of faults in the system model. The enhancements to the
``diagnoser approach'' are due to the introduction of new labels for faults and resets in the
construction of diagnosers, which in turn leads to the consideration of new types of
indeterminate cycles (Sampath et al., 1995) that serve to characterize violations of the four
types of diagnosability. The necessary and suf®cient conditions are veri®able in
polynomial time in the number of states of the diagnosers.
This paper is organized as follows. Section 2 ®rst presents some necessary background
and then focuses on the modeling of intermittent faults and on how their dynamic behavior
is captured by three types of labels. Section 3 presents the four notions of diagnosability
that are proposed to thoroughly capture desired objectives in the context of intermittent
faults. Modi®ed diagnosers that explicitly account for the new label types in building state
estimates are described in Section 4. Sections 5 and 6 then develop the four sets of
necessary and suf®cient conditions associated with the four notions of diagnosability. A
detailed example of the modeling of intermittent faults and the ensuing system analysis is
given in Section 7. Conclusions appear in Section 8.
2.
2.1.
Modeling of System and Intermittent Faults
System Model and Assumptions
We assume that the reader is familiar with basic notions in ®nite-state automata and
regular languages (see, for example, Cassandras and Lafortune, 1999). The system to be
diagnosed is modeled as an automaton
G ˆ …X; S; d; x0 †
…1†
where X is the state space, S is the set of events, d is the partial transition function, and x0 is
the initial state of the system. Model G accounts for the normal and failed behavior of the
system. The behavior of the system is described by the pre®x-closed language L…G†
generated by G. Henceforth, we shall denote L…G† by L. L is a subset of S , where S
denotes the Kleene closure of the set S. The language L generated by G is assumed to be
live. This means that there is a transition de®ned at each state x in X, i.e., the system cannot
reach a point at which no event is possible. The liveness assumption on L is made for the
sake of simplicity. With slight modi®cations, all the main results of this paper hold true
when the liveness assumption is relaxed (cf. the results in Sampath et al., 1998).
Some of the events in S are observable, i.e., their occurrence can be observed, while the
_ uo , where So
rest are unobservable. Thus, the event set S is partitioned as S ˆ So [S
represents the set of observable events and Suo represents the set of unobservable events.
The observable events in the system may be one of the following: commands issued by the
controller, sensor readings immediately after the execution of the above commands, and
changes of sensor readings. The unobservable events may be fault events, reset events, or
other events that cause changes in the system state not recorded by sensors. See Sampath et
al. (1996) for a methodology on how to construct the system model G from models of
174
CONTANT ET AL.
system components and sensor readings. We assume that there does not exist in G any
cycle of unobservable events, i.e., 9 n0 [ N such that V s1 ts2 [ L; t [ Suo )k t k no where
k t k is the length of trace t. This assumption ensures that observations occur with some
regularity. Since detection of faults is based on observable transitions of the system, we
require that G does not generate arbitrarily long sequences of unobservable events.
In the context of diagnosis of intermittent faults, let Sf (S denote the set of fault
events and let Sr (S denote the corresponding set of fault reset events which should be
diagnosed. In this regard, the set Sf is assumed to be composed of m different fault
events, Sf ˆ f f1 ; . . . ; fm g, and the set Sr is assumed to be composed of the
corresponding resets, Sr ˆ fr1 ; . . . ; rm g. Each fault event fi has its corresponding fault
reset event ri , where ri cannot happen until fi occurs at least once. This last assumption
points out the fact that we are dealing with intermittent faults hence each fault event can
potentially be reset; if known permanent faults are present, they can be handled by the
methodology in Sampath et al. (1995). Without loss of generality, we assume that
Sf (Suo and Sr (Suo , since an observable fault event or an observable fault reset event
can be diagnosed trivially. The objective is to identify the occurrence, if any, of the fault
events and their corresponding reset events, while tracking the observable events
generated by the system.
The assumptions made above on the system under investigation are required in all the
results of this paper (even if not explicitly stated).
To de®ne diagnosability, we need the following notation. The empty trace is denoted by
e. Let s denote the pre®x-closure of trace s where s [ S . We denote by L=s the
postlanguage of L after s, i.e.,
L
:ˆ ft [ S : st [ Lg
s
We de®ne the projection P : S ?So in the usual manner (Ramadge and Wonham, 1989)
P…e† ˆ e
P…s† ˆ s
if s [ So
P…s† ˆ e
P…ss† ˆ P…s†P…s†
if s [ Suo
where s [ S ; s [ S
…2†
Thus, P simply ``erases'' the unobservable events in a trace. The inverse projection
operator PL 1 is de®ned as
PL 1 …y† ˆ fs [ L : P…s† ˆ yg
…3†
We will write C… fi † to denote the set of all traces of L that end with the fault event fi . That
is,
C… fi † :ˆ fsfi [ Lg
…4†
175
DIAGNOSIS OF INTERMITTENT FAULTS
Similarly, we will write C…ri † to denote the set of all traces of L that end with the reset
event ri . That is,
C…ri † :ˆ fsri [ Lg
…5†
Consider s [ S and s [ S . We use the notation s [ s to denote the fact that s is an event in
the trace s.
We de®ne
Xo ˆ fxo g [ fx [ X : x has an observable event into itg
…6†
Let L…G; x† denote the set of all traces that originate from state x of G. Lo …G; x† denotes the
set of all traces that originate from state x and end at the ®rst observable event. Ls …G; x†
denotes those traces in Lo …G; x† that end with the particular observable event s; sf denotes
the ®nal event of trace s. Formally,
Lo …G; x† ˆ fs [ L…G; x† : s ˆ us; u [ Suo ; s [ So g
…7†
Ls …G; x† ˆ fs [ Lo …G; x† : sf ˆ sg
…8†
and
Finally, we de®ne the non-deterministic automaton
G0 ˆ …Xo ; So ; dG0 ; x0 †
…9†
the generator of the language
L…G0 † ˆ P…L† ˆ ft : t ˆ P…s† for some s [ Lg
…10†
The elements Xo ; So , and x0 are as de®ned above. The transition relation of G0 is given by
dG0 (…Xo 6S6Xo † and is de®ned as follows:
…x; s; x0 † [ dG0
2.2.
if
d…x; s† ˆ x0
for some s [ Ls …G; x†
…11†
Modeling of Intermittent Faults
As in Sampath et al. (1995), we use the notion of label to identify special changes in the
status of the system. The labels are symbols that allow us to keep track of the occurrence
of selected events along the system's evolution. We de®ne the set of non-intermittent
and present fault labels DF ˆ fF1 ; F2 ; . . . ; Fm g, the set of reset fault labels DFIR ˆ
IR
IR
fFIR
1 ; F2 ; . . . ; Fm g, the set of intermittent and present fault labels DFIP ˆ
IP
IP
fF1 ; F2 ; . . . ; FIP
m g, and the set D ˆ fNg [ DF [ DFIR [ DFIP .
Next, we de®ne the label function `R that will be applied to traces and subtraces in L…G†.
176
CONTANT ET AL.
DEFINITION 1 The label function `R : S ?2D is de®ned as follows. Let o be a trace in S .
Then
`R …o† ˆ fNg
if Vi : … fi 6 [ o†6…ri 6 [ o†;
and Vi [ f1; . . . ; mg;
Fi [ `R …o†
if … fi [ o†6…ri 6 [ o†;
FIR
i
FIP
i
[ ` …o†
if 9 s; s0 : …o ˆ ss0 †6‰s [ C…ri †Š6… fi 6 [ s0 †; and
[ `R …o†
if 9 s; s0 : …o ˆ ss0 †6‰s [ C… fi †Š6…ri [ s†6…ri 6 [ s0 †
R
Hence, if `R …o† is fNg, i.e., ``normal'', then no event from the set of fault events Sf and
no event from the set of reset events Sr have occurred along the trace. If `R …o† contains the
label Fi , then the fault event fi has occurred along o but the reset event ri has not occurred
along o. If `R …o† contains the label FIR
i , then both the fault event fi and the reset event ri
have occurred at least one time or possibly multiple times along o, but the last of the two
to have occurred in o is ri . Finally, if `R …o† contains the label FIP
i , then both the fault event
fi and the reset event ri have occurred at least one time or possibly multiple times along o,
but the last of the two to have occurred in o is fi .
Figures 1 and 2 illustrate the label function `R . Figure 1 shows how the label evolves as a
trace gets extended by the occurrence and re-occurrence of fault and reset events. In Figure
IP
2, the notation …x; fF1 ; FIR
2 ; F3 g† means that along a trace that leads to state x the events
f1 ; f2 ; r2 ; f3 ; r3 have occurred, the event r2 was the last one to occur among f2 and r2 , and the
event f3 was the last one to occur among f3 and r3 .
Figure 1. Label function evolution diagram.
Figure 2. States and label function.
DIAGNOSIS OF INTERMITTENT FAULTS
177
Remark 1: Let o [ L…G†.
i.
IP
R
If `R …o† ˆ fNg, then Fi ; FIR
i ; Fi 6 [ ` …o† for all i.
ii.
R
IP
R
If Fi [ `R …o† for some i, then N 6 [ `R …o†; FIR
i 6 [ ` …o†, and Fi 6 [ ` …o†.
iii.
R
R
R
IP
R
If FIR
i [ ` …o† for some i, then N 6 [ ` …o†; Fi 6 [ ` …o†, and Fi 6 [ ` …o†.
iv.
R
R
R
IR
R
If FIP
i [ ` …o† for some i, then N 6 [ ` …o†; Fi 6 [ ` …o†, and Fi 6 [ ` …o†.
2.3.
Recurrent Faults
We de®ne and motivate the notions of Sf -recurrent and Sr -recurrent languages.
DEFINITION 2 Sf -recurrence and Sr -recurrence
a.
b.
A pre®x-closed and live language L is said to be Sf -recurrent with respect to the set of
fault events Sf and the set of reset events Sr if the following holds:
Vfi [ Sf ; i [ f1; . . . ; mg; 9ni [ N such that
L
‰Vs [ C… fi †Š Vt [
…12†
‰k t k ni ) ri [ tŠ
s
A pre®x-closed and live language L is said to be Sr -recurrent with respect to the set of
fault events Sf and the set of reset events Sr if the following holds:
Vri [ Sr ; i [ f1; . . . ; mg; 9ni [ N such that
L
…13†
‰k t k ni ) fi [ tŠ
‰Vs [ C…ri †Š Vt [
s
The above notions are motivated by the following considerations. We are concerned with
the dynamic behavior of discrete event systems where failure and reset events occur
continuously along any path of the systems's evolution; such a behavior is ensured by Sf recurrence and Sr -recurrence. These notions imply that fault and reset events occur with
Figure 3. Sf -recurrence and Sr -recurrence.
178
CONTANT ET AL.
some regularity along any possible behavior of the system; as will be seen in Section 5,
such regularity allows the diagnosis of these events. That is, Sf -recurrence (respectively,
Sr -recurrence), along with some other conditions (identi®ed in Section 5), ensure that
there are instances where we are sure that certain faults are present in (respectively, absent
from) the system.
3.
Notions of Diagnosability
Intermittent faults are dynamic, that is, they can repeatedly occur and reset. This feature
renders their diagnosis considerably more complicated and intricate than the diagnosis of
permanent failures. The notion of diagnosability proposed in Sampath et al. (1995) for
permanent failures relies on the fact that the status of faults remains ®xed after their
occurrence. In systems with intermittent faults, the fault's status evolves along with the
system's evolution. Consequently, the notion of diagnosability proposed in Sampath et al.
(1995) does not capture all the key issues associated with the diagnosis of intermittent
faults. Since intermittent faults are dynamic, one can imagine several different notions of
diagnosability, each providing a different amount of information about the status of faults.
Ideally, one would want to detect the existence of instances where the status of faults
( present or reset) is precisely known. A weaker notion of diagnosability would require that
one may want to ensure detection of the occurrence of a fault, or detection of a fault's
reset, without necessarily identifying any instance where the status of the fault is precisely
known. These considerations motivate the de®nition of four types of diagnosability. Two
of these types are related to the occurrence of intermittent faults; their ``dual'' notions are
related to the reset of intermittent faults.
The ®rst two notions of diagnosability, Type-P and Type-R, assert the presence of a fault
or the absence of an intermittent fault at a speci®c instance.
DEFINITION 3 Type-P diagnosability
A pre®x-closed and live language L is said to be Type-P diagnosable with respect to
projection P, the set of fault events Sf , and the set of reset events Sr if the following holds:
L
‰Vi [ f0; . . . ; mgŠ…9ni [ N†‰Vs [ C… fi †Š Vt [
‰k t k ni ) DP Š
s
where the diagnosability condition DP is
R
9t0 t : w [ ‰PL 1 P…st0 †Š ) ‰Fi [ `R …w†ŠV‰FIP
i [ ` …w†Š
Type-P diagnosability, where P stands for present, implies that after a fault occurs along
the system's evolution it is possible to identify an instance where, based on the available
information, we are certain that the fault is present in the system. Such an instance is
depicted in Figure 4. We use the label FPi to denote either Fi or FIP
i .
DIAGNOSIS OF INTERMITTENT FAULTS
179
Figure 4. Type-P diagnosability.
DEFINITION 4 Type-R diagnosability
A pre®x-closed and live language L is said to be Type-R diagnosable with respect to
projection P, the set of fault events Sf , and the set of reset events Sr if the following holds:
L
‰Vi [ f0; . . . ; mgŠ…9ni [ N†‰Vs [ C…ri †Š Vt [
‰k t k ni ) DR Š
s
where the diagnosability condition DR is
R
9t0 t : w [ ‰PL 1 P…st0 †Š ) ‰FIR
i [ ` …w†Š
Type-R diagnosability, where R is to be interpreted as meaning reset, implies that after a
fault reset occurs along the system's evolution, it is possible to identify an instance where
we are certain that the fault is absent in the system. Such an instance is illustrated in
Figure 5.
The following notions of diagnosability are weaker than Type-P and Type-R, as they
allow the detection of the occurrence of a fault or the detection of a fault's reset without
180
CONTANT ET AL.
Figure 5. Type-R diagnosability.
necessarily identifying any instances where the status of the fault ( present or reset) is
precisely known.
DEFINITION 5 Type-O diagnosability
A pre®x-closed and live language L is said to be Type-O diagnosable with respect to
projection P, the set of fault events Sf , and the set of reset events Sr if the following holds:
L
‰Vi [ f0; . . . ; mgŠ…9ni [ N†‰Vs [ C… fi †Š Vt [
‰k t k ni ) DO Š
s
where the diagnosability condition DO is
R
IP
R
o [ ‰PL 1 P…st†Š ) ‰Fi [ `R …o†ŠV‰FIR
i [ ` …o†ŠV‰Fi [ ` …o†Š
Type-O diagnosability, where O stands for occurrence, has the following
meaning. Suppose that fault fi occurs along the system's evolution. Then, after at most
ni events, it is possible to identify the occurrence of fi . This means that all possible system
behaviors (i.e., all possible sequences of events in the system) which are compatible with
DIAGNOSIS OF INTERMITTENT FAULTS
181
Figure 6. Type-O diagnosability.
the information available after ni further events contain the fault event fi . However, TypeO diagnosability does not guarantee perfect knowledge of the status of the fault event fi at
any stage. This means that along some of the possible traces fi may be reset, along some
others fi may reset and reoccur more than once, yet along some others fi may never be
reset. This fact is depicted in Figure 6. For the sake of convenience, the label FO
i shall be
IP
used to denote either Fi ; FIR
,
or
F
.
i
i
Type-I diagnosability, where I is to be interpreted as meaning intermittent, is in some
way the dual notion of Type-O diagnosability.
DEFINITION 6 Type-I diagnosability
A pre®x-closed and live language L is said to be Type-I diagnosable with respect to
projection P, the set of fault events Sf , and the set of reset events Sr if the following holds:
L
‰Vi [ f0; . . . ; mgŠ…9ni [ N†‰Vs [ C…ri †Š Vt [
‰k t k ni ) DI Š
s
where the diagnosability condition DI is
R
IP
R
w [ ‰PL 1 P…st†Š ) ‰FIR
i [ ` …w†ŠV‰Fi [ ` …w†Š
It is possible to give an interpretation of Type-I diagnosability similar to that of Type-O
diagnosability. For the sake of convenience, the label FIi shall be used to denote either FIR
i
182
CONTANT ET AL.
Figure 7. Type-I diagnosability.
or FIP
i . The key difference between the two notions is the following. In Type-O
diagnosability, one can assert the occurrence of a fault event fi , but one cannot be certain as
to whether or not fi has reset, or whether fi has reset and reoccurred once or many times. In
Type-I diagnosability, one can assert the occurrence and reset of a fault event fi , but one
cannot be certain as to whether or not fi has reoccurred once or many times. This fact is
depicted in Figure 7. The above difference illustrates the duality between these two
notions of diagnosability.
As Type-P and Type-R are stronger notions of diagnosability than Type-O and Type-I,
respectively, we would expect that Type-P (respectively, Type-R) diagnosability implies
Type-O (Type-I) diagnosability. This is indeed true, and it is a direct consequence of the
above de®nitions.
PROPOSITION 1
i.
Type-P diagnosability ) Type-O diagnosability.
ii.
Type-R diagnosability ) Type-I diagnosability.
Figure 8 summarizes the results of Proposition 1 and the implications of the various
notions of diagnosability. Table 1 summarizes the labels introduced in this section and in
Section 2.
183
DIAGNOSIS OF INTERMITTENT FAULTS
Figure 8. Different notions of diagnosability.
Table 1. Table of labels.
Label
Meaning
Fi
FIP
i
FIR
i
FO
i
FIi
FPi
Fault presence without reset occurrence
Fault presence after reset occurrence
Fault reset
IP
Fault occurrence (Fi or FIR
i or Fi )
IR
IP
Reset occurrence (Fi or Fi )
Fault presence (Fi or FIP
i )
Remark 2: The various notions of diagnosability introduced in this section are natural
extensions of the notion of diagnosability for permanent faults introduced in Sampath et al.
(1995). Intermittent faults evolve dynamically along with the system's evolution. The
diagnosability notions introduced here are suitable for dealing with the diagnosis of faults
that evolve dynamically. The primary objective is to determine conditions necessary and
suf®cient to ensure Type-P and Type-R diagnosability. We also wish to determine
conditions necessary and suf®cient to guarantee the weaker notions of Type-O and Type-I
diagnosability.
4.
The Diagnoser
The notion of a diagnoser automaton was originally introduced in Sampath et al. (1995).
A diagnoser automaton, or simply diagnoser, serves two purposes: (i) on-line detection
184
CONTANT ET AL.
and isolation of permanent faults by observing the system behavior; and (ii) off-line
analysis of the diagnosability properties of the system regarding permanent faults. The
latter is based on an examination of the structure of the diagnoser in order to determine
the presence or absence of certain types of cycles termed indeterminate cycles. It turns
out that diagnosers are still at the core of the methodology presented in this paper for
detecting and isolating intermittent faults, albeit their structure needs to be modi®ed to
account for the dynamics of intermittent faults captured by the labeling rules presented in
Section 2.2. We denote the modi®ed diagnosers by GRd, but continue to refer to them as
diagnosers for ease of reading.
The diagnoser GRd for G is an automaton
GRd ˆ …Qd ; So ; dd ; q0 †
…14†
where Qd ; So ; dd , and q0 have the usual interpretation. Its initial state q0 is de®ned to be
f…x0 ; fNg†g. The transition function dd is de®ned in the same way as in Sampath et al.
(1995), but under the new label propagation function LPR de®ned below. The state space
Qd is composed of the states of the diagnoser that are reachable from q0 under dd .
Therefore, a state qd of GRd is of the form
qd ˆ f…x1 ; `1 †; . . . ; …xn ; `n †g
where xi [ Xo and `i [ D. Consequently, any label `i is of the form `i ˆ fNg or `i (DnfNg,
where `i satis®es the following conditions:
i.
IP
if …Fi [ `i † then …FIR
i 6 [ `i † and …Fi 6 [ `i †
ii.
IP
if …FIR
i [ `i † then …Fi 6 [ `i † and …Fi 6 [ `i †
iii.
IR
if …FIP
i [ `i † then …Fi 6 [ `i † and …Fi 6 [ `i †
Since faults evolve dynamically the rules governing label propagation are different from
those of Sampath et al. (1995). These rules are speci®ed by the label propagation function
LPR , which is de®ned as follows.
DEFINITION 7 The label propagation function is denoted by LPR. LPR : Xo 6D6So ?D.
Given x [ Xo ; ` [ D, and s [ Lo …G; x†, LPR propagates the label ` over s starting from x
and following the dynamics of G and the rules of the label function `R . It is de®ned as
follows:
i.
LPR …x; `; s† ˆ fNg
ii.
Vi [ f1; . . . ; mg;
if
Vi; …` ˆ fNg†6… fi 6 [ s†6…ri 6 [ s†;
185
DIAGNOSIS OF INTERMITTENT FAULTS
Fi [ LPR …x; `; s†
R
FIR
i [ LP …x; `; s†
R
FIP
i [ LP …x; `; s†
if
if
if
1:
…Fi [ `†6…ri 6 [ s†; or
2:
IP
…FIR
i 6 [ `†6…Fi 6 [ `†6… fi [ s†6 …ri 6 [ s†;
1:
R
R
IR
…FIR
i [ `†6‰` …s† ˆ fNgV` …s† ˆ fFi gŠ;
2:
R
IR
‰…Fi [ `†V…FIP
i [ `†Š6` …s† ˆ fFi g; or
3:
` ˆ fNg6`R …s† ˆ fFIR
i g;
1:
R
R
…FIP
i [ `†6‰` …s† ˆ fNgV` …s† ˆ fFi gV
`R …s† ˆ fFIP
i gŠ;
2:
R
R
IP
…FIR
i [ `†6‰` …s† ˆ fFi gV` …s† ˆ fFi gŠ; or
3:
‰` ˆ fNg V ` ˆ fFi gŠ6`R …s† ˆ fFIP
i g
Note that De®nition 7 is consistent with De®nition 1. The behavior of LPR is illustrated
below by four different cases (not exhaustive). Let x0 [ S…x; s† with d…x; ss† ˆ x0 and let `0
be the label associated with x0 obtained by propagating ` associated with x:
a.
if ` ˆ fNg and s contains no fault events, then the label `0 is also fNg.
b.
0
IP
if ` ˆ fFi ; Fj ; FIP
k g and s contains no fault events, then the label ` is also fFi ; Fj ; Fk g.
c.
if ` ˆ fNg and s contains fault events fi ; fj ; fk , and reset events rj ; rk with the reset
event rj and the fault event fk being the last ones to occur among f fj ; rj g and f fk ; rk g,
IP
respectively, then `0 ˆ fFi ; FIR
j ; Fk g.
d.
IP
if ` ˆ fFi ; FIR
j ; Fk g and s contains the two fault events fj and f` , and the two reset
IP
IR
events ri and rk , then `0 ˆ fFIR
i ; Fj ; Fk ; F` g.
We state a few properties of the diagnoser that follow directly from its construction.
Property 1: Let q [ Qd . Then
…x1 ; `1 †; …x2 ; `2 † [ q , 9 s1 ; s2 [ L
such that
s1f ; s2f [ So ; d…x0 ; s1 † ˆ x1 ; d…x0 ; s2 † ˆ x2
and
P…s1 † ˆ P…s2 †
Property 2: Let q1 ; q2 [ Qd and s [ S such that …x1 ; `1 † [ q1 ; …x2 ; `2 † [ q2 ; dd ‰q1 ; P…s†Š ˆ
q2 . Then
186
CONTANT ET AL.
O
…FO
i 6 [ `2 † ) …Fi 6 [ `1 †
Property 3: Let q1 ; q2 [ Qd
dd ‰q1 ; P…s†Š ˆ q2 . Then
and
s [ S
such
that
…x1 ; `1 † [ q1 ;
…x2 ; `2 † [ q2 ;
…FIi 6 [ `2 † ) …FIi 6 [ `1 †
In summary, the diagnoser GRd is constructed as follows. Let the current state of the
diagnoser be q1 , and let the next observed event be s. The new state of the diagnoser q2 is
computed via the following three-step process:
1.
For every state estimate x in q1 , compute the reach due to s, given by
S…x; s† ˆ fd…x; ss† : s [ Suo and d…x; ss† is de®ned in Gg.
2.
Let x0 [ S…x; s† with d…x; ss† ˆ x0 . Propagate the label ` associated with x according to
De®nition 7 to obtain the label `0 associated with x0 .
3.
Let q2 be the set of all …x0 ; `0 † pairs computed following the previous steps, for each
…x; `† in q1 .
Figure 9 presents an example of the construction of a diagnoser. The set of observable
events in So ˆ fa; b; lg. They are two faults types, F1 and F2 , with corresponding fault
and reset events f f1 ; r1 g and f f2 ; r2 g, respectively. The initial state of GRd is f…1; fNg†g,
denoted by 1N in the ®gure for the sake of simplicity. Similar compact notation is used in
all the ®gures in this paper. The effect of the new label propagation function LPR , as
compared to the function LP in Sampath et al. (1995), manifests itself when state 8 is ®rst
reached after observed trace ab, yielding the label FIR
1 due to the r1 transition between
Figure 9. Diagnoser construction.
DIAGNOSIS OF INTERMITTENT FAULTS
187
states 5 and 6. As the system settles in the cycle 5±6±7±8±9±10±5, the diagnoser will
IR
IR
IP
alternate between the states f…2; fNg†; …5; fFIP
1 ; F2 g†g and f…3; fNg†; …8; fF1 ; F2 g†g.
5.
Necessary and Suf®cient Conditions for Type-P and Type-R Diagnosability
We present necessary and suf®cient conditions for a language L to be Type-P diagnosable
or Type-R diagnosable.
5.1.
Preliminaries
P
IR
P IR
We de®ne the notions of FPi …FIR
i †-certain states, Fi …Fi †-uncertain states, and Fi …Fi †indeterminate cycles in a way similar to Sampath et al. (1995).
DEFINITION 8
1.
A state q [ Qd is said to be FPi -certain if V…x; `† [ q; FPi [ `.
2.
IR
A state q [ Qd is said to be FIR
i -certain if V…x; `† [ q; Fi [ `.
3.
A state q [ Qd is said to be FPi -uncertain if 9…x; `†; …y; `0 † [ q, such that FPi [ ` and
FPi 6 [ `0 .
4.
0
IR
A state q [ Qd is said to be FIR
i -uncertain if 9…x; `†; …y; ` † [ q, such that Fi [ ` and
IR
0
Fi 6 [ ` .
We will say that a state is non-Fxi -certain if it is not Fxi -certain; note that non-Fxi -certain
does not imply Fxi -uncertain.
DEFINITION 9 A set of states x1 ; x2 ; . . . ; xn [ X is said to form a cycle in G if 9s [ L…G; x1 †
such that s ˆ s1 s2 . . . sn and d…x` ; s` † ˆ x…` ‡ 1†mod n ; ` ˆ 1; 2; . . . ; n.
DEFINITION 10 FPi -indeterminate cycle
A set of non-FPi -certain states q1 ; q2 ; . . . ; qn [ Qd , with at least one FPi -uncertain state, is
said to form an FPi -indeterminate cycle if the following condition (C1) is satis®ed.
C1. States q1 ; q2 ; . . . ; qn [ Qd form a cycle in GRd with dd …qu ; su † ˆ qu ‡ 1 ;
u ˆ 1; . . . ; n 1; dd …qn ; sn † ˆ q1 where su [ So ; u ˆ 1; . . . ; n.
Considering the states q1 ; q2 ; . . . ; qn [ Qd ; 9 …xku ; `ku † [ qu ; u ˆ 1; . . . ; n, and k ˆ 1; . . . ; m
such that:
…FPi [ `ku † for some u and k
188
CONTANT ET AL.
and the sequence of states fxku g; u ˆ 1; . . . ; n; k ˆ 1; . . . ; m, forms a cycle in G0 with
…xku ; su ; xk…u ‡ 1† † [ dG0 ;
u ˆ 1; . . . ; n
1; k ˆ 1; . . . ; m
…xkn ; sn ; xk1 ‡ 1 † [ dG0 ;
k ˆ 1; . . . ; m
1
and
1
…xm
n ; sn ; x1 † [ dG0 ;
DEFINITION 11 FIR
i -indeterminate cycle
IR
A set of non-FIR
i -certain states q1 ; q2 ; . . . ; qn [ Qd , with at least one Fi -uncertain state, is
IR
said to form an Fi -indeterminate cycle if the following condition (C2) is satis®ed.
C2. States q1 ; q2 ; . . . ; qn [ Qd form a cycle in GRd with dd …qu ; su † ˆ qu ‡ 1 ; u ˆ 1;
. . . ; n 1; dd …qn ; sn † ˆ q1 where su [ So , u ˆ 1; . . . ; n.
Considering the states q1 ; q2 ; . . . ; qn [ Qd ; 9 …xku ; `ku † [ qu ; u ˆ 1; . . . ; n, and k ˆ 1; . . . ; m
such that:
k
…FIR
i [ `u † for some u and k
and the sequence of states fxku g; u ˆ 1; . . . ; n; k ˆ 1; . . . ; m, forms a cycle in G0 with
…xku ; su ; xk…u ‡ 1† † [ dG0 ;
u ˆ 1; . . . ; n
1; k ˆ 1; . . . ; m
…xkn ; sn ; xk1 ‡ 1 † [ dG0 ;
k ˆ 1; . . . ; m
1
and
1
…xm
n ; sn ; x1 † [ dG0
The proof of the following result is straight forward and therefore omitted.
PROPOSITION 2 If a language L is Sf -recurrent and Sr -recurrent, then every cycle in any
system model G, such that l…G† ˆ L, has the following properties:
1.
The cycle contains the fault event fi iff it contains the reset event ri .
2.
Consider any trace o ˆ s1 s2 s3 in L where s1 [ S , s3 [ S , and 9 x1 ; x2 ; . . . ;
xn [ X; s2 [ L…G; x1 †, s2 ˆ …s1 s2 sn †m s1 s2 sp and d…x` ; s` † ˆ x…` ‡ 1†mod n ;
` ˆ 1; 2; . . . ; n; p n, and m; n; p [ N (i.e., s2 directly enters a cycle in the state
transition diagram G and completes at least one loop of this cycle). If fi [ s1 then fi [ s2
and ri [ s2 (i.e., any cycle that occurs after the occurrence of a fault necessarily
contains the fault and the reset).
189
DIAGNOSIS OF INTERMITTENT FAULTS
5.2.
Main Results
The following assumptions (A1) and (A2), together with Sf - and Sr -recurrence (equations
(12) and (13)), are critical for the development of necessary and suf®cient conditions to
ensure Type-P and Type-R diagnosability.
A1. Vo ˆ s1 yn s2 [ L; s1 ; s2 [ S ; n [ N, s.t. fi ; ri [ y; y satis®es the
y ˆ y1 fi y2 so y3 ri y4 where so [ So ; y1 ; y4 [ S ; y2 ; y3 [ …So [ Suo †nffi ; ri g.
following:
A2. Vo ˆ s1 yn s2 [ L; s1 ; s2 [ S ; n [ N, s.t. fi ; ri [ y; y satis®es the
y ˆ y1 ri y2 so y3 fi y4 where so [ So ; y1 ; y4 [ S ; y2 ; y3 [ …So [ Suo †nffi ; ri g.
following:
Assumption (A1) [resp.(A2)] implies that for any cycle in G, there exists at least one
observable event between at least one pair … fi ; ri †‰resp:…ri ; fi †Š. This excludes the
possibility of having a cycle with only unobservable events between all pairs
IR
… fi ; ri †‰…ri ; fi †Š, which would have prevented the label FIP
i ‰resp:Fi Š from appearing in
the corresponding cycle of the diagnoser GRd and therefore prevented the detection of any
instance where the fault [resp.reset] is present.
The results that follow provide conditions necessary and suf®cient to ensure Type-P and
Type-R diagnosability.
THEOREM 1 Type-P diagnosability
Consider the language L generated by automaton G. Assume that L is Sf - and Sr recurrent and satis®es (A1). L is Type-P diagnosable iff there are no FPi -indeterminate
cycles in the diagnoser GRd .
Proof:
*
Necessity:
We must prove that if the language L is Type-P diagnosable then there are no FPi indeterminate cycles. We prove the contrapositive, namely, that if there are FPi indeterminate cycles in the diagnoser GRd then the language L is not Type-P
diagnosable. Assume there exist states q1 ; q2 ; . . . ; qn [ Qd that form an FPi indeterminate cycle and dd …qi ; si † ˆ q…i ‡ 1† mod n . Consider the sequence of state
components …xku ; `ku † [ qu ; u ˆ 1; . . . ; n; k ˆ 1; . . . ; m with m [ N, that forms a cycle in
00
G0 with FPi [ `ku00 , for some u00 [ f1; . . . ; ng and some k00 [ f1; . . . ; mg. By De®nition 10 a
sequence of states with the above feature exists. Also De®nition 10 directly implies
that q1 ; q2 ; . . . ; qn are non-FPi -certain states with at least one FPi -uncertain state. In
0
0
other words, there exists at least one state qu0 ; u0 [ f1; . . . ; ng, such that …xku0 ; `ku0 † [ qu0
P
k0 0
and Fi [ `u0 ; k [ f1; . . . ; mg. Then we have
d…xku ; sku su † ˆ xk…u ‡ 1† ;
u ˆ 1; . . . ; n
1; k ˆ 1; . . . ; m
d…xkn ; skn sn †
k ˆ 1; . . . ; m
1
ˆ
xk1 ‡ 1 ;
190
CONTANT ET AL.
and
m
1
d…xm
n ; sn sn † ˆ x1
where
sku [ L…G; xku † and sku [ Suo
0
By Proposition 2, since FPi [ `ku0 ; fi [ sku for some u [ f1; . . . ; ng and k [ f1; . . . ; mg. Let
s [ C…fi † such that
0
d…x0 ; s† ˆ xku0
Let t [ L=s be arbitrarily long such that dd …q0 ; st† ˆ qu ; u [ f1; . . . ; ng, and let v be a
subtrace of st such that dd …qu00 ; v† ˆ qu00 ; u00 [ f1; . . . ; ng. Note that the presence of the
subtrace v implies that the events contained in v, and therefore in st, cycle at least once
between states q1 to qn of the diagnoser. Pick any t0 t. Then dd …q0 ; st0 † ˆ qu for some
u [ f1; . . . ; ng. Since q1 ; q2 ; . . . ; qn are non-FPi -certain states, Vt0 ; 9o [ PL 1 ‰P…st0 †Š
such that FPi 6 [ `R …o†. Therefore, the chosen s violates the de®nition of Type-P
diagnosability. Hence L is not Type-P diagnosable.
*
Suf®ciency:
We must prove that if there are no FPi -indeterminate cycles then the language L is
Type-P diagnosable. We prove the contrapositive, namely, that if the language L is not
Type-P diagnosable then there exists at least one FPi -indeterminate cycle. Assume that
the language L is not Type-P diagnosable. Then, from De®nition 3, this implies that
L
0
…15†
‰9i [ f0; . . . ; m gŠ…Vni [ N†‰9s [ C… fi †Š 9t [
‰k t k ni 6:DŠ
s
where the ``non-diagnosability'' condition :D is
Vt0 t : 9w [ ‰PL 1 P…st0 †Š…FPi 6 [ `R …w††
…16†
Consider the above s [ C… fi †. Since G and GRd are ®nite-state automata and L is Sf recurrent and Sr -recurrent, we can ®nd ni in equations (12) and (13) such that if
k t k> ni then the following ®ve properties hold (see Figure 10).
i. fi [ t
ii.
st leads to a cycle in G0 through state components xku ,
…xku ; `ku † [ qu ; u ˆ 1; . . . ; n, and k ˆ 1; . . . ; m, form a cycle in G0 with
where
191
DIAGNOSIS OF INTERMITTENT FAULTS
Figure 10. The suf®ciency part of the proof of Theorem 1.
…xku ; su ; xk…u ‡ 1† † [ dG0 ;
u ˆ 1; . . . ; n
1; k ˆ 1; . . . ; m
…xkn ; sn ; xk1 ‡ 1 † [ dG0 ;
k ˆ 1; . . . ; m
1
and
1
…xm
n ; sn ; x1 † [ dG0
iii. st leads to a corresponding cycle in GRd , that is, there exist states
q1 ; q2 ; . . . ; qn [ Qd that form a cycle in GRd with
dd …qu ; su † ˆ qu ‡ 1 ; u ˆ 1; . . . ; n
1; and
dd …qn ; sn † ˆ q1 where su [ So ; u ˆ 1; . . . ; n
iv.
Because of (A1), there exists a pre®x st01 of st such that:
0
0
0
*
d…x0 ; st01 † ˆ xku0 ; u0 [ f1; . . . ; ng; k0 [ f1; . . . ; mg, where …xku0 ; `ku0 † [ qu0 and
0
P
k
*
Fi [ `u0
v.
As a result of (A1), there exists a pre®x st02 of st such that:
*
st01 is a pre®x of st02 , i.e., k t01 k<k t02 kk t k,
0
0
0
*
d…x0 ; st02 † ˆ d…x0 ; st01 † ˆ xku0 , where …xku0 ; `ku0 † [ qu0 and
0
P
k
*
Fi [ `u0 .
192
CONTANT ET AL.
Thus state qu0 is FPi -uncertain. Furthermore, equation (16) directly implies that none of the
states q1 ; q2 ; . . . ; qn are FPi -certain. Therefore the cycle q1 ; q2 ; . . . ; qn [ Qd is an FPi indeterminate cycle.
j
THEOREM 2 Type-R diagnosability
Consider the language L generated by automaton G. Assume that L is Sf - and Sr recurrent and satis®es …A2†. L is Type-R diagnosable iff there are no FIR
i -indeterminate
cycles in the diagnoser GRd .
The proof of Theorem 2 is omitted as it is similar to that of Theorem 1.
COROLLARY 1 Suppose that L is Sf - and Sr -recurrent and satis®es (A1), (A2), and
equations (12) and (13). Then L is Type-P diagnosable iff it is Type-R diagnosable.
Proof: Assume that L is not Type-P diagnosable. Then there exists a set of states
q1 ; q2 ; . . . ; qn [ Qd that form an FPi -indeterminate cycle. By Proposition 2, (A1) and (A2),
IR
q1 ; q2 ; . . . ; qn is also a set of non-FIR
i -certain states with at least one Fi -uncertain state.0
k
Furthermore since FPi [ `ku for some u and k, u ˆ 1; . . . ; n, and k ˆ 1; . . . ; m, then FIR
i [ `u0
for some u0 ; k0 ; u0 ˆ 1; . . . ; n; k0 ˆ 1; . . . ; m. Therefore, the states q1 ; q2 ; . . . ; qn [ Qd form
an FIR
i -indeterminate cycle. Hence L is not Type-R diagnosable.
By arguments similar to the above we can show that if the system is not Type-R
diagnosable then it is not Type-P diagnosable.
j
Remark 3: Corollary 1 is not true if Assumption (A1) or (A2) is relaxed.
We conclude this section by observing that Assumptions (A1) and (A2), along with the
Sf - and Sr -recurrent assumptions, are only suf®cient but not necessary in the proofs of
Theorems 1 and 2, respectively. We record this observation in the following corollary.
COROLLARY 2 Consider the language L generated by automaton G.
i.
R
There are no FIP
i -indeterminate cycles in Gd if L is Type-P diagnosable.
ii.
R
There are no FIR
i -indeterminate cycles in Gd if L is Type-R diagnosable.
5.3.
Examples Illustrating the Results
Figures 11 and 12 present examples of systems with Sf - and Sr -recurrent languages that
satisfy all assumptions, including (A1) and (A2). The system in Figure 11 is neither TypeP diagnosable nor Type-R diagnosable. The set of states 5, 7 and 9, 11 both form cycles in
G0 (not shown in ®gure). States q1 and q2 in GRd , which respectively include 7, 11 and 5, 9
as components, form a cycle in GRd .
In the system of Figure 12, states q1 ; q2 , and q3 form a cycle in GRd and are non-FPi certain. State q2 is the only FPi -uncertain state of the cycle. However, there are no cycles in
G0 corresponding to the sequence of state components 10±12±13. Therefore, there are no
DIAGNOSIS OF INTERMITTENT FAULTS
193
Figure 11. Example of a system that is not Type-P and not Type-R diagnosable.
Figure 12. Example of a Type-P and Type-R diagnosable system.
FPi -indeterminate cycles and the language L is Type-P diagnosable. By Corollary 1 it is
also Type-R diagnosable.
6.
Necessary and Suf®cient Conditions for Type-O and Type-I Diagnosability
We present necessary and suf®cient conditions for a language L to be Type-O diagnosable
or Type-I diagnosable.
194
6.1.
CONTANT ET AL.
Preliminaries
DEFINITION 12
1.
O
A state q [ Qd is said to be FO
i -certain if V…x; `† [ q; Fi [ `.
2.
A state q [ Qd is said to be FIi -certain if V…x; `† [ q; FIi [ `.
3.
O
0
A state q [ Qd is said to be FO
i -uncertain if 9…x; `†; …y; ` † [ q, such that Fi [ ` and
O
0
Fi 6 [ ` .
4.
A state q [ Qd is said to be FIi -uncertain if 9…x; `†; …y; `0 † [ q, such that FIi [ ` and
FIi 6 [ `0 .
DEFINITION 13 FO
i -indeterminate cycle
A set of FO
-uncertain
states q1 ; q2 ; . . . ; qn [ Qd is said to form an FO
i
i -indeterminate cycle if
the following condition (C3) is satis®ed.
C3. States q1 ; q2 ; . . . ; qn [ Qd form a cycle in GRd with dd …qu ; su † ˆ qu ‡ 1 ; u ˆ 1;
. . . ; n 1; dd …qn ; sn † ˆ q1 where su [ So ; u ˆ 1; . . . ; n.
r
Considering the states q1 ; q2 ; . . . ; qn [ Qd ; 9…xku ; `ku †; …yru ; `~u † [ qu ; u ˆ 1; . . . ; n; k ˆ 1;
. . . ; m, and r ˆ 1; . . . ; m0 such that:
k
O
~r
‰…FO
i [ `u †6…Fi 6 [ `u †Š; for all u; k; and r
…17†
and the sequences of states fxku g; u ˆ 1; . . . ; n; k ˆ 1; . . . ; m, and fyru g; u ˆ 1; . . . ;
n; r ˆ 1; . . . ; m0 , form cycles in G0 with
…xku ; su ; xk…u ‡ 1† † [ dG0 ;
…xkn ; sn ; xk1 ‡ 1 † [ dG0 ;
u ˆ 1; . . . ; n
k ˆ 1; . . . ; m
1; k ˆ 1; . . . ; m
1
and
1
…xm
n ; sn ; x1 † [ dG0
and
…yru ; su ; yr…u ‡ 1† † [ dG0 ;
…yrn ; sn ; yr1 ‡ 1 † [ dG0 ;
and
0
1
…ym
n ; sn ; y1 † [ dG0 :
u ˆ 1; . . . ; n
r ˆ 1; . . . ; m0
1; r ˆ 1; . . . ; m0
1
195
DIAGNOSIS OF INTERMITTENT FAULTS
DEFINITION 14 FIi -indeterminate cycle
A set of FIi -uncertain states q1 ; q2 ; . . . ; qn [ Qd is said to form an FIi -indeterminate cycle if
the condition (C4) is satis®ed.
C4. The same as (C3) with the exception that clause (17) is replaced by the following:
‰…FIi [ `ku †6…FIi 6 [ `~ur †Š for all u; k; and r
6.2.
…18†
Main Results
The results that follow provide necessary and suf®cient conditions to ensure Type-O and
Type-I diagnosability.
THEOREM 3 Type-O diagnosability
Consider the language L generated by automaton G. L is Type-O diagnosable iff there are
R
no FO
i -indeterminate cycles in the diagnoser Gd .
Proof:
*
Necessity:
We prove that if a language L is Type-O diagnosable then there are no FO
i -indeterminate
cycles. By contradiction, assume there exist states q1 ; q2 ; . . . ; qn [ Qd such that they
k k
r ~r
form an FO
i -indeterminate cycle and dd …qi ; si † ˆ q…i ‡ 1† mod n . Let …xu ; `u †; …yu ; `u † [ qu ;
u ˆ 1; . . . ; n; k ˆ 1; . . . ; m, and r ˆ 1; . . . ; m0 , with m; m0 [ N, form corresponding
O
k
~r
cycles in G0 with FO
i [ `u and Fi 6 [ ` u . Then we have
d…xku ; sku su † ˆ xk…u ‡ 1† ;
u ˆ 1; . . . ; n
1; k ˆ 1; . . . ; m
d…xkn ; skn sn †
k ˆ 1; . . . ; m
1
d…yru ; s~ru su † ˆ yr…u ‡ 1† ;
u ˆ 1; . . . ; n
1; r ˆ 1; . . . ; m0
d…yrn ; s~rn sn † ˆ yr1 ‡ 1 ;
r ˆ 1; . . . ; m0
ˆ
xk1 ‡ 1 ;
and
m
1
d…xm
n ; sn s n † ˆ x 1
and
and
0
0
1
~m
d…ym
n ;s
n sn † ˆ y1
where
1
196
CONTANT ET AL.
sku [ L…G; xku †; s~ru [ L…G; yru †
and
sku ; s~ru [ Suo
Since …x11 ; `11 †; …y11 ; `~11 † [ q1 ; 9s0 ; s~0 [ L such that d…x0 ; s0 † ˆ x11 , d…y0 ; s~0 † ˆ y11 and
O
1
~r
P…s0 † ˆ P…~
s0 † from Property 1. Furthermore, FO
i [ `1 implies that fi [ s0 , and Fi 6 [ `u
r
means that fi 6 [ s~0 and fi 6 [ s~u for all u; r.
Consider the two traces
m
m
o ˆ so …s11 s1 s12 s2 ; . . . ; s1n sn s21 s1 s22 s2 ; . . . ; s2n sn ; . . . ; sm
1 s1 s2 s2 ; . . . ; sn sn †
km0
km
~ ˆ s~o …~
~m
~m
s11 s1 s~12 s2 ; . . . ; s~1n sn s~21 s1 s~22 s2 ; . . . ; s~2n sn ; . . . ; s~m
o
1 s1 s
2 s2 ; . . . ; s
n sn †
0
0
0
0
~ [ L; P…o† ˆ P…o†
~ ˆ P…s0 †…s1 s2 . . . sn †kmm , and
for arbitrarily large k. Then o [ L; o
O
R
R ~
~ Let s [ so be such that s [ C… fi †,
FO
since fi 6 [ o.
i [ ` …o† since fi [ o, while Fi 6 [ ` …o†
and let t [ L=s be such that o ˆ st. By choosing k to be arbitrarily large, we can get
R ~
~ [ PL 1 ‰P…st†Š and FO
~
k t k > n for any given n [ N. Furthermore o
since fi 6 [ o.
i 6 [ ` …o†
Therefore, the chosen s violates the de®nition of Type-O diagnosability. Hence L is not
Type-O diagnosable.
*
Suf®ciency:
We prove that if there are no FO
i -indeterminate cycles then the language L is Type-O
diagnosable. Assume that the diagnoser GRd does not have any FO
i -indeterminate cycle,
i ˆ 1; 2; . . . ; m. For any FO
i , pick any s [ L such that s [ C… fi † and let d…x0 ; s† ˆ x. Pick
any t1 [ L0 …G; x†. Since we assume there are no cycles of unobservable events in G,
there exists n0 [ N such that k t1 k n0 . Let d…x0 ; st1 † ˆ x1 and correspondingly in GRd ,
let dd ‰q0 ; P…st1 †Š ˆ q1 . Then …x1 ; `1 † [ q1 and FO
i [ `1 .
O
We distinguish two cases: (I) q1 is FO
i -certain and (II) q1 is Fi -uncertain.
Case I: Suppose q1 is FO
i -certain. Then, by De®nition 12,
R
…Vo [ PL 1 ‰P…st1 †Š†FO
i [ ` …o†
O
Hence L is diagnosable for FO
i with ni ˆ n0 . Since this is true for any Fi ; L is
diagnosable.
O
Case II: Suppose q1 is FO
i -uncertain. Consider any …z; `† [ q1 such that Fi [ `. We shall
O
0 0
then refer to z as an ``x-state'' of q1 . Likewise, if …z ; ` † [ q1 such that Fi 6 [ `0 , then we
shall denote z0 as a ``y-state'' of q1 . We have assumed that there are no FO
i indeterminate cycle in GRd . Recalling the de®nition of an FO
-indeterminate
cycle,
this
i
assumption means that one of the following is true: (i) there are no cycles of FO
i uncertain states in GRd , or (ii) there exists one or more cycles of FO
-uncertain
states
i
q1 ; q2 ; . . . ; qn in GRd but corresponding to any such cycle in GRd , there do not exist two
sequences fxku g and fyru g; u ˆ 1; 2; . . . ; n, and k; r [ N such that both of them form
DIAGNOSIS OF INTERMITTENT FAULTS
197
cycles in G0 (fxku g is composed of ``x-states'' of qu , and fyru g is composed of ``ystates'' of qu ; u ˆ 1; 2; . . . ; n, cf. De®nition 13).
i.
O
R
Suppose that there are no cycles of FO
i -uncertain states in Gd . Then every Fi O
uncertain state will lead to an Fi -certain state in a bounded number of transitions
by Property 2 of label propagation.
ii.
R
Suppose that there exists a cycle C of FO
i -uncertain states q1 ; q2 ; . . . ; qn in Gd . We
show that whenever a fault happens, i.e., when the true state of the system is an
``x-state'', it is not possible to loop for arbitrarily long in this cycle in GRd and
thereby never detect the fault.
~
Pick any ``y-state'' yu [ qu , and let the corresponding label be `~u . Since FO
i 6 [ `u , the
~
~
pair …yu ; `u † [ qu could only have resulted from a pair …yu 1 ; `u 1 † [ qu 1 such that
~
FO
i 6 [ `u 1 , because of Property 2. That is the ``y-state'' yu cannot be a successor of
any ``x-state'' xu 1 along the corresponding trace in G0 . Thus, by backward induction,
we can always build a cycle of states in G0 involving some or all of the ``y-states'' of
qu ; u ˆ 1; 2; . . . ; n. These ``y-states'' then constitute the sequence yru . Since the cycle
O
of FO
i -uncertain states q1 ; q2 ; . . . ; qn is not Fi -indeterminate, there does not exist a
cycle in G0 involving the ``x-states'' of qu ; u ˆ 1; 2; . . . ; n. Hence, if we pick any ``xstate'' xu in state qu in the cycle C, then a suf®cient long trace p [ L…G; xu †, guaranteed
by the liveness assumption, will leave the cycle C of FO
i -uncertain states, and will lead
to an FO
-certain
state.
j
i
THEOREM 4 Type-I diagnosability
Consider the language L generated by automaton G. L is Type-I diagnosable iff there are
no FIi -indeterminate cycles in the diagnoser GRd .
The proof of Theorem 4 is similar to that of Theorem 3 and is therefore omitted.
7.
A Pump-Valve-Controller Example
Consider a small system consisting of mechanical components: a pump, a valve, a controller,
and one ¯ow sensor. For the sake of simplicity we assume that the pump and the controller do
not fail. The valve has two intermittent fault modes, namely, a stuck- and unstuck-closed
fault mode (labelled Fx1 ) and a stuck- and unstuck-open fault mode (labelled Fx2 ).
In order to construct the complete model G, we ®rst need to form the parallel
composition …Pump k Valve k Controller† of the individual component models; see
Figure 13. The initial states are POFF (Pump OFF), VC (Valve Closed), and C1 (Controller
State #1). The meaning of the events and their observable or unobservable status are
shown in Table 2. The second step is to combine the sensor map with the synchronized
component models …Pump k Valve k Controller† and obtain the complete model G. The
¯ow sensor takes either the value F, indicating ¯ow, or the value NF, indicating no ¯ow.
The global sensor map is listed in Table 3. This table represents the sensor mapping of the
198
CONTANT ET AL.
Table 2. Event list.
C_P: Close_Pump (observable)
O_P: Open_Pump (observable)
NL: No Load (observable)
L: Load (observable)
C_V: Close_Valve (observable)
S_C: Stuck_Closed (unobservable)
V_C: Valve_Closed (unobservable)
US_C: UnStuck_Closed (unobservable)
O_V: Open_Valve (observable)
S_O: Stuck_Open (unobservable)
V_O: Valve_Open (unobservable)
US_O: UnStuck_Open (unobservable)
Table 3. Global sensor map.
f …PON; VC; † ˆ NF
f …PON; X1; † ˆ NF
f …PON; SC; † ˆ NF
f …PON; X3; † ˆ NF
f …PON; VCi ; † ˆ NF
f …PON; X5; † ˆ NF
f …POFF; ; † ˆ NF
Figure 13. Component models.
f …PON; VO; † ˆ F
f …PON; X2; † ˆ F
f …PON; SO; † ˆ F
f …PON; X4; † ˆ F
f …PON; VOi ; † ˆ F
f …PON; X6; † ˆ F
DIAGNOSIS OF INTERMITTENT FAULTS
199
Figure 14. Portions of diagnoser GRd showing the indeterminate cycles.
states obtained in the parallel composition …Pump k Valve k Controller†. Each state is
therefore composed of three ordered state components, for the respective states of the
pump, valve, and controller. The symbol represents any possible state of the given
component. The resulting G was obtained using the UMDES-LIB1 software and has 64
states; it is not depicted here. The next step is to construct the diagnoser GRd for G. This is
done using UMDES-LIB; GRd has 104 states. By inspecting GRd , we test the four
diagnosability conditions de®ned in this paper. This analysis shows that there are only two
indeterminate cycles in GRd and they are both FIR
2 -indeterminate cycles. Figure 14 and
Tables 4 and 5 describe the two portions of the diagnoser GRd that contain the FIR
2 indeterminate cycles. We draw the following conclusions:
*
*
*
I
The system is Type-O and Type-I diagnosable since neither FO
i - nor Fi -indeterminate
cycles exist.
R
No FIP
i -indeterminate cycles exist in Gd ; i ˆ 1; 2. Based on Theorem 1 we need to
verify if certain assumptions are satis®ed in order to conclude about Type-P
diagnosability. The language L is Sf - and Sr -recurrent and Assumption (A1) is
satis®ed for both fault types. Therefore the system is Type-P diagnosable.
Based on Corollary 2 (ii) the fault of type 2 is not Type-R diagnosable since there exist
R
IR
FIR
2 -indeterminate cycles in Gd . Due to the absence of F1 -indeterminate cycles, we
need to check for the assumptions mentioned in Theorem 2 in order to conclude about
Type-R diagnosability. Assumption (A2) is violated for both fault types and thus
despite the absence of FIR
1 -indeterminate cycles, no conclusion can be drawn about
Type-R diagnosability for faults of type 1.
200
CONTANT ET AL.
Table 4. Description of diagnoser states involved in indeterminate cycle 1.
State Number
Concatenation of the Component States
62
66
42
47
52
57
IP
IR
f(poff,x4,c2)FIP
2 ; (poff,so,c2)F2 ; (poff,vci,c2)F2 g
IP
f(pon,x4,c3)FIP
;
(pon,so,c3)F
g
2
2
f(pon,so,c4)FIP
2 g
IP
f(pon,so,c5)F2 g
f(pon,x4,c6)FIP
2 g
f(poff,x4,c7)FIP
2 g
Table 5. Description of diagnoser states involved in indeterminate cycle 2.
8.
State Number
Concatenation of the Component States
103
104
90
96
99
101
IP
IR IP
IR IR
f(poff,x4,c2)FIR
1 F2 ; (poff,so,c2)F1 F2 ; (poff,vci,c2)F1 F2 g
IP
IR IP
f(pon,x4,c3)FIR
1 F2 ; (pon,so,c3)F1 F2 g
IP
f(pon,so,c4)FIR
1 F2 g
IP
f(pon,so,c5)FIR
1 F2 g
IP
F
f(pon,x4,c6)FIR
1 2 g
IP
f(poff,x4,c7)FIR
1 F2 g
Conclusion
Intermittent faults are dynamic in nature, i.e., they can occur, reset, and reoccur repeatedly
during a system's operation. They are distinctly different from permanent faults which
never reset after they occur. Therefore, system models that capture permanent failures and
existing methodologies for diagnosis of permanent failures are not appropriate for
modeling and diagnosis of intermittent faults.
We proposed a method for modeling intermittent faults and their resets in the context of
discrete event system models. We de®ned notions of diagnosability that provide
information about the status of intermittent faults at different levels of detail. Finally,
we developed, via conditions that are necessary and suf®cient to ensure different notions
of diagnosability, a methodology for diagnosis of intermittent faults.
Our investigation has revealed that: (i) diagnosis of intermittent faults is a problem
considerably more intricate than the diagnosis of permanent failures; and (ii) the
philosophy and approach to diagnosis of permanent failures developed in Sampath et al.
(1995) is powerful and generic, as evidenced by the nature of the results of this paper.
Acknowledgment
This research was supported in part by NSF grant ECS-0080406.
DIAGNOSIS OF INTERMITTENT FAULTS
201
Notes
1. http://www.eecs.umich.edu/umdes/
References
Aghasaryan, A., Fabre, E., Benveniste, A., Boubour, R., and Jard, C. 1998. Fault detection and diagnosis in
distributed systems: An approach by partially stochastic Petri nets. Discrete Event Dynamic Systems: Theory
and Applications 8(2): 203±231.
Benveniste, A., Fabre, E., Haar, S., and Jard, C. 2003. Diagnosis of asynchronous discrete event systems, a net
unfolding approach. IEEE Trans. Automatic Control 48(5): 714±727.
Bouloutas, A. T. 1990. Modeling fault management in communication networks. Ph.D. thesis, Columbia
University.
Cassandras, C. G., and Lafortune, S. 1999. Introduction to Discrete Event Systems. Kluwer Academic Publishers.
Console, L. 2000. Diagnosis and diagnosability analysis using process algebras. In A. Darwiche and G. Provan
(eds.), Proc. DX'00: Eleventh International Workshop on Principles of Diagnosis, June, pp. 25±32.
Darwiche, A., and Provan, G. 1996. Exploiting system structure in model-based diagnosis of discrete event
systems. In Pro. Seventh International Workshop on the Principles of Diagnosis, DX-96, October, Val Morin,
Canada.
Dvorak, D., and Kuipers, B. 1992. Model based monitoring of dynamic systems. In W. Hamscher, L. Console, and
J. Kleer (eds.), Readings in Model Based Diagnosis, Morgan Kaufmann, pp. 249±254.
Debouk, R., Lafortune, S., and Teneketzis, D. 2000. Coordinated decentralized protocols for failure diagnosis of
discrete-event systems. Discrete Event Dynamic Systems: Theory and Applications 10(1/2): 33±86.
Garcia, E., Morant, F., Blasco-Gimenez, R., Correcher, A., and Quiles, E. 2002. Centralized modular diagnosis
and the phenomenon of coupling. In Proc. IEEE WODES '02 International Workshop on Discrete Event
Systems, October, pp. 161±168.
Hastrudi Zad, S., Kwong, R. H., and Wonham, W. M. 1998. Fault diagnosis in discrete-event systems: Framework
and model reduction. In Proc. 37th IEEE Conf. on Decision and Control, December, pp. 3769±3774.
Jiang, S., and Kumar, R. 2002. Failure diagnosis of discrete event systems with linear-time temporal logic fault
speci®cations. In Proc. 2002 American Control Conference, May, pp. 128±133.
Jiang, S., Kumar, R., and Garcia, H. E. 2002. Diagnosis of repeated failures in discrete event systems. In Proc. of
the 41st IEEE Conference on Decision and Control, December, pp. 4000±4005.
Lafortune, S., Teneketzis, D., Sampath, M., Sengupta, R., and Sinnamohideen, K. 2001. Failure diagnosis of
dynamic systems: An approach based on discrete event systems. In Proc. 2001 American Control Conference,
June, pp. 2058±2071.
Lamperti, G., and Zanella, M. 1999. Diagnosis of discrete event systems integrating synchronous and
asynchronous behavior. In Proceedings of the Ninth International Workshop on Principles of Diagnosis,
DX'99, pp. 129±139.
Lin, F. 1994. Diagnosability of discrete event systems and its applications. Discrete Event Dynamic Systems:
Theory and Applications 4(2): 197±212.
Lin, F., Markee, J., and Rado, B. 1993. Design and test of mixed signal circuits: A discrete event approach. In
Proc. 32th IEEE Conf. on Decision and Control, December, pp. 246±251.
Lunze, J. 2000. Diagnosis of quantised systems. In IFAC SafeProcess 2000, June, pp. 28±39.
Pandalai, D. N., and Holloway, L. E. 2000. Template languages for fault monitoring of discrete event processes.
IEEE Transactions on Automatic Control 45(5): 868±882.
PencoleÂ, Y. 2000. Decentralized diagnoser approach: Application to telecommunication networks. In A.
Darwiche and G. Provan (eds.), Proc. DX'00: Eleventh International Workshop on Principles of Diagnosis,
June, pp. 185±192.
PencoleÂ, Y., Cordier, M.-O., and RozeÂ, L. 2001. A decentralized model-based diagnostic tool for complex
systems. In Proc. of the 13th IEEE International Conf. on Tools with Arti®cial Intelligence, November, pp. 95±
102.
202
CONTANT ET AL.
Provan, G., and Chen, Y.-L. 1998. Diagnosis of timed discrete event systems using temporal causal networks:
Modeling and analysis. In Proc. of the 1998 International Workshop on Discrete Event Systems (WODES'98),
IEE, August, pp. 152±154.
Provan, G., and Chen, Y.-L. 1999. Model-based diagnosis and control recon®guration for discrete event systems:
An integrated approach. In Proc. 38th IEEE Conf. on Decision and Control, December, pp. 1762±1768.
Ramadge, P. J., and Wonham, W. M. 1989. The control of discrete event systems. Proc. IEEE 77(1): 81±98.
Sampath, M. 2001. A hybrid approach to failure diagnosis of industrial systems. In Proc. 2001 American Control
Conf., June.
Sampath, M., Lafortune, S., and Teneketzis, D. 1998. Active diagnosis of discrete event systems. IEEE Trans.
Automatic Control 43(7): 908±929.
Sampath, M., Sengupta, R., Sinnamohideen, K., Lafortune, S., and Teneketzis, D. 1995. Diagnosability of discrete
event systems. IEEE Transactions on Automatic Control 40(9): 1555±1575.
Sampath, M., Sengupta, R., Sinnamohideen, K., Lafortune, S., and Teneketzis, D. 1996. Failure diagnosis using
discrete event models. IEEE Transactions on Control Systems Technology 4(2): 105±124.
Sengupta, R. 2001. Discrete-event diagnostics of automated vehicles and highways. In Proc. 2001 American
Control Conf., June.
Sinnamohideen, K. 2001. Discrete-event diagnostics of heating, ventilation, and air-conditioning systems. In
Proc. 2001 American Control Conf., June.
Westerman, G., Kumar, R., Stroud, C., and Heath, J. R. 1998. Discrete event system approach for delay fault
analysis in digital circuits. In Proc. 1998 American Control Conf.
Williams, B., and Nayak, P. 1996. A model-based approach to reactive self-con®guring systems. In Proceedings
of the AAAI.