Structural Transformations of Probabilistic Finite State Machines

International Journal of Control
Vol. 81, No. 5, May 2008, 820–835
Structural transformations of probabilistic finite state machines
I. CHATTOPADHYAY and A. RAY*
The Pennsylvania State University, University Park, PA 16802, USA
(Received 12 June 2007; in final form 25 September 2007)
Probabilistic finite state machines have recently emerged as a viable tool for modelling
and analysis of complex non-linear dynamical systems. This paper rigorously establishes
such models as finite encodings of probability measure spaces defined over symbol strings.
The well known Nerode equivalence relation is generalized in the probabilistic setting
and pertinent results on existence and uniqueness of minimal representations of probabilistic
finite state machines are presented. The binary operations of probabilistic synchronous
composition and projective composition, which have applications in symbolic model-based
supervisory control and in symbolic pattern recognition problems, are introduced. The results
are elucidated with numerical examples and are validated on experimental data for statistical
pattern classification in a laboratory environment.
1. Introduction and motivation
Probabilistic finite state machines have recently emerged
as a modelling paradigm for constructing causal models
of complex dynamics. The general inapplicability of
classical identification algorithms in complex non-linear
systems has led to development of several techniques
for construction of probabilistic representations of
dynamical evolution from observed system behaviour.
The essential feature of a majority of such reported
approaches is partial or complete departure from the
classical continuous-domain modelling towards a formal
language theoretic and hence symbolic paradigm
(Ray 2004, Shalizi and Shalizi 2004). The continuous
range of a sufficiently long observed data set is
discretized and tagged with labels to obtain a symbolic
sequence (Ray 2004), which is subsequently used to
compute a language-theoretic finite state probabilistic
predictor via recursive model update algorithms.
Symbolization essentially discretizes the continuous
state space and gives rise to probabilistic dynamics
from the underlying deterministic process, as illustrated
in figure 1.
Among various reported symbolic reconstruction
algorithms,
causal-state
splitting
reconstruction
*Corresponding author. Email: [email protected]
(CSSR) (Shalizi and Shalizi 2004) computes optimal
representations (e.g., "-machines) and is reported to
yield the minimal representation consistent with accurate prediction. In contrast, the D-Markov construction
(Ray 2004) produces a sub-optimal model, but it has
a significant computational advantage and has been
shown to be better suited for online detection of small
parametric anomalies in dynamic behaviour of physical
processes (Rajagopalan and Ray 2006).
This paper addresses the issue of structural manipulation of such inferred probabilistic models of system
dynamics. The ability to transform and manipulate the
automaton structure is critical for design of supervisory
control algorithms for symbolic models and real-time
pattern recognition from symbol sequences. Specific
issues are delineated in the sequel.
1.1 Applications of symbolic model-based control
The natural setting for developing control algorithms
for symbolic models is that of probabilistic languages.
The notion of probabilistic languages in the context
of studying qualitative stochastic behaviour of discrete-event systems first appeared in Garg (1992a, b), where
the concept of p-languages (‘p’ implying probabilistic)
is introduced and an algebra is developed to
model probabilistic languages based on concurrency.
International Journal of Control
ISSN 0020–7179 print/ISSN 1366–5820 online 2008 Taylor & Francis
http://www.tandf.co.uk/journals
DOI: 10.1080/00207170701704746
821
Structural transformations of probabilistic finite state machines
s
•
••
q1
t
q3
q1
•
q3
t
q2
••
Σ*
q3
s
q2
(a)
q2
(b)
Figure 1. Emergence of probabilistic dynamics from the
underlying deterministic system due to discretization as
symbolic states q1, q2, q3; and , are symbolic events.
(a)
1 / 0.2
1 / 0.4 MODEL
0 / .7
q2
1/0.4
1 / 0.3
Figure 2.
q3
(c)
−1
qA1
1/0.2
0/0.6
1/0.3
CONTROL
SPECIFICATION
qA
1
qB +1
0
Symbolic model and control specification.
A multitude of control algorithms for p-languagetheoretic models have been reported. Earlier approaches
Lawford and Wonham (1993) and Kumar and Garg
(2001) attempt a direct generalization of Ramadge and
Wonham’s supervisory control theory (Ramadge and
Wonham 1987) for deterministic languages and proves
to be somewhat cumbersome in practice. A significantly
simpler approach is suggested in Ray (2005),
Chattopadhyay (2006) and Chattopadhyay and Ray
(2007), where supervisory control laws are synthesized
by elementwise maximization of a language measure
vector (Ray 2005, Chattopadhyay and Ray 2006) to
ensure that the generated event strings cause the
supervised plant to visit the ‘‘good’’ states while
attempting to avoid the ‘‘bad’’ states optimally in a
probabilistic sense. The notion of ‘‘good’’ and ‘‘bad’’ is
induced by specifying scalar weights on the model states,
with relatively more negative weights indicating less
desirable states. Unlike the previous approaches,
the measure-theoretic approach does not require a
‘‘specification automaton’’; however, a specification
weight is assigned to each state of the finite state
machine. (Note: These states are different from the
states obtained via symbolic reconstruction of observed
physical data.)
Figure 2 illustrates the underlying concept.
The symbolic model shown on the left which has
three states q1, q2, q3, while the control objective is
specified by weights þ1 and 1 on states qA, qB of the
two-state automaton on the right.
Recalling that every finite state automaton induces
a right invariant partition on the set of all possible
qB2
+1
0/0.6
qB3
+1
1/0.2
0/0.7
0/0.8
−1
−1
qA2
1/0.4
0 / .8
0 / 0.6
+1
1/0.3
0
1
+1
(b)
−1
qA3
q1
+1
−1
qB(+1)
q1
−1
−1
qA(−1)
0/0.8
qB1
+1
0/0.7
(d)
Figure 3. Imposing control specification by probabilistic
synchronous composition of automata.
finite length strings, the above situation is illustrated by
figures 3(a) and (b). The operation of probabilistic
synchronous composition, defined later in this paper,
resolves the problem by considering the product
partition in figure 3(c). Then, the given model
is ztransformed into the one shown in figure 3(d),
on which the optimization algorithm reported in
Chattopadhyay and Ray (2007) can be directly applied
to yield the optimal supervision policy.
1.2 Applications of symbolic pattern recognition
As mentioned earlier, the symbolic reconstruction
algorithms (Ray 2004, Shalizi and Shalizi 2004) generate
probabilistic finite state models from observed time
series. However, in a pattern classification problem, one
may be only interested in a given class of possible future
evolutions. For example, as illustrated in figure 4, while
the systems G1, G2, . . . , Gk yield different symbolic
models A1, A2, . . . , Ak, we may be only interested in
matching a given template, i.e., knowing how similar the
systems are as far as strings with even number of 0s is
concerned (note: qA ¼ {strings with even number of 0s}).
The operation of projective composition, defined in
this paper, allows transformation of each model Ai
to the structure of the template while preserving the
distribution over the strings of interest, and is of critical
importance in symbolic pattern classification problems.
As shown in the sequel, the model order of the machines
822
G1
G2
I. Chattopadhyay and A. Ray
0101010001
0101011001
1
A1
A2
TEMPLATE
qA
Gk
1101001101
Figure 4.
Ak
1
RECONSTRUCTED
MODELS
0
qB
0
Symbolic template matching problem.
as a language. The function : Q ! Q represents
the state transition map and : Q ! Q is the
reflexive and transitive closure (Hopcroft et al. 2001)
of and Qm Q is the set of marked (i.e. accepting)
states. For given functions f and g, we denote the
composition as f g.
Definition 1: The classical Nerode equivalence N
(Hopcroft et al. 2001) on with respect to a given
language L is defined as:
8x, y 2 ,
Ai is not particularly important; hence projective
composition accomplishes model order reduction
within a quantifiable error.
1.3 Organization of the paper
The paper is organized in seven sections including the
present one. Section 2 presents preliminary concepts and
pertinent results that are necessary for subsequent
development. Section 3 introduces the concept of
probabilistic finite state automata as finite encodings
of probability measure spaces. The concept of Nerode
equivalence is generalized to probabilistic automata and
the key results on existence and uniqueness of minimal
representations are established. Section 4 presents
metrics on the space of probability measures on
symbolic strings which is shown to induce pseudometrics
on the space of probabilistic finite state automata.
Along this line, the concept of probabilistic synchronous
composition is introduced and the results are elucidated
with a simple example. Section 5 defines projective
composition and invariance of projected distributions is
established. A numerical example is provided for clarity
of exposition. Section 6 demonstrates applicability of the
developed method to a pattern classification problem
on experimental data. The paper is summarized and
concluded in x 7 with recommendations for future
research.
2. Preliminary notions
A deterministic finite state automaton (DFSA) is defined
(Hopcroft et al. 2001) as a quintuple Gi ¼ (Q, , , qi,
Qm), where Q is the finite set of states, and qi 2 Q is the
initial state; is the (finite) alphabet of events. The
Kleene closure of , denoted as , is the set of all
finite-length strings of events including the empty string
"; the set of all finite-length strings of events excluding
the empty string " is denoted as þ and the set of all
strictly infinite-length strings of events is denoted as !.
A subset of ! is called an !-language on the alphabet and a subset of is called a -language. If the meaning
is clear from context, we refer to a set of strings simply
ðxN y , ð8u 2 ðxu 2 LÞ()ðyu 2 LÞÞÞ:
ð1Þ
A language L is regular if and only if the
corresponding Nerode equivalence is of finite index
(Hopcroft et al. 2001).
Probabilistic finite state automata (PFSA) considered
in this paper are built upon deterministic finite state
automata (DFSA) with a specified event generating
function. The formal definition is stated next.
Definition 2 (PFSA): A probabilistic finite state
~
automata (PFSA) is a quintuple Pi ¼ ðQ, , , qi , Þ
where the quadruple (Q, , , qi) is a DFSA with
unspecified marked states and the mapping
~ : Q ! ½0, 1 satisfies the following condition:
X
~ j , Þ ¼ 1:
ð2Þ
ðq
8qj 2 Q,
2
In the sequel, ~ is denoted as the event generating
function. For a PFSA Pl, cardinality of the set of states
is denoted as NUMSTATES(PI).
~
Definition 3: For every PFSA Pi ¼ ðQ, , , qi , Þ,
there
is
an
associated
stochastic
matrix
Q
2 RNUMSTATESðPl ÞNUMSTATESðPi Þ called the state transition probability matrix, which is defined as follows:
Y
X
~ j , Þ:
¼
ð3Þ
ðq
jk
:ðqj , Þ¼qk
We further note that for every stochastic matrix , there
exists at least one row-vector } such that
} ¼ }, where 8j }j = 0 and
NUMSTATES
X ðPi Þ
}j ¼ 1,
j¼1
ð4Þ
where } is a stable long term distribution over the
PFSA states. If is irreducible, then } is unique.
Otherwise, there may exist more than one possible
solution to equation (4), one for each eigenvector
corresponding to unity eigenvalue. However, if the
initial state is specified (as it is in this paper), then } is
always unique. Several efficient algorithms have been
reported in Kemeny and Snell (1960), Harrod and
823
Structural transformations of probabilistic finite state machines
Plemmons (1984) and Stewart (1999) for computation
of }.
⎧
⎨
⎩
Key definitions and results from measure theory that are
used here are recalled.
S1
Past
q0
τ
qk
Definition 4 (-Algebra): A collection M of subsets of
a non-empty set X is said to be a -algebra (Rudin 1988)
in X if 2M has the following properties:
(1) X 2 M
(2) If A 2 M, then Ac 2 M where Ac is the complement
c
of A relative
S1 to X, i.e., A ¼ X\A
(3) If A ¼ n¼1 An and if An 2 M for n 2 N, then A 2 M.
Theorem 1: If f is any collection of subsets of X, there
exists a smallest -algebra M in X such that f M .
Proof: See (Rudin 1988, Theorem 1.10)
œ
Definition 5 (Measure): A finite (non-negative)
measure is a countably additive function , defined on
a -algebra M, whose range is [0, K] for some K 2 R.
Countable additivity means that if {Ai} is a disjoint
countable collection of members of M, then
!
1
1
X
[
Ai ¼
ðAi Þ:
ð5Þ
i¼1
i¼1
Theorem 2: If is a (non-negative) measure on a
-algebra M, then
(1) (;) ¼ 0
(2) (Monotonicity) A B ) ðAÞ ðBÞ if A, B 2 M.
Proof: See (Rudin 1988, Theorem 1.19).
œ
Definition 6: A probability measure on a non-empty
set with a specified -algebra M is a finite non-negative
measure on M. Although not required by the theory,
a probability measure is defined to have the unit interval
[0, 1] as its range.
Definition 7: A probability measure space is a triple
ðX, M, qÞ where X is the underlying set, M is the
-algebra in X and q is a finite non-negative measure
on M.
3. Properties of probabilistic finite state automata
For any 2 , the language ! has an important
physical interpretation pertaining to systems modeled as
probabilistic language generators (figure 5). A string
2 can be interpreted as a symbol sequence that has
been already generated, and any string in ! qualifies
as a possible future evolution. Thus, the language ! is
conceptually associated with the current dynamical state
of the modelled system.
S3
Present
Σω
Possible
Future
S4
Figure 5. Interpretation of the language ! pertaining to
dynamical evolution of a language generator.
Definition 8: Given an alphabet , the set B ¼
2 ! is defined to be the -algebra generated by
the set fL : L ¼ ! where 2 g, i.e., the smallest
-algebra on the set ! which contains the set
fL : L ¼ ! where 2 g.
Remark 1: Cardinality of B is @1 because both 2
and ! have cardinality @1.
The following relations in the probability measure
space ð! , B , qÞ are consequences of Definition 8.
. qð! Þ ¼ qð"! Þ ¼ 1
. 8x, u 2 , xu! j x! and hence qðxu! Þ 5 qð! Þ
Notation 1: For brevity, the probability qð! Þ is
denoted as qðÞ8 2 in the sequel.
Next the notion of probabilistic Nerode equivalence
N q is introduced on for representing the measure
space ð! , B , qÞ in the form of a PFSA. In this context,
the following logical formulae are introduced.
Definition 9: For x, y 2 ,
U1 ðx, yÞ ¼ ðqðxÞ ¼ 0 ^ qðyÞ ¼ 0Þ
ð6aÞ
U2 ðx, yÞ ¼ ðqðxÞ 6¼ 0 ^ qðyÞ 6¼ 0Þ^
qðxuÞ qðyuÞ
¼
:
8u 2 qðxÞ
qðyÞ
ð6bÞ
Theorem 3 (probabilistic nerode equivalence): Given
an alphabet , every measure space ð! , B , qÞ induces a
right-invariant equivalence relation N q on defined as
8x, y 2 , xN q y , U1 ðx, yÞ _ U2 ðx, yÞ :
ð7Þ
Proof: Reflexivity and symmetry properties of the
relation N q follow from Definition 9. Let x, y, z 2 be distinct and arbitrary strings such that xN q y
and yN q z. Then, transitivity property of N q follows
from equation (7) and Definition 9. Hence, N q is an
equivalence relation.
To establish right-invariance (Hopcroft et al. 2001)
of N q , it suffices to show that
8x, y 2 , xN q y , 8u 2 , xuN q yu :
ð8Þ
824
I. Chattopadhyay and A. Ray
Let x, y, u be arbitrary strings in such that xN q y.
If qðxÞ ¼ 0, qðyÞ ¼ 0 from equation (7). Then, it follows
from the monotonicity property of the measure (see
Theorem 2) that qðxuÞ ¼ 0, which implies the truth of
U1 ðxu, yuÞ and hence the truth of xuN q yu. If qðxÞ 6¼ 0,
then ðxN q yÞ ^ ðqðxÞ 6¼ 0Þ implies qðyÞ 6¼ 0. Hence,
qðxuÞ qðxuÞ
qðxÞ
¼
¼
:
qðxuÞ
qðxÞ
qðxuÞ
ð9Þ
If qðxÞ ¼ qðyÞ, then xN q y implies qðxuÞ ¼ qðyuÞ and
also 8 2 ðqðxuÞ ¼ qðyuÞÞ. Similarly, if qðxÞ 6¼ qðyÞ,
then xN q y implies qðxuÞ 6¼ qðyuÞ and also 8 2 ðqðxuÞ 6¼ qðyuÞÞ. Hence, 8 2 ððqðxuÞ ¼ qðyuÞÞ ,
ðqðxuÞ ¼ qðyuÞÞÞ:
œ
Definition 10 (perfect encoding): Given an alphabet ,
~ is defined to be a perfect
PFSA Pi ¼ ðQ, , , qi , Þ
encoding of the measure space ð! , B , qÞ if 8 2 þ
and ¼ 1 2 . . . r ,
~ i , 1 Þ
qðÞ ¼ ðq
r1
Y
~ ðqi , 1 . . . k Þ, kþ1 Þ:
ð
~ ‘ , 1 Þ
qðhk Þ ¼ qðhk Þðq
Theorem 4: A PFSA is a perfect encoding if and only if
the corresponding probabilistic Nerode equivalence N q is
of finite index.
Proof (Left to Right): Let Q be the finite set of
equivalence classes of the relation N q of the PFSA
~ that is constructed as follows:
Pi ¼ ðQ, , , qi , Þ
(1) Since N q is an equivalence relation on , there
exists a unique qi 2 Q such that " 2 qi . The initial
state of Pi is set to qi.
(2) If x 2 qj and x 2 qk , then (qj, ) ¼ qk
~ j , Þ ¼ ðqðxÞ=qðxÞÞ where x 2 qj .
(3) ðq
First we verify that the steps 2 and 3 are consistent
in the sense that and ~ are well-defined.
Probabilistic nerode equivalence (see Theorem 3)
implies that if x, y 2 , then ððx 2 qj Þ ^ ðx 2 qk Þ^
ðy 2 qj ÞÞ ) ðy 2 qk Þ. Therefore, the constructed is
well-defined. Similarly, since ðx, y 2 qj Þ ) ðqðxÞ ¼
qðyÞÞ ^ ðqðxÞ ¼ qðyÞÞ, the constructed ~ is also
well-defined. Therefore, the steps 2 and 3 are consistent.
For ¼ 1 2 . . . r 2 þ , it follows that
R
Y
qð1 . . . r Þ
qð1 . . . r1 Þ
r¼2
R
1
Y
r¼1
~ ð ðqi , 1 . . . r Þ, rþ1 Þ:
~ ð ðq‘ , 1 m Þ, mþ1 Þ
m¼1
ð10Þ
Remark 2: The implications of Definition 10 are
as follows: The encoding introduced is perfect in the
sense that the measure q can be reconstructed without
error from the specification of Pi.
~ i , 1 Þ
¼ ðq
r1
Y
~ ‘ , 1 Þ
qðhj Þ ¼ qðhj Þðq
k¼1
qðÞ ¼ pð1 Þ
Hence, the criterion for perfect encoding (see
Definition 10) is satisfied.
~
Right to Left: Let the PFSA Pi ¼ ðQ, , , qi , Þ
be a perfect encoding; and let the probabilistic
Nerode equivalence N q be of infinite index. Then,
there exists a set of strings h , having the same
cardinality as , such that each element of h belongs
to a distinct N q -equivalence class. That is, 8hj , hk 2 h
such that j 6¼ k, we have hj N q hk . Since qðhj Þ ¼ qðhk Þ ¼ 0
implies hj N q hk , there can exist at most one element
such
that
qðh0 Þ ¼ 0.
That
is,
h0 2 h
qðhj Þ 6¼ 0 8hj 2 h fh0 g.
~ where Q is the
For the PFSA Pi ¼ ðQ, , , qi , Þ,
finite set of states, there exists q‘ 2 Q and hj , hk 2 h such
Let
2 þ
and
that
(qi,hj) ¼ (qi,hk) ¼ q‘.
¼ 1 2 r . Since Pi ia a perfect encoding, it follows
from Definition 10 that
r1
Y
~ ð ðq‘ , 1 m Þ, mþ1 Þ:
m¼1
Now, it follows that
^ qðhj Þ qðhk ÞÞ
¼
qðhj Þ 6¼ 0 ^ qðhk Þ 6¼ 0
qðhj Þ
qðhk Þ
) U2 ðhj , hk Þ ) hj N q hk
which contradicts the initial assertion that hj N q hk 8hj ,
œ
hk 2 h. This completes the proof.
The construction in the first part of Theorem 4 is stated
in the form of Algorithm 1.
Algorithm 1: Construction of PFSA from the probability
measure space (!, B , q)
input: (!, B , q) such that N q is of finite index
output: Pi
1 begin
2 Let Q ¼ fqj : j 2 J $ Ng be the set of equivalence classes of
the relation N q ;
3 Set the initial state of Pi as qi such that " belongs to the
equivalence class qi; If x 2 qj and x 2 qk , then set ðqj , Þ ¼ qk ;
~ j , Þ ¼ qðxÞ=qðxÞ where x 2 qj ;
4 ðq
5 end
Corollary 1
~
ðQ, , , qi , Þ
-algebra B
equivalence is
(to Theorem 4): A PFSA Pi ¼
induces a probability measure q on the
and the corresponding probabilistic Nerode
of finite index.
825
Structural transformations of probabilistic finite state machines
Proof: Let a probability measure q be constructed on
the -algebra B as follows:
!
r1
Y
~ i , 1 Þ
8 2 þ , qðÞ ¼ ðq
~ ð ðqi , 1 k Þ, kþ1 Þ :
k¼1
It follows from Definition 10 that Pi perfectly encodes
the measure q and Theorem 4 implies that the
œ
corresponding N q is of finite index.
On account of Corollary 1, we can map any given PFSA
to a measure space (!, B , q).
Definition 11: Let p be the space of all probability
measures on B and a be the space of all possible
~
PFSA Pi ¼ ðQ, , , qi , Þ.
. The map H : a ! p is defined as HðPi Þ ¼ q such that
!
r1
Y
þ
~ i , 1 Þ
8 2 , qðÞ ¼ ðq
~ ð ðqi , 1 k Þ, kþ1 Þ
k¼1
where ¼ 1 2 r :
ð11Þ
. The map H1 : p ! a is defined as
(
Pi given by Algo:1 if N q is of finite index
H1 ðqÞ ¼
Undefined
otherwise:
ð12Þ
Lemma 1: Pi is a perfect encoding for HðPi Þ.
Proof: The proof follows from Definition 10 and
Definition 11.
œ
Next we show that, similar to classical finite state
machines, an arbitrary PFSA can be uniquely minimized. However, the sense in which the minimization
is achieved is somewhat different. To this end, we
introduce the notion of reachable states in a PFSA and
define isomorphism of two PFSA.
Definition 12 (reachable states): Given a PFSA
~
the set of reachable states
Pi ¼ ðQ, , , qi , Þ,
RCHðPi Þ j Q is defined as:
q~ 2 RCHðPi Þ ) 9 ¼ 1 R 2 such that ð ðqi , Þ ¼ q~ Þ
~ i , i Þ
^ ðq
R
1
Y
!
~ ð ðqi , 1 r Þ, rþ1 Þ > 0 :
r¼1
Remark 3: The strict positivity condition in
Definition 12 ensures that every state in the set of
reachable states can actually be attained with a strictly
non-zero probability. In other words, for every state
qj 2 RCHðPi Þ, there exists at least one string !, initiating
from qi and eventually terminating on state qj, such that
the generation probability of ! is strictly positive.
Definition 13 (Isomorphism): Two PFSA Pi ¼
~ and P0i0 ¼ ðQ0 , , 0 , q0i0 , ~ 0 Þ are defined to
ðQ, , , qi , Þ
be isomorphic if there exists a bijective map
: RCHðPi Þ ! RCHðP0i0 Þ such that
~ j , Þ
~ j , Þ 6¼ 0 ) ~ 0 ððqj Þ, Þ ¼ ðq
ðq
^
0 ððqj Þ, k Þ ¼ ððqj , k ÞÞ :
Remark 4: The notion of isomorphism stated in
Definition 13 generalizes graph isomorphism to PFSA
by considering only the states that can be reached with
non-zero probability and transitions that have a nonzero probability of occurrence.
Theorem 5 (Minimization of PFSA): For a PFSA
~ H1 HðPi Þ is the unique minimal
Pi ¼ ðQ, , , qi , Þ,
realization of Pi in the sense that the following conditions
are satisfied:
(1) The PFSA H1 HðPi Þ perfectly encodes the probability measure HðPi Þ.
(2) For a PFSA P0i0 that perfectly encodes HðPi Þ,
the
inequality
CARDðRCHðH1 HðPi ÞÞÞ 5
CARDðRCHðP0i0 ÞÞ holds.
(3) The
equality,
CARDðRCHðH1 HðPi ÞÞÞ 5
CARDðRCHðP0i0 ÞÞ, implies isomorphism of Pi and P0i0
in the sense of Definition 13.
Proof:
(1) The proof follows from the construction in
Theorem 4.
(2) Let P0i0 ¼ ðQ0 , , 0 , q0i0 , ~ 0 Þ be an arbitrary PFSA that
perfectly encodes the probability measure HðPi Þ. Let
us construct a PFSA Pyi0 ¼ ðQ0 [ fqd g, , y , qi0 , ~ y Þ,
where qd is a new state not in Q0 , as follows:
y ðq0j0 k Þ ¼
8q0j 2 Q0 , 2 ,
(
qd
if ~ 0 ðq0j k Þ ¼ 0
0
ðq0j0 k Þ
ð13aÞ
otherwise
8 2 , y ðqd , k Þ ¼ qd
8q0j 2 Q0 , 8 2 , ~ y ðq0j0 k Þ ¼ ~ 0 ðq0j0 k Þ:
ð13bÞ
ð13cÞ
It is seen that Pyi0 perfectly encodes HðPi Þ as well, which
follows from Definition 10 and equation (13c). It is
claimed that
CARDðRCHðPyi0 ÞÞ ¼ CARDðRCHðP0i0 ÞÞ
ð14Þ
based on the following rationale.
Let q0j 2 RCHðP0i0 Þ. Following Definition 12, there exists
a stringQ 2 such that 0 ðqi0 , xÞ ¼ q0j and
~ i0 , 1 Þ R1
~ 0 ð 0 ðqi0 , 1 r Þ, rþ1 Þ > 0. It follows
ðq
r¼1 Q
~y from Equation (13c) that ~ y ðqi0 , 1 Þ R1
r¼1 y
ð ðqi0 , 1 r Þ, rþ1 Þ > 0 and hence we conclude
826
I. Chattopadhyay and A. Ray
using equation (13a) that y ðqi0 , xÞ ¼ qj 6¼ qd which then
implies
that
q0j 2 RCHðPyi0 Þ.
Hence
we
have
0
CARDðRCHðPi0 ÞÞ 5 CARDðRCHðPyi0 Þ. By a similar argument,
we have CARDðRCHðPyi0 Þ 5 CARDðRCHðP0i0 ÞÞ and hence
CARDðRCHðPyi0 Þ 5 CARDðRCHðP0i0 ÞÞ.
Next, we claim
8x, y 2 y ðqi0 , xÞ ¼ y ðqi0 , yÞ ) xN HðPi Þ y
ð15Þ
based on the following rationale.
Let x, y 2 s.t. ðy ðqi0 , xÞ ¼ y ðqi0 , yÞÞ. It follows from
equations (13a–c) that
(
ðHðPi ÞðxÞ ¼ 0 ^ HðPi ÞðyÞ ¼ 0Þ if y ðqi0 , xÞ ¼ qd
ðHðPi ÞðxÞ 6¼ 0 ^ HðPi ÞðyÞ 6¼ 0Þ
otherwise:
Let the PFSA H1 HðPi Þ be denoted as
ðQ# , , # , q#i# , ~ # Þ. Let eðq#j Þ denote the equivalence
class of N HðPi Þ that q# represents. We define a map
0
: RCHðH1 HðPi ÞÞ ! 2RCHðPi0 Þ as follows:
n
o
ðq#j Þ ¼ q0j 2 Q0 9x 2 eðq#j Þ s:t: 0 ðqi0 , xÞ ¼ q0j : ð19Þ
We claim
8q#j , q#k 2 Q#
ðC2Þ
Let q0‘ 2 ðq#j Þ \ ðq#k Þ. Hence there exists xj 2 eðq#j Þ,
xk 2 eðq#k Þ such that
q0‘ ¼ 0 ðqi0 , xj Þ ¼ 0 ðqi0 , xk Þ
ð16Þ
Now, if ðHðPi ÞðxÞ ¼ 0 ^ HðPi ÞðyÞ ¼ 0Þ, then it follows
from equation (6b) that xN HðPi Þ y. On the other hand, if
ðHðPi ÞðxÞ 6¼ 0 ^ HðPi ÞðyÞ 6¼ 0Þ, then equation (6a) yields
8u ¼ 1 R 2 ,
¼ ~ y ðqi0 , 1 Þ
R
1
Y
HðPi ÞðxuÞ HðPi ÞðyuÞ
¼
HðPi ÞðxÞ
HðPi ÞðyÞ
~ y y ðqi0 , 1 r Þ, rþ1 ) xN HðPi Þ y:
r¼1
We define a map : RCHðH1 HðPi ÞÞ ! RCHðPyi0 Þ as
follows: Let q# 2 RCHðH1 HðPi ÞÞ and let "ðq# Þ be the
equivalence class of the relation N HðPi Þ represented by
q#. Let x ¼ 1 R 2 "ðq# Þ.
HðPi ÞðxÞ > 0 ðSee Definition 12Þ
y
) ~ ðqi0 , 1 Þ
R
1
Y
y
y
~ ðqi0 , 1 r Þ, rþ1 > 0
xj N HðPi Þ xk ) 9u 2 r¼1
which contradicts equation (21).
Next we claim that
ðC3Þ
Let x1 , x2 2 Eðq#j Þ such that
0 ðqi , x1 Þ ¼ q0j
0 ðqi , x2 Þ ¼ q0k
) y ðqi0 , xÞ 2 RCHðPyi0 Þ:
Let ðq# Þ ¼ y ðqi0 , xÞ. Note that (q#) depends on the
choice of x. Let q#1 , q#2 2 RCHðH1 HðPi ÞÞ such that
ðq#1 Þ ¼ ðq#2 Þ. If x1, x2 are the corresponding strings
chosen to define ðq#1 Þ, ðq#2 Þ, we have y ðqi0 , x1 Þ ¼
y ðqi0 , x2 Þ which implies x1 N HðPi Þ x2 , i.e., q#1 ¼ q#2 .
Hence we conclude is injective which, in turn, implies
ð17Þ
Finally, from equations (14) and (17), it follows that
ð18Þ
Let P0i0 ¼ ðQ0 , , 0 , qi0 , ~ 0 Þ be an arbitrary PFSA that
perfectly encodes HðPi Þ such that
CARDðRCHðH1 HðPi ÞÞÞ 5 CARDðRCHðP0i0 ÞÞ:
HðPi Þðxj uÞ HðPi Þðxk uÞ
6¼
ð21Þ
HðPi Þðxj Þ
HðPi Þðxk Þ
but, P0i0 perfectly encodes HðPi Þ implying
HðPi Þðxk uÞ
HðPi Þðxj uÞ
6¼
8u ¼ 1 R 2 HðPi Þðxj Þ
HðPi Þðxk Þ
!
R
1
Y
¼ ~ 0 ðq0‘ , 1 Þ
~ 0 ð 0 ðqi0 , 1 r Þ, rþ1
ðSince Pyi0 perfectly encodes HðPi ÞÞ
CARDðRCHðH1 HðPi ÞÞÞ 5 CARDðRCHðP0i0 ÞÞ:
ð20Þ
that
8q#j 2 RCHðH1 HðPi ÞÞ CARDððq#j ÞÞ ¼ 1
r¼1
CARDðRCHðH1 HðPi ÞÞÞ 5 CARDðRCHðPyi0 ÞÞ:
\
q#j 6¼ q#k ) ðq#j Þ
ðq#k Þ ¼ :
ðC1Þ
with q0j 6¼ q0k :
ð22Þ
Therefore,
CARDððq#j ÞÞ41
X
¼)
q#k
2 RCHðH1 HðPi ÞÞ
CARDððq#k ÞÞ
0
4CARDðRCHðH1 HðPi ÞÞÞ
¼)CARDðRCHðP0i0 ÞÞ4CARDðRCHðH1 HðPi ÞÞÞ
which contradicts C1 thus proving C3.
On account of C2 and C3, let us define a bijective map
~ #j Þ ¼ 0 ðqi , xÞ,
~ : RCHðH1 HðPi ÞÞ ! RCHðP0i0 Þ as ðq
#
x 2 eðqj Þ. Then,
9
8k 2 , 8q#j 2 RCHðH1 HðPi ÞÞÞ, x 2 eðq#j Þ >
=
ð23Þ
HðP
Þðx
Þ
i
k
>
~ #j Þ, k Þ
;
~ # ðq#j , k Þ ¼
¼ ~ # ððq
HðPi ÞðxÞ
827
Structural transformations of probabilistic finite state machines
~ # ðq#j , k ÞÞ ¼ 0 ðqi0 , xk Þ
ð
~ #j Þ, k Þ
¼ 0 ð 0 ðqi0 , k Þ, k Þ ¼ 0 ððq
ð24Þ
which implies that H1 HðPi Þ and P0i0 , are
isomorphic in the sense of Definition 13. This completes
the proof.
œ
~
the
Theorem 6: For a PFSA Pi ¼ ðe, , , Ei , Þ,
function ~ : Q ! Q can be extended to
~ : Q ! Q as
2 ,
~ j , "Þ ¼ 1
ðq
2 ,
~ j , Þ ¼ ðq
~ j , Þðq
~ j , Þððq
~
ðq
j , Þ, Þ:
8qj 2 Q,
ð25Þ
Proof: Let ¼ HðPi Þ. We note that that Pi perfectly
encodes q (see Lemma 1). It follows from Theorem 4 that
~ j , "Þ ¼
8qj 2 Q, ðq
qðx"Þ qðxÞ
¼
¼ 1 where ðqi , xÞ ¼ qj :
qðxÞ
qðxÞ
Similarly, for a string initiating from state qj, where
2 , 2 , we have
qðxÞ qðxÞ qðxÞ
¼
:
qðxÞ
qðxÞ
qðxÞ
4. Metrization of the space p of probability
measures on BD
Metrization of p is important for differentiating
physical processes modeled as dynamical systems evolving probabilistically on discrete state spaces of finite
cardinality. In this section, we introduce two metric
families, each of which captures a different aspect of
such dynamical behaviour and can be combined to form
physically meaningful and useful metrics for system
analysis and design.
Definition 14: Given two probability measures q1 , q2
on the -algebra B and a parameter s 2 ½1, 1,
the function ds : p p ! ½0, 1 is defined as follows:
!1=s
jj X
q1 ðxj Þ q2 ðxj Þs
ds ðq1 , q2 Þ ¼ sup
q ðxÞ q ðxÞ x 2 j¼1
1
2
8s 2 ½1, 1Þ
ð28aÞ
q1 ðxj Þ q2 ðxj Þ
:
d1 ðq1 , q2 Þ ¼ sup max
q2 ðxÞ x 2 2 q1 ðxÞ
ð28bÞ
ð26Þ
Theorem 8: The space p of all probability measures
on B is ds-metrizable for s 2 ½1, 1.
~ j , Þ. Also, (qi, x) ¼ qj
We note that qðxÞ=qðxÞ ¼ ðq
implies (qj, ) ¼ (qi, x). Therefore, qðxÞ=qðxÞ ¼
~
ððq
j , Þ, Þ and hence
Proof: Strict positivity and symmetry properties
of a metric follow directly from Definition 14.
Validity of the remaining property of triangular inequality follows by application of Minkowski inequality
(Rudin 1988).
œ
~ j , Þ ¼
ðq
~ j , Þððq
~
~ j , Þ ¼ ðq
ðq
j , Þ, Þ:
œ
This completes the proof.
Theorem 7:
For a measure space ð! , B , qÞ,
H H1 ðqÞ ¼ q
ð27Þ
i.e., H H1 is the identity map from p onto itself.
~ We note Pi
Proof: Let H1 ðqÞ ¼ Pi ¼ ðQ, , , qi , Þ.
perfectly encodes q (See Lemma 3.1). Let HðPi Þ ¼ q0 .
We claim
8x 2 , qðxÞ ¼ q0 ðxÞ:
The result is immediate for jxj ¼ 0, i.e., x ¼ . For jxj 1,
we proceed by the method of induction. For jxj ¼ 1,
we note
~ i , Þ ¼ qðÞ ðperfect EncodingÞ:
8 2 , q0 ðÞ ¼ ðq
Next let us assume that 8x 2 , s.t. jxj ¼ r 2 N,
Since
8x 2 with
jxj ¼ r 2 N,
q ðxÞ ¼ qðxÞ.
it follows that
0
0
0
~ j , Þ where ðqi , xÞ ¼ qi
q ðxÞ ¼ q ðxÞðq
~ i , Þ ¼ qðxÞ:
¼ qðxÞðq
This completes the proof.
œ
Definition 15: Let M be a right invariant equivalence
relation on and the ith equivalence class of M be
denoted as Mi ; i 2 I, where I is an arbitrary index set.
Let q be a probability measure on the -algebra B
inducing the probabilistic Nerode equivalence N q on j
with the jth equivalence class of N q denoted as N q ,
j 2 J , where J is an index set distinct from I .
Then, the map M : p ! ½0, 1CARDðI Þ ½0, 1CARDðJ Þ is
defined as
X
M ðqÞij ¼
qðxÞ:
x 2 Mi \N jq
be two probability
Definition 16: Let q1 , q2
measures on the -algebra B . Then, the function
dF : p p ! ½0, 1 is defined as follows:
dF ðq1 , q2 Þ ¼ N q2 ðq1 Þ N q2 ðq2 Þ
,
F
ð29Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
where kkF ¼ Trace½H is the Frobenius norm of
the operator , and H is the Hermitian of .
Definition 16 implies that if I and J are the index
sets corresponding to N q1 and N q2 respectively,
828
I. Chattopadhyay and A. Ray
then N q1 ðq2 Þ 2 ½0, 1CARDðI Þ ½0, 1CARDðJ Þ and N q2 ðq1 Þ 2
½0, 1CARDðJ Þ ½0, 1CARDðI Þ .
Theorem 9: The function dF is a pseudometric on the
space p of probability measures.
Proof: The Frobenius norm on a probability space
satisfies the metric properties except strict positivity
because of the almost sure property of a probability
measure.
œ
probability measure. Thus when comparing two probabilistic finite state machines, we need not concern
ourselves with whether the machines are represented in
their minimal realizations; the distance between two
non-minimal realizations of the same PFSA is always
zero. However this implies that ,s only qualifies as a
pseudo-metric on a.
Theorem 10: For 8 2 ½0, 1Þ and 8s 2 ½1, 1Þ, the para
metized function ,s ¼ dF þ ð1 Þds is a metric on p.
4.1 Explicit computation of the pseudometric m
for PFSA
Proof: Following Theorems 8 and 9, ds is a metric
for s 2 ½1, 1 and dF is a pseudometric on p.
Non-negativity, finiteness, symmetry and sub-additivity
of ,s follow from the respective properties of dF and ds.
Strict positivity of ,s on 2 ½0, 1Þ is established below
The pseudometric ,s is computed explicitly for pairs of
PFSA over the same alphabet. Before proceeding to the
general case, ,s is computed for the special case, where
the pair of PFSA have identical state sets, initial states
and transition maps.
,s ðq1 , q2 Þ ¼ 0 ) ð1 Þds ðq1 , q2 Þ ¼ 0 ) q1 ¼ q2 :
ð30Þ
Lemma 2: Given two PFSA P1i ¼ ðQ, , , qi , ~ 1 Þ,
P2i ¼ ðQ, , , qi , ~ 2 Þ, and 2 , the steps for computation
of 0, s ðP1i , P2i Þ are
œ
Remark 5: If two physical processes are modelled as
discrete-event dynamical systems, then the respective
probabilistic language generators can be associated with
probability measures q1 and q2 . The metric ds ðq1 , q2 Þ is
related to the production of single symbols as arbitrary
strings and hence captures the difference in short term
dynamic evolution. In contrast, the pseudometric dF is
related to generation of all possible strings and therefore
captures the difference in long term behavior of the
physical processes. The metric ,s thus captures
the both short-term and long-term behaviour with
respective relative weights of 1 and .
Definition 17: The metric ,s on p for 2 ½0, 1Þ,
s 2 ½1, 1, induces a function ,s on a a as follows:
8Pi , P0i0 2 a, ,s ðPi , P0i0 Þ ¼ ,s ðHðPi Þ, HðP0i0 ÞÞ:
ð31Þ
Corollary 2 (to Theorem 10): The function ,s in
Definition 17 for 2 ½0, 1Þ and s 2 ½1, 1 is a pseudometric
on a. Specifically, the following condition holds:
,s ðPi , H1 HðPi ÞÞ ¼ 0:
ð32Þ
Proof: Following Theorem 7,
,s ðPi , H1 HðPi ÞÞ ¼ ,s ðHðPi Þ, H H1 HðPi ÞÞ
¼ ,s ðHðPi Þ, ðH H1 Þ HðPi ÞÞ
¼ ,s ðHðPi Þ, HðPi ÞÞ ¼ 0:
œ
Remark 6: Corollary 2 can be physically interpreted
to imply that the metric family ,s does not
differentiate between different realizations of the same
Set ðqj Þ ¼ ~ 1 ðqj , Þ ~ 2 ðqj , Þ
Then, 0, s ðP1i , P2i Þ ¼ max
ðqj Þ
s :
qj 2 Q
P
Proof: Let HðP2i ÞðxÞ=HðP2i ÞðxÞ denote a j j-dimensional vector-valued function, where 2 .
For proof of the lemma, it suffices to show that the
following relation holds:
HðP1i ÞðxÞ HðP2i ÞðxÞ
,
¼ max
ðqj Þ
s :
sup ds
1
2
qi 2 Q
HðPi ÞðxÞ HðPi ÞðxÞ
x2
Since P1i perfectly encodes HðP1i Þ, it follows that
ð8x, y 2 , ðqi , xÞ ¼ ðqi , yÞÞ implies
HðP1i Þðxk Þ
HðP1i ÞðyÞ
1
~
ðq
,
Þ
¼
¼
,
j
k
HðP1i ÞðxÞ
HðP1i ÞðyÞ
where (qi, x) ¼ qj. Similar argument holds for HðP2i Þ.
Hence, it follows that for computing 0, s ðP1i , P2i Þ, only
one string needs to be considered for each state qj 2 Q.
That is,
HðP1i Þðxk Þ HðP2i Þðxk Þ
0, s ðP1i , P2i Þ ¼ max x:ðqi ,xÞ¼qj HðP1i ÞðxÞ
HðP2i ÞðxÞ s
¼ max ~ 1 ðqj , k Þ ~ 2 ðqj , k Þ
s
x:ðqi ,xÞ¼qj
¼ max
ðqj Þ
s :
qj 2 Q
Lemma 3: Let }1 , }2 be the stable probability distributions for PFSA P1i ¼ ðQ, , , qi , ~ 1 Þ and P2i ¼
ðQ, , , qi , ~ 2 Þ respectively. Then,
lim ,s ðP1i , P2i Þ ¼ d2 ð}1 , }2 Þ
!1
829
Structural transformations of probabilistic finite state machines
Proof: Since P1i and P2i have the same initial state and
state transition maps,
(
\ k
if j 6¼ k
j
N HðP2 Þ ¼
N HðP1 Þ
j
k
i
i
N HðP1 Þ ¼ N HðP2 Þ otherwise
i
j
N HðP1 Þ
i
i
k
N HðP2 Þ
j
where
and
are the jth and kth equivalence
classes (i.e., states qj and qk) for P1i , P2i , respectively.
The result follows from Definition 16 and Corollary 11
and noting that
8
if j ¼ k
<
qðxÞ ¼ }1 j
:
N HðP2 Þ ðHðP1i ÞÞ ¼ x:ðqi ,xÞ¼qj
:
i
jk
0
otherwise
Theorem 11: Given two PFSA P1i ¼ ðQ, , , qi , ~ 1 Þ and
P2i ¼ ðQ, , , qi , ~ 2 Þ, the pseudometric ,s ðP1i , P2i Þ can be
computed explicitly for 2 ½0, 1Þ and s 2 ½1, 1 as
,s ðP1i , P2i Þ ¼ lim ,s ðP1i , P2i Þ þ ð1 Þ0, s ðP1i , P2i Þ:
!1
ð33Þ
Proof: The result follows from Theorem 10 and
Corollary 2.
œ
The algorithm for computation of the the pseudometric
is presented below.
To extend the approach presented in Lemma 2 to
Algorithm 2: Computation of , s ðPi , P0i0 Þ
input: Pi , P0i0 , s, output: , s ðPi , P0i0 Þ
1 begin
, ,
~ 12 Þ ¼ Pi P00 ;
2 Compute P12 ¼ ðQ,
i
~ 21 Þ ¼ P00 Pi ;
3 Compute P21 ¼ ðQ, , ,
i
4 for
j ¼ 1 to CARDðQÞ do
5 ðJÞ ¼ ~ 12 ðqj , k Þ ~ 21 ðqj , k Þ
s;
6 endfor
7 0, s ðPi , P0i0 Þ ¼ maxj ðjÞ;
8 Compute }12 ; = State prob: for P12 ðDef: 2:3Þ =
9 Compute }21 ; =
State prob:
for P21 ðDef: 2:3Þ =
10 Compute d ¼ }12 }21 2 ;
11 Compute , s ðPi , P0i0 Þ ¼ d þ ð1 Þ0, s ðPi , P0i0 Þ;
12 end
Then, Pi Gi0 ¼ ðQ Q0 , , , ðqi , q0i0 Þ, ~ Þ, where
ððqj , q0k0 Þ, Þ
8
0 0
>
>
< ðqj , Þ, ðqj0 , Þ ,
¼
>
>
:
Undefined
if ðqj , Þ and
0 ðq0j0 , Þ are defined
otherwise
ð34Þ
~ j , Þ:
~ ððqj , q0k0 Þ, Þ ¼ ðq
Remark 7: Synchronous composition for PFSA is not
commutative, i.e., for an arbitrary pair Pi and Gi 0 ,
Pi Gi0 6¼ Gi0 Pi :
Definition 18: The binary operation of synchronous
composition of PFSA, denoted as : p p ! p, is
defined as follows:
(
Let
~
Pi ¼ ðQ, , , qi , Þ
Gi0 ¼ ðQ0 , , 0 , q0i0 , ~ 0 Þ:
ð36Þ
Synchronous composition of PFSA is associative, i.e.,
8P1i1 , P2i2 , P3i3 2 p, P1i1 P2i2 P3i3 ¼ P1i1 P2i2 P3i3
Theorem 11: For a pair of PFSA Pi and Gi0 over the
same alphabet,
HðPi Þ ¼ HðPi Gi0 Þ:
ð37Þ
Proof: Let q ¼ HðPi Þ and q0 ¼ HðPi Gi0 Þ. It suffices
to show that
8x 2 , qðxÞ ¼ q0 ðxÞ:
ðC4Þ
For jxj ¼ 0, i.e., x ¼ , the result is immediate. For
jxj = 1, we use the method of induction. Since Pi
perfectly encodes HðPi Þ,
~ i , Þ
8 2 , qðÞ ¼ ðq
¼ ~ ððqi , q0i0 Þ, Þ ¼ q0 ðÞ:
Hence C4 is true for jxj 5 1.
With the induction hypothesis
8x 2 , s:t:, jxj ¼ r 2 N, qðxÞ ¼ q0 ðxÞ
ð38Þ
we proceed with an arbitrary 2 to yield
~ j , Þ where ðqi , xÞ ¼ qj
qðxÞ ¼ qðxÞðq
0
¼ q ðxÞ~ ððqj , q0j0 Þ, Þ where ððqi , q0i0 Þ, xÞ
¼ ðqj , q0j0 Þ ¼ q0 ðxÞ:
This completes the proof.
arbitrary pairs of PFSA, we need to define the
synchronous composition of a pair of PFSA.
ð35Þ
œ
Theorem 12: Given a pair of PFSA Pi, P0i0 and an
arbitrary parameter s 2 ½1, 1, Algorithm 2 computes
,s(Pi, P0i0 ) for 2 ½0, 1Þ, s 2 ½1, 1.
Proof: By Theorem 11, 8 2 ½0, 1Þ, s 2 ½1, 1,
,s ðPi , P0i0 Þ ¼ ,s ðPi P0i0 , P0i0 Pi Þ:
P0i0
P0i0
ð39Þ
Since Pi and
Pi have the same state sets,
initial states and transition maps (see Definition 18),
correctness of Algorithm 2 follows from Lemmas
2 and 3.
œ
830
I. Chattopadhyay and A. Ray
Example 1: The theoretical results of x 4 are illustrated
with a numerical example. The following PFSA are
considered:
P1q1 ¼ ðfq1 , q2 , q3 g, f0, 1g, 1 , q1 , ~ 1 Þ
ð40Þ
P2qA ¼ ðfqA , qB g, f0, 1g, 2 , qA , ~ 2 Þ
ð41Þ
as shown in figures 6 and 7, respectively, and figure 8
illustrates the computed compositions P1q1 P2qA (above)
and P2qA P1q1 (below).
Following Algorithm 2, we have
3
2
k0:4 0:9 0:6 0:1ks
6 k0:2 0:9 0:8 0:1k 7
s7
6
7
6
6 k0:3 0:9 0:7 0:1ks 7
7:
6
ð42Þ
s ¼ 6
7
6 k0:3 0:7 0:7 0:3ks 7
7
6
4 k0:4 0:7 0:6 0:3ks 5
k0:2 0:7 0:8 0:3ks
As an illustration, we set s ¼ 1. Hence,
1 ¼ ½0:5
)
0:7
0:6
0, 1 ðP1q1 , P2qA Þ
0:4
0:3
0:5T
¼ maxð1 Þ ¼ 0:7:
ð43Þ
ð44Þ
The final state probabilities are computed to be
0:07
}P1 P2 ¼ ½0:15
}P2 P1 ¼ ½0:29
0:29
0:09
0:29
0:2
0:22
0:27
ð45Þ
0:043 0:043 0:043:
ð46Þ
For ¼ 0.5, we have
0:5, 1 P1q1 , P2qA ¼ 0:5d2 }P1 P2 , }P2 P1 þ 0:5 0:7
such that probability of occurrence of , given that x has
already occurred, is 70% more in one system compared
to the other. Also, the occurrence probability of any
event, given an arbitrary string has already occurred,
is different by no more than 70% for the two systems.
The composition P1q1 P2qA shown in the upper part
of figure 8 is an encoding of the measure HðP1i Þ and
hence is a non-minimal realization of P1i , while
the composition P2qA P1q1 shown in the lower part of
figure 8 encodes HðP2i0 Þ and therefore is a non-minimal
realization of P2i0 . Although the structures of the
two compositions are identical in a graph-theoretic
sense (i.e. there is a graph isomorphism between the
compositions), they represent very different probability
distributions on B .
5. Model order reduction for PFSA
This section investigates the possibility of encoding
an arbitrary probability distribution on B by a PFSA
with a pre-specified graph structure. As expected, such
encodings will not always be perfect. However, we will
show that the error can be rigorously computed and
hence is useful for very close approximation of large
PFSA models by smaller models.
Definition 19: The binary operation of projective
~ : p p ! p is defined as follows:
composition 8
P ¼ ðQ, , , qi , ~ 0 Þ
>
< i
Let Gi0 ¼ ðQ0 , , 0 , q0i0 , ~ 0 Þ
>
:
Gi0 Pi ¼ ðQ0 Q, , , ðq0i0 , qi Þ, ~ Þ:
¼ 0:4599:
1/ 0.3
The pseudonorm 0, 1 ðP1q1 , P2qA Þ ¼ 0:7 is interpreted as
follows. There exists a string x 2 and an event 2 qA3
1/ 0.4
qA1
qA2
1/ 0.2
1/ 0.4
0/ 0.6
q1
1/0.2
1/ 0.3
0/ 0.8
1/0.4
0/.7
0/.8
q2
qB2
0/ 0.6
q3
0/ 0.8
1/ 0.2
qB1
0/ 0.7
0/0.6
1/ 0.9
1/0.3
Figure 6.
qB3
0/ 0.7
PFSA P1.
qA3
1/ 0.9
qA1
1/ 0.9
qA2
1/ 0.7
1/0.9
0/0.3
1/.7
qA
0/ 0.1
1/ 0.7
qB2
qB
1/ 0.7
0/ 0.3
qB3
0/ 0.3
qB1
0/ 0.3
0/0.1
Figure 7.
0/ 0.1
0/ 0.1
2
PFSA P .
Figure 8.
P1 P2 (above) and P2 P1 (below).
831
Structural transformations of probabilistic finite state machines
For notational simplicity set 8qj 2 Q and 8q0k0 2 Q0 ,
X
#ðq0k0 , qj Þ ¼
ðHðGi0 ÞðxÞÞ:
Projective composition preserves
distribution which is defined next.
Theorem 13:
alphabet,
~ j H
~ k Þ ¼ Pi H
~ k
(1) Pi ðG
~ j ÞH
~ j H
~ k Þ ðNon-associativeÞ
~ k 6¼ Pi ðG
(2) ðPi G
~ j¼
~ i ðNon-commutativeÞ:
(3) Pi G
6 Gj P
œ
We justify the nomenclature ‘projective’ composition in
the following theorem.
Theorem 14: For arbitrary PFSA Pi and Gi0 over the
same alphabet,
~ i ¼ Gi0 P
~ i P
~ i:
ð48Þ
Gi0 P
~ Definition 19 implies
Proof: Let Pi ¼ ðQ, , , qi , Þ.
z
~
0
~
that ðGi Pi Þ ¼ ðQ, , , qi , for ~ z computed as
specified in equation (47). It further follows from
~ i ÞP
~ i ¼ ðQ, , , qi , ~ ~ Þ,
Definition 19, that ðGi0 P
~ i ÞP
~ i have the same state set,
~ i and Gi0 P
i.e., ðGi0 P
initial state and state transition maps. Thus, it suffices to
show that
8qj 2 Q, , ~ ðqj , Þ ¼ ~ ðqj , Þ:
ð49Þ
Considering
the
probabilistic
synchronous
~ i Þ Pi ¼ ðQ Q, , , ðqi , qi Þ, ~ Þ
composition ðGi0 P
(see Definition 18),
8x 2 , ððqi , qi Þ, xÞ ¼ ðqi , qj Þ, for some qj 2 Q
It follows that, for qk 6¼ qj,
! X
H Gi0 Pi ðxÞ ¼ 0:
#ðqk , qj Þ ¼
ð50Þ
Finally we conclude 8qj 2 Q, 2 ,
P
!
~ ððqk , qj Þ, Þ
qk 2 Q #ðqk , qj Þ
P
~ ðqj , Þ ¼
qk 2 Q #ðqk , qj Þ
This completes the proof.
Theorem 15: (Projected Distribution Invariance): For
two arbitrary PFSA Pi and Gi over the same alphabet,
!
½½Gi0 Pi ¼ ½½Gi0 Pi Pi :
~
Proof: Let Pi ¼ðQ, , , qi , Þ
and Gi0 ¼ ðQ0 , , 0 ,
!
!
0 ~0
qi0 , Þ. It follows that Gi0 Pi ¼ ðQ, , , qi , ~ Þ, where
!
~ is as computed in Definition 19. Using the same
notation as in Definition 19, we have 8 2 ,
X
HðGi0 ÞðxÞ
x: ðqi , xÞ¼qj
¼
X
#ðq0k0 , qj Þ~ 0 ðq0k0 , Þ
q0k0 2 Q0
¼
X
(P
#ðq0k0 , qj Þ
q0k0 2 Q0
8
< X
q0k0 2 Q0
P
#ðq0k0 , qj Þ~ 0 ðq0k0 , Þ
q0k0 2 Q0
)
#ðq0k0 , qj Þ
9
=
~
#ðq0k0 , qj Þ ~ ðqj , Þ:
¼
:q0 2 Q0
;
ð53Þ
P
P
Since
½½Gi0 Pi jj ¼ x: ðqi , xÞ¼qj HðGi0 ÞðxÞ ¼ q0 0 2 Q0 k
#ðq0k0 , qj Þ, it follows that 8 2 ,
P
x: ðqi , xÞ¼qj HðGi0 ÞðxÞ
½½Gi0 Pi j
~
¼ ~ ððqj , qj Þ, Þ
ðsee Definition 18Þ:
ð52Þ
¼ ~ ðqj , Þ
P
X
x: ðqi , xÞ¼qj HðGi0 ÞðxÞ
)
½½Gi0 Pi j
:ðqj , Þ¼q‘
X
~
¼
~ ðqj , Þ
#ðqj , qj Þ~ ððqj , qj Þ, Þ
#ðqj , qj Þ
¼ ~ z ðqj , Þ
We note ½½Gi0 Pi is a probability vector, i.e.,
NUMSTATES
X
X
½½Gi0 Pi j ¼
HðGi0 ÞðxÞ ¼ 1:
x 2 j¼1
k0
x: ððqi , qi Þ, xÞ
¼ðqk , qj Þ
¼
such that if Nj is the jth equivalence class (i.e. the
jth state) of Pi,
X
then
HðGi0 ÞðxÞ ¼ }j :
x 2 Nj
Proof: The results follow from Definition 19.
~
½½Gi0 Pi ¼ } 2 ½0, 1NUMSTATESðPi Þ ,
ð47Þ
For PFSA Pi, Gj, Hk over the same
z
projected
Definition 20 (projected distribution): The projected
distribution } 2 ½0, 1NUMSTATESðPi Þ of an arbitrary PFSA
Gi0 with respect to a given PFSA Pi is defined by the
map ½½Pi : a ! ½0, 1NUMSTATESðPi Þ as follows:
x: ððq0i0 , qi Þ, xÞ
¼ðq0k0 , qj Þ
~ i ¼ ðQ, , , qi , ~ ~ Þ s.t.
Then, Gi0 P
P
0
~ ððq0k0 , qj Þ, Þ
q0k0 2 Q0 #ðqk0 , qj Þ
~
P
:
~ ðqj , Þ ¼
0
q0k0 , 2 Q0 #ðqk0 , qj Þ
the
:ðqj , Þ¼q‘
ð51Þ
)
œ
1
HðGi0 Þðxj‘ Þ ¼ ~ ðqj , q‘ Þ,
½½G Pi j
i0
ð54Þ
832
I. Chattopadhyay and A. Ray
where j‘ j such that 2 j‘ ) ðqj , Þ ¼ q‘ and
~ ðqj , q‘ Þ is the j‘th element of the stochastic state
transition matrix ~ corresponding to the PFSA
~ i . It follows from (54), that
Gi0 P
X
qj 2 Q
HðGi0 Þðxj‘ Þ ¼
X
~
½½Gi0 Pi j ðqj , q‘ Þ
qj 2 Q
) ½½Gi0 Pi j‘ ¼
X
~
½½Gi0 Pi j ðqj , q‘ Þ:
qj 2 Q
ð55Þ
It follows that ½½Gi0 Pi satisfies the vector equation
~
½½Gi0 Pi ¼ ½½Gi0 Pi :
The idea is further clarified in the commutative diagram
of figure 9.
Probabilistic synchronous composition is an exact
representation with no loss of statistical information;
but the model order increases due to the product
automaton construction. On the other hand, the
projective composition has the same number of states
~
as the second argument in ðÞðÞ.
Both representations
have exactly the same projected distribution with respect
~ an extremely
to a fixed second argument, thus making useful tool for model order reduction. Algorithm 3
computes the projected composition of two arbitrary
PFSA.
ð56Þ
~ i P is the stable probability
We note that ½½Gi0 P
i
~ i and hence, we have
distribution of the PFSA Gi0 P
~
~ i P ¼ ½½Gi0 P
~ i P :
½½Gi0 P
i
i
ð57Þ
In general, a stochastic matrix may have more than one
eigenvector corresponding to unity eigenvalue (Bapat
and Raghavan 1997). However, as per our definition of
PFSA (see Definition 2), the initial state is explicitly
specified. It follows that the right hand side of (53)
assumes that all strings begin from the same state qi 2 Q.
Hence it follows:
~ i P :
½½Gi0 Pi ¼ ½½Gi0 P
i
This completes the proof.
ð58Þ
œ
Algorithm 3: Computation of Projected Composition
1
2
3
4
5
6
7
8
~ Gi0 ¼ ðQ0 , , 0 , qi0 , ~ 0 Þ
input: Pi ¼ ðQ, , , qi , Þ,
~ i0
output: Pi G
begin
Compute Pi Gi0 ¼ ðQ Q0 , , , ðqi , q0i0 Þ, ~ Þ;
= See Definition 4:5 =
Compute }; = State prob: for Pi Gi0 Def: 2:3 =
Set up matrix T s.t. Tjk ¼ }ððqj , q0k ÞÞ;
~
Compute ~ ~ ¼ T;
!
return Pi Gi0 ¼ ðQ0 , , 0 , qi0 , ~ ~ Þ
end
5.2 Incurred error in projective composition
Given any two PFSA Pi and Gi0 , the incurred error in
!
projective composition operation P Gi_0 is quantified in
the pseudo-metric defined in x 4 as follows:
~ i0 Þ:
v,s ðPi , Pi G
5.1 Physical significance of projected distribution
invariance
Given a symbolic language theoretic PFSA model for a
physical system of interest, one is often concerned with
only certain class of possible future evolutions. For
example, in the paradigm of deterministic finite state
automata (DFSA) (Ramadge and Wonham 1987), the
control requirements are expressed in the form of a
specification language or a specification automaton. In
that setting, it is critical to determine which state of the
specification automaton the system is currently visiting.
In contrast, for a PFSA, the issue is the probability of
certain class of future evolutions. For example, given
a large order model of a physical system, it might be
necessary to work with a much smaller order PFSA, that
has the same long-term behaviour with respect to a
specified set of event strings. Although projective
composition may incur a representation error in general,
the long-term distribution over the states of the
projected model is preserved as shown in Theorem 14.
ð59Þ
Next we establish a sufficient condition for guaranteeing
zero incurred error in projective composition.
~
Theorem 16: For arbitrary PFSA Pi ¼ ðQ, , , qi , Þ
with
corresponding
and
Gi0 ¼ ðQ0 , , 0 , q0i0 , ~ 0 Þ
0
probabilistic Nerode equivalence relations N and N ,
we have
!
0
N 5 N ) v,s ðGi0 , Gi0 Pi Þ ¼ 0:
0
Proof: N 5 N implies that there exists a possibly
non-injective map f: Q ! Q such that
8x 2 , ðqi , xÞ ¼ qj 2 Q ) ðq0i0 , xÞ ¼ fðqj Þ 2 Q0 :
It then follows from Definition 19 that
#ðq0k0 , qj Þ ¼ 0
if fðqj Þ 6¼ q0k0 :
833
Structural transformations of probabilistic finite state machines
Denoting Gi0 Pi ¼ ðQ Q0 , , , ðqi , q0i0 Þ, ~ Þ and
!
!
Gi0 Pi ¼ ðQ, , , qi , ~ Þ, we have fron Definition 19
that
P
0
~ ððq0k0 , qj Þ, Þ
!
q0k0 2 Q0 #ðqk0 , qj Þ
P
~ ðqj , Þ ¼
0
q0 0 2 Q0 #ðqk0 , qj Þ
k
0
¼ ~ ððfðqj Þ, qj Þ, Þ ¼ ~ ðfðqj Þ, Þ,
where the last step follows from Definition 18.
The proof is completed by noting
8x 2 ,
Using Algorithm 3, we compute the event
~ 21 as
~ 12 and functions 2
"
#
0:1250
0:7197
0:2803
6
~ 12 ¼
~ 21 ¼ 4 0:1250
, 0:6891 0:3109
0:1250
Example 2: The results of x 5 are illustrated considering
the PFSA models described in Example 1. Given
the PFSA models P1q1 ¼ ðfq1 , q2 , q3 g, , 1 , q1 , ~ 1 Þ and
P2qA ¼ ðfqA , qB g, , 2 , qA , ~ 2 Þ (see (40) and (41)), we
!
compute the projected compositions P1q1 P2qA ¼
! 1
2
12
2
ðfqA , qB g, , , qA , ~ Þ and PqA Pq1 ¼ ðfq1 , q2 , q3 g,
The
synchronous
compositions
, 1 , q1 , ~ 21 Þ.
P1q1 P2qA and P2qA P1q1 were computed in Example 1
and are shown in figure 8. Denoting the associated
stochastic transition matrices for P1q1 P2qA and
P2qA P1q1 as 12 and 21 respectively, we note
3
3
2
2
0:2000:8
0:9000:1 ðq1 , qA Þ
6 00:3:700 7
6 00:9:100 7 ðq , q Þ
7
7
6
6
2 A
7
7
6
6
6 :4000:60 7
6 :9000:10 7 ðq3 , qA Þ
7
7
6
6
12 ¼ 6
7, 21 ¼ 6
7
6 0:2000:8 7
6 0:7000:3 7 ðq1 , qB Þ
7
7
6
6
7
7
6
6
4 00:3:700 5
4 00:7:300 5 ðq2 , qB Þ
7
0:8750 5:
ðq3 , qB Þ:
ð62Þ
!
}21 ¼ ½ 0:3333 0:3333 0:3333 :
The operations are illustrated in figures 10 and 11 and
invariance of the projected distribution is checked as
follows:
!
}12 ð1Þ þ }12 ð2Þ þ }12 ð3Þ ¼ 0:3017 ¼ }12 ð1Þ
ð63aÞ
!
}12 ð4Þ þ }12 ð5Þ þ }12 ð6Þ ¼ 0:6983 ¼ }12 ð2Þ
ð63bÞ
!
}21 ð2Þ þ }21 ð5Þ ¼ 0:333 ¼ }21 ð2Þ
ð63cÞ
!
}21 ð3Þ þ }21 ð6Þ ¼ 0:333 ¼ }21 ð3Þ:
ð63dÞ
6. An engineering application of pattern recognition
Projective composition is applied to a symbolic pattern
identification problem. Continuous-valued data from a
laser ranging array in a sensor fusion test bed are fed
to a symbolic model reconstruction algorithm (CSSR)
The stable probability distributions}12 and }21 are
computed to be
}12 ¼ ½0:1458 0:0695 0:0864 0:2017 0:2186 0:2780
ð60aÞ
}21 ¼ ½0:2917 0:2917 0:2917 0:0417 0:0417 0:0417:
ð60bÞ
⊗Gi′
Pi ⊗ Gi′
→
⊗Gi′
[•
Gi′
[•
[
Pi
q1
1/0.2
1/0.4
0/.7
0/0.6891
1/0.2803
0/.8
q2
→ 2
⊗P
qA
q3
0/0.6
.31
1
qA
qB
1/0.3
Figure 10. P1q1 projectively composed with P2qA .
1/0.9
q1
0/0.3
1/.875
Gi′
1/.875
[
0
[•
[
→
Pi ⊗ Gi′
Gi′
3
!
}12 ¼ ½ 0:3017 0:6983 ,
œ
:7000:30
0:8750
0:8750
ð61Þ
1 ! 2
We note that the stable distributions for Pq1 PqA
!
and P2qA P1q1 are given by
HðGi0 ÞðxÞ ¼ ~ 0 ðq0i0 , xÞ ¼ ~ ððfðqi Þ, qi Þ, xÞ
!
!
¼ ~ ðqi , xÞ ¼ HðGi0 Pi ÞðxÞ:
:4000:60
generating
℘
Figure 9. Commutative diagram relating probabilistic
composition, projective composition and the original projected
distribution.
qA
0.7
1
0/0.1
qB
→ 1
⊗P
q1
0
.125
q2
q3
0/.125
1/.875
Figure 11. P2qA projectively composed with P1q1 .
834
I. Chattopadhyay and A. Ray
(Shalizi and Shalizi 2004) to yield probabilistic finite
state models over a four-letter alphabet. A maximum
entropy partitioning scheme (Rajagopalan and Ray
2006) is employed to create the symbolic alphabet on
the continuous time series. Figure 12 depicts the results
from four different experimental runs. Two of those
runs in the top two rows of figure 12 correspond to a
human subject moving in the sensor field; the other two
runs in the bottom two rows correspond to a robot
representing an unmanned ground vehicle (UGV).
The symbolic reconstruction algorithm yields PFSA
having disparate number of states in each of the above
four cases (i.e., two each for the human subject and the
robot), with their graph structures being significantly
different. The resulting patterns (i.e., state probability
vectors) for these PFSA models in each of the four cases
are shown on the left side of figure 12. The models are
then projectively composed with a 64 state D-Markov
machine (Ray 2004) having alphabet size ¼ 4 and
depth ¼ 3. The resulting pattern vectors are shown on
the right hand column of figure 12. The four rows in
figure 12 demonstrate the applicability of projective
composition to statistical pattern classification; the state
probability vectors of projected models unambiguously
identify the respective patterns of a human subject
and an UGV.
0.4
7. Summary, conclusions and future work
This paper presents a rigorous measure-theoretic
approach to probabilistic finite state machines. Key
concepts from classical language theory such as the
Nerode equivalence relation is generalized to the
probabilistic paradigm and the existence and uniqueness
of minimal representations for PFSA is established. Two
binary operations, namely, probabilistic synchronous
composition and projective composition of PFSA are
introduced and their properties are investigated.
Numerical examples have been provided for clarity of
exposition. The applicability of the defined binary
operators has been demonstrated on experimental data
from a laboratory test bed in a pattern identification and
classification problem. This paper lays the framework
for three major directions for future research and the
associated applications.
. Probabilistic non-regular languages: Since projective
composition can be used to obtain smaller order
models with quantifiable error, the possibility of
projectively composing infinite state probabilistic
models with finite state machines must be investigated. The extension of the theory developed in this
paper to non-regular probabilistic languages would
prove invaluable in handling strictly non-Markovian
(a)
0.2
0
0.4
0.2
0
10
20
30
40
50
60
0
70
0.4
0
10
20
30
40
50
60
70
0
10
20
30
40
50
60
70
0
10
20
30
40
50
60
70
0
10
20
30
40
50
60
70
0.4
(b)
0.2
0
0.2
0
10
20
30
40
50
60
70
0
80
1
1
0.5
(c)
0
1
2
3
4
5
6
7
8
0.5
0
9 10 11 12 13 14
0.4
1
0.2
0.5
0
(d)
0
5
10
15
20
25
30
0
Figure 12. Experimental validation of projective composition in pattern recognition: (a) and (b) correspond to ranging data for a
human subject in sensor field; (c) and (d) correspond to an UGV.
Structural transformations of probabilistic finite state machines
models in the symbolic paradigm, especially physical
processes that fail to have the semi-Martingle property, e.g., fractional Brownian motion (Decreusefond
and Ustunel 1999). Future work will investigate
language-theoretic non-regularity as the symbolic
analogue to chaotic behavior in the continuous
domain.
. Optimal control: The reported measure-theoretic
approach to optimal supervisor design in PFSA
models will be extended in the light of the developments reported in this paper to situations where the
control specification is given as weights on the states
of DFSA models disparate from the plant under
consideration. Such a generalization would allow the
fusion of Ramadge and Wonham’s constraint based
supervision approach (Ramadge and Wonham 1987)
with the measure-theoretic approach reported in
Chattopadhyay (2006) and Chattopadhyay and Ray
(2007). This new control synthesis tool would prove
invaluable in the design of event driven controllers in
probabilistic robotics.
. Pattern identification: Preliminary application in
pattern classification has already been demonstrated
in x6. Future research will formalize the approach and
investigate methodologies for optimally choosing the
plant model on which to project the constructed PFSA
to yield maximum algorithmic performance. Future
investigations will explore applicability of the structural transformations developed in this paper for the
fusion, refinement and computation of bounded order
symbolic models of observed system behavior in
complex dynamical systems.
Acknowledgements
The authors would like to thank Dr. Eric Keller for his
valuable contribution in obtaining the experimental
results.
This work has been supported in part by the
U.S. Army Research Office under Grant Nos.
W911NF-06-1-0469 and W911NF-07-1-0376.
835
References
R. Bapat and T. Raghavan, Nonnegative Matrices and Applications,
Cambridge University Press, 1997.
I. Chattopadhyay, ‘‘Quantitative control of probabilistic discrete event
systems,’’ PhD Dissertation, Dept. of Mech. Engg. Pennsylvania
State University, http://etda.libraries.psu.edu/theses/approved/
WorldWideIndex/ETD-1443, 2006.
I. Chattopadhyay and A. Ray, ‘‘Renormalized measure of regular
languages’’, Int. J. Contr., 79, pp. 1107–1117, 2006.
I. Chattopadhyay and A. Ray, ‘‘Language-measure-theoretic optimal
control of probabilistic finite-state systems’’, Int. J. Contr., 80,
pp. 1271–1290, 2007.
L. Decreusefond and A. Ustunel, ‘‘Stochastic analysis of the
fractional brownian motion’’, Potential Analysis, 10, pp. 177–214,
1999.
V. Garg, ‘‘Probabilistic languages for modeling of DEDs’’,
Proceedings of 1992 IEEE Conference on Information and Sciences,
Princeton, NJ, pp. 198–203, March 1992a.
V. Garg, ‘‘An algebraic approach to modeling probabilistic
discrete event systems’’, Proceedings of 1992 IEEE Conference
on Decision and Control, Tucson, AZ, pp. 2348–2353, December
1992b.
W.J. Harrod and R.J. Plemmons, ‘‘Comparison of some direct
methods for computing the stationary distributions of markov
chains,’’, SIAM J. Sci. Statist. Comput., 5, pp. 453–469, 1984.
J.E. Hopcroft, R. Motwani and J.D. Ullman, Introduction to Automata
Theory, Languages, and Computation, 2nd ed., Boston, MA, USA:
Addison-Wesley, 2001, pp. 45–138.
J.G. Kemeny and J.L. Snell, Finite Markov Chains, 2nd ed.,
New York: Springer, 1960.
R. Kumar and V. Garg, ‘‘Control of stochastic discrete event systems
modeled by probabilistic languages’’, IEEE Trans. Autom. Contr.,
46, pp. 593–606, 2001.
M. Lawford and W. Wonham, ‘‘Supervisory control of probabilistic
discrete event systems’’, Proceedings of 36th Midwest Symposium on
Circuits and Systems, pp. 327–331, 1993.
V. Rajagopalan and A. Ray, ‘‘Symbolic time series analysis via
wavelet-based partitioning’’, Signal Process., 86, pp. 3309–3320,
2006.
P.J. Ramadge and W.M. Wonham, ‘‘Supervisory control of a class of
discrete event processes’’, SIAM J. Contr. Optimiz., 25, pp. 206–230,
1987.
A. Ray, ‘‘Symbolic dynamic analysis of complex systems for anomaly
detection’’, Signal Process, 84, pp. 1115–1130, 2004.
A. Ray, ‘‘Signed real measure of regular languages for discrete-event
supervisory control’’, Int. J. Contr., 78, pp. 949–967, 2005.
W. Rudin, Real and Complex Analysis, 3rd ed., New York: McGraw
Hill, 1988.
C.R. Shalizi and K.L. Shalizi, ‘‘Blind construction of optimal
nonlinear recursive predictors for discrete sequences’’, in AUAI’04:
Proceedings of the 20th conference on uncertainty in artificial
intelligence, Arlington, Virginia, United States, AUAI Press,
pp. 504–511, 2004.
W. Stewart, Computational Probability: Numerical Methods for
Computing Stationary Distribution of Finite Irreducible Markov
Chains, New York: Springer, 1999.