A Finite Characterization of
Weak Lumpable Markov Processes.
Part I: The Discrete Time Case.
Gerardo Rubino and Bruno Sericola
IRISA { INRIA
Campus de Beaulieu, 35042 Rennes Cedex, FRANCE
Abstract
We consider an irreducible and homogeneous Markov chain (discrete time) with nite
state space. Given a partition of the state space, it is of interest to know if the aggregated
process constructed from the rst one with respect to the partition is also Markov homogeneous. We give a characterization of this situation by means of a nite algorithm. This
algorithm computes the set of all initial probability distributions of the starting homogeneous
Markov chain leading to an aggregated homogeneous Markov chain.
Markov Chains, Aggregation, Weak Lumpability
1 Introduction
This paper is the natural continuation of the work performed in G. Rubino and B. Sericola (1989).
To remain compatible with this previous paper, we conserve the same notation. Let us rst recall
the problem of weak lumping in Markov chains.
Let X = (Xn)n 0 be a homogeneous irreducible Markov chain evolving in discrete time. The
state space is assumed to be nite and denoted by E = f1 2 ::: N g. The stationary distribution
of X is denoted by . Let us denote by B = fB (1) B (2) ::: B (M )g a partition of the state space
and by n(m) the cardinal of B (m). We assume the states of E ordered such that
B (1) = f1 . . . n(1)g
...
B (m) = fn(1) + + n(m ; 1) + 1 . . . n(1) + + n(m)g
...
B (M ) = fn(1) +
+ n(M ; 1) + 1 . . . N g:
1
With the given process X we associate the aggregated stochastic process Y with values in the set
F = f1 2 ::: M g, dened by
def
Yn = m ()
Xn 2 B (m)
for all n 0
It is easily checked from this denition and the irreducibility of X that the process Y obtained is
also irreducible in the following sense: for all m 2 F , for all l 2 F such that IP(Y0 = l) > 0, there
exists n > 0 such that IP(Yn = m=Y0 = l) > 0.
The Markov property of X means that given the state in which X is at the present time,
the future and the past of the process are independent. Clearly, this is no longer true for the
aggregated process Y in the general case (see the example below). This paper deals with the
conditions under which the process Y is also a homogeneous Markov chain.
The homogeneous Markov chain X is given by its transition probability matrix P and its
initial distribution we shall denote it by ( P ) when necessary. The (i j ) entry of matrix P
is denoted by P (i j ). We shall denote by agg( P B) the aggregated process constructed from
( P ) over the partition B. Let us denote by A the set of all probability vectors with N entries.
Of course, it is possible to have the situation in which agg( P B) is a Markov homogeneous
chain for any 2 A. This happens, for instance, in the following example where E = f1 2 3g,
B (1) = f1g, B (2) = f2 3g and
1
0 1;p p
0
P =B
0 1;q C
A
@ q
q 1;q 0
with 0 < p 1 and 0 < q < 1. In this case, X is said to be strongly lumpable with repect to the
partition B and this is a well known property with a simple characterization. For every i 2 E and
m 2 F , let us denote by P (i B (m)) the transition probability of passing in one step from state
i to the subset B (m) of E , that is P (i B (m)) def
=P
P (i j ). Then, X is strongly lumpable
j 2B(m)
2
with respect to the partition B if and only if for every pair of elements l m 2 F , the probability
P (i B (m)) has the same value for every i 2 B (l). This common value is the transition probability
from state l to state m for the aggregated homogeneous Markov chain Y .
The opposite situation in which for any initial distribution , the process agg( P B) is
not Markov is illustrated by the following example. Consider again a three state chain with
B (1) = f1g, B (2) = f2 3g and
1
0
1;p p 0
P =B
@ 0 0 1 CA :
1 0 0
For any value of p with 0 < p 1 and for any initial probability distribution, we have
IP(Xn+1 2 B (1) j Xn 2 B (2) Xn;1 2 B (1)) = 0
and IP(Xn+1 2 B (1) j Xn 2 B (2) Xn;1 2 B (2)) = 1:
In ], the authors show that a third situation is possible, in which there exists a proper subset of A
denoted here by AM such that for any 2 AM, the process agg( P B) is Markov homogeneous
while for any 62 AM the aggregated process is not Markov. In this more general situation, X
is said to be weakly lumpable with respect to the given partition B (see ] for some examples).
Moreover, in the same reference a sucient condition to weak lumpability is given.
In ], the authors analyse the problem of the characterization of weak lumpability. An interesting technique is introduced but we have shown in G. Rubino and B. Sericola (1989) that their
characterization is wrong.
The work performed here is an extension of G. Rubino and B. Sericola (1989). We obtain
a nite characterization of weak lumpability by means of an algorithm which computes the set
AM. In G. Rubino and B. Sericola (1989), AM is given as an innite intersection of sets. The
main result obtained in this paper is the reduction of this innite intersection to a nite one.
3
The paper is organized as follows. The next section gives the background material and notation
and recalls the main results obtained in G. Rubino and B. Sericola (1989). In Section 3, we show
how the searched set AM can be obtained by means of a nite algorithm. Section 4 presents an
example in which the set AM is computed and Section 5 concludes the paper.
2 Notation and main results of G. Rubino and B. Sericola (1989)
By convention, the vectors will be row vectors. Column vectors will be indicated by means of the
transposition operator ( : )T . A vector with all its entries equal to 1 will be denoted simply by 1
its dimension will be dened by the context.
For each l 2 F and 2 A we denote by Tl : the \restriction" of to the subset B (l). That is,
Tl : is a vector with n(l) entries and its ith entry is (Tl :)(i) = (n(1) +
+ n(l ; 1) + i) i =
1 2 . . . n(l). When Tl : 6= 0, we denote by B(l) the vector of A dened by B(l) (i) = (i)=kTl :k
if i belongs to B (l), 0 otherwise, where k k denotes the sum of the components of the non negative
real vector . Each time when in the sequel we shall write a vector of the form B(m) with 2 A,
we implicitely mean that this vector is dened, i.e. Tm: 6= 0.
To clarify this notation, let us give some examples. Suppose that N = 5 and B = fB (1) B (2)g
with B (1) = f1 2 3g and B (2) = f4 5g. Then,
for = (1=10 1=10 1=5 2=5 1=5) :
B(1) = (1=4 1=4 1=2 0 0)
T1 : = (1=10 1=10 1=5)
T1 :B(1) = (1=4 1=4 1=2)
for = (1=2
B(2) = (0 0 0 2=3 1=3)
T2 : = (2=5 1=5)
T2 :B(2) = (2=3 1=3)
1=2 0 0 0) :
B(1) = (1=2 1=2 0 0 0) and B(2) is not dened since T2 : = (0 0):
4
Recall the denition of the main object of this analysis.
AM def
= f 2 A=Y = agg( P B) is a homogeneous Markov chaing
For l 2 F , we shall denote by Pel the n(l) M matrix dened by
Pel (j m) def
= P ( n(1) +
+ n(l ; 1) + j B (m) ) 1 m n(l) m 2 F:
In other words, Pel (j m) is the probability of passing in one step from the j th state in B (l) to
the set B (m).
When the set AM is not empty, the transition probability matrix of the Markov chain Y ,
denoted by Pb , is the same for every 2 AM. In this case, its lth row Pbl can be computed by
Pbl = (Tl :B(l) )Pel (see G. Rubino and B. Sericola (1989)).
To illustrate these denitions, consider the following matrix
0 1=6
B 1=8
P =B
B@ 3=8
1=4
1=6
3=8
1=8
1=4
1=6
1=4
1=4
1=4
1=2
1=4
1=4
1=4
1
CC
CA
with B = fB (1) B (2)g, B (1) = f1,2,3g and B (2) = f4g.
We have
0
1
1=2 1=2
Pe1 = B
Pe2 = 3=4 1=4
@ 3=4 1=4 CA
3=4 1=4
= (3=13 3=13 3=13 4=13)
!
2
=
3
1
=
3
b
and P = 3=4 1=4 :
Let be an element of A, k a positive integer and i1 i2 . . . ik a sequence of k elements of F .
We dene recursively the vector f ( B (i1) . . . B (ik )) 2 A by
f ( B (i1)) = B(i1 )
f ( B (i1) . . . B (ih)) = (f ( B (i1) . . . B (ih;1)P )B(ih )
5
2 h k:
whenever these operations make sense. The reader can check the following probabilistic interpretation of f . If = f ( B (i0) . . . B (in)) then
(j ) = IP(Xn = j=Xn 2 B (in) . . . X0 2 B (i0 )):
That is, is the probability distribution of Xn given (X0 2 B (i0 ) . . . Xn 2 B (in)).
Using this denition, we dene the basic sequence of sets
A1 def
= f 2 A = (Tl :B(l) )Pel = Pbl for every l 2 F g
and for j 2,
Aj def
= f 2 A = 8 = f ( B (i1 ) . . . B (ik )) with k j we have 2 A1g:
As an example, the reader can check that for the previous 4-dimensional transition probability
matrix, the set A1 can be written as
A1 = f 2 A= = ( 31 t 23 ; t) + (1 ; )(0 0 0 1) 0 t 23 0 1g:
We have then the basic result (proved in G. Rubino and B. Sericola (1989))
AM =
\
j
1
Aj :
In the same paper, some properties of AM are given. Let us say that a subset U of A is stable
by right product by P i for all x 2 U the vector xP 2 U . The principal properties of the set AM
that we need here are
(1) 2 AM =) B(l) 2 AM, 8l 2 F
n
X
(2) 2 AM =) 8n 1, n1 P k 2 AM
k=1
6
(3) AM 6= =) 2 AM
(4) AM 6= =) AM is a convex closed set
(5) A1 is stable by right product by P () AM = A1
(6) 8j 1, Aj+1 Aj
(7) If 9j 1 such that Aj+1 = Aj then AM = Aj+k , 8k 1
\
(8) AM = Aj
j
1
See G. Rubino and B. Sericola (1989) for the proofs of these properties. In the next section, we
show that the sequence of sets (Aj )j 1 is a stationary sequence whose limit is equal to AN . That
is, AM = AN , where N denotes the cardinal of the state space E of the process X . Moreover, a
nite algorithm to compute this set is given.
3 An algorithm to compute AM
To obtain the announced result, we need some preliminary lemmas. The rst one gives a way to
compute the set Aj+1 as a function of Aj .
Lemma 3.1 8j 1,
Aj+1 = f 2 Aj =B(l) P 2 Aj for every l 2 F g:
Proof. The proof is a direct consequence of the denition of f . It only uses the property
f ( B (i1) B (i2 ) . . . B (ik )) = f (B(i1 ) P B (i2) . . . B (ik ))
2
which is immediate.
7
For every l 2 F , we denote by Pl the n(l) N sub-matrix of P corresponding to the transition
probabilities of X from the states of B (l) to the states of E . Let us dene for every l 2 F , the
n(l) M matrix Hl as Hl = Pel ; 1T Pbl and recall that a polytope of IRN is a bounded convex
polyhedral subset of IRN (]).
Lemma 3.2 8j 1, Aj is a polytope of IRN .
Proof. Verify rst that A1 can be written as
A1 = f 2 A=H = 0g
where H is the N M 2 block diagonal matrix whose lth block is Hl . When necessary, we shall
also write H = Diag(Hl). So, A1 is a polytope (see ]).
Now, for every j 1, we dene the N M j+1 matrix H j] as
H 1] = H
H j+1] = Diag(PlH j]) for j 1:
The previous lemma allows us to write
Aj+1 = f 2 Aj =H j+1] = 0g:
2
The result follows by denition (see ]).
For and 2 IRN , we denote by D( ) the line containing the two points and . That
is D( ) = f + (1 ; ) 2 IRg. Recall that the dimension of a convex set C of IRN is the
dimension of the smallest an subset containing C we shall denote it by dim(C ).
Lemma 3.3 8j 1 such that Aj+1 6= , we have
h
i
dim(Aj+1) = dim(Aj ) =) Aj+1 = Aj :
8
Proof. We have seen that for all j 1, Aj+1 Aj . If the dimension is equal to 0, the result
follows immediately. Suppose now that the common dimension is at least 1. Let and be two
points belonging to Aj+1 and let be a point of D( ) \ Aj . Such a point exists since the two
sets have the same dimension. Now, there exists 2 IR such that = + (1 ; ) . That is,
using the previous lemma, H j+1] = H j+1] + (1 ; )H j+1] = 0. This proves that 2 Aj+1
and then that Aj+1 = Aj .
2
We are now ready to prove the main result of this paper. Recall that N denotes the cardinal
of the state space of the starting homogeneous Markov chain X .
Theorem 3.4
AM = AN
Proof. Consider the sequence A1, . . . , AN . From property (6) of the previous section, this
sequence is decreasing. If two consecutive elements of A1, . . . , AN are equal, property (7) of the
previous section leads to AM = AN . If all its elements are dierent, the previous lemma concludes
that the sequence of their dimensions is strictly decreasing and so, since dim(A) = N ; 1, we
have dim(Aj ) < N ; j for j = 1 2 . . . N ; 1, and then AN = . Property (8) of the previous
2
section leads to the result.
Before giving the algorithm which computes the set AM, we need another interesting property
given in the following lemma and which generalizes property (5) of the previous section.
Lemma 3.5 Aj is stable by right product by P () AM = Aj
Proof. AM being stable by right product by P (property (2) of the previous section), the
implication from the right to the left is immediate. Conversely, suppose that Aj is stable by right
9
product by P . Verify then that
h
i
8j 1 2 Aj () B(l) 2 Aj for every l 2 F :
Then Lemma 3.1 leads to Aj+1 = Aj and property (7) implies AM = Aj .
2
To determine the set AM, we can proceed as follows. Assume the chain X not strongly
lumpable with respect to the partition B. The rst step consists of calculating the set A1 and
determining if it is stable by right product by P . If A1 is stable by right product by P , Lemma 3.5
says that AM = A1. If A1 is not stable by right product by P , we rst verify whether for all l 2 F
the vector B(l) P is in A1. If there exists l 2 F such that B(l) P 2= A1, then 2= A2 and AM = (consequence of properties (3) and (7) of the previous section). If 2 A2 then we calculate this
set and we repeat the previous operations.
Denote by the following procedure, where P (A) is the set of all subsets of A:
: P (A) ! nP (A)
o
U 7! 2 U =B(l) P 2 U for every l 2 F
With this notation, Aj+1 = (Aj ) (Lemma 3.1). The algorithm has then the following form.
if X is strongly lumpable then AM = A stop
else U := A1
loop
if U is stable by P break byfound endif
if 9l 2 F : B(l) P 2= U break byempty endif
U := (U )
endloop
exits
10
exit found : AM := U
exit empty : AM := endexits
endif
Remark that all the sets considered in this algorithm are polytopes and so they can be determined
in a unique way by giving their vertices.
4 Example
To illustrate our algorithm, let us take as an example the process X = ( : P ) on the state space
E = f1 2 3 4 5 6g with
0
BB
BB
P =B
BB
B@
3/14
1/6
1/8
3/8
1/5
1/5
3/14
1/6
3/8
1/8
1/5
1/5
3/14
1/6
1/4
1/4
1/5
1/5
3/14
1/3
1/6
1/6
1/5
1/5
1/14
1/12
1/24
1/24
1/10
1/10
1/14
1/12
1/24
1/24
1/10
1/10
1
CC
CC
CC
CC
A
and consider the partition B = fB (1) B (2)g where B (1) = f1 2 3 4g and B (2) = f5 6g. We
have N = 6, M = 2, n(1) = 4, n(2) = 2. We see that X is not strongly lumpable with respect to
the partition B. The stationary distribution of X is
= 42=193 42=193 42=193 42=193 25=386 25=386
and we have
B(1) = 1=4 1=4 1=4 1=4 0 0
11
B(2) = 0 0 0 0 1=2 1=2 :
The matrices Pe1 and Pe2 are given by
0 6=7 1=7
B 5=6 1=6
Pe1 = B
B@ 11=12 1=12
11=12 1=12
This leads to
1
CC
CA
!
eP2 = 4=5 1=5 :
4=5 1=5
Pb1 = 37=42 5=42
Pb2 = 4=5 1=5 :
The polytope A1 can be determined by the matrix H 1] (see the proof of Lemma 3.2) given by
0 ;1=42 1=42 0 0 1
BB ;1=24 1=24 0 0 CC
BB 1=28 ;1=28 0 0 CC
1]
H =B
BB 1=28 ;1=28 0 0 CCC :
B@ 0
A
0 0 0C
0
0 0 0
Therefore, the vertices of the polytope A1 are
1 = ( 3=5 0 0
2 = ( 3=5 0 2=5
3 = ( 0 3=7 0
4 = ( 0 3=7 4=7
5 = ( 0 0 0
6 = ( 0 0 0
2=5
0
4=7
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
)
)
)
)
)
):
The set A1 is not stable by right product by P since 2P 2= A1 (2 PH 1] 6= 0). Furthermore,
B(1) P 2 A1 and B(2) P 2 A1 since B(1) PH 1] = 0 and B(2) PH 1] = 0. This means that the set
A2 6= ( 2 A2). The polytope A2 can be determined by the matrix H 2] given by
0 0
0
BB 1=168 ;1=168
BB ;1=168 1=168
BB
0
2]
H =B
BB 00
0
BB
0
BB 0
0
@ 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
The vertices of the polytope A2 are 1 , 5, 6 and 7 where
7 = 0 3=7 3=7 1=7 0 0 :
12
0
0
0
0
0
0
0
0
1
CC
CC
CC
CC :
CC
CC
CA
The set A2 is not stable by right product by P since 1P 2= A2 (1 PH 2] 6= 0). Furthermore,
B(1) P 2 A2 and B(2) P 2 A2 since B(1) PH 2] = 0 and B(2) PH 2] = 0. This means that A3 6= ( 2 A3). The polytope A3 can be determined by the matrix H 3] given by
0 0
0
BB 0
0
BB 1=1344 ;1=1344
3]
H =B
BB ;1=1344 1=1344
B@ 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
CC
CC
CC :
CC
A
The vertices of the polytope A3 are 5 , 6 and B(1) . Finally, the polytope A3 is stable by
right product by P since 5 PH 3] = 0, 6 PH 3] = 0 and B(1) PH 3] = 0. So, we have AM = A3,
that is
AM = (0 0 0 0 0 1) + (0 0 0 0 1 0) + ( 41 41 14 14 0 0) 2 0 1] + + = 1 :
5 Conclusions
In this paper, we have analyzed the set of all initial probability distributions of an irreducible
and homogeneous Markov chain which lead to a homogeneous aggregated Markov chain given the
transition probability matrix and a partition of the state space. We have obtained a constructive
characterization of this set by means of a nite algorithm. In a subsequent paper we analyse the
continuous time case. Basically, it is shown that it is always possible to come down to the discrete
time case using the uniformization technique. The case of homogeneous Markov processes with
absorbing states seems to be the rst possible direction to extend these results.
13
References
] A.M. Abdel-Moneim and F.W. Leysieer, Weak lumpability in nite Markov chains, J. Appl.
Prob. 19 (1982) 685{691.
] J.G. Kemeny and J.L Snell, Finite Markov chains, Springer-Verlag, New York Heidelberg
Berlin, 1976.
] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, New-Jersey, 1970.
] G. Rubino and B. Sericola, On weak lumpability in Markov chains, J. Appl. Prob. 26 (1989)
446{457.
14
© Copyright 2026 Paperzz