Saarland University
Faculty of Natural Sciences and Technology I
Department of Computer Science
Master’s Program in Computer Science
Master’s Thesis
Parametric Markov Model Analysis
submitted by
Ernst Moritz Hahn
on 2008-04-30
Advisor
Dipl.-Inform. Lijun Zhang
Reviewers
Prof. Dr.-Ing. Holger Hermanns
Prof. Bernd Finkbeiner, Ph.D.
Statement
Hereby I confirm that this thesis is my own work and that I have
documented all sources used.
Saarbrücken, 2008-04-30
Declaration of Consent
Herewith I agree that my thesis will be made available through the
library of the Computer Science Department.
Saarbrücken, 2008-04-30
Abstract
The analysis of complex systems is an important field in
computer science. For qualitative properties of systems, this
problem is quite well understood now. There exists a wide
tool support for specification of models and properties as well
as the analysis of them.
Currently, quantitative properties are more and more becoming the focus of interest. These quantities include complicated timing behavior and continuous state-spaces as well
as stochastic and probabilistic characteristics.
The purpose of this thesis is the analysis of systems with random behavior with certain unknown parameters, like the reliability of sub-components. The approach analyzes the property
for all parameter evaluations of a system without formerly fixing concrete values. That is, all possible values of parameters
are considered at once. Within others, the result acquired by
such an analysis allows for finding optimal parameter evaluations for a certain property.
4
Acknowledgement
I would like to express my gratitude for Prof. Holger Hermanns for his continuous support and proactive guidance
throughout the course of this Master’s thesis. His invaluable feedback and rich insights were essential for my research
endeavor.
This thesis would not have been possible without the active
support of Lijun Zhang. He not only had the initial idea for
the topic but also encouraged me throughout the work. Our
discussions lead to valuable research insights.
Prof. Bernd Finkbeiner immediately accepted to be the second reviewer and gave valuable comments.
Björn Wacher is the author of a large part of the infrastructure the tool implemented along this thesis is based on. I
would like to thank him for doing such a great job.
Christa Schäfer offered the best organizational support I
could expect.
To conclude this section, I would like to express my thanks
to my friends and my family. Their constant support and
assistance helped me enormously and gave me much needed
aid and encouragement.
6
7
Contents
1 Introduction
2 Markov Models
2.1 Discrete-Time Markov Chains . .
2.2 Continuous-Time Markov Chains
2.3 Markov Decision Processes . . . .
2.4 Parametric Models . . . . . . . .
2.4.1 Parametric DTMCs . . . .
2.4.2 Parametric CTMCs . . . .
2.4.3 Parametric MDPs . . . . .
2.5 Reward Models . . . . . . . . . .
11
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Algorithms for Parametric Models
3.1 Bounded Reachability for PDTMCs . . . . . . .
3.2 Bounded Reachability for PCTMCs . . . . . . .
3.2.1 Exponomials . . . . . . . . . . . . . . .
3.2.2 Uniformization . . . . . . . . . . . . . .
3.3 Parameterization on Time . . . . . . . . . . . .
3.4 Unbounded Reachability for Markov Chains . .
3.4.1 State-Elimination Approach . . . . . . .
3.4.2 Extensions . . . . . . . . . . . . . . . . .
3.5 Rewards . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Reachability Rewards . . . . . . . . . . .
3.5.2 State-Elimination Approach . . . . . . .
3.6 Handling Parametric Markov Decision Processes
3.6.1 Straight-Forward Approach . . . . . . .
3.6.2 Alternative Approach . . . . . . . . . . .
3.7 Complexity . . . . . . . . . . . . . . . . . . . .
8
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
16
17
18
19
20
21
21
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
26
27
28
29
30
31
35
40
40
42
45
46
50
53
4 Optimizations
55
4.1 Optimizations in State-Space Generation . . . . . . . . . . . . . . . . . . . 55
4.2 Bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5 Case Studies
5.1 The Tool . . . . . . . . . . . . . .
5.1.1 Architecture . . . . . . . .
5.1.2 Input Language . . . . . .
5.1.3 Evaluating Functions . . .
5.2 Zeroconf . . . . . . . . . . . . . .
5.3 An Instable Model . . . . . . . .
5.4 Bounded Retransmission Protocol
5.5 Protein Synthesis . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61
61
61
63
67
67
70
71
76
6 Conclusion and Future Work
80
A Appendix
A.1 Zeroconf . . . . . . . . . . . . . .
A.2 An Instable Model . . . . . . . .
A.3 Bounded Retransmission Protocol
A.4 Protein Synthesis . . . . . . . . .
85
85
86
87
91
.
.
.
.
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
9
Chapter 1
Introduction
Markov processes have been introduced by Andrei Andrejewitsch Markow (1856 – 1922).
Since then, they have been applied successfully to reason about quantitative properties
in a large number of areas such as computer science, engineering, mathematics, biology
and many more.
A Markov chain is a special kind of a stochastic process. The special property of this
class of stochastic processes is that from knowing only a limited amount of the history, it
is possible to make forecasts about the further development of the process as if the full
history were known. In this thesis, finite-space Markov chains of first order are considered.
In these models, the state-space is finite and the future only depends on the present and
not on the earlier history.
Since the availability of computer systems, numerous methods and tools have been
developed to handle Markov processes, making it possible to handle Markov models of
increasing size, as progress in computation methods and computer system goes on. A
good introduction to the numerical solution of Markov chains can be found in [25]. A
comparison of recent tools for handling these models is given in [15]. In addition to
techniques to reason about properties of Markov chains, efficient means to represent very
large models, for which an explicit representation is not possible, in a computer memory
have been developed. For example in [2] a method is introduced to represent Markov
chains using MTBDDs, which are a data structure allowing to take advantage of structural
similarities between different parts of a model.
This master thesis is about parametric Markov processes, first introduced in [8] for
discrete-time models. Parametric Markov chains are Markov chains in which certain
aspects of the model are not fixed, but depend on parameters of the model. As an
example, consider a computer network in which a sender sends packages to a receiver
over a lossy channel, as shown in Figure 1.1. If a package is sent, it is received with
the probability x but lost with probability 1 − x. Given this model, it is possible to
consider the probability that a complete file, consisting a sequence of several packages,
is transmitted successfully with no more than 10 transmissions. If standard methods are
11
Received: x, eg. 0.999 chance
...............
Sender
Receiver
Lost: (1−x), eg. 0.001 chance
Figure 1.1: Parametric model of a computer network
used, the value for the reliability x has to be initialized as a concrete number before
checking this property.
In this thesis however, we propose novel methods to handle parametric analyses. Instead of a numeric or Boolean result, we compute a function which depends on the
parameters. For the example model and property, a function depending on x giving the
probability for successful transmission of the file within the given bound for transmissions
would result. Using the method introduced in this thesis allows curve sketching on the
resulting function using standard mathematical means. Having a result formula allows to
solve optimization problems for Markov models. Beforehand, this was only possible by
trying out a bunch of variable evaluations using standard probabilistic model checkers,
guessing optima from the resulting values or to use manual methods. For a change, if we
have computed the result function for the example of Figure 1.1, we can reason about the
minimal reliability x of the channel that is needed such that a whole file is successfully
transmitted with a probability larger or equal to 0.0001. Our methods also allow for a
fast plot of the function to visualize the results. This allows to quickly get an overview
on the influence the parameters have on the behaviour of the model.
We developed the tool Param with which we show the feasibility of the approach on
a number of case studies, though as seen later in Section 3.7 the theoretical bounds of
space and memory are rather bad. In Section 5 Param was successfully applied to a
number of case studies. For instance, we considered the bounded retransmission protocol
(Section 5.4) which is a protocol similar to the example of Figure 1.1. In this protocol,
a number of packages are sent over a network with a limited number of retransmissions
allowed. Both the channel, over which packages are sent as well as the channel used
to acknowledge that a package was received are considered to be unreliable. This case
study has already been handled in Prism [20]. Prism is a tool which allows for the
description of Markov models in a higher level language, that is, using variables, guarded
commands, etc. Prism programs also allow for the usage of undefined constants in the
high level specification. Assume the acknowledgement channel to have a reliability of 0.99
12
and consider the property “The maximum probability that eventually the sender does not
report a successful transmission”.
0.005
0.0045
The maxium probability that ...
0.004
0.0035
0.003
0.0025
0.002
0.0015
0.001
0.0005
0
x0
0.9
x1
0.91
x2
0.92
x3
x4
x5
x6
x7
0.93
0.94
0.95
0.96
0.97
Reliability of transmission channel
x8
0.98
x9
0.99
1
Figure 1.2: Plots in Prism and our tool
Using Prism, it is possible to generate a plot with the reliability of the transmission
channel as the x axis and the probability of the property as the y axis. However, for
each value pair (xi , yi ) of the plot, a new run of Prism has to be executed, as shown in
Figure 1.2. With the approach of this thesis however, we can evaluate the function after
applying the analysis, allowing to generate the plot much faster. It also allows to further
inquire the result function, for example requesting minimal reliabilities to guarantee that
the probability of a successful file transmission is above a certain bound.
The contributions of this thesis are:
• providing definitions and algorithms for parametric discrete-time and continuoustime Markov chains as well as Markov Decision Processes
• handling reward properties
• several case studies showing the feasibility of the approach
• a tool to apply parametric analysis on models specified in a widely used higher-level
description language
The thesis is organized as follows: In Chapter 2 the formal models to be handled are
described. Chapter 3 puts forward problems to be solved on the models of Chapter 2
13
together with algorithms for their solution. Chapter 4 complements this by describing
optimizations for the computations of the former chapter. This is important, because calculations may be more time and memory consuming in the parametric-setting than in the
non-parametric, where computations to the largest part involve mathematical operations
on floating point numbers. Chapter 5 introduces Param and shows its application on a
number of case studies. In Chapter 6 we will give a conclusion of the results as well as a
number of future works. Chapter A is the appendix.
14
Chapter 2
Markov Models
This chapter is organized as follows: In Sections 2.1, 2.2 and 2.3 we give definitions for
discrete-time and continuous-time Markov models with and without non-determinism. In
Section 2.4 these models are extended with parameters. Section 2.5 extends the previous
models by rewards, which can be interpreted as costs or bonuses. The model types
introduced here are the ones that our tool Param can actually handle.
2.1
Discrete-Time Markov Chains
Definition 2.1.1 A Discrete-Time Markov Chain (DTMC) is a tuple D = (S, S0 , P)
where
• S is a finite set of states
• P : S × S → [0, P
1] is a probability matrix such that P(s, S) ∈ {0, 1} for all s ∈ S
where P(s, A) = s0 ∈A P(s, s0 ) for A ⊆ S
• S0 ⊆ S is a set of initial states
The state s ∈ S is called absorbing if P (s, S) = 0. A path of a DTMC is a finite or
infinite sequence σ = s0 s1 s2 . . . of its states such that for all i ≥ 0: P(si , si+1 ) > 0. A
path is maximal if it is infinite or finite and its last state is absorbing. Path D (s) shall
denote the set of all infinite or maximal paths of D starting in s. With σ[i] the (i + 1)th
state of σ shall be denoted that is, σ[i] = si . The unique probability measure on a set of
paths starting in s [18] is given by Prs .
A system can be modelled as a DTMC when it is reasonable to assume that time
passes in discrete steps. Consider the example from [19] given in Figure 2.1. Here,
throwing a dice is simulated by repeatedly tossing a coin. During the simulation process,
we remember the state of the simulation. Starting in s0 we toss a coin. If the result is
head, we proceed to the state s1 , in the other case to s2 . Each of these transitions has
15
1
2
1
2
1
2
1
2
1
2
1
2
s1
d1
s3
d2
s4
d3
1
2
s0
1
2
1
2
s2
1
2
s5
1
2
1
2
d5
s6
1
2
d4
1
2
d6
Figure 2.1: Simulating a dice by tossing a coin
probability 12 . In each of the simulator states where a result was not yet computed, a
coin is thrown and the next state is chosen, as shown in the figure. Finally, one of the
states di , giving the result of the simulation, is reached. Notice that the number of tosses
needed to get a result is not bounded, but the probability of not reaching any of the final
states within a certain step limit becomes arbitrary small with an increasing number of
coin tosses. A possible path through this model is σ = s0 s2 s6 s2 s6 d6 ∈ Path D (s0 ). The
5
1
.
path has a probability of Prs0 (σ) = 21 = 32
Obviously, a DTMC induces a labelled directed graph GD = (S, E) where E = {(s, s0 ) |
P(s, s0 ) > 0} with the values of P as an edge labelling. For the other Markov models
considered in this thesis, the situation is similar. Because of this, usual graph algorithms
can easily be applied to Markov chains.
2.2
Continuous-Time Markov Chains
Definition 2.2.1 A Continuous-Time Markov Chain (CTMC) is a tuple C = (S, S0 , R)
where S and S0 are as in Definition 2.1 for DTMCs. Instead of a probability matrix, a
CTMC has a rate matrix R : S × S → R≥0 .
P
Let R(s, A) = s0 ∈A R(s, s0 ) for A ⊆ S. With E(s) = R(s, S) we denote the sum of
all outgoing rates of a state. If E(s) = 0, we call s absorbing. Given the system is in
state s, the probability that s is left after t or more time units is 1 − e−E(s)·t . If there is
more than one state s0 such that R(s, s0 ) > 0, the probability to move to s0 (in a single
0)
. A path of a CTMC is an alternating sequence
step) when s is left is P(s, s0 ) = R(s,s
E(s)
σ = s0 t0 s1 t1 . . . such that for all i ≥ 0 it is R(si , si+1 ) > 0. In this definition, ti gives the
time spent in state si along the path. The notation σ@t shall denote the state the
Pi Markov
chain is in at time t, that is σ@t = σ[i] where i is the smallest index with t ≤ j=0 tj . A
16
path is called maximal if it is infinite or ends in absorbing state s. As in the discrete-time
setting, Path C (s) shall stand for the set of maximal paths through C starting in s and Prs
represents the unique probability measure [18] on those paths. Continuous-time systems
are also called stochastic whereas discrete-time systems are referred to as probabilistic.
5
s0
5
s1
5
s2
7
7
s3
7
Figure 2.2: M/M/1/3
CTMCs are the model type of choice when modelling the continuous flow of time
is relevant. Within others, this is the case for queueing models [11]. In such a model,
customers may enter a network with a certain rate λ and leave it with rate µ. If the
number of customers is limited to 3 and λ = 5, µ = 7, this is an M/M/1/3 queue which
results in a CTMC depicted in Figure 2.2. Here, si stands for the state where there
are i customers in the queue. In s0 there is only a transition to s1 with λ, because the
queue is empty. In s3 , no further customers can enter the queue, because the queue is
full. In other states, two transitions are possible for the exit and entrance of customers
respectively. Given such a CTMC, we are interested in properties such as: “What is he
probability that at time 10.4 exactly two customers are in the queue?”.
We shall give two different transformations from CTMCs to DTMCs which will be
relevant for the algorithms later in the thesis.
Definition 2.2.2 The uniformized DTMC of a CTMC C = (S, S0 , R) is defined by
unif(C) = (S, S0 , P) where P = I + Qq with I the identity matrix, Q = R − diag(q)
and q ≥ max{E(s) | s ∈ S} where diag(q) is the diagonal matrix with q as the entries of
its main diagonal.
The rate q is called the uniformization rate.
Definition 2.2.3 The embedded DTMC emb(C) of a CTMC C = (S, S0 , R) is defined as
0)
emb(C) = (S, S0 , P) where P(s, s0 ) = R(s,s
for s, s0 ∈ S in case E(s) 6= 0. If E(s) = 0,
E(s)
let P(s, S) = 0.
2.3
Markov Decision Processes
Markov decision processes are extensions of DTMCs with non-deterministic decisions.
Non-deterministic choices may result from user inputs, sensors, etc.
17
Definition 2.3.1 A Markov decision process (MDP) is a tuple M = (S, S0 , Act, P) where
S and S0 are as for DTMCs and Act is a finite set of actions. Further, P : (S×Act ×S) →
[0, 1] is a three-dimensional probability matrix.PFor each a ∈ Act and each s ∈ S, P must
satisfy P(s, a, S) ∈ {0, 1} where P (s, a, A) = s0 ∈A P (s, a, s0 ) for A ⊆ S.
For each state s ∈ S, we call the action a ∈ Act enabled if P (s, a, S) = 1 and disabled
otherwise.
Definition 2.3.2 Let M = (S, S0 , Act, P) be an MDP and s ∈ S. Then the set of enabled
actions for a state s ∈ S can be given as Act(s) = {a ∈ Act | P (s, a, S) = 1}.
A path of an MDP is an alternating sequence σ = s0 a0 s1 a1 . . . such that si ∈ S,
ai ∈ Act(si ) and P(si , ai , si+1 ) > 0. A path is called maximal if it is either infinite or
ends in an absorbing state s, that is Act(s) = ∅. With Path M (s) we specify the maximal
paths starting in s. In the graphical notation, we will denote states as before and the
non-deterministic decisions of a state will be denoted by smaller black circles. These black
circles will be labelled with their action name, if the concrete name is relevant.
1
2
1
2
s1
s2
a
1
3
10
b
b
1
a
s0
7
10
c
1
Figure 2.3: Example MDP
As an example consider the model from Figure 2.3. Here, for the initial state s0 the
7
3
, P(s0 , a, s1 ) = 10
, . . .. A path through
actions a, b and c are enabled. It is P(s0 , a, s0 ) = 10
this model is for example s0 as0 bs1 as1 as2 .
2.4
Parametric Models
A parametric Markov model is a Markov model equipped with a set of parameters which,
given concrete valuation, correspond to a concrete non-parametric model. In this thesis,
Markov models are considered in which the number of states is fixed and the parameters
18
influence only the matrices P or R respectively. For this kind of parametric models,
definitions are given in the following subsections. An idea how other types of parametric
Markov models in which parameters could influence the number of states may be handled
is discussed in future work in Chapter 6.
In the model types introduced in the following sections, matrices have entries of functions. We fix a set of variables V = {x1 , . . . , xn }. We assume the domain of xi is an
interval I(xi ). The two kinds of functions considered here are multivariate polynomials
and rational functions.
Definition 2.4.1 A polynomial f over V is a sum of monomials
X
f (x1 , . . . , xn ) =
ai1 ,...,in xi11 · · · xinn
i1 ,...,in
where each ij ∈ N0 and each ai1 ,...,in ∈ R
Definition 2.4.2 A rational function f over a set of variables V = {x1 , . . . , xn } is a
fraction
f1 (x1 , . . . , xn )
f (x1 , . . . , xn ) =
f2 (x1 , . . . , xn )
of two polynomials f1 , f2 over V .
So in this definition, with V = {x, y}, f (x, y) = 2xy 2 + 4x3 y 4 is a polynomial and
x
g(x, y) = x+y
is a rational function. Let FV = {f : V → R} denote the set of functions
from V to R, FV,poly the set of polynomials over V and FV,ratio the set of rational functions
over V . For practical purposes, we restrict FV to be either the space of multivariate
polynomials FV,poly or rational functions FV,ratio .
In many cases it is necessary to restrict the evaluations of a variable to guarantee that
the model resulting from the instantiation of variables is actually a Markov model. For
discrete-time models this means that probabilities should not exceed 1 and should not be
below 0 whereas for continuous-time models rates must be non-negative. Because of this,
the intervals I(v) of the variables are often critical for the validity of the results.
2.4.1
Parametric DTMCs
Definition 2.4.3 A parametric DTMC (PDTMC) is a tuple D = (S, S0 , P, V ) where S
and S0 are as in 2.1 and V = {v1 , . . . , vn } is a finite set of parameters. The probability
matrix is a function P : S × S → FV , satisfying P(s, S) ∈ {0, 1} for each s ∈ S.
An example for a PDTMC with P in the space of polynomials and V = {x} is given
in Figure 2.4.
Notice that not all variable evaluations of x represent a valid DTMC. In this example,
I(x) = [0, 1] guarantees a valid DTMC for all values of this range. We introduce the
notion of valid evaluations.
19
x
2
1−x
s0
1−
x
2
s2
1
s3
1
x
2
s1
x
4
x
4
Figure 2.4: Example PDTMC
Definition 2.4.4 An evaluation g : V → R of the variables of a PDTMC D = (S, S0 , P, V )
is valid if Dg = (S, S0 , Pg ) with Pg (si , sj ) = P(si , sj )(a1 , . . . , an ), ai = g(vi ), vi ∈ V is a
DTMC.
x
x
s1
1−x
s3
x
x
1−x
d1
d2
s4
s0
x
1−x
s2
1−x
d3
x
d4
s5
1−x
1−x
s6
1−x
x
d5
d6
Figure 2.5: Simulating a dice by tossing a biased coin
As another example, in Figure 2.5 a parametric variant of the dice model from Section
2.1 is given. Here, the probability that the coin shows head is x and that probability that
tail is shown is 1 − x. Note that the values for x are restricted to the interval [0, 1].
2.4.2
Parametric CTMCs
Definition 2.4.5 A parametric CTMC (PCTMC) is a tuple (S, S0 , R, V ) where S and
S0 are as in 2.2. As in 2.4.1, V is a finite set of parameters. The transition rate matrix
20
is a function R : S × S → FV .
An example for a PCTMC is given in Figure 2.6. The model is a parametric version of
the M/M/1/3 models where, in contrast to Figure 2.2, the rates are not fixed. However,
they are restricted to non-negative values.
λ
s0
λ
s1
µ
λ
s2
µ
s3
µ
Figure 2.6: Parametric M/M/1/3
Definition 2.4.6 An evaluation g : V → R of the variables of a PCTMC C = (S, S0 , R, V )
is valid if Cg = (S, S0 , Rg ) with Rg (si , sj ) = R(si , sj )(a1 , . . . , an ), ai = g(vi ), vi ∈ V is a
CTMC.
2.4.3
Parametric MDPs
Definition 2.4.7 A parametric MDP (PMDP) M is a tuple (S, S0 , Act, P, V ) where S,
S0 and Act are as for MDPs. As in Section 2.4.1, V is a finite set of parameters. The
transition probability matrix is a function P : S × Act × S → FV , where for each s ∈ S
and a ∈ Act it is P(s, a, S) ∈ {0, 1}.
An example for a PMDP is given in Figure 2.7 which corresponds to the non-parametric
example of Figure 2.3. Validity is defined analogously to Definitions 2.4.4 and 2.4.6 for
PDTMCs and PCTMCs respectively.
2.5
Reward Models
Markov reward models allow for reasoning about properties involving costs or bonuses.
For example, consider a model of an automaton which, if it is working, can be used to
acquire a certain amount of profit per hour. If a description of the system is given, which
includes a depiction of the system structure, possible sources of system failures, etc. plus
the reward – the amount of money acquired – then properties like the expected hourly
income in the long run or the probability that the system has gained x or more units
of money within 10 hours can be considered. For all Markov model types given above,
extensions with rewards are possible.
Definition 2.5.1 A DMRM is a tuple R = (D, r) where D = (S, S0 , P) is a DTMC and
r : S ∪ (S × S) → R is a reward assignment function.
21
1−x
x
s1
s2
a
1
1−y
b
b
1
a
s0
y
c
1
Figure 2.7: Example PMDP
Definition 2.5.2 A PDMRM is a tuple R = (D, r) where D = (S, S0 , P, V ) is a PDTMC
and r : S ∪ (S × S) → FV is a reward assignment function.
For each state s ∈ S, r(s) is the reward that is gained for staying in s for one step
whereas r(s, s0 ) denotes the reward for taking the transition from s to s0 . Orthogonal
extensions are possible for CTMCs (CMRM) as well as for MDP and parametric models
(PDMRM, PCMRM). For parametric models, we define r to have have its range in FV
(space of functions) instead of real values. As for transitions probabilities and rates, the
values of r will be assumed to be polynomials or rational functions.
1
0.001/0
s0
0
s1
0.1/ − 2
Figure 2.8: Example reward model
Rewards may express both costs as well as benefits, depending on the model and
the property to be expressed. For the model described above, a quite abstract CMRM
version could be R = (C, r) with C = ({s0 , s1 }, {s0 }, {(s0 , s1 , 0.001), (s1 , s0 , 0.1)} and
r = {(s0 , 1), (s1 , 0), ((s1 , s0 ), −2)}. So, initially the system is in the working state s0 in
which an amount of one unit of money is acquired. It fails with a rate of 0.001. If it is
broken, no money is accomplished. It is repaired with a rate of 0.1 and a repair costs of
two units of money. In Figure 2.8 we depict a graphical representation.
22
Chapter 3
Algorithms for Parametric Models
This chapter introduces several algorithms in the way they are actually implemented in
our tool Param. The organization of this chapter is as follows: In Section 3.1 we discuss
algorithm for bounded reachability for PDTMCs. Section 3.2 is about bounded reachability for PCTMCs. In Section 3.3 techniques from 3.2 are extended to handle parametric
time for PCTMCs. Section 3.4 handles the problem of unbounded time reachability for
Markov models. In Section 3.5 the problem of reachability reward properties is considered. In Section 3.6 we show how to handle parametric MDPs and especially how we can
apply the techniques previously introduced for PDTMCs on them.
3.1
Bounded Reachability for PDTMCs
Computing the bounded reachability for a DTMC D means computing the probability to
reach a set of target states B from a state s0 ∈ S0 within n steps. This probability is
denoted by PD (s, n, B). If it is clear from the context which Markov model is meant, we
will leave out the index D. The states of B are assumed to be absorbing. If this is not the
case, it can be fixed beforehand by removing all outgoing transitions for states of B. This
transformation does not affect the probability to reach B from a state s0 ∈ S0 , because
the probability to reach B from s0 only depends on the prefixes of the paths starting in
s0 which end in B.
Let D = (S, S0 , P) be a DTMC, then for all s ∈ S:
1 s∈B
P (s, 0, B) =
0 else
1P
s∈B
∀n > 0 : P (s, n, B) =
0
0
/B
s0 ∈S P(s, s ) · P (s , n − 1, B) s ∈
The meaning of these equations is the following: In case 0 steps are left to reach
the target states, the probability is 1 if and only if the state considered is itself a target
23
state. If more than 0 time steps are left, the probability is 1 if the state considered is a
target state. Otherwise, the probability is calculated from the probability of moving to a
successor of the state and then reaching the target states within the n − 1 steps that are
left. The linear equation set leads to an algorithm in which we first calculate P (s, 0, B)
for each s ∈ S, then using these values to compute P (s, 1, B), and so on till P (s, n, B).
This algorithm can be applied to PDTMCs directly by just adding/multiplying the
functions like usual numbers. If the PDTMC transitions are given as polynomials, the
result is also a polynomial, because only addition and multiplication are used under which
polynomials are closed.
x
2
x
2
1−x
0
1−
0
1
x
2
x
2
0
1−
x
4
x
4
x
2
x
4
1
1−x
0
x2
8
1
x
2
x
4
1
1
step 1
x
2
x
2
+
x
4
x
4
step 0
x2
8
x2
16
0
x
2
x
2
1−
1−x
0
x
4
1−
x
4
x
2
+
x2
16
+ x4
x
4
step 2
0
1
x
2
x3
64
1
1−x
3x3
32
+
x
4
1
step 3
Figure 3.1: Algorithm for bounded reachability properties of PDTMCs
The algorithm is illustrated in Figure 3.1 for the sample PDTMC from Figure 2.4. We
want to compute the probability that from the initial state (marked by an incoming arrow)
the target state (marked by an additional circle) is reached within 3 steps. For each step
24
n, the values P (s, n, B) are given in the states. The result is P (s0 , 3, B) =
3 3
x
32
+ 18 x2 .
0.4
n=2
n=3
n=4
n=5
0.35
0.3
P(s,n,B)
0.25
0.2
0.15
0.1
0.05
0
0
0.2
0.4
0.6
0.8
1
x
Figure 3.2: Probability to reach B in n steps in Figure 3.1
A plot of the probabilities to reach B in n steps is given in Figure 3.2 for several values
of n. The following list gives the results as polynomials:
• P (s0 , 2, B)(x) = 18 x2
• P (s0 , 3, B)(x) =
3 3
x
32
• P (s0 , 4, B)(x) =
3
x4
128
+ 18 x2
+
5 3
x
32
9
• P (s0 , 5, B)(x) = − 512
x5 +
+ 81 x2
15 4
x
128
+
5 3
x
32
+ 18 x2
Notice that if exact arithmetic is used for the calculations with polynomials, exact
results will result here, once for all values wx ∈ I(x).
Consider again the dice model with a biased coin of Figure 2.5. Let’s assume that
we have a player that would win if the dice shows 6, but is willing to throw a coin no
more than n times. For each n, the result is a polynomial. For several n, the results are
depicted in Figure 3.3 and listed in the following:
• P (s0 , 3, {d6 })(x) = x3 − 2x2 + x
25
• P (s0 , 5, {d6 })(x) = x5 − 4x4 + 7x3 − 6x2 + 2x
• P (s0 , 7, {d6 })(x) = x7 − 6x6 + 16x5 − 24x4 + 22x3 − 12x2 + 3x
• P (s0 , 9, {d6 })(x) = x9 − 8x8 + 29x7 − 62x6 + 86x5 − 80x4 + 50x3 − 20x2 + 4x
• P (s0 , 11, {d6 })(x) = x11 − 10x10 + 46x9 − 128x8 + 239x7 − 314x6 + 296x5 − 200x4 +
95x3 − 30x2 + 5x
• P (s0 , 13, {d6 })(x) = x13 −12x12 +67x11 −230x10 +541x9 −920x8 +1163x7 −1106x6 +
791x5 − 420x4 + 161x3 − 42x2 + 6x
As expected, with a growing number of steps allowed, this probability increases. The
value for x (probability that head is thrown) which maximizes the probability to throw a
6 is different for each step limit. For n = 3, 4 it is ≈ 0.3333, for n = 5, 6 it is 0.2707, etc.
As seen later in Figure 3.11, in the limit for n → ∞, the limit for the maximizing value is
x → 0 and the maximum probability to throw a 6 goes towards 21 . Notice however, that
for x = 0 the probability is zero, which means that the probability to throw a 6 will not
become 12 or higher. Results are equal for the pairs (3, 4), (5, 6), etc. The reason is that
after the coin was thrown an odd number of times, it is either in one of the states di or in
one of the states s1 or s2 , and from those two states at least two further steps are needed
to reach a target state.
3.2
Bounded Reachability for PCTMCs
Computing the bounded reachability for a CTMC C means computing the probability
that from a given state s0 ∈ S0 a set of target states B is reached within t time units.
This is denoted by PC (s0 , t, B) where we may leave out the index C if appropriate, as
before. The difference to the problem described in 3.1 is that within any time bound t
any number of jumps – that is, transitions from one state to another – may occur in the
CTMC, though with a small probability for very large or small numbers of transitions.
The probability π(t)(s) that a CTMC C = (S, S0 , R) is in a given state s at a given
point of time t is the solution of the differential equation
π̇(t) = Qπ(t)
It can be calculated [25] by
π(t) = π(0)eQt
Thereby, Q is the infinitesimal generator matrix where
• Q(s, s0 ) = R(s, s0 ) for s 6= s0
26
0.4
n=3, 4
n=5,6
n=7,8
n=9,10
n=11,12
n=13,14
0.35
probability to throw a 6
0.3
0.25
0.2
0.15
0.1
0.05
0
0
0.2
0.4
0.6
0.8
1
probability of head
Figure 3.3: Probability of throwing a 6 in no more than n steps
• Q(s, s) = −E(s) + R(s, s)
So instead of self-loops Q contains the negative sum of outgoing transitions to other
states. The vector π(0)(s) is an initial distribution giving the probability to be in state s
at time 0.
3.2.1
Exponomials
The entries π(t)(s) can be expressed as exponomials [23] that is, functions of the form
n
X
ai tki ebi t
i=1
where ki , n are some integers with ki ≥ 0, n ≥ 1, ai , bi ∈ C. Notice that though the shape
of this definition looks simple, concrete instantiations for parametric models may not be
so. For example, making state s3 of the example from Figure 2.6 absorbing and using
the computer algebra system Maple to calculate π(t)(s4 ) exactly leads to a formula of
several pages (after simplification). The reason for this is, that ai and bi may depend
on variables of the Markov chain and may be of a rather complicated form. Also, to
compute the exponomial the Eigenstructure of parametric matrices has to be considered.
Calculating Eigenvalues for matrices is already a complicated problem for non-parametric
27
Algorithm 1 Algorithm from [17] used to calculate bounded reachability for CTMCs
1.1: Pq·t , Lε , Rε = FoxGlynn(q · t, ε)
1.2: sol = 0
1.3: b := lB
1.4: for k = 1 to Lε − 1 do
1.5:
b := P · b
1.6: end for
1.7: for k = Lε to Rε do
1.8:
b = P·b
1.9:
sol = sol + Pq·t (X = k) · b
1.10: end for
1.11: return sol
matrices. Exact solutions are not possible in many cases or too computationally expensive.
Approximate solutions may suffer from problems like stiffness [25] and other numerical
problems. Because of these problems, it was chosen not to consider the approach based
on the Eigenstructure of matrices in this thesis, but instead consider another method.
3.2.2
Uniformization
Another method to calculate bounded reachability probabilities for CTMCs, which is
more feasible in the parametric case, is called uniformization. For this method, given a
CTMC, the uniformized DTMC is considered (see Definition 2.2.2).
Let t be the time bound for the reachability property and q be the uniformization
rate, where it is required that q ≥ max{E(s) | s ∈ S}. The calculations are based on
the usage the Poisson distribution Pq·t (X = k) which gives the probability that exactly k
jumps occur within t time units. This distribution takes its largest value around k ≈ q and
decreases for k becoming smaller or larger than this value. Because of this, for calculations
up to a certain precision, it is possible to give a left and a right bound for non-negligible
probabilities. The common method to calculate values for the Poisson distribution up to
a certain precision ε together with left and right bounds is the Fox-Glynn algorithm [10].
Now computing the bounded reachability for C is based on using this algorithm to get the
probability that k steps are taken within t time units. The uniformized DTMC unif(C)
is used to decide in which state the CTMC will be in after the kth step.
The variant of this method used here is given in [17]. The algorithm from this paper is
given in Algorithm 1. Here Pq·t (X = k) is the approximated Poisson distribution and Lε ,
Rε are the left and right bounds respectively when calculating the Poisson probabilities
up to a certain precision ε. Furthermore, lB is a vector in which an entry for a state s
is 1 if s is in B and 0 else. For each state s, the result vector sol is supposed to give
the probability to reach B from s within t time units. In lines 1.4 to 1.10 this vector
28
is computed. The matrix-vector multiplications in lines 1.5 and 1.8 correspond to state
transitions in unif(C). The first loop just executes a number of Lε multiplications, as the
probability that less than Lε jumps occur is negligible. In the second loop, for all k for
which the probability that exactly k jumps happen is non-negligible, the probabilities to
reach B in a number of k jumps are added up.
As for bounded reachability for discrete-time models, this approach can be extended
to PCTMCs. To apply this method on parametric models, some points have to be taken
care of. As noted before, the uniformization rate q has to be larger or equal to E(s)
for all states s ∈ S. However, because rates of a PCTMC depend on parameters, such
a bound is not available in general, unless the parameters of the model are bound to a
certain interval.
As an example, consider the M/M/1/3 model from Figure 2.2. If we limit λ + µ to
2, we can uniformize with q = 2 as shown in Figure 3.4. In case we use this q as the
uniformization rate but evaluate the resulting polynomial on λ + µ > 2, the results may
differ from the exact value by more than ε: The uniformization rate was chosen too small
for these values and in consequence the truncation points were chosen too low.
λ/2
s0
λ/2
s1
µ/2
1 − λ/2
λ/2
s2
µ/2
1 − λ/2 − µ/2
1 − λ/2 − µ/2
s3
µ/2
1 − µ/2
Figure 3.4: Uniformized M/M/1/3 PDTMC
Uniformization has the advantage that we do not have to consider the Eigenstructure
of matrices, which turned out not to be feasible, as mentioned before. Notice however,
that the result computed here is not exact, even if exact arithmetic is used, so in all
cases an approximate solution will result. When evaluating the function returned by the
algorithm, we must be aware of possible problems, as described in 5.1.3.
3.3
Parameterization on Time
In the algorithm in Section 3.2 the probability to reach B within a fixed time bound t
was considered. In this section it shall be shown how to compute these probabilities for
all t with 0 ≤ t ≤ tmax for a given tmax > 0 at once. Here tmax is a maximal time bound
that is of interest for some reachability property.
Definition 3.3.1 The time-parametric PCTMC of a PCTMC C = (S, S0 , R, V ) is the
˙
PCTMC Ct = (S, S0 , R0 , V 0 ) where V 0 = V ∪{t}
and R0 = R · t. We call the variable t the
time-parameter of Ct .
29
In this PCTMC, each rate of the original PCTMC is multiplied by a parameter t
giving the time bound. Then, the following holds:
Lemma 3.3.2 The probability to reach B within t time units in C equals the probability
to reach B in 1 time unit in Ct given parameter t.
Proof Let Q be the infinitesimal generator matrix of C and Qt be the infinitesimal generator matrix of Ct . Consider the exact probabilities π(C)(t) = π(C)(0)eQt to be in
a given state of C at time t. By definition of the time-parametric PCTMC it holds
π(C)(0) = π(Ct )(0). Also, eQt ·1 = eQ·t . Because of this, π(C)(t) = π(Ct )(1). So, as those
probabilities are equal for each state of S ⊇ B, they are also equal for the target states
B.
In combination with the uniformization approach, this method can be used to get
approximating polynomials for the bounded-time reachability for all t ≤ tmax . Of course,
it can also be used to get time-parametric CTMCs from non-parametric ones. Assume
we are given a PCTMC C and a set of target states B. Further assume that we have
a maximum time bound tmax and we are interested in all P (s0 , t, B) with s0 ∈ S0 and
0 ≤ t ≤ tmax . Assume we also have a uniformization rate q for C. Now we transform C
into the time-parametric PCTMC Ct . For the parameter evaluations of interest in C and
all 0 ≤ t ≤ tmax , due to the definition of Ct the number q0 = q · tmax is larger or equal to
all rates occurring in Ct . Because of this, we can uniformize Ct with a uniformization rate
q0 . If we apply Algorithm 1 on unif(Ct ) for a time bound of 1, we obtain a formula that
is parametric in the parameters of C but also in the time t.
In [25] a method is given in which bounded reachability probabilities for CTMCs are
calculated for several points of time. In the method described there, for calculating the
probability for time t results from t0 < t can be reused. For this reason, the overall effort
when computing values for a set of time parameters is lower than performing computations
for all of them individually when using the method from [25]. However, with the approach
presented here, the result is a function providing an approximation for all points up to a
certain maximal one at once.
3.4
Unbounded Reachability for Markov Chains
Computing the unbounded reachability for a DTMC D means computing the probability
that from a state s0 ∈ S0 a set of target states B is finally reached. This probability
shall be denoted by PD (s, B) or PD (s, ∞, B). As before, we may leave out the index D if
appropriate. In contrast to bounded reachability, where the number of steps allowed to
reach B was limited, we do not restrict the number of steps here.
In the following, two methods for PDTMCs are presented. For a PCTMC, unbounded
time reachability can be derived from the embedded PDTMC [25]. Thus, we consider
only PDTMCs.
30
For all s ∈ S, this property can be expressed [7] as
s∈B
1
P (s, B) =
0
s∈B
P
0
0
P(s,
s
)
·
P
(s
,
B)
else
s0 ∈S
Here, B is used to denote the set of all states which can not reach B with a non-zero
probability at all. This set can be calculated beforehand: Let R be the set of states that
can be reached in the reversed graph of D starting at some state of B. R can be computed
for example by a depth-first search. Then it is B = S \ R.
The linear equation given above can be solved using Gauss elimination. In [8] an alternative approach to Gauss elimination was introduced, which has the following advantages
• allows to find the probability to reach several sets of target states at once
• it is not necessary to identify the set B of states which can not reach the target
states B at all
We first recall the method introduced in [8] to handle PDTMCs exploiting stateelimination. Then we propose a few important extensions developed in this thesis.
3.4.1
State-Elimination Approach
In [8], PDTMCs with a single initial state are transformed into finite automata. Thereby,
the states stay the same, the initial state stays also the same and the target states become
the final states of the automaton. Transition probabilities are described by symbols
from an alphabet of the automaton of the form pq or x representing rational numbers, or
variables. Afterwards, the regular expression (with multiplicities, that is e.g. a|a may not
be shortened to a) describing the language of such an automaton is calculated. Then,
using a function val : regex → FV,ratio , these regular expressions are transformed into
rational functions representing the probability to finally reach the target states. The
recursive definition of val is given below:
1. val pq = pq
2. val (x) = x, x ∈ V
3. val (r|s) = val (r) + val (s)
4. val (r.s) = val (r) · val (s)
5. val (r∗ ) =
1
1−val(r)
31
1 and 2 are the base cases of the recursive definition of val. If the regular expression
is a rational number or a variable, the resulting value is just that value. In case of an
alternative as in 3, the probabilities of the two possible choices have to be added up. In
case of a sequence as in 4, they have to be multiplied. In case of 5, r∗ represents the
possible evaluations ε, r, rr, rrr, . . .. The probability for each of them has to be added up.
This leads to the formula
∗
val(r ) =
∞
X
val(r)i
i=0
which is the geometric sequence implying that val(r∗ ) =
1
.
1−val(r)
a
b
s0
d
s2
1
c
s1
f
s3
e
Figure 3.5: Example finite automaton
To illustrate the approach, consider again the model from Figure 2.4 with B = {s3 }.
Considering this as a finite automaton as seen in Figure 3.5, the following regular expression describes its language:
r = a∗ c(e|(da∗ c))∗ f
with a = x2 , b = 1 − x, c = x2 , d = 1 − x2 , e = x4 , f = x4 . Using the definition of val, and
inserting the value for a, b, c, d, e, f , we get
x2
val(r) =
8 − 10x + 3x2
A graph of this function is given in Figure 3.6
In [8] Daws considers the usage of state-elimination for finite automata to get the
regular expression to evaluate. State-elimination is a standard method to transform finite
automata to regular expressions. It removes all states from an automaton except the initial
32
1
0.8
val(r)
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
x
Figure 3.6: Evaluations for
x2
8−10x+3x2
and final one while maintaining its languages. For this algorithm, edges are allowed to
be labelled with regular expressions instead of just the alphabet of the initial automaton.
If there is more than one initial or final state, this can be fixed beforehand by using
ε-labelled transitions.
In Figure 3.7 it is shown how to remove a state s00 . Let s → s0 denote that there
is a direct transition from s to s0 in the finite automaton. Consider the set A(s00 ) =
{(s, s0 ) | s → s00 ∧ s00 → s0 }. That is, A(s00 ) is the set of all pairs (s, s0 ) where s is a state
that has a direct transition to s00 , and s00 has a direct transition to s0 . In the figure, state
s00 is eliminated by replacing the edge label d by (ac∗ b)|d. This is possible, because instead
of moving directly from s to s0 by d, there is an alternative to move to s00 by a, stay there
by taking c zero or more times and then leave to s0 by b. Notice that a, b, c and d may
already be complex regular expressions resulting from former elimination. If there is no
former direct connection between s and s0 , the alternative choice is left out. Non-target
states without transitions leaving the state can just be removed together will all their
incoming transitions. Also notice that s = s0 is not forbidden. In this case, a self-loop
will be added or modified.
All states except the initial and the final one are eliminated as described before.
Afterwards, there will only be two states left: The initial and the final one, with an edge
between them which will be labelled by the final regular expression.
33
d
c
s
s
a
s00
(ac∗ b)|d
b
s0
s0
Figure 3.7: State-elimination to get regular expressions from automata
In Figure 3.8 this technique is demonstrated on the example of Figure 3.5. In box 1 the
original model is given. In box 2 the former initial state was made non-initial, and instead
a new initial state with no incoming transitions was added. An ε-transition connects the
new initial state with the former one. Now the sink but non-target state s2 is eliminated.
The result of this is depicted in box 3. Because s2 did not have any outgoing non-self-loop
transitions, it was just removed together with its incoming transitions. Now we eliminate
s0 . The effect of this is shown in box 4. Before, it was possible to move from s˜0 to s0
by ε, then stay in s0 by taking the a-labelled transition and finally move to s1 . This has
been replaced by a transition labelled with a∗ c from s˜0 to s1 . It was also possible to move
from s1 to s0 with d, then stay in s0 with a and move back to s1 by c. This possibility
is replaced by the regular expression da∗ c. Because s1 already had a self-loop labelled
with e, this label is changed to e|(da∗ c). Now s1 is removed, resulting in box 5. The final
regular expression is given in box 6. It is equal to the transition label from the initial to
the final state.
Notice that this algorithm is not confluent with respect to the regular expressions
returned; different elimination orders may lead to different regular expressions. For example, if in Figure 3.8 we would have eliminated the states in the order s2 , s1 , s0 , we
would have obtained the regular expression (a|(ce∗ d))∗ ce∗ f instead of a∗ c(e|(da∗ c))∗ f .
State-elimination is confluent however with respect to the language of the regular expressions generated. Because of this, when keeping multiplicities, the rational functions
obtained by val are also the same [8]. For this reason, the elimination order can be chosen
in a way that is computationally advantageous.
34
1
2
a
b
s0
d
s2
1
s˜0
c
s1
f
1
c
f
s3
4
s0
d
s2
e
add init state w.o. incoming transitions
a
ε
b
s0
s1
s3
original model
s˜0
ε
d
e
3
a
s˜0
a∗ c
c
s1
f
s3
s1
f
s3
e|(da∗ c)
eliminate s0
e
eliminate s2
5
6
s˜0
Resulting regular expression:
a∗ c(e|(da∗ c))∗ f
a∗ c(e|(da∗ c))∗ f
s3
eliminate s1
Figure 3.8: Example for state-elimination
3.4.2
Extensions
In the following, a number of extensions to the approach put forward previously are
described. In the way they are given there, they are implemented in the tool Param
35
developed along the thesis. A further extension to transform PMDPs to PDTMCs to
exploit state-elimination is considered in Section 3.6. In Section 3.5.2 we extend the
approach to handle a certain reward property.
d
c
s
s00
a
b
s0
s0
s
a·
1
1−c
·b+d
Figure 3.9: State-elimination to get rational functions from automata
Computing Rational Functions Directly Instead of first computing a regular expression and then transforming it into a rational function, it is also possible to compute
the rational function directly by using a modified rule for state-elimination given in Figure
3.9. This was also conjected in [8] but not further put forward.
When using state-elimination directly, we will have to use sub-stochastic PDTMCs.
In sub-stochastic PDTMCs we allow P(s, s0 ) ∈ (0, 1), which is necessary if we want
to eliminate a non-target sink states s. It is possible that for a state s0 with a direct
transition to s it is 0 ≤ P(s0 , S \ {s}) < 1, so the elimination of s leaves s0 with an invalid
probability sum. This is no large problem, however. If a state becomes sub-stochastic,
this can only happen because a sink-state was eliminated. The probability that is missing
is the probability with which the state would move to the sink state. If we define the
probability measure of paths as before, we have the same probability of reaching B, as
paths to a non-target sink-state would not ever reach B. So, we can eliminate non-target
sink states without changing the reachability probability.
The removal of non-sink, non-target states also does not affect the probability to reach
B either if done with our modified state-elimination method. Assume we remove a noninit, non-target state s00 . We have to consider the paths in the old and the new model.
As depicted in Figure 3.9 we consider each tuple (s, s0 ) of states where s is a state with a
transition to and s0 is a state with a transition from s00 . If P(s, s00 ) = a, P(s00 , s0 ) = b and
there is a self-loop with P(s00 , s00 ) = c, the probability to first move from s to s00 , stay in
1
b. So, the probability to move
s00 for a while and then leave to s0 can be calculated as a 1−c
1
0
00
0
from s to s over s or directly from s to s in one step can be computed as a 1−c
b + d,
0
if P(s, s ) = d. The probability of all paths starting in s0 and ending in the target
states which had a detour over s00 is the same as the probability for the paths where the
36
detour has been replaced by the modified direct transition from s to s0 . Because of this,
we can eliminate s00 and modify P(s, s0 ) as described while maintaining the reachability
probability.
The state-elimination process for computing rational functions has in principal the
same sequence of steps like the previous one. Firstly, all states except the initial state
and the final state are eliminated. After this, there is only one transition left, from the
initial to the final one. This transition is labelled with the probability to finally move
from the initial to the final state in the original model. Because the removal of states does
not change the probability to reach target states, the order of the elimination does not
influence the result acquired but may only influence the performance of the algorithm.
In the following paragraphs, we will show how to extend this approach to several initial
and target states. We will also consider certain special cases where the result evaluation
turns out to be invalid.
Multiple Initial States In [8] only PDTMCs with a single initial state where handled.
We discuss how to handle a set of initial states. Let D = (S, S0 , P, V ) be a PDTMC. We
define D0 = (S 0 , S˜0 , P0 , V ) by: S 0 = S ∪˙ S˜0 where S˜0 = {s˜0 | s0 ∈ S0 } that is, for each
original initial state s0 in S0 there is a copy s̃0 in S˜0 . The original initial states are still
states of the model, but are no initial anymore. Let
P(s, s0 ) s, s0 ∈ S
0
0
P (s, s ) =
1
s̃0 ∈ S˜0 and s0 = s0 ∈ S0 is the corresponding original initial state
0
else
When eliminating all states except those of S˜0 , we get the reachability probabilities
for all initial states of the original model. A state s0 from S˜0 can not be reached from
any other state s00 ∈ S˜0 if s0 6= s00 . So, the probability that from s0 the set of target
states is reached is not influenced by s00 . It is not necessary to eliminate s00 for calculating
the probability for s0 . It follows that when eliminating all states from D0 not in S˜0 , the
probabilities for each original state of S0 to reach the set of target states can be obtained
by considering the corresponding state in S˜0 .
Multiple Target States Also, in state-elimination for regular expressions, the usual
approach is to introduce a single target state while inserting ε-transitions from all former
target states in B to this one. For Markov chains, this is not really necessary, as it is
possible to just eliminate all states except those from S˜0 and B, assuming that initial
states do not have any incoming transitions and that target states are absorbing. Assume
that all other states have been eliminated, such that the remaining model which is denoted
by D00 only consists of initial and target states. For each s0 ∈ S0 and s ∈ B, if there is a
transition from s0 to s, it is labelled with the probability to reach s from s0 . This means
37
that for each initial state s0 we can just add up the labels of outgoing transitions of s˜0 to
get the probability to reach the states B: PD (s0 , B) = PD00 (s˜0 , B) = PD00 (s˜0 , B).
Now assume that we do not have one single set of target states B, but several disjoint
target sets B1 , . . . , Bn . To handle this situation, we can just eliminate the non-initial,
non-target states as before. Then for each s ∈ S0 we add up labels from s˜0 individually
for each Bi : PD (s0 , Bi ) = PD00 (s˜0 , Bi ) = PD00 (s˜0 , Bi ).
Invalid Evaluations In [8] it is stated that the possibility that val(x∗ ) is undefined was
irrelevant, as may be the case if val(r) = 1 and val(r∗ ) is to be evaluated. However, this
is not always true, as shown in the example PDTMC of Figure 3.10.
x
s1
1−x
1
2
s0
1
2
s2
Figure 3.10: Example for a PDTMC where certain values need special treatment
1
Here, we have val(x) = 12 1−x
(1 − x) + 12 . For x 6= 1, we can shorten the term to 1.
However, if x = 1, s1 is a (non-target) sink state and the probability to reach s2 is just 21 .
The problem of possible evaluation-dependent divisions by zero can occur if during
the state-elimination a state with both leaving transitions and a self-loop is removed. Let
p be the label of the self-loop transition of a state with leaving transitions that is going to
be eliminated. If p = 1 for an evaluation g : V → R of the variables V of the PDTMC, the
result is invalid for g, as we are dividing by zero. To avoid this we can list up those cases
and treat them specially afterwards, for example by inserting the values in the model
and doing a new analysis run. Only those evaluations g are truly problematic, in which
∀v ∈ V : g(v) ∈ I(v) that is, all variables are in their valid ranges. Cases where a division
by zero only occurs if variables are set to invalid values are irrelevant.
The true reachability probabilities of those special cases can only be lower than the
values obtained by the resulting function (in case we have no formal division by zero
after shortening it). This is, because with such an invalid variable evaluation we stay in
a so-called bottom strongly connected component (BSCC). This is a SCC from which B
can not be finally reached.
In cases where the denominator is near zero, if inexact arithmetic is used, numerical
problems are possible when evaluating the function. However, if the result can be short-
38
ened symbolically, it is possible to avoid this problem in some cases, as in the example
given before, under the condition that x 6= 1.
An Example Finally, we demonstrate the usage of state-elimination on the model of
a biased coin of Figure 2.5. Consider the property that the dice simulator rolls a 6 when
the number of times the coin may be tossed is not restricted. Using state-elimination,
this turns out to be
x2 − 2x + 1
−x + 2
This function is only defined for 0 < x < 1. For x ∈ {0, 1}, no di state will ever be
reached: If x = 1, the only possible path with a non-zero probability is s0 s1 s3 s1 s3 . . . not
leading to a target state. In the other case x = 0 we only have the path s0 s2 s6 s2 s6 . . . not
leading to a target state either. This means that the dice simulator does not terminate.
With a fair coin (x = 21 ) this value is 16 as expected. Because f (x) is strictly decreasing,
x = 12 is also the only value with f (x) = 61 . A plot of f (x) is given in Figure 3.11.
f (x) =
0.5
probability to throw a 6
0.4
0.3
0.2
1/6
0.1
0
0
0.2
0.4
0.6
probability of head
0.8
1
Figure 3.11: Probability of throwing a 6
In Figure 3.12 the process of state-elimination is demonstrated on this model. For
readability, the non-parametric version is used there.
39
1
4
1
2
1
2
1
2
1
2
s1
s3
1
2
1
2
1
2
s2
s3
1
4
d2
1
2
1
2
s5
1
2
1
2
1
2
1
6
d4
s2
1
2
1
2
1
2
s2
d3
s5
1
2
1
2
s6
1
2
1
2
1
2
1
6
s4
1
2
1
2
s0
d4
1
2
1
2
s2
s5
1
2
1
2
1
2
d4
d5
s6
d6
d3
1
2
d6
d2
1
6
s0
d3
1
6
d4
d6
d3
d1
1
6
1
6
d5
d2
1
3
d5
s6
d2
1
6
s0
1
2
1
2
d1
d2
1
2
1
2
s5
1
2
d6
1
6
d1
s4
1
2
1
2
d1
1
6
1
2
s0
d5
s6
1
2
d3
1
4
1
4
s4
s0
1
2
1
2
d1
1
6
s0
d4
d5
d6
1
6
d6
Figure 3.12: State-elimination for unbounded reachability on the dice model
3.5
Rewards
In this section we consider reward models and propose algorithms for computing solutions
for reward-based properties. We first define the problem of reachability rewards and then
explain how it can be solved by a state-elimination based approach.
3.5.1
Reachability Rewards
Let R = (D, r), D = (S, S0 , P, V ), r : S ∪ S × S → FV be a PDMRM. The reachability
reward from a state s ∈ S0 to a set of target states B is the expected reward that is
accumulated along the paths from s to B. This means, that for paths which start in s,
the transition and state rewards of the path are summed up, but only till a state of B is
reached.
40
x/z
3
1/z
s0
0
1 − x/z
s1
x/0
3z
1 − x/3
s2
s3
Figure 3.13: An example reward models
As an example, consider the model of Figure 3.13. Here, state rewards are written
into the states, and transition rewards are written right to the probabilities separated
by a slash. One path of this model is σ = s0 s1 s1 s2 s3 . The reward accumulated along
σ is 3 + z + 0 + z + 0 + z + 3z + 3 = 6 + 6z. The probability of the example path is
1 · x · (1 − x) · (1 − x). The weighted reward of a path is its probability multiplied by
its reward. For σ this value is x · (1 − x) · (1 − x) · (6 + 6z). Summing up all weighted
rewards of paths starting from a state s0 ∈ S0 leads to the expected reward. Formally,
reachability rewards can be defined as follows:
Definition 3.5.1 Let R = (D, r), D = (S, S0 , P, V ), r : S ∪ S × S → FV be a PDMRM
and B ⊆ S a set of target states. The reachability reward from a state s0 ∈ S0 to B is a
random variable defined as
R(s0 ) = Exp D (s0 , w)
where
min{j|σ[j]∈B}−1
w (σ) =
X
r (σ[i]) + r (σ[i], σ[i + 1])
i=0
and
Exp D (s, w) =
X
σ∈Path D (s)
Pr(σ) · w(σ)
s
Due to [1] reachability rewards can be expressed by the following linear equation
s∈B
0
D
∞
Exp (s, w) =
s∈B
P
D 0
0
0
r(s) + s0 ∈S P(s, s ) · r(s, s ) + Exp (s , w) else
As for unbounded reachability, this equation can be solved for example by Gauss
elimination or one of the numerous other approaches for linear equation systems.
41
3.5.2
State-Elimination Approach
In Section 3.4.1 a method based on the elimination of model states was used to calculate
the probability to reach a set of target states in a PDTMC. In that chapter, we eliminated
non-target, non-initial states and modified the model in a way that the reachability probability stays valid. In this section, we adapt the method to calculate reachability rewards
for PDMRMs.
Transforming State to Transition Rewards First, we show that state rewards can
be transformed into transition rewards. The idea is to add all state rewards for some
state s to all the rewards for each outgoing transition of s (and set the state reward to
0). The accumulated rewards along a path from the initial states till the target set B are
not changed by this transformation.
Lemma 3.5.2 Let R = (D, r), D = (S, S0 , P, V ), r : S ∪S ×S → FV be a PDMRM. This
model can be transformed into another PDMRM R0 = (D, r0 ) with r0 (s) = 0, r0 (s, s0 ) =
r(s, s0 ) + r(s) where s, s0 ∈ S without changing the expected reachability rewards for some
set of target states B.
Proof Both PDMRMs R and R0 have the same paths, as they have the same underlying
PDTMC. If we can show that a path
σ has the same reward in R as in R0 we are done,
P
because in this case Exp D (s, w) = σ∈Path D (s) Prs (σ) · w(σ) is the same for both models.
Pmin{j|σ[j]∈B}−1
r (σ[i]) + r (σ[i], σ[i + 1]).
Recall the definition w (σ) = i=0
Let σ be a path of R and let k = min {j | σ[j] ∈ B} − 1. In R it is
wR (σ) =
k
X
r (σ[i]) + r (σ[i], σ[i + 1])
i=0
On the other hand, in R0 we have
wR0 (σ) =
k
X
r0 (σ[i]) + r0 (σ[i], σ[i + 1])
i=0
=
=
k
X
i=0
k
X
0 + r0 (σ[i], σ[i + 1])
r (σ[i], σ[i + 1]) + r (σ[i])
i=0
= wR (σ)
So, in both PDMRMs all paths have the same reward meaning that the expected reachability reward is also the same for both.
42
State-Elimination Now that we have a model with only transition rewards, we slightly
change the meaning of these rewards. Before, transition rewards denoted the exact reward
gained by taking this transition. We change the meaning to the expected reward under
the condition that a transition is taken. The transformation done in the following will
affect other reward measures (for example, instantaneous, accumulated, . . . ) but will not
change the expected reachability rewards.
This allows us to do state-elimination in a similar way as for unbounded reachability
properties, as is illustrated in Figure 3.14. We want to eliminate a state s1 which has
an incoming transition a from state s0 with probability pa and expected reward ra . The
state s1 also has a self-loop b and an outgoing transition c to s2 . State s0 could also have
a direct transition d to s2 . When removing s1 , in case there is no previous transition from
1
pc = pe , as for the unbounded
s0 to s2 , the probability to move from s0 to s2 will be pa 1−p
b
reachability state-elimination.
Now we also have to specify rewards for the new transition. The reward to add there
is the expected reward under the condition that at first a transition from s0 to s1 is taken,
then we stay in s1 for zero or more steps, and then leave to s2 . This reward is a sum
of three summands. The first one results from the fact that when moving from s0 to s1 ,
in all cases the expected reward ra results. For the second one, we must consider that
when moving from s1 to s2 we get the expected reward rc . The last summand is the
expected reward resulting from a possible self-loop of s1 . The expected number of times
pb
. As the reward associated with this loop is rb ,
the self-loop transition is taken is 1−p
b
pb
the expected reward is 1−pb · rb . From this we conclude that the reward from the new
pb
transition e from s0 to s2 when removing s1 must be ra + rc + 1−p
rb .
b
However, there may already exist a transition d from s0 to s2 . Due to the definition
of the probability matrix, we only allow one transition from a state to another, which
means that we have to combine the new transition e with the old one d into a combined
transition f . For the probabilities pd and pe , this is simple. As in the case for unbounded
reachability probabilities in Section 3.4.1, we can just add them up to the probability
pf = pe + pd . Though, we also have to combine the rewards of the two transitions. Let
rd be the reward of d and re the one of e. Remember that the reward associated with a
transition is the expected reward under the condition that the transition is taken. The
d
, and for e
probability that d is taken under the condition that either d or e taken is pdp+p
e
pe
this is pd +pe . This means that the expected reward when taking d under the condition that
e
d
d or e is taken is pdp+p
rd and for e this is pdp+p
re . The expected reward when taking either
e
e
d or e is the sum of those values. This value is equal to the one for the combined transition
e
d
d rd
f which substitutes d and e. So, the new reward is rf = pep+p
re + pep+p
rd = pepree+p
.
+pd
d
d
43
b
pb /rb
pa /ra
s0
a
pc /rc
s1
c
s2
pd /rd
d
e
1
pa 1−p
p
/r
+
r
+
c
a
c
b
pb
r (=
1−pb b
s0
pe /re )
s2
pd /rd
d
s0
d rd
pe + pd / pepree+p
+pd
s2
f
1
Figure 3.14: State-elimination for handling reachability rewards (pe = pa 1−p
pc , re =
b
pb
ra + rc + 1−pb rb )
After the state-elimination, the resulting PDMRM is a tuple R0 = (D0 , r0 ), D0 =
(S 0 , S00 , P0 ) which only consists of a number of initial states S00 , target states B and
transitions from initial to target states. For each initial state s0 ∈ S00 the reachability
reward for B is
X
P(s0 , s0 ) · r(s0 , s0 )
R(s0 ) =
s0 ∈B
As an example for the application of the method, we consider the dice simulator with
a biased coin of Figure 2.5. One property to consider is the expected number of times the
coin has to be thrown to get a result. This can be modelled by adding a state reward of 1
to each transition and defining {di } as the set of target states B. For a change, consider
the path σ = s0 s1 s3 d1 . In σ, three transitions are performed, corresponding to a number
of three coin throws and also to a reward of three. When computing the expected number
of coin throws, the result is the function
44
−x4 + 2x3 − x2 − 2
−x4 + 2x3 + x2 − 2x
depending on the probability x that the coin shows head. A graph for this function is
1
9
given in Figure 3.15 for x = 10
. . . 10
. For x ∈ {0, 1} the value is undefined, as f (x) goes
towards infinity when approaching these values (from left or right respectively). For both
values, B will never be reached, as seen in the model, which means that the reward will
will be taken for x = 12 that is, for the non-biased coin.
be infinite. The minimal value 11
3
f (x) =
11
10
expected number of throws
9
8
7
6
5
4
3
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
probability of head
Figure 3.15: Expected number of throws needed to get a result
3.6
Handling Parametric Markov Decision Processes
In the following, we describe how PMDPs can be handled. We consider the problem of
unbounded and bounded time reachability for MDPs.
In contrast to Markov chains, in MDPs there is no longer a unique probability for
reachability properties, because in each state before the probabilistic choices an action
is chosen non-deterministically. Because of this, what we are going to consider are the
maximal and minimal probabilities of reachability.
The non-deterministic decisions of MDPs are resolved by the notion of schedulers [5].
We first recall their definition.
45
Definition 3.6.1 Given an MDP M = (S, S0 , Act, P), a stationary scheduler is a partial
function D : S * Act fixing the decisions for each non-absorbing state of the MDP such
that D(s) ∈ Act(s). With MD(M) we denote the set of stationary schedulers of M.
Definition 3.6.2 Given an MDP M = (S, S0 , Act, P) and a stationary scheduler D :
S * Act, the induced Markov chain MD is defined as MD = (S, S0 , P0 ) where P0 (s, s0 ) =
P(s, D(s), s0 ) if D(s) is defined and P(s, S) = 0 else.
For each MDP M = (S, S0 , Act, P) and each set of target states B there exists a
stationary scheduler D : S * Act that maximizes (or minimizes) the probability for each
s ∈ S0 to reach B within unbounded time [5] when taking the decision chosen by D in each
step. This probability equals the one to reach B in the induced DTMC MD . Having the
possibility to make the choice dependent on the whole history does not lead to a possible
higher (or lower) probability of reaching B. So, for each s ∈ S0 , the maximal probability
to reach B can be expressed as
max
D∈MD(M)
3.6.1
PMD (s, B)
Straight-Forward Approach
For non-parametric models, this leads to a single real number. However, because PMDPs
are actually a (infinite) family of MDPs, optimal scheduler decisions may depend on
transition parameters. For different variable evaluations, different schedulers may be
necessary to maximize or minimize reachability probabilities. Each possible scheduler of
a PMDP induces a different PDTMC with a different reachability property. There are only
finitely many schedulers, however. For this reason, given a PMDP M = (S, S0 , Act, P, V ),
the maximal reachability probability can be written as
max (H)
where H is a finite set of polynomials or rational functions where each f ∈ H is the
reachability probability of an induced Markov chain of a scheduler.
1
2
s1
a
1
2
s0
x
b
1−x
s2
Figure 3.16: A parametric MDP
46
Consider the PMDP given in Figure 3.16. For variable evaluations with x < 12 for
maximizing the probability to reach s1 from s0 , it is necessary to use the scheduler taking
action a leading a probability of 21 to reach s1 . If x > 21 , action b must be taken. So in
this case, the maximal probability is given by
1
1
x ≤ 12
2
f (x) = max
,x =
x else
2
For bounded time properties, stationary schedulers do not suffice to handle maximal
or minimal reachability properties [5]. Instead, schedulers dependent on the current step
have to be used.
Definition 3.6.3 Given an MDP M = (S, S0 , Act, P), a step-dependent scheduler is a
partial function D : S × {0, . . . , n − 1} * Act, n ∈ N providing the decisions for each
non-absorbing state s ∈ S of the MDP and each step number i ∈ {0, . . . , n − 1} such that
D(s) ∈ Act(s). With SD(M) we denote the set of step-dependent schedulers for M.
Definition 3.6.4 Given an MDP M = (S, S0 , Act, P) and a step-dependent scheduler
D : S × {0, . . . , n − 1} * Act, the induced Markov chain is defined as MD = (S 0 , S00 , P0 )
where
• S 0 = S × {0, . . . , n}
• S00 = S0 × {0}
• P((s, i), (s0 , j)) =
P(s, D(s, i), s0 ) 0 ≤ i ≤ n − 1, j = i + 1, D(s, i) defined
0
else
In the induced Markov chain we have n + 1 different copies of the MDP. The states
S × {0} are for the initial situation and the states S × {1, . . . , n} are for the situation after
a step. So, the n + 1 layers correspond to n + 1 points of time. Transition are chosen in a
way such that in each state layer i the scheduler D(i, ·) specifies the probabilistic choice
for a state. The successor state of a state of layer i is on the next layer i + 1, simulating
a step in the original MDP.
For each MDP M = (S, S0 , Act, P), set of target states B and step limit n, there exists
a step-dependent scheduler D : S × {0, . . . , n − 1} * Act such that for each s ∈ S0 the
probability to reach B within n steps is maximized when taking the action the scheduler
decides for each state and step number. Also, this probability is the same as reaching
B0 = B × {0, . . . , n} from (s, 0) ∈ S00 in the induced DTMC MD .
47
Algorithm 2 Algorithm to calculate the maximal probability to reach B within n steps
for each state of an MDP M
2.1: for all s ∈ S do
2.2:
P (s, B) = if s ∈ B then 1 else 0
2.3: end for
2.4: for i = 1 to n do
2.5:
for all s ∈ S do
2.6:
if Act(s) = ∅ then
2.7:
P 0 (s, B) = if s ∈ B then 1 else 0
2.8:
else
P
2.9:
P 0 (s, B) = maxa∈Act(s) s0 ∈S P(s, a, s0 ) · P (s0 , B)
2.10:
end if
2.11:
end for
2.12:
P (·, B) = P 0 (·, B)
2.13: end for
1
10
9
10
s1
a
1
2
s0
b
1
2
s2
Figure 3.17: Stationary schedulers are insufficient for maximal bounded reachability
To understand why stationary schedulers are not sufficient to maximize bounded-time
reachability for MDPs, consider the example given in Figure 3.17. Consider the probability
to reach s1 from s0 . For unbounded reachability, we can use a scheduler D with D(s0 ) = a
P
9 i
1
= 1. If we limit the number of steps to n = 2 however,
to maximize it. It is ∞
i=0 10 · 10
1
9
1
+ 10
· 10
= 0.19. A scheduler with D(s0 ) = b
s1 would be reached with a probability of 10
would lead to a probability of 0.5 Now consider the usage of a step-dependent scheduler
with D(s0 , 0) = a and (s0 , 1) = b. Using it to resolve the non-determinism results in a
1
9
+ 10
· 12 = 0.55, which is higher than the
maximal bounded reachability probability of 10
probability possible with any stationary scheduler.
With Algorithm 2 it is possible to calculate bounded reachability properties. Initially,
in lines 2.1 to 2.3, P (s, B) is set to 1 for all states within B and 0 for the other ones. Then,
in the first iteration of the main loop (line 2.4), the maximal probability is calculated to
reach B within 1 step. In the next iteration, from this value the maximal probability for
2 steps is calculated, and so on.
In the case of non-parametric models, all calculations are possible using real numbers.
48
In case of parametric models with transitions labelled with polynomials or rational functions, calculations can not simply be performed using these kinds of functions, because the
space of these function spaces is not closed under maximum. Instead calculations can be
executed using a set H of such functions. As before, the property to be represented then
is the maximal value of each variable evaluation of the functions of H. The following laws
allow for computations with functions given in this representation (fi , gi non-negative):
• max{f1 , . . . , fn } · max{g1 , . . . , gm } = max{f1 · g1 , f1 · g2 , . . . , fn · gn }
• max{f1 , . . . , fn } + max{g1 , . . . , gm } = max{f1 + g1 , f1 + g2 , . . . , fn + gn }
• max{max A, max B} = max{A ∪ B}
1
0.8
max
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
x
Figure 3.18: Reducing the set representing the maximum function
To make calculations feasible, it is necessary
1 to minimize
the set sizes used during
11
1
the calculations. For example, consider H = 2 , x, 10 x − 10 and let 0 ≤ x ≤ 1. As
seen in Figure
3.18, the maximum function can be represented as well by the reduced
set H 0 = 12 , x . Possible reductions of these sets depend on the restrictions of variable
1
ranges of the variables. For example, if x would not be restricted to be less than 1, 11
x− 10
10
could not be removed from H (though in this example, this would lead to a “probability”
larger than 1). The reduction is nontrivial, because it is necessary to consider the maxima
49
of the functions as well as the intersections of them and the valid range of variables. It is
planned to later on link the tool developed along the thesis to a computer algebra system
to check the feasibility of this approach.
(s0 , a)
a
1
b
s0
1
1
vs0 ,a
s1
vs0 ,b
s0
(s0 , b)
1
s1
1
c
vs0 ,c
1
(s0 , c)
encoding D1
original
(s0 , a)
(s0 , a)
1
vs0 ,a
s0
vs0 ,b
(s0 , b)
a − vs0 ,a − vs0 ,b
1
1
vs0 ,a
s1
1
s0
(1 − vs0 ,a )
1
(s0 , b)
·vs0 ,b
1 − vs0 ,a
−(1 − vs0 ,a ) · vs0 ,b
(s0 , c)
s1
1
(s0 , c)
encoding D3
encoding D2
Figure 3.19: Transforming a small MDP to PDTMC as in [8]
3.6.2
Alternative Approach
Unbounded Time It is possible to replace non-determinism by parameters, as conjected in [8]. We can replace each non-deterministic choice by a new variable, as in Figure
3.19, “encoding D1 ”. Then we can apply state-elimination as described in Section 3.5.2
which gives us a result function. The parameters of this function are the variables used
to resolve the non-determinism.
Definition 3.6.5 Let M = (S, S0 , Act, P) be an MDP. Then a stationary encoding is a
50
partial function D : S × Act * FV,poly such that D is defined for each non-absorbing state
and is undefined for absorbing states.
Definition 3.6.6 Let M = (S, S0 , Act, P) be an MDP and D : S × Act → FV,poly a
stationary encoding of M. The encoding PDTMC is defined as enc(M) = (S 0 , S0 , P0 , V )
˙ × Act,
where S 0 = S ∪S
s ∈ S, a ∈ Act(s), s0 = (s, a)
D(s, a)
0
0
00
0
P(s , a, s ) s = (s00 , a) ∈ S × Act, s0 ∈ S
P (s, s ) =
0
else
The encoding D1 (s, a) = vs,a for all s ∈ S, a ∈ Act(s) is the encoding in which
each action of a state is encoded by a different variable, which was shown in Figure 3.19
(original). We can define a second one in which we can safe one variable. Let A : S * Act
be a partial function that chooses an arbitrary enabled action for all non-absorbing state
s ∈ S. Then we can define the encoding D2 (s non-absorbing, a ∈ Act(s)).
vs,a P
A(s) 6= a
D2 (s, a) =
1 − b∈Act(s),b6=A(s) vs,b A(s) = a
An example for this encoding is also given in Figure 3.19. There we chose A(s0 ) = c.
Let B : S × Act * {1, . . . , |Act|} be a further partial function that is defined for nonabsorbing states. It shall enumerate the valid actions of a state s from 1 to |Act(s)| in
an arbitrary way. With B −1 : S × {1, . . . , |Act|} * Act we denote the function which
for each state s gives the action with the given number in s. So if for a state s ∈ S
we have Act(s) = {a, b, c} we can declare B(a) = 1, B(b) = 2, B(c) = 3. We then have
B −1 (s, 1) = a, B −1 (s, 2) = b, B −1 (s, 3) = c. By using B, a further encoding D3 is possible:
vs,a
B(s, a) = 1
PB(s,a)−1
−1
1 − j=1
D3 (s, B (s, j)) vs,a 1 < B (s, a) < |Act(s)|
D3 (s, a) =
Pi−1
1 − j=1 D3 (s, B −1 (s, j))
B(s, a) = |Act(s)|
For this encoding there is also an example given in figure 3.19. When compared with
D2 , this encoding has the following advantage: Let V be the set of variables used to
encode the non-determinism. Then each evaluation g : V → R of the variables such that
0 ≤ g(v) ≤ 1 for all v ∈ V is valid in the sense of definition 2.4.4. Notice however, that
even in the encodings D2 and D3 , in the worst-case exponentially many evaluations are
necessary to get the maximum reachability probability. In contrast to D1 , this is not
always the case though. The approach given above can be extended easily to PMDPs
M = (S, S0 , Act, P) where the set V of variables is enlarged by the new variables for
the non-deterministic decisions. In special cases, the variables resolving non-determinism
51
may disappear in the result. This was the case in the bounded retransmission case study
in Section 5.4.
It is to be noticed that the formulas resulting from state-elimination after such a
transformation may not be valid for minimizing schedulers, but are valid for maximizing
schedulers. The reason that the result may not be valid for minimizing schedulers is, that
there may exist states which are part of a BSCC for certain variable evaluations and are
not part of a BSCC in general, as noticed in Section 3.4.1. If a variable replacing the
non-determinism controls whether a state is part of a BSCC or not, this variable may be
shortened away, which means that actual probability is lower than the one of the resulting
formula.
For the validity for maximizing schedulers we have to take into account that, as seen in
Section 3.4.1, the actual result under a certain variable evaluation can only be lower than
the one obtained by state-elimination. The only case in which this happens is, that such
a variable evaluation leads to a BSCC which is no BSCC for other variable evaluations.
However, if we want to maximize the value, there is no reason to assign variables resolving
non-determinism for maximizing schedulers in a way leading to BSCCs when this can be
avoided, because doing so can only lead to lower reachability probabilities. Thus, the
result is valid for maximizing schedulers.
Bounded Time As mentioned before, for bounded reachability properties stationary
schedulers are insufficient to calculate minimal or maximal probabilities. Because of
this, it is not possible to simply replace the non-deterministic choice for each state by
a probabilistic one. Instead, we will replace the non-determinism for each state by n
different probabilistic ones, where n is the step limit of the property. This is possible by
a construction similar to the encoding PDTMC of a step-dependent scheduler.
Definition 3.6.7 A partial function p : S × Act × {0, . . . , n − 1} * FV,poly defined for
each non-absorbing state where n ∈ N is a step-limit is called a step-dependent encoding.
One possible step-dependent encoding is
D10 (s, a, k) = vs,a,k
For the stationary encodings D2 and D3 , corresponding step-dependent encodings
exist. Now we can define the encoding PDTMC.
Definition 3.6.8 Let M = (S, S0 , Act, P) be an MDP, n a step limit and D a stepdependent encoding. The encoding PDTMC is defined as enc(M, n) = (S 0 , S00 , P0 , A)
˙ × {0, . . . , n − 1} × Act, S00 = S0 × {0},
where S 0 = S × {0, . . . , n} ∪S
s = (s00 , k) ∈ S × {0, . . . , n − 1} , a ∈ Act(s), s0 = (s00 , k, a)
vs,k,a
0
0
00
000
P(s , a, s ) s = (s00 , k, a) ∈ S × {0, . . . , n − 1} , s0 = (s000 , k + 1), a ∈ Act
P (s, s ) =
0
else
52
In contrast to the stationary encoding, we can not just add an additional state for
each non-deterministic choice. If we have a step limit of n, the maximal probability can
depend on n different decisions. Because of this, n+1 copies of the original states are used
– n in which non-deterministic decisions have been taken, and one for the initial situation.
These states are elements of the set S × {0, . . . , n}, meaning we have n + 1 state layers.
Because we want to start at layer 0, we define the set of initial states as S00 = S0 × {0}.
The states of S × {0, . . . , n − 1} × Act are the ones where a non-deterministic decision
has been taken. A state (s, 0, a) means that in layer zero – that is, the first step – in s the
non-deterministic choice a was used. The transition encoding is as follows: For each state
(s, k), 0 ≤ k ≤ n − 1 there is a transition for each enabled action a ∈ Act(s) to a state
(s, k, a) in which this decision has been taken. The non-deterministic choice is encoded
by a parametric probabilistic one, using the variable vs,k,a . In states (s, k, a) in which
a decision has been taken, there is the probabilistic choice of the original state s when
action a is used. However, instead of a successor state of s like in the original, we move
to a state which is a successor state of s on the next layer, to allow different decisions on
a different step number.
Given an MDP M, a step-limit n and a set of target states B, the maximal probability
of reaching B can be calculated by transforming M into the encoding PDTMC enc(M, n)
with step-limit n, defining B0 = B×{0, . . . , n} and applying the algorithm for unbounded
reachability on enc(M, n) and B0 as the set of target states. We use B × {0, . . . , n},
because the target states can be reached after each decision. The maximal value is given
by the maximum over the variables used to encode the non-determinism. If the original
MDP was parametric, the maximum may depend on the variable evaluations of M.
3.7
Complexity
In the previous sections we have introduced a number of algorithms for parametric Markov
models. In this section, we give the corresponding complexity results.
Firstly we recall the complexity of bounded-time reachability for non-parametric models. In the discrete-time case the complexity is O(mk) where m is the number of non-zero
entries of a probability matrix P and k is the step-limit. For bounded-time reachability
analysis for CTMCs the complexity is of order O(mqt) [3] where m is the number of
non-zero entries of P, q is the uniformization rate and t is the time-bound.
We can show the complexity of state-elimination to be O(n3 ). So, state-elimination
has the same theoretical complexity as the Gauss algorithm. To prove this, we consider
the family of DTMCs {Dn }n∈N (sketched in Figure 3.20) with Dn = (S, S0 , P) where
S = {s0 , . . . , sn+1 }, S0 = {s0 }. Further, let P be defined as follows
1
s = si , s0 = sj , 0 ≤ i ≤ n, 1 ≤ j ≤ n + 1
0
n+1
P(s, s ) =
0
else
53
1
n+1
1
n+1
s0
s1
1
n+1
si 1
...
1
n
1. . .
n
si2
...
1
n
sn
1
n+1
sn+1
...
1
n+1
si3
1
n+1
Figure 3.20: Complexity of state-elimination
In the part of Dn consisting of the states S 0 = {s1 , . . . , sn }, there is a connection from
each state to each other state in this set. When applying the state-elimination algorithm,
we have to eliminate all states of S 0 . The elimination of an arbitrary si , 1 ≤ i ≤ n
leads to 2n edges to be removed and a number of (n − 1)2 to be modified. The model
resulting from this transformation is the DTMC Dn−1 . We have to eliminate states till
we reach D0 this way, implying we have to do O(n3 ) operations. The graphs of Di have
as many connections as are possible for a model of this size in which the initial state has
no incoming transitions and the target state has no outgoing one. This means that the
case considered here yields the worst-case complexity of the algorithm.
Now we consider the complexity of our parametric analyses. The algorithms work
exactly the same except for the implementation of mathematical operations. These operations turn out to be the bottleneck of the approach. Consider a parametric variant
1
but by the
Dn0 of Dn where the probability between two states si , sj is not given by 1+n
parameters pij . If we compute P(s, n, B) in the modified PDTMC, as a result we get the
formula
p00 + p01 + p02 + . . . + p01 p11 + p01 · p12 + . . .
which is of exponomial size in the number of edges, because the sum consists of the
products of each possible combinations of the pij . We also need exponomial time to
compute this term, so both memory usage as well as run-time are of exponomial size.
However, in practice the worst-case complexity does seldom turn out to be relevant for
real-world models. The reason is, that in most cases only a limited number of variables
are used and that functions may grow large, but stay still in a range that can be handled
by today’s computers. This is demonstrated in Chapter 5.
54
Chapter 4
Optimizations
In this chapter certain optimizations implemented in Param are described. All of them
are extended from the non-parametric setting. Because operations are more expensive for
parametric models, they are especially useful here and are also used in the implementation.
In Section 4.1 we describe optimizations during state-space exploration. Minimization
techniques based on strong and weak bisimulation are delineated in 4.2.
4.1
Optimizations in State-Space Generation
Markov models are usually generated from higher level descriptions, like the one described
later in Section 5.1.2. As mentioned in Section 3.4.1, the model generation process starts
by computing the set of initial states, then continues by the computation of the states
reachable in one step from the initial states, then the states reachable from these states
and so on, till a fix-point is reached.
However, depending on the property to be checked, exploring the whole state-space is
usually not necessary. For both bounded and unbounded reachability, is is not required
to explore states of the set of target states B further, because we are only interested in
the probabilities of reaching B. States that are only reachable from B are not of interest,
as they do not add to the probability to reach this set. Also, for both bounded and
unbounded reachability it is not needful to distinguish between different target states. It
suffices to represent B by a single target state and deflect all transitions to some state of
B to this single target state while maintaining the reachability probability. For bounded
reachability, it is not requisite to explore the model further than depth n. For discretetime models, n is the step limit whereas for continuous-time properties n is the right
truncation point of the Fox-Glynn algorithm.
55
4.2
Bisimulation
Generally, a bisimulation is an equivalence relation over states such that states in the
same equivalence class behave equally in the sense that they can not be distinguished by
a property of a given class. The quotient of a models is a model, in which the states
are equivalence classes of bisimilar states of the former model. The states of the original
model are then bisimilar to the ones of the new transition system. Bisimulation was
originally introduced by Milner (1980) and Park (1981) for non-deterministic systems
without probabilities.
For probabilistic and stochastic systems, several versions of bisimulation exist. The one
that is appropriate depends on the property that shall be preserved by the bisimulation [4].
We consider several bisimulations that have already been considered in the non-parametric
setting:
Strong Bisimulation for PDTMCs
Definition 4.2.1 Let D = (S, S0 , P, V ) be a PDTMC and T ⊆ S × S. T is a strong
bisimulation if for all s1 T s2 , C ∈ S/T : P(s1 , C) = P(s2 , C).
Definition 4.2.2 Let D = (S, S0 , P, V ) be a PDTMC and T ⊆ S × S be a strong bisimulation on D. The quotient PDTMC up to T is the PDTMC D0 = (S 0 , S00 , P0 , V ) where
S 0 = S/T , S00 = {C ∈ S/T | ∃s0 ∈ S0 : s0 ∈ C}, P0 (C, C 0 ) = P(s, C 0 ) where s ∈ C.
Strong bisimulation for PDTMCs preserves unbounded reachability properties as well
as the ones for bounded reachability. We assume the B states to be absorbing. Thus, all
B states are bisimilar by definition.
Strong Bisimulation for PCTMCs
Definition 4.2.3 Let C = (S, S0 , R, V ) be a PCTMC and T ⊆ S × S. T is a strong
bisimulation if for all s1 Rs2 , C ∈ S/T : R(s1 , C) = R(s2 , C).
The quotient is defined accordingly. Strong bisimulation for PCTMCs, as for the case
for PDTMCs, preserves bounded as well as unbounded reachability properties.
Weak Bisimulation for PDTMCs
Definition 4.2.4 Let D = (S, S0 , P, V ) be a PDTMC and T ⊆ S × S. All transitions
P(s, s0 ) with sT s0 are considered silent steps. A state for which only silent steps exist,
that is P (s, [s]T ) = 1, is considered silent. With silentT (D) the silent states of D given T
shall be denoted. T is a weak bisimulation if
56
• if P(si , [si ]T ) < 1, i = 1, 2 then for all C ∈ S/T, C 6= [s1 ]T = [s2 ]T :
P(s2 , C)
P(s1 , C)
=
1 − P(s1 , [s1 ]T )
1 − P(s2 , [s2 ]T )
• s1 can reach a state outside [s1 ]T iff s2 can reach a state outside [s2 ]T (in a sequence
of steps)
Definition 4.2.5 Let D = (S, S0 , P, P ) be a PDTMC and T ⊆ S ×S a weak bisimulation
for PDTMCs. The quotient PDTMC up to T is the PDTMC D0 = (S 0 , S00 , P0 , V ) where
S 0 = S/T , S00 = {C ∈ S/T | ∃s0 ∈ S0 : s0 ∈ C},
P(C, C 0 ) =
P(s, C 0 )
, s ∈ C, s ∈
/ silentT (D)
1 − P(s, [s]T )
Weak bisimulation for PDTMCs does not preserve bounded reachability properties
but maintains unbounded ones. The reason is that the definition is coarser than strong
bisimulation: If two states are related in a strong bisimulation, they can also be related in
a weak one. For this reason, weak bisimulation quotients are usually smaller than strong
ones. Similar to Section 3.4.2, we must take care not to divide by zero. It is possible that
the quotient is not valid for all variable evaluations.
Weak Bisimulation for PCTMCs
Definition 4.2.6 Let C = (S, S0 , R, V ) be a PCTMC and T ⊆ S × S. T is a weak
bisimulation if for all s1 T s2 , C ∈ S/T, C 6= [s1 ]T = [s2 ]T : R(s1 , C) = R(s2 , C).
Weak bisimulation for PCTMCs preserves time-bounded and unbounded reachability
properties
Strong bisimulation has shown to be profitable in the non-parametric case [16]. As
computations in parametric Markov processes are more expensive, bisimulation quotioning
can even be more useful here. Of course, we have to make sure that the computations used
to create the quotient are not more expensive than actually executing the final analyses.
In the following it shall be described in which way the creation of quotients of parametric
Markov models was implemented in the tool.
Refinement Algorithm The algorithm used here is based on signature refinement
which was extended to Markov chains [9]. In signature refinement, the starting point is
an initial partitioning of the state-space. This initial partitioning is refined by splitting
its elements on some criterion violating the bisimulation conditions, depending on the
current partitioning of the state-space. This is repeated until a fix-point is reached that
57
Algorithm 3 Base algorithm for partition refinement
3.1: P = CreateInitialPartition(C)
3.2: queue = P
3.3: while ¬queue.empty() do
3.4:
B = queue.dequeue()
3.5:
B 0 = {{s0 ∈ B | sig(P )(s) = sig(P )(s0 )} | s ∈ B}
3.6:
P = P \ {B} ∪ B 0
3.7:
queue.append(B 0 )
3.8: end while
3.9: CreateQuotientsig (C, P )
is, due to the refinement criterion no more splitting is needed. The final partitioning is the
coarsest bisimulation of the given definition with the initial partitioning specified before.
The base for the refinement process implemented is Algorithm 3.
First of all, in line 3.1 the initial partitioning P is created. If considering reachability
properties, all target states are put into a partition and the rest is put into another one. If
all target states are sink states (or at least only have transitions to another target state),
the partition containing the target states will not be split along the refinement. In case
of a reward analysis, all non-target states with the same reward are put into one initial
equivalence class. Here of course, creating the quotient makes only sense if not all states
have different reward values.
In line 3.2, a priority queue is filled with the initial partitioning . In lines 3.3-3.4
partitions are taken off the queue, as long as it is not empty. Some criterion can be
defined for the order of taking out elements. One possibility is to consider taking out
smaller partitions first. The idea behind this is that it may be better to split a partition
completely or as far as possible before handling other partitions: If we would first split
a partition A once and then split a partition B whereas the splitting of B depends on
A, B would have to be split again if A is split again later. However, if A is first split
completely, the necessary splitting for B resulting from the partitions contained in A can
be performed in one step. As this is just a heuristic, for certain models other priorities
may be appropriate.
In line 3.5, an equivalence class B is split if necessary, based on a given signature,
which depends on the kind of bisimulation to be used. In the following line 3.6 the old
equivalence class is removed and the new ones resulting from the split are added to the
queue for further splitting.
The actual split of a class is performed by computing a signature sig(s, P ) for each
state of it. The new classes consist of states which have the same signature.
For strong bisimulation for PDTMCs it is
sig(s, P ) : P → FV
58
sig(s, P )(A) = P(s, A)
and for the strong bisimulation for PCTMCs the P just becomes an R. For weak bisimulation for PCTMCs the signature is
sig(s, P ) : (P \ [s]P ) → FV
sig(s, P )(A) = R(s, A), A 6= [s]P
For the last case, the weak bisimulation for PDTMCs, the situation is a bit more
involved. For all s ∈
/ silentP (D) it is
signs (s, P ) ⊆ (P \ [s]P ) → F
P(s, A)
, A 6= [s]P
signs (s, P )(A) =
1 − P(s, [s]P )
sig(s, P )(A) = signs (s, P )(A)
For s ∈ silentP (D), the signature is defined as
A ∈ signs (s, P ) | ∃σ ∈ Path D (s), i > 0 : σ[i] ∈ A ∧ ∀0 ≤ j < i : σ[j] ∈ [s]P
Implementing these signatures is straightforward for strong bisimulation and weak
bisimulation for PCTMC. The split of silent states in weak bisimulation for PDTMCs is
possible by a search on the reversed graph. The implementation for this thesis allows
silent states to be labelled with all classes the state may reach, labelling the state by a set
of classes. This allows for more fine-granular splits in some cases for faster convergence
of the algorithm when compared for algorithms where silent states are labelled with “*”
if they can reach two or more partitions.
PDMRM with transition rewards can be handled by first transforming transition into
state rewards.
Lemma 4.2.7 Let R = (D, r), D = (S, S0 , P, V ), r : S ∪ S × S → FV be a PDMRM.
Then R can be transformed to another PDMRM R0 = (D0 , r0 ) with no transition rewards,
˙
where D0 = (S 0 , S0 , P0 , V ), S 0 = S ∪{(s,
s0 ) | s, s0 ∈ S, P(s, s0 ) > 0},
P(s, s00 ) s ∈ S, s0 = (s, s00 ), s00 ∈ S
0
0
1
s = (s00 , s0 ), s00 ∈ S
P (s, s ) =
0
else
r0 (s, s0 ) = 0 for transitions and
0
r (s) =
r(s)
s∈S
0 00
r (s , s ) s = (s0 , s00 )
59
The construction is linear in the number of transitions m, as a number of m new
states and 2m new transitions will be introduced. It adds an intermediate state for each
transition of the original model which is then given the same reward as the transition it
is added for.
Lemma 4.2.8 The construction given above will keep reward properties valid. For bounded
reachability, the time limit allowed to reach B has to be doubled.
Proof Let R = (D, r), D = (S, S0 , P, V ), r : S ∪ S × S → FV be a PDMRM and R0 the
PDMRM from the construction given above. Further, let σ = s0 s1 s2 . . . be a maximal
path of R starting from an initial state s0 ∈ S0 . There exists a corresponding maximal
path σ 0 = s0 s00 s1 , s01 s2 s02 . . . in R0 where s0i ∈ S × S. Also, each maximal path σ 0 of R0 is
of this form. Because of this, there is a bijection between the maximal paths of the two
models. Bijective maximal paths σ and σ 0 have the same reward and the maximal path
σ 0 of R0 is twice the length of σ. This means that if in the corresponding maximal path σ
of R the set B can be reached within n steps, then for σ 0 a number of 2n steps is needed.
Notice that there is no similar construction for CTMCs. The problem is, that in
CTMCs the reward acquired in a state s is r(s) · t that depends on a continuous value
instead of a discrete step, as for DTMCs. The transitions are assumed to be instantaneous,
leading to a fixed amount of gained rewards each time they are taken. Because of this, we
can not simply transform transition rewards into state rewards like in the discrete-time
case.
However, as mentioned, operations on polynomials and rational functions are quite
expensive, and we are not willing to redo operations on polynomials we already performed
before. Because of this, a value caching technique was implemented in Param. The
assumption here is, that during the calculations only a small set of values will actually be
used. Initially, all values appearing in P (or R) are given a unique identifier i ∈ N. For
each arithmetical operation f : N × N * N needed for the partition refinement (addition,
subtraction, division) a data structure is created to handle values of f . All operations
are performed with the unique identifiers as operands. If the result of a calculation is not
found in the value cache, the actual polynomials/rational functions are taken, the result
calculated on them, given a new unique identifier, inserted in the value cache and the new
unique identifier is returned. This solution turned out to be quite more performant when
compared to the approach of doing all calculations directly with polynomials or rational
functions.
Notice that when there is a need to find the variable evaluations which may lead to a
division by zero (see 3.4.1) we need to apply a preprocessing on the graph structure, as
this information may be lost in the quotient.
60
Chapter 5
Case Studies
In this chapter, the techniques put forward previously are applied to a number of case
studies. For this purpose, the tool Param, which is a program to handle the analysis of
parametric Markov models, has been developed. We will first given an introduction to
Param in Section 5.1. Then, we will demonstrate that it is successfully applied in a number of case studies: The Zeroconf protocol (5.2), a numerically problematic model (5.3),
a model of the bounded retransmission protocol (5.4) and a model of protein synthesis in
biological cells (5.5).
5.1
The Tool
We have implemented a prototype within the tool Param to handle parametric Markov
models and applied the techniques introduced in the previous chapters. Section 5.1.1
will give an overview of the tool and its general architecture while in Section 5.1.2 the
modelling language to be used with Param is described. In Section ?? we consider the
evaluation of result functions.
5.1.1
Architecture
Param works as a command line tool. Models are specified in a higher-level input language derived from the one of Prism [20], as described in Section 5.1.2. Properties can
be specified by PCTL/CSL properties without nesting operators plus an extension from
Prism to specify reward properties.
61
Model
Property
Param
Low−level model
State-space exploration
no quotioning?
quotioning?
Bisimulation quotient
State-elimination
component
DTMC Strong
Quotient
Refiner
DTMC Weak
...
Analysis
CTMC
embedding
CTMC
unify
DTMC
unbounded
CTMC
bounded
DTMC
reachability
reward
DTMC
bounded
...
Result File
Figure 5.1: Architecture of Param
The general architecture of the tool is given in Figure 5.1. Initially, the state-space
exploration component generates a Markov model from a higher level description. In
this exploration process the optimizations from Section 4.1 are incorporated. The model
resulting from the exploration could be exported to a file.
Before further processing, it is possible to apply bisimulation to construct the quotient
62
of the model before the analysis. The quotient component is implemented in a modular way. There is one part that is independent of the concrete bisimulation definition
(weak, strong, . . . ) which is responsible for managing the partition, choosing the class
to be refined next, etc. In addition, there are sub-components implementing one specific
bisimulation definition. On the one hand, this allows to add further bisimulation definitions without having to rewrite the whole refinement component, on the other hand,
some optimizations are possible without having to rewrite the sub-component for each
bisimulation. If requested, the quotient is exported to a file.
Depending on the property of scope, different algorithms are called. The procedures for
PDTMC unbounded reachability and for PDTMC reachability rewards both make use of
the state-elimination component. PDTMC bounded reachability and PCTMC bounded
reachability are implemented as described in the corresponding sections. Finally, the
result is written to a file for further examination.
To summarize, the features of Param are:
• reading and optimized exploring of a higher-level model to an explicit representation
• handling probabilistic, stochastic and discrete-time non-deterministic models
• weak and strong bisimulation
• state-elimination based unbounded reachability
• state-elimination based reachability rewards
• PDTMC bounded reachability
• PCTMC bounded reachability (by uniformization)
5.1.2
Input Language
The parser for the input language originated from the tool Pass used in [26]. It has been
extended for thesis tool by the param keyword and the ability to parse formulas for the
specification of reward properties. In the following, the basic syntax of this language will
be given together with an informal description of its semantics. A formal specification of
the closely related input language of Prism can be found on their homepage [20].
Models the input language start with a keyword specifying their type. This may be
dtmc, ctmc or mdp (or certain synonyms of them).
After this, global variables to be used in the program may be defined. For now, Param
allows only finite-ranged variables. They are specified as <name> : <type>; where name
is the name of the variable and <type> may be either bool or [i..j] where i and j
specify the range of a finite integer variable. In the semantics of the program that is,
the induced Markov model, states are an evaluation of all variables of the model. So for
63
example, if there are two variables a : [0..2] and b: [1..3] in a model, one state of
the model is a=0, b=2.
Constants can be specified by const <type> <name> = <value>;. Here, <type> may
be float or int specifying the type of the variable and <value> is a value complying to
the specified type.
We have a new keyword such that we can specify parametric variables. Parameters are
specified by param float <name>. The keyword float is required there, because later
extensions may allow further types of parameters.
After these specifications, a number of modules follow. The definition of a model is
started by module and ended by endmodule. A module may contain a number of local
variables and constants, defined like the global variables introduced before. Each module
also consists of a number of guarded commands. A guarded command has the form
[<actionname>] <guard> ->
<prob/rate> : <var-assignment> +
<prob/rate> : <var-assignment> +
...
<prob/rate> : <var-assignment>
;
The guard is a Boolean formula where the atomic formulas are predicates over the
model variables. Remember that states are evaluations of the model variables. The
guarded commands specify the successors of model states.
If the guard of exactly one guarded command is fulfilled in a state, the distribution to
its successors is specified in the following way: Each <prob/rate> : <var-assignment>
specifies a successor state. The <var-assignment> is of the form (s1’=<ev1>) & ... &
(sn’=<evn>) assigning a new value <evi> to each variable si to be changed. Variables
for which no primed version is given keep their old value. The probability or rate to
move to the state induced by this assignment is given by <prob/rate>. For discretetime models, the <prob/rate> specifications of one guarded command must add up to 1.
Notice that <prob/rate> does not only allow fixed numbers but also formulas over the
variables, constants and parameters of the model allowing us to use parameters here by
inserting variables previously specified by the param keyword.
If two or more guarded commands are active in one state, the semantic depends on the
model type considered. For PDTMCs, each command is taken with the same probability,
so if at some state there are n active commands, each of the is taken by probability n1 . So if
a certain variable assignment is executed with probability p in case the action is executed,
then if n actions are possible it is executed with a probability of np . For PCTMCs, all rates
are just left as they are, leading to a race between all possible successors. For PMDPs,
there is a non-deterministic choice between the different guarded commands, each leading
to a distribution over successor states induced by the respective guarded command.
64
If two or more guarded commands from different modules are activated, the result
depends on the <actionname>. If the <actionname> is left out as is allowed, the semantic
is as described in the previous paragraph. The same is the case for actions which just
occur in one of the modules but not in others. If an action name occurs in several modules,
the corresponding actions can only be executed synchronously in all the modules. In such
a synchronous execution, in each of the modules one variable assignment is executed at
the same time. If several guarded commands write to the same variable the result is
undefined. Because each module may only modify its own variables, this can only occur
for global ones.
After the modules specification, with init <init-states> endinit the initial states
are given whereas <init-states> is a Boolean formula of predicates over model variable.
All states which fulfill this formula are initial.
Finally, the rewards may be specified by
rewards
<state-reward> ;
<state-reward> ;
...
<transition-reward> ;
<transition-reward> ;
...
endrewards
The assignment <state-reward> = <guard> : <reward> gives the rewards for the
states, as described in 2.5. If <guard> is fulfilled in a state, this state is given the reward
specified by <reward>. If the guards of several <state-reward> are fulfilled, these values
are added up.
A <transition-reward> is specified as [<actionname>] <guard> : <reward> where
<actionname> is optional specifies the transition rewards. If for a state the guard <guard>
is fulfilled, all its outgoing transitions resulting from an action with <actionname> (or,
with no action name if it is not given in the reward specification) are given the reward specified by <reward>. As for state rewards, if the guards of multiple <transition-reward>
are fulfilled, the rewards are added up.
The higher-level descriptions in this format can be found in the appendix for all case
studies considered in the following sections.
As an example for the input language, consider the specification of the dice model
with a biased coin of Figure 2.5:
probabilistic
param float x;
65
module die
// local state
s : [0..7] init 0;
// value of the dice
d : [0..6] init 0;
[]
[]
[]
[]
[]
[]
[]
s=0
s=1
s=2
s=3
s=4
s=5
s=6
->
->
->
->
->
->
->
x :
x :
x :
x :
x :
x :
1-x
(s’=1) +
(s’=3) +
(s’=5) +
(s’=1) +
(s’=7) &
(s’=7) &
: (s’=2)
1-x : (s’=2);
1-x : (s’=4);
1-x : (s’=6);
1-x : (s’=7) &
(d’=2) + 1-x :
(d’=4) + 1-x :
+ x : (s’=7) &
(d’=1);
(s’=7) & (d’=3);
(s’=7) & (d’=5);
(d’=6);
endmodule
rewards
[] true : 1;
endrewards
The first line tells us that it is a probabilistic model. It has the parameter x, which is
the probability that the coin used to simulate the dice shows head. There is only a single
module, die. The variable s specifies whether the current state is one of si , in which case
it is in the range 0, . . . , 6, or one of the di states where a result for the dice was already
decided. In this case, s = 7, and the second variable d gives the result. In case s 6= 7,
d is meaningless. The variable s is initialized by 0, to ensure the dice simulator starts in
the initial state s0 . The variable d is also set to 0. This is needed to avoid constructing
6 identically behaving initial states for each possible value of d leading to an unnecessary
blow-up of the model. After variable declarations, the transition specifications follow. For
each value of s in the range 0, . . . , 6, the possible successors are given. For example, in
the case s=0 there is a transition to s=1 with probability x and a transition to s=2 with
probability 1 − x, corresponding to the transitions of state s0 of Figure 2.5. For s=6 we
have a transition with probability x, setting s=7 meaning that the simulation is finished
and setting d=6 telling us that the simulated dice shows 6.
66
5.1.3
Evaluating Functions
The output of Param is a rational function or a polynomial. It is often necessary to
evaluate it to concrete numbers, for example to draw a graph of the function. The
resulting functions calculated by Param are often quite large and of high degree. In the
case study of the bounded retransmission protocol of Section 5.4 polynomials occur which
take several pages and have a degree of more than 10000. Coefficients may be negative:
We often have the situation that a state has two leaving transitions where one is labelled
with x and the other one with 1 − x. This often implies the occurrence of a minus in the
resulting functions. For a change, this is the case in the bounded retransmission where we
have channels with a reliability of pK , such that a transmission succeeds with probability
pK and fails with 1 − pK .
These three factors, size, high degree and occurrence of negative numbers make the
evaluation of resulting functions numerically problematic. The high degree of polynomials
may lead to underflow in calculations, subtraction may lead to cancellation and the size
of the functions aggravates the problems. Because of these factors, evaluating the polynomials in a straightforward manner using single or double-precision arithmetic turned
out to be infeasible.
For the graphs used in the case study of Section 5.4, a Maple worksheet was written
which reads result functions and evaluates them in a given range for a number of values
using infinite-precision arithmetic. Then these numbers are converted to floating points
and written to a data file. After this, Gnuplot is used to plot the graph.
However, for larger result files, the memory usage quickly exceeded an amount of 1GB,
and then Maple crashed. To evaluate result functions of really great size, more involved
numerical methods or a more efficient implementation will have to be used. One of the
most widely known methods is the Horner scheme, but there also exist other approaches
[24].
5.2
Zeroconf
Zeroconf is a technology for networking without previous configuration. The motivation
behind Zeroconf is to realize a mechanism to allow the installation of a network as well
as its operation in the most simple way. This case study is based on an example of [8]
where certain aspects of the IPv4 Zeroconf protocol are modelled. In contrary to this
paper, the model was specified in Prism language and the calculations were performed
using Param and not manually as in [6] and [8].
The part considered handles the collision-avoiding mechanism of this protocol. When
a new host joins the network, it randomly selects an address among the K = 65024
m
.
possible ones. With m hosts in the network, the collision probability is q = K
In case no collision occurs, the host joins the network. If a collision occurs and is
67
detected, the host selects a new address and tries again. It may happen however, that
a collision is not detected. A host must ask n times before being allowed to consider
the chosen address as valid. If after the nth time no collision is detected, the host will
erroneously consider the chosen address as valid. As n is a parameter influencing the
structure of the model, it can not yet be handled with the techniques provided by this
thesis. Because of this, the analyses considered here are parametric in p and q only, but
different analyses have to be executed for different n. A sketch of the model is given in
Figure 5.2
sok
1−q
s0
q
p
s1
p
s2
...
p
sn
p
serr
1−p
1−p
1−p
Figure 5.2: Zeroconf collision detection
The property on which we applied Param is the probability that finally a correct
address will be selected which is represented by the target state sok . For each n tested,
the result turned out to be
f (x) =
q−1
+q−1
−pn q
which is the same value as in [8]. For this analysis, the state-elimination approach of
Section 3.4.1 was used. For this model, the order of state-elimination is critical for the
time of the analysis. In Figure 5.3 two different elimination orders are given.
The order sketched on the left side of the figure first eliminates the states reachable
earlier (states with lower numbers), resulting in the generation of more intermediate
transitions. The effect of the elimination order is shown in Figure 5.4 where the time
needed for the analyses is given for both elimination orders for several n.
In the order on the left of Figure 5.3, we would first eliminate the old initial state, after
inserting a new one without incoming transitions. As this state had incoming transitions
from all states si , 1 ≤ i ≤ n and 2 outgoing transitions, we remove n transitions but add
n new ones to the target states and also n more new transitions to state 3. Now when
eliminating the next state, n incoming transitions have to be removed and n − 1 new
incoming transitions have to be added to state 4. Proceeding with the next state again,
n − 1 transitions have to be removed and n − 2 be set. The elimination of all states except
the target state and the new initial state leads to and order of O(n2 ) transitions modified
in this process.
68
0
-q +1 q
1
6
-p +1
3
-q +1
-p +1
-p +1
p
-p +1
-p +1
4
-p +1
-p +1
p
-p +1
q -p +1
0
7
p
5
-p +1
p
5
4
p
p
6
3
p
p
7
2
p
p
8
1
1
1
Figure 5.3: Two orders of state-elimination for Zeroconf
If the elimination order on the right side is used, much less work is necessary. Initially,
1 is removed. When removing the states 2 to 5 in this order, each time only a constant
number of transitions are modified. After this, only the states s0 , sok and serr and a
constant number of transitions are left for each version of the model with an according
elimination order. This means, that by using this order, only a linear number of transitions
are modified during the process.
n
10
20
30
40
100
time left (s) time right (s)
0.7
0.2
2.5
0.2
5.8
0.2
11.5
0.3
91.0
0.5
Figure 5.4: Effect of the order of state-elimination in Zeroconf
69
5.3
An Instable Model
0.5
s0
0.5 − x
s−1
sn+1
x
x
1−x
s1
x
s2
...
x
x
sn
1−x
1−x
Figure 5.5: Model with transition probabilities varying in magnitudes
Markov models in which transition probabilities vary in magnitudes lead to problems
when checking properties on them using non-exact arithmetic. For example, consider the
model of Figure 5.5. With x set to 10−6 and n = 10, Prism was not able to verify hard
bounds on the probability to finally reach sn+1 . Using Prism, it could neither be verified
that this probability is larger or equal to 12 nor that it is smaller or equal to 21 . A formula
requesting the result delivered 12 .
However, using Param we are able to obtain a formula representing the probability
to finally reach sn+1 . It turns out to be
f (x) = x11 +
70
1
2
0.5006
0.5005
0.5004
0.5003
0.5002
0.5001
0.5
0.4999
0
0.1
0.2
0.3
0.4
0.5
x
Figure 5.6: Plot for the function f (x) = x11 +
1
2
Taking x = 10−6 , the resulting value is 21 + 10−66 which means that the value is larger
than 12 . This is also the case for all 0 < x ≤ 21 . As seen in Figure 5.6, for 0 ≤ x ≤≈ 0.30
the values of this function only have a very small difference to 12 , which explains the
problems observed in Prism.
5.4
Bounded Retransmission Protocol
This case study is adapted from Prism and originated from [14]. It describes the bounded
retransmission protocol (BRP) which is a variant of the alternating bit protocol. In the
BRP, files that are sent are divided into a number of N chunks. For each of them, the
number of retransmissions allowed is bounded by MAX. There exists a channel K for
the data and another channel L for sending acknowledgements for packages received.
Both are assumed to loose packages with a certain probability. There actually exist two
different versions of the case study, one which is a purely probabilistic model and one that
includes non-determinism. In this thesis, we considered both of them. For the analysis
of the non-deterministic model, we encoded the non-deterministic choice by a parametric
probabilistic one, as described in Chapter 3.6.
For different values N and MAX several variants of the model were constructed and
analyzed. As parameters, the reliability of the channels K (pK ) and L (pL) was taken.
71
In the original model, these were fixed to be 0.98 and 0.99 respectively. We analyze
the version of the model which is parametric in pL and pK . We consider four different
properties of the Prism case study:
1. “The maximum probability that eventually the sender does not report a successful
transmission”
2. “The maximum probability that eventually the sender reports an uncertainty on
the success of the transmission”
3. “The maximum probability that eventually the sender reports an unsuccessful transmission after more than 8 chunks have been sent successful”
4. “The maximum probability that eventually the receiver does not receive any chunk
and the sender tried to send a chunk”
First, we show the quotient of a small version of the BRP model with N = 2, MAX = 2
for property 1. It is given in Figure 5.7. The quotient is the same for both the probabilistic
and non-deterministic case, where non-determinism is resolved by parametric probabilistic
choice. The non-lumped probabilistic model consisted of 78 states and 99 transitions and
the non-deterministic one contained of 255 states and 335 transitions. For this model and
property, the resulting formula turns out to be
−pK 6 pL6 + 6pK 5 pL5 − 15pK 4 pL4 + 18pK 3 pL3 − 9pK 2 pL2 + 1
N
64
64
256
256
1024
1024
MAX
4
5
4
5
4
5
States
3976
4809
15880
19209
63496
76809
no bisimulation
Time Mem
Trans.
(s)
(MB)
5379
8
8
6531
11
10
21507
161
27
26115
238
32
86019 10021 98
104451 26327 118
bisimulation (weak)
Time
States Trans.
(s)
643
1282
4
771
1538
5
2563
5122
42
3075
6146
56
10243 20482 1187
12291 24578 2514
Mem
Result
(MB)
8
10
27
32
98
118
1.50E-06
4.48E-08
6.02E-06
1.79E-07
2.41E-05
7.17E-07
For property 1, results are given in the table above for the probabilistic version without non-determinism. Several different instantiations of N and MAX were considered.
The “Result” column gives an approximation of the result with the variable evaluation
pK = 0.98 and pL = 0.99 as in the original Prism model. Evaluating the formula with
these values was performed after the analysis. The actual formulas (without pL and pK
initialized to concrete values) are too large to be given here (several DIN-A4 pages for
larger models).
72
5
pK
-pK+1
14
-pL+1
12
pK
-pK+1
13
-pL+1
pL
11
pK
pL
9
pL
10
pK
-pK+1
8
-pL+1
-pK+1
6
-pL+1
pK
-pK+1
7
-pL+1
pL
3
pK
-pK+1
-pL+1
2
pL
4
pL
0
1
1
1
Figure 5.7: Graph of weak bisimulation quotient of BRP model with N=2, MAX=2
The number of states, transitions, time and memory needed grow with increasing N
and MAX. As can be seen from the table, for larger versions of the model up to about
eight hours were needed when the model was not minimized. Using weak bisimulation
minimization speeded up the analysis quite a lot. It took only two minutes for the analysis with quotioning. The weak bisimulation quotients of corresponding probabilistic and
non-deterministic versions of the model turned out to be exactly the same, which means
that the maximal probabilities are also. Analyses on the non-deterministic model without former minimization took several minutes even for the smallest models. The problem
is that for resolving the non-determinism many more variables are needed than in the
probabilistic case where there were only two. This lead to complicated intermediate values on transitions during state-elimination. When considering the bisimulation quotient,
analyses of non-deterministic models took more time than for the corresponding purely
probabilistic versions, but was still feasible.
For properties 2 and 3, the performance results turned out to be similar taking into
account the time and memory needed as well as the number of states and transitions.
For property 4, the number of states and transitions of the original model were similar to
73
the former ones. However, much less time was needed for the analysis and the quotient
turned out to be much smaller, consisting of only 9 states and 14 transitions for N=1024,
MAX=5. Also, the resulting formulas were much shorter than for the other properties
(no more than a line, even for N=1024, MAX=5).
Evaluation Figure 5.8 gives plots for the probabilities of property 1 for the two model
instantiations (N=32, MAX=3) and (N=32, MAX=4). Parameter pL was fixed to 0.99,
to allow for a two-dimensional graph. As an interval for pK the range [0.9, 1.0] was chosen.
As expected, with an increasing reliability of channel K the probability that the sender
does not finally report a successful transmission decreases. Using this plot, it’s easy to see
that if we want the probability of this failure to be below 10−4 , for N=32 and MAX=4
one must have pK larger than about 0.93 if pL = 0.99. Note that such analyses were not
possible with existing tools like Prism, Mrmc or Pass.
0.005
N=32, MAX=3
N=32, MAX=4
0.0045
0.004
0.0035
probability
0.003
0.0025
0.002
0.0015
0.001
0.0005
0
0.9
0.91
0.92
0.93
0.94
0.95
pK
0.96
0.97
0.98
0.99
1
Figure 5.8: Plots for BRP model for property 1 probability
As exemplified in Figure 5.9 for other model instantiations that were considered, the
ratio between the results for properties 1 to 3 for models with parameters (N=n, MAX=m)
and (N=n, MAX=m+1) were similar to the one above, though not exactly the same. For
pL = 0.99 and pK = 0.98, these ratio was about 33 meaning that allowing one more
retransmission increases safety by a factor of 33 if the channels have the given reliability.
For property 4 this was different, for example, for pL = 0.99 and pK = 0.98 the ratio was
74
about 50. With increasing reliability of channel K, the ratio grows. This means that the
effect of allowing more retransmissions is even greater if the reliability is already high.
100
N=32, MAX=3 to MAX=4
90
80
70
ratio
60
50
40
33
30
20
10
0
0.9
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
pK
Figure 5.9: Plots for BRP model for property 1 ratio
Notice that, for the reasons mentioned in Section 3.6, the weak bisimulation quotient
for the non-deterministic model variants is not valid for minimizing schedulers. Indeed,
when using Prism to check the minimal instead of maximal probabilities, they turned
out to be zero for all properties.
In the current preliminary implementation the memory consumption is quite larger
than the optimal value and run times could also possibly be improved, as there are yet
a lot of optimizations left to be implemented. However, analysis times turned out to be
shorter for larger models than when using Prism, though in Param a parametric analysis
was performed. For smaller models, Prism is faster.
A Reward Property Finally, consider the analysis of reachability rewards. We consider the expected number of times a data package has to be sent over channel K till the
sender terminates, either in a success or error state. For N=32, MAX=2 about 37 seconds
were needed.
In Figure 5.10 an overview of the expected number of data transmissions for the model
with N=32 and MAX=2 is given. For pL = 0.90 and pK = 0.95 there exists a value pLmax
for pK that is not equal to 1 which maximizes the expected number of transmissions
75
36
Expected number of data transmissions
34
32
30
28
26
pL=0.80
pL=0.90
pL=0.95
24
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
pK
Figure 5.10: Expected number of data transmissions (N=32,MAX=2)
needed. To understand this, one must take into account that the sender may terminate
with both success and failure. So, for pK < pK max the reliability of channel K is so
low that decreasing it leads to a high probability of an early failure of the sender. For
pK > pKmax the reliability is already so high, that less and less retransmissions are needed
in the average while increasing reliability for a successful transmission.
5.5
Protein Synthesis
We analyze a stochastic Petri net (SPN) model of protein synthesis [12]. In biological
cells, each protein is encoded by a certain gene. If the gene is active, the corresponding
protein will be synthesized. Also, proteins may degenerate and thus disappear after a
time. Activation and deactivation of genes, protein synthesis (in case of an active gene)
as well as protein degeneration are modelled by stochastic rates.
The SPN model is depicted in Figure 5.11. The place p1 corresponds to an inactive
gene encoding the protein, p2 corresponds to an active gene, and p3 gives the numbers of
existing proteins. The transition t1 deactivates the gene with rate µ, while t2 activates it
with rate λ. If the gene is active, t3 can produce new proteins with rate ν. Each individual
protein degenerates with rate δ, which is modelled by the transition t4 .
We consider the property that within time t a state is reached in which 10 or more
76
p1
λ
µ
t1
t2
ν
p3
p3 · δ
t4
p2
t3
Figure 5.11: SPN for protein synthesis
proteins exist. The parameters λ = 1, µ = 5, ν = 1, δ = 0.02 were fixed and parametrization was performed on the time, as described in 3.3 for 0 ≤ t ≤ 10 with a uniformization
rate of q = 72. When applying Param on the model, the results were as follows:
• total time: 72 s
• number of model states: 21
• number of model transitions: 48
• memory used: 28 MB
• probabilities are given in Figure 5.12
As expected, the probability that 10 or more protein molecules are synthesized increases with the time bound. We considered several more uniformization rates. As seen
in Figure 5.13, for larger uniformization rates, which are necessary to handle larger time
bounds, the grows of run-time is more than linear.
77
0.0001
9e-05
8e-05
7e-05
probability
6e-05
5e-05
4e-05
3e-05
2e-05
1e-05
0
0
2
4
6
8
10
time
Figure 5.12: Probability for protein synthesis parametric on time
1400
1200
1000
runtime
800
600
400
200
0
0
50
100
150
200
uniformization rate
250
300
Figure 5.13: Run-time for protein synthesis parametric on time
78
79
Chapter 6
Conclusion and Future Work
In this thesis, we have put forward the analysis of parametric Markov models. We considered the analysis of bounded and unbounded reachability properties for both discreteand continuous-time models. For unbounded reachability, we used a state-elimination approach. For this technique, we noticed that it may be necessary to do additional validity
checks, which was not considered on previous work this thesis is based on.
The approach for probabilistic parametric models was extended to models involving
non-determinism in addition to random behavior. A method was introduced to transform
non-determinism to parametric probabilistic choice. This allows the usage of techniques
which have been introduced along the thesis to handle purely probabilistic models.
The approach of state-elimination was extended from handling unbounded reachability
properties to handling reward properties.
A number of optimizations were put forward. The first part of the chapter about
them dealt with improvements during the state-space exploration. The second part was
the application of bisimulation minimization on parametric models.
Finally, an implementation of the techniques developed in this thesis was described.
It was applied on a number of case studies, showing the feasibility of the approach.
There are several future works that could possibly extend the contribution of this
thesis.
PCTL/CSL Model Checking Model checking means specifying the correct behaviour
of a system by a formula in a certain logic and then automatically checking whether a
formal system fulfills the specifying property by considering all its possible behaviors. For
non-parametric Markov models, the logic PCTL is used for discrete-time systems and
CSL is used for continuous-time systems exist. In this context, model checking has the
advantage that by using nested formulas more complicated properties can be expressed
than the reachability considered in this thesis. However, in the general case model checking
for parametric Markov models is problematic. For example, consider the PDTMC of
Figure 6.1. It consists of a number of 4 states. The state s3 is labelled by a allowing it
80
to be distinguished from others ina PCTLformula. Let A be some polynomial. Further,
consider the PCTL formula P=? F P≤ 1 Xa . It requests the probability (P=? ) that finally
2
(F ) the probability that the next state (X) is labelled with a is smaller or equal to 21 (P≤ 1 ).
2
The result is 1 if A ≥ 12 and 0 else. Deciding whether s0 fulfills this property involves
computation of the roots of A, which is only possible by approximative methods if A has
a degree larger than 4. Also, the result is no longer expressible by polynomials or rational
functions, but may need a more complicated (piecewise) definition. For certain classes for
formulas or under certain circumstances however, model checking for parametric Markov
models may still be feasible.
PMTBDD A Binary Decision Diagram (BDD) [21] is a data structure that can be used
to implicitly represent Kripke structures. Such a representation is often much more compact than an explicit state representation of a model. Multi-Terminal BDDs (MTBDDs),
an extension to BDDs, can be used to represent Markov models. MTBDDs could be
modified to allow polynomials as terminal nodes to represent parametric models. There
also exists an approach [22] to use a special form of BDDs for representing polynomials
exploiting structural similarities between them.
1−A
s0
s1
A
s2
a
s3
Figure 6.1: PCTL/CSL Model Checking
Parameterization In Structure The parametric model types considered were parametric in the rates or probabilities of their transitions. We did not however consider
models which were parametric in structure. For example, in the Zeroconf model, the
parameter n had to be fixed before the analysis. An interesting approach could be to
transform Prism models into linear recursive expressions and use a computer algebra
system to solve them. For a change, the Zeroconf model together with the property
considered can be specified by:
s−1 = 1
sn+1 = 0
s0 = (1 − q)s−1 + qs1
81
sm = (1 − p)s0 + psm+1 , 0 < m ≤ n
Reducing Polynomials For now, in the implementation the intermediate values of
computation were either exact or only had non-exact coefficients. In case intermediate
values become too large, it could be interesting to think about ways of approximating the
polynomials to safe memory. It would be necessary to implement error bound analysis to
guarantee the usefulness of the results.
More Efficient Implementation There are numerous ways the implementation could
be improved on a more technical side. For example, the state exploration is rather slow.
Also, we could compare the efficiency of different libraries to handle polynomials or rational functions and possibly link Param to several of them. We could also try to take
advantage of the fact that there are often states with corresponding incoming and outgoing
transitions (which are not bisimilar) by eliminating them at once and only calculations of
new polynomials or rational functions for the new transitions once. For example, consider
the model of Figure 5.7. Here the states 3 and 6, which are not bisimilar, have incoming
and outgoing transitions labelled with the same probabilities.
State-Elimination Before Composition The Prism language allows several modules to be specified. The modules may interact with each other in a well-defined way.
Currently, in Param the module interaction is handled by flattening the model that is,
by resolving the module structure into a single module. A possible extension to Param
could perform a state-space exploration for each module first and then minimize the statespace of each module by state-elimination (possibly after applying an appropriate form
of bisimulation) before composing the state-spaces. This way, for some models it may
be possible to safe memory by avoiding to generate the whole state-space of the original
model before minimization.
Parallelization There are several parts of Param which are possible candidates for a
parallel implementation. This includes the state-space exploration as well as the calculation of the bisimulation quotient. For state-space exploration, this would be especially
efficient in combination with the approach put forward in the last paragraph, as then the
state-space of each module could be handled in parallel avoiding the need of fine-granular
synchronization. State-elimination can be parallelized by splitting the state-space into a
number of n sets Ai , where n equals the number of processors available, and then have
one thread for each Ai eliminate all non-initial, non-target states of Ai that have no direct
connection to a Aj , i 6= j. Notice however that this only works if the library handling the
transition values is thread-safe.
82
Combination With Predicate Abstraction Predicate abstraction is a technique
for minimizing the state-space of a formal model. In this method, a set of predicates
is introduced. This can then be used to generate an abstraction of the state-space by
combining all states for which the same subset of predicates is fulfilled, assuming that a
state is an evaluation of model variables. In [26] this method has been applied successfully
on probabilistic systems for the first time. It is planned to combine the results of this
thesis with those of [26] to allow predicate abstraction of parametric probabilistic systems.
83
84
Appendix A
Appendix
This chapter is the appendix. For each of the models from Chapter 5, a description in
the higher level languages is given as well as the property files.
A.1
Zeroconf
Model File
// Model taken from Daws04
// This version by Ernst Moritz Hahn ([email protected])
probabilistic
param double p;
param double q;
const int n = 75;
module main
s: [-2..n+1];
[b] (s=-1) -> (s’=-2);
[a] (s=0) -> 1-q : (s’=-1) + q : (s’=1);
[a] (s>0) & (s<n+1) -> 1-p : (s’=0) + p : (s’=s+1);
endmodule
init
s = 0
endinit
85
rewards
[a] true : 1;
[b] true : n-1;
endrewards
Formula File
// Pmax=? [ (true) U (s=-1) ]
R=? [ F s=-2 | s=75+1 ]
A.2
An Instable Model
Model File
// Using symbolic analyses to avoid loss of precision verifying awkward model
// Author: Ernst Moritz Hahn ([email protected])
probabilistic
param float epsilon;
const int n=10;
module main
s: [-1..n+1];
[]
[]
[]
[]
(s=-1) -> (s’=-1);
(s=0) -> 0.5 : (s’=n+1) + epsilon : (s’=1) + 0.5-epsilon : (s’=-1);
(s>0) & (s<n+1) -> epsilon : (s’=s+1) + 1-epsilon : (s’=-1);
(s=n+1) -> (s’=n+1);
endmodule
init
s = 0
endinit
Formula File
Pmax=? [ true U (s=n+1) ]
86
A.3
Bounded Retransmission Protocol
Model File The original Prism model for the Bounded Retransmission Protocol is
shipped along with the Prism distribution 3.2.beta1. In the following, the model modified
for this thesis is given.
// bounded retransmission protocol [D’AJJL01]
// gxn/dxp 23/05/2001
// Modified by Ernst Moritz Hahn ([email protected])
probabilistic
// number of chunks
const int N = 64;
// maximum number of retransmissions
const int MAX = 5;
// reliability of channels
param float pL;
param float pK;
global T : bool;
module sender
s : [0..6];
// 0 idle
// 1 next_frame
// 2 wait_ack
// 3 retransmit
// 4 success
// 5 error
// 6 wait sync
srep : [0..3];
// 0 bottom
// 1 not ok (nok)
// 2 do not know (dk)
// 3 ok (ok)
nrtr : [0..MAX];
i : [0..N];
bs : bool;
87
s_ab : bool;
fs : bool;
ls : bool;
// idle
[NewFile] (s=0) -> (s’=1) & (i’=1) & (srep’=0);
// next_frame
[aF] (s=1) -> (s’=2) & (fs’=(i=1)) & (ls’=(i=N)) & (bs’=s_ab) & (nrtr’=0);
// wait_ack
[aB] (s=2) -> (s’=4) & (s_ab’=!s_ab);
[TO_Msg] (s=2) -> (s’=3);
[TO_Ack] (s=2) -> (s’=3);
// retransmit
[aF] (s=3) & (nrtr<MAX) -> (s’=2) & (fs’=(i=1)) & (ls’=(i=N)) & (bs’=s_ab) & (nrtr’
[] (s=3) & (nrtr=MAX) & (i<N) -> (s’=5) & (srep’=1);
[] (s=3) & (nrtr=MAX) & (i=N) -> (s’=5) & (srep’=2);
// success
[] (s=4) & (i<N) -> (s’=1) & (i’=i+1);
[] (s=4) & (i=N) -> (s’=0) & (srep’=3);
// error
[SyncWait] (s=5) -> (s’=6);
// wait sync
[SyncWait] (s=6) -> (s’=0) & (s_ab’=false);
endmodule
module receiver
r : [0..5];
// 0 new_file
// 1 fst_safe
// 2 frame_received
// 3 frame_reported
// 4 idle
// 5 resync
rrep : [0..4];
// 0 bottom
// 1 fst
// 2 inc
// 3 ok
88
// 4
fr :
lr :
br :
r_ab
recv
nok
bool;
bool;
bool;
: bool;
: bool;
// new_file
[SyncWait] (r=0) -> (r’=0);
[aG] (r=0) -> (r’=1) & (fr’=fs) & (lr’=ls) & (br’=bs) & (recv’=T);
// fst_safe_frame
[] (r=1) -> (r’=2) & (r_ab’=br);
// frame_received
[] (r=2) & (r_ab=br) & (fr=true) & (lr=false) -> (r’=3) & (rrep’=1);
[] (r=2) & (r_ab=br) & (fr=false) & (lr=false) -> (r’=3) & (rrep’=2);
[] (r=2) & (r_ab=br) & (fr=false) & (lr=true) -> (r’=3) & (rrep’=3);
[aA] (r=2) & !(r_ab=br) -> (r’=4);
// frame_reported
[aA] (r=3) -> (r’=4) & (r_ab’=!r_ab);
// idle
[aG] (r=4) -> (r’=2) & (fr’=fs) & (lr’=ls) & (br’=bs) & (recv’=T);
[SyncWait] (r=4) & (ls=true) -> (r’=5);
[SyncWait] (r=4) & (ls=false) -> (r’=5) & (rrep’=4);
// resync
[SyncWait] (r=5) -> (r’=0) & (rrep’=0);
endmodule
module checker
// [NewFile] (T=false) -> (T’=false);
[NewFile] (T=false) -> (T’=true);
endmodule
module channelK
k : [0..2];
89
// idle
[aF] (k=0) -> pK : (k’=1) + 1-pK : (k’=2);
// sending
[aG] (k=1) -> (k’=0);
// lost
[TO_Msg] (k=2) -> (k’=0);
endmodule
module channelL
l : [0..2];
// idle
[aA] (l=0) -> pL : (l’=1) + 1-pL : (l’=2);
// sending
[aB] (l=1) -> (l’=0);
// lost
[TO_Ack] (l=2) -> (l’=0);
endmodule
init
T=false &
bs = false &
s_ab = false &
fs = false &
ls = false &
fr = false &
lr = false &
br = false &
r_ab = false &
recv = false &
srep = 0 &
i = 0 &
k = 0 &
l = 0 &
r = 0 &
s = 0 &
nrtr = 0 &
90
rrep = 0
endinit
rewards
[aF] true: 1;
endrewards
Property Files
Property 1
Pmax=? [ true U s=5 & T ]
Property 2
Pmax=? [ true U s=5 & T & srep=2 ]
Property 3
Pmax=? [ true U s=5 & T & srep=1 & i>8 ]
Property 4
Pmax=? [ true U !(srep=0) & T & !recv ]
Reward Property
R=? [ F (s=5 | (srep=3)) & T ]
A.4
Protein Synthesis
Model File
// Model taken from GossP98
// This version by Ernst Moritz Hahn ([email protected])
stochastic
param float time;
const lambda = 1.0; // activation rate
const mu = 5.0;
// inactivation rate
const nu = 1.0;
// synthesis rate
91
const delta = 0.02; // degradation rate
module protein_synthesis
// places
p1: int; // inactive gene
p2: int; // active gene
p3: int; // protein
// transitions
[] p2>0 -> mu*time : (p2’=p2-1) & (p1’=p1+1);
[] p1>0 -> lambda*time : (p1’=p1-1) & (p2’=p2+1);
[] p2>0 -> nu*time : (p3’=p3+1);
[] p3>0 -> p3 * delta * time : (p3’=p3-1) ;
endmodule
// initial marking
init
p1 = 1 &
p2 = 0 &
p3 = 0
endinit
Formula File
Pmax=?[true U[0,1] p3>=10]
92
Bibliography
[1] Suzana Andova, Holger Hermanns, and Joost-Pieter Katoen. Discrete-time rewards
model-checked. In Kim Guldstrand Larsen and Peter Niebert, editors, FORMATS,
volume 2791 of Lecture Notes in Computer Science, pages 88–104. Springer, 2003.
[2] Christel Baier, Edmund M. Clarke, Vassili Hartonas-Garmhausen, Marta Z.
Kwiatkowska, and Mark Ryan. Symbolic model checking for probabilistic processes.
In Automata, Languages and Programming, pages 430–440, 1997.
[3] Christel Baier, Boudewijn R. Haverkort, Holger Hermanns, and Joost-Pieter Katoen.
Model-checking algorithms for continuous-time markov chains. IEEE Trans. Software
Eng., 29(6):524–541, 2003.
[4] Christel Baier, Joost-Pieter Katoen, Holger Hermanns, and Verena Wolf. Comparative branching-time semantics for markov chains. Inf. Comput., 200(2):149–214,
2005.
[5] Bianco and de Alfaro. Model checking of probabilistic and nondeterministic systems.
FSTTCS: Foundations of Software Technology and Theoretical Computer Science,
15, 1995.
[6] Henrik C. Bohnenkamp, Peter van der Stok, Holger Hermanns, and Frits W. Vaandrager. Cost-optimization of the ipv4 zeroconf protocol. In DSN, pages 531–540.
IEEE Computer Society, 2003.
[7] Frank Ciesinski and Marcus Größer. On probabilistic computation tree logic. In
Christel Baier, Boudewijn R. Haverkort, Holger Hermanns, Joost-Pieter Katoen,
and Markus Siegle, editors, Validation of Stochastic Systems, volume 2925 of Lecture
Notes in Computer Science, pages 147–188. Springer, 2004.
[8] Conrado Daws. Symbolic and parametric model checking of discrete-time markov
chains. In Zhiming Liu and Keijiro Araki, editors, ICTAC, volume 3407 of Lecture
Notes in Computer Science, pages 280–294. Springer, 2004.
93
[9] Salem Derisavi. A symbolic algorithm for optimal markov chain lumping. In Grumberg and Huth [13], pages 139–154.
[10] Bennett L. Fox and Peter W. Glynn. Computing poisson probabilities. Commun.
ACM, 31(4):440–445, 1988.
[11] Donald Gross and Carl M. Harris. Fundamentals of queueing theory (2nd ed.). John
Wiley & Sons, Inc., New York, NY, USA, 1985.
[12] Peter J. E. Gross and Jean Peccoud. Quantitative modeling of stochastic systems
in molecular biology by using stochastic petri nets. Proc. Natl. Acad. Sci. USA,
95:6750–6755, 1998.
[13] Orna Grumberg and Michael Huth, editors. Tools and Algorithms for the Construction and Analysis of Systems, 13th International Conference, TACAS 2007, Held as
Part of the Joint European Conferences on Theory and Practice of Software, ETAPS
2007 Braga, Portugal, March 24 - April 1, 2007, Proceedings, volume 4424 of Lecture
Notes in Computer Science. Springer, 2007.
[14] L. Helmink, M. Sellink, and F. Vaandrager. Proof-checking a data link protocol. In
H. Barendregt and T. Nipkow, editors, Proc. International Workshop on Types for
Proofs and Programs (TYPES’93), volume 806 of LNCS, pages 127–165. Springer,
1994.
[15] David N. Jansen, Joost-Pieter Katoen, Marcel Oldenkamp, Mariëlle Stoelinga, and
Ivan Zapreev. How fast and fat is your probabilistic model checker? In Haifa
Verification Conference, HVC’07, LNCS. Springer, 2007. To be published.
[16] Joost-Pieter Katoen, Tim Kemna, Ivan S. Zapreev, and David N. Jansen. Bisimulation minimisation mostly speeds up probabilistic model checking. In Grumberg and
Huth [13], pages 87–101.
[17] Joost-Pieter Katoen, Marta Kwiatkowska, Gethin Norman, and David Parker. Faster
and symbolic CTMC model checking. Lecture Notes in Computer Science, 2165:23–
38, 2001.
[18] J. Kemeny, J.Snell, and A. Knapp. Denumerable markov chains. Van Nostrand,
1966.
[19] D.E. Knuth and A.C. Yao. The complexity of nonuniform random number generation. In J. F. Traub, editor, Algorithms and Complexity: New Directions and Recent
Results, pages 357–428. 1976.
94
[20] M. Kwiatkowska, G. Norman, and D. Parker. PRISM: Probabilistic symbolic
model checker. In P. Kemper, editor, Proc. Tools Session of Aachen 2001 International Multiconference on Measurement, Modelling and Evaluation of ComputerCommunication Systems, pages 7–12, September 2001. Available as Technical Report
760/2001, University of Dortmund.
[21] C.Y. Lee. Representation of Switching Circuits by Binary-Decision Programs. Bell
Systems Technical Journal, 38:985–999, July 1959.
[22] S. Minato. Implicit manipulation of polynomials using zero-suppressed bdds. In
EDTC ’95: Proceedings of the 1995 European conference on Design and Test, page
449, Washington, DC, USA, 1995. IEEE Computer Society.
[23] A. V. Ramesh and Kishor Trivedi. Semi-numerical transient analysis of markov models. In ACM-SE 33: Proceedings of the 33rd annual on Southeast regional conference,
pages 13–23, New York, NY, USA, 1995. ACM.
[24] L L Shumaker and W Volk. Efficient evaluation of multivariate polynomials. Comput.
Aided Geom. Des., 3(2):149–154, 1986.
[25] William J. Stewart. Introduction to the Numerical Solution of Markov Chains. Princeton University Press, 1994.
[26] Björn Wachter, Lijun Zhang, and Holger Hermanns. Probabilistic model checking
modulo theories. In QEST ’07: Proceedings of the Fourth International Conference
on Quantitative Evaluation of Systems, pages 129–140, Washington, DC, USA, 2007.
IEEE Computer Society.
95
© Copyright 2026 Paperzz