49
Chapter 3
MARKOV PROCESS
50
Chapter 3
MARKOV PROCESS
3.0 INTRODUCTION
Markov process is a stochastic or random process, that is used
in decision problems in which the probability of transition to any
future state depends on the current state and not on the manner in
which the specific state was reached.
Mathematically, PX n X n 1 , X n -2 , , X 2 , X 1 } P{X n X n1 }
Markov analysis involves the studying of the present behavior of a
system to predict the future behavior of the same system. This is
introduced by Russian mathematician, Andrey A Markov. General
theory concerning Markov process was developed by A N Kolmogorov,
W.Feller and others. Markov processes are a certain special class of
mathematical models that are used in decision problems associated
with dynamic systems. Markov processes are widely used, perhaps, as
marketing aid for examining and predicting the customer behavior
concerning their loyalty to one brand of a given product and their
switching patterns to other brands of the same product. Other
applications that have been found for Markov processes include:
Manpower planning
Monsoon rainfall prediction
Forecast of wind power generation
Assessing the behaviour of stock prices
Scheduling hospital admissions
51
A simple Markov chain is a discrete random process with
Markovian property. In general the term Markov Chain is used to refer
a Markov Process that is discrete with finite state space. Usually a
Markov Chain would be defined for a discrete set of times (i.e. a
discrete-time Markov chain) although some authors use the same
terminology where time can take continuous values.
In a Markov process the possible states i.e. state space that a
system under focus could take at any point of time will be clearly
defined. As the system changes its state randomly it will be difficult to
predict the next future state with certainty. In this context the
statistical properties of the system for future state will be forecasted.
The change of system from one state to other state is called
Transition and the probability associated with this state transition is
called Transition Probability. The state space and the associated
transition probabilities characterize the Markov Chain.
For instance, the system may be a customer interested in a
certain commodity and have an option of selecting the commodity
from the available three brands namely – A, B and C. If the customer
selects the brand A, we might say that the system is in state 1. If the
customer selects the brand B, we might say that the system is in state
2 and if he/she selects the brand C, we might say that the system is
in state 3. Similarly the system might be inventory level of a given item
and a zero inventory could be denoted by state 1, one to ten units of
inventory could be denoted by state 2 and so on. Now the relevant
questions one may pose may be:
52
1. If the system is in state ‘i’ (i = 1,2,3…) say on day one, then what
is the probability that it is instate ‘j’ (j= 1,2,3…) in ‘n’ steps i.e. on
day 2 or day 3 or day 4 or … on day n?
2. After a larger number of steps (n ), what is the probability
that the system will be in state ‘s’?
3. If a company currently has a certain share of the market, what
share of the market will it have n steps from now?
Markov processes are capable of answering these and many other
questions relative to dynamic systems.
A simple Markov process is illustrated in the following example.
Let us consider monsoon rainfall over a particular region as a
system with two states – one as rainfall occurs and second as rainfall
does not occur. During monsoon, if rainfall occurs today, the
probability that there will be rainfall a day later is 0.7, and the
probability that there will be no rainfall a day later is 0.3. If there is no
rain fall today, the probability that there will be rainfall a day later is
0.6 and the probability that there will be no rainfall a day later is 0.4.
Let state-1 represents the rainfall and state-2 represents no rainfall
then the transition probabilities can be represented as given in the
Table 3.1.
Rain fall
(State-1)
No rain fall
(State-1)
Rain fall
0.7
0.3
(State-1)
No rain fall
0.6
0.4
(State-1)
Table 3.1: Transition probabilities for Monsoon rainfall example
53
The process is represented in Fig.3.1 by two flow diagrams whose
upward
branches represent moving to
state-1
and downward
branches represent moving to state-2.
Suppose the monsoon starts with state-1 (rainfall), there is a 0.7
probability that there will be rainfall on the 2 nd day. Now consider
the state on 3rd day. The probability for state-1 on 3rd day is 0.67
(i.e. 0.49 + 0.18).
The corresponding probability that there will be state-2 in 3rd day,
given that it started in state-1 on 1st day, is 0.33 (i.e. 0.21 + 0.12).
Transition probabilities for next days can be computed in similar
manner.
1st day
2nd day
3rd day
0.7
0.7
Starting
with
State-1
1
0.3
1
0.6
0.3
2
0.4
1st day
2nd day
0.6
Starting
with
State-2
1
0.3
2
0.6
0.4
1
0.49
2
0.21
1
0.18
2
0.12
3rd day
0.7
2
0.4
Transition probability
on 3rd day
Transition probability
on 3rd day
1
0.42
2
0.18
1
0.24
2
0.16
Fig 3.1: Flow diagram representing the Markov Process
54
The Table 3.2 shows that the probability for the state-1 on any
future day tends towards 2/3, irrespective of the initial state on day-1.
This probability is called as Steady state probability of being in state1. On the similar lines, the steady state probability for the state-2 on
any future day is 1/3 (i.e. 1- 2/3).
Probability that there will be state-1 on a future day,
given that it started in
STATE-1 on daySTATE-2 on day-1
1
1.0
0.0
0.7
0.6
0.67 … (2/3)
0.66 … (2/3)
0.667 … (2/3)
0.666 … (2/3)
0.6667 … (2/3)
0.6666 … (2/3)
0.66667 … (2/3)
0.6666 … (2/3)
3.2: Steady state probabilities for monsoon rainfall example
Day
Number
1
2
3
4
5
6
Table
These steady state probabilities do find much significance in
several decision processes. For example, if we are deciding to hire a
machine with two states – working (state-1) and break down (state-2),
the steady state probability of state-2 indicate the fraction of time the
machine would be in break down condition in the long run, and this
fraction of break down time would be the key factor in deciding
whether to hire the said machine or not.
3.1 CHARACTERISTICS OF MARKOV PROCESS:
Important Characteristics of a first order Markov process or the simple
Markov Process are:
1. The probabilities of going to each of the sates of the system depend
only on the current state and not on the manner in which the
55
current state was reached. This means that the next state of the
system is dependent on the current state and is completely
independent of the previous states of the system. This property is
popularly known as the property of “No Memory” which simply
means that there is no need to remember how the process reached
a particular state at a particular period.
2. There are initial conditions that take on less and less importance
as the process operates eventually ‘washing out’ when the process
reaches the steady state. Accordingly, the term steady sate
probability is defined as the long run probability of being in
particular state, after the process has been operating long enough
to wash out the initial conditions.
3. In a Markov process we assume that the process is discrete in state
space and also in time.
4. We also assume that in a simple Markov process the switching
behaviour is represented by transition matrix (matrix containing
transition probability). The conditional probabilities of moving from
one state to another or remaining in the same sate in a single time
period are termed as transition probabilities.
3.2 MARKOV PROCESS:
A stochastic system is said to follow a Markov process if the
occurrence of a future state depends on the immediately preceding
state only.
Therefore if t 0 t 1 t n represents the instants on time scale
then
the
set of random variables
Xt n
whose
state
space
56
S x 0 , x 1 , , x n 1 , x n is said to follow a Markov process provided it holds
the Markovian property:
PXt n x n Xt n 1 x n 1 , , Xt 0 x 0 } P{X t n x n Xt n 1 x n 1 }
for all Xt 0 , Xt 1 , , Xt n
If the random process at time t n is in the state x n , the future
state of the random process X n 1 at time t n 1 depends only on the
present state x n and not on the past states x n 1 , x n 2 , , x 0 .
Examples of Markov process are
A first order differential equation is Markovian
The probability of raining today depends on the previous weather
conditions existed for the last two days and not on past weather
conditions.
3.2.1 First Order Markov Chain (FOMC)
First Order Markov process is based on the following three
assumptions:
(i)
The set of possible outcomes is finite
(ii)
The probability of next outcome (state) depends only on the
immediately preceding outcome.
(iii)
The transition probabilities are constant over time.
A simple Markov Process is discrete and constant over time. A
system is said to be discrete in time if it is examined at regular
intervals, e.g. daily, monthly or yearly.
Mathematically,
57
the
probability
Px(n -1), x(n) P{X t n x n Xt n 1 x n 1 } is
called
FOMC
transition probability that represents the probability of moving from
one state to another future state.
The second order Markov process assumes that the probability
of the next outcome (state) may depend on the two previous outcomes.
Likewise, ‘l’ order Markov process assumes that the probability of next
state can be calculated by obtaining and taking account of the past “l’
states.
Second order Markov process is discussed in detail in Sec 3.10.
3.3 CLASSIFICATION OF MARKOV PROCESS
A Markov process ca be classified into four types, based on the nature
of the values taken by ‘t’ and X i .
(i) A continuous random process satisfying Markov Property is called
as Continuous parameter Markov process as ‘t’ and
X i
both
are continuous.
(ii) A continuous random sequence satisfying Markov Property is
called as Discrete parameter Markov process as ‘t’ is discrete
and X i is continuous.
(iii)A discrete random process satisfying Markov Property is called as
Continuous parameter Markov chain as ‘t’ is continuous and
{ X i is discrete.
(iv)A discrete random sequence satisfying Markov Property is called as
Discrete parameter Markov chain as ‘t’ and
discrete.
X i
both are
58
Classification of states of a Markov process:
Different states of Markov process are listed below:
(i) State ‘j’ is said to be accessible form state ‘i’ if Pij(n) 0 for some
n0.
(ii) If two states ‘i’ and ‘j’ are accessible from each other, they are said
to be communicate. In general
(a) Any state ‘i’ communicates with itself for all i0. (from the
definition of states communicating)
(b) State ‘i’ communicates with state ‘k’ and state ‘k’ communicates
with state ‘j, then state ‘i’ communicates with state ‘j’. (from the
Chapman-Kolmogorov equations)
(iii)Two states that communicate are in the same class. Two classes of
states are either identical or disjoint.
(iv)The Markov chain is irreducible if there is only one class, i.e. if all
states communicate with each other. Pij(n) 0 for some n and for all
‘i’ and ‘j’.
(v) A state is said to be an absorbing state if no other state is
accessible form it; that is for an absorbing state ‘i’, Pii = 1. A
Markov chain is absorbing if,
(a)
it has at least one absorbing state
(b)
it is possible to go from every non-absorbing state to at least
one absorbing state.
(vi)State ‘i’ is said to be recurrent, if starting in state ‘i’, the process
will ever reenter state ‘i’ with probability Pi = 1. If state ‘i’ is
59
recurrent, then starting from state ‘i’, the process will reenter state
‘i’ again and again.
So a state ‘i’ is recurrent if
P
n 1
(n)
ii
If state ‘i’ communicates with state ‘j’ and if state ‘i’ is recurrent
then state ‘j’ is also recurrent.
μ i nPii(n) is the expected or mean recurrence time of state ‘i’.
The state ‘i’ is positive recurrent if, starting in ‘i’, the expected time
until the process returns to state ‘i’ is finite. For a process, a
recurrent state need not be positive recurrent. Such recurrent
states are called null recurrent. But in a finite state Markov chain,
all recurrent states are positive recurrent.
(vii) State ‘i’ is said to be transient if starting from state ‘i’, the
process will ever reenter state ‘i’ with probability <1. If a state ‘i’ is
transient, then whenever the process enters state ‘i’, there is also
probability 1- Pi , that it will never enter again into the same state
‘i’ again. So if the state ‘i’ is transient then,
P
n 1
(n)
ii
.
(viii) A state ‘i’ is called an essential state if it communicates with
every state it leads to. Assume Pjk(n) 0 then if Pkj(m) 0 , ‘j’ is an
essential state.
(ix) A state ‘i’ said to have period ‘d’ if Pii(n) 0 whenever ‘n’ is not
divisible by ‘d’ i.e. for all values of ‘n’ other than n,2n,3n,…and ‘d’
is the largest integer with this property. If a state ‘i’ is periodic
60
with period ‘d’, if the GCD of the set { n Pii(n) 0, n 1 } is d>1. A
state is said to be aperiodic if its period is 1.
(x) Positive recurrent, aperiodic states are called ergodic. For an
ergodic Markov chain, it is possible to pass from one state to
another in a finite number of steps, regardless of present state. A
special case of ergodic Markov chain is regular Markov chain. A
regular Markov chain is defined as a chain having a transition
matrix P such that for some power of P, it has only non-zero
positive probability values. All regular chains must be ergodic
chains. An easiest way to check whether an ergodic chain is
regular is to square the transition matrix continuously until all
zeros are removed.
3.4 TRANSITION PROBABILITY AND TRANSITION PROBABILITY
MATRIX
The probability of moving form one state to another or
remaining in the same sate during a single period is called the
transition probability.
Px(n -1), x(n) P{X t n x n Xt n 1 x n 1 }
is called FOMC transition probability. This represents the conditional
probability of the system, which is now in state x n at time t n , provided
that it was previously in state x n 1 at time t n 1 .
The transition probabilities can be arranged in a matrix of size
m x m
and such a matrix can be called as one step Transition
Probability Matrix(TPM), represented as below:
61
P11
P
P 21
Pm1
P12
P22
Pm2
P1m
P2m
, where ‘m’ represents the number of states.
Pmm
The matrix P is a square matrix who’s each element is nonnegative and sum of the elements in each row is unity i.e.
m
P
j 1
ij
1 ;
i= 1 to m and 0 Pij 1.
The initial estimates of Pij can be computed as, Pij
N ij
Ni
, (i, j = 1 to m)
where N ij is the raw data sample that refer the number of items or
units or observations transitioned from the state i to state j. N i is the
raw data sample in state i.
In general , any matrix P, whose elements are non-negative and
sum of the elements in either each row or column is unity, is called a
Transition Matrix or Stochastic Matrix. Since the number of rows is
equal to number of columns in the matrix, it gives the complete
description of the Markov process.
3.5 n- STEP TRANSITION PROBABILITIES
Consider a system which is in state i at time t=0, the probability
that the system moves to state j at time t=n (some times these time
periods are considered as number of steps). The n-step transition
probabilities Pij(n) , can be represented in matrix from as:
62
P11(n)
(n)
P
21
(n)
Pm1
P (n)
P12(n)
P22(n)
(n)
Pm2
P13(n)
P23(n)
(n)
Pm3
(n)
P1m
(n)
P3m
(n)
Pmm
In this matrix P21(n) , represents the probability that the system with
current state 2 will move to state 1 after ‘n’ steps.
3.5.1
State Probabilities
Let Pi (n) represents the probability that the system occupies
state ‘i’. Let this will move to state ‘j’ in one transition. One should
distinguish that the transition probability Pij is independent of time
where as the absolute ( or STATE) probability Pi (n) depends on time.
Let the number of possible states are ‘m’, then
m
P (n) 1
i 1
and
i
m
P
j1
ij
1 for all ‘i’
If all the state probabilities are given at time t = n, then the state
probabilities at time t = n+1 can be calculated by the equation:
m
Pj (n 1) Pi (n) Pij ; n = 0,1,2, …
i 1
i.e. the probability of being in state ‘j’ at time t = n+1 is equal
to the probability moving from state ‘i’ to state ‘j’ for all values of ‘i’.
63
To make the procedure more understandable, the equations for
each state probability at time t = n+1 can be written as:
P1 (n 1) P1 (n) P11 P 2 (n) P21 P m (n) Pm1
P2 (n 1) P1 (n) P12 P 2 (n) P22 P m (n) Pm2
Pm (n 1) P1 (n) P1m P 2 (n) P2m P m (n) Pmm
This set of equations can be arranged in matrix form as
P11
P
P1 (n 1) P2 (n 1) Pm (n 1) P1 (n) P2 (n) Pm (n) 21
Pm1
P12
P22
Pm2
P1m
P2m
Pmm
In compact form it can be written as,
X n 1 X n P
---(3.1)
where X n 1 is the row vector of state probabilities at time t = n+1, X n
is the row vector of state probabilities at time t = n, and P is the
matrix of transition probabilities.
If the state probabilities at time t = 0 are known, say X 0 , then
STATE probabilities can be determined at any time by solving the
matrix equation (3.1) i.e.
X1 X 0 P
X 2 X1 P X 0 P 2
X 3 X 2 P X1 P 2 X 0 P 3
X n X n -1 P X 0 P
n
64
3.6 CHAPMAN-KOLMOGOROV EQUATION
The Chapman-Kolmogorov Equation provides a method to
compute the n-step transition probabilities. This contains the fact that
the n-step probabilities, for any n can be calculated from 1-step
probabilities.
The equation can be represented as,
Pij(n m)
M
P P
k 1
n m
ik kj
for all n, m 0 and
for all i, j =1,2,…,M ( state space).
Pik(n) Pkj(m) represents the probability that a process beginning from state
‘i’ will go to state ‘j’ in (n+m) transitions or steps through a path taking
it into state ‘k’ at the nth transition. So, if we sum over all intermediate
states ‘k’, it gives the probability that the process will be in state ‘j’
after (n+m) transitions.
Proof:
We know that Pij(n m) PX n m j X 0 i
State ‘j’ can be reached from state ‘i’ through an intermediate state ‘k’.
So
M
Pij(n m)
P X
=
P X
k 1
j , X n k, X 0 i
nm
j X n k, X o i.P X n k X 0 i
nm
j X n k P X n k X 0 i
M
k 1
=
nm
M
P X
k 1
65
=
M
P X
k 1
=
M
P
k 1
(m)
kj
m
j X 0 k P X n k X 0 i
Pik(n)
As it is mentioned at the starting of this section, the ChapmanKolmogorov equations contain the fact that the n-step probabilities,
for any n can be calculated from 1-step probabilities. This can be
easily understandable by the repeated application of the ChapmanKolmogorov equations as below:
P (1) P ( Since P (1) is just P, the 1-step Transition Probability Matrix )
P (2) P (11) P (1) P (1) PP P 2
P (3) P (21) P (2) P (1) P 2 P P 3
Now by induction,
P (n) P (n -11) P (n -1) P (1) P n -1 P P n
Hence, the n-step transition probability may be determined by
multiplying the matrix P by itself ‘n’ times.
3.7 STOCHASTIC MATRIX:
A
special
case
of
non-negative
matrices
is
stochastic
matrix
(Transition Matrix) for which A = aij is constrained by the following
two conditions
0 a ij 1 and
a
j
ij
1
The stochastic matrix is regular, if 0 a ij 1 subject to
a
j
ij
1
66
A matrix is called doubly stochastic, if for 0 a ij 1; i,j ;
a
i
ij
a ij 1
j
Here a ij is the transition probability of going from the state i to state j.
3.8 EIGEN VALUES AND EIGEN VECTORS OF MATRIX
Let A n x n (X) n x 1 λ (X) n x 1 holds for some scalar λ .
Then λ is called the characteristic root (or eigen value or latent
root) of A. X is called the characteristic vector (or eigen vector) of A or
the right characteristic vector of A.
If A is non-symmetric (non-Hermetian) then A 1 A and if it
satisfies for some scalar λ then A 1 X λX .
Then X will be called the left characteristic vector corresponding to λ .
If A is symmetric matrix then left and right characteristic
vectors are the same.
Since AX = λ X
(A λI)X 0
is a system of n linear and homogeneous equation.
A λI is called a characteristic polynomial of A.
If A = a ij n x n , then
a 11 - λ
a 21
A - λI
a n1
a 12
a 22 - λ
a n2
= λ σ 0 λ
n
n 1
a 1n
a 2n
a nn - λ
σ1 λ
n 2
σ 2 σ n where 0 1, n A
67
i = sum of the principal minors of order i, in A. i= 1,2,…,n-1
The characteristic polynomial can be written as
λ n σ 0 λ n 1σ1 λ n 2 σ 2 1 σ n 0
n
r
n
λ n r σ r 0
r 0
Theorem 3.1
If λ 1 , λ 2 , λ n are the characteristic roots of A, then A-KI will have the
characteristic roots λ 1 - K, λ 2 K,, λ n K , where K is a scalar.
Proof: We have
n
r
0= A λI = λ n r σ r 0
r 0
λ n σ 0 λ n 1σ1 λ n 2 σ 2 1 σ n 0
n
On factorization
(λ λ 1 )(λ λ 2 ),, (λ λ n ) 0
where λ 1 , λ 2 , λ n are the roots of A λI 0 i.e. the characteristic roots
of A.
Characteristic root of A-KI is given by
A ( K λ)I 0
i.e.
n
r
(λ K)
r 0
n r
σr 0
(λ K) n σ 0 (λ K) n 1 σ1 (λ K) n 2 σ 2 1 A 0
n
( Replacing λ by K λ in A - λI
n
r
λ n r σ r 0 )
r 0
68
On factorization,
(λ K λ 1 )(λ K λ 2 ), , (λ K λ n ) 0
(λ - (λ 1 - K))(λ - (λ 2 - K)), , (λ - (λ n - K)) 0
This
shows
that
(A-KI)
has
the
characteristic
roots
(λ 1 - K), (λ 2 K ) , (λ n K ) , if A has the characteristic roots λ 1 , λ 2 , λ n .
Theorem 3.2
If λ is a characteristic root of A, then λ n is a characteristic root of A n .
Proof:
AX λX
Pre-multiplying by A, both sides, we get
A 2 X (AX)
λ(λX) λ 2 X
A 3 X λ(λ 2 AX) λ 2 (X) λ 3 X
A4X λ4X
A n X λ n X , for any positive integral index n.
Hence λ n is characteristic root of A n .
3.9 SPECTRAL DECOMPOSITION OF A MATRIX:
It is a method of Matrix Decomposition based on eigen values
and hence it is also called as Eigen Decomposition. It is applicable to
square matrix. In this, matrix will be decomposed into a product of
three(3) matrices. Out of these three, only one matrix will be diagonal.
69
Thus decomposition of a matrix into three matrices composing its
eigen values and Eigen vectors is known as Spectral decomposition.
An n x n matrix P will always have ‘n’ eigen values, which can
be arranged in more than one way to form a diagonal matrix D of size
n x n and a corresponding matrix V of non zero columns which
satisfies the eigen value equation PV = VD.
Let P has eigen values λ 1 , λ 2 , λ 3 , , λ k and corresponding eigen vectors
X 1 , X 2 , X 3 , , X k which can be represented as
x 11 x 21
x k1
x x
12 , 22 , , x k2
x 1k x 2k
x kk
Let the matrix of eigenvectors,
V = X 1
X2
Xk
x 11
x
12
x 1k
x 21
x 22
x 2k
x k1
x k2
x kk
and matrix of eigen values
λ 1
0
D
0
where D
0
0
λ 2 0
0 λk
is a diagonal matrix.
70
Then
PV
= PX 1
X2 Xk
= PX 1
PX 2 PX k
= λ 1 X 1
λ2X2 λk Xk
λ 1 x 11
λ x
= 1 12
λ 1 x 1k
x 11
x
= 12
x 1k
λ 2 x 21
λ 2 x 22
λ 2 x 2k
x 21
x 22
x 2k
(as we know PX = λX)
λ k x k1
λ k x k2
λ k x kk
x k1 λ 1
x k2 0
x kk 0
0
λ2
0
0
0
λk
= VD
This results in the decomposition of P as,
P VDV 1
For a square matrix P, this type of decomposition is always possible as
long as V is a square matrix.
Further by squaring on both sides of above equation,
P 2 VDV 1 VDV 1
VD V 1 V DV 1
VD 2 V 1
Mathematically, Spectral Decomposition can be represented as
P i VD i V 1 where i= 1 to n
71
n-step Transition Probability Matrix (TPM) of First order Markov
Chain using Spectral Decomposition Method:
If P represents the four state TPM then the higher order
transition probabilities are obtained by the following procedure.
i. Determine the eigen values of the Transition Probability Matrix ‘P’ by
solving ׀P-λI =׀0
ii. If all eigen values say λ 1, λ2, λ3, . . . λk are distinct then obtain kcolumn vectors say X1, X2, X3, . . . Xk corresponding to the Eigen
values by solving PV = VD or PX = λX where X ≠ 0
iii. Denote these column vectors (eigen vectors by matrix V) where V=
(X1, X2, X3 , . . . Xk ) and obtain V-1
iv. Compute D, a diagonal matrix formed from the eigen values of P.
λ 1
0
D
0
0
λ2
0
0
0
λk
v. Higher order Transition Probability Matrix (TPM) of four state
Markov chain can be computed using the equation, P i VD i V 1
where i= 1 to n
3.10 HIGHER ORDER MARKOV CHAIN
Second Order Markov Chain (SOMC) assumes that probability
of next state depends on the probabilities of immediately two
preceding states. Second order time dependence means that the
transition probabilities depend on the states at lags of both one and
72
two time periods. Then, the transition probabilities for a second order
Markov chain require three subscripts. Then we will have Second
Order Markov Chain (SOMC) whose transition probabilities are
Px(n -2), x(n -1), x(n) PXt n x n Xt n 1 x n 1 , Xt n -2 x n -2 }
Similarly for a third order Markov chain, the notation requires
four subscripts on the transition counts and transition probabilities.
So as the model under study consists 4 states, the SOMC
Transition Probability Matrix(TPM) (Andre Berchtold et al 2002) can be
formulated as
I
II
III
IV
I
I
P111
P112
P113
P114
II
I
P211
P212
P213
P214
III
I
P311
P312
P313
P314
IV
I
P411
P412
P413
P414
I
II
P121
P122
P123
P124
II
II
P221
P222
P223
P224
III
II
P321
P322
P323
P324
TPM P IV
II
P421
P422
P423
P424
I
III P131
P132
P133
P134
II
III P231
P232
P233
P234
III
III P331
P332
P333
P334
IV III P431
P432
P433
P434
I
IV P141
P142
P143
P144
II
IV P241
P242
P243
P244
III IV P341
P342
P343
P344
IV IV P441
P442
P443
P444
The size of the TPM will be ml x m and the number of transition
probabilities to be calculated in each TPM will be m l+1 . The Table 3.3
gives the number of transition probabilities for various combinations
of order (l) and states (m).
73
Number
of
states
(m)
2
3
4
Order (l) of
Markov
Chain
Size of
the
TPM
No. of
Transition
probabilities
1
2
3
4
1
2
3
4
1
2
3
4
2x2
4x2
8x2
16x2
3x3
9x3
27x3
81x3
4x4
16x4
64x4
256x4
4
8
16
32
9
27
81
243
16
64
256
1024
Table 3.3: Number of transition probabilities for various combinations
of order (l) and states (m)
3.11 Parsimonious modeling of Higher order Markov Chain using
Weighted Moving Transition Probabilities
From the Table 3.3 it is evident that for higher order Markov
chain with more state space, the size of the TPM will be too large.
Estimation of several hundreds of parameters is a very time
consuming one.
And also it will be difficult to analyse and make
conclusions or decisions.
Moreover as maintenance cost is in proportion of the items
falling in each state at certain time, the state probabilities are fairly
important rather through which intermediate states the current state
has been attained. Therefore focus on developing a model that yields
better forecast of the proportion of items in each state at a certain
time period is justifiable.
To address this, a parsimonious model, the Weighted Moving
Transition
Probabilities
(WMTP)
method
is
introduced
that
74
approximates higher order Markov chains. This makes the number of
parameters in each TPM to be computed far less. The size of the TPM
is m x m only. Each element of the TPM is the probability for the
occurrence of a particular event at time t given the probabilities of
immediate previous l (= order of the Markov Chain) time periods i.e.
at times from (t-l) to (t-1). The effect of each lag is considered by
assigning the weights.
For an l- order Markov Chain, in general, the probabilities can
be estimated as
PXt n x n Xt n 1 x n 1 , , Xt n -l x n -l } =
subject to
l
g 1
g
l
δ
g 1
g
(Pij ) g
1 and g 0 .
is the weight parameter corresponding with the lag g. P ij
are the
transition probabilities of the corresponding m x m TPM. It is based on
the premise that the most recent value is the most relevant to
estimate the future value; consequently the weights decrease as we
consider the older lags.
The weighted Moving Transition Probabilities-WMTP (for l = 2, Second
Order Markov process) can be written as:
P
ij n
PXt n x n Xt n 1 x n 1 , Xt n -2 x n -2 }
=
2
δ
g 1
g
(Pij ) g = δ n 1 (Pij ) n 1 δ n 2 (Pij ) n 2
where δ n 1 δ n 2 1
and ≥ 0.
75
As shown in the following Fig. 3.2 real Second Order Markov
Chain (SOMC) carries the combined influence of lags where as WMTP
model analogue carries the independent influences of each lag on the
present.
Real SOM C
WMTP modeling
X t-2
X t-2
X t-1
X t-1
Xt
Xt
Fig. 3.2: Pictorial representation of Real SOMC and WMTP modeling of SOMC
Summary: Markov Process, a stochastic mathematical model, is used
in decision-making in the face of great amount of uncertainty. Markov
process is based on the premise that the future state of the system
depends on the current state but not on how it reaches the present
state. The general characteristics of the Markov Process, different
states of Markov process, order of Markov process, transition
probability etc. are discussed in detail. Stochastic matrix, eigen values
and eigen vectors of matrix, and spectral decomposition of a matrix
are also discussed.
By defining the higher order Multi state Markov processes and
the
difficulties
in
estimating
the
corresponding
transition
probabilities, reasons for introducing the Weighted Moving Transition
Probabilities (WMTP) technique are discussed in detail. Weighted
Moving Transition Probabilities (WMTP) technique is a parsimonious
model that approximates higher order Markov chains.
© Copyright 2026 Paperzz