Lab 2: Markov Chains – stationary distributions

Lab 2: Markov Chains – stationary distributions
Math 486 Spring 2011
due: March 7, 2011
Note: Turn in your programs/output with the necessary comments. Electronic
submission is ok, but in this case all work (including code) has to be contained in a
single file suitable for print.
All the files pertinent to this Lab are located at http://www.nmt.edu/~olegm/486/Lab2
1
Finding a stationary distribution
As you already know, stationary distribution π (if it exists) satisfies equations
X
π = πP,
πi = 1
i
Given P, you can compute π using
A = P’ - eye(n);
A(n,:) = ones(1,n);
b = zeros(n,1);
b(n) = 1;
pi = A\b
Exercise 1.
(warm-up) Exercise 1.5, p. 209 – solve using Matlab.
2
Stationary distributions: Pagerank
Markov Chains have been used to a great benefit by Google, Inc. in its famous
Pagerank algorithm. The Google founders, Brin and Page, were looking for ways to
rank websites according to their popularity.
The problem was that a simple number of links to a website does not determine
1
its popularity. A criterion based on number of links is easily foiled by creating a
clique of websites that link to each other but otherwise are not connected to anything
important.
Their idea was as follows. Let’s start at a random webpage. Then, out of links on
that webpage, randomly pick the next webpage to visit. If there are no outgoing links
on the page, start over with another random page. This will imitate the behavior of
a user clicking on links at random.
Consider the “Web” that consists of 10 websites connected as follows. The links
between websites can be represented graphically as a directed graph (an arrow from
one site to another indicates a link in that direction).
H -
I
6
I
@
@
@
R @
?
B
J
A
?
-
C
-
@
I
@
D
@
@
@
F
@
@ ?
- G
E
According to the number of links, site C should be the most popular. Sites D, E, F,
H, I, J should be equally popular, as each has 2 links pointing to it. Note that sites
H, I, J form a separate connected component of the graph.
To avoid the problem with non-uniqueness of a stationary state (that arises when
more than one connected component exists in a graph), Brin and Page decided to
allow a small probability of jumping from any state to any other random state. This
will keep the advantages of the random walk idea, while making the graph of Web
links connected, and the corresponding transition probability matrix P regular.
Stationary probabilities provide a measure of website’s popularity. Simply, the more
popular website will be visited more often by the Markov Chain.
The transition probability matrix for this example is given in the file interweb1.m.
Exercise 2.
(a) Calculate the stationary distribution for the random jump probability pj = 0.01
2
(b) Which site(s) are most popular? Does the “clique” strategy adopted by websites
H, I, J work?
3
Stationary distributions for the age replacement
system
(based on Sections IV.2.4-2.5)
A system (e.g. aircraft) has components that wear down. In order for the system
to function, the components need to be replaced from time to time. However, the
cost of preventive maintenance may be lower than the cost of the downtime caused
by failure of a component. Thus, it might make sense to replace components preventively. We’re going to do the cost analysis and develop recommendations for the
optimal replacement schedule.
Suppose for simplicity that there’s only one component. Another simplifying assumption is that the lifetime of a component has a discrete distribution, suppose it has
integer lifetime T (in years, say), given by
P (T = k) = ak ,
k = 1, 2, 3, ...
Let the replacement cost be K, and cost due to downtime (e.g. lost productivity, or
defence capability etc.) is D.
We will find the average cost associated with the N -th policy “Replace the component
that is N years old”. When you replace too often, your replacement costs will soar.
If you always replace on failure, then you always incur D, which can be costly. How
do you pick the best N ?
The answer is to compute the expected cost under N -th policy, for each N . That, in
turn, depends on the stationary distribution of a Markov Chain.
For N -th policy, we have the Markov Chain with the following N -by-N matrix
(p. 222):
0
1
2
3
N −1
0
p0 1 − p0
0
0
···
0
1
p1
0
1 − p1
0
0
P=
2
p2
0
0
1 − p2 · · ·
0
...
...
...
...
...
...
N −1 1
0
0
0
···
0
Here, pk is the probability that k-year old part will break down during the next year,
found by
ak+1
pk =
, k = 0, 1, ..., N − 1
ak+1 + ak+2 + ...
3
Suppose that you found the stationary distribution π for this Chain. Then,
average number of replacements (per year) =
average proportion of time spent at State 0 (Brand New Component) = π0 .
The average number of planned replacements (per year) is πN −1 . Thus, the long-term
rate of failures (and, therefore, Downtime) is π0 − πN −1 .
Overall,
Average Cost = Kπ0 + D(π0 − πN −1 )
Note that π depends on N .
Exercise 3.
For
a1 = 0.1,
a2 = 0.2,
a3 = 0.3,
a4 = 0.4,
compute the Average Cost under N -th policy, N = 1, ..., 4. Determine the optimal N
for K = 1, D = 1 and also for K = 1, D = 5.
Exercise 4.
Try to determine the optimal policy for the component with geometric lifetime
ak = (1 − p)k−1 p,
k = 1, 2, ....
(Pick some limit on N and let p = 1/2 for simplicity. Let K = 1, D = 1.)
4