Some Applications of Eigenvectors: Markov Chains

Lecture 30: Some Applications of Eigenvectors:
Markov Chains and Chemical Reaction Systems
Winfried Just
Department of Mathematics, Ohio University
November 9, 2015
Winfried Just, Ohio University
MATH3200, Lecture 30: Applications of Eigenvectors
Review: Eigenvectors and left eigenvectors
A nonzero column vector ~x is an eigenvector aka right eigenvector
of a square matrix A with eigenvalue λ if
A~x = λ~x.
A nonzero row vector ~y is a left eigenvector of a square matrix A
with eigenvalue λ if
~yA = λ~y. Note that ~y left-multiplies A here.
Homework 68: Prove that ~x is a (right) eigenvector of A with
eigenvalue λ if, and only if, ~y = ~xT is a left eigenvector of AT with
the same eigenvalue λ.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Solution for Homework 68
Homework 68: Prove that ~x is a (right) eigenvector of A with
eigenvalue λ if, and only if, ~y = ~xT is a left eigenvector of AT with
the same eigenvalue λ.
Solution: Let ~x be a nonzero column vector. Then
~xT AT = (A~x)T by the formula for the transpose of a matrix
product.
The left-hand side is equal to (λ~x)T if, and only if, ~xT is a left
eigenvector of AT with eigenvalue λ.
The right-hand side is equal to (λ~x)T if, and only if, ~x is an
eigenvector of A with eigenvalue λ.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Review: Markov chains
A Markov chain is a stochastic process.
Time proceeds in discrete steps t = 0, 1, 2, . . .
At each time t the process can only be in one of several states
that are numbered 1, . . . , n.
The probability of being in a given state at time t + 1 depends
only on the state at time t.
The matrix P = [pij ]n×n gives the transition probabilities pij
from state i at time t to state j at time t + 1.
When ~x(t) = [x1 (t), . . . , xn (t)] is the probability distribution
for the states at time t, then the probability distribution
~x(t + 1) at time t + 1 is given by
~x(t + 1) = ~x(t)P = [x1 (t), . . . , xn (t)]P.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Review: Markov chains for weather.com light
Time proceeds in steps of days.
State 1: sunny day, State 2: rainy day.
Each day is somehow unambiguously classified in this way.
The meaning of the transition probabilities:
p11 is the probability
sunny day.
p12 is the probability
day.
p21 is the probability
day.
p22 is the probability
rainy day.
that a sunny day is followed by another
that a sunny day is followed by a rainy
that a rainy day is followed by a sunny
that a rainy day is followed by another
p11 p12
P=
p21 p22
~x(t) = [x1 (t), x2 (t)], where
x1 (t) is the probability that day t will be a sunny day.
x2 (t) is the probability that day t will be a rainy day.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
An example of P for weather.com light
p11 p12
0.6 0.4
Let P =
=
p21 p22
0.3 0.7
A sunny day is followed by another sunny day with
probability 0.6.
A sunny day is followed by a rainy day with probability 0.4.
A rainy day is followed by a sunny day with probability 0.3.
A rainy day is followed by another rainy day with
probability 0.7.
P is a stochastic matrix, which means that each row adds up to 1.
This will be true for every transition probability matrix of a
Markov chain, as each state i must be followed by some state in
the next time step.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
One-state transitions for our example of P
p11 p12
0.6 0.4
Let P =
=
p21 p22
0.3 0.7
Consider the following probability distributions for day t:
~x(t) = [1, 0] means that day t is sunny for sure.
~y(t) = [0.5, 0.5] means equal likelihood of a sunny or a rainy day.
Note that the probabilities of all states always add up to 1.
The corresponding probabilities for the next day are:
0.6 0.4
~x(t + 1) = [1, 0] P = [1, 0]
= [0.6, 0.4]
0.3 0.7
~y(t + 1) = [0.5, 0.5] P = [0.5, 0.5]
Ohio University – Since 1804
Winfried Just, Ohio University
0.6 0.4
= [0.45, 0.55]
0.3 0.7
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
The eigenvalues and an eigenvector for P of our example
0.6 0.4
Let P =
0.3 0.7
0.6 − λ
0.4
P − λI =
0.3
0.7 − λ
det(P − λI) = λ2 − 1.3λ + 0.3 = (1 − λ)(0.3 − λ).
The eigenvalues are λ1 = 1 and λ2 = 0.3.
Let’s find an eigenvector with eigenvalue 1:
0.6 − 1
0.4
−0.4 0.4
Form P − 1I =
=
0.3
0.7 − 1
0.3 −0.3
x1
0
−0.4 0.4
Solve
=
0.3 −0.3 x2
0
−0.4x1 + 0.4x2 = 0
0.3x1 − 0.3x2 = 0
By setting x1 = 1, we see that ~x = [1, 1]T is an eigenvector with
eigenvalue 1 of P.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
The meaning of the eigenvector [1, 1]T
Eigenvectors with eigenvalues λ 6= 1 are less important for
transition matrices of Markov chains, so we will skip finding a
eigenvector with eigenvalue λ2 = 0.3 in our example. But we will
take a closer look at the eigenvector [1, 1]T with eigenvalue λ1 = 1.
Let A = [aij ]n×n be any square matrix. Then [1, 1, . . . , 1]T is an
eigenvector of A with eigenvalue λ if, and only if,
  
  

1
a11 + a12 + · · · + a1n
λ
a11 a12 . . . a1n
a21 a22 . . . a2n  1  a21 + a22 + · · · + a2n  λ
  
  

 =  .. 
 ..
..
..   ..  = 
..
 .
 .
.
.
.  . 
an1 an2 . . .
ann
1
an1 + an2 + · · · + ann
λ
Thus [1, 1, . . . , 1]T is an eigenvector of A with eigenvalue λ if, and
only if, each row of A adds up to λ.
In particular, [1, 1, . . . , 1]T is an eigenvector of A with eigenvalue 1
if, and only if, A is a stochastic matrix.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
We have proved a theorem ...
The observation on the previous slide proves parts (a) and (b) of
the following result:
Theorem
Let P = [pij ]n×n be the matrix of transition probabilities for a
Markov chain. Then
(a) λ∗ = 1 is an eigenvalue of P.
(b) [1, 1, . . . , 1]T is an eigenvector of P with eigenvalue 1.
(c) Every eigenvalue λ of P satisfies |λ| ≤ |λ∗ | = 1.
Part (c) is a consequence of a more general theorem called the
Perron-Frobenius Theorem that goes beyond the scope of this
course. This part says that λ∗ = 1 is a so-called leading eigenvalue
of P.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
How about the eigenvectors of PT ?
Since every square matrix has the same eigenvalues as its
transpose, λ∗ = 1 must also be an eigenvalue of PT .
Let’s find a corresponding eigenvector for our example of P:
0.6 0.3
1 0
−0.4 0.3
T
Form P − 1I =
−
=
0.4 0.7
0 1
0.4 −0.3
−0.4 0.3
x1
0
−0.4x1 + 0.3x2 = 0
Solve
=
0.4 −0.3 x2
0
0.4x1 − 0.3x2 = 0
T
We find that every vector of the form ~x = x1 , 43 x1 is an
eigenvector with eigenvalue 1 of PT . These are the only
eigenvectors with eigenvalue 1 of PT .
Here it will be useful to find the eigenvector ~x = [x1 , x2 ]T with
x1 + x2 = 1 = x1 + 34 x1 = 37 x1 .
T
It is ~x = 37 , 47 ≈ [0.4286, 0.5714]T .
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
The meaning of the eigenvector [0.4286, 0.5714]T of PT
By the result of Homework 68, the vector
([0.4286, 0.5714]T )T = [0.4286, 0.5714] is a left eigenvector of P.
Moreover, since the coordinates add up to 1 and are nonnegative,
[0.4286, 0.5714] is a probability distribution.
It follows that if the probability distribution of the weather in our
example on day t is
~x(t) = [0.4286, 0.5714],
then the probability distribution of the weather on day t + 1 is
~x(t + 1) = [0.4286, 0.5714]P = [0.4286, 0.5714].
~x ∗ = [0.4286, 0.5714] is a stationary (probability) distribution,
which means that it remains the same on the next and all future
days.
In fact, ~x ∗ = [0.4286, 0.5714] is the only stationary distribution in
this example.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
These observations generalize
Theorem
Let P be the transition probability matrix of a Markov chain with n
states and let ~x = [x1 , x2 , . . . xn ] be a probability distribution. Then
(a) ~x ∗ is a stationary distribution for this Markov chain if, and only
if, ~x ∗ is an eigenvector with eigenvalue 1 of PT .
(b) There exists at least one stationary distribution ~x ∗ of the
Markov chain.
(c) If ~x ∗ is the only stationary distribution of the Markov chain,
then for any given initial distribution ~x(0), the distributions ~x(t)
always approach ~x ∗ as t → ∞.
Point (b) follows from point (a) and the previous theorem.
Note also that in point (c) it is necessary that ~x ∗ is unique,
because when we start in one stationary distribution ~y ∗ then we
cannot approach another stationary distribution ~x ∗ .
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Some alternative versions of weather.com light
Let us consider some other transition probability matrices P for
weather.com light Markov chains.
Homework 69: (a) Let P1 = I2 (the weather always stays the
same). Show that in this case every probability
distribution ~x = [x1 , x2 ] is a stationary distribution.
0 1
(b) Let P2 =
1 0
Show that in this case ~x ∗ = [0.5, 0.5] is the unique stationary
distribution.
(c) Find a third transition probability matrix P3 with stationary
distribution ~x ∗ = [0.5, 0.5].
(d) Formulate a condition on P that appears to guarantee
that ~x ∗ = [0.5, 0.5] is a stationary distribution and prove that it
does.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Remember Waldo?
Waldo is a highly gregarious and motivated and spends all of his
evenings working with six students on his MATH 3200 homework.
At 7p.m. he visits a randomly chosen student i among those six,
and then operates as follows:
He starts working with i. After 10 minutes, he flips a fair coin.
If the coin comes up heads, he continues working with i for
another 10 minutes before flipping the coin again.
If the coin comes up tails, he moves to the room of a randomly
chosen friend of i and repeats the procedure.
He never tires of these efforts until 1a.m.
Where should we go looking for Waldo at midnight?
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
The stationary distribution for Waldo
Waldo’s itinerary can be modeled as a Markov chain with
states i = 1, 2, . . . , 6, where one time step lasts 10 minutes.
State i simply means that Waldo is in i’s room.
The transition probability matrix for this Markov chain is

1/2 0
0
1/4
 0 1/2
0
1/4

 0
0
1/2
1/4
P=
1/8 1/8 1/8 1/2

 0
0
1/2
0
1/6 1/6
0
1/6
0
0
1/4
0
1/2
0

1/4
1/4

0 
 = [pij ]6×6
1/8

0 
1/2
Homework 70: Show that this Markov chain has a unique
stationary probability distribution and find it.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Eigenvectors with eigenvalue 0 and the nullspace of A
Let A be a square matrix. Let N(A) denote the set of all
eigenvectors of A with eigenvalue 0 together with the zero
vector ~0. It is the nullspace of A. (A nullspace can be defined for
any matrix A, but only for square matrices in terms of
eigenvectors.)
Proposition
(a) N(A) has a nonzero element ~x 6= ~0 if, and only if, A is singular.
(b) N(A) is the set of all nonzero solutions ~x of the homogeneous
system A~x = ~0.
(c) If ~a1 , ~a2 , . . . , ~an denote the column vectors of A, then N(A) is
the set of all vectors [x1 , x2 , . . . , xn ]T of coefficients such that
x1~a1 + x2~a2 + · · · + xn~an = ~0.
(d) N(A) is the set of all vectors ~x such that TA (~x) = ~0.
N(A) is also called the kernel of TA .
Winfried Just, Ohio University
MATH3200, Lecture 30: Applications of Eigenvectors
Review: Chemical reaction networks;
net change of concentrations
Consider a chemical reaction network like:
−→
A + 2B ←− 2C
−→
−→
A + 2C ←− 2D
−→
A + B ←− D
B + D ←− 2C
If initial concentrations are denoted by [A]0 , [B]0 , [C ]0 , [D]0 and
concentrations are measured again after some time and denoted by
[A]1 , [B]1 , [C ]1 , [D]1 , then the vector
~ = [[A]1 − [A]0 , [B]1 − [B]0 , [C ]1 − [C ]0 , [D]1 − [D]0 ]
w
represents the net change in concentrations.
If some coordinate [X ]1 − [X ]0 is positive, then a net production of
compound X was observed, if some coordinate [X ]1 − [X ]0 is
negative, then a net consumption of compound X was observed.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Review: Chemical reaction networks;
reaction vectors and stoichiometric matrix
The reaction vectors of the chemical reaction network
−→
1
A + 2B ←− 2C
2
A + 2C ←− 2D
3
A + B ←− D
−→
−→
−→
B + D ←− 2C
 
 
−1
−1
−2
0

 
~v1 = 
 2  ~v2 = −2
0
2
4
 
−1
−1

~v3 = 
0
1


0
−1

~v4 = 
2
−1
represent the net changes in concentrations if only one reaction
occurs and consumes one mole of its first reactant.
They can be written as the columns of the stoichiometric matrix S.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Review: The linear transformation TS
If we let ~k = [k1 , k2 , k3 , k4 ]T be the column vector of average net
rates at which the reactions occur over a given time interval, then
the matrix product
~
S~k = w
gives us the net change in concentrations.
Positive values ki > 0 signify that the forward reaction dominates;
negative values ki < 0 signify that the backward reaction
dominates.
When ~k = ~0 over arbitrarily short time intervals, then each
reaction is at equilibrium.
When S~k = ~0 over arbitrarily short time intervals, then no
observable change occurs and the system is at equilibrium.
The nullspace N(S) is the set of all rate vectors ~k where the
system is at equilibrium.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
The rank of the stoichiometric matrix S
Recall the result of Homework 49:
Proposition
Suppose S represents a stoichiometric matrix of order m × n for n
reactions between m chemical species in a closed reaction
system (without net inflow, net outflow, or contributions from or
to other reactions).
Then r (S) < m.
It follows that if m = n, then S is singular, so that it has at least
one eigenvector with eigenvalue 0.
Each such eigenvector represents a vector of reaction rates where
the system is at equilibrium, but at least one reaction is not at
equilibrium.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors
Homework problems
Homework 71: Let S be a stoichiometric matrix of order n × n,
and let ~k be an eigenvector with eigenvalue 0 for S. Show that ~k
must have at least 2 nonzero coordinates.
Homework 72: Let S be a stoichiometric matrix for the chemical
reaction network
−→
A + 2B ←− 2C
−→
−→
A + 2C ←− 2D
−→
A + B ←− D
B + D ←− 2C
Find the set of all eigenvectors of S with eigenvalue 0.
Ohio University – Since 1804
Winfried Just, Ohio University
Department of Mathematics
MATH3200, Lecture 30: Applications of Eigenvectors