Probability square roots for (2 × 2) transition matrices

Probability square roots for (2 × 2) transition matrices
Marie-Anne Guerry
MOSI
Vrije Universiteit Brussel
Pleinlaan 2, B-1050 Brussels, Belgium
e-mail: [email protected]
July 30, 2012
Abstract
For a (2 × 2) probability matrix P necessary and sufficient conditions are formulated under
which there exists a probability square root, i.e. a probability matrix A satisfying P = A.A.
The probability square roots are expressed in analytic form.
Key words. Transition matrix; Probability square root; Embeddable problem
AMS subject classifications. 15A23; 15A51; 60J10
1
Introduction
Under time homogenous assumption, a discrete-time Markov model is characterized by a transition
matrix P = (pij ) that is a probability matrix. The element pij of this matrix corresponds with
the transition probability from state Si to Sj after a time interval with unit length 1. Based on
the information of the transition matrix P , the evolution of the stock vector can be described by
n(t) = n(t−1).P = n(0).P t . Since the transition probabilities pij are related to time intervals with
length 1, extrapolations of the stock vector can be found for the subsequent values t = 1, 2, ....
Although the available personnel data on stocks and flows is restricted to for example annual
base (time intervals with length 1), it can be interesting to have information on the stocks semiannually, i.e. for t = 0.5; 1; 1.5; 2; 2.5; .... Under time homogeneous assumption a transition matrix
P (0.5) with respect to time intervals of length 0.5 satisfies:
n(t) = n(t − 0.5).P (0.5) and n(t − 0.5) = n(t − 1).P (0.5)
Consequently n(t) = n(t − 1).P = n(t − 1).P (0.5).P (0.5).
Therefore, in finding a transition matrix P (0.5) with respect to a time interval with length 0.5,
the question is whether there exists a probability matrix A such that P = A.A. In other words,
the question is whether the original Markov chain with transition matrix P and time unity 1 is
embeddable in a Markov chain with time unity 0.5. This particular embeddable problem can be
expressed in terms of probability matrices that are square roots of a transition matrix: For the
matrix P of the transition probabilities with respect to a time interval with length 1, the question
is whether there exists a transition matrix A of a Markov chain with time unity 0.5 and that is
compatible with P . Such a matrix A is in fact a probability matrix that is a square root of P ,
which will be referred to in this paper by the terminology probability square root. In case for the
estimated transition matrix with respect to an one-unit time-interval there exist probability square
roots, these matrices provide information related to the transition probabilities with respect to
time-intervals with length 0.5.
1
The initial formulation of the embeddable problem was introduced by Elving and concerns
the question whether for a discrete-time Markov chain there exists a compatible continuous-time
Markov process; i.e. whether for a transition matrix P there exists an intensity matrix Q such
that exp(Q) = P ([1]). The embeddability in a continuous-time Markov processes is discussed in
detail by Singer and Spilerman ([4]).
This paper deals with necessary and sufficient conditions for embeddability in a discrete-time
Markov chain with the number of states equal to 2. For (2 × 2) transition matrices, the probability
square roots are described in analytic form.
2
Probability square roots for a (2 × 2) transition matrix:
conditions and analytic form
In this section for a (2 × 2) transition matrix P necessary and conditions for the existence of
a probability square root A are presented. The probability matrices A satisfying P = A.A are
described in analytic form.
c 1−c
Theorem 2.1 For a (2 × 2) probability matrix P =
d 1−d
• If c < d there does not exist a probability square root A of P .
• If c = d, there does exist exactly one probability square root A of P , namely the probability
matrix A = (aij ) with a11 = a21 = c = d .
• If c > d and 1 − c + d 6= 0 there does exist at least one probability square root A of P , namely
the probability matrix A = (aij ) with
√
√
1− c−d
c − d (1 − c) + d
and a21 = d
a11 =
1−c+d
1−c+d
Moreover, in case
√
a11 =
√
1+ c−d
c − d (c − 1) + d
and a21 = d
1−c+d
1−c+d
are both elements of [0, 1] there does exist a second probability square root A = (aij ) of P .
1 0
0 1
• If c > d and 1 − c + d = 0 both
and
are probability square roots of P .
0 1
1 0
Proof
For a (2 × 2) probability matrix P =
c
d
1−c
1−d
, the probability matrix A = (aij ) is a
probability square root of P if and only if
2
a11 + (1 − a11 ).a21 = c
(1)
a11 .a21 + (1 − a21 ).a21 = d
c 1−c
In the special case of a probability matrix P =
satisfying 1 − c + d = 0, the
d 1−d
1 0
1 0
0 1
matrix P equals
and has 2 probability square roots, namely
and
.
0 1
0 1
1 0
In the special case of a probability matrix A = (aij ) with a11 = 1, the matrix A can only
1
0
be a square root of a probability matrix of the form P =
. Under these conditions
d 1−d
2
and according to (1), P has exactly one
√ probability square root namely the probability matrix
A = (aij ) with a11 = 1 and a21 = 1 − 1 − d.
c 1−c
The further reasoning can be restricted to the situation of a probability matrix P =
d 1−d
satisfying 1 − c + d 6= 0 and a probability square root A = (aij ) with a11 6= 1: The system (1)
results in the following quadratic equation in a11
(1 − c + d)a211 − (2d)a11 + c2 − c + d = 0
(2)
with discriminant D = 4(1 − c)2 .(c − d).
Therefore, in case c < d, no probability square root exists for the matrix P . In case c = d,
the matrix P has exactly one probability square root A. According to (1) this matrix A satisfies
a11 = a21 = c = d . In case c > d, the equation (2) results in the following solutions for a11 :
√
√
c − d (1 − c) + d
c − d (c − 1) + d
and a11 =
a11 =
1−c+d
1−c+d
c−a2
resulting respectively in:
According to (1), a21 can be expressed as a21 = 1−a11
11
√
√
1− c−d
1+ c−d
a21 = d
and a21 = d
1−c+d
1−c+d
√
√
(1−c)+d
c−d
and a21 = d 1−
Since both a11 = c−d
1−c+d
1−c+d are elements of [0, 1] there does exist at least
one probability square root A = (aij ) of P .
Which proves the Theorem.
2
In the following Theorem a necessary and sufficient condition for a (2 × 2) probability matrix
P to have a probability square root is expressed in terms of the trace tr(P ) of the matrix P .
Theorem 2.2 For a (2 × 2) probability matrix P there exists a probability square root if and only
if tr(P ) ≥ 1.
Proof
c 1−c
From Theorem 2.1 it is known that for the probability matrix P =
there does exist
d 1−d
a probability square root under the condition c ≥ d and there does not exist a probability square
root in case c < d. Since the trace of P equals tr(P ) = 1 + c − d, the condition c ≥ d is equivalent
with tr(P ) ≥ 1.
2
Theorem 2.3 For a (2 × 2) probability matrix P there exists a probability square root if and only
if all the eigenvalues of P are nonnegative.
Proof
The probability matrix P has 1 as eigenvalue. Denoting the second eigenvalue of P by λ, the
trace of P can be expressed as tr(P ) = 1 + λ. Therefore, the necessary and sufficient condition
tr(P ) ≥ 1 to have a probability square root is equivalent with λ ≥ 0.
2
Theorem 2.4 A (2 × 2) probability matrix P =
c 1−c
d 1−d
with nonnegative eigenvalues 1
−1
and λ is diagonalizable: P = T .D.T for
1
d 1−c
k
T =
T −1 =
k
k −k
k.(1 − c + d)
with k ∈ R.
3
1−c
−d
D=
1
0
0
λ
Proof
In case for the probability matrix P the eigenvalue 1 has an algebraic multiplicity equal to 2, the
Jordan blocks are of order (1 ×1) ([2]). The diagonal matrix D as well as the matrix P are equal
1 0
to the identity matrix I2×2 =
.
0 1
In the situation that the eigenvalue 1 has an algebraic multiplicity equal to 1, P has eigenvalues
1 and λ 6= 1. According to the Perron-Frobenius Theorem the eigenvalue λ is less than 1 ([3]).
Consequently 0 ≤ λ < 1 . On the one hand, for a left eigenvector E of P = (pij ) associated with
the eigenvalue λ 6= 1 holds E. (P − λ.I) = 0 and therefor:
X
XX
X
X
X
X
X
[E. (P − λ.I)]j =
Ei .pij − λ.
Ej =
Ei .
pij − λ.
Ei =
Ei .(1 − λ) = 0
j
j
i
j
i
j
i
i
P
Which implies that i Ei = 0 since λ 6= 1. A left eigenvector E of P is therefor of the form
E = (k − k) with k ∈ R. On the other hand, the eigenspace corresponding with the eigenvalue 1
can be described as {a. (d 1 − c) |a ∈ R} .
Consequently P = T −1 .D.T is diagonalizable with
1
k 1−c
1 0
d 1−c
−1
D=
T =
T =
k −d
0 λ
k −k
k.(1 − c + d)
2
c 1−c
Theorem 2.5 For the (2 × 2) probability matrix P =
with nonnegative eigenvalues
d 1−d
√ √
√
d + (1 − √
c). λ (1 − c).(1 −√ λ)
1
1 and λ, the matrix P = 1−c+d
is a probability square
d.(1 − λ)
1 − c + d. λ
root.
Proof
c 1−c
According to Theorem 2.4, the probability matrix P =
is diagonalizable with
d 1−d
1
k 1−c
1 0
d 1−c
−1
D=
T =
T =
k −d
0 λ
k −k
k.(1 − c + d)
Since P = T
−1
.D.T and λ ≥ 0, for
T
−1
√
. D.T =
√
D=
1
1−c+d
1
0
√0
λ
holds that
√
d + (1 −√
c) λ
d.(1 − λ)
√ (1 − c).(1 −√ λ)
1 − c + d. λ
is a square root of P with each of the row sums equal to 1.
Moreover, since 0 ≤ λ ≤ 1:
√
√
√
1−c
.(1 − λ) ≥ 0 , T −1 . D.T
= 1−c+d
T −1 . D.T
12
√
T −1 . D.T
=
11
√
d+(1−c). λ
1−c+d
≥ 0 and
=
21
√
T −1 . D.T
d
1−c+d .(1
−
√
λ) ≥ 0
≥0
22
√
Which proves that the matrix T −1 . D.T has elements that are all non-negative and therefor is a
probability square root of P .
2
4
Let us introduce the following notations for λ ≥ 0:
√
√
1 √0
1
0
√
D=
D− =
and
.
0
0 − λ
λ
The following Theorem provides in an analytic form of square roots for a (2 × 2) probability
matrix based on its diagonalized form.
c 1−c
Theorem 2.6 A (2 × 2) probability matrix P =
, with nonnegative eigenvalues 1
d 1−d
and λ, has the following square roots:
√
√ √
√ √
√
d + (1 −√
c) λ (1 − c).(1 −√ λ)
d − (1 −√
c) λ (1 − c).(1 +√ λ)
1
1
−
and P = 1−c+d
.
P = 1−c+d
1 − c + d. λ
1 − c − d. λ
d.(1 − λ)
d.(1 + λ)
Proof
Since the matrix P = T −1 .D.T is diagonalizable with
1
d 1−c
k
−1
T =
T =
k −k
k
k.(1 − c + d)
P has the following square roots:
Which proves the Theorem.
√
1−c
−d
D=
1
0
0
λ
√
√
√
P = T −1 . D.T and P − = T −1 . D− .T .
2
Theorem 2.7 For a (2 × 2) probability matrix P = (pij ), with nonnegative eigenvalues 1 and λ,
the following properties regarding probability square roots hold:
• In case λ = 0: P is a probability square root of itself.
√
√
• In case λ > 0: P has
√ d−(1−c). λ ≥
√ at least one probability square root, namely P . In case
0 and 1 − c − d. λ ≥ 0, P has a second probability square root, being P − .
Proof
√
For λ = 0: the square root P equals the idempotent matrix P itself, and therefor P is a
probability square root of√P .
For λ > 0: The fact that √P is a probability square root of P is a direct result from the proof of
Theorem 2.5. The matrix P − is a square
2.6 that is a probability
√ root of P according to Theorem
√
matrix under the conditions d − (1 − c). λ ≥ 0 and 1 − c − d. λ ≥ 0.
Which proves the Theorem.
2
3
Further research questions
In this paper the embeddability problem is discussed for 2-stated discrete-time Markov chains
in terms of probability square roots: Conditions for embeddability are presented and probability
square roots are described in analytic form. It is a challenge for further research to find necessary
and sufficient embeddability conditions for discrete-time Markov chains with more than 2 states.
5
References
[1] Elving, G. (1937). Zur theorie der Markoffschen ketten. Acta Social Sci. Fennicae n. Ser. A 2 8, 1–17.
[2] Minc, H. (1988). Nonnegative matrices. Wiley, New York.
[3] Seneta E. (1973). Non-negative matrices. An introduction to theory and applications. George Allen and
Unwin, London.
[4] Singer, B., Spilerman, S. (1976). The representation of social processes by Markov chains. The American
Journal of Sociology 82, 1–54.
6