Probability square roots for (2 × 2) transition matrices Marie-Anne Guerry MOSI Vrije Universiteit Brussel Pleinlaan 2, B-1050 Brussels, Belgium e-mail: [email protected] July 30, 2012 Abstract For a (2 × 2) probability matrix P necessary and sufficient conditions are formulated under which there exists a probability square root, i.e. a probability matrix A satisfying P = A.A. The probability square roots are expressed in analytic form. Key words. Transition matrix; Probability square root; Embeddable problem AMS subject classifications. 15A23; 15A51; 60J10 1 Introduction Under time homogenous assumption, a discrete-time Markov model is characterized by a transition matrix P = (pij ) that is a probability matrix. The element pij of this matrix corresponds with the transition probability from state Si to Sj after a time interval with unit length 1. Based on the information of the transition matrix P , the evolution of the stock vector can be described by n(t) = n(t−1).P = n(0).P t . Since the transition probabilities pij are related to time intervals with length 1, extrapolations of the stock vector can be found for the subsequent values t = 1, 2, .... Although the available personnel data on stocks and flows is restricted to for example annual base (time intervals with length 1), it can be interesting to have information on the stocks semiannually, i.e. for t = 0.5; 1; 1.5; 2; 2.5; .... Under time homogeneous assumption a transition matrix P (0.5) with respect to time intervals of length 0.5 satisfies: n(t) = n(t − 0.5).P (0.5) and n(t − 0.5) = n(t − 1).P (0.5) Consequently n(t) = n(t − 1).P = n(t − 1).P (0.5).P (0.5). Therefore, in finding a transition matrix P (0.5) with respect to a time interval with length 0.5, the question is whether there exists a probability matrix A such that P = A.A. In other words, the question is whether the original Markov chain with transition matrix P and time unity 1 is embeddable in a Markov chain with time unity 0.5. This particular embeddable problem can be expressed in terms of probability matrices that are square roots of a transition matrix: For the matrix P of the transition probabilities with respect to a time interval with length 1, the question is whether there exists a transition matrix A of a Markov chain with time unity 0.5 and that is compatible with P . Such a matrix A is in fact a probability matrix that is a square root of P , which will be referred to in this paper by the terminology probability square root. In case for the estimated transition matrix with respect to an one-unit time-interval there exist probability square roots, these matrices provide information related to the transition probabilities with respect to time-intervals with length 0.5. 1 The initial formulation of the embeddable problem was introduced by Elving and concerns the question whether for a discrete-time Markov chain there exists a compatible continuous-time Markov process; i.e. whether for a transition matrix P there exists an intensity matrix Q such that exp(Q) = P ([1]). The embeddability in a continuous-time Markov processes is discussed in detail by Singer and Spilerman ([4]). This paper deals with necessary and sufficient conditions for embeddability in a discrete-time Markov chain with the number of states equal to 2. For (2 × 2) transition matrices, the probability square roots are described in analytic form. 2 Probability square roots for a (2 × 2) transition matrix: conditions and analytic form In this section for a (2 × 2) transition matrix P necessary and conditions for the existence of a probability square root A are presented. The probability matrices A satisfying P = A.A are described in analytic form. c 1−c Theorem 2.1 For a (2 × 2) probability matrix P = d 1−d • If c < d there does not exist a probability square root A of P . • If c = d, there does exist exactly one probability square root A of P , namely the probability matrix A = (aij ) with a11 = a21 = c = d . • If c > d and 1 − c + d 6= 0 there does exist at least one probability square root A of P , namely the probability matrix A = (aij ) with √ √ 1− c−d c − d (1 − c) + d and a21 = d a11 = 1−c+d 1−c+d Moreover, in case √ a11 = √ 1+ c−d c − d (c − 1) + d and a21 = d 1−c+d 1−c+d are both elements of [0, 1] there does exist a second probability square root A = (aij ) of P . 1 0 0 1 • If c > d and 1 − c + d = 0 both and are probability square roots of P . 0 1 1 0 Proof For a (2 × 2) probability matrix P = c d 1−c 1−d , the probability matrix A = (aij ) is a probability square root of P if and only if 2 a11 + (1 − a11 ).a21 = c (1) a11 .a21 + (1 − a21 ).a21 = d c 1−c In the special case of a probability matrix P = satisfying 1 − c + d = 0, the d 1−d 1 0 1 0 0 1 matrix P equals and has 2 probability square roots, namely and . 0 1 0 1 1 0 In the special case of a probability matrix A = (aij ) with a11 = 1, the matrix A can only 1 0 be a square root of a probability matrix of the form P = . Under these conditions d 1−d 2 and according to (1), P has exactly one √ probability square root namely the probability matrix A = (aij ) with a11 = 1 and a21 = 1 − 1 − d. c 1−c The further reasoning can be restricted to the situation of a probability matrix P = d 1−d satisfying 1 − c + d 6= 0 and a probability square root A = (aij ) with a11 6= 1: The system (1) results in the following quadratic equation in a11 (1 − c + d)a211 − (2d)a11 + c2 − c + d = 0 (2) with discriminant D = 4(1 − c)2 .(c − d). Therefore, in case c < d, no probability square root exists for the matrix P . In case c = d, the matrix P has exactly one probability square root A. According to (1) this matrix A satisfies a11 = a21 = c = d . In case c > d, the equation (2) results in the following solutions for a11 : √ √ c − d (1 − c) + d c − d (c − 1) + d and a11 = a11 = 1−c+d 1−c+d c−a2 resulting respectively in: According to (1), a21 can be expressed as a21 = 1−a11 11 √ √ 1− c−d 1+ c−d a21 = d and a21 = d 1−c+d 1−c+d √ √ (1−c)+d c−d and a21 = d 1− Since both a11 = c−d 1−c+d 1−c+d are elements of [0, 1] there does exist at least one probability square root A = (aij ) of P . Which proves the Theorem. 2 In the following Theorem a necessary and sufficient condition for a (2 × 2) probability matrix P to have a probability square root is expressed in terms of the trace tr(P ) of the matrix P . Theorem 2.2 For a (2 × 2) probability matrix P there exists a probability square root if and only if tr(P ) ≥ 1. Proof c 1−c From Theorem 2.1 it is known that for the probability matrix P = there does exist d 1−d a probability square root under the condition c ≥ d and there does not exist a probability square root in case c < d. Since the trace of P equals tr(P ) = 1 + c − d, the condition c ≥ d is equivalent with tr(P ) ≥ 1. 2 Theorem 2.3 For a (2 × 2) probability matrix P there exists a probability square root if and only if all the eigenvalues of P are nonnegative. Proof The probability matrix P has 1 as eigenvalue. Denoting the second eigenvalue of P by λ, the trace of P can be expressed as tr(P ) = 1 + λ. Therefore, the necessary and sufficient condition tr(P ) ≥ 1 to have a probability square root is equivalent with λ ≥ 0. 2 Theorem 2.4 A (2 × 2) probability matrix P = c 1−c d 1−d with nonnegative eigenvalues 1 −1 and λ is diagonalizable: P = T .D.T for 1 d 1−c k T = T −1 = k k −k k.(1 − c + d) with k ∈ R. 3 1−c −d D= 1 0 0 λ Proof In case for the probability matrix P the eigenvalue 1 has an algebraic multiplicity equal to 2, the Jordan blocks are of order (1 ×1) ([2]). The diagonal matrix D as well as the matrix P are equal 1 0 to the identity matrix I2×2 = . 0 1 In the situation that the eigenvalue 1 has an algebraic multiplicity equal to 1, P has eigenvalues 1 and λ 6= 1. According to the Perron-Frobenius Theorem the eigenvalue λ is less than 1 ([3]). Consequently 0 ≤ λ < 1 . On the one hand, for a left eigenvector E of P = (pij ) associated with the eigenvalue λ 6= 1 holds E. (P − λ.I) = 0 and therefor: X XX X X X X X [E. (P − λ.I)]j = Ei .pij − λ. Ej = Ei . pij − λ. Ei = Ei .(1 − λ) = 0 j j i j i j i i P Which implies that i Ei = 0 since λ 6= 1. A left eigenvector E of P is therefor of the form E = (k − k) with k ∈ R. On the other hand, the eigenspace corresponding with the eigenvalue 1 can be described as {a. (d 1 − c) |a ∈ R} . Consequently P = T −1 .D.T is diagonalizable with 1 k 1−c 1 0 d 1−c −1 D= T = T = k −d 0 λ k −k k.(1 − c + d) 2 c 1−c Theorem 2.5 For the (2 × 2) probability matrix P = with nonnegative eigenvalues d 1−d √ √ √ d + (1 − √ c). λ (1 − c).(1 −√ λ) 1 1 and λ, the matrix P = 1−c+d is a probability square d.(1 − λ) 1 − c + d. λ root. Proof c 1−c According to Theorem 2.4, the probability matrix P = is diagonalizable with d 1−d 1 k 1−c 1 0 d 1−c −1 D= T = T = k −d 0 λ k −k k.(1 − c + d) Since P = T −1 .D.T and λ ≥ 0, for T −1 √ . D.T = √ D= 1 1−c+d 1 0 √0 λ holds that √ d + (1 −√ c) λ d.(1 − λ) √ (1 − c).(1 −√ λ) 1 − c + d. λ is a square root of P with each of the row sums equal to 1. Moreover, since 0 ≤ λ ≤ 1: √ √ √ 1−c .(1 − λ) ≥ 0 , T −1 . D.T = 1−c+d T −1 . D.T 12 √ T −1 . D.T = 11 √ d+(1−c). λ 1−c+d ≥ 0 and = 21 √ T −1 . D.T d 1−c+d .(1 − √ λ) ≥ 0 ≥0 22 √ Which proves that the matrix T −1 . D.T has elements that are all non-negative and therefor is a probability square root of P . 2 4 Let us introduce the following notations for λ ≥ 0: √ √ 1 √0 1 0 √ D= D− = and . 0 0 − λ λ The following Theorem provides in an analytic form of square roots for a (2 × 2) probability matrix based on its diagonalized form. c 1−c Theorem 2.6 A (2 × 2) probability matrix P = , with nonnegative eigenvalues 1 d 1−d and λ, has the following square roots: √ √ √ √ √ √ d + (1 −√ c) λ (1 − c).(1 −√ λ) d − (1 −√ c) λ (1 − c).(1 +√ λ) 1 1 − and P = 1−c+d . P = 1−c+d 1 − c + d. λ 1 − c − d. λ d.(1 − λ) d.(1 + λ) Proof Since the matrix P = T −1 .D.T is diagonalizable with 1 d 1−c k −1 T = T = k −k k k.(1 − c + d) P has the following square roots: Which proves the Theorem. √ 1−c −d D= 1 0 0 λ √ √ √ P = T −1 . D.T and P − = T −1 . D− .T . 2 Theorem 2.7 For a (2 × 2) probability matrix P = (pij ), with nonnegative eigenvalues 1 and λ, the following properties regarding probability square roots hold: • In case λ = 0: P is a probability square root of itself. √ √ • In case λ > 0: P has √ d−(1−c). λ ≥ √ at least one probability square root, namely P . In case 0 and 1 − c − d. λ ≥ 0, P has a second probability square root, being P − . Proof √ For λ = 0: the square root P equals the idempotent matrix P itself, and therefor P is a probability square root of√P . For λ > 0: The fact that √P is a probability square root of P is a direct result from the proof of Theorem 2.5. The matrix P − is a square 2.6 that is a probability √ root of P according to Theorem √ matrix under the conditions d − (1 − c). λ ≥ 0 and 1 − c − d. λ ≥ 0. Which proves the Theorem. 2 3 Further research questions In this paper the embeddability problem is discussed for 2-stated discrete-time Markov chains in terms of probability square roots: Conditions for embeddability are presented and probability square roots are described in analytic form. It is a challenge for further research to find necessary and sufficient embeddability conditions for discrete-time Markov chains with more than 2 states. 5 References [1] Elving, G. (1937). Zur theorie der Markoffschen ketten. Acta Social Sci. Fennicae n. Ser. A 2 8, 1–17. [2] Minc, H. (1988). Nonnegative matrices. Wiley, New York. [3] Seneta E. (1973). Non-negative matrices. An introduction to theory and applications. George Allen and Unwin, London. [4] Singer, B., Spilerman, S. (1976). The representation of social processes by Markov chains. The American Journal of Sociology 82, 1–54. 6
© Copyright 2025 Paperzz