The MatrixMatters Logarithm: Research from Theory to Computation February 25, 2009 Nick Higham Nick Higham SchoolofofResearch Mathematics Director The University of Manchester School of Mathematics [email protected] http://www.ma.man.ac.uk/~higham/ 6th European Congress of Mathematics, July 2012 1/6 Definition Applications Theory Methods Matrix Logarithm Definition A logarithm of A ∈ Cn×n is any matrix X such that eX = A. Implicit definition. Properties, classification? University of Manchester Nick Higham Matrix Logarithm 2 / 40 Definition Applications Theory Methods Outline 1 Definition and Properties 2 Applications 3 Theory 4 Computing the Matrix Logarithm and its Fréchet derivative University of Manchester Nick Higham Matrix Logarithm 3 / 40 Definition Applications Theory Methods Cayley and Sylvester Matrix algebra developed by Arthur Cayley, FRS (1821–1895) in Memoir on the Theory of Matrices (1858). Cayley considered matrix square roots. Term “matrix” coined in 1850 by James Joseph Sylvester, FRS (1814–1897). Gave (1883) first definition of f (A) for general f . University of Manchester Nick Higham Matrix Logarithm 4 / 40 Definition Applications Theory Methods Multiplicity of Definitions There have been proposed in the literature since 1880 eight distinct definitions of a matric function, by Weyr, Sylvester and Buchheim, Giorgi, Cartan, Fantappiè, Cipolla, Schwerdtfeger and Richter. — R. F. Rinehart, The Equivalence of Definitions of a Matric Function (1955) University of Manchester Nick Higham Matrix Logarithm 5 / 40 Definition Applications Theory Methods Multiplicity of Definitions There have been proposed in the literature since 1880 eight distinct definitions of a matric function, by Weyr, Sylvester and Buchheim, Giorgi, Cartan, Fantappiè, Cipolla, Schwerdtfeger and Richter. — R. F. Rinehart, The Equivalence of Definitions of a Matric Function (1955) University of Manchester Nick Higham Matrix Logarithm 5 / 40 Definition Applications Theory Methods Jordan Canonical Form Z −1 AZ = J = diag(J1 , . . . , Jp ), 1 λk λk Jk = |{z} mk ×mk .. . ... 1 λk Definition f (A) = Zf (J)Z −1 = Z diag(f (Jk ))Z −1 , f (mk −1) )(λk ) 0 f (λk ) f (λk ) . . . (mk − 1)! .. .. . f (Jk ) = . f (λk ) . . .. f 0 (λ ) k f (λk ) University of Manchester Nick Higham Matrix Logarithm 6 / 40 Definition Applications Theory Methods Primary and Nonprimary Logarithms A = diag(1, 1, e, e). Primary: Nonprimary: University of Manchester Nick Higham log(A) = diag(0, 0, 1, 1). log(A) = diag(0, 2πi, 1, 1). Matrix Logarithm 7 / 40 Definition Applications Theory Methods Cauchy Integral Theorem Definition 1 f (A) = 2πi Z Γ f (z)(zI − A)−1 dz, where f is analytic on and inside a closed contour Γ that encloses λ(A). University of Manchester Nick Higham Matrix Logarithm 8 / 40 Definition Applications Theory Methods Mercator’s Series By integrating (1 + t)−1 = 1 − t + t 2 − t 3 + · · · between 0 and x we obtain Mercator’s series (1668), log(1 + x) = x − x2 x3 x4 + − + ··· , 2 3 4 |x| < 1. For A ∈ Cn×n , log(I + A) = A − University of Manchester A2 A3 A4 + − + ··· , 2 3 4 Nick Higham ρ(A) < 1. Matrix Logarithm 10 / 40 Definition Applications Theory Methods Composite Functions Theorem f (t) = g(h(t)) ⇒ f (A) = g(h(A)). Corollary exp(log(A)) = A when log(A) is defined. University of Manchester Nick Higham Matrix Logarithm 11 / 40 Definition Applications Theory Methods Composite Functions Theorem f (t) = g(h(t)) ⇒ f (A) = g(h(A)). Corollary exp(log(A)) = A when log(A) is defined. What about log(exp(A)) = A? Matrix unwinding number U(A) = University of Manchester Nick Higham A − log(exp(A)) . 2πi Matrix Logarithm 11 / 40 Definition Applications Theory Methods Outline 1 Definition and Properties 2 Applications 3 Theory 4 Computing the Matrix Logarithm and its Fréchet derivative University of Manchester Nick Higham Matrix Logarithm 12 / 40 Definition Applications Theory Methods Toolbox of Matrix Functions d 2y + Ay = 0, y (0) = y0 , y 0 (0) = y00 dt 2 has solution √ √ −1 √ y (t) = cos( A t)y0 + A sin( A t)y00 . University of Manchester Nick Higham Matrix Logarithm 13 / 40 Definition Applications Theory Methods Toolbox of Matrix Functions d 2y + Ay = 0, y (0) = y0 , y 0 (0) = y00 dt 2 has solution √ √ −1 √ y (t) = cos( A t)y0 + A sin( A t)y00 . But University of Manchester y0 y = exp Nick Higham 0 t In −tA 0 y00 . y0 Matrix Logarithm 13 / 40 Definition Applications Theory Methods Toolbox of Matrix Functions d 2y + Ay = 0, y (0) = y0 , y 0 (0) = y00 dt 2 has solution √ √ −1 √ y (t) = cos( A t)y0 + A sin( A t)y00 . But y0 y = exp 0 t In −tA 0 y00 . y0 In software want to be able evaluate interesting f at matrix args as well as scalar args. MATLAB has expm, logm, sqrtm, funm. University of Manchester Nick Higham Matrix Logarithm 13 / 40 Definition Applications Theory Methods Application: Control Theory Convert continuous-time system dx = Fx(t) + Gu(t), dt y = Hx(t) + Ju(t), to discrete-time state-space system xk +1 = Axk + Buk , yk = Hxk + Juk . Have Fτ A=e , Z τ Ft e dt G, B= 0 where τ is the sampling period. MATLAB Control System Toolbox: c2d and d2c. University of Manchester Nick Higham Matrix Logarithm 14 / 40 Definition Applications Theory Methods The Average Eye First order character of optical system characterized by transference matrix S δ ∈ R5×5 , T = 0 1 where S ∈ R4×4 is symplectic: 0 T S JS = J = −I2 Average m−1 I2 . 0 Pm Ti is not a transference matrix. P Harris (2005) proposes the average exp(m−1 m i=1 log(Ti )). University of Manchester i=1 Nick Higham Matrix Logarithm 15 / 40 Definition Applications Theory Methods Markov Models Time-homogeneous continuous-time Markov process with transition probability matrix P(t) ∈ Rn×n . Transition intensity matrix Q: qij ≥ 0 (i 6= j), Pn Qt j=1 qij = 0, P(t) = e . For discrete-time Markov processes: Embeddability problem When does a given stochastic P have a real logarithm Q that is an intensity matrix? University of Manchester Nick Higham Matrix Logarithm 16 / 40 Definition Applications Theory Methods Markov Models (1)—Example √ With x = −e−2 ≈ −1.9 × 10−5 , 1 + 2x 1 − x 1 P = 1 − x 1 + 2x 3 1−x 1−x 3π 1−x 1 − x . 1 + 2x P diagonalizable, Λ(P) = {1, x, x}. Every primary log complex (can’t have complex conjugate ei’vals). Yet a generator is the non-primary log −2/3 1/2 1/6 √ Q = 2 3π 1/6 −2/3 1/2 . 1/2 1/6 −2/3 University of Manchester Nick Higham Matrix Logarithm 17 / 40 Definition Applications Theory Methods Markov Models (2) Suppose P ≡ P(1) has a generator Q = log P. Then P(t) at other times is P(t) = exp(Qt). E.g., if P transition matrix for 1 year, 1 P(1/12) = e 12 log P ≡ P 1/12 is matrix for 1 month. Problem: log P, P 1/k may have wrong sign patterns ⇒ “regularize”. University of Manchester Nick Higham Matrix Logarithm 18 / 40 Definition Applications Theory Methods HIV to Aids Transition Estimated 6-month transition matrix. Four AIDS-free states and 1 AIDS state. 2077 observations (Charitos et al., 2008). 0.8149 0.0738 0.0586 0.0407 0.0120 0.5622 0.1752 0.1314 0.1169 0.0143 . 0.3606 0.1860 0.1521 0.2198 0.0815 P= 0.1676 0.0636 0.1444 0.4652 0.1592 0 0 0 0 1 Want to estimate the 1-month transition matrix. Λ(P) = {1, 0.9644, 0.4980, 0.1493, −0.0043}. N. J. Higham and L. Lin. On pth roots of stochastic matrices, LAA, 2011. University of Manchester Nick Higham Matrix Logarithm 19 / 40 Definition Applications Theory Methods Outline 1 Definition and Properties 2 Applications 3 Theory 4 Computing the Matrix Logarithm and its Fréchet derivative University of Manchester Nick Higham Matrix Logarithm 20 / 40 Definition Applications Theory Methods Logs of A = I3 0 C = −2π −2π 0 0 0 B = 0 0 0, 0 0 0 2π − 1 1 0 0 0 , D = −2π 0 0 0 2π 0 0 1 0, 0 eB = eC = eD = I3 . Λ(C) = Λ(D) = {0, 2πi, −2πi}. University of Manchester Nick Higham Matrix Logarithm 21 / 40 Definition Applications Theory Methods All Solutions of eX = A Theorem (Gantmacher, 1959) A ∈ Cn×n nonsing with Jordan canonical form Z −1 AZ = J = diag(J1 , J2 , . . . , Jp ). All solutions to eX = A are given by (j ) (j ) (j ) X = Z U diag(L1 1 , L2 2 , . . . , Lp p ) U −1 Z −1 , where (j ) Lk k = log(Jk (λk )) + 2 jk πi Imk , jk ∈ Z arbitrary, and U an arbitrary nonsing matrix that commutes with J. University of Manchester Nick Higham Matrix Logarithm 22 / 40 Definition Applications Theory Methods All Solutions of eX = A: Classified Theorem A ∈ Cn×n nonsing: p Jordan blocks, s distinct ei’vals. eX = A has a countable infinity of solutions that are primary functions of A: (j ) (j ) (j ) Xj = Z diag(L1 1 , L2 2 , . . . , Lp p )Z −1 , where λi = λk implies ji = jk . If s < p then eX = A has non-primary solutions (j ) (j ) (j ) Xj (U) = Z U diag(L1 1 , L2 2 , . . . , Lp p ) U −1 Z −1 , where jk ∈ Z arbitrary, U arbitrary nonsing with UJ = JU, and for each j ∃ i and k s.t. λi = λk while ji 6= jk . University of Manchester Nick Higham Matrix Logarithm 23 / 40 Definition Applications Theory Methods Logs of A = I3 (again) 0 C = −2π −2π 2π − 1 0 0 1 0, 0 0 D = −2π 0 2π 0 0 1 0, 0 e0 = eC = eD = I3 . Λ(C) = Λ(D) = {0, 2πi, −2πi}. 1 α U= 0 1 0 0 0 α, 1 α ∈ C, X = U diag(2πi, −2πi, 0)U −1 University of Manchester Nick Higham 1 −2α = 2π i 0 1 0 0 Matrix Logarithm 2α2 −α . 1 24 / 40 Definition Applications Theory Methods Square Roots of Rotations cos θ G(θ) = − sin θ sin θ . cos θ G(θ/2) is the natural square root of G(θ). For θ = π, −1 0 G(π) = , 0 −1 0 G(π/2) = −1 1 . 0 G(π/2) is a nonprimary square root. University of Manchester Nick Higham Matrix Logarithm 25 / 40 Definition Applications Theory Methods Principal Logarithm and pth Root Let A ∈ Cn×n have no eigenvalues on R− . Principal log X = log(A) denotes unique X such that eX = A. −π < Im λ(X ) < π. University of Manchester Nick Higham Matrix Logarithm 26 / 40 Definition Applications Theory Methods Principal Logarithm and pth Root Let A ∈ Cn×n have no eigenvalues on R− . Principal log X = log(A) denotes unique X such that eX = A. −π < Im λ(X ) < π. Principal pth root For integer p > 0, X = A1/p is unique X such that X p = A. −π/p < arg(λ(X )) < π/p. University of Manchester Nick Higham Matrix Logarithm 26 / 40 Definition Applications Theory Methods Outline 1 Definition and Properties 2 Applications 3 Theory 4 Computing the Matrix Logarithm and its Fréchet derivative University of Manchester Nick Higham Matrix Logarithm 27 / 40 Definition Applications Theory Methods Henry Briggs (1561–1630) Arithmetica Logarithmica (1624) Logarithms to base 10 of 1–20,000 and 90,000–100,000 to 14 decimal places. University of Manchester Nick Higham Matrix Logarithm 28 / 40 Definition Applications Theory Methods Henry Briggs (1561–1630) Arithmetica Logarithmica (1624) Logarithms to base 10 of 1–20,000 and 90,000–100,000 to 14 decimal places. Briggs must be viewed as one of the great figures in numerical analysis. —Herman H. Goldstine, A History of Numerical Analysis (1977) University of Manchester Nick Higham Matrix Logarithm 28 / 40 Definition Applications Theory Methods Briggs’ Log Method (1617) log(ab) = log a + log b ⇒ log a = 2 log a1/2 . Use repeatedly: k log a = 2k log a1/2 . k Write a1/2 = 1 + x and note log(1 + x) ≈ x. Briggs worked to base 10 and used k log10 a ≈ 2k · log10 e · (a1/2 − 1). University of Manchester Nick Higham Matrix Logarithm 31 / 40 Definition Applications Theory Methods When Does log(BC) = log(B) + log(C)? Theorem Let B, C ∈ Cn×n commute and have no ei’vals on R− . If for every ei’val λj of B and the corr. ei’val µj of C, | arg λj + arg µj | < π, then log(BC) = log(B) + log(C). University of Manchester Nick Higham Matrix Logarithm 32 / 40 Definition Applications Theory Methods When Does log(BC) = log(B) + log(C)? Theorem Let B, C ∈ Cn×n commute and have no ei’vals on R− . If for every ei’val λj of B and the corr. ei’val µj of C, | arg λj + arg µj | < π, then log(BC) = log(B) + log(C). Proof. log(B) and log(C) commute, since B and C do. Therefore elog(B)+log(C) = elog(B) elog(C) = BC. Thus log(B) + log(C) is some logarithm of BC. Then Im(log λj + log µj ) = arg λj + arg µj ∈ (−π, π), so log(B) + log(C) is the principal logarithm of BC. University of Manchester Nick Higham Matrix Logarithm 32 / 40 Definition Applications Theory Methods Inverse Scaling and Squaring Method Take B = C in previous theorem: log A = log A1/2 · A1/2 = 2 log(A1/2 ), since arg λ(A1/2 ) ∈ (−π/2, π/2). University of Manchester Nick Higham Matrix Logarithm 33 / 40 Definition Applications Theory Methods Inverse Scaling and Squaring Method Take B = C in previous theorem: log A = log A1/2 · A1/2 = 2 log(A1/2 ), since arg λ(A1/2 ) ∈ (−π/2, π/2). Use Briggs’ idea: University of Manchester k log A = 2k log A1/2 . Nick Higham Matrix Logarithm 33 / 40 Definition Applications Theory Methods Inverse Scaling and Squaring Method Take B = C in previous theorem: log A = log A1/2 · A1/2 = 2 log(A1/2 ), since arg λ(A1/2 ) ∈ (−π/2, π/2). Use Briggs’ idea: k log A = 2k log A1/2 . Kenney & Laub’s (1989) inverse scaling and squaring method: Bring A close to I by repeated square roots. s Approximate log(A1/2 ) using an [m/m] Padé approximant rm (x) ≈ log(1 + x). Rescale to find log(A). University of Manchester Nick Higham Matrix Logarithm 33 / 40 Definition Applications Theory Methods Choice of Parameters s, m s Must have kI − A1/2 k < 1. Larger Padé degree m means smaller s. Let h2m+1 (X ) = erm (X ) − X − I. Assume ρ(rm (X )) < π, so log(erm (X ) ) = rm (X ). Then rm (X ) = log(I + X + h2m+1 (X )) =: log(I + X + ∆X ), where h2m+1 (X ) = ∞ X ck X k . k =2m+1 University of Manchester Nick Higham Matrix Logarithm 34 / 40 Definition Applications Theory Methods Bounding the Backward Error Want to bound norm of h2m+1 (X ) = P∞ k =2m+1 ck X k . Non-normality implies ρ(A) kAk. Note that ρ(A) ≤ kAk k1/k ≤ kAk, k = 1 : ∞. and limk →∞ kAk k1/k = ρ(A). Use kAk k1/k instead of kAk in the truncation bounds. University of Manchester Nick Higham Matrix Logarithm 35 / 40 0.9 A= 0 500 . −0.5 10 10 kAkk2 kAk k2 1/5 (kA5 k2 )k 1/10 (kA10 k2 )k k 1/k kA k2 8 10 6 10 4 10 2 10 0 10 0 5 10 15 20 Definition Applications Theory Methods Algorithm of Al-Mohy & H (2012) Truncation bounds use kAk k1/k rather than kAk, leading to major benefits in speed and accuracy. Matrix norms not such a blunt tool! Use estimates of kAk k (alg of H & Tisseur (2000)). Choose s and m to achieve double precision backward error at minimal cost. Initial Schur decomposition: A = QTQ ∗ . Directly and accurately compute certain elements of s T 1/2 − I and log(T ). Use s a−1 . 1/2i ) i=1 (1 + a a1/2 − 1 = Qs University of Manchester Nick Higham Matrix Logarithm 37 / 40 Definition Applications Theory Methods Frechét Derivative of Logarithm f (A + E) − f (A) − L(A, E) = o(kEk). Integral formula Z L(A, E) = 0 1 t(A − I) + I −1 E t(A − I) + I Method based on X E f (X ) f = 0 X 0 −1 dt. L(X , E) . f (X ) Kenney & Laub (1998): Kronecker–Sylvester alg, Padé of tanh(x)/x. Requires complex arithmetic. University of Manchester Nick Higham Matrix Logarithm 38 / 40 Definition Applications Theory Methods Algorithm of Al-Mohy, H & Relton (2012) Fréchet differentiate the ISS algorithm! 1 2 3 4 5 6 7 E0 = E for i = 1: s i Compute A1/2 . i i Solve the Sylvester eqn A1/2 Ei + Ei A1/2 = Ei−1 . end s log(A) ≈ 2s rm (A1/2 − I) s Llog (A, E) ≈ 2s Lrm (A1/2 − I, Es ) Backward Error Result rm (X ) = log(I + X + ∆X ), Lrm (X , E) = Llog (I + X + ∆X , E + ∆E). University of Manchester Nick Higham Matrix Logarithm 39 / 40 Definition Applications Theory Methods Conclusions & Future Directions Log appears in a growing number of applications. Have good algorithms for log(A), Llog (A) and estimating the condition number. If A is real can work entirely in real arithmetic. Conditioning of f (A). Non-primary functions. Functions of structured matrices. University of Manchester Nick Higham Matrix Logarithm 40 / 40 Definition Applications Theory Methods References I A. H. Al-Mohy and N. J. Higham. A new scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl., 31(3):970–989, 2009. A. H. Al-Mohy and N. J. Higham. Improved inverse scaling and squaring algorithms for the matrix logarithm. SIAM J. Sci. Comput., 34(4):C152–C169, 2012. University of Manchester Nick Higham Matrix Logarithm 36 / 40 Definition Applications Theory Methods References II A. H. Al-Mohy, N. J. Higham, and S. D. Relton. Computing the fréchet derivative of the matrix logarithm and estimating the condition number. MIMS EPrint 2012.72, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, July 2012. 17 pp. T. Charitos, P. R. de Waal, and L. C. van der Gaag. Computing short-interval transition matrices of a discrete-time Markov chain from partially observed data. Statistics in Medicine, 27:905–921, 2008. University of Manchester Nick Higham Matrix Logarithm 37 / 40 Definition Applications Theory Methods References III W. F. Harris. The average eye. Opthal. Physiol. Opt., 24:580–585, 2005. N. J. Higham. The Matrix Function Toolbox. http: //www.ma.man.ac.uk/~higham/mftoolbox. N. J. Higham. Evaluating Padé approximants of the matrix logarithm. SIAM J. Matrix Anal. Appl., 22(4):1126–1135, 2001. University of Manchester Nick Higham Matrix Logarithm 38 / 40 Definition Applications Theory Methods References IV N. J. Higham. Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008. ISBN 978-0-898716-46-7. xx+425 pp. N. J. Higham and A. H. Al-Mohy. Computing matrix functions. Acta Numerica, 19:159–208, 2010. N. J. Higham and L. Lin. On pth roots of stochastic matrices. Linear Algebra Appl., 435(3):448–463, 2011. University of Manchester Nick Higham Matrix Logarithm 39 / 40 Definition Applications Theory Methods References V C. S. Kenney and A. J. Laub. Condition estimates for matrix functions. SIAM J. Matrix Anal. Appl., 10(2):191–209, 1989. C. S. Kenney and A. J. Laub. A Schur–Fréchet algorithm for computing the logarithm and exponential of a matrix. SIAM J. Matrix Anal. Appl., 19(3):640–663, 1998. University of Manchester Nick Higham Matrix Logarithm 40 / 40
© Copyright 2026 Paperzz