Newton's Method for the Matrix Square Root Author(s): Nicholas J. Higham Reviewed work(s): Source: Mathematics of Computation, Vol. 46, No. 174 (Apr., 1986), pp. 537-549 Published by: American Mathematical Society Stable URL: http://www.jstor.org/stable/2007992 . Accessed: 02/12/2011 06:39 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. American Mathematical Society is collaborating with JSTOR to digitize, preserve and extend access to Mathematics of Computation. http://www.jstor.org MATHEMATICS OF COMPUTATION VOLUME 46, NUMBER 174 APRIL 1986, PAGES 537-549 Newton'sMethodfor the Matrix SquareRoot* By NicholasJ. Higham Abstract. One approach to computing a square root of a matrix A is to apply Newton's method to the quadratic matrix equation F(X) _ x2 - A = 0. Two widely-quoted matrix square root iterations obtained by rewriting this Newton iteration are shown to have excellent mathematical convergence properties. However, by means of a perturbation analysis and supportive numerical examples, it is shown that these simplified iterations are numerically unstable. A further variant of Newton's method for the matrix square root, recently proposed in the literature, is shown to be, for practical purposes, numerically stable. 1. Introduction.A square root of an n X n matrix A with complex elements, A E C ,n , is a solution X E CfnXfnof the quadratic matrix equation F(X) (1.1) X2-A = 0. A naturalapproachto computinga squareroot of A is to applyNewton'smethodto (1.1). For a generalfunctionG: Cnxn -_ CnXn, Newton'smethodfor the solutionof G(X) = 0 is specifiedby an initialapproximationX0 and the recurrence(see [14, p. 140], for example) (1.2) Xk+l = Xk- k = 0,1,2,..., G`(Xk)1G(Xk), where G' denotes the Frechetderivativeof G. Identifying F(X+ H) = X2 -A +(XH+ HX) + H2 with the Taylor series for F we see that F'(X) is a linear operator, F'(X): n_ CcnXn, defined CnXn by F'(X)H = XH + HX. Thus Newton's methodfor the matrixsquareroot can be written X0 given, (1.3) (1 3) (1.4) XkHk+ HkXk= A - ~(N): ()Xk XkH X+ Hk X2 ) 012 k = 0, 1, 2,.... Applying the standardlocal convergencetheoremfor Newton's method [14, p. 148], we deduce that the Newton iteration(N) convergesquadraticallyto a square root X of A if X - X0I is sufficientlysmalland if the lineartransformationF'( X) is nonsingular.However,the most stableand efficientmethodsfor solvingEq. (1.3), Received October 22, 1984; revised July 30, 1985. 1980 Mathematics Subject Classification. Primary 65F30, 65H10. Key wordsand phrases. Matrix square root, Newton's method, numerical stability. * This work was carried out with the support of a SERC Research Studentship. 01986 American Mathematical Society 0025-5718/86 $1.00 + $.25 per page 537 538 NICHOLAS J. HIGHAM [1], [6], requirethe computationof a Schurdecompositionof Xk, assumingXk is full. Since a squareroot of A can be obtaineddirectlyand at little extracost once a single Schur decomposition,that of A, is known, [2], [9], we see that in general Newton's method for the matrixsquareroot, in the form (N), is computationally expensive. It is thereforenaturalto attemptto "simplify"iteration(N). Since X commutes with A = X2, a reasonableassumption(whichwe will justify in Theorem1) is that the commutativityrelation XkHk = HkXk holds, in whichcase (1.3) maybe written 2XkHk = 2HkXk A = -Xk2 and we obtain from(N) two new iterations (1.5) (I): (1.6) (II): Yk+1 = ?2(Yk Zk+l + Yj A) 2(Zk+AZk-) These iterationsarewell-known;see for example[2],[7, p. 395],[11],[12],[13]. Considerthe followingnumericalexample.Using iteration(I) on a machinewith approximatelynine decimaldigit accuracy,we attemptedto computea squareroot of the symmetricpositivedefiniteWilsonmatrix[16, pp. 93, 123] 10 7 8 7 8 6 10 9 7 5 9 10, - 2984. Two implementationsof iteration(I) wereemployed(for the detailssee Section5). The first is designed to deal with generalmatrices,while the second is for the case whereA is positive definite and takes full advantageof the fact that all iteratesare (theoretically) positive definite (see Corollary1). In both cases we took YO= I; as we will prove in Theorem2, for this startingvalue iteration(I) should convergequadratically to W1/2, the uniquesymmetricpositivedefinitesquareroot of W. Denoting the computediteratesby Yk,the resultsobtainedwere as in Table 1. Both implementationsfailed to converge;in the first, Y20 was unsymmetricand indefinite. In contrast,a furthervariantof the Newton iteration,to be defined in Section4, convergedto W1/2 in nine iterations. Clearly,iteration(I) is in some sense "numericallyunstable".This instabilitywas noted by Laasonen[13] who, in a paper apparentlyunknownto recentworkersin this area, stated without proof that for a matrix with real, positive eigenvalues iteration(I) "if carriedout indefinitely,is not stablewheneverthe ratioof the largest to the smallesteigenvalueof A exceedsthe value 9". We wish to draw attentionto this important and surprisingfact. In Section 3 we provide a rigorousproof of Laasonen'sclaim. We show that the originalNewton method (N) does not suffer from this numericalinstabilityand we identifyin Section4 an iteration,proposedin [4], which has the computationalsimplicityof iteration(I) and yet does not suffer from the instabilitywhichimpairsthe practicalperformanceof (I). for which the 2-normconditionnumberK2(W) = IIW I 211W112 NEWTON'SMETHODFOR THE MATRIX SQUARE ROOT 539 TABLE 1 Implementation2 Implementation1, k llW/ klll - l l/- YkIll 0 1 4.9 1.1 x 101 4.9 1.1 x 101 2 3.6 3.6 3 6.7 X 10-1 3.3 x 10- 6.7 x 10-1 10 11 4.3 x 10-4 3.4 x 109.3 x 10-4 2.5 x 10-2 6.7 x 10-1 1.8 x 101 3.3 x 104.3 x 10-4 6.7 x 10-7 1.4 x 10-6 1.6 x 10 2.0 x 10-4 2.4 x 10- 4.8 x 102 2.8 x 10-2 12 1.3 x 13 20 3.4 x 105 1.2 x 106 4 5 6 7 8 9 3.2 x 10-1 104 Error:Yk not positivedefinite We begin by analyzingthe mathematicalconvergencepropertiesof the Newton iteration. 2. Convergence of Newton's Method. In this section we derive conditions which ensure the convergenceof Newton's method for the matrix square root and we establishto which squareroot the methodconvergesfor a particularset of starting values. (For a classificationof the set { X: X2 = A) see, for example,[9].) First, we investigatethe relationshipbetween the Newton iteration(N) and its offshoots (I) and (II). To begin,note that the Newton iteratesXk are well-definedif and only if, for each k, Eq. (1.3) has a unique solution, that is, the linear transformationF'(Xk) is nonsingular.This is so if and only if Xk and -Xk haveno eigenvaluein common[7, p. 194],whichrequiresin particularthat Xk be nonsingular. 1. Consider the iterations (N), (I) and (II). Suppose X0 = YO= ZO commutes with A and that all the Newton iterates Xk are well-defined. Then (i) Xk commutes withA for all k, (ii) Xk = Yk = Zk for all k. THEOREM Proof. We sketchan inductiveproof of parts(i) and (ii) together.The case k = 0 is given. Assume the resultshold for k. Fromthe remarksprecedingthe theoremwe see that both the lineartransformationF'(Xk), and the matrixXk, are nonsingular. Define Gk = 2(X0A1A - Xk)- Using XkA = AXk we have, from (1.3), F'(Xk)Gk = F'(Xk)Hk. 540 NICHOLASJ. HIGHAM Thus Hk = Gk, and so from (1.4), (2.1) Xk+l = Xk + Gk = 2(Xk + X0A) which commuteswith A. It followseasilyfrom(2.1) that Xk+l = Yk+1 = Zk+l. O Thus, providedthe initialapproximationX0 = YO= ZOcommuteswith A and the correctionequation(1.3) is nonsingularat each stage,the Newton iteration(N) and its variants (I) and (II) yield the same sequenceof iterates.We now examine the convergence of this sequence,concentratingfor simplicity on iteration (I) with startingvalue a multipleof the identitymatrix.Note that the startingvalues YO= I and YO= A lead to the samesequenceY1= !(I + A), Y2. For our analysis we assume that A is diagonalizable,that is, there exists a nonsingularmatrixZ suchthat (2.2) Z-1AZ = A = diag(X1,..., n), where X1,..., AXnare the eigenvaluesof A. The convenienceof this assumptionis that it enablesus to diagonalizethe iteration.For, defining Dk = Z1lYkZ (2.3) we have from(1.5), (2.4) Dk~l= 4(z-lYkZ +(Z-lYkZ) Z AZ) A(Dk= + so that if Do is diagonal,thenby inductionall the successivetransformediteratesDk are diagonaltoo. THEOREM2. Let A E C nXIn be nonsingular and diagonalizable, and suppose that none of A 's eigenvalues is real and negative. Let m>O. YO= mI, Then, provided the iterates { Yk} in (1.5) are defined, lim Yk= X k-boo and (2.5) I1Yk+l - X IIYk- < where X is the uniquesquare root of A for which every eigenvaluehas positive real part. Proof. We will use the notation (2.2). In view of (2.3) and (2.4) it suffices to analyze the convergenceof the sequence {Dk}. Do = mI is diagonal, so Dk is diagonalfor each k. WritingDk = diag(d (k)) we see from(2.4) that d~k+l) + XA/d(k)), I(d(k) - 1 < i < n, that is, (2.4) is essentiallyn uncoupledscalarNewton iterationsfor the squareroots rX1, 1 < i < n. Considerthereforethe scalariteration = Zk?+ 2(Zk + a/Zk). We will use the relations[17,p. 84] Zk (2.6) (2.7) Zk Zk+l = - +1 + _ (zk- IZOZ0+Vj )(2Zk) - k+1 )k NEWTON'SMETHODFOR THE MATRIXSQUARE ROOT 541 If a does not lie on the nonpositivereal axis then we can choose Vato havepositive real part, in which case it is easy to see that for real zo > 0, IyI< 1. Consequently, for a and zo of the specifiedform we have from (2.7), providedthat the sequence { Zk } is defined, lim Zk ReV > O. = V, ka-oo Since the eigenvaluesX1and the startingvalues d(?)= m > 0 are of the form of a and zo, respectively,then 12 = diag(XV2), (2.8) lim Dk Re X2 > 0, kaz-+oo and thus lim Yk= ZA1/2Z-1 = X k-boo (providedthe iterates{ Yk} are defined),whichis clearlya squareroot of A whose eigenvalueshave positiverealpart.The uniquenessof X follows fromTheorem4 in [9]. Finally,we can use (2.6),with the minussign, to deducethat Dk l-A12 = A1/2)2; iDi-(Dk performinga similaritytransformation by Z gives Yk+l - 2= 2Yk( - from which(2.5) followson takingnorms. E Theorem2 shows, then, that underthe statedhypotheseson A iterations(N), (I) and (II) with startingvalue a multipleof the identity matrix,when defined, will indeed converge:quadratically,to a particularsquareroot of A the form of whose spectrumis knowna priori. Severalcommentsare worthmaking.First,we can use Theorem4 in [9] to deduce that the squareroot X in Theorem2 is indeeda functionof A, in the sense defined in [5, p. 96]. (Essentially,B is a functionof A if B can be expressedas a polynomial in A.) Next, note that the proof of Theorem2 relies on the fact that the matrix which diagonalizesA also diagonalizeseach iterateYk.This propertyis maintained for YOan arbitraryfunctionof A, and under suitableconditionsconvergencecan still be proved,but the spectrum{ ? Xi, ..., +?/ } of the limit matrix,if it exists, will depend on YO.Finally,we remarkthat Theorem2 can be proved withoutthe assumptionthat A is diagonalizable,using,for example,the techniquein [13]. We conclude this section with a corollarywhich applies to the importantcase whereA is Hermitianpositivedefinite. be Hermitian positive definite. If YO= mI, m > 0, XoYk = X, where X is the uniqueHermitianpositive definite square root of A, and (2.5) holds. COROLLARY 1. Let A EsC" then the iterates { Yk } in (1.5) are all Hermitianpositivedefinite,limk 3. StabilityAnalysis.We now considerthe behaviorof Newton'smethodfor the matrix square root, and its variants(I) and (II), when the iterates are subjectto perturbations.We will regardthese perturbationsas arisingfrom roundingerrors sustainedduringthe evaluationof an iterationformula,thoughour analysisis quite general. NICHOLAS J. HIGHAM 542 Considerfirst iteration(I) with Y0= ml, m > 0, and makethe same assumptions as in Theorem2. Let Yk denotethe computedk th iterate,Yk = Yk,and define Ak = Yk -Yk Our aim is to analyzehow the errormatrix Ak propagatesat the (k + 1)st stage (note the distinctionbetween Ak and the "true"errormatrix Yk - X). To simplify the analysis we assume that no roundingerrors are committedwhen computing Yk+ 1,so that ~Yk+1= 2(Yk + YijA) (3.1) 2(Yk + Ak +(Yk = + Ak)A)- Using the perturbationresult[15,p. 188 ff.] (A + E -1 (3.2) + A-1 -A-EA-' o(IIE112) we obtain Yk+1 = 1(Yk + Ak + YA - YiAkYA1) + O(IIAk12). Subtracting(1.5) yields = Ak+l (3.3) - 2(Ak + YAkYA) 0(1 AkI ) Using the notation (2.2) and (2.3), let Ak = Z lAkZ, (3.4) and transform(3.3) to obtain (3.5) 2 -( Ak+1 - DJ AkDJ1A) + 112) 0(11 Ik From the proof of Theorem2, Dk = diag(d k)), (3.6) so with (3.7) Ak (8ij)) Eq. (3.5) can be writtenelementwiseas 8,.(jk1) = 112),(k)+?|^||) 1<i where Xi) Since Dk -) A112 as k -+ -21 (3.9) where XA/(d k)d(k))). o (see (2.8))we can write df~k)= X12 + (3.8) where - E(k) O) 0 as k - Elk) oo. Then, (k) = -1 -(Xj/Xi)'/2) + O(e(k)), j < n, 543 NEWTON'S METHOD FOR THE MATRIX SQUARE ROOT To ensurethe numericalstabilityof the iterationwe requirethat the erroramplificabe boundedin modulusby 1; hencewe require,in particular,that tion factors 1| (3.11) 1 < 1, (XA/i)'I/ i, j n- This is a severerestrictionon the matrixA. For example,if A is Hermitianpositive definite the conditionis equivalentto (cf. [13]) K2(A) < 9, (3.12) wherethe conditionnumberK2(A) = IIAI12 11 A-12. To clarify the above analysis it is helpful to consider a particularexample. SupposeA is Hermitianpositivedefinite,so that in (2.2) we can take Z = Q where is unitary.Thus, Q = (q*, q.,) Q*AQ= A =diag(Xl,...,Xn), (3.13) Q*Q = I and (cf. (2.3)) QYkQ = Dk = diag(d ~k)). (3.14) rank-oneperturbation Considerthe special(unsymmetric) A k=Eqiqj, IIAkL2= E > i*j; O. formula[7, p. 3] gives For this Ak the Sherman-Morrison = Yk (Yk + Ak) - _YAkykl Using this identityin (3.1) we obtain,on subtracting(1.5), = (3.15) Ak+l 2(Ak - YkAkYkA), that is, (3.3) with the ordertermzero.Using (3.13) and (3.14)in (3.15),we have Ak+l = - 2(Ak 2 ~(Ak =i(Ak (QD-jQ*)(Eqiqj*)(QDkjQ*)(QAQ*)) - EQDkeiej*Dl1AQ*) - __Qk_ 2j ( 2( d 't dj d (k) ) dj(k)j ) A k Let Yk A1/2 (the Hermitianpositivedefinitesquareroot of A), so that Dk and choose i, j so that Xj/Xi = K2(A). Then A . = '(1 - = A112, K2(A)1)Ak A Assumingthat Yk+2' Yk+3 ... , like Yk+ , are computedexactlyfrom the preceding iterates,it follows that A1/2 + r > 0. > in the 2-norm,yet 0 from is an distance E A1/2 away InIk+r this example, Yk [?(i arbitrary if K2(A) > 9 the subsequentiteratesdiverge,growingunboundedly. = - K2() /2 )] Ak 544 NICHOLAS J. HIGHAM Consider now the Newton iteration (N) with X0 = mI, m > 0, so that by Theorem1, Xk Yk, and makethe sameassumptionsas in Theorem2. Then (3.16) Xk-X--+O ask--xoo (quadratically),where X is the squareroot of A defined in Theorem2. Let Xk be perturbedto Xk = Xk + Ak and denote the correspondingperturbedsequenceof iterates(computedexactlyfrom Xk) by { Xk+ r }r 0> > The standardlocal convergence theoremfor Newton'smethodimpliesthat, for IIk- XIIsufficientlysmall, that is, for k sufficientlylargeand IlAklIsufficientlysmall, (3.17) as r -o Xk+r -X -0 (quadratically).From(3.16)and (3.17)it followsthat 0 as r -- o. Ak+r = Xk+r - Xk+r Thus, unlike iteration (I), the Newton iteration (N) has the property that once convergenceis approached,a suitablenormof the errormatrixAk = Xk - Xk is not magnified,but ratherdecreased,in succeedingiterations. To summarize,for iterations(N) and (I) with initial approximationmI (m > 0), our analysis shows how a small perturbationAk in the kth iterateis propagatedat the (k + 1)st stage. For iteration(I), dependingon the eigenvaluesof A, a small perturbationAk in Yk may induce perturbationsof increasingnorm in succeeding iterates, and the sequence{Yk) may "diverge"from the sequenceof true iterates {Yk). The same conclusionapplies to iteration(II) for which a similar analysis holds. In contrast,for largek, the Newton iteration(N) dampsa smallperturbation Ak in Xk. Our conclusion, then, is that in simplifyingNewton's method to produce the ostensiblyattractiveformulae(1.5) and (1.6), one sacrificesnumericalstabilityof the method. 4. A FurtherNewtonVariant.The followingmatrixsquareroot iterationis derived in [4] using the matrixsign function: P0 = A, Q0 = I, (4.1) (4.2) Pk (III): l-( 1 Q1 k=O,1,2. +P-)j Qk+1 =(Qk It is easy to prove by induction(using Theorem1) that if {Yk} is the sequence computed from (1.5) with Y0= I, then (4.3) Pk = Yk (4.4) Qk = ) A'1YkI k= 2 Thus if A satisfies the conditions of Theorem2 and the sequence {Pk,Qk} defined,then lim Pk= X, k- oo lim k-oo Qk= is X- where X is the squarerootof A definedin Theorem2. At first sight,iteration(III) appearsto haveno advantageoveriteration(I). It is in generalno less computationallyexpensive;it computessimultaneouslyapproximations to X and X-1, whenprobablyonly X is required;and intuitivelythe fact that NEWTON'S METHOD FOR THE MATRIX SQUARE ROOT 545 A is present only in the initial conditions, and not in the iteration formulae,is displeasing.However,as we will now show, this "coupled"iterationdoes not suffer from the numericalinstabilitywhichvitiatesiteration(I). To parallelthe analysisin Section3 supposethe assumptionsof Theorem2 hold, let Pk and Qk denotethe computediteratesfromiteration(III), define Ek = Pk -Pk, Fk =Qk -Qk, and assume that at the (k + 1)st stage Pk+, and Qk+l are computedexactlyfrom Pk and Qk. Then from(4.1) and (4.2), using(3.2), we have Pk~l Qk+l = 2(Pk + Ek + Qk 0(1 Ek1 2) Qk FkQJ) + 2(Qk + Fk + Pk7 - P/jEkPkl) o(IIEk + )- Subtracting(4.1) and (4.2),respectively,gives (4.5) Ek~l = (4.6) Fk l+1 = 21(Fk - PC EkPhi) + (Ek- QFkQ1) + 0( ) k wheregk = max{11EkI1,F/FkCij. From (2.2), (2.3), (4.3),(4.4) and (3.6), Z1PPkZ = Dk, A1DDk, D/ = diag(d = Z1lQkZ (k)); thus, defining E/C = Z1lE/Z, FCk = ZF/kZ, we can transform(4.5) and (4.6) into E/C+1 2 ( -Ek Fk+1 = 2 ( tk- Dk AF/kD-1A) + ?(g2)9 Dk kE/Dk ) + ?(kg,)- Writtenelementwise,usingthe notation Pk_ (-@P) Fk = these equationsbecome (4.7) (4.8) e (k+ 1) f(/k+l) - (0e(/) - + - i(f(/k) - + ?(g2) (g2 ) where ij dk d~k) djlk) ;J -( = T = (ij j)1/2 ( j)1/2 + o(k) and (k) fIcY. d~k)dJ() 2 + ?(?~~~(k)) + 9(~) using (3.8) and (3.10).It is convenientto writeEqs.(4.7) and (4.8) in vectorform: (4.9) h(k+l) = M(jk)h(9)+ O(g2), where - i I [ (/k)J NICHOLASJ. HIGHAM 546 and M j |(k) k[k) =Mij ( + O( + O(e(k)). It is easy to verify that the eigenvaluesof Mij are zero and one; denote a correspondingpair of eigenvectorsby x0 and xl and let h(k) = a (k)X + a(k)Xi. If we make a furtherassumptionthat no new errorsare introducedat the (k + 2)nd stage of the iterationonwards(so that the analysisis tracinghow an isolatedpair of perturbationsat the kth stageis propagated),then for k largeenoughand gk small, we have, by induction, r > 0. (4.10) h(k+r) . Mrh(k) = M,.j(a(k)xo + a(k x) = a(k)x, >? 1 (taking While IIh(k+)11 may exceed IIhI(k)1by the factor IIM/fk)jj IIM1jj1 norms in (4.9)), from (4.10) it is clear that the vectors hij remain hij2I,... approximatelyconstant,that is, the perturbationsintroducedat the kth stage have only a boundedeffect on succeedingiterates. Our analysis shows that iteration(III) does not suffer from the unstable error propagation which affects iteration (I) and suggests that iteration (III) is, for practicalpurposes,numericallystable. In the next section we supplementthe theorywhich has been given so far with some numericaltest results. 5. NumericalExamples.In this sectionwe give some examplesof the performance in finite-precisionarithmeticof iteration(I) (with Y0= I) and iteration(III). When implementingthe iterationswe distinguishedthe case whereA is symmetric positive definite;since the iteratesalso possessthis attractiveproperty(see Corollary 1) it is possible to use the Choleskidecompositionand to workonly with the "lower triangles"of the iterates. To define our implementations,it sufficesto specifyour algorithmfor evaluating W = B-1C, where B = Yk, C = A in iteration (I), and B = Pk or Qk' C = I in iteration (III). For general A we used an LU factorizationof B (computedby Gaussian elimination with partial pivoting) to solve by substitution the linear systems BW = C. For symmetricpositivedefiniteA we first formed B-1, and then computed the (symmetric)product B-1C; B-1 was computedfrom the Choleski decompositionB = LLT,by invertingL and then formingthe (symmetric)product B-1= LTL1. The operation counts for one stage of each iteration in our implementations, measuredin flops [7, p. 32] are as follows. TABLE Flops per stage: A E Rnxn Iteration(I) Iteration(III) 2 GeneralA SymmetricpositivedefiniteA 4n3/3 n3 2n3 n 547 NEWTON'SMETHODFOR THE MATRIXSQUARE ROOT The computationswereperformedon a Commodore64 microcomputerwith unit roundoff [7, p. 33] u = 2-32 2.33 x 10-1o. In the following X(A) denotes the spectrumof A. Example 1. Consider the Wilson matrix example given in Section 1. W is symmetricpositive definiteand (K2(W)1/2 - 1)/2 = 27, so the theoryof Section 3 predicts that for this matrixiteration(I) may exhibitnumericalinstabilityand that for largeenough k W |111 YIIk+1 Y- k+ 111 < 27j Yk - Yk 111 = 27I Yk Ill. Note from Table 1 that for Implementation1 there is approximateequality throughoutin (5.1) for k > 6; this examplesupportsthe theorywell. Strictly,the analysisof Section3 does not apply to Implementation2, but the overallconclusion is valid (essentially,the errormatricesAk are forcedto be symmetric,but they can still grow as k increases). (5.1) 11Yk+ 1 - Example 2 [8]. 5 4 A = 1 1 4 1 2] | X(A)= {1,2, 5,10}, K2(A) 10. -1 1 2 4 Iterations(I) and (III) both convergedin seveniterations. Note that condition(3.12) is not satisfiedby this matrix;thus the failureof this condition to hold does not necessarilyimply divergenceof the computediterates from iteration(I). Example 3. A- 1 0 0 0 -1 .01 0 0 _ -1 -1 100 100, -1 -1 -100 100 X(A) = {.01, 1, 100 ?lOOi}. form of A is preservedby iterations(I) and Note that the lower quasi-triangular (III). Iteration (I) divergedwhile iteration (III) convergedwithin ten iterations. Briefly,iteration(I) behavedas follows. TABLE 3 _ k Yk Yk- 111 11 1 6 7 8 9 12 9.9 X 10' 2.3 x 2.1 x 4.0 x 2.1 4.8 x 10-1 i-0 10-2 105 Example 4 [3]. A= 0 6 .07 --.36 .27 -.33 2 -1 .411 1 1.21 -1.84 -.24 [1.31 -2.64 -2.01 (A) =.03 303 19 i 548 NICHOLASJ. HIGHAM Iteration(I) diverged,but iteration(III) convergedin eightiterationsto a real square root (cf. [3] wherea nonrealsquareroot was computed). Example 5 [8]. 4 A = LO 1 1 4 1 1 X(A) = (3,3,6); A is defective. 4- Both iterationsconvergedin six steps. We note that in Examples3 and 4 condition(3.11)is not satisfied;the divergence. of iteration(I) in theseexamplesis "predicted"by the theoryof Section3. 6. Conclusions.When A is a full matrix,Newton'smethodfor the matrixsquare root, defined in Eqs. (1.3) and (1.4), is unattractivecomparedto the Schurdecomposition approachdescribedin [2], [9]. Iterations(I) and (II), defined by (1.5) and (1.6), are closely relatedto the Newton iteration,since if the initial approximation XO= YO= ZOcommuteswith A, then the sequencesof iterates { Xk}, { Yk} and { Zk } are identical(see Theorem1). In view of the relativeease with whichEqs. (1.5) and (1.6) can be evaluated,these two Newton variantsappear to have superior computational merit. However, as our analysis predicts, and as the numerical examples in Section 5 illustrate,iterations(I) and (II) can suffer from numerical instability-sufficient to cause the sequenceof computediteratesto diverge,even though the correspondingexact sequenceof iteratesis mathematicallyconvergent. Since this happenseven for well-conditionedmatrices,iterations(I) and (II) mustbe classed as numericallyunstable;they are of little practicaluse. Iteration(III), definedby Eqs.(4.1) and (4.2), is also closelyrelatedto the Newton iteration and was shown in Section 4 to be numericallystable under suitable assumptions.In our practicalexperience(see Section 5) iteration(III) has always performedin a numericallystablemanner. As a means of computinga single squareroot, of the form describedin Theorem 2, iteration(III) can be recommended:it is easy to code and it does not requirethe use of sophisticatedlibraryroutines(importantin a microcomputer environment,for In example). comparison,the Schurmethod[2], [9] is morepowerful,since it yields more informationaboutthe problemand it can be used to determinea "well-conditioned"squareroot (see [9]);it has a similarcomputationalcost to iteration(III) but it does requirethe computationof a Schurdecompositionof A. Since doing this work,we have developeda new methodfor computingthe square root A1/2 of a symmetricpositivedefinitematrixA; see [10].The methodis related to iteration(I) and the techniquesof this papercan be used to show that the method is numericallystable. I am pleasedto thankDr. G. Hall and Dr. I. Gladwellfor their Acknowledgments. interest in this work and for their commentson the manuscript.I also thank the refereefor helpful suggestions. Department of Mathematics University of Manchester Manchester M13 9PL, England NEWTON'S METHOD FOR THE MATRIX SQUARE ROOT 549 "Solution of the matrix equation AX + XB = C," Comm. ACM, & G. W. STEWART, 1. R. H. BARTELS 820-826. pp. v. 15, 1972, 2. A. BJORCK & S. HAMMARLING,"A Schur method for the square root of a matrix," Linear Algebra Apple, v. 52/53, 1983, pp. 127-140. 3. E. D. DENMAN,"Roots of real matrices," LinearAlgebra Appl., v. 36, 1981, pp. 133-139. "The matrix sign function and computations in systems," Appl. 4. E. D. DENMAN & A. N. BEAVERS, Math. Comput., v. 2, 1976, pp. 63-94. The Theoryof Matrices, Vol. I, Chelsea, New York, 1959. 5. F. R. GANTMACHER, 6. G. H. GOLUB, S. NASH & C. F. VAN LOAN, "A Hessenberg-Schur method for the problem AX + XB = C," IEEE Trans. Automat. Control, v. AC-24, 1979, pp. 909-913. 7. G. H. GOLUB& C. F. VAN LOAN, Matrix Computations,Johns Hopkins Univ. Press, Baltimore, Maryland, 1983. 8. R. T. GREGORY & D. L. KARNEY, A Collection of Matrices for Testing ComputationalAlgorithms, Wiley, New York, 1969. 9. N. J. HIGHAM, ComputingReal Square Roots of a Real Matrix, Numerical Analysis Report No. 89, University of Manchester, 1984; Linear AlgebraAppl. (To appear.) 10. N. J. HIGHAM, Computingthe Polar Decomposition- WithApplications, Numerical Analysis Report No. 94, University of Manchester, 1984; SIAM J. Sci. Statist. Comput.(To appear.) 11. W. D. HOSKINS & D. J. WALTON, "A faster method of computing the square root of a matrix," IEEE Trans. Automat. Control, v. AC-23, 1978, pp. 494-495. 12. W. D. HOSKINS & D. J. WALTON, "A faster, more stable method for computing the pth roots of positive definite matrices," Linear AlgebraAppl., v. 26, 1979, pp. 139-163. 13. P. LAASONEN, "On the iterative solution of the matrix equation AX2 - I - 0," M.T.A.C., v. 12, 1958, pp. 109-116. 14. J. M. ORTEGA,NumericalAnalysis: A Second Course, Academic Press, New York, 1972. Introductionto Matrix Computations,Academic Press, New York, 1973. 15. G. W. STEWART, 16. C.- E. FROBERG,Introduction to Numerical Analysis, 2nd ed., Addison-Wesley, Reading, Mass., 1969. 17. P. HENRICI,Elements of Numerical Analysis, Wiley, New York, 1964.
© Copyright 2026 Paperzz