Newton`s Method for the Matrix Square Root

Newton's Method for the Matrix Square Root
Author(s): Nicholas J. Higham
Reviewed work(s):
Source: Mathematics of Computation, Vol. 46, No. 174 (Apr., 1986), pp. 537-549
Published by: American Mathematical Society
Stable URL: http://www.jstor.org/stable/2007992 .
Accessed: 02/12/2011 06:39
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
American Mathematical Society is collaborating with JSTOR to digitize, preserve and extend access to
Mathematics of Computation.
http://www.jstor.org
MATHEMATICS OF COMPUTATION
VOLUME 46, NUMBER 174
APRIL 1986, PAGES 537-549
Newton'sMethodfor the Matrix SquareRoot*
By NicholasJ. Higham
Abstract. One approach to computing a square root of a matrix A is to apply Newton's
method to the quadratic matrix equation F(X) _ x2 - A = 0. Two widely-quoted matrix
square root iterations obtained by rewriting this Newton iteration are shown to have excellent
mathematical convergence properties. However, by means of a perturbation analysis and
supportive numerical examples, it is shown that these simplified iterations are numerically
unstable. A further variant of Newton's method for the matrix square root, recently proposed
in the literature, is shown to be, for practical purposes, numerically stable.
1. Introduction.A square root of an n X n matrix A with complex elements,
A E C ,n , is a solution X E CfnXfnof the quadratic matrix equation
F(X)
(1.1)
X2-A
= 0.
A naturalapproachto computinga squareroot of A is to applyNewton'smethodto
(1.1). For a generalfunctionG: Cnxn -_ CnXn, Newton'smethodfor the solutionof
G(X) = 0 is specifiedby an initialapproximationX0 and the recurrence(see [14, p.
140], for example)
(1.2)
Xk+l = Xk-
k = 0,1,2,...,
G`(Xk)1G(Xk),
where G' denotes the Frechetderivativeof G. Identifying
F(X+ H)
= X2
-A +(XH+ HX) + H2
with the Taylor series for F we see that F'(X) is a linear operator, F'(X):
n_ CcnXn, defined
CnXn
by
F'(X)H
= XH + HX.
Thus Newton's methodfor the matrixsquareroot can be written
X0 given,
(1.3)
(1 3)
(1.4)
XkHk+ HkXk= A -
~(N):
()Xk
XkH
X+
Hk
X2
)
012
k = 0, 1, 2,....
Applying the standardlocal convergencetheoremfor Newton's method [14, p.
148], we deduce that the Newton iteration(N) convergesquadraticallyto a square
root X of A if X - X0I is sufficientlysmalland if the lineartransformationF'( X)
is nonsingular.However,the most stableand efficientmethodsfor solvingEq. (1.3),
Received October 22, 1984; revised July 30, 1985.
1980 Mathematics Subject Classification. Primary 65F30, 65H10.
Key wordsand phrases. Matrix square root, Newton's method, numerical stability.
* This work was carried out with the support of a SERC Research Studentship.
01986 American Mathematical Society
0025-5718/86 $1.00 + $.25 per page
537
538
NICHOLAS J. HIGHAM
[1], [6], requirethe computationof a Schurdecompositionof Xk, assumingXk is
full. Since a squareroot of A can be obtaineddirectlyand at little extracost once a
single Schur decomposition,that of A, is known, [2], [9], we see that in general
Newton's method for the matrixsquareroot, in the form (N), is computationally
expensive.
It is thereforenaturalto attemptto "simplify"iteration(N). Since X commutes
with A = X2, a reasonableassumption(whichwe will justify in Theorem1) is that
the commutativityrelation
XkHk
=
HkXk
holds, in whichcase (1.3) maybe written
2XkHk
= 2HkXk
A
=
-Xk2
and we obtain from(N) two new iterations
(1.5)
(I):
(1.6)
(II):
Yk+1 = ?2(Yk
Zk+l
+ Yj A)
2(Zk+AZk-)
These iterationsarewell-known;see for example[2],[7, p. 395],[11],[12],[13].
Considerthe followingnumericalexample.Using iteration(I) on a machinewith
approximatelynine decimaldigit accuracy,we attemptedto computea squareroot
of the symmetricpositivedefiniteWilsonmatrix[16, pp. 93, 123]
10
7
8
7
8
6 10
9
7
5
9
10,
- 2984. Two implementationsof iteration(I) wereemployed(for the detailssee Section5). The first is
designed to deal with generalmatrices,while the second is for the case whereA is
positive definite and takes full advantageof the fact that all iteratesare (theoretically) positive definite (see Corollary1). In both cases we took YO= I; as we will
prove in Theorem2, for this startingvalue iteration(I) should convergequadratically to W1/2, the uniquesymmetricpositivedefinitesquareroot of W.
Denoting the computediteratesby Yk,the resultsobtainedwere as in Table 1.
Both implementationsfailed to converge;in the first, Y20 was unsymmetricand
indefinite. In contrast,a furthervariantof the Newton iteration,to be defined in
Section4, convergedto W1/2 in nine iterations.
Clearly,iteration(I) is in some sense "numericallyunstable".This instabilitywas
noted by Laasonen[13] who, in a paper apparentlyunknownto recentworkersin
this area, stated without proof that for a matrix with real, positive eigenvalues
iteration(I) "if carriedout indefinitely,is not stablewheneverthe ratioof the largest
to the smallesteigenvalueof A exceedsthe value 9". We wish to draw attentionto
this important and surprisingfact. In Section 3 we provide a rigorousproof of
Laasonen'sclaim. We show that the originalNewton method (N) does not suffer
from this numericalinstabilityand we identifyin Section4 an iteration,proposedin
[4], which has the computationalsimplicityof iteration(I) and yet does not suffer
from the instabilitywhichimpairsthe practicalperformanceof (I).
for which the 2-normconditionnumberK2(W) = IIW I 211W112
NEWTON'SMETHODFOR THE MATRIX SQUARE ROOT
539
TABLE 1
Implementation2
Implementation1,
k
llW/
klll
-
l
l/-
YkIll
0
1
4.9
1.1 x 101
4.9
1.1 x 101
2
3.6
3.6
3
6.7 X 10-1
3.3 x 10-
6.7 x 10-1
10
11
4.3 x 10-4
3.4 x 109.3 x 10-4
2.5 x 10-2
6.7 x 10-1
1.8 x 101
3.3 x 104.3 x 10-4
6.7 x 10-7
1.4 x 10-6
1.6 x 10 2.0 x 10-4
2.4 x 10-
4.8 x 102
2.8 x 10-2
12
1.3 x
13
20
3.4 x 105
1.2 x 106
4
5
6
7
8
9
3.2 x 10-1
104
Error:Yk not positivedefinite
We begin by analyzingthe mathematicalconvergencepropertiesof the Newton
iteration.
2. Convergence of Newton's Method. In this section we derive conditions which
ensure the convergenceof Newton's method for the matrix square root and we
establishto which squareroot the methodconvergesfor a particularset of starting
values. (For a classificationof the set { X: X2 = A) see, for example,[9].)
First, we investigatethe relationshipbetween the Newton iteration(N) and its
offshoots (I) and (II). To begin,note that the Newton iteratesXk are well-definedif
and only if, for each k, Eq. (1.3) has a unique solution, that is, the linear
transformationF'(Xk) is nonsingular.This is so if and only if Xk and -Xk haveno
eigenvaluein common[7, p. 194],whichrequiresin particularthat Xk be nonsingular.
1. Consider the iterations (N), (I) and (II). Suppose X0 = YO= ZO
commutes with A and that all the Newton iterates Xk are well-defined. Then
(i) Xk commutes withA for all k,
(ii) Xk = Yk = Zk for all k.
THEOREM
Proof. We sketchan inductiveproof of parts(i) and (ii) together.The case k = 0
is given. Assume the resultshold for k. Fromthe remarksprecedingthe theoremwe
see that both the lineartransformationF'(Xk), and the matrixXk, are nonsingular.
Define
Gk = 2(X0A1A - Xk)-
Using XkA = AXk we have, from (1.3),
F'(Xk)Gk = F'(Xk)Hk.
540
NICHOLASJ. HIGHAM
Thus
Hk =
Gk, and so from (1.4),
(2.1)
Xk+l = Xk + Gk = 2(Xk + X0A)
which commuteswith A. It followseasilyfrom(2.1) that Xk+l = Yk+1 = Zk+l. O
Thus, providedthe initialapproximationX0 = YO= ZOcommuteswith A and the
correctionequation(1.3) is nonsingularat each stage,the Newton iteration(N) and
its variants (I) and (II) yield the same sequenceof iterates.We now examine the
convergence of this sequence,concentratingfor simplicity on iteration (I) with
startingvalue a multipleof the identitymatrix.Note that the startingvalues YO= I
and YO= A lead to the samesequenceY1= !(I + A), Y2.
For our analysis we assume that A is diagonalizable,that is, there exists a
nonsingularmatrixZ suchthat
(2.2)
Z-1AZ = A = diag(X1,..., n),
where X1,..., AXnare the eigenvaluesof A. The convenienceof this assumptionis
that it enablesus to diagonalizethe iteration.For, defining
Dk = Z1lYkZ
(2.3)
we have from(1.5),
(2.4)
Dk~l=
4(z-lYkZ +(Z-lYkZ)
Z AZ)
A(Dk=
+
so that if Do is diagonal,thenby inductionall the successivetransformediteratesDk
are diagonaltoo.
THEOREM2. Let A E C nXIn be nonsingular and diagonalizable, and suppose that
none of A 's eigenvalues is real and negative. Let
m>O.
YO= mI,
Then, provided the iterates { Yk} in (1.5) are defined,
lim Yk= X
k-boo
and
(2.5)
I1Yk+l
-
X
IIYk-
<
where X is the uniquesquare root of A for which every eigenvaluehas positive real part.
Proof. We will use the notation (2.2). In view of (2.3) and (2.4) it suffices to
analyze the convergenceof the sequence {Dk}. Do = mI is diagonal, so Dk is
diagonalfor each k. WritingDk = diag(d (k)) we see from(2.4) that
d~k+l)
+ XA/d(k)),
I(d(k)
-
1 < i < n,
that is, (2.4) is essentiallyn uncoupledscalarNewton iterationsfor the squareroots
rX1, 1 < i < n.
Considerthereforethe scalariteration
=
Zk?+
2(Zk
+ a/Zk).
We will use the relations[17,p. 84]
Zk
(2.6)
(2.7)
Zk
Zk+l
=
-
+1
+
_
(zk-
IZOZ0+Vj
)(2Zk)
-
k+1
)k
NEWTON'SMETHODFOR THE MATRIXSQUARE ROOT
541
If a does not lie on the nonpositivereal axis then we can choose Vato havepositive
real part, in which case it is easy to see that for real zo > 0, IyI< 1. Consequently,
for a and zo of the specifiedform we have from (2.7), providedthat the sequence
{ Zk } is defined,
lim
Zk
ReV > O.
= V,
ka-oo
Since the eigenvaluesX1and the startingvalues d(?)= m > 0 are of the form of a
and zo, respectively,then
12 = diag(XV2),
(2.8)
lim Dk
Re X2 > 0,
kaz-+oo
and thus
lim Yk= ZA1/2Z-1 = X
k-boo
(providedthe iterates{ Yk} are defined),whichis clearlya squareroot of A whose
eigenvalueshave positiverealpart.The uniquenessof X follows fromTheorem4 in
[9].
Finally,we can use (2.6),with the minussign, to deducethat
Dk
l-A12
=
A1/2)2;
iDi-(Dk
performinga similaritytransformation
by Z gives
Yk+l
-
2= 2Yk(
-
from which(2.5) followson takingnorms. E
Theorem2 shows, then, that underthe statedhypotheseson A iterations(N), (I)
and (II) with startingvalue a multipleof the identity matrix,when defined, will
indeed converge:quadratically,to a particularsquareroot of A the form of whose
spectrumis knowna priori.
Severalcommentsare worthmaking.First,we can use Theorem4 in [9] to deduce
that the squareroot X in Theorem2 is indeeda functionof A, in the sense defined
in [5, p. 96]. (Essentially,B is a functionof A if B can be expressedas a polynomial
in A.) Next, note that the proof of Theorem2 relies on the fact that the matrix
which diagonalizesA also diagonalizeseach iterateYk.This propertyis maintained
for YOan arbitraryfunctionof A, and under suitableconditionsconvergencecan
still be proved,but the spectrum{ ? Xi, ..., +?/ } of the limit matrix,if it exists,
will depend on YO.Finally,we remarkthat Theorem2 can be proved withoutthe
assumptionthat A is diagonalizable,using,for example,the techniquein [13].
We conclude this section with a corollarywhich applies to the importantcase
whereA is Hermitianpositivedefinite.
be Hermitian positive definite. If YO= mI, m > 0,
XoYk = X,
where X is the uniqueHermitianpositive definite square root of A, and (2.5) holds.
COROLLARY
1. Let A EsC"
then the iterates { Yk } in (1.5) are all Hermitianpositivedefinite,limk
3. StabilityAnalysis.We now considerthe behaviorof Newton'smethodfor the
matrix square root, and its variants(I) and (II), when the iterates are subjectto
perturbations.We will regardthese perturbationsas arisingfrom roundingerrors
sustainedduringthe evaluationof an iterationformula,thoughour analysisis quite
general.
NICHOLAS J. HIGHAM
542
Considerfirst iteration(I) with Y0= ml, m > 0, and makethe same assumptions
as in Theorem2. Let Yk denotethe computedk th iterate,Yk = Yk,and define
Ak =
Yk -Yk
Our aim is to analyzehow the errormatrix Ak propagatesat the (k + 1)st stage
(note the distinctionbetween Ak and the "true"errormatrix Yk - X). To simplify
the analysis we assume that no roundingerrors are committedwhen computing
Yk+ 1,so that
~Yk+1= 2(Yk + YijA)
(3.1)
2(Yk + Ak +(Yk
=
+ Ak)A)-
Using the perturbationresult[15,p. 188 ff.]
(A + E -1
(3.2)
+
A-1 -A-EA-'
o(IIE112)
we obtain
Yk+1
= 1(Yk + Ak + YA
-
YiAkYA1) + O(IIAk12).
Subtracting(1.5) yields
=
Ak+l
(3.3)
-
2(Ak
+
YAkYA)
0(1
AkI
)
Using the notation (2.2) and (2.3), let
Ak = Z lAkZ,
(3.4)
and transform(3.3) to obtain
(3.5)
2 -(
Ak+1
-
DJ AkDJ1A) +
112)
0(11
Ik
From the proof of Theorem2,
Dk = diag(d k)),
(3.6)
so with
(3.7)
Ak
(8ij))
Eq. (3.5) can be writtenelementwiseas
8,.(jk1) =
112),(k)+?|^||)
1<i
where
Xi)
Since Dk
-)
A112 as k
-+
-21
(3.9)
where
XA/(d k)d(k))).
o (see (2.8))we can write
df~k)= X12 +
(3.8)
where
-
E(k) O)
0 as k
-
Elk)
oo. Then,
(k) =
-1
-(Xj/Xi)'/2)
+ O(e(k)),
j <
n,
543
NEWTON'S METHOD FOR THE MATRIX SQUARE ROOT
To ensurethe numericalstabilityof the iterationwe requirethat the erroramplificabe boundedin modulusby 1; hencewe require,in particular,that
tion factors
1|
(3.11)
1
< 1,
(XA/i)'I/
i, j
n-
This is a severerestrictionon the matrixA. For example,if A is Hermitianpositive
definite the conditionis equivalentto (cf. [13])
K2(A) < 9,
(3.12)
wherethe conditionnumberK2(A) = IIAI12
11
A-12.
To clarify the above analysis it is helpful to consider a particularexample.
SupposeA is Hermitianpositivedefinite,so that in (2.2) we can take Z = Q where
is unitary.Thus,
Q = (q*, q.,)
Q*AQ= A =diag(Xl,...,Xn),
(3.13)
Q*Q = I
and (cf. (2.3))
QYkQ = Dk = diag(d ~k)).
(3.14)
rank-oneperturbation
Considerthe special(unsymmetric)
A k=Eqiqj,
IIAkL2= E >
i*j;
O.
formula[7, p. 3] gives
For this Ak the Sherman-Morrison
= Yk
(Yk + Ak)
-
_YAkykl
Using this identityin (3.1) we obtain,on subtracting(1.5),
=
(3.15)
Ak+l
2(Ak - YkAkYkA),
that is, (3.3) with the ordertermzero.Using (3.13) and (3.14)in (3.15),we have
Ak+l
=
-
2(Ak
2 ~(Ak
=i(Ak
(QD-jQ*)(Eqiqj*)(QDkjQ*)(QAQ*))
-
EQDkeiej*Dl1AQ*)
-
__Qk_
2j
(
2(
d 't dj
d (k) ) dj(k)j
)
A
k
Let Yk A1/2 (the Hermitianpositivedefinitesquareroot of A), so that Dk
and choose i, j so that Xj/Xi = K2(A). Then
A
.
=
'(1
-
=
A112,
K2(A)1)Ak
A
Assumingthat Yk+2' Yk+3 ... , like Yk+ , are computedexactlyfrom the preceding
iterates,it follows that
A1/2 +
r >
0.
>
in the 2-norm,yet
0
from
is
an
distance
E
A1/2
away
InIk+r
this example, Yk
[?(i
arbitrary
if K2(A) > 9 the subsequentiteratesdiverge,growingunboundedly.
=
-
K2()
/2 )] Ak
544
NICHOLAS J. HIGHAM
Consider now the Newton iteration (N) with X0 = mI, m > 0, so that by
Theorem1, Xk Yk, and makethe sameassumptionsas in Theorem2. Then
(3.16)
Xk-X--+O
ask--xoo
(quadratically),where X is the squareroot of A defined in Theorem2. Let Xk be
perturbedto Xk = Xk + Ak and denote the correspondingperturbedsequenceof
iterates(computedexactlyfrom Xk) by { Xk+ r }r 0>
> The standardlocal convergence
theoremfor Newton'smethodimpliesthat, for IIk- XIIsufficientlysmall, that is,
for k sufficientlylargeand IlAklIsufficientlysmall,
(3.17)
as r -o
Xk+r -X -0
(quadratically).From(3.16)and (3.17)it followsthat
0 as r -- o.
Ak+r = Xk+r - Xk+r
Thus, unlike iteration (I), the Newton iteration (N) has the property that once
convergenceis approached,a suitablenormof the errormatrixAk = Xk - Xk is not
magnified,but ratherdecreased,in succeedingiterations.
To summarize,for iterations(N) and (I) with initial approximationmI (m > 0),
our analysis shows how a small perturbationAk in the kth iterateis propagatedat
the (k + 1)st stage. For iteration(I), dependingon the eigenvaluesof A, a small
perturbationAk in Yk may induce perturbationsof increasingnorm in succeeding
iterates, and the sequence{Yk) may "diverge"from the sequenceof true iterates
{Yk). The same conclusionapplies to iteration(II) for which a similar analysis
holds. In contrast,for largek, the Newton iteration(N) dampsa smallperturbation
Ak in Xk.
Our conclusion, then, is that in simplifyingNewton's method to produce the
ostensiblyattractiveformulae(1.5) and (1.6), one sacrificesnumericalstabilityof the
method.
4. A FurtherNewtonVariant.The followingmatrixsquareroot iterationis derived
in [4] using the matrixsign function:
P0 = A, Q0 = I,
(4.1)
(4.2)
Pk
(III):
l-(
1
Q1
k=O,1,2.
+P-)j
Qk+1 =(Qk
It is easy to prove by induction(using Theorem1) that if {Yk} is the sequence
computed from (1.5) with Y0= I, then
(4.3)
Pk = Yk
(4.4)
Qk =
)
A'1YkI
k=
2
Thus if A satisfies the conditions of Theorem2 and the sequence {Pk,Qk}
defined,then
lim Pk= X,
k- oo
lim
k-oo
Qk=
is
X-
where X is the squarerootof A definedin Theorem2.
At first sight,iteration(III) appearsto haveno advantageoveriteration(I). It is in
generalno less computationallyexpensive;it computessimultaneouslyapproximations to X and X-1, whenprobablyonly X is required;and intuitivelythe fact that
NEWTON'S METHOD FOR THE MATRIX SQUARE ROOT
545
A is present only in the initial conditions, and not in the iteration formulae,is
displeasing.However,as we will now show, this "coupled"iterationdoes not suffer
from the numericalinstabilitywhichvitiatesiteration(I).
To parallelthe analysisin Section3 supposethe assumptionsof Theorem2 hold,
let Pk and Qk denotethe computediteratesfromiteration(III), define
Ek = Pk -Pk,
Fk =Qk
-Qk,
and assume that at the (k + 1)st stage Pk+, and Qk+l are computedexactlyfrom
Pk and Qk. Then from(4.1) and (4.2), using(3.2), we have
Pk~l
Qk+l
=
2(Pk + Ek + Qk
0(1 Ek1 2)
Qk FkQJ) +
2(Qk + Fk + Pk7 - P/jEkPkl)
o(IIEk
+
)-
Subtracting(4.1) and (4.2),respectively,gives
(4.5)
Ek~l =
(4.6)
Fk l+1 = 21(Fk - PC EkPhi) +
(Ek-
QFkQ1)
+ 0(
)
k
wheregk = max{11EkI1,F/FkCij.
From (2.2), (2.3), (4.3),(4.4) and (3.6),
Z1PPkZ = Dk,
A1DDk, D/ = diag(d
=
Z1lQkZ
(k));
thus, defining
E/C = Z1lE/Z,
FCk =
ZF/kZ,
we can transform(4.5) and (4.6) into
E/C+1
2 ( -Ek
Fk+1 = 2 ( tk-
Dk AF/kD-1A) + ?(g2)9
Dk kE/Dk ) + ?(kg,)-
Writtenelementwise,usingthe notation
Pk_
(-@P)
Fk
=
these equationsbecome
(4.7)
(4.8)
e (k+ 1)
f(/k+l)
-
(0e(/)
-
+
-
i(f(/k)
-
+ ?(g2)
(g2
)
where
ij
dk
d~k) djlk)
;J
-(
=
T
=
(ij
j)1/2
(
j)1/2
+
o(k)
and
(k)
fIcY.
d~k)dJ()
2 + ?(?~~~(k))
+
9(~)
using (3.8) and (3.10).It is convenientto writeEqs.(4.7) and (4.8) in vectorform:
(4.9)
h(k+l) = M(jk)h(9)+ O(g2),
where
-
i
I
[ (/k)J
NICHOLASJ. HIGHAM
546
and
M
j |(k)
k[k)
=Mij
(
+ O(
+ O(e(k)).
It is easy to verify that the eigenvaluesof Mij are zero and one; denote a
correspondingpair of eigenvectorsby x0 and xl and let
h(k) = a (k)X
+ a(k)Xi.
If we make a furtherassumptionthat no new errorsare introducedat the (k + 2)nd
stage of the iterationonwards(so that the analysisis tracinghow an isolatedpair of
perturbationsat the kth stageis propagated),then for k largeenoughand gk small,
we have, by induction,
r > 0.
(4.10)
h(k+r) . Mrh(k) = M,.j(a(k)xo + a(k x) = a(k)x,
>? 1 (taking
While IIh(k+)11 may exceed IIhI(k)1by the factor IIM/fk)jj IIM1jj1
norms in (4.9)), from (4.10) it is clear that the vectors hij
remain
hij2I,...
approximatelyconstant,that is, the perturbationsintroducedat the kth stage have
only a boundedeffect on succeedingiterates.
Our analysis shows that iteration(III) does not suffer from the unstable error
propagation which affects iteration (I) and suggests that iteration (III) is, for
practicalpurposes,numericallystable.
In the next section we supplementthe theorywhich has been given so far with
some numericaltest results.
5. NumericalExamples.In this sectionwe give some examplesof the performance
in finite-precisionarithmeticof iteration(I) (with Y0= I) and iteration(III).
When implementingthe iterationswe distinguishedthe case whereA is symmetric
positive definite;since the iteratesalso possessthis attractiveproperty(see Corollary
1) it is possible to use the Choleskidecompositionand to workonly with the "lower
triangles"of the iterates.
To define our implementations,it sufficesto specifyour algorithmfor evaluating
W = B-1C, where B
= Yk,
C
=
A in iteration (I), and B
= Pk
or
Qk'
C
=
I in
iteration (III). For general A we used an LU factorizationof B (computedby
Gaussian elimination with partial pivoting) to solve by substitution the linear
systems BW = C. For symmetricpositivedefiniteA we first formed B-1, and then
computed the (symmetric)product B-1C; B-1 was computedfrom the Choleski
decompositionB = LLT,by invertingL and then formingthe (symmetric)product
B-1=
LTL1.
The operation counts for one stage of each iteration in our implementations,
measuredin flops [7, p. 32] are as follows.
TABLE
Flops per stage: A E Rnxn
Iteration(I)
Iteration(III)
2
GeneralA
SymmetricpositivedefiniteA
4n3/3
n3
2n3
n
547
NEWTON'SMETHODFOR THE MATRIXSQUARE ROOT
The computationswereperformedon a Commodore64 microcomputerwith unit
roundoff [7, p. 33] u = 2-32 2.33 x 10-1o. In the following X(A) denotes the
spectrumof A.
Example 1. Consider the Wilson matrix example given in Section 1. W is
symmetricpositive definiteand (K2(W)1/2 - 1)/2 = 27, so the theoryof Section 3
predicts that for this matrixiteration(I) may exhibitnumericalinstabilityand that
for largeenough k
W |111 YIIk+1 Y- k+ 111 < 27j Yk - Yk 111 = 27I Yk Ill.
Note from Table 1 that for Implementation1 there is approximateequality
throughoutin (5.1) for k > 6; this examplesupportsthe theorywell. Strictly,the
analysisof Section3 does not apply to Implementation2, but the overallconclusion
is valid (essentially,the errormatricesAk are forcedto be symmetric,but they can
still grow as k increases).
(5.1)
11Yk+ 1
-
Example 2 [8].
5 4
A
=
1 1
4 1 2]
|
X(A)=
{1,2, 5,10},
K2(A)
10.
-1 1 2 4
Iterations(I) and (III) both convergedin seveniterations.
Note that condition(3.12) is not satisfiedby this matrix;thus the failureof this
condition to hold does not necessarilyimply divergenceof the computediterates
from iteration(I).
Example 3.
A-
1
0
0
0
-1
.01
0
0
_ -1
-1
100
100,
-1
-1
-100
100
X(A) = {.01, 1, 100 ?lOOi}.
form of A is preservedby iterations(I) and
Note that the lower quasi-triangular
(III). Iteration (I) divergedwhile iteration (III) convergedwithin ten iterations.
Briefly,iteration(I) behavedas follows.
TABLE 3
_
k
Yk Yk- 111
11
1
6
7
8
9
12
9.9 X 10'
2.3 x
2.1 x
4.0 x
2.1
4.8 x
10-1
i-0
10-2
105
Example 4 [3].
A=
0
6
.07
--.36
.27 -.33
2
-1 .411
1
1.21
-1.84
-.24
[1.31
-2.64
-2.01
(A) =.03
303
19
i
548
NICHOLASJ. HIGHAM
Iteration(I) diverged,but iteration(III) convergedin eightiterationsto a real square
root (cf. [3] wherea nonrealsquareroot was computed).
Example 5 [8].
4
A
=
LO
1 1
4 1
1
X(A)
=
(3,3,6);
A is defective.
4-
Both iterationsconvergedin six steps.
We note that in Examples3 and 4 condition(3.11)is not satisfied;the divergence.
of iteration(I) in theseexamplesis "predicted"by the theoryof Section3.
6. Conclusions.When A is a full matrix,Newton'smethodfor the matrixsquare
root, defined in Eqs. (1.3) and (1.4), is unattractivecomparedto the Schurdecomposition approachdescribedin [2], [9]. Iterations(I) and (II), defined by (1.5) and
(1.6), are closely relatedto the Newton iteration,since if the initial approximation
XO= YO= ZOcommuteswith A, then the sequencesof iterates { Xk}, { Yk} and
{ Zk } are identical(see Theorem1). In view of the relativeease with whichEqs. (1.5)
and (1.6) can be evaluated,these two Newton variantsappear to have superior
computational merit. However, as our analysis predicts, and as the numerical
examples in Section 5 illustrate,iterations(I) and (II) can suffer from numerical
instability-sufficient to cause the sequenceof computediteratesto diverge,even
though the correspondingexact sequenceof iteratesis mathematicallyconvergent.
Since this happenseven for well-conditionedmatrices,iterations(I) and (II) mustbe
classed as numericallyunstable;they are of little practicaluse.
Iteration(III), definedby Eqs.(4.1) and (4.2), is also closelyrelatedto the Newton
iteration and was shown in Section 4 to be numericallystable under suitable
assumptions.In our practicalexperience(see Section 5) iteration(III) has always
performedin a numericallystablemanner.
As a means of computinga single squareroot, of the form describedin Theorem
2, iteration(III) can be recommended:it is easy to code and it does not requirethe
use of sophisticatedlibraryroutines(importantin a microcomputer
environment,for
In
example). comparison,the Schurmethod[2], [9] is morepowerful,since it yields
more informationaboutthe problemand it can be used to determinea "well-conditioned"squareroot (see [9]);it has a similarcomputationalcost to iteration(III) but
it does requirethe computationof a Schurdecompositionof A.
Since doing this work,we have developeda new methodfor computingthe square
root A1/2 of a symmetricpositivedefinitematrixA; see [10].The methodis related
to iteration(I) and the techniquesof this papercan be used to show that the method
is numericallystable.
I am pleasedto thankDr. G. Hall and Dr. I. Gladwellfor their
Acknowledgments.
interest in this work and for their commentson the manuscript.I also thank the
refereefor helpful suggestions.
Department of Mathematics
University of Manchester
Manchester M13 9PL, England
NEWTON'S METHOD FOR THE MATRIX SQUARE ROOT
549
"Solution of the matrix equation AX + XB = C," Comm. ACM,
& G. W. STEWART,
1. R. H. BARTELS
820-826.
pp.
v. 15, 1972,
2. A. BJORCK & S. HAMMARLING,"A Schur method for the square root of a matrix," Linear Algebra
Apple, v. 52/53, 1983, pp. 127-140.
3. E. D. DENMAN,"Roots of real matrices," LinearAlgebra Appl., v. 36, 1981, pp. 133-139.
"The matrix sign function and computations in systems," Appl.
4. E. D. DENMAN & A. N. BEAVERS,
Math. Comput., v. 2, 1976, pp. 63-94.
The Theoryof Matrices, Vol. I, Chelsea, New York, 1959.
5. F. R. GANTMACHER,
6. G. H. GOLUB, S. NASH & C. F. VAN LOAN, "A Hessenberg-Schur method for the problem
AX + XB = C," IEEE Trans. Automat. Control, v. AC-24, 1979, pp. 909-913.
7. G. H. GOLUB& C. F. VAN LOAN, Matrix Computations,Johns Hopkins Univ. Press, Baltimore,
Maryland, 1983.
8. R. T. GREGORY & D. L. KARNEY, A Collection of Matrices for Testing ComputationalAlgorithms,
Wiley, New York, 1969.
9. N. J. HIGHAM, ComputingReal Square Roots of a Real Matrix, Numerical Analysis Report No. 89,
University of Manchester, 1984; Linear AlgebraAppl. (To appear.)
10. N. J. HIGHAM, Computingthe Polar Decomposition- WithApplications, Numerical Analysis Report
No. 94, University of Manchester, 1984; SIAM J. Sci. Statist. Comput.(To appear.)
11. W. D. HOSKINS & D. J. WALTON, "A faster method of computing the square root of a matrix,"
IEEE Trans. Automat. Control, v. AC-23, 1978, pp. 494-495.
12. W. D. HOSKINS & D. J. WALTON, "A faster, more stable method for computing the pth roots of
positive definite matrices," Linear AlgebraAppl., v. 26, 1979, pp. 139-163.
13. P. LAASONEN, "On the iterative solution of the matrix equation AX2 - I - 0," M.T.A.C., v. 12,
1958, pp. 109-116.
14. J. M. ORTEGA,NumericalAnalysis: A Second Course, Academic Press, New York, 1972.
Introductionto Matrix Computations,Academic Press, New York, 1973.
15. G. W. STEWART,
16. C.- E. FROBERG,Introduction to Numerical Analysis, 2nd ed., Addison-Wesley, Reading, Mass.,
1969.
17. P. HENRICI,Elements of Numerical Analysis, Wiley, New York, 1964.