136
C. G. BROYDEN
On Convergence
Criteria for the Method
Successive Over-Relaxation
of
By C. G. Broyden
1. Introduction.
The solution of a set of n simultaneous
(1.1)
linear equations,
Mx = c
where M is non-singular and both M and c are real is often attempted by iterative
methods, which rely on the systematic improvement of an approximate solution.
In order to simplify the algebra, it is convenient to consider the system.
(1.2)
Ax = b
where
(1.3)
A = DM,
b = Dc
and D is a diagonal matrix chosen so that the elements of the principal diagonal of
A are unity.
A may then be expressed as the sum of three matrices
(1.4)
A = 7 + L+
U
where / is the unit matrix and L, U are lower and upper triangular matrices respectively.
The type of iterative process considered here is that known as the extrapolated
Gauss-Seidel method, or the method of successive over-relaxation (SOR).
Convergence criteria have been established for this method by Ostrowski [3]
for the case where M is symmetric. This paper derives sufficient conditions for the
convergence of the method when applied to problems involving non-symmetric
matrices.
2. The SOR Method.
the equation
Let xi+i and x, be successive approximate
solutions to
Ax = b
The SOR method is defined by
(2.1)
afi+i = Xi — co[(7 + U)xí + Lxi+i — b].
(See, e.g. [2], where, however, a different notation is used to that employed here).
Eliminating U from the above equation by substitution from equation (1.4) gives
(2.2)
(I + œL)(xi+1 - Xi) = -u(Axi
Now the residual e, corresponding
- b).
to the approximate
€, = Axi — b.
Received February 19, 1963.
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
solution
x» is defined by
CONVERGENCE CRITERIA FOR METHOD OF SUCCESSIVE OVER-RELAXATION 137
Therefore
e,+i = Axi+i — b
and
A
(e,+i — (i) — Xi+i
Xi.
Hence equation (2.2) becomes
(I + wL)A~~ (ei+i — ti) = —we,which gives the recurrence relation
(2.3)
u+i = [I-aA(I
+ wL)-% .
This has the form
(2.4)
€,+1 = (7 - B)e{
and a sufficient condition for the convergence of processes of this type will now be
derived.
Theorem 1. A sufficient condition for the convergence of processes where the residual
vectors obey the equation
ti+1 = (7 - B)n
is that there exist matrices S and G such that
S> 0
and
G = BTS + SB - BTSB > 0.
(The notation S > 0 means that S is symmetric
superscript T indicates transposition.)
and positive definite, and the
Proof. Let /,- = e^Su . Then since, by hypothesis, S > 0, /,- > 0 for e¿ ¿¿ 0.
Hence, /, —>0 as i —» »isa
necessary and sufficient condition for convergence.
Now
fi+l = (i+lSii+l
= eiT(I - BT)S(I
- B)ti
= eiT(S - G)u .
Let <j>i= tiGu . Hence, /¡+i = /< — <£>,-.
A sufficient condition for /¿ to tend to
zero with increasing i is that there exists a positive constant k such that
(2.5)
4>i^ kfi
for then/¿+i ¿ (1 — k)/,- and the sequence/,; converges.
Since S > 0, all its eigenvalues are real and positive, and if the largest is Xmax
then
(2.6)
fi ^ Xmaxe/e,- .
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
138
C. G. BROYDEN
(See, e.g. [1], p. 65). Now since S = S1, the matrix G is symmetric and its eigenvalues
are real. Denote
Hence, from equation
the smallest
by ßmm . Then
<£,-5; <Xmine,rei([1], P- 65).
(2.6)
"max
Now if G > 0, Aimin> 0 and equation (2.5) will be satisfied. This proves Theorem 1.
Corollary.
If S > 0 and G ^ 0 i.e., G is negative semidefinite, then the process
will never converge.
Gal. 4 MT 8437 p. 5 Take 10-23-14Gx 71 10-11-63
Proof. If G ^ 0, <¡>i^ 0, and /¿+i ^ /,-. Hence /< can never be reduced to zero,
and the corollary is proved.
Theorem 1 will now be applied to the SOR method. Since, from equation (2.3)
B = wA(I + wL)_1, a sufficient condition for SOR to converge is that there exists
a matrix S where S > 0 such that
w(7 + uL'^A'S
+ oiSA(I + wL)_1
(2.7)
- w2(7 + aLTTlATSA{I
+ oiL)-1 > 0.
This condition may be simplified by using the following lemma.
Lemma. A necessary and sufficient condition for P > 0 is that QTPQ > 0 where Q
is any non-singular matrix.
Proof. Let Qx = z. Then xTQTPQx = zTPz. But since Q is non-singular, for
every non-zero x there exists a non-zero z, and conversely. The lemma follows.
Since (7 + wL) is non-singular, the sufficient condition 2.7 may be transformed by
the lemma into
(2.8)
œ[ATS(I + wL) + (7 + coLT)SA - uATSA] > 0
3. Symmetric Matrices. Suppose that M is symmetric. Put S = Z)-1M_1Z)_1.
Condition (2.8) becomes
(3.1)
Equations
w[7)-1(7 + uL) + (7 + wLT)D_1 - «M] > 0
(1.3), (1.4), and the assumed symmetry of M, give
M = D"\I
Equating
the upper triangular
+ L + U) = (7 + 7/ + UT)D~\
partitions
(3.2)
of these two representations
of M gives
D~lU = LTD~l
Hence equation (3.1) reduces to
(3.3)
w(2 - w)7)_1 > 0
Now if M > 0 it follows from the lemma that S > 0 and since if M > 0 then
D > 0, the condition (3.3) obtains. Hence SOR will converge, if M > 0 and
0 < w < 2. If, however, w < 0 or w < 2 the matrix G becomes negative definite,
so by the corollary to Theorem 1, SOR will not converge for w lying outside the
range 0 —>2.
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
CONVERGENCE CRITERIA FOR METHOD OF SUCCESSIVE OVER-RELAXATION 139
4. Non-Symmetric Matrices. Take for S in equation (2.8) the matrix (AT)~1A~1.
Since A is non-singular this automatically fulfills the conditions of symmetry and
positive definiteness. The condition for convergence becomes
wA_1(7+ uL) + w(7 + wT/K-AT1 - w27> 0
and since A is non-singular, the lemma gives
w(7 + wL)AT + wA(I + w7/) - o?AAT > 0.
Decomposing A by equation (1.4) and simplifying gives
(4.1)
w(A + AT) - w2[(7 + U)(I + UT) - LLT] > 0.
A second condition, analogous to that given by equation (4.1) may be derived by
putting S — I in equation (2.8). This gives
uAT(I + wL) + w(7 + UL')A - wATA > 0.
Decomposing A and simplifying gives the sufficient condition for convergence
(4.2)
w(A + AT) - w2[(7 + UT)(I + U) - LTL] > 0.
It will now be shown that ifA-|-^.T>0a
positive w may be found such that
conditions 4.1 and 4.2 hold.
Theorem 2. If P > 0 and Q = QT there exists a positive w such that P + wQ > 0.
Proof. Let /i = xTPx. Since P > 0 all its eigenvalues are real and positive. Let
the smallest be Xmin
••
Jl
:= Aminci- *^
Now Q is symmetric, hence all the eigenvalues are real. Denote the algebraically
smallest by /imin . If
ft = xTQx
ft
^
rlmmXTX
and
/ = XT(P + WQ)X è
(Xmin + 0>IXmm)xTX.
Consider now the two cases
(a)
Mmin è 0
In this case / > 0 for all w ^ 0.
(b)
Mmin =
/.
/
>
(Xmin
:.
/ > 0
for
— |/imin| < 0
— W | jltmin \)x
w < -r^-
I MminI
X
and
x^0
This proves Theorem 2.
A further sufficient condition for convergence may be derived from equation
(4.2).
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
140
C. G. BROYDEN
Define the matrices P and Q by
P = (7 + 7/)(7 + L) - UTU
Q = (7+
:.
Equation
(4.3)
UT)(I+
U) - LTL
P + Q = A + AT.
(4.2) becomes
w2P - w(w - 1)Q > 0.
Henee SOR will converge for w = 1 if P > 0. A similar condition may be derived
from equation (4.1) in the same way.
5. Conclusions. Theorem 2 indicates that conditions (4.1) and (4.2) will be
satisfied if A -f- AT is positive definite, and a sufficiently small positive value of w
is used. Now any matrix A may be expressed as the sum of symmetric and antisymmetric components, and the matrix A + AT is merely double the symmetric
component. Thus if a matrix is decomposed in this way and the symmetric component is positive definite, it will always be possible to find an w such that successive
over-relaxation, or possibly successive under-relaxation,
will converge. This is
clearly a more general form of Ostrowski's criterion, to which it reduces in the
limiting case when the anti-symmetric component becomes zero.
Condition (4.3) although derived from the same equation, is rather different
in character. It shows that SOR will converge for w = 1 if
(5.1)
(I + LT)(I + L) - UTU > 0.
This leads to the conclusion that matrices exist for which an important factor in
guaranteeing convergence of the method of successive over-relaxation is "lower
triangular dominance". In the limiting case when U becomes zero, (5.1) holds,
and if w takes the value unity the equations are solved in one step.
That the conditions P > 0 and A + AT > 0 are not equivalent is probably
best demonstrated by examples.
(5.2)
A = [l -»].
In this case A is strongly anti-symmetric, and its symmetric component is positive
definite although neither P nor Q possess this property. Successive relaxation will
lead to a solution for w sufficiently small, e.g., J.
(5.3)
A.{J
*]
Here A is lower-triangularly
dominant. P is positive definite, but A + AT and
Q are not. Convergence is guaranteed for w = 1.
It should be emphasised here that although conditions (4.1), (4.2) and (4.3)
are sufficient for convergence, they are not necessary. In particular, the largest
value of w for which (4.1) and (4.2) is valid may well be exceeded without the
process diverging. They do, though, show that there exist two quite definite types
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
INVERSES OF FINITE
SEGMENTS OF THE GENERALIZED HILBERT MATRIX 141
of non-symmetric matrix for which SOR will always converge provided that a
suitable value of w is chosen.
6. Acknowledgments. The author is grateful to the directors of the English
Electric Co. for permission to publish this paper, and to his colleagues, F. Ford
and J. M. Williamson, for their helpful comments and criticisms.
Atomic Power Division,
English Electric Co.,
Whetstone,
Leicester.
England.
1. E. Bodewig, Matrix Calculus, North Holland Publishing Co. Amsterdam, 1959.
2. G. E. Forsythe
& W. R. Wasow,
Finite
Difference Methods for Partial
Equations, John Wiley and Sons., Inc., New York. 1960.
3. A. M. Ostrowski,
"On the linear iteration procedures for symmetric
Mat. e Appl. v. 13, 1954, p. 140.
Differential
matrices,"
Rend.
On Inverses of Finite Segments of the
Generalized Hubert Matrix
By Jean L. Lavoie
The purpose of this note is to show that two theorems given by Smith [1J on
inverses of finite segments of the generalized Hilbert Matrix can be proved in a
simple manner by using results from the theory of generalized hypergeometric
series.
The usual notation for generalized hypergeometric functions will be used :
p
<»
in
\
°° II («y)K
K
z
A(2)-'f,«(s...;:l»)-£s
K\
IT (6y)i
y=i
where
/ N
r(o- + M)
r(o-)
•
See Erdélyi [2], Chapters 2 and 4 for details.
Let 77„ represent a finite segment of the generalized Hilbert matrix, i.e.,
(2)
77n - {ha),
ha m (p + i + j-
iy\
it j = l, 2, • • • • n.
Here n is the order of the segment and obviously
p * -1, -2, .••. -(2b
- 1).
We shall assume that the above conditions on », j, and p hold throughout
paper.
Received April 15, 1963.Revised July 24, 1963.
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
this
© Copyright 2026 Paperzz