ESTIMATING THE OPTIMAL EXTRAPOLATION

ESTIMATING THE OPTIMAL EXTRAPOLATION PARAMETER FOR
EXTRAPOLATED ITERATIVE METHODS WHEN SOLVING SEQUENCES OF
LINEAR SYSTEMS
A Thesis
Presented to
The Graduate Faculty of The University of Akron
In Partial Fulfillment
of the Requirements for the Degree
Master of Science
Curtis J. Anderson
December, 2013
ESTIMATING THE OPTIMAL EXTRAPOLATION PARAMETER FOR
EXTRAPOLATED ITERATIVE METHODS WHEN SOLVING SEQUENCES OF
LINEAR SYSTEMS
Curtis J. Anderson
Thesis
Approved:
Accepted:
Advisor
Dr. Yingcai Xiao
Dean of the College
Dr. Chand Midha
Co-Advisor
Dr. Zhong-Hui Duan
Dean of the Graduate School
Dr. George Newkome
Co-Advisor
Dr. Ali Hajjafar
Date
Department Chair
Dr. Yingcai Xiao
ii
ABSTRACT
Extrapolated iterative methods for solving systems of linear equations require the selection of an extrapolation parameter which greatly influences the rate of convergence.
Some extrapolated iterative methods provide analysis on the optimal extrapolation
parameter to use, however, such analysis only exists for specific problems and a general method for parameter selection does not exist. Additionally, the calculation of
the optimal extrapolation parameter can often be too computationally expensive to
be of practical use.
This thesis presents an algorithm that will adaptively modify the extrapolation parameter when solving a sequence of linear systems in order to estimate
the optimal extrapolation parameter. The result is an algorithm that works for any
general problem and requires very little computational overhead. Statistics on the
quality of the algorithm’s estimation are presented and a case study is given to show
practical results.
iii
TABLE OF CONTENTS
Page
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v
CHAPTER
I.
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1 Stationary Iterative Methods . . . . . . . . . . . . . . . . . . . . . .
2
1.2 Convergence of Iterative Methods . . . . . . . . . . . . . . . . . . . .
3
1.3 Well-Known Iterative Methods . . . . . . . . . . . . . . . . . . . . .
4
1.4 Extrapolated Iterative Methods . . . . . . . . . . . . . . . . . . . . .
10
1.5 Sequence of Linear Systems . . . . . . . . . . . . . . . . . . . . . . .
13
II. AN ALGORITHM FOR ESTIMATING THE OPTIMAL EXTRAPOLATION PARAMETER . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.1 Spectral Radius Estimation . . . . . . . . . . . . . . . . . . . . . . .
15
2.2 Extrapolated Spectral Radius Function Reconstruction . . . . . . . .
20
2.3 Parameter Selection . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.4 Solver Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
III. PERFORMANCE ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . .
43
IV. CASE STUDY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
iv
LIST OF FIGURES
Figure
Page
1.1
The CR transition disk in relation to the unit disk . . . . . . . . . . . .
11
2.1
Statistics on the ratio of the estimated average reduction factor (σ̄i )
to the spectral radius (ρ) as i is varied for randomly generated systems.
17
Statistics on the ratio of the estimated spectral radius (ρ̄) to the
spectral radius (ρ) compared against the number of iterations required for convergence for randomly generated systems. . . . . . . . . .
18
Example of ρ̄ estimating points on the extrapolated spectral radius
function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
The square of the magnitude of the spectral radius shown as the composition of the square of the magnitude of the respective eigenvalues
for two randomly generated systems (only relevant eigenvalues are
shown). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.5
Example of segmentation of samples based upon algorithm 2.2.1. . . . .
25
2.6
Constrained regression (left) versus unconstrained regression (right)
applied to the same problem. . . . . . . . . . . . . . . . . . . . . . . .
26
Example of estimating the extrapolated spectral radius function
from samples that are all superior to non-extrapolated iteration (below the black dotted line). . . . . . . . . . . . . . . . . . . . . . . . . .
30
3.1
Performance results for Gauss-Seidel. . . . . . . . . . . . . . . . . . . .
45
3.2
Performance results for SOR w = 1.5. . . . . . . . . . . . . . . . . . . .
46
3.3
Performance results for SOR w = 1.8. . . . . . . . . . . . . . . . . . . .
47
4.1
Slices of the volume solved in the example problem at particular times.
49
4.2
Sparsity plot of the Crank-Nicholson coefficient matrix for an 8x8x8
discretization (blue entries are nonzero entries of the matrix). . . . . . .
50
2.2
2.3
2.4
2.7
v
4.3
Benchmark results for a ’pulsed’ source (S(x, y, z, t) = sin(t)). . . . . .
51
4.4
Benchmark results for a constant source. (S(x, y, z, t) = 0) . . . . . . .
52
vi
CHAPTER I
INTRODUCTION
Many applications in scientific computing require solving a sequence of linear systems
A{i} x{i} = b{i} for i = 0, 1, 2, ... for the exact solution x{i} = A{i}−1 b{i} where A{i} ∈
Rn×n , x{i} ∈ Rn , and b{i} ∈ Rn . Several direct methods exist for solving systems
such as Gaussian elimination and LU decomposition that do not require explicitly
computing A{i}−1 , however, such methods may not be the most efficient or accurate
methods to use on very large problems. Direct methods often require the full storage
of matrices in computer memory as well as O(n3 ) operations to compute the solution.
As matrices become large these drawbacks become increasingly prohibitive. Iterative
methods are an alternative class of methods for solving linear systems that alleviate
many of the drawbacks of direct methods.
Iterative methods start with an initial solution vector x(0) and subsequently
{ }∞
generate a sequence of vectors x(i) i=1 which converge to the solution x. Iteration
takes the form of a recursive function
x(i+1) = Φ(x(i) ),
1
(1.1)
which is expected to converge to x, however, convergence for iterative methods is not
always guaranteed and is dependent upon both the iterative method and problem.
The calculation of each iteration is often achieved with significantly lower computation
and memory usage than direct methods.
There are several choices for iterative functions, leading to two major classes
of iterative methods. Stationary methods have an iteration matrix that determine
convergence while non-stationary methods, such as Krylov subspace methods, do
not have an iteration matrix and convergence is dependent upon other factors [1, 2].
Throughout this thesis the name iterative method is intended to refer to stationary
methods since only stationary methods and their extrapolations are studied.
1.1 Stationary Iterative Methods
Let a nonsingular n × n matrix A be given and a system of linear equations, Ax = b,
with the exact solution x = A−1 b. We consider an arbitrary splitting A = N − P for
the matrix A, where N is nonsingular. A stationary iterative method can be found
by substituting the splitting into the original problem
(N − P )x = b
(1.2)
N x(i+1) = P x(i) + b
(1.3)
and then setting the iteration
2
and solving for x(i+1) ,
x(i+1) = N −1 P x(i) + N −1 b.
(1.4)
This method was first developed in this generality by Wittmeyer in 1936 [3]. Convergence to the solution is heavily dependent upon the choice of splitting for N and P
and is not generally guaranteed for all matrices, however, for certain types of matrices some splittings can guarantee convergence. Intelligent choices for splittings rarely
require finding the matrix N −1 explicitly and instead computation occurs by solving
the system in equation (1.3) for x(i+1) . In the following sections the most well-known
iterative methods are introduced along with general rules of convergence.
1.2 Convergence of Iterative Methods
Convergence is the cornerstone of this thesis since it is required for the method to be
of any use. It is well known that convergence occurs if and only if the spectral radius
of the iteration matrix, N −1 P , is less than one [4], that is, if λi is an eigenvalue of
the matrix N −1 P then
ρ(N −1 P ) = max |λi | < 1.
1≤i≤n
(1.5)
Additionally, the rate of convergence for an iterative method occurs at the
same rate that ρ(N −1 P )i → 0 as i → ∞. Thus, we are not only concerned about having ρ(N −1 P ) within unity, but also making ρ(N −1 P ) as small as possible to achieve
the fastest convergence possible. The rate of convergence between two iterative methods can be compared through the following method: If ρ1 and ρ2 are the respective
3
spectral radii of two iterative methods then according to
ρn1 = ρm
2
(1.6)
iterative method 1 will require n iterations to reach the same level of convergence as
iterative method 2 with m iterations. Solving explicitly for the ratio of the number
of iterations required results in
n
ln(ρ2 )
=
.
m
ln(ρ1 )
(1.7)
Equation (1.7) allows for the iteration requirements for different methods to be easily
compared. For example, if ρ1 = .99 and ρ2 = .999, then we see
ln(ρ2 )
ln(ρ1 )
= 0.0995, thus,
method 1 requires only 9.95% the iterations that method 2 requires. Additionally, if
ρ1 = .4 and ρ2 = .5, then we see
ln(ρ2 )
ln(ρ1 )
= 0.7565, thus, method 1 requires only 75.65%
the iterations that method 2 requires. The first example shows how important small
improvements can be for spectral radii close to 1 while the second example shows
that the absolute change in value of a spectral radius is not an accurate predictor
of the improvement provided by a method. Thus, for a proper comparison, equation
(1.7) should be used when comparing two methods.
1.3 Well-Known Iterative Methods
Given a nonsingular n × n matrix A, the matrices L, U, and D are taken as the
strictly lower triangular, the strictly upper triangular, and the diagonal part of A,
4
respectively. The following sections detail the most well-known splittings and their
convergence properties.
1.3.1 Jacobi Method
The Jacobi method is defined by the splitting A = N − P , where N = D and
P = −(L + U ) [5, 6]. Therefore, the Jacobi iterative method can be written in vector
form as
x(k+1) = −D−1 (L + U )x(k) + D−1 b.
(1.8)
In scalar form, the Jacobi method is written as

(k+1)
xi

n
∑

1 
(k)

=− 
aij xj − bi 

aii j=1
f or i = 1, 2, · · · , n.
j̸=i
Algorithm 1.3.1 implements a one Jacobi iteration in MATLAB.
Algorithm 1.3.1: Jacobi Iteration
1 % Compute one i t e r a t i o n s o f t h e J a c o b i method
2 function [ x new ] = J a c o b i I t e r a t i o n (A, x o l d , b )
3
4
% Find s i z e o f t h e matrix
5
n = length ( b ) ;
6
7
% I n i t i a l i z e a r r a y f o r next i t e r a t i o n v a l u e
8
x new = zeros ( n , 1 ) ;
9
10
f or i = 1 : n
11
f or j = 1 : n
12
i f ( i ∼= j )
13
x new ( i ) = x new ( i ) + A( i , j ) ∗ x o l d ( j ) ;
14
end
15
end
16
5
(1.9)
17
x new ( i ) = ( b ( i )−x new ( i ) ) /A( i , i ) ;
18
end
19 end
The Jacobi method is guaranteed to converge for diagonally dominant systems [7]. A system is said to be diagonally dominant provided that it’s coefficient
matrix has the property that |aii | >
∑n
j=1
j̸=i
|aij | for i = 1,2, . . . , n. This means that
in each row of the coefficient matrix the magnitude of the diagonal element is larger
than the sum of the magnitudes of all other elements in the row.
1.3.2 Gauss-Seidel Method
The Gauss-Seidel method is a modification of the Jacobi method that can sometimes
improve convergence. If the Jacobi method computes elements sequentially then the
(k+1)
elements x1
(k+1)
ment xi
(k+1)
, x2
(k+1)
, . . . , xi−1
have already been computed by the time the ith ele-
is computed. The Gauss-Seidel method makes use of these more recently
computed elements by substituting them in place of the older values. Therefore, in
(k+1)
the computation of the element xi
(k)
(k)
(k)
xi+1 , xi+2 , . . . , xn
the Gauss-Seidel method utilizes the elements
(k+1)
from the k th iteration and the elements x1
(k+1)
, x2
(k+1)
, . . . , xi−1
from the (k + 1)th iteration. This substitution results in the the scalar equation
(k+1)
xi
1
=−
aii
( i−1
∑
j=1
(k+1)
aij xj
+
n
∑
)
(k)
aij xj
− bi
f or i = 1, 2, · · · , n.
j=i+1
Algorithm 1.3.2 implements one Gauss-Seidel iteration in MATLAB.
6
(1.10)
Algorithm 1.3.2: Gauss-Seidel Iteration
1 % Compute one i t e r a t i o n o f t h e Gauss−S e i d e l method
2 function [ x new ] = G a u s s S e i d e l I t e r a t i o n (A, x o l d , b )
3
4
% Find s i z e o f t h e matrix
5
n = length ( b ) ;
6
7
% I n i t i a l i z e a r r a y f o r new i t e r a t i o n v a l u e
8
x new = zeros ( n , 1 ) ;
9
10
f or i = 1 : n
11
x new ( i ) = b ( i ) ;
12
13
f or j = 1 : i −1
14
x new ( i ) = x new ( i ) − A( i , j ) ∗ x new ( j ) ;
15
end
16
17
f or j = i +1:n
18
x new ( i ) = x new ( i ) − A( i , j ) ∗ x o l d ( j ) ;
19
end
20
21
x new ( i ) = x new ( i ) /A( i , i ) ;
22
end
23 end
To write the vector form of the Gauss-Seidel method, let the matrices L, U,
and D be defined as earlier. By taking the splitting N = D + L and P = −U the
vector form of Gauss-Seidel can be shown as
x(k+1) = −(D + L)−1 U x(k) + (D + L)−1 b.
(1.11)
It is well known that if the matrix A is diagonally dominant then the Gauss-Seidel
method is convergent and the rate of convergence is at least as fast as the rate of
convergence of the Jacobi method [7]. Also, convergence is guaranteed for positive
definite matrices [4].
7
1.3.3 Successive Over-Relaxation Method
The successive over-relaxation (SOR) method is a modification of the Gauss-Seidel
method that introduces a relaxation parameter to affect convergence. SOR was first
introduced by David M. Young in his 1950 dissertation [8]. In order to approximate
(k+1)
xi
, SOR introduces the temporary Gauss-Seidel approximation
(k+1)
x̂i
1
=
aii
(k)
that is extrapolated with xi
(k+1)
xi
(
bi −
i−1
∑
(k+1)
aij xj
j=1
−
n
∑
)
(k)
aij xj
(1.12)
j=i+1
by a parameter ω. In other words, we obtain
(k+1)
(k)
(k)
(k+1)
(k)
+ (1 − ω)xi = xi + ω(x̂i
− xi )
[
]
i−1
n
∑
∑
ω
(k)
(k+1)
(k)
= (1 − ω)xi +
bi −
aij xj
−
aij xj .
aii
j=1
j=i+1
= ωx̂i
(1.13)
Rearranging (1.13) for i = 1, 2, . . . , n, the scalar form of SOR is
∑
(i−1)
(k+1)
aii xi
+ω
(k+1)
aij xj
(k)
= (1 − ω)aii xi − ω
j=1
n
∑
(k)
aij xj + ωbi .
(1.14)
j=i+1
In vector form (1.14) can be written as
Dx(k+1) + ωLx(k+1) = (1 − ω)Dxk − ωU x(k) + ωb
or
1
1
(D + ωL)x(k+1) = ((1 − ω)D − ωU )x(k) + b.
ω
ω
8
(1.15)
Equation (1.15) shows that the SOR method splits A as A = N − P , where
1
N = D + L and P =
ω
(
)
1
− 1 D − U,
ω
(1.16)
resulting in the iteration matrix
H(ω) = N −1 P = (D + ωL)−1 ((1 − ω)D − ωU ).
(1.17)
Therefore, in vector form, SOR can be written as
x(k+1) = (D + ωL)−1 ((1 − ω)D − ωU )x(k) + (D + ωL)−1 ωb.
Algorithm 1.3.3 implements one iteration of SOR in MATLAB.
Algorithm 1.3.3: SOR Iteration
1 % Compute one i t e r a t i o n o f t h e SOR method
2 function [ x new ] = S O R I t e r a t i o n (A, x o l d , b , w)
3
4
% Find s i z e o f matrix
5
n = length ( b ) ;
6
7
% I n i t i a l i z e a r r a y f o r next i t e r a t i o n v a l u e
8
x new = zeros ( n , 1 ) ;
9
10
f or i = 1 : n
11
x new ( i ) = b ( i ) ;
12
13
f or j = 1 : i −1
14
x new ( i ) = x new ( i ) − A( i , j ) ∗ x new ( j ) ;
15
end
16
17
f or j = i +1:n
18
x new ( i ) = x new ( i ) − A( i , j ) ∗ x o l d ( j ) ;
19
end
20
9
(1.18)
21
r e s u l t ( i ) = x o l d ( i ) + w∗ ( x new ( i ) /A( i , i )−x o l d ( i ) ) ;
22
end
23 end
Notice that if ω = 1, the SOR method reduces to the Gauss-Seidel method.
Convergence of SOR depends on ρ(H(ω)) and it is well known (Kahan’s theorem)
that for an arbitrary matrix A, ρ(H(w)) ≥ |ω − 1| [9]. This implies that if the
SOR method converges, ω must belong to the interval (0,2). Furthermore, if A is
symmetric positive definite, then for any ω in (0,2), SOR is convergent [10].
1.4 Extrapolated Iterative Methods
An extrapolated scheme which converges to the same solution as (1.4) can be defined
by
x(k+1) = µΦ(x(k) ) + (1 − µ)x(k) = x(k) + µ(Φ(x(k) ) − x(k) )
(1.19)
where µ is an extrapolation parameter. Substitution of Φ from (1.4) into (1.19) results
in the extrapolated iterative method [11, 12],
x(k+1) = ((1 − µ)I + µN −1 P )x(k) + µN −1 b.
(1.20)
Note that equation (1.20) is equivalent to equation (1.4) when µ = 1.
The addition of an extrapolation parameter changes the rules of convergence
for a splitting which is now convergent if and only if ρ((1−µ)I +µN −1 P ) < 1. In [12],
the controlled relaxation method (CR method) was introduced which analyzed the
10
Figure 1.1: The CR transition disk in relation to the unit disk
convergence of extrapolated iteration. It was found that an appropriate µ can be
chosen to make extrapolated iteration converge if all the eigenvalues of the iteration
matrix, N −1 P , have real parts all greater than 1 or all less than 1. If the eigenvalues
of N −1 P are {λj = αj + iβj }nj=1 then the eigenvalues of (1 − µ)I + µN −1 P are
{1 − µ + µλj }nj=1 . If αj < 1, then for 0 < µ < ∞, 1 − µ + µλj will shift λj to a position
on the half-line starting with the point (1,0) and passing through λj . In [12], it is
shown that if
µλj =
1 − Real(λj )
1 − Real(λj )
=
,
2
|1 − λj |
1 + |λj |2 − 2Real(λj )
then 1 − µλj + µλj λj will shift λj to a point on the circle with center
1
2
(1.21)
(1
2
)
, 0 and radius
called the transition circle (see figure 1.1). Notice that if λj is in the interior of the
transition circle, then µλj ≥ 1, otherwise 0 < µλj < 1. Additionally, if αj > 1 then
µλj < 0.
11
Now suppose all the eigenvalues have a real part less than 1 (for all j, αj < 1).
For each 1 ≤ j ≤ n, equation (1.21) provides a shift parameter µλj by which λj will be
transformed onto the transition circle. Next, define µ = min1≤j≤n µλj . For 1 ≤ j ≤ n,
1 − µ + µλj are the eigenvalues of the matrix (1 − µ)I + µN −1 P which belong to the
transition disk and at least one of them lies on the transition circle.
In the case where all the eigenvalues of N −1 P have real parts greater than 1
(for all j, αj > 1) equation (1.21) will shift λj to 1 − µλj + µλj λj on the transition
circle and µ = max1≤j≤n µλj will shift all eigenvalues of the iteration matrix (1 −
µ)I + µN −1 P inside or on the transition disk.
Starting with a matrix where all the eigenvalues of the iteration matrix N −1 P
have real parts all less than 1 or all greater than 1, the CR method (1.20) with an
appropriate µ will converge independent of the initial vector x(0) . However, the shift
parameter µ may push the eigenvalues of (1 − µ)I + µN −1 P too close to the point
(1,0) which is on the boundary of the unit disk causing slower convergence. At this
point, the spectral radius can be modified by shifting all the eigenvalues to the left
on the shift lines. This can be done by applying the CR method to (1.20) with a µ∗
larger than one. The resulting iterative method is called controlled over-relaxation
(COR) defined by the iteration
x(k+1) = ((1 − µµ∗ )I + µµ∗ N −1 P )x(k) + µµ∗ N −1 b.
12
(1.22)
Since one of the eigenvalues of the matrix (1−µ)I +µN −1 P lies on the transition disk,
it is shown in [12] that (1.22) with the calculated µ converges if and only if 0 < µ∗ < 2.
While COR is helpful for understanding the rules of convergence, it’s reliance upon
the eigenvalues for the calculation of µ makes it impractical for real world use which
is discussed later. Henceforth, in this thesis the extrapolation parameter µ has no
relation to the calculated µ from COR.
One final note is that since the extrapolated spectral radius function
ρ(M (µ)) = max |(1 − µ) + µλi |
1≤i≤n
(1.23)
is the composition of a linear equation, the absolute value, and the max function, the
spectral radius of an extrapolated iterative method is convex. Therefore, there exists
a single optimal extrapolation parameter that will provide the global minimum of the
extrapolated spectral radius function resulting in the fastest convergence possible.
1.5 Sequence of Linear Systems
As mentioned previously, many problems in scientific computing require solving not
just one system of linear equations but a sequence of linear systems
A{i} x{i} = b{i} f or i = 1, 2, . . .
13
(1.24)
Often there is the requirement that the system A{i} x{i} = b{i} must be solved in order
to generate the system A{i+1} x{i+1} = b{i+1} , that is, A{i+1} = f (A{i} , x{i} , b{i} ) and
b{i+1} = g(A{i} , x{i} , b{i} ). In the case that A{i} is constant, then the iteration matrix
remains invariant as elements of the sequence are solved and thus a single particular
extrapolation parameter will be optimal for all the systems in the sequence.
The repeated solving of linear systems is the cornerstone of this thesis as it
allows what would be expensive computations to be estimated very cheaply by evaluating properties of previously solved systems. The evaluation of previous systems
would not be possible if only one system is being solved and thus is only useful when
solving a sequence of linear systems. The proposed algorithm for estimating the optimal extrapolation parameter is outlined in chapter 2 and performance results are
presented in chapters 3 and 4.
14
CHAPTER II
AN ALGORITHM FOR ESTIMATING THE OPTIMAL EXTRAPOLATION
PARAMETER
An algorithm for estimating the optimal extrapolation parameter for an extrapolated
stationary solver can be broken down into four parts. The first part of the algorithm
estimates the spectral radius of the extrapolated iteration matrix for a particular
extrapolation parameter. The second part of the algorithm reconstructs the extrapolated spectral radius function from spectral radius estimation samples. The third
part of the algorithm establishes rules that dictate how subsequent sample locations
are chosen and where the optimal value lies. The fourth part of the algorithm is the
integration of the previous three parts into the iterative solver. The result of this
algorithm is an iteratively refined estimation of the optimal extrapolation parameter.
2.1 Spectral Radius Estimation
Determining the eigenvalues and thus the spectral radius of a matrix can be extremely
expensive and is often more expensive than solving a system of linear equations. For
an n×n matrix, typical algorithms for finding the eigenvalues of a matrix, such as the
QR method, require O(n3 ) operations as well as the full storage of a matrix in memory.
Iterative techniques such as power iteration exist for finding the dominant eigenvalue
15
and dominant eigenvector of a matrix, however, the iteration is not guaranteed to
converge if there is more than one dominant eigenvalue and thus cannot be used
for general cases. Therefore, calculating the spectral radius based upon eigenvalue
techniques is not practical for use in a low-cost general algorithm.
The spectral radius can, however, be estimated based upon the rate of convergence of a stationary iterative solver. As the number of iterations computed by
a solver grows, the rate of convergence is dictated by the spectral radius, thus, the
average reduction factor per iteration is able to give an estimation of the spectral radius when enough iterations are computed. For a particular extrapolation parameter,
iteration is carried out which generates the solution sequence {xi }ni=0 which should
converge to the solution x. The average reduction factor for i iterations, defined
by [4], can be written as
(
σi :=
||xi − x||
||x0 − x||
) 1i
.
(2.1)
Note that the calculation of σi requires the exact value x which is unknown. The final
iteration computed, xn , can be used as an estimation of x resulting in the estimation
of the average reduction factor
(
σ̄i :=
||xi − xn ||
||x0 − xn ||
) 1i
,
f or 0 < i < n.
(2.2)
Care must now be given to how well σ̄i estimates the spectral radius along with
which σ̄i in the set {σ̄i }n−1
i=1 should be chosen to represent the final average reduction
factor ρ̄ for this set of data. The number of elements in {xi }, and thus xn , are usually
16
1
σ̄i /ρ
0.8
0.6
0.4
0.2
0
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
0
10
20
30
40
50
60
70
80
90
100
i used for σ̄i as percent of n
Figure 2.1: Statistics on the ratio of the estimated average reduction factor (σ̄i ) to
the spectral radius (ρ) as i is varied for randomly generated systems.
determined by the tolerance of the solver for the problem at hand. Thus, the choice of
n is not controllable in the calculation of ρ̄. The remaining problem is choosing i such
that 0 < i < n and σ̄i is as accurate as possible for estimating the spectral radius.
Figure 2.1 presents statistics that were gathered on the appropriate i to choose for the
computation of ρ̄. Randomly generated positive definite systems were solved using
the Gauss-Seidel method and the sequence {xi }, and thus {σ̄i }, were analyzed. Due
to the fact that each system can require a different number of iterations to achieve
convergence, the horizontal axis of figure 2.1 is given as i as a percentage of n so the
17
1
ρ̄/ρ
0.8
0.6
0.4
0.2
0
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
0
25
50
75
100
125
150
Iterations for solution convergence
Figure 2.2: Statistics on the ratio of the estimated spectral radius (ρ̄) to the spectral
radius (ρ) compared against the number of iterations required for convergence for
randomly generated systems.
statistics are normalized for the number of iterations required. The vertical axis of
figure 2.1 is the ratio of the average reduction factor σ̄i to the spectral radius, letting
values close to 1 show that the estimation is accurate in estimating the spectral radius.
We see that choosing an element in the sequence {σ̄i } for use as the final estimation ρ̄
is fairly straight forward because choosing an i that is close to n provides the highest
quality estimation. Although there is a very small dip in accuracy towards 100 in
figure 2.1, σ̄n−1 provides sufficient accuracy for use as ρ̄.
18
1.05
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
Spectral Radius
ρ̄
0
0.5
1
1.5
2
2.5
Extrapolation Parameter
Figure 2.3:
function.
Example of ρ̄ estimating points on the extrapolated spectral radius
Now that the selection of ρ̄ is understood, the accuracy and precision of its
estimation of the spectral radius must be evaluated. Figure 2.1 normalized i and
n for each sample for purpose of analyzing σ̄i free from particular cases of i and
n. Figure 2.2 shows that the accuracy and precision of ρ̄ in estimating the spectral
radius is somewhat dependent upon n. Notice that the 50th percentile remains fairly
steady for all values of n, this shows that ρ̄ can provide a fairly accurate estimation
of the spectral radius over nearly all values of n. However, looking at the spread of
the percentiles for smaller values of n we see that the precision of the estimation is
reduced. While the highest quality estimations are preferred, because the algorithm
19
proposed is based upon statistical methods, issues due to lack of precision can be
mitigated by gathering more samples.
Figure 2.3 shows an example of ρ̄ being used to estimate the spectral radius of
an extrapolated iteration matrix as the extrapolation parameter is varied. It should
be noted that if iteration diverges then ρ̄ → 1 which can be seen on the far right of
figure 2.3. Detecting divergence can be useful so unnecessary computation is avoided.
2.2 Extrapolated Spectral Radius Function Reconstruction
The previous section gives a technique for cheaply estimating the spectral radius after
solving one system of linear equations. If solving a sequence of linear systems, the
extrapolation parameter can be varied so the extrapolated spectral radius function
can be cheaply sampled while solving elements of the sequence. These samples contain
random variations that makes them unsuitable for direct reconstruction techniques
such as interpolation. To deal with the random nature of the samples a statistical
regression can be used for the reconstruction of the extrapolated spectral radius
function. Figure 2.3 shows an example of the spectral radius function that needs to
be reconstructed along with the samples that could be used for the reconstruction.
Analyzing the mathematical properties of the spectral radius function gives some
insight into the regression model that should be used. The greatest consideration in
the regression model is that the square of the spectral radius function is a piecewise
quadratic function, that is, if λj = αj + iβj is an eigenvalue of the iteration matrix
20
1
0.9
0.9
0.8
0.8
0.7
0.7
Square of Magnitude
Square of Magnitude
1
0.6
0.5
0.4
0.6
0.5
0.4
0.3
0.3
0.2
0.2
0.1
0
0.1
Spectral Radius
Eigenvalues
0
0.5
1
1.5
Extrapolation Parameter ( µ )
0
2
Spectral Radius
Eigenvalues
0
0.5
1
1.5
Extrapolation Parameter ( µ )
2
Figure 2.4: The square of the magnitude of the spectral radius shown as the composition of the square of the magnitude of the respective eigenvalues for two randomly
generated systems (only relevant eigenvalues are shown).
N −1 P and M (µ) is the extrapolated iteration matrix then
ρ(M (µ)) = max |(1 − µ) + µλj |
1≤j≤n
=⇒ ρ(M (µ))2 = max |(1 − µ) + µλj |2
1≤j≤n
= max 1 + 2(αj − 1)µ + ((αj − 1)2 + βj2 )µ2 .
1≤j≤n
(2.3)
(2.4)
(2.5)
Typical extrapolated spectral radius functions are made up of 2 to 5 segments but
are not strictly limited in number. Since the purpose of reconstructing the spectral
radius function is to estimate the location of the minimum value which will only occur
where the derivative of a segment is equal to zero or at an intersection of segments,
21
at most only the two segments adjacent to the minimum value are required to be
reconstructed.
2.2.1 Segmentation Method
Given a set of samples of the extrapolated spectral radius function, the goal of segmentation is to sort the samples into groups based upon the segment they were sampled
from. Without any knowledge of which segment a sample came from, segments can
be estimated by grouping samples such that the error of the piecewise regression is
minimized. Algorithm 2.2.1 is proposed as a method for finding the optimal grouping
of samples to best minimize regression error and is explained in the following.
Algorithm 2.2.1: Segmentation
1 function [ segments ] = FindSegments ( x i n p u t , y i n p u t )
2
3
% S o r t samples
4
[ x s o r t p e r m u t a t i o n ] = sort ( x i n p u t ) ;
5
y = y input ( sort permutation ) ;
6
7
% I n i t i a l i z e values
8
n = length ( x ) ;
9
e ( 1 : n ) = Inf ;
10
11
% Generate e r r o r t a b l e
12
f or i = 1 : n−1
13
% Get r e g r e s s i o n c o e f f i c i e n t s
14
BL = R e g r e s s i o n (X( 1 : i ) ,Y( 1 : i ) ) ;
15
BR = R e g r e s s i o n (X( i +1:n ) ,Y( i +1:n ) ) ;
16
17
% C a l c u l a t e and s t o r e r e g r e s s i o n e r r o r
18
EL = R e g r e s s i o n E r r o r (BL,X( 1 : i ) ,Y( 1 : i ) ) ;
19
ER = R e g r e s s i o n E r r o r (BR,X( i +1:n ) ,Y( i +1:n ) ) ;
20
21
e ( i ) = EL + ER;
22
end
23
24
% C a l c u l a t e optimum c h o i c e s
25
s p l i t l o c a t i o n = find ( error == min( error ) ) ;
26
22
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44 end
% Find i n t e r s e c t i o n o f segments
a = BL( 3 )−BR( 3 ) ;
b = BL( 2 )−BR( 2 ) ;
% Make s u r e samples t o t h e l e f t o f t h e i n t e r s e c t i o n b e l o n g t o
% t h e l e f t segment and sa mp le s t o t h e r i g h t t h e r i g h t segment
i f ( a ˜= 0 && −b/a > 0 )
f or j = 2 : n
i f (X( j ) > −b/ a )
segments = [ 1 j −1; j n ] ;
return ;
end
end
end
% I f t h e r e i s no i n t e r s e c t i o n , r e t u r n r e s u l t s
segments = [ 1 s p l i t l o c a t i o n ( 1 ) ; s p l i t l o c a t i o n ( 1 ) +1 n ] ;
As previously mentioned, at most only two segments need to be found, which
we call the left segment and the right segment. Samples must belong to either the left
segment or the right segment. To prepare the data, the samples need to be sorted by
their x position so that groups are easily determined by the segmentation position i
for a left group x(1 : i) and a right group x(i + 1 : n). Note that x and y are parallel
arrays so sorting and grouping methods should handle y data accordingly. Next, all
the possible segmentations are iterated through. For each potential segmentation a
regression for the left segment and right segments are computed, the total error of
the regressions is calculated, then the error is stored in an array that tracks the total
error for each potential segmentation. The optimal segmentation will be the one
that minimizes the amount of error in the regression. However, due to the nature of
regression, in some scenarios samples that belong to the left segment may end up on
the right side of the intersection of segments or vice-versa with right samples. Thus,
23
a final classification of samples into left and right segments is made by assigning all
samples to the left of the intersection to the left segment and all the samples to the
right of the intersection to the right segment, as seen in lines 33-40 of algorithm 2.2.1.
Once the optimal segmentation is found, a regression can be run on the left
and right segment samples to reconstruct the spectral radius. Figure 2.5 shows an
example of segmentation done by algorithm 2.2.1 with the left segment colored green
and right segment colored red. A comparison of the regressions provided by the
segmentation (solid lines) is made against the quadratic functions they are trying
to reconstruct(dotted lines). The ability of the segmentation algorithm to find the
segments when given good data is clearly satisfactory. Because of the statistical
nature of regressions, additional sample points can be used to refine the regression
for more accurate results.
2.2.2 Regression Constraints
Figure 2.5 shows a good reconstruction generated from accurate samples that are
uniformly spaced across our area of interest, however, such well behaved samples are
rarely the case. Utilizing the mathematical properties of the extrapolated spectral
radius function allows constraints to be added to the regression to make reconstructions as accurate as possible. Equation (2.5) shows that the square of each segment of
the extrapolated spectral radius function is a quadratic function, thus, the regression
model for segment j of the spectral radius function is aj µ2 + bj µ + cj . The mathematics of equation (2.5) also show that restrictions can be placed on the coefficients of
24
1
0.9
0.8
Square of Magnitude
0.7
0.6
0.5
0.4
0.3
Spectral Radius
Left Eigenvalue
Right Eigenvalue
Left Sample
Left Regression
Right Sample
Right Regression
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1
1.2
Extrapolation Parameter ( µ )
1.4
1.6
1.8
2
Figure 2.5: Example of segmentation of samples based upon algorithm 2.2.1.
the quadratic function. First, note that cj must always be equal to 1. Additionally,
as noted in [12], if {λi = αi + iβi }ni=1 are the eigenvalues of the iteration matrix then
extrapolated iteration converges if and only if every αi is less than 1, thus, 2(αi − 1)
is always negative dictating that bj must always be negative. Finally, (αi − 1)2 + βi2
is always positive dictating that aj must also always be positive. Applying these constraints can be trivial by transforming the data and constraints into a non-negative
least squares problem. Algorithm 2.2.2 implements these transformations which are
explained in the following.
If there are n samples with m explanatory variables and X ∈ Rn×m , Y ∈
Rn×1 , and B ∈ Rm×1 where X contains the explanatory samples, Y contains the
response samples, and B contains the regression coefficients then the typical least
25
1
1
0.9
0.9
0.8
0.8
0.7
Square of Magnitude
Square of Magnitude
0.7
0.6
0.5
0.4
0.3
0.5
0.4
0.3
Spectral Radius
Left Sample
Left Regression
Right Sample
Right Regression
0.2
0.1
0
0.6
0
0.5
1
1.5
Extrapolation Parameter ( µ )
Spectral Radius
Left Sample
Left Regression
Right Sample
Right Regression
0.2
0.1
0
2
0
0.5
1
1.5
Extrapolation Parameter ( µ )
2
Figure 2.6: Constrained regression (left) versus unconstrained regression (right)
applied to the same problem.
squares problem attempts to choose the vector B such that ||XB − Y || is minimized.
The non-negative least squares problem also attempts to choose the vector B such
that ||XB − Y || is minimized, however, it is subject to B ≥ 0. An algorithm for
solving non-negative least squares problems is given in [13].
The first transformation to set up the non-negative least squares problem will
be to account for cj being equal to 1. If the ith sample point consists of the values
26
(µi , yi ) and the j th segment consists of the coefficients aj , bj and cj , then
yi ≈ aj µ2i + bj µi + cj
(2.6)
=⇒ yi ≈ aj µ2i + bj µi + 1
(2.7)
=⇒ yi − 1 ≈ aj µ2i + bj µi
(2.8)
=⇒ ȳi ≈ aj µ2i + bj µi
(2.9)
where ȳi = yi − 1.
Now the regression can be run on equation (2.9) and only two coefficients need to
be determined. The additional constraints dictated by the mathematics require that
aj > 0 and bj < 0 which means it is not currently a non-negative problem. Through
substitution, the least squares problem that satisfies
ȳi ≈ aj µ2i + bj µi
where ȳi = yi − 1, aj > 0, and bj < 0
(2.10)
can be transformed into
ȳi ≈ aj µ2i + bj µ̄i
where ȳi = yi − 1, µ̄i = −µi , aj > 0, and bj > 0.
(2.11)
The substitution of −µi for µ̄i in equation (2.11) allows µ̄i to absorb the negative
constraint, resulting in constraints that all require positiveness and thus is a problem
fit for non-negative least squares.
27
Figure 2.6 shows the effect that constraints can have on the reconstruction.
The left plot applies the constraints and shows an accurate reconstruction that passes
through (0,1) with both segments being convex. The right plot, however, has no
constraints and clearly does not pass through (0,1) and the left segment is concave.
Not only does this lead to an inaccurate reconstruction but it also leads to difficulties
in selecting an optimal extrapolation parameter because the segments do not intersect.
Algorithm 2.2.2: Constrained Regression
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
% r e s u l t i s a 3−e l e m e n t a r r a y t h a t c o n t a i n s t h e c o e f f i c i e n t s
% of a quadratic function
function [ r e s u l t ] = C o n s t r a i n e d R e g r e s s i o n ( x i n p u t , y i n p u t )
n = length ( x i n p u t ) ;
% Transform data f o r c o n s t r a i n e d r e g r e s s i o n
y ( i ) = y i n p u t −1;
x ( 1 , : ) = −x i n p u t ;
x ( 2 , : ) = x input . ˆ 2 ;
% Run non−n e g a t i v e l e a s t s q u a r e s r e g r e s s i o n
n n l s r e s u l t = NNLS( x , y ) ;
% Transform n n l s r e s u l t back t o u s a b l e r e u l t s
result (1) = 1;
r e s u l t (2) = −n n l s r e s u l t s (1) ;
result (3) = n n l s r e s u l t s (2) ;
end
2.3 Parameter Selection
The previous sections have outlined how to sample the extrapolated spectral radius
function and how to reconstruct the extrapolated spectral radius function from samples, however, the question of where to take samples from has yet to be addressed.
28
Samples up to this point have been chosen in a uniformly spaced region around the
area of interest, i.e. around the optimal value. However, without any prior information about the the region surrounding the optimal value the problem of where
to select samples from becomes difficult. Poor choices for sample locations can be
computationally expensive due to the required solving of a linear system with an
extrapolation parameter that is suboptimal. Since an extrapolation parameter of 1
is equivalent to no extrapolation, parameter selection should try to find the optimal
extrapolation parameter while always using extrapolation parameters that provide superior convergence over non-extrapolated iteration. Figure 2.7 shows a reconstructed
spectral radius function and contains a dotted black line that samples should ideally
be taken below since any sample in this region would be superior to non-extrapolated
iteration.
The goal of parameter selection is to estimate the optimal extrapolation parameter while also refining the reconstruction of the extrapolated spectral radius
function. Parameter selection can be broken down into three important functions,
EstimatedOptimalMu, ApplyJitter, and ParameterSelection which are described in
the following sections.
2.3.1 EstimatedOptimalMu
As previously mentioned, the minimum value of the extrapolated spectral radius
function will occur at either the intersection of segments or on a segment which has a
derivative equal to zero. Thus, the first step in finding the optimal parameter will be
29
1
0.9
0.8
Square of Magnitude
0.7
0.6
0.5
0.4
0.3
Spectral Radius
Left Sample
Left Regression
Right Sample
Right Regression
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1
1.2
Extrapolation Parameter ( µ )
1.4
1.6
1.8
2
Figure 2.7: Example of estimating the extrapolated spectral radius function from
samples that are all superior to non-extrapolated iteration (below the black dotted
line).
to find the intersection of the two segments that have been reconstructed. If aL and
bL are the quadratic and linear coefficients of the left segment, respectively, and aR
and bR are the coefficients of the right segment, then the intersection will be located
where
aL µ2 + bL µ + 1 = aR µ2 + bR µ + 1
=⇒ (aL − aR )µ2 + (bL − bR )µ = 0.
30
(2.12)
(2.13)
Let aI = aL − aR and bI = bL − bR be the coefficients of equation (2.13), then the
segment intersections occur at
µ=
−bI
and µ = 0.
aI
(2.14)
Because the intersection at µ = 0 is already known from the constraint placed on
the regression, we are interested in the intersection at µ =
−bI
aI
which is calculated on
lines 14 - 16 of algorithm 2.3.1. However, it is possible that aI is zero because either
the non-negative least squares regression determined 0 to be the quadratic coefficients for both segments or both segments share the same non-zero quadratic term.
In either scenario, an intersection cannot be found and an optimal parameter cannot
yet be estimated. The procedure under these conditions is to take another sample
slightly beyond the current range of samples, as seen on lines 25 - 31 of algorithm
2.3.1. Additional samples will quickly eliminate scenarios with no intersections as the
reconstruction is refined. Note that the value search speed is a tune-able parameter
that determines how far the next parameter can be from the previous sample locations. Choosing a small value for search speed can result in slow exploration of the
sample space, however, large values for search speed can lead to samples far away
from the optimal value leading to expensive computation.
If aI ̸= 0 and thus an intersection besides (0,1) is found, then both the left
and right segments are checked to see if they contain a minimum within their region,
if one does, then the location of the minimum is used as the optimal parameter,
31
otherwise, the intersection is used as the minimum. Checking if a segment contains
the minimum does not require explicitly calculating the actual value of the function
at it’s minimum, only the location of the segment minimums are needed which are
located at
−bL
2aL
and
−bR
2aR
for the left and right segments respectively. Then, if the left
segment’s minimum is to the left of the intersection or the right segment’s minimum
to the right of the intersection the appropriate location of the optimal extrapolation
parameter can be determined as seen in lines 34- 45 of algorithm 2.3.1.
If the optimal parameter is estimated at a location that is far outside the
range of where samples have been taken then the estimation may be very inaccurate.
This issue is common when there are very few samples to base an estimation on. In
the case that this situation does occur, lines 50-54 in algorithm 2.3.1 limit how far
away the next sample will be so parameters don’t travel too far into unsampled areas.
Algorithm 2.3.1: Estimated Optimal Mu
1 function [mu] = EstimatedOptimalMu (X, Y, s e a r c h S p e e d , h a s d i v e r g e n c e )
2
3
% Find Segments
4
Segments = FindSegments (X,Y) ;
5
6
% Find e q u a t i o n o f each segment
7
BL = C o n s t r a i n e d R e g r e s s i o n (X( Segments ( 1 , 1 ) : Segments ( 1 , 2 ) ) ,
8
Y( Segments ( 1 , 1 ) : Segments ( 1 , 2 ) ) ) ;
9
10
BR = C o n s t r a i n e d R e g r e s s i o n (X( Segments ( 2 , 1 ) : Segments ( 2 , 2 ) ) ,
11
Y( Segments ( 2 , 1 ) : Segments ( 2 , 2 ) ) ) ;
12
13
% Find i n t e r s e c t i o n o f segments
14
a = BL( 3 )−BR( 3 ) ;
15
b = BL( 2 )−BR( 2 ) ;
16
i n t e r s e c t i o n = −b/a ;
17
18
% I f t h e r e i s no i t e r s e c t i o n f o r p o s i t i v e mu, o r t h e r i g h t
19
% segment i s ontop o f t h e l e f t segment
20
i f ( a == 0 | | i n t e r s e c t i o n < 0 | | BL( 2 ) < BR( 2 ) )
32
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55 end
% I f d i v e r g e n c e has o c c u r e d and t h e r i g h t segment i s
% ontop o f t h e l e f t segment sa m p l e s s h o u l d be taken from
% t h e r i g h t s i d e o f t h e l e f t group o f s a m p l e s
i f ( h a s d i v e r g e n c e == 1 && BL( 2 ) < BR( 2 ) )
mu = max(X( Segments ( 1 , 1 ) : Segments ( 1 , 2 ) ) ) ∗(1+ s e a r c h s p e e d ) ;
% El se , j u s t s e a r c h beyond t h e maximum sample parameter
else
mu = max(X) ∗ ( 1 + s e a r c h s p e e d ) ;
end
else
% I f t h e l e f t segment c o n t a i n s t h e s p e c t r a l r a d i u s minimum
i f (BL( 3 ) ˜= 0 && −BL( 2 ) / ( 2 ∗BL( 3 ) ) < i n t e r s e c t i o n )
mu = −BL( 2 ) / ( 2 ∗BL( 3 ) ) ;
% El se − i f t h e r i g h t segment c o n t a i n s t h e s p e c t r a l r a d i u s
minimum
e l s e i f (BR( 3 ) ˜= 0 && −BR( 2 ) / ( 2 ∗BR( 3 ) ) > i n t e r s e c t i o n )
mu = −BR( 2 ) / ( 2 ∗BR( 3 ) ) ;
% E l s e t h e i n t e r s e c t i o n c o n t a i n s t h e s p e c t r a l r a d i u s minimum
else
mu = i n t e r s e c t i o n ;
end
end
% I f t h e o p t i m a l parameter i s found o u t s i d e o f t h e r a n g e o f
% t h e samples t h e sample s h o u l d be l i m i t e d by t h e s e a r c h s p e e d
i f (mu < min(X) ∗ ( 1 − s e a r c h s p e e d ) )
mu = min(X) ∗ ( 1 − s e a r c h s p e e d ) ;
e l s e i f (mu > max(X) ∗ ( 1 + s e a r c h s p e e d ) )
mu = max(X) ∗ ( 1 + s e a r c h s p e e d ) ;
end
2.3.2 Jitter
Occasionally, if the estimated optimal parameter is used repeatedly for the sample
location the estimation may get stuck in an inaccurate reconstruction because subsequent samples do not add enough variation to refine the estimation. To alleviate
this issue a jitter term is introduced so that subsequent parameters are slightly different each time. As seen from the speedup equation in section 1.2, small changes to
33
the spectral radius can have large affects on the amount of computation that must
be done. Therefore, jitter of the extrapolation parameter should not be based upon
adding unrestricted random variation to µ but instead should be intelligently applied
variation to µ based upon restrictions derived from the speedup equation.
If the estimated optimal spectral radius is ρOP T and requires n iterations to
converge and the allowed jittered cost is α times the number iterations the optimal
iteration requires, with α ≥ 1, then
n
ρnα
LIM IT = ρOP T
(2.15)
and
1
α
ρOP
T = ρLIM IT .
(2.16)
Equation (2.16), calculated on line 10 of algorithm 2.3.2, places a limit on the highest
spectral radius allowed by jitter while remaining within the computation envelope
provided by α. Note that if the cost of jitter is up to 10% the iterations of the
optimal iteration then α = 1.1.
Next, a range for the extrapolation parameter that satisfies the limit placed
on the spectral radius from jitter must be found. In the case that a segment is
quadratic then it has two locations that equal the jitter limit that are located at the
solutions of
aµ2 + bµ + 1 = ρLIM IT .
34
(2.17)
Any value of the extrapolation parameter between the two solutions to equation
(2.17) will satisfy the jitter limit. Additionally, in the case of a linear segment then
the solution to
ρLIM IT = bµ + 1
(2.18)
will provide one limit for the extrapolation parameter that satisfies the jitter limit.
The distance from this linear restriction to the optimum extrapolation parameter
is mirrored to the other side of the optimum parameter so that a bounded limit is
given in either case of a quadratic or linear segment. Lines 12 - 43 of algorithm 2.3.2
implement the calculation of range bounds for the left and right segments. Finally,
as calculated on lines 45-57, with the tightest bounds on the extrapolation parameter
from both the left segment and right segment that adhere to the jitter limit, a value
in the range is randomly chosen as the jittered µ value. Note that the jitter limit α
could be made a function of the number of samples available so that well sampled
estimations can use less jitter for improved performance.
Algorithm 2.3.2: Jitter Algorithm
1 function [mu] = A p p l y J i t t e r (BL,BR, mu, m a x j i t t e r c o s t )
2
3
% Find t h e y v a l u e a t mu ( t h e max o f l e f t and r i g h t segments )
4
l e f t y = BL( 3 ) ∗ muˆ2 + BL( 2 ) ∗ mu + 1 ;
5
r i g h t y = BR( 3 ) ∗ muˆ2 + BR( 2 ) ∗ mu + 1 ;
6
7
c u r r e n t y = max( l e f t y , r i g h t y ) ;
8
9
% Maximum v a l u e t h e j i t t e r v a l u e s h o u l d a t t a i n
10
y l i m i t = c u r r e n t y ˆ(1/ m a x j i t t e r c o s t ) ;
11
12
% I f t h e l e f t segment i s q u a d r a t i c
13
i f (BL( 3 ) ˜=0)
14
% Find t h e s o l t i o n t o t h e q u a d r a t i c e q u a t i o n
35
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58 end
BL left bound
BL left bound
= −BL( 2 )−sqrt (BL( 2 ) ˆ2−4∗BL( 3 ) ∗(1− y l i m i t ) ) ;
= B L l e f t b o u n d / ( 2 ∗BL( 3 ) ) ;
BL ri g ht bo un d = −BL( 2 )+sqrt (BL( 2 ) ˆ2−4∗BL( 3 ) ∗(1− y l i m i t ) ) ;
BL ri g ht bo un d = BL r i g ht b o u n d / ( 2 ∗BL( 3 ) ) ;
% El se , t h e l e f t segment i s l i n e a r
else
j i t D i s t = abs ( ( y l i m i t −1)/BL( 2 ) − mu) ;
B L l e f t b o u n d = mu − j i t D i s t ;
BL ri g ht bo un d = mu + j i t D i s t ;
end
% I f t h e r i g h t segment i s q u a d r a t i c
i f (BR( 3 ) ˜=0)
B R l e f t b o u n d = −BR( 2 )−sqrt (BR( 2 ) ˆ2−4∗BR( 3 ) ∗(1− y l i m i t ) ) ;
B R l e f t b o u n d = B R l e f t b o u n d / ( 2 ∗BR( 3 ) ) ;
BR right bound = −BR( 2 )+sqrt (BR( 2 ) ˆ2−4∗BR( 3 ) ∗(1− y l i m i t ) ) ;
BR right bound = BR right bound / ( 2 ∗BR( 3 ) ) ;
% El se , t h e r i g h t segment i s l i n e a r
else
j i t D i s t = abs ( ( y l i m i t −1)/BR( 2 ) − mu) ;
B R l e f t b o u n d = mu − j i t D i s t ;
BR right bound = mu + j i t D i s t ;
end
% Setup bounds on t h e j i t t e r ( u s e t h e t i g h t e s t bounds )
l e f t b o u n d = max( B L l e f t b o u n d , B R l e f t b o u n d ) ;
r i g h t b o u n d = min( BL right bound , BR right bound ) ;
% Generate random v a l u e f o r j i t t e r
j i t t e r a m o u n t = rand ( ) ;
% Apply j i t t e r randomly t o l e f t o r r i g h t o f mu
if ( jitter amount < .5)
mu = l e f t b o u n d + (mu − l e f t b o u n d ) ∗ ( 2 ∗ j i t t e r a m o u n t ) ;
else
mu = mu + ( r i g h t b o u n d −mu) ∗ ( 2 ∗ ( j i t t e r a m o u n t −.5) ) ;
end
36
2.3.3 Parameter Selection
The previous sections provide a method to estimate the optimal extrapolation parameter as well as a method to intelligently add variation to refine the reconstruction
of the spectral radius function. Parameter selection ties these functions together with
rules to manage the selection of a parameter. Algorithm 2.3.3 will be examined which
implements these rules.
First, lines 4-10 declare parameters that are used in this function and are
tune-able by users. Next, lines 13-14 sort the samples by the x location while making
sure to also track the y value accordingly. The sorted samples make further processing
easier by knowing the data is in order. The first rule in parameter selection takes
place on lines 17-20 which returns a default initial value if no samples have been
evaluated yet. This gives a starting point for the algorithm and an initial value of 1
makes the initial value equivalent to non-extrapolated iteration. Additionally, there
needs to be at least 2 samples for any analysis to take place, thus, lines 24-27 return
a parameter slightly beyond the initial value to get additional samples.
Next, lines 30-36 search for divergence by checking for sample values that are
very close to 1, as dictated by divergence value. If all samples are declared divergent,
then lines 39-42 implement a bisection of the minimum sample location in an attempt
to find a parameter that will converge. Additionally, lines 45-51 detect if there has
been divergence, in which case it calculates x limit which will clamp subsequent
estimations to the midpoint between the last convergent sample and the first divergent
sample. This clamp will limit the possibility of further divergent samples.
37
Finally, lines 54-55 trim out divergent samples so only convergent samples
are used in the reconstruction of the spectral radius function. Lines 58-61 find the
estimated optimal µ and apply jitter. And lastly, lines 65-67 apply the clamp from
x limit that was calculated earlier.
Algorithm 2.3.3: Parameter Selection
1 function [mu] = P a r a m e t e r S e l e c t i o n ( X input , Y input )
2
3
% Set parameters
4
n input
= length ( X input ) ; % Number o f sa m p l e s taken
5
6
search speed
= 1.1;
% P e r c e n t a g e beyond sample r a n g e we
7
% can l o o k
8
initial value
= 1;
% Where t h e f i r s t g u e s s i s
9
d i v e r g e n c e v a l u e = . 9 9 9 9 ; % Should be s l i g h t l y l e s s than 1
10
jitter
= 1.1;
% Should be s l i g h t l y l a r g e r than 1
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
% S o r t samples by X p o s i t i o n
[X IX ] = sort ( X input ) ;
Y = Y input ( IX ) ;
% I f t h e r e a r e no s a m ples t o a n a l y z e , r e t u r n i n i t i a l v a l u e
i f ( isempty (X) )
mu = i n i t i a l v a l u e ;
return ;
end
% I f t h e r e i s o n l y one p o i n t you must g e n e r a t e a n o t h e r p o i n t
% b e f o r e a n a l y s i s can be done
i f ( length (X) <= 2 )
mu = max(X) ∗ ( 1 − s e a r c h s p e e d ) ;
return ;
end
% D e t e c t where d i v e r g e n c e o c c u r s
divergence location = n input ;
f or i = n i n p u t : −1:1
i f (Y( i ) < d i v e r g e n c e v a l u e )
divergence location = i ;
break ;
end
38
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68 end
end
% I f e v e r y sample has been d i v e r g e n t , b i s e c t s m a l l e s t checked mu
i f ( d i v e r g e n c e l o c a t i o n == 1 && d i v e r g e n c e l o c a t i o n ˜= n i n p u t )
mu = min(X) / 2 ;
return ;
end
% C r e a t e an upper l i m i t f o r t h e next parameter
i f ( divergence location < n input )
has divergence = 1;
x l i m i t = (X( d i v e r g e n c e l o c a t i o n )+ X( d i v e r g e n c e l o c a t i o n +1) )
/2;
else
has divergence = 0;
x l i m i t = Inf ;
end
% Trim out d i v e r g e n t sa m p les
X = X( 1 : d i v e r g e n c e l o c a t i o n ) ;
Y = Y( 1 : d i v e r g e n c e l o c a t i o n ) ;
% Find t h e e s t i m a t e d o p t i m a l mu and t h e c o e f f o f t h e f u n c t i o n s
minimum mu = EstimatedOptimalMu (X, Y, s e a r c h s p e e d , h a s d i v e r g e n c e )
% Apply j i t t e r
mu = A p p l y J i t t e r (BL,BR, minimum mu , j i t t e r ) ;
% Check t o make s u r e f i n a l v a l u e i s n ’ t i n an a r e a c l o s e
% t o d i v e r g e n t , i f so , clamp i t t o t h e l i m i t
i f (mu > x l i m i t )
mu = x l i m i t ;
end
2.4 Solver Integration
The result of the previous sections give the function P arameterSelection which abstracts all the logic of having to choose a parameter, thus, integration of the proposed
algorithm into an existing solver only requires a few modifications. One of the requirements of implementing the algorithm will be the storage of previous samples, so
39
persistent variables have been declared and initialized in lines 9-18 which will store
samples between calls to M ySolver.
Line 24 contains the main iteration loop that is typical of every iterative
solver, however, there is additional logic from lines 30-77 to manage sample collection.
The calculation of the estimated spectral radius and storage of samples occur on lines
51 and 59-64 respectively. Previously, in section 1 of this chapter the number of
samples required for an accurate estimation was evaluated. The conclusion was that
roughly 200 iterations are required for an accurate estimation of the spectral radius.
In the case that many more iterations are required to solve a system than are needed
to estimate the spectral radius a new parameter can be selected whenever enough
iterations have been computed for the estimation. Thus, lines 32, 38, and 46 check
if a sufficient number of iterations have been computed. In the case that a divergent
parameter has been used then all the work computed by this parameter is useless
because it has taken the iteration farther away from the solution, thus, line 55 will
revert to the value before divergent iteration occurs.
The integration of the proposed algorithm into the solver and the use of
persistent memory allows a user to call M ySolver with no additional knowledge or
requirements about the problem. This fully abstracts the workings of the proposed
algorithm making it available for general use by any user.
Algorithm 2.4: Solver Integration
1 function [ x s o l v e r i t e r a t i o n ] = MySolver (A, x , b , SolverType ,
s o l u t i o n t o l , SOR w)
2
40
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
% Set parameters
max iterations = 5000;
restart pos
= 200;
restart min
= 150;
% Create p e r s i s t e n t ( s t a t i c ) v a r i a b l e s
p e r s i s t e n t mu array ;
persistent rate array ;
p e r s i s t e n t sample ;
% I n i t i a l i z e values i f u n i t i a l i z e d
i f ( isempty ( mu array ) )
mu array = [ ] ;
rate array = [ ] ;
sample = 1 ;
end
% I n i t i a l i z e mu
mu = 0 ;
% I t e r a t e up t o m a x i t e r a t i o n s t i m e s
fo r s o l v e r i t e r a t i o n = 1 : m a x i t e r a t i o n s
% Calculate the e r r o r in the current s o l u t i o n
e r r o r n o r m = norm(A∗x−b ) ;
% I f s o l v e d OR t h e r e a r e enough i t e r a t i o n s t o sample
i f ( error norm < s o l u t i o n t o l | |
mod( s o l v e r i t e r a t i o n , r e s t a r t p o s ) == 0 | |
s o l v e r i t e r a t i o n == 1 )
% I f i t ’ s not t h e f i r s t i t e r a t i o n
i f ( s o l v e r i t e r a t i o n > 1)
% Find t h e number o f i t e r a t i o n s used f o r t h i s sample
i f (mod( s o l v e r i t e r a t i o n , r e s t a r t p o s ) == 0 )
sample iterations = restart pos ;
else
s a m p l e i t e r a t i o n s = mod( s o l v e r i t e r a t i o n , r e s t a r t p o s ) ;
end
% Only g a t h e r a s sample i f t h e r e were enough i t e r a t i o n s o r
you a r e f o r c e d t o
if ( sample iterations > restart min | |
solver iteration < restart pos )
% Calculate estimate for spectral radius
e s t i m a t e d s p e c t r a l r a d i u s = (norm( xp − x ) /
norm( sample x0−x ) ) ˆ
(2/( s a m p l e i t e r a t i o n s − 1) ) ;
41
53
54
% I f NOT s o l v e d AND d i v e r g e n t , throw away d i v e r g e n t
iterations
i f ( e r r o r n o r m > s o l u t i o n t o l && e s t i m a t e d s p e c t r a l r a d i u s
> 0.999)
x = s a mp l e x0 ;
end
55
56
57
58
% Make s u r e x0−x d i d not c a u s e NAN, then r e c o r d
59
i f ( ˜ isnan ( e s t i m a t e d s p e c t r a l r a d i u s ) )
60
mu array ( sample )
= mu;
61
r a t e a r r a y ( sample ) = e s t i m a t e d s p e c t r a l r a d i u s ;
62
sample = sample + 1 ;
63
end
64
end
65
end
66
67
% I f solved , quit i t e r a t i n g
68
i f ( error norm < s o l u t i o n t o l )
69
break ;
70
end
71
72
% S e l e c t new parameter
73
mu = P a r a m e t e r S e l e c t i o n ( mu array , r a t e a r r a y ) ;
74
75
% Record s t a r t i n g p o i n t f o r sample
76
s am p le x0 = x ;
77
end
78
79
xp = x ;
80
81
x = C o m p u t e I t e r a t i o n (A, x , b ) ;
82
83
% Apply e x t r a p o l a t i o n
84
x = xp + mu∗ ( x−xp ) ;
85
end
86 end
42
CHAPTER III
PERFORMANCE ANALYSIS
This chapter evaluates the performance of the proposed algorithm by randomly generating sequences of linear systems, solving them, and then recording the number
of iterations required for convergence. To generate a sequence of linear systems a
100 × 100 symmetric positive definite matrix with a randomly chosen condition number between 50 and 150 was generated for the coefficient matrix A, then, vectors with
random entries between 0 and 1 were generated for b(i) to create a sequence. For each
benchmark, the first 100 elements in 250 sequences of linear systems were solved.
Figures 3.1a, 3.2a, and 3.3a analyze the ratio of the number of iterations
required using the proposed algorithm to the number of iterations required using
the optimal extrapolation parameter found through a brute force bisection method.
Ratios less than or equal to 1 show that the proposed algorithm is able to accurately
find the optimal extrapolation parameter. The figures show that by roughly the
30th element in a sequence even the 95th percentile can very accurately estimate the
optimal extrapolation parameter. Note that there are some scenarios in which the
proposed algorithm exceeds the performance of the theoretical optimal parameter
found through brute force. The cause of this superoptimal improvement stems from
the fact that the spectral radius determines the rate of convergence as the number
43
of iterations goes to infinity, however, iteration usually terminates after only a few
hundred iterations because sufficient convergence is achieved. Because the proposed
algorithm is based upon the average reduction factor and not the spectral radius,
superoptimal convergence can sometimes be achieved.
Figures 3.1b, 3.2b, and 3.3b analyze the ratio of the number of iterations
required for the proposed algorithm to the number of iterations required for the non
extrapolated method. Ratios less than 1 show that the proposed method is more
computationally efficient than not using the proposed algorithm. Similarly with the
previous figures, by roughly the 30th element in the sequence even the 95th percentile
becomes steady as the estimation of the parameter becomes accurate. Figures 3.1b
and 3.2b show a certain improvement with median ratios well below 1, however, figure
3.3b does not show much of an improvement with a median ratio very close to 1. The
lack of improvement seen in figure 3.3b is caused by having an optimal extrapolation
parameter that is very close to 1, thus, even with a very accurate estimation of the
optimal parameter a large improvement would not be seen.
The conclusion to be drawn from these figures is that the proposed algorithm
is able to accurately estimate the optimal extrapolation parameter after solving a
small number of elements in the sequence, however, whether extrapolation provides
a significant speedup is dependent upon the splitting and the problem at hand.
44
Iterations for Proposed Algorithm / Iterations for Optimal Extrapolation
2
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
1.8
1.6
1.4
1.2
1
0
10
20
30
40
50
60
Sequence Position
70
80
90
100
(a) Quality of optimal spectral radius estimation.
Iterations for Proposed Algorithm / Iterations for No Extrapolation
1.1
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
1
0.9
0.8
0.7
0.6
0
10
20
30
40
50
60
Sequence Element
70
80
(b) Improvement beyond non-extrapolated iteration.
Figure 3.1: Performance results for Gauss-Seidel.
45
90
100
Iterations for Proposed Algorithm / Iterations for Optimal Extrapolation
2
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1
1
0
10
20
30
40
50
60
Sequence Position
70
80
90
100
(a) Quality of optimal spectral radius estimation.
Iterations for Proposed Algorithm / Iterations for No Extrapolation
1.1
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
1.05
1
0.95
0.9
0.85
0
10
20
30
40
50
60
Sequence Element
70
80
(b) Improvement beyond non-extrapolated iteration.
Figure 3.2: Performance results for SOR w = 1.5.
46
90
100
Iterations for Proposed Algorithm / Iterations for Optimal Extrapolation
2
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1
1
0
10
20
30
40
50
60
Sequence Position
70
80
90
100
(a) Quality of optimal spectral radius estimation.
Iterations for Proposed Algorithm / Iterations for No Extrapolation
1.1
5th Percentile
25th Percentile
50th Percentile
75th Percentile
95th Percentile
1.05
1
0.95
0.9
0.85
0.8
0
10
20
30
40
50
60
Sequence Element
70
80
(b) Improvement beyond non-extrapolated iteration.
Figure 3.3: Performance results for SOR w = 1.8.
47
90
100
CHAPTER IV
CASE STUDY
The previous chapter utilized randomly generated systems in order to analyze the
performance of the proposed algorithm, however, randomly generated systems may
not accurately reflect problems that occur in the real world. This chapter implements
the Crank-Nicholson method for solving the diffusion equation over a 3D domain in
order to analyze the performance of the proposed algorithm in a real world scenario.
The partial differential equation that will be solved is
Ut = a(Uxx + Uyy + Uzz ) + S(x, y, z, t)
(4.1)
where S(x, y, z, t) is a source term and a is a constant value for the diffusion coefficient.
The Crank-Nicholson method discretizes the diffusion equation over space and time
requiring a linear system to be solved at each time step, thus, generating a sequence
of linear systems [14,15]. Figure 4.1 shows slices of the solved volume in this example
at different time steps. When discretized, the diffusion equation becomes
48
(a) t = .25
(b) t = 1.25
Figure 4.1: Slices of the volume solved in the example problem at particular times.
)
)
1
a
a
a
a ( n+1
n+1
n+1
+
+
+
U
−
U
+
U
i,j,k
i+1,j,k
∆t ∆x2 ∆y 2 ∆z 2
2∆x2 i−1,j,k
)
)
a ( n+1
a ( n+1
n+1
n+1
−
U
+
U
−
U
+
U
i,j−1,k
i,j+1,k
i,j,k−1
i,j,k+1
2∆y 2
2∆z 2
(
=
(
(4.2)
)
)
1
a
a
a
a ( n
n
n
−
−
−
Ui,j,k
+
Ui−1,j,k + Ui+1,j,k
2
2
2
2
∆t ∆x
∆y
∆z
2∆x
)
)
a ( n
a ( n
n
n
+
Ui,j−1,k + Ui,j+1,k
+
Ui,j,k−1 + Ui,j,k+1
2
2
2∆y
2∆z
n+ 1
+ Si,j,k2
and the coefficient matrix A(i) is determined from the left side of equation (4.2) while
the vector b(i) is determined from the right side of equation (4.2). If nx , ny , and
nz are the number of grid points in each respective dimension, then the resulting
coefficient matrix is of size nx ny nz × nx ny nz . Doubling the resolution of the grid in
49
0
50
100
150
200
250
300
350
400
450
500
0
50
100
150
200
250
300
nz = 1808
350
400
450
500
Figure 4.2: Sparsity plot of the Crank-Nicholson coefficient matrix for an 8x8x8
discretization (blue entries are nonzero entries of the matrix).
each dimension results in a matrix that contains 64 times the number of entries, the
vast majority of which are zero as seen in figure 4.2. Considering the memory and
computational requirements of direct system solvers, this example is clearly favourable
of iterative methods and sparse storage as grid sizes become large.
For this particular example α is chosen to be 1, the boundary conditions are
constant and equal to 2, ∆x = ∆y = ∆z = 1, ∆t = .25, and nx = ny = nz = 30.
Each system was solved using the non-extrapolated Gauss-Seidel method as well as
the extrapolated Gauss-Seidel method utilizing the proposed algorithm. The number
of iterations required for convergence were recorded as the metric for comparison.
Two different scenarios for the source function S(x, y, z, t) were tested.
The first scenario utilizes a source function that is a ’pulse’ in the form of
S(x, y, z, t) = sin(t). The pulse creates constant change in the simulated volume
50
4
500
4.5
Proposed Algorithm
Original Method
Total Iterations Computed for Sequence
4
450
Iterations for Convergence
x 10
400
350
300
250
3.5
3
2.5
2
1.5
1
0.5
200
0
20
40
60
Sequence Element
80
0
100
(a) Iterations per Sequence Element
Proposed Algorithm
Original Method
0
20
40
60
Sequence Element
80
100
(b) Total Iterations for Sequence
Figure 4.3: Benchmark results for a ’pulsed’ source (S(x, y, z, t) = sin(t)).
ensuring that work must be done at each time step by the iterative solver. The affect
of the pulse is clearly visible in figure 4.3a in the form of bumps which are caused
by the varying change in the solution between time-steps requiring more iterations.
The first elements solved in the sequence in figure 4.3a show the initial variation
in the optimal parameter estimation and the use of a suboptimal parameter that
causes slower convergence for one particular sequence element. Figure 4.3b shows the
accumulated improvement that the proposed algorithm has over the non-extrapolated
version. Extrapolating the results from 4.3b shows that using the proposed algorithm
requires only 70% the iterations of the non-extrapolated version.
The second scenario for the source function utilizes no source term, that is,
S(x, y, z, t) = 0. This results in a solution that goes to a steady state and requires
very little work by an iterative solver as time passes. Figure 4.4a shows that the
51
500
12000
Proposed Algorithm
Original Method
450
10000
Total Iterations for Convergence
Iterations for Convergence
400
350
300
250
200
150
8000
6000
4000
2000
100
50
Proposed Algorithm
Original Method
0
10
20
30
40
Sequence Element
50
0
60
(a) t = .25
0
10
20
30
40
Sequence Element
50
60
(b) t = 1.25
Figure 4.4: Benchmark results for a constant source. (S(x, y, z, t) = 0)
proposed algorithm is less successful at estimating a parameter that will provide an
improvement due to the decline in the amount of work that needs to be done and
thus, a decline in the quality of the spectral radius estimation. Figure 4.4b shows that
even with this difficulty, the overall ability of the proposed algorithm isn’t completely
compromised.
52
BIBLIOGRAPHY
[1] C.T. Kelley. Iterative Methods for Linear and Nonlinear Equations. Frontiers in
Applied Mathematics. Society for Industrial and Applied Mathematics, 1995.
[2] Y. Saad and M. Schultz. Gmres: A generalized minimal residual algorithm for
solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical
Computing, 7(3):856–869, 1986.
[3] Helmut Wittmeyer. ber die lsung von linearen gleichungssystemen durch iteration. ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift fr
Angewandte Mathematik und Mechanik, 16(5):301–310, 1936.
[4] Richard Varga. Matrix iterative analysis. Springer Verlag, Berlin New York,
2000.
[5] Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and
Applied Mathematics, Philadelphia, PA, USA, 2nd edition, 2003.
[6] Gene Golub. Matrix Computations. Johns Hopkins University Press, Baltimore,
1989.
[7] Josef Stoer. Introduction to numerical analysis. Springer, New York, 2002.
[8] David Young. Iterative Methods for Solving Partial Difference Equations of Elliptical Type. PhD thesis, Harvard University, Cambridge, Mass, 1950.
[9] L.A. Hageman and D.M. Young. Applied Iterative Methods. Dover books on
mathematics. Dover Publications, 2004.
[10] Hans Schwarz. Numerical analysis of symmetric matrices. Prentice-Hall, Englewood Cliffs, N.J, 1973.
[11] P. Albrecht and M. P. Klein. Extrapolated iterative methods for linear systems.
SIAM Journal on Numerical Analysis, 21(1):pp. 192–201, 1984.
53
[12] Ali Hajjafar. Controlled over-relaxation method and the general extrapolation
method. Applied Mathematics and Computation, 174(1):188–198, 2006.
[13] C. L. Lawson and R. J. Hanson. Solving least squares problems. 3 edition, 1995.
[14] S.V. Patankar. Numerical Heat Transfer and Fluid Flow. Series in computational
methods in mechanics and thermal sciences. Taylor & Francis, 1980.
[15] C. Pozrikidis. Introduction to Theoretical and Computational Fluid Dynamics.
OUP USA, 1997.
54