here - UNC CS

Math 662 Course Project
Glenn Elliott
May 3, 2009
Summary
In this assignment I explore various computational methods that may be used to solve partial
differential equations. Specifically, I solve the Poisson problem from Homework #5: ∆u = uxx +
uyy = f (x, y) = xy(1 − x)(1 − y) for (x, y) ∈ [0, 1] × [0, 1] with u = 0 on the boundary.
The Poisson problem is discretized using a five-point stencil with a step size of h = 21M , M =
5, 6, and 7. The various step sizes allow us to examine how problem size may affect accuracy and
speed of the tested numerical methods.
The tables 1 through 3 give a summary of results, in accuracy (expressed as relative error),
speed1 , and iteration count, of the different numerical methods for different step sizes. A more
detailed analysis of the methods follows.
Method
LU
LU (sparse)
GS (row)
GS (col)
GS (row/col)
GS (row), centered
SOR (ω = 1.4)
SOR (ω = 1.85)
SOR (ω = 2.2)
SOR (ω = 1.4), centered
SOR (ω = 1.85), centered
SOR (ω = 2.2), centered
CG
CGN
GMRES
BiCGstab
||ε||1
7.853 e-4
7.853 e-4
2.524 e-4
2.524 e-4
2.524 e-4
2.534 e-4
3.676 e-4
7.366 e-4
2.635 e+155
3.420 e-4
6.897 e-4
1.264 e+157
7.853 e-4
7.846 e-4
7.853 e-4
1.172 e-3
||ε||2
8.059 e-4
8.059 e-4
2.407 e-4
2.407 e-4
2.407 e-4
2.424 e-4
3.869 e-4
7.643 e-4
2.074 e+155
3.684 e-4
7.106 e-4
8.036 e+156
8.059 e-4
8.055 e-4
8.059 e-4
1.172 e-3
||ε||∞
7.853 e-4
7.853 e-4
2.524 e-4
2.524 e-4
2.524 e-4
2.540 e-4
3.676 e-4
7.366 e-4
2.635 e+155
3.419 e-4
6.897 e-4
6.000 e+156
7.853 e-4
7.846 e-4
7.853 e-4
1.172 e-3
Time (s)
0.7400
0.0120
0.0160
0.0160
0.0160
0.0160
0.0120
0.0040
0.0560
0.0080
0.0040
0.0040
0.0080
0.0120
0.0520
0.0040
Table 1: M = 5
1 Executed
in the python Scipy environment on a Pentium M processor operating at 1.7GHz.
1
Iteration Count
X
X
713
713
713
717
342
65
1960
352
122
120
32
177
32
10
Method
LU
LU (sparse)
GS (row)
GS (col)
GS (row/col)
GS (row), centered
SOR (ω = 1.4)
SOR (ω = 1.85)
SOR (ω = 2.2)
SOR (ω = 1.4), centered
SOR (ω = 1.85), centered
SOR (ω = 2.2), centered
CG
CGN
GMRES
BiCGstab
||ε||1
1.963 e-4
1.963 e-4
3.939 e-3
3.939 e-3
3.939 e-3
3.945 e-3
1.569 e-3
1.149 e-4
1.978 e+155
1.585 e-3
1.601 e-4
1.873 e+158
1.964 e-4
1.963 e-4
1.871 e-4
1.168 e-3
||ε||2
2.015 e-4
2.015 e-4
3.917 e-3
3.917 e-3
3.917 e-3
3.923 e-3
1.557 e-3
1.099 e-4
1.733 e+155
1.573 e-3
1.554 e-4
1.363 e+158
2.015 e-4
2.015 e-4
1.926 e-4
9.099 e-4
||ε||∞
1.963 e-4
1.963 e-4
3.939 e-3
3.939 e-3
3.939 e-3
3.945 e-3
1.569 e-3
1.149 e-4
1.978 e+155
1.585 e-3
1.602 e-4
1.408 e+158
1.964 e-4
1.963 e-4
1.871 e-4
1.168 e-3
Time (s)
39.3704
0.0520
0.2120
0.1920
0.2000
0.2120
0.1280
0.1280
0.1240
0.1320
0.0400
0.0120
0.0360
0.4000
0.2240
0.0200
Iteration Count
X
X
2279
2279
2279
2282
1126
253
1046
1134
310
58
64
646
295
20
||ε||∞
X
4.907 e-5
1.633 e-2
1.633 e-2
1.633 e-2
1.634 e-2
7.027 e-3
1.257 e-3
8.394 e+154
7.041 e-3
1.316 e-3
1.564 e+160
4.9082 e-5
X
3.921 e-5
3.132 e-3
Time (s)
X
0.2680
2.5681
2.3481
2.9681
2.5841
1.6321
0.5360
0.4360
1.6401
0.4280
0.0160
0.2000
X
3.6682
0.1080
Iteration Count
X
X
6832
6832
6832
6834
3524
880
933
3530
927
28
131
X
1368
34
Table 2: M = 6
Method
LU*
LU (sparse)
GS (row)
GS (col)
GS (row/col)
GS (row), centered
SOR (ω = 1.4)
SOR (ω = 1.85)
SOR (ω = 2.2)
SOR (ω = 1.4), centered
SOR (ω = 1.85), centered
SOR (ω = 2.2), centered
CG
CGN*
GMRES
BiCGstab
||ε||1
X
4.907 e-5
1.633 e-2
1.633 e-2
1.633 e-2
1.634 e-2
7.027 e-3
1.257 e-3
8.394 e+154
4.041 e-3
1.316 e-3
2.030 e+160
4.908 e-5
X
3.921 e-5
3.132 e-3
||ε||2
X
5.039 e-5
1.626 e-2
1.626 e-2
1.626 e-2
1.637 e-2
6.994 e-3
1.248 e-3
7.295 e+154
7.008 e-3
1.309 e-3
1.378 e+160
5.039 e-5
X
4.058 e-5
2.480 e-4
Table 3: M = 7
* Exceeded system capabilities.
Detailed Analysis
Exact Solution
The exact solution to u(x, y) is given by the equation:
∞ !
∞
!
u(x, y) =
Emn sin mπx sin nπy
n=1m=1
2(−2+2 cos mπ+mπ sin mπ)(nπ cos
nπ
−2 sin
nπ
)(sin
nπ
)
2
2
2
Emn = (mπ)2−4
+(nπ)2 .
m3 n3 π 6
Relative error between the analytical solution and numerically computed solutions will be studied
closely in the subsequent sections. Figure 1 is a plot of the exact solution. Though there will
2
be significant differences in the amount and distribution of relative error between the numerical
methods, the computed solution for each method do not vary greatly (in terms of scale) from Figure
1, so they are not reproduced.
Figure 1: Exact solution of u(x, y)
LU Factorization Methods
The LU factorization methods are “exact” solutions in that the number of operations in execution
is fixed and the entire solution is computed. In terms of accuracy, there is no difference between
the dense LU and sparse LU solvers (both which use partial pivoting). This is to be expected
since the same numerical operations are performed. The sparse LU solver is faster since operations
(multiplications with zero) may be skipped. Furthermore, the sparse solver is able to tackle larger
problem sets since less memory is needed. For example, the dense solver was unable to solve the M
= 7 discretization (a 129 x 129 matrix).
Table 4 contains contour plots of the relative error between the sparse LU computed solution
and the exact solution. It can be seen that accuracy increases with the increase in problem size, and
accuracy improves the most in the center of the solution. This increased accuracy is not entirely
expected since exact solutions often suffer from stability problems proportional to the conditioning
of the matrix being processed.
Regarding the execution time, the dense LU solver scales very poorly. Quadrupling the problem
set size (from M = 5 to M = 6) increased execution time by 52 times. However, the sparse LU solver
shows itself to be competitive at all tested problem set sizes. This is due to the highly optimized
software package2 used to compute the solution. While perhaps not directly applicable to solving
PDEs, the fact that highly optimized solvers for exact solutions exist is a good thing since some of
the iterative methods give poor results when used to compute full solutions.
Gauss-Seidel Methods (GS)
Three different iterative schemes were used to test Gauss-Seidel (GS) properties. However, there
appears to be no difference in accuracy or convergence (expressed by iteration count) with this
particular problem. I suspect that this is due to the several symmetric properties in the particular
2 SuperLU:
http://crd.lbl.gov/~xiaoye/SuperLU/
3
Table 4: Relative error contour plots for LU methods. (Left to Right: M = 5, 6, 7)
problem being solved: (1) f(x,y) is symmetric in the x/y directions with an evenly distributed
solution about ( 12 , 12 ); (2) the stencil is symmetric; and (3) the boundary conditions are equal on all
four sides of the boundary.
With regards to accuracy, GS is one of the most accurate algorithms for M = 5, but an order
of magnitude worse than the best methods when M = 7. Table 5 shows the relative error for GS
(row-inner) methods. Contours for the other iteration methods are omitted since they are the same.
Note that accuracy decreases with the increase in grid size.
Table 5: Relative error contour plots for GS (row-inner) method. (Left to Right: M = 5, 6, 7)
It is interesting to note that the gradient of the relative error is determined by the order in which
the uij ’s are visited. This can be seen by the off-centered contour in the relative error plot for M
= 5 in Table 5, and this trend is continued in the M = 6 and M = 7 plots (though perhaps more
difficult to observe). Figure 2 shows what happens to the gradient when iteration takes place in
reverse order, iterating from n − 1 to 1 instead of 1 to n − 1 for both inner and outer loops– the
contour is mirrored across the xy-line.
On this observation, I implemented another GS algorithm that dynamically switches iteration
directions, for both inner and outer loops, yielding four iteration schemes. The result is that relative
error remains centered within the solution and better distributed. However, the degree of accuracy
remains about the same (except in the lower right corner) and iteration count is modestly increased
(four extra iterations for M = 7). The results of this experiment can be observed in Table 6.
In terms of execution time, GS scales poorly. Between M = 6 and M = 7, execution time increases
by about a factor of 12 though the problem set size was only increased by a factor of 4. Furthermore,
the actual results returned by the method are less accurate than others. Additional iterations will
not improve results very much (at least quickly) since the GS algorithm exits once the difference in
L2 -norm between iterations drops below a small threshold. This decrease in norms is asymptotic,
so improvements through additional iterations will take a very long time.
GS only performs better (in terms of speed) than dense LU and GMRES.
4
Figure 2: GS with iteration direction reversed. (M = 5)
Table 6: GS relative error with alternating iteration direction. (Left to Right: M = 5, 6, 7)
Successive Over Relaxation (SOR)
The next method tested was successive over relaxation (SOR). Three different values for ω were
used: 1.4, 1.85, and 2.2. From Tables 1, 2, and 3, it can be seen that SOR, in general, is faster and
more accurate than GS and LU methods.
The relative errors for SOR can be seen in Table 7. Note that the contours exhibit the same
asymmetries as those of GS. There are two important aspects to note in these plots. First, like GS,
accuracy decreases with the increase of grid size. Second, in the case where ω is 2.2, the solution
does not converge; the over relaxation was too great. SOR suffers from the same problem as GS in
that additional iterations will not greatly improve accuracy due to asymptotic decease in iterative
improvements.
In terms of execution time, SOR is much faster than GS. This is clearly due to the fewer number
of (and relatively comparable) iterations that are executed. However, SOR’s execution still does not
scale very well with data size. Between M = 6 and M = 7, execution time increased by roughly 12
and 10 times for ω =1.4 and ω = 1.85. This is the same behavior as GS and it should be expected.
Though over relaxation speeds up convergence, SOR does not address data size in any way different
than GS.
With the exception of M = 5, SOR is more accurate that GS. For larger data sets, SOR shows
itself to be superior to GS in terms of accuracy and speed. However, it is not better than CG
5
methods or even the optimized sparse LU method.
Table 7: Successive Over Relaxation relative error. (Top to Bottom: ω = 1.4, 1.85, 2.2; Left to
Right: M = 5, 6, 7)
I also tried to address the asymmetries in SOR in the same manner as I did with GS by switching
between iteration orders. While this method did not do much to improve GS results, this method
helps mitigate the degradation of accuracy with increased problem size in SOR. For example, for
standard SOR (with ω = 1.85), accuracy worsened by a factor of about 63 between M = 6 and M
= 7. In contrast, this factor was only about 8 for the modified method. Table 8 contains contour
plots of the relative errors for ω = 1.4 and 1.85 (note that 2.2 still diverges). Note that the relative
errors remain centered instead of forming in arcs.
6
Table 8: Successive Over Relaxation relative error with alternating iteration direction (Top to Bottom: ω = 1.4, 1.85; Left to Right: M = 5, 6, 7)
Conjugate Gradient (CG)
The conjugate gradient method is the second most accurate technique among the tested methods
and is also the second fasted. If arithmetic operations are exact, the CG method is guaranteed to
converge in n steps, where n is the size of the matrix being solved. For M = 5, 6, 7, n = 33, 65, 129,
respectively. Even with numerical error, the number of CG iterations follow very closely to these n
values for the Poisson problem. For example, when n = 33, 32 iterations are executed. When n =
129, 131 iterations are executed. This behavior causes CG to scale very well with the increase of
problem size. Though the matrix is quadrupled in size, the number of necessary iterations needed to
solve the linear system only doubles. However, the amount of work done per iteration still increases,
so execution time more than doubles.
Unlike the GS and SOR methods, accuracy increases with problem set size. CG is only out-done
in speed by BiCGstab.
Table 9: Relative error contour plots for Conjugate Gradient method. (Left to Right: M = 5, 6, 7)
7
Conjugate Gradient on Normalized Equations (CGN)
The CG method may only be used on symmetric positive-definite linear systems. However, CG can
still be used if the system is transformed into a positive definite form by computing the normal
equations, AT A. The Poisson matrix is already symmetric positive-definite, but we can still explore
how CG behaves on the associated normal equations.
From Tables 1, 2, and 3, it can be seen that CGN is nearly as accurate as the CG method (at
least to three or four significant digits). However, the nice scaling behavior of CG where iteration
count was strongly tied to problem size no longer holds. The number of iterations between M =
5 and M = 6 more than quadruples. This poor scaling behavior is due to the increased condition
number of the problem, which slows convergence. The conditioning of CGN is κ2 (A) instead of κ(A)
as it was with the CG method.
There were other problems with this method as well. Scipy was unable to handle the computation
of the normal equations when the problem set was large (M = 7) due to system constraints.
Table 10 shows the relative error contour plots for CGN. Note how they differ little from the
plots in Table 9.
Table 10: Relative error contour plots for Conjugate Gradient method on the normal equations.
(Left to Right: M = 5, 6)
Generalized Minimal Residual (GMRES)
Generalized Minimal Residual method, or GMRES, is the most accurate method tested, though it is
the slowest iterative method tested. In terms of performance, GMRES several times slower than CG
(this is partly due to the fact that GMRES does not exploit the symmetry of the Poisson matrix).
Furthermore, the difference between GMRES and CG performance does not scale with a constant
factor. This means that CG will outperform GMRES by greater factors as problem size is increased.
Still, GMRES yields the best results.
Table 11: Relative error contour plots for GMRES method. (Left to Right: M = 5, 6, 7)
8
Biconjugate Gradient Stabilized (BiCGstab)
The Biconjugate Gradient Stabilized was the fastest of the tested methods, beating out the second
fasted method, CG, by a factor of two to three times and outperforming the other methods by larger
margins. BiCGstab has better scaling behavior than even CG. While the number of iterations of
CG was bounded below by the problem size, BiCGstab is not. BiCGstab often required less than 13
the number of iterations needed by CG.
BiCGstab’s accuracy is not as good as the other Krylov methods, CG and GMRES (even SOR
could be more accurate with a proper ω-value). Table 12 contains the contour plots of relative error
for BiCGstab. The plots are very unusual in comparison to those from the other methods. I cannot
explain why this is.
Table 12: Relative error contour plots for BiCGstab method. (Left to Right: M = 5, 6, 7)
Comparisons
Comparison between the various numerical methods have already been made in the Detailed Analysis
sections, but let me restate a few important points here. First, GMRES showed itself to be the most
accurate method though it was one of the slowest. CG performed very well in scaling and accuracy.
It is the best balance of speed and accuracy of the tested methods. BiCGstab was the fastest method;
however, it leaves much to be desired in terms of accuracy. GS and SOR methods may be useful
for small problem sets, but quickly degrade on larger problem sets. Finally, sparse LU methods can
perform very well in terms of accuracy and speed if the algorithm is optimized. In fact, the sparse
LU solver performed nearly as well as CG both in terms of accuracy and speed. A problem sets
would have to be larger for CG to outstrip sparse LU in performance.
In terms of difficulty of implementation, that is difficult to quantify for two reasons. First, even
simple algorithms may become very difficult to implement when optimized. For example, matrix
multiplication becomes difficult to implement when blocked for a processor cache. By comparison, a
straight forward linear system solving algorithm may become very complex with optimized. Second,
I only had to implement (starting with a template) the easiest of algorithms tested: GS and SOR.
Since Scipy implemented the remaining methods, it is hard for me to describe the difficulty of
implementing them. However, reviewing the pseudo-code for the various methods, BiCGstab is
probably the most complex of the algorithms to implement, followed by CG. GMRES, while simple
to implement, has a significant memory overhead in needing to maintain the iterative unitary Qn
matrix (though “restart” variants exist to address this). Furthermore, updating Qn each iteration
(needed for an optimized implementation) offers additional complexities.
One last note regarding Scipy: I found that the sparsity structure of the matrix used could
impacted performance by orders of magnitude. Scipy offers several different flavors of sparse matrices,
each differing in the way sparse data is organized. I found that the diagonal sparse format performed
the best in most cases (the blocked sparse matrix performed marginally better for CGN). It is not
surprising that the diagonally formatted sparse matrix would perform optimally since the Poisson
matrix is oriented diagonally. Alternative formats include dictionaries, row and column compressed,
list-of-lists, and other formats.
9
Conclusion
In this project I examined the accuracy and performance of various numerical methods that may be
used to solve PDEs. Each tested method showed itself to have its own characteristics and particular
usefulness such that one method cannot be wholly supplant another. These individual characteristics
will lead a practitioner to pick the best method for their specific application. The practitioner will
weigh data set size and consider trade-offs between speed and accuracy in choosing the best method
for them.
Appendix (code)
math.py
gauss_seidel.f90
gauss_seidel_flip.f90
sor.f90
sor_flip.f90
Python code used to drive experiments.
Fortran code for executing the Gauss-Seidel method.Supports
row-inner, col-inner, and row/col alternating iteration schemes.
Fortran code for executing the Gauss-Seidel method. Supports
dynamic iteration direction switching (1 to n, then n to 1). Rowinner scheme only.
Fortran code for executing the SOR method. Supports row-inner
iteration scheme only.
Fortran code for executing the SOR method. Supports dynamic
iteration direction switching (1 to n, then n to 1). Row-inner
scheme only.
10
math.py
Printed: 5/3/09 5:20:12 PM
# -*- coding: utf-8 -*-!
import resource!
from pylab import *!
from numpy import *!
!
import scipy.io!
import scipy.linalg!
import scipy.sparse!
import scipy.sparse.linalg!
!
from sor import *!
from sor_flip import *!
from gauss_seidel import *!
from gauss_seidel_flip import *!
!
# Used to count the number of iterations a Scipy method takes!
# to complete.!
class Counter :!
def __init__(self):!
self.count = 0;!
!
def increment(self, x):!
self.count = self.count + 1;!
!
!
# Used to measure CPU time. Sum of User and System cpu usage.!
# This method helps avoid counting delays in computation induced!
# by other processes running on the system.!
def cpu():!
return(resource.getrusage(resource.RUSAGE_SELF).ru_utime +!
resource.getrusage(resource.RUSAGE_SELF).ru_stime);!
!
!
def savePlot(mat, name):!
res = linspace(0,1,200)!
clf()!
contourf(mat, origin = 'image')!
colorbar()!
axis('off')!
savefig(name)!
clf()!
!
!
def computeRelativeError(error, sol):!
relError = zeros(error.shape)!
for i in range(error.shape[0]):!
for j in range(error.shape[1]):!
temp = abs(sol[i,j])!
if(temp > 0.000000001):!
relError[i,j] = abs(error[i,j]) / temp!
return relError!
!
!
# Tests the various PDE-solving methods.!
def doIt(A, F, solution):!
!
Adia = A.todia()!
Acsc = A.tocsc()!
!
f = array(F.flatten())[0] # vectorized F matrix.!
sq = int(sqrt(f.shape[0]))!
!
SolNorm_1 = linalg.norm(solution, ord = 1)!
Page 1 of 7
Printed For: Glenn Elliott
math.py
Printed: 5/3/09 5:20:12 PM
Page 2 of 7
Printed For: Glenn Elliott
SolNorm_2 = linalg.norm(solution, ord = 2)!
SolNorm_Inf = linalg.norm(solution, ord = inf)!
!
# by LU factorization!
print "by LU:"!
try:!
Adense = A.todense()!
t = cpu()!
x = scipy.linalg.lu_solve(scipy.linalg.lu_factor(Adense), f)!
t2 = cpu()!
!
error = reshape(x, (sq,sq)) - solution!
!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "lu_dense_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error, solution)), "lu_dense_"+str(sq)+".eps")!
#print reshape(x, (sq,sq))!
print ""!
except ValueError as e:!
print "Caught Exception: ",e!
!
!
# by sparse LU factorization!
print "by Sparse LU:"!
t = cpu()!
sparseSolver = scipy.sparse.linalg.factorized(Acsc)!
x = sparseSolver(f)!
t2 = cpu()!
!
error = reshape(x, (sq,sq)) - solution!
!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "lu_sparse_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error, solution)), "lu_sparse_"+str(sq)+".eps")!
#print reshape(x, (sq,sq))!
print ""!
!
# by Gauss-Seidel!
print "by Gauss-Seidel (row-inner):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = gauss_seidel(u0, F, 1e-5, 0)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "gs_row_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "gs_row_"+str(sq)+".eps")!
#print u!
print ""!
!
!
print "by Gauss-Seidel (col-inner):"!
u0 = zeros(F.shape)!
t = cpu()!
math.py
Printed: 5/3/09 5:20:12 PM
Page 3 of 7
Printed For: Glenn Elliott
u, err, iters = gauss_seidel(u0, F, 1e-5, 1)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "gs_col_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "gs_col_"+str(sq)+".eps")!
#print u!
print ""!
!
print "by Gauss-Seidel (row/col alternating):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = gauss_seidel(u0, F, 1e-5, 2)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "gs_rowcol_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "gs_rowcol_"+str(sq)+".eps")!
#print u!
print ""!
!
print "by Gauss-Seidel (centered):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = gauss_seidel_flip(u0, F, 1e-5)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "gs_rowcol_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "gs_centered_"+str(sq)+".eps")!
#print u!
print ""!
!
!
# by Successive Over-Relaxation!
print "by SOR (w = 1.4):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = sor(1.4, u0, F, 1e-5)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "sor_1.4_"+str(sq)+".eps")!
math.py
Printed: 5/3/09 5:20:12 PM
Page 4 of 7
Printed For: Glenn Elliott
savePlot(array(computeRelativeError(error,solution)), "sor_1.4_"+str(sq)+".eps")!
#print u!
print ""!
!
!
print "by SOR (w = 1.85):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = sor(1.85, u0, F, 1e-5)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "sor_1.85_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "sor_1.85_"+str(sq)+".eps")!
print ""!
!
!
print "by SOR (w = 2.2):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = sor(2.2, u0, F, 1e-5)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "sor_2.2_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "sor_2.2_"+str(sq)+".eps")!
print ""!
!
!
print "by SOR (w = 1.4, centered):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = sor_flip(1.4, u0, F, 1e-5)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "sor_1.4_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "sor_1.4_centered_"+str(sq)+".eps")!
#print u!
print ""!
!
!
print "by SOR (w = 1.85, centered):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = sor_flip(1.85, u0, F, 1e-5)!
t2 = cpu()!
!
math.py
Printed: 5/3/09 5:20:12 PM
Page 5 of 7
Printed For: Glenn Elliott
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "sor_1.85_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "sor_1.85_centered_"+str(sq)+".eps")!
print ""!
!
!
print "by SOR (w = 2.2, centered):"!
u0 = zeros(F.shape)!
t = cpu()!
u, err, iters = sor_flip(2.2, u0, F, 1e-5)!
t2 = cpu()!
!
error = u - solution;!
!
print "\titerations:",iters!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "sor_2.2_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "sor_2.2_centered_"+str(sq)+".eps")!
print ""!
!
!
!
!
!
# by CG!
print "by CG:"!
iterations = Counter()!
t = cpu()!
x, info = scipy.sparse.linalg.cg(Adia, f, callback = iterations.increment)!
t2 = cpu()!
!
error = reshape(x, (sq,sq)) - solution!
!
print "\titerations:",iterations.count!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "cg_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "cg_"+str(sq)+".eps")!
#print reshape(x, (sq,sq))!
print ""!
!
!
# by GMRES!
print "by GMRES:"!
iterations = Counter()!
t = cpu()!
x, info = scipy.sparse.linalg.gmres(Adia, f, callback = iterations.increment)!
t2 = cpu()!
!
error = reshape(x, (sq,sq)) - solution!
!
print "\titerations:",iterations.count!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
math.py
Printed: 5/3/09 5:20:12 PM
Page 6 of 7
Printed For: Glenn Elliott
#savePlot(array(abs(error)), "gmres_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "gmres_"+str(sq)+".eps")!
#print reshape(x, (sq,sq))!
print ""!
!
!
# by CG w/ normalized A!
print "by CG (normalized system):"!
try:!
iterations = Counter()!
AA = dot(A.T,A).tobsr()!
AF = array(dot(A.T, scipy.sparse.lil_matrix(f).T).todense()).flatten(1)!
!
!
!
!
t = cpu()!
x, into = scipy.sparse.linalg.cg(AA, AF, callback = iterations.increment)!
t2 = cpu()!
error = reshape(x, (sq,sq)) - solution!
!
print "\titerations:",iterations.count!
print "\ttime taken:",t2-t!
!
!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
!
!
#savePlot(array(abs(error)), "cgn_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "cgn_"+str(sq)+".eps")!
#print reshape(x, (sq,sq))!
print ""!
except ValueError as e:!
print "Caught Exception: ",e!
!
!
# by BICGstab!
print "by BICGstab:"!
iterations = Counter()!
t = cpu()!
x, info = scipy.sparse.linalg.bicgstab(Adia, f, callback = iterations.increment)!
t2 = cpu()!
!
error = reshape(x, (sq,sq)) - solution!
!
print "\titerations:",iterations.count!
print "\ttime taken:",t2-t!
print "\trelative error norms:",linalg.norm(error, 1)/SolNorm_1, linalg.norm(error, 2)/SolNorm_2,
linalg.norm(error, inf)/SolNorm_Inf!
#savePlot(array(abs(error)), "bicgstab_"+str(sq)+".eps")!
savePlot(array(computeRelativeError(error,solution)), "bicgstab_"+str(sq)+".eps")!
#print reshape(x, (sq,sq))!
print ""!
!
!
# Samples the analyitcal solution to the PDE given by:!
#
f(x,y) = x*(1-x) * y*(1-y)!
#
w/ boundary conditions equal to 0.!
def sampleAnalyitcal(xi, yi, num):!
sol = zeros((xi.shape[0], yi.shape[0]));!
!
for n in range(num+1):!
N = n + 1;!
Npi = N * math.pi;!
!
for m in range(num+1):!
M = m + 1;!
Mpi = M * math.pi;!
math.py
Printed: 5/3/09 5:20:12 PM
!
# Phi!
phi = dot(sin(Mpi * xi).T, sin(Npi * yi));!
!
!
# Emn!
emn = (-4 / ((Mpi)**2 + (Npi)**2)) * \!
(2 * (-2 + 2*cos(Mpi) + Mpi * sin(Mpi)) * \!
(Npi * cos(Npi/2) - 2*sin(Npi / 2)) * sin(Npi / 2)) / \!
(M**3 * N**3 * math.pi**6);!
!
sol = sol + emn * phi;!
!
return sol;!
!
!
#####!
# Test Drive Code Follows!
#####!
!
size = 33
# m = 5!
#size = 65
# m = 6!
#size = 129 # m = 7!
!
print "A is: ",size,"x",size!
print ""!
!
h = 1.0/(size - 1)!
hSq = size*size!
A = scipy.sparse.lil_matrix((hSq,hSq))!
!
for i in range(size):!
for j in range(size):!
if((i == 0) or (i == size-1) or (j == 0) or (j == size-1)): # boundary!
A[j*size+i, j*size+i] = 1!
else: # set up the stencil!
A[j*size+i, j*size+i] = -4
# center!
A[j*size+i, j*size+i-1] = 1
# left!
A[j*size+i, j*size+i+1] = 1
# right!
A[j*size+i, j*size+i-size] = 1 # down!
A[j*size+i, j*size+i+size] = 1 # up!
!
!
xi = matrix(linspace(0,1,size))!
yi = matrix(linspace(0,1,size))!
!
# (n, m) = 100 is good enough for machine epsilon. (kind of lazy)!
solution = sampleAnalyitcal(xi, yi, 25)!
!
!
# Sample f(x,y) = [x(1-x) * y(1-y)] * h**2!
F = dot(multiply(xi, 1 - xi).T, multiply(yi, 1 - yi)) * (h**2)!
!
# DO IT!!
doIt(A, F, solution)!
Page 7 of 7
Printed For: Glenn Elliott
gauss_seidel.f90
Printed: 5/3/09 5:22:47 PM
! Apply Gauss-Seidel iter times
SUBROUTINE gauss_seidel(m,n,u0,f,tol,mode,u,err,k)
IMPLICIT NONE
! Interface declarations
! INs
INTEGER, INTENT(IN) :: m,n
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: u0
! Note: f is assumed to be premultiplied by h**2
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: f
DOUBLE PRECISION, INTENT(IN) :: tol
INTEGER, INTENT(IN) :: mode
! OUTs
DOUBLE PRECISION, INTENT(OUT), DIMENSION(0:m,0:n) :: u
DOUBLE PRECISION, INTENT(OUT), DIMENSION(3) :: err
INTEGER, INTENT(OUT) :: k
! Internal declarations
INTEGER i,j
DOUBLE PRECISION :: ustar,uold,nrm(3)
! Boundary conditions u(0,j),u(i,0),u(m,j),u(i,n)
! are assumed to be predefined
u = u0
! mode = 0 -> row-inner
! mode = 1 -> col-inner
! mode = 2 -> row/col
IF (mode .eq. 0) THEN
! row-inner
DO k=1,100000 ! arbitrary max iteration.
err = 0.; nrm = 0.
DO i=1,m-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
err = err/nrm; err(2) = SQRT(err(2))
IF(err(2) .le. tol) THEN
EXIT ! close enough on 2-norm. exit.
ENDIF
END DO
ELSE IF (mode .eq. 1) THEN
! col-inner
DO k=1,100000 ! arbitrary max iteration.
err = 0.; nrm = 0.
DO j=1,n-1
DO i=1,m-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
err = err/nrm; err(2) = SQRT(err(2))
IF(err(2) .le. tol) THEN
Page 1 of 2
Printed For: Glenn Elliott
gauss_seidel.f90
Printed: 5/3/09 5:22:47 PM
Page 2 of 2
Printed For: Glenn Elliott
EXIT ! close enough on 2-norm. exit.
ENDIF
END DO
ELSE
DO k=1,100000 ! arbitrary max iteration.
err = 0.; nrm = 0.
IF(MOD(k, 2) .eq. 1) THEN
! row-inner
DO i=1,m-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
ELSE
! col-inner
DO j=1,n-1
DO i=1,m-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
ENDIF
err = err/nrm; err(2) = SQRT(err(2))
IF(err(2) .le. tol) THEN
EXIT ! close enough on 2-norm. exit.
ENDIF
END DO
ENDIF
END SUBROUTINE gauss_seidel
gauss_seidel_flip.f90
Printed: 5/3/09 5:21:47 PM
! Apply Gauss-Seidel iter times
SUBROUTINE gauss_seidel_flip(m,n,u0,f,tol,u,err,k)
IMPLICIT NONE
! Interface declarations
! INs
INTEGER, INTENT(IN) :: m,n
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: u0
! Note: f is assumed to be premultiplied by h**2
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: f
DOUBLE PRECISION, INTENT(IN) :: tol
! OUTs
DOUBLE PRECISION, INTENT(OUT), DIMENSION(0:m,0:n) :: u
DOUBLE PRECISION, INTENT(OUT), DIMENSION(3) :: err
INTEGER, INTENT(OUT) :: k
! Internal declarations
INTEGER i,j,temp
DOUBLE PRECISION :: ustar,uold,nrm(3)
! Boundary conditions u(0,j),u(i,0),u(m,j),u(i,n)
! are assumed to be predefined
u = u0
DO k=1,100000 ! arbitrary max iteration.
err = 0.; nrm = 0.
temp = MOD(k,4)
! Iterate across the matrix in different orders.
IF(temp .eq. 0) THEN
! ++
DO i=1,m-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
ELSE IF (temp .eq. 1) THEN
! +DO i=1,m-1
DO j=n-1,1,-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
ELSE IF (temp .eq. 2) THEN
! -+
DO i=m-1,1,-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
Page 1 of 2
Printed For: Glenn Elliott
gauss_seidel_flip.f90
Printed: 5/3/09 5:21:47 PM
END DO
ELSE
! -DO i=m-1,1,-1
DO j=n-1,1,-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
err(1) = err(1) + ABS(ustar-uold); nrm(1) = nrm(1) + ABS(ustar)
err(2) = err(2) + (ustar-uold)**2; nrm(2) = nrm(2) + ustar**2
err(3) = MAX(err(3),ABS(ustar-uold)); nrm(3) = MAX(nrm(3),ABS(ustar))
u(i,j) = ustar
END DO
END DO
ENDIF
err = err/nrm; err(2) = SQRT(err(2))
IF(err(2) .le. tol) THEN
EXIT ! close enough on 2-norm. exit.
ENDIF
END DO
END SUBROUTINE gauss_seidel_flip
Page 2 of 2
Printed For: Glenn Elliott
sor.f90
Printed: 5/3/09 5:23:34 PM
! Apply SOR iter times
SUBROUTINE sor(m,n,omega,u0,f,tol,u,err,k)
IMPLICIT NONE
! Interface declarations
! INs
INTEGER, INTENT(IN) :: m,n
DOUBLE PRECISION, INTENT(IN) :: omega
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: u0
! Note: f is assumed to be premultiplied by h**2
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: f
DOUBLE PRECISION, INTENT(IN) :: tol
! OUTs
DOUBLE PRECISION, INTENT(OUT), DIMENSION(0:m,0:n) :: u
DOUBLE PRECISION, INTENT(OUT), DIMENSION(3) :: err
INTEGER, INTENT(OUT) :: k
! Internal declarations
INTEGER i,j
DOUBLE PRECISION :: ustar,uold,unew,nrm(3)
! Boundary conditions u(0,j),u(i,0),u(m,j),u(i,n)
! are assumed to be predefined
u = u0
DO k=1,100000 ! arbitrary maximum number of iterations.
err = 0.; nrm = 0.
DO i=1,m-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
unew = uold + omega*(ustar-uold)
err(1) = err(1) + ABS(unew-uold); nrm(1) = nrm(1) + ABS(unew)
err(2) = err(2) + (unew-uold)**2; nrm(2) = nrm(2) + unew**2
err(3) = MAX(err(3),ABS(unew-uold)); nrm(3) = MAX(nrm(3),ABS(unew))
u(i,j) = unew
END DO
END DO
err = err/nrm; err(2) = SQRT(err(2))
IF(err(2) .le. tol) THEN
EXIT ! close enough on 2-norm. exit.
ENDIF
END DO
END SUBROUTINE sor
Page 1 of 1
Printed For: Glenn Elliott
sor_flip.f90
Printed: 5/3/09 5:23:10 PM
! Apply SOR iter times
SUBROUTINE sor_flip(m,n,omega,u0,f,tol,u,err,k)
IMPLICIT NONE
! Interface declarations
! INs
INTEGER, INTENT(IN) :: m,n
DOUBLE PRECISION, INTENT(IN) :: omega
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: u0
! Note: f is assumed to be premultiplied by h**2
DOUBLE PRECISION, INTENT(IN), DIMENSION(0:m,0:n) :: f
DOUBLE PRECISION, INTENT(IN) :: tol
! OUTs
DOUBLE PRECISION, INTENT(OUT), DIMENSION(0:m,0:n) :: u
DOUBLE PRECISION, INTENT(OUT), DIMENSION(3) :: err
INTEGER, INTENT(OUT) :: k
! Internal declarations
INTEGER i,j, temp
DOUBLE PRECISION :: ustar,uold,unew,nrm(3)
! Boundary conditions u(0,j),u(i,0),u(m,j),u(i,n)
! are assumed to be predefined
u = u0
DO k=1,100000 ! arbitrary maximum number of iterations.
err = 0.; nrm = 0.
temp = MOD(k,4)
!
! Iterate across the matrix in different orders.
IF (temp .eq. 0) THEN
! ++
DO i=1,m-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
unew = uold + omega*(ustar-uold)
err(1) = err(1) + ABS(unew-uold); nrm(1) = nrm(1) + ABS(unew)
err(2) = err(2) + (unew-uold)**2; nrm(2) = nrm(2) + unew**2
err(3) = MAX(err(3),ABS(unew-uold)); nrm(3) = MAX(nrm(3),ABS(unew))
u(i,j) = unew
END DO
END DO
ELSE IF (temp .eq. 1) THEN
! +DO i=1,m-1
DO j=n-1,1,-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
unew = uold + omega*(ustar-uold)
err(1) = err(1) + ABS(unew-uold); nrm(1) = nrm(1) + ABS(unew)
err(2) = err(2) + (unew-uold)**2; nrm(2) = nrm(2) + unew**2
err(3) = MAX(err(3),ABS(unew-uold)); nrm(3) = MAX(nrm(3),ABS(unew))
u(i,j) = unew
END DO
END DO
ELSE IF (temp .eq. 2) THEN
! -+
DO i=m-1,1,-1
DO j=1,n-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
unew = uold + omega*(ustar-uold)
err(1) = err(1) + ABS(unew-uold); nrm(1) = nrm(1) + ABS(unew)
err(2) = err(2) + (unew-uold)**2; nrm(2) = nrm(2) + unew**2
err(3) = MAX(err(3),ABS(unew-uold)); nrm(3) = MAX(nrm(3),ABS(unew))
Page 1 of 2
Printed For: Glenn Elliott
sor_flip.f90
Printed: 5/3/09 5:23:10 PM
u(i,j) = unew
END DO
END DO
ELSE
! -DO i=m-1,1,-1
DO j=n-1,1,-1
uold = u(i,j)
ustar = 0.25d0*(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1)-f(i,j))
unew = uold + omega*(ustar-uold)
err(1) = err(1) + ABS(unew-uold); nrm(1) = nrm(1) + ABS(unew)
err(2) = err(2) + (unew-uold)**2; nrm(2) = nrm(2) + unew**2
err(3) = MAX(err(3),ABS(unew-uold)); nrm(3) = MAX(nrm(3),ABS(unew))
u(i,j) = unew
END DO
END DO
ENDIF
err = err/nrm; err(2) = SQRT(err(2))
IF(err(2) .le. tol) THEN
EXIT ! close enough on 2-norm. exit.
ENDIF
END DO
END SUBROUTINE sor_flip
Page 2 of 2
Printed For: Glenn Elliott