CSE 245 Lecture Notes

CSE 245: Computer Aided Circuit
Simulation and Verification
Matrix Computations: Iterative
Methods I
Chung-Kuan Cheng
Outline
 Introduction
 Direct Methods
 Iterative Methods






Formulations
Projection Methods
Krylov Space Methods
Preconditioned Iterations
Multigrid Methods
Domain Decomposition Methods
2
Introduction
Iterative Methods
Direct Method
LU Decomposition
Domain Decomposition
General and Robust but
can be complicated if
N>= 1M
Preconditioning
Conjugate Gradient
Jacobi
GMRES
Multigrid
Gauss-Seidel
Excellent choice for SPD matrices
Remain an art for arbitrary matrices
3
Introduction: Matrix Condition
Ax=b
With errors, we have
(A+eE)x(e)=b+ed
Thus, the deviation is
x(e)-x=e(A+eE)-1(d-Ex)
|x(e)-x|/|x| <= e|A-1|(|d|/|x|+|E|)+O(e)
<= e|A||A-1|(|d|/|b|+|E|/|A|)+O(e)
We define the matrix condition as
K(A)=|A||A-1|,
i.e.
|x(e)-x|/|x|<=K(A)(|ed|/|b|+|eE|/|A|)
4
Introduction: Gershgorin Circle Theorem
For all eigenvalues r of matrix A, there exists an i
such that |r-aii|<= sumi=!j |aij|
Proof: Given r and eigenvector V s.t. AV=rV
Let |vi|>= |vj| for all j=!i, we have
sumj aijvj = rvi
Thus
(r-aii)vi= sumj=!i aijvj
r-aii=(sumj=!i aij vj)/vi
Therefore
|r-aii|<= sumj=!i |aij|
Note if equality holds then for all i, |vi| are equal.
5
Iterative Methods
Stationary:
x(k+1) =Gx(k)+c
where G and c do not depend on iteration
count (k)
Non Stationary:
x(k+1) =x(k)+akp(k)
where computation involves information
that change at each iteration
6
Stationary: Jacobi Method
In the i-th equation solve for the value of xi while assuming the
other entries of x remain fixed:
N
 mij x j  bi  xi 
j 1
bi   mij x j
j i
xi
mii
(k )

bi   mij x j
( k 1)
j i
mii
In matrix terms the method becomes:
x ( k )  D 1 L  U x k 1  D 1b
where D, -L and -U represent the diagonal, the strictly lowertrg and strictly upper-trg parts of M
M=D-L-U
7
Stationary-Gause-Seidel
Like Jacobi, but now assume that previously computed results
are used as soon as they are available:
N
 mij x j  bi  xi 
j 1
bi   mij x j
j i
xi
mii
(k )
In matrix terms the method becomes:

bi   mij x j
j i
(k )
  mij x j
j i
mii
x ( k )  D  L  (Ux k 1  b)
1
where D, -L and -U represent the diagonal, the strictly lowertrg and strictly upper-trg parts of M
M=D-L-U
8
( k 1)
Stationary: Successive Overrelaxation (SOR)
Devised by extrapolation applied to Gauss-Seidel in the form of
weighted average:
xi
(k )
 wxi
(k )
 (1  w) xi
( k 1)
xi
(k )

bi   mij x j
j i
(k )
  mij x j
( k 1)
j i
mii
In matrix terms the method becomes:
x ( k )  D  wL  ( wU  (1  w) D ) x k 1  w( D  wL) 1 b
1
where D, -L and -U represent the diagonal, the strictly lowertrg and strictly upper-trg parts of M
M=D-L-U
9
SOR
 Choose w to accelerate the convergence
xi
(k )
 wxi
(k )
 (1  w) xi
( k 1)
 W =1 : Jacobi / Gauss-Seidel
 2>W>1: Over-Relaxation
 W < 1: Under-Relaxation
Convergence of Stationary Method
 Linear Equation: MX=b
 A sufficient condition for convergence of the
solution(GS,Jacob) is that the matrix M is diagonally
dominant.
N
mii
m
j 1& j i
i, j
 If M is symmetric positive definite, SOR converges for any
w (0<w<2)
 A necessary and sufficient condition for the
convergence is the magnitude of the largest eigenvalue of
the matrix G is smaller than 1



Jacobi:
Gauss-Seidel
SOR:
G  D 1 L  U 
G  ( D  L) 1U
G  D  wL ( wU  (1  w) D )
1
Convergence of Gauss-Seidel
Eigenvalues of G=(D-L)-1LT is inside a unit
circle
Proof:
G1=D1/2GD-1/2=(I-L1)-1L1T, L1=D-1/2LD-1/2
Let G1x=rx we have
L1Tx=r(I-L1)x
xL1Tx=r(1-xTL1x)
y=r(1-y)
r= y/(1-y), |r|<= 1 iff Re(y) <= ½.
Since A=D-L-LT is PD, D-1/2AD-1/2 is PD,
1-2xTL1x >= 0 or 1-2y>= 0, i.e. y<= ½.
Linear Equation: an optimization problem
 Quadratic function of vector x
f ( x )  12 xT Ax  bT x  c
T
x
Ax  0
 Matrix A is positive-definite, if
for any nonzero vector x
 If A is symmetric, positive-definite, f(x) is
minimized by the solution Ax  b
Linear Equation: an optimization problem
 Quadratic function
f ( x )  12 xT Ax  bT x  c
 Derivative
f ( x )  12 AT x  12 Ax  b
 If A is symmetric
f ( x )  Ax  b
 If A is positive-definite
f (x ) is minimized by setting f (x ) to 0
Ax  b
For symmetric positive definite matrix A
15
Gradient of quadratic form
The points in the direction of steepest increase of f(x)
16
Symmetric Positive-Definite Matrix A
 If A is symmetric positive definite
 P is the arbitrary point
 X is the solution point x  A1b
f ( p )  f ( x )  12 ( p  x )T A( p  x )
since
1
2
( p  x )T A( p  x )  0
We have,
f ( p)  f ( x)
If p != x
If A is not positive definite
a) Positive definite matrix b) negative-definite matrix
c) Singular matrix
d) positive indefinite matrix
18
Non-stationary Iterative Method
 State from initial guess x0, adjust it
until close enough to the exact solution
x( i 1)  x( i )  a( i ) p( i )
p(i )
a(i )
i=0,1,2,3,……
Adjustment Direction
Step Size
 How to choose direction and step size?
Steepest Descent Method (1)
 Choose the direction in which f
decrease most quickly: the direction
opposite of f ( x(i ) )
 f ( x( i ) )  b  Ax( i )  r( i )
 Which is also the direction of residue
x( i 1)  x( i )  a( i ) r( i )
Steepest Descent Method (2)
 How to choose step size ?
 Line Search
a(i ) should minimize f, along the direction
of r(i ) , which means dad f ( x( i 1) )  0
d
da
f ( x(i 1) )  f ( x(i 1) )T
d
da
x(i 1)  f ( x(i 1) )T r(i )  0
 r( i 1) r( i )  0
T
Orthogonal
 (b  Ax( i 1) )T r( i )  0
 (b  A( x( i )  a( i ) r ( i ) )) T r( i )  0
 (b  Ax( i ) )T r( i )  a( i ) ( Ar ( i ) )T r( i )
a ( i ) 
r( i )T r( i )
r( i )T Ar( i )
Steepest Descent Algorithm
Given x0, iterate until residue is smaller than error tolerance
r( i )  b  Ax( i )
a (i ) 
r( i )T r( i )
r( i )T Ar( i )
x( i 1)  x( i )  a( i ) r( i )
Steepest Descent Method: example
3 2  x1   2 
2 6  x    8

 2   
a) Starting at (-2,-2) take the
direction of steepest descent of f
b) Find the point on the intersection of these two surfaces that
minimize f
c) Intersection of surfaces.
d) The gradient at the bottommost
point is orthogonal to the gradient
of the previous step
23
Iterations of Steepest Descent Method
24
Convergence of Steepest Descent-1
let
vk  [0,0,0,,1,,0,0,0]T
k
n
Eigenvector:
e( i )   j v j
j 1
EigenValue:
j
Energy norm:
e A  (eT Ae)1/ 2
j=1,2,…,n
Convergence of Steepest Descent-2
e( i 1)
2
A
 e(Ti 1) Ae( i 1)
 (e(Ti )  a( i ) r(Ti) ) A(e( i )  a( i ) r( i ) )
 e(Ti ) Ae( i )  2a( i ) r(Ti) Ae( i )  a(2i ) r(Ti) Ar( i )
 e( i )
 e( i )
 e( i )
2
A
2
r(Ti)r( i )
T
(i )
r Ar( i )
(  r(Ti)r( i ) )  (
r(Ti)r( i )
T
(i )
r Ar( i )
)2 r(Ti) Ar( i )  e( i )
2
A

( r(Ti)r( i ) )2
r(Ti) Ar( i )
2 2 2


(


j j)
T
2


( r( i )r( i ) )
2
2
j
1  T
  e( i ) 1 

T
2 3
2
A

A
 (   j  j )(   j  j ) 
 ( r( i ) Ar( i ) )( e( i ) Ae( i ) ) 
j
j


(   j22j )2
2
j
2
2

,

 1
A
(   j23j )(   j2 j )
j
j
Convergence Study (n=2)
assume
let
1  2
k  1 / 2
Spectral condition number
2
e( i )   j v j
let
u  2 / 1
j 1
2 2
2 2 2
(




1 1
2 2 )
2  1  2
(1 1  222 )(1213  2232 )
(k 2  u 2 )2
 1
(k  u 2 )( k 3  u 2 )
Plot of w
28
Case Study
29
Bound of Convergence
2
2 2
(
k

u
)
2
  1
( k  u 2 )( k 3  u 2 )
4k 4
 1 5
k  2k 4  k 3
( k  1) 2

( k  1) 2
k 1

k 1
It can be proved that it is
also valid for n>2, where
k  max / min
30