Introduction.. - Network Systems Laboratory

EE692
Parallel and Distribution Computation | Prof. Song Chong
Ch. 2
Algorithms for Systems of Linear
Equations
Korea Advanced Institute of Science and Technology
No.1
Network Systems Lab.
Overview
 Consider the system of linear equations
Ax  b
A: n x n real matrix, b: vector in Rn
 Direct method to find “exact” solution (sec 2.1~2.3) with a finite
number of operations, typically of the order of n3
e.g.) Gaussian Elimination
Korea Advanced Institute of Science and Technology
No.2
Network Systems Lab.
Overview (Cont’d)
 Iterative methods do not obtain an exact solution of Ax = b in
finite time, but they converge to a solution asymptotically
 Often yield a solution, within acceptable precision, after a relatively
small number of iterations
 Usually preferred when n is very large
 May have smaller storage requirement than direct methods
 Performance measures
 Direct method: complexity
 Iteration method: speed of convergence
e.g.) geometrical convergence
x(t )  x *  c t
Korea Advanced Institute of Science and Technology
No.3
Network Systems Lab.
Classical Iterative Methods
 Assume that A is invertible so that Ax = b has a unique solution.
Write the i-th equation as
n
a
j 1
ij
x j  bi , i
 Assuming aii≠0 and solving for xi,

1 
xi    aij x j  bi , i ------(1)
aii  j i

 If xj, j≠i, (or estimates) are known (available), one can obtain xi
(or an estimate of xi)
Korea Advanced Institute of Science and Technology
No.4
Network Systems Lab.
Jacobi Algorithm
 Starting with some initial vector x(0)  R , evaluate x(t), t=1,2,...,
using the iteration
n

1 
xi (t  1)    aij x j (t )  bi , i
aii  j i

 If the sequence {x(t)} converges to a limit x*, then obviously x*
satisfies Eg.(1) for each i.
>> Case 1: Convergence case
 Condition for Convergence??
e.g.)
2 x1  x2  0
(eq. 1)
 x1  2 x2  0
(eq.2)
2  1  x1  0
 1 2  x   0

  2  
>> Case 2: Divergence case
 1 2  x1  0
2  1  x   0

  2  
Korea Advanced Institute of Science and Technology
No.5
Network Systems Lab.
Jacobi Algorithm (Cont’d)
Convergence Case
x2
Divergence Case
First equation: 2x1-x2=0
x2
Second equation: 2x1-x2=0
x(0)
x(2)
x(1)
x(1)
x(2)
Second equation: -x1+2x2=0
x1
x(0)
First equation: -x1+2x2=0
x1
Korea Advanced Institute of Science and Technology
No.6
Network Systems Lab.
Gauss-Seidel Algorithm
 Starting with x(0)  R n

1 
xi (t  1)    aij x j (t  1)   aij x j (t )  bi , i
aii  j i
j i

 Any other order of updating is possible
 Different order of updating may produce substantially different
results for the same problem
Korea Advanced Institute of Science and Technology
No.7
Network Systems Lab.
Relaxation of Iterative Methods
 Relaxation of Eq.(1) using relaxation parameter  (0    1)
 

xi  (1   ) xi   aij x j  bi , i
aii  j i

 Jacobi overrelaxation (JOR)

 
xi (t  1)  (1   ) xi (t )   aij x j (t )  bi , i
aii  j i

 Convex computation of xi(t) and Jacobi iteration
 Gauss-Seidel overrelaxation (SOR)

 
xi (t  1)  (1   ) xi (t )   aij x j (t  1)   aij x j (t )  bi , i
aii  j i
j i

 Convex combination of xi(t) and Gauss-Seidel iteration
 JOR and SOR are widely used because they often converge
faster if  is suitably chosen
Korea Advanced Institute of Science and Technology
No.8
Network Systems Lab.
Richardson’s method
 Following equation is obtained by rewriting Ax  b
Ax  b  x  x   Ax  b


 xi (t  1)  xi (t )    aij x j (t )  bi  , i
 j

 Richardson-Gauss-Seidel [RGS] method


xi (t  1)  xi (t )    aij x j (t  1)   aij x j (t )  bi  , i
j i
 j i

 A more general form using an invertible matrix B
Ax  b  x  x  BAx  b
 x (t  1)  x (t )  BAx(t )  b
Korea Advanced Institute of Science and Technology
No.9
Network Systems Lab.
Parallel Implementation
 Jacobi, JOR and Richardson’s algorithms are straightforward to
implement in parallel
 Gauss-Seidel, SOR and RGS algorithms are not well suited for
parallel implementation in general because they are inherently
sequential
 Typical termination criteria used in practice
Ax(t )  b  
Korea Advanced Institute of Science and Technology
No.10
Network Systems Lab.
Applications: Poisson’s equation
 Find a function f:[0,1]2R that satisfies
2 f
2 f
2
(
x
,
y
)

(
x
,
y
)

g
(
x
,
y
),
(
x
,
y
)

[
0
,
1
]
- - - (1)
2
2
x
y
where g:[0,1]2R is a known function and f has prescribed values
on the boundary of the unit square.
 Let
fi, j
gi, j
(0,N)
(N,N)
Δ
i j
 f ( , ), 0  i, j  N
N N
i j
 g ( , ), 0  i, j  N
N N
(N+1) x (N+1) grid
Δ=1/N
(0,0)
(N,0)
Korea Advanced Institute of Science and Technology
No.11
Network Systems Lab.
Applications: Poisson’s equation (cont’d)
 Assume that f is sufficiently smooth and the  is a small scalar,
2 f
1
 f ( x  , y)  2 f ( x, y)  f ( x  , y) - - - (2)
(
x
,
y
)

x 2
2
2 f
1
 f ( x, y  )  2 f ( x, y)  f ( x, y  ) - - - (3)
(
x
,
y
)

y 2
2
by Prop. A.33 in Appendix A.
 By plugging (2) and (3) into (1),
fi, j 
1
1
( f i 1, j  f i 1, j  f i , j 1  f i , j 1 ) 
g i , j , 0  i, j  N
4
4N 2
 A system of (N-1)2 linear equations in (N-1)2 unknowns, i.e., can be
represented in the form Ax=b.
Korea Advanced Institute of Science and Technology
No.12
Network Systems Lab.
Applications: Poisson’s equation (cont’d)
 JOR algorithm
f i , j (t  1)  (1   ) f i , j (t ) 

f
4
i 1, j

(t )  f i 1, j (t )  f i , j 1 (t )  f i , j 1 (t ) 

4N 2
g i , j , 0  i, j  N
where fi,j(t)=fi,j are known, whenever i or j is equal to 0 or N.
Korea Advanced Institute of Science and Technology
No.13
Network Systems Lab.
Applications: Power Control of CDMA Uplink
1
g1
g3
g2
3
2
 Assume K users in a cell, SINR per chip, denoted by SINRc, of
i
user i is

i
SINRc 

j 1,, K
j i
c
j
c
 N0
i
where  c is the received energy per chip for user i and N0 is
noise.
 Since each bit is encoded onto a pseudonoise sequence of length
Gi chips
at i the transmitter, the received energy per bit for user i
i
is  b  Gi c .
Korea Advanced Institute of Science and Technology
No.14
Network Systems Lab.
Applications: Power Control of CDMA Uplink (cont’d)
 The SINR of user i, or equivalently the ratio of the received
energy per bit to the interference and noise per chip (commonly
i
i
called  b / I 0 in the CDMA literature) is
pi
gi

G
W
SINRi 


pj
I
  N0

g j  N0

j 1,, K
j i
j 1,, K W
i
b
i
0
Gi
i
i c
j
c
j i
where pi (joules/sec) is the transmit power of user i and gi is the
attenuation of user i’s signal to base station.
SINRi 
Gi pi g i
, i  1,, K
 p j g j  N 0W
j 1,, K
j i
Korea Advanced Institute of Science and Technology
No.15
Network Systems Lab.
Applications: Power Control of CDMA Uplink (cont’d)
 To achieve equally reliable communication,
SINRi  
where  is a certain threshold.
 The data rate of user i, Ri (bits/sec), is
Ri 
W
Gi
and Gi is called the processing gain of user i.
Korea Advanced Institute of Science and Technology
No.16
Network Systems Lab.
Applications: Power Control of CDMA Uplink (cont’d)
 The power control problem of CMDA uplink is to find minimal
nonnegative transmit power vector p  [ p1 , p2 ,, pK ] satisfying
Gi pi g i
  , i  1,, K
 p j g j  N 0W
j 1,, K
j i
That is, find nonnegative p satisfying
Gi pi g i
  , i  1,, K
 p j g j  N 0W
j 1,, K
j i
 pi 

Gi g i
 pjg j 
j 1,, K
j i
N 0W
Gi g i
, i  1,, K
 A system of K linear equations in K unknowns, i.e., can be represented
in the form Ax=b.
Korea Advanced Institute of Science and Technology
No.17
Network Systems Lab.
Applications: Power Control of CDMA Uplink (cont’d)
 JOR algorithm
For each user i,
pi (t  1)  (1   ) pi (t ) 

Gi g i
 p j (t ) g j 
j 1,, K
j i
N 0W
Gi g i
where β, Gi, gi, N0 and W are given.
Korea Advanced Institute of Science and Technology
No.18
Network Systems Lab.
Parallelization of Iterative Methods Using Dependency Graph
 Consider a Jacobi-type iteration in the general form
xi (t  1)  fi ( x1 (t ) ,
, xn (t )) ,
i  1,
,n
 The communication required for this iteration can be described
by means of a directed graph G=(N,A), called the dependency
graph.
 The set of nodes N is {1,…,n}, corresponding to the components of
x. Let (i,j) be an arc of the dependency graph if and only if the
function fj depends on xi.
e.g.)
2
x1 (t  1)  f1 ( x1 (t ), x3 (t ))
x2 (t  1)  f 2 ( x1 (t ), x2 (t ))
x3 (t  1)  f3 ( x2 (t ), x3 (t ), x4 (t ))
x4 (t  1)  f 4 ( x2 (t ), x4 (t ))
1
4
3
Korea Advanced Institute of Science and Technology
No.19
Network Systems Lab.
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
 The dependency over iterations can be described by means of a
directed acyclic graph (DAG) where the nodes one of the form
(i,t) and arcs are of the form ((i,t), (j,t+1)).
t=0
1,0
2,0
3,0
4,0
The depth of the single
iteration (sweep) is 1
t
t=1
1,1
2,1
3,1
4,1
t=2
1,2
2,2
2,3
2,4
Korea Advanced Institute of Science and Technology
No.20
Network Systems Lab.
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
 Consider a Gauss-Seidel type iteration in the general form
xi (t  1)  fi ( x1 (t  1),
, xi 1 (t  1), xi (t ),
, xn (t )) , i  1,
,n
 Often preferable since it incorporates the newest available information,
thereby sometimes converging faster than the Jacobi type
 Maybe completely non-parallelizable since it is sequential in nature
 When the dependency graph is sparse, it is possible that certain
component updates can be parallelized
 The degree of parallelism may depend on update ordering
e.g.) ordering 1234
1,0
2,0
3,0
4,0
The depth of the single
iteration (sweep) is 3
1,1
2,1
3,1
4,1
No.21
Korea Advanced Institute of Science and Technology
Network Systems Lab.
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
e.g.) ordering 1342
1,0
2,0
1,1
3,0
4,0
3,1
4,1
2,1
The depth of the single iteration is 2
 Finding an optimal update ordering that maximizes parallelisms in GaussSeidel algorithm is equivalent to an optimal coloring problem.
Korea Advanced Institute of Science and Technology
No.22
Network Systems Lab.
Parallelization of Iterative Methods Using Dependency Graph (cont’d)
 Given the dependency graph G=(N,A), a coloring of G, using K colors, is
defined as a mapping h:N->{1,…, K} that assigns a color k=h(i) to each
node i in N.
 Prop. 2.5
There exists an ordering such that a sweep of the Gauss-Seidel
algorithm can be performed in K parallel steps if and only if there exists
a coloring of the dependency graph that uses K colors and with the
property that there exists no positive cycle with all nodes on the cycle
having the same color.
 Prop. 2.6
Suppose that (i, j )  A if and only if ( j , i )  A. Then, there exists an
ordering such that a sweep of the Gauss-Seidel algorithm can be
performed in K parallel steps if and only if there exists a coloring of the
dependency graph that uses at most K colors and such that adjacent
nodes have different colors.
 Unfortunately, the optimal coloring problems are intractable, i.e., there
is know known efficient algorithm for solving them.
Korea Advanced Institute of Science and Technology
No.23
Network Systems Lab.
Convergence Analysis of Classical Iterative Methods
 Prop. 4.1
If x(t ) , generated by any of the above presented algorithms
converges, then it converges to a solution of Ax  b.
Korea Advanced Institute of Science and Technology
No.24
Network Systems Lab.
Uniform representation of the different algorithms
 Let B = A-D where D is a diagonal matrix whose entries are equal
to the corresponding diagonal entries of A.
 Assuming that the diagonal entries of A are nonzero, the Jacobi
algorithm can be written as
x(t  1)   D 1Bx (t )  D 1b <Jacobi>
 Ax  b  ( B  D) x  b  Dx   Bx  b  x   D 1Bx  D 1b
 Similarly, the JOR
x(t  1)  [(1   ) I  D 1B]x(t )  D 1b
<JOR>
Korea Advanced Institute of Science and Technology
No.25
Network Systems Lab.
Uniform representation of the different algorithms (cont’d)
Decompose A=L+D+U where L strictly lower triangular
D diagonal
U strictly upper triangular
Then, the Gauss-Seidel can be written as
x(t  1)   D 1 ( Lx(t  1)  Ux(t )  b)
x(t  1)  ( I  D 1L) 1 D 1Ux(t )  ( I  D 1L) 1 D 1b
<Gauss-Seidel>
Korea Advanced Institute of Science and Technology
No.26
Network Systems Lab.
Uniform representation of the different algorithms (cont’d)
Similarly,
x(t  1)  (1   ) x(t )  D 1 ( Lx(t  1)  Ux(t )  b)


x(t  1)  ( I  D 1L) 1 (1   ) I  D 1U x(t )   ( I  D 1L) 1 D 1b
Finally,
x(t  1)  ( I  A) x(t )  b
<Richardson>
The uniform representation is
x(t  1)  Mx (t )  Gb
iterative matrix
Korea Advanced Institute of Science and Technology
No.27
Network Systems Lab.
<SOR>
Uniform representation of the different algorithms (cont’d)
 Assume that I-M is invertible (fact: A invertible and nonzero
diagonal  I-M invertible for all the algorithm Ex.6.1). Then,
there exists a unique x* satisfying x* = Mx* + Gb.
 Let y(t) = x(t) – x*. Then,
y (t  1)  My (t )
t
 The solution form is then y(t )  M y(0) for every t.
 Y(t)  0 iff Mt  0 iff all the eigenvalue of M have a magnitude
smaller than 1, i.e., the spectral radius  ( M )  1 .
Korea Advanced Institute of Science and Technology
No.28
Network Systems Lab.
Uniform representation of the different algorithms (cont’d)
 Prop. 6.1
Assume that I-M is invertible, let x* satisfy x*=Mx*+Gb and let
{x(t)} be the sequence generated by the iteration x(t+1) = Mx(t) + Gb.
Then,
*
lim x(t )  x for all choices of x(0) iff  (M )  1
t 
 Note : G and b are nothing to do with convergence
 Proof) to be done!
Korea Advanced Institute of Science and Technology
No.29
Network Systems Lab.