Inverse–Based Multilevel ILU Preconditioners Theory, Applications

Inverse–Based Multilevel ILU Preconditioners
Theory, Applications and Software
Matthias Bollhöfer
Yousef Saad
TU Berlin / TU Braunschweig
Univ. of Minnesota
8- TH C OPPER M OUNTAIN C ONFERENCE ON I TERATIVE M ETHODS
M ARCH 28 – A PRIL 02, 2004
April 02, 2004
O UTLINE
1. inverse–based ILUs, theoretical background
2. Multilevel algorithms
3. ILUPACK
4. Numerical experiments
5. Conclusions
supported by the DFG research center “Mathematics for key technologies” in Berlin
0
Inverse–Based ILUs
Ax = b
LU DECOMPOSITION
!
!
!
INCOMPLETE
A=
B F
E C
= LDU + E ≈
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
, where L, U are unit diagonal.
I NVERSE –BASED D ROPPING
At elimination step k, drop lik , ukj if
−1
|lik | ∗ ke>
k L k 6 τ |dkk |,
|ukj | ∗ kU −1ek k 6 τ |dkk |,
where τ is a drop tolerance [B. ’01, ’03], [Li, Saad, Chow ’03].
=⇒ L−1 and U −1 serve as approximate inverses
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
1
Inverse–Based ILU — Approximate Inverse Relation
FACTORED A PPROXIMATE I NVERSE (AINV), [Benzi, Tůma ’98]
!
!
!
>
W AZ =
@
@
@
@
@
@
@
@
@
@
W ←→ L−>,
=D+E
Z ←→ U −1
THEOREM [B., Saad ’01]. Suppose that the AINV process and the inverse–based ILU are computed
using the same approximate Schur complement.
Let τ be the drop tolerance. Then
|(I − W L>)ij | 6 2(j − i)τ,
M. Bollhöfer/Y. Saad
|(I − ZU )ij | 6 2(j − i)τ
∀i < j.
Inverse–Based Multilevel ILUs — Theory, Applications and Software
2
Impact on the Inverse Error
G ROWTH OF THE INVERSE TRIANGULAR FACTORS
Problem. kL−1k and kU −1k may become very large.
⇒ pivoting recommended.
BUT WHAT KIND OF PIVOTING IS APPROPRIATE ?
Consider the approximate factorization
B F
LB 0
DB 0
UB U F
+Ek ,
A=
=
E C
LE I
0 SC
0 I
| {z } |
{z
}|
{z
}
Lk
Dk
Uk
where Lk and Uk> are unit lower triangular matrices and the leading k × k part of Dk is diagonal.
I NVERSE ERROR
−1
Fk := L−1
k Ek Uk
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
3
Inverse Error Bound
LEMMA. Denote by vk and wk the entries in column k of Lk Dk (resp. row k of Dk Uk ) that are dropped
at step k of the incomplete LU decomposition.
1. Suppose that the approximate Schur complement SC is defined via
SC = C − LE DB UF (S–version)
X
> −1
−1
> −1
⇒ Fk =
L−1
v
e
U
+
L
e
w
U
k l l k
k l l
k
l6k
2. Suppose that the approximate Schur complement SC is defined via
−1
−U
U
F
B
SC = −LE L−1
(T–version)
B I A
I
X
>
> −1
⇒ Fk =
L−1
v
e
+
e
w
U
l l
k l l
k
l6k
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
4
Application: Inverse-Based Pivoting
C ONSEQUENCES
−1
• kL−1
k k and kUk k contribute to the inverse error
−1
• For the S–version even their PRODUCT kL−1
k k · kUk k!
I NVERSE – BASED PIVOTING
Control kL−1k, kU −1k 6 κ by a factor or skip strategy
@
@
@
@
@
@
@
@
factor
∗
At step k :
@
@
@
@
@
@
@
@
%
∗
&
@
@
@
@
@
@
@
@
skip
∗
Rows/columns that exceed the prescribed bound are pushed to the end
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
5
Inverse–Based Multilevel ILUs
S KETCH OF THE TEMPLATES
1. Compute a static partial reordering
A → P >AQ =
B F
E C
−1
2. Factor P >AQ and control kL−1
k k, kU k 6 κ.
B̂ F̂
LB 0
DB 0
UB U F
P >AQ −→ P̂ >AQ̂ =
=
LE I
0 SC
0 I
Ê Ĉ
3. Apply strategy recursively to the approximate Schur complement SC (multilevel scheme)
1.
2.
3.
A → P >AQ → P̂ >AQ̂ →
1.
2.
3.
S → PS>SQS → P̂S>S Q̂S →
L·D·U F
E
S
LS · DS · US FS
ES
T
···
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
6
Inverse-Based Multilevel ILUs — Templates
T EMPLATE 1. STATIC REORDERINGS
Examples.
• RCM (Reverse Cuthill–McKee), MMD (Multiple Minimum Degree), AMF (Approximate Minimum
Fill), INDSET (independent set, ARMS), MC64 (from HSL).
• ddPQ [Saad ’03]. Permutations P, Q as a compromise between diagonal dominance and fill
1. Define numerical weights
max |aij |
j
ρi = P
,
|a
|
j ij
max |aij |
i
γj = P
i |aij |
2. define fill weights, e.g.
fi = |{j : aij 6= 0}, gj = |{i : aij 6= 0}, hij = (fi − 1)(gj − 1) + 1, . . .
3. order ratios in decreasing order, e.g. order
ρlj γj
ρiγki
γj
ρi
∀i,
∀j}, . . .
{ ∀i}, { ∀j}, {
gki
flj
hi,ki
hlj ,j
4. use associated indices (i, j) and skip duplicate entries
Further selection criteria, e.g. enforce submatrix to be close to lower triangular
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
7
Inverse–Based Multilevel ILUs — Templates
−1
T EMPLATE 2. PILUC — INVERSE - BASED ILU THAT CONTROLS kL−1
k k,kUk k
-
↓
??
• ILU (Crout–version)
−→
@
@
@
@
@
@
@
@
factor
∗
•
−1
kL−1
k k, kUk k
6 κ controlled by diagonal pivoting
@
@
@
@
@
@
@
@
%
∗
&
@
@
@
@
@
@
@
@
skip
∗
−1
• kL−1
k k, kUk k estimated [Cline, Moler, Stewart, Wilkinson ’77], [B. ’03]
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
8
Inverse–Based Multilevel ILUs — Templates
T EMPLATE 3. M ULTILEVEL SCHEME
(P̂ >AQ̂)−1 =
B F
E C
−1
−1
LB 0
DB 0
UB UF
≈
LE I
0 S
0 I
−1 B̃
0
−B̃ −1F
−1
−1
≈
+
SC −E B̃
I , where B̃ = LB DB UB
0 0
I
Recursive application to SC −→ (additive) algebraic multigrid scheme (or block ILU)
• use concept of ARMS [Saad, Suchomel ’99], i.e. skip LE , UF .
• After AMG is set up, SC can be discarded.
• Without LE , UF , forward/backward substitution twice as expensive, but more accurate and less
memory needed.
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
9
Inverse–Based Multilevel ILUs — Templates
A PPROXIMATE S CHUR COMPLEMENTS
1. SC = C − LE DB UF
use analogous approximation scheme for B ≈ LB DB UB
(S–version)
−1
−U
U
F
B
2. SC = −LE L−1
B I A
I
use analogous approximation scheme for B ≈ LB DB UB
(T–version)
3. Use S–version for the computation of B ≈ LB DB UB
Use T–version for the computation of SC
(M–version)
A PPROXIMATION ORDER
Denote by τ the drop tolerance and let B = LB DB UB + EB
−1
2
• If we use 1., then kEB k = O(τ ), kL−1
B EB UB k = O(τ κ )
−1
2
• if we use 2., then kEB k = O(τ 2), kL−1
B EB UB k = O(τ κ)
• Similar arguments apply to SC
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
10
Preconditioning Package: ILUPACK
Preconditioning software package (C/FORTRAN 77) based on inverse–based multilevel ILUs
ILUPACK
SPARSKIT
A R M S
M. Bollhöfer/Y. Saad
Interfaces
BLAS
SPARSPAK
MC64
Appr. Min. Fill
Inverse–Based Multilevel ILUs — Theory, Applications and Software
11
ILUPACK — General
NAMES A ND F ORMATS
1. ILUPACK currently supports real single/double precision and complex single/double precision.
2. Apart from the arithmetic ILUPACK distinguishes between two classes of matrices, namely
general matrices and symmetric (Hermitian) positive definite matrices.
3. matrices are stored in SPARSKIT format (compressed sparse row format).
SPARSKIT, ARMS, . . .
• Several routines from SPARSKIT incorporated (GMRES, FGMRES, ILUT, ILUTP,. . . )
• Reorderings (INDSET, ddPQ) and several tools from ARMS adapted
• Most common reorderings supported or at least interfaces are provided
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
12
ILUPACK — Templates
T EMPLATE 1
• Preconditioner allows three arbitrarily chosen permutation/scaling functions
1. Initial preprocessing (applied only once)
2. Regular reordering (applied to any subsequent system)
3. Final pivoting (if the regular reordering fails)
• Custom reorderings possible
T EMPLATE 2
Partial ILUC (Crout version) with static permutation and diagonal pivoting (F77 core routine)
T EMPLATE 3
• Three different versions of approximate Schur complements are supported (S–, M– and T–
version). M–version is default.
• After the computation of the approximate Schur complement the parts LE and UF are discarded.
• if the fill exceeds a certain amount, the algorithm switches to full-matrix processing.
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
13
ILUPACK — Experiments
P RECONDITIONERS
1. inverse–based multilevel ILU (referred as “ILUPACK”)
2. ILUTP
3. ILUC (single–level, inverse–based)
O RDERINGS
PQ, RCM, . . . + row/column scaling
I TERATIVE S OLVERS
• GMRES, PCG. Iteration stopped if kb − Ax(k)k 6
√
eps kb − Ax(0)k
• Iterative solver treated as failure, if a break down occurred or more than 500 steps were needed.
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
14
ILUPACK — Application to Unstructured Systems
E XAMPLE . Matrix ATT/TWOTONE
Computation time
400
350
Memory requirement
12
ILUC, MMD+MC64
ILUTP, MMD+MC64
ILUPACK−M, PQ
ILUPACK−S, PQ
10
ILUC, MMD+MC64
ILUTP, MMD+MC64
ILUPACK−M, PQ
ILUPACK−S, PQ
300
8
250
6
200
150
4
100
2
50
0
−2
10
−4
10
drop tol.
M. Bollhöfer/Y. Saad
−6
10
0
−2
10
−4
10
drop tol.
Inverse–Based Multilevel ILUs — Theory, Applications and Software
−6
10
15
ILUPACK — Application to Unstructured Systems
E XAMPLE . Matrix VAVASIS/AV41092
Computation time
Memory requirement
600
500
12
ILUTP, RCM
ILUPACK−M, PQ
ILUPACK−S, PQ
10
400
8
300
6
200
4
100
2
0
−2
10
−4
10
drop tol.
M. Bollhöfer/Y. Saad
−6
10
0
ILUTP, RCM
ILUPACK−M, PQ
ILUPACK−S, PQ
−2
10
−4
10
drop tol.
Inverse–Based Multilevel ILUs — Theory, Applications and Software
−6
10
16
ILUPACK — dependence on inverse bound κ
Almost no difference between κ = 10 , κ = 25 , κ = 50 , κ = 100
64 LARGE TEST MATRICES , NUMBER OF SUCCESSFUL COMPUTATIONS
ILUPACK M–version
ILUPACK S–version
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
1
2
3
4
−∗
drop tol. [10 ]
M. Bollhöfer/Y. Saad
5
0
1
2
3
4
−∗
drop tol. [10 ]
Inverse–Based Multilevel ILUs — Theory, Applications and Software
5
17
ILUPACK — dependence on inverse bound κ
Almost no difference between κ = 10 , κ = 25 , κ = 50 , κ = 100
64 LARGE TEST MATRICES , NUMBER OF SUCCESSFUL COMPUTATIONS
ILUPACK M–version
ILUTP + RCM
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
1
2
3
4
−∗
drop tol. [10 ]
M. Bollhöfer/Y. Saad
5
0
1
2
3
−∗
drop tol. [10 ]
Inverse–Based Multilevel ILUs — Theory, Applications and Software
4
5
18
ILUPACK — dependence on inverse bound κ
Almost no difference between κ = 10 , κ = 25 , κ = 50 , κ = 100
64 LARGE TEST MATRICES , NUMBER OF SUCCESSFUL COMPUTATIONS
ILUPACK M–version
ILUTP + RCM + MC64
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
1
2
3
4
−∗
drop tol. [10 ]
M. Bollhöfer/Y. Saad
5
0
1
2
3
−∗
drop tol. [10 ]
Inverse–Based Multilevel ILUs — Theory, Applications and Software
4
5
19
ILUPACK — Abused as AMG
E XAMPLE . cylindric shell problem
Tight bounds on kU −1k 6 κ implicitly yield FINE GRID/COARSE GRID selection
16
κ= 2
κ= 5
κ= 10
κ= 25
κ= 50
κ=100
Computation time
14
12
3
κ= 2
κ= 5
κ= 10
κ= 25
κ= 50
κ=100
2.5
Memory requirement
2
10
1.5
8
1
6
4
−1
10
M. Bollhöfer/Y. Saad
drop tol.
−2
10
0.5
−1
10
drop tol.
Inverse–Based Multilevel ILUs — Theory, Applications and Software
−2
10
20
ILUPACK — Abused as AMG
E XAMPLE . cylindric shell problem
Number of levels. κ = 2, κ = 5, κ = 10, κ = 25, κ = 50, κ = 100.
4
Number of levels depending on κ
3.5
3
2.5
2
1.5
1
0.5
0
drop tol.
M. Bollhöfer/Y. Saad
1
2
0.1
0.05
3
4
0.025
0.01
5
0.005
Inverse–Based Multilevel ILUs — Theory, Applications and Software
21
Conclusions
1. Multilevel preconditioners using inverse–based ILUs
2. numerical experiments indicate higher robustness with respect to parameter tuning
(time dependent problems, nonlinear equations)
3. Main objective: inverse triangular factors are kept bounded
4. AMG-like preconditioners realized in the software package ILUPACK that offers inverse–based
multilevel ILUs for a broad class of matrices
5. Still to do: reordering strategies for PDE-type problems
ILUPACK BEING RELEASED BY A PRIL /M AY, 2004
http://www.math.tu-berlin.de/ilupack/
M. Bollhöfer/Y. Saad
Inverse–Based Multilevel ILUs — Theory, Applications and Software
22