Implementation and numerical stability of saddle point solvers

Implementation and numerical stability of saddle point solvers
Miro Rozložnı́k
joint results with Pavel Jiránek and Valeria Simoncini
Institute of Computer Science, Czech Academy of Sciences,
Prague, Czech Republic
Recent developments in the solution of indefinite systems
Workshop, TU Eindhoven, April 17,2012
Saddle point problems
We consider a saddle point problem with the symmetric 2 × 2 block form
A B
x
f
=
.
y
0
BT 0
I
A is a square n × n nonsingular (symmetric positive definite) matrix,
I
B is a rectangular n × m matrix of (full column) rank m.
Applications: mixed finite element approximations, weighted least squares,
constrained optimization, computational fluid dynamics, electromagnetism etc.
[Benzi, Golub and Liesen, 2005], [Elman, Silvester, Wathen, 2005]. For the
updated list of applications leading to saddle point problems contact [Benzi].
Iterative solution of saddle point problems
1. segregated approach: outer iteration for solving the reduced Schur
complement or null-space projected system;
2. coupled approach with block preconditioning: iteration scheme for
solving the preconditioned system;
3. rounding errors in floating point arithmetic: numerical stability of the
solver
Numerous solution schemes: inexact Uzawa algorithms, inexact null-space
methods, inner-outer iteration methods, two-stage iteration processes,
multilevel or multigrid methods, domain decomposition methods
Numerous preconditioning techniques and schemes: block diagonal
preconditioners, block triangular preconditioners, constraint preconditioning,
Hermitian/skew-Hermitian preconditioning and other splittings, combination
preconditioning
Numerous iterative solvers: conjugate gradient (CG) method, MINRES,
GMRES, flexible GMRES, GCR, BiCG, BiCGSTAB, ...
Delay of convergence and limit on the final accuracy
Numerical experiments: a small model example
A = tridiag(1, 4, 1) ∈ R100×100 , B = rand(100, 20), f = rand(100, 1),
κ(A) = kAk · kA−1 k = 5.9990 · 0.4998 ≈ 2.9983,
κ(B) = kBk · kB † k = 7.1695 · 0.4603 ≈ 3.3001.
Schur complement reduction method
I
Compute y as a solution of the Schur complement system
B T A−1 By = B T A−1 f,
I
compute x as a solution of
Ax = f − By.
I
Segregated vs. coupled approach: xk and yk approximate solutions to x
and y, respectively.
I
Inexact solution of systems with A: every computed solution û of
Au = b is interpreted as an exact solution of a perturbed system
(A + ∆A)û = b + ∆b, k∆Ak ≤ τ kAk, k∆bk ≤ τ kbk, τ κ(A) 1.
Iterative solution of the Schur complement system
choose y0 , solve Ax0 = f − By0
(y)
compute αk and pk
(y)
yk+1 = yk + αk pk

solve Ap(x) = −Bp(y)

k
k





back-substitution:




(x)
A: xk+1 = xk + αk pk ,
inner

B: solve Axk+1 = f − Byk+1 ,
iteration





C: solve Auk = f − Axk − Byk+1 ,



x
=x +u .
k+1
(y)
rk+1
=
(y)
rk
k
− αk B
k
T
(x)
pk























outer
iteration






















Accuracy in the saddle point system
O(α1 )κ(A)
(kf k + kBkYk ),
1 − τ κ(A)
O(α2 )κ(A) −1
(y)
kA kkBk(kf k + kBkYk ),
k − B T xk − r k k ≤
1 − τ κ(A)
kf − Axk − Byk k ≤
Yk ≡ max{kyi k | i = 0, 1, . . . , k}.
Back-substitution scheme
A: Generic update
(x)
xk+1 = xk + αk pk
B: Direct substitution
xk+1 = A−1 (f − Byk+1 )
C: Corrected dir. subst.
xk+1 = xk + A−1 (f − Axk − Byk+1 )
α1
α2
τ
u
τ
τ
u
τ
)
additional
system with A
−B T A−1 f + B T A−1 Byk = −B T xk − B T A−1 (f − Axk − Byk )
(x)
Generic update: xk+1 = xk + αk pk
0
0
10
10
τ = 10−2
−2
−2
10
(y)
relative residual norms ||−B xk||/||−B x0||, ||rk ||/||r0 ||
10
−4
10
−4
(y)
−6
−6
10
T
10
−8
10
τ = 10−10
−10
10
−12
10
τ = O(u)
−14
10
−16
−10
10
−12
10
−14
10
−10
τ = O(u), τ = 10
−6
−2
, τ=10 , τ=10
−16
10
10
−18
10
−8
10
T
residual norm ||f−Axk−Byk||
τ = 10−6
10
−18
0
50
100
150
200
iteration number k
250
300
10
0
50
100
150
200
iteration number k
250
300
Direct substitution: xk+1 = A−1 (f − Byk+1 )
0
0
10
10
τ = 10−2
−2
(y)
relative residual norms ||−B xk||/||−B x0||, ||rk ||/||r0 ||
10
−4
10
−4
10
(y)
τ = 10−6
−6
τ = 10−6
−6
10
T
10
−8
10
τ = 10−10
−10
10
−12
10
τ = O(u)
−14
10
−16
τ = 10−10
−10
10
−12
10
−14
10
τ = O(u)
−16
10
10
−18
10
−8
10
T
residual norm ||f−Axk−Byk||
τ = 10−2
−2
10
−18
0
50
100
150
200
iteration number k
250
300
10
0
50
100
150
200
iteration number k
250
300
Corrected direct substitution: xk+1 = xk + A−1 (f − Axk − Byk+1 )
0
0
10
10
−2
τ = 10−2
−2
10
(y)
relative residual norms ||−B xk||/||−B x0||, ||rk ||/||r0 ||
10
−4
−4
10
−6
τ = 10−6
−6
10
T
10
−8
10
−8
10
T
residual norm ||f−Axk−Byk||
(y)
10
−10
10
−12
10
−14
10
τ = O(u), τ = 10−10, τ=10−6, τ=10−2
−16
−12
10
−14
10
τ = O(u)
−16
10
10
−18
10
τ = 10−10
−10
10
−18
0
50
100
150
200
iteration number k
250
300
10
0
50
100
150
200
iteration number k
250
300
Null-space projection method
I
compute x ∈ N (B T ) as a solution of the projected system
(I − Π)A(I − Π)x = (I − Π)f,
I
compute y as a solution of the least squares problem
By ≈ f − Ax,
T
Π = B(B B)
I
−1
B
T
is the orthogonal projector onto R(B).
Schemes with the inexact solution of least squares with B. Every
computed approximate solution v̄ of a least squares problem Bv ≈ c is
interpreted as an exact solution of a perturbed least squares
(B + ∆B)v̄ ≈ c + ∆c, k∆Bk ≤ τ kBk, k∆ck ≤ τ kck, τ κ(B) 1.
Null-space projection method
choose x0 , solve By0 ≈ f − Ax0
(x)
compute αk and pk
∈ N (B T )
(x)
xk+1 = xk + αk pk

solve Bp(y) ≈ r(x) − αk Ap(x)

k
k
k





back-substitution:




(y)
A: yk+1 = yk + pk ,
inner

B: solve Byk+1 ≈ f − Axk+1 ,
iteration





C: solve Bvk ≈ f − Axk+1 − Byk ,



y
=y +v .
k+1
(x)
rk+1
=
(x)
rk
k
−
k
(x)
αk Apk
(y)
− Bpk























outer
iteration






















Accuracy in the saddle point system
O(α3 )κ(B)
(kf k + kAkXk ),
1 − τ κ(B)
O(τ )κ(B)
k − B T xk k ≤
kBkXk ,
1 − τ κ(B)
(x)
kf − Axk − Byk − rk k ≤
Xk ≡ max{kxi k | i = 0, 1, . . . , k}.
Back-substitution scheme
A: Generic update
(y)
yk+1 = yk + pk
B: Direct substitution
yk+1 = B † (f − Axk+1 )
C: Corrected dir. subst.
yk+1 = yk + B † (f − Axk+1 − Byk )
α3
u
τ
u
)
additional least
square with B
(y)
Generic update: yk+1 = yk + pk
0
0
10
−2
τ = 10−2
−2
10
10
−4
−4
10
τ = 10−6
10
−6
residual norm ||−BTxk||
relative residual norms ||f−Axk−Byk||/||f−Ax0−By0||, ||r(x)
||/||r(x)
||
k
0
10
10
−8
10
−10
10
−12
−6
10
−8
τ = 10−10
10
−10
10
−12
10
10
−14
−14
10
10
τ = O(u)
τ = O(u), τ = 10−10, τ=10−6, τ=10−2
−16
−16
10
10
−18
10
−18
0
10
20
30
40
50
60
iteration number
70
80
90
100
10
0
10
20
30
40
50
60
iteration number
70
80
90
100
Direct substitution: yk+1 = B † (f − Axk+1 )
0
0
10
τ = 10−2
τ = 10−2
−2
−2
10
10
−4
−4
10
τ = 10−6
10
τ = 10−6
−6
residual norm ||−BTxk||
relative residual norms ||f−Axk−Byk||/||f−Ax0−By0||, ||r(x)
||/||r(x)
||
k
0
10
10
−8
10
τ = 10−10
−10
10
−12
−6
10
−8
τ = 10−10
10
−10
10
−12
10
10
−14
−14
10
10
τ = O(u)
τ = O(u)
−16
−16
10
10
−18
10
−18
0
10
20
30
40
50
60
iteration number
70
80
90
100
10
0
10
20
30
40
50
60
iteration number
70
80
90
100
Corrected direct substitution: yk+1 = yk + B † (f − Axk+1 − Byk )
0
0
10
−2
τ = 10−2
−2
10
10
−4
−4
10
τ = 10−6
10
−6
residual norm ||−BTxk||
relative residual norms ||f−Axk−Byk||/||f−Ax0−By0||, ||r(x)
||/||r(x)
||
k
0
10
10
−8
10
−10
10
−12
−6
10
−8
τ = 10−10
10
−10
10
−12
10
10
−14
−14
10
10
τ = O(u)
τ = O(u), τ = 10−10, τ=10−6, τ=10−2
−16
−16
10
10
−18
10
−18
0
10
20
30
40
50
60
iteration number
70
80
90
100
10
0
10
20
30
40
50
60
iteration number
70
80
90
100
Stationary iterative methods
I
A = Ax = b, M − N
I
A: Mxk+1 = N xk + b
B: xk+1 = xk + M−1 (b − Axk )
I
Inexact solution of systems with M: every computed solution y of
My = z is interpreted as an exact solution of a perturbed system
(M + ∆M)y = z,
k∆Mk ≤ τ kMk,
τ k(M) 1
Accuracy of the computed approximate solution
A
Mxk+1 = Mxk + b
k |M−1 | |M| |x| k∞
kx̂k+1 − xk∞
≤τ
kxk∞
kxk∞
B
xk+1 = xk + M−1 (b − Axk )
kx̂k+1 − xk∞
k |M−1 | |A |x| k∞
≤ O(u)
kxk∞
kxk∞
new value=old value+small correction
Two-stage iterative methods
M1 xk+1/2 = N1 xk + b,
A = M1 − N1
M2 xk+1 = N2 xk+1/2 + b,
A = M2 − N2
xk+1/2 = xk + M−1
1 (b − Axk )
xk+1 = xk+1/2 + M−1
2 (b − Axk+1/2 )
⇔
−1
−1
−1
xk+1 = xk + (M−1
1 + M2 − M2 AM1 )(b − Axk )
−1
= xk + (I + M−1
2 N1 )M1 (b − Axk )
−1
= xk + M−1
2 (I + N2 M1 )(b − Axk )
−1
k |M−1
kx̂k+1 − xk∞
2 (I + N2 M1 )| |A| |x| k∞
≤ O(u)
kxk∞
kxk∞
Preconditioning of saddle point problems
A symmetric indefinite, P positive definite
A=
−T
R
A B
BT 0
−1
AR
≈ P = RT R
x
−T f
=R
R
0
y
R−T AR−1 is symmetric indefinite!
Symmetric indefinite or nonsymmetric preconditioner
P symmetric indefinite or nonsymmetric
P
−1
x
−1 f
A
=P
y
0
AP
−1
x
f
P
=
y
0
P −1 A and AP −1 are nonsymmetric!
Schur complement approach with indefinite preconditioner
A B
BT 0
x
f
A
B
=
, P=
y
0
B T B T A−1 B − I
I
0
AP −1 =
(I − S)B T A−1 S
S = B T A−1 B, AP −1 nonsymmetric but diagonalizable and it
has a ’nice’ spectrum!
σ(AP −1 ) ⊂ {1} ∪ σ(B T A−1 B T )
[Durazzi, Ruggiero 2003], [Fortin, El-Maliki, 2009?]
Krylov method with the preconditioner: basic properties
x0
0
x − xk+1
, r0 =
, ek+1 =
y0
s0
y − yk+1
rk+1 =
f
0
−
A
BT
B
0
xk+1
yk+1
0
0
r0 =
⇒ rk+1 =
s0
sk+1
⇒ Axk+1 + Byk+1 = f
Preconditioned CG method: saddle point problem and indefinite
preconditioner
T
rk+1
P −1 rj = 0, j = 0, . . . , k
yk+1 is an iterate from CG applied to the Schur complement system
B T A−1 By = B T A−1 f !
satisfying
ky − yk+1 kB T A−1 B =
minu∈x0 +Kk+1 (B T A−1 B,B T A−1 f ) ky − ukB T A−1 B
Preconditioned CG algorithm
x0
x0
0
, r0 = b − A
=
y0
y0
s0
!
(x)
p0
0
−1
−1
=
P
r
=
P
0
(y)
s0
p0
k = 0, 1, . . .
!
!
(x)
(x)
0
pk
pk
0
−1
αk = (
,P
)/(A
,
(y)
(y) )
sk
sk
p
pk
! k
(x)
xk+1
xk
pk
=
+ αk
(y)
yk+1
yk
p
! k (x)
0
pk
rk+1 = rk − αk A
=
(y)
sk+1
pk
0
0
0
0
−1
,P
)/(
, P −1
)
βk = (
s
s
s
s
k+1
k
k+1
k
!
!
!
(x)
(y)
(x)
pk+1
−A−1 Bpk+1
0
pk
−1
=
P
+
β
=
k
(y)
(y)
(y)
sk+1
pk+1
pk
pk+1
αk =
(rk ,zk )
(Apk ,pk )
zk+1 = P −1 rk+1
βk =
(rk+1 ,zk+1 )
(rk ,zk )
pk+1 = zk+1 + βk pk
(x)
Generic update: xk+1 = xk + αk pk
(x)
(y)
with pk = −A−1 Bpk
Saddle point problem and indefinite constraint preconditioner
AP
A B
BT 0
−1
x
f
=
,
y
0
=
P=
I B
BT 0
A(I − Π) + Π (A − I)B(B T B)−1
0
I
Π = B(B T B)−1 B T - orth. projector onto span(B)
[Lukšan, Vlček, 1998], [Gould, Keller, Wathen 2000]
[Perugia, Simoncini, Arioli, 1999], [R, Simoncini, 2002]
Indefinite constraint preconditioner: spectral properties
AP −1 nonsymmetric and non-diagonalizable!
but it has a ’nice’ spectrum:
σ(AP −1 ) ⊂ {1} ∪ σ(A(I − Π) + Π)
⊂ {1} ∪ σ((I − Π)A(I − Π)) − {0}
and only 2 by 2 Jordan blocks!
[Lukšan, Vlček 1998], [Gould, Wathen, Keller, 1999], [Perugia, Simoncini 1999]
Krylov method with the constraint preconditioner: basic properties
x0
s0
x − xk+1
, r0 =
, ek+1 =
y0
0
y − yk+1
rk+1 =
f
0
−
A
BT
B
0
xk+1
yk+1
s0
sk+1
r0 =
⇒ rk+1 =
0
0
T
⇒ B (x − xk+1 ) = 0
⇒ xk+1 ∈ N ull(B T )!
Preconditioned CG method: error norm
T
rk+1
P −1 rj = 0, j = 0, . . . , k
xk+1 is an iterate from CG applied to
(I − Π)A(I − Π)x = (I − Π)f !
satisfying
kx − xk+1 kA = minu∈x0 +span{(I−Π)sj } kx − ukA
[Lukšan, Vlček 1998], [Gould, Wathen, Keller, 1999]
Preconditioned CG algorithm
x0
x0
s0
, r0 = b − A
=
y0
y0
0
!
(x)
p0
−1
−1 s0
=
P
r
=
P
0
(y)
0
p0
k = 0, 1, . . .
!
!
(x)
(x)
sk
pk
pk
−1 sk
αk = (
,P
)/(A
,
(y)
(y) )
0
0
p
pk
! k
(x)
xk+1
xk
pk
=
+ αk
(y)
yk+1
yk
p
! k (x)
pk
sk+1
rk+1 = rk − αk A
=
(y)
0
pk
sk+1
s
s
−1 sk+1
βk = (
,P
)/( k , P −1 k )
0
0
0
0
!
!
(x)
(x)
pk+1
pk
−1 sk+1
+ βk
=P
(y)
(y)
0
pk+1
pk
αk = (rk , zk )/(Apk , pk )
zk+1 = P −1 rk+1
βk = (rk+1 , zk+1 )/(rk , zk )
pk+1 = zk+1 + βk pk
Preconditioned CG method: residual norm
kxk+1 − xk → 0
but in general
yk+1 6→ y
which is reflected in
sk+1 →
krk+1 k = 6 0!
0
but under appropriate scaling yes!
Preconditioned CG method: residual norm
xk+1 → x
x − xk+1 = φk+1 ((I − Π)A(I − Π))(x − x0 )
sk+1 = φk+1 (A(I − Π) + Π)s0
σ((I − Π)A(I − Π)) ∼ σ(A(I − Π) + Π)?
{1} ∈ σ((I − Π)αA(I
− Π)) − {0}
sk+1 → 0!
⇒ krk+1 k = 0
How to avoid misconvergence?
I
Scaling by a constant α > 0 such that
{1} ∈ conv(σ((I − Π)αA(I − Π)) − {0})
A
BT
B
0
v:
x
f
=
y
0
⇐⇒
k(I − Π)vk 6= 0,
α=
αA
BT
B
0
x
αy
=
αf
0
1
!
((I − Π)v, A(I − Π)v)
I
Scaling by a diagonal A → (diag(A))−1/2 A(diag(A))−1/2 often gives
what we want!
I
Different direction vector pk
minimized!
(y)
so that krk+1 k = ksk+1 k is locally
yk+1 = yk + (B T B)−1 B T sk
[Braess, Deuflhard,Lipikov 1999], [Hribar, Gould, Nocedal, 1999], [Jiránek, R, 2008]
Numerical experiments: a small model example
A = tridiag(1, 4, 1) ∈ R25,25 , B = rand(25, 5) ∈ R25,5
f = rand(25, 1) ∈ R25
σ(A) ⊂ [2.0146, 5.9854]
α = 1/τ
1/100
1/10
1/4
1
4
αA
σ(
BT
B
0
I
BT
B
0
−1
[0.0207, 0.0586] ∪ {1}
[0.2067, 0.5856] ∪ {1}
[0.5170, 1.4641]
{1} ∪ [2.0678, 5.8563]
{1} ∪ [8.2712, 23.4252]
)
5
10
τ =100
0
residual norms || rk ||
10
−5
10
τ =1
−10
10
τ =4
−15
10
0
10
20
30
iteration number
40
50
60
0
10
−2
10
−4
10
τ =100
−6
A−norm of errors || x − xk ||
A
10
−8
10
τ =1
−10
10
−12
10
−14
10
τ =4
−16
10
−18
10
0
10
20
30
iteration number
40
50
60
Conclusions: segregated solution approach
The accuracy measured by the residuals of the saddle point problem
depends on the choice of the back-substitution scheme [Jiránek, R, 2008].
The schemes with (generic or corrected substitution) updates deliver
approximate solutions which satisfy either the first or second block
equation to working accuracy.
I
Care must be taken when solving nonsymmetric systems [Jiránek, R,
2008], all bounds of the limiting accuracy depend on the maximum norm
of computed iterates, cf. [Greenbaum 1994,1997], [Sleijpen, et al. 1994].
relative residual norms ||BTA−1f−BTA−1Byk||/||BTA−1f|| and ||r(y)
||/||BTA−1f||
k
I
5
10
0
10
−5
10
−10
10
−15
10
0
50
100
150
200
250
iteration number k
300
350
400
450
Conclusions: coupled approach with indefinite preconditioner
I
Short-term recurrence methods are applicable for saddle point problems
with indefinite preconditioning at a cost comparable to that of symmetric
solvers. There is a tight connection between the simplified Bi-CG
algorithm and the classical CG.
I
The convergence of CG applied to saddle point problem with indefinite
preconditioner for all right-hand side vectors is not guaranteed. For a
particular set of right-hand sides the convergence can be achieved by the
appropriate scaling of the saddle point problem.
I
Since the maximum attainable accuracy depends heavily on the size of
computed residuals, a good scaling of the problems leads to approximate
solutions satisfying both two block equations to the working accuracy.
Thank you for your attention.
http://www.cs.cas.cz/∼miro
M. Rozložnı́k and V. Simoncini, Krylov subspace methods for saddle point
problems with indefinite preconditioning, SIAM J. Matrix Anal. Appl., 24
(2002), pp. 368–391.
P. Jiránek and M. Rozložnı́k. Limiting accuracy of segregated solution methods
for nonsymmetric saddle point problems. J. Comput. Appl. Math. 215 (2008),
pp. 28-37.
P. Jiránek and M. Rozložnı́k. Maximum attainable accuracy of inexact saddle
point solvers. SIAM J. Matrix Anal. Appl., 29(4):1297–1321, 2008.