Null space preconditioners for saddle point problems Jennifer Pestana Joint work with Tyrone Rees, STFC Rutherford Appleton Laboratory February 1, 2017 [email protected] Null space preconditioners February 1, 2017 1 / 34 Where do saddle point systems arise? How do we minimise the error between measurements and a model? How do we choose parameters so that our model is as close as possible to a desired state? How can we exploit the structure present when models themselves include a minimisation principle? [email protected] Null space preconditioners February 1, 2017 3 / 34 Best Linear Unbiased Estimate (BLUE) Two observations of true state xt : y1 = xt + 1 , y2 = xt + 2 . Errors 1 , 2 have zero mean, are uncorrelated, and have variances σ12 , σ22 . [email protected] Null space preconditioners February 1, 2017 4 / 34 Best Linear Unbiased Estimate (BLUE) Two observations: y1 = xt + 1 , y2 = xt + 2 . BLUE Linear: xa = v1 y1 + v2 y2 [email protected] Null space preconditioners February 1, 2017 4 / 34 Best Linear Unbiased Estimate (BLUE) Two observations: y1 = xt + 1 , y2 = xt + 2 . BLUE Linear: xa = v1 y1 + v2 y2 Unbiased: E(xa ) = xt [email protected] Null space preconditioners February 1, 2017 4 / 34 Best Linear Unbiased Estimate (BLUE) Two observations: y1 = xt + 1 , y2 = xt + 2 . BLUE Linear: xa = v1 y1 + v2 y2 Unbiased: E(xa ) = xt = v1 E(y1 ) + v2 E(y2 ) = v1 xt + v2 xt = (v1 + v2 )xt ⇒ v1 + v2 = 1 [email protected] Null space preconditioners February 1, 2017 4 / 34 Best Linear Unbiased Estimate (BLUE) Two observations: y1 = xt + 1 , y2 = xt + 2 . BLUE Linear: xa = v1 y1 + v2 y2 Unbiased: E(xa ) = xt = v1 E(y1 ) + v2 E(y2 ) = v1 xt + v2 xt = (v1 + v2 )xt ⇒ v1 + v2 = 1 h i Best: minimise E (xa − E(xa ))2 = v12 σ12 + v22 σ22 [email protected] Null space preconditioners February 1, 2017 4 / 34 Best Linear Unbiased Estimate (BLUE) min v 2 σ 2 v1 ,v2 1 1 + v22 σ22 s.t. v1 + v2 = 1 BLUE min J(v ) = v T Dv where [email protected] v v= 1 , v2 s.t. v T 1 = 1, D= σ12 Null space preconditioners σ22 . February 1, 2017 4 / 34 Best Linear Unbiased Estimate (BLUE) BLUE min J(v ) = v T Dv or 1 min J(v ) = v T Dv 2 where [email protected] v v= 1 , v2 s.t. s.t. v T 1 = 1, b(v , q) = v T 1q = q, D= σ12 Null space preconditioners σ22 . February 1, 2017 4 / 34 Numerical weather prediction Estimate true state of atmosphere xiA at time i Models [email protected] Background state Observations Null space preconditioners February 1, 2017 5 / 34 Numerical weather prediction 4D-var N 1 1X J(x0 ) = (x0 − x0B )T B −1 (x0 − x0B ) + (yi − Hi (xi ))T Ri−1 (yi − Hi (xi )) 2 2 i=1 subject to nonlinear model dynamics xi = M(xi−1 ), i = 1, . . . , N. x0 ∈ Rn : guess x0 = x(t0 ) at the beginning of the assimilation window x0B : background (prior) at t0 yi ∈ Rm , i = 1, . . . , N: observation vectors at time i Hi : maps state vector xi from model space to observation space Mi,i−1 : integration of numerical model fro step i to i + 1 B, Ri : background and observation error covariance matrices [email protected] Null space preconditioners February 1, 2017 5 / 34 Example: Stokes equations −∇2 u + ∇p = f ∇·u =0 in Ω with boundary conditions Z Z ∇v : ∇v − J(v ) = Ω fv Ω Z v · ∇q = 0 = g (q) b(v , q) = Ω [email protected] Null space preconditioners February 1, 2017 6 / 34 Mathematical formulation Variational problem 1 u = arg min J(v ) = a(v , v ) − f (v ) v ∈X 2 such that b(v , q) = g (q) for all q ∈ M X and M may have finite or infinite dimension, |a(v , w )| ≤ Ca kv kX kw kX for all v , w ∈ X , a(v , w ) = a(w , v ) and a(v , v ) ≥ 0 for all v , w ∈ X , |b(v , q)| ≤ Cb kv kX kqkM for all v ∈ X , q ∈ M, f ∈ X 0 and g ∈ M0 bounded linear functionals [email protected] Null space preconditioners February 1, 2017 7 / 34 Saddle point problems Variational problem 1 u = arg min J(v ) = a(v , v ) − f (v ) v ∈X 2 such that b(v , q) = g (q) for all q ∈ M Lagrange function L(v , q) = J(v ) + [b(v , q) − g (q)], q ∈ M coincides with J when the constraints are satisfied. [email protected] Null space preconditioners February 1, 2017 8 / 34 Saddle point problems Lagrange function L(v , q) = J(v ) + [b(v , q) − g (q)], q ∈ M coincides with J when the constraints are satisfied. Saddle point problem Find (u, p) ∈ X × M such that a(u, v ) + b(v , p) = f (v ) for all v ∈ X , b(u, q) = g (q) for all q ∈ M [email protected] Null space preconditioners February 1, 2017 8 / 34 Saddle point problems Lagrange function L(v , q) = J(v ) + [b(v , q) − g (q)], q ∈ M coincides with J when the constraints are satisfied. Saddle point problem Find (u, p) ∈ X × M such that a(u, v ) + b(v , p) = f (v ) for all v ∈ X , b(u, q) = g (q) for all q ∈ M (u, q) ≤ (u, p) ≤ (v , p) for all (v , q) ∈ X × M [email protected] Null space preconditioners February 1, 2017 8 / 34 Matrix formulation Find (u, p) ∈ X × M such that a(u, v ) + b(v , p) = f (v ) for all v ∈ X , b(u, q) = g (q) for all q ∈ M (Potentially after discretization) f A BT x = y g B 0 | {z } A where A ∈ Rn×n , SPSD, B ∈ Rm×n , m ≤ n , B full rank [email protected] Null space preconditioners February 1, 2017 9 / 34 BLUE min 1 T v Dv 2 s.t. v T 1q = q leads to saddle point problem 0 D 1T v = q 1 1 0 [email protected] Null space preconditioners February 1, 2017 10 / 34 Numerical weather prediction N 1 1X J(x0 ) = (x0 − x0B )T B −1 (x0 − x0B ) + (yi − Hi (xi ))T Ri−1 (yi − Hi (xi )) 2 2 i=1 s.t. xi = M(xi−1 ), i = 1, . . . , N. After linearisation leads to saddle point problem I B 0 λ b 0 R̂ Ĥ µ = d δx 0 I Ĥ T 0 [email protected] Null space preconditioners February 1, 2017 11 / 34 Stokes equations −∇2 ∇ u f = ∇· 0 p 0 in Ω with boundary conditions After discretizing we get A BT A= B 0 [email protected] f u = p g Null space preconditioners February 1, 2017 12 / 34 Other sources of saddle point problems Optimization: quadratic programming, interior point methods, optimal control PDEs with conservation laws: biharmonic equations in mixed form, Maxwell equations, Navier-Stokes equations Interpolation: hybrid schemes Constrained or weighted least squares,... See Benzi, Golub and Liesen (2005) [email protected] Null space preconditioners February 1, 2017 13 / 34 How do we solve saddle point problems? Matrix formulation f A BT x = y g B 0 | {z } A How do we solve saddle point systems? Direct methods (Gaussian elimination) I I Perform a fixed number of operations Obtain a single approximation Iterative methods (Krylov methods) I I Start with an initial guess that is updated at each iteration Get successively better approximations [email protected] Null space preconditioners February 1, 2017 15 / 34 How do we solve large systems? For large systems direct methods can be infeasible Most iterative methods rely on matrix-vector products with A If these matrix-vector products can be applied cheaply then iterative methods can be used to solve very large problems [email protected] Null space preconditioners February 1, 2017 16 / 34 How do we solve large systems? For large systems direct methods can be infeasible Most iterative methods rely on matrix-vector products with A If these matrix-vector products can be applied cheaply then iterative methods can be used to solve very large problems Sparse matrices Sparse matrices have few nonzeros, either because of the problem or because of the numerical approximation [email protected] Null space preconditioners February 1, 2017 16 / 34 Krylov subspace methods Aw = b Krylov subspace methods Choose w 0 and compute r 0 = b − Aw 0 For k = 1, 2, . . . choose w k such that w k − w 0 ∈ Kk (A, r 0 ) = span{r 0 , Ar 0 , . . . , A(k−1) r 0 } Many methods depending on matrix properties Conjugate gradient method MINRES GMRES, BiCG, BiCGStab, QMR, TFQMR, IDR, ... [email protected] Null space preconditioners February 1, 2017 17 / 34 MINRES If A is symmetric we can apply MINRES, which minimizes kr k k2 . MINRES convergence bound kr k k2 ≤ min p∈Πk kr 0 k2 max |p(λ)| λ∈σ(A) p(0)=1 [email protected] Null space preconditioners February 1, 2017 18 / 34 MINRES MINRES convergence bound kr k k2 ≤ min p∈Πk kr 0 k2 max |p(λ)| λ∈σ(A) p(0)=1 (0, 1) [email protected] (0, 1) (0, 1) Null space preconditioners (0, 1) February 1, 2017 18 / 34 MINRES MINRES convergence bound kr k k2 ≤ min p∈Πk kr 0 k2 max |p(λ)| λ∈σ(A) p(0)=1 (0, 1) (0, 1) (0, 1) (0, 1) Clustered eigenvalues, or few distinct eigenvalues, that are bounded away from the origin are generally good [email protected] Null space preconditioners February 1, 2017 18 / 34 How do we ensure fast convergence? Preconditioning A BT Aw = B 0 x f = =b y g For MINRES convergence bounds guarantee fast convergence if eigenvalues clustered For GMRES, clustered eigenvalues can also be useful (e.g., Campbell et al., 1996) If convergence is poor, solve equivalent system: 1 1 1 1 P − 2 AP − 2 P 2 w = P − 2 b AP −1 (Pw ) = b [email protected] Null space preconditioners February 1, 2017 20 / 34 Schur complement (range space) preconditioners Aw = b where I A A BT I A= = −1 BA I −S B 0 A−1 B T I with S = BA−1 B T [email protected] Null space preconditioners February 1, 2017 21 / 34 Schur complement (range space) preconditioners Aw = b where I A A BT I A= = −1 BA I −S B 0 A−1 B T I with S = BA−1 B T " Pcs = Pcons Ab # " , Sb I = B Ab−1 I Pus " Ab # Ab B T = , −Sb # −Sb I Ab−1 B T I (Kuznetsov, 1995; Murphy, Golub & Wathen, 2000) [email protected] Null space preconditioners February 1, 2017 21 / 34 Schur complement (range space) preconditioners Aw = b where I A A BT I A= = −1 BA I −S B 0 A−1 B T I with S = BA−1 B T If Ab and Sb good approximations of A and S then P −1 A has clustered eigenvalues Many successful preconditioners based on this approach [email protected] Null space preconditioners February 1, 2017 21 / 34 Nonsingularity A BT Aw = B 0 x f = =b y g A ∈ Rn×n and SPSD, B ∈ Rm×n , B full rank, m ≤ n [email protected] Null space preconditioners February 1, 2017 22 / 34 Nonsingularity A BT Aw = B 0 x f = =b y g A ∈ Rn×n and SPSD, B ∈ Rm×n , B full rank, m ≤ n Theorem (Theorem 3.2. Benzi, Golub & Liesen (2005)) Assume that A is symmetric positive semidefinite and B has full rank. Then a necessary and sufficient condition for the saddle point matrix A to be nonsingular is null(A) ∩ null(B) = {0}. [email protected] Null space preconditioners February 1, 2017 22 / 34 Nonsingularity A BT Aw = B 0 x f = =b y g A ∈ Rn×n and SPSD, B ∈ Rm×n , B full rank, m ≤ n Theorem (Theorem 3.2. Benzi, Golub & Liesen (2005)) Assume that A is symmetric positive semidefinite and B has full rank. Then a necessary and sufficient condition for the saddle point matrix A to be nonsingular is null(A) ∩ null(B) = {0}. We need A to be positive definite on null(B) [email protected] Null space preconditioners February 1, 2017 22 / 34 The nullspace method A BT B 0 A x y = f g B [email protected] Null space preconditioners February 1, 2017 23 / 34 The nullspace method A BT B 0 A x p + x̄ y = f g x = x p + x̄ Bx p = g x̄ ∈ null(B) [email protected] Null space preconditioners B February 1, 2017 23 / 34 The nullspace method A BT B 0 A x̄ y = h 0 h = f − Ax p [email protected] Null space preconditioners B February 1, 2017 23 / 34 The nullspace method A BT B 0 A x̄ y = h 0 Let Z span the nullspace of B (i.e. BZ = 0) B BZ = 0 ⇒ x̄ = Z x n Z [email protected] Null space preconditioners February 1, 2017 23 / 34 The nullspace method A BT B 0 A x̄ y = h 0 Let Z span the nullspace of B (i.e. BZ = 0) B BZ = 0 ⇒ x̄ = Z x n Z AZ x n + [email protected] BT y = h Null space preconditioners February 1, 2017 23 / 34 The nullspace method A BT B 0 A x̄ y = h 0 Let Z span the nullspace of B (i.e. BZ = 0) B Z Z T AZ x n + Z T B T y = Z T h Z T AZ [email protected] Null space preconditioners February 1, 2017 23 / 34 The nullspace method A BT B 0 A x̄ y = h 0 Let Z span the nullspace of B (i.e. BZ = 0) B Z Z T AZ x n = Z T h Z T AZ [email protected] Null space preconditioners February 1, 2017 23 / 34 The nullspace method A BT B 0 A x̄ y = h 0 Let Z span the nullspace of B (i.e. BZ = 0) B Z Z T AZ x n = Z T h smaller, SPD matrix This is the nullspace method [email protected] Null space preconditioners Z T AZ February 1, 2017 23 / 34 Nullspace factorizations A BT B x f = y g Nullspace method 1 2 3 4 Find a particular solution x p ∈ Rn for Bx p = g Find Z whose columns span the nullspace of B Then since x = xb + Z x n solve Z T AZ x n = Z T (f − Ax p ) Solve the overdetermined system B T y = f − Ax [email protected] Null space preconditioners February 1, 2017 24 / 34 Nullspace factorizations A BT B x f = y g Nullspace method 1 2 3 4 Find a particular solution x p ∈ Rn for Bx p = g Find Z whose columns span the nullspace of B Then since x = xb + Z x n solve Z T AZ x n = Z T (f − Ax p ) Solve the overdetermined system B T y = f − Ax x = x p + x̄ = Y x r + Z x n , range(Y ) = range(B T ), range(Z ) = null(B) [email protected] Null space preconditioners February 1, 2017 24 / 34 Nullspace factorization Want sparse and well conditioned Y and Z A11 A12 B1T A = A21 A22 B2T , B1 ∈ Rm×m nonsingular B1 B2 0 −B1−1 B2 Zf = , I [email protected] −1 B Yf = 1 0 Null space preconditioners February 1, 2017 25 / 34 Nullspace factorization I 0 0 I B1−1 B2 A11 0 B1T I A = B2T B1−T I XB1−1 0 N 0 0 B1 0 0 0 0 I 0 B1−T X T {z }| {z }| | {z L D N = ZfT AZf and X = ZfT LT A11 A21 0 0, I } Can build preconditioners by approximating and/or dropping blocks. [email protected] Null space preconditioners February 1, 2017 25 / 34 Central null preconditioner I 0 0 I B1−1 B2 A11 0 B1T I A = B2T B1−T I XB1−1 0 N 0 0 B1 0 0 0 0 I 0 B1−T X T {z }| {z }| | {z D L Pcn [email protected] LT 0 0, I } A11 0 B1T = 0 Ne 0 B1 0 0 Null space preconditioners February 1, 2017 26 / 34 Central null preconditioner I 0 0 I B1−1 B2 A11 0 B1T I A = B2T B1−T I XB1−1 0 N 0 0 B1 0 0 0 0 I 0 B1−T X T {z }| {z }| | {z D L Pcn LT 0 0, I } A11 0 B1T = 0 Ne 0 B1 0 0 Eigenvalues depend on N −1 A22 [email protected] Null space preconditioners February 1, 2017 26 / 34 Upper null preconditioner I 0 0 I B1−1 B2 A11 0 B1T −T −1 I A = B2T B1 I XB1 0 N 0 0 −T T B1 0 0 0 0 I 0 B1 X {z }| {z }| | {z D L Pun LT 0 0 I } A11 A12 B1T = 0 Ñ 0 B1 B2 0 Theorem (P. & Rees, 2016) Let Ñ be an nonsingular approximation to N. Then −1 A) = {1} ∪ σ(N e−1 N). σ(Pun [email protected] Null space preconditioners February 1, 2017 27 / 34 Constraint preconditioner I 0 0 I B1−1 B2 A11 0 B1T −T −1 T I A = B2 B1 I XB1 0 N 0 0 −T T B1 0 0 0 0 I 0 B1 X {z }| {z }| | {z D L Pconn I T = B2 B1−T 0 LT A11 0 B1T 0 0 I B1−1 B2 I I XB1−1 0 Ne 0 0 0 I 0 B1−T X T B1 0 0 0 0 I } 0 0 I Theorem (P. & Rees, 2016) The matrix Pconn is an exact constraint preconditioner. Also, if Ñ is −1 A) = {1} ∪ σ(N e−1 N). nonsingular then σ(Pconn Pconn is equivalent to a preconditioner in GALAHAD [email protected] Null space preconditioners February 1, 2017 28 / 34 Diagonal preconditioner Cheap to apply −1 A depend on N −1 A Eigenvalues of Pcn 22 Triangular preconditioner −1 A) = {1} ∪ σ(N e−1 N) σ(Pun Can also apply lower triangular preconditioner using short term recurrences Constraint preconditioner Exact constraint preconditioner σ(P −1 A) = {1} ∪ σ(Ne−1 N) un Can use in projected conjugate gradients (Gould, Hribar & Nocedal, 2001) or projected MINRES (Gould, Orban & Rees, 2014) [email protected] Null space preconditioners February 1, 2017 29 / 34 Results: Optimization matrices min 1 T x Hx + f T x 2 s.t. Bx = g , x ≥ 0. Primal-dual interior point method: [email protected] H + Xk−1 Zk B Null space preconditioners BT 0 February 1, 2017 30 / 34 Results: Optimization problems Incomplete Cholesky factorizations: Matrix AUG3DC CONT-200 CVXQP3 S DPKLO1 DTOC3 GOULDQP3 LISWET1 MOSARQP1 PRIMAL1 QPCSTAIR STCQP2 YAO n 3873 40397 100 133 14999 699 10002 2500 325 467 4097 2002 [email protected] m 1000 39601 75 77 9998 349 10000 700 85 356 2052 2000 Pus 10 * 7 15 6 7 6 25 4 10 13 5 Pcs 21 * 14 30 12 13 9 51 8 20 26 9 Null space preconditioners Pcons 9 * 7 15 6 6 4 25 4 10 13 4 Pun 16 24 6 8 5 6 2 8 13 20 21 2 Pcn 33 43 33 28 10 27 4 22 25 40 22 5 Pconn 16 23 5 7 5 6 1 7 12 19 20 1 February 1, 2017 31 / 34 Results: Optimization matrices Identity matrices: Matrix AUG3DC CONT-200 CVXQP3 S DPKLO1 DTOC3 GOULDQP3 LISWET1 MOSARQP1 PRIMAL1 QPCSTAIR STCQP2 YAO n 3873 40397 100 133 14999 699 10002 2500 325 467 4097 2002 [email protected] m 1000 39601 75 77 9998 349 10000 700 85 356 2052 2000 Pus 35 * 82 51 * 20 * 464 77 247 267 * Pcs 71 * 150 102 * 39 * 927 154 490 528 * Null space preconditioners Pcons 35 * 83 51 * 19 * 464 77 247 267 * Pun 87 28 26 24 6 40 3 15 40 51 85 3 Pcn 166 55 44 50 10 71 5 29 79 93 95 5 February 1, 2017 Pconn 91 28 26 25 6 41 4 15 41 53 93 4 32 / 34 Results: F-matrices In F-matrices B is a gradient matrix. Matrix DORT DORT2 L3P M3P S3P dan2 n 13360 7515 17280 2160 270 63750 m 9607 5477 12384 1584 207 46661 Pus 121 117 44 24 12 7 Pcs 237 233 87 47 23 13 Pcons 120 116 43 23 11 6 Pun 12 8 16 12 10 9 Pcn 32 30 36 31 28 48 Pconn 12 8 15 11 9 8 Matrices from Tůma (2002); de Niet and Wubs (2008) [email protected] Null space preconditioners February 1, 2017 33 / 34 Conclusions and outlook Conclusions Saddle point problems arise in variety of applications (constrained minimisation) For large sparse linear systems, Krylov subspace methods are often a good choice BUT only if we have a good preconditioner Discussed preconditioners based on a nullspace factorization [email protected] Null space preconditioners February 1, 2017 34 / 34 Conclusions and outlook Conclusions Saddle point problems arise in variety of applications (constrained minimisation) For large sparse linear systems, Krylov subspace methods are often a good choice BUT only if we have a good preconditioner Discussed preconditioners based on a nullspace factorization Outlook What is the best nullspace basis to choose? What happens when we replace blocks by approximations? What applications is this approach best suited to? [email protected] Null space preconditioners February 1, 2017 34 / 34 Thank you! References I [BGL05] M. Benzi, G. H. Golub, and J. Liesen, Numerical solution of saddle point problems, Acta Numer. 14 (2005), 1–137. [FJ97] R. Fletcher and T. Johnson, On the stability of null-space methods for KKT systems, SIMAX 18 (1997), 938–958. [GHN01] N. I. M. Gould, M. E. Hribar, and J. Nocedal, On the solution of equality constrained quadratic programming problems arising in optimization, SISC 23 (2001), 1376–1395. [GOR14] N. I.M. Gould, D. Orban, and T. Rees, Projected Krylov methods for saddle-point systems, SIMAX 35 (2014), 1329–1343. [PR16] J. Pestana and T. Rees, Null-space preconditioners for saddle point problems, SIMAX 37 (2016), 1103–1128. [PW15] J. Pestana and A. J. Wathen, Natural preconditioning and iterative methods for saddle point systems, SIREV 57 (2015), 71–91. [email protected] Null space preconditioners February 1, 2017 2/3 References II [RS14] T. Rees and J. A. Scott, The null-space method and its relationship with matrix factorizations for sparse saddle point systems, Tech. Report RAL-TR-2014-016, STFC Rutherford Appleton Laboratory, 2014. [email protected] Null space preconditioners February 1, 2017 3/3
© Copyright 2026 Paperzz