SYMMETRIC INDEFINITE PRECONDITIONERS FOR SADDLE POINT PROBLEMS WITH APPLICATIONS TO PDE-CONSTRAINED OPTIMIZATION PROBLEMS JOACHIM SCHÖBERL∗ AND WALTER ZULEHNER† Abstract. We consider large scale sparse linear systems in saddle point form, or, equivalently, large scale quadratic optimization problems with linear equality constraints. A natural property in this context is the coercivity of the cost functional only on the set satisfying the constraints. To enforce the coercivity everywhere, an augmented Lagrangian approach is usually chosen. However, the adjustment of the involved parameters is a critical issue. We will present a different approach that is not based on such an explicit augmentation technique. A symmetric and indefinite preconditioner for the saddle point problem will be constructed and analyzed. Under appropriate conditions the preconditioned saddle point system is symmetric and positive definite with respect to a particular scalar product. Therefore, conjugate gradient acceleration can be used. If applied to a typical infinite-dimensional optimization problem from optimal control with an L 2 -type cost functional and an elliptic state equation with distributed cpntrol, it is shown that the convergence rate for solving the discretized problem with the proposed preconditioned conjugate gradient method is independent of the mesh size of the underlying subdivision and also independent of some involved regularization parameter. Numerical experiments are presented for such a model problem in order to illustrate the theoretical results. Key words. saddle point problems, indefinite preconditioners, conjugate gradient methods, optimization problems with PDE-constraints AMS subject classifications. 65F10, 15A12, 49M15 1. Introduction. In this paper we consider large scale sparse linear systems of equations in saddle point form A BT x f = , B 0 p g where A is a real, symmetric and positive semi-definite n-by-n matrix, B is a real m-by-n matrix with full rank m ≤ n, and B T denotes the transposed matrix of B. Such systems typically result from the discretization of mixed variational problems for systems of partial differential equations (in short: PDEs), see Brezzi and Fortin [4], in particular, from the discretization of optimization problems with PDE-constraints. A natural property of such a problem is that A is positive definite on the kernel of B, i.e.: (Ax, x) > 0 for all x ∈ ker B with x 6= 0, (1.1) where (x, y) denotes the Euclidean scalar product. This condition guarantees in combination with the full rank of B that the matrix A BT K= B 0 is non-singular. As long as the matrix A is positive definite not only on ker B but on the whole space R n , the Schur complement S = BA−1 B T is well-defined. Then several approaches for an efficient solution procedure have been proposed, see the review article by Benzi, Golub and Liesen [2]. In this paper, however, we will focus on systems, where A is positive definite in a stable way (to be specified later) only on ker B, a typical situation for certain classes of optimization problems with PDE-constraints. ∗ Center for Computational Engineering Science, RWTH Aachen, D-52074 Aachen, Germany, ([email protected]). † Institute of Computational Mathematics, Johannes Kepler University, A-4040 Linz, Austria ([email protected]). The work was supported in part by the Austrian Science Foundation (FWF) under the grant SFB F013. 1 2 J. SCHÖBERL AND W. ZULEHNER One strategy to enforce the definiteness on the whole space Rn is the augmented Lagrangian approach, where the matrix A is replaced by a matrix of the form AW = A + B T W B with an appropriate matrix W , see e.g. Fortin and Glowinski [5]. This does not change the solution of the equation, and AW becomes positive definite if W is positive definite. It is, however, a delicate issue to choose the matrix W in order to obtain good convergence properties, see the discussion in Golub and Greif, [6]. Another strategy is to use positive definite block diagonal preconditioners, either applied to the original system, see e.g. Rusten and Winther [8], Silvester and Wathen [9], or to the augmented system, see e.g. Golub, greif and Varah [7]. Then the preconditioned system remains indefinite. The application of conjugate gradient is then not possible, other Krylov subspace methods suitable for indefinite systems, such as MINRES, have been proposed to accelerate the iteration. In Bramble and Pasciak [3] a preconditioning technique was introduced leading to a symmetric and positive definite system, which then can be solved by conjugate gradient method. However, this technique requires a symmetric and positive definite approximation  with  < A. Thus it could be applied only after augmentation, if A is not positive definite on the whole space Rn . In this paper we will take a different approach and construct preconditioners K̂ for the original system matrix K (without augmentation) which, nevertheless, work also well in the case that A is positive definite only on the kernel of B. It will be shown that the preconditioned matrix K̂−1 K is even symmetric and positive definite in some appropriate scalar product. Therefore, conjugate gradient acceleration can be applied. In contrast to Bramble and Pasciak [3] this new technique requires a symmetric and positive definite approximation  with  > A, which is easier to achieve and can be applied also if A is only positive definite on the kernel of B. The paper is organized as follows: In Section 2 the class of preconditioners is introduced and analyzed. Section 3 describes how the algebraic conditions for the preconditioners are linked to the conditions of the theorem of Brezzi for mixed variational problems, and a general strategy for constructing the preconditioners is sketched. In Section 4 a model problem from optimal control is discussed and preconditioners are constructed which are robust with respect to the mesh size as well as to the involved regularization parameter. Numerical experiments are presented in Section 5, followed by some concluding remarks. Throughout the paper the following notations are used: M < N (N > M ) iff N − M is positive definite, and M ≤ N (N ≥ M ) iff N − M is positive semi-definite, for symmetric matrices M and N . For a symmetric and positive definite matrix M the associated scalar product (v, w) M and norm kvkM are given by (v, w)M = (M v, w), 1/2 and kvkM = (v, v)M , where (v, w) (without index) denotes the Euclidean scalar product. The Euclidean norm of a vector v is denoted by kvk (without index). 2. A Class of Symmetric and Indefinite Preconditioners. A well-known class of preconditioners is given by  BT , K̂ = B B Â−1 B T − Ŝ where  and Ŝ are symmetric and positive definite matrices, see Bank, Welfert and Yserentant [1]. More precisely, we will assume that  and Ŝ are preconditioners, i.e., an efficient evaluation of Â−1 w and Ŝ −1 q is available. We have the following factorization I 0  B T , K̂ = B Â−1 I 0 −Ŝ which implies that K̂ is non-singular and that the solution of a linear system w s K̂ = q t SYMMETRIC INDEFINITE PRECONDITIONERS 3 reduces to the consecutive solution of the following three linear systems: Âŵ = s, Ŝ q = B ŵ − t, Âw = s − B T q. So, one application of the preconditioner K̂ requires two applications of the preconditioner  and one application of the preconditioner Ŝ. In Bank, Welfert and Yserentant [1] and later in Zulehner [10], this preconditioner has been analyzed for the case that A is positive definite. One important part of the analysis easily carries over to the case, considered here: Theorem 2.1. Assume that A ≥ 0, condition (1.1) is satisfied, and rank B = m. Let  > 0 and Ŝ > 0. 1. If  ≥ A and Ŝ ≤ B Â−1 B T , (2.1) then all eigenvalues of K̂−1 K are real and positive. 2. If  > A and Ŝ < B Â−1 B T , (2.2) then K̂−1 K is symmetric and positive definite with respect to the scalar product w x = (( − A)x, w) + ((B Â−1 B T − Ŝ)p, q). , q p D (2.3) Proof. Apply Theorem 5.2 from Zulehner [10] to the regularized matrices A + ε I and  + ε I for ε > 0, take the limit ε → 0 and observe the K is non-singular. However, the estimates for the extreme eigenvalues of K̂−1 K were derived under the assumption that A is positive definite on the whole space. In particular, the estimate for the smallest eigenvalue degenerates, if directly applied to the case considered here. In this paper this gap will be closed. First of all, we have to discuss reasonable assumptions on  and Ŝ, which measure the quality of these preconditioners. Comparing the matrix K and the preconditioner K̂ it seems to be natural to consider  as an approximation to A at least on ker B and to consider Ŝ as an approximation to the so-called inexact Schur complement H, given by H = B T Â−1 B. In order to quantify the quality of these approximations, we introduce two positive constants α and β with (Ax, x) ≥ α (Âx, x) for all x ∈ ker B and B Â−1 B T ≤ β Ŝ. Observe, that we will still require at least condition (2.1), therefore α ≤ 1 and β ≥ 1. The closer α and β are to 1 the better, we expect, the preconditioner K̂ will be. Now we have Theorem 2.2. Assume that A ≥ 0, condition (1.1) is satisfied, and rank B = m. Let  > 0 and Ŝ > 0 with α (Âx, x) ≤ (Ax, x) for all x ∈ ker B and A ≤ Â, (2.4) 4 J. SCHÖBERL AND W. ZULEHNER and Ŝ ≤ B Â−1 B T ≤ β Ŝ. (2.5) Then λmax (K̂−1 K) ≤ β + and λmin (K̂−1 K) ≥ p β2 − β i p 1h 2 + α − 1/β − (2 + α − 1/β)2 − 4α . 2 Proof. The upper bound directly follows from Theorem 5.2 in Zulehner [10], again by considering the regularized matrices A + ε I and  + ε I for ε > 0 with ε → 0. For the lower bound we consider an eigenvalue λ of the matrix K̂−1 K: x x K = λ K̂ , p p which is equivalent to the eigenvalue problem x x = µD K p p with λ= µ 1+µ and D = K̂ − K =  − A 0 0 , B Â−1 B T − Ŝ or, in an equivalent variational form: (Ax, w) + (Bw, p) = µ (( − A)x, w) for all w ∈ Rn , = µ ((B Â−1 B T − Ŝ)p, q) for all q ∈ Rm . (Bx, q) Now, two cases are distinguished: Firstly, for the case µ ≤ 0 it follows that λ = µ/(1 + µ) > 1, since λ must be positive by Theorem 2.1. (The case µ = −1 can be excluded, since K̂ is nonsingular.) So, in this case, the eigenvalues λ are bounded from below by 1. Next, we consider the remaining case µ > 0: Let W = ker B, W ⊥ = {x ∈ Rn : (Âx, w) = 0 for all w ∈ W }. Then there is a unique representation of x of the following form: x = x1 + x2 with x1 ∈ W and x2 ∈ W ⊥ . Now the variational form reads: h i = µ (( − A)x1 , w1 ) − (Ax2 , w1 ) h i (Ax1 , w2 ) + (Ax2 , w2 ) + (Bw2 , p) = µ −(Ax1 , w2 ) + (( − A)x2 , w2 ) (Ax1 , w1 ) + (Ax2 , w1 ) (Bx2 , q) = µ ((B Â−1 B T − Ŝ)p, q) for all w1 ∈ W, for all w2 ∈ W ⊥ , for all q ∈ Rm . From the first equation we obtain for w1 = x1 : α (x1 , x1 ) ≤ (Ax1 , x1 ) = µ (( − A)x1 , x1 ) − (µ + 1)(Ax2 , x1 ). 5 SYMMETRIC INDEFINITE PRECONDITIONERS Using (Aw2 , w1 ) = −(( − A)w2 , w1 ) ≤ (( − A)w1 , w1 )1/2 (( − A)w2 , w2 )1/2 √ ≤ 1 − α kw1 k kw2 k for all w1 ∈ W, w2 ∈ W ⊥ , it follows that √ α (x1 , x1 ) ≤ µ (1 − α) (x1 , x1 ) + (µ + 1) 1 − α kx1 k kx2 k , which implies √ α kx1 k ≤ µ (1 − α) kx1 k + (µ + 1) 1 − α kx2 k . From the second equation we obtain sup w2 ∈W ⊥ −(µ + 1)(Ax1 , w2 ) + ((µ( − A) − A)x2 , w2 ) (Bw2 , p) = sup kw2 k kw2 k ⊥ w2 ∈W √ ≤ (µ + 1) 1 − α kx1 k + (µ + 1) kx2 k . From the third equation we obtain sup 06=q µ ((B Â−1 B T − Ŝ)p, q) (Bx2 , q) = sup ≤ µ (1 − 1/β) kpkH . kqkH kqkH 06=q Observe that, for the left-hand sides of the last two inequalities, we have the following well-known representation: sup w2 ∈W ⊥ (Bw2 , p) (Bw, p) = sup = (B Â−1 B T p, p)1/2 = kpkH kw2 k w∈Rn kwk and sup 06=q∈Rm (Bx2 , q) 1/2 1/2 = (B T H −1 Bx2 , x2 )1/2 = (Â−1 B T H −1 Bx2 , x2 ) = (x2 , x2 ) = kx2 k kqkH since P = Â−1 B T H −1 B is a Hence, in summary, √ − 1−α √α − 1 − α −1 0 1 or, in short: projector onto W ⊥ , so P x2 = x2 for x2 ∈ W ⊥ . kx1 k 0 √1 − α 1 kx2 k ≤ µ 1 − α kpkH 0 0 √ 1−α 1 0 kx1 k 0 0 kx2 k , kpkH 1 − 1/β Ke ≤ µ De (2.6) with √α K = − 1 − α 0 √ − 1−α −1 1 0 √1 − α 1 , D = 1 − α 0 0 Since K −1 is non-negative element-wise, it follows √ 1−α 1 0 0 kx1 k 0 , e = kx2 k . kpkH 1 − 1/β e ≤ µ K −1 De. Elementary calculations show that i p 1 h 2 − α − 1/β + (2 − α − 1/β)2 + 4α(1 − 1/β) ν+ = 2α 6 J. SCHÖBERL AND W. ZULEHNER is a non-negative eigenvalue of K −1 D with component-wise non-negative left eigenvector l T , given by √ 1 − α, 1, αν+ + α − 1 . lT = Then lT e ≤ µν+ lT e. Obviously, lT e ≥ 0. One can easily excludes the case l T e = 0 by taking into account (2.6) and e 6= 0. Therefore, after dividing by l T e > 0, we obtain µ≥ 1 . ν+ Consequently, λ= i p 1 1h µ 2 + α − 1/β − (2 + α − 1/β)2 − 4α ≥ = 1+µ 1 + ν+ 2 This lower bound is obviously smaller than 1, which was the lower bound for the first case µ ≤ 0. This completes the proof. Under the slightly stronger conditions α (Âx, x) ≤ (Ax, x) for all x ∈ ker B and A < Â, (2.7) and Ŝ < B Â−1 B T ≤ β Ŝ. (2.8) it is justified by Theorem 2.1 to apply the standard conjugate gradient method to the preconditioned system x −1 −1 f K̂ K = K̂ (2.9) p g with respect to the scalar product (2.3). It is well-known that the error e(k) for the k-th iterate (x(k) , p(k) )T measured in the corresponding energy norm can be estimated by q k κ(K̂−1 K) − 1 2q (0) q e(k) ≤ e with q = , 1 + q 2k κ(K̂−1 K) + 1 where κ(K̂−1 K) denotes the relative condition number: κ(K̂−1 K) = λmax (K̂−1 K) λmin (K̂−1 K) . From Theorem 2.2 the following upper bound for the relative condition number follows: p 2(β + β 2 − β) −1 p κ(K̂ K) ≤ ≡ κ(α, β). 2 + α − 1/β − (2 + α − 1/β)2 − 4α This shows that the convergence rate q can be bounded by α and β only. If the preconditioners are chosen such that α and β are independent of certain parameters like the mesh size h of discretization or some involved regularization parameter ν, then the convergence rate is also robust with respect to such parameters. SYMMETRIC INDEFINITE PRECONDITIONERS 7 Furthermore, for α → 1 and β → 1, the lower and upper bounds for the eigenvalues in Theorem both approach 1, (implying that all eigenvalues of the preconditioned matrix K̂−1 K approach 1,) leading to a relative condition number approaching 1 and a convergence factor q approaching 0. In the limit case α = 1 and β = 1 one can easily derive from the conditions (2.4) and (2.5) the following representations for the preconditioners:  = A + B T W B and Ŝ = B Â−1 B. for some matrix W ≥ 0. Then, we obtain for K̂: A + BT W B K̂ = B BT 0 From the considerations above it follows in this case that all eigenvalue of K̂−1 K must be equal to 1. Moreover, it can easily be shown that h K̂−1 K i2 = 0. So, the corresponding Richardson method terminates at the solution after two steps. In a simplified way one could describe the proposed strategy as follows: Good preconditioners  can be interpreted as good approximations to some augmented matrix A + B T W B, but we do not change the matrix A itself in the system matrix K. This seems to be only a slight variant to the augmented Lagrangian approach, where A itself is replaced by A + B T W B in K. However, the actual construction of the preconditioner is not based on selecting first some augmentation matrix W and then preconditioning the augmented matrix. Instead, as it will be detailed in the next section, it naturally follows from the analysis of an underlying (infinite-dimensional) variational problem, whose discretization leads to the discussed large scale linear systems of equations in saddle point form. 3. Application to mixed variational problems. Let the (infinite-dimensional) variational problem be of the following form: Find x ∈ X and p ∈ Q such that a(x, w) + b(w, p) = hF, wi b(x, q) = hG, qi for all w ∈ X, for all q ∈ Q. Here, X and Q are real Hilbert spaces, a : X × X −→ R and b : X × Q −→ R are bilinear forms, F : X −→ R and G : Q −→ R are continuous linear functionals, and hF, wi (hG, qi) denotes the evaluation of F (G) at the element w (q). The existence and uniqueness of a solution to this mixed variational problem is well-established (theorem of Brezzi, see Brezzi and Fortin [4]) under the following conditions: 1. The bilinear form a is bounded. Then, of course the norm kak satisfies the condition a(w, w) ≤ kak kwk2X for all w ∈ X. 2. The bilinear form a is coercive on ker B = {w ∈ X : b(w, q) = 0 for all q ∈ Q}: There exists a constant α0 > 0 such that a(w, w) ≥ α0 kwk2X for all x ∈ ker B. 3. The bilinear form b is bounded: Then, of course, the norm kbk satisfies the condition sup 06=w∈X b(w, q) ≤ kbk kqkQ kwkX for all q ∈ Q. 8 J. SCHÖBERL AND W. ZULEHNER 4. The bilinear form b satisfies the inf-sup condition: There exists a constant k0 such that sup 06=w∈X b(w, q) ≥ k0 kqkQ kwkX for all q ∈ Q. Under the additional assumptions that 5. the bilinear form a is symmetric on X: a(x, w) = a(w, x) for all x, w ∈ X, and 6. the bilinear form a is non-negative on X: a(w, w) ≥ 0 for allw ∈ X, the theorem of Brezzi implies the equivalence of the mixed variational problem to the following constrained optimization problem: Find x ∈ X such that J(x) = min J(y) w∈Xg (3.1) with J(w) = 1 a(w, w) − hF, wi 2 and Xg = {w ∈ X : b(w, q) = hG, qi for all q ∈ Q}. For discretizing the infinite-dimensional problem the spaces X and Q are replaced by finitedimensional sub-spaces Xh ⊂ X and Qh ⊂ Q, which results in the following finite-dimensional variational problem: Find xh ∈ Xh and ph ∈ Qh such that a(xh , wh ) + b(wh , ph ) = hF, wh i for all wh ∈ Xh , b(xh , qh ) =0 for all qh ∈ Qh . By introducing suitable basis functions in Xh and Qh , we finally obtain the following saddle point problem in matrix-vector notation: Ah xh + BhT ph = f h , B h xh = gh, where xh and ph denote the corresponding vectors of coefficients with respect to these basis functions. We assume that the conditions of the theorem of Brezzi are also satisfied in Xh and Qh . This is trivial for the first and the third condition, but not at all for the second and fourth condition. To simplify the notation the same symbols are used to denote the constants. These conditions read in matrix-vector notations: Ah ≤ kak X h , (3.2) (Ah , wh , w h ) ≥ α0 (X h wh , w h ) for all wh ∈ ker Bh , (3.3) T 2 Bh X −1 h Bh ≤ kbk Qh , (3.4) T 2 Bh X −1 h Bh ≥ k0 Qh . (3.5) SYMMETRIC INDEFINITE PRECONDITIONERS 9 Here, X h and Qh denote the matrices representing the scalar products (x, q)X and (p, q)Q as bilinear forms on Xh and Qh , respectively: (ph , qh )Q = (Qh ph , q h ). (xh , wh )X = (X h xh , w h ), For the third and fourth condition we used the well-known representation sup 06=wh ∈Xh b(wh , qh ) 1/2 T . = (Bh X −1 h Bh q h , q h ) kwh kX Comparing with the conditions (2.7) and (2.8) it is reasonable to choose for Âh a properly scaled preconditioner of X h , and for Ŝh a properly scaled preconditioner of Qh , say Âh = 1 X̂h σ and Ŝh = σ Q̂h τ (3.6) for some real parameters σ > 0 and τ > 0 and with preconditioners X̂h and Q̂h satisfying, e.g., the spectral estimates (1 − qX ) (X̂h w h , w h ) ≤ (wh , wh )X ≤ (X̂h wh , wh ) (3.7) (1 − qQ ) (Q̂h q h , q h ) ≤ (qh , qh )Q ≤ (Q̂h q h , q h ) (3.8) and Combining all estimates we easily obtain the following result: Lemma 3.1. Assume that (3.2) - (3.8) hold. Then the conditions (2.7) and (2.8) are satisfied with α = σ (1 − qX ) α0 and β = τ kbk2, if the parameters σ and τ are chosen such that 1 1 and τ > . σ< kak (1 − qX )(1 − qQ )k02 Proof. We have Ah ≤ kak X h ≤ kak X̂h = σ kak Âh < Âh if σ < 1/kak. Next (Ah w h , w h ) ≥ α0 (X h w h , w h ) ≥ (1 − qX ) α0 (X̂h wh , w h ) = α (Âh wh , w h ) with α = σ (1 − qX ) α0 . Next −1 T −1 T T 2 2 Bh Â−1 h Bh = σ Bh X̂h Bh ≤ σ Bh X h Bh ≤ σ kbk Qh ≤ σ kbk Q̂h = β Ŝh with β = τ kbk2 . Finally −1 T −1 T T 2 Bh Â−1 h Bh = σ Bh X̂h Bh ≥ σ (1 − qX ) Bh X h Bh ≥ σ (1 − qX ) k0 Qh ≥ σ (1 − qX ) (1 − qQ ) k02 Q̂h = τ (1 − qX ) (1 − qQ ) k02 Ŝh > Ŝh if τ > 1/[(1 − qX )(1 − qQ )k02 ]. Good and efficient preconditioners X̂h and Q̂h are usually available, as it will be shown for a model problem in the next section. Therefore, the quantities qX and qQ , which measure the quality of these preconditioners, are typically very small, say 0.1. Roughly speaking, the parameter σ has to be sufficiently small, while the parameter τ has to be sufficiently large in order to guarantee the conditions (2.7) and (2.8). On the other hand, in order to obtain low upper bounds for the condition number of the preconditioned matrix, α should be as large as possible and β should be as small as possible, i.e.: σ should be as large as possible and τ should be as small as possible. This, of course, requires at least a rough quantitative knowledge of the constants kak and k0 , which are involved in the choice of σ and τ . Next, we will study a particular model problem from optimal control, where the parameters kak, α0 , kbk, and k0 are known: 10 J. SCHÖBERL AND W. ZULEHNER 4. A model problem from optimal control. Let Ω ⊂ Rd be an open and bounded set. We consider the following optimization problem with PDE-constraints: Find the state y ∈ H 1 (Ω) and the control u ∈ L2 (Ω) such that J(y, u) = min (z,v)∈H 1 (Ω)×L2 (Ω) J(z, v), subject to the state equation with distributed control u on the right hand side −∆y + y = u in Ω, ∂y = 0 on ∂Ω, ∂n where the cost functional is given by J(y, u) = 1 2 Z Ω (y − yd )2 dx + ν 2 Z u2 dx. Ω More precisely, we prescribe the state equation in weak form: Z Z Z u q dx for all q ∈ H 1 (Ω). y q dx = ∇y · ∇q dx + Ω Ω 1 Ω 2 Let X = Y × U with Y = H (Ω), U = L (Ω) and Q = H 1 (Ω). With x = (y, u) ∈ X, w = (z, v) ∈ X and q ∈ Q we introduce the following bilinear forms and linear functionals: Z Z a(x, w) = y z dx + ν u v dx, Ω ZΩ Z Z b(w, q) = ∇z · ∇q dx + z q dx − v q dx, Ω Ω ZΩ hF, wi = yd z dx, Ω hG, qi = 0. With this setting the optimization problem is of the standard form (3.1). The conditions of the theorem of Brezzi can easily be verified for the Hilbert spaces X = Y × U and Q introduced above and equipped with the standard scalar products (y, z)H 1 (Ω) in Y , (u, v)L2 (Ω) in U and (p, q)H 1 (Ω) in Q. Then, however, the parameters kak, α0 , kbk and k0 depend on the regularization parameter ν, eventually resulting in convergence rates also depending on ν. With a different scaling of the scalar products in Y , U and Q we obtain parameters kak, α 0 , kbk and k0 independent of ν, eventually leading to preconditioners with convergence rates robust in ν: In particular, we consider the following new scalar products (y, z)Y in Y = H 1 (Ω), (u, v)U in U = L2 (Ω) and (p, q)Q in Q = H 1 (Ω): √ (y, z)Y = (y, z)L2 (Ω) + ν (y, z)H 1 (Ω) , (u, v)U = ν (u, v)L2 (Ω) and (p, q)Q = 1 1 (p, q)L2 (Ω) + √ (p, q)H 1 (Ω) ν ν and we set (x, w)X = (y, z)Y + (u, v)U for x = (y, u), w = (z, v) ∈ X = Y × U . With these definitions of the scalar products the following properties can be verified: Lemma 4.1. 1. The bilinear form a is bounded: a(x, x) ≤ kxk2X for all x ∈ X. 11 SYMMETRIC INDEFINITE PRECONDITIONERS 2. The bilinear form a is coercive on ker B: a(x, x) ≥ α0 kxk2X for all x ∈ ker B with α0 = 2 . 3 3. The bilinear form b is bounded: sup 06=w∈X b(w, q) ≤ kqkQ kwkX for all q ∈ Q. 4. The bilinear form b satisfies the inf-sup condition: b(w, q) sup ≥ k0 kqkQ 06=w∈X kwkX with k0 = r 3 . 4 Proof. (1) is trivial. For (2) take x = (y, u) ∈ ker B. Then: for all q ∈ H 1 (Ω). (y, q)H 1 (Ω) = (u, q)L2 (Ω) In particular, it follows for q = y: kyk2H 1 (Ω) = (u, y)L2 (Ω) ≤ kukL2 (Ω) kykL2 (Ω) , which implies kxk2X = kyk2Y + kuk2U ≤ kyk2L2(Ω) + √ ν kykL2(Ω) kukL2(Ω) + ν kuk2L2 (Ω) Then a(x, x) ≥ α0 kxk2X is certainly satisfied if h i √ a(x, x) ≥ α0 kyk2L2 (Ω) + ν kykL2 (Ω) kukL2 (Ω) + ν kuk2L2(Ω) , which is equivalent to (1 − α0 ) kyk2L2 (Ω) − α0 √ ν kykL2(Ω) kukL2(Ω) + (1 − α0 ) ν kuk2L2(Ω) ≥ 0. This is obviously the case for α0 = 2/3, since 2 √ 1 2√ 1 1 kykL2 (Ω) − νkukL2 (Ω) . ν kykL2(Ω) kukL2(Ω) + ν kuk2L2(Ω) = kyk2L2(Ω) − 3 3 3 3 To show (3) and (4) we start with the following formula: 2 (z, q)H 1 (Ω) − (v, q)L2 (Ω) b(w, q)2 = sup sup 2 kzk2Y + kvk2U 06=w∈X kwkX 06=(z,v)∈Y ×U (z, q)2H 1 (Ω) = sup 06=z∈Y kzk2Y (z, q)2H 1 (Ω) = sup 06=z∈Y kzk2Y + sup 06=v∈U + (v, q)2L2 (Ω) kvk2U 1 kqk2L2 (Ω) . ν Then (3) easily follows from the estimates sup 06=z∈Y (z, q)2H 1 (Ω) kzk2Y + kzk2H 1 (Ω) kqk2H 1 (Ω) 1 1 kqk2L2 (Ω) ≤ sup + kqk2L2 (Ω) ν kzk2Y ν 06=z∈Y = sup 06=z∈Y kzk2H 1 (Ω) kqk2H 1 (Ω) 1 √ + kqk2L2 (Ω) 2 2 kzkL2 (Ω) + ν kzkH 1 (Ω) ν 1 1 ≤ √ kqk2L2 (Ω) + kqk2H 1 (Ω) = kqk2Q . ν ν 12 J. SCHÖBERL AND W. ZULEHNER For (4) observe that: sup (z, q)2H 1 (Ω) kzk2Y 06=z∈Y + kqk4H 1 (Ω) 1 1 kqk2L2 (Ω) ≥ + kqk2L2 (Ω) ν kqk2Y ν kqk4H 1 (Ω) 1 √ + kqk2L2 (Ω) . = kqk2L2 (Ω) + ν kqk2H 1 (Ω ν The inf-sup condition sup 06=w∈X b(w, q) ≥ k0 kqkQ . kwkX is certainly satisfied if kqk4H 1 (Ω) 1 √ + kqk2L2 (Ω) ≥ k02 kqk2Q = k02 ν kqk2L2 (Ω) + ν kqk2H 1 (Ω 1 1 kqk2L2 (Ω) + √ kqk2H 1 (Ω) , ν ν which is equivalent to 1 1 (1 − k02 ) kqk4H 1 (Ω) + (1 − 2k02 ) √ kqk2L2 (Ω) kqk2H 1 (Ω + (1 − k02 ) kqk4L2 (Ω) ≥ 0. ν ν This is obviously the case for k02 = 3/4 since 2 1 1 1 1 1 1 1 2 4 4 2 2 2 kqkH 1 (Ω − √ kqkL2 (Ω) . kqkH 1 (Ω) − √ kqkL2 (Ω) kqkH 1 (Ω + kqkL2 (Ω) = 4 2 ν 4ν 4 ν By the theorem of Brezzi it now follows that the optimization problem is equivalent to the following mixed variational problem: Find x ∈ H 1 (Ω) × L2 (Ω) and p ∈ H 1 (Ω) such that a(x, w) + b(w, p) = hF, xi b(x, q) =0 for all w ∈ H 1 (Ω) × L2 (Ω), for all q ∈ H 1 (Ω). For the spaces Yh = Uh = Qh we choose, as an example, the space of piecewise linear and continuous functions on a simplicial subdivision of Ω. By introducing the standard nodal basis, we finally obtain the following saddle point problem in matrix-vector notation: Ah xh + BhT ph = f h , B h xh = 0, with Ah = Mh 0 0 ν Mh and Bh = Kh −Mh , where Mh denotes the mass matrix representing the L2 (Ω) inner product on Yh , and Kh denotes the stiffness matrix representing the bilinear form of the state equation, here (∇y, ∇q) L2 (Ω) + (y, q)L2 (Ω) , on Yh . For the matrices X h and Qh representing the scalar products (x, w)X = (y, z)Y + (u, v)U and (p, q)Q on Xh and Qh we obtain 1 0 Yh and Qh = Y h Xh = 0 ν Mh ν SYMMETRIC INDEFINITE PRECONDITIONERS 13 with Yh = √ √ ν Kh + ( ν + 1) Mh . √ √ Observe that Y h is the stiffness matrix representing the bilinear form ν (∇y, ∇q)L2 (Ω) + ( ν + 1) (y, q)L2 (Ω) on Yh . It is easy to see that Lemma 4.1 remains valid with the same constants if Y , U , Q are replaced by the finite-dimensional spaces Yh , Uh , Qh , as long as Uh = Qh ⊂ Yh . As discussed before, it is reasonable to use for Âh a (scaled) preconditioner for X h , and to use for Ŝh a (scaled) preconditioner for Qh . For Y h , which appears in the first diagonal block of X h and in Qh , we use, e.g., a standard multigrid preconditioner Ŷh for the second-order elliptic √ √ differential operator represented by the bilinear form ν (∇y, ∇q)L2 (Ω) + ( ν + 1) (y, q)L2 (Ω) . For the well-conditioned matrix Mh , which appears in the second diagonal block of X h a simple preconditioner M̂h , e.g. a few steps of a symmetric Gauss-Seidel iteration, is used. So, eventually we set 1 Ŷh σ 1 1 0 and Ŝh = Ŷh (4.1) Âh = X̂h = σ σ τ ν 0 ν M̂h with real parameters σ > 0 and τ > 0. So, in summary, the preconditioner Âh K̂h = Bh BhT −1 T Bh Âh Bh − Ŝh for the matrix Kh = Ah Bh BhT 0 is given by (4.1), where Ŷh is a √ preconditioner for the √ second-order elliptic differential operator represented by the bilinear form ν (∇y, ∇q)L2 (Ω) + ( ν + 1) (y, q)L2 (Ω) and a preconditioner M̂h for the mass matrix. It is reasonable to assume that (1 − qX ) (Ŷh yh , yh ) ≤ (yh , yh )Y ≤ (Ŷh yh , y h ) (4.2) (1 − qX ) (M̂h uh , uh ) ≤ (uh , uh )L2 (Ω) ≤ (M̂h uh , uh ) (4.3) and for some small value qX ∈ (0, 1). The factor qX describes the quality of the preconditioners Ŷh and M̂h . Then the discussion of the last section shows that the conditions (2.7) and (2.8) are satisfied with α = σ (1 − qX ) 2 3 and β = τ for parameters σ and τ satisfying σ < 1 and τ > 4 . 3(1 − qX )2 In particular, assuming that qX ≈ 0, we can except α ≈ 2/3 and β ≈ 4/3 for σ ≈ 1 and τ ≈ 4/3, leading to a rough estimate of the condition number κ ≈ κ(2/3, 4/3) ≈ 4, which implies a convergence factor q ≈ 1/3 for the conjugate gradient method. 14 J. SCHÖBERL AND W. ZULEHNER Remark 1. Observe that the mass matrix resulting from the first term of the L 2 -type cost functional is approximated a preconditioner for a second order differential operator, at first sight an very unusual approach, but completely natural in the context of the conditions of theorem of Brezzi. The more straight forward approach, namely approximating the mass matrix by some lumped mass matrix, which is easy to invert, leads to a much more complicated inexact Schur complement, which can be interpreted as a discretized fourth order differential operator, which is much harder to approximate. By our choice of the preconditioner for the mass matrix, the inexact Schur complement remains a discretized second order differential operator. 5. Numerical Experiments. We consider the optimal control problem from the previous section on the unit cube Ω = (0, 1)3 and with homogeneous data yd ≡ 0. Starting from an initial mesh of 24 tetrahedra (starting level l = 1) we obtain a hierarchy of nested meshes by uniform refinement up to some final level l = L. On each tetrahedral mesh piecewise linear and continuous finite elements are used for Yh = Uh = Qh . The discretized mixed problem is solved on the finest mesh (level l = L) by using the conjugate gradient method for the preconditioned system (2.9) with the scalar product (2.3) as described before. For the preconditioner we used the proposed symmetric block preconditioner, where Ŷh is one V-cycle of the multigrid method with m1 forward Gauss-Seidel steps for pre-smoothing and m1 backward Gauss-Seidel steps for post-smoothing (in short √ second-order elliptic √ V (m1 , m1 )) for the differential operator represented by the bilinear form ν (∇y, ∇q)L2 (Ω) + ( ν + 1) (y, q)L2 (Ω) . For M̂h we use m2 steps of the symmetric Gauss-Seidel method (in short SGS(m2 )). (0) (0) Starting values xh and ph are generated randomly. The exact solution of the problem is (k) (k) the trivial solution xh = 0 and ph = 0. The quality of an approximation (xh , ph ) is measured either by the energy norm e(k) of the error, which here is given by ! (k) xh (k) e = (k) ph −1 Dh K̂h Kh or the residual r (k) : r (k) = K h ! (k) xh (k) . ph Figure 5.1 shows a typical convergence history (number of iterations versus e (k) /e(0) and r /r(0) for level L = 5 (number of unknowns 3 × 17.985) and regularization parameter ν = 1 using a V (3, 3)-cyle for Ŷh and SGS(3) for M̂h and parameters σ = 0.9 and τ = 1.1/k02 . Table 5.1 shows that the number of iterations does not depend on the level of refinement. L denotes the level of refinement, n the total number of all unknowns y h , uh and ph , and k the number of iterations needed to satisfy the stopping rule (k) r(k) ≤ ε r(0) with ε = 10−8 . Table 5.1 Dependence of the number of iterations on the mesh size for fixed ν = 1. level L 3 4 5 6 7 number of unknown n 1.107 7.395 53.955 412.035 3.200.227 iterations k 14 15 15 16 15 Table 5.2 shows that the number of iterations does not depend on the regularization parameter ν. The results are given for refinement level L = 5. 15 SYMMETRIC INDEFINITE PRECONDITIONERS 1 energy norms residuals 0.01 1e-04 1e-06 1e-08 1e-10 1e-12 1e-14 1e-16 0 5 10 15 20 25 30 35 40 Fig. 5.1. Convergence history: number of iterations versus relative accuracy. Table 5.2 Dependence of the number of iterations on ν for fixed refinement level L = 5. ν 10−4 10−2 1 102 104 iterations k 15 14 15 14 15 6. Concluding remarks. A class of symmetric and indefinite preconditioners have been analyzed for saddle point problems where A is positive definite only on the kernel of B. The preconditioned system is symmetric and positive definite, and, therefore, can be solved by a conjugate gradient method. The construction of the preconditioners are guided by the conditions of the theorem of Brezzi. In particular, for the considered optimal control problem with L 2 -type cost functional and elliptic state equation with distributed control, the mass matrix for the state variable, originated from the L2 -type cost functional is preconditioned by a preconditioner for a modified elliptic equation, at first sight a very unusual approach, but quite natural in the context of the theorem of Brezzi. The presented technique results in preconditioners robust in the mesh size for a large class of saddle point problems. For the considered model problem it is also robust with respect to the regularization parameter. REFERENCES [1] R. E. Bank, B. D. Welfert, and H. Yserentant. A class of iterative methods for solving saddle point problems. Numer. Math., 56:645 – 666, 1990. [2] M. Benzi, G. H. Golub, and J. Liesen. Numerical Solution of Saddle Point Problems. Acta Numerica, 14:1–137, 2005. [3] J. H. Bramble and J. E. Pasciak. A preconditioning technique for indefinite systems resulting from mixed approximations of elliptic problems. Math. Comp., 50:1 – 17, 1988. [4] F. Brezzi and M. Fortin. Mixed and Hybrid Finite Element Methods. Springer-Verlag, 1991. [5] M. Fortin and R. Glowinski. Augmented Lagrangian Methods: Application to the Numerical Solution of Boundary–Value Problems. North–Holland, Amsterdam, 1983. 16 J. SCHÖBERL AND W. ZULEHNER [6] G. H. Golub and C. Greif. On solving block-structured indefinite linear systems. SIAM J. Sci. Comput., 24(6):2076–2092, 2003. [7] G. H. Golub, C. Greif, and J. M. Varah. An algebraic analysis of a block diagonal preconditioner for saddle point problems. SIAM J. Matrix Anal. Appl., 27(3):779–792, 2006. [8] T. Rusten and R. Winther. A preconditioned iterative method for saddle-point problems. SIAM J. Matrix Anal. Appl., 13:887 – 904, 1992. [9] D. Silvester and A. Wathen. Fast iterative solutions of stabilized Stokes systems. Part II: Using block diagonal preconditioners. SIAM J. Numer. Anal., 31:1352 – 1367, 1994. [10] W. Zulehner. Analysis of iterative methods for saddle point problems: a unified approach. Math. Comp., 71:479 – 505, 2002.
© Copyright 2026 Paperzz