BROKEN SOBOLEV SPACE ITERATION FOR TOTAL
VARIATION REGULARIZED MINIMIZATION PROBLEMS
SÖREN BARTELS
Abstract. We devise an improved iterative scheme for the numerical solution
of total variation regularized minimization problems. The numerical method
realizes a primal-dual iteration with discrete metrics that allow large step sizes.
1. Introduction
Major progress has recently been made in the development of iterative schemes for
the approximate solution of infinite-dimensional, nonsmooth, convex optimization
problems, cf. [NN13, CP11]. By employing an equivalent formulation as a saddlepoint problem the convergence of certain primal-dual methods has recently been
rigorously established. The schemes may be regarded as subdifferential flows of
a Lagrange functional and the employed metrics determine the speed of convergence to the stationary saddle-point. Since the numerical schemes are semi-implicit
discretizations of continuous evolution problems the conditions on discretization
parameters that guarantee stability of the method crucially depend on the chosen
metric.
We consider here a model problem that arises in image processing and qualitatively
also in the modeling of damage. It consists in the minimization of the functional
Z
α
I(u) =
|Du| + ku − gk2L2 (Ω)
2
Ω
among u ∈ BV (Ω) ∩ L2 (Ω). Here, Ω ⊂ Rd , d = 2, 3, is a bounded Lipschitz
domain, α > 0 is a given parameter, and g ∈ L2 (Ω) is a given noisy image. The
first term on the right-hand side is the total variation of the distributional derivative
of u which coincides with the W 1,1 (Ω) seminorm if u belongs to this space. The
Fenchel conjugate of the total variation functional with respect to the inner product
in L2 (Ω; Rd ) is the indicator functional of the unit ball in L∞ (Ω; Rd ). The saddlepoint problem associated to the minimization of I thus seeks a pair (u, p) ∈ L2 (Ω)×
HN (div; Ω) that is stationary for
α
L(u, p) = −IK1 (0) (p) − (u, div p) + ku − gk2 ,
2
where (·, ·) and k · k denote the inner product and the norm in L2 (Ω), respectively,
and HN (div; Ω) is the space of vector fields in L2 (Ω; Rd ) with square integrable
distributional divergence and vanishing normal component on ∂Ω. The structure
of the functional L as the sum of an indicator functional and a linear-quadratic
functional makes primal-dual methods attractive. With inner products h·, ·i and
Date: September 10, 2013.
1991 Mathematics Subject Classification. 65K15 (49M29).
Key words and phrases. Total variation, primal-dual method, finite elements, iterative solution.
1
2
SÖREN BARTELS
hh·, ·ii on subspaces of L2 (Ω) and L2 (Ω; Rd ), respectively, the considered numerical
schemes are interpreted as discretizations of the evolution problem
hu0 , vi = −∂u L(u, p) = (v, div p) − α(u − g, v),
hhp0 , qii = ∂p L(u, p) = −∂IK1 (0) (p)[q] − (u, div q),
where u0 and p0 denote the generalized time-derivatives of u and p. The choice of
the inner products needs to be done in such a way that (i) the problems resulting
from a discretization in space and time can be solved efficiently, (ii) the discrete
time-derivatives are bounded in sufficiently strong norms that allow to absorb error
terms, and (iii) they do not limit relevant discontinuities of solutions.
In the choice of the metrics we exploit the observation that if the data function
satisfies g ∈ L∞ (Ω) then the minimizer u ∈ BV (Ω) ∩ L2 (Ω) of the functional I
satisfies u ∈ L∞ (Ω) and hence belongs to the broken Sobolev space H 1/2 (Ω). We
employ a discrete version of the inner product in H 1/2 (Ω) which enables us to
improve the step size restriction for discretizations to τ = O(h1/2 ) as opposed to
the more restrictive condition τ = O(h) when the weaker inner product of L2 (Ω) is
employed. The choice of the inner product of L2 (Ω; Rd ) for the evolution of the dual
variable p allows a pointwise explicit solution of the variational inequality defined
by the second equation. A drawback of the modified iteration is that a nontrivial
but sparse linear system of equations has to be solved in every step. For iterations
based on a lumped version of the inner product of L2 (Ω) the steps are fully explicit
provided that the term (α/2)ku − gk2 is discretized using numerical quadrature.
The outline of this article is as follows. In Section 2 we introduce the required notation and prove an auxiliary discrete interpolation estimate. The modified scheme
and its analysis are presented in Section 3. Numerical experiments that study the
influence of different inner products and choices of initial data on the performance
of the numerical scheme are presented in Section 4.
2. Preliminaries
2.1. Finite element discretization. Given a regular triangulation Th of the domain Ω into triangles or tetrahedra we define
S 1 (Th ) = {vh ∈ C(Ω) : vh |T affine for all T ∈ Th },
L0 (Th ) = {rh ∈ L1 (Ω) : rh |T constant for all T ∈ Th }
and let Ih : C(Ω) → S 1 (Th ) denote the nodal interpolation operator on Th which
is also applied to bounded, piecewise continuous functions. With these spaces
a nonconforming discretization of the saddle-point formulation is defined by the
following lemma.
Lemma 2.1 (Existence of approximations). There exists a pair (uh , ph ) ∈ S 1 (Th )×
L0 (Th )d that is stationary for
α
Lh (uh , ph ) = −IK1 (0) (ph ) + (∇uh , ph ) + kuh − gk2 ,
2
where IK1 (0) (ph ) = 0 if |ph | ≤ 1 almost everywhere in Ω and IK1 (0) (ph ) = +∞
otherwise. The pair (uh , ph ) ∈ S 1 (Th )×L0 (Th )d satisfies |ph | ≤ 1 almost everywhere
in Ω and
(∇uh , qh − ph ) ≤ 0, (ph , ∇vh ) + α(uh − g, vh ) = 0
for all (vh , qh ) ∈ S 1 (Th ) × L0 (Th )d with |qh | ≤ 1 almost everywhere in Ω.
BROKEN SOBOLEV SPACE ITERATION
3
Proof. The result is an immediate consequence of standard assertions for saddlepoint problems, cf., e.g., [ET99].
Remarks 2.1. (i) Convergence uh → u in L2 (Ω) as h → 0 is a consequence of the
strong convexity of I, cf. [Bar12].
(ii) In general, a discrete maximum principle kuh kL∞ (Ω) ≤ ckgkL∞ (Ω) cannot be
expected to hold with c = 1 as in the continuous situation.
2.2. Discrete interpolation. The following lemma is a discrete version of the fact
that functions in BV (Ω) ∩ L∞ (Ω) belong to the space H 1/2 (Ω), cf., e.g., [Tar07,
Lemma 38.1] for details. It implies that if (uh )h>0 is a bounded sequence of finite
element functions associated to a shape regular family of regular triangulations in
W 1,1 (Ω) ∩ L∞ (Ω) then the weighted norm
1/2
khT ∇uh k
−1/2
remains bounded as h → 0, i.e., that k∇uh k ≤ chmin . Here, hT ∈ L∞ (Ω) is
defined by hT |T = hT = diam(T ) for all T ∈ Th and we denote hmin = min hT and
h = hmax = max hT .
Lemma 2.2 (Interpolation inequality). There exists c > 0 such that for all vh ∈
S 1 (Th ) we have
1/2
khT ∇vh k ≤ ck∇vh kL1 (Ω) kvh kL∞ (Ω) .
Proof. For T ∈ Th an integration by parts on T shows
Z
Z
2
hT
|∇vh | dx = −hT
vh (∇vh · νT ) ds ≤ kvh kL∞ (Ω) hT k∇vh kL1 (∂T ) .
T
∂T
Noting hT |∂T | ≤ c|T | and summing over all T ∈ Th implies the assertion.
For s ∈ [0, 1] and uh , vh ∈ S 1 (Th ) we define the inner product
Z
Z
(1−s)/s
(uh , vh )h,s =
uh vh dx +
hT
∇uh · ∇vh dx
Ω
Ω
and let kuh kh,s denote the corresponding norm on S 1 (Th ). We use the convention
(1−s)/s
that hT
= 0 for s = 0. It then follows that (·, ·)h,s coincides with the inner
product in L2 (Ω) if s = 0 and with the inner product in H 1 (Ω) if s = 1.
Lemma 2.3 (Inverse estimate). There exists c > 0 such that for all vh ∈ S 1 (Th )
we have
− min{1,(1−s)/(2s)}
k∇vh k ≤ chmin
kvh kh,s .
Proof. For s > 0 we have
−(1−s)/(2s)
k∇vh k ≤ hmin
(1−s)/(2s)
khT
−(1−s)/(2s)
∇vh k ≤ hmin
kvh kh,s
for all vh ∈ S 1 (Th ) while for s ≥ 0 the inverse estimate k∇vh k ≤ ch−1
min kvh k implies
k∇vh k ≤ ch−1
min kvh kh,s .
The combination of the estimates implies the asserted inequality.
4
SÖREN BARTELS
2.3. Difference quotients. For a step-size τ > 0 we let dt denote the backward
difference quotient defined by dt an = (an − an−1 )/τ for a sequence (an )n=0,...,N .
We note that we have
dt
τ
(2.1)
(an , dt an ) = kan k2 + kdt an k2
2
2
for n = 1, 2, ..., N . For sequences (an )n=0,...,N and (bn )n=0,...,N we have Leibniz’
summation by parts formula
(2.2)
τ
N
X
(dt an )bn = τ
n=1
N
X
an−1 (dt bn ) + aN bN − a0 b0
n=1
which follows from a summation of the discrete product rule dt (an bn ) = (dt an )bn +
an−1 (dt bn ).
3. Modified primal-dual method
The following algorithm is a modified version of the algorithms used in [CP11,
Bar12]. It has the interpretation of a discretization of the evolution problem
(u0 , v)s = (div p, v) − α(u − g, v),
(−p0 , q − p) ≤ (u, div(q − p)) + IK1 (0) (q) − IK1 (0) (p)
with initial conditions u(0) = u0 and p(0) = p0 .
Algorithm 1 (Primal-dual iteration). Given τ > 0 and (u0h , p0h ) ∈ S 1 (Th )×L0 (Th )d
set dt u0h = 0 ∈ S 1 (Th ) and n = 1 and iterate the following steps:
(1) Set u
enh = un−1
+ τ dt unh .
h
n
(2) Compute ph ∈ L0 (Th )d such that
(−dt pnh + ∇e
unh , qh − pnh ) + IK1 (0) (pnh ) ≤ IK1 (0) (qhn )
for all qh ∈ L0 (Th )d .
(3) Compute unh ∈ S 1 (Th ) such that
(dt unh , vh )h,s + (pnh , ∇vh ) = −α(unh − g, vh )
for all vh ∈ S 1 (Th ).
(4) Stop if
(dt unh , vh )h,s
+ kdt pnh k ≤ εstop .
kvh k
vh ∈S 1 (Th )
sup
Remarks 3.1. (i) The equation in Step (2) is equivalent to seeking pnh ∈ L0 (Th )d
with |pnh | ≤ 1 almost everywhere in Ω and
(−dt pnh + ∇e
unh , qh − pnh ) ≤ 0
for all qh ∈ L0 (Th )d with |qh | ≤ 1 almost everywhere in Ω. This inequality characterizes the unique minimizer pnh ∈ L0 (Th )d of the mapping
1
kph − phn−1 k2 + IK1 (0) (ph ) − (ph , ∇e
unh ).
2τ
Owing to the choice of the inner product of L2 (Ω; Rd ) for the evolution of the dual
variable we have
pn−1
+ τ ∇e
unh
h
pnh =
n−1
max{1, |ph + τ ∇e
unh |}
ph 7→
BROKEN SOBOLEV SPACE ITERATION
5
which can be evaluated elementwise.
(ii) The stopping criterion controls the residual of the equation of the primal variable
u with respect to the norm in L2 (Ω). For a stopping criterion of the form kdt unh k ≤
εstop the output u∗h of the algorithm would critically depend on the parameter s.
Proposition 3.1 (Convergence). Let (uh , ph ) ∈ S 1 (Th ) × L0 (Th )d be a solution of
the discrete saddle-point problem defined through Lh in Lemma 2.1. Provided that
τ Θh ≤ 1/2 for
Θh =
k∇vh k
− min{1,(1−s)/(2s)}
≤ chmin
vh ∈S 1 (Th ) kvh ks,h
sup
we have for the iterates (unh )n=0,...,N and (pnh )n=0,...,N of Algorithm 1 that
τ
N
X
αkunh − uh k2 +
n=1
1
τ
τ
kdt unh k2h,s + kdt pnh k2 ≤ kuh − u0h k2h,s + kph − p0h k2 .
4
4
2
Proof. We denote δun = uh − unh and δpn = ph − pnh . The formula (2.1) and the
identities dt δun = −dt unh and dt δpn = −dt pnh show
τ
dt
kδun k2h,s + kδpn k2 + kdt δun k2h,s + kdt δpn k2 + αkδun k2
2
2
= (dt δun , δun )h,s + (dt δpn , δpn ) + αkδun k2 = −(dt unh , δun )h,s − (dt pnh , δpn ) + αkδun k2 .
Υ(n) =
Using the equations for dt unh and dt pnh of Algorithm 1 leads to
Υ(n) ≤ α(unh − g, δun ) + (pnh , ∇δun ) − (δpn , ∇e
unh ) + αkδun k2 .
We note αkδun k2 = α(uh , δun ) − α(unh , δun ) and deduce that
Υ(n) ≤ α(uh − g, δun ) + (pnh , ∇δun ) − (δpn , ∇e
unh ).
The identity satisfied by uh and a rearrangement yield
Υ(n) ≤ −(ph , ∇δun ) + (pnh , ∇δun ) − (δpn , ∇e
unh )
= (δpn , ∇(unh − u
enh )) − (δpn , ∇uh ).
Since |pnh | ≤ 1 almost everywhere in Ω we have (∇uh , δpn ) ≥ 0 and by incorporating
the identity unh − u
enh = unh − un−1
− τ dt un−1
= τ 2 d2t unh we deduce
h
h
Υ(n) ≤ (δpn , ∇(unh − u
enh )) ≤ τ 2 (δpn , ∇d2t unh ).
We sum this estimate over n = 1, 2, ..., N and multiply by τ to verify
N
N
X
X
1
τ
kδuN k2h,s + kδpN k2 + τ
kdt δun k2h,s + kdt δpn k2 + ατ
kδun k2
2
2
n=1
n=1
≤
N
X
1
kδu0 k2h,s + kδp0 k2 + τ 3
(δpn , ∇d2t unh ).
2
n=1
With Leibniz’ formula (2.2), the identity dt uhn−1 = −dt δun−1 , and dt u0h = 0 we find
τ3
N
X
n=1
(δpn , ∇d2t unh ) = τ 3
N
X
n=1
(dt δpn , ∇dt δun−1 ) + τ 2 (δpN , ∇dt uN
h ).
6
SÖREN BARTELS
Two applications of Young’s inequality imply
τ3
N
X
(δpn , ∇d2t unh ) ≤ τ 2
n=1
N
X
1
1
τ 2 k∇dt δun−1 k2 + kdt δpn k2 + kδpN k2 + τ 4 k∇dt δuN k2 .
4
4
n=1
The assumption on τ , i.e., τ k∇vh k ≤ τ Θh kvh kh,s ≤ (1/2)kvh kh,s for all vh ∈ S 1 (Th )
allows us to absorb all of the terms on the right-hand side and to deduce
τ
N
X
αkδun k2 +
n=1
1
τ
τ
(1 − 2τ 2 Θ2h )kdt δun k2h,s + kdt δpn k2 ≤ kδu0 k2h,s + kδp0 k2
2
4
2
which, noting 1 − 2τ 2 Θ2h ≥ 1/2, proves the asserted bound on the iterates. The
estimate for Θh is a direct consequence of Lemma 2.3.
Remark 3.1. If the sequence (uh − u0h )h>0 is bounded uniformly in L∞ (Ω) ∩
W 1,1 (Ω) then according to Lemma 2.2 the right-hand side of the estimate of Proposition 3.1 is bounded h-independently if 0 ≤ s ≤ 1/2. The largest step-size τ is thus
possible for s = 1/2. No restriction on the step size is needed for s = 1 but in this
−1/2
case the right-hand side of the estimate will typically grow like hmin .
4. Numerical experiments
We tested Algorithm 1 with different choices of s ∈ [0, 1]. For the practical implementation of the stopping criterion we employed the operator As : S 1 (Th ) → S 1 (Th )
defined by
(As vh , wh ) = (vh , wh )h,s
1
for all wh ∈ S (Th ). The stopping criterion is then equivalent to kAs dt unh k ≤ εstop .
If M and S are the mass and stiffness matrix related to the nodal basis of the finite
b n of dt un that
element space S 1 (Th ) then we have for the coefficient vector dt U
h
h
b n = M −1 (M + h(1−s)/s S)dt U
bn
Abs dt U
h
h
in the case of a uniform triangulation. Mass lumping can be employed to avoid
an additional inversion of the mass matrix in every iteration. In the numerical
experiments we used the following specification of the model problem.
Example 4.1. Let d = 2, Ω = (−1, 1)2 , α = 10, and g = χB1/2 (0) + ξh , where ξh
is a mesh-dependent noise function.
Table 1 displays the numbers of iterations needed to satisfy the stopping criterion
with εstop = 10−2 for s = √0, 1/2, 1, uniform triangulations consisting of halved
squares with mesh-size h = 22−` for ` = 3, ..., 6, and the following choices for u0h ,
i.e.,
(A) the smooth function u0h = 0,
(B) the rough function u0h = Ih g,
(C) the unique function u0h = Qh g ∈ S 1 (Th ) satisfying
(∇u0h , ∇vh ) + α(u0h − Ih g, vh ) = 0
for all vh ∈ S 1 (Th ).
BROKEN SOBOLEV SPACE ITERATION
u0h
s
Ih g
0
0
1/2
7
1
0
1/2
Qh g
1
0
` = 3 298
279
725
298 289
788
` = 4 603
645 2533
602 698 2763
` = 5 1575 1065 5903 1572 1238 6811
` = 6 4249 1394 9986 4225 1778 12524
299
603
1569
4099
1/2
1
274 695
601 2307
1074 5604
1559 9754
Table 1. Iteration numbers of Algorithm 1 for εstop = 10−2 ,
different metrics (·, ·)h,s specified by
parameter s, uniform tri√ the
−`
angulations with mesh-size h = 22 , ` = 3, 4, ..., 6, step-size
τ = h1−s /10, and initialized with different functions u0h specified
by (A), (B), and (C).
We always used p0h = 0. The discrete noise function ξh ∈ S 1 (Th ) was generated by
defining its nodal values through normally distributed random numbers. The inner
products (g, vh ) for vh ∈ S 1 (Th ) were approximated by (Ih g, vh ). The step-size was
chosen according to
for s = 0,
h/10
1/2
τ = h /10 for s = 1/2,
1/10
for s = 1,
i.e., τ = h1−s /10. For nearly all choices of initial data and refinement levels we
obtain smaller iteration numbers for the choice s = 1/2 than for s = 0 and s = 1.
This confirms the expected properties of the modified iteration. In the case s = 1/2
the rough initial function u0h = Ih g of case (B) leads to larger iteration numbers
than the functions defined in (A) and (C) which is in agreement with Remark 3.1.
The iteration numbers grow superlinearly for s = 0, sublinearly for s = 1/2, and
approximately linearly for s = 1 in the experiment. The overall conclusion from the
experiments is that the choices s = 1/2 and u0h = 0 lead to the smallest iteration
numbers.
Remark 4.1.
R If the term ku − gk is discretized using mass lumping and if
(v, w)h,0 = Ω Ih [vw] dx for v, w ∈ C(Ω) is the corresponding inner product, the
solution of the linear systems of equations can be avoided entirely in Algorithm 1.
The reduced numerical effort in every iteration step might then compensate the
larger iteration numbers. Since the matrix representing the inner product (·, ·)h,s
in the nodal basis is sparse the computational work in every iteration step can be
optimized to linear complexity if s > 0, e.g., for an appropriate enumeration of the
nodes and a precomputed sparse LU factorization.
References
[Bar12] Sören Bartels, Total variation minimization with finite elements: convergence and iterative solution, SIAM J. Numer. Anal. 50 (2012), no. 3, 1162–1180.
[CP11] Antonin Chambolle and Thomas Pock, A first-order primal-dual algorithm for convex
problems with applications to imaging, J. Math. Imaging Vision 40 (2011), no. 1, 120–
145.
8
SÖREN BARTELS
[ET99] Ivar Ekeland and Roger Témam, Convex analysis and variational problems, english ed.,
Classics in Applied Mathematics, vol. 28, Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA, 1999, Translated from the French.
[NN13] Yurii Nesterov and Arkadi Nemirovski, On first-order algorithms for `1 /nuclear norm
minimization, Acta Numer. 22 (2013), 509–575.
[Tar07] Luc Tartar, An introduction to Sobolev spaces and interpolation spaces, Lecture Notes of
the Unione Matematica Italiana, vol. 3, Springer, Berlin, 2007.
Department of Applied Mathematics, Mathematical Institute, University of Freiburg,
Hermann-Herder-Str 9, 79104 Freiburg i. Br., Germany
E-mail address: [email protected]
© Copyright 2026 Paperzz