Introduction Total Variation Regularization Domain Decomposition The Algorithm Parallel Total Variation Minimization Jahn Müller [email protected] 27.01.2009 Jahn Müller Parallel Total Variation Minimization 1 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Outline 1 Introduction 2 Total Variation Regularization The ROF model Different Formulations 3 Domain Decomposition Overlapping Decomposition Schwarz Methods 4 The Algorithm Numerical Realization Results Jahn Müller Parallel Total Variation Minimization 2 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Introduction desirable to extend 2D algorithms to 3D currently used workstations reach their technical limitations expedient: divide original problem in subproblems solve independently on several CPU’s merge together to a solution of the complete problem data dependences may arise (neglecting leads to undesirable effects at the interfaces) Jahn Müller Parallel Total Variation Minimization 3 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Image Denoising function space for denoising Lebesgue space R Lp (Ω) := {u : Ω → R̄ | Ω |u|p dx < ∞}, p ≥ 1 contain oscillating images, in particular noise Sobolev space W 1,p (Ω) := {u ∈ Lp (Ω) | ∂u ∂xj ∈ Lp (Ω), j = 1, . . . , d}, p ≥ 1 to restrictiv, e.g. for piecewise constant images need for a proper space space of functions with bounded variation (BV) Jahn Müller Parallel Total Variation Minimization 4 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Total Variation Jahn Müller Parallel Total Variation Minimization 5 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Total Variation Denoising Jahn Müller Parallel Total Variation Minimization 6 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Total Variation Regularization Let f : Ω ⊂ Rd → R beRa noisy version of a given image u0 with noise variance given by Ω (u0 − f )2 dx ≤ σ 2 . The ROF model: (Rudin, Osher, Fatemi) Z λ û = arg min (u − f )2 dx + 2 u∈BV (Ω) } | Ω {z data fitting 0 ||ϕ||∞ ≤1 R Ω u∇ (1) regularization where |u|TV := supϕ∈C ∞ (Ω)d |u|TV | {z } · ϕ dx denotes the total variation of u BV (Ω) = u ∈ L1 (Ω) | |u|TV < ∞ is the space of functions with bounded total variation. Jahn Müller Parallel Total Variation Minimization 7 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Total Variation Regularization depending on exact definition of the supremum norm ||p||∞ := ess sup ||p(x)||l r x∈Ω family of equivalent seminorms: Z Z u∇ · ϕ dx |Du|l s = sup Ω with 1 s + 1 r ϕ∈C0∞ (Ω)d ||ϕ||∞ ≤1 Ω = 1 (Hölder conjugate) isotropic total variation (r = 2) anisotropic total variation (r = ∞) Jahn Müller Parallel Total Variation Minimization 8 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Isotropic vs Anistropic Total Variation Jahn Müller (a) Original image (b) Noisy image (c) Isotropic TV (d) Anisotropic TV Parallel Total Variation Minimization 9 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Total Variation Regularization The functional λ J(u) := 2 Z 2 (u − f ) dx + Ω Z |u|TV Ω is strict convex and attains a unique minimum in BV (lower semicontinuity and weak*- compactness of the sub-level sets). optimality condition in terms of subgradients (∂J(u) := {p ∈ U ∗ | J(w ) ≥ J(u) + hp, w − ui, ∀w ∈ U}): 0 ∈ ∂J(u) 0 ∈ λ(u − f ) + ∂|u|TV Jahn Müller Parallel Total Variation Minimization 10 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Total Variation Regularization Primal Formulation: λ J(u) = 2 Z 2 (u − f ) dx + Z |∇u| dx Ω Ω for u sufficiently smooth, particularly u ∈ W 1,1 (Ω). Euler Lagrange equation: λ(u − f ) − ∇ · ∇u |∇u| =0 perturbe norm to overcome issue with singularity at ∇u = 0: |∇u|β := p |∇u|2 + β solution methods: steepest descent (Rudin et al.), fixed point iteration (Vogel and Oman), Newton’s method (Chan et al.) Jahn Müller Parallel Total Variation Minimization 11 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Total Variation Regularization exact formulation: min sup u∈BV (Ω) kpk∞ ≤1 | λ 2 Z (u − f )2 dx + Ω {z Z u∇ · p dx Ω =:L(u,p) interchange min and sup (not trivial!): } min sup L(u, p) = max min L(u, p) u p p u Dual Formulation: 2 min λ1 ∇ · p − f 2 kpk∞ ≤1 (2) after solving (2) we obtain the solution for u from u = f − λ1 ∇ · p solution methods: projection algorithm (Chambolle) Jahn Müller Parallel Total Variation Minimization 12 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Total Variation Regularization Primal Dual Formulation: Z Z λ 2 inf sup u∇ · p dx (u − f ) dx + Ω u∈BV (Ω) kpk∞ ≤1 2 Ω | {z } =:L(u,p) For a saddle point we achieve the following optimality conditions ∂L = λ(u − f ) + ∇ · p = 0 ∂u and L(u, p) ≥ L(u, q) ∀q , kqk∞ ≤ 1, (3) (4) (4) implies ∇ · p ∈ ∂|u|TV and hence 0 ∈ ∂J(u) Jahn Müller Parallel Total Variation Minimization 13 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Penalty and Barrier Approximation approximate the constraint kpk ≤ 1 in (3) by using Barrier or Penalty methods Penalty approximation 1 Lε (u, p) = L(u, p) − F (kpk − 1) ε with ε > 0 small and a term F penalizing if kpk − 1 > 0. Typical example: F (s) = 1 max{s, 0}2 2 still allows violations of the constraint Jahn Müller Parallel Total Variation Minimization 14 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Penalty and Barrier Approximation Barrier approximation add a continuous barrier term to L such that G (s) = ∞, if the constraint is violated Lε (u, p) = L(u, p) − εG (kpk2 − 1) For example: G (s) = − log(−s) ensures that the constraint is not violated also called interior-point method choice of approximation effects the shape of the solution u Jahn Müller Parallel Total Variation Minimization 15 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Penalty and Barrier Approximation 1.2 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 −0.2 −1 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −0.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 h = 10−3 ε = 10−3 0 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 Penalty method Jahn Müller ε = 0.1 0 −0.8 1.2 −0.2 −1 h = 10−3 0.8 1 −0.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Barrier method Parallel Total Variation Minimization 16 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Newton method with damping Optimality conditions (using penalty approximation) ∂Lε = λ(u − f ) + ∇ · p = 0 ∂u ∂Lε = −∇u − 2ε H(p) = 0 ∂p with H(p) being the derivative of F (kpk − 1). Linearize H(p) via first-order Taylor-approximation H(p k+1 ) ≈ H(p k ) + H ′ (p k )(p k+1 − p k ) Add damping term, with damping parameter τ τ k (p k+1 − p k ) Jahn Müller Parallel Total Variation Minimization 17 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm The ROF model Different Formulations Newton method with damping Solve in each step λ(u k+1 − f ) + ∇ · p k+1 = 0 −∇u k+1 − ε2 H ′ (p k )(p k+1 − p k ) − ε2 H(p k ) − τ k (p k+1 − p k ) = 0 linear system and easy to descretize choose ε → 0 during iteration, to obtain fast convergence start with small value τ and increase during iteration, to avoid oscillations ε and τ are chosen from experimental runs back Jahn Müller Parallel Total Variation Minimization 18 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Overlapping Decomposition Schwarz Methods Domain Decomposition Example Lu = f u = 0 in Ω on ∂Ω split the given domain Ω of the problem into subdomains Ωi , i = 1, . . . , S overlapping or non overlapping decompositions when all unknowns are coupled, a straight forward splitting leads to signifant errors at the interfaces transmission conditions at the interface avoid computing transmission conditions by using overlapping decompositions Jahn Müller Parallel Total Variation Minimization 19 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Overlapping Decomposition Schwarz Methods Overlapping Decomposition replacementsz Ω′2 Ω′1 z }| }| { { Ω2 Ω1 Γ2 Γ1 | {z } δ for a uniform lattice with stepsize h, δ = mh, with m ∈ N redundant degress of freedom achieve update for boundary data from this redundancy Jahn Müller Parallel Total Variation Minimization 20 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Overlapping Decomposition Schwarz Methods Additive Schwarz Method Let u (0) be an initial function, (k+1) = f, in Ω1 Lu1 (k+1) u = u (k) |Γ1 , 1(k+1) u1 = 0, on Γ1 on ∂Ω1 \ Γ1 and (k+1) = f, Lu2 (k+1) = u1 (k+1) = 0, u2 u2 (k) |Γ2 , in Ω2 on Γ2 on ∂Ω2 \ Γ2 . The next step is computed by (k+1) if x ∈ Ω \ Ω1 u2 (x), (k+1) (k+1) u (x), if x ∈ Ω \ Ω2 u (x) = 1 u1(k+1) (x)+u2(k+1) (x) 2 if x ∈ Ω1 ∩ Ω2 . related to the well-known Jacobi method direct application of parallelization Jahn Müller Parallel Total Variation Minimization 21 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Overlapping Decomposition Schwarz Methods Multiplicative Schwarz Method Let u (0) be an initial function, (k+1) = f, in Ω1 Lu1 (k+1) u1 (k+1) u1 (k) =u = 0, |Γ1 , on Γ1 on ∂Ω1 \ Γ1 and (k+1) = f, Lu2 (k+1) u2 (k+1) u2 = (k+1) u1 |Γ2 , = 0, in Ω2 on Γ2 on ∂Ω2 \ Γ2 . The next step is computed by (k+1) u (x), if x ∈ Ω2 (k+1) u (x) = 2(k+1) u1 (x), if x ∈ Ω \ Ω2 . related to the well-known Gauss-Seidel method seems not conventient for a parallel implementation need of coloring Jahn Müller Parallel Total Variation Minimization 22 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Overlapping Decomposition Schwarz Methods Multiplicative Schwarz Method - Coloring painting subdomins in several colors, such that same colored domains do not overlap solve domains of same color in parallel using the latest boundary conditions from the other “colors“ Jahn Müller Parallel Total Variation Minimization 23 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Numerical Realization •u11 ∗p1 11 •u21 ∗p1 21 .. . ∗p1 n−11 •un1 ∗p2 11 ∗p2 21 ∗p2 n1 •u12 ∗p1 12 •u22 ∗p1 22 .. . ∗p1 n−12 •un2 ∗p2 12 ∗p2 22 ∗p2 n2 ... ... ... ... .. . ... ... ∗p2 1n−1 ∗p2 2n−1 ∗p2 nn−1 •u1n ∗p1 1n •u2n ∗p1 2n .. . ∗p1 n−1n •unn Lay the degrees of freedom of the dual variable p in the center between the pixels of u compute the divergence of p effective as a value in each pixel (using a single-sided difference quotient) Jahn Müller Parallel Total Variation Minimization 24 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Numerical Realization applying anisotropic total variation slightly change penalty term F : F (p) = 1 1 max{|p1 | − 1, 0}2 + max{|p2 | − 1, 0}2 2 2 and hence H(p) = sgn(p1 ) · (|p1 | − 1) · 1{|p1 |≥1} sgn(p2 ) · (|p2 | − 1) · 1{|p2 |≥1} and its Hessian ′ H (p) = 1{|p1 |≥1} 0 0 1{|p2 |≥1} Newton with damping Jahn Müller Parallel Total Variation Minimization 25 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Numerical Realization write u, p 1 , p 2 column wise in vectors, to obtain a vector x = (~u , p~1 , p~2 ) construct a system matrix A and righthand side b to obtain a system Ax = b A= λ .. . D1 D2 D1t M1 0 D2t 0 M2 λ due to the shape of A some improvements to solve this system are applicable , e.g. Schur complement, or conjugate gradient methods. Jahn Müller Parallel Total Variation Minimization 26 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Parallel Implementation • | • | • | • | • Jahn Müller − • | − • | − • | − • | − • − • | − • | − • | − • | − • − • | − • | − • | − • | − • − • | − • | − • | − • | − • Processor #1 Processor #2 • − • − • − | | | • − • − • − | | | • − • − • − | | | − • − • − • | | | − • − • − • | | | − • − • − • | | | Processor #3 Processor #4 | | | • − • − • − | | | • − • − • − | | | • − • − • − | | | − • − • − • | | | − • − • − • | | | − • − • − • Parallel Total Variation Minimization 27 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Parallel Implementation “ghost cells“(red) have to be communicated after each iteration explicit communication needed communication realized via the Message Passing Interface (MPI) MatlabMPI makes MPI available in MATLAB (provided by the Lincoln Laboratory of the Massachusetts Institute Of Technology (MIT)) Jahn Müller Parallel Total Variation Minimization 28 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Additive Version Neumann boundary conditions for u turn into Dirichlet boundary conditions for p A u p p = b in Ω = 0 on ∂Ω. solve in each step (k+1) A|Ω1 u1 (k+1) p1 ! (k+1) p1 (k+1) p1 Jahn Müller (k+1) = b|Ω1 in Ω1 = p2 (k) on Γ1 = 0 on ∂Ω1 \ Γ1 A|Ω2 u2 (k+1) p2 ! = b|Ω2 p2 (k+1) = p1 (k+1) p2 = 0 Parallel Total Variation Minimization (k) in Ω2 on Γ2 on ∂Ω2 \ Γ2 . 29 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Multiplicative Version using m colors, the number of divisions would be m times higher than the number of CPUs alternatively coloring similar to a checkerboard overlapping of domains of same color (but only 1 pixel) Jahn Müller Parallel Total Variation Minimization 30 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Results 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 (a) Original 50 100 150 200 250 300 350 400 450 500 (b) Noisy using the following parameters: λ=2 ǫ = 10−2 Jahn Müller Parallel Total Variation Minimization 31 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Results: Sequential Algorithm 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 (a) After 1 iteration 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 150 200 250 300 350 400 450 500 450 500 500 50 100 150 200 250 300 350 400 450 (c) After 3 iterations Jahn Müller 100 (b) After 2 iterations 500 50 100 150 200 250 300 350 400 450 500 (d) After 13 iterations Parallel Total Variation Minimization 32 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Results: Multiplicative version on 2 CPUs (4 domains) 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 (a) After 1 iteration 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 150 200 250 300 350 400 450 500 450 500 500 50 100 150 200 250 300 350 400 450 (c) After 3 iterations Jahn Müller 100 (b) After 2 iterations 500 50 100 150 200 250 300 350 400 450 500 (d) After 13 iterations Parallel Total Variation Minimization 33 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Results: Additive Version on 4 CPUs 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 (a) After 1 iteration 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 150 200 250 300 350 400 450 500 450 500 500 50 100 150 200 250 300 350 400 450 (c) After 3 iterations Jahn Müller 100 (b) After 2 iterations 500 50 100 150 200 250 300 350 400 450 500 (d) After 13 iterations Parallel Total Variation Minimization 34 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Results: Multiplicative Version on 32 CPUs (64 domains) 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 (a) After 1 iteration 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 150 200 250 300 350 400 450 500 450 500 500 50 100 150 200 250 300 350 400 450 (c) After 3 iterations Jahn Müller 100 (b) After 2 iterations 500 50 100 150 200 250 300 350 400 450 500 (d) After 14 iterations Parallel Total Variation Minimization 35 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Results: Additive Version on 32 CPUs 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 500 500 50 100 150 200 250 300 350 400 450 500 50 (a) After 1 iteration 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 150 200 250 300 350 400 450 500 450 500 500 50 100 150 200 250 300 350 400 450 (c) After 3 iterations Jahn Müller 100 (b) After 2 iterations 500 50 100 150 200 250 300 350 400 450 500 (d) After 16 iterations Parallel Total Variation Minimization 36 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Computation Time Additive Multiplicative 90 Additive Multiplicative 400 80 350 70 300 Time s Time s 60 50 250 200 40 150 30 100 20 50 10 1 2 4 8 16 Processes 32 (a) Image size 512 × 512 1 2 4 8 16 Processes 32 (b) Image size 1024 × 1024 4 x 10 5000 Additive Multiplicative Additive Multiplicative 6 4500 4000 5 3500 4 Time s Time s 3000 2500 3 2000 2 1500 1000 1 500 1 2 4 8 16 Processes 32 (c) Image size 2048 × 2048 Jahn Müller 1 2 4 8 16 Processes 32 (d) Image size 4096 × 4096 Parallel Total Variation Minimization 37 / 38 Introduction Total Variation Regularization Domain Decomposition The Algorithm Numerical Realization Results Speedup Additive Multiplicative linear Speedup Additive Multiplicative linear Speedup 30 25 25 20 20 Speedup Speedup 30 15 10 15 10 5 5 1 2 4 8 16 Processes 32 1 2 (a) Image size 512 × 512 Additive Multiplicative linear Speedup 35 4 8 16 Processes 32 (b) Image size 1024 × 1024 Additive Multiplicative linear Speedup 55 50 45 30 40 25 Speedup Speedup 35 20 30 25 15 20 15 10 10 5 5 1 2 4 8 16 Processes 32 (c) Image size 2048 × 2048 Jahn Müller 1 2 4 8 16 Processes 32 (d) Image size 4096 × 4096 Parallel Total Variation Minimization 38 / 38
© Copyright 2026 Paperzz