Un metodo iterativo di tipo Gauss-Newton per la risoluzione del problema TLS A Gauss-Newton iteration for solving TLS problems Antonio Fazzi1 and Dario Fasino2 1 Gran Sasso Science Institute. 2 University of Udine. Como, 17/02/2017 A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 1 / 17 The Total Least Squares problem The Total Least Squares (TLS) problem Given A ∈ Rm×n with m > n and b ∈ Rm , the Total Least Squares (TLS) problem is dened as min k(E | f )k2F with b + f ∈ Im(A + E ) E ,f where E ∈ Rm×n and f ∈ Rm . After we nd such a matrix (Ē | f¯) whose Frobenius norm is minimum, each x ∈ Rn satisfying (A + Ē )x = b + f¯ is a . solution of the TLS problem A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 2 / 17 The Total Least Squares problem Solution, existence and uniqueness Dene the matrix C = (A | b) and consider the SVD C = UΣV T . In the following we assume that the problems has a unique solution; this happens if σn0 > σn+1 , where σn0 and σn+1 are the smallest singular values of A and C , respectively. The solution of the TLS problem is the vector xTLS such that xTLS vn+1 = −ζ where ζ is a normalization constant. −1 A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 3 / 17 The Total Least Squares problem The function η It's known that xTLS can be characterized as point of global minimum of the function kAx − bk22 . η(x) = 1 + kxk22 The function η(x) measures the backward error of the vector x as approximated solution of the linear system Ax = b : Lemma For each vector x there exists a rank one matrix (Ē |f¯) such that (A + Ē )x = b + f¯, k(Ē |f¯)k2F = k(Ē |f¯)k22 = η(x). Moreover, for each matrix (E |f ) such that (A + E )x = b + f it holds true k(E |f )k2F ≥ k(E |f )k22 ≥ η(x). A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 4 / 17 The Total Least Squares problem The function η It's known that xTLS can be characterized as point of global minimum of the function kAx − bk22 . η(x) = 1 + kxk22 The function η(x) measures the backward error of the vector x as approximated solution of the linear system Ax = b : Lemma For each vector x there exists a rank one matrix (Ē |f¯) such that (A + Ē )x = b + f¯, k(Ē |f¯)k2F = k(Ē |f¯)k22 = η(x). Moreover, for each matrix (E |f ) such that (A + E )x = b + f it holds true k(E |f )k2F ≥ k(E |f )k22 ≥ η(x). A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 4 / 17 Solution of TLS problems using GaussNewton The Gauss-Newton algorithm The Gauss-Newton algorithm is a cheap optimization method that can solve nonlinear least squares problems min kf (x)k22 , x∈Rn f : Rn → Rm , m ≥ n. The basic idea is to linearize f (x) in a neighborhood of x ; the step x → x + h is computed by replacing kf (x + h)k22 with kf (x) + J(x)hk22 and solving the corresponding ordinary LS problem. A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 5 / 17 Solution of TLS problems using GaussNewton The GaussNewton applied to the function η We set η(x) = kf (x)k22 where 1 f (x) = √ (Ax − b). 1 + xT x Hence, minx kf (x)k2 xTLS . The Jacobian of f is J(x) = √ 1 1+ xT x A− 1 (1 + 3 x T x) 2 (Ax − b)x T . A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 6 / 17 Solution of TLS problems using GaussNewton Outline of the algorithm Basic GN-TLS method Input: Output: A, b (problem data); ε, maxit (stopping criteria) x̂ (approximation of xTLS ) Set k := 0 Compute x0 := arg minx kAx − bk2 Compute f0 := f (x0 ) and J0 := J(x0 ) while kJkT fk k2 ≥ ε and k < maxit Compute hk := arg minh kJk h + fk k2 Set xk+1 := xk + hk Set k := k + 1, fk := f (xk ), Jk := J(xk ) end x̂ := xk A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 7 / 17 Solution of TLS problems using GaussNewton Computational Cost The GaussNewton method solves at each step a least squares problem, which can be written in the form: minkJk h + fk k22 = h 2 r k T x h + r min A− k k h 1 + xkT xk 2 We can compute only once the QR factorization of A, and then we use a technique which updates the QR factorization of rank one perturbations. This update has only a quadratic cost. A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 8 / 17 Solution of TLS problems using GaussNewton Computational Cost The GaussNewton method solves at each step a least squares problem, which can be written in the form: minkJk h + fk k22 = h 2 r k T x h + r min A− k k h 1 + xkT xk 2 We can compute only once the QR factorization of A, and then we use a technique which updates the QR factorization of rank one perturbations. This update has only a quadratic cost. A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 8 / 17 Solution of TLS problems using GaussNewton Geometry of the method Proposition Let 1 f (x) = √ (Ax − b). 1 + xT x Then its image Im(f ) ⊂ Rm is an open subset of the ellipsoid v T Xv = 1, where X = (CC T )+ . A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 9 / 17 Solution of TLS problems using GaussNewton Image of f (x) Figure: Surface plot of Im(f ). Blue star: f (xTLS ); red star: f (xLS ). A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 10 / 17 An improved variant Improved variant of the basic GN-TLS Motivation ensure convergence increase convergence speed The value f (x + h) comes from a linear combination of f (x) and f (x) + J(x)h, so it's not the retraction of the Gauss-Newton step! Idea Introduce a step-length parameter α such that f (x + αh) = τ̂ (f (x) + J(x)h). α = 1/(1 + x T h/(1 + x T x)) A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 11 / 17 An improved variant Improved variant of the basic GN-TLS Motivation ensure convergence increase convergence speed The value f (x + h) comes from a linear combination of f (x) and f (x) + J(x)h, so it's not the retraction of the Gauss-Newton step! Idea Introduce a step-length parameter α such that f (x + αh) = τ̂ (f (x) + J(x)h). α = 1/(1 + x T h/(1 + x T x)) A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 11 / 17 An improved variant Example Figure: Example in dimension 1. Notice the dierence between the two methods. A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 12 / 17 An improved variant GN-TLS method with optimal step length Input: Output: A, b (problem data); ε, maxit (stopping criteria) x̂ (approximation of xTLS ) Set k := 0 Compute x0 := arg minx kAx − bk2 Compute f0 := f (x0 ) and J0 := J(x0 ) while kJkT fk k2 ≥ ε and k < maxit Compute hk := arg minh kJk h + fk k2 Set αk := 1/(1 + xkT hk /(1 + xkT xk )) Set xk+1 := xk + αk hk Set k := k + 1, fk := f (xk ), Jk := J(xk ) end x̂ := xk A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 13 / 17 An improved variant Equivalence with an inverse power iteration The GN-TLS method with optimal step length is equivalent to an inverse power method involving the matrix C T C . Indeed, let q sk = (xk , −1)T / 1 + xkT xk . Then, sk+1 = βk (C T C )−1 sk , Meanwhile, βk = 1/k(C T C )−1 sk k2 . f (xk+1 ) = βk (CC T )+ f (xk ). Corollary The GN-TLS method with optimal step length is convergent. Moreover, kf (xk ) − f (xTLS )k = O(( σσn+n 1 )2k ), kxk − xTLS k = O(( σσn+n 1 )2k ) |η(xk ) − η(xTLS )| = O(( σσn+n 1 )4k ) A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 14 / 17 An improved variant Equivalence with an inverse power iteration The GN-TLS method with optimal step length is equivalent to an inverse power method involving the matrix C T C . Indeed, let q sk = (xk , −1)T / 1 + xkT xk . Then, sk+1 = βk (C T C )−1 sk , Meanwhile, βk = 1/k(C T C )−1 sk k2 . f (xk+1 ) = βk (CC T )+ f (xk ). Corollary The GN-TLS method with optimal step length is convergent. Moreover, kf (xk ) − f (xTLS )k = O(( σσn+n 1 )2k ), kxk − xTLS k = O(( σσn+n 1 )2k ) |η(xk ) − η(xTLS )| = O(( σσn+n 1 )4k ) A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 14 / 17 An improved variant Numerical experiments Test problem by Björck, Heggernes, Matstoms (2000) −8 5 10 10 −10 1.2 0 10 10 −12 1.15 −5 10 10 −14 1.1 −10 10 10 −16 1.05 −15 10 10 −18 1 −20 10 10 0 10 20 0.95 0 10 20 0 10 20 Figure: Left: log kJkT fk k. Center: Errors log kxk − xTLS k (solid lines) and log |η(xk ) − η(xTLS )| (dashed lines). Right: plot of αk A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 15 / 17 Conclusions Conclusions The method produces a sequence of approximations which converges with no restriction. The value η(xk ), available during the iterations, estimates the backward error in Ax ≈ b . The method avoids to compute the SVD. At each step it only solves a least squares problem whose matrix is a rank one perturbation of the data matrix. This can be useful in some circumstances: if A is large and sparse we can use (transpose free) Krylov methods where the matrix is only involved in matrix-vector products; if the QR factorization of A is known in advance. A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 16 / 17 Conclusions Conclusions D. Fasino, A. Fazzi. A GaussNewton iteration for Total Least Squares problems. arXiv:1608.01619 (2016). Submitted. Thank you for your attention. A. Fazzi, D. Fasino (1 Gran Sasso Science Institute. 2 University of Udine.) 17 / 17
© Copyright 2026 Paperzz