Selected Numerical Methods Part 2: iterative methods for nonlinear equations Roberto Ferretti • General geatures of iterative methods • Methods for scalar equations: fixed–point iterations • Methods for scalar equations: Newton’s method and its derivations 1 General features of iterative methods In making an iterative method xk+1 = T (xk ) (1) work ptoperly, we need several ingredients: • Approximate location of the solution x̄ – most iterative methods have a local convergence • Suitable definition of the function T (·) – it must be a contraction • Suitable definition of the stopping criterion – that is, a correct estimation of the error at a given step 2 Approximate location of the solution: expecially in nonlinear cases, we do not expect that the equation (or system) may have a unique solution, so the mapping T (·) cannot in general be a contraction on the whole of Rn. Typically, it may occur that • The definition of T itself depends on the neighbourhood we choose • The definition of T does not depend on the neighbourhood, but its convergence does (as in Newton’s method) 3 Definition of the iteration function T (·): this is the core of the numerical theory, in practice the requirements are • T must be a contraction, at least in the neighbourhood of the solution • The error should have a fast decrease, so that the required accuracy is achieved with a low computational complexity • The construction of T should not require complex informations (even derivatives may not be explicitly known) 4 Error kxk − x̄k: it can be bounded in two different forms: • If it is possible to give a bound on the initial error kx0 − x̄k, then kxk − x̄k ≤ LT kxk−1 − x̄k ≤ · · · ≤ LkT kx0 − x̄k • If not, from the updating kxk − xk−1k the error can be estimated as kxk − x̄k ≤ kxk+1 − xk k + kxk+2 − xk+1k + · · · ≤ LT ≤ LT kxk − xk−1k + L2 kx − x k + · · · ≤ k k−1 T 1−LT kxk − xk−1 k • In defining a suitable stopping criterion for iterations, it is also necessary to take into account the residual |f (x)| of the equation 5 |xk − x̄| small, |f (x)| large |xk − x̄| large, |f (x)| small 6 Estimating the error at the k–th step as a function of the error at the (k − 1)–th step, kxk − x̄k ≤ LT kxk−1 − x̄k ≤ · · · ≤ LkT kx0 − x̄k shows that in a fixed-point iterative method the Lipschitz constant of T should be kept as low as possible • Working in a neighbourhood of x̄ is crucial for obtaining a small Lipschitz constant • Convergence is (at least) exponential, but can still be very slow if LT ≈ 1 7 In some case, this behaviour can be improved: we define order of convergence of a method the largest exponent γ such that kxk − x̄k ≤ Ckxk−1 − x̄kγ • In methods based on a contraction we typically have C = LT and γ = 1, but the interest is towards methods for which γ > 1 (as a rule, a higher order implies a faster reduction of the error for the method) • The case γ > 1 correspond to a convergence speed faster then exponential 8 Example: kx0 − x̄k = 0.1, errors for a linear and a quadratic method iter. γ = 1, C = 0.1 γ = 2, C = 1 0 0.1 0.1 1 0.01 0.01 2 0.001 0.0001 3 0.0001 0.00000001 9 The reduction of the error depends more strongly on the exponent γ then on the constsnt C. This motivates the effort in constructing methods for which γ > 1: in particular, ”superlinear methods” (secants, Muller) for 1 < γ < 2 and ”quadratic methods” (Newton) for γ=2 indice 10 Methods for scalar equations: fixed–point iterations In one dimension, we rewrite methods in the form (1) as xk+1 = g(xk ) (g : R → R) (2) • The contractivity condition becomes |g 0(x)| ≤ Lg < 1 • In one dimension it is possible to give a graphic interpretation of the construction and possible convergence of the sequence xk , since the solution is the intersection of the graphs y = x y = g(x) 11 0 < g 0(x) < 1 −1 < g 0(x) < 0 12 1 < g 0(x) g 0(x) < −1 13 A standard way to set the scalar equation f (x) = 0 in fixed–point form is to rewrite it as x = x + α(x)f (x) (3) • For this form to be equivalent to the original equation, the function α(x) must not have zeroes in the neighbourhood of x̄. • Possibly, α(x) ≡ α (a nonzero constant) • Using (3) to define an iterative method, it is usually assumed that x̄ is a simple root (actually, if x̄ is a multiple root, we have g 0(x̄) = 1 and therefore g cannot be a contraction) 14 Starting with the case α(x) ≡ α, the ideal situation for convergence would be to have g 0(x̄) = 1 + αf 0(x̄) = 0 since in this case the contraction coefficient may be arbitrarily small, if restricted to a sufficiently small neighbourhood of x̄. • x̄ being unknown (also the explicit expression of f 0 might not be available), the constant α should be an approximation of the optimal value ᾱ = −1/f 0(x̄) 15 • A first possibility is to replace f 0(x̄) with the incremental ratio of f computed on a (sufficiently small) interval [a, b] containing x̄: the resulting method is b−a f (xk ) xk+1 = xk − f (b) − f (a) • A second possibility is to replace f 0(x̄) with f 0(x0), provided x0 is close enough to x̄: 1 xk+1 = xk − 0 f (xk ) f (x0) • Theory confirms that both methods are convergent if f ∈ C 1, f 0(x̄) 6= 0 and a, b, x0 are close enough to x̄ 16 The choice of α may be made rigorous by means of the following result, giving the order of convergence of a method in the form (2): • If g ∈ C m+1 and g 0(x̄) = · · · = g (m)(x̄) = 0, g (m+1)(x̄) 6= 0, then the method converges with order m + 1 if x0 is sufficiently close to x̄ • In particular, if g(x) is in the form (3) with α(x) = α (constant), then if α = −1/f 0(x̄) convergence is quadratic (in general, such a structure of the method cannot ensure convergence with order greater then 2) index 17 Methods for scalar equations: Newton’s method and its derivations Newton’s method is obtained using the form (3), with α(x) = −1/f 0(x): f (x ) xk+1 = xk − 0 k f (xk ) • We must assume to know the explicit expression of the derivative • If f ∈ C 2 and f 0(x̄) 6= 0, then f 0(x̄)2 − f (x̄)f 00(x̄) 0 g (x̄) = 1 − =0 0 2 f (x̄) and the method converges with quadratic order if x0 is close enough to x̄ 18 • The approximation xk+1 is the zero of the tangent to the graph of f in (xk , f (xk )) • Vanishing of the derivative around x̄ shoud be avoided (in such a case, the tangent would become parallel to the x axis) 19 • It can be proved that Newton’s method has monotone, global convergence in any of the following cases: f (x) increasing & convex on [x̄, x ] 0 x > x̄, 0 f (x) decreasing & concave f (x) increasing & concave on [x , x̄] 0 x < x̄, 0 f (x) decreasing & convex • In case of zeroes of multiplicity m > 1, the method can be corrected so as to preserve quadratic convergence 20 If the expression of the derivative is unknown, the secant method replaces the computation of f 0(x) with the incremental ratio between the abscissas xk−1 and xk , obtaining thus the scheme xk − xk−1 xk+1 = xk − f (xk ) f (xk ) − f (xk−1) • The explicit expression of the derivative is not required, and moreover f is still computed once each iteration • If f ∈ C 2 and f 0(x̄) 6= 0, convergence is superlinear, with exponent √ 1+ 5 γ= ≈ 1.618 2 provided x0 and x1 are close enough to x̄ 21 • the approximation xk+1 is the zero of the line through the points (xk , f (xk )) and (xk−1, f (xk−1)) • As in Newton’s method, it is required that x̄ is a simple zero of the function f • Monotone convergence holds under the same conditions as in Newton’s method 22 In Muller’s method the same idea of the secant method is used by defining xk+1 as a zero of the so–called interpolating polynomial of f ) with degree n = 2 passing in the points (xk , f (xk )), (xk−1, f (xk−1)) and (xk−2, f (xk−2)) • Suitable strategies allow to choose between the two roots, and it is possible to converge to complex roots (for this reason, it is often used for algebraic equations) • The method converges with order γ ≈ 1.84 if f ∈ C 3 and x0, x1 and x2 are close enough to x̄ 23 Comparison among the various iterative methods: • In order to apply Newton’s method the function and its derivative should be known in closed explicit form; with this limitation, the method is very efficient • Whenever f is not explicitly known, the most efficient methods are secants and Muller, if the function f is sufficiently smooth • In lack of regularity for f , other techniques must be applied (typically, bisection) 24 scheme complexity order regolarity bisection comp. of f (x) γ=1 C0 fixed point comp. of f (x) γ=1 C1 secants comp. of f (x) γ ≈ 1.62 C2 Muller comp. of f (x) γ ≈ 1.84 C3 Newton comp. of f (x), f 0 (x) γ=2 C2 indice 25
© Copyright 2026 Paperzz