Selected Numerical Methods Part 1: some preliminary notions Roberto Ferretti • Numerical Analysis • Analytical and computational issues • Machine arithmetics • The concepts of conditioning and stability • General philosophy of iterative methods 1 Numerical Analysis • In its usual setting, Numerical Analysis is a branch of Mathematics which studies the approximate solution of analytical problems, in particular when they are • Not explicitly solvable: e. g., nonlinear equations, integrals,... • Overly complex: e. g. linear systems in high dimensions,... • Convergence of numerical solutions to exact solutions is usually obtained as a limit 2 Mathematical formulation Study of well-posedness ↓↑ Approximation scheme Convergence and stability of the scheme ↓↑ Efficient implementation of the scheme 3 • In complex problems, the interaction between modeling, numerical approximation and implementation issues is very close, and choices done in one of this frameworks also affect the others • The success of the operation crucially depends on a good interaction of all the components index 4 Analytical and computational issues The study of well-posedness of the problem should ensure: • Existence and uniqueness of solution: it does not make sense to approximate a problem with no solution, and if multiple solutions exist, we need to characterize the solution of interest • Continuous dependence upon data: any approximation scheme introduces perturbations of the original problem, so the solution should be stable with respect to such perturbations 5 Problems of large dimensions, which are typical of Scientific Computing, need a careful handling of computational complexity, as well as memory requirements related to a given approximation algorithm • Choices done at both the modeling and the numerical level may greatly affect the actual computability of the solution • In case of parallel computers implementation, algorithms should usually be recast in a suitable form index 6 Machine arithmetics The typical objects on which numerical algorithms work are finite floating-point representations of real numbers, in normalized form (i. e., with d1 6= 0) x = ±0.d1d2d3 . . . · B p → flt(x) = ±0.d1d2d3 . . . dt−1d¯t · B p • In the chopping strategy, one uses d¯t = dt, while in the more usual strategy of rounding one has d t d¯t = dt + 1 se se dt+1 < B/2 dt+1 ≥ B/2 7 IEEE standard for floating-point machine representations: precision B t p truncation float 2 23 [-126,127] rounding double 2 52 [-1022,1023] rounding Machine arithmetics also include the special symbols Inf (”infinity”, e. g. as it results from dividing a nonzero number by zero) and NaN (”not a number”, as it results from a nonadmissible operation, which cannot be treated as Inf, e. g. the logarithm of a negative number) 8 Besides the fact that this representation only covers a discrete set of numbers, it is also possible that a real number x falls out of the representable range: • Overflow: the exponent p is greater then the maximum representable exponent, and x is represented as Inf (in single precision, |x| > 2127 ∼ 1038, in double precision |x| > 21023 ∼ 10308) • Underflow: the exponent p is lower then the minimum representable exponent, and x is represented as ±0 (in single precision, |x| < 2−126 ∼ 10−38, in double precision |x| < 2−1022 ∼ 10−308) 9 Rounding errors associated to a given floating-point machine representation may be naturally characterized in terms of relative error. We denote by machine precision the maximum relative error associated to the representation flt(x): x − fl (x) t m = max x In the IEEE standard, the machine precision is therefore m = 2−23 ∼ 10−7 in single precision, m = 2−52 ∼ 10−16 in double precision (which correspond to about seven exact significant digits in the first case, fifteen/sixteen digits in the second) index 10 The concepts of conditioning and stability (1) Assuming that the functional relationship relating x to the problem data d is in the form x = F (d) (1) the condition number of the problem is the ratio between the variation of x and the variation of d: kF (d + δd) − F(d)k (2) kδdk that is, a coefficient of sensitivity of the solution with respect to data cond = perturbations 11 Example: solution of linear equation ax − b = 0 with respect to perturbations on the constant term b. Writing the perturbed solution as x + δx one has: a(x + δx) = b + δb which gives |δx| 1 = , |δb| |a| that is, sensitivity of the solution to perturbations on b is higher for small values of the slope coefficient a 12 10x − 10 = 0 x − 10 = 0 (δb ≤ 0.5) (δb ≤ 0.5) 13 • Also in well-posed problems, large condition numbers show a strong sensitivity to perturbations and therefore an inherent difficulty in the accurate approximation of the problem • Also in well-posed and well-conditioned problems, a further amplification of perturbations might be caused by the numerical cheme 14 The concepts of conditioning and stability (2) Writing the approximate solution in the form: x̂ = F̂ (d) (3) we denote by stability a ”low” ratio between the the variations of respectively x̂ and d, that is, a property of good conditioning of the function F̂ with respect to perturbations on data: • Data coming from measures or previous approximations • Data affected by rounding errors due to machine arithmetics (a permanent perturbation) 15 Example: incremental ratio of f (x) = x1/3 at x = 1, computed with seven significant digits (the correct limit value is f 0(1) = 1/3). h 10−1 10−3 10−5 10−6 (1 + h)1/3 1.032280 1.000333 1.000003 1.000000 (1+h)1/3 −1 h 0.3228 0.333 0.3 0.0 Comment: ill-conditioned computation – the perturbation on the result cannot be controlled 16 Pathologies and paradoxes of machine arithmetics (for example, with seven significant digits): • Operating with numbers of very different order of magnitude: 1 + 10−8 = 0.1 · 101 + 0.00000000(1) · 101 = 1 • Obtaining a small number as the difference between two large numbers 0.1235678 · 101 − 0.1234567 · 101 = 0.1111000 · 10−2 index 17 General philosophy of iterative methods • In Numerical Analysis, a common technique for approximating solutions is to build iterative methods, in which the solution is searched for by means of a sequence of recursively defined approximate solutions xk • In the case of equations (or system of equations), such a sequence is usually constructed by setting the system Ax = b or the equation f (x) = 0 in fixed point form: x = T (x) 18 Banach fixed point theorem ensures that if there exists a set U invariant for the transformation T (·), and if T (·) is a contraction on U , that is, for any x, y ∈ U : kT (x) − T (y)k ≤ LT kx − yk with the Lipschitz constant LT < 1, then for x0 ∈ U the sequence xk+1 = T (xk ) (4) converges xk → x̄, with x̄ solving x = T (x). • This result allows to constructively define a sequence converging to x̄, at the condition of having a contraction at the right-hand side of (4). 19 Since approximately locating a solution is usually easier then finding an invariant set for the transformation T , it can be convenient to replace the assumption of invariance with the hypotesis that T is a contraction in the neighbourhood of the solution: actually, if this is the case, when xk is in a spherical set containing x̄ and LT is the Lipschitz constant of T on this set, then kxk+1 − x̄k = kT (xk ) − T (x̄)k ≤ LT kxk − x̄k < kxk − x̄k and therefore xk+1 is also in the same set (which is thus invariant) 20 The Lipschitz constant of a transformation T : Rn → Rn may be determined as LT = sup kJT (x)k x where JT (x) is the jacobian matrix ∂T1 1 ∂x . . JT = . ∂Tn ∂x1 ··· ··· ∂T1 ∂xn ... ∂Tn ∂xn • If n = 1, the norm of the jacobian is nothing but the magnitude of the derivative index 21
© Copyright 2026 Paperzz