Lecture 1

Selected Numerical Methods
Part 1: some preliminary notions
Roberto Ferretti
• Numerical Analysis
• Analytical and computational issues
• Machine arithmetics
• The concepts of conditioning and stability
• General philosophy of iterative methods
1
Numerical Analysis
• In its usual setting, Numerical Analysis is a branch of Mathematics which studies the approximate solution of analytical problems, in
particular when they are
• Not explicitly solvable: e. g., nonlinear equations, integrals,...
• Overly complex: e. g. linear systems in high dimensions,...
• Convergence of numerical solutions to exact solutions is usually
obtained as a limit
2
Mathematical formulation
Study of well-posedness
↓↑
Approximation scheme
Convergence and stability of the scheme
↓↑
Efficient implementation of the scheme
3
• In complex problems, the interaction between modeling, numerical
approximation and implementation issues is very close, and choices
done in one of this frameworks also affect the others
• The success of the operation crucially depends on a good interaction
of all the components
index
4
Analytical and computational issues
The study of well-posedness of the problem should ensure:
• Existence and uniqueness of solution: it does not make sense to
approximate a problem with no solution, and if multiple solutions
exist, we need to characterize the solution of interest
• Continuous dependence upon data: any approximation scheme introduces perturbations of the original problem, so the solution should
be stable with respect to such perturbations
5
Problems of large dimensions, which are typical of Scientific Computing, need a careful handling of computational complexity, as well as
memory requirements related to a given approximation algorithm
• Choices done at both the modeling and the numerical level may
greatly affect the actual computability of the solution
• In case of parallel computers implementation, algorithms should
usually be recast in a suitable form
index
6
Machine arithmetics
The typical objects on which numerical algorithms work are finite
floating-point representations of real numbers, in normalized form (i.
e., with d1 6= 0)
x = ±0.d1d2d3 . . . · B p
→
flt(x) = ±0.d1d2d3 . . . dt−1d¯t · B p
• In the chopping strategy, one uses d¯t = dt, while in the more usual
strategy of rounding one has

d
t
d¯t =
dt + 1
se
se
dt+1 < B/2
dt+1 ≥ B/2
7
IEEE standard for floating-point machine representations:
precision
B
t
p
truncation
float
2
23
[-126,127]
rounding
double
2
52
[-1022,1023]
rounding
Machine arithmetics also include the special symbols Inf (”infinity”,
e. g. as it results from dividing a nonzero number by zero) and NaN
(”not a number”, as it results from a nonadmissible operation, which
cannot be treated as Inf, e. g. the logarithm of a negative number)
8
Besides the fact that this representation only covers a discrete set
of numbers, it is also possible that a real number x falls out of the
representable range:
• Overflow: the exponent p is greater then the maximum representable exponent, and x is represented as Inf (in single precision,
|x| > 2127 ∼ 1038, in double precision |x| > 21023 ∼ 10308)
• Underflow: the exponent p is lower then the minimum representable
exponent, and x is represented as ±0 (in single precision, |x| < 2−126 ∼
10−38, in double precision |x| < 2−1022 ∼ 10−308)
9
Rounding errors associated to a given floating-point machine representation may be naturally characterized in terms of relative error.
We denote by machine precision the maximum relative error associated to the representation flt(x):
x − fl (x) t
m = max x
In the IEEE standard, the machine precision is therefore m = 2−23 ∼
10−7 in single precision, m = 2−52 ∼ 10−16 in double precision (which
correspond to about seven exact significant digits in the first case,
fifteen/sixteen digits in the second)
index
10
The concepts of conditioning and stability (1)
Assuming that the functional relationship relating x to the problem
data d is in the form
x = F (d)
(1)
the condition number of the problem is the ratio between the variation
of x and the variation of d:
kF (d + δd) − F(d)k
(2)
kδdk
that is, a coefficient of sensitivity of the solution with respect to data
cond =
perturbations
11
Example: solution of linear equation ax − b = 0 with respect to perturbations on the constant term b. Writing the perturbed solution as
x + δx one has:
a(x + δx) = b + δb
which gives
|δx|
1
=
,
|δb|
|a|
that is, sensitivity of the solution to perturbations on b is higher for
small values of the slope coefficient a
12
10x − 10 = 0
x − 10 = 0
(δb ≤ 0.5)
(δb ≤ 0.5)
13
• Also in well-posed problems, large condition numbers show a strong
sensitivity to perturbations and therefore an inherent difficulty in the
accurate approximation of the problem
• Also in well-posed and well-conditioned problems, a further amplification of perturbations might be caused by the numerical cheme
14
The concepts of conditioning and stability (2)
Writing the approximate solution in the form:
x̂ = F̂ (d)
(3)
we denote by stability a ”low” ratio between the the variations of
respectively x̂ and d, that is, a property of good conditioning of the
function F̂ with respect to perturbations on data:
• Data coming from measures or previous approximations
• Data affected by rounding errors due to machine arithmetics (a
permanent perturbation)
15
Example: incremental ratio of f (x) = x1/3 at x = 1, computed with
seven significant digits (the correct limit value is f 0(1) = 1/3).
h
10−1
10−3
10−5
10−6
(1 + h)1/3
1.032280
1.000333
1.000003
1.000000
(1+h)1/3 −1
h
0.3228
0.333
0.3
0.0
Comment: ill-conditioned computation – the perturbation on the
result cannot be controlled
16
Pathologies and paradoxes of machine arithmetics (for example, with
seven significant digits):
• Operating with numbers of very different order of magnitude:
1 + 10−8 = 0.1 · 101 + 0.00000000(1) · 101 = 1
• Obtaining a small number as the difference between two large
numbers
0.1235678 · 101 − 0.1234567 · 101 = 0.1111000 · 10−2
index
17
General philosophy of iterative methods
• In Numerical Analysis, a common technique for approximating solutions is to build iterative methods, in which the solution is searched for
by means of a sequence of recursively defined approximate solutions
xk
• In the case of equations (or system of equations), such a sequence
is usually constructed by setting the system Ax = b or the equation
f (x) = 0 in fixed point form:
x = T (x)
18
Banach fixed point theorem ensures that if there exists a set U invariant for the transformation T (·), and if T (·) is a contraction on U ,
that is, for any x, y ∈ U :
kT (x) − T (y)k ≤ LT kx − yk
with the Lipschitz constant LT < 1, then for x0 ∈ U the sequence
xk+1 = T (xk )
(4)
converges xk → x̄, with x̄ solving x = T (x).
• This result allows to constructively define a sequence converging to
x̄, at the condition of having a contraction at the right-hand side of
(4).
19
Since approximately locating a solution is usually easier then finding
an invariant set for the transformation T , it can be convenient to
replace the assumption of invariance with the hypotesis that T is a
contraction in the neighbourhood of the solution: actually, if this is
the case, when xk is in a spherical set containing x̄ and LT is the
Lipschitz constant of T on this set, then
kxk+1 − x̄k = kT (xk ) − T (x̄)k ≤ LT kxk − x̄k < kxk − x̄k
and therefore xk+1 is also in the same set (which is thus invariant)
20
The Lipschitz constant of a transformation T : Rn → Rn may be
determined as
LT = sup kJT (x)k
x
where JT (x) is the jacobian matrix
∂T1
1
 ∂x
.

.
JT =  .
∂Tn
∂x1

···
···

∂T1
∂xn 
... 

∂Tn
∂xn
• If n = 1, the norm of the jacobian is nothing but the magnitude of
the derivative
index
21