Math 56 Homework 2 Michael Downs 1. (a) We should be able to

Math 56
Homework 2
Michael Downs
1. (a) We should be able to square 10−8 relatively accurately resulting in 10−16 . Attempting to add this to 1, however, results in the 10−16 being absorbed
p because it’s
outside of 16 digits of relative accuracy. We should get: yapprox = 1 + (10−8 )2 −
√
√
approx |
1 = 1 + 10−16 − 1 ≈ 1 − 1 = 1 − 1 = 0. Then rel = |y−y|y|
= |y|
= 1 so we
|y|
have 100% relative error.
(b) Using the binomial series to expand f (x) about zero:
√
1
1
5
1
2
4
6
8
9
x + O(x ) − (1)
1 + x2 − 1 = 1 + x − x + x −
2
8
16
128
1
1
1
5 8
= x2 − x4 + x6 −
x + O(x9 )
2
8
16
128
(c) Keeping track of the errors in the naive implementation we’d have two for rounding
the two xs, one for the multiplication between the xs, one for the addition of the
one inside the root, one for the square root, and a final one for the subtraction
by one. It would look like:
p
( (1 + (x(1 + )x(1 + ))(1 + ))(1 + )(1 + ) − 1)(1 + )
(1)
From here we re-arrange the terms inside the root:
p
(2)
( (1 + x2 (1 + )4 + ∗ (1 + ) − 1)(1 + )
√
From here we can use the taylor expansion 1 + y ≈ 1 + y2 which holds for small
y (x2 (1 + )4 + is small because x and are small). So we get:
([1 +
x2
(1 + )4 + ] ∗ (1 + ) − 1)(1 + )
2
(3)
Since x and are both small x ∗ is very small so we can throw away any x and
n for n ≥ 2:
([1 +
x2
x2
(1 + )4 + ] ∗ (1 + ) − 1)(1 + ) ≈ [1 +
+ ] ∗ (1 + ) − 1)(1 + )
2
2
x2
≈ (1 +
+ 2 − 1)(1 + )
2
x2
≈
+ 2
2
So our relative error here is rel =
x ≥ 2 ∗ 10−2
4mach
.
x2
This is 12 digits or better for all
(d) We need three terms in the taylor series for
for x ≤
√ twelve digits1 of2 accuracy
1 4
1 6
−2
2
2 ∗ 10 so we would use the approximation 1 + x − 1 ≈ 2 x − 8 x + 16 x . The
−5 8
x so the relative error at the
error is dominated by the first omitted term, 128
9
−16
5∗2 ∗10
−2
boundary x = 2 ∗ 10 for three terms is 29 ∗10−4 = 5 ∗ 10−12 .
(e) Code:
1
Math 56
Homework 2
1
Michael Downs
%s c r i p t t o t e s t t h e a c c u r a c y o f t h e n a i v e i m p l e m e n t a t i o n and t h e
t a y l o r at
%t h e boundary
3
val = . 0 2 ;
5
7
n a i v e = s q r t (1+ v a l ˆ 2 ) − 1 ;
t a y l = 1/2 ∗ v a l ˆ2 − 1/8 ∗ v a l ˆ4 + 1/16 ∗ v a l ˆ 6 ;
9
d = naive −t a y l ;
11
abs ( d/ n a i v e )
abs ( d/ t a y l )
which outputs:
>> HW1Q1PE
ans =
5.4966e-12
ans =
5.4966e-12
2. f (z) = z 3 − 1
(a) Staring at z0 = i with newton iteration via a matlab script I get:
ans =
-0.49999999628903 +
0.866025398338587i
Which I’ll round to ≈ −5 + 0.86603i. Starting at z0 = i gives:
ans =
-0.49999999628903 -
0.866025398338587i
Which I’ll round to ≈ −5 − 0.86603i. Visualizing the three roots: r1 = 1, r2 =
−5 + 0.86603i, and r3 = −5 − 0.86603i in the complex plane:
2
Math 56
Homework 2
Michael Downs
2
1.5
1
Im(z)
0.5
0
−0.5
−1
−1.5
−2
−2
−1.5
−1
−0.5
0
Re(z)
0.5
1
1.5
2
In order for these numbers to be roots of f (z), their cubes must be equal to 1 and
and P hase(r3 ) =
so their phase must be some multiple of 2π. P hase(r2 ) = 2π
3
−2π
. Angles are additive in complex multiplication so, for each of these numbers,
3
multiplying their angles by three will put them on the positive real line.
1
(b) Solutions to z 5 = 2 in the form reiθ must all have r = 2 5 and their phases, when
multiplied by 5, must be some multiple of 2π. There are five complex numbers
1
that are unique when restricting phases to (−π, π] that have this property: 2 5 ,
−2π
−4π
1
2π
1
4π
1
1
2 5 ei 5 , 2 5 ei 5 , 2 5 ei 5 , and 2 5 ei 5 .
(c) Graphic of basins:
3
Math 56
Homework 2
Michael Downs
Fractal plot of basins for f(z) = z3 − 1
−2
−1.5
−1
Im(z)
−0.5
0
0.5
1
1.5
2
−2
−1.5
−1
−0.5
0
Re(z)
0.5
1
1.5
2
Complex numbers in green converge to r1 = 1, blue to r2 = −5 + 0.86603i, and
red to r3 = −5 − 0.86603i. Here are some more images of the fractal nature of
this diagram:
4
Math 56
Homework 2
Michael Downs
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
−0.8
−0.7
−0.6
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
−0.25
−0.2
−0.15
−0.1
−0.05
−0.75
−0.7
5
−0.65
−0.6
−0.55
Math 56
Homework 2
Michael Downs
−0.3
−0.25
−0.2
−0.15
−0.1
−0.5
−0.45
−0.4
−0.35
3. The relative condition number κ of a function f (x) is defined as κ =
√
−1
x
(a) κ(sin (x)) = 1−x2 sin−1 (x) and κ(sin−1 (0.999999)) ≈ 450.5635
1 (b) κ(ln(x)) = ln(x)
which becomes huge when x is close to 1.
√
−0.3
|f 0 (x)x|
.
|f (x)|
2
(c) We can use the quadratic formula −b± 2ab −4ac to find the roots of x2 − 2x + c with
a = 1, b = −2, and c = c. Inputting these arguments, the quadratic formula
−1
√
1
2
(this
simplifies to 1 ± 1 − c. Then f (c) = 1 ± (1 − c) 2 and f 0 (c) = ∓(1−c)
2
wouldn’t be a function but I’m expression both roots as a function of c in one)
∓c √
so κ(f (x)) = 2√1−c(1±
. If c = 1 − 10−16 then the roots are 2 ∗ 10−8 apart
1−c
∓(1−10−16 ) −16
and κ(f (1 − 10 ) = 2∗10−8 (1±10−8 ) . ∓(1 − 10−16 ) ≈ ∓1 and 1 ± 10−8 ≈ 1 so
κ(f (1 − 10−16 ) ≈
108
2
(d) Suppose we have a backwards-stable algorithm to compute sin(x) which we’ll idenˆ
ˆ
tify as sin(x).
Then ∀x, ∃ with || ≤ Cmach , such that sin(x)
= sin (x(1 + )).
10
10
−16
ˆ
Then sin(10
) = sin(10 (1 + )) with || ≤ 10
(we’ll say C = 1). Then:
sin(1010 (1 + )) = sin(1010 + 1010 )
= sin(1010 ) cos(1010 ) + cos(1010 ) sin(1010 )
Substituting = 10−16 into the above expression this becomes:
= sin(1010 ) cos(10−6 ) + cos(1010 ) sin(10−6 )
6
Math 56
Homework 2
Michael Downs
2
We can say that cos(10−6 ) ≈ 1 since cos(x) = 1 − x2 + O(x4 ) but sin(10−6 ) ≈ 10−6
since sin(x) ≈ x for small x. We don’t know what cos(1010 ) is but it’s at most/least
±1. So we have the approximate expression sin(1010 ) ± 10−6 . From here we have
10−6
the relative error rel = sin(10
10 ) . Thus we can’t expect more than six digits of
accuracy from such an algorithm for large x.
4. This can be shown by expanding f (x + h) and f (x − h) at x and evaluating them at
x + h and x − h respectively to five terms and then capping the fifth term with Taylor’s
theorem:
f (x + h) = f (x) + f 0 (x)h +
f 00 (x)h2 f (3) (x)h3 f (4) (q)h4
+
+
2!
3!
4!
(4)
f 00 (x)h2 f (3) (x)h3 f (4) (p)h4
−
+
2!
3!
4!
(5)
for some q ∈ [x, x + h] and
f (x − h) = f (x) − f 0 (x)h +
for some p ∈ [x − h, x]. Adding (1) and (2) we get:
f (x + h) + f (x − h) = 2f (x) + f 00 (x)h2 +
Substituting this into
f (x+h)−2f (x)+f (x−h)
h2
(f (4) (q) + f (4) (p))h4
4!
(6)
and simplifying:
1
(f (4) (q) + f (4) (p))h4
00
2
f (x) = 2 f (x)h +
h
4!
(f (4) (q) + f (4) (p)) 2
= f 00 (x) +
h
4!
00
(4)
(4)
(p)) 2
But (f (q)+f
h ≤ C as h → 0 since p and q both become close to x. We conclude
4!
(x−h)
that the error for f (x+h)−2fh(x)+f
is O(h2 ) as h → 0. A log-log graph of the relative
2
error vs h shows something interesting:
7
Math 56
Homework 2
Michael Downs
0
10
−2
10
−4
log(ε)
10
−6
10
−8
10
−10
10
−12
10
−16
10
−14
10
−12
10
−10
−8
10
10
log(h)
−6
10
−4
10
−2
10
0
10
The slope from x = 10−16 to x = 10−8 is approximately −0.5407. This is the area where
the rounding error dominates the total error. The slope from x = 10−4 to x = 100 is
−8 )
approximately ln(1)−ln(10
= 2. This is where the taylor series error dominates the
ln(1)−ln(10−4 )
total error. From this we observe that, in the world of machine calculations, smaller
h does not necessarily mean smaller error. In order to obtain optimal precision an
h must be chosen such that the taylor error is as small as possible without making
the rounding error huge. If there were no machine error, the line with slope 2 would
continue all the way.
5. (a) The code should set x to zero, add 0.1 to x ten times, and then stop when x
is one. The code actually runs forever because x is never the integer 1. Each
addition is actually floating point addition which introduces very small error. So
the 1 calculated with the loop is not actually 1, just a number very close to 1.
In order to remedy this, we should only run the loop while x is less than some
number slightly under 1.
2
4
x=0;
w h i l e x <= . 9 9
x = x + 0.1
end
We also see that the 1 computed with the loop isn’t exactly equal to 1.
>> x - 1
ans =
-1.11022302462516e-16
8
Math 56
Homework 2
Michael Downs
(b) Assuming that we can use 192212 as an approximation for 178212 + 184112 we have
12 +184112 −192212 |
a relative error of |17821782
≈ 2.75542 ∗ 10−10 . In order to express such
12 +184112
a large number correctly we would need log2 (192212 ) ≈ 131 binary digits.
(c) For the matrix multiplication AB each entry abij is the dot product of the row
vector ai and the column vector bj . For such a dot product there are n multiplications and n − 1 additions for a total of 2n − 1 flops for each entry in AB.
There are n2 entries in AB so we have a total of 2n3 − n2 = O(n3 ) flops. Given a
column vector x of length n, the computation ABx can be carried out two ways
thanks to the commutativity of matrix multiplication: (AB)x or A(Bx). The
former requires 2n3 − n2 operations for the initial matrix multiplication and then
2n2 − n for the right column multiplication for a total of 2n3 + n2 − n = O(n3 )
flops. The latter requires 2n2 − n for the initial column multiplication and then
another 2n2 − n for the second for a total of 4n2 − 2n = O(n2 ) flops. Thus A(Bx)
is preferable. Less flops means less introduced error.
9