MATH 4250/CS 4210 Homework 1 Solutions

MATH 4250/CS 4210 Homework 1
Solutions
M. Hin & I. Papst
[email protected] & [email protected]
Fall 2016
Problem 2
(a) Since
∞
∞
X (−1)j (1)2j+1 X (−1)j
π
= arctan(1) =
=
,
4
2j + 1
2j + 1
j=0
j=0
(1)
∞
X
(−1)j
π =4·
2j + 1
j=0
(2)
we can write
(b) Similarly,
π
= arctan
6
1
√
3
so we can write
π =6·
√
∞
X
(−1)j (1/ 3)2j+1
,
=
2j
+
1
j=0
√
∞
X
(−1)j (1/ 3)2j+1
2j + 1
j=0
(c) The following MATLAB program approximates π using parts a and b:
% Problem 2 c
% % Approximate pi using series of arctan (1)
n = [11 ,101 ,1001 ,10001 ,100001];
pi_approx1 = [];
err1 = [];
for i =1: length ( n )
pi4_approx = 0;
for j =0: n ( i )
pi4_approx = pi4_approx + ( -1).^ j ./(2.* j +1);
end
pi_approx1 ( i ) = 4.* pi4_approx ;
err1 ( i ) = abs ( pi_approx1 ( i ) - pi );
end
disp ( pi_approx1 )
disp ( err1 )
1
(3)
(4)
% % Approximate pi using series of arctan (1/ sqrt (3))
pi_approx2 = [];
err2 = [];
for i =1: length ( n )
pi6_approx = 0;
for j =0: n ( i )
pi6_approx = pi6_approx + ( -1).^ j ./(( sqrt (3).^(2.* j +1)).*(2.* j +1));
end
pi_approx2 ( i ) = 6.* pi6_approx ;
err2 ( i ) = abs ( pi_approx2 ( i ) - pi );
end
disp ( pi_approx2 )
disp ( err2 )
% % Plot absolute error
loglog (n , err1 , ’ LineWidth ’ , 3)
hold on
loglog (n , err2 , ’r - - ’ , ’ LineWidth ’ , 3)
hold off
xlabel ( ’n ’)
ylabel ( ’ Absolute Error ’)
title ( ’ Absolute error of both pi approximations ’)
legend ( ’ arctan (1) ’ , ’ arctan (1/ sqrt (3)) ’)
(d) The remainder term in a Taylor expansion truncated at n terms is proportional to
|x − a|n+1 , where a is the point about which the series is expanded. In this case,
2
a = 0, so the larger
√the distance between x and 0, the worse the Taylor approximation.
Therefore, x = 1/ 3 gives a better approximation of π here then x = 1.
(e) The following MATLAB program computes the AGM and plots its value after each
iteration:
% % Problem 2 e
% Set constants
a = 1;
b = 1/ sqrt (2);
t = 1/4;
j = 1;
eps = 1e -13;
% Initialize count and vector to store results
i = 0;
AGM = [];
err = [];
% Calculate AGM values and error
while (a - b ) >= eps
% increment count
i = i +1;
y = a;
a = ( a + b )./2;
b = sqrt ( b .* y );
t = t - j .*( a - y ).^2;
j = 2.* j ;
AGM ( i ) = a .^2./ t ;
err ( i ) = abs ( AGM ( i ) - pi );
end
% Display results
disp ( AGM )
disp ( err )
% Plot
semilogy (1: length ( err ) , err , ’ LineWidth ’ , 2)
title ( ’ Absolute Error for AGM Approximation of \ pi ’)
xlabel ( ’ Iteration Number ’)
ylabel ( ’ Absolute Error ’)
3
Since the plot is eventually linear1 on a semilogy plot, we can conclude that the
convergence is exponential.
Problem 3
First off, note that the loss of significance occurs when 4ac is small relative to b and there is
a subtraction in the numerator due to a cancellation error. There are two cases to consider:
(a) If b > 0, a cancellation error occurs in the root
√
−b + b2 − 4ac
x+ =
.
2a
To avoid the cancellation error, multiply by the conjugate:
√
√
−b + b2 − 4ac −b − b2 − 4ac
√
x+ =
·
2a
−b − b2 − 4ac
b2 − (b2 − 4ac)
√
=
2a(−b − b2 − 4ac)
2c
√
=
−b − b2 − 4ac
1
(5)
(6)
(7)
(8)
Note that errors for low iteration numbers do not give reliable information about convergence—the
algorithm has to begin to converge to the true value before we can infer anything about the convergence
rate, and it won’t do so immediately.
4
Since b > 0, there is no subtraction of two approximately equal quantities in the
denominator, and so there is no cancellation error in this expression.
(b) If b < 0, a cancellation error occurs in the root
√
−b − b2 − 4ac
.
x− =
2a
Again, to avoid the error, we multiply by the conjugate:
√
√
−b − b2 − 4ac −b + b2 − 4ac
√
x− =
·
2a
−b + b2 − 4ac
b2 − (b2 − 4ac)
√
=
2a(−b − b2 − 4ac)
2c
√
=
−b + b2 − 4ac
(9)
(10)
(11)
(12)
Problem 4
(a) The following code generates the results tabulated in Table 1
%% Problem 4, Part A
xvec = [ 0.5 ; -0.5 ; 30*pi ; -30*pi ];
n1 = 10;
n2 = 40;
sum_vec1 = ones(size(xvec));
sum_vec2 = ones(size(xvec));
for i = 1:n1
sum_vec1 = sum_vec1 + xvec.^i / factorial(i);
end
rel_err1 = abs((sum_vec1 - exp(xvec))./(exp(xvec)));
for i = 1:n2
sum_vec2 = sum_vec2 + xvec.^i / factorial(i);
end
rel_err2 = abs((sum_vec2 - exp(xvec))./(exp(xvec)));
disp(sum_vec1)
disp(rel_err1)
disp(sum_vec2)
disp(rel_err2)
Note that the Taylor polynomials of small order for f (x) = ex are only (relatively)
accurate near x = 0. The further away from the origin, the more terms needed in the
Taylor series to accurately approximate the exponential. In addition, negative values
5
x = 0.5
x = −0.5
x = 30π
x = −30π
n = 10
Approx
1.65e+00
6.07e-01
1.70e+13
1.38e+13
Rel Err
7.74e-12
1.94e-11
1.00e+00
1.17e+54
n = 40
Approx
1.65e+00
6.07e-01
1.97+31
8.03e+30
Rel Err
2.69e-16
1.83e-16
1.00e+00
6.85e+71
Table 1: nth-order Taylor polynomial approximations and relative errors of f (x) = ex at
various x and n.
of x also require more terms due to the alternating sign of the terms involved in the
Taylor series.
(b) Since the majority of our relative error arises in using the 10th order Taylor polynomial to approximate ex , we require a scheme to reduce the error induced by this
approximation. From part (a) we observed that the polynomial approximation is most
accurate for values of x close to zero. Hence we are motivated to only perform our
most error-prone step at small values of x. We can design an algorithm so this is the
case by relying on the properties that
and eaN = (ea )N .
ea+b = ea · eb
Thus, by choosing a small base unit , we can construct x in terms of the divisor ,
remainder 0 ≤ δ < , and integer N as
x = N + δ.
Hence, we can calculate the exponential as:
ex = (e )N · eδ .
Therefore we need to select a small enough so that the 10th order Taylor approximation of e is below our desired tolerance. By the Taylor Remainder theorem, we have
that the the 10th order Taylor approximation has a (relative) remainder of:
ec ||11
|R1 0()|
= ,
f ()
e 11!
for some 0 ≤ c ≤ . We can bound ec ≤ e , so if we prescribe a relative tolerance
of 10−13 for the Taylor approximation, we find that we need to select an epsilon that
satisfies:
11
< 10−13 ,
11!
or
1/11
11!
<
≈ 0.323.
1013
6
Of course we are ignoring the impact of the numerical stability issue with raising our
approximation e by an integral power of N , so we will need to perform a search for
the best to yield the desired relative error for the overall calculation. One way to
search is to define = 1/s and seek for a valid integer value of s. Below is the code
used to generate approximations of exponentials at x = ±30π, selecting s = 17.
%% Problem 4, Part B
x = 30*pi;
% x = -30*pi;
n = 10;
s = 17;
sinv = 1/s;
N2 = floor(x/sinv);
xdec = x - N2*sinv;
expx2 = 1;
expxdec2 = 1;
for i = 1:n
expx2 = expx2 + (sinv).^i / factorial(i);
expxdec2 = expxdec2 + (xdec).^i / factorial(i);
end
expx2 = expx2^N2 * expxdec2;
rerrx2 = abs(expx2 / exp(x) - 1);
disp(rerrx2)
We find that we generate a relative error of 4.22 × 10−14 for evaluating at x = 30π and
3.85 × 10−14 for evaluating at x = −30π.
(c) The algorithm above can be leveraged to approximate cos x and sin x for values of x
far from zero. This can be done with Euler’s identity:
eix = cos x + i sin x.
There are several ways to go from here. One way is to force MatLab to work in complex
numbers for which we can directly apply our algorithm to the now complex summation.
Then we can recover the cosine or sine quantity by taking either the real or imaginary
part. Alternatively, we can restrict ourselves to real-valued calculations and simply pull
the real and imaginary terms in the Taylor approximation to form individual Taylor
series for which a similar tolerance bounding procedure as in part (b) can be applied.
Problem 5
(a) Let
f (x) = 1 − tan x,
and g(x) =
7
cos2
cos 2x
.
x(1 + tan x)
Consider the difference, which we will show to be equal to zero:
cos 2x
,
f (x) − g(x) = [1 − tan x] −
cos2 x(1 + tan x)
cos2 x − sin2 x
= [1 − tan x] −
,
cos2 x(1 + tan x)
1 − tan2 x
= [1 − tan x] −
,
(1 + tan x)
(1 − tan x)(1 + tan x)
= [1 − tan x] −
,
(1 + tan x)
= [1 − tan x] − [1 − tan x] ,
= 0.
Note that in the second equality we used the trigonometric identity that cos 2x =
cos2 x − sin2 x, while in the fourth equality we factored the reducible quadratic in the
numerator. Hence f (x) = g(x).
(b) Observe that tan (π/4) = tan (5π/4) = 1. By the continuity of the tangent function,
we know that for values of x near π/4 and 5π/4, the function tan x takes on values
close to 1. This will prove numerically unsavory when evaluating f since there can
be loss of significant digits when computing the difference of two quantities of near
magnitudes. As such, the function g should be used.
(c) Observe that tan (3π/4) = tan (7π/4) = −1. By the continuity of the tangent function,
we know that for values of x near 3π/4 and 7π/4, the function tan x takes on values
close to −1. This will prove numerically unsavory when evaluating g since there can be
loss of significant digits when computing the sum of two quantities of near but opposite
magnitudes. As such, the function f should be used.
Problem 6
Below we list the appropriate MATLAB commands for the prescribed operations. We also
mention any operations that are mathematically impossible and explain why.
(a) 2A + C T = 2*A+C’
(b) C − 3B cannot be done mathematically due to dimension mismatch.
(c) 3B − 2D = 3*B - 2*D
(d) AD = A*D
(e) CA = C*A
(f) AC = A*C
8
(g) BD = B*D
(h) DB = D*B
(i) BC = B*C
(j) CB cannot be done mathematically due to dimension mismatch.
(k) AB = A*B
(l) 2DT + B = 2 * D’ + B
(m) det(D) = det(D)
(n) det(A) cannot be done mathematically since A is not a square matrix.
(o) C T D = C’ * D
(p) BAT = B*(A’)
(q) −2AT + 5C = -2*A’ + 5*C
(r) B T + D = B’+D
(s)
1
(B
2
+ B T ) = (B+B’)/2
(t)
1
(B
2
− B T ) = (B-B’)/2
(u) AAT = A*A’
(v) AT A = A’ * A
(w) det(AAT ) = det(A*A’)
(x) det(AT A) = det(A’*A)
(y) B(AD)T = B*(A*D)’
(z) ADB T = A*D*B’
Problem 7
(a) Equation (2) is numerically unacceptable when N = 108 since arctan(x) ≈ π/2 when x
is large; the arctan function approaches a horizontal asymptote of y = π/2 as x → ∞,
which means that it does not change rapidly for x large (and, in fact, it is very flat).
Therefore, arctan(N + 1) − arctan(N ) could yield cancellation errors when N is large
enough, since arctan(N + 1) ≈ arctan(N ), so arctan(N + 1) − arctan(N ) ≈ 0.
9
(b) We use the identity
arctan(x) − arctan(y) = arctan
x−y
1 + xy
,
(13)
which yields
arctan(N + 1) − arctan(N ) = arctan
1
1 + N + N2
(14)
Formula (14) avoids cancellation error since there is no longer a subtraction leading to
one, and moreover the argument of arctan is now very close to zero (since N is large),
and arctan(x) can be precisely computed for x ≈ 0.
10