ApproximationTheoryNotes.pdf

Classroom Notes for Approximation Theory
Thomas Shores1
Contents
1. Operator Algebra for Approximation Theory
2. Pade Approximations
3. Trigonometric Polynomials
4. Complex Trigonometric Polynomials
5. Discrete Fourier Transforms (DFT)
6. The Fast Fourier Transform (FFT)
Description
Complexity
Implementation
7. All About Cubic Splines
General Equations for All Cubic Splines
Standard Form for the Defining Cubics
Extra Conditions to Uniquely Define the Interpolating Cubic Spline
8. A Review and Tour of Linear Algebra
Solving Systems
Matrix Arithmetic
The Basics
Inner Products
Eigenvectors and Eigenvalues
Condition Number of a Matrix
Singular Value Decomposition
9. A Tour of Matlab
What is Matlab?
A Tutorial Introduction to Matlab
References
1
4
5
7
10
13
13
14
15
17
17
18
18
20
20
22
22
25
26
27
29
30
30
31
38
1. Operator Algebra for Approximation Theory
Unless otherwise stated, we assume that A is a subset of the normed vector space B with
norm ∥·∥. Consult Section 8 for more linear algebra details if needed.
1
These are some classroom notes from a course in approximation theory (Math 441) that I taught in 2009
with a few corrections and updates that relate to Adam’s NA seminar, along with a review of basic linear
algebra and Matlab tutorial (which applies to Octave except for a few Matlab-specific commands, mainly
graphics.) Last Rev.: 2/16/16.
1
2
Definition. The norm of a (not necessarily linear) operator X : B → A is the smallest number
∥X∥ such that for all f ∈ B,
∥X (f )∥ ≤ ∥X∥ · ∥f ∥ ,
if such an ∥X∥ exists. In this case the operator is said to be bounded and otherwise, unbounded.
Notice that in the case of a linear operator X, i.e., X (af + bg) = aX (f )+bX (g) for scalars
a, b and f, g ∈ B, this is equivalent to the definition that we gave in the background notes (see
page 25).
Definition. An operator X : B → A is said to be an approximation (projection) operator if
for all f ∈ B, we have
X (X (f )) = X (f ) .
In the case of an approximation operator X, the quantity ∥X∥ is often called the Lebesgue
constant of the operator.
A sufficient condition for an operator to be an approximation operator is the following
(stronger) condition: for all a ∈ A,
(3.2)
X (a) = a.
Most of the operators we deal with will have the property (3.2).
Next, we give a name to the distance from f to the subset cA.
Definition. Given f ∈ B and approximant subset A, we define
d∗ (f ) = min ∥f − a∥
a∈A
Here is a situation in which we can actually estimate d∗ in a useful way:
Theorem. (text 3.2) If c = π/2, B = C k [a, b] and A = Pn with 1 ≤ k ≤ n, then for any
f ∈ B,
!
(n − k)! k !
!
!
c !f (k) ! .
d∗ (f ) ≤
n!
∞
Here is a connection between two of the definitions just discussed:
Theorem. (text 3.1) Let A be a finite dimensional subspace of B and X : B → A a linear
operator satisfying (3.2). Then
∥f − X (f )∥ ≤ {1 + ∥X∥} d∗ (f ) .
OK, now let’s do a bit of experimenting with a simple linear approximation operator:
Example. Let B = C 1 [0, 1], A = P1 , and X : B → A the interpolation operator that
interpolates f (x) at x = x0 , x1 , with a ≤ x0 < x1 ≤ b. Let’s experiment with a reasonable
interesting function first and see how Theorems 3.1 and 3.2 fare. Recall that we can find an
interpolating polynomial p (x) = ax+b through (x0 , y0 ) and (x1 , y1 ) by way of the vandermonde
matrix system
ax0 + b = y0
ax1 + b = y1 .
Furthermore, this is a linear operator (we checked this in class), so that text Theorem 3.1
applies. The norm of X we calculated in class to be 1.
3
% start with f(x) = exp(sin(3*x))
fcn = inline(’exp(sin(3*x))’,’x’)
fcnp = inline(’exp(sin(3*x)).*cos(3*x)*3’,’x’)
xnodes = (0:.01:1)’;
plot(xnodes,fcn(xnodes))
hold on, grid
% start with the endpoints for interpolation nodes
p = [0,1;1,1]\fcn([0;1])
plot(xnodes,polyval(p,xnodes))
dfXf = norm(polyval(p,xnodes)-fcn(xnodes),inf)
% not so hot, so let’s look for a “best approximation”
nn = 20
astar = [1,nn];
dstar = dfXf;
xgrid = linspace(0,1,nn)’;
for ii = 1:nn-1
for jj = ii+1:nn
p = [xgrid(ii),1;xgrid(jj),1]\fcn([xgrid(ii);xgrid(jj)]);
d = norm(polyval(p,xnodes)-fcn(xnodes),inf);
if (d < dstar)
dstar = d;
astar = [ii,jj];
end
end
end
pstar = [xgrid(astar(1)),1;xgrid(astar(2)),1]\fcn([xgrid(astar(1));xgrid(astar(2))]);
plot(xnodes,polyval(pstar,xnodes))
dstar
% compare results to Theorem 3.1 prediction
dfXf, (1+1)*dstar
% compare results to Theorem 3.2 prediction
c = pi/2
n = 1
k = 1
dstar, factorial(n-k)/factorial(n)*c*norm(fcnp(xnodes),inf)
Finally, here are some calculations regarding complexity of linear system solving that is
alluded to in BackgroundNotes.pdf.
n = 80; a = n*eye(n)+rand(n); b = rand(n,1);
tic; c = a\b; toc
% now repeat with n = 160, 320 and 640...how does time grow
Let’s do some more calculations
x = (-5:0.1:5)’;
fcn = inline(’1./(1+x.^2)’,’x’)
plot(x,fcn(x))
grid, hold on
n = 5
nodes = linspace(-5,5,n)’;
fnodes = fcn(nodes);
4
p = vander(nodes)\fnodes;grid,
pnodes = polyval(p,x);
plot(x,pnodes);% what do you think?
norm(pnodes-fcn(x),inf)
% now start over and do it with n = 10, 20
% next, look for the culprit -- close the plot window
n = 5
nodes = linspace(-5,5,n)’;
prd = ones(size(x));
for ii =1:n
prd = prd.*(x - nodes(ii));
end
plot(x,prd)
hold on, grid
% now repeat the loop with n = 10, 20
2. Pade Approximations
This is a rational function approximation whose basic idea is this: Let the approximating
rational function to f ∈ C N +1 [a, b] be a rational function
r (x) =
p0 + p1 x + · · · + pn
p (x)
=
q (x)
1 + q 1 x + · · · + q m xm
where 0 ∈ (a, b), N = m + n is the “total degree” of r (x) and require that
r (j) (0) = f (j) (0) , j = 0, 1, . . . , N,
so that r (x) is something like a Taylor series approximation. This gives us N + 1 conditions
on N + 1 coefficients p0 , . . . , pn , q1 , . . . , qm , so in principle we should be able to solve for all
the coefficients (but there are no guarantees). Of course, we don’t want to be in the business
of computing all those derivatives, so here’s an alternate approach: Let g (x) = f (x) − r (x),
so that the derivative condition is simply that all derivatives of g up to the N -th order vanish
at x = 0. We have that
q (x) f (x) − p (x)
g (x) =
q (x)
also has N + 1 continuous derivatives defined at x = 0, so has a Taylor series expansion
′
g (0)
g(N ) (0) N g(N +1) (ξ) N +1
g (x) = g (0) +
x + ··· +
x +
x
1
n!
(n + 1)!
for some ξ between 0 and x. So the derivative conditions are equivalent to requiring that xN +1
be a factor of g (x). However, the denominator has constant term 1, so x cannot be factored
from it. Therefore, we must have that xN +1 is a factor of q (x) f (x) − p (x).
Thus our strategy is to write out an expansion for f (x), say
f (x) = a0 + a1 x + · · · ,
multiply terms and require that the first N + 1 coefficients of
(1 + q1 x + · · · + qm ) (a0 + a1 x + · · · ) − (p0 + p1 x + · · · + cn xn )
5
be zero. This won’t require complicated derivatives, and can easily be done, once we have an
expansion for f . In fact, the n-th term can be written out explicitly, so the conditions are
q0 = 1,
0=
j
"
k=0
qk aj−k − pj , j = 0, 1, . . . , N.
3. Trigonometric Polynomials
Real-Valued Functions. We covered the discrete Fourier series operator in class. Now let’s
put it to work. A review: With the standard inner product in C [−π, π], we saw that the
projection of of f (x) ∈ C [−π, π] into the finite dimensional subspace
is the trig polynomial
Wn = span {cos kx, sin kx | 0 ≤ k ≤ n}
n
a0 "
q (x) =
(ak cos kx + bk sin kx) ,
+
2
k=1
where, for 0 ≤ k ≤ n,
1
ak =
π
and, for 1 ≤ k ≤ n,
ˆ
π
f (x) cos kx dx
−π
ˆ
1 π
f (x) sin kx dx
π −π
If we use the trapezoidal rule with N + 1 equally spaced nodes xj = 2π
N j on the interval [0, 2π]
(remember that the location of the interval makes no difference – only that it should be of
length 2π) to approximate the Fourier coefficients ak , bk with a′k , b′k , then first and last values
agree by periodicity and we obtain
#
$
N −1
2π
2 "
′
f (xj ) cos
jk
ak =
N
N
bk =
j=0
b′k
2
=
N
N
−1
"
j=0
f (xj ) sin
#
2π
jk
N
$
Wait a minute, you say! That doesn’t look like the trapezoidal rule that I know – the
endpoints should have one half the coefficient of interior points. However, bear in mind that
assuming periodicity, we have f (x0 ) = f (xN ), so we combine the endpoints. As an aside,
% &
you might recall that the trapezoidal method applied to smooth functions has error O h2 ,
where h is the stepsize between the equally spaced nodes. As a matter of fact,% the error
of
&
this method as applied to analytic periodic functions is spectacularly better: O e−a/h , for a
suitable positive constant a. This is exponential decay of error. For details see the informative
paper [3] by L. Trefethen and J. Weideman.
First, let’s create a function that actually evaluates a trig polynomial at vector argument
x. We understand that such a trig polynomial will be specified by the vectors of coefficients
a = (a1 , . . . , an , a0 )
b = (b1 , b2 , . . . , bn ) .
OK, here it is:
6
function retval = trigeval(a,b,x)
% usage: y = trigeval(a,b,x)
% description: given vectors of trig coefficients a, b
% of length n+1 and n, respectively, and argument vector
% x, this function returns the value of the trig series
% p(x) = a(n+1)/2 + sum(a(j)*cos(j*x)+b(j)*sin(j*x),j=1..n).
%
n = length(b);
if (length(a) ~= n+1)
error(’incorrect vector lengths’);
end
retval = a(n+1)/2*ones(size(x));
for j = 1:n
retval = retval + a(j)*cos(j*x)+b(j)*sin(j*x);
end
Now create a script that does the work of building a discrete Fourier trig polynomial from
’myfcn.m’. It would look something like this:
% script: fouriertest.m
% description: perform some approximation experiments with
% Fourier polynomials on the function myfcn defined externally.
n = 2; % account for highest frequency
N = 4; % sampling number on [0,2*pi]
a = zeros(1,n+1);
b = zeros(1,n);
% construct the coefficients for myfcn
xnodes = linspace(0,2*pi,N+1);
xnodes = xnodes(1:N); % trim off 2*pi
xnodes = xnodes - pi; % let’s work on [-pi,pi]
f = myfcn(xnodes);
% dispose of 0th coefficients
a(n+1) = 2*sum(f)/N;
% now the rest
for j = 1:n
a(j) = 2*f*cos(j*xnodes)’/N;
b(j) = 2*f*sin(j*xnodes)’/N;
end
% display norm of node error
x = -pi:.01:pi;
disp(’|| myfcn(x) - trigeval(a,b,x) ||:’) ,disp(norm(myfcn(x) - trigeval(a,b,x),inf))
% now try plotting
clf
plot(x,myfcn(x),x,trigeval(a,b,c))
7
Now for some experiments:
(1) Let’s start with something
really simple like f (x) = sin 2x, N = 3, 4 and n from
' N −1 (
to N . Hmmm. Something’s seriously wrong. Try N = 5, 7.
interpolation rate
2
(2) Now try f (x) = sin 2x − 0.2 ∗ cos 5x, but start with N = 5.
What’s going on here? Nyquist discovered that in order to recover a periodic signal by sampling, a sufficient condition is that we have a sample rate more than twice the highest frequency
(in a sinusoidal component) of the signal. It is not a necessary condition. This number is called
the Nyquist sampling rate. Recall that frequency is defined f = 1/T , where T is the period
of a function. In the first example, the highest frequency is 1/π, so our sampling rate must
be greater than 2/π. Thus in an interval of width 2π, we could sample with N > 2π · π2 = 4.
So N = 5 will work. What went wrong with smaller N ? Consider this simple example: If we
sample with N = 2, we cannot distinguish between sin x, sin 2x, sin 3x, etc. This phenonemon
is called aliasing, and is well studied in the area of signal processing.
Now consider the second example. The highest frequency occurring is 5/(2π), so the Nyquist
sampling rate gives that in an interval of width 2π a sufficient sampling number is N > 10.
Finally, let’s break the rules and change the function to a square wave: f (x) = sign (x).
Experiment with increasing sample sizes. This time, settle on a trig polynomial size and
increase sampling until we are satisfied. Draw a separate figure for n = 5, 11, 21. Conclusions?
Note that we saw several interesting features:
(1) The discrete Fourier series does seem to converge pointwise to f (x).
(2) At the discontinuity it appears to be converging to a point half way between the left
and right-hand limits.
(3) The “bumps” near the discontinuity appears to gradually contract in width, but their
height does not go to zero. This is the so-called “Gibbs effect” that was observed
around the end of the nineteenth century by J. Gibbs.
4. Complex Trigonometric Polynomials
Here the trig functions are replaced the by the complex exponentials eikx , k = −n, . . . 0, . . . , n,
which turn out to be an orthogonal set of functions in the space C [−π, π] of continuous
complex-valued functions with domain [−π, π]. One can define the complex function ez in
terms of the power series for the exponential function learned in Calculus, but for our purposes, it suffices to use this fact as definition: For real x,
eix = cos x + i sin x.
We use the standard complex inner product
ˆ
⟨f, g⟩ =
π
f (x) g (x)dx
−π
and induced norm ∥f ∥2 = ⟨f, f ⟩ in this setting. Remember that to integrate a complex
function, you simply integrate its real and complex parts, that is, if f (x) = g (x) + ih (x), with
g (x) and h (x) real-valued, then
ˆ b
ˆ b
ˆ b
f (x) dx =
g (x) dx + i
h (x) dx.
a
a
a
Let Wn be the finite dimensional subspace spanned by these exponentials and, as with the
trig polynomials, we obtain the standard projection formula for the best approximation to
8
f (x) ∈ C [−π, π] from Wn is given by
q (x) =
n
"
ck eikx ,
k=−n
where
)
*
ˆ π
f, eikx
1
ck = ikx ikx =
f (x) e−ikx dx.
⟨e , e ⟩
2π −π
The bridge between these coefficients and the real Fourier coefficients consists of the following identities for k ≥ 0, which are easily checked from definitions:
ak = ck + c−k
bk = i (ck − c−k )
1
ck = (ak − ibk )
2
1
c−k = (ak + ibk ) .
2
Just as with the ak and bk we can approximate the ck by a trapezoidal integration using
N + 1 equally spaced nodes xj = 2π
N j on the interval [0, 2π], as with real Fourier coefficients,
′
and obtain the approximation ck to ck as
c′k =
N −1
N −1
2πi
1
1 "
1 "
f (xj ) e− N jk =
yj ω kj =
⟨y, wk ⟩ ≡ Yk ,
N
N
N
j=0
where
j=0
yj = f (xj ) ,
Yk = c′k
y = (y0 , y1, . . . , yN −1 ) ∈ CN and
+
,
wk =
ω 0 , ω,k . . . , ω k(N −1) ∈ CN ,
−1
Here we use the notation ωN = e2πi/N and ω= ωN
= ω N (check these equalities for yourself)
T
N
and the inner product ⟨y, w⟩ = y w on C . The vectors wk have some very interesting
properties. For one, one checks that the sequence {wk } is periodic in that wm = wm+N for
all integers m, since ω N = 1. Moreover, for 0 ≤ m, n < N ,
N
−1 +
N
−1
,j "
"
N, if m = n
(m−n)
mj −nj
ω ω
=
ω
=
.
⟨wm , wn ⟩ =
0, if m ̸= n
j=0
j=0
For the unequal case, check the identity
%
&
1 + r + r 2 + · · · + r N −1 (1 − r) = 1 − r N .
Thanks to periodicity of these vectors, any sequence of vectors wk , k = m, . . . , m + N − 1,
forms an orthogonal basis of CN . In particular, for any vector y the standard projection
formula gives
N
−1
N
−1
N −1
"
"
1 "
⟨y, wk ⟩
Yk w k .
wk =
⟨y, wk ⟩ wk =
y=
⟨wk , wk ⟩
N
k=0
k=0
k=0
9
Therefore, for 0 ≤ j ≤ N − 1, the ith coordinate of y is given by
yj =
N
−1
"
Yk (wk )i =
k=0
N
−1
"
2πi
Yk e N
k=0
All of this suggests an N × N matrix of the form F =
1
N
express the relation between the yj and Yk in matrix form:
kj
.
.
2πi
e− N
kj
/
k,j
which enables us to
Y =F y,
y =N F Y.
Let’s do some experiments with Matlab:
% construct some DFT matrices
N = 5
zeta = exp(-2*pi*i/N)
z = zeta.^(0:N-1)
% some random vector
y = rand(N,1)
% try help vander -- it doesn’t quite do what we want
F = fliplr(vander(z))/N
Finv = N*conj(F)
Y = F*y
norm(y - Finv*Y) % it works!
O.K. now we are going to do some experiments that will be better handled with a script.
So start Matlab and edit ’myfcn’. Change it to a square wave on [0, 2π] by
retval = (abs(x)>10*eps).*sign(pi-x);
Next, edit ’fouriertest’, or whatever you called the file from the previous experiments. Here
is what you need:
n = 4 % highest frequency in Fourier approximation
N = 2*n + 1 % account for Nyquist sampling rate rule
a = zeros(1,n+1);
b = zeros(1,n);
% construct the coefficients for myfcn
xnodes = linspace(0,2*pi,N+1);
xnodes = xnodes(1:N); % trim off 2*pi
f = myfcn(xnodes); % dispose of 0th coefficient
a(n+1) = 2*sum(f)/N;
% now the rest of real Fourier coefs
for k = 1:n
a(k) = 2*f*cos(k*xnodes)’/N;
b(k) = 2*f*sin(k*xnodes)’/N;
end
% next construct the complex coefficients
% with a suitable DFT matrix
10
zeta = exp(-2*pi*i/N)
z = zeta.^(-n:n)
F = fliplr(vander(z))/N;
Y = F*f.’;
% ok, let’s compare
ac = zeros(1,n+1);
% a(n+1) is exceptional Fourier a_0 of index 0
ac(n+1) = 2*c(n+1);
disp(’| a_0 - 2*c_0 | :’) ,disp(abs(a(n+1) - ac(n+1)))
% the a(k) are Fourier a_k, k>0
ac(1:n) = c(n+2:N) + c(n:-1:1);
disp(’|| a_k - (c_k + c_{-k}) || :’), disp(norm(a(1:n)-ac(1:n),inf))
% the b(k) are Fourier b_k, k>0
bc = i*(c(n+2:N) - c(n:-1:1) );
disp(’ || b_k - (c_k - c_{-k}) || :’), disp(norm(b - bc,inf))
Now save it as ’DFTtest’. Once you have run DFTtest, try plotting (or just add it to
DFTtest):
x = 0:.01:2*pi;
plot(x,myfcn(x),x,trigeval(a,b,x),x,trigeval(ac,bc,x));
norm(imag(ac),inf)
norm(imag(bc),inf)
5. Discrete Fourier Transforms (DFT)
2π
As usual, ωN = e N i , a primitive N th root of unity. To recap, given periodic data [yk ] =
N −1
−1
′
{yk }k=0
, we obtain transformed data [Yk ] = {Yk }N
k=0 (we’re replacing ck by Yk ), and conversely, by way of the formulas defining the discrete Fourier transform and its inverse, namely
Yk =
N −1
1 "
−kj
yj ωN
≡ FN ([yk ])
N
j=0
yk = N
N
−1
"
j=0
kj
−1
≡ FN
([Yk ]) .
Yj ω N
Next, let’s alter the definition of the data [yk ]. Since the data is periodic, we can interpret
it as a doubly infinite sequence
[yk ] = . . . , y−j , . . . , y−1 , y0 , y1 , . . . , yN −1 , yN , . . . , yj , . . .
where we understand that in general, that if j = qN + r, with integers j, q, r and 0 ≤ r < N ,
then yj = yr . (If you know congruences, you will recognize that j ≡ r (modN ).) We have the
same interpretation with transformed data [Yk ], which is also periodic, given that [yk ] is. Here
is a key idea:
Definition 1. Given periodic data sets [yk ] and [zk ], we define the convolution of these to be
[wk ] = [yk ] ∗ [zk ] ,
11
where
wk =
N
−1
"
j=0
yk zk−j , k ∈ Z.
% &
The complexity of the convolution operator is 2N 2 , or simply O N 2 .
To illustrate the power of the DFT, we next develop two useful applications of the DFT,
namely polynomial multiplication and numerical derivative calculation.
Polynomial multiplication: We can interpret polynomial multiplication as convolution in
the following way: Given polynomials
p (x) = a0 + a1 x + · · · + am xm
q (x) = b0 + b1 x + · · · + bn xn ,
we know that
P (x) Q (x) = a0 b0 + (a0 b1 + a1 b0 ) x + · · · + am bn xm+n .
The kth term is actually
"
aj bℓ ,
j+ℓ=k
where 0 ≤ j ≤ m and 0 ≤ ℓ ≤ n. As far as complexity, we see that the complexity is about
2 (m + 1) (n + 1) flops, so certainly O (mn).
Let N ≥ m + n + 1 and pad the vectors of coefficients with zeros to obtain vectors
a = (a0 , a1 , . . . , am , 0, . . . , 0)
b = (b0 , b1 , . . . , bn , 0, . . . , 0)
in the space CN . Next form the periodic data sets [ak ] and [bk ], form [ck ] = [ak ] ∗ [bk ] and
observe that for 0 ≤ k ≤ m + n,
ck =
N
−1
"
aj bk−j
j=0
=
"
a j bℓ ,
j+ℓ=k
Here the padded zeros ensure that 0 ≤ j ≤ m and 0 ≤ ℓ ≤ n. Moreover, if k > m + n, then
the padded zeros will make the sum identically 0, since one factor in each term will always be
zero. So this is exactly the same sum as for the coefficients of the product polynomial, padded
with zeros beyond the (m + n + 1)th entry. We need one more key fact:
Theorem 2. If FN ([yk ]) = [Yk ] and FN ([zk ]) = [Zk ] , then
FN ([yk ] ∗ [zk ]) = [N Yk Zk ]
and
−1
([N Yk Zk ]) .
[yk ] ∗ [zk ] = FN
12
Proof. Let [wk ] = [yk ] ∗ [zk ] and calculate
0
1
N −1 N −1
1 " "
−kj
Wk =
yℓ zj−ℓ ωN
N
j=0
ℓ=0
⎛
⎞
−1
N
−1 N
"
"
1
−kj
⎝
yℓ zj−ℓ ⎠ ωN
=
N
j=0
ℓ=0
⎛
⎞
N −1
N
−1
"
1 "
−k(j−ℓ) ⎠
−kℓ ⎝
=
yℓ ωN
zj−ℓ ωN
N
j=0
ℓ=0
0N −1−ℓ
1
N
−1
"
1 "
−kℓ
−km
yℓ ωN
=
zm ωN
N
ℓ=0
m=−ℓ
0 N −1
1 0 N −1
1
1 "
1 "
−kℓ
−km
=N
yℓ ωN
zm ωN
N
N m=0
ℓ=0
= N Yk Zk .
The last equality is immediate.
All of this suggests an algorithm for computing convolutions: First, apply the DFT to each
sequence, then take the coordinate-wise product of the transforms, and finally
% 2take
& the inverse
transform of this product. Although we can say that the complexity is O N , clearly there
is a factor of 10 or so that would seem to be too much work when contrasted with doing it
directly. So what’s the point? Enter the FFT, which will reduce the work of a transform (or
inverse transform) to O (N log2 N ). Now this idea will become profitable in some situations.
Let’s confirm the convolution theorem in a special case, namely for the multiplication of
p (x) = (1 − x)2 and q (x) = (1 − x)3 by means of the convolution trick.
N = 8
zeta = exp(-2*pi*i/N).^(0:N-1);
F = fliplr(vander(zeta))/N;
Finv = N*conj(F);
% represent the polynomial with lowest to highest degree vector
y = [1,-2,1,0,0,0,0,0]’
z = [1,-3,3,-1,0,0,0,0]’
Y = F*y;
Z = F*z;
W = Y.*Z;
w = Finv*(N*W)
Numerical derivative calculations: If we believe that the standard projection formula for
the best approximation q(x) to f (x) ∈ C [−π, π] from Wn is a very good approximation to
f (x), then it is reasonable to consider the derivative of q (x) to be a good approximation to
the derivative of f (x), that is,
f (x) ≈ q (x) =
n
"
k=−n
ck eikx , and f ′ (x) ≈ q ′ (x) =
n
"
k=−n
ikck eikx .
13
Let’s try this out with the test function f (x) = sin 2x − 0.2 cos 5x, with f ′ (x) = 2 cos 2x +
sin 5x. All we have to do is modify the complex Fourier coefficients, which we illustrate in this
Matlab calculation (which you should save as the script DFTderivtest):
n = 5 % highest frequency in Fourier approximation
N = 2*n + 1 % account for Nyquist sampling rate
% construct the coefficients for myfcn, derivative myfcnp
xnodes = linspace(0,2*pi,N+1)’;
xnodes = xnodes(1:N); % trim off 2*pi
f = myfcn(xnodes);
fp = myfcnp(xnodes);
% this time we’ll do the calculationDF with a suitable DFT matrix
zeta = exp(-2*pi*i/N).^(-n:n);
F = fliplr(vander(zeta))/N;
c = F*f;
cp = F*fp;
cpapprox = i*(-n:n)’.*f;
% ok, let’s compare coefficients
disp(’|| cp - cpapprox || :’), disp(norm(cp - cpapprox,inf))
6. The Fast Fourier Transform (FFT)
Description. We’ll do pretty much everything you need to know about the development
of the FFT. OK, let’s start with an update of the definition of the discrete Fourier transform
(DFT): given a data sequence (yk ) of complex numbers, periodic of period N, we define the
DFT of this data to be the periodic sequence (Yn ) of period N given by the formula
7
1 6
−(N −1)k
−k
−2k
, k = 0, 1, . . . , N − 1
y0 + y1 ωN
+ y2 ωN
+ · · · + yN −1 ωN
Yk =
N
Since the Yk are a periodic sequence, these are all the values we need to know. Symbolically,
we write [Yk ] = FN ([yk ]). Now assume that N is even, say N = 2m and we can split this
−k
sum into even and odd indexed terms and factor ωN
from the odd terms to get that for
k = 0, 1, . . . , N − 1,
7
16
−k
Yk =
Ek + ωN
Ok
2
where
6
7
−(N −2)k
−2k
−4k
1
y0 + y2 ωN
+ y4 ωN
+ · · · + yN −2 ωN
Ek = m
7
6
−(N −2)k
−2k
−4k
1
Ok = m
y1 + y3 ωN
+ y5 ωN
+ · · · + yN −1 ωN
Here are some simple facts:
−(k+m)
−k
ωN
= −ωN
Ek+m = Ek
Ok+m = Ok
It follows that for k = 0, 1, . . . , N/2 − 1, we have these DFT doubling formulas
6
7
−k
Yk
= 12 Ek + ωN
Ok
6
7
−k
Ok .
Yk+m = 12 Ek − ωN
14
Observe that this essentially cuts the work of computing Yk in half, requiring about N 2 /2
multiplications and additions instead of N 2 . This observation lies at the heart of the FFT.
However, the real power of this idea is realized when we team it up with recursion. To this
end, we suppose that N is purely even, that is, N = 2p for some integer p. Now make the key
2 =ω
observation that calculation of the Ek and Ok ’s is itself a DFT. For we have that ωN
N/2 ,
from which it follows that
7
6
−(m−1)k
−n
−2n
1
y0 + y2 ωN/2
+ y4 ωN/2
+ · · · + yN −2 ωN/2
Ek = m
7
6
−(m−1)k
−n
−2n
1
Ok = m
y1 + y3 ωN/2
+ y5 ωN/2
+ · · · + yN −1 ωN/2
In other words, the two quantities above are simply the DFT of the pairs of the half sized
data y0 , y2 , . . . , yN −2 and y1 , y3 , . . . yN −1 . Thus, we see that we can halve these two data sets
again and reduce the problem to computing four quarter sized DFTs. Recursing all the way
down to the bottom, we reduce the original problem to computing N DTFs of 1/N -th sized
data sets. It takes us p of these steps to get down to the bottom, namely size N/20 to N/21 ,
then N/21 to N/22 , all the way down to N/2p−1 to N/2p = 1. However, at the bottom, the
DFT of a singleton data point is simply the data point itself. Are we done?
Well, not quite. This is only half the problem, because once we’ve reached the bottom, we
have to reassemble the data a step at a time by way of DFT doubling formulas to build the
DFT at at the top. Let’s examine how we have to rearrange our original data set for this
recursion in the case p = 3, that is, N = 8. Designate levels by ℓ.
ℓ=3
ℓ=2
ℓ=1
ℓ=0
y0
y0
y0
y0
y1
y2
y4
y4
y2
y4
y2
y2
y3
y6
y6
y6
y4
y1
y1
y1
y5
y3
y5
y5
y6
y5
y3
y3
y7
y7
y7
y7
Now we’re ready to work our way back up. Notice, BTW, ℓ = 1 really is the same as the
bottom.
Complexity. The recursive way of thinking is very handy. For one thing, it enables us to
do a complexity analysis without much effort. We’ll ignore the successive divisions by 2. We
could just ignore them during the computations, then at the end divide our final result by
2p = N. This only involves N real multiplications, which is negligible compared to the rest
of the work. Also notice that the number of multiplications for a full reconstruction is about
half the number of additions, so we’ll focus on additions. (Bear in mind that we assume the
values of ωN are precomputed and available.) How much work is involved in going from the
two data sets to the single one of length 2p ? Let’s denote the total number of additions by
Ap . We see from the formulas for Yk and Yk+m that
Ap = 2Ap−1 + 2p
and down at the bottom A0 = 0. Recursion yields
Ap =
=
..
.
2Ap−1 + 2p
2(2Ap−2 + 2p−1 ) + 2p
= 2p A0 + p2p = N log2 N
Similarly, one could count multiplications Mp recursively, with the recursion formula being
Mp = 2Mp−1 + 2p−1 − 1,
15
0 = 1 and M = 0, to obtain that M = 1 N (log N − 2). For
not counting multiplication by ωN
0
p
2
2
reasonably large N this is a very significant reduction; e.g., if N = 256 = 28 , then N 2 = 65536
and N log2 N = 2048.
Implementation. We might be tempted to use recursion to program the algorithm. From
a practical point of view, this is usually a bad idea, because recursion, the elegant darling of
computer science aficionados, is frequently inefficient as a programming paradigm.
Rather, let’s assume that we have decomposed that data set into the lowest level. That
process alone is interesting, so suppose we have a routine that takes an index vector 0 : N − 1
and rearranges it in such a way that the returned index vector is what is needed to work our
way back up the recursion chain of the FFT. The syntax of the routine should be
retval=FFTsort(p)
2p .
where N =
With the routine FFTsort in hand, let’s address the issue of practical coding of the FFT.
Rather than thinking from the top down (recursion), let’s think from the bottom up (induction). Of course, they’re equivalent. It suffices to understand how things work from the
m = N/2 to N. So let’s re-examine the formulas from the point of view of DFTs. Consider
this tabular form for the relevant data:
ℓ=p
Y0 Y1 . . .
YN −1
ℓ = p − 1 E0 E1 . . . Em−1 O0 O1 . . . Om−1
Notice that the first half of the second row is the DFT of the first half of the second
level of rearranged data (like ℓ = 2 in our N = 8 example above). Let’s call the process of
applying Fact 2 to the second row to get the first row DFT doubling. Back to the general
case, imagine that we’re at the ℓth level in our reconstruction, where 0 ≤ ℓ < p. At this level
we have constructed 2p−ℓ consecutive DFTs of size 2ℓ , where each DFT is the DFT of the
corresponding data at the ℓth level of rearranged input data. Now we pair these DFTs up to
construct 2p−ℓ−1 consecutive DFTs of size 2ℓ+1 . Here’s a picture of what the DFT doublings
look like in our N = 8 example (start at the bottom):
ℓ = 3 E = Y0 E = Y1 E = Y2 E = Y3 E = Y4 E = Y5 E = Y6 E = Y7
ℓ=2
E
E
E
E
O
O
O
O
ℓ=1
E
E
O
O
E
E
O
O
ℓ = 0 y0 = E y4 = O y2 = E y6 = O y1 = E y5 = O y3 = E y7 = O
Keep going until we get to the top. When we get there we have reconstructed the Yn′ s.
One can envision two nested for loops that do the whole job. In fact, here’s a simple Matlab
code for the FFT. All divisions by 2 are deferred to the last line, as suggested by our earlier
discussion.
FFT Algorithm:
function Y = FFTransform(y)
% usage: Y = FFTransform(y)
% description: This function accepts input vector y of length
% 2^p (NB: assumed and not checked for) to and outputs
% the DFT Y of y using the FFT algorithm.
y = y(:); % turn y into a column
16
N = length(y);
p = log2(N);
omegaN = exp(-pi*i/N).^(0:N-1).’; % all the omegas we need
Y = y(FFTsort(p)); % initial Y at bottom level
stride = 1;
for l=1:p
omegal = omegaN(1:N/stride:N);
stride = 2*stride;
for j=1:stride:N
evens = Y(j:j+stride/2-1);
odds = omegal.*Y(j+stride/2:j+stride-1);
Y(j:j+stride/2-1) = evens + odds;
Y(j+stride/2:j+stride-1) = evens - odds;
end
end
Y = Y/N;
We might ask what one has to do about the inverse DFT. Recall that this transform is
given by the equation
7
6
(N −1)k
k
2k
, k = 0, 1, . . . , N − 1.
y k = Y0 + Y1 ω N
+ Y2 ω N
+ · · · + YN −1 ωN
−1
Symbolically, we write (yk ) = FN
(Yn ). This operation does what it says, that is, compute
untransformed data from transformed data. The good news is that we need do nothing more
than write a program for the FFT. The reason is that from the definition we can see that
7
1 6
−(N −1)k
−k
−2k
yk = N
Y 0 + Y 1 ωN
+ Y 2 ωN
+ · · · + Y N −1 ωN
, k = 0, 1, . . . , N − 1,
N
from which it follows that
(yk ) = N FN (Y n ).
Therefore, it suffices to have an fast algorithm for the DFT and a fast algorithm for the inverse
DFT is more or less free.
There are other refinements to applications and construction of the DFT coming from signal
processing; we won’t go into these. Since these notes are all about approximation theory, there
is one more question: how exactly do these numbers relate to the approximation formula that
started this whole business, namely,
q (x) =
n
"
ck eikx .
k=−n
The answer lies in a careful scrutiny of the computed approximate Fourier coefficients Yk ≈ ck .
Let m = N/2 are recall from the periodicity of the Yk ’s that Yj = Yj−2m . Now re-index these
computed numbers as follows:
Y0 , Y1 , . . . Ym−1 , Ym ,
Ym+1 ,
Ym+2 ,
...
Y2m−1
Y0 , Y1 , . . . Ym−1 , Ym , Ym+1−2m , Ym+2−2m . . . Y2m−1−2m
Y0 , Y1 , . . . Ym−1 , Ym , Y−(m−1) , Y−(m−2) . . .
Y−1
17
Thus we are really calculating the approximation
q (x) =
m−1
"
Yk eikx + Ym eimx +
N
−1
"
Yk e−i(k−N )x .
k=m+1
k=0
In other words, the correct sequence of exponents of eix implied by the Yk ordering is
0, 1, . . . m − 1 m, − (m − 1) , − (m − 2) , . . . , −1.
7. All About Cubic Splines
We study cubic splines with a finite number of (ordered) knots at x0 , x1 , . . . , xn . The space
of all such objects isS (x0 , x1 , . . . , xn ; 4) and such an object is a function s (x) ∈ C 2 [x0 , xn ],
such that on each subinterval [xj , xj+1 ], j = 0, 1, . . . , n − 1, we have s (x) = sj (x) , a cubic
polynomial defined for x ∈ [xj , xj+1 ].
General Equations for All Cubic Splines.
Here is the problem presented by knot interpolation alone.
Problem: Given a function f (x) ∈ C 2 [x0 , xn ], find all cubic splines that interpolate f (x)
at the knots x0 , x1 , . . . , xn , that is, s (xj ) = f (xj ), j = 0, 1, . . . , n.
Notation:
(1) sj (x) ≡ aj + bj (x − xj ) + cj (x − xj )2 + dj (x − xj )3 , j = 0, 1, . . . , n-1.
(2) yj = f (xj ), j = 0, 1, . . . , n.
(3) Mj = s′′ (xj ), j = 0, 1, . . . , n. (The Mj ’s are called the second moments of the spline.)
(4) hj = xj+1 − xj , j = 0, 1, . . . , n − 1.
Here is how we solve this problem: Start with s′′ (x) which must be a piecewise linear continuous function. Since we know its endpoint values in the subinterval [xj , xj+1 ], we may use
simple Lagrange interpolation to obtain
(xj+1 − x) Mj + (x − xj ) Mj+1
.
s′′j (x) =
hj
Integrate twice and we have to add in an arbitrary linear term which, for convenience we write
as follows
(xj+1 − x)3 Mj + (x − xj )3 Mj+1
+ Cj (xj+1 − x) + Dj (x − xj ) ,
sj (x) =
6hj
where Dj and Cj are to be determined. Now differentiate this expression and obtain
− (xj+1 − x)2 Mj + (x − xj )2 Mj+1
− Cj + Dj .
2hj
For later use we note that if we shift the indices down one, we obtain
s′j (x) =
− (xj − x)2 Mj−1 + (x − xj−1 )2 Mj
− Cj−1 + Dj−1 .
=
2hj−1
Our strategy is to reduce the determination of every unknown to the determination of the
Mj ’s. Start with the interpolation conditions for s, which implies that for j = 0, 1, . . . , n − 1,
s′j−1 (x)
sj (xj ) = yj =
so that
Cj =
h2j
Mj + Cj hj ,
6
hj Mj
yj
−
.
hj
6
18
Similarly, from
sj (xj+1 ) = yj+1 =
h2j
Mj+1 + Dj hj ,
6
we have
Dj =
yj+1 hj Mj+1
−
.
hj
6
Next, use the continuity of s′ (x) to obtain that for j = 1, . . . , n − 1,
s′j−1 (xj ) = s′j (xj )
which translates into
hj−1 Mj
yj−1
−
2
hj−1
h2j−1 Mj
h2j Mj
− Cj−1 + Dj−1 = −
− Cj + Dj
2hj−1
2hj
hj−1 Mj−1
yj
hj−1 Mj
hj Mj
yj
hj Mj
yj+1 hj Mj+1
+
+
−
=−
−
+
+
−
.
6
hj−1
6
2
hj
6
hj
6
After simplifying, we obtain
#
$
hj−1
hj−1 hj−1 hj
hj
yj+1
yj
yj−1
yj
hj
Mj−1 +
−
+
−
Mj + Mj+1 =
−
+
−
,
6
2
6
2
6
6
hj
hj
hj−1 hj−1
that is,
hj−1
Mj−1 +
6
#
hj−1 + hj
3
$
Mj +
hj
Mj+1 = f [xj , xj+1 ] − f [xj−1 , xj ] .
6
Thus we are two equations short of what we need to determine the n+1 unknowns M0 , M1 , . . . , Mn .
Standard Form for the Defining Cubics.
Notice that if we express the cubic polynomials defining s (x) in standard form, then for
j = 0, 1, . . . , n − 1, we have
sj (x) = aj + bj (x − xj ) + cj (x − xj )2 + dj (x − xj )3 ,
s′j (x) = bj + 2cj (x − xj ) + 3dj (x − xj )2 ,
s′′j (x) = 2cj + 3dj (x − xj ) .
We can use these equations along with our earlier description of sj (x) to obtain that for
j = 0, 1, . . . , n − 1,
aj = f (xj )
bj = f [xj , xj+1 ] −
hj
(2Mj + Mj+1 )
6
Mj
2
1
dj =
(Mj+1 − Mj ) .
6hj
cj =
These are the formulas that we need to put the spline in standard PP form.
Extra Conditions to Uniquely Define the Interpolating Cubic Spline.
19
Clamped (complete) cubic spline. Define the spline s (x) = sc (x) by the two additional conditions
s′ (x0 ) = f ′ (x0 ) , s′ (xn ) = f ′ (xn ) .
Use the formula for s′ (x) to obtain additional equations
h0
h0
M0 + M1 = f [x0 , x1 ] − f ′ (x0 )
3
6
hn−1
hn−1
Mn−1 +
Mn = f ′ (xn ) − f [xn−1 , xn ] .
6
3
The resulting system has symmetric positive definite coefficient matrix, from which it follows
that there is a unique clamped cubic spline interpolating f (x) at the knots of the spline. The
system is (the convention is that blank entries are understood to have value zero)
⎤⎡
⎤ ⎡
⎤
⎡ h0
h0
M0
f [x0 , x1 ] − f ′ (x0 )
3
6
h1
⎥ ⎢ M1 ⎥ ⎢
⎥
⎢ h0 h0 +h1
f [x1 , x2 ] − f [x0 , x1 ]
6
⎥⎢
⎥ ⎢
⎥
⎢ 6
⎥
⎢
⎥
⎢
⎢
⎥
..
..
..
..
..
=
⎥⎢
⎥
⎢
⎢
⎥
.
.
.
.
.
⎥⎢
⎥ ⎢
⎥
⎢
hn−1 ⎦ ⎣
hn−2
hn−2 +hn−1
⎦
⎣
⎣
Mn−1
f [xn−1 , xn ] − f [xn−2 , xn−1 ] ⎦
6
6
hn−1
hn−2
Mn
f ′ (xn ) − f [x0 , x1 ]
6
3
Error analysis leads to this theorem:
Theorem 3. If f ∈ C 4 [a, b], knots are defined as a = x0 < x1 < · · · < xn = b, h =
max0≤j≤n−1 (xj+1 − xj ), and sc (x)% is the clamped
cubic spline for f , then for x ∈ [a, b] and
&
5
1 3
constants cj , j = 0, 1, 2, where c = 384
, 24
, 8 , such that
>
>
!
!
> (j)
>
4−j ! (4) !
>f (x) − s(j)
!f ! .
c (x)> ≤ cj h
∞
Another striking fact is that cubic spline minimize “wiggles” in the following sense.
Theorem 4. Among all g (x) ∈ C 2 [a, b] interpolating f (x) ∈ C 2 [a, b] at the nodes a = x0 <
x1 < · · · < xn = b, and f ′ (x) at a and b, the choice g (x) = sc (x) minimizesf
ˆ b
% ′′
&2
g (x) dx.
a
Natural cubic spline. Define the spline s (x) = snat (x) by the two additional conditions
s′′ (x0 ) = 0,
s′′ (xn ) = 0.
Obtain additional equations
M0 = 0
Mn = 0.
The resulting system has symmetric positive definite coefficient matrix, from which it follows
that there is a unique clamped cubic spline interpolating f (x) at the knots of the spline. The
system is
⎤⎡
⎡
⎤ ⎡
⎤
1
M0
0
h1
⎥ ⎢ M1 ⎥ ⎢
⎢ h0 h0 +h1
⎥
f [x1 , x2 ] − f [x0 , x1 ]
6
⎥⎢
⎢ 6
⎥ ⎢
⎥
⎥
⎢
⎢
⎥
⎢
⎥
.
.
.
.
.
.
.
.
.
.
=
⎥⎢
⎢
⎥ ⎢
⎥.
.
.
.
.
.
⎥
⎢
⎢
⎥
⎢
⎥
h
+h
h
h
n−2
n−1
n−2
n−1 ⎦ ⎣ M
⎣
⎦ ⎣ f [xn−1 , xn ] − f [xn−2 , xn−1 ] ⎦
n−1
6
6
Mn
0
1
One can obtain similar error bounds as with clamped cubic splines, but of course not as good.
20
“Not-a-knot” cubic spline. Define the spline s (x) = snk (x) to be the cubic interpolating spline
at knots x0 by the two additional conditions
s (x1 ) = f (x1 ) ,
s (xn−1 ) = f (xn−1 ) .
One obtain two additional equations which again give a linear system with a unique solution
by evaluating the expression for s (x) (in terms of moments Mj ) at the nodes x1 and xn−1 . In
the limit, as x1 → x0 and xn−1 → xn , the not-a-knot cubic spline tends to the clamped cubic
spline.
As a final example, estimate error on this function
$
#
1 3
x − x2 + x sin (x) , 0 ≤ x ≤ 2π
f (x) =
10
using the various cubic splines. Tools for constructing these three splines are the function files
CCSpp.m, NCSpp.m and NKCSpp.m. These are located in the file PPfcns.m, but they are
also unpacked along with all the other functions from PPfcns.m in the folder PPfncs to be
found in Course Materials.
For example, here is how to construct the natural cubic spline for f (x) and plot it along
with the plot of f (x). You might find it more helpful to plot the error. This could be helpful
for deciding how to add knots or change their location. First edit ’myfcn.m’ to the following:
retval = (0.1*x.^3 - x.^2 + x).*sin(x);
Then do the following in Matlab:
knots = linspace(0, 2*pi, 6);
pp = NCSpp(knots,myfcn(knots));
x = 0:0.001:2*pi;
plot(x, myfcn(x))
hold on, grid
plot(x, PPeval(pp,x))
8. A Review and Tour of Linear Algebra
Note: This really is a brief tour. For the most part, it is material from Math 314. Since
it is less likely that you saw a variety of norms and a development of the singular value
decomposition, I’m including a little more detail concerning these topics.
Solving Systems . Solving linear systems is a fundamental activity, practiced from high
school forward. Recall that a linear equation is one of the form
a1 x1 + a2 x2 + · · · + an xn = b,
where x1 , x2 , . . . , xn are the variables and a1 , a2 , . . . , an , b are known constants. A linear system
is a collection of linear equations, and a solution to the linear system is a choice of the variables
that satisfies every equation in the system. What can we expect?
Example 6. Use geometry to answer the question with one or more equations in 2 unknowns. Ditto 3 unknowns.
Basic Fact (Number of Solutions): A linear system of m equations in n unknowns
is either inconsistent (no solutions) or consistent, in which case it either has a unique solution or infinitely many solutions. In the former case, unknowns cannot exceed equations. If
21
the number of equations exceeds the number of unknowns, the system is called overdetermined. If the number of unknowns exceeds the number of equations, the system is called
underdetermined.
While we’re on this topic, let’s review some basic linear algebra as well:
The general form for a linear system of m equations in the n unknowns x1 , x2 , . . . , xn is
a11 x1 + a12 x2 + · · · + a1j xj + · · · a1n xn
a21 x1 + a22 x2 + · · · + a2j xj + · · · a2n xn
..
.
ai1 x1 + ai2 x2 + · · · + aij xj + · · · ain xn
..
.
am1 x1 + am2 x2 + · · · + amj xj + · · · amn xn
=
=
..
.
=
..
.
=
b1
b2
..
.
bi
..
.
bm
Notice how the coefficients are indexed: in the ith row the coefficient of the jth variable, xj ,
is the number aij , and the right hand side of the ith equation is bi .
The statement “A = [aij ]” means that A is a matrix (rectangular table of numbers) whose
(i, j)th entry, i.e., entry in the ith row and jth column, is denoted by aij . Generally, the size
of A will be clear from context. If we want to indicate that A is an m × n matrix, we write
A = [aij ]m,n .
Similarly, the statement “b = [bi ]” means that b is a column vector (matrix with exactly
one column) whose ith entry is denoted by bi , and “c = [cj ]” means that c is a row vector
(matrix with exactly one row) whose jth entry is denoted by cj . In case the type of the vector
(row or column) is not clear from context, the default is a column vector.
How can we describe the matrices of the general linear system described above? First, there
is the m × n coefficient matrix
⎡
a11
a21
..
.
a12
a22
..
.
···
···
⎢
⎢
⎢
⎢
A=⎢
⎢ ai1 ai2 · · ·
⎢ .
..
⎣ ..
.
am1 am2 · · ·
a1j
a2j
..
.
···
···
a1n
a2n
..
.
aij
..
.
···
ain
..
.
amj
···
amn
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
Notice that the way we subscripted entries of this matrix is really very descriptive: the first
index indicates the row position of the entry and the second, the column position of the entry.
Next, there is the m × 1 right hand side vector of constants
⎡
⎢
⎢
⎢
⎢
b=⎢
⎢
⎢
⎣
b1
b2
..
.
bi
..
.
bm
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
22
Finally, stack this matrix and vector along side each other (we use a vertical bar below to
separate the two symbols) to obtain the m × (n + 1) augmented matrix
⎤
⎡
a11 a12 · · · a1j · · · a1n b1
⎢ a21 a22 · · · a2j · · · a2n b2 ⎥
⎢ .
.. ⎥
..
..
..
⎥
⎢ .
. ⎥
.
.
.
.
? = [A | b] = ⎢
A
⎥
⎢
⎢ ai1 ai2 · · · aij · · · ain bi ⎥
⎢ .
..
..
..
.. ⎥
⎣ ..
.
.
.
. ⎦
am1 am2 · · · amj · · · amn bm
Matrix Arithmetic. Given any two matrices (or vectors) A = [aij ] and B = [bij ] of the
same size, we may add them or multiply by a scalar according to the rules
A + B = [aij + bij ]
and
cA = [caij ] .
For each size m×n we can form a zero matrix of size m×n: just use zero for every entry. This
is a handy matrix for matrix arithmetic. Matlab knows all about these arithmetic operations,
as we saw in our Matlab introduction. Moreover, there is a multiplication of matrices A and
B provided that A is m × p and B is p × n. The result is an m × n matrix given by the formula
@ p
A
"
AB =
aik bkj .
k=1
Again, Matlab knows all about matrix multiplication.
Another simple matrix operation: The transpose of a matrix A = [ai,j ] is a matrix AT =
[aj,i ] in which the rows of A have been transposed into columns.
Matlab knows these as well: use the apostrophe as in A’.
The Basics. There are many algebra rules for matrix arithmetic. We won’t review all
of them here, but one noteworthy fact is that matrix multiplication is not commutative:
AB ̸= BA in general. Another is that there is no general cancellation of matrices: if AB = 0.
Another is that there is an identity matrix for any square size n × n: In is the matrix with
ones down the diagonal entries and zeros elsewhere. For example,
⎤
⎡
1 0 0
I3 = ⎣ 0 1 0 ⎦ .
0 0 1
We normally just write I and n should be clear from context. Algebra fact: if A is m × n and
B is n × m , then AI = A and IB = B.
One of the main virtues of matrix multiplication is that is gives us a symbolic way of
representing the general linear system given above, namely, we can put it in the simple looking
form
Ax = b,
where A and b are given as above, namely
Ax = b,
23
where
⎡
⎢
⎢
⎢
⎢
x=⎢
⎢
⎢
⎣
x1
x2
..
.
xi
..
.
xn
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
is the (column) vector of unknowns. Actually, this is putting the cart before the horse because
this idea actually inspires the whole notion of matrix multiplication. This form suggests a way
of thinking about the solution to the system, namely,
b
.
A
The only problem is that this doesn’t quite make sense in the matrix world. But we can make
it sensible by a little matrix algebra: imagine that there were a multiplicative inverse matrix
A−1 for A, that is, a matrix such that
x=
A−1 A = I = AA−1 .
Then we could multiply both sides of Ax = b on the left by A−1 and obtain that
x = A−1 b.
Matlab represents this equation by
x = A\b
In general, a square matrix A may not have an inverse. One condition for an inverse to
exist is that the matrix have nonzero determinant, where this is a number associated with a
matrix such as one saw in high school algebra with Cramer’s rule. We won’t go into details
here. Of course, Matlab knows all about inverses and we can compute the inverse of A by
issuing the commands A^(-1) or inv(A).
Sets of all matrices or vectors of a given size form “number systems” with a scalar multiplication and addition operation that satisfies a list of basic laws. Such a system is a vector
space. For example, for vectors from R3 , we have operations
⎡
⎤ ⎡
⎤ ⎡
⎤
a1
b1
a1 + b1
⎣ a2 ⎦ + ⎣ b2 ⎦ = ⎣ a2 + b2 ⎦
a3
b3
a3 + b3
⎡
⎤ ⎡
⎤
a1
ca1
c ⎣ a2 ⎦ = ⎣ ca2 ⎦
a3
ca3
and similar operations for n-vectors of the vector space Rn or the vector space of n×n matrices
Rn×n . Note that we can also deal with complex numbers in more or less the same fashion as
reals, giving rise to the complex vector spaces Cn and Cn×n .
BTW, see what a nuisance it is to write out column vectors – a space hog. Here’s a remedy.
My Shorthand: a tuple like c = (c1 , c2 , . . . , cn ) is understood to represent a column vector
[c1 , c2 , . . . , cn ]T .
A more exotic example of a vector space (I’m not bothering to give the formal definition) is
C[a, b] = {f (x) : [a, b] → R | f (x) is continuous on [a, b]}
24
together with the vector space operations of addition and scalar multiplication given by the
formulas
(f + g) (x) = f (x) + g (x)
(cf ) (x) = c (f (x)) .
With vector spaces come a steady flow of ideas that we won’t elaborate on here – such as
subspaces, linear independence, spanning sets, bases and many more.
Some key vector spaces associated with the m × n matrix A:
• Null space (kernel) of A: N (A) = {x ∈ Rn | Ax = 0}, a subspace of the vector space
Rn .
6
7
Bn
m
• Range (column space) of A: R (A) = y ∈ R | y = j=1 cj Aj , the subspace of Rm
consisting of linear combinations of the columns Aj of the matrix A (a.k.a. the span
of these columns.)
A Simple But Key Idea: one checks easily that if x = (x1 , x2 , . . . , xn ) and A = [A1 , A2 , . . . , An ]
then
Ax = A1 x1 + · · · + An xn .
From this we obtain a concise description of R (A), namely that
R (A) = {Ax | x ∈ Rn } .
Norms. Before we get started, there are some useful ideas that we need to explore. We already
know how to measure the size of a scalar (number) quantity x: use |x| as a measure of its size.
Thus, we have a way of thinking about things like the size of an error. Now suppose we are
dealing with vectors and matrices.
Question: How do we measure the size of more complex quantities like vectors or matrices?
Answer: We use some kind of yardstick called a norm that assigns to each vector x a
non-negative number ∥x∥ subject to the following norm laws for arbitrary vectors x, y and
scalar c:
• For x ̸= 0, ∥x∥ > 0 and for x = 0, ∥x∥ = 0.
• ∥cx∥ = |c| ∥x∥.
• ∥x + y∥ ≤ ∥x∥ + ∥y∥.
Examples: For vector space Rn and vector x = [x1 , x2 ; . . . ; xn ],
• 1-norm: ∥x∥1 = |x1 | + |x2 | + · · · + |xn |
• 2-norm (the default norm):
%
&1/2
∥x∥2 = x21 + x22 + · · · + x2n
• ∞-norm:∥x∥∞ = max (|x1 | , |x2 | , · · · , |xn |)
Let’s do some examples in Matlab, which knows all of these norms:
>A=[1;-3;2;-1], x=[2;0;-1;2]
>norm(x)
>norm(x,1)
>norm(x,2)
>norm(x,inf)>norm(x)
>norm(A,1)
>norm(A,2)
>norm(A,inf)
25
>norm(-2*x),abs(-2)*norm(x)
>norm(A*x),norm(A)*norm(x)
We can abstract the norm idea to more exotic vector spaces like C[a, b] in many ways. Here
is one that is analogous to the infinity norm on Rn :
∥f ∥ ≡ max |f (x)| .
a≤x≤b
Matrix Norms: For every vector norm ∥x∥ of n-vectors there is a corresponding induced
norm on the vector space of n × n matrices A by the formula
∥A∥ = max
x̸=0
∥Ax∥
= max ∥Ax∥ .
∥x∥
∥x∥=1
These induced matrix norms, also called operator norms, in reference to the operator of matrix
multiplication defined by a matrix, satisfy all the basic norm laws and, in addition:
∥Ax∥ ≤ ∥A∥ ∥x∥ (compatible norms)
∥AB∥ ≤ ∥A∥ ∥B∥ (multiplicative norm)
One can show directly that if A = [aij ], then
n
"
|aij |
• ∥A∥1 = max
1≤j≤n
i=1
C
• ∥A∥2 = ρ (AT A), where ρ(B) is largest eigenvalue of square matrix B in absolute
value.
n
"
|aij |
• ∥A∥∞ = max
1≤i≤n
j=1
There is a more general definition of matrix norm that requires that the basic norm laws be
satisfied as well as one more, the so-called multiplicative property:
∥AB∥ ≤ ∥A∥ ∥B∥
for all matrices conformable for multiplication. One such standard example is the Frobenius
norm: given that A = [aij ]
m "
n
"
|aij |2 .
∥A∥F =
i=1 j=1
One can check that the all the matrix norm properties are satisfied.
Inner Products. The 2-norm is special. It can be derived from another operation on
vectors called the inner product (or dot product). Here is how it works.
Definition. The inner product (dot product) of vectors u, v in Rn is the scalar
u · v = uT v.
This operation satisfies a long list of properties, called the inner product properties:
•
•
•
•
u · u ≥ 0 with equality if and only if u = 0.
u·v =v·u
u · (v + w) = u · v + u · w
c (u · v) = (cu) · v
26
For complex numbers and complex vector spaces the only change in the above list of properties
is that u · v = v · u. For example, for u, v in Cn , we could define u · v = uT v or u · v = uT v,
depending on where we want to put the conjugation sign.
What really makes this idea important are the following facts. Here we understand that
∥u∥ means the 2-norm ∥u∥2 .
•
√
∥u∥ = u · u
• If θ is an angle between the vectors represented in suitable coordinate space, then
u · v = ∥u∥ ∥u∥ cos θ.
• If θ is an angle between the vectors represented in suitable coordinate space, then
u · v = ∥u∥ ∥u∥ cos θ.
• Once we have the idea of angle between vectors, we can speak of orthogonal (perpendicular) vectors, i.e., vectors for which the angle between them is π/2 (90◦ ).
• Given any two vectors u, v, with v ̸= 0, the formula
u·v
projv u =
v
v·v
defines a vector parallel to v such that u − projv u is orthogonal to v.
• We can introduce the concept of orthogonal set of vectors: An orthogonal set of
vectors is one such that elements are pairwise orthogonal. It turns out that orthogonal
sets of nonzero vectors are always linearly independent. The Gram-Schmidt algorithm
gives us a way of turning any linearly independent set into an orthogonal set with the
same span.
We can abstract the idea of inner product to more exotic vector spaces. Consider, for example,
C[a, b] with the inner product (which we now denote with angle brackets instead of the dot
which is reserved for the standard vector spaces)
ˆ b
⟨f, g⟩ =
f (x) g (x) dx.
a
Of course, this gives rise to an induced norm
ˆ b
∥f ∥ =
f (x)2 dx
a
Eigenvectors and Eigenvalues. An eigenvalue for a square n × n matrix A is a number
λ such that for some NONZERO vector x, called an eigenvector for λ,
Ax = λx.
Eigenvalue Facts:
(1) The eigenvalues of A are precisely the roots of the nth degree polynomial
p (λ) = |λI − A| ,
where || denotes determinant.
(2) An n × n matrix A has exactly n eigenvalues, counting multiplicities and complex
numbers.
(3) The eigenvalues of αI + βA are just α + βλ, where λ runs over the eigenvalues of A.
(4) More generally, if r(λ) = p(λ)/q(λ) is any rational function, i.e., p, q are polynomials,
then the eigenvalues of r(A) are r(λ), where λ runs over the eigenvalues of A, provided
both expressions make sense.
27
(5) The spectral radius of A is just
ρ (A) = max{|λ| | λ is an eigenvalue of A}
(6) Suppose that x0 is an arbitrary initial vector, {bk } is an arbitrary sequence of uniformly
bounded vectors, and the sequence {xk }, k = 0, 1, 2, . . . , is given by the formula
xk+1 = Axk + bk .
If ρ(A) < 1, then {xk } is a uniformly bounded sequence of vectors. Such a matrix A
is called a stable matrix.
(7) If ρ(A) > 1, then {xk } will not be uniformly bounded for some choices of x0 .
(8) The matrix A is diagonalizable, i.e., there exists an invertible matrix P and diagonal
matrix D such that
⎡
⎤
λ1
⎢
⎥
λ2
⎢
⎥
P −1 AP = D = ⎢
⎥.
.
.
⎣
⎦
.
λn
Here the diagonal entries of D are precisely the eigenvalues of A.
Let’s do a few calculations with matrices to illustrate the idea of eigenvalues and eigenvectors
in Matlab, which knows all about these quantities.
> A = [3 1 0;-1 3 1;0 3 -2]
> eig(A)
> eig(A)
> [V,D]=eig(A)
> v = V(:,1),lam = D(1,1)
> A*v,lam*v
> x = [1;-2;3]
> b = [2;1;0]
> x = A*x+b % repeat this line
Now let’s demonstrate boundedness of an arbitrary sequence as above. Try the
same with a suitable A.
> A = A/4
> max(abs(eig(A)))
> x = [1;-2;3]
> x = A*x+b % repeat this line
Condition Number of a Matrix. Condition number is only defined for invertible square
matrices.
Definition. Let A be a square invertible matrix. The condition number of A with respect
to a matrix norm ∥·∥is the number
!
!
cond (A) = ∥A∥ !A−1 ! .
Roughly speaking, the condition number measures the degree to which changes in A lead to
changes in solutions of systems Ax = b. A large condition number means that small changes
in A may lead to large changes in x. Of course this quantity is norm dependent.
Definition. One of the more important applications of the idea of a matrix norm is the famous
Banach Lemma. Essentially, it amounts to a matrix version of the familiar geometric series
encountered in calculus.
28
Theorem 5. Let M be a square matrix such that ∥M ∥ < 1 for some operator norm ∥·∥. Then
the matrix I − M is invertible. Moreover,
(I − M )−1 = I + M + M 2 + · · · + M k + · · ·
!
!
!
!
and !(I − M )−1 ! ≤ 1/(1 − ∥M ∥).
Proof. Form the familiar telescoping series
+
,
(I − M ) I + M + M 2 + · · · + M k = I − M k+1
so that
+
,
I − (I − M ) I + M + M 2 + · · · + M k = M k+1
Now by the multiplicative property of matrix norms and fact that ∥M ∥ < 1
!
!
! k+1 !
k+1
→ 0.
!M
! ≤ ∥M ∥
k→∞
%
&
It follows that the matrix limk→∞ I + M + M 2 + · · · + M k = B exists and that I−(I − M ) B =
0, from which it follows that B = (I − M )−1 . Finally, note that
!
!
!
!
2
k
!I + M + M 2 + · · · + M k ! ≤ ∥I∥ + ∥M ∥ + ∥M ∥ + · · · + ∥M ∥
≤ 1 + ∥M ∥ + ∥M ∥2 + · · · + ∥M ∥k
1
≤
.
1 − ∥M ∥
Now take the limit as k → ∞ to obtain the desired result.
In the case of an operator norm, the Banach lemma has a nice application.
!
Corollary 6. If A = I + N, where ∥N ∥ < 1, then
cond(A) ≤
1 + ∥N ∥
1 − ∥N ∥
We leave the proof as an exercise.
Here is a very fundamental result for numerical linear algebra. The scenario: suppose that
we desire to solve the linear system Ax = b, where A is invertible. Due to arithmetic error
or possibly input data error, we end up with a value x + δx which solves exactly a “nearby”
system (A + δA)(x + δx) = b + δb. (It can be shown by using an idea called “backward error
analysis” that this is really what happens when many algorithms are used to solve a linear
system.) The question is, what is the size of the relative error ∥δx∥/∥x∥? As long as the
perturbation matrix ∥δA∥ is reasonably small, there is a very elegant answer.
Theorem 7. Suppose that A is invertible, Ax = b, (A+δA)(x+δx) = b+δb and ∥A−1 δA∥ =
c < 1 with respect to some operator norm. Then A + δA is invertible and
D
E
cond(A) ∥δA∥ ∥δb∥
| δx∥
≤
+
∥x∥
1−c
∥A∥
∥b∥
Proof. That the matrix I + A−1 δA follows from hypothesis and the Banach lemma. Expand
the perturbed equation to obtain
(A + δA)(x + δx) = Ax + δAx + Aδx + δA δx = b + δb
Now subtract the terms Ax = b from each side and solve for δx to obtain
(A + δA)δx = A−1 (I + A−1 δA)δx = −δA · x + δb
29
so that
δx = (I + A−1 δA)−1 A−1 [−δA · x + δb]
Now take norms and use the additive and multiplicative properties and the Banach lemma to
obtain
∥A−1 ∥
∥δx∥ ≤
[∥δA∥∥x∥ + ∥δb∥] .
1−c
Next divide both sides by ∥x∥ to obtain
D
E
∥A−1 ∥
∥δb∥
∥δx∥
≤
∥δA∥ +
∥x∥
1−c
∥x∥
Finally, notice that ∥b∥ ≤ ∥A∥∥x∥. Therefore, 1/∥x∥ ≤ ∥A∥/∥b∥. Replace 1/∥x∥ in the right
hand side by ∥A∥/∥b∥ and factor out ∥A∥ to obtain
D
E
∥δx∥
∥A−1 ∥∥A∥ ∥δA∥ ∥δb∥
≤
+
∥x∥
1−c
∥A∥
∥b∥
which completes the proof, since by definition, cond A = ∥A−1 ∥∥A∥.
!
If we believe that the inequality in the perturbation theorem can be sharp (it can!), then
it becomes clear how the condition number of the matrix A is a direct factor in how relative
error in the solution vector is amplified by perturbations in the coefficient matrix.
Singular Value Decomposition. Here is the fundamental result:
Theorem 8. Let A be an m×n real matrix. Then there exist m×m orthogonal matrix U , n×n
orthogonal matrix V and m × n diagonal matrix Σ with diagonal entries σ1 ≥ σ2 ≥ . . . ≥ σp ,
with p = min{m, n}, such that U T AV = Σ. Moreover, the numbers σ1 , σ2 , . . . , σp are uniquely
determined by A.
Proof. There is no loss of generality in assuming that n = min{m, n}. For if this is not the
case, we can prove the theorem for AT and by transposing the resulting SVD for AT , obtain
a factorization for A. Form the n × n matrix B = AT A. This matrix is symmetric and its
eigenvalues are nonnegative (we leave these facts as exercises). Because they are nonnegative,
we can write the eigenvalues of B in decreasing order of magnitude as the squares of positive
real numbers, say as σ12 ≥ σ22 ≥ . . . ≥ σn2 . Now we know from the Principal Axes Theorem
that we can find an orthonormal set of eigenvectors corresponding to these eigenvalues, say
Bvk = σk2 vk , k = 1, 2, . . . , n. Let V = [v1 , v2 , . . . , vn ]. Then V is an orthogonal n×n matrix.
Next, suppose that σr+1 , σr+2 , . . . , σn are zero, while σr ̸= 0.
Next set uj = σ1j Avj , j = 1, 2, . . . , r. These are orthonormal vectors in Rm since
σk2 T
1 T T
1 T
0, if j ̸= k
T
uj u k =
vj A Avk =
vj Bvk =
vj vk =
1, if j = k
σj σk
σj σk
σj σk
Now expand this set to an orthonormal basis u1 , u2 , . . . , um of Rm . This is possible by
standard theorems in linear algebra. Now set U = [u1 , u2 , . . . , um ]. This matrix is orthogonal
and we calculate that if k > r, then uTj Avk = 0 since Avk = 0, and if k < r, then
0, if j ̸= k
T
T
T
uj Avk = uj Avk = σk uj uk =
σk , if j = k
It follows that U T AV = [uTj Avk ] = Σ.
Finally, if U, V are orthogonal matrices such that U T AV = Σ, then A = U ΣV T and
therefore
B = AT A = V ΣU T U ΣV T = V Σ2 V T
30
so that the squares of the diagonal entries of Σ are the eigenvalues of B. It follows that the
numbers σ1 , σ2 , . . . , σn are uniquely determined by A.
!
The numbers σ1 , σ2 , . . . , σp are called the singular values of the matrix A, the columns of
U are the left singular vectors of A, and the columns of V are the right singular values
of A.
There is an interesting geometrical interpretation of this theorem from the perspective of
linear transformations and change of basis as developed in Section 3.7 of [1]. It can can be
stated as follows.
Corollary 9. Let T : Rn → Rm be a linear transformation with matrix A with respect to the
standard bases. Then there exist orthonormal bases u1 , u2 , . . . , um and v1 , v2 , . . . , vn of Rm
and Rn , respectively, such that the matrix of T with these bases is diagonal with nonnegative
entries down the diagonal.
Proof. First observe that if U = [u1 , u2 , . . . , um ] and V = [v1 , v2 , . . . , vn ], then U and V
are the change of basis matrices from the standard bases to the bases u1 , u2 , . . . , um and
v1 , v2 , . . . , vn of Rm and Rn , respectively. Also, U −1 = U T . Now apply a change of basis
theorem and the result follows.
!
We leave the following fact as an exercise.
Corollary 10. Let U T AV = Σ be the SVD of A and suppose that σr ̸= 0 and σr+1 = 0. Then
(1) rank A = r
(2) N (A) = span{vr+1 , vr+2 , . . . , vn }
(3) range A = span{u1 , u2 , . . . , ur }
We have only scratched the surface of the many facets of the SVD. Like most good ideas,
it is rich in applications. We mention one more. It is based on the following fact, which can
be proved by examining the entries of A = U ΣV T : The matrix A of rank r can be written as
a sum of r rank one matrices, namely
A = σ1 u1 v1T + σ2 u2 v2T + · · · + σr ur vrT
where the σk , uk , vk are the singular values, left and right singular vectors, respectively. In
fact, it can be shown that this representation is the most economical in the sense that the
partial sums
Ak = σ1 u1 v1T + σ2 u2 v2T + · · · + σk uk vkT ,
k = 1, 2, . . . , r
give the rank k approximation to A that is closest among all rank k approximations to A.
Thus, if A has a small number of relatively large singular values, say k of them, then we can
approximate A pretty well by Ak .
9. A Tour of Matlab
What is Matlab? Matlab is
(1) An interactive system for numerical computation.
(2) A programmable system for numerical computation.
(3) A front-end for state-of-the-art libraries, currently LAPACK and optimized BLAS.
History:
• Late 70s: Cleve Moler writes Matlab under an NSF grant as a front-end for the current
state-of-the-art numerical linear algebra packages LINPACK and BLAS.
31
• Early 80s: Moler founds the MathWorks and introduces a commercial version of Matlab
running on a PC and written in Fortran.
• Early 90s: Matlab is rewritten in C and “handle graphics” is added into Matlab.
• Late 90s: Matlab evolves into an OOP supporting LAPACK and optimized BLAS.
An excellent reference for this material is the superb text (recommended but not required) by
Desmond Higman and Nicholas Higman, Matlab Guide, SIAM, Philadelphia, 2000.
Note: Except for graphics, you can do most of the Matlab work for this course with Octave,
an open source program available under Linux – there is a Windows version, too, but it may
require some work to set it up.
A Tutorial Introduction to Matlab. Here we go.... Our procedure in all the lectures
will be to work our way through the commands that are listed in the narrative that follows.
Separately one should open a Matlab window for command execution. The prompt sign “>
” and typewriter text indicate a command to be typed in by the instructor or reader at the
Matlab command prompt. Comments will not be preceded by the prompt and will appear in
normal text. I will make comments on the commands that are executed as we read through the
lecture, and students are invited to ask questions as we proceed. In the body of the Matlab
command listing there will be occasionally a “%” sign. This is Matlab’s way of starting a
comment, and all text until the next end of line is ignored by the Matlab interpreter. It isn’t
necessary to type in these comments in order to work through this lesson in a Matlab session.
They are inserted to give a little more explanation to the meaning of the commands for those
who are reading this file and executing the commands themselves.
How to handle a Matlab assignment. Well, do it, of course. The real question is: how do I
turn it in? I suggest that you start up Matlab and do the work you need to do. Once you
actually knew what you’re doing, redo it with an eye to recording your work and turning it in
to me. To keep a recording of your work, you issue the following command to Matlab
> diary ’myfile’
(Caution: make sure your current working directory is a place where you can write files – use
the icons above the workspace area to change working directories.) Matlab will then send a
copy of all your typed input and the output to a file called ’myfile.’ For clarity, use descriptive
names for your files, such as ’jsmithasgn1’ so that when I save the files that you email me, I
can tell what it’s about by the title. If at any point you want to stop the diary feature, issue
the command
> diary off
To resume the diary feature, simply type
> diary on
This will cause input to be appended to myfile. You can make comments in your homework
file by typing % at the command line and this too will be recorded. For example
> % This is a comment.
Be sure to start your file with the comments
> % Name: yourname
> % Email: your email address
When you end your session, the file will be closed and you can view it and even edit it with a
text editor. As a matter of fact, you can even edit and view it with the Matlab Editor. Just
type
> edit myfile
When you have finished the assignment, you can sow this text into another document.
32
Starting and stopping. Start up Matlab and quit after a few calculations and getting some
help.
> 1+1
> 1+1;
> pi
> % to get help on a command or built-in variable just type:
> help pi % get help on built-in variable pi
> help quit % get help on built-in common
> helpwin % or get help browser style
> quit
A calculator. Restart and do the following:
> x=0.3
> sin(x)
> exp(x)*sin(x)/log(x)
> 2^24
> (1+sqrt(5))/2
> x
> format long
> x
> format
> z = 2 -3*i % complex numbers are no problem either
> z^2
> exp(z)
Simple plotting. Matlab’s plotting, unlike a CAS like Maple, does not plot a function.Rather,
it does a dot-to-dot on vectors of x- and y-coordinates. Most Matlab built-in scalar functions
know to handle a vector argument by applying the function to each coordinate.
(Correction: used to not plot a function. There is a function called ’ezplot’ whose expanded
capabilities does just that.)
> x = 0: 0.1: 3;
> y = sin(x);
> plot(x,y);
> z = x - x.^2/2 + x.^3/6;
> plot(x,z)
> % close the plot window, then
> plot(x,y)
> hold on
> plot(x,z)
Functions and scripts. There are two kinds of text files that can be “run” under Matlab, both
of which have the suffix “.m” The first kind is a script file. This file contains a list of commands
that one could execute at the command line. It is rather like a program in C or Fortran, but
not as capable. For one thing, there is no division of program into a “main” section and various
subroutines. You cannot include function definitions in a script file. Functions, that is, value
returning subroutines are what function files, the second kind of executable text file, are good
for.
Make sure a copy of chebyfcns.m is in your working directory and type
> edit chebyfcns
33
This file shows some of the things you can (and can’t) do in Matlab. Multiple functions are
a no-no, unless some are private, but this file would actually run just fine under Octave.
Numbers. Start Matlab. Matlab is a fine calculator. But as with any calculator, one has to
be careful:
> ((2/7+1000)-1000)-2/7
Multiple commands are possible
> x=sin(.3),y=cos(.3);x^2+y^2
And of course help is always available, specific or general:
> help sqrt
> helpwin
There is another handy facility that is a word search function:
> lookfor elliptic
One always has access to the last output:
> exp(-3)
> x=ans;
> disp(x)
And one can always load and save variables:
> save ’junkfile’ x y
> % Now clear all variables:
> clear
> who
> load ’junkfile’ x
> who
Arithmetic. Now a few words about arithmetic, which conforms to double precision IEEE
standard. Here eps is the distance from 1 to the next floating point number:
> eps
> 1+eps
> format long
> 1+eps
> format hex
> 1
> 1+eps
> 1+2*eps
> format
There are other kinds of numbers. Just for the record:
> realmax
> realmin
> realmin/2
> realmax/realmin
> 0/0 % indeterminate...see what Matlab reports
> cos(0)/sin(0)
Matrices. This is where Matlab really excels in a way that other computer packages don’t.
Building and manipulation matrices is quite easy with Matlab:
> A = [1 3 2; 2 1 1; 4 0 2] % create a matrix
> size(A)
> det(A)
34
> eye(3)
> inv(A)
> inv(A)*A,A*inv(A)
> A^(-1)
> A^2
> A + eye(3)
> A - 2*eye(3)
> (1:3) % create a vector using the colon operator
> b = (1:3)’
> x = A\b % solve system A*x = b
> x = A^(-1)*b % solve it using inverse of A
> (1:3) % create a vector using the colon operator
> v = (1:3)’
> eig(A) % calculate eigenvalues of A
> [V,lam]=eig(A) % eigenvectors and eigenvalues
> A*V(:,2)-lam(2,2)*V(:,2) % check eigen relation
> V^(-1)*A*V-lam % better check of eigen relation
> zeros(2) % create a 2x2 matrix of zeros
> zeros(2,1) % create a 2x1 matrix of zeros
> ones(3)
> ones(3,2)
> eye(3) % create a 3x3 identity matrix
> eye(2,3)
> rand(3)
> A=rand(3)-0.5 % create a 3x3 matrix of random numbers in [-0.5,0.5]
> size(A)
> x=ones(1,5)
> size(x)
> length(x)
Of course you can build and reshape matrices from the command line in many ways:
> a = [1 2 3
> 4, 5, 7
> 2,4, 6]
> a = [1 2 3; 4, 5, 7; 2,4, 6]
One accesses entries or changes them with a standard mathematical notation:
> a(1,3)
> a(1,3) = 10
> a(1,3)
One can access a vector (row or column) by a single coordinate:
> y = x’ % the prime sign ’ performs (Hermitian) transpose operation
> x(3)
> y(4) = 7
One can build matrices in blocks:
> b = [eye(3) a; a eye(3)]
> c = repmat(b,2,3)
The all-important colon notation gets used in two different ways. First as a separator in a
type of vector constructor:
> x = 1:5
35
> y = 1:2:5
> z = (1:0.5:5)’
The other principle use if the colon notation is to work as a wild card of sorts. The colon in
a position used for a row indicator means
> c = b(:, 3:5)
Matrix manipulations:
> triu(a)
> tril(a)
> diag(a)
> diag(diag(a))
> reshape(a,1,9)
Matrix constructions (there is a huge number of special matrices that can be constructed by
a special command.) A sampling:
> toeplitz((1:5),(1:5).^2)
> hilb(6)
> vander((1:6))
Multidimensional arrays: Matlab can deal with arrays that are more than two dimensional.
Try the following
> a = [1 2; 3 4]
> a(:,:,2) = [1 1;2 2]
> a
A final note on matrices: one can even use a convenient Array Editor to modify matrices.
Do this: if the Workspace window is not already visible, click on the View button, then check
Workspace. Now double-click on the variable ’a’, a matrix we created earlier. The Array
Editor will open up. Edit a few entries and close the Editor. Confirm your changes by typing
at the command line:
> a
Objects. Objects are instances of classes. Matlab has some OOP capabilities, but for now
we’re going to confine our attention to some fairly simple types of objects. Matlab has five
built-in classes of objects, one of which we’ve already seen:
• double: double-precision floating point numeric matrix or array
• sparse: two-dimensional real (or complex) sparse matrix
• char: character structure
• struct: structure array
• cell: cell array
Various Matlab toolboxes provide additional class definitions.
Of course, we’ve seen lots of doubles. Here’s another very familiar sort of object, namely a
string object:
s = ’Hello world’
size(s)
s
disp(s)
The struct object is what it sounds like, a way to create structures and access them. There
are two ways to build a structure: command line assignments or the struct command. What
actually gets constructed is a structure array. Try the following
> record.name = ’John Doe’
> record.hwkscore = 372
36
> record.ssn = [111 22 3333]
> record
> record(2).name = ’Mary Doe’;
> record(2).hwkscore = 40;
> record(2).ssn = [222 33 4444];
> record
> record(1).hwkscore
Finally, a cell object is an array whose elements can be any other object. For example:
> A(1,1) = {[1 2; 3 4]};
> A(1,2) = {’John Smith’}
> A(2,1) = {249}
> A(2,2) = {[1;2;3;4]}
> A{1,2}
> A(1,2)
Basic Graphics. Start with some simplel plotting. Matlab’s plotting, unlike a CAS like Maple,
does not plot a function. Rather, it does a dot-to-dot on vectors of x- and y-coordinates. Most
Matlab built-in scalar functions know to handle a vector argument by applying the function
to each coordinate.
> x = 0: 0.1: 3;
> y = sin(x);
> plot(x,y);
> z = x - x.^2/2 + x.^3/6;
> plot(x,z)
> % close the plot window, then
> plot(x,y)
> hold on
> plot(x,z)
Next, we’ll get help to guide us along. The plot command has many options. We’ll hit the
highlights. We’ll also keep a help window open so we can peruse the help files as we go. After
we open it, we’ll move it to a corner.
helpwin
Now click on plot in the Help Window and examine the possibilities for the 1-3 character
string S.
x = 0:0.01:1; % create array of abscissas and suppress output
y1 = 4*x.*(1- x);
plot(x, y1, ’r+:’)
A plot with multiple curves is possible without doing a “hold on”:
y2 = sin(pi*x);
plot(x, y1, ’r+:’, x, y2, ’g’)
We can make it fancier. For example
xlabel(’x’)
ylabel(’y’)
title(’Comparison of 4x(1-x) and sin(\pi x).’)
As a matter of fact, you can massage your plot window quite a bit with the tools available in
the tool bar. For example, click on the arrow icon and then draw an arrow on the inside of
the curves pointing to the inner curve. Then click on the A icon (A for ascii) and click at the
37
base of the arrow. Then type in “4*x*(1-x)”. There are lots of other things one can do to a
graph from the graph window itself.
You can even save your figures to many different formats. Click on the File button, then
the menu choice Export. Select under “Save as type:” the choice Portable Document Format.
Then browse to the directory you want, edit the name of the file, say to “junk.pdf.” Then click
on Save. Now use the Windows Explorer to find the file you’ve created and double click on it.
Acrobat will now show you your picture.
There are a number of alternates to the plot command as well. Look in the Help Window.
There’s even an ezplot which is a Maple rip-off (or is it the other way around?).Try these:
ezplot(’4*x*(1-x)’)
polar(x,y2)
Interesting. Let’s stretch it out.
x = 0:0.01:60;
y2 = sin(pi*x);
polar(x,y2)
Axes and other controls. There are other ways to massage your graph. For example, start
with
ezplot(’4*x*(1-x)’)
hold on
ezplot(’sin(pi*x)’)
Not very good. Let’s fiddle it a bit:
axis equal
axis off
axis([0 1 0 1.2])
axis on
Well, let’s just do it again:
x = 0:.01:1;
y1 = 4*x*(1-x);
y2 = sin(pi*x);
plot(x,y1,’--’, x, y2, ’r:’)
legend(’ 4x(1-x)’, ’sin(\pi x)’)
title(’\it Comparison of 4x(1-x) and sin(\pi x)’)
What do you notice about the graph?
Multiple plots. Here is the way to construct multiple plots, along with another variation on
plot:
subplot(2,2,1)
plot(x,y1)
subplot(222)
fplot(’sin(x)’, [0 1])
subplot(223)
fplot(’sin(round(2*pi*x))’, [0 1], ’r--’)
subplot(224), polar(x,y2)
A Bit of Approximation. OK, we’re ready to do something serious. First recall the Weierstrauss theorem (or open up the file VectorSpaces.pdf and look at the last section. Let’s start
by viewing polynomial approximations to the function f (x) = sin x, −π ≤ x ≤ π. In Matlab,
38
we represent a polynomial by a vector of coefficients in descending order. Thus x3 − 2x + 3
would be represented as [1, 0, −2, 3]. Here we go:
x = (-pi:0.1:pi)’;
y = sin(x);
plot(x,y)
grid, hold on
tpoly = [1/120,0,-1/6,0,1,0] % start with a 5th degree Taylor about 0
ty = polyval(tpoly,x);
plot(x,ty);% not bad, sort of
n = 3;
nodes = linspace(-pi,pi,n+1)’;
fnodes = sin(nodes);
ipoly = vander(nodes)\fnodes;
iy = polyval(ipoly,x);
plot(x,iy);% what do you think?
% play fair and do this again with n = 5
% now start over with the function f(x) = 1/(1+x^2) on the interval [-5,5]
% use y = 1./(1+x.^2); in place of y = sin(x);
References
[1] M.J.D. Powell, Approximation Theory and Methods, Cambridge University Press, Cambridge, 1981.
[2] T. Shores, Applied Linear Algebra and Matrix Analysis, Springer, New York, 2007.
[3] L. Trefethen and J. Weideman, The exponentially convergent trapezoidal rule, SIAM Rev., 56 (2014), pp.
385–458.