VectorSpaces441.pdf

LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
THOMAS SHORES
Last Rev.: 9/1/09
1. Vector Spaces
Here is the general denition of a vector space:
Denition 1.
An
(abstract) vector space
is a nonempty set
V
of elements called
vectors, together with operations of vector addition (+) and scalar multiplication (
· ), such that the following laws hold for all vectors u, v, w ∈ V
(1):
(2):
(3):
(4):
and scalars
a, b ∈ F:
u + v ∈ V.
u + v = v + u.
(Associativity of addition) u + (v + w) = (u + v) + w.
(Additive identity) There exists an element 0 ∈ V such that u + 0 = u =
0 + u.
(5): (Additive inverse) There exists an element −u ∈ V such that u + (−u) =
0 = (−u) + u.
(6): (Closure of scalar multiplication) a · u ∈ V.
(7): (Distributive law) a · (u + v) = a · u + a · v.
(8): (Distributive law) (a + b) · u = a · u + b · u.
(9): (Associative law) (ab) · u = a · (b · u) .
(10): (Monoidal law) 1 · u = u.
About notation: just as in matrix arithmetic, for vectors u, v ∈ V, we understand
that u − v = u + (−v). We also suppress the dot (·) of scalar multiplication and
usually write au instead of a · u.
About scalars: the only scalars that we will use in 441 are the real numbers R.
(Closure of vector addition)
(Commutativity of addition)
Denition 2.
Given a positive integer
dimension n over the reals
n,
we dene the
standard vector space of
to be the set of vectors
Rn = {(x1 , x2 , . . . , xn ) | x1 , x2 , . . . , xn ∈ R}
together with the standard vector addition and scalar multiplication. (Recall that
(x1 , x2 , . . . , xn )
is shorthand for the column vector
[x1 , x2 , . . . , xn ]T .)
We see immediately from the denition that the required closure properties of
vector addition and scalar multiplication hold, so these really are vector spaces in
the sense dened above. The standard real vector spaces are often called the real
Euclidean vector spaces once the notion of a norm (a notion of length covered in
the next chapter) is attached to them.
Example 3.
Let
C [a, b] denote the set of all real-valued functions that are con[a, b] and use the standard function addition and scalar
tinuous on the interval
1
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
multiplication for these functions. That is, for
ber
c,
we dene the functions
f +g
and
cf
MATH 441, FALL 2009
f (x) , g (x) ∈ C [a, b]
2
and real num-
by
(f + g) (x) = f (x) + g (x)
(cf ) (x) = c (f (x)) .
C [a, b]
Show that
We set
with the given operations is a vector space.
V = C [a, b]
and check the vector space axioms for this
f, g, h
of this example, we let
be arbitrary elements of
V.
V.
For the rest
We know from calculus
that the sum of any two continuous functions is continuous and that any constant
times a continuous function is also continuous. Therefore the closure of addition
and that of scalar multiplication hold. Now for all
x
such that
a ≤ x ≤ b,
we have
from the denition and the commutative law of real number addition that
(f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x).
Since this holds for all
x, we conclude that f + g = g + f, which is the commutative
law of vector addition. Similarly,
((f + g) + h)(x) = (f + g)(x) + h(x) = (f (x) + g(x)) + h(x)
= f (x) + (g(x) + h(x)) = (f + (g + h))(x).
Since this holds for all
x,
we conclude that
(f + g) + h = f + (g + h),
which is the
associative law for addition of vectors.
0
Next, if
denotes the constant function with value
have that for all
a ≤ x ≤ b,
0,
then for any
f ∈V
we
(f + 0)(x) = f (x) + 0 = f (x).
(We don't write the zero element of this vector space in boldface because it's customary not to write functions in bold.) Since this is true for all
x we have that f +0 = f ,
(−f )(x) = −(f (x)) so
which establishes the additive identity law. Also, we dene
that for all
a ≤ x ≤ b,
(f + (−f ))(x) = f (x) − f (x) = 0,
from which we see that
f + (−f ) = 0.
The additive inverse law follows. For the
distributive laws note that for real numbers
we have that for all
a ≤ x ≤ b,
c, d
and continuous functions
f, g ∈ V ,
c(f + g)(x) = c(f (x) + g(x)) = cf (x) + cg(x) = (cf + cg)(x),
which proves the rst distributive law. For the second distributive law, note that
for all
a ≤ x ≤ b,
((a + b)g)(x) = (a + b)g(x) = ag(x) + bg(x) = (ag + bg)(x),
and the second distributive law follows. For the scalar associative law, observe that
for all
a ≤ x ≤ b,
((cd)f )(x) = (cd)f (x) = c(df (x)) = (c(df ))(x),
so that
(cd)f = c(df ),
as required. Finally, we see that
(1f )(x) = 1f (x) = f (x),
from which we have the monoidal law
operations is a vector space.
1f = f.
Thus,
C [a, b]
with the prescribed
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
3
We could have abbreviated the work above by using the Subspace Test. Recall
from Math 314:
Denition 4.
A
subspace
of the vector space
V
is a subset
together with the binary operations it inherits from
the same eld of scalars as
Given a subset
W
V)
W
W
of
V
such that
W,
forms a vector space (over
in its own right.
of the vector space
space directly to the subset
V,
V,
we can apply the denition of vector
to obtain the following very useful test.
Theorem 5. Let W be a subset of the vector space V. Then W is a subspace of V
if and only if
(1): W contains the zero element of V.
(2): (Closure of addition) For all u, v ∈ W, u + v ∈ W.
(3): (Closure of scalar multiplication) For all u ∈ W and scalars c, cu ∈ W.
Finally, we recall some key ideas about linear combinations of vectors.
Denition 6.
Let
v1 , v2 , . . . , vn be vectors in the vector space V. The span of
span {v1 , v2 , . . . , vn }, is the subset of V consisting of all
these vectors, denoted by
possible linear combinations of these vectors, i.e.,
span {v1 , v2 , . . . , vn } = {c1 v1 + c2 v2 + · · · + cn vn | c1 , c2 , . . . , cn
are scalars}
Recall that one can show that spans are always subspaces of the containing vector
space. A few more fundamental ideas:
Denition 7.
exist scalars
The vectors
c1 , c2, . . . , cn ,
v 1 ,v 2 , . . . ,v n
linearly dependent
if there
c1 v1 + c2 v2 + · · · + cn vn = 0.
(1.1)
Otherwise, the vectors are called
Denition 8.
are said to be
not all zero, such that
A
linearly independent.
basis for the vector space V
is a spanning set of vectors
v1 , v2 , . . . , vn
that is a linearly independent set.
In particular, we have the following key facts about bases:
Theorem 9. Let v1 , v2 , . . . , vn be a basis of the vector space V . Then every v ∈ V
can be expressed uniquely as a linear combination of v1 , v2 , . . . , vn , up to order of
terms.
Denition 10.
The vector space
V
is called
nite-dimensional
if
V
has a nite
spanning set.
It's easy to see that every nite-dimensional space has a basis: simply start with
a spanning set and throw away elements until you have reduced the set to a minimal
spanning set. Such a set will be linearly independent.
Theorem 11. Let w1 , w2 , . . . , wr be a linearly independent set in the space V and
let v1 , v2 , . . . , vn be a basis of V. Then r ≤ n and we may substitute all of the wi 's
for r of the vj 's in such a way that the resulting set of vectors is still a basis of V.
As a consequence of the Steinitz substitution principle above, we obtain this
central result:
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
4
Theorem 12. Let V be a nite-dimensional vector space. Then any two bases of
V have the same number of elements, which is called the dimension of the vector
space and denoted by dim V
Exercises
Exercise 1.
P of C[a, b] consisting of all polynomial
C[a, b] and that the subset Pn consisting of all polynomials
of degree at most n on the interval [a, b] is a subspace of P .
Exercise 2. Show that Pn is a nite dimensional vector space of dimension n,
but that P is not a nite dimensional space, that is, does not have a nite vector
Show that the subset
functions is a subspace of
basis (linearly independent spanning set).
2. Norms
The basic denition:
Denition 13.
each vector
A
v ∈ V
norm
on the vector space
a real number
kvk
V
is a function
such that for
c
k·k
that assigns to
a scalar and
u, v ∈ V
the
following hold:
(1):
(2):
(3):
kuk ≥ 0 with equality if and only if u = 0.
kcuk = |c| kuk.
(Triangle Inequality) ku + vk ≤ kuk + kvk.
A vector space V , together with a norm k·k on the space V , is called a normed
(linear) space. If u, v ∈ V , the distance between u and v is dened to be d (u, v) =
ku − vk.
Notice that if
V
is a normed space and
W
is any subspace of
matically becomes a normed space if we simply use the norm of
V , then W autoV on elements of
W. Obviously all the norm laws still hold, since they hold for elements of the bigger
V.
space
Of course, we have already studied some very important examples of normed
spaces, namely the standard vector space
Rn ,
or any subspace thereof, together
with the standard norms given by
1/2
2
2
2
k(z1 , z2 , . . . , zn )k = |z1 | + |z2 | + · · · + |zn |
.
Since our vectors are real then we can drop the conjugate bars.
This norm is
actually one of a family of norms that are commonly used.
Denition 14.
The
p-norm
Let
V
be one of the standard spaces
of a vector in
V
Rn
and
p≥1
p
p 1/p
p
k(z1 , z2 , . . . , zn )kp = (|z1 | + |z2 | + · · · + |zn | )
Notice that when
p = 2
a real number.
is dened by the formula
.
we have the familiar example of the standard norm.
p = 1. The last important instance of a
p-norm is one that isn't so obvious: p = ∞. It turns out that the value of this norm
is the limit of p-norms as p → ∞. To keep matters simple, we'll supply a separate
Another important case is that in which
denition for this norm.
Denition 15.
vector in
V
Let
V
be one of the standard spaces
Rn
or
Cn .
is dened by the formula
k(z1 , z2 , . . . , zn )k∞ = max {|z1 | , |z2 | , . . . , |zn |} .
The
∞-norm
of a
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
Example 16.
Calculate
kvkp ,
where
p = 1, 2,
or
∞
and
MATH 441, FALL 2009
5
v = (1, −3, 2, −1) ∈ R4 .
We calculate:
k(1, −3, 2, −1)k1 = |1| + |−3| + |2| + |−1| = 7
q
√
2
2
2
2
k(1, −3, 2, −1)k2 = |1| + |−3| + |2| + |−1| = 15
k(1, −3, 2, −1))k∞ = max {|1| , |−3| , |2| , |−1|} = 3.
It may seem a bit odd at rst to speak of the same vector as having dierent
lengths.
You should take the point of view that choosing a norm is a bit like
choosing a measuring stick. If you choose a yard stick, you won't measure the same
number as you would by using a meter stick on an object.
Example 17.
case that
Let
c
Verify that the norm properties are satised for the
p = ∞.
be a scalar, and let
u = (z1 , z2 , . . . , zn ),
and
p-norm
v = (w1 , w2 , . . . , wn )
in the
be two
vectors. Any absolute value is nonnegative, and any vector whose largest component
in absolute value is zero must have all components equal to zero.
Property (1)
follows. Next, we have that
kcuk∞ = k(cz1 , cz2 , . . . , czn )k∞
= max {|cz1 | , |cz2 | , . . . , |czn |}
= |c| max {|z1 | , |z2 | , . . . , |zn |} = |c| kuk∞ ,
which proves (2). For (3) we observe that
ku + vk∞ = max {|z1 | + |w1 | , |z2 | + |w2 | , . . . , |zn | + |wn |}
≤ max {|z1 | , |z2 | , . . . , |zn |} + max {|w1 | , |w2 | , . . . , |wn |}
≤ kuk∞ + kvk∞ .
2.1.
Unit Vectors.
Sometimes it is convenient to deal with vectors whose length
is one. Such a vector is called a
to concoct a unit vector
u
unit vector.
We saw in Chapter 3 that it is easy
in the same direction as a given nonzero vector
v
when
using the standard norm, namely take
u=
(2.1)
v
.
kvk
The same formula holds for any norm whatsoever because of norm property (2).
Example 18.
the
Construct a unit vector in the direction of
1-norm, 2-norm,
and
∞-norms
We already calculated each of the norms of
v.
tion (2.1) to obtain unit-length vectors
1
(1, −3, 2, −1)
7
1
u2 = √ (1, −3, 2, −1)
15
1
u∞ = (1, −3, 2, −1).
3
u1 =
v = (1, −3, 2, −1),
where
are used to measure length.
Use these numbers in equa-
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
6
From a geometric point of view there are certain sets of vectors in the vector
space
V
that tell us a lot about distances.
These are the so-called
vector (or point) v0 of radius r, whose denition is as follows:
Denition 19.
vectors
The (closed) ball of radius
r
balls about a
v0 is the set of
centered at the vector
Br (v0 ) = {v ∈ V | kv − v0 k ≤ r} .
The set of vectors
is the
open
ball of
Bro (v0 ) = {v ∈ V | kv − v0 k < r} .
radius r centered at the vector v0 .
Here is a situation very important to approximation theory in which these balls
are helpful: imagine trying to nd the distance from a given vector
(this means it contains all points on its boundary) set
S
v0
to a closed
of vectors that need not
v0
S. Then expand this ball by increasing its radius until
you have found a least radius r such that the ball Br (v0 ) intersects S nontrivially.
Then the distance from v0 to this set is this number r. Actually, this is a reasonable
denition of the distance from v0 to the set S . One expects these balls, for a given
be a subspace. One way to accomplish this is to start with a ball centered at
so small that the ball avoids
norm, to have the same shape, so it is sucient to look at the unit balls, that is,
the case
r = 1.
Example 20.
and
∞-norms
Sketch the unit balls centered at the origin for the
in the space
V =R2 .
In each case it's easiest to determine the boundary of the ball
vectors
v = (x, y)
such that
kvk = 1.
1-norm, 2-norm,
B1 (0), i.e., the set of
These boundaries are sketched in Figure 2.1,
and the ball consists of the boundaries plus the interior of each boundary. Let's
start with the familiar
2-norm.
Here the boundary consists of points
(x, y)
such
that
which is the familiar circle
1-norm,
p
1 = k(x, y)k2 = x2 + y 2 ,
of radius 1 centered at the
origin. Next, consider the
in which case
1 = k(x, y)k1 = |x| + |y| .
It's easier to examine this formula in each quadrant, where it becomes one of the
four possibilities
±x ± y = 1.
x + y = 1.
For example, in the rst quadrant we get
These equations give lines that
connect to form a square whose sides are diagonal lines. Finally, for the
∞-norm
we have
1 = |(x, y)|∞ = max {|x| , |y|} ,
which gives four horizontal and vertical lines x = ±1 and y = ±1. These intersect
to form another square. Thus we see that the unit balls for the 1- and ∞-norms
have corners, unlike the 2-norm. See Figure 2.1 for a picture of these balls.
One of the important applications of the norm concept is that it enables us to
make sense out of the idea of limits and convergence of vectors.
limn→∞ vn = v
limn→∞ kvn − vk = 0.
v1 , v2 , . . . converges to v. Will we have to
was taken to mean that
said that the sequence
notion of limits for dierent norms? For
nite-dimensional
In a nutshell,
In this case we
have a dierent
spaces, the somewhat
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
7
y
x
kvk1 = 1
kvk2 = 1
kvk∞ = 1
Figure 2.1. Boundaries of unit balls in various norms.
surprising answer is no. The reason is that given any two norms
k·ka
and
k·kb
on a
nite-dimensional vector space, it is always possible to nd positive real constants
c
and
d
such that for any vector
v,
kvka ≤ c · kvkb
Hence, if
kvn − vk
tends to
0
and
kvkb ≤ d kvka .
0
in one norm, it will tend to
in the other norm.
For this reason, any two norms satisfying these inequalities are called
equivalent.
It can be shown that all norms on a nite-dimensional vector space are equivalent.
Indeed, it can be shown that the condition that
kvn − vk
tends to
norm is equivalent to the condition that each coordinate of
corresponding coordinate of
Example 21.
the
1-norm
Verify that
and
2-norm,
vn
0
in any one
converges to the
v. We will verify the limit fact in the following example.
limn→∞ vn
exists and is the same with respect to both
where
vn =
(1 − n)/n
e−n + 1
.
Which norm is easier to work with?
First we have to know what the limit will be. Let's examine the limit in each
coordinate. We have
lim
1−n
1
= lim
− 1 = 0 − 1 = −1 and lim e−n + 1 = 0 + 1 = 1.
n→∞ n
n→∞
n
to use v = (−1, 1) as the limiting vector. Now calculate
1−n 1 −1
n
n
v − vn =
−
=
,
1
e−n + 1
e−n
n→∞
So we try
so that
and
1 kv − vn k1 = + e−n −→ 0
n→∞
n
s 2
1
2
kv − vn k =
+ (e−n ) −→ 0,
n→∞
n
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
which shows that the limits are the same in either norm. In this case the
8
1-norm
appears to be easier to work with, since no squaring and square roots are involved.
Here are two examples of norms dened on nonstandard vector spaces:
Denition 22.
The
p-norm
on
C[a, b]
is dened by
kf kp =
n
b
a
o1/p
|f (x)|p dx
.
Although this is a common form of the denition, a better form that is often
used is
(
kf kp =
1
b−a
)1/p
b
.
|f (x)| dx
p
a
This form is better in the sense that it scales the size of the interval.
Denition 23.
uniform (or
The
maxa≤x≤b |f (x)|.
norm
innity)
on
C[a, b]
is dened by
kf k∞ =
This norm is well dened by the extreme value theorem, which guarantees that
the maximum value of a continuous function on a closed interval exists. We leave
verication of the norm laws as an exercise.
Convexity.
The basic idea is that if a set in a linear space is convex, then the line
connecting any two points in the set should lie entirely inside the set. Here's how
to say this in symbols:
Denition 24.
A set
S
in the vector space
V
is convex if, for any vectors
u, v ∈ S,
all vectors of the form
λu + (1 − λ) v, 0 ≤ λ ≤ 1,
are also in
S.
Denition 25.
vectors
u, v ∈ S,
A set
S
in the normed linear space
V
is strictly convex if, for any
all vectors of the form
w = λu + (1 − λ) v, 0 < λ < 1
are in the interior of
ball
Br (w)
S,
that is, for each
is entirely contained in
w
there exists a positive
S.
r
such that the
Exercises
Exercise 3. Show that the uniform norm on C[a, b] satises the norm properties.
Exercise 4. Show that for positive r and v0 ∈ V , a normed linear space, the
ball
Br (v0 )
is a convex set. Show by example that it need not be strictly convex.
3. Inner Product Spaces
This dot product of calculus and Math 314 amounted to the standard inner
product of the two standard vectors.
We now extend this idea to a setting that
allows for abstract vector spaces.
Denition 26.
h·, ·i
An (abstract)
inner product
that assigns to each pair of vectors
scalar and
(1):
(2):
on the vector space
u, v ∈ V
u, v, w ∈ V the following hold:
hu, ui ≥ 0 with hu, ui = 0 if and
hu, vi = hv, ui
only if
a scalar
u = 0.
hu, vi
V
is a function
such that for
c
a
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
(3):
(4):
MATH 441, FALL 2009
9
hu, v + wi = hu, vi + hu, wi
hu, cvi = c hu, vi
V,
A vector space
together with an inner product
h·, ·i
on the space
V,
is called
inner product space. Notice that in the case of the more common vector spaces
over real scalars, property (2) becomes the commutative law: hu, vi = hv, ui . Also
an
observe that if
V
is an inner product space and
W
is any subspace of
V,
then
W
automatically becomes an inner product space if we simply use the inner product
of
V
on elements of
W.
For all the inner product laws still hold, since they hold for
elements of the larger space
V.
Of course, we have the standard examples of inner products, namely the dot
products on
Rn
Example 27.
and
Cn .
For vectors
u, v ∈ Rn , with u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ),
dene
u · v = u1 v1 + u2 v2 + · · · + un vn = uT v.
This is just the standard dot product, and one can verify that all the inner product
laws are satised by application of the laws of matrix arithmetic.
Here is an example of a nonstandard inner product on a standard space that is
useful in certain engineering problems.
Example 28.
For vectors
u = (u1 , u2 ) and v = (v1 , v2 ) in V = R2 ,
dene an inner
product by the formula
hu, vi = 2u1 v1 + 3u2 v2 .
Show that this formula satises the inner product laws.
First we see that
hu, ui = 2u21 + 3u22 ,
so the only way for this sum to be
0
is for
u1 = u2 = 0.
Hence (1) holds. For (2)
calculate
hu, vi = 2u1 v1 + 3u2 v2 = 2v1 u1 + 3v2 u2 = hv, ui = hv, ui,
since all scalars in question are real. For (3) let
w = (w1 , w2 )
and calculate
hu, v + wi = 2u1 (v1 + w1 ) + 3u2 (v2 + w2 )
= 2u1 v1 + 3u2 v2 + 2u1 w1 + 3u2 = hu, vi + hu, wi .
For the last property, check that for a scalar
c,
hu, cvi = 2u1 cv1 + 3u2 cv2 = c (2u1 v1 + 3u2 v2 ) = c hu, vi .
It follows that this weighted inner product is indeed an inner product according
to our denition. In fact, we can do a whole lot more with even less eort. Consider
this example, of which the preceding is a special case.
Example 29.
product
product
A be an n × n Hermitian matrix (A = A∗ ) and dene the
hu, vi = u Av for all u, v ∈V , where V is Rn or Cn . Show that this
satises inner product laws (2), (3), and (4) and that if, in addition, A is
Let
∗
positive denite, then the product satises (1) and is an inner product.
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
As usual, let
1×1
u, v, w ∈ V and let c be a scalar.
q , q ∗ = q , so we calculate
MATH 441, FALL 2009
10
For (2), remember that for a
scalar quantity
∗
∗
hv, ui = v∗ Au = (u∗ Av) = hu, vi = hu, vi.
For (3), we calculate
hu, v + wi = u∗ A(v + w) = u∗ Av + u∗ Aw = hu, vi + hu, wi .
For (4), we have that
hu, cvi = u∗ Acv = cu∗ Av = c hu, vi .
Finally, if we suppose that
A
is also positive denite, then by denition,
hu, ui = u∗ Au > 0,
for
u 6= 0,
which shows that inner product property (1) holds. Hence, this product denes an
inner product.
We leave it to the reader to check that if we take
A=
2
0
0
3
,
we obtain the inner product of the rst example above.
Here is an example of an inner product space that is useful in approximation
theory:
Example 30.
[a, b]
Let
V = C [a, b],
the space of continuous functions on the interval
with the usual function addition and scalar multiplication.
formula
Show that the
b
hf, gi =
f (x)g(x) dx
a
denes an inner product on the space
V.
Certainly hf, gi is a real number. Now if f (x) is a continuous function then
1
2
f (x) is nonnegative on [a, b] and therefore 0 f (x)2 dx = hf, f i ≥ 0. Furthermore,
2
if f (x) is nonzero, then the area under the curve y = f (x) must also be positive
since f (x) will be positive and bounded away from 0 on some subinterval of [a, b].
This establishes property (1) of inner products.
Now let
f (x), g(x), h(x) ∈ V .
For property (2), notice that
b
hf, gi =
g(x)f (x)dx = hg, f i .
a
Also,
b
f (x)g(x)dx =
a
b
hf, g + hi =
f (x)(g(x) + h(x))dx
a
b
=
b
f (x)g(x)dx +
a
f (x)h(x)dx = hf, gi + hf, hi ,
a
which establishes property (3). Finally, we see that for a scalar
b
hf, cgi =
f (x)cg(x) dx = c
a
which shows that property (4) holds.
c,
b
f (x)g(x) dx = c hf, gi ,
a
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
We shall refer to this inner product on a function space as the
product on the function space C [a, b].
11
standard inner
(Most of our examples and exercises involving
function spaces will deal with polynomials, so we remind the reader of the integration formula
m ≥ 0.)
b
a
xm dx =
1
m+1
bm+1 − am+1
and special case
1
0
xm dx =
1
m+1 for
Following are a few simple facts about inner products that we will use frequently.
The proofs are left to the exercises.
Theorem 31. Let V be an inner product space with inner product h·, ·i . Then we
have that for all u, v, w ∈ V and scalars a,
(1): hu, 0i = 0 = h0, ui,
(2): hu + v, wi = hu, wi + hv, wi,
(3): hau, vi = ahu, vi.
Induced Norms and the CBS Inequality.
It is a striking fact that we can
accomplish all the goals we set for the standard inner product using general inner
products: we can introduce the ideas of angles, orthogonality, projections, and so
forth. We have already seen much of the work that has to be done, though it was
stated in the context of the standard inner products. As a rst step, we want to
point out that every inner product has a natural norm associated with it.
Denition 32.
Let
V
be an inner product space. For vectors
u ∈ V,
the norm
dened by the equation
kuk =
is called the
hu, ui
norm induced by the inner product h·, ·i on V .
p
As a matter of fact, this idea is not really new. Recall that we introduced the
standard inner product on
V = Rn
or
Cn
with an eye toward the standard norm.
At the time it seemed like a nice convenience that the norm could be expressed in
terms of the inner product. It is, and so much so that we have turned this cozy
relationship into a denition. Just calling the induced norm a norm doesn't make
it so.
Is the induced norm really a norm?
We have some work to do.
The rst
norm property is easy to verify for the induced norm: from property (1) of inner
products we see that
hu, ui ≥ 0,
with equality if and only if
u = 0. This conrms
c be a scalar and
norm property (1). Norm property (2) isn't too hard either: let
check that
kcuk =
p
hcu, cui =
q p
p
2
cc hu, ui = |c| hu, ui = |c| kuk .
Norm property (3), the triangle inequality, remains. This one isn't easy to verify
from rst principles. We need a tool called the the CauchyBunyakovskySchwarz
(CBS) inequality.
Theorem 33. (CBS Inequality) Let V be an inner product space. For u, v ∈ V , if
we use the inner product of V and its induced norm, then
|hu, vi| ≤ kuk kvk .
Henceforth, when the norm sign
k·k
is used in connection with a given inner
product, it is understood that this norm is the induced norm of this inner product,
unless otherwise stated.
Just as with the standard dot products, we can formulate the following denition
thanks to the CBS inequality.
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
Denition 34.
angle
between
u, v ∈ V,
For vectors
u
and
v
to be any angle
Example 35.
satisfying
hu, vi
.
kuk kvk
|hu, vi| / (kuk kvk) ≤ 1, so that this formula for cos θ
Let
u = (1, −1)
v = (1, 1)
and
12
a real inner product space, we dene the
θ
cos θ =
We know that
MATH 441, FALL 2009
be vectors in
2
R .
makes sense.
Compute an angle
between these two vectors using the inner product of Example 28. Compare this
to the angle found when one uses the standard inner product in
R2 .
Solution. According to 28 and the denition of angle, we have
cos θ =
−1
2 · 1 · 1 + 3 · (−1) · 1
hu, vi
√
=
=p
.
2
2
2
2
kuk kvk
5
2 · 1 + 3 · (−1) 2 · 1 + 3 · 1
Hence the angle in radians is
θ = arccos
−1
5
≈ 1.7722.
On the other hand, if we use the standard norm, then
hu, vi = 1 · 1 + (−1) · 1 = 0,
from which it follows that
u
and
v
are orthogonal and
θ = π/2 ≈ 1.5708.
In the previous example, it shouldn't be too surprising that we can arrive at two
dierent values for the angle between two vectors. Using dierent inner products
to measure angle is somewhat like measuring length with dierent norms. Next,
we extend the perpendicularity idea to arbitrary inner product spaces.
Denition 36.
if
Two vectors
hu, vi = 0.
Note that if
hu, vi = 0,
u and v in the same inner product space are orthogonal
then
hv, ui = hu, vi = 0.
the zero vector orthogonal to every other vector.
Also, this denition makes
It also allows us to speak of
things like orthogonal functions. One has to be careful with new ideas like this.
Orthogonality in a function space is not something that can be as easily visualized
as orthogonality of geometrical vectors.
may not be quite enough.
Inspecting the graphs of two functions
If, however, graphical data is tempered with a little
understanding of the particular inner product in use, orthogonality can be detected.
Example 37.
C [0, 1]
2
3 are orthogonal elements of
with the inner product of Example 30 and provide graphical evidence of
Show that
f (x) = x
and
g(x) = x −
this fact.
Solution. According to the denition of inner product in this space,
hf, gi =
1
0
f (x)g(x)dx =
0
1
2
x x−
3
dx =
1
x3
x2 −
= 0.
3
3 0
f and g are orthogonal to each other. For graphical evidence, sketch
f (x), g(x), and f (x)g(x) on the interval [0, 1] as in Figure 3.1. The graphs of f and
g are not especially enlightening; but we can see in the graph that the area below
f · g and above the x-axis to the right of (2/3, 0) seems to be about equal to the
area to the left of (2/3, 0) above f · g and below the x-axis. Therefore the integral
of the product on the interval [0, 1] might be expected to be zero, which is indeed
It follows that
the case.
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
13
y
1
f (x) = x
f (x) · g(x)
x
2
3
g(x) = x −
1
2
3
−1
Figure 3.1. Graphs of
f , g,
and
f ·g
on the interval
[0, 1].
Some of the basic ideas from geometry that fuel our visual intuition extend very
elegantly to the inner product space setting.
One such example is the famous
Pythagorean theorem, which takes the following form in an inner product space.
Theorem 38. Let
u, v be orthogonal vectors in an inner product space V. Then
2
2
2
kuk + kvk = ku + vk .
Proof.
Compute
2
ku + vk = hu + v, u + vi
= hu, ui + hu, vi + hv, ui + hv, vi
2
2
= hu, ui + hv, vi = kuk + kvk .
Here is an example of another standard geometrical fact that ts well in the
abstract setting. This is equivalent to the law of parallelograms, which says that
the sum of the squares of the diagonals of a parallelogram is equal to the sum of
the squares of all four sides.
Example 39.
Use properties of inner products to show that if we use the induced
norm, then
2
2
2
2
ku + vk + ku − vk = 2 kuk + kvk .
The key to proving this fact is to relate induced norm to inner product. Specifically,
2
ku + vk = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi ,
while
2
ku − vk = hu − v, u − vi = hu, ui − hu, vi − hv, ui + hv, vi .
Now add these two equations and obtain by using the denition of induced norm
again that
2
2
2
2
ku + vk + ku − vk = 2 hu, ui + 2 hv, vi = 2 kuk + kvk ,
which is what was to be shown.
It would be nice to think that every norm on a vector space is induced from some
inner product. Unfortunately, this is not true, as the following example shows.
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
Example 40.
V = R2
MATH 441, FALL 2009
Use the result of Example 39 to show that the innity norm on
is not induced by any inner product on
V.
Solution. Suppose the innity norm were induced by some inner product on
Let
u = (1, 0)
14
and
v = (0, 1/2).
2
V.
Then we have
2
2
2
ku + vk∞ + ku − vk∞ = k(1, 1/2)k∞ + k1, −1/2k∞ = 2,
while
2
2
2 kuk + kvk = 2 (1 + 1/4) = 5/2.
This contradicts Example 39, so that the innity norm cannot be induced from an
inner product.
Denition 41.
to be an
v1 , v2 , . . . , vn in an inner product space is said
hvi , vj i = 0 whenever i 6= j. If, in addition, each vector
hvi , vi i = 1 for all i, then the set of vectors is said to be an
The set of vectors
orthogonal set
has unit length, i.e.,
orthonormal set
if
of vectors.
The proof of the following key fact and its corollary are the same as those of for
standard dot products. All we have to do is replace dot products by inner products.
The observations that followed the proof of this theorem are valid for general inner
products as well. We omit the proofs.
Theorem 42. Let v1 , v2 , . . . , vn be an orthogonal set of nonzero vectors and suppose that v ∈ span {v1 , v2 , . . . , vn }. Then v can be expressed uniquely (up to order)
as a linear combination of v1 , v2 , . . . , vn , namely
v=
hv1 , vi
hv2 , vi
hvn , vi
v1 +
v2 + · · · +
vn .
hv1 , v1 i
hv2 , v2 i
hvn , vn i
Corollary 43. Every orthogonal set of nonzero vectors is linearly independent.
Another useful corollary is the following fact about the length of a vector, whose
proof is left as an exercise.
Corollary 44. If
v1 , v2 , . . . , vn is an orthogonal set of vectors and v = c1 v1 +
c2 v2 + · · · + cn vn , then
2
Exercises
Exercise 5.
elements of
2
2
2
kvk = c21 kv1 k + c22 kv2 k + · · · + c2n kvn k .
Conrm that
C [−1, 1]
p1 (x) = x
and
p2 (x) = 3x2 − 1
span {p1 (x) , p2 (x)} using Theorem
2
1 + x − 3x2
(c) 1 + 3x − 3x
following polynomials belong to
(a)
x2
are orthogonal
with the standard inner product and determine whether the
(b)
42.
4. Linear Operators
Before giving the denition of linear operator, let us recall some notation that
f with the notation
T are the domain and target of the function, respectively.
This means that for each x in the domain D, the value f (x) is a uniquely determined
element in the target T. We want to emphasize at the outset that there is a dierence
here between the target of a function and its range. The range of the function f is
is associated with functions in general. We identify a function
f : D → T,
where
D
and
dened as the subset of the target
range(f ) = {y | y = f (x)
for some
x ∈ D} ,
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
which is just the set of all possible values of
if, whenever
then
x = y.
15
f (x). A function is said to be one-to-one
Also, a function is said to be
onto
if the
For example, we can dene a function
f : R → R
A function that maps elements of one vector space into another, say
f : V → W,
range of
f
f (x) = f (y),
MATH 441, FALL 2009
equals its target.
f (x) = x2 . It follows from our specication of f that the target
of f is understood to be R, while the range of f is the set of nonnegative real
numbers. Therefore, f is not onto. Moreover, f (−1) = f (1) and −1 6= 1, so f is
by the formula
not one-to-one either.
transformation. One of the simplest mappings of
a vector space V is the so-called identity function idV : V → V given by idV (v) = v,
is sometimes called an
for all
v ∈ V.
operator
or
Here domain, range, and target all agree. Of course, matters can
become more complicated. For example, the operator
by the formula
f
x
y
might be given

x2
=  xy  .
y2

f
Notice in this example that the target of
range of
f : R 2 → R3
is
R3 ,
which is not the same as the
f, since elements in the range have nonnegative rst and third coordinates.
From the point of view of linear algebra, this function lacks the essential feature
that makes it really interesting, namely linearity.
Denition 45.
A function
T :V →W
over the same eld of scalars is called a
if for all vectors
u, v ∈ V
and scalars
from the vector space
V
into the space
W
linear operator (mapping, transformation)
c, d,
we have
T (cu + dv) = cT (u) + dT (v).
c = d = 1 in the denition, we see that a linear function T is additive,
T (u + v) = T (u) + T (v). Also, by taking d = 0 in the denition, we see
that a linear function is outative, that is, T (cu) = cT (u). As a matter of fact, these
By taking
that is,
two conditions imply the linearity property, and so are equivalent to it. We leave
this fact as an exercise.
If
V.
T :V →V
is a linear operator, we simply say that
A linear operator
T :V →R
is called a
T
is a linear operator on
linear functional
on
V.
By repeated application of the linearity denition, we can extend the linearity
property to any linear combination of vectors, not just two terms. This means that
for any scalars
c1 , c2 , . . . , cn
and vectors
v1 , v2 , . . . , vn ,
we have
T (c1 v1 + c2 v2 + · · · + cn vn ) = c1 T (v1 ) + c2 T (v2 ) + · · · + cn T (vn ).
Example 46.
Determine whether
T : R2 → R3
given by the formula
(a)
T ((x, y)) = (x2 , xy, y 2 )
If
T
or (b)
T ((x, y)) =
1
1
is a linear operator, where
0
−1
x
y
is dened by (a) then we show by a simple example that
is
.
T
fails to be linear.
Let us calculate
T ((1, 0) + (0, 1)) = T ((1, 1)) = (1, 1, 1),
while
These two are
T
T ((1, 0)) + T ((0, 1)) = (1, 0, 0) + (0, 0, 1) = (1, 0, 1).
not equal, so T fails to satisfy the linearity property.
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
as in (b). Write
T dened
x
and v =
,
y
the action of T can be given as T (v) = Av.
16
Next consider the operator
A=
1
1
0
−1
and we see that
Now we have already
seen that the operation of multiplication by a xed matrix is a linear operator.
Recall that an operator
f :V →W
is said to be
invertible
if there is an operator
g : W → V such that the composition of functions satises f ◦ g = idW and
g ◦ f = idV . In other words, f (g (w)) = w and g (f (v)) = v for all w ∈ W and
v ∈ V . We write g = f −1 and call f −1 the inverse of f . One can show that for any
operator f , linear or not, being invertible is equivalent to being both one-to-one
and onto.
Example 47.
spaces, then
Show that if
f −1
f :V →W
is an invertible linear operator on vector
is also a linear operator.
u, v ∈ W , the linearity property f −1 (cu + dv) =
cf (u) + df (v) is valid. Let w = cf −1 (u) + df −1 (v). Apply the function f to
both sides and use the linearity of f to obtain that
f (w) = f cf −1 (u) + df −1 (v) = cf f −1 (u) + df f −1 (v) = cu + dv.
Solution. We need to show that for
−1
−1
Apply
f −1
to obtain that
w = f −1 (f (w)) = f −1 (cu + dv),
which proves the
linearity property.
Abstraction gives us a nice framework for certain key properties of mathematical
objects, some of which we have seen before. For example, in calculus we were taught
that dierentiation has the linearity property. Now we can express this assertion
in a larger context: let
operator
space
V.
T
on
V
V
be the space of dierentiable functions and dene an
by the rule
Operator Norms.
T (f (x)) = f 0 (x).
Then
T
is a linear operator on the
The following idea lets us measure the size of a norm in terms
of how much the operator is capable of stretching an argument. This is a very
useful notion for approximation theory in the situation where our approximation
to the element
approximation
f can
T (f ).
Denition 48.
The
Let
operator norm
be described as applying an operator
T
to
f
to obtain the
T : V → W be a linear operator between normed linear spaces.
T is dened to be
kT (v)k
max
= max kT (v)k ,
06=v∈V
kvk
v∈V,kvk=1
of
provided the maximum value exists, in which case the operator is called
Otherwise the operator is called
unbounded.
bounded.
This norm really is a norm on the appropriate space.
Theorem 49. Given normed linear spaces V, W , the set of L (V, W ) of all bounded
linear operators from V to W is a vector space with the usual function addition and
multiplication. Moreover, the operator norm on L (V, W ) is a vector norm, so that
L (V, W ) is also a normed linear space.
This theorem gives us an idea as to why operator norms are relevant to approximation theory.
Suppose you want to approximate functions
f
in some function
space by a method which can be described as aplying a linear operator
T
to
f
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
17
(we'll see lots of examples of this in approximation theory). One rough measure of
T (f ) should be close to f . Put another
kT k is much larger than one, we
f , T (f ) will be a poor approximation to f .
how good an approximation we have is that
way,
kT (f )k / kf k
should be close to one. So if
expect that for some functions
Bounded linear operators for innite dimensional spaces is a large subject which
forms part of the area of mathematics called functional analysis. However, in the
case of nite dimension, the situation is much simpler.
Theorem 50. Let T : V → W be a linear operator, where V is a nite dimensional
space. Then T is bounded, so that L (V, W ) consists of all linear operators from V
to W .
Some notation: if both
V
and
norm, then the operator norm of
W are standard spaces with the same standard pT is denoted by kT kp . In some cases the operator
norm is fairly simple to compute. Here is an example.
Example 51.
TA : Rn → Rm be the linear operator given by matrix-vector
multiplication, TA (v) = Av, where A = [aij ] is an m × n matrix with (i, j)-th entry
aij . Then TA is bounded by the previous theorem and moreover
Let
kTA k∞ = max
n
X
1≤i≤m
|aij | .
j=1
Solution. To see this, observer that in order for a vector
norm
1,
all coordinates must be at most
be equal to one.
Pn
1
v∈V
to have innity
in absolute value and at least one must
Now choose the row that has the largest row sum of absolute
j=1 |aij | and let v be the vector whose j -th coordinate is just the signum
of aij , so that |aij | = vj aij for all j . Then it is easily checked that this is the largest
possible value for kAvk.
values
Exercises
Exercise 6.
Show that dierentiation is a linear operator on
V = C ∞ (R)
, the
space of functions dened on the real line that are innitely dierentiable.
1
0
Exercise 7.
f (x) dx
Show that the operator
T : C [0, 1] → R
given by
T (f ) =
is a linear operator.
5. Metric Spaces and Analysis
We start with the denition of a metric space which, in general, is rather dierent
from a vector space, although the main examples for us are in fact normed linear
spaces and their subsets.
Denition 52.
X of objects called points, toX , mapping pairs of points
all points x, y, z :
if x = y .
A metric space is a nonempty set
gether with a function
d (x, y),
called a metric for
x, y ∈ X to real numbers and satisfying for
(1) d (x, y) ≥ 0 with equality if and only
(2) d (x, y) = d (y, x)
(3) d (x, z) ≤ d (x, y) + d (y, z)
Examples abound.
This is a more general concept than norm, since metric
spaces do not have to be vector spaces, so that every subset of a metric space is
also a metric space with the inherited metric. There are many interesting examples
that are not subsets of normed linear spaces. One such example can be found in
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
18
Powell's text, Exercise 1.2. BTW, this example is important for problems of shape
recognition.
Example 53.
Let For a very nonstandard example, consider a nite network in
x, y
which every pair
of nodes is connected by a path of edges, each of which has
a positive cost associated with traversing it. Assume that there are no self-edges.
Dene the distance between any two nodes
path connecting
x
y
to
if
x 6= y ,
x
and
y
to be the minimum cost of a
otherwise the distance is
0.
This denition turns
the network into a metric space.
Next, we consider some topological ideas that are useful for metric spaces. In all
of the following we assume that
Denition 54.
X
is a metric space with metric
r > 0 and a point x
x0 is the set of points
Given
centered at the point
in
X,
d(x, y).
the (closed) ball of radius
r
Br (x) = {y ∈ X | d (x0 , y) ≤ r} .
The set of points
is the
open
ball of
Denition 55.
Y
Bro (x) = {y ∈ X | d (x0 , y) < r} .
radius r centered at the point x0 .
Y of the metric space X is open if for every point y ∈
Br (y) contained entirely in Y . The set Y is closed if its
X is an open set.
A subset
there exists a ball
complement
X\Y
in
(I'll nish this later)
Denition 56. bounded set
Denition 57. boundary
Denition 58. compact, sequentially compact
Denition 59. continuous function
Example 60. Linear operators
The key theorems we need are
Theorem 61. For a metric space X , compactness and sequential compactness are
equivalent.
Theorem 62. Heine-Borel theorem
Theorem 63. EVT
Finally, we have the following famous theorem from analysis, which tells us that
the polynomials are a dense subset of
that every open ball in
C [a, b]
C [a, b]
Theorem 64. (Weierstrauss) Given any f
a polynomial p (x) such that
Exercise 8.
with the innity norm, in the sense
with respect to this norm contains a polynomial:
∈ C [a, b] and number > 0, there exists
kf − pk∞ < .
Show that every normed linear space
space with metric given by
d (u, v) = ku − vk .
V
with norm
k·k
is a metric
LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY
MATH 441, FALL 2009
19
References
[1] M.J.D. Powell,
1981.
[2] T. Shores,
Approximation Theory and Methods
, Cambridge University Press, Cambridge,
Applied Linear Algebra and Matrix Analysis
, Springer, New York, 2007.