Geometry in R

Chapter 6
Geometry in Rn
In Chapter 2, where we first started our study of vector spaces, a vector was
intuitively described as something possessing both magnitude and direction.
This led us to the idea of expressing vectors, at least in R2 , as pairs of real
numbers. One of the topics we discuss in this chapter is the relation between
these pairs of numbers and the length of a vector. We also show how the angle
between two vectors is related to their ordered pair representation.
6.1
Length and Dot Product
We first define the length of a vector in R2 and then show how to compute the
angle between two vectors. Let a = (a1 , a2 ) be any vector in R2 . By the length
of a we mean the distance from the point with coordinates (a1 , a2 ) to the origin
(see Figure 6.1). The Pythagorean theorem tells us that this distance equals
(a21 + a22 )1/2 . We therefore define the length of any vector in R2 or Rn as follows:
(a1 , a2 )
a2
a1
Figure 6.1
201
CHAPTER 6. GEOMETRY IN RN
202
Definition 6.1. Let x = (x1 , x2 , . . . , xn ) be any vector in Rn . The length or
xk, is
norm of x denoted by kx

xk = (x21 + · · · + x2n )1/2 = 
kx
n
X
j=1
1/2
x2j 
(6.1)
Example 1. Compute the lengths of the following vectors:
a. k(2, −3)k = (4 + 9)1/2 = (13)1/2
b. k(2, −1, 3)k = (4 + 1 + 9)1/2 = (14)1/2
c. k(7, −8, 1, 3, 6)k = (49 + 64 + 1 + 9 + 36)1/2 = (159)1/2
In part b of Example 1, we computed the length of a vector in R3 . Figure 6.2
shows that in this case we may also interpret the length of the vector (x1 , x2 , x3 )
as the distance from the point with coordinates (x1 , x2 , x3 ) to the origin. The
following theorem lists a few useful properties of the norm or length of a vector.
x3
(2, −1, 3)
x2
(2, −1, 0)
x1
Figure 6.2
Theorem 6.1.
xk ≥ 0, and kx
xk = 0 if and only if
1. Let x be any vector in Rn . Then kx
x = 0.
xk = |c| kx
xk for any constant c and any vector x .
2. kcx
x + y k ≤ kx
xk + kyy k.
3. For any two vectors x and y , we have kx
The first two results are almost obvious, both geometrically and analytically,
as we will see in a few lines. The third, which is called the triangle inequality,
is not quite so obvious, but it does have a nice geometrical interpretation for x
and y in R3 .
203
6.1. LENGTH AND DOT PRODUCT
Figure 6.3 shows the result of adding y to x . The three points A, B, and C
xk, kyy k, and kx
x + y k. Since
determine a triangle the lengths of whose sides are kx
the shortest distance between any two points is the straight line connecting
them, we see that inequality 3 must indeed be true.
C
x +y
v
A
x
B
Figure 6.3
Proof of Theorem 6.1.
xk = (x21 + · · · + x2n )1/2 ≥ 0. Moreover
1. Let x = (x1 , . . . , xn ). Then kx
xk = 0 if and only if xj = 0 for each j, i.e., kx
xk = 0 if and only if x = 0 .
kx
xk =
2. Let c be any scalar and x any vector in Rn . Then kcx
hP
i1/2
n
2
xk.
= (c2 )1/2
= |c| kx
j=1 xj
hP
n
2
j=1 (cxj )
i1/2
3. The reader is asked to prove this property in one of the problems at the
end of this section.
Property 2 is used when we wish to construct a vector that has a given direction
and length equal to 1. This is accomplished by taking any nonzero vector x that
xk,
has the desired direction and then dividing it by its length, for if u = x /kx
uk = 1.
then ku
Example 2.
a. k − 2(1, 3)k = k(−2, −6)k = (4 + 36)1/2 = 2k(1, 3)k
b. Construct a unit vector that is parallel to the line going from the point
P = (−1, 2) to the point Q = (3, 4). See Figure 6.4. By a unit vector we
mean one whose length equals 1. A vector that has the desired direction
can be found by subtracting the coordinates of the point P from those of
Q. Thus x = (3, 4) − (−1, 2) = (4, 2) points in the desired direction, but
xk = (16 + 4)1/2 = (20)1/2 . Thus, the
it may not have length 1. In fact kx
√
desired unit vector u equals (4, 2)/ 20.
We next wish to take two vectors A = (a1 , a2 ) and B = (b1 , b2 ) in R2 and
derive a formula relating their coordinates and the cosine of the angle between
them. See Figure 6.5. If we picture the triangle formed by these two vectors,
CHAPTER 6. GEOMETRY IN RN
204
Q(3, 4)
P (−1, 2)
Figure 6.4
A −B
B k = k(a1 − b1 , a2 − b2 )k.
the length of the side opposite the angle θ equals kA
The law of cosines then implies
A − B k2 = kA
Ak2 + kB
B k2 − 2kA
AkkB
B k cos θ
kA
(6.2)
Computing these lengths in terms of the coordinates, we have
AkkB
B k cos θ
(a1 − b1 )2 + (a2 − b2 )2 = a21 + a22 + b21 + b22 − 2kA
Squaring the terms in parentheses and then canceling gives us
AkkB
B k cos θ
−2a1 b1 − 2a2 b2 = −2kA
We finally arrive at the formula
cos θ =
a 1 b1 + a 2 b2
AkkB
Bk
kA
(6.3)
where A = (a1 , a2 ), B = (b1 , b2 ), and θ is the smaller of the two angles determined by A and B . We may draw a similar diagram from two vectors
A = (a1 , a2 , a3 ) and B = (b1 , b2 , b3 ) in R3 . If the same calculations are performed, we have
a 1 b1 + a 2 b2 + a 3 b3
(6.4)
cos θ =
AkkB
Bk
kA
A = (a1 , a2 )
θ
B = (b1 , b2 )
Figure 6.5
where again, θ is the smaller of the two angles between A and B . The numerator
in (6.3) and (6.4) appears so often in various formula that it has been given a
special name.
205
6.1. LENGTH AND DOT PRODUCT
Definition 6.2. Let x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) be any two vectors
in Rn . The dot, or scalar, or inner product, of these two vectors is defined to be
x , y i = x 1 y1 + · · · + x n yn =
hx
n
X
x j yj
(6.5)
j=1
The phrase dot product arises because this operation is commonly denoted by a
dot, i.e., x ·yy . The term scalar is used because the operation produces a number
and not a vector; while the term inner distinguishes this product from the outer
product, x T y , an n × n matrix whose j, k entry is xj yk .
Example 3.
a. h(1, 2), (6, 4)i = 6 + 8 = 14
b. h(1, 2), (−4, 2)i = −4 + 4 = 0
c. h(2, −3, 4), (1, 6, −2)i = 2 − 18 − 8 = −24
d. h(1, 0, 4, 6), (−2, 3, 2, 8)i = −2 + 0 + 8 + 48 = 54
We now rewrite formulas (6.3) and (6.4) as
A, B i = kA
AkkB
B k cos θ
hA
(6.6)
Example 4. Compute the cosine of the angle between the following pairs of
vectors:
a. (1, −2) and (4,3). From (6.6) we have
h(1, −2), (4, 3)i
√ √
5 25
4−6
−2
= √ = √
5 5
5 5
cos θ =
b. (−2, 3, 4) and (3, −1, 8)
h(−2, 3, 4), (3, −1, 8)i
√ √
29 74
23
=√ √
29 74
cos θ =
c. (4, 2, −1) and (−3, 4, −4)
h(4, 2, −1), (−3, 4, −4)i
√ √
21 41
−12 + 8 + 4
=0
= √ √
21 41
cos θ =
Since cos θ = 0 if and only if θ equals 90 degrees, we see that the two
vectors (4, 2, −1) and (−3, 4, −4) are perpendicular.
CHAPTER 6. GEOMETRY IN RN
206
Two comments are in order. First, formula (6.6) is only valid (at this point) in
R2 or R3 . We later extend (6.6) to vectors in Rn , by defining the cosine of the
angle between two vectors to be that number which satisfies (6.6). The second
comment, which we state as a theorem, has to do with when two vectors are
perpendicular.
Theorem 6.2. Let x and y be any two nonzero vectors in R2 or R3 . Then x
x, y i = 0.
and y are perpendicular if and only if hx
Proof. Two nonzero vectors are perpendicular if and only if the angle between
them equals 90 degrees; equivalently the cosine of that angle must equal zero.
Using formula (6.6), we see that happens only when the inner product of the
two vectors equals zero.
x, y i of two vectors satisfies many properties, some of which
The inner product hx
are listed in the next theorem.
Theorem 6.3. Let x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) be any two vectors in
Rn . Then
x, x i = kx
xk2
a. hx
x, y i = hyy , x i
b. hx
c. Let a and b be any two scalars and let z be any vector in Rn ; then
x + byy , z i = ahx
x, z i + bhyy , z i
hax
Proof.
x, x i = x1 x1 + x2 x2 + · · · + xn xn = kx
xk2
a. hx
x , y i = x 1 y1 + · · · + x n yn
b. hx
= y1 x 1 + · · · + yn x n
= hyy , x i
x + byy , z i =
c. hax
=
n
P
j=1
n
P
(axj + byj )zj
j=1
axj zj +
n
P
byj zj
j=1
x, z i + bhyy , z i
= ahx
Property a is referred to by saying that the inner product is positive definite, i.e.,
x, x i ≥ 0, and it equals zero only if x is the zero vector. The second property is
hx
summarized by saying that the inner product is symmetric. The third property
is the statement that the dot product is linear in its first argument. Symmetry
immediately implies that
x + byy i = ahzz , x i + bhzz , y i
hzz , ax
207
6.1. LENGTH AND DOT PRODUCT
We want to extend formula (6.6) to dimensions higher than three. Before we
x, y i/kx
xkkyy k can be
can do so, however, we need to know that the expression hx
interpreted as the cosine of some angle between 0 and 180 degrees. Figure 6.6
shows the graph of cos θ for 0 ≤ θ ≤ π radians. Notice that cos θ is a number
that always lies between −1 and 1. Thus, we would like the absolute value of
x, y i/kx
xkkyy k to be no greater than 1. This is the content of the next theorem.
hx
1
0
π
2
π
−1
Figure 6.6
Theorem 6.4 (Cauchy–Schwarz inequality). Let x and y be any two vectors in
Rn . Then
x, y i| ≤ kx
xkkyy k
|hx
(6.7)
Proof. Define the following function of t by
x + tyy k2 = hx
x + tyy , x + tyy i
f (t) = kx
x, y i + kx
xk2
= t2 kyy k2 + 2t(x
(6.8)
f (t) is a quadratic function of t and is never negative. If y = 0, then (6.7) is
certainly true. Hence, we may assume that y is not the zero vector. Completing
the square we rewrite (6.8) as
x, y i 2
x , y i2
hx
hx
xk2 −
f (t) = kyy k2 t +
+ kx
(6.9)
2
kyy k
kyy k2
Regardless of the value of t we must have f (t) ≥ 0. We now pick t0 in order to
x, y i/kyy k2 , we have
obtain the minimum value of f (t). Setting t0 = −hx
x, y i 2
hx
xk2 −
0 ≤ f (t0 ) = kx
kyy k
x, y i2 ≤ [kx
xkkyy k]2 , from which (6.7) immediately follows.
Thus, we see that hx
Example 5. Write the Cauchy–Schwarz inequality for any two vectors in R2
in terms of their coordinates.
Solution. If x = (x1 , x2 ) and y = (y1 , y2 ), we have
x, y i| = |x1 y1 + x2 y2 | ≤ (x21 + x22 )1/2 (y12 + y22 )1/2
|hx
CHAPTER 6. GEOMETRY IN RN
208
Definition 6.3. Let x and y be two nonzero vectors in Rn . We define the angle
θ between x and y to be that angle which lies between 0 and 180 degrees and
satisfies
x, y i
hx
cos θ =
xkkyy k
kx
Definition 6.4. We say that two nonzero vectors x and y are perpendicular if
x, y i equals zero.
hx
Definitions 6.3 and 6.4 extend our usual notions of the angle between two vectors,
and the concept of perpendicularity, to Rn , for n greater than 3, in a consistent
manner.
Example 6. Determine which of the following pairs of vectors are perpendicular.
a. (1,0) and (0,1) are perpendicular since h(1, 0), (0, 1)i = 0 + 0 = 0.
b. (a, b), and (−b, a) are perpendicular since h(a, b), (−b, a)i = −ab + ba = 0.
c. (1,6,3) and (0,1,2) are not perpendicular since their inner product, which
equals 12, is not zero.
d. (1, 2, −6, 1) and (0,1,4,3) are not perpendicular since their inner product
equals −19.
e. (−2, 8, 3, 4) and (6, 4, 4, −8) are perpendicular since their dot product
equals zero.
Example 7. Show that the diagonals of a rhombus must be perpendicular. A
rhombus is a four-sided polygon with all four sides having the same length.
x 3 = (x3 , y3 )
x 2 = (x2 , y2 )
x 1 = (x1 , y1 )
O = (0, 0)
Solution. The accompanying figure has all four sides the same length. Thus,
x2 k = kx
x1 k = kx
x3 − x 2 k = kx
x3 − x 1 k
kx
209
6.1. LENGTH AND DOT PRODUCT
Hence, we have by Theorem 6.3,
x3 − x 1 k2 − kx
x3 − x 2 k2
0 = kx
x3 − x 1 , x 3 − x 1 i − hx
x3 − x 2 , x 3 − x 2 i
= hx
x3 , x 3 i − 2hx
x3 , x 1 i + hx
x1 , x 1 i
= hx
x3 , x 3 i − 2hx
x3 , x 2 i + hx
x2 , x 2 i}
− {hx
x3 , x 2 − x 1 i + kx
x1 k2 − kx
x2 k2
= 2hx
x3 , x 2 − x 1 i
= 2hx
Thus, the vectors x 3 and x 2 − x 1 are perpendicular. Since these vectors are
parallel to the diagonals of the rhombus, the diagonals must be perpendicular
to each other.
Problem Set 6.1
1. Calculate the lengths of the following vectors:
a. (1,2)
b. (−1, 3, 6)
c. (1,1,2,8)
2. Find all unit vectors that are parallel to the vector (1, 2, −4).
3. Compute the dot product of each of the following pairs of vectors:
a. (1,0), (0,1)
b. (a, b), (b, a)
c. (1,2,1), (3, −6, 2)
4. Sketch each of the following pairs of vectors. Compute their inner product
and determine the cosine of the angle between them.
a. (1,0), (1,0)
e. (1,0), (−1, 0)
b. (1,0), (1,1)
c. (1,0), (0,1)
f. (1,0), (−1, −1)
d. (1,0), (−1, 1)
5. Find the cosine of the angle between each of the following pairs of vectors:
a. (1,2), (3, −1)
b. (1, 0, −4), (6,1,2)
c. (−2, 3, 0, 1), (1, 2, 8, −2)
x, y i| = kx
xkkyy k,
6. Show that if x and y are any two vectors in Rn for which |hx
then one of them must be a scalar multiple of the other. [Hint: Let f (t)
be the function defined in (6.8). Show that the above equality holds if and
only if f (t) = 0 for some value of t.]
7. Use the Cauchy–Schwarz inequality to prove the triangle inequality for
x + y k2 .)
any two vectors. (Hint: Compute kx
8. Let V be the vector space C[0, 1] which consists of all real-valued continuous functions defined on the interval [0,1]. For any two functions f (t) and
g (t) in V define
ˆ 1
f (t)gg (t)dt
hff , g i =
0
CHAPTER 6. GEOMETRY IN RN
210
a. Let f (t) = t2 and g (t) = 1 − t. Compute hff , g i.
b. Define the length of a vector f in V by
2
kff k = hff , f i =
ˆ
1
f (t)ff (t)dt
0
If f (t) = sin nπt where n is an integer, compute its length.
c. Show that properties 1 and 2 of Theorem 6.1 are valid.
d. Show that h , i is an inner product in the sense that Theorem 6.3 is
valid.
e. Prove the Cauchy–Schwarz inequality for this inner product. (Hint:
Repeat the proof of Theorem 6.4.)
9. Let A = [ajk ] be any real m × n matrix. Define kAk2 =
P m Pn
j=1
k=1
a2jk .
xk ≤ kA
Akkx
xk.
a. For any vector x in Rn show that kAx
1
2
xk ≤
b. Let A =
. Compute kAk and verify directly that kAx
3 −1
xk for any x in R2 .
kAkkx
10. The distance between a vector x and a subspace W is defined to be the
x − w k as w varies throughout W . Let V equal R2
smallest value of kx
and W = S[(1, 1)], the span of the vector (1,1). For each of the following
vectors x compute the distance between x and W .
a. (2,2)
b. (−1, 1)
c. (1,2)
x −w
w 0 k equals the
In each case determine that vector w 0 in W such that kx
distance between x and W . Then show that x − w 0 is perpendicular to
(1,1) and hence to every vector in S[(1, 1)] = W .
xk for any
11. Let x = (x1 , . . . , xn ) be any vector in Rn . Show that |xj | ≤ kx
xk ≤ |x1 | + |x2 | + · · · + |xn |.
j, and kx
12. Let x1 , . . . , xn , . . . be a sequence of vectors in Rm . We say that the sexn − xk) = 0.
quence xn converges to x, lim xn = x, if and only if lim (kx
n→∞
n→∞
a. Let x n = (1 + (1/n), 0). Show that x n converges to (1,0).
b. Let x n be a sequence of vectors in R2 . Show that lim x n = x
n→∞
if and only if the components of x n converge to the corresponding
components of x ; cf. problem 11.
13. Let x n = (1 + (1/n), 2 − (2/n), 3). Show that x n converges to (1,2,3).
14. Show that the result of problem 12b is also true in Rn .
211
6.2. PROJECTIONS AND BASES
15. Let A = [ajk ], Ap = [apjk ], p = 1, 2, . . . . Define the norm of m × n
matrices as in problem 9. Show that lim kA − Ap k = 0 if and only if
p→∞
lim |ajk − apjk | = 0, for each j and k. Note, this is problem 14 where we
p→∞
identify Mmn with Rmn .
16. Suppose that x is perpendicular to every vector in some set A. Show that
x must then be perpendicular to every vector in S[A].
x = 0 if and only if x is
17. Let A be any m × n matrix. Show that Ax
perpendicular to every row of A. Thus, x is in ker(A) if and only if x is
perpendicular to every vector in the row space of A.
x] = hx
x, y i maps V to R.
18. Let V = R2 . For any y in V the mapping L[x
a. Show that L is a linear transformation.
b. Using the standard bases in V and R, what is the matrix representation of L?
c. Repeat parts a and b where V is now Rn .
A
be an isosceles triangle with equal angles at O and B.
19. Let
O
B
Show that the line drawn from the vertex A to the midpoint of OB is
perpendicular to OB.
20. Show that the diagonals of a square bisect not only each other but also
each vertex angle.
6.2
Projections and Bases
For various reasons we sometimes wish to compute the component or projection
of a vector in some particular direction. Geometrically it’s easy to see how to do
this. Suppose x is some vector in R2 and we wish to compute its perpendicular
projection onto the direction indicated by the dashed line in Figure 6.7a.
x
Proj x
(a)
C
C
Proj x
C
(b)
(c)
Figure 6.7
We locate a point C on this line so that the line joining the tip of x to C is
perpendicular to the original line. The projection of x in the direction C is
CHAPTER 6. GEOMETRY IN RN
212
then given by the vector starting where x starts and ending at the point C.
Figure 6.7b and c show two other configurations. Notice that in Figure 6.7c, x
is perpendicular to the dashed line. Since this forces the point C to coincide
with the origin of x , the projection of x in this case is the zero vector.
In deriving a formula for the projection, we first start by assuming that the
uk = 1. Let θ denote the angle
direction is given by a unit vector u , that is ku
between the vector x and u . Let d denote the length of the projection; cf.
xk and
Figure 6.8. Then cos θ = d/kx
xk cos θ = kx
xk
d = kx
x, u i
hx
x, u i
= hx
xkku
uk
kx
(6.10)
Thus, d is merely the inner product of x with u . To get the projection of x onto
u , Projux , we just multiply u by d. We note that if u were not a unit vector we
x, u i/ku
uk. In order to avoid having to carry along the factor
would have d = hx
uk−1 we insist that u be a vector of length 1.
ku
Definition 6.5. Let x be any vector in Rn , and let u be any unit vector in Rn .
The projection of x onto u , Projux , is defined to be
x, u iu
u
Projux = hx
(6.11)
Example 1. Let x = (2, −3). Compute Projux for each of the following unit
vectors:
a. u = (1, 0) : Proju (2, −3) = h(2, −3), (1, 0)i(1, 0)
= 2(1, 0) = (2, 0)
E D
√
√
√1 , √1
b. u = (1/ 2, 1/ 2) : Proju (2, −3) = (2, −3), √12 , √12
2
2
√
√1 , √1
= 2−3
= − 21 , 21
2
2
2
c. u = (0, 1) : Proju (2, −3) = h(2, −3), (0, 1)i(0, 1)
= −3(0, 1) = (0, −3)
x
θ
d
Figure 6.8
Figure 6.9 illustrates this example. It is clear, geometrically, that our construction gives us the “perpendicular component” of x in the direction specified by
u . The following lemma shows that this is indeed true.
213
6.2. PROJECTIONS AND BASES
u=
Proj
1
√
, √12
2
u = (0, 1)
u = (1, 0)
Proj
Proj
(2, −3)
(2, −3)
(a)
(b)
(c)
Figure 6.9
Lemma 6.1. Let x be any vector in Rn , and let u be a unit vector. Then
x − Projux is either the zero vector, or it is perpendicular to u .
Proof.
x − Projux , u i = hx
x, u i − hProjux , u i
hx
x, u i − hhx
x, u iu
u, u i
= hx
x, u i − hx
x, u ihu
u, u i
= hx
x, u i − hx
x, u i = 0
= hx
u, u i = 1.
Remember that u is assumed to be a unit vector and therefore hu
Thus, if x − Projux is not the zero vector it must be perpendicular to u . See
Figure 6.10.
x − Projux
x
u
Projux
Figure 6.10
Lemma 6.1 tells us that, given any vector x and any unit vector u , we can write
x as the sum of two vectors, one parallel to u and the other perpendicular to u .
x − Projux ]
x = Projux + [x
(6.12)
Why is Projux parallel to u ?
Example 2. Write x = (4, 2) as the sum of two vectors, one of which is parallel
to the line joining the two points P = (−6, 4) and Q = (3, −5), while the second
CHAPTER 6. GEOMETRY IN RN
214
vector is perpendicular to this line. See Figure 6.11. Let y = (−6−3, 4−(−5)) =
(−9, 9). Since we found y by subtracting the coordinates of Q from those of P, y
is a vector parallel to the line through the
√ two points P and Q. Thus a unit
vector parallel to this line is y /kyy k = (1/ 2)(−1, 1) = u . √
x) = h(4, 2), (−1, 1)/ 2iu
u = (1, −1). A
A vector parallel to u is Proju (x
vector perpendicular to the line is x − Projux = (4, 2) − (1, −1) = (3, 3). Thus,
x = (4, 2) = (1, −1) + (3, 3), where (1, −1) is parallel to the line and (3,3) is
perpendicular to the line.
We next want to relate these ideas to those involving the coordinates of a
vector. Referring to Example 1a and c, notice that the vectors u 1 = (1, 0) and
u1
u 2 = (0, 1) are our standard basis. For x = (2, −3) we had Proju1 x = 2u
u2 . In other words, the coordinates of x with respect to the
and Proju2 x = −3u
standard basis can be found by taking the dot product of x with each of the
basis vectors. That may not happen for an arbitrary basis, as Example 3 shows.
√
√
Example 3. The pair u 1 = (1, 1)/ 2 and u 2 = (1, 2)/ 5 form a basis of
R2 . Moreover each of them has length 1. Find the coordinates of x = (2, −3)
with respect to this basis and also compute the
of x with u 1 and
√ inner product
√
u
u
u 2 . An easy calculation
shows
that
x
=
7(
2)u
−
5(
5)u
.
Notice though
1
2
√
√
x, u 2 i = −4/ 5. Thus, the inner products do not
x, u 1 i = −1/ 2, and hx
that hx
equal the coordinates of x with respect to this basis. Actually, we shouldn’t
have expected any such relationship because the two vectors u 1 and u 2 are
not perpendicular and the inner product of x with u 1 gives us the size of the
perpendicular projection of x onto u 1 .
P (−6, 4)
− √12 ,
1
√
2
(4,2)
Proj(− √1
2
Q(3, −5)
1 ) (4, 2)
.√
2
Figure 6.11
u1 , u 2 }
With the above example in mind we might expect that if U = {u
x]u = [hx
x, u 1 i, hx
x, u 2 i].
consists of two perpendicular unit vectors in R2 , then [x
Indeed, we will prove that
x, u 1 iu
u1 + hx
x, u 2 iu
u2
x = hx
(6.13)
u1 , u 2 } is a basis of R2 that consists of two perpenThus, suppose that U = {u
dicular unit vectors. Then x = c1u 1 + c2u 2 . Let’s now take the inner product
215
6.2. PROJECTIONS AND BASES
of x with u 1 and then with u 2 .
x, u 1 i = hc1u 1 + c2u 2 , u 1 i = c1 hu
u1 , u 1 i + c2 hu
u 2 , u 1 i = c1
hx
u1 , u 1 i = 1 and hu
u1 , u 2 i = 0. Similarly we have hx
x, u 2 i = c2 . Thus, (6.13)
since hu
is valid, and as we shall see in a short while, it is also valid in Rn .
uj : j = 1, . . . , p} in Rn is said to
Definition 6.6. A set of nonzero vectors {u
uj , uk i = 0 if j 6= k.
be orthogonal if they are mutually perpendicular, i.e., hu
Example 4. The set {(1, 1, 0, 0), (0, 0, 1, 1), (1, −1, 1, −1)} is orthogonal since
h(1, 1, 0, 0), (0, 0, 1, 1)i = 0, h(1, 1, 0, 0), (1, −1, 1, −1)i = 0, and h(0, 0, 1, 1),
(1, −1, 1, −1)i = 0. However, the set {(1, 1, 1), (1, −1, 0), (1, 1, 2)} is not orthogonal since h(1, 1, 1), (1, 1, 2)i equals 4, not zero.
Lemma 6.2. Any set of orthogonal vectors must be linearly independent.
uk : k = 1, . . . , p} be an orthogonal set of vectors and suppose that
Proof. Let {u
we have constants cj such that
0 = c1 u 1 + · · · + cp u p
(6.14)
Taking the inner product of (6.14) with u 1 we have
+
* p
X
0 = h00, u 1 i =
cj u j , u 1
j=1
=
p
X
j=1
uj , u1 i
cj hu
u1 , u 1 i
= c1 hu
u1 , u 1 i 6= 0. Thus c1 = 0.
Since u 1 is not the zero vector, we know that hu
By taking the inner product of (6.14) with any one of the u k ’s, we similarly
see that ck = 0 for each k. Thus, our orthogonal set of vectors is linearly
independent.
u1 , . . . , u n } is said to be an orthonormal
Definition 6.7. A set of vectors U = {u
uj u k i =
basis of Rn if it is a basis consisting of orthogonal unit vectors. That is, hu
δjk .
The following is probably the most useful idea in this section.
u1 , . . . , u n } be an orthonormal basis of Rn . Then the
Theorem 6.5. Let U = {u
coordinates of any vector x can be found by taking the inner product of x with
each of the basis vectors u k . That is,
x, u 1 iu
u1 + · · · + hx
x, u n iu
un
x = hx
(6.15)
CHAPTER 6. GEOMETRY IN RN
216
Proof. Since U is a basis we know there are unique constants ck , 1 ≤ k ≤ n,
such that
x = c1 u 1 + · · · + cn u n
(6.16)
Takin the dot product of both sides of (6.16) with the kth basis vector u k , we
have
* n
+
X
x, u k i =
hx
cj u j , u k
j=1
=
n
X
j=1
= ck
uj , u k i =
cj hu
n
X
cj δjk
j=1
Thus, each coordinate of x with respect to U is the inner product of x with
the corresponding basis vector in U . Another interpretation of (6.15) is that
a vector equals the sum of its projections onto the vectors of an orthonormal
basis.
Example 5. Verify that each of the following is an orthonormal basis of R3 ,
and then compute the coordinates of the vector x = (6, −2, 1) with respect to
these bases.
a. U = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. Clearly each of these vectors has length
1, and they are mutually perpendicular. Thus, (6.15) is applicable. Computing the inner product of x with each basis vector we have
h(6, −2, 1), (1, 0, 0)i = 6
h(6, −2, 1), (0, 1, 0)i = −2
h(6, −2, 1), (0, 0, 1)i = 1
Thus, x is equal to
u1 + (−2)u
u2 + 1u
u3
x = (6, −2, 1) = 6u
= 6(1, 0, 0) − 2(0, 1, 0) + (0, 0, 1)
√
√
√
b. U = {(1, 1, 1)/ 3, (2, −1, −1)/ 6, (0, 1, −1)/ 2}. We first verify that U
is an orthonormal set.
√
1 1 1
u1 k2 = + + = 1
u1 , u2 i = (2 − 1 − 1)/ 18 = 0
ku
hu
3 3 3
√
4 1 1
u1 , u3 i = (1 − 1)/ 6 = 0
u2 k2 = + + = 1
hu
ku
6 6 6
√
1 1
u3 k2 = + = 1
u2 , u3 i = (−1 + 1)/ 12 = 0
ku
hu
2 2
Thus, U consists of mutually orthogonal unit vectors. Lemma 6.2 tells
us that U is linearly independent. Since dim(R3 ) equals 3, U must be a
217
6.2. PROJECTIONS AND BASES
basis. Computing the inner products of x with each of the basis vectors
we have
√
5
6−2+1
√
i= √
h(6, −2, 1), (1, 1, 1)/ 3i =
3
3
√
12 + 2 − 1
13
√
h(6, −2, 1), (2, −1, −1)/ 6i =
=√
6
6
√
−3
−2 − 1
=√
h(6, −2, 1), (0, 1, −1)/ 2i = √
2
2
√
√
√
u1 + (13/ 6)u
u2 − (3/ 2)u
u3 .
Thus, x = (5/ 3)u
In order to appreciate the convenience of an orthonormal basis, the reader should
compute the coordinates of x with respect to U without using Theorem 6.5.
Another fact, which is sometimes useful, is the relationship between the inner
product of two vectors and their coordinates with respect to an orthonormal
basis.
uj : j = 1, . . . , n} be any orthonormal basis of Rn . Let
Theorem 6.6. Let U = {u
x and y be any two vectors. Their coordinates with respect to the orthonormal
basis U are [x1 , . . . , xn ]U and [y1 , . . . , yn ]U , respectively. Then
x , y i = x 1 y1 + · · · + x n yn =
hx
xk2 =
kx
Proof. By hypothesis x =
n
X
n
X
x j yj
(6.17a)
j=1
x2j
(6.17b)
j=1
Pn
j=1
x, y i =
hx
=
=
xj u j and y =
* n
X
n
X
Pn
j=1
yk u k
xj u j ,
j=1
k=1
n
n X
X
j=1 k=1
n
n X
X
ynu j . Thus
+
uj , u k i
xj yk hu
xj yk δjk =
n
X
x j yj
j=1
j=1 k=1
To verify (6.17b) we only need to remember that
xk2 = hx
x, x i =
kx
n
X
x2j
j=1
√
√
Example 6. U = {(1, 1)/ 2, (−1, 1)/ 2} is an orthonormal basis of R2 . Verify
(6.17) for the vectors (1,1) and (2,6).
CHAPTER 6. GEOMETRY IN RN
218
Solution. The inner product of these two vectors is
h(1, 1), (2, 6)i = 2 + 6 = 8
Using Theorem 6.5 to compute the coordinates of these vectors with respect to
U , we have
√
√
√
[(1, 1)]u = [h(1, 1), (1, 1)/ 2i, h(1, 1), (−1, 1)/ 2i] = [ 2, 0]
√
√
[(2, 6)]u = [h(2, 6), (1, 1)/ 2i, h(2, 6), (−1, 1)/ 2i]
√
√
= [8/ 2, 4/ 2]
Thus, x1 =
√
√
√
2, x2 = 0, y1 = 8/ 2, and y2 = 4/ 2, and we have
x, y i
x1 y1 + x2 y2 = 8 + 0 = hx
Using the coordinates of x and y with respect to U to compute their lengths,
we have
√
64 16
xk2 = ( 2)2 + 0 = 2
kyy k2 =
+
= 40
kx
2
2
Problem Set 6.2
1. Compute Projux , where x = (7, −8) for each of the following unit vectors:
a. (1, −2)/51/2
b. (2, 3)/(13)1/2
c. (1,0)
2. Compute the projection of x = (−2, 3) in a direction parallel to the
straight line joining the points (1,7) and (−3, 8). There are two possible choices for a direction; take the one that points from (1,7) to (−3, 8).
√
√
3. Let x = (7, −5). Let U = {(1, 5)/ 26, (−5, 1)/ 26}.
a. Show that U is an orthonormal basis of R2 .
b. Find Projuj x, where uj is the jth unit vector in U .
c. Compute the coordinates of x with respect to U .
4. Let x = (1, −2). Show that any vector orthogonal to x is a scalar multiple
of (2,1).
5. Let A = {(1, 2, −1), (−2, 1, 0)}. Show that A is an orthogonal set of vectors, and if x is any vector orthogonal to both vectors in A, then x must
be a scalar multiple of (1,2,5).
6. Let V = {(2, −3, 1), (2, 3, 5), (−9, −4, 6)}.
a. Show that V is an orthogonal set of vectors.
b. Let x = (7, −3, 4). Compute the projection of x onto the direction
given by v j , where v j is the jth vector in the set V .
219
6.2. PROJECTIONS AND BASES
c. Compute the coordinates of (7, −3, 4) = x with respect to the basis
V . (Hint: It’s easy to construct an orthonormal basis from V .)
7. Find the angle between the following pairs of vectors:
a. (1,1), (0,1)
b. (1,1,1), (0,1,0)
c. (1,1,1,1), (0,1,0,0) d. (6, 7, −2, 3), (−1, −2, 1, 1)
√
√
8. Let U = {(1, −1)/ 2, (1, 1)/ 2}. Use the fact that U is an orthonormal
basis to compute the coordinates of the following vectors with respect to
U:
a. (9, −2)
b. (6,4)
c. (1, −1)
d. (1,0)
√
√
√
9. Let U = {(2, −3, 1)/ 14, (2, 3, 5)/ 38, (−9, −4, 6)/ 133}. Show that U
x]u for the
is an orthonormal basis (cf. problem 6) and then compute [x
following vectors:
a. (1,0,0)
b. (−1, 6, 4)
c. (18, 2, −4)
10. Let u be an arbitrary unit vector in Rn .
a. If x is the zero vector, show that Projux = 0 .
b. If x and u are perpendicular, show that Projux = 0 .
c. Show that Projux is a linear transformation from Rn to Rn .
d. What is the dimension of the kernel of this linear transformation?
11. Let V = P3 . Let f and g be any two polynomials in V . Define hff , g i =
´1
f (t)gg (t)dt; cf. problem 8 in Section 6.1.
0
a. Find a unit vector u that points in the same direction as f (t) = t.
b. Find the projection of t 2 onto the vector u of part a.
c. Find the cosine of the angle between the vectors t 2 and t .
12. Let V = P1 . Define the inner product of two vectors as we did in problem 11. Show that {11, t − 12 } is an orthogonal set of vectors. Find an
orthonormal basis for V .
13. Let V = P2 . Define the inner product as we did in problems 11 and 12.
Let f ′ denote the derivative of f .
a. Find all polynomials in P2 that are perpendicular to their derivatives.
b. For any two polynomials f and g in V , compute hff , g ′ i + hff ′ , g i.
14. Let V = M22 , the vector space of 2 × 2 matrices.
P2 P 2
B = [bjk ] in M22 define hA, Bi = j=1 k=1 ajk bjk .
1 0
a. Compute the norms of the matrices
and
0 1
For A = [ajk ] and
a
c
b
.
d
220
CHAPTER 6. GEOMETRY IN RN
b. Let Ejk , 1 ≤ j, k ≤ 2, be the standard basis of V . Do these four
matrices form an othonormal basis?
x] = Projux
15. In problem 10 we saw that for a fixed unit vector u in Rn , L[x
is a linear transformation from Rn to Rn . Let
1
u = √ (ee1 − e 3 )
2
n≥3
a. What is the matrix representation of L with respect to the standard
basis?
b. Find an orthonormal basis for Rg(L).
c. Find an orthonormal basis for ker(L).
d. Show that the union of the two orthonormal sets in parts b and c is
an orthonormal basis of Rn .
e. What is the matrix representation of L with respect to the basis of
part d? (List the vectors from b and then the vectors from c.)
6.3
Construction of Orthonormal Bases
We indicated in Chapter 2 that every vector space has a basis. A natural
question now is whether or not every vector space has an orthonormal basis.
This of course makes sense only if the vector space has an inner product. Clearly,
the answer is yes for Rn since the standard basis {ee1 , . . . , e n } is orthonormal.
What about subspaces of Rn ? The answer is again yes. In fact there is a
technique for constructing an orthonormal basis from any given basis. This
technique goes by the name of Gram–Schmidt. We illustrate it with an example
before going into the details of the algorithm.
Example 1. Let f 1 = (0, 1, 1) and f 2 = (0, 2, 0). Let W = S[ff 1 , f 2 ]. Construct
an orthonormal basis for W .
Solution. Geometrically W is a plane (two-dimensional subspace of R3 ). In fact
W is the plane x1 = 0. Clearly e2 and e3 form an orthonormal basis for W .
What we wish to do, though, is to use the given basis for W√in constructing our
orthonormal basis. We first set u 1 = f 1 /kff 1 k = (0, 1, 1)/ 2. The unit vector
u 1 will be the first vector in our basis. We now want a unit vector u 2 that is
perpendicular to u 1 and also lies in W . Such a vector is easy to construct by
using the fact that f 2 − Proju1 f 2 must be perpendicular to u 1 . Moreover, since
u1 is a scalar multiple of f 1 ),
this vector is a linear combination of f 1 and f 2 (u
6.3. CONSTRUCTION OF ORTHONORMAL BASES
221
it will lie in W . Thus, set
v 2 = f 2 − Proju 1 f 2
√ (0, 1, 1)
= (0, 2, 0) − h(0, 2, 0), (0, 1, 1)/ 2i √
2
= (0, 2, 0) − (0, 1, 1) = (0, 1, −1)
(0, 1, −1)
v2
√
=
u2 =
kvv 2 k
2
u1 , u 2 i = 0.
A quick computation shows that hu
The only difficulty here is that the vector v 2 might equal zero. But this
cannot happen, since the vectors f 1 and f 2 are linearly independent.
Let’s assume now that {ff 1 , . . . , f n } is a set of linearly independent vectors.
The Gram–Schmidt procedure given below provides us with an orthonormal
u1 , . . . , u n }, such that S[u
u1 , . . . , u p ] = S[ff 1 , . . . , f p ], for p =
set of vectors {u
1, 2, . . . , n. Define the unit vectors u k inductively by
f1
kff 1 k
u1
= f 2 − hff 2 , u 1 iu
v2
=
kvv 2 k
u1 + hff k , u 2 iu
u2 + · · · + hff k , u k−1 iu
uk−1 ]
= f k − [hff k , u 1 iu
vk
=
k = 2, . . . , n
kvv k k
u1 =
v2
u2
vk
uk
(6.18)
Theorem 6.7. Let {ff k : k = 1, . . . , n} be a linearly independent set of vectors.
uk : k = 1, . . . , n} is an northonormal set or vectors
Define u k by (6.18). Then {u
u1 , . . . , u p ] = S[ff 1 , . . . , f p ] for p = 1, . . . , n.
and S[u
Proof. We prove this theorem by induction. For p = 1, we have u 1 = f 1 /kff 1 k.
u1 } is an orthonormal set and S[u
u1 ] = S[ff 1 ]. Again we note that f 1 6= 0
Clearly {u
since the f j ’s are linearly independent. We now assume that the theorem is true
for p and deduce its truth for p + 1. From (6.18) we have
v p+1 = f p+1 − [Proju 1 (ff p+1 ) + · · · + Proju p (ff p+1 )]
v p+1
u p+1 =
kvv p+1 k
(6.19a)
(6.19b)
u1 , . . . , u p ] = S[ff 1 , . . . , f p ]. Thus, the vector
We have by assumption that S[u
u1 , . . . , u p ].
v p+1 cannot be the zero vector since f p+1 is not in S[ff 1 , . . . , f p ] = S[u
Hence we may divide v p+1 by its length to get u p+1 , a unit vector. Since
u1 , . . . , u p+1 ] =
(6.19a) can be solved for f p+1 , the reader can easily show that S[u
CHAPTER 6. GEOMETRY IN RN
222
S[ff 1 , . . . , f p+1 ]. It remains to show that u p+1 , equivalently v p+1 , is orthogonal
to each of the preceding u k .
hvv p+1 , u k i = hff p+1 − Proju1 f p+1 − Proju2 f p+1 − · · · − Projup f p+1 , u k i
*
+
p
X
uj , u k
= f p+1 −
hff p+1 , u j iu
j=1
= hff p+1 , u k i −
p
X
j=1
uj , u k i
hff p+1 , u j ihu
uk : k = 1, . . . , p} is orthonormal. Thus hu
uj , u k i = δjk
By assumption the set {u
and we have
hvv p+1 , u k i = hff p+1 , u k i − hff p+1 , u k i = 0
Example 2. Construct an orthonormal basis for R3 from the vectors {(1, 0, 1),
(2, 1, 0), (1, 1, 1)} by using the Gram–Schmidt algorithm
(1, 0, 1)
√
2
√
√
v 2 = (2, 1, 0) − h(2, 1, 0), (1, 0, 1)/ 2i((1, 0, 1)/ 2)
u1 =
= (2, 1, 0) − (1, 0, 1) = (1, 1, −1)
(1, 1, −1)
√
3
√
√
v 3 = (1, 1, 1) − h(1, 1, 1), (1, 0, 1)/ 2i((1, 0, 1)/ 2)
√
√
− h(1, 1, 1), (1, 1, −1)/ 3i((1, 1, −1)/ 3)
u2 =
= (1, 1, 1) − (1, 0, 1) −
u3 =
(−1, 2, 1)
√
6
(1, 1, −1)
(−1, 2, 1)
=
3
3
In the preceding section we defined and showed how to calculate the projection of a vector onto a unit vector. We now wish to define the projection of
a vector onto a subspace, and we do so in terms of the distance between the
vector and the subspace.
Definition 6.8. Let x be any vector in Rn and let W be any subspace of Rn .
x), is defined to be that vector y in W which
The projection of x onto W , Projw (x
x − w k for w any vector in W .
minimizes kx
It is not at all clear that there is such a vector; or perhaps there might be more
than one, and which should we pick?
u1 , . . . ,
Theorem 6.8. Let W be any m-dimensional subspace of Rn . Let U = {u
u m } be any orthonormal basis of W . Then for any vector x in Rn there is a
6.3. CONSTRUCTION OF ORTHONORMAL BASES
223
x − w k for w in W . Moreover,
unique y that minimizes kx
x) =
y = ProjW (x
m
X
x, u k iu
uk
hx
(6.20)
k=1
u1 , . . . , u m , v 1 , . . . , v n−m } be any orthonormal basis of Rn
Proof. Let U = {u
whose first m vectors are the given orthonormal basis of W . Theorem 6.7 tells
us how to construct such a basis, and we also have
x=
m
X
x, u k iu
uk +
hx
k=1
Let w be any vector in W . Then w =
x − w k2 =
kx
m
X
n−m
X
k=1
x, v k ivv k
hx
Pm
w , u k iu
uk .
k=1 hw
x − w , u k i]2 +
[hx
k=1
n−m
X
k=1
From (6.17) we have
x, v k i]2
[hx
Clearly, the second sum is constant regardless of the choice of w . However, the
w , u k i = hx
x, u k i for
first sum equals zero if and only if hw
P each k. In other words,
x −w
w k occurs only when w = m
x, u k iu
uk . Moreover,
the minimum value of kx
k=1 hx
it is clear from the above equation that this minimum length equals
x − Projw x k =
kx
"n−m
X
k−1
x , v k i2
hx
#1/2
x) is perpendicular to every vector in W .
and that x − Projw (x
Formula (6.20) also says that the projection of any vector x onto a subspace
W equals the sum of its projections onto the vectors of an orthonormal basis of
W . If x is an arbitrary vector in Rn , what is the projection of x onto Rn ?
Definition 6.9. We define the distance between a vector x and a subspace W
x − ProjW x k.
to equal kx
Example 3. Let x = (1, 2, 3). Compute the projection of x onto each of the
following subspaces.
a. W is the x1 , x2 plane in R3 . An orthonormal basis for W is the set
{(1, 0, 0), (0, 1, 0)}. Thus,
ProjW (1, 2, 3) = h(1, 2, 3), (1, 0, 0)i(1, 0, 0) + h(1, 2, 3), (0, 1, 0)i(0, 1, 0)
= (1, 0, 0) + 2(0, 1, 0) = (1, 2, 0)
CHAPTER 6. GEOMETRY IN RN
224
x3
x = (1, 2, 3)
x2
Projw x = (1, 2, 0)
x1
b. W = S[(1, 1, 0), (0, 1, −1)]. Since the vectors (1, 1, 0) and (0, 1, −1) are
linearly independent, we may use
an orthonormal basis
√
√ them to construct
for W . This basis is {(1, 1, 0)/ 2, (−1, 1, −2)/ 6} and we have
u1 + h(1, 2, 3), u 2 iu
u2
ProjW (1, 2, 3) = h(1, 2, 3), u 1 iu
1
5
=
(1, 1, 0) +
(−1, 1, −2)
2
6
1 4 5
= − , ,−
3 3 3
We conclude this section with a discussion of the properties of a change of
uk : k = 1, . . . , n}
basis matrix P relating two orthonormal bases. Let U = {u
and V = {vv k : k = 1, . . . , n} be two orthonormal bases of Rn . Let P = [pjk ] be
the matrix that gives the vectors u k as linear combinations of the v k . That is,
uk =
n
X
pjkv j
(6.21a)
n
X
qjku j
(6.21b)
j=1
and if P −1 = Q = [qjk ], then
vk =
j=1
We remind the reader that the kth column of P consists of the coordinates of
the vector u k with respect to the basis V . A similar comment of course applies
to the columns of P −1 = Q. However, these two bases are orthonormal. Thus,
Theorem 6.5 may be used to compute the coordinates
uk , v j i
pjk = hu
qjk = hvv k , u j i
Since our inner product is symmetric, we have
uk , v j i = hvv j , u k i = qkj
pjk = hu
(6.22)
6.3. CONSTRUCTION OF ORTHONORMAL BASES
225
In other words the matrix Q = P −1 is the transpose of the matrix P , or P −1 =
P T , an extremely useful fact. We also note that the formulas
PPT = PTP = I
imply that both the rows and columns of P form orthonormal sets of vectors.
Definition 6.10. A matrix P is said to be orthogonal if P T = P −1 .
√
√
√
Example 4. Let U = {(1, 0, 1)/ 2, (1, 1, −1)/ 3, (−1, 2, 1)/ 6}. Find the
change of basis matrices P and P −1 relating this basis to the standard basis.
Solution. We know from Example 2 that U is an orthonormal basis of R3 . Thus
P −1 = P T .




1
1
1
1
1
√
√
√
√
−√
0
 2

3
6 
2
2 







1
1
2 
1
1 
 0

−1
√
√ 
√
−√ 
P =
P = √



3
6 
3
3
3 




 1

1 
2
1
1
1 
√
√
√
√
−√
−√
2
3
6
6
6
6
Clearly, when we deal with orthonormal bases, the amount of computational
work is considerably lessened. There is of course the initial labor involved in
constructing such a basis, but it is usually well worth the effort.
Problem Set 6.3
1. Use the Gram–Schmidt procedure to construct an orthonormal basis for
each of the following subspaces of R3 :
a. W = {(x1 , x2 , x3 ) : x1 − x2 = 0}
b. W = S[(1, −1, 2), (6, 1, 1)]
2. Construct an orthonormal basis for R3 from the following basis, {(0, 5, 1),
(0, 1, −5), (1, −2, 3)}.
3. Let W be the subspace of R4 spanned by the vectors f 1 = (1, 1, 0, 1) and
f 2 = (3, 1, 4, 1). Compute the projection of x = (3, 0, 3, 3) onto W .
4. Find the distance from the point (1, −2, 3) to the plane 2x1 −3x2 +6x3 = 0.
5. Find the distance from the point (1, −2, 3) to the plane 2x1 −3x2 +6x3 = 2.
n
o
−1 √1
√1 , √1
√
6. Show that U =
,
and V = {(1, 0), (0, 1)} are both
,
2
2
2
2
orthonormal bases of R2 . Find a change of basis matrix P relating U and
V and verify that it is orthogonal.
CHAPTER 6. GEOMETRY IN RN
226
7. Construct an orthonormal basis for R4 from the basis {(0, 1, 1, 1), (1, 0, 1, 1),
(1,1,0,1), (1, 1, 1, 0)}.
2
8. Show that {(x
1 , x2 ),(y1 , y2 )} is an orthonormal basis for R if and only if
x y1
the matrix 1
is orthogonal.
x 2 y2
9. Find an orthonormal basis for the kernels of each of the following matrices:
1 2
1 −1 2
1 0 −1 3
a.
b.
c.
3 6
4
6 3
−3 1
0 1
10. Find an orthonormal basis for the ranges of each of the matrices in problem
9.
uj : j = 1, . . . , n} and V = {vv j : j = 1, . . . , n}
11. We’ve seen that if U = {u
are two orthonormal bases of Rn , then the matrix P = [pjk ] relating
them is orthogonal. Conversely, show that if P is orthogonal and U is an
orthonormal basis then V = {vv j }, where the vectors in V are defined by
vj =
n
X
pkj u k
k=1
is also an orthonormal basis.
12. Show that if W is any subspace of Rn and x is any vector in Rn , then
there is a unique unit vector w 0 in W such that the angle between x and
w 0 is minimized. That is, the angle between x and w for any vector w in
W is no smaller than that between x and w 0 .
´1
13. Let V = P2 . Define hff , g i = 0 f (t)gg (t)dt. The set B = {1, t, t2 } is a
basis for V . Construct an orthonormal basis for V from B by using the
Gram–Schmidt procedure.
14. Let V = P2 . Define hff , g i = f0 g0 +f1 g1 +f2 g2 , where f (t) = f0 +f1 t+f2 t2
and g (t) = g0 + g1 t + g2 t2 . Show that {1, t, t2 } is an orthonormal basis if
we use this inner product, but not if we use the inner product of problem
13.
15. Let V = C[0, 1]. Define hff , g i as we did in problem 13.
a. Compute the length of the vector sin πt.
b. Show that the set {1, sin πt, cos πt, . . . , sin nπt, cos nπt, . . .} is orthogonal.
16. Let V = S[(1, 0, 1), (1, 1, 1)]. Show that (−1, 0, 1) is perpendicular to every
vector in V .
17. Let V and W be subspaces of Rn . We say that V and W are perpendicular
x, y i = 0 for every x in V and y in W .
if hx
6.3. CONSTRUCTION OF ORTHONORMAL BASES
227
a. Suppose f 1 and f 2 are two perpendicular vectors. Show S[ff 1 ] and
S[ff 2 ] are perpendicular.
b. Let {vv , . . . , v p } be an orthogonal set of vectors. Show that S[vv 1 , . . . , v k ]
and S[vv k+1 , . . . , v p } are perpendicular.
18. Let V be any subspace of R3 . Show that dim(V ) + dim(V ⊥ ) = 3. Generalize this to Rn .
19. Let V be any subspace of R2 . Show that any vextor x in R2 can be written
uniquely in the form x = v + w , for some v in V and w in V ⊥ . Consider
the two special cases V = {00} and V = R2 first. Then consider the case
V = S[vv ] for some fixed vector v .
20. Let V be any substance of Rn . Define V ⊥ (V perp) by
x : hx
x, y i = 0 for every y in V }.
V ⊥ = {x
a. Show that V ⊥ is a subspace of Rn .
b. Show that V and V ⊥ are perpendicular in the sense of problem 17.
c. {00}⊥ = Rn , (Rn )⊥ = {00}.
d. (V ⊥ )⊥ = V . Hint: What are the dimensions of the two spaces?
21. Use the result of problem 18 to show that for any subspace V of Rn the
following is true. Given any x in Rn , we can write x uniquely in form
x = v + w for some v in V and w in V ⊥ .
22. Given any unit vector u , Projux is a linear transformation from Rn to Rn ;
cf. problem 10 in Section 6.2.
a. Find a “nice” matrix representation for Projux .
b. Describe geometrically the two subspaces ker(Proju ) and Rg(Proju ).
c. What are the dimensions of the kernel and range of Proju ?
23. Let A be a matrix representation of a linear transformation L : R2 → R2 ,
where L rotates the plane through some angle θ. Show that A is an
orthogonal matrix.
x) = ProjW x .
24. Let W be any subspace of Rn . Define L(x
a. Show L is a linear transformation.
b. Find the range and kernel of L.
c. Find a “nice” matrix representation for L.
CHAPTER 6. GEOMETRY IN RN
228
6.4
Symmetric Matrices
Given a matrix A we defined the transpose of A seemingly for no special reason.
There is, however, an important relationship between A and AT that is not
apparent until we have an inner product. To demonstrate this relationship we
x with y . Thus suppose A = [ajk ] is an m×n matrix,
take the inner product of Ax
x a vector in Rn , and y a vector in Rm . Then AT = [aTjk ] is an n × m matrix,
x is in Rm and AT y is in Rn .
Ax
* n
! +
n
X
X
x, y i =
hAx
amk xy , y
a1k xk , . . . ,
k=1
k=1
=
m
n
X
X
j=1
ajk xk
k=1
!
yj =


m
m
X
X
xk 
=
aTkj yj 
n
X
k=1

xk 
m
X
j=1

ajk yj 
j=1
k=1
T
x, A y i
= hx
This is such a useful formula that we write it again
x, y i = hx
x, AT y i
hAx
(6.23)
x, y i = hx
x, Ayy i.
Notice that if A = AT , then (6.23) becomes hAx
In Chapter 5 we stated that every symmetric matrix was similar to a diagonal
matrix. Put another way, we know that given any symmetric matrix there is
a basis of Rn , which consists of eigenvectors of A. It turns out that it is also
possible to construct this basis in such a manner that it is an orthonormal basis.
This useful feature of symmetric matrices is a consequence of the next lemma.
Lemma 6.3. Let A be a symmetric matrix. Let λ1 and λ2 be two distinct
eigenvalues of A. Then any pair of eigenvectors f 1 and f 2 corresponding to λ1
and λ2 , respectively, must be perpendicular.
Proof.
λ1 (ff 1 , f 2 i = hλ1f 1 , f 2 i
= hAff 1 , f 2 i = hff 1 , Aff 2 i
= hff 1 , λ2f 2 i = λ2 hff 1 , f 2 i
Thus, we have (λ1 − λ2 )hff 1 , f 2 i = 0. Since λ1 − λ2 6= 0, we must have hff 1 , f 2 i
= 0, i.e., the eigenvectors are perpendicular.
2 3
. Find the eigenvectors of A and verify Lemma 6.3
Example 1. Let A =
3 4
for this symmetric matrix.
229
6.4. SYMMETRIC MATRICES
Solution. A quick calculation shows that the characteristic
polynomial of A
√
is p(λ) = λ2 − 6λ − 1. The eigenvalues are 3 ± 10 and their corresponding
eigenvectors are
√
√
f 1 = (−1 − 10, 3)
λ1 = 3 + 10
√
√
λ2 = 3 − 10
f 2 = (−1 + 10, 3)
Computing the inner product of the eigenvectors we have
√
√
hff 1 , f 2 i = h(−1 − 10, 3), (−1 + 10, 3)i
√
√
= (−1 − 10)(−1 + 10) + 9 = 0
The procedure for constructing an orthonormal basis from the eigenvectors of a
symmetric matrix is now relatively easy. We first find the eigenvalues of A from
its characteristic polynomial
p(λ) = (λ − λ1 )m1 (λ − λ2 )m2 . . . (λ − λp )mp
We next find a basis for each of the eigenspaces ker(A − λj I). An orthonormal
basis for each of these eigenspaces is constructed by using the Gram–Schmidt
procedure. These orthonormal bases are then adjoined to form a basis of Rn .
It is Lemma 6.3 which guarantees that combining these individually orthogonal
sets will produce an orthogonal set. Since each vector has length 1 to begin
with, they will remain unit vectors. In other words, suppose that {ff 1 , . . . , f m1 }
and {gg 1 , . . . , g m2 } are orthonormal bases of ker(A − λ1I ) and ker(A − d2I ),
respectively. Then since hff j , g k i = 0 for every j and k, we conclude that
{ff 1 , . . . , g m2 } is also an orthonormal set.
Example 2. Find an orthogonal matrix P such that P T AP is a diagonal matrix, where A is the matrix


7 −2 −1
 −2 10
2 
−1
2
7
Solution. Since A is symmetric, we know that there is an orthonormal basis
of R3 consisting of eigenvectors. If P is the matrix whose columns are these
eigenvectors then P −1 = P T and we have P −1 AP = P T AP , a diagonal matrix.
Computing the characteristic polynomial of A we have
p(λ) = det(A − λI) = (6 − λ)2 (12 − λ)
The eigenvalues of A are 6 with multiplicity 2 and 12 with multiplicity 1. We
list the corresponding eigenvectors.
λ1 = 6
λ2 = 12
f 1 = (1, 0, 1)
f 3 = (−1, 2, 1)
f 2 = (2, 1, 0)
CHAPTER 6. GEOMETRY IN RN
230
Since the eigenvectors f 1 and f 2 correspond to the eigenvalue 6, they are automatically perpendicular to the eigenvector f 3 . To construct our orthonormal
basis we use the Gram–Schmidt procedure for the first two eigenvectors and
then divide f 3 by its length
(1, 0, 1)
√
2
u1 = (1, 1, −1)
v 2 = f 2 − hff 2 , u 1 iu
(1, 1, −1)
v2
√
=
u2 =
kvv 2 k
3
f3
(−1, 2, 1)
√
u3 =
=
kff 3 k
6
u1 =
Thus,
1
√
3
1
√
3
1
−√
3
1
√
 2



P = 0


 1
√
2

and

6
P T AP =  0
0
1
−√
6
2
√
6
1
√
6










0
0
6
0 
0 12
We summarize this discussion in the following theorem.
Theorem 6.9. Let A be an n × n symmetric matrix (A = AT ). Then Rn has
an orthonormal basis of eigenvectors of A, and there is an orthogonal matrix P
such that P T AP = D is a diagonal matrix. The diagonal elements of D are the
eigenvalues of A and the columns of P are eigenvectors of A.
Another consequence of formula (6.23), which we use in the next section, is
x = 0 if and only if Ax
x = 0.
the fact that if A is any m × n matrix then AT Ax
Lemma 6.4. Let A be an m × n matrix. Then the following statements are
true:
x = 0 if and only if Ax
x=0
a. AT Ax
b. Rank(AT A) = rank(A) = rank(AT )
x = 0 , then AT Ax
x = 0 . Thus, to verify a it suffices to show
Proof. Clearly, if Ax
T x
x = 0 . Assuming AT Ax
x = 0 we have
that if A Ax = 0 , then Ax
x, x i = hAx
x, Ax
xi = kAx
xk2
0 = hAT Ax
6.4. SYMMETRIC MATRICES
231
x = 0 . To see that rank(AT A) equals rank(A) we note that
Thus we also have Ax
part a has shown that ker(AT A) equals ker(A). Thus
Rank(AT A) = n − dim(ker AT A) = n − dim(ker A)
= rank(A)
That rank(A) = rank(AT ) was proved in Chapter 3; cf. Theorem 3.5.
Problem Set 6.4
1. For each of the following matrices verify formula (6.23):


2
4
1 2
3 6 8
a.
b.
c.  1 −1 
3 4
1 2 4
5
6
2. Verify (6.23) for each

0
1 3
b. 1
a.
3 1
0
of the following matrices:



6 0 0
1 0
0 2
c. 0 3 0
0 0 2
2 0
x, x i is
3. An n × n symmetric matrix is said to be positive definite if hAx
positive for each nonzero vector x in Rn .
3 1
a. Show that the matrix
is positive definite.
1 2
a b
is positive definite if and only if a > 0
b. Show that the matrix
b d
and ad − b2 > 0.
c. Find a similar criterion for 3 × 3 symmetric matrices.
4. Show that a symmetric matrix is positive definite (hAx, xi > 0 if x̄ 6= 0)
if and only if each of its eigenvalues is positive. (Hint: If A is positive
definite so is P T AP when P is a nonsingular matrix.)
5. Find an orthonormal basis of eigenvectors for each of the following matrices:
1
2
6 0
3 −1
a.
b.
c.
2 −3
0 4
−1
2
6. Find an orthonormal basis of eigenvectors for each of the following matrices:




8 −1 1
−2 3 0
8 1 
a.  −1
b.  3 4 0 
1
1 8
0 0 2
7. Find an orthonormal basis of eigenvectors for each matrix in problem 2.
CHAPTER 6. GEOMETRY IN RN
232
8. For each of the matrices A of problem 5 find P and D such that P T AP =
D, where D is a diagonal matrix.
9. For each of the matrices A of problem 6 find P and D such that P T AP =
D, where D is a diagonal matrix.
10. Let A be any symmetric matrix. Show that P T AP is also a symmetric
matrix.
11. Let A be any n × n matrix. Show that A is symmetric if and only if there
is an orthonormal basis of Rn consisting of eigenvectors of A.
x = b has a solution, then b
12. Let A be an m × n matrix. Show that if Ax
x = b and y is in ker(AT ),
must be perpendicular to ker(AT ). [Hint: If Ax
x, y i =?]
we have hbb, y i = hAx
13. Show that the converse of problem 12 is also true. That is, show that if b
x = b has a solution.
is perpendicular to ker(AT ), then the equation Ax
14. Let V be a vector space with an inner product h , i. A linear transx, y i = hx
x, Lyy i for
formation L : V → V is said to be symmetric if hLx
every pair
of
vectors
x
and
y
in
V
.
Let
V
be
P
and
for
f
,
g
in
V define
1
´1
hff , g i = 0 f (t)gg (t)dt. Decide which, if any, of the following linear transformations is symmetric.
a. L[ff ] = f ′
b. L[ff ] = f ′′
c. L[ff ] = tf ′
15. Let L be a linear transformation from R2 to R2 .
a. Show that for each x in R2 there is a unique y in R2 such that
x, L[zz ]i = hyy , z i for every z in R2 . This vector, y , will be denoted as
hx
x]. Thus, we have the formula hx
x, L[zz ]i = hLT [x
x], z i.
LT [x
b. Show that if A is the matrix representation of L with respect to the
standard basis of R2 , then AT is the matrix representation of LT .
Note, a linear transformation is said to be symmetric if L = LT . Thus, we
have shown that L is symmetric if and only if its matrix representation,
with respect to the standard basis, is a symmetric matrix.
16. Generalize problem 15 to Rn .
17. For each of the linear transformations in problem 14, find LT .
´1
18. Let V = C[0, 1]. Define hff , g i = 0 f (t)gg (t)dt. Let L : V → V be defined
´t
as L[ff ] = 0 f (s)ds. Define the transpose of L as in problem 15, and find
a formula for it.
233
6.5. LEAST SQUARES
6.5
Least Squares
The first problem we considered at the start of this text was that of solving
a system of linear equations. At that time, we noticed that there are systems
which have no solution. In the language of matrices and linear transformations,
x = b does not have a solution unless b
this translates to the statement that Ax
is in the range of A. What we can do at this time is to find a “best” possible
solution. That is, we find that vector y in Rg(A) which is closest to b . Thus, if
x = b , we solve the equation
we cannot solve Ax
x = ProjRg A (bb)
Ax
(6.24)
Since (6.24) will in general have many solutions, we restrict our discussion to
the case where A is one-to-one or equivalently ker(A) consists of just the zero
vector.
With the preceding in mind let A = [ajk ] be an m×n matrix with rank(A) =
n ≤ m. Note that if m < n, then A cannot be one-to-one. In terms of a system
of linear equations the restriction n ≤ m says that there are at least as many
equations as unknowns. (The number of unknowns is never more than the
number of equations.)
Theorem 6.10. Let A be an m × n matrix with n ≤ m and rank(A) = n. Let
x = ProjRg A (bb) is given by
b be any vector in Rm . Then the solution to Ax
x = (AT A)−1 (AT b)
(6.25)
Proof. The trick in the proof is to pick a nice basis for Rn . We first note that
AT A is a symmetric n × n matrix. Moreover, by Lemma 6.4 we know that
rank(AT A) = rank(A) = n. Theorem 6.9 guarantees an orthonormal basis
u1 , . . . , u n } of Rn such that AT Au
uk = dku k . Moreover,
U = {u
uj , Au
uk i = hAT Au
uj uk i
hAu
uj , u k i = dj δjk
= dj hu
uj are mutually perpendicular, and since ker(A) is justpthe
Thus, the vectors Au
uj k2 = dj > 0. We therefore conclude that {Au
u j / dj :
zero vector, we have kAu
j = 1, . . . , n} is an orthonormal basis of Rg(A). By (6.20) we have
ProjRg A (bb) =
n
X
p
u
Au
u j / dj i p j
hbb, Au
dj
j=1
n
X
1 T
uj
hA b , u j iAu
d
j=1 j


n
X
1
uj 
= A
hAT b , u j iu
d
j
j=1
=
CHAPTER 6. GEOMETRY IN RN
234
Setting
x=
n
X
1 T
uj
hA b, u j iu
d
j=1 j
x = ProjRg A (bb).
we see, since A is 1 − 1, that x is the unique solution to Ax
T
T
−1
uj = dj u j we also have (A A) u j = (1/dj )u
uj . Thus,
Moreover since (A A)u
x=
n
X
j=1
hAT b , u j i(AT A)−1u j

= (AT A)−1 
n
X
j=1

uj 
hAT b , u j iu
uj } is an orthonormal basis of Rn ; hence the term in brackets is
But the set {u
T
equal to A b . Thus, x = (AT A)−1 AT b is that unique vector in Rn such that
x is closest to b .
Ax
This formula has an immediate application to curve fitting. Suppose we have a
set of data points (xj , yj ), 1 ≤ j ≤ n, and we wish to find a straight line passing
through these points. If there are more than two data points, such a line is
usually nonexistent. See Figure 6.12. What is normally donePin this situation
Pn
n
is to find that straight line y = mx + b, such that the sum j=1 e2j = j=1
[yj −(mxj +b)]2 is minimized. The numbers ej equal the error in approximating
yj by mxj + b. Thus, in a certain sense, picking m and b in order to minimize
the above sum gives us the best straight-line approximation to our data. This
line is often referred to as the least squares fit.
(x4 , y4 )
n
e4
(x1 , y1 )
e1
e2
o
e3
(x3 , y3 )
(x2 , y2 )
Figure 6.12
At this time the reader might find it advisable to review the discussion
immediately preceding Example 4 in Section 3.4.
Let A be the n × 2 matrix


1 x1
 ..
.. 
.
. 


.
.. 
 ..
. 
1 xn
235
6.5. LEAST SQUARES
Think of R2 as pairs of numbers of the form (b, m)T , and A : R2 → Rn . We
wish to find a solution to the equation
 
y1
 .. 
.
b

=
A
.
m
 .. 
yn
The pair (b, m) is a solution to this equation if and only if the line y = mx + b
passes through each of the data points (xj , yj ). Realizing that this is unlikely
we look for that pair (b, m) such that A(b, m)T is closest to (y1 , . . . , yn )T . Since
rank(A) is two (assuming at least two different xj ’s), we may apply Theorem 6.10. Thus, our approximate solution is
 
y1
 y2 
b
 
= (AT A)−1 AT  . 
m
 .. 
yn
On easily calculates that

n

T

A A = P
n
xj
j=1
n
P
j=1
n
P
j=1
xj



2

 P
n
yj

 j=1

AT y T = 
n
P


x j yj
and that
xj
j=1
where y = (y1 , . . . , yn ). Using Cramer’s rule we have

 P
n
n
P
xj
yj

 j=1
j=1

det 
n
n
P

P
x2j
x j yj
b=
j=1
j=1

n

det 
n
P
xj
j=1

m=
n

det 
n
P
xj
j=1

n

det 
n
P
xj
j=1
n
P
j=1
n
P
j=1
n
P
j=1
n
P
xj



2
xj

yj
x j yj
j=1
n
P
j=1
n
P
j=1
xj






2
xj
(6.26)
CHAPTER 6. GEOMETRY IN RN
236
Example 1. Find the least squares fit to the following data:
(1, −1), (2, 3), (3, 4), (7, 5)
Solution. We first construct the following table:
xj
1 2
3
x2j
1 4
yj
−1 3
4
−1 6
12
x j yj
A is a 4 × 2 matrix

1
1

A=
1
1
9
7
49
5
35
4
P
j=1
4
P
J=1
4
P
j=1
4
P
xj = 13
x2j = 63
yj = 11
xj yj = 52
j=1
and we have

1
11
4 13
2
T
T

A
y
=
A
A
=
52
13 63
3
7
11 13
4 11
det
det
52 63
13 52
17
65
=
=
b=
m=
83
83
4 13
4 13
det
det
13 63
13 63
Thus, y =
65
17
83 x+ 83
is the least squares straight-line approximation to our data.
There is no a priori reason why one should always insist upon fitting a
straight line to data. For example, we might wish to fit a parabola to the data.
That is, find a0 , a1 , and a2 such that y = a0 + a1 x + a2 x2 is the quadratic least
squares fit, cf. problem 5.
Problem Set 6.5


2 1
1. Let A = 0 2
3 4
a. Determine the range of A, and show that (1,1,0) is not in the range,
x = (1, 1, 0)T does not have a solution.
i.e., the equation Ax
b. Compute AT A and show that it is one to one.
x = AT b , where b = (1, 1, 0).
c. Solve the equation AT Ax
x − b k is smaller than
d. If x is your solution from part c, show that kAx
w − b k for any vector w in the range of A.
kw
237
6.5. LEAST SQUARES
2. Determine the straight-line least squares fit for the following data: (1,1),
(2, −3), (4,0), (5,1), (10,3).
3. Determine the straight-line least squares fit for the following data:
a. (0,6), (3,0), (4, −2)
b. (−2, 4), (3,9), (4,7)
4. Consider the system of equations:
3x1 + 4x2 + 8x3
x1
− x3
2x1 + x2 + 4x3
x1 + x2 + x3
=0
=1
=0
=0
a. This system is overdetermined (more equations than unknowns) and
may not have a solution. Show that if there is a solution, it is unique.
b. Show that this system does not have a solution, and then find x in
x is that vector in the range of A closest to (0,1,0,0).
R3 such that Ax
5. Determine the least squares quadratic
Pn fit for the following data; i.e., find
p(x) = a0 + a1 x + a2 x2 , such that j=1 [p(xj ) − yj ]2 is minimized, where
(xj , yj ) are the given data. Remember, you will need to solve an equation
x = AT b .
of the form AT Ax
a. (−2, 4), (3,9), (4,7)
b. (1,1), (2, −3), (4,0) (5,1), (10,3)
Supplementary Problems
1. Define each of the following and give an example of each:
a. Length of a vector
b. Angle between two vectors
c. Orthonormal basis
d. Projection onto a subspace
e. Orthogonal matrix
2. Let x be a vector in Rn .
x, y i = 0 for every y in Rn , show that x must be the zero
a. Suppose hx
vector.
x, y i = 0 for every vector y in some spanning set F of Rn .
b. Suppose hx
Show that x must be the zero vector.
CHAPTER 6. GEOMETRY IN RN
238
3. Compute the inner product and the cosine of the angle between each of
the following pairs of vectors:
a. (−4, 5), (1,2)
b. (−2, 3, 7), (2, −4, 5)
c. (−1, −2, 3, 5), (1,1,0,8)
4. Let x0 = (1, −2, 6). Show that the vector (1, −2, 0) is that vector in the
subspace x3 = 0 which is closest to x0 , by showing that
f (x, y) = k(1, −2, 6) − x(1, 0, 0) − y(0, 1, 0)k2
obtains its minimum when x = 1 and y = −2. Repeat this for x 0 =
(a, b, c), an arbitrary vector in R3 .
5. Let V = M22 . Define the inner product of two 2 × 2 matrices by
hA, Bi =
2
2 X
X
ajk bk
j=1 k=1
a. Show that this inner product satisfies properties a through c of Theorem 6.3.
b. Show that Theorem 6.4 is valid.
1 0
c 6
.
, where c is a fixed constant. Let E1 =
c. Let A =
0 0
−3 2
Let f (t) = kA − tE1 k2 . Find that value of t which minimizes f (t).
6. Given two vectors x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) in R3 , define their
cross product as (cf. problem 8 in Supplementary Problems to Chapter 4).
x × y = (x2 y3 − x3 y2 , x3 y1 − x1 y3 , x1 y2 − x2 y1 )
a. Show that the cross product of x and y is perpendicular to both x
and y .
b. Show that x × y = 0 if and only if x and y are linearly dependent.
c. Show that i × j = k , j × k = i , and k × i = j .
d. Verify that x × y = −yy × x .
x × y ) + (x
x + z ).
e. Show that x × (yy + z ) = (x
x × y) × z.
f. Find three vectors x , y , and z for which x × (yy × z ) 6= (x
Hint: Use parts b and c.
Thus, the cross product of two vectors is a noncommutative, nonassociative operation that produces a vector perpendicular to both of the original
vectors.
7. Define x × y as in problem 6. Show that
x × y k2 + hx
x, y i2 = kx
xk2 kyy k2
kx
239
6.5. LEAST SQUARES
x × y k = kx
xkkyy k sin θ, where θ
a. Deduce from the above formula that kx
is the angle between the vectors x and y .
b. If P is the parallelogram determined by x and y , show that area(P ) =
x × y k.
kx
8. If x × y is defined as in problem 6, show that
x × y )′ = (x
x′ × y ) + (x
x × y ′)
(x
where we assume that both x and y are vector-valued functions of a real
variable and the ′ denotes differentiation.
9. Let P be an orthogonal n × n matrix. Let x and y be any two vectors in
Rn .
x, y i = hPx
x, Pyy i. Deduce from this that the linear transa. Show that hx
x = Px
x preserves the lengths of vectors and
formation L given by Lx
the angles between them.
x, y i =
b. Conversely, show that if P is an n × n matrix for which hx
x, Pyy i for every pair of vectors in Rn , then P is an orthogonal
hPx
matrix.
x = Ax
x +a
a, where A is
10. A mapping T from Rn to Rn is said to be affine if Tx
an n×n matrix and a is a fixed vector in Rn . Clearly, if A is an orthogonal
x − Tyy k = kx
x − y k. Show
matrix, then T is distance preserving, i.e., kTx
that the converse of this is also true. That is, if T is any mapping that
preserves distance then T is affine and the matrix A is orthogonal.
11. Let x (t) and y (t) be two vector-valued functions from R to R2 . If x (t) =
(x1 (t), x2 (t)), define x′ (t) = (x′1 , x′2 ).
x ]′ =
a. Let c(t) be a real-valued differentiable function. Show that [cx
′
′
x.
c x + cx
x, y i′ = hx
x′ , y i + hx
x, y ′ i.
b. Show that hx
c. If x (t) is a vector-valued function with constant nonzero length, show
that x and x ′ are perpendicular if x ′ is not the zero vector.
x, x i ≥ 0
12. A linear transformation L from Rn to Rn is positive definite if hLx
x equals zero,
for all vectors x , and whenever the inner product of x and Lx
then x equals 0 . Let A = [ajk ] be the matrix representation of L with
respect to the standard basis. Assume that A is a symmetric matrix.
a. If n = 2, show that L is positive definite if and only if a11 > 0, and
det(A) > 0.
b. If n = 3, show that L is positive definite if and only if a11 > 0,
det(M33 ) > 0, and det(A) > 0, where M33 is the 2 × 2 matrix in the
upper left-hand corner of A.
240
CHAPTER 6. GEOMETRY IN RN
13. Let P be a linear transformation from Rn to Rn . Suppose P 2 = P .
x)) = Px
x for all vectors in Rn . Such mappings are called
That is, P (P (x
projections.
a. Show that (I − P )2 = I − P . Thus, if P is a projection so too is
I − P.
b. Show that the following are equivalent:
(1) x is in the range of P .
x = x.
(2) Px
x = 0.
(3) (I − P )x
c. Show that ker(P ) = Rg(I − P ).
d. A projection is said to be orthogonal if ker(P ) is orthogonal to Rg(P ).
Show that P is an orthogonal projection if and only if P = P T .
e. Show that the projections defined in Section 6.2 are orthogonal projections in the sense of part d.
14. Let T : P2 → P3 be a linear transformation given by
ˆ t
p (s)ds
T [pp](t) =
0
Define an inner product on P2 by hpp, q i = p0 q0 + p1 q1 + p2 q2 , where
p (t) = p0 + p1 t + p2 t2 . Define an inner product on P3 in a similar manner.
a. Show that p (t) ≡ 1 is not in the range of T .
b. Show that T is one-to-one.
c. Find the least squares solution to T (pp) = 1. That is, find p in P2 such
that T (pp) is that vector in the range of T closest to the polynomial
identically 1.
15. Let V = M22 . Define the inner product of two matrices A = [ajk ] and
B = [bjk ] by
2
2 X
X
ajk bjk
hA, Bi =
j=1 k=1
Let F be the set consisting of the three matrices below:
1 −1
0 1
1 2
−1
1
−2 0
−3 1
a. Show that F is a linearly independent set.
b. Construct orthonormal bases for V and for S[F ].
´t
16. Define T : P2 → P3 by T [pp](t) = 0 p (s)ds − (t/2)pp(t).
a. Show that T is a linear transformation and determine its range and
kernel.
b. If inner products on P2 and P3 are defined as in problem 14, construct
orthonormal bases for the range and kernel of T .