LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 THOMAS SHORES Last Rev.: 9/1/09 1. Vector Spaces Here is the general denition of a vector space: Denition 1. An (abstract) vector space is a nonempty set V of elements called vectors, together with operations of vector addition (+) and scalar multiplication ( · ), such that the following laws hold for all vectors u, v, w ∈ V (1): (2): (3): (4): and scalars a, b ∈ F: u + v ∈ V. u + v = v + u. (Associativity of addition) u + (v + w) = (u + v) + w. (Additive identity) There exists an element 0 ∈ V such that u + 0 = u = 0 + u. (5): (Additive inverse) There exists an element −u ∈ V such that u + (−u) = 0 = (−u) + u. (6): (Closure of scalar multiplication) a · u ∈ V. (7): (Distributive law) a · (u + v) = a · u + a · v. (8): (Distributive law) (a + b) · u = a · u + b · u. (9): (Associative law) (ab) · u = a · (b · u) . (10): (Monoidal law) 1 · u = u. About notation: just as in matrix arithmetic, for vectors u, v ∈ V, we understand that u − v = u + (−v). We also suppress the dot (·) of scalar multiplication and usually write au instead of a · u. About scalars: the only scalars that we will use in 441 are the real numbers R. (Closure of vector addition) (Commutativity of addition) Denition 2. Given a positive integer dimension n over the reals n, we dene the standard vector space of to be the set of vectors Rn = {(x1 , x2 , . . . , xn ) | x1 , x2 , . . . , xn ∈ R} together with the standard vector addition and scalar multiplication. (Recall that (x1 , x2 , . . . , xn ) is shorthand for the column vector [x1 , x2 , . . . , xn ]T .) We see immediately from the denition that the required closure properties of vector addition and scalar multiplication hold, so these really are vector spaces in the sense dened above. The standard real vector spaces are often called the real Euclidean vector spaces once the notion of a norm (a notion of length covered in the next chapter) is attached to them. Example 3. Let C [a, b] denote the set of all real-valued functions that are con[a, b] and use the standard function addition and scalar tinuous on the interval 1 LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY multiplication for these functions. That is, for ber c, we dene the functions f +g and cf MATH 441, FALL 2009 f (x) , g (x) ∈ C [a, b] 2 and real num- by (f + g) (x) = f (x) + g (x) (cf ) (x) = c (f (x)) . C [a, b] Show that We set with the given operations is a vector space. V = C [a, b] and check the vector space axioms for this f, g, h of this example, we let be arbitrary elements of V. V. For the rest We know from calculus that the sum of any two continuous functions is continuous and that any constant times a continuous function is also continuous. Therefore the closure of addition and that of scalar multiplication hold. Now for all x such that a ≤ x ≤ b, we have from the denition and the commutative law of real number addition that (f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x). Since this holds for all x, we conclude that f + g = g + f, which is the commutative law of vector addition. Similarly, ((f + g) + h)(x) = (f + g)(x) + h(x) = (f (x) + g(x)) + h(x) = f (x) + (g(x) + h(x)) = (f + (g + h))(x). Since this holds for all x, we conclude that (f + g) + h = f + (g + h), which is the associative law for addition of vectors. 0 Next, if denotes the constant function with value have that for all a ≤ x ≤ b, 0, then for any f ∈V we (f + 0)(x) = f (x) + 0 = f (x). (We don't write the zero element of this vector space in boldface because it's customary not to write functions in bold.) Since this is true for all x we have that f +0 = f , (−f )(x) = −(f (x)) so which establishes the additive identity law. Also, we dene that for all a ≤ x ≤ b, (f + (−f ))(x) = f (x) − f (x) = 0, from which we see that f + (−f ) = 0. The additive inverse law follows. For the distributive laws note that for real numbers we have that for all a ≤ x ≤ b, c, d and continuous functions f, g ∈ V , c(f + g)(x) = c(f (x) + g(x)) = cf (x) + cg(x) = (cf + cg)(x), which proves the rst distributive law. For the second distributive law, note that for all a ≤ x ≤ b, ((a + b)g)(x) = (a + b)g(x) = ag(x) + bg(x) = (ag + bg)(x), and the second distributive law follows. For the scalar associative law, observe that for all a ≤ x ≤ b, ((cd)f )(x) = (cd)f (x) = c(df (x)) = (c(df ))(x), so that (cd)f = c(df ), as required. Finally, we see that (1f )(x) = 1f (x) = f (x), from which we have the monoidal law operations is a vector space. 1f = f. Thus, C [a, b] with the prescribed LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 3 We could have abbreviated the work above by using the Subspace Test. Recall from Math 314: Denition 4. A subspace of the vector space V is a subset together with the binary operations it inherits from the same eld of scalars as Given a subset W V) W W of V such that W, forms a vector space (over in its own right. of the vector space space directly to the subset V, V, we can apply the denition of vector to obtain the following very useful test. Theorem 5. Let W be a subset of the vector space V. Then W is a subspace of V if and only if (1): W contains the zero element of V. (2): (Closure of addition) For all u, v ∈ W, u + v ∈ W. (3): (Closure of scalar multiplication) For all u ∈ W and scalars c, cu ∈ W. Finally, we recall some key ideas about linear combinations of vectors. Denition 6. Let v1 , v2 , . . . , vn be vectors in the vector space V. The span of span {v1 , v2 , . . . , vn }, is the subset of V consisting of all these vectors, denoted by possible linear combinations of these vectors, i.e., span {v1 , v2 , . . . , vn } = {c1 v1 + c2 v2 + · · · + cn vn | c1 , c2 , . . . , cn are scalars} Recall that one can show that spans are always subspaces of the containing vector space. A few more fundamental ideas: Denition 7. exist scalars The vectors c1 , c2, . . . , cn , v 1 ,v 2 , . . . ,v n linearly dependent if there c1 v1 + c2 v2 + · · · + cn vn = 0. (1.1) Otherwise, the vectors are called Denition 8. are said to be not all zero, such that A linearly independent. basis for the vector space V is a spanning set of vectors v1 , v2 , . . . , vn that is a linearly independent set. In particular, we have the following key facts about bases: Theorem 9. Let v1 , v2 , . . . , vn be a basis of the vector space V . Then every v ∈ V can be expressed uniquely as a linear combination of v1 , v2 , . . . , vn , up to order of terms. Denition 10. The vector space V is called nite-dimensional if V has a nite spanning set. It's easy to see that every nite-dimensional space has a basis: simply start with a spanning set and throw away elements until you have reduced the set to a minimal spanning set. Such a set will be linearly independent. Theorem 11. Let w1 , w2 , . . . , wr be a linearly independent set in the space V and let v1 , v2 , . . . , vn be a basis of V. Then r ≤ n and we may substitute all of the wi 's for r of the vj 's in such a way that the resulting set of vectors is still a basis of V. As a consequence of the Steinitz substitution principle above, we obtain this central result: LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 4 Theorem 12. Let V be a nite-dimensional vector space. Then any two bases of V have the same number of elements, which is called the dimension of the vector space and denoted by dim V Exercises Exercise 1. P of C[a, b] consisting of all polynomial C[a, b] and that the subset Pn consisting of all polynomials of degree at most n on the interval [a, b] is a subspace of P . Exercise 2. Show that Pn is a nite dimensional vector space of dimension n, but that P is not a nite dimensional space, that is, does not have a nite vector Show that the subset functions is a subspace of basis (linearly independent spanning set). 2. Norms The basic denition: Denition 13. each vector A v ∈ V norm on the vector space a real number kvk V is a function such that for c k·k that assigns to a scalar and u, v ∈ V the following hold: (1): (2): (3): kuk ≥ 0 with equality if and only if u = 0. kcuk = |c| kuk. (Triangle Inequality) ku + vk ≤ kuk + kvk. A vector space V , together with a norm k·k on the space V , is called a normed (linear) space. If u, v ∈ V , the distance between u and v is dened to be d (u, v) = ku − vk. Notice that if V is a normed space and W is any subspace of matically becomes a normed space if we simply use the norm of V , then W autoV on elements of W. Obviously all the norm laws still hold, since they hold for elements of the bigger V. space Of course, we have already studied some very important examples of normed spaces, namely the standard vector space Rn , or any subspace thereof, together with the standard norms given by 1/2 2 2 2 k(z1 , z2 , . . . , zn )k = |z1 | + |z2 | + · · · + |zn | . Since our vectors are real then we can drop the conjugate bars. This norm is actually one of a family of norms that are commonly used. Denition 14. The p-norm Let V be one of the standard spaces of a vector in V Rn and p≥1 p p 1/p p k(z1 , z2 , . . . , zn )kp = (|z1 | + |z2 | + · · · + |zn | ) Notice that when p = 2 a real number. is dened by the formula . we have the familiar example of the standard norm. p = 1. The last important instance of a p-norm is one that isn't so obvious: p = ∞. It turns out that the value of this norm is the limit of p-norms as p → ∞. To keep matters simple, we'll supply a separate Another important case is that in which denition for this norm. Denition 15. vector in V Let V be one of the standard spaces Rn or Cn . is dened by the formula k(z1 , z2 , . . . , zn )k∞ = max {|z1 | , |z2 | , . . . , |zn |} . The ∞-norm of a LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY Example 16. Calculate kvkp , where p = 1, 2, or ∞ and MATH 441, FALL 2009 5 v = (1, −3, 2, −1) ∈ R4 . We calculate: k(1, −3, 2, −1)k1 = |1| + |−3| + |2| + |−1| = 7 q √ 2 2 2 2 k(1, −3, 2, −1)k2 = |1| + |−3| + |2| + |−1| = 15 k(1, −3, 2, −1))k∞ = max {|1| , |−3| , |2| , |−1|} = 3. It may seem a bit odd at rst to speak of the same vector as having dierent lengths. You should take the point of view that choosing a norm is a bit like choosing a measuring stick. If you choose a yard stick, you won't measure the same number as you would by using a meter stick on an object. Example 17. case that Let c Verify that the norm properties are satised for the p = ∞. be a scalar, and let u = (z1 , z2 , . . . , zn ), and p-norm v = (w1 , w2 , . . . , wn ) in the be two vectors. Any absolute value is nonnegative, and any vector whose largest component in absolute value is zero must have all components equal to zero. Property (1) follows. Next, we have that kcuk∞ = k(cz1 , cz2 , . . . , czn )k∞ = max {|cz1 | , |cz2 | , . . . , |czn |} = |c| max {|z1 | , |z2 | , . . . , |zn |} = |c| kuk∞ , which proves (2). For (3) we observe that ku + vk∞ = max {|z1 | + |w1 | , |z2 | + |w2 | , . . . , |zn | + |wn |} ≤ max {|z1 | , |z2 | , . . . , |zn |} + max {|w1 | , |w2 | , . . . , |wn |} ≤ kuk∞ + kvk∞ . 2.1. Unit Vectors. Sometimes it is convenient to deal with vectors whose length is one. Such a vector is called a to concoct a unit vector u unit vector. We saw in Chapter 3 that it is easy in the same direction as a given nonzero vector v when using the standard norm, namely take u= (2.1) v . kvk The same formula holds for any norm whatsoever because of norm property (2). Example 18. the Construct a unit vector in the direction of 1-norm, 2-norm, and ∞-norms We already calculated each of the norms of v. tion (2.1) to obtain unit-length vectors 1 (1, −3, 2, −1) 7 1 u2 = √ (1, −3, 2, −1) 15 1 u∞ = (1, −3, 2, −1). 3 u1 = v = (1, −3, 2, −1), where are used to measure length. Use these numbers in equa- LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 6 From a geometric point of view there are certain sets of vectors in the vector space V that tell us a lot about distances. These are the so-called vector (or point) v0 of radius r, whose denition is as follows: Denition 19. vectors The (closed) ball of radius r balls about a v0 is the set of centered at the vector Br (v0 ) = {v ∈ V | kv − v0 k ≤ r} . The set of vectors is the open ball of Bro (v0 ) = {v ∈ V | kv − v0 k < r} . radius r centered at the vector v0 . Here is a situation very important to approximation theory in which these balls are helpful: imagine trying to nd the distance from a given vector (this means it contains all points on its boundary) set S v0 to a closed of vectors that need not v0 S. Then expand this ball by increasing its radius until you have found a least radius r such that the ball Br (v0 ) intersects S nontrivially. Then the distance from v0 to this set is this number r. Actually, this is a reasonable denition of the distance from v0 to the set S . One expects these balls, for a given be a subspace. One way to accomplish this is to start with a ball centered at so small that the ball avoids norm, to have the same shape, so it is sucient to look at the unit balls, that is, the case r = 1. Example 20. and ∞-norms Sketch the unit balls centered at the origin for the in the space V =R2 . In each case it's easiest to determine the boundary of the ball vectors v = (x, y) such that kvk = 1. 1-norm, 2-norm, B1 (0), i.e., the set of These boundaries are sketched in Figure 2.1, and the ball consists of the boundaries plus the interior of each boundary. Let's start with the familiar 2-norm. Here the boundary consists of points (x, y) such that which is the familiar circle 1-norm, p 1 = k(x, y)k2 = x2 + y 2 , of radius 1 centered at the origin. Next, consider the in which case 1 = k(x, y)k1 = |x| + |y| . It's easier to examine this formula in each quadrant, where it becomes one of the four possibilities ±x ± y = 1. x + y = 1. For example, in the rst quadrant we get These equations give lines that connect to form a square whose sides are diagonal lines. Finally, for the ∞-norm we have 1 = |(x, y)|∞ = max {|x| , |y|} , which gives four horizontal and vertical lines x = ±1 and y = ±1. These intersect to form another square. Thus we see that the unit balls for the 1- and ∞-norms have corners, unlike the 2-norm. See Figure 2.1 for a picture of these balls. One of the important applications of the norm concept is that it enables us to make sense out of the idea of limits and convergence of vectors. limn→∞ vn = v limn→∞ kvn − vk = 0. v1 , v2 , . . . converges to v. Will we have to was taken to mean that said that the sequence notion of limits for dierent norms? For nite-dimensional In a nutshell, In this case we have a dierent spaces, the somewhat LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 7 y x kvk1 = 1 kvk2 = 1 kvk∞ = 1 Figure 2.1. Boundaries of unit balls in various norms. surprising answer is no. The reason is that given any two norms k·ka and k·kb on a nite-dimensional vector space, it is always possible to nd positive real constants c and d such that for any vector v, kvka ≤ c · kvkb Hence, if kvn − vk tends to 0 and kvkb ≤ d kvka . 0 in one norm, it will tend to in the other norm. For this reason, any two norms satisfying these inequalities are called equivalent. It can be shown that all norms on a nite-dimensional vector space are equivalent. Indeed, it can be shown that the condition that kvn − vk tends to norm is equivalent to the condition that each coordinate of corresponding coordinate of Example 21. the 1-norm Verify that and 2-norm, vn 0 in any one converges to the v. We will verify the limit fact in the following example. limn→∞ vn exists and is the same with respect to both where vn = (1 − n)/n e−n + 1 . Which norm is easier to work with? First we have to know what the limit will be. Let's examine the limit in each coordinate. We have lim 1−n 1 = lim − 1 = 0 − 1 = −1 and lim e−n + 1 = 0 + 1 = 1. n→∞ n n→∞ n to use v = (−1, 1) as the limiting vector. Now calculate 1−n 1 −1 n n v − vn = − = , 1 e−n + 1 e−n n→∞ So we try so that and 1 kv − vn k1 = + e−n −→ 0 n→∞ n s 2 1 2 kv − vn k = + (e−n ) −→ 0, n→∞ n LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 which shows that the limits are the same in either norm. In this case the 8 1-norm appears to be easier to work with, since no squaring and square roots are involved. Here are two examples of norms dened on nonstandard vector spaces: Denition 22. The p-norm on C[a, b] is dened by kf kp = n b a o1/p |f (x)|p dx . Although this is a common form of the denition, a better form that is often used is ( kf kp = 1 b−a )1/p b . |f (x)| dx p a This form is better in the sense that it scales the size of the interval. Denition 23. uniform (or The maxa≤x≤b |f (x)|. norm innity) on C[a, b] is dened by kf k∞ = This norm is well dened by the extreme value theorem, which guarantees that the maximum value of a continuous function on a closed interval exists. We leave verication of the norm laws as an exercise. Convexity. The basic idea is that if a set in a linear space is convex, then the line connecting any two points in the set should lie entirely inside the set. Here's how to say this in symbols: Denition 24. A set S in the vector space V is convex if, for any vectors u, v ∈ S, all vectors of the form λu + (1 − λ) v, 0 ≤ λ ≤ 1, are also in S. Denition 25. vectors u, v ∈ S, A set S in the normed linear space V is strictly convex if, for any all vectors of the form w = λu + (1 − λ) v, 0 < λ < 1 are in the interior of ball Br (w) S, that is, for each is entirely contained in w there exists a positive S. r such that the Exercises Exercise 3. Show that the uniform norm on C[a, b] satises the norm properties. Exercise 4. Show that for positive r and v0 ∈ V , a normed linear space, the ball Br (v0 ) is a convex set. Show by example that it need not be strictly convex. 3. Inner Product Spaces This dot product of calculus and Math 314 amounted to the standard inner product of the two standard vectors. We now extend this idea to a setting that allows for abstract vector spaces. Denition 26. h·, ·i An (abstract) inner product that assigns to each pair of vectors scalar and (1): (2): on the vector space u, v ∈ V u, v, w ∈ V the following hold: hu, ui ≥ 0 with hu, ui = 0 if and hu, vi = hv, ui only if a scalar u = 0. hu, vi V is a function such that for c a LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY (3): (4): MATH 441, FALL 2009 9 hu, v + wi = hu, vi + hu, wi hu, cvi = c hu, vi V, A vector space together with an inner product h·, ·i on the space V, is called inner product space. Notice that in the case of the more common vector spaces over real scalars, property (2) becomes the commutative law: hu, vi = hv, ui . Also an observe that if V is an inner product space and W is any subspace of V, then W automatically becomes an inner product space if we simply use the inner product of V on elements of W. For all the inner product laws still hold, since they hold for elements of the larger space V. Of course, we have the standard examples of inner products, namely the dot products on Rn Example 27. and Cn . For vectors u, v ∈ Rn , with u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ), dene u · v = u1 v1 + u2 v2 + · · · + un vn = uT v. This is just the standard dot product, and one can verify that all the inner product laws are satised by application of the laws of matrix arithmetic. Here is an example of a nonstandard inner product on a standard space that is useful in certain engineering problems. Example 28. For vectors u = (u1 , u2 ) and v = (v1 , v2 ) in V = R2 , dene an inner product by the formula hu, vi = 2u1 v1 + 3u2 v2 . Show that this formula satises the inner product laws. First we see that hu, ui = 2u21 + 3u22 , so the only way for this sum to be 0 is for u1 = u2 = 0. Hence (1) holds. For (2) calculate hu, vi = 2u1 v1 + 3u2 v2 = 2v1 u1 + 3v2 u2 = hv, ui = hv, ui, since all scalars in question are real. For (3) let w = (w1 , w2 ) and calculate hu, v + wi = 2u1 (v1 + w1 ) + 3u2 (v2 + w2 ) = 2u1 v1 + 3u2 v2 + 2u1 w1 + 3u2 = hu, vi + hu, wi . For the last property, check that for a scalar c, hu, cvi = 2u1 cv1 + 3u2 cv2 = c (2u1 v1 + 3u2 v2 ) = c hu, vi . It follows that this weighted inner product is indeed an inner product according to our denition. In fact, we can do a whole lot more with even less eort. Consider this example, of which the preceding is a special case. Example 29. product product A be an n × n Hermitian matrix (A = A∗ ) and dene the hu, vi = u Av for all u, v ∈V , where V is Rn or Cn . Show that this satises inner product laws (2), (3), and (4) and that if, in addition, A is Let ∗ positive denite, then the product satises (1) and is an inner product. LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY As usual, let 1×1 u, v, w ∈ V and let c be a scalar. q , q ∗ = q , so we calculate MATH 441, FALL 2009 10 For (2), remember that for a scalar quantity ∗ ∗ hv, ui = v∗ Au = (u∗ Av) = hu, vi = hu, vi. For (3), we calculate hu, v + wi = u∗ A(v + w) = u∗ Av + u∗ Aw = hu, vi + hu, wi . For (4), we have that hu, cvi = u∗ Acv = cu∗ Av = c hu, vi . Finally, if we suppose that A is also positive denite, then by denition, hu, ui = u∗ Au > 0, for u 6= 0, which shows that inner product property (1) holds. Hence, this product denes an inner product. We leave it to the reader to check that if we take A= 2 0 0 3 , we obtain the inner product of the rst example above. Here is an example of an inner product space that is useful in approximation theory: Example 30. [a, b] Let V = C [a, b], the space of continuous functions on the interval with the usual function addition and scalar multiplication. formula Show that the b hf, gi = f (x)g(x) dx a denes an inner product on the space V. Certainly hf, gi is a real number. Now if f (x) is a continuous function then 1 2 f (x) is nonnegative on [a, b] and therefore 0 f (x)2 dx = hf, f i ≥ 0. Furthermore, 2 if f (x) is nonzero, then the area under the curve y = f (x) must also be positive since f (x) will be positive and bounded away from 0 on some subinterval of [a, b]. This establishes property (1) of inner products. Now let f (x), g(x), h(x) ∈ V . For property (2), notice that b hf, gi = g(x)f (x)dx = hg, f i . a Also, b f (x)g(x)dx = a b hf, g + hi = f (x)(g(x) + h(x))dx a b = b f (x)g(x)dx + a f (x)h(x)dx = hf, gi + hf, hi , a which establishes property (3). Finally, we see that for a scalar b hf, cgi = f (x)cg(x) dx = c a which shows that property (4) holds. c, b f (x)g(x) dx = c hf, gi , a LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 We shall refer to this inner product on a function space as the product on the function space C [a, b]. 11 standard inner (Most of our examples and exercises involving function spaces will deal with polynomials, so we remind the reader of the integration formula m ≥ 0.) b a xm dx = 1 m+1 bm+1 − am+1 and special case 1 0 xm dx = 1 m+1 for Following are a few simple facts about inner products that we will use frequently. The proofs are left to the exercises. Theorem 31. Let V be an inner product space with inner product h·, ·i . Then we have that for all u, v, w ∈ V and scalars a, (1): hu, 0i = 0 = h0, ui, (2): hu + v, wi = hu, wi + hv, wi, (3): hau, vi = ahu, vi. Induced Norms and the CBS Inequality. It is a striking fact that we can accomplish all the goals we set for the standard inner product using general inner products: we can introduce the ideas of angles, orthogonality, projections, and so forth. We have already seen much of the work that has to be done, though it was stated in the context of the standard inner products. As a rst step, we want to point out that every inner product has a natural norm associated with it. Denition 32. Let V be an inner product space. For vectors u ∈ V, the norm dened by the equation kuk = is called the hu, ui norm induced by the inner product h·, ·i on V . p As a matter of fact, this idea is not really new. Recall that we introduced the standard inner product on V = Rn or Cn with an eye toward the standard norm. At the time it seemed like a nice convenience that the norm could be expressed in terms of the inner product. It is, and so much so that we have turned this cozy relationship into a denition. Just calling the induced norm a norm doesn't make it so. Is the induced norm really a norm? We have some work to do. The rst norm property is easy to verify for the induced norm: from property (1) of inner products we see that hu, ui ≥ 0, with equality if and only if u = 0. This conrms c be a scalar and norm property (1). Norm property (2) isn't too hard either: let check that kcuk = p hcu, cui = q p p 2 cc hu, ui = |c| hu, ui = |c| kuk . Norm property (3), the triangle inequality, remains. This one isn't easy to verify from rst principles. We need a tool called the the CauchyBunyakovskySchwarz (CBS) inequality. Theorem 33. (CBS Inequality) Let V be an inner product space. For u, v ∈ V , if we use the inner product of V and its induced norm, then |hu, vi| ≤ kuk kvk . Henceforth, when the norm sign k·k is used in connection with a given inner product, it is understood that this norm is the induced norm of this inner product, unless otherwise stated. Just as with the standard dot products, we can formulate the following denition thanks to the CBS inequality. LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY Denition 34. angle between u, v ∈ V, For vectors u and v to be any angle Example 35. satisfying hu, vi . kuk kvk |hu, vi| / (kuk kvk) ≤ 1, so that this formula for cos θ Let u = (1, −1) v = (1, 1) and 12 a real inner product space, we dene the θ cos θ = We know that MATH 441, FALL 2009 be vectors in 2 R . makes sense. Compute an angle between these two vectors using the inner product of Example 28. Compare this to the angle found when one uses the standard inner product in R2 . Solution. According to 28 and the denition of angle, we have cos θ = −1 2 · 1 · 1 + 3 · (−1) · 1 hu, vi √ = =p . 2 2 2 2 kuk kvk 5 2 · 1 + 3 · (−1) 2 · 1 + 3 · 1 Hence the angle in radians is θ = arccos −1 5 ≈ 1.7722. On the other hand, if we use the standard norm, then hu, vi = 1 · 1 + (−1) · 1 = 0, from which it follows that u and v are orthogonal and θ = π/2 ≈ 1.5708. In the previous example, it shouldn't be too surprising that we can arrive at two dierent values for the angle between two vectors. Using dierent inner products to measure angle is somewhat like measuring length with dierent norms. Next, we extend the perpendicularity idea to arbitrary inner product spaces. Denition 36. if Two vectors hu, vi = 0. Note that if hu, vi = 0, u and v in the same inner product space are orthogonal then hv, ui = hu, vi = 0. the zero vector orthogonal to every other vector. Also, this denition makes It also allows us to speak of things like orthogonal functions. One has to be careful with new ideas like this. Orthogonality in a function space is not something that can be as easily visualized as orthogonality of geometrical vectors. may not be quite enough. Inspecting the graphs of two functions If, however, graphical data is tempered with a little understanding of the particular inner product in use, orthogonality can be detected. Example 37. C [0, 1] 2 3 are orthogonal elements of with the inner product of Example 30 and provide graphical evidence of Show that f (x) = x and g(x) = x − this fact. Solution. According to the denition of inner product in this space, hf, gi = 1 0 f (x)g(x)dx = 0 1 2 x x− 3 dx = 1 x3 x2 − = 0. 3 3 0 f and g are orthogonal to each other. For graphical evidence, sketch f (x), g(x), and f (x)g(x) on the interval [0, 1] as in Figure 3.1. The graphs of f and g are not especially enlightening; but we can see in the graph that the area below f · g and above the x-axis to the right of (2/3, 0) seems to be about equal to the area to the left of (2/3, 0) above f · g and below the x-axis. Therefore the integral of the product on the interval [0, 1] might be expected to be zero, which is indeed It follows that the case. LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 13 y 1 f (x) = x f (x) · g(x) x 2 3 g(x) = x − 1 2 3 −1 Figure 3.1. Graphs of f , g, and f ·g on the interval [0, 1]. Some of the basic ideas from geometry that fuel our visual intuition extend very elegantly to the inner product space setting. One such example is the famous Pythagorean theorem, which takes the following form in an inner product space. Theorem 38. Let u, v be orthogonal vectors in an inner product space V. Then 2 2 2 kuk + kvk = ku + vk . Proof. Compute 2 ku + vk = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi 2 2 = hu, ui + hv, vi = kuk + kvk . Here is an example of another standard geometrical fact that ts well in the abstract setting. This is equivalent to the law of parallelograms, which says that the sum of the squares of the diagonals of a parallelogram is equal to the sum of the squares of all four sides. Example 39. Use properties of inner products to show that if we use the induced norm, then 2 2 2 2 ku + vk + ku − vk = 2 kuk + kvk . The key to proving this fact is to relate induced norm to inner product. Specifically, 2 ku + vk = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi , while 2 ku − vk = hu − v, u − vi = hu, ui − hu, vi − hv, ui + hv, vi . Now add these two equations and obtain by using the denition of induced norm again that 2 2 2 2 ku + vk + ku − vk = 2 hu, ui + 2 hv, vi = 2 kuk + kvk , which is what was to be shown. It would be nice to think that every norm on a vector space is induced from some inner product. Unfortunately, this is not true, as the following example shows. LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY Example 40. V = R2 MATH 441, FALL 2009 Use the result of Example 39 to show that the innity norm on is not induced by any inner product on V. Solution. Suppose the innity norm were induced by some inner product on Let u = (1, 0) 14 and v = (0, 1/2). 2 V. Then we have 2 2 2 ku + vk∞ + ku − vk∞ = k(1, 1/2)k∞ + k1, −1/2k∞ = 2, while 2 2 2 kuk + kvk = 2 (1 + 1/4) = 5/2. This contradicts Example 39, so that the innity norm cannot be induced from an inner product. Denition 41. to be an v1 , v2 , . . . , vn in an inner product space is said hvi , vj i = 0 whenever i 6= j. If, in addition, each vector hvi , vi i = 1 for all i, then the set of vectors is said to be an The set of vectors orthogonal set has unit length, i.e., orthonormal set if of vectors. The proof of the following key fact and its corollary are the same as those of for standard dot products. All we have to do is replace dot products by inner products. The observations that followed the proof of this theorem are valid for general inner products as well. We omit the proofs. Theorem 42. Let v1 , v2 , . . . , vn be an orthogonal set of nonzero vectors and suppose that v ∈ span {v1 , v2 , . . . , vn }. Then v can be expressed uniquely (up to order) as a linear combination of v1 , v2 , . . . , vn , namely v= hv1 , vi hv2 , vi hvn , vi v1 + v2 + · · · + vn . hv1 , v1 i hv2 , v2 i hvn , vn i Corollary 43. Every orthogonal set of nonzero vectors is linearly independent. Another useful corollary is the following fact about the length of a vector, whose proof is left as an exercise. Corollary 44. If v1 , v2 , . . . , vn is an orthogonal set of vectors and v = c1 v1 + c2 v2 + · · · + cn vn , then 2 Exercises Exercise 5. elements of 2 2 2 kvk = c21 kv1 k + c22 kv2 k + · · · + c2n kvn k . Conrm that C [−1, 1] p1 (x) = x and p2 (x) = 3x2 − 1 span {p1 (x) , p2 (x)} using Theorem 2 1 + x − 3x2 (c) 1 + 3x − 3x following polynomials belong to (a) x2 are orthogonal with the standard inner product and determine whether the (b) 42. 4. Linear Operators Before giving the denition of linear operator, let us recall some notation that f with the notation T are the domain and target of the function, respectively. This means that for each x in the domain D, the value f (x) is a uniquely determined element in the target T. We want to emphasize at the outset that there is a dierence here between the target of a function and its range. The range of the function f is is associated with functions in general. We identify a function f : D → T, where D and dened as the subset of the target range(f ) = {y | y = f (x) for some x ∈ D} , LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY which is just the set of all possible values of if, whenever then x = y. 15 f (x). A function is said to be one-to-one Also, a function is said to be onto if the For example, we can dene a function f : R → R A function that maps elements of one vector space into another, say f : V → W, range of f f (x) = f (y), MATH 441, FALL 2009 equals its target. f (x) = x2 . It follows from our specication of f that the target of f is understood to be R, while the range of f is the set of nonnegative real numbers. Therefore, f is not onto. Moreover, f (−1) = f (1) and −1 6= 1, so f is by the formula not one-to-one either. transformation. One of the simplest mappings of a vector space V is the so-called identity function idV : V → V given by idV (v) = v, is sometimes called an for all v ∈ V. operator or Here domain, range, and target all agree. Of course, matters can become more complicated. For example, the operator by the formula f x y might be given x2 = xy . y2 f Notice in this example that the target of range of f : R 2 → R3 is R3 , which is not the same as the f, since elements in the range have nonnegative rst and third coordinates. From the point of view of linear algebra, this function lacks the essential feature that makes it really interesting, namely linearity. Denition 45. A function T :V →W over the same eld of scalars is called a if for all vectors u, v ∈ V and scalars from the vector space V into the space W linear operator (mapping, transformation) c, d, we have T (cu + dv) = cT (u) + dT (v). c = d = 1 in the denition, we see that a linear function T is additive, T (u + v) = T (u) + T (v). Also, by taking d = 0 in the denition, we see that a linear function is outative, that is, T (cu) = cT (u). As a matter of fact, these By taking that is, two conditions imply the linearity property, and so are equivalent to it. We leave this fact as an exercise. If V. T :V →V is a linear operator, we simply say that A linear operator T :V →R is called a T is a linear operator on linear functional on V. By repeated application of the linearity denition, we can extend the linearity property to any linear combination of vectors, not just two terms. This means that for any scalars c1 , c2 , . . . , cn and vectors v1 , v2 , . . . , vn , we have T (c1 v1 + c2 v2 + · · · + cn vn ) = c1 T (v1 ) + c2 T (v2 ) + · · · + cn T (vn ). Example 46. Determine whether T : R2 → R3 given by the formula (a) T ((x, y)) = (x2 , xy, y 2 ) If T or (b) T ((x, y)) = 1 1 is a linear operator, where 0 −1 x y is dened by (a) then we show by a simple example that is . T fails to be linear. Let us calculate T ((1, 0) + (0, 1)) = T ((1, 1)) = (1, 1, 1), while These two are T T ((1, 0)) + T ((0, 1)) = (1, 0, 0) + (0, 0, 1) = (1, 0, 1). not equal, so T fails to satisfy the linearity property. LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 as in (b). Write T dened x and v = , y the action of T can be given as T (v) = Av. 16 Next consider the operator A= 1 1 0 −1 and we see that Now we have already seen that the operation of multiplication by a xed matrix is a linear operator. Recall that an operator f :V →W is said to be invertible if there is an operator g : W → V such that the composition of functions satises f ◦ g = idW and g ◦ f = idV . In other words, f (g (w)) = w and g (f (v)) = v for all w ∈ W and v ∈ V . We write g = f −1 and call f −1 the inverse of f . One can show that for any operator f , linear or not, being invertible is equivalent to being both one-to-one and onto. Example 47. spaces, then Show that if f −1 f :V →W is an invertible linear operator on vector is also a linear operator. u, v ∈ W , the linearity property f −1 (cu + dv) = cf (u) + df (v) is valid. Let w = cf −1 (u) + df −1 (v). Apply the function f to both sides and use the linearity of f to obtain that f (w) = f cf −1 (u) + df −1 (v) = cf f −1 (u) + df f −1 (v) = cu + dv. Solution. We need to show that for −1 −1 Apply f −1 to obtain that w = f −1 (f (w)) = f −1 (cu + dv), which proves the linearity property. Abstraction gives us a nice framework for certain key properties of mathematical objects, some of which we have seen before. For example, in calculus we were taught that dierentiation has the linearity property. Now we can express this assertion in a larger context: let operator space V. T on V V be the space of dierentiable functions and dene an by the rule Operator Norms. T (f (x)) = f 0 (x). Then T is a linear operator on the The following idea lets us measure the size of a norm in terms of how much the operator is capable of stretching an argument. This is a very useful notion for approximation theory in the situation where our approximation to the element approximation f can T (f ). Denition 48. The Let operator norm be described as applying an operator T to f to obtain the T : V → W be a linear operator between normed linear spaces. T is dened to be kT (v)k max = max kT (v)k , 06=v∈V kvk v∈V,kvk=1 of provided the maximum value exists, in which case the operator is called Otherwise the operator is called unbounded. bounded. This norm really is a norm on the appropriate space. Theorem 49. Given normed linear spaces V, W , the set of L (V, W ) of all bounded linear operators from V to W is a vector space with the usual function addition and multiplication. Moreover, the operator norm on L (V, W ) is a vector norm, so that L (V, W ) is also a normed linear space. This theorem gives us an idea as to why operator norms are relevant to approximation theory. Suppose you want to approximate functions f in some function space by a method which can be described as aplying a linear operator T to f LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 17 (we'll see lots of examples of this in approximation theory). One rough measure of T (f ) should be close to f . Put another kT k is much larger than one, we f , T (f ) will be a poor approximation to f . how good an approximation we have is that way, kT (f )k / kf k should be close to one. So if expect that for some functions Bounded linear operators for innite dimensional spaces is a large subject which forms part of the area of mathematics called functional analysis. However, in the case of nite dimension, the situation is much simpler. Theorem 50. Let T : V → W be a linear operator, where V is a nite dimensional space. Then T is bounded, so that L (V, W ) consists of all linear operators from V to W . Some notation: if both V and norm, then the operator norm of W are standard spaces with the same standard pT is denoted by kT kp . In some cases the operator norm is fairly simple to compute. Here is an example. Example 51. TA : Rn → Rm be the linear operator given by matrix-vector multiplication, TA (v) = Av, where A = [aij ] is an m × n matrix with (i, j)-th entry aij . Then TA is bounded by the previous theorem and moreover Let kTA k∞ = max n X 1≤i≤m |aij | . j=1 Solution. To see this, observer that in order for a vector norm 1, all coordinates must be at most be equal to one. Pn 1 v∈V to have innity in absolute value and at least one must Now choose the row that has the largest row sum of absolute j=1 |aij | and let v be the vector whose j -th coordinate is just the signum of aij , so that |aij | = vj aij for all j . Then it is easily checked that this is the largest possible value for kAvk. values Exercises Exercise 6. Show that dierentiation is a linear operator on V = C ∞ (R) , the space of functions dened on the real line that are innitely dierentiable. 1 0 Exercise 7. f (x) dx Show that the operator T : C [0, 1] → R given by T (f ) = is a linear operator. 5. Metric Spaces and Analysis We start with the denition of a metric space which, in general, is rather dierent from a vector space, although the main examples for us are in fact normed linear spaces and their subsets. Denition 52. X of objects called points, toX , mapping pairs of points all points x, y, z : if x = y . A metric space is a nonempty set gether with a function d (x, y), called a metric for x, y ∈ X to real numbers and satisfying for (1) d (x, y) ≥ 0 with equality if and only (2) d (x, y) = d (y, x) (3) d (x, z) ≤ d (x, y) + d (y, z) Examples abound. This is a more general concept than norm, since metric spaces do not have to be vector spaces, so that every subset of a metric space is also a metric space with the inherited metric. There are many interesting examples that are not subsets of normed linear spaces. One such example can be found in LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 18 Powell's text, Exercise 1.2. BTW, this example is important for problems of shape recognition. Example 53. Let For a very nonstandard example, consider a nite network in x, y which every pair of nodes is connected by a path of edges, each of which has a positive cost associated with traversing it. Assume that there are no self-edges. Dene the distance between any two nodes path connecting x y to if x 6= y , x and y to be the minimum cost of a otherwise the distance is 0. This denition turns the network into a metric space. Next, we consider some topological ideas that are useful for metric spaces. In all of the following we assume that Denition 54. X is a metric space with metric r > 0 and a point x x0 is the set of points Given centered at the point in X, d(x, y). the (closed) ball of radius r Br (x) = {y ∈ X | d (x0 , y) ≤ r} . The set of points is the open ball of Denition 55. Y Bro (x) = {y ∈ X | d (x0 , y) < r} . radius r centered at the point x0 . Y of the metric space X is open if for every point y ∈ Br (y) contained entirely in Y . The set Y is closed if its X is an open set. A subset there exists a ball complement X\Y in (I'll nish this later) Denition 56. bounded set Denition 57. boundary Denition 58. compact, sequentially compact Denition 59. continuous function Example 60. Linear operators The key theorems we need are Theorem 61. For a metric space X , compactness and sequential compactness are equivalent. Theorem 62. Heine-Borel theorem Theorem 63. EVT Finally, we have the following famous theorem from analysis, which tells us that the polynomials are a dense subset of that every open ball in C [a, b] C [a, b] Theorem 64. (Weierstrauss) Given any f a polynomial p (x) such that Exercise 8. with the innity norm, in the sense with respect to this norm contains a polynomial: ∈ C [a, b] and number > 0, there exists kf − pk∞ < . Show that every normed linear space space with metric given by d (u, v) = ku − vk . V with norm k·k is a metric LINEAR ALGEBRA NOTES FOR APPROXIMATION THEORY MATH 441, FALL 2009 19 References [1] M.J.D. Powell, 1981. [2] T. Shores, Approximation Theory and Methods , Cambridge University Press, Cambridge, Applied Linear Algebra and Matrix Analysis , Springer, New York, 2007.
© Copyright 2025 Paperzz