Optimization Critical Points In this section, we develop a method for …nding the extrema— i.e.e, the maximum and minimum points— of a function of two variables. For reasons which will soon be apparent, this method is called the second derivative test. To begin with, we say that a function f (x; y) has a local maximum at a point (p; q) if there is a circle centered at (p; q) such that f (x; y) f (p; q) for all (x; y) in that circle. That is, f (p; q) is the maximum height of some small patch of the surface, although it may not be maximum overall. It follows that if jhj < R; then f (p + h; q) f (p + h; q) f (p; q) and thus f (p; q) 0 Dividing by h when 0 < h < R and letting h approach 0 from the right yields fx (p; q) = lim+ h!0 Conversely, dividing by h when left yields fx (p; q) = lim h!0 f (p + h; q) h f (p; q) 0 R < h < 0 and letting h approach 0 from the f (p + h; q) h f (p; q) 0 Consequently, it must follow that fx (p; q) = 0: A similar argument shows that fy (p; q) = 0: 1 That is, the tangent plane to the graph of f (x; y) is horizontal at a local maximum or a local minimum. Similar results hodl if f (x; y) has a local minimum at a point (p; q) since this is equivalent to f (x; y) having a local maximum at (p; q) : De…nition 8.1: The critical points of a function f (x; y) are those points (p; q) for which fx (p; q) = 0 and fy (p; q) = 0: By the discussion above, the extrema of f (x; y) must occur at its critical points. EXAMPLE 1 Find the critical point(s) of f (x; y) = x3 3xy + y 3 Solution: The …rst partial derivatives are fx (x; y) = 3x2 3y; 3x + 3y 2 fy (x; y) = Setting fx and fy equal to zero leads to 2 simultaneous equations: 3x2 3x + 3y 2 = 0 3y = 0; Simplifying leads to y = x2 and x = y 2 ; which implies that x = x4 : Since x = x4 is the same as x4 x = 0; we obtain x x3 x (x 1 = 0 1) x + x + 1 = 0 2 2 which results in x = 0 and x = 1: Since y = x2 ; we have x = 0 implies y = 0; while x = 1 implies y = 1: Thus, the critical points are (0; 0) and (1; 1) : Check your Reading: Which of the critical points in example 1 does not correspond to a local extremum of f (x; y)? The Second Derivative Test Clearly, f (x; y) has a local maximum at a critical point (p; q) only if every vertical slice of z = f (x; y) has a maximum at (p; q) : 3 Similarly, f (x; y) has a local mimimum at (p; q) only if every vertical slice of z = f (x; y) has a minimum at (p; q). However, it is possible for z (t) to be concave up in one slice and concave down in another slice. If this is the case, then we say that f (x; y) has a saddle at (p; q) : To determine if we get a maximum, a minimimum, or a saddle at a critical point (p; q) ; we consider the vertical slice z (t) = f (p + mt; q + nt) : Since x = p + mt and y = q + nt implies that x0 (t) = m and y 0 (t) = n; the …rst derivative of z (t) is dz @f dx @f dy = + = mfx + nfy dt @x dt @y dt Moreover, m and n constant implies that z 00 = d dz dt dt dfy dfx +n dt dt @fx dx @fx dy @fy dx @fy dy = m + + +n @x dt @y dt @x dt @y dt = m (mfxx + nfxy ) + n (mfyx + nfyy ) = m Expanding and using the equality of the mixed partials then yields z 00 (0) = m2 fxx (p; q) + 2mnfxy (p; q) + n2 fyy (p; q) (1) If fxx (p; q) = 0; then we can choose m and n so that z 00 (0) is negative in some slices and positive in others, thus implying that z = f (x; y) has a saddle at (p; q) : If fxx (p; q) 6= 0; the completing the square in m yields z 00 (0) = fxx (p; q) m + fxy (p; q) n fxx (p; q) 4 2 + D (p; q) 2 n fxx (p; q) (2) 2 where D = fxx fyy (fxy ) is called the discriminant of f: (i.e., expanding (2) will result in (1) ). If D (p; q) > 0; then z 00 (0) has the same sign as fxx (p; q) in all directions u = hm; ni ; thus implying a maximum if fxx (p; q) < 0 and a minimum if fxx (p; q) > 0: However, if D (p; q) < 0; then choosing m = 1 and n = 0 yields z 00 (0) > 0 whereas choosing m = fxy (p; q) =fxx (p; q) yields z 00 (0) < 0; thus implying a saddle. These observations lead to the following theorem: Second Derivative Test: If (p; q) is a critical point of a function f (x; y) whose second derivatives exist at (p; q) ; then Discriminant D (p; q) > 0; D (p; q) > 0; D (p; q) < 0; 2nd der fxx (p; q) > 0 fxx (p; q) < 0 Result f (x; y) has a local minimum at (p; q) f (x; y) has a local maximum at (p; q) f (x; y) has a saddle at (p; q) However, if D (p; q) = 0; then no information about f (x; y) is obtained.. EXAMPLE 2 x2 y 2 . Identify the extrema and saddle points of f (x; y) = Solution: Since fx = 2x and fy = 2y; the only critical point is (0; 0) : However, fxx = 2; fyy = 2; and fxy = 0; so that the discriminant of f is D = fxx fyy Thus, f (x; y) = x2 2 (fxy ) = (2) ( 2) 02 = y 2 has a saddle at (0; 0) : 5 4<0 EXAMPLE 3 Find the extrema and saddle points of f (x; y) = x3 3xy + y 3 : Solution: In example 1, we showed that the critical points of f are (0; 0) and (1; 1) : Since fx (x; y) = 3x2 3y and fy (x; y) = 3x+3y 2 ; the second derivatives of f (x; y) are fxx = 6x; fxy = 3; fyy = 6y Thus, the discriminant is 2 D (x; y) = (6x) (6y) At (0; 0) ; we have D (0; 0) = 0 at (0; 0) : At (1; 1) ; we have ( 3) = 36xy 9= D (1; 1) = 36 1 1 9 9 < 0: Thus, f has a saddle 9 = 27 > 0 However, fxx (1; 1) = 6 > 0; so f has a local minimum at (1; 1) : blueEXAMPLE 4 of blackFind the local extrema and saddle points f (x; y) = x sin (xy) 6 Solution: The …rst partial derivatives are fx = sin (xy) + xy cos (xy) ; fy = x2 cos (xy) Setting fy = 0 yields either x = 0 or cos (xy) = 0; the latter of which implies that xy = + n 2 for any integer n: At such points, we would have fx (x; y) either as 1 or 1 (but not 0). However, if y = 0; then fx (x; 0) = 0 + y which implies that both fx = 0 and fy = 0 at (0; 0) (and nowhere else). The second derivatives are fxx fxy fyy = 2y cos (xy) xy 2 sin (xy) = fyx = 2x cos (xy) x2 y sin (xy) = x3 sin (xy) and fxx (0; 0) = fxy (0; 0) = fyy (0; 0) = 0: Thus, the discriminant is D (0; 0) = 0; so the second derivative test provides no information about the extrema or saddles of f (x; y) = x sin (xy) at (0; 0) : 7 Although it appears that there is a saddle at (0; 0) in example 4, there is no way of determining this using the second derivative test. Indeed, g (x; y) = x4 + y 4 is positive everywhere except for g (0; 0) = 0; so clearly g (x; y) has a minimum at (0; 0) : But gxx (0; 0) = gxy (0; 0) = gyy (0; 0) = 0 implies that D (0; 0) = 0; so the minimum cannot be identi…ed using the second derivative test. Check your reading: Does p (x; y) = x4 y 4 have any local extrema that can be identi…ed using the second derivative test? Linear Systems and Quadratic Extrema Many applications involve quadratic functions, where a quadratic function is a function that is a second degree polynomial in each variable. When a quadratic function has a critical point, it must be the solution to a system of simultaneous linear equations (also known as a linear system) of the form ax + by cx + dy = r = s One way of solving a linear system is to multiply the …rst equation by c; multiply the second by a; and then combine the two equations to eliminate y: acx bcy acx + ady (ad bc) y = = = sa rc sa rc After solving for x; substitution can be used to determine y: Or any of a number of other variations may be used instead. blueEXAMPLE 5 blackFind the point(s) on the plane z = x+y 3 that are closest to the origin. 8 Solution: To begin with, we let f denote the square of the distance from a point (x; y; z) to the origin. Consequently, f = x2 + y 2 + z 2 Substituting z = x + y 3 thus yields f (x; y) = x2 + y 2 + (x + y Since fx = 4x + 2y 6; fy = 2x + 4y 4x + 2y = 6; + 6; we must solve 2x + 4y = 6 Multiplying the second equation by 4x 4x 0x 2 3) 2y 8y 6y 2 yields = = = 6 12 6 so that y = 1: Similarly, we …nd that x = 1; so the critical point is (1; 1) : Moreover, fxx = 4; fxy = 2; and fyy = 4; so that the discriminant is D = fxx fyy 2 fxy = 16 4 = 12 > 0 Thus, every “slice”is concave up and correspondingly, f has a minimum at (1; 1) : Substitution yields z =1+1 3= 1 so that (1; 1; 1) is the point in the plane z = x+y 3 that is closest to the origin. One of the most important applications in statistics is …nding the equation of the line that best …ts a data set of the form (x1 ; y1 ) ; (x2 ; y2 ) ; : : : ; (xn ; yn ) where by best …t we mean the line which produces the least error. Speci…cally, the j th error or residual in approximating the data set with the line y = mx + b is "j = mxj + b yj 9 Thus, "2j is the square of the vertical distance from the point to the line. We then de…ne the least squares line for the data set to be the line with the slope m and the y-intercept b that minimizes the total squared error E (m; b) = n X (mxj + b 2 yj ) j=1 That is, the least squares line minimizes the sum of the squares of the residuals. EXAMPLE 6 and (4; 4) : Find the least squares line for the data set (1; 1) ; (2; 3) ; (3; 5) ; Solution: To …nd E (m; b) ; we calculate the squares of the residuals for each of the data points and then compute their sum: "21 "22 "23 "24 : : : : (m (m (m (m 2 1 + b 1) 2 2 + b 3) 2 3 + b 5) 2 4 + b 4) E (m; b) m2 + 2mb 2m + b2 2b + 1 4m2 + 4mb 12m + b2 6b + 9 9m2 + 6mb 30m + b2 10b + 25 16m2 + 8mb 32m + b2 8b + 16 30m2 + 20mb 76m + 4b2 26b + 51 = = = = = The …rst partial derivative of E (m; b) are Em (m; b) = 60m + 20b 76 and Eb (m; b) = 20m + 8b Thus, the critical points must satisfy 60m + 20b 20m + 8b Multiplying the latter by 60m 60m 0m = = 76 26 3 yields + 20b 24b 4b 10 = = = 76 78 2 26 Thus, b = 0:5 and likewise, we …nd that m = 1:1: The second derivatives of E (m; b) are Emm = 60; Emb = 20; Ebb = 8 and as a result, the discriminant is D = 60 8 2 (20) = 80 > 0 which implies that E (m; b) has a minimum at m = 1:1 and b = 0:5: Thus, the least squares line for the data set (1; 1) ; (2; 3) ; (3; 5) ; and (1; 4) is y = 1:1x + 0:5: Typically, due to the size of the data sets involved, least squares problems are not solved by hand. Correspondingly, our investigation of least squares problem is treated with greater depth and more examples in the associated Maple worksheet. Check your reading: Why did we use the square of the distance instead of the actual distance in example 4? Positive De…nite Matrices and the Hessian The second derivative test can be generalized to any number of variables, but to do so requires that we reinterpret our results in section 2 in terms of the Hessian of f (x; y) : To begin with, let us notice that we obtained (1) in the form z 00 = m2 fxx + 2mnfxy + n2 fyy by simplifying it from z 00 = m (mfxx + nfxy ) + n (mfyx + nfyy ) 11 (3) However, (3) is the inner product of the vector u = hm; ni with the Hessian applied to u as a column matrix: Hf u = fxx fyx fxy fyy m n = mfxx + nfxy mfyx + nfyy That is, (1) in section 2 is in actuality given by z 00 (0) = u Hf (p; q) u If z 00 (0) > 0; then all the vertical slices are concave up at (p; q) and thus f (x; y) has a local minimum at (p; q) : If z 00 (0) < 0; then f (x; y) has a local maximum at (p; q) : But if z 00 (0) is negative for some directions u and positive for other choices of u; then f (x; y) has a saddle at (p; q) : This motivates the following de…nition: De…nition 8.3: An n n matrix A is positive de…nite if u Au > 0 for all n-dimensional vectors u 6= 0: Correspondingly, if A is positive de…nite, then A itself is said to be negative de…nite. If u Au is negative for some vectors u and positive for others, then A is not de…nite. The second derivative test then corresponds to the de…niteness (or lack thereof) of the Hessian of f at a critical point (p; q) : Moreover, our discussion in section 2 that led to the de…nition of the discriminant can be restated as a theorem: Theorem 8.4: Let A be the matrix A= a11 a21 a12 a22 Then A is positive de…nite only if a11 > 0 and det (A) > 0: 2 Indeed, notice that the discriminant D = fxx fyy fxy is the determinant of the Hessian matrix. Moreover, because a 2 2 matrix satis…es det (A) = det ( A) (but not true for 3 3 matrices !!!), a matrix A is negative de…nite if a11 < 0 and det (A) > 0, thus allowing us to restate the second derivative test: 12 Second Derivative Test for 2 Variables: If (p; q) is a critical point of a function f (x; y) whose second derivatives exist at (p; q) ; then Discriminant D (p; q) > 0; D (p; q) > 0; D (p; q) < 0; 2nd der fxx (p; q) > 0 fxx (p; q) < 0 Hessian: Hf (p; q) positive de…nite negative de…nite not de…nite Result: f (x; y) has a local minimum at (p; q) local maximum at (p; q) saddle at (p; q) If Hf (p; q) = 0; then the Hessian says nothing about the extrema at (p; q). EXAMPLE 7 of Use the second derivative test to …nd the extrema f (x; y) = x2 + y 3 3y 2 Solution: Since fx = 2x and fy = 3y 3; setting fx = 0 and fy = 0 yields 2x = 0 and 3y 2 = 3 Thus, x = 0 and y = 1; so that the critical points are (0; 1) and (0; 1) : Since fxx = 2; fxy = 0; and fyy = 6y; the Hessian matrix is 2 0 Hf = 0 6y At (0; 1) ; we have Hf (0; 1) = 2 0 0 6 Since 2 > 0 and det (Hf (0; 1)) = 12 > 0; the Hessian Hf (0; 1) is positive de…nite. Thus, f (x; y) has a minimum at (0; 1) : However, at (0; 1) ; the Hessian matrix is Hf (0; 1) = 2 0 0 6 and det (Hf (0; 1)) = 12 < 0: Thus, Hf (0; 1) is not de…nite and f (x; y) has a saddle at (0; 1) : The second derivative test for functions of 3 or more variables is essentially the same as for 2 variables, except that there is no discriminant for functions of 3 or more variables. Second Derivative Test for n Variables: If p = (p1 ; : : : ; pn ) is a critical point of a function f (x1 ; : : : ; xn ) that is well-approximated 13 by its quadratic approximation near p; then If Hf (p1 ; : : : ; pn ) is positive de…nite negative de…nite not de…nite Then f (x1 ; : : : ; xn ) has a local minimum at (p1 ; : : : ; pn ) local maximum at (p1 ; : : : ; pn ) saddle at (p1 ; : : : ; pn ) The proof of the second derivative test in higher dimensions follows directly from the form of the quadratic approximation. Speci…cally, at a critical point p = (p1 ; : : : ; pn ) ; the total derivative rf (p) = 0; so that the quadratic approximation is of the form Q (x) = f (p) + 1 (x 2 p) Hf (p) (x p) If Hf (p) is positive de…nite, then Q (x) > f (p) for all x in some neighborhood of p, and f (x) Q (x) then implies that f (x) f (p) on that neighborhood. Thus, f has a minimum at p: However, determining if a matrix is positive de…nite becomes increasingly di¢ cult as the number of dimensions increases, as the next theorem illustrates: Theorem 8.5 (Sylvester’s Criterion): Let A matrix 2 a11 a12 a13 : : : a1n 6 a21 a22 a23 : : : a2n 6 6 A = 6 a31 a32 a33 : : : a3n 6 .. .. .. .. .. 4 . . . . . an1 an2 an3 ::: ann be the n 3 n real 7 7 7 7 7 5 Then A is positive de…nite if and only if a11 > 0; the determinant of the upper 2 2 matrix satis…es det a11 a21 a12 a22 the determinant of the upper 3 3 02 a11 a12 det @4 a21 a22 a31 a32 and in general all the upper j nant for all j = 1; : : : ; n. >0 matrix satis…es 31 a13 a23 5A > 0 a33 j matrices have a positive determi- We further explore and also provide examples for the second derivative test for functions of more than 2 variables in the associated Maple worksheet. 14 Exercises: Find the local extrema and saddle points of the following functions: 1. 3. 5. 7. 9. 11. 13. 15. 17. 19. f (x; y) = x2 + 4y 2 f (x; y) = x2 + xy + 3x + 2y f (x; y) = x2 4xy + y 2 + 6y f (x; y) = 3x2 + 6xy + 7y 2 2x + 4y f (x; y) = x3 3x2 + y 2 f (x; y) = x3 + 3xy + y 3 f (x; y) = 4xy x4 y 4 f (x; y) = x4 + 2x2 y 2y f (x; y) = sin (x) + cos (y) f (x; y) = x sin (y) 2. 4. 6. 8. 10. 12. 14. 16. 18. 20. f (x; y) = x2 3y 2 f (x; y) = y 2 + xy 2x 2y f (x; y) = x2 + 2xy y 2 + 3x + 4 f (x; y) = 4x2 6xy + 5y 2 20x + 26y f (x; y) = x4 + y 4 y 2 f (x; y) = x3 + 6xy + y 3 f (x; y) = x4 + y 4 + 4xy f (x; y) = x4 2x2 y + 2y f (x; y) = x ln (x) + y ln (y) f (x; y) = e2x cos (y) Find the slope and y-intercept of the least squares line for each of the following data sets: 21. 23. 25. (1; 1) ; (2; 2) ; (3; 3) ( 1; 1:2) ; ( 2; 2:3) ; ( 3; 3:4) (1; 75) ; (2; 79) ; (3; 85) ; (4; 81) 22. 24. 26. (1; 72) ; (2; 97) ; (3; 83) (1; 1) ; (2; 1) ; (3; 1) (1; 75) ; (2; 79) ; (3; 81) ; (4; 85) 27. Find the point in the plane z = x + 1 which is closest to the origin. (Hint: minimize the square of the distance from a point (x; y; x + 1) to the origin (0; 0; 0) ). 28. Find the point in the plane z = x + 2y + 3 which is closest to the origin. 29. Find the point in the plane z = x + y closest to the point (2; 2; 1) : 2 30. Find the point(s) on the surface z = (x 1) + y 2 closest to the origin. 31. What dimensions of a rectangular box with a surface area of 64 in2 lead to a maximum volume? 32. What dimensions of a rectangular box with a volume of 64 in3 lead to a minimum surface area? 33. Acme sporting goods collected the following set of data relating price charged for a racket, x; to the number of rackets per week sold at that price. x = price y = weekly sales $50 18 $55 15 $60 10 $65 6 Fit this data to a linear demand function y = mx + b: 34. Repeat exercise 33 given the data set x = price y = weekly sales $50 24 15 $55 21 $60 18 $65 15 35. A house with width x; length y; and height z is to have a roo‡ine with a height of 250 : If the house is to have a total ‡oor space of 2000 ft2 ; what values of x; y; and z minimize the sum of the area of the sides and the roof. 36. If the roof costs 3 times more than the sides of the house, then what values of x; y; z in problem 35 minimize the cost of the house? 37. Suppose that r (t) = p + tu and L (s) = q + sw where p; u; q; and w are constant vectors ( i.e., (i.e., r (t) and L (s) are straight lines). Let E(s; t) denote the total squared distance between r (t) and L (s). Does E(s; t) always have a minimum? What is signi…cant about any extrema or saddle points of E(s; t) when the two lines do not intersect? 38. A “ray” travels in a line from the point (4; 2) to the x-axis, is re‡ected 16 in a straight line to the y-axis, and then is re‡ected again to the point (2; 3) : What is the shortest possible path for the ray to travel in this manner from (4; 2) to (2; 3)? 39. Discussion: Explain why every point on the unit circle is a critical point of 2 f (x; y) = x2 + y 2 1 Does f (x; y) have all saddle points on the unit circle, or does it have all minima on the unit circle? Explain. 40. Find the point(s) on the surface z = 1 x2 y 2 closest to the origin. Why are there in…nitely many of them? 41. Write to Learn: Write a short essay in which you revisit section 2 but instead assume that fyy (p; q) is nonzero and subsequently complete the square in n. What does this version of the second derivative test look like? 42. *Write to Learn: The key to the proof of the second derivative test is (2), which is z 00 (0) = fxx (p; q) m + fxy (p; q) n fxx (p; q) 2 + D (p; q) 2 n fxx (p; q) Although we say there is "no information" when D (p; q) = 0; is that completely correct? If fxx (p; q) > 0 but D (p; q) = 0; then what does this imply about the possibility of an extremum at (p; q)? How would we explore what happens for the one choice of m for which z 00 (0) = 0? Is this one case enough to derail the entire theorem? Write a short essay addressing these possibilities. 43. *Suppose a house with a height of 10 feet at each corner is to have a total ‡oor space of 2000 f t2 ; and suppose that ps is the cost per square foot of 17 the sides and that pr is the cost per square foot of the roof. What width x; height y; and pitch of the roof minimize the cost of the house? What shape should the house have if ps pr ? What should the dimensions of the house be if ps = 2pr ? 18
© Copyright 2026 Paperzz