LECTURE 17 - LINEARIZATION CHRIS JOHNSON Abstract. In this lecture we’ll apply tangent planes, the topic of the previous lecture, to show how to obtain a linear approximation of a multivariable function. 1. Motivation Suppose that we’re given a function of a single variable, f (x). If this function is differentiable, then we can use the tangent line of the graph of the function to get an approximation of the function. That is, we know the equation of the tangent line of the surface at some point (x0 , f (x0 )) is y − f (x0 ) = f 0 (x0 )(x − x0 ). Moving the f (x0 ) to the right-hand side, we have a line y = f 0 (x0 )(x − x0 ) + f (x0 ) which is the graph of the function L(x) = f 0 (x0 )(x − x0 ) + f (x0 ). What’s nice about this function L(x) is that it’s something we can actually evaluate. If you think about the mathematical, numerical operations you can actually perform – the things you can in principle sit down and work out with a pencil and paper – you realize that there are basically only four operations: addition, subtraction, multiplication, and division. (These are the four arithmetic operations.) There are of course some more complicated things we know how to do (for example, cubing a number), but these are really built out of combinations of addition, subtraction, multiplication, and division (e.g., x3 = x · x · x). Using a computer, by the way, doesn’t really let you do any more operations than what you can do with pencil and paper. Ultimately, computers also can only do arithmetic: they aren’t magically able to perform things that people in principle can not. In fact, in some ways computers are worse at these arithmetic operations than people. A computer has to represent numbers using a finite number of bits: values that can only be 1 or 0. Since any computer only has a finite number Date: February 26, 2014. 1 2 CHRIS JOHNSON of these bits – even if it’s a very large number! – “most” numbers can’t be represented exactly on a computer. It turns out that “most” numbers would require an infinite number of bits to represent exactly, so the computer has to use approximations. You might be surprised to learn that a number as simple as 1/10 can’t be represented exactly with a finite number of bits (at least not the way computers usually represent numbers)! A really simple way to demonstrate this is to ask the computer to add 0.1 + 0.2. If you did this in Python, for example, you’ll get >>> 0.1 + 0.2 0.30000000000000004. Usually the computer truncates – cuts off – the values before printing them, but internally represents the number to more decimal places. When you do lots of calculations with these values, these little errors start to add up! Anyway, the four arithmetic operations are basically the only tools we have to do numerical computations. However, certain types of functions are defined in terms of these arithmetic operations. The trig functions, cos θ and sin θ, for example, are defined geometrically: they’re the x- and y-coordinates of points on the circle. Yet somehow your computer is able to spit out a value if you enter cos(0.3245). If the computer can only do arithmetic, how is it able to determine this value? The answer is that the computer uses calculus (or, rather, someone who knew calculus programmed the computer) to determine approximations of cos θ that can actually be evaluated by using only arithmetic operations. This is of course the Taylor polynomials you learned about in your second-semester calculus class. The techniques of Taylor polynomials you learned before are very powerful, and are the basis for all of the fancy technology we have today: breakthroughs in science and medicine are possible today because people are able to use computers to do calculations and analyze very large amounts of data, and they’re able to do this because we can use Taylor polynomials (and related ideas) to convert complicated calculations into arithmetic. However, the material you’ve learned before is only applicable to functions of a single variable. Our goal in this lecture is to start studying the comparable ideas for functions of several variables. To do this, we’ll use tangent planes to get a linear approximation of a multivariable function. LECTURE 17 - LINEARIZATION 3 2. Linearization Recall from the last lecture that we said tangent plane of the surface z = f (x, y) at the point (x0 , y0 ) is given by fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) − (z − f (x0 , y0 )) = 0. Solving this for z we have z = fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) + f (x0 , y0 ). Notice that this is a function of x and y, and this is a plane which is a graph of the function L(x, y) = fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) + f (x0 , y0 ), which we call the linearization of the function f (x, y) at the point (x0 , y0 ). The reason we care about linearizations is that they’re functions we (or a computer) can actually compute: we can get numerical answers using a linearization. We can thus use linearizations to approximate multivariable functions. Example 2.1. Calculate the linearization of f (x, y) = xey at the point (2, 0), and use the linearization to approximate f (2.01, −0.1). To calculate the linearization, we first need to calculate the partial derivatives. fx (x, y) = ey fy (x, y) = xey Now we must evaluate these partial derivatives, and our original function, at the point (x0 , y0 ) = (2, 0). f (2, 0) = 2 fx (2, 0) = 1 fy (2, 0) = 2 We use these values to now determine our linearization, L(x, y) = (x − 2) + 2y + 2 = x + 2y We can now use this approximation to estimate the value f (2.01, −0.1): f (2.01, −0.1) ≈ L(2.01, −0.1) = 2.01 + 2 · (−0.1) = 2.01 − 0.2 = 1.81 4 CHRIS JOHNSON Let’s take a moment to think about what we’ve just done in the example above. We used linearization, which is essentially just the equation of a tangent plane, to estimate the value 2.01 · e−0.1 and got an actual numerical value. This is an extremely important idea: we can complicated functions and estimate them with things we actually know how to calculate – this we can sit down and really do with a pencil and paper. That is, we were able to say 2.01 · (2.718281824...)−0.1 ≈ 1.81. By the way, if you plug 2.01e−0.1 in a calculator or computer, it will probably spit back the answer 2.01e−0.1 ≈ 1.81872321025. This means two things: 1) our approximation above, which was super easy to actually calculate by hand, is a decent approximation; and 2) the computer uses a different type of approximation than what we used. The computer is using a Taylor polynomial, probably to some high degree, to get its approximation. The idea of Taylor polynomials in one variable, you may recall, is really just an extension of the idea of linearization (using tangent lines as the approximation). We can also do Taylor polynomials in several variables, but won’t work on that right now. For this lecture we’ll focus on linearization, and may come back to multivariable Taylor polynomials at the end of the semester if we have extra time. Example 2.2. Calculate the linearization of f (x, y) = x3 y +y 2 x at the point (−1, 3), and use the linearization to approximate f (−0.93, 2.976). First we calculate the partial derivatives, fx (x, y) = 3x2 y + y 2 fy (x, y) = x3 + 2xy Evaluating these partials, and the original function, at (−1, 3) we have f (−1, 3) = (−1)3 · 3 + 32 · (−1) = −3 − 9 = −12 fx (−1, 3) = 3 · (−1)2 · 3 + 32 = 9 + 9 = 18 fy (−1, 3) = (−1)3 + 2 · (−1) · 3 = −1 − 6 = −7 Hence the linearization is L(x, y) = 18(x + 1) − 7(y − 3) − 12. LECTURE 17 - LINEARIZATION 5 We now use this to get an approximation, f (−0.93, 2.976) ≈ L(−0.93, 2.976) = 18 · (1 − 0.93) − 7(2.976 − 3) − 12 = 18 · (0.07) − 7 · (−0.024) − 12 = 1.26 + 0.168 − 12 = 1.428 − 12 = −10.576 3. Differentials Notice in both of the examples above we had to pick values (x0 , y0 ), the “center” of our approximation, where we could actually calculate the true value of the function. This is always the case for these linearizations: we have to find a place to “anchor” our approximation; we need somewhere where we know definitively what the function equals. Let’s say we know the true value z0 = f (x0 , y0 ). Calling L(x, y) = z, our linearization has the form: z = fx (x0 , y0 ) · (x − x0 ) + fy (x0 , y0 ) · (y − y0 ) + z0 . Moving the z0 to the other side we have z − z0 = fx (x0 , y0 ) · (x − x0 ) + fy (x0 , y0 ) · (y − y0 ). Notice that z − z0 , x − x0 , and y − y0 are just the change in the values of z, x, and y when change our inputs from x0 to x; from y0 to y; and then the output changes from z0 to z. That is, each of x − x0 , y − y0 , and z − z0 represents the change in x, y, and z. Let’s write these changes as dx, dy, and dz. (We use the letter ’d’ for “difference.”) Our equation above then becomes dz = fx (x0 , y0 )dx + fy (x0 , y0 )dy. Thinking of dx and dy as variables (just saying how much we vary the original inputs x and y), we have that dz is a function of two variables. This function is called the differential of z = f (x, y). The idea here is that differentials measure the change in our approximation. For example, in our example above, we have dz = 18dx − 7dy. This means we can determine the change in approximation, dz, by just plugging in the changes dx and dy in our variables x and y. In the example above we changed x from −1 to −0.93. This is a change of dx = −0.93 − (−1) = 0.07. 6 CHRIS JOHNSON We changed y from 3 to 2.976. This is a change of dy = 2.976 − 3 = −0.024 So the change in z from f (−1, 3) = 12 to our approximation L(−0.93, 2.976) is dz = 18 · (0.07) − 7 · (−0.024) = 1.428. This means our function f (x, y) changes by approximately 1.428 when we move the inputs of the function from (x0 , y0 ) = (−1, 3) to (−0.93, 2.976); so the approximation is −12 + dz = −12 + 1.428 = −10.576, as we saw above. Differentials and linearizations are two sides of the same coin: they’re basically the same thing, just represented different ways. More precisely, a differential is just a change in linearization. This means that f (x, y) ≈ L(x, y) = f (x0 , y0 ) + dz. (Since we’ll usually write z = f (x, y), we may sometimes write df for dz, and call this value the differential of f instead of the differential of z. These are the same thing, just different words.) Example 3.1. Calculate the differential dz of z = sin(x + 3y 2 ): ∂f ∂f dz = dx + dy ∂x ∂y ∂ ∂ sin(x + 3y 2 )dx + sin(x + 3y 2 )dy =⇒ dz = ∂x ∂y =⇒ dz = cos(x + 3y 2 )dx + 6y cos(x + 3y 2 )dy Example 3.2. Calculate the differential dz of z = f (x, y) = (x3 − 2) · tan−1 (y), then use the differential to approximate f (−2.1, 0.22) and f (−1.99, 0.18). x3 − 2 dy 1 + y2 Use differentials (or linearizations), we need to find a point (x0 , y0 ) to use as the “center” of our approximation; some value near the values we’re trying to approximate, where we can exactly calculate the true value of the function. Let’s use (x0 , y0 ) = (−2, 0). Then our differential becomes dz = 12dx − 10dy and the true value of the function is f (−2, 0) = −10. For (−2.1, 0.22), we have dx = −0.1 and dy = 0.22, so dz = 3x2 tan−1 (y)dx + dz = 12 · (−0.1) − 10 · (0.22) = −1.2 − 2.2 = −3.4 LECTURE 17 - LINEARIZATION 7 so our approximation for the function is f (−2.1, 0.22) ≈ f (−2, 0) + dz = −10 − 3.4 = −13.4. For (−1.99, 0.18), dx = 0.01 and dy = 0.18, so dz = 12 · 0.01 − 10 · 0.18 = 0.12 − 1.8 = −1.68 so f (−1.99, 0.18) ≈ −10 − 1.68 = −11.68. We can describe differentials in any number of variables, by the way. If z = f (x1 , x2 , ..., xn ) is a function of n variables, the differential of z is defined to be ∂f ∂f ∂f dz = dx1 + dx2 + · · · + dxn . ∂x1 ∂x2 ∂xn 4. Differentiability In the case of functions of a single variable, saying a function is differentiable at x0 basically means that the tangent plane of the approximation is a “good” approximation of the function near x0 . Differentiability in two dimensions is basically the same thing. Intuitively, a function f (x, y) is differentiable at (x0 , y0 ) if the tangent plane of f (x, y) at the point (x0 , y0 ) is a good approximation of the function near (x0 , y0 ). To be precise, a function z = f (x, y) is differentiable at (x0 , y0 ) if we can write the function as f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )∆x + fy (x0 , y0 )∆y + 1 (∆x) + 2 (∆y) where 1 and 2 are functions of the single variable with lim 1 (∆x) = lim 2 (∆y) = 0. ∆x→0 ∆y→0 Note that we can rewrite the above as f (x, y) = L(x, y) + 1 (∆x) + 2 (∆y). Note we used = and not ≈! We’re saying that f (x, y) is the linearization “plus a little bit more” where that “little bit more” gets very very small as (x, y) gets close to (x0 , y0 ). Again, differentiability means the linear approximation is good. Because of the following theorem, most of the functions we deal with in this class will be differentiable. Theorem 4.1. If the partial derivatives fx (x, y) and fy (x, y) exist and are continuous near (x0 , y0 ), then f (x, y) is differentiable at (x0 , y0 ).
© Copyright 2026 Paperzz