Chain rules via multiplication Bro. David E. Brown, BYU–Idaho Dept. of Mathematics. All rights reserved. Version 0.44, of June 16, 2014 Answer to Exercise 2.1 corrected, minor edits made, and numbering of exercises mostly corrected on 2014-06-13. Contents 1 Introduction 1 2 A fancier Calc I example 2 3 Chain rules for multivariate functions 3.1 Additional chain rules? . . . . . . . . . . . . . . . . 3.2 Isn’t there a better way to write down chain rules? 3.2.1 A more compact way of writing chain rules 3.2.2 A more flexible way of writing chain rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 5 5 6 4 “The” Chain Rule 7 5 Additional exercises 9 6 Answers, etc. 9 1 Introduction You learned a “chain rule” probably looked 0 for differentiating compositions of functions in dCalculus I. It dg du something like g(u(x)) = g 0 (u(x))u0 (x), which is sometimes shortened to dx g(u(x)) = du dx . The truth is slightly more complicated: The chain rule is really d dg(u) du g(u(x)) = , (1) dx du u=u(x) dx but we’re usually too lazy to write all this, so we use one of the abbreviated versions given above. You know how this goes: For example, if f (x) = sin3 x, you can think of g as being the cubing function (that is, g(u) = u3 ), and u as being the sine function (i.e., u(x) = sin x). Then g(u(x)) = (sin x)3 = sin3 x = f (x). The chain rule (Equation (1)) says df dg du d 3 du 2 = = u = 3u (cos x) = 3 sin2 x cos x. dx du u=sin x dx du u=sin x dx u=sin x Back when my father learned calculus, this technique of differentiating “by substitution” was used quite commonly, much as we integrate by substitution. Differentiation by substitution has gone out of style, but 1 it’s a valid technique. Feel free to use it, if it helps you with the chain rule. We will make good use of it in this document. The question at hand is what the chain rule looks like when there are more variables than x floating around. We’ll sneak up on this question, by looking at a slightly more complicated Calc I example, to get a sense for how additional variables might be handled. Then we’ll look at chain rules for partial derivatives for functions of more than one variable. I say “rules” instead of “rule” because there are many chain rules. Fortunately, we will combine them all into one mother-of-all-chain-rules. Along the way, we’ll examine spiffy uses of symbols, to help keep the tedium down. Spiffy uses of symbols will include introducing matrices and matrix multiplication at some point. And, as you have no doubt discerned, this little exposition is rather informal. We can tackle formalities some other time. 2 A fancier Calc I example Careful examination of a suitable example can provide a stepping stone to chain rules for multivariate functions. Let’s try differentiating f (x) = (ln x)2 + sin3 x. Rather than treating this mindlessly as a Calc I problem, let’s examine it in some detail, deliberately introducing additional variables to increase its value as a bridge to the multivariate case. So: f has two terms: (ln x)2 and sin3 x. I want you to think of these two terms as u2 and v 3 , respectively, so that f (x) = f (u, v) = u2 + v 3 . (2) This requires us to set u = ln x and v = sin x. We’re using a substitution—one that happens to have two parts to it. I call u and v intermediate variables because f depends on u and v, which in turn depend on x and y—this means they are between f on the one hand and x and y on the other. To differentiate f with respect to x, we can start with the (ln x)2 term, think of it as u2 , and use the Calc I chain rule, in a knee-jerk reaction sort of way: 1 2 ln x d (ln x)2 = (2 ln x) = . dx x x 1 1 ∂f du Notice that the “2 ln x” bit is really ∂u in disguise, and is actually dx . So (2 ln x) is really x x u=ln x ∂f du ∂u dx . Put this together and find that u=ln x d (ln x)2 = dx ! ∂f du ∂u u=ln x dx (3) The differentiation of the other term is like unto it: ! d ∂f dv 3 2 sin x = = 3v (cos x) = (3 sin2 x)(cos x) = 3 sin2 x cos x. dx ∂v v=sin x dx v=sin x It’s the left-most equality that interests me here: d sin3 x = dx ! ∂f dv . ∂v v=sin x dx (Compare with Equation (3).) Page 2 (4) Here’s the punchline: We can get the entire derivative of f = (ln x)2 + sin3 x by adding the results of (3) and (4) together: ! ! du dv df ∂f ∂f 2 ln x = + = + 3 sin2 x cos x. dx ∂u u=ln x dx ∂v v=sin x dx x I want you to focus on the leftmost equality in the above, which is ! ! du dv df ∂f ∂f = + . dx ∂u u=ln x dx ∂v v=sin x dx With your kind permission, I’ll abbreviate it as df ∂f du ∂f dv = + , dx ∂u dx ∂v dx (5) always remembering to substitute in the u = ln x and v = sin x, as needed. (This abbreviation is customary.) Hmm. . . Equation (5) looks suspiciously like two instances of the chain rule added together. That’s because that’s exactly what it is: Differentiating the given f actually requires you to use the Calc I chain rule twice, once for the u2 term and once for the v 3 term. (Go back and look at Equation (2) to remember why I’m talking about u2 and v 3 .) Exercise 2.0.1. Reproduce the logic above, so as to differentiate f = sinh2 x − ln(x2 ) with respect to x. In the process, re-invent Equation (5). Hint: Think about “u2 ” and “ln v.” 3 Chain rules for multivariate functions So Equation (5) is a chain rule; it uses two intermediate variables (u and v) and one independent variable (x). What if there are two independent variables? Let’s find out. Let f (x, y) = ln(2x + y) + cos(3x − y), and think about calculating ∂f ∂x . Here’s how it goes if we indulge our Calc I knee-jerk reaction: Differentiate the ln(2x + y) term using the Calc I chain rule, but remember to hold y constant during the differentiation: ∂ 1 ∂ 2 ln(2x + y) = (2x + y) = . ∂x 2x + y ∂x 2x + y Then do the same for the cosine term: ∂ cos(3x − y) = − sin(3x − y) ∂x ∂ (3x − y) = −3 sin(3x − y), ∂x and combine the results to get 2 ∂f = − 3 sin(3x − y). ∂x 2x + y (6) Problem is, the knee-jerk reaction skips the intermediate steps, which (a) keeps you from seeing what’s really happening and therefore also (b) puts you at risk for making mistakes later. So what are the missing steps? To see them, try setting u = 2x + y and v = 3x − y. Then f = f (u, v) = ln u + cos v, which makes the “ 2 ” bit equal to 2x + y ∂f ∂u ∂ (2x + y) = 2 ∂x , and the “− sin(3x−y)”’ bit equal to u=2x+y and Page 3 ∂ (3x − y) = 3. ∂x ∂v ∂x . Likewise, v=3x−y Then we can write Equation (6) as ∂u ∂v 2 1 ∂f ∂f ∂f = − 3 sin(3x − y) = (2) + sin(3x − y) (−3) = + . ∂x 2x + y 2x + y ∂u u=2x+y ∂x ∂v u=2x+y ∂x Let’s abbreviate this as ∂f ∂f ∂u ∂f ∂v = + . ∂x ∂u ∂x ∂v ∂x (7) df This is just Equation (5), except we have to say “ ∂f ∂x ” instead of “ dx ” because x isn’t the only independent variable anymore. If you’re interested, here’s how it would look to differentiate f (x, y) = ln(2x + y) + cos(3x − y) using Equation (7) without all lead-up I’ve written above: You’d start by saying, “Hmm. . . Let’s say u = 2x + y and v = 3x − y and calculate 1 2 ∂f ∂u ∂f ∂v + = (2) + − sin(3x − y) (3) = − 3 sin(3x − y).” ∂u ∂x ∂v ∂x x−y x−y Exercise 3.0.2. Use the ideas of this section to calculate ∂f ∂y for f (x, y) = ln(2x + y) + cos(3x − y). In the process, you should come up with a chain rule similar to (but not the same as) Equation (7). Hint: Use the same u and v that I did. 3.1 Additional chain rules? You might be wondering whether there is a different chain rule for every function. Fortunately, the answer is, “nope.” q √ x Exercise 3.1.1. Use Equation (7) to calculate ∂f for f (x, y) = xy + ∂x y . Hint: Choose u and v to be functions that are “inside” other functions. Exercise 3.1.2. Use the chain rule you created for Exercise 3.0.2 to calculate ∂f ∂y for f (x, y) = √ xy + q x y. So: A given chain rule may serve to differentiate more than one function—in fact, lots of functions. Nevertheless, there are lots of chain rules. For example, suppose you have to differentiate f = sinh(xz) + cosh(yz) + tanh(xyz) with respect to z. If we let u = xz, v = yz , and w = xyz, the chain rule for ∂f ∂z looks like this: ∂f ∂u ∂f ∂v ∂f ∂w ∂f = + + . ∂z ∂u ∂z ∂v ∂z ∂w ∂z (8) Exercise 3.1.3. Explain why the previous sentence is true. Exercise 3.1.4. Write down chain rules for Exercise 3.1.5. Go ahead and calculate ∂f ∂x and ∂f ∂f ∂x , ∂y , ∂f ∂y , and ∂f ∂z for the current example. for the current example, using the chain rules. Exercise 3.1.6. Write down the chain rule for differentiating f (x, y) = sin(xy) cos(xy) with respect to x and use it to calculate ∂f ∂x . The do the same for differentiation with respect to y. (Hint: There’s only one intermediate variable in this example.) Page 4 3.2 Isn’t there a better way to write down chain rules? Fortunately, there is a more flexible and compact way of writing down chain rules than what we have so far. To find out what it is, let’s put a chain rule under the microscope. How about the chain rule you should have discovered while doing Exercise 3.0.2? It was ∂f ∂u ∂f ∂v ∂f = + . ∂y ∂u ∂y ∂v ∂y Hmm. . . This chain rule is a sum of products. . . and the first product uses u in both factors; the second uses v. . . hmm. . . u is “first” and v is “second,” in some sense?. . . uh, two derivatives; two “components”. . . Is this chain rule a dot product of some vectors or other? ∂f and ∂f Yes, actually. If we think of ∂u ∂v as being the first and second components of a vector, and if we ∂u ∂v think of ∂y and ∂y as the first and second components of some other vector, we can write the chain rule as the following dot product: " # " # ∂u ∂f ∂f ∂u ∂f ∂v ∂y + = ∂u (9) · ∂f ∂v . ∂u ∂y ∂v ∂y ∂y ∂v Clever, eh? (Wish I could take credit for it!) Exercise 3.2.1. Write Equation (7) as a dot product of suitable vectors. There is another way to write dot products. Some people (myself included) write Equation (9) like this: " # i ∂u ∂f ∂u ∂f ∂v h ∂f ∂f ∂y + = ∂u (10) ∂v . ∂v ∂u ∂y ∂v ∂y ∂y (Note the absence of the dot.) Means EXACTLY the same thing as Equation (9). Right now, it’s just another way to write the dot product. Shortly, however, we will see that it’s a more powerful and more flexible way of writing certain types of multiplication. Exercise 3.2.2. Write Equation (7) in the same way as Equation (10). Equation (10) has some advantages over Equation (9). One is that it has a nice, compact representation. Another is that it can be extended to more complicated situations than writing a chain rule for a partial derivative with respect to a single variable. Let’s take these in turn. 3.2.1 A more compact way of writing chain rules h #» ∂f It is customary to use the symbol ∇f to stand for the vector1 ∂u ∂f ∂v i #» . (The symbol ∇f is pronounced “del f ,” “the gradient of f ,” “grad f ,” or even “nabla " f#.”) ∂u u ∂ for the column ∂y Also, some people write ∂y ∂v . (Convince yourself that this makes sense. If you v ∂y u ∂ , so I’m just as likely to write can’t, then ask somebody.) I’m too lazy to write ∂y v " # ∂u ∂(u, v) ∂y for ∂v , ∂y ∂y u instead. (The symbol ∂(u,v) stands for the partial derivative of with respect to y. This is sloppiness ∂y v u again, as it takes the column , turns it into the row u v , puts a comma between the u and the v, v 1 The gradient, being a “row” instead of a “column,” is not a vector, but a “covector.” It is very common to call the gradient a vector, and at this point in your education, it’s even safe. So I will strive to resist the temptation to be picky about this. Page 5 and changes the square brackets into round parentheses, to get the (u, v) in the “numerator” of the symbol ∂(u,v) ∂y . Shameful, but customary.) With sloppy abbreviations like the above in hand, we can write the multiplication " # i ∂u h ∂f ∂v ∂f ∂u ∂y ∂v ∂y as #» ∂(u, v) , ∇f ∂y so that the chain rule of Equation (10) is now ∂f #» ∂(u, v) = ∇f . ∂y ∂y (11) Exercise 3.2.3. Write the chain rule of Equation (7) in the same format as Equation (11). Then write out what it means in terms of a matrix multiplication and a dot product. Exercise 3.2.4. Repeat the previous exercise for the chain rule of Equation (8). 3.2.2 A more flexible way of writing chain rules The more compact method of writing chain rules is also more flexible. For example, the chain rule of Equation (8) can be written as ∂f #» ∂(u, v, w) = ∇f . ∂z ∂z Likewise, if we need to differentiate f = x2 + y 2 + z 2 with respect to y, the chain rule is ∂f #» ∂(u, v, w) = ∇f , ∂y ∂y (I’m thinking of f as being u + v + w, with u = x2 , v = y 2 , and w = z 2 .) #» ∂(u,v,w) Exercise 3.2.5. Write out what “ ∂f ” means and calculate it, assuming f = x2 + y 2 + z 2 , as ∂y = ∇f ∂y above. Exercise 3.2.6. Write down the chain rule for chain rule means, and calculate it. ∂f ∂x , again assuming f = x2 + y 2 + z 2 ; write out what this Example 3.2.7. The density ρ of the water at a point under the surface of the ocean depends on the temperature T , the depth d, and the salinity s at that point. The temperature and the salinity both depend on the depth. Write down the chain rule for finding out how the density changes with depth, and interpret it. Fine: ρ depends on T , d, and s. We can express this fact as ρ = ρ(d, s, T ) (keeping everything in alphabetical order, for the sake of good bookkeeping). Likewise, T = T (d) and s = s(d). On the other hand, d is just d. (You can say d = d(d) if you like, but it seems pretty silly.) The chain rule for how density changes with depth is (by analogy with Equation (7)) ∂ρ #» ∂(d, s, T ) = ∇ρ . ∂d ∂d Page 6 #» We can calculate this by realizing that ∇ρ = ρd ρs ρT and ∂(d,s,T ) ∂d dd 1 = sd = sd . Putting all this Td Td together, we get that ∂ρ #» ∂(d, s, T ) = ∇ρ = ρd ∂d ∂d ρs ρT 1 sd = ρd + ρs sd + ρT Td . Td Notice that the rightmost expression clearly shows that density depends on depth (directly, without regard to salinity or temperature—that’s the “ρd ” part), but that density also depends on salinity and temperature, which in turn depend on depth. Of course, we could have said all that with words (just did!), but the statement ∂ρ = ρd + ρs sd + ρT Td ∂d is cleaner, easier to read, and just plain more elegant. Moreover, this statement shows how the dependence of salinity and temperature are incorporated into the dependence of density on temperature. Heh! Try doing that in words! ∂ρ and ρd The foregoing example points out a shortcoming in our notation. We have said elsewhere that ∂d mean the same thing. Yet, in the example above, they don’t! What’s going on here? Well, in the “ρs sd + ρT Td ” part of our answer, we treated s and T like intermediate variables. But in the ∂ρ “ρd ” part, we did not! In the “ρd ” part, we held s and T fixed. It’s as though we were saying “ ‘ ∂d ’ means ‘The partial derivative with respect to d, treating s and T as intermediate variables when it suits us,’ ” while saying “ ‘ρd ’ means ‘The partial derivative of ρ with respect to d, holding s and T constant.’ ” The science and engineering community believe they have a cure for this problem, but in my excessively autistic way, I don’t believe they’re “cure” does the job.2 4 “The” Chain Rule Our sloppy symbols are actually flexible enough to allow us to put all the partial derivatives of a function in #» ∂(u,v) one place. For example, Equation (11) and its cousin ∂f from Exercise 3.2.3 give us chain rules ∂y = ∇f ∂y for the first partial derivatives of some function f with respect to x and y, respectively. We can combine these two chain rules into one happy equation, like so: ux uy ∂f = fu fv , (12) vx vy ∂(x, y) f fv as being the gradient that is, if we can agree on what the symbols mean. You’ll recognize the row u u uy of f , though it’s laying down on the job. The boxy thing x is called the “Jacobian matrix of f .” vx vy ∂(u, v) . Using this symbol allows us to The symbol for the Jacobian matrix is the hopefully unsurprising ∂(x, y) write our chain rules together as ∂f #» ∂(u, v) = ∇f . (13) ∂(x, y) ∂(x, y) Fine, but how does this symbol stand for the two chain rules combined? Think of the right-hand side of Equation (12) as a multiplication.3 To produce one of the chain rules ∂(u,v) properly, this multiplication has to include multiplying the gradient by ∂(u,v) is the left column ∂x . Since ∂x of the Jacobian matrix, the multiplication required includes taking the dot product of the gradient with 2 If you want to know what the “cure” is, take a look at pages 844–846 of the text. no symbol between the gradient and the Jacobian matrix, and writing things next to each other with no symbol between has meant multiplication since 5th or 6th grade, yes? So, it’s a multiplication. 3 There’s Page 7 ∂(u,v) ∂x . Likewise, we need the dot product of the gradient with ∂(u,v) ∂y , to get the other chain rule. So the multiplication in Equation (12) is a pair of dot products. Specifically, it’s ux uy fu fv = fu ux + fv vx fu uy + fv vy . vx v y If you like, you can write this as fu fv ux vx h #» uy = ∇f vy I prefer to write it as fu Cleaner, yes? So now we can write fv ux vx #» ∇f ∂(u,v) ∂y i . #» ∂(u, v) uy = ∇f . vy ∂(x, y) ∂f #» ∂(u, v) = ∇f , ∂(x, y) ∂(x, y) as in Equation (13). As a bonus, the symbol Equation (11) and with ∂(u,v) ∂x ∂(u,v) ∂(x,y) is consistent with symbols like the ∂(u,v) ∂x we used in ∂f ∂(x,y) . So: Equation (13) is really Equation (12), in disguise. We will call ∂f ∂(x,y) the total derivative of f. Example 4.0.8. Let’s calculate the total derivative of f = sinh(x2 − y 2 ) cos(x2 + y 2 ). To do so, I suggest letting u = x2 − y 2 and v = x2 + y 2 . Then f is f = sinh u + cos v, which shows how f depends on u and v; bear in mind that these two intermediate variables depend on x and y. Hmm. . . Sounds like a job for Equation (13): ∂f #» ∂(u, v) = ∇f ∂(x, y) ∂(x, y) ux uy = fu fv vx vy 2x = cosh(x2 − y 2 ) cos(x2 + y 2 ) − sinh(x2 − y 2 ) sin(x2 + y 2 ) 2x h = 2x cosh(x2 − y 2 ) cos(x2 + y 2 ) − 2x sinh(x2 − y 2 ) sin(x2 + y 2 ) −2y 2y i −2y cosh(x2 − y 2 ) cos(x2 + y 2 ) − 2y sinh(x2 − y 2 ) sin(x2 + y 2 ) (The last expression is supposed to be a row, but it didn’t fit, so I put the first entry on one line and the second entry on the following line.) Exercise 4.0.9. Calculate the total derivative of f = sin(x + y) cos(x − y). Exercise 4.0.10. What would Equation (13) look like if f = sin(x2 + y 2 + z 2 )? We now finally arrive at “the” chain rule. I want a nice way to write it. To create a nice way, note first #» that all our chain rules include the symbol ∇f . But the symbol for the Jacobian is different from one context to another, depending on how many intermediate variables there are, and how many independent variables. I will get around this problem by using the symbol J to stand for the Jacobian. Likewise, the symbol for the total derivative depends on how many independent variables there are. I will get around this problem by using the symbol Df for the total derivative. Here is the long-awaited chain rule: #» Df = ∇f J. (14) Page 8 Not very dramatic, perhaps, but this one equation now includes all the chain rules there are in the universe, from Calc I on up. You may be interested to know that suitable use of matrix multiplication can extend the chain rule to situations in which there are variables between the intermediate variables and the independent variables. I also note in passing "that if you # " want # your total derivative to be a genuine vector (as opposed to a row ∂v ∂u ∂f #» ∂x ∂x ∂u = J T ∇f ; this is what you get when you transpose the matrices in matrix), you can use ∂u ∂v ∂f ∂y ∂y ∂v Equation (14) and reverse the order of multiplication. 5 Additional exercises Exercise 5.0.11. Use Equation (7) and the chain rule you invented in Exercise 3.0.2 to calculate the first partial derivatives of f (x, y) = xy + x/y. Until I can get some more exercises written, look in your Calculus text, in the section on “chain rules.” They’ll talk about “branch diagrams” or “tree diagrams” for helping you with the bookkeeping. That’s fine, but try working your textbook’s examples using the methods of this document, and see if you get the same answers as the book does. You’d better! 6 Answers, etc. Exercise 2.0.1. Let u = sinh x and v = x2 . Then f = u2 + ln v, so that: du ∂f = 2u|u=sinh x = 2 sinh x, = cosh x ∂u u=sinh x dx 1 1 dv ∂f = = 2, and = 2x. ∂v v=x2 v v=x2 x dx The derivative of the u2 term is therefore ∂f ∂u u=sinh x du = (2 sinh x)(2x) = 4x sinh x, dx and the derivative of the ln v term is 1 ∂f dv = 2 (2x) = 2/x. ∂v v=x2 dx x Put the pieces together to get df ∂f ∂f 2 du dv = + = 4x sinh x + . dx ∂u u=sinh x dx ∂v v=x2 dx x ∂f dv ∂f du 2 Note: In practice, people usually think in terms of ∂u dx + ∂v dx and substitute in the sinh x and the x , as needed. Their work typically looks like this, on paper: ∂f 1 2 = (2 sinh x) (2x) + (2x) = 4x sinh x + . 2 ∂x x x Exercise 3.0.2. Knee-jerk reaction: Differentiate the ln(2x + y) term using the Calc I chain rule, but remember to hold x constant during the differentiation: ∂ 1 ∂ 1 ln(2x + y) = (2x + y) = . ∂y 2x + y ∂y 2x + y Page 9 Then do the same for the cosine term: ∂ cos(3x − y) = − sin(3x − y) ∂y ∂ (3x − y) = − sin(3x − y), ∂y and combine the results to get ∂f 1 = − sin(3x − y). ∂y 2x + y What we’ve done here is you’d invent. ∂f ∂y = ∂f ∂u ∂u ∂y + ∂f ∂v ∂v ∂y . This is the chain rule similar to Equation (7) that I hoped Exercise 3.1.1. Let u = xy and v = xy . Then f = 1 ∂f 1 = √ = √ , ∂u 2 xy 2 u √ u+ 1 ∂f 1 = √ = q = ∂v 2 v 2 xy r √ v, so y , 4x ∂u = y, ∂x ∂v 1 = . ∂x y and Equation (7) now says ∂f ∂u ∂f ∂v y 1 ∂f = + = √ + ∂x ∂u ∂x ∂v ∂x 2 xy y r y = 4x r 1 y +√ . 4x 4xy Exercise 3.1.2. The chain rule you created for Exercise 3.0.2 should have been We can use r 1 ∂f ∂f y = √ and = ∂u 2 xy ∂v 4x from the previous exercise. But instead of ∂u =x ∂y ∂u ∂x and ∂v ∂x , ∂f ∂y = ∂f ∂u ∂u ∂y + ∂f ∂v ∂v ∂y . we need ∂v x = − 2. ∂y y and Then ∂f ∂u ∂f ∂v 1 ∂f = + = √ x+ ∂y ∂u ∂y ∂v ∂y 2 xy r y 4x x − 2 y r = x − 4y r x . 4y 3 Exercise 3.1.3. Well, f depends on u, v, and w, all of which depend on z, but in different ways. So ∂f ∂f the contributions to ∂f ∂z that u, v, and w all make have to be accounted for separately. The term ∂u ∂z term ∂f ∂v ∂f ∂w describes the dependence of f on z, via u, and likewise for the terms ∂v ∂z and ∂w ∂z . Adding the three terms together gives the total dependence of f on z. ∂f ∂u ∂f ∂w ∂f ∂v Exercise 3.1.4. ∂f ∂x = ∂u ∂x + ∂w ∂x . The term ∂v ∂x is missing, because v = yz does not depend on x. ∂f ∂f ∂v ∂f ∂w ∂f ∂u Also, ∂y = ∂v ∂y + ∂w ∂y . The term ∂u ∂y is missing, because u = xz does not depend on y. (If you prefer, it’s missing because ∂u ∂y = 0.) ∂f ∂u ∂f ∂w Exercise 3.1.5. ∂f ∂x = ∂u ∂x + ∂w ∂x = z cosh(xz) + yz sech(xyz) ∂f ∂f ∂v ∂f ∂w ∂y = ∂v ∂y + ∂w ∂y = z sinh(yz) + xz sech(xyz) ∂f ∂z = ∂f ∂u ∂u ∂z + ∂f ∂u ∂u ∂z + ∂f ∂w ∂w ∂z = x cosh(xz) + y sinh(yz) + xy sech(xyz) Exercise 3.1.6. Let u = xy. Then f = sin u cos u, and ∂f df ∂u = = −y sin2 u + y cos2 u = y(cos2 xy − sin2 xy) = y(cos 2xy − 1). ∂x du ∂x Page 10 Likewise, ∂f df ∂u = = −x sin2 u + x cos2 u = x(cos2 xy − sin2 xy) = x(cos 2xy − 1). ∂y du ∂y " ∂f ∂u ∂f ∂v # " # ∂u ∂x ∂v ∂x Exercise 3.2.1. ∂f ∂x Exercise 3.2.2. ∂f ∂x = Exercise 3.2.3. ∂f ∂x h #» ∂f = ∇f ∂(u,v) = ∂x ∂u = h ∂f ∂u · ∂f ∂v i " ∂u ∂x ∂v ∂x # ∂f ∂v i " ∂u ∂x ∂v ∂x " # = ∂f ∂u ∂f ∂v # " · ∂u ∂x ∂v ∂x # . Oh. This looks a lot like what we wrote for the previous two exercises! (Sorry about the repetition. I wanted to drive home the point that we’re just writing the same thing in three different ways.) Exercise 3.2.4. Exercise 3.2.5. ∂f ∂y #» = ∇f ∂(u,v,w) ∂y means ∂f #» ∂(u, v, w) = ∇f ∂y ∂y = fu = 1 uy fv fw vy , wy 0 1 1 2y 0 which we calculate as = 1(0) + 1(2y) + 1(0) = 2y Exercise 4.0.9. Let u = x + y and v = x − y, so that f = sin u cos v. Then # " ∂u ∂f #» ∂(u, v) h ∂f ∂f i ∂u ∂x ∂y = ∇f = ∂u ∂v ∂v ∂v ∂(x, y) ∂(x, y) ∂x ∂y 1 1 = cos u cos v − sin u sin v = cos u cos v − sin u sin v cos u cos v + sin u sin v . 1 −1 Exercise 4.0.10. Let u = x2 , v = y 2 , and w = z 2 , so that f = sin(u + v + w). Then the total derivative of f is ∂u ∂u ∂u h i ∂x ∂y ∂z ∂f #» ∂(u, v, w) ∂v ∂v ∂f ∂f ∂f ∂v = ∇f · = ∂u ∂y ∂z ∂v ∂w ∂x ∂(x, y, z) ∂(x, y, z) ∂w ∂w ∂w ∂x = cos(u + v + w) cos(u + v + w) ∂y 2x cos(u + v + w) 0 0 0 2y 0 2z cos(u + v + w) = 2x cos(u + v + w) 2y cos(u + v + w) = 2x cos(x2 + y 2 + z 2 ) 2y cos(x2 + y 2 + z 2 ) Page 11 ∂z 0 0 2z 2z cos(x2 + y 2 + z 2 ) ∂f ∂u ∂f ∂v x Exercise 5.0.11. Equation (7) says ∂f ∂x = ∂u ∂x + ∂v ∂x . If we let u = xy and v = y , then f = u + v, and ∂f ∂f ∂u ∂f ∂v 1 1 = + = (1)(y) + (1) =y+ . ∂x ∂u ∂x ∂v ∂x y y Similarly, in Exercise 3.0.2, you should have found that ∂f ∂y = ∂f ∂u ∂u ∂y + ∂f ∂v ∂v ∂y . This implies that x ∂f ∂f ∂u ∂f ∂v x = + = (1)(x) + (1) − 2 = x − 2 . ∂x ∂u ∂x ∂v ∂x y y Page 12
© Copyright 2026 Paperzz