Picture - UMD MATH

The direct proof of the chain rule
Advanced Calculus II, Math 411
Summer Session 2010
1. Intro. In this note we’ll go over the proof of the chain rule as a corollary of the
Mean Value Lemma and Mean Value Proposition given in the book.
2. Theorem. Let g : Rn →
− R, and f1 , . . . , fn : Rm →
− R have continuous 1st order
partials. Consider the composition,
call
k(x) = g(f1 (x), . . . , fn (x)) = (g ◦ F)(x)
Let x ∈ Rm . To simplify notation write y = (f1 (x), . . . , fn (x)), so that k(x) = g(y).
Picture
Then k has first order partials, Di k, and
Di k(x) =
m
X
Dj g(y) · Di fj (x)
j=1
3. Remark. Notice the use of the notation Dj . This is to avoid confusion about
what we are taking the derivative with respect to. You need to treat g as a function
of f1 , . . . , fn , so Dj g means the partial of g “with respect to” fj .
4. Example. One common example is the case when the fi are change of coordinates. For example, write
x = r cos θ
y = r sin θ.
Here, x and y are taking the role of the fi . Then
∂x
= cos θ
∂r
∂x
= −r sin θ
∂θ
∂y
= sin θ
∂r
∂y
= r cos θ.
∂θ
This allows us to compute the radial rate of change of a function given to us in
terms of x and y.
5. Example. The charge of a particle in R2 is given by the equation
C(x, y) = yex
How fast is the charge changing as the particle moves away from the origin?
First note that
D1 C(x, y) = yex
D2 C(x, y) = ex
1
b θ) = C(f1 (r, θ), f2 (r, θ)), f1 (r, θ) = r cos θ and f2 (r, θ) = r sin θ. Fix
Define C(r,
a specific point (r, θ) →
7− (f1 (r, θ), f2 (r, θ)). By the chain rule,
b θ) = D1 C(f1 (r, θ), f2 (r, θ)) · D1 f1 (r, θ) + D2 C(f1 (r, θ), f2 (r, θ)) · D1 f2 (r, θ)
D1 C(r,
= f2 (r, θ)ef1 (r,θ) · cos θ + ef1 (r,θ) · sin θ
= r sin θer cos θ · cos θ + er cos θ · sin θ
6. Remark. In practice, the above notation is unwieldy and unnecessary. Instead,
we generally write
∂C
∂C ∂x ∂C ∂y
=
+
∂r
∂x ∂r
∂y ∂r
This notation can sometimes cause confusion because when you write ∂C/∂x and
∂C/∂y, you generally mean that they are functions of x and y. Here however, they
are thought of as functions of r and θ. Be careful.
Another way this notation is useful is that you can remember the chain rule
easily by using diagrams like the following
+3 x @
r..
G @@@
....
@@@@@
.. .
@@@@
...
@ $
....
....
;C C
....
....
/ y
θ
The bold arrows show all the paths we must take to get from r to C, from which
you can read of which partials you need to take.
7. Proof of Theorem. The goal is to examine the quotient
k(x + tei ) − k(x)
t
whose limit (as t →
− 0) is Di k(x).
For the sake of argument, fix t > 0. By the (usual) Mean Value Theorem applied
to each fj ,
∃ θj , 0 < θj < t
such that fj (x + tei ) − fj (x) = t · Di fj (x + θj ei )
Let h = F(x + tei ) − F(x). By The Mean Value Proposition applied to g,
∃ z1 , . . . , zn , kzj − yk < khk
such that g(y + h) − g(y) =
n
X
hj · Dj g(zj ).
j=1
But k(x + tei ) − k(x) = g(y + h) − g(y) and hj = fj (x + tei ) − fj (x), so
k(x + tei ) − k(x) =
n
X
j=1
hj · Dj g(zj ) =
n
X
t · Di fj (x + θj ei ) · Dj g(zj )
j=1
Dividing through by t. By construction, as t →
− 0, θj →
− 0 and zj →
− y. Therefore
n
n
k(x + tei ) − k(x) X
t→
−0 X
=
Di gj (x + θj ei ) · Dj f (zj ) −−−→
Di gj (x) · Dj f (y)
t
j=1
j=1
where on the last step we have used the fact that Di gj and Dj f are all continuous
functions (as zj →
− y, Dj f (zj ) →
− Dj f (y), etc.)
8. Corollary. Let x ∈ Rn , and f : O →
− R be a continuously differentiable function
on some open set O 3 x, and p ∈ Rn any vector. Then the partials form a basis
for the space of directional derivatives, specifically:
∂f
(x) = h∇f (x), pi
∂p
9. Proof. Recall that we can write the directional derivative as a one dimensional
derivative. Pick ε > 0 such that Bε (x) ⊂ O. Define g : (−ε, ε) →
− R by
g(t) = f (x + tp).
Then basically by definition, g 0 (0) =
d
g(t) =
dt
n
X
∂f
∂p (x).
By the chain rule,
n
Di f (x + tp) ·
i=1
Now just set t = 0, get g 0 (0) =
Pn
i=1
X
d
Di f (x + tp) · pi
(x + tp) =
dt
i=1
Di f (x) · pi = h∇f (x), pi.