4 Partial Differentiation

PARTIAL DIFFERENTIATION
4
Partial Differentiation
Many equations in engineering, physics and mathematics tie together more than two variables.
For example Ohm’s Law (V = IR) and the equation for an ideal gas, P V = nRT , which
gives the relationship between pressure (P ), volume (V ) and temperature (T ). If we vary any
two of these then the behaviour of the third can be calculated:
P =
nRT
,
V
V =
nRT
,
P
T =
PV
.
nR
How P varies as we change T and V is easy to see from the above, but we want to adapt the
tools of one-variable calculus to help us investigate functions of more than one variable.
For the most part we shall concentrate on functions of two variables such as z = x2 + y 2
or z = x sin(y + ex ). Graphically z = f (x, y) describes a surface in 3D space — varying the
x- and y-coordinates gives the z-coordinate, producing the surface:
15
z
z0 = f (x0 , y0 )
10
00
y
x
2
x
y
2
(x0 , y0 )
One task of interest will be to maximise or minimise such functions, where we may have
to take into account limitations on the domain of definition, i.e. those points (x, y) for which
we calculate z = f (x, y). Restrictions on the domain can come from both mathematical and
V
physical reasons. For example above the function T = PnR
makes mathematical sense for
negative P or V , but, physically, negative pressure or volume has no obvious meaning.
Exercise 4.1. What is the domain of the function z = f (x, y) =
p
1 − x2 − y 2 ?
Solution.
Consider the function z = ln(x + y). From its definition we see that it is defined only
when x + y > 0, that is only for points (x, y) ∈ R2 lying above the line y = −x. Moreover
on any line x + y = a for a > 0 we have z = ln a, that is z maintains a constant value, so we
have a contour line of the surface:
59
y
x
x+y =a
x+y =0
Note also that on any
line with equation y = mx + c for constants m 6= −1 and c, we have
z = ln (m + 1)x + c = ln x + c/(m + 1) + ln(m + 1) for all x > −c/(m + 1), so that we
get a copy of the graph of the logarithm curve when not travelling parallel to the lines of
constancy. For example on y = x, z = ln(2x) = ln x + ln 2.
As another example, consider the function z = x2 + y 2 . If we choose a positive value for z,
for example z = 4, then the points (x, y) that can give rise to this value are those satisfying
x2 + y 2 = 4 = 22 , i.e. those on the circle centred on the origin of radius 2. On the other hand
if we fix a value for x, for example x = 0, then
we have z = y 2 . If we fix x = 1 then z = y 2 + 1.
Both of these are parabolas, and indeed fixing
any value of x produces such a curve. Symmetrically, fixing a value for y also produces a
parabola, e.g. z = x2 + (−3)2 = x2 + 9.
Note that at (x, y) = (0, 0), z = 0, but if
x 6= 0 or y 6= 0, then x2 > 0 or y 2 > 0, and it follows that z > 0. Thus the minimum value taken
by this function is z = 0, at the origin. This
contrasts with our earlier example z = ln(x + y)
where if we move along y = x we have z =
ln(2x) = ln x + ln 2, which diverges to −∞ as
x → 0, and diverges to +∞ as x → +∞. Thus
there is no overall maximum or minimum value.
Unfortunately in general it is harder to picture what is happening with less simple multivariable functions, such as z = sin(x2 + y) + exy . One useful technique illustrated above for
z = x2 + y 2 is to hold either x or y constant. For example consider z = x2 (1 − y) − xy 2 + y 3 .
Setting x = 0 gives
z = y3 ⇒
dz
= 3y 2 ,
dy
and setting x = 1 gives
z = y 3 − y 2 − y + 1 = (y − 1)2 (y + 1) ⇒
dz
= 3y 2 − 2y − 1 = (y − 1)(3y + 1).
dy
On the other hand if y = −2 then
z = 3x2 − 4x − 8 ⇒
dz
= 6x − 4 = 2(3x − 2).
dx
60
PARTIAL DIFFERENTIATION
All of these slices through the surface give us an insight into the behaviour of the function:
z
z
z
y
y
z = y3
4.1
z = y3 − y2 − y + 1
x
z = 3x2 − 4x − 8
Definition of partial derivatives
Suppose that z = f (x, y) is a function of two variables. We define partial derivatives taken
with respect to x and with respect to y by:
∂z
f (x + h, y) − f (x, y)
= lim
,
∂x h→0
h
∂z
f (x, y + k) − f (x, y)
= lim
,
∂y k→0
k
∂z
∂x
we are holding the value of y fixed, altering x by a small amount h to get the point (x + h, y),
and calculating the slopes of straight line approximations to the tangent in the x-direction.
∂z
Similarly calculating
involves holding x fixed and finding the limit of approximations to
∂y
the tangent in the y-direction.
whenever these limits exist. These definitions mirror those for the one variable case. For
Applying this definition to the function z = x2 (1 − y) − xy 2 + y 3 above we have
(x + h)2 (1 − y) − (x + h)y 2 + y 3 − x2 (1 − y) − xy 2 + y 3
∂z
= lim
∂x h→0
h
(2xh + h2 )(1 − y) − hy 2
= lim
h→0
h
= lim (2x + h)(1 − y) − y 2 = 2x(1 − y) − y 2 ,
h→0
and this limit exists at all points (x, y) in the plane. A similar calculation shows that
∂z
= −x2 − 2xy + 3y 2 .
∂y
61
Definition of partial derivatives
However, in practice it is rarely necessary to go back to the definitions as we did above.
Indeed, since all we are doing is holding one variable fixed, we can treat this variable as a
constant in our calculations. For example if z = x2 + xy 5 − 6x3 y + y 4 then
∂z
d 2
d
d
d
=
(x ) + y 5 (x) − 6y (x3 ) + y 4 (1)
∂x
dx
dx
dx
dx
= 2x + y 5 × 1 − 6y × 3x2 + y 4 × 0 = 2x + y 5 − 18x2 y.
Similarly,
∂z
d
d
d
d 4
= x2 (1) + x (y 5 ) − 6x3 (y) +
(y )
∂y
dy
dy
dy
dy
= x2 × 0 + x × 5y 4 − 6x3 × 1 + 4y 3 = 5xy 4 − 6x3 + 4y 3 .
Using this technique we can make use of known results from one-variable theory such as
the product and quotient rules, and the chain rule if we are considering a function of one
variable. (The general chain rule for partial derivatives is a little more complicated.) For
example if z = sin(xy)ex+y then
∂z
∂
∂ x+y
=
sin(xy) × ex+y + sin(xy) ×
e
∂x
∂x
∂x
∂
∂
= cos(xy) (xy) × ex+y + sin(xy) × ex+y (x + y)
∂x
∂x
= y cos(xy)ex+y + sin(xy)ex+y
A similar calculation shows that
Product rule
Chain rule ×2
∂z
= x cos(xy)ex+y + sin(xy)ex+y .
∂y
4.1.1
Alternative notations
∂z
∂z
If z = f (x, y), then
is often written as fx (x, y), and
as fy (x, y). A further alternative
∂x
∂y
is to write these as f1 (x, y) and f2 (x, y) respectively, where the number indicates whether the
derivative is being taken with respect to the first or second variable.
Exercise 4.2. For which points (x, y) is the function z = ln(y − x2 ) + sin(x + y 2 ) defined?
∂z
∂z
Sketch the domain of definition. Calculate
and
.
∂x
∂y
Solution.
62
PARTIAL DIFFERENTIATION
Exercise 4.3 (S03 6(a i)). Compute
∂z
∂z
and
when z = x2 y + 3x sin(x − 2y).
∂x
∂y
Solution.
4.1.2
Functions of more variables
We can extend the notion of partial derivatives to functions of any (finite) number of variables
in a natural way. For example if w = sin(x + y) + z 2 ex then
∂w
∂
∂
= cos(x + y) (x + y) + z 2 ex = cos(x + y) + z 2 ex
∂x
∂x
∂x
∂w
∂
= cos(x + y) (x + y) + 0 = cos(x + y)
∂y
∂y
∂w
∂ 2 x
=0+
(z )e = 2zex .
∂z
∂z
The geometrical significance of such functions is not so immediate as for functions of only
two variables.
4.2
Tangent planes
By holding one variable fixed in the definition of partial derivatives, for example setting y = b,
we are taking a surface z = f (x, y), intersecting it with the plane y = b, and then taking
derivatives of the resulting curve in this plane:
z
fx (a, b)
1
a
x
For one-variable calculus we are interested in approximating a function y = f (x) by finding
63
Tangent planes
a tangent line to the curve at some point a, f (a) . For two variables the appropriate object
is the tangent plane.
n
ty
tx
By taking partial derivatives in the orthogonal directions corresponding to the x- and
y-axes we can produce vectors that are in the direction of the tangent lines to the two curves
obtained by intersecting the surface with
the planes x = a and y = b. In particular the
tangent plane at the point a, b, f (a, b) contains this point, and should contain the two
tangent lines through this point in the directions specified by the vectors obtained from these
partial derivatives.
In the plane y = b a tangent
vector is tx = 1, 0, fx(a, b) , and in the plane x = a a tangent
vector is ty = 0, 1, fy (a, b) . Both of these vectors lie in the plane, hence their vector product
n = tx × ty is a normal vector to the plane:
i j
k n = tx × ty = 1 0 fx (a, b) = −fx (a, b), −fy (a, b), 1 ,
0 1 fy (a, b)
and so the tangent plane has equation
h
i
(x, y, z) − a, b, f (a, b) . −fx (a, b), −fy (a, b), 1 = 0
⇔
⇔
−fx (a, b) (x − a) − fy (a, b)(y − b) + z − f (a, b) = 0
z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b)
This function of x and y is also known as the linearisation of z = f (x, y) at the point (a, b).
Exercise 4.4 (A04 8(b)). Findthe equation of the tangent plane and the normal line to the
surface z = x2 y 3 − sin πx + π2 y at the point (2, 1). Find the intersection of this normal line
with the (x, y)-plane.
Solution.
64
PARTIAL DIFFERENTIATION
Exercise
4.5 (S03 6(b)). Find the equation of the tangent plane to the surface z = f (x, y) =
p
8 − 3x2 − y 2 at the point (1, 2, 1). Write down the linear approximation to f (x, y) at (1, 2)
and use it to find an approximate value for f (1.05, 1.95).
Solution.
Exercise 4.6. Find the tangent planes to the surface z = f (x, y) = 3xy + x + y 2 at (1, 0)
and at (−1, 2). Find the line of intersection of these two planes.
Solution.
65
Higher order derivatives
Example 4.7. Find the tangent plane to the surface
π(x + y)
16
z = e2xy + tan
at the point (3, 1, e6 + 1).
Solution. Since z = e2xy + tan
π(x + y)
, we have
16
∂z
π
π(x + y)
= 2ye2xy +
sec2
,
∂x
16
16
So when x = 3 and y = 1, z = e6 + tan
∂z
π
π(x + y)
= 2xe2xy +
sec2
.
∂y
16
16
π
= e6 + 1 and
4
∂z
π
π
1
√
= 2e6 +
= 2e6 + ,
∂x
16 (1/ 2)2
8
π
∂z
π
1
√
= 6e6 +
= 6e6 +
∂y
16 (1/ 2)2
8
So the tangent plane has equation
π
π
z = e6 + 1 + 2e6 +
(x − 3) + 6e6 +
(y − 1).
8
8
Example 4.8 (S05 8(c)). Find the equation of the tangent plane to the surface with equation
z = y cos(x − y) at the point (2, 2, 2).
Solution. Taking partial derivatives of z we get
∂z
∂
= −y sin(x − y) (x − y) = −y sin(x − y), and
∂x
∂x
∂z
∂
= cos(x − y) − y sin(x − y) (x − y) = cos(x − y) + y sin(x − y).
∂y
∂y
So at (2, 2, 2) we have
plane is
∂z
∂z
= −2 sin 0 = 0, and
= cos 0 + 2 sin 0 = 1. Thus the tangent
∂x
∂y
z = 2 + 0 × (x − 2) + 1 × (y − 2) ⇒ z = y.
4.3
Higher order derivatives
Suppose z = x sin y + x2 y. Then
∂z
= sin y + 2xy
∂x
and
∂z
= x cos y + x2 .
∂y
Both of these partial derivatives are again functions of x and y, so we can differentiate both
of them, either with respect to x, or with respect to y. This gives us a total of four second
order partial derivatives:
∂2z
∂
∂z
∂2z
∂ ∂z
=
=
2y,
=
= cos y + 2x
∂x2
∂x ∂x
∂y∂x
∂y ∂x
∂2z
∂ ∂z
∂2z
∂ ∂z
=
= cos y + 2x,
=
= −x sin y.
∂x∂y
∂x ∂y
∂y 2
∂y ∂y
66
PARTIAL DIFFERENTIATION
∂2z
∂2z
=
. This is not
∂y∂x
∂x∂y
something special about our particular example, but is true for all reasonably well-behaved
functions. However it is possible to find functions for which this is not true (examples can be
found in calculus text books).
∂
In the
notation, the order of taking the derivatives is given by reading the variables in
∂x
∂2z
∂z
means calculate
, then differentiate
the denominator from right to left. For example
∂y∂x
∂x
the result with respect to y. When using the fx or f1 notations, the convention is the other
way — the subscripts are read left to right. That is, fxy = (fx )y , the derivative with respect
y of the derivative with respect to x. However, in light of the remark above, these conventions
can usually be ignored for most functions, since the results in either order will be the same.
For example, if f (x, y) = x2 + xy 2 , then
Remark. The mixed partial derivatives in this case are equal:
fx (x, y) = 2x + y 2 ,
fy (x, y) = 2xy,
and so
fxx (x, y) = 2,
fxy (x, y) = 2y = fyx (x, y),
fyy (x, y) = 2x.
Exercise 4.9. Compute all the second order partial derivatives of the function f (x, y) =
sin(x + xy).
Solution.
Example 4.10. Compute
∂z ∂z
∂2z
2
,
and
when z = x3 y + ex+y + y sin x.
2
∂x ∂y
∂x
2
Solution. Since z = x3 y + ex+y + y sin x then
2
∂z
= 3x2 y + ex+y + y cos x,
∂x
2
∂z
and
= x3 + 2yex+y + sin x
∂y
67
2
∂2z
= 6xy + ex+y − y sin x,
∂x2
Exercises
Example 4.11. Consider the following function of three variables:
xy
f (x, y, z) = x2 y 3 + sin(x2 + z) −
.
z
Find the partial derivatives fx , fy , fyx and fxyz .
Solution. Since f (x, y, z) = x2 y 3 + sin(x2 + z) −
xy
then
z
y
fx = 2xy 3 + 2x cos(x2 + z) − ,
z
1
2
fyx = (fy )x = 6xy − ,
z
4.4
fy = 3x2 y 2 −
x
,
z
fxyz = (fxy )z = (fyx )z =
1
.
z2
Exercises
1. Find the domains of the following functions. Sketch these domains in parts (i) and (iii).
(i) f (x, y) =
p
8 − 3x2 − y 2
(iii) f (x, y) = ln(1 − x2 − y 2 ) +
(ii) f (x, y) = sin(x − y) +
x2 + 1
x+y
1
x − y2
(iv) f (x, y) = sin(x2 + exy )
2. Find all the first order derivatives of the following functions:
(ii) f (x, y) = x2 ey − 4y
(i) f (x, y) = x3 − 4xy 2 + y 4
(iii) f (x, y) = x2 sin xy − 3y 2
(iv) f (x, y, z) = 3x sin y + 4x3 y 2 z
3. Find the indicated partial derivatives:
(i) f (x, y) = x3 − 4xy 2 + 3y :
4
2 3
(ii) f (x, y) = x − 3x y + 5y :
fxx , fyy , fxy
fxx , fxy , fxyy
2
(iii) f (x, y, z) = e2xy −
z
+ xz sin y :
y
fxx , fyy , fyyzz
4. Find the equation of the tangent plane and normal line to the following surfaces at the
given point:
(i) z = x2 + y 2 − 1 at (2, 1, 4)
(iii) z = sin x cos y at (0, π, 0)
p
(v) z = x2 + y 2 at (−3, 4, 5)
4.5
(ii) z = e−x
2
−y 2
at (0, 0, 1)
3
(iv) z = x − 2xy at (−2, 3, 4)
4x
(vi) z =
at (1, 2, 2)
y
The Chain Rule
For one-variable calculus if y = f (x), i.e. y is a function of x, and if x = x(t), i.e. x is a
function of the variable t, then y can be viewed as a function of t, and has derivative
dy
dy dx
=
.
dt
dx dt
Suppose instead that z = f (x, y) is a function of two variables, each of which is written
in terms of the single variable t. Then we think of z just as a function of t, and differentiate.
It depends on the partial derivatives with respect to x and y through the following formula:
dz
∂z dx ∂z dy
=
+
dt
∂x dt
∂y dt
68
PARTIAL DIFFERENTIATION
As an example, suppose z = x2 − xy, and that x = sin t, y = t2 . Then
dz
∂z dx ∂z dy
=
+
dt
∂x dt
∂y dt
d
d
= (2x − y) (sin t) + (−x) (t2 ) = (2 sin t − t2 ) cos t − 2t sin t.
dt
dt
In this case we could avoid use of the chain rule, since direct substitution for x and y gives
z = sin2 t − t2 sin t. But substitution may not always be convenient, or even available.
Exercise 4.12. Suppose z = x2 y + y 2 , where x = cos t and y = 1/t. Find
dz
.
dt
Solution.
Exercise 4.13. The pressure, volume and temperature of an ideal gas are related by the
equation P V = 8.31T . Find the rate at which the pressure is changing when the temperature
is 300K and increasing at a rate of 0.1Ks−1, and the volume is 100l and increasing at a rate
of 0.2ls−1 .
Solution.
Example 4.14. The volume of a right circular cylinder of base radius r and height h is
V = πr2 h. If the radius is decreasing at a rate of 3cm s−1 while the height is increasing at a
rate of 2cm s−1 , what is the rate of the change of V when r = 40cm and h = 110cm?
∂V
∂V
Solution. Since V = πr2 h,
= 2πrh and
= πr2 . So thinking of V as a function of t
∂r
∂h
we have, by the chain rule,
dV
∂V dr
∂V dh
dr
dh
=
+
= 2πrh + πr2 .
dt
∂r dt
∂h dt
dt
dt
For the problem we have r = 40, h = 110,
dh
dr
= −3 and
= 2, and so
dt
dt
dV
= 2π × 40 × 110 × (−3) + π × 402 × 2 = −23200π cm3 s−1 .
dt
69
The Chain Rule
Example 4.15. The voltage V in an electrical circuit is slowly decreasing as the battery
wears out. The resistance is slowly increasing as the resistor heats up. Use Ohm’s Law
V = IR to find how the current I is changing at the moment when R = 400Ω, I = 0.08A,
dV
dR
= −0.01Vs−1 and
= 0.03Ωs−1 .
dt
dt
V
Solution. From V = IR we get I = , and so the chain rule gives
R
dI
∂I dV
∂I dR
=
+
dt
∂V dt
∂R dt
1 dV
V dR
1 dV
I dR
=
− 2
=
−
R dt
R dt
R dt
R dt
1
0.08
=
× (−0.01) −
× 0.03
400
400
1
=
(−0.01 − 0.08 × 0.03) = −0.000031 A s−1 .
400
Now suppose that z = f (x, y), a function of the two variables x and y, and that each of
these in turn depend on two variables s and t. Then, viewing z as a function of s and t, we
have the two partial derivatives
∂z
∂z ∂x ∂z ∂y
=
+
∂s
∂x ∂s
∂y ∂s
and
∂z
∂z ∂x ∂z ∂y
=
+
∂t
∂x ∂t
∂y ∂t
s
As an example, suppose that z = xy − y 2 , and that x = es+t and y = . Then
t
∂z
∂z ∂x ∂z ∂y
∂
∂ s
=
+
= y (es+t ) + (x − 2y)
∂s
∂x ∂s
∂y ∂s
∂s
∂s t
s
2s 1
1
s+t
s+t
= ×e
+ e
−
× = 2 (s + 1)tes+t − 2s
t
t
t
t
Similarly,
s
∂z
= 3 t(t − 1)es+t + 2s .
∂t
t
Exercise 4.16. If z = ex sin y where x = st2 and y = s2 t, find
∂z
.
∂s
Solution.
Exercise 4.17 (A04 8(c)). Suppose that z = f (u, v) and that the variables u and v depend
∂z
∂z
on x and y through u = x2 y + y 2 and v = ex cos(πy). If
= −4 and
= 3 at the point
∂u
∂v
∂z
∂z
and
at the corresponding point (x, y) = (0, 2).
(u, v) = (4, 1), find
∂x
∂y
70
PARTIAL DIFFERENTIATION
Solution.
Exercise 4.18. If g(s, t) = f (s2 − t2 , t2 − s2 ), and if f has partial derivatives with respect
to both variables, show that g satisfies the equation
t
∂g
∂g
+s
= 0.
∂s
∂t
Solution.
In general, if z is a function depending on the m variables x1 , x2 , . . . , xm , and each of
these are defined in terms of the n variables y1 , y2 , . . . , yn , then we can think of z as a
function of the yj , and take n different partial derivatives, which are given by
∂z
∂z ∂x1
∂z ∂x2
∂z ∂xm
=
+
+ ··· +
.
∂yj
∂x1 ∂yj
∂x2 ∂yj
∂xm ∂yj
The previous two formulae given are special cases of this general version of the chain rule.
4.6
Directional derivatives; the gradient operator
∂z
∂z
and
by
∂x
∂y
making small changes in one variable while holding the other fixed. This amounts to moving
a small distance in the (x, y)-plane parallel to one or other of the coordinate axes. This can
be generalised by moving in any direction in the (x, y)-plane as specified by a unit vector
c = (c, d) (so c2 + d2 = 1). The directional derivative of z = f (x, y) at the point r = (a, b)
in the direction c is
f (a + hc, b + hd) − f (a, b)
Dc f (r) = lim
.
h→0
h
Given a function z = f (x, y), we have defined the two partial derivatives
71
Directional derivatives; the gradient operator
Particular examples are given by taking c = i = (1, 0) or c = j = (0, 1), the unit vectors in
the coordinate directions, since from these we just recover the partial derivatives:
f (a + h, b) − f (a, b)
= fx (r),
h→0
h
Di f (r) = lim
Dj f (r) = fy (r).
More generally, having fixed a unit vector c and a point r, we are taking the one-variable
function g(h) := f (r + hc) = f (a + hc, b + hd), differentiating it, and evaluating at h = 0.
Using the chain rule we get
d
d
d
g(h) = fx (a + hc, b + hd) (a + hc) + fy (a + hc, b + hd) (b + hd)
dh
dh
dh
= cfx (r + hc) + dfy (r + hc)
for all h ∈ R, and so setting h = 0 we get
Dc f (r) = cfx (r) + dfy (r) = (c, d).(fx (r), fy (r)).
That is, Dc f (r) can be calculated by finding the vector of partial derivatives, evaluating this
at the point r, and taking the dot product of the result with the direction vector c. The
vector (fx (r), fy (r)) is known as the gradient of f , and is denoted ∇f (r), so that
Dc f (r) = c.∇f (r)
For example, if we take f (x, y) = 4xy − x3 y 2 , c = ( 35 , 54 ) and r = (2, −3) then
fx (x, y) = 4y − 3x2 y 2 ⇒ fx (r) = −120,
fy (x, y) = 4x − 2x3 y ⇒ fy (r) = 56,
and so
3 4
−360 + 224
136
,
.(−120, 56) =
=−
.
5 5
5
5
Recall that for any two vectors a and b, a.b = |a||b| cos θ, where θ is the angle between
these two vectors. Applying this to our formula for directional derivatives, and noting that
|c| = 1, we get
Dc f (r) = |∇f (r)||c| cos θ = |∇f (r)| cos θ.
Dc f (r) =
But −1 6 cos θ 6 1, so the directional derivative is maximised if we take θ = 0, when
cos θ = 1. That is, if we take c in the same direction as ∇f (r). Alternatively, if we take c in
the opposite direction, so that θ = π, and hence cos θ = −1, then Dc f (r) = −|∇f (r)|.
Exercise 4.19. Find the directional derivative of f (x, y) = x3 − 3xy + 4y 2 in the direction
given by the unit vector c at an angle π/6 to the x-axis. What is Dc f (1, 2)?
Solution.
72
PARTIAL DIFFERENTIATION
Exercise 4.20. If f (x, y) = xey , find the rate of change at the point P with position vector
(2, 0) in the direction from P to the point Q with position vector ( 21 , 2). In which direction
does f have the maximum rate of change? What is this maximum rate of change?
Solution.
Example 4.21 (S05 8(b)). Find the directional derivative of the function
f (x, y) = 5xy 2 − 4x3 y
at the point P = (1, 2) in the direction of the vector (5, 12).
What is the maximum rate of decrease of the function at P , and in which direction does
this occur?
Solution. Since 52 + 122 = 169 = 132 , a unit vector in the required direction is c =
Also,
f (x, y) = 5xy 2 − 4x3 y ⇒ ∇f = (5y 2 − 12x2 y, 10xy − 4x3 )
1
13 (5, 12).
and so the directional derivative in the direction of c at P is
1
1
1
172
(5, 12).∇f (1, 2) =
(5, 12).(20 − 24, 20 − 4) =
(−20 + 192) =
.
13
13
13
13
The maximum rate of decrease occurs in the direction of −∇f (1, 2) = (4, −16), and is
p
√
|∇f (1, 2)| = (−4)2 + 162 = 4 17.
4.6.1
Tangent planes revisited
Consider the equation
x2 + y 2 + z 2 = 1.
(S)
This can be written as |r|2 = 1, which is equivalent to |r| = 1, where r = (x, y, z) is the
position of a general point in space. Thus a point satisfies the equation (S) precisely if it is
distance 1 from the origin, that is, if it is on the sphere of radius 1 whose centre is the origin.
73
Directional derivatives; the gradient operator
It is impossible to rearrange this equation to get a single function z = f (x, y), since we have
the two possibilities:
p
p
z = 1 − x2 − y 2 and z = − 1 − x2 − y 2 ,
and note that these only make sense whenever x2 + y 2 6 1, that is when the point (x, y) lies
in the disc of radius 1 centred on (0, 0) in the (x, y)-plane.
Define a function F : R3 → R by F (r) = F (x, y, z) = x2 + y 2 + z 2 . The equation of
the sphere is given by F (r) = 1, that is, setting F equal to a constant value. Surfaces
generated this way are called the level surfaces of the function F , and provide a more general
way of describing surfaces than equations of the form z = f (x, y). Note that if we define
F (r) := z − f (x, y) then our first type of surface is nothing but the level surface
F (r) = 0.
Consider now the problem of finding the equation of a tangent plane to such a surface
F (r) = k at the point a. To do this imagine
a point moving around on the surface. At time t
its position vector is r(t) = x(t), y(t), z(t) , and suppose that at time t = 0 it passes through
a. That is
r(0) = x(0), y(0), z(0) = a and F r(t) = k for all t,
the second equationfollowing since the point is constrained to lie on the surface. Since the
function t 7→ F r(t) is constant, its derivative is 0. But we can also evaluate this using the
chain rule, which gives
dx(t)
dy(t)
dz(t)
0 = Fx r(t)
+ Fy r(t)
+ Fz r(t)
dt
dt
dt
dx(t) dy(t) dz(t) = Fx r(t) , Fy r(t) , Fz r(t) .
,
,
dt
dt
dt
dr(t)
= ∇F r(t) .
dt
where ∇F is the gradient vector of F made up of its partial derivatives, which is the 3D
analogue of ∇f considered above. Putting t = 0 in this equation shows that ∇F (a) is
dr(0)
orthogonal to
, which is the tangent vector to the path taken by the point when passing
dt
through a (i.e. its velocity, when thinking in terms of its motion). Since this is true for any
path passing through a, it follows that the vector ∇F (a) must be orthogonal to the surface
at this point, and so provides the normal vector for the tangent plane. Hence all points r on
andtangent plane at a satisfy the equation
the
(r − a).∇F (a) = 0.
∇F
dr(t)
dt
74
PARTIAL DIFFERENTIATION
For an example consider F (x, y, z) = x2 + y 2 + z 2 as above. We have
∇F =
∂
∂ 2
∂ 2
(x2 + y 2 + z 2 ),
(x + y 2 + z 2 ),
(x + y 2 + z 2 ) = 2(x, y, z),
∂x
∂y
∂z
a vector parallel to the position vector of the point r, showing that the normal line to the
sphere at any point can be continued back to the origin. In particular at the point (1, 0, 0)
on the surface of the sphere, the tangent plane has equation
(x, y, z) − (1, 0, 0) .(2, 0, 0) = 0 ⇔ 2(x − 1) = 0 ⇔ x = 1.
This equation could not have been calculated with our earlier method.
Finally, if we are given a surface by means of the equation z = f (x, y) then, rewriting
this as F (x, y, z) := z − f (x, y) = 0, we see that the normal vector at the point (x, y, z) =
(a, b, f (a, b)) is (−fx (a, b), −fy (a, b), 1), as shown previously.
Exercise 4.22. Find the tangent plane to the hyperboloid x2 − y 2 + 2z 2 = 1 at the point
(3, 4, −2). At which points is the normal line to the surface parallel to the line through the
points (3, −1, 0) and (5, 3, 6).
Solution.
75
Directional derivatives; the gradient operator
Exercise 4.23. Show that (0, −1, 2) lies on both of the following surfaces: x2 + 4y + z 2 = 0
and x2 + y 2 + z 2 − 6z + 7 = 0. Show, moreover, that the surfaces are tangent to one another
at this point.
76
PARTIAL DIFFERENTIATION
Solution.
4.7
Critical/stationary points
A function z = f (x, y) has a local maximum at the point (a, b) if f (a, b) > f (x, y) for all
(x, y) close to (a, b). More precisely, z = f (x, y) has a local maximum at the point (a, b) if
we can find some number r > 0 such that when we evaluate f (x, y) at any point in the disc
with centre (a, b) and radius r, we have f (a, b) > f (x, y).
Similarly, z = f (x, y) has a local minimum at the point
y
(c, d) if f (c, d) 6 f (x, y) for all points (x, y) close to (c, d)
in the same sense, i.e. all points in some disc of positive
radius centred on (c, d).
r
We shall make use of partial derivatives to locate can(a, b)
didates for such points. Note that if (a, b) is a local maximum or a local minimum, then the tangent plane at this
x
point should be horizontal, which is equivalent to saying
that the normal vector n to the surface/tangent plane
should have 0 for its x- and y- components.
Recall that n = −fx (a, b), −fy (a, b), 1 . Consequently a point (a, b) in the domain of a
function z = f (x, y) is called a critical or stationary point if
fx (a, b) = fy (a, b) = 0.
This is a necessary condition for there to be a local maximum or a local minimum at (a, b),
but is not a sufficient condition.
It tallies with the fact that if there was a local maximum in the surface at that point then
x = a would be a local maximum in the curve z = g(x) = f (x, b), and y = b would be a local
maximum in the curve z = h(y) = f (a, y) — the curves obtained by intersecting the surface
with the planes y = b and x = a respectively.
77
Critical/stationary points
4.7.1
The second derivative test
Recall that for a function y = f (x), if f ′ (a) = 0 then we can check to see whether there is a
local maximum or local minimum at x = a by calculating f ′′ (a) (providing these derivatives
exist) and seeing if the result is positive or negative.
Given a function z = f (x, y) of two variables, its discriminant is the function
fxx (a, b) fxy (a, b)
2
Df (a, b) = fxx (a, b)fyy (a, b) − fxy (a, b) = fyx (a, b) fyy (a, b)
where we are assuming that fxy (a, b) = fyx (a, b).
Theorem 4.24. Suppose that z = f (x, y) has partial derivatives up to second order, and that
(a, b) is a critical point of the function.
(i) If Df (a, b) > 0 and fxx (a, b) > 0 then f has a local minimum at (a, b).
(ii) If Df (a, b) > 0 and fxx (a, b) < 0 then f has a local maximum at (a, b).
(iii) If Df (a, b) < 0 then f has a saddle point at (a, b).
(iv) If Df (a, b) = 0 then no conclusion can be drawn.
Remark. The matrix whose determinant is taken to form the discriminant is symmetric, and
so, by an earlier theorem, we know that it has two real (possibly repeated) eigenvalues. The
inequality Df (a, b) > 0 is equivalent to saying that the eigenvalues are either both positive
or both negative. The inequality Df (a, b) < 0 is equivalent to saying that the eigenvalues are
of different sign. If Df (a, b) = 0 then at least one eigenvalue is 0, which leads to the lack of
conclusion.
The theorem follows from the multidimensional version of Taylor’s Theorem which implies
that for points (x, y) close to (a, b)
f (x, y) ≈ f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − a)
fxx (a, b) fxy (a, b) x − a
+ x−a y−b
.
fyx (a, b) fyy (a, b) y − b
So if there is a critical point at (a, b) then
f (x, y) ≈ f (a, b) + x − a
fxx (a, b) fxy (a, b) x − a
y−b
fyx (a, b) fyy (a, b) y − b
Local maxima and minima are relatively easy to visualise, and the final possibility (Df (a, b) =
0) being inconclusive can happen for reasons similar to the one variable case (e.g. there is a
point of inflection when looking at the curve passing through a, b, f (a, b) in some direction).
Case (iii) is a phenomenon that occurs for surfaces but not for curves. In one direction the
curve obtained by intersecting the surface with a plane has a local minimum, in the orthogonal
direction the corresponding curve has a local maximum.
y
x
y
x
x
y
Figure 1: z = x2 + y 2
Figure 2: z = −x2 − y 2
78
Figure 3: z = x2 − y 2
PARTIAL DIFFERENTIATION
∂z
∂z
∂z
∂z
= 2x or
= −2x, and
= 2y or
= −2y.
∂x
∂x
∂y
∂y
Consequently in all cases the only critical point is the origin (x, y) = (0, 0).
When z = x2 + y 2 ,
For each of the above examples
∂2z
= 2,
∂x2
∂2z
= 0,
∂x∂y
∂2z
=2
∂y 2
⇒ Df (0, 0) = 4.
Since fxx (0, 0) = 2 > 0, the point (0, 0) is a local minimum for this function.
When z = −x2 − y 2 ,
∂2z
= −2,
∂x2
∂2z
= 0,
∂x∂y
∂2z
= −2
∂y 2
⇒ Df (0, 0) = 4.
Since fxx (0, 0) = −2 < 0, the point (0, 0) is a local maximum for this function.
When z = x2 − y 2 ,
∂2z
= 2,
∂x2
∂2z
= 0,
∂x∂y
∂2z
= −2
∂y 2
⇒ Df (0, 0) = −4,
so the point (0, 0) is a saddle point for this function.
Perhaps more instructively one should consider the intersections of each surface with the
planes y = 0 and x = 0 which produces the curves z = ±x2 and z = ±y 2 .
Exercise 4.25. Locate and classify the stationary points of f (x, y) = x3 − 2y 2 − 2y 4 + 3x2 y.
Solution.
79
Critical/stationary points
Exercise 4.26 (S04 8(c)). Locate and classify the stationary points of the function z =
2x2 + y 3 − x2 y − 3y.
Solution.
80
PARTIAL DIFFERENTIATION
Example 4.27. Locate and classify all the critical points of the function z = x sin y.
∂z
∂z
∂z
= sin y and
= x cos y. So to have
= 0 we need sin y = 0,
∂x
∂y
∂x
that is y = nπ for n = 0, ±1, ±2, . . .
∂z
But note: cos nπ = (−1)n 6= 0 for all n, and we also need
= x cos y = 0, where
∂y
cos y 6= 0. So we must have x = 0. Thus the critical points of z are (0, nπ) for n = 0, ±1, ±2, . . .
2 2
∂2z
∂ 2z
∂2z ∂2z
∂ z
= 0 and
= cos y, so that
−
= − cos2 y = −1 for
But now
2
2
2
∂x
∂x∂y
∂x ∂y
∂x∂y
x = 0 and y = nπ. Thus each of the critical points is a saddle point.
Solution. z = x sin y, so
4.8
Exercises
1. Show that for any k ≥ 0 the function z = sin kx cos kct satisfies the wave equation c2
∂2z
=
∂x2
∂2z
. Show that if f (u) is any function of one variable that is twice differentiable (i.e. f ′ (u)
∂t2 ′′
and f (u) exist) then z = f (x − ct) is also a solution of this equation.
2. Use the chain rule to calculate the indicated derivatives:
√
(i) g ′ (t) where g(t) = f x(t), y(t) , f (x, y) = x2 y − sin y, x = t2 + 1, y = et .
p
(ii) g ′ (t) where g(t) = f x(t), y(t) , f (x, y) = x2 + y 2 , x = sin t, y = t2 + 2.
∂z
∂z
and
, where z = sin(xy), x = u2 v, y = veu .
∂u
∂v
√
∂z
∂z
2
(iv)
and
, where z = xy 3 , x = eu , y = v 2 + 1 sin u.
∂u
∂v
3. Find the directional derivatives of the given functions at the given point and in the given
direction:
(iii)
(i) f (x, y) = x sin(xy) at (1, π2 ) in the direction of c =
√1 (1, −1).
2
(ii) f (x, y) = x2 + ln(x − y) at (2e, e) in the direction of the vector a = (4, −7).
(iii) g(x, y) = y 2 − x2 − x at (4, 5) in the direction of the vector from P = (3, 7) to
Q = (−1, 9).
(iv) f (x, y, z) = x + y 2 z − xz 3 at (2, 0, 2) in the direction of the normal vector to the plane
x − 3y + 4z = 5.
4. Find the tangent plane to the following surfaces at the given point:
(i) x2 − y 2 + z 2 = 13 at the point (4, 2, 1).
(ii) x sin(yz) = 2 at the point (2, 2, π4 ).
(iii) xy − ex−yz = 31 at the point (16, 2, 8).
5. Locate and classify all of the critical points of the following functions:
2
(i) z = e−x (y 2 + 1)
(iv) z = e−x
2
−y 2
(ii) z = 4xy − x4 − y 4 + 4
4xy
(v) z = x2 − 2
y +1
81
(iii) z = y 2 + x2 y + x2 − 2y
(vi) z = xye−x
2
−y 2