4.1 Partial Derivatives

4.1
4.1.1
Partial Derivatives
Functions of several variables
Up until now, we have only met functions of single variables. From now on
we will meet functions such as z = f (x, y) and w = f (x, y, z), which are
functions of two and three variables respectively. The domain of z = f (x, y)
is the set of all points (x, y) at which f is defined and similarlyw = f (x, y, z)
is the set of all points (x, y, z) at which f is defined. For example, let
√
(1)
f (x, y, z) = x2 + xy + y 2 − z .
We will find its value at the point (1,3, 4). We get
√
f (1, 3, 4) = 12 + (1)(3) + 32 − 4 = 11 .
(2)
To find its domain, we notice that the
√ first three terms are defined for all
real numbers; however, the last term z is only defined is z ≥ 0. Hence the
domain is everything on or above the z-axis. An easier function to understand
is
p
f (x, y) = x2 + y 2 − 4 .
(3)
The argument of the square root must be positive, so the condition is 0 ≤
x2 + y 2 − 4. This means that we must have
x2 + y 2 ≥ 4 .
(4)
There, the domain of f is all points that reside on or outside the circle of
radius 2.
4.1.2
Graphs of functions of two variables
A graph of a function of two variables is the graph of the function z = f (x, y).
For example, let’s sketch the graph of
p
(5)
z = 1 − x2 − y 2 .
What is this shape? Well, we can rewrite this as
x2 + y 2 + z 2 = 1 .
(6)
Written this way, it is clearly a hemisphere of radius 1, and is sketched
in Figure 1. It is only a hemisphere because the square root imposes the
restriction z ≥ 0 since the argument of the square root must be positive.
1
y
1.0
0.5
0.0
-0.5
-1.0
2.0
1.5
z 1.0
0.5
0.0
-1.0
-0.5
0.0
x
0.5
1.0
Figure 1: z =
4.1.2.1
p
1 − x2 − y 2
Level curves
If there is a surface z = f (x, y), we define a level curve of height k to be
the shape the surface makes on the plane z = k. If we project a series of
such plots onto the xy-plane we get a contour plot. This is demonstrated
in Figure 2. Here, the contour plot of z = x2 + 2y 2 is shown for values
z = 0, 1, 2, 3, 4. This is equivalent to plotting the curves
x2 + 2y 2
x2 + 2y 2
x2 + 2y 2
x2 + 2y 2
x2 + 2y 2
= 0,
= 1,
= 2,
= 3,
= 4.
(7)
The first of these is a point, and the rest are ellipses, which we can represent
in the form x2 + 2y 2 = k, which is equivalent to
x2 2y 2
+
= 1.
k
k
(8)
To draw the contour plot we simply draw the point and then the four ellipses
with k = 1, 2, 3, 4.
2
y
2
1
0
2
-1
-2
4
1
3
z
2
0
1
0
-1
-2
-1
k= 0, 1, 2, 3, 4
0
x
1
-2
-2
2
-1
0
1
2
Figure 2: A three dimensional surface and its level curves.
4.1.3
Limits and continuity in functions of several variables
With functions of a single variable, we can take the limit from either above
or below. These are denoted
lim f (x) ,
lim f (x) .
(9)
x→x−
0
x→x+
0
When we have two or three variables, there are infinitely many ways to
approach a point (think of a point on the plane) and as a result we must
take a limit along a curve. To do this, we take a curve C such that the point
(x0 , y0 ) is on it in two dimensions and (x0 , y0 , z0 ) is on it in three dimensions.
If this curve is parameterised by t, then we have
x = x(t) ,
y = y(t) ,
or x = x(t) ,
y = y(t) ,
z = z(t) ,
(10)
with x0 = x(t0 ), y0 = y(t0 ), z0 = z(t0 ) and the limits along the curve are
defined as
lim
f (x, y) = lim f (x(t), y(t)) ,
(11)
t→t0
(x,y)→(x0 ,y0 )
or
lim
f (x, y, z) = lim f (x(t), y(t), z(t)) .
t→t0
(x,y,z)→(x0 ,y0 ,z0 )
3
(12)
In essence, what we have done is used the curve to define a direction along
which we take the limit. We then can use t to take the limit in analogy to
the case of a function of a single variable. Bear in mind that limits along
different curves do not have to match.
Example: Find the limit of the function
f (x, y) = −
x2
xy
,
+ y2
at (0, 0) along (a) the x-axis (b) the line y = x (c) the parabola y = x2 .
Solution:
(a) The x-axis can be parameterised by x = t, y = 0 and therefore, if we
substitute these values into f (x, y) and take the limit t → 0, we get
lim f (t, 0) = lim −
t→0
t→0
(t)(0)
= lim 0 = 0 .
t2 + 02 t→0
(b) The line y = x can be parameterised by x = t, y = t and therefore, if
we substitute these values into f (x, y) and take the limit t → 0, we get
lim f (t, t) = lim −
t→0
t→0
(t)(t)
1
1
= lim − = − .
2
2
t→0
t +t
2
2
(b) The line y = x2 can be parameterised by x = t, y = t2 and therefore, if
we substitute these values into f (x, y) and take the limit t → 0, we get
lim f (t, t2 ) = lim −
t→0
t→0
t3
1
(t)(t2 )
=
lim
−
=− .
2
2
2
2
4
t→0
t + (t )
t +t
2
Note that these limits all differ from each other.
In order to understand the general definition of limits of functions of several
variables, we first need to understand the concepts of a open and closed sets.
An open set is a set of points that are bounded, but the boundary is not
included in the set. As an example, an open disk of radius a centred on
(0, 0) is a disk bounded by the circle x2 + y 2 = a2 such that points on the
boundary are not included in the set. Similarly, an open ball of radius a
centred on (0, 0, 0) is a ball bounded by the sphere x2 +y 2 +z 2 = a2 such that
points on the boundary are not included in the set. By contrast, a closed
4
set is a set of points that are bounded such that he boundary is included in
the set. A closed disk of radius a centred on (0, 0) is a disk bounded by the
circle x2 + y 2 = a2 such that points on the boundary are included in the set.
Similarly, an closed ball of radius a centred on (0, 0, 0) is a ball bounded by
the sphere x2 + y 2 + z 2 = a2 such that points on the boundary are included
in the set. An open and a closed disk are shown in Figure 3. Using this
1.0
1.0
0.5
0.5
0.0
0.0
-0.5
-0.5
-1.0
-1.0
-1.0
-0.5
0.0
0.5
-1.0
1.0
-0.5
0.0
0.5
1.0
Figure 3: An open disk (no boundary indicated) and a closed disk (boundary
indicated).
knowledge, the following is a general definition of a limit of a function of two
variables (with a natural extension to three):
Let f be a function of two variables defined on an open disk centred on
(x0 , y0 ). It is not necessary that f (x0 , y0 ) is well-defined. We will write the
limit
lim
f (x, y) = L ,
(13)
(x,y)→(x0 ,y0 )
if given a number > 0, there exists a number δ > 0 such that
|f (x, y) − L| < ,
for all point in an open disk of radius δ centred on (x0 , y0 ), i.e.
p
0 < (x − x0 )2 − (y − y0 )2 < δ .
5
(14)
(15)
This gives rise to the following theorem:
Theorem:
(a) If f (x, y) → L as (x, y) → (x0 , y0 ), then f (x, y) → L as (x, y) → (x0 , y0 )
along any smooth curve.
(b) If the limit of f (x, y) does not exist along a smooth curve as (x, y) →
(x0 , y0 ), or if different curves have different limits as (x, y) → (x0 , y0 ),
then the limit of f (x, y) at (x, y) → (x0 , y0 ) does not exist.
In short, this means that limits along a curve may exist, but the general limit
the
at points on the curve may not. In our example for f (x, y) = − x2xy
+y 2
limits differed on different curves and therefore the limit of the function does
not exist at (0, 0).
4.1.3.1
Continuity
A function f (x, y, z) is continuous at (x0 , y0 , z0 ) if f (x0 , y0 , z0 ) is defined and
lim f (x, y, z) = f (x0 , y0 , z0 ) .
t→t0
(16)
To simplify the terminology, let’s go to two variables now. If f (x, y) is continuous at every point in a region D, then it is continuous on D, and if
it is continuous on the entire xy-plane, we say that f (x, y) is continuous
everywhere. Similarly for three variables, but it’s a bit harder to describe
accurately. The properties of continuity are
1. If g(x) is continuous at x0 and h(y) is continuous at y0 , then f (x, y) =
g(x)h(y) is continuous at (x0 , y0 ).
2. If h(x, y) is continuous at (x0 , y0 ) and g(u) is continuous at u0 =
h(x0 , y0 ), then f (x, y) = g(h(x, y)) is continuous at (x0 , y0 ), i.e. composition of continuous function is continuous.
3. Sums, differences and products of continuous functions are continuous.
4. Quotients of two differentiable functions are continuous unless the denominator is zero.
6
What can we say about discontinuities? Well, obviously if the limit does not
exist at a point, then the function is discontinuous at that point. Sometimes
the lack of a limit is obvious. Clearly,
lim
(x,y)→(0,0) x2
1
= +∞ ,
+ y2
(17)
and therefore it must be discontinuous there. But for the function
lim
(x2 + y 2 ) ln(x2 + y 2 ) ,
(18)
(x,y)→(0,0)
it is not immediately clear since this is 0 · ∞. Unfortunately, because there
are two variables l’Hôpital’s rule cannot be used in this form. However, a
change of variables can sometimes fix this. If we use polar coordinates in this
case, we have x2 + y 2 = ρ2 and hence
lim
(x2 + y 2 ) ln(x2 + y 2 ) = lim+ r2 ln r2
(x,y)→(0,0)
r→0
2 ln r
r→0 1/r 2
2/r
= lim+
(l’Hôpital’s rule)
r→0 −2/r 3
= lim+ (−r2 ) = 0 .
= lim+
(19)
r→0
Therefore, the limit exists and the function is continuous at (0, 0). Note we
ln r
can apply l’Hôpital’s rule rule since 21/r
2 has limit of the type ∞/∞.
4.1.4
Partial derivatives
If we have a function that depends on two or more variables, how do we
treat derivatives? We might be interested in how the function changes with
respect to only one of these variables. For example, we might be interested
in how people’s blood pressure depend on on age and on their career. If we
just take a large sample of random people, it would be hard to see a pattern.
But if we took the results of all accountants, we would get an idea how blood
pressure varies with age; likewise if we took the results of all people aged
forty, we would get an idea about how different careers affect blood pressure.
The key thing here is that we had to fix one of the variables to see how the
other changes. This is the idea behind partial derivatives.
7
Take, for instance, z = f (x, y). Let us imagine that we can fix y at some
value, say y = y0 . Then, the derivative at f (x, y0 ) in x is
d
f (x, y0 ) .
(20)
dx
In other words, treat y as a constant. Similarly, we could fix x and take a
derivative in y. We define the partial derivatives as follows
f (x + ∆x, y) − f (x, y)
,
∆x→0
∆x
f (x, y + ∆y) − f (x, y)
fy (x, y) = lim
.
∆y→0
∆y
fx (x, y) = lim
(21)
The notation fx (x, y) means the partial derivative of z = f (x, y) with respect
to x. Other notations are
∂z
∂f
,
.
(22)
∂x
∂x
Often, we will want to find the partial derivative at a given point, say (x0 , y0 ).
To do this, find the partial derivative and then substitute the values of the
point (x0 , y0 ). This will be denoted
∂f ∂f
∂f ,
,
(x0 , y0 ) .
(23)
∂y
∂y
∂y
x=x0 ,y=y0
(x0 ,y0 )
Let us look at an example.
Example: Let z = x2 sin y, and find
∂z ∂x (π,π)
and
∂z .
∂y (π,π)
Solution:
∂z
= 2x sin y
∂x ∂z ⇒ = 2π sin π = 0 .
∂x (π,π)
(24)
Similarly,
⇒
∂z
= x2 cos y
∂y
∂z = π 2 cos π = −π 2 .
∂x (π,π)
8
(25)
4.1.4.1
Higher order partial derivatives
As with normal derivatives, we can of course have higher order derivatives,
but now there can be mixed partials. We will use the following notations
fxy (x, y) =
∂ ∂f
∂ 2f
=
,
∂y∂x
∂y ∂x
(26)
so in fxy (x, y) we differentiate in the variables from left to right: x then y.
2
Similarly, we can have fxx (x, y) = ∂∂xf2 , and so on.
Example: Find fxy (x, y) for f (x, y) = x2 (y 2 − y).
Solution:
∂
∂
2 2
fxy (x, y) =
x (y − y)
∂y ∂x
∂
(27)
=
2x(y 2 − y)
∂y
= 2x(2y − 1) .
4.1.4.2
Slope
You probably recall that the slope of a function is given by its derivative,
df
. If a function has three variables, i.e three independent directions,
slope = dx
df
it has three slopes. Therefore the function f (x, y, z) has slope dx
in the xdf
df
direction, slope dy in the y-direction, and slope dz in the z-direction. We will
revisit this later when we discuss gradient.
4.1.4.3
One-dimensional wave equation
If a string is oscillating in one dimension (up and down), the position of
any point on the string depends on both a coordinate x and time t and can
be described by a function u(x, t). Then, it can be shown that the wave
equation is
2
∂ 2u
2∂ u
=
c
.
(28)
∂t2
∂x2
The constant c2 depends on the properties of the string. The wave equation
also appears in Hooke’s law and in a more general form in electromagnetic
radiation.
9
4.1.4.4
Laplace’s equation
In three dimensions, Laplace’s equation is
∂ 2f
∂ 2f
∂ 2f
+
+
= 0.
∂x2
∂y 2
∂z 2
(29)
It appears in fluid dynamics and electrostatics for example.
Example: Prove that φ = x3 −2xy 2 +xyz −xz 2 satisfies Laplace’s equation.
Solution:
∂ 2φ
∂φ
= 3x2 − 2y 2 + yz − z 2 ⇒
= 6x ,
∂x
∂x2
∂ 2φ
∂φ
= −4xy + xz
⇒ 2 = −4x ,
∂y
∂y
∂ 2φ
∂φ
= xz − 2xz
⇒ 2 = −2x .
∂z
∂z
Hence
∂ 2f
∂ 2f
∂ 2f
+
+
= 6x − 4x − 2x = 0 ,
∂x2
∂y 2
∂z 2
and so φ satisfies Laplace’s equation.
4.1.4.5
(30)
Total derivatives
Consider a function of three variables w = f (x, y, z). Assuming that this can
be differentiated with respect to all three variables, we define the increment
of f , ∆f , to be the amount f changes if all the variables are simultaneously
varied. It is given by the limit
∆f − fx (x0 , y0 , z0 )∆x − fy (x0 , y0 , z0 )∆y − fz (x0 , y0 , z0 )∆z
p
= 0,
(∆x,∆y,∆x)→(0,0,0)
(∆x)2 + (∆y)2 + (∆z)2
(31)
from which we define the total differential
lim
dw = df = fx (x0 , y0 , z0 )dx + fy (x0 , y0 , z0 )dy + fz (x0 , y0 , z0 )dz .
(32)
The function f is differentiable if the total differential exists, and in this
situation it is also continuous.
10
4.1.4.6
Local Linear Approximation
If a function f (x, y, z) is differentiable at a point, it can be approximated by a
linear function. We consider a the function at a point (x0 , y0 , z0 ), and consider
shifting away to a nearby point (x = x0 + ∆x, y = y0 + ∆y, z = z0 + ∆z),
then we can approximate
f (x, y, z) ≈ f (x0 , y0 , z0 )+fx (x0 , y0 , z0 )∆x+fy (x0 , y0 , z0 )∆y+fz (x0 , y0 , z0 )∆z ,
(33)
and since ∆x = x − x0 , ∆y = y − y0 and ∆z = z − z0 we define the local
linear approximation to be
L(x, y, z) = f (x0 , y0 )+fx (x0 , y0 )(x−x0 )+fy (x0 , y0 )(y−y0 )+fz (x0 , y0 , z0 )(z−z0 ) .
(34)
Example: Find the local linear approximation of f (x, y) = xα y β +
(1, 1).
yα
xβ
at
Solution: We need
L(x, y) = f (1, 1) + fx (1, 1)(x − 1) + fy (1, 1)(y − 1) ,
(35)
and we have f (1, 1) = 2 and then the derivatives are
yα
xβ+1
y α−1
fy (x, y) = βxα y β−1 + α β
x
which gives us
fx (x, y) = αxα−1 y β − β
⇒
fx (1, 1) = α − β ,
(36)
⇒
fy (1, 1) = α + β ,
L(x, y) = 2 + (α − β)(x − 1) + (α + β)(y − 1) .
4.1.5
(37)
The Chain Rule
Remember that generally, a function f (x, y, z) depends on a parameter t
via f (x(t), y(t), z(t)). Varying t will obviously change each of x, y and z.
dv du
= du
, we
Recalling that the chain rule for a function v(u(t)) gives dv
dt
dt
define the chain rule for derivatives as
df
∂f dx ∂f dy ∂f dz
=
+
+
.
(38)
dt
∂x dt
∂y dt
∂z dt
11
We can take it a step further. If z = f (x, y) has variables that depend on
two parameters u and v, i.e. x(u, v) and y(u, v), we have the chain rule for
partial derivatives
∂f ∂x ∂f ∂y
∂f
=
+
,
∂u
∂x ∂u ∂y ∂u
∂f
∂f ∂x ∂f ∂y
=
+
.
∂v
∂x ∂v
∂y ∂v
Example: Use the chain rule to find
z = x2 y ;
∂z
∂u
and
x = 2u + v ,
∂z
∂v
(39)
for
y = u − v2 .
(40)
Solution:
∂z ∂x ∂z ∂y
∂z
=
+
= (2xy)(2) + (x2 )(1)
∂u
∂x ∂u ∂y ∂u
= x2 + 4xy
= (2u + v)2 + 4(2u + v)(u − v 2 ) ,
∂z ∂x ∂z ∂y
∂z
=
+
= (2xy)(1) + (x2 )(−2v)
∂v
∂x ∂v ∂y ∂v
= −2v(2u + v)2 + 2(2u + v)(u − v 2 ) .
4.1.6
Directional derivatives and the gradient
4.1.6.1
Directional derivatives
(41)
If we consider a function at a given point f (x, y, z), there are obviously many
different directions in which we could move away from the initial point. In
general, any linear combination which is a unit vector (a2 + b2 + c2 = 1)
u = ai + bj + ck ,
(42)
defines a direction, if we fix the origin to be (x0 , y0 , z0 ). In terms of the
arc length parameter s, we express subsequent motion away from (x0 , y0 , z0 )
through the equations
x = x0 + a s ,
y = y0 + b s ,
12
z = z0 + c s .
(43)
When we take s to 0, we recover the initial point. Then, differentiation with
respect to s will give the slope in the direction of u when we set s = 0. In
other words, we use s to test how a small change affects the function f at
(x0 , y0 , z0 ). If we didn’t set s = 0 at the end, we would not find the derivative
at (x0 , y0 , z0 ), but at a point an arc length s away in the relevant directions.
As a result, we define the directional derivative of f in the direction of
u to be
d
[f (x0 + a s, y0 + b s, z0 + c s)]|s=0
ds
= fx (x0 , y0 , z0 )a + fy (x0 , y0 , z0 )b + fz (x0 , y0 , z0 )c .
Du f (x0 , y0 , z0 ) =
(44)
This can be regarded as the slope of the surface w = f (x, y, z) in the direction
u.
4.1.6.2
The gradient
Calculating directional derivative is made easier using the gradient. It is
denoted by ∇, which is called “nabla”, but generally read as “del” and is
given by
∇f (x, y, z) = fx (x, y, z)i + fy (x, y, z)j + fz (x, y, z)k .
(45)
Using this, we see that we can use it to express directional derivatives as
Du f (x, y, z) = ∇f (x, y, z) · u .
(46)
This is why it is called a gradient, because it can give the slope in any
direction if the dot product with a unit vector is taken. Properties of the
gradient are:
1. w = f (x, y, z) has its maximum slope in the direction of the gradient,
and the maximum slope is ||∇f (x, y, z)||.
2. w = f (x, y, z) has its minimum slope in the direction opposite to that
of the gradient, and the minimum slope is −||∇f (x, y, z)||.
3. If ∇f = 0 at a point, all directional derivatives are zero at that point.
4. Since level curves are curves of equal z = f (x, y), then the gradient is
normal to the level curves. Therefore, on level curves, ∇f · T = 0.
13
Example: Find the unit vector in the direction in which f (x, y) = 10 −
2x2 − y 2 increases most quickly at P = (1, 1) and compute the rate of change
in that direction.
Solution: The direction in which f increases most is
∂f ∂f
j
∇f (1,1) = i +
∂x
∂y (1,1)
= −4xi − 2yj
(47)
(1,1)
= −4i − 2j .
√
√
This has magnitude ||∇f || = 42 + 22 = 2 5, and so the unit vector is
2
1
u = −√ i − √ j .
5
5
√
Finally, the rate of change is +||∇f || = +2 5.
4.1.7
(48)
Tangent planes and normal vectors
We want to consider how to find the tangent plane to a surface. A tangent
plane is intuitively the surface that contains all possible tangent lines of all
curves at a point P0 . At a point P0 = (x0 , y0 , z0 ), the surface F (x(t), y(t), z(t))
has value c = F (x0 , y0 , z0 ). We assume that the surface is continuous at P0
and that its partial derivatives are also continuous. Then, at the point P0
we have
0 = Fx (x0 , y0 , z0 )x0 (t0 ) + Fy (x0 , y0 , z0 )y 0 (t0 ) + Fz (x0 , y0 , z0 )z 0 (t0 ) .
(49)
We now consider a curve C parameterised by r(t) = (x(t), y(t), z(t)), and
we note that the tangent line to C runs parallel to r0 (t) = (x0 (t), y 0 (t), z 0 (t)).
With this in mind we note that (49) may be rewritten as
0 = (Fx (x0 , y0 , z0 ), Fy (x0 , y0 , z0 ), Fz (x0 , y0 , z0 )) · (x0 (t0 ), y 0 (t0 ), z 0 (t0 )) , (50)
which can be written as
0 = ∇F (x0 , y0 , z0 ) · r0 (t0 ) .
14
(51)
In other words, ∇F (x0 , y0 , z0 ) is normal to the tangent line of the curve C
at P , and indeed to the tangent line of any curve since C was arbitrary. We
therefore define the tangent plane to be the plane with normal vector
n = ∇F (x0 , y0 , z0 ) = (Fx (x0 , y0 , z0 ), Fy (x0 , y0 , z0 ), Fz (x0 , y0 , z0 )) ,
(52)
and the tangent plane is given by
Fx (x0 , y0 , z0 )(x − x0 ) + Fy (x0 , y0 , z0 )(y − y0 ) + Fz (x0 , y0 , z0 )(z − z0 ) = 0 , (53)
since it is a plane that touches the surface F (x, y, z) at the point (x0 , y0 , z0 )
in analogy to the tangent line. The normal line is the line that it parallel
to the normal vector and has parametric form (r(t) = r0 + n t)
x = x0 + Fx (x0 , y0 , z0 )t ,
y = y0 + Fy (x0 , y0 , z0 )t ,
z = z0 + Fz (x0 , y0 , z0 )t .
(54)
A more useful form of (53) comes from considering z = f (x, y) at the point
x-2-4
0
2
4
10
0
z
-10
-4
-2
0
2
4
y
Figure 4: Tangent plane and normal line for z = −(x2 + y 2 ).
(x0 , y0 , f (x0 , y0 )) and gives the tangent plane as
z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) ,
15
(55)
which is easier to understand as a generalisation of the tangent line. In this
form, the normal vector is
n = (−fx (x0 , y0 ), −fy (x0 , y0 ), 1) ,
(56)
since we would have F (x, y, z) = z − f (x, y). The normal line may be written
r(t) = r0 + t(−fx (x0 , y0 )i − fy (x0 , y0 )j + k) .
(57)
These are the forms of the tangent plane and normal line that we will use
for calculations. Notice that it is identical to the local linear approximation
given in equation (34) for the surface z = f (x, y), which would read
L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) .
(58)
This means that the graph of the local linear approximation z = L(x, y) is
the tangent plane to the surface z = f (x, y) at the point (x0 , y0 ).
Example: Find the tangent plane and normal line of the surface z =
−(x2 + y 2 ) at the point (1, 1, −2).
Solution: First, we notice that indeed the point is (x0 , y0 , f (x0 , y0 )) since
f (x0 , y0 ) = −2. The first step is to find the derivatives
∂f ∂f = −2 ,
= −2 ,
(59)
∂x (1,1)
∂y (1,1)
which we then use to find the equation for the tangent plane using equation
(55)
z = −2 + (−2)(x − 1) + (−2)(y − 1)
z = 2(1 − x − y) .
(60)
The normal vector is given by equation (56) as n = (2, 2, 1) and therefore
the normal line is given from (57) as
r = (1, 1, −2) + t(2, 2, 1)
= (1 + 2t, 1 + 2t, t − 2) .
(61)
Alternatively, we could have written F = z + x2 + y 2 and the normal vector
is given by equation (52)
∇F = (Fx (1, 1, −2), Fy (1, 1, −2), Fz (1, 1, −2)) = (2, 2, 1) .
16
(62)
The tangent plane using equation (53) is
2(x − 1) + 2(y − 1) + 1(z − (−2)) = 0
⇒2x + 2y + z − 2 = 0 .
(63)
We can rewrite this as
z = 2(1 − x − y) ,
(64)
which is the same as before. Finally, the normal line from equation (57) is
x = 1 + 2t ,
y = 1 + 2t ,
z = −2 + t ,
(65)
which is the same as before.
Important note: When finding the equation of the normal line, some
textbooks might give answers that have the opposite sign for t. However,
this doesn’t matter. A line parameterised by (1 + t, 2 + t, t − 1) is equivalent
to a line parameterised by (1 − t, 2 − t, t + 1) as it corresponds to a change
of parameter t → −t. In fact we could change parameter by t → a + bt for
constants a and b and we would have the correct line, but it would be hard
to recognise. In short, don’t get too confused by the sign of t in the normal
line if you see answers in textbooks, but for this course use equation (54) or
(57).
4.1.8
Minima and maxima of functions of two variables
Consider a function of two variables, f (x, y). Obviously, it varies in the two
variables and just as for a function of a single variable we can define the concepts of minima and maxima. We state the definitions separately for clarity:
f has a relative (or local) maximum at (x0 , y0 ) if f (x0 , y0 ) ≥ f (x, y) for
all points that lie in some disk centered on (x0 , y0 ). f has a absolute (or
global) maximum at (x0 , y0 ) if f (x0 , y0 ) ≥ f (x, y) for all point for which f
is defined.
f has a relative (or local) minimum at (x0 , y0 ) if f (x0 , y0 ) ≤ f (x, y) for
all points that lie in some disk centered on (x0 , y0 ). f has a absolute (or
global) minimum at (x0 , y0 ) if f (x0 , y0 ) ≤ f (x, y) for all point for which f
is defined.
Both minima and maxima are types of extrema, i.e. points for which the
function takes an extreme value.
17
Recall that a set is bounded if there is a box that can be drawn around the
entire set of points. Also recall that a closed set contains its boundary, but
an open set does not. Therefore, a disk including its boundary is closed and
bounded, but an infinite line is clearly open since the endpoints are at infinity,
and unbounded because not box can be bigger than infinite length. However,
the interior of a disk, i.e. without the boundary is open but bounded.
Extreme-Value Theorem
If f (x, y) is continuous on a closed, bounded set, then it has an absolute
maximum and an absolute minimum in that set.
4.1.8.1
Finding extrema
The position of a stationary point is shown by the fact that the first
derivatives vanish. In other words, there is a stationary point at (x0 , y0 )
if fx (x0 , y0 ) = fy (x0 , y0 ) = 0. In addition, a critical point is any point
which is either a stationary point (i.e. all derivatives vanish) or where one
or more of the derivatives doesn’t exist.
The Second Partial Derivative Test Let f (x, y) be a function with continuous second order partial derivatives in a disk centered around a critical
point (x0 , y0 ), and define
D = fxx (x0 , y0 )fyy (x0 , y0 ) − fxy (x0 , y0 )2 .
(66)
• If D > 0 and fxx (x0 , y0 ) > 0 then f (x, y) has a relative minimum at
(x0 , y0 ).
• If D > 0 and fxx (x0 , y0 ) < 0 then f (x, y) has a relative maximum at
(x0 , y0 ).
• If D < 0 then f (x, y) has a saddle point at (x0 , y0 ).
• If D = 0 then no conclusion can be taken.
A saddle point is a stationary point that is not a relative or absolute extremum. An example is for f (x, y) = x2 − y 2 at the point (0, 0) and is shown
in Figure 5.
Example: Find the critical points of f (x, y) = xy − x3 − y 2 and determine
18
5
0
z -5
2
-2
0
x
0
y
-2
2
Figure 5: Saddle point of f (x, y) = x2 − y 2 at the point (0, 0).
whether they are maxima, minima or saddle points.
Solution: To find the critical points, we set fx (x, y) = 0 and fy (x, y) = 0.
This gives us
y − 3x2 = 0 , x − 2y = 0 ,
(67)
and therefore from the second equation, we can rewrite the first equation as
x
− 3x2 = 0 , ⇒ x(x − 1/6) = 0 , ⇒ x = 0 or x = 1/6 .
(68)
2
The corresponding y values are then x = 0 ⇒ y = 0 and x = 1/6 ⇒ y = 1/12
and so the critical points are at (0, 0) and (1/6, 1/12). Next we find the second
order partial derivatives
fxx (x, y) = −6x ,
fyy (x, y) = −2 ,
fxy (x, y) = 1 ,
(69)
and so
D = (−6x)(−2) − (1)2 = 12x − 1 .
(70)
At the point (0, 0), D = −1 < 0 and therefore (0, 0) is a saddle point. At
(1/6, 1/12), D = 12(1/6) − 1 = 1 > 0 and therefore it is either a minimum or
maximum. We now check fxx (1/6, 1/12) = −6(1/6) = −1 < 0 and therefore
(1/6, 1/12) is a global maximum.
19
4.1.9
Lagrange Multipliers
Using Lagrange multipliers is another way of solving extremum problems
using constraints on the system. This can be very effective when the methods
we discussed previously are difficult or impossible to implement. The idea is
that if we have a function f (x, y, z) subject to the constraint g(x, y, z) = 0
then if we graph these two functions they will only intersect at certain point,
which correspond to the constraint g(x, y, z) = 0 restraining the allowed
values of the function f (x, y, z). At these intersections, the extrema will
correspond to the points for which ∇f = 0. However, since g(x, y, z) = 0 for
any point on the intersection, ∇g = 0 also, and therefore at an extremum
∇f is parallel to ∇g, which can be written as
∇f (x, y, z) = λ∇g(x, y, z) ,
(71)
where λ is a constant called a Lagrange multiplier. Solving this set of
equations (there are three equations in three dimensions and two in two
dimensions) together with the constraint equation g(x, y, z) = 0 will give the
extrema. In other words, we solve the set of equations
fx = λgx ,
fy = λgy ,
fz = λgz ,
g(x, y, z) = 0 .
(72)
Often this will be used to maximise or minimise the curve for a given constraint. These extrema are known as constrained maxima/minima. When
solving these equations we will not be concerned with finding the value of λ:
it is simply a tool to make the calculation easier.
Example: Find the dimensions of the box with largest volume if the total
surface area is 64cm3 .
Solution: Here, the volume is the function to be maximized, f (x, y, z) =
xyz, and the surface area is the constraint. A box with sides of length x, y, z
has six sides, which come in pairs of equal areas xy, yz, xz and therefore the
constraint is
2xy + 2yz + 2xz = 64
⇒xy + yz + xz − 32 = 0 .
20
The equations to be solved are then
fx = λgx
⇒ yz = λ(y + z) ,
fy = λgy
⇒ xz = λ(x + z) ,
fz = λgz
⇒ xy = λ(x + y) ,
g(x, y, z) = 0 ⇒ xy + yz + xz = 32 .
The first three are equivalent to
xyz = xλ(y + z) ,
xyz = yλ(x + z) ,
xyz = zλ(x + y) ,
and so equating the first two gives
λ(xy + xz) = λ(xy + yz)
⇒λ(xz − yz) = 0 .
This means that either λ = 0 or xz = yz. λ = 0 is bad for two reasons.
Firstly, it implies yz = 0, which is impossible for a box. Secondly, and more
importantly, it says that the assumption (71) is wrong. Since we are dealing
with a closed bounded set (a box with fixed surface area) we must have a
maximum and minimum value by the Extreme Value Theorem, and so we
should find a non-zero λ. Now, let’s look at the equality xz = yz. z = 0
can’t be true because the volume would be zero, and in any case we actually
have a box so it must have non-zero dimensions. Therefore, we conclude that
x = y. Now using our second and third equations, we can write
yλ(x + z) = zλ(x + y)
⇒ λ(xy − xz) = 0 ,
which in analogy to the previous argument requires y = z. Therefore, we
have x = y = z, and if we return to our constraint, we get
r
32
,
x2 + x2 + x2 = 32 ⇒ 3x2 = 32 ⇒ x =
3
which
is positive as it is a length. Therefore the box has sides of equal length
q
32
.
3
21