6 Differentiation

6
6.1
Differentiation
Definition of the derivative and first properties
Suppose we want to construct the tangent line at a point P on the curve y =
f (x). As a first attempt we could take a second point on the curve near the
point P and draw the chord through them. Intuitively, we will end up with the
tangent line when we make the second point nearer and nearer our target point.
Let’s formalize this a little more. Let P ≡ (x, f (x)). To get a point near P we
can tweak the x-co-ordinate by a non-zero small quantity, say h, giving us the
point Q ≡ (x + h, f (x + h)).
y=f(x) The slope of the chord P Q is
y coordinate of Q − y coordinate of P
f (x + h) − f (x)
f (x + h) − f (x)
=
=
,
x coordinate of Q − x coordinate of P
(x + h) − x
h
the ‘difference quotient’. To get the slope of the tangent line we need to make
1
h go to 0. When will we get a well defined slope?
Exercise 6.1. The graphs of three functions are sketched below. Which ones
might have a well defined ‘slope of tangent’ at x = 0?
Exercise 6.2. What is the slope of the tangent to the parabola y = x2 at the
point P ≡ (1, 1)?
Consider a nearby point Q ≡ (1 + h,
.
.
.
.
.
. ).
Then
slope of chord P Q =
→
.
.
. as
h → 0.
More generally, the slope of the tangent at the point (x, x2 ) will be
(x + h)2 − x2
=
h→0
h
lim
Definition 6.3. The derivative of the function f at a point x, denoted by f 0 (x),
is defined to be the limit
f (x + h) − f (x)
.
h→0
h
f 0 (x) := lim
Geometrically f 0 (x) is the slope of the tangent to the graph given by y = f (x)
at the point (x, f (x)). In order to get a well defined slope at x, the graph should
2
look smooth. At the very least, there shouldn’t be a kink or a break in the graph
at x. The latter property can be precisely stated as follows.
Theorem 6.4. If f has a derivative at x then f is continuous at x.
Proof. Continuity at x means limh→0 f (x + h) = f (x). Equivalently, we want
to verify that limh→0 (f (x + h) − f (x)) = 0. This is done as follows:
lim (f (x + h) − f (x)) = lim
h→0
f (x + h) − f (x) h
h→0
h
Suppose that a function f has a local maximum at c i.e. f (c) ≥ f (x) for all
x in some open interval around c. Then the slope of a chord approximating
the tangent at (c, f (c)) from the right hand side is
.
.
.
approximating the tangent from the left hand side will have slope
the tangent must have slope
.
.
while a chord
.
.
..
So
..
Similarly, if the function f has a local minimum at c i.e. f (c) ≤ f (x) for all x
in some open interval around c, then f 0 (c) =
3
.
.
..
Theorem 6.5. The derivative of a function at a local maximum or local minimum
is always 0.
The above theorem is extremely useful in figuring out the maxima and minima
of functions. Of course, as we know, the vanishing of the derivative on its own
doesn’t mean the point is an extremum.
Let’s have a look at how we can use Definition 6.3 to calculate derivatives of
functions.
Example 6.6. The simplest function we can think of is the constant function
i.e. f (x) = c for all x ∈ R, where c is a fixed real number. As we all know, this
has derivative
.
.
..
Let’s prove this:
f (x + h) − f (x)
h→0
h
f 0 (x) = lim
Example 6.7. Consider f (x) = x3 . We know f 0 (x) = 3x2 . Let’s show this.
f (x + h) − f (x)
h→0
h
f 0 (x) = lim
Theorem 6.8. Let n be an integer and f (x) = xn . Then f 0 (x) = nxn−1 .
We will prove the result when n ≥ 0 and leave the case when n is negative as
an exercise. We have
f (x + h) − f (x)
(x + h)n − xn
= lim
h→0
h→0
h
h
n
xn + nxn−1 h + 2 xn−2 h2 + . . . + hn − xn
= lim
h→0
h
f 0 (x) = lim
4
Theorem 6.9. The derivative of sin x is cos x. The derivative of cos x is − sin x.
Proof. We want to show limh→0
f (x+h)−f (x)
h
= cos x where f (x) = sin x. Let’s
check:
f (x + h) − f (x)
sin(x + h) − sin(x)
= lim
h→0
h→0
h
h
lim
=
The verification for cos x is an exercise in the problem booklet.
Example 6.10. Consider the function f (x) =
derivative
f 0 (x) =
√
x, defined for positive x. It has
.
. . Verfication:
√
√
x
limh→0 x+h−
=
h
.
Other notation for the derivative. If y = f (x) is a function of the real
variable x, then we obtain a new function whose value at x is the derivative at
x i.e. f 0 (x). This is—unsurprisingly—called the derivative of f and is denoted
by f 0 or y 0 or by any of the following:
dy
;
dx
df
;
dx
d
f (x).
dx
The x in the denominator indicates that the function we are differentiating is
considered as a function of the variable x; we often highlight this by stating that
we are differentiating with respect to x.
5
dy
suggests the derivative as a ratio of the terms dy and
dx
dx. The terms dy and dx are supposed to represent ‘instantaeneuous’ change
Remark. The notation
in y and x respectively.
To make this more plausible, note that h (in the definition of derivative) is
the change in x-co-ordinate ‘∆x’. Then ∆y, the change in y-co-ordinate, is
f (x + h) − f (x). So
dy
∆y
= lim
.
dx ∆x→0 ∆x
Note that ∆y, ∆x are actual numbers with ∆x 6= 0. On the other hand dy, dx
have no independent existence for us, at least at the moment!
Having said that we can think of df , for a function f , to be a qualitative measurement of what an idealised small change ∆f looks like. If f is a function of
the variable x then this change will depend on dx, the instantaeneous change for
x, according to the rule
df
dx.
dx
df =
For example if f (x) = x2 then df = 2xdx. This is quite a handy trick to use
when we do integration by substitution.
Remark. We can write Definition 6.3 as saying
f (x + h) = f (x) + hf 0 (x) + (error term, depending on h)
with the error term going to 0 when h → 0. Thus the derivative allows us to
find good approximations near a point: f (x + h) ≈ f (x) + hf 0 (x) for small h.
6
6.2
The rules of differentiation
Clearly we do not want to be using the definition every time we are required to
find a derivative. We now prove and record some properties of differentiation
which we will then use to compute more complex derivatives.
Theorem 6.11. Let u(x), v(x) be functions.
(i) If y = cu where c is a constant then
(ii) If y = u + v then
dy
du
=c .
dx
dx
dy
du dv
=
+ .
dx
dx dx
dy
dv
du
(iii) Product Rule (Leibniz). If y = uv then
=u +v .
dx
dx
dx
u
(iv) Quotient Rule. If y = , v 6= 0, then
v
du
dv
v
−u
dy
= dx 2 dx .
dx
v
Exercise 6.12. Use Theorem 6.9 and Theorem 6.11 to establish formulae for
derivatives of tan x, cot x, csc x and sec x.
Theorem 6.13. Chain Rule (Newton). If y = f (g(x)) then
dy
= f 0 (g(x))g 0 (x).
dx
If we write y = f (u) where u = g(x) then the chain rule becomes
dy du
dy
=
.
dx
du dx
Recall that to evaluate the composite function y = f ((g(x)) at a point x we
need to
Step 1. Evalauate the function g at x.
Step 2. Evaluate the function f at g(x).
7
The chain rule tells us that to evaluate the derivative of y at x we need to
evaluate the derivative of g at x and the derivative of f at g(x) and multiply the
two.
The process generalises. For example, suppose y = f (g(h(x))) is the composite
of three functions. If we want to evaluate y at a point, the final operation will
involve f being evaluated at some point. So we can write y = f (u) where
u = g(h(x)). We can then write u = g(v) with v = h(x). Now use the chain
rule
dy
dy du
dy du dv
=
=
= f 0 (g(h(x)))g 0 (h(x))h0 (x).
dx
du dx
du dv dx
Here’s another way of remembering the chain rule. The composite function
f (g(x)) has g inside and f outside. The chain rule says the required derivative
is ‘differentiate the outside times differentiate the inside’.
Exercise 6.14. What is
d
d
(3x2 + 1)3 ? What is
sin(cos(x3 )) ?
dx
dx
More examples.
We will now look at various examples of how to differentiate functions using
the preceding rules, particularly the chain rule. Typically we will have a relation
dy
F (x, y) = 0 from which we obtain a relation involving
, y and x.
dx
Example 6.15. Let’s show
d
1
tan−1 x =
.
dx
1 + x2
Let y = tan−1 x. Then x = tan y. Differentiate both sides w.r.t. x to obtain
d
d
x=
(tan y).
dx
dx
8
Using the chain rule, we obtain
d
dy
d
(tan y) =
(tan y)
dx
dy
dx
dy
= sec2 y
dx
dy
dy
= (1 + x2 ) .
= (1 + tan2 y)
dx
dx
Hence
1
dy
=
.
dx
1 + x2
Exercise 6.16. Show that
d
1
.
arcsin x = √
dx
1 − x2
For more examples, we will need to assume formulae for derivatives of the exponential and logarithm functions. These will be verified when we discuss the
logarithm and exponential functions in more detail later on in the course.
Theorem 6.17.
d x
d
1
e = ex and
ln x = .
dx
dx
x
Example 6.18. Let a ∈ R and assume x > 0. Then
d a
(x ) = axa−1 .
dx
To see this, set y = ax . Then
y = ax =⇒ ln y = a ln x
=⇒
(take logs)
d
d
ln y =
a ln x
dx
dx
(diff. w.r.t. x)
=⇒
(use chain rule)
=⇒
Exercise 6.19. Find the slope of the tangent to the graph of (x2 + y 2 )3 = 8x2 y 2
at the point (−1, 1).
Exercise 6.20. Find
dy
dx
given that sin(2x + 2y) = sin2 x + sin2 y.
9
6.3
Derivative as a rate of change and velocity vectors
Let r(t) := (x(t), y(t)) be the position vector at time t of a particle moving in
the plane. Its position just after or before t, say at time t + ∆t is r(t + ∆t) and
the position vector of the particle would have changed by
∆r = r(t + ∆t) − r(t) = (x(t + ∆t) − x(t), y(t + ∆t) − y(t)).
Note that |∆r| is the distance covered over the time ∆t.
If we now let ∆t → 0 we get
d
dr
∆r d
:= lim
=
x(t), y(t) .
∆t→0 ∆t
dt
dt
dt
This is called the velocity vector. At each instant the velocity vector points in
the direction of the motion tangentially to the path. The length of the velocity
vector gives the speed at that instant.
Example 6.21. Let r(t) := (x(t), y(t)) be the position vector at time t of a
point particle moving anticlockwise on the unit circle at unit speed, with initial
position (1, 0) at time t = 0. Then (x(t), y(t)) = (cos t, sin t). Now the velocity
is tangential to the curve along the direction of motion and has magnitude 1.
10
Thus the velocity at time t is represented by the unit vector perpendicular to
(cos t, sin t) in the anti-clockwise direction i.e.
.
.
.
.
.
.
.
.
.
This
gives
d
x(t) =
dt
6.4
.
.
.
.
.
.
.
.
. and
d
y(t) =
dt
.
.
.
.
.
.
.
..
.
L’Hôpital’s Rule
In some limit calculations we land in situations where the obvious limit substitu0
∞
tions leave us with the indeterminate forms or
. These cases can often be
0
∞
handled using L’Hôpital’s Rule.
Theorem 6.22 (L’Hôpital’s Rule). If f (a) = g(a) = 0 then
f 0 (x)
f (x)
= lim 0
.
x→a g (x)
x→a g(x)
lim
We sketch a proof when g 0 (a) 6= 0. In this case, we are claiming limx→a
f 0 (a)
.
g 0 (a)
f (x)
g(x)
=
We have
f (x)
f (a + h)
= lim
x→a g(x)
h→0 g(a + h)
lim
f (a + h) − f (a)
h→0 g(a + h) − g(a)
= lim
(since f (a) = g(a) = 0)
(f (a + h) − f (a))/h
f 0 (a)
= 0 .
h→0 (g(a + h) − g(a))/h
g (a)
= lim
x3 + x − 2
.
x→1
x−1
Example 6.23. Find lim
Both numerator and denominator are 0 at x = 1. By L’Hôpital’s Rule: limx→1
11
x3 +x−2
x−1
=
3x2 +1
|x=1
1
= 4.
x3 + x2 − x − 1
Example 6.24. Find lim
.
x→−1 x3 + 2x2 + x
We check that numerator and denominator are 0 at x = −1. As long as this
remains the case we can keep applying L’Hôpital’s Rule:
x3 + x2 − x − 1
3x2 + 2x − 1
=
lim
x→−1 x3 + 2x2 + x
x→−1 3x2 + 4x + 1
−4
6x + 2
=
= 2.
= lim
x→−1 6x + 4
−2
lim
Theorem 6.25 (Variants of L’Hôpital’s Rule).
(I) If lim f (x) = lim g(x) = ∞, then
x→a
x→a
f (x)
f 0 (x)
= lim 0
.
x→a g(x)
x→a g (x)
lim
(II) If lim f (x) = lim g(x) = 0, then
x→∞
x→∞
f (x)
f 0 (x)
= lim 0
.
x→∞ g(x)
x→∞ g (x)
lim
(III) If lim f (x) = lim g(x) = ∞ then
x→∞
x→∞
f (x)
f 0 (x)
= lim 0
.
x→∞ g (x)
x→∞ g(x)
lim
Note 6.26. There are also variants for one sided limits i.e. x → a+ and x → a−
which we will use without stating formally. (You are strongly encouraged to write
out the corresponding statements for one sided limits.)
Note 6.27. Remember to check that the limit you want to calculate is in the
form 0/0 or ∞/∞ before you apply L’Hôpital’s Rule. Otherwise you will get into
12
blatantly nonsensical statements: e.g.
0 = lim
x→0
Example 6.28. Find limx→∞
x
1
= lim = 1.
x→0
x+1
1
3x−7
5x+3
Note that both numerator and denominator go to ∞ as x → ∞. So by
L’Hôpital’s Rule, limx→∞
3x−7
5x+3
= limx→∞
3
5
Theorem 6.29. Let k > 0 be fixed. Then
= 35 .
ln x
→ 0 as x → ∞.
xk
Interpretation: the function ln x grows more slowly (in the long run) than any
positive power of x.
Proof. As x → ∞, ln x → ∞ and xk → ∞. By L’Hôpital’s Rule:
lim
x→∞
ln x
=
xk
Here are some important limits that can be derived using L’Hôpital’s Rule.
1
Theorem 6.30. lim x x = 1.
x→∞
Proof. Taking logs (base e) it is equivalent to show that lim
x→∞
1
ln x = 0, but
x
this was proved in Theorem 6.29 (with k = 1 in that result).
1 x
Theorem 6.31. lim 1 +
= e.
x→∞
x
1
Proof. Taking logs, it is equivalent to show that lim x ln 1 +
= 1.
x→∞
x
1
As x → ∞, ln 1 + x1 → ln 1 = 0 and → 0.
x
13
By L’Hôpital’s Rule,
lim
x→∞
ln 1 + x1
1/x
=
r x
Note 6.32. Similarly, for any r, lim 1 +
= er .
x→∞
x
6.5
Properties of differentiable functions
We now briefly discuss properties of differentiable functions on closed intervals.
Theorem 6.33. Rolle’s Theorem. Assume that the function f : [a, b] → R
fulfils the following conditions.
• f is continuous on [a, b],
• f is differentiable in the open interval (a, b), and
• f (a) = f (b).
Then there is a point c ∈ (a, b) with derivative f 0 (c) = 0.
Proof. If f is the constant function i.e. f (x) = f (a) for all x ∈ [a, b] then the
result is clearly true as f 0 (x) = 0 for all x ∈ (a, b).
If f is not the constant function, then f will have a maximum or a minimum at
some c ∈ (a, b) and f 0 (c) = 0 necessarily!
Theorem 6.34. Lagrange’s Mean Value Theorem. Assume that the function f : [a, b] → R fulfils the following conditions.
• f is continuous on [a, b], and
• f is differentiable in the open interval (a, b).
14
Then there is a point c ∈ (a, b) such that f 0 (c) =
f (b) − f (a)
.
b−a
Geometrical interpretation. Set P ≡ (a, f (a)) and Q ≡ (b, f (b)). Lagrange’s
Mean Value Theorem says that at some point c ∈ (a, b), the chord P Q and the
f (b) − f (a)
= (c, f (c)). In other
b−a
words, the tangent at (c, f (c)) is parallel to the chord P Q.
tangent at (c, f (c)) have the same slope i.e.
Proof of the Mean Value Theorem. Consider the function g : [a, b] → R given
by
g(x) = f (x) −
f (b) − f (a)
(x − a).
b−a
Then g is continuous on [a, b] and differentiable on (a, b) with derivative
g 0 (x) = f 0 (x) −
f (b) − f (a)
.
b−a
Now g(a) = g(b) = f (a). So by Rolle’s Theorem there is a c ∈ (a, b) where
g 0 (c) = 0 i.e. f 0 (c) =
f (b)−f (a)
.
b−a
Theorem 6.35. If f has positive derivative in an open interval then f is strictly
increasing. If f has negative derivative in an open interval then f is strictly
decreasing.
15
To see this, suppose f has positive derivative. For a < b in the given open
interval, the MVT gives a c between a and b such that f (b)−f (a) = (b−a)f 0 (c).
Since b − a > 0 and f 0 (c) > 0, we obtain f (b) > f (a). So the function is strictly
increasing. (And similarly for the second statement.)
16