Understanding Calculus

STUDENT’S COMPANIONS IN BASIC MATH: THE FIFTH
Understanding Calculus
Let f be a (real-valued) function defined on an open interval I. The derivative
0
f (x) of f at a point x in I is defined by
f 0 (x) = lim
h→0
f (x + h) − f (x)
h
(1)
provided that this limit exists. We may interpret h as a small change in x, which results
in a small change of f , namely f (x + h) − f (x). Their ratio (f (x + h) − f (x))/h stands
for the relative change: change (or difference) in f relative to change (or difference) in
x. Schematically,
f (x + h) − f (x)
change in
=
h
change in
difference
≡
difference
f
(relative change)
x
in f
(difference quotient)
in x
and f 0 (x) is the limit of the above expression as h tends to zero. For a quick and easy
example, we show how to get the derivative of the function f (x) = x2 according to the
definition given by (1):
f (x + h) − f (x)
(x + h)2 − x2
x2 + 2xh + h2 − x2
2xh + h2
=
=
=
= 2x + h
h
h
h
h
which approaches to 2x as h → 0. Hence f 0 (x) ≡
d 2
x = 2x.
dx
EXERCISE 1. Use the definition of derivative given by (1) to find f 0 (x) for f (x) = x3 .
EXERCISE 2. Use the definition of derivative to find f 0 (x) for f (x) = 1/x (x 6= 0).
√
EXERCISE 3. Use the definition of derivative to find f 0 (x) for f (x) = x (x > 0).
There is a general formula covering all cases considered in the above exercises:
d α
x = αxα−1 ,
dx
(2)
where α is any fixed real number. (Here we impose the restriction x > 0, if necessary.)
The main difficulty for establishing (2) is to find a proper definition of the expression xα ,
1
which requires a solid background in mathematical analysis. We only check its special case
in which α is a positive integer and accept its full generality in good faith; (however,
consult the TENTH COMPANION ).
EXAMPLE 4. We verify (2) for a positive integer α, say α = n. Let us recall the following
well-known identity
an − bn
= an−1 + an−2 b + · · · + abn−2 + bn−1 ;
a−b
(see (2) from the SECOND COMPANION). Putting a = x + h and b = x, we obtain
(x + h)n − xn
= (x + h)n−1 + (x + h)n−2 x + · · · + (x + h)xn−2 + xn−1 .
h
As h tends to zero, each term on the right hand side of the above identity approaches to
the same limit xn−1 , and there are n terms all together. So
d n
(x + h)n − xn
x = lim
= nxn−1
h→0
dx
h
which gives (2) for α = n.
For complicated functions, finding derivatives can be enhanced by the following basic
rules of differentiation.
Linearity (af + bg)0 = af 0 + bg 0 ; (a, b are constants).
Product Rule (f g)0 = f g 0 + f 0 g.
µ ¶0
f
gf 0 − f g 0
Quotient Rule
=
.
g
g2
Chain Rule (f ◦ g)0 (x) = f 0 (g(x)) g 0 (x).
We have to say something about the chain rule. The expression f ◦ g stands for the
composite of f
and g and its value at x is given by f (g(x)). In other words,
√
(f ◦ g)(x) = f (g(x)). For example, if f (x) = x and g(x) = x2 , then (f ◦ g)(x) =
√
f (g(x)) = x2 = |x|. The above chain rule formula reads: the derivative of the composite
function f ◦ g is equal to the derivative of f evaluated at g(x), times the derivative
of g at x. This identity can be rewritten as (f ◦ g)0 = (f 0 ◦ g)g 0 . Many students find
2
the chain rule put in this way is mentally challenging. They prefer to introduce variables
v = g(x) and u = f (v), and write the formula for the chain rule as
du
du dv
=
.
dx
dv dx
A good student can apply the chain rule mentally, without following any explicit formula.
QUESTION 5. What is the result of applying the chain rule to find
d √ 2
d
|x| ≡
x
dx
dx
for x 6= 0 ? Use your answer to find the derivative of ln |x| for x 6= 0, (assuming you
know (d/dx) ln x = 1/x for x > 0).
EXERCISE 6. Find
d
dx
r
q
x+
x+
√
x.
(This exercise shows that the derivative of a function could be quite unwieldy.)
EXERCISE 7. The logarithmic derivative f L of a function f is defined to be f 0 /f .
Verify the following identities
L
L
L
(f g) = f + g ,
µ ¶L
f
= f L − gL ,
g
(f α )L = αf L .
(Notice that, if f > 0, then f L = (ln f )0 , the derivative of the natural log of f .)
Here are some basic formula for derivatives of elementary transcendental functions
d x
e = ex ,
dx
d
1
ln x =
dx
x
d
d
d
d
sin x = cos x ,
cos x = − sin x ,
tan x = sec2 x ,
sec x = tan x sec x.
dx
dx
dx
dx
A more thorough list can be found in any standard textbook of calculus.
QUESTION 8. What is the logarithmic derivative of sec x + tan x ? If you don’t like this
question, answer the following one: what is the derivative of ln | sec x + tan x| ?
EXERCISE 9. Suppose that the graph of f intersects the x-axis at a nontangentially, that
is, f (a) = 0 but f 0 (a)⊥ = 0. Let g = f /f 0 . Verify that g 0 (a) = 0.
3
The derivative f 0 (x) can be considered as the rate of change of f at x if x
is interpreted as the time variable. Normally we prefer to use the letter t for the time
variable. For example, if x(t) stands for the position on the x-axis of a moving point at
time t, then x0 (t), the rate of change of x, is the velocity of the moving point at time
t. The second derivative x00 , which is the derivative of x0 , is the rate of change of the
velocity. It is called the acceleration.
QUESTION 10(a). What is wrong with the statement “to accelerate means to speed up;
so the acceleration is proportional to the speed”.
QUESTION 10(b). Air is pumped into a balloon at a constant rate. Why do we perceive
that the expansion of the balloon is slowed down? The volume of the balloon is V = cx3 ,
where x is its diameter and c is a constant depending on its shape. By assumption,
dV /dt is positive constant. Check dx/dt > 0 (which means that the balloon is expanding)
and d2 x/dt2 < 0 (which means that the expansion is slowed down).
An important application of calculus is optimization: finding maxima and minima. If
a smooth function f of a single variable attains its (relative) maximum or minimum at
an interior point x0 , then its derivative vanishes at this point: f 0 (x0 ) = 0. (Nowadays
this is called Fermat’s theorem. Fermat is the world’s greatest amateur mathematician.
He discovered this theorem long before calculus was invented.) This fact tells us that
solutions to f 0 (x) = 0 (as well as the boundary points of the domain, if any) give us
all candidates for optimizing points. To each candidate, we need other methods (such as
the second derivative test) or just our common sense to find out whether it is a maximum
point, a minimum point, or neither.
EXERCISE 11. (In this exercise, a, c and k are positive constants.) To prepare an
upcoming test thoroughly, a student needs no more than a hours of work. If the student
spends t hours (t ≤ a) to prepare this test, the mental stress due to the work is M1 (t) =
ct2 and the mental stress due to his worry about the test is M2 (t) = k(a − t)2 . The
student decides to study in a way minimizing his total mental stress, which is M (t) =
M1 (t) + M2 (t) = k(a − t)2 + ct2 . Find the number T of hours he should study. Find out
the limit of T in the following two cases: k → 0 and k → ∞. Make sure that the answer
agrees with your experience.
4
EXERCISE 12 (continued). Now suppose that a high-strung student prepares for the
same test. His stress level is measured by an exponential function instead of a quadratic
function. Assume that M1 (t) = c(eλt − 1) and M2 (t) = keµ(a−t) , where k, c, µ and λ
are positive constants. Describe the behavior of this nervous student in the following two
cases in comparison with a normal student described in the previous exercise: k small
(only worrying a little bit) and k large (worrying a lot).
QUESTION 13. An engineer applied calculus to design a cargo ship with weight minimized
and capacity maximized. He tested his model in a tank. The model flipped over and sank.
In your opinion, what is the cause of this fiasco ? Do you blame calculus for this?
One of the most important theorems in differential calculus is Lagrange’s mean
value theorem. It says that if f is a smooth function of a single variable and if [a, b] is
an interval contained in its domain, then there is a point ξ in this interval such that
f 0 (ξ) =
f (b) − f (a)
.
b−a
This is an existence theorem. It is important to our theory. We explain the geometric
meaning of this theorem as follows. Notice that f 0 (ξ) is the slope of the tangent at the
point (ξ, f (ξ)) and (f (b) − f (a))/(b − a) is the slope of the secant through (a, f (a))
and (b, f (b)). So geometrically Lagrange’s mean value theorem says that given any secant
to a smooth planar curve there is a tangent parallel to it. (The word “planar” here is
crucial. This statement is not true for spatial curves. For example, the helix described by
the parametric equations x = cos t, y = sin t and z = t has a vertical secant but no
vertical tangents.)
EXAMPLE 14. Let x1 (t) and x2 (t) be the x-coordinates of the positions of two
cars. Assume that both of them start at the origin and car 2 is faster at any moment:
x1 (0) = x2 (0) = 0 and x02 (t) > x01 (t) for all t. We are going to prove that car 2 is
always leading, that is, x2 (t) > x1 (t) for all t. Suppose the contrary that, at certain time
T that car 1 is ahead: x1 (T ) ≥ x2 (T ). Apply the mean value theorem to the function
f (t) = x2 (t)−x1 (t): there exists some t0 in [0, T ] such that f 0 (t0 ) = (f (T )−f (0))/(T −0).
Replace f by x2 − x1 to get x02 (t0 ) − x01 (t0 ) = (x2 (T ) − x1 (T ))/T ≤ 0, or x02 (t0 ) ≤ x01 (t0 ),
contradicting to our assumption that x02 (t) > x01 (t) for all t.
Sometimes in differential calculus we use a more convenient tool for manipulation,
5
called differential forms. Their basic rules for calculation are very similar to those for
derivatives:
Linearity d(au + bv) = a du + b dv (a, b are constants).
Product Rule d(uv) = u dv + v du.
³ u ´ v du − u dv
Quotient Rule d
=
.
v
v2
Chain Rule (special case) If u is a function of v, then du =
du
dv.
dv
There are many advantages of using differential forms. For example, the above rules apply
equally well to functions of several variables.
EXAMPLE 15. Let
r=
p
x2 + y 2 + z 2 .
We want to find dr in terms of x, y and z. Note that r2 = x2 + y 2 + z 2 . So
d(r2 ) = d(x2 ) + d(y 2 ) + d(z 2 ). The chain rule tells us that
d(v 2 ) =
d(v 2 )
dv = 2v dv.
dv
So we have 2r dr = 2x dx + 2y dy + 2z dz, or
dr =
x dx + y dy + z dz
x dx + y dy + z dz
= p
.
r
x2 + y 2 + z 2
Done.
√
EXERCISE 16. Use the last identity to estimate 1.032 + 2.012 + 2.022 (without using a
√
calculator). Notice that 12 + 22 + 22 = 3. Use dx = 0.03, dy = 0.01, dz = 0.02. Check
how good is your answer by comparing with the answer obtained from a calculator.
EXERCISE 17. The relation between the Cartesian coordinates x, y and polar coordinates r, θ is given by x = r cos θ and y = r sin θ. Verify the following identity
dθ =
xdy − ydx
.
x2 + y 2
Using partial derivatives, the chain rule above can be refined:
6
Chain rule (full version) If u is a function of n variables v1 , v2 , . . . , vn , then
du =
∂u
∂u
∂u
dv1 +
dv2 + · · · +
dvn .
∂v1
∂v2
∂vn
An important route for understanding nature and applying our knowledge about nature is roughly as follows:
Physical Laws ⇒ DEs (differential equations)
⇒ qualitative studies or numerical solutions of DEs
⇒ engineering
Certainly this tells us that the theory of differential equations is an important subject! For
example, Newton’s law for one dimensional motion of a point mass is given by
m
dx2
= F ≡ −V 0 (x),
dt2
(V is a potential function) which is a second order ordinary differential equation.
EXAMPLE 18. You are sitting comfortably in a heated room playing video game. Suddenly a supersonic aircraft appears nearby, with a thunderous boom interrupting your
game. What are the differential equations relevant to this situation? Well, the sound
accompanying the game and the shock wave from the aircraft are governed by the wave
equations. The heat keeping you warm is governed by the heat equation. Photons from
the screen arriving at your eyes are governed by Schrödinger’s equations. The electrons
in your computer are governed by Dirac’s equations. The flight of the aircraft is governed
by the Navier-Stokes equations. You are sitting in the room, which is on the earth, which
is a member of the solar system, which belongs to a galaxy called Milky Way, which is
a part of galaxy cluster, which is in the universe. The universe is governed by Einstein’s
equations. And you, your whole being is probably governed by some very sophisticated
differential equations that only God knows.
EXAMPLE 19. Before a political debate, the room is at a low temperature L. Once the
debate starts, the room begins to heat up quickly by the hot air supplied by the panelists,
which is at a much higher temperature H, close to the boiling point. Let T (t) be the
7
room temperature at time t, assuming the debate starts at t = 0. Then, according to
Newton’s law of heat conduction, T (t) satisfies the differential equation
dT
= c(H − T ),
dt
where c is a constant depending on the size, shape and structure of the room. This
equation, together with the initial condition T (0) = L, determines a unique solution
T (t). As you can see, the rate of change of the temperature dT /dt is proportional to the
temperature difference H −T . At the beginning, this temperature difference is substantial.
The room is heated up quickly by the debate, and the audience is excited. But when the
debate drags on, the room temperature is getting close to H. But since H − T is small,
so is dT /dt. The audience cannot tell if the room temperature is still raising. This may
contribute to the fact that most of the audience feel sleepy at that moment. [If, at that
moment, the temperature maintained the same rate of increase, the audience would think
that the house was on fire and everyone would leave immediately.]
A basic problem for studying a differential equation is to establish the existence and the
uniqueness of its solution in order to make sure that we are not talking about nonsense.
One may think that the purpose of building a theory of integration is to provide an answer
to the problem concerning the existence and uniqueness of the solution to the following
differential equation
du
= f (x)
(3)
dx
satisfying the initial condition u(a) = C, where a, C are given real numbers and f
is a given function. The theory tells us that under some mild condition on f , such as
continuity, and we write the solution in the form
Z x
u(x) =
f (t) dt + C
(4)
a
(Notice that t here is a dummy variable. It can be replaced by any letter. If we replace t
Rx
by ξ, we have u(x) = a f (ξ) dξ + C. The letter x cannot be used to replace t because
it appears somewhere else in this expression.)
0
You may rewrite (3) as f (t) = du
dt ≡ u (t) and substitute this into (4) to get
Rx
u(x) = C + a u0 (t) dt. Replace C by u(a). Then
Z x
u(x) = u(a) +
u0 (t) dt.
(5)
a
8
This identity is the content of the so-called fundamental theorem of calculus. Also,
Rx
d
(4) gives u0 (x) = dx
f (t) dt. On the other hand, (3) says u0 (x) = f (x). So
a
d
dx
Z
x
f (t) dt = f (x).
(6)
a
Both (5) and (6) should be kept in mind.
d
QUESTION 20.
dx
d
QUESTION 21.
dx
Z
QUESTION 22.
2
4
Z
x2
ln t
dt = ?
t
x2
ln t
dt = ?
t
1
Z
x
d
dt
µ
ln t
t
¶
dt = ?
EXERCISE 23. Derive the following “integration by parts formula”:
Z
b
Z
b
0
u(x)v (x) dx = u(b)v(b) − u(a)v(a) −
a
u0 (x)v(x) dx.
a
Try to do this yourself, not copying from your textbook!
Now we give a geometric interpretation of the integral
Rb
a
f (t) dt. Let us recall the
0
differential equation du/dt = f (t). Letting u = f and x = b in (5) above, we have
Rb
u(b) − u(a) = a f (t) dt. Let ε be an arbitrary positive number (as small as you like).
Now du/dt = f (t) means that there is a positive number γ = γ(t) depending on t, such
that, whenever [r, s] is a closed interval in (t − γ, t + γ) (that is, t − γ < r < s < t + γ),
we have
¯
¯
¯
¯ u(s) − u(r)
0
¯
− u (t)¯¯ < ε,
¯ s−r
or, in view of du/dt = f , we have
|u(s) − u(r) − f (t)(s − r)| < ε(s − r) if
t − γ(t) < r < s < t + γ(t).
(7)
By a tagged partition of the interval [a, b] we mean a subdivision of [a, b] into smaller
intervals by points x0 , x1 , . . . , xn such that
a = x0 < x1 < · · · < xn = b,
9
together with the points t1 , t2 , . . . , tn called tags so that the jth tag tj belongs the jth
interval [xj−1 xj ]. Such a tagged partition gives rise to an approximate sum
S=
Xn
j=1
f (tj )(xj − xj−1 )
called a Riemann sum. How good does it approximate
(8)
Rb
a
f (t) dt? Well, it depends on
how fine the tagged partition is. Assume that this tagged partition is γ-fine in the sense
that [xj−1 , xj ] is contained in (tj − γ(tj ), tj + γ(tj )) for j = 1, . . . , n. By (7),
¯
¯Z
¯ ¯
¯
¯ b
Xn
X
n
¯ ¯
¯
¯
=
u(b)
−
u(a)
−
f
(t
)(x
−
x
)
f
(t)
dt
−
f
(t
)(x
−
x
)
¯
¯
j
j
j−1 ¯
j
j
j−1 ¯
j=1
j=1
¯
¯ a
¯X n
¯
Xn
¯
¯
=¯
(u(xj ) − u(xj−1 ) −
f (tj )(xj − xj−1 )¯
j=1
j=1
Xn
Xn
≤
|u(xj ) − u(xj−1 ) − f (tj )(xj − xj−1 )| ≤
ε (xj − xj−1 ) = ε (b − a).
j=1
j=1
Since ε > 0 is arbitrary and b − a is fixed, we can make ε(b − a) as small as we like.
QUESTION 24. Why is u(b) − u(a) equal to
Xn
j=1
(u(xj ) − u(xj−1 )) ?
Geometrically, the Riemann sum here gives an approximation of the area of the region in
the xy-plane enclosed by the graph of f , the x–axis and the vertical lines x = a and x = b;
here we assume f ≥ 0). [The argument here suggested by a modern theory of Riemann–
type integration, called the gauge integration or the Kurzweil–Henstock integration.]
The general initial value problem for ordinary differential equations is of the form
du
= f (x, u),
dx
with u(x0 ) = y0 .
Some mild condition on f (such as the Lipschitz condition) will give us the existence and
uniqueness of solution to this problem.
PROBLEM 25. Assuming that u = ex is the unique solution to the initial value problem
du/dx = u with u(0) = 1, derive the identity ea+b = ea eb .
10