1.2 Prelude to Instantaneous Rates of Change

22
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
1.2
Prelude to Instantaneous Rates of Change
Let’s return now to the situation of a car, moving along a straight road, on which we’ve placed
a coordinate axis. Let p(t) denote the position of the car, in miles, at time t hours, from some
initial time. How do you determine the velocity of the car at time t = 1 hour?
In light of our discussions in Section 1.1, you may be asking “Do you mean the average
velocity or the instantaneous velocity?” However, we wrote “the velocity at time t = 1 hour”,
not the velocity between two times or on some time interval. When we ask for the rate of change
of one quantity p (or y, or any other variable name) with respect to another quantity t (or x,
or any other variable) at or when t equals a particular value, that means that we are asking
for the instantaneous rate of change, the IROC, whether the term “instantaneous” explicitly
appears or not. Why? Simply because we can’t mean the average rate of change, the AROC, if
we don’t specify two values, or an interval, for the independent variable.
Okay. So, how do you determine the instantaneous velocity of the car at time t = 1 hour?
From inside the car, it’s easy: look at the speedometer of the car when t = 1 hour (and also
note whether you are traveling in the positive or negative direction). Our real question is: how
does someone outside the car measure the instantaneous velocity of the car? Actually, our
real question is: what does instantaneous velocity even mean?
We have some intuitive concept of velocity, and we believe that the speedometer of the car is
measuring something. What is instantaneous velocity? We know what average velocity means;
it’s the average rate of change of the position, with respect to time. Can we use our definition
of average velocity to arrive at a definition of instantaneous velocity? Let’s think about it. Go
back to Example 1.1.1, where the car was at mile marker 37 at exactly noon, and was at mile
marker 38 at exactly 12:02 pm. Can we say what the velocity of the car was at noon?
Certainly not, at least not from this data. As we calculated in Example 1.1.1, the average
velocity between noon and 12:02 pm would be 30 mph, but maybe the car was moving more
slowly than this at noon and sped up by 12:02 pm, or was going faster at noon and slowed
down by 12:02 pm, or sped up and slowed down multiple times in the intervening 2 minutes.
The point is that 2 minutes of time is easily long enough for the velocity of the car to change
appreciably, so that the average velocity of the car over a 2-minute period need not even be a
good approximation of the actual (instantaneous) velocity at noon.
To get a good approximation of the velocity (as measured by the speedometer, and noting the
direction of travel) at noon, we would need to calculate the average velocity between noon and
some time that is so close to noon that we don’t believe that the car’s velocity could have changed
much in the tiny time interval. Suppose, instead of looking at the car’s position 2 minutes later,
we looked at the car’s position 10 seconds later, and calculated the average velocity between noon
1.2. PRELUDE TO IROC’S
23
and 12:00:10 pm. This still doesn’t seem like a small enough time interval to get a convincing,
decent approximation of the actual velocity at noon. Maybe the driver braked severely between
12:00:01 pm and 12:00:10 pm, so that the average velocity between noon and 12:00:10 pm was
far lower than the car’s actual velocity at noon.
Certainly, 1/10th of a second seems like so small an interval of time that the car’s velocity
would be essentially unaffected by slamming on the brakes or stomping on the accelerator.
Assuming that this is true, we could say that the instantaneous velocity of the car at noon is
approximately the average velocity of the car between noon and the time 1/10 of a second later.
However, using the average velocity over a 1/10th of second interval to approximate the
instantaneous velocity should feel unsatisfying for at least two reasons. First, why stop at
1/10th of a second? Surely we’d get an even better approximation if we used smaller intervals,
like 1/100th of a second or 1/1000th of a second. Second, in order to decide that 1/10th of
second was “good enough”, we had to know something about the physical properties of a car,
i.e., we needed to know that the car’s velocity could not be changed any significant amount in
1/10th of a second. But, what if we were discussing the velocity of other objects, like bullets,
atoms, or photons? How do we know, in all cases, when a time interval is small enough so that
the average velocity yields a reasonable approximation of the instantaneous velocity?
An answer that may occur to you is to simply use a time interval of zero. Unfortunately,
this doesn’t work. If we try it, we get that the average rate of change of the position function
p(t) between t = a and t = a would be
p(a) − p(a)
0
= ,
a−a
0
which is undefined. Maybe we could take the time interval between t equals some number and
the next biggest real number. Again, this doesn’t work; there is no “next biggest real number”.
Great. So, what do we do?
The answer is that we take the average velocity between times t = a and t = a + h, where h
is a variable, unequal to zero. We then see if this average velocity gets arbitrarily close to some
number v, if h is “close enough” to 0. If there is such a number v, then we call that number the
instantaneous velocity at t = a.
There is no reason to restrict ourselves here to velocity, which is the rate of change of position,
with respect to time. If y = f (x) is any function, we look at the average rate of change of f ,
with respect to x, between x = a and x = a + h, where h 6= 0, and we see if this AROC gets
arbitrarily close to some number L, as h gets close enough to 0. If there is such a number L, we
24
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
say that the instantaneous rate of change of f , with respect to x, at x = a exists and is equal
to L.
To make this precise, we shall need the notion of a limit, which we discuss in Section 1.3 and
in Section 1.A. However, in this section, we wish to give a number of preliminary examples. Note
that, in order to calculate the AROC of f between x = a and x = a+h, as h gets arbitrarily small,
we must know the values of f for an infinite number of values of the independent variable. This
means that a list or table of f values is not enough; we typically need a mathematical formula
for f .
In these examples, the AROC that we are considering will, of course, be a function of h. It is
very cumbersome to write over and over that some function of h, call it q(h), gets arbitrarily close
to some number L, as h gets close enough to 0. Thus, we will go ahead and adopt terminology
and notation that we will not carefully explain until Section 1.3.
Definition 1.2.1. (Preliminary “Definition” of Limit, IROC, and Derivative) If a function
q(h) gets arbitrarily close to some number L, as h gets close enough to 0, then we say
that the limit of q(h), as h approaches 0, exists and is equal to L, and we write
limh→0 q(h) = L.
Suppose we have y = f (x) and we let q(h) be the average rate of change of f , with respect
to x, between x = a and x = a + h, i.e.,
q(h) =
f (a + h) − f (a)
f (a + h) − f (a)
=
.
(a + h) − a
h
If, for this particular q(h), limh→0 q(h) = L, then we say that the instantaneous rate of
change of f , with respect to x, at x = a, exists and equals L.
This instantaneous rate of change of f , with respect to x, at x = a is also called the derivative of f at a and is denoted by f 0 (a).
1.2. PRELUDE TO IROC’S
25
It is tempting to look at Definition 1.2.1 and think “Ah - to calculate the limit as h approaches 0, and so to calculate the IROC, I simply have to plug in 0 for h.”
However, this is clearly not what we want to do; if we were to put in 0 for h in the
expression (f (a + h) − f (a))/h, then we would obtain the undefined quantity 0/0. We must
do some manipulations to somehow eliminate the division by h before we can “plug in”
h = 0 and, even then, to know that plugging in h = 0 agrees with the limit as h approaches
0, we must use that the function under consideration is continuous everywhere that it is
defined. We shall discuss this at length in Section 1.3
Example 1.2.2.
Let’s look again at Example 1.1.7, in which we had a widescreen television, which had area
A(d) = 144d2 /337 in2 , where d is the diagonal length in inches. What is the instantaneous
rate of change, the IROC, of A, with respect to d, when d = 40 in?
As before, let us write c for the constant 144/337, simply to cut down on how much we have
to write. So, A = cd2 . We wish to calculate the AROC of A, between d = 40 and d = 40 + h,
where h 6= 0, and then see what happens to this AROC as h gets close to 0.
The AROC of A with respect to d between d = 40 and d = 40 + h is
c ((40)2 + 80h + h2 ) − (40)2
c(40 + h)2 − c(40)2
=
= c(80 + h) in2 /in.
(40 + h) − 40
h
Does the limit of this AROC, as h approaches 0, exist, i.e., does the IROC at d = 40 exist? Yes.
Namely,
lim c(80 + h) = c · 80 ≈ 34.18398 in2 /in.
h→0
Of course, we have not, at this point, proved any results about limits. We are simply
appealing to your intuition that c(80 + h) gets as close to c · 80 as we want by taking h close
enough to 0. How close is ‘’close enough”? As we shall see in Section 1.3 and Section 1.A, that
depends on how close we want c(80 + h) to be to c · 80.
What about the IROC at d = 52 inches? We do a similar calculation:
c ((52)2 + 104h + h2 ) − (52)2
c(52 + h)2 − c(52)2
=
= c(104 + h) in2 /in.
(52 + h) − 52
h
Does c(104+h) get arbitrarily close to some number as h approaches 0? Certainly. It approaches
c(104) = (144/337)(104) in2 /in. Therefore, we say that the instantaneous rate of change of the
26
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
area of the television screen, with respect to the diagonal length, when d = 52 inches, exists and
is equal to this number of square inches per inch.
Notice that the algebra that we had to do in the two calculations above was essentially the
same in each case. We could have saved time and space, and calculated the IROC of A, with
respect to d, for every possible d value, by simply leaving d as a variable in the calculation. We
find that the IROC of A with respect to d, at each value of d, is
A0 (d) = lim
h→0
lim c ·
h→0
A(d + h) − A(d)
c(d + h)2 − cd2
= lim
=
h→0
h
h
(d2 + 2dh + h2 ) − d2
= lim c(2d + h) = 2cd =
h→0
h
288
d ≈ 0.85460 d in2 /in.
337
As we saw in the example above, it was convenient to discuss the IROC of A(d), with respect
to d, at arbitrary values of d, i.e., it was convenient to just leave d as a variable in the derivative.
Thus, we make the following definition, even before we have a rigorous definition of the limit.
We restate this definition, a bit more carefully, in Definition 1.4.2, after we have investigated
limits in Section 1.3. We should mention that it has become common practice to let the variable
h denote ∆x (or the change in whatever the independent variable is) in this definition.
Definition 1.2.3. Suppose we have y = f (x). Then, the new function f 0 , given by
f 0 (x) =
lim
∆x→0
f (x + ∆x) − f (x)
f (x + h) − f (x)
= lim
,
h→0
∆x
h
is called the derivative of f , with respect to x, and is the instantaneous rate of change
of f , with respect to x, for any value of x for which the limit exists.
Remark 1.2.4. You may look at Definition 1.2.3 and think “what’s the difference between
what’s written for f 0 (x) in Definition 1.2.3 and the definition of f 0 (a) in Definition 1.2.1, other
than that Definition 1.2.3 has an x where f 0 (a) has an a?”.
1.2. PRELUDE TO IROC’S
27
It is true that, in Definition 1.2.1, we defined f 0 (a) by
f 0 (a) = lim
h→0
f (a + h) − f (a)
,
h
and a can be anything, just as x can be anything. So what’s the point of putting in an a instead
of x?
The point is that we frequently discuss functions in a convenient, but technically imprecise
way, and replacing the variable x with a different letter helps avoid confusion.
Consider, for instance the function f (x) = x2 . The actual function is simply f , the squaring
function. The x is what’s referred to as a “dummy variable”; it’s simply there as a named
placeholder, but it doesn’t matter what the name is. The function given by f (t) = t2 is the
same as the function given by f (x) = x2 . They are both the squaring function. The expression
f (x) is actually the value of the function at x; it is technically a real number, not a function.
And yet, we frequently write f (x) in place of the function f , or we write simply “the function
x2 ”, assuming that the reader will know that we mean the function f defined by f (x) = x2 ,
where x is just a dummy variable.
But if we’re going to use x2 to denote the squaring function, then what do we write when
we want to indicate simply a single value of f , not the function f , after we’ve plugged in a
number that could be anything? The answer is that we write something like “consider the value
of x2 , when x = a”. This does exactly mean consider a2 , but the switch from our standard
variable names, like x and t, is supposed to let the reader know that a2 really means the value
a2 , not the function f (a) = a2 , which would just be the squaring function again.
For more details on functions, and their domains and codomains, see Subsection 1.A.2.
Now that we have finished with that technical discussion, let’s look at what happens to secant
lines (Definition 1.1.9) as we take limits. Is there a graphical way to see the instantaneous rate
of change, in addition to the average rates of change?
Example 1.2.5.
Let’s return to Example 1.2.2 above, where we considered the function A = A(d) = (144/337)d2 .
The red lines in Figure 1.4 are the secant lines of A for the pairs d = 20 and 55, d = 20 and
28
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
40, and d = 20 and 27. That is, we have fixed one d-value, d = 20, and let the second d-value
get closer and closer to d = 20. We cannot let the second d-value get too close to d = 20 and
continue to see changes on the graph. If you are viewing this electronically, and can view videos,
clicking on the graph in Figure 1.4 will produce an animation. Otherwise, you should be able
to imagine the red lines approaching the fixed blue line as the second d-value approaches 20.
1500
1250
1000
A
750
500
250
d
0
5
10
15
20
25
30
35
40
45
50
55
60
Figure 1.4: Limits of secant lines.
The blue line appears to glance off of the graph at the point (20, A(20)). Its slope is the
limit of the slopes of the secant lines. But the slopes of the secant lines are the AROC’s of A
between d = 20 and d = 20 + h, for values of h other than 0. Hence, the slope of the blue line
is the limit of the AROC’s, i.e., the blue line, the tangent line to the graph of A, where d = 20,
is the unique line passing through the point (20, A(20)) with slope given by the instantaneous
rate of change of A, with respect to d, at d = 20.
We calculated in Example 1.2.2 that the IROC of A, with respect to d, for any value of d,
was given by (288/337)d. Therefore, the slope of the tangent line to the graph of A, where
d = 20, is (288/337)(20) ≈ 17.09199. In addition, A(20) = (144/337)(20)2 ≈ 170.91988. Using
the point-slope form for an equation for a line, we find that the tangent line to the graph of A,
where d = 20, is given by the equation
A − (57600/337) = (5760/337)(d − 20),
1.2. PRELUDE TO IROC’S
29
which would be approximated very closely by the line with equation
A − 170.91988 = 17.09199(d − 20).
In the example below, we will look at a function whose derivative does not exist at a particular
point. However, you should keep in mind that our main focus in differential Calculus is on
functions whose derivatives exist.
How can the instantaneous rate of change fail to exist at a point where the function exists?
Consider the following example.
Example 1.2.6. Recall the lobster cost function C(w) from Example 1.1.14. In that example,
we calculated the AROC of C, with respect to w, on the interval [2, 2.1], and found that it was
large: $28/lb. In fact, we claim that C 0 (2), the instantaneous rate of change of C, with respect
to w, at w = 2, does not exist.
We need to look at
C 0 (2) = lim
h→0
C(2 + h) − C(2)
.
h
Thus, we are interested in the values of C(w), when w is close to 2. Recall that C(w) = 7w if
1.5 < w ≤ 2, and C(w) = 8w if 2 < w ≤ 3. Therefore, if 0 < h < 1, then 2 < 2 + h < 3, and so
the average rate of change of C, with respect to w, on the interval [2, 2 + h] is
C(2 + h) − C(2)
8(2 + h) − 7 · 2
2 + 8h
2
=
=
= + 8 $/lb.
h
h
h
h
(1.2)
Note that, by plugging in h = 0.1, we recover our earlier result that the AROC of C on the
interval [2, 2.1] is $28/lb.
But what happens as we let h in Formula 1.2 approach 0 (but always choosing h > 0)? The
2/h portion gets arbitrarily large; we also say that it increases without bound. As a brief way of
expressing this, we say that 2/h approaches (positive) infinity as h approaches 0 from the right,
and write that 2/h → ∞ as h → 0+ . The phrase “from the right” is used because we usually
picture the positive direction on a coordinate axis to be to the right of the origin. We shall make
all of this precise in Section 1.3.
30
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
Hence, the IROC of C, with respect to w, at w = 2 does not exist, for there is no real number
that is being approached by the AROC between w = 2 and w = 2 + h, as h approaches 0 from
the right.
Remark 1.2.7. In business and economics, derivatives are discussed all the time; the derivative
of a function f (x), with respect to x, is referred to as the marginal value of f (with respect to x).
The marginal cost, marginal revenue, and marginal profit are all very important in the business
world.
When we first gave Example 1.1.14, we had a price per pound for lobsters, and then the
total price for for a lobster of weight w; in order to avoid confusion over which “price” we were
discussing, we referred to the total price as the “total cost”, and denoted this by C(w). In
business terminology, this is very bad. “Price” is what you sell something for, “cost” is how
much the item costs you, the seller, to produce or acquire. (The revenue, per item, would be
the same as the price, after the sale is made, assuming the sale is made at the given price.) So,
really, we should refer to C(w) in Example 1.1.14 and Example 1.2.6 as the price of a lobster
of weight w pounds and, using the “marginal” terminology, what we showed in Example 1.2.6
would be described as the marginal price per pound of a lobster of weight 2 is infinite. Such
infinite marginal quantities are not something that are discussed often; as we mentioned above,
we typically deal with cases where the derivatives actually exist.
Before we look at two final examples in this section, we want to clearly define instantaneous
velocity, speed, acceleration, using our definitions of the average velocity, speed, and acceleration, together with our preliminary definitions of limit and derivative in Definition 1.2.1. Of
course, while these definitions are the actual definitions, they won’t technically have rigorous
mathematical meaning until after the next section, in which we give the formal definition of the
limit.
Definition 1.2.8. The instantaneous velocity of an object is the instantaneous rate of
change of the position of the object, with respect to time, as defined in Definition 1.2.3.
Thus, if p(t) is the position of the object as a function of the time t, then the instantaneous
velocity of the object at time t is p0 (t).
The instantaneous acceleration of an object is the instantaneous rate of change of the
velocity of the object, with respect to time. Hence, if v(t) is the velocity of the object as a
function of the time t, then the instantaneous acceleration of the object at time t is v 0 (t).
1.2. PRELUDE TO IROC’S
31
The instantaneous speed of an object is the instantaneous rate of change of the distance
traveled by the object, with respect to time. Therefore, if d(t) is the distance that the object
has traveled, as a function of the time t, then the instantaneous speed of the object at time
t is d0 (t). This is equivalent to defining instantaneous speed to be the magnitude of the
instantaneous velocity; for motion in a straight line, this means that the instantaneous speed
is the absolute value of the instantaneous velocity.
Remark 1.1.18 was important enough that we repeat some of it here.
Remark 1.2.9. You must be careful. As Example 1.1.17 shows, and as we discussed in
Remark 1.1.18, even if an object is traveling in a straight line, if the direction of motion
changes, the average speed need not be the absolute value of the average velocity.
We will give two more examples of using our intuitive notion of limits to calculate IROC’s,
one fairly easy example and one hard one. The point of the hard example is to show you that
the algebra involved in calculating limits can be quite involved.
Example 1.2.10. Suppose that a particle is moving along a coordinate axis in such a way that
its position p(t), in meters, at time t seconds, is given by
p(t) = 5t2 − 40t.
What is the (instantaneous) velocity of the particle at an arbitrary time t seconds? When does
the particle “stop” for an instant of time?
Solution:
The instantaneous velocity of the particle is the instantaneous rate of change of the position,
with respect to time. This is the limit of the average velocities of the particle, as the interval
of time approaches zero, i.e., the velocity v(t), in m/s, at time t seconds, provided it exists, is
given by
v(t) = p0 (t) = lim
h→0
5(t + h)2 − 40(t + h) − 5t2 − 40t
p(t + h) − p(t)
= lim
=
h→0
h
h
32
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
lim
h→0
5t2 + 10th + 5h2 − 40t − 40h − 5t2 + 40t
10th + 5h2 − 40h
= lim
h→0
h
h
lim
h→0
(10t + 5h − 40)h
= lim (10t + 5h − 40) = 10t − 40 m/s.
h→0
h
We have found that the velocity v(t) = 10t − 40 m/s. When does the particle stop? When
its velocity is zero. We set 10t − 40 equal to 0, and solve for t to find that the particle is stopped
at the instant t = 4 seconds.
Example 1.2.11. Suppose that a particle is moving along a coordinate axis in such a way that
its position p(t), in meters, at time t seconds, where t > 0, is given by
p(t) = √
1
.
+1
t3
What is the (instantaneous) velocity of the particle at an arbitrary time t > 0 seconds?
Solution:
Note that the restriction that t > 0 is present just as a convenient way of avoiding the
division by 0 if we were to let t = −1.
As in the previous example, but now using our new p(t), we need to calculate
p(t + h) − p(t)
= lim
v(t) = p (t) = lim
h→0
h→0
h
0
√
1
(t+h)3 +1
−
h
√ 1
t3 +1
=
√
p
t3 + 1 − (t + h)3 + 1
√
lim p
,
h→0 h
(t + h)3 + 1 t3 + 1
(1.3)
where we shall omit the units of m/s until we write our final answer.
It is still unclear at this point that the ugly fraction in Formula 1.3 approaches anything as h
approaches 0; if we simply “plug in” h = 0, we get the meaningless 0/0. Our goal is to eliminate
the division by h, so that we can calculate the limit by just plugging in h = 0. (We remind
you that calculating limits simply by plugging in requires that the function being considered is
continuous. This is something that we shall just assume below. See Section 1.3.)
1.2. PRELUDE TO IROC’S
33
To eliminate the division by h, we multiply the numerator and denominator of Formula 1.3
by the “conjugate” of the numerator:
p
p
t3 + 1 + (t + h)3 + 1
to obtain that
v(t) = lim
h→0
lim
h→0
h
lim
h→0
(t3 + 1) − ((t + h)3 + 1)
h√
i=
p
p
√
h (t + h)3 + 1 t3 + 1
t3 + 1 + (t + h)3 + 1
p
(t3 + 1) − (t3 + 3t2 h + 3th2 + h3 + 1)
h√
i=
p
√
(t + h)3 + 1 t3 + 1
t3 + 1 + (t + h)3 + 1
−3t2 h − 3th2 − h3
h√
i.
p
p
√
h (t + h)3 + 1 t3 + 1
t3 + 1 + (t + h)3 + 1
Now, at last, we can cancel the h factor in the denominator with an h factor in the numerator
to obtain
−3t2 − 3th − h2
h√
i,
v(t) = lim p
p
√
h→0
(t + h)3 + 1 t3 + 1
t3 + 1 + (t + h)3 + 1
and, finally, we can calculate this limit (of this continuous function) by letting h actually equal
0; we find that the velocity of the particle at any time t > 0 is
v(t) = √
t3 + 1
√
−3t2
−3t2
√
√
=
m/s.
2(t3 + 1)3/2
t3 + 1
t3 + 1 + t3 + 1
In order to make precise all that we have written in this section, we must give a real definition
of limit, a mathematically rigorous definition, and discuss theorems on limits. We do these things
in the next section, Section 1.3.
To calculate instantaneous rates of change, i.e., derivatives, we do not want to have to
perform horrendous algebra or trigonometric manipulations, such as in Example 1.2.11 over and
over again. Hence, we will prove “rules” for calculating derivatives in Chapter 2. Once you
34
CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE
memorize these relatively few rules for calculating derivatives, the calculation of instantaneous
rates of change becomes very simple and quick for many, many functions.
Later, when we have so many rules for calculating derivatives, you may feel that you can
safely forget the definition of the derivative as a limit of average rates of change.
Try to always remember: the rules that we shall develop will help us calculate derivatives
easily, but the reason that the derivative is something that we want to calculate
is because it is the instantaneous rate of change, and the reason it is the instantaneous rate of change is because it is the limit of the average rates of change,
as the change in the independent variable approaches 0.
The point being that, even after we have the rules for calculating derivatives, you should
never forget the definition of the derivative. The definition of the derivative is where all of
its applications come from.
1.2.1
Exercises
For the functions in Exercises 1 through 9, find the average rates of change on the
intervals [x, x + h], for x given in the problem, and h = 1, 0.1, 0.01, 0.001, and 0.0001.
Use this data to estimate the IROC at the point x.
1. f (x) = 7x + 3, at x = 9.
2. f (x) = 11, at x = 2.
3. f (x) = x2 − 9, at x = 3.
1
4. g(x) = √ , at x = 9.
x
5. k(x) = (2x − 4)3 , at x = 1.
(
0
if − ∞ < x ≤ 0.5;
6. l(x) =
,
x + 5 if 0.5 < x < ∞
x2 − 1
, at x = 0.
x−1
√
8. w(x) = 25 − x, at x = 16.
7. z(x) =
at x = 0.25.