RATES OF CHANGE MODULE
GUERSHON HAREL, EVAN FULLER, JEFF RABIN, LAURA STEVENS
1. Introduction
This module is structured around the need to model reality. When seeking a function to
model a natural phenomenon, the data which are typically available to us consist of how
the phenomenon changes. Thus, one of the main purposes of examining rates of change is
to use some information about rate to gain information about a function, a purpose which
is often masked in traditional calculus courses. We have designed a sequence of problems
consistent with this purpose which necessitate ways of thinking critical to understanding
rates of change.
We begin with a set of problems on functions, in particular problems in which the objective
is to describe a physical situation; for example, at any time, what is the population? One
of our primary goals is that students understand functions as models of reality, and of
course that they develop a more sophisticated understanding of functions. In these problems,
attending to rate of change is necessary for determining a model. This, in turn, necessitates
an in-depth study of rates of change, in particular, an exploration of average rate of change
which leads naturally to an intuitive notion of instantaneous rate of change. The need for
communication – in this case the need to communicate to others a precise definition of
“approaches” and “arbitrarily close” – demands the formalization of our intuitive notion,
i.e. the definition of the derivative. With the definition of the derivative in hand, we may
prove properties of functions which follow from properties of their derivatives. Many of
these properties are intuitive, but the need for certainty – to know that something is true –
demands formal proof. Truth alone, however, is not our only aim, and we desire to educate
students to know why something is true –the cause that makes it true – a need we refer to
as the need for causality.
These needs manifest a crucial principle, called the necessity principle. It claims: For
students to learn what we intend to teach them, they must have a need for it, where ‘need’
refers to intellectual need, not only psychological need. Intellectual need has to do with
disciplinary knowledge being born out of people’s current knowledge through engagement in
problematic situations conceived as such by them. Psychological need, on the other hand,
has to do with people’s desire, volition, interest, self determination, and the like. Indeed,
before one immerses oneself in a problem, one must be willing to engage in the problems
and persist in the engagement. Our focus in this module is on intellectual rather than
psychological needs. As the module unfolds, we urge the reader to contrast this Necessity
approach with the current standards-driven approach in high school teaching.
This module also emphasizes another crucial principle: the repeated reasoning principle.
It claims: Students must practice reasoning in order to internalize, organize, and retain the
1
2
mathematics they have learned. Repeated reasoning, not mere drill and practice of routine
problems, is essential to the process of internalization-a state where one is able to apply
knowledge autonomously and spontaneously. The sequence of problems must continually
call for reasoning through the situations and solutions and must respond to the students’
changing intellectual needs.
Lastly, our approach is to carefully attend to subject matter– definitions, theorems, proofs,
problems and their solutions,etc.– as well as to ways of thinking (WoT’s), for example,
problem solving approaches and proof schemes (empirical versus deductive). We will refer
to elements of subject matter as ways of understanding, so as to differentiate them from
ways of thinking. For example, the following are different ways of understanding the phrase
“derivative of a function f at a” or the symbol f � (a): “the slope of a line tangent to the graph
of the function f at a” or lim (f (a + h) − f (a)) /h” or “the instantaneous rate of change of
h→0
f at a” or “the slope of the best linear approximation to the function f near a”. Additional
examples of ways of understanding and ways of thinking will emerge as the module unfolds.
The module consists of four units. Each unit begins with a list of focus ways of understanding and ways of thinking and proceeds with classroom problems that attend to them.
A pedagogical discussion of these problems, including observations from our own classes,
then follows. The unit concludes with a set of practice problems.
An unavoidable difficulty in teaching this module is that the population of students we
are facing is familiar with the mechanics of calculus, i.e. formulas for differentiation, strategies for solving related rates problems, etc., even though most of these methods are void of
meaning for them. Thus, the instructor should expect the students to apply sophisticated
problem solving methods and should not try to deter the students from using these methods. Instead, the instructor should encourage the students to unpack these methods with
questions such as “Do we really know why we are doing this?” and “Is there a way to do
this without using calculus?” When possible, the instructor should also lead the students
to compare sophisticated approaches to the problems with more elementary, conceptual approaches, with the ultimate goal being that the elementary approaches lend insight to the
sophisticated approaches, thereby fostering a deeper and more meaningful understanding of
calculus concepts.
2. Unit One: Functions
Focus Ways of Thinking and Ways of Understanding
• PGA way of thinking: Attending to interrelationships between physical/perceptual,
geometric (including graphs), and algebraic realities.
• Thinking in terms of functions.
• Process conception of function1 : Thinking of functions in terms of input and output.
• Functions as models of reality.
• Proportional reasoning (linearity).
• Piecewise defined functions, in particular the algebraic definition of the absolute value
function as well as its connection to the geometric definition.
1This
is in contrast to the eventual goal of the object conception of function where we think of functions
as elements of a vector space.
3
Problem 2.1. : Jack and Jill run 10 kilometers. They start at the same point, run 5
kilometers up a hill, and return to the starting point by the same route. Jack has a 10
minute head start and runs at the rates of 15 km/hr uphill and 20 km/hr downhill. Jill runs
16 km/hr uphill and 22 km/hr downhill. How far from the top of the hill are they when
they meet? What is the distance between Jack and Jill at any given moment from the time
Jill leaves until Jack arrives?
Pedagogical Considerations. Problem 2.1: Jack and Jill run 10 kilometers. They start
at the same point, run 5 kilometers up a hill, and return to the starting point by the same
route. Jack has a 10 minute head start and runs at the rates of 15 km/hr uphill and 20
km/hr downhill. Jill runs 16 km/hr uphill and 22 km/hr downhill. How far from the top of
the hill are they when they meet? What is the distance between Jack and Jill at any given
moment from the time Jill leaves until Jack arrives?
Notice that the second question of Problem 2.1 is not phrased as, “Find a formula for
the distance . . . ” The careful wording of these problems aims at intellectually necessitating
that the students conceive of a function as a process which transforms a collection of input
objects into output objects, rather than simply a formula, whereby encouraging attention to
meaning, especially for algebraic symbols, which we call referential symbolic reasoning.
One commonly used approach to the first part of the problem is geometric, namely attempting to solve it by sketching graphs of time versus total distance travelled for Jack and
Jill, respectively. Upon realizing that these two graphs have no intersection point (perturbation), students sketch the graphs of time versus position (distance from starting point).
Again, we see here opportunities for students to realize the need to attend to meaning, a
crucial habit of mind for the development of referential symbolic reasoning. Once the graphs
are sketched correctly, students see that the only point of intersection occurs in the time
interval in which Jack is going downhill and Jill is going uphill. From here the problem
can be solved algebraically, i.e. by finding equations for the lines. Two questions that the
instructor may raise, if the students do not, are: Why does finding the point of intersection
of the two graphs tell us when Jack and Jill meet? And why does constant velocity imply
that the graph of the position function is a line? In particular, how can this be justified
without using calculus?
The other most commonly used approach is to start with algebra, in particular to find
formulas for the positions of Jack and Jill. If t denotes the time in minutes from Jack’s
departure, then Jack’s position (km from starting point) is given by the function:
1
if 0 ≤ t ≤ 20
4t
1
d1 (t) = 5 − 3 (t − 20) if 20 ≤ t ≤ 35
0
if 35 ≤ t ≤ 42 17
44
Whereas Jill’s position (km from starting point) is given by the function:
if 0 ≤ t ≤ 10
0
4
10) � if 10 ≤ t ≤ 28 34
d2 (t) = 15 (t − �
5 − 11 t− 115
4
if 28 34 ≤ t ≤ 42 17
30
44
4
Note that these are both natural examples of functions defined by different formulas on different intervals. Many students believe that a function is the same thing as a single algebraic
formula, so this problem provides an opportunity for them to correct that misconception.
Given the above formulas, solving the first question necessitates understanding the meaning of solving an equation on a specified interval, and the students should be able to articulate
this. An additional instructional note regarding the above formulas is that the derivation of
the formulas offers rich opportunities for pedagogical discussion. The instructor can ask the
students whether there is any advantage in showing all of the work on the board. Amongst
other things, showing all of the algebra on the board tells the whole story behind the formulas, allows students to retrace the instructor’s steps and find errors if neccessary, and also
has the potential to illustrate, for example, proportional reasoning, the distributive property,
and finding a common denominator efficiently by factoring integers as the product of primes.
Note also that the solution of the second question invites the concept of absolute value.
Some of the students may understand absolute value as the distance from zero, but many
of them lack a clear understanding of the algebraic definition. Thus, another goal of the
exercise is for the students to understand the need for an algebraic definition of absolute
value. Understanding and attending to definitions is part of the definitional reasoning way
of thinking- a mathematical definition captures a category of objects and only that category.
Supplementary and Practice Problems.
Problem 2.2. Two siblings, Juan and Lola, decide to drive to their cousin’s house, which
is 200 miles away from their house. They have to take separate cars because Lola needs to
stay there for two days longer than Juan. Juan leaves the house at 5:00 PM and drives 60
mph. Lola leaves at 5:10 PM and drives 71 mph. Neither Juan nor Lola makes any stops
during the trip.
(1) How far apart are Juan’s and Lola’s cars at any given moment from the time Juan
leaves until the time they have both arrived at their cousin’s house?
(2) Will Lola pass Juan during the trip? If so, at what time and where?
Problem 2.3.
(1) Your answer to the first part of Problem 2.2 is a function, whereas
your answer to the second part of the problem is the solution to an equation. What
is a function? What is an equation?
(2) Suppose f and g are functions from the real numbers to the real numbers. What
does it mean for the graphs of f and g to have four intersection points? What does
it mean for the graphs of f and g to have no intersection points?
(3) Suppose f is a function from the real numbers to the real numbers. What do we
mean by the graph of f ?
Problem 2.4.
(1) A pharmacist is to prepare 15 milliliters of special eye drops for a
glaucoma patient. The eye-drop solution must contain a concentration of 2% active
ingredient, but the pharmacist only has a 10% concentrated solution and a 1% concentrated solution in stock (unlimited quantities of each). Can the pharmacist use
the solutions she has in stock to fill the prescription?
(2) The same pharmacist receives a large number of orders for special eye drops for glaucoma patients. The prescriptions vary in volume but each requires a concentration
5
of 2% active ingredient. Help the pharmacist find a convenient way to determine the
exact amounts of the 10% solution and the 1% solution needed for a given volume of
eye drops.
Problem 2.5.
(1) Trains between Los Angeles and San Diego leave each city every hour
on the hour, with no stops. Suppose the one-way trip in either direction takes 4 hours.
How many southbound trains will a northbound train pass during its trip?
(2) Amtrak also operates trains between City A and City B, again leaving each city every
hour on the hour. However, due to a difference in elevation, the trip from A to B
takes 5 hours, while the trip from B to A takes 6 hours. How many trains will a train
leaving A pass on its journey? How many trains will a train leaving B pass?
(3) Amtrak serves many other pairs of cities as well. Again, trains leave each city every
hour on the hour, and the trip in both directions takes the same time. Trip times
vary, for example: 3 hours, 8 hours, 11 hours, and so forth. Help them predict how
many trains each train will pass on its trip.
(4) Continuing with part 3, what about another pair of cities, where trains leave each
city every hour and a half, on the hour or half hour, and the trip time is 6 hours?
Problem 2.6. What is the angle (in degrees) between the minute hand and the hour hand
of a clock at any moment between 1:00 and 2:00? At what time do the hands meet during
this interval?
3. Unit Two: Functions and Rates of Change
Focus Ways of Thinking and Ways of Understanding
• Functions as models of reality.
• Object conception of function: thinking of functions as elements of a vector space; in
particular, a function can be an object to which one applies another operation, such
as differentiation or integration.
• Rate of change can be used to approximate a model.
• Thinking of functions in terms of rates of change.
• Sequence differences as measures of change.
• Constant versus non-constant rates of change.
• Meaning of “with respect to”.
• Attending to units.
• Referential symbolic way of thinking: Attending, when there is a need, to the meaning
of symbols and their manipulations.
• Deductive reasoning: Logical structure of proofs, i.e. what is given, what is proved,
what can be assumed or chosen freely. Distinguishing between a theorem and its
converse.
Problem 3.1.
(1) You would like to predict the population of your town twenty years
from now. How could you do this?
(2) The following table gives the population (in thousands) of a city for the years from
1987 to 2006. What do you expect the population of this city to be in 2026?
6
year population year population
1987
67.38
1997
87.17
1988
69.13
1998
89.43
1989
70.93
1999
91.74
1990
72.78
2000
94.11
1991
74.68
2001
96.53
1992
76.63
2002
99.0
1993
78.63
2003
101.52
1994
80.69
2004
104.09
1995
82.8
2005
106.71
1996
84.96
2006
109.38
Problem 3.2.
(1) A spherical balloon is expanding. You want to determine the volume
of the balloon at any given instant from the moment it started to expand. What do
you do?
(2) Suppose that the balloon has already been partly blown up, and the time it took the
radius to grow each half centimeter was taken and recorded below.
Time (s) Radius (cm)
0.00
0
0.06
0.5
0.50
1
1.69
1.5
4.00
2
7.81
2.5
13.50
3
21.44
3.5
32.00
4
45.56
4.5
62.50
5
Problem 3.3.
(1) A cylindrical storage tank is full of uranium hexaflouride (UF6). You
want to find the mass of the gas. What do you do?
(2) Suppose the cylinder has a radius of 3m and a height of 100m. When you sample
the density at 10m increments of height, you find the following:
7
Height (m) Mass (mg) of 1cm3
0
16.09
10
15.87
20
15.65
30
15.43
40
15.22
50
15.01
60
14.80
70
14.60
80
14.40
90
14.20
100
14.00
Pedagogical Considerations. The contexts of these problems are chosen to be concrete,
and the students can easily build concept images. Again, the problems are carefully phrased
to necessitate that the students conceive of functions as models of reality; contrast the
wording of Problem 3.1, for example, with the more standard wording, “How would you find
a formula for the population as a function of time?”.
Problem 3.1:
(1) You would like to predict the population of your town twenty years from now. How
could you do this?
(2) The following table gives the population (in thousands) of a city for the years from
1987 to 2006. What do you expect the population of this city to be in 2026?
year population year population
1987
67.38
1997
87.17
1988
69.13
1998
89.43
1989
70.93
1999
91.74
1990
72.78
2000
94.11
1991
74.68
2001
96.53
1992
76.63
2002
99.0
1993
78.63
2003
101.52
1994
80.69
2004
104.09
1995
82.8
2005
106.71
1996
84.96
2006
109.38
Here it is essential that the problem is presented in stages since the second part of the
question suggests the nature of the solution to the first part. Initially students may be
perplexed by part one because of its open-ended nature. Because the context is so familiar
to them, it is expected that they will resolve this perturbation fairly quickly and suggest
considering past population data and looking for patterns in population growth. At this
point, the instructor should present the students with the second part of the problem.
In the classroom, we have seen various approaches to the second question. One approach
is to plot the data points and try to find a best fit curve. Students who plot the data by hand
may try to fit a line, and the class can discuss whether or not a linear function would be
8
a good model for the population. This question is resolved by considering first differences,
which are not constant. Other students considered second differences, and the data have
been constructed so that the second differences are almost all equal to 0.05. Students may
then claim that a quadratic function would be a good model for the population, and in fact
one of the goals of this problem is for the students to justify this claim. Since some students
may expect the population to exhibit exponential growth, another common approach is to
assume an exponential model, namely P (t) = P0 ert . An exponential model also fits the data
relatively well, so the class should compare it with the quadratic model; why do both models
“work” for these data?
Discussion of this problem leads, in a natural way, to:
Theorem 3.1. Let an be a sequence, and consider the sequences of differences bn = an −an−1 ,
cn = bn − bn−1 . If cn = k, then an = An2 + Bn + C. Moreover, A, B, and C can be expressed
as functions of k, a1 , and a2 .
Proof of Theorem 3.1. We have
k = cn = bn − bn−1 = (an − an−1 ) − (an−1 − an−2 ) = an − 2an−1 + an−2 .
Assuming k is known, we are relating it to three unknowns, whereas we want to relate an to
known quantities. Substituting values of n:
k = a3 − 2a2 + a1
k = a4 − 2a3 + a2
k = a5 − 2a4 + a3
······
k = an−2 − 2an−3 + an−4
k = an−1 − 2an−2 + an−3
k = an − 2an−1 + an−2
Adding the equations gives:
(n − 2)k = an − an−1 + a1 − a2 .
It may not be obvious to students that any progress has been achieved. The instructor
should highlight here that we now have only two unknowns related to known quantities.
This reinforces the way of thinking of pausing and evaluating progress made towards one’s
9
goals. Once again, substituting values of n:
0 = a2 − a1 + a1 − a2
k = a3 − a2 + a1 − a2
2k = a4 − a3 + a1 − a2
3k = a5 − a4 + a1 − a2
······
(n − 3)k = an−1 − an−2 + a1 − a2
Adding the equations gives:
that is,
(n − 2)k = an − an−1 + a1 − a2
(1 + 2 + · · · + n − 2)k = an + (n − 2)a1 − (n − 1)a2 ,
�
�
3k
k 2
n − a2 + 2a1 + k,
an = n + a2 − a1 −
2
2
so the sequence is quadratic, as desired.
Problem 3.2:
(1) A spherical balloon is expanding. You want to determine the volume of the balloon
at any given instant from the moment it started to expand. What do you do?
(2) Suppose that the balloon has already been partly blown up, and the time it took the
radius to grow each half centimeter was taken and recorded below.
Time (s) Radius (cm)
0.00
0
0.06
0.5
0.50
1
1.69
1.5
4.00
2
7.81
2.5
13.50
3
21.44
3.5
32.00
4
45.56
4.5
62.50
5
As with Problem 3.1, it is crucial that the problem be presented in stages. Notice the
standard context of this open-ended problem, on the one hand, and its unusual presentation,
on the other hand. It is most likely that the volume formula will be suggested by the students:
V (r) = (4/3)πr3 . Since the problem is about volume as a function of time, the following
representation will be necessitated: V (r(t)) = (4/3)π(r(t))3 . In turn, this would necessitate
a focus on determining r(t). Either r(t) is given to us, or we have to find a way to determine
it. In each case, the need to determine a model (i.e. a function) is highlighted. If r(t) were
given to us, then we would be done. If we are not given r(t), then we could find a model
for r(t) by using data about the rate of change of the radius. When students ask for data
10
about the radius or its rate of change, we suggest giving them the table in part two of the
problem.
We note that, in the classroom, prior to presenting the data to the students, a student
suggested that one could find V (r(t)) using the formula
�
�3
∆r
V (r(t)) = 4π/3 r0 +
t .
∆t
This presented an opportunity to clarify the difference between t and ∆t. After discussing the
above formula, some students conjectured that it holds only in the case that ∆r
is constant.
∆t
Even if no students make this suggestion, the instructor might consider introducing the
conjecture because it is an excellent exercise for the students to examine their understanding.
Many students had a strong conviction that the conjecture is true, but they could not
articulate why.
Returning to part two of the problem, the way in which the data are presented is critical.
In fact, the relationship is quite simple: we assume the volume is growing linearly (someone
is blowing air in at a constant rate), so that the radius is proportional to the cube root
of time; specifically, t(r) = .5r3 . The data round off these values slightly. It is significant
here that students should conceive of time as a function of radius, opposite to the usual
presentation. Once they conceive of this, they may think of taking successive differences
(as in Problem 3.1), in which case they will find that the third differences are essentially
constant. The fact that the given values for radius are not all integers makes it difficult to
guess a cubic without finding differences. Thus, students are likely to have to examine rate
of change of time with respect to radius (implicitly at least). Moreover, the independent
variable appears on the right in the table, so that students do not “learn” to always look at
differences in the rightmost column.
In the classroom, the only approach we saw which did not involve finding time as a function
of the radius (or vice versa) was to explicitly calculate
�
�
∆V
4π ri3 − rj3
=
∆t
3
ti − tj
for pairs of data points (ti , ri ). Observing the constant ratio 8π/3, one concludes that the
given data suggest that V (t) = (8π/3)t. This could also be observed graphically if a student
plots V versus t.
After that, we saw three approaches to finding time as a function of the radius (or vice
versa) for the given data. One approach was to consider “easy” values of t such as 0, .5, 4, 32,
which led to the conjectured formula
r(t) = (2t)1/3 .
The student then verified this function using the rest of the data.
The other two approaches began with the observation that the third differences of the
sequence of times are approximately all equal to 0.38. Given that t(r) = Ar3 +Br2 +Cr +D,
one approach is to substitute pairs of values (ti , ri ) to create a system of four equations in
four unknowns. Another approach is replicate the ideas used in the proof of Theorem 3.1,
i.e. that constant (non-zero) second differences imply a quadratic sequence. Namely, one
11
considers the sequences of differences bn = tn − tn−1 , cn = bn − bn−1 , dn = cn − cn−1 and
works backwards from dn = 0.38. Although a bit arduous, this is an excellent opportunity
to revisit the techniques of the proof in a natural way, engaging in repeated reasoning.
Problem 3.3:
(1) A cylindrical storage tank is full of uranium hexaflouride (UF6). You want to find
the mass of the gas. What do you do?
(2) Suppose the cylinder has a radius of 3m and a height of 100m. When you sample
the density at 10m increments of height, you find the following:
Height (m) Mass (mg) of 1cm3
0
16.09
10
15.87
20
15.65
30
15.43
40
15.22
50
15.01
60
14.80
70
14.60
80
14.40
90
14.20
100
14.00
Once again, this problem should be presented in stages. It requires some knowledge of
physics that students may not have. After some confusion, we expect students to think of
sampling some volume of the UF6 to find its mass, then multiplying that density by the
volume of the cylindrical tank. However, it is a fact that gases are denser at the bottom of
a distribution. Thus, the implicit assumption of constant density in the above approach is
not quite correct. The instructor will probably have to inform students of this fact. This
will perturb them to find a more accurate method. Students may wish to know something
about the dependence of density on height, from which they could integrate to directly find
the overall mass. However, this approach should be discouraged as it uses tools that have
not been developed yet. It should be quite reasonable that this dependence is not known–
indeed, we do not expect any students to know it. The expected method is for students to
consider sampling the density at various heights by measuring the mass of small volumes of
gas at these heights. This makes the correct assumption that density is constant at a given
height. By taking the accumulation function of these pieces, the overall mass can be found.
Thus, students are using rate of change of mass in what is basically a Riemann sum.
Once students realize that they need to sample the density at different heights, they should
be given the second part of the problem.
In the classroom, as expected, students were initially quite confused by the first part of
the problem. The instructor responded by drawing a cylinder on the board. Given the
visual representation, one student suggested finding the volume, namely V = πR2 H. Then
another student suggested sampling the density D and reminded the class of the formula
M = V D, where M is the mass. The teacher responded by asking the students if we were
done, i.e. if the problem was solved. Then one student asked, “Are we assuming that density
12
is uniform?”. Another student responded by noting that the density will be higher at the
bottom of the tank since the pressure is higher there. Thus, the class collectively concluded
that density is not uniform. The instructor then asked the students what we should do. A
student suggested measuring the density at different levels, and at this point, the instructor
presented the data for the second part of the problem.
Most students perceived that the data they were given followed a piecewise linear pattern
(actually the density decreases exponentially), and they found an expression for density
(sometimes labeled mass) as a function of height. Some students used a Riemann sum (not
necessarily labeled as such) to estimate the overall mass. For instance, one student used the
function for density to find the density at centimeter increments and multiplied that by each
piece of volume (using a spreadsheet to perform the actual calculations). Other students
attempted to use calculus to find the exact mass. They graphed density as a function of
height from 0 to 100 meters and recognized that they were looking for the area under this
curve. Some set up an integral (not always correct) and solved. Others recognized that
the area could be found directly by using area formulas for familiar polygons (triangles,
rectangles, trapezoids). Once this area is found, it can be multipled by the constant πR2 to
give mass. Unit conversion presented difficulty for most students; they were uncertain what
units their answer would be in or how to move between different units. It was intentional that
the density data is given in mg/cm3 (which is equivalent to kg/m3 ), so that students would
have to confront the issue of units and figure out how to resolve it. The common strategy for
resolving units was to express all measurements in centimeters instead of meters, yielding
an answer in milligrams.
Instructional note: Before moving on to the next unit, the class should reflect on what
has been done up to this point. Summarizing this unit, we see that our goal was to know
how to describe a physical situation; for example, at any time, what is the population? or
at any height, what is the density? To quantify these physical realities, we found rates of
change (the global necessity: rate of change helps us model physical reality). Thus we should
engage in (and have necessitated) an in-depth study of rate of change.
Supplementary and Practice Problems:
Problem 3.4. How might you estimate a person’s weight if all you know is their height?
For example, given the following data for the average weight of a 20 year old female, how
might you estimate the weight of a 20 year old female who is 5� 7” tall?
height weight
5� 0”
108
5� 1”
111.6
5� 2”
115.3
�
5 3”
119.1
5� 4”
123
�
5 5”
127
Problem 3.5. A cylindrical tank (oriented vertically) is full of water. A drain in the bottom
of the tank is opened.
13
(1) You want to know:
(a) The time it takes for the tank to empty completely.
(b) The amount of water remaining in the tank at any given time from the moment
the drain is opened.
What would you do?
(2) Do the following data help you answer the above questions?
Time (min) Water remaining (gal)
0
600
1/2
14161/24
1
3481/6
3/2
4563/8
2
1682/3
5/2
13225/24
3
1083/2
7/2
12769/24
4
1568/3
9/2
4107/8
Problem 3.6. You are teaching a high school mathematics class, and one day three of your
students approach you with the following problem. One of their textbooks includes the
following table of data for the average weight of an eighteen-year-old male:
height
5� 0”
5� 1”
5� 2”
5� 3”
5� 4”
5� 5”
5� 6”
weight
114.9
120.05
125.25
130.5
135.8
141.15
146.6
Several of the boys in the senior class are taller than 5� 6” and would like to compare their
weights with the average weights, so the students would like to know if they can use the
above data to predict the average weights of eighteen-year-old males over 5� 6”.
(1) Student A says that he thinks that an accurate predictor of the average weights
should be a function of the form f (x) = Ax + B, where x is the number of inches
over five feet. Do you agree? Carefully explain your answer.
(2) Student B says that she thinks that an accurate predictor of the average weights
should be a function of the form g(x) = abx , where x is the number of inches over
five feet. Do you agree? Carefully explain your answer.
(3) Student C says that he thinks that an accurate predictor of the average weights
should be a function of the form h(x) = Ax2 + Bx + C, where x is the number of
inches over five feet. Do you agree? Carefully explain your answer.
14
(4) Find a function to predict the average weights for eighteen-year-old males of different
heights.
Problem 3.7.
ences
(1) Let {an }n≥1 be a sequence, and consider the sequence of first differ-
bn = an − an−1 , n ≥ 2.
Prove: If {bn } is a constant sequence, then {an } is a linear sequence (i.e. an = An+B
for some numbers A and B).
(2) Use the result from part (a) to prove: If the sequence of second differences of a
given sequence {an }n≥1 is a constant sequence, then {an } is a quadratic sequence
(i.e. an = An2 + Bn + C for some numbers A, B, C).
Problem 3.8. Prove that if {an }n≥1 is a quadratic sequence (i.e. an = An2 +Bn+C for some
numbers A, B, C), then the sequence of second differences of {an } is a constant sequence.
4. Unit Three: Rates of Change
Focus WoT’s and WoU’s
• Object conception of function: thinking of functions as elements of a vector space; in
particular, a function can be an object to which one applies another operation, such
as differentiation or integration.
• Thinking of functions in terms of rates of change.
• Average rate of change.
• Average rate of change is a function of two variables (starting point and change from
starting point), and it is constant for linear functions.
• Instantaneous rate of change.
• Linear approximation for a function near a point.
• Reasoning from definitions.
• Process pattern generalization: accepting the existence of a pattern by understanding
the underlying structure that causes the pattern to continue2.
Problem 4.1. The average weight of the boys in a second grade class is 74.2 pounds. The
average weight of the girls in the same class is 68.3 pounds. There are 12 boys and 15 girls
in the class. What is the average weight of the students in the class?
Problem 4.2. A square has all sides expanding at the same constant rate. Let s be the side
length of this square at some time, and consider two increments. The first is when the side
length is between s and s + h; the second is when the side length is between s − h/2 and
s + h/2. Before doing any calculation, George and Isaac engaged in an argument about the
rate of change of the area over these increments. George argued that the average rates of
change of area with respect to side length over these two increments will be the same, while
Isaac was certain that the average rates of change would be different.
(1) If you were engaged in this debate, before doing any calculation, with whom would
you side?
2This
is in contrast to result pattern generalization: accepting the existence of a pattern based on a finite
number of examples.
15
(2) Whose prediction turned out to be correct?
(3) What about the rates of change of area with respect to time over these increments;
will these be the same?
(4) What about the average rates of change of the square’s area with respect to time
when the side length is between s and s + 5h?
(5) What is meant by “with respect to” in parts 3 and 4 of this problem?
Problem 4.3. At 11:00 am, a car driving north on Interstate 5 is 52 miles north of San
Diego, and the speedometer reads 72 miles per hour. What do you think the location of the
car will be at 11:01 am? What about the location at 11:05am? At 11:30 am? Do you think
your predictions are accurate?
Problem 4.4. Consider the following table of values of the function f (x) = x2 .
x
x2
1
1
1.001 1.002001
1.002 1.004004
1.003 1.006009
1.004 1.008016
1.005 1.010025
(1) Observe two distinct patterns as you move up or down the right hand column. What
explains these patterns?
(2) How could you use these data to approximate the value of (1.0057)2 ? What about
(1.3589)2 ?
Problem 4.5. Consider the following table of values of the function f (x) = sin(x).
x
sin(x)
0
0
0.001 0.0009999998
0.002 0.0019999987
0.003 0.0029999955
0.004 0.0039999893
0.005 0.0049999792
(1) Observe any patterns as you move up or down the right hand column. What explains
these patterns?
(2) How could you use these data to approximate the value of sin 0.0059? What about
sin 0.25?
√
Problem 4.6. Approximate the value of 4.007 without a calculator.
Problem 4.7. As you stretch a flexible cylinder, its height h and radius r are changing in
such a way that its volume V remains constant. What is the rate of change of h with respect
to r?
Problem 4.8. Let f (x) = sin(100πx).
(1) What do you expect the value of
df
(x0 )
dx
to be when x0 = 0?
16
(2) What is ∆f
(x0 , ∆x) when x0 = 0 and ∆x = .02?
∆x
∆f
(3) What is ∆x (x0 , ∆x) when x0 = 0 and ∆x = .01?
(4) What can you conclude about the rate of change of f at x0 = 0?
Problem 4.9. Let f (x) = 100x2 .
df
(1) What do you expect the value of dx
(x0 ) to be when x0 = .1?
∆f
(2) What is ∆x (x0 , ∆x) when x0 = .1 and ∆x = −.2?
(3) What can you conclude about the rate of change of f at x0 = .1?
Problem 4.10. Let f (x) = x. Verify that our new definition for
understanding that the rate of change at any point should be 1.
Problem 4.11. Let f (x) = |x|. Examine possible candidates for
df
dx
matches our more basic
df
(0).
dx
Does
df
(0)
dx
exist?
A deeper look into rate of change.
Problem 4.12. A particle is moving on a number line. Suppose the velocity of an object
at a particular time t0 is positive. Consider the relative positions of the particle during a
time interval around t0 (before and after t0 ). The claim is that there is a sufficiently small
interval of time around t0 during which:
• The particle’s positions after time t0 are to the right of its position at t0 .
• The particle’s position at time t0 is to the right of the positions of the particle before
t0 .
(1) Is the claim true? Why or why not?
(2) Formulate a corresponding assertion for the case in which the velocity of the particle
at time t0 is negative.
(3) Formulate a corresponding assertion for any (differentiable) function f : R → R.
Problem 4.13. Another velocity problem:
(1) An object leaves a particular location at time t = a, travels smoothly for some time
and returns to the same location at t = b. I say there must be a particular time
between a and b when the object’s velocity is zero. Am I right? Why?
ds
(2) The claim from part 1 can be stated as follows: Let s : [a, b] → R. Suppose
(t)
dt
exists for every t in the interval (a, b). If s(a) = s(b) = 0, then there is at least one
ds
number c ∈ (a, b), for which (c) = 0.
dt
Problem 4.14. A car is driving on Interstate 5, which has a speed limit of 65 mph. One
police officer measures the car’s speed as 67 mph. 5 seconds and 581 feet later, another
police officer finds its speed to be 69 mph. The second police officer sees the person’s brake
lights on, so he is pretty sure the driver was going quite fast and gives him a ticket for 15
mph over the speed limit. The driver takes the case to court.
(1) If the police officers are smart, can they prove that the driver was going at least 15
mph over the speed limit at some point?
(2) Separate from what they think they can prove, the police officers strongly suspect
that this driver was going more than 15 mph over the speed limit. How much might
17
the driver be getting away with; i.e. what is the fastest his car could have been going
over this stretch of road? Assume that the car’s maximum acceleration (whether
speeding up or slowing down) is 9 mph/sec.
Problem 4.15. The following is known as the Mean Value Theorem: Let s : [a, b] → R.
ds
(t) exists for every t in the interval (a, b). Then there is at least one number
Suppose
dt
ds
s(b) − s(a)
c ∈ (a, b), for which
(c) =
. How can we prove this statement? Interpret the
dt
b−a
Mean Value Theorem in terms of a moving object. In this context, would we expect it to be
true?
Pedagogical Considerations. This collection of problems necessitates an understanding
of average rate of change as a function of two variables, t (starting point) and ∆t (change
from starting point); alternatively, t1 and t2 (starting and ending points). The concept of
instantaneous rate of change emerges as a way of understanding when average rate of change
is considered in this context, since in many cases we are interested in average rate of change
for ∆t very small. To simplify the complexities of dealing with two variables, we eliminate
one, so that instantaneous rate of change depends only on starting point. With the definition
of instantaneous rate of change in hand, we may prove properties of functions which follow
from properties of their derivatives.
Problem 4.1 (refer also to Supplementary Problem 4.16): The average weight of
the boys in a second grade class is 74.2 pounds. The average weight of the girls in the same
class is 68.3 pounds. There are 12 boys and 15 girls in the class. What is the average weight
of the students in the class?
The goal of both of these problems is that the students attend to the meaning of average
and average rate of change before moving on to instantaneous rate of change. We expect
that the students will have no difficulty performing the calculations required to solve these
problems, but the problems set the stage for a discussion in which the students should
“unpack” the meaning of average and average rate of change.
Problem 4.2 (refer also to Supplementary Problem 4.17): A square has all sides
expanding at the same constant rate. Let s be the side length of this square at some time,
and consider two increments. The first is when the side length is between s and s + h; the
second is when the side length is between s − h/2 and s + h/2. Before doing any calculation,
George and Isaac engaged in an argument about the rate of change of the area over these
increments. George argued that the average rates of change of area with respect to side
length over these two increments will be the same, while Isaac was certain that the average
rates of change would be different.
(1) If you were engaged in this debate, before doing any calculation, with whom would
you side?
(2) Whose prediction turned out to be correct?
(3) What about the rates of change of area with respect to time over these increments;
will these be the same?
(4) What about the average rates of change of the square’s area with respect to time
when the side length is between s and s + 5h?
18
(5) What is meant by “with respect to” in parts 3 and 4 of this problem?
Since Problem 4.17 incorporates the additional conceptual difficulty of dealing with a
function of two variables, we recommend that it be used as a follow-up problem to Problem
4.2 (repeated reasoning). The emphasis in both of the problems is the same, namely, on
average rate of change as a function of starting point, since change is held constant. We
expect the problems to play out similarly, so here we limit ourselves to a discussion of Problem
4.17. Since many of the students are still likely to rely heavily on empirical reasoning, it is
expected that they will begin the problem by trying some examples, namely, comparing the
areas of the rectangles for different choices of m, n and h. Note that only certain combinations
of m and n are possible, which some students should realize. Here is an opportunity to attend
to the meaning of average rate of change, since in this case the students are only considering
net change. The shift in attention to average rate of change will likely lead to perturbation if
they consider the area as a function of two variables (i.e. the side lengths), since they won’t
know which formula to use for average rate of change– although the presence of time as a
variable allows for a reasonable resolution of this perturbation. Calculating the changes over
these intervals:
∆A1 = (m + h/2)(n + h/2) − (m − h/2)(n − h/2) = h(m + n),
whereas
∆A2 = (m + h)(n + h) − mn = h(m + n) + h2 .
Students should see that these are quite close if h is chosen to be small. The next three
questions should be easily answered by working out the algebra; they are intended to reveal
different patterns of change and necessitate attention to exactly what we are examining the
change of, and with respect to what? As stated above, it is convenient to use time as the
independent variable for finding these rates. Part 5 is intended to start students thinking
about accumulation functions: how is the accumulation of area related to its rate of change?
However, we also note that most students are likely to follow the “easier” solution path of
explicitly writing down area as a function of t:
A(t) = m(t)n(t) = (7 + 2t)(15 + 2t).
This is thus a case where the accumulation function is easy to write down immediately.
Problem 4.3: At 11:00 am, a car driving north on Interstate 5 is 52 miles north of San
Diego, and the speedometer reads 72 miles per hour. What do you think the location of the
car will be at 11:01 am? What about the location at 11:05am? At 11:30 am? Do you think
your predictions are accurate?
At this point, the students have seen many problems emphasizing that average rate of
change is a function of the starting point and the change. This problem is intended to
emphasize that when we considerate average rate of change, the size of the change plays a
very important role.
We expect the students to solve the problem assuming an average speed of 72 miles per
hour for each of the given time periods. Thus they will conclude that the position of the car
after five minutes is 52 + 72(1/12) = 58 miles north of San Diego, and the position of the
car after thirty minutes is 52 + 72(1/2) = 88 miles north of San Diego.
19
The next issue is to focus on the accuracy of these predictions. Our students all agreed
that intuitively, the prediction for the location of the car at 11:30 was less accurate. The
instructor then posed the following question:
Suppose that your speed on the trip is given by v(t) = 100t2 + 72, where t is measured in
hours. How would we calculate the distance traveled, given that velocity is always changing?
The students agreed that it was reasonable to calculate the distance traveled assuming
constant speed on subintervals. (It may also be expected that students suggest integrating
the velocity function, and the instructor can respond by asking why that works.) The instructor proceeded to consider assuming constant speed each minute. Under this assumption
and using a left hand sum, the approximate distance traveled between 11:00 am and 11:05
am is:
�
�
�
� �2 �
� �2 �
� �2 �
1
1
1
1
1
2
4
+ 72 + 100
+ · · · + 72 + 100
(72) + 72 + 100
60
60
60
60
60
60
60
�
�
� �2
� �2
� �2
� �2
1
1
2
3
4
=
72 + 72 + 100
+ 72 + 100
+ 72 + 100
+ 72 + 100
60
60
60
60
60
� �3
� 2
�
1
1
0 + 12 + 22 + 32 + 42
= (5)(72) + 100
60
60
� �3
1
= 6 + 100
(30)
60
1
=6+
72
Note that the instructor wrote the calculations on the board as above. The students instead
began making calculations but saw the advantages of maintaining the general structure
when repeating the calculations to approximate the distance traveled between 11:00 and
12:00. Once again assuming constant speed each minute and using a left hand sum, the
approximate distance traveled in the first hour is:
�
�
�
� �2 �
� �2 �
� �2 �
1
1
1
2
1
59
1
(72) + 72 + 100
+ 72 + 100
+ · · · + 72 + 100
60
60
60
60
60
60
60
�
� �2
� �2
� �2 �
1
1
2
59
=
72 + 72 + 100
+ 72 + 100
+ · · · + 72 + 100
60
60
60
60
� �3
� 2
�
1
1
= (60)(72) + 100
0 + 12 + 22 + · · · + 592
60
60
� �3 �
�
2(59)3 + 3(59)2 + 59
1
= 72 + 100
60
6
7021
= 72 +
216
20
The instructor then suggested that we compare these approximate distances to the students’
answers in which they assumed the constant velocity of 72 miles per hour. Assuming constant
speed, the distance traveled in the first five minutes is 6 miles, whereas our approximation
gave 6 + 1/72 miles. Thus the error is 1/72. Assuming constant speed, the distance traveled
in the first hour is 72 miles, whereas our approximation gave (after rounding) 72 + 32.5, so
the error is 32.5 miles. Comparing the proportion of the errors to the total distance given
by our approximations, in the first case we have an error of
� ��
�
1
100
100
=
≈ .23%
72
6
432
and in the second case we have an error of
�
�
100
3250
(32.5)
=
≈ 45%.
72
72
We note that ideally the error would be calculated with respect to the actual distance
traveled, whereas we used a comparison with respect to a Riemann sum, but this is the
best we could do in the context of this class at this point. We also note that here we
are illustrating the importance of small change in the context of an accumulation function
whereas Problems 4.4, 4.5, 4.6 illustrate the importance of small change in the context of a
linear approximation.
Problem 4.4 (refer also to Problem 4.5): Consider the following table of values of
the function f (x) = x2 .
x
x2
1
1
1.001 1.002001
1.002 1.004004
1.003 1.006009
1.004 1.008016
1.005 1.010025
(1) Observe two distinct patterns as you move up or down the right hand column. What
explains these patterns?
(2) How could you use these data to approximate the value of (1.0057)2 ? What about
(1.3589)2 ?
As noted above, these problems illustrates the importance of small change in the context
of approximations. Here we discuss Problem 4.4. The key observation is that for a change in
x of 0.001, f (x) changes by approximately 0.002. Thus, the average rate of change ∆f /∆x
is approximately 2, i.e.
f (1 + ∆x) − f (1)
≈ 2,
∆x
which is equivalent to
f (1 + ∆x) ≈ f (1) + 2∆x.
Taking ∆x = 0.0057, we get
(1.0057)2 ≈ 1 + 2(0.0057) = 1.0114.
21
Since (1.0057)2 = 1.01143249, the error in the approximation is 0.00003249. For the second
part of the problem, we have
(1.3589)2 ≈ 1 + 2(0.3589) = 1.7178.
Since (1.3589)2 = 1.84660921, the error in the approximation is 0.12880921. The students
should compare the relative errors in both cases. When ∆x = 0.0057, the relative error is
approximately 0.003%, and when ∆x = 0.3589, the relative error is approximately 7%. The
instructor should ask the students why the relative errors are so different. Again, the key
idea is that the approximation is much better when change is small.
Problem √
4.6 (refer also to Supplementary Problems 4.22 and 4.23): Approximate
the value of 4.007 without a calculator.
These problems also illustrates the importance of small change in the context of approximations. The problems
play out in a similar manner, so here we discuss only Problem 4.6.
√
Consider f (x) = x. The average rate of change (ARC) on the interval [x0 , x0 + ∆x] is:
√
√
x0 + ∆x − x0
f (x0 + ∆x) − f (x0 )
ARC =
=
.
∆x
∆x
Since we are interested in ARC for ∆x very small, we consider the ARC when ∆x = 0.
But in this case, the ARC is 0/0, so something needs to be done. In our class, the students
suggested
that we multiply both the numerator and the denominator by the expression
√
√
x0 + ∆x + x0 . Thus we get
�√
√ � �√
√ �
x0 + ∆x − x0
x0 + ∆x + x0
√
ARC =
√
∆x
x0 + ∆x + x0
x + ∆x − x0
√0
=
√
∆x( x0 + ∆x + x0 )
1
=√
√ .
x0 + ∆x + x0
√
So for ∆x small, ARC is approximately 2√1x0 . Now we can approximate 4.007. We take
x0 = 4 and ∆x = 0.007, so
√
√
1
0.007
4.007 ≈ 4 + √ (0.007) = 2 +
.
4
2 4
Problem 4.7: As you stretch a flexible cylinder, its height h and radius r are changing
in such a way that its volume V remains constant. What is the rate of change of h with
respect to r?
Here we present both a traditional approach to the problem and a more naive, conceptual
approach. In the classroom, the instructor faces two possibilities. If the students use the
more sophisticated approach, the instructor can lead the students to consider the more
naive approach by asking whether or not we really understand what is happening in the
sophisticated approach. If, on the other hand, the students do not use the sophisticated
approach, the instructor can introduce it with a comment such as “You have probably
seen this approach before.” Again, the goal is that the students realize how the more naive
approach provides a conceptual basis for the traditional approach.
22
Traditional approach:
dV
d
dh
dh
= πr2
+ h (πr2 ) = πr2
+ 2πrh = 0,
dr
dr
dr
dr
so
dh
2h
=− .
dr
r
The instructor should ask the students what this expression means (not just symbolically,
but conceptually). In particular, they should be able to articulate that if we assume that
r increases, then for V to remain constant, h must decrease 2h/r times as quickly as r
increases. This sophisticated solution uses:
(1) the product rule and implicit differentiation
(2) an indirect approach to the particular question: rather than attempting to deal with
the change in h, it attends to the change in V (which depends on the change in h).
Direct, naive approach (without using product rule and implicit differentiation): To make
sense of the problem (utilizing and enhancing the Process Pattern Generalization WoT), the
V
teacher can suggest that students consider “very small” changes. Suppose h = 2 . Suppose
πr
we start at some point r0 , h0 . If V is held constant while r changes by some very small ∆r,
then h will change by: ∆h = h(r0 + ∆r) − h(r0 ). Thus
�
�
∆h
1
V
V
=
−
∆r
∆r π(r0 + ∆r)2 πr02
�
�
V
1
1
=
−
π∆r (r0 + ∆r)2 r02
� 2
�
V
r0 − (r0 + ∆r)2
=
π∆r
r2 (r0 + ∆r)2
� 02
�
V
r0 − (r02 + 2r0 ∆r + (∆r)2 )
= 2
πr0 ∆r
r02 + 2r0 ∆r + (∆r)2
�
�
V
−2r0 ∆r − (∆r)2 )
= 2
πr0 ∆r r02 + 2r0 ∆r + (∆r)2
�
�
V
−2r0 − ∆r
= 2 2
πr0 r0 + 2r0 ∆r + (∆r)2
At this point, it makes sense to neglect the terms involving ∆r, since we are choosing ∆r to
be small. Thus, we get:
�
�
∆h
V −2r0
≈ 2
.
∆r
πr0
r02
∆h
2h0
Recalling that πrV 2 = h0 and canceling a factor of r0 , we can rewrite this as
≈−
. So,
0
∆r
r0
for a particular starting point r0 , h0 , the change in height over a small change ∆r is approx0
imately − 2h
times that change. That is, if r is increasing, then the height is decreasing at
r0
0
approximately the rate of − 2h
per unit change in radius.
r0
23
The goal of this direct approach is to have students remain with the concept of approximate
rate of change for a long time, but the emphasis is on two things:
(1) Average rate of change depends on starting point r0 .
(2) Rate of change is approximated by considering ARC over very small changes ∆r.
The crucial aspect of this approach is the repeated application of the observation that average
rate of change depends on starting point. The concept of instantaneous rate of change will
emerge as a WoU where average rate of change is considered in the context of two factors:
• dependence on starting point
• neglecting the value of the change (when added to any other terms)
The first factor is necessitated from the reality of the situations: some phenomena are
independent of location (linear), but “most” phenomena are not. The second factor is a
little more complicated. It is necessitated by a need for computation: attempts to make
sense of and simplify a complicated function expressing average rate of change. It can also
be necessitated by the fact that in many cases we are interested in average rate of change
over a very short interval. This factor embeds an expert WoT: when examining a quotient of
functions of some tiny quantity, only the lowest order terms in numerator and denominator
need to be considered.
In order to realize the connection between the elementary approach and the more traditional approach, one proceeds as in the elementary approach, but without solving for h as a
function of r. Namely, V = πr2 h, so
V (r + ∆r, h + ∆h) = π(r + ∆r)2 (h + ∆h) = π(r2 + 2r∆r + (∆r)2 )(h + ∆h).
Thus, if ∆r and ∆h are both very small,
�
�
V (r + ∆r, h + ∆h) ≈ π(r2 + 2r∆r)(h + ∆h) = π r2 h + r2 ∆h + 2hr∆r + 2r∆r∆h .
Using the formula for V in terms of r and h leads to
0 = ∆V ≈ r2 ∆h + 2hr∆r + 2r∆r∆h.
Since ∆r and ∆h are both very small, the last term is much smaller than the previous two,
so
0 = ∆V ≈ r2 ∆h + 2hr∆r.
Note that this a naive derivation of the product rule for implicit differentiation. We can
conclude
2h
∆h ≈ − ∆r.
r
Instructional Note. Upon completion of Problem 4.7, it is crucial for the class to
summarize what we have achieved up to this point. Namely, why have we been talking
about average rate of change? The answer is that we are interested in average rate of change
in order to model reality using functions. We have seen that
• Average rate of change (ARC) depends on (1) starting point and (2) change from
starting point.
• We are always interested in ARC over very small change (in order to model reality
well). To this end, we will always try to estimate ARC in terms of the starting point
and not in terms of the change.
24
• When we succeed to estimate ARC only in terms of the starting point, we call the
new estimate rate of change (RC). So RC is derived from ARC, and RC depends on
the starting point, but not on the change.
• ARC of a function with respect to a variable x (e.g. ARC of distance with respect to
df
∆f
and RC is denoted by
. Note the fundamental difference
time) is denoted by
∆x
dx
between these two objects: one is a fraction and the other is not. For very small
change ∆x, we have
∆f
df
≈
.
dx
∆x
•
df
∆f
f (x0 + ∆x) − f (x0 )
≈
=
dx
∆x
∆x
so
df
f (x0 + ∆x) ≈ f (x0 ) + ∆x.
dx
In the example of a position function, if you were in position f (x0 ) at time x0 , then
your new position at time x0 + ∆x is approximately your starting position plus the
rate of change times the change.
Problem 4.8 (refer also to Problem 4.9): Let f (x) = sin(100πx).
df
(1) What do you expect the value of dx
(x0 ) to be when x0 = 0?
∆f
(2) What is ∆x (x0 , ∆x) when x0 = 0 and ∆x = .02?
(3) What is ∆f
(x0 , ∆x) when x0 = 0 and ∆x = .01?
∆x
(4) What can you conclude about the rate of change of f at x0 = 0?
Upon completion of Problem 4.7, the next goal of the class should be to find a precise
∆f
df
relationship between the two quantities
(x0 , ∆x) and
(x0 ). In our classroom, the
∆x
dx
instructor built up to the formal definition in stages: first formulating the definition intuitively, then advancing to a slightly more formal formulation, and finally by necessitating a
refinement of this formulation to arrive at the accurate definition.
df
Intuitive formulation: We have seen that
(x0 ) is a number with the property that
dx
∆f
we can make
(x0 , ∆x) as close to it as we want by making an appropriate choice of ∆x.
∆x
First formalization of intuitive formulation: In other words, for any positive number
ε, we can find ∆x so that
�
�
� ∆f
�
df
�
� < ε.
(x
,
∆x)
−
(x
)
0
0
� ∆x
�
dx
Problems 4.8 and 4.9 are designed to demonstrate the deficiency in this first formulation.
In particular, these problems have the potential to demonstrate to students that there may
be coincidental values for ∆x that suggest a very different value for the rate of change. Thus,
merely finding one ∆x that makes ∆f
(x0 , ∆x) close to some value is insufficient. Intuitively,
∆x
we think that if some ∆x “works” (i.e. makes the ARC close to some value), then any smaller
(in absolute value) ∆x should also work. The instructor can then formalize this into the
usual epsilon-delta definition, which can be applied to the next problems.
25
Problem 4.10 (refer also to Problem 4.11): Let f (x) = x. Verify that our new
df
definition for dx
matches our more basic understanding that the rate of change at any point
should be 1.
Having used Problems 4.8 and 4.9 to necessitate the epsilon-delta definition of the derivative, these problems give students the opportunity to internalize the definition.
Problem 4.12: A particle is moving on a number line. Suppose the velocity of an object
at a particular time t0 is positive. Consider the relative positions of the particle during a
time interval around t0 (before and after t0 ). The claim is that there is a sufficiently small
interval of time around t0 during which:
• The particle’s positions after time t0 are to the right of its position at t0 .
• The particle’s position at time t0 is to the right of the positions of the particle before
t0 .
(1) Is the claim true? Why or why not?
(2) Formulate a corresponding assertion for the case in which the velocity of the particle
at time t0 is negative.
(3) Formulate a corresponding assertion for any (differentiable) function f : R → R.
Students already have a very strong image for positive velocity, but the need for communication necessitates using the epsilon-delta definition to justify the claim. We expect that
most students will conclude that velocity is positive on an interval (without justification)
and thus that the position function is increasing, so the instructor will most likely need to
remind the class that we are only given v(t0 ) > 0 for some t0 . This was indeed the case in
our classroom, and students attempted to justify the claim that velocity must be positive on
an interval by sketching graphs of position functions and velocity functions. The instructor
responded by asking the class how we could be sure that these examples of graphs exhaust
all of the possibilities? He then prompted the students to recall the meaning of the symbols,
i.e. v(t0 ) > 0 gives us:
ds
s(t0 + ∆t) − s(t0 )
(t0 ) = lim
> 0.
∆t→0
dt
∆t
Together, the class discussed how we could use the above to demonstrate that we can find a
∆t > 0 so that
s(t0 + ∆t) > s(t0 ) and s(t0 − ∆t) < s(t0 ).
Using the definition, we have that for all ε > 0, there exists a δ > 0 so that if |∆t| < δ, then
�
�
� s(t0 + ∆t) − s(t0 ) ds
�
�
− (t0 )�� < ε
�
∆t
dt
v(t0 ) =
that is,
So
−ε <
s(t0 + ∆t) − s(t0 ) ds
− (t0 ) < ε.
∆t
dt
ds
s(t0 + ∆t) − s(t0 )
ds
(t0 ) <
< ε + (t0 ).
dt
∆t
dt
Since the above is true for all ε > 0, it is true in particular for ε = ds
(t ). Thus we can find
dt 0
a δ > 0 so that 0 < ∆t < δ implies that s(t0 + ∆t) − s(t0 ) > 0, as required.
−ε +
26
Problem 4.13:
(1) An object leaves a particular location at time t = a, travels smoothly for some time
and returns to the same location at t = b. I say there must be a particular time
between a and b when the object’s velocity is zero. Am I right? Why?
ds
(2) The claim from part 1 can be stated as follows: Let s : [a, b] → R. Suppose
(t)
dt
exists for every t in the interval (a, b). If s(a) = s(b) = 0, then there is at least one
ds
number c ∈ (a, b), for which (c) = 0.
dt
Here we anticipate that some students ask what “smoothly” means in this problem. In
our classroom, the instructor responded to this question by suggesting that the students
imagine driving a car, so that in English smoothly would mean without abrupt changes. He
then said that the meaning in mathematical terms reflects this. The class then discussed
the veracity of the claim based solely on inituition. Students who argued that the claim is
true offered the example of tossing a ball in the air, stating that in this example, velocity
changes from positive to negative and equals zero when the ball is at its maximum height.
Students who argued that the claim is false offered the example of a car traveling around a
circular racetrack. The class then reached the following resolution: we can change the claim
so that the statement is true by specifying that velocity means rate of change of distance
from starting point with respect to time. This conflict was useful from a pedagogical point
of view since it illustrates that we must specify meaning and in particular the necessity for
accurate formulation (i.e. a formulation in terms which are independent of intuition). The
class then discussed a sketch of the formal justification of Rolle’s Theorem, and the instructor
gave the students the homework assignment of writing a clean proof. The sketch of the proof
was as follows.
If s = 0 on [a, b], we are done. So suppose s(t0 ) �= 0 for some t0 ∈ [a, b]. So either s(t0 ) > 0
or s(t0 ) < 0. Consider the case in which s(t0 ) > 0. Since s(a) = s(b) = 0, there must be
a number c so that s(c) is a local maximum. That is, for some r > 0, s(c) ≥ s(c + t) and
s(c) ≥ s(c − t) for 0 < t < r. Now we wish to show that ds
(c) = 0. Suppose instead that
dt
ds
(c) > 0. Then by problem 4.12, there exists a d > 0 so that s(t0 − t) < s(t0 ) < s(t0 + t) for
dt
0 < t < d. Thus we have a contradiction. [Note that here there should be some discussion
as to why we do indeed have a contradiction since one of the inequalities involves r while
the other involves d.]
Problem 4.14: A car is driving on Interstate 5, which has a speed limit of 65 mph. One
police officer measures the car’s speed as 67 mph. 5 seconds and 581 feet later, another
police officer finds its speed to be 69 mph. The second police officer sees the person’s brake
lights on, so he is pretty sure the driver was going quite fast and gives him a ticket for 15
mph over the speed limit. The driver takes the case to court.
(1) If the police officers are smart, can they prove that the driver was going at least 15
mph over the speed limit at some point?
(2) Separate from what they think they can prove, the police officers strongly suspect
that this driver was going more than 15 mph over the speed limit. How much might
the driver be getting away with; i.e. what is the fastest his car could have been going
27
over this stretch of road? Assume that the car’s maximum acceleration (whether
speeding up or slowing down) is 9 mph/sec.
Initially, students are likely to be confused: the driver is speeding, but only by 4 mph;
how could it be more? However, by this point in the course, students should be somewhat
enculturated into the practice of carefully examining problem situations. Because of their
strong distance-rate-time schema, students should have some intuition about this particular
situation: namely, that the driver could be changing speed in some (non-random) fashion
between the two points where his speed was measured. They may calculate various quantities,
one of which would be average speed. This will be in feet/sec, but it is likely that some
students will convert to mph, and we expect students to be surprised by the result of this
calculation, since the average speed is considerably higher than the starting or ending speed.
This surprise can lead to disequilibrium and a desire to relate the result to the first question.
Up to this point, the instructor has only been clarifying the problem statement and asking
groups what they are thinking. Since students have seen the Mean Value Theorem in a
previous calculus course, it is likely that one of them will think to invoke it in this situation.
The instructor can facilitate communication of this idea to other groups, who are likely to
be intellectually ready for it and adopt a similar solution.
Students may have to engage in some discussion before the second question makes sense to
them. The second question can be viewed as requiring them to maximize a velocity function
satisfying certain constraints. It implicitly requires thinking about multiple possible paths
that would have the same average velocity. The optimal path involves accelerating at the
maximum rate for just over half the time, then decelerating at the maximum rate for the
remainder. This is the only ‘obvious’ path that would be possible, so its genesis will come
from the social aspects of the situation rather than the mathematics itself. We expect that
some students will suggest this solution without having checked that it satisfies the constraint
of total distance traveled. The numbers were carefully chosen so that this solution satisfies
this constraint, though a little rounding is still required. It is expected that students will
appreciate the need to check whether their solution satisfies this constraint, and that some
of them will be capable of doing so. The fact that the ‘obvious’ solution is valid shifts the
focus from finding a solution to justifying that solution. This shift can extend to Mean Value
Theorem; students have used the Mean Value Theorem, so now they have a reason to prove
it (Problem 4.15.).
Problem 4.15: The following is known as the Mean Value Theorem: Let s : [a, b] → R.
ds
Suppose
(t) exists for every t in the interval (a, b). Then there is at least one number
dt
ds
s(b) − s(a)
c ∈ (a, b), for which
(c) =
. How can we prove this statement? Interpret the
dt
b−a
Mean Value Theorem in terms of a moving object. In this context, would we expect it to be
true?
Likely interpretations of the Mean Value Theorem in context include: suppose we are
driving from town A to town B. Then there must be a time during the trip at which the
velocity is equal to the average velocity. In our classroom, as with the proof of Rolle’s
Theorem, the instructor suggested the idea of a proof to the class and gave the students
the assignment of writing a clean proof. He first drew the connection with Rolle’s Theorem,
28
df
by pointing out that if f (a) = f (b) = 0, then the claim is that dx
(c) = 0. He then asked
the students if we can create a situation in which we can use Rolle’s Theorem, i.e. can we
cook up a function T so that T (a) = T (b) = 0? Sketching a graph suggests that we take
T (x) = f (x) − g(x), where the graph of g is the line passing through the points (a, f (a)) and
(b, f (b))).
Supplementary and Practice Problems:
Problem 4.16. A car drives 120 miles from San Diego to Los Angeles with an average speed
of 62 miles per hour. On the return trip back to San Diego, the car’s average speed is 71
miles per hour. What is the average speed for the entire trip?
Problem 4.17. A rectangle ABCD starts with side lengths of 7 and 15, and all of the sides
are getting longer at the same constant rate (so the rectangle is maintaining a rectangular
shape). Let m and n be the lengths of the sides of the rectangle at a particular time, and
consider two increments. The first is when one side length is between m − h/2 and m + h/2,
while the other side length is between n − h/2 and n + h/2. The second is when one side
length is between m and m + h, while the other is between n and n + h.
(1) What are the rates of change of area with respect to time over these two increments?
(2) What about the rates of change of the perimeter with respect to time over these
increments?
(3) What about the rates of change of the length of the diagonal with respect to time
over these increments?
(4) What about the rates of change of the distance of the center of the rectangle from
one of the sides with respect to time over these increments?
(5) Suppose that the side lengths of the rectangle are increasing at a rate of 2 units per
minute. What will be the area of the rectangle at any instant?
Problem 4.18. Suppose that a rectangle starts with the side lengths of 7 and 15 units and
opposite sides get longer at the same constant rate in such a way that the ratio between
the side lengths of the non-congruent sides is maintained (thus forming rectangles similar to
the original rectangle). Suppose further that after 30 seconds, the side lengths will both be
double their original values: 14 and 30 units. Let m be the length of the shorter side of the
rectangle at a particular time, and consider two increments. The first is when the shorter
side length is between m − h/2 and m + h/2; the second is when that side length is between
m and m + h.
(1) What will be the longer side lengths at the starting and ending points of each increment?
(2) Will the rates of change of area with respect to time be different for the two increments? How does this compare to the previous problem? Explain your answer.
Problem 4.19. A ball is tossed into the air from a bridge, and its height y (in feet) above
the ground t seconds after it is thrown is given by
y(t) = −16t2 + 50t + 36.
(1) How high above the bridge is the ball when it reaches its maximum height?
29
(2) How long does it take the ball to reach its maximum height?
(3) How long does it take the ball to reach the ground?
(4) Find the average velocity of the ball “near” its maximum height; that is, during the
periods [T, T + ∆T ] and [T − ∆T, T ], where T is the answer to question 2. How do
you expect these results to compare when ∆T is small? Explain.
(5) There is a time T1 such that the average velocity of the ball over the interval [T1 , T1 +1]
is 2 ft/sec. What is T1 ?
(6) The average velocity of the ball is dependent on the time interval [t, t + ∆t] under
consideration. What exactly is this dependence? Any two values of average velocity,
t, and ∆t will determine the third. Replace each � in the table below with a value of
your choice and then complete the table.
t ∆t average velocity
� �
�
�
�
�
Parts 1 and 2 may be solved by traditional calculus methods, and in this case the instructor
should ask students to justify why these calculus methods lead to the answer, e.g. why do we
set the derivative of the height function equal to zero? Alternatively students may rewrite the
height function in the standard vertex form to see that it is a downward pointing parabola
with vertex at t = 25
. Thus the ball is 252 /16 + 36 = 70 1/16 feet above the bridge at its
16
maximum height. Some students may solve part 4 by brute force calculations and others
by interpreting the average velocity as the slope of a secant line. The solution to part 5 is
T1 = 1, and students should be able to explain why this solution makes sense, but may be
perturbed as to why T1 is uniquely determined. This perturbation is resolved in Part 6 since
the average rate of change is linear in both t and ∆t, namely −32t − 16∆t + 50.
Problem 4.20. Estimate the volume of the solid generated by rotating the graph of y = x2 ,
0 ≤ y ≤ 16, around the y-axis.
Problems 4.20 and 4.21 are designed to set the stage for accumulation functions; refer also
to Problems 3.3 and 4.3.
Problem 4.21. A potter is making ceramic carafes, the insides of which are simple parabolas, so that the cross-sectional radius r at some height h is given by h = r2 (measured in
cm). Approximately how tall should he make the inside of a carafe if it is to hold 650 cm3 ?
Problem 4.22.
(1) Use average rate of change to estimate (2.0017)3 .
(2) Use average rate of change to estimate (2.3918)3 .
(3) How close are your estimates to the actual values?
√
Problem 4.23.
(1) Use average rate of change to estimate 3 8.0156. (Hint: a − b =
(a1/3 )3 − (b1/3 )3 )
√
(2) Use average rate of change to estimate 3 8.9735.
(3) How close are your estimates to the actual values?
30
Problem 4.24. Suppose f and g are functions. Can you find a relationship between the
∆(f g)
∆f
∆g
average rate of change
and the average rates of change
and
? (In other words,
∆x
∆x
∆x
find a justification for the product rule using average rates of change).
The sophisticated solution of Problem 4.7 uses the product rule, so the opportunity has
arisen for the students to justify the product rule. We advocate the following justification
using linear approximations:
f (x + ∆x)g(x + ∆x) − f (x)g(x)
.
∆x→0
∆x
(f (x)g(x))� = lim
Approximating the numerator:
f (x + ∆x)g(x + ∆x) − f (x)g(x) ≈ [f (x) + (∆x)f � (x)] [g(x) + (∆x)g � (x)] − f (x)g(x)
= (∆x)[f � (x)g(x) + f (x)g � (x)] + (∆x)2 f � (x)g � (x).
Thus:
(∆x)[f � (x)g(x) + f (x)g � (x)] + (∆x)2 f � (x)g � (x)
∆x→0
∆x
�
�
= lim [f (x)g(x) + f (x)g (x) + (∆x)f � (x)g(x)]
(f (x)g(x))� = lim
∆x→0
= f � (x)g(x) + f (x)g � (x).
Problem 4.25. Let f (x) = 3x2 + 2x.
(1) Sketch the graph of f .
df
(2) Use the definition of the derivative to show that dx
(0) = 2.
df
(3) Use the definition of the derivative to show that dx (−1) = −4.
df
(4) Use the definition of the derivative to show that dx
(x0 ) = 6x0 + 2.
(5) Do the results from parts 2, 3, and 4 “agree” with your graph?
Problem 4.26. Let
�
1 x≤1
f (x) =
x x>1
(1)
(2)
(3)
(4)
(5)
(6)
Sketch the graph of f .
df
Use the definition of the derivative to show that dx
(0) = 0.
df
Use the definition of the derivative to show that dx (x0 ) = 0 if x0 < 1.
df
Use the definition of the derivative to show that dx
(2) = 1.
df
Use the definition of the derivative to show that dx (x0 ) = 1 if x0 > 1.
df
Use the definition of the derivative to show that dx
(1) does not exist, i.e. show that
df
(1) �= A for any real number A.
dx
(7) Do the results from parts 3, 5 and 6 “agree” with your graph?
Problem 4.27. Use the ε − δ definition of the derivative to prove: If f : R → R is a
differentiable function and f � (x0 ) < 0, then f is decreasing on some open interval containing
x0 .
31
Problem 4.28. A ball is tossed in the air from a bridge, and its height y (in feet) above the
ground t seconds after it is thrown (until it hits the ground) is given by y(t) = −8t2 +40t+48.
(1) Find the average rates of change of the ball’s height over the intervals [0, 4] and [1, 3].
(2) Show that the result from part 1 is not a coincidence. Hint: One approach would be
to show that the average rate of change of a quadratic function over some interval is
equal to the rate of change of the function at the midpoint of the interval.
Problem 4.29. Watch Hands Problem: The minute hand on a watch is 8 mm long and the
hour hand is 4 mm long. How fast is the distance between the tips of the hands changing
at one o’clock? [J. Stewart, Calculus, Concepts and Contexts, Brooks/Cole 1998] At what
time(s) might this distance be changing the fastest, or the slowest?
This problem is nicely economical; it’s not necessary to provide the speeds of the watch
hands, because everyone knows their speeds, although students need to avoid tautological
statements such as, “The minute hand travels at one minute per minute.” The need to “tell
algebra” the speeds engages the PGA WoT. The fact that the moving system is a watch,
which serves the function of measuring time, may help students understand that the hands
have instantaneous positions and speeds that covary with the time.
Hopefully students have acquired the way of thinking of generalization, so that one should
solve the problem for arbitrary lengths of the hands and then specialize to the given values.
Let the hour hand and minute hand have lengths l and L, respectively, and let θ be the
angle measured clockwise from the minute hand to the hour hand; θ = π/6 at one o’clock.
The (positive) distance x between the tips of the hands is given by the law of cosines as
x2 = l2 + L2 = 2lL cos θ . Then the chain rule gives
2x
dx
dθ
= 2lL sin θ
dt
dt
or
dx
lL
dθ
=
sin θ .
dt
x
dt
The speeds of the hour and minute hands, in radians per minute, are respectively 2π/(12)(60)
and 2π/60. The motion of the hour hand increases θ while that of the minute hand decreases
it, so dθ/dt = −11π/360 ≈ −0.09599 rad/min. At one o’clock we find x ≈ 4.957 mm
and dx/dt ≈ −0.3098 mm per minute. Students have an opportunity to notice the chain
rule: omitting the factor dθ/dt from the formula for dx/dt gives instead dx/dθ, which is
approximately 3.228 mm per radian.
A naive, calculator-based approach to the problem might say that at one o’clock, x =
4.9572547, while one minute later we find that x = 4.6650272, so x is changing at a rate of
−0.2922275 mm per minute. This is an average rather than instantaneous rate of change,
and students may discuss why they differ.
The second part of the problem asks when dx/dt might be largest or smallest. If we
ask for the smallest magnitude, it is zero, occurring when θ = 0 or θ = π: the hands are
aligned or anti-aligned, for example at noon or at 6:00, when the distance has a minimum
or maximum value. The largest magnitude, either positive or negative, is not obvious, so we
32
set the derivative of dx/dt equal to zero:
d2 x
lL
=
cos θ
2
dt
x
so
d2 x
lL
=
dt2
x
�
�
dθ
dt
dθ
dt
�2
−
lL
dx dθ
sin θ
,
2
x
dt dt
�2 �
�
lL 2
cos θ − 2 sin θ .
x
This vanishes when x2 cos θ = lL sin2 θ, which simplifies
to cos2 θ = η cos θ + 1 = 0, where
�
η = L/l + l/L ≥ 2. The solution is cos θ = (η − η 2 − 4)/2, where the minus sign in the
quadratic formula must be used to ensure that cos θ ≤ 1. With the numerical values in our
problem, η = 5/2 and we find, remarkably, that θ = ±π/3. The maximum and minimum
rates of change of distance have equal magnitudes,
√ approximately 0.38397 rad/min, and
occur for example at 10:00 and 2:00, when x = 4 3 ≈ 6.9282 mm. It is useful for students
to reflect on the qualitative properties of this solution. For example, the angle θ giving the
extremal values of dx/dt depends only on the ratio of lengths L/l, and it does not change
if the lengths of the minute and hour hands are interchanged. In the limit L → l, one finds
cos θ = l/L and θ = ±(π/2 − l/L); the extrema are now near 9:00 or 3:00. If instead l = L,
the extrema are at θ = 0, where x = 0. At this point our formula for dx/dt is indeterminate
dθ
but has the physically correct limiting value ±l . The maximum (minimum) occurs an
dt
instant after (before) noon. It is also instructive to plot graphs of x or dx/dt versus t.
Problem 4.30. Suppose f and g are functions. Can you find a relationship between the
∆f (g)
∆f
∆g
average rate of change
and the average rates of change
and
? (In other words,
∆x
∆x
∆x
find a justification for the chain rule using average rates of change).
The solution of Problem 4.29 uses the chain rule, so the opportunity has arisen for the
students to justify the chain rule. We advocate the following justification using linear approximations:
f (g(x + ∆x)) − f (g(x))
[f (g(x))]� = lim
.
∆x→0
∆x
Approximating the numerator:
f (g(x + ∆x)) − f (g(x)) ≈ f (g(x) + (∆x)g � (x)) − f (g(x))
≈ f (g(x)) + (∆x)g � (x)f � (g(x)) − f (g(x))
= (∆x)g � (x)f � (g(x)).
Thus:
(∆x)g � (x)f � (g(x))
[f (x)g(x)] = lim
∆x→0
∆x
= lim g � (x)f � (g(x))
�
∆x→0
�
= f (g(x))g � (x).
33
Problem 4.31.
(1) A police officer is sipping coffee at a roadside speed trap when a
speeder zooms past at 60 mph (its a 35 mph zone). It takes the officer one minute
to put down the coffee and start her motorcycle, but then she is off at 80 mph in
pursuit. When and where does she catch the speeder?
(2) It is unrealistic to assume that the officer can accelerate from 0 to 80 mph instantaneously, but we do that in algebra because motion at a constant speed is all we
have been taught to handle. Let’s learn to do better. This time the officer starts
her motorcycle the moment that the speeder passes, but she accelerates gradually,
at a “uniform” rate. With the aid of a stopwatch and roadside mileage markers,
we measure her average speed during one-minute time intervals. Her average speed
during the first minute is 20 mph, then 60 mph during the second minute, 100 mph
during the third minute, and so on. (The numbers in this problem are chosen for
convenience and plausible order of magnitude, but not literal realism.) When and
where does she catch the speeder? When and where would the speeder have been
caught if his speed had been 40 mph? 50 mph?
Problem 4.32.
(1) Find the intersection points of each of the following pairs of graphs:
(a) y = 5x − 5 and y = 2x + 1
(b) y = x2 and y = 4x − 3
(c) y = x2 and y = 4x − 4
(d) y = x2 and y = 4x − 5
(2) Now suppose that each pair of graphs represents the motion of a pair of cars driving
along a straight road, so that x is time (measured from some chosen moment) and y
is the position along the road (measured from some chosen starting point, in a chosen
direction). Describe the motion in each of the four cases. What are the cars doing
near the positions and times of the graphs’ intersections? Which car in each pair is
moving faster?
(3) Find the equation of the line tangent to the graph of y = x2 at the point (3, 9).
What is the speed of the car whose motion is described by this graph at time x = 3?
Answer the same question for an arbitrary point (p, q) on the graph of y = x2 and
the arbitrary time x = p.
(4) Explain how to find the equation of the tangent line to the graph of any quadratic
function y = ax2 +bx+c at any chosen point (p, q) and the speed of the corresponding
car.
(5) Extend these ideas to find tangent lines to graphs of arbitrary polynomial equations.
5. Unit Four: Integration
Focus WoT’s and WoU’s
• Object conception of function: thinking of functions as elements of a vector space; in
particular, a function can be an object to which one applies another operation, such
as differentiation or integration. For �example, the object conception of function is
x
needed to understand the expression 1 f (t)dt.
• Riemann sum as an approximation to the accumulation function.
34
• Accumulation functions (approximated) as functions of the starting and ending points
and the number of intervals in the partition.
• Fundamental Theorem of Calculus.
• Comparing functions by comparing their rates of change.
Problem 5.1. The rate at which the world’s oil is being consumed is continuously increasing.
The rate of the world’s oil consumption in 1994 reached about 41 billion barrels per year
from 32 billion barrels per year in 1990. About how much oil was consumed during the
period from the beginning of 1990 until the end of 1994?
Problem 5.2. A company develops two strains of bacteria that grow at unthinkable rates.
2
Both start with a single organism. The first colony has et organisms at any time t (in
2
minutes). The second colony grows at the rate of (3t + 5)et individuals per minute at any
time t (in minutes). Can you predict which colony will be larger after five minutes?
Problem 5.3. The weight of a cubic shipping box, in pounds, can be modeled by
� t2
�1/4
1 �
w(t) =
1 + x1/2
dx,
400
0
where t is the length of a side of the box, measured in centimeters. You need to ship a cubic
box with a side length of 20 cm. About a month ago, you shipped a cubic box with a side
length of 15 cm, and it weighed 1.012 pounds. Approximately how much will the 20×20×20
box weigh?
Pedagogical Considerations. Problem 5.1: The rate at which the world’s oil is being
consumed is continuously increasing. The rate of the world’s oil consumption in 1994 reached
about 41 billion barrels per year from 32 billion barrels per year in 1990. About how much
oil was consumed during the period from the beginning of 1990 until the end of 1994?
Here students are expected to assume that the rate of change of oil consumption is linear.
In this case, the concept image is that the increase in oil consumption from 32 billion barrels
per year to 41 billion barrels per year is distributed evenly across the years from 1990 to
1994. Under these assumptions, oil consumption is increasing at a rate of 9/5 billion barrels
per year per year. Thus an approximation for the total amount of oil consumed during the
five year period would be 32 + 33.8 + 35.6 + 37.4 + 39.2 = 178 billion barrels. Here the
students should notice that this estimation has the structure of a Riemann sum.
Problem 5.2: A company develops two strains of bacteria that grow at unthinkable
2
rates. Both start with a single organism. The first colony has et organisms at any time t
2
(in minutes). The second colony grows at the rate of (3t + 5)et individuals per minute at
any time t (in minutes). Can you predict which colony will be larger after five minutes?
This problem is designed to necessitate finding the rate of change of an accumulation
function. One likely approach is to attempt to evaluate the integral
� 5
2
(3t + 5)et dt,
0
and the instructor should take the opportunity to ask the students why they are taking this
approach. Once they realize the difficulty in finding an antiderivative for the integrand, they
35
may try to approximate the value of the definite integral using a Riemann sum. Another
approach is to compare the rates of change of the functions
� t
2
t2
C1 (t) = e
and C2 (t) =
(3x + 5)ex dx.
0
Since C1 (0) = C2 (0) = 1 and C1� (t) < C2� (t) for all t > 0, C2 (5) > C1 (5). Thus this problem
can necessitate an important way of understanding: if two functions have the same initial
value and one of the functions is always increasing at a faster rate, then that function will
always have a larger value.
In the classroom, students used both of the above approaches. The instructor built on
the first approach to necessitate a proof of the Fundamental Theorem of Calculus which we
describe below. If we subdivide the interval [0, 5] into n subintervals of equal length and
assume the rate of change is constant on each subinterval, then we have:
�
�
�
�
5 dC2
5 5
dC2
5 5
dC2
(0) +
1·
+ ··· +
(n − 1) ·
C2 (5) − C2 (0) ≈
dt
n
dt
n n
dt
n n
�
�
n−1
�
dC2
5 5
=
i·
.
dt
n
n
i=0
The class responded by saying that
C2 (5) − C2 (0) =
�
0
5
dC2
(t)dt.
dt
The instructor asked the class what the symbol on the right hand side means, but they
had difficulty answering this question. A few of the students said that the symbol means
C2 (5) − C2 (0). The instructor then asked the class how the symbol on the right hand side is
related to our sum. One student responded by saying that
�
0
5
n−1
� dC2
dC2
(t)dt ≈
dt
dt
i=0
�
5
i·
n
�
5
n
with the approximation becoming better as n gets larger, and the�rest of the class agreed.
5
dC2
The instructor summarized the students’ comments as follows: So
(t)dt is a number
dt
0
�
�
n−1
�
dC2
5 5
with the property that we can we make the sum
i·
as close to it as we want
dt
n n
i=0
provided that n is large enough. More formally, for any small number ε > 0, we can find n
large enough so that:
��
�
� ��
n−1
� 5 dC
�
dC
5
5�
�
2
2
(t)dt −
i·
�
� < ε.
� 0 dt
dt
n n�
i=0
36
The instructor then noted that since we do not want to restate this repeatedly, we should
replace this long statement with the symbol:
lim
n→∞
�
n−1
�
dC2
i=0
dt
5
i·
n
�
5
=
n
�
0
5
dC2
(t)dt.
dt
The instructor then noted the connections to previous discussion about the relationship
df
between ∆f
and dx
. The instructor then posed the following question: Why does the ap∆x
proximation become exact in the limit as n approaches ∞? The instructor pointed out that
this question is nontrivial because even if we suppose that we can bring the limit inside the
sum (which may problematic in and of itself), the limits of each of the individual terms in
the sum may be very different, for example one of these limits may be 0 another may be 5
and another may be infinite. Thus, a legitimate problem is to demonstrate that
lim
n→∞
�
n−1
�
dC2
i=0
dt
5
i·
n
�
5
= C2 (5) − C2 (0),
n
i.e. our problem is to show that
�
5
0
dC2
(t)dt = C2 (5) − C2 (0).
dt
The class agreed that there is nothing special about the number 5; we could replace 5 by
any number. So actually we expect that for any number x,
�
0
x
dC2
(t)dt = C2 (x) − C2 (0).
dt
The class observed that on the left hand side we have a function, and on the right hand
side we also have a function. So if we call the function on the left hand side f (x), then our
claim is that C2 (x) − f (x) = C2 (0), i.e. the functions C2 and f differ by a constant. When
asked how to demonstrate that the difference between the functions C2 and f is constant,
one student suggested showing that the rates of change of C2 and f are the same. The rest
of the class agreed, so the goal of the class was now to show that
df
dC2
(x) =
(x).
dx
dx
37
The instructor then presented the following proof, noting details that we would need to prove
later by writing the letter W (for warning) over equalities that were not fully justified.
df
f (x + ∆x) − f (x)
(x) = lim
∆x→0
dx
� � x+∆x∆x
�
� x dC2
dC2
(t)dt
−
(t)dt
dt
0 dt
0
= lim
∆x→0
∆x
� � x+∆x
�
dC2
(t)dt
W
dt
x
= lim
∆x→0
∆x
�
� ∆x
�
dC2
x + k∆x
limn→∞ n−1
k=0 dt
n
n
= lim
∆x→0
∆x
�
�
n−1
k∆x
1 � dC2
= lim lim
x+
∆x→0 n→∞ n
dt
n
k=0
�
�
n−1
1 � dC2
k∆x
W
= lim lim
x+
n→∞ ∆x→0 n
dt
n
k=0
n−1
1 � dC2
(x)
n→∞ n
dt
k=0
= lim
n−1
dC2
1�
=
(x) lim
1
n→∞ n
dt
k=0
dC2
n
(x) lim
n→∞ n
dt
dC2
=
(x).
dt
=
Thus we have shown that C2 (x) − f (x) = K, and it remains only to show that K = C2 (0).
But K = C2 (0) − f (0) = C2 (0) − 0, so we are done.
This proof was well received by the students. As a follow-up assignment, the instructor
challenged the students to “make the proof their own” by stating the Fundamental Theorem
of Calculus in terms of functions other than the ones used in the above proof and then
reprove the theorem using their own notation.
Problem 5.3: The weight of a cubic shipping box, in pounds, can be modeled by
� t2
�1/4
1 �
w(t) =
1 + x1/2
dx,
400
0
where t is the length of a side of the box, measured in centimeters. You need to ship a cubic
box with a side length of 20 cm. About a month ago, you shipped a cubic box with a side
length of 15 cm, and it weighed 1.012 pounds. Approximately how much will the 20×20×20
box weigh?
38
This problem should be assigned after the discussion of the Fundamental Theorem of
Calculus. The goal is two-fold: internalizing and applying the FTC and using derivatives
to find a linear approximation. Here students are expected to struggle with determining an
appropriate problem solving approach. Some may write down the integral
� 400
�1/4
1 �
w(20) =
1 + x1/2
dx,
400
0
and discuss possible methods of approximating/evaluating this integral, but after some discussion, it is likely that they will seek an alternative approach. (Although a few of our
students did evaluate the integral using a sequence of substitutions.) The need to avoid
computation necessitates that the students find a linear approximation for w; and in doing
so, they must employ the FTC to find w� (15) in the following expression:
w(20) ≈ w(15) + w� (15)(20 − 15).
Theorems. Proofs of both the Mean Value Theorem and Fundamental Theorem of Calculus
are presented in the pedagogical discussions of the problems which necessitated them (refer
to Problems 4.14, 4.15, and 5.2); here we present an alternative proof of the Mean Value
Theorem as well as a proof of the Fundamental Theorem of Calculus outside the context of
the bacterial colonies problem.
Theorem 5.1. Mean Value Theorem: If f is continuous on [a, b] and differentiable on (a, b),
then there is a number c in (a, b) such that
f � (c) =
f (b) − f (a)
.
b−a
Remarks on Proof of Mean Value Theorem. The traditional proof of the Mean Value
Theorem involves checking several conditions that are not very meaningful for the students.
We suggest a different approach, and we also suggest not worrying about the open and
closed intervals, since functions considered in this unit will be differentiable everywhere we
are considering. The proof involves an intuitive lemma that was developed in Problem 5.2.
Intuitively, the statement is that if one function starts at least as large as another function
and increases at least as fast, then it will always stay larger. Formally, we can express this
as:
Lemma. If f (a) ≥ g(a) and f � (x) ≥ g � (x) (respectively f � (x) > g � (x)) for all x ≥ a, then
f (x) ≥ g(x) (respectively f (x) > g(x)) for all x ≥ a.
This lemma allows a quick indirect proof. Consider some differentiable function f and let
g be the linear function whose graph is the line through the points (a, f (a)) and (b, f (b)).
Suppose there is no x ∈ [a, b] where f � (x) = g � (x). Then we either have f � (x) > g � (x)
or f � (x) < g � (x) for all x ∈ [a, b] (assuming, that is, that the derivatives of f and g are
continuous, or else using the nontrivial fact that derivatives take on intermediate values
even when they aren’t continuous). Without loss of generality, assume that f � (x) > g � (x).
Then, since f (a) = g(a), we can conclude by the lemma that f (b) > g(b). But f (b) = g(b),
so this is a contradiction.
39
This proof should be presented after students have used the idea in a couple of problems.
Students can be asked if they remember or can create a proof. Presumably, they will not
remember a complete proof, in which case the instructor can ask what they think would go
wrong if the theorem failed– setting up an indirect proof as above. If some students are able
to put together a different proof, the instructor can also present the proof above and have
students compare the proofs.
Theorem 5.2. Fundamental Theorem of Calculus (Part One): If f is continuous on [a, b],
then the function g defined by
� x
g(x) =
f (t)dt,
a≤x≤b
a
is continuous on [a, b] and differentiable on (a, b), and g � (x) = f (x).
Remarks on Proof of FTC1. If x and x + h are in (a, b), then
� x+h
� x
g(x + h) − g(x) =
f (t)dt −
f (t)dt
a
a
� x+h
W
=
f (t)dt
x
= lim
n→∞
Thus
�
g (x) = lim
limn→∞
h→0
n
�
f (x + ih/n)h/n.
i=1
�n
i=1
f (x + ih/n)h/n
h
n
1�
= lim lim
f (x + ih/n)
h→0 n→∞ n
i=1
n
1�
= lim lim
f (x + ih/n)
n→∞ h→0 n
i=1
W
n
1�
= lim
f (x)
n→∞ n
i=1
= f (x).
Note that the above proof shows that g is differentiable, so it follows that g is continuous,
as claimed.
As with the Mean Value Theorem, we should not focus too much on the intervals and
inequalities involved, since these will not be a problem in the situations considered. Overall, the intellectual need for using the Fundamental Theorem of Calculus should have been
built up through the problem sequence. The intellectual need for proving can come from
a focused discussion. Students should intuitively see that the accumulation of velocity is
given by a difference in position, or, equivalently, the rate of change of accumulated distance
40
is velocity. However, these intuitive notions use only average velocity and a discrete accumulation function. It is clear that distance travelled over some interval is given by average
velocity over that interval times duration, but we need to check that these same ideas will
hold with our careful definitions of instantaneous change (derivative) and continuous accumulation (integral). To do this, we need to express our intuitive ideas mathematically using
these definitions. Doing so leads to the starting point for the first proof above. Although
the expression may look ugly, it is simply a mathematical expression of what we expect to
happen. The question then becomes how we can transform one expression into the other.
There is a need to deal with two limits. The instructor suggests that it will be easier if
we first take the h-limit; we do not expect students to object to interchanging limits since,
intuitively, the limits operate simultaneously. Once we do so, the algebra works out nicely
to give the expected result, and our intuition is confirmed.
Theorem 5.3. Fundamental Theorem of Calculus (Part Two): If f is continuous on [a, b],
and F � = f on (a, b), then
� b
f (x)dx = F (b) − F (a).
a
Remarks on Proof of FTC2. We use the FTC1 to prove the FTC2. Based on the concept
image and prior experience using the theorem, the students should expect the result. We
first prove it for some particular antiderivative,
then show that any two antiderivatives will
�t
differ by a constant. Define F (t) = a f (x)dx. By FTC1, F � (t) = f (t). Then we have:
� b
� a
F (b) − F (a) =
f (x)dx −
f (x)dx
a
a
� b
=
f (x)dx
a
Now, suppose F and G are both antiderivatives of f , i.e. F � (t) = G� (t) = f (t). Then
(F − G)� (t) = 0 for all t. By the Mean Value Theorem, the only function that never changes
is a constant, so F − G must be constant; i.e. F and G differ by a constant. Note that,
from a mathematical perspective, the first proof
� t on its own would raise the issue of whether
any antiderivative of f can be expressed as c f (x)dx for some c. Although this issue is very
interesting to explore (it isn’t true that every antiderivative can be expresessed in this way!),
it is peripheral to the focus of the FTC2, so we prefer the simple proof given.
Supplementary and Practice Problems.
Problem 5.4. I’m driving my truck up a hill at 50 mph. I speed up smoothly to 60 mph,
and it takes me one minute to do so. About how far did I go in that minute?
Problem 5.5. A cup of coffee at 90o C is put in a 20o C room at 12:00 pm. The coffee’s
temperature is changing gradually. At noon, the temperature of the coffee is decreasing by
3.8o C per minute, whereas at 1:00 pm, the temperature of the coffee is decreasing by 0.15o C
per minute. Approximately what is the temperature of the coffee at 1:00 pm?
41
10
(u ≥ 0) dol+2
lars/mile after walking u miles. One day, you decide to outsell him by starting earlier and
walking twice as fast so that when he has walked u miles, you have walked 2u + 5 miles.
(1) Write an expression for the total amount of money your friend can expect to earn
after walking x miles. Write an expression for how much you can expect to earn
during this same period.
(2) What is the rate of change of your earnings with respect to the distance your friend
has travelled?
Problem 5.6. Your friend, a magazine salesman, can make about
u2
As with Problem 5.3, one of the main goals of this problem is for the students to internalize
and apply the Fundamental Theorem of Calculus. The most likely difficulty for the students
will be to model the situtation mathematically. The solution to the first part of the problem
is the accumulation functions
� x
10
du
2
0 u +2
and
� 2x+5
10
du.
2
u +2
0
The solution of the second part of the problem is obtained by differentiating the latter
accumulation function, which yields
20
.
(2x + 5)2 + 2
Problem 5.7. The cost of heating a house for the first t days in the period from February
� t3/2
4
√
1 until May 1 is approximately
dx dollars. Approximately how much are the
x+1
0
heating costs per day at the beginning of April?
This problem again involves reasoning with FTC1. We expect that students willl find the
general rate of change at day t, then simply plug in 60 (first day of April). We have
√
� 3/2
d t
4
6 t
√
dx = √
.
dt 0
x+1
t3/2 + 1
Students could simply rewrite this expression with 60 substituted for t, or they could use a
calculator to evaluate it.
� x/2
sin t
Problem 5.8. Consider the function S defined by S(x) =
dt. Given that S(π) =
t
0
1.37, approximate S(4).
Problem 5.9. Verify the Mean Value Theorem for the function ex on the interval [0, 1].
Illustrate your answer with a sketch to demonstrate what is happening geometrically.
This problem should be straightforward for students, but it purposely does not specify
exactly what they should do to verify the Mean Value Theorem (as most textbooks do).
Thus, students have to reason about what the Mean Value Theorem would say in this specific
42
situation. The resulting equation they have to solve, finding an x for which ex = e − 1, is
easy but reinforces the way of thinking of leaving expressions unsimplified because they have
to resist the temptation to simplify ln(e − 1).
Problem 5.10. The charge per unit length on a rod at a point x meters from the left end
2
is given by 2xex . How much charge is contained in the portion of the rod between 1 and 2
meters from the left end?
Problem 5.11. Money Problem:
(1) Jack puts $2, 000 in a savings account that earns 3% interest, compounded annually.
Jill puts $1, 000 in a savings account that earns 6% annually.
(a) Whose account is worth more in the long run? Are the account balances ever
equal?
(b) How do things change if the interest is compounded monthly? Daily? Continuously?
(2) Jack and Jill find out their friend has died. This friend leaves Jill a one-time lump
sum of $100, 000 and leaves Jack $7, 000 now and every year until he dies.
(a) Who do you think gets a better inheritance?
(b) How does this change if Jack’s $7, 000 is given to him as a continuous income
stream; that is, if he gets $7, 000 evenly spread over each year?
This sequence of problems addresses many issues, so it is expected to occupy a large
chunk of time and will require significant instructor interventions to clarify and scaffold the
tasks. Students are intended to use graphing calculators to aid in their calculations and
allow testing of their reasoning.
Problem (1)(a) is meant to cause puzzlement. Since the initial instantaneous rates of
change are the same, but one person starts with more money, it seems logical that this
person would have more money at the end. However, the actual answer contradicts this
reasoning (if one allows sufficient time).
The problems necessitate comparing functions in sophisticated ways based on their starting amounts and rates of change. Problem (1)(b) is purposely vague, forcing students to
forumulate a more precise question. Answering this question requires coordinating not only
the two functions constructed and how they change individually with respect to a parameter
(how interest is compounded), but how they change with respect to each other as that parameter varies. This sophisticated reasoning should be scaffolded by the fact that students
should remember that “compounding more often is better”, so it is not too much of a leap
to expect this effect to be more pronounced for higher interest rates.
Question (2)(b) necessitates use of calculus, and it may help to answer question (2)(a). If
students have considered the case of continuous interest, it should be a natural extension to
look at obtaining the inheritance itself continuously. Students’ previous calculations could
be considered approximations for this question, which they are now finding exactly using
integration.
It is likely that most students will approach the first question similarly: writing explicit
formulas and looking at their values for large times or setting them equal. However, the
means they use to do this may vary. Let P (t) and Q(t) denote the values of Jack’s and Jill’s
43
accounts, respectively. Then P (t) = 2000(1 + .03)t and Q(t) = 1000(1 + .06)t . Setting these
� �
��−1
1.06
equal, we see that their accounts will have the same amount at t1 = ln 2 ln
≈
1.03
24.14 years. However, a debate could arise as to whether this means they will be equal or
not. One could take the question literally and argue that, if the interest is only deposited
exactly at the one-year mark, then the accounts never have the same balance (similarly for
the next two values). Changing the compounding yields:
� �
��−1
ln 2
1 + .06/12
t2 =
ln
≈ 23.19 years,
12
1 + .03/12
� �
��−1
ln 2
1 + .06/365
ln 2
t3 =
ln
≈ 23.108 years,
t4 =
≈ 23.105 years.
365
1 + .03/365
.03
For the second question, an initial period of discussion of what exactly is being asked
and how to model the situation should precede actual calculation. However, students are
expected to experiment with the numbers more before moving to a closed formula. It makes
sense to consider both parties investing at the same yield r. The answer is very sensitive to
this yield; 7% is a cutoff value at which Jack’s investment can never catch up to Jill’s. The
previous problem will have prepared students to attend to how the interest is compounded.
If yearly, Jack’s investment will then be worth 7000 (1 + (1 + r) + (1 + r)2 + · · · + (1 + r)t )
after t years. After experimentation, some students may realize that the yearly increases
form a geometric sequence,
from which
�
� it can be found that (redefining the functions used
(1 + r)t+1 − 1
earlier) P (t) = 7000
and Q(t) = 100, 000(1+r)t , so we could assume a yield
r
of 6% (a likely student idea due to the previous problem), and find that the investments will
ln(1.06 − 6/7)
be equal at t1 =
≈ 27.4 years. On the other hand, with compound interest,
ln(1.06)
1 − er(t+1)
we would have P (t) =
, so that
1 − er
� 100
�
ln 7 (1 − e.06 ) + e.06
t2 =
≈ 28.7 years.
−.06
For (2)(b), we can use one of several formulas for a continuous income stream, depending
on whether one wants to compare the present or future values of the investments. The easiest
solution is to find the present value of Jacks income stream with T as a parameter:
� −.06T
�
� T
e
1
−.06t
7000e
dt = 7000
−
.
−.06
−.06
t=0
ln(1 − 6/7)
≈ 32.4 years.
−.06
However, it is expected that students will not be familiar with any formula for value of an
income stream, so one possible student approach is to consider Jack as getting $ 7000
n times
n
per year, then take the limit as n → ∞ to obtain a formula for the value of his investment.
Taking the limit formally turns out to be quite difficult, which might lead students to calculate an approximation by choosing some large value of n and using a graphing calculator.
Setting this equal to 100, 000 we get T =
44
For instance, if we let n = 106 and also compound n times per year, the investments turn
out to be equal at T2 ≈ 32 years. Students using this approach should be unhappy with
only having obtained an approximation, so they should feel intellectual need for the present
value formula, which the instructor can present at the end of the problem.
Alternatively, students might consider differentials and instantaneous rate of change: in
some small interval dt, Jack gets $7000dt from the trust and .06P dt from interest. Thus,
they obtain the differential equation dP/dt = 7000 + .06P , which some students should
recognize as solvable by separation of variables. It is expected that many students will make
some mistakes while solving, such as leaving off the integration constant or not attending
to the correct initial condition, but they should be able to check whether their answers are
reasonable and understand errors when these issues are probed. The particular solution with
�
7000 � .06t
initial condition P (0) = 0 is given by: P (t) =
e − 1 . Setting this equal to Q(t),
6
6 .06t
.06t
students obtain the equation e − 1 = e , which yields the value T from above as the
7
solution.
Overall, if Jack and Jill are young and the yield is significantly below 7%, (both likely
student assumptions), then Jack’s inheritance is better.
© Copyright 2026 Paperzz