Introduction to the Real Numbers

Chapter 1
Introduction to the Real
Numbers
1.1
Introduction
Most students feel that they have an understanding of the real numbers and/or
the real line. Throughout most of your education you have drawn a line—
sometimes called a number line—having a positive (and hence negative) direction and with the integers indicated. From this you could approximately
designate any other real number.
You used two of these as axes when you graphed functions. When you
learned how to take limits as x → a, you used a number line as the x-axis when
you (or the instructor or the book) explained the meaning of the concept of a
limit. When you were introduced to integrals, an interval on the real line was
subdivided to aid in the definition of the integral.
Unless you were given an non-standard introduction to these concepts you
really didn’t know enough about the real line to know what’s missing. You
surely don’t know enough about the real line to be able prove the important
calculus theorems.
This isn’t as bad as it may sound. When Isaac Newton and Gottfried Wilhelm Leibniz invented calculus in the late 1600’s, they used a very intuitive
approach. After a while some people started pointing out that there were inconsistencies to their approaches. Over the next 200 years many of the great
mathematicians worked on rigorizing calculus. In 1754 Jean-le-Rond d’Alembert
decided that it was necessary to give a rigorous treatment of limits. Joseph Louis
Lagrange published his first paper rigorizing calculus in 1797. As a part of his
work on hypergeometric series, in 1812 Carl Freidrich Gauss gave rigorous discussion of the convergence of an infinite series. And finally, Augustus-Louis
Cauchy in 1821 answered d’Alembert’s call and introduced a theory of limits. In 1874 Karl Weierstrass gave an example of an everywhere continuous,
nowhere differentiable function. This example illustrated that geometric intu1
2
1. Real Numbers
ition was not an adequate tool for analytic studies. Weierstrass realized that
to perform rigorous analysis there must be an understanding of the real number system. Weierstrass instigated a program known as the arithmetization of
analysis which through the work of Weierstrass and his followers established
the rigorous treatment of the real number system as a foundation for classical
analysis. Weierstrass died in 1897.
Summarizing the situation, it took them 200 years from when they were first
introduced to the ideas of calculus until they arrived at an understanding of how
and why calculus really works. So it’s not too bad that you may have started
learning calculus two or three years ago and some of the important essentials
were skipped.
This chapter serves as an introduction to the set of real numbers. There are
at least three common approaches of introducing the set of real numbers. For
many people the approaches using either Dedekin cuts (where the real numbers
are represented by sets of rational numbers) or Cauchy sequences (where the real
numbers are represented by equivalence classes of Cauchy sequences of rational
numbers) are more satisfying in that you actually construct the set of reals.
In either case, however, neither the rationals nor the reals look like numbers
that you are accustomed to using. Instead of using either of these approaches
we shall describe the set of real numbers by a suitable set of postulates. One
advantage of this approach is that it is the fastest (and we don’t want to spend
too much time on it). In addition, the set of postulates give us a very explicit
list of the most basic properties satisfied by real numbers.
In addition to introducing the real numbers in this chapter, we will discuss
certain aspects of proofs. We feel that this necessary or at least advantageous
to help the reader understand the proofs given in this text.
Before we get started we will review some things that we know (or at least
we think we know), introduce some notation and discuss some useful results and
ideas. We begin by defining the sets
• the set of natural numbers N = {1, 2, 3, · · · },
• the set of integers Z = {· · · , −3, −2, −1, 0, 1, 2, 3, · · ·},
• the set of rational numbers Q = {m/n : m, n ∈ Z, n 6= 0}.
Of course N ⊂ Z ⊂ Q. The description of Q is not ideal. It’s not wrong
but it doesn’t take care of the fact that there are multiple descriptions of each
rational number, i.e. 1/2 = 3/6 = 123/246, etc. One complicated way around
this is to define a rational number as the equivalence class of rational numbers
that are equal. It’s a bit easier to just always consider the rational number
in ”reduced form” where common multiples of the integers in the numerator
and the denominator have been divided out. We will take this latter approach.
Hence the ”rational numbers” 1/2, 3/6 and 123/246 are all represented by 1/2.
One way to introduce the need for the real number system is to start with the
set of rational numbers and decide that something important is missing. If you
graph the function f (x) = 2−x2 really carefully using only the rational numbers
3
1.1 Introduction
on the x-axis, you see that the graph passes through the x-axis without hitting
the axis—we don’t want that but then you really can’t
√ graph it that carefully.
Many of you have seen the common proof that 2 is not rational which
goes as follows. (Read the proof carefully. It’s not terribly important to be
able to reproduce the proof
√ but it is important
√ that you can follow the proof.)
Assume false, i.e. that 2 is rational. Then 2 = m/n where m and n are in
Q, n 6= 0 and m/n is in reduced form. If we square both sides we get m2 = 2n2 .
Since m2 is a multiple of 2, m2 is even. This implies that m is even. (If m
is not even, i.e. m is odd, then m = 2k + 1 for some integer k. But then
m2 = (2k + 1)2 = 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1 is odd. This is a contradiction.)
If m is even, then m can be written as m = 2k. But then the facts that m = 2k
and m2 = 2n2 imply that m2 = 4k 2 = 2n2 or n2 = 2k 2 . Thus n must also
be even. This is a contradiction to the fact that we assumed that m/n was in
reduced form.
√
√Thus we see that 2 is not rational. But do we really care? Do we need
a 2 in our lives? With a little bit of thought about the diagonal of the unit
square or the √
graph of the functions f (x) = 2 − x2 , it’s reasonably clear that we
want to have 2 in our number system, i.e. Q is not enough. In general,
√ we do
not want to work on domains that have holes in them like Q has at 2.
It should be pretty clear that there are a lot of rational numbers (there are
a lot of natural numbers and there are clearly a lot more rational numbers than
there are natural numbers). A little thought will at convince you that there are
also a lot of numbers on the number line
√ that√are not rational. A proof similar
to the one given√above will show that 3 and√ 7 are not rational. It’s also easy
to prove that q 2 (Example 1.2.2) and q + 2 (HW1.2.2–(a)) are not rational
for any q ∈ Q. Since there are a lot of what we think of as numbers that are
not rational, there are many holes in Q.
What we would like to do is to figure out how to fill in the holes in the real
number line of the numbers that are not rational. This is close to what the
approaches to building the real numbers using either Dedekin cuts or Cauchy
sequences do. We will really approach this from the other direction. We will
define the set of real numbers and then show that this set is what we want.
Before we go on, we would like to include a topic related to our work with
the rational numbers. The following result makes it easy to show that a given
number is not rational—and it shows more. Consider the following proposition.
Proposition 1.1.1 Consider the polynomial equation
a0 xn + a1 xn−1 + · · · + an−1 x + an = 0
(1.1.1)
where a0 , a1 , · · · , an are integers, a0 6= 0, an 6= 0 and n ≥ 1. Let r ∈ Q be a root
of equation (1.1.1) where r = p/q expressed in reduced form. Then q divides a0
and p divides an .
Proof: If you observe carefully, you will see that the√proof of this proposition
is really very similar to the way that we proved that 2 was not rational.
4
1. Real Numbers
If r = p/q is a root of equation (1.1.1), then a0
p
an−1
+ an = 0. Multiplying by q n we get
q
n−1
n
p
p
+ a1
+ ··· +
q
q
a0 pn + a1 pn−1 q + · · · + an−1 pq n−1 + an q n = 0.
(1.1.2)
Solving for a0 pn allows us to rewrite equation (1.1.2) as
a0 pn = −q a1 pn−1 + a2 pn−3 q + · · · + an q n−1 .
Since everything inside of the brackets is an integer, q must divide a0 pn . Since
p/q is in reduced form, no factors of q divide out with any part of pn (see
HW1.1.1). Thus, q divides a0 .
Likewise, we rewrite equation (1.1.2) as
an q n = −p a0 pn−1 + a1 pn−2 q + · · · + an−1 q n−1 .
We use the same argument as before. Because p must divide an q n and no factors
of p can divide out any factors of q n , then p must divide an .
Before we show you how nice this result is in relation to our work with
rational numbers, let us remind you that you have probably used this result
before. A while after you learned how to factor polynomials in your algebra
classes, you were faced with factoring polynomials of degree greater than or
equal to three. You were given a problem like ”factor 2x3 + 3x2 − 8x + 3.” You
were taught to try to divide by x ± 3, x ± 3/2, x ± 1/2 and x ± 1. These potential
roots were formed by trying all rationals p/q where p is a factor of an = 3 and
q is a factor of a0 = 2, i.e. by applying Proposition 1.1.1. If and when you were
lucky enough to divide 2x3 + 3x2 − 8x + 3 by x − 1 you got
2x3 + 3x2 − 8x + 3
= 2x2 + 5x − 3.
x−1
You then factored the quadratic term which gives you the complete factorization
2x3 + 3x2 − 8x + 3 = (x − 1)(2x − 1)(x + 3).
If none of the potential roots satisfy the equation, Proposition 1.1.1 implies
that there are no rational roots. In you algebra class you usually didn’t have to
worry about that since they were trying to teach you how to factor—one of the
potential roots always satisfied the equation.
Our application of Proposition 1.1.1 goes as follows. Consider the equation
x2 − 2 = 0. By Proposition 1.1.1 we know that if there are going to be rational
roots to this equation, they will be either ±2 or ±1. It is easy to try these
four potential roots and see that none of them satisfy the equation x2 − 2 =
0. Therefore
the equation has no rational roots. Solving for x√we know that
√
x = ± 2 represents the solutions to this equation. Therefore ± 2 must not be
rational.
5
1.2 Proofs
This same approach √
can be used to produce
√ many numbers that are not
rational. Many such as 13 are as easy as 2. For some it is more difficult
to find the appropriate algebraic equation associated q
with the numberq
but the
√
√
method still works. For example consider the number 4−3 2 . Set x = 4−3 2 .
√
√
√
2
Then x3 = 4−3 2 , 3x3 = 4 − 2, 3x3 − 4 = − 2 and 3x3 − 4 = 2. Expand
this last expression and apply Proposition
1.1.1 to the resulting polynomial
q
3
(with integer coefficients). Surely
3
√
4− 2
3
3
is a root of this polynomial.
HW 1.1.1 Assume that p and q have the prime factorizations p1 · · · pmp and
q1 · · · qmq , respectively and p/q is in reduced form. Prove that if q divides a0 pn
(where a0 is an integer), then q divides a0 .
√
HW 1.1.2 Prove that 13 is not rational.
√
HW 1.1.3 Prove that 32 + 13 is not rational.
1.2
Introduction to Proofs
Before we proceed with the next step of defining the set of real numbers, we
pause to include a short discussion of proofs. This topic is surely a bit of a
detour but it may prove to be helpful. In a text such as this, proofs are very
important. It is a time in your mathematical career when you see why things
are true. It is a time when you learn to write a proof that convinces the reader
that what you claim is indeed true. Probably most importantly, you learn to
read mathematics (specifically mathematical proofs) critically and be able to
evaluate whether the writers argument is sound and if what the writer claims
is true is in fact true.
Two types of proofs are important in mathematics: the direct proof and the
indirect proof. We will first discuss the simplest case, direct proof.
Direct Proofs: A direct proof is a valid argument with true premises. In
our cases the true premises are usually axioms and definitions that have been
given, or previously proved results. If the statement to be proved is in the
form p implies q (which our statements will often be), then we can include the
statement p as one of the true premises. A valid argument should be defined and
studied in a logic class—but not many logic classes exist anymore. We will try
to show you what a valid argument is through a series of examples—just about
all (hopefully all) of the proofs in this text give examples of valid proofs. The
valid argument is a series of logical implications relating know facts resulting in
the desired conclusion. Consider the following example.
Example 1.2.1
Prove that r1 , r2 ∈ Q implies r1 + r2 ∈ Q.
Solution: The list of known facts mentioned as a part of the proof includes
the statement r1 , r2 ∈ Q, the definition of Q and all known properties of
arithmetic for rational numbers. (We know more. That means that there are
6
1. Real Numbers
more potential hypotheses but these would not tend to be relevant here.) The
argument to prove this statement can be given as follows:
r1 , r2 ∈ Q implies that r1 = m1 /n1 and r2 = m2 /n2 for m1 , m2 , n1 , n2 ∈ Z
m2
m1 n 2 + n 1 m2
m1
+
=
(by know
(by the definition of Q). Then r1 + r2 =
n1
n2
n1 n2
arithmetic for integers). m1 , m2 , n1 , n2 ∈ Z implies that m1 n2 +n1 m2 , n1 n2 ∈ Z
(because Z is closed with respect to addition and multiplication). Therefore
r1 + r2 is the ratio of two integers or r1 + r2 is rational (by the definition of Q).
This is a very easy proof but it is hoped that it shows explicitly what the
”true hypotheses” are and how these hypotheses fit together with the valid
argument to construct the proof. We will have more difficult direct proofs, but
more difficult direct proofs will just be more difficult analogs to this proof. We
should realize that the statement p implies q can also be written as if p, then q,
p is a sufficient condition for q, p only if q and q is a necessary condition for p.
Depending on the author you may see all of these different expressions.
And finally we discuss again what we mean by true premises. It is difficult when you leave the ”do as I say” world of mathematics to the ”prove it”
world of mathematics. At this time you ”know” a lot of things that have been
told to you—things that have not been based on a firm mathematical foundation. Students sometimes have trouble knowing what they can assume are true
premises. It is clear that you can assume anything that we have given you as
postulates, definitions or anything that you or we have proved. We did cheat a
bit when we have told you that you know about the integers, the arithmetic for
integers and consequently the arithmetic for rationals. Actually, the facts that
you know for the integers include a small set of postulates and results proved
from those postulates. Because we had to start somewhere, we assume that you
know those. When it is necessary, we will include some of the properties of the
integers—postulated and/or proved. Just about every other true premise that
you will have to use or we will use will be included in this text. If we cheat, we
will try to remember to tell you that we are cheating.
Indirect Proofs: Indirect proofs are very common in analysis. There are
certain results that are very difficult to prove directly yet can be easily proved
using an indirect proof. The indirect proofs are based on the logical concepts
of the contrapositive and the contradiction. We discuss first the use of the
contrapositive in proof.
The Contrapositive: When the statement we wish to prove is if r, then s,
a common approach to proving the statement is to consider the contrapositive
of the statement. For this short discussion we will write the implication as
r → s and read it as p implies q. The contrapositive of the statement r → s
is the statement (∼ s) → (∼ r) where (∼ s) mean ”not s”. We refer to ∼ s
as the negation of s. An example of an implication that we proved earlier is
n2 is even implies that n is even (where the context implied that n ∈ Z). The
contrapositive of this statement is n is not even implies n2 is not even or n is
odd implies n2 is odd. It is not clear or easy to see that the statements n2 is even
7
1.2 Proofs
r
T
T
F
F
s
T
F
T
F
r→s
T
F
T
T
∼s
F
T
F
T
∼r
F
F
T
T
(∼ s) → (∼ r)
T
F
T
T
Table 1.2.1: Truth table for the contrapositive.
implies that n is even and n is odd implies n2 is odd are equivalent. The easiest
way is to construct a simple truth table including r → s and (∼ s) → (∼ r).
The first two columns of the Table 1.2.1 list all combinations of truth values
of the statements r and s, both r and s can be true or false. Column 3 can
be thought of the definition of the truth value of the implication. The point is
that the implication is only bad (i.e. false) if a true hypothesis implies a false
conclusion (row 2). Otherwise the implication is true. Columns 4 and 5 give the
truth values of the statements ∼ s and ∼ r (opposite of those of s and r). And
finally, column 6 gives the truth values of the statement (∼ s) → (∼ r) based
on the definition of the truth values of the implication (i.e. false only when a
true statement will imply a false statement) and the truth values of ∼ s and
∼ r. We note that the truth values of r → s and (∼ s) → (∼ r) are the same.
That means that the statements r → s and (∼ s) → (∼ r) are equivalent.
The result of the argument is to prove the statement r implies s is equivalent
to proving the statement not s implies not r, or using our example, to prove the
statement n2 is even implies n is even is equivalent to proving the statement
n is odd implies n2 is odd. This is good because the latter statement is very
easy to prove directly. The first statement is very difficult (if not impossible)
to prove directly.
This proof was given in the previous section as a part of the
√
proof that 2 is not rational (before you knew that you were doing a proof by
proving the contrapositive of a statement). Therefore we use the easy argument
to prove that n is odd implies n2 is odd. Then this will also imply that n2 is
even implies n is even.
Contradiction The second type of indirect proof is based on the logical
concept of a contradiction. A contradiction is a statement that is false for all
combinations of the truth values. A proof by contradiction, or when you are
really pleased with your proof you might refer to it as a proof by reductio ad
absurdum, is based on the fact that if p1 is a statement, then it is impossible for
p1 and ∼ p1 to both be true. Recall that a proof is a valid argument with true
premises. If we lump everything that we know to be true (including anything
that we can prove based on what we know to be true) into one statement called
pK (where if the statement we wish to prove looks like r → s, we also include
the r in pK ), a proof is of the form pK → q where q is true whenever pK is true.
When we prove a statement s or r → s be contradiction, we begin by assuming that the statement s is false. We then proceed to use this information
to prove that some statement p1 is false, where p1 is one of the statements in-
8
1. Real Numbers
cluded in pK or r, in the case that the statement that we want to prove is of
the form r → s. In either case we will have assumed initially that p1 is true
and have proved that p1 is false, which is a contradiction both by Webster’s
definition and by a mathematical definition—because then p1 ∧ (∼ p1 ) is surely
always false. Thus our original assumption that the statement s was false must
be erroneous—thus the statement s must
√ be true.
In Section 1.1 we proved that the 2 was not rational. We did it at that
time—before we discussed proofs—because we wanted to convince you that
there were a lot of numbers other √
than the rational numbers. We gave a proof
2 is not rational. We assumed that this was
by contradiction.
Our
state
s
was
√
false, i.e. that 2 is rational. What we included in pK (everything that we know
to be true) at that time was very nebulous but we did emphasize that a part
of the definition of the rational numbers was that p ∈ Q implies that p = m/n
where m/n is in reduced form. For our proof p1 is the statement that p ∈ Q
implies that p = m/n where m/n√ is in reduced form. We then proceeded from
the assumption that s was false, 2 is rational, to prove that the form assumed,
m/n, was not in reduced form, i.e. we proved that p1 was false. Thus we had
our contradiction. And of course we should add that as a part of the proof that
m/n was not in reduced form we included a small contrapostive proof.
We next illustrate a proof by contradiction by considering an easier proof.
Example 1.2.2
√
Prove that q 2 is not rational for any q ∈ Q.
Solution:
We first note that this statement can be reworded as q ∈ Q
√
implies q 2 is not rational. As we mentioned in the discussion of direct proofs
above, we assume that q is rational, recall ”everything
else that we know is
√
true”, and devise an argument that shows that q 2 is not rational. This would
be
√ very difficult. Instead we assume that the desired result is false, i.e. that
and
q 2 is rational (along with the true hypothesis
√
√ everything we know to be
true). Then we know that we can write q 2 as q 2 = m
n where m, n ∈ Z or
√
m
m
2 = qn . It is not difficult to show that qn is rational when m, n ∈ Z and
√
q ∈ Q. Therefore,
√ 2 is rational. However, part of ”everything we know
√ to
be true” is that√ 2 is not rational (we proved it). Thus we have that 2 is
a√rational and 2 is not a rational. This is clearly a contradiction. Therefore
q 2 is not rational.
There are clearly some similarities between proofs using the contrapositive
and contradiction. If the statement that we want to prove is of the form if r then
s, then we know that we can prove this statement by proving the contrapositive,
if ∼ s then ∼ r. As we mentioned earlier if we were to try to prove this statement
by contradiction we would include the statement r as a part of pK —the things
that we know to be true. We then proceed by assuming that s is false, and
we will complete the proof if we prove some statement p1 is false where p1
is part of pK —something we know to be true. If as a part of our proof by
contradiction the statement we prove false is p1 = r, this is a perfectly good
proof by contradiction—because r was thrown in with the other things that we
knew were true. However we should realize that we have then proved if ∼ s
1.3 Some Preliminaries
9
then ∼ r—i.e. we have in effect proved the contrapositive. Anything that we
can prove by proving the contrapositive can be proved by contradiction—using
the same proof.
We will be proving statements using direct and indirect proofs throughout
the rest of this text. There must be an explicit reason for every step of a proof.
To emphasize this fact, in the beginning we will try to explicitly give a reason
for each step. After a while we will revert to the approach that is generally used
in mathematics where we might give an explicit reason for some of the more
difficult steps but will assume that the reader can see the reasons for the other
steps (the reasons we maintain are ”clear”). However, every step is taken for a
reason. If you do not understand why some particular step is done, ask.
HW 1.2.1 (a) Prove that p, q ∈ Q implies that p/q ∈ Q.
(b) Prove that if p is rational, then p + 17
3 is rational.
√
HW 1.2.2 (a) If q ∈ Q, prove that q + 2 is not a rational.
(b) If q ∈ Q and x is not rational, prove that q + x is not rational.
1.3
Some Preliminaries to the Definition of the
Real Numbers
It is now time to introduce the real numbers, R. In this section we give the
easy part of the definition, the structures of a field and an order. We give the
appropriate arithmetic properties by defining a field. We then add the order
properties by defining the order structure. You have probably been introduced
to the field properties before, and maybe the order relation. You at least have
used all of these properties often in your previous mathematics work.
Before we define a field we thought it might be good to be careful about
equality. We will use an equality as a part of our definition of a field (and about
everything else). Everyone knows what equality means—sort of. An acceptable
notion of equality on a set Q must satify the following properties: (i) For a ∈ Q
a = a (reflexive law). (ii) If a, b ∈ Q and a = b, then b = a (symmetric law).
(iii) If a, b, c ∈ Q, a = b and b = c, then a = c (transitive law). There are times
when one of the steps in a proof is technically a result of one of these properties
of equality. We want to make sure that you realize that there are reasons for
all steps—and some of these reasons are due to a precise definition of equality
given in (i)–(iii) above.
We are now ready to start with our definition of a field.
Definition 1.3.1 Let Q be a set on which between any two elements of Q,
a, b ∈ Q, two operations are defined, + and ·, called addition and multiplication.
We assume that Q is closed with respect to addition and multiplication, i.e. if
a, b ∈ Q, then a + b ∈ Q and a · b ∈ Q. The set Q is said to be a field if addition
and multiplication in Q satisfies the following properties.
a1. For any a, b ∈ Q, a + b = b + a (addition is commutative).
10
1. Real Numbers
a2. For any a, b, c ∈ Q, a + (b + c) = (a + b) + c (addition is associative).
a3. For any a ∈ Q there exists an element of Q, θ, such that a+θ = a (existence
of an additive identity).
a4. For any a ∈ Q there exists an element of Q, −a, such that a + (−a) = θ
(existence of an additive inverse).
m1. For any a, b ∈ Q, a · b = b · a (multiplication is commutative).
m2. For any a, b, c ∈ Q, a · (b · c) = (a · b) · c (multiplication is associative).
m3. For any a ∈ Q there exists an element of Q, 1, such that a·1 = a (existence
of an multiplicative identity).
m4. For any a ∈ Q such that a 6= θ there exists an element of Q, a−1 , such that
a · a−1 = 1 (existence of an multiplicative inverse).
d1 For any a, b, c ∈ Q a · (b + c) = a · b + a · c (multiplication is distributive over
addition).
The set Q is said to be an integral domain if Q, + and · satisfy properties a1,
a2, a3, a4, m1, m2, m3 and d1, along with the following property: if a, b, c ∈ Q,
c 6= 0 and ca = cb, then a = c (cancellation law).
You see that the field properties consist of the very basic properties satisfied
by the addition and multiplication that you have used since grade school. When
you were working in N, Z, Q or R, you, your teachers and your books probably
wrote a · b as ab, θ as 0, 1 as 1 and a−1 as 1/a. We will stick with the more
formal notation at this time. After we ”have the reals” we will revert to the
usual notation of ab, 0, etc.
It should be easy to see that N is not a field nor an integral domain because
it does not contain either an additive inverse, Z is not a field because it does not
contain a multiplicative inverse (but it is an integral domain), and Q is a field
(and an integral domain). There are many other fields that are very important
in mathematics.
As a part of our defintion of a field above, we assumed that we have operations addition and multiplication defined on the set. We emphasize that we
want to assume that these operations are uniquely defined. This is a trivial
idea, but it is important. That is, if we have a + b = a + c and b = b′ , then
we also have a + b′ = a + c. For the obvious reason we will sometimes refer to
this as the substitution law—and after a while we will not refer to it, we will
just do it. Of course we have the analogous substitution law associated with
multiplication.
As a part of our definition, if Q is a field, it possesses the basic properties
that are generally familiar to us. However, there are many more properties
associated with a field that are also familiar to us. The point is that there are
many very useful properties in a field that follow from the field axioms. We
include the following proposition that will give us some of these properties.
Proposition 1.3.2 Suppose that Q is a field. Then the following properties are
satisfied.
(i) If a, b, c ∈ Q and a + c = b + c, then a = b.
(ii) If a, b, c ∈ Q, c 6= θ and a · c = b · c, then a = b.
1.3 Some Preliminaries
11
(iii) If a ∈ Q, then a · θ = θ.
(iv) If a, b ∈ Q, then (−a) · b = −(a · b).
(v) If a, b ∈ Q, then (−a) · (−b) = a · b
(vi) If a, b ∈ Q and a · b = θ, then a = θ or b = θ. This also shows that if Q is
a field, then Q is an integral domain.
Proof: (i) c ∈ Q implies there exists −c ∈ Q such that c + (−c) = θ (a4). By
the reflexive law of equality, (a + c) + (−c) = (a + c) + (−c). Since a + c = b + c,
the substitution law implies that (a + c) + (−c) = (b + c) + (−c). Then using
a2 twice we have a + (c + (−c)) = b + (c + (−c)). By a4 (twice) this becomes
a + θ = b + θ which implies (by a3 twice) that a = b.
Note that if we applied HW1.3.2–(b), we could have began this proof with
a + c = b + c implies that (a + c) + (−c) = (b + c) + (−c) and then proceeded
as above. However, the proof of HW1.3.2 uses the reflexive law of equality and
the substitution law.
(ii) It should be logical that this proof is analogous to the proof given for (i)—
properties (i) and (ii) are essentially the same properties, (i) with respect to
addition and (ii) with respect to multiplication. Because we have the hypothesis that c 6= θ, by m4 there exists c−1 ∈ Q such that c · c−1 = 1. By the
multiplication analog of HW1.3.2–(b) we see that a · c = b · c implies that
(a · c) · c−1 = (b · c) · c−1 . Then by m2, m4 and m3 a = b.
(iii) Properties a3 and a1 imply that for a · θ ∈ Q (the closure with respect to
multiplication implies that if a ∈ Q, then a · θ ∈ Q) (a · θ) + θ = a · θ and
a · θ + θ = θa · θ, or θ + a · θ = a · θ. Then by a4 (applied to θ), substitution and
d1, we have θ + a · θ = a · θ = a · (θ + θ) = a · θ + a · θ. Using part (i) of this
proposition we have θ = a · θ.
(iv) The element −(a · b) ∈ Q is the element that satisfies (a · b) + (−(a · b)) = θ.
Thus if we can show that (a · b) + ((−a) · b) = θ, we will be done. We have
a · b + ((−a) · b) = b · (a + (−a)) (using a1 twice and then d1) = b · θ (by a4 and
substitution) = θ(by part (iii) of this proposition). Therefore −(a · b) = (−a) · b.
(v) It is easy to use part (iv) of this proposition along with m1 to show that
(−a) · (−b) = −(−(a · b)). Then HW1.3.2–(a) implies that −(−(a · b)) = a · b.
(vi) If a and b both equal θ, then we are done. If b 6= θ, then there exists b−1
such that b · b−1 = 1 (by m4). Then a · b = θ implies that (a · b) · b−1 = θ · b−1
by substitution. The right hand side equals θ by m1 and part (iii) of this
proposition. Then θ = (a · b) · b−1 = a · (b · b−1 ) = a · 1 = a by m2, m4 and m3.
Therefore a = θ.
The proof of the case if a 6= θ is clearly the same (with b replaced by a).
Of course there are more properties. The purpose of the above proposition
is to illustrate to you how some of the other properties that you know can be
proved from the field axioms.
We want to emphasize here that in the proofs above we used only the axioms
and properties that we had previously proved. It’s not terribly important that
you can prove these properties. It would be nice if you’re capable of proving
12
1. Real Numbers
some reasonably easy properties using the axioms and previous results. It is very
important that you are able to read these proofs and verify that they are correct
(which we hope they are). In the proofs given we tried to be very complete,
giving each step and giving a reason for each step. As we move along we will
ease up on some of the completeness, assuming that the reader understands the
reasons for some of the ”simple” steps. When we are done with this section, we
will assume that you know and/or have proved all basic arithmetic properties
of a field. You will have seen proofs of some of these properties such as those
given in Proposition 1.3.2 and HW1.3.2. There are countless other little facts
concerning fields (the rational numbers and the reals—when we really know
what the rational numbers and the real numbers are) that we will need to use.
So that proofs of these facts do not slow down our subsequent work, we will not
fill in every detail and we will assume that you have proved all of these facts or
could prove them if someone wanted a proof.
We next would like to extend our definition to that of an ordered field. As
with the equality, an order must satisfy certain properties. A necesssary part of
defining an order and an ordered field Q is to identify a set P ⊂ Q of positive
elements. We will use the notation that if a ∈ P we will write a > θ. We now
proceed to define an ordered field. We define the ordered field with respect to
the order >.
Definition 1.3.3 Suppose that Q is a field in which we identify a set of positive
elements P ⊂ Q. The set Q along with > is said to be an ordered field if the
satisfy the following properties.
o1. The sum of two positive elements is positive, i.e. a, b ∈ P implies that
a + b ∈ P.
o2. The product of two positive elements is positive, i.e. a, b ∈ P implies that
a · b ∈ P.
o3. For a ∈ Q, one and only one of the following alternatives hold: either a is
positive, a = θ, or −a is positive, i.e. a > θ, a = θ or −a > θ.
You should recognize these three properties as being common facts that you
have used in the past when dealing with inequalities. One of the pertinent facts
is that these three axioms are all you need to get everything you know and/or
need to know about inequalities. Of course we need—want—inequalities defined
on the entire set Q and the other inequalities that you know exist defined, <,
≥ and ≤. We make the following definition.
Definition 1.3.4 Suppose Q is an ordered field. If a, b ∈ Q, we say that b > a
if b − a > θ. Also, we say that
(i) b < a if and only if a > b,
(ii) b ≥ a if and only if b > a or b = a, and
(iii) b ≤ a if and only if b < a or b = a
It should then be reasonably easy to see that Z is not an ordered field (Z is not
a field) and that Q is an ordered field (use P = { m
n : m, n ∈ Z and mn > 0}).
1.3 Some Preliminaries
13
As with the arithmetic properties of the field, the axioms above are then
used to prove a variety of properties concerning ordered fields. We state some
of these properties in the following proposition where we include some of the
very basic results that follow directly from Definition 1.3.3.
Proposition 1.3.5 Let Q along with the operations +, · and > be an ordered
field. Then the following propertes hold.
(i) If a, b, c ∈ Q, a > b and b > c, then a > c (transitive law).
(ii) If a, b, c ∈ Q and a > b, then a + c > b + c.
(iii) If a, b, c ∈ Q, a > b and c > θ, then a · c > b · c.
(iv) If a, b ∈ Q and a > b, then −b > −a.
(v) If a ∈ Q and a 6= θ, then a2 > θ.
Proof: (i) Since a > b and b > c, we have a − b > θ and b − c > θ. Then by
property o1 of Definition 1.3.3 we know that (a − b) + (b − c) > θ—which using
Definition 1.3.1 a2 (a couple times) and a4 yields a − c > θ or a > c.
(ii) a > b implies that a − b > 0. Then using a3, a4, a2, a1 etc, we get
a−b = a−b+θ = (a−b)+(c+(−c)) = (a+c)−b+(−c) = (a+c)+((−b)+(−c)) = (a+c)−(b+c)
or a + c > b + c.
(iii) a > b implies that a − b > θ. Applying 02 of Definition 1.3.3 to b − a and
c >, gives (a − b) · c > 0. Then d1 implies that a · c − b · c > 0 or a · c > b · c.
(iv) If a > b, then by o4 a + (−b) > b + (−b). By a4 and a3 this becomes
a + (−b) > θ. We next use a1 to fix up the right hand, apply o4 again (this
time with −a), and then clean it all up with a4 and a3 (on the left) and a3 to
get (−b) > (−a).
(v) If a ∈ Q, then by o3 a > θ or −a > θ (and we assumed that a 6= θ).
The case when a > θ follows immediately from o2. If −a > θ, by o2 we have
(−a) · (−a) > θ. Then part (v) of Proposition 1.3.2 gives us our desired result.
There are a lot of different properties of ordered real fields. In the next
proposition we include three more very important results.
Proposition 1.3.6 Let Q be an ordered field. Then the following properties
hold.
(i) 1 > θ
(ii) If a ∈ Q and a > θ, then a−1 > θ.
(iii) If a, b ∈ Q and b > a > θ, then a−1 > b−1 > θ.
Proof: (i) We begin by noticing that by Proposition 1.3.5-(v) we get 12 > θ.
By m3 12 = 1 so we have 1 > θ.
(ii) Suppose false, i.e. suppose that a > θ and a−1 is not greater than θ, i.e.
a−1 = 0 or −a−1 > 0. If a1 = 0, then a · a−1 = a · θ = θ. This is a contradiction
to the fact that a · a−1 = 1.
14
1. Real Numbers
We next consider the case when −a−1 > θ. By o2 we see that a·(−a−1 ) > θ.
1.3.2−(iv)
m4
Or a(−a−1 )
= −a · a−1 = −1 > 0. This contradicts part (i) of this
proposition. Therefore a−1 > θ.
(iii) Since b > a > θ, by part (ii) of this proposition we see that a−1 > 0
and b−1 > 0. Then since b > a, by m4 and Proposition 1.3.5-(vi) we see that
1 = b · b−1 > a · b−1 . Then by Proposition 1.3.5-(vi), m1 and m4 we get
a−1 > a · b−1 · a−1 = b−1 —which along with the fact that b−1 > 0 gives us
the desired result.
Again as we saw with our field properties we use the additional properties
of our order structure to prove a variety of additional properties of the ordered
field. As you will see, as we start applying properties of our field and later our
order structure we will need some additional properties of ordered fields. We
do not want to try to prove all of these results so we either have to pause and
prove these properties when we need them, assign them as homework so we can
assume that they have been proved, or cheat a bit and assume that everyone
can prove them if there is a need for a proof—which in reality is the approach
that we will usually use.
All of the work concerning orders done above was done with respect to
the order >. You know that we have defined other order relations, <, ≥ and
≤. These other order relations will satisfy properties analogous to those found
above for >. Most of the results that we want for <, ≥ and ≤ will follow
from Definitions 1.3.3 and 1.3.4 and Propositions 1.3.5 and 1.3.6—along with a
careful consideration of results following from the fact that a = θ. We cannot
give all of the possible results—even all of the results directly like the previous
theorems—for all flavors of inequalities. We include some of these properties
without proof in the following proposition.
Proposition 1.3.7 Let Q be an ordered field. Then the following propertes
hold.
(i) If a, b, c ∈ Q, a θ, then a · c b and c < θ, then a · c b · c.
(vi) If a, b, c ∈ Q, a ≥ b and b ≥ c, then a ≥ c.
(vii) If a, b, c ∈ Q, a ≥ b, then a + c ≥ b + c.
(viii) If a, b, c ∈ Q, a > b and c ≥ θ, then a · c ≥ b · c.
(ix) If a, b, c ∈ Q, a ≥ b and c > 0, then a · c ≥ b · c.
(x) If a, b, c ∈ Q and a ≤ b, then a + c ≤ b + c.
We quit. Of course there are more ≤ results and other results—we thought
that ten was enough. We assume that you are generally aware of the correct
results and hope that based on the propositions proved in this section you could
prove all reasonable true results.
With the definition of the ordered field and the properties we have proved
we have the algebraic of the reals. As we have seen the set of rationals, Q, is an
1.4 Real Numbers
15
ordered field and we have claimed that the set of rationals is not good enough.
We need more which we will add in the next section.
HW 1.3.1 (True or False and why)
(a) If Q is a field and θ is the additive identity in Q, then θ · θ = θ.
(b) If Q is a field and θ is the additive identity in Q, then −θ = θ.
(c) Suppose that Q is a field and 1 and θ are the multiplicative and additive
identies, respectively. Then 1 and θ are unique.
(d) Suppose that Q is a field and a · x = b. Then x = a−1 · b.
(e) If Q is an ordered field, a ∈ Q and a < 0, then a−1 < 0.
(f) If Q is an ordered field, a, b ∈ Q and 0 ≤ a < b, then a2 < ab < b2 .
HW 1.3.2 (a) Prove that if Q is a field and a ∈ Q, then −(−a) = a.
(b) Prove that if Q is a field, a, b, c ∈ Q and a = b, then a + c = b + c.
(c) Suppose that Q is an ordered field and a, b, c ∈ Q are such that a > b and
c > d. Prove that a + c > b + d.
(d) Suppose Q is an ordered field and a, b ∈ Q are such that ab > 0. Prove that
either a > 0 and b > 0 or a < 0 and b < 0.
HW 1.3.3 Suppose that Q is an order field. (a) Prove that if a, b ∈ Q and
θ ≤ a ≤ b, then a2 ≤ b2 . (Note: Essentially the same proof will prove that for
a, b ∈ Q and θ ≤ a < b, then a2 < b2 .)
√
(b) For the moment for a ∈ Q and a ≥ 0 define a to be that number such that
√ 2
a = a (if such a number exists–see Section **********.**
√ and something
√
else). Prove that if a, b ∈ Q and θ ≤ a ≤ b, then a ≤ b. Hint: Use the
contrapositive or contradiction.
HW 1.3.4 (a) Define Q1 to be the set of all 2 × 2 matrices along with the
traditional matrix addition and multiplication. Prove or disprove that Q is a
field.
(b) Define Q2 to be the set of all 2 × 2 invertable matrices along with the
traditional matrix addition and multiplication. Prove or disprove that Q is a
field.
HW 1.3.5 Let Q be an ordered field. Prove the following statements.
(a) If a, b, c ∈ Q, a b · c. (Proposition 1.3.7-(v))
(b) If a, b, c ∈ Q, a > b and c ≥ θ, then a · c ≥ b · c. (Proposition 1.3.7-(viii))
(c) If a, b, c ∈ Q, a < b and c ≤ θ, then a · c ≥ b · c.
1.4
Definition of the Real Numbers
We have one more step in the definition of the real numbers—the difficult step.
Before we proceed we need a few easy definitions.
Definition 1.4.1 Let Q be an ordered field and let S be a nonempty subset of
Q.
16
1. Real Numbers
(i) If M ∈ Q is such that s ≤ M for all s ∈ S, then M is said to be an upper
bound of S.
(ii) If m ∈ Q is such that s ≥ m for all s ∈ S, then m is said to be a lower
bound of S.
(iii) If a nonempty subset of Q has an upper bound, it is said to be bounded
above. If a nonempty subset of Q has a lower bound, it is said to be bounded
below. If a nonempty subset of Q has both an upper and lower bound, then it
is said to be bounded. If a set does not have an upper bound or a lower bound,
then the set is said to be unbounded.
It is easy to see that in the set of rational numbers, Q, (an ordered field) 7 is
an upper bound of the set S1 = {−3, −2, −1, 3, 4} and −5 is a lower bound of S1 ,
there is no upper bound of the set S2 = {−17, −3/2, −1/2, 0, 2, 8/3, 4, 32/5, 32/3, 128/7, · · ·}
(the elements of the set continue to increase without bound) and −23.1 is
a lower bound of S2 , 7 is an upper bound of the set S3 = {r ∈ Q : r =
7 − 1/n for all n ∈ N} and 6 is a lower bound of S3 , and −1 is an upper bound
of the set S4 = {· · · , −4, −3, −2, −1, −3/2, −5/4, −9/8, −17/16, · · ·} and S4 has
no lower bound. Also, both 4 and 4.00001 are upper bounds and −3 and −3.1
are lower bounds of the set S5 = {r ∈ Q : −3 < r ≤ 4}. Note that 3.9999 is
not an upper bound of S5 . It is also the case that −17 is also a lower bound
of the set S2 , 0 is also an upper bound of the set S4 , and 14 and −10, 9 and 5,
and −10 and 10 are upper and lower bounds of S1 , S3 and S5 , respectively.
We note that upper and lower bounds of a set may be elements of the set
(for example −1 ∈ S4 and 6 ∈ S3 ). And of course by the other examples, we
see that upper and lower bounds need not be elements of the set.
Definition 1.4.2 Let Q be an ordered field and let S be a nonempty subset of
Q.
(i) If M ∗ ∈ Q is such that M ∗ is an upper bound of S, and for any upper bound
of S, M , M ∗ M , then M ∗ is said to be the least upper bound of S. We denote
the least upper bound of S by M ∗ = lub(S). Another word that is used for the
least upper bound of S is the supremum of S and is written as sup(S).
(ii) If m∗ ∈ Q is such that m∗ is a lower bound of S, and for any lower bound of
S, m, m∗ ≥ m, then m∗ is said to be the greatest lower bound of S. We denote
the greatest lower bound of S by M ∗ = glb(S). Another word that is used for
the greatest lower bound of S is the infimum of S and is written as inf (S).
Let us emphasize the fact that the least upper bound must be an upper bound.
Hence, if the set does not have an upper bound, the least upper bound of the set
does not exist. Likewise, if a set does not have a lower bound, then the greatest
lower bound of the set does not exist.
It should be easy to see that for the four sets S1 , S2 , S3 , S4 and S5 , glb(S1 ) =
−3 and lub(S1 ) = 4, glb(S2 ) = −17 and lub(S2 ) does not exist, glb(S3 ) = 6 and
lub(S3 ) = 7, glb(S4 )does not exist and lub(S4 ) = −1, and glb(S5 ) = −3 and
lub(S5 ) = 4 (where the facts that lub(S4 ) = −1 and glb(S5 ) = −3 are the two
that should be considered carefully).
1.4 Real Numbers
17
Least upper bounds (as in S1 , S4 and S5 ) and greatest lower bounds (as
in S1 , S2 and S3 ) may be elements of the set, but that is not a requirement
(as 7 6∈ S3 and −3 6∈ S5 ). You should note that the upper and lower bounds
need not be close to the set (as 1000 is an upper bound of S5 ), whereas the
least upper bound and greatest lower bound must be close to the set (close to
at least one element of the set). We note that if the set is finite (meaning the
set has a finite number of elements), the least upper bounds and the greatest
lower bounds will always be the smallest and the largest elements of the set (as
with the set S1 ). That need not be the case for sets with an infinite number of
elements as can be seen by lub(S3 ) and glb(S5 ).
It is not difficult to prove any of the above claims. For example, if you were
forced to prove that −5 is a lower bound of the set S1 , we would only have to
list the elements of the set noting that −5 ≤ −3, −5 ≤ −2, −5 ≤ −1, −5 ≤ 3
and −5 ≤ 4. Therefore, −5 ≤ s for all x ∈ S1 so −5 is a lower bound of S1 .
If you wanted to prove that −3.0001 is an lower bound of the set S5 , you
would only have to show that if s ∈ S5 , then −3.0001 < −3 < s. Therefore, if
s ∈ S5 , then s ≥ −3.00001 so −3.00001 is a lower bound of the set S5 .
To prove that a given value is the greatest lower bound of a set or the least
upper bound of a set is a bit more difficult. To prove that glb(S5 ) = −3 we must
first prove that −3 is a lower bound of S5 —but this proof is almost identical to
the proof given above that −3.0001 is a lower bound of S5 .
We next must prove that −3 is the greatest lower bound of S5 . The way
to prove this is by contradiction. Assume that m∗ = glb(S5 ) and m∗ > −3.
Then we can find a number r = (−3 + m∗ )/2 that will be in S5 (because
r = (−3 + m∗ )/2 > (−3 + (−3))/2 = −3) but m∗ 6≤ r (r = (−3 + m∗ )/2 <
(m∗ + m∗ )/2 = m∗ so m∗ is not a lower bound of S5 . This contradicts the fact
that the greatest lower bound of a set must also be a lower bound of the set.
Therefore there cannot be a greatest lower bound of S5 that is greater than −3.
Since −3 is a lower bound of S5 , it must be the greatest lower bound of S5 .
To prove that lub(S3 ) = 7 is more difficult. It’s easy to show that 7 is an
upper bound of S3 . If we assume that r = lub(S3 ) < 7, all we have to do is to
show that there is some element of S3 that is greater than r, i.e. we must show
that there is some n0 ∈ N such that r < 7 − n10 . This is what we call intuitively
clear—but not proved. In HW1.5.2 we will use Corollary 1.5.5–(b) to complete
the proof that lub(S3 ) = 7.
Before we proceed we want one more example. Consider the set S6 = {x ∈
Q : x2 < 2}. It is not hard to show that 2 is an upper bound of S6 —if x ∈ S6
and x > 2, then x2 > 4 which is a contradiction (and −2 is a lower bound).
It’s not obvious that S6 doesn’t have a least upper bound. Since S6 is defined
as a subset of the rational numbers, if S6 has a least upper
bound, it has to be
√
a rational. If we cheat and claim that we know that 2 has a non-repeating,
non-terminating decimal expansion (though at this time we really don’t know
about decimal expansions) and that rational numbers have decimal expansions
that terminate or repeat from some point on, we can do some calculations and
figure out that no rational number can be the least upper bound of S6 . We
prove this fact in the following example.
18
Example 1.4.1
Q).
1. Real Numbers
Show that S6 = {x ∈ Q : x2 < 2} does not have a least upper bound (in
Solution: We will prove this statement by contradiction. We assume that L = lub(S6 )
exists and L ∈ Q. Since 12 < 2 and 22 > 2, 1 ≤ L ≤ 2. We shall show that L2 = 2 (which we
know is impossible for L ∈ Q by the proof given in Section 1.1).
First suppose that L2 < 2. Choose α = (2 − L2 )/5. Note that α ∈ Q. Also note that
α > 1/5 implies that L2 < 1 which contradicts the fact that 1 ≤ L. Thus 0 < α ≤ 1/5 < 1.
We see that
(L + α)2 = L2 + 2αL + α2 < L2 + 5α = L2 + 5(2 − L2 )/5 = 2 (since L ≤ 2 and α2 < α).
Thus L + α ∈ S6 , L is not an upper bound of S6 and we have a contradiction. Therefore
L2 6< 2.
If we suppose that L2 > 2 and choose α = (L2 − 2)/4. If α > 1/2, then L2 > 4 which
contradicts the fact that L ≤ 2. Thus α ≤ 1/2 < 1. We see that
(L − α)2 = L2 − 2αL + α2 > L2 − 2αL ≥ L2 − 4α = L2 − 4(L2 − 2)/4 = 2 (since L ≤ 2).
Thus L − α is an upper bound of S6′ and L − α < L so that L is not the least upper bound
of S6′ . This is a contradiction. Therefore L2 6> 2.
The only choice left is that L2 = 2 but we know this is impossible. Therefore lub(S6 )
does not exist.
We emphasize that in the last example the point is that the lub(S6 ) does not
exist in Q. That is important.
We are finally ready to define the set of real numbers.
Definition 1.4.3 An ordered field Q is complete if and only if every subset of
Q that is bounded above has a least upper bound.
We refer to Definition 1.4.3 as the completeness axiom.
Definition 1.4.4 The set of real numbers, R, is a complete ordered field.
We see that the set of real numbers are quite nice. The set is an ordered field
so that we get all of the properties of arithmetic and inequalities that we have
known and used since childhood. In addition, we get the completeness axiom.
We will see that the fact that the reals are complete will be extremely important
to almost every aspect of our work (it is the concept that delayed the rigor of
calculus for 200 years). There will be many times when you are working on
some proof that you want to use the largest element in a set. However, there
are many nice sets that are bounded above and do not have a largest element,
such as S3 . The least upper bound of S3 is not the largest element in S3 —it’s
not even in the set. However, 7 is very close to all of the large elements in
the set. We will be able to use the least upper bound approximately where we
wanted to use the largest element.
However, we have a problem. We know what we want the set of real numbers
to be: the numbers that are plotted on the real line, the numbers that come
up on our calculator screens, etc. Our definition is a very abstract definition.
To begin with we would have to prove the existence of a complete
ordered field. It would surely be embarrassing to define the set of reals to be
a complete ordered field and have someone else come back and prove that no
such set existed. After that, we have to worry about that fact that if we were
1.4 Real Numbers
19
using some complete ordered field as our set of reals, someone else may be using
some other complete ordered field—so we might get different results (when you
bought a new calculator, you might have to decide which complete ordered field
you wanted it to be based on). This is not really a problem. However, we are
only going to clear up this situation by stating the following theorem that we
give without proof.
Theorem 1.4.5 There exists one and (except for isomorphic fields) only one
complete ordered field.
The proof of this theorem would take us too far out of our way to be useful at
this time. The fact that there are ismorphic complete ordered fields is not a
problem. Two fields are isomorphic if there is a one-to-one (which we will define
later) mapping between the fields that preserves the arithmetic operations. For
our purposes isomorphic complete order fields can be considered the same.
When we work with the set of real numbers, we still want R to contain
N, Z and Q. This is not a problem. We won’t do it but it is not difficult
to use 1 = 1, 2 = 1 + 1, · · · to define N within any field, or N = {x ∈ R :
1 ∈ N or x = k + 1 for k ∈ N}. Using the approach that we have by defining
the set of real numbers by a set of postulates, the above description is the
definition of the set of natural numbers. Likewise, we can use the additive
inverses and the multiplicative inverses to get Z and Q, respectively. Thus any
complete ordered field will contain N, Z and Q along with their properties.
When the natural numbers are developed without first defining the set of real
numbers, there are several sets of axioms that are used to define the natural
numbers. Since we defined the set of real numbers and defined the natural
numbers as a particular subset of the reals, we don’t use any of these axioms.
Of course we want our set of natural numbers to define all of these axioms–
otherwise either something is wrong with the sets of axioms or something is
wrong with our definition of N. In our setting all of these axioms can and in
some situatons need to be proved as theorems. We will not prove all of these
results though when we need them, we will use them. We do want to give you
one of the common sets of axioms, called the Peano Postulates:
• PP1: 1 is a natural number.
• PP2: For each natural number k there exists exactly one natural number,
called the successor of k, which we donote by k+1 .
• PP3: 1 is not the successor of any natural number.
• PP4: If k+1 = j+1 , then k = j.
• PP5: Let M be a set of natural numbers such that (i) M contains 1
and (ii) M contains x+1 whenever it contains x, then M contains all the
natural numbers.
It should be clear that the Peano Postulates are a long way from the real
numbers–in other words, if you use this approach, you have a lot of work to
20
1. Real Numbers
do before you get to R. Also based on our definition of R, some of the properties that we have proved in R and the definition of N, it is easy to see that
PP1, PP4 and PP5 are true. It not too difficult to see that PP3 can be proved
using PP5; and PP2 follows from the result that for k ∈ N there are no natural
numbers between k and k + 1 which follows from PP3. We prove PP3 in Example 1.6.4 as one of our examples of the application of proof by mathematical
induction. You will see in Section 1.6 that postulate PP5 is a very important
property of the natural numbers.
We know from our work in Section 1.1 that there exist real numbers that
are not rational. We define
• the set of irrational numbers, I = {x ∈ R : x 6∈ Q} = R − Q.
Obviously by the definition of I, Q ∩ I = ∅ and R = Q ∪ I. In Section 1.1 we
showed that there were a lot of real numbers that are not rational, i.e. that are
irrational. Hence not only do we know that I is not empty, I 6= ∅, but I is large.
And finally, to this point we have tried to be careful to use the formal notation
of · for multiplication, −1 for the multiplicative inverse, θ for the additive identity
and 1 for the multiplicative identity. Now that we have defined R and made the
argument (some of it not proved) that R is the set of reals that we have always
used, we will change to a more traditional notation. We will write a · b as ab,
ab−1 and a/b, θ as 0 and 1 as 1.
HW 1.4.1 (True or False and why)
(a) If S is a set of real numbers and M and m are upper and lower bounds of
S, respectively, then M and m are unique.
(b) If S is a bounded set of real numbers and M ∗ and m∗ are least upper and
greatest lower bounds of S, respectively, then M ∗ and m∗ are unique.
(c) If S is a bounded set of real numbers and S ∗ ⊂ S, then lub(S) ≥ lub(S ∗ )
and glb(S) ≤ glb(S ∗ ).
HW 1.4.2 Suppose that S ⊂ R contains only a finite number of elements
(which we could say that S is a finite set). Prove that M ∗ = lub(S) exists and
M ∗ ∈ S.
1.5
Some Properties of the Real Numbers
We want to emphasize that the important addition to our knowledge base in
the last section is the fact that the set of real numbers must be complete. This
section is very much a continuation of the last section. We begin by stating
and proving a series of very important results related to the completeness. The
first result shows that we also get greatest lower bounds if our sets our bounded
below.
Proposition 1.5.1 If S is a subset of R that is bounded below, then S has a
greatest lower bound.
1.5 Properties of the Real Numbers
21
Proof: Let S ′ = {−x ∈ R : x ∈ S}. If m is a lower bound of S, then m ≤ s for
all s ∈ S. This is the same as −m ≥ −s for all −s ∈ S ′ . Therefore, −m is an
upper bound of S ′ . By the completeness of R, S ′ has a least upper bound, say
M ∗ = lub(S ′ ). Our claim is that m∗ = −M ∗ = glb(S). M ∗ is an upper bound
of S ′ so for all −s ∈ S ′ , M ∗ ≥ −s. Then m∗ = −M ∗ ≤ s for all s ∈ S and m∗
is a lower bound of S.
If g is a lower bound of S and g > m∗ , then −g will be an upper bound of
′
S and −g < −m∗ = M ∗ . Thus if m∗ is not the greatest lower bound of S, then
M ∗ is not the least upper bound of S ′ . Therefore by reductio ad absurdum (or
contradition) m∗ = glb(S).
If we return to the sets S1 –S5 described earlier and consider these sets as
subsets of the reals, R (S1 –S5 were previously considered as subsets of Q and
Q ⊂ R), then the least upper bounds and greatest lower bounds of these sets are
the same as before. If proofs were needed, the proofs would be essentially the
same as before. If we consider a different set S5′ = {x : −3 < x ≤ 4} = (−3, 4]
(where S5′ contains all of the real numbers between −3 and 4 (including 4)
where S5 contained only the rational numbers in that range, it is easy to see
(again using essentially the same arguments as before that lub(S5′ ) = 4 and
glb(S5′ ) = −3.
The case of special interest is the set S6 . Recall in Example 1.4.1 that when
S6 was considered a subset of Q, we found that the least upper bound of S6 did
not exist. We now know that this example proves the following result.
Proposition 1.5.2 The set of rationals, Q, is not complete.
Now consider S6 as a subset of R and define S6′ = {x ∈ R : x2 < 2}. As we
showed before with S6 considered as a subset of Q, S6 and S6′ are both bounded
above and below in R by 2 and −2, respectively. By the definition of R we know
that both lub(S6 ) and lub(S6′ ) exist. We don’t explicitly know what value these
least √
upper bounds assume (even though deep in our hearts we know they are
both 2—the number such that when squared gives 2). Consider the following
example.
Example 1.5.1
Let L = lub(S6′ ). Show that L2 = 2.
Solution: Other than the fact that in this case we know (because R is complete) that
L = lub(S6′ ) exists, the proof of this result is the same as part of the proof given in Example
1.4.1.
We assume that L2 < 2, define α = (2 − L2 )/5, show that L + α ∈ S6′ and obtain the
contradiction to the fact that L is an upper bound of S6′ .
We then assume that L2 > 2, define α = (L2 − 2)/4, show that L − α is an upper bound
of S6′ and contradict the fact that L is the least upper bound of S6′ . (One large difference
between these proofs is the fact that in this case L and α may be—will be—irrational. This
makes no difference in the necessary computations.) Therefore we know that L2 = 2.
√
√
The completeness of the reals allows us to define 2 as 2 = lub(S6′ ) and we
√
√ 2
know that 2 satisfies
2 = 2. This approach also allows us to define square
roots of all positive real numbers.
√ This is a big deal. When I was a young student, I was told that we let
2 be the number such that when squared gives 2—and I would guess most of
22
1. Real Numbers
you were given the same introduction. No one questioned whether or not such a
number
might exist—I surely never questioned it. It wasn’t until I had to define
√
2 for students that
√ I started wondering why we never discuss the existence.
You no know that 2 exists and why.
As we stated earlier the completeness axiom is a very important and essential
part of the definition of the set of real numbers. To better describe this property
of R and to make this property easier to use, we next include several useful
results that follow from the completenes axiom. These results are very important
in that often when we need to use the completeness of the reals, we will use
Proposition 1.5.3–Corollary 1.5.5 rather than the definition of completeness.
We begin with a result that illustrates our earlier claim that the least upper
bound and the greatest lower bound take the place of the largest and smallest
elements of the set when it’s impossible to specify the largest and smallest
elements.
Proposition 1.5.3 (a) Suppose S ⊂ R is bounded above. Let the least upper
bound of S be given by M ∗ . Then for every ǫ > 0 there exists x0 ∈ S such that
M ∗ − x0 < ǫ.
(b) Suppose S ⊂ R is bounded below. Let the greatest lower bound of S be given
by m∗ . Then for every ǫ > 0 there exists x0 ∈ S such that x0 − m∗ − < ǫ.
Proof: (a) Suppose false, i.e. suppose that for some ǫ0 > 0 there is no
element x ∈ S greater than M ∗ − ǫ0 . Then M ∗ − ǫ0 will be an upper bound for
the set S—for all x ∈ S, x ≤ M ∗ − ǫ0 . But this is a contradiction to the fact
that M ∗ is the least upper bound of S.
(b) The proof of (b) is similar—do be careful with the inequalities.
Remember that m∗ and M ∗ may or may not be in the set. Though we
cannot choose the smallest or largest element in the set, we can always find an
element in the set that is arbitrarily close to m∗ and M ∗ . Often when we are
using an argument where we would like to use the smallest or largest element
in a set (and can’t make the claim that there is such an element), we can use
the elements provided by the above proposition that are arbitrarily close the
the greatest lower bound and least upper bound of the set.
We might also make special note of the argument used in the proof of (a)
above. The proof is not difficult. For many students it is difficult to negate
the original statement. The statement is that ”for every ǫ > 0 there is an x0
that satisfies an inequality.” To negate that statement, you need ”some ǫ > 0
for which there is no x0 that will satisfy the inequality,” or ”some ǫ > 0 for
which every x ∈ S does not satisfies the inequality.” Analysis results often
involve convoluted statements. It is often difficult to negate these convoluted
statements.
We next obtain a very important corollary known as the Archimedian property (and of course, it doesn’t really deserve to be a corrolary).
Corollary 1.5.4 For any positive real numbers a and b, there is an n ∈ N such
that na > b.
23
1.5 Properties of the Real Numbers
Proof: Suppose false, i.e. suppose that na ≤ b for all n ∈ N. Set S = {na :
n ∈ N}. Since we are assuming that na ≤ b for all n ∈ N, the set S is bounded
above by b. The completeness axiom implies that S has a least upper bound,
let M ∗ = lub(S).
By Proposition 1.5.3–(a) there exists an element of S, n0 a, n0 ∈ N, such
that M ∗ − n0 a < a. (The statement must be true for any ǫ > 0. We’re applying
the proposition with ǫ = a.) Then we have M ∗ < (n0 + 1)a for n0 + 1 ∈ N so
M ∗ is not an upper bound of S. This is a contradiction so there must be an
n ∈ N such that na > b.
The next result we give two special cases of the Archimedian property that
are basic and seem obvious (or as we used to say when we were graduate students
”intuitively clear to the casual observer”). The result helps make it clear that
the completeness axiom is very important if without it we could not make these
seemingly obvious claims. By first choosing b = c and a = 1, and then choosing
a = 1 and b = ǫ, we obtain the following corollary.
Corollary 1.5.5 (a) For any positive real number c there is an n ∈ N such that
n > c.
(b) For any ǫ > 0 there is an n ∈ N such that 1/n < ǫ.
We next include two properties of the set of reals that both really help
explain the complexity of R and sometimes make it difficult to comprehend the
complexity of R.
Proposition 1.5.6 Let a, b ∈ R such that a < b.
(a) There exists r ∈ Q such that a < r < b.
(b) There exists x ∈ I such that a < x < b.
Proof: By Corollary 1.5.5–(b) we can choose n ∈ N so that 1/n na (or m/n > a). Then m − 1 ≤ na
and
m−1
1
1
m
=
+ ≤ a + < a + (b − a) = b.
n
n
n
n
m
Therefore r = n satisfies a < r < b.
(b) By part (a) there exists r ∈ Q such that a√ < r < b. By Corollary 1.5.5–(a)
√
√
there exists n ∈ N such that n1 < (b−r)
or r + n2 < b. Then a < r < r + n2 < b.
2
By Example 1.2.2 and HW1.2.2–(b), r +
√
2
n
∈ I.
Note that part of the proof of (a) included ”let m be the smallest integer such
that m > na.” This type of ”obvious” statement is commonly assumed to be true
and nothing more is said about it. However, when requested, you must be able
to justify the statement. This statement is called the Well-Ordering principle
and is states as every non-empty subset of N contains a smallest natural number
or if M ⊂ N is non-empty, then glb(M ) ∈ N. We will prove this statement in
Example 1.6.5 in Section 1.6 but want to emphasize that though we prove this
later, we are not using any sort of circular argument.
24
1. Real Numbers
Before we leave this section we include a definition the absolute value of a
real number and some properties of absolute value that don’t necessarily fit in
this section but we surely will need soon.
(
x
if x ≥ 0,
Definition 1.5.7 The absolute value of x ∈ R is defined as |x| =
−x if x < 0.
Proposition 1.5.8 (i) |x| ≥ 0 for all x ∈ R. |x| = 0 only if x = 0.
(ii) |xy| = |x||y| for all x, y ∈ R.
(iii) −|x| ≤ x ≤ |x| for all x ∈ R.
(iv) For a ∈ R, a ≥ 0 |x| ≤ a if and only if −a ≤ x ≤ a.
(v) |x + y| ≤ |x| + |y| for all x, y ∈ R.
(vi) |x − y| ≥ ||x| − |y|| ≥ |x| − |y| for all x, y ∈ R.
Proof: We will claim that the proof of (i) and (iii) are trivial (follow directly
from the defintion). Likewise we would like to think that property (ii) is clear
and/or claim that property (ii) is very clear if you consider the four cases x ≥ 0,
y ≥ 0; x ≥ 0, y < 0; x < 0, y ≥ 0; and x < 0, y < 0.
(iv) We have not discussed an ”if and only if” statement but only want to
emphasize that it means that we have implications going in each direction. Often
to prove ”if and only if” you prove both directions separately. Sometimes, as is
the case here, you can prove both directions at the same time.
We consider the statement |x| ≤ a. If we only consider x values greater
than or equal to zero, this statement becomes |x| = x ≤ a so that statement
is equivalent to 0 ≤ x ≤ a. If we only consider x values less than zero, this
statement becomes |x| = −x ≤ a or 0 > x ≥ −a. Since x is either greater or
equal to zero or less than zero, the statement |x| ≤ a is equivalent to 0 ≤ x ≤ a
or 0 > x ≥ −a. If we consider this set of x values carefully, we see that it is the
same as −a ≤ x ≤ a.
(v) Property (v) is well know as the triangular inequality and is an important
property of the absolute value. We will use it often. Having proved properties
(iii) and (iv), property (v) is easy to prove. Using property (iii) twice, for any
x, y ∈ R we have −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|. Adding these inequalities
gives −|x| − |y| ≤ x + y ≤ |x| + |y| (consider it carefully why it is permissable
to add these inequalities). By property (iv) this last inequality implies that
|x + y| ≤ |x| + |y|.
(vi) Property (vi) is another useful property of the absolute value. We will refer
to property (vi) as the backwards triangular inequality. The proof of property
(vi) is a trick. Consider the following two computations.
|x| = |(x − y) + y| ≤ |x − y| + |y| (triangular inequality) so |x| − |y| ≤ |x − y|
and
|y| = |(y−x)+x| ≤ |y−x|+|x| (triangular inequality) so −(|x|−|y|) = |y|−|x| ≤ |y−x| = |x−y|.
1.6 Math Induction
25
Then since ||x| − |y|| = |x| − |y| or −(|x| − |y|), ||x| − |y|| ≤ |x − y|.
Before we give the next properties, we define the following notation (some
of which we have already used, but at least in this way we know that you
understand the notation). For a ≤ b we define the closed interval [a, b] as
[a, b] = {x ∈ R : a ≤ x ≤ b}. For a < b we define the open interval (a, b) as
(a, b) = {x ∈ R : a < x < b}. And we use the obvious combinations of the
notation for the half open–half closed intervals (a, b] and [a, b).
Proposition 1.5.9 For r ∈ R and r > 0 the following three statements are
equivalent: |x − a| < r, a − r < x < a + r and x ∈ (a − r, a + r).
Proof: If we do as we did with property (iv) in Proposition 1.5.8 and
consider two cases, x − a ≥ 0 and x − a < 0, it is easy to see that the first two
expressions are equivalent. The equivalence of the third statement comes from
the second statement and the definition of the open interval.
Infinity:
To include a discussion about infinity in this section is a bit odd since we
want to make it very clear that ±∞ 6∈ R, i.e. ∞ and −∞ are not real numbers.
But we will do it anyway. Often the extended reals are defined to include R and
±∞. Plus and minus infinity do fit into our order system in that ±∞ are such
that for x ∈ R, −∞ < x < ∞, i.e. ∞ is larger than any real number and −∞
is smaller than any real number. Above we defined [a, b], (a, b), etc for a, b ∈ R.
We can logically extend these definitions to the unbounded intervals (a, ∞) =
{x ∈ R : a < x < ∞}, [a, ∞) = {x ∈ R : a ≤ x < ∞}, (−∞, a] = {x ∈ R :
−∞ < x ≤ a}, (−∞, a) = {x ∈ R : −∞ < x < a}, and even (−∞, ∞) = R.
Notice very clearly that ±∞ was not included in any of these sets.
At times we will have to do some arithmetic with infinities so we define for
a ∈ R, a + ∞ = ∞, a − ∞ = −∞, a∞ = ∞ for a > 0 and a∞ = −∞ for a < 0.
We emphasize that ∞ − ∞ and 0∞ are not defined (we don’t know what order
of ”large” the infinity represents). And finally, since N ⊂ R, for all n ∈ N we
have 1 ≤ n < ∞.
HW 1.5.1 (True, False and show why)
(i) Suppose that S ⊂ R is such that x ∈ S implies s ≥ 0. Then glb(S) ≥ 0.
HW 1.5.2 In Section 1.4 we considered S3 = {r ∈ Q : r = 7−1/n for all n ∈ N}
as a subset of the rationals Q. We claimed that lub(S3 ) = 7. Prove it.
1.6
Principle of Mathematical Induction
In this section we consider the topic of proof by mathematical induction. Mathematical induction is a very important form of proof in mathematics. It would
be easy to say that the topic of math induction should not be included in a
chapter titled An Introduction to the Real Numbers. Because it is a convenient
time and place for this topic, we include it here.
26
1. Real Numbers
Recall the fifth Peano Postulate, PP5: Let M be a set of natural numbers
such that (i) M contains 1 and (ii) M contains x+1 whenever it contains x, then
M contains all the natural numbers. From this postulate—which in our setting
followed immediately from the definition of the set of natural numbers—we
obtain the following theorem.
Theorem 1.6.1 Let P (n) be a proposition that is defined for every n ∈ N.
Suppose that P (1) is true, and that P (k + 1) is true whenever P (k) is true.
Then P (n) is true for all n ∈ N.
This theorem is referred to as the Principle of Mathematical Induction and
follows easily from the fifth Peano Postulate be setting M = {n ∈ N : P (n) is true}.
It is important for us to be able to use the Principle of Mathematical Induction, Theorem 1.6.1, as a method of proof: proof by mathematical induction.
We shall introduce proofs by math induction (short for mathematical induction
or the principle of mathematical induction) by a variety of examples. In each
example we will use a common template—which in order to avoid confusion, we
suggest that you follow.
Example 1.6.1
Prove that
n
X
rj =
j=0
1 − r n+1
.
1−r
Solution: We want to use the principle of mathematical induction. For this
n
X
1 − rn+1
rj =
problem the proposition P is the expansion
.
1−r
j=0
Step 1: Prove that P (1) is true.
1
X
rj = 1 + r and
j=0
1 − r2
1 − r1+1
=
= 1 + r.
1−r
1−r
Therefore the proposition is true when n = 1.
k
X
1 − rk+1
rj =
Step 2: Assume that P (k) is true, i.e.
.
1−r
j=0
Step 3: Prove that P (k + 1) is true, i.e.
k+1
X
j=0
rj =
1 − r(k+1)+1
.
1−r
k
X
1 − rk+1
rj =
rj + rk+1 =
+ rk+1 by the assumption in Step 2(1.6.1)
1
−
r
n=0
j=0
k+1
X
=
1 − rk+2
.
1−r
(1.6.2)
(Notice that in the first step of (1.6.1) we take the last term of the summation
k+1
X
rj out of the summation, changing the upper limit of the summation to k
j=0
and including the last term separately.) Therefore P is true for n = k + 1.
27
1.6 Math Induction
By the principle of mathematical induction P is true for all n, i.e.
n
X
rj =
j=0
1 − rn+1
.
1−r
We want to emphasize that all proofs by math induction follow the above
template. In Step 1 you prove that the proposition is true for n = 1. In Step 2
you assume that the proposition is true for n = k—this assumption is referred
to as the inductive assumption. In Step 3 you prove that the proposition is true
for n = k + 1–using the inductive assumption as a part of the proof. If you
are able to prove that the proposition is true for n = k + 1 without using the
inductive assumption, you would have a direct proof of the proposition—math
induction would not be necessary.
You should recognize the formula in Example 1.6.1 as the formula for the
sum of a geometric series. A common proof of this formula is to write
S
rS
= 1 + r + r2 + · · · + rn−1 + rn and note that
2
3
n
= r + r + r + ··· + r + r
n+1
.
(1.6.3)
(1.6.4)
Subtracting equation (1.6.4) from equation (1.6.3) gives S − rS = (1 − r)S =
1 − rn+1
. The point that we
1 − rn+1 (the rest of the terms add out) or S =
1−r
want to make is that this is a nice derivation but it is not a direct proof of the
formula. To be able to write rS as r + r2 + r3 + · · · + rn + rn+1 we are applying
n
n
X
X
raj —which is an extension of the distributive property
aj =
a ”rule” r
j=1
j=1
c(a + b) = ca + cb. If we want to be picky (and at times we do), this formula
should be proved true. This formula is proved true by math induction. Likewise
when we are computing S − rS by subtracting the right hand side of equation
(1.6.4) from the right hand side of equation (1.6.3), we are using an extension
of the associative property of addition which can be proved by math induction.
Hence, this nice derivation (that didn’t seem to using mathematical induction)
involved several steps that could be or should be proved by math induction.
In general, when you do algebra involving expressions that include three
dots, you are probably doing an easy math induction proof. Another common
form of an easy math induction proof is when you write your desired result after
you’ve written several terms of the result and added the abbreviation ”etc.”.
It’s perfectly ok to do easy results this way—we all do them—but you should
at least realize that they’re true by the principle of mathematical induction.
Example 1.6.2
Prove that
n
X
j=1
j=
n(n + 1)
.
2
Solution: Step 1: Prove true for n = 1.
1
X
j=1
proposition is true for n = 1.
j=
1(1 + 1)
= 1. Therefore the
2
28
1. Real Numbers
Step 2: Assume true for n = k, i.e.
k
X
j=1
j=
(k(k + 1)
.
2
Step 3: Prove true for n = k + 1, i.e. prove that
k+1
X
j=
j=1
(k + 1)(k + 2)
.
2
k
X
k(k + 1)
+ (k + 1) by assumption in Step 2(1.6.5)
j + (k + 1) =
j=
2
j=1
j=1
k
(k + 2)
= (k + 1)
+ 1 = (k + 1)
.
(1.6.6)
2
2
k+1
X
Therefore the proposition is true for n = k + 1.
By the principle of mathematical induction the proposition is true for all n.
There are many of these summation formulas that can and are proved by
math induction. You should note that except for details, the proofs are very
similar.
We next include a proof by math induction that is somewhat different than
the preceeding two.
Example 1.6.3
If m, n ∈ N and a ∈ R, then am an = am+n .
Solution: Before we begin we should note that the definition of am is what we
call an inductive definition: Define a1 = a and for any k ∈ N, define ak+1 as
ak+1 = ak a. We now begin our proof by fixing m.
Step 1: Prove that the proposition is true for n = 1. Since am a1 = am+1 , by
the definition above the proposition is true for n = 1.
Step 2: Assume that the proposition is true for n = k, i.e. assume that am ak =
am+k .
Step 3: Prove that the proposition is true for n = k + 1, i.e prove that am ak+1 =
am+k+1 .
am ak+1=am ak a1 = am ak a=am+k a by the inductive hypothesis (1.6.7)
= am+k+1 by the definition of am given above. (1.6.8)
Therefore the proposition is true for n = k + 1 and by the principle of
mathematical induction the proposition is true for all n, i.e. am an = am+n .
We show how another basic property of the integers can be proved in the
following example.
Example 1.6.4
1 ≤ n for all n ∈ N.
Solution: Step 1: Prove true for n = 1. Clearly 1 ≤ 1 so the proposition is
true for n = 1.
Step 2: Assume true for n = k, i.e. 1 ≤ k.
29
1.6 Math Induction
Step 3: Prove true for n = k + 1, i.e. 1 ≤ k + 1.
By adding 1 to both sides of the inequality 1 ≤ k (using the inductive
hypothesis and (x) of Proposition 1.3.7) we get 2 ≤ k + 1. By Proposition
1.3.6-(i) we have 1 > 0–which we know implies that 0 < 1. Adding 1 to both
sides gives 1 < 2. We then have 1 < 2 < k + 1 or 1 < k + 1. This implies that
1 ≤ k + 1.
In the last example of the application of mathematical induction we prove
a very important property of the natural numbers, the Well-Ordered Principle.
As we will see, we will prove the Well-Ordered Principle by contradiction, using
mathematical induction to arrive at the contradiction.
Example 1.6.5
Suppose that M ⊂ N and M 6= ∅. Then glb(M ) ∈ M .
Solution: Before we proceed we wish to emphasize that this statement can be
reworded as follows. If M is a nonempty subset of the natural numbers, then
M contains a smallest natural number.
We begin this proof by supposing that the statement is false, i.e. there
exists a subset of the natural numbers M that does not contain a smallest
natural number. Since by Example 1.6.4 we know that 1 is the smallest natural
number, we know that 1 6∈ M . Let T = {k ∈ N : k < m for all m ∈ M }. By
the defintion of T it is clear that M ∩ T = ∅. We will use math induction to
prove that T = N.
Step 1: Because 1 6∈ M (since 1 is the smallest natural number, 1 would be
the smallest natural number in M ) and 1 ≤ n for all n ∈ N, 1 ∈ T .
Step 2: Suppose that k ∈ T , i.e. k < m for all m ∈ M .
Step 3: Prove that k + 1 ∈ T .
Let h ∈ N and be such that h < k +1. Then h ≤ k because there cannot be a
natural number between k and k + 1. By the definition of T , h ∈ T (h ≤ k < m
for all m ∈ M ) and h 6∈ M (h ≤ k and k < m for all m ∈ M ). Then if k+1 ∈ M ,
k + 1 is the smallest element of M —but, of course M does not have a smallest
element. Since k < m for all m ∈ M , there is no natural number between k and
k + 1 and k + 1 6∈ M , k + 1 < m for all m ∈ M . Therefore k + 1 ∈ T .
By induction T = N—but this is a contradiction because we know that
M 6= ∅. Therefore, glb(M ) ∈ M .
If you are not inclined to just believe that for k ∈ N there are no natural
numbers between k and k + 1—and we surely hope you wouldn’t believe that,
consider the following short proof. (Sometimes it makes life tough but you
must be careful what you believe to be obvious.) For α ∈ N suppose that there
exists a natural number β such that α < β < α + 1. Then β − α > 0 and
α + 1 − β > 0. But since these are natural numbers and 1 is the smallest natural
number, this implies that β − α ≥ 1 and α + 1 − β ≥ 1. Then we see that
(β − α) + (α + 1 − β) = 1 ≥ 2. This is a contradiction so for α ∈ N there are no
natural numbers between α and α + 1 by reductio ad absurdum.
HW 1.6.1 Prove that
n
X
j=1
j2 =
n(n + 1)(2n + 1)
.
6
30
1. Real Numbers
HW 1.6.2 Prove that if m, n ∈ N and a ∈ R, then (an )m = anm .
n
HW 1.6.3 For n ∈ N and a, b ∈ R prove that (a + b)
n!
n
.
where
=
(n − k)!k!
k
n X
n n−k k
=
a
b
k
k=0
HW 1.6.4 For n ∈ N and a, b ∈ R prove that an − bn = (a − b)
n−1
X
an−1−j bj .
j=0
HW 1.6.5 Suppose that Q is an ordered field (or the reals) and suppose that
a, b ∈ Q and θ ≤ a ≤ b. Then for n ∈ N we have an ≤ bn .
HW 1.6.6 For 0 < c < 1 prove that 0 < cn < 1 for all n ∈ N.
Chapter 2
Some Topology of R
2.1
Some Introductory Set Theory
As a part of introducing some topology of the real numbers we will include
some basic set theory. It would be easy to say that it’s a bit late to include set
theory—we have already used sets and set notation. However we felt that to
be able to discuss some of the properties of R that we now want to introduce,
we want to be sure that you know what we are talking about—and it is not the
case that we didn’t care if you didn’t know what we were talking about earlier.
Some of this material might be review but bear with us.
Definition 2.1.1 (a) We say that A is a subset of B and write A ⊂ B (or
B ⊃ A) if x ∈ A implies that x ∈ B.
(b) If A ⊂ B and there exists an x ∈ B such that x 6∈ A, then A is said to be a
proper subset of B.
(c) If A ⊂ B and B ⊂ A, we say that A equals B and write A = B.
(d) We call the set that does not contain any elements the empty set and write
the empty set as ∅.
The sets in which we will be interested will almost always be subsets of the
real numbers—but none of the general definitions require that to be the case.
We have already seen that N ⊂ Z ⊂ Q ⊂ R—all of the the subsets being proper
subsets. We also have I ⊂ R. We note that for any set A, A ⊂ A (clearly x ∈ A
implies that x ∈ A) and ∅ ⊂ A (if x ∈ ∅, then x ∈ A because there are no x’s in
∅).
We will often want to combine two or more sets in various ways. We make
the following definition.
Definition 2.1.2 Suppose that S is a set and there exists a family of sets associated with S in that for any α ∈ S there exists the set Eα .
(a) We define the union of the sets Eα , α ∈ S, to be the set E such that x ∈ E
if and only if x ∈ Eα for some α ∈ S. We write E = ∪ Eα . If we have only
x∈S
31
32
2. Topology
two sets, E1 and E2 , we write E = E1 ∪ E2 . If S = {1, 2, · · · , n} (we have n
n
sets, E1 , · · · , En ), we write E = ∪ Ek or E = E1 ∪ E2 ∪ · · · ∪ En . If S = N,
∞
k=1
then we write E = ∪ Ek .
k=1
(b) We define the intersection of the sets Eα , α ∈ S, to be the set E such that
x ∈ E if and only if x ∈ Eα for all α ∈ S. We write E = ∩ Eα . If we have
x∈S
only two sets, E1 and E2 , we write E = E1 ∩ E2 . If S = {1, 2, · · · , n} (we have
n
n sets, E1 , · · · , En ), we write E ∩ Ek or E = E1 ∩ E2 ∩ · · · ∩ En . If S = N,
∞
k=1
then we write E = ∩ Ek .
k=1
We note that the union contains all of the points that are in any of the sets
under consideration while the intersection contains the points that are in all of
the sets under consideration. It is easy to see that
• {1, 2, 3, 4, 5, 6, 7} ∪ {5, 6, 7, 8} = {1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7} ∩
{5, 6, 7, 8} = {5, 6, 7}
• Q ∪ I = R, Q ∩ I = ∅
• (1, 10)∪{1, 10} = [1, 10], [1, 10]∪[10, 20] = [1, 20], [1, 10)∪[10, 20] = [1, 20],
[1, 10] ∩ [10, 20] = {10}, [1, 10) ∩ [10, 20] = ∅
We can immediately obtain an assortment of properites pertaining to unions
and intersections which we include in the following proposition.
Proposition 2.1.3 For the sets A, B and C we obtain the following properities.
(a) A ⊂ A ∪ B
(b) A ∩ B ⊂ A
(c) A ∪ ∅ = A
(d) A ∩ ∅ = ∅
(e) If A ⊂ B, then A ∪ B = B and A ∩ B = A.
(f ) A ∪ B = B ∪ A, A ∩ B = B ∩ A
Commutative Laws
(g) (A ∪ B) ∪ C = A ∪ (B ∪ C), (A ∩ B) ∩ C = A ∩ (B ∩ C)
Associative
Laws
(h) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
Distributive Law
Proof: We will not prove all of these—hopefully most of these are very easy for
you. We will prove three of the properties to illustrate some methods of proofs
of set properties.
(b) To prove the set containment in property (b) we begin with an x ∈ A ∩ B.
This implies that x ∈ A and x ∈ B. Therefore x ∈ A and we are done.
(h) To prove property (b) we applied the definition of set containment and
prove that if x ∈ A ∩ B, then x ∈ A. To prove property (h) we must apply
the definition of equality of sets, Definition 2.1.1–(c), and prove containment
both directions, i.e. we must prove that A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C) and
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C).
2.1 Set Theory
33
A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C): We suppose that x ∈ A ∩ (B ∪ C). Then
we know that x ∈ A, and x ∈ B or x ∈ C. If x ∈ B, then x ∈ A ∩ B. If
x ∈ C, then x ∈ A ∩ C. Thus we know that x ∈ A ∩ B or x ∈ A ∩ C. Therefore
x ∈ (A ∩ B) ∪ (A ∩ C) and A ∩ (B ∪ C) ⊂ (A ∩ B) ∪ (A ∩ C).
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C): We now suppose that x ∈ (A ∩ B) ∪ (A ∩ C).
Then we know that x ∈ A ∩ B or x ∈ A ∩ C, i.e. we know that x ∈ A and
x ∈ B, or x ∈ A and x ∈ C. Thus in either case we know that x ∈ A. We also
know that x must be in either B or C (or both, but we don’t care much about
this possibility). Thus x ∈ A and x ∈ B ∪ C, or x ∈ A ∩ (B ∪ C). Therefore
(A ∩ B) ∪ (A ∩ C) ⊂ A ∩ (B ∪ C).
By Definition 2.1.1 we have that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).
(g) We will prove both properties given in (g) using Venn Diagrams. It’s not
clear what sort of proof Venn Diagrams provide but they are very nicely illustrative. We note in Figure 2.1.1 below in the left box we draw three supposedly
arbitrary sets, A, B and C. We cross-hatch A with vertical lines, B with horizontal lines and C with slanted lines. It is clear that A ∪ B is the set cross-hatched
with either vertical or horizontal lines. Then (A ∪ B) ∪ C is the set that is
cross-hatched with vertical or horizontal lines, or slanted lines, i.e. the set that
is cross-hatched in any manner.
We then proceed to the second box. We cross-hatch A, B and C as we did
in the box on the left. Then the set (B ∪ C) is the set that is cross-hatched
with either horizontal lines or slanted lines, and A ∪ (B ∪ C) is the set that is
cross-hatched with vertical lines, or horizontal lines or slanted lines, i.e. the set
that is cross-hatched in any manner. It is clear that the region denoting the set
(A ∪ B) ∪ C is the same as the region A ∪ (B ∪ C), so the sets are equal.
To prove the property (A ∩ B) ∩ C = A ∩ (B ∩ C), we note on the left that
A ∩ B) is the set cross-hatched with vertical and horizontal lines. We then note
that the set (A ∩ B) ∩ C is the set cross-hatched with vertical and horizontal
lines, and slanted lines, i.e. the region cross-hatched with all three lines. We
then note on the right that the region (B ∩ C) is the region cross-hatched with
horizontal lines and slanted lines, so the region A ∩ (B ∩ C) will be the region
cross-hatched with vertical, and horizontal and slanted lines, i.e. the region
cross-hatched with all three lines. It is clear that these regions are equal so we
know that (A ∩ B) ∩ C = A ∩ (B ∩ C).
Must fix the above figure.
As we stated earlier it is not clear how rigorous the Venn Diagram proof is,
but hopefully it is a helpful method—because it’s so visual. We will not prove
the remaining properties. The proofs of the rest are very similar to the proofs
given above—and are all easier than the proof of (h).
We next define the complement of a set. To discuss the complement it is
necessary to have a universe. The entirety of the set of elements under consideration is called the unviversal set or the universe. Generally for us the universe
will be either R or a subset of R. When it is not emphasized with respect to
what we are taking the complement, assume it is with respect to R. At he
34
2. Topology
A
B
C
A
B
C
In both plots A,
, B,
and C,
. The sets (A ∪ B) ∪ C and
A ∪ (B ∪ C) will be cross-hatched as any of the three. The sets (A ∩ B) ∩ C
will be cross-hatched with all three.
Figure 2.1.1: Venn Diagram proofs that (A ∪ B) ∪ C = A ∪ (B ∪ C) and
(A ∩ B) ∩ C = A ∩ (B ∩ C).
same time we define as concept that is strongly related to the complement, the
difference of two sets
Definition 2.1.4 (a) For two sets A and B, we define the difference of A and
B (or the complement of B with respect to A, as A − B = {x ∈ A : x 6∈ B}.
(b) The complement of the set A with respect to the universe U is the set Ac =
{x ∈ U : x 6∈ A}, or Ac = U − A.
If A1 = (−∞, 4), A2 = (2, 5) and A3 = (−∞, 5], it is easy to see that Ac1 =
[4, ∞), Ac2 = (−∞, 2] ∪ [5, ∞) and Ac3 = (5, ∞). If we wanted the complement
of A1 with respect to A3 , then A3 − A1 = [4, 5].
We next state the very basic but important result concerning complements
of sets.
Proposition 2.1.5 (Ac )c = A
It should be very easy to see that the above result is true. Probably the easiest
way is to draw the very simple Venn Diagram representing the left hand side of
the equality.
We next prove a very important result related to complements referred to
as DeMorgan’s Laws.
Proposition 2.1.6 Consider the set A, and the family of sets associated with
S, Eα .
(a) A − ∪ Eα = ∩ (A − Eα )
α∈S
α∈S
(b) A − ∩ Eα = ∪ (A − Eα )
α∈S
α∈S
35
2.1 Set Theory
(c)
∪ Eα
α∈S
(d)
c
∩ Eα
α∈S
c
= ∩ Eαc
α∈S
= ∪ Eαc
α∈S
Proof: (a) The proof of property (a) follows by carefully applying the definition
of set equality. We begin by assuming that x ∈ A − ∪ Eα . Then we know
α∈S
that x ∈ A and x 6∈ ∪ Eα . The statement that x 6∈ ∪ Eα is a very strong
α∈S
α∈S
statement. This means that x 6∈ Eα for any α ∈ S—if x ∈ Eα0 for some α0 ∈ S,
then x ∈ ∪ Eα . Thus x ∈ A and x 6∈ Eα so x ∈ A − Eα —and this holds for
α∈S
any α ∈ S. Therefore x ∈ ∩ (A − Eα ), and A − ∪ Eα ⊂ ∩ (A − Eα ).
α∈S
α∈S
α∈S
We next assume that x ∈ ∩ (A − Eα ). This implies that x ∈ A − Eα for
α∈S
every α ∈ S. Therefore, x ∈ A and for every α ∈ S, x 6∈ Eα . This implies that
x 6∈ ∪ Eα —because if x ∈ Eα0 for some α0 ∈ S, then x 6∈ A − Eα0 . Since
α∈S
x ∈ A and x 6∈ ∪ Eα , then x ∈ A − ∪ Eα , or ∩ (A − Eα ) ⊂ A − ∪ Eα .
α∈S
α∈S
α∈S
α∈S
Therefore A − ∪ Eα = ∩ (A − Eα ).
α∈S
α∈S
(b) The proof of property (b) is very similar to that of property (a). We assume
that x ∈ A − ∩ Eα . Then x ∈ A and x 6∈ ∩ Eα . The statement x 6∈ ∩ Eα
α∈S
α∈S
α∈S
implies that x 6∈ Eα0 for some (at least one) α0 ∈ S. But then x ∈ A − Eα0 so
x ∈ ∪ (A − Eα ) and A − ∩ Eα ⊂ ∪ (A − Eα ).
α∈S
α∈S
α∈S
If x ∈ ∪ (A − Eα ), then x ∈ A − Eα0 for some (again, at least one) α0 ∈ S.
α∈S
This implies that x ∈ A and x 6∈ Eα0 . But if x 6∈ Eα0 , then x 6∈ ∩ Eα
α∈S
(to be in there, it must be in all of them). Therefore x ∈ A − ∩ Eα and
α∈S
∪ (A − Eα ) ⊂ A − ∩ Eα .
α∈S
α∈S
We then have A − ∩ Eα = ∪ (A − Eα ).
α∈S
α∈S
(c) and (d) Properties (c) and (d) follow from properties (a) and (b),
respectively, by letting A = U the universal set.
HW 2.1.1 (True or False and why)
(a) A ⊂ A ∩ B
(b) B − A = B ∩ Ac
αn
(c) For Ek = (−1/k, 1/k), k ∈ N, E = ∪ Gαk .
∞
k=1
(d) For Ek = (−k, k), k ∈ N, E = ∩ Ek = R.
(e) A ∪ B = [A − (A ∩ B)] ∪ B
k=1
HW 2.1.2 Give set containment proofs of parts (c) and (g) of Proposition
2.1.3.
HW 2.1.3 Give Venn diagram proofs of part (h) of Proposition 2.1.3 and part
(c) of Proposition 2.1.6.
36
2.2
2. Topology
Basic Topology
Do we show that the union of two closed sets is closed? Open? Intersection? Countable number?
Topology provides a general set with basic structures and results that allow
us to study some basic topics in analysis on the topological space. We do not
want that. We are going to study calculus on R so we want some of the relevant
concepts of the topology of R that will help us. The title of the chapter and
the title of this section are very appropriate. We do not claim to be giving you
the topology of R. As the titles imply we are going to give you some of the
basic topology on R—the topology that we want to use. In this section we will
introduce some of the most basic topology of the reals. In later sections, when
appropriate, we will add more topological results.
We begin defining several ideas related to subsets of R.
Definition 2.2.1 Suppose x0 ∈ R and E is a subset of R. (a) A neighborhood
of a point x0 is the set Nr (x0 ) = {x ∈ R : |x − x0 | < r} for some r > 0. The
number r is called the radius of the neighborhood.
(b) A point x0 is a limit point (an accumulation point) of a set E if every
neighborhood of x0 contains a point x 6= x0 such that x ∈ E. We call the set of
limit points of E the derived set of E and denote it by E ′ .
(c) If x0 ∈ E and x0 is not a limit point of E, then x0 is said to be an isolated
point of E.
(d) The set E is closed if every limit point of E is in E, i.e. E ′ ⊂ E.
(e) A point x0 ∈ E is an interior point of E if there is a neighborhood N of x0
such that N ⊂ E. We call the set of interior points of E the interior of E and
denote it by E o .
(f ) The set E is open if every point of E is an interior point (E ⊂ E o so then
E = E o ).
(g) The set E is dense in R if x ∈ R implies that x is a limit point of E or
x ∈ E.
It should be easy to see from Proposition 1.5.9 that neighborhoods of a point
x0 are intervals, (x0 − r, x0 + r) where r > 0. Thus the intervals (.9, 1.1), (.5, 1.5)
and (.995, 1.005) are all neighborhoods of the point x0 = 1 with radii .1, .5, and
0.005, respectively.
Example 2.2.1 Define the following sets: E1 = [0, 1], E2 = (0, 1), E3 = {1, 1/2, 1/3, · · · }.
(a) Show that E1′ = [0, 1] = E1 .
(b) Show that E2′ = [0, 1].
(c) Show that E3′ = {0}.
Solution: (a) We begin by considering any point x0 ∈ E1 , x0 6= 1, and let Nr (x0 ) be any
neighborhood of x0 . We note that the point x1 = min{x0 + r/2, 1} is in E1 and Nr (x0 ),
and x1 6= x0 so the point x0 is a limit point of E1 . Note that the point that we used,
x1 = min{x0 + r/2, 1} is not a very nice looking point but we needed to be careful to choose
a point that would be in both E1 and Nr (x0 )—the choice being 1 when r is too large.
If x0 = 1 and Nr (x0 ) is an arbitrary neighborhood of x0 —for any r, then the point
x1 = max{x0 − r/2, −1} is in E1 and Nr (x0 ), and x1 6= x0 so the point x0 = 1 is a limit point
of E1 . Thus every point in [0, 1] is a limit point of E1 = [0, 1].
37
2.2 Basic Topology
If x0 6∈ [0, 1], say x0 > 1, then if r = (x0 − 1)/2, the neighborhood of x0 is such that
Nr (x0 ) ∩ E1 = ∅. Thus x0 is not a limit point of E1 . A similar argument shows that any x0
such that x0 < 0 is not a limit point of E1 .
Thus only the points in [0, 1] are limit points of E1 = [0, 1], i.e. E1′ = [0, 1] = E1 .
We would like to emphasize that by the definition of a limit point, for the point x0 to be
a limit point of E1 , it must be shown that all neighborhoods of x0 contains an element of E1
different from x0 . To show that a point x0 is not a limit point of E1 , we only have to show
that there exists one neighborhood of x0 that does not contain any elements of E1 other than
x0 .
(b)If we consider the set E2 = (0, 1) and let x0 be an arbitrary point of E2 , then for any
neighborhood of x0 , Nr (x0 ), the point x1 = min{x0 + r/2, (1 + x0 )/2} will be in both E2
and Nr (x0 ), and not equal to x0 . (Again we emphasize that we need the nasty looking point
x1 because we use x0 + r/2 when r is sufficientlly small and use (1 + x0 )/2 when r is large.)
Thus every point in (0, 1) is a limit point of E2 .
Since for any r > 0 the neighborhoods Nr (0) = (−r, r) and Nr (1) = (1 − r, 1 + r) contain
the points x0 = min{r/2, 1/2} and x1 = max{1 − r/2, 1/2}, respectively—both points in E2
and surely x0 6= 0 and x1 6= 1, the points x = 0 and x = 1 are both limits points of E2 .
The same argument used for the set E1 can be used to show that all points x 6∈ [0, 1] are
not limit points. Thus only the points in [0, 1] are limit points of E2 = (0, 1) or E2′ = [0, 1].
(c) To find E3′ for the set E3 = {1, 1/2, 1/3, · · · } is more difficult. The easiest way is to
first determine some facts concerning E3 . It is very intuitive that given some element of E3
other than 1, say 1/m, m ∈ N , then the elements of E3 that are closest to 1/m in value are
1/(m − 1) (the next larger element in the set) and 1/(m + 1) (the next smaller element in the
set). Of course we must be able to prove these statements—if someone asked. The easiest
way to prove these is to use the second Peano Postulate that could be stated that there are no
natural numbers between m − 1 and m, or m and m + 1—if there is some element of E3 , say
1/k, such that 1/m < 1/k < 1/(m − 1), then we have k < m and k > m − 1 which contradicts
PP2.
If we proceed and choose any specific element of E3 , say x0 = 1/1004, it is not difficult
to see that the neighborhood Nr (1/1004) where r = 0.00001 will not contain any elements of
E3 other than x0 (because we can compute the value of the elements of E3 that are closest
to x0 ). This same argument will work for all of the elements of E3 —r =
1
2
1
m
−
1
m+1
will
always work. For x0 = 1 the neighborhood N = (0.99, 1.01) will be such that N contains no
points of E3 other than x0 = 1. Thus no points in E3 are limit points of E3 .
If we consider the point x0 > 1, then the neighborhood Nr (x0 ) with r = (x0 − 1)/2 will
not contain any elements of E3 . Thus x0 > 1 is not a limit point of E3 .
If we consider the point x0 < 0, then the neighborhood Nr (x0 ) with r = −x0 /2 will not
contain any elements of E3 . Thus x0 < 0 is not a limit point of E3 .
We now consider the point x0 , x0 6∈ E3 and 0 < x0 < 1, We know that there must be two
elements of E3 , say x1 = 1/(m − 1) and x2 = 1/m such that x2 < x0 < x1 —choose m be
setting x2 = 1/m = lub{y ∈ E3 : y < x0 } (you must prove that this least upper bound will
be in E3 ) and let x1 = 1/(m − 1) be the value of the next largest element in the set. We can
then set r = min{(x0 − x2 )/2, (x1 − x0 )/2} and note that Nr (x0 ) ∩ E3 = ∅. Therefore, the
points x0 such that x0 6∈ E3 and 0 < x0 < 1 are not limit points of E3 .
The last point that we have to consider is the point x0 = 0. We let Nr (0) denote any
neighborhood of x0 , i.e. consider any r. Then by Corollary 1.5.5–(b) with ǫ = r we see that
there exists an n such that 1/n < r or 1/n ∈ Nr (0). Thus x0 = 0 is a limit point of E3 . Thus
we see that the only limit point of the set E3 is x0 = 0, i.e. E3′ = {0}.
It should be clear that all of the points in E3 are isolated points. Likewise,
if we consider the set N, since none of the points of N are limit points (for any
k ∈ N, the neighborhood N1/2 (k) does not contain any elements of N), all of
the points in N are isolated. It should also be easy to see that no points of E1
nor E2 are isolated points—all of the points in both E1 and E2 are limit points.
38
2. Topology
Example 2.2.2
Let E1 , E2 and E3 be as in Example 2.2.1.
(a) Show that E1 is a closed set.
(b) Show that E2 is not a closed set.
(c) Show that E3 is not a closed set.
Solution: These proofs are very easy based on the work done in Example 2.2.1. Since we
saw that E1′ = [0, 1] = E1 , then clearly all of the limits points of E1 are contained in E1 and
the set E1 is closed.
Since we found that E2′ = [0, 1], we see that the limit points 0 and 1 do not belong to E2
so the set E2 is not closed. Likewise, since we saw that the only limit point of E3 is the point
0 and 0 6∈ E3 , then the set E3 is not closed.
We should note that if we considered E4 = E3 ∪ {0}, E4 will surely be closed—
almost any time you define a new set by including the limit points to a set, the
set will be closed. (Can you give an example where that is not the case?) Also,
since we saw that N has not limit points, the set N is surely closed—any set
that has no limit points is closed. This statement then would also imply that
the empty set, ∅, is closed. It should be really easy to see that the set R′ = R
and hence that R is closed—for x0 ∈ R, the whole neighborhood Nr (x0 ) ⊂ R
for any r.
Example 2.2.3
Let E1 , E2 and E3 be as in Example 2.2.1.
(a) Show that E1o = (0, 1).
(b) Show that E2o = (0, 1) = E2 .
(c) Show that E3o = ∅.
Solution: (a) For x0 ∈ (0, 1) is should be easy to see that Nr (x0 ) ⊂ E1 if r = min{x0 /2, (1 −
x0 )/2}. Thus x0 ∈ (0, 1) implies x0 ∈ E o . It should be easier to see that there is no r such
that Nr (0) or Nr (1) will be contained in E1 —in the first case −r/2 6∈ E1 and in the second
case 1 + r/2 6∈ E1 . And since a point must be an element of the set to be an interior point,
we do not have to consider any other points. Therefore E1o = (0, 1).
(b) For points x0 ∈ (0, 1) exactly the same argument used in part (a) will show that x0 is an
interior point of E2 . All other points are not in E2 so E2o = (0, 1).
(c) In part (c) of Example 2.2.1–(c) we showed that for x0 ∈ E3 there was a neighborhood
Nr (x0 ) that did not contain any points of E3 —other than x0 . So surely Nr (x0 ) 6⊂ E3 .
Clearly, any neighborhood Nr1 (x0 ) with r1 < r will also not contain any points of E3
other than x0 . Thus Nr1 (x0 ) 6⊂ E3 .
And though neighborhoods Nr1 (x0 ) for r1 > r may contain some elements of E3 , such
neighborhoods will also always contain Nr (x0 )—which contains a large number of points that
are not in E3 . Thus there is no neighborhood of x0 that is contained in E3 .
Since only points of E3 need be considered, E3o = ∅.
It should be clear that the set N does not contain any interior points and every
point in R is an interior point.
It should now be easy to see that since 0 6∈ E1o (1 would work too), the set
E1 is not open. Since E2o = (0, 1) = E2 (i.e. every point in E2 is an interior
point), the set E2 is an open set. Clearly since 1/120012 6∈ E3o (and of course
any element of E3 would work here), the set E3 is not open. And finally, it
should be easy to see that N is not open and R is open.
The question of whether a set is dense in R is more difficult but we do not
want to consider many examples. Hopefully it is clear that sets like E1 , E2 , E3
and N are clearly not dense in R—you must have much bigger sets than these
to be dense in R. It should be clear that E = R will be trivially dense in R. The
2.2 Basic Topology
39
two important examples were already consider in Proposition 1.5.6. Consider
the following example.
Example 2.2.4
(a) Show that Q is dense in R.
(b) Show that I is dense in R.
Solution: (a) By the definition of I a point x0 ∈ R is either in Q or in I. Let x0 be an
arbritray point of R. We must show that x0 is in Q or x0 is a limit point of Q. Thus if x0 ∈ Q,
we are done. Suppose that x0 6∈ Q (so x0 ∈ I). Consider Nr (x0 ) for any r, i.e. the interval
(x0 − r, x0 + r). Then by Proposition 1.5.6–(a) (with a chosen to be x0 − r and b chosen to
be x0 + r) there exists a rational r1 such that x0 − r < r1 < x0 + r. Therefore x0 is a limit
point of Q.
Since any point of R is either in Q or a limit point of Q, Q is dense in R.
(b) The proof of part (b) follows the same pattern as the proof of part (a) except that we use
part (b) of Proposition 1.5.6 instead of part (a).
Hopefully the above examples gives us an understanding of the ideas presented in Definition 2.2.1. We now proceed to prove some important properties
concerning limit points, and open and closed sets.
Proposition 2.2.2 A neighborhood is an open set.
Proof: You should note that this proof is and should be very similar to the
proof that E2o = E2 = (0, 1) and that E2 is open in Example 2.2.3–(b) and the
statements following that example. We write the neighborhood N as N = (x0 −
r, x0 + r). If we choose any point y0 ∈ N , then it clear that the neighborhood of
y0 , Nr1 (y0 ) = (y0 − r1 , y0 + r1 ) where r1 = 12 min{r − (x0 − y0 ), r − ((y0 − x0 )}, is
in N (draw a picture to help see that this is true). Thus y0 is an interior point
of N so N is open (N o = N ).
Proposition 2.2.3 If x0 is a limit point of E ⊂ R and N is any neighborhood
of x0 , then N contains infinitely many points of E.
Proof: Since N is a neighborhood of x0 and x0 is a limit point of E, we know
that there exists a point of E in N . Call this point x1 . Then consider the
neighborhood of x0 , N1 = (x0 − d1 , x0 + d1 ) where d1 = max{x0 − x1 , x1 − x0 }
(we’re just choosing the positive distance from x0 to x1 ). It should be clear by
the construction of N1 that N1 ⊂ N .
Then since x0 is still a limit point of E and N1 is a neighborhood of x0 ,
there exists x2 ∈ N1 such that x2 ∈ E and x2 6= x0 . Since x1 6∈ N1 , x2 6= x1 .
Let N2 = (x0 − d2 , x0 + d2 ) where d2 = max{x2 − x0 , x2 − x0 }. Then N2 is
another neighborhood of x0 and by construction N2 ⊂ N1 ⊂ N . Since x0 is a
limit point of E, there exists x3 ∈ N2 such that x3 ∈ E and x3 6= x0 .
Inductively, we define a set of points {x0 , x1 , x2 , · · · } and neighborhoods
N1 , N2 , · · · such that for any n, xn+1 ∈ Nn , xn+1 ∈ E (because Nn is a
neighborhood of x0 and x0 is a limit point of E), Nn+1 is defined to be Nn+1 =
max{x0 − xn+1 , xn+1 − x0 }, Nn+1 is a neighborhood of x0 and Nn+1 ⊂ Nn ⊂ N .
Thus we see that the infinite set of points {x0 , x1 , x2 , · · · } are all in both N
and E.
From this result we obtain the following useful corollary.
40
2. Topology
Corollary 2.2.4 If E is a finite set, then E has no limit points.
Proposition 2.2.5 The set E ⊂ R is open if and only if E c is closed.
Proof: (⇐) We suppose that E c is closed and that x0 is an arbitrary point of
E. Then x0 6∈ E c (definition of E c ) and x0 is not a limit point of E c (because
E c is closed, it contains all of it’s limit points). Since x0 is not a limit point of
E c , we know that there exists a neighborhood of x0 , N , such that N ∩ E c = ∅,
i.e. N ⊂ E. Therefore x0 is an interior point of E, E ⊂ E o , so E is open.
(⇒) Now suppose that E is open and that x0 is a limit point of E c . Then
every neighborhood of x0 contains a point of E c , i.e. no neighborhood of x0 is
contained in E. Therefore, x0 is not an interior point of E. Since E is open,
this implies that x0 6∈ E, i.e. x0 ∈ E c . Thus E c is closed.
We then obtain the following corollary to the above result.
Corollary 2.2.6 The set F ⊂ R is closed if and only if F c is open.
HW 2.2.1 (True or False and why)
(a) The set E = {x ∈ [0, 1] : x ∈ Q} = [0, 1] ∩ Q is open.
(b) The set E = {x ∈ [0, 1] : x ∈ I} = [0, 1] ∩ I is closed.
(c) The set E = [0, 1] ∪ {x : x = 1 + 1/n, n ∈ N} is closed.
(d) If E = [0, 1] ∩ Q, then E o = (0, 1).
(e) A neighborhood is closed.
(f) If E is a finite set, E is closed.
HW 2.2.2 Determine the limit points of the set {x ∈ R : x =
1
n
+
1
m,
n, m ∈ N}.
HW 2.2.3 (a) Suppose E1 , E2 ⊂ R are open. Prove that E1 ∪ E2 is open.
(b) Suppose E1 , E2 ⊂ R are open. Prove that E1 ∩ E2 is open.
(c) Suppose E1 , E2 ⊂ R are closed. Prove that E1 ∪ E2 is closed.
(d) Suppose E1 , E2 ⊂ R are closed. Prove that E1 ∩ E2 is closed.
∞
HW 2.2.4 (a) Suppose E1 , E2 , · · · ⊂ R are open. Prove that ∪ Ek is open.
∞
k=1
(b) Suppose E1 , E2 , · · · ⊂ R are closed. Prove that ∩ Ek is closed.
(c) Suppose E1 , E2 , · · · ⊂ R are open. Show that
k=1
∞
∩ Ek
k=1
∞
need not be open.
(d) Suppose E1 , E2 , · · · ⊂ R are closed. Show that ∪ Ek need not be closed.
k=1
2.3
Compactness in R
The concept of compactness of sets is very important in analysis. We will use
compactness results in later chapters and you will probably use them throughout
your mathematic career. We make the following two definitions.
41
2.3 Compactness
Definition 2.3.1 The collection {Gα }α∈S of open subsets of R is an open cover
of the set E ⊂ R if E ⊂ ∪ Gα .
α∈S
Definition 2.3.2 A set K ⊂ R is said to be compact if every open cover of K
contains a finite subcover.
The concept of compactness is an abstract concept. We will give several
examples of compact and non-compact sets but as you will see, this is difficult
If we consider the collection of sets Gk = (k − 1/2, k + 1/2), k ∈ N , it should
be clear that {Gk } is an open cover of the set N. It should also be clear that
we cannot choose a finite subcover. Thus the set N is not compact.
Also consider the set (0, 1] and the collection of sets Gk = (1/k, 2), k =
1, 2, · · · . For any x ∈ (0, 1] there exists k ∈ N such that 1/k < x (Corollary 1.5.5(b)), i.e. x ∈ (1/k, 2). Thus the collection {Gk } covers (0, 1]. Let Gα1 , · · · , Gαn
be any finite sub-collection of {Gk }. One these these sets will be associated with
the largest k value—the smallest 1/k (the lub{αk }), k = k0 . Then the point
1
2 (1/k0 )
αn
is not included in ∪ Gαk , i.e. the set (0, 1] is not compact.
k=1
It would be nice to have an example of a compact set. Suppose that E is
a finite set, say E = {a1 , · · · , aK }, and {Gα }α∈S is any open cover of E, i.e.
E ⊂ ∪ Gα . Then for each aj ∈ E there must exist some Gαj in the collection
α∈S
{Gα } such that aj ∈ Gαj —aj may be in a lot of Gα ’s but who cares. Then
K
{Gαj }K
j=1 is a finite subcover of E, E ⊂ ∪ Gαj , so the set E is compact.
j=1
We understand that the set E in this last example is a trivial example
whereas N and (0, 1] are more interesting sets. The truth of the matter is that
in general it is much easier to prove that a set is not compact (if it is not
compact)—youonly have to find one cover that has no finite subcover—than it
is to prove that a set is compact—you have to consider all open covers. Later
we will use some of our theorems to produce other sets that are compact (and
some that are not compact).
We need two different types of results concerning compactness. We first need
some general methods that help us determine when and if a given set is compact.
In addition we need some results that give us some of the useful properties of
compact sets—this is why we need and want the concept of compactness. We
begin with the following result.
Proposition 2.3.3 If K ⊂ R is compact, then K is closed.
Proof: We will prove this result by showing that K c is open (and then apply
Propositions 2.2.5 and 2.1.5).
Suppose that x ∈ K c . Then for any point y ∈ K, we can choose neighborhoods Vy and Wy of points x and y, respectively, of radius r = |x − y|/4 (and
since x 6= y, r > 0). The collection of sets {Wy }, y ∈ K, will surely define
an open cover of the set K—y ∈ K implies y ∈ Wy . Since the set K is compact, we choose a finite number of sets Wy1 , Wy2 , · · · , Wyn that covers K, i.e.
n
K ⊂ W = ∪ Wyk .
k=1
42
2. Topology
n
Let V = ∩ Vyk . Since each Vyk is a neighborhood of the point x and we
k=1
are considering only a finite number of such neighborhoods, V will also be a
neighborhood of the point x—of radius min{|x − y1 |/4, · · · , |x − yn |/4}. (Note
that the sets Vyk , k = 1, · · · , n form a nested set of neighborhoods all about the
point x—we don’t know what order. The set V will be the smallest of those
neighborhoods.) Since Vyk ∩ Wyk = ∅ for k = 1, · · · , n, V ∩ W = ∅. Since
K ⊂ W , V ⊂ K c —draw a picture, it’s easy. Therefore V is a neighborhood of
x ∈ K c such that V ⊂ K c , so x is an interior point of the set K c . Since x was
an arbitrary point of K c , the set K c is open, and then K = (K c )c is closed.
Since we know that the set N is closed but not compact, we know that
we cannot obtain the converse of the above result. We can however prove the
following ”partial converse.”
Proposition 2.3.4 If the set K ⊂ R is compact and F ⊂ K is closed, then F
is compact.
Proof: Let the collection of sets {Vα } be an open cover of F . Since F ⊂ K,
the sets {Vα } will cover part of K. Since F is closed, we know that F c is
open. Since the set F c will cover the part of K that the collection {Vα } did not
cover, the collection of sets {Vα } plus F c will cover K. Since K is compact we
can choose a finite subcover, Vα1 , · · · , Vαn plus ”maybe F c . Since F ⊂ K, this
subcover must cover F also.
If F c was included in the subcover, then we can throw it out and Vα1 , · · · , Vαn
will cover F —because F c didn’t cover any part of F .
Otherwise, if F c was not included in the subcover, then clearly Vα1 , · · · , Vαn
covers F .
In either case we have found a subcover of the collection of sets {Vα } which
covers F . Therefore F is compact.
You should understand that the above proof is especially abstract since you
start with a given open cover which you have no idea what it’s like. You still
must find a finite subcover—and we do. That can be considered a tough job.
We next give a result that will be very important to us later. We will care
that sets have limit points. This result guarantees that we get a limit point any
time that the set is compact and infinite.
Proposition 2.3.5 If K ⊂ R is compact, and the set E is an infinite subset of
K, then E has a limit point in K.
Proof: Suppose the result is false, i.e. suppose that K is compact and E ⊂ K
is infinite and E has no limit points in K. Then for any x ∈ K (which would
not be a limit point of E) there exists a neighborhood of x, Nx , such that if
x ∈ E (it need not be), then Nx ∩ E = {x} and if x 6∈ E, then Nx ∩ E = ∅.
(Since x is not a limit point of E, it is not the case that every neighborhood of
x contains a point of E other than maybe x. Or there is some neighborhood of
x that does not contain any point of E other than maybe x.)
43
2.3 Compactness
The collection of all such neighborhoods, Nx , x ∈ K, such that Nx ∩E = {x}
is surely an open cover of K. Clearly no finite subcover of this collection of sets
can cover E—each set Nx contains one or no points of E and the set E is
infinite. If no finite subcover of this collection can cover E, no finite subcover
of this collection can cover K since E ⊂ K. This contradicts the fact that the
set K is compact. Therefore the set E has at least one limit point in K.
When you think about the next statement it probably seems clear. We need
this result proved because it will be very important for us.
Proposition 2.3.6 Suppose that {In }∞
n=1 is a collection of closed intervals in
∞
R such that In+1 ⊂ In for all n = 1, 2, · · · , then ∩ In is not empty.
n=1
Proof: Write the intervals as In = [an , bn ]. Let E = {an : n = 1, 2, · · · } and
x = lub(E). Then it is clear that x ≥ an for all n.
We note that In+1 ⊂ In implies that an ≤ an+1 ≤ bn+1 ≤ bn . We claim
that for any natural numbers n and m, an ≤ an+m ≤ bn+m ≤ bn . This can be
proved by fixing n and using induction on m.
Step 1: We know from the hypothesis of the proposition (and the interpretation
of the hypothesis given at beginning of the proof) that the statement is true for
m = 1.
Step 2: We assume that the statement is true for m = k, i.e. an ≤ an+k ≤
bn+k ≤ bn .
Step 3: We will now prove that the statement is true for m = k + 1, i.e. an ≤
an+(k+1) ≤ bn+(k+1) ≤ bn . We know from the hypothesis of the proposition
that I(n+k)+1 ⊂ In+k . This implies that an+k ≤ a(n+k)+1 ≤ b(n+k)+1 ≤ bn+k .
This along with the inductive hypothesis implies that an ≤ an+k ≤ a(n+k)+1 ≤
b(n+k)+1 ≤ bn+k ≤ bn , i.e. an ≤ an+(k+1) ≤ bn+(k+1) ≤ bn which is what we
were to prove.
Therefore by the Principal of Mathematical Induction,
an ≤ an+m ≤ bn+m ≤ bn for all m.
(2.3.1)
This result will also show that
am ≤ an+m ≤ bn+m ≤ bm .
(2.3.2)
Using the first three inequalities of (2.3.1) and the last inequality of (2.3.2)
we see that an ≤ an+m ≤ bn+m ≤ bm . Thus for any m, bm is an upper bound
of E. Therefore x = lub(E) ≤ bm for all m. Thus, since am ≤ x ≤ bm for all m,
∞
∞
n=1
n=1
x ∈ Im for all m and x ∈ ∩ In . Therefore ∩ In 6= ∅.
We next start proving some theorems that will give us a better idea of what
compact sets might look like. We begin with the first, very basic result.
Proposition 2.3.7 For a, b ∈ R with a < b the set [a, b] is compact.
44
2. Topology
Proof: We begin by setting a0 = a, b0 = b, denoting the interval [a0 , b0 ] by I0
and assuming that the set I0 is not compact, i.e. there exists an open cover of
I0 , {Gα }, which contains no open subcover.
We next consider the intervals [a0 , c0 ] and [c0 , b0 ] where c0 = (a0 + b0 )/2. At
least one of these two intervals cannot be covered by any finite subcollection of
{Gα }—if both subintervals could be covered by a finite subcollection of {Gα },
so could their union which is I0 . Denote whichever subinterval that cannot be
covered by a finite subcover of {Gα } I1 and denote the end points of this interval
by a1 and b1 (if neither subinterval can be covered by a finite subcover, chose
either).
We next consider the intervals [a1 , c1 ] and [c1 , b1 ] where c1 = (a1 + b1 )/2.
Again at least one of these two intervals cannot be covered by any finite subcollection of {Gα }—denote this subinterval by I2 .
We inductively define a collection of closed intervals that satisfy the following
properties: (i) In+1 ⊂ In for n = 0, 1, 2, · · · , (ii) In is not covered by any finite
subcollection of {Gα }, and (iii) the length of the interval In is (b − a)/2n .
We begin by applying Proposition 2.3.6 to the collection of closed intervals
∞
to get x0 such that x0 ∈ ∩ In , i.e. x0 ∈ In for all n. Since x0 ∈ I0 (and all
n=0
others) and the collection {Gα } covers I0 , x0 ∈ Gα0 for some α0 . Since Gα0 is
open, there exists a neighborhood of x0 , say Nr (x0 ) for some r > 0, such that
x0 ∈ Nr (x0 ) and Nr (x0 ) ⊂ Gα0 . If we choose n0 such that (b − a)/2n0 < r/2,
then In0 will be contained in Nr (x0 )—remember x0 ∈ In for all n. But then
In0 ⊂ Nr (x0 ) ⊂ Gα0 , which is a finite subcover. This contracts (ii) above.
Therefore there is no open cover of [a, b] that does not have a finite subcover
and the set [a, b] is compact.
We should be a little careful above where we chose n0 such that (b−a)/2n0 <
r/2. However, we can do this. By Corollary 1.5.4 (the Archimedian property) we
can choose n0 such that n0 > 2(b − a)/r (letting ”a” = 1 and ”b” = 2(b − a)/r
in Corollary 1.5.4—where ”a” and ”b” are the a and b of the Archimedian
property). It is then easy to use Mathematical Induction to prove that 2n0 > n0
for all n0 . We don’t really want to stop and prove everything like this but we
must realize that we must be ready and able to do so if asked.
We next prove a very important theorem that gives a characterization of
compact sets. This result is know as the Heine-Borel Theorem.
Theorem 2.3.8 (Heine-Borel Theorem) A set E ⊂ R is compact if and
only if E is closed and bounded.
Proof: (⇒) We begin by assuming that the set E is compact but is not
bounded. If E is not bounded we know that the set E either does not have
an upper bound or does not have a lower bound. Let’s suppose that E does not
have an upper bound. Then there exist points xn ∈ E such that xn > n for
n = 1, 2, · · · . Clearly the set E1 = {x1 , x2 , · · · } is an infinite subset of E that
does not have a limit point in E (the set E1 doesn’t even have a limit point in
R). This contradicts Proposition 2.3.5. Therefore the set E must be bounded.
2.3 Compactness
45
We now suppose that E is not closed. This implies that there is a limit
point of E, x0 , such that x0 6∈ E. From Proposition 2.2.3 we know that every
neighborhood of x0 contains infinitely many points of E. We will use a construction similar to that used in Proposition 2.2.3. Since x0 is a limit point of E,
there exists a point x1 ∈ E such that x1 ∈ N1 (x0 ) (neighborhood of radius 1).
Likewise, there is a point x2 ∈ E such that x2 ∈ N1/2 (x0 ). In general (or inductively), there exists a point xn ∈ E such that xn ∈ N1/n (x0 ) for n = 1, 2, · · · .
Set E1 = {x1 , x2 , · · · }. E1 is a subset of E and is an infinite set. (Otherwise an
infinite number of the xj ’s must be equal. Since x0 6∈ E and xn ∈ N1/n (x0 ) ∩ E,
they cannot equal x0 .)
We want to show that E1 does not have a limit point in E. Since x0 6∈ E,
we know that it’s not x0 . We will next show that nothing else will be a limit
point of E1 . For any y0 ∈ R, y0 6= x0 we have
|xn − y0 | = |(x0 − y0 ) − (x0 − xn )| ≥ |x0 − y0 | − |x0 − xn | by Prop 1.5.8-(vi)
1
(2.3.3)
≥ |x0 − y0 | − the point xn ∈ N1/n (x0 ).
n
1
1
< |x0 − y0 | (which is possible by Corollary 1.5.5–
n0
2
1
1
(b)), then for all n ≥ n0 , < |x0 − y0 |. Then by (2.3.3) we have
n
2
If we choose n0 so that
1
1
|xn − y0 | ≥ |x0 − y0 | − |x0 − y0 | = |x0 − y0 |.
2
2
1
Since a neighborhood of x0 of radius less than |x0 − y0 | can include only a
2
finite number of elements of E1 , y0 cannot be a limit point of E1 , i.e. not
y0 ∈ R, y0 6= x0 , can be a limit point of E1 . Thus E1 is an infinite subset of
the compact set E which has no limit point in E. This contradicts Proposition
2.3.5. Therefore the set E is closed.
(⇐) Since E is bounded, there exists an a, b ∈ R such that a, b and E ⊂ [a, b].
Since [a, b] is compact (by Proposition 2.3.7) and E is closed, E is compact by
Proposition 2.3.4 which is what we were to prove.
Proposition 2.3.7 gives us a lot of compact sets. Theorem 2.3.8 makes it
easier yet to determine whether certain sets are compact. For example we
know that the sets (0, 1), [0, 1] ∩ Q and [0, ∞) are not compact. And the sets
{0, 1, 1/2, 1/3, · · ·} and [0, 10] ∪ {3/2} ∪ [2, 3] are compact. The next result helps
us use the compact sets that we have to build more.
Proposition 2.3.9 (a) If E1 , E2 ⊂ R are compact, then E1 ∪ E2 is compact.
(b) If E1 , E2 ⊂ R are compact, then E1 ∩ E2 is compact.
Proof: (a) Suppose that {Gα } is an open cover of E1 ∪ E2 . Then {Gα } is an
open cover of E1 (so we can find a finite subcover) and E2 (so we can find a
finite subcover). If we include all of the sets in these two subcovers, we will get
a finite subcover of E1 ∪ E2 .
46
2. Topology
(b) Since E1 , E2 ⊂ R are compact, we know from Theorem 2.3.8 that E1 and
E2 are both closed and bounded. By HW2.2.3-(d) we know that E1 ∩ E2 is
closed. It should be easy to see that it also follows that E1 ∩ E2 is bounded.
Hence E1 ∩ E2 is compact.
We next prove the converse of Proposition 2.3.5. We should realize that this
next result along with Proposition 2.3.5 provides an alternative to the definition
of compactness.
Proposition 2.3.10 If K ⊂ R is such that any infinite subset of K has a limit
point in K, then K is compact.
Proof: Consider the following statement.
Result**: If K ⊂ R is such that for any infinite set E that is a subset of K,
then E has a limit point in K, then K is closed and bounded.
It should be clear that if we can prove the above result, we can apply Theorem 2.3.8 to get the desired result.
In effect Result** has already been proved—in disguise. In the ⇒ direction
of the proof of Theorem 2.3.8 we supposed that the set was not closed and
bounded (doing one at a time) and showed that we had an infinite subset of K
that did not have a limit point in K (which would contradict the hypothesis of
Result**)—and hence contradicted Proposition 2.3.5. This same proof (without
the contradiction of Proposition 2.3.5) will prove Result**. Then as we said, we
apply Theorem 2.3.8 and get the desired result.
We close with a result that will be very useful to use later.
Theorem 2.3.11 Every bounded infinite subset of R has a limit point in R.
Proof: Since E is bounded, E ⊂ [a, b] for some a and b in R. Since [a, b] is
compact, by Proposition 2.3.5 E has a limit point in [a, b], i.e. E has a limit
point in R.
HW 2.3.1 (True or False and why)
(a) The set [a, 2] ∪ [3, 4].
(b) If E ⊂ R is bounded, E is compact.
(c) If E1 , E2 ⊂ R and E1 ∪ E2 is compact, then E1 and E2 are compact.
(d) The set [0, 1] ∩ I is compact.
(e) If E is open and bounded, then E c is compact.
n
HW 2.3.2 (a) Prove that if E1 , · · · , En ⊂ R are compact, then ∪ Ej is comj=1
pact.
(b) Show that if E1 , E2 , · · · ⊂ R are compact, it is not necessarily the case that
∞
∪ Ej is compact.
k=1
HW 2.3.3 (a) Given an open cover of the set (0, 1) that does not have a finite
subcover.
(b) Give an open cover of the set [1, ∞) that does not have a finite subcover.
Chapter 3
Limits of Sequences
3.1
Definition of Sequential Limit
Hopefully we now have enough of an understanding of some of the background
material so that we can start considering some of the traditional topics of calculus. The first topics that we will study are sequences and limits of sequences.
It is highly likely that your first calculus course did not start with limits of
sequences but we think it is the most inciteful place to start. We assume that
you have worked with functions and know what a function is but to make the
material as concrete as possible we begin with the definition of a function.
Definition 3.1.1 Suppose that D and R are subsets of R. If f is a rule that
assigns one, and only one, element y ∈ R to each x ∈ D, then f is said to be
a function from D to R. The set D is referred to as the domain of f and R is
the range of f . We write f : D → R and often denote the element y ∈ R as
y = f (x). f is also called a map from D into R.
We note that f (x) must be defined for each element x ∈ D. We also note that it
is not necessary that each element of R be associated with some element of D.
For D1 ⊂ D we define the set f (D1 ) = {y ∈ R : y = f (x) for some x ∈ D1 }.
f (D1 ) is called the image of D1 . Obviously, f (D) ⊂ R and f (D1 ) ⊂ R. Generally f (D) need not be equal to R. If f (D) = R, f is said to be onto and we say
f maps D onto R. When working with functions, the domain and range can be
any sort of sets. In our work the domain and the range will not only be subsets
of the set of real numbers, but (except for the definition of a sequence) will
most often be intervals of R or all of R. We will not dwell on these definitions
now—we will try to make a point to explicitly define that domain, range, etc in
examples later.
Sequences: Definition and Examples As we see in the next definition, a
sequence is just a special function.
Definition 3.1.2 A sequence is a function with domain
{n ∈ Z : n ≥ m for m ∈ Z}.
47
48
3. Limits of Sequences
We note that usually m = 0 or 1. We did not specify the range of the function
in the definition of a sequence. The range can really be any set–but in our work
it will almost always be a subset of the reals. Using this definition we could
define a sequences by defining D = N and f (n) = 1/(n2 + 1) for each n ∈ D.
But this is not how it’s usually done. Because the potential domains can easily
be listed in order (especially N), we would usually write the above sequence as
1/2, 1/5, 1/10, · · · ,
where we assume that the reader can figure out the rest of the terms. If we
think that there’s a good chance that the reader will not be able to recognize
the general term, we might write
1/2, 1/5, 1/10, · · · , 1/(n2 + 1), · · ·
∞
or just 1/(n2 + 1) n=1 . Often the sequence will be listed without a specific
description of the domain such as
1, 2, 5, 10, · · ·
or
3/4, 8/9, 15/16, · · · ,
where while you are figuring out the formula that generates the sequences, you
will be expected to also come up with the domain. You should realize that
the domain and formula is not unique–but they had better be equivalent. For
example, the last sequence could be expressed as 1−1/n2 for n = 2, 3, · · · , or you
could write the same sequence as (n2 + 2n)/(n + 1)2 for n = 1, 2, 3, · · · . When
we are discussing a general sequence, instead of using the function notation we
∞
will write the sequence as a1 , a2 , · · · , {an } for n = 1, 2, · · · or {an }n=1 .
If we return to our discussion of plus and minus infinity from Section 1.5,
recall that we specifically included the property that for all n ∈ N we have
1 ≤ n < ∞. This allows us to consider the natural numbers sequentially, starting
at 1 and approaching infinity. Likewise, using the set N as our indexing set, we
can consider a sequence {an } starting at a1 and continuing with increasing n
as n approaches infinity. We are interested in which value, if there is such a
value, an approaches (gets close to) as n approaches infinity. We will write this
limiting value as lim an as n → ∞ or lim an . Just how we treat n approaching
n→∞
∞ hopefully will be made clear below, i.e. how we treat ”large n” in Definition
3.1.3 below. It should not be hard to see that the limits of the three sequences
given above are 0, ∞ and 1. Of course, we must make the claim in a very
precise manner. We need a definition so that anyone using the definition will
get the same results. Anyone using the idea of the limit of a sequence will know
precisely what they mean.
When you are just beginning, it is not easy or clear to imagine how the limit
of a sequence should be defined. We make the following definition.
49
3.1 Definition
Definition 3.1.3 Consider a real sequence {an } and L ∈ R. lim an = L if
n→∞
for every ǫ > 0 there exists an N ∈ R such that n > N implies that |an − L| < ǫ.
If lim an = L, we say that the sequence {an } converges to L, sometimes
n→∞
write an → L as n → ∞—read, an approaches L as n approaches ∞, or just
an → L—assuming that the reader knows that n will be going to ∞.
An explanation of the definition of a limit that we like to use is ”for every
measure of closeness to L” (that’s what ǫ measures) there exists ”a measure of
closeness to ∞” (that’s what N measures) so that whenever ”n is close to ∞”,
”an is close to L.” Recall the statements preceeding the definition where we
discussed the sequence an as ”n gets large,” and ”n approaches infinity.” These
concepts have been rigorized by the requirement that there exists an N such
that for all n > N , something happens. Thus we have taken an idea or concept
of ”n approaching ∞” and rigorized the notion so that it is possible to use in a
mathematical context. Of course when we prove theorems and/or prove limits,
we use the definition not the assortments of words that we have used to try to
give an understanding of the idea of a limit. When mathematics is done, we
must be precise and use the definition.
L+ǫ
L
L−ǫ
1
5
10
Figure 3.1.1: Plot of a sequence and the y = L ± ǫ corridor.
Sequential Limit: Graphical Description Another description of the limit
of a sequence that is useful for some is to consider the sequence graphically. In
Figure 3.1.1 we have plotted a fictitious sequence {an }. We have plotted the
50
3. Limits of Sequences
point (0, L) and horizontal dashed lines coming out of the points (0, L ± ǫ). The
corridor within the dashed lines represents y-coordinate values that are close
to L (for a given ǫ). The definition of lim an = L requires that for any ǫ (no
n→∞
matter how large or how small you make the corridor around L—the corridor
being small is usually the problem) there must be a value of n, call it N , so that
from that point on, all of the points are within the given corridor. In general,
when the corridor is smaller (the ǫ is smaller), the N must get larger.
Comments: Sequential Limits (i) Given a sequence {an }, the definition
does not help you decide what L should be—but of course it is necessary to
have this L to apply the definition. One way to think about it is that you have
to ”guess the L” and then try to prove that the limit is L. Really we know that
we have methods for determining L from our basic calculus course (the methods
were not rigorously proved but surely would be sufficient to guess what L should
be). We will repeat these results (rigorously) in later sections.
(ii) We emphasize that the value N can be any real number. By notation
(an N usually represents an integer) we imply that N is an integer. It always
can be chosen as an integer but need not be. It is sometimes more convenient to
choose N as a particular real number rather than go through the song and dance
that it is the next largest integer greater than some particular real number–we
will make this clear later.
(iii) We want to emphasize (and we will beat this to death) that when we
apply Definition 3.1.3 we will always follow two steps. Step 1: For a given
ǫ define N . How we find N is immaterial–we will develop methods for finding
N . Step 2: Show that the defined N works.—that n > N implies that
|an − L| < ǫ. We will repeat and emphasize this procedure often.
(iv) We note that the definition of a limit can be given in terms of neighborhoods introduced in Section 2.2. The statements ”there exists an ǫ > 0” and
”|an − L| < ǫ” (and their use) can be replaced by ”there exists a neighborhood
of L, Nǫ (L)” and ”an ∈ Nǫ (L)”. We can also define a neighborhood of infinity
(even though infinity is not in R) as follows.
Definition 3.1.4 A neighborhood of infinity, ∞, is the set NR (∞) = {x ∈ R :
x > R} for some R > 0. A neighborhood of minus infinity, −∞, is the set
NR (−∞) = {x ∈ R : x < −R} for some R > 0.
We can then write Definition 3.1.3 as follows: lim an = L if for every neighn→∞
borhood of L, Nǫ (L), there exists a neighborhood of infinity, NN (∞), such that
n ∈ NN (∞) ⇒ an ∈ Nǫ (L). Other than notation there is no difference between
this version of the definition and the one given in Definition 3.1.3.
(v) In addition to being able to write Definition 3.1.3 in terms of neighborhoods we get results connecting limit points of sets and limits of sequences.
Proposition 3.1.5 Suppose {an } is a real sequence and lim an = L.
n→∞
(a) Any neighborhood of L, Nr (L), contains infinitely many points of the sequence {an }.
3.2 Applications of the Definition
51
(b) If we consider E = {a1 , a2 , a3 , · · · } as a subset of R (instead of as a sequence)
and E is an infinite set (the an ’s do not all equal L from some point on), then
L is a limit point of E.
Proof: The proofs of both parts are very easy. (a) We know by Definition 3.1.3
that for any neighborhood of L, Nr (L), there exists an N and all of the points
of {an } for n > N are in that neighborhood. Note that for all we know all of
these sequence values could be the same—say if the sequence was a constant
sequence.
(b) If we consider any neighborhood of L, Nr (L), and apply the definition of
the limit of a sequence to the sequence {an }, then the neighborhood Nr (L) will
surely contain at least one point of E. Since we have assumed that the set E is
infinite, we can find a point in Nr (L) ∩ E that is different from L.
Note that it is important for part (b) to assume that the set E is infinite.
For the sequence {an } where an = 1 for n, then an → 1 but 1 is not a limit
point of the set {a1 , a2 , · · · } = {1}.
When you think about the definition of the limit of a sequence—or the graph
of a sequence—the above result is not surprising. The other direction—and it’s
not really a converse—is a bit more of a surprise.
Proposition 3.1.6 Suppose that E ⊂ R and the point x0 is a limit point of E.
Then there exists a sequence of points {xn } ⊂ E such that xn → x0 .
Proof: We have essentially already proved this result. In the proof of Theorem
2.3.8 we consider the sequence of neighborhoods of x0 , N1/n (x0 ), n = 1, 2 · · ·
and chose a point from each neighborhood, xn ∈ N1/n (x0 ). Clearly |xn − x0 | <
1/n for all n. Then for any ǫ > 0 we can use Corollary 1.5.5–(b) to obtain
N ∈ N such that N1 < ǫ. (Step 1: Define N .) Then for n > N we have
|xn − x0 | < n1 < N1 < ǫ. (Step 2: Show that the defined N works.) Thus
lim xn = x0 .
x→∞
Thus we see that when E is a bigger set (has many points), if x0 is a limit
point of E, we can always find a sequence of points of E converging to x0 . For
1
→ 32 . Also
example if E = [0, 3/2), 3/2 is a limit point of E and xn = 32 + 4n
1
1 is a limit point of E and xn = 1 + 2n → 1. And 2 is a limit point of E and
2 − n1 → 2.
Note that the sequence produced by the proof of Theorem 2.3.8 is such that
xn 6= x0 for any n. This is not necessary for this proof.
Of course we need some experience with proving limits—finding the N and
showing that it works. We will do that in the next section.
3.2
Applications of the Definition of a Sequential Limit
In the last section we introduced the definition of a sequential limit. In this
section we will learn how to apply the definition to particular sequences. Re-
52
3. Limits of Sequences
member, we will always be following the two steps, Step 1: Define N and Step 2:
Show that the N works. Let us begin with the following example. It is probably
the second easiest example possible but it is very important.
1
= 0.
n
Solution: We will first do this problem graphically. Generally this is not the way to
prove limits—it only works for easy problems. However we want to illustrate how it works
with the picture and eventually to make a point. We note that in Figure 3.2.1 that the
sequence {1/n} is plotted and the dashed lines y = 0 ± ǫ = ±ǫ are drawn. We notice that
the sequence decreases as n gets larger (it’s easy to see that for n > m, 1/n < 1/m). After
Example 3.2.1
Prove that lim
n→∞
1
L+ǫ
1
L−ǫ
5
10
Figure 3.2.1: Plot of a sequence {1/n} and the y = L ± ǫ corridor.
a while—exaggerated on this plot—the points representing the plot of the sequence enter the
corridor formed by y = ±ǫ and never leave it. The points will never leave the corridor because
they are positive and decreasing. It is of interest to see if we can compute when the points will
1
= ǫ and solve for n as n = 1/ǫ. Of course ǫ would have to be
cross the line y = ǫ. We set
n
special for 1/ǫ to be an integer. However, it should be clear that if we set N = 1/ǫ (This is the
definition of N required by Step 1: we did it graphically, but we don’t really care how we got
it.), then if n > N , we note in our plot that the plotted values of 1/n have entered into the ±ǫ
corridor and because 1/n < ǫ (because
n > N = 1/ǫ) and 1/n > 0, these plotted values will
1
< ǫ. (This shows that the N defined as N = 1/ǫ works:
never leave the ±ǫ corridor, i.e. n
1
Step 2.) Hence, we have defined an N such that if n > N = 1/ǫ, then − 0 < ǫ. Therefore
n
1
= 0. Likewise, we could complete Step 2
by the definition of limit, Definition 3.1.3, lim
n→∞ n
3.2 Applications of the Definition
53
1
1
1
by noting if n > N = 1/ǫ, then − 0 =
<
= ǫ. (This also shows that the N works:
n
n
N
Step 2.)
The second way we will do this problem is the way that limit proofs are most often done.
1
1
< ǫ.
We suppose that ǫ > 0 is given. We need N so that n > N implies that − 0 =
n
n
This last inequality is equivalent to n > 1/ǫ. Therefore
if we choose N = 1/ǫ (definition of
1
1
1
N : Step 1), then n > N = 1/ǫ implies that − 0 =
<
= ǫ (Step 2: The defined N
n
n
N
1
works.). Therefore
→ 0 as n → ∞.
n
Notice that in each method, the first graphically and the second algebraically,
we define N and then show that this N works—satisfies the definition of the
limit. This is always the way limit proofs are done when we are applying the
definition of the limit. The first method illustrates that it really makes no
difference how you find N . If we can show rigorously that a particular N
works—even if we only guessed it—we are done.
And finally, we note that the N that we found, N = 1/ǫ, depended on ǫ.
This is perfectly permissable. The statement in the definition is that ”for every
ǫ > 0 there exists an N .” The same N surely does not have to work for all
ǫ. It is logical that N would generally have to depend on ǫ (it surely is not a
requirement) and with this dependence, we can still satisfy the definition. We
should understand that generally N will depend on ǫ in a way that as ǫ gets
smaller, N will get larger—as with N = 1/ǫ.
1
= 1.
2
n→∞
n
Solution:
that
we are given ǫ > 0. We want an N such that n > N
Again, we assume
1 1
1
implies [1 − 2 − 1 = − 2 = 2 < ǫ. This last inequality is equivalent to n2 > 1/ǫ or
n
n
n
p
√
√
2
n > 1/ǫ = 1/ ǫ (because n >0). Thus if we
N
choose
= 1/ ǫ (or 1/N = ǫ) (Define N :
1
1
1 1
Step 1) and let n > N , we have [1 − 2 − 1 = − 2 = 2 < 2 = ǫ (Step 2: N works)
n
n
n
N
1
or lim 1 − 2 = 1.
n→∞
n
Example 3.2.2
Prove that lim
1−
You should note the proof of a limit involves the first step, where we set
|an − L| < ǫ and solve this inequality of n. This shows us how we should define
N –by setting the inequality we want to be satisfied as true, we are able to see
what we need to make it true–this is sort of a ”mathematical limits mating
dance” and is not technically not a part of the proof. It is a very common
approach to help find N . We then show that this N works. And we emphasize,
after we perform the first step and define N , we always must show that N
satisfies the definition of a limit. If you understand all of the parts of the dance,
this is often easy.
Also, as a part of the analysis above we first had an inequality n2 > 1/ǫ
and took the square root of both sides. As a part of solving the inequality for
n, we often have to perform operations on both sides of an inequality. The
question is why can you take the square root of both sides (or perform some
other operation on both side of an inequality)? In the case of the square root,
54
3. Limits of Sequences
we proved that it was permissible in HW1.3.3–(b). To help us in general we say
that a function g defined on an interval of the real line is said
√ to be increasing
if x < y implies that g(x) < g(y). We know that g(x) = x is increasing–if
√
not because of HW1.3.3–(b),
we know what the graph of y = x looks
√ because
√
like. So if x < y, then x < y, i.e. you can take
√ the square root of both
sides of an inequality. Later we had n > N = 1/ ǫ and squared both sides.
This is possible because g(x) = x2 is an increasing function also (or we can use
HW1.3.3–(a)).
2n + 3
2
= .
5n + 7
5
Solution:
Suppose
that ǫ > 0 is given. We want to find an N such that n > N implies that
2n + 3
2 5n + 7 − 5 < ǫ, or
5(2n + 3) − 2(5n + 7) 1
1
1
=
5(5n + 7) = 5 5n + 7 < ǫ.
5(5n + 7)
1 1
− 7 , i.e. we have done the
This inequality is the same as 5n + 7 > 1/(5ǫ) or n >
5 5ǫ
1 1
1 ǫ
− 7 (Step 1: Define N ). Then if n > N =
− 7 , we get
dance. Define N =
5 5
5 5ǫ
2n + 3
1
2
1
5n + 7 > 1/(5ǫ) or
< ǫ. Therefore n > N implies − < ǫ (Step 2: N
5 5n + 7
5n + 7
5
2n + 3
2
→ as n → ∞.
works), and
5n + 7
5
Example 3.2.3
Prove that lim
n→∞
Before we move on to limits that do not exist, we prove one more limit.
We emphasize that we are cheating. As a part of the next example we will
use the natural logarithm, ln and the exponential, exp. We will define these
functions and prove properties of the ln and exp functions in Chapter 5 (or 6).
We could wait for this example until then but are probably better served doing
it now—there are no circular arguments involved.
1
= 0.
2n
Solution: We proceed as we did in the last example.
that we are given ǫ > 0
We suppose
1
1
and we need to find N such that n > N implies that n − 0 = n < ǫ. This last inequality
2
2
is equivalent to 2n > 1/ǫ. Taking the logarithm base e of both sides gives (because ln is an
n
increasing function) ln 2 = n ln 2 > ln (1/ǫ) = − ln ǫ or n > − ln ǫ/ ln 2.
Thus we see that if we choose N = − ln ǫ/ ln 2 (Step 1: Define N ) and consider n > N ,
then we have n > N = − ln ǫ/ ln 2 or n ln 2 = ln 2n > − ln ǫ = ln (1/ǫ). Taking the exponential
1
1
1
of both sides, we get 2n > 1/ǫ, n < ǫ or n − 0 < ǫ (N works: Step 2), and therefore n
2
2
2
approaches 0 as n approaches ∞.
Example 3.2.4
Prove that lim
n→∞
We note that we can take the ln and exp of both sides of the inequality
because they are both increasing functions–cheating again, but think of the
graphs of these functions and it will be clear that they are increasing.
You should also note that we have used the fact that ln 2 ≈ .69 > 0 allowing us to divide both sides of the inequality by ln 2 keeping the direction
of the inequality the same. Also note that we write the definition of N as
55
3.2 Applications of the Definition
N = − ln ǫ/ ln 2. This is a logical way to write it because for ǫ small (less than
1 but positive), ln ǫ < 0. But do note that nothing we have done is illegal if ǫ is
not small, say ǫ ≥ 1. The definition must be satisfied for all ǫ > 0.
Sequences that don’t converge It should not surprise you that there are
sequences that do not have limits. Here we want to discuss how lim an does
n→∞
not exist for some sequences and what we have to do to prove that a limit does
not exist. If you think about it, you should realize that it might be difficult to
show that the limit does not exist. You have to show that it is impossible to
satisfy the definition no matter what real L you choose, i.e. we have to show
that for any L ∈ R there exists an ǫ for which no N can be found (no N such
that n > N implies that |an − L| < ǫ). There are generally two ways that
the limit does not exist. We have put the requirement in Definition 3.1.3 that
L ∈ R and we know that ±∞ 6∈ R. Therefore, the sequences that want
∞ to
approach ∞ (or −∞) such as the sequence given in Section 3.1, n2 + 1 n=0 ,
will not satisfy Definition 3.1.3. (We will give a definition later for what we
mean when the limit is infinite.) The other case where the limit does not exists
is when it oscillates back and forth between two distinct numbers–or close to
two distinct numbers, or three. The literature seems to be confusing on how
they refer to the two situations of non-convergence. At the moment we will say
that if a sequence does not satisfy Definition 3.1.3 for any L ∈ R, the sequence
does not converge (that’s the only convergence definition we have at this time).
Some of the literature will refer to non-convergence as divergence. We will save
divergence for limits of ±∞ which we will introduce when we introduce infinite
limits (but at this time do not exist). Consider the following example.
Example 3.2.5
Prove that lim n2 + 1 does not exist.
n→∞
Solution: This is really a fairly easy case. Let L be any element of R and choose ǫ = 1. Note
that from what we said above, if no N can be found for this situation (any L ∈ R and one ǫ),
then the limit will not
exist. If we
were to be able to satisfy the definition, we must satisfy the
following inequality, n2 + 1 − L < 1. This inequality is the same as L − 1 < n2 + 1 < L + 1.
Hopefully, it is clear that it is the right inequality that will not be satisfied for large n—you
should notice that for some value of n, n2 + 1 will get larger than L + 1√and stay larger for
all of the rest of the n’s. Rewrite the right inequality as n2 < L or n < L (allowable since
n > 0).
√
√
By Corollary 1.5.5–(a) we know that for L ∈ R, there exists n0 ∈ N such that n0 > L.
Of course, if this inequality is satisfied for some particular n0 ∈ N, it will also√be satisfied for
all n ≥ n0 . Since√it is impossible to find an N such that n > N implies n < L (because
for
all n ≥ n0 , n > L), it is impossible to find an N such that n > N implies n2 + 1 − L < 1
for any L ∈ R. Since Definition 3.1.1 cannot be satisfied for any L ∈ R, lim n2 + 1 does not
n→∞
exist.
√
Note that when we wrote L, we were assuming that L ≥ 0. If someone is silly enough
to guess that the limit might be L where L < 0, it is easy to see that n2 + 1 6< L + 1 (the right
side of the original inequality) for all n ∈ N, n ≥ 1. Therefore the limit can’t be negative.
The above example is a reasonably easy sequence to consider—however, all
more complicated sequences that approach ∞ (or −∞) are handled in the same
way with more difficult algebra.
56
3. Limits of Sequences
We next consider an example that is a classic case of nonexistence. All other
examples of nonexistence where the sequence oscillates between one or more
different values can be done in a similar fashion.
Example 3.2.6
Prove that lim (−1)n does not exist.
n→∞
Solution: As in the last example we must show that for any L ∈ R, there is an ǫ > 0 for
which no N exists (no N such that n > N implies that |an − L| < ǫ). For whatever ever real
number L we think that might be the limiting value, we must satisfy |(−1)n − L| < ǫ or
L − ǫ < (−1)n < L + ǫ.
(3.2.1)
And if the sequence is to have a limit, we must find an N so that the last inequality is satisfied
for all n > N .
If you were trying to guess the limit and were naive enough to think that the limit would
exist, you might guess that the limit is 1 or you might guess that it is −1. After all, these are
the values that are assumed often in this sequence.
If we choose L = 1 (as our first guess) and set ǫ = 1, then we would have to satisfy the
inequality (3.2.1), 1 − 1 = 0 < (−1)n < 1 + 1 = 2. It should be clear that this inequality
cannot be satisfied for any odd values of n when (−1)n = −1. Therefore it is impossible to
find an N so that n > N implies 0 < (−1)n < 2.
Likewise if we were to choose L = −1 and ǫ = 1, the inequality −2 < (−1)n < 0 would
not be satisfied for even values of n so no appropriate N can be defined. Therefore lim (−1)n
n→∞
does not equal 1 or −1.
Finally, consider some L such that L 6= 1 and L 6= −1. Choose ǫ = min{|L − 1|/2, |L −
(−1)|/2} (where by min{|L − 1|/2, |L − (−1)|/2} we mean the minimum of the two values).
You might want to draw a picture describing this choice. It should be clear that this ǫ has
been chosen so that |(−1)n − L| is never less than ǫ for any n—when n is even |(−1)n − L| =
|1 − L|
|1 + L|
|1 − L| >
≥ ǫ and when n is odd |(−1)n − 1| = | − 1 − L| >
≥ ǫ. Therefore
2
2
n
lim (−1) does not equal L (where L is anything but 1 or −1).
n→∞
Therefore, since Definition 3.1.3 cannot be satisfied with any L ∈ R, lim (−1)n does not
exist.
n→∞
We should note that both of these cases of nonexistence of limits can be illustrated graphically—you should be very careful before you claim that the picture
gives you a proof. You will be asked to graphically illustrate the nonexistence of
several limits of sequences in HW3.2.2. In Figure 3.2.2 we draw a picture much
like we did in Figure 3.1.1—choose some L and ǫ, plot the point (0, L) and draw
the lines y = L ± ǫ. To illustrate the non-existence in Example 3.2.5 we choose
an arbitrary L and let ǫ = 1. We then plot some sequence values an = n2 + 1.
We note that sooner or later the sequence points go outside of the y = L ± ǫ
corridor (actually above the corridor) and stay out of there forever—we did not
get to plot many points in Figure 3.2.2 because n2 + 1 grows large quickly. Since
the sequence clearly leaves the L ± ǫ-corridor and never comes back, the limit
is surely not equal to L.
To illustrate the non-existence in Example 3.2.6 we would draw two plots
similar to that in Figure 3.1.1. For the first plot, Figure 3.2.3, we would choose
L = 1 and ǫ = 1/2, and note that every other point of the sequence ((−1)n for
n odd) would be outside of the y = 1 ± 1/2 corridor–forever. We could draw
a similar plot for L = −1—and make a similar argument. In Figure 3.2.4 we
draw a plot for an arbitrary L not equal to 1 or −1 and choose ǫ to be smaller
57
3.2 Applications of the Definition
L+ǫ
L
L−ǫ
1
5
10
Figure 3.2.2: Plot of a sequence {n2 + 1} and the y = L ± 1 corridor.
L+1/2
1
L-1/25
1
10
-1
Figure 3.2.3: Plot of a sequence {(−1)n } and the y = 1 ± 1/2 corridor.
58
3. Limits of Sequences
1
L+ǫ
L
L−ǫ
1
5
10
−1
Figure 3.2.4: Plot of a sequence {(−1)n } and the y = L ± ǫ corridor where
ǫ < min{|L − 1|, |L − (−1)|}.
than the distances from L to 1 or −1. We note that the sequence values would
never be in the y = L ± ǫ-corridor.
As stated earlier be very careful about claiming that the above arguments
are proofs. They can be made to be a part of a proof if done carefully and if
they include some of the arguments are reasons given in Examples 3.2.5 and
3.2.6. The pictures alone are at best ”bad proofs.”
We see from the work in this section that it is not trivial to apply the
definition of the limit of a sequence. You might wonder if it’s because we
have the wrong definition. So that we have the definition here for comparison
purposes, we recall that lim an = L
n→∞
if for every ǫ > 0 there exists N such that n > N implies |an − L| < ǫ.
One might ask if it is necessary to apply the definition so that it works for every
ǫ > 0. We might try the following candidate for the definition.
F1 : if for some ǫ > 0 there exists N such that n > N implies |an − L| < ǫ.
1
and
If this were the definition, life would be much easier. Consider an =
n
choose ǫ = 0.1. If we choose N = 13, then n > N = 13 implies that
1
− 0 = 1 < 1 = 1 < 0.1.
n
n
N
13
1
= 0.
n→∞ n
This is the same result that we got in Example 3.2.1 (which we hope our intuition
If F1 were the definition, the above computation would imply that lim
59
3.2 Applications of the Definition
tells us is the correct limit). We see that F1 is easy to apply. However, using
the same ǫ = 0.1 we see that if we choose N = 100, then n > N = 100 implies
that
1
− 0.001 ≤ 1 + 0.001 < 0.01 + 0.001 < 0.1.
n
n
1
= 0.001. Further calculations using
n
1
F1 would give as a large assortment of answers for lim
(this makes it very
n→∞ n
difficult to grade homework). And different choices of ǫ would give us more
values of the limit. Clearly F1 is a bad choice—it is not a strong enough criterion
to serve as our definition.
If we instead tried
So the same ǫ would imply that lim
n→∞
F2 : if for every ǫ > 0 and all N ∈ N, n > N implies |an − L| < ǫ.
This proposal is in big trouble. Here we are claiming that for any ǫ > 0, the
implication in the definition must be true for all N . That is just too strong of a
requirement. If we return to an = 1/n and
a small ǫ, say ǫ = .1, there is
choose
1
not way that n > N = 1 will imply that − 0 < 0.1, i.e. for n = 2 > N = 1,
n
1
1
1
− 0 = > 0.1. So either lim
6= 0 or F2 will not make it as a definition
2
n
n→∞ n
replacement.
There are a few other candidates—bad candidates—that we could discuss
but by now you surely get the point, we had better live with Definition 3.1.3.
You will see in the next section that using Definition 3.1.3 we will always get a
unique result—not like F1 . We have seen that it is possible to apply Definition
3.1.3 to prove limits that are intuitively correct—not like F2 . So make sure that
you now forget F1 and F2 : the F stood for ”false”.
And finally, we want to emphasize that it is the tail end of the sequence
that determines whether or not the sequence converges—and since n → ∞, no
matter at which number you start the tail, it will be a very long tail. In the first
example of this section we showed that 1/n → 0 as n → ∞. That makes the
sequence {1/n} a nice sequence. In the last example of this section we showed
that lim (−1)n does not exist, i.e. the sequence {(−1)n } is not a nice sequence.
n→∞
Now let us construct a rather strange sequence by defining an = (−1)n for
n = 1, · · · , 100, 000 and an = 1/n for n > 100, 000. The tail end of the sequence
will be nice so that the sequence will converge—again to 0. Specifically we saw
in Example 3.2.1 that as a part of the proof of the convergence of {1/n} to
0, we defined N = 1/ǫ. To prove that the sequence {an } converges to 0, we
define N = max{1/ǫ, 100, 000}. This way the proof never knows that we were
working with a strange sequence. If we define a sequence {bn } by bn = 1/n
for n = 1, · · · , 100, 000, 000 and bn = (−1)n for n > 100, 000, 000, then lim bn
n→∞
does not exist (after a long time the sequence values will start bouncing back
60
3. Limits of Sequences
and forth between 1 and −1 and do that for ever. If the tail end of a sequence
is bad, the sequence will be bad.
HW 3.2.1 (True or false and why.)
1
(a) Suppose that the sequence {an } is such that |an − 7| <
for all n ∈ N.
n
Then limn→∞ an = 7.
(b) Consider the sequence {an }, n = 1, 2, · · · where an = c ∈ R for all n (i.e.
the sequence c, c, c, · · · ). Then lim an = c.
HW 3.2.2 (a) Consider the sequence
2n2 + 4
n+3
∞
. Illustrate graphically
n=1
2n2 + 4
does not exist.
n→∞ n + 3
∞
1
1
(b) Consider the sequence (−1)n +
. Illustrate graphically that lim (−1)n +
n→∞
n n=1
n
does not exist.
1
does not exist, i.e. use the definition).
(c) Prove that lim (−1)n +
n→∞
n
that lim
2n2 + 4
2
= (use the definition).
n→∞ 3n2 + 1
3
2
2n2 + 4
= (use the definition).
(b) Prove that the limit lim
n→∞ 3n2 + n + 1
3
HW 3.2.3 (a) Prove that the limit lim
HW 3.2.4 Suppose that {an } and {bn } are sequence such that lim an =
n→∞
lim bn = 0. Prove that lim an bn = 0.
n→∞
3.3
n→∞
Some Sequential Limit Theorems
We want to be able to compute limits and know what we have computed is the
correct result but we do not want to have to apply the definition every time.
Hence we now want to move on to the propositions, theorems and corollaries
(all referred to collectively as theorems) concerning limits of sequences. You
probably already know most of these theorems from your basic calculus course.
Most of these theorems are the building blocks that allow you to compute the
limit of a sequence without using the definition—but as you will see, the definition of a limit is the core of the proof of all of these theorems. The limits that
we compute using the limit theorems will be as rigorous as the limits that we
have proved using the definition because all of the results that we use will have
been rigorously proved.
We begin with a discussion of one of the common hypotheses of the theorems.
We note in Proposition 3.3.1 we assume that lim an exists. This is a common
n→∞
assumption for most of the propositions in this section and the next section.
What does this mean? Moreso, what do we get to use from this assumption?
This is very easy—and very, very important—if we return to Defintion 3.1.3.
3.3 Some Limit Theorems
61
In the first place the hypothesis ensures us that there is some L such that
lim an = L. For the sequence {an } and this L, the hypothesis ensures us that
n→∞
for any ǫ > 0 we can find an N such that n > N implies that |an − L| < ǫ.
The hypothesis doesn’t tell us what N and L are or how to find them. It just
guarantees that there are such an N and L—and that’s all we need. As we said
above, just about every proposition in this section and the next will have this
type of hypothesis. Think clearly each time just how it’s being used.
We begin with a result that is probably unlike results you have seen before
but is basic. From our basic calculus class we know how to compute some limits.
From the last section, for a given sequence {an } and L we know how to apply
the definition of a sequential limit to show that the the sequence {an } and L
satisfy the definition. But what if after you read Example 3.2.3 and gain an
2
2n + 3
= ,
understanding how Definition 3.1.3 was used to show that lim
n→∞ 5n + 7
5
one of your classmates says that the limit is really 5/2 and claims that she can
apply Definition 3.1.3 to prove it. Could the text (and your reading of the text)
and your classmate both be right? Has it been made clear that a sequence can’t
have two distinct limits that satisfy the definition? We answer these questions
with the following proposition.
Proposition 3.3.1 Suppose {an } is a real sequence and lim an exists. The
n→∞
limit is unique.
Proof: A common way to prove uniqueness is to use contradiction. Thus
we assume that the above statement is false, i.e. we assume that there exists
L1 , L2 ∈ R such that L1 6= L2 , lim an = L1 and lim an = L2 . By these
n→∞
n→∞
assumptions we know that for any ǫ1 > 0 there exists an N1 such that n > N1
implies |an − L1 | < ǫ1 , and for any ǫ2 > 0 there exists an N2 such that n > N2
implies |an − L2 | < ǫ2 . Another way to write this is for n > N1 , L1 − ǫ1 < an <
L1 + ǫ1 , and for n > N2 , L2 − ǫ2 < an < L2 + ǫ2 .
To help see why this assumption is clearly false, we have included the plot
in Figure 3.3.1. Since L1 6= L2 , we have choosen ǫ1 and ǫ2 sufficiently small so
that the L1 ± ǫ1 and L2 ± ǫ2 -corridors do not intersect. Yet for all n > N1 all of
the values an must be in the L1 ± ǫ1 -corridor and for n > N2 all of the values an
must be in the L2 ± ǫ2 -corridor—we try to illustrate this in the plot but it isn’t
going to happen—the question marks signify the fact that we can’t put them in
both places.
For the proof we set ǫ1 = ǫ2 = |L1 − L2 | /2. For convenience and without
loss of generality, we assume that L1 > L2 so ǫ1 = ǫ2 = (L1 − L2 ) /2—one of
the two values must be larger than the other. Define N = max{N1 , N2 }. Then
for all n > N (so that n > N1 and n > N2 ) we will have
L1 − ǫ1 = (L1 + L2 )/2 < an < L1 + ǫ1 = (3L1 − L2 )/2
(3.3.1)
L2 − ǫ2 = (3L2 − L1 )/2 < an < L2 + ǫ2 = (L1 + L2 )/2.
(3.3.2)
and
62
3. Limits of Sequences
L2 + ǫ
L2
L2 − ǫ
L1 + ǫ
L1
L1 − ǫ
?
?
?
?
?
?
Figure 3.3.1: Plot of the y = L1 ± ǫ and L2 ± ǫ corridors, and some sequence
points—trying to be in both corridors.
The left most inequality in statement (3.3.1 and the right most inequality in
statement (3.3.2 gives us for all n > N = max{N1 , N2 }
(L1 + L2 )/2 < an < (L1 + L2 )/2.
(3.3.3)
This is surely a contradicition. Therefore no such L1 and L2 exists, and the
limit is unique.
Hence we see that if such a claim about the results of Example 3.2.3 is
made, either the text or your classmate must be wrong (and we’re betting on
the classmate).
We next state and prove a proposition that includes several of our basic
sequential limit theorems.
Proposition 3.3.2 Suppose that {an } and {bn } are real sequences, lim an =
n→∞
L1 , lim bn = L2 and c ∈ R. We then have the following results.
n→∞
(a) lim (an + bn ) = L1 + L2 .
n→∞
(b) lim can = c lim an = cL1 .
n→∞
n→∞
(c) There exists an K ∈ R such that |an | ≤ K for all n.
(d) lim (an bn ) = L1 L2 .
n→∞
Proof: (a) Suppose ǫ > 0 is given. The first two hypotheses gives us that
for any ǫ1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < ǫ1 ,
(3.3.4)
and
for any ǫ2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < ǫ2 .
(3.3.5)
63
3.3 Some Limit Theorems
We note that
|(an + bn ) − (L1 + L2 )| = |(an − L1 ) + (bn − L2 )| ≤ |(an − L1 )| + |(bn − L2 )| ,
(3.3.6)
where the last inequality follows from the triangular inequality, Proposition
1.5.8–(v). Define ǫ1 = ǫ2 = ǫ/2 (the hypotheses presented in (3.3.4) and (3.3.5)
allow for any ǫ1 and ǫ2 ), N = max{N1 , N2 } (Step 1: Define N ) and let n > N
(so that the last inequalities in both (3.3.4) and (3.3.5) will hold true). Then
from inequality (3.3.6) we get
|(an + bn ) − (L1 + L2 )| ≤ |(an − L1 )| + |(bn − L2 )| <
ǫ
ǫ
+ =ǫ
2 2
(Step 2: Show that the defined N works). Then by Definition 3.1.3 an + bn →
L1 + L2 as n → ∞.
(b) We begin by noting that if c = 0 the result is very easy because the sequence
{can } will be the zero sequence and c lim an will be zero—the result then
n→∞
follows from HW3.2.1-(b).
Hence, we assume that c 6= 0. We suppose that we are given an ǫ > 0. The
hypothesis that lim an = L1 gives us that for any ǫ1 > 0 there exists an N1 ∈ R
n→∞
such that n > N1 implies |an − L1 | < ǫ1 . We need an N such that n > N implies
that |can − cL1 | < ǫ. By Proposition 1.5.8–(ii) this last inequality is the same as
|c| |an − L1 | < ǫ. Thus we see that if we apply our hypothesis with ǫ1 = ǫ/|c| and
N = N1 , for n > N we have |can − cL1 | = |c| |an − L1 | < |c|ǫ1 = |c|(ǫ/|c|) = ǫ.
Therefore by Definition 3.1.3 lim can = cL1 .
n→∞
(c) This statement is clearly different from parts (a), (b), and (d). As we shall
see, this result is both a very important tool and generally a very important
property of convergent sequences. We begin by choosing ǫ1 = 1 and apply the
hypothesis that lim an = L1 to get an N1 ∈ R such that n > N1 implies that
n→∞
|an − L1 | < ǫ1 = 1. Then by this last inequality and the backwards triangular
inequality, Proposition 1.5.8–(vi), we have for n > N1
|an | − |L1 | ≤ |an − L1 | < ǫ1 = 1,
or |an | < |L1 | + 1. This inequality bounds most of the sequence {an }. If we let
N0 = [N1 ] where the bracket function is defined by [x] is the largest integer less
than or equal to x, then inequality |an | < |L1 | + 1 for n > N1 bounds |an | for
n = N0 + 1, N0 + 2, · · · . Thus we set K = max{|a1 |, |a2 |, · · · , |aN0 |, |L1 | + 1}
and we have our desired result.
We note that in the above proof it would have been convenient if we had
always defined N to be a natural number (so that we didn’t have to use the
[N ]). However, we have had many instances when it has been convenient for
us to only require that N ∈ R. We should also note that it is the completeness
axiom that assures us that such an integers N0 exists. We are defining [N ] to be
the least upper bound of the set {n ∈ N : n ≤ N }—which surely exists because
the set is bounded above by N .
64
3. Limits of Sequences
(d) Suppose ǫ > 0 is given. The first two hypotheses give us that
for any ǫ1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < ǫ1 ,
(3.3.7)
and
for any ǫ2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < ǫ2 .
(3.3.8)
We must find an N such that n > N implies that |an bn − L1 L2 | < ǫ. It should
not surprise you that the proof will be similar to that given in part (a)—with a
different dance. We note that
|an bn − L1 L2 | = |(an (bn − L2 ) + L2 (an − L1 )| ≤ |an (bn − L2 )| + |L2 (an − L1 )|
= |an | |(bn − L2 )| + |L2 | |(an − L1 )| .
(3.3.9)
(To verify the first step just multiply out the second expression. The first
inequality is due to the triangular inequality, Proposition 1.5.8–(v)—we will use
this often. The last step just uses |xy| = |x||y|, Proposition 1.5.8–(ii).) Then
starting with expression (3.3.9) and using (3.3.7), n > N1 , (3.3.8), n > N2 and
part (c) of this proposition, we get
|an bn − L1 L2 | ≤ |an | |(bn − L2 )| + |L2 | |(an − L1 )| < Kǫ2 + |L2 | ǫ1 .
(3.3.10)
Thus we see that if we choose ǫ2 = ǫ/2K, ǫ1 = ǫ/2|L2| and N = max{N1 , N2 }
(so that both of the inequalities in (3.3.7) and (3.3.8) are satified), we have
|an bn − L1 L2 | < ǫ whenever n > N . Therefore lim an bn = L1 L2 .
n→∞
We should note that when many mathematicians are doing proofs such as
this one, they will often let ǫ1 = ǫ2 = ǫ, obtain expression (3.3.10) (with ǫ
replacing ǫ1 and ǫ2 ) and claim that they are done. And they are. The last term
of expression (3.3.9) would be (K + |L2 |)ǫ. Because of the ǫ we are able to make
|an bn − L1 L2 | arbitrarily small—which is really our goal. However, we don’t
technically satisfy Definition 3.1.3. But it should also be clear at this time that
the (K + |L2 |)ǫ term can be fixed up so as to give the desired result. Textbooks
will generally fix it up so that they always end with just an ǫ at the end of
the inequality—it’s just a bit cleaner. But don’t be surprised if you see this
”sloppier” (but correct) approach in classes and talks.
We now have some of the basic results that let us compute easy limits. We
know from Example 3.2.1 that lim 1/n = 0. It should be easy to see that
n→∞
we can use part (d) of the above theorem to get lim 1/n2 = 0 and another
n→∞
application of part (d) will give lim 1/n3 = 0. We are able to obtain the
n→∞
following more general result.
1
= 0 for any k ∈ N.
nk
Solution: We hope that you realize that this result is a natural for mathematical induction.
Step 1: Prove true for k = 1. Example 3.2.1 shows that it is true for k = 1.
Example 3.3.1
Prove that lim
n→∞
65
3.3 Some Limit Theorems
1
= 0.
nj
Step 3: Prove true for k = j + 1, i.e. prove that lim
Step 2: Assume true for k = j, i.e. lim
n→∞
1
= 0. This proof is an easy
nj+1
1
1 1
application of part (d) of Proposition 3.3.2. We write j+1 as j . We know from Example
n
n n
1
3.2.1 that lim 1/n = 0. We know from the inductive assumption, Step 2, that lim j = 0.
n→∞
n→∞ n
Then by part (d) of Proposition 3.3.2 we have
n→∞
lim
n→∞
1
1
1
= lim j lim
= 0.
n→∞ n n→∞ n
nj+1
Therefore the proposition is true for k = j + 1.
By the principle of mathematical induction the proposition is true for all k ∈ N.
Note that you must be careful to not mix up the fact that usually our
statements to be proved by math induction were given in terms of n and we
used k as our dummy index. In this case, since n was already in use, our
statement is given in terms of k and we used j as our dummy index. It can be
confusing but is only a matter of notation.
Also, we might be inclined to want to prove the above result using part (d)
of Proposition 3.3.2) (k − 1 times) and Example 3.2.1 to show that
1
1
1
= lim
· · · lim
repeated k times
k
n→∞ n
n→∞ n
n→∞ n
= 0.
lim
This is a perfectly good approach. Hopefully you realize when you include the
”three dots” you are including a math induction proof in disguise—and hopefully an easy one. The result needed here is the extension of part (d) of Proposition 3.3.2 that can be stated as follows. Let {ajn }∞
n=1 , j = 1, · · · , k denote k
real sequences such that lim ajn = Lj for j = 1, · · · , k. Then lim a1n · · · akn =
n→∞
n→∞
L1 · · · Lk . It is hoped that you realize that this statement can be proved reasonably easily by mathematical induction—like the proof given above.
We next note that we can use parts (a) and (b) of Proposition 3.3.2, and
Example 3.3.1 to show that
1
1
1
lim 1 − 2 = lim 1 + lim (−1) 2 = 1 + lim (−1) lim 2 = 1 + (−1)0 = 1,
n→∞
n→∞
n→∞
n→ n
n→∞
n
n
which is the same result we got in Example 3.2.2. We note that once we have
proved Proposition 3.3.2, HW3.2.1-(b) and Example 3.3.1, the proof of the
limit given here is every bit as rigorous as the proof given in Example 3.2.2.
In the next example we prove a more general result that can be useful. Let p
denote a k-th degree polynomial p(x) = a0 xk + a1 xk−1 + · · · + ak−1 x + ak where
a0 , a1 , · · · , ak are real.
Example 3.3.2
Prove that
lim p(1/n) = lim
n→∞
n→∞
a0
1
1
1
+
a
+
·
·
·
+
a
+
a
1
k−1
k = ak .
nk
nk−1
n
66
3. Limits of Sequences
Solution: Again it should be clear that this proof could be done by induction. Instead, we
will prove this result using the extension of part (a) of Proposition 3.3.2 along with parts (b)
and Example 3.2.1 to see that
1
1
1
lim p(1/n) = lim a0 k + a1 k−1 + · · · + ak−1 + ak
n→∞
n→∞
n
n
n
1
1
1
+ lim [ak ]
= lim a0 k + lim a1 k−1 + · · · + lim ak−1
n→∞
n→∞
n→∞
n→∞
n
n
n
1
1
1
= lim a0 lim k + lim a1 lim k−1 + · · · + lim ak−1 lim
+ lim ak
n→∞
n→∞ n
n→∞
n→∞ n
n→∞
n→∞ n
n→∞
= ak .
This is a nice straightforward proof of the desired result but it does depend strongly on
the use of the extension of part (a) Proposition 3.3.2—which you should be confident follows
easily from part (a) or you should prove it.
HW 3.3.1 (True or False and why) (a) If lim |an | exists, then lim an exists.
n→∞
n→∞
an
and lim bn exists, then lim an exists.
(b) If lim
n→∞
n→∞
n→∞ bn
(c) If lim an exists, then lim a3n exists.
n→∞
n→∞
n→∞
n→∞
(d) If lim a3n exists, then lim an exists.
(e) If lim an bn exist, then lim an and lim bn exist.
n→∞
n→∞
n→∞
HW 3.3.2 Prove that if lim an = 0, then lim |an | = 0.
n→∞
n→∞
HW 3.3.3 Prove that if lim an = L, then lim |an | = |L|.
n→∞
n→∞
HW 3.3.4 Consider the sequence {an }.
(a) Prove that lim an = L if and only if lim [an − L] = 0.
n→∞
n→∞
Prove that lim an = L implies that lim |an − L| = 0.
n→∞
n→∞
Show that lim |an − L| = 0 does not imply that lim an = L.
n→∞
3.4
n→∞
More Sequential Limit Theorems
As the section title indicates there are more results that we need concerning
sequential limits. The first result is a very basic result that you probably already
know. Hopefully sometime during your elementary calculus class you found
2n2 + n − 3
limits such as lim
. You wrote
n→∞ 3n2 + 3n + 3
limn→∞ 2 + n1 − 3 n12
2 + n1 − 3 n12
2
2n2 + n − 3
= .
=
=
lim
lim
1
1
n→∞ 3 + 3 1 + 3 12
n→∞ 3n2 + 3n + 3
3
+
3
3
+
3
lim
n→∞
n
n
n
n2
The first step above follows from the fact that the expressions inside of the
limit is exactly the same for the first two terms—the second term is found by
1/n2
. In the second step of the calculation we would be
multiplying the first by
1/n2
3.4 More Limit Theorems
67
using part (b) of Proposition 3.4.1 given below and two applications of Example
3.3.2 (or in place of Example 3.3.2 you can use parts (a), (b) of Proposition
3.3.2 and Example 3.3.1). Thus we next include the quotient rule for sequential
limits.
Proposition 3.4.1 Suppose that {an } and {bn } are real sequences, lim an =
n→∞
L1 and lim bn = L2 . Then we have the following results.
n→∞
(a) If L2 6= 0 then there exists an M ∈ R and an N3 ∈ R such that |bn | ≥ M
for all n > N3 .
L1
an
=
.
(b) If L2 6= 0, then lim
n→∞ bn
L2
Proof: (a) As in the other proofs, the hypotheses for this result implies that
for every ǫ2 > 0 there exists an N2 ∈ R so that for n > N2 , |bn − L2 | < ǫ2 .
We are also given the fact that L2 6= 0. This is another result for which it is
convenient to draw a picture. In Figure 3.4.1 we have used the fact that L2 6= 0
to choose an ǫ2 so that the L2 ± ǫ2 -corridor forces the sequence values to be
away from zero for all n greater than some N3 , i.e. we have choosen an ǫ2 so
that y = 0 is not in the L2 ± ǫ2 -corridor.
The easiest way to accomplish this is to choose ǫ2 = |L2 | /2. Then the
hypothesis implies that there exists an N2 ∈ R such that n > N2 implies that
|bn − L2 | < |L2 | /2 or |L2 − bn | = |bn − L2 | < |L2 | /2. Then by the backwards
triangular inequality, Proposition 1.5.8–(vi), for n > N3 = N2 we get
|L2 | − |bn | ≤ |L2 − bn | < |L2 | /2
or |bn | > |L2 | /2.
L2 + ǫ2
L2
L2 − ǫ2
Figure 3.4.1: Plot of a sequence and the y = L2 ± ǫ corridor.
68
3. Limits of Sequences
(b) Before we proceed we note that
an
L1
L 2 an − L 1 b n
L2 (an − L1 ) − L1 (bn − L2 )
−
=
=
.
bn
L2
L2 bn
L2 bn
(3.4.1)
We note that we like the an − L1 and bn − L2 terms in the numerator because we
can make them small. The L2 and L1 are also good in the numerator because
they’re constants. The minus sign between the two terms does not cause us
trouble because when we use the triangular inequality separate the two terms,
we can use the fact that the triangular inequality will give us |x−y| = |x+(−y| ≤
|x| + | − y| = |x| + |y|—so it’s as if the minus sign isn’t really there. The bn term
in the denominator is the term that might cause us most problems but we have
part (a) of this proposition. So let’s begin.
Suppose ǫ > 0 is given. The first two hypotheses give us that
for any ǫ1 > 0 there exists an N1 such that n > N1 implies that |an − L1 | < ǫ1 ,
(3.4.2)
and
for any ǫ2 > 0 there exists an N2 such that n > N2 implies that |bn − L2 | < ǫ2 .
(3.4.3)
Since L2 6= 0, we can apply part (a) of this proposition to get an M and N3
such that n > N3 implies |bn | ≥ M . If we choose n so that n > N1 , n > N2 and
n > N3 , i.e. choose n > N = max{N1 , N2 , N3 }, we can return to inequality
(3.4.1) to see that
an
L1 L2 (an −L1 )−L1 (bn −L2 ) |L2 (an −L1 )|+|L1 (bn −L2 )|
=
−
(3.4.4)
≤
L2 bn
|L2 bn |
bn
L2 =
|L2 ||(an −L1 )|+|L1 ||(bn −L2 )|
|L2 ||bn |
<
|L2 |ǫ1 +|L1 |ǫ2
.
|L2 |M
(3.4.5)
Thus if we choose ǫ1 = M ǫ/2 and ǫ2 = M |L2 | ǫ/ (2 |L1 |), we can apply inequalities (3.4.4)–(3.4.5) to get
an
L1 |L2 ( |M ǫ/2) + |L1 | [M |L2 | ǫ/ (2 |L1 |)]
= ǫ.
bn − L2 <
|L2 | M
Therefore an /bn → L1 /L2 as n → ∞.
We note that the above argument only applies if L1 6= 0 also. If L1 = 0, we
can consider inequality (3.4.4) and note that it will have only one term in the
numerator. The proof then follows as above with the same definition of ǫ1 —ǫ2
is not necessary in this case.
In the introduction to Proposition 3.4.1 we included a calculation using the
quotient rule. To obtain a more general result we let p and q denote k-th
and m-th degree polynomials p(x) = a0 xk + a1 xk−1 + · · · + ak−1 x + ak and
q(x) = b0 xm + b1 xm−1 + · · · + bm−1 x + bm , respectively, where a0 , a1 , · · · , ak
and b0 , b1 , · · · , bm are real. We obtain the following result.
69
3.4 More Limit Theorems
Example 3.4.1
(a) If k = m and b0 6= 0, then
lim
n→∞
p(n)
a0
=
.
q(n)
b0
(3.4.6)
p(n)
= 0.
q(n)
(3.4.7)
(b) If k < m and b0 6= 0, then
lim
n→∞
(c) If k > m and bj 6= 0 for some j = 1, · · · , m, then lim
n→∞
p(n)
does not exist.
q(n)
Solution: We want to make you very aware that in the last example we used the polynomial
p and calculated a limit of p(1/n). In this example we have p(n) and q(n) in our limit
statements. As you will see we will juggle things so that we can still use the results of the last
example—but that’s why we are getting a0 and b0 in our answer and not ak and bm .
(a) (k = m)Part (a) of this result is quite easy—it can be proved precisely the way you
computed these limits in your first calculus classes (except this time we will be using a quotient
result that we have proved). We note that
lim
n→∞
a0 nk + a1 nk−1 + · · · + ak−1 n + ak
p(n)
= lim
n→∞ b0 nm + b1 nm−1 + · · · + bm−1 n + bk
q(n)
1
1
a0 + a1 n
+ · · · + ak−1 nk−1
+ ak n1k
multiply top and bottom by 1/nk
1
1
b0 + b1 n
+ · · · + bk−1 nk−1
+ bk n1k
i
h
1
1
+ · · · + ak−1 nk−1
+ ak n1k
limn→∞ a0 + a1 n
i part (b), Proposition 3.4.1
h
=
1
1
+ bk n1k
+ · · · + bk−1 nk−1
limn→∞ b0 + b1 n
= lim
n→∞
=
a0
by applying Example 3.3.2 twice
b0
(b) (k < m) For this case we proceed similar to the way that we proceeded in part (a)—we
will multiply the top and bottom by 1/nm . We get
lim
n→∞
p(n)
=
q(n)
=
lim
n→∞
lim
n→∞
a0 nk + a1 nk−1 + · · · + ak−1 n + ak
b0 nm + b1 nm−1 + · · · + bm−1 n + bk
1
1
1
+ ak n1m
a0 nm−k
+ a1 nm−k+1
+ · · · + ak−1 nm−1
1
1
b0 + b1 n
+ bk n1k
+ · · · + bk−1 nk−1
multiply by
1/nm
1/nm
i
h
1
1
1
+ ak n1m
+ a1 nm−k+1
+ · · · + ak−1 nm−1
limn→∞ a0 nm−k
h
i
=
1
1
limn→∞ b0 + b1 n
+ · · · + bk−1 nk−1
+ bk n1k
=
0
= 0 (by applying Example 3.3.2) twice.
b0
(c) (k > m) This statement is true but the proof is too ugly to include here.
There are other results that we need or would like concerning limits of sequences. We can use the definition to prove that lim (−1)n /n converges to zero.
n→∞
However a tool that can be used to prove the convergence of this limit and many
others is the following proposition referred to as the Sandwich Theorem.
Proposition 3.4.2 Suppose that {an }, {bn } and {cn } are real sequence for
which lim an = lim cn = L and an ≤ bn ≤ cn for all n greater than some N1 .
n→∞
n→∞
Then lim bn = L.
n→∞
70
3. Limits of Sequences
Proof: We suppose that we are given an ǫ > 0. The two limit hypotheses give
us that there exists an N2 such that n > N2 implies that |an − L| < ǫ, or
L − ǫ < an < L + ǫ,
(3.4.8)
and there exists an N3 such that n > N3 implies that |cn − L| < ǫ, or
L − ǫ < cn < L + ǫ.
(3.4.9)
Then if we use the left inequality of (3.4.8), the right inequality of (3.4.9) and
the hypothesis that an ≤ bn ≤ cn , we find that if n > N = max{N1 , N2 , N3 },
then
L − ǫ < an ≤ bn ≤ cn < L + ǫ,
or |bn − L| < ǫ. Therefore lim bn = L.
n→∞
It should then be easy to see that we can use the inequality −|an | ≤
(−1)n an ≤ |an | (see HW3.3.2) and Proposition 3.4.2 to obtain the following
result.
Corollary 3.4.3 If limn→∞ an = 0, then limn→∞ (−1)n an = 0.
In Chapter 2 we worked with limit points of sets. Though the concepts are
very different, in Propositions 3.1.5 and 3.1.6 we saw that there was a connection limits of sequences and limits of sets. We next include some additional
topological results involving sequences.
Proposition 3.4.4 Suppose that E ⊂ R is closed and {xn } is a sequence contained in E that converges to x0 . Then x0 ∈ E.
Proof: A sequence {xn } can converge to x0 in two ways: either xn = x0 for
all n > N for some N (in which case x0 ∈ E because {xn } ⊂ E) or the set of
points E1 = {x1 , x2 , · · · } is infinite (in which case by Proposition 3.1.5-(b) x0
is a limit point of of E1 , and since E1 ⊂ E, X0 is a limit point of E—and since
E is closed, x0 ∈ E. In either case, x0 ∈ E.
Subsequences Though we saw that the sequence {(−1)n } does not converge,
we can choose the subsequence of even terms {a2n } and note that a2n = 1 → 1.
Thus we see that even when a sequence does not converge, it is possible that a
subsequence might not converge. We begin with the following definition.
Definition 3.4.5 Consider the real sequence {an } and the sequence {nk } ⊂ N
such that n1 < n2 < n3 · · · . The sequence {ank }∞
k=1 is called a subsequence of
{an }.
If lim ank exists, the limit is called a subsequential limit.
k→∞
We then have the following result.
Proposition 3.4.6 Suppose that {an } is a real sequence and L ∈ R. lim an =
L if and only if every subsequence of {an } converges to L.
n→∞
71
3.4 More Limit Theorems
Proof: (⇒) If limn→∞ an = L then for every ǫ > 0 there exists an N ∈ R such
that n > N implies that |an − L| < ǫ. Consider any subsequence of {an }, {ank }.
Clearly if nk > N , then |ank − L| < ǫ. Let K be such that nK ≤ N < nK+1
(nK can be defined to be lub{nk : nk ≤ N }). Then k > K implies that nk > N
and |ank − L| < ǫ. Therefore lim ank = L.
k→∞
(⇐) Suppose false, i.e. every subsequence of {an } converges to L but lim an 6=
n→∞
L (either it doesn’t exist or it exists and does not equal L). lim an 6= L if for
n→∞
some ǫ > 0 and every N ∈ R there exists an n > N for which |an − L| ≥ ǫ.
Let N = 1 and denote by n1 the value (of n) such that |an1 − L| ≥ ǫ.
Then let N = n1 and denote by n2 the element of N such that n2 > N = n1
and |an2 − L| ≥ ǫ.
Continue in this fashion and get a sequence of natural numbers {nk } such that
n1 < n2 < n3 < · · · and |ank − L| ≥ ǫ for all k. Thus the subsequence {ank }
does not converge to L. This is a contradiction so lim an = L.
n→∞
The next result is an important result for later work known as the Bolzano–
Weierstrass Theorem.
Theorem 3.4.7 (Bolzano–Weierstrass Theorem) If the set E ⊂ R is bounded,
then every sequence in E has a convergent subsequence.
Proof: Let {xk } be a sequence in E. If the set E1 = {xk : k ∈ N} is a finite
set, then at least one value, say a ∈ E1 , must be repeated infinitely often in the
sequence {xn }. If we consider the subsequence {xnj }∞
j=1 where xnj = a for all
j, then the subsequence is clearly convergent.
If the set E1 is infinite, we proceed with a construction much the same as
we used in Proposition 2.3.7. Since E1 is bounded, there is an closed interval
I1 = [a1 , b1 ], a1 < b1 , such that E1 ⊂ I1 .
Let c1 = (a1 + b1 )/2 and consider the closed intervals [a1 , c1 ] and [c1 , b1 ]. One of
these intervals must contain infinitely many points of E1 (the sequence {xn }),
call this interval I2 and write this closed interval as I2 = [a2 , b2 ].
Let c2 = (a2 + b2 )/2 and consider the closed intervals [a2 , c2 ] and [c2 , b2 ]. One of
these intervals must contain infinitely many points of E1 (the sequence {xn }),
call this interval I3 and write this closed interval as I3 = [a3 , b3 ].
In general suppose that we have defined I1 ⊃ I2 ⊃ · · · ⊃ In , where Ij = [aj , bj ],
j = 1, · · · n. Let cn = (an + bn )/2 and consider the closed intervals [an , cn ]
and [cn , bn ]. One of these intervals must contain infinitely many points of E1
(the sequence {xn }), call this interval In+1 and write this closed interval as
In+1 = [an+1 , bn+1 ].
We have the nested sequence of closed intervals {In } (In ⊃ In+1 ) such that
each interval In contains infinitely many points of E1 and the length of the
∞
inteval In is (b1 − a1 )/2n−1 . By Proposition 2.3.6 we know that ∩ In is not
∞
n=1
empty. Let x0 be such that x0 ∈ ∩ In . (Because the length of the intervals
n=1
72
3. Limits of Sequences
goes to zero, there is really only one point in the intersection—but we don’t
care—finding one point is enough.)
Now choose the subsequence {xnj } as follows:
Choose xn1 as one of the terms of the sequence {xn } such that xn1 ∈ I1 (since
I1 contains infinitely many elements of E1 , this is surely possible).
Choose xn2 ∈ I2 from the terms of the part of the original sequence {xn1 , · · · }
(i.e. such that n2 > n1 )—I2 contained infinitely many elements of E1 so there
are still enough to choose from.
In general choose xnj ∈ Ij so that nj > nj−1 —since Ij contained infinitely
many elements of E1 there are still plenty of elements to choose from. Do so for
all j, j = 1, · · · .
∞
Since x0 ∈ ∩ In , x0 ∈ Ij for all j. Since xnj ∈ Ij also, xnj − x0 ≤
n=1
(b1 − a1 )/2j−1 → 0 as j → ∞ and the subsequence {xnj } converges to x0 .
One very easy result that we obtain from the Bolzano–Weierstrass Theorem
is the following.
Corollary 3.4.8 If the set K ⊂ R is compact, then every sequence in K has a
convergent subsequence that converges to a point in K.
Proof: From the Heine–Borel Theorem, Theorem 2.3.8, we know that K is
bounded. By the Bolzano–Weierstrass Theorem, Theorem 3.4.7, we know that
if {xn } is a sequence in K, then the sequence {xn } has a convergent subsequence
{xnj }. Again by the Heine–Borel Theorem , since K is compact, K is closed.
Then by Proposition 3.4.4 the subsequence {xnj } converges to some x0 ∈ K.
Cauchy Sequences and the Cauchy Criterion There is an idea strongly
related to convergence in the reals that at times can be very helpful when
discussing convergence of sequences. At first look it doesn’t appear that this
should be the right place to include this result. We include this definition
and proposition here because the proof depends on the Bolzano–Weierstrass
Theorem, 3.4.7. We begin with the following definition—notice how similar it
is to that of convergence of a sequence.
Definition 3.4.9 Consider a real sequence {an }. The sequence is said to be a
Cauchy sequence if for every ǫ > 0 there exists an N ∈ R such that n, m ∈ N
and n, m > N implies that |an − am | < ǫ.
Thus we see that whereas {an } converges to L if for all large n’s, the an ’s get
close to L, the sequence is a Cauchy sequence if for all large n’s and m’s, the
terms an and am get close to each other. It is easy to see that the sequence
{1, 1, · · · } is a Cauchy sequence—choose N = 1. It is also easy to see that a
sequence {an } where an = 1/n is a Cauchy
sequence.
N is chosen so that
∗ 1 (If
1
1
N = 2/ǫ, then n, m > N implies that n1 − m
≤ n+m
< N2 = ǫ where the
∗
step labeled ≤ is true because of the triangular inequality, Proposition 1.5.8(v).) And finally a sequence such as {n} is not a Cauchy sequence. If we choose
ǫ = 1, then for any N and n > N , we can find an m > N , say m = n + 5, such
that |an − am | = |n − m| = 5 > ǫ.
3.4 More Limit Theorems
73
We next include a lemma that is really part of the proof of the Cauchy
criterion. We are separating it out because it may be useful in its own right.
Lemma 3.4.10 If the sequence {an } is a Cauchy sequence, then the sequence
is bounded.
Proof: Suppose that {an } is a Cauchy sequence. Choose ǫ = 1 and let N ∈ R
be such that n, m > N implies that |an − am | < 1. Choose a fixed M ∈ N such
that M > N . Then for n > M > N we have |an − aM | < 1. Then by the
backwards inequality, Proposition 1.5.8-(vi), we get |an | − |aM | ≤ |am − aM | <
1, or if n > M , |an | < |aM | + 1. Then the sequence {an } is bounded by
max{|a1 |, · · · , |aM−1 |, |aM | + 1}.
We now proceed with a very important theorem. Note that the proof in one
direction is difficult—read it carefully.
Proposition 3.4.11 Cauchy Criterion for Convergence Consider a real
sequence {an }. The sequence {an } is convergent if and only if the sequence
{an } is a Cauchy sequence.
Proof: (⇒) Begin by supposing that an → L. We know that for any ǫ1 > 0
there exists N ∈ R such that n > N implies that |an − L| < ǫ1 . Now suppose
that we are given an ǫ > 0, choose ǫ1 = ǫ/2 and let N ∈ R be the value
promised us by the convergence of the sequence {an }. Then if n, m > N ,
|an − am | = |(an − L) + (L − am )| ≤∗ |an − L| + |L − am | < 2ǫ1 = ǫ where
the step labeled ≤∗ is due to the triangular inequality, Proposition 1.5.8-(v).
Therefore {an } is a Cauchy sequence.
(⇐) Suppose that {an } is a Cauchy sequence. Let E be the set of points
{a1 , a2 , · · · }. By Lemma 3.4.10 the sequence {an }, and hence the set E, is
bounded. Then we know by the Bolzano–Weierstrass Theorem, Theorem 3.4.7,
the sequence {an } has a convergent subsequence, say {ank }. Let L be such that
ank → L as k → ∞.
We will now proceed to prove that {an } converges to L. Suppose ǫ > 0 is
given and let N ∈ R be such that n, m ∈ N and n, m > N implies |an −am | < ǫ/2
(because {an } is a Cauchy sequence). Let N2 ∈ R be such that k ∈ N and k > N2
implies that |ank − L| < ǫ/2 (because ank → L). Let nK (where nK is one of
the subscripts from the subsequence, n1 , n2 , · · · ) be a fixed integer such that
K > N2 . In addition require that nK > N —if we have found an appropriate
nK , we can always use a larger value. Hence, we have that |anK − L| < ǫ/2 and
if n > N we have that |anK − an | < ǫ/2. Thus for n > N ,
|an − L| = |(an − anK ) + (anK − L)| ≤∗ |an − anK | + |anK − L| <
ǫ
ǫ
+ =ǫ
2 2
where the step labeled ≤∗ follows by the triangular inequality. Therefore an → L
as n → ∞.
So we see that the Cauchy criterion provides us with an alternative approach
to proving convergence. We know that because we showed that the sequence
74
3. Limits of Sequences
{an } where an = 1/n is a Cauchy sequence, we know that {an } converges
(which we already know). However, using this approach we do not know or
need to know what the sequence converges to. That is the magic of the Cauchy
criterion. If you look more closely at how we proved that the sequence {1/n}
was Cauchy, it should be pretty clear that the approach is very similar to the
approach for showing that a sequence converges—except that we have to of the
terms an in the absolute value and we do not have the limit. As we will see
when we consider series in Chapter 8, the Cauchy criterion for convergence can
be very useful—because we often do not know the sum of our series.
1
exists, then lim an exists.
n→∞
an
(b) Consider the sequence {an }. If the subsequences {a2n } and {a2n+1 } both
converge, then the sequence {an } converges.
(c) If {an } is a sequence of rationals in [0, 1], then {an } has a subsequence that
converges to a rational in [0, 1].
(d) If an < 0 for all n > N for some N ∈ R and lim an exists, then lim an ≤ 0.
HW 3.4.1 (True or False and why) (a) If lim
n→∞
n→∞
(e) The sequences {1/n2} is a Cauchy sequence.
n→∞
HW 3.4.2 Prove that there exists a subsequence {nk } of N such that {cos nk }
converges.
HW 3.4.3 Use the definition, Definition 3.4.9, to prove that {1/n3 } is a Cauchy
sequence.
3.5
The Monotone Convergence Theorem
At this time the methods we have to prove convergence of sequences is (i)
to use the definition (if we know what the limit is) and (ii) to use the limit
theorems to reduce our limit to one or more known limits (usually getting back
eventually to lim c = c or lim 1/n = 0). In this section we will include a
n→∞
n→∞
third approach for proving the convergence of sequences. We will discuss is the
convergence of monotone sequences. Montone sequences are a very important
class of sequences. We begin with the following definition.
Definition 3.5.1 (a) The sequence {an } is said to be monotonically increasing
if an+1 ≥ an for all n ∈ N.
If {an } is such that an+1 > an for all n ∈ N, the sequences is said to be strictly
increasing.
(b) The sequence {an } is said to be monotonically decreasing if an+1 ≤ an for
all n ∈ N.
If {an } is such that an+1 < an for all n ∈ N, the sequences is said to be strictly
decreasing.
A sequence {an } is said to be monotone if it is either monotonically increasing or decreasing.
3.5 Monotone Convergence Theorem
75
It should not be hard to see that sequences {−1/n}, {−1/n2}, {1−1/n2} and
{3 } are monotonically increasing (they’re
strictly
n increasing too) and that the
1
are monotonically decreassequences {1/n}, {1/n2 }, {1 + 1/n2 } and
2
ing (and strictly
decreasing).
Likewise, it should be clear that the sequences
1
n
n1
{(−1) }, (−1)
and 1 + (−1)n
are not monotonic sequences. The
n
n
easiest approach to demonstrate that a sequence such as {1 − 1/n2 } is mono1
1
≥ 1 − 2 —which at
tone increasing is by setting an+1 ≥ an , i.e. 1 −
2
(n + 1)
n
this time we do not know is true (we have placed a question mark, ?, over the
inequality to indicate that you don’t know that it is true), and then simplify the
inequality with reversible steps until you arrive at an inequality that you know
is true or that you know is false. In this case we see that
n
1−
?
1
1
≥ 1 − 2 is the same as
(n + 1)2
n
? 1
1
(subtract 1 from both sides and multiply both sides
≤
(n + 1)2
n2
by −1) is the same as
?
n2 ≤ (n + 1)2 = n2 + 2n + 1 (multiply both sides by n2 and
(n + 1)2 and simplify) is the same as
?
0 ≤ 2n + 1(subtract n2 from both sides).
We know that 0 ≤ 2n + 1 is true for all n ∈ N. Then it should be clear that we
can trace the steps used above backwards (add n2 to both sides, write n2 +2n+1
as (n + 1)2 , divide both sides by n2 and (n + 1)2 , multiply both sides by −1 and
1
1
add 1 to both sides) to actually prove that an+1 = 1 −
≥ 1 − 2 = an
2
(n + 1)
n
for all n ∈ N. You will see that most people do the first calculation and do
not do the second—and it shouldn’t be necessary amongst friends. You should
realize (or verify) that two groups of monotonic sequences given above can be
proved to be such by the same method.
It is much easier to prove that a sequence is not monotonic. Consider the
1
sequence {1 + (−1)n }. If we write out three terms in a row (the first three
n
make the arithmetic easier), 1 − 1 = 0, 1 + (1/2) = 3/2, 1 − 1/3 = 2/3, we see
that since a1 = 0 < a2 = 3/2 the sequence is not monotonically decreasing (but
it may be monotonically increasing) and since a2 = 3/2 > a3 = 2/3 the sequence
is not monotonically increasing. Therefore the sequence is not monotonic.
The above sequences are some of the easiest sequences and are some of the
easiest sequences to show whether they are or are not monotonic. There are
sequences where it is more difficult to show that they are monotonic. Sometimes
the algebra required to perform the computations analogous to that done above
is next to impossible. One approach (which is cheating at this time but will
76
3. Limits of Sequences
be perfectly OK soon) is to use the fact that if the derivative of a function
is positive (or negative), then the function is increasing (or decreasing). For
n2 + 3n + 1
d x2 + 3x + 1
example consider the sequence an =
. Because
=
2n + 3
dx 2x + 3
2
2
x + 3x + 1
2x + 6x + 7
> 0 for x ≥ 1 the function f (x) =
is increasing. Then
2
(2x + 3)
2x + 3
for n ∈ N we see that n < n + 1 implies that
an = f (n) < f (n + 1) = an+1
or that the sequence {an } is monotonically increasing (and strictly increasing).
One last comment before we proceed to the Monotone Convergence Theorem. We notice that we have always proved that our sequences were strictly increasing or decreasing and only claimed that they were monotonically increasing
or decreasing. This was done this way because for the Monotone Convergence
Theorem, we only need that the sequences are monotonic. When it is important
to have the strict monotonicity, it is not difficult to shift gears, get it and use
it.
Theorem 3.5.2 Monotone Convergence Theorem
(a) If the sequence {an } is monotonically increasing and bounded above, the
sequence converges, and converges to lub{an : n ∈ N}.
(b) If the sequence {an } is monotonically decreasing and bounded below, the
sequence converges, and converges to glb{an : n ∈ N}.
(c) If a monotonic sequence is not bounded, then it does not converge.
Proof: This is a very important theorem that is especially nice because the
proof is really easy—in fact, when you think about it, it’s obvious. Consider
part (a). If the sequence is monotonically increasing, then it surely cannot be
the type of sequence that does not converge because it oscillates back and forth
between two distinct numbers. If the sequence is bounded, the sequence cannot
be the type of sequence that does not converge because it goes off to infinity.
There’s really nothing left.
We begin the proof as usual by supposing that we are given ǫ > 0. For
convenience let S = {an : n ∈ N} and L = lub(S). Recall that from Proposition
1.5.3–(a) we know that for any ǫ > 0 there exists some an0 ∈ S such that
L − an0 < ǫ. Then for all n > N = n0 (Step 1: Define N), by the fact that
the sequence {an } is monotonically increasing, an ≥ an0 > L − ǫ. Also for all
n > N = n0 (really for all n), because L = lub(S) is an upper bound of S,
an ≤ L < L + ǫ. Therefore, for n > N we have
L − ǫ < an < L + ǫ or |an − L| < ǫ
so lim an = L.
n→∞
(b) We will not include the proof of part (b). You should make sure that you
understand that part (b) follows from Proposition 1.5.3–(b) in the same way
that (a) followed from Proposition 1.5.3–(a).
77
3.5 Monotone Convergence Theorem
(c) This statement was only included in the proposition for completeness.
The contrapositive of the statement is that if the sequence is convergent, it
is bounded—but we already know that to be true for any sequence (monotone
or not) by Proposition 3.3.2–(c).
The Monotone Convergence Theorem has many applications and is an important theorem. At this time we will use it to prove a very useful limit.
Example 3.5.1
Prove that if |c| < 1 then lim cn = 0.
n→∞
Solution: (You should be aware that if c = 1, the limit is one. If c > 1, the limit does not
exist—or as we will soon show, the limit is infinity. If c ≤ −1, the limit does not exist. We
will not prove these now.)
Case 1: Suppose we make it easy and assume that 0 < c < 1. By two induction proofs (see
HW1.6.6) we see that 0 < cn < 1 for all n ∈ N. If an = cn , then by the fact that c < 1
and Proposition 1.3.7-(iii) we have an+1 = cn+1 = cn c < cn 1 = an . Thus the sequence
{an = cn } is monotonically decreasing. Also since an = cn > 0, the sequence is bounded
below. Thus by Theorem 3.5.2–(b) we know that lim cn exists and equals L = glb(S) where
n→∞
S = {cn : n ∈ N}.
Notice that since 0 < cn for all n ∈ N, by HW1.5.1–(i) L ≥ 0. To show that L = 0 we
suppose false, i.e. suppose that L > 0. Since L is a lower bound of S, L ≤ cm for any m ∈ N.
L
cn+1
≥
or L/c is a lower bound of
Specifically, if n ∈ N, then L ≤ cn+1 also. Then cn =
c
c
S. But since c < 1, L/c > L. This contradicts the fact that L = glb(S). Therefore L = 0 and
lim cn = 0.
n→∞
Case 2 & 3: If c = 0, then cn = 0 for all n so the result follows from HW3.2.1-(b). If
−1 < c < 0, then we can write cn = (−|c|)n = (−1)n |c|n and the result follows from the Case
1 and Corollary 3.4.3 (−1 < c < 0 implies that 0 < |c| < 1, so Case 1 implies that |c|n → 0).
Note that this example includes the limit proved in Example 3.2.4. Hopefully
you realize that the limit considered above could also be proved using the same
approach as we used in Example 3.2.4
We next use Example 3.5.1, Proposition 3.4.2 and Corollary 3.4.3 to prove
the convergence of another important limit.
an
= 0 for any a ∈ R.
n!
Solution: Before we proceed, recognize that this is a strong result. No matter how large a
is (and if a is large, an will get really large), eventually n! gets to be big enough to dominate
the an term (If you are interested, set a = 100 and compute an /n! for n = 150, 151, 152.
You’ll see they are getting smaller but they have a long way to go. If you look at this proof
carefully, you will see exactly how and why this happens.
To make the solution a bit easier we consider Case 1: a > 0. We begin by choosing
M ∈ N such that M > a (we can do this by Corollary 1.5.4). Then for n > M , we see that
Example 3.5.2
Prove that lim
n→∞
an
an
an
=
≤
there were n − M factors
n!
M !(M + 1) · · · n
M !M n−M
M M a n
=
.
(3.5.1)
M! M
a n
an
M M a n
and
→ 0 (because a/m < 1), we apply
Since M is fixed, 0 <
≤
n!
M! M
M
Proposition 3.4.2 to see that the sequence {an /n!} converges to 0.
As in the last example Case 2 & 3 are easy. When a = 0 we have the trivial zero
|a|n
an
= (−1)n
so the result follows from Case 1 and
sequence. When a < 0, we see that
n!
n!
Corollary 3.4.3.
78
3. Limits of Sequences
You might note that the limit proved in Example 3.5.2 can be proved using
the Monotone Convergence Theorem directly. It is an interesting application of
the Monotone Convergence Theorem in that the sequence is not monotonically
decreasing—set a = 100 and compute a1 , a2 , a3 , a150 , a151 and a152
. Toapply
∞
an
the Monotone Convergence Theorem you apply it to the sequence
n! n=M
(where M is as in the last example). Again we begin with Case 1 where a > 0.
Since n + 1 > M and M > a, we see that
an a
an a
an
an+1
=
<
<
,
(n + 1)!
n! n + 1
n! M
n!
so the sequence is monotonically decreasing. The sequence is bounded below
by zero, so the limit exists and equals L = glb(S) = glb ({cn : n ∈ N}). As in
an+1
Example 3.5.1 we assume that L > 0 and note that
≥ L for any n and
(n + 1)!
an
=
n!
an+1
(n+1)!
a
n+1
≥
L
a
M
.
Since this is true for any nN, M L/a is also a lower bound and M L/a > L so L
cannot be the greatest lower bound. Therefore L = 0.
We see that the tail end of the given sequence converges, hence the seequence
converges—recall the discussion of tail ends of sequence at the end of Section
3.2.
HW 3.5.1 (True or False and why) (a) The sequence {sin(1/n)} is monotone.
(b) The sequence {n + (−1)n /n} is monotone.
n
(c) The sequence {n/2
} is monotone.
o
n
n+1
(d) The sequence n+2 is monotonically decreasing.
HW 3.5.2 Suppose S ⊂ R is bounded above and not empty, and set s = lub(S).
Prove that there exists a monotonically increasing sequence {an } ⊂ S such that
s = limn→∞ an .
3.6
Infinite Limits
As we stated earlier we do want to have the concept of infinite limits. The fact
that lim n2 + 1 does not exist is not on an equal footing with the fact that
n→∞
lim (−1)n does not exist.
n→∞
When we introduced the limit of a sequence we gave you the following explanation (that we told you we liked): ”for every measure of closeness to L” there
exists ”a measure of closeness to ∞” so that whenever ”n is close to ∞”, ”an is
close to L.” To be able to define when lim an = ∞ it should be clear that we
n→∞
want a definition that will satisfy ”for every measure of closeness to ∞” there
79
3.6 Infinite Limits
exists ”a measure of closeness to ∞” so that whenever ”n is close to ∞”, ”an
is close to ∞.” We use the same type of measure of closeness of an to ∞ as we
do for the measure of closeness of n to ∞. We obtain the following definition.
Definition 3.6.1 Consider a real sequence {an }. (a) lim an = ∞ if for every
n→∞
M > 0 there exists an N ∈ R such that n > N implies that an > M .
(b) lim an = −∞ if for every M < 0 there exists an N ∈ R such that n > N
n→∞
implies that an < M .
We will say either that an converges to ∞ (or −∞) or an diverges to ∞ (or
−∞). From this point on we will no longer claim that a limit such as
lim n2 + 1 does not exist. We will say that lim (1/n) exists and equals 0,
n→∞
n→∞
lim n2 + 1 exists and equals ∞ and lim (−1)n does not exist.
n→∞
n→∞
Since we have made the claim that lim n2 + 1 = ∞, we had better prove it.
n→∞
Example 3.6.1
2
Prove that lim n + 1 = ∞.
n→∞
Solution: As you will see the proofs of infinite limits are very much like the proofs of finite
limits—maybe easier. We still will have two basic steps. Step 1: Define N, and Step 2: Show
that N works. We suppose that we are given an M ∈ N. We want an N so that n > N implies
that n2 + 1 > M . As we did in the case of finite limits, we solve this inequality for n, i.e.
√
n2 + 1 > M is the same as n2 > M − 1 is the same as n > M − 1.
√
√
Therefore we want to define N = M − 1 (Step 1: Define N ). Then if n > N = M − 1,
2
2
n > M − 1 and n + 1 > M (Step 2: N works).
Before we say that we are done we should note that what we have done above is not
quite correct. The definition must hold for any M > 0 and if 0 < M < 1, M − 1 < 0 so we
cannot take the square root of M − 1—but using an M between 0 and 1 to measure whether a
sequence is going to infinite is not the smartest thing to do anyway. However, we must satisfy
the definition (this technicality is analogous to large ǫ’s when we are considering finite limits.
The approach is to take two cases, 0 < M < 1 and M ≥ 1.
Case 1: (0 < M < 1) Choose N = 1. Then n > N = 1 implies that n2 + 1 > M (this is
assuming that the sequence starts at either n = 0 or n√= 1).
Case 2: (M ≥ 1) Proceed as we did originally—now M − 1 makes sense.
We include one more infinite limit example because we hinted at the result
in the last section—but warn you as was the case with Example 3.2.4 we will
again cheat in that we will use the logarithm and exponential functions.
Example 3.6.2
Prove that if c > 1 then lim cn = ∞.
n→∞
Proof: As before we assume that we are given M > 0. We want N so that n > N implies
that cn > M . We solve the last inequality for n by taking the logarithm of both sides to get
ln cn = n ln c > ln M or n > ln M/ ln c. We choose N = ln M/ ln c (Step 1: Defined N ). Then
n > N = ln M/ ln c implies that n ln c > ln M or ln cn > ln M . Taking the exponential of both
sides (the exponential function is also increasing) gives cn > M (Step 2: N works). Therefore
cn → ∞ as n → ∞.
We should note that some of the reasons that make the above steps correct include the
following facts. The logarithm and exponential functions are increasing so the inequalities
stay in the same direction when these functions are applied. We were given that c > 1 so that
ln c > 0 so that inequalities stay in the same direction when we divide by or multiply by ln c.
And if 0 < M < 1, ln M < 0 but it’s permissible to have a negative N because if we assume
that the sequence starts at either n = 0 or n = 1, for all n ≥ 0 > N = ln M , cn > M .
80
3. Limits of Sequences
We should include the last few cases here. If c = 1, the sequence is the trivial
sequence of all ones so cn → 1. If c = −1, the sequence is the sequence that we
have consider in Example 3.2.6 so lim cn does not exist. And if c < −1 the
n→∞
sequence oscillates between values that are large in magnitude but are positive
for even n and negative for odd n. We consider the three potential limits, L ∈ R,
∞ and −∞. Since cn grows infinitely large for n even, the limit could not be any
finite L or −∞. Since Since cn approaches −∞ when n is odd, the limit could
not be ∞. Therefore lim cn does not exist. (Of course, all of these statements
n→∞
would have to be proved.)
We want to emphasize the point that the limit theorems stated and proved
in Sections 3.3 and 3.4 do not apply to infinite limits—we always had the assumption that the limits were L, L1 or L2 and they were in R. It should not
surprise you that there are limit theorems for infinite limits—and that they are
not as nice as the theorems for finite limits. We include some of the results
without proof. The proofs of these results are easy. You should know that there
are more results available.
Proposition 3.6.2 Suppose that {an } and {bn } are real sequences. We have
the following results.
(a) If lim an = ∞ and lim bn = L2 where L2 ∈ R or is ∞. Then an +bn → ∞
n→∞
n→∞
as n → ∞.
(b) If lim an = −∞ and lim bn = L2 where L2 ∈ R or is −∞. Then an +bn →
n→∞
n→∞
−∞as n → ∞.
(c) If lim an = ∞ and c ∈ R is such that c > 0, then can → ∞.
n→∞
(d) If lim an = −∞ and c ∈ R is such that c > 0, then can → ∞.
n→∞
HW 3.6.1 (True or False and why) (a) lim (2n2 − 3n3 ) = ∞ − ∞ = 0.
(b) lim (2n2 − 3n3 ) does not exist.
n→∞
n→∞
(c) lim (2n2 − 3n3 ) = −∞
n→∞
limn→∞ 2n2
∞
2n2
=
=
= 1.
n→∞ 3n3 + 1
limn→∞ (3n3 + 1)
∞
2
2n
=0
(e) lim
n→∞ 3n3 + 1
(d) lim
n2
= ∞.
n→∞ n + 1
HW 3.6.2 Prove that lim
Chapter 4
Limits of Functions
4.1
Definition of the Limit of a Function
In a way the title of this chapter is bad or misleading. We saw that a sequence
is a function (a function that has N as it’s domain) and we defined a limit of a
sequence. The difference is that in this chapter we will define limits of functions
defined on the reals or subsets of the reals that are generally much larger than
N. Where in case of the limit of a sequence we considered the limit of f (n) as
n approaches infinity, we will now consider the limit of f (x) as x approaches x0
for some x0 ∈ R. The limit that we will consider in this chapter is the limit
that you studied so hard in your basic calculus course and used to define the
derivative.
We begin by considering f : D → R where the domain and range of f , D
and R, are subsets of R. We suppose that x0 ∈ R but very importantly do not
require that x0 ∈ D. We do, however, require that x0 is a limit point of D, i.e.
every neighborhood of x0 must contain x ∈ D, x 6= x0 . Thus x0 need not be
in D but it must be close to D. Of course, We will write the limit of f (x) as x
approaches x0 equals L as lim f (x) = L.
x→x0
The limit considered in this chapter will be analogous to the sequential limit
so we must be able to characterize the limit by ”for every measure of closeness
to L” there exists ”a measure of closeness to x0 ” so that whenever ”x is close
to x0 ”, ”f (x) is close to L.” It should not surprise us that we can handle ”f (x)
is close L” very much as we did for the sequential limit. The difference is that
instead of n being close to ∞, we must now have the concept that x is close
to x0 —but that idea should not be too difficult to comprehend. We make the
following definition.
Definition 4.1.1 For f : D → R, D, R ⊂ R, x0 , L ∈ R and x0 is a limit point
of D, we say that lim f (x) = L if for every ǫ > 0 there exists a real δ such
x→x0
that x ∈ D, 0 < |x − x0 | < δ implies that |f (x) − L| < ǫ.
81
82
4. Limits of Functions
If lim f (x) = L we say that f (x) converges to L as x goes to x0 , or sometimes
x→x0
write f (x) → L as x → x0 .
We see that the biggest difference between the definition of a sequential
limit and Definition 4.1.1 is the the statement ”there exists an N such that
n > N implies that |an − L| < ǫ” is replaced by ”there exists a δ such that
0 < |x − x0 | < δ implies that |f (x) − L| < ǫ.” The measure of closeness to
infinity is all of the n’s greater than some N whereas the measure of closeness
to x0 is all of the x’s within some δ distance from x0 .
There are two pieces of the above definition of which we want to make special
note. The first is the ”0 <” part of the requirement that we want to consider
x’s such that 0 < |x − x0 | < δ. We want (need?) the limit to be applicable to
f (x) − f (x0 )
derivatives where the function under consideration is of the form
,
x − x0
i.e. we eventually want to use limits to define derivatives. This function is not
defined at x = x0 so if we want to take a limit of this function as x approaches
x0 , we surely do not want to require that x ever equals x0 . We will soon see
when it is important to allow for x0 6∈ D and when it is not, and how to handle
it when it is important.
Another point of the definition is that we only consider x’s such that 0 <
|x − x0 | < δ and x ∈ D. Of course we don’t want to consider any x’s that are
not in D—because then it would be stupid to write f (x) if x 6∈ D. However,
the requirement that x0 is a limit point of D ensures us that there are points in
the domain D that are arbitrarily close to x0 , i.e. there are some points in D
such that 0 < |x − x0 | < δ. Otherwise the limit definition is nonsensical at x0 .
Graphical description of the definition of a limit There is a graphical
description of Definition 4.1.1. In Figure 4.1.1 we first plotted a function, chose
a point x0 , and then projected that point up to the curve and across to the
y-axis to define L. Thus that part of the plot gives us the function, f , the point
at which we want the limit, x0 , and the limiting point, L. We are given an
ǫ > 0 so we plot the points L ± ǫ. We then project these two points across to
the curve and down to the x-axis. We denote these two points as x0 − δ2 and
x0 + δ1 . This notation is really defining the size of δ1 and δ2 .
We note that whenever the curve is nonlinear, δ1 6= δ2 . In this case δ1 < δ2 .
More importantly you should realize that for any x between x0 − δ2 and x0 + δ1 ,
f (x) will be between L − ǫ and L + ǫ—you choose any such x, project the point
vertically to the curve and then horizontally to the y-axis. We want to find a
δ so that whenever 0 < |x0 − x| < δ (or x0 − δ < x < x0 + δ, x 6= x0 ), then
f (x) will satisfy |f (x) − L| < ǫ (or L − ǫ < f (x) < L + ǫ). Hopefully you realize
that what we have in the picture and what we want are close. If you choose
δ = min{δ1 , δ2 } (Step 2: Define δ), the point x0 + δ will be at x0 + δ1 (because
we claimed that δ1 < δ2 so in this case δ = min{δ1 , δ2 } = δ1 ) and x0 − δ will
be inside of x0 − δ2 . Hence by the Figure 4.1.1 it should be clear that whenever
0 < |x0 − x| < δ, |f (x) − L| < ǫ (Step 2: δ works).
You should realize that anytime you have an acceptable candidate for the
δ, i.e. one that works, you can always choose a smaller δ. For example, it
83
4.1 Definition
L+ε
L
L−ε
x0−δ2
0
x0+δ1
x0
0
x
Figure 4.1.1: Plot of a function, the y = L ± ǫ corridor and the x0 − δ2 —x0 + δ1
corridor.
is clear from the picture that everything between x0 − δ1 (remembering that
δ1 < δ2 ) and x0 + δ1 will get mapped into the region (L − ǫ, L + ǫ). So it
should be clear that if we chose δ = δ1 /13, then all points in the interval
(x0 − δ, x0 + δ) = (x0 − δ1 /13, x0 + δ1 /13) would also get mapped into the region
(L − ǫ, L + ǫ). And, of course there is nothing special about 13 (except that it
is a very nice integer). In this case any δ such that 0 < δ < δ1 will work.
The second note that we should make about this example is that we have not
done anything to eliminate the point ”x0 ” from our deliberations, i.e. we have
not done anything to allow for the ”0 <” part of the requirement 0 < |x−x0 | < δ.
The reason is that in this case the function is sufficiently nice that we don’t have
to. In this case it is clear that f (x0 ) = L so that when |x − x0 | is actually zero,
i.e. when x = x0 , then |f (x) − L| = |f (x0 ) − L| = 0 < ǫ. The point is that once
we have the δ, we only need to satisfy if x is such that 0 < |x − x0 | < δ, then
|f (x) − L| < ǫ. If whenever x is such that |x − x0 | < δ, then |f (x) − L| < ǫ,
the above statement will be satisfied (plus nice info at one extra point that we
didn’t need). This happens because f is a nice function. We will see that this
is not always the case.
It should not surprise you to hear that we can also rewrite Definition 4.1.1
in terms of neighborhoods. We define a punctured neighborhood of a point
x0 to be the set (x0 − r, x0 + r) − {x0 } = (x0 − r, x0 ) ∪ (x0 , x0 + r) for some
r > 0, i.e. the same as a neighborhood of x0 except that we eliminate the point
x0 . We denote a punctured neighborhood of x0 by N̂r (x0 ). We can then restate
Definition 4.1.1 as follows: lim f (x) = L if for every neighborhood of L, Nǫ (L),
x→x0
there exists a punctured neighborhood of x0 , N̂δ (x0 ), such that x ∈ N̂δ (x0 ) ∩ D
implies that f (x) ∈ Nǫ (L). Again there is only a difference of notation between
84
4. Limits of Functions
this version of the definition and Definition 4.1.1.
Two limit theorems Before we proceed to apply the definitions to some specific examples, we are going to prove two propositions. The first is the analog
to Proposition 3.3.1. It would be best—in fact it is imperative—that when we
do have a value of L satisfying Definition 4.1.1, there isn’t some other L1 that
would also satisfy the definition. We have the following proposition.
Proposition 4.1.2 Suppose that f : D → R, D, R ⊂ R, x0 ∈ R and x0 is a
limit point of D. If lim f (x) exists, it is unique.
x→x0
Proof: The proof of this proposition is very similar to that of Proposition
3.3.1. We suppose the proposition is false and that there are at least two limits,
lim f (x) = L1 and lim f (x) = L2 , L1 6= L2 . For convenience let us suppose
x→x0
x→x0
that L1 > L2 (one or the other must be larger). Choose ǫ = |L1 − L2 |/2 =
(L1 − L2 )/2. Since lim f (x) = L1 , we know that for the ǫ given there exists a
x→x0
δ1 such that 0 < |x − x0 | < δ1 implies that |f (x) − L1 | < ǫ. This inequality can
be rewritten as
−ǫ + L1 < f (x) < ǫ + L1 or (L1 + L2 )/2 < f (x) < (3L1 − L2 )/2.
(4.1.1)
Likewise since lim f (x) = L2 , we know that for the ǫ given above there
x→x0
exists a δ2 such that 0 < |x − x0 | < δ2 implies that |f (x) − L2 | < ǫ. This
inequality can be rewritten as
−ǫ + L2 < f (x) < ǫ + L2 or (3L2 − L1 )/2 < f (x) < (L1 + L2 )/2.
(4.1.2)
Let δ = min{δ1 , δ2 } and consider x such that 0 < |x − x0 | < δ. Then both
inequalities (4.1.1) and (4.1.2) will be satisfied. If we take the leftmost part of
inequality (4.1.1) and the rightmost part of inequality (4.1.2) we get
(L1 + L2 )/2 < f (x) < (L1 + L2 )/2.
Of course this is impossible so we have a contradiction (and because x0 is a limit
point of D we know that there are some values of x at which this contradiction
actually occurs), two such L’s do not exist and our limit is unique.
We note that the hypothesis that ”x0 is a limit point of D” is a very important hypothesis for this result. If x0 is not required to be a limit point of D, the
limit would not be unique at x0 —the limit could be anything at such points.
Our next result will be very important to us. It is logical to try to relate the
limit of Definition 4.1.1 and that of the sequential limits. We do this with the
following propopsition.
Proposition 4.1.3 Suppose that f : D → R, D ⊂ R, x0 , L ∈ R and x0 is a
limit point of D. Then lim f (x) = L if and only if for any sequence {an } such
x→x0
that an ∈ D for all n, an 6= x0 for any n, and lim an = x0 , then lim f (an ) =
n→∞
n→∞
L.
85
4.1 Definition
Proof of Proposition 4.1.3 (⇒) We begin by assuming the hypothesis that
lim = L and suppose that we are given a sequence {an } with an ∈ D for all n,
x→x0
an 6= x0 for any n and an → x0 . We also suppose that we are given some ǫ > 0.
We must find an N such that n > N implies that |f (an ) − L| < ǫ.
Because lim f (x) = L, we get a δ such that
x→x0
if 0 < |x − x0 | < δ, then |f (x) − L| < ǫ.
(4.1.3)
We apply the definition of the fact that an → x0 with the ”traditional ǫ relaced
by δ” to get an N ∈ R such that
n > N implies that |an − x0 | < δ.
(4.1.4)
(Step 1: Define N .)
Now suppose that n > N . We first apply statement (4.1.4) above to see that
|an − x0 | < δ. By the fact that we assumed that the sequence {an } satisfied
an 6= x0 for all n, we know that 0 < |an − x0 |, i.e. for n > N we have
0 < |an − x0 | < δ. We then apply statement (4.1.3) (with x replaced by an ) to
see that |f (an ) − L| < ǫ (Step 2: N works). Therefore lim f (an ) = L.
n→∞
(⇐) We now assume that if {an } is any sequence such that an ∈ D for all n,
an 6= x0 for any n and an → x0 , then lim f (an ) = L. We assume that the
n→∞
proposition is false, i.e. that lim f (x) does not converge to L. This means
x→x0
that there is some ǫ such that for any δ there exists an x-value, xδ , such that
0 < |xδ − x0 | < δ and |f (xδ ) − L| ≥ ǫ, i.e. for any δ there is at least one bad
value xδ .
The emphasis is that the above last statement is true for any δ.
Let δ = 1: Then there exists an xδ value, call it a1 , such that 0 < |a1 − x0 | < 1
and |f (a1 ) − L| ≥ ǫ.
Let δ = 1/2: Then there exists an xδ value, call it a2 , such that 0 < |a2 − x0 | <
1/2 and |f (a2 ) − L| ≥ ǫ. (It happens for any δ.)
We could go next to 1/3, then 1/4, etc except that it gets old. We’ll jump
to a general n.
Let δ = 1/n: Then there exists an xδ value, call it an , such that 0 < |an − x0 | <
1/n and |f (an ) − L| ≥ ǫ.
And of course this works for all n ∈ N. We have a sequence {an } such that
an 6= x0 for all n (true because of the ”0 <” part of the restriction). We also
have |an − x0 | < 1/n for all n. This implies that an → x0 . (See HW3.2.1.)
Then by our hypothesis we know that f (an ) → L, i.e. for any ǫ > 0 (including
specifically the ǫ given to us above) there exists an N such that n > N implies
|f (an ) − L| < ǫ. But for this sequence we have that |f (an ) − L| ≥ ǫ for all
n ∈ N. This is a contradiction therefore the assumption that ” lim f (x) does
x→x0
not converge to L” is false and lim f (x) = L.
x→x0
Comments concerning Proposition 4.1.3 (i) We first note that since we
have an ”if and only if” result with the definition on one side, this gives us a
86
4. Limits of Functions
statement equivalent to our definition. It is the case that the right side of the
above proposition is used as the definition of a limit in some text books. Our
definition is surely the more traditional one. Once we have Proposition 4.1.3,
who cares. We can use either the definition or Proposition 4.1.3, which ever
best suites us at the time.
(ii) It should seem that the restrictions on the sequences {an } are not especially nice in that we always have to assume that an 6= x0 for any n. However,
it is fairly obvious that this is necessary—it’s necessary because f may not be
defined at x = x0 . A lot of the sequences that converge to x0 take on the value
x0 once or many times, for example the very nice sequence {x0 , x0 , · · · }. Such
sequences are not allowed—because we do have and want the ”0 <” as a part
of our definition of a limit. But in the end as long as you remember that the
restriction is necessary, it doesn’t seem to cause undo difficulties.
(iii) And finally it might seem that it would be very hard to apply the ⇐
direction of Proposition 4.1.3 because you have to consider a lot of sequences—
all of the sequences such that an ∈ D for all n, an 6= x0 for any n and an → x0 .
But often this is not a terrible burden if you can just consider a general sequence.
Application of Proposition 4.1.3 is an especially nice way to show that a
given limit does not exist or is not L. If we can find one sequence {an } such that
an 6= x0 and an → x0 but f (an ) 6→ L, then we know that lim f (x) 6= L (f (an )
x→x0
must approach L for all such sequences). If we can find one sequence {an } such
that an 6= x0 and an → x0 but lim f (an ) does not exist, then lim f (x) does
n→∞
x→x0
not exist ( lim f (an ) must exist and equal L for all such sequences).
n→∞
HW 4.1.1 (True or False and why)
(a) Suppose D = [0, 1] ∪ {2} and define f : D → R by f (x) = x2 . For any
ǫ > 0 any x such that 0 < |x − 2| < 1, x ∈ D implies |f (x) − 4| < ǫ. Then
lim f (x) = 4.
x→2
(b) Suppose D = [0, 1) and define f : D → R by f (x) = x2 . Let {an } be any
sequence such that an ∈ [0, 1), an 6= 0, and an → 1 as n → ∞. By Proposition
3.3.2-(d) lim f (an ) = lim a2n = 1 · 1 = 1. Then lim f (x) = 1.
n→∞
n→∞
x→1
(c) Suppose f : D → R, D ⊂ R, x0 ∈ D is such that if for any sequence {an }
such that an ∈ D and an → x0 , then f (an ) → f (x0 ). Then lim f (x) = f (x0 ).
x→x0
(d) Suppose f : D → R, D ⊂ R, x0 , L ∈ R and x0 is a limit point of D. The
negation of the statement lim f (x) = L is ”for some ǫ > 0 there exists a δ such
x→x0
that 0 < |x − x0 | < δ implies |f (x) − L| ≥ ǫ.”
(e) lim (3x + 2) = 8
x→2
HW 4.1.2 Prove that lim |x| = 0 (Hint: Consider HW3.3.2).
x→0
HW 4.1.3 Prove that lim (2x + 3) = 5 (Hint: Consider Proposition 3.3.2).
x→1
HW 4.1.4 Suppose that f (y) < 0 for all y in some punctured neighborhood of
y0 . Suppose that lim F (y) exists. Prove that lim F (y) ≤ 0.
y→y0
y→y0
87
4.2 Applications of the Definition
4.2
Applications of the Definition of the Limit
In the last section we introduced the definition of a limit of a function. In
this section we will learn how to apply the definition to particular functions and
points. Again we want to emphasize that when we are applying Definition 4.1.1,
we will always follow the two steps, Step 1: Define δ, and Step 2: Show that
the δ works.
In addition to introducing the definition of a limit in the last section we
also proved Proposition 4.1.3—which gave an alternative equivalent definition
of the limit of a function. All of these examples can be done both using the
definition and using Proposition 4.1.3. As you will see, most often it is easier
to apply Proposition 4.1.3 (we have already done most of the work in Sections
3.3 and 3.4). However, we do want you to be familiar with the definition. For
this reason we will do each of these examples twice: using Definition 4.1.1 and
using Proposition 4.1.3.
We now consider several examples.
Example 4.2.1
Prove that lim 2x + 3 = 9.
x→3
Using Definition 4.1.1: We suppose that we are given an ǫ > 0. We must find the δ (Step 1)
that will satisfy the definition. This is an easy example. It is easy to use the graphical approach
to find the δ. In HW4.2.2 you will given the problem of proving this limit graphically. At this
time we will introduce the method that is the most common approach because it works for a
wider class of problems—the method is very close the the method used for proving sequential
limits.
We need x to satisfy |f (x) − 9| = |(2x + 3) − 9| = |2x − 6| < ǫ. This last inequality
is the same as |2(x − 3)| = 2|x − 3| < ǫ or |x − 3| < ǫ/2. According to Definition 4.1.1,
we must find a δ such that 0 < |x − 3| < δ implies that |f (x) − 9| < ǫ. But the above
calculation shows that |f (x) − 9| < ǫ is equivalent to |x − 3| < ǫ/2. Thus if we choose δ = ǫ/2
and require that x satisfy |x − 3| < δ = ǫ/2. We can multiply by 2 to get 2|x − 3| < ǫ or
|2(x − 3)| = |(2x + 3) − 9| = |f (x) − 9| < ǫ.
Thus, if we choose δ = ǫ/2 (Step 1:Define δ), the above calculation shows that this
delta works, i.e. |x − 3| < δ implies that |f (x) − 9| < ǫ (Step 2: The δ works). Therefore
lim 2x + 3 = 9.
x→3
Note again that as in the case with the example given in Figure 4.1.1, we have shown that
|x − 3| < δ implies that f (x) − 9| < ǫ where we are only required to show that 0 < |x − 3| < δ
implies that |f (x) − 9| < ǫ. It is always permissible to show something stronger that what
we need to show. Again this is possible for this example because it is just about the second
easiest example possible.
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an 6= 3
for any n and an → 3—any such sequence. Then we know by Proposition 3.3.2 parts (a) and
(b), and HW3.2.1-(b) that
lim (2an + 3)) = lim (2an ) + lim 3 = 2 lim an + 3 = 2(3) + 3 = 9.
n→∞
n→∞
n→∞
n→∞
Therefore lim 2x + 3 = 9. Admitedly much easier.
x→3
Note in the above example when we applied the definition, we started with
the inequality that we need to be satisfied, |f (x) − 9| < ǫ. We then proceeded
to manipulate this inequality until we were able to isolate a term of the form
|x − 3|. This led to an easy definition of δ. The algebra of inequalities will not
always be as easy but it will always be possible to isolate the term |x − x0 |.
Observe this occurence as we proceed.
88
4. Limits of Functions
Example 4.2.2
Prove that lim x2 = 4.
x→2
Solution: Using Definition 4.1.1: We begin as we did in the last problem. We suppose
that ǫ > 0 is given. We must find δ. Eventually we must satisfy the inequality |f (x) − L| =
|x2 − 4| = |(x − 2)(x + 2)| = |x − 2||x + 2| < ǫ. Notice that the |x − 2| term is included in the
second to the last term of the inequality—as we promised it would. The next step is tougher.
We cannot divide by |x + 2|, get |x − 2| < ǫ/|x + 2| and define δ = ǫ/|x + 2|. This would
be analogous to what we did in Example 4.2.1. The δ that we find can depend on ǫ (like the
N ’s almost always depended on ǫ). If we are taking a limit as x approaches a general point
x0 , the δ can depend on x0 —it’s a fixed value. δ cannot depend on x.
We used the bold face above but we did want to make that point extremely clear—
otherwise (and maybe in spite of) someone would make that mistake. The last couple sentences
of the above paragraph are very important.
We return to the inequality that we want satisfied |x − 2||x + 2| < ǫ. The technique we
use is to bound the |x + 2| term. How we do this is to choose a temporary fixed δ1 , say δ1 = 1,
and assume that |x − 2| < δ1 = 1. Then −1 < x − 2 < 1, 1 < x < 3 and 3 < x + 2 < 5.
The last inequality implies that |x + 2| < 5. Could it be less for some x? Of course it could.
However it could be very close to 5—and never bigger. Therefore if we assume that x satisfies
|x − 2| < δ1 = 1, then |x − 2||x + 2| < 5|x − 2|. If we then set 5|x − 2| < ǫ, we see that
|x − 2| < ǫ/5 so we see that it’s logical to define δ = ǫ/5. But this is wrong. If we review this
paragraph carefully, we see that |x − 2||x + 2| < 5|x − 2| < 5δ = 5(ǫ/5) = ǫ only if x satisfies
|x − 2| < δ1 = 1 and |x − 2| < δ = ǫ/5.
Therefore the way to do it is to forget our earlier definition of δ and define δ to be
δ = min{1, ǫ/5} (Step 1: Define δ). Then if x satisfies |x − 2| < δ, x will satisfy both
|x − 2| < 1 and |x − 2| < ǫ/5. Then
|x2 − 4| = |x − 2||x + 2| <∗ 5|x − 2| <∗∗ 5(ǫ/5) = ǫ
(Step 2: Show that the defined δ works) where inequality ”<∗ ” is satisfied because |x −
2| < 1 (δ = min{1, ǫ/5} < 1) and inequality ”<∗∗ ” is satisfied because |x − 2| < ǫ/5 (δ =
min{1, ǫ/5} < ǫ/5). Therefore lim x2 = 4.
x→2
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an 6= 2
for any n and an → 2. By Proposition 3.3.2–(d) we see that
lim a2n = lim an
lim an = 2 · 2 = 4.
n→∞
n→∞
n→∞
Therefore lim x2 = 4.
x→2
For the application of the definition, if you compare Examples 4.2.1 and
4.2.2, you realize that the difference is that the function in Example 4.2.1 is
linear and that is why it is so easy to apply the definition. For most functions (at
least all nonlinear functions) you will have to apply some version of the method
(trick?) used in Example 4.2.2. Of course application of Proposition 4.1.3 let
us skip these difficulties. Recall that in the proof of Proposition 3.3.2–(d), we
used part (c) of the proposition—the result that guaranteed the boundedness
of a convergent sequence. Thus Proposition 4.1.3 also must use some sort of
boundedness result—albeit, very indirectly.
We might note that in the application of the definition we used δ1 = 1 (we
call it δ1 because it’s sort of the first approximation of our δ) because 1 is a really
nice number. If we had used δ1 = 1/2, we would have gotten 7/2 < x + 2 < 9/2.
In this case we see that |x + 2| < 9/2, so |x − 2||x + 2| < (9/2)|x − 2| and we
would define δ to be δ = min{1/2, ǫ/(9/2)}. If instead we had used δ1 = 2, then
we would find that |x + 2| < 6 and would eventually define δ = min{2, ǫ/6}.
Any of these choices would give you a correct result. As we see in the next
example it is sometimes important to be careful how we choose δ1 .
89
4.2 Applications of the Definition
x−2
= −4.
x+3
Solution: Using Definition 4.1.1: For this problem we proceed as we have before and
assume that ǫ > 0 is given. We want a δ so that when 0 < |x − (−2)| = |x + 2| < δ,
x − 2
x + 3 − (−4) < ǫ. We see that
x − 2
5(x + 2) 5|x + 2|
=
−
(−4)
(4.2.1)
x + 3
x + 3 = |x + 3| .
Example 4.2.3
Prove that lim
x→−2
We note that the |x + 2| term is there—as we promised it would be. So as in Example 4.2.2
we must bound the rest. But in this case we must be more careful. If we chose δ1 = 1 as
we did before (and 1 is such a nice number), then 5/|x + 3| would be unbounded on the set
of x such that |x + 2| < δ1 = 1. (|x + 2| < 1 implies that −3 < x < −1. 5/|x + 3| goes to
infinity as x goes to −3.) Hence we must be a little bit more careful and choose δ1 = 1/2. If
x is such that |x + 2| < 1/2, then −5/2 < x < −3/2 and 1/2 < x + 3 < 3/2. Thus we see
that if |x + 2| < 1/2, |x + 3| > 1/2 (and it’s only the bad luck of the numbers that the two
1/2’s appear) and 5/|x + 3| < 5/(1/2) = 10. Thus we return to equation (4.2.1) and see that
if |x + 2| < 1/2, then
x − 2
5|x + 2|
(4.2.2)
x + 3 − (−4) = |x + 3| < 10|x + 2|.
Thus define δ = min{ǫ/10, 1/2} (Step 1: Define δ). Then if 0 < |x + 2| < δ, 5/|x + 3| < 10
and 10|x + 3| < 10(ǫ/10) = ǫ. Therefore if 0 < |x + 2| < δ,
x− 2
= 5|x + 2| < 10|x + 3| < ǫ
−
(−4)
x+ 3
|x + 3|
(Step 2: δ works.), and limx→−2
x−2
x+3
= −4.
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an ∈ D
for all n (which in this case means that an 6= −3 for any n), an 6= −2 for any n and an → −2.
By Proposition 3.4.1–(b), Proposition 3.3.2–(a) and HW3.2.1-(b) we see that
lim
x→−2
−4
an − 2
=
= −4.
an + 3
1
Note that again in this problem the ”0 <” part of the restriction on x is not
important. The function is well behaved at x = −2 (and equals −4). However,
in this problem since −3 is not in the domain of f , we must be careful to restrict
the δ in the application of the definition and the sequence {an } in the application
of Proposition 4.1.3.
The next example that we consider is an important problem. The limit considered is an example of a limit used to compute a derivative—a very important
use of limits in calculus.
Example 4.2.4
Prove that lim
x→4
x3 − 64
= 48.
x−4
x3 − 64
. Note that f (4)
x−4
is not defined. When you try to evaluate f at x = 4, you get zero over zero. This does not
mean that we will not be able to evaluate the limit given above—and hopefully you realize
this if you remember your limit work related to derivatives.
We proceed as usual, assume that we are given an ǫ > 0 and want to find a δ so that
x3 − 64
− 48 < ǫ. We start with expression
0 < |x − 4| < δ will imply that |f (x) − 48| = x−4
f (x) − 48 and note that
Solution: Using Definition 4.1.1: For convenience define f (x) =
x3 − 64
(x − 4)(x2 + 4x + 16)
− 48 =
− 48
x−4
x−4
(4.2.3)
90
4. Limits of Functions
(if you don’t believe the factoring, multiply the expression on the right to see that you get
x3 − 64 back). We have an x − 4 factor in both the numerator and the denominator of the first
term on the right. We want to divide them out. In general you have to be careful in doing
this but in this case it is completely permissible. The requirement on x will be 0 < |x − 4| < δ.
The meaning of the part of the inequality 0 < |x − 4| is that x − 4 6= 0. And if x − 4 6= 0, we
can divide them out. Hence returning to equation (4.2.3) we get
x3 − 64
(x − 4)(x2 + 4x + 16)
−48 =
−48 = (x2 +4x+16)−48 = x2 +4x−32 = (x−4)(x+8).
x−4
x−4
We promised you that there would always be an x − 4 factor in the simplified version of
f (x) − L. Thus
3
x − 64
= |(x − 4)(x + 8)| = |x − 4||x + 8|.
−
48
(4.2.4)
x−4
The |x − 4| term will be made less than δ as it has been in Examples 4.2.3–4.2.3. The |x + 8|
term must be bounded as we bounded |x + 2| in Example 4.2.2 and 5/|x + 3| in Example
4.2.3. Hence we require that |x − 4| satisfy |x − 4| < δ1 = 1 and notice that this gives us the
following: |x − 4| < 1 ⇒ − 1 < x − 4 < 1 ⇒ 3 < x < 5 ⇒ 11 < x + 8 < 13. Therefore if
|x − 4| < δ1 = 1, then |x + 8| < 13. Returning to equation (4.2.4) we see that if we require
that x satisfy |x − 4| < δ1 = 1, then
3
x − 64
(4.2.5)
x − 4 − 48 = |x − 4||x + 8| < 13|x − 4|.
And finally, if we define δ = min{1, ǫ/13} (Step 1: Define δ) and require that 0 < |x − 4| < δ
(so that 0 < |x − 4| < 1 and 0 < |x − 4| < ǫ/13), we continue with equation (4.2.5) to get
3
x − 64
= |x − 4||x + 8| < 13|x − 4| < 13(ǫ/13) = ǫ
−
48
x−4
x3 − 64
→ 48 as x → 4.
x−4
Using Proposition 4.1.3: We suppose that we are given a sequence {an } such that an ∈ D
for all n (i.e. an 6= 4 for any n), an 6= 4 for any n and an → 4. Then
(Step 2: δ works). Therefore
lim
n→∞
a3n − 64
=
an − 4
=
lim
n→∞
(an − 4)(a2n + 4an + 16
an − 4
lim a2
n→∞ n
2
+ 4an + 16 = 4 + 4 · 4 + 16 = 48.
(4.2.6)
(4.2.7)
We note that it is permissible to divide out the an − 4 term between steps (4.2.6) and (4.2.7)
x3 − 64
→ 48 as x → 4.
because we have assume that an 6= 4 for any n. Therefore
x−4
We should emphasize that it would be wrong to apply Proposition 3.4.1–(b) after step
(4.2.6) and then try some sort of division.
You might have noticed that when we wrote the limits in the preceeding
examples, we did not usually explicitly define the function and the domain. We
wrote the expression for the function as a part of the limit statement (as you
did in your basic calculus course) and assumed that you knew the domain. This
is common. Know the domain? We really assume that the domain is chosen as
the largest set on which the expression can be defined—in the case of Examples
4.2.1 and 4.2.2, D = R, in Example 4.2.3 D = R − {−3}, in Example 4.2.4
D = R − {4}, etc. Of course in these cases the requirement that x0 is a limit
point of D was always satisfied.
Notice that as a part of the solution using the definition, we were able to
factor out an x − 4 term out of x3 − 64 in equation (4.2.3). This was not luck,
4.2 Applications of the Definition
91
if the limit is to exist, it will always be there. Remember when we tried to
evaluate f (4) we got 0/0. The zero in the numerator implies that there’s a x − 4
factor in there—somewhere, sometimes it’s hard to see. For all of the problems
that result in applying the definition of a derivative (and you do not need to
know what that is yet—except it’s related to the x−4 in the denominator—wait
until Chapter 6) you will always have the x − 4 term in the numerator that will
divide out with the x − 4 term in the denominator (except it won’t always be a
4 and it may be very difficult to see that the term is there). But remember it
was the ”0 <” part of the restriction on x that allowed us to divide
out the x − 4 terms. This was essential. Likewise, this is an example that if
you choose to prove the limit using Proposition 4.1.3, the hypothesis ”an 6= x0
for any n” becomes important. In the application of Proposition 4.1.3, it is this
assumption on the sequence {an } that allows us to divide out the an − 4 terms.
There are other problems that require the ”0 <” restriction on x (or an 6=
x0 assumption) and a division other than the limits involved in computing
derivatives. You could make up a function that when factored looked like
(x − 2)2 (x2 + x + 1)/(x − 2)2 (you can multiply it out if you’d like to make
it look like a real example) and try to calculate the limit of that function as
x → 2. The limit would be 7. You would use the ”0 <” restriction to divide
out the (x − 2)2 terms and then would have
(x − 2)2 (x2 + x + 1)
− 7 = (x2 + x + 1) − 7 = x2 + x − 6 = (x + 3)(x − 2).
(x − 2)2
Notice that it contains the x − 2 term (we promised) and if we were applying
the defintion, we would proceed by bounding |x + 3| the way that we have done
before.
(x + 2)2 h(x)
If the function is of the form f (x) =
where h(−2) 6= 0 (and we
x+2
want the limit of f as x → −2, then only one x + 2 term will divide out (you
only have one in the denominator—what else could you do) and the limit would
be 0 because of the x + 2 term that is left in the numerator. If the function
(x − 3)2 h(x)
is of the form f (x) =
where h(3) 6= 0—emphasizing the fact that
(x − 3)3
the degree of the term in the numerator is larger than that in the denominator,
then you could divide out only two of the x − 3 terms and the x − 3 term that
was left in the denominator would cause the limit to not exist.
Let us emphasize again, all of these slight variations of the problem given
in Example 4.2.4 work because of the ”0 <” part of the restriction on x in
Definition 4.1.1. We see that it does not come into problems involving easy
limits but is important on the class of limits associated with derivatives—and
similar problems.
Nonconvergence of Limits: Of course if we have a definition of convergence
of limits and some examples of application of the definition, we must have some
examples where the function doesn’t converge to a limit. As in the case of
nonconvergence of sequential limits, proving that a limit does not exist using
the definition is often difficult. You must show that for some ǫ > 0 there does
92
4. Limits of Functions
not exist any δ such that 0 < |x − x0 | < δ implies that |f (x) − L| < ǫ (using
the notation as given in Definition 4.1.1), i.e. for some ǫ > 0 and any δ, there
exists an xδ such that 0 < |xδ − x0 | < δ and |f (x) − L| ≥ ǫ.
In general, it is usually much easier and more natural to use Proposition
4.1.3 to show that a limit does not exist. Again we do want you to see that you
can use the definition in these arguments and how to use it. The application of
Proposition 4.1.3 is a bit different from before. Consider the ⇒ direction of the
proposition: if lim f (x) = L, then for any sequence {an } such that an ∈ D for
x→x0
all n, an 6= x0 for any n and lim an = x0 , then lim f (an ) = L. Of course the
n→∞
n→∞
contrapositive of this statement would read something like the following. if it is
not the case that for any sequence {an } such that an ∈ D for all n, an 6= x0 for
any n and lim an = x0 , then lim f (an ) = L, then lim f (x) 6= L. How does
n→∞
n→∞
x→x0
one satisfy the statement ”it is not the case that for any sequence {an } such that
an ∈ D for all n, an 6= x0 for any n and lim an = x0 , then lim f (an ) = L”?
n→∞
n→∞
It is easy. One way is to find a sequence that satisfies the properties an ∈ D
for all n, an 6= x0 for any n and an → x0 , but the limit lim f (an ) does not
n→∞
exist. That implies that not only is the limit not some particular L but that
the limit does not exist. Another way is to find two sequences {an } and {bn }
such that an , bn ∈ D for all n, an 6= x0 and bn 6= x0 for any n, an → x0 and
bn → x0 as n → ∞, and lim f (an ) 6= lim f (bn ). Not only will this imply
n→∞
n→∞
that the original limit is not L but will also imply that it can’t be anything else
either (because we will always get at least two nonequal candidates), i.e. the
limit does not exist.
We will include three examples of nonconvergence. The first will not satisfy
Definition 4.1.1 because it wants to have an infinite limit and Definition 4.1.1
requires that L ∈ R (and as in the case with sequential limits, we will later
define what means to have an infinite limit). The second will be analogous to
the sequential limit example given in Example 3.2.6—there will be two logical
limits, so neither (and nothing else) will satisfy the definition. And the last—
probably the most interesting—will not have a limit just because it is a really
nasty function. As we mentioned earlier we will show nonconvergence by the
definition because we want you to see how it is done. Because we feel that the
natural approach is to apply Proposition 4.1.3, we will give that approach first.
Example 4.2.5
Prove that lim
x→0
1
does not exist.
x2
Solution: (Using Proposition 4.1.3) Consider the sequence {1/n}. This sequence satisfies
the
1/n 6= 0 for any n and 1/n → 0. Thus if the limit were to exist, the sequence
properties
1
2
=
{n
} would have to converge to some L—the resulting limit. Clearly this is not
(1/n)2
1
1
2
does not exist.
the case in that (1/n)
2 = n → ∞ as n → ∞. Therefore lim
x→0 x2
2
(Using Definition 4.1.1) If you evaluate the function 1/x near zero, we would hope that
you figure out what is happening. You consider ǫ = 1 (remember, we only have to show that
it’s bad for one particular ǫ) and suppose that the limit is some L ∈ R where for the moment
we assume that L > −1. We must show that for any δ we do not satisfy 0 < |x| < δ implies
93
4.2 Applications of the Definition
1
< ǫ = 1 or L − 1 < 1 < L + 1, i.e. we must show that for any δ there exists an xδ
−
L
x2
2
x
1
such that 0 < |xδ | < δ and 2 − L ≥ ǫ = 1.
x √
Choose xδ = min{δ/2, 1/2 L + 1}. Then xδ is such that 0 < |xδ | < δ and 0 < xδ <
1
√
√
1
1/ L + 1. We note that 0 < xδ < 1/ L + 1 implies that 2 > L + 1, or 2 − L ≥ 1. Thus
xδ
xδ
1
lim
cannot equal L. How did we choose this xδ that worked so well? Of course we worked
x→0 x2
backwards—knowing that we could choose xδ small enough so that 1/x2δ would be greater
than L + 1 (and δ/2 would guarantee that it is between 0 and δ).
If L ≤ −1 (and note that this implies that −L ≥ 1), then
for any δ we choose xδ = δ/2
1
4
4
1
1
and note that 2 − L = 2 − L ≥ 2 + 1 > 1, or 2 − L ≥ 1. Thus lim 2 cannot equal
x→0 x
xδ
xδ
δ
δ
L.
And of course, if the limit cannot equal L for L > −1 and cannot equal L for L ≤ −1,
then the limit cannot exist.
Note that even though this limit does not exist, it is handy here having the
”0 <” part of the restriction on x in the definition of a limit so that 1/x2 need
not be defined at x = 0—otherwise we would have been done long ago.
Example 4.2.6
For f defined as f (x) =
(
1
0
if x ≥ 0
prove that lim f (x) does not
x→0
if x < 0.
exist.
Solution: (Using Proposition 4.1.3) The approach we use for this example is to use
two sequences—remember that Proposition 4.1.3 must hold for all sequences {an } such that
an 6= x0 and an → x0 . We first considerthesequence {1/n}. We know that 1/n 6= 0 for any
1
n and 1/n → 0. It is easy to see that f
→ 1 (since all of the 1/n’s are positive implies
n
the f (1/n) = 1 for all n—so this is just an application of HW3.2.1-(b)). Then we consider the
sequence {−1/n}. Again
we
notice that the sequence satisfies the hypothesis of Proposition
1
4.1.3, but this time f −
→ 0. Therefore lim f (x) does not exist.
x→0
n
(Using Definition 4.1.1) We approach this proof similar to the way that we proved that
lim (−1)n did not exist in Example 3.2.6. Case 1: We first guess that maybe the limit is
n→∞
1. We choose ǫ = 1/2 (remember that we only have to show that for some ǫ > 0 there is no
appropriate δ). If this limit were not to be 1, we would have to show that for any δ there
is an xδ such that 0 < |xδ | < δ, or −δ < xδ < δ, xδ 6= 0, and |f (x) − 1| ≥ ǫ = 1/2. Now
consider any δ and choose xδ = −δ/2. Then xδ satisfies |xδ | < δ and xδ 6= 0. Since f (xδ ) = 0
(xδ < 0), |f (xδ ) − 1| = 1 ≥ ǫ = 1/2. Thus we know that lim f (x) 6= 1.
x→0
Case 2: We next guess that the limit might be 0. We again choose ǫ = 1/2 and consider any
δ. This time choose xδ = δ/2. Then since |f (xδ ) − 0| = |1 − 0| = 1 ≥ ǫ = 1/2, lim f (x) 6= 0.
x→0
Case 3: We next consider the most difficult case, and assume that lim f (x) = L where L
x→0
is any real number other than 1 or 0 (the two cases that we have already considered). Then
choose ǫ = min{|L|/2, |L − 1|/2}. For any δ choose xδ = δ/2. Then |f (x) − L| = |1 − L| >
|L − 1|/2 ≥ ǫ. Thus lim f (x) 6= L, L 6= 1 and L 6= 0. (We could just have well chosen
x→0
xδ = −δ/2.)
Since we have exhausted all possible limits in R, lim f (x) does not exist.
x→0
In the next example we will use the sine function—and we have never defined
it (but we used it earlier). We assume that your trigonometry course gave a
94
4. Limits of Functions
sufficiently rigorous definition of these functions.
last case of non-existence.
Example 4.2.7
Define the function f : R → R by f (x) =
We now proceed with our
(
lim f (x) does not exist.
sin
0
1
x
if x 6= 0
Prove that
if x = 0.
x→0
It is especially instructive for this example to get a plot of the function. We see on the
plot below that like the sine function −1 ≤ f (x) ≤ 1. But as x nears zero, 1/x goes through
odd multiples of π/2 (giving values ±1), multiples of π (giving values of 0) and everything
else in between—many times.
1
0.8
0.6
0.4
0.2
0
−0.2
−0.4
−0.6
−0.8
−1
−3
−2
−1
0
x
1
2
3
Figure 4.2.1: Plot of a function f (x) = sin(1/x) for x 6= 0 and f (0) = 0.
Solution: (Using Proposition 4.1.3) Again we choose two sequences converging to
zero. We choose the sequence {an } where an = 1/(nπ) and the sequence {bn } where bn =
2/[(4n + 1)π]. Both of these sequences will clearly never equal 0 and both of these sequences
will converge to zero. It is easy to see that f (an ) = 0 for all n and f (bn ) = 1 for all n.
Therefore, f (an ) → 0, f (bn ) → 1 and lim f (x) does not exist.
x→0
(Using Definition 4.1.1) Case 1: L 6= 0 For any δ we can find an x0 satisfying 0 < |x| < δ
such that x0 = 1/(n0 π) for some n0 ∈ N—this follows from Corollary 1.5.5–(b) (there are
many such n0 ’s). Then if we suppose that the limit exists and is some L other than 0, we
choose ǫ = |L|/2 and note that |f (x0 ) − L| = |0 − L| = |L| > |L|/2 so it is impossible to satisfy
Definition 4.1.1 for any δ (so lim f (x) 6= L, L 6= 0).
x→0
Case 2: L = 0 We next suppose that the limit is 0 (it is the only value left). We choose
ǫ = 1/2. Then for any δ, we can find an x0 such that 0 < |x0 | < δ, x0 6= 0 and x0 =
2/[(2n0 + 1)π] for some n0 ∈ N (one over an odd multiple of π/2). For this value of x0
|f (x0 ) − 0| = | ± 1| = 1 > 1/2 = ǫ. Thus again it is impossible to satisfy Definition 4.1.1 for
any δ (so lim f (x) 6= 0).
x→0
Therefore, lim f (x) does not exist.
x→0
Note that while f defined in Example 4.2.7 is a terribly nasty function—especially
near 0, for any x0 6= 0 (even very near 0), lim f (x) exists and equals sin(1/x0 ).
x→x0
95
4.3 Limit Theorems
HW 4.2.1 (True or False and Why)
(a) lim |x| = 0
x→0
(b) lim |x| = 2
x→−2
(c) Suppose f : D → R, D ⊂ R, x0 ∈ R. If f (x0 ) is defined (x0 ∈ D), then
lim f (x) = f (x0 ).
x→x0
x2
= −4
x→2 2x − 5
(d) lim


x<0
1
(e) Consider the function defined by f (x) = 0
x = 0 lim f (x) = 0.
x→0


−1 x > 0.
HW 4.2.2 Use the graphical approach to show that lim 2x+3 = 9. Specifically
x→3
find the δ1 and δ2 (of Figure 4.1.1), determine δ and show that it works. Explain
why δ1 = δ2 in this example.
HW 4.2.3 (a) Prove that lim 7 = 7. Show this using the graphical approach
x→4
and then prove it twice—first using Definition 4.1.1 and then using Proposition
4.1.3.
(b) Prove that for any x0 , c ∈ R, lim c = c.
x→x0
(
x2 + x + 1 if x 6= 2
12
if x = 2.
Prove that lim f (x) = 7—prove it twice, first using Definition 4.1.1 and then
x→0
using Proposition 4.1.3.
HW 4.2.4 Define the function f : R → R by f (x) =
x2
= −9—prove it twice, first using Definition
x→3 x − 4
4.1.1 and then using Proposition 4.1.3.
HW 4.2.5 Prove that lim
4.3
Limit Theorems
We don’t want to have to apply the Definition 4.1.1 or Proposition 4.1.3 every
time we have a limit. As was the case with sequential limits, we shall develop
limit theorems that allow us to compute a large number of limits. You already
know most of these limit theorems from your elementary calculus class. Of
course we will now include the proofs of these theorems. And it should not
be a surprise to you that the limit theorems will look very much like the limit
theorems that we proved for limits of sequences. As with the proofs of convergence of the specific limits done in Section 4.2, parts (a), (b), (d) and (f) of
Proposition 4.3.1 given below can be proved by either using the definition or
Proposition 4.1.3. Again we feel that you should see both approaches. For that
reason we will include both proofs for these parts. Since the proofs applying
Definition 4.1.1 are very similar to the proofs of the analogous proofs for limits
96
4. Limits of Functions
of sequences and the proofs applying Proposition 4.1.3 are pretty easy, we will
give reasonably abbreviated versions of these proofs.
Proposition 4.3.1 Consider the functions f, g : D → R where D ⊂ R, suppose
that c, x0 ∈ R and x0 is a limit point of D. Suppose lim f (x) = L1 and
x→x0
lim g(x) = L2 . We then have the following results.
x→x0
(a) lim (f (x) + g(x)) = lim f (x) + lim g(x) = L1 + L2 .
x→x0
x→x0
x→x0
(b) lim cf (x) = c lim f (x) = cL1 .
x→x0
x→x0
(c) There exists a δ3 , K ∈ R such that for x ∈ D and 0 < |x − x0 | < δ3 ,
|f (x)| < K.
lim g(x) = L1 L2 .
(d) lim f (x)g(x) = lim f (x)
x→x0
x→x0
x→x0
(e) If L2 6= 0, then there exists a δ4 , M ∈ R such that if x ∈ D and 0 < |x−x0 | <
δ4 , then |g(x)| > M .
(f ) If L2 6= 0, then
lim
x→x0
L1
limx→x0 f (x)
f (x)
.
=
=
g(x)
limx→x0 g(x)
L2
Proof: So that we don’t have to repeat it every time, throughout this proof
let {an } be any sequence such that an ∈ D for all n, an 6= x0 for any n, and
an → x0 .
(a) (Using Definition 4.1.1) We suppose that we are given an ǫ > 0. We
apply the hypothesis lim f (x) = L1 with ǫ1 = ǫ/2 to get a
x→x0
δ1 such that x ∈ D, 0 < |x − x0 | < δ1 implies that |f (x) − L1 | < ǫ1 = ǫ/2,
and the hypothesis lim g(x) = L2 with ǫ2 = ǫ/2 to get a
x→x0
δ2 such that x ∈ D, 0 < |x − x0 | < δ2 implies that |g(x) − L2 | < ǫ2 = ǫ/2.
Then if we let δ = min{δ1 , δ2 } and require that x ∈ D and 0 < |x − x0 | < δ, we
have
|(f (x) + g(x)) − (L1 + L2 )| = |(f (x) − L1 ) + (g(x) − L2 )|
≤ |(f (x) − L1 )| + |(g(x) − L2 )| < ǫ/2 + ǫ/2 = ǫ.
Therefore lim (f (x) + g(x)) = L1 + L2 .
x→x0
(Using Proposition 4.1.3) We note that by Proposition 3.3.2–(a)
lim (f (an ) + g(an )) = lim f (an ) + lim g(an ) = L1 + L2 .
n→∞
n→∞
n→∞
Since this holds true for any such sequence {an }, by Proposition 4.1.3 we get
lim (f (x) + g(x)) = L1 + L2 .
x→x0
97
4.3 Limit Theorems
(b) (Using Definition 4.1.1) If c 6= 0, we apply the hypothesis lim f (x) = L1
x→x0
with ǫ1 = ǫ/|c|. Then setting δ = δ1 will give the desired result.
If c = 0, the result is trivial since cf (x) = 0 for all x ∈ D—so it follows from
HW4.2.3-(b).
(Using Proposition 4.1.3) Since lim cf (an ) = c lim f (an ) by Proposition
n→∞
n→∞
3.3.2–(b), the result follows.
(c) (We do not give a proof of this result based on Proposition 4.1.3—it is
possible but it would not be very insightful.) Using the hypothesis lim = L1
x→x0
with ǫ1 = 1, we get a δ3 such that if x ∈ D and 0 < |x − x0 | < δ3 implies that
|f (x) − L1 | < ǫ1 = 1. Then by the backwards triangular inequality, Proposition
1.5.8–(vi), we see that for all x such that x ∈ D and 0 < |x − x0 | < δ3 ,
|f (x)| − |L1 | ≤ |f (x) − L1 | < 1,
or |f (x)| < 1 + |L1 |. If we set K = 1 + |L1 |, we are done.
(d) (Using Definition 4.1.1) We suppose that we given an ǫ > 0. We apply
the hypothesis lim f (x) = L1 with ǫ1 = ǫ/(2L2 ) to get a
x→x0
δ1 such that x ∈ D, 0 < |x = x0 | < δ1 implies that |f (x) − L1 | < ǫ1 = ǫ/(2L2 ),
and the hypothesis lim g(x) = L2 with ǫ2 = ǫ/(2K) to get a
x→x0
δ2 such that x ∈ D, 0 < |x = x0 | < δ2 implies that |g(x) − L2 | < ǫ2 = ǫ/(2K).
We set δ = min{δ1 , δ2 , δ3 } (where δ3 follows from part (c) of this proposition)
and note that if x is such that x ∈ D and 0 < |x − x0 | < δ (x satisfies all three
restrictions),
|f (x)g(x) − L1 L2 | = |f (x)(g(x) − L2 ) + L2 (f (x) − L1 )|
≤ |f (x)||g(x) − L2 | + |L2 ||f (x) − L1 | < Kǫ2 + |L2 |ǫ1 .
Then using the fact that ǫ1 = ǫ/(2|L2 |) and ǫ2 = ǫ/(2K), the result follows.
Note that generally K 6= 0—or we can always choose it to be so. If L2 = 0,
the result follows by choosing ǫ2 = ǫ/K and δ = min{δ2 , δ3 }, and noting that
|f (x)g(x) − 0| = |f (x)||g(x)| < Kǫ2 = ǫ
whenever x ∈ D and 0 < |x − x0 | < δ—we only use the hypothesis lim f (x) =
x→x0
L1 to get K.
(Using Proposition 4.1.3) The result follows from Proposition 3.3.2–(d).
(e) (Again we do not include a proof of this result based on Proposition 4.1.3.)
Since L2 is assumed to be nonzero, we use the hypothesis lim g(x) = L2
x→x0
with ǫ2 = |L2 |/2 and obtain a δ4 such that 0 < |x − x0 | < δ4 implies that
|g(x) − L2 | < ǫ = |L2 |/2. We have
|L2 | − |g(x)| ≤ |L2 − g(x)| = |g(x) − L2 | < ǫ = |L2 |/2
by the backwards triangular inequality, Proposition 1.5.8–(vi).
98
4. Limits of Functions
Thus when x ∈ D and 0 < |x − x0 | < δ4 , |g(x)| > |L2 | − |L2 |/2 = |L2 |/2. If we
set M = |L2 |/2, we are done.
(f ) (Using Definition 4.1.1) We suppose that we are given an ǫ > 0 and apply
the hypotheses lim f (x) = L1 with respect to ǫ1 to get δ1 such that x ∈ D
x→x0
and 0 < |x − x0 | < δ1 implies |f (x) − L1 | < ǫ1 , lim g(x) = L2 with respect to
x→x0
ǫ2 to get δ2 such that x ∈ D and 0 < |x − x0 | < δ2 implies |g(x) − L2 | < ǫ2 ,
and L2 6= 0 and part (e) of this proposition to get δ4 such that x ∈ D and
0 < |x − x0 | < δ4 implies that g(x) > M . Then we set δ = min{δ1 , δ2 , δ4 },
require x to satisfy x ∈ D and 0 < |x − x0 | < δ and note that
f (x) L1 f (x)L2 − L1 g(x) (f (x) − L1 )L2 + L1 (L2 − g(x)) =
=
−
g(x)
L2 L2 g(x)
L2 g(x)
|f (x) − L1 ||L2 | + |L1 ||L2 − g(x)|
ǫ1 |L2 | + |L1 |ǫ2
≤
<
.
g(x)|L2 |
M |L2 |
Thus we see that if we choose ǫ1 as ǫ1 = M ǫ/2 and ǫ2 as ǫ2 = M |L2 |ǫ/(2|L1 |,
f (x) L1 < ǫ and lim f (x) = L1 (if L1 = 0, the result follows by
−
then x→x0 g(x)
g(x)
L2 L2
choosing δ = min{δ1 , δ4 } and ǫ1 = ǫ/M ).
(Using Proposition 4.1.3) Since by Proposition 3.4.1–(b)
lim
n→∞
f (an )
limn→∞ f (an )
L1
=
=
g(an )
limn→∞ g(an )
L2
for any such sequence {an }, the result follows from Proposition 4.1.3.
Parts (a), (b), (d) and (f) of the above proposition are basic tools used in
the calculation of limits. However, to use these tools—which are always used to
simplify a given limit to a set of easier limits—we need some easier limits. In
the next proposition we provide one of the easy limits that we need.
Proposition 4.3.2 (a) For any x0 ∈ R lim x = x0 .
x→x0
(b) Consider f1 , · · · , fn : D → R where D ⊂ R, x0 , L1 , · · · , Ln ∈ R and x0
is a limit point of D. Suppose that lim fj (x) = Lj , j = 1, · · · , n. Then
x→x0
lim f1 (x) · f2 (x) · · · fn (x) = L1 · L2 · · · Ln .
x→x0
(c) Suppose x0 ∈ R and k ∈ N. Then lim xk = xk0 .
x→x0
Proof: The proofs of these results are very easy. Result (a) follows by choosing
δ = ǫ in Definition 4.1.1. Property (b) is an elementary application of mathematical induction using part (d) of Proposition 4.3.1. And then the result
given in part (c) follows from parts (a) and (b) of this proposition (or by apply
Proposition 4.3.1–(d) k − 1 times along with part (a) of this proposition).
We next include an inductive version of Proposition 4.3.1–(a) and use this
result—along with the other parts of Proposition 4.3.1 to compute a large class of
limits. Let p and q denote mth and nth degree polynomials, respectively, p(x) =
a0 xm + a1 xm−1 + · · · + am−1 x + am and q(x) = b0 xn + b1 xn−1 + · · · + bn−1 x + an .
99
4.3 Limit Theorems
Proposition 4.3.3 (a) Consider f1 , · · · , fn : D → R where D ⊂ R, x0 , L1 , · · · , Ln ∈
R and x0 is a limit point of D. Suppose that lim fj (x) = Lj , j = 1, · · · , n.
x→x0
Then lim (f1 (x) + f2 (x) + · · · + fn (x)) = L1 + L2 + · · · + Ln .
x→x0
m−1
(b) For all x0 ∈ R lim p(x) = p(x0 ) = a0 xm
+ · · · + am−1 x0 + am .
0 + a1 x0
x→x0
(c) If x0 ∈ R and q(x0 ) 6= 0, then
lim
x→x0
m−1
p(x0 )
a0 xm
+ · · · + am−1 x0 + am
p(x)
0 + a1 x0
=
=
.
n−1
n
q(x)
q(x0 )
b0 x0 + b1 x0 + · · · + bn−1 x0 + bn
Proof: As was the case with Proposition 4.3.2 the proof of this proposition
is also very easy. Part (a) can be proved by applying mathematical induction
along with part (a) of Proposition 4.3.1. The result given in part (b) then
follows from part (a) of this result, Proposition 4.3.1–(b) and Proposition 4.3.2–
(c). And finally, to prove part (c) we apply the quotient rule from Proposition
4.3.1–(f) along with part (b) of this proposition.
We are now able to compute a large class of limits. We have intentionally
skipped functions involving irrational exponents and functions of the form ax
(which are two other very basic ”easy limits” that we use along with Proposition 4.3.1 to compute limits) because we will give a rigorous mathematical
introduction to these functions in Chapter 5 and 6—so discussiing their limits
at this time would be cheating. If we returned to the examples considered in
Section 4.2 we would now be able to compute the limits considered in Examples
4.2.1–4.2.3 very easily. To compute a limit such as that considered in Example
4.2.4, we proceed much the way we did in our elementary course and compute
as follows.
(x − 2)(x + 2)(x2 + 4)
x4 − 16
= lim
x→2
x→2 x − 2
x−2
= lim (x + 2)(x2 + 4) because we know that x − 2 6= 0
lim
x→2
= 32 by Proposition 4.3.3–(b).
Before we leave this section we include one more limit result—the Theorem,
analogous to the sequential Sandwich Theorem, Proposition 3.4.2.
Proposition 4.3.4 Consider the functions f, g, h : D → R where D ⊂ R,
suppose that x0 ∈ R and x0 is a limit point of D. Suppose lim f (x) =
x→x0
lim g(x) = L and there exists a δ1 such that f (x) ≤ h(x) ≤ g(x) for x ∈ D
x→x0
and 0 < |x − x0 | < δ1 . Then lim h(x) = L.
x→x0
Proof: Suppose ǫ > 0 is given. Let δ2 and δ3 be such that 0 < |x − x0 | < δ2
implies |f (x) − L| < ǫ or L − ǫ < f (x) < L + ǫ, and 0 < |x − x0 | < δ3 implies
|g(x) − L| < ǫ or L − ǫ < g(x) < L + ǫ. Let δ = min{δ1 , δ2 , δ3 } (Define δ) and
suppose that x satisfies x ∈ D and 0 < |x − x0 | < δ. Then
L − ǫ < f (x) ≤ h(x) ≤ g(x) < L + ǫ,
100
4. Limits of Functions
or L−ǫ < h(x) < L+ǫ (Step 2: δ works). Thus |h(x)−L| < ǫ so limx→x0 h(x) =
L.
HW 4.3.1 (True and False and why) (a) Suppose f : D → R, D ⊂ R, x0 , L ∈
R, x0 is a limit point of D and lim f (x) = L. Then lim |f (x)| = |L|.
x→x0
x→x0
(b) Suppose f : D → R, D ⊂ R, x0 , L ∈ R, x0 is a limit point of D, lim f (x) =
x→x0
L and L > 0. Then there exists a neighborhood of x0 , N (x0 ), such that f (x) > 0
for all x ∈ N (x0 ).
(c) Consider f, g : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of D. Suppose
further that lim (f (x) + g(x)) and lim g(x) exist. Then lim f (x) exists.
x→x0
x→x0
x→x0
(d) Suppose f : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of D. Suppose
further that lim f (x) exists and there exists a punctured neighborhood of x0 ,
x→x0
N̂δ (x0 ), such that f (x) > 0 for x ∈ N̂δ (x0 ). It may be the case that lim f (x) =
x→x0
0.
(e) Suppose f, g, h : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of D.
Suppose also that lim f (x) = lim g(x) = L and there exists a δ such that
x→x0
x→x0
f (x) < h(x) < g(x) for x satisfying 0 < |x − x0 | < δ. In this situation it is not
necessarily the case that lim h(x) = L.
x→x0
√
x−2
√ = 2 2.
HW 4.3.2 Prove that lim √
x→2
x− 2
HW 4.3.3 Suppose f, g : D → R, D ⊂ R, x0 ∈ R and x0 is a limit point of
D. Suppose further that f (x) ≤ g(x) on some punctured neighborhood of x0 ,
N̂δ (x0 ), and both limits lim f (x) and lim g(x) exist. Prove that lim f (x) ≤
x→x0
x→x0
x→x0
lim g(x).
x→x0
4.4
Limits at Infinity, Infinite Limits and Onesided Limits
Limits at Infinity For all of the limits consider so far in this chapter both
x0 and L must be real. In this section we want to introduce limits where x0
and or L are ±∞. Of course we need definitions to extend the limit concepts
to these situations. If you think about it a bit, you should realize that we will
want lim f (x) to be very much like the sequential limit—except the definition
x→∞
will now have to allow for x values in some interval (N, ∞) rather than the
discrete points of N. Note that for convenience we define the limits at infinite
for functions defined on intervals such as (−∞, a) or (a, ∞) for some a. We
could give these definitions for domains less than these intervals but we would
have to fix up the domains so that we guaranteed that ±∞ was a limit point
of D—which we haven’t and don’t really want to define. We begin with the
following definition.
101
4.4 Infinite and One-sided Limits
Definition 4.4.1 For f : (a, ∞) → R for some a ∈ R and L ∈ R, we say that
lim f (x) = L if for every ǫ > 0 there exists an N ∈ R such that x > N implies
x→∞
that |f (x) − L| < ǫ.
Likewise, if f is defined on (−∞, a) for some a, lim f (x) = L if for every
x→−∞
ǫ > 0 there exists an N ∈ R such that x < N implies that |f (x) − L| < ǫ.
You probably computed some limits of this sort in your basic calculus class.
One of the common applications of limits at ±∞ is to determine asymptotes
to curves. Methods for computing limits at infinity are similar to the methods
for computing sequential limits. For example, the approach used to calculate a
2x2 + x − 3
is to perform the following computation.
limit such as lim
x→∞ 3x2 + 3x + 3
limx→∞ 2 + 1/x − 3/x2
2
2 + 1/x − 3/x2
2x2 + x − 3
=
= .
=
lim
2
x→∞ 3 + 3/x + 3/x2
x→∞ 3x2 + 3x + 3
limx→∞ 3 + 3/x + 3/x
3
lim
(Compare this result with the limit evaluated at the beginning of Section 3.4.)
To perform the above computation we first multiplied the numerator and denominator by 1/x2 and then used ”limit of a quotient is the quotient of the
limits”, ”limit of a sum is the sum of the limits”, ”limit of a constant times a
function is the constant times the limit”, ”limit of a constant is that constant”
and ”the limit of 1/xk as x goes to infinity is zero” (k ∈ N). Clearly, at present
we do not have these results, and hopefully equally clearly, these results are
completely analogous to the results proved for sequences in Propositions 3.3.2,
3.4.1, Example 3.3.1 and HW3.2.1-(b). We include these results in the following
two propositions.
Proposition 4.4.2 Suppose that f, g : (a, ∞) → R, for some a ∈ R, lim f (x) =
x→∞
L1 , lim g(x) = L2 and c ∈ R. We have the following results.
x→∞
(a) lim (f (x) + g(x)) = L1 + L2 .
x→∞
(b) lim cf (x) = c lim f (x) = cL1 .
x→∞
x→∞
(c) There exists an N, K ∈ R such that for x ∈ (a, ∞) and x > N , |f (x)| ≤ K.
(d) lim f (x)g(x) = L1 L2 .
x→∞
(e) If L2 6= 0 then there exists an N, M ∈ R such that |g(x)| ≥ M for all x > N .
f (x)
L1
(f ) If L2 6= 0, then lim
.
=
x→∞ g(x)
L2
Proposition 4.4.3 (a) For c ∈ R lim c = c.
x→∞
(b) lim (1/x) = 0
x→∞
(c) For k ∈ N lim (1/xk ) = 0.
x→∞
We are not going to prove these two propositions. Their proofs are just
copies of the analogous sequential results. Likewise, we could also take some
102
4. Limits of Functions
2x + 3
2
= and use Definition 4.4.1 to prove
5x + 7
5
this statement. We will not do that because such a proof would be almost
identical to the proof given in Example 3.2.3 (when we did the analogous result
for sequences). Also we should add that there are versions of Propositions 4.4.2
and 4.4.3 for the case when x approaches −∞. And finally note that we have
not mentioned a result analogous to Example 3.5.1. It is possible to prove that
for 0 < c < 1, lim cx = 0. However, since we will wait until Chapter 6 to define
x→∞
cx , we do not consider this limit at this time.
Infinite Limits In Example 4.2.5 we showed that lim (1/x2 ) does not exist. We
particular examples such as lim
x→∞
x→0
mentioned as a part of the proof that the limit wanted to go to infinity—so since
according to Definition 4.1.1 it is necessary that L ∈ R, the limit cannot exist.
We want to be able to show that the limit above does not exist in a much nicer
way than the nonexistence of the limits considered in Examples 4.2.6 and 4.2.7.
Just as we included an alternative definition for sequences converging to infinity
in Section 3.6, we want the same concept for limits of functions. Consider the
following definition.
Definition 4.4.4 (a) Suppose that f : D → R, D, R ⊂ R, x0 ∈ R and x0 a
limit point of D. We say that lim f (x) = ∞ if for every M > 0 there exists a
x→x0
δ such that x ∈ D, 0 < |x − x0 | < δ implies that f (x) > M .
(b) lim f (x) = −∞ if for every M < 0 there exists an δ such that x ∈ D,
x→x0
0 < |x − x0 | < δ implies that f (x) < M .
We now return to the consideration of the example given in Example 4.2.5.
Example 4.4.1
Prove that lim (1/x2 ) = ∞.
x→0
Solution: We suppose that we are given an M > 0. We must find a δ so that 0 < |x − 0| < δ
implies that f (x) > M , i.e. we need x12 > M . This last inequality is equivalent to x2 < 1/M .
√
√
This inequality is satisfied if |x|
δ = 1/ M (Step 1: Define δ) and
√ < 1/ M . Thus we define
suppose that 0 < |x| < δ = 1/ M . Then for x 6= 0, |x|2 = x2 < 1/M and 1/x2 > M which is
what we had to prove (Step 2: δ works). Therefore lim 1/x2 = ∞.
x→0
Note that we did not consider x = 0 at all. This is a place where the part of the
requirement ”0 <” is important.
If we wanted to, we could now prove some theorems pertaining to infinite
limits. This surely would be overkill. However, we should be aware that it is
possible to obtain all of the results analogous to those in Proposition 3.6.2. And
finally, we should realize that we could also define infinite limits as x approaches
either positive or negative infinity, i.e. lim (x2 + 1) = ∞. Hopefully you
x→−∞
are now capable of piecing the definitions given above to obtain the necessary
definition for such limits—if they are needed. Limits such as this last one are
not common.
One-sided Limits If you feel that this topic does not fit particularly well with
the other two topics in the section, you are right—we had to find a place to
put it. Quite literally there are times when instead of approaching the limiting
103
4.4 Infinite and One-sided Limits
point from either side, we want to only consider points to the right or left of
x0 . We did this when we considered limits at ±∞ but in that case there were
no points ”on the other side.” We have three easy approaches to this idea—we
shall do all three.
We begin by considering f : D → R where D ⊂ R and x0 ∈ R (where we
always keep in mind the most common case where D = [a, b]). We define two
new functions f + : D+ → R where D+ = D ∩ (x0 , ∞) and f − : D− → R where
D− = D ∩ (−∞, x0 ). We make the following definition.
Definition 4.4.5 (a) If x0 is a limit point of D+ , we define
lim f (x) =
x→x0 +
+
lim f (x).
x→x0
We refer to this limit as the limit of f as x approaches x0 from the right or the
right hand limit of f at x0 .
(b) If x0 is a limit point of D− , we define lim f (x) = lim f − (x).
x→x0 −
x→x0
We refer to this limit as the limit of f as x approaches x0 from the left or the
left hand limit of f at x0 .
We should note that the functions f + and f − are just copies of f to the right
and the left of x0 , respectively—hence using f + and f − we get the right and left
hand limits of f , respectively. The fact that Definition 4.1.1 is a very general
definition of a limit allows us to easily define the right and left hand limits. Also
notice that it is still a requirement that x0 is a limit point of D+ and D− —this
is to guarantee that we have enough points of D on either side of x0 to allow us
to apply Definition 4.1.1. Note that if x0 is a limit point of either D+ or D− ,
then x0 will also be a limit point of D.
Before we look at some results concerning right and left hand limits, we
include the more common definition in the following result.
Proposition 4.4.6 Suppose that f : D → R where D ⊂ R and x0 , L ∈ R.
(a) Suppose that x0 is a limit point of D ∩ (x0 , ∞). Then lim f (x) = L if
x→x0 +
and only if for every ǫ > 0 there exists a δ such that x ∈ D and 0 < x − x0 < δ
implies that |f (x) − L| < ǫ.
(b) Suppose that x0 is a limit point of D ∩ (−∞, x0 ). Then lim f (x) = L if
x→x0 −
and only if for every ǫ > 0 there exists a δ such that x ∈ D and 0 < x0 − x < δ
implies that |f (x) − L| < ǫ.
Proof: (a) (⇒) We begin by assuming that lim f (x) = L, i.e. lim f + (x) =
x→x0 +
x→x0
L. This means that for every ǫ > 0 there exists a δ such that x ∈ D+ and 0 <
|x− x0 | < δ implies that |f + (x)− L| < ǫ. We note that if x ∈ D+ = D ∩(x0 , ∞),
then |x − x0 | = x − x0 , so 0 < |x − x0 | < δ is equivalent to 0 < x − x0 < δ. Also,
note that if x ∈ D+ , then x ∈ D also. And finally, for x ∈ D+ f + (x) = f (x).
Thus for the δ given we see that x ∈ D and 0 < |x − x0 | = x − x0 < δ implies
|f (x) − L| < ǫ.
(⇐) We will skip the proof of this direction because it is so similar to the proof
given for part (b)
104
4. Limits of Functions
(b) (⇒) We will skip the proof of this direction because it is so similar to the
proof given for part (a).
(⇐) We suppose that for every ǫ > 0 there exists a δ so that if x ∈ D and
0 < x0 − x < δ implies |f (x) − L| < ǫ. Note that x ∈ D and 0 < x0 − x < δ is
equivalent to x ∈ D− and |x − x0 | < δ. Also, if x ∈ D and 0 < x0 − x < δ, then
f (x) = f − (x). Thus for x ∈ D− and 0 < x0 − x < δ we have |f − (x) − L| < ǫ
or lim f − (x) = L or lim f (x) = L.
x→x0 −
x→x0
The way that we apply Proposition 4.4.6 in a one-sided limit proof is very
similar to the way that we applied Definition 4.1.1—except that we now only
need to consider points on one side of x0 .
The third characterization of one-sided limits should be very familiar to us.
In Proposition 4.1.3 we gave a sequential characterization of limits—we can do
the same thing for one-sided limits. We state the following proposition.
Proposition 4.4.7 Suppose that f : D → R, D ⊂ R, x0 , L ∈ R.
(a) Suppose that x0 is a limit point of D ∩ (x0 , ∞). Then lim f (x) = L if
x→x0 +
and only if for any sequence {an } such that an ∈ D for all n, an > x0 for all
n, and lim an = x0 , then lim f (an ) = L.
n→∞
n→∞
(b) Suppose that x0 is a limit point of D ∩ (−∞, x0 ). Then
lim f (x) = L if
x→x0 −
and only if for any sequence {an } such that an ∈ D for all n, an < x0 for all
n, and lim an = x0 , then lim f (an ) = L.
n→∞
n→∞
Proof: We will skip this proof because it is so much like the proof of Proposition
4.1.3. For the (⇒) direction of part (a) for a given ǫ > 0 the right hand limit
hypothesis gives a δ, this δ used as the ”ǫ” in the assumption that an → x0
(and it works because we have assumed that an > 0) gives an N which is the
N that we need to prove that f (an ) → L as n → ∞. The (⇒) direction of part
(b) is essentially the same.
To prove the (⇐) directions we again assume false and use this assumption
to create a sequence {an } that contradicts our hypthesis—because the one-sided
limit is used the our contradiction assumption, the sequence will either greater
than or less than x0 .
We emphasize that when we want to prove things about one-sided limits,
we will generally use Propositions 4.4.6 and 4.4.7. We gave Definition 4.4.5
as we did to emphasize the ”one sidedness” of the functions when we consider
one-sided limits.
In Example 4.2.2 we proved that lim x2 = 4. It is very easy to show that
x→2
lim x2 = 4 and lim x2 = 4. If we use Proposition 4.4.6 we can choose
x→2+
x→2−
δ = min{1, ǫ/5} (the same δ used in Example 4.2.2). If we apply Proposition
4.4.7 to prove that lim x2 = 4, we use a sequence an → 2 with an > 2 and
x→2+
lim an = 2 · 2 = 4. And of course the
the fact that lim a2n = lim an
n→∞
n→∞
n→∞
application of Proposition 4.4.7 to lim x2 is similar except that this time we
x→2−
105
4.4 Infinite and One-sided Limits
assume that the sequence satisfies an < 2. We do not try to apply Definition
4.4.5—either Propositions 4.4.6 and 4.4.7 are much cleaner ways to prove onesided limits. (
1 if x ≥ 0
If f (x) =
we showed in Example 4.2.6 that lim f (x) does not
x→0
0 if x < 0,
exist. It is very easy to show that lim f (x) = 1 and lim f (x) = 0. If we
x→0+
x→0−
were to apply Proposition 4.4.6, we can choose δ = 1 (or anything else) for both
of them. If we apply Proposition 4.4.7, the results follow because f (an ) = 1 if
an > 0 and f (an ) = 0 if an < 0.
In Example 4.2.7 we showed that lim f (x) does not exist when f (x) =
x→0
(
sin(1/x) x 6= 0
The easiest way to show that lim f (x) does not exist is
x→0+
0
x = 0.
to use Proposition 4.4.6 with two different sequences an = 1/nπ and bn =
2/(4n + 1)π—where just as we did in Example 4.2.7 we find that f (an ) → 0 and
f (bn ) → 1. To show that lim f (x) does not exist we again apply Proposition
x→0−
4.4.6, this time with the sequences {−an } and {−bn }. The fact that these
one-sided limits do not exist should be clear by looking at Figure 4.2.1.
We do not define one-sided infinite limits but we should be clear that there
are such limits (and we will not define them here). In Example 4.2.5 we showed
1
in lim 2 does not exist in R, and then in Example 4.4.1 we showed that
x→0 x
√
1
= ∞. If we are given M > 0 and choose δ = 1/ M —-just as we did
lim
x→0 x2
√
in Example 4.4.1, we can show that 0 < x < δ = 1/ M implies that x12 > M .
1
1
Thus lim 2 = ∞. The same δ can be used to prove that lim 2 = ∞ also.
x→0+ x
x→0− x
1
is slightly different. This limit does not exist either by
The limit lim
x→0 x
Definition 4.1.1 (i.e. not in R) and not by Definition 4.4.4. However it should
1
1
be clear that lim
= ∞ and lim
= −∞. Proof of the right hand limit is as
x→0+ x
x→0− x
follows. Suppose M > 0 is given. Choose δ = 1/M . Then if 0 < x < δ = 1/M ,
1
= 1 > M . Therefore lim 1 = ∞. The proof of the left hand limit is
x
x
x→0+ x
similar.
Before we leave this topic we want to include one important result. We
notice that when we considered the left and right hand limits of f (x) = x2 at
x0 = 2, we got 4—they same value as lim x2 . When we considered the functions
(
( x→2
1 if x ≥ 0
sin(1/x) x 6= 0
f (x) =
and f (x) =
0 if x < 0
0
x = 0,
for both of which the limit lim f (x) did not exist, we see that in one case both
x→0
one-sided limits exist but are different and in the other case neither of the onesided limits exist. These examples pretty much illustrate all possibilities of the
following theorem.
106
4. Limits of Functions
Proposition 4.4.8 Consider the function f : D → R where D, R ⊂ R, suppose
that L, x0 ∈ R and x0 is a limit point of both D ∩ (x0 , ∞) and D ∩ (−∞, x0 ).
Then lim f (x) exists if and only if both lim f (x) and lim f (x) exist and
x→x0
x→x0 +
x→x0 −
are equal.
Proof: (⇒) We assume that lim f (x) exists, i.e. for every ǫ > 0 there exists
x→x0
a δ such that x ∈ D and 0 < |x − x0 | < δ implies that |f (x) − L| < ǫ for some
L ∈ R. Note that 0 < |x − x0 | < δ implies that 0 < x − x0 < δ or 0 < x0 − x < δ.
Thus x0 is a limit point of D ∩ (x0 , ∞) and we have a δ such that
x ∈ D and 0 < x − x0 < δ implies that |f (x) − L| < ǫ
thus by Proposition 4.4.6-(a) lim f (x) = L
x→x0 +
and x0 is a limit point of D ∩ (−∞, x0 ) and we have a δ such that
x ∈ D and 0 < x0 − x < δ implies that |f (x) − L| < ǫ
thus by Proposition 4.4.6-(b) lim f (x) = L,
x→x0 −
which is what we were to prove.
(⇐) Suppose ǫ > 0 is given. We suppose that lim f (x) = L and lim f (x) =
x→x0 +
x→x0 −
L. Then there exists a δ1 and δ2 such that if x satisfies either 0 < x − x0 < δ1
or 0 < x0 − x < δ2 implies that |f (x) − L| < ǫ. Let δ = min{δ1 , δ2 } (Step 1:
Define δ). The if x satisfies 0 < |x − x0 | < δ, x satisfies 0 < x − x0 < δ or
0 < x0 − x < δ. So x satifies 0 < x − x0 < δ ≤ δ1 or 0 < x0 − x < δ ≤ δ2 . Thus
|f (x) − L| < ǫ (Step 2: δ works), so lim f (x) = L.
x→∞
There are times when we want to prove a particular limit that the above
theorem is very useful. We can handle the function on each side of x0 separately,
get the same one-sided limits and hence prove our limit result.
HW 4.4.1 (True or False and why) (a) If x0 is a limit point of D ⊂ R, then
x0 is a limit point of D+ = D ∩ (x0 , ∞) and D− = D ∩ (−∞, x0 ).
(b) Suppose f : [0, 1) → R and that lim f (x) exists. Then lim f (x) exists
x→1−
x→1+
also.
(c) Suppose f : [0, 1] → R and that lim f (x) exists. Then lim f (x) exists
x→1−
x→1+
also.
(d) Suppose f : [0, 1) → R and that lim f (x) exists. Then lim f (x) exists.
x→1−
x→1
(e) Suppose that f (x) = 1/x4 and g(x) = x. We know that lim f (x) = ∞ and
x→0
lim g(x) = 0. Then lim f (x)g(x) = 0 · ∞ = 0.
x→0
x→0
HW 4.4.2 (a) Prove that lim 1/x4 = ∞.
x→0
(c) Prove that lim sin x does not exist.
x→∞
(b) Prove that lim 1/x4 = 0.
x→∞
x
1
(d) Prove that lim
− .
x→∞ 2x − 1
2
Chapter 5
Continuity
5.1
An Introduction to Continuity
In the preceeding chapters we have been building basics and tools. In this
chapter we introduce the concept of a continuous function and results related
to continuous functions. The class of continuous functions is a very important
set of functions in many areas of mathematics. Also there are a lot of very nice
and useful properties of continuous functions. We begin with the definition of
continuity.
Definition 5.1.1 Consider a function f : D → R where D ⊂ R and a point
x0 ∈ D. The function f is continuous at x0 if for every ǫ > 0 there exists a δ
such that for x ∈ D and |x − x0 | < δ, then |f (x) − f (x0 )| < ǫ.
If the function f is continuous at x for all x ∈ D, then f is said to be continuous
on D.
If the function f is not continuous at a point x = x0 , then f is said to be
discontinuous at x = x0 .
In the last chapter we studied what it meant for a function to have a limit at a
point. Often the definition of continuity is given in terms of limits—especially
in the elementary calculus texts. We state the following proposition.
Proposition 5.1.2 Consider a function f : D → R where D ⊂ R and a point
x0 ∈ D. Suppose that x0 is a limit point of the set D. If lim f (x) = f (x0 ),
x→x0
then the function f is continuous at x = x0 .
If x0 ∈ D and x0 is a limit point of D but lim f (x) does not exist, or exists
x→x0
but does not equal f (x0 ), then f is not continuous at x = x0 .
Proof: This proof is very easy. Let ǫ > 0 be given. If we apply the definition
of lim f (x) = f (x0 ) we get a δ such that 0 < |x − x0 | < δ implies that
x→x0
|f (x) − f (x0 )| < ǫ. But this is almost what we need to satisfy Definition 5.1.1.
We need to get rid of the ”0 <” requirement. But when x = x0 , we know that
107
108
5. Continuity
|f (x) − f (x0 )| = 0 < ǫ so the ”0 <” part of the restriction on x is completely
unnecessary. Therefore f is continuous at x = x0 .
If x0 ∈ D is a limit point of D and it is not the case that lim f (x) = f (x0 )
x→x0
(either the limit does not exist or it exists but equals something other than
f (x0 )), then there is some ǫ so that for any δ, there is an xδ ∈ D such that
0 < |x − xδ | < δ and |f (xδ ) − f (x0 )| ≥ ǫ. This also negates Definition 5.1.1 so
f is not continuous at x = x0 .
We want to emphasize that this result is not an ”if and only if” result, i.e.
this does not provide us with a statement that is equivalent to the definition
of continuity. However, some texts use this as their definition—usually only
basic calculus texts. The function that we considered in HW4.1.1 shows that
hypotheses given in Proposition 5.1.2 are not equivalent to Definition 5.1.1.
In HW4.1.1–(a) we considered the domain D = [0, 1] ∪ {2} and the function
f : D → R, f (x) = x2 . The True-False question was whether lim f (x) exists
x→2
and equals 4. Of course the answer is False because for that limit to exist,
2 must be a limit point of D. However, f is continuous at x = 2—if we set
δ = 1/2, Definition 5.1.1 is satisfied. It is not a requirement for continuity at
x = x0 that lim f (x) exist. A function will be continuous at isolated
x→x0
points of its domain—the limit of the function will not exist at those points.
This is not a terribly important distinction for our work. Let us emphasize that
Proposition 5.1.2 can be used to prove continuity at points of the domain that
are limit points of the domain—which in this level of a text is most of them.
In Example 4.2.1 we showed that lim 2x + 3 = 9. If we set f1 (x) = 2x + 3
x→3
and choose the domain to be R (a reasonable domain for that function), we
note that f1 is defined at x = 3 and f1 (3) = 9. Thus by Proposition 5.1.2
the function f1 is continuous at x = 3. It would be equally easy to mimic
the work done in Example 4.2.1, (omitting the ”0 <” part) to show that f1
satisfies Definition 5.1.1 at x = 3. (Choose δ = ǫ/2. Then for |x − 3| < δ,
|f1 (x) − 9| = |(2x + 3) − 9| = 2|x − 3| < 2δ = ǫ.) It is equally easy to see—using
either Definition 5.1.1 or Proposition 5.1.2 (or we could use Proposition 4.3.3(b) along with Proposition 5.1.2—that the function f1 is continuous at x = x0
for any x0 ∈ R. Thus f1 is continuous on R.
Likewise, we showed in Example 4.2.2 that the function f2 (x) = x2 is continuous at x = 2 (given that we define f2 on some reasonable domain such as
x−2
is continuous at x = −2.
D = R) and in Example 4.2.3 the function f3 (x) = x+3
Note that in the case of the function f3 , the largest (most logical) domain that
we can choose is the set D3 = {x : x ∈ R and x 6= −3}. Recall that in the work
done in Examples 4.2.1–4.2.3, the ”0 <” part of the definition of a limit was not
relevant. We noted that in each case |f (x) − L| < ǫ when x = x0 also—in fact
in each of these cases |f (x) − L| = 0 when x = x0 . This is exactly why f1 , f2
and f3 are continuous at x = 3, x = 2 and x = −2, respectively.
If we consider f2 and f3 at arbitrary points of D2 and D3 , respectively, we
can use either Definition 5.1.1, or Proposition 4.3.3 and Proposition 5.1.2 to
show that f2 is continuous on D2 and f3 is continuous on D3 . Hopefully it is
109
5.1 Introduction
obvious that f3 is not continuous at x = −3. A function cannot be continuous
at a point that is not in the domain of the function.
x3 − 64
In Example 4.2.4 we showed that lim
= 48. However, if we define
x→4 x − 4
3
x − 64
and allow the domain to be what is sort of f4 ’s natural domain,
f4 (x) =
x−4
D4 = {x : x ∈ R and x 6= 4}, then f4 is surely not continuous at x = 4 since
f4 is not defined at x = 4. If we were to define a new function f8 so that
f8 (x) = f4 (x) for all x ∈ R, x 6= 4 and define f8 (4) = 48, then the domain of f8
would be D8 = R, f8 will be continuous at x = 4 and f8 would be continuous
on all of R—use Proposition 5.1.2.
And finally we showed
and 4.2.7 that the functions
( in Examples 4.2.5, 4.2.6 (
1
if
x
≥
0
sin x1
if x 6= 0
2
f5 (x) = 1/x , f6 (x) =
and f7 (x) =
0 if x < 0,
0
if x = 0
are not continuous at x = 0. This can be seen by the last part of Proposition
5.1.2 because 0 is a limit point of the domain of each of these domains and the
limit as x approaches 0 of each of these functions does not exist. In the case of
f5 it is even easier yet. If a function isn’t defined at a particular point, there is
no way that the function can be continuous at that point.
A Graphical Example In Figure 5.1.1 we plot a function where the domain
is assumed to be approximately the set above which there is a graph plotted
(except for the points xE and xF at which the function f is not defined. We
make the following claims.
xA
xB
xC
xD
xE
xF
Figure 5.1.1: Plot of a function continuous at xA , discontinuous at xB –xF . A
small open circle denotes a point at which the function is not defined. A small
filled circle denotes one point of definition.
• At point xA , though the graph has a ”corner” at that point, the function
is continuous at that point. (Generally, a function is continuous at welldefined corners.)
110
5. Continuity
• At point xB , the function is defined at x = xB but will not be continuous
at xB . Though lim f (x) exists, there is no way that lim f (x) will equal
x→xB
x→xB
f (xB ). When x is near xB , f (x) is not near f (xB ).
• The function is not continuous at points xC , xD and xE . These points are
similar to the point x = 0 considered in Example 4.2.6 and the proof that
the limit does not exist at points xC and xD would be very similar to the
argument used in Example 4.2.6. We wanted to emphasize that each of
these points represent a jump in the function. The points xC and xD are
points where the function is defined on the left and right side of the jump,
respectively. At the point xE the function is not continuous because it has
a jump at that point and it is not defined at the jump point. A function
cannot be continuous at a point at which it is not defined.
• The function is not continuous at point xF . Even though the function is
nicely behaved on both sides of the point xF , the function must be defined
at a point to be continuous at that point. Note that lim f (x) exists, and
x→xF
if we were to define f at the point xF to be lim f (x), then the function
x→xF
f would be continuous at x = xF .
Before we leave this section we include one of the basic continuity theorems.
We see that this result is the continuity analog to the limit result, Proposition
4.1.3.
Proposition 5.1.3 Suppose that f : D → R, D, R ⊂ R and x0 ∈ R and x0 .
Then f is continuous at x = x0 if and only if for any sequence {an } such that
an ∈ D for all n and lim an = x0 , then lim f (an ) = f (x0 ).
n→∞
n→∞
Proof: Before we begin let’s look at some of the difference between the above
statement and that given in Proposition 4.1.3. Because we now assume that
x0 ∈ D and because we no longer have the ”0 <” restriction on the range of
x, we now do not require that an 6= x0 . In addition, in the above proposition
statement we no longer require that x0 in a limit point of D. We know that
when we consider the continuity of a function, it is permissible to have isolated
points in D and the function will always be continuous at those isolated points.
Despite these differences the proof of this result is essentially identical to that
of Proposition 4.1.3.
(⇒) We are assuming that f is continuous at x0 ∈ D. We consider any sequence
{an } where an ∈ D and an → x0 . We suppose that we are given an ǫ > 0. The
continuity of f at x = x0 implies that there exists a δ such that |x − x0 | < δ
implies that |f (x) − f (x0 )| < ǫ. If we apply the definition of the fact that
an → x0 with the traditional ”ǫ” replaced by δ, we get an N such that n > N
implies that |an − x0 | < δ. Then the continuity of f statement implies that for
n > N , |f (an ) − f (x0 )| < ǫ. Thus f (an ) → f (x0 ).
(⇐) We suppose that f is not continuous at x0 , i.e. for some ǫ0 > 0 for any δ
there exists an xδ ∈ D such that |xδ − x0 | < δ and |f (xδ ) − f (x0 )| ≥ ǫ.
5.2 Examples
111
Let δ = 1 so we get an x1 ∈ D such that |x1 − x0 | < 1 and |f (x1 ) − f (x0 )| ≥ ǫ.
Let δ = 1/2 so we get an x2 ∈ D such that |x2 −x0 | < 1/2 and |f (x2 )−f (x0 )| ≥ ǫ.
And in general
let δ = 1/n so we get an xn ∈ D such that |xn − x0 | < 1/n and |f (xn )− f (x0 )| ≥
ǫ.
Thus we have a sequence {xn } such that xn → x0 and f (xn ) 6→ f (x0 ). This
is a contradiction so f is continuous at x = x0 .
We should be mildly concerned that the proof of Propositions 4.1.3 and 5.1.3
are the same—we pointed out the differences between the two results. The fact
that we no longer require that an 6= x0 is no problem because f is now defined
at x0 and the restriction 0 < |x − x0 | < δ is replaced by |x − x0 | < δ. The fact
that we do not require that x0 be a limit point of D is taken care of by the
fact that if x0 is an isolated point of D (not a limit point) we can consider the
sequence {x0 , x0 , x0 , · · · }—in fact the tail end of all of the sequences contained
in D that converges to x0 look like this if x0 is an isolated point of D.
There are some texts that use the right hand side of Proposition 5.1.3 as
the definition of continuity—since the proposition is an ”if and only if” result,
this is completely permissible. The definition of continuity given in Definition
5.1.1 is the most common definition. Of course there are many times that the
sequential characterization of continuity is very useful. We feel that you must
be comfortable with using both Definition 5.1.1 and Proposition 5.1.3 (just as
we tried to force you to work with both Definition 4.1.1 and Proposition 4.1.3).
Specifically, as was the case with limits, when we want to show that a function is
not continuous at a given point, it is usually easier to apply Proposition 5.1.3—
providing either one sequence {an } at which {f (an )} does not converge, or two
sequences {an } and {bn } for which {f (an )} and {f (bn )} converge to different
values. When a push comes to a shove, we will use which ever characterization
is best at the time.
HW 5.1.1 (True or False and why) (a) Suppose f : [0, 1) → R is defined as
f (x) = x2 . We know that lim f (x) = 1. Then the function f is continuous at
x→1
x = 1.
(b) Suppose f : N → R defines a sequence, i.e. f (n) = an . The function f is
continuous at all n ∈ N.
(c) Suppose f : D → R, D ⊂ R and x0 ∈ D. Suppose for every ǫ > 0 there
exists δ such that x ∈ D, 0 < |x − x0 | < δ implies |f (x) − f (x0 )| < ǫ. Then f is
continuous at x = x0 .
5.2
Some Examples of Continuity Proofs
In this section we include an assortment of proofs of continuity. Hopefully after
our work with limits, you are getting to be pretty good at these. We felt that
you should see some. In general we can use Definition 5.1.1 or Propositions 5.1.2
112
5. Continuity
and 5.1.3—which ever appears to be best at the time. In this section we will
use a variety of methods so that you get a taste of each of the above results.
We begin with an example where we use the definition for the specific case
(much easier) and Proposition 5.1.2 for the general case (much more difficult).
Example 5.2.1
Consider the function f (x) =
x2 − 1
on the domain D = {x ∈ R :
x+3
x 6= −3}.
(a) Prove that f is continuous at x = 3.
(b) Prove that f is continuous at x = x0 for x0 ∈ D.
Solution: (a) We begin by assuming that we are given an ǫ > 0. We must find a δ so
that |x − 3| < δ implies that
2
x − 1
4 3x2 − 4x − 15 |3x + 5||x − 3|
− = < ǫ.
=
|f (x) − f (3)| = x+3
3
3(x + 3) 3|x + 3|
Note the |x − 3| term in the numerator—we promised you that it would always be there. As
|3x + 5|
we did with the limit proofs, we must bound the
term. We begin as we did with the
3|x + 3|
limit problems and choose δ1 = 1 and restrict x so that |x − 3| < δ1 = 1. Then
|x − 3| < 1 ⇒ 2 < x < 4 ⇒ 6 < 3x < 12 ⇒ 11 < 3x + 5 < 17 so |3x + 5| < 17.
Likewise
|x − 3| < 1 ⇒ 2 < x < 4 ⇒ 5 < x + 3 < 7 ⇒ |x + 3| > 5 so 3|x + 3| > 15.
Therefore
17
|3x + 5|
<
, and if |x − 3| < δ1 = 1, then
3|x + 3|
15
|f (x) − f (3)| =
|3x + 5||x − 3|
17
<
|x − 3|.
3|x + 3|
15
Then: if we define δ = min{1, (15/17)ǫ} and require that x satisfy |x − 3| < δ, then
|f (x) − f (3)| =
17 15
|3x + 5||x − 3| ∗ 17
<
|x − 3| <∗∗
ǫ = ǫ,
3|x + 3|
15
15 17
where the ”<∗ ” inequality is due to the fact that |x − 3| < δ ≤ 1 and the ”<∗∗ ” inequality is
due to the fact that |x − 3| < δ ≤ (15/17)ǫ. Therefore the function f is continuous at x = 3.
This result could have been proved using either Propositions 5.1.2 or 5.1.3—and using
either of these would be easier than the above proof.
(b) Originally (in the preparation of the text) we used Definition 5.1.1 to prove the continuity
of f at x0 ∈ D. It was good because it showed that it could be done but it was brutal—so
we took it out. Probably the easiest way to prove continuity at x0 ∈ D is to use Proposition
5.1.2. It should not be hard to see that any x0 ∈ D is a limit point of D. Then by Proposition
4.3.1, parts (a), (b), (d) and (f), we see that
lim f (x) = lim
x→x0
x→x0
x2 − 1
x2 − 1
= 0
= f (x0 ).
x+3
x0 + 3
Therefore f is continuous at any x0 ∈ D.
We next consider the absolute value function at x = 0. Recall that the graph
of the absolute value function has a corner at x = 0 (like point xA on the graph
of f in Figure 5.2.1). Functions are continuous at corners of the graph.
Example 5.2.2 Show that the function f (x) = |x| is continuous at x = 0.
Solution: Note that f is defined on R (which we will assume to be the domain of f ). Clearly
x = 0 is a limit point of the domain. Since −x ≤ |x| ≤ x, lim x = 0 lim (−x) = 0, we have
x→0
x→0
lim |x| = 0 = |0| by using Propositions 4.3.4 and 5.1.2. Thus | · | is continuous at x = 0.
x→0
It should be clear that the absolute value function is also continuous at all other points
of R.
113
5.2 Examples
We next prove the continuity of the sine and cosine functions—first at θ = 0
and then for general θ. The continuity of the sine and cosine functions can then
be used to prove the continuity of the remaining trigonometric functions at the
points where these functions are defined.
Example 5.2.3
(a) Show that for sufficiently small θ
−|θ| ≤ sin θ ≤ |θ| and − |θ| ≤ 1 − cos θ ≤ |θ|.
(b) Prove that sine function is continuous at θ = 0.
(c) Prove that cosine function is continuous at θ = 0.
(d) Show that the sine and cosine functions are continuous on R.
Solution: (a) We consider the picture given in Figure 5.2.1. We begin by noting that most
of this argument will be true for more that ”sufficiently small” θ. However, as we shall see,
we only need the result for small θ and do not want to have to worry about what happens
when θ gets larger than π/2 or smaller than −π/2.
P
θ
O
θ
Q
(1,0)
A
Figure 5.2.1: Figure used to prove part (a) of Example 5.2.3.
We begin by noting that |OQ| = cos θ and |P Q| = sin θ. We also see that |AP | ≤ |θ|
(equality when θ = 0) where of course AP is the line from A to P , |AP | is the length of the line
AP , θ is the arc length from A to P —note that the absolute value signs are included to allow
for a negative angle θ. From triangle △OQP we see that |QP | = | sin θ| and |OQ| = cos θ—
which gives |AQ| = 1 − cos θ. Then applying the Pythagoras Theorem to triangle △AQP we
see that
sin2 θ + (1 − cos θ)2 = |AP |2 ≤ θ 2 .
2
2
Therefore sin θ ≤ θ and (1 − cos θ)2 ≤ θ 2 . If we take the square roots of both inequalities
we get | sin θ| ≤ |θ| and |1 − √
cos θ| ≤ |θ|. (Notice
√ that this is one of the times that you must
be very careful to note that a2 = |a|—not a2 = a.)
(b) and (c) By Example 5.2.2 we know that lim (±|θ|) = 0. Then by part (a) of this example
θ→0
and Proposition 4.3.4 lim sin θ = 0 = sin 0. Since x = 0 is a limit point of R we can apply
θ→0
Proposition 5.1.2 to see that the sine function is continuous at θ = 0.
It should be easy to see that the proof that lim (1 − cos θ) = 0 is the same as the proof
θ→0
given above for the sine function. From this we see that lim cos θ = 1 = cos 0 and that the
θ→0
cosine function is continuous at θ = 0.
Once we have the inequalities from part (a), it is also easy to prove the continuity of sine
and cosine using Definition 5.1.1. (If ǫ > 0 is given, then by choosing δ = ǫ, we see that
|θ − 0| < δ implies that −ǫ = −δ < −|θ| ≤ 1 − cos θ ≤ |θ| < δ = ǫ. So cos is continuous at
θ = 0.)
114
5. Continuity
(d) Let θ0 ∈ R. We consider lim sin θ. Note that
θ→θ0
sin θ = sin [θ0 + (θ − θ0 )] = sin θ0 cos(θ − θ0 ) + cos θ0 sin(θ − θ0 ).
(5.2.1)
By part (b) lim sin h = 0, hence for any ǫ > 0 there exists a δ such that |h| < δ implies
h→0
that | sin h| < ǫ. Then: if |θ −θ0 | < δ, we have | sin(θ −θ0 )| < ǫ. therefore lim sin(θ −θ0 ) = 0.
θ→θ0
In part (b) we found that lim cos θ = 1. Hence, lim cos(θ − θ0 ) = 1. Thus by these
θ→0
θ→θ0
limits, equation (5.2.1) and parts (a) and (b) of Proposition 4.3.1 we see that lim sin θ =
θ→θ0
sin θ0 (1) + cos θ0 (0) = sin θ0 . Therefore the sine function is continuous at θ = θ0 (for any
θ0 ∈ R).
To prove the continuity of the cosine function at θ = θ0 we use the identity
cos θ = cos [θ0 + (θ − θ0 )] = cos θ0 cos(θ − θ0 ) − sin θ0 sin(θ − θ0 )
and proceed as we did in the proof of the continuity of the sine function.
The next example is a fun example. Before we get working notice the function f defined in Example 5.2.4. Recall that in Example 4.2.7 we considered a
similar function (without the x term multiplying the sine term) that was not
continuous at x = 0. As we did in Example 4.2.7 it is useful here to look at the
plot of f . In Figure 5.2.2 we see that the plot of f squeezes down to zero when
x is near zero. This is the attribute of this function that makes it continuous at
x = 0 whereas the function given in Example 4.2.7 was not continuous at x = 0.
0.5
0.4
0.3
0.2
0.1
0
−0.1
−0.2
−0.5
0
x
0.5
Figure 5.2.2: Plot of a function f (x) = x sin(1/x) for x 6= 0 and f (0) = 0.
Example 5.2.4
Define the function f : R → R by f (x) =
(
x sin
0
1
x
if x 6= 0
if x = 0.
Show that f is continuous at x = 0.
Proof: It is easy to use Definition 5.1.1 to prove that f is continuous at x = 0. Let ǫ > 0 be
given, define δ = ǫ and consider x values that satisfies |x| < δ. Then
x sin 1 − 0 ≤ |x| < δ = ǫ.
x
115
5.3 Basic Continuity Theorems
Therefore f is continuous at x = 0.
It should be clear that f is also continuous for all other points in R.
We include one more example (that is also a fun example) that introduces
a useful, interesting function.
Example 5.2.5
Define the function f : D = [0, 1] → R by f (x) =
(
1
0
Show that f is discontinuous at all points x ∈ D = [0, 1].
if x ∈ Q
if x ∈ I.
Solution: First consider x0 ∈ [0, 1]∩I. Let ǫ = 1/2. Consider any δ. We know by Proposition
1.5.6-(a) that there exists rδ ∈ Q such that rδ ∈ (x0 − δ, x0 + δ), i.e. rδ satisfies |rδ − x0 | < δ
and |f (rδ ) − f (x0 )| = |1 − 0| = 1 > ǫ = 1/2. Therefore f is not continuous at x0 .
Likewise consider x0 ∈ [0, 1] ∩ Q. Let ǫ = 1/2. Consider any δ. We know by Proposition
1.5.6-(b) that there exists iδ ∈ I such that iδ ∈ (x0 − δ, x0 + δ), i.e. iδ satisfies |iδ − x0 | < δ
and |f (iδ ) − f (x0 )| = |0 − 1| = 1 > ǫ = 1/2. Therefore f is not continuous at x0 .
HW 5.2.1 (True or Fals and why) (a) Set D = [0, 1] ∪ {3} and define f on D
by f (x) = 1 for x ∈ D ∩ Q and f (x) = 0 for x ∈ D ∪ I. Then f is discontinuous
at all points of D.
(b) Suppose f : [−1, 1] → R is defined as follows: for x ∈ [−1, 1] ∩ Q, f (x) = x2
and for x ∈ [−1, 1] ∩ I, f (x) = −x2 . Then the function f is continuous at x = 0.
(c) The function f defined in part (b) is discontinuous for all x ∈ [−1, 1], x 6= 0.
o∞
n√
(d) If we consider the function f defined in Example 5.2.5, the sequence 22 + n1
n=5
√
can be used to show that the function f is discontinuous at x = 2/2.
HW 5.2.2 (a) Prove that f (x) = |x − 3| is continuous at x = 3.
(b) Prove that f is continuous at x = 2.
(c) Prove that f is continuous on R.
HW 5.2.3 (a) Consider the functions f1 defined on R as f1 (x) =
(
x3
3x
(
x3
3x − 1
x≥0
x < 0,
x≥0
x < 0,
(
x3
x ∈ [−1, 1] ∩ Q
and f3 defined on [−1, 1] as f3 (x) =
At which points
3
−x
x ∈ [−1, 1] ∩ I.
are f1 , f2 and f3 are continuous and show why.
f2 defined on R as f2 (x) =
HW 5.2.4 Prove that any polynomial is continuous on R.
HW 5.2.5 Prove that any rational function is continuous at all points where
the denominator is nonzero.
116
5.3
5. Continuity
Basic Continuity Theorems
There are a lot of important continuity theorems. We will begin with the most
basic of these theorems.
Proposition 5.3.1 Consider f, g : D → R for D ⊂ R, x0 ∈ D, c ∈ R, and
suppose that f and g are continuous at x = x0 . We then have the following
results.
(a) cf is continuous at x = x0 .
(b) f ± g is continuous at x = x0 .
(c) f g is continuous at x = x0 .
(d) If g(x0 ) 6= 0, then f /g is continuous at x = x0 .
Proof: The proofs of (a)-(d) follow from Proposition 5.1.3 along with Propositions 3.3.2 and 3.4.1. We consider any sequence {an } such that an ∈ D for all n
and an → x0 . Then by the continuity hypothesis and Proposition 5.1.3 we know
that f (an ) → f (x0 ) and g(an ) → g(x0 ). Then by Proposition 3.3.2 we know that
cf (an ) → cf (x0 ), (f + g)(an ) = f (an ) + g(an ) → f (x0 ) + g(x0 ) = (f + g)(x0 )
and (f g)(an ) = f (an )g(an ) → f (x0 )g(x0 ) = (f g)(x0 ). Then by Proposition
5.1.3 cf , f + g and f g are continuous at x = x0 .
Likewise, for any sequence {an } such that an ∈ D for all n and an → x0 ,
the continuity of f at x0 implies that f (an ) → f (x0 ) and g(an ) → g(x0 ).
Since g(x0 ) 6= 0, Proposition 3.4.1 implies that (f /g)(an ) = f (an )/g(an ) →
f (x0 )/g(x0 ) = (f /g)(x0 ). Then by Proposition 5.1.3, f /g is continuous at
x = x0 .
We must realize that the above results can also be proved based on Definition
5.1.1—similar to the proof of Proposition 4.3.1 given using Definition 4.1.1.
Also, we want to emphasize that Proposition 5.3.1 implies that if f and g are
continuous on D ⊂ R, then cf , f ± g, f g are continuous on D. And, f /g is
continuous on {x ∈ D : g(x) 6= 0}—which is the natural domain of f /g.
In addition the the results given above, we also have the results analogous
to parts (c) and (e) of Proposition 4.3.1. The result is a useful tool in the study
of continuity. We state the following proposition.
Proposition 5.3.2 Consider f : D → R for D ⊂ R, x0 ∈ D, and suppose that
f is continuous at x = x0 .
(a) There exists a K ∈ R and δ > 0 such that |x − x0 | < δ implies |f (x)| ≤ K.
(b) If f (x0 ) 6= 0, there exists M ∈ R and δ > 0 such that |x − x| < δ implies
|f (x)| ≥ M .
We don’t prove the above result—the proof is the same as those of parts (c)
and (e) of Proposition 4.3.1.
The next result could be pieced together by multiple applications of parts
Proposition 5.3.1—but we don’t have to work that hard. We have already done
the work in Section 4.3
5.3 Basic Continuity Theorems
117
Example 5.3.1
(a) For n ∈ N the function f (x) = xn is continuous on R.
(b) All polynomials are continuous on R.
(c) All rational functions are continuous at all points at which the denominator is not zero.
Solution: All of the points under consideration are limit points of the domains. Then part
(a) follows from Proposition 4.3.2-(c) along with Proposition 5.1.2. Parts (b) and (c) follows
from parts (b) and (c) of Proposition 4.3.3.
The are a series of basic continuity theorems that we must consider. We
include the following result.
Proposition 5.3.3 Consider f, g : D → R for D ⊂ R, x0 ∈ D, c ∈ R, and
suppose that f and g are continuous at x = x0 . We then have the following
results.
(a) The function F (x) = max{f (x), g(x)} is continuous at x = x0 .
(b) The function G(x) = min{f (x), g(x)} is continuous at x = x0 .
Proof: (a) Suppose ǫ > 0 is given. Then there exists δ1 and δ2 such that
|x − x0 | < δ1 implies that |f (x) − f (x0 )| < ǫ or f (x0 ) − ǫ < f (x) < f (x0 ) + ǫ
(5.3.1)
and
|x − x0 | < δ2 implies that |g(x) − g(x0 )| < ǫ or g(x0 ) − ǫ < g(x) < g(x0 ) + ǫ.
(5.3.2)
Let δ = min{δ1 , δ2 }. Then for x satisfying |x − x0 | < δ
max{f (x0 ), g(x0 )} − ǫ = max{f (x0 ) − ǫ, g(x0 ) − ǫ} < max{f (x), g(x)} (5.3.3)
and
max{f (x), g(x)} < max{f (x0 ) + ǫ, g(x0 ) + ǫ} = max{f (x0 ), g(x0 )} + ǫ (5.3.4)
or F (x0 ) − ǫ < F (x) < F (x0 ) + ǫ. Thus we have |F (x) − F (x0 )| < ǫ so F is
continuous at x = x0 . Look at the computation given in (5.3.3) and (5.3.4)
carefully. It’s easy but looks difficult. You start with max{f (x), g(x)} and
replace each of them by the inequalities given by statements (5.3.1) and (5.3.2).
(b) Of course the proof of part (b) will be the same. We again consider statements (5.3.1) and (5.3.2). This time taking the minimums, we get G(x0 ) − ǫ <
G(x) < G(x0 ) + ǫ or |G(x) − G(x0 )| < ǫ. Thus G is continuous at x = x0 .
Next we want to give a result that will expand the number of functions that
we know are continuous. Before we give the result we include the definition of
the composite function.
Definition 5.3.4 For D ⊂ R consider f : D → R and g : U → R where
f (D) ⊂ U . Then the composition of f and g, g ◦ f : D → R is defined as
g ◦ f (x) = g(f (x)) for all x ∈ D.
118
5. Continuity
We use the composition to define some more interesting functions: (i) f (x) =
√
x2 + 1 and g(y) = y implies that g ◦ f (x) = sqrtx2 + 1. (ii) f (θ) = θ − π/2
and g(y) = sin y implies that g ◦ f (θ) = sin(θ − π/2). etc. We then have the
following basic result concerning continuity of the composite function.
Proposition 5.3.5 Suppose that f : D → R, g :→ R, f (D) ⊂ U , f is continuous at x0 and g is continuous at f (x0 ). Then g ◦ f is continuous at x = x0 .
Proof: We suppose that ǫ > 0 is given. g continuous at f (x0 ) implies that
there exists a δ1 such that |y − f (x0 )| < δ1 implies that |g(y) − g(f (x0 ))| < ǫ. f
continuous at x0 (applying the definition of the continuity of f at x0 using δ1 in
place of the traditional ”ǫ”) implies that there exists a δ such that |x − x0 | < δ
implies that |f (x) − f (x0 )| < δ1 .
Then for x−x0 | < δ we have |f (x)−f (x0 )| < δ1 which implies that |g(f (x))−
g(f (x0 ))| < ǫ or g ◦ f is continuous at x = x0 .
We next define the maximum and minimums that you worked a lot with in
your basic course.
Definition 5.3.6 Consider the function f : D → R where D ⊂ R and x0 ∈ D.
(a) The point (x0 , f (x0 )) is said to be a maximum (or local maximum) of f if
there exists a neighborhood of x0 , N , such that f (x) ≤ f (x0 ) for all x ∈ N ∩ D.
(b) The point (x0 , f (x0 )) is said to be an absolute maximum of f on D if f (x) ≤
f (x0 ) for all x ∈ D.
(c) The point (x0 , f (x0 )) is said to be a minimum (or local minimum) of f if
there exists a neighborhood of x0 , N , such that f (x) ≥ f (x0 ) for all x ∈ N ∩ D.
(d) The point (x0 , f (x0 )) is said to be an absolute minimum of f on D if f (x) ≥
f (x0 ) for all x ∈ D.
It is easy to see that the function f (x) = −x2 defined on [−1, 1] has a maximum at (0, 0)—it is an absolute maximum. This function has minimums at both
points (−1, −1) and (1, −1)—which are both absolute maximums. Note that
this means that the absolute maximum or minimum need not be unique. Note
that the same function defined on the set (−1, 1) does not have any minimi—
and then surely does not have an absolute minimum. We also note that if we
define a function f : D → R, D = (−2, −1) ∪ {0} ∪ (1, 2), by f (x) = x2 , then by
the definition x = 0 is both a maximum and a minimum—not very satisfying
but acceptable.
We next prove a useful lemma and an very important theorem concerning
continuous functions.
Lemma 5.3.7 Suppose that f : [a, b] → R and f is continuous on [a, b]. Then
there exists an M ∈ R such that f (x) ≤ M for all x ∈ [a, b].
Proof: Suppose false. Suppose there is not such M such that f (x) ≤ M .
For M = 1 there exists an x1 ∈ [a, b] such that f (x1 ) > 1 (otherwise M = 1
would work).
For M = 2 there exists an x2 ∈ [a, b] such that f (x2 ) > 2.
5.3 Basic Continuity Theorems
119
And, in general, for each n ∈ N there exists an xn ∈ [a, b] such that f (xn ) > n.
{xn } is a sequence in [a, b]. Since [a, b] is compact, by Corollary 3.4.8 we know
that the sequence {xn } has a subsequence, {xnj } and there is an x0 ∈ [a, b],
such that xnj → x0 as j → ∞. Then by the continuity of f on [a, b] and
Proposition 5.1.3 we know that f (xnj ) → f (x0 ). Since the sequence {f (xnj }∞
j=1
is convergent, we know by Proposition 3.3.2–(c) that the sequence is bounded.
This contradicts the fact that f (xnj ) > nj > j for all j. Therefore the set
f ([a, b]) is bounded above.
Theorem 5.3.8 Suppose that f : [a, b] → R and f is continuous on [a, b]. Then
f has an absolute maximum and an absolute minimum on [a, b].
Proof: Let S = f ([a, b]). By Lemma 5.3.7 S is bounded above. Thus by the
completeness axiom, Definition 1.4.3, M ∗ = lub(S) exists. Note that to find an
absolute maximum of f on [a, b], we must find an x0 such that f (x0 ) = M ∗ .
Recall that by Proposition 1.5.3–(a) for every ǫ > 0 there exists an s ∈ S
such that M ∗ −s < ǫ. In our case Proposition 1.5.3–(a) gives that for every ǫ > 0
there exists an x ∈ [a, b] (and an associated f (x)) such that M ∗ − f (x) < ǫ—all
points in S look like f (x)—and are associated with an x ∈ [a, b].
Let ǫ = 1. We get x1 ∈ [a, b] such that M ∗ − f (x1 ) < 1.
Let ǫ = 1/2. We get x2 ∈ [a, b] such that M ∗ − f (x2 ) < 1/2.
In general, let ǫ = 1/n for n ∈ N. We get xn ∈ [a, b] such that M ∗ −f (xn ) < 1/n.
Hence we have M ∗ − 1/n < f (xn ) ≤ M ∗ for all n ∈ N (because M ∗ is an upper
bound) so f (xn ) → M ∗ .
All of xn ’s are in [a, b]. Since [a, b] is compact, by Corollary 3.4.8 we know
that there exists a subsequence of {xn }, {xnj }, such that xnj → x0 for some
x0 ∈ [a, b]. Thus f (xnj ) → f (x0 ).
By Proposition 3.4.6 we know that f (xnj ) → M ∗ . By Proposition 3.3.1 we
know that the limit must be unique. Thus M ∗ = f (x0 ) and (x0 , f (x0 )) is an
absolute maximum.
To show that f has an absolute minimum on [a, b] we consider the function
g = −f . If f is continuous on [a, b], the function g will be continuous on [a, b].
The absolute maximum of g on [a, b] will be the absolute minimum of f on [a, b].
From the above theorem we obtain the following useful corollary.
Corollary 5.3.9 If f : [a, b] → R is continuous on [a, b], then f is bounded on
[a, b].
HW 5.3.1 (True or False and why) (a) Suppose f, g : D → R, D ⊂ R, x0 ∈ R.
If f and g are continuous at x = x0 , then either f or g is continuous at x = x0 .
(b) Suppose f : [0, 1] → R such that f 2 is continuous on [0, 1]. Then f is
continuous on [0, 1].
120
5. Continuity
(c) Suppose f, g : D → R, D ⊂ R. Then max{f (x), g(x) : x ∈ D} = max{f (x) :
x ∈ D} max{g(x) : x
√∈ D}.
(d) Consider f (x) = x defined on [0, ∞) and g(x) = x − 1 defined on R. Then
f ◦ g is continuous on [1, ∞).
(e) Consider f : [0, 1] → R. Then f has a maximum on [0, 1].
HW 5.3.2 Suppose that f : [0, 1] → R is continuous at the point x = x0 and
f (x0 ) > 0. Prove that there exists an n ∈ N such that f (x) > 0 for all x in the
neighborhood N1/n (x0 ).
HW 5.3.3 (a) Suppose f : D → R, D ⊂ R, x0 ∈ D, is continuous at x = x0 .
Prove that |f | (defined by |f |(x) = |f (x)|) is continuous at x = x0 .
(b) Prove that for x ∈ D min{f (x), g(x)} = 21 [f (x) + g(x)] − 21 |f (x) − g(x)|.
(c) If f and g are continuous at x = x0 , prove that G(x) = min{f (x), g(x)}
is continuous at x = x0 (give a proof different from that given in Proposition
5.3.3).
5.4
More Continuity Theorems
We next prove a very important basic theorem concerning continuous functions—
the result yields an approximate characterisation of continuity.
Theorem 5.4.1 (Intermediate Value Theorem: IVT) Suppose that f : [a, b] →
R and f is continuous on [a, b]. Let c ∈ R be between f (a) and f (b). Then there
exists x0 ∈ (a, b) such that f (x0 ) = c.
Proof: We have two cases, f (a) < c < f (b) and f (b) < c < f (a). We will
consider the first case—the second case will follow in the same way.
This will be a constructive proof. Let a1 = a and b1 = b.
Let m1 = (a1 + b1 )/2. If f (m1 ) ≤ c, define a2 = m1 and b2 = b1 . If f (m1 ) > c,
define a2 = a1 and b2 = m1 . Note that this construction divides the interval
in half, and chooses the half so that f (a2 ) ≤ c < f (b2 )—specifically we have
a = a1 ≤ a2 < b2 ≤ b1 = b and f (a2 ) ≤ c < f (b2 ).
Let m2 = (a2 + b2 )/2. If f (m2 ) ≤ c, define a3 = m2 and b3 = b2 . If f (m2 ) > c,
define a3 = a2 and b3 = m2 . We have a = a1 ≤ a2 ≤ a3 < b3 ≤ b2 ≤ b1 = b and
f (a3 ) ≤ c < f (b3 ).
We continue in this fashion and inductively obtain an and bn , n = 1, 2, · · ·
such that
a = a1 ≤ a2 ≤ a3 ≤ · · · ≤ an < bn ≤ bn−1 ≤ · · · ≤ b1 = b
and f (an ) ≤ c < f (bn ).
We have a sequence of closed intervals [an , bn ] such that f (an ) ≤ c < f (bn )
1
1
[b1 − a1 ] = 2n−1
[b − a].
and bn − an = 21 [bn−1 − an−1 ] = · · · = 2n−1
Clearly {an } is a monotonically increasing sequence that is bounded above
by b. Therefore by the Monotone Convergence Theorem, Theorem 3.5.2, there
5.4 More Continuity Theorems
121
exists α ≤ b such that an → α. Likewise the sequence {bn } is a monotonically
decreasing sequence bounded below by a. Thus by the Monotone Convergence
Theorem there exists a β ≥ a such that bn → β.
1
We see that limn→∞ [bn − an ] = limn→∞ 2n−1
[b − a] = 0 and limn→∞ [bn −
an ] = β − α. Thus α = β. Call it x0
We have f (an ) ≤ c < f (bn ) and limn→∞ f (an ) = limn→∞ f (bn ) = f (x0 ).
By the Sandwhich Theorem, Proposition 3.4.2, (where the center sequence will
be the constant sequence {c, c, · · · }) we have f (x0 ) = c.
We might note that one of the nice applications of the IVT is to prove the
existence of a solution of an equation of the form f (x) = 0. The approach is to
find a and a b in the domain of the function such that f (a) < 0, f (b) > 0 and f
is continuous on [a, b]. The IVT then implies that there exists an x0 ∈ [a, ] such
that f (x0 ) = 0. For example consider the function f (x) = x5 + x + 1. We note
that f (−1) = −1, f (1) = 3 and f is surely continuous on the interval [−1, 1].
Therefore by the Intermediate Value Theorem there exists an x0 ∈ [−1, 1] such
that f (x0 ) = 0. Can you find such an x0 . Can you approximate it? (Use your
calculator.)
A slight variation of the above application of the IVT and essentially the
process that we used for the proof of the IVT gives us an excellent method
for finding an approximation to the solution to an equation f (x) = 0: called
the Bisection Method. Suppose we know that f (a) < 0, f (b) > 0 and f is
continuous on [a, b]. We then use a construction that we have used in the proof
of the IVT. We set a1 = a and b1 = b.
We set c1 = (a1 + b1 )/2 and evaluate f (c1 ). If f (c1 ) < 0, we set a2 = c1 and
b2 = b1 . If f (c1 ) > 0, we set a2 = a1 and b2 = c1 . (If f (c1 ) ≈ 0, quit.)
We set c2 = (a2 + b2 )/2 and evaluate f (c2 ). If f (c2 ) < 0, we set a3 = c2 and
b3 = b2 . If f (c2 ) > 0, we set a3 = a2 and b3 = c2 . (If f (c2 ) ≈ 0, quit.)
We continue in this fashion until for some cn , f (cn ) is sufficiently small.
We use that value of cn as an approximation of the solution of f (x) = 0. We
note that the proof of the IVT proves that the sequence {an } converges to the
solution of f (x) = 0. (It’s easy to see that the sequence {cn } will also converge
to the solution of f (x) = 0.) We really don’t know how fast this convergence
is taking place (the Bisection method is not the fastest method) but we can
get an excellent approximation to solutions of equations using this method. For
example if we again consider f (x) = x5 + x + 1, set a = −1 and b = 1, and
perform the iteration, we get c1 = 0.0, c2 = −0.5, c3 = −0.75, c4 = −0.875,
c5 = −0.8125, c6 = −0.7813, c7 = −0.7656, c8 = −0.7578. We see that
f (−0.7578) = −0.0077 and we stopped because we chose ”sufficiently small”
to be 0.01. And of course, if you wanted to find the solutions to x5 + x + 1 = 7
instead, you could consider f (x) = (x5 + x + 1) − 7.
We next include a corollary to Theorem 5.4.1 that will be useful to us in
the next section. Before we proceed we want to emphasize that by interval, we
mean any of the different types of intervals we have introduced—closed, open,
part open and part closed, unbounded, etc. We state the following result.
122
5. Continuity
Corollary 5.4.2 Suppose f : I → R where I ⊂ R is an interval and f is
continuous on I. Then f (I) is an interval.
Proof: If f (I) is not an interval, there must be an f (a), f (b) ∈ f (I) and a
c ∈ R such that c is between f (a) and f (b) but c 6∈ f (I). This would contradict
Theorem 5.4.1 applied to f on [a, b] (where for convenience we assume that
a < b).
Often we are interested in when and where the functions are increasing and
decreasing—if you recall, you probably used these ideas in your basic class when
you used calculus to plot the graphs of some functions. We will use these ideas
in a very powerful way to help us study the inverse of functions. Before we
proceed we make the following definitions.
Definition 5.4.3 Consider the function f : D → R where D ⊂ R.
(a) f is said to be increasing on D if for x, y ∈ D such that x < y, then
f (x) ≤ f (y).
(b) f is said to be decreasing on D if for x, y ∈ D such that x < y, then
f (x) ≥ f (y).
(c) f is said to be strictly increasing on D if for x, y ∈ D such that x < y, then
f (x) < f (y).
(d) f is said to be strictly decreasing on D if for x, y ∈ D such that x < y, then
f (x) > f (y).
If the function f is either increasing or decreasing, we say that f is monotone. If f is either strictly increasing or decreasing, then we say that f is strictly
monotone.
We note that f2 (x) = x2 is not monotone on R. We also note that f2
is strictly increasing on [0, ∞) and strictly decreasing on (−∞, 0]. We also
note that f3 (x) = x3 is strictly increasing on R—these all can be seen by
graphing the functions—these all can be proved by using the methods similar
to
( those used in HW1.3.3-(a). A more complicated function is given by f7 (x) =
x−4
if x < 0
—graph it and it should be clear that f7 is increasing on R.
2x + 3 if x ≥ 0


−x + 4 if x < 0
If we define f8 (x) = 2
if 0 ≤ x ≤ 1 and graph it, it is easy to see that


−4x
if x > 1
f8 is decreasing but not strictly decreasing.
We next prove a result we think is surprising in that we get continuity with
a rather strange hypothesis. Read this proof carefully—it is a very technical
proof.
Proposition 5.4.4 Consider f : D → R where D ⊂ R. Assume that f is
monotone on D. If f (D) is an interval, then f is continuous on D.
5.4 More Continuity Theorems
123
Proof: Consider the case when f is increasing. The case of f decreasing will
be the same. Let x0 ∈ D and suppose that ǫ > 0 is given. We must find a δ so
that |x − x0 | < δ implies that |f (x) − f (x0 )| < ǫ, i.e.
f (x0 ) − ǫ < f (x) < f (x0 ) + ǫ.
(5.4.1)
Since we know that f is increasing to the right of x0 , we limit δ so that for
x ∈ (x0 , x0 + δ) f cannot grow too much (not more than to f (x0 ) + ǫ). We then
do the same thing to the left of x0 .
Consider the right most part of inequality (5.4.1): f (x) < f (x0 ) + ǫ.
If f (x) ≤ f (x0 ) for all x ∈ D, the desired inequality is satisfied and we can
choose δ1 = 1.
Otherwise, let x∗ ∈ D be such that f (x∗ ) > f (x0 ). Then x0 < x∗ (because
f is increasing) and the interval [f (x0 ), f (x∗ )] is contained in f (D) (f (D) is
assumed to be an interval). Let y ∗∗ = min{f (x0 ) + ǫ/2, f (x∗ )}. Then the
interval [f (x0 ), y ∗∗ ] is contained in f (D). Thus there exists an x∗∗ ∈ D such
that f (x∗∗ ) = y ∗∗ . Then x0 < x < x∗∗ implies that f (x0 ) ≤ f (x) ≤ f (x∗∗ ) =
y ∗∗ < f (x0 ) + ǫ. Let δ2 = x∗∗ − x0 .
Now consider the left most part of inequality (5.4.1): f (x0 ) − ǫ < f (x).
If f (x) ≥ f (x0 ), we are done. Let δ3 = 1.
Otherwise, let x∗ be such that f (x∗ ) < f (x0 ). Then x∗ < x0 and the interval
[f (x∗ ), f (x0 )] ⊂ f (D). Let y ∗∗ = max{f (x0 ) − ǫ/2, f (x∗ )}. Then [y ∗∗ , f (x0 )] ⊂
f (D) and there exists x∗∗ ∈ D f (x∗∗ ) = y ∗∗ . Then x∗∗ < x < x0 implies that
f (x0 ) − ǫ < y ∗∗ = f (x∗∗ ) ≤ f (x) ≤ f (x0 ). Let δ4 = x0 − x∗∗ .
Thus we see that if we define δ = min{δ1 , δ2 , δ3 , δ4 } (Define δ) and require
that |x − x0 | < δ, then |f (x) − f (x0 )| < ǫ.
Notice that the functions f7 and f8 consider earlier are both monotone but
are not continuous—neither f7 (R) nor f8 (R) are intervals. Check it out.
We next state a result is a bit strange because we already have this result—it
is a combination of Corollary 5.4.2 and Proposition 5.4.4. We do so because we
want to emphasize this result in this form.
Corollary 5.4.5 Consider f : I → R where I ⊂ R is an interval. Assume
that f is monotone on I. Then f is continuous on I if and only if f (I) is an
interval.
There are times when we are given a function, it is very important to use to
know that the function has an inverse and to be able to determine properties
of that inverse. You have been using inverse functions for a long time(it is a
very basic result to be give some y = f (x) and want to solve for x—sometimes
you might have been aware that you were using an inverse, and other times you
might not have been aware. We begin with the following definition.
Definition 5.4.6 Consider f : D → R where D ⊂ R. The function f is said
to be one-to-one (often written 1-1) if f (x) = f (y) implies that x = y.
124
5. Continuity
In your basic calculus course when you studied one-to-one functions you used
what is called the horizontal line test—that is, draw an arbitrary horizontal line
on the graph of the function, the function is one-to-one if the line intersects
that graph at only one point. It should be clear that this description of the
horizontal line test is equivalent to Definition
5.4.6—though less rigorous.
√
x
defined
on [0, ∞) we note that f1 (x) =
To prove that function
f
(x)
=
1
√
√
f1 (y) is the same as x = y. If we then square both sides, we find that
x = y—which is what we must prove. Graph f1 to see how the horizontal line
test works. The function f2 (x) = x2 is surely not one-to-one on R. Again, plot
the function and draw the horizontal line. If we consider f2 on [0, ∞) instead,
then f2 is one-to-one.
A statement that is equivalent to Definition 5.4.6 is as follows: The function
f is said to be one-to-one for each element y ∈ f (D) there exists one and only
one element in D, x, such that f (x) = y. The definition of one-to-one allows us
to make the following definition.
Definition 5.4.7 Consider f : D → R where D ⊂ R. Assume that the function
f is one-to-one. We define the function f −1 : f (D) → D by f −1 (y) = x if
f (x) = y. The function f −1 is called the inverse of f . When f −1 exists, f is
said to be invertible.
Note that the definition that f is one-to-one is exactly what is needed to make
f −1 a function, i.e. for each y ∈ f (D)there exists one and only one x ∈ D such
that f −1 (y) = x. We also note that by rewriting the statement in Definition
5.4.7 we see that f and f −1 satisfy f −1 (f (x)) = x for x ∈ D and f (f −1 (y)) = y
for y ∈ f (d).
If we consider f2 (x) = x2 on [0, ∞), let y = x2 and solve for x, we get
√
√
x = ± y. Since x must be greater than or equal to zero, f2−1 (y) = y. Note
−1
that since f2 ([0, ∞)) = [0, ∞), the domain of f2 is also [0, ∞). If we next
√
consider f3 (x) = x3 on R, we note that f3 (R) = R and f3−1 (y) = 3 y for all
y ∈ R.
We obtain the following important but very easy result.
Proposition 5.4.8 Consider f : D → R where D ⊂ R. Assume that f is
strictly monotone on D. Then f is one-to-one on D.
Proof: Suppose that f is strictly increasing—the proof for f strictly decreasing
will the same—and suppose that f is not one-to-one. Then there exists x 6= y
such that f (x) = f (y). If x 6= y, then either x < y or y < x—either case is a
contradiction to the fact that f is strictly increasing.
It’s clear that the converse of Proposition 5.4.8 is not true when we consider
a function like f (x) = 1/x defined on R − {0}. However we are able to obtain
the following result.
Proposition 5.4.9 Consider the function f : I → R where I ⊂ R is an interval. Assume that f is one-to-one and continuous on I. Then f is strictly
monotone on I.
5.4 More Continuity Theorems
125
Proof: We begin by choosing arbitrary a, b ∈ I. For convenience assume that
a < b. Since f is one-to-one, we know that f (a) 6= f (b)—so that either f (a) <
f (b) or f (a) > f (b). Consider the case of f (a) < f (b). If f is to be strictly
monotone, in this situation it must be the case that f is strictly increasing on
[a, b]. Assume false, i.e. assume that f is not strictly increasing on [a, b], i.e.
assume there exists some x1 , x2 ∈ [a, b] such that x1 < x2 and f (x1 ) ≥ f (x2 )—
since f is one-to-one, we know we would really have f (x1 ) > f (x2 ).
We have two cases. For each case it might help to draw a picture of
the situation—give it a try. Case 1: f (a) < f (x1 ). We then choose c =
max{(f (x1 ) + f (x2 ))/2, (f (a) + f (x1 ))/2}. Since f is continuous on I, f is continuous on [a, x1 ]. Also c is between f (a) and f (x1 ). Therefore by the IVT,
Theorem 5.4.1, we know that there exists y1 ∈ [a, x1 ] such that f (y1 ) = c.
Also f is continuous on [x1 , x2 ] and c is between f (x1 ) and f (x2 ). Thus
again by the IVT we know that there exists y2 ∈ [x1 , x2 ] such that f (y2 ) = c.
This is a contradiction to the fact that f is one-to-one.
Case 2: f (a) > f (x1 ). In this case then f (b) > f (a) > f (x1 ) > f (x2 ). Similar
to the last case, we set c = min{(f (x1 ) + f (x2 ))/2, (f (x2 ) + f (b)/2}, apply the
IVT with respect to c on [x1 , x2 ] and [x2 , b], and arrive at a contradiction to the
fact that f is one-to-one on I.
Therefore f is strictly increasing on I.
The case when f (a) > f (b) is essentially the same.
Proposition 5.4.10 Suppose that f : D → R and D ⊂ R. (a) If f is strictly
increasing on D, then f −1 : f (D) → D is strictly increasing on f (D).
(b) If f is strictly decreasing, then f −1 : f (D) → D is strictly decreasing on
f (D).
Proof: (a) We assume that f −1 is not strictly increasing, i.e. suppose
that u < v and f −1 (u) 6< f −1 (v). Then f −1 (u) ≥ f −1 (v). Suppose x and y
are such that f (x) = u and f (y) = v. Then u < v implies f (x) < f (y) and
f −1 (u) ≥ f −1 (v) implies that x ≥ y.
This contradicts the fact that f is increasing because when f is increasing
x > y implies f (x) > f (y) and x = y implies f (x) = f (y). i.e. if we have x ≥ y,
we have f (x) ≥ f (y).
(b) The proof when f is strictly decreasing is very similar to that given in part
(a).
The next two results are the ultimate results relating the continuity properties of f −1 to those of f .
Proposition 5.4.11 Suppose that I ⊂ R is an interval and f : I → R is strictly
monotone on I. Then f −1 : f (I) → I is continuous.
Proof: Suppose that f is strictly increasing—the proof of the case for f strictly
decreasing is the same. We know then by Proposition 5.4.10–(a) that f −1 :
f (I) → I is strictly increasing. Then since f −1 (f (I)) = I is an interval (and
126
5. Continuity
f (I) is the domain of f −1 ), by Proposition 5.4.5 we see that f −1 is continuous
on f (I).
Proposition 5.4.12 Suppose that f : I → R where I ⊂ R. Assume that f is
one-to-one and continuous on I. Then f −1 : f (I) → I is continuous.
Proof: This is an easy combination of Propositions 5.4.9 and 5.4.11. From
Proposition 5.4.9 we see that f is strictly monotone on I. Then by Proposition
5.4.11 we have that f −1 is continuous on f (I).
The results of the two propositions—f −1 is continuous—is the same. The
difference between the two propositions is in the hypotheses. The fact that
we assume f continuous (and one-to-one) in Proposition 5.4.12 is a stronger
hypothesis than assuming strict monotonicity as we do in Proposition 5.4.11.
However, there are times that it is preferable to be able to assume one-toone than monotonicity—and it’s not a terrible assumption to assume that f is
continuous. The real point is that we have both results. Whatever we want to
use in the end, we will have.
HW 5.4.1 (True or False and why) (a) There is at least one solution to the
equation x4 − 3x3 + 2x2 − x − 1 = 0.
(b) Consider the function f : R → R defined by f (x) = sin x. We know that
f (R) = [−1, 1] is an interval. Then f is invertible on [−1, 1].
(c) Suppose f : D → R, D ⊂ R, is one-to-one and continuous on D. Then f is
strictly monotone.
(d) Suppose f : D → R, D ⊂ R, is monotone. Then f is invertible.
(e) Suppose f : D → R, D ⊂ R, is continuous on D. Then f (D) is an interval.
HW 5.4.2 Prove that the equation x6 + x4 − 3x3 − x + 1 = 0 has at least one
solution. Find an approximation of a solution to the equation.
HW 5.4.3 Suppose that the function f : [0, 1] → R is continuous and satisfies
f ([0, 1]) ⊂ Q. Prove that f is the constant function.
HW 5.4.4 Can we put in the M161 problem?
(
3x − 2 if x < 0
2x + 1 if x ≥ 0.
(a) Show that the function f is strictly increasing.
(b) Determine f (R).
(c) Show that f −1 exists.
(d) Prove that f −1 is continuous at x = −2.
(e) Determine where f −1 is continuous.
HW 5.4.5 Define the function f (x) =
5.5 Uniform Continuity
5.5
127
Uniform Continuity
The set of continuous functions on some domain D is an important set of functions. There is another level of smoothness that we attach to functions that
yields another important class of functions: uniformly continuous functions. As
we shall see the idea of uniform continuity is truly related to the set D whereas
continuity was defined pointwise and then was consider continuous on the set D
if it was continuous at each individual point of D. We begin with the definition.
Definition 5.5.1 Consider the function f : D → R where D ⊂ R. f is said
to be uniformly continuous on D if for every ǫ > 0 there exists a δ such that
x, y ∈ D and |x − y| < δ implies that |f (x) − f (x0 )| < ǫ.
This definition should be observed carefully and contrasted with Definition 5.1.1.
If we consider the function f (x) = x2 defined on D = (0, 1), we hope that it is
clear that f is continuous on (0, 1). However, when we proceed to show that it
is continuous at each point in (0, 1), we might begin by considering x0 = 0.1.
Then given ǫ > 0 we write |x2 − (0.1)2 | = |x − 0.1||x + 0.1| and realize that this
is one of the applications of the definition of continuity where we must restrict
the range of x and bound the term |x+ 0.1|. Suppose we let δ1 = 0.1 (no specific
tie to the fact that x0 = 0.1 except for the fact that both were chosen because
0.1 is a nice small number) and restrict x so that |x − 0.1| < δ1 = 0.1. Then
|x + 0.1| < 3/10 so we set δ0.1 = min{0.1, 10ǫ/3}, suppose that |x − 0.1| < δ0.1
and continue with our previous calculation to get
|x2 − (0.1)2 | = |x − 0.1||x + 0.1| <∗ (3/10)|x − 0.1| <∗∗ (3/10)10ǫ/3 = ǫ
where the ”<∗ ” inequality is true because δ0.1 ≤ 0.1 and the ”<∗∗ ” inequality
is true because δ0.1 ≤ 10ǫ/3. Therefore f (x) = x2 is continuous at x = 0.1.
If we continued by next choosing x0 = 0.9, we can do some work to note
that we can choose δ0.9 = min{0.1, 10ǫ/19}, suppose that |x − 0.9| < δ0.9 and
note that
|x2 − (0.9)2 | = |x − 0.9||x + 0.9| < (19/10)|x − 0.9| < (19/10)10ǫ/19 = ǫ.
Therefore f is continuous at x = 0.9. (Do the necessary calculation.)
To continue with showing that f is continuous on D = (0, 1) we have many
more points to consider. But let us consider the two points we have already
considered. We first admit that we could have bounded our x values differently
(chosen δ1 larger than 0.1) and gotten different δ’s—but that wouldn’t make
our point. The end result is that we prove continuity at these two points using
radically different δ’s. For example, if ǫ = 0.001, then δ0.1 = (0.001)10/3 and
δ0.9 = (0.001)10/19. Not only do we get different δ’s, in this case we get radically
different δ’s.
When we want to prove that f (x) = x2 is uniformly continuous on D = (0, 1),
for a given ǫ > 0 we must find a δ so that for any x, y ∈ (0, 1) and |x − y| < δ
implies that |x2 − y 2 | < ǫ. That means it must work when we choose y = 0.1
and it must work when we choose y = 0.9. It seems as if (we don’t really know
128
5. Continuity
for sure) δ0.1 will not work everywhere because it is a lot bigger than the δ0.9 .
Since the δ0.9 < δ0.1 , δ0.9 would work at both y = 0.1 and y = 0.9. Is there any
reason to believe that it would work everywhere? If a function f is continuous
on a set D and we are given an ǫ > 0, it is perfectly permissable for the δ to be
different at every point in D.
Thus the question is can we choose a δ that will work everywhere (if we
can’t, then f is not uniformly continuous on (0, 1)) and how do we do it. If
we return to Figure 4.1.1 and consider what determines the size of δ (in the
case of Figure 4.1.1, the δ1 and the δ2 ) it should be clear that the steepness
of the graph at and near the point is what determines the quantity needed for
δ—the steeper the curve the smaller the δ. Hence, we want to choose the point
of D = (0, 1) that requires the smallest δ, i.e. the point at which the graph is
the steepest, construct a continuity at that point to determine the δ and show
that this δ will work for x, y ∈ D.
Hopefully we know what the graph of f (x) = x2 looks like on (0, 1). It should
be clear that there is not a point in (0, 1) at which the graph is the steepest, but
it is also clear that the closer we get to x = 1, the steeper the curve gets. We
consider the point x0 = 1. But this is ridiculous because x0 = 1 6∈ D = (0, 1).
Who cares. If we can determine a δ that works, we will be done. We know that
f (x) = x2 could have just as well been defined on all of [0, 1] so a continuity
proof at x0 = 1 will make sense. We again consider |x2 − (1)2 | = |x − 1||x + 1|.
If we restrict x so that |x − 1| < δ1 = 0.1 or 0.9 < x < 1.1, then for x ∈ [0, 1],
|x + 1| ≤ 2.1. We then set δ = min{0.1, ǫ/2.1}, suppose that x satisfies x ∈ [0, 1]
and |x − 1| < δ and continue with our previous calculation to get
|x2 − 1| = |x − 1||x + 1| ≤ 2.1|x − 1| < 2.1ǫ/2.1 = ǫ.
Thus, if we considered f (x) = x2 defined on [0, 1], we would know that f is
continuous at x0 = 1.
We now have a δ = min{0.1, ǫ/2.1} that our earlier argument indicates
might work to show that f is uniformly continuous on (0, 1). We suppose that
x, y ∈ D = (0, 1) satisfies |x − y| < δ and consider |f (x) − f (y)| = |x2 − y 2 | =
|x − y||x + y|. Clearly for x, y ∈ (0, 1), |x + y| < 2. Thus we have
|f (x) − f (y) = |x2 − y 2 | = |x − y||x + y| < 2|x − y| < 2δ ≤ 2ǫ/2.1 < ǫ.
Therefore f is uniformly continuous on D = (0, 1).
In summary we see that when we prove that a function is continuous on a set,
the derived δ may be different at each point of the set. To prove uniform convergence we must find a δ that works uniformly through out the entire domain.
We saw that if a function is going to be uniformly continuous, one way to find
the correct δ is to consider continuity at the steepest point of the graph of the
function. More so, in the example considered above we first literally proved (at
least determined the correct δ) uniform convergence of f (x) = x2 on the larger
domain [0, 1]—and then used this information to prove uniform convergence on
(0, 1). In effect, we used the following proposition which is trivial to prove.
129
5.5 Uniform Continuity
Proposition 5.5.2 Suppose the function f : D → R, D ⊂ R is uniformly
continuous on D. If D1 ⊂ D, then f is uniformly continuus on D1 .
One very important but easy result is the following.
Proposition 5.5.3 Suppose the function f ; D → R, D ⊂ R is uniformly continuous on D. If x0 is any point in D, then f is continuous at x = x0 .
It should be pretty clear how we find a function that is not uniformly
continuous—a function that doesn’t have a steepest point, the graph keeps getting steeper and steeper. Consider the function f : D = (0, 1) → R defined by
f (x) = 1/x. It is not difficult to show that f is continuous on D. It’s more difficult to show that f is not uniformly continuous. We suppose that f is uniformly
continuous on D = (0,
for a given ǫ > 0 there must exist a δ such that
1). Then
1
1
|x − y| < δ implies − . Consider any δ > 0. Clearly the graph gets steep
x y
near x = 0 so that’s where we have to work. Consider the
points xn = 1/n and
yn = 1/2n. Then |xn −yn | = 1/2n and |f (xn )−f (yn )| = x1n − y1n = n. Clearly
we can find an n such that 1/2n < δ (and this will hold for all
larger n) and such
1
1
− = n > ǫ.
that n > ǫ. For this value of n we have |xn − yn | < δ and xn
yn
Thus f is not uniformly continuous on (0, 1).
There is a result that makes this last proof a bit easier. Since it is an ”if and
only if” result, the following proposition provides for an alternative definition
of uniform continuity.
Proposition 5.5.4 Suppose the function f : D → R, D ⊂ R. The function f
is uniformly continuous if and only if for all sequences {un }, {vn } in D such
that lim [un − vn ] = 0, then lim [f (un ) − f (vn )] = 0.
n→∞
n→∞
Proof: (⇒) Let ǫ > 0 be given. Since f is uniformly continuous on D, there
exists δ such that x, y ∈ D and |x − y| < δ implies that |f (x) − f (y)| < ǫ.
Suppose that {un } and {vn } are two sequences in D such that lim [un − vn ] =
n→∞
0. Apply the definition of the limit of a sequence to this statement. Let ǫ1 = δ
(where we have used ǫ1 because we have already used ǫ). Then there exists an
N ∈ R such that n > N implies that |un − vn | < δ. Then for n > N we have
|un − vn | < δ so we can apply the definition of uniform continuity given above
to get |f (un ) − f (vn )| < ǫ and of course this holds for all n > N . Therefore
lim [f (un ) − f (vn )] = 0.
n→∞
(⇐) Suppose f is not uniformly continuous on D, i.e. suppose that for some
ǫ0 > 0 for any δ there exist x, y ∈ D such that |x − y| < δ and |f (x) − f (y)| ≥ ǫ0 .
We inductively define two sequences in the following manner.
Set δ = 1. Then there exists x1 , y1 ∈ D such that |x1 − y1 | < δ and |f (x1 ) −
f (y1 )| ≥ ǫ0 .
Set δ = 1/2. Then there exists x2 , y2 ∈ D such that |x2 − y2 | < δ and |f (x2 ) −
f (y2 )| ≥ ǫ0 .
130
5. Continuity
In general set δ = 1/n. Then there exists xn , yn ∈ D such that |xn − yn | < δ
and |f (xn ) − f (yn )| ≥ ǫ for all n ∈ N.
We have two sequences {xn }, {yn } such that xn − yn → 0 as n → ∞ and
f (xn ) − f (yn ) 6→ 0. This is a contradiction—to the hypothesis.
We feel that when the above statement is used as the definition it is a rather
odd definition. However, Proposition 5.5.4 gives us an excellent approach to
show that a function is not uniformly continuous. For the example considered
earlier, f (x) = 1/x defined on D = (0, 1), we define un = 1/n and vn = 1/2n.
Then
1
1
1
= lim
−
=0
lim [un − vn ] = lim
n→∞
n→∞
n→∞ n
2n
2n
and
lim [f (un ) − f (vn )] = lim (n − 2n) = − lim n = −∞ (not zero).
n→∞
n→∞
n→∞
Thus by Proposition 5.5.4 f is not uniformly continuous on D = (0, 1). Note
that this is essentially what we did earlier—but now we have a proposition that
we can easily apply.
We next include a result that is a logical and necessary result—we had an
analogous result for limits and continuity.
Proposition 5.5.5 Suppose that f, g : D → R, D ⊂ R, are uniformly continuous on D. If c1 , c2 ∈ R then c1 f + c2 g is uniformly continuous on D.
Proof: Let ǫ > 0 be given. Since f and g are uniformly continuous on D,
for ǫ1 > 0, ǫ2 > 0 there exist δ1 , δ2 such that x, y ∈ D, x − y| < δ1 implies
|f (x) − f (y)| < ǫ1 and x, y ∈ D, |x − y| < δ2 implies |g(x) − g(y)| < ǫ2 . Let
δ = min{δ1 , δ2 } and assume |x − y| < δ. Then
|(c1 f (x)+c2 g(x))−(c1 f (y)+c2 g(y))| ≤ |c1 ||f (x)−f (y)|+|c2 ||g(x)−g(y)| < |c1 |ǫ1 +|c2 |ǫ2 .
Then if we choose ǫ1 = ǫ/|C1 | and ǫ2 = ǫ/|C2 |, we have |(c1 f (x) + c2 g(x)) −
(c1 f (y) + c2 g(y))| < ǫ so C1 f + C2 g is uniformly continuous on D.
Notice that we have not included results for products and quotients of uniformly continuous functions. See HW5.5.3. We have one more important result
relating continuity and uniform continuity.
Proposition 5.5.6 Suppose that f : [a, b] → R is continuous on [a, b]. Then f
is uniformly continuous on [a, b].
Proof: Suppose ǫ > 0 is given. For x0 ∈ [a, b] since f is continuous at x0 , then
there exists a δx0 such that |x − x0 | < δx0 implies |f (x) − f (x0 | < ǫ/2. (We
can find a δ for any ǫ so we can find one for ǫ/2.) This can be done for every
x0 ∈ [a, b], i.e. for each x0 ∈ [a, b] we get a δx0 . This construction will produce
1
1
the open cover of the set [a, b], {Gx0 }x0 ∈[a,b] , Gx0 = (x0 − δx0 , x0 + δx0 )—
2
2
1
Gx0 = {x ∈ [a, b] : |x − xx0 | < δx0 }. The sets Gx0 are clearly open because
2
131
5.5 Uniform Continuity
they’re open intervals. We get [a, b] ⊂
∪
Gx0 because there is an open
x0 ∈[a,b]
interval around each point of [a, b], i.e. the collection of open sets {Gx0 }x0 ∈[a,b]
is an open cover of [a, b].
Then by the fact that [a, b] is compact, Proposition 2.3.7 and the definition
of compactness, Definition 2.3.1, there exists a finite subcover of [a, b], i.e. there
exist a finite number of these open intervals Gx1 , · · · , Gxn such that [a, b] ⊂
n
∪ Gxj . Remember that these open sets are intervals with radius
j=1
1
2 δx j ,
j =
1
2
1, · · · , n. Let δ = min{δx1 , · · · , δxn }.
Now consider x, y ∈ [a, b] such that |x − y| < δ. Since {Gx1 , · · · , Gxn } covers
[a, b] and x ∈ [a, b], there exists Gxi0 such that x ∈ Gxi0 . Then |x−xi0 | < δ < δi0
and |f (x) − f (xi0 )| < ǫ/2. Also
|y − xi0 | = |(y − x) + (x − xi0 )| ≤∗ |y − x| + |x − xi0 |
1
1
1
< δ + δi0 ≤∗∗ δi0 + δi0 = δi0
2
2
2
where the ”≤∗ ” inequality is due to the triangular inequality, Proposition 1.5.8(v), and the ”≤∗∗ ” inequality is due to the definition of δ = 12 min{δx1 , · · · , δxn }.
Thus we have |f (x) − f (xi0 | < ǫ/2. Then
|f (x) − f (y)| = |(f (x) − f (xi0 )) + (f (xi0 ) − f (y))|
≤∗ |f (x) − f (xi0 )| + |f (xi0 ) − f (y)| <
1
1
ǫ + ǫ = ǫ,
2
2
where the ”≤∗ ” inequality is due to the triangular inequality, Proposition 1.5.8(v). Therefore f is uniformly continuous on [a, b].
We next state the more general result—the proof of which is exactly the
same as that of Proposition 5.5.6.
Proposition 5.5.7 Suppose that f : K → R, K ⊂ R compact, is continuous
on K. Then f is uniformly continuous on K.
HW 5.5.1 (True or False and why) (a) If f is uniformly continuous on (0, 1),
then f is uniformly continuous on [0, 1].
(b) If f is uniformly continuous on (0, 1) and continuous at points x = 0 and
x = 1, then f is uniformly continuous on [0, 1].
(c) If the domain of f is all of R, then f cannot be uniformly continuous.
(d) If D is the domain of f and f (D) = R, then f cannot be uniformly continuous
on D.
(e) The set D = [0, 1] ∩ Q is not compact. If D is the domain of f , then f
cannot be uniformly continuous on D.
HW 5.5.2 (a) Show that the function f : (0, 1) → R defined by f (x) = 3x2 + 1
is uniformly continuous.
(b) Show that f : (2, ∞) → R defined by f (x) = 1/x2 is uniformly continuous.
(c) Show that the function f : R → R defined by f (x) = x3 is not uniformly
continuous on R.
132
5. Continuity
HW 5.5.3 (a) Suppose f, g : D → R, D ⊂ R, are both uniformly continuous
on D. Show that f g need not be uniformly continuous on D.
(b) Suppose f, g : D → R, D ⊂ R, are both uniformly continuous and bounded
on D. Prove that f g is uniformly continuous on D.
5.6
Rational Exponents
√
In Example 1.5.1 in Section 1.5 we used the completeness of R to define 2.
We mentioned at that time that the same approach could be used to define
square roots
of the rest of the positive reals, i.e. we could define the function
√
f (x) = x. It would be possible to proceed in this fashion to define the functions
x1/n for n ∈ N , n ≥ 2. After these definitions we could consider limits of these
functions, continuity of these functions and any other operations that we might
want to apply to functions.
We decided not to proceed in this
√ fashion. We have not used rational exponents (except for our work with 2) until this time. We will now give an
alternative slick approach to defining rational exponents. We use Proposition
5.4.8 to define the functions x1/n (when they should exist) and Proposition
5.4.11 to √
show that these functions are continuous. We begin by considering the
function x.
Example 5.6.1
Consider the function f (x) = x2 on D = [0, ∞). Show that f is invertible
and that f −1 is continuous on [0, ∞).
Solution: We first note that f (D) = [0, ∞) and as we saw in Section 5.4, f is strictly
increasing on D. By Proposition 5.4.8 we know that f is one-to-one, i.e. f is invertible. As
usual denote the inverse of f by f −1 . Of course f −1 : f (D) = [0, ∞) → D = [0, ∞).
In addition since D = [0, ∞) is an interval, by Proposition 5.4.11 we know that f −1 is
continuous on f (D) = [0, ∞).
In addition we note that by Proposition 5.4.10 we know that f −1 is strictly
increasing.
Also, recall that f and f −1 satisfy f −1 (f (x)) = x for all x ∈ D = [0, ∞)
and f f −1 (y) = y for all y ∈ f (D) = [0, ∞). Since f (x) = x2 , these identities
2
imply that f −1 x2 = x for x ∈ [0, ∞) and f −1 (y) = y for all y ∈ [0, ∞).
The last identity suggests that we make the following definition.
√
√
Definition 5.6.1 For y ∈ [0, ∞) define y = y 1/2 = f −1 (y). y is referred to
as the square root of y.
As you will see this definition
√ will be usurped by Definition 5.6.2 given below.
at this
We included the definition of 2 for
√ emphasis. You should realize√that
2
time the only property we have is x2 = x for x ∈ [0, ∞) and
y = y for
all y ∈ [0, ∞)—the two identities associated with the definition of an inverse
function.
We next consider the function f (x) = xn for n ∈ N defined on D = [0, ∞).
We see that f (D) = [0, ∞). Using induction along with a calculation similar
133
5.6 Rational Exponents: An Application
to that used to show that the function g(x) = x2 is strictly increasing, we see
that f is strictly increasing on D (see Hw5.6.2). Again by Proposition 5.4.8 we
know that f is one-to-one, i.e. f is invertible. Denote the inverse of f by f −1 .
Then f −1 : f (D) = [0, ∞) → D = [0, ∞) and since D = [0, ∞) is an interval,
by Proposition 5.4.11 we know that f −1 is continuous on f (D) = [0, ∞).
As always f and f −1 must satisfy the identity
f −1 (f (x)) = x for all x ∈ D = [0, ∞) or specificallyf −1 (xn ) = x
(5.6.1)
and
f f −1 (y) = y for all y ∈ f (D) = [0, ∞)
n
or specifically f −1 (y) = y for all y ∈ [0, ∞).
(5.6.2)
We make the following definition..
Definition 5.6.2 For y ∈ [0, ∞) and n ∈ N define
is referred to as the nth root of y.
√
√
n y = y 1/n = f −1 (y).
n y
Hence the definition of the nth root of y is defined as the inverse of the function
f (x) = xn . With this definition and the identities given in (5.6.1) and (5.6.2)
we get the following identities.
n
(a) (xn )1/n = x for x ∈ [0, ∞) and (b) y 1/n = y for all y ∈ [0, ∞) (5.6.3)
Now that the nth roots are defined we have to decide what we want to do
with these definitions. To begin with we make the following extensions of the
of the above definition.
1
for y ∈ [0, ∞).
Definition 5.6.3 (a) For n is a negative we define y 1/n = y−1/n
m
m
r
1/n
(b) For r ∈ Q, r = n , we define y = y
for y ∈ [0, ∞).
Now that we have xr defined we have work to do. We noted in Example
1.6.3 and HW1.6.2 that for m, n ∈ N and a > 0 we have am an = am+n and
n
(am ) = amn , respectively. Of course we would like these properties to be true
for rationals also. But before we prove these arithmetic properties we must
mk
prove that xr is well defined. The problem is that r = m
n and r = nk are equal
rationals. We need to know that xm/n = xmk/nk .
Proposition 5.6.4 xr is well defined.
Proof: We note that
kn
km kn
kn km
1/kn
∗∗
1/kn
m k
km
∗
=∗∗∗ xkm/kn
x
=
x
(x ) = x =
(5.6.4)
and
m k
(x ) =
∗
1/n
x
n m k
=∗∗
kn
m ikn
h
x1/n
=∗∗∗ xm/n
(5.6.5)
134
5. Continuity
where in both cases the *-equalities are due to (5.6.3)-(b), the **-equalities are
due to integer algebra and the ***-equalities are due to Definition 5.6.3-(b)—
kn
kn
. Then because
= xm/n
the definition of y r . Thus we have xkm/kn
h(u) = ukn is one-to-one on [0, ∞), we get xkm/kn = xm/n so xr is well defined.
We should note that in the last step where we used the fact that h is one-toone, we could have equally said that we were taking the kn-th root of both sides
of the equality—but you might recall that the fact that the kn-th root exists is
because h is one-to-one.
Now that we know that xr is well defined it’s time to start developing the
necessary arithmetic properties. We begin the the fractional part of our arithmetic properties.
Proposition 5.6.5
Suppose m, n ∈ N. Then we get the following results.
m
m 1/n
1/n
= xm/n
(a) (x )
= x
1
1
1
1
(b) x n x m = x n + m
1/m
= x1/mn
(c) x1/n
h
in
1/n
Proof: (a) We note that by (5.6.3)-(b) xm = (xm )
, and by (5.6.3)-(b)
h
im h
in
n
m
and integer algebra xm = x1/n
= x1/n
. Then since h(u) = un is
m
one-to-one, we get (xm )1/n = x1/n . And this last expression is the definition
of xm/n .
(b) We note that
xm+n =
mn im+n m+n mn h
imn 1 1 mn
h
x1/mn
= x1/mn
= x(m+n)/mn = x n + m
and
in
n im h
h
x1/m m
xm+n = xm xn = x1/n
inm
nm
h
nm == x1/n x1/m
x1/m
= x1/n
—where again the steps are
integer
1due1 tomn(5.6.3)-(b),
inm algebra and Definition
h
+
1/n
1/m
5.6.3-(b). Thus we have x n m
= x x
. Since h(u) = unm is
1
1
1
1
one-to-one, we get x n x m = x n + m .
1/m m n n nm
1/m mn
1/n
1/n
1/nm
1/n
=
x
, x= x
,
(c) Since x = x
=
x
1/m
and h(u) = umn , we see that x1/n
= x1/mn —again the reasons for the
steps are the f -f −1 identity (5.6.3)-(b) and integer algebra. qed
5.6 Rational Exponents: An Application
135
We now proceed to derive the final results for rational exponents. By now
we will stop giving the reasons for each of the steps—if you’ve read the proofs
of Propositions 5.6.4 and 5.6.5, you know the reasons.
Proposition 5.6.6 Suppose that r = m/n, s = p/q ∈ Q where m, n, p, q ∈ N.
Then we have the following.
(a) xr xs = xr+s
(b) (xr )s = xrs
Proof: (a) We see that
xr+s
nq
(mq+np) nq
nq = x1/nq
= x(mq+np)/nq
q inp
n imq h
nq imq+np
h
x1/q
= xmq+np = xmq xnp = x1/n
x1/nq
h
m inq h
p inq
nq
= x1/n
x1/q
= (xr xs ) .
=
h
(5.6.6)
Then since h(u) = unq is one-to-one implies that xr+s = xr xs .
(b) Since
nqmp h
mp inq
nq
= x1/nq
= x1/nq
= (xrs ) ,
m inp
nmp h
n m p m p
1/n
= x1/n
= x1/n
= (x ) =
x
q inp
h
= (xr )np = (xr )1/q
iqnp h
p inq
h
1/q
1/q
s nq
= (xr )
= (xr )
= [(xr ) ] ,
xmp =
xmp
h
x1/nq
nq imp
(5.6.7)
and h(u) = unq is one-to-one, xrs = (xr )s .
Thus we now have the arithmetic properties for rational exponents that we
have all known for a long time. The proofs given above are a bit gross but we
hope that you realize that you now have a rigorous treatment of these definitions
and properties.
We notice that the above definitions and analysis are all done for x ∈ [0, ∞).
This was necessary because xn is not one-to-one on R for n even—therefore it’s
not invertible. It should be clear that it’s possible to define y 1/n on R for n
odd. For n odd we could repeat the construction given earlier for f (x) = xn
and arrive at the definition of y 1/n as y 1/n = f −1 (y)—which is good because
we have all taken the cube root of −27 sometime in our careers and got −3.
You can do most of the arithmetic that we developed for roots, etc. defined on
[0, ∞). However you do have to be careful. For example we know that 13 and
2
1/3
= −3,
6 are two representations of the same rational number. But (−27)
136
5. Continuity
1/6
2
(−27)2
= 3 and (−27)1/6 is not defined. This is not good, i.e. you must
be careful when you start taking roots of odd numbers.
And finally we remember that we are in a chapter entitled Continuity. This
has been a very nice application of some of our continuity results but we will
now return to continuity. We obtain the following result.
Proposition 5.6.7 Suppose that r ∈ Q and define f : [0, ∞) → R by f (x) = xr .
then f is continuous on [0, ∞).
1/n
Proof: We write r as r = m
and h(x) = xm . Then
n , and define g(x) = x
m/n
1/n m
f (x) = x
= x
= h◦g(x). We know that h is continuous everywhere (it
is an easy polynomial). We found earlier by Proposition 5.4.11 that g(y) = y 1/n
is continuous on [0, ∞) because g = F −1 where F (x) = xn . Then by Proposition
5.3.5 we see that f is continuous on [0, ∞).
HW 5.6.1 (True or False and why) (a) If we consider f (x) = x3 defined on R,
then f −1 is defined and continuous on R.
HW 5.6.2 (a) Prove that f (x) = x3 for x ∈ R is strictly increasing.
(b) Prove that the function f (x) = xn , x ∈ [0, ∞), n ∈ N, is strictly increasing.
Chapter 6
Differentiation
6.1
An Introduction to Differentiation
In your first course in calculus you learned about the derivative and a variety of
applications of differentiation. You found the slope of tangent lines to curves,
velocities and accelerations of particles, maximums and minimums, an assortment of different rates of changes and more. The importance of the concept of
a derivative should be clear. We begin with the definition.
Definition 6.1.1 Suppose that the function f : [a, b] → R. If x0 ∈ [a, b], then
f (x) − f (x0 )
f is said to be differentiable at x = x0 if lim
exists. The limit
x→x0
x − x0
is the derivative of f at x0 and is denoted by f ′ (x0 ). If E ⊂ [a, b] and f is
differentiable at each point of E, then f is said to be differentiable on E. The
function f ′ : E → R defined to be the derivative at each point of E is called the
derivative function. A common notation for the derivative function is to write
dy
the functions as y = f (x) and denote the derivative of f as dx
—at a particular
dy dy
d
′
f (x).
point x0 write either dx (x0 ) or dx x=x . We also denote f (x) by dx
0
There is an important alternative form of the limit given in the definition above.
f (x) − f (x0 )
It should be clear that if we replace the x in the limit lim
by
x→x0
x − x0
x0 + h, then x → x0 is the same as h → 0. Thus an alternative definition
f (x + h) − f (x0 )
. There are times that this
of the derivative is given by lim
h→0
h
particular limit is preferable to the limit given in Definition 6.1.1 above.
In the above definition the derivative is defined at x = a and x = b and
the derivatives at these points will be in reality right and left hand derivatives,
respectively. We can also define right and left hand derivatives at interior points
of [a, b] by using right and left hand limits, i.e. the right hand derivative of f at
f (x) − f (x0 )
, and the left hand
x = x0 ∈ (a, b) is defined by f ′ (x0 +) = lim
x→x0 +
x − x0
137
138
6. Differentiation
f (x) − f (x0 )
.
x − x0
We will not do much with one sided derivatives. Generally the results that you
need for one sided derivatives are not difficult.
Since hopefully we are good at taking limits, it is not difficult to apply
x3 − 64
Definition 6.1.1. In Example 4.2.4 we showed that lim
= 48, i.e. if
x→4 x − 4
3
′
f (x) = x we showed that f (4) = 48. We can just as easily show that
derivative of f at x = x0 ∈ (a, b) is defined by f ′ (x0 −) = lim
x→x0 −
x3 − x30
f (x) − f (x0 )
= lim
x→x0 x − x0
x→x0
x − x0
2
(x − x0 )(x + xx0 + x20 )
= lim (x2 + xx0 + x20 ) = 3x20 .
= lim
x→x0
x→x0
x − x0
f ′ (x0 ) = lim
We next include a result that is an extremely nice result and is necessary for
us to be able to proceed.
Proposition 6.1.2 Consider f : [a, b] → R and x0 ∈ [a, b]. If f is differentiable
at x = x0 , then f is continuous at x = x0 .
f (x) − f (x0 )
(x − x0 ) + f (x0 ). Then we see that
x − x0
f (x) − f (x0 )
(x − x0 ) + f (x0 )
lim f (x) = lim
x→x0
x→x0
x − x0
f (x) − f (x0 )
= lim
lim (x − x0 ) + lim f (x0 )
x→x0
x→x0
x→x0
x − x0
′
= f (x0 ) · 0 + f (x0 ) = f (x0 )
Proof: Note that f (x) =
where we can apply the appropriate limit theorems because all of the individual limits exist. Since x0 ∈ [a, b], x0 is a limit point of [a, b]. Therefore by
Proposition 5.1.2 f is continuous at x = x0 .
The above result shows that there is a heirarchy of properties of functions.
Continuous functions may be nice but differentiable functions are nicer. It is
easy to see by considering the absolute value function at the origin—which we
will do soon—that the converse of this result is surely not true.
In your basic calculus course the very important tools that you used constantly to compute derivatives were ”derivative of the sum is the sum of the
derivatives, derivative of a constant times a function is the constant times the
derivative, the product rule and the quotient rule.” We now include these results.
Proposition 6.1.3 Supppose that f, g : [a, b] → R, x0 ∈ [a, b], c ∈ R, and
f ′ (x0 ) and g ′ (x0 ) exist. Then we have the following results.
(a) (cf )′ (x0 ) = cf ′ (x0 )
(b) (f + g)′ (x0 ) = f ′ (x0 ) + g ′ (x0 )
(c) (f g)′ (x0 ) = f ′ (x0 )g(x0 ) + f (x0 )g ′ (x0 )
′
f ′ (x0 )g(x0 ) − f (x0 )g ′ (x0 )
f
(x0 ) =
.
(d) If g(x0 ) 6= 0, then
2
g
[g(x0 )]
139
6.1 Introduction
Proof: (a) & (b) The proofs of (a) and (b) are direct applications of Proposition 4.3.1 parts (b) and (a).
(c) We note that
f (x)g(x) − f (x0 )g(x0 )
(f g)(x) − (f g)(x0 )
=
x − x0
x − x0
g(x) − g(x0 )
f (x) − f (x0 )
= f (x)
+ g(x0 )
.
x − x0
x − x0
(We added and subtracted terms to go from step 2 to step 3—if you simplify
the last expression, you will see that it is the same as step 2.) Then
g(x) − g(x0 )
(f g)(x) − (f g)(x0 )
f (x) − f (x0 )
= lim f (x)
+ g(x0 )
lim
x→x0
x→x0
x − x0
x − x0
x − x0
g(x) − g(x0 )
f (x) − f (x0 )
= lim f (x) lim
+ g(x0 ) lim
(6.1.1)
x→x0
x→x0
x→x0
x − x0
x − x0
by Proposition 4.3.1-(a), (b) & (d)
= f (x0 )g ′ (x0 ) + g(x0 )f ′ (x0 ).
(6.1.2)
(To allow us to take the limits that get us from (6.1.1) to (6.1.2) we use the fact
that if f differentiable at x0 , then f is continuous at x0 , Proposition 6.1.2, and
of course Definition 6.1.1.)
Therefore we get the product rule, (f g)′ (x0 ) = f (x0 )g ′ (x0 ) + f ′ (x0 )g(x0 ).
(d) We attack the quotient rule in a similar way. We note that
f (x)
g(x)
−
f (x0 )
g(x0 )
f (x)g(x0 ) − g(x)f (x0 )
=
x − x0
g(x)g(x0 )(x − x0 )
f (x) − f (x0 )
1
g(x) − g(x0 )
g(x0 )
=
.
+ f (x0 )
g(x)g(x0 )
x − x0
x − x0
(f /g)(x) − (f /g)(x0 )
=
x − x0
(To get from the third term to the last term we have added and subtracted
things again. You can simplify the last expression to see that it is equal to the
second to the last expression.) Then
1
f (x) − f (x0 )
(f /g)(x) − (f /g)(x0 )
= lim
g(x0 )
lim
x→x0 g(x)g(x0 )
x→x0
x − x0
x − x0
g(x) − g(x0 )
+f (x0 )
(6.1.3)
x − x0
1
′
′
=
2 [g(x0 )f (x0 ) − f (x0 )g (x0 )] . (6.1.4)
[g(x0 )]
(Note that to get to (6.1.4) from (6.1.3) we have used parts (a), (b), (d) and
(f) of Proposition 4.3.1 along with Definition 6.1.1. Again it is very important
that by Proposition 6.1.2 since g is differentiable at x0 , then g is continuous at
x0 —and nonzero—so that we can take the limit in the denominator.)
140
6. Differentiation
Thus we have the quotient rule, (f /g)′ (x0 ) =
g(x0 )f ′ (x0 ) − f (x0 )g ′ (x0 )
.
2
[g(x0 )]
One of the very basic and useful theorems that you learned and used often
in your Calc I course was the Chain Rule. We state the following theorem.
Proposition 6.1.4 Consider the functions f : [a, b] → R, g : [c, d] → R where
f ([a, b]) ⊂ [c, d] and x0 ∈ [a, b]. Suppose that f is differentiable at x = x0 ∈ [a, b]
and g is differentiable at y = f (x0 ) ∈ [c, d]. Then g ◦f is differentiable at x = x0
and (g ◦ f )′ (x0 ) = g ′ (f (x0 ))f ′ (x0 ).
Proof: You should realize that this is a difficult proof. The proof given here
is clearly not difficult but it’s tricky. Read it carefully—otherwise before you
know what we’re doing, we’ll be done.
(
g(y)−g(f (x0 ))
if y 6= f (x0 )
y−f (x0 )
Define h : [c, d] → R by h(y) =
′
g (f (x0 ))
if y = f (x0 ).
Since g is differentiable at y = f (x0 ), h is continuous at y = f (x0 )—clearly
lim
y→f (x0 )
h(y) =
g(y) − g(f (x0 ))
= g ′ (f (x0 )) = h(f (x0 )).
y − f (x0 )
y→f (x0 )
lim
Note that g(y) − g(f (x0 )) = h(y)(y − f (x0 )) for all y ∈ [c, d]. We let y = f (x)
and get g(f (x)) − g(f (x0 )) = h(f (x))(f (x) − f (x0 )).
Thus
(g ◦ f )′ (x0 ) = lim
x→x0
g ◦ f (x) − g ◦ f (x0 )
f (x) − f (x0 )
= lim h(f (x))
. (6.1.5)
x→x0
x − x0
x − x0
f (x) − f (x0 )
→ f ′ (x0 ). Also, since f is
x − x0
differentiable at x = x0 , then f is continuous at x = x0 . And finally, since f is
continuous at x = x0 and h is continuous at y = f (x0 ), then h ◦ f is continuous
at x = x0 . Returning to (6.1.5) we get
Since f is differentiable at x = x0 ,
(g ◦ f )′ (x0 ) = lim h(f (x))
x→x0
f (x) − f (x0 )
= h(f (x0 ))f ′ (x0 ),
x − x0
or (g ◦ f )′ (x0 ) = g ′ (f (x0 ))f ′ (x0 ).
Often in texts of a variety of levels the justification of the chain rule is given
approximately as follows. We note that
g(f (x)) − g(f (x0 ))
g(f (x)) − g(f (x0 )) f (x) − f (x0 )
=
x − x0
f (x) − f (x0 )
x − x0
g(y) − g(f (x0 )) f (x) − f (x0 )
=
,
y − f (x0 )
x − x0
(6.1.6)
where we have set y = f (x). The argument made is as x → x0 , y → f (x0 ) so
(6.1.6) implies (g ◦ f )′ (x0 ) = g ′ (f (x0 ))f ′ (x0 ). Most often if you read the texts
141
6.2 Some Derivatives
carefully, they do not claim that it’s a proof. But you have to read it carefully.
The difference is between the statements
g(y) − g(f (x0 ))
y − f (x0 )
y→f (x0 )
lim
and
lim
x→x0
g(f (x)) − g(f (x0 ))
.
f (x) − f (x0 )
(6.1.7)
(6.1.8)
Expression (6.1.8) is what we really have and we replaced it by (6.1.7). They are
not the same. Clearly the limit in (6.1.7) is g ′ (f (x0 )). The problem with (6.1.8)
is that the function f may be such that f has zeros in every neighborhood of
x = x0 . In that case it should be clear
cannot find a
that for any given ǫ we
g(f (x)) − g(f (x0 ))
− L < ǫ—for any L
δ such that 0 < |x − x0 | < δ implies f (x) − f (x0 )
(including L = g ′ (f (x0 ))) because no matter which δ is chosen, we get zeros in
the denominator. Thus our proof given above dances around this difficulty.
The ”non-proof” given in the last paragraph is useful if given honestly. It is
a good indication that the Chain Rule is true. If you add the hypothesis that
”for some δ1 the function f satisfies f (x) 6= f (x0 ) when 0 < |x − x0 | < δ1 ,”
then it’s a proof. And lastly, the type of function that could cause the problems
described in the last paragraph is the function f3 defined in Example 6.2.4—so
as you will see, it has to get fairly ugly.
HW 6.1.1 (True or False and why) (a) Suppose f : [0, 1] → R, x0 ∈ [a, b], is
such that f 2 is differentiable at x = x0 . Then f is differentiable at x = x0 .
(b) Suppose f : [a, b] → R, x0 ∈ [a, b], is such that f is differentiable at x = x0 .
Then f 2 is differentiable at x = x0 .
(c) Suppose f : [a, b] → R, x0 ∈ [a, b], is such that f is continuous at x = x0 .
Then f is differentiable at x = x0 .
(d) Suppose f, g : [a, b] → R, x0 ∈ [a, b], are such that f + g is differentiable at
x = x0 . Then f and g are differentiable at x = x0 .
(e) Suppose f, g : [a, b] → R, x0 ∈ [a, b], are such that f + g and f are differentiable at x = x0 . Then g is differentiable at x = x0 .
HW 6.1.2 Suppose that f1 , · · · , fn : [a, b] → R, x0 ∈ [a, b], are all differentiable
at x = x0 . Then prove that f1 + · · · + fn is differentiable at x = x0 .
HW 6.1.3 Suppose f : [a, b] → R, g : [c, d] → R, h : [e1 , e2 ] → R are such
that f ([a, b]) ⊂ [c, d], g([c, d]) ⊂ [e1 , e2 ], f is differentiable at x = x0 , g is
differentiable at y = f (x0 ) and h is differentiable at z = g ◦ f (x0 ). Prove that
(h ◦ g ◦ f )′ (x0 ) = h′ (g ◦ f (x0 ))g ′ (f (x0 ))f ′ (x0 ).
6.2
Computation of Some Derivatives
Before we can proceed we must compute some derivatives. Definition 6.1.1,
Proposition 6.1.3 and Proposition 6.1.4 give us tools that allow us to compute
142
6. Differentiation
some derivatives and reduce a problem involving a difficult expression to several
easier problems—that’s how we used these results in our basic course. We begin
with the derivatives of a few of the basic functions.
Example 6.2.1
(a)
(b)
(c)
d
c = 0 where
dx
d
x = 1.
dx
d n
x
= nxn−1
dx
Show that
c ∈ R.
for n ∈ Z.
Solution: (a) We note that
f (x) − f (x0 )
c−c
d
c = lim
= lim
= 0.
x→x0
x→x0 x − x0
dx
x − x0
(b) We see that
f (x) − f (x0 )
x − x0
d
x = lim
= lim
= 1.
x→x0
x→x0 x − x0
dx
x − x0
Remember that we can divide out the x − x0 terms because of the ”0 <” part of the definition
of a limit.
(c) For n = 0 the statement is true by part (a). We next prove the formula for n ∈ N. We
d n
prove this statement by mathematical induction, i.e. dx
x = nxn−1 for n ∈ N.
Step 1: Show true for n = 1: The statement is true for n = 1 by part (b) of this example, i.e.
d 1
x = 1 · x0 = 1.
dx
d k
Step 2: Assume true for n = k, i.e. assume that dx
x = kxk−1 .
d k+1
Step 3: Prove true for n = k + 1, i.e. prove that dx
x
= (k + 1)xk . We note that
d k+1
d
d
d
x
=
(xxk ) =∗ x xk + xk
x = x · (kxk−1 ) + xk · 1 = kxk + xk = (k + 1)xk
dx
dx
dx
dx
where step ”=∗ ” is due to Proposition 6.1.3-(c)—the product rule.
By mathematical induction the statement is true for all n, i.e.
d n
x
dx
= nxn−1 .
And finally we consider n ∈ Z, n < 0. Then we have
1
d n
d
x =
where now we should note that −n > 0
dx
dx x−n
=
0 · x−n − 1 · (−n)x−n−1
[x−n ]2
by Proposition 6.1.3-(d)—the quotient rule
= nx−n−1+2n = nxn−1 .
Thus for all n ∈ Z we have
d n
x
dx
= nxn−1 .
Note that a common approach to Example 6.2.4-(c) for n > 0 is to apply
the definition and note that
(x − x0 ) xn−1 + xn−2 x0 + · · · + xx0n−2 + x0n−1
xn − xn0
lim
= lim
(6.2.1)
x→x0 x − x0
x→x0
x − x0
= lim xn−1 + xn−2 x0 + · · · + xx0n−2 + x0n−1 = nx0n−1 .
x→x0
We must realize that this proof has an ”obvious” mathematics induction proof
hidden in the middle—the · · · .
If we apply parts (a), (b) and (c) of Proposition 6.1.3 along with Example
6.2.1, we see that any polynomial is differentiable and (a0 xm + a1 xm−1 + · · · +
am−1 x+am )′ = ma0 xm−1 +(m−1)a1 xm−2 +· · ·+am−1 . Likewise, if in addition
143
6.2 Some Derivatives
we apply part (d) of Proposition 6.1.3 we find that any rational function is
differentiable at all points where the denominator is not zero and
′
p′ (x)q(x) − p(x)q ′ (x)
p(x)
=
2
q(x)
[q(x)]
where p and q are polynomials.
We now include several more examples where we compute the derivative of
a function or show that the function is not differentiable.
d √
1
x= √ .
dx
2 x
Solution: We apply the definition and note that
√
√
√
√ √
√
x − x0
x − x0 x + x0
d √
x = lim
= lim x → x0
√
√
x→x0
dx
x − x0
x − x0
x + x0
Example 6.2.2
Show that for x ∈ (0, ∞)
=
lim
x→x0
(x − x0 )
1
= √ .
√
√
(x − x0 )( x + x0 )
2 x0
Again we get to divide out the x−x0 term because in the definition of a limit, we only consider
x − x0 6= 0.
√
We note that since x is not defined for x < 0, we know that we cannot
worry about the derivative there. If we consider x0 = 0, we see that
√
x−0
1
lim
= lim √ = ∞.
x→0+ x − 0
x→0+
x
(We have used a one-sided limit to emphasize the fact that we cannot consider
x < 0. Also, we don’t really know that this limit is ∞ (even though we hope
we do know that). Hopefully we could use the methods in Section 4.4 to prove
√
that this limit is ∞.) Since this limit does not exist in R, the derivative of x
does not exist at x = 0. However, this computation is useful when we use the
derivative to give the slope of the tangent to the curve. The above computation
shows that at x = 0 the tangent line is vertical—that’s surely better information
than just telling us that there is no tangent at that point.
If we think about the approach used for the above example, we can use the
d √
1
1
d √
3
4
analogous approach to show that
x= √
x= √
for x ∈ R,
3
4
dx
dx
3 x2
4 x3
for x ∈ (0, ∞), etc.
We next include an example that includes an interesting limit and the very
important application of that limit that gives the derivatives of the trig functions.
Example 6.2.3
(a) Prove that lim
(b) Prove that lim
1 − cos θ
= 0.
θ
x→0
x→0
sin θ
= 1.
θ
d
sin x = cos x.
dx
Solution: (a) We notice that given that angle ∠P OA is θ, then |OB| = cos θ, |BP | = sin θ
and |AP | = tan θ. We also note that the area of triangle △OAP is 21 sin θ, the area of the
(c) Show that
144
6. Differentiation
sector OAP is 21 θ and the area of triangle △OAQ is 21 tan θ (remember that |OA| = 1). Also,
the area of triangle △OAP is less than the area of sector OAP is less than the area of triangle
△OAQ, i.e. we have that
1
1
1
θ
1
sin θ < θ < tan θ or 1 <
<
.
2
2
2
sin θ
cos θ
sin θ
< 1. Since lim cos θ = 1
x→0
θ
(Example 5.2.3-(c)) and lim 1 = 1, we can apply the Sandwhich Theorem, Proposition 4.3.4,
Inverting these inequalities very carefully gives us that cos θ <
x→0
to see that lim
x→0
sin θ
= 1.
θ
(0,1)
Q
P
tan θ
sin θ
θ
O
B
cosθ
A
1
Figure 6.2.1: Figure used to prove
(b) Given part (a), part (b) is easy. We see that
1 − cos θ
1 − cos θ 1 + cos θ
1 − cos2 θ
sin2 θ
sin θ sin θ
=
=
=
=
.
θ
θ
1 + cos θ
θ((1 + cos θ)
θ(1 + cos θ)
θ 1 + cos θ
sin θ
= 0 (we know both sin and cos are
1 + cos θ
continuous at 0), we apply Proposition 4.3.1 to obtain the desired result.
Then from part (a) and the fact that lim
x→0
(c) In order to apply the definition of the derivative to f (x) = sin x we note that
f (x + h) − f (x)
sin(x + h) − sin x
sin x cos h + sin h cos x − sin x
=
=
h
h
h
cos h − 1
sin h
= sin x
+ cos x
.
h
h
(6.2.2)
145
6.2 Some Derivatives
Then using parts (a) and (b) we get
f ′ (x) = lim
h→0
sin h
cos h − 1
f (x + h) − f (x)
= lim sin x
+ cos x
= cos x.
h→0
h
h
h
Of course the derivatives of the rest of the trig functions follow from the derivative of the sine function.
In Example 4.2.7 we showed that the limit of the function
(
sin x1
if x 6= 0
f1 (x) =
0
if x = 0
does not exist at x = 0—hence we know that f1 is not continuous at x = 0—so
it’s surely not differentiable
at x = 0. In Example 5.2.4 we showed that the
(
1
if x 6= 0
x sin x
function f2 (x) =
0
if x = 0
is continuous at x = 0—thus it’s at least a candidate for differentiability. We
next include the following example.
Example 6.2.4
(a) Show that f2 is not differentiable at
( x = 0.
x2 sin
(b) Show that the function f3 : R → R defined by f3 (x) =
0
is differentiable at x = 0.
1
x
if x 6= 0
if x = 0
Solution: (a) We note that
lim
x→0
f2 (x) − f2 (0)
= lim sin
x→0
x−0
1
.
x
We know that this last limit does not exist—by the same approach that we used to show that
f1 was not continuous at x = 0. Therefore the derivative of f2 does not exist at x = 0.
(b) We start the same way that we started with part (a) and note that
f3 (x) − f3 (0)
1
lim
= lim x sin
.
x→0
x→0
x−0
x
We showed in Example 5.2.4 that this last limit exists and equals zero. Therefore f3 is
differentiable at x = 0 and f3′ (0) = 0.
You should view the functions f1 , f2 , and f3 as a series of functions, admitted
not especially nice functions, that have obvious similarities. The function f2 is
smoothed enough so that it is continuous at x = 0 (where f1 was not) but
not differentiable. The function f3 is smoothed more—it is differentiable at
x = 0 and hence also continuous there. In addition, we should realize that all
of the functions f1 , f2 and f3 are differentiable when x 6= 0—if we knew how to
differentiate the sine function.
HW 6.2.1 Consider the function f (x) = x3 − 2x2 + x − 1 defined on R. (a)
Use Definition 6.1.1 to compute f ′ (2).
(b) Compute f ′ (x).
HW 6.2.2 Compute
d 1/3
x .
dx
146
6. Differentiation
HW 6.2.3 Compute lim
θ→0
sin 3θ
.
sin 5θ
(
x2
if x ∈ [−1, 1] ∩ Q
−x2 if x ∈ [−1, 1] ∩ I.
(a) Is f differentiable at x ∈ [−1, 1], x 6= 0? If so, compute f ′ (x).
(b) Is f differentiable at x = 0? If so, compute f ′ (0).
HW 6.2.4 Consider the function f (x) =
HW 6.2.5 We saw in Example 6.2.4 that f3 is differentiable at x = 0. (a)
Show that f3 is differentiable for x 6= 0.
(b) Determine where f3′ is continuous.
6.3
Some Differentiation Theorems
Now that we have some of the basic properties of the concept of the derivatve,
it is time to develop some additional applications of differentiation. There are
many very important applications of differentiation and you have surely seen
some of these in your basic course.
We begin with a very important result—but one we want now more as a
lemma. Recall that in Section 5.3 we defined maximums and minimums as the
maximum and minimum in some neighborhood about a point—Definition 5.3.6
(a) and (c). We begin with the very important result.
Proposition 6.3.1 Suppose the function f is such that f : (a, b) → R for some
a, b ∈ R, a < b and suppose that f is differentiable at x0 ∈ (a, b). If f has a
maximum or minimum at the point (x0 , f (x0 )), then f ′ (x0 ) = 0.
Proof; Consider the case where (x0 , f (x0 )) is a maximum. Let N ⊂ (a, b) be
a neighborhood of x0 so that f (x) ≤ f (x0 ) for all x ∈ N . (The definition of a
maximum gives us a neighborhood. If we then intersect that neighborhood with
(a, b), we get N . The neighborhood N will still be an interval and x0 ∈ N .)
f (x) − f (x0 )
exists (f is differentiable at x0 ), by Proposition 4.4.8
Since lim
x→x0
x − x0
f (x) − f (x0 )
f (x) − f (x0 )
both lim
and lim
exist and are equal. If x ∈ N
x→x0 −
x→x0 +
x − x0
x − x0
f (x) − f (x0 )
and x < x0 , then x − x0 < 0, f (x) − f (x0 ) ≤ 0 and hence
≥0
x − x0
(the slope is positive going up hill).
f (x) − f (x0 )
≥ 0: We prove this claim by contradiction. We
Claim 1: lim
x→x0 −
x − x0
f (x) − f (x0 )
assume that lim
< 0, i.e. there exists L < 0 such that for every
x→x0 −
x − x0
f (x) − f (x0 )
− L < ǫ.
ǫ > 0 where exists δ such that 0 < x0 − x < δ implies x − x0
Choose ǫ = |L|/2. Then there exists a δ such that 0 < x0 − x < δ implies
147
6.3 Differentiation Theorems
(x0 )
−
L
that f (x)−f
< |L|/2 or − |L|/2 + L <
x−x0
f (x)−f (x0 )
x−x0
< −|L|/2.
This contradicts that fact that
f (x)−f (x0 )
x−x0
< |L|/2 + L or
f (x) − f (x0 )
≥ 0 for x ∈ (x0 − δ, x0 ) ∩ N so
x − x0
f (x) − f (x0 )
≥ 0.
x − x0
f (x) − f (x0 )
Claim 2: lim
≤ 0: This proof is very much like the last case.
x→x0 +
x − x0
f (x) − f (x0 )
We show that if x ∈ N and x > x0 , then
≤ 0 (the slope is
x − x0
f (x) − f (x0 )
negative going down hill). We then assume that lim
> 0, apply
x→x0 +
x − x0
the definition of the right hand limit with ǫ = L/2 and arrive at a contradicition.
f (x) − f (x0 )
≤ 0.
Therefore lim
x→x0 +
x − x0
f (x) − f (x0 )
f (x) − f (x0 )
Since lim
≥ 0, lim
≤ 0 and they are equal,
x→x0 −
x→x0 +
x − x0
x − x0
f (x) − f (x0 )
they both must be zero. Therefore the limit lim
= 0 or f ′ (x0 ) =
x→x0
x − x0
0.
We can prove the analogous result for a minimum using a completely similar
argument or by considering the function −f —if f has a minimum at x0 , then
−f will have a maximum at x0 .
Of course we know that Proposition 6.3.1 gives us a powerful tool for finding
local maximums and minimums. We set f ′ (x) = 0—we called these points
critical points in our basic course. This gives us all of the maximums and
minimums at points at which f is differentiable and probably a few extras. We
then develop methods (which we will develop later) to determine which of these
critical points is actually a maximum or minimum. Then if we consider any
points at which f is not differentiable and maybe some end points, we have the
maximums and minimums.
However, at this time we wanted Proposition 6.3.1 to help us prove the
following theorem commonly referred to as Rolle’s Theorem.
we know that lim
x→x0 −
Theorem 6.3.2 (Rolle’s Theorem) Suppose that f : [a, b] → R is continuous
on [a, b], differentiable on (a, b) and such that f (a) = f (b). Then there exists a
ξ ∈ (a, b) such that f ′ (ξ) = 0.
Proof: We know from Theorem 5.3.8 that there exists x0 , y0 ∈ [a, b] such that
f (x0 ) = min{f (x) : x ∈ [a, b]} and f (y0 ) = max{f (x) : x ∈ [a, b]}. If both x0
and y0 are endpoints of [a, b] (and f (a) = f (b)), then f is constant on [a, b],
f ′ (x) = 0 for x ∈ (a, b) so we can choose ξ = a + b/2. Otherwise, either x0 or y0
is in (a, b), say x0 ∈ (a, b). Then by Proposition 6.3.1 we know that f ′ (x0 ) = 0,
i.e. we can set ξ = x0 .
The real reason that we want Rolle’s Theorem is to help us prove the Mean
Value Theorem, abbreviated by MVT, which is a very important result.
148
6. Differentiation
Theorem 6.3.3 (Mean Value Theorem (MVT) Suppose that f : [a, b] → R
is continuous on [a, b] and differentiable on (a, b). Then there exists a ξ ∈ (a, b)
f (b) − f (a)
such that f ′ (ξ) =
.
b−a
Proof: The proof of this result is very easy to prove as long as we know the
f (b) − f (a)
(x−a). Then h(a) = h(b) = 0, h
”trick.” We set h(x) = f (x)−f (a)−
b−a
is continuous on [a, b] and h is differentiable on (a, b). Thus by Rolle’s Theorem,
Theorem 6.3.2, there exists ξ ∈ (a, b) such that h′ (ξ) = 0, i.e. h′ (ξ) = f ′ (ξ) −
f (b) − f (a)
= 0 which is what we were to prove.
b−a
Often the results of the Mean Value Theorem will be given in the form
f (b) − f (a) = f ′ (ξ)(b − a).
The Mean Value Theorem is important in the form given in Theorem 6.3.3
but is most important for some of the very important corollaries that follow easily from the theorem. We begin with two results that are related to integration—
even thought at this time we don’t know what integration is.
Corollary 6.3.4 Suppose f : (a, b) → R is differentiable on (a, b) and such that
f ′ (x) = 0 for x ∈ (a, b). Then f is constant on (a, b).
Proof: Choose any two values x0 , y0 ∈ (a, b) where x0 6= y0 , say x0 < y0 .
Since f is differentiable on (a, b) we know that f is differentiable on (x0 , y0 ) and
by Proposition 6.1.2 f is continuous on [x0 , y0 ]. We can then apply the MVT
and get f (y0 ) − f (x0 ) = f ′ (ξ)(b − a) for ξ ∈ (x0 , y0 ). Since f ′ (x) = 0 for all
x ∈ (a, b), f ′ (ξ) = 0 and we have that f (x0 ) = f (y0 ). Since this is true for any
x0 , y0 ∈ (a, b), f must be a constant on (a, b).
Corollary 6.3.5 Suppose f, g : (a, b) → R are such that f and g are differentiable on (a, b) and f ′ (x) = g ′ (x) for all x ∈ (a, b). Then there exists a constant
C ∈ R such that f (x) = g(x) + C for all x ∈ (a, b).
Proof: Define h by h(x) = f (x) − g(x). If we then apply Corollary 6.3.4 to the
function h, we see that h is constant on (a, b). This is what we wanted to prove.
Do these for any open interval—include R and (0, ∞)????
As we stated earlier you should recall that you used both of the above results
often in your basic course. We next give the results that related increasingdecreasing functions with their derivatives. Recall that in Definition 5.4.3 we
defined increasing and decreasing, and strictly increasing and strictly decreasing
functions. We state the following corollary.
Corollary 6.3.6 Suppose f : (a, b) → R is differentiable on (a, b). We then
have the following results.
(a) If f ′ (x) > 0 for all x ∈ (a, b), then f is strictly increasing on (a, b).
(b) If f ′ (x) ≥ 0 for all x ∈ (a, b), then f is increasing on (a, b).
6.3 Differentiation Theorems
149
(c) If f ′ (x) < 0 for all x ∈ (a, b), then f is strictly decreasing on (a, b).
(d) If f ′ (x) ≤ 0 for all x ∈ (a, b), then f is decreasing on (a, b).
Proof: (a) Suppose x, y ∈ (a, b) are such that x < y. Then f is differentiable
on (x, y) and continuous on [x, y] so we can apply the MVT. We get f (y) −
f (x) = f ′ (ξ)(y − x) for ξ ∈ (x, y). Since f ′ (ξ) > 0 and y − x > 0, we get that
f (y) − f (x) > 0, or f (x) < f (y), or f is strictly increasing.
The proofs of (b), (c) and (d) follow from the MVT in exactly the same way.
We should realize that the application of Corollary 6.3.6 along with Proposition 6.3.1 gives us a method for catorgorizing the maximums and minimums
of a function. We use Proposition 6.3.1 (along with listing the points where the
derivative does not exist) to find the potential maximums and minimums, critical points. We handle points at which the function is not defined separately. We
then evaluate f ′ at one point in the interval between each of these critical points
to determine whether f is strictly increasing or decreasing in that interval—if
we have all critical points listed, the sign of f ′ cannot change in the interval.
We then classify the critical point as a maximum if the curve of the function
is increasing to the left of the critical point and decreasing to the right of the
critical point. We classify the critical point as a minimum if the curve of the
function is decreasing to the left of the critical point and increasing to the right
of the critical point.
Also we get a very useful result from Corollary 6.3.6. From Proposition 5.4.8
we saw that if a function f is strictly monotone on the domain of the function,
then the function was one-to-one. Then from Corollary 6.3.6 and Proposition
5.4.8 we obtain the following useful result.
Corollary 6.3.7 Suppose f : (a, b) → R is differentiable on (a, b). We then
have the following results.
(a) If f ′ (x) > 0 for all x ∈ (a, b), then f is one-to-one on (a, b).
(b) If f ′ (x) < 0 for all x ∈ (a, b), then f is one-to-one on (a, b).
We next return to the situation of inverse functions. In Section 5.4, Proposition 5.4.11 we proved that if I is an interval and f is either strictly monotone
on I or one-to-one and continuous on I, then f −1 is continuous on f (I). We
next give the result that gives a differentiability condition for f −1 .
Proposition 6.3.8 Suppose that f : I → R where I ⊂ R is an interval. Assume
that f is one-to-one and continuous on I. If x0 ∈ I is not an end point of I, f
is differentiable at x = x0 and f ′ (x0 ) 6= 0, then f −1 : f (I) → R is differentiable
at y0 = f (x0 ) and
′
1
f −1 (y0 ) = ′
.
(6.3.1)
f (x0 )
Proof: Read this proof carefully. It is a very technical proof. As discussed
earlier we already know that f −1 is continuous on f (I).
150
6. Differentiation
We know that lim
x→x0
f (x) − f (x0 )
6= 0. Hence
x − x0
1
= lim
f ′ (x0 ) x→x0
1
f (x)−f (x0 )
x−x0
= lim
x→x0
x − x0
.
f (x) − f (x0 )
Thus for every ǫ > 0 there exists δ such that 0 < |x − x0 | < δ implies
x − x0
1 f (x) − f (x0 ) − f ′ (x0 ) < ǫ.
(6.3.2)
Let g = f −1 . Since g is continuous at y0 , for every ǫ1 > 0 there exists δ1
such that 0 < |y − y0 | < δ1 implies |g(y) − g(y0 )| < ǫ1 . Apply this continuity
argument with ǫ1 = δ (from the preceeding paragraph) and call the given δ1 η,
i.e. we have 0 < |y − y0 | < η implies |g(y) − g(y0 )| < δ.
Then 0 < |y − y0 | < η implies |g(y) − g(y0 )|< δ or |g(y) − x0 | < δ. Then by
g(y) − x0
1 − ′
< ǫ (where the last expression
(6.3.2) implies that f (g(y)) − f (x0 ) f (x0 ) follows by replacing x in (6.3.2) by g(y)).
Note that because x0 = g(y0 ), f (g(y)) = y and f (x0 ) = y0 ,
g(y) − x0
1 1 g(y) − g(y0 )
−
−
=
.
f (g(y)) − f (x0 ) f ′ (x0 ) y − y0
f ′ (x0 ) Therefore
we have that for every ǫ > 0 there exists an η such that 0 < |y−y0 | < η
g(y) − g(y0 )
1 1
g(y) − g(y0 )
− ′
= ′
—which is
implies < ǫ or lim
y→y0
y − y0
f (x0 ) y − y0
f (x0 )
what we were to prove.
We next give a nice application of Proposition 6.3.8. In Section 5.6 we used
Proposition 5.4.8 to define y 1/n , we used Proposition 5.4.11 to show that the
function y 1/n is continuous on [0, ∞), and then used the composition of y 1/n and
xm to define xr , r ∈ Q and show that xr is continuous. We now want to extend
these results to show that xr is differentiable. To do so in the most pleasant
√
way we return to Example 5.6.1 and consider y. (Recall that in Example 6.2.2
1
d√
y = √ . We did so using more elementary methods and
we proved that
dy
2 y
methods that did not extend as nicely to y 1/n —and methods that weren’t as
nice as these.)
Example 6.3.1
Consider the function f (x) = x2 on D = [0, ∞). Show that f −1 (y) =
1
d √
1
y = √ = y −1/2 .
is differentiable on [0, ∞) and that
dy
2 y
2
√
y
Solution: We should recall that we already know that f is invertible, that f −1 is continuous
√
on [0, ∞) and that f −1 (y) = y. We see that it is very easy to apply Proposition 6.3.8 to
f —we know that f : I = [0, ∞) → R, I is surely an interval and that f is one-to-one and
continuous on I. We let y0 be an arbitrary element of [0, ∞) and let x0 ∈ [0, ∞) be such
√
that y0 = f (x0 ) = x20 , i.e. x0 = y0 . Then we know from Proposition 6.3.8 that f −1 is
1
1
1
d −1
f (y0 ) = ′
=
= √ . This is what we were to prove.
differentiable at y0 and
dy
f (x0 )
2x0
2 y0
6.3 Differentiation Theorems
151
Of course we next extend the above result to the function y 1/n .
Example 6.3.2
Consider the function f (x) = xn on D = [0, ∞). Show that f −1 (y) =
d 1/n
1 1
√
= n y is differentiable on [0, ∞) and that
y
= y n −1 .
dy
n
Solution: We proceed exactly as we did in the previous example. We know from Section
5.6 that f is invertible, that f −1 is continuous on [0, ∞) and f −1 (y) = y 1/n . It is also easy
to see that f satisfies the hypotheses of Proposition 6.3.8. Again let x0 ∈ D = [0, ∞) and
1/n
y0 ∈ f (D) = [0, ∞) be such that y0 = f (x0 ) = xn
. Then by Proposition 6.3.8
0 or x0 = y0
we know that f −1 is differentiable for any y0 ∈ f (D) = [0, ∞) and
y 1/n
1
1
d −1
1
1
1 1
f (y0 ) = ′
=
= =
= y n −1
1−1/n
1/n n−1
dy
f (x0 )
n
nxn−1
ny
0
n y0
0
which is what we were to prove.
Of course the next step is to apply the Chain Rule, Proposition 6.1.4 to
prove that for r ∈ R, r = m/n,
m−1 1 1
d r
m m−1 1
d 1/n m
= m x1/n
x
x =
x n −1 = x n + n −1 = rxr−1 .
dx
dx
n
n
This is important enough that we state this result in the form of a proposition.
Proposition 6.3.9 Consider r = m
n ∈ Q and the function g : [0, ∞) → R
defined by g(x) = xr . The function g is continuous and differentiable on [0, ∞)
and for any x0 ∈ [0, ∞) we have g ′ (x0 ) = rxr−1
.
0
The inverse trig functions provide us with another nice application of the
Propositions 5.4.11 and 6.3.8. In our basic functions course some time after
defining the trig functions we defined the inverse trig functions—for the sine
function, probably very intuitively as θ = sin−1 x as the ”angle whose sine is
x.” The interesting part and sometimes the tough part is that the sine function is
not one-to-one. We can get around this problem easily. Suppose for the moment
we write sin x to denote the sine function defined on R and define the restriction
of sin to [−π/2, π/2] by Sin x—this is a temporary, uncommon notation used to
make the point. It should be reasonably clear that though sin is not one-to-one,
Sin is one-to-one—we have restricted the domain, just as you did in your basic
course, so as to make the restriction one-to-one. We then have the following
result.
Example 6.3.3
Consider the function f = Sin : D = [−π/2, π/2] → R where f (x) =
Sin x = sin x. Show that f −1 exists on f (D) = [−1, 1], f −1 is continuous on [−1, 1], f −1 is
′
1
for x0 ∈ (−1, 1).
differentiable on (−1, 1) and f −1 (x0 ) = √
1 − x2
Solution: Above we stated that it was ”reasonably clear” that Sin is one-to-one. We also
d
need to know that the Sin function is monotone. Since we know that dx
sin x = cos x and
cos x > 0 on (−π/2, π/2), by Corollaries 6.3.6 and 6.3.7 we know that the Sin function is
strictly increasing and one-to-one on the interval (−π/2, π/2). Since we also know that the
sine function does not equal ±1 on the open interval (−π/2, π/2), we can include the end
points to see that the function Sin is strictly increasing and one-to-one on [−π/2, π/2].
Since f is monotone, we can apply Proposition 5.4.11 to see that f −1 is continuous on
f (D) = [−1, 1]. We know from Example 5.2.3-(d) that the sine function is continuous on
152
6. Differentiation
R—hence f (x) = Sin x is continuous on D = [−π/2, π/2]. Then since we already knoiw that
f is one-to-one, by Proposition 6.3.8 f −1 = sin−1 is differentiable on f (D) = (−1, 1), and for
1
1
d
sin−1 x = ′
=
.
x ∈ (−1, 1) and θ ∈ [−π/2, π/2] such that sin θ = x, we have
dx
f
(θ)
cos
θ
p
We know that cos θ = ± 1 − sin2 θ. Because for θ ∈ [−π/2, π/2] we know that cos θ ≥ 0, we
p
d
1
sin−1 x = √
—the formula you
have cos θ = 1 − sin2 θ. Also sin θ = x so we have
dx
1 − x2
learned in your basic course.
Of course we know that f −1 is usually written as sin−1 . You should be careful
with this notation. In some texts—usually old ones—they will write sin−1 as
the inverse of sin (not a function since sin is not one-to-one) and Arcsin as the
inverse of Sin. This is nice notation because it emphasizes the fact that sin
is not one-to-one but it doesn’t seem to be used much anymore. Just use the
notation sin−1 to denote the inverse of the Sin function and never talk about
the inverse of the sin function—because the inverse isn’t even a function. But
be careful.
Also we should realize that we could next consider the cosine, tangent, secant, etc functions. Just like the sine function, none of these functions are
one-to-one. Thus we restrict the domain as you did in your basic class (sometimes different from the domain used to define sin−1 and proceed as we did in
Example 6.3.3. We emphasize that having different domains for these functions
can make things difficult when we have the more than one of them interacting
with each other. Also we must be careful when using a calculator.
HW 6.3.1 (True or False and why) (a) Suppose f, g : (a, b) → R are such that
f and g are differentiable on (a, b), f ′ (x) = g ′ (x) for x ∈ (a, b) and f ((a+b)/2) =
g((a + b)/2). Then f (x) = g(x) for all x ∈ (a, b).
(b) Suppose f : R → R is strictly increasing on R. Then f ′ (x) > 0 for x ∈ R.
(c) Suppose f : [−1, 1] → R has a maximum at x = 0. Then f ′ (0) = 0.
(d) The function f (x) = x + sin x is strictly monotone on [0, ∞).
(e) Suppose f : (−3, 3) → R is differentiable on (−3, 3) and such that |f ′ (x)| > 0
on (−3, 3). Then f is strictly monotone on (−3, 3).
HW 6.3.2 Consider the function f (x) = |x| for x ∈ R. Show that if a < 0 < b,
then there is no ξ ∈ (a, b) such that f (b) − f (a) = f ′ (ξ)(b − a). Why does this
not contradict the Mean Value Theorem.
HW 6.3.3 Consider the function f : R → R defined by f (x) = 2x − cos x. (a)
Prove that f is invertible and f −1 is continuous on R.
df .
(b) Prove that f −1 is differentiable and find dx
x=−π/2
HW 6.3.4 Suppose f : D → R, D ⊂ R, is differentiable on D and for some M
satisfies |f (x)| ≤ M . Prove that f is uniformly continuous on D.
HW 6.3.5 Suppose f : [a, b] → R is continuous on [a, b] and differentiable on
(a, b). Prove that if f ′ (x) 6= 0 for all x ∈ (a, b), then f is one-to-one.
153
6.4 L’Hospital’s Rule
6.4
L’Hospital’s Rule
In Proposition 4.3.1-(f) we found that if f → L1 , g → L2 and L2 6= 0, then
x3 − 64
= 48, i.e. we found
f /g → L1 /L2 . In Example 4.2.4 we found that lim
x→4 x − 4
the limit of a quotient when the limit in the denominator is zero. We did this
by dividing out the x − 4 term in the numerator and the denominator. And
sin θ
= 1—another example where
finally in Example 6.2.3 we proved that lim
θ→0 θ
the limit exists where the limit in the denominator is zero. This time we were
unable to divide out the θ term so that we had to work much harder to find the
limit and prove 1 is the correct limit.
In this section we introduce L’Hospital’s Rule—a method for finding certain
limits of quotients. We begin by introducing the easiest version—a version that
will satisfy many of our needs.
Proposition 6.4.1 (L’Hospital’s Rule) Suppose f, g : I → R where I is
an interval, x0 ∈ I, f (x0 ) = g(x0 ) = 0, f and g are differentiable at x0 and
g ′ (x0 ) 6= 0. Then
f ′ (x0 )
f (x)
= ′
.
lim
x→x0 g(x)
g (x0 )
Proof: We note that for x ∈ I, x 6= x0 , we have
f (x)
f (x) − f (x0 )
=
=
g(x)
g(x) − g(x0 )
Then
limx→x0
f (x)
=
lim
x→x0 g(x)
limx→x0
f (x)−f (x0 )
x−x0
g(x)−g(x0 )
x−x0
f (x)−f (x0 )
x−x0
g(x)−g(x0 )
x−x0
=
.
f ′ (x0 )
g ′ (x0 )
which is what we were to prove.
We note that this version of L’Hospital’s Rule is enough to find both of the
x3 − 64
singular limits that we have considered in the past: lim
, Example 4.2.4,
x→4 x − 4
sin θ
and lim
, Example 6.2.3-(a). Also note that if you consider a simple limit
x→0 θ
3x + 4
such as lim
= −7, and blindly apply L’Hospital’s Rule, we would get
x→1 2x − 3
3
that the limit is 2 —the wrong answer. As is often the case you must be careful
that the functions involved satisfy the hypotheses. The functions f (x) = 3x + 4
and g(x) = 2x − 3 do not satisfy the hypotheses f (1) = 0 and g(1) = 0. And
finally, note that if I is a closed interval and x0 is an endpoint, then we have a
version of L’Hospital’s Rule for one-sided derivatives.
There are more difficult versions of L’Hospital’s Rule—and at times there is a
need for these more difficult versions. We will include several of these different
versions of L’Hospital’s Theorem. We will prove several of these results and
154
6. Differentiation
state several without proof. These proofs will depend strongly on the Cauchy
Mean Value Theorem—an generalization of the Mean Value Theorem, Theorem
6.3.3.
Proposition 6.4.2 (Cauchy Mean Value Theorem (CMVT)) Suppose
f, g : [a, b] → R are continuous on [a, b], differentiable on (a, b) and g is such
that g ′ (x) 6= 0 on (a, b). Then there exists ξ ∈ (a, b) such that
f (b) − f (a)
f ′ (ξ)
= ′ .
g(b) − g(a)
g (ξ)
Proof: We prove this theorem very much we proved the Mean Value Theorem—
we use a trick and Rolle’s Theorem, Theorem 6.3.2. We define a function h by
f (b) − f (a)
. Then it is easy to see that
h(x) = f (x) − mg(x) where m =
g(b) − g(a)
h(a) = h(b), h is continuous on [a, b] and h is differentiable on (a, b). Thus by
Rolle’s Theorem, Theorem 6.3.2, there exists ξ ∈ (a, b) such that h′ (ξ) = 0.
Since h′ (x) = f ′ (x) − mg ′ (x), we have the desired result.
We should note that if we wanted to pretend that we were discovering the
proof, we could set h(x) = f (x) − mg(x) (without telling what m is) and choose
m so that h(a) = h(b). It’s still a trick—but a nice trick. Also we should notice
that if we set g(x) = x, we get the Mean Value Theorem, Theorem 6.3.3. For
this reason sometimes Proposition 6.4.2 will be referred to as the Generalized
Mean Value Theorem.
Before we move on to some of the versions of LHospital’s Rules, we want to
discuss a few ideas that we will use often. The first is that though we will use
the CMVT in each of our results, we will always use a contorted version of the
proposition. We will always work with an x and y in our domain where y < x.
Then for example in part (a) of Proposition 6.4.3, we use the fact that
f (x) − f (y)
=
g(x) − g(y)
f (x)
g(x)
f (y)
g(x)
g(y)
g(x)
−
1−
(6.4.1)
and apply the CMVT to the left hand side. Equation (6.4.1) is easily seen to
be true by simplifying the right hand side.
We should also note that in our applications of the CMVT, it is always the
case that g(x) 6= g(y)—because we will always assume that g ′ (x) 6= 0 on our
interval I, the Mean Value Theorem, Theorem 6.3.3, implies that g(x) − g(y) =
g ′ (ξ)(x − y) for some ξ ∈ (y, x). Thus g(x) − g(y) 6= 0.
And finally another operation we will
use often is the
following. Again in part
f (x) f (y)
g(x) − g(x)
(a) of Proposition 6.4.3 we will have − A < ǫ where x is fixed and
g(y)
1 − g(x)
f (x)
g(x) − 0
− A ≤ ǫ.
f and g approach zero as y → c+. We let y → c+ and get 1−0
We should realize that this follows from HW4.1.4.
155
6.4 L’Hospital’s Rule
Since some flavor of each of the above statements will appear in each proof,
we thought that we’d belabor the idea once and in the proofs just proceed as if
we know what we’re doing. Thus we begin with the following result where we
consider several of the possibilities when x is approaching a real number from
the right hand side.
Proposition 6.4.3 (L’Hospital’s Rule) Suppose that f, g : I = (c, a) → R,
c, a ∈ R, where f, g are differentiable on I and g is assumed to satisfy g(x) 6= 0,
g ′ (x) 6= 0 on I. We then have the following results.
f ′ (x)
f (x)
= A ∈ R, then lim
=
x→c+ g ′ (x)
x→c+ g(x)
(a) If lim f (x) = lim g(x) = 0 and lim
x→c+
x→c+
A.
f (x)
f ′ (x)
= ∞, then lim
= ∞.
′
x→c+ g(x)
x→c+ g (x)
(b) If lim f (x) = lim g(x) = 0 and lim
x→c+
x→c+
f (x)
f ′ (x)
= A ∈ R, then lim
= A.
x→c+ g(x)
x→c+ g ′ (x)
(c) If lim g(x) = ∞ and lim
x→c+
f ′ (x)
f (x)
= ∞, then lim
= ∞.
x→c+ g ′ (x)
x→c+ g(x)
(d) If lim g(x) = ∞ and lim
x→c+
f ′ (x)
= A,
x→c+ g ′ (x)
for
′ ǫ1 given we know that there exists a δ such that ξ ∈ (c, c + δ) implies that
f (ξ)
g ′ (ξ) − A < ǫ1 . Choose x, y ∈ (c, c + δ) such that y < x. By the CMVT
f (x) − f (y)
f ′ (ξy )
there exists ξy ∈ (y, x) such that
= ′
. Note that ξy is also in
g(x) − g(y)
g (ξy )
(c, c + δ). Then
Proof: (a) Suppose ǫ > 0 is given and let ǫ1 = ǫ/2. Since lim
f (x) f (y)
′
f (x) − f (y)
g(x) − g(x)
f (ξy )
− A = − A = ′
− A < ǫ1 .
g(y)
g(x) − g(y)
g (ξy )
1 − g(x)
We then let y → c+, noting that ξy ∈ (c, c + δ) for all of the y’s, and get
f (x)
f (x)
= A.
g(x) − A ≤ ǫ1 = ǫ/2 < ǫ for any x ∈ (c, c + δ). Therefore lim
x→c+ g(x)
You might notice that this proof isn’t too different from the proof of Proposition 6.4.1, except in this case since we cannot evaluate f or g at x = c, we use
x and y and then let y → c+. Otherwise the proofs are really very similar.
(b) Suppose K > 0 is given—this time we are proving that a limit is infinite so
f ′ (x)
=
we begin with K in place of the traditional ǫ. Let K1 = 2K. Since lim ′
x→c+ g (x)
f ′ (ξ)
> K1 . Choose
∞, there exists a δ such that ξ ∈ (c, c + δ) implies that ′
g (ξ)
x, y ∈ (c, c + δ) with y < x. By the CMVT there exists ξy ∈ (y, x) such that
156
6. Differentiation
f (x) − f (y)
f ′ (ξy )
= ′
. Then we have
g(x) − g(y)
g (ξy )
f (x)
g(x)
f (y)
g(x)
g(y)
g(x)
−
1−
=
f ′ (ξy )
f (x) − f (y)
= ′
> K1
g(x) − g(y)
g (ξy )
—since ξy ∈ (y, x) ⊂ (c, c + δ)—and it’s true for any y and ξy as long as y < x.
f (x)
Let y → c+ and get
≥ K1 = 2K > K for all x ∈ (c, c + δ). Thus
g(x)
f (x)
= ∞.
lim
x→c+ g(x)
(c) The proof of statement (c) is difficult. We feel that it is important for you
to know that there is a rigorous proof. We also feel that it is important that
you are able to read and understand such a proof—even when it is tough. We
proceed.
f ′ (x)
= A, for
Suppose ǫ > 0 is given. Let ǫ1 = min{1, ǫ/2}. Since lim ′
x→c+ g (x)
′
f (ξ)
ǫ1 > 0 given, there exists δ1 such that ξ ∈ (c, c + δ1 ) implies ′
− A < ǫ1 .
g (ξ)
Choose x, y ∈ I such that y < x < c + δ. Then since ξy ∈ (y, x) ⊂ (c, c + δ1 )
′
f (y) f (x)
f (ξy )
f (y) − f (x)
g(y) − g(y)
− A = ′
− A < ǫ1 .
(6.4.2)
− A = g(x)
g(y) − g(x)
g (ξy )
1 − g(y)
Since lim g(y) = ∞ for a fixed x there exists δ4 such that y ∈ (c, c + δ4 )
y→c+
g(x)
> 0. Set ǫ2 = ǫ1 /2[2 + |A|]. Again using the fact that
g(y)
|f (x)|
lim g(y) = ∞, there exists δ5 such that y ∈ (c, c+δ5 ) implies that g(y) >
y→c+
ǫ2
|g(x)|
and there exists δ6 such that y ∈ (c, c + δ6 ) implies that g(y) >
. Let
ǫ2
|g(x)|
|f (x)|
δ = min{δ4 , δ5 , δ6 }. Then y ∈ (c, c + δx ) implies that g(y) < ǫ2 and g(y) < ǫ2 .
Now from (6.4.2) we have
implies that 1 −
−ǫ1 + A <
f (y)
g(y)
f (x)
g(y)
g(x)
g(y)
−
1−
< A + ǫ1
or
g(x)
f (x)
f (x)
f (y)
g(x)
(−ǫ1 + A) 1 −
+
+
<
< (ǫ1 + A) 1 −
. (6.4.3)
g(y)
g(y)
g(y)
g(y)
g(y)
|g(x)|
g(x)
< ǫ2 implies that −ǫ2 <
< ǫ2 . From this we see that
g(y)
g(y)
g(x)
g(x)
(x)|
< ǫ2 implies that
> 1 − ǫ2 and 1 −
< 1 + ǫ2 . Also, |fg(y)
1−
g(y)
g(y)
Note that
157
6.4 L’Hospital’s Rule
−ǫ2 <
f (x)
g(y)
< ǫ2 . Then inequality (6.4.3) gives
(−ǫ1 + A)(1 − ǫ2 ) − ǫ2 <
f (y)
< (ǫ1 + A)(1 + ǫ2 ) + ǫ2
g(y)
or
−ǫ1 + A − ǫ2 (1 + A − ǫ1 ) <
f (y)
< ǫ1 + A + ǫ2 (1 + A + ǫ1 ).
g(y)
(6.4.4)
Using the fact that ǫ2 = ǫ1 /2[2 + |A|] and the fact that ǫ1 ≤ 1, the extra term
on the right hand side of (6.4.4) becomes
ǫ2 (1 + A + ǫ1 ) ≤ ǫ2 (2 + A) ≤ ǫ2 [2 + |A|] = ǫ1 /2 < ǫ1 .
The extra term on the left side of (6.4.4) (without the minus sign) becomes
ǫ2 (1 + A − ǫ1 ) < ǫ2 [1 + A] ≤ ǫ2 [1 + |A|] =
ǫ1 1 + |A|
ǫ1
≤
< ǫ1 .
2 2 + |A|
2
f (y)
<
These allow us to write inequality (6.4.4) as −ǫ + A < −2ǫ1 + A <
g(y)
f (y)
f (y)
2ǫ1 + A < ǫ + A or − A < ǫ for all y ∈ (c, c + δ). Thus lim
= A.
y→c+ g(y)
g(y)
f ′ (x)
(d) Let K > 0 be given. Let K1 = 2K + 1. Since lim ′
= ∞, there exists
x→c+ g (x)
f ′ (ξ)
a δ1 such that ξ ∈ (c, c = δ1 ) implies that ′
> K1 . Choose x, y ∈ (c, c + δ1 )
g (ξ)
such that y < x. Then by the CMVT there exists ξy ∈ (y, x) such that
f (y)
g(y)
f (x)
g(y)
g(x)
g(y)
−
1−
=
f ′ (ξy )
f (x) − f (y)
= ′
> K1 .
g(x) − g(y)
g (ξy )
As in the proof of part (c) choose δ7 ( δ5 and δ6 )
1
2
|f (x)|
g(y)
1
2
− 12
<
(which implies that
(x)
> 1−
> and
< (which implies that
< fg(y)
< 21 ). Then
inequality (6.4.5) gives
f (x)
f (y)
g(x)
1
1
1
1
K1 +
> 1−
> K1 − = (2K + 1) − = K
g(y)
g(y)
g(y)
2
2
2
2
3
2
g(x)
g(y)
|g(x)|
g(y
(6.4.5)
1
2
f (y)
= ∞.
g(y)
We next state and do not prove a version of l’Hospital’s Rule when x approaches −∞. Originally we had included the proofs in the text but then decided
that it did no good to include these proofs if no one reads them. They are tough.
If you are interested in the proof see Advanced Calculus, Robert C. James.
for all y ∈ (c, c + δ7 ). Therefore lim
y→c+
158
6. Differentiation
Proposition 6.4.4 (L’Hospital’s Rule) Suppose that f, g : I = (−∞, a) →
R, a ∈ R, where f, g are differentiable on I and g is assumed to satisfy g(x) 6= 0,
g ′ (x) 6= 0 on I. We then have the following results.
f ′ (x)
f (x)
(a) If lim f (x) = lim g(x) = 0 and lim ′
= A ∈ R, then lim
=
x→−∞
x→−∞
x→−∞ g (x)
x→−∞ g(x)
A.
f ′ (x)
f (x)
(b) If lim f (x) = lim g(x) = 0 and lim ′
= ∞, then lim
=
x→−∞
x→−∞
x→−∞ g (x)
x→−∞ g(x)
∞.
f (x)
f ′ (x)
= A ∈ R, then lim
= A.
(c) If lim g(x) = ∞ and lim
x→−∞ g(x)
x→−∞
x→−∞ g ′ (x)
f ′ (x)
f (x)
(d) If lim g(x) = ∞ and lim
= ∞, then lim
= ∞.
x→−∞ g(x)
x→−∞
x→−∞ g ′ (x)
Chapter 7
Integration
7.1
An Introduction to Integration: Upper and
Lower Sums
The concept of the integral is very important. An integral is an abstract way
to perform a summation. We know of its application to areas, work, distancevelocity-acceleration, and much, much more. Generally the treatment of integration given in the basic course is less than adequate—integration is more
difficult than differentiation and continuity. In this chapter we introduce the
concept of the integral and develop basic results concerning integration. Specifically, in this section we will lay the ground work for the definition. When we
feel that we want motivation for the integral, we will use the fact that we want
the integral to represent the area under a given curve—hence in this section we
will give upper and lower approximations of the area in terms of sums.
Consider an interval [a, b] where a < b, and for n ∈ N consider P =
{x0 , · · · , xn } where a = x0 < x1 < · · · < xn−1 < xn = b. P is called a partition
of [a, b]. The points xi , i = 0, · · · , n, are called partition points. The intervals
[xi−1 , xi ], i = 1, · · · , n, are called partition intervals. Note that P = {a, b} is
the most trivial partition of [a, b]. We we write a partition of [a, b], we will write
the partition as P = {x0 , x1 , · · · , xn }, assuming that we all know that x0 = a,
xn = b and xi−1 < xi , for i = 1, · · · , n.
Definition 7.1.1 Consider the function f : [a, b] → R where f is bounded on
[a, b], and P is a partition of [a, b], P = {x0 , x1 , · · · , xn }.
(a) For each i, i = 1, · · · , n, define mi = glb{f (x) : x ∈ [xi−1 , xi ]} and Mi =
lub{f (x) : x ∈ [xi−1 , xi ]}. (Note that these glb’s and lub’s exist because f is
bounded on [a, b].)
n
n
X
X
Mi (xi − xi−1 ).
mi (xi − xi−1 ) and U (f, P ) =
(b) Define L(f, P ) =
i=1
i=1
159
160
7. Integration
The values L(f, P ) and U (f, P ) are called the lower and upper Darboux sums
of f based on P , respectively, or just the lower and upper sums.
We notice in Figure 7.1.1 that for the given partition P , L(f, P ) (represented by the area under the thin horizontal lines) gives a lower approximation
for the area under the curve y = f (x), a ≤ x ≤ b, and U (f, P ) (represented by
the area under the thick horizontal lines) gives an upper approximation for the
area under the curve y = f (x), a ≤ x ≤ b. Note that when we use the word
”approximation”, we do not mean that it necessarily provides an accurate approximation. You should realize that if we include more points in the partition,
the values L(f, P ) and U (f, P ) will provide better approximations of the area
compared the parition pictured—add a point to the partition in the figure, draw
the new version of the segments indicating the new upper and lower sums, and
note that the new upper and lower sums gives a value closer to the area under
the curve y = f (x).
y = f (x)
a x
x3
1
x0
x2
x4
x5 x6 b
x7
Figure 7.1.1: Plot of the function y = f (x) on [a, b], a partition indicated on
[a, b] and the step function representing the upper and lower sums, U (f, P ) and
L(f, P ), respectively.
Consider the following examples.
Example 7.1.1
Let f1 denote the constant function f1 (x) = k for x ∈ [a, b]. Let P =
{x0 , x1 , · · · , xn } be a partition of [a, b]. Compute L(f1 , P ) and U (f1 , P ).
Solution: Let [xi−1 , xi ] denote one of the partition intervals associated with partition P . It
should be clear that mi = glb{f1 (x) : x ∈ [xi−1 , xi ]} = k and Mi = k. Then
L(f1 , P ) =
n
X
mi (xi − xi−1 ) = k
n
X
(xi − xi−1 ) = k(b − a).
n
X
Mi (xi − xi−1 ) = k
n
X
(xi − xi−1 ) = k(b − a).
i=1
Also
U (f1 , P ) =
i=1
i=1
i=1
161
7.1 Upper and Lower Sums
(
1 if x ∈ Q ∩ [0, 1]
0 if x ∈ I ∩ [0, 1]
and let P = {x0 , · · · , xn } be a partition of [0, 1]. Compute L(f2 , P ) and U (f2 , P ).
Example 7.1.2
Consider the function f2 : [0, 1] → R defined by f2 (x) =
Proof: Recall that f2 is the same function considered in Example 5.2.5. Let [xi−1 , xi ] denote
a partition interval of partition P and assume that xi−1 < xi . Technically we could allow
points xi−1 and xi be two points of a parition and have xi−1 = xi —but the partition interval
[xi−1 , xi ] would contribute nothing to either L(f2 , P ) or U (f2 , P )—so why include such a
point. Since by Proposition 1.5.6-(a) there exists a q ∈ Q such that q ∈ (xi−1 , xi ), we see that
Mi = 1. Also since by Proposition 1.5.6-(b) there exists p ∈ (xi−1 , xi ) such that p ∈ I, we see
that mi = 0. This is true for every i, i = 1, · · · , k. Thus
L(f2 , P ) =
n
X
mi (xi − xi−1 ) =
n
X
0(xi − xi−1 ) = 0
n
X
Mi (xi − xi−1 ) =
n
X
1(xi − xi−1 ) = 1.
i=1
and
U (f2 , P ) =
i=1
i=1
i=1
Example 7.1.3
Consider f3 : [0, 1] → R defined by f3 (x) = 2x + 3 and the partition of
[0, 1], P = {x0 , · · · , xn }. Compute L(f3 , P ) and U (f3 , P ).
Proof: Since f3 is increasing, it is easy to that on the partition interval [xi−1 , xi ], mi =
2xi−1 + 3 and Mi = 2xi + 3, i = 1, · · · , n. Thus
L(f3 , P ) =
n
X
i=1
mi (xi − xi−1 ) =
n
X
i=1
(2xi−1 + 3)(xi − xi−1 ) = 2
n
X
xi xi−1 − 2
n
X
x2i − 2
i=1
n
X
x2i−1 + 3,
i=1
and
U (f3 , P ) =
n
X
i=1
Mi (xi − xi−1 ) =
n
X
i=1
(2xi + 3)(xi − xi−1 ) = 2
i=1
n
X
xi xi−1 + 3.
i=1
This is not very nice—and there’s no way to make it nice.
If instead we choose the partition P1 = {0, 1/n, 2/n, · · · , n/n = 1}, we get
L(f3 , P1 ) =
n
X
i=1
=
mi (xi − xi−1 ) =
n X
i
i−1
i−1
+3
−
2
n
n
n
i=1
n
2 n(n − 1)
2 X
(i − 1) + 3 = 2
+3
2
n i=1
n
2
and
U (f3 , P1 ) =
n
X
i=1
mi (xi − xi−1 ) =
2
n
i
i
i−1
2 X
2 n(n + 1)
+3
−
= 2
i+3 = 2
+ 3.
n
n
n
n i=1
n
2
These are much nicer expressions. You’d think they’d be useful for something—but remember
that this is a very nice and specific partition.
We now proceed to develop some results comparing and relating the upper
and lower sums. Because it is clear that mi ≤ Mi , i = 1, · · · , n, we obtain the
following result.
Proposition 7.1.2 Suppose f : [a, b] → R where f is bounded on [a, b] and P
is a partition of [a, b]. Then L(f, P ) ≤ U (f, P ).
162
7. Integration
Remember that we want
Rb
f to give us the area under the curve. If we look at
a
Rb
Figure 7.1.1 again, we note that for that to be true we at least must define a f
Rb
so that for any partition P , L(f, P ) ≤ a f ≤ U (f, P ), i.e. we must squeeze
Rb
a f between the inequality given in Proposition 7.1.2.
We next state and prove an easy but necessary lemma for our later work.
Lemma 7.1.3 Suppose that f : [a, b] → R is bounded such that m ≤ f (x) ≤ M
for all x ∈ [a, b]. Then for any partition P of [a, b], m(b − a) ≤ L(f, P ) and
U (f, P ) ≤ M (b − a).
Proof: Let the partition P be given by P = {x0 , · · · , xn }. Since m ≤ f (x) for
all x ∈ [a, b], surely m ≤ mi = glb{f (x) : x ∈ [xi−1 , xi ]} for all i, i = 1, · · · , n.
Then
m(b − a) = m
n
X
i=1
(xi − xi−1 ) =
n
X
i=1
m(xi − xi−1 ) ≤
n
X
i=1
mi (xi − xi−1 ) = L(f, P ),
which is one of the inequalities that we were to prove. The other inequality
follows in the same manner.
One of the problems that we have is that we have defined the upper and
lower Darboux sums for a particular partition. We have already discussed the
fact that if we make the partition finer, the upper and lower sums give us better
Rb
approximations of the area—the value that we want for a f . We must have
ways to connect the sums in some way for different partitions. The next few
definitions and propositions do this job for us. We begin with the following
definition.
Definition 7.1.4 Let P and P ∗ be partitions of [a, b] given by P = {x0 , · · · , xn }
and P ∗ = {y0 , · · · , ym }. If P ⊂ P ∗ , then P ∗ is said to be a refinement of P .
Note that since the partitions P and P ∗ (and any other partitions that we may
define) are given as a set of points in [a, b], the set containment definition above
makes sense.
We should note that the easiest way to get a simple refinement of the partition P is to add one point, i.e. if P = {x0 , · · · , xn }, choose a point yI ∈ [xI−1 , xI ]
and let P ∗ = {x0 , · · · , xI−1 , yI , xI , · · · , xn }. We should then realize that if P ∗
is a refinement of the partition P , it is possible to consider a series of one-point
refinements P0 , · · · , Pk such that P0 = P , Pk = P ∗ and Pj is a one-point refinement of Pj−1 for j = 1, · · · , k. The construction consists of choosing P0 = P
and adding one of the points of P ∗ − P = {x : x ∈ P ∗ and x 6∈ P } at each step.
This observation makes several of the proofs given below much easier.
We next prove two lemmas, the first that relates the upper and lower sums
on a partition and refinements of that partition and the second that relates the
lower sum and upper sums with respect to different partitions.
Lemma 7.1.5 Suppose f : [a, b] → R where f is bounded on [a, b], P is a
partition of [a, b] and P ∗ is a refinement of P . Then L(f, P ) ≤ L(f, P ∗ ) and
U (f, P ∗ ) ≤ U (f, P ).
7.1 Upper and Lower Sums
163
Proof: Let P # be a one-point refinement of the partition P where P =
{x0 , · · · , xn } and the extra point in P # , yi is such that xi−1 ≤ yi ≤ xi . Then
to compare L(f, P ) and L(f, P # ) we only need to compare the contributions to
both of these values from the interval [xi−1 , xi ]. The contribution of this interval
to L(f, P ) is the value mi (xi −xi−1 ) where mi = glb{f (x) : x ∈ [xi−1 , xi ]}. The
contribution of this same interval to L(f, P # ) is the value m1 (yi −xi−1 )+m2 (xi −
yi ) where m1 = glb{f (x) : x ∈ [xi−1 , yi ]} and m2 = glb{f (x) : x ∈ [yi , xi ]}. Of
course we note that xi − xi−1 = (yi − xi−1 ) + (xi − yi ).
We make the claim that m1 ≥ mi and m2 ≥ mi . To see that this is true
we note that since {f (x) : x ∈ [xi−1 , yi ]} ⊂ {f (x) : x ∈ [xi−1 , xi ]}, mi =
glb{f (x) : x ∈ [xi−1 , xi ]} will be a lower bound of {f (x) : x ∈ [xi−1 , yi ]}. Hence
mi ≤ f (x) for all x ∈ [xi−1 , yi ]} and thus mi ≤ m1 = glb{f (x) : x ∈ [xi−1 , yi ]},
which also follows from HW1.4.1-(c). The fact that m2 ≥ mi follows in the
same manner.
Thus
mi (xi − xi−1 ) = mi [(yi − xi−1 ) + (xi − yi )]
= mi (yi − xi−1 ) + mi (xi − yi ) ≤ m1 (yi − xi−1 ) + m2 (xi − yi ),
so L(f, P ) ≤ L(f, P # ).
Now if P ∗ is any refinement of the partition P , from the discussion given preceeding this lemma we know that there exist one-point refinements P0 , · · · , Pk
such that P0 = P , Pk = P ∗ and Pj is a one-point refinement of Pj−1 for
j = 1, · · · , k. Thus by taking k steps involving the one-point refinement argument given above (really an easy proof by induction) we see that L(f, P ) =
L(f, P0 ) ≤ L(f, P1 ) ≤ · · · L(f, Pk ) = L(f, P ∗ ) which is what we were to prove.
The proof that U (f, P ∗ ) ≤ U (f, P ) is very similar. Using the one-point
partition of P , P # , used above, the key step in the one-point partition argument
for U is that
Mi = lub{f (x) : x ∈ [xi−1 , xi ]} ≥ M 1 = lub{f (x) : x ∈ [xi−1 , yi ]}
and
Mi = lub{f (x) : x ∈ [xi−1 , xi ]} ≥ M 2 = lub{f (x) : x ∈ [yi , xi ]},
which follow from HW1.4.1-(c). The desired result then follows.
Lemma 7.1.6 Suppose that f : [a, b] → R where f is bounded on [a, b], and
suppose that P1 and P2 are partitions of [a, b]. Then L(f, P1 ) ≤ U (f, P2 ).
Proof: Let P ∗ be any common refinement of P1 and P2 , i.e. P ∗ is a refinement
of P1 and P ∗ is a refinement of P2 . The smallest such common refinement is
found by setting P ∗ = P1 ∪ P2 . Then by Lemma 7.1.5 we have L(f, P1 ) ≤
L(f, P ∗ ) and U (f, P ∗ ) ≤ U (f, P2 ). By Proposition 7.1.2 we have L(f, P ∗ ) ≤
U (f, P ∗ ). Thus we have
L(f, P1 ) ≤ L(f, P ∗ ) ≤ U (f, P ∗ ) ≤ U (f, P2 ).
164
7. Integration
If we return to Examples 7.1.1, 7.1.2 and 7.1.3, it’s really easy to see f1
and f2 satisfy L(fj , P ) ≤ U (fj , P ) for j = 1, 2. It’s more difficult to see that
Proposition 7.1.2 is satisfied for f3 —but it is true because xi−1 (xi − xi−1 ) ≤
xi (xi − xi−1 ) (sum both sides of the inequality and add 3). Because the upper
and lower sums for the functions f1 and f2 are so trivial, it is easy to see that
f1 and f2 will satisfy Lemma 7.1.5 for any refinement. Again because the upper
and lower sums for f3 are so difficult, it is difficult to see that f3 will satisfy
Lemma 7.1.5—except by approximately reproducing the proof of Lemma 7.1.5
with respect the function f3 and some particular refined partition. And finally,
while it is again easy to see that functions f1 and f2 will satisfy Lemma 7.1.6
with respect to any two partitions, it is very difficult to see that the upper and
lower sums of the function f3 with respect to the partitions P and P1 will satisfy
Lemma 7.1.6. It should be clear that if we were to consider upper and lower
sums for more complex functions, it would be next to impossible to compare
these upper and lower sums—especially with respect to very complex different
partitions. It is for that reason that the above lemmas are so important.
HW 7.1.1 Consider the function f (x) = x and the partition of [0, 1], Pn =
{0, n1 , n2 , · · · , 1}. (a) Compute L(f, P ) and U (f, P .
(b) Compute U (f, Pn ) − L(f, Pn ).
(c) Show that U (f, Pn ) − L(f, Pn ) > 0.
(d) Compute lim [U (f, Pn ) − L(f, Pn )].
n→∞
(
x if x ∈ Q ∩ [0, 1]
0 if x ∈ I ∩ [0, 1]
1 2
= {0, n , n , · · · , 1}. (a) Compute L(f, Pn ) and
HW 7.1.2 Consider the function f (x) =
and the partition of [0, 1] Pn
U (f, Pn ).
(b) Compute U (f, Pn ) − L(f, Pn ) and lim [U (f, Pn ) − L(f, Pn )].
n→∞
7.2
The Darboux Integral
In the last section we computed upper and lower Darboux sums that gave us
approximations of the area under the curve from above and below, respectively.
If these sums weren’t so terrible to work with, we could live with them—that’s
approximately what they do numerically. However, the upper and lower sums
are not a very nice analytic tool—and we can do better—much better. In the
following definition we use the upper and lower sums to get a step nearer to the
area under the curve.
Definition 7.2.1 Consider f : [a, b] → R where f is bounded on [a, b]. (a) We
define the lower integral of f on [a, b] to be
Z b
f = lub{L(f, P ) : P is a partition of [a, b]}.
a
165
7.2 The Integral
(b) We define the upper integral of f on [a, b] to be
Z
b
f = glb{U (f, P ) : P is a partition of [a, b]}.
a
We first note that if P ∗ is any fixed partition of [a, b], then by Lemma 7.1.6
L(f, P ) ≤ U (f, P ∗ ) for any partition P . Because U (f, P ∗ ) is an upper bound
for the set {L(f, P ) : P is a partition of [a, b]} we know that lub{L(f, P ) :
P is a partition of [a, b]} exists. Likewise, L(f, P ∗ ) is a lower bound of the set
{U (f, P ) : P is a partition of [a, b]} so that glb{U (f, P ) : P is a partition of [a, b]}
exists.
If we again return to Example 7.1.1, we see that since L(f1 , P ) = U (f1 , P ) =
Z b
Z b
f1 = k(b − a). If we consider
f1 =
k(b − a) for any partition P . Thus
a
a
the function f2 introduced in Example 7.1.2, we see that since L(f2 , P ) = 0 for
Z 1
Z 1
any partition P ,
f2 =
f2 = 0 and since U (f2 , P ) = 1 for any partition P ,
0
0
1. And finally, it should be reasonably easy to see that since L(f3 , P ) and
U (f3 , P ) from Example 7.1.3 are so complex, it is too difficult to try to use these
Z 1
Z 1
f3 . Even though L(f3 , P1 ) and U (f3 , P1 )
expressions to determine
f3 or
0
0
found in Example 7.1.3 are much nicer, the lub and the glb in Definition 7.2.1
above must be taken over all partitions of [0, 1], not just a few nice ones—so
Z 1
Z 1
knowing L(f, P1 ) and U (f, P1 ) does not help us determine
f3 or
f3 (at
0
0
least at this time).
We see by Lemma 7.1.5 that as we add points to a partition, the upper sums
Z b
f to be the glb of these upper sums. Likewise we
get smaller. We define
a
know that as we add points to a partition, the lower sums get larger. The lower
Rb
Rb
Rb
integral a f is defined to be the lub of these lower sums. Hence, a f and a f
squeeze in to provide a better upper and lower approximation of the area under
the curve y = f (x) from a to b, respectively.
We next prove a result that our intuition should tell us is obvious.
Proposition 7.2.2 Suppose that f : [a, b] → R, f is bounded on [a, b] and let
Z b
Z b
P be any partition of [a, b]. Then L(f, P ) ≤
f≤
f ≤ U (f, P ).
a
a
Proof: The first inequality and the last inequality follow from the fact that the
lub and the glb must be an lower bound and a upper bound of the set of all
possible sums, respectively.
Let P1 and P2 be any two partitions of [a, b]. By Lemma 7.1.6 we know that
L(f, P2 ) ≤ U (f, P1 ). Hence U (f, P1 ) is an upper bound of the set {L(f, P2 ) :
166
7. Integration
P2 is a partition of [a, b]}. Therefore
Z
b
Z
b
f = lub{L(f, P2) : P2 is a partition of [a, b]} ≤ U (f, P1 ).
a
Then
f is a lower bound of the set {U (f, P1 ) : P1 is a partition of [a, b]}.
a
Z
Therefore
a
b
f ≤ glb{U (f, P1) : P1 is a partition of [a, b]} =
what we were to prove.
The upper sums U (f, P ) and the upper integral
Z
Z
b
f which is
a
b
f approximate the area
a
under the curve y = f (x) from above, and the lower sums L(f, P ) and the
Z b
lower integral
f approximate the area under the curve y = f (x) from below.
a
Z b
Since we want the integral
f to give the area under the curve y = f (x), it is
a
reasonably logical to make the following definition.
Definition 7.2.3 Suppose f : [a, b] → R and f is bounded on [a, b]. We say
Z b
Z b
f . If f is integrable on [a, b], we write
that f is integrable on [a, b] if
f=
a
Z
b
f=
a
We call
Z
b
f
a
Z
or
Z
a
b
f=
Z
a
b
a
f .
b
f the Darboux integral of f from a to b. We will actually drop
Z b
the Darboux from here on and refer to
f as the integral of f from a to
a
a
b. We want to make it clear that this is the same integral that you studied
in your basic class. We tacked on the ”Darboux” to differentiate it from the
”Riemann” integral that we define in Section 7.6—at which time we immediately
prove that the Riemann and the Darboux integrals are the same. We use the
Darboux definition because it makes some of the proofs easier and because we
feel that it is very intuitive.
Before we discuss the integral we want to emphasize that while we denote the
Rb
integral of f from a to b by a f , the most common notation (especially in the
Rb
basic course) is to denote the integral by a f (x) dx. There are some advantages
to this latter notation. The ”dx” sort of reminds us that there is an xi − xi−1 in
the definition of the upper and lower sums. Also, later when we want to make
a change of variables, the ”dx” term is very useful for reminding us what we
Rb
want to do. The notation a f (x) dx can also be difficult to understand when
we study differentials. In our basic course we had a ”dx” in the integral, a ”dx”
167
7.2 The Integral
as a part of differentials, with apparently no connection. In any case we will
Rb
generally use the notation a f to denote the integral of f from a to b—though
Rb
whenever it seems convenient or more clear to use the a f (x) dx notation, we
will use it.
We return to the statement made just before we gave the definition of the
integral, ”it is reasonably logical to make the following definition.” It wouldn’t
Z b
Z b
f —then no
be a good definition if it were always the case that
f <
a
a
function would be integrable. It is only a good definition—and then our logic is
affirmed—if for a large set of nice functions, we can in fact get equality—and for
some functions we do not get equality. For a first glimpse of what we have, we
return to our Examples 7.1.1, 7.1.2 and 7.1.3 from Section 7.1. For the function
f (x) = k defined on the interval [a, b] introduced in Example 7.1.1, we see that
Z1 b
f1 = k(b − a). If we consider the function f2 introduced in Example 7.1.2—
a
Z 1
Z 1
and the subsequent work on f2 , we see that since
f2 = 0 < 1 =
f2 , the
0
0
function f2 is not integrable on [0, 1]. And of course, we can’t say much about
the function f3 .
We want to emphasize that the integral of f on the interval [a, b] is defined
only for functions that are bounded on [a, b]. We saw above that in the case of
the function f2 , a function can be bounded and still not integrable. But also
we should realize that f2 is not a nice function, i.e. if it is a bounded function
that is pretty nasty, it may not be integrable.
To be able to show that more functions are integrable (in practice, other
than for theoretical purposes, we don’t care too much about the function f2 ),
we need some methods and results. We begin with a very powerful and useful
theorem, the Archimedes-Riemann Theorem.
Theorem 7.2.4 (The Archimedes-Riemann Theorem (A-R Theorem))
Consider f : [a, b] → R where f is bounded on [a, b]. The function f is integrable
on [a, b] if and only if there exists a sequence of partitions of [a, b], {Pn }, n =
1, · · · such that
lim [U (f, Pn ) − L(f, Pn )] = 0.
(7.2.1)
n→∞
If there exists such a sequence of partitions, then
lim L(f, Pn ) = lim U (f, Pn ) =
n→∞
n→∞
Z
b
f.
(7.2.2)
a
Proof: (⇐) Let Pn , n = 1, · · · be a sequence of partitions of [a, b] such that
lim [U (f, Pn ) − L(f, Pn )] = 0. By Proposition 7.2.2 we know that for all n
n→∞
0≤
Z
b
a
f−
Z
a
b
f ≤ U (f, Pn ) − L(f, Pn ).
168
7. Integration
Then by the Sandwich Theorem, Proposition 3.4.2, we know that
Z b
Z b
0≤
f ≤ lim [U (f, Pn ) − L(f, Pn )] = 0
f−
a
a
n→∞
Rb
Rb
(notice that the two sequences on the left are constant sequences) or a f = a f
so f is integrable on [a, b].
Z b
Z b
Z b
f=
f.
(⇒) We now assume that f is integrable on [a, b]. Then
f=
a
a
a
Z b
Since
f = lub{L(f, P ) : P is a partition of [a, b]}, for every n ∈ N there
a
Z b
1
f − L(f, Pn∗ ) < . (Recall that
exists a partition of [a, b], Pn∗ , such that
n
a
by Proposition 1.5.3-(a) for any ǫ > 0 there exists a partition of [a, b], Pǫ such
Z b
f − L(f, Pǫ ) < ǫ.) Likewise (using Proposition 1.5.3-(b)) there exists a
that
a
b
1
. Let Pn be the common
n
a
refinement of Pn∗ and Pn# , Pn = Pn∗ ∪ Pn# . Doing this construction of each n ∈ N
gives us a sequence of partitions of [a, b], {Pn }.
We note that L(f, Pn∗ ) ≤ L(f, Pn ) and U (f, Pn ) ≤ U (f, Pn# ). Thus we have
Z b
1
∗
(7.2.3)
L(f, Pn ) ≥ L(f, P ) >
f−
n
a
partition of [a, b], Pn# , such that U (f, Pn# ) −
Z
f<
and
U (f, Pn ) ≤ U (f, Pn# ) <
Subtracting (7.2.3) from (7.2.4) gives
0 ≤ U (f, Pn ) − L(f, Pn ) <
Z
a
b
Z
b
f+
a
1
f+ −
n
"Z
1
.
n
b
a
(7.2.4)
#
1
2
f−
=∗
n
n
where ”=∗ ” is true because of our hypothesis that f is integrable on [a, b] (in
Z b
Z b
f=
which case
f ). Therefore by Proposition 3.4.2 we have
a
a
0 ≤ lim [U (f, Pn ) − L(f, Pn )] ≤ lim
n→∞
n→∞
2
= 0,
n
or lim [U (f, Pn ) − L(f, Pn )] = 0 which is what we wanted to prove.
n→∞
Now let {Pn } is such a sequence of partitions that satisfies the above ”if
Z b
Z b
and only if” statement. If we use (7.2.3), the fact that
f=
f (if there is
a
a
169
7.2 The Integral
Z b
such a sequence of partitions, then f is integrable), and that
f ≥ L(f, Pn )
a
Z b
1
for all n—which implies that
f − L(f, Pn ) <
for all n, we get 0 ≤
n
a
Z b
lim L(f, Pn ) =
f . A similar arguement using (7.2.4) can be used to prove
n→∞
a
Z b
that lim U (f, Pn ) =
f.
n→∞
a
We claimed before we stated this theorem that it was ”powerful and useful.”.
If we consider an integral using Definitions 7.2.1 and 7.2.3, we must consider
all partitions of [a, b]—this is difficult because there are a lot of partitions. To
consider a particular integral using Theorem 7.2.4, we can use only a sequence of
partitions—and we can choose a very nice sequence of partitions. For example
when we considered the function f3 defined by f3 (x) = 2x + 3 in Example 7.1.3
we found that for a general partition the upper and lower sums are not very nice.
However, when we considered the partition P1 = {0, 1/n, 2/n, · · · , n/n = 1}, we
found that L(f3 , P1 ) = n(n−1)
+ 3 and U (f3 , P1 ) = n(n+1)
+ 3. Thus if we
n2
n2
define the sequence of partitions {Pn } to be Pn = {0, 1/n, 2/n, · · · , n/n = 1}
(the same as P1 ), we see that U (f3 , Pn ) − L(f3 , Pn ) = n2 . Thus it is clear that
lim [U (f3 , Pn ) − L(f3 , Pn )] = 0 and that by Theorem 7.2.4 we see that
n→∞
Z
0
1
f3 = lim L(f3 , Pn ) = lim
n→∞
n→∞
n(n − 1)
+
3
= 4.
n2
We should note that since Theorem 7.2.4 is an ”if and only if” result with
”f is integrable” one side, the theorem gives us a result that is equivalent to the
definition. In this case it is very important because the result given by Theorem
7.2.4 is easier to use than the definition. To make this result a bit easier to
discuss we make the following definition.
Definition 7.2.5 Suppose that f : [a, b] → R be such that f is bounded on [a, b].
Let {Pn } be a sequence of partitions of [a, b]. The sequence {Pn } is said to be
an Archimedian sequence of partitions for f on [a, b] if U (f, Pn ) − L(f, Pn ) → 0
as n → ∞.
Of course we can then reword Theorem 7.2.4 as follows: Consider f : [a, b] →
R such that f is bounded on [a, b]. The function f is integrable on [a, b] if
and only if there exists an Archimedian sequence of partitions for f on [a, b].
Also, if there exists an Archimedian sequence of partitions for f on [a, b], then
Z b
f = lim U (f, Pn ) = lim L(f, Pn ).
a
n→∞
n→∞
Before we leave this section we include another theorem that is only a slight
variation of Theorem 7.2.4 but is sometimes useful—and gives us another characterization of integrability.
170
7. Integration
Theorem 7.2.6 (Riemann Theorem) Suppose f : [a, b] → R is bounded on
[a, b]. Then f is integrable on [a, b] if and only if for every ǫ > 0 there exists a
partition of [a, b], P , such that 0 ≤ U (f, P ) − L(f, P ) < ǫ.
Proof: (⇒) If f is integrable on [a, b], we know from the A-R Theorem, Theorem
7.2.4, that there exists an Archimedian sequences of partitions of [a, b], {Pn }, so
that U (f, Pn ) − L(f, Pn ) → 0 as n → ∞, i.e. for every ǫ > 0 there exists N ∈ R
such that n ≥ N implies that |U (f, Pn ) − L(f, Pn )| < ǫ. Let n0 ∈ N be such
that n0 > N . Then the partition Pn0 is such that 0 ≤ U (f, Pn0 ) − L(f, Pn0 ) =
|U (f, Pn0 ) − L(f, Pn0 )| < ǫ which is what we were to prove.
(⇐) We are given that for any ǫ > 0 there exists a partition of [a, b], P , such
that 0 ≤ U (f, P ) − L(f, P ) < ǫ. Then for each n ∈ N there exists a partition of
[a, b], Pn , such that U (f, Pn ) − L(f, Pn ) < 1/n (letting ǫ = 1/n). Then
0 ≤ U (f, Pn ) − L(f, Pn ) < 1/n → 0 as n → ∞.
Therefore by the A-R Theorem, Theorem 7.2.4, f is integrable on [a, b].
It’s obvious from the proof that the Riemann Theorem is only a slight variation of the A-R Theorem. There are times when it is more convenient to
only have to produce one partition to prove integrability. For those times the
Riemann Theorem is convenient.
HW 7.2.1 (True or False and why) (a) Suppose f : [0, 1] → R is integrable on
Z 1
[0, 1] and
f = 0. Then for any partition of [0, 1], P , U (f, P ) = L(f, P ) = 0.
0
Z 1
f = 0. Then f (x) = 0
(b) Suppose f : [0, 1] → R is integrable on [0, 1] and
0
for all x ∈ [0, 1].
(c) Suppose f : [0, 1] → R is integrable on [0, 1]. Then f is continuous on [0, 1].
(d) Suppose f : [0, 1] → R is integrable on [0, 1] and f (x) > 0 for all x ∈ [0, 1].
Z 1
It is possible that
f < 0.
0
(e) Consider f : [0, 1] → R defined by f (x) = 1/(x − 1/2)2 , x 6= 1/2, and
f (1/2) = 0. The function f is integrable on [0, 1].
HW 7.2.2 Consider the function f (x) = x on [0, 1] (and recall HW7.1.1). (a)
Z 1
Show that
f exists and equals 21 .
0
(b) Find a partition of [0, 1] that satisfies Riemann’s Theorem, Theorem
7.2.6, i.e. for any ǫ > 0 there exists a partition P such that U (f, P )−L(f, P ) < ǫ.
(
x if x ∈ Q ∩ [0, 1]
HW 7.2.3 Consider the function f (x) =
0 if x ∈ I ∩ [0, 1]
(and recall HW7.1.2). (a) Explain why the results of HW7.1.2 do not directly
Z 1
show that
f does not exist.
0
171
7.3 Some Integration Topics
(b) Show that for any partition of [0, 1], P , U (f, P ) > 12 .
Z 1
(c) Prove that
f does not exist.
0
HW 7.2.4 Prove that f (x) = |x| is integrable on [−1, 1]. Find
7.3
Z
1
f.
−1
Some Topics in Integration
Now that we have a definition of the integral and we have Theorem 7.2.4 to
help us. We could calculate some integrals by applying the A-R Theorem to
functions such as we did with the function f3 in Section 7.2. Instead we will use
Theorems 7.2.4 and 7.2.6 to find some large classes of integrable functions. We
begin by showing that the class of monotonic functions is integrable.
Proposition 7.3.1 Suppose that f : [a, b] → R is a bounded, monotonic function. Then f is integrable on [a, b].
Proof: Before we start, let us emphasize the approach that we shall use. By
Theorem 7.2.4 if we can find an Archimedian sequence of partitions for f on
[a, b], then we know that the function f is integrable on [a, b]. Thus we work to
find the appropriate sequence of partitions.
Consider the sequence of partitions {Pn } defined for each n ∈ N by
1
2
n−1
Pn = {x0 = a, x1 = a+ (b−a), x2 = a+ (b−a), · · · , xn−1 =
(b−a), xn = b}.
n
n
n
(7.3.1)
We will use this sequence of partitions often. Notice that it is very regular in
that the partition points are equally space throughout the interval.
Let us assume that f is monotonically increasing—we could prove the case
for f monotonically decreasing in a similar way or consider the negative of the
function and apply this result. Note that on any partition interval [xi−1 , xi ], the
fact that f monotonically increasing implies that mi = f (xi−1 ) and Mi = f (xi ),
i.e. xi−1 ≤ x ≤ xi implies that f (xi−1 ) ≤ f (x) ≤ f (xi ) for any x ∈ [xi−1 , xi ].
Therefore
U (f, Pn ) − L(f, Pn ) =
=
n
X
i=1
Mi (xi − xi−1 ) −
n
X
i=1
n
X
1
(Mi − mi ) xi − xi−1 =
(b − a)
n
i=1
mi (xi − xi−1 )
b−a
n
n
X
1
1
(f (xi ) − f (xi−1 ) = (b − a)(f (b) − f (a)) → 0
= (b − a)
n
n
i=1
as n → ∞. Thus {Pn } is an Archimedian sequence of partitions for f on [a, b]
so we know that f is integrable on [a, b].
172
7. Integration
As you will see, often the difficulty in the proof is to define the correct
sequence of partitions. Since the expression ”Archimedian sequence of partitions
for f on [a, b]” is a tedious statement, from this time on we will just state the
the sequence of partitions is ”Archimedian” and assume that you know that it
is for some f (the right f ) and on some interval (the right interval).
We next proof the integrability of a very large and important class of functions, the continuous functions.
Proposition 7.3.2 If f : [a, b] → R is continuous on [a, b], then f is integrable
on [a, b].
Proof: Let ǫ > 0 be given. Clearly since f is continuous on [a, b], by Corollary
5.3.9 we know that f is bounded on [a, b]—thus it makes sense to consider
whether f is integrable on [a, b]. Consider the partition Pn = {x0 , x1 , · · · , xn }
where xi = a + i(b − a)/n, i = 0, · · · , n. Recall that by Proposition 5.5.6, f
continuous on [a, b] implies that f is uniformly continuous on [a, b]. Thus for any
ǫ/(b − a) > 0 there exists a δ such that |x − y| < δ implies that |f (x) − f (y)| <
ǫ/(b − a). Choose n so that (b − a)/n < δ. (Recall that we know we can find
such an n by Corollary 1.5.5-(b).)
Consider the partition interval [xi−1 , xi ]. By Theorem 5.3.8 we know that
f assumes its absolute minimum and absolute maximum on [xi−1 , xi ]. Let
(xi , f (xi )) and (xi , f (xi )) denote the absolute minimum and maximum of f on
[xi−1 , xi ]. Note that mi = f (xi ) and Mi = f (xi ). Because f is uniformly
continuous, n was chosen so that (b − a)/n < δ and xi − xi−1 = (b − a)/n,
xi , xi ∈ [xi−1 , xi ] implies that Mi − mi < ǫ/(b − a). Then
U (f, Pn ) − L(f, Pn ) =
n
X
i=1
(Mi − mi )(xi − xi−1 ) < (ǫ/(b − a))
n
X
i=1
(xi − xi−1 ) = ǫ.
Therefore by Theorem 7.2.6 f is integrable on [a, b].
We have several large classes of integrable functions. We next provide results
that let us expand our cache of integrable functions, allow us to manipulate the
integrals and compute some of the integrals. There are many integration results
that we could include. We will try to include the theorems that you have
probably already seen and used in your basic course—this usually implies that
they are important theorems—and some theorems that are useful for further
analysis results. We will surely miss some nice results but we will assume that
you will now be able to read and or develop proofs for these results that we do
not include. We begin with the following proposition.
Proposition 7.3.3 Suppose f : [a, b] → R be integrable on [a, b]. Let c ∈ (a, b).
Rb
Rc
Rb
Then f is integral on [a, c] and [c, b], and a f = a f + c f .
Proof: Since f is integrable on [a, b], we know from Theorem 7.2.4 that there
exists an Archimedian sequence of partitions of [a, b], Pn , so that U (f, Pn ) −
L(f, Pn ) → 0. Suppose that Pn = {x0 , x1 , · · · , xn }. Then there will be some
173
7.3 Some Integration Topics
[a,c]
k so that xk ≤ c < xk+1 . Define the three partitions Pn′ = Pn ∪ {c}, Pn
=
[c,b]
= {c, xk+1 , · · · , xn }—where of course, if xk = c, no
{x0 , · · · , xk , c} and Pn
new point is really added, and if xk < c, the one new point c is added. Of course
these constructions are valid for each n.
We note that since Pn′ is a refinement of the partition Pn , by Proposition
7.1.2 and Lemma 7.1.5 0 ≤ U (f, Pn′ ) − L(f, Pn′ ) ≤ U (f, Pn ) − L(f, Pn ). Then by
the Sandwich Theorem, Proposition 3.4.2, U (f, Pn′ ) − L(f, Pn′ ) → 0 as n → ∞.
[a,c]
[c,b]
[a,c]
By the definition of Pn′ , Pn and Pn , we see that U (f, Pn′ ) = U (f, Pn )+
c,b]
[a,c]
c,b]
U (f, Pn and L(f, Pn′ ) = L(f, Pn ) + L(f, Pn . Then because
0 ≤ U (f, Pn[a,c] ) − L(f, Pn[a,c]) ≤ U (f, Pn[a,c] ) − L(f, Pn[a,c] ) + U (f, Pn[c,b] ) − L(f, Pn[c,b] )
= U (f, Pn′ ) − L(f, Pn′ )
and
0 ≤ U (f, Pn[c,b] ) − L(f, Pn[c,b] ) ≤ U (f, Pn[c,b] ) − L(f, Pn[c,b] ) + U (f, Pn[a,c] ) − L(f, Pn[a,c])
= U (f, Pn′ ) − L(f, Pn′ ),
[a,c]
by two applications of the Sandwich Theorem, Proposition 3.4.2, we get U (f, Pn )−
[a,c]
[a,c]
[a,c]
L(f, Pn ) → 0 and U (f, Pn ) − L(f, Pn ) → 0 as n → 0. Therefore f is
integrable on [a, c] and on [c, b].
[a,c]
And finally since f is integrable on [a, b], [a, c] and [c, b]; L(f, Pn′ ) = L(f, Pn )+
R
R
R
b
c
b
c,b]
[a,c]
[c,b]
L(f, Pn ); L(f, Pn′ ) → a f ; L(f, Pn ) → a f ; and L(f, Pn ) → c f , we
see that
Z b
Z b
h
i Z c
lim L(f, Pn′ ) =
f = lim L(f, Pn[a,c] ) + L(f, Pnc,b] ) =
f+
f.
n→∞
a
n→∞
a
c
We next include several very basic, important results for computing integrals.
Proposition 7.3.4 Suppose that f, g : [a, b] → R are integrable on [a, b] and
suppose that c, c1 , c2 ∈ R. Then we have the following results.
Z b
Z b
f
cf = c
(a) cf is integrable on [a, b] and
a
aZ
Z b
Z b
b
(b) f + g is integrable on [a, b] and
(f + g) =
f+
g
a Z
a
aZ
Z b
b
b
(c) c1 f + c2 g is integrable on [a, b] and
(c1 f + c2 g) = c1
f + c2
g
a
a
a
Proof: Since f and g are both integrable, there exists an Archimedian sequence
for each of f and g. Let {Pn } be the common refinement of these two Archimedian sequences, i.e. then U (f, Pn ) − L(f, Pn ) → 0 and U (g, Pn ) − L(g, Pn ) → 0
as n → ∞. Suppose Pn = {x0 , · · · , xn }.
(a) This a very simple property. As you will see the proof is tough. Hold
on and ready carefully. Let us begin by defining the following useful notation:
174
7. Integration
Mif = lub{f (x) : x ∈ [xi−1 , xi ]}, mfi = glb{f (x) : x ∈ [xi−1 , xi ]}, Micf =
lub{cf (x) : x ∈ [xi−1 , xi ]} and mcf
i = glb{cf (x) : x ∈ [xi−1 , xi ]}. To prove
part (a) we need a relationship between U (f, Pn ) and U (cf, Pn ), and L(f, Pn )
and L(cf, Pn ).
Case 1: c ≥ 0: We begin by showing that Micf = cMif for any i. If c = 0,
then it is very easy since Micf = mcf
= cMif = cmfi = 0. So we consider
i
c > 0. Since Mif is an upper bound of {f (x) : x ∈ [xi−1 , xi ]}, then surely cMif
is an upper bound of {cf (x) : x ∈ [xi−1 , xi ]} and M,cf ≤ cMif (because Micf
is the least upper bound of the set). Likewise, since Micf is an upper bound
of {cf (x) : x ∈ [xi−1 , xi ]}, it is clear that Micf /c is an upper bound of the set
{f (x) : x ∈ [xi−1 , xi ]}. Thus Mif ≤ Micf /c (since Mif is the least upper bound
of the set) or cMif ≤ Micf . Therefore cMif = Micf . The proof that cmfi = mcf
i
is very similar—with glb’s replacing lub’s.
We then have U (cf, Pn ) = cU (f, Pn ), L(cf, Pn ) = cL(f, Pn ) and
U (cf, Pn ) − L(cf, Pn ) = c [U (f, Pn ) − L(f, Pn )] → 0 as n → ∞.
Thus {Pn } is an Archimedian sequence for cf , cf is integrable and
Z b
f.
lim L(cf, Pn ) = c lim L(f, Pn ) = c
n→∞
n→∞
Z
b
cf =
a
a
f
Case 2: c < 0: When c < 0, the proofs that Micf = cmfi and mcf
i = cMi
are very similar to the proofs given in Case 1, except that c < 0 reverses the
inequalities which replace the Mi by the mi . For example, since Mif = lub{f (x) :
x ∈ [xi−1 , xi ]}, Mif ≥ f (x) for all x ∈ [xi−1 , xi ]. Thus cMif ≤ cf (x) (remember
cf
c < 0) for all x ∈ [xi−1 , xi ] and cMif ≤ mcf
i because mi is the glb of the
set {cf (x) : x ∈ [xi−1 , xi ]}. Also, since mcf
≤ cf (x) for all x ∈ [xi−1 , xi ],
i
cf
mcf
/c
≥
f
(x)
for
all
x
∈
[x
,
x
],
i.e.
m
/c
≥ Mif (because Mif is the
i−1
i
i
i
least upper bound of the set {f (x) : x ∈ [xi−1 , xi ]}) or mcf
≤ cMif . Thus
i
f
mcf
i = cMi .
n
n
X
X
cmfi (xi − xi−1 ) =
Micf (xi − xi−1 ) =
We then have U (cf, Pn ) =
cL(f, Pn ), L(cf, Pn ) =
n
X
i=1
and
i=1
mcf
i (xi − xi−1 ) =
n
X
i=1
i=1
cMif (xi − xi−1 ) = cU (f, Pn )
U (cf, Pn ) − L(cf, Pn ) = cL(f, Pn ) − cU (f, Pn ) = c [U (f, Pn ) − L(f, Pn )] → 0.
Thus {Pn } is an Archimedian sequence of cf , cf is integrable and
Z b
lim L(cf, Pn ) = c lim U (f, Pn ) = c
f.
n→∞
n→∞
a
Z
a
b
cf =
175
7.3 Some Integration Topics
(b) Note that for this proof it is important that we made {Pn } to be the
common refinement of the Archimedian sequences for both f and g. We didn’t
need it for part (a) but we need it here. For this proof we define Mif , mfi as
we did in part (a), and Mig , mgi , Mif +g and mfi +g analogously. The technical
inequalities that we need, L(f, Pn )+ L(g, Pn ) ≤ L(f + g, Pn ) and U (f + g, Pn ) ≤
U (f, Pn ) + U (g, Pn ) follow easily from the inequalities mfi + mgi ≤ mfi +g and
Mif +g ≤ Mif + Mig . For example, for x ∈ [xi−1 , xi ] we see that mfi + mgi ≤
f (x) + g(x). Thus mfi + mgi ≤ mfi +g (because mfi +g is the greatest lower bound
of the set) which is one of the inequalitiies that we wanted to prove. The other
inequality follows in the same manner.
Thus since both U (f, Pn ) − L(f, Pn ) → 0 and U (g, Pn ) − L(g, Pn ) → 0, we
have
U (f + g, Pn ) − L(f + g, Pn ) ≤ U (f, Pn ) + U (g, Pn ) − [L(f, Pn ) + L(g, Pn )]
= [U (f, Pn ) − L(f, Pn )] + [U (g, Pn ) − L(g, Pn )]
→ 0 as n → ∞.
Therefore {Pn } is an Archimedian sequence for f +g on [a, b] so f +g is integrable
Z b
on [a, b], and lim L(f + g, Pn ) = lim U (f + g, Pn ) =
(f + g).
n→∞
n→∞
a
Since we have L(f, Pn )+L(g, Pn ) ≤ L(f +g, Pn ) ≤ U (f +g, Pn ) ≤ U (f, Pn )+
U (g, Pn ), taking the limit of all parts of the inequality as n → ∞ gives
Z
a
b
f+
Z
a
b
g≤
Z
a
b
(f + g) ≤
Z
b
a
(f + g) ≤
Z
a
b
f+
Z
b
g
a
Rb
(we don’t really need the extra a (f + g) in the inequality). Thus by the
Rb
Rb
Rb
Sandwich Theorem, Proposition 3.4.2, a (f + g) = a f + a g.
(c) Part (c) follows from parts (a) and (b).
The above proof was not necessarily difficult but very technical. However,
we must realize that it is the proof of an important theorem.
HW 7.3.1 (True or False and why) (a) If f : [a, b] → R is integrable on [a, b],
then |f | is integrable on [a, b].
(b) Suppose f : [a, b] → R. If |f | is integrable on [a, b], then f is integrable on
[a, b].
(c) If f, g :→ R are not integrable on [0, 1], then f + g is not integrable on [0, 1].
(d) If f : [a, b] → R is not integrable on [0, 1], then cf is not integrable on [0, 1]
for c ∈ R.
(e) If f, g : [0, 1] → R is such that f is continuous on [0, 1] and g is strictly
increasing on [0, 1], then f + 2g is integrable on [0, 1].
(
x if x ∈ [0, 1]
HW 7.3.2 Define f : [0, 2] → R by f (x) =
3 if x ∈ (1, 2].
176
7. Integration
(a) Find an Archimedian sequence of partitions that shows that f is integrable
Z 2
on [0, 2]. Find
f.
0
(b) Use the theorems of this section to prove that f is integrable on [0, 2].
(c) Prove that f is integrable on [1, 2].
HW 7.3.3 Suppose f, g : [a, b] → R are such that f is continuous on [a, b]
and g(x) = f (x) on [a, b] except for a finite number of points. Show that g is
Z b
Z b
f.
g=
integrable on [a, b] and
a
a
HW 7.3.4
 Consider the function f : [−2, 2] → R defined by
−1 if x ∈ [−2, 1]

f (x) = x
if x ∈ (−1, 1)


1
if x ∈ [1, 2].
Prove that f is integrable on [−2, 2].
7.4
More Topics in Integration
There are many basic, important properties of integration—too many to fit into
one section. Thus this section is actually a continuation of the last section. The
next proposition that we include at first seems very general and probably not
familiar. The proof is tough so pay attention. As you will see the proposition
will provide for us some very useful corollaries.
Proposition 7.4.1 Suppose f : [a, b] → R is integrable on [a, b] and φ : [c, d] →
R is continuous on [c, d] where f ([a, b]) ⊂ [c, d]. Then the function φ◦f : [a, b] →
R is integrable on [a, b].
Proof: We will work to find a partition of [a, b] that allows us to apply Riemann’s Theorem, Theorem 7.2.6. Let ǫ > 0 be given, let K = lub{|φ(y)| :
y ∈ [c, d]} and set ǫ1 = ǫ/(b − a + 2K) (we’ll see why this is the correct choice
of ǫ1 later). Since φ is continuous on [c, d], φ is uniformly continuous on [c, d].
So given ǫ1 > 0 there exists δ such that y1 , y2 ∈ [c, d] and |y1 − y2 | < δ implies
that |φ(y1 ) − φ(y2 )| < ǫ1 . Choose δ < ǫ1 —we can always make our δ smaller.
Since f is integrable on [a, b], by the Riemann Theorem, Theorem 7.2.6, there
exists a partition P such that U (f, P ) − L(f, P ) < δ 2 . (Theorem 7.2.6 said that
we could find such a partition P for any ǫ > 0—we’re using δ 2 in place of the ǫ
in the theorem.) Suppose P is given by P = {x0 , · · · , xn }. The partition P is
the partition we want to use in our application of Theorem 7.2.6 to show that
φ ◦ f is integrable on [a, b], i.e. we must show that U (φ ◦ f, P ) − L(φ ◦ f, P ) < ǫ.
We know that
U (f, P ) − L(f, P ) =
n
X
i=1
(Mif − mfi )(xi − xi−1 ) < δ 2
(7.4.1)
177
7.4 More Integration Topics
where we will use the notation used in the last theorem: for any F MiF =
lub{F (x) : x ∈ [xi−1 , xi ]} and mF
i = glb{F (x) : x ∈ [xi−1 , xi ]}.
Since U (f, P )−L(f, P ) must be ”small”, and since Mif −mfi and xi −xi−1 are
both greater than or equal to zero, then (Mif − mfi )(xi − xi−1 ) must be ”small”.
There are two ways the terms (Mif − mfi )(xi − xi−1 ) are made ”small”—either
Mif − mfi can be ”small” or xi − xi−1 is ”small”.
Let S1 be the set of indices for which Mif − mfi is ”small”, i.e. S1 = {i :
1 ≤ i ≤ n and Mif − mfi < δ}. Let S2 = {i : 1 ≤ i ≤ n and Mif − mfi ≥ δ}.
Note that we have now partially defined what we mean by ”small” and though
we have defined the set S2 to be the set of indices on which Mif − mfi is not
”small”, it is for these partition intervals that we had better have xi − xi−1 be
small—because we have decided that Mif − mfi is not.
We note that
U (φ ◦ f, P ) − L(φ ◦ f, P ) =
=
X
i∈S1
(Miφ◦f
−
mφ◦f
)(xi
i
n
X
i=1
)(xi − xi−1 )
(Miφ◦f − mφ◦f
i
− xi−1 ) +
X
i∈S2
(Miφ◦f − mφ◦f
)(xi − xi−1 ).(7.4.2)
i
We will handle the two sums separately.
For i ∈ S1 : We note that for i ∈ S1 we have Mif − mfi < δ. We are interested
in Miφ◦f − mφ◦f
. To aid us we prove the following two claims.
i
Claim 1: Mif − mfi = lub{f (x) − f (y) : x, y ∈ [xi−1 , xi ]}
Claim 2: Miφ◦f − mφ◦f
= lub{φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}
i
Proof of Claim 2: Recall that Miφ◦f = lub{φ ◦ f (x) : x ∈ [xi−1 , xi ]} and
mφ◦f
= glb{φ ◦ f (y) : y ∈ [xi−1 , xi ]}. By Proposition 1.5.3-(a) for every ǫ3 >
i
0 there exists φ ◦ f (x∗ ) ∈ {φ ◦ f (x) : x ∈ [xi−1 , xi ]} such that Miφ◦f − φ ◦
f (x∗ ) < ǫ3 /2. Also by Proposition 3.4.2-(b) there exists φ ◦ f (y ∗ ) ∈ {φ ◦ f (y) :
y ∈ [xi−1 , xi ]} such that φ ◦ f (y ∗ ) − mφ◦f
< ǫ3 /2. Then Miφ◦f − mφ◦f
−
i
i
[φ(f (x∗ )) − φ(f (y ∗ ))] < ǫ, i.e. for ǫ3 > 0 there exists
φ ◦ f (x∗ ) − φ ◦ f (y ∗ ) ∈ {φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}
so that Miφ◦f − mφ◦f
− [φ(f (x∗ )) − φ(f (y ∗ ))] < ǫ3 . Thus again by Proposition
i
3.4.2-(a) Miφ◦f − mφ◦f
= lub{φ(f (x)) − φ(f (y)) : x, y ∈ [xi−1 , xi ]}.
i
The proof of Claim 1 is essentially the same—a bit easier.
By Claim 1 and the fact that Mif −mfi < δ we see that for any x, y ∈ [xi−1 , xi ]
we have |f (x) − f (y)| < δ. Then by the uniform continuity of φ, we have for
any x, y ∈ [xi−1 , xi ] φ(f (x)) − φ(f (y))| < ǫ1 . Therefore Miφ◦f − mφ◦f
≤ ǫ1 .
i
Therefore we have
X
X
(Miφ◦f − mφ◦f
)(xi − xi−1 ) ≤ ǫ1
(xi − xi−1 ) ≤ ǫ1 (b − a).
(7.4.3)
i
i∈S1
i∈S1
178
7. Integration
For i ∈ S2 : We note that for i ∈ S2 we have Mif − mfi ≥ δ. Note that
Miφ◦f = lub{φ ◦ f (x) : x ∈ [xi−1 , xi ]} ≤ K (x ∈ [xi−1 , xi ] implies φ(f (x)) ≤
|φ(f (x))| ≤ K) and mφ◦f
= glb{φ ◦ f (x) : x ∈ [xi−1 , xi ]} ≥ −K (x ∈ [xi−1 , xi ]
i
implies φ(f (x)) ≥ −|φ(f (x))| ≥ −K). Then Miφ◦f − mφ◦f
≥ 2K.
i
If for each i ∈ S2 we multiply both sides of the inequality Mif − mfi ≥ δ by
P
f
f
(xP
i − xi−1 ) (all positive) and sum over S2 , we get
i∈S2 (Mi − mi )(xi − xi−1 ) ≥
δ i∈S2 (xi − xi−1 ) or
δ
X
i∈S2
(xi − xi−1 ) ≤
(by (7.4.1)) or
P
X
i∈S2
(Mif − mfi )(xi − xi−1 ) ≤
i∈S2 (xi
+
i∈S2
i=1
(Mif − mfi )(xi − xi−1 ) < δ 2
− xi−1 ) < δ. Thus
U (φ ◦ f, P ) − L(φ ◦ f, P ) =
X
n
X
(Miφ◦f
X
i∈S1
)(xi − xi−1 )
(Miφ◦f − mφ◦f
i
− mφ◦f
)(xi − xi−1 ) ≤ ǫ1 (b − a) + 2K
i
< ǫ1 (b − a) + 2Kδ < ǫ1 (b − a + 2K) = ǫ.
X
i∈S2
(xi − xi−1 )
Therefore by Theorem 7.2.6 the function φ ◦ f is integrable on [a, b].
That too was a very technical proof. As we said earlier Proposition 7.4.1 is
useful for its corollaries. We begin with the following.
Corollary 7.4.2 Suppose f, g : [a, b] → R are integrable on [a, b]. Then
(a) f 2 is integrable on [a, b], and
(b) f n is integrable on [a, b] for any n ∈ N.
(c) |f | is integrable on [a, b].
(d) f g is integrable on [a, b].
Proof: The proof of part (a) consists of letting φ(y) = y 2 —which is surely
continuous anywhere—and applying Proposition 7.4.1. The proof of part (b) is
a reasonably nice induction proof using part (a). To obtain part (c) we again
use Proposition
7.4.1, this time with
φ(y) = |y|. And part (d) follows by noting
that f g = 41 (f + g)2 − (f − g)2 and using Proposition 7.3.4 along with part
(a) of this proposition.
We should realize that we can let φ be any continuous function on [c, d]—
which will give us a lot of different integrable composite functions. For example,
if we let φ(x) = cx we see that the function cf is integrable if f is integrable—
part (a) Proposition 7.3.4. We next prove two reasonably easy propositions that
will give us a lot of interesting, integrable functions.
Proposition 7.4.3 (a) Suppose that f : [a, b] → R is integrable on [a, c] and
Z b
Z c
Z b
[c, b] for c ∈ (a, b). Then f is integrable on [a, b] and
f=
f+
f.
a
a
c
179
7.4 More Integration Topics
(b) Suppose that f : [c0 , ck+1 ] → R, c1 , · · · , ck satisfy c0 < c1 < · · · ck < ck+1
and f is integrable on [cj−1 , cj ] for j = 1, · · · k + 1. Then f is integrable on
Z ck+1
k+1
X Z cj
f.
f=
[c0 , ck+1 ] and
c0
j=1
cj−1
Proof: (a) Since f is integrable on [a, c] and [c, b], by Theorem 7.2.6 for any
ǫ > 0 there exists a partition P1 of [a, c] and a partition P2 of [c, b] such that
U (f, P1 ) − L(f, P1 ) < ǫ/2 and U (f, P2 ) − L(f, P2 ) < ǫ/2. Let P by the partition P = P1 ∪ P2 . Clearly P is a partition of [a, b] and U (f, P ) − L(f, P ) =
[U (f, P1 ) − L(f, P1 )] + [U (f, P2 ) − L(f, P2 )] < ǫ. Therefore by Theorem 7.2.6
the function f is integrable on [a, b]. We can then apply Proposition 7.3.3 to
Z b
Z c
Z b
see that
f=
f+
f.
a
a
c
(b) Part (b) follows from part (a) using mathematical induction.
Proposition 7.4.4 Suppose f : [a, b] → R is bounded on [a, b] and continuous
on (a, b). Then f is integrable on [a, b].
Proof: Since f is bounded on [a, b], there exists some K such that |f (x)| ≤ K
for x ∈ [a, b]. Let x0 = a, x1 = a + (b − a)/n (for some n, yet to be determined),
xn−1 = b − (b − a)/n and xn = b. Note that lub{f (x) : x ∈ [x0 , x1 ]} ≤ K
(for x ∈ [x0 , x1 ], f (x) ≤ |f (x)| ≤ K), glb{f (x) : x ∈ [x0 , x1 ]} ≥ −K (for
x ∈ [x0 , x1 ], f (x) ≥ −|f (x)| ≥ −K), lub{f (x) : x ∈ [xn−1 , xn ]} ≤ K, and
glb{f (x) : x ∈ [xn−1 , xn ]} ≥ −K. Thus M1 − m1 ≤ 2K and Mn − mn ≤ 2K.
Now suppose we are given ǫ > 0. Choose n above so that 4K(b − a)/n < ǫ/2.
Since f is continuous on (a, b), we know that f is continuous on [x1 , xn−1 ]. Then
by Riemann’s Theorem, Theorem 7.2.6, we know that there exists a partition
P ∗ of [x1 , xn−1 ] such that U (f, P ∗ ) − L(f, P ∗ ) < ǫ/2. Write the partition
P ∗ as P ∗ = {x1 , x2 , · · · , xn−1 }. Let P = {x0 , x1 , · · · , xn−1 , xn }, i.e. P =
P ∗ ∪ {x0 , xn }. Then clearly P is a partition of [a, b] and
U (f, P ) − L(f, P ) = (M1 − m1 )(x1 − a) + U (f, P ∗ ) − L(f, P ∗ ) + (Mn − mn )(b − xn−1 )
< 2K(b − a)/n + ǫ/2 + 2K(b − a)/n < ǫ.
Then using Theorem 7.2.6 again we get that f is integrable on [a, b].
It might not be clear the range of integrable functions this result produces.
If a < c1 < · · · < ck < b, and f is continuous on (a, c1 ), (ck , b) and (cj−1 , cj ) for
j = 2, · · · , k, then we say that f is piecewise continuous on [a, b]. If we assume
that f is defined at a, c1 ,· · · ,ck ,b (which forces f to be bounded on [a, b]), then
by Propositions 7.4.3 and 7.4.4 we see that f is integrable on [a, b] and
Z
a
b
f=
Z
a
c1
f+
k Z
X
j=2
cj
cj−1
f+
Z
b
.
ck
Also if we have the same setting a < c1 < · · · < ck < b and S is defined to be constant on each open interval (a, c1 ), (ck , b) and (cj−1 , cj ), j = 2, · · · , k, then S is
180
7. Integration
said to be a step function. If, in addition S is defined at the points a, c1 ,· · · ,ck ,b,
then S is piecewise continuous on [a, b] and is integrable on [a, b]. Thus we have
a lot of integrable functions, a bit more interesting than
( just continuous funcsin(1/x) if x 6= 0
tions. In fact, we note that the nasty function f (x) =
is
0
if x = 0
intregrable on [0, 1]. Note also that f is integrable on [−1, 1].
We next proof a intuitive result that as we will see later is a useful tool.
Proposition 7.4.5 Suppose that f, g : [a, b] → R are integrable on [a, b] and
Rb
Rb
satisfy f (x) ≤ g(x) for all ∈ [a, b]. Then a f ≤ a g.
Proof: Since f and g are integrable, they will both have Archimedian sequences
on [a, b]. Let {Pn } be the common refinement of both sequences, i.e. {Pn } will
Z b
Z b
satisfy U (f, Pn ) →
f and U (g, Pn ) →
g as n → ∞. Since f (x) ≤ g(x)
a
a
on [a, b], f (x) ≤ g(x) on any of the partition intervals [xi−1 , xi ] and we get
Mif ≤ Mig (using notation that we defined in the proof of part (b) of Proposition
Rb
Rb
7.3.4) and U (f, Pn ) ≤ U (g, Pn ). If we then let n → ∞, we get a f ≤ a g.
We next combine Propositions 7.3.4 and 7.4.5 to obtain the following important result.
Proposition
Suppose that f : [a, b] → R are integrable on [a, b].
Z 7.4.6
b Z b
(a) Then f ≤
|f |.
a a
Z
b (b) If |f (x)| ≤ M for all x ∈ [a, b], then f ≤ M (b − a).
a Proof: (a) We note that by the definition of absolute value we have −|f (x)| ≤
f (x) ≤ |f (x)|. By Corollary 7.4.2-(c) we know that f integrable implies that |f |
Z b
Z b
Z b
|f |.
f ≤
(−|f |) ≤
is integrable. Then by Proposition 7.4.5 we have
a
a
Z b
Z ba
Z b
Applying Proposition 7.3.4 gives −
|f |) ≤
f ≤
|f | which gives us
a
a
a
Z
Z
b b
|f |.
f ≤
a a
(b) Before we start let us emphasize that |f (x)| ≤ M is not really an hypothesis.
Because f is already assumed to be integrable, we know f must be bounded.
We’re now just saying that it’s bounded by M .
Z
b If we apply part (a) of this proposition and Proposition 7.4.5 we get f ≤
a Z b
Z b
|f | ≤
M = (b − a)M which is what we were to prove.
a
a
181
7.4 More Integration Topics
Everything we have done with respect to integrals has been over a range a
to b where a < b. It is convenient and necessary to have integral results for
integrals from d to c with d ≥ c—you probably have already used such integrals
in your basic course. To allow for this we make the following definition.
Definition 7.4.7 Suppose f : [a, b] → R is integrable on [a, b]. Let c, d ∈ [a, b]
be such
Z c that c < d. We define
(a)
f = 0 and
c
Z c
Z d
(b)
f =−
f.
d
c
We then have a variety of results that ”fix up” our previous results, now
allowing for integrals defined in Definition 7.4.7. The results generally follow
Rd
previous analogous results for integrals c f with c < d. We include some of
these results in the following proposition.
Proposition 7.4.8 Suppose f : [a, b] → R is integrable on [a, b]. We then have
the following results.
Z x3
Z x2
Z x3
f.
f+
f=
(a) For x1 , x2 , x3 ∈ [a, b]
x2
x1
x1
Z x2
Z x2
g if
f ≤
(b) If x1 , x2 ∈ [a, b] and f (x) ≤ g(x) for x ∈ [a, b], then
x1
x1
Z x2
Z x2
g if x1 > x2 .
f≥
x1 ≤ x2 , and
x1Z
x1
Z
x2 x2 (c) If x1 , x2 ∈ [a, b], then f ≤ |f |.
x1
x1
Z x2 (d) If x1 , x2 ∈ [a, b] and |f (x)| ≤ M for x ∈ [a, b], then f ≤ M |x2 − x1 |.
x1
Proof: (a) We note that if x1 < x2 < x3 ,Zthis results
Z
Z is the Zsame as Proposition
x2
x1
x2
x3
f=
f or −
f+
f=
7.3.3. Suppose that x3 < x1 < x2 . Then
x2
x1
x3
Z x3 x3
Z x2
Z x3
Z x2
Z x3
f which is what we wanted to prove.
f+
f=
f or
f+
−
x1
x1
x1
x1
x2
The results for other orders of x1 , x2 and x3 follow in the same manner.
(b) The first part of (b) is the
Z same Zas Proposition 7.4.5. If x2 < x1 , using
x1
x1
g. Using Definition 7.4.7 this can be
f≤
Proposition 7.4.5 again gives
x2
Z x2x2
Z x2
g which is equivalent to the inequality that we
f ≤−
rewritten as −
must prove.
x1
x1
Z
(c) First note that if x1 < x2 , the inequality x2
x1
Z
f ≤ x2
x1
|f | follows from
Proposition 7.4.6-(a)—and the outer set of absolute value signs are not necessary. If x1 = x2 , the inequality is trivial because the values on both sides of the
182
7. Integration
inequality are zero. And finally, if x1 > x2 , then
Z x2 Z x1 Z x1 Z
f = f ≤
f = −
x2
x1
x2
x2
by Proposition 7.4.6-(a). Then since
Z
x1
x2
the desired result.
|f | = −
x1
Z
x2
x1
|f |
Z
|f | = x2
x1
|f |, we have
(d) If x1 < x2 , this result follows from Proposition 7.4.5-(b). If x1 = x2 , both
sides of the inequality are zero—so
Z x1 the result is true. If x1 > x2 , we apply
Proposition 7.4.3-(b) to get f ≤ M (x1 − x2 ). Then
x2
Z
Z
x2 f = −
Z
x1 f = x2
x1
x1
x2
f ≤ M (x1 − x2 ) = M |x2 − x1 |.
There may be some other integration results that must be adjusted to allow
for arbitrary limits and from this time on we will assume that you are able to
complete them.
HW 7.4.1 labelhw7.4.1 (True or False and why) (a) Suppose f : [a, b] → R is
integrable on [a, b]. Then the function g defined by g(x) = sin f (x), x ∈ [a, b],
is integrable on [a, b].
(b)
Z xf2 : [a, b] → R is integrable on [a, b], and x1 , x2 ∈ [a, b]. Then
Z Suppose
x2 |f |.
f ≤
x1
x1
Z b
(c) Suppose f : [a, b] → R is such that f (x) > 0 for x ∈ [a, b]. Then
f > 0.
a
(d) Suppose f : [a, b] → R is integrable on [a, b] and such that f (x) > 0 for
Z b
x ∈ [a, b]. Then
f > 0.
a
(e) Suppose f : [a, b] → R is integrable on [a, b] and there exists a c > 0 such
that f (x) ≥ c for all x ∈ [a, b]. Then 1/f is integrable on [a, b].
(
sin(1/x) if x 6= 0
HW 7.4.2 Recall that we earlier defined functions f1 (x) =
,
0
if x = 0
(
(
x sin(1/x) if x 6= 0
x2 sin(1/x) if x 6= 0
f2 (x) =
and f3 (x) =
0
if x = 0
0
if x = 0.
Which of the functions f1 , f2 and f3 are integrable on [−1, 1]—prove it.
HW 7.4.3 Suppose f : [0, 1] → R is continuous on [0, 1] and that
Z
0
1
f > 0.
Prove that there exists an interval (α, β) ⊂ [0, 1] such that f (x) > 0 for x ∈
(α, β).
183
7.5 Fundamental Theorem
HW 7.4.4 Suppose f, g : [a, b] → R are integrable on [a, b]. Prove that
Z b
Z b
|g|.
|f | +
g| ≤
a
Z
b
|f +
a
a
HW 7.4.5 Suppose f : [a, b] → R, g : [c, d] → R are such that f ([a, b]) ⊂ [c, d],
f is continuous on a, b] and g is integrable on [c, d]. Prove or disprove that g ◦ f
is integrable on [a, b].
7.5
The Fundamental Theorem of Calculus
So far we have defined the integral and derived properties of integrals. There are
many applications of integration. If we were not able to compute integrals, these
applications would not be very useful—at least until numerical integration has
become routine. We all know that there are methods for computing integrals.
In this section we state and prove the Fundamental Theorems of Calculus and
some related results—the theorems that will allow us to compute integrals. One
might guess that a theorem with a name like ”the Fundamental Theorem” of
anything might be important. So read carefully.
We consider a function f : [a, b] → R that is integrable on [a, b]. We note
that by Proposition 7.3.3 f is also
Z integrable on [a, x] for any x, a < x ≤ b.
x
Define F : [a, b] → R by F (x) =
f . We will use this notation throughout
a
this section. We begin with our first result that gives us a very basic property
of F .
Proposition 7.5.1 If f : [a, b] → R is integrable on [a, b], then F is uniformly
continuous on [a, b].
Proof: Suppose x, y ∈ [a, b]. Then
Z y
Z x
Z x
Z
F (y)−F (x) =
f−
f=
f+
a
a
a
y
x
f−
Z
a
x
f (by Prop. 7.4.8-(a)) =
Z
y
f.
x
Since f is integrable, we know that f is bounded, i.e. there exists K ∈ R such
that
Z y |f (x)| ≤ K for x ∈ [a, b]. Then by Proposition 7.4.8-(d) |F (y) − F (x)| =
f ≤ K|y − x|. Thus, given any ǫ > 0 we can choose δ = ǫ/K and see that
x
|y − x| < δ implies that |F (y) − F (x)| < ǫ. Therefore F is uniformly continuous
on [a, b].
We next add a hypothesis that makes f a bit nicer, and we see that it makes
F nicer. You could say that this result shows how integration is the reverse
operation of differentiation.
Proposition 7.5.2 Suppose f : [a, b] → R is integrable on [a, b]. If f is continuous at x = c ∈ [a, b], then F ′ (c) exists and F ′ (c) = f (c).
184
7. Integration
F (x) − F (c)
Proof: We want to proceed in the obvious way and consider
. We
x−c
Z x
Z c
Z x
begin by noting that F (x) − F (c) =
f−
f =
f . Also we note that
a
a
c
Z x
1
since f (c) is a constant (c is a fixed point in [a, b]), f (c) = f (c)
1=
x−c c
Z x
1
f (c)—for x 6= c. Then for x 6= c we have
x−c c
Z x
Z x
1
F (x) − F (c)
1
− f (c) = f−
f (c)
x−c
x−c c
x−c c
Z
Z x
1
1 x
(f − f (c)) =
[f − f (c)] .(7.5.1)
=
x−c c
|x − c| c
We assume that we have an ǫ > 0 given. By the continuity of f at x = c we get
a δ so that |x − c| < δ and x ∈ [a, b] implies that |f (x) − f (c)| < ǫ. Then if we
consider x satisfying 0 < |x − c| < δ and x ∈ [a, b], return to equation (7.5.1)
and apply Proposition 7.4.8-(d), we see that
Z
F (x) − F (c)
1 x
1
− f (c) =
ǫ|x − c| = ǫ. (7.5.2)
[f − f (c)] ≤
x−c
|x − c| c
|x − c|
Thus we have for a given ǫ > 0 a δ such that 0 < |x − c| < δ and x ∈ [a, b]
F (x) − F (c)
F (x) − F (c)
implies that − f (c) < ǫ. Therefore lim
= f (c) or
x→c
x−c
x−c
F ′ (c) = f (c).
Note that continity gave us |f (x) − f (c)| < ǫ for all x satisfying |x − c| < δ.
When we used this inequality in equation (7.5.2) we only used it for 0 < |x−c| <
δ. This is all we can use in the definition of a derivative (a limiit) and it is ok
to use less than what we have.
We now return to our basic calculus course and define the antiderivative of
a function.
Definition 7.5.3 Consider some interval I ⊂ R and f : I → R. If the function
F is such that F ′ (x) = f (x) for all x ∈ I, then F is said to be the antiderivative
of f on I.
We then have the following theorem, the Fundamental Theorem of Calculus.
Theorem 7.5.4 Suppose f : [a, b]
Z x→ R is continuous on [a, b]. Then F :
[a, b] → R satisfies F(x) − F(a) =
f if and only if F is the antiderivative of
f on [a, b].
a
Proof: (⇒) We assume that there is a function F such that F(x) − F(a) =
Z
x
f . Using the notation of this section we can rewrite this expression as
a
185
7.5 Fundamental Theorem
F(x) − F(a) = F (x) and this expression holds for all x ∈ [a, b]. Then since by
Proposition 7.5.2 F is differentiable on [a, b], we know that F is differentiable on
[a, b] (F(a) is a constant). Also by Proposition 7.5.2 and the fact that we know
that the derivative of a constant is zero, we see that F ′ (x) = F ′ (x) = f (x).
Thus F is the antiderivative of f .
(⇐) If F : [a, b] → R is such that F ′ (x) = f (x) for x ∈ [a, b], we know that
F ′ (x) = F ′ (x). By Corollary 6.3.5 there exists a C ∈ R such that F(x) =
F (x) + C. Since F (a) = 0, we evaluate the last expression at x = a to see that
F(a) = RF (a) + C = C. Thus we have F(x) = F (x) + F(a) or F(x) − F(a) =
x
F (x) = a f which is what we were to prove.
If we let x = b, we get the following corollary that looks more like the result
we applied so often in our basic course.
Corollary 7.5.5 Suppose f : [a, b] → R is continuous on [a, b] and F is the
Z b
antiderivative of f on [a, b]. Then F(b) − F(a) =
f.
a
Thus as we have done so often before, we evaluate
Z
0
1
1 x2
11
1 1
x3
+
+x =
+ +1 −0=
= F(1) − F(0)
(x + x + 1) =
3
2
3
2
6
0
2
x3
3
where F(x) =
+
x2
2
+ x.
We next include several very nice results, the first two of which are byproducts of our previous results. We begin with integration by parts. Integration by parts is usually presented as a technique for evaluating integrals—
integrals that we can not evaluated using easier methods. However, with the
advent of computer and calculator calculus systems, integration by parts is not
as necessary as an integration technique as it was in the past. Integration by
parts is an important tool in analysis as we shall see in the next chapter. We
proceed with the following result.
Proposition 7.5.6 (Integration by Parts) Suppose f, g : [a, b] → R are
differentiable on [a, b] and are such that f ′ , g ′ : [a, b] → R are continuous on
[a, b] (f and g are continuously differentiable on [a, b]). Then
Z
a
b
′
f g = [f (b)g(b) − f (a)g(a)] −
Z
b
f g′.
a
Proof: The proof of the integration by parts formula is very easy. We all
d
(f g) = f ′ g +
remember that the product formula for differentiation gives us
dx
f g ′ . We integrate both sides of this equality, use Corollary 7.5.5 and get
Z
a
b
d
(f g) = f (b)g(b) − f (a)g(a) =
dx
Z
a
b
f ′g +
Z
a
b
f g′.
186
7. Integration
Rearranged, this is the formula for integration by parts.
R
R
Thus we are now able to evaluate such integrals
R x as
R 7 x sin x, x arctan x,
etc—we cannot use integration by parts on xe or x ln x yet because we
have still not introduced the exponential and logarithm functions—but we will
also be able to do those soon.
Another technique that is a common tool for the evaluation of integrals is
that of substitution. Substitution is a very important result—for the evaluation of integrals and for the general manipulation of integrals in all sorts of
applications. We include the following result.
Proposition 7.5.7 Substitution Suppose that φ : [a, b] → R is continuously
differentiable on [a, b] and that f : φ([a, b]) → R is continuous on φ([a, b]).
Z b
Z φ(b)
Then
f ◦ φ φ′ =
f.
a
φ(a)
Proof: Before we start our work we might note that we could write the result
Z b
Z φ(b)
′
of the above proposition as
f (φ(t))φ (t) dt =
f (x) dx. Substitution is
a
φ(a)
one of the places that this latter notation is very nice—it reminds you that you
Z φ(b)
are making the substitution x = φ(t) in the integral
f (x) dx.
φ(a)
We begin by noting that since φ is continuous on the interval [a, b], φ attains
both its maximum and minimum on [a, b], Theorem 5.3.8, i.e. there exists a
xM , xm ∈ [a, b] such that φ(xM ) is the maximum value of φ on [a, b] and φ(xm )
is the minimum value of φ on [a, b]. Suppose for convenience that xm < xM .
For any y ∈ [φ(xm ), φ(xM ]) by the IVT, Theorem 5.4.1, we know that there is
an x0 ∈ [xm , xM ] such that φ(x0 ) = y0 . Thus φ([a, b]) is an interval.
Now let c = φ(a) and d = φ(b). We note that since φ([a, b]) is an interval,
φ([a, b]) will contain the interval with end
R x points c and d—either [c, d] or [d, c].
Define F : φ([a, b]) → R by F (x) = c f and define h : [a, b] → R by h =
F ◦ φ. Then by the Chain Rule, Proposition 6.1.4, and Proposition 7.5.2, we see
that h′ (x) = F ′ (φ(x))φ′ (x) = f (φ(x))φ′ (x). If integrate both sides of this last
expression and apply Corollary 7.5.5, we get
Z
a
b
h′ = h(b) − h(a) =
Z
b
f (φ(x)φ′ (x) dx.
(7.5.3)
a
Rd
Since h(a) = F (φ(a)) = F (c) = 0 and h(b) = F (φ(b)) = F (d) = c f , equation
Z b
Z d
R φ(b)
Rb
f (φ(x))φ′ (x) dx or φ(a) f = a f ◦ φ φ′ which is
f =
(7.5.3) becomes
c
what we were to prove.
a
Thus we can now consider an integral such as
Z
0
1/2
√
1
. We choose
1 − x2
187
7.5 Fundamental Theorem
φ(θ) = sin θ. Then φ(0) = 0 and φ(π/2) = 1/2. Thus
Z φ(π/6)
Z φ(π/2)
1
1
√
p
=
φ′ (x)
2
1−x
1 − φ(x)
φ(0)
φ(0)
Z π/6
Z π/6
π
cos θ
p
=
1= .
=
2
6
0
0
1 − sin θ
We next include a theorem that may not be familiar to you. We will find
useful in the next chapter and can be used in a variety of interesting ways. The
theorem is called the Mean Value Theorem for Integrals.
Theorem 7.5.8 (Mean Value Theorem for Integrals) Suppose that f :
[a, b] → R is continuous on [a, b], and p : [a, b] → R is integrable on [a, b] and
such that p(x) ≥ 0 for x ∈ [a, b]. Then there exists c ∈ [a, b]such that
Z b
Z b
p.
(7.5.4)
f p = f (c)
a
a
Proof: We know from Corollary 7.4.2-(d) that since f and p are integrable
on [a, b], f p is integrable on [a, b]. We also know from Theorem 5.3.8 that
f assumes its maximum and minimum on [a, b], i.e there exists m, M ∈ R
such that m ≤ f (x) ≤ M for x ∈ [a, b], and there exists xm , xM ∈ [a, b] such
that f (xm ) = m and f (xM ) = M . Because p(x) ≥ 0 we know also that
mp(x) ≤ f (x)p(x) ≤ M p(x) for all x ∈ [a, b]. Therefore
Z b
Z b
Z b
m
p≤
fp ≤ M
p.
(7.5.5)
a
If
Rb
a
p = 0, then
Z
a
a
a
b
f p = 0 and we can choose any c ∈ [a, b] to satisfy
equation (7.5.4). Otherwise we rewrite inequality (7.5.5) as
Rb
fp
f (xm ) = m ≤ Ra b ≤ M = f (xM ).
a p
Then by the Intermediate Value Theorem, Theorem 5.4.1, (applied on either
[xm , xM ] or [xM , xm ] depending on whether xm ≤ xM or xM < xm ) there exists
Rb
fp
c between xm and xM such that f (c) = Ra b which is the same as (7.5.4).
a p
HW 7.5.1 (True
and why) (a) Suppose f : [a, b] → R is continuous on
"Z or False
#
b
d
f = −f (x).
[a, b]. Then
dx x
Z a d
(b) Suppose f : [a, b] → R is continuous on [a, b]. Then
f = −f (x).
dx x
188
7. Integration


−2 if x ∈ [−2, −1]
(c) Consider f : [−2, 2] → R defined by f (x) = x
if x ∈ (−1, 1)


3
if x ∈ [1, 2].
Rx
Then the function F (x) = −2 f is continuous at points x ∈ [−2, −1) ∪ (−1, 1) ∪
(1, 2] and discontinuous at x = −1 and x = 1.
(d) The function F defined in part (c) is differentiable for all x ∈ [−2, 2].
(e) Suppose f : [a, b] → R is continuous on [a, b]. Then there exists c ∈ [a, b]
Z b
such that
f = f (c)(b − a).
a
HW 7.5.2 Consider the functions defined in HW7.5.1-(c). Compute F . Plot
F.
HW 7.5.3 Calculate the following three integrals—verifying all steps.
Z 2
Z 3
Z 2
1
√
.
x3
(b)
x cos x
(c)
2x − 1
−1
−1
1
(a)
HW 7.5.4 Suppose f : [a, b] → R is integrable. Show that there may not be a
Z b
c ∈ [a, b] such that
= f (c)(b − a).
a
HW 7.5.5 Suppose f, φ, ψ : [a, b] → R are such that f is continuous on [a, b]
Z φ(x)
d
and φ and ψ are differentiable on (a, b). Show that for x ∈ (a, b)
f =
dx ψ(x)
f (φ(x))φ′ (x) − f (ψ(x))ψ ′ (x).
7.6
The Riemann Integral
The integral studied in the basic calculus course is most often referred to as the
Riemann integral. We called the integral that we defined the Darboux integral
or just the integral to differentiate it from the integral defined in this section.
As we will see in Theorem 7.6.3 the name is not relevant because the integrals
are the always equal. It is important to introduce the definition given below
because this is the most common definition introduced in the basic calculus
courses. We begin with some definitions.
For a partition of [a, b], P = {x0 , x1 , · · · , xn−1 , xn }, we define the gap of P
to be gap(P ) = max{xi − xi−1 : i = 1, · · · , n}. Thus the gap(P ) is the length
of the largest partition interval. We then make the following definition.
Definition 7.6.1 Consider the function f : [a, b] → R where f is bounded on
[a, b] and let P = {x0 , x1 , · · · , xn } be a partition of [a, b].
(a) A Riemann sum of f with respect to the partition P is the sum
Sn (f, P ) =
n
X
i=1
f (ξi )(xi − xi−1 )
(7.6.1)
189
7.6 Riemann Integral
where ξi is any point in [xi−1 , xi ].
(b) The function f is said to be Riemann integrable on [a, b] if there exists
Z b
a real number (R)
f so that for every ǫ > 0 there exists a δ such that
a
Z b
f − Sn (f, P ) < ǫ for all partitions P with gap(P ) < δ and all different
(R)
a
choices of Sn (f, P ).
Before we move on let us emphasize some important points here. The Riemann sums are very difficult in that for a given partition there are many different sums—you get a different value for each choice of ξi ∈ [xi−1 , xi ] for each
i = 1, · · · , n. That is why we included the statement ”all different choices of
Sn (f, P )” in the definition of the Riemann
usually
integral—it’s
not there. The
Z b
fact that we must be able to show that (R)
f − Sn (f, P ) < ǫ for arbitrary
a
ξi ∈ [xi−1 , xi ] (besides all partitions P for which gap(P ) < δ) can make working
with this definition difficult.
The definition given above is the most common definition used in elementary
textbooks. This is probably because it is the definition that can be given as
quickly as possible. In most texts this definition is given before they consider
limits of sequences—let alone limits of partial sums.
We are going to do very little with Definition 7.6.1. As we stated earlier the
main result will be Theorem 7.6.3 were we prove that the Riemann integral and
the Darboux integral are the same. Before we do this, we state the following
easy result.
Proposition 7.6.2 Consider f : [a, b] → R where f is bounded on [a, b]. If
f is Riemann integrable on [a, b], then there exist a sequence of partitions of
Rb
[a, b], {Pn } such that Sn (f, Pn ) → (R) a f as n → ∞ for all choices of ξi ∈
Z b
f for arbitrary ξi .
[xi−1 , x − i], i = 1, · · · n, i.e. lim Sn (f, Pn ) = (R)
n→∞
a
Proof: We begin by setting ǫn = 1/n, n = 1, · · · and applying Definition
7.6.1. For each n we obtain
a δn that for any partition of [a, b], Pn∗ , with
Z b gap(Pn∗ ) < δn satisfies Sn (f, Pn∗ ) − (R)
f < 1/n. For each n choose one
a
such partition, call it Pn . Then we have a sequence of partitions of [a, b] such
Z b
f as n → ∞.
that Sn (f, Pn ) → (R)
a
Thus we see that the Riemann integral can be evaluated by a sequence of
the Riemann sums over a sequence of partitions—much like the result of the
Archimedes-Riemann Theorem, Theorem 7.2.4. The real result that we want is
that the Riemann integral defined by Definition 7.6.1 is the same as the Darboux
integral defined by Definition 7.2.3.
190
7. Integration
Theorem 7.6.3 Consider f : [a, b] → R where f is bounded on [a, b]. Then f
is Riemann integrable if and only if f is Darboux integrable, and in either case
the integrals are equal.
Proof: (⇒) We’ll do the easier one first. Suppose
ǫ > 0 is given.
Since f is
Z b
Riemann integrable there exists a δ so that (R)
f − Sn (f, P ) < ǫ/3 for all
a
Z b
Z b
ǫ
ǫ
f + —and
partitions P with gap(P ) < δ or (R)
f − < Sn (f, P ) < (R)
3
3
a
a
this must hold for all choices ξi ∈ [xi−1 , xi ], i = 1, · · · , n.
Choose one such partition P and consider the left half of the inequality,
Z b
ǫ
(R)
f − < Sn (f, P ). Since this inequality must hold for any choice of
3
a
ξi ∈ [xi−1 , xi ], i = 1, · · · , n, we can take the greatest lower bound of both sides
of this inequality over all such possible choices of ξi to get
Z b
ǫ
(R)
f − ≤ glb{Sn(f, P ) : ξi ∈ [xi−1 , xi ], i = 1, · · · , n}.
3
a
But the term on the right is just L(f, P ) so we have
(R)
Z
a
b
f−
ǫ
≤ L(f, P ).
3
Repeat this process with the the inequality Sn (f, P ) < (R)
time taking the least upper bound of both sides, to get
Z b
ǫ
U (f, P ) ≤ (R)
+ .
3
a
(7.6.2)
Z
b
a
ǫ
f + , this
3
(7.6.3)
2ǫ
< ǫ.
3
By Riemann’s Theorem, Theorem 7.2.6, we know that f is integrable (Darboux
integrable).
Z b
If we then use the fact that for any partition P we have L(f, P ) ≤
f ≤
If we combine inequalities (7.6.2) and (7.6.3), we get U (f, P )−L(f, P ) ≤
U (f, P ) along with inequalities (7.6.2) and (7.6.3), we get
a
Z b
Z b
ǫ
ǫ
≤ L(f, P ) ≤
f ≤ U (f, P ) ≤ (R)
f+
3
3
a
a
a
Z b
Z b
Z b
Z b
ǫ
ǫ
f − (R)
f ≤ . Since ǫ is arbitrary, we have
f = (R)
f.
or − ≤
3
3
a
a
a
a
(R)
Z
b
f−
(⇐) This is a tough proof, but also an interesting, good proof. But the proof
is not as hard as it looks—we do it very carefully. If f is integrable on [a, b],
191
7.6 Riemann Integral
then for ǫ > 0 by Definitions 7.2.1 and 7.2.3 there exists a partition of [a, b],
P ′ = {x0 , · · · , xn }, such that
U (f, P ′ ) −
Z
a
b
f = U (f, P ′ ) −
Z
b
f<
a
ǫ
.
4
(7.6.4)
Since f is bounded on [a, b], there exists M such that |f (x)| ≤ M for all x ∈ [a, b].
Set δ1 = ǫ/16M n and let P = {y0 , y1 , · · · , ym } be any partition of [a, b] such
that gap(P ) < δ1 . Let P ∗ be the common refinement of P ′ and P , P ∗ = P ′ ∪ P .
By Lemma 7.1.5 U (f, P ∗ ) ≤ U (f, P ′ ). Then from (7.6.4) we get
Z b
ǫ
∗
f< .
U (f, P ) −
(7.6.5)
4
a
We next want to transfer the information from inequality (7.6.5) to partition
P . To do this we want to compare U (f, P ∗ ) and U (f, P ), and we will do this
be looking at 0 ≤ U (f, P ) − U (f, P ∗ ). Write P ∗ as P ∗ = {z0 , z1 , · · · , zp }
where p will be less than or equal to m + (n − 1)—but we don’t care about
∗
this. Define the notation MiP = lub{f (x) : x ∈ [zi−1 , zi ]}, i = 1, · · · , p, and
P
Mj = lub{f (x) : x ∈ [yj−1 , yj ]}, j = 1, · · · , m. We note the following facts.
• If a partition interval of P , [yj−1 , yj ], contains no points of P ′ , then this
∗
partition interval is the same as one of the partition intervals of P ∗ , MjP =
MjP and the contribution of this interval to U (f, P ) − U (f, P ∗ ) is zero.
• If a partition interval of P , [yj−1 , yj ], contains one point of P ′ , then this
partition interval is the same as two adjacent partition intervals of P ∗ , say
[zi−1 , zi ] and [zi , zi+1 ] and the contribution of this interval to U (f, P ) −
U (f, P ∗ ) is
∗
∗
P
MjP (yj − yj−1 ) − [MiP (zi − zi−1 ) + Mi+1
(zi+1 − zi )]
∗
∗
P
= MjP [(zi+1 − zi ) + (zi − zi−1 )] − [MiP (zi − zi−1 ) + Mi+1
(zi+1 − zi )]
∗
∗
P
= (MiP − MjP )(zi − zi−1 ) + (Mi+1
− MjP )(zi+1 − zi ).
∗
∗
P
Since either MjP = MiP or MjP = Mi+1
, at least one of these two
terms will be zero and the contribution to U (f, P ) − U (f, P ∗ ) will be
∗
∗
P
− MjP )(zi+1 − zi ), and in either case
(MiP − MjP )(zi − zi−1 ) or (Mi+1
the contribution will be less than or equal to 2M δ1 (for example
∗
∗
∗
|(MiP −MjP )(zi −zi−1 )| ≤ |(MiP −MjP )|δ1 ≤ (|MiP |+|MjP |)δ1 ≤ 2M δ1 .)
• If a partition interval of P , [yj−1 , yj ], contains two points of P ′ , then this
partition interval is the same as three adjacent partition intervals of P ∗ , we
play the same game—this time with three intervals of P ∗ —and find that
the contribution to U (f, P ) − U (f, P ∗ ) is less than or equal to 2 · 2M δ1 —
where the boldface 2 indicates that there will be two terms contributing
to this contribution—still only one adds out.
192
7. Integration
• etc. If a partition interval of P , [yj−1 , yj ], contains k points of P ′ , then
this partition interval is the same as k + 1 adjacent partition intervals of
P ∗ and will contribute less than or equal to k2M δ1 to U (f, P ) − U (f, P ∗ ).
Thus we see that because each interior point of P ′ contributes less than or
equal to 2M δ1 to U (f, p) − U (f, P ∗ ),
0 ≤ U (f, P ) − U (f, P ∗ ) = |U (f, P ) − U (f, P ∗ )| ≤ (n − 1)2M δ1
or
ǫ
U (f, P ) ≤ U (f, P ∗ )+(n−1)2M δ < U (f, P ∗ )+(n−1)2M ǫ/16M n < U (f, P ∗ )+ .
8
If we combine this inequality with inequality (7.6.5), we get
Z b
3ǫ
U (f, P ) −
f< .
8
a
(7.6.6)
We have derived inequality (7.6.6) very carefully. In a like manner we can
show that there exists a δ2 such that if gap(P ) < δ2 , we get
Z b
3ǫ
f − L(f, P ) < .
(7.6.7)
8
a
(To show that we are right—in that we claim that ”we can show”—you might
try deriving inequality (7.6.7).)
Take δ = min{δ1 , δ2 } and suppose we are given a partition of [a, b], P , such
that gap(P ) < δ—we then get both inequalities (7.6.6) and (7.6.7).
Because on any partition interval we have mi ≤ f (ξi ) ≤ Mi , we have
L(f, P ) ≤ Sn (f, P ) ≤ U (f, P ). The right half of this inequality along with
Rb
inequality (7.6.6) gives Sn (f, P ) < a f + 3ǫ
8 and the left half
Z of the inequality
b
Rb
3ǫ
f − Sn (f, P ) <
along with inequality (7.6.7) gives a f − Sn (f, P ) < 8 ; or a
Z b
Z b
3ǫ
f.
f=
< ǫ. Thus by Definition 7.6.1 f is Riemann integrable and (R)
8
a
a
As we promised the above proof is difficult. However it is especially neat
because we are given the P ′ and inequality with respect to P ′ by the hypothesis,
and then we are given another partition P and want essentially the inequality
with respect to P . We do this by defining P ∗ and use P ∗ to pass the inequality
from P ′ to P .
HW 7.6.1 Suppose f ; [0, 1] → R is integrable. Prove that
Z 1
n−1
n
n
X i 1
X
X
2i − 1 1
i 1
f
f
. Note also that lim
and lim
f
f = lim
n→∞
n→∞
n→∞
n n
n n
n
n
0
i=0
i=1
i=1
Z 1
are also equal to
f.
0
193
7.7 Logarithms and Exponentials
n
X
i 1
f
exists.
n→∞
n n
i=1
HW 7.6.2 (a) Suppose f : [0, 1] → R and suppose lim
Show that f is not necessarily integrable on [0, 1].
(b) Show also that neither of the other limits of sums considered in HW7.6.1
will imply the integrability of f either.
7.7
Logarithm and Exponential Functions
We have been reasonably careful not to use logarithms yet because one of the
very logical ways to define a logarithm is to use the integral as a part of the
definition (at least we haven’t used them often—surely we haven’t used them
for anything important). We’ve also stayed away from all different kinds of
exponentials where the exponent is anything other than a rational, i.e. we have
not allowed irrational exponents—and we want them and need them. Approximately half of the basic calculus books use this approach to define the logarithm
and exponential functions—the books that are not referred to as ”early transcendentals”. We make the following definition.
Z x
1
dt which we call the logaDefinition 7.7.1 For x > 0 we define ln x =
1 t
rithm of x.
We immediately get the following proposition that includes some of the basic results concerning the logarithm function. Since the function f (t) = 1t is
continuous for t > 0, we apply Proposition 7.5.2 to obtain the following result.
Proposition 7.7.2 (a) The function ln : (0, ∞) → R is continuous on (0, ∞).
d
ln x = x1 .
(b) The function ln is differentiable on (0, ∞), and dx
(c) The function ln is strictly increasing.
(d) ln 1 = 0.
Proof: We are going to apply Propositions 7.5.1 and 7.5.2. Both of these
propositions considered a function f defined on a closed interval [a, b]. For
this result we must consider the function 1/t defined on (0, ∞). However, for
any x0 ∈ R we can consider [x0 /2, 2x0 ] and apply Propositions 7.5.1 and 7.5.2
to see that
f (x) = ln x is both continuous and differentiable at t = x0 , and
d
1
1
d
. Thus for any x ∈ (0, ∞),
ln x
ln x = .
=
dx
x
dx
x
0
x=x0
d
1
Since
ln x =
> 0 on (0, ∞), the function ln is strictly increasing by
dx
x
Corollary 6.3.6-(a). (Notice that Corollary 6.3.6-(a) was proved for open ind
1
tervals. For any x1 , x2 ∈ (0, ∞) such that x1 < x2 ,
ln x =
> 0 on
dx
x
I = (x1 /2, x2 + 1). Then by Corollary 6.3.6-(a) ln is strictly increasing on I
so ln x1 < ln x2 . Thus the ln function is strictly increasing on (0, ∞). Notice
194
7. Integration
also that it would have been easier to just say that ln is strictly increasing by
Corollary 6.3.6-(b).)
R1
And by Definition 7.4.7-(a) we see that ln 1 = 1 1t dt.
We next need to show that the logarithm function defined satisfies the basic
properties that we all know logarithm functions are supposed to satisfy.
Proposition 7.7.3 For a, x ∈ (0, ∞) and r ∈ Q
(a) ln(ax) = ln a + ln x
(b) ln(a/x) = ln a − ln x
(c) ln xr = r ln x
Proof: Notice that the derivative found in part (b) of Proposition 7.7.2 along
′
(x)
d
with the Chain Rule, Proposition 6.1.4, dx
ln f (x) = ff (x)
.
(a) We consider the expression ln(ax) where a ∈ (0, ∞) is some constant. Then
a
1
d
1
1
d
d
dx ln(ax) = ax = x . Also dx (ln a + ln x) = 0 + x = x . Since dx ln(ax) =
1
d
dx (ln a + ln x) = x , by Corollary 6.3.5 we know that ln(ax) = ln a + ln x + C
where C is some constant. This last equality must be true for all x ∈ (0, ∞).
We set x = 1 to see that ln a = ln a + ln 1 + C = ln a + C or C = 0. Thus
ln(ax) = ln a + ln x.
d
a
1 a
1
d
(b) We note that
ln =
− 2 = − and dx
(ln a − ln x) = 0 − x1 =
dx x
a/x
x
x
− x1 . Thus ln(a/x) = ln a−ln x+C. If we set x = 1, we get ln a = ln a−ln 1+C =
ln a + C or C = 0. Thus ln(a/x) = ln a − ln x.
d
d
(c) Since dx
ln xr = x1r rxr−1 = xr and dx
r ln x = r x1 , ln xr = r ln x + C. If we
let x = 1, we see that C = 0 and hence, ln xr = r ln x. Note that we have only
consider part (c) for r ∈ Q. This is because we have not defined xr for r ∈ I—so
surely we could not decide how to differentiate xr for r ∈ I.
We next consider ln 2—which by using a calculator we know
R 2 is approximately
equal to 0.69, but we can’t use that. We note that ln 2 = 1 (1/t) dt. We also
note that on [1, 2] we have 1t ≥ 21 (look at the graph of 1/t). Thus we know
R2
R2
that ln 2 = 1 (1/t) dt ≥ 1 (1/2) dt = 1/2—and it is true that 0.69 ≥ 1/2.
Using this inequality we see that ln 2n = n ln 2 ≥ n/2 and lim ln 2n =
n→∞
∞. Then because the ln function is increasing, we know that lim ln x = ∞.
x→∞
(Because limn→∞ ln 2n = ∞ for every R > 0 there exists an N ∈ R so that for
any n > N , ln 2n > R. Then for any R > 0 we can choose K = 2N . Then
because the ln function is increasing, for any x > K, ln x > R.)
Likewise we want to show that lim ln x = −∞. We first show that ln 2−n =
x→0+
−n ln 2 ≤ −n/2. Part (b) below then follows using an argument similar to the
one used for part (a). We then have the following result that will allow us to
understand the plot of the ln function.
Proposition 7.7.4 (a) lim ln x = ∞
x→∞
(b) lim ln x = −∞.
x→0+
195
7.7 Logarithms and Exponentials
If you look at a plot of the ln function—use your calculator—we know that
there is some x0 ∈ (0, ∞) so that ln x0 = 1. This can be proved by first noting
that ln 1 = 0 and ln 23 = 3 ln 2 ≥ 3(1/2) = 3/2. Then by the Intermediate
Value Theorem, Theorem 5.4.1, we know that there exists x0 ∈ (1, 8) such that
ln x0 = 1. We make the following definition.
Definition 7.7.5 The real number e is defined to be that value such that ln e =
1.
It should be reasonably clear that the same argument used above can be
used to prove that for any y ∈ (0, ∞) there exists an x ∈ (1, ∞) such that
ln x = y—for any y always use a some n so that ln 2n > y and then apply the
IVT with ln on (1, 2n ). This implies that ln(1, ∞) = (0, ∞).
Likewise, we can apply the same approach to show that ln(0, 1] = (−∞, 0]—
remember ln 1 = 0. Specifically, consider y0 = −11. We note that ln 2−24 =
24 ln 2 ≤ −24(1/2) = −12 < −11 and ln 1 = 0 > −11 so we can apply the IVT
to imply that there exists some x0 ∈ (2−24 , 1) such that ln x0 = y0 = −11. Or
more generally, if you consider any y0 ∈ (−∞, 0), we can choose an n such that
ln 2−n = −n ln 2 ≤ −n/2 < y (and we do have ln 1 = 0 > y0 ). We can apply the
IVT to imply that there exists x0 ∈ (2−n , 1) such that ln x0 = y0 . This implies
that ln(0, 1] = (−∞, 0]. Thus we have the following result.
Proposition 7.7.6 ln(0, ∞) is an interval—specifically ln(0, ∞) = (−∞, ∞).
At this time we assume that we know almost everything that we want to
know about the logarithm function. We are now ready to move on the define
the exponential function. Because we know by Proposition 7.7.2-(c) that the ln
function is strictly increasing, we know that the ln function is one-to-one. Thus
we know that the inverse of the ln function exists on ln(0, ∞) = (−∞, ∞) so we
can make the following definition.
Definition 7.7.7 Define the exponential function, exp : (−∞, ∞) → (0, ∞), as
exp(x) = ln−1 x.
We want to make it very clear that at this time there is no special relationship
between the exponential function defined above and anything of the form ax —
we still don’t know what the latter expression means. However, we do have
tools to help us look at the exp function. We can use either Proposition 5.4.11
or 5.4.12 to prove the following result.
Proposition 7.7.8 The function exp : (−∞, ∞) → (0, ∞) is continuous on R.
The next property we would like to investigate regarding the exponential
functions is differentiability. Hopefully we remember that in Section 6.3 we
developed everything we need in Proposition 6.3.8. We have the following result.
Proposition 7.7.9 The function exp : (−∞, ∞) → (0, ∞) is differentiable at
y0 = ln x0 for any y0 ∈ (−∞, ∞), and
d
exp(y)|y=y0 =
dy
d
dx
1
= exp(y0 ).
ln xx=x0
(7.7.1)
196
7. Integration
Proof: The domain of the function ln is an interval—I = (0, ∞). The function
ln is one-to-one and continuous
on I, x0 is not an end point of I, ln is differen
d
tiable at x = x0 and dx
ln xx=x0 = x10 6= 0 for any x0 ∈ (0, ∞). Note that since
y0 = ln x0 , then x0 = exp(y0 ). Thus by Proposition 6.3.8 we get
d
exp(y)|y=y0 =
dy
d
dx
1
1
= x0 = exp(y0 ).
=
1/x0
ln x x=x
0
The exponential function will inherit other more basic properties from the
logarithm function. Some of these properties are included in the following proposition.
Proposition 7.7.10 (a) exp(0) = 1
(b) exp(1) = e
(c) For r ∈ Q exp(r) = er
Proof: Parts (a) and (b) follow since ln 1 = 0 and ln e = 1, respectively.
Remember that for r rational er has been defined earlier. Part (c) follows from
ln er = r ln e = r. Thus er = ln−1 r = exp(r).
We want to define some sort of exponential ax that makes sense for all
x ∈ R—specifically here ex . Above we see that on the rationals er and exp(r)
are the same—so we’re close. Thus we define the following.
Definition 7.7.11 For x ∈ R define ex = exp(x).
We should emphasize that we are really only defining ex for x ∈ I since it
is already defined on Q. It’s acceptable to state it the way we do because by
Proposition 7.7.10-(c), we know that for r ∈ Q they are the same anyway. We
should also emphasize that by Definition 7.7.11 and Proposition 7.7.9 we have
d x
x
dx e = e .
One of the very important results follow immediately because of the functioninverse function basic identity, Definition 5.4.7.
Proposition 7.7.12 (a) eln x = x for x > 0
(b) ln ex = x for x ∈ R
There are, of course, some very basic properties that we want exponentials
to satisfy. In Section 5.4, Proposition 5.6.6 we showed that for r, s ∈ Q, we
have xr xs = xr+s and (xr )s = xrs . We want and need ex1 ex2 = ex1 +x2 and
x
(ex1 ) 2 = ex1 x2 . We have the following.
Proposition 7.7.13 For x1 , x2 ∈ R we have
(a) ex1 ex2 = ex1 +x2 and
(b) (ex1 )x2 = ex1 x2 .
197
7.7 Logarithms and Exponentials
Proof: This proposition could be proved using the same approach that we used
to prove Proposition 7.7.3. Instead of using that approach we will show how
these properties follow from results proved in Proposition 7.7.3. Suppose y1 and
y2 are such that y1 = ex1 and y2 = ex2 —then also x1 = ln y1 and x2 = ln y2 .
Then x1 + x2 = ln y1 + ln y2 = ln(y1 y2 ) by Proposition 7.7.3-(a). Then taking
the exponential of both sides gives ex1 +x2 = eln(y1 y2 ) = y1 y2 = ex1 ex2 .
x
(b) In a similar way we note that x1 x2 = x2 ln ex1 = ln (ex1 ) 2 . Then taking
x1 x2
x1 x2
= (e ) .
the exponential of both sides yields e
We do want and need more general exponentials. To accomplish this we
make the following definition.
Definition 7.7.14 For a > 0 and x ∈ R we define ax = ex ln a .
We next would have to state and prove all of the relevant properties related
to the function ax . We want at least the following properties: a0 = 1, a1 = a,
d x
a = ax ln a. We will not prove
ax1 ax2 = ax1 +x2 , (ax1 )x2 = ax1 x2 and dx
these properties but you should be able to see that they follow easily from the
analogous properties for the exponential—ex.
And finally we want one last very important function defined, xr for some
r ∈ R and x ∈ (0, ∞). The function xr is already defined, xr = er ln x . Of course
for certain values of r (many values of r) we can actually define xr for any x ∈ R
(x2 , x3 , x2/3 , etc)—but for many values of r (at least r = 1/2, r = π, etc) xr
just doesn’t make any sense for x < 0. And clearly the definition xr = er ln x
only makes sense for positive x.
The most basic properties of xr follow from the properties of the exponential
and logarithm. We note that becuase ln xr = ln er ln x = r ln x, we see that now
(essentially be definition) ln xr satisfies the property given in Proposition 7.7.3(c)—this time for any r ∈ R (instead of only r ∈ Q).
The property that we need badly is the derivative property. We already have
d r
x = rxr−1 for r ∈ Q. We also have the extension of this result.
that dx
Proposition 7.7.15 For r ∈ R and x ∈ (0, ∞)
Proof: We note that
desired result.
d r
dx x
=
d r ln x
dx e
d r
dx x
= rxr−1 .
= er ln x xr = xr xr = rxr−1 which is the
HW 7.7.1 (True or False and why) (a) ln 2n ≥
n
2
implies that lim ln 2n
1
(b) If ln 2x 3 ln 8x = 1, then x = 10
.
d x
x = (1 + ln x)xx .
(c) For x ∈ R,
dx x
(d) The function sin x is defined only if x ∈ [0, 2π].
(e) exp(ln x) = x for x > 0.
n→∞
HW 7.7.2 Let f (θ) = cos θ. (a) Show that f is not one-to-one. Restrict f in
a way so that the restriction, fr , is one-to-one.
d −1
f (x),
(b) prove the existence of fr−1 , prove the continuity of fr−1 and compute
dx r
d
i.e. compute
cos−1 x.
dx
198
7. Integration
7.8
Improper Integrals
Two important assumptions made as a part of the definition of the integral
were that the functions were bounded and the interval was finite—and it’s easy
to see that for many of the integration results proved, these were important
assumptions. However, there are many times that we want need some sort
of integral of an unbounded function or some sort of integral over an infinite
interval. In this section we introduce an extension of the integral to the improper
integral—an integral that allows for unbounded functions and infinite intervals.
We want to emphasize that the integral considered in this section is not the
Darboux-Riemann integral considered in the rest of the chapter.
Z b
We want a definition for integrals
f where f may be unbounded at a, b
a
Z b
f where b is infinity,
or at c ∈ (a, b). Likewise we want integrals of the form
a
a is minus infinity or both. We really do this by considering each possibility
separate. To do this we make the following definition.
Definition 7.8.1 (a) Suppose that f : (a, b] → R is such that f is integrable on
Z b
[c, b] for any c ∈ (a, b]. Suppose further that lim c → a+
f exists. Then we
c
Z b
Z b
f.
f = lim
define
a
c→a+
c
(b) Suppose that f : [a, b) → R is such that f is integrable on [a, c] for any
Z c
Z b
c ∈ [a, b). Suppose further that lim c → b−
f exists. Then we define
f=
a
a
Z c
lim
f.
c→b−
a
(c) Suppose that f : [a, ∞) → R is such
any
Z cthat f is integrable on [a, c]Z for
∞
c ∈ [a, ∞). Suppose further that lim
f exists. Then we define
f =
c→∞ a
a
Z c
lim
f.
c→∞
a
(d) Suppose that f : (−∞, b] → R is such that f is integrable on [c, b] for any
Z b
c ∈ (−∞, b]. Suppose further that lim c → −∞
f exists. Then we define
c
Z b
Z b
f = lim
f.
c→−∞ c
−∞
Z c
Z b
(e) Suppose that f : [a, c) ∪ (c, b] → R is such that
f and
f exist and are
a
c
Z b
Z c
Z b
finite. Then we define
f=
f+
f.
a
a
c
(f ) Suppose that f : R →
R is such
integrable on [−c1 , c1 ] for every
Z ∞
Z c that fZ is
∞
c1 > 0. Then we define
f=
f+
f for any c ∈ R.
−∞
−∞
c
Chapter 8
Sequences and Series
8.1
Approximation by Taylor Polynomials
1
The functions ex , sin x, √1−x
, etc are nice functions—especially when you are
2
using a calculator or computer—but they are not as nice as polynomials. Specifically polynomials can be evaluated completely based on multiplication, subtraction and addition. Thus when you build your computer, if you teach it how to
multiply, subtract and add, your computer can also evaluate polynomials. These
other functions are not that simple (even division creates problems). The way
that most computers evaluate the more complex functions is to approximate
them by polynomials.
There are many other applications where it is useful to have a polynomial
approximation to a function. Generally, polynomials are just easier to use. In
this section we will show one way to obtain a polynomial approximation of a
function. The approximation will include the error term which is extremely
important since we must know that our approximation is a sufficiently good
approximation—how good depends on our application. The main tool that we
will use is integration by parts, Proposition 7.5.6. We will use integration by
Z b
Z b
b
parts in the form
F ′ (t)G(t) dt = [F G]a −
F (t)G′ (t) dt where we see that
a
a
it is convenient to include the variable of integration specifically because we will
have two variables in our formulas.
We consider a function f and desire to find a polynomial approximation of
f near x = a. At this time we will not worry about the necessary assumptions
on our function—they will be included when we state our proposition. We
begin by noting that by the Fundamental Theorem of Calculus, Theorem 7.5.4,
Z
x
a
f ′ (t) dt = f (x) − f (a) or
f (x) = f (a) +
Z
x
f ′ (t) dt.
(8.1.1)
a
We write expression (8.1.1) as f (x) = T0 (x) + R0 (x) where T0 (x) = f (a) and
199
200
8. Sequences and Series
Rx
R0 (x) = a f ′ (t) dt. T0 is referred to as the zero order Taylor polynomial of
the function f about x = a and R0 is the zero order error term—of course the
trivial case—and generally T0 would not be a very good approximation
of f .
Z
x
We obtain the next order of approximation by integrating
f ′ (t) dt by
0
parts. We let G(t) = f ′ (t) and F ′ (t) = 1. Then G′ (t) = f ′′ (t) and F (t) = t − c.
You should
Z take note of the last step carefully. The dummy variable in the
x
integral
f ′ (t) dt is t. Hence, if you were to integrate by parts without being
0
especially clever (or even sneaky), you would say that F = t. However, there
is no special reason that you could not use F = t + 1 or F = t + π instead.
The only requirement is that the derivative of F with respect to t must be 1.
Since the integration (and hence, the differentiation) is with respect to t, x is a
constant with respect to this operation (no different from 1, π or c). Since we
want it, it is perfectly ok to set F (t) = t − x. Then application of integration
by parts gives
Z
Z x
x
F (t)G′ (t) dt
F ′ (t)G(t) dt = [F G]a −
a
a
Z x
Z x
′
′′
′
= 0 − (a − x)f (a) −
(t − x)f (t) dt = (x − a)f (a) −
(t − x)f ′′ (t) dt.
x
f ′ (t) dt =
a
Z
x
a
a
If we plug this result into (8.1.1), we get
′
f (x) = f (a) + (x − a)f (a) −
Z
a
x
(t − x)f ′′ (t) dt.
(8.1.2)
Expression (8.1.2) can be written as f (x) = T1 (x) + R1 (x) where T1 (x) =
f (a) + (xZ− a)f ′ (a) is the first order Taylor polynomial of f at x = a and
x
R1 (x) =
a
(x − t)f ′′ (t) dt is the first order error term. At this time it is not
clear that T1 is a better approximation of f than T0 . You must be patient.
If we continue in the same fashion, we obtain the following result.
Proposition 8.1.1 Suppose f : I → R where I is some open interval containing a and f is n + 1 times continuously differentiable on I. Then for x ∈ I f
can be written as
f (x) = Tn (x) + Rn (x)
(8.1.3)
where
Tn (x) =
n
X
1 (k)
f (a)(x − a)k
k!
(8.1.4)
k=0
and
1
Rn (x) =
n!
Z
x
a
(x − t)n f (n+1) (t) dt.
(8.1.5)
201
8.1 Taylor Polynomials
The polynomial Tn is called the nth order Taylor polynomial of f about x = a
and Rn is called the nth order Taylor error term.
Proof: We apply mathematical induction.
Step 1: Equations (8.1.3)–(8.1.5) are true for n = 1 (by the derivation preceeding
this proposition).
Step 2: Assume that equations (8.1.3)–(8.1.5) are true for n = m, i.e. assume
m
X
1 (k)
that f can be written as f (x) = Tm (x)+Rm (x) where Tm (x) =
f (a)(x−
k!
k=0
Z x
1
(x − t)m f (m+1) (t) dt.
a)k and Rm (x) =
m! a
Step 3: We now prove that equations (8.1.3)–(8.1.5) are true for m + 1. We
integrate the expression Rm by parts, letting G(t) = f (m+1) (t) and F ′ (t) =
1
(x − t)m and get
m!
t=x
Z x
1
1
m (m+1)
m+1 (m+1)
(x − t) f
(t) dt = −
(x − t)
f
(t)
(m + 1)!
a m!
t=a
Z x
1
m+1 (m+2)
(x − t)
f
(t) dt
−
−
(m + 1)!
a
1
=
(x − a)m+1 f (m+1) (a)
(m + 1)!
Z x
1
(x − t)m+1 f (m+2) (t) dt.
+
(m + 1)! a
Thus
Rm (x) =
1
1
(x − a)m+1 f (m+1) (a) +
(m + 1)!
(m + 1)!
Z
a
x
(x − a)m+1 f (m+2) (t) dt
and we can write
1
(x − a)m+1 f (m+1) (a)
f (x) = Tm (x) + Rm (x) = Tm (x) +
(m + 1)!
Z x
1
+
(x − t)m+1 f (m+2) (t) dt
(m + 1)! a
or f (x) = Tm+1 (x) + Rm+1 (x). Therefore equations (8.1.3)–(8.1.5) are true for
n = m + 1.
Therefore equations (8.1.3)–(8.1.5) are true for all n by mathematical induction.
We note that if we choose a = 0 we obtain the following special case which
is very common.
Proposition 8.1.2 Suppose f : I → R where I is some open interval containing 0 and f is n + 1 times continuously differentiable on I. Then for x ∈ I f
can be written as
f (x) = Tn (x) + Rn (x)
(8.1.6)
202
8. Sequences and Series
where
Tn (x) =
n
X
1 (k)
f (0)xk
k!
(8.1.7)
k=0
and
Rn (x) =
1
n!
Z
x
0
(x − t)n f (n+1) (t) dt.
(8.1.8)
We can consider the function f (x) = ex and can easily obtain expression for
the Taylor polynomial for f about x = 0.
Example 8.1.1
Obtain the Taylor polynomial and error term for f (x) = ex about x = 0.
Solution: It is easy to see that for any n, f (n) (0) = 1. Then we can write Tn (x) =
and Rn (x) =
1
n!
Z
x
0
(x − t)n et dt.
n
X
1 k
x
k!
k=0
1
. Compute Taylor polynomials and
x+1
error terms for f about x = 2 for n = 4 and for general n.
Example 8.1.2
Consider the function f (x) =
Solution: We begin by making a table for derivatives of f at x = 2.
n
0
1
2
3
4
5
n
n+1
f (n) (x)
(x + 1)−1
−(x + 1)−2
2!(x + 1)−3
−3!(x + 1)−4
4!(x + 1)−5
−5!(x + 1)−6
(−1)n n!(x + 1)−(n+1)
(−1)n+1 (n + 1)!(x + 1)−(n+2)
f (n) (2)
3−1
−3−2
2! · 3−3
−3! · 3−4
4! · 3−5
−5! · 3−6
(−1)n n! · 3−(n+1)
1
1
1
1
1
It is then easy to see that T4 (x) = − (x − 2) +
(x − 2)2 −
(x − 2)3 +
(x − 2)4
3
9
27
81
243
Z x
n
X
1
(x − t)4 (t + 1)−6 dt; and Tn (x) =
(−1)k k+1 (x − 2)k and Rn (x) =
and R4 (x) = −5
3
2
k=0
Z x
(x − t)n (t + 1)−(n+2) dt.
(−1)n+1 (n + 1)
2
The title of this section was Approximation by Taylor Polynomials. The
function Tn does an especially good job of approximating f at x = a since Tn
and the first n derivatives of Tn evaluated at x = a gives f (a) and the first n
derivatives of f evaluated at x = a. For Tn to provide an approximation of f
for values of x other than x = a, it is clear that Rn will have to be small. If we
think about what a polynomial looks like, it is clear that a polynomial cannot
approximate a general function everywhere. To see how well Tn approximates
f , you might plot some of the Taylor polynomials found in Examples 8.1.1 and
8.1.2 along with the given functions. The best that we can hope for is that Tn
approximates f near x = a—which we show with the following result which we
refer to as the Taylor Inequality.
203
8.1 Taylor Polynomials
Proposition 8.1.3 (Taylor Inequality) Suppose f : I = [a − r, a + r] → R
for some r > 0 where f is n + 1 times continuously differentiable on I. Suppose
further that there exists M such that |f (n+1) (x)| ≤ M for x ∈ I. Then
|Rn (x)| ≤
or |Rn (x)| ≤
M
|x − a|n+1 for x ∈ I
(n + 1)!
(8.1.9)
M
n+1
.
(n+1)! r
Proof: We note that by Proposition 7.4.8-(c) we get
Z
Z x
1
1 x n (n+1)
n (n+1)
(x − t) f
(t) dt ≤
|Rn (x)| = (t) dt .
(a − t) f
n! a
n! a
Using our hypothesis on f and Proposition 7.4.8-(b) we get
Z
M x
n
|Rn (x)| ≤
|(a − t) | dt .
n! a
(To obtain this last result we must be careful. When x ≥ a, everything is positive
and the statement is true withoutRthe outside absolutevalue signs.
R x When x < a,
x
by Proposition 7.4.8-(b) we get a (a − t)n f (n+1) (t) dt ≥ M a |(a − t)n | dt.
Because these two integrals are negative, we get
Z x Z x
n (n+1)
n
.)
−
t)
f
(t)
dt
≤
M
|(a
−
t)
|
dt
(a
a
a
R x
Next we must compute a |(a − t)n | dt—carefully. Probably the easiest
way is to consider x ≥ a and show that
Z x
Z x
|x − a|n+1
(x − a)n+1
=
.
|(a − t)n | dt =
(t − a)n dt =
n+1
n+1
a
a
Then consider x < a and show that
Z x
Z x
|x − a|n+1
(a − x)n+1
=−
.
|(a − t)n | dt =
(a − t)n dt = −
n+1
n+1
a
a
R x
n+1
In either case a (a − t)n | dt = |x−a|
n+1 , and we get
|Rn (x)| ≤
M
M
|x − a|n+1 ≤
rn+1 .
(n + 1)!
(n + 1)!
We should note that the result of Proposition 8.1.3, equation (8.1.9), can
M
also be expressed as |f (x) − Tn (x)| ≤ (n+1)!
rn+1 for x ∈ [a − r, a + r]. This
expression makes it extremely clear on how Tn approximates f .
In the above result that the (n + 1)! in the denominator is one part of the
above result that makes the error small on [a − r, a + r]. Also, if r is small,
then rn+1 makes Rn small. Consider the following examples that are based on
Example 8.1.1.
204
8. Sequences and Series
Example 8.1.3
Return to Example 8.1.1.
(a) Find the Taylor polynomial approximation of f (x) = ex associated with n = 3. Apply the
Taylor inequaltiy, Proposition 8.1.3, with r = 3 to obtain an error bound on [−3, 3] for this
approximation.
(b) Repeat part (a) with r = 0.1.
(c) Repeat part (a) with n = 27 and r = 3.
Solution: (a) We see that if we choose r = 3 and n = 3, then M = e3 ≈ 20.09, T3 (x) =
3
1 + x + 12 x2 + 61 x3 and by Proposition 8.1.3 |R3 (x)| = |ex − T3 (x)| ≤ e24 34 ≈ 67.79 on [−3, 3].
This is not very good.
(b) If instead we choose r = .1 and n = 3, then M = e.1 ≈ 1.11, T3 is the same and
|R3 (x)| = |ex − T3 (x)| ≤
e.1
0.14
24
≈ 4.60 · 10−6 on [−.1, .1]. These are very good results.
(c) If we want r = 3, we can choose n = 27 (or some other insanely large n), not write out T27
3
e
and see that |R27 (x)| ≤ 28!
328 ≈ 1.51 · 10−15 . So if we especially want a large interval, it is
possible to find a sufficiently high order Taylor polynomial that will approximate f (x) = ex .
Thus we see that we can approximate ex well with a small order Taylor
polynomial on a small interval (with r small). It may not be very nice but we
also see that if for some reason we want or need a large interval, we can use
a Taylor polynomial (a high ordered Taylor polynomial) to approximate ex on
the large interval.
Likewise we can revisit the example considered in Example 8.1.2, f (x) =
we clearly have to choose r so that −1 6∈ [2 − r, 2 + r]. If we choose
1
x+1 ,
−6
r = 1 and n = 4, then M = 5!1−6 ≈ 120 and |R4 (x)| ≤ 5!15! 16 = 1 on the
interval [1, 3]—again not very good. If we instead chose r = 0.5 and n = 4,
−6
.56 ≈ 1.37 · 10−3 on the interval
then M = 5!1.5−6 ≈ 10.53 and |R4 (x)| ≤ 5!1.5
5!
[1.5, 2.5]. This is a much better result.
We see that in this case if r is a bit larger (1 or larger), M gets large—large
enough so that the (n + 1)! in the denominator of (8.1.9) doesn’t help making
R4 small. And of course, if r ≥ 1, the rn term doesn’t help make R4 small
either.
HW 8.1.1 (True or False and why) (a) If n is sufficiently large and r is sufficiently small (but > 0), then Tn (x) = f (x) on [a − r, a + r].
(b) On any interval [a−r, a+r] the derivative f (n) (x) gets small as n gets larger.
(c) A sufficient hypothesis for Proposition 8.1.1 is that each of the functions
f (k) be integrable, k = 1, · · · , k + 1.
(d) If f is a fourth degree polynomial, then T4 (x) = f (x) for all x ∈ R and
R4 (x) = 0 for all x ∈ R.
(e) If Rn (x) = 0 for all x ∈ R and some n ∈ N, then f is a polynomial.
HW 8.1.2 Begin with f expressed as f (x) = T1 (x) + R1 (x) as in equation
(8.1.2). Derive T2 (x) and R2 (x)—of course such that f (x) = T2 (x) + R2 (x).
HW 8.1.3 Consider the function f (x) = sin x. (a) Compute the Taylor polynomial and error term about x = 0 for n = 4 and for a general n.
(b) Apply the Taylor inequality, Proposition 8.1.3, on [−1, 1] to determine a
bound on the error for both cases.
205
8.2 Sequences and Series
(c) Use the result from part (b) for general n to determine an n0 such that
| sin x − Tn (x)| ≤ 1.0 · 10−10 for all x ∈ [−1, 1].
8.2
Sequences and Series
Convergence of sequences of functions In Proposition 8.1.3 we see that
M
if the function f is defined on [a − r, a + r], then |f (x) − Tn (x)| ≤ (n+1)!
rn+1
M
rn+1 is small. How small?
where we worked to find M , n and r so that (n+1)!
It depends on how accurately we want to approximate f .
Hopefully that inequality above reminds you of convergence of sequences. If
we return to the function f (x) = ex and the sequence of Taylor polynomials
found in Example 8.1.1, choose r = 3 and you plot f along with a bunch of
the Tn ’s for different n’s, it is clear that Tn converges to f by the English
definition of ”converges”. For a fixed x ∈ [−3, 3] since by the Taylor inequality
e3
e3
3n+1
−
3n+1 ≤ ex − Tn (x) ≤
3n+1 and lim
= 0 (Example
n→∞ (n + 1)!
(n + 1)!
(n + 1)!
3.5.2), by the Sandwich Theorem, Proposition 3.4.2, we know that lim [ex −
n→∞
Tn (x)] = 0 or Tn (x) → f (x) = ex —for fixed x ∈ [−3, 3].
We formalize the concept of a sequence of functions converging to a given
function with the following defintions.
Definition 8.2.1 Suppose f, fn : D → R for D ⊂ R, n = 1, 2, · · · . If for each
x ∈ D lim fn (x) exists and equals f (x), then we say that the sequence {fn }
n→∞
converges pointwise to f on D. We write fn → f .
We have defined pointwise convergence of a sequence of functions. There
other other types of convergence—we will include uniform convergence later.
When there is no doubt that the convergence is pointwise, the ”pointwise” will
often be eliminated.
There are an abundant number of easy, important sequences of functions.
Consider the following examples.
Example 8.2.1
f1n (x) =
xn
Define(f1n , f1 : D = [0, 1] → R for n ∈ N by
0 for 0 ≤ x < 1
and f1 (x) =
Show that f1n → f1 pointwise.
1 for x = 1.
Solution: We note that
• since f1n (0) = 0 for all n, then f1n (0) → 0 = f1 (0),
• for 0 < x < 1, since lim xn = 0 by Example 3.5.1, f1n (x) → 0 = f1 (x), and
n→∞
• since f1n (1) = 1 for all n, then f1n (1) → 1 = f1 (1).
Thus f1n → f1 pointwise on [0, 1].
n
Define f2n , f2 : [0, 1] → R for n ∈ N by f2n (x) = xn and f2 (x) = 0 for
x ∈ [0, 1]. Show that f2n → f2 pointwise on [0, 1].
xn
Solution: For any x ∈ [0, 1] lim
= 0—thus f2n → f2 pointwise on [0, 1].
n→∞ n
Example 8.2.2
206
8. Sequences and Series
Example 8.2.3
Define f3n , f3 : [0, 1] → R for n ∈ N by f3n (x) =
for x ∈ [0, 1]. Show that f3n → f3 pointwise on [0, 1].
nx
and f3 (x) = 0
1 + n2 x 2
Solution: Since f3n (0) = 0 for all n, then f3n (0) → 0 = f3 (0). For x satisfying 0 < x ≤ 1,
nx
= 0 = f3 (x). Thus f3n → f3 pointwise on [0, 1].
lim
n→∞ 1 + n2 x2
Series We started this discussion talking about in
polynomial associated with f , Tn , converges to f .
the Taylor polynomial associated with f (x) = ex
D = [−3, 3]. Earlier we found that lim Tn (x) =
n→∞
which manner the Taylor
Specifically let Tn denote
and consider the domain
ex for x ∈ [−3, 3]. Thus
the sequence {Tn } converges to ex pointwise on D = [−3, 3].
It should be clear that {Tn } is a different sort of sequence from {f1n }, {f2n }
and {f3n } defined above. Recall that the sequence of Taylor polynomials Tn
n
X
1 k
x . All Taylor polynomials
associated with f (x) = ex is given by Tn (x) =
k!
k=0
look similar—given as a sum of n + 1 terms. When we take the limit as n
approaches ∞, we are computing an infinite sum. We want to understand what
∞
X
1 k
we mean by
x . Sequences such as these are referred to as a series of
k!
k=0
functions. To provide a logical setting to discuss series of functions we
introduce series of real numbers.
For a sequence {a1 , a2 , · · · } where ai ∈ R for all i = 1, 2, · · · , we want to
∞
X
ai , the sum of an infinite number of real numbers.
discuss what we mean by
i=1
We define partial sums of {ai } by sn =
sequence of partial sums, {sn }.
n
X
i=1
ai for n ∈ N and consider the
Definition 8.2.2 Consider the real sequence {ai } and the associated sequence
n
X
ai . If the sequence {sn } is convergent, say to
of partial sums {sn }, sn =
i=1
s, we say that the series
We refer to
∞
X
i=1
∞
X
ai converges and we define
∞
X
ai = s = lim sn .
i=1
i=1
n→∞
ai as an infinite series, or just a series. If the sequence {sn }
does not converge, we say that the series
we say that the series
∞
X
i=1
∞
X
i=1
ai does not converge. If sn → ±∞,
ai diverges to ±∞, respectively—but make sure you
understand that a series that diverges to ±∞ does not converge in R.
Consider the following example.
207
8.2 Sequences and Series
Example 8.2.4
series
∞
X
i=1
Consider the real series
∞
X
i=1
ai where ai = r i for some r ∈ R. Then the
ai converges if and only if |r| < 1.
Solution: Recall that in Example 1.6.1 we showed that the formula for the sum of a finite
n
X
1 − Rn+1
(where r was changed to R for convenience).
geometric series was given by
Rj =
1−R
j=0
Applying this formula to the series given above gives formula for the partial sum sn =
n
X
ri =
i=1
n
n
X
X
1 − rn
. If r = 1, we use the fact that sn =
ri =
1 = n to see that the sequence {sn }
1−r
i=1
i=1
diverges to infinity. When r 6= 1, lim sn exists if and only if lim r n exists. By Examples
r
n→∞
n→∞
3.2.6, 3.5.1, 3.6.2 and the discussion following Example 3.6.2 we know that lim r n exists if
n→∞
and only if |r| < 1.
The geometric series is very nice but this is almost the only series that we
can write and work explicitly with the sequence of partial sums (telescoping
series gives one more example).
When we consider the convergence of a series, it is sometimes useful to realize
that when we are showing that sn → s where s is to be the sum of the series
∞
∞
X
X
ai . Thus
ai , we must consider s − sn —as in |s − sn | < ǫ. And s − sn =
i=n+1
i=1
to show that a series converges, we must show that the sum of the ”tail end” of
the series is arbitrarily small.
And finally one other approach is extremely useful when working with the
convergence of series is to use the Cauchy criterion for the convergence of the
sequence {sn } introduced in Section 3.4. Recall that when we discussed the
Cauchy criterion, we noted that it was a case where we did not need to know
the limit of the sequence. This is especially convenient when we are working
with series in that we hardly ever know or can guess the sum of the series. We
include the application of the Cauchy criterion to the convergence of series in
the following proposition.
Proposition 8.2.3 Consider the real sequence {ai }. The series
∞
X
ai con-
i=1
verges if and only if for every ǫ > 0 there
exists N ∈ R such that m, n ∈ N,
m
X
m ≥ n and m, n > N implies that ai < ǫ.
i=n
Proof: This result follows from Proposition 3.4.11 in that {sn } is convergent
if and only if the sequence {sn } is a Cauchy sequence. The sequence {sn } is a
Cauchy sequence if for every ǫ > 0 there exists an N ∈ R such that n, m ∈ N
and n, m > N implies |sm − sn | < ǫ. This can easily be adjusted by setting
N ∗ = N + 1 and requiring m, n > N ∗ which implies that |sm − sn−1 | < epsilon.
208
8. Sequences and Series
P
If we take m ≥ n (one of the two must be larger), then sm − sn−1 = m
i=n ai .
The result follows.
We used the example of convergence of Taylor polynomials to motivate the
convergence of series. We now realize that the convergence of Taylor polynomials
are really the convergence of a series, a Taylor series. For that reason (and the
fact that it is an important concept) we now define what we mean by the
pointwise convergence of a series of functions.
Definition 8.2.4 Consider the sequence of functions {fi (x)} where for each i,
∞
X
fi (x) is convergent, say
fi : D → R, D ⊂ R. If for each x ∈ D the real series
i=1
to s(x), then we say that the series of functions
∞
X
fi (x) converges pointwise
i=1
to s(x).
Begin by noting that the notation used above is not very good. At the function
∞
X
fi converges
level it would be better to say that the series of functions
i=1
pointwise to s—but the above notation is reasonably common.
We should noteP
that we can also consider the sequence of partial sums of
n
functions, sn (x) = i=1 fi (x), and say that if the sequence {sn (x)} converges
∞
X
fi (x) is said to converge
pointwise, say to s(x), then the series of functions
i=1
pointwise and is defined to be equal to s(x).
In our consideration of the convergence of sequences of Taylor polynomials
we have already given a very common example of a series of functions. Since
Tn (x) was really a partial sum, when we considered the convergence of the Taylor
polynomials of f (x) = ex on [−3, 3], we were proving the pointwise convergence
∞
X
1 i
of the series of functions
x (and we hope that you realize that it is not
i!
i=0
important that we considered general series starting at i = 1 and the Taylor
series started with i = 0). Because we expanded f (x) = ex about x = 0, the
series given above is the Maclaurin series of f . In general we make the following
defintiion.
Definition 8.2.5 Let I be a neighborhood of x = a and suppose f : I → R has
∞
X
f (k) (a)
(x − a)k is called the Taylor
derivatives of all orders at x = a. Then
k!
k=0
series expansion of f about x = a. When a = 0, the Taylor series is most often
referred to as the Maclaurin series.
HW 8.2.1 Prove that the sequence of functions {fn } where fn : [0, 1] → R is
defined by fn (x) = nx(1 − x2 )n converges to f where f (x) = 0 for all x ∈ [0, 1].
209
8.3 Convergence Tests
HW 8.2.2 Consider the sequence of functions {fn } where fn : R → R is de∞
∞
X
X
x2
x2
fined by fn (x) (1+x
converes
.
Show
that
the
series
f
(x)
=
2 )n
n
(1 + x2 )n
n=0
n=0
pointwise and determine the limiting function.
HW 8.2.3 Determine the Taylor series of the function f (x) =
8.3
1
x+1
about x = 2.
Tests for Convergence
As a part of our discussion of the pointwise convergence of the Taylor polynomials, we also considered real series for each fixed x—the sequence of Taylor
polynomials for a fixed x. For these Taylor series we were able to prove convergence by the use of Taylor’s Inequality, Proposition 8.1.3. For general series (and
hence series of functions) we do not have a result as nice as Taylor’s Inequality—
and they are surely not all as nice as a geometric series. For this reason we need
and will develop a set of tools that can be used to prove convergence of series.
We begin with an obvious result of Definition 8.2.2 and Proposition 3.3.2, parts
(a) and (b).
Proposition 8.3.1 Suppose
∞
X
ai and
∞
X
bi are two convergent real sequences
i=1
k=1
and c ∈ R. Then
∞
∞
∞
∞
X
X
X
X
bi , and
ai +
(ai + bi ) =
(ai + bi ) converges and
(a)
i=1
(b)
∞
X
cai converges and
∞
X
i=1
cai = c
i=1
i=1
ai .
i=1
i=1
i=1
∞
X
When you look back to Proposition 3.3.2, you might ask ”what about part
∞
X
(−1)i
√
(d)?” We don’t know it yet but we will find later that
is converi
i=1
∞
∞
X
X
(−1)i (−1)i
1
√
√ =
gent (twice) but
is not. Hence there is no nice result
i
i
i
i=1
i=1
that gives convergence for a series resulting by a term by term product of two
convergent series.
The next is very easy but was a very important tool in your basic course.
∞
X
ai converges, then lim ai = 0.
Proposition 8.3.2 If the series
i→∞
i=1
Proof: If sn represents the partial sum associated with the convergent series
∞
X
ai , we know that both limits lim sn and lim sn−1 exist and equal
s =
s=
i=1
∞
X
i=1
n→∞
ai . Then an = sn − sn−1 → s − s = 0.
n→∞
210
8. Sequences and Series
As we said earlier, the result given in Proposition 8.3.2 is very important—
but not in the form given in Proposition 8.3.2. For this reason we state the
contrapositive of Proposition 8.3.2 as the following corollary—called the ”test
for divergence” in the basic course.
Corollary 8.3.3 Test for Divergence Consider the series
0, then the series
∞
X
∞
X
i=1
ai . If lim ai 6=
i→∞
ai does not converge.
i=1
One thing that we want to emphasize is that the statement ” lim ai 6= 0 can
i→∞
be satisfied if either the limit does not exist, or the limit exists and is not equal
∞
X
(−1)i ,
to zero. Of course this corollary can be used to show that the series
∞
X
i
2 and
∞
X
i=1
sin(i) do not converge. We do not know it yet but the series
∞
X
1
i
does not converge. For this series ai = 1/i → 0. Hence we emphasize that the
converse of Proposition 8.3.2 is not true.
We next include a concept that will be very important to us later. We begin
by including the definition of absolute and conditional convergence.
i=1
i=1
i=1
Definition 8.3.4 Suppose {ai } is a real sequence. We say that {ai } is abso∞
X
|ai | is convergent. If the series {ai } is conlutely convergent if the series
i=1
vergent but not absolutely convergent, then the series is said to be conditionally
convergent.
We then state and prove the following result.
Proposition 8.3.5 Suppose {ai } is a real sequence. If the series
solutely convergent, then it is convergent.
∞
X
ai is ab-
i=1
Proof: This is one of the results where it is very convenient to consider the
Cauchy criterion for the convergence of a series given in Proposition 8.2.3. We
∞
X
ai is absolutely convergent, we
suppose that we are given an ǫ > 0. Since
i=1
know that there exists
mN ∈ R such that n, m ∈ N, n, m > N and m > n (for
X
|ai | < ǫ (where the outer absolute value signs are not
convenience) implies i=n
really needed). Then m, n ∈ N, m, n > N and m > n implies—by multiple
applications of the triangular inequality, Proposition 1.5.8-(v), or an easy math
211
8.3 Convergence Tests
m
m
X
X
|ai | < ǫ. Thus by the Cauchy criterion for
ai ≤
induction proof—
i=n
i=n
∞
X
ai converges.
convergence of series the series
i=1
∞
X
1
, which we know is convergent
i
2
i=1
because it is a geometric series associated with r = 1/2 < 1, then we immediately
∞
X
1
(−1)i i is convergent—or any other series like this
know that the series
2
i=1
where some of the terms are negative. We will apply this result often in a
similar way.
The series results given so far do not directly help us decide whether or
not series converge. When we worked with sequences, we had many methods
that helped find limits. We next state and proof a series of results that help
determine whether or not a series is convergent. We begin with the integral test
(recall that we considered improper integrals in Section 7.8).
We see that if we consider the series
Proposition 8.3.6 (Integral Test) Suppose that
∞
X
ai is a real series and
i=1
suppose f : [1, ∞) is a positive, decreasing continuous function for which f (i) =
Z ∞
∞
X
f exists.
ai converges if and only if the integral
ai for i ∈ N. Then
1
i=1
Proof: Before we proceed we emphasize that we are assuming that
Z
∞
f
1
exists in R (we do not include convergence to ∞ for this assumption). Since f
is decreasing on the interval [i − 1, i], we know that f (i − 1) ≥ f (t) ≥ f (i) for
Z i
Z i
any t ∈ [i − 1, i]. Hence by Proposition 7.4.5, f (i − 1) =
f (i − 1) ≥
f≥
i−1
i−1
Z i
Z i
f (i) = f (i), or ai−1 ≥
f ≥ ai . If we sum from i = 2 to i = n and
i−1
i−1
apply Proposition 7.4.3, we get
sn−1 ≥
(⇒) We assume s =
∞
X
i=1
Z
1
n
f ≥ sn − a1 .
(8.3.1)
ai converges. Since ai ≥ 0 for all i, the left side of
∞
n
X
ai = s. Therefore, the sequence
inequality (8.3.1) yields
f ≤ sn−1 ≤
1
i=1
Z n bn =
f is a bounded, monotone increasing sequence. By the Monotone
Z
1
212
8. Sequences and Series
Convergence Theorem, Theorem 3.5.2-(a), the limit lim bn = lim
n→∞
n→∞
Z
n
f exists
1
and equals L = lub{bn : n ∈ N}. Because L ≥ bn for any n, the convergence
of the sequence {bn } can be expressed as follows: for every ǫ > 0 there exists
N ∈ R such that n > N implies that |bn − L| = L − bn < ǫ.
Z ∞
The above limit is not enough to show that the improper integral
f
1
Z R
exists. We must show that lim
f exists. We claim that this limit does in
R→∞
1
fact exist and will equal L. Suppose that ǫ > 0 is given. Choose the N based
on the convergence of the sequence {bn }. Let N1 = N + 1 and suppose that
R > N1 . Note that since f (x) ≥ 0, we can use Propositions 7.3.3 and 7.4.5
Z R
Z [R]
to show that
f ≥
f where [R] is the greatest integer function. Then
[R] > N and
1
1
Z R Z R
Z [R]
f = L −
f ≤L−
f = b[R] − L < ǫ.
L −
1
1
1
Therefore lim
R→∞
Z
R
f exists and the improper integral
Z
∞
f exists.
1
1
(⇐) We now assume that the integral
Z
∞
f exists. By the right side of inZ ∞
f exists, we
equality (8.3.1), the fact that f is positive and the fact that
1
Z n
Z ∞
see that sn − a1 ≤
f ≤
f —the sequence {sn − a1 } is bounded. Since
1
1
1
ai ≥ 0 for all i, the sequence {sn } is increasing—so the sequence {sn −a1 } is also
increasing. Thus by the Monotone Convergence Theorem, Theorem 3.5.2-(a),
the sequence {sn − a1 } is convergent. Thus the sequence {sn } is also convergent
∞
X
ai converges.
(using Proposition 3.3.2-(a)) and the series
i=1
The form of the integral test given in Proposition 8.3.6 is not in the form
that we are accustomed to using. We rewrite Proposition 8.3.6 in the following
corollary where we include one of the implications from Proposition 8.3.6 and
the contrapositive of the other implication from Proposition 8.3.6.
Corollary 8.3.7 (Integral Test) Suppose that
∞
X
ai is a real series and sup-
i=1
pose f : [1, ∞) is a positive, decreasing continuous function for which f (i) = ai
for i ∈ N.
∞
X
R∞
ai is convergent.
(a) If the improper integral 1 f exists, then the series
i=1
213
8.3 Convergence Tests
(b) If the improper integral
convergent.
R∞
1
f does not exist, then the series
∞
X
ai is not
i=1
The integral test is a specially good result because
gives us a large number
Z it
∞
1
of convergent sequences easily. It is easy to see that
exists for p > 1 and
p
x
1
∞
X 1
does not exist for p ≤ 1. Thus we get the p-series:
converges for p > 1
ip
i=1
and diverges if p ≤ 1. There are some other series on which the integral test can
be used but not many important ones. We also note at this time that if we apply
the idea of absolute convergence along with some of these p-series, we obtain
∞
∞
X
X
1
1
more convergent series. We see that since
(−1)i 3
converges,
then
3
i
i
i=1
i=1
also converges. Of course we can find many more convergent series using this
method.
The next result, the comparison test, is important but is often difficult to
use.
Proposition 8.3.8 (Comparison Test) Suppose that {ai } and {bi } are real,
positive sequences and suppose that for some N1 ∈ N ai ≤ bi for all i ≥ N1 . If
∞
∞
X
X
ai converges.
bi converges, then the series
the series
i=1
i=1
Proof: Since
∞
X
bi converges, we know from Proposition 8.2.3 that for every
i=1
ǫ > 0 there exists an N2 ∈ R such that n, m ∈ N, n, m > N2 and m > n
m
X
bi < ǫ (where no absolute value signs are needed since {bi } was
implies that
i=n
assumed to be a positive series). If we then let N = max{N1 , N2 }, we know
m
m
X
X
bi < ǫ. Therefore
ai ≤
that for n, m ∈ N, n, m > N and m > n we have
again by Proposition 8.2.3 we know that
∞
X
i=n
i=n
ai converges.
i=1
We mentioned earlier that the comparison test is often difficult to use. If we
∞
X
1
1
1
, it is easy to see that 2
≤ 2.
consider a series such as
2
i +i+1
i +i+1
i
i=1
∞
X
1
converges because it is a p-series with p = 2. Hence, by the comparison
2
i
i=1
∞
X
1
converges.
test
2
i +i+1
i=1
214
8. Sequences and Series
∞
X
1
we have to be more
−i+1
i=1
1
1
≥ 2
clever. We note that i2 − i + 1 ≥ i2 − 2i + 1 = (i − 1)2 , then
2
(i − 1)
i −i+1
∞
X
1
for i ≥ 2. The series
is a p-series with p = 2 so it is convergent—
(i − 1)2
i=2
it is not exactly in the form of a p-series but it should be clear that with a
change of variable j = i − 1, we see that it is exactly in the form of a p-series.
∞
X
1
is convergent. We note that the
Then by the Comparsion Test,
2−i+1
i
i=1
series in Proposition 8.3.8 both start at i = 1 where in this example the series
∞
X
1
starts at i = 2. This is no problem. We could add an i = 1 term,
(i
−
1)2
i=2
∞
X
1
say b1 = 13. The series 13 +
will still be convergent and Proposition
(i − 1)2
i=2
8.3.8 will apply with N1 = 2.
Just as we did following the integral test, we can apply the comparison test
in conjunction with absolute convergence. Using the comparison test and the
∞
X
| sin i|
| sin i|
1
fact that
≤
,
it
is
easy
to
see
that
the
series
is convergent.
i2
i2
i2
i=1
∞
X
sin i
Then using Proposition 8.3.5 we know that
converges.
i2
i=1
The next convergence test is an extremely nice result that takes care of most
of the difficulties associated with the comparison test.
If we instead consider a series such as
i2
Proposition 8.3.9 (Limit Comparison Test) Suppose that {ai } and {bi }
are positive, real sequences.
∞
X
ai
ai is convergent if and only if the series
6= 0, then the series
(a) If lim
n→∞ bi
i=1
∞
X
bi is convergent. Note that (a) can be worded as follows:
i=1
∞
∞
X
X
ai
ai converges, and
bi converges, then
6= 0 and
i→∞ bi
i=1
i=1
∞
∞
X
X
ai
(a2 ) If lim
ai does not converge.
bi does not converge, then
6= 0 and
n→∞ bi
i=1
i=1
∞
∞
X
X
ai
ai converges.
(b) If lim
bi converges, then
= 0 and
n→∞ bi
i=1
i=1
(a1 ) If lim
Proof: The statement of the proposition above really consists of parts (a) and
(b). Parts (a1 ) and (a2 ) are rewordings of part (a)—one implication and the
215
8.3 Convergence Tests
contrapositive of the other implication. Statements (a1 ) and (a2 ) are in a form
much easier to apply than that of (a).
∞
X
ai
ai is convergent.
(a) (⇒) We assume that lim
= r 6= 0 and the series
i→∞ bi
i=1
ai
Since ai and bi are positive, r > 0. Because lim
= r > 0, for every ǫ > 0
i→∞
bi
ai
there exists N ∈ R such that i > N implies that − r < ǫ or
bi
r−ǫ<
ai
< r + ǫ.
bi
(8.3.2)
Since the sequence {bi } is assumed positive, inequality (8.3.2) can be rewritten
as
(r − ǫ)bi < ai < (r + ǫ)bi .
(8.3.3)
Choose ǫ = r/2. Then for i > N we have (r/2)bi < ai . By the comparison test,
∞
∞
X
X
(r/2)bi converges. By Proposition
ai converges,
Proposition 8.3.8, since
i=1
∞
X
8.3.1-(b) this implies that
i=1
bi is also convergent.
i=1
(⇐) The proof of this directions is almost identical to the previous proof. The
difference is that this time, the right hand half of inequality (8.3.3) is used
∞
∞
X
X
ai is
bi converges, then
along with the comparison test to show that if
i=1
i=1
also convergent—try it.
ai
(b) If lim
= 0, then for ǫ > 0 there exists an N ∈ R such that i > N implies
n→∞ bi
ai
< ǫ (no absolute value signs are necessary because both sequences are
that
bi
positive). Thus for i > N we have
ai < ǫbi .
(8.3.4)
Thus by the Proposition 8.3.1-(b) and the comparison test, the convergence of
∞
∞
X
X
ai .
bi implies the convergence of
i=1
i=1
Hopefully you remember from your basic course that you can easily prove the
∞
X
1
1
1
by setting ai = 2
, bi = 2 and applying
convergence of
2
i −i+1
i −i+1
i
i=1
∞
X
bi converges because it
part (a1 ) of the limit comparison test (realizing that
i=1
is a p-series with p = 2). This is much easier than applying the comparison test.
∞
X
i2 + i + 1
i2 + i + 1
To show that
does
not
converge,
we
set
a
=
,
i
i3 + i2 + i + 1
i3 + i2 + i + 1
i=1
216
8. Sequences and Series
1
ai
→ 1 as i → ∞, and apply part (a2 ) of the limit comparison
, show that
i
bi
∞
∞
X
X
i2 + i + 1
1
test to see that
does
not
converge
(recall
that
diverges
3 + i2 + i + 1
i
i
i=1
i=1
since it is a p-series with p = 1).
Generally, the limit comparison test allows you to prove the comvergence or
divergence of a series by ”comparing” the series with a known much nicer series
that is similar to the original series—similar in that the limit ai /bi → r exists.
We next introduce the convergence test that might be the most important
test of them all. The ratio test is applicable on series that are almost geometric
series—as we shall see by the proof and the examples that follow. Of course the
ratio test will work on a geometric series but we don’t need it there.
bi =
Proposition 8.3.10 Ratio Test Consider a real sequence of non-zero elements {ai }.
(a) Suppose that there exists r, 0 < r < 1, and an N ∈ N such that i ≥ N
∞
X
ai+1 ≤ r. Then the series
implies that ai is absolutely convergent.
ai i=1
(b) Suppose that there exists r, r ≥ 1, and an N ∈ N such that i ≥ N implies
∞
X
ai+1 ≥ r. Then the series
ai does not converge.
that ai i=1
Proof: (a) Suppose
that there exists r, 0 < r < 1 and an N ∈ N such that
ai+1 ≤ r. Note the following.
i ≥ N implies that ai aN +1 ≤ r implies that |aN +1 | ≥ r|aN |
• aN aN +2 ≤ r implies that |aN +2 | ≤ r|aN +1 | ≤ r2 |aN |
• aN +1 • Claim: m ≥ 1 implies that |aN +m | ≤ rm |aN |
Proof by mathematical induction
Step 1: True for m = 1, given above.
Step 2: Assume true for m = k, i.e. assume that |aN +k | ≤ rk |aN |.
aN +k+1 ≤ r, |aN +k+1 | ≤ r|aN +k | ≤ rrk |aN | (by the
Step 3: Since aN +k inductive hypothesis). Thus |aN +k+1 | ≤ rk+1 |aN |, i.e. it is true for m =
k + 1.
Therefore by math induction the statement is true for all m, i.e. for m ≥ 1,
|aN +m | ≤ rm |aN |.
217
8.3 Convergence Tests
We know that the series
∞
X
m=1
rm+1 |aN | is convergent since it is a geometric
series. By the comparison test we then know that
finally, since
N
X
∞
X
aN +m is convergent. And
m=1
∞
∞
N
X
X
X
ai +
ai is a finite sum, we then know that
i=1
i=1
is convergent.
ai
aN +m =
m=1
i=1
(b) Suppose
that there exists r, r ≥ 1 and an N ∈ N such that i ≥ N implies
ai+1 ≥ r. By a mathematical induction proof similar to that used in
that ai part (a) we can show that m ≥ 1 implies that |aN +m | ≥ rm |aN |. Since r ≥ 1,
it is clear that |aN +m | 6→ 0 as m → ∞. Thus it is impossible that aN +m → 0
(because if aN +m → 0, then |aN +m | → 0). And if aN +m 6→ 0, it should be clear
∞
X
ai does not
that ai 6→ 0. Thus by Corollary 8.3.3 we know that the series
i=1
converge.
We should note that the above version of the ratio test is not the version
usually included in the basic calculus texts. We include the following version of
the ratio test.
Corollary 8.3.11 Ratio
Test
Consider a real sequence of non-zero elements
ai+1 = r. Then
{ai }. Suppose that lim i→∞
ai ∞
X
ai is absolutely convergent.
(a) if r < 1, then the series
i=1
(b) if r > 1, then the series
∞
X
ai is not convergent.
i=1
(c) if r = 1, then no prediction can be made.
ai+1 = r < 1, for every ǫ > 0 there exists N ∈ R such that
Proof: (a) If lim i→∞
ai ai+1 − r < ǫ. Choose ǫ = (1 − r)/2 and set N1 = [N ]+ 1.
n > N implies that ai
ai+1 < r + 1 − r = r + 1 < 1. Thus by part
Then for n ≥ N1 we have 0 < ai 2
2
∞
X
(a) of Proposition 8.3.10 the series
ai converges.
i=1
(b) Again we know that for every ǫ > 0 there exists N ∈ R such that i > N
ai+1 − r < ǫ—but now r > 1. Choose ǫ = (r − 1)/2 and set
implies that ai
r+1
r − 1 ai+1 N1 = [N ] + 1. Then for i ≥ N1 we have 1 <
. Thus
<r−
<
2
2
ai 218
8. Sequences and Series
we can use part (b) of Proposition 8.3.10 to see that the series
∞
X
ai does not
i=1
converge.
∞
X
1
1/(i + 1) = 1 and we know that the
, we see that lim i→∞
i
1/i i=1
∞
∞
X
X
1/(i + 1)2 1
1
=
series
, we see that lim does not converge. If we consider
2
2
i→∞
i
i
1/i
i=1
i=1
∞
X 1
converges. Hence the condition that r = 1
1 and we know that the series
i2
i=1
does not determine whether or not a series converges.
(c) If we consider
Notice that as a part of the proofs of parts (a) and (b) we had to do a dance
to make for the point that we only require the ”N ” in the definition of a limit of
a sequence to be in R—using N1 = [N ] + 1 when we needed an integer. There
were times when it was convenient to allow N to be any real number that works.
In the above proof we had to pay for that earlier convenience.
Another well known test is the root test. At times the root test is clearly the
natural test to use. In most other cases the root test is very difficult to apply.
We state the following result.
Proposition 8.3.12 (Root Test) Consider a real sequence {ai }.
(a) Suppose that there exists r, 0 < r < 1, and an N ∈ N such that i ≥ N
∞
X
1/i
ai is absolutely convergent.
implies that |ai | ≤ r. Then the series
i=1
(b) Suppose that there exists r, r ≥ 1, and an N ∈ N such that i ≥ N implies
∞
X
1/i
ai does not converge.
that |ai | ≥ r. Then the series
i=1
P∞
≤ r for i ≥ N , then |ai | ≤ ri . Then since i=1 ri converges
∞
X
|ai | converges—of course we
(it is a geometric series and r < 1), the series
1/i
Proof: (a) If |ai |
i=1
really get that the tail end of the series converges (from N on) which implies
that the whole series converges.
1/i
(b) Since |ai |
≥ r, we have |ai | ≥ ri . Then since r ≥ 1, we know that
|ai | 6→ 0 which implies that ai 6→ 0—which by Corollary 8.3.3 implies that the
∞
X
ai does not converge.
series
i=1
We should note that like the ratio test, the statement of Proposition 8.3.12
is not the version that is usually given in basic calculus texts. And as was the
case with the ratio test, the more tradition root test will follow from Proposition
8.3.12.
There is one more important test for convergence that we must consider.
We should realize that until this time all of our tests were for positive series
219
8.3 Convergence Tests
or they gave absolute convergence (ratio and root test). The only way that we
proved convergence of series that was not positive was to use Proposition 8.3.5—
absolute convergence implies convergence. We next consider a class of series that
are not positive called alternating series. We now include the following definition
and the associated convergence theorem.
Definition 8.3.13 Consider a real sequence of positive elements {ai }. The
∞
X
(−1)i+1 ai is said to be an alternating series.
series
i=1
Note that we set the exponent on the −1 term to be i + 1 just so that the first
term would be positive—that seems a bit neater. This is not important. It is
still an alternating series if it starts out negative and the result given below is
equally true for alternating series that start with a negative term.
Proposition 8.3.14 (Alternating Series Test) Consider the real, positive
decreasing sequence of elements {ai } and suppose that lim ai = 0. Then the
alternating series
∞
X
i→∞
(−1)i ai converges.
i=1
Proof: Consider the sequence of partial sums s2n = (a1 − a2 ) + (a3 − a4 ) +
· · · + (a2n−1 − a2n ). Since ak − ak+1 ≥ 0, the sequence is increasing. Also since
ak − ak+1 ≥ 0, k = 2, · · · , 2n − 2, and a2n > 0,
s2n = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2n−2 − a2n−1 ) − a2n ≤ a1 ,
i.e. the sequence {s2n } is bounded above. Then by the Monotone Convergence
Theorem, Theorem 3.5.2-(a), the sequence {s2n } converges—say to s. Then
since s2n+1 = s2n + a2n+1 and the fact that a2n+1 → 0, we see that {s2n+1 }
also converges to s.
Claim: The sequence {sn } converges to s. Let ǫ > 0 be given and let
N1 be such that n > N1 implies that |s2n − s| < ǫ, i.e. if 2n > 2N1 , then
|s − s2n | < ǫ. Let N2 be such that n > N2 implies that |s − s2n+1 | < ǫ, i.e. if
2n + 1 > 2N2 + 1, then |s − s2n+1 | < ǫ.
Then if we define N = max{2N1 , 2N2 +1}, then n > N implies that |s−sn | <
∞
X
(−1)i ai converges—to s.
ǫ, so lim sn = s or the series
n→
i=1
We emphasize here that to apply the alternating series test we must show
that the sequence is (i) decreasing, (ii) ai → 0 and that (iii) the series is alter∞
X
(−1)i
√ converges. It is easy
nating. We used the fact earlier that the series
i
i=1
√
√
1
1
to see that (i) ai+1 = √
< √ = ai (which is the same as i < i + 1 or
i+1
i
220
8. Sequences and Series
1
i < i + 1), and (ii) lim √ = 0. Surely the series is alternating. Hence by the
i→∞
i
∞
X
(−1)i
√ converges.
alternating series test the series
i
i=1
HW 8.3.1 (True or False and why) (a) If lim an = 0, then
n→∞
(b) Since the series
converge.
∞
X
an converges.
n=0
∞
∞
X
X
1
1
does not converge, the series
(−1)n does not
n
n
n=1
n=1
(c) The integral test implies that the series
(d) If
∞
X
an converges, then
n=1
∞
X
n=1
1
converges.
n
ln
n
n=2
|an | converges.
(e) Suppose an ≤ bn for n = 1, 2, · · · and
verges.
∞
X
∞
X
n=1
bn converges. Then
∞
X
an con-
n=1
HW 8.3.2 Tell which test (if any) will determine whether the following series
are convergent or not.
∞
∞
∞
∞
X
X
X
X
n
1
1
1
(a)
(b)
(c)
(d)
n
n
2+1
2−1
2
n
n
n
n=1
n=1
n=1
n=2
8.4
Power series
In Section 8.2 we showed that the Maclaurin series of f (x) = ex about x = 0
converges pointwise to f (x) = ex on [−3, 3].
From Section 8.3 we now have other methods for proving convergence of
∞
X
1 k
x , and
series. For example if we consider that same Maclaurin series,
k!
k=0
apply the ratio test, we see that
1
xk+1 x an+1 (k+1)!
=
→0
=
1 k
an k
+
1
x
k!
as k → ∞ for all x ∈ R. Thus we have just shown that the series that is the
natual infinite series associated with f (x) = ex converges on all of R—a much
better result than that proved in Section 8.2 where we proved it converged on
[−3, 3].
n
X
1 k
x
But note several important things. Here we prove that the series
k!
k=0
converged for all x ∈ R but we did not prove that it converged to f (x) = ex . We
221
8.4 Power Series
should also note that if we apply the same approach using the Taylor inequality
as we did in Section 8.2 on a larger interval, say [−88, 88], we still get convere88
gence, i.e. f (n+1) (x) = ex for all n so M = e88 , |Tn (x) − ex | ≤
88n+1 ,
(n + 1)!
88n+1
→ 0 as n → ∞ (Example 3.5.2) implies that Tn (x) → ex for all
and
(n + 1)!
x ∈ [−88, 88]. And since this argument will work for any interval [−R, R] ⊂ R,
∞
X
1 k
the series
x converges to f (x) = ex for all x ∈ R. Then we can write
k!
k=0
∞
X
1 k
x on R.
f (x) =
k!
k=0
We can find an assortment of Taylor-Maclaurin series expansions and are
able to prove convergence of the series to the given function—as we see in the
following result.
Proposition 8.4.1 Let I be a neighborhood of x = a and suppose f : I → R has
derivatives of all orders at x = a. Suppose further that there exists r and M such
that the interval [a − r, a + r] ⊂ I and |f (n) (x)| ≤ M for all x ∈ [a − r, a + r] and
∞
X
f (k) (a)
all n ∈ N. Then the Taylor series of f converges and f (x) =
(x−a)k ,
k!
k=0
i.e. the Taylor series converges and is equal to the function f (x) on [a− r, a+ r].
Proof: Let Tn be the Taylor polynomial approximation of f about x = a,
n
∞
X
X
f (k) (a)
f (k) (a)
Tn (x) =
(x − a)k , the partial sum of the series
(x − a)k .
k!
k!
k=0
k=0
By ZProposition 8.1.1 we know that f (x) − Tn (x) = Rn (x) where Rn (x) =
1 x
(x − t)n f (n+1) (t) dt. Then by the Taylor inequality, Proposition 8.1.3,
n! a
(since |f (n) (x)| ≤ M for all x ∈ [a − r, a + r] and all n ∈ N, then |f (n+1) | ≤ M
for x ∈ [a − r, a + r])
|f (x) − Tn (x)| ≤ |Rn (x)| ≤
n+1
M
rn+1 .
(n + 1)!
r
By Example 3.5.2 M (n+1)!
→ 0 as n → ∞. This implies that for any x ∈
[a − r, a + r], |f (x) − Tn (x)| → 0 as n → ∞, or Tn (x) → f (x) as n → ∞ or {Tn }
∞
X
f (k) (a)
(x−a)k
converges to f pointwise on [a−r, a+r]. Therefore the series
k!
k=0
converges pointwise to f on [a − r, a + r].
To apply this proposition we must understand that the choice of r may be
very important. When we considered f (x) = ex and it’s associated Maclaurin
series, the choice of r was not important—we could find a bound M for any
[−R, R] or any [a − r, a + r]. This does not always work. Consider the following
example.
222
8. Sequences and Series
Example 8.4.1 Find the Maclaurin series expansion of f (x) = ln(x + 1) and analyze the
convergence of this series.
Solution: It is easy to see that f (k) (x) = (−1)k (k + 1)(k − 1)!(x + 1)−k for k = 1, 2, · · · .
Thus since f (k) (0) = (−1)k+1 (k − 1)! for k = 1, 2, · · · (and of course f (0) (0) = 0), we see that
∞
X
(−1)k+1 k
x .
the Maclaurin series expansion of f (x) = ln(x + 1) is given by
k
k=1
Recall that the Taylor polynomial and the associated error term of f (x) = ln(x + 1) about
n
X
R
(−1)k+1 k
x = 0 are given by Tn (x) =
x and Rn (x) = (−1)n 0x (x − t)n (t + 1)−(n+1) dt.
k
k=1
Notice that we do not have the n! term in the denominator to help us make Rn small. Also
note that we surely do not want to have r ≥ 1—because then [a − r, a + r] would become
[−r, r] and f (n) (t) is not defined at t = −1. And finally if we consider Rn on [−r, r] for r < 1,
the maximum of the term (t + 1)−(n+1) occurs at t = −r and is given by (1 − r)−(n+1) . This
term cannot be bounded as a function of n (try r = 1/2—then it is really easy to see). And
of course it will be impossible to find an M that bounds f (n) on [−r, r].
Thus we cannot use Proposition 8.4.1. Moreso, the expression for Rn doesn’t look good—
the term (t + 1)−(n+1) will be large with respect to n. We have methods to consider
the
bk+1 (−1)k+1 k
=
x , we see by the ratio test that convergence of the series. If we let bk =
k
bk k
|x| → |x| so that the series converges for |x| < 1 (and not converge for |x| > 1. If set
k+1
let x = −1, we see that the series does not converge because it is the negative of the p-series,
p = 1. And if we let x = 1, it is easy to see that the series converges by the alternating series
test. Thus the series converges for x ∈ (−1, 1] and does not converge elsewhere.
However, when the series converges, we do not know that it converges to ln(x + 1). We
will return to this example later Example 8.6.1.
Thus we see that though Proposition 8.4.1 is a very nice result, there are times
(lots of times) that it does not apply.
Power Series In our discussion of Taylor and Macclaurin series above we
started with a function f , used that function to generate a series and then
proved that the series converged to f on some interval. There are times that
we want to go approximately in the other direction. We begin with a series of
functions, prove that the series is convergent and define a function to be the
result of the convergent series. We begin with the following definition.
Definition 8.4.2 Consider the real sequence {ak }∞
k=0 . The series
is said to be a power series about x = a.
∞
X
ak (x−a)k
k=0
We first note that we started the power series at k = 0. A power series can
equally well start at k = 1 or any other particular value. It is very traditional
to start power series at k = 0—and that’s ok. There is a slight problem starting
the power series at k = 0. The first term is then a0 x0 , and we do want the power
series to be well-defined at x = 0. And of course x0 is not defined at x = 0.
∞
∞
X
X
Thus we want to emphasize that we write
ak xk and mean a0 +
ak xk .
k=0
k=1
We should also note that power series are commonly defined for complex
sequences of numbers. We restricted out power series to real coefficients because
here we are interested in real functions and real series. Everything that we do
223
8.4 Power Series
can be generalized to complex power series. And finally, we will work with
power series about x = 0—everything that we do can be translated to results
about x = a.
Of course we see that a Taylor series expansion gives us a power series.
There are power series where it is not clear that they come from a Taylor series.
Power series appear in a variety of applications. One of the common reasons
for generating power series is where we find power series solutions to ordinary
differential equations—including the resulting power series that define Bessel’s
functions, hypergeometric functions and others.
Consider the following examples of power series.
Example 8.4.2
∞
X
xk
k!
k=0
(c)
∞
X
Discuss the convergence of the following power series: (a)
(−1)k
k=1
xk
k
(d)
∞
X
∞
X
k!xk
(b)
k=0
xk
k=0
Solution (a) Let bk = k!xk . Applying the ratio test to the power series (1) we see that
∞
X
bk+1 k!xk does not converge for any x ∈ R,
b = (k + 1)|x| → ∞ as k → ∞, if 6= 0. Thus
k
k=0
x 6= 0. The series converges to a0 for x = 0, i.e. series (a) converges on the set {0}.
∞
X
bk+1 xk
= |x| → 0 as k → ∞. Thus the series
converges
(b) Let bk = xk /k!. Then bk
k+1
k!
k=0
∞
X
xk
absolutely for all x ∈ R. Thus the series
converges for all x ∈ R.
k!
k=0
b
k
|x| → |x| as k → ∞. Thus by the ratio test the
(c) Let bk = (−1)k xk /k. Then k+1
= k+1
b
∞
X
k
xk
series
(−1)
converges absolutely on {x ∈ R : |x| < 1} = (−1, 1) and does not converge
k
k=1
on {x ∈ R : |x| > 1} = (−∞, −1) ∪ (1, ∞).
The ratio test tells us nothing about the convergence at x = −1 and x = 1. At x = −1
∞
X
1
the series becomes
which we know diverges—a p-series with p = 1. At x = 1 the series
k
k=1
∞
X
1
becomes
(−1)k which converges by the alternating series test.
k
k=1
∞
X
xk
(−1)k
Therefore the series
converges on (−1, 1] and does not converge on (−∞, −11]∪
k
k=0
(1, ∞).
∞
X
b
xk
(d) Let bk = xk . Then k+1
= |x| → |x| as k → ∞. Thus by the ratio test the series
b
k
k
k=0
converges on {x ∈ R : |x| < 1} = (−1, 1) and does not converge on {x ∈ R : |x| > 1} =
(−∞, −1) ∪ (1, ∞). As in (c) the ratio test tells us nothing about the convergence at x = ±1.
∞
∞
X
X
At x = −1 and x = 1 the series become
(−1)k and
1, respectively. Both of these series
k=0
k=0
do not converge by the test for divergence, Corollary 8.3.3.
∞
X
Therefore the series
xk converges on (−1, 1) and does not converge on (−∞, −1] ∪
k=0
[1, ∞).
We see from the above example that the ratio test is a powerful tool that
224
8. Sequences and Series
can be used to determine the convergence of power series. Also, we see that we
get all different possibilities—convergence at one point, convergence on all of R,
converges on an interval, including end points of the interval or not including
end points.
The first result concerning power series is a proposition that describes the
convergence of a power series—results that we sort of see from the previous
example.
Proposition 8.4.3 (a) If the power series
z is such that |z| < |x0 |, then
(b) If the power series
|z| > |x0 |, then
∞
X
∞
X
∞
X
∞
X
ak xk converges for x = x0 and
k=0
ak xk converges absolutely for x = z.
k=0
ak xk does not converge at x = x0 and z is such that
k=0
k
ak z does not converge.
k=0
∞
X
ak xk0 converges, then ak xk0 → 0 as k → ∞. Then choosing
k=0
ǫ = 1 we know that there exists N ∈ R such that k > N implies that ak xk0 < 1.
If z is such that |z| < |x0 |, then
Proof: (a) If
k
|ak z | =
k k
z < z x0 x0
|ak xk0 | ∞ X
z k
z is a convergent geometric series.
for K > N . Since < 1, the series
x0 x0
k=0
∞
X
By the comparison test, Proposition 8.3.8, the series
|ak z k | is convergent,
i.e. the series
∞
X
k=0
ak z k is absolutely convergent.
k=0
(b) Suppose the statement is false, i.e. suppose the series
∞
X
ak z k converges.
k=0
Then since |x0 | < |z|, by part (a) of this result we know that
∞
X
ak xk0 con-
k=0
verges absolutely. This is a contradiction to the hypothesis. Therefore the
∞
X
series
ak z k does not converge.
k=0
We want to be able to describe (somewhat) the set of convergence of a power
series as we found in Example 8.4.2. We make the following definition.
225
8.5 Uniform Convergence
Definition 8.4.4 Define the radius of convergence R of a power series
∞
X
ak xk
k=0
as
R = lub{y ∈ R :
∞
X
ak y k is absolutely convergent}.
k=0
We then obtain the following result.
Proposition 8.4.5 If R is the radius of convergence of the power series
∞
X
ak xk ,
k=0
then the series converges absolutely for |x| < R and does not converge for
|x| > R.
If |x| = R, the series may converge, may converge absolutely or many not
converge.
Proof: Suppose that |x| < R. By the definition of R we know that there exists
∞
X
an x0 such that |x| < x0 < R and
ak xk0 is absolutely convergent. Then by
Proposition 8.4.3-(a) the series
∞
X
k=0
ak xk is absolutely convergent.
k=0
Suppose not that |x| > R and suppose the the result is false, i.e. suppose that
∞
∞
X
X
ak xk converges. By Proposition 8.4.3-(a)
ak z k will converge absolutely
k=0
k=0
for all z such that |z| < |x|. Hence R ≥ x. This contradicts the assumption
∞
X
that |x| > R. Therefore the series
ak xk does not converge.
k=0
We see that for a given power series, the series will converge (absolutely) for
|x| < R, not converge for |x| > R and may, or may not, converge for |x| = R.
When the series converges, we want to use the power series to define a function
on the domain {x ∈ R : |x| < R}—or maybe a bit more, we may want to include x = ±R. It’s not hard to see that wherever the series converges, we can
∞
X
define a function f (x) =
ak xk . As we always do in calculus once we have a
k=0
new function, we ask the question of whether the function is continuous, differentiable and/or integrable. The fact is that in our present setting, we cannot
answer these questions. Pointwise convergence is not enough. We mentioned
earlier that there were other kinds of convergence. In the next section we will
give ourselves the necessary structure to answer these questions.
8.5
Uniform Convergence of Sequences
We mentioned in the last section that we need more—we need a stronger form
of convergence than pointwise convergence to give us the results that we want
226
8. Sequences and Series
and need. As should be obvious from the title of this section, we will introduce
uniform convergence. Remember that convergence of series was defined in terms
of the convergence of sequences. For that reason we shall return to consideration
of uniform convergence of sequences of functions.
We begin with three traditional examples that every grade school child show
know. Recall in Example 8.2.1 that we defined a sequence
of functions {f1n }
(
0
for 0 ≤ x < 1
on [0, 1] by f1n (x) = xn and a function f1 by f1 (x) =
and
1 for x = 1,
showed that f1n → f1 pointwise on [0, 1]. Consider the following example.
Example 8.5.1 Suppose fn , f : D → R, D ⊂ R, and fn → f pointwise on D. Suppose
that fn is continuous on D for all n ∈ N. Show that it need not be the case that f is
continuous.
Solution: If we consider the sequence of functions defined above, {f1n }, and the limiting
function f1 . We saw that f1n → f1 pointwise on [0, 1]. We know that f1n is continuous on
[0, 1] for each n ∈ N. It should be clear that f1 is not continuous at x = 1. Therefore we have
an example of a pointwise convergent sequence of continuous functions that converges to a
discontinuous function.
In Example 8.2.2 we considered another example of a pointwise convergent
n
sequence. We defined f2n , f2 : [0, 1] → R for n ∈ N by f2n (x) = xn and
f2 (x) = 0. We showed that f2n → f2 pointwise on [0, 1]. Consider the related
example.
Example 8.5.2 Suppose fn , f : D → R, D ⊂ R, and fn → f pointwise on D. Suppose
that fn and f are differentiable on D for all n ∈ N. Suppose also that the sequence of
derivatives {fn′ } converges pointwise on D to f ∗ . Show that it need not be the case that
f ′ = f ∗ , i.e. show that the derivative of the limit need not be the limit of the derivatives.
Solution: Obviously we want to consider the sequence {f2n } and limiting function f2 . Clearly
each function f2n and f2 are differentiable, and f2′ n (x) = xn and f2′ (x) = 0 for all x ∈ [0, 1].
We know from Example 8.2.1 that f2′ n → f1 pointwise on [0, 1] where f1 is as was defined in
Examples 8.2.1 and 8.5.1. And clearly, f1 (x) 6= f2′ (x) for all x ∈ [0, 1]. Therefore, the limit of
derivatives need not be equal to the derivative of the limit.
Before we get to work we include one more traditional example that shows
the inadequacy of pointwise convergence. Consider the following example.
Example
8.5.3 Define f4n , f4 : [0, 2] → R by f4 (x) = 0 for x ∈ [0, 2] and f4n (x) =

2

n x
n − n2 (x − 1/n)


0
x ∈ [0, 1/n]
(the Teepee function that goes from (0, 0), to (1/n, n),
x ∈ [1/n, 2/n]
elsewhere in [0, 2]
Z 2
Z 2
f4 ,
f4n 6=
to (2/n, 0)). Show that f4n (x) → f4 (x) = 0 for all x ∈ [0, 2] and that lim
n→∞
0
0
i.e. the limit of the integrals is not equal to the integral of the limit.
Solution: If we choose any x ∈ (0, 2] it is easy to see that there is an N so that n > N
implies f4n (x) = 0 (choose N such that N > 2/x). Thus f4n (x) → 0 = f4 (x) as n → ∞. By
definition f4n (0) = 0 for all n. Thus f4n (0) → 0 = f4 (0) as n → ∞. Therefore f4n → f4
pointwise. Z
Z 2
2
12
n = 1 for all n. And clearly
Since
f4n is the area under the Teepee,
f4n =
2n
0
0
R2
f = 0. Thus
0 4
Z 2
Z 2
f4 = 0.
f4n = lim 1 6=
lim
n→∞
0
n→∞
0
8.5 Uniform Convergence
227
Thus we see that if we want such properties as (1) the limit of a sequence of
continuous functions is continuous, (2) the limit of the derivatives of a sequence
of functions is equal to the derivative of the limit of the sequence of functions
and (3) the integral of the limit of a sequence of functions is equal to the limit
of the integral of the sequence of functions, we need something stronger than
pointwise convergence. For this reason we make the following definition of the
uniform convergence of a sequence of functions.
Definition 8.5.1 Consider the sequence of functions {fn }, fn : D → R and
function f : D → R for D ⊂ R. The sequence {fn } is said to converge uniformly
to f on D if for every ǫ > 0 there exists an N ∈ R such that n > N implies that
|fn (x) − f (x)| < ǫ for all x ∈ D.
The emphasis in the above definition is that the N that is provided must
work for all x ∈ D. We see in Figure 8.5.1 that we have drawn an ǫ neighborhood
about the function f . The definition of uniform convergence requires that for
n > N , all functions fn must be entirely within the ǫ-tube around f . Consider
the following examples.
Example 8.5.4 Consider {f2n }, f2 defined just prior to Example 8.5.2 and also in Example 8.2.2. Prove that f2n → f2 uniformly on [0, 1].
Solution: We suppose that we are given an ǫ > 0. We must find N that must work for all
x ∈ [0, 1]. If you know what the plots of the various f2n look like—or if you plot a few of
these, you realize that the sequence {f2n } converges to f2 the most slowly at x = 1, i.e. it is
the worst point. Thus we consider the
of the sequence {f2n (1)} to 0. As we did
1 convergence
1
in our study of sequences, we need n
− 0 = n
< ǫ. Thus we see that if we choose N = 1/ǫ,
1
1
1
1
= 0.
− 0 = n
< N
< ǫ. Therefore lim f2n (1) = lim
then n > N implies n
n→∞
n→∞ n
But more importantly, we now consider the sequence {f2n } and f2 . If n > N , then
n
x
1
1
xn
− 0 =
≤
<
= ǫ.
|f2n (x) − f2 (x)| = n
n
n
N
Notice that this sequence of inequalities holds for all x ∈ [0, 1]. Therefore f2n → f2 uniformly.
We found the N in the above example by choosing the N associated with
x = 1 because it was the ”worst point”. Another way to describe the approach
we used was to compute the maximum of |f2n (x) − f2 (x)|. (This was a very nice
example because the maximum occured at x = 1 for all n.) Since this maximum
approaches zero, it seems clear that for large n, f2n will eventually be in the
ǫ-tube about f2 . This is a common approach to proving uniform convergence.
There are three basic results concerning uniform convergence of interest to
us at this time. As you will see the results are directly related to Examples
8.5.1, 8.5.2 and 8.5.3.
Proposition 8.5.2 Consider the function f and sequence of continuous functions {fn }, f, fn : D → R for D ⊂ R. If fn → f uniformly on D, then f is
continuous.
Proof: Consider some x0 ∈ D and suppose that we are given an ǫ > 0. (We
must find a δ such that |x − x0 | < δ implies that |f (x) − f (x0 )| < ǫ.)
228
8. Sequences and Series
1.5
1
0.5
0
−0.5
0.5
1
x
1.5
Figure 8.5.1: Plot of a function and an ǫ neighborhood (ǫ-tube) of the function.
Since fn → f uniformly in D, we know that there exists N ∈ R such that
n > N implies |f (y) − fn (y)| < ǫ/3 for all y ∈ D (so it would hold for x0 ∈ D
and any x ∈ D also). Choose some particular n0 > N . Then we know that fn0
is continuous on D so there exists a δ such that |x − x0 | < δ and x ∈ D implies
that |fn0 (x) − fn0 (x0 )| < ǫ/3. Then |x − x0 | < δ and x ∈ D implies that
|f (x) − f (x0 )| = |(f (x) − fn0 (x)) + (fn0 (x) − fn0 (x0 )) + (fn0 (x0 ) − f (x0 ))|
≤∗ |f (x) − fn0 (x)| + |fn0 (x) − fn0 (x0 )| + |fn0 (x0 ) − f (x0 )|
< ǫ/3 + ǫ/3 + ǫ/3 = ǫ
where inequality ”≤∗ ” follows from two applications of the triangular inequality,
Proposition 1.5.8-(v). Therefore f is continuous at x0 —for any x0 ∈ D, so f is
continuous on D.
If we return to Example 8.5.4, we see that since the functions f2n are continuous for all n and the fact that the sequence {f2n } converges uniformly to
f2 , by Proposition 8.5.2 we know that the function f2 is continuous—but that’s
pretty easy since we know that f2 (x) = 0 for x ∈ [0, 1].
Next consider the sequence of functions {f1n } and the function f1 used in
Example 8.5.1 (and also in Example 8.2.1). By Proposition 8.5.2 and the fact
that f1 is not continuous, we then know that the sequence of functions {f1n }
do not converge uniformly.
The next result that we consider is the interaction of uniform convergence
and integration—see Example 8.5.3.
Proposition 8.5.3 Consider the functions f, fn : [a, b] → R, a < b and n ∈ N,
where the functions fn are continuous on [a, b] for all n and the sequence {fn }
converges uniformly to f on [a, b].
229
8.5 Uniform Convergence
Rx
Rx
(a) If we define Fn , F : [a, b] → R by Fn (x) = a fn and F (x) = a f , then
Fn → F uniformly on [a, b].
Rb
Rb
(b) If we define Fn , F : [a, b] → R by Fn (x) = x fn and F (x) = x f , then
Fn → F uniformly on [a, b].
Z b
Z b
(c) lim
fn =
f.
n→∞
a
a
Proof: (a) Suppose ǫ > 0 is given. Since {fn } converges uniformly to f on
[a, b], we know by Proposition 8.5.2 that the function f is continuous—thus it
is integrable. Define ǫ1 = ǫ/(b − a). Since the sequence {fn } converges to f
uniformly, there exists N ∈ R such that n ≥ N implies that |fn (t) − f (t)| < ǫ1
for all t ∈ [a, b]. Then for any x ∈ [a, b]
Z x
Z x Z x
Z
Z x
∗ x
#
≤
f
−
f
=
(f
−
f
)
|f
(t)−f
(t)|
dt
<
ǫ1 ≤ (b−a)ǫ1 = ǫ
n
n
n
a
a
a
a
a
(8.5.1)
where inequality ”≤∗ follows from Proposition 7.4.6-(a) and inequality ”<#”
follows from Proposition 7.4.6-(b). Hence, Fn → F uniformly on [a, b].
(b) The proof of part (b) is about identical to the proof of part (a).
(c) If we apply the convergence of {Fn } to F at x = b given in part (a) of this
Z b
Z b
proposition, we see that lim
fn =
f.
n→∞
a
a
Of course one of the fast and easy results we get from Proposition 8.5.3 is that
the sequence {f4n } considered in Example 8.5.3—the Teepee functions—does
Z 2
not converge uniformly to f4 on [0, 2]—otherwise
f4n = 1 would converge to
0
Z 2
f4 = 0.
0
And finally, we consider our last result involving uniform convergence. Suppose that fn → f —some sort of convergence. There are many times that we
would like to be able to obtain the derivative of f by taking the limit of the
sequence of derivatives {fn′ }. We state the following proposition.
Proposition 8.5.4 Consider the sequence of functions, {fn }, fn : [a, b] → R,
a < b and n ∈ N, where each fn is continuously differentiable on [a, b]. Suppose
there exists some x0 ∈ [a, b] such that {fn (x0 )} converges and the sequence of
functions {fn′ } converge uniformly on [a, b]. Then the sequence of functions
{fn } converges uniformly on [a, b], say to the function f , the function f is
differentiable on [a, b] and f ′ (x) = lim fn′ (x) for all x ∈ [a, b].
n→∞
Proof: Let g be such that fn′ → g uniformly. Consider the sequence {fn′ } on
[x0 , b]. Clearly the sequence {fn′ } converges uniformly to g on [x0 , b]. Then by
Proposition 8.5.3-(a)
Z x
Z x
′
g,
(8.5.2)
fn =
lim
n→∞
x0
x0
230
8. Sequences and Series
nR
o
Rx
x ′
and the convergence of
f
to x0 g is uniform on [x0 , b]. Also by the
n
x0
Fundamental Theorem, Theorem 7.5.4,
Z x
fn′ = fn (x) − fn (x0 ).
(8.5.3)
x0
Combining equations (8.5.2) and (8.5.3) gives lim [fn (x) − fn (x0 )] =
n→∞
Z
x
g.
x0
Since we know that lim fn (x0 ) exists, we can add lim fn (x0 ) to the last
n→∞
n→∞
expression and get
Z x
g + lim fn (x0 ).
(8.5.4)
lim fn (x) =
n→∞
x0
n→∞
Thus the sequence {fZn (x)} converges for each x ∈ [x0 , b] Because the converx
o
nR
x
g is uniform, the convergence of {fn (x)} is uniform on
gence of x0 fn′ to
x0
Z x
g + lim fn (x0 ).
[x0 , b]. Denote this limit by f , i.e. f (x) =
x0
n→∞
By Proposition 7.5.2 we see that f is differentiable and f ′ (x) = g(x) (the
derivative of the limit term is zero), i.e. f ′ (x) = lim fn′ (x) for x ∈ [x0 , b].
n→∞
If we essentially repeat the above proof, this time applying Proposition 8.5.3(b) instead of part (a) (when we got equation (8.5.2)), we find that f ′ (x) =
lim fn′ (x) for x ∈ [a, x0 ]. If we combine these results, we get the desired result
n→∞
on [a, b].
Thus we see by Propositions 8.5.2-8.5.4 if we want the limit of a sequence of
functions to inherit certain properties of the sequence, we need uniform convergence.
Earlier we used Proposition 8.5.2 to show that the sequence {f1n } does not
converge uniformly and Proposition 8.5.3 to show that the sequence of functions
{f4n } does not converge uniformly. The proofs are completely rigorous but it’s
sort of cheating.
Of course the fact that these sequences do not converge uniformly can be
proved using the defintion of uniform convergence, Definition 8.5.1. To prove
that a sequence does not converge uniformly we must show that there is least
one ǫ so that for all N ∈ R there will be an n > N and at least one x0 ∈ D for
which |fn (x0 ) − f (x0 )| ≥ ǫ.
For example consider {f4n } and choose ǫ = 1/2. The maximum value of
|f4n (x) − f4 (x)| occurs at x = 1/n and equals n for every n. For every N ∈ R
and any n > N , x0 = 1/n ∈ [0, 2] is the point such that |f4n (x0 ) − f4 (x0 )| =
n ≥ 1 > ǫ. Therefore the sequence {f4n } does not converge uniformly to f4 .
Likewise, if we next consider the sequence {f1n } and choose ǫ = 1/2, for any
N ∈ R we must find an n > N and x0 ∈ [0, 1] so that |f1n (x0 ) − f1 (x0 )| ≥ ǫ. If
you plot the function y = xn for a few n’s, you will see that the point is going
to occur near x = 1 (but surely not at x = 1). Let N ∈ R (any such N ) and
8.5 Uniform Convergence
231
suppose n is any integer greater than N . We need to find x0 < 1 such that
|xn0 − 0| = xn0 ≥ 1/2, or taking the n-th rootpof both sides (realizing that the
nth root function
is increasing) gives x0 ≥ n 1/2. So we could surely choose
p
x0 = (1 + n 1/2)/2 and see that f1n 6→ f1 uniformly on [0, 1].
We notice that proving that a sequence of functions does not converge uniformly is not easy—but generally showing that any type of limit does not exist
is not easy.
Before we leave we include one more approach to proving uniform convergence. Recall that when we studied convergence of sequences, we included the
Cauchy criterion for convergence of a sequence, Definition 3.4.9 and Proposition
3.4.11. We begin with the following definition of a uniform Cauchy criterion.
Definition 8.5.5 Consider a sequence of functions {fn }, fn : D → R for D ⊂
R. The sequence {fn } is said to be a uniform Cauchy sequence if for every
ǫ > 0 there exists an N ∈ R such that n, m ∈ N and n, m > N implies that
|fn (x) − fm (x)| < ǫ for all x ∈ D.
Hopefully it is clear that as with the Cauchy criterion for sequences, the
advantage of the uniform Cauchy criterion is when you really don’t know the
limiting function. Also as was the case with the Cauchy criterion for sequences,
our major application for the uniform Cauchy criterion will be when we use it
to show uniform convergence of series. We do need the convergence result—
analogous to Proposition 3.4.11
Proposition 8.5.6 Consider a sequence of functions {fn }, fn : D → R for
D ⊂ R. The sequences {fn } converges uniformly on D to some function f ,
f : D → R if and only if the sequence is a uniformly Cauchy sequence.
Proof: (⇒) We suppose that the sequence {fn } converges uniformly to f on
D and suppose that ǫ > 0 is given. Then we know that there exists N ∈ R such
that n > N implies |fn (x) − f (x)| < ǫ/2 for all x ∈ D. Then m, n > N implies
that
|fn (x) − fm (x)| = |(fn (x) − f (x)) + (f (x) − fm (x)| ≤ |fn (x) − f (x)|
+ |f (x) − fm (x)| < ǫ/2 + ǫ/2 = ǫ
for all x ∈ D. Thus the sequence {fn } is a uniform Cauchy sequence.
(⇐) If the sequence {fn } is a uniform Cauchy sequence on D, then for each
x ∈ D the sequence {fn (x)} is a Cauchy sequence. Then by Proposition 3.4.11
the sequence {fn (x)} converges—call this limit f (x). Let ǫ > 0 be given. Then
there exists an N ∈ R such that n, m > N implies |fn (x) − fm (x)| < ǫ/2 for all
x ∈ D. If we let m → ∞, then we have fn (x) − f (x)| ≤ ǫ/2 < ǫ for all x ∈ D.
Therefore the sequence {fn } converges uniformly to f .
This section gives only a brief introduction to uniform convergence of sequences. There are other versions of the basic theorems, Propositions 8.5.2–
8.5.4, with weaker hypotheses (and more difficult proofs). Uniform convergence
232
8. Sequences and Series
is an important enough concept to deserve more space and work—but not in
this text. We have tried to give you enough so that in the next section we can go
on and discuss the uniform convergence of power series and the resulting power
series results.
HW 8.5.1 (True or False and why)(a) Suppose f, fn : D → R, D ⊂ R, are
such that fn is continuous on D for all n, f is continuous on D and fn → f
pointwise on D. Then fn → f uniformly on D.
(b) Suppose f, fn : D → R, D ⊂ R, are such that fn → f uniformly on D, and
c ∈ R. Then cfn → cf uniformly on D.
(c) Suppose f, fn : D → R, D ⊂ R, are such that fn → f . It may be the case
that {fn } does not converge to f pointwise on D.
(d) Suppose f, fn : D → R, D ⊂ R, are such that fn → f uniformly on D.
Then |fn | → |f | uniformly on D.
(e) Suppose the sequence fn : D → R, D ⊂ R, is uniformly Cauchy on D and
each function fn is differentiable on D. Then the sequence {fn′ } is uniformly
Cauchy on D.
HW 8.5.2 Consider the following sequences of functions. Find the pointwise
limit of the sequences on their domains and determine whether the convergence
is uniform.
x
nx
, x ∈ [0, 1]
(c) fn (x) =
(b) fn (x) = nx+1
(a) fn (x) = 1+n
2 x2 , x ∈ [0, 1]
2
e−nx , x ∈ R
(d) fn (x) =
x
1+nx2 ,
x∈R
HW 8.5.3 Suppose f, fn , g, gn : D → R, D ⊂ R, are such that fn → f and
gn → g uniformly on D. Then for α, β ∈ R, αfn + βgn → αf + βg uniformly
on D.
HW 8.5.4 Consider f, fn : [0, 1] → R defined by fn (x) = (1 − x2 )n and f (x) =
Z 1
0 for x ∈ [0, 1]. Compute
fn and
0
int10 f . Discuss these results.
HW 8.5.5 (a) Suppose f, fn : D → R, D ⊂ R, are such that fn → f uniformly
on D and each function fn is bounded on D. Prove that f is bounded on D.
(b) Find a sequence of functions {fn } and function f all with domain D such
that fn → f pointwise on D, each function fn is bounded on D, but f is not
bounded on D.
8.6
Uniform Convergence of Series
We now want to return to series of functions of the form
∞
X
i=1
fi (x). Hopefully it
should be reasonably clear by now how uniform convergence of series of functions
233
8.6 Uniform Convergence
will look. In spite of this we set the partial sum of the series to be sn (x) =
n
X
fi (x) and make the following definition.
i=1
Definition 8.6.1 Consider the sequence of functions {fi (x)} where for each
i, fi : D → R, D ⊂ R. If the sequence of partial sums, {sn (x)}, converges
∞
X
fi (x) converges uniformly to
uniformly on D, say to s(x), then the series
i=1
s(x).
Earlier we saw that methods for proving convergence of sequences were not
especially useful for proving convergence of series. Likewise, the methods of
proving uniform convergence of sequences of functions aren’t very useful for
proving the uniform convergence of a series of functions. There is one excellent
result that we will use, the Weierstrass test for uniform convergence. The Weierstrass test is to uniform convergence of series of functions that the comparison
test is to convergence of real series. For that reason before we state and prove
the Weierstrass Theorem, we prove the following proposition which also defines
the uniform Cauchy criterion for a series.
Proposition 8.6.2 (Cauchy Criterion for Series) Consider the sequence of
∞
X
∞
fi (x) converges
functions {fi }0 , fi : D → R, D ⊂ R and i ∈ N. The series
i=1
uniformly in D if and only if for every ǫ > 0 there exists N ∈ R such that
m
X
fi (x) < ǫ for all x ∈ D.
m, n ∈ N, m ≥ n and m, n > N implies that i=n
Proof: This result follows direction from Proposition 8.5.6. Let sn (x) =
∞
n
X
X
fi (x) converges uniformly if and only if the sequence
fi (x). The series
i=1
i=1
{sn } converges uniformly. Also sm (x) − sn−1 (x) =
m
X
fi (x). Thus the sequence
i=n
{sn } is a uniform Cauchy sequence if and only if the sequence {fi } satisfies for
every ǫ > 0 there exists
N ∈ R such that m, n ∈ N, m ≥ n and m, n > N
m
X
fi (x) < ǫ for all x ∈ D.
implies that i=n
The result then follows from Proposition 8.5.6. (Again you should realize
that replacing n by n − 1 in the Cauchy criterion for {sn } does not cause any
problem.)
We now proceed with the following theorem.
Theorem 8.6.3 (Weierstrass Theorem-Weierstrass Test) Suppose that
∞
X
Mi is a convergent series of nonnegative numbers. Suppose further that
i=1
234
8. Sequences and Series
{fk }∞
0 is a sequence of functions, fk : D → R, D ⊂ R and k ∈ N, such that
∞
X
fi (x) converges absolutely for
|fk (x)| ≤ Mk for x ∈ D and k ∈ N. Then
i=1
each x ∈ D and converges uniformly on D.
Proof: Since |fk (x)| ≤ Mk for each x ∈ D and k ∈ N, the series
∞
X
i=1
|fi (x)|
converges for each x ∈ D by the comparison test, Proposition 8.3.8. Thus the
∞
X
fi (x) converges absolutely for each x ∈ D.
series
i=1
Define sn and mn by sn (x) =
n
X
fi (x), s(x) =
∞
X
fi (x) and mn =
n
X
Mi .
i=1
i=1
i=1
Let ǫ > 0 be given. Since the series
∞
X
Mi converges, we know by Proposition
i=1
8.2.3 that the series satisfies the Cauchy criterion, i.e. there
exists N ∈ R such
m
X
Mi < ǫ.
that m, n ∈ N, m ≥ n and m, n > N implies that i=n
Suppose m, n > N and m ≥ n. Then
m
m
m
X
X
X
Mi < ǫ
|fi (x)| ≤
fi (x) ≤∗
|sm (x) − sn−1 (x)| = i=n+1
i=n
i=n
for all x ∈ D where inequality ”≤∗ ” is due to a bunch of applications of the
∞
X
triangular inequality. Therefore the series
fi (x) satisfies the uniform Cauchy
i=1
criterion on D and hence converges uniformly on D.
Applying the Weierstrass test to power series we obtain the following result.
Proposition 8.6.4 Suppose that the radius of convergence of the power series
∞
X
ai xi is R and R0 is any value such that 0 < R0 < R. Then the power series
i=0
converges uniformly on [−R0 , R0 ].
Proof: Let r be some value such that R0 < r < R. Then the power series
∞
X
ai ri converges absolutely. For any x ∈ [−R0 , R0 ] |ak xk | ≤ |ak rk |. By the
i=0
Weierstrass Theorem the power series
∞
X
ai xi converges uniformly on [−R0 , R0 ].
i=0
You should realize that the above result shows us that power series are very
nice. They just about always converge uniformly—except possible at ±R. Since
235
8.6 Uniform Convergence
we know that the series may not converge at ±R, we cannot make a stronger
statement. That is surely nicer than most sequences and series.
We are now ready to apply the fact that power series converges uniformly to
obtain the properties we developed for sequences in the last section. The first
is very easy. Since each of the terms in the power series is continuous and the
convergence is uniform on any closed interval contained in (−R, R), we obtain
continuity on the interval [−R0 , R0 ] for any 0 < R0 < R.
∞
X
ai xi with radius of converProposition 8.6.5 Consider the power series
i=0
gence R. The function f : (−R, R) → R defined by f (x) =
on (−R, R).
∞
X
ai xi is continuous
i=0
We next want to differentiate a power series term by term. To see when
and if this is possible, we return to Proposition 8.5.4. We see that we easily
satisfy the hypothesis that the sequence of partial sums {sn } converges at a
point—we’ll only consider differentiating the series in (−R, R) and the series
converges on all of (−R, R). We also easily satisfy the hypothesis that each sn
is continuous. The difficulty is satisfying the hypothesis that the sequence of
derivatives of partial sums, {s′n }, converges uniformly. We want the derivative
∞
X
ixi−1 . Thus we must show that this ”derivative” power
of the series to be
i=1
series converges uniformly.
Proposition 8.6.6 Consider the power series
∞
X
ai xi with radius of conver-
i=0
gence R. The function f : (−R, R) → R defined by f (x) =
entiable on (−R, R), f ′ (x) =
∞
X
∞
X
ai xi is differ-
i=0
iai xi−1 and the radius of convergence of the
i=1
series for f ′ is R.
Proof: As we said in our introduction to this proposition, we wish to apply
n
X
ai xi . Let
Proposition 8.5.4 to the sequence of functions {sn } where sn (x) =
i=0
R0 ∈ R be such that 0 < R0 < R. We discussed how we easily satisfy the
hypotheses that sn must be continuous for each n and that there must exist
one point x0 ∈ [−R0 , R0 ] for which {sn (x0 )} converges—we know that {sn }
converges absolutely on (−R, R).
∞
X
ixi−1 converges absolutely for all x ∈ (−R, R) For r such that
Claim:
i=1
0 < r < R we know that
∞
X
i=0
ai ri is convergent. Consider any x such that |x| < r,
236
8. Sequences and Series
set Ai = iai xi−1 and Bi = ai ri , i = 1, · · · . We apply the limit comparison test,
Proposition 8.3.9-(b). Note that
i x i−1 iai xi−1 Ai
= 0.
=
lim
= lim lim
i→∞
i→∞ Bi
ai ri i→∞ r r
(To see that this last limit is zero let ρ be such that 0 < ρ < 1 and consider the
ρy
limit lim yρy . Write this limit as lim y → ∞
and apply L’Hospital’s rule.)
y→∞
1/y
∞
X
iai xi−1 converges absolutely
Thus by the limit comparison test the series
i=1
for |x| < r for any r, 0 < r < R—and since this holds true for any r < R, the
∞
X
iai xi−1 is at least R.
radius of convergence of
i=1
Therefore we know that the series
∞
X
iai xi−1 converges uniformly on [−R0 , R0 ],
i=1
and by Proposition 8.5.4 we know that the function f is differentiable and
∞
X
iai xi−1 . And since this is true for any R0 , 0 < R0 < R, the function
f ′ (x) =
i=1
f is differentiable on (−R, R).
Once we know that we can always differentiate a power series and that the
derivative series converges too, we know that we can do it again. We obtain the
following corollary.
Corollary 8.6.7 Consider the power series
∞
X
ai xi with radius of convergence
i=0
R. The function f : (−R, R) → R defined by f (x) =
all orders on (−R, R) and ai = f (i) (0)/i!.
∞
X
ai xi has derivatives of
i=0
Of course the above result follows from applying Proposition 8.6.6 many times
and evaluating that result at x = 0 (which we can by Proposition 8.6.5). Using
the result we also obtain the following corollary.
Corollary 8.6.8 Consider the power series
∞
X
ai xi and
i=0
converge on (−r, r) for some r, 0 < r. If
then ak = bk for k = 0, 1, 2, · · · .
∞
X
i=0
ai xi =
∞
X
bi xi both of which
i=0
∞
X
i=0
bi xi for x ∈ (−r, r),
These are very nice results when it comes to differentiating power series. In
the next result we see that we obtain an analogous result concerning integration
of power series.
237
8.6 Uniform Convergence
Proposition 8.6.9 Consider the power series
∞
X
ai xi with radius of conver-
i=0
gence R. The function f : (−R, R) → R defined by f (x) =
∞
X
ai xi is in-
i=0
tegrable on any closed interval contained in (−R, R), for any x ∈ (−R, R)
Z x
Z x
∞
X
ai i+1
f =
x
and the radius of convergence of the series for
f
i+1
0
0
i=0
is R.
Proof: Let x be such that x ∈ (−R, R) and suppose that x < R0 < R. If we
∞
X
ai xi converges uniformly on [−R0 , R0 ],
use the fact that the power series
i=0
n
X
ai i+1
x
n→∞
n→∞
i
+
1
0
0 i=0
i=0
converges uniformly which gives the desired result. We note that since this is
∞
X
ai i+1
x
is at least
true for any x ∈ (−R, R), the radius of convergence of
i
+
1
i=0
R.
Proposition 8.5.3 implies that
Z
x
f = lim
Z
n
xX
ai xi = lim
Note that we integrated the power series from 0 to x—we did so because
the result looks nicer that way. We could have integrated the function f on
any interval [a, b] ⊂ (−R, R). Also, notice that the radius of convergence of the
differentiated and integrated series is the same as that of the original series. We
mention specifically however that the convergence of these series may differ at
the points x = ±R—you might want to find examples that will illustrate this.
There are many applications of Propositions 8.6.5-8.6.9. An example that
we alluded to earlier is when power series are used to find the solutions to
differential equations. This is not a very popular approach anymore but is still
important. The power series solution is found by inserting a general power series
into the differential equation and combining like terms. After the power series
solution is computed it is really nice to know that the series is in fact a solution
in that it can be differentiated the appropriate number of times, etc.
Other, more fun, applications of propositions 8.6.6 and 8.6.9 are when we use
differentiating and integration of known power series to find other power series.
1
= 1 − x2 + x4 − x6 + · · · (it’s a geometric
For example, we know that 2
x +1
series) and the the radius of convergence is R = 1. Hence by integrating both
x3 x5 x7
sides from 0 to x we see that tan−1 x = x −
+
−
+ · · · —the radius of
3
5
7
convergence of this resulting series is also R = 1.
And finally, we now return to the Maclaurin series considered in Example
8.4.1 where we considered the convergence of the Maclaurin series of the function
f (x) = ln(x + 1). Sadly to say, the results we have obtained since that time do
not make it easier to prove this result. We can prove that the series converges
238
8. Sequences and Series
to ln(x + 1). However the methods we shall use, though basic, are not clearly
applicable to other functions and series.
Example 8.6.1
Prove that the series
(−1, 1].
∞
X
(−1)k+1 k
x converges to f (x) = ln(x + 1) on
k
k=1
Solution: We know that we can write f (x) = Tn (x) + Rn (x) where Tn (x) =
n
X
(−1)k+1 k
x
k
k=1
(x − t)n
dt. Clear if we can show that Rn (x) → 0 for x ∈ (−1, 1)
(1 + t)n+1
as n → ∞, then we will have proved our result.
and Rn (x) = (−1)n
Z
x
0
Case 1: 0 ≤ x ≤ 1: We see that
Z x
Z x
(x − t)n
xn+1
|Rn (x)| =
(x − t)n dt =
dt ≤∗
→0
n+1
n+1
0 (1 + t)
0
where inequality ”≤∗ ” follows from Proposition 7.4.5.
Case 2: −1 < x < 0: For −1 < x < 0 we have
Z x
Z 0
(x − t)n
t−x n 1
|Rn (x)| = dt =
dt.
n+1
1+t
1+t
0 (1 + t)
x
n
1
t−x
≥ 0, we can apply the Mean Value Theorem for Integrals, Theorem
Since
1+t
1+t
7.5.8, to get
Z 0
Z 0
tn − x n
tn − x n
t−x n 1
1
1
dt =
(−x)
|Rn (x)| =
dt =
1+t
1+t
1 + tn
1 + tn x
1 + tn
1 + tn
x
where tn ∈ [x, 0]—emphasize that tn does depend on n.
Since x ≤ tn ≤ 0 implies that 1 + x ≤ 1 + tn ≤ 1 and 0 < 1 + x, and |x| = −x, we see that
tn − x n
tn + |x| n |x|
tn |x| + |x| n |x|
|x|n+1
1
(−x) ≤
<
=
.
1 + tn
1 + tn
1 + tn
1+x
1 + tn
1+x
1+x
Thus |Rn (x)| <
|x|n
1+x
→ 0 as n → ∞.
Therefore in both cases Rn (x) → 0 as n → ∞, so f (x) = ln(x + 1) =
x ∈ (−1, 1].
∞
X
(−1)k+1 k
x for
k
k=1
One important fact that we should emphasize is that the series converges
to ln(x + 1) on only (−1, 1]—we found as a part of Example 8.4.1 that the
∞
X
(−1)k+1 k
x is R = 1, i.e. the series diverges
radius convergence of the series
k
k=1
for |R| > 1. We notice that even though the function ln(x + 1) is defined for
x ∈ (−1, ∞), the Maclaurin series doesn’t converge on the (1, ∞) part of the
function. This is just a fact of power series—and a Maclaurin series is a power
series—that they always converge only on a symmetric interval (−R, R)—and
maybe the endpoints. We cannot do better.
In Proposition 8.4.1 we found a tool for proving that Taylor series-Maclaurin
series converged to the function that generated the series. This result works well
for a variety of functions, exp, sine, cosine, etc. We saw in Example 8.4.1 that
Proposition 8.4.1 will not work for all functions—specifically for f (x) = ln(x+1).
239
8.6 Uniform Convergence
We were able to prove that the Maclaurin series for ln(x + 1) will converge to
ln(x + 1) on (−1, 1], but we really used ad hoc methods—methods that will not
necessarily carry over to other examples. There are not methods that will work
for all Taylor series-Maclaurin series. To illustrate how bad it can really be,
consider the following very important example.
(
2
e−1/x
if x 6= 0
0
if x = 0.
Find the Maclaurin series of f —if it exists—and determine for which values of x the series
converges, and for which values of x the series converges to f (x).
Example 8.6.2
Consider the function f (x) =
Solution: To determine the Maclaurin series of f we begin by computing the derivatives at
x = 0. Note that each of the equalities ”=∗ ” follow by L’Hospital’s rule.
2
f ′ (0) = lim
h→0
e−1/h
f (h) − f (0)
= lim
h→0
h−0
h
= lim
h−1
h→0
2
e1/h
=∗ lim
h→0
−h−2
2
−2h−3 e1/h
=
1
h
lim
=0
2 h→0 e1/h2
2
f ′ (h) − f ′ (0)
2h−3 e−1/h
h−4
−4h−5
=∗ lim
= lim
= 2 lim
2
2
h→0 −2h−3 e1/h
h→0
h→0
h→0 e1/h
h−0
h
−2
−3
h
−2h
1
= 4 lim
=∗ 4 lim
= 4 lim
=0
2
2
2
h→0 e1/h
h→0 −2h−3 e1/h
h→0 e1/h
!
2
2
e−1/h
e−1/h
f ′′′ (0) = lim −6
+
4
= ··· = 0
h→0
h4
h7
f ′′ (0) = lim
We don’t know how you feel about it but we’re tired of these computations by now. It should
be reasonably clear that all derivatives of f evaluated at x = 0 will involve one or more limits
2
of the form lim h−k e−1/h . Hopefully the above computations convinces you that all of these
h→0
limits can be computed and are equal to zero. How would we prove this? To prove that the
2
particular limits lim h−k e−1/h are zero we must use mathematical induction—for even and
h→0
odd k separately—but we don’t really want to do that here.
Thus we see that f (k) (0) = 0 for k = 0, 1, 2, · · · . Hence the Maclaurin series expansion
∞
∞
X
X
f (k) (0) k
f (k) (0) k
x exists and is the identically zero series, and we see f (x) 6=
x for
k!
k!
k=0
k=0
all x 6= 0.
The function used in this example is clearly a non-standard function. Plot it in various
neighborhoods of x = 0 to see what it looks like. However, the example does show that if you
compute a Maclaurin series-Taylor series expansion, you do not necessarily get the original
function back again.
∞
X
HW 8.6.1 (a) Prove that the series
xn converges uniformly to
1
1−x
on
xn converges uniformly to
1
1−x
on
n=0
[−R, R] for 0 < R < 1.
(b) Discuss whether or not the series
∞
X
n=0
(−1, 1).
HW 8.6.2 (a) Prove that the series
∞
X
n=0
on [−R, R] for 0 < R < 1.
(−1)n xn converges uniformly to
1
1+x
240
Index
(b) Discuss whether or not the series
∞
X
(−1)n xn converges uniformly to
1
1−x
n=0
on (−1, 1).
HW 8.6.3 (a) Show that
∞
X
1
=
(−1)n x2n has a radius of convergence
1 + x2
n=0
of R = 1.
(b) Use part (a) to determine the power series expansion of tan−1 x—and the
radius of convergence of the series. Justify your results.
Index
p-series, 215
A-R Theorem, 169
{an }∞
n=1 , 48
am , 28
ax , 199
[x], 64
f ◦ g, 119
aaaaee, 197
A = B, 31
ex , 198
exp(x), 197
f : D → R, 47
glb(S), 16
inf (S), 16
∞, 25
∞, 25
E o , 36
∩ Eα , 32
x∈S
lim an , 49
n→∞
lim f (x) = L, 83
x→x0
′
E , 36
ln x, 195
lub(S), 16
min, 56
N, 2
p implies q, 6
if p, then q, 6
p is a sufficient condition for q, 6
p only if q, 6
q is a necessary condition for p, 6
Q, 2
R, 9
A ⊂ B, 31
sup(S), 16
∪ Eα , 32
x∈S
Z, 2
absolute convergence, 212
absolute maximum, 120
absolute minimum, 120
absolute value, 24
of a function, 180
accumulation point, 36
addition, 9
additive identity, 15, 20
alternating series, 221
Archimedes-Riemann Theorem, 169
Archimedian property, 22, 23
arithmetization of analysis, 2
associative
addition, 10
multiplication, 10
Associative Laws
set theory, 32
backwards triangular inequality, 24
bisection method, 123
Bolzano-Weierstrass Theorem, 72
bounded, 16
bounded above, 16, 18, 22, 23, 77
bounded below, 16, 21, 22, 77
bracket function, 64
cancellation law, 10
Cauchy criterion
sequences, 74
sequences of functions, 233
series, 209
Cauchy sequence, 2, 73
Cauchy sequences, 3
Cauchy, Augustus-Louis, 1
chain rule, 142
closed
241
242
with respect to addition, 9
with respect to multiplication, 9
closed interval, 25
closed set, 36, 40–42
common refinement, 165
commutative
addition, 9
multiplication, 10
Commutative Laws
set theory, 32
compact, 73
compact set, 40–42
comparison test, 215
complement, 34
complete, 18, 20, 22
complete ordered field, 19
completeness axiom, 18, 22, 23, 64
composite function, 119, 120
continuity, 120
differentiation, 142
conditional convergence, 212
continuity, 109, 113, 118, 122
continuous
at a point
definition, 109
contradiction, 6–8
contrapositive, 6, 8
converge
infinity, 80
convergence
Cauchy criterion, 74
sequence of functions
pointwise, 207
series, 208
series of functions
pointwise, 210
converges, 49
critical point, 149
d’Alembert, Jean-le-Rond, 1
decreasing function, 124
Dedekin cuts, 2, 3
DeMorgan’s Laws, 34
dense, 36
derivative, 139
left hand, 140
Index
right hand, 140
derivative function, 139
derived set, 36
difference
sets, 34
differentiable
at a point, 139
on a set, 139
differentiable function, 1
direct proof, 5, 6, 9
discontinuous, 109
distributive, 10
diverge
infinity, 80
divergence, 55
series, 208
domain, 47
e, 197
empty set, 31
equality, 9
equivalence class, 2
equivalent statements, 7
exponential function, 197
extended reals, 25
field, 9–11, 15
finite subcover, 41
function, 47
Gauss, Carl Freidrich, 1
greatest lower bound, 16, 17, 21, 22
horizontal line test, 126
hypergeometric series, 1
identity
addition, 10
multiplication, 10
if and only if, 24
image, 47
implication, 6, 7
increasing
function, 77
increasing function, 54, 124
indirect proof, 5, 6, 9
inductive assumption, 27
243
Index
inductive hypothesis, 28
infimum, 16
infinite limit
sequence, 80
infinite limits, 55
sequences, 79
infinite series, 1
infinity, 25, 48
integer, 4, 6
integers, 2, 3, 19
integral, 1
integral domain, 10, 11
integral test, 213, 215
interior of a set, 36
interior point, 36
Intermediate Value Theorem, 122
intersection, 32
interval, 25, 123
inverse
addition, 10
multiplication, 10
inverse function, 126
derivative, 151
invertible, 126
isolated point, 36
isomorphic, 19
IVT, 122
sequences, 47, 67
limits of sequences, 47
local maximum, 120
local minimum, 120
logarithm, 195
lower bound, 16, 17, 21
lower Darboux sums, 161
lower integral, 168
lower sums, 161, 168
L’Hospital’s Rule, 155
Lagrange, Joseph Louis, 1
least upper bound, 16–18, 21–23
left hand derivative, 140
left hand limit, 105
Leibniz, Wilhelm, 1
limit, 1, 109
from the left, 105
from the right, 105
function, 83
one-sided, 104
sequence, 61, 83
limit comparison test, 216
limit point, 39, 40, 42, 83
set, 36
limits
infinite
sequences, 79
natural number, 29
natural numbers, 2, 19
negation, 6
neighborhood, 36, 39, 50
infinity, 50
Newton, Isaac, 1
non-convergence, 55
nonexistence, 56
map, 47
mathematical induction, 25
maximum, 120, 148
Mean Value Theorem, 149, 150
minimum, 56, 120, 148
monotone, 76
Monotone Convergence Theorem, 75,
77
monotone sequence, 75
monotonic function, 173
monotonic sequences, 76
monotonically decreasing, 75–79
monotonically increasing, 75–77
multiplication, 9, 20
multiplicative identity, 20
multiplicative inverse, 20
MVT, 150
one-sided limits, 104
one-to-one, 19
one-to-one function, 126
onto, 47
open cover, 41, 42
open interval, 25
open set, 36, 40, 41
order, 9, 12
order structure, 9
244
ordered field, 12, 16, 18
partial sums, 208
partition, 161
partition interval, 161
Peano Postulate, 26
Peano Postulates, 19, 20
piecewise continuous, 181
polynomial, 5
polynomial equation, 3
polynomials
continuity, 119
power series, 224
premise, 5
Principal of Mathematical Induction,
43
Principle of Mathematical Induction,
25, 26
principle of mathematical induction, 27,
28
product rule
differentiation, 140
proof, 5
mathematical induction, 25
proper subset, 31
quotient rule
differentiation, 141
sequential limits, 68
radius
neighborhood, 36
radius of convergence, 227
range, 47
ratio test, 218, 219
rational, 3, 4
rational functions
continuity, 119
rational number, 2, 5
rational numbers, 2, 3, 16, 19
rational roots, 4
real line, 1
real number, 2
real number system, 2
real numbers, 1, 2, 9, 15, 18, 20, 22
reduced form, 2–4
Index
reductio ad absurdum, 7, 21, 29
refinement
common, 165
partition, 164
reflexive law
equality, 9, 11
Riemann Theorem, 172
right hand derivative, 140
right hand limit, 105
Rolle’s Theorem, 149
root, 3, 4
Sandwich Theorem, 177
sequences, 70
sequence, 47–49
sequential limit, 47, 51, 61
series, 208
functions, 208
set
closed, 36
compact, 40, 41
derived, 36
interior, 36
open, 36
set containment, 32
set theory, 31
sine, 96
step function, 182
strictly decreasing, 75–77
strictly decreasing function, 124
strictly increasing, 75–77
strictly increasing function, 124
subsequence, 71
subset, 31
substitution law, 10, 11
successor, 19
supremum, 16
symmetric law
equality, 9
theory of limits, 1
topological space, 36
topology, 31, 36
transitive law
equality, 9
triangular inequality, 24, 25, 64, 65
Index
trigonometric functions, 96
truth table, 7
truth value, 7
unbounded, 16
uniform Cauchy criterion, 233
series of functions, 235
uniform convergence, 228, 229
functions, 229
sequences, 228
series, 235
union, 32
universal set, 33, 35
universe, 33
upper bound, 16, 17, 21, 77
upper Darboux sums, 161
upper integral, 168
upper sums, 161, 168
valid argument, 5, 7
Venn Diagram, 33
Weierstrass Test, 235
Weierstrass Theorem, 235
Weierstrass, Karl, 1
Well-Ordered Principle, 29
Well-Ordering principle, 23
245

Download Report

Introduction to the Real Numbers

Paperzz.com

Your Paperzz