Lecture notes - Department of Mathematical Sciences

Calculus
Paul Sutcliffe
Office: CM212a.
www.maths.dur.ac.uk/~dma0pms/calc/calc.html
Books
One and several variables calculus, Salas, Hille & Etgen.
Calculus, Spivak.
Mathematical methods in the physical sciences, Boas.
Contents
1 Functions
1.1 Functions, domain and range .
1.2 The graph of a function . . .
1.3 Graphs and transformations .
1.4 Even and odd functions . . .
1.5 Piecewise functions . . . . . .
1.6 Operations with functions . .
1.7 Inverse functions . . . . . . .
1.8 Elementary functions . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Limits and continuity
2.1 Definitions . . . . . . . . . . . . . .
2.2 Results about limits and continuity
2.3 Limits as x → ∞. . . . . . . . . . .
2.4 Classification of discontinuities . . .
2.5 The intermediate value theorem . .
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
6
8
9
11
12
12
14
.
.
.
.
.
24
24
27
31
32
34
3 Differentiation
3.1 Derivative as a limit . . . . . .
3.2 Rules of differentiation . . . . .
3.3 Derivatives of higher order . . .
3.4 The chain rule . . . . . . . . . .
3.5 L’Hopital’s rule . . . . . . . . .
3.6 Implicit differentiation . . . . .
3.7 Boundedness and monotonicity
3.8 Critical points . . . . . . . . . .
3.9 Min-Max problems . . . . . . .
3.10 Rolle’s theorem . . . . . . . . .
3.11 The mean value theorem . . . .
3.12 The inverse function rule . . . .
3.13 Partial derivatives . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Integration
4.1 Indefinite integrals . . . . . . . . . . . . . . . . . . . .
4.2 Definite integrals . . . . . . . . . . . . . . . . . . . . .
4.3 The fundamental theorem of calculus . . . . . . . . . .
4.4 The logarithm and exponential function revisited . . .
4.5 Standard integrals . . . . . . . . . . . . . . . . . . . .
4.6 Integration by parts . . . . . . . . . . . . . . . . . . . .
4.7 Integration by substitution . . . . . . . . . . . . . . . .
4.8 Integration of rational functions using partial fractions
4.9 Integration using a recurrence relation . . . . . . . . .
4.10 Definite integrals using even and odd functions . . . . .
5 Differential equations
5.1 First order separable ODE’s . . . . . . . . . .
5.2 First order homogeneous ODE’s . . . . . . . .
5.3 First order linear ODE’s . . . . . . . . . . . .
5.4 First order exact ODE’s . . . . . . . . . . . .
5.5 Bernoulli equations . . . . . . . . . . . . . . .
5.6 Second order linear constant coefficient ODE’s
5.6.1 The homogeneous case . . . . . . . . .
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
36
38
39
41
42
43
44
45
51
52
53
55
56
.
.
.
.
.
.
.
.
.
.
58
58
59
60
62
65
66
67
69
71
72
.
.
.
.
.
.
.
74
75
75
76
77
79
79
79
5.7
5.6.2 The inhomogeneous case . . . . . . . . . . . . . . . . .
5.6.3 Initial and boundary value problems . . . . . . . . . .
Systems of first order linear ODE’s . . . . . . . . . . . . . . .
82
85
86
6 Taylor series
87
6.1 Taylor’s theorem and Taylor series . . . . . . . . . . . . . . . 87
6.2 Lagrange form for the remainder . . . . . . . . . . . . . . . . 91
6.3 Calculating limits using Taylor series . . . . . . . . . . . . . . 93
7 Fourier series
7.1 Calculating the Fourier coefficients
7.2 Parseval’s theorem . . . . . . . . .
7.3 Half range Fourier series . . . . . .
7.4 Complex Fourier series . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8 Functions of several variables
8.1 Taylor’s theorem for functions of two variables
8.2 Chain rule . . . . . . . . . . . . . . . . . . . .
8.3 Implicit differentiation . . . . . . . . . . . . .
8.4 Double integrals . . . . . . . . . . . . . . . . .
8.4.1 Rectangular regions . . . . . . . . . . .
8.4.2 Beyond rectangular regions . . . . . .
8.4.3 Polar coordinates . . . . . . . . . . . .
8.4.4 Change of variables and the Jacobian .
8.4.5 The Gaussian integral . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
96
102
103
104
.
.
.
.
.
.
.
.
.
106
107
108
109
110
110
112
115
117
118
1
Functions
Notation:
Symbol
Denotes
∀
for every
∃
there exists
s.t.
such that
iff
if and only if
Open interval, (a, b) = {x : a < x < b}.
Closed interval, [a, b] = {x : a ≤ x ≤ b}.
Half-open interval, (a, b] = {x : a < x ≤ b}, etc.
Semi-infinite interval, (a, ∞) = {x : x > a}, etc.
The set of real numbers, (−∞, ∞), is denoted by R.
For two sets A and B then A ∪ B is the union, A ∩ B is the intersection and
A\B is the set of all elements in A that are not in B.
1.1
Functions, domain and range
Defn: A function f is a correspondence between two sets D (called the
domain) and C (called the codomain), that assigns to each element of D
one and only one element of C. The notation to indicate the domain and
codomain is f : D 7→ C.
For x ∈ D we write f (x) to denote the assigned element in C, and call this
the value of f at x, or the image of x under f, where x is called the argument
of the function. Extending the above notation we write
f : D 7→ C : x 7→ f (x)
ie. “f is a function from D to C that associates x in D to f (x) in C”.
The set of all images is called the range of f and our notation for the domain
and range is
Dom f
and
Ran f = {f (x) : x ∈ Dom f }.
4
Figure 1: Illustration of the domain, codomain and range of a function.
In (almost all of) this course we shall only deal with real-valued functions
of a real variable, meaning that our functions assign real numbers to real
numbers ie. both the domain and codomain are subsets of R.
There are various ways of representing a function, but perhaps the most familiar is through an explicit expression.
Eg. f : R → R given by f (x) = x2 , ∀x ∈ R.
In this case Dom f = R and Ran f = [0, ∞).
We say that f maps the real line onto [0, ∞).
Eg. f (x) =
√
2x + 4, x ∈ [0, 6]. Here Dom f = [0, 6] and Ran f = [2, 4].
If the domain of a function f is not explicitly given, then it is taken to be
the maximal set of real numbers x for which f (x) is a real number.
Eg. f (x) =
√
2x + 4. Here Dom f = [−2, ∞) and Ran f = [0, ∞).
Eg. f (x) = 1/(1 − x). Here Dom f = R\{1} = (−∞, 1) ∪ (1, ∞) and
Ran f = R\{0}.
In addition to explicit expressions, a function f may be represented by other
means, for example, as a set of ordered pairs (x, y), where x ∈ Dom f and
y ∈ Ran f.
5
Note: y(x) is another common notation for a function. The element x in the
domain is called the independent variable and the element y in the range is
called the dependent variable. This notation is often used if the function
is defined by an equation in two variables, including differential equations
(see later).
The function f (x) = x2 is defined by the equation y = x2 , as we have simply
written y = f (x). However, not all equations will define a function.
Eg. The equation y 2 = x does NOT define a function y(x), because for x < 0
there are no real solutions for y, and for x > 0 there are two solutions for
y, whereas a function must assign only one element of the range for each
element of the domain.
1.2
The graph of a function
Defn: The graph of a function f is the set of all points (x, y) in the xy-plane
with x ∈ Dom f and y = f (x), ie.
graph f = {(x, y) : x ∈ Dom f and y = f (x)}.
The graph of a function over the interval [a, b] is the portion of the graph
where the argument is restricted to this interval.
Note: If you are asked to graph a function, but no interval is given, then try
to choose an appropriate interval that includes all the interesting behaviour
eg. turning points.
The graph of a function is a curve in the plane, but not every curve is the
graph of a function. The following simple test determines whether a curve is
the graph of a function.
6
Figure 2: The graph of f (x) = cos x over [−2π, 2π].
The vertical line test.
If any vertical line intersects the curve more than once then the curve is not
the graph of a function, otherwise it is.
The proof is obvious, given the defining property of a function that only one
element in the range is associated with an element in the domain.
Figure 3: The vertical line test applied to a cubic curve and a circle.
In the above figure the first curve is the graph of a function; in fact the
cubic function f (x) = x3 − x2 − 1. The second curve is not the graph of a
function as the vertical line shown intersects the curve twice. In fact the
curve is given by the equation y 2 = 4 − (x − 1)2 ie. the circle of radius 2
with centre (x, y) = (1, 0). Note that even when a curve is not the graph of
a function, some sections of the curve may be graphs of functions. For the
above circle
p example the section given by y ≥ 0 is the graph of the function
f (x) = 4 − (x − 1)2 with Dom f = [−1, 3].
7
1.3
Graphs and transformations
The graph of a function f : R 7→ R can be shifted, reflected and rescaled by
applying some simple transformations to the function and its argument, as
follows:
• translation:
f (x) + a shifts the graph of f (x) up by a units.
f (x + b) shifts the graph of f (x) to the left by b units.
• reflection:
−f (x) reflects the graph of f (x) about the x-axis.
f (−x) reflects the graph of f (x) about the y-axis.
|f (x)| reflects the portions of the graph of f (x) that are below the x-axis
about the x-axis.
• scaling:
λf (x) is a vertical stretch of the graph of f (x) if λ > 1 and a vertical contraction if 0 < λ < 1.
f (µx) is a horizontal contraction of the graph of f (x) if µ > 1 and a horizontal stretch if 0 < µ < 1.
Figure 4: Graph of f (x) = x2 − 2x + 2, together with −f (x) and f (−x).
Combining these transformations in turn is a way to create graphs from the
graph of a simpler starting function. As an example, the figure shows how
8
Figure 5: Graph of f (x) = sin x, together with 2f (x) and f (2x).
to obtain the graph of the function f (x) = 2 + 2|(x − 1)2 − 1| over [−1, 3] by
starting from the function g(x) = x2 .
Figure 6: Graphs of x2 , (x−1)2 , (x−1)2 −1, |(x−1)2 −1|, 2|(x−1)2 −1|, 2+2|(x−1)2 −1|.
Here are some tips for graphing the reciprocal 1/f (x) from the graph of f (x).
If f (x) is increasing (decreasing) then 1/f (x) is decreasing (increasing).
The graphs of f (x) and 1/f (x) intersect iff f (x) = ±1.
1.4
Even and odd functions
Defn. A function f is even if f (x) = f (−x) ∀ ± x ∈ Dom f.
Defn. A function f is odd if f (x) = −f (−x) ∀ ± x ∈ Dom f.
The graph of an even function is symmetric under a reflection about the
y-axis.
The graph of an odd function is symmetric under a rotation by 180◦ about
the origin (equivalent to a reflection in the y-axis followed by a reflection in
the x-axis).
9
Figure 7: Graph of f (x) = 1 + x2 , together with 1/f (x).
Figure 8: Graphs of the even function x2 and the odd function x3 .
All functions f : R 7→ R can be written as the sum of an even and an odd
function
f (x) = feven (x) + fodd (x),
1
1
1
1
where feven (x) = f (x) + f (−x) and fodd (x) = f (x) − f (−x).
2
2
2
2
This decomposition is often useful, as is the ability to spot an even or odd
function as it can simplify some calculations.
Eg. f (x) = (1 + x) sin x with feven (x) = x sin x and fodd (x) = sin x.
10
1.5
Piecewise functions
Some functions are defined piecewise ie. different expressions are given for
different intervals in the domain.
Eg. The absolute value (or modulus) function
(
x if x ≥ 0
f (x) = |x| =
−x if x < 0.
Figure 9: The graph of |x| over [−1, 1].
Eg.
(
f (x) =
x
if 0 ≤ x < 1
x − 1 if 1 ≤ x ≤ 2.
In this example Dom f = [0, 2] but the function has a
Figure 10: The graph of a piecewise function.
discontinuity at x = 1 (more about this later).
A step function is a piecewise function which is constant on each piece. An example is the Heaviside step
function H(x) defined by
(
0 if x < 0
H(x) =
1 if x > 0.
Note that with this definition Dom H = R\{0}. It is
sometimes convenient to extend the domain to R by
defining the value of H(0) (some obvious candidates
are 0, 12 , 1).
11
Figure 11: The Heaviside step function.
1.6
Operations with functions
Given two functions f and g we can define the following:
• the sum is (f + g)(x) = f (x) + g(x), with domain Dom f ∩ Dom g.
• the difference is (f − g)(x) = f (x) − g(x), with domain Dom f ∩ Dom g.
• the product is (f g)(x) = f (x)g(x), with domain Dom f ∩ Dom g.
• the ratio is (f /g)(x) = f (x)/g(x), with domain (Dom f ∩ Dom g)\{x :
g(x) = 0}.
• the composition is (f ◦ g)(x) = f (g(x)), with domain {x ∈ Dom g : g(x) ∈
Dom f }.
Note that f ◦ g and g ◦ f are usually different functions.
Eg. f (x) = sin x, g(x) = x2 then (f ◦ g)(x) = sin(x2 ) but (g ◦ f )(x) = sin2 x.
1.7
Inverse functions
Defn. A function f : D 7→ C is surjective (or onto) if Ran f = C,
ie. if ∀ y ∈ C ∃ x ∈ D s.t. f (x) = y.
Eg. f : R 7→ R given by f (x) = 2x + 1 is surjective.
Eg. f : R 7→ R given by f (x) = x2 is not surjective, since any negative
number in the codomain is not the image of an element in the domain.
Defn. A function f : D 7→ C is injective (or one-to-one) if ∀ x1 , x2 ∈ D
with x1 6= x2 then f (x1 ) 6= f (x2 ).
Eg. f (x) = 2x + 1 is injective since f (x1 ) = f (x2 ) implies x1 = x2 .
12
Eg. f (x) = x2 is not injective since f (x) = f (−x) and x 6= −x if x 6= 0.
The following simple test can be applied to see if a function is injective from
its graph.
The horizontal line test.
If no horizontal line intersects the graph of f more than once then f is injective, otherwise it is not.
Eg. The function f (x) = x3 − 3x is surjective as Ran f = R but it is not
injective as the horizontal line y = 1 intersects the graph of f at 3 points (see
Figure 12).
Figure 12: Graph of f (x) = x3 − 3x and the horizontal line y = 1.
Defn. A function f : D 7→ C is bijective if it is both surjective and injective.
Eg. From above we have seen that the function f (x) = 2x + 1 is bijective.
Theorem of inverse functions.
A bijective function f admits a unique inverse, denoted f −1 , such that
(f −1 ◦ f )(x) = x = (f ◦ f −1 )(x).
It is clear from the definition that Dom f −1 = Ran f and Ran f −1 = Dom f.
As the inverse undoes the effect of the function an equivalent definition is
f (x) = y
iff f −1 (y) = x.
13
Eg. We have seen that f (x) = 2x + 1 is bijective, so lets find its inverse. To
simplify notation first write y = f −1 (x). Using the property x = f (f −1 (x)),
we have x = f (y) = 2y + 1 and hence y = 21 (x − 1) = f −1 (x).
Note that given an injective function we can make a bijective function by
taking the codomain equal to the range, and then an inverse exists.
Eg. We have seen that f (x) = x2 is not an injective function if Dom f = R,
but it is injective if we take Dom f = [0, ∞). With this choice we can take
the codomain equal to Ran f = [0, ∞) and f is now a bijective function. The
√
inverse is f −1 (x) = x with Dom f −1 = Ran f = [0, ∞).
Warning: Dont confuse the notation for the inverse with the same notation
for the reciprocal.
1.8
Elementary functions
Polynomials.
Linear functions are polynomials of degree one and take the general form
f (x) = ax + b. Here Dom f = Ran f = R. The graph is a straight line with
slope a and y-intercept (0, b), which is the point on the graph that intersects
the y-axis.
Quadratic functions are polynomials of degree two and take the general
form f (x) = ax2 + bx + c. Here Dom f = R, but the range is restricted. The
graph is an upward parabola if a > 0 and a downward parabola if a < 0.
b 2
b2
By completing the square we can write f (x) = a(x + 2a
) + c − 4a
2 and see
2
b
b
that the vertex of the parabola is at (− 2a
, c − 4a
2 ). Hence, if a > 0 then the
2
b
b2
range is given by Ran f = [c− 4a
2 , ∞) and if a < 0 then Ran f = (−∞, c− 4a2 ].
Eg. f (x) = −2x2 + 8x + 5 = −2(x − 2)2 + 13 so Ran f = (−∞, 13].
14
Cubic functions are polynomials of degree three and take the general form
f (x) = ax3 + bx2 + cx + d, with Dom f = Ran f = R. The graph of a typical
example is shown in Figure 12 for the cubic function f (x) = x3 − 3x. A cubic
has at most two turning points (where the graph changes direction) and in
the example these are at x = ±1.
From the above examples it should be clear that for a polynomial function
of arbitrary degree n, the domain is always R, and the range is also R if n is
odd, but is restricted if n is even.
Square root functions. p
These take the form f (x) = g(x) for some function g(x). By definition the
square root is a non-negative number, so this can yield restrictions on the
range. There can also be restrictions on the domain, since the square root is
only defined if the argument g(x) is non-negative.
√
Eg. f (x) = − −x has Dom f = Ran f = (−∞, 0].
Rational functions.
A rational function is a ratio of two polynomials f (x) = p(x)/q(x).
The domain is given by Dom f = R\{x : q(x) = 0}.
Eg. f (x) = 2x/(x2 − 1) has Dom f = R\{±1}.
The simplified form of a rational function is obtained by removing any
common factors in the numerator and denominator.
Eg.
f (x) =
2x − 4
2(x − 2)
2
=
=
x2 − 4
(x + 2)(x − 2) x + 2
if x 6= 2.
Note that we had to exclude the point x = 2 in the last equality to avoid
a division by zero. Hence in this example the original rational function and
its simplified form 2/(x + 2) are not quite the same function as they have
different domains, being R\{±2} and R\{−2} respectively.
15
If, in simplified form, q(a) = 0 then x = a is a vertical asymptote of the
graph of f, which is a vertical line given by the equation x = a.
Figure 13: Graph of f (x) = 2x/(x2 − 1) and the vertical asymptotes x = ±1.
There can also be horizontal asymptotes though a correct treatment requires the concept of a limit, which we have not yet studied. Roughly, a
horizontal asymptote involves considering the behaviour of the function when
|x| is large. For rational functions a horizontal asymptote exists only if the
degree of the numerator is not larger than the degree of the denominator. In
this case we can write
p0 + p1 x + . . . + pn xn
f (x) =
q0 + q1 x + . . . + qn xn
and the horizontal asymptote is the line y = pn /qn . In the example of Figure 13 we have n = 2 and p2 = 0, q2 = 1, so the horizontal asymptote is the
line y = 0, which the graph approaches when |x| is large.
A rational function is called a proper rational function if the degree of the
numerator is less than the degree of the denominator. A rational function can
always be written as the sum of a polynomial and a proper rational function
by performing long division.
16
Figure 14: Graph of f (x) = 3x2 /(2x2 − 50), showing the vertical asymptotes x = ±5 and
the horizontal asymptote y = 3/2.
Eg. f (x) =
1+x6
x3 −x2
is not a proper rational function.
x3 + x2 + x + 1
x6 + 1
x6 − x5
x5 + 1
x5 − x4
x4 + 1
x4 − x3
x3 + 1
x3 − x2
x2 + 1
x3 − x2 |
This yields the required decomposition
x2 + 1
1 + x6
3
2
=x +x +x+1+ 3
.
x3 − x2
x − x2
17
Exponential and logarithm functions.
An exponential function is any function of the form f (x) = bx where b
(called the base) is a positive real number not equal to 1.
For all values of the base Dom f = R and Ran f = (0, ∞) and the graph is
continuous and passes through the point (0, 1). However, there are two types
of behaviour:
• If b > 1 the graph is increasing and has a horizontal asymptote y = 0 along
the negative x-axis.
• If 0 < b < 1 the graph is decreasing and has a horizontal asymptote y = 0
along the positive x-axis.
Figure 15: Graph of f (x) = 2x and g(x) = ( 12 )x =
1
2x
= 1/f (x).
Two important properties of exponential functions are
bx by = bx+y
and (bx )y = bxy .
There is a special value of the base (called the natural base) b = e = 2.718...,
where e is called Euler’s number and is transcendental (like π) ie. it is not the
root of a polynomial equation with rational coefficients. One way to calculate
e is via the sum
1
1
1
1
e = 1 + + + + + ...
1! 2! 3! 4!
x
The function f (x) = e is usually referred to as the exponential function.
18
An exponential function f (x) = bx is injective and so admits an inverse. To
determine this write y = f −1 (x) so that f (y) = x ie. by = x. This is the
defining relation of a logarithm function g(x) = logb x = f −1 (x) which is
simply the name given to the inverse of an exponential function f (x) = bx .
Explicitly, y = logb x iff by = x.
The domain of a logarithm function is the range of an exponential function
and hence is (0, ∞). The range of a logarithm function is the domain of an
exponential function and is thus R.
Notation: I shall use the notation log x as the shorthand for the logarithm
loge x, which is known as the natural logarithm.
Warning: This notation is not universal as some people use ln x to denote
the natural logarithm and instead use log x to denote log10 x.
A logarithm grows very slowly for x large and positive (more slowly than a
linear function) and has a vertical asymptote at x = 0 (see Figure 16).
Figure 16: The natural logarithm log x and the vertical asymptote at x = 0.
Some important properties of logarithms are:
• logb 1 = 0
• logb bx = x
from b0 = 1.
from the definition as the inverse of an exponential.
• logb (αβ) = logb α + logb β
from bα bβ = bα+β .
19
from bα /bβ = bα−β .
• logb (α/β) = logb α − logb β
• logb (xr ) = r logb x
from (ax )r = axr .
Trigonometric functions.
Defn. A function f (x) is periodic if ∃ p > 0 s.t. f (x + p) = f (x) ∀ x.
The smallest such value of p is called the period of f.
The simplest periodic functions are the trigonometric functions sin x and cos x
with period 2π. In both cases the domain is R and the range is [−1, 1].
The function tan x = sin x/ cos x has period π and the domain is
R\{x = ( 21 + n)π for integer n} with associated vertical asymptotes
x = ( 21 + n)π. The range is R.
Figure 17: The trigonometric functions sin x, cos x and tan x.
It is expected that you know the values of the trigonometric functions at the
particular values of the argument shown in the table, plus those related by
the symmetries apparent in the plots shown in Figure 17.
radians 0 π/6 π/4 π/3 π/2
degrees 0 30 45 60
90
√
3
√1
sin
0 12
1
2
2
√
cos
1
3
2
√1
2
1
2
0
It is also expected that you know the important trigonometric identities:
20
• sin2 x + cos2 x = 1.
• addition formulae
sin(x ± y) = sin x cos y ± cos x sin y
cos(x ± y) = cos x cos y ∓ sin x sin y
• double angle formulae
sin 2x = 2 sin x cos x
cos 2x = 2 cos2 x − 1 = 1 − 2 sin2 x
• half-angle formulae
sin2 x = 21 − 12 cos 2x
cos2 x = 21 + 12 cos 2x
The reciprocals of the trigonometric functions have their own notation:
csc x = 1/ sin x, sec x = 1/ cos x, cot x = 1/ tan x = cos x/ sin x.
The above formulae lead to related formulae for these functions
eg. sec2 x = 1 + tan2 x.
The trigonometric functions are not injective over their full domains, but are
injective over suitably restricted domains (given in the table), so that an inverse exists.
f (x)
Dom f
Ran f
f −1 (x)
Dom f −1
Ran f −1
sin x [−π/2, π/2] [−1, 1] sin−1 x or arcsin x
[−1, 1] [−π/2, π/2]
−1
cos x
[0, π]
[−1, 1] cos x or arccos x [−1, 1]
[0, π]
−1
tan x (−π/2, π/2)
R
tan x or arctan x
R
(−π/2, π/2)
21
Figure 18: Graphs of the functions sin−1 x, cos−1 x and tan−1 x.
Hyperbolic functions.
Hyperbolic functions, with domain R, are defined in terms of the exponential function via
1
1
sinh x = (ex − e−x ), cosh x = (ex + e−x ), tanh x = sinh x/ cosh x.
2
2
Figure 19: Graphs of the functions sinh x, cosh x and tanh x.
From the properties of the exponential function it is easy to show (exercise)
that some basic properties include
• sinh 0 = 0 and cosh 0 = 1.
• If x > 0 then 0 < sinh x < cosh x.
• sinh x is an odd function and cosh x is an even function.
22
• Addition formulae
sinh(x ± y) = sinh x cosh y ± cosh x sinh y
cosh(x ± y) = cosh x cosh y ± sinh x sinh y.
• Double argument formulae
sinh(2x) = 2 sinh x cosh x
cosh(2x) = sinh2 x + cosh2 x.
• cosh2 x − sinh2 x = 1.
The reciprocals of hyperbolic functions have their own notation:
cosech x = 1/sinh x, sech x = 1/cosh x, coth x = 1/tanh x.
The above formulae lead to related formulae for these functions
eg. sech 2 x = 1 − tanh2 x.
23
2
2.1
Limits and continuity
Definitions
Rough idea #1:
A function f (x) has a limit L at x = a if f (x) is close to L whenever x is
close to a.
Rough idea #2: Lets think about the function f (x) near x = a.
If you approximate f (x) by a constant L then you will make an error given
by |f (x) − L|. Suppose ε > 0 is the εrror at which you become unhappy
with your approximation, that is, you are happy as long as |f (x) − L| < ε.
The important issue for a limit is whether I can keep you happy simply by
making sure that x stays within a δistance δ > 0 of a, that is by restricting
to 0 < |x − a| < δ. Note that we dont even care about what happens exactly
at x = a. If I can always find a δistance δ that keeps you happy, no matter
how small you choose your εrror ε, then we say that f (x) has a limit L as x
tends to a.
Figure 20: The idea of a limit.
24
Defn: f (x) has a limit L as x tends to a if
∀ ε > 0 ∃ δ > 0 s.t |f (x) − L| < ε when 0 < |x − a| < δ.
We then write
lim f (x) = L or equivalently f (x) → L as x → a.
x→a
If there is no such L then we say that no limit exists.
The limit does not require that f (a) is equal to L or even that f (a) exists.
This is because of the inequality 0 < |x − a| in the definition.
There is a lot more of this ε and δ stuff in Analysis I, where proofs concerning
limits are considered. In this course we shall be less formal and deal only
with methods to calculate limits or see that no limit exists.
Rough idea #3:
If f (a) exists then it would seem that this is a good candidate for the limit
L. This naive view turns out to be correct only if the graph of f (x) is a single
unbroken curve with no holes or jumps, at least in a small region around
x = a.
Defn. A function f (x) is continuous at the point x = a if the following
three properties all hold:
• f (a) exists.
• limx→a f (x) exists.
• limx→a f (x) is equal to f (a).
Defn. A function f (x) is continuous on a subset S of its domain if it is
continuous at every point in S.
Defn. A function f (x) is continuous if it is continuous at every point in its domain.
Eg. f (x) = x2 is continuous and limx→2 x2 = f (2) = 4.
25
Eg. The following function has a discontinuity at x = 0

2

x if − 1 ≤ x < 0
f (x) = 1 if x = 0

 2
x if 0 < x ≤ 1
To indicate on a graph exactly which points are included we use a closed
circle to denote an included point, so if it is at the end of a curve segment
this denotes that this end is a closed interval. We use an open circle to
denote an excluded point associated with an open interval. This notation is
demonstrated in Figure 21 where the graph of the above function is presented.
Figure 21: A graph illustrating the use of open and closed circles. The limit exists at x = 0.
This function is not continuous at x = 0 but the limit exists at this point
and limx→0 f (x) = 0 6= 1 = f (0).
Eg. The following function also has a discontinuity at x = 0 (see Figure 23
for its graph)
(
1 if x ≤ 0
f (x) =
x2 if x > 0
In this case no limit exists at x = 0. This is clear because if we consider x < 0
then to get arbitrarily close to the function for x arbitrarily close to 0 would
require the limit to be 1. However, if we consider x > 0 then to get arbitrarily
close to the function for x arbitrarily close to 0 would require the limit to
be 0. These two requirements are incompatible and so no limit exists at x = 0.
26
Figure 22: A graph illustrating the use of open and closed circles. No limit exists at x = 0.
Eg. For the function f (x) = sin(1/x) no limit exists at x = 0 because as
x approaches zero this function oscillates between 1 and -1 over smaller and
smaller intervals. In particular, in any given interval 0 < x < δ it is always
possible to find values x1 and x2 s.t. f (x1 ) = 1 and f (x2 ) = −1, thus f (x)
cannot remain close to any given constant L.
Figure 23: A graph of the function sin(1/x).
2.2
Results about limits and continuity
There are some simple facts about limits that follow from the definition:
• The limit is unique.
• If f (x) = g(x) (except possibly at x = a) on some open interval containing
a then limx→a f (x) = limx→a g(x).
• If f (x) ≥ K on either an interval (a, b) or an interval (c, a)
27
and if limx→a f (x) = L then L ≥ K.
(A similar result holds by replacing all ≥ with ≤).
• If limx→a f (x) = L and limx→a g(x) = M then
(i) limx→a (f (x) + g(x)) = L + M
(ii) limx→a (f (x)g(x)) = LM
(iii) if M 6= 0 then limx→a (f (x)/g(x)) = L/M.
Note: In Durham the results (i),(ii),(iii) are sometimes known as the
Calculus Of Limits Theorem (COLT). However, this seems to be a rather
grand name for these results and you will not find this name in any books.
• Pinching Theorem. If g(x) ≤ f (x) ≤ h(x) for all x 6= a in some open interval containing a and limx→a g(x) = limx→a h(x) = L then limx→a f (x) = L.
There are also some simple facts about continuous functions that follow
from the definition and the above results:
• If f (x) and g(x) are continuous functions then so are f (x) + g(x), f (x)g(x)
and f (x)/g(x).
• All polynomial, rational, trigonometric and hyperbolic functions are continuous.
• If f (x) is continuous then so is |f (x)|.
Eg. All the following functions are continuous:
2x3 + x + 7, 3x/(x − 1), |(1 + x2 )/ sin x|, tan x.
Note: Although tan x is continuous, it is not continuous on the interval [0, π],
because this interval includes the point x = π/2 which is not in the domain
of tan x.
Eg. limx→π/2 x2 sin x = (π/2)2 sin(π/2) = π 2 /4.
Eg. limx→0 x1 does not exist. The value of f (x) can be made greater than any
28
finite constant by taking x sufficiently close to zero, so no limit exists.
2
−18
Eg. Calculate limx→3 2xx−3
.
2x2 − 18 2(x + 3)(x − 3)
=
= 2(x + 3) if x 6= 3.
x−3
x−3
The value of the function at x = 3 is irrelevant in defining the limit as x → 3 so
2x2 − 18
lim
= lim 2(x + 3) = 12.
x→3 x − 3
x→3
√
x−5
Eg. Calculate limx→25 x−25
.
√
√
√
x − 5 ( x − 5)( x + 5)
x − 25
1
√
√
=
=
=√
if x 6= 25.
x − 25
(x − 25)( x + 5)
(x − 25)( x + 5)
x+5
√
x−5
1
1
= .
lim
= lim √
x→25 x − 25
x→25
x + 5 10
Eg. Calculate limx→0 x2 sin( x1 ).
As −1 ≤ sin( x1 ) ≤ 1 then −x2 ≤ x2 sin( x1 ) ≤ x2 . Also limx→0 −x2 = 0 = limx→0 x2 .
Hence by the pinching theorem limx→0 x2 sin( x1 ) = 0.
Figure 24: Graphs of f (x) = x2 sin(1/x) and the bounding functions g(x) = −x2 and h(x) = x2 .
Two important trigonometric limits that you may assume to be true and
use are:
sin x
1 − cos x
lim
= 1 and lim
= 0.
x→0 x
x→0
x
29
The graphs in Figure 25 provide some evidence to support these results. They
can be proved by applying the pinching theorem but it requires a bit of work.
For example, the first result requires the fact that sin x < x < tan x for
0 < x < π/2.
x
Figure 25: Graphs of sinx x and 1−cos
. As x = 0 is not in the domain of either function there
x
should really be open circles at x = 0 on both graphs.
Eg. Calculate limx→0 sin(2x)
x .
2 sin(2x)
2 sin(u)
sin(2x)
= lim
= lim
= 2 · 1 = 2.
x→0
u→0
x→0
x
2x
u
The above used the change of variable u = 2x and then the first of the two
important trigonometric limits given above.
lim
Eg. Calculate limx→0 tanx x .
sin x
tan x
= lim
=
lim
x→0 x cos x
x→0 x
sin x
lim
x→0 x
1
lim
x→0 cos x
= 1 · 1 = 1.
x
Eg. Calculate limx→0 1−cos
x2 .
For −π < x < π
1 − cos x (1 − cos x)(1 + cos x)
=
=
x2
x2 (1 + cos x)
sin x
x
2
1
1 + cos x
Hence by using the first important trigonometric limit we have that
2
1 − cos x
sin x
1
1
1
2
lim
=
lim
=
(1)
·
=
.
x→0
x→0
x2
x
1 + cos x
1+1 2
30
2.3
Limits as x → ∞.
So far we have only been concerned with the limit of a function f (x) as x
approaches a finite point a. However, it is also possible to define a limit as
x → ∞.
Rough idea #4:
A function f (x) has a limit L as x → ∞ if f (x) can be kept arbitrarily close
to L by making x sufficiently large.
Defn: f (x) has a limit L as x tends to ∞ if
∀ ε > 0 ∃ S > 0 s.t |f (x) − L| < ε when x > S.
We then write
lim f (x) = L or equivalently f (x) → L as x → ∞.
x→∞
If there is no such L then we say that no limit exists.
The corresponding definition associated with limx→−∞ f (x) = L should be
obvious and is given by
Defn: f (x) has a limit L as x tends to −∞ if
∀ ε > 0 ∃ S < 0 s.t |f (x) − L| < ε when x < S.
Eg. limx→∞ x1 = 0. This should be clear because x1 can be made as close
to zero as required by making x sufficiently large. However, a better way
to see this result is to first make the substitution u = 1/x as the limit
x → ∞ then becomes the limit u → 0 that we are already familiar with ie.
limx→∞ x1 = limu→0 u = 0.
.
Eg. Calculate limx→∞ x cos(1/x)+2
x
By using the substitution u = 1/x we have that
1
x cos(1/x) + 2
u cos u + 2
lim
= lim
= lim (cos u + 2u) = cos(0) + 0 = 1.
1
x→∞
u→0
u→0
x
u
31
Our earlier discussion of a horizontal asymptote can now be made more precise, in that the graph of a function f (x) will have a horizontal asymptote
y = L if limx→∞ f (x) = L.
Eg.
2+
2x + 3
lim
= lim
x→∞ x + 5
x→∞ 1 +
3
x
5
x
=
2+0
= 2.
1+0
Here we have used the result that limx→∞ x1 = 0, together with the earlier
facts about limits. We see that calculating this limit shows that the graph of
this rational function has the horizontal asymptote y = 2, reproducing our
earlier observations about the horizontal asymptotes of rational functions.
2.4
Classification of discontinuities
If a function f (x) is not continuous at a point x = a then we say it has a
discontinuity at x = a. There are different types of discontinuity and these
are best classified by considering how the function behaves on each side of the
point x = a. This motivates the definition of the following one-sided limits.
Defn: f (x) has a right-sided limit L+ = limx→a+ f (x) as x tends to a from
above if
∀ ε > 0 ∃ δ > 0 s.t |f (x) − L+ | < ε when 0 < x − a < δ.
Note that the difference between this definition and the definition of the limit
is the removal of the modulus sign in |x−a|. This means that we only need to
worry about points to the right of a when requiring the function to be close
to L+ .
Similarly, we have the definition
Defn: f (x) has a left-sided limit L− = limx→a− f (x) as x tends to a from
below if
∀ ε > 0 ∃ δ > 0 s.t |f (x) − L− | < ε when 0 < a − x < δ.
Here we only need to worry about points to the left of a when requiring the
function to be close to L− .
32
It should be obvious that L = limx→a f (x) exists iff L+ and L− both exist
and are equal. In which case L = L+ = L− .
There are 3 types of discontinuity:
(i) Removable discontinuity.
In this case L exists but f (a) 6= L.
The discontinuity can be removed to make the continuous function
(
f (x) if x 6= a
g(x) =
L
if x = a.
Eg. The following function (Figure 26i) has a removable discontinuity at x = 0,
(
x2 if x =
6 0
f (x) =
1 if x = 0.
Removing this discontinuity yields the continuous function g(x) = x2 .
Figure 26: 3 types of discontinuities at x = 0 : (i) removable; (ii) jump; (iii) infinite.
(ii) Jump discontinuity.
In this case both L+ and L− exist but L+ 6= L− .
Eg. The following function (Figure 26ii) has a jump discontinuity at x = 0,
(
1 if x ≤ 0
f (x) =
x2 if x > 0.
33
In this example L+ = 0 6= 1 = L− .
Eg. The signum function


−1 if x < 0
sgn(x) = 0 if x = 0


1 if x > 0
has a jump discontinuity at x = 0. In this example L+ = 1 6= −1 = L− .
(iii) Infinite discontinuity.
In this case at least one of L+ or L− does not exist.
Eg. The function f (x) = 1/x (Figure 26iii) has an infinite discontinuity at x = 0.
In this example neither L+ nor L− exist.
2.5
The intermediate value theorem
The intermediate value theorem states that if f (x) is continuous on [a, b]
and u is any number between f (a) and f (b) (ie. either f (a) < u < f (b) or
f (b) < u < f (a)) then ∃ (at least one) c ∈ (a, b) s.t. f (c) = u.
Figure 27: An illustration of the intermediate value theorem.
Eg. f (x) = sin x is continuous on [0, π/2] and f (0) = 0 < 21 < 1 = f (π/2) so
by the intermediate value theorem there is (at least) one point x in (0, π/2)
s.t. sin x = 21 .
34
It is important that f (x) is continuous throughout the interval, otherwise the
theorem does not apply.
sgn(x)
Eg. f (x) = 1+x2 has f (−1) = − 12 < 51 < 12 = f (1) but there is no x in
(−1, 1) s.t. f (x) = 15 . The intermediate value theorem does not apply because f (x) is not continuous at x = 0. There is a jump discontinuity at x = 0
with limx→0+ f (x) = 1 6= −1 = limx→0− f (x).
An application of the intermediate value theorem is locating the zeros of a
function. If the function f (x) is continuous on [a, b] and we know that either
f (a) < 0 < f (b) or f (b) < 0 < f (a) then by the intermediate value theorem
the equation f (x) = 0 has at least one root between a and b.
Eg. The function f (x) = x2 − 2 is continuous on [1, 2] with f (1) = −1 < 0
and f (2) = 2 > 0, so there is at√
least one root of the equation x2 − 2 = 0 in
(1, 2) (of course the root is x = 2 = 1.4142...).
A repeated iterated application of this approach gives the bisection method,
which can be used to locate the roots of a wide variety of equations to any
desired accuracy.
35
3
3.1
Differentiation
Derivative as a limit
Geometrically the derivative f 0 (a) of a function f (x) at x = a is equal to the
slope of the tangent to the graph of f (x) at x = a.
Figure 28: The graph of a function f (x), the tangent at x = a and the secant through the
points on the graph at x = a and x = a + h.
A secant is a line that intersects a curve at two points (the part of the secant
that is between the two intersection points is called a chord).
Consider a secant that intersects the graph of f (x) at the two (x, y) points
(a, f (a)) and (a+h, f (a+h)). The slope of this secant is given by the difference
quotient (f (a + h) − f (a))/h. As the two points approach each other ie. as
|h| decreases, the secant approaches the tangent at x = a. The tangent is
obtained in the limit as h → 0 and the derivative at a is the slope of the
secant in this limit ie.
f (a + h) − f (a)
h→0
h
f 0 (a) = lim
providing this limit exists.
If this limit exists then we say that f (x) is differentiable at x = a.
Eg. Use the limit definition of the derivative to calculate f 0 (a) for f (x) = x2 .
f (a + h) − f (a)
(a + h)2 − a2
2ah + h2
f (a) = lim
= lim
= lim
= lim (2a+h) = 2a.
h→0
h→0
h→0
h→0
h
h
h
0
36
This is, of course, the expected result from the knowledge that f 0 (x) = 2x.
Eg. Use the limit definition of the derivative to calculate f 0 (π) for f (x) = sin x.
f (π + h) − f (π)
sin(π + h) − sin π
sin π cos h + cos π sin h
= lim
= lim
h→0
h→0
h→0
h
h
h
f 0 (π) = lim
− sin h
= −1
h→0
h
where the final equality uses the earlier important trigonometric limit result.
This is, of course, the expected answer from the knowledge that f 0 (x) = cos x
giving f 0 (π) = cos π = −1.
= lim
If f 0 (a) exists for all a in Dom f we say that f (x) is differentiable and then
f 0 (a) defines a function f 0 (x) called the derivative.
Eg. Use the limit definition of the derivative to calculate the derivative of
the function f (x) = x cos x.
f (x + h) − f (x)
(x + h) cos(x + h) − x cos x
= lim
h→0
h→0
h
h
f 0 (x) = lim
(x + h)(cos x cos h − sin x sin h) − x cos x
h→0
h
cos h − 1
sin h
= lim x cos x
−x sin x
+cos x cos h−sin x sin h = −x sin x+cos x
h→0
h
h
where we have used both of the earlier important trigonometric limit results.
= lim
The fact that the derivative is equal to the slope of the tangent of the graph
allows us to determine a Cartesian equation for the tangent by calculating
the derivative. Explicitly, the tangent line at the point (a, f (a)) is given by
y = f (a) + f 0 (a)(x − a).
Eg. For the function f (x) = 4x3 −x, find a Cartesian equation for the tangent
to the graph at the point (1, 3).
f 0 (x) = 12x2 − 1 so f 0 (1) = 11. The tangent line is given by
37
y = f (1) + f 0 (1)(x − 1) = 3 + 11(x − 1) = 11x − 8
A necessary condition for a function to be differentiable at a point is that it
is continuous at that point, but this is not a sufficient condition.
Geometrically, there are two ways that a function can fail to be differentiable
at a point where it is continuous:
(i). The tangent line is vertical at that point.
Eg. The function f (x) = x1/3 is continuous in R but it is not differentiable
at x = 0. The difference quotient at x = 0 is
f (0 + h) − f (0) h1/3
1
=
= 2/3
h
h
h
and no limit exists as h → 0 because the above grows without bound.
(ii). There is no tangent line at that point.
Eg. The function f (x) = |x| is continuous in R but it is not differentiable at
x = 0. The difference quotient at x = 0 is
(
−1 if h < 0
f (0 + h) − f (0) |h|
=
=
h
h
1
if h > 0
(0)
The left and right limits therefore both exist limh→0+ f (0+h)−f
= 1 and
h
f (0+h)−f (0)
f (0+h)−f (0)
limh→0−
= −1 but these are not equal, so limh→0
does
h
h
not exist.
3.2
Rules of differentiation
The following rules of differentiation are easy to prove by considering the
appropriate difference quotients.
If f (x) and g(x) are differentiable at x then so are the following:
38
• sum rule:
f (x) + g(x) with derivative f 0 (x) + g 0 (x).
• product rule:
f (x)g(x) with derivative f 0 (x)g(x) + f (x)g 0 (x).
and providing g(x) 6= 0 then so are the following:
• reciprocal rule:
g 0 (x)
1
with
derivative
−
g(x)
(g(x))2 .
• quotient rule:
f (x)
g(x) with derivative
f 0 (x)g(x)−f (x)g 0 (x)
.
(g(x))2
Eg. Differentiate u(x) = (3x2 − 1)/(x4 + 3x + 1)
(3x2 − 1)0 (x4 + 3x + 1) − (3x2 − 1)(x4 + 3x + 1)0
u (x) =
(x4 + 3x + 1)2
0
6x(x4 + 3x + 1) − (3x2 − 1)(4x3 + 3) 6x5 + 18x2 + 6x − (12x5 + 9x2 − 4x3 − 3)
=
=
(x4 + 3x + 1)2
(x4 + 3x + 1)2
−6x5 + 4x3 + 9x2 + 6x + 3
.
=
(x4 + 3x + 1)2
3.3
Derivatives of higher order
If f (x) is differentiable then its derivative f 0 (x) may also be differentiable, in
which case we denote its derivative by f 00 (x) ie. the second derivative of
f (x). Similarly the third derivative is f 000 (x) and in general the nth derivative
is f (n) (x) so that f (3) (x) = f 000 . This notation is due to Lagrange (1736-1813),
but other notations are also common.
f 0 (x) =
df
dx ,
f 00 (x) =
d2 f
dx2 ,
f (n) (x) =
dn f
dxn
is due to Leibniz (1646-1716).
f 0 (x) = Df (x), f 00 (x) = D2 f (x), f (n) (x) = Dn f (x) is due to Euler (1707-1783).
39
Finally, if the independent variable represents time, say f (t), then f 0 (t) = f˙
and f 00 (t) = f¨ is due to Newton (1642-1727).
df
As we have seen, the derivative dx
is the slope of the tangent to the graph of
df
measures the rate of change of the
f (x). This means that the derivative dx
dependent variable f with respect to changes in the independent variable x.
In the case that the independent variable is time t, the derivative measures
the rate of change with time, so if the dependent variable, say X(t) represents the position of an object confined to a line then Ẋ is the velocity of the
object, with |Ẋ| the speed, and Ẍ is the acceleration of the object.
The product rule for differentiation is just the first case of the more general
Leibniz rule:
If f (x) and g(x) are both differentiable n times then so is the product f (x)g(x) with
n X
n
(Dk f )(Dn−k g)
Dn (f g) =
k
k=0
where
n
k
=
n!
k!(n−k)!
is the binomial coefficient familiar from Pascal’s triangle
Figure 29: The binomial coefficients arranged into Pascal’s triangle.
Eg. Calculate D3 (x2 sin x).
D3 (f g) = f (D3 g) + 3(Df )(D2 g) + 3(D2 f )(Dg) + (D3 f )g.
f = x2 , Df = 2x, D2 f = 2, D3 f = 0.
g = sin x, Dg = cos x, D2 g = − sin x, D3 g = − cos x.
40
D3 (x2 sin x) = −x2 cos x − (3)2x sin x + (3)2 cos x + 0 sin x = (6 − x2 ) cos x − 6x sin x.
3.4
The chain rule
To handle the differentiation of composite functions (f ◦ g)(x) = f (g(x)) we
turn to the following theorem:
The chain rule theorem states that if g(x) is differentiable at x and f (x)
is differentiable at g(x) then the composition (f ◦ g)(x) is differentiable at x
with
(f ◦ g)0 (x) = f 0 (g(x))g 0 (x).
Recall that prime denotes differentiation with respect to the argument so in
Leibniz notation the above formula may be written more crudely as
d
df dg
f (g(x)) =
dx
dg dx
where we need to be aware that on the right hand side the argument of the
first factor is g(x) and the argument of the second factor is x.
This form is useful as a mnemonic (think of ’cancelling’ the dg factor) but it
is no more than this.
d
Eg. Calculate
dx
1
x+
x
−3 .
In terms of the above notation g(x) = x + x1 and f (x) = x−3
giving g 0 (x) = 1 − x12 and f 0 (x) = −3x−4 thus
−3 −4 d
1
1
1
x+
= f 0 (g(x))g 0 (x) = −3 x +
1− 2 .
dx
x
x
x
Eg.
d
dt
cos(t2 ) = −2t sin(t2 ).
41
3.5
L’Hopital’s rule
Recall that the calculus of limits theorem states that if limx→a f (x) = L and
(x)
L
limx→a g(x) = M 6= 0 then limx→a fg(x)
=M
. Clearly this result is not applicable to the situation where L = M = 0, which is called an indeterminant
form. We have already seen how to deal with situations like this directly, but
an alternative (and often easier) approach is sometimes available by making
use of the following.
L’Hopital’s rule
Let f (x) and g(x) be differentiable on I = (a − h, a) ∪ (a, a + h) for some
h > 0, with limx→a f (x) = limx→a g(x) = 0.
0
(x)
If limx→a fg0 (x)
exists and g 0 (x) 6= 0 ∀ x ∈ I then
f (x)
f 0 (x)
= lim 0 .
lim
x→a g(x)
x→a g (x)
Proof: We shall only consider the proof for the slightly easier situation in
which f (x) and g(x) are both differentiable at x = a and g 0 (a) 6= 0.
In this case
f (x)
f (x) − f (a)
lim
= lim
= lim
x→a g(x)
x→a g(x) − g(a)
x→a
f (x)−f (a)
x−a
g(x)−g(a)
x−a
=
(a)
limx→a f (x)−f
x−a
limx→a g(x)−g(a)
x−a
f 0 (a)
f 0 (x)
= 0
= lim
.
g (a) x→a g 0 (x)
We can apply L’Hopital’s rule to calculate the two important trigonometric
limits from earlier.
Eg. Calculate limx→0 sinx x .
f (x) = sin x and g(x) = x are both differentiable. Also limx→0 sin x = sin 0 =
0 and limx→0 x = 0. Furthermore, f 0 (x) = cos x and g 0 (x) = 1 6= 0. Thus
L’Hopital’s rule applies to give
sin x
cos x
= lim
= cos 0 = 1.
x→0 x
x→0 1
lim
42
x
.
Eg. Calculate limx→0 1−cos
x
f (x) = 1−cos x and g(x) = x are both differentiable. Also limx→0 (1−cos x) =
1 − cos 0 = 0 and limx→0 x = 0. Furthermore, f 0 (x) = sin x and g 0 (x) = 1 6= 0.
Thus by L’Hopital’s rule
1 − cos x
sin x
= lim
= sin 0 = 0.
x→0
x→0 1
x
lim
If applying L’Hopital’s rule yields another indeterminant form then L’Hopital’s
rule can be reapplied to this form and so on.
x−sin(2x)
.
Eg. Calculate limx→0 2 sinx−sin
x
2 sin x − sin(2x)
2 cos x − 2 cos(2x)
−2 sin x + 4 sin(2x)
= lim
= lim
x→0
x→0
x→0
x − sin x
1 − cos x
sin x
lim
−2 cos x + 8 cos(2x) −2 + 8
=
= 6.
x→0
cos x
1
= lim
3.6
Implicit differentiation
So far we have been differentiating functions defined explicitly in terms of an
independent variable. Even if an explicit expression is not available then we
may still be able to obtain an expression for the derivative if we know that
a differentiable function satisfies a particular equation. This is the process
of implicit differentiation, which involves differentiating an equation and
using the chain rule.
Eg. Consider a function y(x) that satisfies the equation x2 + y 2 = 1. Difdy
= 0 and hence
ferentiating this equation with respect to x gives 2x + 2y dx
√
dy
2
1 − x satisfies the given equadx = −x/y. In particular, the function y = √
dy
tion and the above result yields dx
= −x/ 1 − x2 , which we could have
obtained directly by differentiating the explicit expression.
43
Eg. Assume that x(t) is a differentiable function that satisfies the equation
cos(t − x) = (2t + 1)3 x. Express dx
dt in terms of x and t.
Differentiating both sides of the equation with respect to t gives
2
3 dx
− sin(t − x)(1 − dx
dt ) = 3(2t + 1) 2x + (2t + 1) dt and hence
dx 6x(2t + 1)2 + sin(t − x)
=
.
dt
sin(t − x) − (2t + 1)3
3.7
Boundedness and monotonicity
The following definitions apply to a function f (x) defined in some interval I.
Defn: If ∃ a constant k1 s.t. f (x) ≤ k1 ∀ x in I we say that f (x) is
bounded above in I and we call k1 an upper bound of f (x) in I.
Furthermore, if ∃ a point x1 in I s.t. f (x1 ) = k1 we say that the upper bound
k1 is attained and we call k1 the global maximum value of f (x) in I.
Similarly, we have the following.
Defn: If ∃ a constant k2 s.t. f (x) ≥ k2 ∀ x in I we say that f (x) is
bounded below in I and we call k2 a lower bound of f (x) in I.
Furthermore, if ∃ a point x2 in I s.t. f (x2 ) = k2 we say that the lower bound
k2 is attained and we call k2 the global minimum value of f (x) in I.
Defn: f (x) is bounded in I if it is both bounded above and bounded below
in I ie. if ∃ a constant k s.t. |f (x)| ≤ k ∀ x in I.
If no interval I is specified then it is taken to be the domain of the function.
Eg. cos x is bounded (in R) because | cos x| ≤ 1 ∀x ∈ R. Both these upper and
lower bounds are attained as cos 0 = 1 and cos π = −1. The global maximum
value in R is 1 and the global minimum value in R is −1.
sgn(x)
Eg. Consider f (x) = 1+x2 for −1 ≤ x ≤ 1. Then f (x) is bounded in [−1, 1]
because |f (x)| ≤ 1 for all x in [−1, 1] but neither of the bounds is attained,
so it has no global maximum value and no global minimum value in [−1, 1].
44
Eg. On [0, π/2) the function tan x is bounded below but not bounded above.
A condition that guarantees the existence of both a global maximum and
minimum value (called extreme values) is provided by the following:
The extreme value theorem states that if f is a continuous function on
a closed interval [a, b] then it is bounded on that interval and has upper and
lower bounds that are attained ie. ∃ points x1 and x2 in [a, b] s.t.
f (x2 ) ≤ f (x) ≤ f (x1 ) ∀ x ∈ [a, b].
Defn: f (x) is monotonic increasing in [a, b] if f (x1 ) ≤ f (x2 ) for all pairs
x1 , x2 with a ≤ x1 < x2 ≤ b.
Defn: f (x) is strictly monotonic increasing in [a, b] if f (x1 ) < f (x2 ) for
all pairs x1 , x2 with a ≤ x1 < x2 ≤ b.
Obvious definitions apply with increasing replaced by decreasing.
Eg. sgn(x) is monotonic increasing in [−1, 1] but not strictly.
Eg. x2 is strictly monotonic increasing in [0, b] for any b > 0.
3.8
Critical points
Defn: We say that f (x) has a local maximum at x = a if ∃ h > 0 s.t.
f (a) ≥ f (x) ∀ x ∈ (a − h, a + h).
Remark: If f (x) has a local maximum at x = a and is differentiable at this
point then f 0 (a) = 0.
Proof: As f (x) is differentiable at a then
lim−
x→a
f (x) − f (a)
f (x) − f (a)
= lim+
= f 0 (a).
x→a
x−a
x−a
45
f (x) has a local maximum at a implies
f (x) − f (a)
≥ 0 for x ∈ (a − h, a)
x−a
f (x) − f (a)
≥ 0.
x→a
x−a
Similarly, f (x) has a local maximum at a also implies
hence f 0 (a) = lim−
f (x) − f (a)
≤ 0 for x ∈ (a, a + h)
x−a
f (x) − f (a)
≤ 0.
x→a
x−a
Putting these two together gives f 0 (a) ≤ 0 ≤ f 0 (a) and thus f 0 (a) = 0.
hence f 0 (a) = lim+
Defn: We say that f (x) has a local minimum at x = a if ∃ h > 0 s.t.
f (x) ≥ f (a) ∀ x ∈ (a − h, a + h).
Remark: If f (x) has a local minimum at x = a and is differentiable at this
point then f 0 (a) = 0.
Defn: f (x) has a stationary point at x = a if it is differentiable at x = a
with f 0 (a) = 0.
Defn: An interior point x = a of the domain of f (x) is called a critical
point if either f 0 (a) = 0 or f 0 (a) does not exist.
Every local maximum and local minimum of a differentiable function is a
stationary point but there may be other stationary points too.
Eg. f (x) = x3 has f 0 (0) = 0 so x = 0 is a stationary point but it is neither a
local maximum nor a local minimum because f (x) > f (0) for x ∈ (0, h) but
f (x) < f (0) for x ∈ (−h, 0).
Defn: If f (x) is twice differentiable in an open interval around x = a with
f 00 (a) = 0 and if f 00 (x) changes sign at x = a then we say that x = a is a
46
point of inflection.
Figure 30: The graph of f (x) = x3 indicating the point of inflection at x = 0.
Eg. f (x) = x3 has f 00 (x) = 6x so f 00 (0) = 0. As f 00 (x) < 0 for x < 0 and
f 00 (x) > 0 for x > 0 then f 00 (x) changes sign at x = 0 so it is a point of
inflection.
The first derivative test
Suppose f (x) is continuous at a critical point x = a.
(i). If ∃ h > 0 s.t. f 0 (x) < 0 ∀ x ∈ (a − h, a) and f 0 (x) > 0 ∀ x ∈ (a, a + h)
then x = a is a local minimum.
(ii). If ∃ h > 0 s.t. f 0 (x) > 0 ∀ x ∈ (a − h, a) and f 0 (x) < 0 ∀ x ∈ (a, a + h)
then x = a is a local maximum.
(iii). If ∃ h > 0 s.t. f 0 (x) has a constant sign ∀ x 6= a in (a − h, a + h) then
x = a is a not a local extreme value (minimum/maximum).
Eg. f (x) = |x| is continuous but f 0 (x) 6= 0 so there are no stationary points.
The derivative does not exist at x = 0 so this is a critical point. f 0 (x) < 0 for
x ∈ (−1, 0) and f 0 (x) > 0 for x ∈ (0, 1) so by the first derivative test x = 0
is a local minimum. In fact f (0) = 0 is a global minimum as |x| ≥ 0.
Eg. f (x) = x4 − 2x3 has f 0 (x) = 4x3 − 6x2 = 2x2 (2x − 3). The only critical
points are x = 0, 3/2. Now f 0 (x) < 0 for x < 3/2 and f 0 (x) > 0 for x > 3/2.
47
Thus f 0 (x) has constant sign on both (sufficiently close) sides of x = 0 so
this is not a local extreme value. However, f 0 (x) changes sign from negative
to positive as x passes through x = 3/2 so this is a local minimum.
The second derivative test
Suppose f (x) is twice differentiable at x = a with f 0 (a) = 0.
(i). If f 00 (a) > 0 then x = a is a local minimum.
(ii). If f 00 (a) < 0 then x = a is a local maximum.
Note that the theorem doesn’t say anything about the case f 00 (a) = 0.
Eg. f (x) = 2x3 − 9x2 + 12x has f 0 (x) = 6x2 − 18x + 12 = 6(x2 − 3x + 2) =
6(x − 1)(x − 2) giving stationary points at x = 1 and x = 2. Furthermore,
f 00 (x) = 6(2x − 3) giving f 00 (1) = −6 < 0 and f 00 (2) = 6 > 0 so by the second
derivative test x = 1 is a local maximum and x = 2 is a local minimum.
Note that the first derivative test is more general than the second derivative
test as it does not require the function to be differentiable at the critical point.
Determining all critical points and asymptotes is the key to sketching the
graph of a function.
Eg. Sketch the graph of the function f (x) = 4x−5
x2 −1 .
4−0
4x − 5
1 4 − 5/x
=0
=0
lim
= lim
x→±∞ x2 − 1
x→±∞ x
1 − 02
1 − x12
so y = 0 is a horizontal asymptote along both the positive and negative
x-axis. There are vertical asymptotes at x = ±1.
f 0 (x) =
4(x2 − 1) − 2x(4x − 5) −4x2 + 10x − 4 −(4x − 2)(x − 2)
=
=
(x2 − 1)2
(x2 − 1)2
(x2 − 1)2
so there are stationary points at x = 12 , 2 with f ( 12 ) = 4 and f (2) = 1.
1
x
-1
1
2
2
0
f −
−
+
+
−
48
As x passes through 12 then f 0 (x) changes sign from negative to positive so
this is a local minimum. As x passes through 2 then f 0 (x) changes sign from
positive to negative so this is a local maximum.
Figure 31: Graph of f (x) =
4x−5
.
x2 −1
If a function is defined on an interval then extreme values can occur at the
endpoints of the interval. Here an endpoint x = c is a point at which the
function is defined but where the function is undefined either to the left or
right of this point. For example, if the domain of a function is [a, b] then a
and b are both endpoints.
Defn: If c is an endpoint of f (x) then f has an endpoint maximum at
x = c if f (x) ≤ f (c) for x sufficiently close to c.
The obvious similar definition applies for an endpoint minimum.
If a function is differentiable sufficiently close to an endpoint then examining
the sign of the derivative can determine if there is an endpoint maximum or
minimum.
If f (x) is continuous on an interval [a, b] then the global extreme values in
this interval are attained at either critical points or endpoints, so we can
determine them by examining all the possibilities.
49
Eg. Find the global extreme values of f (x) = 1 + 4x2 − 21 x4 for x ∈ [−1, 3].
Figure 32: Graph of f (x) = 1 + 4x2 − 12 x4
f 0 (x) = 8x−2x3 = 2x(4−x2 ) which is zero at x = 0, 2; recall we only consider
x ∈ [−1, 3].
Critical point: f (0) = 1 and f (2) = 9
End points: f (−1) = 92 and f (3) = − 72 .
The smallest of these four values is − 72 , which is the global minimum, whereas
the largest is 9 which is the global maximum.
Eg. Find the global extreme values of
(
x2 + 2x + 2 if − 12 ≤ x < 0
f (x) =
x2 − 2x + 2 if
0≤x≤2
f 0 (x) = 2x + 2 for x ∈ [− 12 , 0) so no critical points here.
f 0 (x) = 2x − 2 for x ∈ (0, 2] so a critical point at x = 1 with f (1) = 1.
limx→0+ f 0 (x) = −2 6= 2 = limx→0− f 0 (x) hence f 0 (x) does not exist at x = 0,
making this a critical point with f (0) = 2.
Endpoints: f (− 21 ) = 54 and f (2) = 2.
Thus the global minimum is 1 and the global maximum is 2.
50
Figure 33: Graph of a piecewise function with two quadratic pieces.
3.9
Min-Max problems
Determining extreme values (minima and maxima) is the subject of optimization, which has many practical applications. We can use the ideas we have
studied so far if we can express the quantity to be optimized as a function of
one variable.
Eg. Determine the ratio of the side lengths of a rectangle of fixed perimeter,
so that its area is as large as possible.
Let the side lengths be x and y. The fixed perimeter is P = 2x + 2y and the
area is A = xy. By substituting y = 21 P − x into A we get A(x) = 21 P x − x2
as a function of one variable. The physically allowed values are 0 ≤ x ≤ 12 P
and A(x) vanishes at both end points. A0 (x) = 21 P − 2x = 0 if x = 41 P. As
1 2
P and is also the
A00 (x) = −2 < 0 this is a local maximum with A( 41 P ) = 16
global maximum since there are no other critical points. With x = 14 P then
y = 21 P − x = 14 P so the ratio of side lengths is 1 ie. the optimal rectangle is
a square.
Eg. What length of fencing is required to create two adjacent rectangular
fields of the same width and total area 15, 000 m2 ?
Use length units of m and area units of m2 . Let x be the common width
and y the total height. The amount of fencing is F = 3x + 2y, from the
51
perimeter plus the extra dividing fence. The total area is 15000 = xy.
Thus F (x) = 3x + 30000/x is the function to be minimized for x > 0.
F 0 (x) = 3 − 30000/x2 = 0 if x = 100. Then F (100) = 300 + 300 = 600. Also
F 00 (x) = 60000/x3 > 0 and the stationary point is a global minimum. The
required length of fencing is 600m.
3.10
Rolle’s theorem
Rolle’s theorem states that if f is differentiable on the open interval (a, b)
and continuous on the closed interval [a, b], with f (a) = f (b), then there is
at least one c ∈ (a, b) for which f 0 (c) = 0.
Proof: By the extreme value theorem ∃ x1 and x2 in [a, b] s.t.
f (x1 ) ≤ f (x) ≤ f (x2 ) ∀ x ∈ [a, b].
If x1 ∈ (a, b) then x1 is a local minimum and f 0 (x1 ) = 0 so we are done.
If x2 ∈ (a, b) then x2 is a local maximum and f 0 (x2 ) = 0 so we are done.
The only case that is left is if both x1 and x2 are endpoints, a, b. But since
f (a) = f (b) then in this case f (x1 ) = f (x2 ) = f (a) so the above bound
becomes f (a) ≤ f (x) ≤ f (a) ∀ x ∈ [a, b]. Thus f (x) = f (a) is constant in
the interval and f 0 (x) = 0 ∀ x ∈ [a, b].
Figure 34: Illustration of Rolle’s theorem.
An obvious corollary of Rolle’s theorem is that if f (x) is differentiable at
every point of an open interval I then each pair of zeros of the function in
I is separated by at least one zero of f 0 (x). This is simply the special case
52
of Rolle’s theorem with a and b taken to be the the zeros of f (x) so that
f (a) = f (b) = 0.
An application of the above result is to provide a limit on the number of
distinct real zeros of a function.
Eg. Show that f (x) = 71 x7 − 15 x6 + x3 − 3x2 − x has no more than 3 distinct
real roots.
Figure 35: Graph of f (x) = 17 x7 − 51 x6 + x3 − 3x2 − x showing 3 zeros.
Suppose the required result is false and there are (at least) 4 distinct real roots
of f (x). Then by Rolle’s theorem there are (at least) 3 distinct real roots of
f 0 (x). Then by applying Rolle’s theorem to f 0 (x) there are (at least) 2 distinct
real roots of f 00 (x). However, f 00 (x) = 6x5 − 6x4 + 6x − 6 = 6(x − 1)(x4 + 1)
has only one real root (at x = 1). This contradiction proves the required
result.
3.11
The mean value theorem
The mean value theorem states that if f is differentiable on the open
interval (a, b) and continuous on the closed interval [a, b], then there is at
least one c ∈ (a, b) for which
f 0 (c) =
f (b) − f (a)
.
b−a
53
Figure 36: Illustration of the mean value theorem.
Geometrically this theorem states that there is at least one point in the interval at which the tangent to the curve is parallel to the line joining the two
points (b, f (b)) and (a, f (a)) at the ends of the interval. The geometry makes
the result clear but here is the proof.
Proof: Define the function
g(x) = (b − a)(f (x) − f (a)) − (x − a)(f (b) − f (a)).
Then g(a) = g(b) = 0 and as g(x) satisfies the requirements of Rolle’s theorem
then ∃ c ∈ (a, b) for which g 0 (c) = 0. As g 0 (x) = (b − a)f 0 (x) − (f (b) − f (a))
(a)
then setting x = c yields the required result f 0 (c) = f (b)−f
b−a .
Note that Rolle’s theorem is just a special case of the mean value theorem
with f (a) = f (b).
The mean value theorem can be used to prove some obvious results relating
monotonicity to the sign of the derivative. Here is an example:
Suppose f (x) is continuous in [a, b] and differentiable in (a, b) with f 0 (x) ≥ 0
throughout this interval. Then f (x) is monotonic increasing in (a, b).
Proof: If a ≤ x1 < x2 ≤ b then by the mean value theorem ∃ c ∈ (a, b) s.t.
(x1 )
f 0 (c) = f (xx22)−f
. However, since f 0 (x) ≥ 0 in (a, b) then f 0 (c) ≥ 0 which
−x1
imples that f (x2 ) ≥ f (x1 ) ie. f is monotonic increasing.
54
Similar obvious results and proofs follow for the cases of monotonic decreasing and strictly monotonic increasing/decreasing.
3.12
The inverse function rule
The derivative of the inverse of a function can be obtained from the following:
The inverse function rule states that if f (x) is continuous in [a, b] and
differentiable in (a, b) with f 0 (x) > 0 throughout this interval then its inverse
function g(y) (recall g(f (x)) = x) is differentiable for all f (a) < y < f (b)
with g 0 (y) = 1/f 0 (g(y)).
Note: A similar theorem exists for the case f 0 (x) < 0.
Eg. f (x) = sin x has f 0 (x) = cos x > 0 for x ∈ (−π/2, π/2).
Its inverse function g(y) = sin−1 y is therefore differentiable in (−1, 1) with
1
1
1
=
=p
g 0 (y) = 0
f (g(y)) cos g(y)
1 − sin2 g(y)
but y = f (g(y)) = sin(g(y)) hence g 0 (y) = √ 1
1−y 2
. Thus we have shown that
1
d
sin−1 x = √
.
dx
1 − x2
Eg. f (x) = tan x has f 0 (x) = sec2 x > 0 for x ∈ (−π/2, π/2).
With this domain then Ran f = R hence the domain of the inverse function
g(y) = tan−1 (y) is R.
By the inverse function rule
1
1
1
1
g 0 (y) = 0
=
=
=
f (g(y)) sec2 (g(y)) 1 + tan2 (g(y)) 1 + y 2
where we have used y = f (g(y)) = tan(g(y)).
Thus we have shown that
d
1
tan−1 x =
.
dx
1 + x2
55
3.13
Partial derivatives
So far we have only considered functions of a single variable eg. f (x). Later
we shall study functions of two variables eg. f (x, y). For this situation we
shall need the concept of a partial derivative. The partial derivative of f
with respect to x is written as ∂f
∂x and is obtained by differentiating f with
respect to x, keeping y fixed ie. y may be treated as a constant in performing
the partial differentiation with respect to x.
Eg. f (x, y) = x2 y 3 − sin x cos y + y
has
∂f
= 2xy 3 − cos x cos y.
∂x
Similarly, the partial derivative of f with respect to y is written as
obtained by differentiating f with respect to y, keeping x fixed.
Eg. f (x, y) = x2 y 3 − sin x cos y + y
has
∂f
∂y
and is
∂f
= 3x2 y 2 + sin x sin y + 1.
∂y
Partial derivatives are defined in terms of the following limits (which we
assume exist)
∂f
f (x + h, y) − f (x, y)
(x, y) = lim
,
h→0
∂x
h
∂f
f (x, y + h) − f (x, y)
(x, y) = lim
.
h→0
∂y
h
An alternative notation for partial derivatives is to write fx for
∂f
∂x
etc.
By applying partial differentiation to these partial derivatives we may obtain
the second order partial derivatives
∂ 2f
∂ ∂f
∂ 2f
∂ ∂f
∂ 2f
∂ ∂f
∂ 2f
∂ ∂f
=
,
=
,
=
,
=
.
∂x2
∂x ∂x
∂y 2
∂y ∂y
∂y∂x ∂y ∂x
∂x∂y
∂x ∂y
For the earlier example f (x, y) = x2 y 3 − sin x cos y + y we obtain
∂ 2f
= 2y 3 + sin x cos y,
2
∂x
∂ 2f
= 6x2 y + sin x cos y,
2
∂y
∂ 2f
= 6xy 2 + cos x sin y,
∂y∂x
∂ 2f
= 6xy 2 + cos x sin y.
∂x∂y
56
Note the equality, in this example, of the two mixed partial derivatives
∂ 2f
∂ 2f
=
.
∂x∂y
∂y∂x
It can be shown that this result is true in general, if f and all first and second
order partial derivatives are continuous.
57
4
Integration
4.1
Indefinite integrals
Defn: A function F (x) is called an indefinite integral or antiderivative
of a function f (x) in the interval (a, b) if F (x) is differentiable
with
R
0
F (x) = f (x) throughout (a, b). We then write F (x) = f (x) dx.
Eg.
R
cos x dx = sin x
Eg.
R
1
x2
Eg.
R
sgnx dx = |x|
dx = − x1
in R.
in R\{0}.
in R\{0}.
Note1: If F (x) is an indefinite integral of f (x) in (a, b) then so is F (x) + c
for any constant c. In applications of integration
R it is important to include
this arbitrary constant. We therefore write eg. cos x dx = sin x + c.
Note2: If F1 (x) and F2 (x) are both indefinite integrals of f (x) in (a, b) then
F1 (x) − F2 (x) = c for some constant c.
Defn: We say that f (x) is integrable in (a, b) if it has an indefinite integral
F (x) in (a, b) that is continuous in [a, b].
Eg. cos x is integrable in any finite interval (a, b) because sin x is continuous
in R.
Eg. x12 is not integrable in (0, 1) because its indefinite integral − x1 is not
continuous at x = 0.
Eg. sgn x is integrable in (0, 1) because its indefinite integral |x| is continuous
in [0, 1].
58
4.2
Definite integrals
Defn: A subdivision S of [a, b] is a partition into a finite number of subintervals [a, x1 ], [x1 , x2 ], ..., [xn−1 , b] where a = x0 < x1 < x2 < ... < xn = b.
The norm |S| of the subdivision is the maximum of the subinterval lengths
|a − x1 |, |x1 − x2 |, .., |xn−1 − b|. (Thus a small value of |S| means that the interval [a, b] has been chopped up into small pieces.) The numbers z1 , z2 , ..., zn
form a set of sample points from S if zj ∈ [xj−1 , xj ] for j = 1, .., n.
Defn: Suppose that f (x) is a function defined for x ∈ [a, b]. The Riemann
sum is
n
X
R=
(xj − xj−1 )f (zj ).
j=1
The Riemann sum is equal to the sum of the (signed) areas of rectangles
Figure 37: An illustration of a Riemann sum.
of height f (zj ) and width xj − xj−1 . Here signed means that the areas of
rectangles below the x-axis are counted negatively. If f (x) is continuous in
[a, b] and |S| is small then we expect R to be a good approximation to the
(signed) area under the graph of f (x) above the interval [a, b]. This turns out
to be correct and the error in the approximation can be reduced to zero by
taking the limit in which |S| tends to zero. This leads to the definition of the
definite integral as
Z
b
f (x) dx = lim R
|S|→0
a
and it can be shown that this limit exists if f (x) is continuous in [a, b].
59
By definition we set
Ra
b
f (x) dx = −
Rb
a
f (x) dx and therefore
Figure 38: The definite integral is the signed area under the curve:
Rb
a
Ra
a
f (x) dx = 0.
f (x) dx = A1 −A2 +A3 .
There are a number of fairly obvious properties of the definite integral that
are not too hard to prove. In the following let f (x) and g(x) both be integrable in (a, b).
(i) Linearity
If λ, µ are any constants then λf (x) + µg(x) is integrable in (a, b) with
Z b
Z b
Z b
λf (x) + µg(x) dx = λ
f (x) dx + µ
g(x) dx.
a
(ii) If c ∈ [a, b] then
a
Rb
a
f (x) dx =
Rc
a
(iii) If f (x) ≥ g(x) ∀ x ∈ (a, b) then
f (x) dx +
Rb
a
a
Rb
c
f (x) dx.
f (x) dx ≥
Rb
a
g(x) dx.
(iv) If m ≤ f (x) ≤ M ∀ x ∈ [a, b] then
Z b
m(b − a) ≤
f (x) dx ≤ M (b − a).
a
4.3
The fundamental theorem of calculus
So far we dont have any connection between indefinite and definite integrals.
This is provided by the following.
60
The fundamental theorem of calculus
If f (x) is continuous on [a, b] then the function
Z x
F (x) =
f (t) dt
a
defined for x ∈ [a, b] is continuous on [a, b] and differentiable on (a, b) and is
an indefinite integral of f (x) on (a, b) ie.
Z x
d
0
f (t) dt = f (x)
F (x) =
dx a
throughout (a, b). Furthermore if Fe(x) is any indefinite integral of f (x) on
[a, b] then
Z b
f (t) dt = Fe(b) − Fe(a) = [Fe(x)]ba .
a
We shall sketch the important points of the proof.
For a ≤ x < x + h < b we have that
Z x+h
Z x
Z
F (x + h) − F (x) =
f (t) dt −
f (t) dt =
a
a
x+h
f (t) dt
x
where we have used property (ii).
Let m(h) and M (h) denote the minimum and maximum values of f (x) on
the interval [x, x + h]. Then by property (iv) we have that
Z x+h
m(h)h ≤
f (t) dt ≤ M (h)h.
x
Thus
F (x + h) − F (x)
≤ M (h).
h
Since f (x) is continuous on [x, x+h] we have that limh→0+ m(h) = limh→0+ M (h) =
f (x) and so by the pinching theorem
m(h) ≤
lim+
h→0
F (x + h) − F (x)
= f (x).
h
61
A similar argument applies to the limit from below and together they give
F (x + h) − F (x)
= f (x)
h→0
h
F 0 (x) = lim
which proves that F (x) is an indefinite integral of f (x). The proof of the first
part of the theorem is completed by showing that F (x) is continuous from
the right at x = a and from the left at x = b. Both these follow by a simple
consideration of the limit of the relevant quotient. Finally, to prove the last
part of the theorem the key observation is that any indefinite integral Fe(x)
is related to F (x) by the addition of a constant.
The fundamental theorem of calculus provides a simple rule for differentiating
a definite integral with respect to its limits.
Z x
1
1
d
dt
=
.
Eg.
dx 0 1 + sin2 t
1 + sin2 x
We can combine this result with the chain rule if the limit is a more complicated expression
Z x2
1
1
2x
d
d 2
dt
=
x
=
Eg.
.
2
dx 0 1 + et
1 + ex dx
1 + e x2
4.4
The logarithm and exponential function revisited
We have already seen one definition of the logarithm function (as the inverse
of the exponential function) but here is an alternative equivalent definition.
The function
1
x
is continuous for x ∈ (0, ∞) so we can define
Z t
1
log t =
dx for t ∈ (0, ∞).
1 x
By the fundamental theorem of calculus, this is differentiable for t ∈ (0, ∞)
with dtd log t = 1t . Thus log t is strictly monotonic increasing in (0, ∞) and it
follows that log t is positive when t > 1, zero when t = 1, and negative when
0 < t < 1.
62
Figure 39: The logarithm function in terms of areas under the curve y = x1 .
In Figure 39 we illustrate the logarithm function in terms of areas under the
curve y = x1 . For t1 > 1 then log t1 is the area A1 and for 0 < t2 < 1 then
log t2 is minus the area A2 .
R
As we have seen, the indefinite integral x1 dx = log x is only valid for x > 0
because this is the domain of the function log x. However,
we can extend this
R 1
relation for the indefinite integral to all x 6= 0 as x dx = log |x|. It is easy
to prove this result by considering the derivative of log(−x) for x < 0.
As we have already seen, the inverse of the function f (x) = log x is the
exponential function g(y) = ey = exp y. We can therefore use the inverse
function rule to calculate the derivative of the exponential function
1
1
d y
e = g 0 (y) = 0
= 1 = g(y) = ey .
dy
f (g(y))
g(y)
Thus we have shown that
d x
e = ex
dx
∀ x ∈ R.
If n is a positive integer then xn has the obvious definition of x multiplied by
itself n times. Also, x−n = 1/xn .
1
We define x n as the inverse function of y n in the interval (0, ∞), which has
1
1
x n > 0 and (x n )n = x ∀ x ∈ (0, ∞).
63
q
1
If q is an integer then we define x n = (x n )q for x > 0. From this definition
we have that
q
1
1
1
1
q
q
x n = exp(log(x n )q ) = exp(q log(x n )) = exp( n log(x n )) = exp( log(x n )n )
n
n
q
= exp( log x).
n
Thus for any rational number r = q/n we have that for x > 0
xr = exp(r log x).
If a is irrational (eg. a =
√
2) then we take this formula as a definition of xa ,
xa = exp(a log x).
There are some important theorems concerning the logarithm and exponential functions and limits as x → ∞.
Theorem 1
For any constant a > 0
log x
= 0.
x→∞ xa
Roughly speaking this says that log x grows more slowly than any power of x.
x
1
The proof involves first showing that if x > 0 then log
xa ≤ ae and then applying the pinching theorem.
lim
Theorem 2
For any constant a > 0
xa
= 0.
x→∞ ex
Roughly speaking this says that any power of x grows more slowly than ex .
a
a
The proof involves first showing that if x > 0 then xex ≤ aea and then applying
the pinching theorem.
lim
64
Theorem 3
For any constant a
lim
x→∞
a
1+
x
x
= ea .
Proof (for the case a > 0):
Recall the definition of the derivative of a function f (x) at x = b.
f (b + h) − f (b)
.
h→0
h
f 0 (b) = lim
Apply this to f (x) = log x so that f 0 (x) =
1
x
and take b = 1.
log(1 + h) − log(1)
log(1 + h)
= lim
.
h→0
h→0
h
h
1 = lim
With the change of variable h =
1 = lim
a
x
log(1 + xa )
x→∞
a
x
this becomes
a
ie. a = lim (x log(1 + )).
x→∞
x
Since ex is continuous at a we have that
x
a
a
.
e = lim exp(x log(1 + )) = lim 1 +
x→∞
x→∞
x
x
a
4.5
Standard integrals
A closed form expression is a function obtained from elementary functions
by a finite sequence of elementary operations.
2
Eg.
ex + sin(1 + x2 )
x4 + ecos x
is a closed form expression.
A basic problem in integration is given a closed form expressionR f (x) can we
find a closed form expression for its indefinite integral F (x) = f (x) dx ?
In general this is not possible, for example,
form expression.
65
R
2
e−x dx does not have a closed
R
The game is to try and reduce f (x) dx to some standard integrals by using
a number of rules of integration eg. integration by parts, substitution,...
Some standard integrals that you are expected to know include (we have seen
some of these already but it is an exercise to prove them all by differentiating
the right hand side in each case)
xa+1
x dx =
+ c, for constant a 6= −1
a
+
1
Z
1
dx = log |x| + c
x
Z
ex dx = ex + c
Z
sin x dx = − cos x + c
Z
cos x dx = sin x + c
Z
sec2 x dx = tan x + c
Z
cosec2 x dx = −cotx + c
Z
1
dx = tan−1 x + c
2
Z 1+x
1
√
dx = sin−1 x + c
1 − x2
Z
1
√
dx = sinh−1 x + c
1 + x2
Z
4.6
a
Integration by parts
Z
Z
0
f (x)g (x) dx = f (x)g(x) − f 0 (x)g(x) dx.
66
Eg.
Z
1
x e dx = x2 e3x −
3
2 3x
Z
2 3x
1
2
xe dx = x2 e3x − xe3x +
3
3
9
Z
2 3x
e dx
9
2
2
1
= x2 e3x − xe3x + e3x + c.
3 Z
9
27
Z
Eg.
x cos x dx = x sin x− sin x dx = x sin x+cos x +c.
Z
Z
Z
1
Eg.
log x dx = (log x)1 dx = (log x)x−
x dx = x log x−x+c = x(log x−1)+c.
x
Z
Z
Z
x
x
x
x
x
Eg.
e cos x dx = e sin x− e sin x dx = e sin x+e cos x− ex cos x dx.
Z
1
Hence
ex cos x dx = ex (sin x+cos x) +c.
2
4.7
Integration by substitution
R
R
If F (x) = f (x) dx then F (u(x)) = f (u(x)) u0 (x) dx.
In more compact notation, by making the substitution from x to u(x) we have
Z
Z
f (x) dx = f (u) du, where du = u0 (x) dx.
Eg. Calculate
R
x
2+3x4
dx.
√
Make the substitution u = 32 x2 then du = 6x dx and
Z
Z Z 1
x
1
1
1
1
√
√
√
dx
=
du
=
du
=
tan−1 u +c
4
2
2
2 + 3x
6 2 + 2u
2 6 1+u
2 6
r
3 2
1
= √ tan−1 (
x ) + c.
2
2 6
R
Eg. Calculate tan x dx.
Put u = cos x then du = − sin x dx and
Z
Z
Z
sin x
1
tan x dx =
dx = − du = − log |u| + c = − log | cos x| + c.
cos x
u
q
67
Rπ
Eg. Calculate 02 cos3 x dx.
Put u = sin x then du = cos x dx.
When x = 0 then u = 0 and when x = π/2 then u = 1. Therefore
1
Z π
Z 1
Z π
2
2
1 3
1 2
3
2
2
cos x dx =
cos x(1−sin x) dx =
(1−u ) du = u− u
= 1− = .
3
3 3
0
0
0
0
R 1
Eg. Calculate (x2 +4)
2 dx.
Put x = 2 tan u then dx = 2 sec2 u du and
Z
Z
Z
Z
1
2 sec2 u du
(tan2 u + 1) du
du
dx
=
=
=
(x2 + 4)2
(4 tan2 u + 4)2
8(tan2 u + 1)2
8(tan2 u + 1)
Z
Z
1
1
1
1
1
2
=
cos u du =
(1+cos 2u) du = (u+ sin 2u) = (u+sin u cos u)
8
16
16
2
16
x
1
tan u
1
tan u
1
x
=
u+
=
u+
=
tan−1 + 2 x2
2
2
16
sec u
16
1 + tan u
16
2 1+ 4
Z
1
1
2x
−1 x
Thus
dx
=
tan
+
+ c.
(x2 + 4)2
16
2 4 + x2
For a general function f (x) the formula
Z 0
f (x)
dx = log |f (x)| + c
f (x)
is a simple application of integration by substitution.
Simply put u = f (x) so that du = f 0 (x) dx. Then
Z 0
Z
f (x)
1
dx =
du = log |u| + c = log |f (x)| + c.
f (x)
u
Z
x3 + 1
1
Eg.
dx
=
log |x4 + 4x + 7| + c.
4
x + 4x + 7
4
68
4.8
Integration of rational functions using partial fractions
Given a rational function, first of all decompose it into a polynomial and a
proper rational function. As the integration of a polynomial is trivial we can
now restrict attention to the integration of proper rational functions.
Every proper rational function can be written as a sum of partial fractions
which take the form
A
Bx + C
and
(x − α)k
(x2 + βx + γ)l
where the quadratic that appears in the above has no real roots.
The number and type of partial fractions that appear depends on the factors
in the denominator of the rational function.
Each factor in the denominator of the form (x − α)k generates a partial
fraction of the form
A1
A2
Ak
+
+ ... +
.
2
(x − α) (x − α)
(x − α)k
Each factor in the denominator of the form (x2 + βx + γ)l generates a partial
fraction of the form
B1 x + C1
B2 x + C2
Bl x + Cl
+
+
...
+
.
(x2 + βx + γ) (x2 + βx + γ)2
(x2 + βx + γ)l
The coefficients Ak , Bl , Cl can be found directly by comparison with the rational function either by the substitution of specific values of x or by comparing
coefficients of x. The integration is then completed by integrating the partial
fraction.
Eg. Calculate
R
2x
x2 −x−2
x2
dx.
2x
2x
a
b
=
=
+
.
− x − 2 (x − 2)(x + 1) x − 2 x + 1
To determine a and b multiply the above by (x − 2)(x + 1) to give
69
2x = a(x + 1) + b(x − 2). Putting x = 2 yields 4 = 3a and putting x = −1
yields −2 = −3b hence
4
2
2x
3
=
+ 3 .
2
x −x−2 x−2 x+1
Integrating this expression gives the required integral
Z
Z
4
2
4
2
2x
3
3
dx
=
+
=
log
|x
−
2|
+
log |x + 1| + c.
x2 − x − 2
x−2 x+1 3
3
Eg. Calculate
R
2x2 +3
x3 −2x2 +x
dx.
2x2 + 3
2x2 + 3
a
b
c
=
=
+
+
x3 − 2x2 + x x(x − 1)2
x x − 1 (x − 1)2
As 2x2 + 3 = a(x − 1)2 + bx(x − 1) + cx then putting x = 0 gives a = 3, while
x = 1 gives 5 = c. Comparing coefficients of x2 gives 2 = a + b thus b = −1.
Z
Z
3
1
5
5
2x2 + 3
dx
=
−
+
dx
=
3
log
|x|−log
|x−1|−
+C.
x3 − 2x2 + x
x x − 1 (x − 1)2
x−1
Eg. Calculate
R
3x4 +x3 +20x2 +3x+31
(x+1)(x2 +4)2
dx.
a
bx + c
dx + f
3x4 + x3 + 20x2 + 3x + 31
=
+
+
(x + 1)(x2 + 4)2
x + 1 x2 + 4 (x2 + 4)2
It is an exercise to show that a = 2, b = 1, c = 0, d = 0, f = −1.
Z
Z
3x4 + x3 + 20x2 + 3x + 31
2
x
1
dx
=
+
−
dx
(x + 1)(x2 + 4)2
x + 1 x2 + 4 (x2 + 4)2
1
1
x
1
−1 x
tan
+ C.
= 2 log |1 + x| + log(x2 + 4) −
−
2
8 x2 + 4
16
2
70
4.9
Integration using a recurrence relation
R1
Eg. Calculate 0 x3 ex dx.
R1
Define In = 0 xn ex dx for all integer n ≥ 0.
1 Z 1
Z 1
n+1 x
n+1 x
In+1 =
x e dx = x e
−
(n + 1)xn ex dx = e − (n + 1)In .
0
0
0
The recurrence relation In+1 = e − (n + 1)In can beR used to calculate In for
1
any positive integer n from the starting value I0 = 0 ex dx = [ex ]10 = e − 1.
I1 = e − I0 = e − (e − 1) = 1, I2 = e − 2I1 = e − 2(1) = e − 2,
I3 = e − 3I2 = e − 3(e − 2) = 6 − 2e is the required integral.
R
Eg. Calculate Rtan4 x dx.
Define Fn (x) = tann x dx for all integer n ≥ 0.
Z
Z
n
2
Fn+2 (x) + Fn (x) = tan x (1 + tan x) dx = tann x sec2 x dx
Put u = tan x then du = sec2 x dx
Z
tann+1 x
un+1
=
.
Fn+2 (x) + Fn (x) = un du =
n+1
n+1
1
The recurrence relation Fn+2 (x) = n+1
tann+1 (x) − Fn (x) can be used to
calculate
the required integral for n even from the starting value F0 (x) =
R
1 dx = x and for n odd from the starting value F1 (x) = − log | cos x|.
F2 (x) = tan x − x and F4 (x) = 31 tan3 x − tan x + x thus
RIn particular,
tan4 x dx = 13 tan3 x − tan x + x + c.
71
4.10
Definite integrals using even and odd functions
The definite integral of an odd function over a symmetric interval is zero.
If fodd (x) is an integrable odd function on [−a, a] then
Z a
fodd (x) dx = 0.
−a
Proof:
a
Z
Z
0
fodd (x) dx =
−a
Z
fodd (u) du +
−a
0
=−
Z
Z
fodd (x) dx
0
a
a
Z
(fodd (−x) + fodd (x)) dx = 0.
fodd (x) dx =
fodd (−x) dx +
0
0
a
a
A similarly argument shows that if feven (x) is an integrable even function on
[−a, a] then
Z
Z
a
a
feven (x) dx = 2
−a
feven (x) dx.
0
Recall that any function f (x) can be decomposed as a sum of even and odd
functions f (x) = feven (x)+fodd (x). This can be a useful technique to calculate
definite integrals over symmetric intervals as
Z a
Z a
Z a
f (x) dx =
feven (x) + fodd (x) dx = 2
feven (x) dx.
−a
−a
Eg. Calculate
For f (x) =
ex x4
−1 cosh x
x 4
e x
cosh x
dx.
4
x
−x
+e
we have feven (x) = 12 f (x) + 12 f (−x) = 21 x (ecosh
x
1
Z 1 x 4
Z 1
2
e x
1
dx = 2
x4 = 2 x5 = .
5
5
−1 cosh x
0
0
R π4
(1+2x)3 cos x
dx.
1+12x2
(1+2x) cos x
we have
1+12x2
Eg. Calculate
For f (x) =
R1
0
− π4
3
72
)
= x4 . Thus
3
3
2
+(1−2x) }
x)(2+24x )
= (cos2(1+12x
= cos x.
feven (x) = 21 f (x) + 12 f (−x) = (cos x){(1+2x)
2)
2(1+12x2 )
Thus
π4
Z π
Z π
√
4 (1 + 2x)3 cos x
4
1
√
=
2
2.
=
dx
=
2
cos
x
dx
=
2
sin
x
1 + 12x2
2
− π4
0
0
73
5
Differential equations
A differential equation is an equation that relates an unknown function
(the dependent variable) to one or more of its derivatives. The order of a
differential equation is the order of the highest derivative that appears in
the equation. A function that satisfies the differential equation is called a
solution and the process of finding a solution is called solving the differential equation. If the dependent variable depends on only a single unknown
variable (the independent variable) then the differential equation is called an
ordinary differential equation (ODE). We shall deal only with ODE’s.
Differential equations for functions with several independent variables and involving partial derivatives are called partial differential equations (PDE’s).
These are generally much harder to solve and wont be discussed in this course.
We shall usually discuss ODE’s using y to denote the dependent variable and
x for the independent variable, so solving the ODE involves finding y(x).
The generic first order ODE may be written in the form
dy
= f (x, y)
dx
(?)
Generically this will have a one-parameter family of solutions ie. the general
solution will contain an arbitrary constant. Assigning a particular value to
this constant gives a particular solution. A particular solution will be
singled out by the assignment of an initial value ie. requiring y(x0 ) = y0 ,
for given x0 and y0 . In this case the ODE is called an initial value problem (IVP). To solve an IVP, first solve the ODE to find the general solution
and then determine the value of the constant so that the IVP is also satisfied.
Depending upon the form of the function f (x, y) there are various methods
that can be used to solve certain kinds of ODE’s. We shall discuss these in
turn.
74
5.1
First order separable ODE’s
The ODE (?) is separable if f (x, y) = X(x)Y (y).
It can be solved by direct integration as
Z
Z
1
dy = X(x) dx.
Y (y)
dy
Eg. Solve the IVP dx
= xey−x with y(0) = 0.
Z
Z
Z
0
−x y
−y
−x
−y
−x
y = xe e ,
e dy = xe dx, −e = −xe + e−x dx = −xe−x −e−x +c
y = − log(e−x (1 + x) − c). y(0) = 0 = log(1 − c),
so c = 0.
y = − log(e−x (1 + x)) = x − log(1 + x).
2
)
Eg. Solve y 0 = 2x(1+y
(1+x2 )2 .
Z
Z
2x
1
dy
=
dx,
1 + y2
(1 + x2 )2
−1
tan
1
y=−
−c,
1 + x2
1
y = − tan
+c .
1 + x2
2
y−y
Eg. Solve the IVP y 0 = x1+y
, y(3) = 1.
Z
Z
x3
1+y
dy = (x2 − 1) dx, y + log |y| =
− x + c. Put x = 3 and y = 1
y
3
x3
1 = 6 + c, c = −5, hence y + log |y| =
− x − 5.
3
In this case the final solution can only be written in implicit form ie. it cannot
be written in an explicit form y = ... where the right hand side is a function
of x only.
5.2
First order homogeneous ODE’s
The ODE (?) is homogeneous if f (tx, ty) = f (x, y) ∀ t ∈ R. In this case
the substitution y = xv produces a separable ODE for v(x).
75
Eg. Solve
y0 =
y 2 −x2
xy .
t2 y 2 − t2 x2
y 2 − x2
f (tx, ty) =
=
= f (x, y) hence homogeneous.
txty
xy
Put y = xv then y 0 = v + xv 0 and the ODE becomes
Z
Z
x2 v 2 − x2
1
1
1
0
0
v+xv =
=
v−
,
v
=
−
,
v
dv
=
−
dx,
x2 v
v
xv
x
s
s
c
c
v = ± 2 log , y = xv = ±x 2 log .
x
x
5.3
v2
= − log |x|+log c,
2
First order linear ODE’s
The ODE (?) is linear if f (x, y) = −p(x)y + q(x).
In this case the ODE can be solved in terms of the integrating factor
R
I(x) = e
p(x) dx
.
The solution is given by
1
y=
I(x)
Z
I(x)q(x) dx.
Proof: We need to show that y satisfiesR the ODE y 0 + py = q.
The first step
is to observe that I 0 = e p dx p = Ip.
R
With y = I1 Iq dx we have
0 R
0
y 0 = − II2 Iq dx + I1 Iq = − II y + q = − Ip
I y + q = −py + q.
Eg. Solve the IVP y 0 − x2 y = 3x3 , y(−1) = 2.
In the above notation p = − x2 and q = 3x3 .
Z
Z
1
2
I = exp
p dx = exp
− dx = exp − 2 log x = 2 .
x
x
Z
Z
Z
1
1 3
3 4
2
2
2 3 2
y=
Iq dx = x
3x
dx
=
x
3x
dx
=
x
x
+
c
=
x + cx2 .
2
I
x
2
2
Using y(−1) = 2 we have 2 =
3
2
+ c hence c =
76
1
2
giving y = 32 x4 + 21 x2 .
5.4
First order exact ODE’s
An alternative form in which to write a first order ODE is
M (x, y) dx + N (x, y) dy = 0.
(??)
Rearranging gives
dy
M (x, y)
=−
dx
N (x, y)
(x,y)
so this agrees with the form (?) with f (x, y) = M
N (x,y) .
Note that a given f (x, y) does not correspond to a unique choice of M (x, y)
and N (x, y) as only their ratio determines the ODE, so there is a freedom to
multiply both M and N by the same arbitrary function.
For any function g(x, y) the total differential dg is defined to be
dg =
∂g
∂g
dx +
dy.
∂x
∂y
You may think of dg as the small change in the function g that arises in
moving from the point (x, y) to (x + dx, y + dy) where dx and dy are both
small. In particular, if dg is identically zero then g is a constant.
The ODE (??) is called exact if there exists a function g(x, y) s.t. the left
hand side of (??) is equal to the total derivative dg ie.
M=
∂g
∂x
and N =
∂g
.
∂y
In this case the ODE says that dg = 0 hence g = constant and this yields the
∂2g
∂2g
solution of the ODE. The equality of the mixed partial derivatives ∂y∂x
= ∂x∂y
∂N
requires that an exact equation satisfies ∂M
∂y = ∂x .
This results in the following test for exactness
The ODE M (x, y) dx + N (x, y) dy = 0 is exact iff
∂M
∂N
=
.
∂y
∂x
Eg. Show that (3e3x y + ex ) dx + (e3x + 1) dy = 0 is an exact equation and
hence solve it.
77
In this example M = 3e3x y + ex and N = e3x + 1.
∂M
3x
= ∂N
∂y = 3e
∂x hence this equation is exact.
From M =
∂g
∂g
we have
= 3e3x y + ex
∂x
∂x
giving g = e3x y + ex + φ(y)
where the usual constant of integration is replaced by an arbitrary function
of integration φ(y) because of the partial differentiation. To determine this
function we use
∂g
= N,
∂y
∂g
= e3x + 1 = e3x + φ0 ,
∂y
so φ0 = 1, giving φ = y.
Finally we have that g = e3x y + ex + y = constant = c.
In this case we can write the solution in explicit form y = (c − ex )/(e3x + 1).
As an exercise check that this does indeed solve the starting ODE
y 0 = −(3e3x y + ex )/(e3x + 1). Note that this example is also a linear ODE so
could also have been solved using an integrating factor.
∂N
Consider an ODE M dx + N dy = 0 that is not exact ie. ∂M
∂y 6= ∂x .
By multiplying the ODE by a function I(x, y) we can get an equivalent representation of the ODE as m dx + n dy = 0 where m = M I and n = N I.
∂n
If this equation is exact ie. ∂m
∂y = ∂x then I is called an integrating factor
for the original ODE.
Eg. Show that x is an integrating factor for (3xy − y 2 ) dx + x(x − y) dy = 0.
∂N
2
M = 3xy − y 2 so ∂M
∂y = 3x − 2y, but N = x − xy so ∂x = 2x − y.
∂N
Therefore ∂M
∂y 6= ∂x and the ODE is not exact.
However, with the integrating factor I = x we get m = M I = 3x2 y − y 2 x
∂n
2
and n = N I = x3 − x2 y so now ∂m
∂y = 3x − 2yx = ∂x which is exact.
We can now solve this exact equation using the above method.
∂g
1 2 2
2
2
3
∂x = m = 3x y − y x giving g = x y − 2 y x + φ(y).
∂g
3
2
3
2
0
0
∂y = n = x − x y = x − x y + φ so φ = 0 and we can take φ = 0.
This gives the final solution g = constant = c ie. x3 y − 21 y 2 x2 = c.
78
5.5
Bernoulli equations
A Bernoulli equation is a nonlinear ODE of the form
y 0 + p(x)y = q(x)y n
where n 6= 0, 1, otherwise the equation is simply linear.
These ODE’s can be solved by first performing the substitution v = y 1−n ,
which converts the equation to a linear ODE for v(x).
2 2
Eg. Solve y 0 − 2y
x = −x y .
This is a Bernoulli equation with n = 2 hence put v =
0
0
1
y,
which gives
2
v 0 = − yy2 . Dividing the ODE by y 2 yields yy2 − xy
= −x2 which in terms
2
of v is −v 0 − 2v
ie the linear equationv 0 + x2 v = x2 . We solve
x = −x
R 2
= exp(2 log x) = x2 as
this with an integrating factor I = exp
x dx
R
2
v = x12 x2 x2 dx = x12 ( 15 x5 + 5c ). So finally y = v1 = x5x
5 +c .
5.6
Second order linear constant coefficient ODE’s
The general form of a second order linear constant coefficient ODE is
α2 y 00 + α1 y 0 + α0 y = φ(x)
(†)
where α2 6= 0, α1 , α0 are constants (the constant coefficients) and φ(x) is an
arbitrary function of x. The ODE is still second order and linear if these constants are replaced by functions of x, but then the ODE is not so easy to solve.
5.6.1
The homogeneous case
We first restrict to the case φ(x) = 0 ie.
α2 y 00 + α1 y 0 + α0 y = 0
(††)
in which case the second order ODE is called homogeneous.
79
The first point to note is that if y1 and y2 are any two solutions of the
homogeneous ODE (††) then so is any arbitrary linear combination
y = Ay1 + By2 , where A and B are constants.
Proof: y = Ay1 + By2 so y 0 = Ay10 + By20 and y 00 = Ay100 + By200 . Hence
α2 y 00 + α1 y 0 + α0 y = α2 (Ay100 + By200 ) + α1 (Ay10 + By20 ) + α0 (Ay1 + By2 ) =
= A(α2 y100 + α1 y10 + α0 y1 ) + B(α2 y200 + α1 y20 + α0 y2 ) = 0 + 0 = 0.
Note that the linearity of the ODE is crucial for this result.
The general solution of the ODE (††) contains two arbitrary constants (because the ODE is second order). If y1 and y2 are any two independent particular solutions then the general solution is given by y = Ay1 +By2 , with A and
B the two arbitrary constants. The task is therefore to find two independent
particular solutions.
To find a particular solution, look for one in the form y = eλx .
Putting this into (††) yields the characteristic (or auxiliary) equation
α2 λ2 + α1 λ + α0 = 0.
(Char)
There are 3 cases to consider, depending upon the type of roots of (Char).
(i). Distinct real roots.
If (Char) has real roots λ1 6= λ2 then y1 = eλ1 x and y2 = eλ2 x are the two
required particular solutions and the general solution is
y = Aeλ1 x + Beλ2 x .
Eg. Solve y 00 − y 0 − 2y = 0. Then (Char) is λ2 − λ − 2 = 0 = (λ − 2)(λ + 1)
with roots λ = 2, −1. The general solution is y = Ae2x + Be−x .
(ii). Repeated real root.
α12
(Char) has a repeated real root if α12 = 4α0 α2 ie. when α0 = 4α
.
2
(Char) then reduces to
2
2
α
α
α1
1
α2 λ2 +α1 λ+ 1 = 0 = α2 λ+
with λ1 = −
the real double root.
4α2
2α2
2α2
80
Thus (Char) has produced only one solution y1 = eλ1 x .
However, as we now show, y = xeλ1 x is also a solution in this case.
The original ODE (††) now takes the form
α12
y = 0 = α2 (y 00 − 2λ1 y 0 + λ21 y).
α2 y + α1 y +
4α2
00
0
With y = xeλ1 x we have y 0 = eλ1 x +λ1 xeλ1 x and y 00 = 2λ1 eλ1 x +λ21 xeλ1 x . Hence
y 00 − 2λ1 y 0 + λ21 y = 2λ1 eλ1 x + λ21 xeλ1 x − 2λ1 (eλ1 x + λ1 xeλ1 x ) + λ21 xeλ1 x = 0.
y1 = eλ1 x and y2 = xeλ1 x are the two required particular solutions and the
general solution is
y = Aeλ1 x + Bxeλ1 x .
Eg. Solve 2y 00 −12y 0 +18y = 0. Then (Char) is 2λ2 −12λ+18 = 0 = 2(λ−3)2
with double root λ = 3 The general solution is y = Ae3x + Bxe3x .
(iii). Complex roots.
If the roots of (Char) are complex then they come in a complex conjugate
pair ie. λ1 = α+iβ and λ2 = α−iβ. We therefore have the two solutions y1 =
eλ1 x = eαx eiβx and y2 = eλ2 x = eαx e−iβx , but these are not the solutions we are
looking for, since they are complex and we want real solutions. However, we
have seen that we can take any linear combination of these solutions (even
with complex coefficients) and it will still be a solution. In particular the
combinations
1
1
Y1 = (y1 + y2 ) = eαx (eiβx + e−iβx ) = eαx cos(βx)
2
2
1
1
(y1 − y2 ) = eαx (eiβx − e−iβx ) = eαx sin(βx)
2i
2i
are both real and provide the two particular solutions we want.
The general solution is an arbitrary real linear combination of Y1 and Y2 ie.
Y2 =
y = Aeαx cos(βx) + Beαx sin(βx) = eαx (A cos(βx) + B sin(βx))
81
with A and B real. Recall that α and β are both real and are the real and
imaginary parts of the complex root of (Char).
Eg. Solve
y 00 − 6y 0 + 10y = 0. Then (Char) is λ2 − 6λ + 10 = 0 with roots
√
= 6±2i
λ = 6± 36−40
2
2 = 3 ± i = α ± iβ. Hence α = 3 and β = 1 and the general
3x
solution is y = e (A cos x + B sin x).
5.6.2
The inhomogeneous case
We now return to the general inhomogeneous ODE
α2 y 00 + α1 y 0 + α0 y = φ(x)
(†)
where φ(x) is a function that is not identically zero.
The general solution to (†) is obtained as a sum of two parts
y = yCF + yP I
where yCF is called the complementary function and is the general solution
of the homogeneous version of the ODE ie. (††) which is obtained by removing
the φ(x) term. This part of the solution contains the two arbitrary constants
and we have already seen how to find this.
yP I is called the particular integral and is any particular solution of (†).
We haven’t yet seen any methods to find this. However, before we look at
a method, let us first show that the combination y = yCF + yP I is indeed a
0
00
solution of (†). We have that y 0 = yCF
+ yP0 I and y 00 = yCF
+ yP00 I so put these
into the left hand side of (†) to get
00
0
α2 y 00 + α1 y 0 + α0 y = α2 (yCF
+ yP00 I ) + α1 (yCF
+ yP0 I ) + α0 (yCF + yP I )
00
0
= α2 yCF
+ α1 yCF
+ α0 yCF + α2 yP00 I + α1 yP0 I + α0 yP I = φ(x).
|
{z
} |
{z
}
= 0 by (††)
= φ(x) by (†)
82
To find a particular integral we shall use the method of undetermined
coefficients, which can be applied if φ(x) takes certain forms such as polynomials, exponentials or trigonometric functions, including sums and products of these. The idea is to try an appropriate form of the solution that
contains some unknown constant coefficients which are then determined by
explicit calculation and the requirement that the try yields a solution. The
table below lists the kind of terms that can be dealt with if they appear in
φ(x) (they can come with any constant coefficients in front) together with
the form to try for yP I , containing undetermined constant coefficients ai .
Term in φ(x) Form to try for yP I
eγx
a1 eγx
xn
a0 + a1 x + ... + an xn
cos(γx)
a1 cos(γx) + a2 sin(γx)
sin(γx)
a1 cos(γx) + a2 sin(γx)
The motivation for the above is that the form of the try when differentiated
zero, once or twice, should have a similar functional form to φ(x), as there
is then a chance that the left hand side of (†) could equal the right hand
side. If φ(x) contains sums/products of the kind of terms above then try
sums/products of the listed forms for the try.
Special case rule: If the suggested form of the try for yP I is contained in
yCF then we know that this form wont work as the left hand side of (†) will
then be identically zero and so cant equal φ(x). In this case first multiply the
suggested form of the try by x to get the correct try. In some special cases
this rule may need to be applied twice as x times the suggested try may also
be contained in yCF .
Eg. Solve y 00 − y 0 − 2y = 7 − 2x2 .
From earlier we already know that yCF = Ae2x + Be−x .
To find yP I we try the form y = a0 + a1 x + a2 x2
to simplify the notation for the calculation we have dropped the P I subscript here.
Now y 0 = a1 + 2a2 x and y 00 = 2a2 , so putting these into the ODE gives
2a2 − (a1 + 2a2 x) − 2(a0 + a1 x + a2 x2 ) = 7 − 2x2
83
= 2a2 − a1 − 2a0 + x(−2a2 − 2a1 ) − 2a2 x2 .
Comparing coefficients of x2 , x1 , x0 gives −2a2 = −2, (a2 = 1), −2a2 − 2a1 = 0,
(a1 = −1), 2a2 − a1 − 2a0 = 7, (a0 = −2).
We have found yP I = −2 − x + x2 and the general solution is y = yCF + yP I ie.
y = Ae2x + Be−x − 2 − x + x2 .
Eg. Solve y 00 + 4y 0 + 4y = 5e3x .
First find yCF by solving y 00 + 4y 0 + 4y = 0.
(Char) is λ2 + 4λ + 4 = 0 = (λ + 2)2 with λ = −2 repeated,
hence yCF = e−2x (A + Bx).
For yP I try y = a1 e3x so y 0 = 3a1 e3x and y 00 = 9a1 e3x . Put these into the ODE
9a1 e3x + 12a1 e3x + 4a1 e3x = 5e3x = 25a1 e3x so a1 = 15 giving yP I = 15 e3x .
The general solution is y = yCF + yP I ie y = e−2x (A + Bx) + 15 e3x .
Eg. Solve y 00 − 2y 0 + 2y = 10 cos(2x).
First find yCF by solving y 00 − 2y 0 + 2y = 0.
(Char) is λ2 − 2λ + 2 = 0 with λ = 1 ± i. Hence yCF = ex (A cos x + B sin x).
For yP I try y = a1 cos(2x) + a2 sin(2x) so y 0 = −2a1 sin(2x) + 2a2 cos(2x) and
y 00 = −4a1 cos(2x) − 4a2 sin(2x). Put these into the ODE y 00 − 2y 0 + 2y =
−4a1 cos(2x) − 4a2 sin(2x) − 2(−2a1 sin(2x) + 2a2 cos(2x)) + 2(a1 cos(2x) +
a2 sin(2x)) = 10 cos(2x) = (−2a1 − 4a2 ) cos(2x) + (−2a2 + 4a1 ) sin(2x).
Comparing coefficients of sin(2x) and cos(2x) gives −2a2 + 4a1 = 0, (a2 =
2a1 ), −2a1 − 4a2 = 10, −10a1 = 10, (a1 = −1, a2 = −2).
Thus yP I = − cos(2x) − 2 sin(2x).
The general solution is y = ex (A cos x + B sin x) − cos(2x) − 2 sin(2x).
Eg. Solve y 00 − y 0 − 2y = 6e−x .
From earlier we already know that yCF = Ae2x + Be−x .
To find yP I we first think that we should try the form y = a1 e−x but note
that this is contained in yCF so instead we try y = a1 xe−x . Then
y 0 = a1 e−x − a1 xe−x and y 00 = −2a1 e−x + a1 xe−x . Put into ODE
−2a1 e−x +a1 xe−x −(a1 e−x −a1 xe−x )−2a1 xe−x = 6e−x = −3a1 e−x so a1 = −2
and yP I = −2xe−x .
The general solution is y = Ae2x + Be−x − 2xe−x .
84
Eg. Solve y 00 − y 0 − 2y = ex (8 sin(3x) − 14 cos(3x)).
From earlier we already know that yCF = Ae2x + Be−x .
For yP I try y = ex (a1 cos(3x) + a2 sin(3x)).
It is an exercise to show that a1 = 1, a2 = −1.
The general solution is then y = Ae2x + Be−x + ex (cos(3x) − sin(3x)).
Eg. Solve y 00 − y 0 − 2y = −5 + 9x − 2x3 + 4 cos x + 2 sin x.
From earlier we already know that yCF = Ae2x + Be−x .
For yP I try y = a0 + a1 x + a2 x2 + a3 x3 + a4 cos x + a5 sin x.
It is an exercise to show that
a0 = 1, a1 = 0, a2 = − 23 , a3 = 1, a4 = −1, a5 = −1.
The general solution is then y = Ae2x + Be−x + 1 − 23 x2 + x3 − cos x − sin x.
5.6.3
Initial and boundary value problems
The values of the two arbitrary constants in the general solution of a second
order ODE are fixed by specifying two extra requirements on the solution
and/or its derivative.
An initial value problem (IVP) is when we require y(x0 ) = y0 and y 0 (x0 ) =
δ for given constants x0 , y0 , δ. In this case the two constraints are given at
the same value of the independent variable.
A boundary value problem (BVP) is when we require y(x0 ) = y0 and
y(x1 ) = y1 for given constants x0 6= x1 , y0 , y1 . In this case the two constraints
are given at two different values of the independent variable.
The way to solve an IVP or BVP is to first find the general solution of the
ODE and then determine the values of the two constants in this solution so
that the extra conditions are satisfied.
Eg. Solve the IVP y 00 − y 0 − 2y = 7 − 2x2 , y(0) = 5, y 0 (0) = 1.
From earlier we already know that the general solution of the ODE is
85
y(x) = Ae2x + Be−x − 2 − x + x2 .
Therefore y(0) = A + B − 2 = 5, so A + B = 7.
Now y 0 (x) = 2Ae2x − Be−x − 1 + 2x, giving y 0 (0) = 2A − B − 1 = 1, so
2A − B = 2.
The solution of these two equations for A and B is A = 3, B = 4.
Therefore the solution of the IVP is y = 3e2x + 4e−x − 2 − x + x2 .
Eg. Solve the BVP 4y 00 + y = 0, y(0) = 1, y(π) = 2.
(Char) is 4λ2 + 1 = 0 with roots λ = ± 2i .
The general solution is y = A cos( x2 ) + B sin( x2 ).
y(0) = A = 1 and y(π) = B = 2.
Hence the solution of the BVP is y = cos( x2 ) + 2 sin( x2 ).
5.7
Systems of first order linear ODE’s
A system of n coupled first order linear ODE’s for n dependent variables can
be written as a single nth order linear ODE for a single dependent variable
by eliminating the other dependent variables. An associated IVP is obtained
if all the values of the dependent variables are specified at a single value of
the independent variable.
Eg. Solve the pair of first order linear ODE’s y 0 = −y + z, z 0 = −8y + 5z
for y(x), z(x) by first finding a second order ODE for y(x).
From the first equation z = y 0 + y hence z 0 = y 00 + y 0 .
Substituting these expressions into the second equation gives
y 00 + y 0 = −8y + 5(y 0 + y) ie. the second order ODE y 00 − 4y 0 + 3y = 0.
We can now solve this ODE using our earlier methods.
(Char) is λ2 − 4λ + 3 = 0 = (λ − 3)(λ − 1) with roots λ = 1, 3.
y = Aex + Be3x and then z = y 0 + y = 2Aex + 4Be3x .
Hence the general solution is y = Aex + Be3x , z = 2Aex + 4Be3x .
We can extend the problem to an IVP by requiring that y(0) = 1, z(0) = −2.
From the general solution y(0) = A + B = 1 and z(0) = 2A + 4B = −2
with solution A = 3, B = −2.
Hence the solution of the IVP is y = 3ex − 2e3x , z = 6ex − 8e3x .
86
6
6.1
Taylor series
Taylor’s theorem and Taylor series
Taylor’s theorem states that if f (x) has n + 1 continuous derivatives in an
open interval I that contains the point x = a, then ∀ x ∈ I
X
n
f (k) (a)
f (x) =
(x − a)k + Rn (x)
k!
k=0
Z
1 x
(x − t)n f (n+1) (t) dt is called the remainder.
where Rn (x) =
n! a
Proof:
Rx
Fix x ∈ I then a f 0 (t) dt = f (x) − f (a), so
Z x
f (x) = f (a) +
f 0 (t) dt.
a
If we think of the integrand in the above as f 0 (t) · 1 then we can perform an
integration
by parts where we choose the constant of integration to be −x so
R
that 1 dt = (t − x). This gives
x Z x
f (x) = f (a) + f 0 (t)(t − x) −
f 00 (t)(t − x) dt
a
aZ
x
= f (a) + f 0 (a)(x − a) +
f 00 (t)(x − t) dt.
a
Note that this proves the theorem for n = 1. To prove the theorem for n = 2
we perform another integration by parts on this equation,
Z x
0
f (x) = f (a) + f (a)(x − a) +
f 00 (t)(x − t) dt
a
x Z x
(x − t)2
(x − t)2
0
00
000
+
f (t)
= f (a) + f (a)(x − a) − f (t)
dt
2
2
a
a
Z x
f 00 (a)
(x − t)2
0
2
000
= f (a) + f (a)(x − a) +
(x − a) +
f (t)
.
2
2
a
87
This proves the theorem for n = 2. Continuing in this way the theorem follows
after performing an integration by parts a total of n times. More formally,
this can be written in the form of a proof by induction.
The combination Pn (x) = f (x)−Rn (x) is a polynomial in x of degree n called
the nth order Taylor polynomial of f (x) about x = a
f (n) (a)
f 00 (a)
2
(x − a) + ... +
(x − a)n .
Pn (x) = f (a) + f (a)(x − a) +
2!
n!
0
If the term ‘about x = a’ is not included when referring to the nth order
Taylor polynomial of f (x), then by default this is taken to mean the choice
a = 0, as this is the most common case.
The Taylor polynomial Pn (x) is an approximation to the function f (x).
Generically, it is a good approximation if x is close to a and the approximation improves with increasing order n. The remainder provides an exact
expression for the error in the approximation.
Eg. Calculate the nth order Taylor polynomial of ex .
f (x) = ex , f 0 (x) = ex , ... , f (k) (x) = ex so f (k) (0) = 1 giving
n
x2
xn X xk
Pn (x) = 1 + x +
+ ... +
=
.
2!
n!
k!
k=0
Figure 40 shows the graph of ex and its Taylor polynomials of order 0 to 3.
Figure 40: The function ex and its Taylor polynomials of order 0 to 3.
88
If f (x) is infinitely differentiable on an open interval I that contains the point
x = a and in addition for each x ∈ I we have that limn→∞ Rn (x) = 0, then
we say that f (x) can be expanded as a Taylor series about x = a and we
write
∞
X
f (k) (a)
f (x) =
(x − a)k .
k!
k=0
Infinite series and discussions/tests concerning their convergence are covered
in the Analysis I course.
Note that the nth order Taylor polynomial is obtained by taking just the first
(n + 1) terms of the Taylor series (where we count terms even if they happen
to be zero).
Eg. From our earlier calculation we have that the Taylor series of ex is
x
e =
∞
X
xk
k=0
x2 x3
=1+x+
+
+ ···
k!
2!
3!
where · · · denotes an infinite number of terms with increasing powers of x.
Note that if we set x = 1 in the above Taylor series then we recover the
formula presented earlier for Euler’s number
∞
X
1
1
1
1
e=
= 1 + + + + ···
k!
1! 2! 3!
k=0
Eg. Calculate the Taylor series of sin x.
f (x) = sin x, f (0) = 0, f 0 (x) = cos x, f 0 (0) = 1,
f 00 (x) = − sin x, f 00 (0) = 0, f 000 (x) = − cos x, f 000 (0) = −1,
f (4) (x) = sin x so now the pattern repeats and we see that for all non-negative
integers k we have that f 2k (0) = 0 and f 2k+1 (0) = (−1)k .
89
The Taylor series of sin x is therefore given by
∞
X
(−1)k 2k+1
x3 x5 x7
sin x =
x
=x−
+
−
+ ···
(2k + 1)!
3!
5!
7!
k=0
Observe that only odd powers of x appear in the Taylor series of sin x as this
is an odd function.
A similar calculation (exercise) yields the Taylor series of cos x
cos x =
∞
X
(−1)k
k=0
(2k)!
x
2k
x2 x4 x6
=1−
+
−
+ ···
2!
4!
6!
in which only even powers of x appear, as cos x is an even function.
To calculate the Taylor series of sinh x observe that for all non-negative integer k we have that f (x) = sinh x satisfies
(
sinh x if k is even
f (k) (x) =
cosh x if k is odd
thus f (2k) (0) = 0 and f (2k+1) (0) = 1 giving
sinh x =
∞
X
k=0
1
x3 x5 x7
2k+1
x
=x+
+
+
+ ···
(2k + 1)!
3!
5!
7!
A similar calculation (exercise) yields the Taylor series of cosh x
cosh x =
∞
X
k=0
1 2k
x2 x4 x6
x =1+
+
+
+ ···
(2k)!
2!
4!
6!
These two results can also be obtained directly from the Taylor series of ex
by noting that cosh x and sinh x are the even and odd parts of ex .
As log x is not defined at x = 0 it does not make sense to consider the Taylor
series of log x about x = 0. Instead we consider the Taylor series of log x
about x = 1.
90
f (x) = log x, f (1) = 0, f 0 (x) = x1 , f 0 (1) = 1,
f 00 (x) = − x12 , f 00 (1) = −1, f 000 (x) = x23 , f 000 (1) = 2,
(4)
and in general for k a positive integer
f (4) (x) = − 3·2
x4 , f (1) = −3 · 2
f (k) (x) = (−1)k−1 (k−1)!
, f (k) (1) = (−1)k−1 (k − 1)!
xk
Thus the Taylor series of log x about x = 1 is
log x =
∞
X
(−1)k−1 (k − 1)!
k=1
k!
k
(x − 1) =
∞
X
(−1)k−1
k=1
k
(x − 1)k
This result is more often expressed in terms of the variable X = x − 1, when
P
(−1)k−1 k
it becomes log(1 + X) = ∞
X .
k=1
k
If we now rename the variable X as x we get
log(1 + x) =
∞
X
(−1)k−1
k=1
k
x2 x3 x4
x =x−
+
−
+ ···
2
3
4
k
This result is not valid at x = −1, and this is to be expected because the
logarithm is not defined when its argument is zero. A careful examination of
the requirement limn→∞ Rn (x) = 0 reveals that x must satisfy −1 < x ≤ 1
for the above Taylor series to be valid (ie. for the series to converge).
6.2
Lagrange form for the remainder
There is a more convenient expression for the remainder term in Taylor’s
theorem. The Lagrange form for the remainder is
f (n+1) (c)
(x − a)n+1 ,
Rn (x) =
(n + 1)!
for some c ∈ (a, x).
To prove this expression for the remainder we will first need to prove the
following lemma:
Lemma
Let h(t) be differentiable n + 1 times on [a, x] with h(k) (a) = 0 for 0 ≤ k ≤ n
and h(x) = 0. Then ∃ c ∈ (a, x) s.t. h(n+1) (c) = 0.
91
Proof:
h(a) = 0 = h(x) so by Rolle’s theorem ∃ c1 ∈ (a, x) s.t. h0 (c1 ) = 0.
h0 (a) = 0 = h0 (c1 ) so by Rolle’s theorem ∃ c2 ∈ (a, c1 ) s.t. h00 (c1 ) = 0.
Repeating this argument a total of n times we arrive at
h(n) (a) = 0 = h(n) (cn ) so by Rolle’s theorem ∃ cn+1 ∈ (a, cn ) s.t. h(n+1) (cn+1 ) = 0.
This proves the lemma with c = cn+1 ∈ (a, x).
Proof of the Lagrange form of the remainder:
Consider the function
f (x) − Pn (x)
h(t) = f (t) − Pn (t) −
(t − a)n+1 .
n+1
(x − a)
By construction h(x) = 0.
k
Also dtd k (t−a)n+1 is zero when evaluated at t = a for 0 ≤ k ≤ n. Furthermore,
(k)
by definition of the Taylor polynomial Pn (t) we have that f (k) (a) = Pn (a)
for 0 ≤ k ≤ n. Hence h(n) (a) = 0 for 0 ≤ k ≤ n.
h(t) therefore satisfies the conditions of the lemma and we have that
∃ c ∈ (a, x) s.t. h(n+1) (c) = 0.
(n+1)
As Pn (t) is a polynomial of degree n then Pn
(t) = 0.
dn+1
n+1
Also, dtn+1 (t − a)
= (n + 1)! hence
f
(x)
−
P
(x)
n
0 = h(n+1) (c) = f (n+1) (c) − (n + 1)!
.
(x − a)n+1
Rearranging this expression gives the required result
f (n+1) (c)
(x − a)n+1 .
f (x) = Pn (x) +
(n + 1)!
One use of the Lagrange form of the remainder is to provide an upper bound
on the error of a Taylor polynomial approximation to a function.
Suppose that |f (n+1) (t)| ≤ M, ∀ t in the closed interval between a and x.
|x−a|n+1
Then |Rn (x)| ≤ M (n+1)!
provides a bound on the error.
92
Eg. Show that the error in approximating ex by its 6th order Taylor polynomial is always less than 0.0006 throughout the interval [0, 1].
In this example f (x) = ex and n = 6 so we first require an upper bound on
|f (7) (t)| = |et | for t ∈ [0, 1]. As et is monotonic increasing and positive then
|et | ≤ e1 = e < 3. Thus, in the above notation, we may take M = 3.
7
3
As a = 0 we now have that |R6 (x)| < 3|x|
7! ≤ 7! for x ∈ [0, 1].
1
Evaluating 7!3 = 1680
< 0.0006 and the required result has been shown.
6.3
Calculating limits using Taylor series
Defn: Let n be a positive integer.
We say that f (x) = o(xn ) (as x → 0) if limx→0 fx(x)
n = 0.
In particular, if α is any non-zero constant then αxm = o(xn ) iff m > n.
Egs. 3x4 = o(x3 ), 3x4 = o(x2 ), 3x8 − x6 = o(x5 ), o(x8 ) + o(x5 ) = o(x5 ).
If we know that a Taylor series is valid in a suitable open interval, then it
may be useful in calculating certain limits.
It can be shown that the Taylor series we have already seen for sin x and
cos x are valid for all x ∈ R. Although we have not proved this result you
may assume that it is true and use it in calculating limits. As examples, we
shall now calculate the two important trigonometric limits that we saw earlier.
Eg. Use Taylor series to calculate limx→0 sinx x .
From the Taylor series of sin x we have that sin x = x −
x−
sin x
lim
= lim
x→0 x
x→0
x3
3!
x3
3!
+ o(x4 ). Hence
+ o(x4 )
x2
= lim (1 −
+ o(x3 )) = lim (1 + o(x)) = 1.
x→0
x→0
x
3!
93
x
.
Eg. Use Taylor series to calculate limx→0 1−cos
x
From the Taylor series of cos x we have that cos x = 1 −
2
1 − 1 + x2 + o(x3 )
1 − cos x
lim
= lim
= lim
x→0
x→0
x→0
x
x
94
x2
2
x2
2
+ o(x3 ). Hence
+ o(x3 )
x
= lim ( +o(x2 )) = 0.
x→0 2
x
7
Fourier series
In our study of Taylor series and Taylor polynomials we have seen how to
write and approximate functions in terms of polynomials. If the function we
are interested in is periodic, then it is more appropriate to use trigonometric
functions rather than polynomials. This is the topic of Fourier series, which
we shall describe in this chapter. Fourier series are important in a wide range
of areas and applications. In particular, they play a vital role in the solution
of certain partial differential equations. Examples of this application will be
studied in the Dynamics course.
Consider a function f (x) of period 2L which is given on the interval (−L, L).
nπx
The functions cos nπx
L and sin L , with n any positive integer, are also periodic
with period 2L. It therefore seems reasonable to try and write f (x) in terms
of these trigonometric functions. Note that a constant is trivially a periodic
function (for any period) so we can also include a constant term. We therefore
aim to write
∞ nπx
a0 X
nπx
f (x) =
(F S)
an cos
+
+ bn sin
2
L
L
n=1
where a0 , an , bn are constants labelled by the positive integer n, and are called
the Fourier coefficients of f (x). (It is just a convenient notation to call the
constant term a0 /2.). If these Fourier coefficients are such that this series
converges then it is called the Fourier series of f (x).
πx
Eg. The function f (x) = 2 cos2 πx
L + 3 sin L has period 2L.
Its Fourier series contains only a finite number of terms as
πx
2πx
πx
2 cos2 πx
L + 3 sin L = 1 + cos L + 3 sin L .
Therefore a0 = 2, a2 = 1, b1 = 3, and all the other Fourier coefficients are zero.
In general an infinite number of Fourier coefficients may be non-zero and we
need a method to determine these from the given function f (x).
95
7.1
Calculating the Fourier coefficients
In order to derive formulae for the Fourier coefficients we first need to prove
some identities for integrals of trigonometric functions.
In the following let m and n be positive integers.
Z
(i)
(ii)
(iii)
(iv)
(v)
L
nπx
dx = 0.
L
−L
Z L
nπx
sin
dx = 0.
L
−L
Z L
mπx
nπx
sin
cos
dx = 0.
L
L
−L
(
Z
0
mπx
nπx
1 L
cos
cos
dx =
L −L
L
L
1
(
Z
0
1 L
mπx
nπx
sin
sin
dx =
L −L
L
L
1
cos
if m 6= n
if m = n.
if m 6= n
if m = n.
The expression that occurs on the right hand side of the last two formulae
arises so often that it is useful to introduce a shorthand notation for this.
The Kronecker delta δmn is defined to be
(
0 if m 6= n
δmn =
1 if m = n.
RL
nπx
So, for example, we may write L1 −L sin mπx
L sin L dx = δmn .
It is trivial to prove (i) :
RL
nπx
−L cos L
dx =
L
nπ
sin nπx
L
L
=
−L
2L
nπ
sin(nπ) = 0.
(ii) and (iii) are true by inspection, as they are both integrals of odd functions
over a symmetric interval.
We can prove (iv) and (v) by using cos(x ± y) = cos x cos y ∓ sin x sin y.
First assume m 6= n then
Z
Z L
1 L
mπx
nπx
1
(m − n)πx
(m + n)πx
cos
cos
dx =
cos
+ cos
dx
L −L
L
L
2L −L
L
L
96
L
1
(m − n)πx
1
(m + n)πx
1
=
sin
+
sin
= 0.
2π m − n
L
m+n
L
−L
Now if m = n
L
Z
Z L
2nπx
mπx
nπx
1
1
L
2nπx
1 L
1+cos
cos
cos
dx =
dx =
x+
sin
= 1.
L −L
L
L
2L −L
L
2L
2nπ
L −L
This proves (iv) and a similar calculation (exercise) proves (v).
nπx
We say that the set of functions {1, cos nπx
L , sin L } form an orthogonal set
on [−L, L] because if φ, ψ are any two different functions from this set then
R
1 L
L −L φψ dx = 0.
We now use the identities (i), .., (v) to derive expressions for the Fourier
coefficients in (F S).
First of all, by integrating (F S) we have
Z
Z ∞ 1 L
1 L a0 X
nπx
nπx
f (x) dx =
+
an cos
+ bn sin
dx = a0
L −L
L −L 2
L
L
n=1
using (i) and (ii). Thus we have derived an expression for a0
Z
1 L
f (x) dx.
a0 =
L −L
Let m be a positive integer and multiply (F S) by cos mπx
L and integrate to give
Z
Z ∞ 1 L
mπx
1 L a0 X
nπx
nπx
mπx
an cos
cos
cos
f (x) dx =
+
+bn sin
dx
L −L
L
L −L 2 n=1
L
L
L
=
∞
X
an δmn = am ,
where we have used the identities (i), (iii), (iv).
n=1
Thus we have derived an expression for an
Z
nπx
1 L
an =
cos
f (x) dx.
L −L
L
97
Note that if we extend this relation to n = 0 then it reproduces the correct
expression given above for a0 .
Similarly, multiply (F S) by sin mπx
L and integrate to give
Z
Z ∞ 1 L
1 L a0 X
nπx
nπx
mπx
mπx
f (x) dx =
+
an cos
+bn sin
dx
sin
sin
L −L
L
L −L 2 n=1
L
L
L
=
∞
X
bn δmn = bm ,
where we have used the identities (ii), (iii), (v).
n=1
Thus we have derived an expression for bn
Z
1 L
nπx
bn =
sin
f (x) dx.
L −L
L
Eg. Calculate the Fourier series of the function f (x) = |x| for −1 < x < 1.
It is to be understood here that f (x) is a periodic function with period 2. Its
graph is shown in Figure 42.
Figure 41: The graph of f (x) = |x| for −1 < x < 1, with period 2.
We apply the above formulaewith
1 L = 1.
R1
R1
a0 = −1 |x| dx = 2 0 x dx = x2 = 1.
0
For n > 0
Z
1
Z
|x| cos(nπx) dx = 2
an =
−1
x cos(nπx) dx
0
98
1
1 Z 1
1
2
1
x
sin(nπx) −
sin(nπx) dx = 2 2 cos(nπx)
=2
nπ
nπ
nπ
0
0
0
2
2
= 2 2 (cos(nπ) − 1) = 2 2 ((−1)n − 1).
nπ
nπ
Furthermore, for n > 0
Z 1
bn =
|x| sin(nπx) dx = 0
−1
because this is the integral of an odd function over a symmetric interval.
This demonstrates a general point that if f (x) is an even function on the
interval (−L, L) then all bn = 0 and the Fourier series contains only cosine
terms (plus a constant term). This is called a cosine series.
Putting all this together we have the Fourier (cosine) series
∞
1 X 2
((−1)n − 1) cos(nπx)
f (x) = +
2
2
2 n=1 n π
4
1
1
1
cos(5πx) + · · ·
= − 2 cos(πx) + cos(3πx) +
2 π
9
25
4
Note that in this example a2n = 0 and a2n−1 = − π2 (2n−1)
2 , so this Fourier
(cosine) series could also be written as
∞
1
4 X cos((2n − 1)πx)
f (x) = − 2
.
2 π n=1
(2n − 1)2
To see how the Fourier series approaches the function f (x) define the
partial sum
m a0 X
nπx
nπx
Sm (x) =
+
an cos
+ bn sin
.
2
L
L
n=1
In this example, S1 (x) = 21 − π42 cos(πx),
S5 (x) = 12 − π42 (cos(πx) + 19 cos(3πx) +
99
S3 (x) = 21 − π42 (cos(πx)+ 19 cos(3πx)),
1
25 cos(5πx)), etc.
In Figure 42 we plot the graph of f (x) together with the partial sums
S1 (x), S5 (x), S11 (x).
This helps to demonstrate that, in this case, limm→∞ Sm (x) = f (x).
Figure 42: The graph of f (x) = |x| for −1 < x < 1, together with the partial sums
S1 (x), S5 (x), S11 (x).
Eg. Calculate the Fourier series of the function f (x) = x for −π < x < π.
It is to be understood here that f (x) is a periodic function with period 2π.
Its graph is shown in Figure 43. Note that the function is not continuous at
the points x = (2p + 1)π, p ∈ Z, where there are jump discontinuities.
Figure 43: The graph of f (x) = x for −π < x < π, with period 2π.
We apply the above formulae with L = π. For n ≥ 0
Z
1 π
an =
x cos(nx) dx = 0
π −π
because this is the integral of an odd function over a symmetric interval.
100
This demonstrates a general point that if f (x) is an odd function on the
interval (−L, L) then all an = 0 and the Fourier series contains only sine
functions. This is called a sine series.
For n > 0
π
Z
Z π
1 π
1
x
1
bn =
x sin(nx) dx =
− cos(nx)
cos(nx) dx
+
π −π
π
n
n
−π
−π
π 1
2π
2
2
2
1
=
− cos(nπ)+ 2 sin(nx)
= − cos(nπ) = − (−1)n = (−1)n+1 .
π
n
n
n
n
n
−π
Putting all this together we have the Fourier (sine) series
∞
X
1
1
1
2
n+1
f (x) =
(−1)
sin(nx) = 2 sin x− sin(2x)+ sin(3x)− sin(4x)+· · ·
n
2
3
4
n=1
In Figure 44 we plot the graph of f (x) = x for −π < x < π, together with
Figure 44: The graph of f (x) = x for −π < x < π, together with the graphs of the partial
sums S1 (x), S6 (x), S50 (x).
the graphs of the partial sums S1 (x), S6 (x), S50 (x).
These graphs demonstrate that as more terms of the Fourier series are included it becomes an increasingly accurate approximation to f (x) inside the
interval x ∈ (−π, π). However, notice what happens at the points x = ±π,
where f (x) is not continuous. At these points the Fourier series converges to
0, which is the midpoint of the jump. Note the oscillations around the point
of discontinuity, where the Fourier series under/overshoots. This is called the
Gibbs phenomenon and the amount of under/overshoot tends to a constant
101
(in fact about 9%) rather than dying away as more terms are including.
In general, we would like to know what happens to the partial sum
m a0 X
nπx
nπx
Sm (x) =
+
+ bn sin
an cos
2
L
L
n=1
for all values of x in the limit as m → ∞, and how this is related to f (x).
The following theorem provides an answer.
Dirichlet’s theorem
Let f (x) be a periodic function, with period 2L, such that on the interval
(−L, L) it has a finite number of extreme values, a finite number of jump
discontinuities and |f (x)| is integrable on (−L, L). Then its Fourier series
converges for all values of x. Furthermore, it converges to f (x) at all points
where f (x) is continuous and if x = a is a jump discontinuity then it converges to 21 limx→a− f (x) + 12 limx→a+ f (x).
7.2
Parseval’s theorem
We shall now derive a relation between the average of the square of a function
and its Fourier coefficients.
Parseval’s theorem
If f (x) is a function of period 2L with Fourier coefficients an , bn then
Z L
∞
1
1 2 1X 2
2
(f (x)) dx = a0 +
(an + b2n ).
2L −L
4
2 n=1
Proof:
1
L
1
L
Z
L −L
Z
L
(f (x))2 dx =
−L
∞ ∞ a0 X
nπx
a0 X
mπx
nπx
mπx
+
an cos
+bn sin
+
am cos
+bm sin
dx
2 n=1
L
L
2 m=1
L
L
102
∞
∞
∞
a2 X X
a2 X 2
= 0+
(an am δmn + bn bm δmn ) = 0 +
(an + b2n ).
2
2
n=1 m=1
n=1
Parseval’s theorem is useful in several contexts. One application is in finding
the sum of certain infinite series.
P
1
Eg. Calculate ∞
n=1 n2 by applying Parseval’s theorem to the Fourier series
of f (x) = x for −π < x < π. From earlier we calculated that the Fourier
coefficients are given by an = 0 and bn = n2 (−1)n+1 . Hence by Parseval’s
theorem
3 π
Z π
∞
∞
X
1
π2
1
π2
1X 4
1
x
2
. Thus
= .
=
=
x dx =
2
2π −π
2π 3 −π
3
2 n=1 n2
n
6
n=1
Certain infinite sums can also be obtained by evaluating a Fourier series at
a particular value of x.
P
1
Eg. Calculate ∞
n=1 (2n−1)2 by evaluating the Fourier series of f (x) = |x| for
−1 < x < 1, at the value x = 0.
From earlier we calculated the Fourier (cosine) series to be
∞
1
4 X cos((2n − 1)πx)
f (x) = − 2
.
2 π n=1
(2n − 1)2
As f (x) = x satisfies the conditions of Dirichlet’s theorem and is continuous
for all x then, by Dirichlet’s theorem, evaluating the Fourier series at x = 0
gives f (0) = 0. Hence
∞
1
4 X
1
0= − 2
.
2 π n=1 (2n − 1)2
7.3
Thus
∞
X
n=1
1
π2
= .
(2n − 1)2
8
Half range Fourier series
A half range Fourier series is a Fourier series defined on an interval (0, L)
rather than the interval (−L, L).
103
Given a function f (x), defined on the interval (0, L), we obtain its half range
sine series by calculating the Fourier sine series of its odd extension
(
f (x)
if 0 < x < L
fo (x) =
−f (−x) if − L < x < 0
The Fourier coefficients of fo (x) are an = 0 and
Z
Z
1 L
nπx
2 L
nπx
fo (x) sin
dx =
f (x) sin
dx.
bn =
L −L
L
L 0
L
P
nπx
This gives the half range sine series for f (x) on (0, L) as f (x) = ∞
n=1 bn sin L .
Given a function f (x), defined on the interval (0, L), we obtain its half range
cosine series by calculating the Fourier cosine series of its even extension
(
f (x)
if 0 < x < L
fe (x) =
f (−x) if − L < x < 0
The Fourier coefficients of fe (x) are bn = 0 and
Z
Z
1 L
nπx
2 L
nπx
an =
fe (x) cos
dx =
f (x) cos
dx.
L −L
L
L 0
L
This gives the half range cosine series for f (x) on (0, L) as f (x) =
a0
2
+
P∞
nπx
n=1 an cos L .
For a given (physical) problem on an interval (0, L) it is usually the boundary
conditions that determine whether an odd extension and a half sine Fourier
series or an even extension and a half cosine Fourier series is more appropriate.
7.4
Complex Fourier series
Fourier series take a much simpler form if written in terms of complex variables. Starting with the Fourier series
∞ a0 X
nπx
nπx
f (x) =
+
an cos
+ bn sin
2
L
L
n=1
104
inπx
inπx
inπx
inπx
L
1
nπx
i
L +e− L ) and sin
L −e−
we use the relations cos nπx
L = 2 (e
L = − 2 (e
to rewrite this as
∞
∞ X
inπx
inπx
inπx
a0 1 X
−
=
cn e L
+
(an − ibn )e L + (an + ibn )e L
f (x) =
2
2 n=1
n=−∞
)
where we have defined
a0
1
c0 =
=
2
2L
Z
L
f (x) dx
−L
for n > 0
1
1
cn = (an − ibn ) =
2
2L
Z
L
nπx
nπx
1
f (x)(cos
− i sin
) dx =
L
L
2L
−L
Z
L
f (x)e−
inπx
L
−L
and for n < 0
1
1
cn = (a−n +ib−n ) =
2
2L
L
−nπx
−nπx
1
f (x)(cos
+i sin
) dx =
L
L
2L
−L
Z
Z
L
f (x)e−
−L
Note that all 3 cases (n = 0, n > 0, n < 0) can be written as a single compact
formula
Z L
∞
X
inπx
1
− inπx
L
cn =
f (x)e
,
with the Fourier series f (x) =
cn e L
2L −L
n=−∞
105
inπx
L
.
8
Functions of several variables
In this chapter we shall consider functions of two variables, say f (x, y).
The extension to functions of more than two variables should be obvious.
We have already seen partial derivatives, defined in terms of the limits
∂f
f (x + h, y) − f (x, y)
(x, y) = lim
,
h→0
∂x
h
∂f
f (x, y + h) − f (x, y)
(x, y) = lim
.
h→0
∂y
h
We have also introduced the total differential df as
df =
∂f
∂f
dx +
dy.
∂x
∂y
The gradient of f (x, y) is defined to be the vector
∂f ∂f
∇f =
,
,
∂x ∂y
where the symbol ∇ is called nabla. ∇f points in the direction of the greatest
rate of increase of the function f (x, y). A point (x, y) at which ∇f vanishes,
that is ∇f = (0, 0), is called a stationary point.
Eg. f = 12 (1 + x2 + y 2 ).
∇f = (x, y) points along the radial half-line from the origin through the point
(x, y). It is easy to see that moving out along a radial line is the direction of
the greatest rate of increase of f since f = 12 (1 + r2 ), in terms of polar coordinates x = r cos θ, y = r sin θ. The only stationary point is (x, y) = (0, 0),
and in fact f (x, y) attains its global minimum at this point since f (0, 0) = 12
and clearly f (x, y) ≥ 12 .
Below we shall see how to extend some of the concepts we have seen for functions of one variable to the two variable case.
106
8.1
Taylor’s theorem for functions of two variables
Suppose f (x, y) and all its partial derivatives up to and including order n + 1
are continuous throughout an open rectangular region centred at the point
(a, b). Then for all (x, y) in this region
∂f
∂f
f (x, y) = f (a, b) + (x − a) (a, b) + (y − b) (a, b)
∂x
∂y
2
2
2
1
∂
f
∂
f
∂
f
+
(a, b) + (y − b)2 2 (a, b) + · · ·
(x − a)2 2 (a, b) + 2(x − a)(y − b)
2
∂x
∂x∂y
∂y
n
1 X n
∂ nf
k
n−k
(a, b) + Rn (x, y)
··· +
(x − a) (y − b)
n!
k
∂xk ∂y n−k
k=0
where Rn (x, y) is the Lagrange form of the remainder given by
n+1 X
1
n+1
∂ n+1 f
Rn (x, y) =
(x − a)k (y − b)n+1−k k n+1−k (α, β)
(n + 1)!
k
∂x ∂y
k=0
with (α, β) a point on the line segment joining (a, b) to (x, y).
If limn→∞ Rn (x, y) = 0 for all (x, y) in this region then the n → ∞ limit of
the above expression is the Taylor series of f (x, y) about (a, b) in this region.
Note that the nth order Taylor series is a polynomial in powers of (x − a) and
(y − b) up to order n.
Eg. Taylor expand f (x, y) = ex log(1 + y) about the origin (0, 0) up to and
including terms of quadratic order and give an expression for the remainder
term.
∂f
x
∂x = e log(1 + y) = 0 at (x, y) = (0, 0).
∂f
ex
∂y = 1+y = 1 at (x, y) = (0, 0).
∂2f
x
∂x2 = e log(1 + y) = 0 at (x, y) = (0, 0).
∂2f
ex
=
−
2
∂y
(1+y)2 = −1 at (x, y) = (0, 0).
2
∂ f
ex
=
∂x∂y
1+y = 1 at (x, y) = (0, 0).
107
∂3f
∂x3
= ex log(1 + y),
∂3f
∂y 3
=
2ex
(1+y)3 ,
∂3f
∂x2 ∂y
=
ex
1+y ,
f (x, y) = y +12 (2xy − y 2 ) + R2 (x, y) with
R2 (x, y) =
1
6
3 α
2
x e log(1 + β) + 3x
eα
y 1+β
− 3xy
2
eα
(1+β)2
∂3f
∂x∂y 2
+
x
e
= − (1+y)
2
2eα
y 3 (1+β)
3
.
Here (α, β) is a point on the line segment joining (0, 0) to (x, y).
8.2
Chain rule
Suppose f (x, y) but in addition x and y are both functions of the independent
variable t. By substituting the t dependence of x(t) and y(t) directly into
f (x(t), y(t)) we obtain a function f (t), with a derivative df
dt . We may ask how
∂f
this derivative is related to the partial derivatives ∂x and ∂f
∂x . The answer is
∂f
obtained by taking the expression for the total derivative df = ∂f
∂x dx + ∂y dy
and dividing by dt to obtain the chain rule
df
∂f dx ∂f dy
=
+
.
dt
∂x dt
∂y dt
Eg. f (x, y) = x2 y and x = cos t, y = sin t. Obtain df
dt as a function of t by
using the chain rule and check the result by expressing f in terms of t and
differentiating directly.
By the chain rule
∂f dx ∂f dy
dx
dy
df
=
+
= 2xy +x2
= −2xy sin t+x2 cos t = −2 cos t sin2 t+cos3 t.
dt
∂x dt ∂y dt
dt
dt
By direct substitution f = cos2 t sin t and
df
dt
= −2 cos t sin2 t + cos3 t.
Suppose f (x, y) but this time x and y are both functions of two variables
u and v ie. x(u, v) and y(u, v). Then by substituting these expressions into
f (x(u, v), y(u, v)) we obtain f as a function of u and v and we may ask how
∂f
∂f
the partial derivatives ∂f
∂u and ∂v are related to the partial derivatives ∂x and
∂f
∂y . The answer is again provided by the chain rule, which in this case reads
∂f
∂f ∂x ∂f ∂y
=
+
,
∂u
∂x ∂u ∂y ∂u
and
108
∂f
∂f ∂x ∂f ∂y
=
+
.
∂v
∂x ∂v ∂y ∂v
∂f
Eg. f (x, y) = sin(x2 − y 2 ) and x = u + v, y = u − v. Obtain ∂f
∂u and ∂v in
terms of u and v by using the chain rule and check the result by expressing
f directly in terms of u, v and then calculating the partial differentials.
By the chain rule
∂f
∂f ∂x ∂f ∂y
=
+
= 2x cos(x2 − y 2 )(1) − 2y cos(x2 − y 2 )(1)
∂u
∂x ∂u ∂y ∂u
= 2(u + v) cos(4uv) − 2(u − v) cos(4uv) = 4v cos(4uv).
∂f ∂x ∂f ∂y
∂f
=
+
= 2x cos(x2 − y 2 )(1) − 2y cos(x2 − y 2 )(−1)
∂v
∂x ∂v ∂y ∂v
= 2(u + v) cos(4uv) + 2(u − v) cos(4uv) = 4u cos(4uv).
By direct substitution f = sin(4uv) thus
∂f
∂u
= 4v cos(4uv) and
∂f
∂v
= 4u cos(4uv).
In the above examples it turned out to be easier to calculate the required
derivatives by direct substitution. However, there are situations in which
this is not always possible and then the chain rule comes into its own.
8.3
Implicit differentiation
Suppose f (x, y) = x−y and x, y are functions of t given implicitly by x2 +y 2 =
t2 and x sin t = yey . In this case it is not possible to explicitly solve for x(t)
and y(t) so the chain rule needs to be used to obtain an expression for df
dt .
df
∂f dx ∂f dy
dx dy
=
+
=
− .
dt
∂x dt
∂y dt
dt
dt
To obtain these required terms we differentiate the given relations with respect to t to get
2x
dx
dy
+ 2y
= 2t,
dt
dt
and
dx
dy
sin t + x cos t = ey (1 + y).
dt
dt
These two equations can now be solved (exercise) for
the above chain rule formula to get an expression for
109
dx
dt
df
dt
and dy
dt and put into
in terms of x, y, t.
Given an implicit expression for a function, the value of its partial derivatives
can be calculated using the chain rule.
Eg. Suppose the function z(x, y) satisfies the equation 4x2 y + z 5 x − 3yz − 2xy = 0.
∂z
Calculate the value of ∂x
at (x, y, z) = (1, 1, 1).
First of all, it is easy to check that (x, y, z) = (1, 1, 1) indeed satisfies the
given equation. Now differentiate the equation with respect to x keeping y
fixed to get
∂z
∂z
− 3y
− 2y = 0.
8xy + z 5 + 5xz 4
∂x
∂x
∂z
Setting x = y = z = 1 this becomes 7 + 2 ∂x
= 0 thus at (1, 1, 1) we find that
∂z
∂x = −7/2.
8.4
8.4.1
Double integrals
Rectangular regions
Rb
Recall that for a function f (x) the definite integral a f (x) dx is the signed
area under the curve y = f (x) between x = a and x = b. This interpretation
of the definite integral can be generalized for a function of two variables to a
volume under a surface.
Given a function f (x, y) and a region D in the (x, y) plane, the double integral
ZZ
f (x, y) dxdy
D
is the signed volume of the region between the surface z = f (x, y) and the
region D in the z = 0 plane (see Figure 45).
In a similar manner to our earlier construction of the definite integral, in
terms of the limit of a Riemann sum of areas of rectangles, we can define the
double integral as the limit of a Riemann sum of volumes of cuboids.
For simplicity, consider the case of a rectangular region D = [a0 , a1 ] × [b0 , b1 ].
We construct a subdivision S of D by defining the points of a two-dimensional
lattice as a0 = x0 < x1 < . . . < xn = a1 and b0 = y0 < y1 < . . . < ym = b1 .
The edge lengths of the lattice are equal to dxi = xi − xi−1 for i = 1, . . . , n
and dyj = yj − yj−1 for j = 1, . . . , m. The norm of the subdivision |S| is given
by the maximum of all the dxi and dyj .
110
Figure 45: The double integral is the signed volume of the region between the surface
z = f (x, y) and the region D in the z = 0 plane. It is obtained as the limit of a Riemann
sum of cuboid volumes.
We introduce sample points (pi , qj ) ∈ [xi−1 , xi ]×[yj−1 , yj ] inside each rectangle
of the lattice. The area of the rectangle is dxi dyj and we associate to this
rectangle a cuboid of height f (pi , qj ). The volume of all the cuboids is the
Riemann sum
m
n X
X
f (pi , qj )dxi dyj
R=
i=1 j=1
and the double integral is the limit
ZZ
f (x, y) dxdy = lim R.
|S|→0
D
In
RR the special case that f (x, y) is a constant, say f (x, y) = c, then
f (x, y) dxdy = c × area(D).
D
For a rectangular region D = [a0 , a1 ] × [b0 , b1 ] a practical way to compute the
double integral is to use the following iterated integral result (valid providing
f is continuous on D)
ZZ
Z a1 Z b1
Z b1 Z a1
f (x, y) dxdy =
f (x, y) dy dx =
f (x, y) dx dy.
D
a0
b0
b0
111
a0
Eg. Given the region D = [−2, 1] × [0, 1], calculate
RR
(x2 + y 2 ) dxdy.
D
ZZ
(x2 + y 2 ) dxdy =
Z
1
−2
D
Z
0
1
y=1 1
yx2 + y 3
dx
(x2 + y 2 ) dy dx =
3
−2
y=0
Z
1
3
1
x
x
1
1
=
(x + ) dx =
+
= (1 + 1 + 8 + 2) = 4.
3
3
3 −2 3
−2
Z
1
2
As an exercise, check that the same result is obtained if the integration is
performed in the other order.
8.4.2
Beyond rectangular regions
To compute double integrals beyond rectangular regions we need to consider
the following definition.
Defn: A region D of the plane is called y − simple if every line that is
parallel to the y-axis and intersects D, does so in a single line segment (or a
single point if this is on the boundary of D).
Figure 46: A y-simple region.
See Figure 46 for an example of a y-simple region. The boundaries of this
region are given by the curves y = φ1 (x) and y = φ2 (x) for a0 ≤ x ≤ a1 .
112
The double integral can be calculated as an iterated integral by integrating
over y first
ZZ
Z a1 Z φ2 (x)
f (x, y) dxdy =
f (x, y) dy dx.
a0
D
φ1 (x)
Eg. Sketch the region DRRthat lies between the curves y = x and y = x2 for
0 ≤ x ≤ 1 and calculate
6xy dxdy.
D
Figure 47: A sketch of the region D in the example.
As the region D is y-simple then
y=x
ZZ
Z 1Z x
Z 1
Z 1
2
6xy dxdy =
6xy dy dx =
3xy
dx = 3
(x3 −x5 ) dx
D
0
x2
0
y=x2
0
1
4
x6
1 1
1
x
−
=3
−
= .
=3
4
6 0
4 6
4
There is a similar definition of an x-simple region.
Defn: A region D of the plane is called x − simple if every line that is
parallel to the x-axis and intersects D, does so in a single line segment (or a
single point if this is on the boundary of D).
Figure 48 shows an example of an x-simple region that is not y-simple. The
boundaries of this region are given by the curves x = ψ1 (y) and x = ψ2 (y)
for b0 ≤ y ≤ b1 .
113
Figure 48: A region that is x-simple but is not y-simple.
The double integral can be calculated as an iterated integral by integrating
over x first
ZZ
Z b1 Z ψ2 (y)
f (x, y) dxdy =
f (x, y) dx dy.
b0
D
ψ1 (y)
A region can be both x-simple and y-simple, in which case the double integral
can be calculated as an iterated integral by integrating over x first or y first.
The region in the previous example is x-simple, so we can recalculate this
double integral using this property.
2
Same Eg. D is the region between the curves y = x and
RR y = x for 0 ≤ x ≤ 1.
Use the fact that this region is x-simple to calculate
6xy dxdy.
D
√
The boundaries of D are given by the curves x = y and x = y for 0 ≤ y ≤ 1.
Note from Figure 47 that moving along a line (that passes through D) parallel
to the x-axis with x increasing, intersects the curve x = y before it intersects
√
√
the curve x = y. Thus y is the lower limit in the x integration and y is
the upper limit.
x=√y
ZZ
Z 1 Z √y
Z 1
Z 1
2
6xy dxdy =
6xy dx dy =
3x y
dy = 3
(y 2 −y 3 ) dy
D
0
y
0
114
x=y
0
1
3
y4
1
y
1 1
−
−
= .
=3
=3
3
4 0
3 4
4
Sometimes a region may be both x-simple and y-simple but we can only
perform the integration explicitly for one of the ordering choices ie. either
integrating over x first or y first.
Eg. The boundaries RR
of D are given by the curves y = x and y =
ey
0 ≤ x ≤ 1. Calculate
y dxdy.
√
x for
D
If we use the fact that D is y-simple then we have
ZZ y
Z 1 Z √x y e
e
dxdy =
dy dx,
y
y
0
x
D
but we can’t do the integration. However, if we use the fact that D is x-simple
and the bounding curves are x = y and x = y 2 then
ZZ y
Z 1Z y y Z 1 y x=y
Z 1
e
e
xe
dy =
(ey − yey ) dy
dxdy =
dx dy =
y
y x=y2
0
y2 y
0
0
D
=
[ey ]10
−
[yey ]10
Z
+
1
ey dy = e − 2.
0
8.4.3
Polar coordinates
Sometimes it may be useful to describe the region of integration D in terms
of polar coordinates r and θ, where x = r cos θ and y = r sin θ. To convert
a double integral into polar coordinates we need to know what to do with
the area element dA = dxdy. Recall that this area element came from the
definition of the integral in terms of a division of D into small rectangles.
The area of a small rectangle with side lengths ∆x and ∆y is ∆A = ∆x∆y
which becomes dxdy in the limit that defines the integral.
We therefore need to know the area ∆A of a small region obtained by taking
the point with polar cooridnates (r, θ) and extending r by ∆r and θ by ∆θ, as
shown in Figure 49. We see that the area is approximately a rectangle with
115
area ∆A ≈ r∆θ∆r and this approximation improves as the area decreases.
In the limit that defines the integral the result is dA = dxdy = rdθdr.
We now know how to convert a double integral to polar coordinates, we
replace dxdy with rdθdr. Note the crucial factor of r here. We can then
calculate the double integral as an iterated integral if the region D is θ-simple
or r-simple (with the obvious definitions of these terms).
Figure 49: The area element in polar coordinates.
Eg. Let D be the region between theRRcurves x2 + y 2 = 1 and x2 + y 2 = 4
satisfying x ≥ 0 and y ≥ 0. Calculate
xy dxdy.
D
In polar coordinates the region D is given by 1 ≤ r ≤ 2 and 0 ≤ θ ≤ π2 .
The area element is dxdy = rdθdr and the integrand is xy = r2 cos θ sin θ.
ZZ
ZZ
Z 2 Z π/2
xy dxdy =
r2 sin θ cos θ rdrdθ =
r3 sin θ cos θ dθ dr
D
1
D
0
θ=π/2
Z 2
Z 2
1 3
1
1
3
3
sin(2θ) dθ r dr =
r dr =
r dr
=
− cos(2θ)
2
4
1
1 2
1
0
θ=0
2
1 4
15
= r
= .
8 1
8
As an exercise, you can check that you obtain the same result if you perform
the calculation without changing to polar coordinates.
Z 2Z
π/2
116
8.4.4
Change of variables and the Jacobian
Using polar coordinates r, θ rather than Cartesian coordiantes x, y is a particular example of a change of variables. In general we might want to
change variables from x, y to new variables u, v given by relations of the
form x = g(u, v) and y = h(u, v) for some functions g and h. Not only do we
need to know what the integration region D looks like in the new variables,
but we also need the area element dxdy in the new variables. The following
definition plays the central role in this issue.
Defn. The Jacobian of the transformation from the variables x, y to u, v is
∂x
∂x ∂v ∂(x, y) ∂u
J=
=
∂(u, v) ∂y
∂y ∂u
∂v
ie. it is the determinant of the 2 × 2 matrix of partial derivatives.
To obtain the area element dxdy in terms of the new variables we first compute the Jacobian J and then use the result that
dxdy = |J|dudv
where |J| is the absolute value of the Jacobian.
As an example, we can rederive the area element in polar coordinates. We
have x = r cos θ and y = r sin θ, so in this case r, θ are the new variables,
playing the role of u, v in the general setting.
∂x
∂x cos θ −r sin θ ∂θ ∂(x, y) ∂r
=
= r cos2 θ + r sin2 θ = r
J=
=
∂(r, θ) ∂y
∂y sin θ
r
cos
θ
∂r
∂θ
giving dxdy = |J|drdθ = rdrdθ as before.
117
Eg. Let D RR
be the square in the (x, y) plane with vertices (0, 0), (1, 1), (2, 0), (1, −1).
Calculate
(x + y) dxdy by making the change of variables x = u + v and
y = u − v.
D
First we need to determine the region D in terms of the variables [u, v].
Using u = (x + y)/2 and v = (x − y)/2 we find that the vertices map to
the [u, v] coordinates [0, 0], [1, 0], [1, 1], [0, 1]. The edges of the square lie along
the lines y = x, y = −x, y = 2 − x, y = x − 2 which map to the lines
v = 0, u = 0, u = 1, v = 1. Therefore in the [u, v] plane D is again a square,
with the vertices given above.
The Jacobian of the transformation is
∂x
∂x 1
1
∂v ∂(x, y) ∂u
=
= −2.
J=
=
∂(u, v) ∂y
∂y 1 −1 ∂u
∂v
Thus dxdy = |J|dudv = 2dudv.
Finally, the integrand is x + y = 2u. Putting all this together we get
ZZ
ZZ
Z 1Z 1
(x + y) dxdy =
4u dudv =
4u dv du
D
v=1
Z 1
=
4uv
0
8.4.5
0
D
Z
1
du =
v=0
0
0
4u du = 2u2
1
= 2.
0
The Gaussian integral
An important integral that appears in a wide range of mathematical contexts
is the Gaussian integral
r
Z ∞
π
−ax2
e
dx =
a
−∞
where a is a positive constant.
We can derive this result using a trick that involves calculating a double
integral, as follows. We define I to be the required integral
Z ∞
2
I=
e−ax dx
−∞
118
and observe that
Z
2
I =
∞
e
−ax2
Z
∞
dx
e
−∞
−ay 2
dy
ZZ
=
−∞
2
e−a(x
+y 2 )
dxdy.
R2
We now use polar coordinates to evaluate this double integral
Z ∞ Z 2π
Z ∞
ZZ
2
−ar2
−ar2
2
e
r drdθ =
re
dθ dr = 2π
re−ar dr
I =
R2
0
0
e−ar
= −2π
2a
0
2
∞
=
0
p
Hence I = π/a, which is the required result.
119
π
.
a