Calculus: A Modern Approach

Calculus: A Modern Approach
Horst R. Beyer
Louisiana State University (LSU)
Center for Computation and Technology (CCT)
328 Johnston Hall
Baton Rouge, LA 70803, USA
1
Dedicated to the Holy Spirit
Contents
Contents
1
2
3
3
Introduction
1.1 Short Introduction . . . . . . . . . . . . . . . . . . . . .
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . .
1.3 The General Approach of the Text . . . . . . . . . . . .
1.3.1 Motivational Parts . . . . . . . . . . . . . . . .
1.3.2 Core Theoretical Parts . . . . . . . . . . . . . .
1.3.3 Parts Containing Examples and Problems . . . .
1.4 Miscellaneous Aspects of the Approach . . . . . . . . .
1.5 Requirements of Applications . . . . . . . . . . . . . .
1.6 Remarks on the Role of Abstraction in Natural Sciences .
.
.
.
.
.
.
.
.
.
5
5
5
7
8
9
10
12
13
14
Calculus I
2.1 A Sketch of the Development of Rigor in Calculus and Analysis
2.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Elementary Mathematical Logic . . . . . . . . . . . . .
2.2.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Limits and Continuous Functions . . . . . . . . . . . . . . . . .
2.3.1 Limits of Sequences of Real Numbers . . . . . . . . . .
2.3.2 Continuous Functions . . . . . . . . . . . . . . . . . .
2.4 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5 Applications of Differentiation . . . . . . . . . . . . . . . . . .
2.6 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
17
17
20
20
31
42
60
60
88
121
144
211
Calculus II
3.1 Techniques of Integration . . . . . . . . . . . . . . . . .
3.1.1 Change of Variables . . . . . . . . . . . . . . .
3.1.2 Integration by Parts . . . . . . . . . . . . . . . .
3.1.3 Partial Fractions . . . . . . . . . . . . . . . . .
3.1.4 Approximate Numerical Calculation of Integrals
3.2 Improper Integrals . . . . . . . . . . . . . . . . . . . .
3.3 Series of Real Numbers . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
249
249
249
266
281
297
308
338
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3.4
3.5
4
5
Series of Functions . . . . . . . . . . . . . . . . . .
Analytical Geometry and Elementary Vector Calculus
3.5.1 Metric Spaces . . . . . . . . . . . . . . . . .
3.5.2 Vector Spaces . . . . . . . . . . . . . . . . .
3.5.3 Conic Sections . . . . . . . . . . . . . . . .
3.5.4 Polar Coordinates . . . . . . . . . . . . . . .
3.5.5 Quadric Surfaces . . . . . . . . . . . . . . .
3.5.6 Cylindrical and Spherical Coordinates . . . .
3.5.7 Limits in Rn . . . . . . . . . . . . . . . . .
3.5.8 Paths in Rn . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
378
439
440
450
478
491
500
509
515
520
Calculus III
4.1 Vector-valued Functions of Several Variables . . . . . . .
4.2 Derivatives of Vector-valued Functions of Several Variables
4.3 Applications of Differentiation . . . . . . . . . . . . . . .
4.4 Integration of Functions of Several Variables . . . . . . . .
4.5 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . .
4.6 Generalizations of the Fundamental Theorem of Calculus .
4.6.1 Green’s Theorem . . . . . . . . . . . . . . . . . .
4.6.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . .
4.6.3 Gauss’ Theorem . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
541
541
565
596
626
677
693
700
718
731
Appendix
5.1 Construction of the Real Number System . . .
5.2 Lebesgue’s Criterion for Riemann-integrability
5.3 Properties of the Determinant . . . . . . . . . .
5.4 The Inverse Mapping Theorem . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
749
749
761
766
782
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
References
790
Index of Notation
796
Index of Terminology
798
4
1
1.1
Introduction
Short Introduction
This text is an enlargement of lecture notes written for Calculus I, II and III
courses given at the Department of Mathematics of Louisiana State University in Baton Rouge. It follows syllabi for these courses at LSU. Mainly,
it is devised for teaching standard entry level university calculus courses,
but can also be used for teaching courses in advanced calculus or undergraduate analysis, oriented towards calculations and applications, and also
for self-study. The reasons for devising a text of such threefold nature is
explained in Section 1.3. This text is unique also in its special attention
to the needs of applications and due to its unusually elaborate motivations
coming from the history of mathematics and applications. As a result, the
text introduces early on basic material that is needed in applied sciences,
in particular from the area of differential equations. Its motivations follow
Otto Toeplitz’ famous ‘genetic’ method, [96].
1.2
Background
Currently, the content coverage and approach in standard calculus texts appear static. Indeed, such courses teach to a large extent views of the 18th
century. On the other hand, the demand for analysis skills of increasing
sophistication and abstraction in applications is still unbroken.
As pointed out in Section 1.6, the need for a higher level of mathematical sophistication in the discipline which is most fundamental for applications, physics, was a byproduct, in particular, of the study of atomic systems. In particular, the mathematics education of physicists needs to go
beyond calculus. A study of functional analysis, especially that of the spectral theorems of self-adjoint linear operators in Hilbert spaces, considerably
enhances the understanding of quantum theory beyond that given in standard quantum mechanics texts. Such knowledge is extremely helpful in the
study of the more advanced quantum theory of fields and, very likely, also
5
for the formulation and understanding of more advanced unified quantum
field theories that are still to come.
Also in the engineering sciences, the need for higher mathematical sophistication is visible, in particular, in connection with the solution of partial
differential equations (PDE). PDE dominate current applications and functional analysis also provides the basis for their treatment. A good example
for application of functional analytic methods is the method of finite elements which is widely used in engineering sciences for the solution of
boundary value problems of elliptic differential equations. Also, questions
after the relation of approximate solutions, provided by numerical methods,
to the solutions of the original PDE gain importance and hence lead into the
area of functional analysis.
The mathematical thinking taught in current standard calculus courses provides no proper basis for more advanced courses in the area of analysis,
in particular, courses in advanced calculus or undergraduate analysis.1 As
a consequence, the last don’t build on any previous knowledge of calculus, but start completely new.2 Frequently, students from natural sciences
and engineering, which form a major part of classes, don’t attend such advanced courses, mostly for reasons of time. As a consequence, frequently,
standard calculus courses lead the last students into a dead end. In today’s
time, where the speed of development of all parts of society rapidly increases, such procedure appears no longer appropriate.
Since a major raise of the mathematical level of standard calculus courses
1
2
This is not surprising since precisely that thinking led calculus into serious crisis in the
beginning of the 19th century. Only after that crisis was overcome, the development of
more advanced mathematical fields was possible.
Of course, this is not very efficient. Also, significantly, students of mathematics often face substantial problems in the first decisive parts of such courses that demand a
considerably higher level of abstraction. Usually, this problem cannot be avoided by
offering honor calculus courses, since most often there are only insufficient numbers
of students to fill such courses. Also, the last are not always taught on a significantly
higher level than standard calculus courses.
6
does not appear feasible, without losing the bulk of students, the result is
a dilemma. The goal of providing a basic calculus education to a large
mass of students, that is at the same time suitable as basis for more advanced analysis courses and also for increased demands for analysis skills
of higher mathematical sophistication in applications, seems unreachable.
Visibly, current standard calculus courses pursue the first part of this goal,
only.
1.3
The General Approach of the Text
The text tries to reach the whole goal, instead. As is suitable for calculus
courses, it has a strong orientation towards calculations, but uses consistently mathematical methods of the 20th century, in particular, the basic
concepts of sets and maps, for the development of calculus. It is mainly the
use of these efficient concepts that distinguishes 20th century mathematics
from older mathematics. In addition, special care was taken to include material that is needed early on in applied sciences, in particular from the area
of differential equations. On the last, details are given in Section 1.5.
As a consequence, the text rests on Chapter 2, Basics, of Calculus I that
introduces the concepts of sets and maps. Due to their inherent simplicity,
the understanding of these concepts is possible to the majority of students.
This introduction is preceded by a short subsection on elementary mathematical logic to explain the meaning of the notion of a proof. This takes
into account the experience that a large number of students have difficulties in understanding that meaning. It is also hoped that this subsection
convinces some students, if still necessary, that they are capable of understanding proofs.
Therefore, Chapter 2 should be covered in detail in class. Its thorough study
will provide the student with the basic tools that are essential for the understanding of modern mathematics. A student that mastered this chapter will
realize in the following that a main step in the solution of a problem is its
reformulation in terms of the ‘language’ provided in Chapter 2. After that,
7
the solution of a large number of problems is obvious. As a consequence,
he or she will gradually realize that the seemingly ‘challenging’ nature of
many standard calculus problems is due to an inadequate formulation. In
this way, the student will learn to appreciate the power of the provided ‘language’ which will guide him or her through the rest of the course.
Mostly, chapters consist of three parts. An introductory motivational part,
a core theoretical part and a part containing examples and problems.
1.3.1
Motivational Parts
Those parts consider historical mathematical problems or problems from
applications that lead to the development of the mathematics in the theoretical part of the chapter. Such problems often have a certain ‘directness’
which is suitable to catch students attention and should help every student
to get an idea ‘why’ certain mathematics was developed and ‘what mathematics is good for’. To the author’s experience, practically all students
have a high interest in such parts and, if given, are more inclined to follow
subsequent more theoretical investigations. Also, motivations of this type
are largely missing in standard calculus texts known to the author.
In this, the text follows Otto Toeplitz’ ‘genetic’ method, suggested in 1926
and realized in his ‘Die Entwicklung der Infinitesimalrechnung, Bd. I.’
from 1949 [96]. To the knowledge of the author, the present text is the first
that implements Toeplitz’ method to a large extent and at the same time is
capable to cover a three semester course in calculus. On the other hand,
differently to Toeplitz, the text does not follow the historic order of the
mathematical development because, from today’s perspective, that development was not very efficient. Also, the formal approach to mathematics,
with Hilbert as its main proponent, made clear that ‘understanding’ in mathematics is ‘structural understanding’. The last is an achievement of the 20th
century. Presenting the material in the historical order would obstruct the
path towards such understanding and be contrary to the intentions of the
text.
8
Also, wherever possible, motivation is taken from applications. This is suitable, in particular, for students from natural sciences and engineering. This
includes introductions to sections like that on improper integrals that uses
motivation from the mechanics of periodic motion where improper integrals occur naturally in the analysis. Also, a large number of examples and
problems consider basic problems related to theoretical mechanics, general
relativity and quantum mechanics. In this, it pays off that the author is a
mathematical physicist that has a first hand research knowledge of these
areas. As a consequence, those problems are realistic.
In cases where prototypical problems seemed unavailable, pure historical
sketches of the development were used for the purpose of motivation. For
instance, such approach was used in the introduction to the section on set
theory. That introduction points out the fact that the original object of study
of set theory was the concept of the infinite and that initial resistance against
the theory had its roots in ancient Greek philosophical views of the infinite
that were still not completely overcome at the time.
The motivational introductions should be accessible to every student and
be gone through in detail in class.
1.3.2
Core Theoretical Parts
Those parts gives a rigorous development of essential parts of the machinery of analysis. Essentially, they are on the level of a standard undergraduate analysis or advanced calculus text, like Lang’s ‘Undergraduate
Analysis’, [63], but proofs are intentionally more detailed and have been
simplified as far as possible. For this purpose, also current mathematical
literature, in particular, the American Mathematical Monthly and the Mathematics Magazine, has been systematically searched. For instance, this led
to the adoption of E. J. McShane’s proof of Lagrange’s multiplier rule [76]
which does not use the implicit mapping theorem. Also simplifications suggested by [4], [25], [26], [32], [33], [40], [44], [61], [90] and [97] have been
9
used. As a consequence, the text can also be used to teach undergraduate
analysis or advanced calculus courses oriented towards calculations and applications.
In class, the statements of the most important theorems should appear on
the blackboard to teach students to work with these statements, even if the
corresponding proofs are not fully understood or skipped. On the other
hand, for reasons of time, it is to be expected that a number of proofs have
to be omitted or can only be indicated in class.
On the other hand, students from mathematics and also from natural sciences and engineering, are advised to go through proofs, that have in omitted or only indicated in class, in self-study. To facilitate such deeper study,
this text gives students the chance to look up the full proofs without the
necessity for a time consuming study of a large number of other sources.1
The last is no easy task for a beginner and, usually, lacks efficiency. For this
reason, the text is also devised for an unguided self-study and very explicit.
In particular, it tries to give also elementary steps in calculations to such
extent that they become evident. As a consequence, large parts of the text
should not even need paper and pencil.
1.3.3
Parts Containing Examples and Problems
The majority of problems and examples are of a type and level occurring in
standard university calculus texts in the US, but consistently reformulated
in modern terms.
The problems are mostly calculational in nature, as is appropriate for calculus courses also suitable for students for applied sciences. According
to experience, the mastery of the study of applied sciences needs, at the
minimum, technical mathematical skills. Sometimes, the opinion is uttered
1
In particular such study is complicated by different choices of notation. Of course, the
author would not discourage students from such study if there is sufficient time, but,
generally, a dense undergraduate curriculum should not leave much time for that.
10
that the advent of mathematical software tools, like Mathematica, Maple,
Matlab made such skills redundant. In fact, this is not the case since the
use of such software led to the consideration of problems whose complexity would have prevented an attack in the past. For instance, viewed from
the perspective of algebraic manipulation associated to such problems, this
complexity is reflected in the output of such programs. Simplification algorithms cannot possible know what the user’s intentions are. Hence the
user has to guide the software to a useful answer without knowledge of that
answer. This process needs a lot of mathematical experience and skills. As
a consequence, efficient use of such programs presupposes technical mathematical skills and experience and even a form of structural understanding
of mathematical manipulations. In addition, it is well-known that such programs are not completely free of errors. Particular examples are given on
pages 263 and 292 of the text. Therefore, users need to perform routine
checks of the results of such programs which also requires mathematical
skills.1
The examples appear throughout in form of fully worked problems. As
a consequence, these do not only exemplify the theory, but at the same time
teach problem solving and prepare for exams. This procedure is particularly helpful for beginners. Wherever possible, the results of examples
have been checked with Mathematica 5.1.
Also, a large number of examples and problems consider basic problems
from applications, in particular, from theoretical mechanics, general relativity and quantum mechanics. In this, it pays off that the author is a mathematical physicist who has a first hand research knowledge of these areas.
As a consequence, those problems are realistic.
Every calculus student needs to solve those problems and be able understand those examples. In particular, in class, the examples should be covered in detail.
1
Compared to these requirements, the effort for learning the correct syntax of such programs is relatively low.
11
1.4
Miscellaneous Aspects of the Approach
(i) The text tries to introduce only essential mathematical structures and
terminology and only in places where they are of direct subsequent
use. In particular, mathematical notions are developed only to the
level needed in the sequel of the text, thereby stressing their tool character.
(ii) Material which is used in the text, but whose development would
cause a major disrupt of the course, like the proof of Lebesgue’s characterization of Riemann integrability, are deferred to the appendix to
make it accessible to interested students. In addition, the appendix
contains a complete version of Cantor’s construction of the real numbers as equivalence classes of Cauchy sequences of rational numbers.
Today, it is well-known that the whole of analysis and calculus rests
on a construction of the real number system. Therefore, mainly for
students of mathematics, such a construction has been included. The
frequently used introduction of the real number system by a complicated set of axioms, for example, as in [63], has been avoided since
such should appear implausible, in particular, to such students.
(iii) The basic limit notion of the text is that of limits of sequences. Continuous limits are introduced as a derived concept, but their use is
usually avoided. In particular, the definition of the continuity of functions proceeds by means of the conceptually simpler notion of ‘sequential continuity’, instead of the equivalent classical ε, δ-approach.
Generally, the last approach is often problematic for beginners.
(iv) The text contains 210 diagrams whose role is to assist intuition, but
not to create the illusion of being able to replace any argument inside
a proof. Mistakenly, the last is sometimes assumed by students. For
this reason, it is explained in the introduction of the section on the development of rigor in calculus and analysis why geometric intuition
is no longer regarded a valid tool in mathematical proofs. Still, good
diagrams can be useful for the formulation of conjectures.
12
(v) In general, theorems contain their full set of assumptions, so that a
study of their environment is not necessary for their understanding.
For the same reason, occasionally, shorter definitions appear as part
of theorems, and theorems as well as definitions contain also material
that would normally appear only in subsequent remarks.
1.5
Requirements of Applications
The bulk of material needed early on in applied sciences is from the area of
differential equations. In the case of physics, this is the case since the advent of Newtonian mechanics in the 17th century. The advent of quantum
theory made it necessary, in particular, to go beyond differential equations
on to abstract evolution equations, see, e.g., [8]. Of course, the treatment
of differential equations cannot be comprehensive in calculus courses, but
a number of important cases can already be treated with methods from calculus. Such cases have been in included in this text as examples of calculus
applications and in problem sections. For instance, second order differential equations with constant coefficients are already treated in the section
on applications of differentiation in Calculus I. The uniqueness of the solutions of such an equation can be proved by help of an energy inequality.
The solutions are found by help of a simple transformation that eliminates
the first order derivative of the unknown function. A two-parametric family
of solutions of the resulting equation is easily found. Within the sections on
Riemann integration and its applications, separable first order differential
equations are solved by help of integration. The solutions of the equation of
motion for a simple pendulum are considered in the introduction to the section on improper integrals in Calculus II. Solutions of Bessel’s differential
equation are derived by the method of power series in the section on series of functions. The derivation of solutions of the hypergeometric and the
confluent hypergeometric differential equations are part of the subsequent
problem section. Connected to differential equations are special functions,
in particular, the Gamma and the Beta function. The last are defined and
studied within the section on improper Riemann integrals. That section also
derives well-known values of certain exponential integrals used in quantum
13
theory and probability theory and a standard integral representation for Riemann’s zeta function.
In addition, in applications often the need arises to integrate discontinuous functions as well as functions over unbounded domains. Usually, those
needs are due to idealizations that make problems accessible to direct analytical calculation. Such ‘model systems’ are still the main source for the
development of an intuitive understanding of natural phenomena.1 For this
reason, applications need an integration theory which is capable of integrating a large class of functions. Lebesgue’s integration theory is well
suited for this purpose. Still, for reasons of practicability, the text develops Riemann’s integration theory, though close to its limits. In particular, Lebesgue’s characterization of Riemann integrability is given inside the
text, but its proof is deferred to the appendix. For integration of functions
in several variables, we use Serge Lang’s approach to Riemann integration
from [63]. This approach is capable of integrating bounded functions, defined on closed bounded intervals, that are continuous, except from points
of a ‘negligible’ set. Negligible sets can be covered by a finite number of intervals with an associated sum of volumes which can be made smaller than
every preassigned real number ¡ 0. Hence negligible sets are particular
bounded sets of Lebesgue measure zero.
1.6
Remarks on the Role of Abstraction in Natural Sciences
Examples for the fact that the most fundamental of natural sciences, physics,
always operated on a level of abstraction similar to that of mathematics are
easy to find. A first example comes from Newtonian mechanics whose
development was intertwined with that of calculus. The former theory describes strict point particles, that is, particles without any spatial extension.
Of course, experimentally such point particles have never been observed
and therefore constitute an abstraction that has its roots in ancient Greek
1
The rising importance of numerical investigations has not, and likely, cannot change
that.
14
geometry. They have always been regarded as an idealization of a much
more complicated reality. Still, the assumption of Newtonian point particles led to predictions that were in excellent agreement with observations
and measurement until the advent of quantum theory in the first quarter of
the 20th century. Einstein’s theory of special relativity has been the cause
of another abstraction to enter physics, namely the unification of time and
space into a four dimensional space-time. Such unification led to a remarkable simplification of that theory. Since it is the belief of most physicists
that the ‘simplicity’ of a description, that is consistent with the experimental facts and that predicts new phenomena that are subsequently observed,
at least partially, reflects an objective reality, nowadays this unification is
a commonly used abstraction. A further abstraction is due to Einstein’s
theory of general relativity that absorbed the gravitational field into the geometry of the four dimensional space-time. Subsequently, quantum theory
led to the description of matter by elements of abstract Hilbert spaces with
corresponding physical observables being spectral measures of self-adjoint
operators in this space. In the algebraic quantum theory of fields, observables are elements of a von Neumann algebra, and physical states of the
field are positive linear forms on the algebra.
The above indicates that the development of physics towards the understanding of deeper aspects of nature was paralleled by the application of
mathematical methods of increasing sophistication. In order to avoid the
occurrence of errors, the last also necessitated an increasing stress on mathematical rigor in physics. Current physics is as abstract as mathematics
since it studies practically exclusively phenomena that cannot be perceived
by human senses, but only indirectly by help of highly sophisticated experimental equipment. Hence, similar to mathematics, in physics visual
intuition is no longer of much help in the analysis of phenomena. In contrast, the development of physics supports the view that theories based on
direct human perception inevitably contain extrapolations on the nature of
things which ultimately turn out to be seriously flawed. Finally, in current
speculative, i.e., without experimental evidence, physical theories there is
currently nothing else available than mathematical consistency and rigor to
15
give such theories credibility. Those can only try to ‘replace’ experiment,
temporarily, by mathematical consistency and rigor, although ultimately
only the outcome of experiments decide on the ‘truth’ of a physical theory.
Viewed from this perspective, its is quite obvious that calculus courses need
to go into the direction of increased mathematical sophistication in order to
narrow a widening gap to contemporary applications. In this connection,
it needs to be remembered that after the advent of quantum theory, it has
been recognized that the laws of quantum theory provide also the basis for
the laws of chemistry. Therefore, it is to be expected that the other natural
sciences and the engineering sciences follow the development of physics towards the use of more subtle mathematical methods. Such trend is already
obvious.
Acknowledgments
I am indebted to Kostas Kokkotas, Tübingen, by suggesting the inclusion
of a number of valuable examples in the text.
16
2
2.1
Calculus I
A Sketch of the Development of Rigor in Calculus and
Analysis
It is evident that a science that leads to contradictory statements loses its
value. Therefore, the occurrence of such an event sends a shock wave
through the scientific community. The immediate response is an analysis of the validity of the reasoning that leads to the contradiction. In case
that reasoning appears to be ‘valid’, i.e., if the contradiction can be derived
by generally accepted rules of inference (‘logic’) from assumptions that
are generally believed to be true (‘axioms’), the field is in a crisis because
those assumptions and/or rules need to be revised until the contradiction is
resolved. If this succeeds, it has to be determined whether all previously
obtained results of the science are derivable from the revised basis. Potentially, a large number of results could be lost in this way.
Probably the first example of a serious crisis in mathematics
? is the discovery in ancient Greece around 450 B.C. that the length, 2 , of a diagonal
of a square with sides of length 1 is no rational number, a fact that will be
proved in Example 2.2.15 below. Tradition attributes this discovery to a
member of the Pythagorean school of thought. The fundamental assumption of that school was that the essence of everything is expressible in terms
of whole numbers and their ratios, i.e., of quantities which are discrete in
character. As a consequence of the discovery, that line of thought lost its
basis. As a result, Plato’s’ school of thought completely reorganized the
mathematical knowledge of the time by giving it an exclusively geometric
basis. In this, the product of two lengths is not another length, but an area,
for instance, that of rectangle. Hence the equation
x2
2
can be solved geometrically, for instance, by constructing a square with
edge x whose area is equal to the area of a rectangle with sides 2 and 1.
As a consequence, algebraic equations were solved in terms of geometric
17
quantities. On the other hand, viewed from a today’s perspective, that approach bypassed the problem of irrational quantities, rather than solving it
and can be seen as a prime reason for a major delay of the development of
mathematical calculus / analysis. The last was developed as late as in the
17th century in Western Europe.
The crisis gave important reasons for the development of the axiomatic
method in mathematics in ancient Greece, i.e., proof by deduction from
explicitly stated postulates. Without doubt, this method is the single most
important contribution of ancient Greece to mathematics which is the basis
of mathematics until today. In style, modern mathematics texts, including
the present text, mirror that of the epoch making thirteen books of Euclid’s
Elements written around 300 B.C. [37]. Previous Egyptian and Babylonian
mathematics made no distinction between exact and approximate results
nor were there indications of logical proofs or derivations. On the other
hand, the Egyptians and Babylonians had already quite accurate approximations for π and square roots that were needed in land survey. For instance, the Egyptians
of π within an error of 2 102
? determined the value
and the value of 2 within an error of 104 . The Babylonians were already
familiar with the so called Pythagorean theorem
? and determined the value
7
of π within an error of 10 and the value of 2 within an error of 106 .
In order to be considered as properly established in ancient Greece, a theorem had to be given a geometric meaning. This tradition continued in the
Middle Ages and the Renaissance in the West. The geometric intuition was
more trusted than insight into the nature of numbers. In the early phases of
the development of calculus / analysis in the 17th and 18th century and also
in the views of its founding fathers Isaac Newton and Gottfried Wilhelm
Leibniz, geometric intuition was of major importance, but in the sequel
was gradually replaced by arithmetic.
A major factor in this process was the construction of non-euclidean geometries by Nicolai Lobachevsky (1829) [72], Janos Bolyai (1831) [11]
and earlier, but unpublished, by Gauss. In his ‘Elements’, Euclid bases
18
geometry on five postulates that are assumed to be valid. Generally, only
the first four of them were considered geometrically intuitive, whereas the
fifth, the so called parallel postulate, was expected to be a consequence of
the other postulates. For about 2000 years, an enormous effort went into
the investigation of this question. The construction of non-euclidean geometries which satisfy the first four, but not the fifth, of Euclid’s postulates
proved the independence of the parallel postulate from the other postulates.
This result stripped Euclidean geometry from its central role it retained for
about 2000 years.
The final removal of geometric intuition as a means of mathematical proofs
was caused from a number of geometrically non-intuitive results of calculus
/ analysis , in particular, the demonstration of the existence of a continuous
nowhere differentiable function by Karl Weierstrass in 1872 [99], see Example 3.4.13, and the construction of a plane-filling continuous curve by
Giuseppe Peano in 1890 [84], see Example 3.4.14. Weierstrass conceived
and in large part carried out a program known as the arithmetization of
analysis, under which analysis is based on a rigorous development of the
real number system. This is the common approach until today. For this
reason, Weierstrass is often considered as the father of modern analysis. A
common rigorous development of the real number system by use of Cauchy
sequences is given in Appendix 5.1.
Today, reference to geometric intuition is not considered a valid argument
in the proof of a theorem. Of course, such intuition might give hints how
to perform such a proof, but the means of the proof itself are purely formal. This situation is similar to that of blindfold chess, i.e., the playing
of a game of chess without seeing the board. That formal approach has
been suggested by David Hilbert for the foundation of mathematics and
has become the standard of most working mathematicians. It culminated
in the collective works of a group of mathematicians publishing under the
pseudonym ‘Bourbaki’. The series comprises 40 monographs that became
a standard reference on the fundamental aspects of modern mathematics.
19
2.2
2.2.1
Basics
Elementary Mathematical Logic
In the 17th century Leibniz suggested the construction of a universal language for the whole of mathematics that allows the formalization of proofs.
In 1671, he constructed a mechanical calculator, the step reckoner, that was
capable of performing multiplication, division and the calculation of square
roots. Also in view of his involvement in the construction of other mechanical devices, like pumps, hydraulic presses, windmills, lamps, submarines,
clocks, it is likely that he envisioned machines that ultimately could perform proofs. The first scientific work on algebraization of Aristotelian logic
appeared in 1847 [10], 1858 [81] by George Boole and Augustus De Morgan, respectively. The formation of mathematical logic as an independent
mathematical discipline is linked with Hilbert’s program mentioned in Section 2.1 on formal axiomatic systems that resulted from the recognition of
the unreliability of geometrical intuition. That program called for a formalization of all of mathematics in axiomatic form, together with a proof that
it is free from contradictions, i.e., that it is what is called ‘consistent’. The
consistency proof itself was to be carried out using only what Hilbert called
’finitary’ methods. In the sequel, neither Leibniz nor Hilbert’s visions have
been achieved.
However, what has been achieved is sufficient for most working mathematicians today. In the following, we present only the very basics of symbolic
logic and display some basic types of methods of proof in simple cases. Despite of its brevity, this chapter is very important because the given logical
rules for correct mathematical reasoning will be in constant use throughout
the book (as well as throughout the whole of mathematics) without explicit
mentioning. Therefore, its careful study is advised to the reader. Also
should the reader fill in additional steps into proofs whenever he/she feels
the necessity for this. The last should become a routine operation also for
the rest of the book. To the experience of the author, this is a necessity to a
fathom the material.
20
Definition 2.2.1. (Statements) A statement (or proposition) is an assertion
that can determined as true or false.
Often abstract letters like A, B, C, . . . are used for their representation.
Example 2.2.2. The following are statements:
(i) The president George Washington was the first president of the United
States ,
(ii) 2 + 2 = 27 ,
(iii) There are no positive integers a, b, c and n with n ¡ 2 such an
cn . (Fermat’s conjecture)
bn
The following are no statements:
(iv) Which way to the Union Station? ,
(v) Go jump into the lake!
Definition 2.2.3. (Truth values) The truth value of a statement is denoted
by ‘T’ if it is true and by ‘F’ if it is false.
Example 2.2.4. For example, the statement
9
16 25
(2.2.1)
is true and therefore has truth value ‘T’, whereas the statement
9
16 26
is false and therefore has truth value ‘F’. Also, the statement Example 2.2.2 (i)
is true, the statement Example 2.2.2 (ii) is false, and it is not yet known
whether the statement Example 2.2.2 (iii) is true or false.
21
Definition 2.2.5. (Connectives) Connectives like ‘and’, ‘or’, ‘not’, . . .
stand for operations on statements.
Connective
‘not’
‘and’
‘or’
‘if . . . then’
‘. . . if and only if . . . ’
Symbol
Name
Negation
Conjunction
Disjunction
Conditional
Bi-conditional
^
_
ñ
ô
Example 2.2.6. For example, the statement
‘It is not the case that 9
16 25’
is the negation (or ‘contrapositive’) of (2.2.1). It can be stated more simply
as
9 16 25 .
Other examples are compounds like the following
Example 2.2.7.
(i) Tigers are cats and alligators are reptiles ,
(ii) Tigers are cats or (tigers are) reptiles ,
(iii) If some tigers are cats, and some cats are black, then some tigers are
black ,
(iv) 9
16 25 if and only if 8
15 23 .
Definition 2.2.8. (Truth tables) A truth table is a pictorial representation
of all possible outcomes of the truth value of a compound sentence. The
connectives are defined by the following truth tables for all statements A
and B.
A
T
T
F
F
B
T
F
T
F
A
F
F
T
T
A^B
T
F
F
F
A_B
T
T
T
F
22
AñB
T
F
T
T
AôB
T
.
F
F
T
Note that the compound A _ B is true if at least one of the statements A
and B is true. This is different from the normal usage of ‘or’ in English.
It can be described as ‘and/or’. Therefore, the statement 2.2.7 (ii) is true.
Also, the statements 2.2.7 (i) and 2.2.7 (iv) are true.
Also, note that from a true statement A there cannot follow a false statement B, i.e., in that case the truth value of A ñ B is false. This can be
used to identify invalid arguments and also provides the logical basis for so
called indirect proofs.
Note that valid rules of inference do not only come from logic, but also
from the field (Arithmetic, Number Theory, Set Theory, ...) the statement
is associated to. For instance, the equivalence 2.2.7 (iv) is concluded by
arithmetic rules, not by logic. Those rules could turn out to be inconsistent
with logic in that they allow to conclude a false statement from a true statement. Such rules would have to be abandoned. An example for this is given
by the statement 2.2.7 (iii). Although the first two statements are true, the
whole statement is false because there are no black tigers. In the following,
the occurrence of such a contradiction is indicated by the symbol . Note
that the rule of inference in 2.2.7 (iii) is false even if there were black tigers.
Example 2.2.9. (Inconsistent rules) Assume that the real numbers are part
of a larger collection of ‘ideal numbers’ for which there is a multiplication
which reduces to the usual multiplication if the factors are ?
real. Further,
assume that for every ideal number z there is a square root z , i.e., such
that
? 2
z z ,
which is identical to the positive square root if z is real and positive. Finally,
assume that for all ideal numbers z1 , z2 , it holds that
?z z ?z ?z .
1 2
1
2
Note that the last rule is correct if z1 and z2 are both real and positive. Then
we arrive at the following contradiction:
1 ?
a
?
? ?
1 2 1 1 p1qp1q 1 1 .
23
Hence an extension of the real numbers with all these properties does not
exist.
A simple example for an indirect proof is the following.
Example 2.2.10. (Indirect proof) Prove that there are no integers m and
n such that
2m 4n 45 .
(2.2.2)
Proof. The proof is indirect. Assume the opposite, i.e., that there are integers m and n such that (2.2.2) is true. Then the left hand side of the
equation is divisible without rest by 2, whereas the right hand side is not.
Hence the opposite of the assumption is true. This is what we wanted to
prove.
Example 2.2.11. Calculate the truth table of the statements
pA ñ B q ^ pB ñ C q ñ pA ñ C q (Transitivity) ,
pA _ B q ^ pA ñ C q ^ pB ñ C q ñ C (Proof by cases) ,
p B ñ Aq ô pA ñ B q (Contraposition) .
(2.2.3)
Solution:
A
T
T
T
T
F
F
F
F
B
T
T
F
F
T
T
F
F
C
T
F
T
F
T
F
T
F
AñB
T
T
F
F
T
T
T
T
B
ñ C pA ñ B q ^ pB ñ C q
T
F
T
T
T
F
T
T
T
F
F
F
T
F
T
T
24
AñC
T
F
T
F
T
T
T
T
pA ñ B q ^ pB ñ C q ñ pA ñ C q
T
T
T
T
T
T
T
T
A
T
T
T
T
F
F
F
F
B
T
T
F
F
T
T
F
F
C
T
F
T
F
T
F
T
F
A_B
T
T
T
T
T
T
F
F
AñC
T
F
T
F
T
T
T
T
B
ñ C pA ñ C q ^ pB ñ C q
T
F
T
T
T
F
T
T
T
F
T
F
T
F
T
T
pA _ B q ^ pA ñ C q ^ pB ñ C q
pA _ B q ^ pA ñ C q ^ pB ñ C q ñ C
T
F
T
F
T
F
F
F
T
T
T
T
T
T
T
T
A
T
T
F
F
A
F
F
T
T
B
T
F
T
F
B
F
T
F
T
B
ñ
T
F
T
T
A
AñB
T
F
T
T
25
p
B
ñ
Aq ô p A ñ B q
T
T
T
T
The members of (2.2.3) are so called tautologies , i.e., statements that are
true independent of the truth values of their variables. At the same time they
are frequently used rules of inference in mathematics, i.e., for all statements
A, B and C it can be concluded from the truth of the left hand side (in large
brackets) of the relations on the truth of the corresponding right hand side.
Example 2.2.12. (Transitivity) Consider the statements
(i) If Mike is a tiger, then he is a cat,
(ii) If Mike is a cat, then he is a mammal,
(iii) If Mike is a tiger, then he is a mammal.
Statements (i), (ii) are both true. Hence it follows by the transitivity of ñ
the truth of (iii) (and since ‘Mike’, the tiger of the LSU, is indeed a tiger,
he is also a mammal).
Example 2.2.13. (Proof by cases) Prove that
n
|n 1| ¥ 1
(2.2.4)
for all integers n.
Proof. For this, let n be some integer. We consider the cases n ¤ 1 and
n ¥ 1. If n is an integer such that n ¤ 1, then n 1 ¤ 0 and therefore
|n 1| n 1 n 1 ¥ 1 .
If n is an integer such that n ¥ 1, then n 1 ¥ 0 and therefore
n |n 1| n n 1 2n 1 ¥ 2 1 1 .
n
Hence in both cases (2.2.4) is true. The statement follows since any integer
is ¤ 1 and/or ¥ 1.
Example 2.2.14. (Contraposition) Prove that if the square of an integer is
even, then the integer itself is even.
26
Proof. We define statements A, B as
‘The square of the integer (in question) is even’
and
‘The integer (in question) is even’ ,
B corresponds to the statement
respectively. Hence
‘The integer (in question) is odd’ ,
and
A corresponds to the statement
‘The square of the integer (in question) is odd’ .
Hence the statement follows by contraposition if we can prove that the
square of any odd integer is odd. For this, let n be some odd integer. Then
there is an integer m such that n 2m 1. Hence
n2
p2m
1q2
4m2
4m
1 2 p2m2
2mq
1
is an odd integer and the statement follows.
Based on the result in the previous example, we can prove now the result
mentioned in Section 2.1 that there is no rational number whose square is
equal to 2.
Example 2.2.15. (Indirect proof) Prove that there is no rational number
whose square is 2.
Proof. The proof is indirect. Assume on the contrary that there is such a
number r. Without restriction, we can assume that r p{q where p, q
are integers without common divisor different from 1 and that q 0. By
definition,
2
p
p2
2
r 2.
q
q2
Hence it follows that
p2
2q2
27
and therefore by the previous example that 2 is a divisor of p. Hence there is
an integer p̄ such that p 2p̄. Substitution of this identity into the previous
equation gives
2p̄2 q 2 .
Hence it follows again by the previous example that 2 is also divisor of q.
As a consequence, p, q have 2 as a common divisor which is in contradiction
to the assumption. Hence there is no rational number whose square is
equal to 2.
Problems
1) Decide which of the following are statements.
a)
b)
c)
d)
e)
f)
g)
f)
g)
h)
i)
Did you solve the problem?
Solve the problem!
The solution is correct.
Maria has green eyes.
Soccer is the national sport in many countries.
Soccer is the national sport in Germany.
During the last year, soccer had the most spectators among all
sports in Germany.
Explain your solution!
Can you explain your solution?
Indeed, the solution is correct, but can you explain it?
The solution is correct; please, demonstrate it on the blackboard.
2) Translate the following composite sentences into symbolic notation
using letters for basic statements which contain no connectives.
a) Either John is taller than Henry, or I am subject to an optical
illusion.
b) If John’s car breaks down, then he either has to come by bus or
by taxi.
c) Fred will stay in Europe, and he or George will visit Rome.
d) Fred will stay in Europe and visit Rome, or George will visit
Rome.
28
e) I will travel by train or by plane.
f) Neither Newton nor Einstein created quantum theory.
g) If and only if the sun is shining, I will go swimming today; in
case I go swimming, I will have an ice cream.
h) If students are tired or distracted, then they don’t study well.
i) If students focus on learning, their knowledge will increase; and
if they don’t focus on learning, their knowledge will remain
unchanged.
3) Denote by M , T , W the statements ”Today is Monday”, ”Today is
Tuesday” and ”Today is Wednesday”, respectively. Further, denote
by S the statement ”Yesterday was Sunday”. Translate the following
statements into proper English.
a)
b)
c)
d)
e)
f)
M Ñ pT _ W q ,
SØM ,
S ^ pM _ T q ,
pS Ñ T q _ M ,
M Ø pT ^ p W qq _ S ,
pM Ø T q ^ pp W q _ S q .
4) By use of truth tables, prove that
a)
b)
c)
d)
e)
f)
g)
h)
i)
k)
l)
p Aq ô A ,
pA ^ B q ô pB ^ Aq ,
pA _ B q ô pB _ Aq ,
pA ô B q ô pB ô Aq ,
pA ^ B q ô p Aq _ p B q ,
pA _ B q ô p Aq ^ p B q ,
pA Ñ B q ô p Aq _ B ,
A ^ pB ^ C q ô p A ^ B q ^ C ,
A _ pB _ C q ô p A _ B q _ C ,
A _ pB ^ C q ô p A _ B q ^ p A _ C q ,
A ^ pB _ C q ô p A ^ B q _ p A ^ C q .
for arbitrary statements A, B and C.
5) Assume that
a pb
cq a b
c
for all real a, b and c is a valid arithmetic rule of inference. Derive
from this a contradiction to the valid arithmetic statement that 0 1.
29
Therefore, conclude that the enlargement of the field of arithmetic by
addition of the above rule would lead to an inconsistent field.
6) Prove indirectly that 3n
2 is odd if n is an odd integer.
7) Prove indirectly that there are no integers m ¡ 0 and n ¡ 0 such that
m2 n 2
1.
8) If a, b and c are odd integers, then there is no rational number x such
that ax2 bx c 0. [Hint: Assume that there is such a rational
number x r{s where r, s 0 are integers without common divisor. Show that this implies the equation rpar bsq cs2 which is
contradictory.]
9) Prove that there is an infinite number of prime numbers, i.e., of natural numbers ¥ 2 that are divisible without remainder only by 1 and
by that number itself. [Hint: Assume the opposite and construct a
number which is larger than the largest prime number, but not divisible without remainder by any of the prime numbers.]
10) Prove by cases that
|x 1| |x
2| ¤ 3
|x 1| |x
2| ¥ 3
for all real x.
11) Prove by cases that
for all real x.
12) Prove by cases that
||ab||
for all real numbers a, b such that b 0.
a
b
13) Prove by cases that if n is an integer, then n3 is of the form 9k
where k is some integer and r is equal to 1, 0 or 1.
r
14) Prove that if n is an integer, then n5 n is divisible by 5. [Hint:
Factor the polynomial n5 n as far as possible. Then consider the
cases that n is of the form n 5q r where q is an integer and r is
equal to 0, 1, 2, 3 or 4.]
30
2.2.2
Sets
Set theory was created by Georg Cantor between the years 1874 and 1897.
Its development was triggered by the general effort to develop a rigorous
basis for calculus / analysis in the 19th century. As we shall see later, for
this it is necessary to treat infinite collections of real numbers. Since antiquity, most of the mathematicians did not consider collections of infinitely
many objects as valid objects of thinking. This is likely due to the influence
of ancient Greek philosophy, in particular that of Aristotle (384-322 B.C.),
that dominated the thinking in the west up to the 18th century. According to
Aristotle (384-322 B.C.), the infinite is imperfect, unfinished and therefore,
unthinkable; it is formless and confused. Hence it had to be excluded from
consideration. Precisely such consideration is done by set theory. For this
reason, initially Cantor’s work received much criticism and was accused
to deal with fictions. Once its use for calculus / analysis was understood,
attitudes began to change, and by the beginning of the 20th century, set
theory was recognized as a distinct branch of mathematics. Finally, it even
provided the basis for the whole of mathematics in the work of Bourbaki
mentioned in Section 2.1. Today, the notions of set theory seem so natural
that the in part fierce debates at the time of its creation are hard to understand.
In the following, only the very basics of Cantor’s original formulation of
set theory is given which is sufficient for the purposes of the book. Today,
that approach is called ‘naive’ set theory because it uses a definition of sets
which is too broad and leads to contradictions if its full generality is exploited. One such contradiction, the so called Zermelo-Russel’s paradox
is described at the end of this section. So a more restrictive definition of
sets is needed to avoid such contradictions. For this, we refer to books on
axiomatic set theory. In the following such paradoxa will not play a role
because calculus / analysis naturally deals with a far reduced class of sets
which satisfy the more restrictive definition of axiomatic set theory.
Like the previous section, this section is very important because the given
31
notions of set theory will be in consistent use throughout the book as an
efficient unifying language, but without going as far as Bourbaki’s work.
Therefore, its careful study is advised to the reader. Like the material of
the previous section, its apparent simplicity should not lead to an underestimation of it’s importance. Precisely the achievement of such simplicity
is the ultimate goal of the whole of mathematics because it signals a full
understanding of the studied object. Complexity just signals a deficient understanding. In addition, from a practical point of view, such simplicity
drastically reduces the chance of the occurrence of errors.
In the following we adopt the naive definition of sets given by Cantor.
Definition 2.2.16. (Sets) A set is an aggregation of definite, different objects of our intuition or of our thinking, to be conceived as a whole. Those
objects are called the elements of the set.
This implies that for a given set A and any given object a it follows that
either a is an element of A or it is not. The first is denoted by a P A ,
and the second is denoted by a R A . The set without any elements, the so
called ‘empty set’, is denoted by φ.
Example 2.2.17. Examples of sets are
the set of all cats ,
the set of the lowercase letters of the Latin alphabet ,
the set of odd integers .
Definition 2.2.18. (Elements) For a set A, the following statements have
the same meaning
a is in A ,
a is an element of A ,
a is a member of A ,
aPA.
32
Given some not necessarily different objects x1 , x2 , . . . , the set containing
these objects is denoted by
tx 1 , x 2 , . . . u .
In particular, we define the set of natural numbers N , the set of natural
numbers N without 0 , the set of integers Z and the set of integers Z
without 0 by
Definition 2.2.19. (Natural numbers, integers)
N : t0, 1, 2, 3, . . . u ,
N : t1, 2, 3, . . . u ,
Z : t0, 1, 1, 2, 2, 3, 3 . . . u ,
Z : t1, 1, 2, 2, 3, 3, . . . u .
Another way of defining a set is by a property characterizing its elements,
i.e., by a property which is shared by all its elements, but not by any other
object:
tx : x has the property P pxqu .
It is read as: ‘The set of all x such that P pxq’. In this, the symbol ‘:’ is read
as ‘such that’. In particular, we define the set of rational numbers Q, the set
of rational numbers Q without 0 , the set of real numbers R and the set of
real numbers R without 0 by
Definition 2.2.20. (Rational and real numbers)
Q : tp{q : p P Z ^ q P N ^ q 0u ,
Q : tp{q : p P Z ^ q P N ^ q 0u ,
R : tx : x is a real numberu ,
R : tx : x is a non-zero real numberu .
Definition 2.2.21. (Subsets, equality of sets) For all sets A and B, we
define
A € B :ô Every element of A is also an element of B
33
and say that ‘A is a subset B’, ‘A is contained in B’, ‘A is included in B’
or ‘A is part of B’. Finally, we define
A B :ô A € B ^ B € A
ô A and B contain the same elements .
Here and in the following, wherever meaningful, the symbol ‘:’ in front of
other symbols means and is read as ‘per definition’.
Example 2.2.22. For instance,
t1, 1, 2, 3, 5u € t1, 1, 2,?3, 5, 8, 13u , ?
?
t1, 1, 2, 3, 5u € trp1 5qn p1 5qns{p2n 5q : n P Nu ,
t1, 2, 3, 3, 5, 1u t1, 2, 3, 5?
u,
?
?
t1, 1, 2, 3, 5, . . . u trp1 5qn p1 5qns{p2n 5q : n P Nu .
In particular, we define subsets of R, so called intervals , by
Definition 2.2.23.
ra, bs : tx P R : a ¤ x ¤ bu , pa, bq : tx P R : a x bu
ra, bq : tx P R : a ¤ x bu , pa, bs : tx P R : a x ¤ bu
rc, 8q : tx P R : x ¥ cu , pc, 8q : tx P R : x ¡ cu
p8, dq : tx P R : x du , p8, ds : tx P R : x ¤ du
for all a, b P R such that a ¤ b and c, d P R.
We define the following operations on sets.
Definition 2.2.24. (Operations on sets, I) For all sets A and B, we define
(i) their union A Y B, read: ‘A union B’, by
A Y B : tx : x P A _ x P B u
34
y
y
x
x
A
B
Fig. 1: Two subsets A and B of the plane.
A
AÜB
AÝB
B
Fig. 2: Union and intersection of A and B. The last is given by the blue domain.
35
A”B
Fig. 3: The relative complement of B in A.
(ii) and their intersection A X B, read: ‘A intersection B’, by
A X B : tx : x P A ^ x P B u .
If A X B
(iii)
φ, we say that A and B are disjoint.
the relative complement of B in A, A zB, read: ‘A without B’ or ‘A
minus B’, by
A zB : tx : x P A ^ x R B u .
(iv) their cross (or Cartesian / direct) product A B, read: ‘A cross B’,
by
A B : tpx, y q : x P A ^ y P B u
where ordered pairs px1 , y1 q, px2 , y2 q are defined equal,
px1, y1q px2, y2q ,
if and only if x1 x2 and y1 y2 . We also use the notation A2 for
A A. More generally, we define for n P N such that n ¥ 3 and sets
36
y
3
2
x
A
1
B
2
1
1
2
3
3
2
A´B
z
1
0
1
1
2
3
2
3
y
x
Fig. 4: Subsets A of the real line and B of the plane and their cross product.
37
x
A1 , . . . , An the corresponding Cartesian product
A1 An
(2.2.5)
to consist of all ordered n-tuples px1 , . . . , xn q of elements x1 P A1 ,
. . . , xn P An . Also in this case, we define such ordered pairs px1 , . . . ,
xn q and py1 , . . . , yn q to be equal if and only if all their components
are equal, i.e., if and only if x1 y1 , . . . , xn yn . We also use the
notation
n
¡
Ai
i 1
for (2.2.5) and, in the case that A1 , . . . , An are all equal to some set
A, the notation An . Finally, we define R1 : R.
Example 2.2.25.
t1, 2, 3, 5, 8, 13u Y t1, 3, 4, 7, 11, 18u t1, 2, 3, 4, 5, 7, 8, 11, 13, 18u
t1, 2, 3, 5, 8, 13u X t1, 3, 4, 7, 11, 18u t1, 3u
t1, 2, 3, 5, 8, 13u zt1, 2, 3, 5u t8, 13u ,
t1, 2, 3, 5, 1u zt1u t2, 3, 5u ,
t1, 2u t1, 3, 4u tp1, 1q, p1, 3q, p1, 4q, p2, 1q, p2, 3q, p2, 4qu .
We also define unions and intersection of arbitrary families of sets.
Definition 2.2.26. (Operations on sets, II) Let I be some non-empty set
and for every i P I the corresponding Ai an associated set. Then we define
¤
P
£
i I
P
Ai : tx : x P Ai for some i P I u ,
Ai : tx : x P Ai for all i P I u .
i I
Example 2.2.27. Determine
¤
n N
P
r 1{n, 1s ,
£
n N
38
P
r 0, 1{ns
.
Solution: By definition
S1 :
¤
n N
P
r 1{n, 1s tx : x P r 1{n, 1s for some n P Nu .
Any x P R such that x ¡ 1 or x ¤ 0 is not contained any of the sets
r 1{n, 1s, n P N and hence also not contained in their union S1. On the
other hand, if x P R is such that 0 x ¤ 1, then
1
n
¤x¤1
if n P N is such that n ¥ 1{x. Hence for such n, x
x P S1 . As a consequence,
¤
nPN
P r1{n, 1s and hence
r 1{n, 1s p0, 1s .
Further, by definition
S2 :
£
n N
P
r 0, 1{ns tx : x P r 0, 1{ns for all n P Nu .
No x P R such that x 0 is contained in any of the r 0, 1{ns, n P N and
hence also not contained in S2 . 0 is contained in all of these sets and hence
also contained in S2 . If x P R is such that x ¡ 0, then
1
n
x
for n P N such that n ¡ 1{x. Hence for such n, x R r0, 1{ns and therefore
x R S2 . As a consequence,
£
n N
P
r 0, 1{ns t0u .
The naive Definition 2.2.16 of sets leads to paradoxa like the one of ZermeloRussel (1903):
39
Assume that there is a set of all sets that don’t contain itself as an element:
S : tx : x is a set ^ x R xu .
Since S is assumed to be a set, either S P S or S R S. From the assumption
that S P S, it follows by the definition of S that S R S . Hence it follows
that S R S. From S R S, it follows by the definition of S that S P S .
Hence there is no such set.
Bernard Russell also used a statement about a barber to illustrate this principle. If a barber cuts the hair of exactly those who do not cut their own
hair, does the barber cut his own hair?
So a more restrictive definition of sets is needed to avoid such contradictions. For this, we refer to books on axiomatic set theory. In the following
such paradoxa will not play role because we don’t use the full generality
of Definition 2.2.16. Calculus / analysis naturally deals with a far reduced
class of sets which satisfy the more restrictive definition of axiomatic set
theory.
Problems
1) For each pair of sets, decide whether not the following sets are equal:
A : t2, 3u, B : t3, 2u Y φ, C : t2, 3u Y tφu, D :
tx P R : x2 x 6 0u, E : tφ, 2, 3u, F : t2, 3, 2u,
G : t2, φ, φ, 3u .
2) Simplify
t2, 3u Y tt2u, t3uu Y t2, t3uu Y tt2u, 3u .
3) Decide whether
t1, 3u P t1, 3, t1, 7u, t1, 3, 7uu .
Justify your answer.
40
4) Let A : tφ, t1u, t1, 3u, t3, 4uu. Determine for each of the following
statements whether it is true or false.
a) 1 P A ,
b) t1u € A ,
c) t1u P A,
d) t1, 3u € A ,
e) tt1, 3uu P A ,
f) φ P A ,
g) φ € A ,
h) tφu € A .
5) Give an example of sets A, B, C such that A
A R C.
P B and B P C, but
6) Sketch the following sets
A : tpx, y q P R2 : x
B : tpx, y q P R : 2x
2
C : tpx, y q P R : x
1 0u ,
y
3y
5 0u ,
1u, D : tp0, 1qu ,
E : tp1, 1qu, F : tp0, 1qu, G : tp1, 0qu, H : tp2, 3qu ,
I : tp4, 1qu, J : tx P R : 4 ¤ x ¤ 2u, K : t0u, L : t1u
into a xy-diagram and calculate A X B, A X C, pA X B q X C, A X
pB XC q, B XpJ Lq, C XpJ K q, A zB, B zA, B YE, pC YF qYG.
2
2
y
2
7) Let A, B and C be sets. Show that
a)
b)
c)
d)
e)
f)
g)
h)
i)
If A € B and B € C, then A € C ,
AYB BYA ,
AXB BXA ,
A Y pB Y C q p A Y B q Y C ,
A X pB X C q p A X B q X C ,
A Y pB X C q p A Y B q X p A Y C q ,
A X pB Y C q p A X B q Y p A X C q ,
C zpA Y B q pC zAq X pC zB q ,
C zpA X B q pC zAq Y pC zB q .
41
2.2.3
Maps
The development of the concept of a function and its generalization, i.e., the
concept of a map (or ‘mapping’), are further major achievements of Western culture that have no counterpart in ancient Greek mathematics. The first
concept underwent considerable changes until it reached its current meaning.
The principal objects of study of the calculus in the 17th century were
geometric objects, in particular curves, but not functions in their current
meaning. Also the variables associated with those objects had a geometrical meaning, like abscissas, ordinates and tangents. The term function
appeared first in the works of Leibniz. In particular, he asserts that a tangent is a function of a curve. This only very roughly matches the modern
notion of a function. Newton’s method of ‘fluxions’ applies to ‘fluents’ not
to functions. For Newton, a curve is generated by a continuous motion of a
point he called ‘fluent’ because he thought of it as a flowing quantity. The
‘fluxion’ or rate at which it flowed, was the point’s velocity.
Under the influence of analytic geometry, in the first half of the 18th century, the geometric concept of variables was replaced by the concept of
a function as an equation or analytic expression composed of variables
and numbers. Admissible analytic expressions were those that involved
the four algebraic operations, roots, exponentials, logarithms, trigonometric functions, derivatives and integrals. In the sequel, as a consequence
of the study of the solutions of the wave equation in one space dimension
(‘the Vibrating-String Problem’), the concept of a function was enlarged to
include such that are piecewise defined on intervals by several analytic expressions and functions (in the sense of curves) drawn by ‘free-hand’ and
possibly not expressible by any combination of analytic expressions.
The final step in the evolution of the function concept was made by Gustav Lejeune Dirichlet in 1829 [30] in a paper which gave a precise meaning to Fourier’s work from 1822 [41] on heat conduction. In that work,
42
Fourier claimed that ‘any’ function defined over an interval pl, lq can be
represented by his series over this interval. Not only by modern standards,
Fourier’s statement and proof were insufficient, but a proof or disproof
of that statement presupposed a clear definition of the concept of a function. For Dirichlet, y is a function of a variable x, defined on the interval
a x b, if to every value of the variable x in this interval there corresponds a definite value of the variable y. Also, it is irrelevant in what
way this correspondence is established. Already in 1887 [31], Dirichlet
generalizes the concept of a function to that of a mapping
‘By a mapping of a system S a law is understood, in accordance with which to each determinate element s of S there is
associated a determinate object, which is called the image of s
and is denoted by ϕpsq; we say too, that ϕpsq corresponds to
the element s, that ϕpsq is caused or generated by the mapping
ϕ out of s, that s is transformed by the mapping ϕ into ϕpsq.’
This definition practically coincides with the modern definition of maps
given below.
Fourier’s claim pinpointed a major weakness in the mathematics of the 18th
century. On the one hand, the insufficiency of Fourier’s ‘proof’ was obvious to the mathematical community at the time. On the other hand, the
notion of a function was to nebulously defined as that it could have been
convincingly claimed that his result was false. This clearly signaled that
those mathematical notions (or the ‘mathematical language’) were to imprecise to deal with such questions and that more precise notions had to be
developed. This makes clear the size of Dirichlet’s achievement. He had
to solve simultaneously two intertwined problems, namely the giving of a
precise mathematical meaning to Fourier’s result and the development of
a mathematical framework where this is possible. In particular, it was not
clear whether such thing was possible at all. Until today, such problems are
common in mathematics related to applications.
A careful study also of this section is advised to the reader. It introduces
43
f
B
A
Fig. 5: Points in the set A and their images in the set B under the map f are connected by
arrows. Compare Definition 2.2.28.
into the current notion of maps and gives efficient means for their description which will be used throughout the book. If there is reference made to a
function or to a map in the following, the imagining of a picture similar to
Fig 2.2.28 should be helpful to the reader. Mathematically, it is possible to
identify a map with a set, namely its graph, see Definition 2.2.33. In such
exclusivity, this is not advisable since this often does not provide any visual
help, in particular in cases when the graph is a subset of a space of more
then 3 dimensions. The last is frequently the case in applications. In addition, it often hinders intuition since maps are frequently used to describe
transformations. It is more advisable, to consider the graph of a map as one
of the options to describe or visualize the latter. Indeed, this option will frequently be used in Calculus I and II. Other such options, becoming relevant
in Calculus III, are sometimes contour and density maps. With increasing
complexity of the considered problems, also in applications, the options for
a meaningful visualization of the involved maps rapidly decreases and an
abstract view of maps is becoming essential.
44
Definition 2.2.28. (Maps) Let A and B be non-empty sets.
(i) A map (or mapping) f from A into B, denoted by f : A Ñ B, is an
association which associates to every element of A a corresponding
element of B. If B is a subset of the real numbers, we call f a function. We call A the domain of f . If f is given, we also use the short
notation Dpf q for the domain of f .
(ii) For every x
under f .
P A, we call f pxq the value of f at x or the image of x
(iii) For any subset A 1 of A, we call the set f pA 1 q containing all the images of its elements under f ,
f pA 1 q : tf pxq : x P A 1 u ,
(2.2.6)
the image of A 1 under f . In particular, we call f pAq the range or
image of f . If f is given, we use also the short notation
Ranpf q : f pAq tf pxq : x P Au .
for the range or image of f .
(iv) For any subset B 1 of B, we call the subset f 1 pB 1 q of A containing
all those elements which are mapped into B 1 ,
f 1 pB 1 q : tx P A : f pxq P B 1 u ,
the inverse image of B 1 under f . In particular if f is a function, we
call
f 1 pt0uq tx P A : f pxq 0u ,
the set of zeros of f or the zero set of f .
(v) For any subset A 1 of A, we define the restriction of f to A 1 as the
map f |A 1 : A 1 Ñ B defined by
for all x P A 1 .
f |A 1 pxq : f pxq
45
Remark 2.2.29. (Variables) We will not introduce a precise notion of
‘variables’ in the following because such would be redundant. Still there
is a residual of such historic notion present in the commonly used characterization of functions as functions of one variable, several variables or n
variables where n P N is such that n ¥ 2. Also in this text, we will refer
to a function whose domain is a subset of R as a function of one variable
and to a function whose domain is a subset of Rn , where n P N is such that
n ¥ 2, as a function of several variables or a function of n variables.
Remark 2.2.30. In the following, we make the general assumption of basic
knowledge of integer powers and n-th roots, where n P N , as well as of
the functions
sin : R Ñ R , arcsin : r1, 1s Ñ rπ {2, π {2s ,
cos : R Ñ R , arccos : r1, 1s Ñ r0, π s ,
tan : pπ {2, π {2q Ñ R , arctan : R Ñ pπ {2, π {2q ,
exp : R Ñ R , ln : p0, 8q Ñ R
as provided by high school mathematics. Still, we give definitions of some
of these functions later on to exemplify methods of calculus.
Example 2.2.31. Define f : Z Ñ Z by
f pnq : n2
for all n P Z. Moreover, let g be the restriction of f to N. Calculate
f pZq, f pt2, 1, 0, 1, 2uq, f 1 pt1, 0, 1uq, f 1 pt6uq, g 1 pt1, 0, 1uq .
Solution:
f pZq tn2 : n P Nu , f pt2, 1, 0, 1, 2uq t0, 1, 4u ,
f 1 pt1, 0, 1uq t1, 0, 1u , f 1 pt6uq φ , g 1 pt1, 0, 1uq t0, 1u .
Example 2.2.32. Define f : Df
(a) f pxq ?
x
Ñ R and g : Dg Ñ R such that
2 for all x P Df
46
(b) g pxq 1{px2 xq for all x P Dg
and such Df and Dg are maximal. Find the domains Df and Dg . Give
explanations. Solution: In case (a) the inequality
x
2¥0
pô x ¥ 2q
has to be satisfied in order that the square root is defined. Hence
Df : tx P R : x ¥ 2u
?
and f : Df Ñ R is defined by f pxq : x 2 for all x P Df . In case (b)
the denominator has to be different from zero in order that the quotient is
defined. Because of
x2 x xpx 1q 0 ô x P t0, 1u ,
we conclude that
Dg : tx P R : x 0 ^ x 1u
and that g : Dg
Ñ R is defined by gpxq : 1{px2 xq for all x P Dg .
Definition 2.2.33. (Graph of a map) Let A and B be some sets and f :
A Ñ B be some map. Then we define the graph of f by:
Gpf q : tpx, f pxqq P A B : x P Au .
Example 2.2.34. Sketch the graphs of the functions f and g from Example 2.2.32. Solution: See Fig. 6 and Fig. 7.
Example 2.2.35. Find the ranges of the functions in Example 2.2.32. Solution: Since the square root assumes only positive numbers, we conclude
that
f pDf q € ty : y ¥ 0u .
Further for every y
P r0, 8q, it follows that
a
y2 2 2 y
47
and hence that
ty : y ¥ 0u € f pDf q
and, finally, that f pDf q ty : y ¥ 0u. Further, for x 0 or x ¡ 1, it
follows that xpx 1q ¡ 0 and hence that g pxq ¡ 0. For 0 x 1, it
follows that
¤
1
4
x
1
2
2
41 xpx 1q 0
and hence that g pxq ¥ 4. Hence it follows that
ty : y ¡ 0u Y ty : y ¤ 4u € gpDg q .
Finally, for any real y such that py ¡ 0q _ py ¤ 4q, it follows that
g
1
2
and hence that g pDg q € ty : y
c
1
y
1
4
y
¡ 0u Y ty : y ¤ 4u.
A map is called injective (or one-to-one) if no two points from its domain
are mapped onto the same point. A map into a set B is called surjective
(or onto) if every element from B is the image of some element from its
domain. Finally, a map is called bijective (or one-to-one and onto) if it is
injective and surjective.
Definition 2.2.36. (Injectivity, surjectivity, bijectivity) Let A and B be
some sets and f : A Ñ B be some map. We define
(i) f is injective (or one-to-one) if different elements of A are mapped
into different elements of B, or equivalently if
f pxq f py q ñ x y
for all x, y P A. In this case, we define the inverse map f 1 as
the map from f pAq into A which associates to every y P f pAq the
element x P A such that f pxq y.
48
y
2
1.5
1
0.5
-2
1
-1
2
x
Fig. 6: Gpf q from Example 2.2.32.
y
4
2
-1
0.5
-0.5
1.5
-2
-4
-6
-8
-10
Fig. 7: Gpg q from Example 2.2.32.
49
2
x
(ii) f is surjective (or onto) if every element of B is the image of some
element(s) of A:
f pAq B .
(iii) f is bijective (or one-to-one and onto) if it is both injective and surjective. In this case, the domain of the inverse map is the whole of
B.
Example 2.2.37. Let f and g be as in Example 2.2.31. In addition, define
h : Z Ñ Z by hpnq : n 1 for all n P Z. Decide whether f, g and h are
injective, surjective or bijective. If existent, give the corresponding inverse
function(s).
Solution: f is not injective (and hence also not bijective), nor surjective,
for instance, because of
f p1q f p1q 1 , 2 R f pAq .
g is injective because if m and n are some natural numbers such that g pmq g pnq, then it follows that
0 m2 n2
and hence that
mn
pm nqpm
_
nq
m n
and therefore, since g has as its domain the natural?numbers, that m n.
The inverse g 1 : g pAq Ñ A is given by g 1 plq l for all l P g pAq. g is
not surjective (and hence also not bijective), for instance, since 2 R g pAq. h
is injective because if m and n are some natural numbers such that hpmq hpnq, then it follows that
0m
1 pn
1q m n
and hence that m n. h is surjective (and hence as a whole bijective)
because for any natural n we have hpn 1q n. The inverse function
h1 : Z Ñ Z is given by h1 pnq n 1 for all n P Z.
50
The following characterizes the injectivity, surjectivity and bijectivity of a
map in terms of its graph. In the special case of functions defined on subsets
of the real numbers, the theorem can be stated as follows. Such function is
injective if and only if every parallel to the x-axis intersects its graph in at
most one point. If such function maps into the set B, then it is surjective,
bijective, respectively, if and only if the intersection of every parallel to the
x-axis through a point from B intersects its graph in at least one point and
precisely one point, respectively.
Theorem 2.2.38. Let A and B be sets and f : A Ñ B be a map. Further,
define for every y P B the corresponding intersection Gf y by
Gf y : Gpf q X tpx, y q : x P Au .
Then
(i) f is injective if and only if Gf y contains at most one point for all
y P B.
(ii) f is surjective if and only if Gf y is non-empty for all y
P B.
(iii) f is bijective if and only if Gf y contains exactly one point for all
y P B.
Proof. (i) The proof is indirect. Assume that there is y P B such that Gf y
contains two points px1 , y q and px2 , y q. Then, since Gf y is part of Gpf q, it
follows that y f px1 q f px2 q and hence, since by assumption x1 x2 ,
that f is not injective. Further, assume that f is not injective. Then there are
different x1 , x2 P A such that f px1 q f px2 q. Hence Gf f px1 q contains two
different points px1 , f px1 qq and px2 , f px1 qq. (ii) If f is surjective, then for
any y P B there is some x P A such that y f pxq and hence px, y q P Gf y .
On the other hand, if Gf y is non-empty for all y P B, then for every y P B
there is some x P A such that px, y q P Gf y and hence, since Gf y is part of
Gpf q, that y f pxq. Hence f is surjective. (iii) is an obvious consequence
of (i) and (ii).
51
y
2
1.5
0.5
-2
1
-1
2
x
-1
Fig. 8: Gpf q from Example 2.2.32 and parallels to the x-axis.
y
2
-1
0.5
-0.5
1.5
2
x
-4
-8
-10
Fig. 9: Gpg q from Example 2.2.32 and parallels to the x-axis.
52
Example 2.2.39. Apply Theorem 2.2.38 to investigate the injectivity of f
and g from Example 2.2.32. Solution: Fig. 8, Fig. 9 suggest that f is
injective, but not surjective and that g is neither injective nor surjective.
Example 2.2.40. Show that f and the restriction of g to
tx : x ¥ 1{2 ^ x 1u ,
where f and g are from Example 2.2.32, are injective and calculate their
inverse. Solution: If x1 , x2 are any real numbers ¥ 2 and such that
f px1 q f px2 q, then
?
?
x1 2 x2 2
and hence
x1
2 x2
2
and x1 x2 . Hence f is injective. Further, for every y in the range of f
there is x ¥ 2, such that
?
y x 2
and hence
x y2 2 .
Therefore
f 1 py q y 2 2
for all y from the range of f . Further, if x1 and x2 are some real numbers
¥ 1{2 different from 1 and such that
1
x21
then
x1
x2 1 x
2
px1 x2qpx1
,
2
x2 1q 0
and hence x1 x2 . Hence the restriction of g to tx : x ¥ 1{2 ^ x 1u is
injective. Finally, if y is some real number in the range of this restriction,
then y is in particular different from zero and
y
x2 1 x ,
53
hence
x
and
Therefore
x
1
2
1
2
2
y1
c
1
f 1 py q 1
y
1
4
1
.
4
c
1 1
2
y 4
for all y from the range of that restriction of g.
The next defines the composition of maps which corresponds to the application of maps in sequence.
Definition 2.2.41. (Composition) Let A, B, C and D be sets. Further, let
f : A Ñ B and g : C Ñ D be maps. We define the composition g f :
f 1 pB X C q Ñ D (read: ‘g after f ’) by
pg f qpxq : gpf pxqq
for all x P f 1 pB X C q. Note that g f is trivial, i.e., with an empty domain,
for instance, if B X C φ. Also note that f 1 pB X C q A if B € C.
Example 2.2.42. Calculate f f , h h, f h and h f where f , h are
defined as in Example 2.2.31, Example 2.2.37, respectively.
Solution: Obviously, all these maps map Z into itself. Moreover for every n P Z:
pf f qpnq f pf pnqq f pn2q pn2q2 n4 ,
ph hqpnq hphpnqq hpn 1q pn 1q 1 n 2 ,
ph f qpnq hpf pnqq hpn2q n2 1 ,
pf hqpnq f phpnqq f pn 1q pn 1q2 n2 2n 1 .
Note in particular that h f f h.
54
Example 2.2.43. Let A and B be sets. Moreover, let f : A Ñ B be some
injective map. Calculate f 1 f . Assume that f is also surjective (and
hence as a whole bijective) and calculate also f f 1 for this case.
Solution: To every y P f pAq, the map f 1 associates the corresponding
x P A which satisfies f pxq y. In particular, it associates to f pxq the
element x for all x P A. Hence
f 1 f
idA , f f 1 idf pAq
where for every set C the corresponding map idC : C
Ñ C is defined by
idC pxq : C
for all x P C. Further, if f is bijective, f pAq B and hence
f
f 1 idB .
The following theorem gives a relation between the graph of an injective
map and the graph of its inverse. In the special case of functions defined
on subsets of the real numbers, the theorem characterizes the graph of the
inverse of such a function as the reflection of the graph of that function
about the line tpx, xq P R2 : x P Ru.
Theorem 2.2.44. (Graphs of inverses of maps) Let A and B be sets and
f : A Ñ B be an injective map. Moreover, define R : X Y Ñ Y X by
Rpx, y q : py, xq
for all x P A and y
P B. Then the graph of the inverse map is given by
Gpf 1 q RpGpf qq .
Proof. ‘€’: Let py, f 1 py qq be an element of Gpf 1 q. Then y P f pAq and
f 1 py q P A is such that f pf 1 py qq y. Therefore pf 1 py q, y q P Gpf q and
py, f 1pyqq Rpf 1pyq, yq P RpGpf qq .
55
y
2
1
-2
1
-1
2
x
-1
-2
Fig. 10: Gpf q, Gpf 1 q from Example 2.2.32 and the reflection axis.
‘’: Let pf pxq, xq be some element of RpGpf qq. Then f 1 pf pxqq x and
hence
pf pxq, xq pf pxq, f 1pf pxqq P Gpf 1q .
Example 2.2.45. Apply Theorem 2.2.44 to the graph of the function f from
Example 2.2.32 to draw the graphs of its inverse. (See Example 2.2.40.)
Solution: See Fig. 10.
Problems
1) Find
f pr0, π {2sq , f 1 pt1uq , f 1 pt3uq , f 1 pr0, 2sq .
In addition, find the maximal domain D € R that contains the point
π {8 and is such that f |D is injective. Finally, calculate the inverse of
the map h : D Ñ f pDq defined by hpxq : f pxq for all x P D.
56
y
y
1
y
1
2
1
1
-1
2
x
1
x
-1
-2
1
-1
x
2
-2
-1
Fig. 11: Subsets of R2 . Which is the graph of a function?
a) f pxq : 2 sinp3xq , x P R ,
b) f pxq : 3 cosp2xq , x P R ,
c) f pxq : tanpx{2q{3 , x P tx P pp2k 1qπ, p2k 1qπ q : k
P Zu .
1 for x P R
2) Define f : R Ñ R and g : R zt1u Ñ R by f pxq : x
and g pxq : px 1q2 {px 1q for x P R zt1u. Is f g?
3) Let f : Df Ñ R be defined such that the given equation below is
satisfied for all x P Df and such that Df € R is a maximal. In each
of the cases, find the corresponding Df , the range of f , and draw the
graph of f :
a)
b)
c)
d)
e)
f)
g)
h)
f pxq x2 3 ,
?
f pxq 1{ x ,
f pxq 1{p1 xq ,
f pxq x2 |x| ,
f pxq x{|x| ,
f pxq |x|1{3 ,
f pxq |x2 1| ,
a
f pxq sinpxq .
4) Which of the subsets of R2 in Fig. 11 is the graph of a function? Give
reasons.
5) Find the function whose graph is given by
a)
b)
c)
px, yq P R2 : x2 y x
px, yq P R2 : x y{py
px, yq P R2 : y2 6xy
(
10 ,
(
1q ,
9x2
(
0
.
6) In each of the following cases, find a bijective function that has domain D and range R and calculate its inverse.
57
tx P R : 1 ¤ x ¤ 2u, R tx P R : 3 ¤ x ¤ 7u ,
tx P R : 1 ¤ x ¤ 1u, R tx P R : x ¥ 3u .
Define f : Df Ñ R and g : Dg Ñ R such that
a
x1
f pxq :
x2 9
, g pxq : 2
x3
for all x P Df , x P Dg , respectively, and such Df and Dg are
a) D
b) D
7)
a)
maximal. Find the domains and ranges of the functions f and
g. Give explanations.
b) If possible, calculate pf g qp5q and pg f qp5q. Give explanations.
c) f is injective (= ‘one to one’). Calculate its inverse.
8) Is there a function which is identical to its inverse? Is there more then
one such function?
9) Define f : R Ñ R, g : R Ñ R and h : R Ñ R by
f pxq : 1
x , g pxq : 1
x
x2 , hpxq : 1 x
for every x P R. Calculate
10)
pf f qpxq , pf gqpxq , pg f qpxq , pg gqpxq ,
pf hqpxq , ph f qpxq , pg hqpxq , ph gqpxq ,
ph hqpxq , rf pg hqspxq , rpf gq hspxq
for every x P R.
Define f : R Ñ R, g : R Ñ R and h : tx P R : x ¡ 0u Ñ R by
f pxq : x a , g pxq : ax , hpxq : xa
for every x in the corresponding domain where a P R. For each of
these functions and every n P N , determine the n-fold composition
with itself.
11) Define f : R Ñ R by
f pxq : r 1
p2 xq1{3 s1{7 , gpxq : cosp2xq
for every x P R. Express f and g as a composition of four functions,
none of which is the identity function. In addition, in the case of g,
the sine function should be among those functions.
58
12) Let A and B be sets, f : A Ñ B and B1 , B2 be subsets of B. Show
that
f 1 pB1 Y B2 q f 1 pB1 q Y f 1 pB2 q ,
f 1 pB1 X B2 q f 1 pB1 q X f 1 pB2 q .
13) Express the area of an equilateral triangle as a function of the length
of a side.
14) Express the surface area of a sphere of radius r
its volume.
¡ 0 as a function of
15) Consider a circle Sr1 of radius r ¡ 0 around the origin of an xydiagram. Express the length of its intersections with parallels to the
y-axis as a function of their distance from the y-axis. Determine the
domain and range of that function.
16) From each corner of a rectangular cardboard of side lengths a ¡ 0
and b ¡ 0, a square of side length x ¥ 0 is removed, and the edges
are turned up to form an open box. Express the volume of the box as
a function of x and determine the domain of that function.
17) Consider a body in the earth’s gravitational field which is at rest at
time t 0 and at height s0 ¡ 0 above the surface. Its height s and
speed v as a function of time t are given by
1
sptq s0 gt2 , v ptq gt
2
where g is approximately 9.81m{s2 . Determine the domain and range
of the functions s and v. In addition, express s as a function of the
speed and determine domain and range.
59
Fig. 12: Hexagons inscribed in and circumscribed about the unit circle.
2.3
Limits and Continuous Functions
2.3.1
Limits of Sequences of Real Numbers
For motivation of infinite processes, we consider one of its early examples, namely Archimedes’ measurement of the circle. Archimedes considered regular polygons of 6, 12, 24, . . . sides inscribed in and circumscribed
about the unit circle in order to achieve rational estimates of its circumference of increasing accuracy. Since trigonometric functions were not known
at his time, differently to the reasoning below, he used elementary geometric methods to derive the relation (2.3.1) below. Such derivation is given as
an exercise. See Problem 6 below.
For every n 6, 12, 24, . . . , we define a corresponding sn as the circumference of the regular polygon of n sides. Since geometric intuition suggests
that the shortest connection of two point in the plane is a straight line, we
expect sn to give a lower bound of the circumference of the unit circle,
i.e., of 2π. For the same reason, we expect, see Fig. 13, that the sequence
s6 , s12 , s24 , . . . is increasing. The proof of this is given as an exercise. See
Problem 7 below. In particular,
60
C
D
E
АH2nL
Аn
A
B
Fig. 13: Depiction to Archimedes’ measurement of the circle. The dots in the corners C
and D indicate right angles.
sn
n ln
where ln is the length of the side of the polygon. From Fig 13, we conclude
that
π l
π ln
2n
sin
,
sin
.
2
n
2
2n
Further, it follows that
π π π π
sin
sin 2 2n 2 sin 2n cos 2n
n
π π c
2 sin 2n 1 sin2 2n
and hence that
sin
2
π 2
2n
sin
2
π 1
2 π
sin
.
4
n
2n
The last implies that
sin
2
π 2n
and hence that
2
l2n
4 sin
2 π
2n
1
1
2
2 1
c
1 sin
c
61
1
ln2
4
2
π n
l2 {2
n
b
1
1
2
ln
4
.
Finally, we arrive at the recursion relation
2
l2n
l2
4 ln2
an
2
(2.3.1)
which Archimedes used to obtain the length of the sides of the 2n-gon from
that of the n-gon. He started from S6 1 to obtain
2
l12
2 1?3 2 ?
3.
In the next step, he used the approximation
?
3
1351
780
to obtain a lower bound for s12 . Continuing in this fashion up to the 96-gon,
he arrived at the approximation
s96
20
6 71
which gives the circumference of the circle, i.e., 2π, within an error of
2 103 . Note that far better approximations to 2π were already known to
the ancient Babylonians. More important is the fact that this method could
be used to calculate 2π to arbitrary precision, i.e., within an error less than
an arbitrary small preassigned error bound ε ¡ 0.
Given such error bound ε ¡ 0, and taking into account that the sequence
s6 ,s12 ,s24 , . . . is increasing, we expect that there is some corresponding
natural number N such that
2π s2n
ε
for all natural numbers n such that n ¥ N .
Indeed this expectation turns
out to be correct later. Since,
2π s2n
|s2n 2π|
62
Fig. 14: Dodecagon inscribed in a unit circle.
for all n P N, n ¥ 6, we note that our expectation is equivalent to the
statement that for every arbitrary preassigned error bound ε ¡ 0, there is
some corresponding natural number N such that
|s2n 2π| ε
for all natural numbers n such that n ¥ N . The last is also used to define
the limit of a sequence of real numbers in general.
Definition 2.3.1. Let x1 , x2 , . . . be a sequence of elements of R and x P R.
Then we define
lim xn x
nÑ8
if for every ε ¡ 0, there is a corresponding n0 such that for all n ¥ n0
|x n x | ε ,
i.e., from the n0 -th member on, all remaining members of the sequence are
within a distance from x which is less than ε. 1 In this case, we say that the
1
As a consequence, only finitely many members have distance ¥ ε from x.
63
2
1.75
1.5
1.25
1
0.75
0.5
0.25
10
Fig. 15: pn, pn
20
30
40
50
n
1q{nq for n 1 to n 50 and asymptotes.
sequence x1 , x2 , . . . is convergent to x. Note that this implies that for every
ε¡0
|xn| |xn x x| ¤ |xn x| |x| ¤ ε |x|
for all n P N , apart from finitely many members of the sequence, and
hence that x1 , x2 , . . . is bounded, i.e., that there is M ¥ 0 such that |xn | ¤
M for all n P N . If the sequence is not convergent to any real number, we
call the sequence divergent.
Example 2.3.2. Let a be some real number and xn : a for all n
Then
lim xn a .
nÑ8
P N .
Indeed, if ε ¡ 0 is given, then
|xn a| |a a| 0
for all n P N . Hence we can choose N 1. Note that in this simple case,
the chosen N works for every ε ¡ 0. In general this will be impossible.
64
2
1.5
1
0.5
10
20
30
40
50
n
-0.5
-1
-1.5
-2
Fig. 16: pn, p1qn pn
1q{nq for n 1 to n 50 and asymptotes.
50
40
30
20
10
10
Fig. 17: pn, pn2
20
30
40
50
n
1q{nq for n 1 to n 50 and an asymptote.
65
Example 2.3.3. Investigate whether the following limits exist.
(i)
lim
Ñ8
n
(ii)
1
(2.3.2)
n
n
lim p1qn Ñ8
n
1
n
n
,
(2.3.3)
(iii)
lim
nÑ8
n2
1
.
n
(2.3.4)
Solution: Fig. 15, Fig. 16 and Fig. 17 suggest that the limit 2.3.2 is 1,
whereas the limits 2.3.3, 2.3.4 don’t exist. Indeed
1.
(2.3.5)
For the proof, let ε be some real number ¡ 0. Further, let n0 be some
natural number ¡ 1{ε. Then it follows for every n P N such that n ¥ n0 :
lim
Ñ8
n
1
n
n
n 1
n
1
n1 ¤ n1 ε .
0
and hence the statement (2.3.5). The proof that (2.3.3) does not exist proceeds indirectly. Assume on the contrary that there is some x P R such
n
lim
Ñ8 p1q n
Then there is some n0
P N such that
p1q n
n
1
n
n
1
n
x.
x
41
for all n P N such n ¥ n0 . Without restriction of generality, we can assume
that n0 ¥ 4. Then it follows for any even n P N such that n ¥ n0 :
|x 1| n 1
n
x
1 n 1
¤
n n
66
x
1
n
¤ 14
1
n0
¤ 14
1
4
21
and for any odd n P N such that n ¥ n0 :
|x
1| ¤ 14
1
4
n 1
n
x
1 n 1
¤ n
n x
1
n
¤ 14
1
n0
12 ,
and hence we arrive at the contradiction that
2 |x 1 px
1q| ¤ |x 1|
|x
1| ¤
1
2
1
2
1.
Hence our assumption that (2.3.3) exists is false. The proof that (2.3.4)
does not exist proceeds indirectly, too. Assume on the contrary that there is
some x P R such
n2 1
x.
lim
nÑ8
n
Further, let ε be some real number ¡ 0. Finally, let n0 be some natural
number ¥ |x| ε. Then it follows for n ¥ n0 that
2
n
1
n
x
n
x
1 nx
n
1
n
¡ n x ¥ |x|
εxε .
Hence there is an infinite number of members of the sequence that have a
distance from x which is greater than ε. This contradicts the existence of a
limit of (2.3.4). Hence such a limit does not exist.
The alert reader might have noticed that Def 2.3.1 might turn out to be inconsistent with logic, and then would have to be abandoned, if it turned out
that some sequence has more than one limit point. Part piq of the following
Theorem 2.3.4 says that this is impossible.
In particular, this theorem says that a sequence in R can have at most one
limit point (in part (i)), that the sequence consisting of the sums of the
members of convergent sequences in R is convergent against the sum of
67
their limits (in part (ii)), that the sequence consisting of the products of the
members of convergent sequences in R is convergent against the product of
their limits (in part (iii)) and that the sequence consisting of the inverse of
the members of a sequence convergent to a non-zero real number is convergent against the inverse of that number (in part (iv)).
Theorem 2.3.4. (Limit Laws) Let x1 , x2 , . . . ; y1 , y2 , . . . be sequences of
elements of R and x, x̄, y P R.
(i) If
then x̄ x.
lim
Ñ8 xn
x and
n
lim
Ñ8 xn
x and
n
n
(ii) If
n
then
lim
Ñ8pxn
lim
Ñ8 xn
n
then
x and
lim
Ñ8 xn yn
n
(iv) If
lim
Ñ8 xn
n
x̄ ,
lim
Ñ8 yn
y ,
yn q x
n
(iii) If
lim
Ñ8 xn
lim
Ñ8 yn
n
y.
y ,
xy .
x and x 0 ,
then
lim
nÑ8
1
xn
x1 .
Proof. ‘(i)’: The proof is indirect. Assume that the assumption in (i) is true
and that x x̄. Then there is n0 P N such that for n P N satisfying n ¥ n0 :
|xn x| 12 |x̄ x|
and |xn x̄| 68
1
|x̄ x| .
2
Hence it follows the contradiction that
|x̄ x| |x̄ xn
xn x| ¤ |x̄ xn |
|xn x| |x̄ x| .
Hence it follows that x̄ x. ‘(ii)’: Assume that the assumption in (ii) is
true. Further, let ε ¡ 0. Then there is n0 P N such that for n P N with
n ¥ n0 :
|xn x| 2ε and |yn y| 2ε
and hence
|x n
y n px
y q| ¤ |xn x|
|yn y | ε .
‘(iii)’: Assume that the assumption in (iii) is true. Further, let ε ¡ 0 and
δ ¡ 0 such that δ pδ |x| |y |q ε. (Obviously, such a δ exists.) Then
there is n0 P N such that for n P N with n ¥ n0 :
|xn x| 2δ
and |yn y | δ
.
2
Then
|xn yn x y| |xn yn xn y xn y x y| ¤
|xn| |yn y| |xn x| |y| ¤ |xn x| |yn y| |x| |yn y|
|xn x| |y| ε .
‘(iv)’: Assume that the assumption in (iv) is true. Further, let ε ¡ 0 and
δ ¡ 0 such that 1{p|x|p|x| δ qq mint|x|, εu. (Obviously, such a δ exists.)
Then there is n0 P N such that for n P N satisfying n ¥ n0 :
| |xn| |x| | ¤ |xn x| δ ,
and hence also
and
|xn| ¡ |x| δ ¡ 0
1
x
n
1 |xn x|
x |xn | |x|
p|x||xn δqx||x| ε .
69
Remark 2.3.5. The previous theorem is of fundamental importance in the
investigation of sequences. Usually, it is applied as follows. First, a given
sequence of real numbers is decomposed into combinations of sums, products, quotients of sequences whose convergence is already known. Then the
application of the theorem proves the convergence of the sequence and allows the calculation of its limit if the limits of those constituents are known.
Example 2.3.6. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where
1
xn :
n
for all n P N . Solution: In Example 2.3.3, we proved that
lim
nÑ8
n
1
n
1.
Since
n 1
1
p1q
n
n
for every n P N , it follows by Theorem 2.3.4 and Example 2.3.2 the existence of
1
lim
nÑ8 n
and that
1
n 1
lim
lim
nÑ8 n
nÑ8
n
1 p1q 0 .
n 1
p1q nlim
Ñ8 n
lim
Ñ8p1q
n
Example 2.3.7. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where
1
xn :
n a
for all n P N and a ¥ 0. Solution: First, we notice that
xn :
1
n
1
1
a a
1
n
n
70
(2.3.6)
for every n P N . Further, by Theorem 2.3.4, Example 2.3.2 and Example 2.3.6, it follows the existence of
a
n
lim 1
nÑ8
and that
a
nlim
Ñ8 1
n
lim 1
Ñ8
n
1
lim
nÑ8 n
lim a
Ñ8
n
1
a01 .
Since the last is different from 0, it follows by Theorem 2.3.4 that
lim
nÑ8
lim
1
a
n
1
1
Ñ8 1
n
a
n
11 1 .
Finally, again by application of Theorem 2.3.4, it follows from this and
Example 2.3.6 the convergence of x1 , x2 , . . . and that
lim
Ñ8 xn
n
lim
Ñ8
n
1
1
1
lim
nÑ8 n
a
n
100 .
Remark 2.3.8. Note that the result in the last Example is unchanged if a
is some arbitrary real number. Only if a is some integer 0, the term xa
has to be excluded from the sequence because undefined.
Example 2.3.9. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where
3n 2
xn :
2n 1
for all n P N . Solution: First, we notice that
xn
3n
2n
2
1
2n
2n
3
2
2
1
1q
p2n
2n 1
3
2
1
2
23
1
1
4 n
1
2
Hence it follows by Theorem 2.3.4, Example 2.3.2 and Example 2.3.7 the
convergence of x1 , x2 , . . . and that
lim
Ñ8 xn
n
3
lim
nÑ8 2
1
lim
nÑ8 4
lim
Ñ8
n
71
1
n
1
2
32
1
3
0 .
4
2
The following is a comparison theorem that allows to conclude from the
convergence of one of the involved sequences on the convergence of the
other sequence.
Theorem 2.3.10. Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of real numbers such that
|x n | ¤ y n
for all n P N. Further, let
lim
Ñ8 yn
0.
lim
Ñ8 xn
0.
n
Then
n
Proof. Let ε
that
¡ 0. Since y1, y2, . . . is convergent to 0, there is n0 P N such
|xn| ¤ yn |yn| for all n ¥ n0 . Hence it follows that x1 , x2 , . . . is convergent to 0.
Example 2.3.11. Prove the convergence of the sequence x1 , x2 , . . . and
calculate its limit where
1
xn : 2
n
a2
for all n P N and a P R. Solution: We note that for every n P N
1
n2
a2
¤ n1 .
Hence it follows by Theorem 2.3.10 and Example 2.3.6 that
lim
Ñ8 xn
n
0.
The following theorem is often used in the analysis of convergent sequences
whose limits cannot readily be determined. In this way, by approximation
of the members of the sequence, frequently estimation of its limit can be
derived.
72
Theorem 2.3.12. (Limits preserve inequalities) Let x1 , x2 , . . . and y1 , y2 , . . .
be sequences of elements of R converging to x, y P R, respectively. Further
let xn ¤ yn for all n P N . Then also x ¤ y.
Proof. The proof is indirect. Assume on contrary that x
follows the existence of an n P N such that both
x xn
¤ |xn x| 12 px yq
yn y
,
¡
y. Then it
¤ |yn y| 21 px yq
and hence the contradiction
xy
¤xy
yn xn
xy
.
Hence x ¤ y.
Example 2.3.13. Define the sequence x1 , x2 , . . . recursively by
xn
1
:
1
2
xn
a
xn
for all n P N where x1
¡ 0 and a ¥ 0. Show that
?
a
lim
x
n ¥
nÑ8
if x1 , x2 , . . . converges. Solution: For every x ¡ 0, it follows that
? 2
?
0 ¤ x a x2 2 a x a
and hence that
1
x
2
Therefore, since x1
and hence that
a
1 2
px
x
2x
aq ¥
?
2 ax
2x
?
(2.3.7)
a.
¡ 0, it follows inductively that xn ¡ 0 for all n P N
?
xn ¥ a
for all n P N zt1u. Hence if x1 , x2 , . . . is convergent, it follows by Theorem 2.3.12 the validity of (2.3.7).
73
In many cases, in particular such related to applications where sequences
are often defined recursively, it is not obvious how to decide whether a
given sequence is convergent or divergent. Then it is usually tried first to
establish the existence of a limit by application of a very general theorem,
i.e., a theorem that is applicable to a very large class of sequences that have
only few specific properties. If the sequence is found to be convergent, the
determination of its limit or the derivation of estimations of that limit is
performed in subsequent steps. The derivation of such general theorems is
the goal in the following.
For this, we notice that Definition 2.3.1 is not of much use for deciding
the convergence of a given sequence if there is no obvious candidate for its
limit. Therefore it is natural to ask, whether there is a general way to decide that convergence without reference to a limit. Indeed, this is possible
by means of the so called Cauchy criterion. For its formulation, we need
the notion of Cauchy sequences. Roughly speaking, a sequence x1 , x2 , . . .
of real numbers is called a Cauchy sequence if for every arbitrary preassigned error bound ε ¡ 0, after omission of finitely many terms of the
sequence, the distance between every two members of the remaining sequence is smaller than ε.
Definition 2.3.14. (Cauchy sequences) We call a sequence x1 , x2 , . . . of
real numbers a Cauchy sequence if for every ε ¡ 0 there is a corresponding
n0 P N such that
|x m x n | ε
for all m, n P N satisfying m ¥ n0 and n ¥ n0 .
Example 2.3.15. Define x1 : 0, x2 : 1 and
xn
2
:
1
pxn
2
xn
1
q
for all n P N . Show that x1 , x2 , . . . is a Cauchy sequence. Solution:
First, it follows for every n P N that xn 2 is the midpoint of the interval
74
x
1
0.8
0.6
0.4
0.2
10
20
30
40
50
n
Fig. 18: (n, xn ) from Example 2.3.15 for n 1 to n 50.
In between xn and xn 1 given by In
[xn 1 , xn ] if xn ¡ xn 1 . Further,
xn
2
xn 1 12 pxn
xn
1
[xn, xn
1]
if xn
¤ xn
1
and In
q xn 1 12 pxn 1 xnq .
Hence it follows by the method of induction that I1
xn p21nq1
 I2  I3 . . . and that
n 1
xn
1
.
As a consequence, if ε ¡ 0 and n0 P N is such that 21n0 ε, then it
follows for m, n P N satisfying m ¥ n0 and n ¥ n0 that xm P In0 and
therefore that
|xm xn| ¤ 2n101 ε .
Hence x1 , x2 , . . . is a Cauchy sequence. See Fig. 18.
The following is easy to show.
75
Theorem 2.3.16. Every convergent sequence of real numbers is a Cauchy
sequence.
Proof. For this, let x1 , x2 , . . . be a sequence of real numbers converging to
some x P R and ε ¡ 0. Then there is n0 P N such that
|xn x| ε{2
for all n P N satisfying n ¥ n0 . The last implies that
|xm xn| |xm x pxn xq| ¤ |xm x| |xn x| ε
for all n, m P N satisfying n ¥ n0 and m ¥ n0 . Hence x1 , x2 , . . .
is a
Cauchy sequence.
The opposite statement that every Cauchy sequence of real numbers is convergent is not obvious, but a deep property of the real number system. This
is proved in the Appendix, see the proof of Theorem 5.1.11 in the framework of Cantor’s construction of the real number system by completion of
the rational numbers using Cauchy sequences. The most important parts of
calculus / analysis, are based on the following theorem or, equivalently, on
Bolzano-Weierstrass theorem below.
Theorem 2.3.17. (Completeness of the real numbers) Every Cauchy
sequence of real numbers is convergent.
Proof. See the proof of Theorem 5.1.11 in the Appendix.
In the following, we derive far reaching consequences of the completeness
of the real numbers.
Theorem 2.3.18. (Bolzano-Weierstrass) For every bounded sequence x1 ,
x2 , . . . of real numbers there is a subsequence, i.e., a sequence xn1 , xn2 , . . .
that corresponds to a strictly increasing sequence n1 , n2 , . . . of non-zero
natural numbers, which is convergent.
76
Proof. For this let x1 , x2 , . . . be a bounded sequence of real numbers. Then
we define
S : tx1 , x2 , . . . u .
In case that S is finite, there is a subsequence x1 , x2 , . . . which is constant
and hence convergent. In case that S is infinite, we choose some element
xn1 of the sequence. Since S is bounded, there is a ¡ 0 such that S €
I1 : ra{4, a{4s. At least one of the intervals ra{4, 0s, r0, a{4s contains
infinitely many elements of S. We choose such interval I2 and xn2 P I2
such that n2 ¡ n1 . In particular I2 € I1 . Bisecting I2 into two intervals,
we can choose a subinterval I3 € I2 containing infinitely many elements
of S and xn3 P I3 such that n3 ¡ n2 . Continuing this process, we arrive
at a sequence of intervals I1 , I2 , . . . such that I1  I2  . . . and such that
the length of Ik is a{2k for every k P N . Also, we arrive at a subsequence
xn1 , xn2 , . . . of x1 , x2 , . . . such that xk P Ik for every k P N . For given
ε ¡ 0, there is k0 P N such that a{2k0 ε. Further, let k, l P N be such
that k ¥ k0 and l ¥ k0 . Then it follows that xk P Ik0 , xl P Ik0 and therefore
that
|xk xl | ¤ a{2k0 ε .
Hence xn1 , xn2 , . . . is a Cauchy sequence and therefore convergent according to Theorem 2.3.17.
For the following, the Bolzano-Weierstrass theorem will be fundamental.
It will be applied in the proofs of a number of important theorems, for
instance, Theorem 2.3.33, Theorem 2.3.44 and Theorem 3.5.59. Also the
following theorem is an important and frequently applied consequence of
Bolzano-Weierstrass’ theorem. Until the beginning of the 19th century its
statement must have been considered as geometrically obvious because it
was used without mentioning. For instance in Augustin-Louis Cauchy’s
textbook ‘Cours d’analyse’ from 1821 [22], it is implicitly used in the proof
of the intermediate value theorem, see Theorem 2.3.37 below, but without
proof. From today’s perspective, it is clear that such geometric intuition
was based on an illusion.
Theorem 2.3.19. Let x1 , x2 , . . . be an increasing sequence of real numbers,
i.e., such that xn ¤ xn 1 for all n P N, which is also bounded from above,
77
i.e., for which there is M
x1 , x2 , . . . is convergent.
¥
0 such that xn
¤
M for all n
P
N. Then
Proof. Since x1 , x2 , . . . is increasing and bounded from above, it follows
that this sequence is also bounded. Hence according to the previous theorem, there is a subsequence, i.e., a sequence xn1 , xn2 , . . . that corresponds
to a strictly increasing sequence n1 , n2 , . . . of non-zero natural numbers,
which is convergent. We denote the limit of such sequence by x. Then,
¤x
for all n P N . Otherwise, there is m P N such that xm ¡ x. If nk P N
is such that nk ¥ m, then
xn ¥ xm ¡ x
for all k P N such that k ¥ k0 . This implies that
lim xn ¥ xm ¡ x .
kÑ8
xn
0
0
k
k
Further, for ε ¡ 0, there is k0 such that
|x n x | ε
k
for all k P N such that k
n ¥ nk0 that
¥ k0. Hence it follows for all n P N satisfying
|xn x| x xn ¤ x xn |xn x| ε .
k0
k0
Therefore, x1 , x2 , . . . is convergent to x.
Corollary 2.3.20. Let x1 , x2 , . . . be an decreasing sequence of real numbers, i.e., such that xn 1 ¤ xn for all n P N, which is also bounded from
below, i.e., for which there is a real M ¥ 0 such that xn ¥ M for all n P N.
Then x1 , x2 , . . . is convergent.
78
x
0.5
0.4
0.3
0.2
0.1
10
20
30
40
50
n
Fig. 19: (n, xn ) from Example 2.3.21 for n 1 to n 50.
Proof. The sequence x1 , x2 , . . . is increasing, bounded from above and
therefore convergent to a real number x by the previous theorem. Hence
x1 , x2 , . . . is convergent to x.
Example 2.3.21. Show that the sequence x1 , x2 , . . . defined by x1 : 1{2
and
1 3 . . . p2n 1q
xn :
2 4 . . . p2nq
for all n P N zt1u is convergent. Solution: The sequence x1 , x2 , . . . is
bounded from below by 0. In addition,
xn
1
22n
pn
1
xn
1q
¤ xn
for all n P N and hence x1 , x2 , . . . is decreasing. Hence x1 , x2 , . . . is
convergent according to Corollary 2.3.20. See Fig 19.
79
Definition 2.3.22. Let S be a non-empty subset of R. We say that S is
bounded from above (bounded from below) if there is M P R such that
x ¤ M (x ¥ M ) for all x P S.
The following theorem can be considered as a variation of Theorem 2.3.19
which is also in frequent use. Its power will be demonstrated in the subsequent example.
Theorem 2.3.23. Let S be a non-empty subset of R which is bounded from
above (bounded from below). Then there is a least upper bound (largest
lower bound) of S which will be called the supremum of S (infimum of S)
and denoted by sup S (inf S).
Proof. First, we consider the case that S is bounded from above. For this,
we define the subsets A, B of R as all real numbers that are no upper bounds
of S and containing all upper bounds of S, respectively,
A : ta P R : There is x P S such that x ¡ au ,
B : tb P R : x ¤ b for all x P S u .
Since S is non-empty and bounded from above, these sets are non-empty.
In addition, for every a P A and every b P B, it follows that a b. Let
a1 P A and b1 P B. Recursively, we construct an increasing sequence
a1 , a2 , . . . in A and a decreasing sequence b1 , b2 , . . . in B by
an
bn
1
1
:
:
#
pan
#
an
bn
pan
bn q{2 if pan
if pan
if pan
bn q{2 if pan
bn q{2 P A
bn q{2 P B ,
bn q{2 P A
bn q{2 P B
for every n P N . According to Theorem 2.3.19, both sequences are convergent to real numbers a and b, respectively. Since,
pb1 a1q{2n1
for all n P N , it follows that a b. In the following, we show that
b sup S. For every x P S, it follows that x bn for all n P N and hence
bn an
80
that x ¤ b. Hence b is an upper bound of S. Let b̄ be an upper bound of
S such that b̄ b. Then there is n P N such that b̄ an . Since an is no
upper bound for S, the same is also true for b̄. Therefore, b is the smallest
upper bound of S, i.e., b sup S. Finally, we consider the case that S is
bounded from below. Then S : tx : x P S u is bounded from above.
Obviously, a real number a is a lower bound of S if and only if a is an
upper bound of S. Hence suppS q is the largest lower bound of S, i.e.,
inf S exists and equals suppS q.
Example 2.3.24. Prove that there is a real number x such that x2
Solution: For this, we define
S : ty
2.
P R : 0 ¤ y2 ¤ 2u .
Since 0 P S, S is a non-empty. Further, S does not contain real numbers
y ¥ 2 since the last inequality implies that
y 2 2 py 2qpy
2q
2¥2.
Hence S is bounded from above. We define x : sup S. In the following,
we prove that x2 2 by excluding that x2 2 and that x2 ¡ 2. First, we
assume that x2 2. Then it follows for n P N that
x
1
n
2
x2 2
Hence if n ¥ (2x
2 x2 2
2x
1
n
2x
n
¤ x2 2
1
n2
2x
n
1
n
.
1){(2 x2 ) it follows that
x
1
n
2
¤2
and therefore that x (1{n) P S. As a consequence, x is no upper bound
for S. Second, we assume that x2 ¡ 2. Then it follows for ε ¡ 0 that
px εq2 2 x2 2 2εx
81
ε2
¥ x2 2 2εx .
Hence if ε (x2 2){(2x), it follows that
px εq2 ¡ 2 .
As a consequence, x is not the smallest upper bound for S. Finally, it
follows that x2 2. Note that according to Example 2.2.15, x is no rational
number.
Below, we define the exponential function as a limit of sequences. This
function is of fundamental importance for applications. It appears in a natural way in the description of physical systems throughout the whole of
physics. One prominent example is the description of radioactive decay. Its
discovery is often attributed to Jacob Bernoulli, who became familiar with
calculus through a correspondence with Leibniz, resulting from his study of
the problem of continuous compound interest. For motivation, we briefly
sketch the problem in the following.
For this, we assume that a bank account contains a ¡ 0 Dollars that pays
100 x percent interest per year where x is some real number. Of course,
in practice x ¥ 0. If the interest is payed once at the end of the year, the
account contains
a1 : a x a a p1 xq
Dollars at the end of the year. If the interest is payed semiannually, after
1{2 years the account contains
a
x
aa 1
2
x
2
Dollars and after one year
a2 : a 1
x
2
x a 1
2
x
a 1
2
x 2
2
¥ a1
Dollars. Analogously, if the interest is payed n-times per year where n
N , the account contains
an : a 1
82
x n
n
P
2.74
2.73
2.72
2.71
2
4
6
8
10
12
14
n
Fig. 20: pn, xn q, pn, yn q from Lemma 2.3.25 and pn, eq for n 1 to n 15.
Dollars after one year. Bernoulli investigated the question whether this
amount would grow indefinitely with the increase of n or whether it would
stay bounded. Indeed, as we shall see below, the sequence a1 , a2 , . . . is
converging to a real number which is denoted by aex or a exppxq. For
simplicity, below we restrict n to powers of 2. This is an approach of Otto
Dunkel, 1917 [33] which avoids the use of Bernoulli’s inequality. This
restriction can be removed later, for instance, with the help of L‘Hospital’s
theorem, Theorem 2.5.38.
Lemma 2.3.25. Let x P R. Define
x p2n q
x p2n q
xn : 1
,
y
:
1
n
2n
2n
for all n P Z. Then for all n P N such 2pn1q ¡ |x|:
0 xn1
and
xn
y
n
¤ xn ¤ yn ¤ yn1
1
2
x
¤ 4m
2
83
(2.3.8)
.
(2.3.9)
Proof. For this let n P N be such that m : 2pn1q
1
1
and hence
x 2
2m
x 2
2m
1
1
0 xn1
x
m
x
m
x2
4m2
x2
4m2
¥1
¥1
¡ |x|. Then
x
m
x
m
¡0,
¡0
¤ xn and 0 yn ¤ yn1 .
Finally, it follows that
yn xn
1
x 2m 1
2m
#
x 2m
2m +
x 2
x 2m
1 1
2m
2m
x2
x 2m
1 1
2m
4m2
1
1
and hence xn
1
x2
4m2
0
1
x2
4m2
2m
1
1
x2
4m2
2m1 ¤ yn and (2.3.9).
Note that the sequence y1 , y2 , . . . in Lemma 2.3.25 is a decreasing and
bounded from below by 0 and hence convergent according to Theorem 2.3.20.
Hence we can define the following:
Definition 2.3.26. We define the exponential function exp : R Ñ R by
exppxq : ex : nlim
Ñ8 1 for all x P R.
Then we conclude
Theorem 2.3.27.
84
x p2n q
2n
(i)
x
e
and ex
¡ 0 for all x P R.
nlim
Ñ8
x p2n q
2n
1
(ii)
1
x¤ 1
x p2n q
2n
¤e ¤
x
1
x p2n q
2n
for all x P R such |x| 1 and all n P N.
(iii)
ex
for all x, y
y
P R.
¤ 1 1 x
(2.3.10)
exey
Proof. From (2.3.9), it follows for every x P R:
lim
Ñ8
n
xn
yn
1
and hence by the limit laws Theorem (2.3.4) that
lim
Ñ8 yn nlim
Ñ8
n
xn
yn
nlim
Ñ8 xn
and by (2.3.8) and Theorem 2.3.12 that ex ¡ 0 for all x P R. Further, if |x| 1, it follows from (2.3.8) and by Theorem 2.3.12 the estimates (2.3.10). Finally, if y P R and n P N is such that m : 2n ¡
maxt4|x|, 4|y |, 2|x||y |u, then
1
where
y m
x m
1 m
m
m
1 xmy
hm :
xy
m x
85
1
y
hm
m
m
is such that |hm | 1. Hence by (2.3.10)
1
hm
¤
hm
m
1
m
¤ 1 1h
,
m
and it follows by Theorem 2.3.4 and Theorem 2.3.12 that
ex ey
ex y
nlim
Ñ8
1
hm
m
m
1.
Problems
1) Below are given the first 8 terms of a sequence x1 , x2 , . . . . For
each find a representation xn f pnq, n 1, . . . , 8 where f is an
appropriate function.
a)
b)
c)
d)
e)
f)
g)
h)
i)
j)
k)
l)
2, 4, 6, 8, 10, 12, 14, 16,
2, 4, 8, 16, 32, 64, 128, 256,
1, 1, 1, 1, 1, 1, 1, 1,
1, 3, 6, 10, 15, 21, 28, 36,
1, 3{4, 5{7, 7{10, 9{13, 11{16, 13{19, 15{22,
2, 0, 2, 0, 2, 0, 2, 0,
5{7, 0, 7{9, 0, 9{11, 0, 11{13, 0,
1, 1, 4{6, 8{24, 16{120, 32{720, 64{5040, 128{40320,
0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 0, 0, 0, 1, 0, 1, [0, 0, 0, 1,]
0, 1, 0, 0, 0, 1, 0, 1, [0, 0, 0, 1].
2) Prove the convergence of the sequence and calculate its limit. For this
use only the limit laws, the fact that a constant sequence converges
to that respective constant and the fact that
lim p1{nq 0 .
n
Ñ8
Give details.
86
a)
b)
c)
d)
e)
f)
g)
xn
xn
xn
xn
xn
xn
xn
: 1 p1{nq, n P N ,
: 5 p2q p1{nq 3 p1{nq2 , n P N ,
: r1 p4q p1{nqs{r2 3 p1{nq2 s, n P N ,
: 3{n2 , n P N ,
: p2n 1q{pn 3q, n P N ,
: p3n2 6n 10q{p7n2 3n 5q, n P N ,
: p3n2 6n 10q{p7n3 3n 5q, n P N .
3) Determine in each case whether the given sequence is convergent or
divergent. Give reasons. If it is convergent, calculate the limit.
a)
xn :
n 1
n
1
b)
xn :
d)
xn :
p1qn
n
p1qn
c)
xn : p1qn 1 e)
xn : sinpnπ q
f)
xn : sin
g)
xn :
h)
xn :
n2
n2 1
i)
xn :
j)
xn :
n2 n
n3 1
n
n
n2 1
n3
n2 1
for every n P N .
1
n
nπ
2
cospnπ q
4) The table displays pairs pn, sn q, n 1, . . . , 10, where sn is the measured height in meters of a free falling body over the ground after
n{10 seconds and at rest at initial height 4m.
p1, 3.951q p2, 3.804q p3, 3.559q p4, 3.216q
p5, 2.775q p6, 2.236q p7, 1.599q p8, 0.864q
.
Draw these points into an xy-diagram where the values of n appear
on the x-axis and the values of sn on the y-axis. Find a representation
sn f pn{10q, n 1, . . . , 10, where f is an appropriate function,
and predict the time when the body hits the ground.
5) The table displays pairs p2n{10, Ln q, n 1, ..., 8, where 2n{10 is
the pressure in atmospheres (atm) of an ideal gas (, at constant temperature of 20 degrees Celsius,) confined to a volume which is proportional to the length Ln . The last is measured in millimeters (mm).
p0.2, 672q p0.4, 336q p0.6, 224q p0.8, 168q
p1.0, 134.4q p1.2, 112q p1.4, 96q p1.6, 84q
87
Draw these points into an xy-diagram where the values of n appear
on the x-axis and the values of Ln on the y-axis. Find a representation Ln f p2n{10q, n 1, . . . , 8, where f is an appropriate
function, and predict L10 .
6) Like Archimedes, derive the recursion relation (2.3.1) by elementary
geometric reasoning without the use of trigonometric functions.
7) Reconsider Archimedes’ measurement of the circle and calculate the
recursion relation for the sequence of circumferences s6 , s12 , s24 , . . .
that corresponds to (2.3.1). In addition, prove that this sequence is
increasing as well as bounded from above and hence convergent.
2.3.2
Continuous Functions
This section starts the investigation of properties of functions defined on
subsets of the real numbers.
Alongside the notion of a function, the notion of the continuity of a function underwent considerable changes until it reached its current meaning.
In his textbook ‘Introductio ad analysin infinitorum’ from 1748 [38], Leonhard Euler defines a function as an equation or analytic expression composed of variables and numbers. Admissible analytic expressions were
those that involved the four algebraic operations, roots, exponentials, logarithms, trigonometric functions, derivatives and integrals. This common
property of functions was also called ‘continuity in form’. The study of
the solutions of the wave equation in one space dimension (‘the VibratingString Problem’), made necessary the consideration of compounds of such
functions. Such were called ‘discontinuous’ functions by Euler. This included functions (in the sense of curves) that are traced by the free motion
of the hand and therefore not subject to any law of continuity in form. Unlike modern definitions of continuity of a function, continuity in the sense
of Euler included the differentiability of the function in the modern sense.
The last concept will be defined in Section 2.4. Hence the term continuous
was used to indicate a kind of regularity of the function. The same is true
today.
88
The modern definition of continuity goes back to a publication of Bernhard Bolzano from 1817 [12]. The literal translation of the (German) title
is
‘Purely analytical proof of the theorem, that between each two
values which guarantee an opposing result, at least one real
root of the equation lies.’
The phrase ‘opposing result’ means an opposite sign, and the theorem in
question is the intermediate value theorem, see Theorem 2.3.37 below. In
this paper, he criticizes that the known proofs of that theorem still make reference to geometric intuition although such arguments were already considered inadequate in pure mathematics at the time. He argues that the concept
of continuity should be understood in the following sense. A function f pxq
varies according to the law of continuity for all values of x which lie inside
or outside certain limits if for every such x the value of the difference
f px
ω q f pxq
can be made smaller than any given quantity if ω can be assumed as small as
one wishes. Essentially the same formulation can also be found in Cauchy’s
textbook ‘Cours d’analyse’ from 1821 [22]. This formulation practically
coincides with a modern definition.
It is important to note that, on first sight and unlike Bolzano, Cauchy’s
definition makes reference to infinitesimal quantities. The use of such
quantities, which have their roots in ancient Greek philosophy, was quite
common at that time. Among others, Johannes Kepler, Newton, Leibniz,
Jacob Bernoulli, Euler and Cauchy, previously to the writing of his ‘Cours
d’analyse’, made use of them. Jean le Rond d’Alembert, Joseph Louis Lagrange, Bolzano and others distrusted that concept and tried to avoid it. On
the other hand, Cauchy replaces the concept of fixed infinitesimally small
quantities by a definition of infinitesimals in terms of an essentially modern
concept of limits. In this way, he ‘reconciles rigor with infinitesimals’ and
became an important and influential promoter of rigor in calculus / analysis.
89
In modern calculus / analysis, infinitesimals are not part of the real number system. Following Cauchy, their role has been replaced by the rigorous
concept of limits.
The assumption of continuity of the involved function is sufficient to prove
the intermediate value theorem, although neither Bolzano nor Cauchy could
give a completely satisfactory proof according to modern standards because
a rigorous foundation of the real number system was still missing. An additional important property of continuous functions, defined on closed intervals of R, is that they assume a maximum and also a minimum value. See
Theorem 2.3.33 below.
Below, we define the continuity of a function as the property to ‘preserve
limits’. This form of the definition goes back to Heinrich Eduard Heine and
is called ‘sequential continuity’ in more general situations (than functions
defined on subsets of the real numbers).
Definition 2.3.28. (Continuity) Let f : D Ñ R be a function and x P
D. Then we say f is continuous in x if for every sequence x1 , x2 , . . . of
elements in D from
lim xν x
ν Ñ8
it follows that
lim f pxν q f
ν Ñ8
lim xν
ν Ñ8
r f pxqs .
If f is not continuous in x, we say f is discontinuous in x. Also we say f
is continuous if f is continuous in all points of its domain D.
Example 2.3.29. (Basic examples for continuous functions.) Let a, b be
real numbers and f : R Ñ R be defined by
f pxq : ax
for all x P R. Then f is continuous.
90
b
Proof. Let x be some real number and x1 , x2 , . . . be a sequence of real of
numbers converging to x. Then for any given ε ¡ 0, there is n0 P N such
that for n P N with n ¥ n0 :
|a| |xn x| ε
and hence also that
|f pxnq f pxq| |axn
b pax
and
bq| |axn ax| |a| |xn x| ε
lim
Ñ8 f pxn q f pxq .
n
An example for a function which is discontinuous in one point.
Example 2.3.30. Consider the function f : R Ñ R defined by
x
f pxq :
|x |
for x 0 and f p1q : 1. Then
1
lim
nÑ8 n
but
1
0 and nlim
Ñ8 n
0,
1
1
1 and nlim
f
1 .
lim
f
Ñ8
nÑ8
n
n
Hence f is discontinuous at the point 1. See Fig. 21. Such discontinuity is
called a ‘jump discontinuity’.
The following gives an example of a function that is discontinuous in every point of its domain and is known as Dirichlet’s function. It was given
in Dirichlet’s 1829 paper [30] which gave a precise meaning to Fourier’s
work from 1822 [41] on heat conduction. As described in the beginning
of Section 2.2.3, that paper also gave the first modern definition of functions. His example clearly demonstrates that he moved considerably past
his time with his concept of functions since such type of function had not
been considered before.
91
Example 2.3.31. (Dirichlet’s function, a function which is nowhere continuous) Define f : R Ñ R by
f pxq :
#
1 if x is rational
0 if x is irrational
for every x P R. For the proof that f is everywhere discontinuous, let
xP?
R. Then x is either rational or irrational. If x is?rational, then xn :
x
2{n for every n P N is irrational. (Otherwise, 2 npxn xq is a
rational number. ) Hence
lim
Ñ8 f pxn q 0 1 f pxq ,
n
and f is discontinuous in x. If x is irrational, by construction of the real
number system, see Theorem 5.1.11 (i) in the Appendix, there is a sequence
of rational numbers x1 , x2 , . . . that is convergent to x. Hence
lim
Ñ8 f pxn q 1 0 f pxq ,
n
and f is discontinuous in x also in this case.
In the following, we define ‘continuous’ limits of the form
lim
Ña f pxq
x
where f is some function and a some real number or 8, 8. In classical
(=‘pre-modern’) understanding, the symbol was understood as the variable
x approaching a in a ‘continuous’ way, an understanding that was heavily dependent on geometric intuition. Nowadays, there are good reasons to
distrust such an intuition resulting from Cantor’s classification of infinite
sets. That classification separates infinite sets into those that are countable
and those that are not. The last are called ‘uncountable’. A countable set
is a set which is the image of an injective map with domain N. It can be
shown that the sets Z and Q are countable, but that R and also any interval
of R containing more than one point is uncountable. Therefore, the geometric intuition of the variable x approaching a in a continuous way would
92
involve the visualization of an uncountable set which can be considered humanly impossible. For this reason, it can very well be said that a large part
of classical calculus / analysis used arguments that were based on illusions,
even if one excludes its frequent use of infinitesimal quantities from the
consideration.
The following definition introduces notation which is in frequent use in
other textbooks of calculus / analysis. We will use it only occasionally.
Definition 2.3.32. (Continuous limits) Let f be function defined on a subset of R, a P R Y t8u Y t8u and b P R.
(i) We say that a sequence x1 , x2 , . . . of real numbers converges to 8 or
8 if for every n P N there are only finitely many members that are
¤ n or ¥ n, respectively.
(ii) If there is sequence x1 , x2 , x3 , . . . in the domain of f that converges
to a, we define
lim f pxq b ,
x Ña
if for every such sequence it follows that
lim
Ñ8 f pxn q b .
n
An important property of continuous functions, defined on closed intervals
of R, is that they assume a maximum value and a minimum value. The
corresponding theorem is a direct consequence of the Bolzano-Weierstrass
theorem Theorem 2.3.18.
Theorem 2.3.33. (Existence of maxima and minima of continuous functions on compact intervals) Let f : ra, bs Ñ R be a continuous function
where a and b are real numbers such that a b. Then there is x0 P ra, bs
such that
f px0 q ¥ f pxq p f px0 q ¤ f pxq q
for all x P ra, bs.
93
y
0.5
-1
1
0.5
-0.5
x
-0.5
Fig. 21: Graph of f from Example 2.3.30.
y
0.4
0.3
0.2
0.1
0.2
0.4
0.6
0.8
1
Fig. 22: Graph of f from Example 2.3.36.
94
x
Proof. For this, in a first step, we show that f is bounded and hence that
sup f pra, bsq exists. In the final step, we show that there is c P ra, bs such
that f pcq sup f pra, bsq. For this, we use the Bolzano-Weierstrass theorem. The proof that f is bounded is indirect. Assume on the contrary that
f is unbounded. Then there is a sequence x1 , x2 , . . . such that
f pxn q ¡ n
(2.3.11)
for all n P N. Hence according to Theorem 2.3.18, there is a subsequence
xk1 , xk2 , . . . of x1 , x2 , . . . converging to some element c P ra, bs. Note that
the corresponding sequence is f pxk1 q, f pxk2 q, . . . is not converging as a
consequence of (2.3.11). But, since f is continuous, it follows that
f pcq lim f pxnk q
k
Ñ8
Hence f is bounded. Therefore let M : sup f pra, bsq. Then for every
n P N there is a corresponding cn P ra, bs such that
|f pcnq M | n1 .
(2.3.12)
Again, according to Theorem 2.3.18, there is a subsequence ck1 , ck2 , . . . of
c1 , c2 , . . . converging to some element c P ra, bs. Also, as consequence of
(2.3.12), the corresponding sequence f pck1 q, f pck2 q, . . . is converging to M
and by continuity of f to f pcq. Hence f pcq M and by the definition of
M:
f pcq M ¥ f pxq
for all x P ra, bs. By applying the previous reasoning to the continuous
function f , it follows the existence of a c 1 such that
f pc 1q ¥ f pxq
and hence also
for all x P ra, bs.
f pc 1 q ¤ f pxq
95
As a by product of the proof of the previous theorem, we proved that every
continuous function defined on a bounded closed interval of R is bounded
in the following sense.
Definition 2.3.34. (Boundedness of functions) We call a function f bounded
if there is M ¡ 0 such that
|f pxq| ¤ M
for all x from its domain.
An example for an unbounded function defined on a bounded closed interval of R is given by the function f from Example 2.3.36 below.
Corollary 2.3.35. Every continuous function defined on a bounded closed
interval of R is bounded.
A simple example of a function which is discontinuous in one point and
does not assume a maximal value is:
Example 2.3.36. Define f : r0, 1s Ñ R by
f pxq :
"
if 0 ¤ x 1{2
if 1{2 ¤ x ¤ 1 .
1 x2
px 1q2
See Fig. 22.
Another important property of continuous functions, defined on closed intervals of R, is that they assume all values between those at the interval
ends.
Theorem 2.3.37. (Intermediate value theorem) Let f : ra, bs Ñ R be
a continuous function where a and b are real numbers such that a b.
Further, let f paq f pbq and γ P pf paq, f pbqq. Then there is x P pa, bq such
that
f pxq γ .
96
Proof. Define
S : tx P ra, bs : f pxq ¤ γ u .
Then S is non-empty, since a P S, and bounded from above by b. Hence
c : sup S exists and is contained in ra, bs. Further, there is a sequence
x1 , x2 , . . . in S such that
|xn c| ¤ n1
(2.3.13)
for all n P N. Hence x1 , x2 , . . . is converging to c, and it follows by the
continuity of f that
lim f pxn q f pcq .
nÑ8
Moreover, since f pxn q ¤ γ for all n P N, it follows that f pcq ¤ γ. As
a consequence, c b. Now for every x P pc, bs, it follows that f pxq ¡
γ because otherwise c is not an upper bound of S. Hence there exists a
sequence y1 , y2 , . . . in pc, bs which is converging to c. Further, because of
the continuity of f
lim f pyn q f pcq
nÑ8
and hence f pcq ¥ γ. Finally, it follows that f pcq
that c a and c b.
γ and therefore also
The following corollary displays a main application of the intermediate
value theorem: If f is a continuous function defined on a closed interval
of R whose values at the interval ends have a different relative sign, i.e.,
one of those is 0 and the other one is ¡ 0, then there is x in the domain
of f such that
f pxq 0 .
Corollary 2.3.38. Let f : ra, bs Ñ R be a continuous function where a and
b are real numbers such that a b. Moreover, let f paq 0 and f pbq ¡ 0.
Then there is x P pa, bq such that f pxq 0.
Example 2.3.39. Define f : R Ñ R by
f pxq : x3
97
x
1
y
3
2
1
-1
0.5
-0.5
1
x
-1
Fig. 23: Graph of f from Example 2.3.39.
for all x P R. Then by Theorems 2.3.46, 2.3.48 below, f is continuous.
Also, it follows that
f p1q 1 0 and f p0q 1 ¡ 0
and hence by Corollary 2.3.38 that f has a zero in p1, 0q. See Fig. 23.
Remark 2.3.40. Note in the previous example that the value (0.375) of f in
the mid point 0.5 of r1, 0s is ¡ 0. Hence it follows by Corollary 2.3.38
that there is a zero in the interval r1, 0.5s. The iteration of this process is
called the ‘bisection method’. It is used to approximate zeros of continuous
functions.
Polynomial functions, defined on the whole of R, of an odd order necessarily assume the value 0 since they assume values of different relative
sign for large negative and large positive arguments. That the same is not
true in general for polynomial functions of even order can be seen from
the fact that, for instance, the polynomial function f : R Ñ R defined by
f pxq : 1 x2 for all x P R does not assume the value zero.
98
Theorem 2.3.41. Let n be a natural number and a0 , a1 , . . . , a2n be real
numbers. Define the polynomial p : R Ñ R by
ppxq : a0
a2nx2n x2n 1
for all x P R. Then there is some x P R such that f pxq 0.
a1 x
Proof. Below in Example 2.3.49, it is proved that p is continuous. Further,
define
x0 : 1 maxt|a0 |, |a1 |, . . . , |a2n |u .
Then
a0 a1x0 a2nx2n
¤ |a0| |a1| |x0| |a2n| |x0|2n
0
2n 1
¤ px0 1q p1 x0 x2n
1 x02n 1
0 q x0
and hence ppx0 q ¡ 0. Also
a0 a1 px0 q a2n px0 q2n ¤ |a0 | |a1 | |x0 | |a2n | |x0 |2n
2n 1
¤ px0 1q p1 x0 x2n
1 px0q2n 1
0 q x0
and hence ppx0 q 0. Hence according to Theorem 2.3.37, there is x P
rx0, x0s such that f pxq 0.
The ‘converse’ of Theorem 2.3.37 is not true, i.e., a function that assumes
all values between those at its interval ends is not necessarily continuous on
that interval. This can be seen, for instance, from the following Example.
Example 2.3.42. Define f : r0, 2{π s Ñ R by
f pxq : sinp1{xq
for 0 x ¤ 2π and f p0q : 0. Then f is not continuous (in 0), but assumes
all values in the in the interval rf p0q, f p2{π qs r0, 1s. Note also that f has
an infinite number of zeros, located at 1{pnπ q for n P N .
A useful property of continuous functions for theoretical investigations
such as Theorem 2.3.44 below is that they map intervals of R that are contained in their domain on intervals of R.
99
y
1
0.5
0.2
0.4
0.6
x
-0.5
-1
Fig. 24: Graph of f from Example 2.3.42.
Theorem 2.3.43. Let f : ra, bs Ñ R be a continuous function where a and
b are real numbers such that a b. Then the range of f is given by
f pra, bsq rα, β s
for some α, β
(2.3.14)
P R such that α ¤ β.
Proof. Denote by α, β the minimum value and the maximum value of f ,
respectively, which exist according to Theorem 2.3.33. Then for every x P
rα, β s
α ¤ f pxq ¤ β .
Further, let xm , xM P ra, bs be such that f pxm q α and f pxM q β,
respectively. Finally denote by I the interval rxm , xM s if xm ¤ xM and
rxM , xms if xM xm. Then the restriction f |I of f to I is continuous and,
according to Theorem 2.3.37 (applied to the function f |I if xM xm ),
every value of rα, β s is in its range.
100
Intuitively, for instance, as a consequence of Theorem 2.2.44, it is to be
expected that the inverse of an injective continuous function is itself continuous. Indeed, this true.
Theorem 2.3.44. Let f : ra, bs Ñ R, where a, b P R are such that a b,
be continuous and strictly increasing, i.e., for all x1 , x2 P ra, bs such that
x1 x2 it follows that f px1 q f px2 q. Then the inverse function f 1 is
continuous, too.
Proof. From the property that f is strictly increasing, it follows that f is
also injective. Further, from Theorem 2.3.43 it follows the existence of
α, β P R such that the range of f is given by rα, β s and hence that
f 1 : rα, β s Ñ ra, bs .
Now let y be some element of rα, β s and y1 , y2 , . . . be some sequence of elements of rα, β s that is converging to y, but such that f 1 py1 q, f 1 py2 q, . . .
is not converging to f 1 py q. Then there is an ¡ 0 along with a subsequence yn1 , yn2 , . . . of y1 , y2 , . . . such that
1
f
ynk
p q f 1pyq ¥ (2.3.15)
for all k P N . According to the Bolzano-Weierstrass’ Theorem 2.3.18,
there is a subsequence ynk1 , ynk2 , . . . of yn1 , yn2 , . . . such
lim f 1 pynkl q x
l
Ñ8
(2.3.16)
for some x P ra, bs. Hence it follows by the continuity of f that
lim ynkl
l
Ñ8
f pxq
and y f pxq, since ynk1 , ynk2 , . . . is also convergent to y, but from (2.3.15)
it follows by (2.3.16) that
x f 1 py q
which, since f is injective, leads to the contradiction that
y
f pxq
.
Hence such y and sequence y1 , y2 , . . . don’t exist and f 1 is continuous.
101
In the case of sequences, the limit laws, see Theorem 2.3.4, stated that sums,
products and quotients (if defined) of convergent sequences are convergent
to the corresponding sum, product, quotient (if defined) of their limits. A
typical application of these limit laws consisted in the decomposition of
a given sequence into sums, products, quotients of sequences whose convergence is already known. Then the application of the limit laws proved
the convergence of the sequence and allowed the calculation of its limit if
the limits of those constituents are known. Theorems similar in structure
to that of the limit laws for sequences hold for continuous functions and
are given below. Sums, products, quotients (wherever defined) and compositions of continuous functions are continuous. Indeed, this is a simple
consequence of the limit laws, Theorem 2.3.4, and the definition of continuity. According to Theorem 2.3.44 the same is true for the inverse of
an injective continuous function. A typical application of the thus obtained
theorems consists in the decomposition of a given function into sums, products, quotients, compositions and inverses of functions whose continuity is
already known. Then the application of those theorems proves the continuity of that function. In this way, the proof of continuity of a given function
is greatly simplified and, usually, obvious. Therefore, in such obvious cases
in future, the continuity of the function will be just stated, but not explicitly
proved.
Definition 2.3.45. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such that
D1 XD2 φ. Moreover, let a P R. Then we define pf1 f2 q : D1 XD2 Ñ R
(read: ‘f plus g’) and a f1 : D1 Ñ R (read: ‘a times f ’) by
pf1
f2 qpxq : f1 pxq
f2 pxq
for all x P D1 X D2 and
pa f1qpxq : a f1pxq
for all x P D1 .
Theorem 2.3.46. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such that
D1 X D2 φ. Moreover let a P R. Then it follow by Theorem 2.3.4 that
102
(i) if f1 and f2 are both continuous in x
continuous in x, too,
P
D1
X D2, then f1
f2 is
(ii) if f1 is continuous in x P D1 , then a f1 is continuous in x, too.
Definition 2.3.47. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such that
D1 X D2 φ. Then we define f1 f2 : D1 X D2 Ñ R (read: ‘f1 times f2 ’)
by
pf1 f2qpxq : f1pxq f2pxq
for all x P D1 X D2 . If moreover Ranpf1 q
D1 Ñ R (read: ‘1 over f1 ’) by
€ R, then we define 1{f1
:
p1{f1qpxq : 1{f1pxq
for all x P D1 .
Theorem 2.3.48. Let f1 : D1
D1 X D2 φ.
Ñ R, f2 : D2 Ñ R be functions such that
(i) If f1 and f2 are both continuous in x
continuous in x, too.
P
D1
X D2, then f1 f2 is
(ii) If f1 is such that Ranpf1 q € R as well as continuous in x P D1 , then
1{f1 is continuous in x, too.
Proof. For the proof of (i), let x1 , x2 , . . . be some sequence in D1
which converges to x. Then for any ν P N
X D2
|pf1 f2qpxν q pf1 f2qpxq| |f1pxν qf2pxν q f1pxqf2pxq|
|f1pxν qf2pxν q f1pxqf2pxν q f1pxqf2pxν q f1pxqf2pxq|
¤ |f1pxν q f1pxq| |f2pxν q| |f1pxq| |f2pxν q f2pxq|
¤ |f1pxν q f1pxq| |f2pxν q f2pxq| |f1pxν q f1pxq| |f2pxq|
|f1pxq| |f2pxν q f2pxq|
and hence, obviously,
lim
Ñ8pf1 f2 qpxν q pf1 f2 qpxq .
ν
103
For the proof of (ii), let x1 , x2 , . . . be some sequence in D1 which converges
to x. Then for any ν P N
|p1{f1qpxν q p1{f1qpxq| |1{f1pxν q 1{f1pxq|
|f1pxν q f1pxq|{r |f1pxν q| |f1pxq| s
and hence, obviously,
lim p1{f1 qpxν q p1{f1 qpxq .
ν
Ñ8
In the following, we give two examples for the application of Theorem 2.3.46
and Theorem 2.3.48.
Example 2.3.49. Let n P N and a0 , a1 , . . . , an be real numbers. Then the
corresponding polynomial of n-th order p : R Ñ R defined by
ppxq : a0
a1 x
an xn
for all x P R, is continuous.
Proof. The proof is a simple consequence of Example 2.3.29, Theorem 2.3.46
and Theorem 2.3.48.
Example 2.3.50. Explain why the function
f pxq :
x3
2x2 x 1
x2 3x 2
(2.3.17)
is continuous at every number in its domain. State that domain. Solution:
The domain D is given by those real numbers for which the denominator
of the expression (2.3.17) is different from 0. Hence it is given by
D
R zt1, 2u .
Further, as a consequence of Example 2.3.49, the polynomials p1 : R Ñ R,
p2 : D Ñ R defined by
p1 pxq : x3
2x2
104
x
1,
p2 pxq : x2 3x
2
for all x P R and x P D, respectively, are continuous. Since p2 pRq € R ,
it follows by Theorem 2.3.48 that the function 1{p2 is continuous. Finally
from this, it follows by Theorem 2.3.48 that p1 {p2 is continuous.
Theorem 2.3.51. Let f : Df Ñ R, g : Dg Ñ R be functions and Dg be a
subset of R. Moreover let x P Df , f pxq P Dg , f be continuous in x and g
be continuous in f pxq. Then g f is continuous in x.
Proof. For this, let x1 , x2 , . . . be a sequence in Dpg f q converging to x.
Then f px1 q, f px2 q, . . . is a sequence in Dg . Moreover since f is continuous
in x, it follows that
lim f pxν q f pxq .
ν Ñ8
Finally, since g is continuous in f pxq it follows that
lim
Ñ8pg f qpxν q νlim
Ñ8 g pf pxν qq g pf pxqq pg f qpxq .
ν
Example 2.3.52. Show that f : R Ñ R defined by
f pxq : |x|
for all x P R, is continuous. Solution: Define the polynomial p2 : R Ñ R
by p2 pxq : x2 for every x P R. According to Example 2.3.49, p2 is continuous. Then f s2 p2 , where s2 denotes the square-root function on r0, 8q,
which, by Theorem 2.3.44, is continuous as inverse of the strictly increasing restriction of p2 to r0, 8q. Hence f is continuous by Theorem 2.3.51.
Example 2.3.53. The functions sin : R Ñ R and exp : R Ñ R are continuous. Show that arcsin : r1, 1s Ñ rπ {2, π {2s, cos : R Ñ R, arccos :
r1, 1s Ñ r0, πs, tan : pπ{2, π{2q Ñ R, arctan : R Ñ pπ{2, π{2q
and the natural logarithm function ln : p0, 8q Ñ R are continuous. Solution: Since the restriction of sin to rπ {2, π {2s and exp are in particular
105
y
3
2
1
2
-2
3
x
-1
-2
-3
Fig. 25: Graph of sin, arcsin and asymptotes.
y
3
2
-3
2
-2
3
-1
-2
-3
Fig. 26: Graph of cos, arccos and asymptotes.
106
x
y
3
2
1
-3
-2
1
-1
x
2
-1
-2
-3
Fig. 27: Graph of tan, arctan and asymptotes.
y
3
2
1
-3
-2
1
-1
2
-1
-2
-3
Fig. 28: Graph of exp, ln.
107
3
x
F
x
1
tanHxL
sinHxL
x
A
cosHxL
B
C
D
Fig. 29: Sketch for Example 2.3.54. The dots in the corners B and F indicate right angles.
increasing, their inverses arcsin and ln are continuous according to Theorem 2.3.44. Further, since
π
cospxq sin x
2
for all x P R, the cosine function is continuous as composition of continuous functions according to Theorem 2.3.51. Further, the restriction of cos
to r0, π s is in particular increasing and hence its inverse arccos continuous
according to Theorem 2.3.44. Also, tan : R z tk π pπ {2q : k P Zu Ñ R
defined by
sinpxq
tanpxq :
cospxq
for every x P R z tk π pπ {2q : k P Zu is continuous according to Theorem 2.3.48 as quotient of continuous functions. Finally, the restriction of
tan to pπ {2, π {2q is in particular increasing and hence its inverse arctan
continuous according to Theorem 2.3.44.
It is not uncommon that, in a first step, in the definition of a continuous
function f certain real numbers have to be excluded from the domain since
the expression used for the definition is not defined in those points. Such
points are called singularities of f , although not part of the domain of f .
Most frequent is the case that the definition in a point would involve division by 0. Since this division is not defined, that point has to excluded from
108
y
2.5
2
1.5
1
0.5
-1.5
-1
0.5
-0.5
1
1.5
x
Fig. 30: Graphs of f (red) and h (blue) from Example 2.3.54.
the domain of f . In particular in applications, singularities of functions are
points of interest. For instance, in physics they often signal the breakdown
of theories at such locations. In case that there is a continuous function
fˆ whose restriction to the domain of f coincides with f and, in addition,
contains a singularity of f , then that singularity is called a removable and
fˆ a continuous extension of f . If xs P R is a singularity of f and if there
is a sequence x1 , x2 , . . . in the domain of f that is convergent to xs , then it
follows by the assumed continuity of fˆ that
ˆ
ˆ
lim
Ñ8 f pxn q nlim
Ñ8 f pxn q f pxs q
n
and hence that every continuous extension of f containing xs in its domain
assumes the same value in xs . Continuous functions with singularities that
are not removable are easy to construct. For instance, f : R Ñ R defined
by f pxq : 1{x has a singularity at x 0 and the sequence
f p1{1q, f p1{2q, f p1{3q, . . .
109
diverges. Since
1
0,
nÑ8 n
it follows that there is no continuous extension of f . The following is an
often appearing case of a removable singularity.
lim
Example 2.3.54. (Removable singularities) Define f : R Ñ R by
f pxq sinpxq
x
for every x P R and f p0q 1. Then f is continuous. Proof: By Theorem 2.3.48, the continuity of sin and the linear function p : R Ñ R, defined
by ppxq : x, x P R, see Example 2.3.29, it follows the continuity of f in
all points of R . The proof that f is also continuous in x 0, follows from
the following inequality (compare Fig 30):
sin x
x
p q 1 ¤ 1 1 ,
cospxq
(2.3.18)
for all x P pπ {2, π {2q zt0u. For its derivation and in a first step, we
assume that 0 x π {2 and consider the triangle ADF in Fig 29, in particular the areas ApABF q, ApACF q and ApADF q of the triangles ABF ,
ACF and ADF , respectively. Then we have the following relation:
ApABF q ¤ ApACF q ¤ ApADF q
and hence
and
1
x
sinpxq cospxq ¤
2
2
¤ tan2pxq
sinpxq
1
¤
.
x
cospxq
From this follows, by the symmetries of sin, cos under sign change of
the argument, the same equality for π {2 x 0. Hence for x P
pπ{2, π{2q zt0u:
sinpxq
1
1¤
1
x
cospxq
cospxq ¤
110
and
1
sinpxq
x
¤ 1 cospxq ¤ cos1pxq 1
and hence finally (2.3.18). Now since h : pπ {2, π {2q Ñ R defined by
hpxq :
1
cospxq
1 ,
for all x P pπ {2, π {2q is continuous, it follows by (2.3.18) and Theorem 2.3.10 the continuity of f also in x 0.
Remark 2.3.55. The alert reader might have noticed that geometric intuition was used in the derivation of the inequality (2.3.18) that is also used
further on, although such intuition is no longer admitted in proofs. Indeed, this could be avoided by introducing the sine and cosine functions
by their power series expansions, see Example 3.4.27 from Calculus II, but
this would take us to far off course.
Often, in particular in applications, functions occur that are defined on unbounded intervals of the real numbers. For instance, such appear in the
description of the frequently occurring physical systems of infinite extension, like the motion of planets and comets around the sun. In such cases
the behavior of the function near 8 and/or 8 is of interest. Such study
would be much simplified if 8 and 8 would be part of the real numbers which is not the case. But there is a simple method to reduce the
discussion of the behavior of a function near 8 and/or 8 to that of a
related function near 0 which is based on the fact that the auxiliary function
h : pR Ñ R, x ÞÑ 1{xq maps large positive real numbers to small positive numbers and large negative real numbers to small negative numbers.
Hence the behavior of a function f near 8 is completely determined by
the behavior the function f¯ : f h near 0. This fact provides a simple
method for the calculation of limits at infinity.
Theorem 2.3.56. (Limits at infinity) Let a
ber.
111
¡ 0 and L be some real num-
(i) If f : ra, 8q Ñ R is continuous, then
lim f pxq L
x
Ñ8
if and only if the transformed function f¯ : r0, 1{as Ñ R defined by
f¯pxq : f p1{xq
for all x P p0, 1{as and f¯p0q : L is continuous in 0. In this case,
we call the parallel through the x-axis through p0, Lq a ‘horizontal
asymptote of Gpf q for large positive x’.
(ii) If f : p8, as Ñ R is continuous, then
x
lim f pxq L
Ñ8
if and only if the transformed function f¯ : r1{a, 0s Ñ R defined by
f¯pxq : f p1{xq
for all x P r1{a, 0q and f¯p0q : L is continuous in 0. In this
case, we call parallel through the x-axis through p0, Lq a ‘horizontal
asymptote of Gpf q for large negative x’.
Proof. “(i)”: If
lim
Ñ8 f pxq L ,
x
(2.3.19)
we conclude as follows. For this, let x1 , x2 , . . . be a sequence in p0, 1{as
that is convergent to 0. As a consequence, for m P N, there is N P N such
that
1
xn |xn | ¤
m 1
for all n P N such that n ¥ N . This implies that
1
xn
¥m
112
1¥m
for all n P N such that n ¥ N . Hence it follows from (2.3.19) that
L lim f p1{xn q lim f¯pxn q .
nÑ8
nÑ8
Obviously, this also implies that
lim f¯pxn q L
Ñ8
n
for sequences x1 , x2 , . . . in r0, 1{as that are convergent to 0 and hence that
f¯ is continuous in 0. On the other hand, if f¯ is continuous in 0, we conclude
as follows. For this, let x1 , x2 , . . . be a sequence in ra, 8q which contains
only finitely many members that are ¤ m for every m P N. Then for such
m, there is N P N such that xn ¡ m 1 for all n P N satisfying n ¥ N .
This also implies that
1 1
1 x xn
m 1
n
for such n. Since this is true for every m P N, we conclude that
lim
Ñ8
n
1
xn
0
and hence by the continuity of f¯ in 0 that
¯ 1
L nlim
Ñ8 f xn
nlim
Ñ8 f pxn q .
Finally, since this is true for every such sequence x1 , x2 , . . . , (2.3.19) follows.
“(ii)”: The proof is analogous to that of (i). If
x
lim f pxq L ,
Ñ8
(2.3.20)
we conclude as follows. For this, let x1 , x2 , . . . be a sequence in r1{a, 0q
that is convergent to 0. As a consequence, for m P N, there is N P N such
that
1
xn |xn | ¥ m 1
113
for all n P N such that n ¥ N . This implies that
¤ pm
1
xn
1q ¤ m
for all n P N such that n ¥ N . Hence it follows from (2.3.20) that
L lim f p1{xn q lim f¯pxn q .
nÑ8
nÑ8
Obviously, this also implies that
¯
lim
Ñ8 f pxn q L
n
for sequences x1 , x2 , . . . in r1{a, 0q that are convergent to 0 and hence
that f¯ is continuous in 0. On the other hand, if f¯ is continuous in 0, we
conclude as follows. For this, let x1 , x2 , . . . be a sequence in p8, as
which contains only finitely many members that are ¥ m for every m P
N. Then for such m, there is N P N such that xn pm 1q for all n P N
satisfying n ¥ N . This also implies that
1
x
n
x1 m 1
n
1
for such n. Since this is true for every m P N, we conclude that
lim
Ñ8
n
1
xn
0
and hence by the continuity of f¯ in 0 that
¯ 1
f
L nlim
Ñ8
xn
nlim
Ñ8 f pxn q .
Finally, since this is true for every such sequence x1 , x2 , . . . , (2.3.20) follows.
114
y
0.5
-10
5
-5
10
x
-1
Fig. 31: Gpf q and asymptote for Example 2.3.57.
Example 2.3.57. Consider the function f : r1, 8q Ñ R defined by
f pxq x2 1
x2 1
for all x P r1, 8q. See Fig. 31. Then the transformed function f¯ : r0, 1s Ñ
R, defined by
1 x2
f¯pxq :
1 x2
for all x P r0, 1s, is continuous and hence since
f¯pxq f p1{xq
for all x P p0, 1s, it follows that
lim
Ñ8 f pxq 1 .
x
Hence y
Fig. 31.
1 is a horizontal asymptote of Gpf q for large positive x.
115
See
y
2
1
2
4
6
8
x
-1
-2
Fig. 32: Gpf q and asymptotes for Example 2.3.58.
Example 2.3.58. Find the limits
?
?
2x2 1
2x2 1
, lim
.
lim
xÑ8 3x 5
xÑ8 3x 5
Solution: Define f : tx P R : x 5{3u Ñ R by
f pxq :
?
2x2 1
3x 5
for all x P R ^ x 5{3. Then the transformed functions f¯ corresponding
to the restrictions of f to r1, 8q and p8, 1s are given by the continuous
functions
? 2
x
2 x
f¯pxq :
|x| 3 5x2
for all x P r0, 1s and
f¯pxq :
x
|x |
?
3 2 5xx2
116
2
for all x P r1, 0s, respectively, and hence
?
2x2 1
lim
xÑ8 3x 5
?
?
2
2x2 1
, lim
xÑ8 3x 5
3
?
2
.
3
See Fig. 32.
Problems
1) Show the continuity of the function f . For this, use only Theorems
2.3.46, 2.3.48, 2.3.51 on sums, products/quotients, compositions of
continuous functions, and the continuity of constant functions/the
identity function idR on R.
a)
b)
c)
d)
e)
f pxq : x 7 , x P R ,
f pxq : x2 , x P R ,
f pxq : 3{x , x P R ,
f pxq : px 3q{px 8q , x P R zt8u ,
f pxq : px2 3x 2q{px2 2x 2q , x P R .
2) Assume that f and g are continuous functions in x
f p0q 2 and
lim r2f pxq 3g pxqs 1 .
Calculate g p1q.
0 such that
Ñ0
x
In the following, it can be assumed that rational functions, i.e., quotients
of polynomial functions, are continuous on their domain of definition.
In addition, it can be assumed that the exponential function, the natural
logarithm function, the general power function, the sine and cosine function and the tangent function are continuous.
3) For arbitrary c, d P R, define fc,d : R Ñ R by
fc,d pxq :
$
2
'
&1 x
{p 1q if x P p8, 1q
cx d
if x P r1, 1s
'
%?
4x 5
if x P p1, 8q
for all x P R. Determine c, d such that the corresponding fc,d is
everywhere continuous. Give reasons.
117
4) For arbitrary c P R, define fc : r0, 8q Ñ R by
fc pxq :
#
x sinp1{xq if x P p0, 8q
c
if x 0
for all x P r0, 8q. Determine c such that the corresponding fc is
everywhere continuous. Explain your answer.
5) For every k
P R, define fk : r1{3, 8q zt1u Ñ R by
fk pxq :
#?
?
if x P r1{3, 8q zt1u
.
if x 1
3x 1
2x 2
x 1
k
For what value of k is fk continuous? Give explanations.
6) Define the function f : R Ñ R by f pxq : x4 10x 15 for all
real x. Use your calculator to find an interval of length 1{100 which
contains a zero of f (i.e, some real x such that f pxq 0). Give
explanations.
7) Determine in each case whether the given sequence has a limit. If
there is one, calculate that limit. Otherwise, give arguments why
there is no limit.
xn :
a)
?
n
?
1 n
,
xn :
b)
?
?n
n
1
for all n P N.
8) Find the limits.
a) lim e1{n
n
Ñ8
cospnq
c) lim
nÑ8
n
lnpnq
d) lim
nÑ8
n
1{n
e) lim n
n
Ñ8 f) lim
n
a
Ñ8 n2
,
b) lim cos
n
Ñ8
1
n
π
,
?
,
Hint: Use that lnpnq ¤ 2 n for n ¥ 1 .
,
Hint: Use d)
6n n
,
1{3
1{3
Ñ8 n pn 1q
g) lim
n
n
Hint: Use that a b a2
,
a3 b3
for all a b
ab b2
118
,
h) lim
Ñ0
p1
h
hq2{3 1
h
,
Hint: Use the hint in g) .
9) Calculate the limits.
17n 4
a) lim sin
, b) lim tan
nÑ8
xÑ2
n 5
2
x 8x 15
c) lim
.
xÑ5
x5
3x
5x
2
7
,
In each case, give explanations.
10) Find the limits
a)
b)
c)
d)
e)
f)
g)
h)
i)
limxÑ8 rx{px 1qs ,
limxÑ8 rx{px 1qs ,
limxÑ8 rpsin xq{xs ,
limxÑ8 rp3x3 2x2 5x 4q { p2x3 x2
?
limxÑ8 p x{ 1 x2 q ,
?
limxÑ8 p x{ 1 x2 q ,
?
?
limxÑ8 p 3x2 2x { 2x2 5 q ,
?2
?2
limxÑ8 p x
? 2 3 x ?1 q ,2
limxÑ8 p x
4x 5 x
2q.
11) Define f : R Ñ R by
f pxq :
x
5qs ,
#
x if x is rational
0 if x is irrational
for every x P R. Find the points of discontinuity of f .
12) Define f : R Ñ R by f pxq : 0 if x 0 or if x is irrational and
f pm{nq : 1{n if m P Z and n P N have no common divisor
greater than 1. Find the points of discontinuity of f .
13) Let f and g be functions from R to R whose restrictions to Q coincide. Show that f g.
14) Let D € R, f : D Ñ R be continuous in some x P D and f pxq ¡ 0.
By an indirect proof, show that there is ε ¡ 0 such that f pxq ¡ 0 for
all x P D X px ε, x εq.
15) Let a, b P R such that a b and f : ra, bs Ñ ra, bs. By use of the
intermediate value theorem, show that f has a fixed point, i.e., that
there is x P ra, bs such that f pxq x.
119
16) Use the intermediate value theorem to prove that for every a ¥ 0
there is a uniquely
determined x ¥ 0 such that x2 a. That x is
?
denoted by a.
120
y
0.25
0.5
1
x
Fig. 33: Graph of A and its point with maximum ordinate.
2.4
Differentiation
Possibly, the first mathematician to use the derivative concept in some implicit form is Pierre de Fermat in his calculation of maximum / minimum
ordinate values of curves in Cartesian coordinate systems and in his way of
determination of tangents at the points of curves.
The first may be due to the observation that the ordinate values of a curve
near a maximum (or a minimum) change very little near the abscissa of its
location, differently to other points of the curve. It is not clear whether this
was his real motivation because he never published his method, but only described it in communications to other mathematicians from 1637 onwards.
Also in these instances, he did not explain its logical basis so that its general
validity was quickly questioned. On the other hand, his procedure suggests
that observation as the basis of the method. For display of the method,
he considers the problem of finding the maximal area of a rectangle with
perimeter 2b where b ¡ 0. If x ¥ 0 denotes the width of such a rectangle,
the corresponding area is given by
Apxq : x pb xq ,
121
see Fig 33. If px0 , Apx0 qq is the point of GpAq with maximal ordinate, then
Apx0 hq px0 hq pb x0 hq x0 pb x0 q
Apx0q h pb 2x0 hq Apx0q
h pb 2x0 hq
(2.4.1)
for h such that x0 h P DpAq r0, bs and of small absolute value where
means ‘approximately’ . Hence if h 0,
b 2x0 h 0 .
By neglecting the term
arrives at the equation
h on the left hand side of the last relation, he
b 2x0
0
and hence at x0 b{2 which gives Apx0 q b2 {4.
(2.4.2)
Indeed, the rectangle
with perimeter 2b of maximal area is given by a square with sides b{2. We
note that from (2.4.1) it follows that
Apx0
hq Apx0 q
h
b 2x0 h
if h 0. Hence the equation (2.4.2) is equivalent to the demand that
h
lim
Ñ0,h0
Apx0
hq Apx0 q
h
0
where the addition of h 0 in the limit symbol indicates that only sequences with non-vanishing members are admitted . In modern calculus
/ analysis, the limit on the left of the last equation is called the derivative
of f in x0 and is denoted by f 1 px0 q. Hence in modern terms, Fermat demands that f 1 px0 q 0. Indeed, the vanishing of the derivative in a point
is necessary, but not sufficient, for a (differentiable) function to assume an
extremum, i.e., a minimum or maximum value, in that point, see Theorem 2.5.1.
Fermat uses a similar method for the determination of tangent lines to
curves. To a greater extent, such were not studied until the middle of
122
y
fHa+hL
fHaL
a-c
a
a+h
a+d
x
Fig. 34: Depiction to Fermat’s method of determination of tangents.
the 17th century. Apart from Archimedes’ construction of tangent lines
to his spiral, in ancient Greece, tangents were constructed only in few simple cases, namely for ellipses, parabolas and hyperbolas where they were
defined as lines that touch the curve in only one point. In general, this definition is too imprecise. In particular, the concept of differentiation also
gives a precise meaning to tangent lines to curves. For the description of
Fermat’s method, we consider Fig 34 which displays the graph of a function f together with its tangent at the point pa, f paqq and the normal to
the tangent in this point. By definition, the tangent goes through the point
pa, f paqq and hence is determined once we know the location of its intersection pa c, 0q with the x-axis where c is the unknown. For the determination
of c, Fermat considers the triangles with corners pa c, 0q, pa, 0q, pa, f paq
and pa c h, 0q, pa h, 0q, pa h, f pa hqq to be approximately similar in the case of a tangent. These triangles are similar only if the point
pa h, f pa hqq would lie on the tangent. In general, the error of the approximation is becoming smaller with smaller h. The approximation gives
the relation
f pa hq
f paq
c
c h
or
pc hqf paq cf pa hq .
123
The last gives
c
hf paq
.
f pa hq f paq
If f is explicitly given, Fermat proceeds further by performing the division
and neglecting h as in his previous method. For instance if f pxq x2 for
all x P R, then
hf paq
f pa hq f paq
pa
ha2
hq2 a2
2
2ahha
2
a
h2
2a
a
h
2
which leads to c a{2. Indeed, this is the correct result. Also, using
modern notation and assuming that
f 1 paq :
h
lim
Ñ0,h0
f pa
hq f paq
h
0,
the following
c
hf paq
hÑ0,h0 f pa
hq f paq
lim
ff 1ppaaqq
gives the correct result. As a side remark, in older literature, the directed
line segment pa, f paqq, pa c, 0q is called the tangent line in pa, f paqq and
its projection onto the x-axis the corresponding subtangent. In addition, the
directed line segment pa, f paqq, pa d, 0q is called the normal in pa, f paqq
and its projection onto the x-axis the corresponding subnormal. When Fermat’s method was reported to Rene Descartes by Marin Mersenne in 1638,
Descartes attacked it as not generally valid. He proposed as a challenge the
curve
C : tpx, y q P R2 : x3 y 3 3axy u ,
(2.4.3)
a P R, which since then is known as ‘Folium of Descartes’. Indeed, Fermat’s ‘method’ produced the right results, and ultimately Descartes conceded its validity.
A further candidate for the first mathematician to use the derivative concept in some implicit form is Galileo Galilei. In 1589, using inclined planes,
124
y
2
1
-2
1
-1
2
x
-1
-2
Fig. 35: Folium of Descartes for the case a 1, compare (2.4.3).
Galileo discovered experimentally that in vacuum all bodies, regardless of
their weight, shape, or composition, are uniformly accelerated in exactly
the same way, and that the fallen distance s is proportional to the square of
the elapsed time t:
1
(2.4.4)
sptq gt2
2
for all t P R where g 9.8m{sec2 is the gravitational acceleration. This
result was in contradiction to the generally accepted traditional theory of
Aristotle that assumed that heavier objects fall faster than lighter ones. On
the Third Day of his ‘Discorsi’ from 1638 [42], he discusses uniform and
naturally accelerated motion. The idea that the velocity is the same as a
derivative can be read between the lines. Even a recognition of the fundamental theorem of calculus, see Theorem 2.6.19, is visible in this special
case. A modern way of deduction would proceed, for instance, as follows.
For this, we consider the average speed of a falling body described by
(2.4.4), i.e., the traveled distance divided by the elapsed time, during the
125
time interval rt, t hs, if h
some t, h P R. Then
spt
hq sptq
t ht
¥ 0, and rt
g
rpt
2h
hq
2
h, ts, if h
t s
2
g
p2ht
2h
0, respectively, for
2
h
qg
t
h
2
.
Hence it follows by Example 2.3.29 that
h
lim
Ñ0,h0
spt
hq sptq
t ht
gt ,
which suggests itself as (and indeed is the) definition of the instantaneous
speed v ptq of the body at time t:
v ptq : s 1 ptq :
h
lim
Ñ0,h0
spt
hq sptq
t ht
gt .
For a geometrical interpretation of the limit
h
lim
Ñ0,h0
spt
hq sptq
,
t ht
also in more general situations where s is not necessarily given by p2.4.4q,
note that the quotient
spt hq sptq
t ht
gives the slope of the line segment (‘secant’) between the points pt, sptqq
and pt h, spt hqq on the graph of s for every h 0. In the limit h Ñ 0
that slope approaches the slope of the tangent to Gpsq in the point pt, sptqq.
Hence in particular, a geometrical interpretation of s 1 ptq v ptq is the slope
of the tangent to Gpsq at the point pt, sptqq, see Fig. 36.
As for the definition of the continuity of functions, Cauchy, in his textbook ‘Cours d’analyse’ from 1821 [22] and by using Lagrange’s notation,
terminology and Lagrange’s characterization of the derivative in terms of
inequalities, was the first to give a definition of the derivative of a function
126
sHtL @mD
4
3
2
sH0.8L-sH0.4L
1
0.8-0.4
0.2
0.4
0.6
0.8
1
1.2
t @secD
Fig. 36: Gpsq, secant line and tangent at p0.4, sp0.4qq.
based on limits which is very near to the modern definition. Still, his understanding of limits was different from the modern understanding. This
was not without consequences. During the early 19th century, it resulted
in the general belief that every continuous function is everywhere differentiable, except perhaps at finitely many points. Even several ‘proofs’ of
this ‘fact’ appeared during that time. Therefore, it came as a shock when in
1872 [99] Weierstrass proved the existence of a continuous function which
is nowhere differentiable, see Example 3.4.13. For the first time, this result
signaled the complete mastery of the concepts of derivative and limit which
is characteristic for modern calculus / analysis.
Definition 2.4.1. Let f : pa, bq Ñ R be a function where a, b P R such
that a b. Further, let x P pa, bq and c P R. We say f is differentiable in
x with derivative c if for all sequences x0 , x1 , . . . in pa, bq ztxu which are
127
convergent to x it follows that
lim
Ñ8
n
f pxn q f pxq
xn x
c.
In this case, we define the derivative f 1 pxq of f in x by
f 1 pxq : c .
Further, we say f is differentiable if f is differentiable in all points of its
domain pa, bq. In that case, we call the function f 1 : pa, bq Ñ R associating
to every x P pa, bq the corresponding f 1 pxq the derivative of f . Higher order
derivatives of f are defined recursively. If f pkq is differentiable for k P N ,
we define the derivative f pk 1q of order k 1 of f by f pk 1q : pf pkq q 1 ,
where we set f p1q : f 1 . In that case, f will be referred to as pk 1qtimes differentiable. Frequently, we also use the notation f 2 : f p2q and
f 3 : f p3q .
The differentiability of a function in a point of its domain implies also its
continuity in that point. This is a simple consequence of the definition of
differentiability and the limit laws Theorem 2.3.4. That the opposite is not
true in general, can be seen from Example 2.4.6 or Example 2.4.7. Moreover in Calculus II, we give an example of a continuous function which is
not differentiable in any point of its domain, see Example 3.4.13.
Theorem 2.4.2. Let f : pa, bq Ñ R be a function where a, b P R such
that a b. Further, let f be differentiable in x P pa, bq. Then f is also
continuous in x.
Proof. Let x0 , x1 , . . . be a sequence in pa, bq which is convergent to x. Obviously, it is sufficient to assume that x0 , x1 , . . . is a sequence in pa, bq ztxu.
Then it follows by the limit laws Theorem 2.3.4 that
lim pf pxn q f pxqq nlim
nÑ8
Ñ8
and hence that
f pxn q f pxq
xn x
lim
Ñ8 f pxn q f pxq .
n
128
nlim
Ñ8pxn xq 0
Similar to the case of continuous functions, we shall see later on, see Theorems 2.4.8, 2.4.10, that sums, products, quotients (wherever defined) and
compositions of differentiable functions are differentiable. Indeed, this is
a another simple consequence of the limit laws, Theorem 2.3.4, and the
definition of differentiability. As usual, a typical application of those theorems consists in the decomposition of a given function into sums, products,
quotients and compositions of functions whose differentiability is already
known. Then the application of those theorems proves the differentiability of that function and allows the calculation of its derivative. To provide
a basis for the application of those theorems, in the following, we prove
the differentiability of some elementary functions, powers, the exponential function and the sine function, from the definition of differentiability
and by use of their special properties. In this process, we also explicitly
calculate the derivatives.
Example 2.4.3. Let c P R, n P N and f, g : R Ñ R be defined by
f pxq : c , g pxq : xn
for all x P R. Then f, g are differentiable and
f 1 pxq 0 , g 1 pxq nxn1
for all x P R.
Proof. Let x P R and x0 , x1 , be a sequence of numbers in R ztxu which
is convergent to x P R. Then:
lim
Ñ8
ν
f pxν q f pxq
xν x
νlim
Ñ8 0 0 .
P N:
g pxν q g pxq
pxν qn xn
xν x
xν x
n1
n2
pxν q
px ν q x Further, for any ν
129
xν xn2
xn1
and hence by Example 2.3.49:
g pxν q g pxq
ν Ñ8
xν x
lim
xn1
xn2 x
xxn2
xn1
nxn1 .
In the next example, we show that the derivative of the exponential function
is given by that function itself. As we shall see later, this fact along with the
fact that expp0q 1 can be used to characterize the exponential function,
see Example 2.5.8.
Example 2.4.4. The exponential function is differentiable with
exp 1 pxq exppxq
for all x P R.
Proof. First, we prove that exp is differentiable in 0 with derivative e0 1.
For this, let h1 , h2 , . . . be some sequence in R zt0u which is convergent to
0. Moreover, let n0 P N be such that |hn | 1 for n ¥ 0. Then for any such
n:
ehn p1 hn q
ehn e0
0
e .
hn 0
hn
We consider the cases hn ¡ 0 and hn 0. In the first case, it follows by
(2.3.10) and some calculation that
0¤
ehn
p1
hn q
hn
3 hn
¤ h4n 1
hn
2
2
¤ 124 hn 3hn .
Analogously, it follows in the second case that
h
¤ 1 hnh ¤ e hp1
n
hn
n
n
Hence it follows in both cases that
h
e n
hn q
p1
hn
hn q 130
¤ 3|hn|
¤ h4n
.
and therefore by Theorem 2.3.10 that
lim
Ñ8
n
ehn
p1
hn q
hn
0.
Now let x P R and x1 , x2 , . . . be some sequence in R ztxu which is convergent to x. Then
exn ex
xn x
xn x
ex ex e
r1 pxn xqs ,
xn x
and hence it follows by Theorem 2.3.4 and the previous result that
exn ex
lim
nÑ8 x x
n
ex
and therefore the statement of this Theorem.
Example 2.4.5. The sine function is differentiable with
sin 1 pxq cospxq
for all x P R.
Proof. Let x P R and x1 , x2 , . . . be some sequence in R ztxu, which is
convergent to x. Further define hn : xn x, n P N. Then it follows by
the addition theorems for the trigonometric functions
sinpxn q sinpxq
sinpx hn q sinpxq
xn x
hn
cosphn q 1
sinphn q
sinpxq h
cospxq hn
n
2
sinpxq h2n sinhph{n2{2q cospxq sinhphnq
n
n
and hence by Example 2.3.54 and Theorem 2.3.4 that
lim
Ñ8
n
sinpxn q sinpxq
xn x
131
cospxq .
y
1
0.5
-1
0.5
-0.5
1
x
Fig. 37: Graph of the modulus function. See Example 2.4.6.
We give two examples of continuous functions that are not differentiable
in points of their domains. In the first case, this is due to the presence of a
‘corner’ in the graph of the function. In such a point no tangent to the graph
exists and hence the function is not differentiable in the corresponding point
of its domain. In the second case, the non-differentiability is due to fact that
there is a vertical tangent to the graph. Since the derivative of a function f
in a point p of its domain gives the slope of the tangent to its graph at the
point pp, f ppqq, the derivative in p would would have to be infinite in order
to account for a vertical tangent, but infinity is not a real number. Therefore,
a function is not differentiable in such a point p.
Example 2.4.6. The function f : R Ñ R defined by
f pxq : |x|
for all x P R, is not differentiable in 0, because
lim
Ñ8
n
n1 0 1 lim n1 0 1 .
nÑ8 1 0
n1 0
n
See Fig. 37.
Example 2.4.7. The function f : R Ñ R defined by
f pxq : x1{3
132
y
1
0.5
-1
0.5
-0.5
1
x
-1
Fig. 38: Graph of f from Example 2.4.7.
for all x P R, is not differentiable in 0, because the sequence
1 1 3
n
1
n
{
01{3 n2{3
0
has no limit for n Ñ 8. See Fig. 38.
As mentioned above, similar to the case of continuous functions, sums,
products, quotients (wherever defined) and compositions of differentiable
functions are differentiable. This is a simple consequence of the limit laws,
Theorem 2.3.4, and the definition of differentiability. A typical application of the thus obtained theorems consists in the decomposition of a given
function into sums, products, quotients, compositions of functions whose
differentiability is already known. Then the application of those theorems
proves the differentiability of that function and allows the calculation of
its derivative from the derivatives of the constituents of decomposition. In
this way, the proof of differentiability of a given function is greatly simplified and, usually, obvious. Also, the calculation of its derivative is reduced
133
to a simple mechanical procedure if the derivatives of the constituents of
decomposition are known. Therefore, in such obvious cases in future, the
differentiability of the function will be just stated and its derivative will be
given without explicit proof.
Theorem 2.4.8. (Sum rule, product rules and quotient rule) Let f, g be
two differentiable functions from some open interval I into R and a P R.
(i) Then f
g, a f and f g are differentiable with
pf gq 1pxq f 1pxq g 1pxq , pa f q 1pxq a f 1pxq
pf gq 1pxq f pxq g 1pxq gpxq f 1pxq
for all x P I.
(ii) If f is non-vanishing for all x P I, then 1{f is differentiable and
1
1
f 1 pxq
pxq rf pxqs2
f
for all x P I.
Proof. For this let x P I and x1 , x2 , . . . be some sequence in I ztxu which
is convergent to x. Then:
|pf
¤
g qpxν q pf
g qpxq pf 1 pxq
|x ν x |
1
|f pxν q f pxq f pxqpxν xq|
|x ν x |
g 1 pxqqpxν
xq|
|gpxν q gpxq g 1pxqpxν xq|
|x ν x |
and
|pa f qpxν q pa f qpxq ra pf 1qpxqspxν xq|
|x ν x |
1
|a| |f pxν q f px|xq fx|pxqpxν xq|
ν
134
and hence
lim
ν
Ñ8
|pf
g qpxν q pf
g qpxq pf 1 pxq
|x ν x |
g 1 pxqqpxν
xq| 0
and
|pa f qpxν q pa f qpxq ra pf 1qpxqspxν xq| 0 .
ν Ñ8
|x ν x |
lim
Further, it follows that
|pf gqpxν q pf gqpxq pf pxq g 1pxq gpxq f 1pxqqpxν xq|
|x ν x |
1
¤ |f pxν q f px|xq fx|pxqpxν xq| |gpxq|
ν
1
|f pxq| |gpxν q gpx|xq gxp|xqpxν xq|
ν
|f pxν q f pxq| |gpx q gpxq|
ν
|x x |
ν
and hence that
lim
ν Ñ8
|pf gqpxν q pf gqpxq pf pxq g 1pxq
|x ν x |
g pxq f 1 pxqqpxν
0.
If f is does in any point of its domain I, it follows that
1
f pxν q
1
f x
pq
1
f x
r p qs2
f 1pxqpxν xq
¤
|x ν x |
1
|f pxν q f pxq f 1pxqpxν xq|
|f pxq|2
|x ν x |
|f pxν q f pxq|2
|f pxν q| |f pxq|2 |xν x|
135
xq|
and hence that
lim
ν
Ñ8
1
f pxν q
f p1xq
Finally, since x1 , x2 , . . .
follows.
1
r p qs f pxqpxν xq
0.
|x ν x |
and x P I were otherwise arbitrary, the theorem
1
f x
2
As a simple application of Theorem 2.4.8, we prove the differentiability of
polynomial functions and calculate their derivatives.
Example 2.4.9. Let n P N and a0 , a1 , . . . , an be real numbers. Then the
corresponding polynomial of n-th order p : R Ñ R, defined by
ppxq : a0
a1 x
an x n
for all x P R, is differentiable and
p 1 pxq : a1
nan xpn1q
for all x P R.
Proof. The proof is a simple consequence of Example 2.4.3 and Theorem 2.4.8.
Theorem 2.4.10. (Chain rule) Let f : I Ñ R, g : J Ñ R be differentiable
functions defined on some open intervals I, J of R and such that the domain
of the composition g f is not empty. Then g f is differentiable with
pg f q 1 g 1pf pxqq f 1pxq
for all x P Dpg f q.
Proof. For this let x P Dpg f q and x1 , x2 , . . . be some sequence in Dpg f q ztxu which is convergent to x. Then:
|pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq| ¤
|x ν x |
136
|gpf pxν qq gpf pxqq g 1pf pxqqpf pxν q f pxqq|
|x ν x |
1
|g pf pxqqpf pxν q f pxq f 1pxqpxν xqq|
|x ν x |
and hence, obviously,
|pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq|
ν Ñ8
|x ν x |
0.
Finally, since x1 , x2 , . . . and x P Dpg f q were otherwise arbitrary, the
lim
theorem follows.
A typical application of the chain rule is given in the following example.
The cosine function is equal to the composition of the sine function and
the translation pR Ñ R, x ÞÑ x pπ {2qq. Since both of these functions are
differentiable, by Theorem 2.4.10, the same is true for their composition. In
addition, by knowledge of the derivatives of these functions, the derivative
of their composition, i.e., the cosine function, can be calculated by use of
the same theorem. In preparation of the calculation of the derivative of the
inverse tangent function function, we also show the differentiability of the
tangent function and calculate its derivative from the derivatives of the sine
and the cosine with the help of Theorem 2.4.8.
Example 2.4.11. The cosine and the tangent function are differentiable
with
cos 1 pxq sinpxq
for all x P R and
tan 1 pxq for all x P R z
π
2
kπ : k
1
cos2 pxq
(
PZ
.
137
1
tan2 pxq
Proof. Since
π
2
for all x P R, it follows by Examples 2.4.5, 2.4.3 and Theorem 2.4.8 (i.e.,
the ‘sum rule’) and Theorem 2.4.10 (i.e., the ‘chain rule’) that cos is differentiable with derivative
π
1
cos pxq cos x
sinpxq
2
cospxq sin x
for all x P R. Further, because of
tanpxq (
sinpxq
cospxq
for all x P R z π2 kπ : k P Z , it follows by Examples 2.4.5 and Theorems 2.4.8 (i.e., the ‘Quotient Rule’) that tan is differentiable with derivative
cospxq cospxq sinpxq p sinpxqq
cos2 pxq
tan2 pxq
tan 1 pxq 1
for all x P R z
π
2
kπ : k
(
PZ
cos12pxq
.
Functions from applications frequently depend on several variables, i.e.,
are defined on subsets of Rn for some n P N such that n ¥ 2. For such
functions, the concept of differentiation will be formulated in Calculus III.
The calculation of the corresponding derivatives can be reduced to the calculation of derivatives of functions in one variable by help of the concept of
partial derivatives. The last was developed soon after that of differentiation
because of applications. The historic view of the partial derivative was that
of treating all variables of an analytic expression as constant, apart from
one. In this way, there is achieved an analytic expression in one variable
that can be differentiated in the usual way. The result was called a partial derivative of the original expression. The modern definition of partial
derivatives is very similar. To define the partial derivative of a function f
138
in several variables, we consider an auxiliary partial function which results
from f by restricting its domain to those points whose components are all
given constants, apart from one of the components. The result is a function
defined on a subset of R. In general, this function depends on the above
constants. The derivative of the auxiliary function in some point p of its domain, so far existent, is called the partial derivative of f in the point whose
components are the given constants apart from the remaining component
which is given by p.
Definition 2.4.12. Let f : U Ñ R be a function of several variables where
U is a subset of Rn , n P N zt0, 1u. In particular, let i P t1, . . . , nu, x P U
be such that the corresponding function
f px1 , . . . , xi1 , , xi 1 , . . . , xn q
is differentiable at xi . In this case, we say that f is partially differentiable
at x in the i-th coordinate direction, and we define:
Bf pxq : rf px , . . . , x , , x
1
i1
i
B xi
1 , . . . , xn
qs 1pxiq .
If f is partially differentiable at x in the i-th coordinate direction at every
point of its domain, we call f partially differentiable in the i-th coordinate
direction and denote by B f {B xi the map which associates to every x P U
the corresponding pB f {B xi qpxq. Partial derivatives of f of higher order are
defined recursively. If B f {B xi is partially differentiable in the j-th coordinate direction, where j P t1, . . . , nu, we denote the partial derivative of
Bf {Bxi in the j-th coordinate direction by
B2f
B xj B xj
.
Such is called a partial derivative of f of second order. In the case j
we set
B2f : B2f .
Bx2i BxiBxi
Partial derivatives of f of higher order than 2 are defined accordingly.
139
i,
Ñ R by
f px, y q : x3 x2 y 3 2y 2
Example 2.4.13. Define f : R2
for all x, y
P R. Find
Bf p2, 1q
Bx
and
Bf p2, 1q .
By
Solution: We have
f px, 1q x3
for all x, y
x P R,
y
x2 2 and f p2, y q 8
P R. Hence it follows that
Bf px, 1q 3x2
Bx
4y 3 2y 2
2x ,
Bf p2, yq 12y2 4y ,
By
P R, and, finally, that
Bf p2, 1q 16
Bx
Example 2.4.14. Define f : R3
and
Ñ R by
f px, y, z q : x2 y 3 z
Bf p2, 1q 8 .
By
3x
4y
6z
5
P R. Find
Bf px, y, zq , Bf px, y, zq and Bf px, y, zq
Bx
By
Bz
for all x, y, z P R. Solution: Since in partial differentiating with respect to
for all x, y, z
one variable all other variables are held constant, we conclude that
Bf px, y, zq 2xy3z
Bx
Bf px, y, zq x2y3
Bz
for all x, y, z P R.
3,
Bf px, y, zq 3x2y2z
By
6,
140
4,
Problems
1) By the basic definition of derivatives, calculate the derivative of the
function f .
a) f pxq : 1{x , x P p0, 8q ,
b) f pxq : px 1q{px 1q , x P R zt1u ,
?
c) f pxq : x , x P p0, 8q .
2) Calculate the slope of the tangent to G(f) at the point p1, f p1qq and
its intersection with the x-axis.
a) f pxq : x2 3x 1 , x P R ,
b) f pxq : p3x 2q{p4x 5q , x P R zt5{4u ,
c) f pxq : e3x , x P R .
3) Calculate the derivatives of the functions f1 , . . . , f8 with maximal
domains in R defined by
a) f1 pxq : 5x8 2x5
b) f3 ptq : p1
3t
2
6 , f2 pθq : 3 sinpθq
5t
4
qpt
2
8q ,
rsinpxq 6 cospxqs ,
5 cospϕq
3t 2t 5
, f6 pϕq :
c) f5 ptq :
t3 8
tanpϕq
2
4 sinp7tq
d) f7 pxq : sinp3{x q , f8 ptq : e
.
f4 pxq : 3e
4 cospθq ,
x
4
,
4) A differentiable function f satisfies the given equation for all x from
its domain. Calculate the slope of the tangent to Gpf q in the specified
point P without solving the equations for f pxq.
? ?
pf pxqq2 1 , P p1{ 2 , 1{ 2 q ,
?
?
b) px 1q2 r x2 pf pxqq2 s 4x2 0 , P p1
2,1
2q,
a
c) x
f pxqr2 f pxqs arccosp1 f pxqq ,
?
?
P ppπ {4q p1{ 2 q, 1 p1{ 2 qq .
a) x2
Remark: The curve in b) is a cycloid which is the trajectory of a point
of a circle rolling along a straight line. The curve in c) is named after
Nicomedes (3rd century B.C.), who used it to solve the problem of
trisecting an angle.
5) Give a function f : R Ñ R such that
141
a) f 1 ptq 1 for all t P R and such that f p0q 2 ,
b) f 1 ptq 2f ptq for all t P R and such that f p0q 1 ,
c) f 1 ptq 2f ptq 3 for all t P R and such that f p0q 1 .
6) Let I be a non-empty open interval in R and p, q
f : I Ñ R. Show that
f 2 pxq
pf 1 pxq
for all x P I if and only if
f¯2 pxq for all x P I where f¯ : I
p2
4
P R.
Further, let
qf pxq 0
q f¯pxq
Ñ R is defined by
f¯pxq : epx{2 f pxq
for all x P I.
7) Newton’s equation of motion for a point particle of mass m
moving on a straight line is given by
mf 2 ptq F pf ptqq
¥
0
(2.4.5)
for all t from some time interval I € R, where f ptq is the position
of the particle at time t, and F pxq is the external force at the point x.
For the specified force, give a solution function f : R Ñ R of (2.4.5)
that contains 2 free real parameters.
a) F pxq F0 , x P R where F0 is some real parameter ,
b) F pxq kx , x P R where k is some real parameter .
8) Newton’s equation of motion for a point particle of mass m ¥ 0
moving on a straight line under the influence of a viscous friction is
given by
mf 2 ptq λf 1 ptq
(2.4.6)
for all t P R where f ptq is the position of the particle at time t, and
λ P r0, 8q is a parameter describing the strength of the friction. Give
a solution function f of (2.4.6) that contains 2 free real parameters.
9) For all px, y q from the domain, calculate the partial derivatives
pBf {Bxqpx, yq, pBf {Byqpx, yq of the given function f .
a) f px, y q : x4 2x2 y 2
3x 4y
142
1 , px, y q P R2 ,
b) f px, y q : 3x2 2x 1 , px, y q P R2 ,
c) f px, y q : sinpxy q , px, y q P R2 .
10) Let f : R Ñ R and g : R Ñ R be twice differentiable functions.
Define upt, xq : f px tq g px tq for all pt, xq P R2 . Calculate
Bu pt, xq , Bu pt, xq , B2 u pt, xq , B2 u pt, xq
Bt
Bx
B t2
Bx2
for all pt, xq P R2 . Conclude that u satisfies
B2 u B2 u 0
Bt2 Bx2
which is called the wave equation in one space dimension (for a function u which is to be determined).
143
2.5
Applications of Differentiation
The applications of differentiation are manifold. We start with the application to the finding of maxima and minima of functions. For motivation, we
consider a continuous function f defined on a closed interval ra, bs where
a, b P R are such that a ¤ b. According to Theorem 2.3.33, f assumes a
maximum and minimum value, i.e., there are xM , xm P ra, bs such that
f pxM q ¥ f pxq , f pxm q ¤ f pxq
for all x P ra, bs. The values f pxM q, f pxm q are called the maximum and
minimum value of f , respectively. These values are uniquely determined
because if x̄M , x̄m P ra, bs are such that
f px̄M q ¥ f pxq , f px̄m q ¤ f pxq
for all x P ra, bs, it follows by definition of xM , x̄M , xm , x̄m that
f pxM q ¥ f px̄M q , f pxm q ¤ f px̄m q
as well as that
f px̄M q ¥ f pxM q , f px̄m q ¤ f pxm q
and hence that
f pxM q f px̄M q , f pxm q f px̄m q .
On the other hand, a function can assume its maximum value and/or its
minimum value in more than one point. For instance, the function p r0, 4π s Ñ
R, x ÞÑ 1 sin x q assumes its maximum value 3 and its minimum value 1
in the points π {2, 5π {2 and 3π {2, 7π {2, respectively, see Fig. 39.
After this interrupt, we continue with the discussion of the maximum and
minimum values of f . Each of them can be assumed either at a boundary
point a or b of the interval or in a point of the open interval pa, bq. In the last
cases, if the function is differentiable on pa, bq, differentiation can be used
to determine the position(s) where they are assumed. We remember that the
144
y
4
3
2
1
Π
€€€€€€
2
3Π
€€€€€€€€€€€
2
7Π
€€€€€€€€€€€
2
5Π
€€€€€€€€€€€
2
x
Fig. 39: Graph and segments of tangents of a function, p r0, 4π s Ñ R, x ÞÑ 2 sin x q, that
assumes both its maximum and its minimum value in several points of its domain. Note
that the tangents in those points are horizontal corresponding to a vanishing derivative in
those points.
function A from Fermat’s example at the beginning of Section 2.4 assumed
its maximum value in the midpoint of its domain and that his way of finding
its position was equivalent to the demand of a vanishing derivative at the
position of a maximum value. Indeed, this also true for a minimum value.
With precise definitions of limits and derivatives at hand, both follow from
very simple observations. By definition of xM , it follows that
f pxq f pxM q ¤ 0
for all x P Dpf q. As a consequence, we conclude that
if b ¡ x ¡ xM and
f pxq f pxM q
x xM
¤0
f pxq f pxM q
x xM
¥0
145
if a x xM . By choosing a sequence x1 , x2 , . . . of elements of pxM , bq,
pa, xM q that converges to xM in Definition 2.4.1, it follows from this and
Theorem 2.3.12 that f 1 pxM q ¤ 0 and f 1 pxM q ¥ 0, respectively, and hence
that f 1 pxM q 0. Also, by definition of xm , it follows that
f pxq f pxm q ¥ 0
for all x P Dpf q. As a consequence, we conclude that
if b ¡ x ¡ xm and
f pxq f pxm q
x xm
¥0
f pxq f pxm q
¤0
x xm
if a x xm . By choosing a sequence x1 , x2 , . . . of elements of pxm , bq,
pa, xmq that converges to xm in Definition 2.4.1, it follows from this and
Theorem 2.3.12 that f 1 pxm q ¥ 0 and f 1 pxm q ¤ 0, respectively, and hence
that f 1 pxm q 0.
Hence in case that the restriction of f to pa, bq is differentiable, the standard
procedure of finding the maximum and minimum values of f proceeds by
finding the zeros of the derivative of the restriction, subsequent calculation
of the corresponding function values of f in those zeros and comparison of
the obtained values with the function values of f at a and b. The maximum,
minimum value of these function values is the maximum and minimum
value of f , respectively.
Theorem 2.5.1. (Necessary condition for the existence of a local minimum/maximum) Let f be a differentiable real-valued function on some
open interval I of R. Further, let f have a local minimum / maximum at
some x0 P I, i.e, let
f px0 q ¤ f pxq
for all x such that x0 ε x x0
{
f px0 q ¥ f pxq
ε, for some ε ¡ 0. Then
f 1 px0 q 0 ,
i.e, x0 is a so called ‘critical point’ for f .
146
y
1.15
1.1
1.05
0.95
-1
0.2
-0.6 -0.4 -0.2
0.4
x
Fig. 40: Gpf q from Example 2.5.2.
Proof. If f has a local minimum/maximum at x0
sufficiently small h P R that
1
rf px0
h
P I, then it follows for
hq f px0 qs
is ¥ p¤q 0 and ¤ p¥q 0, for h ¡ 0 and h 0, respectively. Therefore, it
follows by Theorem 2.3.12 that f 1 px0 q is at the same time ¥ 0 and ¤ 0 and
hence, finally, equal to 0.
Example 2.5.2. Find the critical points of f : R Ñ R defined by
f pxq : x4
for all x
equation
P R.
x3
1
Solution: The critical points of f are the solutions of the
0 f 1 pxq 4x3
3x2
x2p4x
3q
and hence given by x 0 and x 3{4. See Fig. 40. Note that f has
a local extremum at x 3{4, but not at x 0. Hence the condition in
Theorem 2.5.1 is necessary, but not sufficient for the existence of a local
extremum.
147
y
5
4
3
2
1
-3
-2
2
-1
3
x
-1
-2
Fig. 41: Gpf q from Example 2.5.3.
Example 2.5.3. Find the maximum and minimum values of f : rπ, π s Ñ
R defined by
f pxq : x 2 cospxq
for all x P rπ, π s. Solution: Since f is continuous, such values exist
according to Theorem 2.3.33. Those points, where these values are assumed, can be either on the boundary of the domain, i.e., in the points π
or π, there f assumes the values 2 π and 2 π, respectively, or inside
the interval, i.e., in the open interval pπ, π q. In the last case, according to
Theorem 2.5.1 those are critical points of the restriction of f to this interval.
The last are given by
5π
π
x , 6
6
since
f 1 pxq 1 2 sinpxq
for all x P pπ, π q. Now
f
π
6 π
6
?
3
, f
148
5π6
?
3
5π
6
?
and hence the minimum value of f is pπ {6q 3 (assumed inside the
interval) and its maximum value is π 2 (assumed at the right boundary of
the interval). See Fig. 41.
The following is a theorem of Michel Rolle, published in 1691, which he
used in his method of cascades devised to find intervals around zeros of
polynomial functions that contain no other roots. In this connection, the
subsequent theorem gives that the open interval I that is contained in the
domain of a continuous function and that has two subsequent roots of that
function as end points, contains precisely one zero of the derivative of the
restriction of that function to I if that restriction is differentiable.
Theorem 2.5.4. (Rolle’s theorem) Let f : ra, bs Ñ R be continuous where
a, b P R are such that a b. Further, let f be differentiable on pa, bq and
f paq f pbq. Then there is c P pa, bq such that f 1 pcq 0.
Proof. Since f is continuous, according to Theorem 2.3.33 f assumes its
minimum and maximum value in some points x0 P ra, bs and x1 P ra, bs,
respectively. Now if one of these points is contained in the open interval
pa, bq, the derivative of f in that point vanishes by Theorem 2.5.1. Otherwise, if both of those points are at the interval ends a, b it follows that
f paq ¤ f pxq ¤ f pbq f paq
for all x P ra, bs. Hence in this case, f is a constant function, and it follows
by Example 2.4.3 that f 1 pcq 0 for every c P pa, bq. Hence in both cases
the statement of the theorem follows.
The following example provides a typical application of Rolle’s theorem.
Example 2.5.5. Show that f : R Ñ R defined by
f pxq : x3
x
1
for all x P R, has exactly one zero. (Compare Example 2.3.39.)
149
y
fHbL
fHaL
a
c
b
x
Fig. 42: Illustration of the statement of the mean value theorem 2.5.6.
Proof. f is continuous and because of
f p1q 1 0 and f p0q 1 ¡ 0
and Corollary 2.3.38 has a zero x0 in p1, 0q. See Fig. 23. Further, f is
differentiable with
f 1 pxq 3x2 1 ¡ 0
for all x P R. Now assume that there is a another zero x1 . Then it follows
by Theorem 2.5.4 the existence of a zero of f 1 in the interval with endpoints
x0 and x1 . Hence f has exactly one zero.
The mean value theorem is a simple generalization of Rolle’s theorem
which will be frequently used in the following. Its use as a central theoretical tool in calculus / analysis was initiated by Cauchy. Its proof proceeds
by construction of an appropriate auxiliary function which allows the application of Rolle’s theorem. For a simple geometrical interpretation of the
statement of the mean value theorem, we consider a continuous function f
defined on a closed interval of R with left end point a and right end point
b, where a b, which is differentiable on pa, bq. Then according to the
theorem, there is a tangent to graph of the restriction of f to pa, bq with
slope identical to slope of the line segment (‘secant’) from pa, f paqq and
pb, f pbqq, see Fig. 42.
150
Theorem 2.5.6. (Mean value theorem) Let f : ra, bs Ñ R be a continuous
function where a, b P R are such that a b. Further, let f be differentiable
on pa, bq. Then there is c P pa, bq such that
f pbq f paq
ba
f 1pcq .
Proof. Define the auxiliary function h : ra, bs Ñ R by
f pbq f paq
hpxq : f pxq px aq f paq
ba
for all x P ra, bs. Then h is continuous as well as differentiable on pa, bq
with
f pbq f paq
h 1 pxq f 1 pxq ba
for all x P pa, bq and hpaq hpbq 0. Hence by Theorem 2.5.4 there is
c P pa, bq such that
f pbq f paq
h 1 pcq f 1 pcq 0.
ba
Intuitively, it should be expected that every function which is defined on
an open interval of R and has a vanishing derivative is a constant function.
Indeed, this can be seen as a first important consequence of the mean value
theorem.
Theorem 2.5.7. Let f : pa, bq Ñ R be differentiable, where a, b P R are
such that a b. Further, let f 1 pxq 0 for all x P pa, bq. Then f is a
constant function.
Proof. The proof is indirect. Assume that f is not a constant function. Then
there are x1 , x2 P pa, bq satisfying x1 x2 and f px1 q f px2 q. Hence it
follows by Theorem 2.5.6 the existence of c P px1 , x2 q such that
f px2 q f px1 q
x2 x1
f 1pcq 0
and hence that f px1 q f px2 q. Hence f is a constant function.
151
Typically, the previous theorem is applied in proofs of uniqueness of solutions of differential equations and in the derivation of so called ‘conserved
quantities’ of physical systems as in the subsequent examples.
Example 2.5.8. (A characterization of the exponential function) Let
a, b P R be such that a 0 and b ¡ 0. Find all solutions f : pa, bq Ñ R of
the differential equation
f 1 pxq f pxq
for all x P pa, bq that satisfy f p0q 1. Solution: We know that
f pxq : exppxq
for every x P pa, bq satisfies all these demands. Indeed, it follows by help of
the previous theorem, Theorem 2.5.7, that there is no other solution. This
can be seen as follows. For this, let f be some function that satisfies these
requirements. Then we define the auxiliary function h : pa, bq Ñ R by
hpxq : exppxq f pxq
for all x P pa, bq. As a consequence, h is differentiable with a derivative h 1
satisfying
h 1 pxq exppxq f pxq exppxq f 1 pxq
exppxq f pxq exppxq f pxq 0
for all x P pa, bq. Hence it follows by Theorem 2.5.7 that h is a constant
function of value hp0q f p0q 1 which has the consequence that f pxq exppxq for all x P pa, bq.
Example 2.5.9. (Energy conservation) Newton’s equation of motion for
a point particle of mass m ¥ 0 moving on a straight line is given by
mf 2 ptq F pf ptqq
(2.5.1)
for all t from some non-empty open time interval I € R where f ptq is the
position of the particle at time t and F pxq is the external force at the point
152
x. Assume that F V 1 where V is a differentiable function from an
open interval J  RanpI q. Show that E : I Ñ R defined by
m 1
E ptq :
p
f ptqq2 V pf ptqq
(2.5.2)
2
for all t P I is a constant function. Solution: It follows by Theorem 2.4.8,
Theorem 2.4.10 and (2.5.1) that E is differentiable with derivative
E 1 ptq mf 1 ptqf 2 ptq
V 1 pf ptqq f 1 ptq f 1 ptq rmf 2 ptq F pf ptqqs 0
for all t P I. Hence according to Theorem 2.5.7, E is a constant function.
In physics, its value is called the total energy of the particle. As a consequence, the finding of the solutions of the solution of (2.5.1), which is
second order in the derivatives, is reduced to the solution of (2.5.2), which
is only first order in the derivatives, for an assumed value of the total energy.
Utilizing the interpretation of the values of the derivative of a function as
providing the slopes of tangents at its graph, it is to be expected that a differentiable function is increasing (decreasing) on intervals where its derivative
assumes positive values (negative values), i.e., values that are ¥ 0 (¤ 0).
That this is intuition is correct is displayed by the following theorem. Its
statement can be regarded as a another important consequence of the mean
value theorem.
Theorem 2.5.10. Let f : ra, bs Ñ R be continuous where a, b P R are such
that a b. Further, let f be differentiable on pa, bq and such that f 1 pxq ¡ 0
( f 1 pxq ¥ 0 ) for every x P pa, bq. Then f is strictly increasing ( increasing )
on ra, bs, i.e.,
f pxq f py q p f pxq ¤ f py q q
for all x, y
P ra, bs that satisfy x y.
Proof. Let x and y be some elements of ra, bs such that x y. Then
the restriction of f to the interval rx, y s satisfies the assumptions of Theorem 2.5.6, and hence there is c P px, y q such that
f py q f pxq
f 1 pcqpy xq ¡ f pxq
153
p ¥ f pxq q
.
y
3
2
1
1
0.5
x
Fig. 43: Graphs of exp and approximations. See Example 2.5.12.
Typically, the previous theorem is used in the derivation of lower and upper
bounds for the values of functions or more generally in the comparison of
functions and, in particular, in the proof of injectivity of functions. The
subsequent examples provide such applications.
Example 2.5.11. Show that the exponential function exp : R Ñ R is
strictly increasing. Solution: By Example 2.4.4 and Theorem 2.3.27 it
follows that exp 1 pxq exppxq ¡ 0 for all x P R. Hence it follows by
Theorem 2.5.10 that exp is strictly increasing. Hence there is an inverse
function to exp which is called the natural logarithm and is denoted by ln.
See Fig. 28.
Example 2.5.12. Show that
(i)
ex
for all x P p0, 8q.
(ii)
ex
for all x P p0, 8q.
¡1
¡x
154
(2.5.3)
1
(2.5.4)
(Compare Theorem 2.3.27.)
Proof. Define the continuous function f : r0, 8q Ñ R by f pxq : ex 1 for all x P r0, 8q. Then f is differentiable on p0, 8q with f 1 pxq ex ¡ 0 for all x P p0, 8q. Hence f is strictly increasing according to
Theorem 2.5.10, and (2.5.3) follows since f p0q e0 1 0. Further,
define the continuous function g pxq : ex 1 x for all x P r0, 8q. Then
g is differentiable on p0, 8q with g 1 pxq ex 1 ¡ 0 for all x P p0, 8q
where (2.5.3) has been applied. Hence (2.5.4) follows by Theorem 2.5.10
since g p0q e0 0 1 0.
From Example 2.5.11 and (2.5.4), it follows by the intermediate value theorem, Theorem 2.3.37, that
expp r0, 8q q r1, 8q
and hence by part (iii) of Theorem 2.3.27 that the range of exp is given by
p0, 8q which therefore is also the domain of its inverse function ln. As a
consequence, exp is a strictly increasing bijective map from R onto p0, 8q.
See Fig. 28.
Example 2.5.13. Show that
lnpa bq lnpaq
lnpbq
for all a, b ¡ 0. Solution: For a, b ¡ 0, it follows by Theorem 2.3.27 that
lnpa bq ln elnpaq elnpbq
ln
elnpaq
p q lnpaq
ln b
lnpbq .
In Example 2.5.9, we derived a conserved quantity for the solutions of a
differential equation, a special case of Newton’s equation of motion. Ignoring the physical dimensions of the involved quantities in that example,
in the special case that m 2, F pxq 2x for all x P R, the function
E : I Ñ R, defined by
E ptq pf ptqq2
155
pf 1ptqq2
for all t P I and a solution f of the differential equation
f 2 ptq
f ptq 0
for all t P I, was found to be a constant function. The value of the corresponding constant is called the total energy that is associated to f . An
important feature of that quantity is its positivity. In the subsequent theorem, we show that estimates on the growth of the same function E defined
for solutions of the related differential equation (2.5.5) can be used to show
the uniqueness of the solutions of that differential equation. The key for this
is the following lemma whose proof provides a further application of Theorem 2.5.10. Differential equations of the form (2.5.5) appear frequently
in applications, for instance, in the description of the amplitudes of oscillations of damped harmonic oscillators in mechanics and in the description
of the current as a function of time in simple electric circuits in electrodynamics.
Lemma 2.5.14. (An ‘energy’ inequality for solutions of a differential
equation) Let p, q P R. Further, let I be some open interval of R, x0 P I
and f : I Ñ R satisfy the differential equation
f 2 pxq
p f 1 pxq
q f pxq 0
for all x P I. Finally, define
E pxq : pf pxqq2
pf 1pxqq2
for all x P I. Then for all x P I
0 ¤ E pxq ¤ E px0 q ek|xx0 |
where
k : 1
2|p|
|q | .
Proof. Since f is twice differentiable, E is differentiable such that
E 1 pxq 2f pxqf 1 pxq
2f 1 pxqf 2 pxq
156
(2.5.5)
2f pxqf 1pxq 2 r p f 1pxq q f pxq s f 1pxq
2 p1 qqf pxqf 1pxq 2 p pf 1pxqq2
for all x P I. Hence E 1 is continuous and satisfies
|E 1pxq| ¤ 2 p1 |q|q |f 1pxq| |f pxq| 2 |p| pf 1pxqq2
¤ p1 |q|q pf pxqq2 pf 1pxqq2 2 |p| pf 1pxqq2 ¤ kE pxq
for all x P I where it has been used that
2 |f 1 pxq| |f pxq| ¤ pf pxqq2 pf 1 pxqq2 .
As a consequence,
kE pxq ¤ E 1pxq ¤ kE pxq
for all x P I. We continue analyzing the consequences of these inequalities.
For this, we define auxiliary functions Er , El by
Er pxq : ekx E pxq , El pxq : ekx E pxq
for all x P I. Then
Er1 pxq ekx pE 1 pxq kE pxqq ¤ 0 , El1 pxq ekx pE 1 pxq
kE pxqq ¥ 0
for all x P I. Hence Er is decreasing, which is equivalent to the increasing
of Er , and Er is increasing. Hence it follows by Theorem 2.5.10 that
E pxq ¤ E px0 q ekpxx0 q
E px0q ek|xx |
0
for x ¥ x0 and that
E pxq ¤ E px0 q ekpx0 xq
for x ¤ x0 .
E px0q ek|xx | .
0
The unique dependence of the solutions of (2.5.6) on ‘initial data’, f px0 q
and f 1 px0 q given at some x0 P R is a simple consequence of the preceding
lemma.
157
Theorem 2.5.15. Let p, q P R. Further, let I be some open interval of R,
x0 P I and y0 , y01 P R. Then there is at most one function f : I Ñ R such
that
f 2 pxq p f 1 pxq q f pxq 0
(2.5.6)
for all x P I and at the same time such that
f px0 q y0 , f 1 px0 q y01 .
Ñ R be such that
f 2 pxq p f 1 pxq q f pxq f¯2 pxq p f¯1 pxq q f¯pxq 0
for all x P I and
f px0 q f¯px0 q y0 , f 1 px0 q f¯1 px0 q y01 .
Then u : f f¯ satisfies
u 2 pxq p u 1 pxq q upxq 0
for all x P I and
upx0 q u 1 px0 q 0 .
Hence it follows by Lemma 2.5.14 that upxq 0 for all x P I and hence
that f f¯.
Proof. For this, let f, f¯ : I
Of course, of main interest for applications are the solutions of (2.5.6).
These are obtained by reducing the solution of this equation to the solution
of the special cases corresponding to p 0. The solutions of the last are
obvious. Their representation is simplified by use of hyperbolic functions
which are introduced next.
Definition 2.5.16. We define the hyperbolic sine function sinh, the hyperbolic cosine function cosh and the hyperbolic tangent function tanh by
sinhpxq :
1 x
1 x
e ex , coshpxq :
e
2
2
158
ex ,
y
3
2
-2
1
-1
2
x
-1
-2
-3
Fig. 44: Graphs of the hyperbolic sine and cosine function.
y
0.5
-2
1
-1
2
x
-0.5
Fig. 45: Graphs of the hyperbolic tangent function and asymptotes given by the graphs of
the constant functions on R of values 1 and 1.
159
tanhpxq :
sinhpxq
,
coshpxq
for all x P R. Obviously, sinh, tanh are antisymmetric and cosh is symmetric, i.e.,
sinhpxq sinhpxq , cospxq coshpxq , tanhpxq tanhpxq
for all x P R. Also these functions are differentiable and, in particular,
sinh 1
cosh 1
cosh ,
sinh
similarly to the sine and cosine functions. Another resemblance to these
functions is the relation
cosh2 pxq sinh2 pxq 14
ex
ex ex
1 x
e
4
ex
ex
ex
2
ex
ex ex
ex ex
2 41 2 ex 2 ex 1
for all x P R. In particular, this implies that
cosh2 pxq sinh2 pxq
1
tanh pxq :
cosh pxq
2
1 tanh2pxq cosh12pxq
for all x P R.
The solution of (2.5.6) corresponding to ‘initial data’, f px0 q and f 1 px0 q
given at some x0 P R are obtained in the proof of the following theorem by
considering a function that is related to f . As a consequence of (2.5.6), that
function is a solution of the differential equation of the form (2.5.6) with
p 0. The solutions of these special equations are obvious.
Theorem 2.5.17. Let p, q
the unique solution to
P R, D : pp2{4q q and x0, y0, y01 P R. Then
f 2 pxq
pf 1 pxq
160
qf pxq 0
satisfying f px0 q y0 and f 1 px0 q y01 is given by
f pxq y0 eppxx0 q{2 coshpD1{2 px x0 qq
1{2 py0
1
1{2
D
y0 sinhpD px x0 qq
2
for x P R if D
¡ 0,
py
f pxq eppxx0 q{2 y0
0
2
y1
0
px x0q
for x P R if D
0 and
f pxq y0 eppxx q{2 cosp|D|1{2 px x0 qq
1{2 py0
1
1{2
|D |
y0 sinp|D| px x0 qq
2
0
for x P R if D
0.
Proof. For this, we first notice that a function h : R Ñ R satisfies
h 2 pxq
ph 1 pxq
for all x P R if and only if
h̄ 2 pxq
q
qhpxq 0
p2
4
(2.5.7)
h̄pxq 0
(2.5.8)
for all x P R where h̄ : R Ñ R is defined by
h̄pxq : epx{2 hpxq
(2.5.9)
for all x P R. Indeed, it follows by Theorem 2.4.8 that h̄ is twice differentiable if and only if h is twice differentiable and in this case that
h̄ 1 pxq epx{2 h 1 pxq
p
hpxq ,
2
h̄ 2 pxq epx{2 h 2 pxq
p h 1 pxq
161
p2
hpxq
4
for all x P R. The last implies that
p2
p2
q
h̄pxq epx{2 h 2 pxq p h 1 pxq
hpxq
4
4
p2
px{2
e
q
hpxq epx{2 ph 2 pxq ph 1 pxq qhpxqq 0
4
h̄ 2 pxq
for all x P R if and only if (2.5.7) is satisfied for all x
hpx0 q y0 and h 1 px0 q y01 if and only if
h̄px0 q y0 epx0 {2 , h̄ 1 px0 q py
0
2
P R.
In addition,
y01 epx0 {2 .
(2.5.10)
For the solution of (2.5.8) and (2.5.10), we consider three cases. If D :
pp2{4q q ¡ 0, then a solution to (2.5.8) and (2.5.10) is given by
h̄pxq y0 epx0 {2 coshpD1{2 px x0 qq
py
0
y01 epx0 {2 sinhpD1{2 px x0 qq
D1{2
2
for x P R. If D
0, then a solution to (2.5.8) and (2.5.10) is given by
py
0
h̄pxq y0 epx {2
y01 epx {2 px x0 q
2
for x P R. If D 0, then a solution to (2.5.8) and (2.5.10) is given by
h̄pxq y0 epx {2 cosp|D|1{2 px x0 qq
1{2 py0
1
|D |
y0 epx {2 sinp|D|1{2 px x0 qq
2
for x P R. Hence, finally, it follows by (2.5.9) and by Theorem 2.5.15 the
0
0
0
0
statement of this theorem.
According to Theorem 2.3.44, the inverse of a strictly increasing continuous function defined on a closed interval ra, bs of R where a, b P R are such
that a b, is continuous, too. If the restriction of f to pa, bq is in addition
differentiable, then the restriction of f 1 to pf paq, f pbqq is also differentiable. Moreover, the following theorem gives an often used representation
of the derivative of the last in terms of the derivative of f .
162
Theorem 2.5.18. (Derivatives of inverse functions) Let f : ra, bs Ñ R
be continuous where a, b P R are such that a b. Further, let f be differentiable on pa, bq and such that f 1 pxq ¡ 0 for every x P pa, bq. Then the
inverse function f 1 is defined on rf paq, f pbqs as well as differentiable on
pf paq, f pbqq with
1
1
f 1 py q 1 1
(2.5.11)
f pf py qq
for all y
P pf paq, f pbqq.
Proof. By Theorem 2.5.10, it follows that f is strictly increasing and hence
that there is an inverse function f 1 for f . Further, by Theorem 2.3.44 f 1
is continuous, and by Theorem 2.3.43 it follows that f pra, bsq rf paq, f pbqs
and hence that f 1 is defined on rf paq, f pbqs. Now let y P pf paq, f pbqq
and y1 , y2 , . . . be a sequence in pf paq, f pbqq zty u which is convergent to y.
Then f 1 py1 q, f 1 py2 q, . . . is a sequence in pa, bq ztf 1 py qu which, by the
continuity of f 1 , converges to f 1 py q. Hence it follows for n P N that
f 1 pyn q f 1 py q
yn y
f pf 1 pyn qq f pf 1 py qq
f 1 pyn q f 1 py q
1
and hence by the differentiability of f in f 1 py q, that f 1 pf 1 py qq
by Theorem 2.3.4 the statement (2.5.11).
¡ 0 and
The following examples, give two applications of the previous theorem.
The second example is from the field of General Relativity.
Example 2.5.19. Calculate the derivative of ln, arcsin, arccos and arctan.
Solution: By Theorem 2.5.18, it follows that
ln 1 pxq 1
exp 1 plnpxqq
exppln1 pxqq x1
for every x P p0, 8q,
arcsin 1 pxq 1
sin 1 parcsinpxqq
163
1
cosparcsin
pxqq
y
2
1
1
2
3
x
-1
Fig. 46: Graph of the auxiliary function h from Example 2.5.20.
1
?1 1 x2 ,
1 sin parcsinpxqq
1
1
arccos 1 pxq 1
cos parccospxqq
sinparccospxqq
1
?1 1 x2
a
2
1 cos parccospxqq
a
2
for all x P p1, 1q and
arctan 1 pxq for every x P R.
1
tan 1 parctanpxqq
p1
1
tan qparctanpxqq
2
1
1
x2
Example 2.5.20. In terms of Kruskal coordinates, the radial coordinate
projection r : Ω Ñ p0, 8q of the Schwarzschild solution of Einstein’s field
equation is given by
rpu, v q h1 pu2 v 2 q
164
for all pv, uq P Ω where h : p0, 8q Ñ p1, 8q is defined by
hpxq :
x
1
2M
ex{p2M q
for all x P p0, 8q. Here
Ω : tpv, uq P R2 : u2 v 2
¡ 1u ,
and M ¡ 0 is the mass of the black hole. In addition, geometrical units are
used where the speed of light and the gravitational constant have the value
1. Finally, h is bijective and h1 is differentiable. Calculate
Br , Br .
Bv Bu
for all pv, uq P Ω. Solution: For this, let pv, uq
conclude by Theorem 2.5.18 that
P Ω.
In a first step, we
Br pu, vq 2v ph1q 1pu2 v2q 2v r h 1ph1pu2 v2qq s1
Bv
2v r h 1prpu, vqq s1 ,
Br pu, vq 2u ph1q 1pu2 v2q 2u r h 1ph1pu2 v2qq s1
Bu
2u r h 1prpu, vqq s1 .
Since
h 1 pxq 1 x{p2M q
e
2M
1 x
2M 2M
1
ex{p2M q
4Mx 2 ex{p2M q
for every x ¡ 0, this implies that
Br pu, vq 8M 2v er{p2M q pu, vq ,
Bv
r
r{p2M q Br pu, vq 8M 2u e
pu, vq .
Bu
r
165
y
y
2
2
1
1
1
2
x
1
2
x
Fig. 47: Graphs of power functions corresponding to positive (¥ 0) and negative (¤ 0) a,
respectively. See Definition 2.5.21.
The following defines general powers of strictly positive (¡ 0) real numbers
in terms of the exponential function and its inverse, the natural logarithm
function.
Definition 2.5.21. (General powers) For every a
responding power function by
P R, we define the cor-
xa : ealn x
for all x ¡ 0.
By Theorem 2.4.10, the power function pp0, 8q Ñ R, x ÞÑ xa q is differentiable with derivative
a aln x a ln x pa1qln x a ln x pa1qln x
e x e
x e e
a xpa1q
x
in x ¡ 0. Also the following calculational rules are simple consequences
of the definition of general powers and basic properties of the exponential
function and its inverse.
Example 2.5.22. Show that
x0
1,
xa y a
pxyqa ,
xa xb
166
xa
b
, pxa qb
xab
y
1
1
2
x
-1
-2
Fig. 48: Graphs of ln and polynomial approximations corresponding to a
See Example 2.5.23.
1{2, 1 and 2.
for all x, y ¡ 0 and a, b P R. Solution: By Definition 2.5.21, it follows for
such x, y, a and b that
e0ln x e0 1 ,
xa y a ealn x ealn y ealn x aln y eapln x ln yq ea lnpxyq pxy qa ,
xa xb ealn x ebln x ealn x bln x epa bqln x xa b ,
pxaqb ebln x ebln e eb a ln x ea b ln x xab .
x0
a
a ln x
The following derives frequently used polynomial approximations of the
natural logarithm function as a further example for the application of Theorem 2.5.10. A verbalization of the estimate (2.5.12) is that the natural
logarithm ‘ lnpxq is growing more slowly than any positive power of x for
large x ’.
Example 2.5.23. Show that for every a ¡ 0
lnpxq 1 a
px 1q
a
167
(2.5.12)
for all x ¡ 1. (See Exercise 2.3.2 for an application of the case a
Solution: Define the continuous function f : r1, 8q Ñ R by
f pxq :
1{2.)
1 a
px 1q lnpxq
a
for all x ¥ 1. Then f is differentiable on p1, 8q with
f 1 pxq 1 a aln x 1
e x
a x
ealn x 1
x1
¡0
for x ¡ 1 and f p1q 0. Hence (2.5.12) follows by Theorem 2.5.10.
Another important consequence of Theorem 2.5.6 is given by Taylor’s theorem which is frequently employed in applications. For its formulation, we
need to introduce some additional terminology.
Definition 2.5.24. If m, n
we define
ņ
P N such that m ¤ n and and am, . . . , an P R,
ak : am
am
1
an .
k m
Note that, as a consequence of the associative law for addition, it is not
necessary to indicate the order in which the summation is to be performed.
Further, obviously,
ņ
pak
bk q ņ
ak
k m
k m
and
ņ
λ
ak
k m
ņ
bk
k m
ņ
λ ak
k m
for every λ P R and bm , . . . , bn P R. In addition, we define for every n P N
the corresponding factorial n! recursively by
0! : 1 , pk
1q! : pk
for every k P N . Hence in particular, 1!
5! 120 and so forth.
168
1qk!
1, 2! 2, 3! 6, 4! 24,
For the motivation of Taylor’s theorem, we consider a twice continuously
differentiable function f defined on an open subinterval pa, bq of R where
a, b P R are such that a b. Further, let x0 , x P pa, bq. According to the
mean value theorem, there is ξ in the open interval between x0 and x such
that
f pxq f px0 q
f 1 pξ q .
x x0
This implies that
px x0qf 1pξ q .
f pxq f px0 q
Further, by the same reasoning, it follows the existence of ζ in the open
interval between x0 and ξ such that
f 1 pξ q f 1 px0 q
pξ x0qf 2pζ q .
Hence we conclude that
f pxq f px0 q px x0 qf 1 pξ q
f px0q px x0q r f 1px0q pξ x0qf 2pζ qs
f px0q px x0qf 1px0q px x0qpξ x0qf 2pζ q
and
|f pxq f px0q px x0qf 1px0q| ¤ |x x0|2 |f 2pζ q| .
(2.5.13)
Since f 2 is continuous, we conclude that for every arbitrary preassigned
error bound ε ¡ 0 there is an interval I around x0 such that
|f pxq f px0q px x0qf 1px0q| ¤ ε
for every x P I. Hence the restriction of f to I can be approximated within
an error ε by the restriction of the linear polynomial function
p1 pxq : f px0 q
px x0qf 1px0q
169
for all x P R to I. This polynomial is called the linearization of f around
the point x0 . Note that
p1 px0 q f px0 q , p11 px0 q f 1 px0 q .
Therefore, p1 is the uniquely determined linear, i.e. of order ¤ 1, polynomial that assumes the value f px0 q in x0 and whose derivative assumes the
value f 1 px0 q in x0 . In particular, its graph coincides with the tangent to the
graph of f in x0 . In applications, functions are frequently replaced by their
linearizations around appropriate points to simplify subsequent reasoning.
Often, this is done without performing an error estimate like (2.5.13) in the
hope the error introduced by the replacement is in some sense ‘small’.
If f is sufficiently often differentiable, it is to be expected that f can be
described with higher precision near x0 by polynomials of higher order
than 1. Indeed, this is true and Taylor’s theorem provides such so called
Taylor polynomials pn for n P N with n ¡ 1. It is tempting to speculate
that pn is the uniquely determined polynomial of order ¤ n such that
pn px0 q f px0 q , pnpkq px0 q f pkq px0 q
for k
1, . . . , n. In that case, pn is easily determined to be of the form
ņ
f pkq px0 q
px x qk ,
p pxq n
k 0
k!
0
for all x P R where we set f p0q : f and f is assumed to be pn 1q-times
continuously differentiable. Indeed, this speculation turns out to be correct.
We first give Taylor’s theorem in a form which resembles that of the mean
value theorem. Its proof proceeds by application of the last to a skillfully
constructed auxiliary function.
Theorem 2.5.25. (Taylor’s theorem) Let n P N , I be a non-trivial open
interval and f : I Ñ R be ntimes differentiable. Finally, let a and b be
170
two different elements from I. Then there is c in the open interval between
a and b such that
f pbq f pkq paq
n¸1
k!
k 0
f pnq pcq
pb aqn
n!
pb aqk
(2.5.14)
where f p0q : f and pb aq0 : 1.
Ñ R by
n¸
1 f pkq pxq
g pxq : f pbq pb xqk
k!
Proof. Define the auxiliary function g : I
k 0
for all x P I. Then it follows that g pbq 0 and moreover that g is differentiable with
n¸
1 f pk 1q pxq
1
p
b xqk
g pxq k!
k0
pnq
f pxq pb xqn1
f pkq pxq
n¸1
p
b xqk1
pk 1q!
k1
pn 1q!
for all x P I. Define a further auxiliary function h : I
hpxq : g pxq for all x P I. Then it follows that hpaq
tiable with
h 1 pxq bx
ba
n
Ñ R by
g paq
hpbq 0 and that h is differen-
f pnq pxq
n1
pn 1q! pb xq
n
pb xqn1 gpaq
pb aqn
for all x P I. Hence according to Theorem 2.5.4, there is c in the open
interval between a and b such that
0 h 1 pcq f pnq pcq
pn1q
pn 1q! pb cq
which implies (2.5.14).
171
n
pb cqn1 gpaq
pb aqn
Taylor’s Theorem 2.5.25 is usually applied in the following form,
Corollary 2.5.26. (Taylor’s formula) Let n P N , I be a non-trivial open
interval of length L and f : I Ñ R be ntimes differentiable. Finally, let
x0 P I and C ¥ 0 be such that
|f pnqpxq| ¤ C
for all x P I. Then
f x
p q
for all x P I.
f pkq px q
0
n¸1
k!
k 0
x0 k px q ¤ CL
n!
n
.
Remark 2.5.27. The polynomial
pn1 pxq :
f pkq px q
0
n¸1
k 0
k!
px x0qk
for all x P R in Corollary 2.5.26 is called ‘ the pn 1q-degree polynomial
of f centered at x0 ’. In particular, it follows (for the case n 2) that:
p1 pxq f px0 q
f 1 px0 q px x0 q
for all x P R which is also called the ‘linearization or linear approximation
of f at x0 ’ and
2
|f pxq p1pxq| ¤ CL
2
if C ¥ 0 is such that
|f 2pxq| ¤ C
for all x P I. In applications, one often meets the notation
f pxq p1 pxq
saying that f and p1 are approximately the same near x0 . If the error can be
seen to be ‘negligible’ for the application, this often leads to a replacement
of f by its linearization.
172
y
1.25
1.2
1.15
1.1
1.05
0.1
0.2
0.3
0.4
0.5
x
Fig. 49: Graphs of f and p1 from Corollary 2.5.28.
Example 2.5.28. Calculate the linearization p1 of f : r1, 8q Ñ R defined
by
?
f pxq : 1 x
for all x P r1, 8q at x 0, and estimate its error on the interval r0, 1{2s.
Solution: f is twice differentiable on p1, 8q with
f 1 pxq 1
p1
2
Hence p1 is given by
for all x P R Because of
xq1{2 , f 2 pxq p1 pxq 1
1
1
4
p
1
p1
4
xq3{2 .
1
x
2
1
xq3{2 ¤
4
for all x P r0, 1{2s, it follows from (2.5.27) that the absolute value of the
relative error satisfies
|p1pxq f pxq| ¤ 1
|f pxq|
32
173
for all x P r0, 1{2s.
We know that the first derivative of a function f in a point p of its domain
provides the slope of the tangent at the graph of the function in the point
pp, f ppqq. Hence it is natural to ask whether there is geometrical interpretation of the second derivative. Indeed, such interpretation can be given in
terms of the way how the graph of the function ‘bends’. This can be seen
by help of Taylor’s theorem. For this, we consider a three times continuously differentiable function f defined on an open subinterval pa, bq of R
where a, b P R are such that a b. Further, let x0 , x P pa, bq. According to
Taylor’s theorem, there is ξ in the open interval between x0 and x such that
f 2 px0 q
f 3 pξ q
p
x x0 q2
px x0q3 .
2
6
0, since f 3 is continuous, it follows for x sufficiently near to
f pxq f px0 q
If f 1 px0 q
x0 that
f 1 px0 qpx x0 q
|f 2px0q| px x q2 ¡ |f 3pξ q| |x x |3
2
and hence that
if f 2 px0 q ¡ 0 and
0
6
f pxq ¡ f px0 q
f 1 px0 qpx x0 q
f pxq f px0 q
f 1 px0 qpx x0 q
0
if f 2 px0 q 0. Hence if f 2 px0 q ¡ 0, for x sufficiently near to x0 , the value
of f pxq exceeds the value of its linearization at x0 or, equivalently, the point
px, f pxqq lies above the tangent at x0. In this case, we say that f is locally
convex at x0 . If f 2 px0 q 0, for x sufficiently near to x0 , the value of f pxq
is smaller than the value of its linearization at x0 or, equivalently, the point
px, f pxqq lies below the tangent at x0. In this case, we say that f is locally
concave at x0 .
Definition 2.5.29. (Convexity / concavity of a differentiable function)
Let f : pa, bq Ñ R be differentiable where a, b P R are such that a b. We
call f convex (concave) if
f pxq ¡ f px0 q
f 1 px0 qpx x0 q
p f pxq f px0q
174
f 1 px0 qpx x0 q
q
for all x0 , x P pa, bq such that x0
x.
The following theorem proves the convexity / concavity of a function under
less restrictive assumptions than our motivational analysis above.
Theorem 2.5.30. Let f : pa, bq Ñ R be twice differentiable on pa, bq, where
a, b P R are such that a b, and such that f 2 pxq ¡ 0 (f 2 pxq 0) for all
x P pa, bq. Then
f pxq ¡ f px0 q
f 1 px0 qpx x0 q
for all x0 , x P pa, bq such that x0
p f pxq f px0q
f 1 px0 qpx x0 q
q
x, i.e., ‘f is convex’ (‘f is concave’).
Proof. First, we consider the case that f 2 pxq ¡ 0 for all x P pa, bq. For
this, let x0 P pa, bq and x P pa, bq be such that x ¡ x0 . According to
Theorem 2.5.6, there is c P px0 , xq such that
f pxq f px0 q
f 1 pcq .
x x0
By Theorem 2.5.10, it follows that f 1 is strictly increasing on rx0 , xs and
hence that
f pxq f px0 q
f 1pcq ¡ f 1px0q
x x0
and that
f pxq ¡ f px0 q
f 1 px0 qpx x0 q .
f px0 q f pxq
x0 x
f 1pcq f 1px0q
(2.5.15)
Analogously for x P pa, bq such that x x0 , it follows that there c P px, x0 q
such that
f px0 q f pxq
f 1 pcq
x0 x
1
and such that f strictly increasing on rx, x0 s and hence that
which implies (2.5.15). In the remaining case that f 2 pxq
pa, bq, application of the previous to f gives
f pxq ¡ f px0q f 1px0qpx x0q
175
0 for all x P
y
30
25
20
15
10
5
1
-1
2
3
x
Fig. 50: Graphs of exp along with linearizations around x 1, 2 and 3.
and hence
for all x0
f pxq f px0 q
P pa, bq and x P pa, bq ztx0u.
f 1 px0 qpx x0 q
Example 2.5.31. The exponential function exp is convex because of exp 2 pxq exppxq ¡ 0 for all x P R. See Fig. 50.
Example 2.5.32. Find the intervals of convexity and concavity of f : R Ñ
R defined by
f pxq : x4 x3 2x2 1
for all x P R. Solution: f is twice continuously differentiable with
f 1 pxq 4x3
12
3x 4x , f 2 pxq 12x2
2
x
1
4
c
19
48
x
1
4
c
176
19
48
6x 4 12 x
2
1
1
x
2
3
y
4
2
-2
x
1
-1
-2
Fig. 51: Graph of f from Example 2.5.32 and parallels to the y axis through its inflection
points.
for all x P R. Hence f is convex on the intervals
8, 41
c
and concave on the interval
41
19
48
c
,
19
1
,
48
4
14
c
c
19
,8
48
19
48
.
The following theorem gives another useful characterization of a function
defined on interval I of R to be convex. Such function is convex if and
only if for every x, y P I such that x y the graph of f |px,yq lies below the
straight line (‘secant’) between px, f pxqq and py, f py qq.
Theorem 2.5.33. Let f : pa, bq Ñ R be differentiable on pa, bq where
a, b P R are such that a b. Then f is convex if and only if
f pz q f pxq
pz xq f pyyq fxpxq
177
f pyq py zq f pyyq fxpxq
y
x
Fig. 52: Graph of a convex function (black) and secant (blue). Compare Theorem 2.5.33.
for all x, y, z
P pa, bq such that x z y.
Proof. If f is convex, we conclude as follows. For the first step, let x, y P
pa, bq be such that x y. As a consequence of the convexity of f , it follows
that
f py q ¡ f pxq
and hence that
f 1 pxqpy xq , f pxq ¡ f py q
f 1 py qpx y q
f py q f pxq
f 1pyq .
yx
This is true for all x, y P pa, bq be such that x y. Not that this implies
that f 1 is strictly increasing. For the second step, let x, y, z P pa, bq be such
that x z y. By the mean value theorem Theorem 2.5.6, it follows the
existence of ξ P px, y q such that
f 1 pxq f py q f pxq
yx
f 1pξ q .
¤ z, it follows by help of the first step that
f py q f pxq
f py q f pz q
f 1 pξ q ¤ f 1 pz q yx
yz
In the case that ξ
178
and hence that
f pz q f py q py z q
f py q f pxq
.
yx
¤ ξ, it follows by help of the first step that
f pz q f pxq
f py q f pxq
f 1 pξ q ¥ f 1 pz q ¡
yx
zx
In the case that z
and hence that
f pz q f pxq
pz xq f pyyq xf pxq .
On the other hand, if
pz xq f pyyq fxpxq
f pz q f pxq
for all x, y, z P pa, bq such that x z
this, note that the previous implies that
f pz q f pxq
zx
f pyq py zq f pyyq fxpxq
y, we conclude as follows.
For
f pyyq xf pxq f pyyq zf pzq .
In the following, let x, y, z, ξ P pa, bq be such that x
follows from the assumption that
f pz q f pxq
zx
z ξ f pξξq xf pxq f pyyq xf pxq .
Ñ x that
f pξ q f pxq
f py q f pxq
f 1 pxq ¤
.
ξx
yx
From this, it follows by taking the limit z
This implies that
f 1 pxq f py q f pxq
yx
179
y. It
and therefore that
f py q ¡ f pxq
f 1 pxqpy xq .
(2.5.16)
Also, it follows from the assumption that
f py q f pxq
yx
f pyyq fz pzq f pyyq ξf pξ q .
Ñ y that
f py q f pxq
f py q f pz q
¤ f 1pyq .
yx
yz
From this, it follows by taking the limit ξ
This implies that
f py q f pxq
yx
f 1pyq
and therefore that
f pxq ¡ f py q
f 1 py qpx y q .
Since (2.5.16) is true for all x, y P pa, bq such that x
following for x, y P pa, bq such that y x
f pxq ¡ f py q
(2.5.17)
y, we conclude the
f 1 py qpx y q .
Finally, from this and (2.5.17), it follows that
f pxq ¡ f py q
for all x, y
f 1 py qpx y q .
P pa, bq such that x y.
A typical example for the application of Theorems 2.5.30, 2.5.33 is given
in the following example which derives an occasionally used lower bound
for the sine function.
180
y
1
Π
€€€€€€
2
x
Π
Fig. 53: Graph of sine function (black) and secant (blue). Compare Example 2.5.34.
Example 2.5.34. Show that
sinpxq ¥ 2x{π
(2.5.18)
for all x P r0, π {2s. Solution: By application of Theorem 2.5.30, it follows that the restriction of sin to p0, π q is convex. According to Theorem 2.5.33, this implies that
p1{nq 1
sinpxq ¤ sinp1{nq rx p1{nqs psin
π {2q p1{nq
for all x P r1{n, π {2s where n P N . By taking the limit n Ñ 8, this leads
to
sinpxq ¤ 2x{π
for all x P p0, π {2s. From the last and the fact that (2.5.18) is trivially
satisfied for x 0, it follows the validity of (2.5.18) for all x P r0, π {2s.
We know that the vanishing of the first derivative in a point x of the domain
is a necessary, but in general not sufficient, condition for a differentiable
function f to assume a local maximum or minimum in x. In that case, the
tangent to graph of f in the point px, f pxqq is horizontal; if f is in addition twice continuously differentiable such that f 2 pxq 0 (f 2 pxq ¡ 0),
then it follows by the continuity of f 2 that the restriction of f 2 to a sufficiently small interval around x assumes strictly negative (strictly positive)
values and hence that that restriction is concave (convex) and therefore that
x marks the position of a local maximum (minimum) of f .
181
y
0.8
0.2
-6
-4
2
-2
4
6
x
Fig. 54: Graph of f from Example 2.5.36.
Theorem 2.5.35. (Sufficient condition for the existence of a local minimum/maximum) Let f be a twice continuously differentiable real-valued
function on some open interval I of R. Further, let x0 P I be a critical
point of f such that f 2 pxq ¡ 0 (f 2 pxq 0). Then f has a local minimum
(maximum) at x0 .
Proof. Since f 2 is continuous with f 2 px0 q ¡ 0 (f 2 px0 q 0), there is an
open interval J around x0 such that f 2 pxq ¡ 0 (f 2 pxq 0) for all x P J.
((Otherwise there is for every n P N some yn P I such that |yn x0 | 1{n
and f 2 pyn q ¤ 0 (f 2 pyn q ¥ 0). In particular, this implies that limnÑ8 yn x0 and by the continuity of f 2 also that limnÑ8 f 1 pyn q f 2 px0 q. Hence
it follows by Theorem 2.3.12 that f 2 px0 q ¤ 0 (f 2 px0 q ¥ 0). )) Hence
it follows by Theorem 2.5.30 that f pxq ¡ f px0 q (f pxq f px0 q) for all
x P J z tx0 u.
Example 2.5.36. Find the values of the local maxima and minima of
f pxq : ln
5
4
182
sin pxq
2
for all x P R. Solution: f is twice continuously differentiable with
f 1 pxq 5
4
sinp2xq
2 cosp2xq
, f 2 pxq 5
2
sin pxq
sin2 pxq
4
sin2 p2xq
5
4
2
sin2 pxq
for all x P R. Hence the critical points of f are at xk : kπ {2, k
for each k P Z:
2p1qk
f 2 pxk q 5
.
sin2 pxk q
4
P Z and
Hence it follows by Theorem 2.5.35 that f has a local minimum/maximum
of value lnp5{4q at x2k and of value lnp9{4q at x2k 1 , respectively, and each
k P Z.
Another important consequence of Theorem 2.5.6 (or its equivalent, Rolle’s
theorem) is given by Cauchy’s extended mean value theorem which is the
basis for the proof of L’Hospital’s rule, Theorem 2.5.38, for the calculation
of indeterminate forms. The proof of the extended mean value theorem
proceeds by application of Rolle’s theorem to a skillfully devised auxiliary
function.
Theorem 2.5.37. (Cauchy’s extended mean value theorem) Let f, g :
ra, bs Ñ R be continuous functions where a, b P R are such that a b. Further, let f, g be continuously differentiable on pa, bq and such that
g 1 pxq 0 for all x P pa, bq. Then there is c P pa, bq such that
f pbq f paq
g pbq g paq
f 1 pcq
.
g 1 pcq
(2.5.19)
Proof. Since g 1 is continuous with g 1 pxq 0 for all x P pa, bq, it follows
by Theorem 2.3.37 that either g 1 pxq ¡ 0 or g 1 pxq 0 for all x P pa, bq
and hence by Theorem 2.3.44 that g is either strictly increasing or strictly
decreasing on pa, bq. Since g is continuous, from this also follows that
g pbq g paq. Define the auxiliary function h : ra, bs Ñ R by
hpxq : f pxq f paq f pbq f paq
g pbq g paq
183
pgpxq gpaqq
for all x P ra, bs. Then h is continuous as well as differentiable on pa, bq
such that
f pbq f paq 1
h 1 pxq f 1 pxq g pxq
g pbq g paq
for all x P ra, bs and hpaq hpbq 0. Hence according to Theorem 2.5.4,
there is c P pa, bq such that
h 1 pcq f 1 pcq f pbq f paq 1
g pcq 0
g pbq g paq
which implies (2.5.19).
L’Hospital’s rule goes back to Johann Bernoulli who instructed the young
French marquis Guillaume Francois Antoine de L’Hospital in 1692 in the
new Leibnizian discipline of calculus during a visit in Paris. Johann signed
a contract under which in return for a regular salary, he agreed to send
L’Hospital his discoveries in mathematics, to be used as the marquis might
wish. The result was that one of Johann’s chief contributions to calculus
from 1694 has ever since been known as L’Hospital’s rule on indeterminate
forms after its publication in L’Hospital’s book ‘Analyse des infiniment petits’ in 1696 [69]. L’Hospital’s book was the first textbook on calculus and
was met with great success.
An indeterminate form, we already met in Example 2.3.54 where it was
proved that
sinpxq
lim
1.
(2.5.20)
xÑ0,x0
x
Formally, that limit is of the ‘indeterminate’ type
0
0
where the last formal expression is obtained by replacing sinpxq and x in
the quotient sinpxq{x by
x
lim sinpxq
Ñ0,x0
and
184
x
lim x ,
Ñ0,x0
respectively. Since sin and the identical function on R are continuous and
according to the limit laws, this expression would give the correct result
for the limit (2.5.20) if it would involve division by a non-zero number.
But, since division by zero is not defined, that expression is not defined and
hence ‘indeterminate’. The following theorem treats also indeterminate
limits of the type
8.
8
The calculation of limits of other indeterminate types can usually be reduced to the calculation of limits of these two types.
L’Hospital’s rule is a simple consequence of Cauchy’s extended mean value
theorem.
Theorem 2.5.38. (Indeterminate forms/L’Hospital’s rule) Let f : pa, bq Ñ
R and g : pa, bq Ñ R be continuously differentiable, where a, b P R are
such that a b, and such that g 1 pxq 0 for all x P pa, bq. Further, let
lim
Ña f pxq xlim
Ña g pxq 0
x
(2.5.21)
or let |f pxq| ¡ 0 and |g pxq| ¡ 0 for all x P pa, bq as well as
lim
x Ña
1
|f pxq|
Finally, let
1
xlim
Ña |g pxq| 0 .
(2.5.22)
f 1 pxq
lim
xÑa g 1 pxq
exist. Then
lim
x Ña
f pxq
g pxq
f 1 pxq
xlim
Ña g 1 pxq .
(2.5.23)
Proof. Since g 1 is continuous with g 1 pxq 0 for all x P pa, bq, it follows
by the Theorem 2.3.37 that either g 1 pxq ¡ 0 or g 1 pxq 0 for all x P pa, bq
and hence by Theorem 2.3.44 that g is either strictly increasing or strictly
decreasing on pa, bq. First, we consider the case (2.5.21). Then f and g
185
can be extended to continuous functions on ra, bq assuming the value 0
in a. Now, let x0 , x1 , . . . be a sequence of elements of pa, bq converging
to a. Then by Theorem 2.5.37 for every n P N there is a corresponding
cn P pa, xn q such that
f pxn q
f 1 pcn q
.
g pxn q
g 1 pcn q
Obviously, the sequence c0 , c1 , . . . is converging to a, and hence it follows
that
f pxn q
f 1 pcn q
f 1 pxq
lim
lim
lim
(2.5.24)
nÑ8 g px q
xÑa g 1 pxq
nÑ8 g 1 pc q
n
n
and hence, finally, that (2.5.23). Finally, we consider the second case. So
let |f pxq| ¡ 0 and |g pxq| ¡ 0 for all x P pa, bq, and in addition let (2.5.22)
be satisfied. Further, let x0 , x1 , . . . be some sequence of elements of pa, bq
converging to a and let b 1 P pa, bq. Because of (2.5.22), there is n0 P N
such that
1 f pb 1 q 1 and g pb q 1 .
g px q f px q n
n
for all n P N such that n ¥ n0 . Then according to Theorem 2.5.37 for any
such n, there is a corresponding cn P pxn , b 1 q such that
f pxn q f pb 1 q
g pxn q g pb 1 q
1
fg 1ppccnqq .
n
Hence it follows that
f pxn q
g pxn q
1
1 ggppxbnqq
1
1 ffppxbnqq
1
fg 1ppccnqq
n
and since c0 , c1 , . . . is converging to a by (2.5.22) and Theorem 2.3.4, it
follows the relation (2.5.24) and hence, finally, (2.5.23).
Example 2.5.39. Find
lim x lnpxq .
x
Ñ0
Solution: Define f pxq : lnpxq and g pxq : 1{x for all x P p0, 1q. Then f
and g are continuously differentiable and such that g 1 pxq 1{x2 0 for
186
all x P p0, 1q. Further, |f pxq| | lnpxq| ¡ 0, |g pxq|
for all x P p0, 1q. Finally, (2.5.22) is satisfied and
f 1 pxq
xÑ0 g 1 pxq
lim
|1{x| 1{|x| ¡ 0
xlim
pxq 0 .
Ñ0
Hence according to Theorem 2.5.38:
lim x lnpxq 0 .
x
Ñ0
Example 2.5.40. Determine
x
lim
Ñ8 xe .
x
Solution: Define f py q : 1{y and g py q : expp1{y q for all y P p0, 1q.
Then f and g are continuously differentiable and such that g 1 py q y 2 expp1{y q 0 for all y P p0, 1q. Further, |f py q| 1{|y | ¡ 0, |g py q| expp1{y q ¡ 0 for all y P p0, 1q. Finally, (2.5.22) is satisfied and
f 1 py q
y Ñ0 g 1 py q
lim
ylim
Ñ0
1
e1{y
0.
Hence according to Theorem 2.5.38:
lim xex
xÑ8
1{y
0.
ylim
Ñ0 e1{y
Example 2.5.41. Calculate
2 x
lim
Ñ8 x e .
x
Solution: Define f py q : 1{y 2 and g py q : expp1{y q for all y P p0, 1q.
Then f and g are continuously differentiable as well as such that g 1 py q expp1{yq{y2 0 for all y P p0, 1q. Further, |f pyq| 1{y2 ¡ 0,
|gpyq| expp1{yq ¡ 0 for all y P p0, 1q. Finally, (2.5.22) is satisfied
and by Example 2.5.40
f 1 py q
y Ñ0 g 1 py q
lim
2{y
ylim
0.
Ñ0 e1{y
187
Hence according to Theorem 2.5.38:
1{y
lim x ex lim 1{y
xÑ8
y Ñ0 e
2
2
0.
Remark 2.5.42. Recursively in this way, it can be shown that
lim xn ex
x
for all n P N.
Ñ8
0.
That the condition that g 1 pxq 0 for all x P pa, bq in Theorem 2.5.38 is not
redundant can be seen from the following example.
Example 2.5.43. For this define
f pxq :
2
x
sin
2
x
, g pxq :
2
x
sin
2
x
e sinp1{xq
for all x P p0, 2{5q. Then f and g are continuously differentiable and satisfy
1
xÑ0 |f pxq|
lim
Since
f pxq
g pxq
1
xlim
0.
Ñ0 |g pxq|
e sinp1{xq
for all x P p0, 2{5q, pf {g qpxq does not have a limit value for x Ñ 0. Further,
it follows that
2
f 1 pxq 2 1
4
1
cos
cos 2
2
x
x
x
1
2
2
1
g 1 pxq 2 cos
e sinp1{xq
sin
x
x
x
x
2
x
,
4 cos
and hence that
f 1 pxq
g 1 pxq
2
4x cosp1{xq e sinp1{xq
x sinp2{xq 4x cosp1{xq
188
1
x
for all
x P p0, 2{5q
"
z p2k
2
1qπ
:k
PN
*
.
We notice that
lim
x
Ñ0 2
4x cosp1{xq e sinp1{xq
x sinp2{xq 4x cosp1{xq
0.
This does not contradict Theorem 2.5.38 since g 1 has zeros of the form
2{pp2k 1qπ q, k P N. Hence there is no b ¡ 0 such that the restrictions of
f and g would satisfy the assumptions in Theorem 2.5.38.
The following example shows that in general the existence of
lim
x Ña
f pxq
g pxq
does not imply the existence of
f 1 pxq
.
lim
xÑa g 1 pxq
Example 2.5.44. For this, define
f pxq : x sinp1{x2 q expp1{xq , g pxq : expp1{xq
for all x ¡ 0. Then
f pxq
xÑ0 g pxq
lim f pxq 0 , lim g pxq , lim
x
Ñ0
x
Ñ0
0.
Further,
f 1 pxq g 1 pxq 1 2
2
x
p
x
1
q
sin
p
1
{
x
q
2
cos
p
1
{
x
q
expp1{xq ,
x2
1
f 1 pxq
exp
p
1
{
x
q
,
xpx 1q sinp1{x2 q 2 cosp1{x2 q
2
1
x
g pxq
for all x ¡ 0. Hence f 1 {g 1 does not have a limit value for x Ñ 0.
189
For the motivation of following contraction mapping lemma, we consider a
method of calculating square roots of numbers which can be traced back to
ancient Greek times, but there are indications that this method was already
known in ancient
? Babylonia. For this, we consider the problem of approximation of N by fractions where N is some non-zero natural number. If q
is some non-zero positive rational number such that
then it follows that q
?
N ,
q2
N,
q
N
and hence that
N
q
?
N
N
¡
Hence, the arithmetic mean
q̄ :
?1
?
N
N .
1
2
N
q
q
of q and N {q, which is ?
the midpoint of the interval rq, N {q s, might be a
better approximation to N than q. Indeed, a little calculation gives that
2
1 2
N
N2
q̄ N q
q
2N
N
N
q
4
q2
1 2
1 2
N2
N
2
4 q 2N q2 4 q N q2 pN q q
1 N
p
N q 2 q2
2
4 q2 1 pN q q 4q2 ¡ 0
2
and hence that
if
1
4
q̄ 2 N
1
4
N
q2
N q2
1 1
190
5q2. Hence if
q 2 N 5q 2 ,
?
then q̄ is a better approximation to N than q. Note that q̄ does not satisfy
the same inequalities since q̄ 2 ¡ N . On the other hand,
which is equivalent to N
q̄¯ :
1
2
N
q̄
q̄
satisfies
1 N
q̄ N 1 pN q̄ 2q
4 q̄ 2
p
q̄ 2 N q2
1
N
2
4 1 q̄ 2 pq̄ N q 4q̄ 2
¯2
and hence is a better approximation to
1
4
1
N
q̄ 2
?
¡0
N than q̄ since
1.
Hence
? by continuing this process, we arrive at rational approximations to
N whose accuracy increase in every step.
For instance for N 2 and q 1, note that q 2 N ?5q 2 since 1 2 5,
we arrive at the following rational approximations to 2
3 17 577 665857 886731088897
,
,
,
,
.
2 12 408 470832 627013566048
?
The value 17{12, which gives 2 within
? an error of 3 103, was used as
a common rough approximation of 2 by the Babylonians. Starting from
q 17{12, Babylonian arithmetic leads to the fraction
1
24
60
51
602
191
10
603
30547
21600
y
6
3
!!!!
2
1
x
2
!!!!
2
Fig. 55: Graph of T for the case N
2 and auxiliary curves.
which was found on the Babylonian tablet YBC 7289 and gives
an error of 6 107 .
?
2 within
A modern interpretation of the process in terms of maps is that
q , T pq q , T pT pq qq pT T qpq q , T pT pT pq qqq pT
where T : p0, 8q Ñ p0, 8q is defined by
T pxq :
1
2
x
N
x
pT T qqpqq
for every x ¡ 0, gives a sequence of approximations to
accuracy. We expect that
lim T n pq q nÑ8
?
?
N of increasing
N
where T n for n P N is inductively defined by T 0 : idp0,8q and T k
T T k , for k P N.
Indeed, if x0 , x1 , . . . converges to some element of x
x ¡ 0 and
xk : T k pxq
192
... ,
1
:
P p0, 8q, where
x
2
!!!!
2
1
10
5
20
15
n
Fig. 56: pn, xn q for x 1 and n 1 to n 20.
for all k
P N, then
xk
1
T pxq T pT pxqq T pxk q k 1
k
1
2
xk
N
xk
x
N
x
,
and hence it follows by the limit laws that
x
klim
x
Ñ8 k
1
1
klim
Ñ8 2
xk
N
xk
1
2
As a consequence, in this case, x satisfies the equation
1
2
or equivalently x2
x N
x
N which implies that
?
x N
193
0
.
since it was assumed that x ¡ 0. It is natural to ask in what sense
a particular point for the map T . For this, we notice that
?
N is
?
?
1 ?
N
N ?
N ,
Tp N q 2
N
?
?
that is, T maps N onto itself, i.e., N is a so called ‘fixed point’ of the
map T . Also, every fixed point x of T satisfies the equation
x
1
2
?
N
x
x
which is equivalent to x N , i.e., there is no other fixed point of T .
Finally, it is natural to ask whether there is a special property of the map
that leads
of x0 , x1 , . . . . For this, we notice that for
? to the convergence
?
x ¥ N and y ¥ N , it follows that
N
xy
and hence that
|T pxq T pyq| |
x y | 1
2
1
2 x y
N 1
x
xy 2
N
x
¤1
¤ | y| .
This leads to
|T pT pxqq T pT pyqq| ¤ 21 |T pxq T pyq| ¤ 14 |x y|
and inductively to
for all k
P N. Since
?
|T k pxq T k pyq| ¤ 21k |x y|
N is a fixed point of T , this implies that
|T k pxq ?
N 1
N
x
y
p
x
y
q
y 2
xy
N|¤
194
?
1
|
x
N|
2k
and hence that
Since for x ?
lim T k pxq k
Ñ8
?
N .
(2.5.25)
N , as already observed above, it follows that
2
N q2 ¡ 0
pT pxqq2 N px 4x
2
and hence that
T pxq ¡
?
N .
Therefore, we conclude that (2.5.25) holds for all x ¡ 0. In addition,
we notice that the fact that N P N was nowhere used in the previous
discussion. As a consequence, summarizing that discussion, we proved the
following result.
Theorem 2.5.45. (Babylonian method of approximating roots of real
numbers, I) Let a ¡ 0 and T : p0, 8q Ñ p0, 8q be defined by
T pxq :
for every x ¡ 0, then
1
x
2
lim T k pxq k
Ñ8
a
x
?
a
where T n for n P N is inductively defined by T 0 : idp0,8q and T k
T T k , for k P N.
1
:
Functions T satisfying
|T pxq T pyq| ¤ α|x y|
for some 0 ¤ α 1 and all x, y of their domain are called contractions.
We notice from the previous discussion that if such a function T has a fixed
point x and maps its domain into that domain, then it follows as above that
lim T k pxq x .
k
Ñ8
for all x P DpT q. On the other hand, in many cases the existence of such
a fixed point is not obvious, but such can be shown with the help of Theorem 2.3.33 if the domain of T is a closed interval of R. This is the additional
point that is treated in Theorem 2.5.46.
195
Lemma 2.5.46. (Contraction mapping lemma on the real line) Let T :
ra, bs Ñ R be such that T pra, bsq € ra, bs where a, b P R are such that
a b. In addition, let T be a contraction, i.e., let there exist α P r0, 1q such
that
|T pxq T pyq| ¤ α |x y|
(2.5.26)
for all x, y P ra, bs. Then T has a unique fixed point, i.e., a unique x
ra, bs such that
T px q x .
Further,
and
|x x| ¤ |x 1T αpxq|
n
lim
Ñ8 T pxq x
n
P
(2.5.27)
(2.5.28)
for every x P ra, bs where T n for n P N is inductively defined by T 0 :
idra,bs and T k 1 : T T k , for k P N.
Proof. Note that (2.5.26) implies that T is continuous. Further, define the
hence continuous function f : ra, bs Ñ R by
f pxq : |x T pxq|
for all x P ra, bs. Note that x P ra, bs is a fixed point of T if and only if it
is a zero of f . By Theorem 2.3.33 f assumes its minimum in some point
x P ra, bs. Hence
0 ¤ f px q ¤ f pT px qq |T px q T pT px qq| ¤ α |x T px q| α f px q
and therefore f px q 0 since the assumption f px q 0 leads to the
contradiction that 1 ¤ α. If x̄ P ra, bs is a fixed point of T , then
|x x̄| |T pxq T px̄q| ¤ α |x x̄|
and hence x̄ x since the assumption x̄ x leads to the contradiction
that 1 ¤ α. Finally, let x P ra, bs. Then
|x x| |x T pxq| |x T pxq T pxq T pxq|
196
¤ |x T pxq| |T pxq T pxq| ¤ f pxq
α |x x |
and hence (2.5.27). Further from (2.5.27) and
αn
|T pxq x| |T pxq T pxq| ¤ α |x x| ¤ 1 α f pxq ,
n
n
n
n
it follows (2.5.28) since limnÑ8 αn
0.
The following example applies the previous lemma to the Babylonian method
of approximating roots of real numbers. In this, there are used more widely
applicable methods in the proof of invariance of the domain of the function
T and in the proof that T is a contraction.
Example 2.5.47. (Babylonian method of approximating roots of real
numbers,
II) Let a ¡ 0 and N P N be such that N 2 ¡ a. Finally, define
?
T : r a, N s Ñ R by
1
a
T pxq :
x
2
x
?
for all x P r a, N s. Then
lim T n pN q nÑ8
?
a.
(2.5.29)
Proof. First, we note that
? ?
1
a
T p aq a , T pN q N
N
2
N
?
and ?
hence that a is a fixed point of T . Further, T is twice differentiable
on p a, N q with derivatives
T 1 pxq ?
1
a
1
a
1 2 2 x2 a ¡ 0 , T 2 pxq 3
2
x
2x
x
¡0
for all x P p a,?
N q. Hence T,
T 1 are strictly increasing according to Theo?
rem 2.3.44, T pr a, N sq € r a, N s and
0 ¤ T 1 pxq ¤
1
a 1
1 2 .
2
N
2
197
y
15
10
5
1
3
x
-5
Fig. 57: Graph of pR Ñ R, x ÞÑ x3 2x 5q.
In particular, it follows by Theorem 2.5.6 that
?
|T pxq T pyq| ¤ 21 |x y|
for all x, y P r a, N s. By Lemma 2.5.46,
it follows that T has a unique
?
fixed point, which hence is given by a , and in particular (2.5.29).
For instance for N
mating fractions
2 and q 1, we get in this way the first five approxi-
3 17 577 665857 886731088897
,
,
,
,
2 12 408 470832 627013566048
with corresponding errors (according to (2.5.27)) equal or smaller than
1
1
1
1
1
,
,
,
,
.
6 204 235416 313506783024 555992422174934068969056
In 1669, Newton submitted a paper with title ‘De analysi per aequationes
numero terminorum infinitas’ to the Royal Society. This paper was published only much later in 1712 [82]. Among others, Newton introduces by
198
example a iterative method for the approximation of zeros of differentiable
functions which is now named after him. For this, he considers the equation
x3 2x 5 0 .
(2.5.30)
As a first approximation to the solution in the interval [2,3], compare Fig 57,
he uses x 2. Substitution of x 2 p into (2.5.30) gives
0 x3 2x 5 p2 pq3 2p2 pq 5
8 12p 6p2 p3 4 2p 5 1 10p
6p2
p3
Neglecting higher order terms in p than first order, i.e., effectively replacing
the last polynomial in p by its linearization around p 0, he arrives at the
equation
1 10p 0
and hence at p 1{10. In this way, he arrives at x 2.1 as a second
approximation to the solution. He then substitutes x 2.1 q into (2.5.30)
to obtain
0 x3 2x 5 p2.1 q q3 2p2.1 q q 5
9.261 13.23q 6.3q2 q3 4.2 2q 5
0.061 11.23q 6.3q2 q3 .
Again, neglecting higher order terms in q than first order, i.e., in this effectively replacing the last polynomial in q by its linearization around q 0,
he arrives at the equation
0.061
11.23q
and hence at q 0.0054 where only the first leading digits of are retained. In this way, he arrives at the rounded result x 2.0946 as a third
approximation to the solution which approximates that solution within an
error of 5 105 .
It has to be taken into account that Newton’s paper does not contain references to his fluxions or fluents. On the other hand, in spirit, his procedure
199
matches today’s version of the method. The only difference is that today’s
method does not involve substitutions. It proceeds as follows. We define
f : pR Ñ R, x ÞÑ x3 2x 5q. Starting from the first approximation
x0 2 of its zero, we calculate the linearization p10 of f around x0 . Since
f 1 pxq 3x2 2
for all x P R, we arrive at
p10 pxq f px0 q
f 1 px0 qpx x0 q 1
10 px 2q 21
10x
for all x P R. Effectively replacing the function f by its linearization p10 ,
we arrive at the equation
21
10x 0
and hence, as Newton, at the first approximation x1 2.1. In the second
step, we calculate the linearization p11 of f around x1 . It is given by
p11 pxq f px1 q f 1 px1 qpx x1 q 0.061
23.522 11.23x
11.23px 2.1q
for all x P R. Again, effectively replacing the function f by its linearization
p11 , we arrive at the equation
23.522
11.23x 0
and hence, as Newton, at the second approximation x2
ing Newton’s way of rounding the result.
2.0946 by repeat-
From today’s perspective, Newton’s method can be viewed as a particular application of the contraction mapping lemma. This is also used below
to prove the convergence of the method and to provide an error estimate.
The method is iterative and used to approximate solutions of the equation
f pxq 0 where f : I Ñ R is a differentiable function on a non-trivial open
200
interval I of R. Starting from an approximation xn P I to such a solution,
the correction xn 1 is given by the zero of the linearization around xn ,
f 1 pxn qpx xn q ,
f pxn q
x P R, and hence by
xn
1
xn ff 1ppxxnqq
(2.5.31)
n
assuming f 1 pxn q 0, thereby essentially replacing the function f by its
linearization around xn .
It is instructive to analyze the recursion (2.5.31) in a little more detail
where we assume that f 1 is in addition continuous. For this, let’s assume
that f pxn q ¡ 0. If f is increasing in some neighborhood of xn , i.e., if
f 1 pxn q ¡ 0, then we would expect the solution to be located to the left (=
towards smaller values) of xn and, indeed, in this case, xn 1 is to the left of
xn . If f is decreasing in some neighborhood of xn , i.e., if f 1 pxn q 0, then
we would expect the solution to be located to the right (= towards larger
values) of xn and also xn 1 is to the right of xn . If f pxn q 0 and f is
increasing in some neighborhood of xn , i.e., if f 1 pxn q ¡ 0, then we would
expect the solution to be to the right of xn and also xn 1 is to the right of xn .
Finally, if f is decreasing in some neighborhood of xn , i.e., if f 1 pxn q 0,
then we would expect that the solution is to the left of xn and also xn 1 is
to the left of xn . Hence the recursion (2.5.31) shows as very intuitive behavior. On the other hand, for this reasoning to be make sense, the solution
should be very near to xn . In particular in cases that xn is near to a critical
point of f , the method usually fails because of leading to corrections of a
much too large size.
Finally, since the graph of the linearization of f around xn gives the tangent to the graph of f in the point pxn , f pxn qq, xn 1 gives the abscissa of
the intersection of that tangent with the x-axis. This fact gives a geometric
interpretation to Newton’s method.
201
y
14
12
x2
x1
x0
x
-2
Fig. 58: Graph of f from Example 2.5.48 (a 2) and Newton steps starting from x0
4.
The following example shows that the Babylonian method of approximating roots of real numbers can be seen as a particular case of Newton’s
method.
Example 2.5.48. Let a ¡ 0. Define f : R Ñ R by
for all x P R. Then
xn
for xn
1
xn f pxq : x2 a
f pxn q
f 1 pxn q
xn x2n a
2xn
1
2
xn
0 which is the iteration used in Example 2.5.45.
a
xn
Theorem 2.5.49. (Newton’s method) Let f be a twice differentiable realvalued function on a non-trivial open interval I of R. Further, let I contain
a zero x0 of f and be such that f 1 pxq 0 for all x P I and in particular
such that
f pxqf 2 pxq f 12 pxq ¤ α
202
for all x P I and some α P R satisfying 0 ¤ α 1. Then
lim T n pxq x0
Ñ8
n
for all x P I where
Finally,
for all x P I.
T pxq : x (2.5.32)
f pxq
.
f 1 pxq
|x x0| ¤ |x 1T αpxq|
(2.5.33)
Proof. First, it follows that T is differentiable with derivative
f pxqf 2 pxq
1
T pxq f 12 pxq
P I and that x0 is a fixed point of T . By Theorem 2.5.6 it follows
T pxq T px0 q T pxq x0 xx 1
x x0
0
for all x P I different from x0 and hence that
|T pxq x0| ¤ |x x0|
(2.5.34)
for all x P I. Now let ra, bs, where a, b P R such that a b, be some closed
subinterval of I containing x0 . Then it follows by (2.5.34) that T pra, bsq €
ra, bs and by Theorem 2.5.6 that
T pxq T py q ¤α
xy
for all x, y P ra, bs satisfying x y and hence that
|T pxq T pyq| ¤ α|x y|
for all x, y P ra, bs. Hence by Lemma 2.5.46, the relations (2.5.32) and
(2.5.33) follow for all x P ra, bs.
for all x
that
203
y
0.5
-1
0.5
-0.5
1
x
-0.5
-1
Fig. 59: Zero of f from Example 2.5.50 given by the xcoordinate of the intersection of
two graphs.
The following example gives an application of Newton’s method to a standard problem from quantum theory.
Example 2.5.50. Find an approximation x1 to the solution of
x0
cospx0q
such that |x0 x1 | 106 . Solution: Define f : R Ñ R by
f pxq : x cospxq
for all x P R. Then f is infinitely often differentiable with
f 1 pxq 1 sinpxq , f 2 pxq cospxq
cospxq px cospxqq
f pxqf 2 pxq
1
2
f pxq
p1 sinpxqq2
where only in the last identity it has to be assumed that x is different from
π {2 2kπ for all k P Z. Further
f
π 6
π
6
?
3
2
0,
f
204
π 4
π4 ?12 ¡ 0 ,
and hence according Theorem 2.3.37, f has a zero in the open interval
I : pπ {6, π {4q. Also
f 1 pxq 1
for all x P I. Further,
f f2
f 12
1
sinpxq ¡ 1
sinpπ {6q 3{2 ¡ 0
pxq 3 cosppx1q
x sinpxq 2x
sinpxqq2
and
3 cospxq x sinpxq 2x ¥ 3 cos
π 4
π π
π
sin
2
6
6
4
?32 5π
¡0
12
and hence f f 2 {f 12 is strictly increasing on rπ {6, π {4s as a consequence of
Theorem 2.5.10. Therefore,
1 9
27
cos
and
?
cos
3π p
1 sin π6 q2
π
?
π
π
4 cos
π2 2
4
4
8 6?2
p1 sin π4 q2
cos
π
6
π
6
π
6
2
f pfxq12fpxpqxq
p q p q 1 9 ?3 π α : 1 1
pq
27
3
f x f 2 x f 12 x for all x P I. Starting the iteration from 0.7 gives to six decimal places
0.739436 , 0.739085
with the corresponding errors
0.000527006 , 4.08749 108 .
Hence the zero x0 of f in the interval I agrees with
x1
0.739085
205
to six decimal places. That there is no further zero of f can be concluded as follows. Since the derivative of f does not vanish in the interval
pπ{2, π{2q, it follows by Theorem 2.5.4 that there are no other zeros in
this interval. Further, for |x| ¥ π {2 p¡ 1q there is no zero of f because
| cospxq| ¤ 1 for all x P R. The quantity
U2
pU1 U2q x20
is the ground state energy of a particle in a finite square well potential with
U3 U1 , γ 0, KL 2. See [79].
Problems
1) Give the maximum and minimum values of f and the points where
they are assumed.
a)
b)
c)
d)
e)
f)
g)
f pxq : x2 5x 7 , x P r5, 0s ,
f ptq : t3 6t2 9t 14 , t P r5, 0s ,
f psq : s4 p8{3qs3 6s2 1 , s P r5, 5s ,
f ptq : 4pt 3q2 pt2 1q , t P r1, 4s ,
f pxq : p9x 12q{p3x2 4q , x P r1, 0s ,
f pxq : px2 x 1q exppxq , x P r0.3, 1.5s ,
?
f pxq : exppx{ 3 q cospxq , x P r0, 8q .
2) Consider a projectile that is shot into the atmosphere. If v ¥ 0 is
the component of its speed at initial time 0 in the vertical direction,
its height z ptq above ground at time t ¥ 0 is given by z ptq vt gt2 {2 where g 9.81m{s2 is the acceleration due to gravity and it is
assumed that z p0q 0. Calculate the maximal height the projectile
reaches and also the time of its flight, i.e., the time when it returns to
the ground.
3) Reconsider the situation from previous problem, but now with inclusion of a viscous frictional force opposing the motion of the projectile. Then z ptq α rpv αg qp1 exppt{αqq gts where it is again
assumed that z p0q 0. Here α m{λ where m ¡ 0 is the mass of
the projectile and λ ¡ 0 is a parameter describing the strength of the
friction. Calculate the maximal height the projectile reaches and also
the time of its flight, i.e., the time when it returns to the ground.
206
4) Let a ¡ 0 and b ¡ 0. Find an equation for the straight line through
the point pa, bq that cuts from the first quadrant a triangle of minimum
area. State that area.
5) Let a ¡ 0 and b ¡ 0. Find an equation for the straight line through
the point pa, bq whose intersection with the first quadrant is shortest.
State the length of that intersection.
6) Find the maximal volume of a cylinder of given surface area A ¡ 0.
7) From each corner of a rectangular cardboard of side lengths a ¡ 0
and b ¡ 0, a square of side length x ¥ 0 is removed, and the edges
are turned up to form an open box. Find the value of x for which the
volume of that box is maximal.
8) A rectangular movie screen on a wall is h1 -meters above the floor and
h2 -meters high. Imagine yourself sitting in front of the screen and
looking into the direction of its center. Measured in this direction,
what distance x from the wall will give you the largest viewing angle
θ of the movie screen? [This is the angle between the straight lines
that connect your eyes to the lowest and the highest points on the
screen.] Assume that the height of your eyes above the floor is hs meters where hs h1 .
9) Imagine that the upper half-plane H : R p0, 8q and the lower
half-plane H : R p8, 0q of R2 are filled with different ‘physical media’ with the xaxis being the interface I. Further, let px1 , y1 q
P H , px2 , y2 q P H . Light rays in both media proceed along
straight lines and at constant speeds v1 and v2 , respectively. According to Fermat’s principle, a ray connecting px1 , y1 q and px2 , y2 q
chooses the path that takes the least time. Show that that path satisfies Snell’s law, i.e., sinpθ1 q{ sinpθ2 q v1 {v2 where θ1 (θ2 ) is the
angle of the part of the ray in H Y I (H Y I) with the normal to
the xaxis originating from its intersection with I.
10) For the following functions find the intervals of increase and decrease, the local maximum and minimum values and their locations
and the intervals of convexity and concavity and the inflection points.
Use the gathered information to sketch the graph of the function. If
available, check your result with a graphing device.
a) f psq : 7s4 3s2 1 , s P R ,
b) f ptq : t4 p8{3qt3 6t2 3 , t P R ,
c) f pxq : 4px 3q2 px2 1q , x P R .
207
11) For the following functions find vertical and horizontal asymptotes,
the intervals of increase and decrease, the local maximum and minimum values and their locations and the intervals of convexity and
concavity and the inflection points. Use the gathered information to
sketch the graph of the function. If available, check your result with
a graphing device.
a) f pxq : x{p1 x2 q , x P R ,
?
b) f pxq : x2 1 x , x P R ,
? ?
c) f pxq : p9x 12q{p3x2 4q , x P R zt2{ 3, 2{ 3u .
12) Calculate the linearization of f around the given point.
a)
b)
c)
d)
e)
f)
g)
f pxq : p1 xqn , x ¡ 1 , around x 0 where n P R ,
f pxq : lnpxq , x ¡ 0 , around x 1 ,
f pϕq : sinpϕq , ϕ P R , around ϕ 0 ,
f pϕq : tanpϕq , ϕ P pπ {2, π {2q , around ϕ 0 ,
f pxq : sinhpxq : pex ex q{2 , x P R , around x 0 ,
f pϕq : lnrp5{4q cosp3ϕqs , ϕ P R , around ϕ0 3π {4 ,
f pxq : p3x2 x 5q{p5x2 6x 3q , x P R ztx P R :
5x2 6x 3 0u , around x 1 .
13) Show that
a)
b)
c)
d)
e)
f)
g)
p1
p1
xqn ¡ 1 nx for all x ¡ 0 and n ¥ 1 ,
xqn 1 nx for all x ¡ 0 and 0 n 1 ,
ln x ¤ x 1 for all x ¡ 0 ,
sinpϕq ϕ for all ϕ ¡ 0 ,
tanpϕq ¡ ϕ for all ϕ P p0, π {2q ,
sinhpxq : pex ex q{2 ¡ x for all x ¡ 0 .
ln x ¥ px 1q{x for all x ¡ 0 .
14) Calculate
a x
a x
, b) lim 1 ,
xÑ8
xÑ8
x
x
x tanpxq
tanpxq
, d) lim
,
c) lim
xÑ0 1 cospxq
xÑ0
x
sinpxq
1
1
e) lim ?
, f) lim
,
xÑ0
xÑ0 x
x
sinpxq
lnpxq
r lnpxq sn 2 ,
g) lim
, h) lim
xÑ8
xÑ8
x
x
a) lim
1
208
lnpxq
Ñ1 tanpπxq
i) lim
x
l)
n)
lim r sinpxq s
Ñ0
x
, j)
lim xx , k)
Ñ0
lim xa{ lnpxq ,
Ñ0
p q , m) lim x1{x ,
xÑ8
x
x
tan x
lim xsinpxq , o) lim r cosp1{xq sx ,
Ñ0
x
Ñ8
x
cosp3xq cosp2xq
p) lim
, q)
xÑ0
x2
where n P N, a P R.
1
cospπxq
Ñ1 x2 2x 1
lim
x
15) Explain why Newton’s method fails to find the zero(s) of f in the
following cases.
a) f pxq : x2 x6 , x P R , with initial approximation x 1{2 ,
b) f pxq : x1{3 , x P R .
16) A circular arch of length L ¡ 0 and height h ¡ 0 is to be constructed
where L{h ¡ π.
a) Show that x : L{p2rq, where r ¡ 0 is the radius of the corresponding circle, satisfies the transcendental equation
cospxq 1 2h
x.
L
b) Assume that L{h 7. By Newton’s method, find an approximation x0 to x such that |x0 x| 106 .
17) The characteristic frequencies of the transverse oscillations of a string
of length L ¡ 0 with fixed left end and right end subject to the boundary condition v 1 pLq hv pLq 0, where v : r0, Ls Ñ R is the amplitude of deflection of the string and h P R, is given by ω x{L
where
x
tan x (2.5.35)
hL
[20]. Assume hL 1{3, and find by Newton’s method an approximation x0 to the smallest solution x ¡ 0 of (2.5.35) such that
|x0 x| 106 .
18) The characteristic frequencies of the transverse vibrations of a homogeneous beam of length L ¡ 0 with fixed ends are given by
ω rEJ {pρS qs1{2 px{Lq2 where
coshpxq cospxq 1 ,
(2.5.36)
E is Young’s modulus, J is the moment of inertia of a transverse
section, S is the area of the section, ρ is the density of the material
209
of the beam, and coshpy q : pey ey q{2 for all y P R [65]. By
Newton’s method, find an approximation x0 to the smallest solution
x ¡ 0 of (2.5.36) such that |x0 x| 106 .
19) (Binomial theorem) Let n P N . Define f : p1, 8q Ñ R by
n
¸
f pxq :
k 0
for all x P
defined by
p1, 8q where the so called ‘binomial coefficients’ are
n
0
for every k
n k
x
k
: 1 ,
n
k
:
1
n pn 1q pn pk 1qq
k!
P N .
a) Show that
xqf 1 pxq nf pxq
p1
for all x P p1, 8q.
b) Conclude from part a) that
f pxq p1
xqn
for all x P p1, 8q.
c) Show the binomial theorem, i.e., that
px
yq
n
n
¸
k 0
for all x, y
P R.
210
n k nk
x y
k
y
1
A
1
x
Fig. 60: The yellow area A enclosed by the graph of f : p r0, 1s Ñ R, x ÞÑ 1 x2 q and
the coordinate axes is determined by Archimedes’ method.
2.6
Riemann Integration
An early example of integration is given by Archimedes’ quadrature of the
segment of the parabola. For this, he presents two proofs. Here, we display
his first proof because it anticipates the definition of the Riemann integral.
The second proof will be given at beginning of Section 3.3 on series of
real numbers. We use his method to calculate the area A of the parabolic
segment
tpx, yq P R2 : x P r0, 1s ^ 0 ¤ y ¤ 1 x2u
that is contained the rectangle r0, 1sr0, 1s, see Fig. 60. He approximates A
by what would be called upper and lower sums today, but the construction
of those sums was geometrically motivated. We slightly alter that construction, but otherwise closely follow his method. For this, we divide the x-axis
into intervals of equal lengths, for instance, into four intervals
r0, 1{4s , r1{4, 2{4s , r2{4, 3{4s , r3{4, 4{4s
211
y
1
1
€€€€€€
4
1
€€€€€€
2
3
€€€€€€
4
1
x
Fig. 61: The yellow area gives the upper bound U4 for A, compare text.
of equal lengths 1{4. Then the sum U4 of the areas of the two-dimensional
intervals
r0, 1{4s r 0, 1 p0{4q2 s , r1{4, 2{4s r 0, 1 p1{4q2 s ,
r2{4, 3{4s r 0, 1 p2{4q2 s , r3{4, 4{4s r 0, 1 p3{4q2 s
given by
1 3̧
k2
U4 1 2
4 k0
4
exceeds A, and the sum L4 of the areas of the two-dimensional intervals
r0, 1{4s r 0, 1 p1{4q2 s , r1{4, 2{4s r 0, 1 p2{4q2 s ,
r2{4, 3{4s r 0, 1 p3{4q2 s , r3{4, 4{4s r 0, 1 p4{4q2 s
given by
L4
1 4̧
k2
1 2
4 k1
4
212
y
1
1
€€€€€€
4
1
€€€€€€
2
3
€€€€€€
4
1
x
Fig. 62: The yellow area gives the lower bound L4 for A, compare text.
is smaller than A,
L4
¤ A ¤ U4 .
In the same way, by division of the x-domain into intervals of equal lengths
1{n, where n P N , we arrive at
Un
1 1 n¸
k2
1 2 , Ln
n k0
n
and the inequalities
Ln
Since
Un L n
we conclude that
Ln
1 ņ
k2
1 2
n k1
n
¤ A ¤ Un .
n1
n 2
n
¤ A ¤ Ln
213
n1 ,
1
.
n
Further,
Ln
1
n
1
1
3
ņ
k2
n
n2
k1
1
1
n
1
1
1
2n
1 ņ 2
k
n3 k1
1 pn
1qp2n
6n2
1q
where it has been used that
ņ
k2
k 1
61 npn
1qp2n
1q .
The last formula was known to Archimedes. He proved it in his treatise on
spirals [37]. Of course, it is tempting (and correct) to take the limit n Ñ 8
to conclude that
A ¥ nlim
Ñ8 Ln
and hence that
2
, A ¤ nlim
Ñ8 Ln
3
1
n
2
nlim
Ñ8 Ln 3
A
2
.
(2.6.1)
3
Below, the Riemann integral of f : r0, 1s Ñ R defined by f pxq : 1 x2
for every x P r0, 1s, will be defined essentially as the common limit of the
sequences L1 , L2 , . . . and U1 , U2 , . . . , which give the area enclosed by the
graph of f and the coordinate axes, and denoted by
»1
³
0
f pxq dx
where Leibniz’s sign is a stylized S and is intended to remind of the
summation involved in the definition of the integral. Hence the previous
reasoning shows that
»1
2
f pxq dx .
3
0
Note that (2.6.1) presupposes an intuitive geometric notion of the area A.
Today, the limits would be used for the definition of A. As derivatives
214
of functions are used to define tangents at curves, integrals of functions
are used to define areas (or volumes in Calculus III). Also, note that the
whole calculation, including the limit value, uses only rational numbers and
therefore does not pose a problem to ancient Greek mathematics. In other
cases where the quadrature failed, like the quadrature of the circle, that
area was not describable by a rational number. Finally, instead of (2.6.1),
Archimedes showed an equivalent result that expressed A in terms of a rational multiple of the area of a triangle inscribed into the parabolic segment.
For the last result, we refer to the beginning of Section 3.3 in Calculus II
on series of real numbers.
We return to the question of showing that A 2{3. Since there was no limit
concept at the time, this proof had to be performed by a so called ‘double
reductio ad absurdum’, i.e., by leading both assumptions that A 2{3 and
that A ¡ 2{3 to a contradiction which leaves only the option that A 2{3.
Since
2
3
n1 ¤ 23 3n6n2 1 Ln ¤ A ¤ Un 23
3n 1
6n2
this can be done as follows. For this, we assume that A
some ε ¡ 0. Then, it follows for n ¡ 1{ε that
2
3
εA¤
2
3
1
n
32
¤ 32
p2{3q
1
,
n
ε for
ε.
On the other hand, if A p2{3q ε for some ε ¡ 0, it follows for n ¡ 1{ε
that
2 1
2
2
εA¥ ¡ ε .
3
3 n
3
Hence the only remaining possibility is that A 2{3. Of course, in ancient
Greece only rational ε were considered in such analysis.
A generalization of Archimedes’ result to natural powers of x were made
only in the 17th century by Descartes and Fermat, but unpublished, and in
1647 by Bonaventura Cavalieri [24]. The next decisive step was the discovery of the fundamental theorem of calculus independently by Newton [83]
215
vHtL @msecD
10
8
6
4
2
0.2
0.4
0.6
0.8
1
1.2
t @secD
Fig. 63: S6 p1.2q is given by the yellow area under Gpv q.
and Leibniz [68], see Theorems 2.6.19, 2.6.21, i.e., the realization that differentiation and integration are inverse processes.
For motivation of that theorem, we go back to the start of Section 2.4 to
the discussion of Galileo’s results on bodies in free fall near the surface of
the earth. Starting from the fallen distance sptq at time t,
sptq 1 2
gt
2
(2.6.2)
for all t ¥ 0, we determined the instantaneous speed v ptq of the body at
time t as the derivative
v ptq s 1 ptq gt
where g 9.81m{sec2 is the acceleration of the earth’s gravitational field.
We now investigate the reverse question, how to calculate sptq from the instantaneous speeds between times 0 and t. There are two main approaches
to this problem.
216
The first uses that v ptq s 1 ptq for every t ¡ 0 and concludes that s is
the ‘anti-derivative’ of v such that sp0q 0 and hence (by application of
Theorem 2.5.7) is given by (2.6.2).
A second approach leading on integration uses the following relation between s and v. For every t ¡ 0 and n P N , it follows that
sptq sp0q s
k
s
n¸1
k 1
t
n
k 1
t
n
1
n
k 0
k 0
For k
n¸1 t
s
k
t
n
s nk t k 1 t k t
n
n
nk t
P t0, . . . , n 1u,
s
k 1
t
n
k 1
t
n
.
s nk t
nk t
is the average speed in the time interval
k k 1
t,
t
n
n
.
In this case, it is given by
s
k 1
t
n
k 1
t
n
v
s nk t ngt
2
nk t
k
t
n
k
1
2
n
2 k
n
gt
1
2
k
n
gt
2n
in terms of the instantaneous speed v at the beginning of the time interval.
Hence, we conclude that
sptq sp0q gt2
2n
n¸1
v
k 0
217
k
t
n
t
n
Snptq
gt2
2n
where
n¸1
Sn ptq :
v
k 0
This leads on
sptq sp0q lim
Ñ8
k
t
n
n¸1
n
v
k 0
t
.
n
k
t
n
t
n
.
Note that the sum Sn ptq has the geometrical interpretation of an area under
Gpv q, see Fig. 63. Below the limit
lim
Ñ8
n
n¸1
v
k 0
k
t
n
t
n
will coincide with the integral of the function v over the interval r0, ts which
is denoted by
»
t
0
Hence
v pτ q dτ .
sptq sp0q »t
0
v pτ q dτ
gives the relation between instantaneous speed and the distance traveled
between times 0 and t. It is satisfied for the motion in one dimension in
general. The last relation gives the connection between the integral of v
over the interval r0, ts, t ¡ 0, and its anti-derivative s. It constitutes a special case of the fundamental theorem of calculus and is valid for a wide
class of functions v. From the knowledge of an anti-derivative s of v, i.e.,
some function s such that s 1 pτ q v pτ q for all τ P r0, ts, this relation allows
the calculation of the integral of v over the interval r0, ts.
As a consequence of the discovery of the fundamental theorem of calculus,
during the 18th century, the integral was generally regarded as the inverse
of the derivative, i.e., the statement of the fundamental theorem of calculus
was used to define the integral. Only in cases where an anti-derivative could
218
not be found, definitions of the integral as a limit of some sort of sums or
an area under a curve were used to derive approximations. In particular, the
notion of area was still considered intuitive such that no precise definition
was needed.
At the beginning of the 19th century, the work of Fourier made it necessary
to define integrals also of discontinuous functions. Cauchy was the first to
give a definition for continuous functions. Still, it contained an unnatural
element in a preference of function values assumed at left ends of intervals
used to subdivide the domain of such a function. The first fully satisfactory
definition, applicable to a large class of discontinuous functions, was given
by Bernhard Riemann in 1854 in his habilitation thesis [87]. The equivalent
definition used in this text is due to Jean-Gaston Darboux.
After this introduction, we start with natural definitions of the length of
intervals, partitions of intervals and corresponding lower and upper sums
of bounded functions. Such sums already appeared in the previous calculation of the area of the parabolic segment and in the motivation of the
fundamental theorem of calculus. They corresponded to partitions of intervals into subintervals of equal length. In the limit of vanishing length,
we arrived at the area A as well as at integrals of v. Below, the size of a
partition generalizes that length. On the other hand, we will allow for much
general partitions of intervals in the definition of the integral. As a consequence, those partitions cannot be characterized by a single parameter, and
hence a definition of the integral in form of a simple limit is not possible.
Such limit is replaced by the supremum of lower sums and the infimum of
upper sums which is required to coincide for integrable functions.
Definition 2.6.1.
(i) Let a, b P R be such that a ¤ b. We define the lengths of the corresponding intervals pa, bq, pa, bs, ra, bq, ra, bs by
lppa, bqq lppa, bsq lpra, bqq lpra, bsq : b a .
A partition P of ra, bs is an ordered sequence pa0 , . . . , aν q of elements
219
of ra, bs such that
a a0
¤ a1 ¤ ¤ aν b
where ν is an element of N . Since pa, bq is such a partition of ra, bs,
the set of all partitions of that interval is non-empty. A partition P 1
of ra, bs is called a refinement of P if P is a subsequence of P 1 .
(ii) A partition P pa0 , . . . , aν q of a bounded closed interval I of R
induces a division of I into, in general non-disjoint, subintervals
I
ν¤1
Ij , Ij : raj , aj
1
s , j 0, . . . , ν .
j 0
The size of P is defined as the maximum of the lengths of these
subintervals. In addition, we define for every bounded function f on
I the lower sum Lpf, P q and upper sum U pf, P q corresponding to P
by:
Lpf, P q :
ν¸1
ν¸
1
inf tf pxq : x P Ij u lpIj q ,
j 0
U pf, P q :
suptf pxq : x P Ij u lpIj q .
j 0
Note that if K
that
¡ 0 is such that |f pxq| ¤ K for all x P I, it follows
K ¤ inf tf pxq : x P J u ¤ suptf pxq : x P J u ¤ K
for every subset J of I and hence that
|Lpf, P q| ¤
ν¸1
| inf tf pxq : x P Ij u| lpIj q ¤ K
j 0
|U pf, P q| ¤
ν¸1
ν¸1
lpIj q K lpI q ,
ν¸
1
j 0
| suptf pxq : x P Ij u| lpIj q ¤ K
j 0
j 0
220
lpIj q K lpI q .
As a consequence, the sets
tLpf, P q : P P Pu , tU pf, P q : P P Pu
are bounded where P denotes the set of all partitions of I.
Example 2.6.2. Consider the interval I : r0, 1s and the continuous function f : I Ñ R defined by f pxq : x for all x P I.
P0 : p0, 1q , P1 : p0, 1{2, 1q
are partitions of I. The size of P0 is 1, whereas the size of P1 is 1{2. Also,
P1 is a refinement of P0 . Finally,
Lpf, P0 q 0 1 0 , U pf, P0 q 1 1 1 ,
1 1 1
1
Lpf, P1 q 0 ,
2 2 2
4
1 1
1
3
U pf, P1 q 1 2 2
2
4
and hence
Lpf, P0 q ¤ Lpf, P1 q ¤ U pf, P1 q ¤ U pf, P0 q .
Intuitively, it is to be expected that a refinement of a partition of an interval
leads to a decrease of corresponding upper sums and an increase of corresponding lower sums as has also been found in the special case in the
previous example. Indeed, this is intuition is correct.
Lemma 2.6.3. Let f be a bounded real-valued function on a closed interval
I of R. Further, let P, P 1 be partitions of I, and in particular let P 1 be a
refinement of P . Then
Lpf, P q ¤ Lpf, P 1 q ¤ U pf, P 1 q ¤ U pf, P q .
(2.6.3)
Proof. The middle inequality is obvious from the definition of lower and
upper sums given in Def 2.6.1(ii). Further, let P pa0 , . . . , aν q be a partition of [a,b] where ν P N and a0 , . . . , aν P ra, bs. Obviously, for the proof
221
of the remaining inequalities it is sufficient (by the method of induction) to
assume that P 1 pa0 , a11 , a1 , . . . , aν q where a11 P I is such that
a0
¤ a11 ¤ a1
and where we simplified to keep the notation simple. Then
Lpf, P 1 q Lpf, P q inf tf pxq : x P ra0 , a11 su lpra0 , a11 sq inf tf pxq : x P ra11 , a1 su lpra11 , a1 sq
inf tf pxq : x P ra0, a1su lpra0, a1sq
¥ inf tf pxq : x P ra0, a1su tlpra0, a11 sq lpra11 , a1sq lpra0, a1squ 0 .
Analogously, it follows that
U pf, P 1 q U pf, P q ¤ 0
and hence, finally, (2.6.3).
As a consequence of their definition, lower sums are smaller than upper
sums. It is not difficult to show that the same is true for the supremum of
the lower sums and the infimum of the upper sums.
Theorem 2.6.4. Let f be a bounded real-valued function on the interval
ra, bs of R and P be the set of all partitions of ra, bs where a and b are some
elements of R such that a ¤ b. Then
supptLpf, P q : P
P Puq ¤ inf ptU pf, P q : P P Puq .
(2.6.4)
Proof. By Theorem 2.6.3, it follows for all P1 , P2 P P that
Lpf, P1 q ¤ Lpf, P q ¤ U pf, P q ¤ U pf, P2 q ,
where P P P is some corresponding common refinement, and hence that
supptLpf, P1 q : P1 P Puq ¤ U pf, P2 q
and (2.6.4).
222
As a consequence of Lemma 2.6.3 and since every partition P of some
interval of R is a refinement of the trivial partition containing only its initial
and endpoints, we can make the following definition.
Definition 2.6.5. (The Riemann integral) Let f be a bounded real-valued
function on the interval ra, bs of R where a and b are some elements of R
such that a ¤ b. Denote by P the set consisting of all partitions of ra, bs.
We say that f is Riemann-integrable on ra, bs if
supptLpf, P q : P
P Puq inf ptU pf, P q : P P Puq .
In that case, we define the integral of f on ra, bs by
»b
a
f pxq dx : supptLpf, P q : P
In particular if f pxq
graph of f by
P Puq inf ptU pf, P q : P P Puq .
¥ 0 for all x P ra, bs, we define the area A under the
A :
»b
a
f pxq dx .
Example 2.6.6. Let f be a constant function of value c P R on some interval ra, bs of R where a and b are some elements of R such that a ¤ b.
In particular, f is bounded. Further, let P pa0 , . . . , aν q be a partition of
ra, bs where ν P N and a0, . . . , aν P ra, bs. Then
Lpf, P q U pf, P q ν¸1
c lprak , ak
1 sq k 0
ν¸1
ν¸1
c pak
1
ak q
k 0
c pak 1 ak q c pb aq .
k 0
Hence all lower and upper sums are equal to c pb aq. As a consequence,
f is Riemann-integrable and
»b
a
f pxq dx c pb aq .
223
Note that this result can restated as saying that
»b
dx
a
is given by the difference of the values the antiderivative pra, bs Ñ R, x ÞÑ
xq of the integrand at b and a. That this is not just accidental will be seen
later on. The same is also true in more general cases as specified in the
version Theorem 2.6.21 of the fundamental theorem of calculus.
Note that according to the previous example, the integral of every function
defined on an interval containing precisely one point is zero. The value
of the function in this point does not affect the value of the integral. This
observation will lead further down to the definition of so called zero sets.
Example 2.6.7. Consider the function f : ra, bs Ñ R defined by
f pxq : x ,
for all x P ra, bs where a and b are some elements of R such that a ¤ b.
Since f pxq |x| ¤ max |a|, |b| for every x P ra, bs, f is bounded. For
every n P N , define the partition Pn of ra, bs by
Pn :
ba
,...,a
n
a, a
n pb aq
n
b
.
Calculate Lpf, Pn q and U pf, Pn q for all n P N . Show that f is Riemannintegrable over ra, bs and calculate the value of
»b
a
f pxq dx .
Solution: We have:
I
n¤1 j 0
a
j pb aq
,a
n
224
pj
1qpb aq
n
and
L pf, Pn q n¸1 j pb aq
n
a
j 0
ba
n
1
p
b aq2 n¸
a pb aq
j
n2
a pb aq pb n2aq n2 pn 1q a pb aq pb 2 aq
2
U pf, Pn q n¸1 a
j 0
pj
2
1qpb aq
n
1
2 n¸
a pb aq pb n2aq pj
Hence
1
n
j 0
1
1
n
,
b n a
1q a pb aq
j 0
p
b aq2
1
a pb aq
2
pb aq2 n pn
n2
2
1q
,
lim
Ñ8 L pf, Pn q nlim
Ñ8 U pf, Pn q n
1
2
pb2 a2q .
As a consequence, it follows that
1
2
and that
pb2 a2q ¤ supptLpf, P q : P P Puq
inf ptU pf, P q : P
P Puq ¤ 21 pb2 a2q
and hence by Theorem 2.6.4 that
supptLpf, P q : P
P Puq inf ptU pf, P q : P P Puq 12 pb2 a2q
where P denotes the set of partitions of ra, bs. Hence f is Riemann-integrable
and
»b
1
x dx pb2 a2 q .
2
a
225
Note that the last result can be restated as saying that
»b
x dx
a
is given by the difference of the values the antiderivative pra, bs Ñ R, x ÞÑ
x2 {2q of the integrand at b and a. That this is not just accidental will be
seen later on. The same is also true in more general cases as specified in
the version Theorem 2.6.21 of the fundamental theorem of calculus.
In the past, we have seen that the property of convergence of a sequence as
well as of the continuity and differentiability of functions is automatically
‘transferred’ to sums, products and quotients, see Theorems 2.3.4, 2.3.46, 2.3.48
and 2.4.8. Also did this fact considerably simplify the process of the decision whether a given sequence is convergent or given functions are continuous or differentiable. In many cases, this is an obvious consequence
of the convergence of elementary sequences as well as of the continuity
or differentiability of elementary functions. For these reasons, it is natural
to ask whether multiples, sums, products and quotients of integrable functions are integrable as well. Indeed, this is the case for multiples, sums and
products. In the case of quotients, this is the case if the divisor is in addition nowhere vanishing, and if the quotient is bounded. The corresponding
proof is relatively simple in the case of multiples and sums of integrable
functions and is part of the following theorem. In the case of products
and quotients, the statement is a consequence of Lebesgue’s criterion for
Riemann-integrability, Theorem 2.6.13, which is proved in the appendix.
Within the definition of Riemann-integrability above, we also defined the
area under the graph of a positive integrable function in terms of its integral.
This is reasonable in view of applications only if that integral is positive.
This positivity is a simple consequence of the positivity of the lower sums.
Theorem 2.6.8. (Linearity and positivity of the integral) Let f, g be
bounded and Riemann-integrable on the interval ra, bs of R where a and
b are elements of R such that a ¤ b and c P R. Then f g and cf are
Riemann-integrable on ra, bs and
»b
a
pf pxq
g pxqq dx »b
a
226
f pxq dx
»b
a
g pxq dx ,
»b
a
cf pxq dx c
»b
a
f pxq dx .
If f is in addition positive, then
»b
a
f pxq dx ¥ 0 .
Proof. In the following, we denote by P the set of all partitions of ra, bs.
First, if M1 ¡ 0 and M2 ¡ 0 are such that |f pxq| ¤ M1 and |g pxq| ¤ M2 ,
then
|pf gqpxq| |f pxq gpxq| ¤ |f pxq| |gpxq| ¤ M1 M2 ,
|pcf qpxq| |cf pxq| |c| |f pxq| ¤ |c|M1
for all x P ra, bs and hence f g and cf are bounded for every c P R.
Second, it follows for every subinterval J of I : ra, bs that
inf tf pxq : x P J u inf tg pxq : x P J u ¤ f pxq g pxq pf g qpxq ,
pf gqpxq f pxq gpxq ¤ suptf pxq : x P J u suptgpxq : x P J u
for all x P J and hence that
inf tf pxq : x P J u inf tg pxq : x P J u
¤ inf tpf gqpxq : x P J u ¤ suptpf gqpxq : x P J u
¤ suptf pxq : x P J u suptgpxq : x P J u .
Hence it follows for every partition P of I that
Lpf, P q Lpg, P q ¤ Lpf
¤ U pf, P q U pg, P q .
g, P q ¤ U pf
If n P N , by refining partitions, we can construct Pn
»b
a
f pxq dx 1
2n
Lpf, Pnq ,
»b
227
a
g pxq dx g, P q
P P such that
1
2n
Lpg, Pnq ,
U pf, Pn q »b
f pxq dx
a
1
, U pg, Pn q 2n
»b
a
g pxq dx
1
.
2n
Hence
»b
a
¤
»b
f pxq dx
»b
a
a
f pxq dx
g pxq dx »b
a
1
n
g pxq dx
¤ Lpf
g, Pn q ¤ U pf
g, Pn q
1
n
and
»b
a
f pxq dx
¤ inf tU pf
»b
a
g pxq dx g, P q : P
1
n
¤ suptLpf
P Pu ¤
»b
a
f pxq dx
g, P q : P
»b
a
P Pu
g pxq dx
Since the last is true for every n P N , we conclude that
suptLpf
Hence f
»b
a
g, P q : P
f pxq dx
»b
a
P Pu inf tU pf
g pxq dx .
g, P q : P
P Pu
g is Riemann-integrable and
»b
a
pf pxq
g pxqq dx »b
a
f pxq dx
»b
a
g pxq dx .
Further, if c ¥ 0, it follows for every subinterval J of I that
inf tcf pxq : x P J u c inf tf pxq : x P J u ,
suptcf pxq : x P J u c suptf pxq : x P J u
and hence that
Lpcf, P q c Lpf, P q , U pcf, P q c U pf, P q
228
1
.
n
for every partition P of I. The last implies that
suptLpcf, P q : P
inf tU pcf, P q : P
P Pu c suptLpf, P q : P P Pu c
P Pu c inf tU pf, P q : P P Pu c
»b
»
If c ¤ 0, it follows for every subinterval J of I that
a
b
a
f pxq dx ,
f pxq dx .
inf tcf pxq : x P J u c suptf pxq : x P J u ,
suptcf pxq : x P J u c inf tf pxq : x P J u
and hence that
Lpcf, P q c U pf, P q , U pcf, P q c Lpf, P q
for every partition P of I. The last implies that
suptLpcf, P q : P
P Pu c inf tU pf, P q : P P Pu c
inf tU pcf, P q : P
P Pu c suptLpf, P q : P P Pu c
Hence it follows in both cases that
»b
a
cf pxq dx c
»b
a
f pxq dx .
Finally, if f is such that f pxq ¥ 0 for all x P I, then
inf tf pxq : x P J u ¥ 0
for all subintervals J of I and hence
Lpf, P q ¥ 0
for every partition P of I. As a consequence,
»b
a
f pxq dx suptLpf, P q : P
229
P Pu ¥ 0 .
»b
a
»b
a
f pxq dx ,
f pxq dx .
The Riemann integral can be viewed as a map into the real numbers with
domain given by the set of bounded Riemann-integrable functions over
some bounded closed interval I of R. According to the previous theorem, that map is ‘linear’, i.e., the integral of the sum of such functions
is equal to the sums of their corresponding integrals and the integral of a
scalar multiple of such a function is given by that multiple of the integral
of that function. In addition, it is positive, in the sense that it maps such
functions which are in addition positive, i.e., which assume only positive
(¥ 0) values, into a positive real number. It is easy to see that the linearity
and positivity of the map implies also its monotony, i.e., if such functions f
and g satisfy f ¤ g, defined by f pxq ¤ g pxq for all x P I, then the integral
of f is equal or smaller than the integral of g.
Corollary 2.6.9. (Monotony of the integral) Let f, g be bounded and
Riemann-integrable on the interval ra, bs of R where a and b are elements
of R such that a ¤ b, and in addition let f pxq ¤ g pxq for all x P ra, bs.
Then
»
»
b
a
b
f pxq dx ¤
a
g pxq dx .
Proof. For this, we define the auxiliary function h : ra, bs Ñ R by hpxq :
g pxq f pxq for all x P ra, bs. According to Theorem 2.6.8, h is bounded
and Riemann-integrable. Finally, since f pxq ¤ g pxq for all x P ra, bs, it
follows that hpxq ¥ 0 for all x P ra, bs. Hence it follows by the linearity
and positivity of the integral that
0¤
»b
a
hpxq dx and hence that
»b
a
g pxq dx
»b
a
»b
a
rf pxqs dx f pxq dx ¤
»b
a
»b
a
g pxq dx »b
a
f pxq dx
g pxq dx .
The reader might have wondered why we did not define divisions of intervals induced by partitions in such a way that they contain only pairwise
230
disjoint intervals, although that would have been possible. In our definition
subsequent intervals in a division contain a common point. Hence, in a certain sense, associated upper and lower sums count the values of the function
in such points twice. The reason for our definition is that it is technically
simpler than one which uses pairwise disjoint intervals and that the use of
a definition of the latter type would have led to the same integral. The last
is reflected by the fact that values of functions in individual points don’t
influence the value of the integral. For this note that by Example 2.6.6,
it follows that the integral of any function defined on a interval containing
only one point is zero. The value of the function in this point does not affect
the value of the integral. The reason behind this behavior is, of course, the
fact that we defined the length of intervals as the difference between their
right and left boundary. Hence the length of an interval containing only
one point is zero. Such intervals are examples of so called zero sets. The
values assumed by a function on a zero set do not influence the value of the
integral. There are several definitions of zero sets possible. The following
common definition uses the intuition that they should be, in some sense, of
vanishing length.
Definition 2.6.10. (Sets of measure zero) A subset S of R is said to have
measure zero if for every ε ¡ 0 there is a corresponding sequence I0 , I1 , . . .
of open subintervals of R such that the union of those intervals contains S
and at the same time such that
ņ
lim
Ñ8
n
lpIk q ε .
k 0
Remark 2.6.11. Note that any finite subset of R and also any subset of a
set of measure zero has measure zero.
Theorem 2.6.12. Every countable subset S of R is a set of measure zero.
Proof. Since S is countable, there is a bijection ϕ : N Ñ S. Let ε
0 and define for each n P N the corresponding interval In : pϕpnq
ε{2n 3 , ϕpnq ε{2n 3 q. Then for each N P N:
Ņ
n 0
lpIn q ε Ņ
n 0
n
1
2
2
N
ε 1 21
1
4
231
1
2
1
ε
2
1
N
1
2
1
¡
and hence
Ņ
lim
N
Ñ8 k0
lpIk q ε
2
ε.
So far, we proved existence of the integral only in few simple cases. The
following celebrated theorem due to Henri Lebesgue changes this. It gives
a characterization of Riemann-integrability. Because of its technical character, the proof is transferred to the Appendix.
Theorem 2.6.13. (Lebesgue’s criterion for Riemann-integrability) Let
f : ra, bs Ñ R be bounded where a and b are some elements of R such
that a b. Further, let D be the set of discontinuities of f . Then f is
Riemann-integrable if and only if D is a set of measure zero.
Proof. See the proof of Theorem 5.2.6 in the Appendix.
Remark 2.6.14. A property is said to hold almost everywhere on a subset
S of R if it holds everywhere on S except for a set of measure zero. Thus,
Theorem 2.6.13 states that a bounded function on a non-trivial bounded
and closed interval of R is Riemann-integrable if and only if the function is
almost everywhere continuous.
Since
a
|f pxq| rf pxqs2
for every x P ra, bs, if f is bounded and Riemann-integrable on the interval
ra, bs of R, where a and b are elements of R such that a ¤ b, we conclude by
application of the previous theorem that also |f | is bounded and Riemannintegrable. Since
f pxq ¤ |f pxq| ¤ f pxq
for all x P ra, bs, it follows by the monotony of the Riemann integral, Corollary 2.6.9, that
»b
a
f pxq dx ¤
»b
a
|f pxq| dx ¤
232
»b
a
f pxq dx
y
1
20
-20
x
-0.5
Fig. 64: Graph of J0 .
and hence that
» b
f x dx
a
pq
¤
»b
a
|f pxq| dx .
The last estimate is frequently applied. For a first application, see Example 2.6.16. As a consequence, we proved the following theorem.
Theorem 2.6.15. Let f be bounded and Riemann-integrable on the interval
ra, bs of R where a and b are elements of R such that a ¤ b. Then |f | is
bounded and Riemann-integrable and
» b
f x dx
a
pq
¤
»b
a
|f pxq| dx .
Example 2.6.16. For many functions that are important for applications,
there are integral representations which are often crucial for the derivation
of their properties. For instance, for every n P Z, the corresponding Bessel
function of the first kind Jn satisfies
Jn pxq 1
π
»π
0
cospx sin θ nθq dθ
for all x P R and is the solution of the differential equation
x2 f 2 pxq
xf 1 pxq
px2 n2qf pxq 0 ,
233
for all x P R. By Corollary 2.6.9, it follows the simple estimate
|Jnpxq| ¤
1
π
»π
0
| cospx sin θ nθq| dθ ¤
1
π
»π
dθ
0
1
for all x P R and hence that Jn is a bounded function. Bessel functions
occur frequently in the description of physical systems that are ‘axially
symmetric’, i.e., symmetric with respect to rotations around an axis.
Within the definition of Riemann-integrability above, we also defined the
area under the graph of an bounded integrable function that assumes only
positive (¥ 0) values in terms of its integral. Geometric intuition suggests
that areas are additive, that is, if A is the set under the graph of a bounded
integrable function and A is the disjoint union of two such sets B and C,
we expect that the area of A is equal to the sum of the areas of B and C.
Indeed in the following, it will be shown that this intuition is reflected in
the additivity of the integral.
Theorem 2.6.17. (Additivity of upper and lower Integrals) Let f : ra, bs Ñ
R be bounded where a and b are some elements of R such that a ¤ b and
c P ra, bs. Then
supptLpf, P q : P
P Puq supptLpf |ra,cs, P q : P P Pra,csuq
supptLpf |rc,bs , P q : P P Prc,bs uq ,
inf ptU pf, P q : P P Puq inf ptU pf |ra,cs , P q : P P Pra,cs uq
inf ptU pf |rc,bs , P q : P P Prc,bs uq
where P, Pra,cs , Prc,bs denote the set consisting of all partitions of ra, bs,
ra, cs and rc, bs, respectively.
Proof. For this, let P1 pa0 , . . . , aν q P Pra,cs and P2 paν 1 , . . . , aν µ q P
Prc,bs , where ν, µ are some elements of N , and
P : pa0 , . . . , aν , aν 1 , . . . , aν
µ
q
the corresponding element of P. Then
Lpf, P q Lpf |ra,cs , P1 q
234
Lpf |rc,bs , P2 q ,
U pf, P q U pf |ra,cs , P1 q
U pf |rc,bs , P2 q .
Now let ε ¡ 0. Obviously because of Lemma 2.6.3, we can assume without
restriction that P is such that
ε
supptLpf, P q : P P Puq Lpf, P q ¤ ,
3
ε
supptLpf |ra,cs , P q : P P Pra,cs uq Lpf, P1 q ¤ ,
3
ε
supptLpf |rc,bs , P q : P P Prc,bs uq Lpf, P2 q ¤ .
3
Then also
sup L f, P : P
pt p q P Puq supptL pf |ra,cs, P q : P P Pra,csuq
supptLpf |rc,bs, P q : P P Prc,bsuq ¤ ε .
Analogously because of Lemma 2.6.3, we can also assume without restriction that P is such that
ε
U pf, P q inf ptU pf, P q : P P Puq ¤ ,
3
ε
U pf, P1 q inf ptU pf |ra,cs , P q : P P Pra,cs uq ¤ ,
3
ε
U pf, P2 q inf ptU pf |rc,bs , P q : P P Prc,bs uq ¤ .
3
Then also
inf U f, P : P
pt p q P Puq inf ptLpf |ra,cs, P q : P P Pra,csuq
inf ptLpf |rc,bs, P q : P P Prc,bsuq ¤ ε .
Corollary 2.6.18. (Additivity of the Riemann Integral) Let f : ra, bs Ñ
R be bounded and Riemann-integrable where a and b are some elements of
R such that a ¤ b, and c P ra, bs. Then
»b
a
f pxq dx »c
a
f pxq dx
235
»b
c
f pxq dx .
Proof. The statement is a simple consequence of Theorem 2.6.13 and
Lemma 2.6.17.
So far, we calculated the value of the integral only in some simple cases and
from its definition. At the moment, by help of the linearity of the integral
and the results in these cases, we can calculate integrals of linear functions
over bounded closed intervals of R, only. The next fundamental theorem
will give us a powerful tool for such calculation. Below, that fundamental
theorem will be given in two variations. Both are direct consequences of the
additivity. The first displays that integration and differentiation are inverse
processes. The second is a consequence of the first. For a certain class of
integrands, it allows the calculation of the integral from the knowledge of
the values of an antiderivative its integrand at the ends of the interval of
integration.
Theorem 2.6.19. Let f : ra, bs Ñ R be bounded and Riemann-integrable
where a and b are some elements of R such that a b. Then F : ra, bs Ñ R
defined by
»
F pxq :
x
a
f ptq dt
for every x P ra, bs is continuous. Furthermore, if f is continuous in some
point x P pa, bq, then F is differentiable in x and
F 1 pxq f pxq .
Proof. For x, y
P ra, bs, it follows by the Corollaries 2.6.18, 2.6.9 that
|F pyq F pxq| if y
¥ x as well as that
|F pyq F pxq| » y
f
t
dt
¤ M |y x |
» x
f
t
dt
¤ M |y x |
x
y
pq
pq
236
if y x, where M ¥ 0 is such that |f ptq| ¤ M for all t P ra, bs, and hence
the continuity of F . Further, let f be continuous in some point x P pa, bq.
Hence given ε ¡ 0, there is δ ¡ 0 such that
|f ptq f pxq| ε
for all t P ra, bs such that |t x| δ. (Otherwise, there is some ε ¡ 0
along with a sequence t0 , t1 , . . . in ra, bs such that |f ptn q f pxq| ¥ ε and
|tn x| 1{n for all n P N. Then t0, t1, . . . is converging to x, but
f pt0 q, f pt1 q, . . . is not convergent to f pxq. ) Now let h P R be such that
|h| δ and small enough such that x h P pa, bq. We consider the cases
h ¡ 0 and h 0. In the first case, it follows by Theorem 2.6.13 and
Corollary 2.6.18, 2.6.9 that
» x h
»x
F x h
1
F x
f x f t dt
f t dt
h
h a
a
» x
»x h
»x
1
f t dt
f t dt
f t dt
f x h
a
x
a
»x h
»x h
1
1
f t
f x dt
f t
f x dt ε .
h
h x
x
p
q p q p q pq
r p q p qs
pq pq ¤
pq
pq
f x pq
pq
| p q p q| ¤
Analogously, in the second case it follows that
» x h
»x
F x h
1
F
x
f
t
dt
f
t
dt
f
x
f
x
h
h
a
a
» x h
»x h
»x
1
f t dt
f t dt
f t dt
f x h
a
a
x h
»
»x
1 x
1
f t
f x dt
f t
f x dt ε .
h
h
p
q p q p q pq x h
pq pq
pq
pq
pq r p q p qs
¤| |
x h
| p q p q| ¤
Hence it follows that
h
lim
Ñ0,h0
F px
hq F pxq
h
f pxq
and that F is differentiable in x with derivative f pxq.
237
pq
Remark 2.6.20. Note that because of Theorem 2.6.13, the function F in
Theorem 2.6.19 is differentiable with derivative f pxq for almost all x P
pa, bq.
Theorem 2.6.21. (Fundamental Theorem of Calculus) Let f : ra, bs Ñ
R be bounded and Riemann-integrable where a and b are some elements of
R such that a b. Further, let F be a continuous function on ra, bs as well
as differentiable on pa, bq such that F 1 pxq f pxq, for all x P pa, bq. Then
»b
a
f pxq dx F pbq F paq .
In calculations, we sometimes use the notation
rF pxqs |ba : F pbq F paq .
Proof. Let ε ¡ 0 and P pa0 , . . . , aν q be a partition of ra, bs where ν is
an element of N . By Theorem 2.5.6 for every j P t0, 1, . . . , ν 1u, there
is a corresponding cj P raj , aj 1 s such that
F paj 1 q F paj q F 1 pcj qpaj 1 aj q
where we define F 1 paq : f paq and F 1 pbq : f pbq. Hence
F pbq F paq ν¸1
rF paj 1q F paj qs j 0
and
ν¸1
f pcj qpaj
1
aj q .
j 0
Lpf, P q ¤ F pbq F paq ¤ U pf, P q .
Hence
supptLpf, P q : P P Puq ¤ F pbq F paq
¤ inf ptU pf, P q : P P Puq supptLpf, P q : P
238
P Puq .
Example 2.6.22. Calculate
»π
7 sin
x
Solution: By
dx .
3
0
f pxq : 7 sin
x
3
for all x P r0, π s, there is defined a continuous and hence Riemann-integrable
function on r0, π s. Further by
F pxq : 21 cos
x
3
for all x P r0, π s, there is defined a continuous function on r0, π s which is
differentiable on p0, π q such that g 1 pxq f pxq for all x P p0, π q. Hence by
Theorem 2.6.21
»π
sin
x
0
3
dx 21 cos
π 21 cosp0q 21 3
21
2
212 .
Example 2.6.23. A simple number theoretic function is the greatest integer
or floor function defined by rxs : n for all x P rn, n 1q and n P N.
Calculate
»
»
x
0
rys dy
0
,
x
rys dy
for all x ¥ 0 and x 0, respectively. Solution: Note that the greatest
integer functions is almost everywhere continuous and hence according to
Theorem 2.6.13 also Riemann-integrable on any closed interval of R. For
every n P N and every x P rn, n 1q, it follows by Corollary 2.6.18 and
Theorem 2.6.21 that
»x
0
»n
»x
0
n
rys dy rys dy
n¸1 » k 1
k 0
k
k dy
n¸1 » k 1
rys dy npx nq k 0
n¸1
k
k 0
239
k
rys dy
npx nq
»x
n dy
n
y
4
2
-4
2
-2
x
4
-2
-4
Fig. 65: Graph of the greatest integer function and anti-derivative.
n
pn 1q
2
npx nq n x Analogously, it follows for every n
rn, n 1q that
»0
x
rys dy npn
npn
»n
x
1
rys dy
1 xq
1 xq »0
n 1
1 » k
¸
n
k dy
k n 1
k
n
pn
2
1q n
2
rxs
x
1
rx s 2
.
P Z such that n ¤ 1 and every x P
rys dy 1
1
»n
n dy
x
npn
n
1
2
1
1 xq
1 » k
¸
k n 1 k
¸1
1
rys dy
k
k n 1
x rxs
x
1
rx s 2
.
See Fig. 65.
A basic method for the evaluation of integrals with trigonometric integrands
consists in the application of the addition theorems for sine and cosine.
240
Example 2.6.24. Calculate
»π
0
sinpmθq sinpnθq dθ
where m, n P N . Solution: By help of the addition theorem for the cosine
function, it follows that
cosppm nqθq cospmθq cospnθq sinpmθq sinpnθq ,
cosppm nqθq cospmθq cospnθq sinpmθq sinpnθq ,
and hence that
sinpmθq sinpnθq for all θ
»π
0
1
r cosppm nqθq cosppm nqθq s
2
P R. This leads to
sinpmθq sinpnθq dθ
1
2
»π
0
r cosppm nqθq cosppm
2pm1 nq r sinppm nqθq sπ0 2pm1 nq r sinppm
if m n and
»π
0
sinpmθq sinpnθq dθ
1
2
π2 2pm1 nq r sinppm
»π
0
r 1 cosppm
nqθq sπ0
nqθq s dθ
nqθq sπ0
0
nqθq s dθ
π2
if m n.
Example 2.6.25. Find the solutions of the following (‘differential’) equation for f : R Ñ R:
f 1 pxq e2x sinp3xq
(2.6.5)
241
for all x P R. Solution: If f is such function, it follows that f is continuously differentiable. Hence it follows by Theorem 2.6.21 that
f pxq f px0 q »x
x0
x
1 2y 1
e cosp3y q
2
3
where x0
»x
f 1 py q dy x0
x0
pe2y
sinp3y qq dy
21 e2x 13 cosp3xq 21 e2x
0
1
cosp3x0 q
3
P R and x ¡ x0. Hence
f pxq 1 2x 1
e cosp3xq
2
3
c,
for all x P R where
c f p0q 1
2
1
3
f p0q 16 .
On the other hand if c P R and fc : R Ñ R is defined by
fc pxq :
1 2x 1
e cosp3xq
2
3
c
for all x P R, then it follows by direct calculation that fc satisfies (2.6.5)
for all x P R. Hence the solutions of the differential equation are given by
the family of functions fc , c P R. Note that c f p0q p1{6q. Hence for
every c P R, there is precisely one solution of the differential equation with
‘initial value’ f p0q c. The same is true for initial values given in any
other point of R.
Example 2.6.26. Find the solutions of the following differential equation
for f : R Ñ R.
f 1 pxq af pxq 3
(2.6.6)
for all x P R where a P R. Solution: If f is such function, it follows that
f is continuously differentiable. Further, by using the auxiliary function
h : R Ñ R defined by
hpxq : eax
242
for every x P R, it follows that
phf q 1pxq hpxqf 1pxq h 1pxqf pxq eaxf 1pxq
eaxrf 1pxq af pxqs 3 eax
for all x P R. Hence it follows by Theorem 2.6.21 that
phf qpxq phf qpx0q »x
3 eay dy
x0
aeax f pxq
a3 eax a3 eax
0
and therefore that
phf qpxq phf qpx0q
for x ¡ x0 where x0
3
p1 eax0 q
a
3 ax
pe 1q
a
P R. From this, we conclude that
phf qpxq a3 peax 1q
c
and
f pxq for all x P R where c
is defined by
3
p
1 eax q
a
c eax
f p0q. On the other hand if c P R and fc : R Ñ R
3
p
1 eax q c eax
a
for all x P R, then it follows by direct calculation that fc satisfies (2.6.6)
for all x P R. Hence the solutions of the differential equation are given
by the family of functions fc , c P R. Note that c f p0q. Hence for
every c P R, there is precisely one solution of the differential equation with
‘initial value’ f p0q c. The same is true for initial values given in any
other point of R.
fc pxq :
Problems
243
1) Calculate
»3
px
2
a)
2
»2
c)
»2
e)
1
»1
g)
»
x
0
π 2
{ 1
3
0
»π
j)
l)
{
π 2
»3
1
1
,
x
| 5x 3 |2
2
0
» 2π
q)
0
» 2π
s)
0
?
3x
f)
3x
x
h)
| sinpx{2q| dx
k)
1
4|x
|3x 4| dx
,
1|
dx
p)
,
2 | dx
» π{6
sinpmθq sinpnθq dθ
,
,
1| |x
,
dx
,
,
»3
dx
5
x3
,
,
2 dx
1
2
sinpx{2q cospx{2q dx dx
5
|x 1| |x
4
o)
,
2
2
» 3π
3
»5
dx
sinpπxq
dx
π
»4
dx
1
|x 1|
3
»5
n)
d)
,
0
»2
,
9 t2{3 dt
b)
4
sinpxq cos2 pxq dx
5
»2
m)
,
?4x 3x
2x2
i)
7q dx
5x
pe2x 3xq dx
1
»2
π{6
| cosp3xq| dx
» 2π
,
,
r)
0
,
sinpmθq cospnθq dθ
,
cospmθq cospnθq dθ
where m, n P N .
2) Define f : R Ñ R by
?
? p xq
f pxq :
3 {2
'
%?
3 p1 xq
$
'
& 3 1
if x ¤ 1{2
if 1{2 x 1{2
if 1{2 ¤ x
for every x P R. Calculate the area in R2 that is enclosed by the graph
of f and the x-axis. Verify your result using facts from elementary
244
geometry. Use the result to calculate the area enclosed by a hexagon
of side length 1.
3) Calculate the area in R2 that is enclosed by the graphs of the polynomials
p1 pxq : 1
p7{2q x x2 , p2 pxq : 4 p7{2q x
where x P R.
x2
4) Calculate the area in R2 that is enclosed by the curve
C : tpx, y q P R2 : y 2 4x2
4x4
0u .
5) Show that
for all x P r0, π {2s.
cospxq ¤ 1 x2
π
6) Find the solutions to the differential equation for f : R Ñ R.
a) f 1 pxq 3f pxq x{2 , x P R ,
b) f 1 pxq 3f pxq ex{4 , x P R ,
c) 2f 1 pxq f pxq 3 ex , x P R .
7) Consider the following differential equation for f : R Ñ R.
f 2 pxq 3x
for all x P R.
4
a) Find the solutions of this equation.
b) Find that solution which satisfies f p0q 1 and f 1 p0q 2.
c) Find that solution which satisfies f p0q 2 and f p1q 3.
8) Calculate
a0 :
1
2π
bk :
1
π
for all k
P N .
»
» 2π
0
2π
0
f pxq dx , ak :
1
π
» 2π
0
cospkxqf pxq dx ,
sinpkxqf pxq dx
a)
f pxq :
#
1
245
1
if x P r0, π s
,
if x P pπ, 2π s
b) f pxq : x for all x P r0, 2π s ,
c)
f pxq :
#
x
if x P r0, π s
.
2π x if x P pπ, 2π s
Remark: These are the coefficients of the Fourier expansion of f .
The representation
f pxq lim
n
#
Ñ8 a0
lim
n
¸
+
rak cospkxq
bk sinpkxqs
k 1
is valid for every point x P r0, 2π s of continuity of f .
9) Calculate the area in R2 that is enclosed by the ellipse
C :
"
2
px, yq P R : xa2
2
y2
b2
1
*
where a, b ¡ 0.
10) Calculate the area in R2 that is enclosed by the branches of hyperbolas
"
*
y2
x2
C1 : px, y q P R : y ¥ 0 ^ 2 2 1 ,
a
b
"
*
p
y cq2
x2
2
C2 : px, y q P R : y ¤ c ^
b2 1
a2
2
where a, b ¡ 0 and c ¡ a.
11) Let a, b P R be such that a b. Further, let f : ra, bs Ñ R be
positive, i.e., such that f pxq ¥ 0 for all x P ra, bs, and assume a
value ¡ 0 in some point of ra, bs. Show that
»b
a
f pxq dx ¡ 0 .
12) Let a, b P R be such that a b. Further, let f : ra, bs Ñ R and g :
ra, bs Ñ R be bounded and Riemann-integrable. Show the following
Cauchy-Schwartz inequality for integrals:
»
b
a
f pxqg pxq dx
2
¤
»
b
f pxq dx
»
b
2
a
246
g pxq dx
2
a
.
In addition, show that equality holds if and only if there are α,β P R
satisfying that α2 β 2 0 and such that αf βg 0. Hint:
Consider
»
b
as a function of λ P R.
a
r f pxq
λ g pxq s2 dx
13) Newton’s equation of motion for a point particle of mass m
moving on a straight line is given by
mf 2 ptq F pf ptqq
¥
0
(2.6.7)
for all t P R where f ptq is the position of the particle at time t
and F pxq is the external force at the point x. For the specified
force, calculate the solution function f of (2.6.7) with initial position f p0q x0 and initial speed f 1 p0q v0 where x0 , v0 P R.
a) F pxq 0 , x P R ,
b) F pxq F0 , x P R where F0 is some real parameter .
14) Newton’s equation of motion for a point particle of mass m ¥ 0
moving on a straight line under the influence of a viscous friction is
given by
mf 2 ptq λf 1 ptq
(2.6.8)
for all t P R where f ptq is the position of the particle at time t and
λ ¡ 0 is a parameter describing the strength of the friction. Calculate
the solution function f of (2.6.8) with initial position f p0q x0 and
initial speed f 1 p0q v0 where x0 , v0 P R. Investigate, whether f
has a limit value for t Ñ 8.
15) Newton’s equation of motion for a point particle of mass m ¥ 0
moving on a straight line under the influence of low viscous friction,
for instance friction exerted by air, is given by
mf 2 ptq λ pf 1 ptqq2
(2.6.9)
for all t P R where f ptq is the position of the particle at time t and
λ ¡ 0 is a parameter describing the strength of the friction. Find
solutions f of (2.6.9) with initial position f p0q x0 and initial speed
f 1 p0q v0 where x0 , v0 P R.
16) Consider a projectile that is shot into the atmosphere. According to
Newton’s equation of motion, the height f ptq above ground at time
t P R satisfies the equation
mf 2 ptq g λ pf 1 ptqq2
247
(2.6.10)
for all t P R where g 9.81m{s2 is the acceleration due to gravity
and λ ¡ 0 is a parameter describing the strength of the friction. Find
solutions f of (2.6.10) with initial height f p0q z0 and initial speed
component f 1 p0q v0 where z0 , v0 P R.
248
3
3.1
Calculus II
Techniques of Integration
This section studies standard techniques of integration, namely the methods of change of variables (also referred to as ‘integration by substitution’),
integration by parts, integration by decomposition of rational integrands
into partial fractions and, finally, approximate numerical calculation of integrals.
3.1.1
Change of Variables
The method of change of variables (also referred to as ‘integration by substitution’) is based on the chain rule for differentiation. For motivation, we
consider a continuously differentiable and increasing function g defined on
a non-trivial open interval I of R and a continuously differentiable function
F that is defined on an open interval containing Ranpg q. Further, let c, d P I
be such that c d.
Then it follows by the chain rule for differentiation that F
is continuously differentiable with derivative given by
g
: I
ÑR
pF gq 1puq F 1pgpuqq g 1puq
for all u P I. Further, it follows by the fundamental theorem of calculus,
Theorem 2.6.21, that
» gpdq
pq
g c
F 1 pxq dx F pg pdqq F pg pcqq pF
»d
c
pF gq 1puq du Hence by defining f :
variables
» gpdq
pq
g c
»d
c
gqpdq pF gqpcq
F 1 pg puqq g 1 puq du .
F 1 , we arrive at the formula for the change of
f pxq dx »d
c
f pg puqq g 1 puq du
249
for f . We note that the previous reasoning proves the validity of this equation if, in addition to the assumptions above on g, c and d, f is a continuous
function that is defined on a open interval of containing Ranpg q for which
there is a antiderivative F , i.e., for which there is a differentiable function
F : Dpf q Ñ R such that
F 1 pxq f pxq
for all x P Dpf q. In the proof of the following theorem, the last is concluded from the continuity of the function f and the fundamental theorem
of calculus in the form of Theorem 2.6.19.
Theorem 3.1.1. (Change of variables) Let c, d P R such that c d.
Further, let g : rc, ds Ñ R be continuous, such that g pcq ¤ g pdq and continuously differentiable on pc, dq with a derivative which can be extended to a
continuous function on rc, ds. Finally, let I be an open interval interval of
R containing g prc, dsq and f : I Ñ R be continuous. Then
» gpdq
pq
g c
f pxq dx »d
c
f pg puqq g 1 puq du .
(3.1.1)
Proof. In the special case that g is a constant function, the statement of the
theorem is obviously true. In the remainder of this proof, we consider the
case of a non-constant g. We denote by g 1 the extension of the derivative of
g |pc,dq to a continuous function on rc, ds and define G : rc, ds Ñ R by
Gpuq :
»u
c
f pg pūqq g 1 pūq dū
for all u P rc, ds. By Theorem 2.6.19 it follows that G is continuous as well
as differentiable on pc, dq with
G 1 puq f pg puqq g 1 puq
for all u P pc, dq. Further, we define F : rx0 , x1 s Ñ R by
F pxq :
»x
x0
f px 1 q dx 1 250
» gpcq
x0
f px 1 q dx 1
for all x P rx0 , x1 s where x0 , x1 P I are such that x0 is smaller than the minimum value of g and x1 is larger than the maximum value of g, respectively.
By Theorem 2.6.19 it follows that F is continuous as well as differentiable
on px0 , x1 q with
F 1 pxq f pxq
for all x P px0 , x1 q. Hence it follows by Theorems 2.3.51, 2.4.10 that F
is continuous as well as differentiable on pc, dq with
g
pF gq 1puq f pgpuqq g 1puq G 1puq
for all u P pc, dq. From Theorem 2.5.7 and F pg pcqq Gpcq 0, it follows
that F g G and hence by Corollary 2.6.18 also (3.1.1).
Example 3.1.2. Calculate
»3
1
?
x x 1 dx .
Solution: For this, we define g : R Ñ R by g puq : u 1 for all u P
R. Then g is increasing and continuously differentiable with a derivative
function constant of value 1. In particular, g p0q 1 and g p2q a
3. Further,
we define the continuous function f : R Ñ R by f pxq : x |x 1| for
all x P R. Hence it follows by Theorem 3.1.1 that
»3
2
?
x x 1 dx »2
pu
0
» gp2q
pq
g 0
?
1q u du 2
p3u
15
5q u {
3 2
f pxq dx »2
pu {
2
0
0
u{
3 2
0
»2
f pg puqq g 1 puq du
q du 44 ?
1 2
22
23{2 15
15
2 5{2
u
5
2
2 3{2 u
3
0
2.
Note that, we could have achieved this result also by the following more
simple reasoning.
»3
1
?
x x 1 dx »3
1
px 1
?
1q x 1 dx
251
»3
1
px 1q {
2
p3x
15
3 2
px 1q { dx 2 px 1q5{2
1 2
2q px 3
3{2 1
q
1
5
44
22
23{2 15
15
?
3
2
p
x 1q3{2 3
1
2.
Simple substitutions can often be avoided by application of such simple
‘tricks’. Below, we will give some examples where this is not the case.
Example 3.1.3. Calculate
»2
1
sinpln xq
dx .
x
Solution: For this, we define g : R Ñ R by g puq : eu for all u P R. Then
g is increasing and continuously differentiable with derivative g 1 puq eu
for all u P R. In particular, g p0q 1 and g pln 2q exppln 2q 2. Further,
we define the continuous function f : p0, 8q Ñ R by f pxq : sinpln xq{x
for all x ¡ 0. Then, it follows by Theorem 3.1.1 that
»2
1
sinpln xq
dx x
» ln 2
» gpln 2q
pq
g 0
f pxq dx » ln 2
» lnp2q
0
f pg puqq g 1 puq du
sinpln e q u
2
e du sin u du r cos us |ln
1 cospln 2q
0
u
e
0
0
ln 2
ln 2
ln 2
ln 2
2
2
2
sin
2 sin 2
1 cos 2 2 1 cos 2
2
u
where, in particular, the addition theorem for the cosine was applied.
The reason for continuing the simplification of the result 1 cospln 2q is
motivated by applications. Usually in applications, a calculation of the previous type is only a small step in a sequence of steps toward a final result.
Hence, typically, such result would be needed as input for the next step.
Therefore, it is useful to reduce results in their ‘size’ in order to avoid a
final result of even larger ‘size’. Usually, the implications of results of relatively large ‘size’ are less obvious. Note also that, the final expression
252
makes obvious the positivity of the integral which is due to the positivity
of the integrand in the interval of integration. The last can be seen from the
inequality
0 ¤ ln x ¤ x 1 ¤ π
for all x P r1, 2s where the inequality (2.5.12) for the case a 1 was applied. Quite generally, such a consistency check of the signs of results can
avoid errors.
Also in this case, the application of change of variables could have been
avoided. Usually, for a successful application of the method of change of
variables, the presence of an ‘inner function’ in the integrand is needed.
The function g in Theorem 3.1.1 is then defined in such a way that that inner function is simplified. In many simple cases, the derivative of that inner
function is also present in the integrand. Often, this can be used to ‘guess’
an antiderivative F of the integrand. For instance in this case, an obvious
candidate for an inner function is the natural logarithm function ln. Since
ln 1 pxq 1{x for all x ¡ 0, we see that its derivative is also present in the
integrand. Hence a first guess (incorrect) for such F might be
F pxq : sinpln xq
for all x ¡ 0. Then it would follow by the chain rule for differentiation that
F 1 pxq cospln xq 1
x
cospxln xq
for all x ¡ 0. F 1 does not coincide with the integrand on the interval r1, 2s
because of the presence of the cosine function instead of the sine function.
Of course, there is a simple remedy for this. A second (correct) guess for
such F would be
F pxq : cospln xq
for all x
gives
¡ 0.
As a consequence of the chain rule for differentiation, this
F 1 pxq sinpln xq 253
1
x
sinpxln xq
and hence that F is a antiderivative of the integrand. Hence, we conclude
by the fundamental theorem of calculus that
»2
1
sinpln xq
dx r cospln xqs |21
x
1 cospln 2q 2 sin
2
ln 2
2
.
We give now in succession four examples of more serious applications of
change of variables. The first three give standard trigonometric substitutions whose goal is the removal of square roots in integrands. The fourth
example gives a standard substitution that is used to transform rational expression in sine and cosine functions of the same argument into rational
expressions of the new variable.
Example 3.1.4. Calculate
»x
0
?
dy
y2
a2
for every x ¡ 0 where a ¡ 0. Solution: Define g : pπ {2, π {2q Ñ R by
g pθq : a tan θ
for all θ P pπ {2, π {2q. Then g is a bijective as well as continuously
differentiable such that
g 1 pθq a p1
for all θ
tan2 θq
P pπ{2, π{2q. The inverse g1 is given by
g 1 pxq : arctan
x
a
for all x P R. By Theorem 3.1.1
»x
0
?
dy
y2
» g1 pxq
0
a2
dθ
cos θ
» gpg1 pxqq
pq
g 0
ln
1
?
dy
y2
a2
sinpg 1 pxqq
cospg 1 pxqq
254
» g1 pxq
0
ln
g 1 pθq dθ
pgpθqq2 a2
a
x
a
c
1
x2
a2
.
Example 3.1.5. Calculate
»2
?
0
9 x2 dx .
Solution: Define g : pπ {2, π {2q Ñ p3, 3q by g pθq : 3 sin θ for all
θ P pπ {2, π {2q. Then g is a bijective as well as continuously differentiable
such that
g 1 pθq 3 cos θ
for all θ
P pπ{2, π{2q. The inverse g1 is given by
x
g 1 pxq arcsin
3
for all x P p3, 3q. By Theorem 3.1.1
»2
0
?
dx » gparcsinp2{3qq
» arcsinp2{3q a
0
92
9
x2
pq
g 0
9 pg pθqq
» arcsinp2{3q
0
p1
9
2
arcsin
2
3
9
2
arcsin
2
3
?
2
9 x2 dx
g 1pθq dθ 9
cosp2θqq dθ
» arcsinp2{3q
cos2 θ dθ
0
1
sinp2 arcsinp2{3qq
2
?
2
cosparcsinp2{3qq 5
3
9
arcsin
2
2
3
.
Example 3.1.6. Calculate
»5
4
x4 ?
x2 9 dx .
Solution: Define g : p0, π {2q Ñ p3, 8q by g pθq : 3 cos θ for all θ P
p0, π{2q. Then g is a bijective as well as continuously differentiable such
that
sin θ
g 1 pθq 3 cos2 θ
255
for all θ
P p0, π{2q. The inverse g1 is given by
3
g 1 pxq arccos
x
for all x P p3, 8q. By Theorem 3.1.1
»5
4
?
x4 x2 9 dx » gparccosp3{5qq
» arccosp3{5q
a
p{q
arccos 3 4
19
p
p { qq
g arccos 3 4
x4 ?
x2 9 dx
pgpθqq4 pgpθqq2 9 g 1pθq dθ
» arccosp3{5q
p{q
cos θ sin2 θ dθ
?
1
1 3
sin parccosp3{5qq sin3 parccosp3{4qq rp
4{5q3 p 7{4q3 s .
27
27
arccos 3 4
Example 3.1.7. Calculate
» π{2
0
dθ
.
4 cos θ
5
Solution: Define g : R Ñ pπ, π q by
g pxq : 2 arctan x
for all x P R. This is a standard substitution to transform a rational integrand in sin and cos into a rational integrand. Then g is bijective as well as
continuously differentiable such that
g 1 pxq 2
1
x2
for all x P R. The inverse g 1 is given by
g 1 pθq : tan pθ{2q
256
y
2
1
-1
2
x
3
-2
Fig. 66: Graphs of solutions of the differential equation (3.1.2) in the case that a 1 with
initial values π, π {2, π {2 and π at x 0. Compare Example 3.1.8.
for all θ
» π{2
0
P pπ, πq. By Theorem 3.1.1
5
»1
0
5
»1
0
4
2
1
Note that
for all x P R.
x2
» gp1q
g p0q 5
g 1 pxq dx
dθ
4 cos θ
pq
g x
2
2 cos2
5
dx
4
2
1 x2
cospg pxqq dθ
4 cos θ
1
1
»1
»1
0
2
0
g 1 pxq dx
5 4 cospg pxqq
5
»1
0
4
dx
x2
g 1 pxq dx
p p qq 1
2
1 tan2
9
g x
2
2
arctan
3
1
3
.
1 x2
2x
, sinpg pxqq .
2
1 x
1 x2
The following example gives a typical application of change of variables
to the solution of (‘separable’) ordinary differential equations of the first
order.
257
Example 3.1.8. Find solutions of the following differential equation for
f : R Ñ R with the specified initial values.
f 1 pxq a sinpf pxqq
(3.1.2)
for all x P R where a ¡ 0, f p0q P p0, π q. Solution: If f is such function, it
follows that f is continuously differentiable. Since f p0q P p0, π q, it follows
by the continuity of f the existence of an open interval c, d P R such that
c 0 d and such that f p[c, d]q € p0, π q. Since a ¡ 0 and the sine
function is ¥ 0 on the interval [0, π], it follows from (3.1.2) that
f px1 q f px0 q
f px0q
» x1
x0
» x1
f px1 q f px0 q f px0 q
x0
f 1 pxq dx
a sinpf pxqq dx ¥ f px0 q
for all x0 , x1 P [c, d] such that x0 ¤ x1 . In addition, the restriction of f to
[c, d] is non-constant since the sine function has no zeros on p0, π q. Hence
we conclude from (3.1.2) by Theorem 3.1.1 for x P [c, d] that
apx cq »x
c
a du »x
c
f 1 puq
du sinpf puqq
» f pxq
dθ
.
f pcq sinpθ q
Further, it follows by use of the transformation g from the previous Example 3.1.7 and Theorem 3.1.1 that
» f pxq
» gptanpf pxq{2qq
»
tanpf pxq{2q
dθ
g 1 pxq
dx
g ptanpf pcq{2qq sinpθ q
tanpf pcq{2q sinpg pxqq
» tanpf pxq{2q
dx
tanpf pxq{2q
ln tanpf pcq{2q .
tanpf pcq{2q x
dθ
f pcq sinpθ q
Hence it follows that
apx cq ln
tanpf pxq{2q
tanpf pcq{2q
258
(3.1.3)
which leads to
f pxq 2 arctan tan
f pcq
2
epq .
a x c
(3.1.4)
From (3.1.3), we conclude that
tan
f pcq
2
f p0q
eac tan
.
2
Substituting this identity into (3.1.3) gives
f pxq 2 arctan tan
f p0q
2
ax
e
.
On the other hand, for every c P pπ, π q, it follows by elementary calculation that f : R Ñ R defined by
f pxq : 2 arctan tan
c
2
ax
e
for all x P R satisfies (3.1.2) and f p0q c. As a side remark, note that for
every k P Z the constant function of value kπ is a solution of (3.1.2). In
addition, if f is a solution of (3.1.2), then for every k P Z also fk : R Ñ R
defined by fk pxq : f pxq 2πk q for every x P R is a solution of (3.1.2).
For the motivation of the following theorem, we consider the map R :
pR2 Ñ R2 defined by
Rpx, y q : px, y q
for all px, y q P R2 . A geometrical interpretation of R is that of a reflection
in the y-axis. This can be seen as follows. For this, let px, y q be some point
in R2 . Then the line segment from px, y q to Rpx, y q px, y q, at the intersection p0, y q with the y-axis, is at a right angle with the y-axis and both
points px, y q and Rpx, y q are at a distance |x| from the y-axis. Therefore, R
meets the geometrical definition of the reflection in the y-axis.
Intuitively (according to elementary geometry), we would not expect that
259
y
4
2
1
-3
1
-1
3
x
Fig. 67: The line segment from p1, 3q to Rp1, 3q p1, 3q intersects the y-axis at a
right angle and is halved by that axis. The yellow rectangles are mapped onto each other
by R. Compare text.
such reflection changes areas, i.e., if S is some subset of R2 of area A, then
we expect that the set RpS q has the same area. For instance, a rectangle
ra, bs rc, ds
in R2 , where a ¤ b and c ¤ d, is mapped by R into the rectangle
Rp ra, bs rc, ds q rb, as rc, ds .
Both rectangles have the same area pb aqpd cq.
Within the definition of Riemann-integrability above, we defined the area
under the graph of a bounded integrable f : ra, bs Ñ R, where a, b P R are
such that a b, that assumes only positive (¥ 0) values by
»b
a
f pxq dx .
260
We consider the associated function f¯ : rb, as Ñ R defined by f¯pxq :
f pxq for all x P rb, as. We claim that the graph of f¯ is the image of
the graph of f under R, i.e.,
Gpf¯ q RpGpf qq .
Indeed, if x P rb, as, then x P ra, bs and
px, f¯pxqq px, f pxqq Rpx, f pxqq P RpGpf qq .
Also, if x P ra, bs, then x P rb, as and
Rpx, f pxqq px, f pxqq px, f ppxqqq px, f¯pxqq P Gpf¯ q .
Therefore, we expect that f¯ is bounded, integrable and that the area under
the graph of f¯ is equal to the area under the graph of f , i.e., that
»b
a
f pxq dx » a
b
f pxq dx .
(3.1.5)
Indeed, it is shown within the proof of the following theorem that this is
the case. Note that we can view this result as a kind of change of variables.
For this, we define g : R Ñ R by g pxq : x. The g is decreasing and
continuously differentiable with a derivative function which is constant of
value 1. Hence g does not satisfy the assumptions of Theorem 3.1.1. In
particular, g paq a and g pbq b. A formal application of the change
of variable formula (3.1.1) would give
»b
a
f pxq dx » b
a
f pg puqq g 1 puq du
pincorrectq
which does not make sense according to our definitions because a ¡
b. The correct formula (3.1.5), can be ‘obtained’ from this formula by
exchange of the integration limits.
Theorem 3.1.9. Let f be a bounded Riemann-integrable function on ra, bs
where a and b are some elements of R such that a b. Then
»b
a
f pxq dx » a
b
261
f pxq dx .
y
4
3
2
1
-2
1
-1
2
x
Fig. 68: The graphs of p r2, 1s Ñ R, x ÞÑ pxq2 q and and p r1, 2s Ñ R, x ÞÑ x2 q are
reflection symmetric with respect to the y-axis. Compare text.
Proof. Define f : rb, as Ñ R by f pxq : f pxq for all x P rb, as.
Then f is bounded, and for any partition P pa0 , . . . , aν q of ra, bs where
ν P N , a0 , . . . , aν P ra, bs, P : paν , . . . , a0 q it is a partition of
rb, as, and in particular Lpf, P q Lpf, Pq, U pf, P q U pf, Pq.
Analogously, for any partition P pa0 , . . . , aν q of rb, as where ν P N ,
a0 , . . . , aν P rb, as, P : paν , . . . , a0 q is a partition of ra, bs, and
in particular Lpf , P q Lpf, P q, U pf , P q U pf, P q. Hence the set
consisting of the lower sums of f is equal to the set of lower sums of f
and the set consisting of the upper sums of f is equal to the corresponding
set of upper sums of f .
The following example displays a typical application of the previous theorem to functions f that are defined on intervals that are symmetric to the
origin, i.e., of the form ra, as, where a ¥ 0, as well as bounded, integrable
and antisymmetric, i.e., such that f pxq f pxq for all x P ra, as. Their
integrals vanish.
262
Example 3.1.10. Calculate
»1
3 sinp2xq dx .
1
Solution: By Theorem 3.1.9, it follows that
»1
1
»1
3 sinp2xq dx and hence that
1
»1
1
3 sinp2xq dx »1
1
3 sinp2xq dx
3 sinp2xq dx 0 .
A variation of the previous reasoning is displayed in the next example.
Example 3.1.11. Calculate
»π
0
x sin2 pxq dx .
Solution: First by Theorem 3.1.9, it follows that
»π
x sin pxq dx »0
2
0
pxq sin pxq dx »0
2
π
π
x sin2 pxq dx .
Further, it follows by Theorem 3.1.1 and Example 2.6.24 that
»0
x sin pxq dx »π
2
»π
π
»π
y sin py q dy
2
0
and, finally, that
0
py πq sin2py πq dy
sin py q dy
x sin2 pxq dx π2
.
4
2
π
0
»π
0
263
»π
0
y sin2 py q dy
π2
2
Another typical application of Theorem 3.1.9 applies to functions f defined
on intervals that are symmetric to the origin, i.e., of the form ra, as, where
a ¥ 0, that are bounded, integrable and symmetric, i.e., such that f pxq f pxq for all x P ra, as. The value of the integral of such a function is
twice the value of the corresponding integral of its restrictions to r0, as.
Example 3.1.12. Show that
»π
sinpxq
dx 2
π x
»π
0
sinpxq
dx .
x
Solution: By Corollary 2.6.18 and Theorem 3.1.9, it follows that
»π
»
»
0
π
sinpxq
sinpxq
sinpxq
dx dx
dx
x
π » x
0
π» x
»π
π
π
sinpxq
sinpxq
sinpxq
dx
dx 2
dx .
x
x
x
0
0
0
Remark 3.1.13. The solution of following problem n) from 1) illustrates
the general rule that one should never blindfoldly rely on computer programs. In Mathematica 5.1, the command
Integraterpx^ 2 2x
4q^ t3{2u, tx, 1, 2us
gives the output
1
p68
16
27 Logr3sq
which is incorrect.
Problems
1) Calculate the value of the integral. For this, if the antiderivative of
the integrand is not obvious, use a suitable substitution.
»1
p2x 1q { dx , b)
»1
1 2
a)
0
264
0
u p2u
1q1{2 du
,
»1
c)
x p3x
1
g)
2
»2
1
x
0
x
0
tanpθq dθ
»4b
3
k)
3
»6c
b
»
3
7
o)
3
»2
q)
1
»π
2
?dx2
2
x 4
dx
?
x2 5 x2
x
a
1
0
»π 2
{
π{2
»2
,
x1{3
1
?x dx
,
x2
1
»π 2
{
r)
0
sinp2θq dθ
,
»2
sinpθq
0
?x
x
2
u 12
du
dx
,
4q3{2
,
2
,
dx
x
9
dθ
sinp3θq 2
» π{2
,
4u
u2
px2 2x
?dx2
t)
cos4 pθq
dθ
sin4 pθq cos4 pθq
dx
4
1
p)
,
2
n)
»3
,
j)
x1{2
l)
ds
,
»4
x P r0, π {2q ,
?x dx
3
m)
,
3
2
2
»
u)
s
2s2
px 2q2 sin x x 2 dx , f)
3
?
»1
»3
sinp xq
? dx , h)
u eu {2 du
i)
s)
?
1 2
»5
e)
»3
1q { dx , d)
2
,
,
dθ
2 cospθq
,
.
2) Let a P R, f : ra, as Ñ R be Riemann-integrable and g : R Ñ R
be Riemann-integrable over every interval rb, cs where b, c P R are
such that b ¤ c. Show that
a)
»a
a
f pxq dx 0
if f is antisymmetric, i.e., if f pxq
ra, as.
b)
»a
a
f pxq dx 2
»a
0
f pxq for all x P
f pxq dx
if f is symmetric, i.e., if f pxq f pxq for all x P ra, as.
c)
»c
b
f pxq dx 265
»c
τ
b τ
f pxq dx
,
if b, c P R are such that b ¤ c and f is periodic with period
τ ¥ 0, i.e., if f px τ q f pxq for all x P R.
3) Calculate the area in ( 8, 0 ]2 that is enclosed by the strophoid
C :
px, yq P R2 : pa xq y2 pa
xq x2
(
0
where a ¡ 0.
4) Find solutions of the following differential equation for f : R
with the specified initial values.
f 1 pxq 2 cospf pxqq
ÑR
3
for all x P R, f p0q P [ π, π q.
3.1.2
Integration by Parts
The method of integration by parts is based on the product rule for differentiation. For motivation, we consider continuous functions F : ra, bs Ñ R
and G : ra, bs Ñ R whose restrictions to the open interval pa, bq are differentiable with derivatives which can be extended to bounded Riemannintegrable functions f : ra, bs Ñ R and g : ra, bs Ñ R, respectively. Then
it follows by the fundamental theorem of calculus and the product rule for
differentiation that
»b
F pbqGpbq F paqGpaq »b
rF 1pxqGpxq
F pxqG 1 pxqs dx
a
»b
a
»b
a
a
pF Gq 1pxq dx
»b
F 1 pxqGpxq dx
f pxqGpxq dx
a
»b
a
F pxqG 1 pxq dx
F pxqg pxq dx
and hence that
»b
a
F pxqg pxq dx F pbqGpbq F paqGpaq 266
»b
a
f pxqGpxq dx .
We note the sign change and how antiderivatives, denoted by capital letters,
switch positions inside the integrals.
A typical application of the last formula consists in the following steps.
The integrand of a given integral needs to be represented by a product of
functions. Its first function will be differentiated in the process. It is an antiderivative of that derivative. The last will appear as the first factor in the
transformed integrand. For the second function an antiderivative should be
available. That antiderivative will appear as the second factor in the transformed integrand. The final result is obtained in form of a difference. The
minuend is given by the difference of the product of the first factor with
the antiderivative of the second factor evaluated at the upper limit of integration and the value of that product at the lower limit of integration. The
subtrahend is given by the integral over the original interval of integration
with the transformed integrand.
Theorem 3.1.14. (Integration by parts) Let f , g be bounded Riemannintegrable functions on ra, bs where a and b are elements of R such that
a b. Further, let F, G be continuous functions on ra, bs which are differentiable on pa, bq and such that F 1 pxq f pxq and G 1 pxq g pxq for all
x P pa, bq. Then
»b
a
F pxqg pxq dx F pbqGpbq F paqGpaq »b
a
f pxqGpxq dx .
Proof. First as a consequence of Theorem 2.6.13, f G and F g are both
Riemann-integrable as products of Riemann-integrable functions. Moreover, F G is continuous and differentiable such that pF Gq 1 pxq f pxqGpxq
F pxqg pxq for all x P pa, bq, and f G F g is Riemann-integrable by Theorem 2.6.8 as a sum of Riemann-integrable functions. Hence by Theorem 2.6.21
»b
a
f pxqGpxqdx
»b
a
F pxqg pxqdx F pbqGpbq F paqGpaq .
267
»b
a
f pxqGpxq
F pxqg pxq dx
The first example gives a typical application of integration by parts where
the occurrence of the derivative of the first factor in the transformed integrand is used to lower the order of a polynomial appearing in the original
integral.
Example 3.1.15. Calculate
»π
0
x cosp3xq dx .
Solution: Define F, G, f, g : r0, π s Ñ R by
F pxq : x , g pxq : cosp3xq , f pxq : 1 , Gpxq :
1
sinp3xq
3
for all x P r0, π s. Hence by Theorems 3.1.14, 2.6.21:
»π
0
1
x cosp3xq dx 3
»π
0
sinp3xq dx 1
2
1
cosp3π q cosp0q .
9
9
9
Another typical application consists in a repeated use of integration by parts
until the original integral reappears, but multiplied by a factor which is
different from 1. In such a case the resulting equation can be solved for the
original integral.
Example 3.1.16. Calculate
»π
0
ex sinp2xq dx .
Solution: Define F, G, f, g : r0, π s Ñ R by
1
F pxq : ex , g pxq : sinp2xq , f pxq : ex , Gpxq : cosp2xq
2
for all x P r0, π s. Then by Theorem 3.1.14,
»π
e sinp2xq dx x
0
1
p1 eπ q
2
268
1
2
»π
0
ex cosp2xq dx
(3.1.6)
To determine the last integral, define F, G, f, g : r0, π s Ñ R by
F pxq : ex , g pxq : cosp2xq , f pxq : ex , Gpxq :
1
sinp2xq
2
for all x P r0, π s. Then by Theorem 3.1.14,
1
2
»π
1
e cosp2xq dx 4
»π
x
0
0
ex sinp2xq dx .
(3.1.7)
and hence by (3.1.6), (3.1.7) finally:
»π
0
ex sinp2xq dx 2 π
pe 1q .
5
Of course, every integrand can be represented by its product with the constant function of value 1. Such a representation can sometimes lead to a
successful application of the method of partial integration as in the following example.
Example 3.1.17. Calculate
»e
1
lnp4xq dx .
Solution: Define F, G, f, g : r1, es Ñ R by
F pxq : lnp4xq , g pxq : 1 , f pxq :
for all x P r0, es. Then by Theorem 3.1.14,
»e
1
lnp4xq dx p1
ln 4q e ln 4 »e
1
1
, Gpxq : x
x
dx pe 1q ln 4
1.
Often, the method of partial integration can be used to derive a recursion
relation for an integral containing a parameter. Such a case is considered in
the following example. In particular, its result will lead to the subsequent
Wallis’ product representation of π.
269
Example 3.1.18. Calculate
In :
»π
0
sinn pxq dx
for n P N . Solution: For n 1, 2, we conclude that
»π
0
sinpxq dx r cospxqs
»π
0
π
0
2,
»π
0
sin2 pxq dx
π
1
1
1
r
x sinp2xq
1 cosp2xqs dx 2
2
2
0
π .
For n ¥ 3, we conclude by partial integration that
In
»π
sin pxq dx »π
n
0
0
sinn1 pxq sinpxq dx
(π
sin pxqr cospxqs 0 »π
n 1
pn 1q
pn 1q
»π
»0π
0
0
pn 1q sinn2pxq cospxqr cospxqs dx
sinn2 pxq cos2 pxq dx
sinn2 pxqr1 sin2 pxqs dx pn 1qpIn2 In q
and hence that
In
n n 1 In2 .
Hence we conclude by induction that
I2k
for all k
1
2 23 2k2k 1 ,
I2k
π 21 2k2k 1
P N zt0, 1u.
The result from the previous example leads on John Wallis’ product representation of π which will be used in the subsequent derivation of Stirling’s
formula and in the calculation of Gaussian integrals.
270
3.3
3.2
Π
3.1
10
20
30
40
50
n
Fig. 69: Sequences a1 , a2 , . . . and b1 , b2 , . . . from the proof of Wallis product representation for π, Theorem 3.1.19, that converge to π from below and above, respectively.
Theorem 3.1.19. (Wallis’ product representation of π, 1656, [98])
lim 4pk
k
Ñ8
1q
2
3
2k
2k 1
2
π .
Proof. In this, we are using the notation from the previous example. Since
0 ¤ sinpxq ¤ 1 for all x P r0, π s, it follows that
sinn
1
pxq sinpxq sinnpxq ¤ sinnpxq
for all x P r0, π s and hence that
In
1
»π
sin
0
n 1
pxq dx ¤
for all n P N . As a consequence,
2
2
3
»π
0
sinn pxq dx In
2k2k 1 I2k 1 ¤ I2k π 12 2k2k 1
271
pk 1q
¤ I2k1 2 23 22k
1
and
2
2
3
pk 1q 2k 2k
2k2k 1 21 43 22k
3 2k 1 2k
2
1
1
ak : p4k 2q 2k2k 1 ¤ π
pk 1q 2 4 2pk 1q 2k
¤ 2 32 22k
1 1 3 2k 3 2k 1
pk 1q 2
bk : 4k 23 22k
1
2
3
P N zt0, 1, 2u. Further,
2
4k 6 2pk 1q
1
for k
ak
ak
bk 1
bk
bk
ak
4k
2
4pk 1q
4k
4k
4k 2
2k
3
2
8k
8k
2
16k
16k
8
6
¡1,
2
4k
4k4k
1,
2
4k 1
2
1
p2k 1q 1 1
2k
2k 1
2k
2k
2
8pk 1q2
p4k 2qp2k 3q
2k
2k
for all k P N zt0, 1, 2u. Hence the sequences a3 , a4 , . . . and are convergent,
as increasing sequence that is bounded from above by π and decreasing
sequence that is bounded from below by π, respectively, and converge to
the same limit π.
Essentially as an application of Wallis’ product formula, we prove Stirling’s
asymptotic formula for factorials which is often used in applications .
Theorem 3.1.20. (Stirling’s formula, 1730, [92])
n! n n
?
lim
nÑ8
n e
272
?
2π .
(3.1.8)
y
1.05
1.04
1.03
1.02
1.01
10
20
30
40
50
?
x
1q px{eqx { 2πx q. Note that Γpn
Fig. 70: Graph of pp0, 8q Ñ R, x ÞÑ Γpx
for every n P N. See Theorem 3.1.20.
Proof. First, we notice that ln is concave since ln2 pxq 1{x2
x ¡ 0. Hence it follows by Theorem 2.5.33 that
»x
x
1
lnpy q dy
x ln
x
¥
1
»x
x
1
lnpxq
x
x
1
2
py xq ln
ln
x
1
x
x
1
x
21 r lnpxq
dy
1q n!
0 for all
lnpxq
lnpx
1qs
for all x ¡ 0. In addition, it follows from the Definition 2.5.29 of the
concavity of a differentiable function that
lnpy q ¤ lnpxq
where x ¡ 0 and y
»x
x
1
lnpy q dy
yx
, lnpy q ¤ lnpx
x
1q
y px 1q
,
x 1
¡ 0, and hence that
¤ lnpxq 1
1
1
2x
lnpxq
273
1
,
2x
»x
1
x
lnpy q dy
¤ lnpx
1q 1
and
»x
1
x
lnpy q dy
21 r lnpxq
¤
1
2
1
lnpxq
lnpx
2px
lnpx
1qs
1
4
1
x
lnpx
1q
1
1q
1
2x
x11
2px
0¤
x
1
lnpy q dy 1
r lnpxq
2
lnpx
1qs ¤
for all x ¡ 0 and hence that
0¤
»n
41
1
lnpy q dy 1
1
n
1
1 n¸
r lnpkq
2 k1
lnpk
1qs ¤
2px
1
1q
1
1q
.
Hence it follows that
»x
1q 1
4
1
x
1
1 n¸
4 k1
x
1
k
1
k
1
1
1
¤ 41 .
Therefore, we conclude that the sequence S1 , S2 , . . . , where
Sn :
»n
»n
1
lnpy q dy 1
1 n¸
r lnpkq
2 k1
lnpk
1qs
r y lnpyq y sn1 lnpn!q lnp2nq
1
? n
lnpnq
n lnpnq pn 1q lnpn!q 2 1 ln n!n ne
lnpy q dy lnpn!q
lnpnq
2
for every n P N , is increasing as well as bounded from above and therefore
convergent to an element of the closed interval form 0 to 1{4. Hence it
follows also the existence of
n! n n
?
lim
nÑ8
n e
274
which will be denoted by a in the following. For the determination of its
value, we use Wallis’ product. According to Theorem 3.1.19
2
2pk 1q
p2k k!q4
klim
2pk 1q
lim
Ñ8
kÑ8 2k
1 p2k 1q rp2k q!s2
p2k k!q4
klim
Ñ8 p2k 1q rp2k q!s2
k 4 ? 2k 2k 2
2p4k 1q k p2k k!q4 1
k
?
klim
2k
Ñ8 p2k 1q rp2k q!s2
e
k e
k 4 ? 2k 2k 2 a2
k pk!q4
1
k
?
klim
2k
4 .
Ñ8 2p2k 1q rp2k q!s2
e
k e
?
Hence it follows that a 2π and, finally, (3.1.8).
π
2
2
3
2k
2k 1
The example below gives another application of the method of partial integration to an integrand containing a parameter which leads on Euler’s
famous product representations of the sine and the cosine. These representations will be used later on in the proof of the reflection formula for the
gamma function. For the formulation of these representations, we need to
introduce the product symbol.
Definition 3.1.21. (Product symbol) If I is some non-empty finite index
set and ai P R for every i P I, the symbol
¹
P
ai
i I
denotes the product of all ai where i runs through the elements of I. Note
that, as a consequence of the commutativity and associativity of multiplication, the order in which the products are performed is inessential.
Example 3.1.22. ( Euler’s product representation of the sine and cosine,
1748, [38]) Show that for every x P R
sin
πx 2
n
¹
πx
x2
lim
1 2
2 nÑ8 k1
4k
275
,
cos
πx 2
nlim
Ñ8
n
¹
1
k 0
p2k
x2
1q2
.
(3.1.9)
Solution: For this, we define for every n P N a corresponding In : R Ñ R
by
In pxq :
» π{2
cospxtq cosn ptq dt
0
for every x P R. In particular, this implies that
#
I0 pxq 1
if x 0
,
sinpπx{2q{pπx{2q if x 0
π
2
for x R t1, 1u
I1 pxq » π{2
0
cospxtq cosptq dt I1 pxq » π{2
0
I1 pxq x In pxq x
» π{2
0
π {2
0
π{2
0
cosppx 1qtqs dt
cos1 pπxx{22q ,
» π{2
0
1qtq
1
r1
2
cosp2tqs dt
π4
#
In the following, let x
integration that
2
1
rcosppx
2
cos ptq dt 2
1
sinp2tq
2
1
t
2
and hence
0
sinppx 1qtq
x1
1 sinppx 1qtq
2
x 1
for x P t1, 1u
» π{2
π {4
if x P t1, 1u
.
2
cospπx{2q{p1 x q if x R t1, 1u
P R.
T For n
P N z t0, 1u, we conclude by partial
cosn ptq x cospxtq dt r x cosn ptq sinpxtqsπ0 {2
276
» π{2
n
0
» π{2
n
sinptq cosn1 ptq x sinpxtq dt 0
» π{2
n sinptq cosn1ptq cospxtq π0 {2
cosn ptq pn 1q sin2 ptq cosn2 ptq cospxtq dt
n
n cosn ptq pn 1q cosn2 ptq
0
2
n Inpxq npn 1qIn2pxq .
cospxtq dt
Therefore, it follows that
In2 pxq n2 x2
In pxq
npn 1q
and hence that
In2 pxq
x2
1 2
In2 p0q
n
From this, it follows by induction that
I0 pxq
I0 p0q
I1 pxq
I1 p0q
In pxq
.
In p0q
n
I2n pxq ¹
x2
1
I2n p0q k1
p2kq2
I2n
I2n
,
n pxq ¹
x2
1
p2k 1q2
1 p0q k1
1
for every n P N . In the following, we show that
lim
nÑ8
In pxq
In p0q
1.
For this, we note that
| cospxtq 1| | cosp|x|tq 1| (3.1.10)
»
|x |t
0
r p qs
for t ¥ 0. Hence it follows for every n P N that
|Inpxq Inp0q| »
π{2
cos xt
0
sin s ds
1 cosn t dt
r p q s
277
pq
¤ |x|t
¤ |x|
» π{2
0
t cosptq cos ptq dt ¤ |x|
n 1
» π{2
0
sinptq cosn1 ptq dt |x |
n
where it has been used that
t cosptq ¤ sinptq
for 0 ¤ t ¤ π {2. Hence it follows (3.1.10) and, finally, (3.1.9). For this,
note that the second relation in (3.1.9) is trivially satisfied for x P t1, 1u.
The following application of the method of partial integration to an integrand containing a parameter leads on a recursion formula that will be used
in the method of integration of rational expressions by decomposition into
partial fractions displayed in the next section.
Example 3.1.23. Let m be some natural number
Define F, G, f, g : R Ñ R by
F py q : py 2
Gpy q : y
for all y
»x
a
¥ 1, a P R and c ¡ 0.
c2 qm , g py q : 1 , f py q : 2my py 2
c2 qpm 1q ,
P R. Then by Theorem 3.1.14 for every x ¡ a
x
dy
2
2
m
2
py c q px c2qm
a
x
2
2
m
2
px c »q pa c2qm
x
2
2mc py2 dyc2qm 1
a
pa2
a
»x
2m
a
»x
c2 qm
dy
2
py c2qm
2m
a
y 2 dy
py2 c2qm
and hence it follows the recursion (or ‘reduction’) formula
»x
a
py2
dy
c2 qm
1
1
x
a
2mc
2 px2
2
m
2
cq
pa c2qm
»x
2m 1
dy
,
2
2
2mc
c2 qm
a py
1
which is used in the method of integration by decomposition into partial
fractions below.
278
The following final example gives a another typical application of the method
of partial integration. Also in this, the integrand contains a parameter. The
method is used to derive an estimate for a special function, a Bessel function, defined in terms of an integral. It is a remarkable fact that estimates
even of elementary functions are often easier to achieve by help of integral
representations.
Example 3.1.24. Show that
|Jnpxq| ¤ π2 n2 x x2
for all n P N and x P R such that 0 ¤ x n. Solution: Define
F pθq :
f pθq :
for all θ
1
, g pθq : px cos θ nq cospx sin θ nθq ,
x cos θ n
x sinpθq
px cos θ nq2 , Gpθq : sinpx sin θ nθq
P r0, πs. Then by Theorem 3.1.14,
Jn pxq π1
1
π
»π
0
»π
0
cospx sin θ nθq dθ
x sinpθq
px cos θ nq2 sinpx sin θ nθq dθ ,
and hence
|Jnpxq| ¤
1
π
»π
0
x sinpθq
px cos θ nq2 dθ
π2 n2 x x2
.
Problems
1) Calculate the value of the integral. In this, where applicable, n P N .
»3
4t e5t dt , b)
2
a)
0
» π{2
0
279
ϕ r sinp2ϕq
3 cosp7ϕq s dϕ
,
»π
c)
0
»1
e)
0
»π
g)
0
»2
eϕ cosp2ϕq dϕ , d)
x2 arctanp3xq dx
x sinpnxq dx
,
» 1{?2
1
f)
»3
,
lnp2xq
dx
x2
h)
1
0
,
lnp2x2
xn lnpxq dx
1q dx
,
.
2) Derive a reduction formula where the integral is expressed in terms
of the same integral with a smaller n. In this n P N , a P R, x ¥ a
and, where applicable, m P N, b, c P R .
»x
a)
»ax
y n eby dy
c)
»ax
e)
»ax
g)
a
»x
sinn py q dy
,
b)
»x
,
d)
a
y n cospby q dy
,
a
cosn py q dy
y n sinpby q dy
»x
f)
a
ecy sinpby q dy
,
,
,
y m rlnpy qsn dy
»x
h)
a
,
ecy cospby q dy
3) Let I be some non-empty open interval of R, h : I
a, b P I be such that a b.
.
Ñ R a map and
a) If h is twice differentiable on I and such that hpaq hpbq 0,
show that
»b
a
hpxq dx 1
2
»b
a
px bqpx aqh 2 pxq dx .
b) If h is four times differentiable on I and such that hpaq
hpbq h 1 paq h 1 pbq 0, show that
»b
a
hpxq dx 1
24
»b
a
px bq2 px aq2 h pivq pxq dx .
[Remark: Note that if h f p where f : I Ñ R is twice and
four times differentiable, respectively, and p : I Ñ R is a polynomial
function of the order 1, 3, respectively, then h 2 f 2 , h pivq f pivq ,
respectively. In connection with the above formulas, this fact is used
in the estimation of the errors for the Trapezoid Rule / Simpson Rule
for the numerical approximation of integrals. See Section 3.1.4.]
280
4) Let a, b P R be such that a b and f, g : ra, bs Ñ R be restrictions to
ra, bs of twice continuously differentiable functions defined on open
intervals of R containing ra, bs. In addition, let f paq f pbq 0 and
g paq g pbq 0.
a) Show that
»b
a
g pxqf 2 pxq dx »b
a
g 2 pxqf pxq dx .
b) In addition, assume that f and g solve the differential equations
f 2 pxq U pxq f pxq λ f pxq ,
g 2 pxq U pxq gpxq µ gpxq
where U : ra, bs Ñ R is continuous and λ, µ P R are such that
λ µ. Show that
»b
f pxqg pxq dx 0 .
a
3.1.3
Partial Fractions
The method of integration of rational expressions by decomposition into
partial fractions is suggested by the following simple observation. For this,
let a1 , a2 , A1 , A2 P R. Then
A2
A1 px a2 q A2 px a1 q
A1
x a1 x a2
px a1qpx a2q
pA1x2 Ap2aqx apAq1xa2 a Aa2a1q
1
2
1 2
for all x P R zta1 , a2 u. Note that for the left hand side of the last equation,
as a function of x, there is an antiderivative which is given by
A1 lnp|x a1 |q
A2 lnp|x a2 |q
for every x P R zta1 , a2 u.
281
On the other hand, for a given quotient p{q of polynomials p of first order and q of second order an antiderivative is usually not obvious. Here
we exclude the case that the quotient can be reduced to the quotient of a
zero order polynomial and a first order polynomial. Also, we assume that
the coefficient of the leading order of q is equal to 1 which can always be
achieved by appropriate definition of p and q. Therefore, for the purpose of
integration, it is natural to try to represent such a quotient p{q in the form
ppxq
q pxq
x A1a
1
A2
x a2
(3.1.11)
for all x P R z r ta1 , a2 u Y q 1 pt0uq s and for some a1 , a2 P R, A1 , A2 P R
such that a1 a2 . In this, we notice that the vanishing of one of the coefficients A1 , A2 or a1 a2 would lead on the excluded case that the quotient
can be reduced to a quotient of a zero order polynomial and a first order
polynomial.
In the following, we will determine A1 , A2 , a1 and a2 . We immediately
note from the singular behavior of the right hand side of equation (3.1.11)
near a1 and a2 that q needs to vanish in the points a1 and a2 . This can also
be shown as follows. The equation (3.1.11) implies that
rA1px a2q A2px a1qs qpxq ppxqpx a1qpx a2q
for all x P R z r ta1 , a2 u Y q 1 pt0uq s. Hence
A1 pa1 a2 q q pa1 q xlim
Ña rA1 px a2 q A2 px a1 qs q pxq
xlim
Ña ppxqpx a1 qpx a2 q 0 ,
A2 pa2 a1 q q pa2 q xlim
Ña rA1 px a2 q A2 px a1 qs q pxq
xlim
Ña ppxqpx a1 qpx a2 q 0 .
1
1
2
2
Since A1
0, A2 0 and a1 a2, this implies that
q pa1 q q pa2 q 0 .
282
Hence q has the two different zeros a1 , a2 and
q pxq px a1 qpx a2 q
for all x P R. Then (3.1.11) implies that
ppxq A1 px a2 q
A2 px a1 q
for all x P R zta1 , a2 u and therefore that
ppa1 q xlim
Ña ppxq A1 pa1 a2 q , ppa2 q xlim
Ña ppxq A2 pa2 a1 q .
1
2
The last system gives
A1
appa1aq
2
, A2
1
appa2aq
2
.
1
Indeed, if p is a polynomial of first order and a1 , a2
a1 a2 , then
P
R are such that
appa1aq x 1 a appa2aq x 1 a
2
1
1
2
1
2
ppa1 qpx a2 q ppa2 qpx a1 q
1
a2 a1
px a1qpx a2q
for all x P R zta1 , a2 u. In addition,
ppa1qpa1 a2q ppa2qpa1 a1q ppa q
1
a2 a1
ppa1qpa2 a2q ppa2qpa2 a1q ppa q.
2
a a
2
Hence
for all x P R and
1
ppa1qpx a2q ppa2qpx a1q ppxq
a2 a1
ppxq
px a1qpx a2q
appa1aq x 1 a
2
1
1
283
ppa2 q
1
a2 a1 x a2
for all x P R zta1 , a2 u gives a decomposition as required. In particular, an
antiderivative of
R zta1 , a2 u Ñ R , x ÞÑ
ppxq
px a1qpx a2q
is given by
ppa2 q
lnp|x a2|q
a2 a1
appa1aq lnp|x a1|q
2
1
for every x P R zta1 , a2 u.
As noticed above, a decomposition of the type (3.1.11) is impossible if
the polynomial q has a double zero or no real zero. For this reason, we try
to find a similar decomposition also for these cases. If q has a double zero
a P R, then p{q is given by
ppxq
px aq2 ,
for all x P R ztau. Then
ppxq
px aq2
1
1
p paq ppxx aaqq2 ppaq xp paaq pxppaaq q2
for all x P R ztau. Hence an antiderivative of
R ztau Ñ R, x ÞÑ
ppxq
px aq2
is given by
p 1 paq lnp|x a|q for every x P R ztau.
ppaq
xa
Finally, if q has no real zero, then
q pxq x2
cx
d x
284
c 2
2
d
c2
4
for all x P R where c, d P R are such that
d¡
Further, p is given by ppxq ax
c2
.
4
b for all x P R and some a, b P R. Then
ax ac
b ac
ppxq
ax b
2
2
q pxq
x2 cx d
x2 cx d
1 2ax ac
ac 1
2 x2 cx d b 2 2
c 2
x 2
d c4
b ac
a
2x c
1
1
2
b
b
2 x2 cx d
2
2
d c4
d c4 1
bx
c
2
c2
4
2
d
for all x P R. The first summand on the right hand side of the last equation,
as a function of x, has an antiderivative given by
a
lnpx2
2
cx
dq
for all x P R. Hence it remains to find an antiderivative for the second
summand. Since we know from Calculus I that
arctan 1 pxq for all x P R, such is given by
b
b
ac
2
d
c2
4
1
x2
1
arctan bx
c
2
d
c2
4
for all x P R. Note that in the last step, we could also have employed
change of variables, but the procedure here is more direct. Hence in the
case that
c2
d¡
,
4
285
an antiderivative of p{q, given by
ax
for every x P R, is given by
a
lnpx2
2
cx
dq
x2
b
cx
b
b
ac
2
d
for all x P R.
d
c2
4
arctan bx
c
2
d
c2
4
The previous analysis can be generalized to quotients of the form p{q where
p, q are polynomials of order m and n, respectively, such that m n. The
result is given below without proof. The proof can be found in texts on
function theory, that is, the theory of functions of one complex variable.
For readers that already know complex numbers, we just indicate how their
introduction might be helpful in this respect. For this, we consider the case
that ppxq 1 and q x2 1 for all x P R. The polynomial q has no
real zero, but if we extend q to the complex plane by q̄ pz q : z 2 1 for
every complex number z, then q̄ has the roots i and i, where i denotes the
imaginary unit, since
q̄ piq i2
In particular,
1 1
1 0 , q̄ piq piq2
1 1
10.
i
1
1
1
q̄ pz q
2 z i zi
for every complex z different from i and i. As reflected in this example,
the introduction of complex numbers allows in every case the decomposition of the extension of p{q to complex numbers into sums of functions that
assume the values
1
1
, ... ,
za
pz aqµpaq ,
in every complex z not among the zeros of that extension of q where a
runs through the zeros of q and for every such a the symbol µa denotes the
corresponding multiplicity. This fact simplifies the discussion significantly.
286
Lemma 3.1.25. Let p, q : R Ñ R be polynomials of degree m, n P
N , respectively, where m n. Finally, let a1 , . . . ar be the (possibly
empty) sequence of pairwise different real roots of q, where r P N, and let
m1 , . . . , mr be the sequence in N consisting of the corresponding multiplicities.
(i) There are s P N along with (possibly empty and apart from reordering
unique) sequences pbr 1 , cr 1 q, . . . , pbr s , cr s q of pairwise different
elements of R p0, 8q and mr 1 , . . . , mr s in N such that
q pxq qn px a1 qm1 . . . px ar qmr
. . . px br
s
q2
cr
mr
px br 1q2
cr
mr
1
s
s
for all x P R where qn is the coefficient of the nth order of q.
(ii) There are unique sequences of real numbers A11 , . . . , A1m1 , . . . ,
Ar1 , . . . , Armr and pairs of real numbers pBr 1,1 , Cr 1,1 q, . . . ,
pBr 1,mr 1 , Cr 1,mr 1 q, . . . , pBr s,1, Cr s,1q, . . . , pBr s,mr s ,
Cr s,mr s q, respectively, such that
ppxq
q pxq
xA11a
px A1ma qm . . .
1
1
Ar1
Arm
px a qm
x ar
r
Br 1,m x
Br 1,1 x Cr 1,1
rpx b q2
px b q2 c
1
1
r
r
r 1
r 1
r 1
Br s,1 x Cr s,1
px br sq2 cr s
r 1
rpBxr s,mb qx2
r s
r s
Cr
cr
Cr
cr
1,mr
mr
1
1
1
s,mr
mr
s
s
s
s
s
...
for all x P R zta1 , . . . , ak u.
Proof. See Function Theory.
Corollary 3.1.26. Let p, q, m, n; a1 , . . . ak , m1 , . . . , mr , pb1 , c1 q . . . ,
pbnk , cnk q, mr 1, . . . , mr s, A11, . . . , A1m1 , . . . , Ar1, . . . , Armr and
287
1
pBr
pBr
1,1 , Cr 1,1
s,mr
s , Cr
q, . . . , pBr 1,m , Cr 1,m q, . . . , pBr
q as in Lemma 3.1.25. Then by
s,m
r 1
r 1
s,1 , Cr s,1
q, . . . ,
r s
F pxq : A11 lnp|x a1 |q A1m
1 px a qm 1
1
m1
Ar1 lnp|x ar |q 1
1
1
Arm
1 px a qm 1
1
mr
r
r
r
...
...
Br 1,1
lnrpx br 1 q2 cr 1 s
2
x br 1
br 1 Br 1,1 Cr 1,1
arctan
...
cr 1
cr 1
Br 1,1
1
2p1 mr 1 q rpx br 1 q2 cr 1 smr 1 1
pbr 1Br 1,1 Cr 1,1q Fr 1pxq . . .
Br s,1
lnrpx br s q2 cr s s
2
br s Br s,1 Cr s,1
x br s
arctan
...
cr s
cr s
1
Br s,1
2p1 mr s q rpx br s q2 cr s smr s 1
pbr sBr s,1 Cr s,1q Fr spxq
for all x P R zta1 , . . . , ak u, there is defined an anti-derivative F of p{q.
Here Fr 1 , . . . , Fr s : R Ñ R denote anti-derivatives satisfying
Fr1
l
pxq rpx b q21
r l
cr
l
sm
r l
for all x P R and l 2, . . . s. Note that such functions can be calculated by
the recursion formula from Example 3.1.23.
In the following, we give five examples of typical applications of the previous lemma and its corollary. The fifth example gives such application to
the solution of a (‘separable’) first order differential equation.
288
Example 3.1.27. Calculate
»2
0
4
x2
9 dx .
Solution:
»2
0
»2
»2
dx dx dx
x2 9
3q
x3 x 3
0 px 3qpx
0
2
2
2
p
lnp|2 3|q lnp|2 3|qq plnp| 3|q lnp|3|qq lnp5q ,
3
3
3
4
4
2
3
1
1
where it has been used that for every function f
pf pxq
1
aqpf pxq
bq
ba
1
1
f pxq
a
1
f pxq
b
,
(3.1.12)
where a, b P R are such that a b and x P Dpf q is such that f pxq R
ta, bu. The previous identity is also of use in applications of the method
of integration by partial fractions to more complicated situations.
Example 3.1.28. Calculate
»3
0
3x
x2
4
2x
2
dx .
Solution:
»3
0
»3
»
3
3 2x 2
1
dx dx
dx
2
2
x
2x 2
2x 2
1q2 1
0 2 x
0 px
3
3
lnp32 2 3 2q arctanp3 1q lnp2q arctanp1q
2 2
3
17
π
ln
arctanp4q .
2
2
4
3x
4
Example 3.1.29. Calculate
»2
1
1
x2 px2
289
1q2
dx .
Solution: Since the integrand is a restriction of the composition of the maps
pR Ñ R, x ÞÑ 1{rxpx 1q2s q and p R Ñ R, x ÞÑ x2 q, by Lemma (3.1.25)
there are A, B, C P R such that
A
B
C
(3.1.13)
2
2
2
p 1q x x 1 px 1q2
for all x 0. Hence for all x P R
1 Apx2 1q2 Bx2 px2 1q Cx2 pA B qx4 p2A B C qx2 A
and hence A 1, B 1 and C 1. Hence it follows by the recursion
1
x2
x2
2
formula from Example 3.1.23 that
»2
1
»2
»
»
2
2
1
1
1
dx
dx
dx
dx
2
2
2
2
2
2
x px
1q
1
1q2
1 x
1 x
1 px
»2
1
1 π
arctanp2q dx
2
2 4
1q2
1 px
»
1 π
1 2 1
1 2 1
arctanp2q 2 5 3 2 x2 1 dx
2 4
1
7
3 π
arctanp2q .
15 2 4
1
Another way of arriving at the decomposition (3.1.13) is by help of the
identity (3.1.12) which leads on
1
x2 px2
1q2
1
x2px2
for all x P R .
x2
1q
1
1
1
1
x2
1 x2 px2 1q
x2 1
x2
1
1
1
1
px2 1q2 x2 x2 1 px2 1q2
Example 3.1.30. Calculate
»x
a
dy
1 y4
290
1
1
y
1
-4
2
-2
4
x
-1
Fig. 71: Graph of the antiderivative F of f pxq : 1{p1
Compare Example 3.1.30.
x4 q, x P R, satisfying F p0q 0.
where a P R and x ¥ a. Solution: Since x4 1 ¡ 0 for all x
according to Lemma 3.1.25 there are b, c, d, e P R such that
y 4 py 2 by
y4 dy3 ey2
y4 pb dqy3
1
P
R,
cq py 2 dy eq
(3.1.14)
3
2
2
by
bdy
bey cy
cdy ce
2
pc e bdqy pbe cdqy ce
P R. This equation is satisfied if and only if
b d 0 , c e bd 0 , be cd 0 , ce 1 .
From the first equation, we conclude that d b which leads to the equiv-
for all y
alent reduced system
d b , e
c b2 , bpe cq 0 , ce 1 .
The assumption that b 0 leads to e c and 1 ce c2 . Hence it
follows that b 0. Therefore, the second equation of the last system leads
to the equivalent reduced system
d b , b2
2c , e c ,
291
c2
1
?
?
which has the solution c e 1 and?b 2,?d 2. (The other
remaining solution c e 1 and b 2, d 2 results in a reordering
of the factors in (3.1.14)). Hence it follows that
y4
1
?
py2
2y
1q py 2 ?
2y
1q
for all y P R. Note that, the last could have also been more simply derived
as follows
1
y4
?1
2y 2
y 4 2y 2
2
? py
?
1q2 p 2y q2
py2 2 y 1q py2 2 y 1q
valid for all y P R. Further, according to Corollary 3.1.26 there are uniquely
determined A, B, C, D P R such that
1
1
y4
y2 Ay?2 yB
?
Cy
y2 1
D
2y 1
(3.1.15)
P R. In particular, this implies that
1
Ay? B
Cy? D
1
4
4
1 y
1 py q
y2 2 y 1 y2
2y 1
Cy D
Ay B
y 2 ?2 y 1 y 2 ? 2 y 1
for all y P R. Since A, B, C and D are uniquely determined by the equations (3.1.15) for every y P R, it follows that C A and D B. Hence
we conclude that there are uniquely determined A, B P R such that
1
Ay B
Ay? B
?
1 y4
y2
2 y 1 y2 2 y 1
for all y P R. In particular,
for all y
1
1
1
04
292
2B
and hence B
1{2. Also
1 1 14 A2 p?1{22q A2 ?p12{2q
? ? ?
2 2
1
2
1
2 2
2
2
2 A 2
A
A
2
2
2
2
?
and hence A 2{4. We conclude that
?
?
1
2y 2
2y 2
1
4 y 2 ?2 y 1 y 2 ?2 y 1
1 y4
?
?
1
2y 1
2y 1
4 y 2 ?2 y 1 y 2 ? 2 y 1
?
?
? 2
2
2
?
?
2
2
4
2y 1
1
2y 1
1
for all y P R. Hence it follows that
? 2 ?
2 ?
»x
2
2x 1
dy
x
x 2x 1
8 ln a2 ?2 a 1 ln a2 ?2 a 1
y4
a 1
? ?
?
2
arctanp 2 x 1q arctanp 2 a 1q
?4 ?
?
2
arctanp 2 x 1q arctanp 2 a 1q .
4
1
2
Remark 3.1.31. The previous example gives another illustration of the general rule that one should never blindfoldly rely on computer programs. In
Mathematica 5.1, the command
Integrater1{p1
x^ 4q, xs
gives the output
?1 p2ArcTanr1 4 2
?
2xs
2ArcTanr1
293
?
2xs Logr1
?
2x x2 s
y
1.2
0.8
0.5
-4
-3
-2
1
-1
2
3
4
x
Fig. 72: Graphs of the solutions f0 , f1{4 , f1{2 , f3{4 and f1 of (3.1.16) in the case that
a 1. Compare Example 3.1.32.
Logr1
?
2x
x2 sq
which is incorrect. A first inspection of the last formula reveals that the argument of the first natural logarithm function is becoming negative for large
x such that the logarithm is not defined. This gives a first indication that the
expression is incorrect. Comparison with the result from Example 3.1.30
shows that the sign of that argument has to be reversed.
Example 3.1.32. Find solutions of the following differential equation for
f : R Ñ R with the specified initial values.
f 1 pxq af pxqp1 f pxqq
(3.1.16)
for all x P R where a ¡ 0, f p0q P p0, 1q. Solution: If f is such function, it
follows that f is continuously differentiable. Since f p0q P p0, 1q, it follows
by the continuity of f the existence of an open interval c, d P R such that
c 0 d and such that f p[c, d]q € p0, 1q. Since a ¡ 0 and the function
af p1 f q is ¡ 0 on the interval [c, d], it follows from (3.1.2) that
f px1 q f px0 q
f px1 q f px0 q f px0 q
294
» x1
x0
f 1 pxq dx
y
1
-1
-0.5
1
0.5
x
-9
Fig. 73: Graphs of the solutions f2 and f4 of (3.1.16) in the case that a
Example 3.1.32.
» x1
f px0q
x0
1.
Compare
a f pxqp1 f pxqqdx ¥ f px0 q
for all x0 , x1 P [c, d] such that x0 ¤ x1 . In addition, the restriction of f
to [c, d] is non-constant since the function pR Ñ R, x ÞÑ axp1 xqq has
no zeros on p0, 1q. Hence we conclude from (3.1.2) by Theorem 3.1.1 for
x P [c, d] that
apx cq » f pxq pq
f c
ln
»x
1
u
c
a du 1
1u
»x
c
f 1 py q
dy
f py qp1 f py qq
du ln
1 f pcq
f pxq
f pcq
1 f pxq
and hence that
f pcq
eac eax
1 f pcq
This implies that
u
» f pxq
f pxq
1u
du
f pcq up1 uq
pq
f c
1 f pfxpqxq f p1xqf 1pxq 1 1 1f pxq 1 .
f pcq
eac
1 f pcq
1 f pf0pq0q
295
and hence that
f p0q
eax
1 f p0q
1 1f pxq 1 .
Finally, this leads on
f pxq 1 eax
ax
1
.
pq
eax 1f pf0pq0q
p qe
On the other hand, for every c P R , it follows by elementary calculation
f 0
1 f 0
1
that the function fc defined by
fc pxq :
$
&
for x P R if 0 c ¤ 1
eax
c
eax 1
c
ax
% axe 1c
e
c
for x P R zta1 lnppc 1q{cqu if c ¡ 1 or c 0
satisfies (3.1.16). Note also that f0 , defined as the constant function of value
zero on R, is a further solution of (3.1.16) such that f0 p0q 0.
Problems
1) Calculate the integral.
»2
a)
2
»4
c)
3
»3
e)
2
»1
g)
3u
u2
»4
0
3
x3
x2
6x2
du
1{2
,
d)
0
3
12x
3x
4x2
x3 4x
x4 2x2
8
4
x
4
5
dx
1
x2
0
6x
»3
,
f)
1
»3
,
h)
3
»4
j)
2
,
9
»1
l)
,
,
x2 1
dx
x4 4x2 4
,
,
7
dx
,
x3 3x2 1
15x2 10x 24 dx
,
4x3
x4
dx
x2 3x 1
dx
x3 2x2 7x 4
3x
x4
0
296
1
3x 1
dx
x3 7x 6
dx
dx
,
2x
b)
»1
,
3x 5
dx
4
x
4x2 3
» 1{2
k)
du
u2
1 x3
i)
u 12
u3
x2
»2
2
6x2
4x
1
»0
m)
2 x4
»1
n)
0
»3
o)
0
»2
p)
1
3.1.4
2x2 1
dx
x3 9x2 11x 4
x3 x 1
x4 3x3 3x2 7x
x2
6
1
x4
2x3
3x2
4x
2
x4
2x3 x2
3x3 5x2
4
9x
6
,
dx
,
dx
,
dx
.
Approximate Numerical Calculation of Integrals
Usually, in cases where an evaluation of a given integral in terms of known
functions appears to be impossible, resort is taken to approximation methods. Basic numerical methods for this, the midpoint rule, the trapezoid rule
and Simpson’s rule, are given within this section. Each of them uses approximations of integrands analogous to those leading to upper and lower
sums in the definition of the Riemann integral. For this, partitions of the interval of integration I are used which induce divisions into subintervals of
equal length. Generally, the decrease of that length leads to better approximations. On each subinterval, the corresponding restriction of the integrand
f : I Ñ R is replaced by a certain polynomial approximation characteristic for each method. The integral of f over I is then approximated by the
sum of the integrals of the approximating polynomials over the subintervals. The midpoint rule uses on each subinterval the constant polynomial
whose value coincides with the value of f in the midpoint of that interval.
This is equivalent to the approximation of f by its linearization around the
midpoint of the subinterval, since the integral of the non-constant part of
the polynomial over that interval vanishes. The last is the reason, why the
midpoint rule leads to results which are similar in accuracy to those of the
trapezoid method. The trapezoid method approximates f on each subinterval by the linear polynomial that interpolates between the values of f at the
interval ends, i.e., by that linear polynomial that assumes the same values
as f at both ends of the subinterval. Finally, Simpson’s method approximates f on each subinterval by the quadratic polynomial that interpolates
between the value of f at the end points and at the midpoint of that interval.
297
From this description, it might be expected that among those methods,
Simpson’s rule is the most accurate, followed by the trapezoid rule and
the midpoint rule. Indeed, Simpson’s rule is the most accurate which is
also reflected in the fact that its error is proportional to n4 where n is the
of number of subintervals of the division. On the other hand, the error of
both, the midpoint and the trapezoidal rule, is proportional to n2 . Often,
the trapezoid rule gives better approximations than the midpoint rule, but
there are also cases known where the opposite is true. For instance, in the
examples below this is the case. All these methods, can lead to poor results
in the case of an oscillating f as long as the length of the subintervals is
comparable to the average distance of subsequent minima and maxima of
f . Such cases are depicted in the figures below.
The key for the following derivation of an error estimate for the midpoint
rule is the observation that the associated integrals over the subintervals coincide with those of the linearization of the integrand around the midpoints.
As a consequence, the remainder estimate of Corollary 2.5.26 to Taylor’s
theorem can be applied.
Theorem 3.1.33. (Midpoint Rule) Let a, b P R be such that a b,
f : ra, bs Ñ R be bounded and twice differentiable on pa, bq such that
|f 2pxq| ¤ K for all x P pa, bq and some K ¥ 0. Then
(i)
» b
f x dx
a
p q f
a
b
2
a K
pb q ¤ 24
pb aq3 .
(ii) In addition, let n P N , h : pb aq{n and ai : a
i P t0, . . . , nu. Then
»
b
f x dx
a
p q h
n¸1
f
a
i 0
298
i
ai 1 2
¤
i h for all
K pb aq3
.
24
n2
y
40
30
20
10
1.2
1.4
1.6
1.8
x
2
Fig. 74: Midpoint approximation.
Proof. (i) By the Corollary 2.5.26 to Taylor’s theorem, it follows that
|f pxq p1pxq| ¤
for all x P pa, bq where
p1 pxq : f
a
b
K
2
x
a
f1
2
a
b
2
2
x
b
2
a
b
2
for all x P R is the first-degree Taylor-polynomial of f centered around
pa bq{2. Further,
»b
a
p1 pxq dx f
f
f
a
b
2
a
b
2
a
b
2
pb aq
a
f1
pb aq
1
f1
2
a
b
2
pb aq .
299
b
2
»b
x
a
a
b
2
x
2 b
a
a
b
2
dx
In addition,
» b
f x dx
a
»b
pq ¤
»b
a
p1 x dx
2
¤
pq
K
a b
x
2 a
2
K
pb aq3 .
24
»b
a
dx |f pxq p1pxq| dx
K
6
x
a
b
2
3 b
a
(ii) is a simple consequence of (i).
Example 3.1.34. We use the midpoint rule to approximate the value of
lnp2q »2
1
dx
.
x
For this, we use the partition
pa0, a1, a2, a3, a4q p4{4, 5{4, 6{4, 7{4, 8{4q
leading to a division of r1, 2s into the four subintervals of length h 1{4.
Then
3̧
h
f
a
i 0
i
ai
2
1
1
1
1
4
5
5
2 4 4
4
4448
0.691
6435
12
3̧
ai
i 0
1
ai
1
1
6
4
6
4
1
7
4
7
4
8
4
2
1
9
1
11
1
13
1
15
where f pxq : 1{x for all x P r1, 2s and the last approximation is to three
decimal places. To three decimal places, lnp2q is given by
lnp2q 0.693 .
The result of this application of the midpoint gives lnp2q within an error of
2 103 . Since
|f 2pxq| 2x3 ¤ 2
300
y
40
30
20
10
1.2
1.4
1.6
1.8
2
x
Fig. 75: Trapezoid approximation.
for all x P p1, 2q, Theorem 3.1.33 (ii) leads to the error bound
4448
6435
ln 2 1
p q ¤ 24 2 16 192
6 103 .
The following derivation of an error estimate for the trapezoid rule exploits
the fact that the difference of the approximating polynomial on a subinterval
and the restriction of the integrand vanishes at the interval ends. By partial
integration, the integral of such a difference can be transformed into an
integral containing the second order derivative of the difference, instead.
Since the approximating polynomial is only of first order, the last coincides
with the second order derivative of the restriction of integrand. This leads
to an error estimate in terms of a bound on the second derivative of f .
Theorem 3.1.35. (Trapezoid Rule) Let I be some non-empty open interval
of R, f : I Ñ R be twice continuously differentiable and a, b P I be such
that a b. In particular, let |f 2 pxq| ¤ K for all x P pa, bq and some K ¥ 0.
Then
301
(i)
» b
f x dx
pq a
f paq
f pbq
2
a K
pb q ¤ 12
pb aq3 .
(ii) In addition let n P N , h : pb aq{n and ai : a
i P t0, . . . , nu. Then
»
b
f x dx
a
p q h
f pa q
i
n¸1
i 0
f pai
2
i h for all
1
q ¤ K pb aq3
12
n2
.
Proof. Define
f pbq f paq
ba
ppxq : f paq
for all x P R and h : f
and
»
p x aq
p. In particular, it follows that hpaq hpbq 0
b
f paq f pbq
ppxq dx p b aq .
2
a
By partial integration, it follows that
»b
a
hpxq dx 1
2
»b
a
px bqpx aqh 2pxq dx 1
2
and hence that
» b
h x dx
a
»b
pq
¤
1
2
»b
a
»b
a
px bqpx aqf 2pxq dx
pb xq px aq |f 2pxq| dx
K
¤ K2 pb xq px aq dx 12
pb aq3 .
a
(ii) is a simple consequence of piq.
Example 3.1.36. As before the midpoint rule, we use the trapezoid rule to
approximate the value of
lnp2q »2
1
302
dx
.
x
Again, we use the partition
pa0, a1, a2, a3, a4q p4{4, 5{4, 6{4, 7{4, 8{4q
leading to a division of r1, 2s into the four subintervals of length h 1{4.
Then
h
f pa q
i
n¸1
i 0
f pai
2
1
1 4 4 4
8 4 5 5
1171
0.697
1680
q1
4
6
3̧
8 i0
4 4
6 7
1
ai
1
ai
4
7
4
8
1
1
2
1
4
2
5
2
6
2
7
1
8
where f pxq : 1{x for all x P r1, 2s and the last approximation is to three
decimal places. To three decimal places, lnp2q is given by
lnp2q 0.693 .
The result of this application of the midpoint gives lnp2q within an error of
4 103 . Since
|f 2pxq| 2x3 ¤ 2
for all x P p1, 2q, Theorem 3.1.35 (ii) leads to the error bound
1171
1680
ln 2 p q ¤ 12 2 16 961 11 103 .
The following derivation of an error estimate for Simpson’s rule is similar
to that for the trapezoid rule. Again, it uses partial integration to exploit the
the fact that the difference of the approximating polynomial on a subinterval
and the restriction of the integrand vanishes at the endpoints and also in the
middle of the interval. This leads to an error estimate in terms of a bound
on the fourth derivative of the integrand.
303
Theorem 3.1.37. (Simpson’s Rule) Let h ¡ 0, I be some open interval of
R containing rh, hs, f : I Ñ R be four times continuously differentiable
and |f pivq pxq| ¤ K for all x P ph, hq and some K ¥ 0. Then
» h
f x dx
h
pq 1
rf phq
3
4f p0q
f phqs h
K 5
¤ 90
h .
Proof. Define
"
ppxq :
*
1
rf phq
2
f phqs f p0q
for all x P R and g : f
»h
h
ppxq dx 31 rf phq
"
2
hx2
1
x
r
f phq f phqs 2
h
f p0q
p. Then gphq gp0q gphq 0 and
*
1
rf phq
2
4f p0q
f phqs f p0q
2h
3
f p0q 2h
f phqs h .
By partial integration, it follows that
»0
h
px
»h
hq p3x hq g pivq pxq dx
3
»h
px hq p3x
3
0
0
hq rg pivq pxq
px hq3 p3x
hq g pivq pxq dx
g pivq pxqs dx 72
»h
and hence that
» h
g x dx
h
pq
¤
K
36
»h
0
ph xq3 p3x
h
K 3
p
h xq5 h ph xq4 36 5
0
304
hq dx
K 5
90
h .
h
g pxq dx
y
50
40
30
20
10
1.2
1.4
1.6
1.8
2
x
Fig. 76: Simpson’s approximation.
Corollary 3.1.38. Let I be some non-empty open interval of R, f : I Ñ R
be four times continuously differentiable and a, b P I be such that a b.
In particular, let |f pivq pxq| ¤ K for all x P pa, bq and some K ¥ 0. Finally,
let n P N , h : pb aq{n and ai : a i h for all i P t0, . . . , nu. Then
»
b
f x dx
a
pq 1
h n¸
f pai q
6 i1
K pb aq
¤ 2880
n4
4f ppai
ai
1
q{2q
f pai
1
q
5
.
Note that
1
h n¸
f pai q
6 i1
32 h
n¸1
4f ppai
f ppai
ai
ai
1
q{2q
1 q{2q
i 1
305
f pai
1
q
n¸
1 f pa q f pa
1
i
i
h
3
2
i1
1
q
hence equals the sum of two-thirds of the corresponding sum for the midpoint rule and one-third of the corresponding sum for the trapezoid rule.
Proof. The corollary is a simple consequence of Theorem 3.1.37.
Example 3.1.39. As before the midpoint and trapezoid rule, we use Simpson’s rule to approximate the value of
lnp2q »2
1
dx
.
x
Again, we use the partition
pa0, a1, a2, a3, a4q p4{4, 5{4, 6{4, 7{4, 8{4q
leading to a division of r1, 2s into the four subintervals of length h 1{4.
Then
1
h n¸
f pai q
6 i1
32 4448
6435
4f ppai
ai
1 1171
3 1680
1498711
0.693155
2162160
1
q{2q
f pai
1
q
where f pxq : 1{x for all x P r1, 2s and the last approximation is to six
decimal places. Also, the corresponding sums for the midpoint rule and the
trapezoid rule have been used. To six decimal places, lnp2q is given by
lnp2q 0.693147 .
The result of this application of Simpson’s rule gives lnp2q within an error
of 8 106 . Since
|f pivqpxq| 24x5 ¤ 24
for all x P p1, 2q, Theorem 3.1.38 (ii) leads to the error bound
1498711
2162160
ln 2 1
p q ¤ 288024 256 30720
4 105 .
306
Problems
1) Calculate the integral. In addition, evaluate the integral approximately, using the midpoint rule, the trapezoidal rule and Simpson’s
rule. In this, subdivide the interval of integration into 4 intervals of
equal length. Compare the approximation to the exact result.
»1
a)
0
»1
c)
0
»1
du
p1 uq2
,
b)
0
3u2
p1 u3 q2 du
p1
2x
dx
x2 q2
,
.
2) By using Simpson’s rule, approximate the area in R2 that is enclosed
by the Cartesian leaf
?
C : tpx, y q P R2 : 3 2 py 2 x2 q
2 x px2
3y 2 q 0u
where a ¡ 0. In this, subdivide the interval of integration into 4
intervals of equal length. Compare the approximation to the exact
result which is given by 1.5.
3) The time for one complete swing (‘period’) T of a pendulum with
length L ¡ 0 is given by
a
L{g
π k 2 I pk q
1 k2
2
T ?
where
I pk q »1
?
1 1 k2 u2
? 2
? 1 2u ? 2 2 du ,
1k
1k u
θ0 P pπ {2, π {2q is the initial angle of elongation from the position of rest of the pendulum, k : | sinpθ0 {2q|, and where g is the
acceleration of the Earth’s gravitational field. By using Simpson’s
rule, approximate T for θ0 π {4. In this, subdivide the interval of
integration into 4 intervals of equal length.
307
3.2
Improper Integrals
A large number of integrals in applications are ‘improper’ in the sense that
they are not Riemann integrals of functions over bounded closed intervals
of R. For instance in physics, integrals over unbounded sets occur naturally in the description of systems of infinite extension which are basic for
physics. Another important source for improper integrals is in theory of
special functions where the majority of integral representations is in form
of improper Riemann integrals (or, alternatively, Lebesgue integrals). Also
special functions have important applications. The majority appears as solutions of differential equations from applications, like Bessel functions,
hypergeometric functions, confluent hypergeometric functions or elliptic
functions. Others, like the gamma function or the beta function appear naturally in the definitions of the former.
For this reason, in this section we also introduce basic special functions,
the gamma function and the beta function, by help of such integral representations and derive their basic properties. In particular, Legendre’s duplication formula, Euler’s reflection formula and Gauss’ representation for
the gamma function are proved in this section. In applications, these results
are often needed also for complex arguments. As is known, these follow
from those for real arguments by help of the principle of analytic continuation. In addition, elementary properties of Gaussian integrals are derived
that are frequently used in quantum theory and in probability theory. Original proofs of some of these results used improper double integrals. In the
meantime, more elementary proofs have been found that allow their derivation already at an early stage in a calculus course. In particular, we use
results from [26] and [61].
For motivation, in the following we consider the problem of the calculation of the period of a simple pendulum in Earths gravitational field which
leads in a natural way on an improper Riemann integral. A simple pendulum is defined as a particle of mass m ¡ 0 suspended from a point O
by a string of length L ¡ 0 and of negligible mass. During the time of
308
O
Θ
L
m
Fig. 77: A simple pendulum. The dashed line marks the rest position. Compare text.
development of calculus in the 17th century, such motion was considered
in 1673 by the inventor of the pendulum clock, Christian Huygens [56]. In
the analysis below, we use Newton’s equation of motion . The last was not
known to Huygens at that time.
Newton’s equation of motion give the following differential equation for
the angle of elongation θ from the rest position of the pendulum as a function of time.
g
sin θ 0
(3.2.1)
θ2
L
where g is the acceleration of Earth’s gravitational field. The general solution of this equation is not expressible in terms of elementary functions, but
only in terms of special functions called ‘elliptic functions’. In the following, instead of finding the solutions of (3.2.1), we pursue the goal of finding
the time τ for the pendulum to reach the angle 0 after release from rest at
initial time 0, i.e., θ 1 p0q 0, and with initial elongation θ0 P p0, π {2q.
The time τ corresponds to one-fourth of the time necessary for completion of one complete swing, i.e., to one-fourth of the period of the pen309
dulum. For this, we assume that there is a unique solution θ : R Ñ R
of (3.2.1) such that θp0q θ0 , θ 1 p0q 0, 0 P Ranpθq, and we define
τ : min θ1 pt0uq. Only this solution of (3.2.1), whose existence and
uniqueness can be proved, we consider in the following. Note that these
assumptions imply that θ is twice differentiable and, as a particular consequence of (3.2.1), that θ 2 is continuous.
In a first step, we use the conserved energy for the solutions of 3.2.1, see
Example 2.5.9, to derive a differential equation for θ that contains no higher
order derivatives of θ than of first order. Multiplication of (3.2.1) by θ 1 gives
0 θ 1θ 2
g 1
θ sin θ
L
1 12 g
θ cos θ
2
L
1
.
Hence it follows by Theorem 2.5.7 that the function inside the brackets is
constant and therefore that
1 1
g
1
g
g
p
θ ptqq2 cos θptq pθ 1 p0qq2 cos θp0q cos θ0
2
L
2
L
L
which leads to
rcos θptq cos θ0s
pθ 1ptqq2 2g
L
for every t P R. The solution of the last equation for θ 1 ptq for some t P R
requires the knowledge of the sign of θ 1 ptq. By the fundamental theorem of
calculus, it follows from (3.2.1) that
θ 1 ptq θ 1 ptq θ 1 p0q »t
0
g
θ 2 psq ds L
»t
0
sin θpsq ds ¤ 0
for all t P r0, τ s where it has been used that θpτ q 0 and θptq P r0, θ0 s €
r0, π{2q. Both follow from the definition of τ . Hence, we conclude that
c
2g a
θ 1 ptq cos θptq cos θ
L
for all t P r0, τ s.
310
0
Since θ 1 ptq 0 for all τ P p0, τ q, it follows by Theorems 2.3.44, 2.5.10
and 2.5.18 that for the restriction of θ to the interval r0, τ s there is a strictly
decreasing continuous inverse function θ1 : r0, θ0 s Ñ R whose restriction
to p0, θ0 q is differentiable such that
d
pθ1q1pϕq θ 1pθ1pϕqq 1
L
a
2g cos θpθ1 pϕqq cos θ0
1
d
L
2g
d
?cos ϕ cos θ 12
0
L
b
g sin2
1
1
θ0
2
sin2
ϕ
2
for all ϕ P p0, θ0 q where the addition theorem for the cosine has been used
to conclude that
α
α
α
α
cos2
sin2
1 2 sin2
cos α cos 2
2
2
2
2
for every α P R. Hence it follows by the fundamental theorem of calculus
that
pθ1qp0q »pθ1qpϕq rpθ1qpϕq pθ1qp0qs
ϕ
pθ1qpϕq pθ1q1pϕ̄q dϕ̄
τ
0
pθ1qpϕq
d
L
g
1
2
d
pθ1qpϕq
1
2k
L
g
for every ϕ P r0, θ0 q where k
»ϕ
dϕ̄
b
sin2
0
»ϕ
b
0
θ0
2
sin2
dϕ̄
1
1
k2
sin2
ϕ̄
2
ϕ̄
2
P p0, 1q is defined by
θ0
k : sin
2
.
By use of the substitution g : r0, sinpϕ{2q{k s Ñ R defined by
g puq : 2 arcsinpkuq
311
for every u P r0, sinpϕ{2q{k s, we arrive at
d
τ pθ1 qpϕq
»
L
g
1
k
sin
p ϕ2 q
du
.
p1 u2qp1 k2u2q
a
0
Finally, since θ1 : r0, θ0 s Ñ R is continuous, we conclude that
d
τ
ϕlim
Ñθ
L
g
0
ulim
Ñ1
d
L
g
»
1
k
sin
p ϕ2 q
a
p1 0
»u
a
p1 0
du
qp1 k2u2q
u2
dū
.
qp1 k2ū2q
ū2
It would be natural to indicate the last by
d
τ
L
g
»1
dū
,
p1 ū2qp1 k2ū2q
a
0
but the integrand of the last ‘integral’ is not defined at ū 1 and its restriction to the interval r0, 1q is an unbounded function. Hence the last ‘integral’
is no Riemann integral. The definitions below turn it into an improper Riemann integral defined by
»1
0
d
dū
a
: lim
p1 ū2qp1 k2ū2q uÑ1
L
g
»u
dū
.
p1 ū2qp1 k2ū2q
a
0
As a side remark, we mention that
»u
a
0
p1 dū
qp1 k2ū2q
ū2
for 0 ¤ u ¤ 1 is called an elliptic integral of the first kind (in Jacobian
form) and is denoted by the symbol F pu|k q.
Definition 3.2.1. (Improper Riemann integrals)
312
(i) Let a P R, b P R Y t8u such that a b if b 8 and f : ra, bq Ñ R
be almost everywhere continuous. Then F : ra, bq Ñ R, defined by
F pxq :
»x
f py q dy
a
for every y P ra, bq, is a continuous function according to Theorem 2.6.19. We say that f is improper Riemann-integrable if there
is L P R such that
lim F pxq lim
x
Ñb
x
Ñb
»x
f py q dy
a
L.
In this case, we define the improper Riemann integral of f by
»b
a
f py q dy
»x
xlim
Ñb
a
f py q dy .
(ii) Let a P R Y t8u, b P R be such that a b if a 8 and f :
pa, bs Ñ R be almost everywhere continuous. Then F : pa, bs Ñ R
defined by
»
F pxq :
b
x
f py q dy
for every y P ra, bq is a continuous function according to Theorem 2.6.19. We say that f is improper Riemann-integrable if there
is some L P R such that
lim
Ña F pxq xlim
Ña
x
»b
x
f py q dy
L.
In this case, we define the improper Riemann integral of f by
»b
a
f py q dy
xlim
Ña
313
»b
x
f py q dy .
(iii) Let a P R Y t8u, b P R Y t8u such that a b if a 8 and
b 8. Further, let f : pa, bq Ñ R be almost everywhere continuous.
We say that f is improper Riemann-integrable if, both, f |pa,cs and
f |rc,bq are improper Riemann-integrable for some c P pa, bq. In this
case, we define
»b
a
f pxq dx :
»c
a
»b
f pxq dx
c
f pxq dx .
That this definition is indeed independent of c is a consequence of
the additivity of the Riemann integral, Theorem 2.6.18. The proof of
this will be given in the subsequent second remark below.
Remark 3.2.2. Note that according to the previous definition, the restrictions to pa, bs, ra, bq or pa, bq of a continuous function defined on a bounded
closed interval ra, bs, where a, b P R are such that a b, are improper
Riemann-integrable, and that the values of the associated improper integrals all coincide with the Riemann integral of that function.
Remark 3.2.3. In the following, we use the notation from Definition 3.2.1.
That Definition 3.2.1 (iii) is independent of c P pa, bq can be seen as follows.
For this, let d P pc, bq and sequences a1 , a2 , . . . in pa, ds, b1 , b2 , . . . in rd, bq
that are convergent to a and b, respectively. Then it follows by the additivity
of the Riemann integral, Theorem 2.6.18, for sufficiently large n P N that
»c
ak
»d
c
f pxq dx
f pxq dx
»d
c
» bk
d
f pxq dx f pxq dx »d
f pxq dx ,
ak
» bk
c
f pxq dx .
Hence it follows by the limit laws that
»c
a
»b
c
f pxq dx lim
k
Ñ8
f pxq dx lim
k
Ñ8
»c
ak
» bk
c
f pxq dx f pxq dx »d
»d
314
c
c
»d
f pxq dx
f pxq dx
lim
k
Ñ8
» bk
lim
k
Ñ8
ak
d
f pxq dx
f pxq dx .
Therefore, f |pa,ds and f |rd,bq are improper Riemann-integrable and satisfy
»c
a
»b
c
f pxq dx f pxq dx »d
»d
c
c
»d
f pxq dx
»b
f pxq dx
d
a
f pxq dx ,
f pxq dx .
The last implies that
»c
a
f pxq dx
»b
c
f pxq dx »d
»b
f pxq dx
a
d
f pxq dx .
The case that d P pa, cq is analogous. If a1 , a2 , . . . in pa, ds, b1 , b2 , . . .
in rd, bq are convergent to a and b, respectively. Then it follows by the
additivity of the Riemann integral, Theorem 2.6.18, for sufficiently large
n P N that
»d
ak
»c
d
f pxq dx
f pxq dx
»c
»
d
bk
c
f pxq dx f pxq dx »c
f pxq dx ,
ak
» bk
d
f pxq dx .
Hence it follows by the limit laws that
»c
a
»b
c
f pxq dx lim
k
Ñ8
f pxq dx lim
k
Ñ8
»c
ak
» bk
c
f pxq dx »c
d
f pxq dx »d
f pxq dx
»c
d
lim
k
f pxq dx
Ñ8
ak
f pxq dx
» bk
lim
k
Ñ8
d
f pxq dx .
Therefore, f |pa,ds and f |rd,bq are improper Riemann-integrable and satisfy
»c
a
»b
c
f pxq dx »c
d
f pxq dx f pxq dx
»c
d
f pxq dx
315
»d
a
f pxq dx ,
»b
d
f pxq dx .
The last implies that
»c
a
»b
f pxq dx
c
f pxq dx »d
a
f pxq dx
»b
d
f pxq dx .
In the following, we give two prime examples of improper integrals whose
integrands are restrictions of powers of the identity function on p0, 8q. The
first example shows that such are improper integrable over an interval p0, as,
where a ¡ 0, if and only if that power is greater than 1. The second
example shows that such are improper integrable over an interval ra, 8q,
where a ¡ 0, if and only if that power is smaller than 1.
Example 3.2.4. Define fα :
where a ¡ 0. Show that
pp0, as Ñ R, x ÞÑ 1{xαq for every real α
(i) fα is improper Riemann-integrable for every α 1 and
»a
0
dx
xα
1 α
1a α .
(ii) fα is not improper Riemann-integrable for every α ¥ 1.
Solution: For α P R zt1u and ε P p0, aq, it follows that
»a
ε
and for α 1 that
»a
ε
x1α
1α
a
ε1α
1α
1 α
dx
xα
dx
x
r lnpxqsaε lnpaq lnpεq
ε
a
and hence the statements.
Example 3.2.5. Define fα :
where a ¡ 0. Show that
p ra, 8q Ñ R, x ÞÑ 1{xαq for every real α
316
(i) fα is improper Riemann-integrable for every α ¡ 1 and
»8
a
dx
xα
α 1 1 aα11
.
(ii) fα is not improper Riemann-integrable for every α ¤ 1.
Solution: For α P R zt1u and x ¡ a, it follows that
»x
a
and for α 1
»x
a
y 1α
1α
x
x1α a1α
1α
dy
yα
dy
y
r lnpyqsxa lnpxq lnpaq
a
and hence the statements.
The following gives an important criterion for improper Riemann integrability. It is based on the fact that for every bounded continuous and increasing function F : ra, bq Ñ R where a P R, b P R Y t8u are such that a b
if b 8,
lim F pxq
x
Ñb
exists. Within the following theorem, F is an antiderivative of the absolute
value of an almost everywhere continuous integrand. In this connection,
the theorem is applied by showing that that absolute value has an improper
Riemann-integrable majorant.
Theorem 3.2.6. Let a P R, b P R Y t8u be such that a b if b 8 and
f : ra, bq Ñ R be almost everywhere continuous. Finally, let G : ra, bq Ñ
R be defined by
»
Gpxq :
x
|f pyq| dy
for all x P ra, bq and be bounded. Then, f and |f | are improper Riemanna
integrable.
317
Proof. Let pbn qnPN be a sequence in ra, bq which is convergent to b. Since G
is bounded and increasing, suptRan Gu exists. As a consequence for given
ε ¡ 0, there is some c P ra, bq such that suptRan Gu Gpcq ε. Hence it
also follows that suptRan Gu Gpxq ε for all x P rc, bq. Then there is
n0 P N such that bn ¥ c for all n P N satisfying n ¥ n0 . Therefore it also
follows that | suptRan Gu Gpbn q| ε for n P N satisfying n ¥ n0 and
hence finally
lim Gpbn q suptRan Gu .
Ñ8
n
Hence |f | is improper Riemann-integrable and
»b
a
|f pxq| dx suptRan Gu .
Further, for every x P ra, bq, it follows that
»x
a
p|f pyq| f pyqq dy ¤ 2
»x
a
|f pyq| dy ¤ 2 suptRan Gu .
Hence |f | f and therefore also f is improper Riemann-integrable.
As an application of the previous theorem, the next example defines the
gamma function. The last extends the factorial function pN Ñ N, n ÞÑ n!q
to a function with domain given by all real numbers which are no negative
integers and such that the functional relationship of the factorial that pn
1q! pn 1qn! for all n P N is preserved. The proof of the last will be
given within the example that is next to the following example. A main
reason for the importance of the gamma function for applications is the fact
that it appears naturally in the definition of many special functions that are
solutions of differential equations from applications.
Example 3.2.7. Show that fy : pp0, 8q Ñ R, x ÞÑ ex xy1 q is improper
Riemann-integrable for every y ¡ 0. Hence we can define the gamma
function Γ : p0, 8q Ñ R by
Γpy q :
»8
ex xy1 dx
0
318
y
y
1
24
0.5
6
2
1
2
3
4
5
x
1
2
3
4
5
x
Fig. 78: Graphs of the gamma function Γ (left) and 1{Γ.
for all y
P p0, 8q. Solution: Let y ¡ 0. For every ε ¡ 0, it follows that
»1
ε
ex xy1 dx ¤
»1
ε
xy1 dx ¤
1
y
and hence by Theorem 3.2.6 that fy |p0,1s is improper Riemann-integrable
and that
»1
1
ex xy1 dx ¤ .
y
0
Further, hy : pr1, 8q Ñ R, x Ñ ex{2 xy1 q has a maximum at x0 :
maxt1, 2py 1qu. Hence it follows for every R ¥ 1 that
»R
1
ex xy1 dx ¤ hy px0 q
»R
1
ex{2 dx ¤ 2hy px0 q e1{2
and by Theorem 3.2.6 that fy |r1,8q is improper Riemann-integrable. Note
for later use that
Γp1q »1
0
ex dx
»R
lim
R
Ñ8
Rlim
p1 eR q 1 .
Ñ8
1
319
ex dx lim
RÑ8
»R
0
ex dx
Example 3.2.8. Show that
Γpy
for all y
¡ 0 and hence that
1q y Γpy q
(3.2.2)
1q n!
(3.2.3)
Γpn
for all n P N. Solution: By partial integration it follows for every y
ε P p0, 1q and R P p1, 8q that
»R
ε
ex xy dx eε ε y eR R y
and hence (3.2.2). Since Γp1q
induction.
1
»R
y
¡ 0,
ex xy1 dx
ε
0! , from this follows (3.2.3) by
As another example of an application of Theorem 3.2.6, the next example
defines Gaussian integrals. Such integrals appear in quantum theory in the
study of the quantization of the harmonic oscillator which is of fundamental
importance for physics. In addition, they appear naturally in the study of the
normal distribution in probability theory. The last distribution is frequently
used for the description of error progression due to random errors occurring
in measurements of physical quantities.
Example 3.2.9. ( Gaussian integrals, I ) Show that fm,n : p r0, 8q Ñ
2
R, x ÞÑ xm enx {2 q is improper Riemann-integrable for all m P N, n P N .
In particular, show that I : N N Ñ R defined by
I pm, nq :
for all m P N, n P N satisfies
I pm
»8
2
xm enx {2 dx
0
2, nq m
for all m P N, n P N and, in particular,
I p2k
1, nq 2k k!
,
nk 1
320
1
n
I pm, nq
(3.2.4)
I p2pk
for all k
1 3 p2k
nk 1
1q, nq 1q
?1n I p0, 1q
(3.2.5)
P N. Solution: First, it follows for n P N and x ¥ 1 that
»x
0
1
2
y eny {2 dy 1
2
1 enx {2
n
»x
0
¤
0
2
eny {2 dy
n
0
pnyq eny {2 dy n1
2
eny {2
2
x
0
,
»1
2
eny {2 dy »1
»x
»x
2
eny {2 dy
»x
0
2
eny {2 dy
1
2
yeny {2 dy
0
and hence by Theorem 3.2.6 that f0,m and f1,m are improper Riemannintegrable as well as that
I p1, nq »8
2
y eny {2 dy
0
Further, according to Example 2.5.12, ex
Hence it follows that
x
e
¥e 1
»x
x
y
e dy
0
n1 .
(3.2.6)
¥ 1 and ex ¥ x for all x ¥ 0.
¥
»x
2
y dy
0
x2
for all x ¥ 0 and in this way inductively that
m
ex
for all x ¥ 1 and m
x ¥ 0 that
»x
y
0
m 2
P N.
¥ xm!
In addition, it follows for m
1
2
eny {2 dy n1 ym
1
e
ny 2
{2
x
0
n
»x
ym
0
1
n
321
»x
0
1
P N, n P N and
pnyq eny {2 dy
pm
2
2
1q y m eny {2 dy
n1 xm
Since,
1
2
enx {2
1 m
nx
we notice that
1
1
n
e
nx2
Ñ8 n x
lim
x
m
1
2
y m eny {2 dy .
(3.2.7)
0
{2 m 1
»x
¤ m!
exnx {2 ,
n
2
2
enx {2
0.
Hence it follows from (3.2.7) inductively the improper Riemann-integrability
of fn,m for all m P N and n P N as well as the validity of (3.2.4) for all
m P N and n P N . Further, it follows from (3.2.6) and (3.2.7) by induction
that
I p2k
for all k
1, nq 2k k!
, I p2pk
nk 1
1q, nq 1 3 p2k
nk 1
1q
I p0, nq
P N and, finally, as a consequence of
»x
0
1
2
eny {2 dy ?
n
» ?n x
2
eu {2 du
0
that
I p2pk
1q, nq 1 3 p2k
nk 1
1q
?1n I p0, 1q .
Equation (3.2.5) reduces the calculation of the Gaussian integrals I pm, 1q
for even m P N to the calculation of
»8
2
ex {2 dx .
0
The determination of the last is the object of the following example. As an
application of the result, the value of Γp1{2q is calculated in the subsequent
example.
322
Example 3.2.10. ( Gaussian integrals, II ) Together with Wallis’ product
representation of π from Theorem 3.1.19, the application of the results of
Example 3.2.9 allow the calculation of I p0, 1q as follows. Employing the
notation of Example 3.2.9, in a first step, we conclude for m P N, n P N
that
0 »x
2
y py tq eny {2 dy m
0
»x
2t
and hence that
ym
1
e
ny 2
{2 dy
ym
y
2
eny {2 dy
t2
ym
2
2
eny {2 dy
0
m 2
2
eny {2 dy
» x
y
0
» x
y
2
»0 x
0
» x
»x
2
m 1
2
eny {2 dy
2
0
for all x ¥ 0 and, finally,
I pm
1, nq m 2
2
eny {2 dy
0
¡0
a
I pm, nq I pm
2, nq .
In particular, since according to (3.2.4)
I pn
1, nq I pn 1, nq , I pn
2, nq n
1
n
I pn, nq ,
we conclude that
I pn
1, nq a
I pn, nq I pn
I pn 2, nq ,
I pn, nq I pn 1, nq I pn
a
2, nq c
n
n
1, nq I pn
1
I pn
1, nq
and hence that
I pn, nq I pn
1, nq I pn
323
2, nq .
2, nq
In particular, the case n 2k
2k k!
p2k 1qk 1 I p2k
1 p32k p12kqk 1 1q
k 1
2p2k pk1qk1q2!
1 where k
P N, leads to
1q I p2k
1, 2k
?2k1
1
2, 2k
I p0, 1q I p2k
1q
3, 2k
1q
and hence to
d
2k
4pk
d
1 ?
2 k
1q
2 4 2k
3 p2k 1q
1
3 ?
2 k
1
1 2k
2q 2k
2k
4pk
2
I p0, 1q
2 4 2pk 1q
3 p2k 3q
Finally, taking the limit k Ñ 8 in the last expression and applying Wallis’
product representation of π (3.2.5) leads to
I p0, 1q »8
0
2
ey {2 dy Example 3.2.11. Show that
Γp1{2q ?
c
π
.
2
(3.2.8)
π.
Solution: For this, let ε, R ¡ 0. By change of variables, it follows that
»R
ε
1
2
ey {2 dy ?
» R2 {2
2
{
x1{2 ex dx
ε2 2
and hence by taking the limits that
c
π
2
?12 Γp1{2q .
324
40
2
z
20
1.5
0
0
1
y
0.5
0.5
1
x
1.5
2 0
Fig. 79: Graph of the Beta function.
As another example of an application of Theorem 3.2.6, the next example
defines Euler’s beta function.
Example 3.2.12. ( Beta function, I ) Show that fx,y : pp0, 1q Ñ R, x ÞÑ
tx1 p1 tqy1 q is improper Riemann-integrable for all x ¡ 0, y ¡ 0. Hence
we can define the Beta function B : p0, 8q2 Ñ R by
B px, y q :
»1
0
tx1 p1 tqy1 dt
for all x ¡ 0, y ¡ 0. Solution: For this, let x
ε, δ P p0, 1{2q. Then
» 1{2
ε
¤ x1
t p1 tqy1 dt ¤ 2
» 1δ
{
1 2
x 1
» 1{2
x1
1
2
x 1{2
x
2t
2
1
x
t
dt ¤
ε
x 1
x
ε
ε
x
2
,
t p1 tqy1 dt ¤ 2
x 1
¡ 0, y ¡ 0. In addition, let
» 1δ
{
1 2
p1 tq y 1
325
y
2p1 y tq
1δ
{
1 2
2
y
y
1
2
δ ¤
y
1
y
y1
1
2
.
Hence it follows by Theorem 3.2.6 that fx,y |p0,1{2s and fx,y |p1{2,1q are improper Riemann-integrable and that
» 1{2
0
1
t p1tqy1 dt ¤
x 1
x1
»1
1
2
x
1
t p1tqy1 dt ¤
x 1
,
{
1 2
y1
y
1
2
.
As a consequence, fx,y is improper Riemann-integrable and satisfies
»1
1
t p1 tqy1 dt ¤
x 1
0
x1
1
y
1
2
x
y1
1
2
.
The next example represents the Gamma function essentially as a limit of
the beta function.
As another example of an application of Theorem 3.2.6, the next example
defines Euler’s beta function.
Example 3.2.13. ( Beta function, II ) Show that
x
lim
Ñ8 y B px, y q Γpxq
(3.2.9)
y
for all x ¡ 0. Solution: For this, let x
ε, δ P p0, 1{2q. Then
» 1δ
ε
0, y
¡
2. In addition, let
tx1 p1 tqy1 dt
y1
1
¡
» py1qp1δq s
x1 1
s
y1
y1
y1
py1qε
y1
» py1qp1δq
s
1
x1
ds .
py 1qx py1qε s 1 y 1
Further,
»
py1qp1δq
x1
s
1
py1qε
y1
s
y1
326
ds » py1qp1δq
py1qε
ds
sx1 es ds ¤
» py1qp1δq
py1qε
x1 s s e 1
1
s
e ds .
y1
s
y1
We consider the auxiliary function h : r0, 8q Ñ R by
hpsq : 1 1 for all s P r0, 8q. Then hp0q
p0, 8q with derivative
h 1 psq y1
s
y1
es
0, h is continuous and differentiable on
s
1
y1
y2
s
es
y1
¡0
for 0 s y 1. Hence it follows for s P r0, y 1s that
»s
|hpsq| hpsq ¤
¤
»s
0
u
y1
1
u
y2
y1
eu du
exp u py 2q ln 1 du
y1
y1
»s
u
u
u
u
exp u py 2q
du exp
du
y1
y1
0 y1
0 y1
e s2
2py 1q
u
u
0
»s
where the case a
that
1 of (2.5.12) has been used.
» py1qp1δq
x1 s s e 1
py1qε
» py1qp1δq
¤
py1qε
sx1 es
1
s
y1
Hence it follows further
y1
s
e ds
e s2
e
ds ¤
Γpx
2py 1q
2py 1q
From the previous, we conclude that
|py 1qxB px, yq Γpxq| ¤ 2py e 1q Γpx
327
2q
2q .
and hence that
lim y x B px, y q Γpxq .
y
Ñ8
The following example expresses the beta function in terms of the gamma
function.
Example 3.2.14. ( Beta function, III ) Show that
Γpxq Γpy q
Γpx y q
(3.2.10)
y
B px
x
(3.2.11)
B px, y q for all x, y ¡ 0. Solution: For this, let x, y
by use of partial integration that
B px, y
For this, let ε, δ
» 1δ
1q 1, y q .
P p0, 1{2q. Then
1δ
1 x
y
y
t p1 tq dt t p1 tq
y
x
x 1
x
ε
¡ 0. In the first step, we show
ε
x1 rp1 δqxδy εxp1 εqy s
y
x
» 1δ
ε
» 1δ
ε
tx p1 tqy1 dt
tx p1 tqy1 dt
which implies (3.2.11). Further, it follows that
» 1δ
ε
» 1δ
ε
» 1δ
t p1 tqy dt
x 1
p1 t
ε
tx p1 tqy1 dt
tq t p1 tqy1 dt x 1
» 1δ
ε
tx1 p1 tqy1 dt
and hence that
B px, y
1q
B px
1, y q B px, y q .
As a consequence, we obtain from (3.2.11) the equation
B px, y q B px, y
1q
B px
328
1, y q x
y
y
B px, y
1q
which results in
B px, y
1q B px, y q .
x y
By induction, we conclude from (3.2.12) that
B px, y
nq y
(3.2.12)
y py 1q py n 1q
B px, y q
y q px y 1q px y n 1q
px
for every n P N . In particular,
1 2 pn 1q
,
y py 1q py n 1q
1 2 pn 1q
y, nq px yq px y 1q px y
B py, nq B px
n 1q
(3.2.13)
where it has been used that
B pz, 1q for every z
»1
0
tz1 dt 1
z
¡ 0. Hence it follows that
B py, nqB px, y nq
B px, y q B px y, nq
x y
x
y n n n B py,nnxq pyyB pxnq y,B pnx,q y
nq
.
From this follows (3.2.10) by taking the limit n Ñ 8 and applying (3.2.9).
Note that, as a consequence of (3.2.9) and the first identity of (3.2.13), we
arrive at Gauss’ representation of the gamma function
Γpxq nlim
Ñ8pn
nlim
Ñ8 x px
1qx B px, n
nx n!
1q px
nq
for every x ¡ 0.
329
x
1q nlim
Ñ8 n B px, n
1q
Theorem 3.2.15. (Gauss’ representation of the gamma function) For
every x ¡ 0
nx n!
Γpxq lim
.
(3.2.14)
nÑ8 x px
1q px nq
As an application of Gauss’ representation of the gamma function and the
product representation of the sine, (3.1.9), we prove the reflection formula
for Γ.
Theorem 3.2.16. (Euler’s reflection formula for the gamma function)
The equation
π
Γpxq Γp1 xq (3.2.15)
sinpπxq
holds for all 0 x 1.
Proof. For this, let 0 x 1. Then it follows by (3.1.9) that
Γpxq Γp1 xq nx n!
1q px
x px
nq
n n!
nlim
Ñ8 p1 xq p2 xq rpn 1q xs
n
pn!q2
x1 nlim
Ñ8 p1 x2 q pn2 x2 q pn 1q x
1
π
1 πx
x1 nlim
.
2
2 x
x
Ñ8 1 2 1 2
x
sin
p
πx
q
sin
p
πx
q
1
n
lim
Ñ8
n
1 x
Remark 3.2.17. Note that the reflection formula (3.2.15) can and is used to
extend the gamma function to negative values of its argument. See Fig. 80.
As final examples for the application of improper integrals, Legendre’s duplication formula for the gamma function is proved, and an occasionally
occurring integral is evaluated in terms of the gamma function.
330
y
y
10
4
3
2
1
-1.5 -0.5
2
3
4
x
1
-4
1
-1.5
2
3
4
x
-1
-10
Fig. 80: Graphs of the extensions of the gamma function Γ (left) and 1{Γ to negative
values of the argument.
Example 3.2.18. Show Legendre’s duplication formula for the gamma function
1
(3.2.16)
Γp2xq ? 22x1 Γpxq Γpx p1{2qq
π
for all x ¡ 0. Solution: For this, let x ¡ 0 and ε, δ
P p0, 1{2q. Then
»1
Γpxq Γpxq
Γp2xq
B px, xq rtp1 tqsx1 dt
0
Further, it follows by change of variables that
» 1δ
ε
rtp1 tqs
» p1{2qδ 1
4
dt x 1
u
» 1δ "
ε
x 1
2
du 2 p1{2q
» 12δ
12x
2
p1 v2qx1 dv
2ε1
» 12δ
212x
p1 v2qx1 dv
ε
0
t
1
2
2 2x
1
2
» p1{2qδ
1
2
p1{2q
p1 v q
1 p2uq2
dv
2 x 1
2ε 1
331
t
ε
»0
1
2
*x1
x1
dt
du
2
1 2x
» 12δ
0
p1 v q
» 12ε
dv
2 x 1
0
p1 v q
dv .
2 x 1
(3.2.17)
Further, it follows by change of variables that
»b
p1 v q
dv 1
» b2
2 x 1
a
2
a2
y 1{2 p1 y qx1 dy ,
where 0 a b, and hence by taking the limit a Ñ 0 that
»b
p1 v q
dv 1
» b2
2 x 1
0
2
0
y 1{2 p1 y qx1 dy .
Hence it follows from (3.2.17) that
» 1δ
ε
rtp1 tqsx1 dt
22x
»
p12δq2
0
» p12εq2
y 1{2 p1 y qx1 dy
0
y 1{2 p1 y qx1 dy
and by taking the limits that
Γpxq Γpxq
Γp2xq
212xB p1{2, xq 212x ΓΓppx1{2qp1Γ{p2xqqq .
Example 3.2.19. Show that
» π{2
0
sin pθq cos pθq dθ
µ
ν
for all µ, ν ¡ 1{2. Solution: For this, let µ, ν
Then it follows by change of variables that
» 1δ
ε
Γ µ2 1 Γ
2 Γ µ2 ν
ν 1
2
1
(3.2.18)
¡ 1{2 and ε, δ P p0, 1{2q.
tpµ1q{2 p1 tqpν 1q{2 dt
» arcsinp?1δ q
?
arcsinp ε q
2 sinpθq cospθq rsin2 pθqspµ1q{2 r1 sin2 pθqspν 1q{2 dθ
332
2
» arcsinp?1δ q
?
arcsinp ε q
sinµ pθq cosµ pθq dθ
and hence by taking the limits that
Γ
µ 1
Γ ν21
2
Γ µ2 ν 1
B
µ
2
1 ν
,
1
2
2
» π{2
0
sinµ pθq cosν pθq dθ .
Problems
1) Show the existence in the sense of an improper Riemann integral and
calculate the value. In this, if applicable, s, a ¡ 0.
»1
a)
0
»1
lnpxq dx
,
esx dx
,
»8
c)
b)
0
x lnpxq dx ,
»8
d)
0
0
»8
esx sinpaxq dx ,
e)
esx cospaxq dx , f)
0
» 8 ?x
»8
e
g)
0
?x
dx
,
h)
8
»8
?x
e
dx ,
0
x expp x2 q dx ,
8 dx
dx
,
i)
,
j)
4
x3
3
0 x
0 1
»8
»8
dx
dx
, l)
,
k)
x
x
2
e
e
x
a2
8»
8
»8
8
dx
dx
m)
,
n)
2
3
2
5x 6
2x
3x
0 x
0 x
»8
»
6
.
2) The radial part of the ‘wave function’ of an electron in a bound state
around a proton is given by Rnl : p0, 8q Ñ R where n P N , l P
t0, . . . , n 1u are the principal quantum number and the azimuthal
quantum number, respectively [62]. Calculate the expectation value
xry of the radial position of the electron in the corresponding state
given by
³8 3 2
xry ³08 r2 Rnl2 prq dr a
r Rnl prq dr
0
333
y
y
0.2
0.5
0.4
0.3
0.1
0.2
0.1
1
2
3
4
r
y
2
4
6
8
10
12
4
8
12
16
20
24
4
8
12
16
20
24
r
y
0.2
0.1
0.1
0.05
2
4
6
8
10
12
r
y
r
y
0.1
0.1
0.05
0.05
4
Fig. 81: Graphs of
Problem 2.
8
12
16
20
24
r
r
pp0, 8q Ñ R, r ÞÑ r2 Rnl2 prqq corresponding to a) to f).
334
Compare
where a 0.529 108 cm is the Bohr radius.
a) R10 prq 2 er
?
b) R20 prq ,
?
r r{2
2
1
e
2
2
6 r{2
re
,
12
? 2 3
2
2 2
d) R30 prq 1 r
r
er{3
9
3
27
8
1 2
?
e) R31 prq r r
er{3 ,
6
27 6
4
f) R32 prq ? r2 er{3
81 30
c) R21 prq for all r
,
,
¡ 0.
3) The ‘wave function’ of a ‘harmonic oscillator’, i.e., the ‘wave function’ of a point particle of mass m ¡ 0 under the influence of a linear
restoring force, is given by ψn : R Ñ R where n P N is the principal quantum number [3]. Calculate the expectation value xxy of the
position of the mass point in the corresponding state given by
³8
8 x ψn pxq dx .
2
8 ψn pxq dx
pmω{~q1{2 , ω pk{mq1{2 , k
xxy 2
³8
[In this, a ¡ 0 is the spring’s
constant and ~ is the reduced Planck’s constant.]
a) ψ0 pxq b) ψ1 pxq c) ψ2 pxq d)
ψ3 pxq ?aπ
1{2
a
?
2 π
?
a
ea
1{2
1{2
8 π
a
?
48 π
2
{
x2 2
,
2axea
2
{
x2 2
,
p4a2 x2 2q ea x {2
1{2
2
2
,
p8a3 x3 12axq ea x {2
2
2
for all x P R.
4) The time for one complete swing (‘period’) T of a pendulum with
335
y
y
0.6
0.4
0.2
0.1
-4
2
-2
4
x
-4
-2
y
-2
4
2
4
x
y
0.4
-4
2
0.4
2
4
x
-4
-2
x
Fig. 82: Squares of the wave functions of a harmonic oscillator. Compare Problem 3.
336
length L ¡ 0 is given by
d
T
2
L
g
»1
1
a
p1 du
qp1 k2 u2 q
u2
where θ0 P pπ {2, π {2q is the initial angle of elongation from the
position of rest of the pendulum, k : | sinpθ0 {2q|, and where g
is the acceleration of the Earth’s gravitational field. Show that the
corresponding integral exists in the improper Riemann sense. Split
the integrand into a Riemann integrable and an improper Riemann
integrable part where the last leads on an integral that can easily be
calculated. In this way, we give another representation of T that
involves only a proper Riemann integral.
337
E
E
A
A
D
B
C
C
Fig. 83: Archimedes’ construction in the quadrature of the parabola. Refer to text.
3.3
Series of Real Numbers
In this section, we start the study of series of real numbers. A special case of
an important series, the geometric series, already appeared in Archimedes’
second proof of his quadrature of the parabola. For motivation, this second
proof is considered in the following.
For this, we consider a parabola along with a line segment AE between
two points A and E on that parabola and the point C of smallest distance
from AE. See Fig 83. Archimedes proved that the area of the parabolic
segment ACE is 4{3 of the area of the inscribed triangle with corners A, C
and E. He did this by dissecting the parabolic segment iteratively by triangles constructed from line segments between points on the parabola as
follows. In the first step, two triangles with corners A, B, C and C, D, E
are constructed in the same way from the line segments AC and CE, respectively, as the triangle with corners A, B, C was constructed from the
line segment AE, i.e., the points B and D are the points of minimal distance from AC and CE, respectively. Then the same process is continued
with the line segments AB, BC, CD, DE leading to four new triangles and
so forth.
At the time of Archimedes writing of his quadrature of the parabola, the
338
E
G
D
A
I
B
C
Fig. 84: Auxiliary diagram for the description of results on parabolic segments used in
Archimedes’ proof. Refer to text.
following facts were known to be true for every line segment AE on a
parabola. See Fig 84.
(i) The tangent to the point C on the parabola of largest distance from
AE is parallel to AE.
(ii) The parallel to the axis of the parabola through C halves every line
segment BD between two points B and D on the parabola that is
parallel to AB.
(iii) If I, G are the points of intersection of the parallel to the axis through
C with BD and AE, respectively, then
CI
CG
BI q2
ppAG
q2
.
(3.3.1)
Note that they imply that
AT
¤ AP ¤ 2AT
(3.3.2)
where AP denotes the area of the parabolic segment ACE and AT denotes
the area of the inscribed triangle with corners A, C and E. See Fig 85.
339
E
G
A
M
C
L
Fig. 85: The double of the area of the triangle ACE gives an upper bound for the area of
the parabolic segment ACE. Refer to text.
E
F
G
H
K
A
D
J
I
B
C
Fig. 86: Archimedes’ construction of quadrature of the parabola including auxiliary lines
(dashed) and points. Refer to text.
340
Archimedes did not prove these facts, but referred for such proofs to earlier
works on conics by Euclid and Aristaeus. We will give proofs in Example 3.5.26 below using methods from analytical geometry. By help of this
knowledge, Archimedes concluded that the areas of the triangles ABC,
CDE are 1{4 of the areas of the triangles ACG and GCE, respectively. See
Fig 86.
This can seen as follows. We denote by I the intersection of the parallel
to AE through B with the parallel to the axis through C. Note that Fig 86
suggests that its prolongation goes through the point D, but this will not be
used in the following. That this is indeed the case will be side result of the
proof. Further, we denote by G the intersection of AE with the parallel to
the axis through C. Finally, we denote by J, H the intersections of the parallel to the axis of the parabola through B with AC and AE, respectively.
Since this parallel halves AC, BH and CG as well as BI, HG are parallel,
we conclude that
AH
HG 21 AG ,
BI
HG ,
BH
IG
and hence by (3.3.1) that
CI
CG
pHGq2 1 .
BI q2
ppAG
q2 p2HGq2 4
(3.3.3)
Further, the triangles with corners AJH and ACG are similar. Hence
JH
CG
1
AH
.
AG
2
In particular, by help of the last and (3.3.3), it follows that
BJ
BH JH IG JH CG CI JH 34 CG JH
32 JH JH 21 JH .
Hence the triangles ABC and ACH have the side AC in common and the
corresponding height of the triangle ABC ( distance from AC to B) is
341
half of that corresponding height of the triangle ACH ( distance from AC
to H). Hence the area of the triangle ABC is half the area of the triangle
ACH. Now also the triangles ACH and ACG have the side AC in common
and the corresponding height of the triangle ACH ( distance from AC to
H) is half of that corresponding height of the triangle ACG ( distance
from AC to G). Hence it follows that the area of the triangle ABC is 1{4
of the area of the triangle ACG.
The reasoning is analogous for the areas of the triangles CDE and GCE,
respectively. See Fig 86. For this, we denote by I the intersection of the
parallel to AE through D with the parallel to the axis through C. Note that
this definition of the point I could conflict with its previous definition. But
only the last definition will be used in the following, and a by product of
the proof is that these points indeed coincide. As before, we denote by G
the intersection of AE with the parallel to the axis through C. Finally, we
denote by K, F the intersections of the parallel to the axis of the parabola
through D with CE and AE, respectively. Since this parallel halves CE
and CG, DF as well as ID, GF are parallel, we conclude that
GF
FE 21 GE ,
ID
GF
, DF
IG
and hence by (3.3.1) that
CI
CG
pGF q2 1 .
IDq2
ppGE
q2 p2GF q2 4
(3.3.4)
Note that this implies, that both previous definitions of I coincide. Further,
the triangles with corners FKE and GCE are similar. Hence
KF
CG
FE
GE
12 .
In particular, by help of the last and (3.3.4), it follows that
DK
DF KF IG KF 34 CG KF
342
23 KF KF 12 KF
.
Hence the triangles CDE and CEF have the side CE in common and the
corresponding height of the triangle CDE ( distance from CE to D) is
half of that corresponding height of the triangle CEF ( distance from CE
to F ). Hence the area of the triangle CDE is half the area of the triangle
CEF . Now also the triangles CEF and GCE have the side CE in common
and the corresponding height of the triangle CEF ( distance from CE to
F ) is half of that corresponding height of the triangle GCE ( distance
from CE to G). Hence it follows that the area of the triangle CDE is 1{4 of
the area of the triangle GCE
As a consequence, the sum of the areas of the triangles ABC and CDE
is 1{4 of the area of the triangle ACE. Hence it follows that
¤ AP AT ¤ 2 A4T
AT
4
and inductively that
¤ AP AT
AT
4n 1
k
ņ
k 0
1
4
¤ 2 4An T1
(3.3.5)
for every n P N. At this point observes that
1
1
3 4n 1
1
4n
1
43 4n1 1 13 41n
for every n P N which leads to
1
1
n
3 4 1
13 41n
n¸1 k 0
ņ
k 0
1
4
k
4n
1
1
k
1
4
343
1
1
n
3 4 1
ņ
k 0
k
1
4
for every n P N. Hence it follows that
ņ
1 1
3 4n
k
k 0
1
4
1 1
3 40
0̧
for every n P N. For every n P N, this leads to
AT
4n 1
¤ AP AT
4
3
k 0
1 1
3 4n
k
1
4
43
¤ 2 4An T1
which is equivalent to
7 AT
AT
AT 1
4
¤
A P AT
n
1
n
1
n
3 4
4
3 4
3
AT 1
10 AT
AT
¤ 2 4n 1 3 4n 3 4n 1 .
(3.3.6)
Differently to Archimedes, we can conclude from this by help of Theorem 2.3.12 directly that
4
AP AT .
3
Since the limit concept was not developed at that time, Archimedes had
to employ a usual ‘double reductio ad absurdum’ argument for this, i.e.,
to lead both assumptions that AP 4AT {3 and that AP ¡ 4AT {3 to a
contradiction which leaves only the option that AP 4AT {3. This can be
done as follows. First, we notice that AP ¥ 4AT {3 according to (3.3.6).
Therefore the assumption that AP 4AT {3 ε for some ε ¡ 0 contradicts
(3.3.6) . Second, we assume that AP 4AT {3 ε for some ε ¡ 0. Then,
it follows for n P N satisfying
n¡
that
AP
10
AT
3 ε ln 4
43 AT ¡ 103 4An T1
which contradicts (3.3.6) . Hence the only remaining possibility is that
AP 4AT {3. Of course, in ancient Greece only rational ε were considered
344
in such analysis.
A modern way of stating Archimedes’ result can be given as follows. Since
it follows from (3.3.5) that
1
AT
AT
AP 2 n 1
4
k
ņ
¤
k 0
1
4
¤
1
AT
AP
AT
4n 1
¤ AAP
(3.3.7)
T
for every n P N, the sequence S0 , S1 , . . . , defined by
Sn :
ņ
k
k 0
1
4
for every n P N, is increasing and bounded from above by AP {AT and
hence convergent. In particular, it follows from (3.3.7) by Theorem 2.3.12
that
ņ k
1
AP AT nlim
.
Ñ8
4
k0
In the following, the natural notation
8̧ 1 k
k 0
4
: nlim
Ñ8
ņ
k 0
k
1
4
will be used and referenced as the ‘sum of the sequence x0 , x1 , . . . ’, defined
by
k
1
xk :
4
for every k P N. In addition, the sequence S0 , S1 , . . . will be called ‘the
sequence of partial sums of x0 , x1 , . . . ’. Sequences of partial sums are also
called ‘series’. In this sense, Archimedes calculates the sum of the sequence 1, q, q 2 , . . . for the case q 1{4 which is given by 4{3. The series
corresponding to the sequences 1, q, q 2 , . . . where q runs through all real
numbers are called ‘geometric series’.
345
Definition 3.3.1. Let x1 , x2 , . . . be a sequence of elements of R. We say
that x1 , x2 , . . . is summable if the corresponding sequence of partial sums
S1 , S2 , . . . , defined by
Sn :
ņ
xk
(3.3.8)
k 1
for every n P N, is convergent to some real number. In this case, the sum
of x1 , x2 , . . . is denoted by
8̧
xk .
(3.3.9)
k 1
Otherwise, we say that x1 , x2 , . . . is not summable. The sequence in (3.3.8)
is also called a series and in case of its convergence a convergent series with
its sum denoted by (3.3.9). In case of its divergence, that series is called
divergent.
In the following, we give two examples of series that play an important
role in the analysis of the convergence of series, geometric series and the
harmonic series. The former contain a real parameter. If and only if the
absolute value of that parameter is smaller than 1, the corresponding geometric series converges. The harmonic series is divergent.
Example 3.3.2. (Geometric series) Let x P R. In the following, we use
the convention that x0 : 1. Show that the so called geometric series
S0 , S1 , . . . , defined by
Sn :
ņ
xk
k 0
for every n P N, is convergent if and only if |x| 1. In the last case, show
that
8̧
1
xk .
1x
k0
Solution: Note that in the case x 1, it follows that Sn n and hence the
divergence of the corresponding the geometric series. For x 1, it follows
346
5
4
3
2
1
10
20
30
40
50
n
[2 lnp2q]1 ln .
Fig. 87: Partial sums of the harmonic series and graphs of ln and 21
that
x Sn
ņ
xk
1
n¸1
xk
Sn 1
xn
1
k 1
k 0
and hence that
1 1x x
n 1
Sn
.
As a consequence, the series of partial sums is convergent if and only if
|x| 1, and in this case
8̧
k 0
xk
1
nlim
Ñ8 Sn 1 x .
Example 3.3.3. (Harmonic series) Show that the harmonic series, defined
by
ņ
1
Sn :
k
k1
347
for every n P N , is divergent. Solution: For every n P N zt0, 1u, it follows
that
2n
¸
1
k 1
¥
¥
k
20
¸
1
k
k 1
22
¸
1
k
22
¸
1
20
22
k 21
1
k
k2n1
k 21
20
¸
1
k 1
2n
¸
2n
¸
1
2n
k2n1
¥ 1 p22 21q 212 p2n 2n1q 21n
n
1
n1
n 1
1
1
lnp2 q
1
2
2
2
2
lnp2q
2 lnp2q
and hence the divergence of the harmonic series.
Remark 3.3.4. Note that because series are sequences of partial sums, we
can apply the limit laws of Theorem 2.3.4 to series.
Often, a given series consists of the partial sums corresponding to a sequence of the form f p1q, f p2q, . . . where f : r1, 8q Ñ R is some function.
For instance in the case of a geometric series corresponding to q ¡ 0, such
function is given by
f pxq : e px1q ln q
for x ¥ 1, and in the case of the harmonic series, such function is given by
f pxq :
1
x
for x ¥ 1. We note that in such case, the sequence of partial sums f p1q,
f p2q, . . . , defined by
ņ
f pk q
k 1
for every n P N , has the form of a Riemann sum, i.e., the form of sums
used in the definition of the Riemann integral, corresponding to a decomposition of R into the intervals r1, 2s, r2, 3s, . . . of length 1. Hence, we would
348
expect that there is a relationship between the existence of the improper
Riemann integral of f and the convergence of the series. Indeed, this is
true for a particular class of functions f .
Theorem 3.3.5. (Integral test) Let f : r1, 8q Ñ R be positive decreasing
and almost everywhere continuous. Then f p1q, f p2q, . . . is summable if
and only if f is improper Riemann-integrable. In this case,
»8
1
f pxq dx ¤
8̧
»8
f pk q ¤ f p1q
1
k 1
f pxq dx
(3.3.10)
as well as
»8
m 1
8̧
f pxq dx ¤
f pk q ¤
»8
m
k m 1
for every m P N .
f pxq dx
(3.3.11)
Proof. For this, we define the auxiliary function g : r1, 8q Ñ R by g p1q :
f p2q as well as g pxq : f pk 1q for all x P pk, k 1s and k P N . Then
»n
m
1
g pxq dx ņ
»k
1
k m k
n¸1
g pxq dx f pk q
k m 1
for every m, n P N such that m ¤ n. If f is improper Riemann-integrable,
it follows because of |g | ¤ f and by Theorem 3.2.6 that g is improper
Riemann-integrable and hence that f p1q, f p2q, . . . is summable and
8̧
f pk q ¤
»8
k m 1
m
f pxq dx
for every m P N . If on the other hand f p1q, f p2q, . . . is summable, we
define the auxiliary function h : r1, 8q Ñ R by hpxq : f pk q for all
x P rk, k 1q and k P N . Then
»x
m
f py q dy
¤
»x
m
hpy q dy
349
¤
8̧
k m
f pk q
for every m P N and x P r1, 8q. Hence it follows by Theorem 3.2.6 that
f is improper Riemann-integrable and that
8̧
f pk q ¥
»8
k m
for every m P N .
m
f py q dy .
Remark 3.3.6. Note that (3.3.11) can be used to estimate remainder terms
of the sequence.
The following two examples give applications of the integral test to further
series that play an important role in the analysis of the convergence of series. In particular, the following example defines Riemann’s zeta function
which has important applications in the description of the distribution of
the prime numbers. Further applications are in quantum statistical physics
and quantum field theory. Finally, there is a famous problem concerning the
zeros of the extension of Riemann’s zeta function to complex numbers. All
even integers that are smaller than 0 are zeros of that extension. Riemann’s
conjecture from 1859 claims that all other zeros have the real part 1{2. It
is not yet known whether this is true. The solution to this problem would
have profound consequences in the theory of numbers.
Example 3.3.7. (Riemann’s Zeta function) Show that by
ζ psq :
8̧ 1
s
n
n 1
for every s P p1, 8q there is defined a function ζ : p1, 8q Ñ R. This function is called Riemann’s zeta function. Solution: For every s P p1, 8q the
corresponding function fs : r1, 8q Ñ R defined by fs pxq : 1{xs for every
x ¥ 1 is positive decreasing and continuous and by Example 3.2.5 improper
Riemann-integrable. Hence the statement follows from Theorem 3.3.5. In
addition, it follows by (3.3.10) that
s1
1
»8
1
fs pxq dx ¤ ζ psq ¤ 1
350
»8
1
fs pxq dx s
s1
.
y
10
8
6
4
2
2
1.5
3
2.5
s
Fig. 88: Graphs of ζ (black), 1{p1 sq (blue) and s{p1 sq (red).
2.5
2
1.5
1
0.5
10
20
30
40
50
n
Fig. 89: Partial sums of the series from Example 3.3.8 for the case p 1.
351
Example 3.3.8. Let p
defined by
¥
1. Determine whether the sequence a2 , a3 , . . .
an :
ņ
1
k plnpk qqp
k2
for every n P N zt0, 1u is convergent or divergent. Solution: For this, we
define the auxiliary function h : r2, 8q Ñ R by hpxq : x plnpxqqp for
every x ¥ 2. Then h is strictly positive, strictly increasing and continuous
and hence f : r1, 8q Ñ R defined by f pxq : 1{rpx 1qplnpx 1qqp s
for every x ¥ 1 is positive, strictly decreasing and continuous. Further for
p 1:
»n
dx
lnplnpn 1qq lnplnp2qq
1q lnpx 1q
1 px
for every n P N . Using that lnplnp2m qq lnpm lnp2qq for every m P N ,
it follows that f is not improper Riemann-integrable. Hence it follows by
Theorem 3.3.5 the divergence of the corresponding sequence a2 , a3 , . . . .
For p ¡ 1 it follows that
»x
1
py
dy
1qplnpy
1
rplnpx
1qq
1p
p
1qq1p plnp2qq1p s
for every x ¥ 1 and hence the improper Riemann-integrability of f and by
Theorem 3.3.5 the convergence of the corresponding sequence a2 , a3 , . . . .
The following comparison test is often applied to decide the convergence of
a given series. For motivation, we investigate the convergence of the series
S1 , S2 , . . . defined by
ņ
1
Sn :
2
k
2
k1
for all n P N . A basic strategy in the solution of any problem is to investigate whether that problem has a peculiarity that prevents its immediate
solution. Indeed, without the addition of 2 in the denominator of the summands, S1 , S2 , . . . would coincide with the zeta series corresponding to
s 2 which was shown to converge. In such cases, it is often possible to
352
reduce, in some sense, the solution of the given problem to the solution of
the simpler problem. For instance in this case, we notice that
Sn :
ņ
1
2
k
k 1
ņ
1
¤
2 k1 k 2
8̧ 1
¤
2
k
ζ p2q
k 1
for every n P N . Hence S1 , S2 , . . . is an increasing sequence that is
bounded from above and therefore convergent (with a sum that is smaller
than ζ p2q). The following theorem generalizes this method of comparison
of series.
Theorem 3.3.9. (Comparison test) Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of positive real numbers. Further, let xn ¤ c yn for all n P
tN, N 1, . . . u where c ¥ 0 and N is some element of N. If y1, y2, . . . is
summable, then x1 , x2 , . . . is summable, too.
Proof. If y1 , y2 , . . . is summable, it follows that
ņ
k 1
xk
¤
ņ
c yk
c
k 1
ņ
k 1
yk
¤c
8̧
yk
k 1
for every n P N . Hence the sequence of partial sums of x1 , x2 , . . . is
increasing (since xk ¥ 0 for all k P N ) and bounded from above and
therefore convergent.
In Example 3.3.3, we proved that the harmonic series is divergent by showing that
2n
¸
1
lnp2n q lnp2q
¥
k
2 lnp2q
k1
for every n P N. In addition, Fig 87 supported the validity of the more
general estimate
ņ
lnpnq lnp2q
1
¥
k
2 lnp2q
k1
for every n P N . The last could indicate a logarithmic increase of the partial sums of the harmonic series with the number of summands. Indeed, as
353
another application of the previous theorem, the following example proves
the more precise statement that
ņ
1
k
k1
lim
Ñ8
n
lnpnq γ
where γ is a real number in the interval r0, 1s called Euler’s constant. To
seven decimal places, γ is given by 0.5772156.
Example 3.3.10. Show that the sequence a1 , a2 , . . . defined by
an :
ņ
1
k
k1
lnpnq
for all n P N is convergent. See Fig. 87. Solution: For this, we define an
auxiliary sequence b1 , b2 , . . . by
bn :
ņ
1
k
k1
lnpn
1q an
for all n P N . Then
lnpnq lnpn
1q an
and hence
for all n
since
0 ¤ bn
1
ln
n
n
1
n 2
bn 1 bn ln
n 1
n 1
»1
»1
1
n 1
1
n 1
1
dx x n 1
n 1 0 x
0
1
bn ¤ pn
1
1q2
x
n
1
dx
P N. Therefore b1, b2, . . . is increasing and bounded from above
bn
b1
n¸1
pbk 1 bk q ¤ b1
k 1
8̧ 1
2
k
k 1
354
for all n P N z t0, 1u. Hence b1 , b2 , . . . and a1 , a2 , . . . are convergent. The
constant
ņ
1
γ : lim
lnpnq
nÑ8
k
k1
is known as Euler constant. Presently, it is not yet known whether it is
rational or irrational. Since
lnpn
¤1
1q »n
1
n¸1 » k 1
k 1
1
k
it follows that
dx
x
dx
x
ņ
»k
k
k 1
»n
1
0 ¤ ln
1
1
dx
x
n
1
dx
x
¤
1
n
ņ
1
k
k1
1
n¸1
1
k
1
k 1
lnpnq ,
¤ an ¤ 1
for every n P N z t0, 1u and hence that 0 ¤ γ
it is given by 0.5772156.
¤ 1. To seven decimal places
The following example derives Weierstrass’ representation of the gamma
function as a simple consequence of the previous result and Gauss’ representation (3.2.14) of the gamma function.
Example 3.3.11. (Weierstrass’ representation of the gamma function)
Show Weierstrass’ representation of the gamma function
1
Γpxq
xeγx nlim
Ñ8
n ¹
1
k 1
x x{k
e
k
(3.3.12)
for every x ¡ 0 where γ is Euler’s constant. Solution: For this, let x
According to (3.2.14), Γpxq is given by
Γpxq nlim
Ñ8
x px
nx n!
1q px
355
x
x1 nlim
Ñ8 n
nq
n
¹
1
k 1
1
x
k
¡ 0.
Further,
ņ
1
nx exppx lnpnqq exp x lnpnq k
k1
ņ
exp x lnpnq k1
k1
n
¹
ņ
x
exp
k
k1
ex{k .
k 1
Hence it follows that
ņ
1
Γpxq x1 nlim
Ñ8 exp x lnpnq k
k1
x1eγx nlim
Ñ8
n
¹
ex{k
n
¹
ex{k
1
k 1
x
k
x
k
1
k 1
which implies (3.3.12).
The following comparison test is a simple consequence of the comparison
test from Theorem 3.3.9.
Theorem 3.3.12. (Limit comparison test) Let x1 , x2 , . . . and y1 , y2 , . . .
be sequences of positive real numbers. Further, let
lim
nÑ8
xn
yn
1.
(Note that this implies that yn ¡ 0 for all n P tN, N 1, . . . u and some N
N .) Then x1 , x2 , . . . is summable if and only if y1 , y2 , . . . is summable.
Proof. Since limnÑ8 pxk {yk q
such that
1, there is N P N satisfying N ¥ 2 and
1
2
¤ xyk ¤ 32
k
for all k P N such that k ¥ N . In particular, this implies that
0 ¤ yk
P
¤ 2xk , 0 ¤ xk ¤ 3y2k
356
for all k P N such that k ¥ N . Hence it follows by help of Theorem 3.3.9
that the sequence xN , xN 1 , . . . is summable if and only if yN , yN 1 , . . . is
summable. Since
ņ
k 1
xk
N
¸1
ņ
ņ
xk
xk ,
for every n P N satisfying n
the theorem.
k 1
k N
k 1
yk
N
¸1
k 1
ņ
yk
yk
k N
¥ N , the last also implies the statement of
In the following, we give three typical applications of the previous comparison test. In particular, the two subsequent examples study series which are
frequently used in the analysis of the convergence of given series. Furthermore, the following example, along with the fact that the sequence
whose members are all equal to 1 is not summable, shows that the series
1, 1{2s , 1{3s , . . . is not summable for s ¤ 1 and hence that ζ psq cannot be
defined for s ¤ 1 in the same way as for s ¡ 1.
Example 3.3.13. Let p
defined by
for all n P
n P N that
1. Determine, whether the sequence a1, a2, . . .
an :
1
np
N , is summable. Solution: Since p
1, it follows for every
1
1
¥
p
n
n
for all n P N and hence by Theorem 3.3.9 and Example 3.3.3 the divergence of a1 , a2 , . . . .
Example 3.3.14. Let p
defined by
1.
Determine whether the sequence a2 , a3 , . . .
an :
ņ
1
k plnpk qqp
k2
for every n P N zt0, 1u is convergent or divergent. Solution: Since p
it follows for every k ¥ 3 that
plnpkqq1p ¥ 11p 1
357
1,
and hence that
1
1
¥
.
p
k plnpk qq
k lnpk q
Hence it follows by Theorem 3.3.9 and Example 3.3.8 that the sequence
a2 , a3 , . . . is divergent.
Example 3.3.15. Determine whether the sequence a1 , a2 , . . . defined by
an :
3n2
pn5
n 1
2q1{2
for all n P N is summable. Solution: Define
bn :
3
n1{2
for all n P N . Then b1 , b2 , . . . is not summable according to Example 3.3.13. In addition, it follows that
lim
nÑ8
an
bn
1
and hence by Theorem 3.3.12 also that a1 , a2 , . . . is not summable.
In the 17th and 18th century, it was generally assumed that the reordering
of the members of a sequence lead to a sequence which is summable if and
only if the same is true for the original sequence and in that case that the
sums of both sequences coincide. Indeed, we will see in the following that
this is true for absolutely summable sequences that include sequences of
positive p¥ 0q real numbers. On the other hand, we will also see that the
above statement is false in more general cases. This false belief led to contradictions which plagued the calculus in those centuries. We present one
example from that time [60] of a too naive handling of series resulting from
a reordering of a sequence whose members alternate in sign.
Since 1668 [78], it was known that
8̧
n 1
p1qn 1 lnp2q ,
n
358
a fact that will be proved in Example 3.4.19. On the other hand, it was
argued that therefore
lnp2q 1
1
1 1 1 1 1
1
...
2 3 4 5 6
1 1
1 1 1
1
...
... 2
3 5
2 4 6
2
2 1 1 1
1 1
... ... 0 .
2 3
2 1 2 3
1
4
1
6
...
It should be noted that already the second line in the above ‘derivation’
cannot be concluded by the limit laws because all three series inside the
brackets diverge. Hence the above can also be viewed as a classic example
of the false treatment of 8 as a real number which was quite common at
that time. The discovery of such apparent contradictions contributed essentially to a re-examination and rigorous founding of the theory of infinite
series.
A simple example for the fact that the reordering of a sequence can affect
its sum is the following. For this, we consider a reordering of the sequence
a1 , a2 , . . . defined by
p1qk 1
ak :
k
for every k P N . The partial sums of this sequence are called the alternating harmonic series whose sum was also considered in the above ‘derivation’.
Example 3.3.16. (A rearrangement of the alternating harmonic series)
For this, we define the sequence a1 , a2 , . . . by
ak :
for every k
p1qk
1
k
P N, and the sequence b1, b2, . . . by
b3k2 :
1
p
1q4k2 a4k3 ,
4k 3
4k 3
1
359
1.1
1
0.9
10
20
30
40
50
n3
0.7
0.6
0.5
Fig. 90: Partial sums of the alternating harmonic series and its rearrangement from Example 3.3.16.
b3k1 :
b3k : 1
p
1q4k a4k1 ,
4k 1
4k 1
1
1
2k
p1q2k 1 2k1 a2k
for every k P N . From the last, we conclude that the sequence b1 , b2 , . . .
contains only members of the sequence a1 , a2 , . . . . The fact that it contains
all of them can be seen as follows. For this, let k P N . If k is even, then
a2 pk{2q ak .
If k is odd, then there is l P N such that k 2l 1. If l is even, then
b3 pl{2q1 a4 pl{2q1 a2l1 ak .
b3 pk{2q
Finally, if l is odd, then
b3 ppl 1q{2q2
a4 ppl
q{ q3 a2l1 ak .
1 2
360
Hence, b1 , b2 , . . . is a reordering of a1 , a2 , . . . . The ninth partial sum corresponding to a1 , a2 , . . . is given by
1
1
2
1
3
14
16
1
5
1
7
18
1
,
9
whereas the ninth partial sum corresponding to b1 , b2 , . . . is given by
1
1
3
12
1
5
1
7
41
1
9
1
11
61 .
Assuming the convergence of the alternating harmonic series which is proved
in Example 3.3.19, it follows that
8̧
p1qk 1 1
k
k 1
2
1
3
56 .
Further, because of
b3k2
for every k
b3k1
b3k
P N, it follows that
3
2kp4k 8k
3qp4k 1q ¡ 0
3n
¸
bk
k 1
¡ 65
for every n P N . Therefore, either b1 , b2 , . . . is not summable (!), or
8̧
k 1
bk
¡ 56 ¡
8̧
k 1
p1qk . p!q
k
In the following, we continue the study of series with view on sums of
alternating sequences. Any sequence y1 , y2 , . . . of real numbers can be
represented in the equivalent form x1 |y1 |, x2 |y2 |, . . . where the sequence
x1 , x2 , . . . assumes values in t1, 1u. In this sense, y1 , y2 , . . . is always a
product of a bounded sequence that describes sign changes and a sequence
of positive numbers. In the case that the partial sums of x1 , x2 , . . . stay
361
bounded, as is the case for alternating y1 , y2 , . . . , consideration of this product structure is helpful in the analysis of the convergence of the series that
corresponds to y1 , y2 , . . . . The basis for such analysis is provided by the
following summation by parts formula which resembles the formula for
partial integration.
Theorem 3.3.17. (Summation by parts) Let x1 , x2 , . . . and y1 , y2 , . . . be
sequences of real numbers and S1 , S2 , . . . be the sequence of partial sums
of x1 , x2 , . . . . Then
ņ
xk yk
pSn
cqyn
1
pSm1
ņ
cqym pSk
cqpyk
1
yk q
k m
k m
for all m, n P N such that n ¥ m and all c P R where we define S0 : 0.
Proof. It follows for all m, n P N , that
ņ
xk yk
k m
ņ
k m
ņ
Sk y k ņ
pSk Sk1qyk Sk y k k m
ņ
Sk y k
1
n¸1
Sk y k
1
k m 1
Sn y n
1
Sm1ym
Sk pyk
1
yk q
k m
k m
ņ
Snyn 1 Sm1ym Snyn 1 Sm1ym Snyn 1 Sm1ym k m
ņ
pSk
k m
ņ
cqpyk
1
ņ
yk q
c pyk
1
yk q
k m
pSk
cqpyk
1
yk q
cyn
1
cym
k m
pSn
cqyn
1 pSm1
cqym ņ
pSk
cqpyk
1
yk q .
k m
The following Dirichlet’s test is mainly a consequence of the summation
by parts formula. This test is frequently used in connection with the sum362
mation of alternating sequences also because it provides a very simple estimate of the error resulting from the truncation of the series after finitely
many terms.
Theorem 3.3.18. (Dirichlet’s test) Let x1 , x2 , . . . be a sequence of real
numbers such that its partial sums form a bounded sequence and y1 , y2 , . . .
be a decreasing sequence of real numbers such that limkÑ8 yk 0. Then
the sequence x1 y1 , x2 y2 , . . . is summable,
8̧
xk yk
M1 y 1
8̧
pSk M1qpyk yk 1q
(3.3.13)
¤ pM2 M1q yn
(3.3.14)
k 1
k 1
and for every n P N
8̧
kn
xk yk 1
1
where M1 , M2 P R are a lower bound and upper bound, respectively, of the
partial sums of x1 , x2 , . . . .
Proof. For this let S1 , S2 , . . . be the sequence of partial sums of x1 , x2 , . . .
and M1 , M2 P R be lower and upper bounds, respectively. Then by Theorem 3.3.17
ņ
xk yk
pSn M1qyn
ņ
1
M1 y 1
k 1
pSk M1qpyk yk 1q
k 1
as well as
ņ
0¤
pSk M1qpyk yk 1q ¤
k 1
ņ
pM2 M1qpyk yk 1q
k 1
pM2 M1qpy1 yn 1q ¤ pM2 M1q y1
for all n P N . Therefore the sequence
1̧
k 1
2̧
pSk M1qpyk yk 1q, pSk M1qpyk yk 1q, . . .
k 1
363
y
10
5
2
1.5
2.5
3
s
-5
-10
Fig. 91: Graph of an extended Riemann’s zeta function ζ.
is increasing as well as bounded from above and hence convergent. Therefore, since limkÑ8 yk 0, it follows the summability of x1 y1 , x2 y2 , . . .
and p3.3.13q. Finally, it follows for every n P N that
8̧
xk yk
pSn M1qyn
8̧
1
k n 1
pSk M1qpyk yk 1q
k n 1
and hence
8̧
8̧
xk yk
¥ pSn M1qyn 1 ¥ pM2 M1qyn
xk yk
¤ pSn M1qyn
k n 1
k n 1
8̧
1
pM2 M1qpyk yk 1q
k n 1
pM2 Snqyn 1 ¤ pM2 M1qyn
and (3.3.14).
364
1
1
¡ 0.
Example 3.3.19. Let s
defined by
Determine whether the sequence a1 , a2 , . . .
an
p
1qn1
:
ns
for all n P N is summable. Solution: Define
xn : p1qn1 , yn :
1
ns
for all n P N . Then the partial sums S1 , S2 , . . . of x1 , x2 , . . . oscillate
between 0 and 1 and y1 , y2 , . . . is decreasing as well as convergent to 0.
Hence by Theorem 3.3.18 a1 , a2 , . . . is summable and
8̧
k 1
ak
8̧
k 0
8̧ p2k
k 0
p2k
1
1
1qs
1q
p2k
s
p2k
1
1
2qs
21s ζ psq p1 21s q ζ psq
2q
s
if, in addition, s ¡ 1. Note that the last formula can and is used to define ζ
on p0, 1q. See Fig. 91.
In some cases where Dirichlet’s test cannot be applied Abel’s test is of
use. Also Abel’s test is mainly a consequence of the summation by parts
formula.
Theorem 3.3.20. (Abel’s test) Let x1 , x2 , . . . be a summable sequence of
real numbers and y1 , y2 , . . . a decreasing convergent sequence of real numbers. Then the sequence x1 y1 , x2 y2 , . . . is summable and
8̧
xk yk
M1 y 1
k 1
where M1
8̧
k 1
x k M1
nlim
Ñ8 yk
8̧
pSk M1qpyk yk 1q
k 1
P R is a lower bound of the partial sums of x1, x2, . . . .
(3.3.15)
Proof. For this, let S1 , S2 , . . . be the sequence of partial sums of x1 , x2 , . . .
and M1 , M2 P R be lower and upper bounds, respectively. Further, let
365
M3 , M4 P R be lower and upper bounds, respectively, of y1 , y2 , . . . . Then
by Theorem 3.3.17
ņ
xk yk
pSn M1qyn
ņ
1
M1 y 1
pSk M1qpyk yk 1q
k 1
k 1
as well as
ņ
0¤
pSk M1qpyk yk 1q ¤
ņ
pM2 M1qpyk yk 1q
k 1
k 1
pM2 M1qpy1 yn 1q ¤ pM2 M1q pM4 M3q
for all n P N . Therefore, the sequence
1̧
2̧
pSk M1qpyk yk 1q, pSk M1qpyk yk 1q, . . .
k 1
k 1
is increasing as well as bounded from above and hence convergent. Finally,
it follows the summability of x1 y1 , x2 y2 , . . . and p3.3.15q by the limit laws
for sequences.
The following example gives an application of Abel’s test.
Example 3.3.21. Show that the sequence a1 , a2 , . . . defined by
a2n1 :
n1
1
, a2n : 2
n
n 1
for every n P N is summable. Solution: We note that
|a2n| |a2n1| n 1 1 n n2 1 n2pn1 1q ¡ 0 ,
|a2n| |a2n 1| n 1 1 pn n 1q2 pn 1 1q2 ¡ 0
for all n P N and hence that the sequence |a1 |, |a2 |, . . . is neither decreasing nor increasing. Hence Dirichlet’s test cannot be directly applied. On
366
0.5
10
5
20
15
25
30
n
-0.2
0.3
-0.4
-0.6
0.1
10
5
20
15
n
30
25
-0.8
Fig. 92: Sequences of absolute values and partial sums of the sequence from Example 3.3.21.
the other hand, Abel’s test can be applied successfully as follows. For this,
we define
x1 : 1 , x2n
y1 : 0 , y2n
1
:
1
:
, x2n : 1
n
1
n
n
, y2n :
1
1
,
n
n
n 1
for every n P N . Then
x1 y1
0 a1 ,
x2n y2n
x2n
1
y2n 1 n 1 1 n n 1 pn
n
n1 n n 1 n 1 1 a2n
1q2
a2n
1
for all n P N . The partial sums of the sequence x1 , x2 , . . . are given by
2n
¸1
k 1
2n
¸
k 1
xk
xk
ņ
x2k1
k 1
ņ
k 1
x2k1
n¸1
x2k
k 1
ņ
x2k
k 1
367
ņ
1
k
k1
ņ
1
k
k1
1
n¸1
k 1
ņ
1
,
k
n
1
k
k1
0
,
for every n P N such that n ¥ 2, and hence x1 , x2 , . . . is summable. Further, y1 , y2 , . . . is decreasing and convergent to 1. Hence it follows
by Abel’s test that a1 , a2 , . . . is summable. Therefore, a1 , a2 , . . . is
summable, too.
In the following, we define and study absolutely summable sequences. Any
reordering of such a sequence leads to a convergent series whose sum coincides with the sum of the original series. In applications mainly absolutely
summable sequences occur. Exceptions are rare. One such exception is
described in [9].
Definition 3.3.22. (Absolute summability) A sequence x1 , x2 , . . . of real
numbers is said to be absolutely summable if the corresponding sequence
|x1|, |x2|, . . . is summable. It is called conditionally summable if it is
summable, but |x1 |, |x2 |, . . . is not.
Of course, the previous definition is reasonable only if any absolutely summable
sequence is summable, too. The last is easy to prove.
Theorem 3.3.23. Any absolutely summable sequence of real numbers is
summable.
Proof. For this, let x1 , x2 , . . . be some absolutely summable sequence of
real numbers. Then x1 |x1 |, x2 |x2 |, . . . is a sequence of positive real
numbers and
ņ
k 1
pxk |xk |q ¤ 2
ņ
k 1
|x k | ¤ 2
8̧
|x k |
k 1
for all n P N . Hence the sequence of partial sums corresponding to the sequence x1 |x1 |, x2 |x2 |, . . . is increasing as well as bounded from above
and hence convergent. Therefore, x1 |x1 |, x2 |x2 |, . . . is summable.
Hence it follows by the limit laws that x1 , x2 , . . . is summable, too.
Remark 3.3.24. Note that the previous definition and theorem reduce the
decision whether a given sequence is absolutely summable (and therefore
368
also summable) to the decision whether a corresponding sequence of positive real numbers is summable. Usually, the decision of the last is relatively
easy, and we already developed a number of tools for this. For this reason,
the second step in the analysis is often the inspection whether the sequence
is absolutely summable. Usually, the first step inspects whether the summability of the sequence can be concluded by help of the limit laws from the
already known summability of certain sequences, or whether there are obvious reasons why the sequence is not summable. If this fails, absolute
summability is investigated. If this also fails, the applicability of Dirichlet’s test or Abel’s test is investigated next.
Example 3.3.25. In Example 3.3.2, we have seen that the geometric series,
defined by
Sn :
ņ
xk
k 0
for every n P N, is convergent if and only if |x| 1 where x0 : 1. In the
last case, this also implies that the geometric series defined by
S̄n :
ņ
|x | ņ
|x |k ,
k
k 0
k 0
where |x|0 : 1, is convergent and hence that 1, x, x2 , . . . is absolutely
summable.
Example 3.3.26. Determine whether the sequence
sinp1q sinp2q sinp3q
,
,
,...
12
22
32
is absolutely summable. Solution: For every k
sin k
k2
p q ¤ 1
k2
(3.3.16)
P N, it follows that
.
Hence it follows by Example 3.3.7 and Theorem 3.3.9 that the sequence
(3.3.16) is absolutely summable.
369
Example 3.3.27. The examples of the harmonic series Example 3.3.3 and
the alternating harmonic series, i.e., the case s 1 in Example 3.3.19, show
that not every summable sequence is absolutely summable.
The following characterization of summability is sometimes useful in the
analysis of sequences and will be used later on. It is a simple consequence
of the definition of summability of a sequence and the completeness of the
real number system in the form of Theorem 2.3.17.
Theorem 3.3.28. (Cauchy’s characterization of summable sequences)
A sequence x1 , x2 , . . . of real numbers is summable if and only if the corresponding sequence of partial sums is a Cauchy sequence, i.e., if and only
if for every ε ¡ 0, there is some N P N such that
ņ
xk km ¤ε
for all m, n P N satisfying n ¥ m ¥ N .
Proof. First, if x1 , x2 , . . . is a sequence of real numbers whose corresponding sequence of partial sums is a Cauchy sequence, then it follows Theorem 2.3.17 that the last sequence is convergent and hence that x1 , x2 , . . . is
a summable. If x1 , x2 , . . . is a summable sequence of real numbers, then
the corresponding sequence of partial sums is convergent and hence also a
Cauchy sequence according to Theorem 2.3.17. The last can also be proved
directly as follows. For this, let ε ¡ 0. Since x1 , x2 , . . . is summable, there
is N P N such that
8̧ ε
m̧
xk xk ¤
k1
2
k1
for all m P N satisfying m
that n ¥ m ¥ N 1 that
ņ
xk km ņ
xk
k1
m
¸1
k 1
¥ N . Hence it follows for all m, n P N such
xk ¤
ņ
xk
k1
370
xk k1
8̧
8̧
xk
k1
m
¸1
k 1
xk ¤ ε.
The following corollary is often used to show that a given sequence is not
summable.
Corollary 3.3.29. Let x1 , x2 , . . . be a summable sequence of real numbers.
Then
lim xn 0 .
Ñ8
n
Example 3.3.30. We consider the sequence x1 , x2 , . . . defined by
xn : p1qn n
1
n
for every n P N . If x1 , x2 , . . . were convergent to 0 also every of its
subsequences would converge to zero. On the other hand,
lim x2n
nÑ8
2n 1
nlim
Ñ8 2n 1 .
Hence x1 , x2 , . . . is not convergent to 0 and therefore also not summable.
In the following, we give the two most important tests, the ratio test and the
root test, for the decision whether a given sequence is absolutely summable
or not. Both tests compare, by application of Theorem 3.3.9, the corresponding series to geometric series. Usually, the structure of the members
of the sequence decides which of the tests is applied. The ratio test uses for
this the ratio of the absolute values of subsequent members and the root test
the n-th root of the absolute value of the n-th member. Since the structure
of the last is often more complicated than that of the ratio, the quotient test
is more frequently applied.
Theorem 3.3.31. (Ratio test) Let x1 , x2 , . . . be a sequence of real numbers.
(i) If there are q
P p0, 1q and N P N such that
xn
x
1 n
¤q
for all n P N such that n ¥ N , then x1 , x2 , . . . is absolutely summable.
Note that this can only be the case if only finitely many of the members of x1 , x2 , . . . are zero.
371
(ii) If there is N
P N such that
xn
x
1 n
¥1
for all n P N such that n ¥ N . Then x1 , x2 , . . . is not summable.
Also this can only be the case if only finitely many of the members
of x1 , x2 , . . . are zero.
Proof. ‘(i)’: For this, let q P p0, 1q and N P N be such that |xn 1 | ¤ q |xn |
for all n P N satisfying n ¥ N . Then it follows by induction that
|xn| ¤ |xN | qnN
for all n P N such that n ¥ N . Hence it follows by Example 3.3.2 and
Theorem 3.3.9 the absolute summability of x1 , x2 , . . . . ‘(ii)’: For this let
N P N be such that |xn 1 |{|xn | ¥ 1 for all n P N satisfying n ¥ N .
Then it follows by induction that
|xn| ¥ |xN |
for all n P N satisfying n ¥ N and hence since xN 0 that x1 , x2 , . . . is
not converging to 0. Hence it follows by Corollary 3.3.29 that x1 , x2 , . . . is
not summable.
Example 3.3.32. Find all values real x for which the sequence
x0 x1 x2
, , ,...
0! 1! 2!
is summable. Solution: For x 0, the corresponding sequence is obviously absolutely summable. For x P R and n P N, it follows that
n 1
x
n! lim
nÑ8 n
1 ! xn p
q
|x | 0
nlim
Ñ8 n 1
and hence by Theorem 3.3.31 the absolute summability of the corresponding sequence.
372
Theorem 3.3.33. (Root test) Let x1 , x2 , . . . be a sequence of real numbers.
P r0, 1q and N P N such that
|xn|1{n ¤ q
P N satisfying n ¥ N , then x1, x2, . . .
(i) If there are q
for all n
summable.
(ii) If there is N
is absolutely
P N such that
|xn|1{n ¥ 1
for all n P N satisfying n ¥ N , then x1 , x2 , . . . is not summable.
Proof. ‘(i)’: For this, let q P r0, 1q and N P N be such that |xn |1{n ¤ q for
all n P N satisfying n ¥ N . Then it follows that
|x n | ¤ q n
for all n P N satisfying n ¥ N and hence by Example 3.3.2 and Theorem 3.3.9 the absolute summability of x1 , x2 , . . . . ‘(ii)’: For this let N P N
be such that |xn |1{n ¥ 1 for all n P N satisfying n ¥ N . Then it follows
that
|x n | ¥ 1
for all n P N such that n ¥ N and hence that x1 , x2 , . . . is not converging
to 0. Hence it follows by Corollary 3.3.29 that x1 , x2 , . . . is not summable.
Example 3.3.34. Determine whether the sequence
p1q2 pln12q2 , p1q3 pln13q3 , p1q4 pln14q4
, ...
is summable. Solution: For n P N zt0, 1u, it follows that
lim
nÑ8 p1q n
1{n
1 pln nqn 1
nlim
Ñ8 ln n 0
and hence by Theorem 3.3.33 the absolute summability of the sequence.
373
Example 3.3.35. Note that in the case of the sequence a1 , a2 , . . . defined
by
1
an : s
n
for all n P N , where s ¡ 0, that neither the ratio nor the root test can be
applied, since
an
lim nÑ8 a
1 s
nlim
1s 1 ,
Ñ8
n 1
n
s{n
s lnpnq{n
lim n
nlim
e0 1 .
nÑ8
Ñ8 e
n
Finally, by application of Cauchy’s characterization of summable sequences,
Theorem 3.3.28, we prove that every reordering of an absolutely summable
sequence leads to a convergent series whose sum coincides with the sum of
the original series.
Theorem 3.3.36. (Rearrangements of absolutely convergent series) Let
x1 , x2 , . . . be an absolutely summable sequence of real numbers. Further,
let f : N Ñ N be bijective. Then the sequence xf p1q , xf p2q , . . . is also
absolutely summable and
8̧
xk
k 1
8̧
xf pkq .
(3.3.17)
k 1
Proof. First, it follows that the sequence
of partial sums of |xf p1q |, |xf p2q |, . . .
°8
is increasing with upper bound k0 |xk | and hence convergent. Hence
|xf p1q|, |xf p2q|, . . . is absolutely summable. Further, let ε ¡ 0. By Theorem 3.3.28, there is N P N such that for all n, m P N satisfying
n ¥ m ¥ N , it follows that
ņ
|x k | ¤ ε .
k m
P N such
t1, . . . , N u € tf p1q, . . . , f pNf qu .
Since f is bijective, there is Nf
374
Hence it follows for every n P N satisfying n ¥ maxtN, Nf u:
ņ
xf pkq
k1
xk k1
ņ
¤ε.
Hence it follows also (3.3.17).
Problems
1) Express the periodic decimal expansion as a fraction.
,
a) 0.9
b) 0.3
,
c) 0.377
.
2) Calculate
8̧ 1 1
8̧ 3p1 p1qn q
1
, b)
,
a)
2n
4 n n 4
n1
n1
8̧
8̧
1
1
npn 3q
c)
,
npn 1qpn 3q
d)
n 1
.
n 1
3) Determine whether the sequence a1 , a2 , . . . is absolutely summable,
conditionally summable or not summable.
a) ak
a) ak
c) ak
k1 k
4
5
,
plnpk 1 1qqk
arctanpkq ,
k 4{3
1
p3q pk!1
g)
ak
1{k
p1qk ke2
b)
2
d)
ak
,
f)
,
h)
31{k
,
p2k2 1q! ,
?
?
k 3 k
ak k
ak
2
p1qk k2k k 3 2 , j)
p2kq! , l) a kk ,
ak k
p3kq!
k!
i) ak
k)
k
2k 1
k2 q
k
e) ak
,
51{k ,
p1qk
a ak
b)
375
p1qk k2 k
3
ak
e3kk{2
,
3
,
,
ak
k)
where k
1
1
k
k
P N .
4) Determine the values q
a3 , a4 , . . .
rlnkpk2qs
3
,
l) ak
¥ 1 for which the corresponding sequence
ak :
1
,
k lnpk q rlnplnpk qqsq
P t3, 4, . . . u, is summable. Give reasons for your answer.
Define a4k : a4k 1 : 1, a4k 2 : a4k 3 : 1 for every k P N.
k
5)
Determine whether the sequence
?ak ,
3 k 7
k P N, is absolutely summable, conditionally summable or not summable.
Give reasons for your answer.
6) Estimate the error if the sum of the first N terms is used as an approximation of the series.
a)
a)
8̧ 1
, N 3 , b)
n2
n1
8̧ p1qn 1
n 1
n
, N
7
8̧
1
n
r
ln
pnqs2 , N
n2
,
b)
8̧
p1qn
n2
n 1
9
1
, N
,
14
.
7) Calculate the sum correct to 3 decimal places
8̧ 1
8̧
1
,
b)
,
4
5 lnpnq
n
n
n1
n2
8̧
8̧
p1qn p2n1 q! , b)
p1qn
a)
n1
n1
a)
1
n!
n2n
.
8) A rubber ball falls from initial height 3m. Whenever it hits the
ground, it bounces up 3{4-th of the previous height. What total distance is covered by the ball before it comes to rest?
9) If a1 , a2 , . . . is sequence of real numbers such that
lim an
n
Ñ8
0,
does this imply the summability of the sequence? Give reasons for
your answer.
376
10) Give an example for a convergent sequence of real numbers a1 , a2 , . . .
and a divergent sequence of real numbers b1 , b2 , . . . satisfying
lim
n
an
Ñ8 an
1
1,
lim
n
bn
Ñ8 bn
1
1.
11) Give an example for a convergent sequence of real numbers a1 , a2 , . . .
and a divergent sequence of real numbers b1 , b2 , . . . satisfying
lim pan q1{n
n
Ñ8
1,
lim pbn q1{n
n
Ñ8
1.
12) Assume that a1 , a2 , . . . is a summable sequence of positive real numbers. Show that the sequence a21 , a22 , . . . is also summable. Is the last
also generally true if members of a1 , a2 , . . . can be negative?
377
3.4
Series of Functions
One main application of series is in form of series of functions, i.e., series
containing one or more parameters varying in a certain domain.
For motivation, we consider Leibniz’s ‘arithmetical quadrature of the circle’ from 1673. By ‘arithmetical’ quadrature, Leibniz meant a representation of an area as a sum of an infinite sequence of rational numbers. He arrived at that ‘Leibniz’s series’ for the area of the circle through application
of his ‘transmutation theorem’ by which he could also derive essentially
all known plane quadrature results at the time. From today’s perspective,
the statement of that theorem is a consequence of the method of integration by parts and the change of variable formula. For this reason, Leibniz’s
transmutation theorem will not be used in the following. For instance, see
[36] for information on that theorem. As a consequence, the first steps in
the derivation below will not look very natural. Leibniz uses the following
representation
S : tpx, y q P R2 : px 1q2
1u
of a circle of radius 1 and center p1, 0q. Since px, y q P S if and only if
0 px 1q2 y 2 1 x2 2x 1 y 2 1 y 2 p2x x2 q ,
y2
the area of the upper left quarter
? of the circle is given by the area below the
graph of p r0, 1s Ñ R, x ÞÑ 2x x2 q, see Fig 93. Hence
1?
π
2x x2 dx .
4
0
Instead of applying Leibniz ‘transmutation theorem’, we proceed as follows. First, it follows by partial integration that
»
»1
0
?
2x ?
x2
2x dx x2
»1
1
x 0
0
?
2x x2 1 dx
»1
0
?2 2x 2 x dx
2 2x x
378
y
1
1
2
x
Fig. 93: The Leibniz series gives a representation of the area of a quarter of a circle of
radius 1 in terms of a sum of an infinite sequence of rational numbers.
1
»1
0
2x x2 x
?
dx 1 2x x2
and hence that
»1
?
0
2x x2
dx »1
?
0
2x x2 dx
»1c
1
1
2
»1c
2x
0
dx
x
2x
0
x
dx
.
Second, we use change of variables. For this, we define
2u2
1 u2
R. In particular, g is continuously differentiable with derivag puq :
for every u
tive
P
u2 q 2u2 2u
4u
2
2
p1 u q
p1 u2q2
for all u P R and hence g is strictly increasing on r0, 8q and g p0q
g p1q 1. Also,
g 1 puq g puq
2 g puq
4u p1
2u2
1
2
1 u 2 12uu22
2u2
1 u2
1
2
1 u2
0,
u2
for all u P R. Hence it follows by change of variables and partial integration
that
1
1
2
»1c
0
x
2x
dx
1
1
2
379
» gp1q c
pq
g 0
x
2x
dx
»1d
1
1
2
0
1
2
ru gpuqs| »1
g puq du
1
0
1
1
g puq
g 1puq du
2 g puq
»1
0
»1
1
1
2
0
2
1
2
u g 1 puq du
»1
0
g puq du
u2
du .
1 u2
0
In this way, as also Leibniz by use of his ‘transmutation method’, we arrive
at the equation
»1
u2
π
1 1 u2 du .
4
0
In the next step, Leibniz uses that 1{p1
series
1 pu2q 1
u2
1
8̧
1
u2 q is the limit of a geometrical
8̧
pu q 2 k
p1qk u2k
(3.4.1)
k 0
k 0
for every u P R satisfying |u| 1, where we use the usual convention that
x0 : 1 for all x P R, and concludes that
π
4
1
1
»1
»1
0
u2
du 1 1 u2
8̧
0 k 0
8̧
p1q u p
k 2 k 1
1 p1qk
8̧
k 0
1
p1q
k 1
k 0
u
0
8̧
2
1
3 0
p1qk u2k du
8̧
k 0
q du 1 2k 3
u
2k
»1
p1q
»1
k
k0
8̧
u2pk 1q du
0
1 p1qk 2k 1
1
2pk 1q
k 0
1
1
8̧
n 1
3
p1qn 2n 1
1
to arrive at the Leibniz series
π
4
8̧
n 0
p1qn 2n 1 1 1 13
380
1
5
17
... .
(3.4.2)
1
Π
€€€€€€
4
0.5
5
10
15
20
25
30
n
Fig. 94: Members of Leibniz’ series and their limit value, π {4.
Note that the previous derivation proceeds without any reference to trigonometric functions or their inverses.
Indeed, as is proved in Example 3.4.20 later on, (3.4.2) turns out to be
correct, but the exchange of integration and summation in the last part of
the derivation needs justification. In particular, it has to be taken into account that the sum in (3.4.1) diverges at the right end u 1 of the interval
of integration.
In the following, we consider the situation in the last part of Leibniz’s
derivation in a little more detail. For this, we define for every n P N the
corresponding function fn : r0, 1s Ñ R by
fn puq :
ņ
p1qk u2pk
k 0
381
1
q
for every u P r0, 1s. In particular, fn is continuous and hence Riemannintegrable for every n P N and
lim fn puq Ñ8
n
8̧
k 0
2
p1qk u2pk 1q 1 u u2
for every u P r0, 1q. Also is the ‘limit function’ f : r0, 1s
sequence of functions f0 , f1 , . . . defined by
f puq :
Ñ
R of the
u2
1 u2
for every u P r0, 1s, continuous and hence Riemann-integrable. In this
situation, Leibniz uses that
»1
0
f puq du nlim
Ñ8
»1
0
»
1
8̧
0 k 0
p1qk u2pk
1
q du 8̧ » 1
k 0
0
p1qk u2pk
1
q du
fn puq du .
In this special case, this exchange of integration and summation turns out
to be correct. On the other hand, it is not difficult to find cases where such
exchange leads to incorrect results, see Example 3.4.2 (iii). It is important
to note that such exchange of integration, differentiation and other limit operations were quite common in the 17th century up to the first quarter of
the 19th century. For instance, a mathematician like Euler would not have
hesitated to perform such exchange without seeing a necessity for justification.
A main goal of this section is to derive general conditions on a sequence of
functions f0 , f1 , . . . and the limit function f such that
»b
a
f puq du nlim
Ñ8
»b
a
fn puq du
where a, b P R such that a ¤ b, fn : ra, bs Ñ R is continuous for every n P
N, f0 pxq, f1 pxq, . . . is convergent for every x P ra, bs and f : ra, bs Ñ R
382
defined by
f pxq : lim fn pxq
Ñ8
n
for every x P ra, bs. Further equally important goals are the derivation of
analogous conditions for situations where integration is replaced by differentiation and other limit operations.
For the last, there is an important historic example. In his textbook ‘Cours
d’analyse’ from 1821 [22], Cauchy gives a false theorem, along with an incorrect proof, that would imply that from the continuity of the members of
such a sequence f0 , f1 , . . . it would follow the continuity of the limit function f . Indeed, the last, and therefore also Cauchy’s ‘theorem’, is incorrect,
see Example 3.4.2(i). Apparently, this has been noticed first by Niels Henrik Abel in 1826 [1] who gave a counterexample.
After the previous introduction, we start the following with a collection
of counterexamples. For this, we make the following definition.
Definition 3.4.1. (The pointwise limit of a sequence of functions) Let
f1 , f2 , . . . be a sequence of functions defined on subsets of R. We define
the pointwise limit of f1 , f2 , . . . as the function f defined by
f pxq : nlim
Ñ8 fn pxq
for all x in the intersection of the domains of f1 , f2 , . . . which are such that
the corresponding sequence f1 pxq, f2 pxq, . . . is convergent. Note that f has
an empty domain in case that no such x exist.
Applying the previous terminology, the following Example shows the following facts. Part (i) shows that the pointwise limit of a sequence of continuous functions is not always a continuous function. Part (ii) shows that
the sequence of derivatives, associated to a sequence of differentiable functions that converge pointwise to a differentiable function f , does not always
converge pointwise to the derivative of f . Finally, Part (iii) shows that the
sequence of integrals, associated to a sequence of Riemann-integrable functions that converge pointwise to a Riemann-integrable function f , does not
always converge to the integral of f .
383
y
1
0.8
0.6
0.4
-3
-2
1
-1
2
3
x
Fig. 95: Graphs of the first five functions of the series from Example 3.4.2(i).
Example 3.4.2. (Examples of limits of sequences of functions)
(i) Define the sequence of infinitely often differentiable functions f1 , f2 , . . .
by
x2n
fn pxq :
1 x2n
for all n P N and x P R. Then
lim
Ñ8 fn
n
$
'
&0
1
pxq : nlim
Ñ8 fn pxq ' 2
%
1
if |x| 1
if x t1, 1u
if |x| ¡ 1
and hence limnÑ8 fn is not a continuous function.
(ii) Define the sequence of differentiable functions g1 , g2 , . . . by
gn pxq :
384
sinpnxq
n
y
1
x
1
-2
-1
Fig. 96: Graphs of the first 4 functions of the series from Example 3.4.2(ii).
y
-3
1
-1
3
x
-1
Fig. 97: Graphs of the derivatives of the first 4 functions of the series from Example 3.4.2(ii).
385
y
1.4
1.2
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
x
Fig. 98: Graphs of the first 10 functions of the series from Example 3.4.2(iii).
for all n P N and x P R. Then
lim gn
nÑ8
pxq : nlim
Ñ8 gn pxq 0
for all x P R and hence limnÑ8 gn is a differentiable function. On
the other hand because of
gn1 pxq cospnxq
for all n P N and x P R, the limit of g11 pxq, g21 pxq, . . . does not exist
for any x P R and
1
1
1 nlim
Ñ8 gn p0q nlim
Ñ8 gn p0q 0 .
Hence the sequence g11 , g21 , . . . of derivatives of g1 , g2 , . . . does not
converge pointwise to the derivative of limnÑ8 gn .
(iii) Define the sequence of continuous functions h1 , h2 , . . . by
hn pxq : nx p1 x2 qn
386
for all n P N and x P r0, 1s. Then
lim hn
Ñ8
n
pxq : nlim
Ñ8 hn pxq 0
for all x P r0, 1s defines a continuous function. On the other hand,
»1
0
hn pxq dx for every n P N and hence
1
2
nlim
Ñ8
»1
0
hn pxq dx n
1
2 n 1
»1
0
lim hn
nÑ8
pxq dx 0 ,
i.e., the limit of the integrals of h1 , h2 , . . . over r0, 1s is different from
the integral of its limit function over r0, 1s.
The following defines the notion of uniform convergence of a sequence of
functions. It is more restrictive than that of pointwise convergence and, if
present, will turn out to allow the exchange of the operations of integration, differentiation and the taking of limits. Its definition is likely due to
Christoph Gundermann the teacher of Weierstrass. From 1841 on, Weierstrass used it routinely in his work on power series. Through Weierstrass’
lectures on analysis at Berlin in 1859 and 1860, the mathematical world
became slowly aware of the importance of the concept.
Definition 3.4.3. (Uniform convergence) A sequence f1 , f2 , . . . of functions on some non-empty subset T of R is said to be uniformly convergent
to some function f : T Ñ R, if for every ε ¡ 0 there is some N P N such
that
|fnpxq f pxq| ε
for all x P T and all n P N such that n ¥ N . Note that the last is equivalent to the requirement that the graph of fn is contained in the ‘uniform
neighborhood of size ε’
t px, yq P T R : y P rpf pxq εq, f pxq εs u
around the graph of f for all n P N such that n ¥ N , see Fig 99.
387
y
1
0.5
-2
1
-1
x
2
Fig. 99: A uniform neighborhood of size 1{4 around sinp2xq.
Subsequently, we prove three statements on the validity of the exchange of
the operations of integration, differentiation and the performing of (other)
limits.
Theorem 3.4.4. (Uniform limits of continuous functions are continuous) Let f1 , f2 , . . . be a sequence of functions on some non-empty subset T
of R which is uniformly convergent to some function f : T Ñ R. Further,
let all f1 , f2 , . . . be continuous at some point x0 P T . Then f is continuous
at x0 , too, i.e.,
lim lim fn pxq xlim
nÑ8 xÑx
Ñx
0
0
lim fn
nÑ8
pxq .
Proof. For this, let a1 , a2 , . . . be some sequence in T which is convergent
to x0 and ε ¡ 0. Then for every m, n P N
|f panq f px0q|
|f panq fmpanq fmpanq fmpx0q fmpx0q f px0q|
¤ |f panq fmpanq| |fmpanq fmpx0q| |fmpx0q f px0q| .
388
(3.4.3)
Since f1 , f2 , . . . is uniformly convergent to f , there is m0
P N such that
|fm pxq f pxq| ¤ 3ε
0
for all x
that
P T . Further, since fm
0
is continuous in x0 , there is N
P N such
|fm panq fm px0q| ¤ 3ε
for all n P N satisfying n ¥ N . Hence it follows by (3.4.3) that
|f panq f px0q| ¤ ε
for all n P N such that n ¥ N .
0
0
Theorem 3.4.5. (A simple limit theorem for Riemann integrals) Let
f1 , f2 , . . . be a sequence of almost everywhere continuous functions on
ra, bs, where a, b P R are such that a ¤ b, which is uniformly convergent to
some almost everywhere continuous function f : ra, bs Ñ R. Then
»b
lim
Ñ8
n
a
fn pxq dx »b
a
lim
Ñ8 fn
n
pxq dx »b
a
f pxq dx .
Proof. For this, let ε ¡ 0. Since f1 , f2 , . . . is uniformly convergent to f ,
there is n P N such that
|fnpxq f pxq| ¤ ε
for all x P ra, bs. Hence
» b
fn x dx
pq ¤ pb aq ε .
a
»b
a
f x dx
pq
¤
»b
a
389
|fnpxq f pxq| dx ¤
»b
ε dx
a
Theorem 3.4.6. Let f1 , f2 , . . . be a sequence of continuous functions on
ra, bs, where a, b P R are such that a ¤ b, such that f1px0q, f2px0q, . . .
is convergent for some x0 P ra, bs. Further, let the restriction of each fn ,
n P N , to pa, bq be differentiable with a derivative that can be extended to
a continuous function fn1 on ra, bs. Finally, let the sequence f11 , f21 , . . . be
uniformly convergent to some continuous function g : ra, bs Ñ R. Then
f1 , f2 , . . . is uniformly convergent to a continuous function f : ra, bs Ñ R
whose restriction to pa, bq is differentiable with derivative given by g |pa,bq .
Hence in particular,
lim f 1 pxq nÑ8 n
lim
Ñ8 fn
n
1
pxq
(3.4.4)
for all x P pa, bq.
Proof. By Theorem 2.6.21, it follows that
fn pxq »x
a
fn1 py q dy
fn paq
for all n P N and x P ra, bs. Further, from this follows by the convergence of f1 px0 q, f2 px0 q, . . . , the uniform convergence f11 , f21 , . . . to g
and Theorem 3.4.5 the pointwise convergence of f1 pxq, f2 pxq, . . . to some
f : ra, bs Ñ R and
f pxq »x
a
g py q dy
lim
Ñ8 fn paq
n
for all x P ra, bs. Further, from this follows by Theorem 2.6.19 and the
continuity of g that f is continuous with its restriction to pa, bq being differentiable with derivative given by g |pa,bq . Finally, from
|fnpxq f pxq| ¤
»x
a
|fn1 pyq gpyq| dy |fnpaq nlim
Ñ8 fn paq|
valid for every n P N and x P ra, bs, the uniform convergence of f11 , f21 , . . .
to g and the convergence of f pa1 q, f pa2 q, . . . to limnÑ8 fn paq, it follows
the uniform convergence of f1 , f2 , . . . to f .
390
Remark 3.4.7. Note that Example 3.4.2(ii) shows that only the assumption
of a uniform convergence of the sequence f1 , f2 , . . . alone in Theorem 3.4.6
does not guarantee the validity of (3.4.4) in general.
The previous three theorems demonstrate the usefulness of the notion of
uniform convergence of a sequence of functions. Therefore, it is important
to derive criteria that indicate the presence of such a convergence. In this
connection, we give only the most simple and at the same time most useful
criterion due to Weierstrass.
Theorem 3.4.8. (Weierstrass test) Let f1 , f2 , . . . be a sequence of functions on some non-empty subset T of R for which there is a summable
sequence M1 , M2 , . . . of positive real numbers such that
|fnpxq| ¤ Mn
for all x P T and n P N . Then the series S1 , S2 , . . . defined by
Sn :
ņ
fk
k 1
for all n P N is uniformly convergent to a function S : T Ñ R on T . Also
is the series S1 pxq, S2 pxq, . . . absolutely convergent to S pxq for all x P T .
Proof. For this let x P T . Then by Theorem 3.3.9, it follows the absolute
summability
of f1 pxq, f2 pxq, . . . . Hence we can define S : T Ñ R by
°
S pxq : 8
k1 fk pxq for all x P T . Further for given ε ¡ 0, there is N P N
such that
ņ
Mk
k1
8̧
Mk k1
¤ε
for all n P N satisfying n ¥ N . For such n, it follows that
ņ
fk x
k1
S x
p q p q ¤
8̧
k n 1
|fk pxq| ¤
8̧
Mk
¤ε
k n 1
for all x P T and hence the uniform convergence of S1 , S2 , . . . to S.
391
The following two examples give standard applications of Theorem 3.4.5
and Theorem 3.4.6, respectively.
Example 3.4.9. Calculate
8̧ xn
n
n 1
for all x P p1, 1q. Solution: We notice that by formal ‘termwise differentiation’ of this sum with respect to x we arrive at a geometrical series. This
fact will be exploited in the following. For this, we define
fn pxq : xn
for all n P N and x P R. Then
|fnpxq| ¤ pmaxt|a|, |b|uqn
for all n P N and x P ra, bs where a, b P R are such that 1 a ¤
b 1. Since | maxt|a|, |b|u| 1, it follows by Theorem 3.4.8 the uniform
convergence of
ņ
Sf n :
fk |ra,bs
k 0
for n Ñ 8 as well as the absolute summability of f0 pxq, f1 pxq, . . . for all
x P p1, 1q. Further, it follows by Theorem 3.4.5 that
8̧ bk
k 0
ak 1 lim » b S pxq dx fn
nÑ8
k 1
1
a
»b
a
lim
Ñ8 Sf n
n
pxq dx »b
a
dx
1x
ln
and hence, finally, that
8̧ xn
for all x P p1, 1q.
n
n1
8̧ xn
n
n0
1
1
392
lnp1 xq
1a
1b
Example 3.4.10. Calculate
8̧
nxn
n 1
for all x P p1, 1q.
Solution: We note that
8̧
nxn
8̧
pn
1q xn xn
8̧
n 1
n 1
n 1
8̧
pn
1q xn n 1
x
1x
for all x P p1, 1q and that by formal integration of the last sum with
respect to x we arrive at geometric series. This fact will be exploited in the
following. For this, we define
fn pxq : xn , gn pxq : nxn1
for all n P N and x P R. Then
|fnpxq| ¤ pmaxt|a|, |b|uqn , |gnpxq| ¤ n pmaxt|a|, |b|uqn1
for all n P N and x P ra, bs where a, b P R are such that 1 a ¤
b 1. Since | maxt|a|, |b|u| 1, it follows by Theorem 3.4.8 the uniform
convergence of
Sf n :
ņ
ņ
fk |ra,bs , Sgn :
k 0
gk |ra,bs
k 0
for n Ñ 8 as well as the absolute summability of f0 pxq, f1 pxq, . . . and
g0 pxq, g1 pxq, . . . for all x P p1, 1q. Hence it follows by Theorem 3.4.6
that
8̧
k 0
kxk1
1
nlim
Ñ8 Sgn pxq nlim
Ñ8 Sf n pxq lim Sf n
nÑ8
for all x P pa, bq. Hence, finally, it follows that
8̧
n 1
nxn
p1 x xq2
393
1
pxq p1 1 xq2
for all x P p1, 1q.
The following example gives a less simple application of Theorem 3.4.5 to
an improper integral from a well-known representation of the Riemann zeta
function. In such applications, Theorem 3.4.5 is applied to every member
of a sequence of Riemann integrals whose limit coincides with the improper
integral.
Example 3.4.11. Show that
ζ psq 1
Γpsq
»8
0
xs1
dx
ex 1
(3.4.5)
for all s ¡ 1. Solution: For this, let s ¡ 1. According to (2.5.4)
ex 1 ¡ x
for all x P p0, 8q and hence
xs1
ex 1
xs2
for all x P p0, 1s. Further, an easy calculation shows that
ex 1 ¡ ex{2
for all x ¥ 1 and hence that
xs1
ex 1
xs1 ex{2
for all x ¥ 1. Hence by Examples 3.2.4, 3.2.7 and Theorem 3.2.6, it follows
the improper Riemann integrability of f : pp0, 8q Ñ R, x ÞÑ xs1 {pex 1qq. Further, define for k P N
fk pxq : xs1 epk 1qx
for every x P R. Then it follows for ε, R P R such that 0 ε R and any
x P rε, Rs
|fk pxq| ¤ Rs1epk 1qε .
394
Hence by Theorem 3.4.8, it follows the uniform convergence of
ņ
Sf n :
fk |rε,Rs
k 0
for n Ñ 8 to f |rε,Rs as well as the absolute summability of f0 pxq, f1 pxq, . . .
for all x P p0, 8q. Hence it follows by Theorem 3.4.5 that
»R
ε
»R
8̧
xs1
dx
ex 1
k0
xs1 epk 1qx dx
ε
8̧ 1 » kR
ks ys1ey dy ¤ Γpsq ζ psq
kε
k1
and hence that
»8
0
xs1
dx ¤ Γpsq ζ psq .
ex 1
On the other hand,
ņ
1
ks
k1
» kR
y ey dy ¤
s 1
kε
»8
0
xs1
dx
ex 1
for every n P N and hence
ņ
1
Γpsq
ks
k1
¤
and, finally,
Γpsq ζ psq ¤
»8
0
»8
0
xs1
dx
ex 1
xs1
dx .
ex 1
We note that (3.4.5) gives that
»8
0
x
ex
1
dx 1
Γp2q
»8
0
395
1 dx ζ p2q .
x
ex
The first integral has applications in statistical mechanics / quantum field
theory. Therefore, the knowledge of the value of ζ p2q is useful. Indeed,
there are quite a number of elementary proofs for the well-known fact that
ζ p2q π2
.
6
(3.4.6)
We use for this the approach from [44]. From this result, we conclude that
»8
0
x
ex 1
dx π2
.
6
More generally, a representation similar to (3.4.6) is also known for ζ p2nq
for n P N zt1u.
Example 3.4.12. There is a fairly elementary way to show that
ζ p2q π2
.
6
(3.4.7)
For this, we define for every n P N a corresponding Sn : R Ñ R by
Sn pxq :
ņ
1
2
cospkxq
k 1
for every x P R. In a first step, it follows that
Sn pxq sinrp2n 1q x{2s
2 sinpx{2q
for all x P R zt2πk : k P Zu and every n P N . The proof proceeds by
induction of n P N . First, it follows by the addition theorem for the sine
function that
2 sin
x
sin
2
S1 pxq 2 sin
x
2
sin
x
2
x 1
2
2
cospxq
cospxq cos
x
2
396
sin
sinpxq
x
2
2 sin
x
2
cospxq
sin
x
2
cospxq
cos
x
sinpxq sin
2
3x
2
for every x P R and hence the validity of the statement in the case n 1.
Further, if the statement is true for some n P N , then it follows by the
addition theorem for the sine function that
x
x
x
Sn 1 pxq 2 sin
Sn pxq 2 sin
cos rpn 1qxs
2 sin
2
2
2
x
x
p
2n 1q x
sin
sin
cos rpn 1qxs cos
sin rpn 1qxs
2
2
2
x
x
p
2n 3q x
sin
cos rpn 1qxs cos
sin rpn 1qxs sin
2
2
2
for every x P R and hence the validity of the statement where n is replaced
by n 1. Hence the statement holds for all n P N . In the following let
n P N . Then
»π
0
2
π
4
»π
xSn pxq dx x
ņ
π2
4
k 1
ņ
k 1
k
0
x
dx
2
π
sinpkxq
π
1
cospkxq
k2
0
0
ņ
»π
0
k 1
ņ
k1
k1
π2
4
x cospkxq dx
»π
0
ņ
sinpkxq dx
k 1
p1qk k2
1
k2
.
Further, by integration by parts, it follows that
»π
0
xSn pxq dx »π
»π
x
0
sinrp2n 1q x{2s
dx
2 sinpx{2q
sinxp{x2{2q sinrp2n 1q x{2s dx
0
π
2
x{2
2n 1 cosrp2n 1q x{2s sinpx{2q
0
»π
1
sinpx{2q px{2q cospx{2q
cosrp2n
2n 1 0
sin2 px{2q
397
1q x{2s dx
Note that in this derivation it has been used that
lim
x
Ñ0
x
sinpxq
1,
sinpx{2q px{2q cospx{2q
sin2 px{2q
lim
x
Ñ0
0
which follows from an application of L’Hospital’s theorem, Theorem 2.5.38.
As a consequence, the function pp0, 2π q Ñ R, x ÞÑ px{2q{ sinpx{2q and its
derivative have uniquely determined extensions to continuous functions on
r0, 2πq. Further, by using the last fact, it follows that
» π
sin x 2
x 2 cos x 2
cos 2n
2
sin x 2
0
»π
sin x 2
x
2
cos
x
2
dx
2
sin x 2
¤
0
p { q p { q p { q rp
p{q
p { qp { q p { q
p{q
»π
and hence that
lim
Ñ8
n
0
1 x 2 dx
q {s
xSn pxq dx 0 .
Hence it follows that
π2
4
8̧ p1qk
k2
k 1
1
k2
0.
The last implies that
8̧
ζ p2q k 1
ζ p22q
p1qk
k2
π2
4
ζ p2q
π2
4
8̧
2
1
p2kq2
k1
π2
4
and hence p3.4.7q.
Cauchy, in his textbook ‘Cours d’analyse’ from 1821 [22], was the first
to give nearly modern definitions of the continuity and differentiability of
functions based on limits. Still, his understanding of limits was different
from the modern understanding. During the early 19th century, it resulted
398
in the general belief that every continuous function is everywhere differentiable, except perhaps at finitely many points. Even several ‘proofs’ of this
‘fact’ appeared during that time. One such ‘proof’ is due to Andre-Marie
Ampere. Therefore, it came as a shock when in 1872 [99] Weierstrass
proved the existence of a continuous function which is nowhere differentiable. For the first time, this result signaled the complete mastery of the
concepts of derivative and limit which is characteristic for modern calculus
/ analysis.
As an application of uniform convergence of series of functions, the following gives such an example of a continuous function which is nowhere differentiable. It differs from Weierstrass’ original example, but the construction
of the function and the subsequent reasoning are analogous. Weierstrass’
key idea is the construction of a continuous function f which is highly oscillating in the neighborhood of every point x of its domain in such a way
that for every M ¡ 0 and in every neighborhood of x, there is x̄ P Dpf q
such that the corresponding absolute value of the slope of the secant between px, f pxqq and px̄, f px̄qq satisfies
f x̄
x̄
p q f pxq ¥ M .
x Hence, there is a sequence x1 , x2 , . . . in Dpf q ztxu which is convergent to
x and such that the corresponding sequence
f x1
x
p q f pxq 1x
f x2
, x
p q f pxq 2x
, ...
is unbounded. As a consequence, the sequence
f px1 q f pxq f px2 q f pxq
,
, ...
x1 x
x2 x
cannot be convergent and hence f is not differentiable in x. It is this key
idea, which signals for the first time the complete mastery of the concepts of
derivative and limit which is characteristic for modern calculus / analysis.
399
y
1
-3
-2
1
-1
2
3
x
Fig. 100: Graph of the auxiliary function h in Example 3.4.13.
Further, Weierstrass’ method of construction is suitable for the construction
of a whole class of continuous functions that are nowhere differentiable.
Hence, it cannot be said that such examples are in any sense isolated or
pathological. The method supports more the view that such functions are
generic.
Example 3.4.13. (A continuous nowhere differentiable function) In the
first step, we define an auxiliary function h : R Ñ R by
hpx
2k q : |x|
for all 1 ¤ x 1 and k
for all x, y P R.
mintx, y u. Then
P Z. This implies that
|hpxq hpyq| ¤ |x y|
For the proof, let x, y P R and n P
hpv q »v
n
g puq du ,
400
Z such that n
y
y
0.7
-1
1
0.5
-0.5
x
1
-1
-0.5
y
-0.5
1
0.5
1
x
y
1.4
-1
0.5
2
0.5
Fig. 101: Graphs of pR Ñ R, x ÞÑ
Example 3.4.13.
x
1
-1
°n
-0.5
x
k
k
p3{4q hp4 xqq for n 1, 2, 3 and 10. Compare
k 1
401
P R where g : R Ñ R is defined by
#
1 if 1 ¤ y 0
g py 2k q :
1 if 0 ¤ y 1
for all 1 ¤ y 1. Hence if x ¤ y
for all v
|hpxq hpyq| and if y
¤x
|hpxq hpyq| » y
g u du
x
pq
» x
g u du
y
pq
¤
¤
»y
x
»x
y
|gpuq| du y x |x y|
|gpuq| du x y |x y| .
In the next step, we define f : R Ñ R by
f pxq :
8̧ 3 n
n 0
for all x P R. Since
4
n
3
h 4n x
4
p q¤
hp4n xq
n
3
4
for all x P R, the summability of the sequence p3{4q, p3{4q2 , . . . and the
continuity of pR Ñ R, x ÞÑ p3{4qn hp4n xqq for every n P N , it follows by
Theorems 3.4.4, 3.4.8 that f is continuous. In the following, we show that
f is nowhere differentiable. For this, let x P R and m P N . Define
δm :
#
4m {2 if p4m x, 4m x p1{2qq contains no integer
4m{2 if p4mx p1{2q, 4mxq contains no integer
and
δm qq hp4n xq
δm
n
for every n P N . When n ¡ m, 4 δm 4nm {2 is an even integer which
implies that γmn 0. When n ¤ m, it follows that
γmn :
h 4n x
|γmn| : p p
hp4n px
δm qq hp4n xq | 4n px
¤
δm
402
δm q 4n x|
δm
4n .
We conclude that
°8
f x δm
n0 3 n h 4n
f
x
4
δm
m̧ n
8̧ n
3
3
γmn γmn n0 4
n0 4
m̧ n
m̧ n
3
3
n
4
4n γmn
n0 4
4
n0
p
p px
q p q δm qq δm
°8
n 0
3 n
4
hp4n xq p q¥
m̧
n 0
3n
21
3m
1
1
.
Hence the sequence
f px
δ1 q f pxq f px
,
δ1
δ2 q f pxq
, ...
δ2
is unbounded and hence not convergent, but
lim
Ñ8 δm
m
0.
Therefore, f is not differentiable in x.
As another application of uniform convergence of series of functions, we
construct a continuous plane-filling curve, i.e., continuous functions f1 and
f2 from r0, 1s to r0, 1s such that the corresponding map f : r0, 1s Ñ r0, 1s2 ,
defined by f ptq : pf1 ptq, f2 ptqq for every t P r0, 1s, is surjective. The first
construction of such a curve, by Peano in 1890, shocked the mathematical
community. The map f can be viewed as a parametrization of its range.
Since we experience the domain of f as ‘one-dimensional’, intuition expects the range of f to be ‘one-dimensional’ as well, i.e., to be a ‘curve’.
But in this special case that curve is the ‘two-dimensional’ interval r0, 1s2 .
In addition, f is continuous in the sense that the corresponding projections
f1 , f2 on the coordinate axes are continuous. This seems to contradict common sense. Also here, the method of construction is sufficiently general to
exclude that the result could be called isolated or pathological.
403
y
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
x
Fig. 102: Curve from Example 3.4.14 resulting from truncating the sums after k
4.
Example 3.4.14. (A plane-filling curve) Let h : r0, 2s Ñ R be some
continuous function such that hptq 0 for all t P r0, 1{3s Y r5{3, 2s and
hptq 1 for all t P r2{3, 4{3s. We consider the 2-periodic continuous
extension of this function to the whole of R which will also be denoted by
h. Then we define f1 , f2 : R Ñ R by
f1 ptq :
8̧ hp32k2 tq
k 1
2k
, f2 ptq :
8̧ hp32k1 tq
k 1
2k
for all t P R. By Theorem 3.4.8, it follows that both series converge pointwise absolutely as well as uniformly on R and hence by Theorem 3.4.4 that
f1 and f2 are continuous. In the following, we show that f pr0, 1sq r0, 1s2
where f : R Ñ R2 is the continuous curve defined by f ptq : pf1 ptq, f2 ptq
for all t P R. For this, let x, y P r0, 1s and
x
8̧ x
k
k
2
, y
k 1
404
8̧ y
k
k
2
k 1
y
y
0.5
0.7
0.4
0.6
0.5
0.3
0.4
0.2
0.3
0.2
0.1
0.1
0.1
0.2
0.3
0.4
0.5
0.6
t
0.2
0.4
0.6
0.8
1
t
y
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
t
Fig. 103: Graph of truncated f1 from Example 3.4.14 corresponding to truncation of the
sum after k 1, 2, 3 and 4.
405
their binary representation where x1 , x2 , . . . and y1 , y2 , . . . in t0, 1u. Then
we define t P r0, 1s by
8̧ t
k
t2
3k
k1
where t2k1 : xk and t2k : yk for all k
n P N that
hp3n tq h 2
In case that tn
1
8̧
h
tk
3kn
k1
8̧ t
n
2
3k
k 1
and hence hp3n tq 0 tn
2
3
tk
3kn
kn 1
8̧ 1
¤2
k 2
3k
8̧ t
n
k 1
¤2
k
3k
h
8̧ t
n
2
k 1
3k
k
.
13
and in case that tn
1
¤2
k
8̧
0,
2
P N. Then it follows for every
8̧ 1
3k
k1
1
1,
1
and hence hp3n tq 1 tn 1 . As a consequence,
f1 ptq 8̧ hp32k2 tq
2k
8̧ hp32k1 tq
k 1
f2 ptq 2k
k 1
8̧ t
8̧ x
2k1
k
k
k
2
2
k1
k1
8̧ t
8̧ y
2k
k
k
2
k 1
k
2
x,
y .
k 1
Note in particular that f p0q p0, 0q and f p1q p1, 1q.
In the remainder of this section, we study power series which are sequences
of polynomials p0 , p1 , . . . that are associated to a sequence of coefficients
a0 , a1 , . . . of real numbers, an ‘expansion point’ x0 P R and defined by
pn pxq ņ
ak px x0 qk
k 0
406
y
y
0.5
0.7
0.4
0.6
0.5
0.3
0.4
0.2
0.3
0.2
0.1
0.1
0.2
0.4
0.6
0.8
1
t
0.2
0.4
0.6
0.8
1
t
y
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
t
Fig. 104: Graph of truncated f2 from Example 3.4.14 corresponding to truncation of the
sum after k 1, 2, 3 and 4.
407
for every x P R and n P N where px x0 q0 : 1 for every x P R. Particular
examples are the Taylor polynomials corresponding to a function f defined
as well as infinitely often differentiable on a non-trivial open interval I.
According to Taylor’s theorem, Theorem 2.5.25, for x0 P I, x P I and
n P N, there is ξn in the closed interval between x0 and x such that
f pxq f pkq px0 q
p
x x0 qk
k!
k0
ņ
f pn 1q pξn q
p
x x0 qn
pn 1q!
1
where f p0q : f and px x0 q0 : 1. The Taylor series of f around x0 is
defined as the power series of Taylor polynomials p0 , p1 , . . . corresponding
to the sequence f px0 q, f 1 px0 q, . . . and the expansion point x0 where
pk pxq :
f pkq px0 q
p
x x0 qk
k!
k0
ņ
for every x P R and n P N.
First, the question will be investigated for what values of x a given power
series converges. In case that the sequence of coefficients has only finitely
many non-zero members, this is of course trivially the case for every x P R.
Hence in this connection, this case is not further considered. The following
lemma shows that a ‘too fast growing sequence of coefficients’ leads to a
power series that converges only in the expansion point. In such a case, we
say that the series has the convergence radius 0.
Lemma 3.4.15. Let a0 , a1 , . . . be a sequence of real numbers such that the
set
t|a1|, |a2|1{2, |a3|1{3, . . . u
is unbounded. Then the sequence a0 , a1 x, a2 x2 , . . . is not summable for
every non-zero real x.
Proof. For this, let x be some non-zero real number. Then also
t|a1x|, |a2x2|1{2, |a3x3|1{3, . . . u
408
is unbounded and hence there exists for every N
P N some n P N such that
|anxn| ¥ N n .
Hence the sequence an xn , n P N, does not converge to zero and therefore
is also not summable by Corollary 3.3.29.
The following fundamental theorem gives important insight into the convergence properties of power series whose coefficients satisfy a certain growth
condition. By application of the root test, Theorem 3.3.33, it shows that
such a series converges on a symmetric open interval px0 r, x0 rq around
the expansion point x0 where r ¡ 0 is the so called radius of convergence
and is defined in terms of the coefficients. That radius r is maximal since
the power series diverges for all x in the complement of the closed interval
rx0 r, x0 rs. Further, the series converges uniformly on every closed
interval of R that is contained in px0 r, x0 rq. Finally, the radius of
convergence of the power series originating from the given series by differentiation has the same radius of convergence as the original series.
Theorem 3.4.16. (Power series) Let x0 P R, a0 , a1 , . . . be a sequence of
real numbers which contains infinitely many non-zero members and is such
that the set
t|a1|, |a2|1{2, |a3|1{3, . . . u
is bounded from above. Finally, let N
rN : sup |aN |
P N and
{ , |a
N
1 N
1
|
{p
1 N 1
q, . . . (
1
.
(i) Then the sequence a0 , a1 px x0 q, a2 px x0 q2 , . . . is absolutely
summable for every x P px0 rN , x0 rN q, and the series of polynomials defined by
Sn pxq :
ņ
ak px x0 qk
k 0
for every x P R is uniformly convergent on rx0
every r 1 satisfying 0 ¤ r 1 rN .
409
r 1 , x0
r 1 s for
(ii) The number
r :
lim sup |aN |
N
Ñ8
{ , |a
N
1 N
1
|
{p
1 N 1
q, . . . (
1
where we set 1{0 : 8, is called the radius of convergence of the
‘power series’ S0 , S1 , . . . . For any x P R such that |x x0 | ¡ r,
S0 pxq, S1 pxq, . . . is divergent.
(iii) The power series S0 , S1 , . . . and S01 , S11 , . . . have the same radius of
convergence.
Proof. ‘(i)’: First, it follows by the definition of rN that
|an|1{n ¤ r1
N
for all n P N satisfying n ¥ N . Further let x P px0 rN , x0
rN q, then
|anpx x0qn|1{n |an|1{n|x x0| ¤ |x r x0| 1
N
for n P N satisfying n ¥ N , and hence it follows by Theorem 3.3.33 that
the sequence a0 , a1 px x0 q, a2 px x0 q2 , . . . is absolutely summable.
Further, let r 1 be such that 0 ¤ r 1 rN . Then it follows for every n P
N, N 1, . . . and every x P rx0 r 1 , x0 r 1 s that
|ak px x0q
k
| |ak | r 1k
¤
r1
rN
k
and hence, obviously, by Theorem 3.4.8 the uniform convergence of S0 , S1 ,
. . . on rx0 r 1 , x0 r 1 s. ‘(ii)’: First, it follows that r is well-defined
since 1{r1 , 1{r2 , . . . is decreasing and bounded from below by 0 and hence
convergent to some positive real number. In the case that this number is
different from zero, it follows that
lim sup |aN |1{N , |aN
N
Ñ8
1
410
|1{pN
1
q, . . . ( 1 .
r
Hence, if x P R is such that |x x0 |
n P N such that
|an|1{n ¡
¡ r there is an infinite number of
1
|x x 0 |
and hence the sequence an px x0 q , n P N, does not converge to zero and
n
therefore is also not summable by Corollary 3.3.29. ‘(iii)’: First, we note
that the series S01 , S11 , . . . is convergent for some x P R if only if pidR x0 q
S01 , pidR x0 q S11 , . . . is convergent in x and hence that both power series
have the same radius of convergence. Further for k P t3, 4, . . . u, it follows
by (2.5.12) that
|ak |1{k ¤ elnpkq{k |ak |1{k pk |ak |q1{k ¤ e |ak |1{k
and that expplnp3q{3q, expplnp4q{4q, . . . is decreasing and convergent to
1. Hence, obviously, it follows that the convergence radii of pidR x0 q S01 , pidR x0 q S11 , . . . and S0 , S1 , . . . are the same.
By application of Theorems 3.4.6, 3.4.5, we immediately conclude the following important corollary.
Corollary 3.4.17. Let x0 , a0 , a1 , . . . and r be as in Theorem 3.4.16. Then
f : px0 r, x0 rq Ñ R defined by
f pxq :
8̧
ak px x0 qk
k 0
for all x P px0 r, x0
rq is infinitely often differentiable with derivative
f pnq pxq 8̧
ak px x0 q
p
k
n
q
!
kn
and in particular
an
k!
pnq
f n!px0q
411
k n
for every n P N and x P px0 r, x0 rq. Further for every a, b
that a ¤ b and ra, bs € px0 r, x0 rq,
»b
a
f pxq dx »b
8̧
ak
a
k 0
8̧
P R such
px x0qk dx
ak pb x0qk
k
1
k0
1
pa x0qk
1
.
Proof. The statement is a simple consequence of Theorems 3.4.16, 3.4.6
and 3.4.5.
It remains the question concerning the convergence of a power series with
convergence radius r ¡ 0 and expansion point x0 in the points x0 r and
x0 r. From examples, we will see later that the series can be divergent in
both points, convergent in one of them or convergent in both. On the other
hand, if the series is convergent in such a point, then the sum of the series
as a function of x P px0 r, x0 rq is extendable to continuous function
on the interval resulting from px0 r, x0 rq by addition of that point.
Theorem 3.4.18. (Abel’s theorem) Let x0 P R, a0 , a1 , . . . , r be as in Theorem 3.4.16 and f : px0 r, x0 rq Ñ R be defined by
f pxq :
8̧
ak px x0 qk
k 0
for all x P px0 r, x0
rq. Further, let a0 r0 , a1 r1 , . . . be summable. Then
x
lim f pxq Ñx 0
r
8̧
ak r k .
k 0
Proof. For this, let S1 , S0 , S1 , . . . be the sequence of partial sums of a0 r0 ,
a1 r1 , . . . , S1 : 0 and
S :
8̧
ak r k .
k 0
412
Then it follows that
ņ
ak px x0 qk
ak rk
x
k 0
k 0
ņ
ņ
pSk Sk1q
k 0
Sn
x
x
r
x 0 k
r
x0 n
1
r
for every x P px0 r, x0
x0 k
1 x x k
x x0 n¸
0
Sk
r
r
k0
rq, n P N and hence by Theorem 3.3.9
f pxq 1 x x k
x x0 8̧
0
Sk
.
r
r
k0
Further if M ¡ 0 is some upper bound for S1 , S2 , . . . , it follows for given
ε ¡ 0, n0 P N such that |Sn S | ¤ ε{2 for all n P tn0 , n0 1, . . . u and
x P px0 r, x0 rq satisfying
"
r
r
x0 x min r,
2n0 pM
that
|f pxq S | 1
|S |q ε
*
x x k x x0 8̧
0
p
Sk S q
r
r
k0
k
x x0 8̧
|
x x0 |
|S k S |
¤ 1 r
r
k0
¤ n0 pM |S |q 1 x r x0 2ε ¤ ε
for all n P tn0 , n0 1, . . . u.
Example 3.4.19. By Examples 3.3.19, 3.4.9 and the previous theorem, it
follows that the sum of the alternating harmonic series is given by
8̧
n 0
p1qn 1 lnp2q .
n
413
y
1.1
1
0.9
0.8
0.7
0.6
0.5
10
20
30
40
50
n
Fig. 105: Partial sums of the alternating harmonic series and Graph of the constant function
of value lnp2q.
Example 3.4.20. As another application of Theorem 3.4.18, we prove Leibniz’s result that
π
4
8̧
n 0
p1qn 2n 1 1 1 13
We use as basis that
1
5
17
... .
»
1
π
u2
1 1 u2 du
4
0
which was shown in the introduction to this section. Since
1
1
1 pu2q 1
u2
8̧
pu q 2 k
k 0
8̧
p1q u p
k 2 k 1
k 0
q
8̧
l 1
414
p1qk u2k
k 0
for every u P R satisfying |u| 1, it follows that
u2
1 u2
8̧
p1ql1u2l
(3.4.8)
for every u P R satisfying |u| 1 where we use the usual convention
that x0 : 1 for all x P R. Hence, taking into account that the sequence 0, 1, 1, 1, 1, . . . diverges, the power series corresponding to the
sequence 0, 1, 1, 1, 1, . . . has convergence radius 1. Hence it follows by
corollary 3.4.17 for every 0 ¤ x 1 that
1
»x
0
u2
du 1 1 u2
8̧
1 p1ql1
l 1
»x
8̧
0 l 1
2l 1 x
u
2l
1 0
p1q u du 1 l 1 2l
8̧
p1q »x
l 1
l 1
2l 1
1 p1ql1 2lx
l 1
8̧
1
u2l du
0
.
Further, by Dirichlet’s test, it follows that the sequence 0, 1{3, 1{5, 1{7, . . .
is summable. Since the function that associates to every x P r0, 1s the value
»x
0
u2
du
1 u2
is in particular continuous, it follows by Theorem 3.4.18 that
1
»1
0
8̧
1
u2
l1
du
1
p
1
q
1 u2
2l 1
l1
and hence Leibniz’ result (3.4.8).
The following example gives a standard application of power series expansions to the solution of differential equations. In this, it is assumed that the
solution can be expanded into a power series around some expansion point.
Usually, that expansion point is chosen to be a point where additional information on the solution is available. In the next step, the function in the
differential equation is replaced by the power series. As a result of a subsequent calculation, a power series is obtained whose sum as a function of the
variable vanishes in every point of its still unknown domain. As a consequence of Corollary 3.4.17, all coefficients of the last power series need to
vanish. Usually, this leads to a recursion relation for the coefficients of the
power series for the solution. If this recursion relation can be solved, it is
415
y
1
0.8
0.6
0.4
0.2
2
6
10
x
-0.2
-0.4
Fig. 106: Graphs of J0 , J1 and J2 .
tried to determine the radius of convergence of the corresponding series. If
that radius is greater than zero, it follows that the obtained power series is
indeed a solution of the differential equation. Precisely in this way, the majority of special functions of mathematics have been found. The associated
differential equations had their roots in applications. This is also true for
the following differential equation which is related to Bessel’s differential
equation by a simple transformation.
Example 3.4.21. Let ν
differential equation
P r0, 8q.
Find a solution fν : R
xfν2 pxq
p2ν 1q fν1pxq
for all x P R and such that fν p0q 1{Γpν
xfν pxq 0
Ñ
R of the
(3.4.9)
1q. Solution: We assume that
fν has a representation as a power series around 0
fν pxq 8̧
k 0
416
ak xk
(3.4.10)
for all x P pr, rq where a0 , a1 , . . . is some sequence of real numbers with
corresponding convergence radius r ¡ 0 which are to be determined. Then
it follows by Corollary 3.4.17 that
0 xfν2 pxq
8̧
8̧
p2ν 1q fν1pxq
k pk 1q ak xk1 p2ν
xfν pxq
8̧
k 0
k 0
2ν q ak xk1
k pk
k 1
p2ν
ak xk
8̧
k ak xk1
ak xk
1
k 0
1
k 0
8̧
1q a1
8̧
1q
rpk
2qpk
2ν
2q ak
ak s xk
2
1
k 0
which is satisfied for all x P pr, rq if
a0
for every k
for all k
1,
a1
0,
ak
2
pk
P N or explicitly if
p1{4qk ,
a2k k! Γpν k 1q
P N. Since
a2pk 1q x2pk
lim kÑ8
a x2k
1
2k
q klim
Ñ8 4pk
ak
2qpk 2ν
a2k
1
x2
1qpk
2q
0
ν
1q
0
foe every x P R, it follows by Theorem 3.3.31 that the convergence radius
of the corresponding power series is infinite and hence that fν : R Ñ R
defined by (3.4.10) has the required properties. In terms of fν , the so called
Bessel function Jν of the first kind and of order ν is given by
Jν pxq :
x ν
2
fν pxq x ν
2
8̧
p1qk
k! Γpν k
k0
417
1q
x2
4
k
(3.4.11)
for all x P p0, 8q. By (3.4.11), (3.4.9), it follows that Jν satisfies the
differential relation
x2 Jν2 pxq
xJν1 pxq
px2 ν 2qJν pxq 0 ,
for all x P p0, 8q.
As a consequence of the absence of a clear notion of limits, in the 17th
and 18th century, power series were generally treated like polynomials. Of
course, the product of two polynomials is another polynomial. The standard
way to show this is to use the distributive law to write the product as a
combination of powers of the variable and then to collect for powers of the
variable. From the last, the coefficients of the resulting polynomial can be
read off. For the above reason, the same was done for power series which
led to the definition of the product of power series. If in that definition the
value 1 is substituted for the variable, we arrive at the so called Cauchy
product of series. Indeed, the following shows that the Cauchy product
of an absolutely summable and a summable sequence is summable with
corresponding sum given by the product of the sums of the factors.
Theorem 3.4.22. (Cauchy product of series) Let a0 , a1 , . . . , b0 , b1 , . . . be
absolutely summable and summable, respectively, sequences of real numbers. We define
ņ
cn :
ak bnk
(3.4.12)
k 0
for all n P N. Then c1 , c2 , . . . is summable and
8̧
k 0
ck
8̧
ak
k 0
8̧
bk
.
k 0
Proof. For this, let A0 , A1 , . . . , B0 , B1 , . . . and C0 , C1 , . . . be the sequence
of partial sums of a0 , a1 , . . . , b1 , b2 , . . . and c1 , c2 , . . . , respectively. Further,
let βn : Bn B for every n P N where
B :
8̧
k 0
418
bk .
In a first step, it follows by induction that
Cn
ņ
ak Bnk
k 0
and hence that
Cn
ņ
ak pB
ņ
βnk q An B
for every n P N. Since, limnÑ8 βn
such that |βn | ¤ ε for all n P tN, N
ņ
a
β
nk k k0
¤M
Ņ
¤
ņ
ank βk
k 0
k 0
0, for given ε ¡ 0 there is N P N
1, . . . u. Hence for such n
ņ
|ank | |βk |
k 0
| ak |
8̧
ε
k n N
|ank | |βk |
k N 1
|ak |
k 0
where M ¡ 0 is such that |βk | ¤ M for all k
by Theorem 3.3.28, there is N 1 P N such that
ņ
P tN, N
1, . . . u. Further,
| ak | ¤ ε
k n N
for all n P tN
N 1, N
N1
ņ
ank βk k0
1, . . . u. Hence for such n
¤
8̧
M
|ak |
ε.
k 0
The following example shows that Cauchy product of two conditionally
summable sequences is not necessarily summable.
419
Example 3.4.23. Define
an : bn :
and
n
?p1q
n
1
ņ
cn :
ak bnk
k 0
for all n
Further,
cn :
P N. Then a0, a1, . . . and b1, b2, . . . are conditionally summable.
ņ
k 0
k
nk
?p1q ?p1q
p1qn k 1
nk 1
p1qn ņ
k 0
1
b
n
2
and hence
2
1
n
2
k
ņ
k 0
a
pk
1
1qpn k
1q
2
|cn| ¥ 2pnn
1q
2
for all n P N. As a consequence, limnÑ8 |cn |
is not summable.
0 and therefore c1, c2, . . .
The following example gives an application of the Cauchy product of series
to the summation of arithmetic series. As a result, we will obtain a systematic method for the derivation of sums of arithmetic series. The key idea
comes from the observation that the coefficient of the Cauchy product, see
(3.4.12), are given by the partial sums of the first series if bk 1 for all
k P N .
Example 3.4.24. (Summation of arithmetic series) Show that
ņ
k 1
r k pk
1qpk
m 1q s 1
m
420
1
n pn
1qpn
mq (3.4.13)
holds for all m, n P N . Solution: For this, let n
follows that for all |x| 1 that
n!
p1 xqn
1
8̧ pk
nq!
k!
k 0
P N.
In a first step, it
xk ,
including the absolute summability of the series. The proof proceeds by
induction over n P N. For the case n 0, this follows from Example 3.3.2.
If the statement is true for some n P N, we conclude by Theorem 3.4.16
and Corollary 3.4.17 that
pn 1q! 8̧ k pk nq! xk1 8̧ pk
p1 xqn 2 k1 k!
k0
8̧ pk n 1q!
xk
1q
pk
pk
n
1q! k
x
1q!
k!
k 0
for all |x| 1, including the absolute summability of the last series, and
hence the validness of the statement where n is replaced by n 1. Further,
it follows by Theorem 3.4.22 that
n!
1 x p1 xqn 1
1
8̧
k 0
ķ
pl
nq!
l!
l 0
8̧
xk
8̧ pk
k 0
k 0
nq!
k!
xk
xk
for |x| 1. Since,
n!
1 x p1 xqn
1
1
n
1
8̧ pk
p
n 1q!
1 p1 xqn 2 k0 pn
n 1q! k
x
1q k!
for |x| 1, it follows by Corollary 3.4.17 that
k¸1
l 1
l pl
1q pl
n 1q ķ
l 0
421
pl
1q pl
2q pl
nq
ķ
pl
nq!
l!
l 0
n 1 1 pk
n 1 1 pk
1q pk
n 1q!
k!
2q pk
1q
n
for all k P N and hence (3.4.13) for all m, n P N . From (3.4.13), we can
iteratively determine the sum of the arithmetic series
Sm pnq :
ņ
km
k 1
for all m, n P N . We carry the procedure through for m
m 1, according to (3.4.13)
ņ
S1 pnq k
k 1
12 npn
1 to 4.
For
1q .
for all n P N . For m 2, according to (3.4.13)
S2 pnq
S1 pnq ņ
r kpk
1q s k 1
1
npn
3
1qpn
2q
and hence
S2 pnq 1
npn
3
1qpn
2q 1
npn
2
1q 1
npn
6
1qp2n
1q
for all n P N . For m 3, according to (3.4.13)
S3 pnq
3S2 pnq
2S1 pnq ņ
r kpk
1qpk
2q s 2qpn
3q 1
npn
2
5n
6 4n 2 4q
k 1
1
npn
4
1qpn
1qp2n
1q
and hence
S3 pnq 1
npn
4
npn
1q 1qpn
1
npn
4
1qpn2
422
2q
41 n2pn 1q2
for all n P N . For m 4, according to (3.4.13)
S4 pnq
6S3 pnq
11S2 pnq
6S1 pnq ņ
r k pk
1q pk
3q s
k 1
51 n pn
1q pn
4q
and hence
S4 pnq 1
n pn
5
1q pn
4q 3 2
n pn 1q2
2
11
6 npn 1qp2n 1q 3npn 1q 301 npn 1q
r 6pn 2qpn 3qpn 4q 45npn 1q 55p2n 1q 90 s
301 npn 1q
r 3p2n3 18n2 52n 48 15n2 15n 30q 55p2n 1q s
301 npn 1q r 3p2n3 3n2 37n 18q 55p2n 1q s
301 npn 1q r 3pn2 n 18qp2n 1q 55p2n 1q s
301 npn 1qp2n 1qp3n2 3n 1q
for all n P N . As a consequence, we arrive at the following results.
S1 pnq 1
1
npn 1q , S2 pnq npn 1qp2n 1q ,
2
6
1 2
1
2
npn 1qp2n 1qp3n2
S3 pnq n pn 1q , S4 pnq 4
30
for all n P N .
3n 1q
The following gives a simple and useful criterion for the convergence of the
Taylor series of a function. It is a simple consequence of Taylor’s theorem,
Theorem 2.5.25, and the fact that n! growths faster with n P N than an
where a is any positive real number.
423
Theorem 3.4.25. (Taylor expansions) Let I be a non-trivial open interval,
a P I and f : I Ñ R be infinitely often differentiable and such there are
M ¥ 0 and N P N such that
|f pnqpxq| ¤ M n
for all n P tN, N
1, . . . u. Then
8̧ f pkq paq
f pxq k!
k 0
px aqk
(3.4.14)
for all x P I.
Proof. By Theorem 2.5.25 for every x P I and n P tN, N 1, . . . u, there
is some cx,n in the closed interval between a and x such that
f x
p q
f pkq paq
n¸1
k 0
and hence
k!
k
a px q lim
f x
nÑ8 p q
Corollary 3.4.26. Let I, a
3.4.25. Then
f pkq paq
k!
P I, f
: I
a k
px q 0 .
Ñ R and M ¥ 0 as in Theorem
1q
paq px aqk
8̧ f pk
k 0
p q |x a|n ¤ pM |x a|qn
n!
n¸1
k 0
f 1 pxq for all x P I and F : I
pnq
f
cx,n
n!
k!
Ñ R, defined by
F pxq :
8̧ f pk1q paq
k 1
k!
for all x P I, is an anti-derivative of f .
424
px aqk
Proof. Obviously without restriction, we can assume that M, N
ther, let g : f 1 . Then g is infinitely often differentiable and
n
¥ 1. Fur-
|gpnqpxq| |f pn 1qpxq| ¤ M n 1 M 2 ¤ M 2 n
for all n P tN, N 1, . . . u and x P I. Hence it follows by Theorem 3.4.25
that
8̧ f pk 1q paq
px aqk
(3.4.15)
g pxq for all x P I. Further, let c, d
F : pc, dq Ñ R by
F pxq :
1
2
k!
k 0
P I be such that c a d. Then we define
»x
f py q dy c
»a
c
f py q dy
for every x P pc, dq. Then F is infinitely often differentiable with its first
derivative given by the restriction of f to the interval pc, dq. Further, F paq 0 and
|F pnqpxq| |f pn1qpxq| ¤ M n1 ¤ M n
for all n P tN, N
rem 3.4.25 that
1, . . . u and x
8̧
F pxq P pc, dq. Hence it follows by Theof pk1q paq
px aqk
(3.4.16)
k!
k 1
for all x P pc, dq.
Example 3.4.27. By Theorem 3.4.25 it follows that
ex
8̧ xk
k!
k0
cospxq ,
8̧
8̧
k 0
sinpxq for all x P R.
k 0
2k
p1qk px2kq! ,
2k 1
p1qk p2kx
425
1q!
y
6
5
4
3
2
1
0.25 0.5 0.75 1 1.25 1.5 1.75
x
Fig. 107: Graphs of the exponential function and corresponding Taylor polynomials of
orders 0, 1 and 2 around 0.
y
1
0.5
0.5
1
2
2.5
x
-0.5
-1
Fig. 108: Graphs of the cosine function and corresponding Taylor polynomials of orders
0, 2, 4 around 0.
426
y
2
1.5
1
0.5
0.5
1
1.5
2
x
3
-0.5
Fig. 109: Graphs of the sine function and corresponding Taylor polynomials of orders
1, 3, 5 around 0.
y
1
0.8
0.6
0.4
0.2
0.25 0.5 0.75 1 1.25 1.5 1.75
x
Fig. 110: Graphs of the error function, an associated asymptote and corresponding Taylor
polynomials of orders 1, 3, 5 around 0.
427
Example 3.4.28. Find the power series expansion around zero of the error
function defined by
»
2 x y2
erfpxq : ?
e dy
π 0
for all x ¥ 0. By Example 3.4.27, it follows that
2
ey
8̧
k 0
2k
p1qk yk!
for all y P R including the absolute summability of this sum and also the
uniform convergence of the sequence of functions S0 , S1 , . . . defined by
Sn py q :
ņ
k 0
2k
p1qk yk!
for every n P N and y P R on every closed subinterval of R. Hence it
follows by Theorem 3.4.5
erfpxq for all x ¥ 0.
?2π
»x
0
2
2
ey dy ?
»x
2 8̧ p1qk
?π
y2k dy
k!
0
k0
π
»x
8̧
2k
p1qk yk!
2 8̧
0 k 0
?π
dy
p1qk x2k 1
p2k 1q k!
k0
Example 3.4.29. Find the first three terms in the Taylor expansion of
f pxq : eax cospxq
for all x P R where a P R. Solution: By Examples 3.4.22,3.4.27, it follows
that the first three terms in the Taylor expansion of f are given by
c0
c1 x
c2 x2 ,
where
c0
a0b0 ,
c1
a0 b 1
a1 b 0 , c 2
428
a0 b 2
a1 b 1
a2 b0 ,
a0
1, a1 a, a2 a2{2, b0 1, b1 0, b2 1{2, and hence by
a2 1 2
1 ax
x
2
for all x P R.
The Binomial series was discovered by Newton in 1665 inspired by Wallis’
paper ‘Arithmetica infinitorum’ [98]. He never published this result, but
describes its derivation in letters from 1676 to Leibniz. The corresponding
Taylor polynomials are routinely used in applications for the purpose of
approximation. Theorem 3.4.25 is not strong enough for its derivation,
but there is a simpler method of proof by consideration of an associated
differential equation. That method is employed in the following.
Example 3.4.30. (Binomial series) Let ν
p1
xq
ν
P R. Show that
8̧ ν n 0
n
xn
for all x P p1, 1q if ν R N and all x P p1, 8q if ν P N. The coefficients
in the series are called ‘binomial coefficients’. They are defined by
ν
0
: 1 ,
ν
n
1
ν pν 1q pν pn 1qq
n!
:
for every n P N . Note that
ν
n
Γpν 1q
1q Γpν n
Γpn
1q
for all n P N satisfying n ν 1. The series is called a binomial series.
Note that in case that ν P N, the series terminates since
ν
ν
k
429
0
for all k P N . In this case, the series coincides with a finite sum. Solution:
First, we notice that there is n0 P N such that
ν
n
0
for n P N satisfying n ¥ n0 if and only if ν P N. In this case, the power
series coincides with a finite sum and its convergence radius is therefore
infinite. In the case that ν R N, it follows that
ν
n
xn
1
ν
xn
n
1
pn
n!
ν ν 1
1 ! ν ν 1
q
|νn n1| |x| ¤ n n 1 |x|
p q pν nq |x|
p q pν pn 1qq for all x P p1, 1q and n P N satisfying n ¥ ν. Hence it follows by the
ratio test, Theorem 3.3.31, that the series is absolutely summable for every
x P p1, 1q and not summable for every |x| ¡ 1. As a consequence, in
this case, the convergence radius of the power series is equal to 1. In the
following, we define I : p1, 8q if ν P N, I : p1, 1q if ν R N and
f : I Ñ R by
8̧ ν f pxq :
xn
n
n0
for all x P I. Then,
p1
xqf 1 pxq p1
8̧
xq
n 1
8̧
8̧
n
ν
xn1
n
ν
n nν xn1
n
xn
n
n1
n1
8̧
8̧ ν ν
n
pn 1q n 1 x
n
xn
n
n0
n0
8̧ n 0
pn
1q
n
ν
1
430
n
ν
n
xn
for every x P I. In particular,
p0
and
pn
1q
ν
n
1q
0
1
ν
ν
0
0
1
n
ν
n
pn
1q
ν
1
pn
1
νν
1q!
ν
0
ν pν 1q pν nq
1
ν pν 1q pν pn 1qq
n!
1
rpν nq ns n! ν pν 1q pν pn 1qq ν nν .
n
Hence it follows that
xqf 1 pxq νf pxq
p1
for all x P I. The last is equivalent to
id I qν f
p1
1
(3.4.17)
pxq 0
for all x P I. Then it follows by Theorem 2.5.7 that p1 id I qν f is a
constant function and, since f p0q 1, that f pxq p1 xqν for all x P I.
The following gives a standard example for the derivation of power series
for functions defined in terms of integrals. The proved integral representation for Bessel functions of the first kind is of frequent use in applications
[2].
Example 3.4.31. Show that
Jν pxq ?π Γ
x ν
2
1
2
ν
»π
0
cospx cos θq sin2ν θ dθ
for all x ¡ 0 and ν ¥ 0. Solution: For this, let ν ¥ 0. Then it follows by
use of the power series expansion of cos and Theorem 3.4.5 that
»π
0
cospx cos θq sin2ν θ dθ
»π
0
8̧
p1qk x2k cos2k θ sin2ν θ
p2kq!
k0
431
dθ
8̧ p1qk
p2kq! x
k 0
»π
2k
2ν
2k
sin θ cos θ dθ
.
0
Further, it follows by change of variables and (3.2.18) that
»π
2ν
2k
sin θ cos θ dθ
0
» π{2
2ν
» π{2
»π
2ν
sin θ cos θ dθ
0
» π{2
2k
sin θ cos θ dθ
0
sin
0
» π{2
2ν
2k
» π{2
sin θ cos θ dθ
0
Γ ν 12 Γ k
Γpk ν 1q
1
2
2k
2ν
θ̄
π
2
sin2ν θ cos2k θ dθ
{
π 2
2k
cos
θ̄
π
dθ̄
2
cos2ν θ̄ sin2k θ̄ dθ̄
0
0, it follows by Legendre’s duplication formula (3.2.16) for Γ that
? 12k Γ ν 12 Γp2kq
Γ ν 12 Γ k 21
π 2 Γpk ν 1q Γpkq
Γpk ν 1q
? 12k Γ ν 21 2k Γp2kq ? 2k Γ ν 12 p2kq!
π2
π 2 Γpk ν 1q k!
Γpk ν 1q 2k Γpk q
For k
and hence that
»π
sin2ν θ cos2k θ dθ
0
?
π 22k
Γ ν
Γpk ν
1
2
p2kq!
1q k!
P N where as usual 0! : 1. This leads to
»π
? 1 8̧
p1qk
2ν
cospx cos θq sin θ dθ π Γ ν
2
k! Γpk ν
for all k
0
k 0
and hence to
?π Γ
x ν
2
ν
1
2
»π
0
cospx cos θq sin2ν θ dθ
432
x 2k
1q 2
x ν
2
8̧
p1qk
k! Γpk ν
k0
x 2k
1q 2
Jν pxq
where the last equality is a consequence of (3.4.11).
Problems
1) Find the interval of convergence of the given series
8̧
a)
xn
8̧
,
b)
xn
1 n2
n0
1 n
8̧
xn
c)
, d)
3n pn 1q
n0
8̧ p3x 2qn
n 0
e)
8̧
n 0
g)
5n
xn
lnpn 2q
n0
,
f)
,
h)
,
8̧ px 1qn
?
,
n 1
n0
8̧ n!
10n
n0
xn
,
8̧
xn
plnpn 2qqn
n0
for real x.
2) Find the Taylor series of f around x0 and the corresponding convergence radius and interval of convergence.
a) f pxq : 4x{p1
b)
c)
d)
e)
f)
g)
h)
i)
j)
k)
l)
2x 3x2 q , x P R zt1{3, 1u ; x0
f pxq : sinpxq , x P R ; x0
π {4 ,
f pxq : sin pxq , x P R ; x0 π {4 ,
f pxq : lnp1 xq , x 1 ; x0 0 ,
f pxq : sinhpxq , x P R ; x0 0 ,
f pxq : coshpxq , x P R ; x0 0 ,
f pxq : x3 3x 7 , x P R ; x0 3
f pxq : 3x , x P R ; x0 0 ,
f pxq : 1{p1 x x2 q , x P R ; x0 0
f pxq : 1{p1 x3 q , x P R ; x0 0 ,
f pxq : ex {2 , x P R ; x0 0 ,
f pxq : lnp1 x2 q , x P R ; x0 0 .
2
2
433
,
,
0
,
3) Find the Maclaurin series of
f pxq :
1
x2
1
,
x P R, and use the result to determine the Maclaurin series of arctan.
Finally, show that
8̧ p1qn
π
.
4
2n 1
n0
4) The Maclaurin series for f : p1, 1q Ñ R defined by
for all x P
too slowly.
f pxq : lnp1 xq
p8, 1q is not useful for computation since converging
a) Find the Maclaurin series for g : p1, 1q Ñ R defined by
g pxq : ln
1 x
1x
for all x P p1, 1q. Also, find the convergence radius and interval of convergence of the series.
b) Show that the error of truncating the series after n P N terms
is equal or smaller than
2
2n
2n 1
x
1 1 x2
for x P p1, 1q.
c) Compute lnp2q to four decimal places by using the series obtained in a). Show the accuracy of your result by using the
estimate from bq.
5) By use of the Cauchy product of series, determine the Maclaurin
series of f .
a) f pxq : p1 xq2 , x 1
b) f pxq : lnp1
c) f pxq : rlnp1
xq{p1
,
xq , x ¡ 1
xqs , x ¡ 1
2
6) By use of the Cauchy product of series show that
for all x, y
P R.
exppxq exppy q exppx
434
yq
.
,
y
0.5
-1
1
3
5
x
Fig. 111: Graphs of f from Problem 11 and the constant function of value 1 which is an
asymptote for large positive values of the argument.
7) Calculate the first k nonzero terms in the Taylor expansion of f
around x0 .
a)
b)
c)
f pxq : ecospxq , x P R ; x0
0, k3 ,
f pxq : cospxq{p1 xq , |x| 1 ; x0 0 , k 6
?
f pxq : expp x q , x P r0, 8q ; x0 1 , k 3 ,
,
8) Use the Taylor series of the sine function around π {4 to approximate
its value at 470 degrees correctly to five decimal places.
9) Evaluate
» 1{2
0
1
dx
x6
to four-decimal-place accuracy by using a suitable power series expansion. Give reasons for the validity of your calculation.
10) Calculate the leading first four digits of
»1
0
cospeu q du
by using a suitable power series expansion. Give reasons for the
validity of your calculation.
435
11) Define f : R Ñ R by
f pxq :
#
0
if x ¤ 0
.
expp1{xq if x ¡ 0
a) Show that f is infinitely often differentiable.
b) Calculate the Maclaurin series of f and show that it does not
converge to f pxq for any x ¡ 0.
12) By a power series expansion around x 0, find a solution of the
differential equation satisfying the given boundary conditions. Determine the convergence radius of the series.
a)
for all x P R;
b)
for all x 1;
c)
for all x P R;
d)
f 2 pxq
2f 1 pxq
f pxq 0
f p0q 0 , f 1 p0q 1 .
p1 xq2 f 2 pxq 2f pxq 0
f p0q f 1 p0q 1 .
xf 2 pxq
2f 1 pxq
f p0q 1 .
xf 2 pxq f 1 pxq
for all x P R;
13) Let a, b ¡ 0 and c ¡ a
xf pxq 0
p3 xqf pxq 0
f 2 p0q 2 .
b.
a) Find the convergence radius r of the Gauss hypergeometric series
8̧ Γpa nqΓpb nq xn
Γpcq
ΓpaqΓpbq n0
Γpc nq
n!
for x P R.
436
b) Show that the corresponding hypergeometric function f , which
is generally denoted by the symbol ‘F pa, b; c; q’ in the literature, satisfies the hypergeometric differential equation
x p1 xqf 2 pxq
r c pa
b
1qxsf 1 pxq abf pxq 0 ,
x P pr, rq.
c) Show that
arctanpxq
,
x
1 lnp1 xq
,
F p1{2, 1; 3{2, x2 q 2x 1 x
lnp1 xq
F p1, 1; 2, xq ,
x
F p1{2, 1; 3{2, x2 q for 0 |x| r1{2 , 0 |x| r, respectively.
14) Let a, b ¡ 0.
a) Find the radius r of convergence of the confluent hypergeometric series
8̧ Γpa nq xn
Γpbq
Γpaq n0 Γpb nq n!
for x P R.
b) Show that the corresponding confluent hypergeometric function f , which is generally denoted by the symbol ‘M pa, b, q’
in the literature, satisfies the confluent hypergeometric differential equation
x f 2 pxq
pb xqf 1 pxq af pxq 0 ,
x P pr, rq.
c) Show that
M pa, a, xq ex , M p1, 2, 2xq ex
for all x P pr, rq, 0 |x| r, respectively.
sinhpxq
x
15) Let n P N. By a power series expansion around x 0, find a solution
Hn : R Ñ R of Hermite’s differential equation
f 2 pxq 2xf 1 pxq
437
2nf pxq 0 ,
y
15
10
5
-2
1
-1
2
x
-5
-10
Fig. 112: Graphs of Hermite polynomials H0 , H1 , H2 , H3 .
x P R, satisfying
Hn p0q p1qn{2
n!
pn{2q! ,
Hn1 p0q 0
if n is even and
Hn p0q 0 , Hn1 p0q 2 p1qpn1q{2
n!
rpn 1q{2s!
if n is odd. Determine the convergence radius of the associated power
series around x 0.
16) Let ν P R. By a power series expansion around x 1, find a solution
of Legendre’s differential equation
p1 x2 qf 2 pxq 2xf 1 pxq ν pν
x P p1, 3q, satisfying
f p1q 1 .
1qf pxq 0 ,
Determine the convergence radius of the associated power series around
x 1. What happens if ν P N?
438
y
4
2
-2
1
-1
2
x
-2
Fig. 113: Graphs of Legendre polynomials P0 , P1 , P2 , P3 .
3.5
Analytical Geometry and Elementary Vector Calculus
The invention of analytical geometry was another important mathematical
development of the 17th century. Usually, this invention is attributed to the
works of Francois Viete, Fermat and Descartes. Indeed, those works put
more stress on the use of algebraic reasoning within proofs of geometric
statements. On the other hand, they also conformed in large parts to ancient
Greek mathematical traditions. Sometimes, probably influenced by the fact
that his philosophy was almost revolutionary in their break with ancient
Greek philosophy, Descartes is claimed to be the prime inventor of analytical geometry. Such claim is also reflected in the name of the ‘Cartesian’
coordinates. On the other hand, his work in this area was much less radical.
Also, it appears nowadays that he is not the inventor of the Cartesian coordinates [15]. The coordinates used by Nicholas Oresme (1320 1382) for
a graphical representation of functions are much closer to their modern use
than Descartes’. Here it has also be taken into account that, like the lawyer
Fermat, Descartes was no professional mathematician. From today’s point
of view, his main interest was in philosophy. Apparently, the invention of
analytical geometry was a gradual process that started already in ancient
439
Greece in the work of Pappus and received its main impacts much later
from the development of calculus.
From a today’s perspective, the goal of analytical geometry is the replacement of intuition in the solution of geometric problems by algebraic calculations. This goal is contrary to ancient Greek mathematics that gave
meaning to the solutions of algebraic equations through geometrical constructions. Today, differently to geometric intuition, algebraic arguments
are considered a valid tool in mathematical proofs. To accomplish its goal,
analytical geometry introduces a purely auxiliary Cartesian coordinate system, that is, a coordinate system that is not related in any essential way to
the nature of the geometrical problem at hand, that allows a unique identification of points in the plane and in space by a pair or triple, respectively, of
real numbers called the Cartesian coordinates of the point. For a simple prototypical example for the analytical geometric approach, see Example 3.5.5
that investigates the elementary geometric bisection of line elements in the
framework of analytic geometry. For a more complicated example, see
Example 3.5.26 that proves ancient Greek knowledge on the properties of
line segments of parabolas which Archimedes used in his quadrature of the
parabola. The last transcends analytic geometry somewhat since it involves
not just algebra in the analysis, but also methods from calculus and therefore belongs to the area of differential geometry.
3.5.1
Metric Spaces
Basic to geometry is the notion of the length of line segments or the distance
of points. Later on, we will give a definition of the Euclidean distance
between points in R2 and R3 which is motivated by the Pythagorean law
of elementary geometry. For n P N such that n ¥ 4, that definition is
generalized in a straightforward manner to points of Rn . In many cases
notions of distance have been found that share certain properties of the
Euclidean distance. That observation led to the definition of a metric space
which allows the formulation of statements that are valid in all those cases.
Definition 3.5.1. A metric space is a pair pM, dq consisting of a non-empty
440
set M , whose elements we shall call points, and a (‘distance’- or ‘metric’-)
function d : M M Ñ R such that for all p, q, r P M
(i) dpp, q q
ness’)
¥ 0 and dpp, qq 0 if and only if p q, (‘Positive definite-
(ii) dpp, q q dpq, pq for all p, q
(iii) dpp, q q ¤ dpp, rq
P M,
(Symmetry)
dpr, q q for all r
PM
(Triangle inequality).
The following inequality will be used later on in the proof that Rn , n P N ,
equipped with the Euclidean distance function is indeed a metric space. It
is also frequently used in other connections.
Lemma 3.5.2. Cauchy-Schwarz inequality Let n
pb1, . . . , bnq P Rn. Then
ņ
aj bj j 1
¤
1{2 ņ
a2j
1{2
ņ
j 1
P N and pa1, . . . , anq,
b2j
.
(3.5.1)
j 1
In addition, if bj 0 for some j P t1, . . . , nu, then equality holds in (3.5.1)
if and only if aj pC {B q bj for all j 1, . . . , n where
B :
ņ
b2j
, C :
j 1
ņ
aj bj .
j 1
Proof. In addition, define
ņ
A :
a2j .
j 1
Then it follows that
0¤
ņ
j 1
2
pBaj Cbj q2 AB 2BC
ņ
B 2 a2j 2BCaj bj
j 1
2
2
C B
B pAB C 2q
441
C 2 b2j
y
q
q2
p2
p
r
x
p1
q1
Fig. 114: The square of the distance between two points p, q in the plane is given by the
sum of squares of the distances between p, r and between r, q.
and hence (3.5.1) in case B 0. In the remaining case B 0, it follows
that b1 bn 0 and hence also (3.5.1). Further if bj 0 for some
j P t1, . . . , nu, then equality holds in (3.5.1) if and only if aj pC {B q bj
for all j 1, . . . , n.
The definition of the Euclidean distance in R2 and R3 is motivated by the
Pythagorean law of elementary geometry.
For such motivation, we consider first the case R2 , see Fig 114. For this,
let p pp1 , p2 q and q pq1 , q2 q be points in R2 . Then we introduce the
auxiliary point r pq1 , p2 q. Since p and r are on the same height, i.e.,
share the same y-coordinate, the length of the line segment pr is given by
the length |p1 q1 | of its orthogonal projection onto the x-axis. Also, since
r and q share the same x-coordinate, the length of the line segment rq is
given by the length |p2 q2 | of its orthogonal projection onto the y-axis.
Since the triangle prq has a right angle in the corner r, we conclude by the
Pythagorean law of elementary geometry that the length of the line segment
442
q
s
p
r
Fig. 115: The square of the distance between two points p, q in space is given by the sum
of squares of the distances between p, r, r, s and s, q.
pq is given by
a
a
|p1 q1|2 |p2 q2|2 pp1 q1q2 pp2 q2q2 .
The situation in R3 is similar, see Fig 115. For this, let p pp1 , p2 , p3 q
and q pq1 , q2 , q3 q be points in R3 . We introduce two auxiliary points
r pq1 , p2 , p3 q and s pq1 , q2 , p3 q. Since p and r share the same y and zcoordinates, the length of the line segment pr is given by the length |p1 q1 |
of its orthogonal projection onto the x-axis. Also, since r and s share the
same x and z-coordinates, the length of the line segment rs is given by the
length |p2 q2 | of its orthogonal projection onto the y-axis. Since the triangle prs has a right angle in the corner r, we conclude by the Pythagorean
law of elementary geometry that the length of the line segment ps is given
by
a
a
|p1 q1|2 |p2 q2|2 pp1 q1q2 pp2 q2q2 .
Further, since s and q share the same x and y-coordinates, the length of the
line segment sq is given by the length |p3 q3 | of its orthogonal projection
443
r
q
p
Fig. 116: According to the triangle inequality, the distance between the points p, q is
smaller than the sum of the distances between p, r and between r, q.
onto the z-axis. Since the triangle psq has a right angle in the corner s, we
conclude by the Pythagorean law of elementary geometry that the length of
the line segment pq is given by
a
pp1 q1q2 pp2 q2q2 |p3 q3|2 pp1 q1q2 pp2 q2q2 pp3 q3q2 .
a
Example 3.5.3. Let n P N . Show that pRn , dq where en : Rn
r0, 8q is the usual Euclidean distance function defined by
e px, y q :
n
g
f
f
e
ņ
Rn Ñ
pxj yj q2
j 1
for all x px1 , . . . , xn q, y py1 , . . . , yn q P Rn , is a metric space. Solution: The positive definiteness and symmetry of en are obvious. Further, it
444
y
3
2
1
-3
-2
1
-1
x
-1
Fig. 117: Circle of radius 2 and center p1, 1q.
follows by Lemma 3.5.2 that
ņ
ņ
penpx, yqq2 pxj yj q2 pxj zj
j 1
pxj zj q
j 1
¤ penpx, zqq2
¤ penpx, zqq2
and hence that
yj q2
j 1
ņ
2
zj
ņ
2
pxj zj qpzj yj q
j 1
ņ
2
xj
j 1
yj ņ
pzj yj q2
j 1
p zj qpzj q penpz, yqq2
2en px, z qen pz, y q
penpz, yqq2 penpx, yq
en px, y q ¤ en px, z q
for all x px1 , . . . , xn q, y
en pz, y qq2
en pz, y q
py1, . . . , ynq P Rn and z pz1, . . . , znq P Rn.
Example 3.5.4. Let n P N and a pa1 , . . . , an q P Rn . Find a function f
whose zero set is given by the sphere Srn paq of radius r ¡ 0 with center a.
445
4
y
2
0
2
0
z
-2
-4
-2
0
x
2
4
Fig. 118: Sphere of radius 3 centered at the point p1, 2, 1q.
Solution: A sphere of radius r ¡ 0 and center a contains precisely those
points x px1 , . . . , xn q P Rn which have Euclidean distance r from a.
Hence Srn paq is given by
Srn paq #
ņ
px1, . . . , xnq P Rn : pxj aj q2 r2
+
.
j 1
In particular, Sr1 paq is called a circle of radius r around a. Hence such
function f is given by f : Rn Ñ R defined by
f px1 , . . . , xn q :
ņ
pxj aj q2 r2
j 1
for all px1 , . . . , xn q P Rn .
We exemplify the goal of analytic geometry, i.e., the replacement of intuition in the solution of geometric problems by algebraic calculations, in a
simple example which proves the correctness of the elementary geometric
construction of the bisection of a line segment.
446
y
x
a
Fig. 119: Elementary geometric bisection of a line segment. See Example 3.5.5.
Example 3.5.5. (Bisection of a line segment) Prove the elementary geometric construction of the bisection of a line segment. Solution: For this,
let p and q be two different points in the plane. We introduce a Cartesian
coordinate system in the following way. The point p is chosen as the origin
of the system, and the direction of the x-axis is chosen to coincide with
the direction of the oriented line segment from p to q. Hence, p p0, 0q
and q p0, aq where a ¡ 0 is the distance between p and q. The elementary geometric construction of the bisection of the line segment pq involves
drawing circles of radius r ¡ a{2 around p and q. The line segment between the intersection points of the circles halves the line segment pq. A
point px, y q P R2 is an intersection point of the circles if and only if its
coordinates satisfy the following equations
x2
y2
r2 , px aq2
y2
r2 .
Subtraction of these equations gives
0 x2
y 2 px aq2 y 2
x2 x2 2ax a2 .
The last equation is equivalent to the equation x a{2. Hence px, y q P R2
447
is an intersection point if and only if x a{2 and
r
2
x
2
y
2
a2
4
y2 .
As a consequence, the intersection points are given by
?
12 4r2 a2
a
,
2
,
a 1? 2
4r a2
,
2 2
.
The line segment between these points, given by
L :
"
a 2t 1 ? 2
4r a2
,
2
2
*
:0¤t¤1
,
intersects pq indeed in its midpoint pa{2, 0q. It might be argued that the
above choice of the Cartesian coordinate system involved geometric intuition. On the other hand, a similar, but more complicated, calculation can
also be performed for the case of a completely arbitrary Cartesian coordinate system. The whole calculation and reasoning can be performed without any geometric intuition once the geometric problem is translated into a
set of algebraic equations. The last is the spirit of the analytic geometric
approach.
Problems
1) Which of the following sets are circles? Find center and radius where
this is the case.
a)
b)
c)
d)
e)
f)
g)
tpx, yq P R2 : x2
tpx, yq P R2 : x2
tpx, yq P R2 : x2
tpx, yq P R2 : x2
tpx, yq P R2 : x2
tpx, yq P R2 : x2
tpx, yq P R2 : 4x2
y2
y2
y
x 2y
2y 3x
2
3x
2
6x
y
0u
2y
,
1 0u
4 0u
1 0u
,
,
,
5y 4 0u ,
y 12x 8y 43 0u
4y 2 2x 8y 1 0u
y
2
2
448
,
.
2) Find a function whose zero set is a circle with center p3, 2q passing
through p2, 1q.
3) Find a function whose zero set is a circle with center pa, aq passing
through the origin where a P R.
4) Find a function whose zero set is a circle passing through all three
points p1, 2q, p1, 0q and p3, 2q.
5) Decide whether the points p0, 3q, p2, 0q and p4, 1q lie on a circle. If
this is the case, find its center and radius.
6) Find functions whose zero sets are circles with center p1, 2q that
touch a) the x-axis, b) the y-axis.
7) Find the intersection S1 X S2 where
tpx, yq P R2 : 3 3x 2x2
S2 tpx, y q P R2 : 3 2x 3x2
S1
4y
0u ,
0u .
2y 2
4y 3y
2
8) Which of the following sets are spheres? Where this is the case, find
center and radius.
tpx, y, zq P R3 : x2 y2 z2 2x 3y z 0u ,
b) tpx, y, z q P R3 : x2 y 2 z 2 4x 5 0u ,
c) tpx, y, z q P R3 : x2 y 2 z 2 6x y z 1 0u ,
d) tpx, y, z q P R3 : x2 y 2 z 2 x y z 1 0u ,
e) tpx, y, z q P R3 : x2 y 2 z 2 3x y 2z 21u ,
f) tpx, y, z q P R3 : x2 y 2 z 2 3x y 2z u ,
g) tpx, y, z q P R3 : 3px2 y 2 z 2 q 2x z 4u .
Find a function whose zero set is sphere with center p1, 1, 2q and
a)
9)
radius 3. What is the intersection of the sphere and the xz-plane?
10) Find a function whose zero set is a sphere that passes through the
point p2, 3, 1q and is centered in p3, 1, 1q.
11) Find functions whose zero sets are spheres with center p1, 3, 2q that
touch a) the xy-plane, b) the yz-plane, c) the xz-plane.
12) Show that the spheres
tpx, y, zq P R3 : 49px2
S2 tpx, y, z q P R3 : 49px2
S1
y2
y
2
z 2 q 32x
z
2
q 20x
8y
26z
26y 10z
have only one point in common, and find its coordinates.
449
8 0u
2 0u
,
3.5.2
Vector Spaces
In the following, the notion of vectors will be introduced. In applications,
a quantity which has magnitude, direction and a point of attack is called
vectorial. Physical examples are force, speed, acceleration, momentum,
angular momentum, torque and so forth. The definition of vectors below
does not take into account a point of attack. Therefore, a vectorial quantity
in applications is a pair consisting of a point (of attack) and a vector in the
sense below. The same is also true for tangent vectors to surfaces defined
in differential geometry. Also those are attached to points in the surface.
But, to simplify notation in applications, the point of attack is often not indicated, if clear from the context. This is often confusing for the beginner.
An additional complication arises from the fact that vectors are often denoted by tuples of real numbers. In such cases, only from the context can
be concluded whether a given tuple refers to the coordinates of a point or
to the components of a vector.
The following defines a vector as a set of parallel oriented line segments in
Rn where n P N . Two points p, q P Rn define a line segment pq between p
and q. In addition, we can give such a line segment an orientation by saying
it is originating in p and ending in q. We denote the resulting oriented line
# Note that qp
# is different from pq
# , although the underlying
segment by pq.
# can
n
line segments (which are subsets of R ) are identical. Alternatively, pq
be interpreted as the pair pp, q q P pRn q2 . For the analytic description of
parallelism, we use translations, Ta : Rn Ñ Rn , a pa1 , . . . , an q P Rn ,
defined by
Ta pxq : px1 a1 , . . . , xn an q
for every x px1 , . . . , xn q P Rn . Note Ta ‘translates’ the coordinates
of every point in Rn in the same way. Therefore Ta also preserves the
Euclidean distance between points:
en px, y q g
f
f
e
ņ
pxj yj q2 g
f
f
e
j 1
ņ
pxj
aj
py j
aj qq2
j 1
e pTapxq, Tapyqq
n
(3.5.2)
450
y
2
1
1
2
3
4
5
x
Fig. 120: Line segment between the points p1, 1q and p2, 1q in the plane and images under
the translations Ta where a p2, 0q, p0, 1q, p2, 1q.
for all x px1 , . . . , xn q P Rn and y py1 , . . . , yn q P Rn . Oriented
# and rs,
# where p, q, r, s P Rn , will be called equivalent
line segments pq
if r Ta ppq and s Ta pq q for some a P Rn . If the last is the case, we
# rs.
# Finally, below it will be shown that the relation
indicate this by pq
has similar properties to ‘’, and a vector will be defined as a set of
equivalent oriented line segments.
Definition 3.5.6. Let n P N zt0, 1u. Then we define
# from p to q : p, q
S : tAll oriented line segments pq
Further, we define on S the relation by
P Rn u .
# rs
#
pq
for p, q, r, s P Rn if
Tappq and s Tapqq
for some a pa1 , . . . , an q P Rn where the translation Ta : Rn Ñ Rn is
r
defined by
Ta pxq : px1
a1 , . . . , x n
an q
for every x px1 , . . . , xn q P R . Note that such a is unique and, in particular, that
s Tpq1 p1 ,...,qn pn q prq .
n
451
Fig. 121: Equivalent oriented line segments. See Definition 3.5.6.
Theorem 3.5.7. Let n P N zt0, 1u. Then is an equivalence relation, i.e.,
it follows for all p, q, r, s P Rn that
# pq
#
pq
( is reflexive) ,
# rs
# q ñ p rs
# pq
# q
p pq
( is symmetric) ,
# # # rs
# q ^ rs
# tu
# tu
p pq
ñ pq
( is transitive) .
Proof. is reflexive: For this, let p, q P Rn . Then p Tp0,...,0q ppq and q # pq.
# is symmetric: For this, let p, q, r, s P Rn
Tp0,...,0q pq q and hence pq
#
#
and pq rs. Then there is a pa1 , . . . , an q P Rn such that
r Ta ppq and s Ta pq q .
Hence
p Tpa1 ,...,an q prq and q
Tpa ,...,a qpsq
# pq.
# is transitive: For this, let p, q, r, s, t, u P Rn and pq
# rs
#
and rs
#
#
and rs tu. Then there are a pa1 , . . . , an q, b pb1 , . . . , bn q P Rn such
452
1
n
that
r
Tappq
and s Ta pq q
as well as such that
t Tb prq and u Tb psq .
Hence
t Tpa1
b1 ,...,an bn
q ppq and u Tpa1
b1 ,...,an bn
q pq q
#
# tu.
and pq
In a first step, the following defines a vector as a set of equivalent oriented
line segments. Every element of such a set is called a representative of the
vector. Subsequently the addition, scalar multiplication, length and scalar
product of vectors are defined. The addition of two vectors is defined as
follows. First, we choose a representative of the first vector. Next, we
choose the representative of the second vector whose initial point coincides
with the end point of the first representative. Then the sum of the vectors
is defined as the vector corresponding to the oriented line segment from
the initial point of the first representative to the end point of the second
representative. Scalar multiples of a vector are defined similarly. First, we
choose the representative of the vector whose initial point coincides with
the origin of the coordinate system. For λ P R, we define the λ-fold of
the vector as the vector corresponding to the oriented line segment from the
origin to the point which results from the endpoint of the representative by
multiplication of each of its coordinates by λ. Subsequently, the length of
a vector is defined as the Euclidean distance of the initial and endpoint of
a representative. Geometrically, the scalar product of two vectors can be
interpreted as the product of the length of the orthogonal projection of the
representative of the first vector onto the representative of the second vector
with the length of the second representative. In this, it is assumed that both
representatives have the same initial point. Below, we use another equivalent and more convenient definition in terms of the law of cosines, namely
as one half of difference of the sum of the lengths of the representatives
453
s
v
r
u
q
t
p
Fig. 122: Vector addition. See Definition 3.5.8 (iii) .
and the length of a difference of the representatives. These definitions of
vectors and of operations on vectors are very geometrical in nature, but
lead to notations that are inconvenient for use in calculations. Fortunately,
a more convenient notation, suitable for calculation, can be obtained from
the observation that there is a natural bijection ι of the set of vectors onto
Rn . Indeed, every vector has a natural representative which has the origin
as its starting point. By abuse of language, we will call such representatives position vectors. Hence by defining the image of the vector under ι
as the endpoint of that representative, we achieve a bijection ι from the set
of vectors onto Rn in part (viii) of the following definition. Subsequently,
we define by help of ι operations on Rn which correspond to the operations defined on vectors. In this way, the elements of Rn become in future
also position vectors, whereas so far the elements of Rn were only interpreted as n-tuples of coordinates that are associated to points by help of
a Cartesian coordinate system. Only from the context of problem can be
concluded whether a given tuple refers to the coordinates of a point or to
the components of a position vector.
454
uHΛL
u
qHΛL
t
q
p
Fig. 123: Scalar multiplication. See Definition 3.5.8 (iv) .
Definition 3.5.8. Let n P N zt0, 1u. We define:
# ] corresponding to
(i) For arbitrary p, q P Rn , the equivalence class [ pq
# by
pq
# ] : trs
# : rs
# pq
# , r, s P Rn u .
[ pq
Every such equivalence class is called a vector.
(ii) The set of all vectors by
# ] : p, q
S/ : t[ pq
P Rn u .
# ] [ rs
# ] of [ pq
# ] and [ rs
# ] as
(iii) For arbitrary p, q, r, s P Rn , the sum [ pq
#
#
follows. For this, let tu P [ pq ]. Then there is a unique v P Rn such
# P [ rs
# ], and we define
that uv
#
# ]
[ pq
# ] : [ tv ] .
[ rs
# ] can be represented as
Note that every element of [ pq
#
Ta ptqTa puq
455
(3.5.3)
for some a P Rn . Then
#
# ]
Ta puqTa pv q P [ rs
and
#
#
Ta ptqTa pv q P [ tv ] .
# ]
This shows that [ pq
# ] is well-defined.
[ rs
# ] as follows.
(iv) For every p, q P Rn and λ P R, the scalar multiple λ.[ pq
#
#
For this, let tu P [ pq ]. Then
#
# ] : [ t pt
λ.[ pq
1
λpu1 t1 q, . . . , tn
λpun tn qq ] .
# ] can be represented in the form of (3.5.3)
Since every element of [ pq
n
for some a P R , it follows that
#
ta pta1
λpua1 ta1 q, . . . , tan
λpuan tan qq
T# aptq Tappt1 λpu1 t1q, . . . , tn λpun tnqqq
P [ t# pt1 λpu1 t1q, . . . , tn λpun tnqq ]
# ] is wellwhere ta : Ta ptq, ua : Ta puq. This shows that λ.[ pq
defined.
# ]| of [ pq
# ] as follows. For this,
(v) For arbitrary p, q P Rn , the length |[ pq
#
# ]. Then
let tu P [ pq
a
# ]| : pu t q2 pu t q2 ,
|[ pq
1
1
n
n
i.e., as the Euclidean distance of the points t and u. Since every
# ] can be represented as (3.5.3) for some a P Rn , it
element of [ pq
follows that
a
pa
ua1 ta1 q2 puan tan q2
pu1 t1q2 pun tnq2
# ]| is well: Ta ptq, ua : Ta puq. This shows that |[ pq
where ta
defined. Vectors of length one are called unit vectors.
456
# ] (vi) For arbitrary p, q, r, s P Rn the scalar product (or dot product) [ pq
# ] of [ pq
# ] and [ rs
# ] by
[ rs
# ] [ rs
# ] : 1
[ pq
2
# ]|2
|[ pq
# ]|2 |[ pq
# ] [ rs
# ]|2
|[ rs
Note that according to the law of cosines
.
# ], [ rs
# ]q ,
?p[ pq
# ]| 0, |[ rs
# ]| 0 and where the angle ?p[ pq
# ], [ rs
# ]q P r0, π s
if |[ pq
# ] [ rs
# ] |[ pq
# ]| |[ rs
# ]| cos
[ pq
# ] and [ rs
# ] is defined by the angle between represenbetween [ pq
tatives of both equivalence classes originating from the same point.
Vectors with a vanishing scalar product are called orthogonal to each
other.
(vii) The bijective map ι : Rn
Ñ S/ by
#
ιpxq : [ Ox ] .
for all x P Rn where O denotes the origin defined by O : p0, . . . , 0q P
Rn .
P Rn and arbitrary λ P R,
x y : ι1 pιpxq ιpy qq , λ.x : ι1 pλ.ιpxqq , |x| : |ιpxq| ,
x y : ιpxq ιpy q .
(viii) For arbitrary x, y
The following theorem derives the properties of the operations of addition
and scalar multiplication that are induced on Rn , n P N , by ι and the
corresponding operations for vectors.
Theorem 3.5.9. Let n P N zt0, 1u. Then
(i)
x
y
px1, . . . , xnq py1, . . . , ynq px1
and
for all x, y
y1 , . . . , x n
a.x a . px1 , . . . , xn q pa x1 , . . . , a xn q
P Rn and a P R.
457
yn q
(ii) pRn , , .q is a real vector space with 0 : p0, . . . , 0q as neutral element and for each x P Rn with x : px1 , . . . , xn q as corresponding inverse element, i.e, the following holds:
y x
px yq z x py
x 0x
x pxq 0
x
y
(Addition is commutative),
zq
(Addition is associative),
(0 is a neutral element),
(x is inverse to x)
as well as
1.x x, pabq.x a.pb.xq, pa
a.px
y q a.x
for all x, y
bq.x a.x
b.x,
a.y
P Rn and a, b P R.
(iii) The sequence of n vectors e1 , . . . , en , defined by
e1 : p1, . . . , 0q , . . . , en : p0, . . . , 1q ,
is a basis of Rn , i.e., for every x P Rn , we have
x x1 .e1
xn .en
and the coefficients x1 , . . . , xn in this representation are uniquely determined. I.e., if
x x̄1 .e1 x̄n .en
for some x̄1 , . . . , x̄n
P R, then
x̄1 x1 ,
. . . , x̄n
xn .
The sequence e1 , . . . , en is called the canonical basis of Rn .
Proof. ‘(i)’: For this, let x
a P R. Then
x
y : ι1 pιpxq
px1, . . . , xnq, y py1, . . . , ynq P
#
ιpy qq ι1 [ Ox ]
458
# [ Oy ]
Rn and
#
#
ι1 [ Ox
] [ Tx pOqTx py q ] ι1
ι1ιppx1 y1, . . . , xn ynqq px1
and
# #
[ OTx py q ]
y1 , . . . , x n
yn q
ι1 [ O# pλx1, . . . , λxnq ]
ι1ιppλx1, . . . , λxnqq pλx1, . . . , λxnq .
λ.x : ι1 pλ.ιpxqq ι1 λ.[ Ox ]
(ii) and (iii) are trivial consequences of the definitions and the algebraic
properties of the real numbers.
The following theorem derives the properties the notion of length that is
induced on Rn , n P N , by ι and the corresponding notion for vectors.
Theorem 3.5.10. Let n P N zt0, 1u. Then
(i) for every x P Rn
|x | a
x21
x2n .
(ii) The absolute value satisfies the defining properties of a norm on Rn ,
i.e.,
Proof.
|x| ¥ 0 and |x| 0 if and only if x 0 (Positive definiteness),
|a.x| |a| |x|
(Homogeneity),
|x y| ¤ |x| |y|
(Triangle inequality)
n
for all x, y P R and a P R.
‘(i)’: For this, let x px1 , . . . , xn q P Rn . Then
a
#
|x| : |ιpxq| |[ Ox
]| x21 x2n .
‘(ii)”: The positive definiteness and homogeneity of the absolute value are
straightforward consequences from the definitions. The triangle inequality
follows from the corresponding property of the metric en . For this, let
x px1 , . . . , xn q, y py1 , . . . , yn q P Rn . Then
|x
y | en pO, x
y q ¤ en pO, xq
459
en px, x
y q |x|
|y | .
The following theorem derives the properties the scalar product that is induced on Rn , n P N , by ι and the corresponding product for vectors.
Theorem 3.5.11. Let n P N zt0, 1u. Then
(i) for all x, y
P Rn :
xy
ņ
xk yk .
k 1
(ii) This product satisfies the defining properties of a scalar product on a
real vector space, i.e,
xy
yx
(Symmetry),
px yq z x z y z
(Additivity in the first variable),
pa.xq y apx yq
(Homogeneity in the first variable),
x x ¥ 0 and x x 0 if and only if x 0 (Positive definiteness)
for all x, y P Rn and a P R. As a consequence, it satisfies the impor-
tant Cauchy-Schwarz inequality
|x y| ¤ |x| |y| .
(3.5.4)
for all x, y P Rn . In particular, in the case that y
in (3.5.4) if and only if
xy
|y |2 . y .
x
Proof. ‘(i)’: For this, let x
Then
x y : ιpxq ιpy q 1
2
|x |
2
1
2
(3.5.5)
px1, . . . , xnq and y py1, . . . , ynq P
#
|rOxs|
2
#
#
#
|rOys| |rOxs rOys|
2
|y | |x y | 2
0, equality holds
1 ņ 2
rx
2 k1 k
460
2
2
yk2 pxk yk q2 s
Rn .
ņ
xk yk .
k 1
‘(ii)’: The symmetry, additivity, homogeneity in the first variable, and
positive definiteness of the dot product are obvious. Let a : |y |2 and
b : x y. For the case y 0, inequality (3.5.4) is trivially satisfied. If
y 0, then |y | ¡ 0 and
0 ¤ pa.x b.y q pa.x b.y q a2 |x|2 2ab px y q b2 |y |2
|y|4|x|2 2|y|2px yq2 px yq2|y|2 |y|2 r |x|2|y|2 px yq2 s ,
and hence it follows (3.5.4). In particular, equality holds in (3.5.4) if and
only if (3.5.5) is true.
Note that the basis e1 , . . . , en of Rn is in particular orthonormal with respect to the Euclidean scalar product, i.e., it satisfies
and
for all i, j
ei ej
0,
if i j pOrthogonalityq
ei ej
1,
if i j pNormalizationq
P t1, . . . , nu.
(3.5.6)
We continue the section with applications of vectors. Part (i) of the following theorem rephrases the Pythagorean theorem in terms of vectors. Part
(ii) shows that the minimal distance of a given point y P R2 from
tλ.x : λ P Ru ,
where x is some non-zero vector, is assumed in the orthogonal projection
of the position vector y onto the direction of x.
Theorem 3.5.12. Let n P N zt0, 1u. Then
(i)
|x y|2 |x|2 |y|2
for all orthogonal x, y P Rn .
461
y
y’
z
x
O
z’
Fig. 124: Orthogonal projections y 1 , z 1 of y and z, respectively, onto the direction of x.
(ii) For every x P Rn zt0u and every y
Px py q P Rn such that
P Rn, there is a unique vector
|y Pxpyq| mint|y λ.x| : λ P Ru .
Px py q is called the orthogonal projection of y onto the direction of x.
In particular, it is given by
Px py q :
yx
|x|2 .x ,
y Px py q is orthogonal to x and
a
|y Pxpyq| |x1| |x|2 |y|2 px yq2 .
Proof. ‘(i)’: Let x px1 , . . . , xn q, y
Then x y 0 and hence
|x
y|
2
ņ
k 1
pxk
yk q
2
py1, . . . , ynq P Rn be orthogonal.
ņ
k 1
ņ
x2k
k 1
462
ņ
yk2
2
k 1
xk yk
|x|2 |y|2 .
‘(ii)’: First, it follows for every λ P R that
y
yx
|x |2 . x
and hence by piq that
y
x y x py x|xqp|2x xq 0
|y λ.x| 2
y x y | x | 2 . x 2
y x
x2
|y λ.x| ¥ for all λ P R and
y
|y λ.x| if and only if
yx
|x |2
λ
λ
. x2
||
y
Hence
yx
|x |2 . x
2
. x
.
y x |x |2 . x 1 a 2
y x 2
2
.
x
|x |2 |x | |x | | y | p x y q
λ
yx
|x |2 .
From elementary geometry, it is known that the area of a plane parallelogram is given by the product of the length of one of its sides and the length
of the corresponding height. The following application of the vector methods allows an often simpler calculation of that area if the location of its
corners is known with respect to a Cartesian coordinate system.
Example 3.5.13. Let n P N zt0, 1u and Opqr be a parallelogram where
p, q, r are points in Rn . Show that its area A is given by
A
a
|a|2 |b|2 pa bq2 |a| |b| sinpαq
#
#
where a [ 0p ] and b [ 0r] and α : ?pa, bq P r0, π s. Solution: See
Fig. 125. If a are multiples of one another, A vanishes. This is consistent
463
r’
q’
r
q
b-PHbL
b
Α
O
a
p
PHbL
Fig. 125: Calculation of the area of the parallelogram Opgr. See Example 3.5.13.
with the above formula since in that case |a b| |a| |b| according to
Theorem 3.5.11. In the remaining cases let P pbq denote the orthogonal
projection of b onto the direction of a. The areas of the triangles Orr1 and
pqq 1 are identical. Hence the area of Opqr is given by
A |a| |b P pbq| |a| a
b
1 a 2
a b 2
2
.a
|
a
|
|a|2 |a| |a| |b| pa bq
|a|2 |b|2 pa bq2 .
Note that |a|2 |b|2 pa bq2 ¡ 0 according to the Cauchy-Schwarz inequality
Theorem 3.5.11. Further, since
a b |a| |b| cospαq ,
it follows that
A
a
|a|2 |b|2 p1 cos2pαqq |a| |b| sinpαq .
464
On R3 , it is possible to define a product that associates to every pair of vectors another vector and that shares some of the properties of multiplication
of real numbers. That product is also important for applications, e.g., in
electrodynamics.
In the following, we motivate the definition of that vector product a b
for a pa1 , a2 , a3 q, b pb1 , b2 , b3 q P R3 which are assumed not to be
multiples of each other. Natural candidates for the definition of a b are
vectors that are at the same time orthogonal to a and b. Orthogonal vectors
to a are given by
α . pa2 , a1 , 0q
β . p0, a3 , a2 q
where α, β P R. In this, we assume in addition that a2 0 which excludes
that pa2 , a1 , 0q and p0, a3 , a2 q are multiples of each other. The condition
b r α . pa2 , a1 , 0q
leads to
β . p0, a3 , a2 q s 0
pa1b2 a2b1q α pa2b3 a3b2q β 0
which is satisfied if
α
γ
p a2 b 3 a3 b 2 q , β
a2
for some γ P R. Then
aγ pa1b2 a2b1q
2
α . pa2 , a1 , 0q β . p0, a3 , a2 q γ . pa2 b3 a3 b2 , a3 b1 a1 b3 , a1 b2 a2 b1 q .
To restrict the final parameter γ, we calculate the square of the norm of this
vector for γ 1. In this, we drop all the additional restricting assumptions
on a, b P R3 made above. This gives
pa2b3 a3b2q2 pa3b1 a1b3q2 pa1b2 a2b1q2 a22b23 a23b22
2a2b2a3b3 a23b21 a21b23 2a1b1a3b3 a21b22 a22b21 2a1b1a2b2
|a|2|b|2 pa1b1 a2b2 a3b3q2 |a|2|b|2 pa bq2
which according to Example 3.5.13 is the square of the area of the parallelogram determined by a, b if a, b P R3 zt0u. This suggests the following
definition.
465
a x b
b
a
O
Fig. 126: Vector product of two vectors a and b.
Definition 3.5.14. For all a, b
a b P R3 by
P R3, we define the corresponding product
a b : pa2 b3 a3 b2 , a3 b1 a1 b3 , a1 b2 a2 b1 q .
A simple calculation shows that
a pa b q b pa b q 0 ,
i.e., that a and b are both orthogonal to a b, and by the foregoing that
a
|a b| |a|2|b|2 pa bq2 |a| |b| sinpαq ,
where α : ?pa, bq P r0, π s, which according to Example 3.5.13 is the area
of the parallelogram determined by a, b.
Remark 3.5.15. Let a, b P R3 zt0u. Then it follows by Example 3.5.13 that
a and b are parallel if and only if
ab0 .
466
The vector product satisfies the following rules that are frequently applied,
e.g., in electrodynamics.
Theorem 3.5.16. Let a, b, c, d P R3 and λ P R. Then
(i) e1 e2
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
e3 , e3 e1 e2 , e2 e3 e1 ,
a b b a,
pλ . aq b λ . pa bq ,
pa bq c a c b c ,
a pb c q c pa b q ,
a pb c q p a c q . b pa b q . c ,
pa bq pc dq pa cq pb dq pa dq pb cq .
Proof. The relations (i) to (iv) are obvious. ‘(v)’:
a pb cq a pb2 c3 b3 c2 , b3 c1 b1 c3 , b1 c2 b2 c1 q
a1b2c3 a1b3c2 a2b3c1 a2b1c3 a3b1c2 a3b2c1
c1 pa2b3 a3b2q c2 pa3b1 a1b3q c3 pa1b2 a2b1q
c pa b q .
‘(vi)’:
a pb cq pa2 pb cq3 a3 pb cq2 , a3 pb cq1 a1 pb cq3 ,
a1 pb cq2 a2 pb cq1 q
pa2 pb1c2 b2c1q a3 pb3c1 b1c3q, a3 pb2c3 b3c2q a1 pb1c2 b2c1q,
a1 pb3 c1 b1 c3 q a2 pb2 c3 b3 c2 qq
pa2c2b1 a3c3b1 a2b2c1 a3b3c1, a3c3b2 a1c1b2 a3b3c2 a1b1c2,
a1 c1 b3 a2 c2 b3 a1 b1 c3 a2 b2 c3 q
pa1c1b1 a2c2b1 a3c3b1 a1b1c1 a2b2c1 a3b3c1,
a3 c3 b2 a1 c1 b2 a2 c2 b2 a2 b2 c2 a3 b3 c2 a1 b1 c2 ,
467
a1 c1 b3 a2 c2 b3 a3 c3 b3 a3 b3 c3 a1 b1 c3 a2 b2 c3 q
pa cq . b pa bq . c .
‘(vii)’:
pa bq pc dq
pa2b3 a3b2, a3b1 a1b3, a1b2 a2b1q
pc2d3 c3d2, c3d1 c1d3, c1d2 c2d1q
a2b3c2d3 a3b2c3d2 a2b3c3d2 a3b2c2d3
a3 b1 c3 d1 a1 b3 c1 d3 a3 b1 c3 d1 a1 b3 c3 d1
a1 b2 c1 d2 a2 b1 c2 d1 a1 b2 c2 d1 a2 b1 c1 d2
a2c2b3d3 a3c3b2d2 a3c3b1d1 a1c1b3d3 a1c1b2d2 a2c2b1d1
pa2d2b3c3 a3d3b2c2 a3d1b1c3 a1d1b3c3 a1d1b2c2 a2d2b1c1q
pa cq pb dq a1c1b1d1 a2c2b2d2 a3c3b3d3
pa2d2b3c3 a3d3b2c2 a3d1b1c3 a1d1b3c3 a1d1b2c2 a2d2b1c1q
pa cq pb dq a1d1b1c1 a2d2b2c2 a3d3b3c3
pa2d2b3c3 a3d3b2c2 a3d1b1c3 a1d1b3c3 a1d1b2c2 a2d2b1c1q
pa cq pb dq pa dq pb cq .
From elementary geometry, it is known that the volume of parallelepiped in
space is given by the product of the area of one of its bases and the length of
the corresponding height. As another application of the vector product, the
following allows an often simpler calculation of that volume if the location
of its corners is known with respect to a Cartesian coordinate system.
Example 3.5.17. Show that the volume V of the parallelepiped with sides
a, b, c P R3 is given by the absolute value of the scalar triple product a pb cq
V
| a pb cq| p | c pa bq| | b pc aq| q .
468
a x b
PHcL
c
b
O
a
Fig. 127: The volume of a parallelepiped with sides a, b and c is given by |a pb cq|.
Solution: The volume is equal to the length of the orthogonal projection
P pcq of c onto the direction of a b times the area of the parallelogram
with sides a, b. Compare Fig 127. Hence
V
| c |apab|bq| |a b| | c pa bq|
where it is assumed that a and b are not scalar multiples of each other. In
that case the V vanishes which is consistent with the above formula since
in that case also a b 0.
Determinants appear naturally in the representation of solutions of systems
of linear equations, i.e., systems of equations where no powers of higher
than first order are present, see Problem 13. This was recognized as early as
1693 by Leibniz in a letter to L’Hospital where such systems were studied
that had parametric coefficients instead of explicitly given numbers. Today,
the corresponding rule of solving such systems in terms of determinants is
469
called Cramer’s rule after Gabriel Cramer who published this rule in 1750 in
a textbook [27]. The same rule was also published posthumously in 1748 in
[74] two years after Colin Maclaurin’s death. Among others, determinants
provide a simple way for determination of areas of parallelograms in R2
and of volumes of parallelepipeds in R3 if the location of their corners are
known with respect to a Cartesian coordinate system. In addition, they
generalize the notions of areas and volumes to higher dimensions.
Definition 3.5.18. (The determinant function) Let n P N .
(i) For every pk1 , . . . , kn q P Zn , we define
n
¹
spk1 , . . . , kn q :
sgnpkj
ki q
i,j 1,i j
where the signum function sgn : R Ñ R is defined by
$
'
& 1
if x ¡ 0
sgnpxq :
0 if x 0
'
%
1 if x 0 .
In this, we use the convention that the empty product is equal to 1.
Note that this definition implies that spk1 , . . . , kn q 0 if the ordered
sequence pk1 , . . . , kn q contains two equal integers. Also, note for the
case of pairwise different integers k1 , . . . , kn that spk1 , . . . , kn q 1
if the number of pairs pi, j q P t1, . . . , nu2 such that i j and kj ki
is even, whereas spk1 , . . . , kn q 1 if that number is odd.
(ii) For every ordered n-tuple pa1 , . . . , an q of elements of Rn , we define
a corresponding determinant
detpa1 , . . . , an q : a
11
an1
470
a1n ann
:
ņ
spk1 , . . . , kn q a1k1 ankn .
k1 ,...,kn 1
Note that according to the remark in (ii), the sum in this definition has only
to be taken over n-tuples k1 , . . . , kn which are permutations of 1, . . . , n.
Further, note that this definition leads in the case n 1 to
detpa1 q a1
and in the case n 2 to
detpa, bq a
1
b1
a2 a1 b2 a2 b1
b2 for every pa, bq such that a, b P R2 . Note that in the last case
r detpa, bq s2 |a|2|b|2 pa bq2 .
Hence the area A of the parallelogram with sides a and b is given by
A | detpa, bq | .
Finally, in the case n 3, this definition leads to
detpa, b, cq
a
1
b1
c1
a2 a3 b b b b 1
2
3
3
b2 b3 a1 a2 c 2 c 3 c 1 c 3 c2 c3
b
a3 1
c 1
b2 c2 (3.5.7)
a1b2c3 a1b3c2 a2b3c1 a2b1c3 a3b1c2 a3b2c1 a pb cq
a1b2c3 a2b3c1 a3b1c2 pa3b2c1 a1b3c2 a2b1c3q
for every pa, b, cq such that a, b, c P R3 . Note that
detpa, b, cq a pb cq .
Hence the volume V of the parallelepiped with sides a, b and c is given by
V
| detpa, b, cq | .
471
v
x0
O
Fig. 128: Line corresponding to x0 and v, see Definition 3.5.20.
Remark 3.5.19. Formally, for ease of remembrance, we can write
ab
e
1
a1
b1
e2 e3 a2 a3 b2 b3 thereby utilizing the representation of the determinant in terms of minors
given in (3.5.7).
The use of the interpretation of the elements of Rn , n P N zt0, 1u, as position vectors starting from the origin of a Cartesian coordinate system allows
the following transparent definitions of lines and planes.
Definition 3.5.20. (Lines and Planes) Let n P N zt0, 1u and x0
x03 q, v P Rn .
px01, x02,
(i) We define a corresponding line by
tx 0
t.v
P Rn : t P Ru .
(ii) In addition, let n 3 and w P Rn be such that v, w are not multiples
of one another. Then we define a plane corresponding to x0 , v, w by
tx 0
t.v
s.w
472
P Rn : t, s P Ru .
v
w
x0
O
Fig. 129: Plane corresponding to x0 ,v and w, see Definition 3.5.20.
Note that this set is equal to
tx P Rn : n px x0q 0u
tx P Rn : n1x1 n2x2 n3x3 pn1x01 n2x02 n3x03q 0u
where n pn1 , n2 , n3 q is some normal vector to v and w, i.e., some
non-trivial multiple of v w.
Example 3.5.21. Calculate the distance d between the two lines
L1 : tp4, 3, 1q
and
L2 : tp1, 0, 3q
t.p1, 1, 2q P R3 : t P Ru
t.p1, 1, 2q P R3 : t P Ru .
Solution: L1 and L2 are parallel. Therefore d is given by the distance of
some point on L1 , like p4, 3, 1q, from L2 , which by Theorem 3.5.12 is given
by the length of (note that p4, 3, 1q p1, 0, 3q p3, 3, 4q)
p3, 3, 4q 61 rp3, 3, 4q p1, 1, 2qs.p1, 1, 2q 32 .p1, 1, 1q .
?
Hence d 2 3{3.
473
Example 3.5.22. Let x0 , v, w P R3 such that v, w are not multiples of one
another. Finally, let E be the corresponding plane and u P R3 . Show that
the distance dpu, E q of u from E is given by
dpu, E q |pu x0q pv wq| .
|v w |
Solution: For this, let
n :
|v w| .pv wq , u0 : u x0 .
Then it follows for every t, s P R that
|u0 t.v s.w| |pu0 nq.n u0 pu0 nq.n t.v s.w|
¥ |pu0 nq.n| |u0 n| .
Also because u0 pu0 nq.n is normal to n, it follows that there are t, s P R
such that t.v s.w u0 pu0 nq.n. Hence
dpu, E q : mint|u x| : x P E u |u0 n| .
1
Example 3.5.23. Find the distance d between the planes x y 3z 1
and 2x 2y 6z 0.5. Solution: Since the normals n1 p1, 1, 3q
and n2 p2, 2, 6q are multiples of each other, these planes are parallel.
Hence d is given by the distance of some point on L1 , like p0, 0, 1{3q, from
the second plane. Therefore
d
?1 p1, 1, 3q rp0, 0, 1{3q p0, 0, 1{12qs 5
44
11
?
11 .
Example 3.5.24. Let x01 , x02 , v, w P R3 and in particular v, w be no multiples of one another. Calculate the distance d of the (‘skew’) lines
L1 : tx01
t.v
P R3 : t P Ru .
474
and
L2 : tx02
s.w
Solution: For all t, s P R,
|x01
t.v px02
P R3 : s P Ru .
s.wq| |x01 x02
t.v
psq.w| .
Hence d is equal to the distance of the plane
tx01 x02
t.v
s.w
P R3 : t, s P Ru
from the origin, i.e, by
d
|px01 x02q pv wq| .
|v w |
Problems
# ]q, of the vector that cor1) Determine the representative, i.e., ι1 p[ pq
#
responds to the oriented line segment pq between the points p and
q given by a list of their coordinates with respect to a Cartesian coordinate system. In addition, calculate the distance between p and
q.
a)
b)
c)
d)
e)
f)
g)
h)
p p1, 2q , q
p4, 7q ,
p p1, 2q , q p3, 4q ,
p p1, 3q , q p1, 2q ,
p p2, 4q , q p3, 4q ,
p p1, 3, 2q , q p3, 4, 7q ,
p p1, 2, 2q , q p3, 1, 4q ,
p p3, 1, 2q , q p1, 5, 2q ,
p p7, 1, 4q , q p3, 9, 4q .
3b, 3a 4b, a b, the angle θ between a and b,
2) Calculate |a|, 2a
a vector of length one in the direction of a, the orthogonal projection
of a onto the direction of b and, if at all possible, a b of the vectors
a and b.
a) a p3, 5q , b p1, 3q ,
475
a p7, 1q , q
b)
p1, 3q
,
c) p p6, 9q , q p2, 4q ,
d) p p1, 5q , q p2, 6q ,
e) p p2, 3, 1q , q
p3, 7, 7q ,
p p1, 3, 1q , q p4, 2, 3q ,
p p5, 2, 2q , q p2, 6, 3q
p p9, 2, 3q , q p5, 8, 2q .
f)
g)
h)
,
3) Calculate detpa, bq, detpa, b, cq, respectively.
a) a p4, 0q , b p7, 0q ,
b) a p8, 3q , b p2, 6q ,
c) a p8, 9q , b p5, 2q ,
d) a p4, 0, 1q , b p2, 0, 1q , c p3, 0, 9q ,
e) a p3, 1, 4q , b p1, 9, 1q , c p2, 3, 1q ,
f)
a p4, 2, 3q , b p3, 5, 2q , c p4, 4, 1q .
4) Find suitable vectors x0 and v such that
L tx0
tv : t P Ru
where
a) L tpx, y q P R2 : 3x
b)
L tpx, y q P R : 4y
2
c) L tpx, y q P R : 7x
4y
3u
2
d)
3y
L tpx, y, z q P R : x
3
e) L tpx, y, z q P R : 3z
3
f)
L tpx, y, z q P R : 5x
5u
,
0u
3y
4z
7 ^ 9x
3
,
6y
.
5 ^ 3x
2y
z
3 0 ^ 8y
9y
0u
6z
12z
,
0u
5) Find suitable vectors x0 , v and w such that
P
tx0
tv
sw
P R3 : t, s P Ru
where
a)
P
tpx, y, zq P R3 : x 3y
476
2z
1u
1u
,
.
,
tpx, y, zq P R3 : 3x 5u
P tpx, y, z q P R3 : 2x 9y
b) P
,
c)
z
0u
.
6) Calculate the distance between the lines L1 and L2 .
tpx, y, zq P R3 : 12x 3y 2z 4 ^ x y z 0u ,
L2 tpx, y, z q P R3 : 4x y z 5 ^ 9x 3x 4z 7u ,
L1 tpx, y, z q P R3 : x 3y z 1 ^ y z 0u ,
L2 tpx, y, z q P R3 : 2x y 18z 4 ^ 5x 3y 1u ,
L1 tpx, y, z q P R3 : 9x 7y 2z 8 ^ x y z 0u ,
L2 tpx, y, z q P R3 : 22x 4y 14z 19 ^ 28x 7z 12u
a) L1
b)
c)
7) Calculate the distance of the point p from the plane P .
a) p p5, 12, 3q , P
b)
c)
tpx, yq P R3 : x 2y 3z 9u ,
p p0, 0, 0q , P tpx, y q P R3 : x y z 1u ,
p p6, 7, 2q , P tpx, y q P R3 : 18x y 9z 3u
.
8) Let A, B and C be the vertices of a triangle. Find the representative
#
#
#
of [ AB ] [ BC ] [ CA ].
9) Show that the line joining the midpoints of two sides of a triangle is
parallel to and one-half the length of the third side.
10) Show that the medians of a triangle intersect in one point which is
called the centroid.
11) Show that the perpendicular bisectors of a plane triangle intersect in
one point which is called the circumcenter.
12) Show that the altitudes of a triangle, i.e., the the straight lines through
the vertexes which are perpendicular to the opposite sides, intersect
in one point which is called orthocenter.
13)
a) Show that vectors a, b P R2 are linearly dependent, i.e., such
that there are α, β P R such that α a β b 0 and α2 β 2 0,
if and only if detpa, bq 0.
b) Show that vectors a, b, c P R3 are linearly dependent, i.e., such
that there are α, β, γ P R such that α a β b γ c 0 and
α2 β 2 γ 2 0, if and only if detpa, b, cq 0.
477
.
14) (Cramer’s rule in two and three dimensions)
a) Let a, b, c P R2 . Show that the of equation
pa x, b xq c
has a unique solution x P R2 if and only if detpa, bq 0.
For that case express the solution x only in terms of detpc, bq,
detpa, cq and detpa, bq. The result is called ‘Cramer’s rule’ in
two dimensions.
b) Let a, b, c, d P R3 . Show that the of equation
pa x, b x, c xq d
has a unique solution x P R3 if and only if detpa, b, cq 0. For
that case express the solution x only in terms of detpd, b, cq,
detpa, d, cq, detpa, b, dq and detpa, b, cq. The result is called
‘Cramer’s rule’ in three dimensions.
3.5.3
Conic Sections
Conic sections were already known in ancient Greece. They were found
by Menaechmus, a student of Eudoxus, in the search for curves that were
suitable for the solution of the Delian problem. The last problem comprises the construction, from the edge of a cube and alone with compass
and straightedge, of the edge of a second cube of double volume. From
today’s perspective, this reduces essentially to the construction of a line
segment of length 21{3 , alone by compass and straightedge. In his search,
Menaechmus found ellipses, parabolas and hyperbolas as intersections of
right circular cones with planes. For this, see Problem 7 in the Section 3.5.5
on quadrics. From an analytic geometric point of view, ellipses, parabolas
and hyperbolas are zero sets of quadratic polynomials in the coordinates of
Cartesian coordinate systems in the plane. In the following, for each case,
such a polynomial will be derived starting from a geometric definition of
the curve.
A parabola is a subset of the plane consisting of those points that are
equidistant from a given line, called its directrix, and a given point, called
478
Fig. 130: Parabola and corresponding directrix and focus. Compare Example 3.5.25.
its focus. The point bisecting the distance of the focus and the directrix is
called its vertex; it is on the parabola. The infinite line through the vertex
which is perpendicular to the directrix is called the axis of the parabola.
Example 3.5.25. (Parabolas) Find a function whose zero set is a parabola
P with vertex in the origin, focus in the upper half-plane at p0, pq and axis
given by the y axis of a Cartesian coordinate system. Solution: As a
consequence of the assumptions the directrix is given by y p. Hence
px, yq P P if and only if
a
x2
ô x2
ô x2
and therefore
py pq2 |y p|
py pq2 py pq2
4py
x0
if p 0, i.e., in this case the parabola is given by y axis and
2
y
if p 0. Hence
P
x
4p
tpx, yq P R2 : x 0u
479
E
G
D
A
I
B
C
Fig. 131: Auxiliary diagram for the description of ancient Greek knowledge on parabolic
segments. See Example 3.5.26.
if p 0 and
"
P
px, yq P R
2
if p 0.
:y
x2
4p
*
0
Below, we give another typical example for the approach of analytical geometry, i.e., the replacement of intuition in the solution of geometric problems by algebraic calculations based on the introduction of an auxiliary
Cartesian coordinate system. Parts of the example transcend analytic geometry since they also apply methods from calculus. Hence as a whole, the
example belongs to the area of differential geometry.
Example 3.5.26. (Ancient Greek knowledge on parabolic segments) As
an application of the previous Example, we prove the following facts for
line segments AE of parabolas which provided the basis of Archimedes
quadrature of the parabola. See Fig 131.
(i) The tangent to the point C on the parabola of largest distance from
AE is parallel to AE.
480
(ii) The parallel to the axis of the parabola through C halves every line
segment BD between two points B and D on the parabola that is
parallel to AB.
(iii) If I, G are the points of intersection of the parallel to the axis through
C with BD and AE, respectively, then
CI
CG
BI q2
ppAG
q2
.
(3.5.8)
For the proofs, we consider the parabola P given by the graph of f : R Ñ R
defined by
x2
f pxq 4p
for every x P R where p ¡ 0. In addition, let px1 , y1 q and px2 , y2 q be two
different points of P . Note that this implies that x2 x1 . Without loss of
generality, we can assume that x1 x2 . Then the line segment L between
px1, y1q and px2, y2q is given by the graph of the map h : rx1, x2s Ñ R
defined by
hpxq : apx x1 q y1
where
x4p 1 x22 x21 1
x x 4p x x 4p px1 x2q .
a :
2
1
2
1
First, we establish (i) in the following. For every px, y q P R2 , we have the
y2 y1
x2 x1
x22
4p
2
1
following decomposition
px, yq x1
ay
.p1, aq
a2
y ax
.pa, 1q
1 a2
(3.5.9)
where the vectors p1, aq and pa, 1q are orthogonal with respect to the Euclidean scalar product. In particular, the last gives for x P rx1 , x2 s
px, hpxqq x 1
ahpxq
.p1, aq
a2
481
hpxq ax
.pa, 1q
1 a2
x
a2 px x1 q ay1
apx x1 q y1 ax
.p1, aq
.pa, 1q
2
1 a
1 a2
p1 a2q x apy1 ax1q .p1, aq y1 ax1 .pa, 1q .
1 a2
1 a2
Further, let px0 , y0 q P P such that x0
px0, y0q x10
P rx1, x2s. Then (3.5.9) gives
ay0
y0 ax0
.p1, aq
.pa, 1q .
2
a
1 a2
Hence the square of the distance of the points px, hpxqq where x
and px0 , y0 q is given by
1
p
a2 q x
1
x10
p1
apy1 ax1 q
.p1, aq
a2
P rx1, x2s
y1 ax1
.pa, 1q
1 a2
2
ay0
y0 ax0
.
p
1,
a
q
.
p
a,
1
q
a2
1 a2
a2 q x apy1 y0 ax1 q x0
.p1, aq
1 a2
2
y1 y0 apx1 x0 q
.
p
a,
1
q
1 a2
rp1 a2q x apy1 y0 ax1q x0s2
1 a2
ry1 y0 apx1 x0qs2
.
The zero of the first summand in the nominator of the last expression is
given by
x0 apy1 y0 ax1 q
x0 x1
1 a2
x1 x0 x11 aap2y0 y1q .
x
x1 apy1 y0 ax1 q
1 a2
As a function of x0 , it is increasing, and it assumes at x0
x x1 and at x0 x2 the value
x x1
x2 x1 apy2 y1 q
1 a2
x1
482
x2 x1
a
1
x1 the value
x22
4p
a2
x21
4p
x 1 px 2 x 1 q
a x14px2
1
a2
1
x2 .
Hence it follows that
x0 apy1 y0 ax1 q
1 a2
P rx1, x2s
for all x0 P rx1 , x2 s and therefore that the minimal distance dppx0 , y0 q, Lq
of px0 , y0 q from the line segment L is given by
dppx0 , y0 q, Lq |y1 y0? apx1 x0q|
1
a2
and is assumed in precisely one point on L with abscissa
x0 x1 apy0 y1 q
.
1 a2
x1
Further, the distance function D : rx1 , x2 s Ñ R defined by
Dpx0 q : dppx0 , f px0 qq, Lq
P rx1, x2s continuous and hence assumes a maximum value.
|y1 y2? apx1 x2q| 0 ,
Dpx q 0 , Dpx q for every x0
Since
1
2
1 a2
that maximum is assumed in the open interval between x1 and x2 . Since
pr0, 8q Ñ R, x ÞÑ Rq is strictly increasing, D and D2 assume maxima in
the same points. Since D2 is differentiable on the open interval between x1
and x2 , we conclude that the derivative of D2 vanishes in such a point x in
the open interval between x1 and x2 , i.e., that
0
y1 y0 a x1
2 1 a2
x0 a2 D x0 a
.
2p 2 1
x
D
2
?
pq
1
p x0q a x0 p q 483
2p Hence it follows that x0 2pa. As as consequence, that maximal distance
dppx0 , f px0 qq, Lq is assumed in precisely one point with coordinates
px0, f px0qq p 2pa, pa2 q .
(3.5.10)
Finally, the slope of the tangent to the graph of f in this point is given by
f 1 p2paq 2pa
2p
a
and hence equal to the slope of the line segment L. Hence, indeed, the tangent to the graph of f in this point is parallel to the line segment L.
Finally, we establish (ii) and (iii) in the following. For this, let px3 , y3 q,
px4, y4q P P be such that x3 x4 and such that
y4 y3
x4 x3
a.
Then the intersection of the parallel to the axis through the point (3.5.10)
with the line segment between px3 , y3 q, px4 , y4 q is given by
Since a px3
px, yq p2pa, ap2pa x3q
x4 q{p4pq, we conclude that
x 2pa x3
x4
2
y3 q .
,
that
ap2pa x3q y3 x3 4p x4 x3 2 x4 x3 y3
2
2
x3 x4 x4 x3 y x4 x3 y y4 y3
y
4p
and hence that
2
3
px, yq 3
8p
x
3
2
x4 y3
,
484
2
y4 2
.
y3
y3 2 y4
Hence, the distances of px3 , y3 q and px4 , y4 q from px, y q are indeed the
same:
c
x3 x3
c
x4 2
2
x3
y3 x4 2
y 4 2
y3
2
y3
c
x
3
x 4 2
y
y4 2
3
2
y 4 2
2
y4 .
2
2
We notice that that distance dppx3 , y3 q, px, y qq equals
x4 ?
p2pa x3q2 a2p2pa x3q2 1 a2 |2pa x3|
?
?
1 a2 x3 2 x4 x3 21 1 a2 px4 x3q .
Further, the distance dppx0 , y0 q, px, y qq between px0 , y0 q and px, y q is given
a
by
ap2pa x3 q
y3 pa2
x3q2
px4 16p
.
Finally, it follows that
dppx0 , y0 q, px, y qq
r dppx3, y3q, px, yqq s2
appa x3q
x3 q px4 16p
p1
2
a2
x23
4p
x3 q
p2pa4p
4
qpx4 x3q2
2
4p p11
a2 q
.
An ellipse is a subset of the plane consisting of those points for which the
sum of the distances from two points, called foci, is constant. Because of
the triangle inequality for the Euclidean distance, that constant is greater
or equal than the distance of the foci. If the constant is non-zero, the ratio
between the distance of the foci and the constant is called the eccentricity
of the ellipse. The line connecting the foci of an ellipse is called eccentric
line and its midpoint the center of the ellipse.
Example 3.5.27. (Ellipses) Find a function whose zero set is an ellipse E
with foci at pc, 0q and pc, 0q and constant 2a where a ¥ c ¥ 0. Solution:
px, yq P E if and only if
d1
d2
2a
485
(3.5.11)
Fig. 132: Ellipse and corresponding foci. Compare Example 3.5.27.
a
a
where d1 :
px cq2 y2 and d2 : px cq2 y2 . In case that
a c 0, this is equivalent to x y 0, i.e., the ellipse is given by the
origin. In the following, let a ¡ 0. Then
2a pd1 d2 q d21 d22
and
px
cq2
y 2 px cq2 y 2
d1 d2
2ca x .
4cx
(3.5.12)
Hence (3.5.11) is equivalent to (3.5.11), (3.5.12) and therefore to
c
c
d1 a
x , d2 a x .
(3.5.13)
a
a
We consider two cases. In case that a c, equations (3.5.13) are equivalent
to |x| ¤ c and y 0, i.e., in this case, the ellipse is given by the line
rc, cs t0u. In case that a ¡ c, equations (3.5.13) are equivalent to
x2
a2
and the condition
y2
1
a2 c 2
(3.5.14)
2
|x| ¤ ac
.
(3.5.15)
Now the assumption |x| ¡ a2 {c and (3.5.14) leads to the contradiction that
0¡1
a2
c2
2
¡ a2 y c2
486
.
Fig. 133: Hyperbolas, corresponding foci and asymptotes (dashed). Compare Example 3.5.28.
Hence (3.5.14) implies (3.5.15), and (3.5.13) is equivalent to (3.5.14). Hence
E
if a c 0,
E
if a ¡ 0, c a and
E
tp0, 0qu
tpx, yq P R2 : c ¤ x ¤ cu
"
2
px, yq P R : xa2
2
y2
a2 c2
*
1
if a ¡ 0, a ¡ c. Note that E is a circle of radius a if c 0.
A hyperbola is a subset of the plane consisting of those points for which the
difference of the distances from two points, called foci, is constant. Because
of the triangle inequality for the Euclidean distance, the absolute value of
that constant is smaller or equal than the distance of the foci. If the constant
is non-zero, the ratio between the distance of the foci and its absolute value
is called the eccentricity of the hyperbola.
Example 3.5.28. (Hyperbolas) Find a function whose zero set is a hyperbola H with foci at pc, 0q and pc, 0q, where c ¥ 0, and constant 2a such
that |a| ¤ c. Solution: px, y q P H if and only if
d1 d2
2a
487
(3.5.16)
a
a
where d1 : px cq2 y 2 and d2 : px cq2 y 2 . We consider
two cases. In case that c a 0, equation (3.5.16) is satisfied by all
px, yq P R2, i.e., the hyperbola is given by the whole plane. In case that
c ¡ 0, a 0, (3.5.16) is equivalent to x 0 and y P R, i.e., the hyperbola
is given by the y axis. In case that c ¡ 0, a 0, it follows that
2a pd1
d2 q d21 d22
px
cq2
d1
2ca x .
and that
d2
y 2 px cq2 y 2
4cx
(3.5.17)
Hence (3.5.16) is equivalent to (3.5.16), (3.5.17) and therefore to
d1
ac x
a , d2
ac x a .
(3.5.18)
c2 a2
(3.5.19)
Equations (3.5.18) are equivalent to
c 2 a2 2
x y2
a2
and the condition
|a| .
x
¥
(3.5.20)
a
c
In case that a P tc, cu, (3.5.19) and (3.5.20) are equivalent to x ¤ c and
y 0, x ¥ c and y 0, respectively, i.e., the hyperbola is given by the
respective half-lines. In case that |a| c, equations (3.5.19) and (3.5.20)
are equivalent to
x2
y2
1
(3.5.21)
a2 c2 a2
and
a2
x¥
c
if a ¡ 0 and
a2
x¤
c
488
if a 0, respectively. The assumption 0 ¤ x
lead together with (3.5.21) to the contradiction
a2{c or a2{c x ¤ 0
x2
a2 c2
1
¤0.
c2 a2
a2
c2
y2
Hence (3.5.19) and (3.5.20) are equivalent to (3.5.21) and x
and (3.5.21) and x 0 if a 0. We conclude that
H
if c a 0,
if c ¡ 0, a 0,
if c ¡ 0, a c,
if c ¡ 0, a c,
R2 ,
H
tp0, yq P R2 : y P Ru ,
H
tpx, 0q P R2 : x ¤ cu
H
tpx, 0q P R2 : x ¥ cu
#
H
if c ¡ a ¡ 0 and
H
¡ 0 if a ¡ 0
px, yq P R2 : x a
#
px, yq P R2 : x a
c
1
c
1
y2
+
c2 a2
y2
+
c2 a2
if c ¡ a ¡ 0.
Remark 3.5.29. Note that from an analytic algebraic point of view conics
are zero sets of second order polynomials in the coordinates of Cartesian
coordinate systems in the plane. Later on in Section 3.5.5, quadrics will be
defined as corresponding sets in three-dimensional space.
489
Problems
1) Find the vertex, focus and the directrix of the parabola.
a)
c)
d)
f)
tpx, yq P R2 : y2 3x 0u , b) tpx, yq P R2 : y2 x{2u
tpx, yq P R2 : y 5 2px 3q2 u ,
tpx, yq P R2 : y x2 xu , e) tpx, yq P R2 : y 2 x2 u
tpx, yq P R2 : y x2 3x 1u .
2) Find a function whose zero set coincides with P .
a) P is the parabola with focus p1, 3q and directrix tpx, y q P R2 :
x 2y 1 0u,
b) P is the parabola with focus p4, 3q and directrix tpx, y q P R2 :
x 2y 1u,
c) P is the parabola with focus p1, 2q and directrix tpx, y q P
R2 : 2x y 3u.
3) Find the location of the foci and the eccentricity of the ellipse.
a)
b)
c)
d)
e)
f)
tpx, yq P R2 : x2 2y2 6u ,
tpx, yq P R2 : 5x2 11y2 10u ,
tpx, yq P R2 : 2x2 4y2 5u ,
tpx, yq P R2 : 3x2 2y2 1u ,
tpx, yq P R2 : 6x2 7y2 4u ,
tpx, yq P R2 : y px2 {3q py2 {4q p1{2qu
.
4) The lines connecting the foci of the following ellipses are parallel to
the x-axis. Find the location of their foci and their eccentricities.
a)
b)
c)
tpx, yq P R2 : 3x2
tpx, yq P R2 : x2
tpx, yq P R2 : 4x2
2y 2 3x p4y {3q p1{36qu ,
3y 2 4x
2y
2
12u ,
8x 12y 21 0u
12y
.
5) Find a function whose zero set is an ellipse of eccentricity 2 and foci
at p1, 1q, p2, 2q.
6) Find the location of the foci and the eccentricity of the hyperbola.
a)
tpx, yq P R2 : 2x2 y2 5u
490
,
,
b)
c)
d)
e)
f)
tpx, yq P R2 : 7x2 9y2 9u ,
tpx, yq P R2 : 3x2 5y2 4u ,
tpx, yq P R2 : 4x2 y2 2u ,
tpx, yq P R2 : 7x2 4y2 1u ,
tpx, yq P R2 : y px2 {4q py2 {2q p1{4qu
.
7) The lines connecting the foci of the following hyperbolas are parallel
to the x-axis. Find the location of their foci and their eccentricities.
a)
b)
c)
tpx, yq P R2 : 2x2 4y2 p2x{3q 4y p35{18qu
tpx, yq P R2 : 3x2 y2 12x 4y 7u ,
tpx, yq P R2 : 2x2 3y2 4x 18y 30u .
,
8) Find a function whose zero set is a hyperbola of eccentricity 4 and
foci at p1, 1q, p2, 2q.
9) Show that the given set is an ellipse
"
2t
1 t2
,b
1 t2
1 t2
P R2 : t P R
(
pa cospθq, b sinpθqq : θ P R
b)
where a ¡ 0 and b ¡ 0.
a)
a
*
,
10) Show that the given set is a hyperbola.
"
a)
b)
pa coshptq, b sinhptqq : t P R
where a ¡ 0 and b ¡ 0.
c)
3.5.4
*
2t
1 t2
,b
a
P R 2 : 1 t 1 ,
1 t2
1 t2
*
"
a
, b tanpθq : θ P pπ {2, π {2q
,
cospθq
(
Polar Coordinates
In addition to Cartesian coordinate systems, there are other options to coordinate the points in the plane. Most important in this respect are polar
coordinate systems, see Fig 134. In physics applications, such are generally applied if the system is, in a certain sense, symmetric under rotations
491
around a particular point. In such cases, the last is chosen as the origin of
the polar coordinate system. In these situations, polar coordinates considerably simplify the analysis of the system compared to Cartesian coordinate
systems.
Polar coordinate systems use as coordinates the distance r of a point p from
an origin O and the angle ϕ of the line segment from Op with a given line
originating from O. For example in Fig 134, the last is given by the positive x-axis of a Cartesian coordinate system. We immediately notice two
problems here. First, the origin does not correspond to a unique pair of coordinates r and ϕ and hence has to be excluded. Second, there are various
ways to measure the angle from the positive x-axis. For instance, if we let
ϕ run in the interval r0, 2π s, then the points on the half-line
H : tpx, 0q P R2 : x ¡ 0u
don’t correspond to unique pairs of coordinates r and ϕ. Hence in this case,
we need to exclude the angles 0 or 2π. But then ϕ ‘jumps’ for points on H
depending whether we approach such point from below H or from above.
Such behavior along H is usually undesirable for applications. For this
reason, below ϕ runs in the interval pπ, π s. Then the jump occurs only on
H : tpx, 0q P R2 : x 0u
which is usually acceptable for applications. Of course, we could also have
chosen rπ, π q for that purpose. That would have led to a different coordinatization of the points on H , only. On the other hand, we will see later in
Calculus III that, in certain applications, H needs to be excluded from coordinatization because the transformation g below, from polar coordinates
to Cartesian coordinates, is not everywhere differentiable. Hence usually,
the used convention for the coordinatization of H has no important consequences.
Example 3.5.30. (Polar coordinates) Define
g : p0, 8q pπ, π s Ñ R2 z tp0, 0qu
492
y
p
r sinHjL
r
j
r cosHjL
x
O
Fig. 134: Polar coordinates r, ϕ of a point p in the plane. r is the Euclidean distance of O
and p. Compare Example 3.5.30.
by
g pr, ϕq : pr cos ϕ, r sin ϕq
for all r
P p0, 8q, ϕ P pπ, πs. Then g is bijective with the inverse
g 1 : R2 z tp0, 0qu Ñ p0, 8q pπ, π s
given by
?
p?x2
p x2
for all px, y q P R2 z tp0, 0qu.
g 1 px, y q "
?
y 2 , arccospx{ x?2 y 2 qq
if y ¥ 0
2
2
2
y , arccospx{ x
y qq if y 0
Example 3.5.31. Find a parametrization of the ellipse
E :
"
2
px, yq P R : xa2
2
y2
b2
*
1
,
i.e., a bijective map whose range coincides with E where a, b
tion: Define the scale transformation f : R2 Ñ R2 by
f px, y q : pax, by q
for all x, y
P R. Then
f pS 1 p0qq E .
493
¡ 0. Solu-
Employing polar coordinates for parametrization, S 1 p0q is given by
S 1 :
pcos ϕ, sin ϕq P R2 : π ϕ ¤ π
(
Hence
.
(
pa cos ϕ, b sin ϕq P R2 : π ϕ ¤ π ,
and a parametrization of E is given by h : pπ, π s Ñ R2 defined by
hpϕq : pa cos ϕ, b sin ϕq
for all ϕ P pπ, π s.
E
The following three examples use polar coordinates for the parametrization
of ellipses, parabolas and hyperbolas that have foci in the origin of a Cartesian coordinate system. The results are frequently applied in astronomy, in
the description of the motion of objects in the gravitational field of a central
object. Such motion proceeds on ellipses, parabolas or hyperbolas with the
position of the central object as a focus.
Example 3.5.32. (Polar representation of parabola with focus in the
origin) Let p ¡ 0. Show that
Pp
pr cos ϕ, r sin ϕq P R2 : r ¡ 0 ^ π ϕ ¤ π
^ r p1 cos ϕq 2pu
(3.5.22)
is a parabola with focus at the origin and directrix given by the parallel
through the y-axis through the point p2p, 0q. Solution: For this, denote
by Pp parabola with focus at the origin and directrix given by the parallel
through the y-axis through the point p2p, 0q. Then px, y q P Pp if and only if
a
x2
The equation
y2
a
px 2pq2 |x 2p| .
a
x2
implies that
x 2p x 2p
y2
a
x2
494
y2
¥x
(3.5.23)
and hence that 2p ¥ 0 which is in contradiction to the assumptions.
Hence this equation has no solution in R2 . Therefore (3.5.23) is equivalent to
a
x2 y 2 x 2p
and
tpx, yq P R2 :
Pp
a
x2
x 2pu .
y2
Finally, since g from Example 3.5.30 is bijective and p0, 0q R Pp , we conclude (3.5.22). Note that as a consequence of the foregoing, (3.5.23) is
equivalent to
x2
y2
p2p xq2 x2 4px
and hence to
y
4p2
a
p pp xq .
2
Example 3.5.33. (Polar representation of ellipses with focus in the origin) Define for a ¡ 0 and 0 ¤ ε 1 the corresponding ellipse Ea,ε with
center paε, 0q, foci at p2aε, 0q, p0, 0q and excentricity ε by
Ea,ε :
"
px, yq P R
2
px
:
aεq2
y2
a2 p1 ε2 q
a2
*
1
.
Show that
pr cos ϕ, r sin ϕq P R2 : r ¡ 0 ^( π ϕ ¤ π
^ r p1 ε cos ϕq ap1 ε2 q .
(3.5.24)
Solution: In Example 3.5.27, we showed for a ¡ 0 and 0 ¤ c a that the
following equations are equivalent for px, y q P R2
Ea,ε
x2
a2
and
a
px
cq2
y2
y2
a2 c 2
a
1
px cq2
495
y2
2a .
Hence if a ¡ 0 and 0 ¤ ε 1, then also the equations
px
aεq2
y2
a2 p1 ε2 q
a2
and
a
px
2aεq2
y2
2a 1
a
x2
y2 .
are equivalent for px, y q P R2 . The last equation is equivalent to
px
2aεq
2
y
2
2a 2
a
x2
y2
(3.5.25)
since the equation
a
px
2aεq2
y2
a
x2
y 2 2a
leads by use of the triangle inequality to
|px, yq p2aε, 0q| |px, yq| 2a ¤ |px, yq p2aε, 0q| 2ap1 εq
and therefore has no solution in R2 since a ¡ 0 and ε 1. Further, (3.5.25)
is equivalent to
x2
y2
4a2 ε2
4aεx
x2
a
y 2 4a x2
4a2
y2
which is equivalent to
a
x2
εx ap1 ε2 q .
y2
As a consequence, we arrive at the representation
!
Ea,ε
px, yq P R
2
:
a
x2
y2
εx ap1 ε
2
)
q
Finally, since g from Example 3.5.30 is bijective and p0, 0q
conclude (3.5.24).
496
.
R
Ea,ε , we
y
3
2
-3
x
1
-1
-2
-3
Fig. 135: Parabola, ellipse, hyperbola and asymptotes corresponding to the parameters
p 1, ε 1{2 and ap1 ε2 q 1, ε 3{2 and ap1 ε2 q 1, respectively. Compare
Examples 3.5.32, 3.5.33 and 3.5.34.
Example 3.5.34. (Polar representation of hyperbola with focus in the
origin) Define for a 0 and ε ¡ 1 the corresponding hyperbolas Ha,ε with
center p0, aεq, foci at p0, 0q, p0, 2aεq and excentricity ε by
Ha,ε :
"
px, yq P R
2
:x¤
a 2
px aεq2
p
ε 1q ^
ε
a2
y2
a2 pε2 1q
*
1
.
Show that
pr cos ϕ, r sin ϕq P R2 : r ¡ 0 ^( π ϕ ¤ π
^ r p1 ε cos ϕq ap1 ε2 q .
(3.5.26)
Solution: In Example 3.5.28, we showed for a 0 and c ¡ a that the
following equations are equivalent for px, y q P R2
Ha,ε
x2
a2
2
c2 y a2 1
497
together with the condition that
a2
x¤
c
and
a
px
cq2
y2 a
px cq2
y2
2a .
Therefore also the equations
px cq2 a2
y2
1
c 2 a2
together with the condition that
x¤
and
a
a
px 2cq2 y2 2a .
are equivalent. Hence if a 0 and ε ¡ 1, the equations
px aεq2 y2 1
a2
a2 pε2 1q
x2
y2 1 2
pc a2q
c
together with the condition that
x¤
and
a
a 2
pε 1q
ε
y 2 2a x2
a
px
2aεq2
y2 .
are equivalent. The last equation is equivalent to
px
2aεq2
a
y2
y2
px
x2
y 2 2a
2
since the equation
2a a
x2
a
498
2aεq2
y2
(3.5.27)
has no solution in R2 since a 0. Further, (3.5.27) is equivalent to
x2
y2
4a2 ε2
4aεx
which is equivalent to
a
x2
a
x2
y 2 4a x2
y2
εx ap1 ε2 q .
y2
As a consequence, we arrive at the representation
!
4a2
a
px, yq P R
:
x2
y2
εx ap1 ε
)
q .
Finally, since g from Example 3.5.30 is bijective and p0, 0q R
Ha,ε
2
2
Ha,ε , we
conclude (3.5.26).
Problems
1) Sketch the image of the set under g from Example 3.5.30 on polar
coordinates.
tpr, ϕq P Dpgq : r 3u ,
b) tpr, ϕq P Dpg q : r ¡ 2u ,
c) tpr, ϕq P Dpg q : π {2 ¤ ϕ ¤ π {2u ,
d) tpr, ϕq P Dpg q : 3π {4 ¤ ϕ ¤ 5π {6u ,
e) tpr, ϕq P Dpg q : 3π {4 ¤ ϕ ¤ π {4u ,
f) tpr, ϕq P Dpg q : 1 r 2 ^ π {6 ¤ ϕ ¤ π {3u ,
g) tpr, ϕq P Dpg q : 0 r 1 ^ π {3 ¤ ϕ ¤ π {6u .
Find a function whose zero set coincides with g pC q where g is the
a)
2)
transformation from Example 3.5.30 on polar coordinates. In addition, sketch g pC q.
tpr, ϕq P Dpgq : r 4u ,
C tpr, ϕq P Dpg q : 1 r r 2 cospϕq sinpϕq s 0u
C tpr, ϕq P Dpg q : r 2 cospϕqu ,
C tpr, ϕq P Dpg q : r 1{ r 2 cospϕq su ,
C tpr, ϕq P Dpg q : r sin2 pϕqu ,
C tpr, ϕq P Dpg q : r2 sinp2ϕq 1u ,
C tpr, ϕq P Dpg q : r2 2 sinpϕqu .
a) C
b)
c)
d)
e)
f)
g)
499
,
3) Find a function whose zero set coincides with g 1 pC q where g is the
transformation from Example 3.5.30 on polar coordinates.
tpx, yq P R2 : x 3u ,
C tpx, y q P R2 : 3x 2y 7u ,
C tpx, y q P R2 : 3x2 y 2 9u ,
C tpx, y q P R2 : y 4x2 1 0u ,
C tpx, y q P R2 : 8x2 4y 2 1 0u
C tpx, y q P R2 : 3xy 4u ,
C tpx, y q P R2 : 2x2 3x y 2 1u
a) C
b)
c)
d)
e)
f)
g)
,
.
4) Show that g from Example 3.5.30 on polar coordinates is bijective by
verifying that
g 1 pg pr, ϕqq pr, ϕq
for all pr, ϕq P Dpg q and
g pg 1 px, y qq px, y q
for all px, y q P R2 z tp0, 0qu.
3.5.5
Quadric Surfaces
Quadric surfaces are zero sets of second order polynomials in the coordinates of Cartesian coordinate systems in space. All quadrics are unique
only up to rigid transformations, i.e., compositions of translations and rotations, in space. In the following, we give a brief discussion of the most
important normal forms of quadrics. A detailed study of their geometric
properties is object of courses in differential geometry.
Example 3.5.35. For every plane curve C, the set C R is called a cylinder
where we identify the pair ppx, y q, z q and the triple px, y, z q for all x, y, z P
R. Examples are:
(i) The parabolic cylinder
ZP :
(
px, y, zq P R3 : x2 4py 0
where p ¡ 0. The intersection of ZP with every parallel plane to the
xy-plane is a parabola.
500
Fig. 136: Example of a parabolic cylinder. Compare Example 3.5.35.
Fig. 137: Example of an elliptic cylinder. Compare Example 3.5.35.
501
Fig. 138: Example of a hyperbolic cylinder. Compare Example 3.5.35.
(ii) The elliptic cylinder
ZE :
"
2
px, y, zq P R : xa2
y2
b2
3
*
1
where a, b ¡ 0. The intersection of ZE with every parallel plane to
the xy-plane is an ellipse.
(iii) The hyperbolic cylinder
ZH :
"
2
px, y, zq P R : xa2
3
y2
b2
*
1
where a, b ¡ 0. The intersection of ZH with every parallel plane to
the xy-plane is a hyperbola.
Example 3.5.36. The surface
E :
"
2
px, y, zq P R : xa2
3
502
y2
b2
z2
c2
*
1
,
Fig. 139: Example of an ellipsoid. Compare Example 3.5.36.
where a, b, c ¡ 0, is an ellipsoid with half-axes a,b and c. The intersection
of E with a plane parallel to a coordinate plane is an ellipse, a point, or
the empty set. E may be viewed as a ‘deformed’ sphere, because it is the
image of S 2 p0q under the scale transformation f : R3 Ñ R3 defined by
f px, y, z q : pax, by, cz q
for all px, y, z q P R3 .
Example 3.5.37. The surface
EP :
"
px, y, zq P R : zc
3
x2
a2
y2
b2
*
,
where a, b, c ¡ 0, is called an elliptic paraboloid. The intersection of EP
with a parallel to the xy-plane is an ellipse, a point or the empty set. The
intersection of EP with a plane containing the z-axis is a parabola. The
surface looks similar to a ‘saddle’ and is therefore often called a ‘saddle
surface’.
503
Fig. 140: Example of an elliptic paraboloid. Compare Example 3.5.37.
Fig. 141: Example of a hyperbolic paraboloid. Compare Example 3.5.38.
504
Fig. 142: Example of an elliptic cone. Compare Example 3.5.39.
Example 3.5.38. The surface
HP :
"
px, y, zq P R : zc
3
x2
a2
y2
b2
*
,
where a, b, c ¡ 0, is called an hyperbolic paraboloid. The intersection of
HP with a parallel to the xy-plane is a hyperbola. The intersection of HP
with a plane containing the z-axis is a parabola.
Example 3.5.39. The surface
EC :
"
2
px, y, zq P R : zc2
3
x2
a2
y2
b2
*
,
where a, b, c ¡ 0, is called an elliptic cone. The intersection of EC with a
parallel to the xy-plane is an ellipse, with midpoint given by its intersection
with the z-axis, or a point called its vertex. The intersection of EC with a
plane containing the z-axis are two straight lines crossing in the vertex.
505
Fig. 143: Example of a hyperboloid of one sheet. Compare Example 3.5.40.
Example 3.5.40. The surface
H1 :
"
2
px, y, zq P R : zc2
3
x2
a2
y2
b2
*
1
,
where a, b, c ¡ 0, is called a hyperboloid of one sheet. The intersection of
H1 with a parallel to the xy-plane is an ellipse with midpoint given by its
intersection with the z-axis. The intersection with a plane containing the
z-axis consists of two hyperbolas.
Example 3.5.41. The surface
H2 :
"
2
px, y, zq P R : zc2
3
x2
a2
y2
b2
*
1
,
where a, b, c ¡ 0, is called a hyperboloid of two sheets. The intersection of
H2 with a parallel to the xy-plane is an ellipse with midpoint given by its
intersection with the z-axis, a point or the empty set. The intersection with
a plane containing the z-axis consists of two hyperbolas.
506
Fig. 144: Example of a hyperboloid of two sheets. Compare Example 3.5.41.
Problems
1) Describe and sketch the surface.
a)
b)
c)
d)
e)
tpx, y, zq P R3 : 2y2 3z2 9u ,
tpx, y, zq P R3 : z 6x2 1u ,
tpx, y, zq P R3 : x 2y2 3u ,
tpx, y, zq P R3 : xz 12u ,
tpx, y, zq P R3 : 3x2 5y2 7u .
2) Find the intersections of the surface with the coordinate planes. In
this way, identify the surface and sketch it.
a)
b)
c)
d)
e)
f)
tpx, y, zq P R3 : 2x2 3y2 z2 4u ,
tpx, y, zq P R3 : 9x2 3y2 5z2 12u
tpx, y, zq P R3 : x2 2y2 4z2 3u ,
tpx, y, zq P R3 : y2 6x2 4z2 u ,
tpx, y, zq P R3 : 4x2 3z2 2yu ,
tpx, y, zq P R3 : 4z2 3x2 2y 0u ,
507
,
g)
tpx, y, zq P R3 : x2
1 0u
.
tpx, y, zq P R3 : z2 4u ,
tpx, y, zq P R3 : 2y2 3z2 0u ,
tpx, y, zq P R3 : x2 4xy 4y2 2u ,
tpx, y, zq P R3 : px 2yq2 2px zq2 u
.
4y 2
3z 2
3) Identify the surfaces.
a)
b)
c)
d)
4) Find a function whose zero set consists of all points that are equidistant from p0, 0, 1q and the coordinate plane
tpx, y, zq P R3 : z 1u .
Identify the surface.
5) Find a function whose zero set consists of all points whose distance
from the z-axis is 3-times the distance from the xy-plane. Identify
the surface.
6) Show that through every point of the the surfaces go two straight lines
that are contained in that surface.
a) The elliptic cone EC ,
b) the hyperbolic paraboloid HP ,
c) the hyperboloid of one sheet H1 .
7) (Conic sections) Let α P r0, π {2s and C be the circular cone defined
by
(
C : px, y, z q P R3 : z 2 x2 y 2 .
a) Find a function whose zero set coincides with the cone Cα resulting from a C by clockwise rotation in the yz-plane around
p0, 0, 1q and about the angle α. The symmetry axis of that cone
is given by
tp0, t sinpαq, 1
t cospαqq : t P Ru
and its vertex by
p0, sinpαq, 1 cospαqq .
b) Find the intersection of Cα with the coordinate plane parallel to
the xy-plane through p0, 0, 1q. Classify those curves.
508
3.5.6
Cylindrical and Spherical Coordinates
There are also other options than Cartesian coordinate systems to coordinate the points in space. Most important in this respect are two derivatives
of polar coordinates in the plane, cylindrical and spherical coordinates.
In physics applications, cylindrical coordinates are generally applied if the
system is, in a certain sense, symmetric under rotations around an axis.
In such a case, cylindrical coordinates are defined with respect to a Cartesian coordinate system whose z-axis coincides with that symmetry axis.
Subsequently, see Fig 145, the cylindrical coordinate system uses the zcoordinate of a point and the polar coordinates r and ϕ of the end point of
the orthogonal projection of its position vector onto the x, y-plane as coordinates. Since in this way, every point on the z-axis is projected onto
the origin, the points on the z-axis are not covered by this coordinatization.
Usually in such situations, cylindrical coordinates considerably simplify
the analysis of the system compared to the use of a Cartesian coordinate
system.
Example 3.5.42. (Cylindrical coordinates) Define
g : p0, 8q pπ, π s R Ñ R3 z pt0u t0u Rq
by
g pr, ϕ, z q : pr cos ϕ, r sin ϕ, z q
for all pr, ϕ, z q P p0, 8q pπ, π s R.
Then g is bijective with inverse
g 1 : R3 z pt0u t0u Rq Ñ p0, 8q pπ, π s R
given by
?
?
p?x2 y2 , arccospx{ x?2 y2 q , zq
p x2 y2 , arccospx{ x2 y2 q , zq
for all px, y, z q P R3 z pt0u t0u Rq.
g 1 px, y, z q "
509
if y ¥ 0
if y 0
z
p
y
O
r
q
j
r×sinHjL
r×cosHjL
x
Fig. 145: Cylindrical coordinates r, ϕ, z of a point p in space. q is the orthogonal projection
of p onto the xy-plane, r is the Euclidean distance of O and q, ϕ the angle of the line from
O to q with the x-axis. Compare Example 3.5.42.
Example 3.5.43. Find a parametrization of the cylinder
Z1
px, yq P R2 : x2
y2
(
1
,
i.e., a bijective map whose range coincides with Z1 . Solution: Employing
cylindrical coordinates, Z1 is given by
tpcos ϕ, sin ϕ, zq : ϕ P pπ, πs, z P Ru ,
and a parametrization of Z1 is given by h : pπ, π s R Ñ R3 defined by
hpϕq : pcos ϕ, sin ϕ, z q
for all ϕ P pπ, π s, z P R.
Z1
In physics applications, spherical coordinates are generally applied if the
system is, in a certain sense, symmetric under rotations around a point. In
such a case, spherical coordinates are defined with respect to a Cartesian
coordinate system whose origin O coincides with that point. Subsequently,
510
z
p
y
r
r×cosHΘL
Θ
O
r×sinHΘL
j
q
x
Fig. 146: Spherical coordinates r, θ, ϕ of a point p in space. r is the Euclidean distance
of O and p, θ the angle between the line from O to p and the z-axis, q the orthogonal
projection of p onto the xy-plane, ϕ the angle of the line from O to q with the x-axis.
Compare Example 3.5.44.
see Fig 146, spherical coordinates use the distance r of a point p from O,
the angle θ of the line segment Op from the positive z-axis and the polar
angle ϕ of the end point of the orthogonal projection onto the x, y-plane of
the position vector corresponding to p. Since in this way, every point on
the z-axis is projected onto the origin, also here the points on the z-axis are
not covered by the coordinatization. Usually in such situations, spherical
coordinates considerably simplify the analysis of the system compared to
the use of a Cartesian coordinate system.
Example 3.5.44. (Spherical coordinates) Define
g : p0, 8q p0, π q pπ, π s Ñ R3 z pt0u t0u Rq
by
g pr, θ, ϕq : pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq
for all pr, θ, ϕq P p0, 8q p0, π q pπ, π s.
511
Then g is bijective with inverse
g 1 : R3 z pt0u t0u Rq Ñ p0, 8q p0, π q pπ, π s
given by
?
p|r| , arccospz{|r|q , arccospx{ x?2 y2qq if y ¥ 0
p|r| , arccospz{|r|q , arccospx{ x2 y2qq if y 0
for all px, y, z q P R3 z pt0u t0u Rq. In analogy with the situation on the
globe, for px, y, z q P R3 z pt0u t0u Rq the second and third component
of g 1 ppx, y, z qq can be called the longitude, co-latitude, respectively of
px, y, zq. Note that for a point on the northern hemisphere π{2 minus its cog 1 prq "
latitude gives its latitude, whereas for a point on the southern hemisphere
the latitude is given by difference of its co-latitude and π {2.
Example 3.5.45. Find a parametrization of
E : E ztp0, 0, cq, p0, 0, cqu ,
where E is the ellipsoid defined by
E :
"
2
px, y, zq P R : xa2
3
y2
b2
z2
c2
*
1
and a, b, c ¡ 0, i.e., find a bijective map whose range coincides with E .
Solution: E is the image of S 2 p0q under the scale transformation f : R3 Ñ
R3 defined by
f px, y, z q : pax, by, cz q
for all px, y, z q P R3 . Employing spherical coordinates,
S 2 p0q ztp0, 0, 1q, p0, 0, 1qu
psin θ cos ϕ, sin θ sin ϕ, cos θq P R3 : θ P p0, πq, ϕ P pπ, πs
Hence
E
pa sin θ cos ϕ, b sin θ sin ϕ, c cos θq P R3 : θ P p0, πq,
512
(
.
ϕ P pπ, π su ,
and a parametrization of E is given by h : p0, π q pπ, π s Ñ R2 defined
by
hpθ, ϕq : pa sin θ cos ϕ, b sin θ sin ϕ, c cos θq
for all θ
P p0, πq, ϕ P pπ, πs.
Problems
1) Describe the image of the set under g from Example 3.5.42 on cylindrical coordinates.
a)
b)
c)
d)
e)
f)
g)
tpr, ϕ, zq P Dpgq : r 3 ^ 1 z 1u ,
tpr, ϕ, zq P Dpgq : r ¡ 2 ^ 0 z 3u ,
tpr, ϕ, zq P Dpgq : 1 r 2u ,
tpr, ϕ, zq P Dpgq : π{2 ¤ ϕ ¤ π{2 ^ z 1u ,
tpr, ϕ, zq P Dpgq : 3π{4 ¤ ϕ ¤ 5π{6 ^ z ¤ 0u ,
tpr, ϕ, zq P Dpgq : 3π{4 ¤ ϕ ¤ π{4 ^ z ¥ 1u ,
tpr, ϕ, zq P Dpgq : 0 r 1 ^ π{3 ¤ ϕ ¤ π{6u
.
2) Find a function whose zero set f : U Ñ R coincides with g pS q
where g is the transformation from Example 3.5.42 on cylindrical
coordinates. In addition, sketch g pS q.
tpr, ϕ, zq P Dpgq : r 3u ,
S tpr, ϕ, z q P Dpg q : z 2ru ,
S tpr, ϕq P Dpg q : z 3r sinpϕq 12u ,
S tpr, ϕq P Dpg q : z 2 4r2 1u ,
S tpr, ϕq P Dpg q : 6z 2r2 3 0u ,
S tpr, ϕq P Dpg q : z 2 3 5r2 u ,
S tpr, ϕq P Dpg q : r 2 cospϕqu .
a) S
b)
c)
d)
e)
f)
g)
3) Find a function whose zero set coincides with g 1 pS q where g is the
transformation from Example 3.5.42 on cylindrical coordinates.
a) S
tpx, y, zq P R2 : 2x 6y
513
z
1u
,
tpx, y, zq P R2 : x2 y2 4u ,
S tpx, y, z q P R2 : x2 y 2 3z 2 2u ,
S tpx, y, z q P R2 : 2x2 2y 2 9z u ,
S tpx, y, z q P R2 : x2 y 2 2z 2 5u ,
C tpx, y q P R2 : 4x2 4y 2 z 2 1 0u
C tpx, y q P R2 : 8x2 y 2 3z 2 0u .
S
b)
c)
d)
e)
f)
g)
,
4) Describe the image of the set under g from Example 3.5.44 on spherical coordinates.
a)
b)
c)
d)
e)
f)
tpr, ϕ, θq P Dpgq : r 3u ,
tpr, ϕ, θq P Dpgq : r ¡ 2u ,
tpr, ϕ, θq P Dpgq : 1 r 8u ,
tpr, ϕ, θq P Dpgq : 0 ¤ θ ¤ π{4u ,
tpr, ϕ, θq P Dpgq : π{6 ¤ θ ¤ π{4u ,
tpr, ϕ, θq P Dpgq : r P r1, 2s ^ θ P rπ{6, π{3s
^ ϕ P rπ{6, π{3su .
5) Find a function whose zero set coincides with g pS q with g from Example 3.5.44 on spherical coordinates. In addition, sketch g pS q.
tpr, ϕ, zq P Dpgq : r 5 0u ,
S tpr, ϕ, z q P Dpg q : ϕ π {6u ,
S tpr, ϕ, z q P Dpg q : θ π {4u ,
S tpr, ϕ, z q P Dpg q : r 6 cospθqu ,
S tpr, ϕ, z q P Dpg q : r sinpθq 4u ,
S tpr, ϕ, z q P Dpg q : r cospθq 2u ,
S tpr, ϕ, z q P Dpg q : r2 cosp2ϕq sin2 pθq 1u
a) S
b)
c)
d)
e)
f)
g)
.
6) Find a function whose zero set coincides with g 1 pS q where g is the
transformation from Example 3.5.44 on spherical coordinates.
tpx, y, zq P R2 : x2
S tpx, y, z q P R2 : x2
S tpx, y, z q P R2 : px2
a) S
b)
c)
514
y2
z 2 2y
0u
,
3u ,
y q 4z 2 px2 y 2 qu
y
2
2 2
.
7) Show that g from Example 3.5.42 on cylindrical coordinates is bijective by verifying that
g 1 pg pr, ϕ, z qq pr, ϕ, z q
for all pr, ϕ, z q P Dpg q and
g pg 1 px, y, z qq px, y, z q
for all px, y, z q P R3 z pt0u t0u Rq.
8) Show that g from Example 3.5.44 on spherical coordinates is bijective by verifying that
g 1 pg pr, ϕ, θqq pr, ϕ, θq
for all pr, ϕ, θq P Dpg q and
g pg 1 px, y, z qq px, y, z q
for all px, y, z q P R3 ztp0, 0, 0qu.
3.5.7
Limits in Rn
Within this section, we assume that n P N zt0, 1u.
The concept of limits of sequences of real numbers has been fundamental
for our development of Calculus I. The same will be true for Calculus III
which develops in particular the calculus for functions defined on subsets
of Rn . The following definition is analogous to the corresponding definition in Calculus I. The main difference is the replacement of the modulus
function by the Euclidean distance in Rn .
Definition 3.5.46. Let x1 , x2 , . . . be a sequence of elements of Rn and
x P Rn . We define
lim xm x
mÑ8
if for every ε ¡ 0 there is a corresponding m0 such that for all m ¥ m0 :
r enpxm, xq s |xm x| ε .
515
x
y
z
Fig. 147: A sequence in space is convergent if and only if all its coordinate projections
converge. Compare Theorem 3.5.47.
The following theorem states that a sequence of elements in Rn is converging to some x P Rn if and only if for every i P t1, . . . , nu the corresponding
sequence of its i-th components converges in R to the i-th component of x.
In this way, the question of convergence or non-convergence of a sequence
in Rn is reduced to the question of convergence or non-convergence of sequences of real numbers.
Theorem 3.5.47. Let x1 , x2 , . . . be a sequence of elements of Rn where
n P N zt0, 1u and x P Rn . Then
lim
Ñ8 xm
x
lim
Ñ8 xmj
xj
m
if and only if
for all j
m
1, . . . , n.
Proof. First, we note that
max |yj | ¤ |y| ¤ |y1 |
j 1,...,n
516
. . . |y n |
for all y P Rn .
Hence if limmÑ8 xm
m ¥ m0
x, ε
¡
0 is given and m0 is such that for all
|xm x| ε ,
then also for every j P t1, . . . , nu and every m ¥ m0
|xmj xj | ε
and hence also
xj .
On the other hand, if for every j P t1, . . . , nu
lim xmj xj ,
mÑ8
ε ¡ 0 is given and for every j P t1, . . . , nu the corresponding m0j is such
that for every m ¥ m0j
|xmj xj | nε ,
then it follows for every m ¥ m01 m02 . . . that
|xm x| ε ,
lim
Ñ8 xmj
m
and hence that
lim
Ñ8 xm
m
x.
Example 3.5.48. Calculate
lim
Ñ8
n
Solution:
n2
2
sinpnq
lim
,
,
2
nÑ8
n
n
1 n
p0, 1, 0q .
sinpnq
n2
2
, 2
,
n
n
1 n
.
sinpnq
n2
2
lim
,
lim
,
lim
2
nÑ8
nÑ8 n
n
1 nÑ8 n
517
As a corollary, we obtain from the limit laws for sequences of real numbers
limit laws valid for sequences in Rn . In particular, part (i) states that a
sequence in Rn can have at most one limit point, part (ii) states that the
sequence consisting of the sums of the members of convergent sequences
in Rn is convergent against the sum of their limits, and part (iii) states that
the sequence consisting of scalar multiples of the members of a convergent
sequence in Rn converges against that scalar multiple of its limit.
Corollary 3.5.49. Let x1 , x2 , . . . ; y1 , y2 , . . . be sequences of elements of
Rn ; x, x̄, y P Rn and a P R.
(i) If
then x̄ x.
lim
Ñ8 xm
x and
m
lim
Ñ8 xm
x and
m
m
(ii) If
m
then
lim
Ñ8pxm
m
lim
Ñ8 xm
x̄ ,
lim
Ñ8 ym
y,
ym q x
(iii) If
lim
Ñ8 xm
m
then
lim
Ñ8 a.xm
m
y.
x,
a.x .
Problems
1) If existent, calculate the limit of the sequence pxn , yn , f pxn , yn qq,
n P N . Otherwise, show non-existence of the limit. Where applicable, a P R.
a) xn :
1
a
2xy 2
, yn : , f px, y q : 2
, px, y q P R2 z t0u
n
n
x
y2
518
1
a
2xy 2
, px, y q P R2 z t0u
, yn : , f px, y q : 2
n
n
x
y4
1
1
2xy 2
xn : 2 , yn : , f px, y q : 2
, px, y q P R2 z t0u
n
n
x
y4
1
a
xy
xn : , yn : , f px, y q : 2
, px, y q P R2 z t0u
n
n
x
y2
1
a
x y
xn : , yn : , f px, y q : 2
, px, y q P R2 z t0u
n
n
x
y2
a
x2
1
, px, y q P R2 z t0u
xn : , yn : , f px, y q : 2
n
n
x
y2
1
a
x
xn : , yn : , f px, y q : 2
, px, y q P R2 z t0u
n
n
x
y2
1
a
x2 y 2
,
xn : , yn : , f px, y q : 3
n
n
x
y3
px, yq P R2 z tpx, xq : x P Ru
b) xn :
c)
d)
e)
f)
g)
h)
1
e1{n
x2 y 2
,
, yn : , f px, y q : 3
n
n
x
y3
px, yq P R2 z tpx, xq : x P Ru
i) xn :
1
a
x3
, yn : , f px, y q : 2
n
n
x
2
2
px, yq P R z tpx, x q : x P Ru
j) xn :
k)
y3
,
y
x3
1
e1{n
, yn : 2 , f px, y q : 2
n
n
x
2
2
px, yq P R z tpx, x q : x P Ru
xn :
a
x3
1
, yn : , f px, y q :
n
n
x
2
px, yq P R z tpx, xq : x P Ru
l) xn :
y3
,
y
y3
,
y
1
a
x2 y 2
, yn : , f px, y q : 4
,
n
n
x
y4
px, yq P R2 z t0u
m) xn :
a
x2 y 2
1
, yn : , f px, y q : 2
,
n
n
x
y2
px, yq P R2 z t0u .
n) xn :
2) Prove Corollary 3.5.49.
519
3.5.8
Paths in Rn
As simple examples of functions assuming values in Rn , n P N zt1u, the
following section considers paths. These have as their domains intervals of
R.
Paths occur frequently in applications, e.g., in the description of the motion of a point particle in Newtonian mechanics. In the last applications,
the domain of a path is a time interval and its range is a curve in space.
To every time t from the domain, the path associates the corresponding
position of the particle in space. In particular, such path needs to satisfy
Newton’s differential equations of motion, given later on, in order to describe a possible motion of a point particle in nature.
The following defines the continuity and differentiability of a path in terms
of the corresponding properties of its component functions. This should
not be surprising in view of Theorem 3.5.47. In Calculus III, we give more
general definitions for the continuity and differentiability of vector-valued
functions of several variables. In the special case of paths, those definitions
are equivalent to the definitions below.
In the definition of the derivative of paths, we meet for the first time tangent vectors. Such have not only magnitude and direction, but also a point
of attack. If u is a path and s is an element of the domain of u, the value
of the derivative of u in s, u 1 psq, if existent, is a (tangent-) vector that has
as point of attack (or ‘is attached to’) the point upsq. This point of attack is
not indicated in the notation which is often confusing for the beginner, but
is standard practice in calculus / analysis courses and in applications. Also
the present text follows this convention. This does not lead to any serious
complications for the problems considered in this text, but the reader should
have this fact in mind for interpretation of the results. Hence as is usual in
other calculus text, tangent vectors will be treated as position vectors. As a
consequence, the derivative of a path will be a path, too. A proper definition
of tangent vectors is given in most courses in differential geometry.
520
Definition 3.5.50. Let n P N .
(i) A path is a map u : I Ñ Rn from some non-empty subinterval I of
R into Rn . The range of a path is frequently called a curve.
(ii) A path u : I Ñ Rn is called continuous if all corresponding component functions ui : I Ñ R that associate to every t P I the i-th
component of uptq, i P t1, . . . , nu, are continuous.
(iii) A path u : I Ñ Rn is said to be differentiable in some inner point
t0 P I, i.e. some point t0 P I for which there is some ε ¡ 0 such
that pt0 ε, t0 εq € I, if all corresponding component functions
ui : I Ñ R, i P t1, . . . , nu, are differentiable in t0 . In this case, we
define its derivative in t0 by
u 1 pt0 q : pu11 pt0 q, . . . , un1 pt0 qq .
The last will also be called the tangent vector to u in upt0 q.
Example 3.5.51. Calculate the derivative of the path u : R
by
uptq : pcos t, sin t, tq
Ñ R3 defined
for all t P R. Solution: u is differentiable since all its component functions
are differentiable in the sense of Calculus I. Hence
u 1 ptq : p sin t, cos t, 1q
for all t P R. See Fig. 3.5.51.
In applications, frequently derivatives of paths need to be calculated that are
composed of other paths. Rules for the differentiation of such frequently
occurring ‘compositions’ are given in a subsequent theorem and are simple
consequences of the following theorem.
521
v
O
Fig. 148: Tangent vector v at a point of a helix. Compare Example 3.5.51.
Theorem 3.5.52. Let l, m, n
i.e., such that
P N, λ : Rl Rm Ñ Rn be a bilinear map,
λpα.x β.y, z q α.λpx, z q β.λpy, z q ,
λpx, α.z β.wq α.λpx, z q β.λpx, wq
for all x, y P Rl , z, w P Rm and α, β P R. Further, let I be a non-void open
interval of R and u : I Ñ Rl , v : I Ñ Rm be differentiable paths. Then the
path λpu, v q : I Ñ Rn defined by
rλpu, vqsptq : λpuptq, vptqq
for all t P I is differentiable, and
rλpu, vqs 1ptq λpu 1ptq, vptqq
for all t P I.
522
λpuptq, v 1 ptqq
m
l
Proof. For this, let el1 , . . . , ell , em
1 , . . . , em , be the canonical basis of R and
m
R , respectively. It follows by the bi-linearity of λ that
ļ
λpx, z q m̧
pxj zk q . λpelj , emkq
j 1k 1
for all x P Rl , z
P Rm. Hence
rλpu, vqsi ļ
m̧
rλpelj , emkqsi uj vk
j 1k 1
is differentiable by Theorem 2.4.8 with derivative
rλpu, vqsi1ptq ļ
m̧
rλpelj , emkqsi puj1 ptq vk ptq
j 1 k1
1
rλpu ptq, vptqq λpuptq, v 1ptqqs
uj ptq vk1 ptqq
i
for all t P I and i P t1, . . . , nu.
Theorem 3.5.53. Let n P N , I,J be non-void open intervals of R, u, v :
I Ñ Rn differentiable paths, f : I Ñ R and g : J Ñ R be differentiable.
Then
Ñ Rn, defined by
pu vqptq : uptq vptq
for every t P I, is differentiable and
pu vq 1ptq u 1ptq v 1ptq
for all t P I.
f.u : I Ñ Rn , defined by
pf.uqptq : f ptq.uptq
for every t P I, is differentiable and
pf.uq 1ptq f 1ptq.uptq f ptq.u 1ptq
for all t P I.
(i) u
(ii)
v:I
523
(iii) u v : I
(iv)
(v)
Ñ R, defined by
pu vqptq : uptq vptq
for every t P I, is differentiable and
pu vq 1ptq u 1ptq vptq uptq v 1ptq
for all t P I.
if n 3, then u v : I Ñ R3 , defined by
pu vqptq : uptq vptq
for every t P I, is differentiable and
pu vq 1ptq u 1ptq vptq uptq v 1ptq
for all t P I.
if Ran g € I, then u g : J Ñ R is differentiable and
pu gq 1ptq g 1ptq.u 1pgptqq
for all t P J.
Proof. ‘(i)-(iv)’ are consequences of Theorem 3.5.52. ‘(v)’: It follows by
Theorem 2.4.10 that
pu gqi ui g
is differentiable with derivative
rpu gqis 1ptq ui1pgptqq g 1ptq rg 1ptq.u 1pgptqqsi
for all t P J, i P t1, . . . , nu.
524
Example 3.5.54. Let r be a twice differentiable path (the trajectory of a
point particle parametrized by time) from some non-void open interval I of
R into R3 and satisfying
m.r 2 ptq 0
for all t P I (Newton’s equation of motion without external forces) where
m ¡ 0 (the mass of the particle). Then
m
2
v2
1
ptq m r 1ptq r 2ptq 0
for all t P I where v : r 1 (the velocity field of the particle) and v 2 : v v.
Hence it follows by Theorem 2.5.7 that the function
m 2
v
2
(the kinetic energy of the particle) is constant (‘is a constant of motion’).
Example 3.5.55. (Kepler problem, I) Let r be a twice differentiable path
(the trajectory of a point particle parametrized by time) from some nonvoid open interval I of R into R3 zt0u satisfying
m . r 2 ptq γmM
|rptq|3 . rptq
(3.5.28)
for all t P I (Newton’s equation of motion for a point particle under the
influence of the gravitational field of a point mass located at the origin.)
where m, M, γ ¡ 0 (the mass of particle, the mass of the gravitational
source, the gravitational constant). Show that the total energy E : I Ñ R
of the system, the angular momentum L : I Ñ R3 and the Lenz vector
A : I Ñ R3 defined by
m 1
γmM
r ptq r 1 ptq 2
|rptq| ,
Lptq : rptq rm . r 1 ptqs m . rptq r 1 ptq ,
γmM
1
Aptq : m . r ptq Lptq |rptq| . rptq
E ptq :
525
for every t P I are constant. Solution: It follows by Theorem 3.5.53,
(3.5.28) and Theorem 3.5.16 (vi) that
E 1 ptq m r 1 ptq r 2 ptq
L 1 ptq m . r 1 ptq r 1 ptq
for all t P I and
γmM rptq r 1 ptq
0,
|rptq|3
m . rptq r 2 ptq 0
γmM rptq r 1 ptq
γmM 1
. rptq 3
|rptq|
|rptq| . r ptq
rptq rm . r 1 ptqs m . rr 1 ptq r 2 ptqs . rptq
a 1 ptq r 2 ptq Lptq
r 2ptq γmM
. r 1 ptq
|rptq|
m . rr 1ptq r 2ptqs . rptq m . rrptq r 2ptqs . r 1ptq
1
m . rr 1ptq r 2ptqs . rptq γmM
|rptq| . r ptq 0
for all t P I where a : I
Ñ R3 is defined by
aptq : m1 . Aptq
for all t P I. Hence it follows by Theorem (2.5.7) that E, L and A are constant. In the following, we derive necessary consequences of these conservation laws. In this, we denote by E, L and A the corresponding constants
and L : |L|, A : |A|. In particular, we assume that A 0, L 0 and
denote for t P I by θptq P pπ, π s is the ‘polar’ angle between A and rptq.
Then it follows by Theorem 3.5.16 (v) that
A|rptq| cospθptqq A rptq m rptq γmM
r 1 ptq L . rptq
|rptq|
m rptq r r 1ptq L s γm2M |rptq|
m L r rptq r 1ptq s γm2M |rptq| L2 γm2M |rptq|
and hence that
|rptq| r 1
ε cospθptqq s p
526
for every t P I where
ε :
L2
A
,
p
:
.
γm2 M
γm2 M
In addition, it follows by Theorem 3.5.16 (v), (vii) for t P I that
A2
m2
γmM
r 1 ptq L . rptq
|rptq|
|rptq|
1
pr 1ptq Lq pr 1ptq Lq 2γmM
|rptq| . rptq pr ptq Lq
1
L2|r 1ptq|2 r L r 1ptq s2 2γmM
|rptq| . L prptq r ptqq
L
2
2E
m
2γM
|rptq|
L
2γM
|rptq|
γmM
r 1 ptq L . rptq
2
γ 2 m2 M 2
γ 2 m2 M 2
γ 2 m2 M 2
γ 2 m2 M 2
2EL2
m
d
and hence that
ε
As a consequence,
2EL2
.
γ 2 m3 M 2
1
$
'
&
1
ε 1
'
%
¡1
if E
if E
if E
0
0
¡0.
By its definition, L is orthogonal to r 1 ptq for every t
for for every t0 , t1 P I satisfying t0 t1 that
L rrpt1 q rpt0 qs » t1
t0
P I. Hence it follows
L r 1 ptq dt 0 .
Hence the motion of the particle proceeds in a plane S with normal vector
n3 : L1 .L .
527
In the following, we make the natural assumption that θptq assumes all
values in pπ, π s. Then S contains the origin. This can be seen as follows.
By assumption, there are t0 , t1 P I such that
θpt1 q π
π
, θpt2 q 2
2
and hence
L rpt0 q L rpt1 q 0 , A rpt0 q A rpt1 q 0 , |rpt0 q| |rpt1 q| .
Therefore, we conclude, by noting that L A 0, that
rpt0 q rpt1 q
and hence that
1
prpt1q rpt0qq 0 P S .
2
Therefore, it follows from Examples 3.5.32, 3.5.33, 3.5.34 that Ranprq are
conics in S with one focus in the origin. In particular, the conic is an ellipse,
parabola or hyperbola if E 0, E 0 or E ¡ 0, respectively. Note that
the previous was derived from the assumption of the existence of a solution
of (3.5.28) with the prescribed properties. Indeed, that existence can be
proved, and this is done, for instance, in courses in theoretical mechanics.
rpt0 q
Example 3.5.56. (Kepler problem, II, Levi-Civita’s transformation) We
continue the discussion from the previous example and present Tullio LeviCivita’s ingenious method to transform (3.5.28) into a form whose solutions
are obvious. His key idea is the ansatz (3.5.31) which transforms ellipses
that have a focus in the origin into ellipses with centers in the origin. In
the first step, we introduce a new time variable. For this, let t0 P I and
I0 : pt0 , 8q X I. We define a time function τ : I0 Ñ R by
τ ptq :
»t
t0
528
dt1
|rpt1q|
for all t P I0 . Then τ is strictly increasing, and hence according to Theorem 2.5.18, the restriction in its image on its range, given by an open interval J0 , has a differentiable inverse which will be denoted by the symbol
τ 1 in the following. In particular, we define
ξ : r τ 1 .
Then
τ 1
ξ1
ξ2
|ξ |2
and hence
1
r 1 τ 1
r 2 τ 1
τ 1 1τ 1
|ξ | 1
r 2 τ 1
r 1 τ 1
r 1 τ 1
|ξ |2
|ξ |
r 1 τ 1
r 2 τ 1
,
|ξ | 1 ξ 1
|ξ |
1
|ξ1|2 ξ 2 ||ξξ||3 ξ 1 .
Hence it follows from (3.5.28) that
|ξ | ξ 2 | ξ | 1 ξ 1
γM ξ
0
(3.5.29)
and
E
m 1 2 γmM
|r | |r |
2
τ 1 m2 ||ξξ ||2 γmM
|ξ |
12
.
(3.5.30)
For the next step, we assume that the r and hence also ξ assume values
in the x, y-plane, only. Note that the discussion in the previous example
indicates that it is reasonable to search for such solutions. For the solutions
of (3.5.29), we make the ansatz
ξ1
where u : J Ñ R, v : J
be found. Then
|ξ | u 2
v 2 , |ξ | 1
u2 v 2 ,
ξ2
2uv
(3.5.31)
Ñ R are twice differentiable functions that are to
2uu 1
2vv 1 , ξ11
529
2uu 1 2vv 1 ,
ξ21 2u 1 v 2uv 1 ,
|ξ 1|2 p2uu 1 2vv 1q2 p2u 1v 2uv 1q2 4pu2 v2qpu 1 2 v 1 2q
ξ12 2uu 2 2vv 2 2u 1 2 2v 1 2 , ξ22 2u 2 v 4u 1 v 1 2uv 2 .
Substitution of the ansatz into (3.5.29) leads to
v 2 qp2uu 2 2vv 2 2u 1 2 2v 1 2 q p2uu 1 2vv 1 qp2uu 1 2vv 1 q
γM pu2 v 2 q 2pu2 v 2 qpuu 2 vv 2 q 2pu2 v 2 qpu 1 2 v 1 2 q
4pu2u 1 2 v2v 1 2q γM pu2 v2q
pu2
2pu2 v2qpuu 2 vv 2q γM 2pu 1 2 v 1 2q pu2 v2q 0
2pu2 v 2 qpu 2 v 2u 1 v 1 uv 2 q 4puu 1 vv 1 qpu 1 v uv 1 q 2 γM uv
2pu2 v2qpu 2v uv 2 q 4 pu2 v2qu 1v 1 puu 1 vv 1qpu 1v uv 1q
2 γM uv
2pu2
v 2 qpu 2 v
uv 2 q
γM
2pu 1 2
v 1 2 q 2uv
0
and hence to
v 2 qupuu 2 vv 2 q
v 2 qu 2
2pu 1 2 v 1 2q upu2 v2q
2pu2 v 2 qv pu 2 v uv 2 q
γM 2pu 1 2 v 1 2 q 2uv 2 0
(
pu2 v2q 2pu2 v2qu 2 γM 2pu 1 2 v 12q u 0
2pu2 v 2 qupu 2 v uv 2 q
γM 2pu 1 2 v 1 2 q 2u2 v
2pu2 v2qvpuu 2 vv 2q γM 2pu 1 2 v 1 2q vp(u2 v2q
pu2 v2q 2pu2 v2qv 2 γM 2pu 1 2 v 1 2q v 0
2pu2
γM
and hence to
2pu2
2pu2
2pu 1 2
γM 2pu 1 2
γM
v 2 qv 2
v 1 2q u ,
Substitution of the ansatz into (3.5.30) leads to
E
2m
12
uu2
v 12
v2
530
v 1 2q v .
2puγM
2
v2q
which leads to the system of equations
u2 E
E
u 0 , v2 v
2m
2m
0.
The solution of the last equations are given by Theorem 2.5.17.
In the following we define the arc length of a curve as a limit of the lengths
of inscribed polygons. The length of any such polygon should be smaller
than the length of the path, since intuitively we expect straight lines to be
the shortest connection between two points. This suggests the following
definition.
Definition 3.5.57. Let n P N and u : I Ñ Rn be a path where I is
some non-empty closed subinterval of R. We say u that is rectifiable, nonrectifiable if the set
#
ν¸1
+
|uptj q uptj 1q| : P pt0, . . . , tν q P P
µ 0
is bounded or unbounded, respectively. In case u is rectifiable, we define
its length Lpuq by
Lpuq sup
#
ν¸1
+
|uptj q uptj 1q| : P pt0, . . . , tν q P P
.
µ 0
Example 3.5.58. (A non-rectifiable continuous path) Define u : r0, 1s Ñ
R2 by
π
uptq : t, p1 tq cos
2p1 tq
for all t P p0, 1s and uptq : p0, 0q. Then u is a continuous path. For every
n P N , we define a partition pt0 , . . . , t2n 1 q of r0, 1s by
t0 : 0 , t2k1 : 1 1
1
, t2k : 1 2p2k 1q
4k
531
y
0.2
0.2
0.4
0.6
1
x
-0.2
-0.4
Fig. 149: Graph of the non-rectifiable continuous path u from Example 3.5.58.
for k
1, . . . , n and
t2n
1
: 1 .
Then
2n
¸
|uptj q uptj 1q| ¥
µ 0
¥
Hence
#
1
2 2k 1
k1
ņ
ν¸1
ņ
|upt2k1q upt2k q|
k 1
p q
1 1
¥
4k 2
ņ
1
.
k
k1
+
|uptj q uptj 1q| : P pt0, . . . , tν q P P
µ 0
is unbounded, and u is non-rectifiable.
Below, we will give a formula for the calculation of the length of C 1 paths. The proof of that formula uses the following simple consequence
of Bolzano-Weierstrass’ theorem.
532
Theorem 3.5.59. (Uniform continuity) Let f : ra, bs Ñ R, where a, b P R
are such that a b, be continuous. Then f is uniformly continuous, i.e.,
for every ε ¡ 0 there is some δ ¡ 0 such that for all x, y P ra, bs it follows
from |x y | ¤ δ that |f pxq f py q| ¤ ε.
Proof. The proof is indirect. Assuming the opposite, there is some ε ¡ 0
for which the statement is not true. Hence for every n P N , there are
xn , yn P ra, bs such that |xn yn | ¤ 1{n and at the same time such
that |f pxn q f pyn q| ¡ ε. According to the Bolzano-Weierstrass’ Theorem 2.3.18, there are subsequences xn1 , xn2 , . . . of x1 , x2 , . . . converging
to some element x P ra, bs and ynk1 , ynk2 , . . . of yn1 , yn2 , . . . converging to
some element y P ra, bs. Hence it follows by the continuity of the modulus function (see Example 2.3.52), the continuity of f , Theorem 2.3.4 and
Theorem 2.3.12 that x y and |f pxq f py q| ¥ ε.
Theorem 3.5.60. Let n P N , a, b P R be such that a ¤ b, u : ra, bs Ñ Rn
be continuous and differentiable on pa, bq such that its derivative on pa, bq
can be extended to a continuous path u 1 on ra, bs, such a path will be called
a C1 -path in the following, then u is rectifiable and
Lpuq »b
a
|u 1ptq| dt .
Proof. For this, let ν P N , pt0 , . . . , tν q P P and µ P t1, . . . , ν 1u. Then
by Theorem 2.6.21,
|uptµq uptµ 1q|2 ņ
|uk ptµq uk ptµ 1q|2 k 1
»
tµ
k1 tµ
ņ
1
2
uk1 t dt .
pq
By Theorem 3.5.2,
»
2 »
»
tµ 1
tµ 1 ņ
tµ 1
uk1 t dt
uk1 s ds uk1 t dt
tµ
tµ
k1 tµ
k1
»
2 1{2 »
ņ tµ 1
tµ 1
1
uk t dt
u 1 t dt .
tµ
k1 tµ
ņ
¤
pq
pq
pq
533
| p q|
pq
Hence
»
tµ
k1 tµ
ņ
and
1
2
1
uk t dt
pq
¤
»
ν¸1
1
tµ
|uptµq uptµ 1q| ¤
as well as
2
tµ
» tµ
1
tµ
|uptµq uptµ 1q| ¤
µ 1
»b
a
|u 1ptq| dt
|u 1ptq| dt
|u 1ptq| dt .
Hence u is rectifiable and
Lpuq ¤
»b
a
|u 1ptq| dt .
For the proof of the opposite inequality, let ε ¡ 0. Since u 1 is continuous,
it follows by application of Theorem 3.5.59 to its component functions the
existence of δ ¡ 0 such that for all s, t P ra, bs
|u 1psq u 1ptq| ¤ ε if |s t| ¤ δ .
Let ν P N , pt0 , . . . , tν q P P of size ¤ δ, µ P t1, . . . , ν 1u. Then
|u 1ptq| |u 1ptq u 1ptµq u 1ptµq| ¤ |u 1ptµq| ε
for all t P rtµ , tµ 1 s. Hence
» tµ
tµ
1
|u 1ptq| dt ¤ p|u 1ptµq|
»
tµ
tµ
»
tµ
tµ
1
ru 1ptq
u 1 t dt
εq l p rtµ , tµ
u 1 ptµ q u 1 ptqs
»
tµ
tµ
dt
1
sq
ε l p rtµ , tµ
dt
1
sq
ru 1ptµq u 1ptqs
ε l p rtµ , tµ 1 s q
?
¤ |uptµq uptµ 1q| p1 n q l p rtµ, tµ 1s q ε
¤
1
pq
1
534
where integration of vector-valued functions is defined component-wise.
Hence
»b
a
|u 1ptq| dt ¤
ν¸1
|uptµq uptµ 1q| p1
?
n q ε ¤ Lpuq
p1
?
nqε
µ 1
and, finally,
»b
a
|u 1ptq| dt ¤ Lpuq .
Usually in applications, the length of curves, i.e., ranges of paths, is of
more interest. The length of a curve should not depend on a parametrization / path. Below, it is proved the invariance of the length of paths under
reparametrization. As a consequence, we will define the length of a curve
as the length of an injective C 1 -path whose range coincides with the curve,
if existent.
Theorem 3.5.61. (Invariance of the length of paths under reparametrizations) Let n P N , a, b P R be such that a ¤ b and u : ra, bs Ñ Rn be a
C1 -path. Further, let c, d P R such that c ¤ d, g : rc, ds Ñ ra, bs be continuous, increasing (not necessarily strictly) such that g pcq a, g pdq b,
and differentiable on pc, dq with its derivative on pc, dq being extendable to
a continuous function on rc, ds. Then u g is a C1 -path and
Lpu g q Lpuq .
For this reason, we define the length LpRan uq of the curve Ran u by
LpRan uq : Lpuq
if u is in addition injective.
Proof. First, u g is continuous, differentiable on pc, dq with its derivative
on pc, dq having the continuous extension g 1 .pu 1 g q. Hence u g is a
C1 -path, and it follows by Theorem 3.1.1 that
Lpuq » gpdq
pq
g c
|u 1ptq| dt »d
c
|pu 1 gqpsq| g 1psq ds
535
»d
c
|g 1psq.pu 1 gqpsq| ds »d
c
|pu gq 1psq| ds Lpu gq .
Example 3.5.62. Calculate the length of the circle Sr1 p0q of radius r ¡ 0
around the origin. Solution: An injective parametrization of the part of
Sr1 p0q in the upper half-plane is given by the C1 -path u : r0, π s Ñ R2
defined by
upϕq : pr cos ϕ , r sin ϕq
for every ϕ P r0, π s. Since
Lpuq r
it follows that
»π
»π
0
0
|u 1pϕq| dϕ »π
0
|pr sin ϕ , r cos ϕq| dϕ
dϕ πr ,
L Sr1 p0q
2πr .
Example 3.5.63. (Length of plane paths given in polar coordinates) Let
a, b P R be such that a ¤ b, I : ra, bs, r : I Ñ R and ϕ : I Ñ R be
continuous as well as differentiable on pa, bq with derivatives that can be
extended to continuous functions on I. Then by
uptq : p rptq cos ϕptq , rptq sin ϕptq q
for every t P I, there is defined a C 1 -path. Note that for t P I, rptq and
ϕptq can be interpreted as polar coordinates of uptq if rptq ¡ 0 and ϕptq P
pπ, πq. In particular for t P pa, bq,
u 1 ptq p r 1 ptq cos ϕptq rptq ϕ 1 ptq sin ϕptq ,
r 1 ptq sin ϕptq rptq ϕ 1 ptq cos ϕptq q
and hence
|u 1ptq|2 r r 1ptq cos ϕptq rptq ϕ 1ptq sin ϕptq s2
536
r r 1ptq sin ϕptq rptq ϕ 1ptq cos ϕptq s2
r 1 2ptq r2ptq ϕ 1 2ptq .
As consequence, the length of u is given by
Lpuq »b
a
r 1 2 ptq
r2 ptq ϕ 1 2 ptq dt .
Problems
Ñ Rn .
upxq : px, 4x 7 q , x P I : r0, 1s ,
uptq : p2t3 , 3t2 q , t P I : r2, 5s ,
upxq : px, 2x4 p16x2 q1 q , x P I : r1, 2s ,
upxq : px, x2{3 q , x P I : r2, 3s ,
upxq : px, 128px5 {15q p8x3 q1 q , x P I : r1, 3s ,
upθq : pθ, ln cos θq , θ P I : rπ {8, π {4s ,
upsq : ps, cosh sq , s P I : r1, 8s ,
upθq : p2θ, cosp3θq, sinp3θqq , θ P I : r0, π s ,
?
uptq : pt2 {2, 2 t, lnptqq , t P I : r2, 7s ,
uptq : pt, cosh t, sinh tq , t P I : r0, 4s ,
upv q : p2v 3 , cos v v sin v, v cos v sin v q , v P I : r0, π {2s ,
uptq : p2et , et sin t, et cos tq , t P I : r3, 4s .
1) Calculate the length of the path u : I
a)
b)
c)
d)
e)
f)
g)
h)
i)
j)
k)
l)
2) Calculate the length of the curve C.
a) C : tpx, y q P R2 : x2{3
b)
c)
d)
e)
f)
g)
y 2{3
9 ^ 1 ¤ x ¤ 3u ,
C : tpx, y q P R2 : x 2y 3{2 3 ^ 0 ¤ y ¤ 2u ,
C : tpx, y q P R2 : y 3 4x2 0 ^ 3 ¤ x ¤ 4u ,
C : tpx, y q P R2 : 1 px4 {3q xy 0 ^ 2 ¤ x ¤ 3u ,
C : tpx, y q P R2 : 8y 2 9px 1q2 ^ x ¥ 1 , 0 ¤ y ¤ 1u ,
?
C : tpx, y q P R2 : y x p3 2xq 0 ^ 0 ¤ x ¤ 4u ,
C : tpx, y q P R2 : xy px4 {2q 24 ^ 1 ¤ x ¤ 5u .
537
y
1
0.5
2Π
Π
x
Fig. 150: A cycloid.
3) A cycloid is the trajectory of a point of a circle rolling along a straight
line. Calculate the length of the part of the cycloid
tpapt sin tq, ap1 cos tq : t P Ru
between the points p0, 0q and p2πa, 0q where a ¡ 0.
4) An astroid is the trajectory of a point on a circle of radius R{4 rolling
on the inside of a circle of radius R ¡ 0. Calculate the length of the
part of the astroid
tpR cos3 t, R sin3 tq : t P Ru
between the points pR, 0q and p0, Rq.
5) A cardioid is the trajectory of a point on a circle rolling on the inside
of a circle of the same radius. Calculate the length of the cardioid
tpa cos ϕp1
where a ¡ 0.
cos ϕq, a sin ϕp1
cos ϕqq : ϕ P r0, 2π qu
6) Consider all real-valued functions on [0,1] such that f p0q 1, f p1q 1 and that are continuously differentiable on the interval p0, 1q with a
derivative that has a continuous extension to [0,1]. Find that function
whose Graph is shortest. Give reasons for your answer.
7) Let b ¡ a ¡ 0. Consider all C 1 -paths u : [0,1]
up0q pa, 0q, up1q pb, 0q and such that
Ñ R2 such that
uptq rptq cos ϕptq , rptq sin ϕptq ,
for every t P [0,1] where r : [0,1] Ñ R and ϕ : [0,1] Ñ R are
continuous, continuously differentiable on the interval p0, 1q with
derivatives that have continuous extensions to [0,1]. Characterize the
shortest paths. What is the common range of all these paths?
538
y
1
0.5
-1
0.5
-0.5
1
x
-0.5
-1
Fig. 151: An astroid.
y
1
1
€€€€€€
2
1
€€€€€€
2
1
3
€€€€€€
2
1
- €€€€€€
2
-1
Fig. 152: A cardioid.
539
x
8) Let a, b, c, d P R such that a2 b2 1 and c2 d2 1. Consider all
C 1 -paths u : [0,1] Ñ R2 on the sphere of radius 1 around the origin
such that up0q pa, 0, bq, up1q pc, 0, dq and such that
uptq sinpθptqq cospϕptqq , sinpθptqq sinpϕptqq , cospθptqq ,
for every t P [0,1] where r : [0,1] Ñ R, θ : [0,1] Ñ R and
ϕ : [0,1] Ñ R are continuous, continuously differentiable on the
interval p0, 1q with derivatives that have continuous extensions to
[0,1]. Characterize the shortest paths. What is the common range
of all these paths?
9) (Length of space paths given in spherical coordinates) Let a, b P
R be such that a ¤ b, I : ra, bs, r : I Ñ R, θ : I Ñ R, ϕ : I Ñ R
be continuous as well as differentiable on pa, bq with derivatives that
can be extended to continuous functions on I. Define
uptq : p rptq sin θptq cos ϕptq , rptq sin θptq sin ϕptq , rptq cos θptq q
for every t P I. Note that for t P I, rptq, θptq and ϕptq can be
interpreted as spherical coordinates of uptq if rptq ¡ 0, θptq P p0, π q
and ϕptq P pπ, π q. Show that u is C 1 -path of length
Lpuq »b
a
r 1 2 ptq
r2 ptq θ 1 2 ptq
540
sin2 θptq ϕ 1 2 ptq
dt .
4
4.1
Calculus III
Vector-valued Functions of Several Variables
This section starts the investigation of maps with domains in Rn and ranges
in Rm where m, n P N are such that at least one from of them is greater
than 1, i.e., such that n2 m2 ¡ 2. For brevity, we will call such maps
vector-valued functions of several variables. Today, the vast majority of
applications lead on the consideration of such maps.
Here it has to remembered that we identify points in Rk , where k P N
is such that k ¥ 2, with position vectors, see the remarks preceding Definition 3.5.8. In addition, as was explained in the beginning of Section 3.5.8,
we also identify tangent vectors that are associated to points in space with
position vectors. In applications, only from the context of a problem can be
concluded about the nature of the involved quantities. But, at least to the
experience of the author, most maps in applications are considering ‘physical fields’, i.e., maps that have as domain a set of points and as range a set
of real numbers or a set of tangent vectors. In the last case, such maps associate to every point from the domain a tangent vector that is ‘attached’ to
that point. Notable exceptions are ‘transformations’ which map points into
points. For the most part of this course, mathematically, the precise nature
of the objects will not play a role. Only in parts of the subsequent section
on applications of differentiation and in later sections on vector analysis
that nature will play a role in the interpretation of the results.
Although the case that m n 1 is not the main object of investigation in the following, a guiding principle of Calculus III is the generalization of main results of Calculus I to the case of vector-valued functions of
several variables. In this way, results are achieved that reduce in the case
m n 1 to familiar results from Calculus I. Often already from the
structure of the last results and their proofs, it is clear whether they likely
allow generalization or not. Such kind of structural thinking can be viewed
as an outflow of the formal approach to mathematics suggested by Hilbert.
541
It has been very fruitful in the 20th century. In particular, it resulted in a
restructuring of previous mathematical knowledge in a very efficient and
aesthetic way. The structuring of whole course, Calculus I - III, can be
viewed as an outgrowth of this formal approach.
Definition 3.5.46 from Calculus II gives a simple example of the above
guiding principle. This definition simply replaces the modulus function
in the corresponding definition for sequences of real numbers by the Euclidean distance function in order to arrive at a definition of the convergence
of sequences in Rk where k P N is such that k ¥ 2. From a notational point
both definitions are practically identical. Subsequently, we proved that a
sequence of elements in Rk is converging to some x P Rk if and only if
for every i P t1, . . . , k u the corresponding sequence of i-th components
converges in R to the i-th component of x. In this way, the question of
convergence or non-convergence of a sequence in Rk was reduced to the
question of convergence or non-convergence of sequences of real numbers.
The last, i.e., reduction to results of Calculus I, is another guiding principle in Calculus III. For instance, Taylor’s theorem, Theorem 4.3.6, for
functions in several variables is a direct consequence of the corresponding
theorem, Theorem 2.5.25, for functions in one variable.
Definition 4.1.1. (Vector-valued functions of several variables) A vectorvalued function is a map from a non-trivial subset of Rn into Rm where
n P N and m P N zt1u. A function of several variables is a map f from a
non-trivial subset D of Rn into Rm for some n P N zt1u and m P N . A
vector-valued function of several variables is a vector-valued function and
/ or a function of several variables.
In accordance with Definitions 2.2.28, 2.2.33, for such a function, we define
Definition 4.1.2.
(i) the domain of f by
Dpf q : D ,
(ii) the range of f by
Ranpf q : tf pxq : x P Du ,
542
y
12
12
-12
x
-12
Fig. 153: Range of γ1 .
(iii) the Graph of f by
Gpf q : tpx, f pxqq : x P Du ,
(iv) the level set (or contour) of f corresponding to some c P Rm by
f 1 pcq : tx P D : f pxq cu .
The following are examples of vector-valued function of several variables.
Example 4.1.3.
(i) γ1 : R Ñ R2 defined by
γ1 ptq : pcosptq, sinptqq
for every t P R,
(ii) γ2 : R Ñ R3 defined by
γ2 ptq : pcosptq, sinptq, tq
for every t P R,
543
10
z
5
1
0
0
-11
0
x
1
y
-1
Fig. 154: Range of γ2 .
(iii) f3 : R2 zt0u Ñ R defined by
for every x P R2 zt0u,
f3 pxq : 1{|x| ,
(iv) f4 : R3 zt0u Ñ R defined by
for every x P R3 zt0u.
f4 pxq : 1{|x|
Example 4.1.4.
(i) Find the maximal domain Dpg q of g such that
g px, y q a
36 9x2 4y 2
for all px, y q P Dpg q. Solution: The domain of g is the subset of R2
consisting of all those px, y q P R2 for which
a
36 9x2 4y 2
544
3
2
z 2
1
1
0
-2
0
y
-1
0
x
-1
1
2 -2
Fig. 155: Truncated graph of f3 .
2
y
1
0
-1
-2
-2
0
x
-1
1
2
Fig. 156: Contour map of f3 . Darker colors correspond to lower values of f3 .
545
y
2
1
x
1
-1
-1
-2
Fig. 157: Dpg q.
is defined. Hence it is given by
tpx, yq P R2 : px{2q2 py{3q2 ¤ 1u .
Geometrically, this set consists of the area of the ellipse centered
around the origin with half axes 2 and 3.
(ii) Find the range of g. Solution: Ranpg q
px, yq P Dpgq, it follows that
r0, 6s.
0 ¤ 36 9x2 4y 2
and hence also that
0¤
a
(Proof: For every
¤ 36
36 9x2 4y 2
¤6.
Therefore, Ranpg q € r0, 6s. In addition for every z
g
1?
36 z 2 , 0
3
P r0, 6s,
z .
Hence it follows also that Ranpg q  r0, 6s and, finally, that Ranpg q r0, 6s.)
546
y
0.5
-1
0.5
-0.5
1
x
-0.5
Fig. 158: Graph of f from Example 4.1.6.
Analogous to the corresponding definition in Calculus I, the next defines
continuity of a vector-valued function of several variables at a point by its
property to commute with limits taken at that point.
Definition 4.1.5. Let f : D Ñ Rm be a vector-valued function of several
variables and x P D. We say that f is continuous in x if for every sequence
x1 , x2 , . . . of elements in D from
lim
Ñ8 xν
ν
it follows that
lim f pxν q f
ν Ñ8
x
lim xν
ν Ñ8
r f pxqs .
Otherwise, we say that f is discontinuous in x. Moreover, we say that f is
continuous if f is continuous in all points of its domain D. Otherwise, we
say that f is discontinuous.
As a reminder of discontinuity of functions defined on subsets of R, we
give the following example.
547
Example 4.1.6. Consider the function f : R Ñ R defined by
f pxq :
x
|x |
for every x P R zt0u and and f p0q : 1. Then
1
lim
nÑ8 n
1
0 and nlim
Ñ8
n
but
lim
Ñ8 f
n
1
n
1 and
lim
Ñ8 f
n
n1
0,
1 .
Hence f is discontinuous at the point 1. See Fig. 158.
Example 4.1.7. Consider the function of several variables f5 : R2 zt0u
R defined by
x2 y 2
f5 pxq : 2
x
y2
Ñ
for all x px, y q P R2 zt0u. Then
f5 px, 0q 1 , f5 p0, y q 1
for all x, y P R zt0u, and hence there is no extension of f5 to a continuous
function defined on R2 . Note that for every real a
f5 px, axq 1 a2
1 a2
for all x P R zt0u. Hence for every b P r1, 1s, there is a real number a
such that
lim f5 px, axq b .
x
Ñ0,x0
Example 4.1.8. (Basic examples of continuous functions.)
Let n P N .
548
2
1
z
2
y
1
0
0
1
-1
-2
0
-1
y
-1
0
x
-1
1
-2
-2
2 -2
-1
0
x
1
2
Fig. 159: Graph and contour map of f5 . In the last, darker colors correspond to lower
values of f5 .
(i) Constant vector-valued functions on Rn are continuous as a consequence of Theorem 3.5.47.
(ii) For i P t1, . . . , nu, define the projection pi : Rn
i-th component by
pi pxq : xi
Ñ R of Rn onto the
for all x px1 , . . . , xn q P Rn . Then pi is continuous as a consequence of Theorem 3.5.47.
In the case of functions of one real variable, one main application of continuity came from the fact that continuous functions defined on bounded and
closed intervals assume a maximum value and a minimum value. The same
is true also for continuous functions of several variables. Such functions assume a maximum value and a minimum value on so called ‘compact’ subsets, defined below, of their domain. Again, as in the case of functions of
one real variable, this property is a consequence of the Bolzano-Weierstrass
theorem for sequences in Rn . The last theorem will be given next. It is a
simple consequence of its counterpart Theorem 2.3.18 for sequences of real
numbers.
549
Theorem 4.1.9. (Bolzano-Weierstrass) Let n P N and x1 , x2 , . . . be a
bounded sequence in Rn , i.e., for which there is M ¡ 0 such that |xk | ¤ M
for all k P N . Then there is a subsequence, i.e., a sequence xn1 , xn2 , . . .
that corresponds to a strictly increasing sequence n1 , n2 , . . . of non-zero
natural numbers, which is convergent in Rn .
Proof. Since x1 , x2 , . . . is bounded, the corresponding sequences of components x1k , x2k , . . . , k 1, . . . , n are also bounded. Therefore, as a consequence of an n-fold application of Theorem 2.3.18 and an application
of Theorem 3.5.47, it follows the existence of a subsequence xn1 , xn2 , . . . ,
where n1 , n2 , . . . is a strictly increasing sequence of non-zero natural numbers, which is convergent in Rn .
For n P N such that n ¥ 2, subsets of Rn show more variety than subsets
of the real numbers. In the following, we define subclasses of such sets that
play a particular role in calculus. In this, open subsets will play a role which
is similar to open intervals of R. Differentiability will be defined only for
functions defined on such sets because differences of neighboring function
values need to be considered, and for every point x in such a set, there is
ε ¡ 0 such that x h is also contained in that set for all h satisfying |h| ε.
Compact subsets generalize aspects of bounded closed intervals of R. In
particular, we will see below that continuous functions of several variables
assume a maximum value and a minimum value on compact subsets of their
domains.
Definition 4.1.10. (Open, closed and compact subsets of Rn )
Let n P N .
(i) A subset U of Rn is called open if for every x
ball of some radius ε ¡ 0 around x’
P U there is an ‘open
Uε pxq : ty P Rn : |y x| εu
which is contained in U . In particular, φ , Rn , and every open ball of
radius ε ¡ 0 around x P Rn is open. Obviously, arbitrary unions of
open subsets of Rn and intersections of finitely many subsets of Rn
are open.
550
(ii) A subset A of Rn is called closed if its complement Rn zA is open. In
particular, the so called ‘closed ball of radius ε ¡ 0 around x P Rn ’
Bε pxq : ty P Rn : |y x| ¤ εu
and the sphere of radius ε centered at x
Sε pxq : ty P Rn : |y x| εu
are closed. As a consequence of the last remark in (i), arbitrary intersections of closed subsets of Rn and unions of finitely many closed
subsets of Rn are closed. In particular, we define for every subset
S of Rn its corresponding closure S̄ as the intersection of all closed
subsets of Rn that contain S. Hence S̄ is the smallest closed subset
of Rn that contains S.
(iii) A subset K of Rn is called compact if it is closed and bounded, i.e.,
it if is closed and contained in some open ball UR p0q of some radius
R ¡ 0 around the origin.
Among others, the following example shows that the notions of openness
and closedness of intervals of R used in Calculus I coincide with those
notions from the previous definition.
Example 4.1.11. Let a, b, c P R be such that a ¤ b.
(i) The interval pa, bq is bounded and open.
This can be seen as follows. First, pa, bq is bounded since pa, bq €
pM, M q where M : maxt|a|, |b|u. Second, pa, bq is open since
it follows for every x P pa, bq that px , x q € pa, bq where
ε : mintx a, b xu.
(ii) Since
p8, cq 8
¤
pn
c, cq , pc, 8q n 0
8
¤
n 0
(i) also implies that p8, cq and pc, 8q are open.
551
pc, c
nq ,
(iii) Since
p8, cs R zpc, 8q , rc, 8q R zp8, cq ,
(ii) implies that p8, cs and rc, 8q are closed.
(iv) The interval [a, b] is compact.
This can be seen as follows. First, ra, bs is bounded since ra, bs
pM, M q, for M ¡ maxt|a|, |b|u. Second, ra, bs is closed since
€
R z ra, bs p8, aq Y pb, 8q
is open as a union of open subsets of R.
(v) The closure of pa, bq coincides with ra, bs.
This can be seen as follows. First, ra, bs is a closed subset of R that
contains pa, bq. Since, by definition, the closure of pa, bq is the smallest closed subset of R that contains pa, bq, the closure of pa, bq is a
subset of ra, bs. We show now indirectly that b is contained in every
closed subset C of R that contains pa, bq. Otherwise, there is such C
for which this is not the case. Hence b is contained in R z C. Since the
last set is open, there is ε ¡ 0 such that pb ε, b εq € R z C. But the
intersection of pb ε, b εq with pa, bq is non-empty. Hence C does
not contain pa, bq. Analogously, it follows that a is contained in every closed subset C of R that contains pa, bq. Otherwise, there is such
C for which this is not the case. Hence a is contained in R z C. Since
the last set is open, there is ε ¡ 0 such that pa ε, a εq € R z C.
But the intersection of pa ε, a εq with pa, bq is non-empty. Hence
C does not contain pa, bq. Hence every closed subset of R that contains pa, bq also contains ra, bs. Therefore, the closure of pa, bq also
contains ra, bs and hence coincides with ra, bs.
The closure S̄ of a subset S of Rn , n P N , was defined as the intersection
of all closed subsets that contain S. It will turn out to be useful to have a
characterization of S̄ that makes reference only to the set S, but not to any
other set. Such characterization is given below.
552
Theorem 4.1.12. Let n P N and S € Rn . Then the closure S̄ of S consists
of all x P Rn for which there is a sequence x1 , x2 , . . . of elements of S that
is convergent to x.
Proof. If x P S̄, there are two cases. In case that x P S, the constant
sequence x, x, . . . is a sequence in S that is converging to x. If x R S and
U is some open subset of Rn that contains x, it follows that U also contains
a point of S. Otherwise, it follows that
S̄ z U
S̄ X p Rn z U q
is a closed subset of Rn that contains S and hence that
S̄
S̄ z U
which implies that x R S̄. In particular by applying the previous to U1{ν pxq
for every ν P N , we obtain a sequence x1 , x2 , . . . of elements of S that is
convergent to x by construction. On the other hand, if x P Rn is such that
there is a sequence x1 , x2 , . . . of elements of S that is convergent to x and
A is a closed subset Rn that contains S, it follows that x P A. Otherwise,
x is contained in the open set Rn z A and there is ε ¡ 0 such that Uε pxq €
Rn z A. Hence if ν P N is such that |xν x| ε, it follows that xν P Rn z A
and hence that xν R S. As a consequence, x is also contained in S̄ which
is the intersection of all closed subsets A of Rn that contain S.
Example 4.1.13. Let a, b P R be such that a ¤ b. Then the closure of the
intervals pa, bq, pa, b], [a, bq and [a, b] is given by [a, b].
Note that the statement of the previous theorem implies that convergent sequences in Rn , n P N , whose members are elements of a closed subset of
Rn converge to an element of that subset. This is the additional fact that is
necessary for the proof that continuous functions defined on compact subsets of Rn assume a maximum value and a minimum value. The procedure
of that proof itself is analogous to the proof of the similar statement from
Calculus I.
553
Theorem 4.1.14. (Existence of maxima and minima of continuous functions on compact subsets of Rn ) Let n P N , K € Rn a non-empty compact subset and f : K Ñ R be continuous. Then there are (not necessarily
uniquely determined) xmin P K and xmax P K such that
f pxmax q ¥ f pxq , f pxmin q ¤ f pxq
for all x P K.
Proof. For this, in a first step, we show that f is bounded and hence that
sup f pK q exists. In the final step, we show that there is c P K such that
f pcq sup f pK q. For both, we use the Bolzano-Weierstrass theorem.
The proof that f is bounded is indirect. Assume on the contrary that f is
unbounded. Then there is a sequence x1 , x2 , . . . in K such that
f pxn q ¡ n
(4.1.1)
for all n P N. Hence according to Theorems 4.1.9, 4.1.12, there is a subsequence xk1 , xk2 , . . . of x1 , x2 , . . . converging to some element c P K. Note
that the corresponding sequence f pxk1 q, f pxk2 q, . . . is not converging as a
consequence of (4.1.1). But, since f is continuous, it follows that
f pcq lim f pxnk q .
k
Ñ8
Hence f is bounded. Therefore, let M : sup f pK q. Then for every n P N
there is a corresponding cn P K such that
|f pcnq M | n1 .
(4.1.2)
Again, according to Theorem 4.1.9, 4.1.12, there is a subsequence ck1 , ck2 , . . .
of c1 , c2 , . . . converging to some element c P K. Also, as consequence of
(4.1.2), the corresponding sequence f pck1 q, f pck2 q, . . . is converging to M .
Hence it follows by the continuity of f that f pcq M and by the definition
of M that
f pcq M ¥ f pxq
554
for all x P K. By applying the previous reasoning to the continuous function f , it follows the existence of a c 1 P K such that
f pc 1q ¥ f pxq .
Hence it follows that
for all x P K.
f pc 1 q ¤ f pxq
In the case of functions of one real variable, it was shown that a continuous
function f : [a, b] Ñ R where a, b P R are such that a b, which is
differentiable on the open interval pa, bq, assumes its extrema either in a
critical point in pa, bq or in the boundary points a or b of [a, b]. The same
is true for functions of several variables. For this purpose, we define the
notion of inner points and boundary points of subsets of Rn , n P N .
Definition 4.1.15. (Inner points and boundary points of subsets of Rn )
Let n P N and S € Rn .
(i) We call x P S an inner point of S if there is ε ¡ 0 such Uε pxq € S.
In particular, we call the set of inner points of S the interior of S and
denote this set by S . Obviously, S is the largest open set that is
contained in S.
(ii) We call x P Rn a boundary point of S if for every ε ¡ 0 the corresponding Uε pxq contains a point from S and a point from Rn z S.
Hence a boundary point of S cannot be an inner point of S. We call
the set of boundary points of S the boundary of S and denote this set
by B S.
Example 4.1.16. Let a, b P R be such that a ¤ b. Then the interior of the
intervals pa, bq and [a, b] is given by pa, bq. The boundary of pa, bq and [a, b]
is given by ta, bu. We note that the closure of pa, bq, i.e., [a, b], is the union
of the interior of pa, bq, i.e., pa, bq, and the boundary of pa, bq, i.e., ta, bu.
The last is true for every subset of Rn , n P N .
555
Theorem 4.1.17. (Decomposition of the closure of subsets of Rn ) Let
n P N and S € Rn . Then
S Y BS .
Proof. First, we note that S € S € S̄. If x P B S, then for every ν P N
there is xν P S such that |xν x| 1{ν. Hence
lim xν x
ν Ñ8
and x P S̄. Hence it follows that S̄  S Y B S. If x P S̄, then either
x P S or x P B S. Otherwise, there is ε ¡ 0 such that Uε pxq is contained in
Rn z S. In this case, there is no sequence x1 , x2 , . . . of elements of S that
is convergent to x and hence x R S̄. Hence if follows that S̄ € S Y B S
and finally that S̄ S Y B S.
S̄
If defined, sums, scalar multiples, products, quotients and compositions of
continuous vector-valued functions of several variables are continuous, as
is also the case for functions in one real variable. This is a simple consequence of the limit laws, Theorems 3.5.49, 2.3.4, and the definition of
continuity. The associated proofs are analogous to those of the corresponding statements for functions in one real variable in Calculus I. As usual,
a typical application of the thus obtained theorems consists in the decomposition of a given vector-valued function of several variables into sums,
scalar multiples, products, quotients, and compositions of vector-valued
functions of several variables whose continuity is already known. Then the
application of those theorems proves the continuity of that function. In this
way, the proof of continuity of a given vector-valued function of several
variables is greatly simplified and, usually, obvious. Therefore, in such obvious cases in future, the continuity of such a function will be just stated,
but not explicitly proved.
Definition 4.1.18. Let f1 : D1 Ñ Rm , f2 : D2 Ñ Rm be vector-valued
functions of several variables such that D1 X D2 φ. Moreover, let a P R.
We define pf1 f2 q : D1 X D2 Ñ Rm and a.f1 : D1 Ñ Rm by
pf1
f2 qpxq : f1 pxq
556
f2 pxq
for all x P D1 X D2 and
pa.f1qpxq : a.f1pxq
for all x P D1 .
Theorem 4.1.19. Let f1 : D1 Ñ Rm , f2 : D2 Ñ Rm be vector-valued
functions of several variables and such that D1 X D2 φ. Moreover, let
a P R. Then by Corollary 3.5.49:
(i) If f1 and f2 are both continuous in x
continuous in x, too.
P
D1
X D2, then f1
f2 is
(ii) If f1 is continuous in x P D1 , then a.f1 is continuous in x, too.
Definition 4.1.20. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions of several
variables such that D1 X D2 φ. We define f1 f2 : D1 X D2 Ñ R by
pf1 f2qpxq : f1pxq f2pxq
for all x P D1 X D2 . If moreover Ranpf1 q € R , we define 1{f1 : D1 Ñ R
by
for all x P D1 .
p1{f1qpxq : 1{f1pxq
Theorem 4.1.21. Let f1 : D1 Ñ R, f2 : D2
variables such that D1 X D2 φ.
(i) If f1 and f2 are both continuous in x
continuous in x, too.
Ñ R be functions of several
P
D1
X D2, then f1 f2 is
(ii) If f1 is such that Ranpf1 q € R as well as continuous in x P D1 , then
1{f1 is continuous in x, too.
557
Proof. For the proof of (i), let x1 , x2 , . . . be some sequence in D1
which converges to x. Then it follows for every ν P N that
X D2
|pf1 f2qpxν q pf1 f2qpxq| |f1pxν qf2pxν q f1pxqf2pxq|
|f1pxν qf2pxν q f1pxqf2pxν q f1pxqf2pxν q f1pxqf2pxq|
¤ |f1pxν q f1pxq| |f2pxν q| |f1pxq| |f2pxν q f2pxq|
¤ |f1pxν q f1pxq| |f2pxν q f2pxq| |f1pxν q f1pxq| |f2pxq|
|f1pxq| |f2pxν q f2pxq|
and hence, obviously, that
lim
Ñ8pf1 f2 qpxν q pf1 f2 qpxq .
ν
For the proof of (ii), let x1 , x2 , . . . be some sequence in D1 which converges
to x. Then it follows for every ν P N that
|p1{f1qpxν q p1{f1qpxq| |1{f1pxν q 1{f1pxq|
|f1pxν q f1pxq|{r |f1pxν q| |f1pxq| s
and hence, obviously, that
lim
Ñ8p1{f1 qpxν q p1{f1 qpxq .
ν
Definition 4.1.22. Let f : Df Ñ Rm and g : Dg Ñ Rp be vector-valued
functions of several variables and Dg be a subset of Rm . We define g f :
Dpg f q Ñ Rp by
Dpg f q : tx P Df : f pxq P Dpg qu
and
for all x P Dpg f q.
pg f qpxq : gpf pxqq
558
Theorem 4.1.23. Let f : Df Ñ Rm , g : Dg Ñ Rp be vector-valued
functions of several variables and Dg be a subset of Rm . Moreover, let
x P Df , f pxq P Dg , f be continuous in x and g be continuous in f pxq.
Then g f is continuous in x.
Proof. For this, let x1 , x2 , . . . be a sequence in Dpg f q converging to x.
Then f px1 q, f px2 q, . . . is a sequence in Dg . Moreover, since f is continuous in x, it follows that
lim
Ñ8 f pxν q f pxq .
ν
Finally, since g is continuous in f pxq, it follows that
lim
Ñ8pg f qpxν q νlim
Ñ8 g pf pxν qq g pf pxqq pg f qpxq .
ν
Example 4.1.24. In the following, we conclude that f5 : R2 zt0u
from Example 4.1.7, defined by
f5 pxq :
Ñ
R
x2 y 2
x2 y 2
for all x px, y q P R2 zt0u, is continuous.
For this, we define for every i P t1, 2u the corresponding projection pi :
R2 zt0u Ñ R of R2 zt0u onto the i-th component by
pi pxq : xi
for every x px1 , x2 q P R2 zt0u. By Theorem 3.5.47, pi is continuous,
i.e., continuous in every point of its domain R2 zt0u.
We arrive at the following representation of f5
f5
r p1 p1 pp1q.p2q p2 s p 1 { r p1 p1
559
p2 p2 s q .
Hence the continuity of f5 follows by application of Theorems 4.1.19, 4.1.21.
Note that another way of concluding the continuity of the second factor
1 { r p1 p1
p2 p2 s
is by means of Theorems 4.1.19(i), 4.1.21(i) and 4.1.23, using the continuity of the function pR zt0u Ñ R, x ÞÑ 1{xq known from Calculus I.
Problems
1) Find the maximal domain Dpf q of f such that
a) f px, y q f px, y q 5
1 2x2 y 2 ,
a
p3x 2yq2 ,
c) f px, y q p1{px 1qq p1{y 2 q ,
d) f px, y q lnpx 3y q ,
e) f px, y q arccosp2xq lnpxy q ,
a
f) f px, y q px2 y 2 3qp1 x2 y 2 q ,
a
g) f px, y q p x y 2 q1{2 ,
a
?
?
h) f px, y, z q x 3
y 2 z 1 ,
i) f px, y, z q arccospxq arccospy q arccosp1 z q ,
j) f px, y, z q lnpxyz q ,
a
k) f px, y, z q 1 x2 2y 2 4z 2 ,
l) f px, y, z q arcsinpx 3y 6z q
for all px, y q P Dpf q or px, y, z q P Dpf q.
Find the maximal domain Dpg q, level sets and range of g such that
a) g px, y q x 2y ,
b) g px, y q 2x3 3y ,
c) g px, y q y {x2 ,
d) g px, y q 3x2 5y 2 7 ,
e) g px, y q 4x2 2y 2 1 ,
f) g px, y q px{3q 2y 2 6 ,
b)
2)
a
560
g px, y q 3{rpx
g)
g px, y q h)
a
i) g px, y q px
2
1qpy 1qs ,
x 3y ,
3q{py 1q ,
j) g px, y, z q x 5y
g px, y, z q 6x
2
k)
2z 3 ,
2y
l) g px, y, z q x2 y 2
2
z2 3 ,
4z 2
9
for all px, y q P Dpg q or px, y, z q P Dpg q. In addition, for the cases a)
- i), draw a contour map showing several curves and sketch Gpg q.
3) Where is the function h : D Ñ R continuous and why? In particular,
decide whether h is continuous or discontinuous at the origin p0, 0q.
Give reasons.
x2 xy 2
for px, y q P D : R2 z t0u ,
a) hpx, y q : 2
x
y2
hp0q : 1 ,
3xy 2
for px, y q P D : R2 z t0u ,
x2 y 2
hp0q : 0 ,
b) hpx, y q :
c) hpx, y q :
xy 2
for px, y q P D : R2 z t0u ,
y4
x2
hp0q : 0 ,
d) hpx, y q :
xy
x2
hp0q : 0 ,
x
e) hpx, y q : 2
x
hp0q : 1 ,
y2
y
for px, y q P D : R2 z t0u ,
y2
f) hpx, y q :
y2
hpx, y q :
y
x2
hp0q : 1 ,
g)
h)
x2
hp0q : 0 ,
hpx, y q :
for px, y q P D : R2 z t0u ,
y2
for px, y q P D : R2 z t0u ,
y2
for px, y q P D : R2 z t0u ,
x2 y 2
for px, y q P D : R2 z tpx, xq : x P Ru ,
y3
x3
hp0q : 0 ,
561
x3
x2
hp0q : 0 ,
y3
for px, y q P D : R2 z tpx, x2 q : x P Ru ,
y
x3
x
hp0q : 0 ,
y3
for px, y q P D : R2 z tpx, xq : x P Ru ,
y
i) hpx, y q :
j) hpx, y q :
k) hpx, y q :
x2 y 2
for px, y q P D : R2 z t0u ,
y4
x4
hp0q : 0 ,
x2 y 2
for px, y q P D : R2 z t0u ,
x2 y 2
hp0q : 0 ,
xy xz yz
m) hpx, y, z q : a
for px, y, z q P D : R3 z t0u ,
x2 y 2 z 2
l) hpx, y q :
hp0q : 0 ,
n)
hpx, y, z q :
hp0q : 0 ,
o)
hpx, y, z q :
hp0q : 1{2 .
x2
xyz
for px, y, z q P D : R3 z t0u ,
y2 z2
xy
x2
yz
y2
z2
for px, y, z q P D : R3 z t0u ,
4) For every n P N , show that p | | : Rn
Ñ R, x ÞÑ |x| q is continuous.
P R. Further, define f : R2 z t0u Ñ R by
|x|p |y|q for px, yq P D : R2 z t0u ,
f px, y q : 2
x xy y 2
f p0q : 0 .
5) Let p, q
Find necessary and sufficient conditions on p and q such that f is
continuous.
6) Sketch the subsets of Rn and determine whether they are bounded,
unbounded, open, closed and compact. In addition, determine there
interior, closure and boundary.
a) The intervals
I1 : p3, 4q , I2 : r1, 2q , I3 : p1, 3s , I4 : r1, 3s ,
I5 : p1, 8q , I6 : r0, 8q , I7 : p8, 3s , I8 : p8, 1s ,
562
b) the 2-dimensional intervals
I9 : tpx, y q : 1 x 4 ^ 0 y
1u ,
I10 : tpx, y q : 0 ¤ x ¤ 4 ^ 3 ¤ y ¤ 1u ,
I11 : tpx, y q : 0 x ¤ 4 ^ 3 ¤ y ¤ 1u ,
I12 : tpx, y q : x ¡ 0 ^ y 3u ,
I13 : tpx, y q : x ¥ 1 ^ y ¥ 4u ,
I14 : tpx, y q : x 1 ^ y ¥ 2u ,
c) the sets
S1 : tpx, y q P R2 : xy
S2 : tpx, y q P R : 9x
¡ 1u ,
36u ,
S3 : tpx, y q P R : x y ¤ 1u ,
S4 : tpx, y q P R2 : 3x2 y 2 ¡ 3u ,
S5 : tpx, y q P R2 : x2 y 2 ¤ 5u ,
S6 : tpx, y q P R2 : 2px 1q2 y 2 ¤ 3u ,
S7 : tpx, y q P R2 : x y 2 ¤ 2u ,
S8 : tpx, y, z q P R3 : x2 2y 2 z 2 ¤ 4u ,
S9 : tpx, y, z q P R3 : x2 3y 2 2z 2 ¤ 1u ,
S10 : tpx, y, z q P R3 : 4x2 y 2 z 2 ¡ 2u ,
S11 : tpx, y, z q P R3 : 9x2 3y 2 4z 2 ¥ 4u ,
S12 : tpx, y, z q P R3 : x2 4y 2 z 2 9u .
2
2
2
2
4y 2
2
7) Let n P N .
a) Show that the union of any number of open subsets of Rn and
the intersection of a finite number of open subsets of Rn are
open.
b) Show that the intersection of any number of closed subsets of
Rn and the union of a finite number of closed subsets of Rn are
closed.
c) Give an example of an intersection of non-empty open subsets
of R which is non-empty and closed.
d) Give an example of a union of non-empty closed subsets of R
which is open.
563
€ Rn where n P N .
Show that B S is closed.
8) Let S, T
a)
b) Show that S is closed if and only if B S
c) Show that B S BpRn zS q.
d) Show that S̄¯ T .
€ S.
e) Show that S Y T S̄ Y T̄ .
f) Show that S X T € S̄ X T̄ .
g) Give an example that shows that in general S X T
S̄ X T̄ .
9) Let n, m P N and f : Rn Ñ Rm . Show that f is continuous if and
only if f 1 pU q is open for every open subset U of Rn .
564
4.2
Derivatives of Vector-valued Functions of Several Variables
In the following, we define derivatives of such functions as linear maps. For
the motivation of that definition, we use the guiding principle mentioned in
the introduction to Calculus III, i.e., we try to generalize the corresponding
definition from Calculus I to vector-valued functions of several variables.
In Calculus I, we defined the following. A function f : pa, bq Ñ R, where
a, b P R such that a b, is differentiable in x P pa, bq with derivative c P R
if for all sequences x0 , x1 , . . . in pa, bq ztxu which are convergent to x it
follows that
f pxν q f pxq
c.
(4.2.1)
lim
ν Ñ8
xν x
If the last is the case, we defined the derivative f 1 pxq of f in x by
f 1 pxq : c .
In the next step, we investigate whether the defining equation p4.2.1q is
could be used also in the case of a vector-valued function of several variables f . In general, in that case x0 , x1 , . . . is a sequence in Rn , n P N
and f px0 q f pxq, f px1 q f pxq, . . . is a sequence in Rm , m P N . We
immediately notice therefore that (4.2.1) cannot be directly applied to this
situation since division by elements of Rn is not defined. We try to remedy
that by going back to the situation from Calculus I with the goal of rewriting (4.2.1) in an equivalent way such that generalization to the situation of
a vector-valued function of several variables is possible. Indeed, (4.2.1) is
equivalent to
f pxν q f pxq
0.
lim
c
ν Ñ8
xν x
Further, since
f xν
x
p q f pxq c f pxν q f pxq c pxν xq xν x
ν x
|f pxν q f|xpxqxc| pxν xq|
ν
565
for every ν
P N, (4.2.1) is equivalent to
|f pxν q f pxq c pxν xq| 0 .
lim
ν Ñ8
|x ν x |
Going back to the situation of a vector-valued function of several variables,
we see that there is only one obstacle left for generalization of the last,
namely the interpretation of c. In this situation, c cannot correspond to a
real number in general since f pxν q f pxq P Rm and xν x P Rn for ν P N,
and in general n m. Hence c needs to be a map from Rn to Rm . In the
case of a function in one variable, this map is given by the linear function
µc : R Ñ R defined by
µc pxq : c x
for every x P R. The map µc has the following simple properties
µc px y q c px y q c x c y µc pxq
µc pα xq c pα xq α pc xq α µc pxq
µc py q ,
P R and α P R. If on the other hand λ : R Ñ R is such that
λpx y q λpxq λpy q , λpα xq α λpxq
(4.2.2)
for all x, y P R and α P R, then
λpxq λpx 1q x λp1q λp1q x
for every x P R and hence
λ µλp1q .
for all x, y
As a consequence, there is a one to one correspondence of functions λ
on R with the property (4.2.2) and real numbers. In the following, maps
λ : Rn Ñ Rm , where n, m P N , with the property that
λpx
y q λpxq
λpy q , λpα.xq α.λpxq
for all x, y P Rn and α P R are called ‘linear’. Such maps are considered
next. Subsequently, a map f from some open subset U of Rn into Rm will
566
be said to be differentiable in x P U if there is a linear map λ : Rn Ñ Rm
such that for all sequences x1 , x2 , . . . in U ztxu which are convergent to x
it follows that
lim
ν
Ñ8
|f pxν q f pxq λpxν xq| 0 .
|xν x|
Definition 4.2.1. (Linear maps) Let n, m
say that λ is linear if
λpx
yq λpxq
P N and λ : Rn Ñ Rm. We
λpyq , λpαxq αλpxq
for all x, y P Rn and α P R. Since in that case
λpxq λ
ņ
xj enj
j 1
ņ
x j λp
enj
q
j 1
m̧
ņ
Λij xj em
i
i 1j 1
m
n
m
where en1 , . . . , enn and em
1 , . . . , em denote the canonical basis of R and R ,
respectively, and for every i 1, . . . , m, j 1, . . . , n, Λij denotes the
component of λpenj q in the direction of em
i , such λ is determined by its
values on the canonical basis of Rn . On the other hand, obviously, if
pΛij qpi,jqPt1,...mut1,...nu
is a given family of real numbers, then by
λpxq :
m̧
ņ
Λij xj em
i
i 1j 1
for all x P Rn , there is defined a linear map λ : Rn Ñ Rm . Interpreting the
elements of Rn and Rm as column vectors and defining the m n matrix
Λ by
λ11 λ1n
Λ :
λm1 567
λmn
,
the last is equivalent to
λpxq : Λ x λm1 λ11
λ1n
λmn
x1
xn
where the multiplication sign denotes a particular case of matrix multiplication defined by
ņ
pΛ xqi :
Λij xj
j 1
for every every x P Rn and i 1, . . . , m. In this case, we call Λ the reprem
sentation matrix of λ with respect to the bases en1 , . . . , enn and em
1 , . . . , em .
Definition 4.2.2. (Differentiability) A vector-valued function of several
variables f from some open subset U of Rn into Rm is said to be differentiable in x P U if there is a linear map λ : Rn Ñ Rm such that for all
sequences x1 , x2 , . . . in U ztxu which are convergent to x:
lim
ν Ñ8
|f pxν q f pxq λpxν xq| 0 .
|xν x|
Since in that case, it follows that
|f pxν q f pxq| |f p|xxν q xf|pxq |xν x|
ν
|
f pxν q f pxq λpxν xq λpxν xq|
|xν x|
|xν x|
¤ |f pxν q f|xpxqxλ| pxν xq| |xν x| |λpxν xq|
ν
¤ |f pxν q f|xpxqxλ| pxν xq| |xν x|
ν
568
m̧
ņ
i 1j 1
|Λij | |xν x|
m
for every ν P N , where en1 , . . . , enn and em
1 , . . . , em denote the canonical
n
m
basis of R and R , respectively, and for every i 1, . . . , m, j 1, . . . , n,
Λij denotes the component of λpenj q in the direction of em
i , the differentiability of f in x also implies the continuity of f in x.
Ñ R by:
f6 px, y q : 2x2
Example 4.2.3. Define f6 : R2
y2
for all x, y P R. Then f6 is differentiable, in particular, at the point p1, 1q.
This can be seen as follows: For x, y P R zt1u, we calculate:
f6 px, y q 2x2 y 2 f6 p1, 1q 2x2 y 2 3
f6p1, 1q 2rpx 1q2 2px 1qs py 1q2
f6p1, 1q 4px 1q 2py 1q 2px 1q2
2py 1q
py 1q2 .
Hence
f6 px, y q f6 p1, 1q 4px 1q 2py 1q 2px 1q2
py 1q2
and
|f6px, yq f6p1, 1q 4px 1q 2py 1q| |px, yq p1, 1q|
2
2px 1q
py 1q2 ¤ 2|x 1| |y 1| .
a
px 1q2 py 1q2
Hence for every sequence x1 , x2 , . . . in R2 ztp1, 1qu which is convergent to
p1, 1q:
|f6pxν q f6p1, 1q 4pxν 1q 2pyν 1q| 0 .
lim
ν Ñ8
|xν p1, 1q|
As a consequence, a linear map λ : R2 Ñ R satisfying the conditions of
Definition 4.2.2 is given by
λpxq : 4x
569
2y
y
-2
-1
0
1 2
10
5
z 0
-5
-10
-15
-2
-1
0
1
2
x
Fig. 160: Graph of f6 together with its tangent plane at (1,1).
for all x px, y q P R2 . The plane (see Fig. 160)
z
x, y
f6p1, 1q
λpx 1, y 1q 4x
2y 3 ,
P R, is called the tangent plane of the Graph of f6 at the point p1, 1q.
In Calculus I, we already defined partial derivatives of functions in several
variables. The following gives a natural generalization to vector-valued
function of several variables.
Definition 4.2.4. (Partial differentiability) A vector-valued function of
several variables f from some open subset U of Rn into Rm is said to be
partially differentiable in the i-th coordinate, where i P t1, . . . , nu, at some
x P U if for all j P t1, . . . , mu the corresponding real-valued function of
one real variable
fj px1 , . . . , xi1 , , xi 1 , . . . , xn q
is differentiable at xi in the sense of the Calculus I. In that case, we define:
Bf pxq : prf px , . . . , x , , x
1 1
i1
i
B xi
570
1 , . . . , xn
qs 1pxiq, . . . ,
rfmpx1, . . . , xi1, , xi
1 , . . . , xm
qs 1pxiqq
.
If f is partially differentiable at x in the i-th coordinate direction at every
x P U , we call f partially differentiable in the i-th coordinate direction
and denote by B f {B xi the map which associates to every x P U the corresponding pB f {B xi qpxq. Partial derivatives of f of higher order are defined
recursively. If B f {B xi is partially differentiable in the j-th coordinate direction, where j P t1, . . . , nu, we denote the partial derivative of B f {B xi in
the j-th coordinate direction by
B2f
B xj B xj
.
Such is called a partial derivative of f of second order. In the case j i,
we set
B2f : B2f .
Bx2i BxiBxi
Partial derivatives of f of higher order than two are defined accordingly.
Ñ R by
f7 px, y q : x3 x2 y 3 2y 2
Example 4.2.5. Define f7 : R2
for all x, y
P R. Find
Bf7 p2, 1q
Bx
and
Bf7 p2, 1q .
By
Solution: We have
f7 px, 1q x3
for all x, y
x P R,
x2 2 and f7 p2, y q 8
P R. Hence it follows that
Bf7 px, 1q 3x2
Bx
2x ,
Bf7 p2, yq 12y2 4y ,
By
571
4y 3 2y 2
P R, and, finally, that
Bf7 p2, 1q 16 and Bf7 p2, 1q 8 .
Bx
By
Example 4.2.6. Define f : R3 Ñ R by
f px, y, z q : x2 y 3 z 3x 4y 6z 5
for all x, y, z P R. Find
Bf px, y, zq , Bf px, y, zq and Bf px, y, zq
Bx
By
Bz
for all x, y, z P R. Solution: Since in partial differentiating with respect to
y
one variable all other variables are held constant, we conclude that
Bf px, y, zq 2xy3z
Bx
Bf px, y, zq x2y3
Bz
for all x, y, z
3,
Bf px, y, zq 3x2y2z
By
4,
6,
P R.
The following example shows that, differently to a function that is differentiable in a point of its domain, a function that is partially differentiable in a
point of its domain is not necessarily continuous in that point.
Example 4.2.7. Define f : R2
f px, y q :
Then
#
Ñ R by
xy {px2
0
y 2 q if px, y q P R2 z t0u
if px, y q 0 .
a{n2
lim f p1{n, a{nq nlim
nÑ8
Ñ8 p1 a2 q{n2
572
1
a
a2
1
0.5
z
1
y
0.5
0
0
-0.5
-1
0
-0.5
y
0
x
-1
-1
-0.5
1 -1
0
x
0.5
1
Fig. 161: Graph and contour map of f from Example 4.2.7. In the last, darker colors
correspond to lower values of f .
for every a
hand,
P R and hence f is discontinuous in the origin.
f ph, 0q f p0, 0q
hÑ0,h0
h
lim
and hence
0,
On the other
f p0, hq f p0, 0q
hÑ0,h0
h
lim
0
Bf p0, 0q Bf p0, 0q 0 .
Bx
By
There are two loose ends here. First, we have not yet shown that the linear
map occurring in the definition of differentiability of vector-valued functions in several variables is unique. Second, the relation of the notions of
differentiability and partial differentiability of such functions is still unclear. Both will be changed by the next theorem.
Theorem 4.2.8. Let f be a vector-valued function of several variables from
some open subset U of Rn into Rm . Furthermore, let f be differentiable
in x and λ : Rn Ñ Rm be some linear map such that for all sequences
x1 , x2 , . . . in U ztxu which are convergent to x:
lim
ν Ñ8
|f pxν q f pxq λpxν xq| 0 .
|xν x|
573
Then f is partially differentiable at x in the i-th coordinate with
λpei q Bf pxq
B xi
for all i P t1, . . . , nu. In particular, it follows that
λpyq y1 for all y P Rn .
Bf pxq B x1
yn Bf pxq ,
B xn
Proof. Let i P t1, . . . , nu and t1 , t2 , . . . be some null sequence in R . Then
the sequence x1 , x2 , . . . , defined by
xν : x
tν .ei
for all ν P N , converges to x. Also its members are contained in U for
large enough ν. For such ν, it follows that
|f pxν q f pxq λpxν xq|
|xν x|
|f px1, . . . , xi1, xi tν , xi 1, . . . , xnq f pxq tν .λpeiq|
νlim
Ñ8
|tν |
f px1 , . . . , xi1 , xi tν , xi 1 , . . . , xn q f pxq
νlim
λ
p
e
i q 0
Ñ8
t
lim
ν Ñ8
ν
Hence, since this is true for all null sequences t1 , t2 , . . . in R , the statements of the theorem follows.
As a result of the previous theorem, we can now define the following.
Definition 4.2.9. (Derivatives of vector-valued functions in several variables) Let f be a vector-valued function of several variables f from some
open subset U of Rn into Rm . In addition, let f be differentiable in x P U ,
and let λ be as in Definition 4.2.2. According to Theorem 4.2.8, λ is
uniquely defined by the properties stated in Definition 4.2.2.
574
(i) We define the derivative f 1 pxq of f at x by
f 1 pxq : λ .
According to Theorem 4.2.8, λ is given by:
f 1 pxqpyq y1 Bf pxq B x1
yn Bf pxq ,
B xn
for all y P Rn . Note that if the elements of Rn , Rm are interpreted
as column vectors, the representation matrix of f 1 pxq with respect to
the canonical bases of Rn and Rm is given by
B f1
Bx1 pxq
1
f pxq Bf pxq Bx
m
1
B f1 Bxn pxq
Bf pxq
Bx
m
n
where f1 , . . . , fm are the component functions of f .
Ñ Rm defined by
p1 pyq : f pxq f 1 pxqpy xq
f pxq py1 x1q BBxf pxq pyn xnq BBxf pxq
(ii) We call the function p1 : Rn
1
n
for all y P Rn , the Taylor polynomial of f of total degree ¤ 1 at x.
(iii) If f is in addition real-valued, we call the graph of p1 the tangent
plane to Gpf q in the point px, f pxqq.
Important special cases of the previous definition are given in the following
example.
Example 4.2.10.
575
(i) Let f be a vector-valued function of several variables from some open
interval I in R into Rm which is differentiable at some t P I. Then
rf 1ptqsp1q ppf1q 1ptq, . . . , pfmq 1ptqq
where the derivatives on the right hand side are in the sense of Calculus I.
(ii) Let f be a function of several variables from some open subset U of
Rn into R which is differentiable at some point x P U . Then
rf 1pxqspyq rpy ∇qf spxq
for all y P Rn where the gradient of f in x, p∇f qpxq, is defined by
B
f
B
f
p∇f qpxq : Bx pxq, . . . , Bx pxq
1
n
and
for every y P Rn .
rpy ∇qf spxq : y p∇f qpxq
Example 4.2.11. (Basic examples of differentiable functions) Let n, m P
N .
(i) Constant vector-valued functions on Rn are differentiable with zero
derivative.
(ii) Any linear map from Rn into Rm is differentiable and its derivative
is given by the same linear map at any x P Rn .
The following criterion for differentiability is usually sufficient for applications.
Theorem 4.2.12. (A sufficient criterion for differentiability) Let f be a
function of several variables from some open subset U of Rn into R. Moreover let f be partially differentiable in all coordinates, and let those partial
derivatives define continuous functions on U . Then f is differentiable.
576
Proof. For x P U and y P U ztxu, it follows by the mean value theorem
for functions of one real variable Theorem 2.5.6 that
f pyq f pxq
f py1, y2, . . . , ynq f px1, y2, . . . , ynq
f px1 , y2 , . . . , yn q f px1 , x2 , . . . , yn q
f px1 , x2 , . . . , xn1 , yn q f px1 , x2 , . . . , xn q
BBxf pc1, y2, . . . , ynq py1 x1q
1
Bf px , c , . . . , y q py x q
n
2
2
B x2 1 2
Bf px , x , . . . , x , c q py x q
n1 n
n
n
B xn 1 2
where for each i P t1, . . . , nu the corresponding ci is some element of the
closed interval between xi and yi . Hence
Bf pxq py x q Bf pxq
n
n
B x1 B xn
BBxf pc1, y2, . . . , ynq BBxf pxq py1 x1q
1
1
Bf px , c , . . . , y q Bf pxq py x q
n
2
2
B x2 1 2
B x1
f pyq f pxq py1 x1 q 577
and
Bf px , x , . . . , x , c q Bf pxq py x q
n1 n
n
n
B xn 1 2
B xn
f y
p q f pxq py1 x1q BBxf pxq pyn xnq BBxf pxq
|y x|
Bf
Bf B x px1 , c2 , . . . , yn q B x pxq . . .
1
2
Bf
B
f
B x px1 , x2 , . . . , xn1 , cn q B x pxq ,
n
n
n
1
Hence, obviously, by the continuity of the partial derivatives of f , it follows
the differentiability of f in x and
f 1 pxqpyq y1 Bf pxq B x1
yn Bf pxq ,
B xn
for all y P Rn . Finally, since x was otherwise arbitrary, it follows the
differentiability of f on U .
Example 4.2.13. (A continuous and partially differentiable function
that is not differentiable) Define f : R2 Ñ R by
f px, y q :
#
p3x2y y3q{px2
0
y 2 q if px, y q P R2 z t0u
if px, y q 0 .
As a consequence of Theorem 4.2.12, the restriction of f to R2 z t0u is
differentiable. In addition, since
|3x2y y3| |y| |3x2 y2| ¤ 3|y|px2 y2q
for every px, y q P R2 z t0u, it follows that f is continuous at the origin.
Further,
f ph, 0q f p0, 0q
hÑ0,h0
h
lim
578
0,
1
0.5
1
y
1
z
0
0
-1
0
-1
-0.5
y
0
x
-1
-1
-0.5
1 -1
0
x
0.5
1
Fig. 162: Graph and contour map of f from Example 4.2.13. In the last, darker colors
correspond to lower values of f .
f p0, hq f p0, 0q
hÑ0,h0
h
lim
and hence
h 1
hÑlim
0,h0 h
Bf p0, 0q 0 , Bf p0, 0q 1 .
Bx
By
We lead the assumption that f is differentiable in the origin to a contradiction. If f is differentiable in the origin, it follows by Theorem 4.2.8 that
Bf p0q h Bf p0q h
f 1 p0qphq h1
2
2
Bx
By
for every h ph1 , h2 q P R2 . Hence it follows for every sequence h1
ph11, h12q, h2 ph21, h22q, . . . in R2 z t0u that is convergent to 0 that
|f phν q hν2| lim 1 3h21ν h2ν h32ν h lim
2ν ν Ñ8
ν Ñ8 |h | |hν |
|hν |2
ν
4h21ν |h2ν |
νlim
Ñ8 |hν |3 0 .
But in the case that h1ν : h2ν : 1{ν for all ν P N , it follows that
4h21ν |h2ν | ?
lim
ν Ñ8
|hν |3 2 .
579
Hence f is not differentiable in the origin. Compare Fig. 162 which indicates that there is no tangential plane to Gpf q in the origin. Note that
Bf px, yq 8xy3
Bx
px2 y2q2
for px, y q P R2 z t0u. Hence B f {B x is discontinuous in the origin, and
Theorem 4.2.12 is not applicable to f .
The following classes of functions appear frequently in applications.
Definition 4.2.14. We say that a real-valued function defined on some open
subset U of Rn is of class C p for some p P N , if it is partially differentiable
to all orders up to p, inclusively, and if all those partial derivatives define
continuous functions on U . This includes partial derivatives of the order
zero, i.e., that function itself is continuous. A real-valued function defined
on some open subset U of Rn is said to be C 8 if it is of class C p for all
p P N .
Remark 4.2.15. As a consequence of the previous definition, the Theorem 4.2.12 can be restated as saying that every real-valued function which
is defined on some open subset of Rn and which is of class C 1 is also differentiable.
Definition 4.2.16. (Gradient operator) Let n P N . We define for every
real-valued function f which is defined as well as partially differentiable in
all coordinate directions on some open subset U of Rn
B
f
B
f
p∇f qpxq : Bx pxq, . . . , Bx pxq
1
n
(4.2.3)
for all x from its domain. We call the map ∇ which associates to every
such f the corresponding ∇f , the gradient operator.
Ñ R by
f8 px, y q : x3 x2 y 3 2y 2
Example 4.2.17. Define f8 : R2
580
for x, y
P R. Find the second partial derivatives of f8.
Solution: We have for all x, y
P R:
Bf8 px, yq 3x2 2xy3 , Bf8 px, yq 3x2y2 4y ,
Bx
By
2
2
B f8 px, yq 6x 2y3 , B f8 px, yq 6x2y 4 ,
B x2
By2
B2f8 px, yq 6xy2 , B2f8 px, yq 6xy2 .
B xB y
ByBx
We notice that the second mixed partial derivatives in the last example were
identical. This is true for a large class of functions.
Theorem 4.2.18. (H. A. Schwarz, 1843 - 1921) Let f be some real-valued
function on some open subset of Rn which is of class C 2 . Then
B2f B2f
B xi B xj B xj B xi
(4.2.4)
P t1, . . . , nu.
Proof. If i j the statement is trivially satisfied. For i j, x P U
and sufficiently small hi , hj 0, it follows by the mean value theorem
for i, j
for functions of one variable Theorem 2.5.6 that there are si , ti in the open
interval between xi and xi hi and sj , tj in the open interval between xj
and xj hj such that
f px
hj .ej hi .ei q f px hi .ei q f px hj .ej q f pxq
B
f
B
f
hi Bx px si.ei hj .ej q Bx px si.eiq
i
i
2
B
f
hihj Bx Bx px si.ei sj .ej q
j
i
f px hi.ei hj .ej q f px hj .ej q f px hi.eiq f pxq
hj BBxf px tj .ej hi.eiq BBxf px tj .ej q
j
j
581
1
2
z
0
1
-1
0
-2
y
-1
0
x
-1
1
2 -2
Fig. 163: Graph of f from Example 4.2.19.
2
hihj BxB Bfx px
i
ti .ei
j
tj .ej q .
Since hi , hj are otherwise arbitrary, from this and the continuity of
B2f , B2f
B xi B xj B xj B xi
follows (4.2.4) and hence, finally, the theorem.
Example 4.2.19. Define f : R2
f px, y q :
#
Ñ R by
xy px2 y 2 q{px2
0
y 2 q if px, y q P R2 z t0u
if px, y q 0 .
Then
Bf p0, yq lim f ph, yq f p0, yq lim y h2 y2 y ,
hÑ0,h0
hÑ0,h0 h2
Bx
h
y2
582
Bf px, 0q lim f px, hq f px, 0q lim x x2 h2 x
hÑ0,h0
hÑ0,h0 x2
By
h
h2
for all x, y P R. Further,
B2f p0, 0q lim 1 Bf ph, 0q Bf p0, 0q 1 ,
hÑ0,h0 h B y
B xB y
By
B2f p0, 0q lim 1 Bf p0, hq Bf p0, 0q 1
hÑ0,h0 h B x
ByBx
Bx
and hence
B2f p0, 0q B2f p0, 0q .
B xB y
ByBx
Note that
B2f px, yq 4xy3px2 3y2q
B x2
px2 y2q3
for all px, y q P R2 z t0u. The function that associates the right hand side
of the last equation to every px, y q P R2 z t0u cannot be extended to a
continuous function on R2 . Hence f is not of class C 2 and Theorem 4.2.18
is not applicable to f .
The following Laplace operator appears frequently in partial differential
equations from applications.
Definition 4.2.20. (Laplace operator, Laplace equation) Let n P N . We
define for every real-valued function f which is defined on some open subset U of Rn and twice partially differentiable in every coordinate direction
4f :
B2f
Bx2i
i1
ņ
.
(4.2.5)
We call the map 4 which associates to every such f the corresponding
4f the Laplace operator. In particular, if such f is mapped into the zero
function defined on the domain of f , f is called a solution of the Laplace
equation
4f 0 .
Note that the zero on the right hand denotes the function of value zero
defined on the domain of f .
583
2
1
1
0
0
y
y
2
-1
-1
-2
-2
-2
0
x
-1
1
2
-2
-1
0
x
1
2
Fig. 164: Contour maps of B f {B x and B f {B y from Example 4.2.19. Darker colors correspond to lower values of the functions.
y
1
1
0
-1
5
0 z
y
0.5
0
-0.5
-5
1
0
-1
-1
-1
1
x
-0.5
0
x
0.5
1
Fig. 165: Graph and contour map of f from Example 4.2.21. In the last, darker colors
correspond to lower values of f .
584
Example 4.2.21. (A solution of Laplace’s equation) Define f : R2 zt0u
Ñ R by
x
f px, y q : 2
x
y2
for all px, y q P R2 zt0u. Then
Bf px, yq x2 y2 2x2 y2 x2 , Bf px, yq 2xy
Bx
px2 y2q2
px2 y2q2 By
px2 y2q2
B2f px, yq 2xpx2 y2q2 4xpy2 x2qpx2 y2q
B x2
px2 y2q4
2
2
2
2
2
x2 ,
2xpx pyx2q y2x2qp32y 2x q 2x p3y
x2 y 2 q3
B2f px, yq 2xpx2 y2q2 8xy2px2 y2q
By2
px2 y2q2
2
2
2
2
x2 3y 2
B
2xpx
y q 8xy
px2 y2q3
2x px2 y2q3 Bxf2 px, yq
for all px, y q P R2 zt0u. Hence f is a solution of Laplace’s equation.
,
The differentiation of vector-valued function of several variables follows
rules analogous to the case known from Calculus I. So there is a sum rule,
a rule for scalar multiples, a product rule, a quotient rule and a chain rule.
The corresponding proofs are analogous to those from Calculus I.
Theorem 4.2.22. (Rules of differentiation) Let f, g be two differentiable
vector-valued function of several variables from some open subset U of Rn
into Rm and a P R.
(i) Then f
g and a.f are differentiable and
pf gq 1pxq f 1pxq
for all x P U .
g 1 pxq , pa.f q 1 pxq a.f 1 pxq
(ii) If f, g are both real-valued, then f g is differentiable and
for all x P U .
pf gq 1pxq f pxq.g 1pxq
585
g pxq.f 1 pxq
(iii) If f is real-valued and non-vanishing, then 1{f is differentiable and
1
1
f
pxq rf p1xqs2 .f 1pxq
for all x P U .
Proof. For this, let x P U and x1 , x2 , . . . be some sequence in U ztxu
which is convergent to x. Then:
|pf
¤
g qpxν q pf
g qpxq pf 1 pxq
|xν x|
1
|f pxν q f pxq f pxqpxν xq|
|xν x|
g 1 pxqqpxν
xq|
|gpxν q gpxq g 1pxqpxν xq|
|xν x|
and
|pa.f qpxν q pa.f qpxq ra.pf 1qpxqspxν xq|
|xν x|
1
|a| |f pxν q f px|xq fxp| xqpxν xq|
ν
and hence
|pf
lim
ν
Ñ8
and
lim
ν Ñ8
g qpxν q pf
g qpxq pf 1 pxq
|xν x|
g 1 pxqqpxν
xq| 0
|pa.f qpxν q pa.f qpxq ra.pf 1qpxqspxν xq| 0 .
|xν x|
If f and g are real-valued, it follows that
|pf gqpxν q pf gqpxq pf pxq.g 1pxq gpxq.f 1pxqqpxν xq|
|xν x|
1
¤ |f pxν q f px|xq fxp| xqpxν xq| |gpxq|
ν
586
1
|f pxq| |gpxν q gpx|xq gxp|xqpxν xq|
ν
|f pxν q f pxq| |gpx q gpxq|
ν
|xν x|
and hence that
|p
f g qpxν q pf g qpxq pf pxq.g 1 pxq
lim
ν Ñ8
|xν x|
0
g pxq.f 1 pxqqpxν
xq|
If f is real-valued and non-vanishing, it follows that
1
f pxν q
1
f x
pq
1
r p qs2 .f pxqpxν xq
1
f x
¤
|xν x|
|f pxν q f pxq f 1pxqpxν xq|
1
|f pxq|2
|xν x|
|f pxν q f pxq|2
|f pxν q| |f pxq|2 |xν x|
and hence that
lim
ν Ñ8
1
f pxν q
1
f x
pq
1
r p qs2 .f pxqpxν xq
1
f x
|xν x|
0.
Hence, since
f 1 pxq
g 1 pxq , a.f 1 pxq , f pxq.g 1 pxq
g pxq.f 1 pxq ,
rf p1xqs2 .f 1pxq
are all linear maps, x1 , x2 , . . . and x
the theorem follows.
P U were otherwise arbitrary, finally,
587
Theorem 4.2.23. (Chain rule) Let f : U Ñ Rm , g : V Ñ Rl be differentiable vector-valued functions of several variables defined on some open
subsets U of Rn and V of Rm , respectively, and such that the domain of the
composition g f is non trivial. Then g f is differentiable with
for all x P Dpg f q.
pg f q 1 g 1pf pxqq f 1pxq
Proof. For this, let x P Dpg f q and x1 , x2 , . . . be some sequence in Dpg f q ztxu which is convergent to x. Then:
|pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq| ¤
|xν x|
|gpf pxν qq gpf pxqq g 1pf pxqqpf pxν q f pxqq|
|xν x|
1
|g pf pxqqpf pxν q f pxq f 1pxqpxν xqq|
|xν x|
and hence, obviously,
lim
ν Ñ8
|pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq|
|xν x|
0.
Hence, since
g 1 pf pxqq f 1 pxq
is a linear map, x1 , x2 , . . . and x
finally, the theorem follows.
P Dpg f q were otherwise arbitrary,
Definition 4.2.24. Let n P N , f : Df Ñ R, g : Dg Ñ Rn be functions of
several variables such that Df X Dg φ. We define the product function
f.g : Df X Dg Ñ R by
pf.gqpxq : pf pxq g1pxq, . . . , f pxq gnpxqq
for all x P Df X Dg where g1 , . . . , gn : Dg Ñ R are the component func-
tions of g.
588
By application of Theorem 4.2.8, we get as a corollary the chain rule for
partial derivatives. The last is frequently applied, e.g., in connection with
coordinate transformations. A typical example for the last application is
given subsequently.
Corollary 4.2.25. (Chain rule for partial derivatives) Let f : U Ñ
Rm , g : V Ñ Rl be differentiable vector-valued functions of several variables defined on some open subsets U of Rn and V of Rm and such that the
domain of the composition g f is non trivial. Then for each x P Dpg f q,
i P t1, . . . , nu:
Bpg f q pxq Bf1 pxq. Bg pf pxqq Bfm pxq. Bg pf pxqq .
B xi
B xi B x1
B xi B xm
Proof. By Theorem 4.2.23 and Theorem 4.2.8, it follows that
Bpg f q pxq rpg f q 1pxqspe q rg 1pf pxqq f 1pxqspe q
i
i
B xi
B
fm
B
f1
1
1
1
g pf pxqqpf pxqpeiqq g pf pxqq Bx pxq, . . . , Bx pxq
i
i
B
g
B
fm
B
g
B
f1
Bx pxq. Bx pf pxqq Bx pxq. Bx pf pxqq
i
1
i
m
and hence the corollary.
The following gives a typical application of the chain rule for partial derivatives.
Example 4.2.26. (Polar coordinates) Let f : R2 Ñ R be differentiable.
Calculate all partial derivatives of first order of f¯ : R2 Ñ R defined by
f¯pr, ϕq : f pr cos ϕ, r sin ϕq
for all pr, ϕq P R2 . Solution: We define g : R2
Ñ R2 by
g pr, ϕq : pr cos ϕ, r sin ϕq
589
for all pr, ϕq P R2 . Then g is differentiable and f¯ f g. Hence we get by
Corollary 4.2.25 that
Bf¯pr, ϕq cos ϕ Bf pr cos ϕ, r sin ϕq sin ϕ Bf pr cos ϕ, r sin ϕq
Br
Bx
By
Bf¯pr, ϕq r sin ϕ Bf pr cos ϕ, r sin ϕq r cos ϕ Bf pr cos ϕ, r sin ϕq
Bϕ
Bx
By
for all pr, ϕq P R2 . Solving the previous system for the partial derivatives
of f leads to the more useful formula
Bf pr cos ϕ, r sin ϕq cos ϕ Bf¯pr, ϕq sin ϕ Bf¯pr, ϕq
Bx
Br
r Bϕ
¯
Bf pr cos ϕ, r sin ϕq sin ϕ Bf pr, ϕq cos ϕ Bf¯pr, ϕq
By
Br
r Bϕ
for all r
P R z t0u and ϕ P R.
Problems
1) Calculate the partial derivatives of f : D Ñ R, and in this way
conclude the differentiability of f . In addition, calculate f 1 p1, 2q
and the Taylor-polynomial of total degree ¤ 1 (‘Linearization’) at
p1, 2q.
a) f px, y q : 4x3
b)
2y 3 3xy for px, y q P D : R2 ,
f px, y q : 8x3 5x2 y 2
c) f px, y, z q : xy
yz
7y 3 for px, y q P D : R2 ,
xz for px, y, z q P D : R3 ,
d) f px, y q : xy for px, y q P D : tpx, y q P R2 : x ¡ 0u ,
a
e) f px, y q : arccospx{ x2
f)
g)
h)
y 2 q for px, y q P D : R2 z t0u ,
f px, y q : arctanpy {xq for px, y q P D : tpx, y q P R2 : x 0u ,
f px, y q : ln x
a
x2
y2
for px, y q P D : R2 z t0u ,
f px, y, z q : exyz for px, y, z q P D : R3 ,
i) f px, y, z q : xyz for px, y, z q P D : tpx, y, z q P R2 : x ¡ 0u .
590
2) Find a function whose zero set coincides with the tangent plane to
the surface at the point p.
a) S1 : tpx, y, z q P R3 : x
b)
c)
d)
e)
f)
g)
h)
i)
3u , p p1, 1, 1q ,
S2 : tpx, y, z q P R : xyz 2u , p p1, 2, 1q ,
S3 : tpx, y, z q P R3 : x2 y 2 2u , p p1, 1, 1q ,
S4 : tpx, y, z q P R3 : x2 y 2 z 0u , p p0, 0, 0q ,
S5 : tpx, y, z q P R3 : x2 y 2 z 2 1u , p p1, 1, 1q ,
S6 : tpx, y, z q P R3 : x2 y 2 z 2 3u , p p1, 1, 1q ,
S7 : tpx, y, z q P R3 : x2 y 2 2z 2 1u , p p1, 1, 1q ,
S8 : tpx, y, z q P R3 : x2 3xy 2 4z 1 0u ,
p p1, 2, 3q ,
S9 : tpx, y, z q P R3 : sinpxyz q 1{2u , p p1, π, 1{6q ,
y
3) Use the chain rule to calculate
Bg p2, 1q
Bu
where
for all u, v
for all x, y
z
3
g pu, v q : f
and
Bg p2, 1q
Bv
?uv , 1 ln u 2
v
¡ 0 and f is a differentiable function such that
Bf px, yq y , Bf px, yq x ,
Bx
By
P R.
4) Let f be a differentiable function with partial derivatives
Bf px, yq x , Bf px, yq y
Bx
By
P R. Define
g pϕ, θq : f pcos ϕ p2 cos θq, sin ϕ p2
for all ϕ, θ P R. By using the chain rule, calculate
Bg pϕ, θq
Bϕ
for all ϕ, θ P R.
for all x, y
591
cos θqq
5) Use the chain rule to calculate
Bg p1, π{6q
Br
where
and
Bg
Bϕ p1, π{6q
g pr, ϕq : f pr cos ϕ, r sin ϕq
for all r, ϕ P R and f is a differentiable function such that
6)
7)
Bf px, yq 3x2 y , Bf px, yq 3y2 x
Bx
By
for all x, y P R.
Let f : R3 Ñ R be a differentiable function satisfying
Bf px, y, zq x , Bf px, y, zq y , Bf px, y, zq z
Bx
By
Bz
for all x, y, z P R. Define the function g by
g pr, θ, ϕq : f pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq
for all r, θ, ϕ P R. By using the chain rule, calculate
Bg p1, π{4, 0q .
Bθ
Let U be a non-empty subset of Rn where n P N . Further, let
f, g : U Ñ R be partially differentiable, I a non-empty open interval
of R such that I  Ranpf q, h : I Ñ R differentiable and a P R.
Show that
a)
b)
c)
d)
e)
∇pf g q ∇f ∇g , ∇pa .f q a .∇f ,
∇pf g q f .∇g g .∇f ,
∇f k k f k1 .∇f , k P N ,
∇pf {g q p1{g q .∇f pf {g 2 q .∇g , if f 1 pt0uq φ ,
∇ph f q ph 1 f q .∇f .
8) Let f : R Ñ R and g : R Ñ R be twice differentiable functions.
Define upt, xq : f px tq g px tq for all pt, xq P R2 . Calculate
all partial derivatives of u up to second order and conclude that u
satisfies
B2 u B2 u 0 .
Bt2 Bx2
The last is called the wave equation in one space dimension (for a
function u which is to be determined).
592
9) Determine whether f is a solution of Laplace’s equation. If applicable, a, b P R.
a) f px, y q : a ex cospy q
b ex sinpy q for px, y q P D : R2 ,
b)
f px, y q : x3 3xy 2 for px, y q P D : R2 ,
d)
f px, y q : x sinpx
c) f px, y q : p1{2q lnpx2
e) f px, y q : arctan
yq
y 2 q for px, y q P D : R2 z t0u ,
y cospx
x y
1 xy
y q for px, y q P D : R2 ,
for px, y q P D : tpx, y q P R2 : xy
f)
g)
h)
i)
1u ,
f px, y, z q : e
for px, y, z q P D : R3 ,
f px, y, z q : a e5x sinp3y q cosp4z q b e5x cosp3y q cosp4z q
for px, y, z q P D : R3 ,
f px, y, z q : x3 2xy 2 xz 2 for px, y, z q P D : R3 ,
a
f px, y, z q : 1{ x2 y 2 z 2 for px, y, z q P D : R3 z t0u .
xyz
10) Let f, g : R Ñ R be differentiable, but otherwise arbitrary. Define
u : R R3 Ñ R by
upt, xq :
1
|x| r f p t |x| q
gp t
|x| q s
for all t P R pR3 z t0uq. Calculate all partial derivatives of u up to
second order and verify that u satisfies
B2 u 4u 0
B t2
(4.2.6)
where
p4uqpt, xq : r4upt, qspxq
for all pt, xq P R R3 . The equation (4.2.6) is called the wave
equation in three space dimensions (for a function u which is to be
determined).
11) (Transformation of the Laplace operator into polar coordinates)
Define g : R2 Ñ R2 by
g pr, ϕq : pr cos ϕ, r sin ϕq
593
for all pr, ϕq P R2 . Further, let u : R2
Ñ R be of class C 2 . Then
2
2
p4uqpgpr, ϕqq BBrū2 pr, ϕq 1r BBūr pr, ϕq r12 BBϕū2 pr, ϕq
for all r P R , ϕ P R where ū : u g.
12) (Transformation of the Laplace operator into cylindrical coordinates) Define g : R3 Ñ R3 by
g pr, ϕ, z q : pr cos ϕ, r sin ϕ, z q
for all pr, ϕ, z q P R3 . Further, let u : R3
Ñ R be of class C 2 . Then
2
p4uqpgpr, ϕ, zqq BBrū2 pr, ϕ, zq 1r BBūr pr, ϕ, zq
B2 ū pr, ϕ, zq
1 B 2 ū
p
r, ϕ, z q
2
2
r Bϕ
Bz2
for all r
P R , pϕ, zq P R2 where ū : u g.
13) (Transformation of the Laplace operator into spherical coordinates) Define g : R3 Ñ R3 by
g pr, θ, ϕq : pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq
for all pr, ϕ, z q P R3 . Further, let u : R3
14)
Ñ R be of class C 2 . Then
2
p4uqpgpr, ϕ, zqq BBrū2 pr, ϕ, zq 2r BBūr pr, ϕ, zq
2
1
B ū pr, ϕ, zq sin2 θ B2 ū pr, ϕ, zq
Bθ2 r2 sin2 θ B ϕ2
Bū pr, ϕ, zq
sin θ cos θ
Bθ
for all r P R , θ P R ztkπ : k P Zu and ϕ P R where ū : u g.
Define f : R2 Ñ R by
f px, y q :
#
x2 y {px4
0
y 2 q if px, y q P R2 z t0u
if px, y q 0 .
a) Show that f is discontinuous at the origin.
594
2
1
0.5
2
0
y
z
0
1
-0.5
-2
0
-1
y
-1
0
x
-1
1
-2
-2
2 -2
0
x
-1
1
2
Fig. 166: Graph and contour map of f from Problem 13.
b) Show that f is partially differentiable at the origin into every
direction, i.e.,
f ph.uq f p0q
lim
hÑ0,h0
h
exists for every pu1 , u2 q P R2 such that u21
15) As in Example 4.2.13, define f : R
f px, y q :
#
2
p3x2 y y3 q{px2
0
Ñ R by
u22
1.
y 2 q if px, y q P R2 z t0u
if px, y q 0 .
Show that f partially differentiable at the origin into every direction,
i.e.,
f ph.uq f p0q
lim
hÑ0,h0
h
exists for every pu1 , u2 q P R2 such that u21
16) Define f : R2
Ñ R by
f px, y q :
#
x3 {px2
0
u22
1.
y 2 q if px, y q P R2 z t0u
if px, y q 0 .
a) Show that f is not differentiable in the origin.
b) Show that f g is differentiable for any differentiable path in
R2 that passes through the origin.
595
1
0.5
z
1
y
1
0
0
-1
-1
0
-0.5
y
0
x
-1
-1
-0.5
1 -1
0
x
0.5
1
Fig. 167: Graph and contour map of f from Problem 16.
4.3
Applications of Differentiation
In this section, we give main applications of differentiation of functions of
several variables. This includes a generalization of Taylor’s theorem, applications to the finding of maxima and minima and Lagrange’s multiplier
rule for the finding of maxima and minima in the presence of additional
constraints.
A function of several variables f from some open subset U of Rn into
Rm was said to be partially differentiable in the i-th coordinate, where
i P t1, . . . , nu, at some x P U if the corresponding real-valued function
of one real variable
f px1 , . . . , xi1 , , xi 1 , . . . , xn q
is differentiable at xi in the sense of the Calculus I. In that case, we defined
Bf pxq :rf px , . . . , x , , x
1
i1
i
B xi
1 , . . . , xn
qs 1pxiq .
In the following, we rewrite the right hand side of the last equation for the
purpose of generalization. Since U is open, there is ε ¡ 0 such that the
open ball Uε pxq around x is contained in U . For this reason, we can define
an auxiliary function h : pε, εq Ñ R by
hptq : f px
t.ei q f px1 , . . . , xi1 , xi
596
t, xi 1 , . . . , xn q
for all t P pε, εq where ei denotes the i-th canonical basis vector of Rn .
As a consequence of the chain rule for functions in one variable, h is differentiable in 0 with derivative
h 1 p0q rf px1 , . . . , xi1 , , xi 1 , . . . , xn qs 1 pxi q
and hence
In this sense,
Bf pxq h 1p0q .
B xi
Bf pxq
B xi
is a derivative of f at the point x in the direction ei of the i-th coordinate axis. Of course, potentially, such a derivative can be defined in any
direction not just in the directions of the coordinate axes. This is done in
the definition below which will lead to a geometrical interpretation of the
gradient ∇f for differentiable function of several variables f .
Definition 4.3.1. (Directional derivatives) A function of several variables
f defined on some open subset U of Rn is said to be differentiable in the
direction of some unit vector u P Rn at some x P U if the auxiliary function
h : I Ñ R, defined by
hptq : f px t.uq
for every t P I and some open interval I around 0, is differentiable at 0 in
the sense of Calculus I. In this case, we define:
Bf pxq : h 1p0q .
Bu
Theorem 4.3.2. Let n P N , f be some differentiable function defined on
some open subset U of Rn and u P Rn be some unit vector. Then f is
differentiable in the direction of u at all points of U and
Bf pxq p∇f qpxq u cospαq |p∇f qpxq|
Bu
for all x P U where α denotes the angle between p∇f qpxq and u.
597
Proof. For this, let x P U and define the path g : I
g ptq : x
Ñ Rn by
t.u
for every t P I and some open interval I of R around 0 such that Ranpg q €
U . According to Example 4.2.11, g is differentiable, and moreover according to Example 4.2.10 its derivative is given by
rg 1ptqsp1q u
for all t P I. Hence it follows by Theorem 4.2.23 that f g is differentiable
and that
pf gq 1ptq rf 1pgptqqs u p∇f qpxq u
where Example 4.2.10(ii) has been used. Hence the theorem follows.
We consider again the situation in the previous theorem. If x0 P U and,
more generally than the path g considered in the previous proof, g is a
differentiable path from some open interval I around 0 to Rn such that
Ranpg q € U , g p0q x0 and
gi1 p0q ui ,
where gi is the i-th component map of g, for all i P t1, . . . , nu, then f g is
differentiable, and it follows by the chain rule that
pf gq 1p0q rf 1pgptqqspuq p∇f qpx0q u .
In particular if the range of g is contained in the level set of f through x0 ,
then f g is constant of value f px0 q. Hence
0 pf
gq 1p0q p∇f qpx0q u .
Since u is a tangent vector to g in x0 and the range of g is contained in the
level set of f through the point x0 , u is also tangent to that level set in the
same point. Since this is true for every such g, in this sense p∇f qpx0 q is
perpendicular to the level set in x0 . Note that we excluded the question of
existence of differentiable paths with values in that level set, and we did not
define the tangent space to that set in x0 . Such questions are answered in
courses on differential geometry. In part (ii) of the following remark, we
give a corresponding result without proof.
598
y
1
y 0
1.5
-1
-1.5
-0.5
0.5
1.5
1
x
z
0
-1.5
-11
0
x
1
Fig. 168: The left picture shows gradients and the level set S through the point p1, 0q of f
from Example 4.3.4. The right picture shows the graph of f , S and directions of steepest
increase of f .
Remark 4.3.3. (Interpretation of the gradient) Let n, f and U as in Theorem 4.3.2. If the gradient vector p∇f qpx0 q of f in x0 P U is non vanishing,
then
(i)
|p∇f qpx0q|1.p∇f qpx0q
and
|p∇f qpx0q|1.p∇f qpx0q
are the directions of steepest ascent and steepest descent of f at x0 ,
respectively. The rate of the ascent and descent is given by
|p∇f qpx0q|
and
|p∇f qpx0q| ,
respectively.
(ii) If f is moreover of class C 1 , p∇f qpx0 q is perpendicular to the level
set (or contour) of f at x0 . Hence the equation of the tangent plane
to this set is given by
p∇f qpx0q px x0q 0
and its normal line through x0 by
x0
λ.p∇f qpx0 q
599
where λ P R.
Ñ R defined by
f px, y q : x2 y 2
Example 4.3.4. The function f : R2
for every px, y q P R2 is of class C 1 . The corresponding gradients are given
by
p∇f qpx, yq p2x, 2yq
for every px, y q P R2 . In particular,
p∇f qp0q 0 .
Hence the directions of steepest increase / decrease of f in px, y q P R2 zt0u
are
?
x
x2
y2
,?
y
x2
y2
,
?x 2
x
y2
,?
y
x2
y2
.
Since f p1, 0q 1, the level set of f through p1, 0q coincides with
tpx, yq P R2 : f px, yq x2
y2
1u S11p0q .
See Fig. 168.
In the next step, we derive a generalization of Taylor’s theorem to functions
of several variables. The idea for such a generalization is as follows.
For the description, let f be a real-valued function defined on some open
subset U of Rn , x0 P U and h P Rn such that x0 t.h P U for all
t P pε, 1 εq where ε ¡ 0. Our goal is the derivation of a relationship between f px0 hq and the values of f as well as of its partial derivatives, so far existent, in the point x0 . For this, we define the auxiliary path
g : pε, 1 εq Ñ Rn by
g ptq : x0
600
t.h
for every t P pε, 1 εq. Then f g is a function of one real variable such
that
pf gqp0q f px0q , pf gqp1q f px0 hq .
In the next step, we apply Taylor’s theorem, Theorem 2.5.25, to f g, in
accordance with the differentiability properties of f g, and choose the
expansion point t0 : 0. For this, derivatives of f g in 0 need to be
known. By application of the chain rule for functions in several variables,
such derivatives can be expressed in terms of partial derivatives of f in
g p0q x0 and partial derivatives of g. The last are constant functions. In
this way, we arrive at a relationship of the required type. The following
lemma derives the form of the derivatives of f g which are needed in this
procedure.
Lemma 4.3.5. Let f be a real-valued function defined on some open subset
U of Rn and of class C r for some r P N . In addition, let x P U , h P Rn
and I be some open interval of R around 0 such that x t.h P U for all
t P I. Finally, define g : I Ñ U by g ptq : x t.h for all t P I. Then
pf gqprq rph ∇q r f s g
where
for k
ph ∇q k f
P t2, . . . , ru is defined recursively by
ph ∇q k f : ph ∇qrph ∇q k1f s .
Compare Example 4.2.10 (ii).
Proof. The proof proceeds by induction. For r 1, it follows by Remark 4.2.15 that f is differentiable. Moreover, obviously, g is differentiable. Hence by Example 4.2.10 and the chain rule,Theorem 4.2.23, it
follows that
pf gq 1ptq f 1pgptqq prg 1ptqsp1qq f 1pgptqqphq
rph ∇qf spgptqq
601
for all t P I and hence the statement for r 1. Now assume that the
statement is valid for some s P N such that 1 ¤ s ¤ r 1. Then it follows
by Remark 4.2.15 that ph ∇q s f is differentiable and by the analogous
arguments applied in the first step that
pf gqps 1q rph ∇q s
1
fs g .
By help of the previous lemma, we can now state and prove Taylor’s formula for functions in several variables.
Theorem 4.3.6. (Taylor’s formula) Let f be a real-valued function defined
on some open subset U of Rn and of class C r for some r P N . Moreover,
let x0 P U and h P Rn zt0u such that the x0 t.h P U for all t P r0, 1s.
Then there is τ P p0, 1q such that
f px0
1
rph ∇qf spx0q
1!
hq f px0 q
...
(4.3.1)
1
rp
h ∇q pr1q f spx0 q
rph ∇q r f spx0 τ.hq .
pr 1q!
r!
Proof. First, since x0 t.h P U for all t P r0, 1s and U is an open subset
of Rn , there is some open interval I from R containing r0, 1s and such that
x0 t.h P U for all t P I. Hence we can define g : I Ñ U by g ptq :
x0 t.h for all t P I. Since f is of class C r , it follows by Lemma 4.3.5
that the real-valued function f g of one variable is r times continuously
1
differentiable. Hence it follows by Taylor’s theorem for functions of one
variable, Theorem 2.5.25, that there is some τ P p0, 1q such that
pf gqp1q pf gqp0q
1
pf
r!
1
pf
1!
gq 1p0q pr 1 1q! pf gqpr1qp0q
gqprqpτ q .
Finally, from this follows (4.3.1) by application of Lemma 4.3.5.
602
Theorem 4.3.7. (Estimate of the remainder in Taylor’s formula) Let
f, U, r, x0 , h be as in Theorem 4.3.6. Moreover, let K be a bound for all
partial derivatives of f on U of order r. Finally, define the remainder term
by
1
Rr px0 hq : rph ∇q r f spx0 τ.hq .
r!
Then there is a number C P N depending only on r and n such that
|Rr px0
hq| ¤
CK r
|h| .
r!
(4.3.2)
Proof. Obviously, ph ∇q r f is of the form
¸
ph ∇q r f i1
hin B i x B. . f. B i
1
r
ci1 ...in hi11
in r
n
n
1
xn
where the numbers ci1 ...in come from a multinomial expansion. Hence
|ph ∇q r f | ¤ CK |h|r
where
¸
C :
i1
ci1 ...in
in r
depends only on n and r. Hence it follows (4.3.2).
Definition 4.3.8. Let f, U, r, x0 , h and τ be as in Theorem 4.3.2. Then we
call the function pr1 : Rn Ñ R defined by
pr1 pxq :f px0 q
1
rppx x0q ∇qf spx0q
1!
pr 1q! rppx x0q ∇q
1
...
pr1q f spx
0
q
for all x P Rn , the Taylor polynomial of f of total degree ¤ r 1 at x0 and
Rr pxq :
1
rppx x0q ∇q r f spx0
r!
its remainder term at x x0
h.
603
τ.px x0 qq
Example 4.3.9. Define f9 : R2 zt0u Ñ R by
f9 px, y q :
xy
x2
y2
for all px, y q P R2 zt0u. Calculate the Taylor polynomial of f9 of total degree ¤ 2 at p1, 1q, and give an estimate of its remainder term at the point
p1.1, 1.2q.
Solution: Obviously, f9 is of class C 8 on R2 zt0u. As a consequence
of Schwarz’s Theorem 4.2.18 and the symmetry of f9 under exchange of
coordinates, there is only 1 ‘independent’ first order partial derivative as
well as 2 independent second order and third order partial derivatives of f9 ,
respectively. In particular,
Bf9 px, yq y py2 x2q , B2f9 px, yq x4 6x2y2 y4 ,
Bx
px2 y2q2 BxBy
px2 y2q3
2
2
3
2
B f9 px, yq 2xy p3y x q , B f9 px, yq 2x5 28x3y2 18xy4
B x2
px2 y2q3 Bx2By
px2 y2q4
B3f9 px, yq 6y x4 6x2y2 y4 ,
(4.3.3)
B x3
px2 y2q4
for all px, y q P R2 zt0u. Hence we have for x0 : p1, 1q, small enough
h P R2 and some τ P r0, 1s:
f9 px0
hq f9 px0 q
1
rph ∇qf9spx0q
1!
1
rp
h ∇q 3 f9 spx0 τ.hq
3!
f9px0q BBfx9 px0q hx BBfy9 px0q hy
1 B 2 f9
B2f9 px q h h
2
p
x
q
h
2
0
x
2 B x2
B xB y 0 x y
21 14 phx hy q2 R3px0 hq
604
1
rp
h ∇q 2 f9 spx0 q
2!
B2f9 px q h2 By2 0 y
R3 px0
hq
where
R3 px0
3
hq B3f9 px
B xB y 2 0
1
6
B3f9 px τ.hq h3
x
B x3 0
B3f9 px
τ.hq hx h2y
By3 0
B3f9 px
B x2 B y 0
τ.hq h3y .
3
τ.hq h2x hy
Hence the Taylor polynomial p2 of f9 of total degree ¤ 2 at p1, 1q is given
by
1 1
1 1
p2 px, y q ppx 1q py 1qq2 px y q2
2 4
2 4
2
for all px, y q P R .
Further for px, y q on the line segment between x0 and x1 : p1.1, 1.2q,
it follows that
3
4
2 2
4
4
B f9
x
6x
y
y
6x2 y 2 y 4
6y
¤ 6|y | x
p
x,
y
q
B x3
px2 y2q4 px2 y2q4
4
2
2
4
¤ 6 1.2 p1.1q 6p1.1q16p1.2q p1.2q 6.29645
3
4
2 2
4
4
B f9
x
6x
y
y
6x2 y 2 y 4
6x
¤ 6|x| x
p
x,
y
q
By3
px2 y2q4 px2 y2q4
4
2
2
4
¤ 6 1.1 p1.1q 6p1.1q16p1.2q p1.2q 5.77174
5
3
5
2x 28x3 y 2 18xy 4 B f9
28|x|3 y 2 18|x|y 4
¤ 2|x|
p
x,
y
q
B x2 B y
px2 y2q4
px2 y2q4
5
3
1.2q2 18 1.1p1.2q4
6.12151
¤ 2p1.1q 28p1.1q p16
5
3
5
B f9
2y 28y 3 x2 18yx4 28|y |3 x2 18|y |x4
¤ 2|y |
p
x,
y
q
B xB y 2
px2 y2q4
px2 y2q4
5
3
1.1q2 18 1.2p1.1q4
¤ 2p1.2q 28p1.2q p16
5.94662
and hence that
|R3px0 hq| ¤ 16 p0.1q3 6.29645 3 p0.1q2 0.2 6.12151
605
3 0.1 p0.2q2 5.94662
p0.2q3 5.77174 ¤ 0.03 .
In the next step, we apply differentiation to the finding of local maxima
/ minima of functions in several variables. For motivation, let f be such
a function which is defined on some open subset U of Rn and assumes a
maximum or minimum in x0 P U . Again, we use paths to investigate the
behavior of f near x0 . For this, let u P Rn be such that |u| 1. Since U
is open, there is an open interval of R around 0 such that the auxiliary path
g : I Ñ Rn , defined by
g ptq : x0 t.u
for every t P I has its range in U . If f is differentiable, f g is differentiable
and assumes a maximum or a minimum, respectively, in 0 since pf g qp0q f px0 q. Hence it follows by Calculus I that the derivative of f g in 0
vanishes, and it follows by the chain rule, Theorem 4.2.23, that
0 pf
gq 1p0q u rp∇f qpx0qs .
Since this is true for every such u, the last implies that the gradient of f
vanishes in x0
p∇f qpx0q 0 .
Below, we will derive the same result by more elementary means and without the assumption of differentiability of f .
Definition 4.3.10. (Local minima and maxima) Let n P N and f be
some real-valued function which is defined on some open subset U of Rn .
Then we say that f has a local minimum, maximum at x0 P U if there is a
open ball U px0 q around x0 such that
f pxq ¥ f px0 q
for all x P U px0 q and
f pxq ¤ f px0 q
for all x P U px0 q, respectively.
606
Theorem 4.3.11. (Necessary condition for the existence of a local minimum/maximum) Let n P N , and let f be a function defined on some
open subset U of Rn which has a local minimum/maximum at x0 P U , and
which is partially differentiable at x0 in each coordinate direction. Then
Bf px q 0 , i P t1, . . . , nu .
B xi 0
Note that in the case that f is differentiable in x0 , the last is equivalent to
the vanishing of the derivative of f in x0
f 1 px0 q 0 .
(4.3.4)
Also in cases where the range of f is part of Rm for some m
point x0 satisfying (4.3.4) will be called a critical point of f .
Proof. If f has a local minimum (maximum) at x0
i P t1, . . . , nu and sufficiently small h P R that
P
P N, such a
U , it follows for
1
f px01 , . . . , x0pi1q , x0i h, x0pi 1q , . . . , x0n q
h
f px01, . . . , x0pi1q, x0i, x0pi 1q, . . . , x0nq
is ¥ p¤q 0 and ¤ p¥q 0, for h ¡ 0 and h 0, respectively. Therefore
Bf px q
B xi 0
is at the same time ¥ 0 and ¤ 0 and hence, finally, equal to 0.
In particular, the following example shows that the vanishing of the gradient
of a function of several variables in a point of its domain does not always
indicate that the function assumes a maximum or a minimum in that point.
Example 4.3.12. Define the differentiable function f10 : R2
f10 px, y q : x2 y 2
607
Ñ R by
2
1
z 1
0
0.5
-1
-1
0
y
-0.5
0
x
-0.5
0.5
1 -1
Fig. 169: Graph of f10 .
for all px, y q P R2 . Then
Bf10 px, yq 2x , Bf10 px, yq 2y
Bx
By
for all px, y q P R2 and hence p0, 0q is a critical point of f10 , but, obviously,
not a local minimum or maximum. It is a so called ‘saddle point’. Note
that graph of f is a quadric, namely a hyperbolic paraboloid. Since hyperbolic paraboloids look similar to saddles, these are also often called ‘saddle
surfaces’.
In Calculus I, we derived a sufficient condition, in terms of the second order
derivative, for a function to assume a local minimum / maximum in a point
of its domain. In the following, we do the same for functions in several
variables. The proof of that result uses Taylor’s formula and the following
lemma. The proof of the last is given in the appendix.
Lemma 4.3.13. (Sylvester’s criterion) Let n P N , A pAij qi,j Pt1,...,nu
be a real symmetric n n matrix,i.e., such that Aij Aji for all i, j P
608
t1, . . . , nu. Then A is positive definite, i.e.,
¸
Aij hi hj
¡0
i,j 1,...,n
for all h P Rn zt0u, if and only if all leading principal minors detpAk q,
k 1, . . . , n, of A are ¡ 0. Here
Ak : pAij qi,j Pt1,...,ku , k
P t1, . . . , nu .
Proof. See the proof of Theorem 5.3.8 in the appendix.
Example 4.3.14. For the real symmetric matrix
1 2 0
2 5 3 ,
A :
0 3 11
it follows that
detpA1 q detp1q 1 ¡ 0 ,
detpA2 q detpA3 q 1 2 15221¡0
2 5 1 2 0
2 5 3
0 3 11
and hence that
1 5 11 3 3 1 11 2 2 2 ¡ 0
¸
Aij hi hj
¡0
i,j 1,2,3
for all h P R3 zt0u. Note that this can also be seen directly from
¸
Aij hi hj
h1ph1
2h2 q
h2 p2h1
i,j 1,2,3
h21
¡0
4h1 h2
5h22
6h2 h3
11h23
ph1
for all h P R3 zt0u.
609
5h2
3h3 q
2h2 q2
h3 p3h2
ph2
3h3 q2
11h3 q
2h23
Theorem 4.3.15. (Sufficient condition for the existence of a local minimum/maximum) Let n P N and f be a real-valued function on some
open subset U of Rn which is of class C 2 . Finally, let x0 be a critical point
for f . Then f has a local minimum/maximum in x0 if all leading principal
minors of its Hessian matrix at x0
2
B
f
H px0 q :
BxiBxj px0q i,jPt1,...,nu
are ¡ 0/all leading principal minors of H px0 q are ¡ 0.
Proof. First, since x0 is a critical point of f , we conclude by Taylor’s formula, Theorem 4.3.6, (together with Theorem 4.2.18) that for every h from
some a sufficiently small ball Uε p0q, ε ¡ 0, around the origin
f px0
hq f px0 q
f px0q
1
rph ∇q 2f spx0
2
1 ¸
pH px0
2 i,j 1,...,n
τ.hq
τ.hqqij hi hj
(4.3.5)
where τ P r0, 1s. Now, if all leading principal minors of H px0 q are ¡ 0/all
leading principal minors of H px0 q are ¡ 0, and since all leading principal
minors of the Hessian of f define continuous functions on U , ε can be
chosen such that also all leading principal minors of the Hessian of f are
¡ 0/all leading principal minors of H px0q are ¡ 0 at all points from
Uε px0 q. Hence it follows from (4.3.5) and by Lemma 4.3.13 that
f px0
hq ¥ f px0 q pf px0
hq ¤ f px0 qq
for all h P Uε p0q and hence, finally, the theorem.
Example 4.3.16. The function f9 : R2 Ñ R, see Example 4.3.9, is of class
C 2 and has a critical point in p1, 1q. By (4.3.3), the negative of the Hessian
matrix of f9 in p1, 1q is given by
H p1, 1q 1{2
1{2
610
1{2
1{2
.
0.5
z
1.4
1.2
0.3
1
0.6
0.8
0.8
1
x
y
1.2
0.6
1.4
Fig. 170: Graph of f9 .
Hence the principal sub-determinants of H p1, 1q are given by 1{2 and 0,
and hence Theorem 4.3.15 is not applicable. Nevertheless, f9 has even a
global maximum at p1, 1q because f9 p1, 1q 1{2 and
xy
x2
2
1 x
¤
y2
2 x2
y2
y2
12
for every px, y q P R2 zt0u. This example demonstrates that the assumptions
of Theorem4.3.15 for the existence of local minimum/maximum are not
necessary.
In the case of functions of one real variable, it was shown that a continuous
function f : [a, b] Ñ R where a, b P R are such that a b, which is differentiable on the open interval pa, bq, assumes its extrema either in a critical
point in pa, bq or in the boundary points a or b of [a, b]. The same procedure
can be applied to a continuous function f of several variables that is differentiable on a bounded open set U in the domain and whose domain arises
as the closure of U . As a consequence, that domain is compact, and hence
611
0 0
x
2
2
y
4
4
4
z 2
0
Fig. 171: Graph of the constraint surface for A 6.
the function assumes a maximum and a maximum. A standard method for
finding those extreme values compares the values of f in critical points of
f in U to the values of f on the boundary of its domain.
Frequently in applications, maxima and minima of functions need to be
found whose domains are unbounded. In such a case, it is often possible to
find the maximum / minimum by decomposing the domain into a compact
subset C and an unbounded subset such that the function assumes a value
on C which is larger / smaller than the values assumed on the unbounded
set. In that case, the maximum / minimum exists and is assumed in C. Such
a case is considered in the following example.
Example 4.3.17. Find the length, width and height of a parallelepiped of
given area A ¡ 0 and maximal volume.
Solution: The volume V and area A of a rectangular box of length x
612
¡ 0,
width y
¡ 0 and height z ¡ 0 are given by:
V xyz , A 2pxy
xz
yz q ,
respectively. Hence if existent, we need to find the maximum of the function V : r0, 8q r0, 8q Ñ R defined by
V px, y q :
xy
x
y
A
2
xy
for every px, y q P pr0, 8q r0, 8qq zt0u, and
V p0, 0q : 0 .
Note for later application of Theorem 4.1.14 that, obviously, V is of class
C 8 on p0, 8q p0, 8q as well as continuous on DpV q zt0u. In addition
because of
1 px y q2
1
xy
¤
px yq
x y
2 x y
2
for every px, y q P DpV q zt0u, it follows the continuity of V in p0, 0q and
hence, finally, the continuity of V . Also, note that
A
V px, y q ¤ ? 2
x
1
4
2
(4.3.6)
y2
for every px, y q P Dpf q zt0u. This is obvious for xy ¥ A{2 because in this
case V ¤ 0, whereas for xy ¤ A{2, px, y q p0, 0q, it follows that
A V px, y q 2
¤ A4 ?x21
A
2
xy
x
A2
4
y
x 2 y 2 ¤ A2 4
1
x
y
2
y2
.
In the next step, we determine the critical point of V on p0, 8q p0, 8q.
The partial derivatives of V are given by
BV px, yq y2 A2 x2 2xy , BV px, yq x2 A2 y2 2xy
Bx
px yq2
By
px yq2
613
1
z
2
0
1.5
-1
0
1
y
0.5
0.5
1
x
1.5
2 0
Fig. 172: Graph of V for the case A 6.
for x, y P p0, 8q. Hence the critical points of V on p0, 8q p0, 8q are
given by the solutions of the system
x2
2xy A
2
0,
2xy y2
A
2
0
which has the unique solution
x0 : y0 :
c
A
.
6
In px0 , y0 q, the volume V assumes the value
V px0 , y0 q A
6
3{2
Now we define the subset C of R2 by
C : Dpf q X BR p0q , R :
614
.
27 ?
A.
4
(4.3.7)
Then C is in particular closed and bounded and hence compact. According
to Theorem 4.1.14, V assumes a maximum value on C. Since V vanishes
on both axes and because of (4.3.7) and (4.3.6), it follows that V does not
assume its maximum on the boundary of C. Hence V assumes its maximum
in the interior of C, and it follows by Theorem 4.3.11 and the previous
analysis that this happens in the point px0 , y0 q. By (4.3.6), it follows that
V assumes its (global) maximum in px0 , y0 q and that its maximum value is
given by (4.3.7). Note that this implies that the box is a cube.
The previous example solves the problem of finding the maximal volume of
a parallelepiped of a given area. Such type of problems are called maximum
/ minimum problems with constraints. In such, the value of a function
needs to be maximized / minimized under further conditions that restrict
the elements of the domain which are considered in the maximization /
minimization. In the previous example, the first was given by the function
V : r0, 8q3 Ñ R that associates to every parallelepiped of length x ¥ 0,
width y ¥ 0 and height z ¥ 0 the corresponding volume
V px, y, z q : xyz .
The second was given by the condition that demanded that the area of the
parallelepiped is equal to some prescribed area A ¡ 0, i.e., that x, y, z
satisfy the additional equation
A 2pxy
xz
yz q .
In the previous example, in the case that z ¡ 0, we solved the last equation
for z and used this in the definition of V in order to arrive at a ‘reduced’
function defined on r0, 8q2 which was subsequently maximized. In general, the last can lead to the presence of very complicated expressions in the
definition of the ‘reduced’ function or may not be possible at all in terms
of elementary functions. In such cases, the following method of Lagrange
multipliers is helpful.
For its motivation, let f be a real-valued ‘constraint’ function defined and
615
HÑfLHpL
p
S
Fig. 173: Sketch of the constraint surface S.
of class C 1 on a non-empty open subset U of Rn , where n P N zt0, 1u, and
let
S : tx P U : f pxq 0u ,
the ‘constraint surface’, be such that
p∇f qpxq 0
for every x P S. Finally, let g : U Ñ R be differentiable, and assume that
the restriction g |S of g to S has a maximum or minimum in p P S. Then it
follows for every differentiable path γ : I Ñ S through p, where I is some
open interval around 0 and γ p0q p, that g γ has a maximum / minimum
in 0. Hence the derivative of g γ in 0 vanishes. Therefore, we conclude
by help of the chain rule for functions in several variables, Theorem 4.2.23,
and Example 4.2.10 that
pg γ q 1p0q p∇gqppq γ 1p0q 0 .
Hence p∇f qppq and p∇g qppq are both orthogonal to the ‘pn1q-dimensional’
tangent space of f at p and therefore parallel. As a consequence, there is a
616
so called ‘Lagrange multiplier’ λ P R such that
p∇gqppq λ.p∇f qppq .
In this way, since in addition f ppq 0, we arrive at n
n
1 equations for the
1 unknowns given by λ and the components of p.
Example 4.3.18. For this, we consider again the situation from Example 4.3.17. The constraint surface S is given by the zero set of f : U Ñ R,
where
U : tpx, y, z q P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u ,
defined by
f px, y, z q : xy
xz
for every px, y, z q P U . Note that
yz A
2
p∇f qpxq py z, x z, x yq 0
for every px, y, z q P U . The function V : U Ñ R, defined by
V px, y, z q : xyz
for every px, y, z q P U , is to be maximized on S. According to the previous
analysis, there is a real λ such that
p∇V qpxq pyz, xz, xyq λ.p∇f qpxq λ.py
Hence it follows that λ 0 and
1 1
1 1
1
1
λ
y z
x z
x
and therefore that
xy
z
z, x
z, x
1
y
and, finally by using the constraint equation f px, y, z q 0, that
xy
z
c
A
6
which is identical to the result of Example 4.3.17.
617
yq .
Usually, the proof of the Lagrange multiplier rule is based on the so called
‘implicit function theorem’ which itself is a consequence of the so called
’inverse mapping theorem’. The last are not considered in the course, for
instance, see [63], XVIII, §4, Theorem 4.6 and XVIII, §3, Theorem 3.1.
In the following, we use for the proof [76]. We remark that within the last
reference there is given a more general Lagrange multiplier rule that applies
also to constraints given in form of inequalities.
Theorem 4.3.19. (Lagrange multipliers) Let n, m P N and g, f1 , . . . , fm
be functions of class C 1 defined on some open subset U . Finally, assume
that the restriction g |S of g to the constraint surface S, defined by
S : tx P U : f1 pxq fm pxq 0u ,
assumes a minimum/maximum value in p P S. Then there are ‘Lagrange
multipliers’ λ0 , . . . , λm P R that are not all 0 and such that
λ0 .p∇g qppq
λ1 .p∇f1 qppq
. . . λm .p∇fm qppq 0 .
Proof. First, we consider the case that g |S assumes a minimum value in p.
For this, let ε0 ¡ 0 such that the closed ball Bε0 ppq is contained in U . In
addition for every M ¡ 0, we define an auxiliary function hM : Uε0 ppq Ñ
R of class C 1 by
hM pxq : g pxq g ppq
|x p|
m̧
2
M
fk2 pxq
k 1
for every x P Uε0 ppq. In a first step, we conclude that for every 0 ε ¤ ε0 ,
there is M pεq ¡ 0 such that
hM pεq pxq ¡ 0
for all x P Sε ppq. Otherwise, there is 0 ε
M ¡ 0 such that
hM pxq ¡ 0
618
¤ ε0 for which there is no
for all x P Sε ppq. Hence for such ε and any N
such that
hN pxN q ¤ 0
P N, there is xN P Sεppq
(4.3.8)
or, equivalently, such that
m̧
fk2 pxN q ¤ k 1
1 g pxN q g ppq
N
ε2 .
(4.3.9)
Therefore, as a consequence of the boundedness and closedness of Sε ppq
and by application of Bolzano-Weierstrass’ Theorem 4.1.9, it follows the
existence of a strictly increasing sequence N1 , N2 , . . . of non-zero natural
numbers such that the corresponding sequence xN1 , xN2 , . . . is convergent
to some x P Sε ppq. By performing the limit in (4.3.9), it follows that
x belongs to the constraint surface S and hence that g px q ¥ g ppq. But,
the last implies that hM px q ¥ ε2 for every M ¡ 0 and hence that (4.3.8)
cannot be valid for every N P N . Hence for the second step, let 0 ε ¤
ε0 and M pεq ¡ 0 be such that
hM pεq pxq ¡ 0
for all x P Sε ppq. Then there is xε
λm pεqq P Rm 1 such that
λ0 pεq. rp∇g qpxε q
P Uεppq and a unit vector pλ0pεq, . . . ,
2.pxε pqs
m̧
λk pεq.p∇fk qpxε q 0 .
k 1
This can be proved as follows. By Theorem 4.1.14, the restriction of hM pεq
to Bε ppq assumes a minimum value in some point xε P Bε ppq. Since
hM pεq ppqq 0, it follows that xε P Uε ppq and that p∇hM pεq qpxε q 0
p∇gqpxεq
2.pxε pq
2M pεq.
m̧
fk pxε q.p∇fk qpxε q 0
k 1
which implies the above statement. In the last step, we choose a sequence
ε1 , ε2 , . . . in the open interval between 0 and ε0 s which is convergent to 0.
In particular, we choose it such that the corresponding sequence
pλ0pε1q, . . . , λmpε1qq, pλ0pε2q, . . . , λmpε2qq, . . .
619
is convergent to a unit vector pλ0 , . . . , λm q in Rm 1 . This is possible as a
consequence of Bolzano-Weierstrass’ Theorem 4.1.9. Since
xε ,
lim xεk
k
Ñ8
we conclude that
λ0 .p∇g qppq
m̧
λk .p∇fk qppq 0 .
k 1
Finally, if g |S assumes a maximum value in p, then g |S assumes a minimum value in p and the statement of the theorem follows by application of
the just proved result to g |S .
The following gives a standard example for the application of the Lagrange
multiplier rule to the finding of the extrema, of the restriction to the unit
sphere around the origin, of a quadratic form that is associated to a matrix. This leads naturally on the notion of eigenvalues of matrices and their
associated eigenvectors.
Example 4.3.20. Let n P N , pakl qk,lPt1,...,nu be a family of real numbers
and g : Rn Ñ R be defined by
g pxq :
ņ
akl xk xl
k,l 1
for all x px1 , . . . , xn q
R is defined by
P Rn. Since S1np0q f 1pt0uq, where f : Rn Ñ
f pxq : |x|2 1 1
ņ
i 1
x2i
for all x P R , is compact, the restriction of f to S1n p0q assumes a minimum
and a maximum. Let x be a point where f assumes an extremum. Then it
follows by Theorem 4.3.19 the existence of real λ0 , λ1 such that λ20 λ21 0
and
λ0 .p∇g qpxq λ1 .p∇f qpxq 0 .
n
620
Since
ņ
akl xk xl
for all x P Rn , it follows that
p∇gqpxq ņ
pa1k
ņ
f pxq alk xl xk
k,l 1
alk xk xl
1
pakl
2
ak1 qxk , . . . ,
k 1
ņ
k,l 1
k,l 1
k,l 1
and hence
ņ
ņ
alk q xk xl
pank
akn qxk
, p∇f qpxq 2x .
k 1
As a consequence, x satisfies the following system of equations
ņ
λ0
paik
aki qxk
2λ1 xi
0,
k 1
for i 1, . . . , n. Further, since x 0, it follows that λ0
that the last system is equivalent to
ņ
k 1
for i 1, . . . , n where
1
paik
2
aki q xk
0 and hence
λ xi ,
λ : λ1
.
λ0
By introducing matrix notation, the last system is equivalent to
ān1 : pakl
ā11
ā1n
x1
x1
(4.3.10)
λ. ānn
xn
xn
alk q{2 for k 1, . . . , n, l 1, . . . , n and the mul-
where ākl
tiplication sign on the left hand side of the last equation denotes matrix
621
multiplication. As a side remark, in general, if λ P R and x P Rn zt0u
satisfy such a matrix equation, λ is called an eigenvalue of the matrix and
x an eigenvector of the matrix corresponding to λ. By Theorem 5.3.6 from
the appendix, it follows that λ satisfies
λ
ā
11
ā21
ān1
ā12
ā22 λ
ān2
ā1n
ā2n
λ
0
ānn which leads on a polynomial equation for λ. After solution of that equation
and substitution of the calculated values for λ into (4.3.10), the solutions of
the remaining system can be easily found.
Problems
1) Find the rate of change of f : D Ñ R at the point p in the direction of
v. In addition, find the direction of steepest ascent / steepest descent
of f in p and the associated rates.
a) f px, y q : x2 2xy 3y 2 for all px, y q P D : R2 , p p1, 2q,
v p2, 1q ,
b) f px, y q : y cospxy q for all px, y q P D : R2 , p p0, 2q,
v pcospπ {3q, sinpπ {3qq ,
c) f px, y q : x expp2px2 y 2 qq for all px, y q P D : R2 ,
p p1, 0q, v p1, 3q ,
d) f px, y q : lnpx2 y 2 q for all px, y q P D : R2 zt0u, p p1, 1q, v p3, 3q ,
e) f px, y, z q : xy yz xz for all px, y, z q P D : R3 , p p1, 2, 1q, v p1, 1, 1q ,
f) f px, y, z q : 5x2 3xy xyz for all px, y, z q P D : R3 ,
p p3, 4, 5q, v p1, 1, 1q ,
g) f px, y, z q : xyz px{y q py {z q pz {xq for all x ¡ 0, y ¡ 0,
z ¡ 0, p p2, 1, 4q, v p1, 1, 1q .
622
2) Decide whether the matrix is symmetric and in case whether it is
positive definite.
A1 :
A4 :
3
7
2
4
4
1
, A2 :
1
3
4
A6 : 2
5
A1 :
A4 :
6
1
9k
k
k
A6 : 5
8
2
6
, A3 :
1
3
1
5
3
4 , A7 : 3
6
1
k
k
2
3
4
1
2k
4
2
3
, A5 : 1
1
3) Decide which values of k
5
12
4
5
3
1
1 ,
5
3
3
1
3
,
2
1
1 .
9
P R make the matrix positive definite.
, A2 :
3
k
10
, A5 : 2
5k
, A3 :
k
2
2 5k
9
3 ,
3
7
8
6
k
4 , A7 : k
5
2k
2 7k
k
4k
4k
1
,
2
7k .
14
4) Calculate the Taylor polynomial of f : D Ñ R of total degree ¤ 2
at p, and estimate the corresponding remainder term on B.
a) f px, y q : sinpx y q for all px, y q P R2 , p p0, 0q, B tpx, yq P R2 : |x| ¤ 1 ^ |y| ¤ 1u ,
b) f px, y q : ex y for all px, y q P R2 , p p0, 0q, B tpx, y q P
R2 : |x| ¤ 1 ^ |y | ¤ 1u ,
c) f px, y q : p1 x y q1{2 for all px, y q P R2 such that y ¥ p1
xq, p p0, 0q, B tpx, y q P R2 : |x| ¤ 1{2 ^ |y | ¤ 1{2u ,
d) f px, y q : xy for all x ¡ 0, y P R, p p1, 1q, B tpx, y q P
R2 : |x 1| ¤ 1{10 ^ |y 1| ¤ 1{10u .
5) Find the maximum and minimum values, so far existent, of f : D
R and the points where they are assumed. If applicable, a, b P R.
a) f px, y q : xp2
4y q 5x2 y 2 for px, y q P D : R2 ,
x
b) f px, y q : xy 1 y for px, y q P D : R2 ,
2
c) f px, y q : 2 2x 5x2 2y p4 x 5y q
623
Ñ
for px, y q P D : R2 ,
d)
f px, y q : x3
xy
e) f px, y q : x4
y 3 for px, y q P D : R2 ,
y 4 2x2
for px, y q P D : R2 ,
pa3 {xq pa3 {yq
for px, y q P D : tpx, y q P R2 : x ¡ 0 ^ y ¡ 0u ,
g) f px, y q : x3 y 3 9xy 27
for px, y q P D : tpx, y q P R2 : 0 ¤ x ¤ 4 ^ 0 ¤ y ¤ 4u ,
h) f px, y q : x4 y 4 2x2 4xy 2y 2
for px, y q P D : tpx, y q P R2 : 0 ¤ x ¤ 2 ^ 0 ¤ y ¤ 2u ,
i) f px, y q : epx y q pax2 by 2 q for px, y q P D : R2 ,
where a, b ¡ 0 ,
j) f px, y, z q : xyz p4a x y z q for px, y q P D : R3 ,
k) f px, y, z q : px3 y 3 z 3 q{pxyz q for
px, y, zq P D : tpx, y, zq P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u ,
l) f px, y, z q : rx{py z qs ry {px z qs rz {px y qs for
px, y, zq P D : tpx, y, zq P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u .
Find the maximum and minimum values of g : D Ñ R on the set(s)
f)
f px, y q : x2
xy
2
6)
4xy 2y 2
y2
2
S and the points where they are assumed. Give reasons for the existence of such values.
a) f px, y q : x2
2xy
y 2 for px, y q P D : R2 ,
on S : tpx, y q P R2 : x2 2x
b)
c)
d)
e)
f)
0u ,
f px, y q : x
2y for px, y q P D : R2 ,
on S : tpx, y q P R2 : x4 y 4 1u ,
f px, y q : x2 y 2 for px, y q P D : R2 ,
on S : tpx, y q P R2 : 3 px2 y 2 q 2xy 1u
f px, y q : x2 xy y 2 for px, y q P D : R2 ,
on S : tpx, y q P R2 : x2 y 2 1u ,
f px, y q : xy for px, y q P D : R2 ,
on S : tpx, y q P R2 : x2 y 2 1u ,
f px, y, z q : xyz for px, y, z q P D : R3 ,
2
y2
2
624
,
on S : tpx, y, z q P R3 : x2
g)
h)
i)
3u ,
f px, y, z q : x
2y
3z for px, y, z q P D : R3 ,
on S1 : tpx, y, z q P R3 : x2 y 2 z 2 1u ,
S2 : tpx, y, z q P R3 : x 2y 3z 0u ,
f px, y, z q : x2 y 2 z 2 for px, y, z q P D : R3 ,
on S1 : tpx, y, z q P R3 : x y z 0u ,
S2 : tpx, y, z q P R3 : px2 y 2 z 2 q2 x2 2y 2 4z 2 u ,
f px, y, z q : sinpx{2q sinpy {2q sinpz {2q
for px, y, z q P D : tpx, y, z q P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u ,
on S : tpx, y, z q P R3 : x y z π u .
2
2
y2
z2
2
7) Let p ¡ 0. Determine the triangle with largest circumscribed area
and perimeter 2p.
8) Determine the point inside a quadrilateral V with minimal sum of
squares of distances from the corners.
9) Determine the point inside a quadrilateral V with minimal sum of
distances from the corners.
10) Determine the triangle with maximal sum of squares of side lengths
and corners on a circle.
11) Let p P R3 z t0u. Determine the plane of largest distance from the
origin among all planes through p.
12) Let a ¡ b ¡ c ¡ 0 and
E :
"
2
px, y, zq P R : xa2
3
y2
b2
z2
c2
1
*
.
Find the point of E that has largest distance from the origin.
625
O
P
r
C
h
Fig. 174: Archimedes determination of the volume of paraboloidal solids of revolution,
see text.
4.4
Integration of Functions of Several Variables
Archimedes’ determination of the volumes of paraboloidal, hyperboloidal
and ellipsoidal solids of revolution can be seen as early examples of integration of functions of several variables. All these volumes are symmetric
with respect to rotations around a line segment, the so called ‘axis of symmetry’, that is part of the body. In particular, he showed that the volume
V of a paraboloidal solid of revolution P inscribed in a circular cylinder C
with radius r and height h is one half of the volume VC of C
V
21 VC ,
see Fig 174. For the proof, he divides the symmetry axis into n P N equal
parts of length h{n. Through the points of division, A0 , A1 , . . . , An , he
passes planes parallel to the base. On the circular sections that these planes
cut out of the surface of the solid, he constructs inscribed and circumscribed
cylindrical frustra as indicated in Fig 175. The last displays the intersections of the boundaries of the solid and the frustra with a plane containing
626
O = AH0L = BH0L
O = AH0L = BH0L
AH1L
AH1L
BH1L
BH1L
AH2L
AH2L
BH2L
BH2L
AHn-1L
AHn-1L
BHn-1L
A = AHnL
BHn-1L
A = AHnL
BHnL
BHnL
Fig. 175: Archimedes determination of the volume of paraboloidal solids of revolution,
see text. Inside the last, the points Apiq, B piq are denoted by Ai and Bi , respectively,
where i P t0, . . . , nu.
the axis of symmetry OA of the body. The points B0 , B1 , . . . , Bn are intersection points of the circular sections with the plane. By summing the
volumes of the frustra, he arrives at the inequality
rlpA B qs2
i i
n¸1
VC
nr2
i 1
rlpAiBiqs
ņ
¤ VC
n¸1
π rlpAi Bi qs
i 1
2
h
n
¤V ¤
ņ
π rlpAi Bi qs2
i 1
2
(4.4.1)
nr2
i 1
h
n
where lpAi Bi q denotes the length of the line segment Ai Bi for i P t0, . . . , nu
and VC πr2 h is the volume of the cylinder C. Further, since the bounding curve in Fig 175 is a parabola, it follows by ancient Greek knowledge
on parabolic segments, see (iii) in Example 3.5.26, that
i
n
ihh{n rlpAriB2 iqs
2
for every i P t1, . . . , nu. Hence it follows from (4.4.1) that
1
2
¤
1
V
VC
1
n
¤
ņ
i 1
1 pn 1qn
n2
2
rlpAiBiqs2 nr2
1
n¸
1 rlpA B qs2
1 n¸
i i
i
2
n i1
nr2
i1
1 ņ
1 npn 1q
i 2
2
n i1
n
2
627
1
2
1
1
n
.
As a consequence,
V
V
1
21 ¤ 2n
.
C
In order to conclude from the last that
V
VC
(4.4.2)
21 ,
Archimedes had to employ a usual ‘double reductio ad absurdum’ argument, i.e., to lead both assumptions that V { VC 1{2 and that V { VC ¡ 1{2
to a contradiction which leaves only the option that V { VC 1{2. This can
be done as follows. If V { VC p1{2q ε for some ε ¡ 0, it follows for
n ¡ 1{p2εq that
V
1 1
V 2 ε ¡ 2n
C
which contradicts (4.4.2). Hence the only remaining possibility is that
V { VC 1{2. Of course, in ancient Greece only rational ε were considered in such analysis.
By introduction of a Cartesian coordinate system with origin in A and zaxis in the direction of the line segment from A to O, we achieve that P is
enclosed by the x, y-plane and the graph of fP : Ur p0q Ñ R defined by
fP px, y q : h 1 x2
y2
r2
for all px, y q P Ur p0q. Below the Riemann integral of fP , giving that volume enclosed by the x, y-plane and the graph of fP , will be defined essentially by a similar construction to Archimedes and denoted by
»
pq
Ur 0
fP px, y q dxdy .
Then the previous shows that
»
pq
Ur 0
fP px, y q dxdy
628
1
2
»
pq
Ur 0
h dxdy
where the integral on the right hand side of the last equation is the Riemann
integral of the constant function of value h on Ur p0q.
It is worth noting that Archimedes inscribed and circumscribed cylindrical frustra are associated to a partitioning of the range of f , such partitions
are an important tool in Lebesgue integration, rather than to a partition of
the domain. Partitions of the last type provide the basis for Riemann integration.
After this introduction, we start with natural definitions of intervals in Rn ,
where n P N is such that n ¥ 2, volume of intervals, partitions of intervals and corresponding lower and upper sums of bounded functions. In
large parts, the following presentation of Riemann integration of functions
of several variables is analogous to that of Calculus I for functions in one
variable.
Definition 4.4.1.
(i) Let a, b P R be such that a ¤ b and ra, bs be the corresponding
closed interval in R. A partition P of ra, bs is an ordered sequence
pa0, . . . , aν q of elements of ra, bs, where ν is an element of N, such
that
a a0 ¤ a1 ¤ ¤ aν b .
Since pa, bq is such a partition of ra, bs, the set of all partitions of that
interval is non-empty. A partition P 1 of ra, bs is called a refinement
of P if P is a subsequence of P 1 .
(ii) Let n P N be such that n ¥ 2. A closed interval I of Rn is the product
of n closed intervals I1 , . . . , In of R:
I1 In .
We define the volume v pI q of I as the product of the lengths lpIi q of
the intervals Ii , i P t1, . . . , nu
v pI q : lpI1 q . . . lpIn q .
I
629
A partition P of I is a sequence pP1 , . . . , Pn q consisting of partitions
Pi of Ii , i P t1, . . . , nu. A partition P 1 pP11 , . . . , Pn1 q of I is called
a refinement of a partition P pP1 , . . . , Pn q if Pi1 of I is a refinement of Pi for every i P t1, . . . , nu.
A partition P ppa10 , . . . , a1ν1 q, . . . , pan0 , . . . , anνn qq induces a division of I into, in general non-disjoint, closed subintervals
I
ν¤
1 1
j1
...
0
ν¤
n 1
0
jn
Ij1 ...jn ,
Ij1 ...jn : ra1j1 , a1pj1 1q s ranjn , anpjn 1q s ,
for j1 0, . . . , ν1 ; . . . ; jn 0, . . . , νn . The size of P is defined as
the maximum of all the lengths of these subintervals. In addition, we
define for every bounded function f on I the lower sum Lpf, P q and
upper sum U pf, P q corresponding to P by
Lpf, P q :
ν¸
1 1
ν¸
1 1
...
j1 0
U pf, P q :
ν¸
n 1
ν¸
n 1
inf tf pxq : x P Ij1 ...jn u v pIj1 ...jn q ,
jn 0
...
j1 0
suptf pxq : x P Ij1 ...jn u v pIj1 ...jn q .
jn 0
¡ 0 is such that |f pxq| ¤ K for all x P I, it follows
Note that if K
that
K ¤ inf tf pxq : x P J u ¤ suptf pxq : x P J u ¤ K
for every subset J of I and hence that
|Lpf, P q| ¤
¤K
ν¸
1 1
j1 0
ν¸
1 1
...
j1 0
...
ν¸
n 1
ν¸
n 1
| inf tf pxq : x P Ij ...j u| vpIj ...j q
1
jn 0
v pIj1 ...jn q
jn 0
630
n
1
n
y
y
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.2
0.4
0.6
0.8
x
1
0.2
0.4
0.6
0.8
1
x
Fig. 176: Divisions of r0, 1sr0, 1s induced by P0 and P1 , respectively, see Example 4.4.2.
K
ν¸
1 1
...
j1 0
ν¸
1 1
j1 0
pa1pj
1
jn 0
|U pf, P q| ¤
¤K
ν¸
n 1
ν¸
1 1
...
ν¸
n 1
...
j1 0
1
ν¸
n 1
q a1j1 q . . . panpjn 1q anjn q Kv pI q
| suptf pxq : x P Ij ...j u| vpIj ...j q
1
n
1
n
jn 0
v pIj1 ...jn q Kv pI q .
jn 0
As a consequence, the sets
tLpf, P q : P P Pu , tU pf, P q : P P Pu
are bounded where P denotes the set of all partitions of I.
Example 4.4.2. Consider the closed interval I : r0, 1s r0, 1s in R2
and the continuous function f : I Ñ R defined by f px, y q : x for all
px, yq P I.
P0 : pp0, 1q, p0, 1qq , P1 : pp0, 1{2, 1q, p0, 1{2, 1qq
631
are partitions of I. Also is P1 a refinement of P0 . Finally,
Lpf, P q 0 1 0 , U pf, P q 1 1 1 ,
2
1
Lpf, P 1 q 0 2
2
1
1
U pf, P 1 q 2
2
1
2
1
2
1
2
1
2
1
2
1
2
2
1
2
0
1
2
1
2
2
2
1
2
41 ,
1
2
34
2
and hence
Lpf, P q ¤ Lpf, P 1 q ¤ U pf, P 1 q ¤ U pf, P q .
Intuitively, it is to be expected that a refinement of a partition of an interval
leads to a decrease of corresponding upper sums and an increase of corresponding lower sums as has also been found in the special case in the
previous example. Indeed, this intuition is correct.
Lemma 4.4.3. Let n P N be such that n ¥ 2 and I I1 In be a
closed interval of Rn . Further, let P pP1 , . . . , Pn q, P 1 pP11 , . . . , Pn1 q be
partitions of I, and in particular let P 1 be a refinement of P . Then
Lpf, P q ¤ Lpf, P 1 q ¤ U pf, P 1 q ¤ U pf, P q .
(4.4.3)
Proof. The middle inequality is obvious from the definition of lower and
upper sums given in Def 4.4.1(ii). Obviously for the proof of the remaining
inequalities, it is sufficient (by the method of induction) to assume that there
is i0 P t1, . . . , nu such that Pi1 Pi for i i0 , for simplicity of notation,
1 , a11 , . . . , a1ν q where a 1 P I1 is such
we assume i0 1, and P11 pa10 , a11
1
11
that
1 ¤a .
a10 ¤ a11
11
Here we again simplified for notational reasons. Then
Lpf, P 1 q Lpf, P q ν2
¸
j2 0
...
νn
¸
1 s ra , a
inf tf pxq : x P ra10 , a11
njn
npjn 1q su
jn 0
632
vpra10, a111 s ranj , anpj 1qsq
1 , a s ra
inf tf pxq : x P ra11
11
nj
1
vpra11, a11s ranj , anpj 1qsq
inf tf pxq : x P ra10, a11s ranj(
vpra10, a11s ranj , anpj 1qsq
¥ inf tf pxq : x P ra10, a11s ranj , anpj 1qsu
vpra10, a111 s ranj , anpj 1qsq
1 , a s ra , a
v pra11
11
nj
npj 1q sq
(
vpra10, a11s ranj , anpj 1qsq 0 .
n
n
n
n
n
n
n
n
n
n
n
, anpjn 1q su
n
, anpjn 1q su
n
n
n
n
Analogously, it follows that
U pf, P 1 q U pf, P q ¤ 0
and hence, finally, (4.4.3).
As a consequence of their definition, lower sums are smaller than upper
sums. It is not difficult to show that the same is true for the supremum of
the lower sums and the infimum of the upper sums.
Theorem 4.4.4. Let f be a bounded real-valued function on some closed
interval I of Rn where n P N is such that n ¥ 2. Then
supptLpf, P q : P
P Puq ¤ inf ptU pf, P q : P P Puq .
(4.4.4)
Proof. By Theorem 4.4.3, it follows for all P1 , P2 P P that
Lpf, P1 q ¤ Lpf, P q ¤ U pf, P q ¤ U pf, P2 q ,
where P P P is some corresponding common refinement, and hence that
supptLpf, P1 q : P1 P Puq ¤ U pf, P2 q
and hence (4.4.4).
633
As a consequence of Lemma 4.4.3 and since every partition P of some
closed interval I of Rn is a refinement of the trivial partition containing
only the coordinates of the initial and endpoints, we can make the following
definition.
Definition 4.4.5. (The Riemann integral, I) Let n P N be such that n ¥
2, f be a bounded real-valued function on some closed interval I of Rn ,
and denote by P the set consisting of all partitions of I. We say that f is
Riemann-integrable on I if
supptLpf, P q : P
P Puq inf ptU pf, P q : P P Puq .
In that case, we define the integral of f on I by
»
I
f dv : supptLpf, P q : P
P Puq inf ptU pf, P q : P P Puq .
We also use sometimes the notation
»
I
f px1 , . . . , xn q dx1 . . . dxn
for the integral indicating a Cartesian coordinate system. In particular if
f pxq ¥ 0 for all x P I, we define the volume under the graph of f by
»
f dv .
I
Example 4.4.6. Let f be a constant function of value a P R on some closed
interval I of Rn where n P N is such that n ¥ 2. In particular, f is bounded.
Further, let P ppa10 , . . . , a1ν1 q, . . . , pan0 , . . . , anνn qq be a partition of I
and
I
ν¤
1 1
j1
0
...
ν¤
n 1
jn
0
Ij1 ...jn ,
Ij1 ...jn : ra1j1 , a1pj1 1q s ranjn , anpjn 1q s ,
634
for j1 0, . . . , ν1 ; . . . ; jn 0, . . . , νn be the induced division of I into
closed subintervals of I. Then
Lpf, P q U pf, P q a
a
ν¸
1 1
j1 0
...
ν¸
n 1
ν¸
1 1
...
j1 0
pa1pj
1
1
ν¸
n 1
v pIj1 ...jn q a
jn 0
ν¸
1 1
...
j1 0
ν¸
n 1
v pIj1 ...jn q
jn 0
q a1j1 q . . . panpjn 1q anjn q a v pI q
jn 0
Hence f is Riemann-integrable and
»
f dv
I
a vpI q .
Note that according to the previous example, the integral of every function
defined on an interval with one vanishing side is zero. The values of the
function on such an interval do not affect the value of the integral. This
observation will lead further down to the definition of so called zero sets.
Example 4.4.7. Consider the closed interval I : r0, 1s2 of R2 and the
function f : I Ñ R defined by
f px, y q : x y
for all x, y
P R. Since
|f px, yq| |x| |y| ¤ 1
for all px, y q P I, f is bounded. For every n P N , define the partition Pn
of I by
Pn :
1
n
0, , . . . ,
n
n
1
n
, 0, , . . . ,
n
n
.
Calculate Lpf, Pn q and U pf, Pn q for all n P N . What is the value of
»
f dv ?
I
635
Solution:
We have:
I
j j
1
1
jj
1 2
2
n
n¤1 n¤1
n
j1 0 j2 0
,
1
n
and
L pf, Pn q n¸1 n¸1
j1 0 j2 0
2
1 n
pn 1q
n4 2
U pf, Pn q 1 n
n4
Hence
pn
2
n¸1 n¸1
pj1
1q
41
1
1
n
1q
1
n
1
1qpj2
n2
j1 0 j2 0
2
1
4
1
n2
j2 j2 1
,
n
n
1
n4
n¸1
j1
j1 0
2
n¸1
j2
j2 0
,
1
n2
1
n4
2
ņ
j1 1
j1
j2 1
.
lim L pf, Pn q nlim
nÑ8
Ñ8 U pf, Pn q 1
.
4
As a consequence, it follows that
1
4
and
¤ supptLpf, P q : P P Puq
inf ptU pf, P q : P
P Puq ¤ 14
and hence by Theorem 4.4.4 that
supptLpf, P q : P
P Puq inf ptU pf, P q : P P Puq 14 .
Hence f is Riemann-integrable and
»
f dv
I
14 .
636
ņ
j2
Note that the product
» 1
» 1
x dx
y dy
0
0
gives the same value. That this is not just accidental will be seen later on.
This result can be obtained by application of Fubini’s theorem given below.
In the past, we have seen many examples that the special properties of
functions, such as continuity, differentiability and integrability are automatically ‘transferred’ to sums, products and quotients. Also did this fact
considerably simplify the process of the decision whether a given function
is continuous, differentiable or integrable. In many cases, this is an obvious
consequence of the continuity, differentiability or integrability of elementary functions. For this reason, it is natural to ask whether multiples, sums,
products and quotients of integrable functions of several variables are integrable as well. Indeed, this is the case for multiples and sums as stated in
the theorem below. Within the definition of Riemann-integrability of functions of several variables above, we also defined the volume under the graph
of a positive integrable function in terms of its integral. This is reasonable
in view of applications only if that integral is positive. This positivity is a
simple consequence of the positivity of the lower sums of such functions.
Theorem 4.4.8. Let n P N be such that n ¥ 2 and f, g be bounded and
Riemann-integrable on some closed interval I of Rn and a P R. Then f g
and af are bounded and Riemann-integrable on I and
»
I
pf
g q dv
»
»
f dv
»
g dv ,
I
I
af dv
I
a
»
f dv .
I
If f is in addition positive, then
»
f dv
I
¥0.
Proof. In the following, we denote by P the set of all partitions of I. First,
if M1 ¡ 0 and M2 ¡ 0 are such that |f pxq| ¤ M1 and |g pxq| ¤ M2 , then
|pf
g qpxq| |f pxq
g pxq| ¤ |f pxq|
637
|gpxq| ¤ M1
M2 ,
|pcf qpxq| |cf pxq| |c| |f pxq| ¤ |c|M1
for all x P I and hence f g and cf are bounded for every c P R. Second,
it follows for every subinterval J of I that
inf tf pxq : x P J u inf tg pxq : x P J u ¤ f pxq g pxq pf g qpxq ,
pf gqpxq f pxq gpxq ¤ suptf pxq : x P J u suptgpxq : x P J u
for all x P J and hence that
inf tf pxq : x P J u inf tg pxq : x P J u
¤ inf tpf gqpxq : x P J u ¤ suptpf gqpxq : x P J u
¤ suptf pxq : x P J u suptgpxq : x P J u .
Hence it follows for every partition P of I that
Lpf, P q Lpg, P q ¤ Lpf
¤ U pf, P q U pg, P q .
If ν
g, P q
P N, by refining partitions, we can construct Pν P P such that
»
I
f dv 1
2ν
»
U pf, Pν q Hence
»
I
¤
I
»
f dv
1
ν
I
»
f dv
I
g dv I
g dv 1
2ν
¤ Lpf
»
Lpg, Pν q ,
g dv
I
g, Pν q ¤ U pf
1
.
2ν
g, Pν q
1
ν
g dv
I
»
I
g dv »
1
, U pg, Pν q 2ν
I
»
»
Lpf, Pν q ,
f dv
f dv
and
g, P q ¤ U pf
1
ν
¤ suptLpf
638
g, P q : P
P Pu
¤ inf tU pf
g, P q : P
P Pu ¤
»
»
f dv
1
.
ν
g dv
I
I
P N, we conclude that
g, P q : P P Pu inf tU pf g, P q : P P Pu
»
Since the last is true for every ν
suptLpf
Hence f
»
f dv
g dv .
I
I
g is Riemann-integrable and
»
I
pf
g q dv
»
»
f dv
I
g dv .
I
Further, if c ¥ 0, it follows for every subinterval J of I that
inf tcf pxq : x P J u c inf tf pxq : x P J u ,
suptcf pxq : x P J u c suptf pxq : x P J u
and hence that
Lpcf, P q c Lpf, P q , U pcf, P q c U pf, P q
for every partition P of I. The last implies that
suptLpcf, P q : P
P Pu c suptLpf, P q : P P Pu c
inf tU pcf, P q : P
P Pu c inf tU pf, P q : P P Pu c
f dv ,
»
If c ¤ 0, it follows for every subinterval J of I that
inf tcf pxq : x P J u c suptf pxq : x P J u ,
suptcf pxq : x P J u c inf tf pxq : x P J u
and hence that
Lpcf, P q c U pf, P q , U pcf, P q c Lpf, P q
639
»
I
f dv .
I
for every partition P of I. The last implies that
suptLpcf, P q : P
P Pu c inf tU pf, P q : P P Pu c
inf tU pcf, P q : P
P Pu c suptLpf, P q : P P Pu c
Hence it follows in both cases that
»
cf dv
I
c
»
f dv ,
»
I
f dv .
I
»
f dv .
I
Finally, if f is such that f pxq ¥ 0 for all x P I, then
inf tf pxq : x P J u ¥ 0
for all subintervals J of I and hence
Lpf, P q ¥ 0
for every partition P of I. As a consequence,
»
f dv
I
suptLpf, P q : P P Pu ¥ 0 .
The Riemann integral can be viewed as a map into the real numbers with
domain given by the set of bounded Riemann-integrable functions over
some closed interval I of Rn where n P N is such that n ¥ 2. According to the previous theorem, that map is ‘linear’, i.e., the integral of the sum
of such functions is equal to the sums of their corresponding integrals, and
the integral of a scalar multiple of such a function is given by that multiple
of the integral of that function. In addition, it is positive, in the sense that it
maps such functions which are in addition positive, i.e., which assume only
positive (¥ 0) values, into a positive real number. It is easy to see that the
linearity and positivity of the map implies also its monotony, i.e., if such
functions f and g satisfy f ¤ g, defined by f pxq ¤ g pxq for all x P I, then
the integral of f is equal or smaller than the integral of g.
640
Corollary 4.4.9. (Monotony of the integral) Let n P N be such that n ¥ 2,
f, g be bounded and Riemann-integrable on some closed interval I of Rn ,
and in addition let f pxq ¤ g pxq for all x P I. Then
»
f dv
I
¤
»
g dv .
I
Proof. For this, we define the auxiliary function h : I Ñ R by hpxq :
g pxq f pxq for all x P I. According to the previous Theorem, h is bounded
and Riemann-integrable. Finally, since f pxq ¤ g pxq for all x P I, it follows
that hpxq ¥ 0 for all x P I. Hence it follows by the linearity and positivity
of the integral that
0¤
»
and hence that
h dv
I
»
»
g dv
I
I
»
f dv
I
rf s dv ¤
»
I
g dv »
f dv
I
»
g dv .
I
We have seen that the integral of every function defined on an interval with
one vanishing side is zero. The values of the function on such an interval
do not affect the value of the integral. The reason behind this behavior is,
of course, the fact that we defined the volume of an interval as the product
of its side lengths. Hence the volume of an interval with one vanishing
side is zero. Such intervals are examples of so called negligible sets which
are similar to zero sets defined in connection with Riemann integration for
functions in one variable. The values assumed by a function on a negligible
set do not influence the value of the integral. The following definition uses
the intuition that such sets should have, in some sense, a vanishing volume.
Definition 4.4.10. (Negligible sets) A subset K of Rn is said to be negligible if for every ε ¡ 0 there exists a finite number of closed intervals
I1 , . . . , Iν of Rn whose union contains K and which is such that
v pI1 q
v pIν q ε .
641
Example 4.4.11. Any interval of Rn with at least one side of vanishing
length is negligible.
Remark 4.4.12. Obviously, negligible subsets are bounded, the closure of
negligible subsets is negligible and a finite unions of negligible subsets are
also negligible.
The proofs of the remaining theorems in this section use either more advanced knowledge of topological properties of subsets of Rn , where n P N
is such that n ¥ 2, than developed in this course or use the inverse mapping theorem for vector-valued functions in several variables which was not
considered in the previous section. For this reason, these proofs will not be
given in the following, but can be found in [63].
Intuitively, an interval of Rn , where n P N is such that n ¥ 2, with at
least one side of vanishing length is of ‘lower dimension’ than n. Hence, it
might be suspected that also other ‘lower dimensional’ subsets of Rn could
be negligible such as parts of curves in R2 or parts of surfaces in R3 . Indeed, this intuition is correct, if such sets are images of bounded subsets
of Rm , where m P N is such that 1 ¤ m n, under maps of class C 1 as
detailed in the following theorem.
Theorem 4.4.13. Let m, n P N , B be a bounded subset of Rm and U an
open subset of Rm containing B. Finally, let n ¡ m and f : U Ñ Rn be
of class C 1 , i.e., such that each of its component functions is of class C 1 .
Then f pB q is negligible.
Proof. See [63], XX, §2, Proposition 2.2.
Example 4.4.14. Show that
Sr1 p0q tpx, y q P Rn : x2
y2
r2 u
where r ¥ 0, is a negligible subset of R2 . Solution: For this, we define
f : p2π, 2π q Ñ R2 by
f ptq : pr cos t, r sin tq
642
for every t P p2π, 2π q. Then f is of class C 1 and Ranpf q Sr1 p0q. Hence
according to Theorem 4.4.13, Sr1 p0q is a negligible subset of R2 .
So far, we proved existence of the integral only in few simple cases. The
following theorem gives a criterion for the Riemann-integrability of a function which is sufficient for most applications.
Theorem 4.4.15. (Existence of Riemann integrals) Let n
that n ¥ 2.
P N be such
(i) Let f be a bounded real-valued function on some closed interval I
of Rn . Moreover, let f be continuous in all points of I, except from
points of a negligible subset of I. Then f is Riemann-integrable on
I.
(ii) If g is some function on I such that f pxq g pxq for all x P I, except
from points of negligible subset of I, then g is Riemann-integrable
on I and
»
»
f dv g dv .
I
I
Proof. See [63], XX, §1, Theorem 1.3.
Since
a
|f pxq| rf pxqs2
for every x P I, if f is a bounded function on some closed interval I of
Rn , where n P N is such that n ¥ 2, which is continuous in all points
of I, except from points from a negligible subset of I, we conclude by
application of the previous theorem that also |f | is bounded and Riemannintegrable. Since
f pxq ¤ |f pxq| ¤ f pxq
for all x P I, it follows by the monotony of the Riemann integral, Corollary 4.4.9, that
»
»
»
f dv ¤ |f | dv ¤ f dv
I
I
643
I
»
f dv and hence that
I
»
¤ |f | dv .
I
The last estimate is frequently applied. As a consequence, we proved the
following theorem.
Theorem 4.4.16. Let n P N be such that n ¥ 2 and f be bounded on
some closed interval I of Rn . Further, let f be continuous in all points of
I, except from points in a negligible subset of I. Then |f | is bounded and
Riemann-integrable and
»
f dv I
»
¤ |f | dv .
I
Example 4.4.17. Let f : tpx, y q P R2 : x2 y 2 ¤ r2 u Ñ R be some
continuous function where r ¥ 0. Define fˆ : rr, rs2 Ñ R by
fˆpx, y q :
"
f px, y q for px, y q P Dpf q
0
for px, y q P rr, rs2 z Dpf q.
Then fˆ is everywhere continuous, except possibly on Sr1 p0q which is according to Example 4.4.14 a negligible subset of R2 . Hence according to
Theorem 4.4.15, fˆ is Riemann-integrable on rr, rs2 .
In Example 4.4.7, we have seen that
»
r0,1s2
xy dxdy
» 1
» 1
x dx
0
y dy
.
0
This result can be obtained by a simple application of the following theorem
of Guido Fubini. This theorem is of major importance for the evaluation of
integrals in Rn m , where n, m P N , since it reduces that evaluation to the
calculation of such integrals in Rn and Rm . If applicable, by successive
application of the theorem, the evaluation of integrals in Rn , n P N such
that n ¥ 2, can be reduced to the calculation of integrals for functions
in one variable. For the evaluation of the last, the powerful fundamental
theorem of calculus is available.
644
Theorem 4.4.18. (Fubini’s Theorem) Let m, n P N and I, J be closed
intervals of Rm and Rn , respectively. Further, let f : I J Ñ R be
Riemann-integrable on I J. Finally, let f px, q be Riemann-integrable on
J for all x P I, except on a negligible subset of I. Then the function on I
which associates to every x P I the value
»
J
f px, yq dy
is Riemann-integrable on I and
»
f px, yq dx dy I J
» »
I
J
f px, yq dy dx .
Proof. See [63], XX, §3, Theorem 3.1.
Example 4.4.19. Let r
¡ 0. Define f : rr, rs2 Ñ R by
f px, y q : 1
if x2 y 2 ¤ r2 and 0 otherwise. According to Example 4.4.17, f is
Riemann-integrable on rr, rs2 , and we conclude by Theorems 4.4.18, 4.4.15
that
»
f px, y q dx dy
rr,rs2 »r
» ?r2 x2
r
?
» r » r
r
r
»r
f px, y q dy dx
?
dy dx 2
r2 x2 dx
?
r2 x2
r
x r 2 x2
x r
r arcsin
r 2
r
πr2
which is the area of a circular disk of radius r. Note that this result can be
achieved only by knowledge of the values of f on Br2 p0q. Hence it appears
natural to define
»
pq
Br2 0
dx dy :
»
rr,rs2
645
f px, y q dx dy ,
since f is the unique extension of the constant function of value 1 on Br2 p0q
to a function on rr, rs2 which is constant of value zero on rr, rs2 zB12 p0q.
Also it is obvious that if g is an analogous extension of the constant function
of value 1 on Br2 p0q to some interval I  Br2 p0q, it follows that
»
I
g px, y q dx dy
»
rr,rs2
f px, y q dx dy
as it should be since the symbol
»
pq
dx dy
B12 0
does not contain any reference to an interval or an extension of the integrand.
This suggests the following definition.
Definition 4.4.20. ( The Riemann integral, II ) Let n P N be such that
n ¥ 2, Ω be a bounded subset of Rn whose boundary is negligible and f
be a bounded function on Ω. In addition, let I  Ω be a bounded closed
interval, and let fˆ : I Ñ R defined by
fˆpxq :
#
f pxq if x P Ω
0
if x P I z Ω .
be Riemann integrable. Then we define
»
Ω
f dv :
»
fˆ dv .
I
For the proof that this definition is independent of the interval I, we refer to the final part of [63], XX, § 1 on admissible sets and functions. In
addition, as a particular case when f is constant of value 1, we define the
n-dimensional volume V of Ω by
V :
»
dv .
Ω
646
y
GHfL
W
b
a
x
Fig. 177: Area under a graph of a function f . See Example 4.4.21.
The following two examples give further applications of Fubini’s theorem.
In particular, they indicate that previous definitions of area / volume under
the graph of functions are consistent with the previous definition, Definition 4.4.20, of the n-dimensional volume of subsets in Rn where n P N is
such that n ¥ 2.
Example 4.4.21. In the following, we calculate
»
dxdy
Ω
where Ω € R2 is the region under the graph of a function f : ra, bs Ñ R,
where a, b P R are such that a b, that assumes only positive p¥ 0q values,
i.e., Ω is given by
Ω : tpx, y q P R2 : a ¤ x ¤ b ^ 0 ¤ y
¤ f pxqu ,
see Fig 177. In addition, we assume that f is the restriction of a continuously differentiable function fˆ defined on an open interval I of R containing ra, bs. As a consequence, the graph of f is part of image of the map
h : I Ñ R2 of class C 1 defined by
hpxq : px, fˆpxqq
647
for every x P I and hence is negligible. From this, we conclude that the
boundary of Ω, given by
4
¤
Bi
i 1
where
B1 : ra, bs t0u , B2 : tbu r0, f pbqs ,
B3 : Gpf q , B4 : tau r0, f paqs
is a negligible set. Hence it follows by Fubini’s theorem, Theorem 4.4.18,
that
»
dxdy
Ω
» b » f pxq
a
0
dy dx »b
a
f pxq dx .
»
Hence the value of
dxdy
Ω
coincides with the area under the graph of f as defined in Calculus I.
Example 4.4.22. In the following, we calculate
»
dxdydz
Ω
where Ω € R3 is the region under the graph of a function f : ra, bs rc, ds Ñ R, where a, b, c, d P R are such that a b and c d, that assumes
only positive p¥ 0q values, i.e., Ω is given by
Ω : tpx, y, z q P R3 : a ¤ x ¤ b ^ c ¤ x ¤ d ^ 0 ¤ z
¤ f px, yqu ,
see Fig 178. In addition, we assume that f is the restriction of a continuously differentiable function fˆ defined on an open set U of R2 containing
ra, bs rc, ds. As a consequence, the graph of f is part of image of the map
h : U Ñ R3 of class C 1 defined by
hpx, y q : px, y, fˆpx, y qq
648
GHfL
z
d
y
a
c
b
x
Fig. 178: Volume under a graph of a function f . See Example 4.4.22.
for all px, y q P U and hence is negligible. From this, we conclude that the
boundary of Ω, given by
6
¤
Bi ,
i 1
where
B1
B3
B4
B5
B6
: ra, bs rc, ds t0u , B2 : Gpf q ,
: tpx, c, λf px, cqq : px, λq P ra, bs r0, 1su ,
: tpx, d, λf px, dqq : px, λq P ra, bs r0, 1su ,
: tpa, y, λf pa, y qq : py, λq P rc, ds r0, 1su ,
: tpb, y, λf pb, y qq : py, λq P rc, ds r0, 1su ,
is a negligible set. For this, note that for x0
maps that associate to every px, λq the value
P ra, bs, y0 P rc, ds also the
px, y0, λfˆpx, y0qq ,
649
and to every py, λq the value
px0, y, λfˆpx0, yqq
are defined as well as of class C 1 on open subsets of R2 containing ra, bs r0, 1s and rc, ds r0, 1s, respectively. Therefore also B3, B4, B5 and B6 are
negligible. Hence it follows by Fubini’s theorem, Theorem 4.4.18, that
»
dxdydz
Ω
»
»
ra,bsrc,ds
p q
f x,y
dz
dxdy
0
»
ra,bsrc,ds
f px, y q dxdy .
»
Hence the value of
dxdydz
Ω
coincides with the volume under the graph of f as defined in Definition 4.4.5.
Often in applications, the integrand of an integral in Rn , where n P N is
such that n ¥ 2, has a certain symmetry. In such cases, integration by
change of variables is often useful. The following theorem will also play a
major role in the subsequent section on generalizations of the fundamental
theorem of calculus to integrals in Rn .
Theorem 4.4.23. (Change of variable formula) Let n P N be such that
n ¥ 2 and I be a closed interval of Rn contained in some open subset U .
Moreover, let g : U Ñ Rn be continuously differentiable with a continuously differentiable inverse. Finally, let f be a Riemann-integrable function
over g pI q. Then
»
»
where det g 1 : U
pq
g I
f dv
pf gq | det g 1| dv
I
Ñ R is defined by
pdet g 1 qpxq : det
for all x P U .
650
Bgi pxq
B xj
i,j 1,...,n
Proof. See [63], XX, §4, Corollary 4.6.
The following is a typical application of change of variables.
Example 4.4.24. Show that
»
pq
xdxdy
2 0
BR
»
ydxdy
pq
2 0
BR
for every R ¡ 0. Solution: For this, let R
and g2 : R2 Ñ R2 by
0
(4.4.5)
¡ 0. We define g1 : R2 Ñ R2
g1 px, y q : px, y q , g2 px, y q : px, y q
for all px, y q P R2 . The maps g1 , g2 are continuously differentiable with
inverse g1 and g2 , respectively. In particular,
g1 pBR2 p0qq g2 pBR2 p0qq BR2 p0q
and
det g11
det g21 1 .
Since f1 : BR2 p0q Ñ R and f2 : BR2 p0q Ñ R, defined by f1 px, y q : x
and f2 px, y q : y, respectively, for every px, y q P BR2 p0q, are continuous, it
follows by Example 4.4.17 and change of variables that
»
pq
xdxdy
2 0
BR
»
pq
2 0
BR
ydxdy
»
»
p p qq
xdxdy
2 0
g1 BR
p p qq
2 0
g2 BR
ydxdy
»
»
pq
2 0
BR
pq
2 0
BR
pxqdxdy pyqdxdy »
»
pq
xdxdy ,
2 0
BR
pq
ydxdy
2 0
BR
and hence (4.4.5).
Also the application of change of variables in the following example is
typical.
651
Example 4.4.25. (A basic oscillatory integral) Let k
»8
0
$
'
&
π2
sinpkxq
dx 0
'
x
% π
if k
if k
if k
2
and that
» R
sin
kx
dx
x
0
p q
¤ π2
0
0
¡0
P R. Show that
1
(4.4.6)
(4.4.7)
for every R ¥ 0. Solution: For this, let R ¡ 0. Then it follows by
application of the fundamental theorem of calculus that
»R
sinpxq
dx
x
0
*
» R "
1 x sin
1 x sin
e
sinpx cosq pπ{2q x e sinpx cosq p0q dx
x
0
» R #» π{2
0
0
+
hpx, θq dθ dx
where h : r0, Rs r0, π {2s Ñ R is defined by
hpx, θq : ex sin θ sinpx cos θq cos θ
sin θ ex sin θ cospx cos θq
for all px, θq P r0, Rs r0, π {2s. Since h is continuous, h is Riemannintegrable. Therefore, it follows by Fubini’s theorem that
» R #» π{2
0
0
In addition, g : R2
+
hpx, θq dθ dx »
r0,Rsr0,π{2s
Ñ R2 defined by
g pθ, xq : px, θq
hpx, θq dx dθ .
for all pθ, xq P R2 is bijective and continuously differentiable. Since g 1
g, the inverse of g is continuously differentiable, too. Further,
0 1
h 1 pθ, xq 1 0
652
and
| detpg 1pθ, xqq| 1
for all pθ, xq P R2 . Hence it follows by change of variables and by Fubini’s
theorem that
»
r0,Rsr0,π{2s
» π{2 "» R
» π{2
0
π
2
r0,π{2sr0,Rs
hpx, θq dθ dx *
ex sin θ sinpx cos θq cos θ dx dθ
sin θ ex sin θ cospx cos θq
0
0
hpx, θq dx dθ
»
eR sin θ cospR cos θq 1 dθ
» π{2
0
eR sin θ cospR cos θq dθ .
Since
»
π {2
eR sin θ cos R cos θ dθ
0
p
q
¤
» π{2
e2Rθ{π dθ
0
π
π
2R
p1 eR q ¤ 2R
,
we conclude that
» R
sin x
dx
x
pq
0
»8
and that
0
¤
π
2
1
R
1
.
sinpxq
π
dx .
x
2
Further since
» R
sin x
dx
x
pq
0
we arrive at
¤
»R
sin x
x
0
» R
sin x
dx
x
pq
0
653
p q dx ¤ R ,
¤ π2
1.
From the previous results, we conclude (4.4.6) and (4.4.7) as follows. First,
we note that (4.4.6) and (4.4.7) are trivially satisfied if k 0. For k ¡ 0
and R ¡ 0 it follows by change of variables that
»R
0
sinpkxq
dx x
» kR
0
and hence (4.4.6) and (4.4.7). Finally for k
change of variables that
»R
0
sinpkxq
dx x
»R
0
sinpy q
dy
y
0 and R ¡ 0, it follows by
sinp|k |xq
dx
x
and hence also in this case the validity of (4.4.6) and (4.4.7).
For the application of the change of variable formula, transformations g
are needed that have a differentiable inverse. For this reason, we need to
exclude certain sets from the domains of polar, cylindrical and spherical
coordinate transformations that were included in those definitions given in
Calculus II. Usually, this does restrict their usefulness in an essential way
since those sets are negligible sets.
Example 4.4.26. (Polar coordinates) Define g : p0, 8q pπ, π q
by
g pr, ϕq : pr cos ϕ, r sin ϕq
Ñ R2
for all pr, ϕq P p0, 8qpπ, π q. Then g is continuously differentiable with
Ranpg q R2 z pp8, 0s t0uq
and a continuously differentiable inverse g 1 : R2 z pp8, 0s t0uq Ñ R2
given by
?
?
p?x2 y2 , arccospx{ x?2 y2 qq if y ¥ 0
p x2 y2 , arccospx{ x2 y2 qq if y 0
for all px, y q P Ranpg q R2 z pp8, 0s t0uq. In particular,
cos ϕ r sin ϕ 1
r
pdet g qpr, ϕq : g 1 px, y q "
sin ϕ
for all pr, ϕq P p0, 8q pπ, π q.
654
r cos ϕ y
R
¶
W
x
-¶
-R
Fig. 179: Domain of integration Ω in Example 4.4.27.
Example 4.4.27. Let ε, R P R be such that ε R. Calculate
»
Ω
where
1
lnpx2
4π
y 2 q dxdy
Ω : tpx, y q P R2 : x ¥ 0 ^ ε2
¤ x2
y2
¤ R2 u .
Solution: First, we note that Ω is bounded with a negligible boundary since
the last is given by the union of the negligible sets t0u rε, Rs, t0u rR, εs and subsets of the negligible sets Sε1p0q and SR1 p0q. Further, f :
Ω Ñ R defined by
1
f px, y q :
lnpx2 y 2 q
4π
for every px, y q P Ω is continuous. Hence, we conclude that f is Riemannintegrable, and it follows by use of polar coordinates and application of
Theorems 4.4.23, 4.4.18 that
»
Ω
1
lnpx2
4π
y q dxdy
2
»
1
r lnprq drdϕ
2π rε,Rsrπ{2,π{2s
655
»
R
1 R
r2
1 r lnprq dr lnprq 2 ε
4
2 ε
R2
1
ε2
1
lnpRq 4 lnpεq 2 .
4
2
Example 4.4.28. (Cylindrical coordinates) Define g : p0, 8qpπ, π q
R Ñ R3 by
g pr, ϕ, z q : pr cos ϕ, r sin ϕ, z q
for all pr, ϕ, z q P p0, 8qpπ, π qR. Then g is continuously differentiable
with
Ranpg q R3 z pp8, 0s t0u Rq
and a continuously differentiable inverse
g 1 : R3 z pp8, 0s t0u Rq Ñ R3
given by
?
?
p?x2 y2 , arccospx{ x?2 y2 q , zq
p x2 y2 , arccospx{ x2 y2 q , zq
for all px, y, z q P R3 z pp8, 0s t0u Rq. In particular,
cos ϕ r sin ϕ 0 pdet g 1qpr, ϕ, zq : sin ϕ r cos ϕ 0 r
g 1 px, y, z q "
0
0
if y ¥ 0
if y 0
1 for all pr, ϕ, z q P p0, 8q pπ, π q R.
Example 4.4.29. (Spherical coordinates) Define g : p0, 8q
pπ, πq Ñ R3 by
p0, πq g pr, θ, ϕq : pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq
for all pr, θ, ϕq P p0, 8q p0, π q pπ, π q. Then g is continuously differentiable with
Ranpg q R3 z pp8, 0s t0u Rq
656
1
z
1
0 y
0
-1
0
1
-1
x
Fig. 180: For a 0, b 1 and f pz q : z for all z
from Example 4.4.30 is a solid cone of height 1.
P [0, 1] , the volume of revolution S
and a continuously differentiable inverse g 1 : R3 z pp8, 0s t0u Rq Ñ
R3 given by
?
p |r| , arccospz{|r|q , arccospx{ x?2 y2 qq if y ¥ 0
p |r| , arccospz{|r|q , arccospx{ x2 y2 qq if y 0
for all px, y, z q P R3 z pp8, 0s t0u Rq. In particular,
sin θ cos ϕ r cos θ cos ϕ r sin θ sin ϕ pdet g 1qpr, θ, ϕq : sin θ sin ϕ r cos θ sin ϕ r sin θ cos ϕ r2 sin θ
cos θ
r sin θ
0
for all pr, ϕ, z q P p0, 8q pπ, π q R.
g 1 prq "
In the following, we give some typical applications of integration of functions in several variables in the calculation of volumes of solid bodies, mechanics and probability theory.
657
Example 4.4.30. (Volume of a solid of revolution) Let a, b P R such that
a b, f : [a, b] Ñ [0, 8q be a continuous function whose restriction to
pa, bq is continuously differentiable and
S :
y 2 q1{2
px, y, zq P R3 : 0 ¤ px2
¤ f pzq ^ z P ra, bs
(
.
Note that S is rotational symmetric around the z-axis and can be thought
of as obtained from a region in x, z-plane that is rotated around the z-axis.
The volume V of S is given by
V
π
»b
a
f 2 pz q dz .
This can be proved as follows. For this, we define ρ : S Ñ R by ρpxq : 1
for all x P S. As a constant map, ρ is continuous. We notice that B S is
given by the union of
A1 : tpx, y, z q P R3 : a z b ^ x2 y 2 f 2 pz q 0u ,
A2 : tpx, y, aq P R3 : x2 y 2 ¤ f 2 paqu ,
A3 : tpx, y, bq P R3 : x2 y 2 ¤ f 2 pbqu .
Further, A1 is the image of the map f1 :
C 1 defined by
p2π, 2πq pa, bq Ñ R3 of class
f1 pϕ, z q : pf pz q cos ϕ, f pz q sin ϕ, z q
for every pϕ, z q P p2π, 2π qpa, bq, A2 is a subset of the image of the map
f2 : p1, 1 f paqq p2π, 2π q Ñ R3 of class C 1 defined by
f2 pr, ϕq : pr cos ϕ, r sin ϕ, aq
for every pr, ϕq P p1, 1 f paqqp2π, 2π q and A3 is a subset of the image
of the map f3 : p1, 1 f paqq p2π, 2π q Ñ R3 of class C 1 defined by
f3 pϕq : pr cos ϕ, r sin ϕ, bq
658
for every pr, ϕq P p1, 1 f paqq p2π, 2π q. Hence it follows that B S is
negligible. Further, if I is some closed interval such that I  S, it follows
that
#
1 if x P S
ρ̂pxq :
0 if x P I z S
is continuous, except in points from a negligible subset of R3 . Therefore
ρ̂ is Riemann-integrable. Hence we can apply the Theorem of Fubini to
conclude that
V
»
dxdydz
S
» b »
pq
π
dxdy dz
Bf2pzq 0
a
»b
a
f 2 pz q dz .
(4.4.8)
As described in the introduction to this section, Archimedes showed that
the volume V of a paraboloidal solid of revolution inscribed in a circular
cylinder C with radius r and height h is one half of the volume VC of C.
This result follows also from 4.4.8. In this, a 0, b h and
f pz q r
for every z
V
c
1
z
h
P r0, hs. Hence
πr
2
»h
0
1
z
dz
h
πr
2
z
h
z 2 2h 0
12 πr2h 12 VC .
Example 4.4.31. Calculate the volume VS of a solid sphere of radius r ¡
0 and the volume VC of circular cylinder of radius r and height h ¡ 0.
Solution: With a r, b r,
f pz q :
for every z
?
r2 z 2
P rr, rs, it follows from 4.4.8 that
VS
π
»r
r
r
2
z
2
dz
π
659
r z
2
r
z 3 3 r
4π3 r3 .
r
2r
r
Fig. 181: Solid sphere of radius r ¡ 0 inscribed into in a right circular cylinder whose
height equals its diameter, see Example 4.4.31.
Finally, with a 0, b h,
for every z
f pz q : r
P r0, hs, it follows from (4.4.8) that
VC
π
»h
r2 dz
0
πr2h .
Since the solid sphere of radius r ¡ 0 can be inscribed in a circular cylinder
of radius r and height 2r, we conclude that
VS
4π3 r3 23 Vr
where Vr denotes the volume of that cylinder. Also this result was derived
by Archimedes in ‘On the sphere and cylinder’. He required that on his
tombstone be carved a sphere inscribed in a right circular cylinder whose
height equals its diameter. After Archimedes death, Cicero restored his
tomb with this inscription, see Fig 181.
660
Example 4.4.32. Calculate the volume V of the solid ellipsoid
VE :
"
2
px, y, zq P R : xa2
y2
b2
3
z2
c2
*
¤1
where a, b, c ¡ 0. Solution: First, it follows by Example 3.5.45 that the
boundary of V , i.e., the ellipsoid
E :
"
2
px, y, zq P R : xa2
3
y2
b2
z2
c2
*
1
,
is a negligible set. Hence it follows the existence of
»
dxdydz .
VE
Further, we note that
gpU1p0qq
where the scale function g : R3 Ñ R3 is defined by
g px, y, z q : pax, by, cz q
for all px, y, z q P R3 . In particular, g is continuously differentiable with a
continuously differentiable inverse g 1 : R3 Ñ R3 given by
g 1 px, y, z q : px{a , y {b , z {cq
for all px, y, z q P R3 and
detpg 1 px, y, z qq abc
for all px, y, z q P R3 . Hence, we conclude by change of variables and the
VE
previous example that
V
»
dxdydz
VE
abc
»
pq
U1 0
661
dxdydz
4π3 abc .
Example 4.4.33. (Total mass, center of mass and inertia tensor of a
mass distribution) If ρ : V Ñ r0, 8q is the mass distribution (mass density) of a solid body occupying the region V in R3 , its total mass M , center
of mass rC and inertia tensor pIij qi,j Pt1,2,3u are defined by:
»
M :
rC
I11 :
I33 :
and
»
1
M
1
xρ dxdydz,
M
V
»
»V
V
py
ρ dxdydz ,
V
2
px
2
»
V
1
yρ dxdydz,
M
z qρ dxdydz , I22 :
»
2
V
px2
»
zρ dxdydz
,
V
z 2 qρ dxdydz ,
y qρ dxdydz
2
Iij : »
xi xj ρ dxdydz
V
if i j, if existent. In the integrands, x, y, z, x1 : x, x2 : y, x3 : z
denote the coordinate projections of R3 .
Example 4.4.34. (Center of mass of a cylindrical rod) Calculate the center of mass of the rod
R : t px, y, z q : x2
of radius r
defined by
y2
¤ r2 ^ z P r0, hs u
¡ 0 and height h ¡ 0 for the mass distribution ρ : R Ñ r0, 8q
ρ1 ρ0
ρpx, y, z q : ρ
z
0
h
fo all px, y, z q P R, where ρ0 , ρ1 ¥ 0. Solution: According to Example 4.4.30, B R is negligible. Further, if I is some closed interval such that
I  R, it follows that
ρ̂px, y, z q :
#
ρpx, y, z q if px, y, z q P R
0
if px, y, z q P I z R
662
h
R
z
0 y
0
0
x
Fig. 182: Cylindrical rod R of radius r and height h from Example 4.4.34.
is continuous, possibly except from points of a negligible subset of R3 .
Therefore ρ̂ is Riemann-integrable. Hence we can apply the Theorem of Fubini to conclude that the mass M and the center of mass rC pxC , yC , zC q
of the rod are given by
M
»
R
πr
xC
yC
ρpx, y, z q dxdydz
2
»h
ρ0
»0
»h
ρ0
0
ρ1 ρ0 z dz
h
»
ρ 1 ρ0 z
dxdy dz
h
Br2 p0q
ρ0 2 ρ1 πr2h ,
1
xρpx, y, z q dxdydz
M R
»
»
ρ1 ρ0 1 h
ρ0
z
xdxdy dz
M 0
h
Br2 p0q
»
1
yρpx, y, z q dxdydz
M R
663
0,
zC
»
»
ρ1 ρ0 1 h
ρ0
z
ydxdy dz 0 ,
M 0
h
Br2 p0q
»
1
zρpx, y, z q dxdydz
M R
»
»
ρ1 ρ0 2 1 h
ρ0 z
z
dxdy dz
M 0
h
Br2 p0q
πr2
M
Note that
»h
ρ1 ρ0 2 z dz
h
ρ0 z
0
0 ¤ zC
2
πr
M
ρ0
2ρ1
6
h2
31 ρρ0
0
2ρ1
h.
ρ1
13 ρρ0
2ρ1
1 3ρ0 3ρ1
h¤
hh
ρ1
3 ρ0 ρ1
0
and hence that the center of mass lies inside the rod.
Example 4.4.35. (Probability theory) A function ρ : Ω
subset Ω of Rn , n P N , such that
»
ρ dv
Ω
Ñ r0, 8q from a
1
can be interpreted as a joint probability distribution for the random variables x1 , . . . , xn on the sample space Ω. The elements of Ω are called
sample points and represent the possible outcomes of experiments.
The probability P tpx1 , . . . , xn q P Du for the event that the outcome of
an experiment px1 , . . . , xn q is a member of a subset D € Ω is given by
P tpx1 , . . . , xn q P Du »
ρ dv ,
D
if existent. The mean or expected value E pf q for the measurement of a
random variable f : Ω Ñ R in an experiment is defined by
E pf q :
»
f ρ dv ,
Ω
if existent.
664
0.8
0.8
0.6
0.6
y
1
y
1
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
x
0.2
0.4
0.6
0.8
1
x
Fig. 183: Density maps of ρp1,2q , ρp1,3q from Exercise 4.4.36. Darker colors correspond to
smaller function values.
Example 4.4.36. (Identical fermionic particles confined to a one-dimensional box) Consider two idealized ‘one-dimensional’ identical fermionic
point particles of mass m ¡ 0 confined to the interval r0, 1s, but not subject
to other forces. In a quantum mechanical description, the probability distributions for the position of the particle in basic stationary states of energy
2 2
E
π2m~ |k|2 ,
where ~ is the reduced Planck constant, are given by
ρk px, y q : 2 r sinpk1 πxq sinpk2 πy q sinpk2 πxq sinpk1 πy q s2
for all x, y P r0, 1s, where k P N2 satisfies k1 k2 . For every such k
calculate the expectation values xx y y for the sum of the positions of the
particles
xx
yy »
r0,1s2
px
Solution: For k P N2 such that k1
y q ρk px, y q dxdy .
k2 and x, y P r0, 1s, it follows that
ρk px, y q 2 sin2 pk1 πxq sin2 pk2 πy q sin2 pk2 πxq sin2 pk1 πy q
2 sinpk1πxq sinpk2πxq sinpk1πyq sinpk2πyq s
665
21 r 1 cosp2k1πxq s r 1 cosp2k2πyq s
21 r 1 cosp2k2 πxq s r 1 cosp2k1 πy q s
r cosppk1 k2qπxq cosppk1 k2qπxq s
r cosppk1 k2qπyq cosppk1 k2qπyq s ,
where it has been used that
sinpαq sinpβ q 21 r cospα1 α2 q cospα1
for all α, β
α2 q s
P R. Hence it follows by Fubini’s theorem that
xxy 21
»1
21
x r 1 cosp2k1 πxq s dx
0
»1
0
1
x r 1 cosp2k2 πxq s dx
cosp2k1 πxq x sinp2k1 πxq 2k π
2
4k12 π 2
1
0
2
1
x
21
cosp2k2πxq x sinp2k2πxq 1 .
2
21 x
4k22 π 2
2
2k2 π
0
2
Further, for the calculation of xy y, we use change variables. For this, we
define g : R2 Ñ R2 by
g px, y q : py, xq
for all px, y q P R2 . The map g is continuously differentiable with inverse g.
In particular,
g pr0, 1s2 q r0, 1s2
and
det g 1
1 .
Hence it follows by change of variables that
xyy »
r0,1s2
»
r0,1s2
y ρk px, y q dxdy
x ρk py, xq dxdy
and hence, finally, that xx
»
pr s2 q
»
y y 1.
666
g 0,1
r0,1s2
y ρk px, y q dxdy
x ρk px, y q dxdy
xxy
0.8
0.8
0.6
0.6
y
1
y
1
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
x
0.6
0.8
1
x
Fig. 184: Density maps of ρp2,3q , ρp2,4q from Exercise 4.4.36. Darker colors correspond to
smaller function values.
Problems
1) Evaluate the following iterated integrals.
» 2 » 1
a)
0
0
px2
2y q dx dy ,
» 3 » 5
b)
3
» 4 »
y2 4
2
c)
3
1
px
π{2
2
» 1 "» 1 » 1
r sin ϕ dr dϕ ,
0
0
?1
» 1 "» 1x » 1xy
f)
» 1 #» ?1x2 »
0
2
0
e)
0
2y q dx dy ,
dy
px yq2 dx ,
» π{2 » 3 cos ϕ
d)
0
0
2) Calculate
0
z
*
dy dx ,
*
xyz dz dy dx ,
?1x y
2
g)
0
dz
x y
0
2
»
x dxdy
D
667
+
dz
a
dy dx .
1 |px, y, z q|2
where D € R2 is the compact set that is contained in the first and
fourth quadrant as well as is bounded by the y-axis and
tpx, yq P R2 : x
y 2 1 0u .
Sketch D.
3) By using polar coordinates, calculate
»
T
where
T :
!
px
y q dxdy
pr cos ϕ, r sin ϕq P R2 : 0 r ¤ 1 ^ π6 ¤ ϕ ¤ π3
)
.
Sketch T and g 1 pT q where g is the polar coordinate transformation.
4) Calculate
»
x dxdydz
E
where
E :
5) Calculate
!
px, y, zq P [0, 8q3 : x
y
z
2
)
1
.
»
y dxdy
D
where D € R2 is the area of the triangle with corners p0, 0q, p1{2, 0q
and p1{2, 1q. Sketch D.
6) By using polar coordinates, calculate
»
xy dxdy
T
where T is the compact subset of R2 that is bounded by the coordinate axes and
!
)
p x, p1 x2 q1{2 q P R2 : 0 ¤ x ¤ 1
.
Sketch T and g 1 pT q where g is the polar coordinate transformation.
668
7) Calculate
»
z dxdydz
E
where E is the compact subset of R3 in the first octant that is bounded
by the coordinate surfaces and
tpx, y, zq P R3 : x
2y
3z
1u .
Sketch E.
8) Calculate
»
x2 dxdy
D
where D is the compact subset of R2 that is contained in the first and
fourth quadrant as well as is bounded by the y-axis and
tpx, yq P R2 : x
y 2 0u ,
tpx, yq P R2 : x y 2 0u .
Sketch D.
9) By using polar coordinates, calculate
»
xy dxdy
T
where T is the compact subset of R2 that is contained in the first
quadrant as well as is bounded by both coordinate axes and
tpx, yq P R2 : x2 y2 4u .
Sketch T and g 1 pT q where g is the polar coordinate transformation.
10) Calculate
»
z dxdydz
E
where E € R3 is the compact set contained in the first octant which
is bounded by the coordinate surfaces and
tp1, y, zq P R3 : y P R ^ z P Ru , tpx, y, zq P R3 : z
Sketch E.
669
2y
2u .
11) Calculate
»
x2 y dxdy
D
where D is the compact subset of R2 that is contained in the upper
half-plane as well as is bounded by the x-axis and
tpx, yq P R2 : x2
y
4u .
Sketch D.
12) By using polar coordinates, calculate
»
x dxdy
T
where T is the compact subset of R2 that is contained in the first
quadrant as well as is bounded by both coordinate axes and
tpx, yq P R2 : x2 y2 1u , tpx, yq P R2 : x2 y2 4u .
Sketch T and g 1 pT q where g is the polar coordinate transformation.
13) Calculate
»
z 2 dxdydz
E
where E is the compact subset of R3 that is contained in
tpx, y, zq P R3 : z ¥ 0u
and is bounded by
tpx, y, zq P R3 : x2
y2
z 9 0u .
Sketch E.
14) Calculate the volume of solid ellipsoid with half-axes a, b, c ¡ 0.
15) Calculate the center of mass and the inertia tensor of a solid hemisphere
tpx, y, zq P R3 : x2
pz rq2 ¤ r2 ^ 0 ¤ z ¤ ru
for a mass distribution which is constant of value ρ0 ¥ 0.
y2
670
0.8
0.8
0.6
0.6
y
1
y
1
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
x
0.2
0.4
0.6
0.8
1
x
Fig. 185: Density maps of ρp1,1q , ρp1,2q from Problem 18. Darker colors correspond to
smaller function values.
16) Calculate the center of mass and the inertia tensor for of a solid cone
of height h ¥ 0
tpx, y, zq P R3 : a2 px2 y2 q ¤ z2 ^ 0 ¤ z ¤ hu ,
where a ¡ 0, for a mass distribution which is constant of value ρ0 ¥
0.
17) (Buffon’s needle problem) A needle of length L ¡ 0 is thrown
in a random fashion onto a smooth table ruled with parallel lines
separated by a distance d ¡ L. For simplicity, associate to all lines
a common orientation. Denote by x P r0, d{2s the minimal distance
of the center of the needle to the lines and by θ P r0, π s the angle
between the direction of the needle and the direction of the lines.
Under the assumption that x and θ are uniformly distributed, the joint
probability distribution ρ : r0, d{2sr0, π s Ñ r0, 8q of x, θ is given
by
2
ρpx, θq πd
for all px, θq P r0, d{2s r0, π s.
a) Determine the set S € r0, d{2s r0, π s corresponding to all
events that cause the needle to intersect a ruled line.
b) Calculate the probability pS of the last event given by
pS
»
S
671
ρpx, θq dxdθ .
0.8
0.8
0.6
0.6
y
1
y
1
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
0.2
x
0.4
0.6
0.8
1
x
Fig. 186: Density maps of ρp2,2q , ρp2,3q from Problem 18. Darker colors correspond to
smaller function values.
18) (Identical bosonic particles confined to a one-dimensional box)
Consider two idealized ‘one-dimensional’ identical bosonic point particles of mass m ¡ 0 confined to the interval r0, 1s, but not subject
to other forces. In a quantum mechanical description, the probability
distributions for the position of the particle in basic stationary states
of energy
π 2 ~2 2
|k| ,
E
2m
where ~ is the reduced Planck constant, are given by
ρk px, y q : 2 r sinpk1 πxq sinpk2 πy q
sinpk2 πxq sinpk1 πy q s
2
for all x, y P r0, 1s, where k P N2 . For every such k calculate the
expectation values xx y y for the sum of the positions of the particles
xx
yy »
r0,1s2
px
y q ρk px, y q dxdy .
19) A point particle of mass m ¡ 0 is confined to a cube r0, 1s3 , but
not subject to other forces. In a quantum mechanical description,
the probability distributions for the position of the particle in basic
stationary states of energy
2 2
E
π2m~ |k|2 ,
where ~ is the reduced Planck constant, are given by
ρk px, y, z q : 8 sin2 pk1 πxq sin2 pk2 πy q sin2 pk3 πz q
for all x, y, z
P r0, 1s where k pk1 , k2 , k3 q P N3 .
672
0.8
0.8
0.6
0.6
y
1
y
1
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
x
0.6
0.8
1
x
Fig. 187: Density maps of ρp1,1,1q p, , 0.5q, ρp1,2,1q p, , 0.5q. See Problem 19. Darker
colors correspond to smaller function values.
a) For every k pk1 , k2 , k3 q P N3 , calculate the expectation
values xxy, xy y, xz y for the components of the position of the
particle
xxy xyy xzy »
r0,1s3
»
r0,1s3
»
r0,1s3
x ρk px, y, z q dxdydz ,
y ρk px, y, z q dxdydz ,
z ρk px, y, z q dxdydz .
b) For every k P N3 , calculate the standard deviation σ1 , σ2 , σ3
for the expectation values from part a)
σ12
σ22
σ32
»
»
»
r0,1s
3
r0,1s
3
r0,1s3
px xxyq2 ρk px, y, zq dxdydz ,
py xyyq2 ρk px, y, zq dxdydz ,
pz xzyq2 ρk px, y, zq dxdydz .
c) Calculate and compare the probability of finding the particle in
the volume r0, as3 that includes a corner and in rp1 aq{2, p1
aq{2s3 around the center of the cube where 0 a 1.
20) Calculate the volume of the compact subset of R3 that is bounded by
the given surfaces.
a) S1
t px, y, zq P R3 : x2
673
y2
zu
,
0.8
0.8
0.6
0.6
y
1
y
1
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
x
0.2
0.4
0.6
0.8
1
x
Fig. 188: Density maps of ρp2,1,1q p, , 0.5q, ρp2,2,1q p, , 0.5q. See Problem 19. Darker
colors correspond to smaller function values.
t px, y, zq P R3 : x4 y4 2 px2 y2 q u ,
S3 t px, y, z q P R3 : z 0 u ,
S1 t px, y, z q P R3 : x2 y 2 z 2 4 u ,
S2 t px, y, z q P R3 : px2 y 2 q2 4 px2 y 2 q u
S1 t px, y, z q P R3 : x2 y 2 z 2 9 u ,
S2 t px, y, z q P R3 : x2 y 2 3 |x| u .
S2
b)
c)
,
21) (Generalized polar coordinates)
a) Let a, b, α ¡ 0. Define g : p0, 8q p0, π {2q Ñ R2 by
g pr, ϕq : par cosα ϕ, br sinα ϕq
for all pr, ϕq P p0, 8q p0, π {2q. Find the range of g. Show
that the restriction of g in range to Ranpg q is a continuously differentiable bijection with a continuously differentiable inverse.
In particular, calculate that inverse and detpg 1 q.
b) Calculate the area of
S : tpx, y q P R2 : |x|1{2
|y|1{2 ¤ R1{2 u
for R ¡ 0 by use of suitable generalized polar coordinates from
part a).
22) (Generalized spherical coordinates)
a) Let a, b, c, α ¡ 0. Define g : p0, 8qp0, π {2qp0, π {2q Ñ R3
by
g pr, θ, ϕq : par sinα θ cosα ϕ, br sinα θ sinα ϕ, cr cosα θq
674
1
1
-1
-1
Fig. 189: Graphical depiction of S from Problem 21 for the case R 1.
1
z 0
1
1
0
y
-1
1
0
x
Fig. 190: Graphical depiction of S from Problem 22 for the case R 1.
675
for all pr, θ, ϕq P p0, 8qp0, π {2qp0, π {2q. Find the range of
g. Show that the restriction of g in range to Ranpg q is a continuously differentiable bijection with a continuously differentiable
inverse. In particular, calculate that inverse and detpg 1 q.
b) Calculate the volume of
S : tpx, y, z q P R3 : |x|1{2
|y|1{2 |z|1{2 ¤ R1{2 u
for R ¡ 0 by use of suitable generalized spherical coordinates
from part a).
676
4.5
Vector Calculus
We remind that we identify points in Rk , where k P N is such that k ¥ 2,
with position vectors, see the remarks preceding Definition 3.5.8. In addition, as was explained in the beginning of Section 3.5.8, we also identify
tangent vectors that are associated to points in space with position vectors.
In applications, only from the context of a problem can be concluded about
the nature of the involved quantities. But, at least to the experience of the
author, apart from ‘transformations’ which map points into points, most
maps in applications are considering ‘physical fields’, i.e., maps that have
as domain a set of points and as range a set of real numbers or a set of
tangent vectors. In the last case, such maps associate to every point from
the domain a tangent vector that is ‘attached’ to that point. The remaining
part of the course studies the last type of maps which are also called vector
fields. Hence for the interpretation of the results, the reader should imagine
that the value of a vector field in a point p of its domain is a position vector
which has been parallel transported in space such that its starting point is p
instead of the origin of a Cartesian coordinate system. The notion of parallel transport is made precise in courses in differential geometry. On the
other hand, any vector-valued function of several variables f from a nontrivial subset D of some Rn into some Rm can be interpreted as a vector
field assigning vectors to points in space. Therefore, the previous remarks
gain importance only in connection with the interpretation of the results.
Example 4.5.1. Define v : tpx, y, z q P R3 : 1 ¤ x2
v px, y, z q :
y2
¤ 4u Ñ R3 by
2
2
2
2
y x x2 y y2 2 , x x x2 y y2 2 , 0
for all px, y, z q P R3 satisfying 1 ¤ x2 y 2 ¤ 4. The map v describes the
velocity field of a viscous incompressible flow, a so called ‘Couette flow’,
between concentric cylinders of radius 1 and 2 rotating at the same rate,
but in counterclockwise and clockwise direction, respectively. The velocity
field in a point on these cylinders coincides with the speed of that point, i.e.,
due to viscous friction forces, the fluid sticks to the cylinders and, in this
way, is carried along with the cylinders. To achieve better visualization,
677
2
y
1
0
-1
-2
-2
0
x
-1
1
2
Fig. 191: Direction field corresponding to Couette flow between counter-rotating concentric cylinders of radius 1 and 2. See Example 4.5.1.
Fig 191 shows the field of directions |v |1 .v ?
corresponding to v. The first
is not defined in points of the circle of radius 2 around the origin.
Example 4.5.2. Define E : R3 zt0u Ñ R3 by
E px, y, z q :
p
x2
1
y2
z 2 q3{2
px, y, zq
for all px, y, z q P R3 zt0u. E describes the electrical field created by a
negative unit charge. To achieve better visualization, Fig 192 shows the
field of directions |E |1 .E corresponding to E.
The following example motivates the subsequent definition of path integrals.
Example 4.5.3. (Motivation for the definition of path integrals.) For
this, let F be a continuous map from some open subset U in R3 into R3 (the
‘force field’) and r be a twice continuously differentiable map from some
678
1
y 0
-1
1
z 0
-1
-11
1
0
x
Fig. 192: Direction field corresponding corresponding to an electrical field created by a
negative point charge. See Example 4.5.2.
open interval I of R into U (the trajectory of a point particle parametrized
by time) which satisfies
m r 2 ptq F prptqq
for every t P I (Newton’s equation of motion) where m
the particle). Then
m
2
v2
1
¡ 0 (the mass of
ptq m r 1ptq r 2ptq r 1ptq F prptqq
for every t P I where v : r 1 (the velocity field of the particle), and hence
m
2
v
2
pt1q m
2
v
2
pt0q » t1
t0
r 1 ptq F prptqq dt .
Hence the right hand side of the previous equation describes the difference
of the ‘kinetic energies’ of the particle at t1 and t0 . Further, if F is in
679
addition ‘conservative’ , i.e., if there is some V : U
function’) of class C 1 such that
F
∇V
Ñ
R (a ‘potential
,
then we conclude by the chain rule that
» t1
t0
r 1 ptq F prptqq dt » t1
» t1
t0
r 1 ptq ∇V prptqq dt
pV rq 1 ptq dt V prpt0qq V prpt1qq
t0
and hence that the function (‘the total energy of the particle’)
m 2
v
2
V
r
is constant (‘Energy conservation’).
Definition 4.5.4. Let n P N , F be a continuous map from some open
subset U of Rn into Rn , a, b P R such that a ¤ b and r : ra, bs Ñ Rn be
a regular C 1 -path in U , i.e., the restriction of a continuously differentiable
map from some open interval I  ra, bs into U . Then we define the path
integral of F along r by
»
r
F dr :
»b
a
r 1 ptq F prptqq dt .
Remark 4.5.5. Note that we don’t demand that r is necessarily injective.
A simple example for a regular C 1 -path which is not injective is given by
r : [ 1, 1] Ñ R2 defined by
rptq : pt2 , 1q
for every t P [ 1, 1]. This path begins at the point p1, 1q, moves on to
p0, 1q from where it returns to p1, 1q.
680
2
1
y
0
-1
-2
-2
0
x
-1
1
2
Fig. 193: Direction field associated to F from Example 4.5.6 and Ranprq for a 1.
Example 4.5.6. Define F : R2 zt0u Ñ R2 by
F px, y q :
x2
y
,
y 2 x2
x
y2
for all px, y q P R2 zt0u and the parametrization of the circle of radius a ¡ 0
around the origin by r : r0, 2π s Ñ R2 by
rptq : a.pcos t, sin tq
for all t P R. Then
»
r
F dr :
» 2π
0
» 2π
0
» 2π
0
r 1 ptq F prptqq dt
p a sin t, a cos tq t a cos t
a sin
,
2
a
a2
dt 2π .
Note that this result does not depend on the radius a.
681
dt
The following shows that the value of a path integral is unchanged if the
path is replaced by another that has the same range and that traverses the
range in the ‘same way’ as the first path.
Theorem 4.5.7. (Invariance under reparametrization) Let n P N , F be
a continuous map from some open subset U of Rn into Rn and r : ra, bs Ñ
Rn be a regular C 1 -path in U . Further, let g : ra, bs Ñ rc, ds be continuously differentiable with a continuously differentiable inverse (i.e, there is
an extension ĝ : I1 Ñ I2 of g, where I1 , I2 are open intervals of R such
that I1  ra, bs, I2  rc, ds and such that ĝ is continuously differentiable
with a continuously differentiable inverse) and such that g 1 pxq ¡ 0 for all
x P ra, bs. Then
»
»
r
F dr r g
F dr .
Proof. By the change of variables and the chain rule, it follows that
»
r
F dr :
»d
c
»b
a
r 1 ptq F prptqq dt »d
pr gq 1psq F ppr gqpsqq ds c
r 1 pg psqq F prpg psqqq g 1 psq ds
»
r g
F dr .
The following defines for every regular C 1 -path r an inverse path r whose
domain and range are the same as that of r, but traverses the range in the
opposite way, i.e., in particular, starts at the endpoint of r and ends at the
starting point of r. The replacement of r in a path integral by r leads to a
change in sign.
Definition 4.5.8. (Change of orientation/inverse path) Let n, F and r as
in Definition 4.5.4. Then, we define the inverse path r to r by
r ptq : rpa
682
b tq
for all t P ra, bs. Then it follows by Theorem 3.1.9 and change of variables
that
»
r
F dr »b
a
b
a
a
r 1 pa
» a
»b
»b
r 1 pt
r 1 ptq F pr ptqq dt
b tq F prpa
a
b tqq dt
bq F prpt
r 1 ptq F prptqq dt bqq dt
a
»
r
F dr .
Path integrals occur frequently in form of ‘boundary integrals’. In these
cases the range of paths are parts of the boundaries of subsets of Rn , where
n P N such that n ¥ 2, and often the whole boundary of the set needs to
be traversed. Since such a boundary can contain corners, e.g., in the case
of the boundary of the interior of a rectangle, it is useful to define also path
integrals along paths that are only piecewise C 1 .
Definition 4.5.9.
(i) A piecewise regular C 1 -path r is a sequence pr1 , . . . , rν q of regular
C 1 -paths r1 , . . . , rν , where ν P N , with coinciding endpoints of ri
and starting points of ri 1 for each i P t1, . . . , ν 1u. Further, we say
that r is closed if the endpoint of rν coincides with the initial point of
r1 .
(ii) Further, for a continuous vector field F : U Ñ Rn , where U is some
open subset of Rn containing the ranges of all ri , i P t1, . . . , ν u, we
define the path integral along r by
»
r
F dr :
»
r1
F dr
»
rν
F dr .
We already noticed in Example 4.5.3 that the value of a path integral depends only on the endpoints of the path in the case that the vector field is
683
the gradient of a function of class C 1 . Such functions are called potentials
or potential functions in physics.
Theorem 4.5.10. (Path independence) Let n P N , F be a continuous
map from some open subset U of Rn into Rn and r be a piecewise regular
C 1 -path in U from x0 to x1 . Finally, let V : U Ñ R be of class C 1 and
such that F ∇V . Then
»
r
F dr V px1 q V px0 q .
Proof. Since r is a piecewise regular C 1 -path in U from x0 to x1 , there are
ν P N along with regular C 1 -paths r1 : ra1 , b1 s Ñ U, . . . , rν : raν , bν s Ñ
U such that r1 pa1 q x0 and rν pbν q x1 . Then it follows by the chain rule
that
»
r
F dr » b1
a1
» b1
»
r1
F dr
r11 ptq p∇V qpr1 ptqq dt
»
rν
F dr
» bν
» bν
aν
rν1 ptq p∇V qprν ptqq dt
pV rν q 1 ptq dt
pV r1q 1 ptq dt a
a
V pr1pb1qq V pr1pa1qq V prν pbν qq V prν paν qq V prν pbν qq V pr1pa1qq V px1q V px0q .
1
ν
From Schwarz’s Theorem 4.2.18 follows a simple necessary condition for
the existence of a potential for a vector field F whose component functions
are all of class C 1 .
Theorem 4.5.11. (Necessary conditions for the existence of a potential)
Let n P N such that n ¥ 2, F pF1 , . . . , Fn q be a map of class C 1 (i.e., all
F1 , . . . , Fn are of class C 1 ) from some open subset U of Rn into Rn , and
let V : U Ñ R be of class C 2 and such that F ∇V . Then
684
BFi BFj 0 , i, j P t1, . . . , nu , i j .
B xj B xi
For every i, j P t1, . . . , nu such that i j, it follows by Schwarz’s
Proof.
Theorem 4.2.18 that
BFi B2V B2V BFj
B xj B xj B xi B xi B xj B xi
.
Remark 4.5.12. For the cases n 2, 3, the condition (4.5.11) is equivalent
to the vanishing of the so called rotational field curl F of F :
curl F :
$
&
B F2
Bx
%
p BBFy BBFz , BBFz BBFx , BBFx BBFy q
BBFy
if n 2
1
3
2
1
3
2
1
if n 3 .
The following example shows that not for every vector field that satisfies
the conditions from Theorem 4.5.11 there is a potential function.
Example 4.5.13. Let F be as in Example 4.5.6, i.e., define F : R2 zt0u Ñ
R2 by
x
y
,
F px, y q : 2
x
y 2 x2 y 2
for all px, y q P R2 zt0u. Then F is of class C 1 and
BFx px, yq BFy px, yq y2 x2
By
Bx
x2 y 2
for all px, y q P R2 zt0u and hence curl F vanishes on R2 zt0u.
But in
Example 4.5.6, we found closed regular C 1 -paths r such that
»
r
F dr 0 .
As a consequence, the existence of a potential function for F would lead to
a contradiction to Theorem 4.5.10. Hence there is no such potential. Note
685
that the same reasoning excludes also the existence of V : UR p0q zt0u Ñ R
of class C 1 such that
F px, y q p∇V qpx, y q
for all px, y q P UR p0q zt0u for all R ¡ 0. Hence it is natural to assume that
this fact is caused by the singular behavior of F in the origin. Indeed, the
following theorem shows that this assumption is correct in the sense that
there would be such a potential if F could be extended to a vector field F̂
of class C 1 on R2 such that curl F̂ vanishes also in the origin.
Criteria, like the following, providing the existence of potential functions
for vector fields satisfying certain conditions, are generally called ‘Poincare
lemmas’ after Henri Poincare. Below, we give only the simplest criterion
of this type. For its proof, the potential functions are explicitly constructed.
Theorem 4.5.14. (Sufficient conditions for the existence of a potential,
Poincare Lemma) Let n P N be such that n ¥ 2, U be an open subset of
Rn which is star-shaped with respect to some x0 P U , i.e., such that for all
x P U also the line segment tx0 t.px x0 q : t P r0, 1su is contained in U .
Further, let F pF1 , . . . , Fn q : U Ñ Rn be of class C 1 and such that
BFi BFj 0
B xj B xi
for every i, j P t1, . . . , nu such that i j. Then there is a potential V
U Ñ R of class C 2 such that F ∇V .
Proof. Define V : U Ñ R by
V pxq :
»
Hi pxq :
rx
»1
0
F dr ņ
:
pxi x0iqHipxq ,
i 1
Fi px0
t.px x0 qq dt ,
where rx ptq : x0 t.px x0 q for all t P r0, 1s, for all x P U . Now let
x P U . Since U is open there is d ¡ 0 such Ud pxq € U . Then by Taylor’s
686
formula Theorem 4.3.6, it follows for all h P Ud p0q that
»1
BFi px t.px x q τ h.e q dt
0
j
B xj 0
0
for some τ P r0, 1s. Now rj : r0, 1s r0, ds Ñ U defined by
rj pt, sq : x0 t.px x0 q s.ej , pt, sq P r0, 1s r0, ds
1
rHipx
h
h.ej q Hi pxqs t
is obviously continuous and hence its image compact, since its domain is
compact, too. Since
BFi : U Ñ R
B xj
is continuous, it is in particular uniformly continuous on Ran rj . Hence for
any ε ¡ 0 there is δ ¡ 0 such that
Fi
x x2
B p q BFi px q ε
Bj
B xj 1 P Ran rj and
|x2 x1| δ .
In particular for h P R such that |h| δ, it follows that
» 1
»1
BFi px
t B Fi px0 t.px x0 q τ h.ej q dt t
B xj
B xj 0
0
0
2ε
whenever x1 , x2
t.px x0 qq
and hence also that
1
Hi x
h
r p
ε.
h.ej q Hi pxqs »1
0
687
BFi px
t
B xj 0
t.px x0 qq
dt
dt
Since ε ¡ 0 is arbitrary otherwise, it follows that Hi is partially differentiable in the j-th coordinate direction with partial derivative given by
BHi pxq » 1 t BFi px
B xj
B xj 0
0
t.px x0 qq dt , x P U .
Moreover, analogous reasoning shows that
B Hi
B xj
is continuous. Hence V is of class C 1 . In particular,
BV pxq ņ px x q » 1 t BFi px
i
0i
B xj
B xj 0
0
i1
»1
0
ņ
Fj px0
pxi x0iq
i 1
»1
0
Fj px0
tFj px0
t.px x0 qq dt
t.px x0 qq dt
»1
t
0
BFj px
B xi 0
t.px x0 qq dt
t.px x0 qq dt
1
t.px x0 qq
0
Fj pxq , j P t1, . . . , nu
and hence, finally, it follows also that V is of class C 2 .
Remark 4.5.15. For the case n 2, the statement of Theorem 4.5.14 is
also true for the more general case of an open simply-connected U . For the
proof see [63], XVI, §5, Theorem 5.4.
Example 4.5.16. For n P N such that n ¥ 2, any open convex subset of Rn ,
i.e, any open subset S of Rn such that for all x, y P S also tx t.py xq :
t P r0, 1su € S, like Rn itself and any open ball in Rn , is star-shaped with
respect to any of its elements.
688
Example 4.5.17. We define F : R3
Ñ R3 by
F px, y, z q : py 2 z 3 , 2xyz 3 , 3xy 2 z 2 q ,
for all x, y, z
P R. In particular, F is of class C 1 and
curl F 0 .
Since R3 is star-shaped with respect to the origin, there is a potential V :
R3 Ñ R of class C 2 such that F ∇V . Such a potential is not uniquely
determined since the gradients of constant functions vanish. Integration of
the corresponding equations shows that V : R3 Ñ R defined by
V px, y, z q : xy 2 z 3
for all x, y, z
P R is a potential function for F .
Problems
1) Calculate
»
r
F dr .
Note that the paths in d)-g) all start and end at the same points.
a) Fpx, y q : py, 2xq , x, y
b)
c)
d)
e)
f)
P R , rptq : pt, t2 q , t P r0, 1s
Fpx, y q : p3y, 4xq , x, y P R ,
rptq : pt2 , t3 q , t P r0, 2s ,
Fpx, y q : px2 3xy, xy y 2 q , x, y P R ,
rptq : pcos t, sin tq , t P rπ, π s ,
Fpx, y q : p2xy, x2 q , x, y P R ,
rptq : p2t, tq , t P r0, 1s ,
Fpx, y q : p2xy, x2 q , x, y P R ,
rptq : p2t, t1{2 q , t P r0, 1s ,
Fpx, y q : p2xy, x2 q , x, y P R ,
689
,
r : pr1 , r2 q , r1 ptq : p0, tq , t P r0, 1s ,
r2 psq : ps, 1q , s P r0, 2s ,
PR,
r : pr1 , r2 q , r1 ptq : pt, 0q , t P r0, 2s ,
r2 psq : p2, sq , s P r0, 1s ,
h) Fpx, y, z q : py z, z x, x y q , x, y, z P R ,
rptq : pcos t, sin t, tq , t P r0, 2π s ,
i) Fpx, y, z q : py, z, xq , x, y, z P R ,
rptq : pR cos α cos t, R cos α sin t, R sin αq , t P r0, 2π s ,
αPR .
If possible, find a potential function V : DpF q Ñ R of class C 1 for
F where DpF q denotes the domain of F . Otherwise, give reasons
g)
2)
Fpx, y q : p2xy, x2 q , x, y
why there is no such function.
a) Fpx, y q : p1
b)
c) Fpx, y q : p2xy
d)
xq , x, y
y, 1
Fpx, y q : px, 2y q , x, y
y2
PR
PR
,
,
2xy 2 , x2
2xy
Fpx, y q : pe cos y, e sin y q , x, y
x
x
PR
2x2 y q , x, y
e) Fpx, y, z q : py z , 2xyz , 2xy z q , x, y, z
2
,
,
PR ,
f) Fpx, y, z q : px, x z y, 3x y z q , x, y, z P R ,
g) Fpx, y, z q : |px, y, z q|3 px, y, z q , px, y, z q P R3 z t0u .
Let n P N zt0, 1u, f : Rn zt0u Ñ R be continuous. Define F
Rn zt0u Ñ Rn by
F pxq : f pxq.x
for all x P Rn zt0u. Calculate
2 2
2
2
3)
PR
2
»
r
:
F dr
where r is a regular C 1 -path whose range is part of Srn p0q for some
r ¡ 0.
4) Define F : R2 zt0u Ñ R2 by
F px, y q :
for all px, y q P R2 zt0u.
2
2
px2 2xyy2 q2 , pxx2 yy2 q2
690
2
y
1
0
-1
-2
-2
0
x
-1
1
2
Fig. 194: Direction field of associated to F from Problem 4 and Ranprq for a 1.
a) Calculate
»
r
F dr
where a ¡ 0 and r : r0, 2π s Ñ R2 is given by
rptq : a.pcos t, sin tq
for all t P R. Note the difference of the result to that of Example 4.5.6.
b) If possible, find a potential function V : R2 zt0u Ñ R of class
C 1 for F . Otherwise, give reasons why such function does not
exist.
c) Calculate
»
r
F dr
where r is any regular C 1 -path that assumes values in R2 zt0u
and has initial point p and end point q.
5) As in Example 4.5.6, define F : R2 zt0u Ñ R2 by
F px, y q :
x2
691
y
,
y 2 x2
x
y2
for all px, y q P R2 zt0u. Further, let a ¡ 0, r : r0, 1s Ñ R2 zt0u a
regular C 1 -path such that rp0q rp1q pa, 0q and such that the
y-component of r is 0 on p0, εq and ¡ 0 on p1 ε, 1q for some
ε ¡ 0.
a) Find a potential V : R2 zp8, 0s Ñ R of class C 1 for the
restriction of F to R2 zp8, 0s Ñ R.
b) Calculate
»
r
692
F dr .
y
GHfL
W
b
a
x
Fig. 195: Domain of integration in the motivation of Green’s formula. See text.
4.6
Generalizations of the Fundamental Theorem of Calculus
In the following, we consider generalizations of the fundamental theorem
of calculus, Theorem 2.6.21, to vector-valued functions of several variables.
Those generalizations have important applications in the theory of partial
differential equations and connected areas, e.g., electrodynamics and fluid
mechanics.
For motivation, we calculate the integral of a partial derivative of a function
in two variables over the region in Ω € R2 under the graph of a function
f : ra, bs Ñ R, where a, b P R are such that a b, that assumes only
positive p¥ 0q values, i.e., Ω is given by
Ω : tpx, y q P R2 : a ¤ x ¤ b ^ 0 ¤ y
¤ f pxqu ,
see Fig 195. In addition, we assume that f is the restriction of a continuously differentiable function fˆ defined on an open interval I of R containing ra, bs. As a consequence, the graph of f is part of the image of the map
h : I Ñ R2 of class C 1 defined by
hpxq : px, fˆpxqq
693
for every x P I and hence is negligible. From this, we conclude that the
boundary of Ω, given by
4
¤
Bi ,
i 1
where
B1 : ra, bs t0u , B2 : tbu r0, f pbqs ,
B3 : Gpf q , B4 : tau r0, f paqs
is a negligible set. Further, let U be an open subset of R2 containing Ω and
F1 : U Ñ R be of class C 1 . For later use, we define a corresponding vector
field F : U Ñ R2 by
F px, y q : pF1 px, y q, 0q
for all px, y q P U . Then, we conclude by Fubini’s theorem and the fundamental theorem of calculus, Theorem 2.6.21, that
»b
B
F1
By dxdy a
» Ω
»b
a
rF1px, f pxqq
»b
a
F1 px, f pxqq dx
»
pq
f x
0
rF1px, qs 1pyq dy
dx
F1 px, 0qs dx
»b
a
F1 px, 0q dx .
We observe that the last two integrals ‘are’ in fact path integrals. Indeed, if
we define r1 : ra, bs Ñ R2 , r3 : ra, bs Ñ R2 by
r1 pxq : px, 0q , r3 pxq : px, f pxqq
for every x P ra, bs, then r1 , r3 are regular C 1 -paths traversing parts of the
boundary of Ω and
»
r1
F dr1
»b
a
F1 px, 0q dx ,
»
r
3
694
F dr
3 »b
a
F1 px, f pxqq dx .
Further, by defining r2 : r0, f pbqs Ñ R2 , r4 : r0, f paqs Ñ R2 by
r2 psq : pb, sq , r4 ptq : pa, tq
for every s P r0, f pbqs and t P r0, f paqs, then r2 , r4 are regular C 1 -paths
traversing the remaining parts of the boundary of Ω such that
»
r2
F dr2
»
r
F dr
4
0
4
since the tangent vectors of the paths are orthogonal to F in every point.
Hence the piecewise C 1 -path r : pr1 , r2 , r
3 , r4 q traverses the whole boundary of Ω such that
»
B
F1
By dxdy F dr .
r
» Ω
(4.6.1)
The last is a special case of so called ‘Green’s formula’, see Theorems 4.6.5,
4.6.7.
We make several observations about the structure of the last result.
First, it reduces the calculation of the integral of a derivative of a function
in two variables to that of a path integral, i.e., essentially to the calculation
of an integral of a function of one variable. This is similar to the fundamental theorem of calculus if we interpret the evaluation of differences of
an antiderivative at the endpoints of an interval of integration as a kind of
‘integration’ in ‘0-dimensions’. In this sense, we can view the result as a
generalization of the fundamental theorem of calculus. The ‘derivative’
BBFy1
(4.6.2)
of the vector field F does not look very natural. Later on, we will see that
that derivative is given by
curl F
BBFx2 BBFy1
695
for more general vector fields F where F2 denotes the corresponding second component function. In the case that F2 vanishes, this reduces to
(4.6.2). Also the last derivative, does not seem very natural since it is
unsymmetrical in the components of the vector field. An understanding
of the structure of such derivatives can be achieved by introduction of so
called ‘differential forms’ as is done in differential geometry courses. See
also [63], XXI. Such forms will not be introduced in this course and consequentially no explanation of the structure of such derivatives will be given.
Apart from practical reasons, the mathematical reason for not introducing
differential forms is the fact that, beyond the explanation of that structure
of the derivatives, differential forms are of not much further use in this connection because they usually make unnecessarily strong assumptions on the
differentiability of vector fields / differential forms. As a consequence, the
integral theorems of Green, Stokes and Gauss obtained by those methods
are usually weak, even compared to those from the present text that uses
quite elementary methods. By use of the Lebesgue integral, the methods in
this text would lead to far stronger results.
An additional observation concerns the definition of the vector field F .
What would have been the result if we had defined F by
F px, y q : p0, F1 px, y qq
for all px, y q P U , instead? Also this would have been a good choice, and
we would have arrived at a special case of a version of Gauss’ theorem, see
Theorem 4.6.27, in two space dimensions. The main difference of that approach is that it arrives at boundary integrals that are no path integrals, but
integrals that describe the flow of a vector field through the boundary. In
connection with Gauss’ and Stokes’ theorems, we will have to define such
flow integrals later on. From the discussion, the reader can also correctly
conclude that there are several forms of generalizations of the fundamental
theorem of calculus to vector-valued functions of several variables. What
form of generalization is used depends on the application at hand.
A final observation concerns the peculiar way the path r traverses the bound696
ary of Ω in (4.6.1). Its starts at the point pa, 0q, proceeds through pb, 0q,
pb, f pbqq, pa, f paqq and ends in pa, 0q. In this way, it traverses the points
of the boundary in counterclockwise direction. The last direction is also
called ‘mathematically positive’. The reader might wonder how this direction can be decided in general without the use of geometric intuition? For
this, we observe that the path r separates R2 into two regions, a part which
is ‘outside’ the boundary of Ω and a part that is ‘inside’ that boundary. In
every point of r for which there exists a corresponding tangent, there are
two directions that are orthogonal to the tangent. One is pointing towards
the outside and the other one is pointing towards the inside. For such points
on B1 , B2 , B3 , B4 , the outward pointing direction is given by
p0, 1q , p1, 0q , αpxq.pf 1pxq, 1q , p1, 0q
for every x P pa, bq, respectively, where
αpxq : a
1
1
rf 1pxqs2
for every x P pa, bq, and the direction of the tangent is given by
p1, 0q , p0, 1q , αpxq.p1, f 1pxqq , p0, 1q
for every x P pa, bq, respectively. Therefore, we conclude that their corresponding determinants are given by
det pp0, 1q, p1, 0qq 1 , det pp1, 0q, p0, 1qq 1 ,
det pαpxq.pf 1 pxq, 1q, αpxq.p1, f 1 pxqqq 1 , detpp1, 0q, p0, 1qq 1
for every x P pa, bq. Since the change of all signs of the entries of one
row of a determinant leads to an overall change of sign, we conclude that
those determinants are all equal to 1 if we replace all outward pointing
directions by the corresponding inward pointing directions. Hence in the
application of (4.6.1), the piecewise C 1 -path r traversing the boundary of
Ω needs to be chosen in such a way that, in all points where a tangent vector
exists, the determinant of the orthogonal direction pointing towards the outside and the tangent vector is ¡ 0. Still, the question remains whether every
697
closed continuous path r : rc, ds Ñ R2 , where c, d P R are such that c d,
without self-intersection, i.e. such rpt1 q rpt2 q for different t1 , t2 P pa, bq,
separates R2 into two regions that both have its range as boundary. Indeed,
this is the case according to the Jordan curve theorem. For an elementary
proof of this theorem, see [80].
After this introduction, we start with the definition of the orientation of
n-tuples of vectors in Rn where n P N is such that n ¥ 2.
Definition 4.6.1. (The orientation of n-tuples of vectors in Rn ) Let n P N
be such that n ¥ 2, pa1 , . . . , an q be an n-tuple of vectors in Rn . Then we
say that pa1 , . . . , an q is positively oriented, negatively oriented if
detpa1 , . . . , an q ¡ 0
and
detpa1 , . . . , an q 0 ,
respectively. Note that exchanging the order of two elements in a positively
oriented n-tuple leads to a negatively oriented n-tuple and vice versa. In
particular, since
detpe1 , . . . , en q 1 ¡ 0
the n-tuple pe1 , . . . , en q consisting of the canonical basis e1 , . . . , en of Rn
is positively oriented.
Example 4.6.2. If a pa1 , a2 q P R2 zt0u and b P R2 zt0u has the same direction as the rotation of a in counterclockwise (= mathematically positive)
sense around the origin by the angle α P p0, π q, then
b λ.pa1 cospαq a2 sinpαq, a1 sinpαq
for some λ ¡ 0 and
detpa, bq λ a1 cos
a1
pαq a2 sinpαq
a2 cospαqq
a1 sinpαq
a2
a2 cospαq
λ |a|2 sinpαq ¡ 0
and hence the pair pa, bq in R2 is positively oriented. This fact is often used
to decide whether a given pair of vectors in R2 is positively oriented.
698
b
Α
O
Fig. 196: Since 0
Example 4.6.2.
a
α π, the pair of vectors pa, bq in R2 is positively oriented.
See
a x b
b
a
Fig. 197: The triple of vectors pa, b, a bq in R3 is positively oriented. See Example 4.6.3.
699
Example 4.6.3. If a, b P R3 are vectors that are not multiples of each other,
it follows by Remark 3.5.15 and Definition 3.5.18 that
detpa, b, a bq detpa b, a, bq |a b|2
¡0
and hence that the triple pa, b, a bq in R3 is positively oriented. In applications, this is often used for the construction of positively oriented triples
in R3 .
4.6.1
Green’s Theorem
We continue with Green’s theorem for images of rectangles under certain differentiable maps. The basis for its proof is given by the following
Lemma referring to transformation properties of the curl of a vector field.
The lemma can be proved by a straightforward calculation using the chain
rule in the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18.
Lemma 4.6.4. Let V be a non-empty open subset of R2 and F pF1 , F2 q :
V Ñ R2 be differentiable. Further, let g pg1 , g2 q : Dpg q Ñ R2 be
defined and of class C 2 on a non-empty open subset Dpg q of R2 and such
that g pDpg qq € V . Then
B pF gq Bg1 pF gq Bg2 B pF gq Bg1 pF gq Bg2 2
Bx 1
By 2 By By 1 Bx
Bx
BBFx2 BBFy1 g detp g 1 q .
Proof. The proof proceeds by a simple calculation using the chain rule in
the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18.
B pF gq Bg1 pF gq Bg2 B pF gq Bg1 pF gq Bg2 2
2
Bx 1
By
By By 1 Bx
Bx
2
2
BpFB1x gq BBgy1 pF1 gq BBxBg1y BpFB2x gq BBgy2 pF2 gq BBxBg2y
2
2
Bp
F1 g q B g1
B
g1
Bp
F2 g q B g2
B
By Bx pF1 gq ByBx By Bx pF2 gq ByBg2x
700
BpFB1x gq BBgy1 BpFB2x gq BBgy2 BpFB1y gq BBgx1 BpFB2y gq BBgx2
B
F1
B
g1
B
F1
B
g2 B g1
Bx g Bx
g Bx By
B
y
BF2 g
Bg1 BF2 g
Bg2 Bg2
Bx Bx By Bx By
BBFx1 g BBgy1 BBFy1 g BBgy2 BBgx1
B
F2
B
g1
B
F2
B
g2 B g2
Bx g By
g By Bx
B
y
B
g2 B g1
B
F2
B
g1 B g2
B
F1
Bg2 Bg1
B
F1
g
g
By g Bx By
Bx
Bx By By
By Bx
BBFx2 g BBgy1 BBgx2 BBFx2 g BBFy1 g detp g 1 q .
Green’s theorem for images of rectangles under certain differentiable maps
is a consequence of the previous Lemma, Lemma 4.6.4, and change of
variables, Theorem 4.4.23.
Theorem 4.6.5. (Green’s theorem for images of rectangles) Let a, b, c,
d P R such that a b and c d and I : ra, bs rc, ds, I0 : pa, bq pc, dq. Further, let U  I be an open subset of R2, g : U Ñ R2 be twice
continuously differentiable such that the induced map from U to g pU q is
bijective with a continuously differentiable inverse and such that detpg 1 q ¡
0. Finally, let V  g pI q be an open subset of R2 and F pF1 , F2 q : V Ñ
R2 be continuously differentiable. Then
»
p q
g I0
BF2 BF1 dxdy » F dr
Bx By
r
(4.6.3)
for any piecewise C 1 -parametrization r of the boundary of g pI0 q which is
of the same orientation as the piecewise C 2 -path prc , rb , r
d , ra q where
rc pxq : g px, cq , rb py q : g pb, y q , rd pxq : g px, dq ,
701
y
gHIL
x
Fig. 198: Illustration for the proof of Green’s theorem, Theorem 4.6.5.
ra py q : g pa, y q
for all x P ra, bs and y
P rc, ds.
Proof. In a first step, we consider the set g pI0 q. Since g is twice continuously differentiable with a continuously differentiable inverse, g pI0 q is a
bounded open subset in R2 . Further, the restriction of
BF2 BF1
Bx By
to g pI0 q is bounded. In addition, it follows by Theorem 4.4.13 and Theorem 4.4.15 that the extension of this function to a function, defined on a
closed subinterval J of R2 containing g pI0 q and assuming the value zero in
the points of J z g pI0 q, is Riemann-integrable. Hence by Theorem 4.4.23,
it follows in a second step that
»
p q
g I0
BF2 BF1 dxdy »
Bx By
I
0
702
BF2 BF1 g detpg 1q dxdy
Bx By
and hence by the previous Lemma 4.6.4 that
BF2 BF1 dxdy
Bx By
g pI q
»
B
g1
B
g2
B
Bx pF1 gq By pF2 gq By dxdy
I
»
B
B
g1
B
g2
By pF1 gq Bx pF2 gq Bx dxdy .
I
»
0
0
(4.6.4)
0
Further, by Fubini’s Theorem 4.4.18 and the fundamental theorem of calculus Theorem 2.6.21, it follows that
B pF gq Bg1 pF gq Bg2 dxdy
(4.6.5)
1
2
By
By
I Bx
»d
B
g1
B
g2
pF1 gqpb, yq By pb, yq pF2 gqpb, yq By pb, yq dy
c
»d
B
g2
B
g1
pF1 gqpa, yq By pa, yq pF2 gqpa, yq By pa, yq dy
c
»
0
and
B pF gq Bg1 pF gq Bg2 dxdy
(4.6.6)
1
2
Bx
Bx
I By
»b
B
g1
B
g2
pF1 gqpx, dq Bx px, dq pF2 gqpx, dq Bx px, dq dx
a
»b
B
g2
B
g1
pF1 gqpx, cq Bx px, cq pF2 gqpx, cq Bx px, cq dx .
a
»
0
Finally, (4.6.3) follows from (4.6.4), (4.6.5) and (4.6.6).
Remark 4.6.6. For the orientation of the piecewise C 2 -path prc , rb , r
d , ra q
note that the region g pI0 q is bounded and hence that there is outward pointing unit normal for every point on its boundary, apart from the corner points
g pa, cq, g pb, cq, g pb, dq and g pa, dq. Outward pointing vectors are given by
B
g1
B
g2
vc px, cq By px, cq, By px, cq
703
,
y
W
a
b
x
Fig. 199: Illustration for the proof of Green’s theorem, Theorem 4.6.7.
B
g2
B
g1
vb pb, y q pb, yq, Bx pb, yq ,
B
x
B
g1
B
g2
vd px, dq px, dq, By px, dq ,
B
y
B
g1
B
g2
va pa, y q Bx pa, yq, Bx pa, yq
for every x P pa, bq and y P pc, dq. In particular, as a consequence of the
assumption that detpg 1 q ¡ 0 in the previous Theorem 4.6.5, it follows that
detpvc px, cq, rc1 pxqq detpvb pb, y q, rb1 py qq detpvd px, dq, rd1 pxqq
detpvapa, yq, ra1 pyqq ¡ 0
for all x P pa, bq and y P pc, dq. Hence the orientation for the piecewise C 1 parametrization r of the boundary of g pI0 q in Theorem 4.6.5 has to be such
that the outward unit normal and the tangent vector in a point of the boundary of g pI0 q are positively oriented in every point of the boundary, apart
from a finite number of points. This orientation is indicated in Fig. 198.
704
From Green’s theorem for images of rectangles, we can conclude Green’s
theorem for regions bounded by graphs. The last has wider applications.
Theorem 4.6.7. (Green’s theorem for regions bounded by graphs) Let
a, b P R be such that a b, f1 : ra, bs Ñ R and f2 : ra, bs Ñ R be
restrictions of twice continuously differentiable functions defined on open
intervals containing ra, bs. In addition, let f1 , f2 be such that f1 pxq f2 pxq
for all x P pa, bq.1 Further, let
Ω : tpx, y q P R2 : a x b ^ f1 pxq y
f2pxqu .
In particular, let Ω be such that there is a 0 δ pb aq{2 such that the
corresponding sets Ω Xppa, a δ q Rq and Ω Xppb δ, bq Rq are convex.
Finally, let F pF1 , F2 q : V Ñ R2 be continuously differentiable where
V is an open subset of R2 containing Ω and its boundary. Then:
» Ω
BF2 BF1 dxdy » F dr
Bx By
r
(4.6.7)
for any piecewise C 1 -parametrization r of the boundary of Ω which is of
the same (‘mathematically positive’, ‘counterclockwise’) orientation as the
piecewise C 2 -path pr1 , rb , r
2 , ra q
r1 pxq : px, f1 pxqq , rb pλq : pb, f1 pbq λ [f2 pbq f1 pbq]q ,
r2 pxq : px, f2 pxqq , ra pλq : pa, f1 paq λ [f2 paq f1 paq]q
for all x P ra, bs and λ P r0, 1s.
Proof. For this, define the open subset U of R2 by U : pa, bq R and
g : U Ñ U by
g px, λq : px , f1 pxq
1
λ [f2 pxq f1 pxq]q
Note that we do not demand that f1 paq f2 paq or that f1 pbq f2 pbq. As a consequence, in Fig. 199, each of the line segments of the boundary of Ω that are parallel to
the y-axis can consist of one point, only.
705
for all px, λq P U . In particular, g is bijective, of class C 2 with an inverse
of class C 2 given by
g 1 px, y q x ,
y f1 pxq
f2 pxq f1 pxq
for all px, y q P U .
detpg 1 px, λqq f2 pxq f1 pxq ¡ 0
for all px, λq P U . Further,
g ppa, bq p0, 1qq Ω ,
(4.6.8)
and hence Ω is an open subset of R2 . The validity of (4.6.8) can be seen as
follows. First, for px, λq P pa, bq p0, 1q, it follows that
f1 pxq f1 pxq
λ [f2 pxq f1 pxq] f1 pxq
f2 pxq f1 pxq f2 pxq
and hence that g px, λq P Ω. Second, for px, y q P Ω, it follows that
0 y f1 pxq
f2 pxq f1 pxq
ff2ppxxqq ff1ppxxqq 1
2
1
and hence that g 1 px, y q P pa, bq p0, 1q. In the following, let 0 δ pb aq{2 be such that the corresponding sets Ω X ppa, a δq Rq and
Ω X ppb δ, bq Rq are convex. Further, let 0 ε δ, Iε : ra ε, b εs r0, 1s and I0,ε : pa ε, b εq p0, 1q. Then U  Iε and V  g pIε q.
Hence it follows by Theorem 4.6.5 that
BF2 BF1 dxdy » F dr
ε
Bx By
g pI q
r
where rε is the piecewise C 2 -path pr1,ε , rbε , r
2,ε , ra ε q given by
r1,ε pxq : px, f1 pxqq ,
rbε pλq : pb ε, f1 pb εq λ [f2 pb εq f1 pb εq]q ,
»
ε
0,ε
706
(4.6.9)
r2,ε pxq : px, f2 pxqq ,
ra ε pλq : pa ε, f1 pa
εq
λ [f2 pa
εq f1 pa
εq]q
for all x P ra ε, b εs and λ P r0, 1s. In the following final step of the
proof, we show that (4.6.7) follows from (4.6.9) by performing the limit
ε Ñ 0. For this, let fˆ1 , fˆ2 : pa δ 1 , b δ 1 q Ñ R twice continuously
differentiable extensions of f1 and f2 , respectively, for some 0 δ 1 . Then
by
ˆ
ˆ
ˆ
ĝ px, λq : x , f1 pxq λ [f2 pxq f1 pxq]
for all px, λq P pa δ 1 , b δ 1 q R, there is defined a twice continuously
differentiable extension of g, and hence it follows by Theorem 4.4.13 and
Theorem 4.4.15 that the extension of
BF2 BF1 ,
Bx By Ω
to a function that is defined on a closed subinterval J of R2 containing Ω
and assuming the value zero in the points of J z Ω, is Riemann-integrable.
Further, it follows that
»
F2
gpI0,ε q
x
B BF1 dxdy » BF2 BF1 dxdy ¤ 2M ε
B
By
Bx By
Ω
where M1
¡ 0 denotes the maximum of
BF2 BF1
Bx By
on some closed subset that is contained in V and at the same time contains
Ω. Hence,
»
lim
Ñ0 gpI0,ε q
ε
Further,
»
F drε
rε
»
r
BF2 BF1 dxdy » BF2 BF1 dxdy .
Bx By
Bx By
Ω
F dr 707
¤
» a ε
» a ε
1
1
F
x,
f
x
1,
f
x
dx
F
x,
f
x
1,
f
x
dx
2
1
2
1
a
a
» 1
[f2 aε
f1 aε ] F2 aε , f1 aε
λ [f2 aε
f1 aε ] dλ
0
»1
[f2 a
f1 a ] F2 a, f1 a
λ [f2 a
f1 a ] dλ
0
» b
» b
1
1
F x, f1 x
1, f2 x dx
1, f1 x dx F x, f2 x
bε
b
» 1ε
[f2 bε
f1 bε ] F2 bε , f1 bε
λ [f2 bε
f1 bε ] dλ
0
»1
[f2 b
f1 b ] F2 b, f1 b
λ [f2 b
f1 b ] dλ
p
p qq p
p qq
p qq
p q p q
p
p q
p q p q q
p q p q
p
pq
p q p q q
p qq p
p qq
p
p
p qq p
p
p qq p
p q p q
p
p q
p q p q q
p q p q
p
pq
p q p q q
0
p qq
where r : pr1 , rb , r
ε, bε : b ε. In the following,
2 , ra q and aε : a
we estimate the individual terms of the last sum. First,
»
F x, f1 x
I
p
p
qq p1, f 1pxqq dx ¤
1
»
{ ,
I
|F px, f1pxqq| |p1, f11pxqq| dx
¤ εM2p1
q
»
1
p p qq p1, f2 pxqq dx ¤ |F px, f2pxqq| |p1, f21pxqq| dx
I
I
2 1{2
¤ εM2p1 M4 q
for every interval I € ra, bs of length ε. Here M2 ¥ 0 denotes the maximum of |F | on some closed subset that is contained in V and at the same
time contains Ω; M3 ¥ 0 denotes the maximum of the restriction of |fˆ11 |
to ra, bs; M4 ¥ 0 denotes the maximum of the restriction of |fˆ21 | to ra, bs.
M32 1 2
»
F x, f2 x
Second, it follows by use of Taylor’s Theorem 4.3.6 that
» 1
[f2 aε
0
»1
p q f1paεq] F2paε, f1paεq
0
[f2 paq f1 paq] F2 pa, f1 paq
708
λ [f2 paε q f1 paε q] q dλ
λ [f2 paq f1 a ] dλ
pqq
»1
|F2paε, f1paεq λ [f2paεq f1paεq]q
F2pa, f1paq λ [f2paq f1paq]q|dλ
|f2paεq f2paq f1paεq f1paq|
»1
|F2pa, f1paq λ [f2paq f1paq]q|dλ
¤ |f2paεq f1paεq| ¤
!
0
M2 pM3
0
M4 q
M7 pM5
M6 q 1
pM 3
{)
2 1 2
M4 q
ε
and that
» 1
[f2 bε
0
»1
p q f1pbεq] F2pbε, f1pbεq
0
[f2 pbq f1 pbq] F2 pb, f1 pbq
¤ |f2pbεq f1pbεq| λ [f2 pbε q f1 pbε q] q dλ
λ [f2 pbq »1
f1 b ] dλ
pqq
|F2pbε, f1pbεq λ [f2pbεq f1pbεq]q
F2pb, f1pbq λ [f2pbq f1pbq]q|dλ
|f2pbεq f2pbq f1pbεq f1pbq|
»1
|F2pb, f1pbq λ [f2pbq f1pbq]q|dλ
!
0
0
)
pM3 M4q2 1{2 ε .
Here Taylor M5 ¥ 0 denotes the maximum of |f1 | on ra, bs; M6 ¥ 0
denotes the maximum of |f2 | on ra, bs; M7 ¥ 0 denotes the maximum
|∇F2| on some closed subset that is contained in V and at the same time
¤
M2 pM3
M4 q
M7 pM5
M6 q 1
contains Ω. As a consequence, it follows that
»
lim
Ñ0
ε
rε
F drε
»
r
F dr
and hence, finally, (4.6.7).
Remark 4.6.8. Note that in the previous Theorem 4.6.7, the assumption of
convexity Ω X ppa, a δ q Rq for some 0 δ pb aq{2 is redundant
709
if f1 paq f2 paq, and the assumption of convexity Ω X ppb δ, bq Rq for
some 0 δ pb aq{2 is redundant in the case that f1 pbq f2 pbq.
Remark 4.6.9. Green’s theorem can be generalized to regions which can
be dissected into regions that satisfy the demands of Theorem 4.6.5 or Theorem 4.6.7. Green’s theorems, Theorem 4.6.5 or Theorem 4.6.7, are then
applied to the parts of the dissection. In this, cuts are traversed twice, but
in opposite directions such that their contributions cancel in the sum. For
such a case, see Example 4.6.12.
Example 4.6.10. (Area of the interior of ellipse) Let a, b be strictly positive real numbers such that a b and
U :
"
2
px, yq : xa2
y2
b2
*
1
be the interior of the ellipse with half-axes a and b around the origin. Then
r : rπ, π s Ñ R2 , defined by rpϕq : pa cos ϕ, b sin ϕq for all ϕ P rπ, π s
is a C 1 -parametrization of that ellipse with positive orientation. Finally,
define the vector field F : R2 Ñ R2 by F px, y q : 1{2.py, xq for all
px, yq P R2. Then it follows by Theorem 4.6.7
»
dxdy
U
πab .
»
r
F dr »
1 π
pb sin t, a cos tq pa sin t, b cos tq dt
2 π
In this way, the area enclosed by the ellipse has been calculated by evaluation of a path integral.
Example 4.6.11. (Area of the interior of a plane curve given in polar
coordinates) Let Ω be a subset of R2 that satisfies the assumptions for Ω in
Theorem 4.6.7. Further, let u be a positively oriented C 1 -parametrization
of B Ω given as follows. For this, let a, b P R be such that a ¤ b, I : ra, bs,
r : I Ñ R and ϕ : I Ñ R be continuous as well as differentiable on pa, bq
710
with derivatives that can be extended to continuous functions on I. Then
by
uptq : p rptq cos ϕptq , rptq sin ϕptq q
for every t P I, there is defined a C 1 -path. Note that for t P I, rptq and
ϕptq can be interpreted as polar coordinates of uptq if rptq ¡ 0 and ϕptq P
pπ, πq. In particular for t P pa, bq,
u 1 ptq p r 1 ptq cos ϕptq rptq ϕ 1 ptq sin ϕptq ,
r 1 ptq sin ϕptq rptq ϕ 1 ptq cos ϕptq q .
As in the previous example, we define the vector field F : R2 Ñ R2 by
F px, y q : 1{2.py, xq for all px, y q P R2 . Then it follows by Theorem 4.6.7 that
»
dxdy
U
1
2
»b
a
»
r
F dr 1
2
»b
a
prptq sin ϕptq , rptq cos ϕptqq u 1ptq dt
r2 ptq ϕ 1 ptq dt .
In this way, the area of Ω can be calculated by a Riemann integral of a
function in one variable.
The following example gives a typical application of Green’s theorem in
the area of partial differential equations. It considers solutions of wave
equations. Ultimately, it will lead to the proof of the causal behavior of
the solutions, i.e., the fact that two solutions, whose values coincide on an
interval I € R at time t 0 and whose partial time derivatives coincide on
that same set, coincide on the area of a certain ‘characteristic triangle’ that
is contained in I r0, 8q and has I as basis.
Example 4.6.12. (An energy inequality for a wave equation in one space
dimension) We consider a function u : U Ñ R of class C 2 that satisfies
the wave equation
B2u B2u V u 0 ,
(4.6.10)
Bt2 Bx2
711
t
Τ
T
A
Ξ-Τ
Ξ-HΤ-TL
Ξ
Ξ+HΤ-TL
Ξ+Τ
x
Fig. 200: Domain of integration in Example 4.6.11.
where V : U Ñ R is continuous, assumes only positive values, i.e.,
RanpV q € r0, 8q, and is such that
BV 0 .
Bt
In this, U is a non-empty open subset of R2 . Then the functions , j defined
by
:
1
2
Bu 2
Bt
Bu 2
Bx
V u2
, j :
Bu Bu
Bx Bt
satisfy
B B u B 2 u B u B 2 u V u B u B u B 2 u V u
Bt Bt Bt2 Bx BtBx
B t B t B x2
Bu B2u V u Bu Bu B2u Bu B2u Bj .
B x B xB t
B t B t B x2 B x B xB t B x
Hence we conclude the conservation law
Bj B 0 .
Bx Bt
712
(4.6.11)
Note for later use that
px, tq ¥ |j px, tq| .
(4.6.12)
for all px, tq P U . In physical applications, is called the energy density
(corresponding to u) and j is called the energy flux density (corresponding
to u). Integration of p, tq over an interval of R gives the energy of u that
is contained in that interval at time t P R. The function j describes the
flow of that energy. In the following, we derive an important consequence
of (4.6.11). For this, let pξ, τ q P R p0, 8q, and let the area enclosed by
the triangle with corners pξ τ, 0q, pξ τ, 0q and pξ, τ q be contained in
U . We integrate (4.6.11) over the subarea enclosed by the trapezoid with
corners pξ τ, 0q, pξ τ, 0q, pξ pτ T q, T q and pξ pτ T q, T q where
0 ¤ T t. We will show that the energy content at time T in the interval
rξ pτ T q, ξ pτ T qs is equal or smaller than the energy content at
time 0 in the interval rξ τ, ξ τ s
»ξ
ξ
pτ T q
pτ T q
px, T q dx ¤
»ξ
τ
ξ τ
px, 0q dx .
(4.6.13)
Indeed, it follows that
» »
3̧ »
B
j
B
j
B
B
0
Bx Bt dxdt i1 A Bx Bt dxdt rp, j q dr
A
where r is the piecewise C 2 -path pr1 , r2 , r
3 , r4 q
r1 py1 q : py1 , 0q , r2 pλq : pξ τ λT, λT q ,
r3 py3 q : py3 , T q , r4 pλq : pξ τ λT, λT q
for all y1 P rξ τ, ξ τ s, y3 P rξ pτ T q, ξ pτ T qs and λ P r0, 1s.
i
Note that in this, A is dissected into the area A1 enclosed by the triangle
with corners pξ τ, 0q, pξ pτ T q, 0q, pξ pτ T q, T q, the area A2 enclosed
by the rectangle with corners pξ pτ T q, 0q, pξ pτ T q, 0q, pξ pτ T q, T q, pξ pτ T q, T q, the area A3 enclosed by the triangle with corners
pξ pτ T q, 0q,pξ τ, 0q,pξ pτ T q, T q, and apply Green’s Theorem 4.6.7
to these surfaces. The cuts are traversed twice, but in opposite directions
713
such that their contribution cancels in the sum as indicated in Fig. 200.
Further, we conclude that
0
»ξ
τ
ξ τ
»1
px, 0q dx »ξ
ξ
pτ T q
pτ T q
px, T q dx
»1
0
p, j qpr2pλqq pT, T q dλ
p, j qpr4pλqq pT, T q dλ
0
and hence that
»ξ
τ
ξ τ
px, 0q dx »1
»ξ
ξ
pτ T q
pτ T q
px, T q dx
»1
p, j qpr2pλqq pT, T q dλ
0
»1
0
p, j qpr4pλqq pT, T q dλ
»1
¥ p, |j |qpr2pλqq pT, T q dλ
0
0
p, |j |qpr4pλqq pT, T q dλ ¥ 0
where in the last step (4.6.12) has been used. Hence it follows (4.6.13). As
an application of the energy inequality (4.6.13), we assume that v : U Ñ R
is another solution of (4.6.10) such that
upx, 0q v px, 0q ,
for all x P rξ τ, ξ
Bu px, 0q Bv px, 0q
Bt
Bt
τ s. Then u v is a solution of (4.6.10) such that
pu vqpx, 0q 0 , BpuBt vq px, 0q 0
for all x P rξ τ, ξ τ s and hence the corresponding energy density vanishes
at time 0 on rξ τ, ξ τ s. As a consequence of (4.6.13) and the positivity
of the energy density, it follows that the same is true at time T on rξ pτ T q, ξ pτ T qs. Since this is true for every t P r0, τ q and since u v is
continuous, it follows that u and v coincide in every point from the closed
area that is enclosed by the triangle with corners pξ τ, 0q, pξ τ, 0q and
714
pξ, τ q. Note that this triangle is isosceles with a right angle and π{4 radian
angles at the corners pξ τ, 0q, pξ τ, 0q. In addition, note that
pξ, τ q p r pξ τ q pξ τ q s {2, r pξ τ q pξ τ q s {2 q .
As a consequence, we have the following result.
Theorem 4.6.13. (Uniqueness of the solutions of a wave equation in one
space dimension) Let U be a non-empty open subset of R2 and u : U Ñ R,
v : U Ñ R be of class C 2 and such that
B2u B2u V u B2v B2v V v 0 ,
Bt2 Bx2
Bt2 Bx2
where V : U Ñ R is continuous, assumes only positive values,
RanpV q € r0, 8q, and satisfies
BV 0 .
Bt
i.e.,
Further, let
Bu px, t q Bv px, t q
Bt 0 Bt 0
for some t0 P R and all x from some closed interval ra, bs of R where
a, b P R are such that a b. Then
upx, tq v px, tq
for all px, tq from the closed area that is bounded by the isosceles right
upx, t0 q v px, t0 q ,
triangle with corners
pa, t0q , pb, t0q , ppa bq{2, t0 pb aq{2q .
Proof. For the case t0 0, the result was proved in the previous example.
If t0 0, then U0 : tpx, t t0 q : px, tq P U u, V0 : pU0 Ñ R, px, tq ÞÑ
V px, t t0 qq, u0 : pU0 Ñ R, px, tq ÞÑ upx, t t0 qq, v0 : pU0 Ñ
715
R, px, tq ÞÑ v px, t t0 qq satisfy the assumptions of the theorem for the
case t0 0. Hence it follows that upx, t t0 q u0 px, tq v0 px, tq v px, t t0 q for all px, tq from the closed area that is enclosed by the triangle
with corners
pa, 0q , pb, 0q , ppa bq{2, pb aq{2q .
Hence it follows that upx, tq v px, tq for all px, tq from the closed area
that is enclosed by the triangle with corners
pa, t0q , pb, t0q , ppa
bq{2, t0
pb aq{2q .
Subsequently, we derive the theorems of Gauss and Stokes. As mentioned
in the introduction, a part of the integrals occurring in these theorems describe flows of vector fields through surfaces. For the definition of such
integrals, we need to introduce the notion of parametric surfaces.
Definition 4.6.14. (Parametric surfaces) Let p P N . A C p -parametric
surface (in R3 ) is a pair pS, rq consisting of a subset S (‘the surface’) of
R3 and an injective map (‘parametrization’) r of class C p from some open
subset U of R2 into R3 with range S. To pS, rq there is an associated normal
field given by
Br px, yq Br px, yq
npx, y q :
Bx
By
for every px, y q P U . Hence for every px0 , y0 q
in rpx0 , y0 q P S is given by
P U the tangent plane to S
npx0 , y0 q ppx, y, z q rpx0 , y0 qq 0 .
As a side remark, such surfaces are examples of C p -manifolds defined in
differential geometry.
Example 4.6.15. (Examples of parametric surfaces)
716
(i) Let p P N and f be a function of class C p defined on some nonempty open subset U of R2 . Then pGpf q, rf q is a C p -parametric surface where
rf px, y q : px, y, f px, y qq
for all px, y q P U . The corresponding normal field n is given by
B
f
B
f
npx, y q px, y q, px, y q, 1 ,
Bx
By
px, yq P U , and the tangent plane at Gpf q in a point px0, y0, f px0, y0qq
is given by
f px0, y0q BBfx px0, y0q px x0q BBfy px0, y0q py y0q
for all px, y q P R2 . The last is identical to the definition given in
z
Definition 4.2.9.
(ii) Denote by S 2 the sphere of radius 1 centered at the origin. Then
pS 2 zpp8, 0s t0u Rq, rq is a C p-parametric surface for every
p P N . Here
rpθ, ϕq : psin θ cos ϕ, sin θ sin ϕ, cos θq
for all θ P p0, π q, ϕ
given by
θ
P pπ, πq. The corresponding normal field n is
npθ, ϕq sin θ . rpθ, ϕq ,
P p0, πq, ϕ P pπ, πq.
(iii) Denote by Z 2 the circular cylinder of radius 1 with axis given by the
z-axis. Then pZ 2 zpp8, 0st0u Rq, rq is a C p -parametric surface
for every p P N where
rpϕ, z q : pcos ϕ, sin ϕ, z q
for all ϕ P pπ, π q, z P R. The corresponding normal field n is given
by
npϕ, z q pcos ϕ, sin ϕ, 0q ,
ϕ P pπ, π q, z
P R.
717
1
z
0
2
-1
0
y
-2
0
x
-2
2
Fig. 201: Torus corresponding to r
1 and R 2.5.
(iv) Denote by T 2 the torus obtained by rotating around the z-axis the
circle of radius r ¡ 0 in the y, z plane centered at the point p0, R, 0q
where R ¡ r. Then
T2 z
pp8, 0s t0u Rq Y pSR1 r pt0uq t0uq , r
is a C p -parametric surface for every p P N where
rpϕ, θq : pcos ϕ pR r cos θq, sin ϕ pR r cos θqr sin θq
for all ϕ, θ P pπ, π q. The corresponding normal field n is given by
npϕ, θq pR r cos θq.pr cos ϕ cos θ, r sin ϕ cos θ, r sin θ Rq
for all ϕ, θ P pπ, π q.
4.6.2
Stokes’ Theorem
Below, we introduce the notion of flux integrals that describe the flow of
vector fields through parametrized surfaces. Such an integrals appear in
Stokes’ theorem and also appear as boundary integrals in Gauss’ theorem.
718
1
Dz
0
0
2
1
v Dt
1 Dy
2
3
0
Fig. 202: Fluid volume flown through R after time 4t
4y 2m, 4z 1m.
7sec for v p0.5, 0, 0q m/sec,
Example 4.6.16. (Motivation for the definition of the flux of a vector
field across a surface) Consider a constant flow v pvx , vy , vz q (length /
time) of a fluid with constant mass density ρ (mass / volume) across the area
R enclosed by a rectangle with sides 4y, 4z in the y, z-plane. Imagine R
to be part of a closed surface such that the outer normal to R is given by
n : ex . Then the change of mass inside the volume after time 4t due to
the flow across R is given by
ρvx 4t4y4z .
Note that it is negative if vx 0 because of our use of the outer normal.
‘Inflow’ pvx 0q is counted negatively, whereas ‘outflow’ pvx ¡ 0q is
counted positively. The rate of change of mass in the volume due to the
flow across R is given by
B
r Br
ρ v n dydz ρv ρvx 4y4z By Bz dydz
R
R
»
»
719
(4.6.14)
where the parametrization rpy, z q : p0, y, z q for all py, z q from the projection of R into the y, z-plane has been used. Note that in the special case
that ρ 1, since
r
y
1 B r B r B
B
r v n :
B Bz . By Bz
it follows from (4.6.14) that
» r
y
R
,
B Br dydz ,
B Bz coincides with the area of R.
Motivated by the previous example, we define the following.
Definition 4.6.17. (Flux of a vector field across a C 1 -parametric surface) Let pS, rq be a C 1 -parametric surface and F : S Ñ R3 be a continuous vector field on S. Finally, let
B
r Br
pF rq Bx By
be Riemann-integrable. Then we define the flux of F across S by
B
r
B
r
F dS :
F prpx, y qq Bx px, yq By px, yq dxdy .
S
Dprq
»
»
In particular, we define the area A of S as the flux (if existent) corresponding to the special case that F coincides with the unit normal field induced
by r. Hence A is defined by
A :
if
»
pq
D r
r
x x, y
B p q Br px, yq dxdy ,
B
By
r
x
B Br B By is Riemann integrable.
720
The following shows that the flux through parametric surfaces pS, r1 q, pS, r2 q
is the same if the parametrizations r1 and r2 are related by an ‘orientation
preserving map’. In this sense, the value of the flow integral is determined
by the vector field and the set S alone. This fact is important for the use of
flow integrals in applications.
Theorem 4.6.18. (Invariance under reparametrization) Let pS, rq be a
C 1 -parametric surface and F : S Ñ R3 be a continuous vector field on S
such that
B
r Br
pF rq Bx By
is Riemann-integrable. Moreover, let V be an open subset of R2 , g : V Ñ
Dpf q be continuously differentiable with a continuously differentiable inverse and such that detpg 1 q ¡ 0. Then pS, r g q is a C 1 -parametric surface
and
Bp
r gq
Bp
r gq
F ppr g qps, tqq ps, tq Bt ps, tq dsdt
B
s
V
»
B
r
B
r
F prpx, y qq (4.6.15)
Bx px, yq By px, yq dxdy .
»
pq
D r
Proof. By the chain rule for partial derivatives Corollary 4.2.25, it follows
that
Bpr gq ps, tq Bg1 ps, tq. Br pgps, tqq Bg2 ps, tq. Br pgps, tqq
Bs
Bs
Bx
Bs
By
Bpr gq ps, tq Bg1 ps, tq. Br pgps, tqq Bg2 ps, tq. Br pgps, tqq
Bt
Bt
Bx
Bt
By
and hence that
Bpr gq ps, tq Bpr gq ps, tq
Bs
Bt
B
r Br
1
detpg ps, tqq Bx By pgps, tqq
721
1
z
1
0
-1
0 y
0
x
1
-1
Fig. 203: Sketch of S from Example 4.6.19.
for all ps, tq P V where g1 , g2 are the component maps of g. Hence it
follows (4.6.15) and finally the theorem follows by the change of variable
formula Theorem 4.4.23.
Example 4.6.19. Calculate the flux of the vector field F px, y, z q : pz, x, 1q,
px, y, zq P R3, across the surface
S : tpx, y, z q P R3 : z
Solution: Define
¥ 0, x2
y2
z
1u .
rpx, y q : px, y, 1 x2 y 2 q
for all px, y q P R2 such that x2 y 2 ¤ 1. Then pS, rq is a C 2 -parametric
surface and the corresponding flux across S is given by
B
r
B
r
px, yq By px, yq dxdy
F dS F prpx, y qq B
x
S
Dprq
»
p1 x2 y2, x, 1q p2x, 2y, 1q dxdy
»
»
pq
D r
722
»
pq
D r
π
2xp1 x2 y 2 q
»1»π
0
π
2xy
1 dxdy
2r2 p1 r2 q cospϕq
r3 sinp2ϕq drdϕ π .
Example 4.6.20. Let f be a function of class C 1 defined on a non-empty
bounded open subset U of R2 . Then pGpf q, rf q is a C 1 -parametric surface
where
rf px, y q : px, y, f px, y qq
for all px, y q P U . The corresponding normal field n is given by
B
f
B
f
npx, y q px, y q, px, y q, 1 ,
Bx
By
If |n| : U Ñ R is Riemann integrable, the surface area A of
px, yq P U .
Gpf q is given by
A
» a
1
U
|p∇f qpx, yq|2 dxdy .
Example 4.6.21. (Area of a surface of revolution) Let a, b P R such that
a b, f : ra, bs Ñ r0, 8q be a continuous function with a finite set Nf
of zeros which is the restriction of a continuously differentiable function
defined on a open interval of R containing ra, bs and
S :
px, y, zq P R3 : px2
y 2 q1{2
f pzq ^ z P ra, bs
(
.
Note that S is rotational symmetric around the z-axis and can be thought of
as obtained from a curve in x, z-plane that is rotated around the z-axis. An
injective parametrization of class C 1 of S zN , where
N :
(
px, 0, zq P R3 : x f pzq ^ z P ra, bs
Y tpx, y, aq P R3 : px2 y2q1{2 f paqu
Y tpx, y, bq P R3 : px2 y2q1{2 f pbqu
723
Y tp0, 0, zq P R3 : z P Nf u ,
is given by r : pπ, π q ppa, bq zNf q Ñ R3 defined by
rpϕ, z q : pf pz q cos ϕ, f pz q sin ϕ, z q
for all pϕ, z q P pπ, π q ppa, bq zNf q. In particular,
Br pϕ, zq pf pzq sin ϕ, f pzq cos ϕ, 0q ,
Bϕ
Br pϕ, zq pf 1pzq cos ϕ, f 1pzq sin ϕ, 1q ,
Bz
Br pϕ, zq Br pϕ, zq pf pzq cos ϕ, f pzq sin ϕ, f pzqf 1pzqq
Bϕ
Bz
and hence
r
ϕ ϕ, z
B p q Br pϕ, zq f pzq 1 pf 1pzqq21{2
B
Bz
for all pϕ, z q P pπ, π q ppa, bq zNf q. Since
Br
B
r Bϕ Bz is Riemann integrable over pπ, π q ppa, bq zNf q, we define the area of S
by
»
r
ϕ ϕ, z
B p q Br pϕ, zq dϕdz
A :
By
Dprq B
»b
2π f pzq 1 pf 1pzqq2 1{2 dz .
(4.6.16)
a
In cases where f is only almost everywhere differentiable on pa, bq and such
that the last integral exists as an improper Riemann integral, we use the last
formula to define the area of a surface of revolution. This will be relevant
in the following two examples.
724
Example 4.6.22. Calculate the surface area AS of a sphere of radius r ¡ 0
and the lateral surface area AC of a circular cylinder of radius r and height
h ¡ 0. Solution: With a r, b r,
?
f pz q :
for every z
AS
P rr, rs, it follows from (4.6.16) that
2π
?
»r
r
4πr2 .
r2
z2
2
z
?2 2
r z
1
Finally, with a 0, b h,
for every z
r2 z 2
1 {2
dz
2π
»r
r
r dz
f pz q : r
P r0, hs, it follows from 4.4.8 that
VC
2π
»h
r dz
0
2πrh .
Example 4.6.23. Calculate the surface area AE of the rotational ellipsoid
E :
where r
2
px, y, zq P R : Rx 2
3
y2
R2
z2
r2
*
1
¡ 0, R ¡ 0 are such that r R. Solution: With a r, b r,
R? 2
f pz q :
r z2
r
for every z
AE
"
P rr, rs, it follows from (4.6.16) that
2π
2πR
»r
R?
r r
»r c
r
1
r2 z 2 1
2
R
r2
r 2 R2 2
z dz
r4
? 2 z 2
r z
4πR
725
2 1{2
dz
»rc
0
1
r 2 R2 2
z dz .
r4
Hence in the case that r
we arrive at
AE
4πR
»r
?
1
¡ R and by defining
1 ?
ε : 2 r2 R2 ,
r
ε2
z2
dz
2πR ?
εr 1 ε2 r2 arcsinpεrq
ε
c
1? 2
r 2 R2
2πr2 R
?2 2
r R2 1 r2
r R r
0
2πrR Rr
In the case that r
we arrive at
AE
2πR ?
εz 1 ε2 z 2
ε
4πR
»r
b
c
1
1
R2
r2
1
arcsin
R2
r2
p q
1? 2
arcsin
r R2
r
0
.
R and by defining
1 ?
ε : 2 R2 r2 ,
r
?
1
ε2 z 2 dz
2πR ?
εz 1
ε
2πR ?
2
2
εr 1 ε r
arsinhpεrq
ε
c
1? 2
2πr2 R
R2 r 2
? 2 2
R r2 1
r2
R r r
0
2πrR Rr
r
arcsin εz b
c
1
R2
r2
1
arsinh
R2
r2
1
r
arsinh εz p q
ε2 z 2
1? 2
arsinh
R r2
r
0
where arsinh denotes the inverse function to sinh. For its existence, not
that
sinh 1 pxq coshpxq ¥ 1
726
for every x P R. Hence
sinhpxq for x ¥ 0 and
»x
sinhpxq 0
coshpxq dx ¥ x
»0
x
sinh 1 pxq dx ¤ x
for x ¤ 0. Hence it follows by Theorem 2.5.18 that sinh is bijective as well
as that
1
arsinh 1 pxq 1
sinh parsinhpxqq
1
1
? 1 2
a
coshparsinh
2
pxqq
1 x
1 sinh parsinhpxqq
for every x P R.
We continue with Stokes’ theorem which relates the flow of the curl of a
vector field through a parametric surface to the path integral of that field
along the boundary of the surface. The following version of Stokes’ theorem is a consequence of Green’s theorem and transformation properties of
the curl of a vector field under certain differentiable maps.
Theorem 4.6.24. (Stokes’ theorem) Let pS, rq be a C 2 -parametric surface
and F be a vector field of class C 1 defined on an open set containing S.1 In
particular, let r be such that
(i) Its domain Ω is a non-empty bounded open subset of R2 for which
Green’s theorem is valid, i.e., there is a piecewise C 1 -path α : I Ñ
R2 defined on some non-empty closed interval of I € R and traversing the boundary of Ω such that Green’s identity
» Bf2 Bf1 dxdy » f dα
(4.6.17)
Bx By
Ω
α
is valid for every continuously differentiable f pf1 , f2 q : V Ñ R2
defined on some open subset V of R2 containing Ω and its boundary.
1
Note that we do not make any further assumptions on the normal field.
727
y
S
1
W
z
x
0
0 y
0
¶W
¶S
x
Fig. 204: Illustration for the proof of Stokes’ theorem, Theorem 4.6.24.
(ii) r is the restriction to Ω of a map r̂ of class C 2 defined on an open
subset containing Ω and its boundary.
»
Then
S
curl F dS »
γ
F dγ
where γ is any piecewise C 1 -parametrization of B S : Ranpr̂ αq which is
of the same orientation as r̂ α.
Proof. In the following, for simplicity of notation, we denote r̂ by the symbol ρ. First, since ρ α is a piecewise C 1 -path, it follows that
»
ρ α
F dpρ αq 3̧
i 1
»
α
rpFi ρq ∇ρis d α
*
B
B
ρi
B
B
ρi
Bx pFi ρq By By pFi ρq Bx dx dy .
i1 Ω
Further, it follows for every i P t1, 2, 3u that
B pF ρq Bρi B pF ρq Bρi BpFi ρq Bρi BpFi ρq Bρi
Bx i By By i Bx
Bx By
By Bx
3̧
» "
728
B
Fi
B
ρ1
B
Fi
B
ρ2
B
Fi
B
ρ3 B ρi
Bx ρ Bx
ρ Bx
ρ Bx By
B
x2
B
x3
1
B
Fi
B
ρ1
B
ρ2
B
ρ3 B ρi
B
Fi
B
Fi
Bx ρ By
B x2 ρ B y
B x3 ρ B y B x
1
and by using
Bρ Bρ
Bx By
B
ρ 2 B ρ3 B ρ 3 B ρ2 B ρ 3 B ρ1 B ρ 1 B ρ3 B ρ 1 B ρ2 B ρ 2 B ρ1
Bx By Bx By , Bx By Bx By , Bx By Bx By
that
B pF ρq Bρ1 B pF ρq Bρ1 Bx 1
B
y By 1 Bx BBFx1 ρ BBρx2 BBFx1 ρ BBρx3 BBρy1
2
3
B
ρ2
B
F1
B
ρ3 B ρ 1
B
F1
ρ By Bx
Bx ρ By
B
x3
2
B
F1
B
ρ Bρ
B
F1
B
ρ Bρ
Bx ρ Bx By
Bx ρ Bx By
2
3
3
B pF ρq Bρ2 B pF ρq Bρ2 Bx 2
B
y By 2 Bx BBFx2 ρ BBρx1 BBFx2 ρ BBρx3 BBρy2
1
3
B
F2
B
ρ1
B
F2
B
ρ3 B ρ2
ρ By Bx
Bx ρ By
B
x3
1
B
F2
B
F2
B
ρ Bρ
B
ρ Bρ
Bx ρ Bx By Bx ρ Bx By
1
3
3
B pF ρq Bρ3 B pF ρq Bρ3 Bx 3
By By 3
Bx
729
,
2
1
,
B
F3
B
ρ1
B
F3
B
ρ2 B ρ3
Bx ρ Bx
ρ Bx By
B
x2
1
B
F3
B
ρ1
B
ρ2 B ρ3
B
F3
ρ By Bx
Bx ρ By
B
x2
1
B
F3
B
ρ Bρ
B
F3
B
ρ Bρ
Bx ρ Bx By
Bx ρ Bx By
1
2
2
1
Hence by using that
B
F3 B F2 B F1 B F3 B F2 B F1
curl F B x2 B x3 , B x3 B x1 , B x1 B x2
,
we arrive at
B pF ρq Bρi B pF ρq Bρi * Bx i By By i Bx
i1
B
ρ Bρ
B
F1
B
ρ Bρ
B
F1
ρ Bx By
Bx ρ Bx By
B
x3
2
3
2
BF2 ρ
Bρ Bρ BF2 ρ
Bρ Bρ Bx1 Bx By 3 Bx3 Bx By 1
BF3 ρ Bρ Bρ
BBFx3 ρ BBxρ BByρ
B x2
Bx By 1
1
2
rpcurl F q ρs BBxρ BByρ .
3̧
"
Hence, finally, it follows that
B
ρ Bρ
F dpρ αq rpcurl F q ρs dx dy
B
x By
ρα
Ω
»
curl F dS .
»
»
S
730
.
Example 4.6.25. Let S and F be as in Example 4.6.19. A simple calculation gives that
F px, y, z q pcurl Aqpx, y, z q
for all x, y, z
P R where
Apx, y, z q :
for all x, y, z
2
2
y, z2 , x2
P R. Hence it follows by Theorem 4.6.24 that
»
S
F dS »
BS
A dr .
The boundary B S is given by the circle of radius 1 in the x, y-plane centered
at the origin. Note that
rpB S q ιpB S q
where ι is the inclusion of R2 into R3 given by ιpx, y q : px, y, 0q for
every px, y q P R2 . Therefore, a C 1 -parametrization of B S satisfying the
assumptions of Theorem 4.6.24 is given by rptq : pcos t, sin t, 0q for all
t P rπ, π s. Hence
»
S
F dS »π
»π
π
sinptq, 0, cos2ptq{2 p sinptq, cosptq, 0q dt
sin ptq dt π
π
»π
2
π 1
1
p
1 cosp2tqq dt π sinp2tq
4
π 2
π
which is identical to the result of Example 4.6.19.
4.6.3
Gauss’ Theorem
The final theorem of this course is Gauss’ theorem for images of cuboids.
It relates the volume integral of a certain derivative of a vector field, the
so called ‘divergence’ of the field, to the flow of the vector field through
the boundary of the volume. Its proof is similar to that of Green’s theorem
731
for images of rectangles, i.e., it is based on a technical lemma referring to
transformation properties of the divergence of a vector field. That lemma
is given next. It’s proof consists in a straightforward calculation using the
chain rule in the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18.
Lemma 4.6.26. Let V be a non-empty open subset of R3 and F pF1 , F2 ,
F3 q : V Ñ R3 be differentiable. Further, let g pg1 , g2 , g3 q : Dpg q Ñ R3
be defined and of class C 2 on a non-empty open subset Dpg q of R3 and such
that g pDpg qq € V . Then
B pF gq Bg Bg B pF gq Bg Bg Bx By Bz By
Bz Bx
B pF gq Bg Bg rpdiv F q gs detpg 1q
Bz
Bx By
where
div F :
BF1 BF2 BF3
Bx By Bz
.
Proof. The proof proceeds by a simple calculation using the chain rule in
the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18. In the following, we indicate partial derivatives in the coordinate directions x, y, z by the
index , x , , y and , z , respectively. In first step, we prove that
pg,y g,z q,x pg,z g,xq,y pg,x g,y q,z 0 .
For this, we note that
g,y g,z pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q
g,z g,x pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q
g,x g,y pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q .
Hence
pg,y g,z q1,x pg,z g,xq1,y pg,x g,y q1,z
pg2,y g3,z g3,y g2,z q,x pg2,z g3,x g3,z g2,xq,y pg2,x g3,y g3,x g2,y q,z
732
g2,yx g3,z g3,yx g2,z g2,y g3,zx g3,y g2,zx g2,zy g3,x g3,zy g2,x
g2,z g3,xy g3,z g2,xy g2,xz g3,y g3,xz g2,y g2,x g3,yz g3,x g2,yz
0,
pg,y g,z q2,x pg,z g,xq2,y pg,x g,y q2,z
pg3,y g1,z g1,y g3,z q,x pg3,z g1,x g1,z g3,xq,y pg3,x g1,y g1,x g3,y q,z
g3,yx g1,z g1,yx g3,z g3,y g1,zx g1,y g3,zx g3,zy g1,x g1,zy g3,x
g3,z g1,xy g1,z g3,xy g3,xz g1,y g1,xz g3,y g3,x g1,yz g1,x g3,yz
0,
pg,y g,z q3,x pg,z g,xq3,y pg,x g,y q3,z
pg1,y g2,z g2,y g1,z q,x pg1,z g2,x g2,z g1,xq,y pg1,x g2,y g2,x g1,y q,z
g1,yx g2,z g2,yx g1,z g1,y g2,zx g2,y g1,zx g1,zy g2,x g2,zy g1,x
g1,z g2,xy g2,z g1,xy g1,xz g2,y g2,xz g1,y g1,x g2,yz g2,x g1,yz
0.
Since according to the chain rule Corollary 4.2.25
pF gq,x g1,x.pF,x gq
pF gq,y g1,y .pF,x gq
pF gq,z g1,z .pF,x gq
g2,x .pF,y g q
g2,y .pF,y g q
g2,z .pF,y g q
g3,x .pF,z g q ,
g3,y .pF,z g q ,
g3,z .pF,z g q ,
it follows in a second step that
rpF gq pg,y g,z qs,x rpF gq pg,z g,xqs,y
rpF gq pg,x g,y qs,z pF gq,x pg,y g,z q
pF gq,y pg,z g,xq pF gq,z pg,x g,y q
rg1,x.pF,x gq g2,x.pF,y gq g3,x.pF,z gqs pg,y g,z q
rg1,y .pF,x gq g2,y .pF,y gq g3,y .pF,z gqs pg,z g,xq
rg1,z .pF,x gq g2,z .pF,y gq g3,z .pF,z gqs pg,x g,y q
pF,x gq rg1,x.pg,y g,z q g1,y .pg,z g,xq g1,z .pg,x g,y qs
pF,y gq rg2,x.pg,y g,z q g2,y .pg,z g,xq g2,z .pg,x g,y qs
pF,z gq rg3,x.pg,y g,z q g3,y .pg,z g,xq g3,z .pg,x g,y qs .
733
Further, it follows that
g1,x .pg,y g,z q g1,y .pg,z g,x q g1,z .pg,x g,y q
g1,x.pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q
g1,y .pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q
g1,z .pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q
pdetpg 1q, 0, 0q ,
g2,x .pg,y g,z q g2,y .pg,z g,x q g2,z .pg,x g,y q
g2,x.pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q
g2,y .pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q
g2,z .pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q
p0, detpg 1q, 0q ,
g3,x .pg,y g,z q g3,y .pg,z g,x q g3,z .pg,x g,y q
g3,x.pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q
g3,y .pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q
g3,z .pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q
p0, 0, detpg 1qq
and hence, finally, that
rpF gq pg,y g,z qs,x rpF gq pg,z g,xqs,y rpF gq pg,x g,y qs,z
rdivpF q gs detpg 1q .
Gauss’ theorem for images of cuboids under certain differentiable maps
is a consequence of the previous Lemma, Lemma 4.6.26, and change of
variables, Theorem 4.4.23.
Theorem 4.6.27. (Gauss’ theorem for images of cuboids) Let a1 , b1 , a2 ,
b2 , a3 , b3 P R be such that ai bi for i 1, 2, 3 and I : ra1 , b1 sra2 , b2 s
ra3, b3s, I0 : pa1, b1q pa2, b2q pa3, b3q. Further, let U  I be an open
subset of R3 , g : U Ñ R3 be twice continuously differentiable such that the
734
z
gHIL
y
x
Fig. 205: g pI q and some outer normal vectors. Illustration for the proof of Gauss’ theorem,
Theorem 4.6.27.
induced map from U to g pU q is bijective with a continuously differentiable
inverse and such that detpg 1 q ¡ 0. Finally, let V  g pI q be an open subset
of R3 and F pF1 , F2 , F3 q : V Ñ R3 be continuously differentiable. Then
»
p q
divF dxdydz
g I0
»
Sb2
»
F dS
Sa3
»
Sa1
»
F dS
F dS
Sb1
»
Sb3
F dS
F dS
»
Sa2
F dS
(4.6.18)
where pSa1 , ra1 q, pSb1 , rb1 q, pSa2 , ra2 q, pSb2 , rb2 q, pSa3 , ra3 q, pSb3 , rb3 q are
C 2 -parametric surfaces defined by
ra1 py, z q : g pa1 , z, y q , rb1 py, z q : g pb1 , y, z q , ra2 px, z q : g px, a2 , z q ,
rb2 px, z q : g pz, b2 , xq , ra3 px, y q : g py, x, a3 q , rb3 px, y q : g px, y, b3 q
for all x P pa1 , b1 q, y
Sa1
P pa2, b2q and z P pa3, b3q and
: Ranpra q , Sb : Ranpra q , Sa : Ranpra q ,
1
1
1
2
735
2
Sb2 : Ranprb2 q ,
Sa3 : Ranpra3 q , Sb3 : Ranprb3 q .
Proof. In a first step, we consider the set g pI0 q. Since g is twice continuously differentiable with a continuously differentiable inverse, g pI0 q is a
bounded open subset in R3 . Further, the restriction of divF to g pI0 q is
bounded. In addition, it follows by Theorem 4.4.13 and Theorem 4.4.15
that the extension of divF to a function, defined on a closed subinterval J of
R3 containing g pI0 q and assuming the value zero in the points of J z g pI0 q,
is Riemann-integrable. Hence by Theorem 4.4.23, it follows in a second
step that
»
p q
divF dxdydz
g I0
»
I0
rpdivF q gs detpg 1q dxdydz
and hence by the previous Lemma 4.6.26 that
»
p q
divF dxdydz
g I0
B
B
g Bg
Bx pF gq By Bz dxdydz
I
»
B pF gq Bg Bg dxdydz
Bz Bx I By
»
B pF gq Bg Bg dxdydz .
Bx By
I Bz
»
0
0
0
Finally, from this follows (4.6.18) by Fubini’s Theorem 4.4.18 and the fundamental theorem of calculus Theorem 2.6.21.
Remark 4.6.28. Since the region g pI0 q is bounded, there is an outward
pointing unit normal for every point on its boundary, apart from the corner
points g pa1 , a2 , a3 q, g pa1 , b2 , a3 q, g pa1 , a2 , b3 q, g pa1 , b2 , b3 q, g pb1 , a2 , a3 q,
g pb1 , b2 , a3 q, g pb1 , a2 , b3 q and g pb1 , b2 , b3 q. Outward pointing vectors are
given by
B
g1
B
g2
B
g3
va pa1 , y, z q Bx pa1, y, zq, Bx pa1, y, zq, Bx pa1, y, zq
1
736
,
B
g1
B
g2
B
g3
vb pb1 , y, z q Bx pb1, y, zq, Bx pb1, y, zq, Bx pb1, y, zq , Bg1 px, a , zq, Bg2 px, a , zq, Bg3 px, a , zq ,
va px, a2 , z q By 2 By 2 By 2
Bg1 px, b , zq, Bg2 px, b , zq, Bg3 px, b , zq ,
vb px, b2 , z q By 2 By 2 By 2
B
g2
B
g3
B
g1
px, y, a3q, Bz px, y, a3q, Bz px, y, a3q ,
va px, y, a3 q B
z
B
g1
B
g2
B
g3
vb px, y, b3 q Bz px, y, b3q, Bz px, y, b3q, Bz px, y, b3q ,
for every x P pa1 , b1 q, y P pa2 , b2 q and z P pa3 , b3 q. Normal vectors on the
1
2
2
3
3
boundary corresponding to the parametrizations ra1 , rb1 , ra2 , rb2 , ra3 , rb3 are
given by
B
g Bg
na pa1 , y, z q :
pa1, y, zq ,
B
z By
B
g Bg
nb pb1 , y, z q :
pb1, y, zq ,
B
y Bz
B
g Bg
na px, a2 , z q :
Bx Bz
px, a2, zq ,
Bg Bg px, b , zq ,
nb px, b2 , z q :
2
B
z Bx
B
g Bg
na px, y, a3 q :
px, y, a3q ,
B
y Bx
B
g Bg
nb px, y, b3 q :
Bx By px, y, b3q
1
1
2
2
3
3
for every x P pa1 , b1 q, y P pa2 , b2 q and z P pa3 , b3 q. In particular, as a
consequence of (3.5.7) and the assumption in the previous Theorem 4.6.27
that detpg 1 q ¡ 0 , it follows that the orthogonal projections of the outgoing
vectors onto the corresponding normal vectors are everywhere ¡ 0. Hence
the parametrizations of the boundary surfaces Sa1 , Sb1 , Sa2 , Sb2 , Sa3 , Sb3 in
737
Theorem 4.6.27 has to be such that the corresponding normal vectors point
out of g pI0 q in every point of the boundary, apart from finitely many points,
as is indicated in Fig. 205.
Remark 4.6.29. Also Gauss’ theorem can be generalized to a larger class
of regions of R3 that can be dissected into regions that satisfy the requirements of Theorem 4.6.27. The last theorem is then applied to the parts
of the dissection. In this, integration over cuts are performed twice, but
with normal fields in opposite directions such that their contributions cancel in the sum. We will not give such generalizations in the following.
The regions considered in the following examples and problems satisfy the
requirements of such more general theorems.
Example 4.6.30. Calculate
»
S
F dS ,
using Gauss’s Theorem where
F px, y, z q : xy, y 2
exz , sinpxy q
2
, x, y, z
PR
and S is the surface of the region E in the first octant bounded by the
parabolic cylinder z 1 x2 and the planes y 0, z 0 and y z 2.
Solution: By Gauss’ and Fubini’s Theorem, it follows
»
S
F dS »
3 1
2 1
1
2
»
3y dxdydz
»
E
1 x2
0
7x x
3
p2 zq dz
2
» 1 » 1x2 » 2z
1
3y dy dxdz
0
dx 1
3 5 1 7 x x 5
7
1
»
0
1 1
p7 3x2 3x4 x6qdx
2 1
184
.
35
The following example gives a typical application of Gauss’ theorem in
the area of partial differential equations. It considers solutions of wave
738
1
0.8
z
0.2
-1
0
0.5
-0.5
y
1
0
x
0.5
1
Fig. 206: Sketch of V .
equations. Ultimately, it will lead to the proof of the causal behavior of
the solutions, i.e., the fact that two solutions, whose values coincide on a
circular area AC at time t 0 and whose partial time derivatives coincide
on that same set, coincide on the volume of a certain ‘characteristic solid
cone’ that is contained in AC r0, 8q and has AC as basis.
Example 4.6.31. (An energy inequality for a wave equation in two space
dimensions) We consider a function u : U Ñ R of class C 2 that satisfies
the wave equation
B2u 4u V u 0
(4.6.19)
Bt2
where V : U Ñ R is continuous, assumes only positive values, i.e.,
RanpV q € r0, 8q, and is such that
BV 0 .
Bt
In this, U is a non-empty open subset of R3 . In addition, we define for
every partially differentiable, twice partially differentiable f : U Ñ R and
739
Ñ R2
p∇f qpt, x, yq : r∇f pt, qspx, yq , div F : rdiv Fpt, qspx, yq ,
p4f qpt, x, yq : r4f pt, qspx, yq
respectively, for all pt, x, y q P U . Then the function and the vector field j
partially differentiable F : U
defined by
:
1
2
Bu 2 |∇u|2
Bt
V u2
, j :
Bu ∇u
Bt
satisfy
B div Bu ∇u
B 1 Bu 2 |∇u|2 V u2 div Bu ∇u
Bt
Bt
Bt 2 Bt
Bt
2
BBut BBtu2 p∇uq ∇ BBt u V u BBut ∇ BBt u p∇uq BBut 4u
B
u B2u
Bt Bt2 4u V u 0 .
Hence we conclude the conservation law
div j B 0 .
Bt
(4.6.20)
Note for later use that
px, y, tq ¥ |jpx, y, tq| .
(4.6.21)
for all px, y, tq P U . In physical applications, is called the energy density
(corresponding to u) and j is called the energy flux density (corresponding
to u). Integration of p, tq over subsets of R2 (if the corresponding integral
exists) gives the energy of u that is contained in that subset at time t P R.
The vector field j describes the flow of that energy. In the following, we
derive an important consequence of (4.6.20). For this, let px, y, tq P R3
740
T
t
Τ
t-T
y
-t
Η
-Ht-TL
-Ht-TL
x
t-T
Ξ
t
-t
Fig. 207: Sketch of the domain of integration in Example 4.6.31.
be such that t ¡ 0. Further, let T P r0, tq. We define the solid backward
with apex px, y, tq by
characteristic cone SCx,y,t
:
SCx,y,t
pξ, η, τ q P R3 : τ ¤ t |px ξ, y ηq|
(
.
Further, we assume that
U
X r0, T s R2
 SCx,y,t
.
Then
»
p q
Bt2T x,y
puqpT, q dξdη
¤
»
p q
Bt2 x,y
puqp0, q dξdη .
(4.6.22)
This can be shown as follows. It follows from (4.6.20) by Gauss’ Theorem
that
0
»
SCx,y,t
Xp r0,T sR2 q
B ∇ Bu ∇u
pξ, η, τ q dξdηdτ
Bt
Bt
741
»
BpSCx,y,t
»
2
p q
Bt2T x,y
»
B
u
, ∇u v dS
Bt
Xp r0,T sR qq
»
p0, q dξdη
pT, q dξdη Cx,y,t
p q
Bt2 x,y
Xp r0,T sR2 q
B
u
puq, ∇u v dS
Bt
where v denotes the outer unit normal field on the boundary surface
X
SCx,y,t
B
r0, ts R2
X
of SCx,y,t
r0, ts R2
.
Ñ R3 defined
f pξ, η q : pξ, η, t |px ξ, y η q|q
is given by f : R2
A parametrization of Cx,y,t
for every pξ, η q P R2 . In particular,
vpf py qq ?1
2
xξ
yη
1,
,
|px ξ, y ηq| |px ξ, y ηq|
for every pξ, η q P R2 . In addition, it follows for such pξ, η q that
?
B
u
2 2
, ∇u v pζ q
Bt
2
BBut pζ q
|p∇uqpζ q|2 V pζ q pupζ q2
|px ξ,2y ηq| BBut pζ q px ξ, y ηq p∇uqpζ q
2
Bu B
u
2
2
¥ Bt pζ q
|p∇uqpζ q| V pζ q pupζ q 2 Bt pζ q |p∇uq|pζ q ¥ 0
where ζ : f pξ, η q. Hence it follows (4.6.22). As an application of the
energy inequality (4.6.22), we assume that v : U Ñ R is another solution
of (4.6.10) such that
upx, y, 0q v px, y, 0q ,
Bu px, y, 0q Bv px, y, 0q
Bt
Bt
742
for all px, y q P Bt px, y q. Then u v is a solution of (4.6.19) such that
pu vqpx, y, 0q 0 , BpuBt vq px, y, 0q 0
for all x P Bt px, y q and hence the corresponding energy density vanishes
at time 0 on Bt px, y q. As a consequence of (4.6.22) and the positivity of
the energy density, it follows that the same is true at time T on BtT px, y q.
Since this is true for every T P r0, tq and since u v is continuous, it follows
that u and v coincide in every point of
X
SCx,y,t
r0, ts R2
.
As a consequence, we have the following result.
Theorem 4.6.32. (Uniqueness of the solutions of a wave equation in two
space dimensions) Let px, y, tq P R3 be such that t ¡ 0. We define the solid
with apex px, y, tq by
backward characteristic cone SCx,y,t
:
SCx,y,t
pξ, η, τ q P R3 : τ ¤ t |px ξ, y ηq|
Further, let U be an open subset of R3 such that
U
 SCx X
R2 r0, ts
(
.
P C 2pU, Rq be such that
B2u 4u V u B2v 4v V v ,
Bt2
Bt2
Bu px, y, 0q Bv px, y, 0q
upx, y, 0q v px, y, 0q ,
Bt
Bt
for all px, y q P Bt px, y q. In this,
p4uqpt, x, yq : r4upt, qspx, yq
respectively, for all pt, x, y q P U , and V : U Ñ R is continuous, assumes
only positive values, i.e., RanpV q € r0, 8q, and satisfies
BV 0 .
Bt
X p R2 r0, tsq.
Then u and v coincide on SCx,y,t
and u, v
743
Problems
1) Decide whether the tuple of vectors is positively or negatively oriented
pp1, 2q, p3, 4qq , b) pp1, 1q, p0, 1qq ,
pp3, 7q, p2, 1qq , d) pp9, 4q, p10, 3qq ,
e) pp0, 4, 2q, p8, 1, 3q, p9, 12, 5qq ,
f) pp2, 2, 14q, p3, 1, 1q, p2, 9, 9qq ,
g) pp1, 1, 2q, p4, 3, 1q, p8, 2, 5qq ,
h) pp2, 7, 9q, p1, 2, 4q, p8, 8, 1qq .
Let U be a non-empty open subset of R3 ; f : U Ñ R, v : U Ñ R3
and w : U Ñ R3 be partially differentiable; f1 : U Ñ R, v1 pv1x , v1y , v1z q : U Ñ R3 and also v2 : U Ñ R3 be of class C 2 .
a)
c)
2)
Show that
a)
rot p∇f1 q 0
,
b)
div prot v1 q 0 ,
c)
div p∇f1 q 4f1 ,
f)
rot prot v1 q ∇pdiv v1 q 4v1 ,
d) rot pf.vq f.prot vq p∇f q v ,
e) div pf.vq f pdiv vq p∇f q v ,
g)
div pv wq w prot vq v prot wq
where 4v1 : p4v1x , 4v1y , 4v1z q .
3) Let a, b, c, d P R be such that a c d b. Further, let f1 :
pa, bq Ñ R and f2 : pa, bq Ñ R be twice differentiable and such
that f1 pxq f2 pxq for all x P pa, bq. Find a twice continuously
differentiable bijective g : R pa, bq Ñ R pa, bq whose inverse is
twice continuously differentiable and which is such that
g pr1, 1s rc, dsq tpx, y q P R rc, ds : f1 py q ¤ x ¤ f2 py qu .
4) Calculate the area of Gpf q of f : U
Ñ R defined by
U : tpx, y q P R2 : x2 y 2 1u
and f px, y q : 3 3x 7y for all px, y q P U .
5) Let D be the open subset of R2 that is bounded by the triangle with
corners p0, 0q, p1, 0q, p1, 1q. Calculate the area of
tpx, y, zq P R3 : 3x2
7y z
0uXtpx, y, zq P R3 : px, yq P Du .
744
y
1
0.5
-1
0.5
-0.5
1
x
-0.5
-1
Fig. 208: An astroid.
6) Calculate the area of
tpx, y, zq P R3 : x2
y2 z2
1uXtpx, y, zq P R3 : x2
y2
xu .
7) Calculate the surface areas of the tori from Example 4.6.15.
8) By calculation of a path integral, find the compact area that is bounded
by the astroid
where a ¡ 0.
t pa cos3 t, a sin3 tq P R2 : t P r0, 2πq u
9) By calculation of a path integral, find the compact area that is bounded
by the cardioid
t pa cos t p1
where a ¡ 0.
cos tq, a sin t p1
cos tqq P R2 : t P r0, 2π q u
10) By calculation of a path integral, find the compact area that is bounded
by the folium of Descartes
where a ¡ 0.
t px, yq P R2 : x3
745
y 3 3axy
0u
y
1
1
€€€€€€
2
1
€€€€€€
2
1
x
3
€€€€€€
2
1
- €€€€€€
2
-1
Fig. 209: A cardioid.
y
2
1
-2
1
-1
-1
-2
Fig. 210: A folium of Descartes.
746
2
x
11) Use Stokes’ theorem to calculate the surface integral
»
S
where
curl A dS
Apx, y, z q : pz, x, y q
P R and
S : t px, y, z q P R3 : x2
for all x, y, z
y2
z40^z
¥ 0u .
For this, assume a normal field with positive z-component. Sketch
S.
12) Use Stokes’ theorem to calculate the surface integral
»
S
where
curl A dS
Apx, y, z q : p2yz, 0, xy q ,
P R and
S : t px, y, z q P R3 : x2
for all x, y, z
y2 z2
9 ^ 0 ¤ z ¤ 4z u .
For this, assume a normal field pointing away from the z-axis. Sketch
S.
13) By using Stokes’ theorem, calculate
»
S
where
for all x, y, z
curl A dS
Apx, y, z q : p2y, 3x, z 2 q
P R and S is the closed upper half surface of the sphere
t px, y, zq P R3 : x2 y2 z2 9 u .
For this, assume a normal field with positive z-component.
14) Use Gauss’ theorem to calculate the surface integral
»
S
A dS
747
where
Apx, y, z q : px2 , xy, y 2 q
for all x, y, z P R and S is the compact region in the first octant
bounded by the coordinate planes and
t px, y, zq P R3 : x
2y
z
1u .
Sketch S.
15) Use Gauss’ theorem to calculate the surface integral
»
S
where
A dS
Apx, y, z q : p3x2 , 6xy, z 2 q
for all x, y, z P R and S is the compact region in the first octant
bounded by the coordinate planes and
t px, y, zq P R3 : x 2 u , t px, y, zq P R3 : z
y2
1u .
Sketch S.
16) By using Gauss’ theorem, calculate
»
S
where
A dS
Apx, y, z q : p2xy
z, y 2 , x 3y q
for all x, y, z P R and S is the compact region in the first octant
bounded by the coordinate planes and
t px, y, zq P R3 : 2x
748
2y
z
6u .
5
5.1
Appendix
Construction of the Real Number System
Already the ancient Greeks discovered that there was a need to go beyond
rational numbers. For instance, they found that there is no rational number
to measure the length of the diagonal d of a square with sides of length 1.
By the Pythagorean theorem that length satisfies the equation d2 2. In
Example 2.2.15, we proved that this equation has no rational solution which
was also known to the Greek’s of that time. Still, they did not develop the
concept of real numbers. In its final form, that concept was developed only
in the 19th century.
In the following, we construct the real number system following an approach by Georg Cantor (1872) as completion of the rational number system. For this, in a first step, we identify Q with a space containing equivalence classes of Cauchy sequences of rational numbers.
Definition 5.1.1. (Cauchy sequences in Q) Let x x1 , x2 , x3 , . . . be a
sequence of rational numbers. We say that x is a Cauchy sequence if for
every rational ε ¡ 0 there is a corresponding n0 P N such that
|x m x n | ε
for all m, n P N such that m ¥ n0 and n ¥ n0 . Such a sequence is nec-
essarily bounded by a some rational number since this leads in the special
case ε 1 to
|xk | ¤ maxt|xl | : l 1, . . . , n0u |xk xn xn |
¤ maxt|xl | : l 1, . . . , n0u |xk xn | |xn |
¤ 1 |xn | maxt|xl | : l 1, . . . , n0u
for k P N such that k ¥ n0 and
|xk | ¤ maxt|xl | : l 1, . . . , n0u ¤ 1 |xn | maxt|xl | : l 1, . . . , n0u
0
0
0
0
0
749
0
for k P N such that k
the symbol C. For x, y
numbers by
¤ n0. We denote the set of all such sequences by
P C, we define sequence x y an x y of rational
x y : x1 y1 , x2 y2 , x3 y3 , . . .
x y : x1 y1 , x2 y2 , x3 y3 , . . . .
Since
|px yqm px yqn| |xm xn ym yn| ¤ |xm xn| |ym yn|
|px yqm px yqn| |xmym xnyn| |xmpym ynq pxm xnqyn|
¤ |xm| |ym yn| |yn| |xm xn| ¤ Cx|xm xn| Cy |ym yn|
where Cx , Cy are rational bounds for x and y, respectively, it follows that
x y P C and x y P C. Finally, we define for every x P C a corresponding
sequence x P C by
x : x1, x2, . . . .
Definition 5.1.2. We define an equivalence relation ‘’ on C as follows.
We say that x, y P C are equivalent and denote this by x y if for every
rational ε ¡ 0 there is n0 P N such that
|x n y n | ε
for all n P N such that n ¥ n0 . Indeed, ‘’ is reflexive since for every
x P C and every rational ε ¡ 0 it follows that
|x n x n | 0 ε
for all n P N such n ¥ 1 and hence that x x. Also, ‘’ is symmetric,
since for x, y P C such that x y and rational ε ¡ 0 there is n0 P N such
that
|x n y n | ε
for all n P N such that n ¥ n0 . This implies that
|y n x n | ε
750
for all n P N such that n ¥ n0 and hence that y x. Finally, if x, y, z P C
are such that x y and y z and ε is some rational number ¡ 0, it follows
the existence of n0 P N such that
|xn yn| ε{2 , |yn zn| ε{2
for all n P N such that n ¥ n0 . Hence
|xn zn| |xn yn yn zn| ¤ |xn yn| |yn zn| ¤ ε
for all n P N such that n ¥ n0 . Therefore it follows that x z.
Lemma 5.1.3. Let x P C and x̄ be a subsequence of x. Then x̄ P C and
x̄ x.
Proof. Since x̄ is a subsequence of x, there is a strictly increasing sequence
n1 , n2 , . . . of elements of N such that x̄ xn1 , xn2 , . . . and such that
nk ¥ k for all k P N . Since x P C, for rational ε ¡ 0, there is n0 P N
such that
|x m x n | ε
for all m, n
that
P N such that m ¥ n0 and n ¥ n0. In particular, this implies
|x k x k | ε
for all m, n P
¥ n0 and n ¥ n0 since the last implies that
km ¥ m ¥ n0 and kn ¥ n ¥ n0 . Hence it follows that x̄ P C. Also, it
N such that m
m
n
follows that
|x̄n xn| |xk xn| ε
for all n P N such that n ¥ n0 since the last implies that kn ¥ n ¥ n0 .
Definition 5.1.4. (Cantor real numbers) For every x P C, we define the
n
associated Cantor real number as the equivalence class [x] defined by
[x] : ty : y
P C ^ y xu .
Also, we define the set C of Cantor real numbers by
C : t [x] : x P Cu .
751
For [x], [y] P C , where x, y
corresponding product by
[x]
P C, we define a corresponding sum and a
[y] [x
y] , [x] [y] [x y] .
Indeed, this is possible, since it follows for x̄, ȳ
and [ȳ] [y] that
[x̄
ȳ] [x
P C satisfying [x̄] [x]
y] , [x̄ ȳ] [x y] .
This can be seen as follows. First, since [x̄] [x] and [ȳ] [y], it follows
that x̄ x and that ȳ y. Hence for every rational ε ¡ 0 there is n0 P N
such that
|x̄n xn| ε{2 , |ȳn yn| ε{2
for all n P N such that n ¥ n0 . For such n, it follows that
|px̄ ȳqn px yqn| |x̄n xn
¤ |x̄n xn| |ȳn yn| ε
and therefore that
px̄
ȳ q px
ȳn yn |
yq .
Also, if Cx , Cȳ are rational bounds for x and ȳ, respectively, and ε is some
rational number ¡ 0, then there is n0 P N such that
Cȳ |x̄n xn | ε{2 , Cx |ȳn yn | ε{2
for all n P N such that n ¥ n0 . For such n, it follows that
|px̄ ȳqn px yqn| |x̄nȳn xnyn| |px̄n xnqȳn
¤ |ȳn| |x̄n xn| |xn| |ȳn yn| ¤ Cȳ |x̄n xn|
and hence that
px̄ ȳq px yq .
Finally, we define the embedding ι of Q into C by
ιpq q [q, q, . . . ]
752
xn pȳn yn q|
Cx |ȳn yn | ε
P Q. It is an obvious consequence of the definitions that
ιpq q̄ q ιpq q ιpq̄ q , ιpq q̄ q ιpq q ιpq̄ q ,
for all q, q̄ P Q.
Theorem 5.1.5. pC , , q is a field, i.e., the following holds for all x, y, z P
for all q
C:
(i)
p [x]
[y] q
[z] [x]
(ii) [x]
[y] [y]
(iii) [x]
ιp0q [x] ,
(iv) [x]
[ x] ιp0q ,
(v)
(vi)
(vii)
(viii)
p [y]
[z] q , (Associativity of addition)
[x] ,
(Commutativity of addition)
(Existence of a neutral element for addition)
(Existence of inverse elements for addition)
p [x] [y] q [z] [x] p [y] [z] q , (Associativity of multiplication)
[x] [y] [y] [x] ,
(Commutativity of multiplication)
[x] ιp1q [x] , (Existence of a neutral element for multiplication)
If [x] ιp0q, then there is w P C such that
[x] [w] ιp1q ,
(Existence of inverse elements for multiplication)
(ix) [x] p [y]
[z] q [x] [y]
[x] [z] .
(Distributive law)
Proof. The validity of the statements (i)-(vii) and (ix) is an obvious consequence of the analogous laws for rational numbers and the definition of the
addition and multiplication on C . For the proof of (viii), let x P C such
that [x] ιp0q. As a consequence, it is not true that for every rational ε ¡ 0
there is n0 P N such that
|x n | ε
(5.1.1)
for all n P N such that n ¥ n0 . Therefore, there is a rational δ ¡ 0 and
for which there is no n0 P N such that (5.1.1) is valid for all n P N such
753
that n ¥ n0 . This implies the existence of a strictly increasing sequence
n1 , n2 , . . . of natural numbers such that
|x n | ¥ δ
k
for all k P N . According to Lemma 5.1.3, x̄ : xn1 , xn2 , P C and
x̄ x. The last implies that [x̄] [x]. Therefore, we can assume without
restriction that
|x n | ¥ δ
for all n P N . We define w : 1{x1 , 1{x2 , . . . . Then
|wm wn| 1
x
m
1 |xm xn |
xn |xm | |xn |
Hence if ε is rational such that ε ¡ 0 and n0
¤ |xm δ2 xn|
P N is such that
|x m x n | δ 2 ε
for all n P N satisfying n ¥ n0 , then also
|wm wn| ¤ ε
for all n
ιp1q.
P N such that n ¥ n0. As a consequence, w P C and [x] [w] In the next step, after preparation by a Lemma, we define an order relation
‘ ’ on C .
Lemma 5.1.6. Let x, y
n0 P N such that
P
C be such that there is a rational ε
xn
¤ yn ε
x̄n
¤ ȳn ε̄
¡
0 and
for all n P N such that n ¥ n0 . Further let x̄, ȳ P C be such that x̄ x and
ȳ y. Then, there are a rational ε̄ ¡ 0 and a n̄0 P N such that
for all n P N such that n ¥ n̄0 .
754
Proof. Since x̄ x and ȳ
y, it follows the existence of N P N such that
x̄n xn ¤ |x̄n xn | ε{4 , yn ȳn ¤ |ȳn yn | ε{4
for all n P N such that n ¥ N . Hence it follows for n P N satisfying
n ¥ maxtn0 , N u that
ε
ε
x̄n xn ¤ yn ε ȳn
ε
4
4
and hence that
x̄n
where ε̄ : ε{2.
¤ ȳn ε̄
As a consequence of the previous lemma, it is meaningful to define the
following.
Definition 5.1.7. For [x], [y] P C , we say that [x] is smaller than [y] and
denote this by
[x] [y]
if there are a rational ε ¡ 0 and n0
P N such that
xn ¤ yn ε
for all n P N such that n ¥ n0 . Further, we say that [x]
smaller than [y] P C and denote this by
P C is equal or
[x] ¤ [y]
if [x] [y] or if [x]
absolute value |[x]| by
[y]. Finally, we define for every [x]
|[x]| :
P
C its
#
[x] if ιp0q ¤ [x]
[x] if [x] ιp0q .
It is an obvious consequence of the definitions that for q1 , q2
in the case q1 q2 that
ιpq1 q ιpq2 q ,
755
P Q it follows
in the case q1
¤ q2 that
ιpq1 q ¤ ιpq2 q
and that
|ιpq1q| |q1| .
Theorem 5.1.8. Let [x] , [y] P C . Then
(i)
ιp0q  ιp0q ;
(ii) if [x] ιp0q , then either ιp0q [x] or [x] ιp0q ;
(iii) if ιp0q [x] and ιp0q [y] , then
ιp0q [x]
[y] , ιp0q [x] [y] .
Proof. ‘(i)’: The proof is indirect. Assume that ιp0q ιp0q. Then there is
a rational ε ¡ 0 such that 0 ε. ‘(ii)’: For this, let [x] ιp0q . Further,
assume that both [x]  ιp0q and ιp0q  [x]. Then it is not true that there is
a rational ε ¡ 0 and n0 P N such that
xn
¤ ε
for all n P N such that n ¥ n0 . Hence for every rational ε
n0 P N , there is n P N such that n ¥ n0 and
xn
¡ 0 and every
¡ ε .
Therefore, there is a subsequence x̄ of x such that
x̄n
¡ n1
for every n P N . Since x̄ P [x] according to Lemma 5.1.3, we can assume
without restriction that
1
xn ¡ n
756
for every n P N . Further, since ιp0q
rational ε ¡ 0 and n0 P N such that
 [x], it is not true that there is a
0 ¤ xn ε
for all n P N such that n ¥ n0 . Hence for every rational ε
n0 P N , there is n P N such that n ¥ n0 and
xn
¡ 0 and every
ε.
Therefore, there is a subsequence x̄ of x such that
x̄n
n1
for every n P N . Since x̄ P [x] according to Lemma 5.1.3, we can assume
without restriction that
n1 xn n1
for every n P N . Obviously, this implies that x 0, 0, . . . and hence that
[x] ιp0q. Hence it follows that either ιp0q [x] or [x] ιp0q are true
or that both of these inequalities are true. The last implies that there are a
rational ε ¡ 0 and n0 P N such that
0 ¤ xn ε
for all n P N satisfying n ¥ n0 and also that there are a rational δ
m0 P N such that
xn ¤ ε
¡ 0 and
for all n P N satisfying n ¥ m0 . (iii) For this, let ιp0q [x] and ιp0q
[y] . Then, there are a rational ε ¡ 0 and n0 P N such that
ε ¤ xn , ε ¤ yn
for all n P N satisfying n ¥ n0 . Hence it follows for all n P N satisfying
n ¥ n0 that
2ε ¤ xn yn , ε2 ¤ xn yn
757
and finally that
ιp0q [x]
[y] , ιp0q [x] [y] .
Theorem 5.1.9. Let [x], [y] be elements of C such that [x]
there is q P Q such that
[x] ιpq q [y] .
Proof. Since [x] [y], there are a rational ε ¡ 0 and n0
xn
for all n P N satisfying n
such that m0 ¥ n0 and
(5.1.2)
P N such that
¤ yn ε
¥ n0. Further, since x, y P C, there is m0 P N
|xm xn| ε{4 , |ym yn| ε{4
for all n P N satisfying n ¥ m0 . We define q : pxm
follows for all n P N satisfying n ¥ m0 that
0
0
0
q xn
21 pxm
yn q
yn 12 pxm
0
[y]. Then
ym0 q xn
0
xm xn
0
ym0 q yn ym0
ym0 q{2. Then it
1
ε
p
ym0 xm0 q ¡
2
4
1
ε
pym0 xm0 q ¡ 4 .
2
Hence it follows (5.1.2).
Definition 5.1.10. Let [x1 ], [x2 ], . . . be a sequence of elements of C and
[x] P C .
(i) We call [x1 ], [x2 ], . . . a Cauchy sequence if for every [ε]
that ιp0q [ε] there is a corresponding n0 P N such that
|[xm] [xn]| [ε]
for all m, n P N such that m ¥ n0 and n ¥ n0 .
758
P C such
(ii) We define
lim [xn ] [x]
Ñ8
n
if for every [ε] P C such that ιp0q
n0 P N such that for all n ¥ n0 :
[ε] there is a corresponding
|[xn] [x]| [ε] .
(5.1.3)
In this case, we say that the sequence [x1 ], [x2 ], . . . is convergent to
[x].
Theorem 5.1.11.
(i) (The rational numbers are dense in the real numbers) Let [x] be
some element of C . Then
lim
Ñ8 ιpxn q [x] .
n
(ii) (Completeness of the real number system) Every Cauchy sequence
in C is convergent.
Proof. ‘(i)’: For this, let [ε]
rational δ ¡ 0 such that
P C such that ιp0q [ε].
Then there is a
ιp0q ιpδ q [ε]{2
as a consequence of Theorem 5.1.2. Further, it follows for m P N that
ιpxm q [x] [xm x1 , xm x2 , . . . ]
and, since x P C, the existence of n0
P N such that
|x m x n | δ
for all m, n P N satisfying m ¥ n0 and n ¥ n0 . Hence it follows for such
m, n that
2δ
δ
δ xm xn δ 2δ δ
759
and therefore that
ε 2 ιpδq ¤ ιpxmq [x] ¤ 2 ιpδq ε .
This implies that
|ιpxmq [x]| ε
for all m P N satisfying m ¥ n0 . ‘(ii)’: For this, let [x1 ], [x2 ], . . . be a
Cauchy sequence in C . In addition, for every n P N , let qn P Q be such
that
[xn ] ιpqn q [xn ]
ιp1{nq .
Such qn exists according to Theorem 5.1.2. In the following, we will show
that
lim [xn ] [q]
nÑ8
where
q : q1 , q2 , . . . .
First, we show that q P C. For this, let δ
such that n0 ¡ 4{δ and such that
¡ 0 be rational and n0 P N be
|[xm] [xn]| ιpδq{2
for all m, n P N satisfying m ¥ n0 and n ¥ n0 .
For such m and n, it
follows that
|ιpqmq ιpqnq| |ιpqmq [xm] [xm] [xn] [xn] ιpqnq|
¤ |ιpqmq [xm]| |[xm] [xn]| |[xn] ιpqnq|
ιp1{mq ιp1{nq |[xm] [xn]| ιpδq
and hence also that
|q m q n | δ .
Further, let [ε] P C be such that ιp0q [ε]. Since according to (i)
lim ιpqn q [q] ,
nÑ8
760
it follows the existence of n0 P N such that ιp1{n0 q [ε]{2 and such that
for all n ¥ n0 :
|ιpqnq [q]| [ε]{2 .
This also implies that
|[xn] [q]| |[xn] ιpqnq
ιp1{nq p[ε]{2q [ε] .
5.2
ιpqn q [q]| ¤ |[xn ] ιpqn q|
|ιpqnq [q]|
Lebesgue’s Criterion for Riemann-integrability
Apart from notational changes and additions, we follow Sect. 7.26 of [5] in
the proof of Lebesgue’s criterion for Riemann-integrability, Theorem 2.6.13.
Theorem 5.2.1. Let S0 , S1 , . . . be a sequence of subsets of measure zero
of R. Then the union S of these subsets has measure zero, too.
Proof. Given ε ¡ 0, for each k P N there is a sequence Ik0 , Ik1 , . . . of open
subintervals of R such that union of these intervals contains Sk and at the
same time such that
ņ
lim
Ñ8
n
lpIkm q m 0
The sequence of all intervals Ikl , where k, l
all these intervals contains S and
ņ
lim
Ñ8
n
lpIϕpkq q lim
l
k 0
where ϕ : N Ñ N2 is some bijection.
ε
2k 1
.
P N, is countable; the union of
ļ
Ñ8 k0
ε
2k 1
ε
Definition 5.2.2. (Oscillation of a function) Let I be some non-trivial interval of R and f : I Ñ R be some bounded function. Then we define for
every non-trivial subset S of I the oscillation Ωf pS q of f on S by
Ωf pS q : suptf pxq f py q : x P S ^ y
761
P Su .
(Note that set in the previous identity is bounded from above by suptf py q :
y P I u inf tf py q : y P I u. In addition, note that Ωf pS q is positive.)
Further, we define for each x P I the oscillation ωf pxq of f at x by the limit
ωf pxq : lim Ωf ppx δ, x
δ
δq X I q
Ñ0
of the decreasing function that associates to every δ
Ωf ppx δ, x δ q X I q.
¡
0 the value of
Theorem 5.2.3. Let I be some non-trivial interval of R and f : I Ñ R be
some bounded function and x P I. Then f is continuous in x if and only if
ωf pxq 0.
Proof. First, we consider the case that f is continuous in x. Then for every
n P N there are xn , yn P px 1{n, x 1{nq X I such that
|f pxnq f pynq Ωf ppx 1{n, x
1{nq X I q| ¤
1
.
n
Hence f px1 qf py1 qΩf ppx1, x 1qXI q, f px2 qf py2 qΩf ppx1{2, x
1{2q X I q, . . . is a null sequence. Since both sequences x1 , x2 , . . . and
y1 , y2 , . . . are converging to x and since f is continuous in x it follows by
Theorem 2.3.4 that ωf pxq 0. Finally, we consider the case that ωf pxq 0. Assume that f is not continuous in x. Hence there is some ε ¡ 0 along
with a sequence x1 , x2 , . . . in I ztxu which is convergent to x, but such that
|f pxnq f pxq| ¥ ε .
Hence
Ωf ppx δn , x
δn q X I q ¥ ε
for all n P N where δn : 2|xn x| for all n P N . Since δ1 , δ2 , . . .
is converging to 0 it follows that ωf pxq ¥ ε. Hence f is continuous in
x.
Theorem 5.2.4. Let f : ra, bs Ñ R be bounded where a and b are some
elements of R such that a b. Further, let ωf pxq ε for every x P ra, bs
762
and some ε ¡ 0. Then there is δ ¡ 0 such that for every closed subinterval
I of ra, bs of length smaller than δ it follows
Ωf pI q ε .
Proof. First, it follows from the assumptions that for each x
is some δx ¡ 0 such that
Ωf ppx δx , x
P ra, bs there
δx q X ra, bsq ε .
The family of sets px δx {2, x δx {2q, where x P ra, bs, is an open covering
of ra, bs and hence by the compactness of ra, bs there are x1 , x2 , . . . , xn P
ra, bs, where n is some element of N, such that ra, bs is contained in
the union of px1 δx1 {2, x1 δx1 {2q, px2 δx2 {2, x2 δx2 {2q, . . . , pxn δxn {2, xn δxn {2q. Now define δ : mintδx1 {2, δx2 {2, . . . , δxn {2u and let I
be some closed subinterval of ra, bs of length smaller than δ. Further, let k
be some element of t1, 2, . . . , nu such that pxk δxk {2, xk δxk {2qX I φ.
Then,
I € pxk δxk , xk δxk q ,
since lpI q δ, and hence Ωf pI q ε.
Theorem 5.2.5. Let f : ra, bs Ñ R be bounded where a and b are some
elements of R such that a b. Further, let ε ¡ 0. Then
Jε : tx P ra, bs : ωf pxq ¥ εu
is a closed subset of ra, bs.
Proof. Let x be some element of the complement ra, bs zJε . Then ωf pxq ε and hence there is some δ ¡ 0 such that Ωf ppx δ, x δ q X ra, bsq ε.
In particular it follows for every element y P px δ, x δ q X ra, bs that
ωf py q ε and as a consequence that px δ, x δ q X ra, bs is contained in
ra, bs zJε. Hence is ra, bs zJε open in ra, bs and therefore Jε a closed subset
of ra, bs.
We prove now Theorem 2.6.13:
763
Theorem 5.2.6. (Lebesgue’s criterion for Riemann-integrability) Let
f : ra, bs Ñ R be bounded where a and b are some elements of R such
that a b. Further, let D be the set of discontinuities of f . Then f is
Riemann-integrable if and only if D is a set of measure zero.
Proof. First, assume that D is not of measure zero. Then D is non-empty
and by Theorem 5.2.3 it follows that ωf pxq ¡ 0 for every x P D. Hence
D
8
¤
J1{n .
(5.2.1)
n 1
Since the union in (5.2.1) is countable, by Theorem 5.2.1 it follows the existence of some n P N such that J1{n is not a set of measure zero. Hence
there is some ε ¡ 0 such that the sum of the lengths of the intervals corresponding to any covering of J1{n by open intervals is ¥ ε. Now let P be
some partition of ra, bs with corresponding closed intervals I0 , I1 , . . . , Ik
where k P N. Further, denote by S the subset of t0, 1, . . . , k u containing
only those indexes j P t0, 1, . . . , k u for which the intersection of the inner
of Ij and J1{n is non-empty. Then the open intervals corresponding to Ij ,
j P S cover J1{n , except possibly for a finite set, which is a set of measure
zero. Hence the sum of their lengths is ¥ ε.
U pf, P q Lpf, P q
¥
ķ
rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q
j 0
¸
P
rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q
j S
¥ n1
¸
P
j S
lpIj q ¥
ε
n
and hence f is not Riemann-integrable. Finally, assume that D is a set
of measure zero and consider again (5.2.1). Further, let n P N such that
1{n pb aq{2. Then J1{n is by Theorem 5.2.5 compact and has measure
zero as a subset of a set of measure zero. Hence there is a covering of J1{n
764
by a finite number of open intervals for which the corresponding sum of
lengths is smaller than 1{n. Without restriction we can assume that those
intervals are pairwise disjoint. Denote by An the union of those intervals.
Then the complement Bn : ra, bs zAn is the union of a finite number of
closed subintervals of ra, bs. Let I be such subinterval. Then ωf pxq 1{n
for each x P I and hence by Theorem 5.2.4 there is a partition of I such
that Ωf pI 1 q 1{n for any induced subinterval I 1 . All those partitions
induce a partition Pn of ra, bs. Now consider some refinement P P P of
Pn with corresponding closed intervals I0 , I1 , . . . , Ik where k P N. Further,
denote by S the subset of t0, 1, . . . , k u containing only those indexes j P
t0, 1, . . . , ku for which Ij X J1{n φ. Then
¸
R
rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q
j S
¤ n1
¸
P
¸
R
lpIj q ¤
j S
ba
,
n
rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q
j S
¤ pM mq
¸
P
lpIj q ¤
M
j S
m ,
n
where M : suptf pxq : x P ra, bsu and m : inf tf pxq : x
hence
ba M m
U pf, P q Lpf, P q ¤
.
n
Therefore
P Puq ¤ U pf, P q ¤ Lpf, P q b a
¤ supptLpf, P q : P P Puq b a nM m .
inf ptU pf, P q : P
P ra, bsu, and
M
n
m
Since this is true for any n P N such that 1{n pb aq{2, from this it
follows by Theorem 4.4.4 that f is Riemann-integrable.
765
5.3
Properties of the Determinant
Lemma 5.3.1. (Leibniz’ formula for the determinant) Let n
pa1, . . . , anq be an n-tuple of elements Rn. Then
P N and
(i)
detpa1 , . . . , an q ¸
P
signpσ q a1σp1q anσpnq
σ Sn
where Sn denotes the set of permutations of t1, . . . , nu, i.e., the set
of all bijections from t1, . . . , nu to t1, . . . , nu and
signpσ q : spσ p1q, . . . , σ pnqq n
¹
sgnpσ pj q σ piqq
i,j 1,i j
for all σ P Sn . Note that signpσ q 1 if the number of pairs pi, j q P
t1, . . . , nu2 such that i j and σpj q σpiq is even, whereas signpσq 1 if that number is odd.
(ii)
signpσ q for all σ
P Sn .
(iii)
for all τ, σ
P Sn .
signpτ
σ pj q σ piq
ji
i,j 1,i j
n
¹
σq signpτ q signpσq
(iv) In addition, let n ¥ 2. Further, let τ be a transposition, i.e., an element of Sn for which there are elements k, l P t1, . . . , nu such that
k l and such that τ piq i for all i P t1, . . . , nu ztk, lu, τ pk q l
and τ plq k. Then there is σ P Sn such that
τ
σ τ0 σ1
766
where τ0 P Sn is the transposition defined by τ0 piq i for all i P
t1, . . . , nu zt1, 2u, τ0p1q 2 and τ0p2q 1. Note that from this it
follows by (iii) that
signpτ q signpσ τ0 σ 1 q signpσ q signpτ0 q signpσ 1 q
signpτ0q 1 .
(v) In addition, let n ¥ 2. For every σ P Sn , there is k
sequence τ1 , . . . , τk of transpositions in Sn such that
σ
P
N and a
τ1 τk .
(vi) For every i P t1, . . . , nu, define
ņ
āi :
aji ej
j 1
where e1 , . . . , en is the canonical basis of Rn . Then
detpā1 , . . . , ān q a
11 an1 a
11
a1n
a1n .
ann an1 detpa1, . . . , anq
ann
Proof. ‘(i)’: The statement of (i) is a direct consequence of Definition 3.5.18
and the definitions given in (i). ‘(ii)’: For this let σ P Sn and denote by m
the number of pairs pi, j q P t1, . . . , nu2 such that i j and σ pj q σ piq.
Then
n
¹
pσpj q σpiqq
i,j 1,i j
767
n
¹
pσpj q σpiqq p1q
m
p q p q
i,j 1,i j,σ i σ j
n
¹
m
p1q
|σpj q σpiq| p1qm
i,j 1,i j
n
¹
p q σpiq
|σpj q σpiq|
i,j 1,i j,σ j
n
¹
pj iq
i,j 1,i j
where the last equality uses the bijectivity of σ. Hence it follows that
signpσ q p1qm
σ pj q σ piq
.
ji
i,j 1,i j
n
¹
‘(iii)’: First, it follows from (ii) that
pτ σqpj q pτ σqpiq
ji
i,j 1,i j
n
n
¹
pτ σqpj q pτ σqpiq ¹
σ pj q σ piq
σ pj q σ piq
ji
i,j 1,i j
i,j 1,i j
n
¹
pτ σqpj q pτ σqpiq signpσq .
σ pj q σ piq
i,j 1,i j
signpτ
σq n
¹
pτ σqpj q pτ σqpiq
σ pj q σ piq
i,j 1,i j
n
¹
p
τ σ qpj q pτ σ qpiq
σ pj q σ piq
i,j 1,i j,σ piq σ pj q
n
¹
pτ σqpj q pτ σqpiq
σ pj q σ piq
i,j 1,i j,σ pj q σ piq
n
¹
p
τ σ qpj q pτ σ qpiq
σ pj q σ piq
i,j 1,i j,σ piq σ pj q
n
¹
pτ σqpj q pτ σqpiq
σ pj q σ piq
i,j 1,i¡j,σ piq σ pj q
n
¹
768
pτ σqpj q pτ σqpiq
σ pj q σ piq
i,j 1,σ piq σ pj q
n
¹
τ pj q τ piq
signpτ q
ji
i,j 1,i j
n
¹
where the last two equalities use the bijectivity of σ. Hence, finally, it
follows that
signpτ σ q signpτ q signpσ q .
‘(iv)’ For this, let k, l P t1, . . . , nu be such that k l and such that τ piq i
for all i P t1, . . . , nu ztk, lu, τ pk q l and τ plq k. Further, let σ be some
element of Sn such that σ p1q k and σ p2q l. Then
σ τ0 σ 1 pk q σ τ0 p1q σ p2q l ,
σ τ0 σ 1 plq σ τ0 p2q σ p1q k
and for i P t1, . . . , nu ztk, lu
σ τ0 σ 1 piq σ σ 1 piq i .
‘(v)’: If σ coincides with the identity transformation on t1, . . . , nu, then
σ τ τ for any transposition τ P Sn . If σ differs from the identity transformation on t1, . . . , nu, then there is i1 P t1, . . . , nu such that
σ piq i for all i P t1, . . . , i1 1u where we define t1, . . . , 0u : φ and
σ pi1 q i1 . The last implies that σ pi1 q ¡ i1 . We define the transposition τ1 P Sn by τ1 pi1 q : σ pi1 q, τ1 pσ pi1 qq : i1 and τ piq : i for all
i P t1, . . . , nu zti1 , σ pi1 qu. Then σ1 : τ1 σ satisfies σ1 piq i for all
i P t1, . . . , i1 u. Continuing this process, we arrive after at a sequence of
transpositions τ1 , . . . , τk in Sn , where k is some element of N , such that
idt1,...,nu
Then
σ
τk . . . τ1 σ .
τ11 τk1 τ1 τk .
769
‘(vi)’: It follows by (i), (iii) that
detpa1 , . . . , an q ¸
P
σ Sn
¸
P
¸
σ Sn
P
¸
P
signpσ q a1σp1q anσpnq
σ Sn
signpσ q aσ1 pσp1qq σp1q aσ1 pσpnqq σpnq
signpσ 1 q aσ1 p1q 1 aσ1 pnq n
signpσ q ā1σp1q ānσpnq
¸
P
signpσ q aσp1q 1 aσp1q n
σ Sn
detpā1, . . . , ānq .
σ Sn
Theorem 5.3.2. (Properties of the determinant) Let n P N , e1 , . . . , en
the canonical basis of Rn , pa1 , . . . , an q an n-tuple of elements of Rn , i P
t1, . . . , nu, ai1 P Rn, α P R and j P t1, . . . , nu such that j ¡ i. Then
(i)
detpe1 , . . . , en q 1 ,
(ii)
ai1 , . . . , an q detpa1 , . . . , ai , . . . , an q
detpa1 , . . . , ai1 , . . . , an q ,
detpa1 , . . . , α ai , . . . , an q α detpa1 , . . . , an q ,
detpa1 , . . . , ai
(iii) if n ¥ 2, then
detpa1 , . . . , ai , . . . , aj , . . . , an q detpa1 , . . . , aj , . . . , ai , . . . , an q ,
(iv) if n ¥ 2 and ai
aj , then
detpa1 , . . . , ai , . . . , aj , . . . , an q 0 .
770
Proof. ‘(i)’:
detpe1 , . . . , en q sp1, . . . , nq ņ
spk1 , . . . , kn q e1k1 enkn
k1 ,...,kn 1
n
¹
sgnpj iq 1 .
i,j 1,i j
‘(ii)’:
detpa1 , . . . , ai
ņ
ai1 , . . . , an q
spk1 , . . . , kn q a1k1 . . . paiki
k1 ,...,kn 1
ņ
aik1 i q . . . ankn
spk1 , . . . , kn q a1k1 . . . aiki . . . ankn
k1 ,...,kn 1
ņ
spk1 , . . . , kn q a1k1 aik1 i . . . ankn
k1 ,...,kn 1
detpa1, . . . , ai, . . . , anq
detpa1 , . . . , ai1 , . . . , an q ,
detpa1 , . . . , α ai , . . . , an q
ņ
spk1 , . . . , kn q a1k1 . . . pα aqiki . . . ankn
k1 ,...,kn 1
ņ
α
spk1 , . . . , kn q a1k1 . . . aiki . . . ankn
k1 ,...,kn 1
α detpa1, . . . , ai, . . . , anq .
‘(iii)’: For this, we define the n-tuple pb1 , . . . , bn q of elements of Rn by
bk : ak if k P t1, . . . , nu zti, j u, bi : aj and bj : ai . Then it follows by
Lemma 5.3.1 (iv) that
detpa1 , . . . , ai , . . . , aj , . . . , an q
771
ņ
k1 ,...,kn 1
ņ
k1 ,...,kn 1
ņ
spk1 , . . . , ki , . . . , kj , . . . , kn q a1k1 . . . aiki . . . ajkj . . . ankn
spk1 , . . . , ki , . . . , kj , . . . , kn q b1k1 . . . bjki . . . bikj . . . bnkn
spk1 , . . . , kj , . . . , ki , . . . , kn q b1k1 . . . bjkj . . . biki . . . bnkn
k1 ,...,kn 1
ņ
spk1 , . . . , ki , . . . , kj , . . . , kn q b1k1 . . . biki . . . bjkj . . . bnkn
k1 ,...,kn 1
detpa1, . . . , aj , . . . , ai, . . . , anq .
‘(iv)’: The statement of (iv) is simple consequence of (iii).
Theorem 5.3.3. (Uniqueness of the determinant) Let n P N and w be a
map which associates to every n-tuple of elements of Rn a real number. In
particular, let w be such that
(i) for the canonical basis e1 , . . . , en of Rn
wpe1 , . . . , en q 1 ,
(ii) for every n-tuple pa1 , . . . , an q of elements of Rn , i P t1, . . . , nu, ai1
Rn and α P R
P
ai1 , . . . , an q wpa1 , . . . , ai , . . . , an q
wpa1 , . . . , ai1 , . . . , an q ,
wpa1 , . . . , α ai , . . . , an q α wpa1 , . . . , an q ,
wpa1 , . . . , ai
(iii) if n ¥ 2, for every n-tuple pa1 , . . . , an q of elements of Rn , i
t1, . . . , nu, ai1 P Rn and j P t1, . . . , nu such that j ¡ i
wpa1 , . . . , ai , . . . , aj , . . . , an q wpa1 , . . . , aj , . . . , ai , . . . , an q .
Then w
det.
772
P
Proof. For this, let pa1 , . . . , an q be an n-tuple pa1 , . . . , an q of elements of
Rn . Then it follows by (ii),(iii) that
wpa1 , . . . , an q ¸
P
ņ
a1k1 . . . ankn wpek1 , . . . , ekn q
k1 ,...,kn 0
a1σp1q . . . anσpnq wpeσp1q , . . . , eσpnq q .
σ Sn
Further, it follows by Theorem 5.3.1 (v), (iii) and (i) that
wpeσp1q , . . . , eσpnq q detpeσp1q , . . . , eσpnq q
and therefore, finally, that
wpa1 , . . . , an q detpa1 , . . . , an q .
Theorem 5.3.4. (Bases of Rn ) Let n P N .
(i) Let r P N and v1 , . . . , vr be basis of Rn , i.e., a sequence of vectors
in Rn which is such that for every w P Rn , there is a unique r-tuple
pα1, . . . , αr q of real numbers such that
w
Then r
ŗ
αk vk .
k 1
n.
(ii) In addition, let r P N and v1 , . . . , vr P Rn be no basis Rn , but be
linearly independent, i.e., such that the equation
ŗ
αk vk
0
k 1
for some real α1 , . . . , αr implies that
α1
αr 0 .
Then there are m P N and vectors w1 , . . . , wm in Rn such that
v1 , . . . , vr , w1 , . . . , wm is a basis of Rn .
773
(iii) Let v1 , . . . , vn
a basis of Rn .
P Rn be linearly independent. Then v1, . . . , vn P Rn is
Proof. ‘(i)’: Since v1 , . . . , vr and the canonical basis e1 , . . . , en of Rn are
both bases, there are real numbers αik , i P t1, . . . , ru, k P t1, . . . , nu and
βjl , j t1, . . . , nu, l P t1, . . . , ru such that
vi
ņ
αik ek , ej
vi
ņ
αik ek
ņ
βjl vl
P t1, . . . , nu. For such i, j, it follows that
ŗ
k 1
l 1
k 1
for every i P t1, . . . , ru and j
ŗ
αik βkl vl , ej
ŗ
k 1l 1
βjl vl
l 1
ņ
ŗ
βjl αlk ek
k 1l 1
and hence that
ņ
αik βki
1,
k 1
ŗ
βjl αlj
1.
l 1
This implies that
r
ŗ
ņ
i 1k 1
αik βki
ņ
ŗ
βjl αlj
n.
j 1l 1
‘(ii)’: Since the canonical basis e1 , . . . , en of Rn of Rn is a bases, it follows that every element of Rn can represented as a linear combination of
the vectors v1 , . . . , vr , e1 , . . . , en . In a first step, we consider the sequence
of vectors v1 , . . . , vr , e1 . If e1 is the linear combination of v1 , . . . , vr , then
we drop e1 from the sequence v1 , . . . , vr , e1 , . . . , en and still every element
of Rn can represented as a linear combination of the vectors from the remaining sequence. Otherwise, we keep e1 in the sequence. Note that in this
case v1 , . . . , vr , e1 are linearly independent. Continuing this process, we
arrive at a sequence of vectors w1 , . . . , wm in Rn , where m P N , such that
v1 , . . . , vr , w1 , . . . , wm is linearly independent and such that every element
of Rn can represented as a linear combination of its members. This also
774
implies that v1 , . . . , vr , w1 , . . . , wm is a basis of Rn . ‘(iii)’: The proof is
indirect. Assume that v1 , . . . , vn is no basis of Rn . Then by (ii) v1 , . . . , vn
can be extended to a basis by adding a non-zero number of vectors from
Rn . That basis has at least n 1 members which contradicts (i).
Definition 5.3.5. (The determinant of a linear map) Let n
A : Rn Ñ Rn be linear, i.e. such that
Apx
y q Apxq
P
N and
Apy q , Apαxq αApxq
P Rn and α P R. Then, obviously, by
wpa1 , . . . , an q : detpApa1 q, . . . , Apan qq
for every n-tuple pa1 , . . . , an q of elements of Rn , there is given a map w
for all x, y
satisfying the conditions (ii) and (iii) in Theorem 5.3.3. Hence according to that theorem, w is a multiple of det. In the following, we call the
corresponding factor the determinant of A and denote it by detpAq. By
definition, it follows that
detpApa1 q, . . . , Apan qq detpAq detpa1 , . . . , an q
for every n-tuple pa1 , . . . , an q of elements of Rn and hence that
detpAq detpApe1 q, . . . , Apen qq
where e1 , . . . , en is the canonical basis of Rn .
Theorem 5.3.6. Let n
Then
P N and A : Rn Ñ Rn, B : Rn Ñ Rn be linear.
(i) detpA B q detpAq detpB q,
(ii) A is bijective if and only if detpAq 0.
775
Proof. For this, let e1 , . . . , en be the canonical basis of Rn . ‘(i)’: From
Definition 5.3.5, it follows that
detpA B q det
pA B qpe1q, .. . , pA B qpenq
det ApB pe1qq, . . . , ApB penqq detpAq det B pe1q, . . . , B penq
detpAq detpB q .
‘(ii)’: If A is bijective, then
detpA1 q detpAq detpA1 Aq detpidRn q detpe1 , . . . , en q 1
and hence detpAq 0. On the other hand, if
detpAq detpApe1 q, . . . , Apen qq 0
and α1 , . . . , αn are real numbers such that
ņ
αk Apek q 0 ,
k 1
then α1 such that
and hence
αn 0.
Otherwise, there is i
Apei q P t1, . . . , nu and αi 0
ņ
αk
Apek q
α
k1,ki i
detpAq detpApe1 q, . . . , Apen qq 0 .
Hence the vectors Ape1 q, . . . , Apen q are linearly independent and constitute
a basis of Rn . Therefore, for every v P Rn , there are real α1 , . . . , αn such
that
v
ņ
ņ
αk Apek q A
k 1
αk ek
k 1
and hence A is surjective. Finally, we show that A is also injective. For
this, assume that there are v, w P Rn such that Av Aw. Then
0 Apv wq A
ņ
pvk wk qek k 1
ņ
k 1
776
pvk wk qApek q .
Since Ape1 q, . . . , Apen q are linearly independent, this implies that vk
for every k P t1, . . . , nu and hence that v w.
Definition 5.3.7. (Linear maps) Let n, m P N and A : Rn
wk
Ñ Rm .
(i) We say that A is linear if
Apx
for all x, y
y q Apxq
Apy q , Apαxq αApxq
P Rn and α P R.
(ii) If A is linear,
Apxq A
ņ
xj enj
j 1
ņ
m̧
xj Apenj q j 1
ņ
Aij xj em
i
i 1j 1
m
n
where en1 , . . . , enn and em
1 , . . . , em denote the canonical basis of R
and Rm , respectively, and for every i 1, . . . , m, j 1, . . . , n, Aij
denotes the component of Apenj q in the direction of em
i , such A is
n
determined by its values on the canonical basis of R . On the other
hand, obviously, if
pAij qpi,jqPt1,...mut1,...nu
is a given family of real numbers, then by
Apxq :
m̧
ņ
Aij xj em
i
i 1j 1
for all x P Rn , there is defined a linear map A : Rn Ñ Rm . Interpreting the elements of Rn and Rm as column vectors and defining the
m n matrix MA by
MA :
A11
Am1
777
A1n
Amn
,
the last is equivalent to
Apxq : MA x A11
Am1
A1n
x1
xn
Amn
where the multiplication sign denotes a particular case of matrix multiplication defined below
pMA xqi :
ņ
Aij xj
j 1
for every x P Rn and every i 1, . . . , m. In this case, we call MA
the representation matrix of A with respect to the bases en1 , . . . , enn
m
and em
1 , . . . , em .
(iii) If A is linear with representation matrix MA , l P N and B : Rm
Rl is linear with representation matrix MB , it follows that
pB Aqpxq B pApxqq ļ
m̧
Bik pApxqqk eli
i 1k 1
ļ
ņ
m̧
Bik
i 1k 1
Akj xj eli
j 1
Ñ
ļ
ņ
i 1j 1
m̧
Bik Akj xj eli
k 1
for every x P Rn and hence that the representation matrix MB A of
B A is given by
°m
MB A
B1k Ak1
k 1
°m
Blk Ak1
k 1
778
°m
B1k Akn
k 1
°m
Blk Akn
k 1
.
For this reason, we define the matrix product MB MA of MB and
MA such that MB MA MB A . Hence
°m
MB MA :
B1k Ak1
k 1
°m
Blk Ak1
k 1
°m
B1k Akn
k 1
°m
.
Blk Akn
k 1
(iv) If A is linear with representation matrix MA and m
the determinant detpMA q of MA by
n, we define
detpMA q : detpAq .
In this case, it follows by Theorem 5.3.1 (vi) that
detpMA q detpApen1 q, . . . , Apenn qq
An1 ņ
ņ
n
n
det
Ai1 ei , . . . , Ain ei i1
i1
Ann
A
11 A1n .
An1 Ann If in addition B : Rn Ñ Rn is linear with representation matrix MB ,
A
11
A1n
it follows by Theorem 5.3.6 (i) that
detpMB MA q detpMB q detpMA q .
779
Lemma 5.3.8. (Sylvester’s criterion) Let n P N , A pAij qi,j Pt1,...,nu
be a real symmetric n n matrix,i.e., such that Aij Aji for all i, j P
t1, . . . , nu. Then A is positive definite, i.e.,
¸
Aij hi hj
¡0
i,j 1,...,n
for all h P Rn zt0u, if and only if all leading principal minors detpAk q,
k 1, . . . , n, of A are ¡ 0. Here
Ak : pAij qi,j Pt1,...,ku , k
P t1, . . . , nu .
First, we derive an auxiliary result. For this let n ¥ 2, detpAn1 q Proof.
0 and let α1 , . . . , αn1 be some real numbers. We define a linear map T :
Rn Ñ Rn by
T phq : h hn .pα1 , . . . , αn1 , 0q
for every h P Rn . In particular, T is bijective with inverse
T 1 ph̄q h̄
h̄n .pα1 , . . . , αn1 , 0q
for every h̄ P Rn . Further, let h̄ P Rn and h : T 1 ph̄q. Then
ņ
Aij hi hj
i,j 1
Ann h̄2n
n¸1
i 1
Ann
2
2h̄n
n¸1
i 1
2
n¸1
Ain ph̄i
i 1
Ain αi
h̄i
n¸1
Aij αi αj
i,j 1
αj Aij
Aij hi hj
Aij ph̄i
i,j 1
n¸1
i,j 1
n¸1
h̄n αi qh̄n
n¸1
n¸1
Ain hi hn
i 1
2
Ann h2n
h̄2n
h̄n αi qph̄j
h̄n αj q
n¸1
Aij h̄i h̄j
i,j 1
Ain
j 1
Since detpAn1 q 0, the column vectors of An1 are linearly independent,
and hence there is a unique n 1-tuple pα1 , . . . , αn1 q of real numbers such
780
that
n¸1
αj Aij
0
Ain
j 1
for all i P t1, . . . , n 1u. Therefore by choosing these α1 , . . . , αn1
it follows that
ņ
Aij hi hj
P R,
n¸1
bnn h̄2n
i,j 1
Aij h̄i h̄j
(5.3.1)
i,j 1
where
bnn : Ann
2
n¸1
n¸1
Ain αi
i 1
Aij αi αj .
i,j 1
As a consequence, A is positive definite if and only if
Ā :
A11
A1 n1
0
An1 1
0
0 An1 n1
0
bnn
is positive definite, Note that by Leibniz’s formula Theorem 5.3.1 (i), it
follows that
detpĀq bnn detpAn1 q .
Further, if M denotes the representation matrix of T , then
ņ
Aij hi hj
i,j 1
ņ
Āij h̄i h̄j
i,j 1
ņ
i,j,k,l 1
ņ
Āij Mik Mjl hk hl
ņ
Āij pT phqqi pT phqqj
i,j 1
ņ
i,j,k,l 1
pM tĀM qij hihj ,
i,j 1
781
Mki Ākl Mlj hi hj
where
M t : pMji qi,j Pt1,...,nu ,
and hence, since M t ĀM is symmetric,
A M t ĀM
Therefore, we conclude that
detpĀq bnn detpAn1 q [ detpM q ]2 detpAq .
(5.3.2)
With the help of the auxiliary result, the proof of the theorem proceeds by
induction over n. The statement of the theorem is obviously true in the
case n 1. In the following, we assume that it is true for some n P N
and consider the case where n is increased by 1. If A is positive definite, it
follows, in particular, that
¸
Aij hi hj
¡0
i,j 1,...,n
for all h P Rn zt0u and therefore also that An is positive definite. As a
consequence, according to the inductive assumption, the leading principal
minors detpAk q, k 1, . . . , n are ¡ 0. Further, it follows by (5.3.1) that
bnn ¡ 0 and hence by (5.3.2) that detpAq ¡ 0. On the other hand, if all
leading principal minors of A are ¡ 0, it follows by the inductive assumption that An is positive definite and by (5.3.2) that bnn ¡ 0. Therefore, it
follows by (5.3.1) that A is positive definite.
5.4
The Inverse Mapping Theorem
Theorem 5.4.1. (Banach fixed point theorem for closed subsets of Rn )
Let n P N , B be a non-empty closed subset of Rn and f : B Ñ B be a
contraction, i.e., such that
|f pxq f pyq| ¤ α |x y|
782
(5.4.1)
for all x, y P B and some α
a unique x P B such that
Further,
P r0, 1q. Then f has a unique fixed point, i.e.,
f px q x .
|x x| ¤ |x 1f pαxq|
and
(5.4.2)
ν
lim
Ñ8 f pxq x
ν
for every x P B where f ν for ν
and f k 1 : f f k , for k P N.
P N is inductively defined by f 0 : idB
Proof. Note that (5.4.1) implies that f is continuous. Further, define F :
B Ñ R by
F pxq : |x f pxq|
for all x P E. Now let x P B. Then it follows that
|f
ν µ1
pxq f pxq| ¤
ν µ
1 1
µ¸
|f
ν k 1
k µ
ν
pxq f pxq| ¤
ν k
1 1
µ¸
αν
k
F pxq
k µ
¤ 1 α α F pxq
for all ν, µ, µ1 P N such that µ1 ¥ µ. Hence the components of pf ν pxqqν PN
are Cauchy sequences. Hence it follows by Theorem 2.3.17, Theorem 3.5.47
and the closedness of B that this sequence is convergent to some x P B.
Further, it follows by the continuity of f that x is a fixed point of f . In
addition, if x̄ P B is some fixed point of f , then
|x x̄| |f pxq f px̄q| ¤ α |x x̄|
and hence x̄ x since the assumption x̄ x leads to the contradiction
that 1 ¤ α. Finally, let y be some element of B. Then it follows that
|y x| |y f pxq| |y f pyq f pyq f pxq|
¤ |y f pyq| |f pyq f pxq| ¤ F pyq α |y x|
783
and hence (5.4.2).
Lemma 5.4.2. Let n P N , Ω be an open subset of Rn containing 0, f :
Ω Ñ Rn be of class C 1 , i.e., such that all corresponding component maps
f1 , . . . , fn are of class C 1 , and such that f p0q 0 and f 1 p0q idRn . Then
there are open subsets U , V of Rn such that 0 P U € Ω, 0 P V and
(i) |f pxq f py q| ¥
1
2
|x y| for all x, y P U ,
(ii) f pU q V ,
(iii) f0 : U Ñ V defined by f0 pxq : f pxq for every x P U is bijective,
and f01 is a continuous map which is differentiable in 0 such that
f01
1
p0q idR
n
.
Proof. For this, we define F : Ω Ñ Rn by
F pxq : x f pxq
for every x P Ω. Then F is of class C 1 such that F p0q 0 and F 1 p0q 0.
Let ν0 P N be such that B1{ν0 p0q € Ω, i P t1, . . . , nu and Fi be the i-th
component function corresponding to F . In particular, |∇Fi | is continuous
and |p∇Fi qp0q| 0. If xν P B1{ν p0q is such that
|p∇Fiqpxq|
|p∇Fiqpxν q| xPmax
B p0q
{
1 ν
for every ν
P N satisfying ν ¥ ν0, then
x ν 0 , xν 0 1 , . . .
is convergent to 0 and hence the corresponding sequence
|p∇Fiqpxν q|, |p∇Fiqpxν 1q|, . . .
0
0
784
is convergent to 0. As a consequence, there is ν0i
P N, ν0i ¥ ν0, such that
|p∇Fiqpxq| ¤ 2?1 n
for all x P U1{ν0i p0q. Since this true for i 1, . . . , n, we conclude that there
is ν P N, ν ¥ ν0 such that
|p∇Fiqpxq| ¤ 2?1 n
for all i P t1, . . . , nu and x
formula Theorem 4.3.6, for i
τ P r0, 1s such that
P U1{ν p0q. Further, according to Taylor’s
P t1, . . . , nu and x, y P U1{ν p0q, there is
Fi pxq Fi py q px y q p∇Fi qpy
τ px y qq .
Hence
|Fipxq Fipyq| |px yq p∇Fiqpy τ px yqq|
¤ |x y| |p∇Fiqpy τ px yqq| ¤ 2?1 n |x y|
and therefore
|F pxq F pyq| ¤ 21 |x y| .
As a consequence, it follows for x, y P U1{ν p0q that
|f pxq f pyq| |f pxq x x y y f pyq| |x y
¥ |x y| |F pxq F pyq| ¥ 12 |x y| .
F py q F pxq|
(5.4.3)
In the following, we define
U : U1{ν p0q , V : f pU q .
Then f0 : U Ñ V defined by f0 pxq : f pxq for every x P U is bijective. In
the next step, we show that V € Rn is open. For this, let y0 P V and x0 P U
785
such that y0 f px0 q. Further, let r ¡ 0 be such that Br px0 q € U . In the
following, we will show that Ur{2 py0 q € V . For this, let y P Ur{2 py0 q. We
define Fy : Br px0 q Ñ Rn by
Fy pxq : F pxq
y
x f pxq
y
for every x P U . Then
|Fy pxq Fy px̄q| |F pxq F px̄q| ¤ 21 |x x̄|
for all x, x̄ P Br px0 q and
|Fy pxq x0| |Fy pxq Fy px0q
Fy px0 q x0 | ¤
1
|x x 0 |
2
|y y0| ¤ r
for all x P Br px0 q. Hence the restriction of F in image to Br px0 q is a
contraction, and it follows by Theorem 5.4.1 the existence of a fixed point
x P Br px0 q of that map. The last also implies that f pxq y. Hence we
conclude that Ur{2 py0 q € f pBr px0 qq € V and therefore that V is open.
Further, it follows by (5.4.3) that
|f01pf0pxqq f01pf0pyqq| ¤ 2|f0pxq f0pyq| .
for all x, y P U and hence that f01 is continuous. To simplify notation in
the following, we define
g : f01 .
Finally, let y1 , y2 , . . . be some sequence in V zt0u that is convergent to 0.
Then
|gpyν q gp0q pyν 0q| |f pgpyν qq gpyν q|
|yν 0|
|y ν |
|f p|ggppyyνqqqggp0pyq|ν q| |gpy|yν q g0|p0q|
ν
ν
¤ 2 |f pgpyν qq |fgppgyp0qqqgpp0gq|pyν q gp0qq| .
ν
786
As a consequence of the continuity of g, it follows that g py1 q, g py2 q, . . . is
convergent to 0. Hence it follows from the last and the differentiability of
f in 0 that
|gpyν q gp0q pyν 0q| 0 .
lim
ν Ñ8
|yν 0|
and therefore that g is differentiable in 0 with derivative idRn .
Theorem 5.4.3. (Inverse mapping theorem) Let n P N , U be an open
subset of Rn , f : U Ñ Rn be of class C 1 , i.e., such that all corresponding
component maps are of class C 1 , and x0 P U such that f 1 px0 q is bijective. Then there are open subsets Ux0 € U , Vf px0 q € Rn containing x0
and f px0 q, respectively, and such that f |Ux0 defines a bijection onto Vf px0 q
whose inverse is differentiable.
Proof. First, we notice that the function detpf q which associates
detpf 1 pxqq ¸
P
signpσ q
σ Sn
Bfσp1q pxq Bfσpnq pxq
B x1
B xn
to every x P U is continuous. Hence it follows the existence of an open
subset U 1 € U containing x0 and such that f 1 pxq is bijective for every
x P U 1 . Otherwise, every open subset of U containing x0 also contains a
point x such that f 1 pxq is not bijective and hence such that
detpf 1 pxqq 0 .
Then it follows the existence of a sequence x1 , x2 , . . . in U that is convergent to x0 and such that
detpf 1 pxν qq 0
P N. Since detpf q is continuous, this leads to
1
1
0 νlim
Ñ8 detpf pxν qq detpf pxqq
and hence the fact that f 1 px0 q is not bijective. Further, let ε ¡ 0 be such
that Uε px0 q € U 1 . We define the auxiliary map h : Uε p0q Ñ Rn by
hpxq : pf 1 px0 qq1 p f px x0 q f px0 q q
for every ν
787
for every x P Uε p0q. In particular, h is of class C 1 , hp0q 0 and h 1 p0q idRn . Hence according to the previous lemma, there are open subsets U , V
of Rn such that 0 P U € Ω, 0 P V and such that h0 : U Ñ V defined
1
by h0 pxq : hpxq for every x P U is bijective with an inverse h
0 which is
differentiable in 0. Hence it follows that by f0 pxq : f pxq for every x P U0 ,
where
U0 : tx
x0 : x P U u , V0 : tf 1 px0 qpy q
f px0 q : y
PVu
there is defined a bijective map f0 : U0 Ñ V0 between open subsets U0 ,
V0 of Rn whose inverse is differentiable in f px0 q. Further, by application
of the previous reasoning to f0 and every x P U0 ztx0 u, it follows that the
inverse of f0 is differentiable.
The following lemma is of use in the proof of the change variable formula
for multiple integrals. The proof of the lemma uses methods analogous to
those used in the proof of Theorem 5.4.3.
Lemma 5.4.4. Let n P N , U be an open subset of Rn and f : U Ñ Rn
be of class C 1 , i.e., whose corresponding component maps f1 , . . . , fn are
of class C 1 , such that f p0q 0, f 1 p0q idRn . Finally, let r ¡ 0 and
0 ε 1 be such that Br p0q € U and
ņ
max
Pt1,...,nu j 1
i
fi
x x
B p q Bfi pyq ¤ ?ε
Bj
B xj n
for all x, y P Br p0q. Then for every y P Brp1εq p0q, there is a uniquely
determined x P Br p0q such that f pxq y.
Proof. For this, let y
sponding gy pxq by
P Brp1εqp0q. We define for every x P Br p0q a corregy pxq : x f pxq
y.
By Taylor’s formula, Theorem 4.3.6, it follows for every x
t1, . . . , nu, the existence of τ P r0, 1s such that
|fipxq xi| |fipxq fip0q x (∇fi)p0q|
788
P Br p0q, i P
|x (∇fi)pτ xq x (∇fi)p0q| ¤ |x| |(∇fi)pτ xq (∇fi)p0q|
ņ B
f
B
f
i
i
¤ |x| Bx pτ xq Bx p0q ¤ ?rεn
j 1
j
and hence that
j
|f pxq x| ¤ rε .
The last implies that
|gy pxq| ¤ |x f pxq| |y| ¤ rε
rp1 εq r
and hence that the range of gy is part of Br p0q. Further, it follows for
x1 , x2 P Br p0q and i P t1, . . . , nu the existence of τ P r0, 1s such that
|gyipx1q gyipx2q|
|fipx1q x1 (∇fi)p0q pfipx2q x2 (∇fi)p0qq|
|fipx1q fipx2q px1 x2q (∇fi)p0q|
|px1 x2q (∇fi)px1 τ px2 x1qq px1 x2q (∇fi)p0q|
¤ |x1 x2| |(∇fi)px1 τ px2 x1qq (∇fi)p0q| ¤ ?εn |x1 x2|
and hence that
|gy px1q gy px2q| ¤ ε |x1 x2| .
Since ε 1, this implies that gy is a contraction and therefore has unique
fixed point x P Br p0q according to Theorem 5.4.1. Since x P Br p0q is
a fixed point of gy if and only if f pxq y, the statement of the lemma
follows.
789
References
[1] Abel N H 1826, Untersuchungen über die Reihe ..., J. reine angew. Math., 1, 311339.
[2] Abramowitz M and Stegun I A (eds.) 1984, Pocketbook of Mathematical Functions,
Harri Deutsch, Thun.
[3] Alonso M, Finn E J 1967, 1967, 1968, Fundamental university physics, Vols I - III,
Addison-Wesley, Reading, Mass.
[4] Anon E 1969, Note on Simpson’s rule, Amer. Math. Month., 76, 929-930.
[5] Apostol T M 2002, Mathematical analysis, 2nd ed., Narosa Publishing House, New
Delhi.
[6] Ayres F, Mendelson E 1999, Calculus, McGraw-Hill, New York.
[7] Baron M E 1969, The origins of the infinitesimal calculus, Dover, New York.
[8] Beyer H R 2007, Beyond partial differential equations, Springer Lecture Notes in
Mathematics 1898, Springer, Berlin.
[9] Beyer H R 1999, On the completeness of the quasinormal modes of the Pöschl-Teller
potential, Commun. Math. Phys. 204, 397-423.
[10] Boole G 1847, The mathematical analysis of logic, Cambridge, Macmillan, Barclay
and Macmillan.
[11] Bolyai J 1832, Appendix scientiam spatii absolute veram exhibens: a veritate aut,
falsitate Axiomatis XI Euclidei (a priori haud unquam decidenda) independentem;
adjecta ad casum falsitatis, quadratura circuli geometrica, in: Bolyai F 1832, Tentamen juventutem studiosam in elementa matheseos purae, elementaris ac sublimioris
methodo intuitiva, evidentiaque huic propria, introducendi. Cum Appendice triplici,
vol. 1., Maros-Vasarhelyini. German translation in: Engel F, Staeckel P 1913, Urkunden zur Geschichte der nichteuklidischen Geometrie, Band 2, Teil 1, 2, Teubner,
Leipzig.
[12] Bolzano B 1817, Rein analytischer Beweis des Lehrsatzes, dass zwischen je zwey
Werthen, die ein entgegengesetztes Resultat gewaehren, wenigstens eine reelle
Wurzel der Gleichung liege, in: Jourdain P E B (ed.) 1905, Ostwald’s Klassiker
der exakten Wissenschaften, 153, Leipzig, Engelmann.
[13] Bottazzini U 1986, The higher calculus: A history of real and complex analysis from
Euler to Weierstrass, Springer, New York.
[14] Boyer C B 1949, The concepts of the calculus: A critical and historical discussion
of the derivative and the integral, Reprint, Hafner, New York.
790
[15] Boyer C B 1988, History of analytic geometry, Scholar’s bookshelf, Princeton.
[16] Boyer C B 1968, A history of mathematics, Wiley, New York.
[17] Bronson R 1989, Matrix operations, McGraw-Hill, New York.
[18] Browder A 2001, Mathematical analysis, corr. 3rd print., Springer, New York.
[19] Buck R C 1965, Advanced calculus, 2nd ed., McGraw-Hill, New York.
[20] Budak B M, Samarskii A A, Tikhonov A N 1964, A collection of problems on mathematical physics, MacMillan, New York.
[21] Cantor M 1880, 1892, 1898, 1908, Vorlesungen ber Geschichte der Mathematik,
Vols 1 - 4, Teubner, Leipzig.
[22] Cauchy A L 1821 / 1885, Algebraische Analysis, (Translated from French), Springer,
Berlin.
[23] Cheney W 2001, Analysis for applied mathematics, Springer, New York.
[24] Cavalieri B 1647, Exercitationes geometricae sex, Bologna.
[25] Coleman A J 1951, A simple proof of Stirling’s formula, Amer. Math. Month., 58,
334-336.
[26] Coleman A J 1954, The probability integral, Amer. Math. Month., 61, 710-711.
[27] Cramer G 1750, Introduction a l’analyse des lignes courbes algebriques, Freres
Cramer & Cl. Philbert, Geneve.
[28] Dantscher V 1908, Vorlesungen ueber die Weierstrasssche Theorie der irrationalen
Zahlen, Teubner, Leipzig.
[29] Demidovich B (ed.) 1989, Problems in mathematical analysis, 7th printing, Mir Publishers, Moscow.
[30] Dirichlet G L 1829, Sur la convergence des series trigonometriques qui servent a
representer une fonction arbitraire entre des limites donnees, J. reine angew. Math.,
4, 157-169.
[31] Dirichlet G L 1887 / 1918, 4th ed., Was sind und was sollen die Zahlen, Vieweg,
Braunschweig.
[32] Drager L D, Foote R L 1986, The contraction mapping lemma and the inverse function theorem in advanced calculus, Amer. Math. Month., 93, 52-54.
[33] Dunkel O 1917, Discussions: Relating to the exponential function, Amer. Math.
Month., 24, 244-246.
[34] Enderton H B 1977, Elements of set theory, Academic Press, New York.
791
[35] Erdelyi A, Magnus W, Oberhettinger F, Tricomi F G (eds.) 1953, Higher transcendental functions, Vol. I, McGraw-Hill, New York.
[36] Edwards C H 1979, The historical development of the calculus, Springer, New York.
[37] Hutchins R M (ed.) 1952, Great books of the western world, Vol. II, Euclid,
Archimedes, Apollonius of Perga, Nicomachus, Encyclopedia Britannica Inc.,
Chicago.
[38] Euler L 1748 / 1885, Einleitung in die Analysis des Unendlichen, (Translated from
Latin), Springer, Berlin.
[39] Fischer G 2003, Lineare Algebra, 14th ed., Vieweg, Wiesbaden.
[40] Ford J 1995, Avoiding the exchange lemma, Amer. Math. Month., 102, 350-351.
[41] Fourier J 1822 / 1955, The analytical theory of heat, (Translated from French),
reprint, New York, Dover.
[42] Galileo G 1638 / 2002, Dialogues concerning two new sciences, (Translated from
Italian), Philadelphia: Running Press.
[43] Gelbaum B R, Olmsted J M H 2003, Counterexamples in analysis, Dover Publications, New York.
[44] Giesy D P 1972, Still another elementary proof that
45, 148-149.
°
1{k 2
π2 {6, Math. Mag.,
[45] Goldrei D 1996, Classic set theory, Chapman & Hall, London.
[46] Goursat E 1904, 1916, 1917, A course in mathematical analysis, Vols I - III, Ginn,
Boston.
[47] Grabiner J V 1983, The changing concept of change: The derivative from Fermat to
Weierstrass, Math. Mag., 56, 195-206.
[48] Guenter N M, Kusmin R O 1971, Aufgabensammlung zur hoeheren Mathematik,
Vols I, II, DVW, Berlin.
[49] Haaser N B, Sullivan J A 1971, Real analysis, Van Nostrand, New York.
[50] Hairer E, Wanner G 2000, Analysis by its history, corr. 3rd printing, Springer, New
York.
[51] Halmos P R 1958, Finite-dimensional vector spaces, Van Nostrand, New York.
[52] Hamilton N, Landin J 1961, Set theory and the struture of arithmetic, Allyn and
Bacon, Boston.
[53] Hazewinkel M (ed.) 2002, Encyclopaedia of mathematics, (Accessible online at:
http://eom.springer.de/), Springer, Berlin.
792
[54] Hille E 1964, 1966, Analysis, Vols I, II, Blaisdell, New York.
[55] Hardy G H 1909, The integral
³8
0
sin x
x
dx, Math. Gaz., 5, 98-103.
[56] Huygens C 1673, Horologium oscillatorium sive de motu pendulorum, Paris.
[57] Katz V J 1993, A history of mathematics: An introduction, HarperCollins, New York.
[58] Klein F 1979, Vorlesungen über die Entwicklung der Mathematik im 19. Jahrhundert, reprint, Berlin, Springer.
[59] Kleiner I 1991, Rigor and proof in mathematics: A historical perspective, Math.
Mag., 64, 291-314.
[60] Kleiner I, Movshovitz-Hadar N 1994, The role of paradoxes in the evolution of mathematics, Amer. Math. Month., 101, 963-974.
[61] Konnully A O 1968, Relation between the beta and the gamma function, Math. Mag.,
41, 37-39.
[62] Landau L D, Lifschitz E M 1979, Lehrbuch der theoretischen Physik, Band III, 9.
Aufl., Quantenmechanik, Akademie-Verlag, Berlin.
[63] Lang S 1997, Undergraduate analysis, 2nd ed., Springer, New York.
[64] Lang S 1969, Real analysis, Addison-Wesley, Reading, MA.
[65] Lebedev N N, Skalskaya I P, Uflyand Ya S 1966, Problems in mathematical physics,
Pergamon, Oxford.
[66] Leibniz G Wilhelm 1684, Nova methodus pro maximis et minimis, itemque tangentibus, quae nec fractas nec irrationales quantitates moratur, et singulare pro illis
calculi genus, Act. Erudit. Lips., in: Pertz G H (ed.) 1859, Leibnizens gesammelte
Werke, Dritte Folge, Mathematik, Vierter Band, H W Schmidt: Hannover, 220-226.
[67] Leibniz G W 1686, De geometria recondita et analysi indivisibilium atque infinitorum, Act. Erud. Lips., in: Pertz G H (ed.), Leibnizens gesammelte Werke, Dritte
Folge, Mathematik, Vierter Band, H W Schmidt: Hannover, 226-233.
[68] Leibniz G W 1693, Supplementum geometriae dimensoriae, seu generalissima omnium tetragonismorum effectio per motum: similiterque multiplex constructio lineae
ex data tangentium conditione, Act. Erudit. Lips., in: Pertz G H (ed.), Leibnizens
gesammelte Werke, Dritte Folge, Mathematik, Vierter Band, H W Schmidt: Hannover, 294-301.
[69] de L’Hospital G F A 1696, Analyse des infiniment petits, pour l’intelligence des
lignes courbes, De l’Imprimrie Royale, Paris.
[70] Lipschutz S 1998, Set theory and related topics, 2nd ed., McGraw-Hill, New York.
793
[71] Lipschutz S, Lipson M L 2001, Linear algebra, 3rd ed., McGraw-Hill, New York.
[72] Lobachevsky N I 1829-1830, On Elements of Geometry, Kazan Vestn. 4, XXV,
books II-III, 178-187 (1829), book IV, 228-241 (1829), XXVII, book XI-XII, 227243 (1829), XXVIII, book III-IV, 251-283 (1830), XIX, books VII-VIII, 571-636
(1830).
[73] Loomis L 1974, Calculus, Addison-Wesley, Reading, Mass.
[74] Maclaurin C 1748, A treatise of algebra in three parts, London.
[75] Margaris A 1990, First order mathematical logic, Dover Publications, New York.
[76] McShane E J 1973, The Lagrange multplier rule, Amer. Math. Month., 80, 922-925.
[77] Mendelson E 1988, 3000 solved problems in calulus, McGraw-Hill, New York.
[78] Mercator N 1668, Logarithmotechnica, Londini.
[79] Messiah A 1999, Quantum mechanics, Dover Publications, New York.
[80] van Mill J 1989, Infinite-dimensional topology, North-Holland, Amsterdam.
[81] De Morgan A 1848, On the syllogism: III; and on logic in general, Trans. Camb.
Phil. Soc., 10, 173-230.
[82] Newton I 1669 / 1712, De analysi per aequationes numero terminorum infinitas,
published in: Collins D J 1712, Commercium epistolicum D. Johannis Collins et
aliorum de analysi promota, London, 3-20.
[83] Newton I 1671 / 1736, Methodus fluxionum et serierum infinitarum, Cum ejusdem
applicatione ad curvarum geometriam, Anglice edita a J. Colsono, Londini, 1736.
[84] Peano G 1890, Sur une courbe, qui remplit toute une aire plane, Math. Ann., 36,
157-160.
[85] Peiffer J, Dahan-Dalmedico A 1994, Wege und Irrwege: Eine Geschichte der Mathematik, Wissenschaftliche Buchgesellschaft, Darmstadt.
[86] Remmert R 1998, Theory of complex functions, 4th corrected printing, Springer, New
York.
[87] Riemann B 1854 / 1868, Ueber die Darstellbarkeit einer Function durch eine
trigonometrische Reihe, Habilitationsschrift 1854, Abhandlungen der Königlichen
Gesellschaft der Wissenschaften zu Göttingen, 13, 1868.
[88] Rudin W 1976, Principles of mathematical analysis, 3rd ed., McGraw-Hill, Singapore.
[89] Salas S, Hille E, Etgen G 2003, Calculus: One and several variables, 9th ed., Wiley,
New York.
794
[90] Samuels S M 1966, A simplified proof of a sufficient condition for a positive definite
quadratic form, Amer. Math. Month., 73, 297-298.
[91] Stewart J 1999, Calculus: Early transcendentals, 4th ed., Brooks/Cole Publishing
Company, Pacific Grove.
[92] Sterling J 1730, Methodus differentialis, London.
[93] Stoll R R 1963, Set theory and logic, Freeman, San Francisco.
[94] Stromberg K R 1981, An introduction to classical real analysis, Wadsworth, Belmont.
[95] Struik D J 1969, A source book in mathematics, 1200-1800, Harvard University
Press, Cambridge.
[96] Toeplitz O 1949, Die Entwicklung der Infinitesimalrechnung, Bd. I., Springer, Berlin,
engl. trans.: Toeplitz O 1963, The calculus: A genetic approach, University of
Chicago Press, Chicago.
[97] Venkatachaliengar K 1962, Elementary proofs of the infinite product for sin z and
allied formulae, Amer. Math. Month., 69, 541-545.
[98] Wallis J 1656, Arithmetica infinitorum, Oxford.
[99] Weierstrass K 1872, Ueber continuirliche Functionen eines reellen Arguments, die
fuer keinen Werth des Letzteren einen bestimmten Differentialquotienten besitzen,
gelesen: Akad. Wiss. 18. Juli 1872.
[100] Whittaker E T, Watson G N 1952, A course of modern analysis, 4th ed. reprint,
Cambridge University Press, Cambridge.
[101] Wrede R, Spiegel M R 2002, Theory and problems of advanced calculus, McGrawHill, New York.
[102] Wussing H 1989, Vorlesungen zur Geschichte der Mathematik, 2. Aufl., VEB,
Berlin.
795
Index of Notation
f pxq , value of f at x, 45
f pxq , image of x under f , 45
f pA 1 q , image of A 1 under f , 45
Ranpf q , range or image of f , 45
f 1 pB 1 q , inverse image of B 1 under f , 45
f |A , restriction of f to A 1 , 45
Gpf q , graph of f , 47
φ , empty set, 32
f 1 , inverse map, 48
N , t0, 1, 2, . . . u, 33
, composition, 54
N , t1, 2, . . . u, 33
idC , identity map on C, 55
Z , t. . . , 2, 1, 0, 1, 2, . . . u, 33
limnÑ8 , limit, 63
Z , t. . . , 2, 1, 1, 2, . . . u, 33
sup , supremum, 80
: , such that, 33
inf , infimum, 80
Q , rational numbers, 33
exp , exponential function, 84
Q , non-zero rational numbers, 33
ex , exponential function, 84
R , real numbers, 33
limxÑa , limit, 93
R , non-zero real numbers, 33
limxÑ8 , limit, 93
€ , subset, 33
limxÑ8 , limit, 93
: , per definition, 34
f1 f2 , sum of functions, 102
ra, bs , closed interval, 34
af1 , multiple of a function, 102
pa, bq , open interval, 34
f1 f2 , product of functions, 103
pa, bs , half-open interval, 34
1{f1 , quotient of functions, 103
ra, bq , half-open interval, 34
, approximately, 122
rc, 8q , unbounded closed interval, 34
limhÑ0,h0 , limit, 122
pc, 8q , unbounded open interval, 34
f 1 pxq , derivative of f in x, 128
p8, ds , unbounded closed interval, 34 f 1 , derivative of f , 128
p8, dq , unbounded open interval, 34
f pkq , k-th derivative of f , 128
Y , union, 34
f 2 , 2nd order derivative of f , 128
X , intersection, 34
f 3 , 3rd order derivative of f , 128
z , relative complement, 34
sinh , hyperbolic sine, 158
A B , Cartesian product of A and B, 36
cosh , hyperbolic cosine, 158
A1 An , n-fold Cartesian product of tanh
°n , hyperbolic tangent, 158
A
,
.
.
.
,
A
,
36
1
n
km , Sum from k m to n, 168
‘n
A
,
n-fold
Cartesian
product
of
A
,
n!
1
i1 i
, n factorial, 168
n
. . . , An , 36
, binomial coefficient, 210
³ bk
n
A
,
n-fold
Cartesian
product
of
A,
36
f pxq dx , integral of f over ra, bs, 223
”
a
r±F pxqs |ba , F pbq F paq, 238
“ , union, 38
, intersection, 38
°8, product symbol, 275
Dpf q , domain of f , 45
k1 xk , sum of a sequence, 346
, not, 21
^ , and, 21
_ , or, 21
ñ , if . . . then, 21
ô, . . . if and only if . . . , 21
P , belongs to, 32
R , does not belong to, 32
1
796
ν
n
, binomial coefficient, 429
# , oriented line segment between p and q,
pq
445
Srn paq , sphere of radius r around a in Rn ,
446
# , oriented line segment between p and q,
pq
451
# ] , vector associated to pq,
# 455
[ pq
#
#
[ pq ] [ rs ] , sum of vectors, 455
# ] , scalar multiple of a vector, 456
λ.[ pq
#
|[ pq ]| , length of a vector, 456
# ] [ rs
# ] , scalar product of vectors, 457
[ pq
a b , vector product of vectors in R3 , 465
sgn , signum function, 470
Uε pxq , open ball of radius ε centered at x,
550
Bε pxq , closed ball of radius ε centered at x,
550
Sε pxq , sphere of radius ε centered at x, 550
f1 f2 , sum of functions, 556
a.f1 , multiple of a function, 556
f1 f2 , product of functions, 557
1{f1 , quotient of functions, 557
g f , composition of functions, 558
Bf
Bx1i , partial derivative, 570
f pxq , derivative of f in x, 574
p∇f qpxq , gradient of f in x, 574
C p , continuously partially differentiable up
to order p, 580
C 8 , C p for all p P N , 580
∇ , gradient operator, 580
4 , Laplace operator, 583
f.g
³ , product of functions, 588
³I f dv , integral of f on I, 634
³r F dr , path integral of F along r, 680
F dS , flux of F across S, 720
S
Sn , permutation group, 766
sign , signum function, 766
797
Index of Terminology
Rn
generalized spherical, 674
polar, 492, 589, 593, 654
spherical, 511, 594, 656
Cramer’s rule, 477
distance of a point from a plane, 474
distance of two lines, 473, 474
distance of two planes, 474
ellipse, 495
hyperbola, 496
lines, 472
metric space, 440
parabola, 494
planes, 472
quadrics, 500
cylinder, 500
ellipsoid, 502, 512
elliptic cone, 505
elliptic cylinder, 502, 510
elliptic paraboloid, 503
hyperbolic cylinder, 502
hyperbolic paraboloid, 505
hyperboloid of one sheet, 506
hyperboloid of two sheets, 506
parabolic cylinder, 500
saddle surface, 503
triangle
centroid, 477
circumcenter, 477
orthocenter, 477
triangle inequality, 440
vector spaces, 450
norms, 459
vectors, 450, 454
addition, 454
length, 454
linear independence, 477
orientation , 698
orthogonality, 454
position, 454
addition, 454, 457
bases, 773
canonical basis, 457
Cauchy-Schwarz inequality, 441, 460
length, 454, 459
linear independent vectors, 773
orthogonal projection, 461
Pythagorean theorem, 461
scalar multiplication, 454, 457
scalar product, 454, 460
subset
area, 646
boundary, 555
boundary point, 555
bounded, 550
closed, 550
closure, 551, 552
compact, 550
convex, 688
curve, 520
inner point, 555
interior, 555
negligible, 641
open, 550
simply-connected, 688
star-shaped, 686
volume, 646
triangle inequality, 459
Analytical geometry
area of a parallelogram, 463
conic sections, 478, 508
ellipse, 485, 493
hyperbola, 487
parabola, 479
coordinates, 492
cylindrical, 509, 594, 656
generalized polar, 674
798
scalar multiplication, 454
scalar product, 454
scalar triple product, 468
unit vector, 454
vector product, 465
volume
parallelepiped, 468
Applications
n-dimensional volume, 646
ancient knowledge on parabolic segments,
480
Archimedes’ conoids and spheroids, 626,
657
Archimedes’ measurement of the circle, 60
Archimedes’ quadrature of the parabola,
211, 338
area
circular cylinder, 725
interior of a plane curve, 710
interior of an ellipse, 710
parallelogram, 463, 471
parametric surface, 720
rotational ellipsoid, 725
set, 646
sphere, 725
surface of revolution, 723
under a graph, 223
arithmetic series, 420
astroid, 538, 745
average speed, 125
Babylonian roots, 195
Bessel functions, 233, 279, 416, 431
Beta function, 325
cardioid, 538, 745
Cartesian leaf, 307
center of mass, 661, 662, 670
circular arch, 209
conchoid of Nicomedes, 141
confluent hypergeometric functions, 437
conservation law, 712, 740
799
constant of motion, 524
continuous compound interest, 82
Couette flow, 677
cycloid, 141, 537
differential equation, 142, 152, 156, 157,
241, 258, 266, 294, 309
direction field, 677
electric circuit, 156
electric field of a point charge, 678
elliptic integral, 312
energy conservation, 152, 680
energy inequality, 156, 713, 741
error function, 428
Euler constant, 354
Fermat’s principle, 207
floor function, 239
folium of Descartes, 745
force field, 152, 678
conservative, 153, 680
potential function, 153, 680
total energy of a point particle, 152,
680
Fourier coefficients, 245
free fall, 87, 206
with low viscous friction, 247
with viscous friction, 206
gamma function, 318, 324, 326, 329,
330, 355
Gaussian integrals, 320, 322
ground state energy, 206
harmonic oscillator, 156
heat conduction, 42
Hermite polynomials, 437
hypergeometric functions, 436
ideal gas law, 87
inertia tensor, 661, 670
instantaneous speed, 126
Kepler problem, 525
angular momentum, 525
energy, 525
Lenz vector, 525
Levi-Civita’s transformation, 528
total mass, 661
kinetic energy, 524, 679
trajectory of a point particle, 679
Laplace equation, 583, 592
transverse vibrations of a beam, 209
Laplace operator, 583
travel distance, 218
cylindrical form, 594
velocity field of a point particle, 679
polar form, 593
volume
set, 646
spherical form, 594
largest viewing angle, 207
solid cylinder, 657
latitude, 512
solid ellipsoid, 657
Legendre functions, 438
solid of revolution, 626, 657
Leibniz’s ‘arithmetical quadrature of the
solid sphere, 657
circle, 378
under a graph, 634
length
Wallis product, 270
curve, 535
wave equation, 42, 88, 143, 592, 593,
path, 531, 536, 540
711, 715, 739
longitude, 512
Curves
Mathematica 5.1 error, 264, 293
astroid, 538, 745
Newton’s equation of motion, 142, 152,
auxiliary lines
247, 309, 524, 679
normal, 124
oscillatory integral, 651
subnormal, 124
partial differential equation, 143, 592,
subtangent, 124
593
tangent, 124
particle paths, 524
cardioid, 538, 745
potential function, 683
Cartesian leaf, 307
probability theory, 664
circle, 445
Buffon’s needle problem, 671
conchoid of Nicomedes, 141
quantum field theory, 395
cycloid, 141, 537
quantum statistics, 395
ellipse, 485, 495
quantum theory
folium of Descartes, 124, 745
confined identical bosons, 671
helix, 521
confined identical fermions, 665
hyperbola, 487, 496
confined particle, 672
length, 535
harmonic oscillator, 335
parabola, 479, 494
hydrogen atom, 333
parallelogram, 463
Riemann’s zeta function, 350, 364, 394,
plane-filling, 403
395
straight lines, 472
Schwarzschild black hole, 164
strophoid, 266
simple pendulum, 307, 308, 335
Snell’s law, 207
Elementary logic
Stirling’s formula, 272
compound, 22
strophoid, 266
connectives, 21
800
contraposition, 26
contrapositive, 22
indirect proof, 24, 27, 119, 151
logical law, 26
negation, 22
proof by cases, 26
proposition, 21
rule of inference, 26
statement, 21
tautology, 26
transitivity, 26
truth table, 22
truth values, 21
Elementary set theory
sets, 31
Cartesian product, 36
countable, 92
disjoint, 36
equality, 33
intersection, 34
relative complement, 34
subset, 33
types of definition, 33
uncountable, 92
union, 34
Zermelo-Russel paradoxon, 39
Functions
definition, 44
determinant, 470, 766, 775
Leibniz formula, 766
domain, 44
image, 44
inverse image, 44
of n variables, 46
of one variable, 46
of several variables, 46
range, 44
restriction, 44
signum, 470, 766
zero set, 44
Functions of one variable
B
definition, 325
F pa, b; c; q, 436
Hn , 437
Jn , 279
integral representation, 233
Jν
integral representation, 431
power series, 416
M pa, b, q, 437
Pν , 438
Γ
Γp1{2q, 324
connection to ζ, 394
definition, 318
duplication formula, 330
Gauss formula, 329
limit of beta function, 326
reflection formula, 330
Stirling’s formula, 272
Weierstrass formula, 355
arccos, 105, 163
arcsin, 105, 163
arctan, 105, 163
arsinh, 727
cos, 105, 137
infinite product, 275
power series, 425
cosh, 158, 209
exp, 105, 154
characterization, 152
convexity, 176
definition, 83
derivative, 130
power series, 425
ln, 105, 154, 163
sin, 105, 131
infinite product, 275
power series, 425
sinh, 158, 208
801
tan, 105, 137
tanh, 158
erf
definition, 428
power series, 428
ζ
ζ p2q, 395
definition, 350
extension to p0, 1q, 364
integral representation, 394
antisymmetric, 262, 265
bisection method, 98
bounded, 96
concave, 174, 177
continuous, 90
continuous extension, 108
removable singularities, 108
singularities, 108
contraction, 195
convex, 174, 177
critical point, 146
derivative, 127
of higher order, 127
differentiable, 127
chain rule, 136
concave, 174, 177
convex, 174, 177
inverse functions, 162
linear approximation, 172
linearization, 172
product rule, 134
quotient rule, 134
sum rule, 134
Taylor’s formula, 172
differential equation, 416
Dirichlet’s function, 91
discontinuous, 90
everywhere, 91
extremum, 122
fixed point, 119, 194, 195
floor, 239
implicit differentiation, 141
increasing, 153
integral representation, 233
limits at infinity, 111
maximum, 93, 146, 181
minimum, 93, 146, 181
not differentiable, 132
nowhere differentiable, 400
periodic, 265
polynomial, 104, 136
powers, 129, 166
rational, 117
removable singularity, 110
Riemann integral, 223
additivity, 235
area under graph, 223
Cauchy-Schwartz inequality, 246
change of variables, 250
improper, 312
integration by parts, 267
Lebesgue criterion, 232
linearity, 226
Midpoint rule, 298
partial fractions, 281
positivity, 226
simple limit theorem, 389
Simpson’s rule, 280, 303
Trapezoid rule, 280, 301
strictly increasing, 101
symmetric, 264, 265
Taylor expansion, 423
uniformly continuous, 532
Functions of several variables, 542
composition, 558
continuous, 547, 548
contours, 542
critical point, 607
derivative, 574
differentiable, 568, 576
chain rule, 587
product rule, 585
802
quotient rule, 585
sum rule, 585
Taylor’s formula, 602
directional derivative, 597
discontinuous, 547, 548
domain, 542
gradient, 598
graph, 542
level set, 542
maximum, 553, 606, 610
minimum, 553, 606, 610
not differentiable, 578
of class C p , 580
of class C 8 , 580
partially derivative, 139, 570
of higher order, 139, 570
partially differentiable, 139, 570, 576
product, 557, 588
quotient, 557
range, 542
Riemann integral, 634
change of variables, 650
existence, 643
Fubini’s theorem, 644
negligible sets, 641
volume under a graph, 634
scalar multiple, 556
sum, 556
tangent plane, 574
Taylor expansion, 602
Taylor polynomial, 574
zero set, 44
General
basic problem solving strategy, 352
steps in the analysis of a series, 368
Infinite products
Γ, 329, 355
cos, 275
sin, 275
Wallis product, 270
Maps
bijective, 48
bilinear, 521
composition, 54
definition, 44
domain, 44
graph, 47
identity map, 55
image, 44
injective, 48
inverse image, 44
Laplace operator, 583
linear, 567, 775, 777
representation matrix, 567, 777
Nabla operator, 580
one-to-one, 48
one-to-one and onto, 48
onto, 48
paths, 520
continuous, 520
differentiable, 520
helix, 521
length, 531, 533
non-rectifiable, 531
rectifiable, 531
tangent vector, 520
permutation, 766
permutation group, 766
transposition, 766
quadratic form, 620
range, 44
restriction, 44
surjective, 48
Matrices
associated quadratic form, 620
definition, 567, 777
eigenvalues, 620
eigenvectors, 620
multiplication, 567, 777
symmetric, 608, 779
positive definite, 608, 779
803
Real numbers
Babylonian roots, 195
completeness, 759
construction, 749
density of rational numbers, 91, 759
intervals, 34
length, 219
partition, 219
subset
bounded from above, 79
bounded from below, 79
infimum, 80
measure zero, 231
supremum, 80
Sequences
bounded, 63
bounded from above, 77
bounded from below, 78
Cauchy, 74
convergent, 63, 515
decreasing, 78
divergent, 63
increasing, 77
limit laws, 68, 518
limits preserve inequalities, 72
subsequence, 76
Series, 345
ζ-type, 357
Abel’s test, 365
absolutely summable, 368
alternating harmonic, 364, 413
arithmetic, 423
Binomial series, 429
Cauchy product, 418
comparison test, 353
conditionally summable, 368
convergent, 345
Cauchy’s characterization, 370
Dirichlet test, 363
divergent, 345
geometric, 346
harmonic, 347, 354
harmonic type, 352, 357
integral test, 349
not summable, 345
of functions, 378
power series, 406
uniform convergence, 387
Weierstrass’ test, 391
ratio test, 371
rearrangement, 359
root test, 373
summable, 345
summation by parts, 362
Surfaces
area, 720, 723
cylinder, 500
ellipsoid, 502
elliptic cone, 505
elliptic paraboloid, 503
hyperbolic paraboloid, 505
hyperboloid of one sheet, 506
hyperboloid of two sheets, 506
parallelepiped, 468
planes, 472
quadrics, 500
saddle surface, 503
sphere, 445
Theorems
Abel, 412
Binomial, 210
Bolzano-Weierstrass, 76, 549
change of variables, 250, 650
contraction mapping lemma, 195
extended mean value, 183
Fubini, 644
fundamental theorem of calculus, 238
Gauss, 734
Green, 701, 705
intermediate value, 96
804
L’Hospital’s rule, 185
Lagrange multiplier rule, 618
mean value, 150
Newton, 202
Poincare lemma, 686
Rolle, 149
Schwarz, 581
Stokes, 727
Taylor, 170, 423
Vector calculus
area
parametric surface, 720
closed path, 683
flux across a surface, 718, 720
Gauss theorem, 734
Green’s theorem, 701, 705
inverse path, 682
parametrized surfaces, 716
normal field, 716
tangent space, 716
path integrals, 680
piecewise regular C 1 -path, 683
Poincare lemma, 686
potential, 684
regular C 1 -path, 680
Stokes’ theorem, 727
vector field, 677
Vector-valued functions, 542
Vector-valued functions of several variables,
542
of class C 1 , 642
805