Calculus: A Modern Approach Horst R. Beyer Louisiana State University (LSU) Center for Computation and Technology (CCT) 328 Johnston Hall Baton Rouge, LA 70803, USA 1 Dedicated to the Holy Spirit Contents Contents 1 2 3 3 Introduction 1.1 Short Introduction . . . . . . . . . . . . . . . . . . . . . 1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The General Approach of the Text . . . . . . . . . . . . 1.3.1 Motivational Parts . . . . . . . . . . . . . . . . 1.3.2 Core Theoretical Parts . . . . . . . . . . . . . . 1.3.3 Parts Containing Examples and Problems . . . . 1.4 Miscellaneous Aspects of the Approach . . . . . . . . . 1.5 Requirements of Applications . . . . . . . . . . . . . . 1.6 Remarks on the Role of Abstraction in Natural Sciences . . . . . . . . . . 5 5 5 7 8 9 10 12 13 14 Calculus I 2.1 A Sketch of the Development of Rigor in Calculus and Analysis 2.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Elementary Mathematical Logic . . . . . . . . . . . . . 2.2.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Limits and Continuous Functions . . . . . . . . . . . . . . . . . 2.3.1 Limits of Sequences of Real Numbers . . . . . . . . . . 2.3.2 Continuous Functions . . . . . . . . . . . . . . . . . . 2.4 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Applications of Differentiation . . . . . . . . . . . . . . . . . . 2.6 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 17 20 20 31 42 60 60 88 121 144 211 Calculus II 3.1 Techniques of Integration . . . . . . . . . . . . . . . . . 3.1.1 Change of Variables . . . . . . . . . . . . . . . 3.1.2 Integration by Parts . . . . . . . . . . . . . . . . 3.1.3 Partial Fractions . . . . . . . . . . . . . . . . . 3.1.4 Approximate Numerical Calculation of Integrals 3.2 Improper Integrals . . . . . . . . . . . . . . . . . . . . 3.3 Series of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 249 249 249 266 281 297 308 338 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 3.5 4 5 Series of Functions . . . . . . . . . . . . . . . . . . Analytical Geometry and Elementary Vector Calculus 3.5.1 Metric Spaces . . . . . . . . . . . . . . . . . 3.5.2 Vector Spaces . . . . . . . . . . . . . . . . . 3.5.3 Conic Sections . . . . . . . . . . . . . . . . 3.5.4 Polar Coordinates . . . . . . . . . . . . . . . 3.5.5 Quadric Surfaces . . . . . . . . . . . . . . . 3.5.6 Cylindrical and Spherical Coordinates . . . . 3.5.7 Limits in Rn . . . . . . . . . . . . . . . . . 3.5.8 Paths in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 439 440 450 478 491 500 509 515 520 Calculus III 4.1 Vector-valued Functions of Several Variables . . . . . . . 4.2 Derivatives of Vector-valued Functions of Several Variables 4.3 Applications of Differentiation . . . . . . . . . . . . . . . 4.4 Integration of Functions of Several Variables . . . . . . . . 4.5 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . . 4.6 Generalizations of the Fundamental Theorem of Calculus . 4.6.1 Green’s Theorem . . . . . . . . . . . . . . . . . . 4.6.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . . 4.6.3 Gauss’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 541 565 596 626 677 693 700 718 731 Appendix 5.1 Construction of the Real Number System . . . 5.2 Lebesgue’s Criterion for Riemann-integrability 5.3 Properties of the Determinant . . . . . . . . . . 5.4 The Inverse Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 749 749 761 766 782 . . . . . . . . . . . . . . . . . . . . . . . . References 790 Index of Notation 796 Index of Terminology 798 4 1 1.1 Introduction Short Introduction This text is an enlargement of lecture notes written for Calculus I, II and III courses given at the Department of Mathematics of Louisiana State University in Baton Rouge. It follows syllabi for these courses at LSU. Mainly, it is devised for teaching standard entry level university calculus courses, but can also be used for teaching courses in advanced calculus or undergraduate analysis, oriented towards calculations and applications, and also for self-study. The reasons for devising a text of such threefold nature is explained in Section 1.3. This text is unique also in its special attention to the needs of applications and due to its unusually elaborate motivations coming from the history of mathematics and applications. As a result, the text introduces early on basic material that is needed in applied sciences, in particular from the area of differential equations. Its motivations follow Otto Toeplitz’ famous ‘genetic’ method, [96]. 1.2 Background Currently, the content coverage and approach in standard calculus texts appear static. Indeed, such courses teach to a large extent views of the 18th century. On the other hand, the demand for analysis skills of increasing sophistication and abstraction in applications is still unbroken. As pointed out in Section 1.6, the need for a higher level of mathematical sophistication in the discipline which is most fundamental for applications, physics, was a byproduct, in particular, of the study of atomic systems. In particular, the mathematics education of physicists needs to go beyond calculus. A study of functional analysis, especially that of the spectral theorems of self-adjoint linear operators in Hilbert spaces, considerably enhances the understanding of quantum theory beyond that given in standard quantum mechanics texts. Such knowledge is extremely helpful in the study of the more advanced quantum theory of fields and, very likely, also 5 for the formulation and understanding of more advanced unified quantum field theories that are still to come. Also in the engineering sciences, the need for higher mathematical sophistication is visible, in particular, in connection with the solution of partial differential equations (PDE). PDE dominate current applications and functional analysis also provides the basis for their treatment. A good example for application of functional analytic methods is the method of finite elements which is widely used in engineering sciences for the solution of boundary value problems of elliptic differential equations. Also, questions after the relation of approximate solutions, provided by numerical methods, to the solutions of the original PDE gain importance and hence lead into the area of functional analysis. The mathematical thinking taught in current standard calculus courses provides no proper basis for more advanced courses in the area of analysis, in particular, courses in advanced calculus or undergraduate analysis.1 As a consequence, the last don’t build on any previous knowledge of calculus, but start completely new.2 Frequently, students from natural sciences and engineering, which form a major part of classes, don’t attend such advanced courses, mostly for reasons of time. As a consequence, frequently, standard calculus courses lead the last students into a dead end. In today’s time, where the speed of development of all parts of society rapidly increases, such procedure appears no longer appropriate. Since a major raise of the mathematical level of standard calculus courses 1 2 This is not surprising since precisely that thinking led calculus into serious crisis in the beginning of the 19th century. Only after that crisis was overcome, the development of more advanced mathematical fields was possible. Of course, this is not very efficient. Also, significantly, students of mathematics often face substantial problems in the first decisive parts of such courses that demand a considerably higher level of abstraction. Usually, this problem cannot be avoided by offering honor calculus courses, since most often there are only insufficient numbers of students to fill such courses. Also, the last are not always taught on a significantly higher level than standard calculus courses. 6 does not appear feasible, without losing the bulk of students, the result is a dilemma. The goal of providing a basic calculus education to a large mass of students, that is at the same time suitable as basis for more advanced analysis courses and also for increased demands for analysis skills of higher mathematical sophistication in applications, seems unreachable. Visibly, current standard calculus courses pursue the first part of this goal, only. 1.3 The General Approach of the Text The text tries to reach the whole goal, instead. As is suitable for calculus courses, it has a strong orientation towards calculations, but uses consistently mathematical methods of the 20th century, in particular, the basic concepts of sets and maps, for the development of calculus. It is mainly the use of these efficient concepts that distinguishes 20th century mathematics from older mathematics. In addition, special care was taken to include material that is needed early on in applied sciences, in particular from the area of differential equations. On the last, details are given in Section 1.5. As a consequence, the text rests on Chapter 2, Basics, of Calculus I that introduces the concepts of sets and maps. Due to their inherent simplicity, the understanding of these concepts is possible to the majority of students. This introduction is preceded by a short subsection on elementary mathematical logic to explain the meaning of the notion of a proof. This takes into account the experience that a large number of students have difficulties in understanding that meaning. It is also hoped that this subsection convinces some students, if still necessary, that they are capable of understanding proofs. Therefore, Chapter 2 should be covered in detail in class. Its thorough study will provide the student with the basic tools that are essential for the understanding of modern mathematics. A student that mastered this chapter will realize in the following that a main step in the solution of a problem is its reformulation in terms of the ‘language’ provided in Chapter 2. After that, 7 the solution of a large number of problems is obvious. As a consequence, he or she will gradually realize that the seemingly ‘challenging’ nature of many standard calculus problems is due to an inadequate formulation. In this way, the student will learn to appreciate the power of the provided ‘language’ which will guide him or her through the rest of the course. Mostly, chapters consist of three parts. An introductory motivational part, a core theoretical part and a part containing examples and problems. 1.3.1 Motivational Parts Those parts consider historical mathematical problems or problems from applications that lead to the development of the mathematics in the theoretical part of the chapter. Such problems often have a certain ‘directness’ which is suitable to catch students attention and should help every student to get an idea ‘why’ certain mathematics was developed and ‘what mathematics is good for’. To the author’s experience, practically all students have a high interest in such parts and, if given, are more inclined to follow subsequent more theoretical investigations. Also, motivations of this type are largely missing in standard calculus texts known to the author. In this, the text follows Otto Toeplitz’ ‘genetic’ method, suggested in 1926 and realized in his ‘Die Entwicklung der Infinitesimalrechnung, Bd. I.’ from 1949 [96]. To the knowledge of the author, the present text is the first that implements Toeplitz’ method to a large extent and at the same time is capable to cover a three semester course in calculus. On the other hand, differently to Toeplitz, the text does not follow the historic order of the mathematical development because, from today’s perspective, that development was not very efficient. Also, the formal approach to mathematics, with Hilbert as its main proponent, made clear that ‘understanding’ in mathematics is ‘structural understanding’. The last is an achievement of the 20th century. Presenting the material in the historical order would obstruct the path towards such understanding and be contrary to the intentions of the text. 8 Also, wherever possible, motivation is taken from applications. This is suitable, in particular, for students from natural sciences and engineering. This includes introductions to sections like that on improper integrals that uses motivation from the mechanics of periodic motion where improper integrals occur naturally in the analysis. Also, a large number of examples and problems consider basic problems related to theoretical mechanics, general relativity and quantum mechanics. In this, it pays off that the author is a mathematical physicist that has a first hand research knowledge of these areas. As a consequence, those problems are realistic. In cases where prototypical problems seemed unavailable, pure historical sketches of the development were used for the purpose of motivation. For instance, such approach was used in the introduction to the section on set theory. That introduction points out the fact that the original object of study of set theory was the concept of the infinite and that initial resistance against the theory had its roots in ancient Greek philosophical views of the infinite that were still not completely overcome at the time. The motivational introductions should be accessible to every student and be gone through in detail in class. 1.3.2 Core Theoretical Parts Those parts gives a rigorous development of essential parts of the machinery of analysis. Essentially, they are on the level of a standard undergraduate analysis or advanced calculus text, like Lang’s ‘Undergraduate Analysis’, [63], but proofs are intentionally more detailed and have been simplified as far as possible. For this purpose, also current mathematical literature, in particular, the American Mathematical Monthly and the Mathematics Magazine, has been systematically searched. For instance, this led to the adoption of E. J. McShane’s proof of Lagrange’s multiplier rule [76] which does not use the implicit mapping theorem. Also simplifications suggested by [4], [25], [26], [32], [33], [40], [44], [61], [90] and [97] have been 9 used. As a consequence, the text can also be used to teach undergraduate analysis or advanced calculus courses oriented towards calculations and applications. In class, the statements of the most important theorems should appear on the blackboard to teach students to work with these statements, even if the corresponding proofs are not fully understood or skipped. On the other hand, for reasons of time, it is to be expected that a number of proofs have to be omitted or can only be indicated in class. On the other hand, students from mathematics and also from natural sciences and engineering, are advised to go through proofs, that have in omitted or only indicated in class, in self-study. To facilitate such deeper study, this text gives students the chance to look up the full proofs without the necessity for a time consuming study of a large number of other sources.1 The last is no easy task for a beginner and, usually, lacks efficiency. For this reason, the text is also devised for an unguided self-study and very explicit. In particular, it tries to give also elementary steps in calculations to such extent that they become evident. As a consequence, large parts of the text should not even need paper and pencil. 1.3.3 Parts Containing Examples and Problems The majority of problems and examples are of a type and level occurring in standard university calculus texts in the US, but consistently reformulated in modern terms. The problems are mostly calculational in nature, as is appropriate for calculus courses also suitable for students for applied sciences. According to experience, the mastery of the study of applied sciences needs, at the minimum, technical mathematical skills. Sometimes, the opinion is uttered 1 In particular such study is complicated by different choices of notation. Of course, the author would not discourage students from such study if there is sufficient time, but, generally, a dense undergraduate curriculum should not leave much time for that. 10 that the advent of mathematical software tools, like Mathematica, Maple, Matlab made such skills redundant. In fact, this is not the case since the use of such software led to the consideration of problems whose complexity would have prevented an attack in the past. For instance, viewed from the perspective of algebraic manipulation associated to such problems, this complexity is reflected in the output of such programs. Simplification algorithms cannot possible know what the user’s intentions are. Hence the user has to guide the software to a useful answer without knowledge of that answer. This process needs a lot of mathematical experience and skills. As a consequence, efficient use of such programs presupposes technical mathematical skills and experience and even a form of structural understanding of mathematical manipulations. In addition, it is well-known that such programs are not completely free of errors. Particular examples are given on pages 263 and 292 of the text. Therefore, users need to perform routine checks of the results of such programs which also requires mathematical skills.1 The examples appear throughout in form of fully worked problems. As a consequence, these do not only exemplify the theory, but at the same time teach problem solving and prepare for exams. This procedure is particularly helpful for beginners. Wherever possible, the results of examples have been checked with Mathematica 5.1. Also, a large number of examples and problems consider basic problems from applications, in particular, from theoretical mechanics, general relativity and quantum mechanics. In this, it pays off that the author is a mathematical physicist who has a first hand research knowledge of these areas. As a consequence, those problems are realistic. Every calculus student needs to solve those problems and be able understand those examples. In particular, in class, the examples should be covered in detail. 1 Compared to these requirements, the effort for learning the correct syntax of such programs is relatively low. 11 1.4 Miscellaneous Aspects of the Approach (i) The text tries to introduce only essential mathematical structures and terminology and only in places where they are of direct subsequent use. In particular, mathematical notions are developed only to the level needed in the sequel of the text, thereby stressing their tool character. (ii) Material which is used in the text, but whose development would cause a major disrupt of the course, like the proof of Lebesgue’s characterization of Riemann integrability, are deferred to the appendix to make it accessible to interested students. In addition, the appendix contains a complete version of Cantor’s construction of the real numbers as equivalence classes of Cauchy sequences of rational numbers. Today, it is well-known that the whole of analysis and calculus rests on a construction of the real number system. Therefore, mainly for students of mathematics, such a construction has been included. The frequently used introduction of the real number system by a complicated set of axioms, for example, as in [63], has been avoided since such should appear implausible, in particular, to such students. (iii) The basic limit notion of the text is that of limits of sequences. Continuous limits are introduced as a derived concept, but their use is usually avoided. In particular, the definition of the continuity of functions proceeds by means of the conceptually simpler notion of ‘sequential continuity’, instead of the equivalent classical ε, δ-approach. Generally, the last approach is often problematic for beginners. (iv) The text contains 210 diagrams whose role is to assist intuition, but not to create the illusion of being able to replace any argument inside a proof. Mistakenly, the last is sometimes assumed by students. For this reason, it is explained in the introduction of the section on the development of rigor in calculus and analysis why geometric intuition is no longer regarded a valid tool in mathematical proofs. Still, good diagrams can be useful for the formulation of conjectures. 12 (v) In general, theorems contain their full set of assumptions, so that a study of their environment is not necessary for their understanding. For the same reason, occasionally, shorter definitions appear as part of theorems, and theorems as well as definitions contain also material that would normally appear only in subsequent remarks. 1.5 Requirements of Applications The bulk of material needed early on in applied sciences is from the area of differential equations. In the case of physics, this is the case since the advent of Newtonian mechanics in the 17th century. The advent of quantum theory made it necessary, in particular, to go beyond differential equations on to abstract evolution equations, see, e.g., [8]. Of course, the treatment of differential equations cannot be comprehensive in calculus courses, but a number of important cases can already be treated with methods from calculus. Such cases have been in included in this text as examples of calculus applications and in problem sections. For instance, second order differential equations with constant coefficients are already treated in the section on applications of differentiation in Calculus I. The uniqueness of the solutions of such an equation can be proved by help of an energy inequality. The solutions are found by help of a simple transformation that eliminates the first order derivative of the unknown function. A two-parametric family of solutions of the resulting equation is easily found. Within the sections on Riemann integration and its applications, separable first order differential equations are solved by help of integration. The solutions of the equation of motion for a simple pendulum are considered in the introduction to the section on improper integrals in Calculus II. Solutions of Bessel’s differential equation are derived by the method of power series in the section on series of functions. The derivation of solutions of the hypergeometric and the confluent hypergeometric differential equations are part of the subsequent problem section. Connected to differential equations are special functions, in particular, the Gamma and the Beta function. The last are defined and studied within the section on improper Riemann integrals. That section also derives well-known values of certain exponential integrals used in quantum 13 theory and probability theory and a standard integral representation for Riemann’s zeta function. In addition, in applications often the need arises to integrate discontinuous functions as well as functions over unbounded domains. Usually, those needs are due to idealizations that make problems accessible to direct analytical calculation. Such ‘model systems’ are still the main source for the development of an intuitive understanding of natural phenomena.1 For this reason, applications need an integration theory which is capable of integrating a large class of functions. Lebesgue’s integration theory is well suited for this purpose. Still, for reasons of practicability, the text develops Riemann’s integration theory, though close to its limits. In particular, Lebesgue’s characterization of Riemann integrability is given inside the text, but its proof is deferred to the appendix. For integration of functions in several variables, we use Serge Lang’s approach to Riemann integration from [63]. This approach is capable of integrating bounded functions, defined on closed bounded intervals, that are continuous, except from points of a ‘negligible’ set. Negligible sets can be covered by a finite number of intervals with an associated sum of volumes which can be made smaller than every preassigned real number ¡ 0. Hence negligible sets are particular bounded sets of Lebesgue measure zero. 1.6 Remarks on the Role of Abstraction in Natural Sciences Examples for the fact that the most fundamental of natural sciences, physics, always operated on a level of abstraction similar to that of mathematics are easy to find. A first example comes from Newtonian mechanics whose development was intertwined with that of calculus. The former theory describes strict point particles, that is, particles without any spatial extension. Of course, experimentally such point particles have never been observed and therefore constitute an abstraction that has its roots in ancient Greek 1 The rising importance of numerical investigations has not, and likely, cannot change that. 14 geometry. They have always been regarded as an idealization of a much more complicated reality. Still, the assumption of Newtonian point particles led to predictions that were in excellent agreement with observations and measurement until the advent of quantum theory in the first quarter of the 20th century. Einstein’s theory of special relativity has been the cause of another abstraction to enter physics, namely the unification of time and space into a four dimensional space-time. Such unification led to a remarkable simplification of that theory. Since it is the belief of most physicists that the ‘simplicity’ of a description, that is consistent with the experimental facts and that predicts new phenomena that are subsequently observed, at least partially, reflects an objective reality, nowadays this unification is a commonly used abstraction. A further abstraction is due to Einstein’s theory of general relativity that absorbed the gravitational field into the geometry of the four dimensional space-time. Subsequently, quantum theory led to the description of matter by elements of abstract Hilbert spaces with corresponding physical observables being spectral measures of self-adjoint operators in this space. In the algebraic quantum theory of fields, observables are elements of a von Neumann algebra, and physical states of the field are positive linear forms on the algebra. The above indicates that the development of physics towards the understanding of deeper aspects of nature was paralleled by the application of mathematical methods of increasing sophistication. In order to avoid the occurrence of errors, the last also necessitated an increasing stress on mathematical rigor in physics. Current physics is as abstract as mathematics since it studies practically exclusively phenomena that cannot be perceived by human senses, but only indirectly by help of highly sophisticated experimental equipment. Hence, similar to mathematics, in physics visual intuition is no longer of much help in the analysis of phenomena. In contrast, the development of physics supports the view that theories based on direct human perception inevitably contain extrapolations on the nature of things which ultimately turn out to be seriously flawed. Finally, in current speculative, i.e., without experimental evidence, physical theories there is currently nothing else available than mathematical consistency and rigor to 15 give such theories credibility. Those can only try to ‘replace’ experiment, temporarily, by mathematical consistency and rigor, although ultimately only the outcome of experiments decide on the ‘truth’ of a physical theory. Viewed from this perspective, its is quite obvious that calculus courses need to go into the direction of increased mathematical sophistication in order to narrow a widening gap to contemporary applications. In this connection, it needs to be remembered that after the advent of quantum theory, it has been recognized that the laws of quantum theory provide also the basis for the laws of chemistry. Therefore, it is to be expected that the other natural sciences and the engineering sciences follow the development of physics towards the use of more subtle mathematical methods. Such trend is already obvious. Acknowledgments I am indebted to Kostas Kokkotas, Tübingen, by suggesting the inclusion of a number of valuable examples in the text. 16 2 2.1 Calculus I A Sketch of the Development of Rigor in Calculus and Analysis It is evident that a science that leads to contradictory statements loses its value. Therefore, the occurrence of such an event sends a shock wave through the scientific community. The immediate response is an analysis of the validity of the reasoning that leads to the contradiction. In case that reasoning appears to be ‘valid’, i.e., if the contradiction can be derived by generally accepted rules of inference (‘logic’) from assumptions that are generally believed to be true (‘axioms’), the field is in a crisis because those assumptions and/or rules need to be revised until the contradiction is resolved. If this succeeds, it has to be determined whether all previously obtained results of the science are derivable from the revised basis. Potentially, a large number of results could be lost in this way. Probably the first example of a serious crisis in mathematics ? is the discovery in ancient Greece around 450 B.C. that the length, 2 , of a diagonal of a square with sides of length 1 is no rational number, a fact that will be proved in Example 2.2.15 below. Tradition attributes this discovery to a member of the Pythagorean school of thought. The fundamental assumption of that school was that the essence of everything is expressible in terms of whole numbers and their ratios, i.e., of quantities which are discrete in character. As a consequence of the discovery, that line of thought lost its basis. As a result, Plato’s’ school of thought completely reorganized the mathematical knowledge of the time by giving it an exclusively geometric basis. In this, the product of two lengths is not another length, but an area, for instance, that of rectangle. Hence the equation x2 2 can be solved geometrically, for instance, by constructing a square with edge x whose area is equal to the area of a rectangle with sides 2 and 1. As a consequence, algebraic equations were solved in terms of geometric 17 quantities. On the other hand, viewed from a today’s perspective, that approach bypassed the problem of irrational quantities, rather than solving it and can be seen as a prime reason for a major delay of the development of mathematical calculus / analysis. The last was developed as late as in the 17th century in Western Europe. The crisis gave important reasons for the development of the axiomatic method in mathematics in ancient Greece, i.e., proof by deduction from explicitly stated postulates. Without doubt, this method is the single most important contribution of ancient Greece to mathematics which is the basis of mathematics until today. In style, modern mathematics texts, including the present text, mirror that of the epoch making thirteen books of Euclid’s Elements written around 300 B.C. [37]. Previous Egyptian and Babylonian mathematics made no distinction between exact and approximate results nor were there indications of logical proofs or derivations. On the other hand, the Egyptians and Babylonians had already quite accurate approximations for π and square roots that were needed in land survey. For instance, the Egyptians of π within an error of 2 102 ? determined the value and the value of 2 within an error of 104 . The Babylonians were already familiar with the so called Pythagorean theorem ? and determined the value 7 of π within an error of 10 and the value of 2 within an error of 106 . In order to be considered as properly established in ancient Greece, a theorem had to be given a geometric meaning. This tradition continued in the Middle Ages and the Renaissance in the West. The geometric intuition was more trusted than insight into the nature of numbers. In the early phases of the development of calculus / analysis in the 17th and 18th century and also in the views of its founding fathers Isaac Newton and Gottfried Wilhelm Leibniz, geometric intuition was of major importance, but in the sequel was gradually replaced by arithmetic. A major factor in this process was the construction of non-euclidean geometries by Nicolai Lobachevsky (1829) [72], Janos Bolyai (1831) [11] and earlier, but unpublished, by Gauss. In his ‘Elements’, Euclid bases 18 geometry on five postulates that are assumed to be valid. Generally, only the first four of them were considered geometrically intuitive, whereas the fifth, the so called parallel postulate, was expected to be a consequence of the other postulates. For about 2000 years, an enormous effort went into the investigation of this question. The construction of non-euclidean geometries which satisfy the first four, but not the fifth, of Euclid’s postulates proved the independence of the parallel postulate from the other postulates. This result stripped Euclidean geometry from its central role it retained for about 2000 years. The final removal of geometric intuition as a means of mathematical proofs was caused from a number of geometrically non-intuitive results of calculus / analysis , in particular, the demonstration of the existence of a continuous nowhere differentiable function by Karl Weierstrass in 1872 [99], see Example 3.4.13, and the construction of a plane-filling continuous curve by Giuseppe Peano in 1890 [84], see Example 3.4.14. Weierstrass conceived and in large part carried out a program known as the arithmetization of analysis, under which analysis is based on a rigorous development of the real number system. This is the common approach until today. For this reason, Weierstrass is often considered as the father of modern analysis. A common rigorous development of the real number system by use of Cauchy sequences is given in Appendix 5.1. Today, reference to geometric intuition is not considered a valid argument in the proof of a theorem. Of course, such intuition might give hints how to perform such a proof, but the means of the proof itself are purely formal. This situation is similar to that of blindfold chess, i.e., the playing of a game of chess without seeing the board. That formal approach has been suggested by David Hilbert for the foundation of mathematics and has become the standard of most working mathematicians. It culminated in the collective works of a group of mathematicians publishing under the pseudonym ‘Bourbaki’. The series comprises 40 monographs that became a standard reference on the fundamental aspects of modern mathematics. 19 2.2 2.2.1 Basics Elementary Mathematical Logic In the 17th century Leibniz suggested the construction of a universal language for the whole of mathematics that allows the formalization of proofs. In 1671, he constructed a mechanical calculator, the step reckoner, that was capable of performing multiplication, division and the calculation of square roots. Also in view of his involvement in the construction of other mechanical devices, like pumps, hydraulic presses, windmills, lamps, submarines, clocks, it is likely that he envisioned machines that ultimately could perform proofs. The first scientific work on algebraization of Aristotelian logic appeared in 1847 [10], 1858 [81] by George Boole and Augustus De Morgan, respectively. The formation of mathematical logic as an independent mathematical discipline is linked with Hilbert’s program mentioned in Section 2.1 on formal axiomatic systems that resulted from the recognition of the unreliability of geometrical intuition. That program called for a formalization of all of mathematics in axiomatic form, together with a proof that it is free from contradictions, i.e., that it is what is called ‘consistent’. The consistency proof itself was to be carried out using only what Hilbert called ’finitary’ methods. In the sequel, neither Leibniz nor Hilbert’s visions have been achieved. However, what has been achieved is sufficient for most working mathematicians today. In the following, we present only the very basics of symbolic logic and display some basic types of methods of proof in simple cases. Despite of its brevity, this chapter is very important because the given logical rules for correct mathematical reasoning will be in constant use throughout the book (as well as throughout the whole of mathematics) without explicit mentioning. Therefore, its careful study is advised to the reader. Also should the reader fill in additional steps into proofs whenever he/she feels the necessity for this. The last should become a routine operation also for the rest of the book. To the experience of the author, this is a necessity to a fathom the material. 20 Definition 2.2.1. (Statements) A statement (or proposition) is an assertion that can determined as true or false. Often abstract letters like A, B, C, . . . are used for their representation. Example 2.2.2. The following are statements: (i) The president George Washington was the first president of the United States , (ii) 2 + 2 = 27 , (iii) There are no positive integers a, b, c and n with n ¡ 2 such an cn . (Fermat’s conjecture) bn The following are no statements: (iv) Which way to the Union Station? , (v) Go jump into the lake! Definition 2.2.3. (Truth values) The truth value of a statement is denoted by ‘T’ if it is true and by ‘F’ if it is false. Example 2.2.4. For example, the statement 9 16 25 (2.2.1) is true and therefore has truth value ‘T’, whereas the statement 9 16 26 is false and therefore has truth value ‘F’. Also, the statement Example 2.2.2 (i) is true, the statement Example 2.2.2 (ii) is false, and it is not yet known whether the statement Example 2.2.2 (iii) is true or false. 21 Definition 2.2.5. (Connectives) Connectives like ‘and’, ‘or’, ‘not’, . . . stand for operations on statements. Connective ‘not’ ‘and’ ‘or’ ‘if . . . then’ ‘. . . if and only if . . . ’ Symbol Name Negation Conjunction Disjunction Conditional Bi-conditional ^ _ ñ ô Example 2.2.6. For example, the statement ‘It is not the case that 9 16 25’ is the negation (or ‘contrapositive’) of (2.2.1). It can be stated more simply as 9 16 25 . Other examples are compounds like the following Example 2.2.7. (i) Tigers are cats and alligators are reptiles , (ii) Tigers are cats or (tigers are) reptiles , (iii) If some tigers are cats, and some cats are black, then some tigers are black , (iv) 9 16 25 if and only if 8 15 23 . Definition 2.2.8. (Truth tables) A truth table is a pictorial representation of all possible outcomes of the truth value of a compound sentence. The connectives are defined by the following truth tables for all statements A and B. A T T F F B T F T F A F F T T A^B T F F F A_B T T T F 22 AñB T F T T AôB T . F F T Note that the compound A _ B is true if at least one of the statements A and B is true. This is different from the normal usage of ‘or’ in English. It can be described as ‘and/or’. Therefore, the statement 2.2.7 (ii) is true. Also, the statements 2.2.7 (i) and 2.2.7 (iv) are true. Also, note that from a true statement A there cannot follow a false statement B, i.e., in that case the truth value of A ñ B is false. This can be used to identify invalid arguments and also provides the logical basis for so called indirect proofs. Note that valid rules of inference do not only come from logic, but also from the field (Arithmetic, Number Theory, Set Theory, ...) the statement is associated to. For instance, the equivalence 2.2.7 (iv) is concluded by arithmetic rules, not by logic. Those rules could turn out to be inconsistent with logic in that they allow to conclude a false statement from a true statement. Such rules would have to be abandoned. An example for this is given by the statement 2.2.7 (iii). Although the first two statements are true, the whole statement is false because there are no black tigers. In the following, the occurrence of such a contradiction is indicated by the symbol . Note that the rule of inference in 2.2.7 (iii) is false even if there were black tigers. Example 2.2.9. (Inconsistent rules) Assume that the real numbers are part of a larger collection of ‘ideal numbers’ for which there is a multiplication which reduces to the usual multiplication if the factors are ? real. Further, assume that for every ideal number z there is a square root z , i.e., such that ? 2 z z , which is identical to the positive square root if z is real and positive. Finally, assume that for all ideal numbers z1 , z2 , it holds that ?z z ?z ?z . 1 2 1 2 Note that the last rule is correct if z1 and z2 are both real and positive. Then we arrive at the following contradiction: 1 ? a ? ? ? 1 2 1 1 p1qp1q 1 1 . 23 Hence an extension of the real numbers with all these properties does not exist. A simple example for an indirect proof is the following. Example 2.2.10. (Indirect proof) Prove that there are no integers m and n such that 2m 4n 45 . (2.2.2) Proof. The proof is indirect. Assume the opposite, i.e., that there are integers m and n such that (2.2.2) is true. Then the left hand side of the equation is divisible without rest by 2, whereas the right hand side is not. Hence the opposite of the assumption is true. This is what we wanted to prove. Example 2.2.11. Calculate the truth table of the statements pA ñ B q ^ pB ñ C q ñ pA ñ C q (Transitivity) , pA _ B q ^ pA ñ C q ^ pB ñ C q ñ C (Proof by cases) , p B ñ Aq ô pA ñ B q (Contraposition) . (2.2.3) Solution: A T T T T F F F F B T T F F T T F F C T F T F T F T F AñB T T F F T T T T B ñ C pA ñ B q ^ pB ñ C q T F T T T F T T T F F F T F T T 24 AñC T F T F T T T T pA ñ B q ^ pB ñ C q ñ pA ñ C q T T T T T T T T A T T T T F F F F B T T F F T T F F C T F T F T F T F A_B T T T T T T F F AñC T F T F T T T T B ñ C pA ñ C q ^ pB ñ C q T F T T T F T T T F T F T F T T pA _ B q ^ pA ñ C q ^ pB ñ C q pA _ B q ^ pA ñ C q ^ pB ñ C q ñ C T F T F T F F F T T T T T T T T A T T F F A F F T T B T F T F B F T F T B ñ T F T T A AñB T F T T 25 p B ñ Aq ô p A ñ B q T T T T The members of (2.2.3) are so called tautologies , i.e., statements that are true independent of the truth values of their variables. At the same time they are frequently used rules of inference in mathematics, i.e., for all statements A, B and C it can be concluded from the truth of the left hand side (in large brackets) of the relations on the truth of the corresponding right hand side. Example 2.2.12. (Transitivity) Consider the statements (i) If Mike is a tiger, then he is a cat, (ii) If Mike is a cat, then he is a mammal, (iii) If Mike is a tiger, then he is a mammal. Statements (i), (ii) are both true. Hence it follows by the transitivity of ñ the truth of (iii) (and since ‘Mike’, the tiger of the LSU, is indeed a tiger, he is also a mammal). Example 2.2.13. (Proof by cases) Prove that n |n 1| ¥ 1 (2.2.4) for all integers n. Proof. For this, let n be some integer. We consider the cases n ¤ 1 and n ¥ 1. If n is an integer such that n ¤ 1, then n 1 ¤ 0 and therefore |n 1| n 1 n 1 ¥ 1 . If n is an integer such that n ¥ 1, then n 1 ¥ 0 and therefore n |n 1| n n 1 2n 1 ¥ 2 1 1 . n Hence in both cases (2.2.4) is true. The statement follows since any integer is ¤ 1 and/or ¥ 1. Example 2.2.14. (Contraposition) Prove that if the square of an integer is even, then the integer itself is even. 26 Proof. We define statements A, B as ‘The square of the integer (in question) is even’ and ‘The integer (in question) is even’ , B corresponds to the statement respectively. Hence ‘The integer (in question) is odd’ , and A corresponds to the statement ‘The square of the integer (in question) is odd’ . Hence the statement follows by contraposition if we can prove that the square of any odd integer is odd. For this, let n be some odd integer. Then there is an integer m such that n 2m 1. Hence n2 p2m 1q2 4m2 4m 1 2 p2m2 2mq 1 is an odd integer and the statement follows. Based on the result in the previous example, we can prove now the result mentioned in Section 2.1 that there is no rational number whose square is equal to 2. Example 2.2.15. (Indirect proof) Prove that there is no rational number whose square is 2. Proof. The proof is indirect. Assume on the contrary that there is such a number r. Without restriction, we can assume that r p{q where p, q are integers without common divisor different from 1 and that q 0. By definition, 2 p p2 2 r 2. q q2 Hence it follows that p2 2q2 27 and therefore by the previous example that 2 is a divisor of p. Hence there is an integer p̄ such that p 2p̄. Substitution of this identity into the previous equation gives 2p̄2 q 2 . Hence it follows again by the previous example that 2 is also divisor of q. As a consequence, p, q have 2 as a common divisor which is in contradiction to the assumption. Hence there is no rational number whose square is equal to 2. Problems 1) Decide which of the following are statements. a) b) c) d) e) f) g) f) g) h) i) Did you solve the problem? Solve the problem! The solution is correct. Maria has green eyes. Soccer is the national sport in many countries. Soccer is the national sport in Germany. During the last year, soccer had the most spectators among all sports in Germany. Explain your solution! Can you explain your solution? Indeed, the solution is correct, but can you explain it? The solution is correct; please, demonstrate it on the blackboard. 2) Translate the following composite sentences into symbolic notation using letters for basic statements which contain no connectives. a) Either John is taller than Henry, or I am subject to an optical illusion. b) If John’s car breaks down, then he either has to come by bus or by taxi. c) Fred will stay in Europe, and he or George will visit Rome. d) Fred will stay in Europe and visit Rome, or George will visit Rome. 28 e) I will travel by train or by plane. f) Neither Newton nor Einstein created quantum theory. g) If and only if the sun is shining, I will go swimming today; in case I go swimming, I will have an ice cream. h) If students are tired or distracted, then they don’t study well. i) If students focus on learning, their knowledge will increase; and if they don’t focus on learning, their knowledge will remain unchanged. 3) Denote by M , T , W the statements ”Today is Monday”, ”Today is Tuesday” and ”Today is Wednesday”, respectively. Further, denote by S the statement ”Yesterday was Sunday”. Translate the following statements into proper English. a) b) c) d) e) f) M Ñ pT _ W q , SØM , S ^ pM _ T q , pS Ñ T q _ M , M Ø pT ^ p W qq _ S , pM Ø T q ^ pp W q _ S q . 4) By use of truth tables, prove that a) b) c) d) e) f) g) h) i) k) l) p Aq ô A , pA ^ B q ô pB ^ Aq , pA _ B q ô pB _ Aq , pA ô B q ô pB ô Aq , pA ^ B q ô p Aq _ p B q , pA _ B q ô p Aq ^ p B q , pA Ñ B q ô p Aq _ B , A ^ pB ^ C q ô p A ^ B q ^ C , A _ pB _ C q ô p A _ B q _ C , A _ pB ^ C q ô p A _ B q ^ p A _ C q , A ^ pB _ C q ô p A ^ B q _ p A ^ C q . for arbitrary statements A, B and C. 5) Assume that a pb cq a b c for all real a, b and c is a valid arithmetic rule of inference. Derive from this a contradiction to the valid arithmetic statement that 0 1. 29 Therefore, conclude that the enlargement of the field of arithmetic by addition of the above rule would lead to an inconsistent field. 6) Prove indirectly that 3n 2 is odd if n is an odd integer. 7) Prove indirectly that there are no integers m ¡ 0 and n ¡ 0 such that m2 n 2 1. 8) If a, b and c are odd integers, then there is no rational number x such that ax2 bx c 0. [Hint: Assume that there is such a rational number x r{s where r, s 0 are integers without common divisor. Show that this implies the equation rpar bsq cs2 which is contradictory.] 9) Prove that there is an infinite number of prime numbers, i.e., of natural numbers ¥ 2 that are divisible without remainder only by 1 and by that number itself. [Hint: Assume the opposite and construct a number which is larger than the largest prime number, but not divisible without remainder by any of the prime numbers.] 10) Prove by cases that |x 1| |x 2| ¤ 3 |x 1| |x 2| ¥ 3 for all real x. 11) Prove by cases that for all real x. 12) Prove by cases that ||ab|| for all real numbers a, b such that b 0. a b 13) Prove by cases that if n is an integer, then n3 is of the form 9k where k is some integer and r is equal to 1, 0 or 1. r 14) Prove that if n is an integer, then n5 n is divisible by 5. [Hint: Factor the polynomial n5 n as far as possible. Then consider the cases that n is of the form n 5q r where q is an integer and r is equal to 0, 1, 2, 3 or 4.] 30 2.2.2 Sets Set theory was created by Georg Cantor between the years 1874 and 1897. Its development was triggered by the general effort to develop a rigorous basis for calculus / analysis in the 19th century. As we shall see later, for this it is necessary to treat infinite collections of real numbers. Since antiquity, most of the mathematicians did not consider collections of infinitely many objects as valid objects of thinking. This is likely due to the influence of ancient Greek philosophy, in particular that of Aristotle (384-322 B.C.), that dominated the thinking in the west up to the 18th century. According to Aristotle (384-322 B.C.), the infinite is imperfect, unfinished and therefore, unthinkable; it is formless and confused. Hence it had to be excluded from consideration. Precisely such consideration is done by set theory. For this reason, initially Cantor’s work received much criticism and was accused to deal with fictions. Once its use for calculus / analysis was understood, attitudes began to change, and by the beginning of the 20th century, set theory was recognized as a distinct branch of mathematics. Finally, it even provided the basis for the whole of mathematics in the work of Bourbaki mentioned in Section 2.1. Today, the notions of set theory seem so natural that the in part fierce debates at the time of its creation are hard to understand. In the following, only the very basics of Cantor’s original formulation of set theory is given which is sufficient for the purposes of the book. Today, that approach is called ‘naive’ set theory because it uses a definition of sets which is too broad and leads to contradictions if its full generality is exploited. One such contradiction, the so called Zermelo-Russel’s paradox is described at the end of this section. So a more restrictive definition of sets is needed to avoid such contradictions. For this, we refer to books on axiomatic set theory. In the following such paradoxa will not play a role because calculus / analysis naturally deals with a far reduced class of sets which satisfy the more restrictive definition of axiomatic set theory. Like the previous section, this section is very important because the given 31 notions of set theory will be in consistent use throughout the book as an efficient unifying language, but without going as far as Bourbaki’s work. Therefore, its careful study is advised to the reader. Like the material of the previous section, its apparent simplicity should not lead to an underestimation of it’s importance. Precisely the achievement of such simplicity is the ultimate goal of the whole of mathematics because it signals a full understanding of the studied object. Complexity just signals a deficient understanding. In addition, from a practical point of view, such simplicity drastically reduces the chance of the occurrence of errors. In the following we adopt the naive definition of sets given by Cantor. Definition 2.2.16. (Sets) A set is an aggregation of definite, different objects of our intuition or of our thinking, to be conceived as a whole. Those objects are called the elements of the set. This implies that for a given set A and any given object a it follows that either a is an element of A or it is not. The first is denoted by a P A , and the second is denoted by a R A . The set without any elements, the so called ‘empty set’, is denoted by φ. Example 2.2.17. Examples of sets are the set of all cats , the set of the lowercase letters of the Latin alphabet , the set of odd integers . Definition 2.2.18. (Elements) For a set A, the following statements have the same meaning a is in A , a is an element of A , a is a member of A , aPA. 32 Given some not necessarily different objects x1 , x2 , . . . , the set containing these objects is denoted by tx 1 , x 2 , . . . u . In particular, we define the set of natural numbers N , the set of natural numbers N without 0 , the set of integers Z and the set of integers Z without 0 by Definition 2.2.19. (Natural numbers, integers) N : t0, 1, 2, 3, . . . u , N : t1, 2, 3, . . . u , Z : t0, 1, 1, 2, 2, 3, 3 . . . u , Z : t1, 1, 2, 2, 3, 3, . . . u . Another way of defining a set is by a property characterizing its elements, i.e., by a property which is shared by all its elements, but not by any other object: tx : x has the property P pxqu . It is read as: ‘The set of all x such that P pxq’. In this, the symbol ‘:’ is read as ‘such that’. In particular, we define the set of rational numbers Q, the set of rational numbers Q without 0 , the set of real numbers R and the set of real numbers R without 0 by Definition 2.2.20. (Rational and real numbers) Q : tp{q : p P Z ^ q P N ^ q 0u , Q : tp{q : p P Z ^ q P N ^ q 0u , R : tx : x is a real numberu , R : tx : x is a non-zero real numberu . Definition 2.2.21. (Subsets, equality of sets) For all sets A and B, we define A B :ô Every element of A is also an element of B 33 and say that ‘A is a subset B’, ‘A is contained in B’, ‘A is included in B’ or ‘A is part of B’. Finally, we define A B :ô A B ^ B A ô A and B contain the same elements . Here and in the following, wherever meaningful, the symbol ‘:’ in front of other symbols means and is read as ‘per definition’. Example 2.2.22. For instance, t1, 1, 2, 3, 5u t1, 1, 2,?3, 5, 8, 13u , ? ? t1, 1, 2, 3, 5u trp1 5qn p1 5qns{p2n 5q : n P Nu , t1, 2, 3, 3, 5, 1u t1, 2, 3, 5? u, ? ? t1, 1, 2, 3, 5, . . . u trp1 5qn p1 5qns{p2n 5q : n P Nu . In particular, we define subsets of R, so called intervals , by Definition 2.2.23. ra, bs : tx P R : a ¤ x ¤ bu , pa, bq : tx P R : a x bu ra, bq : tx P R : a ¤ x bu , pa, bs : tx P R : a x ¤ bu rc, 8q : tx P R : x ¥ cu , pc, 8q : tx P R : x ¡ cu p8, dq : tx P R : x du , p8, ds : tx P R : x ¤ du for all a, b P R such that a ¤ b and c, d P R. We define the following operations on sets. Definition 2.2.24. (Operations on sets, I) For all sets A and B, we define (i) their union A Y B, read: ‘A union B’, by A Y B : tx : x P A _ x P B u 34 y y x x A B Fig. 1: Two subsets A and B of the plane. A AÜB AÝB B Fig. 2: Union and intersection of A and B. The last is given by the blue domain. 35 AB Fig. 3: The relative complement of B in A. (ii) and their intersection A X B, read: ‘A intersection B’, by A X B : tx : x P A ^ x P B u . If A X B (iii) φ, we say that A and B are disjoint. the relative complement of B in A, A zB, read: ‘A without B’ or ‘A minus B’, by A zB : tx : x P A ^ x R B u . (iv) their cross (or Cartesian / direct) product A B, read: ‘A cross B’, by A B : tpx, y q : x P A ^ y P B u where ordered pairs px1 , y1 q, px2 , y2 q are defined equal, px1, y1q px2, y2q , if and only if x1 x2 and y1 y2 . We also use the notation A2 for A A. More generally, we define for n P N such that n ¥ 3 and sets 36 y 3 2 x A 1 B 2 1 1 2 3 3 2 A´B z 1 0 1 1 2 3 2 3 y x Fig. 4: Subsets A of the real line and B of the plane and their cross product. 37 x A1 , . . . , An the corresponding Cartesian product A1 An (2.2.5) to consist of all ordered n-tuples px1 , . . . , xn q of elements x1 P A1 , . . . , xn P An . Also in this case, we define such ordered pairs px1 , . . . , xn q and py1 , . . . , yn q to be equal if and only if all their components are equal, i.e., if and only if x1 y1 , . . . , xn yn . We also use the notation n ¡ Ai i 1 for (2.2.5) and, in the case that A1 , . . . , An are all equal to some set A, the notation An . Finally, we define R1 : R. Example 2.2.25. t1, 2, 3, 5, 8, 13u Y t1, 3, 4, 7, 11, 18u t1, 2, 3, 4, 5, 7, 8, 11, 13, 18u t1, 2, 3, 5, 8, 13u X t1, 3, 4, 7, 11, 18u t1, 3u t1, 2, 3, 5, 8, 13u zt1, 2, 3, 5u t8, 13u , t1, 2, 3, 5, 1u zt1u t2, 3, 5u , t1, 2u t1, 3, 4u tp1, 1q, p1, 3q, p1, 4q, p2, 1q, p2, 3q, p2, 4qu . We also define unions and intersection of arbitrary families of sets. Definition 2.2.26. (Operations on sets, II) Let I be some non-empty set and for every i P I the corresponding Ai an associated set. Then we define ¤ P £ i I P Ai : tx : x P Ai for some i P I u , Ai : tx : x P Ai for all i P I u . i I Example 2.2.27. Determine ¤ n N P r 1{n, 1s , £ n N 38 P r 0, 1{ns . Solution: By definition S1 : ¤ n N P r 1{n, 1s tx : x P r 1{n, 1s for some n P Nu . Any x P R such that x ¡ 1 or x ¤ 0 is not contained any of the sets r 1{n, 1s, n P N and hence also not contained in their union S1. On the other hand, if x P R is such that 0 x ¤ 1, then 1 n ¤x¤1 if n P N is such that n ¥ 1{x. Hence for such n, x x P S1 . As a consequence, ¤ nPN P r1{n, 1s and hence r 1{n, 1s p0, 1s . Further, by definition S2 : £ n N P r 0, 1{ns tx : x P r 0, 1{ns for all n P Nu . No x P R such that x 0 is contained in any of the r 0, 1{ns, n P N and hence also not contained in S2 . 0 is contained in all of these sets and hence also contained in S2 . If x P R is such that x ¡ 0, then 1 n x for n P N such that n ¡ 1{x. Hence for such n, x R r0, 1{ns and therefore x R S2 . As a consequence, £ n N P r 0, 1{ns t0u . The naive Definition 2.2.16 of sets leads to paradoxa like the one of ZermeloRussel (1903): 39 Assume that there is a set of all sets that don’t contain itself as an element: S : tx : x is a set ^ x R xu . Since S is assumed to be a set, either S P S or S R S. From the assumption that S P S, it follows by the definition of S that S R S . Hence it follows that S R S. From S R S, it follows by the definition of S that S P S . Hence there is no such set. Bernard Russell also used a statement about a barber to illustrate this principle. If a barber cuts the hair of exactly those who do not cut their own hair, does the barber cut his own hair? So a more restrictive definition of sets is needed to avoid such contradictions. For this, we refer to books on axiomatic set theory. In the following such paradoxa will not play role because we don’t use the full generality of Definition 2.2.16. Calculus / analysis naturally deals with a far reduced class of sets which satisfy the more restrictive definition of axiomatic set theory. Problems 1) For each pair of sets, decide whether not the following sets are equal: A : t2, 3u, B : t3, 2u Y φ, C : t2, 3u Y tφu, D : tx P R : x2 x 6 0u, E : tφ, 2, 3u, F : t2, 3, 2u, G : t2, φ, φ, 3u . 2) Simplify t2, 3u Y tt2u, t3uu Y t2, t3uu Y tt2u, 3u . 3) Decide whether t1, 3u P t1, 3, t1, 7u, t1, 3, 7uu . Justify your answer. 40 4) Let A : tφ, t1u, t1, 3u, t3, 4uu. Determine for each of the following statements whether it is true or false. a) 1 P A , b) t1u A , c) t1u P A, d) t1, 3u A , e) tt1, 3uu P A , f) φ P A , g) φ A , h) tφu A . 5) Give an example of sets A, B, C such that A A R C. P B and B P C, but 6) Sketch the following sets A : tpx, y q P R2 : x B : tpx, y q P R : 2x 2 C : tpx, y q P R : x 1 0u , y 3y 5 0u , 1u, D : tp0, 1qu , E : tp1, 1qu, F : tp0, 1qu, G : tp1, 0qu, H : tp2, 3qu , I : tp4, 1qu, J : tx P R : 4 ¤ x ¤ 2u, K : t0u, L : t1u into a xy-diagram and calculate A X B, A X C, pA X B q X C, A X pB XC q, B XpJ Lq, C XpJ K q, A zB, B zA, B YE, pC YF qYG. 2 2 y 2 7) Let A, B and C be sets. Show that a) b) c) d) e) f) g) h) i) If A B and B C, then A C , AYB BYA , AXB BXA , A Y pB Y C q p A Y B q Y C , A X pB X C q p A X B q X C , A Y pB X C q p A Y B q X p A Y C q , A X pB Y C q p A X B q Y p A X C q , C zpA Y B q pC zAq X pC zB q , C zpA X B q pC zAq Y pC zB q . 41 2.2.3 Maps The development of the concept of a function and its generalization, i.e., the concept of a map (or ‘mapping’), are further major achievements of Western culture that have no counterpart in ancient Greek mathematics. The first concept underwent considerable changes until it reached its current meaning. The principal objects of study of the calculus in the 17th century were geometric objects, in particular curves, but not functions in their current meaning. Also the variables associated with those objects had a geometrical meaning, like abscissas, ordinates and tangents. The term function appeared first in the works of Leibniz. In particular, he asserts that a tangent is a function of a curve. This only very roughly matches the modern notion of a function. Newton’s method of ‘fluxions’ applies to ‘fluents’ not to functions. For Newton, a curve is generated by a continuous motion of a point he called ‘fluent’ because he thought of it as a flowing quantity. The ‘fluxion’ or rate at which it flowed, was the point’s velocity. Under the influence of analytic geometry, in the first half of the 18th century, the geometric concept of variables was replaced by the concept of a function as an equation or analytic expression composed of variables and numbers. Admissible analytic expressions were those that involved the four algebraic operations, roots, exponentials, logarithms, trigonometric functions, derivatives and integrals. In the sequel, as a consequence of the study of the solutions of the wave equation in one space dimension (‘the Vibrating-String Problem’), the concept of a function was enlarged to include such that are piecewise defined on intervals by several analytic expressions and functions (in the sense of curves) drawn by ‘free-hand’ and possibly not expressible by any combination of analytic expressions. The final step in the evolution of the function concept was made by Gustav Lejeune Dirichlet in 1829 [30] in a paper which gave a precise meaning to Fourier’s work from 1822 [41] on heat conduction. In that work, 42 Fourier claimed that ‘any’ function defined over an interval pl, lq can be represented by his series over this interval. Not only by modern standards, Fourier’s statement and proof were insufficient, but a proof or disproof of that statement presupposed a clear definition of the concept of a function. For Dirichlet, y is a function of a variable x, defined on the interval a x b, if to every value of the variable x in this interval there corresponds a definite value of the variable y. Also, it is irrelevant in what way this correspondence is established. Already in 1887 [31], Dirichlet generalizes the concept of a function to that of a mapping ‘By a mapping of a system S a law is understood, in accordance with which to each determinate element s of S there is associated a determinate object, which is called the image of s and is denoted by ϕpsq; we say too, that ϕpsq corresponds to the element s, that ϕpsq is caused or generated by the mapping ϕ out of s, that s is transformed by the mapping ϕ into ϕpsq.’ This definition practically coincides with the modern definition of maps given below. Fourier’s claim pinpointed a major weakness in the mathematics of the 18th century. On the one hand, the insufficiency of Fourier’s ‘proof’ was obvious to the mathematical community at the time. On the other hand, the notion of a function was to nebulously defined as that it could have been convincingly claimed that his result was false. This clearly signaled that those mathematical notions (or the ‘mathematical language’) were to imprecise to deal with such questions and that more precise notions had to be developed. This makes clear the size of Dirichlet’s achievement. He had to solve simultaneously two intertwined problems, namely the giving of a precise mathematical meaning to Fourier’s result and the development of a mathematical framework where this is possible. In particular, it was not clear whether such thing was possible at all. Until today, such problems are common in mathematics related to applications. A careful study also of this section is advised to the reader. It introduces 43 f B A Fig. 5: Points in the set A and their images in the set B under the map f are connected by arrows. Compare Definition 2.2.28. into the current notion of maps and gives efficient means for their description which will be used throughout the book. If there is reference made to a function or to a map in the following, the imagining of a picture similar to Fig 2.2.28 should be helpful to the reader. Mathematically, it is possible to identify a map with a set, namely its graph, see Definition 2.2.33. In such exclusivity, this is not advisable since this often does not provide any visual help, in particular in cases when the graph is a subset of a space of more then 3 dimensions. The last is frequently the case in applications. In addition, it often hinders intuition since maps are frequently used to describe transformations. It is more advisable, to consider the graph of a map as one of the options to describe or visualize the latter. Indeed, this option will frequently be used in Calculus I and II. Other such options, becoming relevant in Calculus III, are sometimes contour and density maps. With increasing complexity of the considered problems, also in applications, the options for a meaningful visualization of the involved maps rapidly decreases and an abstract view of maps is becoming essential. 44 Definition 2.2.28. (Maps) Let A and B be non-empty sets. (i) A map (or mapping) f from A into B, denoted by f : A Ñ B, is an association which associates to every element of A a corresponding element of B. If B is a subset of the real numbers, we call f a function. We call A the domain of f . If f is given, we also use the short notation Dpf q for the domain of f . (ii) For every x under f . P A, we call f pxq the value of f at x or the image of x (iii) For any subset A 1 of A, we call the set f pA 1 q containing all the images of its elements under f , f pA 1 q : tf pxq : x P A 1 u , (2.2.6) the image of A 1 under f . In particular, we call f pAq the range or image of f . If f is given, we use also the short notation Ranpf q : f pAq tf pxq : x P Au . for the range or image of f . (iv) For any subset B 1 of B, we call the subset f 1 pB 1 q of A containing all those elements which are mapped into B 1 , f 1 pB 1 q : tx P A : f pxq P B 1 u , the inverse image of B 1 under f . In particular if f is a function, we call f 1 pt0uq tx P A : f pxq 0u , the set of zeros of f or the zero set of f . (v) For any subset A 1 of A, we define the restriction of f to A 1 as the map f |A 1 : A 1 Ñ B defined by for all x P A 1 . f |A 1 pxq : f pxq 45 Remark 2.2.29. (Variables) We will not introduce a precise notion of ‘variables’ in the following because such would be redundant. Still there is a residual of such historic notion present in the commonly used characterization of functions as functions of one variable, several variables or n variables where n P N is such that n ¥ 2. Also in this text, we will refer to a function whose domain is a subset of R as a function of one variable and to a function whose domain is a subset of Rn , where n P N is such that n ¥ 2, as a function of several variables or a function of n variables. Remark 2.2.30. In the following, we make the general assumption of basic knowledge of integer powers and n-th roots, where n P N , as well as of the functions sin : R Ñ R , arcsin : r1, 1s Ñ rπ {2, π {2s , cos : R Ñ R , arccos : r1, 1s Ñ r0, π s , tan : pπ {2, π {2q Ñ R , arctan : R Ñ pπ {2, π {2q , exp : R Ñ R , ln : p0, 8q Ñ R as provided by high school mathematics. Still, we give definitions of some of these functions later on to exemplify methods of calculus. Example 2.2.31. Define f : Z Ñ Z by f pnq : n2 for all n P Z. Moreover, let g be the restriction of f to N. Calculate f pZq, f pt2, 1, 0, 1, 2uq, f 1 pt1, 0, 1uq, f 1 pt6uq, g 1 pt1, 0, 1uq . Solution: f pZq tn2 : n P Nu , f pt2, 1, 0, 1, 2uq t0, 1, 4u , f 1 pt1, 0, 1uq t1, 0, 1u , f 1 pt6uq φ , g 1 pt1, 0, 1uq t0, 1u . Example 2.2.32. Define f : Df (a) f pxq ? x Ñ R and g : Dg Ñ R such that 2 for all x P Df 46 (b) g pxq 1{px2 xq for all x P Dg and such Df and Dg are maximal. Find the domains Df and Dg . Give explanations. Solution: In case (a) the inequality x 2¥0 pô x ¥ 2q has to be satisfied in order that the square root is defined. Hence Df : tx P R : x ¥ 2u ? and f : Df Ñ R is defined by f pxq : x 2 for all x P Df . In case (b) the denominator has to be different from zero in order that the quotient is defined. Because of x2 x xpx 1q 0 ô x P t0, 1u , we conclude that Dg : tx P R : x 0 ^ x 1u and that g : Dg Ñ R is defined by gpxq : 1{px2 xq for all x P Dg . Definition 2.2.33. (Graph of a map) Let A and B be some sets and f : A Ñ B be some map. Then we define the graph of f by: Gpf q : tpx, f pxqq P A B : x P Au . Example 2.2.34. Sketch the graphs of the functions f and g from Example 2.2.32. Solution: See Fig. 6 and Fig. 7. Example 2.2.35. Find the ranges of the functions in Example 2.2.32. Solution: Since the square root assumes only positive numbers, we conclude that f pDf q ty : y ¥ 0u . Further for every y P r0, 8q, it follows that a y2 2 2 y 47 and hence that ty : y ¥ 0u f pDf q and, finally, that f pDf q ty : y ¥ 0u. Further, for x 0 or x ¡ 1, it follows that xpx 1q ¡ 0 and hence that g pxq ¡ 0. For 0 x 1, it follows that ¤ 1 4 x 1 2 2 41 xpx 1q 0 and hence that g pxq ¥ 4. Hence it follows that ty : y ¡ 0u Y ty : y ¤ 4u gpDg q . Finally, for any real y such that py ¡ 0q _ py ¤ 4q, it follows that g 1 2 and hence that g pDg q ty : y c 1 y 1 4 y ¡ 0u Y ty : y ¤ 4u. A map is called injective (or one-to-one) if no two points from its domain are mapped onto the same point. A map into a set B is called surjective (or onto) if every element from B is the image of some element from its domain. Finally, a map is called bijective (or one-to-one and onto) if it is injective and surjective. Definition 2.2.36. (Injectivity, surjectivity, bijectivity) Let A and B be some sets and f : A Ñ B be some map. We define (i) f is injective (or one-to-one) if different elements of A are mapped into different elements of B, or equivalently if f pxq f py q ñ x y for all x, y P A. In this case, we define the inverse map f 1 as the map from f pAq into A which associates to every y P f pAq the element x P A such that f pxq y. 48 y 2 1.5 1 0.5 -2 1 -1 2 x Fig. 6: Gpf q from Example 2.2.32. y 4 2 -1 0.5 -0.5 1.5 -2 -4 -6 -8 -10 Fig. 7: Gpg q from Example 2.2.32. 49 2 x (ii) f is surjective (or onto) if every element of B is the image of some element(s) of A: f pAq B . (iii) f is bijective (or one-to-one and onto) if it is both injective and surjective. In this case, the domain of the inverse map is the whole of B. Example 2.2.37. Let f and g be as in Example 2.2.31. In addition, define h : Z Ñ Z by hpnq : n 1 for all n P Z. Decide whether f, g and h are injective, surjective or bijective. If existent, give the corresponding inverse function(s). Solution: f is not injective (and hence also not bijective), nor surjective, for instance, because of f p1q f p1q 1 , 2 R f pAq . g is injective because if m and n are some natural numbers such that g pmq g pnq, then it follows that 0 m2 n2 and hence that mn pm nqpm _ nq m n and therefore, since g has as its domain the natural?numbers, that m n. The inverse g 1 : g pAq Ñ A is given by g 1 plq l for all l P g pAq. g is not surjective (and hence also not bijective), for instance, since 2 R g pAq. h is injective because if m and n are some natural numbers such that hpmq hpnq, then it follows that 0m 1 pn 1q m n and hence that m n. h is surjective (and hence as a whole bijective) because for any natural n we have hpn 1q n. The inverse function h1 : Z Ñ Z is given by h1 pnq n 1 for all n P Z. 50 The following characterizes the injectivity, surjectivity and bijectivity of a map in terms of its graph. In the special case of functions defined on subsets of the real numbers, the theorem can be stated as follows. Such function is injective if and only if every parallel to the x-axis intersects its graph in at most one point. If such function maps into the set B, then it is surjective, bijective, respectively, if and only if the intersection of every parallel to the x-axis through a point from B intersects its graph in at least one point and precisely one point, respectively. Theorem 2.2.38. Let A and B be sets and f : A Ñ B be a map. Further, define for every y P B the corresponding intersection Gf y by Gf y : Gpf q X tpx, y q : x P Au . Then (i) f is injective if and only if Gf y contains at most one point for all y P B. (ii) f is surjective if and only if Gf y is non-empty for all y P B. (iii) f is bijective if and only if Gf y contains exactly one point for all y P B. Proof. (i) The proof is indirect. Assume that there is y P B such that Gf y contains two points px1 , y q and px2 , y q. Then, since Gf y is part of Gpf q, it follows that y f px1 q f px2 q and hence, since by assumption x1 x2 , that f is not injective. Further, assume that f is not injective. Then there are different x1 , x2 P A such that f px1 q f px2 q. Hence Gf f px1 q contains two different points px1 , f px1 qq and px2 , f px1 qq. (ii) If f is surjective, then for any y P B there is some x P A such that y f pxq and hence px, y q P Gf y . On the other hand, if Gf y is non-empty for all y P B, then for every y P B there is some x P A such that px, y q P Gf y and hence, since Gf y is part of Gpf q, that y f pxq. Hence f is surjective. (iii) is an obvious consequence of (i) and (ii). 51 y 2 1.5 0.5 -2 1 -1 2 x -1 Fig. 8: Gpf q from Example 2.2.32 and parallels to the x-axis. y 2 -1 0.5 -0.5 1.5 2 x -4 -8 -10 Fig. 9: Gpg q from Example 2.2.32 and parallels to the x-axis. 52 Example 2.2.39. Apply Theorem 2.2.38 to investigate the injectivity of f and g from Example 2.2.32. Solution: Fig. 8, Fig. 9 suggest that f is injective, but not surjective and that g is neither injective nor surjective. Example 2.2.40. Show that f and the restriction of g to tx : x ¥ 1{2 ^ x 1u , where f and g are from Example 2.2.32, are injective and calculate their inverse. Solution: If x1 , x2 are any real numbers ¥ 2 and such that f px1 q f px2 q, then ? ? x1 2 x2 2 and hence x1 2 x2 2 and x1 x2 . Hence f is injective. Further, for every y in the range of f there is x ¥ 2, such that ? y x 2 and hence x y2 2 . Therefore f 1 py q y 2 2 for all y from the range of f . Further, if x1 and x2 are some real numbers ¥ 1{2 different from 1 and such that 1 x21 then x1 x2 1 x 2 px1 x2qpx1 , 2 x2 1q 0 and hence x1 x2 . Hence the restriction of g to tx : x ¥ 1{2 ^ x 1u is injective. Finally, if y is some real number in the range of this restriction, then y is in particular different from zero and y x2 1 x , 53 hence x and Therefore x 1 2 1 2 2 y1 c 1 f 1 py q 1 y 1 4 1 . 4 c 1 1 2 y 4 for all y from the range of that restriction of g. The next defines the composition of maps which corresponds to the application of maps in sequence. Definition 2.2.41. (Composition) Let A, B, C and D be sets. Further, let f : A Ñ B and g : C Ñ D be maps. We define the composition g f : f 1 pB X C q Ñ D (read: ‘g after f ’) by pg f qpxq : gpf pxqq for all x P f 1 pB X C q. Note that g f is trivial, i.e., with an empty domain, for instance, if B X C φ. Also note that f 1 pB X C q A if B C. Example 2.2.42. Calculate f f , h h, f h and h f where f , h are defined as in Example 2.2.31, Example 2.2.37, respectively. Solution: Obviously, all these maps map Z into itself. Moreover for every n P Z: pf f qpnq f pf pnqq f pn2q pn2q2 n4 , ph hqpnq hphpnqq hpn 1q pn 1q 1 n 2 , ph f qpnq hpf pnqq hpn2q n2 1 , pf hqpnq f phpnqq f pn 1q pn 1q2 n2 2n 1 . Note in particular that h f f h. 54 Example 2.2.43. Let A and B be sets. Moreover, let f : A Ñ B be some injective map. Calculate f 1 f . Assume that f is also surjective (and hence as a whole bijective) and calculate also f f 1 for this case. Solution: To every y P f pAq, the map f 1 associates the corresponding x P A which satisfies f pxq y. In particular, it associates to f pxq the element x for all x P A. Hence f 1 f idA , f f 1 idf pAq where for every set C the corresponding map idC : C Ñ C is defined by idC pxq : C for all x P C. Further, if f is bijective, f pAq B and hence f f 1 idB . The following theorem gives a relation between the graph of an injective map and the graph of its inverse. In the special case of functions defined on subsets of the real numbers, the theorem characterizes the graph of the inverse of such a function as the reflection of the graph of that function about the line tpx, xq P R2 : x P Ru. Theorem 2.2.44. (Graphs of inverses of maps) Let A and B be sets and f : A Ñ B be an injective map. Moreover, define R : X Y Ñ Y X by Rpx, y q : py, xq for all x P A and y P B. Then the graph of the inverse map is given by Gpf 1 q RpGpf qq . Proof. ‘’: Let py, f 1 py qq be an element of Gpf 1 q. Then y P f pAq and f 1 py q P A is such that f pf 1 py qq y. Therefore pf 1 py q, y q P Gpf q and py, f 1pyqq Rpf 1pyq, yq P RpGpf qq . 55 y 2 1 -2 1 -1 2 x -1 -2 Fig. 10: Gpf q, Gpf 1 q from Example 2.2.32 and the reflection axis. ‘’: Let pf pxq, xq be some element of RpGpf qq. Then f 1 pf pxqq x and hence pf pxq, xq pf pxq, f 1pf pxqq P Gpf 1q . Example 2.2.45. Apply Theorem 2.2.44 to the graph of the function f from Example 2.2.32 to draw the graphs of its inverse. (See Example 2.2.40.) Solution: See Fig. 10. Problems 1) Find f pr0, π {2sq , f 1 pt1uq , f 1 pt3uq , f 1 pr0, 2sq . In addition, find the maximal domain D R that contains the point π {8 and is such that f |D is injective. Finally, calculate the inverse of the map h : D Ñ f pDq defined by hpxq : f pxq for all x P D. 56 y y 1 y 1 2 1 1 -1 2 x 1 x -1 -2 1 -1 x 2 -2 -1 Fig. 11: Subsets of R2 . Which is the graph of a function? a) f pxq : 2 sinp3xq , x P R , b) f pxq : 3 cosp2xq , x P R , c) f pxq : tanpx{2q{3 , x P tx P pp2k 1qπ, p2k 1qπ q : k P Zu . 1 for x P R 2) Define f : R Ñ R and g : R zt1u Ñ R by f pxq : x and g pxq : px 1q2 {px 1q for x P R zt1u. Is f g? 3) Let f : Df Ñ R be defined such that the given equation below is satisfied for all x P Df and such that Df R is a maximal. In each of the cases, find the corresponding Df , the range of f , and draw the graph of f : a) b) c) d) e) f) g) h) f pxq x2 3 , ? f pxq 1{ x , f pxq 1{p1 xq , f pxq x2 |x| , f pxq x{|x| , f pxq |x|1{3 , f pxq |x2 1| , a f pxq sinpxq . 4) Which of the subsets of R2 in Fig. 11 is the graph of a function? Give reasons. 5) Find the function whose graph is given by a) b) c) px, yq P R2 : x2 y x px, yq P R2 : x y{py px, yq P R2 : y2 6xy ( 10 , ( 1q , 9x2 ( 0 . 6) In each of the following cases, find a bijective function that has domain D and range R and calculate its inverse. 57 tx P R : 1 ¤ x ¤ 2u, R tx P R : 3 ¤ x ¤ 7u , tx P R : 1 ¤ x ¤ 1u, R tx P R : x ¥ 3u . Define f : Df Ñ R and g : Dg Ñ R such that a x1 f pxq : x2 9 , g pxq : 2 x3 for all x P Df , x P Dg , respectively, and such Df and Dg are a) D b) D 7) a) maximal. Find the domains and ranges of the functions f and g. Give explanations. b) If possible, calculate pf g qp5q and pg f qp5q. Give explanations. c) f is injective (= ‘one to one’). Calculate its inverse. 8) Is there a function which is identical to its inverse? Is there more then one such function? 9) Define f : R Ñ R, g : R Ñ R and h : R Ñ R by f pxq : 1 x , g pxq : 1 x x2 , hpxq : 1 x for every x P R. Calculate 10) pf f qpxq , pf gqpxq , pg f qpxq , pg gqpxq , pf hqpxq , ph f qpxq , pg hqpxq , ph gqpxq , ph hqpxq , rf pg hqspxq , rpf gq hspxq for every x P R. Define f : R Ñ R, g : R Ñ R and h : tx P R : x ¡ 0u Ñ R by f pxq : x a , g pxq : ax , hpxq : xa for every x in the corresponding domain where a P R. For each of these functions and every n P N , determine the n-fold composition with itself. 11) Define f : R Ñ R by f pxq : r 1 p2 xq1{3 s1{7 , gpxq : cosp2xq for every x P R. Express f and g as a composition of four functions, none of which is the identity function. In addition, in the case of g, the sine function should be among those functions. 58 12) Let A and B be sets, f : A Ñ B and B1 , B2 be subsets of B. Show that f 1 pB1 Y B2 q f 1 pB1 q Y f 1 pB2 q , f 1 pB1 X B2 q f 1 pB1 q X f 1 pB2 q . 13) Express the area of an equilateral triangle as a function of the length of a side. 14) Express the surface area of a sphere of radius r its volume. ¡ 0 as a function of 15) Consider a circle Sr1 of radius r ¡ 0 around the origin of an xydiagram. Express the length of its intersections with parallels to the y-axis as a function of their distance from the y-axis. Determine the domain and range of that function. 16) From each corner of a rectangular cardboard of side lengths a ¡ 0 and b ¡ 0, a square of side length x ¥ 0 is removed, and the edges are turned up to form an open box. Express the volume of the box as a function of x and determine the domain of that function. 17) Consider a body in the earth’s gravitational field which is at rest at time t 0 and at height s0 ¡ 0 above the surface. Its height s and speed v as a function of time t are given by 1 sptq s0 gt2 , v ptq gt 2 where g is approximately 9.81m{s2 . Determine the domain and range of the functions s and v. In addition, express s as a function of the speed and determine domain and range. 59 Fig. 12: Hexagons inscribed in and circumscribed about the unit circle. 2.3 Limits and Continuous Functions 2.3.1 Limits of Sequences of Real Numbers For motivation of infinite processes, we consider one of its early examples, namely Archimedes’ measurement of the circle. Archimedes considered regular polygons of 6, 12, 24, . . . sides inscribed in and circumscribed about the unit circle in order to achieve rational estimates of its circumference of increasing accuracy. Since trigonometric functions were not known at his time, differently to the reasoning below, he used elementary geometric methods to derive the relation (2.3.1) below. Such derivation is given as an exercise. See Problem 6 below. For every n 6, 12, 24, . . . , we define a corresponding sn as the circumference of the regular polygon of n sides. Since geometric intuition suggests that the shortest connection of two point in the plane is a straight line, we expect sn to give a lower bound of the circumference of the unit circle, i.e., of 2π. For the same reason, we expect, see Fig. 13, that the sequence s6 , s12 , s24 , . . . is increasing. The proof of this is given as an exercise. See Problem 7 below. In particular, 60 C D E ΠH2nL Πn A B Fig. 13: Depiction to Archimedes’ measurement of the circle. The dots in the corners C and D indicate right angles. sn n ln where ln is the length of the side of the polygon. From Fig 13, we conclude that π l π ln 2n sin , sin . 2 n 2 2n Further, it follows that π π π π sin sin 2 2n 2 sin 2n cos 2n n π π c 2 sin 2n 1 sin2 2n and hence that sin 2 π 2 2n sin 2 π 1 2 π sin . 4 n 2n The last implies that sin 2 π 2n and hence that 2 l2n 4 sin 2 π 2n 1 1 2 2 1 c 1 sin c 61 1 ln2 4 2 π n l2 {2 n b 1 1 2 ln 4 . Finally, we arrive at the recursion relation 2 l2n l2 4 ln2 an 2 (2.3.1) which Archimedes used to obtain the length of the sides of the 2n-gon from that of the n-gon. He started from S6 1 to obtain 2 l12 2 1?3 2 ? 3. In the next step, he used the approximation ? 3 1351 780 to obtain a lower bound for s12 . Continuing in this fashion up to the 96-gon, he arrived at the approximation s96 20 6 71 which gives the circumference of the circle, i.e., 2π, within an error of 2 103 . Note that far better approximations to 2π were already known to the ancient Babylonians. More important is the fact that this method could be used to calculate 2π to arbitrary precision, i.e., within an error less than an arbitrary small preassigned error bound ε ¡ 0. Given such error bound ε ¡ 0, and taking into account that the sequence s6 ,s12 ,s24 , . . . is increasing, we expect that there is some corresponding natural number N such that 2π s2n ε for all natural numbers n such that n ¥ N . Indeed this expectation turns out to be correct later. Since, 2π s2n |s2n 2π| 62 Fig. 14: Dodecagon inscribed in a unit circle. for all n P N, n ¥ 6, we note that our expectation is equivalent to the statement that for every arbitrary preassigned error bound ε ¡ 0, there is some corresponding natural number N such that |s2n 2π| ε for all natural numbers n such that n ¥ N . The last is also used to define the limit of a sequence of real numbers in general. Definition 2.3.1. Let x1 , x2 , . . . be a sequence of elements of R and x P R. Then we define lim xn x nÑ8 if for every ε ¡ 0, there is a corresponding n0 such that for all n ¥ n0 |x n x | ε , i.e., from the n0 -th member on, all remaining members of the sequence are within a distance from x which is less than ε. 1 In this case, we say that the 1 As a consequence, only finitely many members have distance ¥ ε from x. 63 2 1.75 1.5 1.25 1 0.75 0.5 0.25 10 Fig. 15: pn, pn 20 30 40 50 n 1q{nq for n 1 to n 50 and asymptotes. sequence x1 , x2 , . . . is convergent to x. Note that this implies that for every ε¡0 |xn| |xn x x| ¤ |xn x| |x| ¤ ε |x| for all n P N , apart from finitely many members of the sequence, and hence that x1 , x2 , . . . is bounded, i.e., that there is M ¥ 0 such that |xn | ¤ M for all n P N . If the sequence is not convergent to any real number, we call the sequence divergent. Example 2.3.2. Let a be some real number and xn : a for all n Then lim xn a . nÑ8 P N . Indeed, if ε ¡ 0 is given, then |xn a| |a a| 0 for all n P N . Hence we can choose N 1. Note that in this simple case, the chosen N works for every ε ¡ 0. In general this will be impossible. 64 2 1.5 1 0.5 10 20 30 40 50 n -0.5 -1 -1.5 -2 Fig. 16: pn, p1qn pn 1q{nq for n 1 to n 50 and asymptotes. 50 40 30 20 10 10 Fig. 17: pn, pn2 20 30 40 50 n 1q{nq for n 1 to n 50 and an asymptote. 65 Example 2.3.3. Investigate whether the following limits exist. (i) lim Ñ8 n (ii) 1 (2.3.2) n n lim p1qn Ñ8 n 1 n n , (2.3.3) (iii) lim nÑ8 n2 1 . n (2.3.4) Solution: Fig. 15, Fig. 16 and Fig. 17 suggest that the limit 2.3.2 is 1, whereas the limits 2.3.3, 2.3.4 don’t exist. Indeed 1. (2.3.5) For the proof, let ε be some real number ¡ 0. Further, let n0 be some natural number ¡ 1{ε. Then it follows for every n P N such that n ¥ n0 : lim Ñ8 n 1 n n n 1 n 1 n1 ¤ n1 ε . 0 and hence the statement (2.3.5). The proof that (2.3.3) does not exist proceeds indirectly. Assume on the contrary that there is some x P R such n lim Ñ8 p1q n Then there is some n0 P N such that p1q n n 1 n n 1 n x. x 41 for all n P N such n ¥ n0 . Without restriction of generality, we can assume that n0 ¥ 4. Then it follows for any even n P N such that n ¥ n0 : |x 1| n 1 n x 1 n 1 ¤ n n 66 x 1 n ¤ 14 1 n0 ¤ 14 1 4 21 and for any odd n P N such that n ¥ n0 : |x 1| ¤ 14 1 4 n 1 n x 1 n 1 ¤ n n x 1 n ¤ 14 1 n0 12 , and hence we arrive at the contradiction that 2 |x 1 px 1q| ¤ |x 1| |x 1| ¤ 1 2 1 2 1. Hence our assumption that (2.3.3) exists is false. The proof that (2.3.4) does not exist proceeds indirectly, too. Assume on the contrary that there is some x P R such n2 1 x. lim nÑ8 n Further, let ε be some real number ¡ 0. Finally, let n0 be some natural number ¥ |x| ε. Then it follows for n ¥ n0 that 2 n 1 n x n x 1 nx n 1 n ¡ n x ¥ |x| εxε . Hence there is an infinite number of members of the sequence that have a distance from x which is greater than ε. This contradicts the existence of a limit of (2.3.4). Hence such a limit does not exist. The alert reader might have noticed that Def 2.3.1 might turn out to be inconsistent with logic, and then would have to be abandoned, if it turned out that some sequence has more than one limit point. Part piq of the following Theorem 2.3.4 says that this is impossible. In particular, this theorem says that a sequence in R can have at most one limit point (in part (i)), that the sequence consisting of the sums of the members of convergent sequences in R is convergent against the sum of 67 their limits (in part (ii)), that the sequence consisting of the products of the members of convergent sequences in R is convergent against the product of their limits (in part (iii)) and that the sequence consisting of the inverse of the members of a sequence convergent to a non-zero real number is convergent against the inverse of that number (in part (iv)). Theorem 2.3.4. (Limit Laws) Let x1 , x2 , . . . ; y1 , y2 , . . . be sequences of elements of R and x, x̄, y P R. (i) If then x̄ x. lim Ñ8 xn x and n lim Ñ8 xn x and n n (ii) If n then lim Ñ8pxn lim Ñ8 xn n then x and lim Ñ8 xn yn n (iv) If lim Ñ8 xn n x̄ , lim Ñ8 yn y , yn q x n (iii) If lim Ñ8 xn lim Ñ8 yn n y. y , xy . x and x 0 , then lim nÑ8 1 xn x1 . Proof. ‘(i)’: The proof is indirect. Assume that the assumption in (i) is true and that x x̄. Then there is n0 P N such that for n P N satisfying n ¥ n0 : |xn x| 12 |x̄ x| and |xn x̄| 68 1 |x̄ x| . 2 Hence it follows the contradiction that |x̄ x| |x̄ xn xn x| ¤ |x̄ xn | |xn x| |x̄ x| . Hence it follows that x̄ x. ‘(ii)’: Assume that the assumption in (ii) is true. Further, let ε ¡ 0. Then there is n0 P N such that for n P N with n ¥ n0 : |xn x| 2ε and |yn y| 2ε and hence |x n y n px y q| ¤ |xn x| |yn y | ε . ‘(iii)’: Assume that the assumption in (iii) is true. Further, let ε ¡ 0 and δ ¡ 0 such that δ pδ |x| |y |q ε. (Obviously, such a δ exists.) Then there is n0 P N such that for n P N with n ¥ n0 : |xn x| 2δ and |yn y | δ . 2 Then |xn yn x y| |xn yn xn y xn y x y| ¤ |xn| |yn y| |xn x| |y| ¤ |xn x| |yn y| |x| |yn y| |xn x| |y| ε . ‘(iv)’: Assume that the assumption in (iv) is true. Further, let ε ¡ 0 and δ ¡ 0 such that 1{p|x|p|x| δ qq mint|x|, εu. (Obviously, such a δ exists.) Then there is n0 P N such that for n P N satisfying n ¥ n0 : | |xn| |x| | ¤ |xn x| δ , and hence also and |xn| ¡ |x| δ ¡ 0 1 x n 1 |xn x| x |xn | |x| p|x||xn δqx||x| ε . 69 Remark 2.3.5. The previous theorem is of fundamental importance in the investigation of sequences. Usually, it is applied as follows. First, a given sequence of real numbers is decomposed into combinations of sums, products, quotients of sequences whose convergence is already known. Then the application of the theorem proves the convergence of the sequence and allows the calculation of its limit if the limits of those constituents are known. Example 2.3.6. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where 1 xn : n for all n P N . Solution: In Example 2.3.3, we proved that lim nÑ8 n 1 n 1. Since n 1 1 p1q n n for every n P N , it follows by Theorem 2.3.4 and Example 2.3.2 the existence of 1 lim nÑ8 n and that 1 n 1 lim lim nÑ8 n nÑ8 n 1 p1q 0 . n 1 p1q nlim Ñ8 n lim Ñ8p1q n Example 2.3.7. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where 1 xn : n a for all n P N and a ¥ 0. Solution: First, we notice that xn : 1 n 1 1 a a 1 n n 70 (2.3.6) for every n P N . Further, by Theorem 2.3.4, Example 2.3.2 and Example 2.3.6, it follows the existence of a n lim 1 nÑ8 and that a nlim Ñ8 1 n lim 1 Ñ8 n 1 lim nÑ8 n lim a Ñ8 n 1 a01 . Since the last is different from 0, it follows by Theorem 2.3.4 that lim nÑ8 lim 1 a n 1 1 Ñ8 1 n a n 11 1 . Finally, again by application of Theorem 2.3.4, it follows from this and Example 2.3.6 the convergence of x1 , x2 , . . . and that lim Ñ8 xn n lim Ñ8 n 1 1 1 lim nÑ8 n a n 100 . Remark 2.3.8. Note that the result in the last Example is unchanged if a is some arbitrary real number. Only if a is some integer 0, the term xa has to be excluded from the sequence because undefined. Example 2.3.9. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where 3n 2 xn : 2n 1 for all n P N . Solution: First, we notice that xn 3n 2n 2 1 2n 2n 3 2 2 1 1q p2n 2n 1 3 2 1 2 23 1 1 4 n 1 2 Hence it follows by Theorem 2.3.4, Example 2.3.2 and Example 2.3.7 the convergence of x1 , x2 , . . . and that lim Ñ8 xn n 3 lim nÑ8 2 1 lim nÑ8 4 lim Ñ8 n 71 1 n 1 2 32 1 3 0 . 4 2 The following is a comparison theorem that allows to conclude from the convergence of one of the involved sequences on the convergence of the other sequence. Theorem 2.3.10. Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of real numbers such that |x n | ¤ y n for all n P N. Further, let lim Ñ8 yn 0. lim Ñ8 xn 0. n Then n Proof. Let ε that ¡ 0. Since y1, y2, . . . is convergent to 0, there is n0 P N such |xn| ¤ yn |yn| for all n ¥ n0 . Hence it follows that x1 , x2 , . . . is convergent to 0. Example 2.3.11. Prove the convergence of the sequence x1 , x2 , . . . and calculate its limit where 1 xn : 2 n a2 for all n P N and a P R. Solution: We note that for every n P N 1 n2 a2 ¤ n1 . Hence it follows by Theorem 2.3.10 and Example 2.3.6 that lim Ñ8 xn n 0. The following theorem is often used in the analysis of convergent sequences whose limits cannot readily be determined. In this way, by approximation of the members of the sequence, frequently estimation of its limit can be derived. 72 Theorem 2.3.12. (Limits preserve inequalities) Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of elements of R converging to x, y P R, respectively. Further let xn ¤ yn for all n P N . Then also x ¤ y. Proof. The proof is indirect. Assume on contrary that x follows the existence of an n P N such that both x xn ¤ |xn x| 12 px yq yn y , ¡ y. Then it ¤ |yn y| 21 px yq and hence the contradiction xy ¤xy yn xn xy . Hence x ¤ y. Example 2.3.13. Define the sequence x1 , x2 , . . . recursively by xn 1 : 1 2 xn a xn for all n P N where x1 ¡ 0 and a ¥ 0. Show that ? a lim x n ¥ nÑ8 if x1 , x2 , . . . converges. Solution: For every x ¡ 0, it follows that ? 2 ? 0 ¤ x a x2 2 a x a and hence that 1 x 2 Therefore, since x1 and hence that a 1 2 px x 2x aq ¥ ? 2 ax 2x ? (2.3.7) a. ¡ 0, it follows inductively that xn ¡ 0 for all n P N ? xn ¥ a for all n P N zt1u. Hence if x1 , x2 , . . . is convergent, it follows by Theorem 2.3.12 the validity of (2.3.7). 73 In many cases, in particular such related to applications where sequences are often defined recursively, it is not obvious how to decide whether a given sequence is convergent or divergent. Then it is usually tried first to establish the existence of a limit by application of a very general theorem, i.e., a theorem that is applicable to a very large class of sequences that have only few specific properties. If the sequence is found to be convergent, the determination of its limit or the derivation of estimations of that limit is performed in subsequent steps. The derivation of such general theorems is the goal in the following. For this, we notice that Definition 2.3.1 is not of much use for deciding the convergence of a given sequence if there is no obvious candidate for its limit. Therefore it is natural to ask, whether there is a general way to decide that convergence without reference to a limit. Indeed, this is possible by means of the so called Cauchy criterion. For its formulation, we need the notion of Cauchy sequences. Roughly speaking, a sequence x1 , x2 , . . . of real numbers is called a Cauchy sequence if for every arbitrary preassigned error bound ε ¡ 0, after omission of finitely many terms of the sequence, the distance between every two members of the remaining sequence is smaller than ε. Definition 2.3.14. (Cauchy sequences) We call a sequence x1 , x2 , . . . of real numbers a Cauchy sequence if for every ε ¡ 0 there is a corresponding n0 P N such that |x m x n | ε for all m, n P N satisfying m ¥ n0 and n ¥ n0 . Example 2.3.15. Define x1 : 0, x2 : 1 and xn 2 : 1 pxn 2 xn 1 q for all n P N . Show that x1 , x2 , . . . is a Cauchy sequence. Solution: First, it follows for every n P N that xn 2 is the midpoint of the interval 74 x 1 0.8 0.6 0.4 0.2 10 20 30 40 50 n Fig. 18: (n, xn ) from Example 2.3.15 for n 1 to n 50. In between xn and xn 1 given by In [xn 1 , xn ] if xn ¡ xn 1 . Further, xn 2 xn 1 12 pxn xn 1 [xn, xn 1] if xn ¤ xn 1 and In q xn 1 12 pxn 1 xnq . Hence it follows by the method of induction that I1 xn p21nq1 I2 I3 . . . and that n 1 xn 1 . As a consequence, if ε ¡ 0 and n0 P N is such that 21n0 ε, then it follows for m, n P N satisfying m ¥ n0 and n ¥ n0 that xm P In0 and therefore that |xm xn| ¤ 2n101 ε . Hence x1 , x2 , . . . is a Cauchy sequence. See Fig. 18. The following is easy to show. 75 Theorem 2.3.16. Every convergent sequence of real numbers is a Cauchy sequence. Proof. For this, let x1 , x2 , . . . be a sequence of real numbers converging to some x P R and ε ¡ 0. Then there is n0 P N such that |xn x| ε{2 for all n P N satisfying n ¥ n0 . The last implies that |xm xn| |xm x pxn xq| ¤ |xm x| |xn x| ε for all n, m P N satisfying n ¥ n0 and m ¥ n0 . Hence x1 , x2 , . . . is a Cauchy sequence. The opposite statement that every Cauchy sequence of real numbers is convergent is not obvious, but a deep property of the real number system. This is proved in the Appendix, see the proof of Theorem 5.1.11 in the framework of Cantor’s construction of the real number system by completion of the rational numbers using Cauchy sequences. The most important parts of calculus / analysis, are based on the following theorem or, equivalently, on Bolzano-Weierstrass theorem below. Theorem 2.3.17. (Completeness of the real numbers) Every Cauchy sequence of real numbers is convergent. Proof. See the proof of Theorem 5.1.11 in the Appendix. In the following, we derive far reaching consequences of the completeness of the real numbers. Theorem 2.3.18. (Bolzano-Weierstrass) For every bounded sequence x1 , x2 , . . . of real numbers there is a subsequence, i.e., a sequence xn1 , xn2 , . . . that corresponds to a strictly increasing sequence n1 , n2 , . . . of non-zero natural numbers, which is convergent. 76 Proof. For this let x1 , x2 , . . . be a bounded sequence of real numbers. Then we define S : tx1 , x2 , . . . u . In case that S is finite, there is a subsequence x1 , x2 , . . . which is constant and hence convergent. In case that S is infinite, we choose some element xn1 of the sequence. Since S is bounded, there is a ¡ 0 such that S I1 : ra{4, a{4s. At least one of the intervals ra{4, 0s, r0, a{4s contains infinitely many elements of S. We choose such interval I2 and xn2 P I2 such that n2 ¡ n1 . In particular I2 I1 . Bisecting I2 into two intervals, we can choose a subinterval I3 I2 containing infinitely many elements of S and xn3 P I3 such that n3 ¡ n2 . Continuing this process, we arrive at a sequence of intervals I1 , I2 , . . . such that I1 I2 . . . and such that the length of Ik is a{2k for every k P N . Also, we arrive at a subsequence xn1 , xn2 , . . . of x1 , x2 , . . . such that xk P Ik for every k P N . For given ε ¡ 0, there is k0 P N such that a{2k0 ε. Further, let k, l P N be such that k ¥ k0 and l ¥ k0 . Then it follows that xk P Ik0 , xl P Ik0 and therefore that |xk xl | ¤ a{2k0 ε . Hence xn1 , xn2 , . . . is a Cauchy sequence and therefore convergent according to Theorem 2.3.17. For the following, the Bolzano-Weierstrass theorem will be fundamental. It will be applied in the proofs of a number of important theorems, for instance, Theorem 2.3.33, Theorem 2.3.44 and Theorem 3.5.59. Also the following theorem is an important and frequently applied consequence of Bolzano-Weierstrass’ theorem. Until the beginning of the 19th century its statement must have been considered as geometrically obvious because it was used without mentioning. For instance in Augustin-Louis Cauchy’s textbook ‘Cours d’analyse’ from 1821 [22], it is implicitly used in the proof of the intermediate value theorem, see Theorem 2.3.37 below, but without proof. From today’s perspective, it is clear that such geometric intuition was based on an illusion. Theorem 2.3.19. Let x1 , x2 , . . . be an increasing sequence of real numbers, i.e., such that xn ¤ xn 1 for all n P N, which is also bounded from above, 77 i.e., for which there is M x1 , x2 , . . . is convergent. ¥ 0 such that xn ¤ M for all n P N. Then Proof. Since x1 , x2 , . . . is increasing and bounded from above, it follows that this sequence is also bounded. Hence according to the previous theorem, there is a subsequence, i.e., a sequence xn1 , xn2 , . . . that corresponds to a strictly increasing sequence n1 , n2 , . . . of non-zero natural numbers, which is convergent. We denote the limit of such sequence by x. Then, ¤x for all n P N . Otherwise, there is m P N such that xm ¡ x. If nk P N is such that nk ¥ m, then xn ¥ xm ¡ x for all k P N such that k ¥ k0 . This implies that lim xn ¥ xm ¡ x . kÑ8 xn 0 0 k k Further, for ε ¡ 0, there is k0 such that |x n x | ε k for all k P N such that k n ¥ nk0 that ¥ k0. Hence it follows for all n P N satisfying |xn x| x xn ¤ x xn |xn x| ε . k0 k0 Therefore, x1 , x2 , . . . is convergent to x. Corollary 2.3.20. Let x1 , x2 , . . . be an decreasing sequence of real numbers, i.e., such that xn 1 ¤ xn for all n P N, which is also bounded from below, i.e., for which there is a real M ¥ 0 such that xn ¥ M for all n P N. Then x1 , x2 , . . . is convergent. 78 x 0.5 0.4 0.3 0.2 0.1 10 20 30 40 50 n Fig. 19: (n, xn ) from Example 2.3.21 for n 1 to n 50. Proof. The sequence x1 , x2 , . . . is increasing, bounded from above and therefore convergent to a real number x by the previous theorem. Hence x1 , x2 , . . . is convergent to x. Example 2.3.21. Show that the sequence x1 , x2 , . . . defined by x1 : 1{2 and 1 3 . . . p2n 1q xn : 2 4 . . . p2nq for all n P N zt1u is convergent. Solution: The sequence x1 , x2 , . . . is bounded from below by 0. In addition, xn 1 22n pn 1 xn 1q ¤ xn for all n P N and hence x1 , x2 , . . . is decreasing. Hence x1 , x2 , . . . is convergent according to Corollary 2.3.20. See Fig 19. 79 Definition 2.3.22. Let S be a non-empty subset of R. We say that S is bounded from above (bounded from below) if there is M P R such that x ¤ M (x ¥ M ) for all x P S. The following theorem can be considered as a variation of Theorem 2.3.19 which is also in frequent use. Its power will be demonstrated in the subsequent example. Theorem 2.3.23. Let S be a non-empty subset of R which is bounded from above (bounded from below). Then there is a least upper bound (largest lower bound) of S which will be called the supremum of S (infimum of S) and denoted by sup S (inf S). Proof. First, we consider the case that S is bounded from above. For this, we define the subsets A, B of R as all real numbers that are no upper bounds of S and containing all upper bounds of S, respectively, A : ta P R : There is x P S such that x ¡ au , B : tb P R : x ¤ b for all x P S u . Since S is non-empty and bounded from above, these sets are non-empty. In addition, for every a P A and every b P B, it follows that a b. Let a1 P A and b1 P B. Recursively, we construct an increasing sequence a1 , a2 , . . . in A and a decreasing sequence b1 , b2 , . . . in B by an bn 1 1 : : # pan # an bn pan bn q{2 if pan if pan if pan bn q{2 if pan bn q{2 P A bn q{2 P B , bn q{2 P A bn q{2 P B for every n P N . According to Theorem 2.3.19, both sequences are convergent to real numbers a and b, respectively. Since, pb1 a1q{2n1 for all n P N , it follows that a b. In the following, we show that b sup S. For every x P S, it follows that x bn for all n P N and hence bn an 80 that x ¤ b. Hence b is an upper bound of S. Let b̄ be an upper bound of S such that b̄ b. Then there is n P N such that b̄ an . Since an is no upper bound for S, the same is also true for b̄. Therefore, b is the smallest upper bound of S, i.e., b sup S. Finally, we consider the case that S is bounded from below. Then S : tx : x P S u is bounded from above. Obviously, a real number a is a lower bound of S if and only if a is an upper bound of S. Hence suppS q is the largest lower bound of S, i.e., inf S exists and equals suppS q. Example 2.3.24. Prove that there is a real number x such that x2 Solution: For this, we define S : ty 2. P R : 0 ¤ y2 ¤ 2u . Since 0 P S, S is a non-empty. Further, S does not contain real numbers y ¥ 2 since the last inequality implies that y 2 2 py 2qpy 2q 2¥2. Hence S is bounded from above. We define x : sup S. In the following, we prove that x2 2 by excluding that x2 2 and that x2 ¡ 2. First, we assume that x2 2. Then it follows for n P N that x 1 n 2 x2 2 Hence if n ¥ (2x 2 x2 2 2x 1 n 2x n ¤ x2 2 1 n2 2x n 1 n . 1){(2 x2 ) it follows that x 1 n 2 ¤2 and therefore that x (1{n) P S. As a consequence, x is no upper bound for S. Second, we assume that x2 ¡ 2. Then it follows for ε ¡ 0 that px εq2 2 x2 2 2εx 81 ε2 ¥ x2 2 2εx . Hence if ε (x2 2){(2x), it follows that px εq2 ¡ 2 . As a consequence, x is not the smallest upper bound for S. Finally, it follows that x2 2. Note that according to Example 2.2.15, x is no rational number. Below, we define the exponential function as a limit of sequences. This function is of fundamental importance for applications. It appears in a natural way in the description of physical systems throughout the whole of physics. One prominent example is the description of radioactive decay. Its discovery is often attributed to Jacob Bernoulli, who became familiar with calculus through a correspondence with Leibniz, resulting from his study of the problem of continuous compound interest. For motivation, we briefly sketch the problem in the following. For this, we assume that a bank account contains a ¡ 0 Dollars that pays 100 x percent interest per year where x is some real number. Of course, in practice x ¥ 0. If the interest is payed once at the end of the year, the account contains a1 : a x a a p1 xq Dollars at the end of the year. If the interest is payed semiannually, after 1{2 years the account contains a x aa 1 2 x 2 Dollars and after one year a2 : a 1 x 2 x a 1 2 x a 1 2 x 2 2 ¥ a1 Dollars. Analogously, if the interest is payed n-times per year where n N , the account contains an : a 1 82 x n n P 2.74 2.73 2.72 2.71 2 4 6 8 10 12 14 n Fig. 20: pn, xn q, pn, yn q from Lemma 2.3.25 and pn, eq for n 1 to n 15. Dollars after one year. Bernoulli investigated the question whether this amount would grow indefinitely with the increase of n or whether it would stay bounded. Indeed, as we shall see below, the sequence a1 , a2 , . . . is converging to a real number which is denoted by aex or a exppxq. For simplicity, below we restrict n to powers of 2. This is an approach of Otto Dunkel, 1917 [33] which avoids the use of Bernoulli’s inequality. This restriction can be removed later, for instance, with the help of L‘Hospital’s theorem, Theorem 2.5.38. Lemma 2.3.25. Let x P R. Define x p2n q x p2n q xn : 1 , y : 1 n 2n 2n for all n P Z. Then for all n P N such 2pn1q ¡ |x|: 0 xn1 and xn y n ¤ xn ¤ yn ¤ yn1 1 2 x ¤ 4m 2 83 (2.3.8) . (2.3.9) Proof. For this let n P N be such that m : 2pn1q 1 1 and hence x 2 2m x 2 2m 1 1 0 xn1 x m x m x2 4m2 x2 4m2 ¥1 ¥1 ¡ |x|. Then x m x m ¡0, ¡0 ¤ xn and 0 yn ¤ yn1 . Finally, it follows that yn xn 1 x 2m 1 2m # x 2m 2m + x 2 x 2m 1 1 2m 2m x2 x 2m 1 1 2m 4m2 1 1 and hence xn 1 x2 4m2 0 1 x2 4m2 2m 1 1 x2 4m2 2m1 ¤ yn and (2.3.9). Note that the sequence y1 , y2 , . . . in Lemma 2.3.25 is a decreasing and bounded from below by 0 and hence convergent according to Theorem 2.3.20. Hence we can define the following: Definition 2.3.26. We define the exponential function exp : R Ñ R by exppxq : ex : nlim Ñ8 1 for all x P R. Then we conclude Theorem 2.3.27. 84 x p2n q 2n (i) x e and ex ¡ 0 for all x P R. nlim Ñ8 x p2n q 2n 1 (ii) 1 x¤ 1 x p2n q 2n ¤e ¤ x 1 x p2n q 2n for all x P R such |x| 1 and all n P N. (iii) ex for all x, y y P R. ¤ 1 1 x (2.3.10) exey Proof. From (2.3.9), it follows for every x P R: lim Ñ8 n xn yn 1 and hence by the limit laws Theorem (2.3.4) that lim Ñ8 yn nlim Ñ8 n xn yn nlim Ñ8 xn and by (2.3.8) and Theorem 2.3.12 that ex ¡ 0 for all x P R. Further, if |x| 1, it follows from (2.3.8) and by Theorem 2.3.12 the estimates (2.3.10). Finally, if y P R and n P N is such that m : 2n ¡ maxt4|x|, 4|y |, 2|x||y |u, then 1 where y m x m 1 m m m 1 xmy hm : xy m x 85 1 y hm m m is such that |hm | 1. Hence by (2.3.10) 1 hm ¤ hm m 1 m ¤ 1 1h , m and it follows by Theorem 2.3.4 and Theorem 2.3.12 that ex ey ex y nlim Ñ8 1 hm m m 1. Problems 1) Below are given the first 8 terms of a sequence x1 , x2 , . . . . For each find a representation xn f pnq, n 1, . . . , 8 where f is an appropriate function. a) b) c) d) e) f) g) h) i) j) k) l) 2, 4, 6, 8, 10, 12, 14, 16, 2, 4, 8, 16, 32, 64, 128, 256, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 6, 10, 15, 21, 28, 36, 1, 3{4, 5{7, 7{10, 9{13, 11{16, 13{19, 15{22, 2, 0, 2, 0, 2, 0, 2, 0, 5{7, 0, 7{9, 0, 9{11, 0, 11{13, 0, 1, 1, 4{6, 8{24, 16{120, 32{720, 64{5040, 128{40320, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, [0, 0, 0, 1,] 0, 1, 0, 0, 0, 1, 0, 1, [0, 0, 0, 1]. 2) Prove the convergence of the sequence and calculate its limit. For this use only the limit laws, the fact that a constant sequence converges to that respective constant and the fact that lim p1{nq 0 . n Ñ8 Give details. 86 a) b) c) d) e) f) g) xn xn xn xn xn xn xn : 1 p1{nq, n P N , : 5 p2q p1{nq 3 p1{nq2 , n P N , : r1 p4q p1{nqs{r2 3 p1{nq2 s, n P N , : 3{n2 , n P N , : p2n 1q{pn 3q, n P N , : p3n2 6n 10q{p7n2 3n 5q, n P N , : p3n2 6n 10q{p7n3 3n 5q, n P N . 3) Determine in each case whether the given sequence is convergent or divergent. Give reasons. If it is convergent, calculate the limit. a) xn : n 1 n 1 b) xn : d) xn : p1qn n p1qn c) xn : p1qn 1 e) xn : sinpnπ q f) xn : sin g) xn : h) xn : n2 n2 1 i) xn : j) xn : n2 n n3 1 n n n2 1 n3 n2 1 for every n P N . 1 n nπ 2 cospnπ q 4) The table displays pairs pn, sn q, n 1, . . . , 10, where sn is the measured height in meters of a free falling body over the ground after n{10 seconds and at rest at initial height 4m. p1, 3.951q p2, 3.804q p3, 3.559q p4, 3.216q p5, 2.775q p6, 2.236q p7, 1.599q p8, 0.864q . Draw these points into an xy-diagram where the values of n appear on the x-axis and the values of sn on the y-axis. Find a representation sn f pn{10q, n 1, . . . , 10, where f is an appropriate function, and predict the time when the body hits the ground. 5) The table displays pairs p2n{10, Ln q, n 1, ..., 8, where 2n{10 is the pressure in atmospheres (atm) of an ideal gas (, at constant temperature of 20 degrees Celsius,) confined to a volume which is proportional to the length Ln . The last is measured in millimeters (mm). p0.2, 672q p0.4, 336q p0.6, 224q p0.8, 168q p1.0, 134.4q p1.2, 112q p1.4, 96q p1.6, 84q 87 Draw these points into an xy-diagram where the values of n appear on the x-axis and the values of Ln on the y-axis. Find a representation Ln f p2n{10q, n 1, . . . , 8, where f is an appropriate function, and predict L10 . 6) Like Archimedes, derive the recursion relation (2.3.1) by elementary geometric reasoning without the use of trigonometric functions. 7) Reconsider Archimedes’ measurement of the circle and calculate the recursion relation for the sequence of circumferences s6 , s12 , s24 , . . . that corresponds to (2.3.1). In addition, prove that this sequence is increasing as well as bounded from above and hence convergent. 2.3.2 Continuous Functions This section starts the investigation of properties of functions defined on subsets of the real numbers. Alongside the notion of a function, the notion of the continuity of a function underwent considerable changes until it reached its current meaning. In his textbook ‘Introductio ad analysin infinitorum’ from 1748 [38], Leonhard Euler defines a function as an equation or analytic expression composed of variables and numbers. Admissible analytic expressions were those that involved the four algebraic operations, roots, exponentials, logarithms, trigonometric functions, derivatives and integrals. This common property of functions was also called ‘continuity in form’. The study of the solutions of the wave equation in one space dimension (‘the VibratingString Problem’), made necessary the consideration of compounds of such functions. Such were called ‘discontinuous’ functions by Euler. This included functions (in the sense of curves) that are traced by the free motion of the hand and therefore not subject to any law of continuity in form. Unlike modern definitions of continuity of a function, continuity in the sense of Euler included the differentiability of the function in the modern sense. The last concept will be defined in Section 2.4. Hence the term continuous was used to indicate a kind of regularity of the function. The same is true today. 88 The modern definition of continuity goes back to a publication of Bernhard Bolzano from 1817 [12]. The literal translation of the (German) title is ‘Purely analytical proof of the theorem, that between each two values which guarantee an opposing result, at least one real root of the equation lies.’ The phrase ‘opposing result’ means an opposite sign, and the theorem in question is the intermediate value theorem, see Theorem 2.3.37 below. In this paper, he criticizes that the known proofs of that theorem still make reference to geometric intuition although such arguments were already considered inadequate in pure mathematics at the time. He argues that the concept of continuity should be understood in the following sense. A function f pxq varies according to the law of continuity for all values of x which lie inside or outside certain limits if for every such x the value of the difference f px ω q f pxq can be made smaller than any given quantity if ω can be assumed as small as one wishes. Essentially the same formulation can also be found in Cauchy’s textbook ‘Cours d’analyse’ from 1821 [22]. This formulation practically coincides with a modern definition. It is important to note that, on first sight and unlike Bolzano, Cauchy’s definition makes reference to infinitesimal quantities. The use of such quantities, which have their roots in ancient Greek philosophy, was quite common at that time. Among others, Johannes Kepler, Newton, Leibniz, Jacob Bernoulli, Euler and Cauchy, previously to the writing of his ‘Cours d’analyse’, made use of them. Jean le Rond d’Alembert, Joseph Louis Lagrange, Bolzano and others distrusted that concept and tried to avoid it. On the other hand, Cauchy replaces the concept of fixed infinitesimally small quantities by a definition of infinitesimals in terms of an essentially modern concept of limits. In this way, he ‘reconciles rigor with infinitesimals’ and became an important and influential promoter of rigor in calculus / analysis. 89 In modern calculus / analysis, infinitesimals are not part of the real number system. Following Cauchy, their role has been replaced by the rigorous concept of limits. The assumption of continuity of the involved function is sufficient to prove the intermediate value theorem, although neither Bolzano nor Cauchy could give a completely satisfactory proof according to modern standards because a rigorous foundation of the real number system was still missing. An additional important property of continuous functions, defined on closed intervals of R, is that they assume a maximum and also a minimum value. See Theorem 2.3.33 below. Below, we define the continuity of a function as the property to ‘preserve limits’. This form of the definition goes back to Heinrich Eduard Heine and is called ‘sequential continuity’ in more general situations (than functions defined on subsets of the real numbers). Definition 2.3.28. (Continuity) Let f : D Ñ R be a function and x P D. Then we say f is continuous in x if for every sequence x1 , x2 , . . . of elements in D from lim xν x ν Ñ8 it follows that lim f pxν q f ν Ñ8 lim xν ν Ñ8 r f pxqs . If f is not continuous in x, we say f is discontinuous in x. Also we say f is continuous if f is continuous in all points of its domain D. Example 2.3.29. (Basic examples for continuous functions.) Let a, b be real numbers and f : R Ñ R be defined by f pxq : ax for all x P R. Then f is continuous. 90 b Proof. Let x be some real number and x1 , x2 , . . . be a sequence of real of numbers converging to x. Then for any given ε ¡ 0, there is n0 P N such that for n P N with n ¥ n0 : |a| |xn x| ε and hence also that |f pxnq f pxq| |axn b pax and bq| |axn ax| |a| |xn x| ε lim Ñ8 f pxn q f pxq . n An example for a function which is discontinuous in one point. Example 2.3.30. Consider the function f : R Ñ R defined by x f pxq : |x | for x 0 and f p1q : 1. Then 1 lim nÑ8 n but 1 0 and nlim Ñ8 n 0, 1 1 1 and nlim f 1 . lim f Ñ8 nÑ8 n n Hence f is discontinuous at the point 1. See Fig. 21. Such discontinuity is called a ‘jump discontinuity’. The following gives an example of a function that is discontinuous in every point of its domain and is known as Dirichlet’s function. It was given in Dirichlet’s 1829 paper [30] which gave a precise meaning to Fourier’s work from 1822 [41] on heat conduction. As described in the beginning of Section 2.2.3, that paper also gave the first modern definition of functions. His example clearly demonstrates that he moved considerably past his time with his concept of functions since such type of function had not been considered before. 91 Example 2.3.31. (Dirichlet’s function, a function which is nowhere continuous) Define f : R Ñ R by f pxq : # 1 if x is rational 0 if x is irrational for every x P R. For the proof that f is everywhere discontinuous, let xP? R. Then x is either rational or irrational. If x is?rational, then xn : x 2{n for every n P N is irrational. (Otherwise, 2 npxn xq is a rational number. ) Hence lim Ñ8 f pxn q 0 1 f pxq , n and f is discontinuous in x. If x is irrational, by construction of the real number system, see Theorem 5.1.11 (i) in the Appendix, there is a sequence of rational numbers x1 , x2 , . . . that is convergent to x. Hence lim Ñ8 f pxn q 1 0 f pxq , n and f is discontinuous in x also in this case. In the following, we define ‘continuous’ limits of the form lim Ña f pxq x where f is some function and a some real number or 8, 8. In classical (=‘pre-modern’) understanding, the symbol was understood as the variable x approaching a in a ‘continuous’ way, an understanding that was heavily dependent on geometric intuition. Nowadays, there are good reasons to distrust such an intuition resulting from Cantor’s classification of infinite sets. That classification separates infinite sets into those that are countable and those that are not. The last are called ‘uncountable’. A countable set is a set which is the image of an injective map with domain N. It can be shown that the sets Z and Q are countable, but that R and also any interval of R containing more than one point is uncountable. Therefore, the geometric intuition of the variable x approaching a in a continuous way would 92 involve the visualization of an uncountable set which can be considered humanly impossible. For this reason, it can very well be said that a large part of classical calculus / analysis used arguments that were based on illusions, even if one excludes its frequent use of infinitesimal quantities from the consideration. The following definition introduces notation which is in frequent use in other textbooks of calculus / analysis. We will use it only occasionally. Definition 2.3.32. (Continuous limits) Let f be function defined on a subset of R, a P R Y t8u Y t8u and b P R. (i) We say that a sequence x1 , x2 , . . . of real numbers converges to 8 or 8 if for every n P N there are only finitely many members that are ¤ n or ¥ n, respectively. (ii) If there is sequence x1 , x2 , x3 , . . . in the domain of f that converges to a, we define lim f pxq b , x Ña if for every such sequence it follows that lim Ñ8 f pxn q b . n An important property of continuous functions, defined on closed intervals of R, is that they assume a maximum value and a minimum value. The corresponding theorem is a direct consequence of the Bolzano-Weierstrass theorem Theorem 2.3.18. Theorem 2.3.33. (Existence of maxima and minima of continuous functions on compact intervals) Let f : ra, bs Ñ R be a continuous function where a and b are real numbers such that a b. Then there is x0 P ra, bs such that f px0 q ¥ f pxq p f px0 q ¤ f pxq q for all x P ra, bs. 93 y 0.5 -1 1 0.5 -0.5 x -0.5 Fig. 21: Graph of f from Example 2.3.30. y 0.4 0.3 0.2 0.1 0.2 0.4 0.6 0.8 1 Fig. 22: Graph of f from Example 2.3.36. 94 x Proof. For this, in a first step, we show that f is bounded and hence that sup f pra, bsq exists. In the final step, we show that there is c P ra, bs such that f pcq sup f pra, bsq. For this, we use the Bolzano-Weierstrass theorem. The proof that f is bounded is indirect. Assume on the contrary that f is unbounded. Then there is a sequence x1 , x2 , . . . such that f pxn q ¡ n (2.3.11) for all n P N. Hence according to Theorem 2.3.18, there is a subsequence xk1 , xk2 , . . . of x1 , x2 , . . . converging to some element c P ra, bs. Note that the corresponding sequence is f pxk1 q, f pxk2 q, . . . is not converging as a consequence of (2.3.11). But, since f is continuous, it follows that f pcq lim f pxnk q k Ñ8 Hence f is bounded. Therefore let M : sup f pra, bsq. Then for every n P N there is a corresponding cn P ra, bs such that |f pcnq M | n1 . (2.3.12) Again, according to Theorem 2.3.18, there is a subsequence ck1 , ck2 , . . . of c1 , c2 , . . . converging to some element c P ra, bs. Also, as consequence of (2.3.12), the corresponding sequence f pck1 q, f pck2 q, . . . is converging to M and by continuity of f to f pcq. Hence f pcq M and by the definition of M: f pcq M ¥ f pxq for all x P ra, bs. By applying the previous reasoning to the continuous function f , it follows the existence of a c 1 such that f pc 1q ¥ f pxq and hence also for all x P ra, bs. f pc 1 q ¤ f pxq 95 As a by product of the proof of the previous theorem, we proved that every continuous function defined on a bounded closed interval of R is bounded in the following sense. Definition 2.3.34. (Boundedness of functions) We call a function f bounded if there is M ¡ 0 such that |f pxq| ¤ M for all x from its domain. An example for an unbounded function defined on a bounded closed interval of R is given by the function f from Example 2.3.36 below. Corollary 2.3.35. Every continuous function defined on a bounded closed interval of R is bounded. A simple example of a function which is discontinuous in one point and does not assume a maximal value is: Example 2.3.36. Define f : r0, 1s Ñ R by f pxq : " if 0 ¤ x 1{2 if 1{2 ¤ x ¤ 1 . 1 x2 px 1q2 See Fig. 22. Another important property of continuous functions, defined on closed intervals of R, is that they assume all values between those at the interval ends. Theorem 2.3.37. (Intermediate value theorem) Let f : ra, bs Ñ R be a continuous function where a and b are real numbers such that a b. Further, let f paq f pbq and γ P pf paq, f pbqq. Then there is x P pa, bq such that f pxq γ . 96 Proof. Define S : tx P ra, bs : f pxq ¤ γ u . Then S is non-empty, since a P S, and bounded from above by b. Hence c : sup S exists and is contained in ra, bs. Further, there is a sequence x1 , x2 , . . . in S such that |xn c| ¤ n1 (2.3.13) for all n P N. Hence x1 , x2 , . . . is converging to c, and it follows by the continuity of f that lim f pxn q f pcq . nÑ8 Moreover, since f pxn q ¤ γ for all n P N, it follows that f pcq ¤ γ. As a consequence, c b. Now for every x P pc, bs, it follows that f pxq ¡ γ because otherwise c is not an upper bound of S. Hence there exists a sequence y1 , y2 , . . . in pc, bs which is converging to c. Further, because of the continuity of f lim f pyn q f pcq nÑ8 and hence f pcq ¥ γ. Finally, it follows that f pcq that c a and c b. γ and therefore also The following corollary displays a main application of the intermediate value theorem: If f is a continuous function defined on a closed interval of R whose values at the interval ends have a different relative sign, i.e., one of those is 0 and the other one is ¡ 0, then there is x in the domain of f such that f pxq 0 . Corollary 2.3.38. Let f : ra, bs Ñ R be a continuous function where a and b are real numbers such that a b. Moreover, let f paq 0 and f pbq ¡ 0. Then there is x P pa, bq such that f pxq 0. Example 2.3.39. Define f : R Ñ R by f pxq : x3 97 x 1 y 3 2 1 -1 0.5 -0.5 1 x -1 Fig. 23: Graph of f from Example 2.3.39. for all x P R. Then by Theorems 2.3.46, 2.3.48 below, f is continuous. Also, it follows that f p1q 1 0 and f p0q 1 ¡ 0 and hence by Corollary 2.3.38 that f has a zero in p1, 0q. See Fig. 23. Remark 2.3.40. Note in the previous example that the value (0.375) of f in the mid point 0.5 of r1, 0s is ¡ 0. Hence it follows by Corollary 2.3.38 that there is a zero in the interval r1, 0.5s. The iteration of this process is called the ‘bisection method’. It is used to approximate zeros of continuous functions. Polynomial functions, defined on the whole of R, of an odd order necessarily assume the value 0 since they assume values of different relative sign for large negative and large positive arguments. That the same is not true in general for polynomial functions of even order can be seen from the fact that, for instance, the polynomial function f : R Ñ R defined by f pxq : 1 x2 for all x P R does not assume the value zero. 98 Theorem 2.3.41. Let n be a natural number and a0 , a1 , . . . , a2n be real numbers. Define the polynomial p : R Ñ R by ppxq : a0 a2nx2n x2n 1 for all x P R. Then there is some x P R such that f pxq 0. a1 x Proof. Below in Example 2.3.49, it is proved that p is continuous. Further, define x0 : 1 maxt|a0 |, |a1 |, . . . , |a2n |u . Then a0 a1x0 a2nx2n ¤ |a0| |a1| |x0| |a2n| |x0|2n 0 2n 1 ¤ px0 1q p1 x0 x2n 1 x02n 1 0 q x0 and hence ppx0 q ¡ 0. Also a0 a1 px0 q a2n px0 q2n ¤ |a0 | |a1 | |x0 | |a2n | |x0 |2n 2n 1 ¤ px0 1q p1 x0 x2n 1 px0q2n 1 0 q x0 and hence ppx0 q 0. Hence according to Theorem 2.3.37, there is x P rx0, x0s such that f pxq 0. The ‘converse’ of Theorem 2.3.37 is not true, i.e., a function that assumes all values between those at its interval ends is not necessarily continuous on that interval. This can be seen, for instance, from the following Example. Example 2.3.42. Define f : r0, 2{π s Ñ R by f pxq : sinp1{xq for 0 x ¤ 2π and f p0q : 0. Then f is not continuous (in 0), but assumes all values in the in the interval rf p0q, f p2{π qs r0, 1s. Note also that f has an infinite number of zeros, located at 1{pnπ q for n P N . A useful property of continuous functions for theoretical investigations such as Theorem 2.3.44 below is that they map intervals of R that are contained in their domain on intervals of R. 99 y 1 0.5 0.2 0.4 0.6 x -0.5 -1 Fig. 24: Graph of f from Example 2.3.42. Theorem 2.3.43. Let f : ra, bs Ñ R be a continuous function where a and b are real numbers such that a b. Then the range of f is given by f pra, bsq rα, β s for some α, β (2.3.14) P R such that α ¤ β. Proof. Denote by α, β the minimum value and the maximum value of f , respectively, which exist according to Theorem 2.3.33. Then for every x P rα, β s α ¤ f pxq ¤ β . Further, let xm , xM P ra, bs be such that f pxm q α and f pxM q β, respectively. Finally denote by I the interval rxm , xM s if xm ¤ xM and rxM , xms if xM xm. Then the restriction f |I of f to I is continuous and, according to Theorem 2.3.37 (applied to the function f |I if xM xm ), every value of rα, β s is in its range. 100 Intuitively, for instance, as a consequence of Theorem 2.2.44, it is to be expected that the inverse of an injective continuous function is itself continuous. Indeed, this true. Theorem 2.3.44. Let f : ra, bs Ñ R, where a, b P R are such that a b, be continuous and strictly increasing, i.e., for all x1 , x2 P ra, bs such that x1 x2 it follows that f px1 q f px2 q. Then the inverse function f 1 is continuous, too. Proof. From the property that f is strictly increasing, it follows that f is also injective. Further, from Theorem 2.3.43 it follows the existence of α, β P R such that the range of f is given by rα, β s and hence that f 1 : rα, β s Ñ ra, bs . Now let y be some element of rα, β s and y1 , y2 , . . . be some sequence of elements of rα, β s that is converging to y, but such that f 1 py1 q, f 1 py2 q, . . . is not converging to f 1 py q. Then there is an ¡ 0 along with a subsequence yn1 , yn2 , . . . of y1 , y2 , . . . such that 1 f ynk p q f 1pyq ¥ (2.3.15) for all k P N . According to the Bolzano-Weierstrass’ Theorem 2.3.18, there is a subsequence ynk1 , ynk2 , . . . of yn1 , yn2 , . . . such lim f 1 pynkl q x l Ñ8 (2.3.16) for some x P ra, bs. Hence it follows by the continuity of f that lim ynkl l Ñ8 f pxq and y f pxq, since ynk1 , ynk2 , . . . is also convergent to y, but from (2.3.15) it follows by (2.3.16) that x f 1 py q which, since f is injective, leads to the contradiction that y f pxq . Hence such y and sequence y1 , y2 , . . . don’t exist and f 1 is continuous. 101 In the case of sequences, the limit laws, see Theorem 2.3.4, stated that sums, products and quotients (if defined) of convergent sequences are convergent to the corresponding sum, product, quotient (if defined) of their limits. A typical application of these limit laws consisted in the decomposition of a given sequence into sums, products, quotients of sequences whose convergence is already known. Then the application of the limit laws proved the convergence of the sequence and allowed the calculation of its limit if the limits of those constituents are known. Theorems similar in structure to that of the limit laws for sequences hold for continuous functions and are given below. Sums, products, quotients (wherever defined) and compositions of continuous functions are continuous. Indeed, this is a simple consequence of the limit laws, Theorem 2.3.4, and the definition of continuity. According to Theorem 2.3.44 the same is true for the inverse of an injective continuous function. A typical application of the thus obtained theorems consists in the decomposition of a given function into sums, products, quotients, compositions and inverses of functions whose continuity is already known. Then the application of those theorems proves the continuity of that function. In this way, the proof of continuity of a given function is greatly simplified and, usually, obvious. Therefore, in such obvious cases in future, the continuity of the function will be just stated, but not explicitly proved. Definition 2.3.45. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such that D1 XD2 φ. Moreover, let a P R. Then we define pf1 f2 q : D1 XD2 Ñ R (read: ‘f plus g’) and a f1 : D1 Ñ R (read: ‘a times f ’) by pf1 f2 qpxq : f1 pxq f2 pxq for all x P D1 X D2 and pa f1qpxq : a f1pxq for all x P D1 . Theorem 2.3.46. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such that D1 X D2 φ. Moreover let a P R. Then it follow by Theorem 2.3.4 that 102 (i) if f1 and f2 are both continuous in x continuous in x, too, P D1 X D2, then f1 f2 is (ii) if f1 is continuous in x P D1 , then a f1 is continuous in x, too. Definition 2.3.47. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such that D1 X D2 φ. Then we define f1 f2 : D1 X D2 Ñ R (read: ‘f1 times f2 ’) by pf1 f2qpxq : f1pxq f2pxq for all x P D1 X D2 . If moreover Ranpf1 q D1 Ñ R (read: ‘1 over f1 ’) by R, then we define 1{f1 : p1{f1qpxq : 1{f1pxq for all x P D1 . Theorem 2.3.48. Let f1 : D1 D1 X D2 φ. Ñ R, f2 : D2 Ñ R be functions such that (i) If f1 and f2 are both continuous in x continuous in x, too. P D1 X D2, then f1 f2 is (ii) If f1 is such that Ranpf1 q R as well as continuous in x P D1 , then 1{f1 is continuous in x, too. Proof. For the proof of (i), let x1 , x2 , . . . be some sequence in D1 which converges to x. Then for any ν P N X D2 |pf1 f2qpxν q pf1 f2qpxq| |f1pxν qf2pxν q f1pxqf2pxq| |f1pxν qf2pxν q f1pxqf2pxν q f1pxqf2pxν q f1pxqf2pxq| ¤ |f1pxν q f1pxq| |f2pxν q| |f1pxq| |f2pxν q f2pxq| ¤ |f1pxν q f1pxq| |f2pxν q f2pxq| |f1pxν q f1pxq| |f2pxq| |f1pxq| |f2pxν q f2pxq| and hence, obviously, lim Ñ8pf1 f2 qpxν q pf1 f2 qpxq . ν 103 For the proof of (ii), let x1 , x2 , . . . be some sequence in D1 which converges to x. Then for any ν P N |p1{f1qpxν q p1{f1qpxq| |1{f1pxν q 1{f1pxq| |f1pxν q f1pxq|{r |f1pxν q| |f1pxq| s and hence, obviously, lim p1{f1 qpxν q p1{f1 qpxq . ν Ñ8 In the following, we give two examples for the application of Theorem 2.3.46 and Theorem 2.3.48. Example 2.3.49. Let n P N and a0 , a1 , . . . , an be real numbers. Then the corresponding polynomial of n-th order p : R Ñ R defined by ppxq : a0 a1 x an xn for all x P R, is continuous. Proof. The proof is a simple consequence of Example 2.3.29, Theorem 2.3.46 and Theorem 2.3.48. Example 2.3.50. Explain why the function f pxq : x3 2x2 x 1 x2 3x 2 (2.3.17) is continuous at every number in its domain. State that domain. Solution: The domain D is given by those real numbers for which the denominator of the expression (2.3.17) is different from 0. Hence it is given by D R zt1, 2u . Further, as a consequence of Example 2.3.49, the polynomials p1 : R Ñ R, p2 : D Ñ R defined by p1 pxq : x3 2x2 104 x 1, p2 pxq : x2 3x 2 for all x P R and x P D, respectively, are continuous. Since p2 pRq R , it follows by Theorem 2.3.48 that the function 1{p2 is continuous. Finally from this, it follows by Theorem 2.3.48 that p1 {p2 is continuous. Theorem 2.3.51. Let f : Df Ñ R, g : Dg Ñ R be functions and Dg be a subset of R. Moreover let x P Df , f pxq P Dg , f be continuous in x and g be continuous in f pxq. Then g f is continuous in x. Proof. For this, let x1 , x2 , . . . be a sequence in Dpg f q converging to x. Then f px1 q, f px2 q, . . . is a sequence in Dg . Moreover since f is continuous in x, it follows that lim f pxν q f pxq . ν Ñ8 Finally, since g is continuous in f pxq it follows that lim Ñ8pg f qpxν q νlim Ñ8 g pf pxν qq g pf pxqq pg f qpxq . ν Example 2.3.52. Show that f : R Ñ R defined by f pxq : |x| for all x P R, is continuous. Solution: Define the polynomial p2 : R Ñ R by p2 pxq : x2 for every x P R. According to Example 2.3.49, p2 is continuous. Then f s2 p2 , where s2 denotes the square-root function on r0, 8q, which, by Theorem 2.3.44, is continuous as inverse of the strictly increasing restriction of p2 to r0, 8q. Hence f is continuous by Theorem 2.3.51. Example 2.3.53. The functions sin : R Ñ R and exp : R Ñ R are continuous. Show that arcsin : r1, 1s Ñ rπ {2, π {2s, cos : R Ñ R, arccos : r1, 1s Ñ r0, πs, tan : pπ{2, π{2q Ñ R, arctan : R Ñ pπ{2, π{2q and the natural logarithm function ln : p0, 8q Ñ R are continuous. Solution: Since the restriction of sin to rπ {2, π {2s and exp are in particular 105 y 3 2 1 2 -2 3 x -1 -2 -3 Fig. 25: Graph of sin, arcsin and asymptotes. y 3 2 -3 2 -2 3 -1 -2 -3 Fig. 26: Graph of cos, arccos and asymptotes. 106 x y 3 2 1 -3 -2 1 -1 x 2 -1 -2 -3 Fig. 27: Graph of tan, arctan and asymptotes. y 3 2 1 -3 -2 1 -1 2 -1 -2 -3 Fig. 28: Graph of exp, ln. 107 3 x F x 1 tanHxL sinHxL x A cosHxL B C D Fig. 29: Sketch for Example 2.3.54. The dots in the corners B and F indicate right angles. increasing, their inverses arcsin and ln are continuous according to Theorem 2.3.44. Further, since π cospxq sin x 2 for all x P R, the cosine function is continuous as composition of continuous functions according to Theorem 2.3.51. Further, the restriction of cos to r0, π s is in particular increasing and hence its inverse arccos continuous according to Theorem 2.3.44. Also, tan : R z tk π pπ {2q : k P Zu Ñ R defined by sinpxq tanpxq : cospxq for every x P R z tk π pπ {2q : k P Zu is continuous according to Theorem 2.3.48 as quotient of continuous functions. Finally, the restriction of tan to pπ {2, π {2q is in particular increasing and hence its inverse arctan continuous according to Theorem 2.3.44. It is not uncommon that, in a first step, in the definition of a continuous function f certain real numbers have to be excluded from the domain since the expression used for the definition is not defined in those points. Such points are called singularities of f , although not part of the domain of f . Most frequent is the case that the definition in a point would involve division by 0. Since this division is not defined, that point has to excluded from 108 y 2.5 2 1.5 1 0.5 -1.5 -1 0.5 -0.5 1 1.5 x Fig. 30: Graphs of f (red) and h (blue) from Example 2.3.54. the domain of f . In particular in applications, singularities of functions are points of interest. For instance, in physics they often signal the breakdown of theories at such locations. In case that there is a continuous function fˆ whose restriction to the domain of f coincides with f and, in addition, contains a singularity of f , then that singularity is called a removable and fˆ a continuous extension of f . If xs P R is a singularity of f and if there is a sequence x1 , x2 , . . . in the domain of f that is convergent to xs , then it follows by the assumed continuity of fˆ that ˆ ˆ lim Ñ8 f pxn q nlim Ñ8 f pxn q f pxs q n and hence that every continuous extension of f containing xs in its domain assumes the same value in xs . Continuous functions with singularities that are not removable are easy to construct. For instance, f : R Ñ R defined by f pxq : 1{x has a singularity at x 0 and the sequence f p1{1q, f p1{2q, f p1{3q, . . . 109 diverges. Since 1 0, nÑ8 n it follows that there is no continuous extension of f . The following is an often appearing case of a removable singularity. lim Example 2.3.54. (Removable singularities) Define f : R Ñ R by f pxq sinpxq x for every x P R and f p0q 1. Then f is continuous. Proof: By Theorem 2.3.48, the continuity of sin and the linear function p : R Ñ R, defined by ppxq : x, x P R, see Example 2.3.29, it follows the continuity of f in all points of R . The proof that f is also continuous in x 0, follows from the following inequality (compare Fig 30): sin x x p q 1 ¤ 1 1 , cospxq (2.3.18) for all x P pπ {2, π {2q zt0u. For its derivation and in a first step, we assume that 0 x π {2 and consider the triangle ADF in Fig 29, in particular the areas ApABF q, ApACF q and ApADF q of the triangles ABF , ACF and ADF , respectively. Then we have the following relation: ApABF q ¤ ApACF q ¤ ApADF q and hence and 1 x sinpxq cospxq ¤ 2 2 ¤ tan2pxq sinpxq 1 ¤ . x cospxq From this follows, by the symmetries of sin, cos under sign change of the argument, the same equality for π {2 x 0. Hence for x P pπ{2, π{2q zt0u: sinpxq 1 1¤ 1 x cospxq cospxq ¤ 110 and 1 sinpxq x ¤ 1 cospxq ¤ cos1pxq 1 and hence finally (2.3.18). Now since h : pπ {2, π {2q Ñ R defined by hpxq : 1 cospxq 1 , for all x P pπ {2, π {2q is continuous, it follows by (2.3.18) and Theorem 2.3.10 the continuity of f also in x 0. Remark 2.3.55. The alert reader might have noticed that geometric intuition was used in the derivation of the inequality (2.3.18) that is also used further on, although such intuition is no longer admitted in proofs. Indeed, this could be avoided by introducing the sine and cosine functions by their power series expansions, see Example 3.4.27 from Calculus II, but this would take us to far off course. Often, in particular in applications, functions occur that are defined on unbounded intervals of the real numbers. For instance, such appear in the description of the frequently occurring physical systems of infinite extension, like the motion of planets and comets around the sun. In such cases the behavior of the function near 8 and/or 8 is of interest. Such study would be much simplified if 8 and 8 would be part of the real numbers which is not the case. But there is a simple method to reduce the discussion of the behavior of a function near 8 and/or 8 to that of a related function near 0 which is based on the fact that the auxiliary function h : pR Ñ R, x ÞÑ 1{xq maps large positive real numbers to small positive numbers and large negative real numbers to small negative numbers. Hence the behavior of a function f near 8 is completely determined by the behavior the function f¯ : f h near 0. This fact provides a simple method for the calculation of limits at infinity. Theorem 2.3.56. (Limits at infinity) Let a ber. 111 ¡ 0 and L be some real num- (i) If f : ra, 8q Ñ R is continuous, then lim f pxq L x Ñ8 if and only if the transformed function f¯ : r0, 1{as Ñ R defined by f¯pxq : f p1{xq for all x P p0, 1{as and f¯p0q : L is continuous in 0. In this case, we call the parallel through the x-axis through p0, Lq a ‘horizontal asymptote of Gpf q for large positive x’. (ii) If f : p8, as Ñ R is continuous, then x lim f pxq L Ñ8 if and only if the transformed function f¯ : r1{a, 0s Ñ R defined by f¯pxq : f p1{xq for all x P r1{a, 0q and f¯p0q : L is continuous in 0. In this case, we call parallel through the x-axis through p0, Lq a ‘horizontal asymptote of Gpf q for large negative x’. Proof. “(i)”: If lim Ñ8 f pxq L , x (2.3.19) we conclude as follows. For this, let x1 , x2 , . . . be a sequence in p0, 1{as that is convergent to 0. As a consequence, for m P N, there is N P N such that 1 xn |xn | ¤ m 1 for all n P N such that n ¥ N . This implies that 1 xn ¥m 112 1¥m for all n P N such that n ¥ N . Hence it follows from (2.3.19) that L lim f p1{xn q lim f¯pxn q . nÑ8 nÑ8 Obviously, this also implies that lim f¯pxn q L Ñ8 n for sequences x1 , x2 , . . . in r0, 1{as that are convergent to 0 and hence that f¯ is continuous in 0. On the other hand, if f¯ is continuous in 0, we conclude as follows. For this, let x1 , x2 , . . . be a sequence in ra, 8q which contains only finitely many members that are ¤ m for every m P N. Then for such m, there is N P N such that xn ¡ m 1 for all n P N satisfying n ¥ N . This also implies that 1 1 1 x xn m 1 n for such n. Since this is true for every m P N, we conclude that lim Ñ8 n 1 xn 0 and hence by the continuity of f¯ in 0 that ¯ 1 L nlim Ñ8 f xn nlim Ñ8 f pxn q . Finally, since this is true for every such sequence x1 , x2 , . . . , (2.3.19) follows. “(ii)”: The proof is analogous to that of (i). If x lim f pxq L , Ñ8 (2.3.20) we conclude as follows. For this, let x1 , x2 , . . . be a sequence in r1{a, 0q that is convergent to 0. As a consequence, for m P N, there is N P N such that 1 xn |xn | ¥ m 1 113 for all n P N such that n ¥ N . This implies that ¤ pm 1 xn 1q ¤ m for all n P N such that n ¥ N . Hence it follows from (2.3.20) that L lim f p1{xn q lim f¯pxn q . nÑ8 nÑ8 Obviously, this also implies that ¯ lim Ñ8 f pxn q L n for sequences x1 , x2 , . . . in r1{a, 0q that are convergent to 0 and hence that f¯ is continuous in 0. On the other hand, if f¯ is continuous in 0, we conclude as follows. For this, let x1 , x2 , . . . be a sequence in p8, as which contains only finitely many members that are ¥ m for every m P N. Then for such m, there is N P N such that xn pm 1q for all n P N satisfying n ¥ N . This also implies that 1 x n x1 m 1 n 1 for such n. Since this is true for every m P N, we conclude that lim Ñ8 n 1 xn 0 and hence by the continuity of f¯ in 0 that ¯ 1 f L nlim Ñ8 xn nlim Ñ8 f pxn q . Finally, since this is true for every such sequence x1 , x2 , . . . , (2.3.20) follows. 114 y 0.5 -10 5 -5 10 x -1 Fig. 31: Gpf q and asymptote for Example 2.3.57. Example 2.3.57. Consider the function f : r1, 8q Ñ R defined by f pxq x2 1 x2 1 for all x P r1, 8q. See Fig. 31. Then the transformed function f¯ : r0, 1s Ñ R, defined by 1 x2 f¯pxq : 1 x2 for all x P r0, 1s, is continuous and hence since f¯pxq f p1{xq for all x P p0, 1s, it follows that lim Ñ8 f pxq 1 . x Hence y Fig. 31. 1 is a horizontal asymptote of Gpf q for large positive x. 115 See y 2 1 2 4 6 8 x -1 -2 Fig. 32: Gpf q and asymptotes for Example 2.3.58. Example 2.3.58. Find the limits ? ? 2x2 1 2x2 1 , lim . lim xÑ8 3x 5 xÑ8 3x 5 Solution: Define f : tx P R : x 5{3u Ñ R by f pxq : ? 2x2 1 3x 5 for all x P R ^ x 5{3. Then the transformed functions f¯ corresponding to the restrictions of f to r1, 8q and p8, 1s are given by the continuous functions ? 2 x 2 x f¯pxq : |x| 3 5x2 for all x P r0, 1s and f¯pxq : x |x | ? 3 2 5xx2 116 2 for all x P r1, 0s, respectively, and hence ? 2x2 1 lim xÑ8 3x 5 ? ? 2 2x2 1 , lim xÑ8 3x 5 3 ? 2 . 3 See Fig. 32. Problems 1) Show the continuity of the function f . For this, use only Theorems 2.3.46, 2.3.48, 2.3.51 on sums, products/quotients, compositions of continuous functions, and the continuity of constant functions/the identity function idR on R. a) b) c) d) e) f pxq : x 7 , x P R , f pxq : x2 , x P R , f pxq : 3{x , x P R , f pxq : px 3q{px 8q , x P R zt8u , f pxq : px2 3x 2q{px2 2x 2q , x P R . 2) Assume that f and g are continuous functions in x f p0q 2 and lim r2f pxq 3g pxqs 1 . Calculate g p1q. 0 such that Ñ0 x In the following, it can be assumed that rational functions, i.e., quotients of polynomial functions, are continuous on their domain of definition. In addition, it can be assumed that the exponential function, the natural logarithm function, the general power function, the sine and cosine function and the tangent function are continuous. 3) For arbitrary c, d P R, define fc,d : R Ñ R by fc,d pxq : $ 2 ' &1 x {p 1q if x P p8, 1q cx d if x P r1, 1s ' %? 4x 5 if x P p1, 8q for all x P R. Determine c, d such that the corresponding fc,d is everywhere continuous. Give reasons. 117 4) For arbitrary c P R, define fc : r0, 8q Ñ R by fc pxq : # x sinp1{xq if x P p0, 8q c if x 0 for all x P r0, 8q. Determine c such that the corresponding fc is everywhere continuous. Explain your answer. 5) For every k P R, define fk : r1{3, 8q zt1u Ñ R by fk pxq : #? ? if x P r1{3, 8q zt1u . if x 1 3x 1 2x 2 x 1 k For what value of k is fk continuous? Give explanations. 6) Define the function f : R Ñ R by f pxq : x4 10x 15 for all real x. Use your calculator to find an interval of length 1{100 which contains a zero of f (i.e, some real x such that f pxq 0). Give explanations. 7) Determine in each case whether the given sequence has a limit. If there is one, calculate that limit. Otherwise, give arguments why there is no limit. xn : a) ? n ? 1 n , xn : b) ? ?n n 1 for all n P N. 8) Find the limits. a) lim e1{n n Ñ8 cospnq c) lim nÑ8 n lnpnq d) lim nÑ8 n 1{n e) lim n n Ñ8 f) lim n a Ñ8 n2 , b) lim cos n Ñ8 1 n π , ? , Hint: Use that lnpnq ¤ 2 n for n ¥ 1 . , Hint: Use d) 6n n , 1{3 1{3 Ñ8 n pn 1q g) lim n n Hint: Use that a b a2 , a3 b3 for all a b ab b2 118 , h) lim Ñ0 p1 h hq2{3 1 h , Hint: Use the hint in g) . 9) Calculate the limits. 17n 4 a) lim sin , b) lim tan nÑ8 xÑ2 n 5 2 x 8x 15 c) lim . xÑ5 x5 3x 5x 2 7 , In each case, give explanations. 10) Find the limits a) b) c) d) e) f) g) h) i) limxÑ8 rx{px 1qs , limxÑ8 rx{px 1qs , limxÑ8 rpsin xq{xs , limxÑ8 rp3x3 2x2 5x 4q { p2x3 x2 ? limxÑ8 p x{ 1 x2 q , ? limxÑ8 p x{ 1 x2 q , ? ? limxÑ8 p 3x2 2x { 2x2 5 q , ?2 ?2 limxÑ8 p x ? 2 3 x ?1 q ,2 limxÑ8 p x 4x 5 x 2q. 11) Define f : R Ñ R by f pxq : x 5qs , # x if x is rational 0 if x is irrational for every x P R. Find the points of discontinuity of f . 12) Define f : R Ñ R by f pxq : 0 if x 0 or if x is irrational and f pm{nq : 1{n if m P Z and n P N have no common divisor greater than 1. Find the points of discontinuity of f . 13) Let f and g be functions from R to R whose restrictions to Q coincide. Show that f g. 14) Let D R, f : D Ñ R be continuous in some x P D and f pxq ¡ 0. By an indirect proof, show that there is ε ¡ 0 such that f pxq ¡ 0 for all x P D X px ε, x εq. 15) Let a, b P R such that a b and f : ra, bs Ñ ra, bs. By use of the intermediate value theorem, show that f has a fixed point, i.e., that there is x P ra, bs such that f pxq x. 119 16) Use the intermediate value theorem to prove that for every a ¥ 0 there is a uniquely determined x ¥ 0 such that x2 a. That x is ? denoted by a. 120 y 0.25 0.5 1 x Fig. 33: Graph of A and its point with maximum ordinate. 2.4 Differentiation Possibly, the first mathematician to use the derivative concept in some implicit form is Pierre de Fermat in his calculation of maximum / minimum ordinate values of curves in Cartesian coordinate systems and in his way of determination of tangents at the points of curves. The first may be due to the observation that the ordinate values of a curve near a maximum (or a minimum) change very little near the abscissa of its location, differently to other points of the curve. It is not clear whether this was his real motivation because he never published his method, but only described it in communications to other mathematicians from 1637 onwards. Also in these instances, he did not explain its logical basis so that its general validity was quickly questioned. On the other hand, his procedure suggests that observation as the basis of the method. For display of the method, he considers the problem of finding the maximal area of a rectangle with perimeter 2b where b ¡ 0. If x ¥ 0 denotes the width of such a rectangle, the corresponding area is given by Apxq : x pb xq , 121 see Fig 33. If px0 , Apx0 qq is the point of GpAq with maximal ordinate, then Apx0 hq px0 hq pb x0 hq x0 pb x0 q Apx0q h pb 2x0 hq Apx0q h pb 2x0 hq (2.4.1) for h such that x0 h P DpAq r0, bs and of small absolute value where means ‘approximately’ . Hence if h 0, b 2x0 h 0 . By neglecting the term arrives at the equation h on the left hand side of the last relation, he b 2x0 0 and hence at x0 b{2 which gives Apx0 q b2 {4. (2.4.2) Indeed, the rectangle with perimeter 2b of maximal area is given by a square with sides b{2. We note that from (2.4.1) it follows that Apx0 hq Apx0 q h b 2x0 h if h 0. Hence the equation (2.4.2) is equivalent to the demand that h lim Ñ0,h0 Apx0 hq Apx0 q h 0 where the addition of h 0 in the limit symbol indicates that only sequences with non-vanishing members are admitted . In modern calculus / analysis, the limit on the left of the last equation is called the derivative of f in x0 and is denoted by f 1 px0 q. Hence in modern terms, Fermat demands that f 1 px0 q 0. Indeed, the vanishing of the derivative in a point is necessary, but not sufficient, for a (differentiable) function to assume an extremum, i.e., a minimum or maximum value, in that point, see Theorem 2.5.1. Fermat uses a similar method for the determination of tangent lines to curves. To a greater extent, such were not studied until the middle of 122 y fHa+hL fHaL a-c a a+h a+d x Fig. 34: Depiction to Fermat’s method of determination of tangents. the 17th century. Apart from Archimedes’ construction of tangent lines to his spiral, in ancient Greece, tangents were constructed only in few simple cases, namely for ellipses, parabolas and hyperbolas where they were defined as lines that touch the curve in only one point. In general, this definition is too imprecise. In particular, the concept of differentiation also gives a precise meaning to tangent lines to curves. For the description of Fermat’s method, we consider Fig 34 which displays the graph of a function f together with its tangent at the point pa, f paqq and the normal to the tangent in this point. By definition, the tangent goes through the point pa, f paqq and hence is determined once we know the location of its intersection pa c, 0q with the x-axis where c is the unknown. For the determination of c, Fermat considers the triangles with corners pa c, 0q, pa, 0q, pa, f paq and pa c h, 0q, pa h, 0q, pa h, f pa hqq to be approximately similar in the case of a tangent. These triangles are similar only if the point pa h, f pa hqq would lie on the tangent. In general, the error of the approximation is becoming smaller with smaller h. The approximation gives the relation f pa hq f paq c c h or pc hqf paq cf pa hq . 123 The last gives c hf paq . f pa hq f paq If f is explicitly given, Fermat proceeds further by performing the division and neglecting h as in his previous method. For instance if f pxq x2 for all x P R, then hf paq f pa hq f paq pa ha2 hq2 a2 2 2ahha 2 a h2 2a a h 2 which leads to c a{2. Indeed, this is the correct result. Also, using modern notation and assuming that f 1 paq : h lim Ñ0,h0 f pa hq f paq h 0, the following c hf paq hÑ0,h0 f pa hq f paq lim ff 1ppaaqq gives the correct result. As a side remark, in older literature, the directed line segment pa, f paqq, pa c, 0q is called the tangent line in pa, f paqq and its projection onto the x-axis the corresponding subtangent. In addition, the directed line segment pa, f paqq, pa d, 0q is called the normal in pa, f paqq and its projection onto the x-axis the corresponding subnormal. When Fermat’s method was reported to Rene Descartes by Marin Mersenne in 1638, Descartes attacked it as not generally valid. He proposed as a challenge the curve C : tpx, y q P R2 : x3 y 3 3axy u , (2.4.3) a P R, which since then is known as ‘Folium of Descartes’. Indeed, Fermat’s ‘method’ produced the right results, and ultimately Descartes conceded its validity. A further candidate for the first mathematician to use the derivative concept in some implicit form is Galileo Galilei. In 1589, using inclined planes, 124 y 2 1 -2 1 -1 2 x -1 -2 Fig. 35: Folium of Descartes for the case a 1, compare (2.4.3). Galileo discovered experimentally that in vacuum all bodies, regardless of their weight, shape, or composition, are uniformly accelerated in exactly the same way, and that the fallen distance s is proportional to the square of the elapsed time t: 1 (2.4.4) sptq gt2 2 for all t P R where g 9.8m{sec2 is the gravitational acceleration. This result was in contradiction to the generally accepted traditional theory of Aristotle that assumed that heavier objects fall faster than lighter ones. On the Third Day of his ‘Discorsi’ from 1638 [42], he discusses uniform and naturally accelerated motion. The idea that the velocity is the same as a derivative can be read between the lines. Even a recognition of the fundamental theorem of calculus, see Theorem 2.6.19, is visible in this special case. A modern way of deduction would proceed, for instance, as follows. For this, we consider the average speed of a falling body described by (2.4.4), i.e., the traveled distance divided by the elapsed time, during the 125 time interval rt, t hs, if h some t, h P R. Then spt hq sptq t ht ¥ 0, and rt g rpt 2h hq 2 h, ts, if h t s 2 g p2ht 2h 0, respectively, for 2 h qg t h 2 . Hence it follows by Example 2.3.29 that h lim Ñ0,h0 spt hq sptq t ht gt , which suggests itself as (and indeed is the) definition of the instantaneous speed v ptq of the body at time t: v ptq : s 1 ptq : h lim Ñ0,h0 spt hq sptq t ht gt . For a geometrical interpretation of the limit h lim Ñ0,h0 spt hq sptq , t ht also in more general situations where s is not necessarily given by p2.4.4q, note that the quotient spt hq sptq t ht gives the slope of the line segment (‘secant’) between the points pt, sptqq and pt h, spt hqq on the graph of s for every h 0. In the limit h Ñ 0 that slope approaches the slope of the tangent to Gpsq in the point pt, sptqq. Hence in particular, a geometrical interpretation of s 1 ptq v ptq is the slope of the tangent to Gpsq at the point pt, sptqq, see Fig. 36. As for the definition of the continuity of functions, Cauchy, in his textbook ‘Cours d’analyse’ from 1821 [22] and by using Lagrange’s notation, terminology and Lagrange’s characterization of the derivative in terms of inequalities, was the first to give a definition of the derivative of a function 126 sHtL @mD 4 3 2 sH0.8L-sH0.4L 1 0.8-0.4 0.2 0.4 0.6 0.8 1 1.2 t @secD Fig. 36: Gpsq, secant line and tangent at p0.4, sp0.4qq. based on limits which is very near to the modern definition. Still, his understanding of limits was different from the modern understanding. This was not without consequences. During the early 19th century, it resulted in the general belief that every continuous function is everywhere differentiable, except perhaps at finitely many points. Even several ‘proofs’ of this ‘fact’ appeared during that time. Therefore, it came as a shock when in 1872 [99] Weierstrass proved the existence of a continuous function which is nowhere differentiable, see Example 3.4.13. For the first time, this result signaled the complete mastery of the concepts of derivative and limit which is characteristic for modern calculus / analysis. Definition 2.4.1. Let f : pa, bq Ñ R be a function where a, b P R such that a b. Further, let x P pa, bq and c P R. We say f is differentiable in x with derivative c if for all sequences x0 , x1 , . . . in pa, bq ztxu which are 127 convergent to x it follows that lim Ñ8 n f pxn q f pxq xn x c. In this case, we define the derivative f 1 pxq of f in x by f 1 pxq : c . Further, we say f is differentiable if f is differentiable in all points of its domain pa, bq. In that case, we call the function f 1 : pa, bq Ñ R associating to every x P pa, bq the corresponding f 1 pxq the derivative of f . Higher order derivatives of f are defined recursively. If f pkq is differentiable for k P N , we define the derivative f pk 1q of order k 1 of f by f pk 1q : pf pkq q 1 , where we set f p1q : f 1 . In that case, f will be referred to as pk 1qtimes differentiable. Frequently, we also use the notation f 2 : f p2q and f 3 : f p3q . The differentiability of a function in a point of its domain implies also its continuity in that point. This is a simple consequence of the definition of differentiability and the limit laws Theorem 2.3.4. That the opposite is not true in general, can be seen from Example 2.4.6 or Example 2.4.7. Moreover in Calculus II, we give an example of a continuous function which is not differentiable in any point of its domain, see Example 3.4.13. Theorem 2.4.2. Let f : pa, bq Ñ R be a function where a, b P R such that a b. Further, let f be differentiable in x P pa, bq. Then f is also continuous in x. Proof. Let x0 , x1 , . . . be a sequence in pa, bq which is convergent to x. Obviously, it is sufficient to assume that x0 , x1 , . . . is a sequence in pa, bq ztxu. Then it follows by the limit laws Theorem 2.3.4 that lim pf pxn q f pxqq nlim nÑ8 Ñ8 and hence that f pxn q f pxq xn x lim Ñ8 f pxn q f pxq . n 128 nlim Ñ8pxn xq 0 Similar to the case of continuous functions, we shall see later on, see Theorems 2.4.8, 2.4.10, that sums, products, quotients (wherever defined) and compositions of differentiable functions are differentiable. Indeed, this is a another simple consequence of the limit laws, Theorem 2.3.4, and the definition of differentiability. As usual, a typical application of those theorems consists in the decomposition of a given function into sums, products, quotients and compositions of functions whose differentiability is already known. Then the application of those theorems proves the differentiability of that function and allows the calculation of its derivative. To provide a basis for the application of those theorems, in the following, we prove the differentiability of some elementary functions, powers, the exponential function and the sine function, from the definition of differentiability and by use of their special properties. In this process, we also explicitly calculate the derivatives. Example 2.4.3. Let c P R, n P N and f, g : R Ñ R be defined by f pxq : c , g pxq : xn for all x P R. Then f, g are differentiable and f 1 pxq 0 , g 1 pxq nxn1 for all x P R. Proof. Let x P R and x0 , x1 , be a sequence of numbers in R ztxu which is convergent to x P R. Then: lim Ñ8 ν f pxν q f pxq xν x νlim Ñ8 0 0 . P N: g pxν q g pxq pxν qn xn xν x xν x n1 n2 pxν q px ν q x Further, for any ν 129 xν xn2 xn1 and hence by Example 2.3.49: g pxν q g pxq ν Ñ8 xν x lim xn1 xn2 x xxn2 xn1 nxn1 . In the next example, we show that the derivative of the exponential function is given by that function itself. As we shall see later, this fact along with the fact that expp0q 1 can be used to characterize the exponential function, see Example 2.5.8. Example 2.4.4. The exponential function is differentiable with exp 1 pxq exppxq for all x P R. Proof. First, we prove that exp is differentiable in 0 with derivative e0 1. For this, let h1 , h2 , . . . be some sequence in R zt0u which is convergent to 0. Moreover, let n0 P N be such that |hn | 1 for n ¥ 0. Then for any such n: ehn p1 hn q ehn e0 0 e . hn 0 hn We consider the cases hn ¡ 0 and hn 0. In the first case, it follows by (2.3.10) and some calculation that 0¤ ehn p1 hn q hn 3 hn ¤ h4n 1 hn 2 2 ¤ 124 hn 3hn . Analogously, it follows in the second case that h ¤ 1 hnh ¤ e hp1 n hn n n Hence it follows in both cases that h e n hn q p1 hn hn q 130 ¤ 3|hn| ¤ h4n . and therefore by Theorem 2.3.10 that lim Ñ8 n ehn p1 hn q hn 0. Now let x P R and x1 , x2 , . . . be some sequence in R ztxu which is convergent to x. Then exn ex xn x xn x ex ex e r1 pxn xqs , xn x and hence it follows by Theorem 2.3.4 and the previous result that exn ex lim nÑ8 x x n ex and therefore the statement of this Theorem. Example 2.4.5. The sine function is differentiable with sin 1 pxq cospxq for all x P R. Proof. Let x P R and x1 , x2 , . . . be some sequence in R ztxu, which is convergent to x. Further define hn : xn x, n P N. Then it follows by the addition theorems for the trigonometric functions sinpxn q sinpxq sinpx hn q sinpxq xn x hn cosphn q 1 sinphn q sinpxq h cospxq hn n 2 sinpxq h2n sinhph{n2{2q cospxq sinhphnq n n and hence by Example 2.3.54 and Theorem 2.3.4 that lim Ñ8 n sinpxn q sinpxq xn x 131 cospxq . y 1 0.5 -1 0.5 -0.5 1 x Fig. 37: Graph of the modulus function. See Example 2.4.6. We give two examples of continuous functions that are not differentiable in points of their domains. In the first case, this is due to the presence of a ‘corner’ in the graph of the function. In such a point no tangent to the graph exists and hence the function is not differentiable in the corresponding point of its domain. In the second case, the non-differentiability is due to fact that there is a vertical tangent to the graph. Since the derivative of a function f in a point p of its domain gives the slope of the tangent to its graph at the point pp, f ppqq, the derivative in p would would have to be infinite in order to account for a vertical tangent, but infinity is not a real number. Therefore, a function is not differentiable in such a point p. Example 2.4.6. The function f : R Ñ R defined by f pxq : |x| for all x P R, is not differentiable in 0, because lim Ñ8 n n1 0 1 lim n1 0 1 . nÑ8 1 0 n1 0 n See Fig. 37. Example 2.4.7. The function f : R Ñ R defined by f pxq : x1{3 132 y 1 0.5 -1 0.5 -0.5 1 x -1 Fig. 38: Graph of f from Example 2.4.7. for all x P R, is not differentiable in 0, because the sequence 1 1 3 n 1 n { 01{3 n2{3 0 has no limit for n Ñ 8. See Fig. 38. As mentioned above, similar to the case of continuous functions, sums, products, quotients (wherever defined) and compositions of differentiable functions are differentiable. This is a simple consequence of the limit laws, Theorem 2.3.4, and the definition of differentiability. A typical application of the thus obtained theorems consists in the decomposition of a given function into sums, products, quotients, compositions of functions whose differentiability is already known. Then the application of those theorems proves the differentiability of that function and allows the calculation of its derivative from the derivatives of the constituents of decomposition. In this way, the proof of differentiability of a given function is greatly simplified and, usually, obvious. Also, the calculation of its derivative is reduced 133 to a simple mechanical procedure if the derivatives of the constituents of decomposition are known. Therefore, in such obvious cases in future, the differentiability of the function will be just stated and its derivative will be given without explicit proof. Theorem 2.4.8. (Sum rule, product rules and quotient rule) Let f, g be two differentiable functions from some open interval I into R and a P R. (i) Then f g, a f and f g are differentiable with pf gq 1pxq f 1pxq g 1pxq , pa f q 1pxq a f 1pxq pf gq 1pxq f pxq g 1pxq gpxq f 1pxq for all x P I. (ii) If f is non-vanishing for all x P I, then 1{f is differentiable and 1 1 f 1 pxq pxq rf pxqs2 f for all x P I. Proof. For this let x P I and x1 , x2 , . . . be some sequence in I ztxu which is convergent to x. Then: |pf ¤ g qpxν q pf g qpxq pf 1 pxq |x ν x | 1 |f pxν q f pxq f pxqpxν xq| |x ν x | g 1 pxqqpxν xq| |gpxν q gpxq g 1pxqpxν xq| |x ν x | and |pa f qpxν q pa f qpxq ra pf 1qpxqspxν xq| |x ν x | 1 |a| |f pxν q f px|xq fx|pxqpxν xq| ν 134 and hence lim ν Ñ8 |pf g qpxν q pf g qpxq pf 1 pxq |x ν x | g 1 pxqqpxν xq| 0 and |pa f qpxν q pa f qpxq ra pf 1qpxqspxν xq| 0 . ν Ñ8 |x ν x | lim Further, it follows that |pf gqpxν q pf gqpxq pf pxq g 1pxq gpxq f 1pxqqpxν xq| |x ν x | 1 ¤ |f pxν q f px|xq fx|pxqpxν xq| |gpxq| ν 1 |f pxq| |gpxν q gpx|xq gxp|xqpxν xq| ν |f pxν q f pxq| |gpx q gpxq| ν |x x | ν and hence that lim ν Ñ8 |pf gqpxν q pf gqpxq pf pxq g 1pxq |x ν x | g pxq f 1 pxqqpxν 0. If f is does in any point of its domain I, it follows that 1 f pxν q 1 f x pq 1 f x r p qs2 f 1pxqpxν xq ¤ |x ν x | 1 |f pxν q f pxq f 1pxqpxν xq| |f pxq|2 |x ν x | |f pxν q f pxq|2 |f pxν q| |f pxq|2 |xν x| 135 xq| and hence that lim ν Ñ8 1 f pxν q f p1xq Finally, since x1 , x2 , . . . follows. 1 r p qs f pxqpxν xq 0. |x ν x | and x P I were otherwise arbitrary, the theorem 1 f x 2 As a simple application of Theorem 2.4.8, we prove the differentiability of polynomial functions and calculate their derivatives. Example 2.4.9. Let n P N and a0 , a1 , . . . , an be real numbers. Then the corresponding polynomial of n-th order p : R Ñ R, defined by ppxq : a0 a1 x an x n for all x P R, is differentiable and p 1 pxq : a1 nan xpn1q for all x P R. Proof. The proof is a simple consequence of Example 2.4.3 and Theorem 2.4.8. Theorem 2.4.10. (Chain rule) Let f : I Ñ R, g : J Ñ R be differentiable functions defined on some open intervals I, J of R and such that the domain of the composition g f is not empty. Then g f is differentiable with pg f q 1 g 1pf pxqq f 1pxq for all x P Dpg f q. Proof. For this let x P Dpg f q and x1 , x2 , . . . be some sequence in Dpg f q ztxu which is convergent to x. Then: |pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq| ¤ |x ν x | 136 |gpf pxν qq gpf pxqq g 1pf pxqqpf pxν q f pxqq| |x ν x | 1 |g pf pxqqpf pxν q f pxq f 1pxqpxν xqq| |x ν x | and hence, obviously, |pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq| ν Ñ8 |x ν x | 0. Finally, since x1 , x2 , . . . and x P Dpg f q were otherwise arbitrary, the lim theorem follows. A typical application of the chain rule is given in the following example. The cosine function is equal to the composition of the sine function and the translation pR Ñ R, x ÞÑ x pπ {2qq. Since both of these functions are differentiable, by Theorem 2.4.10, the same is true for their composition. In addition, by knowledge of the derivatives of these functions, the derivative of their composition, i.e., the cosine function, can be calculated by use of the same theorem. In preparation of the calculation of the derivative of the inverse tangent function function, we also show the differentiability of the tangent function and calculate its derivative from the derivatives of the sine and the cosine with the help of Theorem 2.4.8. Example 2.4.11. The cosine and the tangent function are differentiable with cos 1 pxq sinpxq for all x P R and tan 1 pxq for all x P R z π 2 kπ : k 1 cos2 pxq ( PZ . 137 1 tan2 pxq Proof. Since π 2 for all x P R, it follows by Examples 2.4.5, 2.4.3 and Theorem 2.4.8 (i.e., the ‘sum rule’) and Theorem 2.4.10 (i.e., the ‘chain rule’) that cos is differentiable with derivative π 1 cos pxq cos x sinpxq 2 cospxq sin x for all x P R. Further, because of tanpxq ( sinpxq cospxq for all x P R z π2 kπ : k P Z , it follows by Examples 2.4.5 and Theorems 2.4.8 (i.e., the ‘Quotient Rule’) that tan is differentiable with derivative cospxq cospxq sinpxq p sinpxqq cos2 pxq tan2 pxq tan 1 pxq 1 for all x P R z π 2 kπ : k ( PZ cos12pxq . Functions from applications frequently depend on several variables, i.e., are defined on subsets of Rn for some n P N such that n ¥ 2. For such functions, the concept of differentiation will be formulated in Calculus III. The calculation of the corresponding derivatives can be reduced to the calculation of derivatives of functions in one variable by help of the concept of partial derivatives. The last was developed soon after that of differentiation because of applications. The historic view of the partial derivative was that of treating all variables of an analytic expression as constant, apart from one. In this way, there is achieved an analytic expression in one variable that can be differentiated in the usual way. The result was called a partial derivative of the original expression. The modern definition of partial derivatives is very similar. To define the partial derivative of a function f 138 in several variables, we consider an auxiliary partial function which results from f by restricting its domain to those points whose components are all given constants, apart from one of the components. The result is a function defined on a subset of R. In general, this function depends on the above constants. The derivative of the auxiliary function in some point p of its domain, so far existent, is called the partial derivative of f in the point whose components are the given constants apart from the remaining component which is given by p. Definition 2.4.12. Let f : U Ñ R be a function of several variables where U is a subset of Rn , n P N zt0, 1u. In particular, let i P t1, . . . , nu, x P U be such that the corresponding function f px1 , . . . , xi1 , , xi 1 , . . . , xn q is differentiable at xi . In this case, we say that f is partially differentiable at x in the i-th coordinate direction, and we define: Bf pxq : rf px , . . . , x , , x 1 i1 i B xi 1 , . . . , xn qs 1pxiq . If f is partially differentiable at x in the i-th coordinate direction at every point of its domain, we call f partially differentiable in the i-th coordinate direction and denote by B f {B xi the map which associates to every x P U the corresponding pB f {B xi qpxq. Partial derivatives of f of higher order are defined recursively. If B f {B xi is partially differentiable in the j-th coordinate direction, where j P t1, . . . , nu, we denote the partial derivative of Bf {Bxi in the j-th coordinate direction by B2f B xj B xj . Such is called a partial derivative of f of second order. In the case j we set B2f : B2f . Bx2i BxiBxi Partial derivatives of f of higher order than 2 are defined accordingly. 139 i, Ñ R by f px, y q : x3 x2 y 3 2y 2 Example 2.4.13. Define f : R2 for all x, y P R. Find Bf p2, 1q Bx and Bf p2, 1q . By Solution: We have f px, 1q x3 for all x, y x P R, y x2 2 and f p2, y q 8 P R. Hence it follows that Bf px, 1q 3x2 Bx 4y 3 2y 2 2x , Bf p2, yq 12y2 4y , By P R, and, finally, that Bf p2, 1q 16 Bx Example 2.4.14. Define f : R3 and Ñ R by f px, y, z q : x2 y 3 z Bf p2, 1q 8 . By 3x 4y 6z 5 P R. Find Bf px, y, zq , Bf px, y, zq and Bf px, y, zq Bx By Bz for all x, y, z P R. Solution: Since in partial differentiating with respect to for all x, y, z one variable all other variables are held constant, we conclude that Bf px, y, zq 2xy3z Bx Bf px, y, zq x2y3 Bz for all x, y, z P R. 3, Bf px, y, zq 3x2y2z By 6, 140 4, Problems 1) By the basic definition of derivatives, calculate the derivative of the function f . a) f pxq : 1{x , x P p0, 8q , b) f pxq : px 1q{px 1q , x P R zt1u , ? c) f pxq : x , x P p0, 8q . 2) Calculate the slope of the tangent to G(f) at the point p1, f p1qq and its intersection with the x-axis. a) f pxq : x2 3x 1 , x P R , b) f pxq : p3x 2q{p4x 5q , x P R zt5{4u , c) f pxq : e3x , x P R . 3) Calculate the derivatives of the functions f1 , . . . , f8 with maximal domains in R defined by a) f1 pxq : 5x8 2x5 b) f3 ptq : p1 3t 2 6 , f2 pθq : 3 sinpθq 5t 4 qpt 2 8q , rsinpxq 6 cospxqs , 5 cospϕq 3t 2t 5 , f6 pϕq : c) f5 ptq : t3 8 tanpϕq 2 4 sinp7tq d) f7 pxq : sinp3{x q , f8 ptq : e . f4 pxq : 3e 4 cospθq , x 4 , 4) A differentiable function f satisfies the given equation for all x from its domain. Calculate the slope of the tangent to Gpf q in the specified point P without solving the equations for f pxq. ? ? pf pxqq2 1 , P p1{ 2 , 1{ 2 q , ? ? b) px 1q2 r x2 pf pxqq2 s 4x2 0 , P p1 2,1 2q, a c) x f pxqr2 f pxqs arccosp1 f pxqq , ? ? P ppπ {4q p1{ 2 q, 1 p1{ 2 qq . a) x2 Remark: The curve in b) is a cycloid which is the trajectory of a point of a circle rolling along a straight line. The curve in c) is named after Nicomedes (3rd century B.C.), who used it to solve the problem of trisecting an angle. 5) Give a function f : R Ñ R such that 141 a) f 1 ptq 1 for all t P R and such that f p0q 2 , b) f 1 ptq 2f ptq for all t P R and such that f p0q 1 , c) f 1 ptq 2f ptq 3 for all t P R and such that f p0q 1 . 6) Let I be a non-empty open interval in R and p, q f : I Ñ R. Show that f 2 pxq pf 1 pxq for all x P I if and only if f¯2 pxq for all x P I where f¯ : I p2 4 P R. Further, let qf pxq 0 q f¯pxq Ñ R is defined by f¯pxq : epx{2 f pxq for all x P I. 7) Newton’s equation of motion for a point particle of mass m moving on a straight line is given by mf 2 ptq F pf ptqq ¥ 0 (2.4.5) for all t from some time interval I R, where f ptq is the position of the particle at time t, and F pxq is the external force at the point x. For the specified force, give a solution function f : R Ñ R of (2.4.5) that contains 2 free real parameters. a) F pxq F0 , x P R where F0 is some real parameter , b) F pxq kx , x P R where k is some real parameter . 8) Newton’s equation of motion for a point particle of mass m ¥ 0 moving on a straight line under the influence of a viscous friction is given by mf 2 ptq λf 1 ptq (2.4.6) for all t P R where f ptq is the position of the particle at time t, and λ P r0, 8q is a parameter describing the strength of the friction. Give a solution function f of (2.4.6) that contains 2 free real parameters. 9) For all px, y q from the domain, calculate the partial derivatives pBf {Bxqpx, yq, pBf {Byqpx, yq of the given function f . a) f px, y q : x4 2x2 y 2 3x 4y 142 1 , px, y q P R2 , b) f px, y q : 3x2 2x 1 , px, y q P R2 , c) f px, y q : sinpxy q , px, y q P R2 . 10) Let f : R Ñ R and g : R Ñ R be twice differentiable functions. Define upt, xq : f px tq g px tq for all pt, xq P R2 . Calculate Bu pt, xq , Bu pt, xq , B2 u pt, xq , B2 u pt, xq Bt Bx B t2 Bx2 for all pt, xq P R2 . Conclude that u satisfies B2 u B2 u 0 Bt2 Bx2 which is called the wave equation in one space dimension (for a function u which is to be determined). 143 2.5 Applications of Differentiation The applications of differentiation are manifold. We start with the application to the finding of maxima and minima of functions. For motivation, we consider a continuous function f defined on a closed interval ra, bs where a, b P R are such that a ¤ b. According to Theorem 2.3.33, f assumes a maximum and minimum value, i.e., there are xM , xm P ra, bs such that f pxM q ¥ f pxq , f pxm q ¤ f pxq for all x P ra, bs. The values f pxM q, f pxm q are called the maximum and minimum value of f , respectively. These values are uniquely determined because if x̄M , x̄m P ra, bs are such that f px̄M q ¥ f pxq , f px̄m q ¤ f pxq for all x P ra, bs, it follows by definition of xM , x̄M , xm , x̄m that f pxM q ¥ f px̄M q , f pxm q ¤ f px̄m q as well as that f px̄M q ¥ f pxM q , f px̄m q ¤ f pxm q and hence that f pxM q f px̄M q , f pxm q f px̄m q . On the other hand, a function can assume its maximum value and/or its minimum value in more than one point. For instance, the function p r0, 4π s Ñ R, x ÞÑ 1 sin x q assumes its maximum value 3 and its minimum value 1 in the points π {2, 5π {2 and 3π {2, 7π {2, respectively, see Fig. 39. After this interrupt, we continue with the discussion of the maximum and minimum values of f . Each of them can be assumed either at a boundary point a or b of the interval or in a point of the open interval pa, bq. In the last cases, if the function is differentiable on pa, bq, differentiation can be used to determine the position(s) where they are assumed. We remember that the 144 y 4 3 2 1 Π 2 3Π 2 7Π 2 5Π 2 x Fig. 39: Graph and segments of tangents of a function, p r0, 4π s Ñ R, x ÞÑ 2 sin x q, that assumes both its maximum and its minimum value in several points of its domain. Note that the tangents in those points are horizontal corresponding to a vanishing derivative in those points. function A from Fermat’s example at the beginning of Section 2.4 assumed its maximum value in the midpoint of its domain and that his way of finding its position was equivalent to the demand of a vanishing derivative at the position of a maximum value. Indeed, this also true for a minimum value. With precise definitions of limits and derivatives at hand, both follow from very simple observations. By definition of xM , it follows that f pxq f pxM q ¤ 0 for all x P Dpf q. As a consequence, we conclude that if b ¡ x ¡ xM and f pxq f pxM q x xM ¤0 f pxq f pxM q x xM ¥0 145 if a x xM . By choosing a sequence x1 , x2 , . . . of elements of pxM , bq, pa, xM q that converges to xM in Definition 2.4.1, it follows from this and Theorem 2.3.12 that f 1 pxM q ¤ 0 and f 1 pxM q ¥ 0, respectively, and hence that f 1 pxM q 0. Also, by definition of xm , it follows that f pxq f pxm q ¥ 0 for all x P Dpf q. As a consequence, we conclude that if b ¡ x ¡ xm and f pxq f pxm q x xm ¥0 f pxq f pxm q ¤0 x xm if a x xm . By choosing a sequence x1 , x2 , . . . of elements of pxm , bq, pa, xmq that converges to xm in Definition 2.4.1, it follows from this and Theorem 2.3.12 that f 1 pxm q ¥ 0 and f 1 pxm q ¤ 0, respectively, and hence that f 1 pxm q 0. Hence in case that the restriction of f to pa, bq is differentiable, the standard procedure of finding the maximum and minimum values of f proceeds by finding the zeros of the derivative of the restriction, subsequent calculation of the corresponding function values of f in those zeros and comparison of the obtained values with the function values of f at a and b. The maximum, minimum value of these function values is the maximum and minimum value of f , respectively. Theorem 2.5.1. (Necessary condition for the existence of a local minimum/maximum) Let f be a differentiable real-valued function on some open interval I of R. Further, let f have a local minimum / maximum at some x0 P I, i.e, let f px0 q ¤ f pxq for all x such that x0 ε x x0 { f px0 q ¥ f pxq ε, for some ε ¡ 0. Then f 1 px0 q 0 , i.e, x0 is a so called ‘critical point’ for f . 146 y 1.15 1.1 1.05 0.95 -1 0.2 -0.6 -0.4 -0.2 0.4 x Fig. 40: Gpf q from Example 2.5.2. Proof. If f has a local minimum/maximum at x0 sufficiently small h P R that 1 rf px0 h P I, then it follows for hq f px0 qs is ¥ p¤q 0 and ¤ p¥q 0, for h ¡ 0 and h 0, respectively. Therefore, it follows by Theorem 2.3.12 that f 1 px0 q is at the same time ¥ 0 and ¤ 0 and hence, finally, equal to 0. Example 2.5.2. Find the critical points of f : R Ñ R defined by f pxq : x4 for all x equation P R. x3 1 Solution: The critical points of f are the solutions of the 0 f 1 pxq 4x3 3x2 x2p4x 3q and hence given by x 0 and x 3{4. See Fig. 40. Note that f has a local extremum at x 3{4, but not at x 0. Hence the condition in Theorem 2.5.1 is necessary, but not sufficient for the existence of a local extremum. 147 y 5 4 3 2 1 -3 -2 2 -1 3 x -1 -2 Fig. 41: Gpf q from Example 2.5.3. Example 2.5.3. Find the maximum and minimum values of f : rπ, π s Ñ R defined by f pxq : x 2 cospxq for all x P rπ, π s. Solution: Since f is continuous, such values exist according to Theorem 2.3.33. Those points, where these values are assumed, can be either on the boundary of the domain, i.e., in the points π or π, there f assumes the values 2 π and 2 π, respectively, or inside the interval, i.e., in the open interval pπ, π q. In the last case, according to Theorem 2.5.1 those are critical points of the restriction of f to this interval. The last are given by 5π π x , 6 6 since f 1 pxq 1 2 sinpxq for all x P pπ, π q. Now f π 6 π 6 ? 3 , f 148 5π6 ? 3 5π 6 ? and hence the minimum value of f is pπ {6q 3 (assumed inside the interval) and its maximum value is π 2 (assumed at the right boundary of the interval). See Fig. 41. The following is a theorem of Michel Rolle, published in 1691, which he used in his method of cascades devised to find intervals around zeros of polynomial functions that contain no other roots. In this connection, the subsequent theorem gives that the open interval I that is contained in the domain of a continuous function and that has two subsequent roots of that function as end points, contains precisely one zero of the derivative of the restriction of that function to I if that restriction is differentiable. Theorem 2.5.4. (Rolle’s theorem) Let f : ra, bs Ñ R be continuous where a, b P R are such that a b. Further, let f be differentiable on pa, bq and f paq f pbq. Then there is c P pa, bq such that f 1 pcq 0. Proof. Since f is continuous, according to Theorem 2.3.33 f assumes its minimum and maximum value in some points x0 P ra, bs and x1 P ra, bs, respectively. Now if one of these points is contained in the open interval pa, bq, the derivative of f in that point vanishes by Theorem 2.5.1. Otherwise, if both of those points are at the interval ends a, b it follows that f paq ¤ f pxq ¤ f pbq f paq for all x P ra, bs. Hence in this case, f is a constant function, and it follows by Example 2.4.3 that f 1 pcq 0 for every c P pa, bq. Hence in both cases the statement of the theorem follows. The following example provides a typical application of Rolle’s theorem. Example 2.5.5. Show that f : R Ñ R defined by f pxq : x3 x 1 for all x P R, has exactly one zero. (Compare Example 2.3.39.) 149 y fHbL fHaL a c b x Fig. 42: Illustration of the statement of the mean value theorem 2.5.6. Proof. f is continuous and because of f p1q 1 0 and f p0q 1 ¡ 0 and Corollary 2.3.38 has a zero x0 in p1, 0q. See Fig. 23. Further, f is differentiable with f 1 pxq 3x2 1 ¡ 0 for all x P R. Now assume that there is a another zero x1 . Then it follows by Theorem 2.5.4 the existence of a zero of f 1 in the interval with endpoints x0 and x1 . Hence f has exactly one zero. The mean value theorem is a simple generalization of Rolle’s theorem which will be frequently used in the following. Its use as a central theoretical tool in calculus / analysis was initiated by Cauchy. Its proof proceeds by construction of an appropriate auxiliary function which allows the application of Rolle’s theorem. For a simple geometrical interpretation of the statement of the mean value theorem, we consider a continuous function f defined on a closed interval of R with left end point a and right end point b, where a b, which is differentiable on pa, bq. Then according to the theorem, there is a tangent to graph of the restriction of f to pa, bq with slope identical to slope of the line segment (‘secant’) from pa, f paqq and pb, f pbqq, see Fig. 42. 150 Theorem 2.5.6. (Mean value theorem) Let f : ra, bs Ñ R be a continuous function where a, b P R are such that a b. Further, let f be differentiable on pa, bq. Then there is c P pa, bq such that f pbq f paq ba f 1pcq . Proof. Define the auxiliary function h : ra, bs Ñ R by f pbq f paq hpxq : f pxq px aq f paq ba for all x P ra, bs. Then h is continuous as well as differentiable on pa, bq with f pbq f paq h 1 pxq f 1 pxq ba for all x P pa, bq and hpaq hpbq 0. Hence by Theorem 2.5.4 there is c P pa, bq such that f pbq f paq h 1 pcq f 1 pcq 0. ba Intuitively, it should be expected that every function which is defined on an open interval of R and has a vanishing derivative is a constant function. Indeed, this can be seen as a first important consequence of the mean value theorem. Theorem 2.5.7. Let f : pa, bq Ñ R be differentiable, where a, b P R are such that a b. Further, let f 1 pxq 0 for all x P pa, bq. Then f is a constant function. Proof. The proof is indirect. Assume that f is not a constant function. Then there are x1 , x2 P pa, bq satisfying x1 x2 and f px1 q f px2 q. Hence it follows by Theorem 2.5.6 the existence of c P px1 , x2 q such that f px2 q f px1 q x2 x1 f 1pcq 0 and hence that f px1 q f px2 q. Hence f is a constant function. 151 Typically, the previous theorem is applied in proofs of uniqueness of solutions of differential equations and in the derivation of so called ‘conserved quantities’ of physical systems as in the subsequent examples. Example 2.5.8. (A characterization of the exponential function) Let a, b P R be such that a 0 and b ¡ 0. Find all solutions f : pa, bq Ñ R of the differential equation f 1 pxq f pxq for all x P pa, bq that satisfy f p0q 1. Solution: We know that f pxq : exppxq for every x P pa, bq satisfies all these demands. Indeed, it follows by help of the previous theorem, Theorem 2.5.7, that there is no other solution. This can be seen as follows. For this, let f be some function that satisfies these requirements. Then we define the auxiliary function h : pa, bq Ñ R by hpxq : exppxq f pxq for all x P pa, bq. As a consequence, h is differentiable with a derivative h 1 satisfying h 1 pxq exppxq f pxq exppxq f 1 pxq exppxq f pxq exppxq f pxq 0 for all x P pa, bq. Hence it follows by Theorem 2.5.7 that h is a constant function of value hp0q f p0q 1 which has the consequence that f pxq exppxq for all x P pa, bq. Example 2.5.9. (Energy conservation) Newton’s equation of motion for a point particle of mass m ¥ 0 moving on a straight line is given by mf 2 ptq F pf ptqq (2.5.1) for all t from some non-empty open time interval I R where f ptq is the position of the particle at time t and F pxq is the external force at the point 152 x. Assume that F V 1 where V is a differentiable function from an open interval J RanpI q. Show that E : I Ñ R defined by m 1 E ptq : p f ptqq2 V pf ptqq (2.5.2) 2 for all t P I is a constant function. Solution: It follows by Theorem 2.4.8, Theorem 2.4.10 and (2.5.1) that E is differentiable with derivative E 1 ptq mf 1 ptqf 2 ptq V 1 pf ptqq f 1 ptq f 1 ptq rmf 2 ptq F pf ptqqs 0 for all t P I. Hence according to Theorem 2.5.7, E is a constant function. In physics, its value is called the total energy of the particle. As a consequence, the finding of the solutions of the solution of (2.5.1), which is second order in the derivatives, is reduced to the solution of (2.5.2), which is only first order in the derivatives, for an assumed value of the total energy. Utilizing the interpretation of the values of the derivative of a function as providing the slopes of tangents at its graph, it is to be expected that a differentiable function is increasing (decreasing) on intervals where its derivative assumes positive values (negative values), i.e., values that are ¥ 0 (¤ 0). That this is intuition is correct is displayed by the following theorem. Its statement can be regarded as a another important consequence of the mean value theorem. Theorem 2.5.10. Let f : ra, bs Ñ R be continuous where a, b P R are such that a b. Further, let f be differentiable on pa, bq and such that f 1 pxq ¡ 0 ( f 1 pxq ¥ 0 ) for every x P pa, bq. Then f is strictly increasing ( increasing ) on ra, bs, i.e., f pxq f py q p f pxq ¤ f py q q for all x, y P ra, bs that satisfy x y. Proof. Let x and y be some elements of ra, bs such that x y. Then the restriction of f to the interval rx, y s satisfies the assumptions of Theorem 2.5.6, and hence there is c P px, y q such that f py q f pxq f 1 pcqpy xq ¡ f pxq 153 p ¥ f pxq q . y 3 2 1 1 0.5 x Fig. 43: Graphs of exp and approximations. See Example 2.5.12. Typically, the previous theorem is used in the derivation of lower and upper bounds for the values of functions or more generally in the comparison of functions and, in particular, in the proof of injectivity of functions. The subsequent examples provide such applications. Example 2.5.11. Show that the exponential function exp : R Ñ R is strictly increasing. Solution: By Example 2.4.4 and Theorem 2.3.27 it follows that exp 1 pxq exppxq ¡ 0 for all x P R. Hence it follows by Theorem 2.5.10 that exp is strictly increasing. Hence there is an inverse function to exp which is called the natural logarithm and is denoted by ln. See Fig. 28. Example 2.5.12. Show that (i) ex for all x P p0, 8q. (ii) ex for all x P p0, 8q. ¡1 ¡x 154 (2.5.3) 1 (2.5.4) (Compare Theorem 2.3.27.) Proof. Define the continuous function f : r0, 8q Ñ R by f pxq : ex 1 for all x P r0, 8q. Then f is differentiable on p0, 8q with f 1 pxq ex ¡ 0 for all x P p0, 8q. Hence f is strictly increasing according to Theorem 2.5.10, and (2.5.3) follows since f p0q e0 1 0. Further, define the continuous function g pxq : ex 1 x for all x P r0, 8q. Then g is differentiable on p0, 8q with g 1 pxq ex 1 ¡ 0 for all x P p0, 8q where (2.5.3) has been applied. Hence (2.5.4) follows by Theorem 2.5.10 since g p0q e0 0 1 0. From Example 2.5.11 and (2.5.4), it follows by the intermediate value theorem, Theorem 2.3.37, that expp r0, 8q q r1, 8q and hence by part (iii) of Theorem 2.3.27 that the range of exp is given by p0, 8q which therefore is also the domain of its inverse function ln. As a consequence, exp is a strictly increasing bijective map from R onto p0, 8q. See Fig. 28. Example 2.5.13. Show that lnpa bq lnpaq lnpbq for all a, b ¡ 0. Solution: For a, b ¡ 0, it follows by Theorem 2.3.27 that lnpa bq ln elnpaq elnpbq ln elnpaq p q lnpaq ln b lnpbq . In Example 2.5.9, we derived a conserved quantity for the solutions of a differential equation, a special case of Newton’s equation of motion. Ignoring the physical dimensions of the involved quantities in that example, in the special case that m 2, F pxq 2x for all x P R, the function E : I Ñ R, defined by E ptq pf ptqq2 155 pf 1ptqq2 for all t P I and a solution f of the differential equation f 2 ptq f ptq 0 for all t P I, was found to be a constant function. The value of the corresponding constant is called the total energy that is associated to f . An important feature of that quantity is its positivity. In the subsequent theorem, we show that estimates on the growth of the same function E defined for solutions of the related differential equation (2.5.5) can be used to show the uniqueness of the solutions of that differential equation. The key for this is the following lemma whose proof provides a further application of Theorem 2.5.10. Differential equations of the form (2.5.5) appear frequently in applications, for instance, in the description of the amplitudes of oscillations of damped harmonic oscillators in mechanics and in the description of the current as a function of time in simple electric circuits in electrodynamics. Lemma 2.5.14. (An ‘energy’ inequality for solutions of a differential equation) Let p, q P R. Further, let I be some open interval of R, x0 P I and f : I Ñ R satisfy the differential equation f 2 pxq p f 1 pxq q f pxq 0 for all x P I. Finally, define E pxq : pf pxqq2 pf 1pxqq2 for all x P I. Then for all x P I 0 ¤ E pxq ¤ E px0 q ek|xx0 | where k : 1 2|p| |q | . Proof. Since f is twice differentiable, E is differentiable such that E 1 pxq 2f pxqf 1 pxq 2f 1 pxqf 2 pxq 156 (2.5.5) 2f pxqf 1pxq 2 r p f 1pxq q f pxq s f 1pxq 2 p1 qqf pxqf 1pxq 2 p pf 1pxqq2 for all x P I. Hence E 1 is continuous and satisfies |E 1pxq| ¤ 2 p1 |q|q |f 1pxq| |f pxq| 2 |p| pf 1pxqq2 ¤ p1 |q|q pf pxqq2 pf 1pxqq2 2 |p| pf 1pxqq2 ¤ kE pxq for all x P I where it has been used that 2 |f 1 pxq| |f pxq| ¤ pf pxqq2 pf 1 pxqq2 . As a consequence, kE pxq ¤ E 1pxq ¤ kE pxq for all x P I. We continue analyzing the consequences of these inequalities. For this, we define auxiliary functions Er , El by Er pxq : ekx E pxq , El pxq : ekx E pxq for all x P I. Then Er1 pxq ekx pE 1 pxq kE pxqq ¤ 0 , El1 pxq ekx pE 1 pxq kE pxqq ¥ 0 for all x P I. Hence Er is decreasing, which is equivalent to the increasing of Er , and Er is increasing. Hence it follows by Theorem 2.5.10 that E pxq ¤ E px0 q ekpxx0 q E px0q ek|xx | 0 for x ¥ x0 and that E pxq ¤ E px0 q ekpx0 xq for x ¤ x0 . E px0q ek|xx | . 0 The unique dependence of the solutions of (2.5.6) on ‘initial data’, f px0 q and f 1 px0 q given at some x0 P R is a simple consequence of the preceding lemma. 157 Theorem 2.5.15. Let p, q P R. Further, let I be some open interval of R, x0 P I and y0 , y01 P R. Then there is at most one function f : I Ñ R such that f 2 pxq p f 1 pxq q f pxq 0 (2.5.6) for all x P I and at the same time such that f px0 q y0 , f 1 px0 q y01 . Ñ R be such that f 2 pxq p f 1 pxq q f pxq f¯2 pxq p f¯1 pxq q f¯pxq 0 for all x P I and f px0 q f¯px0 q y0 , f 1 px0 q f¯1 px0 q y01 . Then u : f f¯ satisfies u 2 pxq p u 1 pxq q upxq 0 for all x P I and upx0 q u 1 px0 q 0 . Hence it follows by Lemma 2.5.14 that upxq 0 for all x P I and hence that f f¯. Proof. For this, let f, f¯ : I Of course, of main interest for applications are the solutions of (2.5.6). These are obtained by reducing the solution of this equation to the solution of the special cases corresponding to p 0. The solutions of the last are obvious. Their representation is simplified by use of hyperbolic functions which are introduced next. Definition 2.5.16. We define the hyperbolic sine function sinh, the hyperbolic cosine function cosh and the hyperbolic tangent function tanh by sinhpxq : 1 x 1 x e ex , coshpxq : e 2 2 158 ex , y 3 2 -2 1 -1 2 x -1 -2 -3 Fig. 44: Graphs of the hyperbolic sine and cosine function. y 0.5 -2 1 -1 2 x -0.5 Fig. 45: Graphs of the hyperbolic tangent function and asymptotes given by the graphs of the constant functions on R of values 1 and 1. 159 tanhpxq : sinhpxq , coshpxq for all x P R. Obviously, sinh, tanh are antisymmetric and cosh is symmetric, i.e., sinhpxq sinhpxq , cospxq coshpxq , tanhpxq tanhpxq for all x P R. Also these functions are differentiable and, in particular, sinh 1 cosh 1 cosh , sinh similarly to the sine and cosine functions. Another resemblance to these functions is the relation cosh2 pxq sinh2 pxq 14 ex ex ex 1 x e 4 ex ex ex 2 ex ex ex ex ex 2 41 2 ex 2 ex 1 for all x P R. In particular, this implies that cosh2 pxq sinh2 pxq 1 tanh pxq : cosh pxq 2 1 tanh2pxq cosh12pxq for all x P R. The solution of (2.5.6) corresponding to ‘initial data’, f px0 q and f 1 px0 q given at some x0 P R are obtained in the proof of the following theorem by considering a function that is related to f . As a consequence of (2.5.6), that function is a solution of the differential equation of the form (2.5.6) with p 0. The solutions of these special equations are obvious. Theorem 2.5.17. Let p, q the unique solution to P R, D : pp2{4q q and x0, y0, y01 P R. Then f 2 pxq pf 1 pxq 160 qf pxq 0 satisfying f px0 q y0 and f 1 px0 q y01 is given by f pxq y0 eppxx0 q{2 coshpD1{2 px x0 qq 1{2 py0 1 1{2 D y0 sinhpD px x0 qq 2 for x P R if D ¡ 0, py f pxq eppxx0 q{2 y0 0 2 y1 0 px x0q for x P R if D 0 and f pxq y0 eppxx q{2 cosp|D|1{2 px x0 qq 1{2 py0 1 1{2 |D | y0 sinp|D| px x0 qq 2 0 for x P R if D 0. Proof. For this, we first notice that a function h : R Ñ R satisfies h 2 pxq ph 1 pxq for all x P R if and only if h̄ 2 pxq q qhpxq 0 p2 4 (2.5.7) h̄pxq 0 (2.5.8) for all x P R where h̄ : R Ñ R is defined by h̄pxq : epx{2 hpxq (2.5.9) for all x P R. Indeed, it follows by Theorem 2.4.8 that h̄ is twice differentiable if and only if h is twice differentiable and in this case that h̄ 1 pxq epx{2 h 1 pxq p hpxq , 2 h̄ 2 pxq epx{2 h 2 pxq p h 1 pxq 161 p2 hpxq 4 for all x P R. The last implies that p2 p2 q h̄pxq epx{2 h 2 pxq p h 1 pxq hpxq 4 4 p2 px{2 e q hpxq epx{2 ph 2 pxq ph 1 pxq qhpxqq 0 4 h̄ 2 pxq for all x P R if and only if (2.5.7) is satisfied for all x hpx0 q y0 and h 1 px0 q y01 if and only if h̄px0 q y0 epx0 {2 , h̄ 1 px0 q py 0 2 P R. In addition, y01 epx0 {2 . (2.5.10) For the solution of (2.5.8) and (2.5.10), we consider three cases. If D : pp2{4q q ¡ 0, then a solution to (2.5.8) and (2.5.10) is given by h̄pxq y0 epx0 {2 coshpD1{2 px x0 qq py 0 y01 epx0 {2 sinhpD1{2 px x0 qq D1{2 2 for x P R. If D 0, then a solution to (2.5.8) and (2.5.10) is given by py 0 h̄pxq y0 epx {2 y01 epx {2 px x0 q 2 for x P R. If D 0, then a solution to (2.5.8) and (2.5.10) is given by h̄pxq y0 epx {2 cosp|D|1{2 px x0 qq 1{2 py0 1 |D | y0 epx {2 sinp|D|1{2 px x0 qq 2 for x P R. Hence, finally, it follows by (2.5.9) and by Theorem 2.5.15 the 0 0 0 0 statement of this theorem. According to Theorem 2.3.44, the inverse of a strictly increasing continuous function defined on a closed interval ra, bs of R where a, b P R are such that a b, is continuous, too. If the restriction of f to pa, bq is in addition differentiable, then the restriction of f 1 to pf paq, f pbqq is also differentiable. Moreover, the following theorem gives an often used representation of the derivative of the last in terms of the derivative of f . 162 Theorem 2.5.18. (Derivatives of inverse functions) Let f : ra, bs Ñ R be continuous where a, b P R are such that a b. Further, let f be differentiable on pa, bq and such that f 1 pxq ¡ 0 for every x P pa, bq. Then the inverse function f 1 is defined on rf paq, f pbqs as well as differentiable on pf paq, f pbqq with 1 1 f 1 py q 1 1 (2.5.11) f pf py qq for all y P pf paq, f pbqq. Proof. By Theorem 2.5.10, it follows that f is strictly increasing and hence that there is an inverse function f 1 for f . Further, by Theorem 2.3.44 f 1 is continuous, and by Theorem 2.3.43 it follows that f pra, bsq rf paq, f pbqs and hence that f 1 is defined on rf paq, f pbqs. Now let y P pf paq, f pbqq and y1 , y2 , . . . be a sequence in pf paq, f pbqq zty u which is convergent to y. Then f 1 py1 q, f 1 py2 q, . . . is a sequence in pa, bq ztf 1 py qu which, by the continuity of f 1 , converges to f 1 py q. Hence it follows for n P N that f 1 pyn q f 1 py q yn y f pf 1 pyn qq f pf 1 py qq f 1 pyn q f 1 py q 1 and hence by the differentiability of f in f 1 py q, that f 1 pf 1 py qq by Theorem 2.3.4 the statement (2.5.11). ¡ 0 and The following examples, give two applications of the previous theorem. The second example is from the field of General Relativity. Example 2.5.19. Calculate the derivative of ln, arcsin, arccos and arctan. Solution: By Theorem 2.5.18, it follows that ln 1 pxq 1 exp 1 plnpxqq exppln1 pxqq x1 for every x P p0, 8q, arcsin 1 pxq 1 sin 1 parcsinpxqq 163 1 cosparcsin pxqq y 2 1 1 2 3 x -1 Fig. 46: Graph of the auxiliary function h from Example 2.5.20. 1 ?1 1 x2 , 1 sin parcsinpxqq 1 1 arccos 1 pxq 1 cos parccospxqq sinparccospxqq 1 ?1 1 x2 a 2 1 cos parccospxqq a 2 for all x P p1, 1q and arctan 1 pxq for every x P R. 1 tan 1 parctanpxqq p1 1 tan qparctanpxqq 2 1 1 x2 Example 2.5.20. In terms of Kruskal coordinates, the radial coordinate projection r : Ω Ñ p0, 8q of the Schwarzschild solution of Einstein’s field equation is given by rpu, v q h1 pu2 v 2 q 164 for all pv, uq P Ω where h : p0, 8q Ñ p1, 8q is defined by hpxq : x 1 2M ex{p2M q for all x P p0, 8q. Here Ω : tpv, uq P R2 : u2 v 2 ¡ 1u , and M ¡ 0 is the mass of the black hole. In addition, geometrical units are used where the speed of light and the gravitational constant have the value 1. Finally, h is bijective and h1 is differentiable. Calculate Br , Br . Bv Bu for all pv, uq P Ω. Solution: For this, let pv, uq conclude by Theorem 2.5.18 that P Ω. In a first step, we Br pu, vq 2v ph1q 1pu2 v2q 2v r h 1ph1pu2 v2qq s1 Bv 2v r h 1prpu, vqq s1 , Br pu, vq 2u ph1q 1pu2 v2q 2u r h 1ph1pu2 v2qq s1 Bu 2u r h 1prpu, vqq s1 . Since h 1 pxq 1 x{p2M q e 2M 1 x 2M 2M 1 ex{p2M q 4Mx 2 ex{p2M q for every x ¡ 0, this implies that Br pu, vq 8M 2v er{p2M q pu, vq , Bv r r{p2M q Br pu, vq 8M 2u e pu, vq . Bu r 165 y y 2 2 1 1 1 2 x 1 2 x Fig. 47: Graphs of power functions corresponding to positive (¥ 0) and negative (¤ 0) a, respectively. See Definition 2.5.21. The following defines general powers of strictly positive (¡ 0) real numbers in terms of the exponential function and its inverse, the natural logarithm function. Definition 2.5.21. (General powers) For every a responding power function by P R, we define the cor- xa : ealn x for all x ¡ 0. By Theorem 2.4.10, the power function pp0, 8q Ñ R, x ÞÑ xa q is differentiable with derivative a aln x a ln x pa1qln x a ln x pa1qln x e x e x e e a xpa1q x in x ¡ 0. Also the following calculational rules are simple consequences of the definition of general powers and basic properties of the exponential function and its inverse. Example 2.5.22. Show that x0 1, xa y a pxyqa , xa xb 166 xa b , pxa qb xab y 1 1 2 x -1 -2 Fig. 48: Graphs of ln and polynomial approximations corresponding to a See Example 2.5.23. 1{2, 1 and 2. for all x, y ¡ 0 and a, b P R. Solution: By Definition 2.5.21, it follows for such x, y, a and b that e0ln x e0 1 , xa y a ealn x ealn y ealn x aln y eapln x ln yq ea lnpxyq pxy qa , xa xb ealn x ebln x ealn x bln x epa bqln x xa b , pxaqb ebln x ebln e eb a ln x ea b ln x xab . x0 a a ln x The following derives frequently used polynomial approximations of the natural logarithm function as a further example for the application of Theorem 2.5.10. A verbalization of the estimate (2.5.12) is that the natural logarithm ‘ lnpxq is growing more slowly than any positive power of x for large x ’. Example 2.5.23. Show that for every a ¡ 0 lnpxq 1 a px 1q a 167 (2.5.12) for all x ¡ 1. (See Exercise 2.3.2 for an application of the case a Solution: Define the continuous function f : r1, 8q Ñ R by f pxq : 1{2.) 1 a px 1q lnpxq a for all x ¥ 1. Then f is differentiable on p1, 8q with f 1 pxq 1 a aln x 1 e x a x ealn x 1 x1 ¡0 for x ¡ 1 and f p1q 0. Hence (2.5.12) follows by Theorem 2.5.10. Another important consequence of Theorem 2.5.6 is given by Taylor’s theorem which is frequently employed in applications. For its formulation, we need to introduce some additional terminology. Definition 2.5.24. If m, n we define ņ P N such that m ¤ n and and am, . . . , an P R, ak : am am 1 an . k m Note that, as a consequence of the associative law for addition, it is not necessary to indicate the order in which the summation is to be performed. Further, obviously, ņ pak bk q ņ ak k m k m and ņ λ ak k m ņ bk k m ņ λ ak k m for every λ P R and bm , . . . , bn P R. In addition, we define for every n P N the corresponding factorial n! recursively by 0! : 1 , pk 1q! : pk for every k P N . Hence in particular, 1! 5! 120 and so forth. 168 1qk! 1, 2! 2, 3! 6, 4! 24, For the motivation of Taylor’s theorem, we consider a twice continuously differentiable function f defined on an open subinterval pa, bq of R where a, b P R are such that a b. Further, let x0 , x P pa, bq. According to the mean value theorem, there is ξ in the open interval between x0 and x such that f pxq f px0 q f 1 pξ q . x x0 This implies that px x0qf 1pξ q . f pxq f px0 q Further, by the same reasoning, it follows the existence of ζ in the open interval between x0 and ξ such that f 1 pξ q f 1 px0 q pξ x0qf 2pζ q . Hence we conclude that f pxq f px0 q px x0 qf 1 pξ q f px0q px x0q r f 1px0q pξ x0qf 2pζ qs f px0q px x0qf 1px0q px x0qpξ x0qf 2pζ q and |f pxq f px0q px x0qf 1px0q| ¤ |x x0|2 |f 2pζ q| . (2.5.13) Since f 2 is continuous, we conclude that for every arbitrary preassigned error bound ε ¡ 0 there is an interval I around x0 such that |f pxq f px0q px x0qf 1px0q| ¤ ε for every x P I. Hence the restriction of f to I can be approximated within an error ε by the restriction of the linear polynomial function p1 pxq : f px0 q px x0qf 1px0q 169 for all x P R to I. This polynomial is called the linearization of f around the point x0 . Note that p1 px0 q f px0 q , p11 px0 q f 1 px0 q . Therefore, p1 is the uniquely determined linear, i.e. of order ¤ 1, polynomial that assumes the value f px0 q in x0 and whose derivative assumes the value f 1 px0 q in x0 . In particular, its graph coincides with the tangent to the graph of f in x0 . In applications, functions are frequently replaced by their linearizations around appropriate points to simplify subsequent reasoning. Often, this is done without performing an error estimate like (2.5.13) in the hope the error introduced by the replacement is in some sense ‘small’. If f is sufficiently often differentiable, it is to be expected that f can be described with higher precision near x0 by polynomials of higher order than 1. Indeed, this is true and Taylor’s theorem provides such so called Taylor polynomials pn for n P N with n ¡ 1. It is tempting to speculate that pn is the uniquely determined polynomial of order ¤ n such that pn px0 q f px0 q , pnpkq px0 q f pkq px0 q for k 1, . . . , n. In that case, pn is easily determined to be of the form ņ f pkq px0 q px x qk , p pxq n k 0 k! 0 for all x P R where we set f p0q : f and f is assumed to be pn 1q-times continuously differentiable. Indeed, this speculation turns out to be correct. We first give Taylor’s theorem in a form which resembles that of the mean value theorem. Its proof proceeds by application of the last to a skillfully constructed auxiliary function. Theorem 2.5.25. (Taylor’s theorem) Let n P N , I be a non-trivial open interval and f : I Ñ R be ntimes differentiable. Finally, let a and b be 170 two different elements from I. Then there is c in the open interval between a and b such that f pbq f pkq paq n¸1 k! k 0 f pnq pcq pb aqn n! pb aqk (2.5.14) where f p0q : f and pb aq0 : 1. Ñ R by n¸ 1 f pkq pxq g pxq : f pbq pb xqk k! Proof. Define the auxiliary function g : I k 0 for all x P I. Then it follows that g pbq 0 and moreover that g is differentiable with n¸ 1 f pk 1q pxq 1 p b xqk g pxq k! k0 pnq f pxq pb xqn1 f pkq pxq n¸1 p b xqk1 pk 1q! k1 pn 1q! for all x P I. Define a further auxiliary function h : I hpxq : g pxq for all x P I. Then it follows that hpaq tiable with h 1 pxq bx ba n Ñ R by g paq hpbq 0 and that h is differen- f pnq pxq n1 pn 1q! pb xq n pb xqn1 gpaq pb aqn for all x P I. Hence according to Theorem 2.5.4, there is c in the open interval between a and b such that 0 h 1 pcq f pnq pcq pn1q pn 1q! pb cq which implies (2.5.14). 171 n pb cqn1 gpaq pb aqn Taylor’s Theorem 2.5.25 is usually applied in the following form, Corollary 2.5.26. (Taylor’s formula) Let n P N , I be a non-trivial open interval of length L and f : I Ñ R be ntimes differentiable. Finally, let x0 P I and C ¥ 0 be such that |f pnqpxq| ¤ C for all x P I. Then f x p q for all x P I. f pkq px q 0 n¸1 k! k 0 x0 k px q ¤ CL n! n . Remark 2.5.27. The polynomial pn1 pxq : f pkq px q 0 n¸1 k 0 k! px x0qk for all x P R in Corollary 2.5.26 is called ‘ the pn 1q-degree polynomial of f centered at x0 ’. In particular, it follows (for the case n 2) that: p1 pxq f px0 q f 1 px0 q px x0 q for all x P R which is also called the ‘linearization or linear approximation of f at x0 ’ and 2 |f pxq p1pxq| ¤ CL 2 if C ¥ 0 is such that |f 2pxq| ¤ C for all x P I. In applications, one often meets the notation f pxq p1 pxq saying that f and p1 are approximately the same near x0 . If the error can be seen to be ‘negligible’ for the application, this often leads to a replacement of f by its linearization. 172 y 1.25 1.2 1.15 1.1 1.05 0.1 0.2 0.3 0.4 0.5 x Fig. 49: Graphs of f and p1 from Corollary 2.5.28. Example 2.5.28. Calculate the linearization p1 of f : r1, 8q Ñ R defined by ? f pxq : 1 x for all x P r1, 8q at x 0, and estimate its error on the interval r0, 1{2s. Solution: f is twice differentiable on p1, 8q with f 1 pxq 1 p1 2 Hence p1 is given by for all x P R Because of xq1{2 , f 2 pxq p1 pxq 1 1 1 4 p 1 p1 4 xq3{2 . 1 x 2 1 xq3{2 ¤ 4 for all x P r0, 1{2s, it follows from (2.5.27) that the absolute value of the relative error satisfies |p1pxq f pxq| ¤ 1 |f pxq| 32 173 for all x P r0, 1{2s. We know that the first derivative of a function f in a point p of its domain provides the slope of the tangent at the graph of the function in the point pp, f ppqq. Hence it is natural to ask whether there is geometrical interpretation of the second derivative. Indeed, such interpretation can be given in terms of the way how the graph of the function ‘bends’. This can be seen by help of Taylor’s theorem. For this, we consider a three times continuously differentiable function f defined on an open subinterval pa, bq of R where a, b P R are such that a b. Further, let x0 , x P pa, bq. According to Taylor’s theorem, there is ξ in the open interval between x0 and x such that f 2 px0 q f 3 pξ q p x x0 q2 px x0q3 . 2 6 0, since f 3 is continuous, it follows for x sufficiently near to f pxq f px0 q If f 1 px0 q x0 that f 1 px0 qpx x0 q |f 2px0q| px x q2 ¡ |f 3pξ q| |x x |3 2 and hence that if f 2 px0 q ¡ 0 and 0 6 f pxq ¡ f px0 q f 1 px0 qpx x0 q f pxq f px0 q f 1 px0 qpx x0 q 0 if f 2 px0 q 0. Hence if f 2 px0 q ¡ 0, for x sufficiently near to x0 , the value of f pxq exceeds the value of its linearization at x0 or, equivalently, the point px, f pxqq lies above the tangent at x0. In this case, we say that f is locally convex at x0 . If f 2 px0 q 0, for x sufficiently near to x0 , the value of f pxq is smaller than the value of its linearization at x0 or, equivalently, the point px, f pxqq lies below the tangent at x0. In this case, we say that f is locally concave at x0 . Definition 2.5.29. (Convexity / concavity of a differentiable function) Let f : pa, bq Ñ R be differentiable where a, b P R are such that a b. We call f convex (concave) if f pxq ¡ f px0 q f 1 px0 qpx x0 q p f pxq f px0q 174 f 1 px0 qpx x0 q q for all x0 , x P pa, bq such that x0 x. The following theorem proves the convexity / concavity of a function under less restrictive assumptions than our motivational analysis above. Theorem 2.5.30. Let f : pa, bq Ñ R be twice differentiable on pa, bq, where a, b P R are such that a b, and such that f 2 pxq ¡ 0 (f 2 pxq 0) for all x P pa, bq. Then f pxq ¡ f px0 q f 1 px0 qpx x0 q for all x0 , x P pa, bq such that x0 p f pxq f px0q f 1 px0 qpx x0 q q x, i.e., ‘f is convex’ (‘f is concave’). Proof. First, we consider the case that f 2 pxq ¡ 0 for all x P pa, bq. For this, let x0 P pa, bq and x P pa, bq be such that x ¡ x0 . According to Theorem 2.5.6, there is c P px0 , xq such that f pxq f px0 q f 1 pcq . x x0 By Theorem 2.5.10, it follows that f 1 is strictly increasing on rx0 , xs and hence that f pxq f px0 q f 1pcq ¡ f 1px0q x x0 and that f pxq ¡ f px0 q f 1 px0 qpx x0 q . f px0 q f pxq x0 x f 1pcq f 1px0q (2.5.15) Analogously for x P pa, bq such that x x0 , it follows that there c P px, x0 q such that f px0 q f pxq f 1 pcq x0 x 1 and such that f strictly increasing on rx, x0 s and hence that which implies (2.5.15). In the remaining case that f 2 pxq pa, bq, application of the previous to f gives f pxq ¡ f px0q f 1px0qpx x0q 175 0 for all x P y 30 25 20 15 10 5 1 -1 2 3 x Fig. 50: Graphs of exp along with linearizations around x 1, 2 and 3. and hence for all x0 f pxq f px0 q P pa, bq and x P pa, bq ztx0u. f 1 px0 qpx x0 q Example 2.5.31. The exponential function exp is convex because of exp 2 pxq exppxq ¡ 0 for all x P R. See Fig. 50. Example 2.5.32. Find the intervals of convexity and concavity of f : R Ñ R defined by f pxq : x4 x3 2x2 1 for all x P R. Solution: f is twice continuously differentiable with f 1 pxq 4x3 12 3x 4x , f 2 pxq 12x2 2 x 1 4 c 19 48 x 1 4 c 176 19 48 6x 4 12 x 2 1 1 x 2 3 y 4 2 -2 x 1 -1 -2 Fig. 51: Graph of f from Example 2.5.32 and parallels to the y axis through its inflection points. for all x P R. Hence f is convex on the intervals 8, 41 c and concave on the interval 41 19 48 c , 19 1 , 48 4 14 c c 19 ,8 48 19 48 . The following theorem gives another useful characterization of a function defined on interval I of R to be convex. Such function is convex if and only if for every x, y P I such that x y the graph of f |px,yq lies below the straight line (‘secant’) between px, f pxqq and py, f py qq. Theorem 2.5.33. Let f : pa, bq Ñ R be differentiable on pa, bq where a, b P R are such that a b. Then f is convex if and only if f pz q f pxq pz xq f pyyq fxpxq 177 f pyq py zq f pyyq fxpxq y x Fig. 52: Graph of a convex function (black) and secant (blue). Compare Theorem 2.5.33. for all x, y, z P pa, bq such that x z y. Proof. If f is convex, we conclude as follows. For the first step, let x, y P pa, bq be such that x y. As a consequence of the convexity of f , it follows that f py q ¡ f pxq and hence that f 1 pxqpy xq , f pxq ¡ f py q f 1 py qpx y q f py q f pxq f 1pyq . yx This is true for all x, y P pa, bq be such that x y. Not that this implies that f 1 is strictly increasing. For the second step, let x, y, z P pa, bq be such that x z y. By the mean value theorem Theorem 2.5.6, it follows the existence of ξ P px, y q such that f 1 pxq f py q f pxq yx f 1pξ q . ¤ z, it follows by help of the first step that f py q f pxq f py q f pz q f 1 pξ q ¤ f 1 pz q yx yz In the case that ξ 178 and hence that f pz q f py q py z q f py q f pxq . yx ¤ ξ, it follows by help of the first step that f pz q f pxq f py q f pxq f 1 pξ q ¥ f 1 pz q ¡ yx zx In the case that z and hence that f pz q f pxq pz xq f pyyq xf pxq . On the other hand, if pz xq f pyyq fxpxq f pz q f pxq for all x, y, z P pa, bq such that x z this, note that the previous implies that f pz q f pxq zx f pyq py zq f pyyq fxpxq y, we conclude as follows. For f pyyq xf pxq f pyyq zf pzq . In the following, let x, y, z, ξ P pa, bq be such that x follows from the assumption that f pz q f pxq zx z ξ f pξξq xf pxq f pyyq xf pxq . Ñ x that f pξ q f pxq f py q f pxq f 1 pxq ¤ . ξx yx From this, it follows by taking the limit z This implies that f 1 pxq f py q f pxq yx 179 y. It and therefore that f py q ¡ f pxq f 1 pxqpy xq . (2.5.16) Also, it follows from the assumption that f py q f pxq yx f pyyq fz pzq f pyyq ξf pξ q . Ñ y that f py q f pxq f py q f pz q ¤ f 1pyq . yx yz From this, it follows by taking the limit ξ This implies that f py q f pxq yx f 1pyq and therefore that f pxq ¡ f py q f 1 py qpx y q . Since (2.5.16) is true for all x, y P pa, bq such that x following for x, y P pa, bq such that y x f pxq ¡ f py q (2.5.17) y, we conclude the f 1 py qpx y q . Finally, from this and (2.5.17), it follows that f pxq ¡ f py q for all x, y f 1 py qpx y q . P pa, bq such that x y. A typical example for the application of Theorems 2.5.30, 2.5.33 is given in the following example which derives an occasionally used lower bound for the sine function. 180 y 1 Π 2 x Π Fig. 53: Graph of sine function (black) and secant (blue). Compare Example 2.5.34. Example 2.5.34. Show that sinpxq ¥ 2x{π (2.5.18) for all x P r0, π {2s. Solution: By application of Theorem 2.5.30, it follows that the restriction of sin to p0, π q is convex. According to Theorem 2.5.33, this implies that p1{nq 1 sinpxq ¤ sinp1{nq rx p1{nqs psin π {2q p1{nq for all x P r1{n, π {2s where n P N . By taking the limit n Ñ 8, this leads to sinpxq ¤ 2x{π for all x P p0, π {2s. From the last and the fact that (2.5.18) is trivially satisfied for x 0, it follows the validity of (2.5.18) for all x P r0, π {2s. We know that the vanishing of the first derivative in a point x of the domain is a necessary, but in general not sufficient, condition for a differentiable function f to assume a local maximum or minimum in x. In that case, the tangent to graph of f in the point px, f pxqq is horizontal; if f is in addition twice continuously differentiable such that f 2 pxq 0 (f 2 pxq ¡ 0), then it follows by the continuity of f 2 that the restriction of f 2 to a sufficiently small interval around x assumes strictly negative (strictly positive) values and hence that that restriction is concave (convex) and therefore that x marks the position of a local maximum (minimum) of f . 181 y 0.8 0.2 -6 -4 2 -2 4 6 x Fig. 54: Graph of f from Example 2.5.36. Theorem 2.5.35. (Sufficient condition for the existence of a local minimum/maximum) Let f be a twice continuously differentiable real-valued function on some open interval I of R. Further, let x0 P I be a critical point of f such that f 2 pxq ¡ 0 (f 2 pxq 0). Then f has a local minimum (maximum) at x0 . Proof. Since f 2 is continuous with f 2 px0 q ¡ 0 (f 2 px0 q 0), there is an open interval J around x0 such that f 2 pxq ¡ 0 (f 2 pxq 0) for all x P J. ((Otherwise there is for every n P N some yn P I such that |yn x0 | 1{n and f 2 pyn q ¤ 0 (f 2 pyn q ¥ 0). In particular, this implies that limnÑ8 yn x0 and by the continuity of f 2 also that limnÑ8 f 1 pyn q f 2 px0 q. Hence it follows by Theorem 2.3.12 that f 2 px0 q ¤ 0 (f 2 px0 q ¥ 0). )) Hence it follows by Theorem 2.5.30 that f pxq ¡ f px0 q (f pxq f px0 q) for all x P J z tx0 u. Example 2.5.36. Find the values of the local maxima and minima of f pxq : ln 5 4 182 sin pxq 2 for all x P R. Solution: f is twice continuously differentiable with f 1 pxq 5 4 sinp2xq 2 cosp2xq , f 2 pxq 5 2 sin pxq sin2 pxq 4 sin2 p2xq 5 4 2 sin2 pxq for all x P R. Hence the critical points of f are at xk : kπ {2, k for each k P Z: 2p1qk f 2 pxk q 5 . sin2 pxk q 4 P Z and Hence it follows by Theorem 2.5.35 that f has a local minimum/maximum of value lnp5{4q at x2k and of value lnp9{4q at x2k 1 , respectively, and each k P Z. Another important consequence of Theorem 2.5.6 (or its equivalent, Rolle’s theorem) is given by Cauchy’s extended mean value theorem which is the basis for the proof of L’Hospital’s rule, Theorem 2.5.38, for the calculation of indeterminate forms. The proof of the extended mean value theorem proceeds by application of Rolle’s theorem to a skillfully devised auxiliary function. Theorem 2.5.37. (Cauchy’s extended mean value theorem) Let f, g : ra, bs Ñ R be continuous functions where a, b P R are such that a b. Further, let f, g be continuously differentiable on pa, bq and such that g 1 pxq 0 for all x P pa, bq. Then there is c P pa, bq such that f pbq f paq g pbq g paq f 1 pcq . g 1 pcq (2.5.19) Proof. Since g 1 is continuous with g 1 pxq 0 for all x P pa, bq, it follows by Theorem 2.3.37 that either g 1 pxq ¡ 0 or g 1 pxq 0 for all x P pa, bq and hence by Theorem 2.3.44 that g is either strictly increasing or strictly decreasing on pa, bq. Since g is continuous, from this also follows that g pbq g paq. Define the auxiliary function h : ra, bs Ñ R by hpxq : f pxq f paq f pbq f paq g pbq g paq 183 pgpxq gpaqq for all x P ra, bs. Then h is continuous as well as differentiable on pa, bq such that f pbq f paq 1 h 1 pxq f 1 pxq g pxq g pbq g paq for all x P ra, bs and hpaq hpbq 0. Hence according to Theorem 2.5.4, there is c P pa, bq such that h 1 pcq f 1 pcq f pbq f paq 1 g pcq 0 g pbq g paq which implies (2.5.19). L’Hospital’s rule goes back to Johann Bernoulli who instructed the young French marquis Guillaume Francois Antoine de L’Hospital in 1692 in the new Leibnizian discipline of calculus during a visit in Paris. Johann signed a contract under which in return for a regular salary, he agreed to send L’Hospital his discoveries in mathematics, to be used as the marquis might wish. The result was that one of Johann’s chief contributions to calculus from 1694 has ever since been known as L’Hospital’s rule on indeterminate forms after its publication in L’Hospital’s book ‘Analyse des infiniment petits’ in 1696 [69]. L’Hospital’s book was the first textbook on calculus and was met with great success. An indeterminate form, we already met in Example 2.3.54 where it was proved that sinpxq lim 1. (2.5.20) xÑ0,x0 x Formally, that limit is of the ‘indeterminate’ type 0 0 where the last formal expression is obtained by replacing sinpxq and x in the quotient sinpxq{x by x lim sinpxq Ñ0,x0 and 184 x lim x , Ñ0,x0 respectively. Since sin and the identical function on R are continuous and according to the limit laws, this expression would give the correct result for the limit (2.5.20) if it would involve division by a non-zero number. But, since division by zero is not defined, that expression is not defined and hence ‘indeterminate’. The following theorem treats also indeterminate limits of the type 8. 8 The calculation of limits of other indeterminate types can usually be reduced to the calculation of limits of these two types. L’Hospital’s rule is a simple consequence of Cauchy’s extended mean value theorem. Theorem 2.5.38. (Indeterminate forms/L’Hospital’s rule) Let f : pa, bq Ñ R and g : pa, bq Ñ R be continuously differentiable, where a, b P R are such that a b, and such that g 1 pxq 0 for all x P pa, bq. Further, let lim Ña f pxq xlim Ña g pxq 0 x (2.5.21) or let |f pxq| ¡ 0 and |g pxq| ¡ 0 for all x P pa, bq as well as lim x Ña 1 |f pxq| Finally, let 1 xlim Ña |g pxq| 0 . (2.5.22) f 1 pxq lim xÑa g 1 pxq exist. Then lim x Ña f pxq g pxq f 1 pxq xlim Ña g 1 pxq . (2.5.23) Proof. Since g 1 is continuous with g 1 pxq 0 for all x P pa, bq, it follows by the Theorem 2.3.37 that either g 1 pxq ¡ 0 or g 1 pxq 0 for all x P pa, bq and hence by Theorem 2.3.44 that g is either strictly increasing or strictly decreasing on pa, bq. First, we consider the case (2.5.21). Then f and g 185 can be extended to continuous functions on ra, bq assuming the value 0 in a. Now, let x0 , x1 , . . . be a sequence of elements of pa, bq converging to a. Then by Theorem 2.5.37 for every n P N there is a corresponding cn P pa, xn q such that f pxn q f 1 pcn q . g pxn q g 1 pcn q Obviously, the sequence c0 , c1 , . . . is converging to a, and hence it follows that f pxn q f 1 pcn q f 1 pxq lim lim lim (2.5.24) nÑ8 g px q xÑa g 1 pxq nÑ8 g 1 pc q n n and hence, finally, that (2.5.23). Finally, we consider the second case. So let |f pxq| ¡ 0 and |g pxq| ¡ 0 for all x P pa, bq, and in addition let (2.5.22) be satisfied. Further, let x0 , x1 , . . . be some sequence of elements of pa, bq converging to a and let b 1 P pa, bq. Because of (2.5.22), there is n0 P N such that 1 f pb 1 q 1 and g pb q 1 . g px q f px q n n for all n P N such that n ¥ n0 . Then according to Theorem 2.5.37 for any such n, there is a corresponding cn P pxn , b 1 q such that f pxn q f pb 1 q g pxn q g pb 1 q 1 fg 1ppccnqq . n Hence it follows that f pxn q g pxn q 1 1 ggppxbnqq 1 1 ffppxbnqq 1 fg 1ppccnqq n and since c0 , c1 , . . . is converging to a by (2.5.22) and Theorem 2.3.4, it follows the relation (2.5.24) and hence, finally, (2.5.23). Example 2.5.39. Find lim x lnpxq . x Ñ0 Solution: Define f pxq : lnpxq and g pxq : 1{x for all x P p0, 1q. Then f and g are continuously differentiable and such that g 1 pxq 1{x2 0 for 186 all x P p0, 1q. Further, |f pxq| | lnpxq| ¡ 0, |g pxq| for all x P p0, 1q. Finally, (2.5.22) is satisfied and f 1 pxq xÑ0 g 1 pxq lim |1{x| 1{|x| ¡ 0 xlim pxq 0 . Ñ0 Hence according to Theorem 2.5.38: lim x lnpxq 0 . x Ñ0 Example 2.5.40. Determine x lim Ñ8 xe . x Solution: Define f py q : 1{y and g py q : expp1{y q for all y P p0, 1q. Then f and g are continuously differentiable and such that g 1 py q y 2 expp1{y q 0 for all y P p0, 1q. Further, |f py q| 1{|y | ¡ 0, |g py q| expp1{y q ¡ 0 for all y P p0, 1q. Finally, (2.5.22) is satisfied and f 1 py q y Ñ0 g 1 py q lim ylim Ñ0 1 e1{y 0. Hence according to Theorem 2.5.38: lim xex xÑ8 1{y 0. ylim Ñ0 e1{y Example 2.5.41. Calculate 2 x lim Ñ8 x e . x Solution: Define f py q : 1{y 2 and g py q : expp1{y q for all y P p0, 1q. Then f and g are continuously differentiable as well as such that g 1 py q expp1{yq{y2 0 for all y P p0, 1q. Further, |f pyq| 1{y2 ¡ 0, |gpyq| expp1{yq ¡ 0 for all y P p0, 1q. Finally, (2.5.22) is satisfied and by Example 2.5.40 f 1 py q y Ñ0 g 1 py q lim 2{y ylim 0. Ñ0 e1{y 187 Hence according to Theorem 2.5.38: 1{y lim x ex lim 1{y xÑ8 y Ñ0 e 2 2 0. Remark 2.5.42. Recursively in this way, it can be shown that lim xn ex x for all n P N. Ñ8 0. That the condition that g 1 pxq 0 for all x P pa, bq in Theorem 2.5.38 is not redundant can be seen from the following example. Example 2.5.43. For this define f pxq : 2 x sin 2 x , g pxq : 2 x sin 2 x e sinp1{xq for all x P p0, 2{5q. Then f and g are continuously differentiable and satisfy 1 xÑ0 |f pxq| lim Since f pxq g pxq 1 xlim 0. Ñ0 |g pxq| e sinp1{xq for all x P p0, 2{5q, pf {g qpxq does not have a limit value for x Ñ 0. Further, it follows that 2 f 1 pxq 2 1 4 1 cos cos 2 2 x x x 1 2 2 1 g 1 pxq 2 cos e sinp1{xq sin x x x x 2 x , 4 cos and hence that f 1 pxq g 1 pxq 2 4x cosp1{xq e sinp1{xq x sinp2{xq 4x cosp1{xq 188 1 x for all x P p0, 2{5q " z p2k 2 1qπ :k PN * . We notice that lim x Ñ0 2 4x cosp1{xq e sinp1{xq x sinp2{xq 4x cosp1{xq 0. This does not contradict Theorem 2.5.38 since g 1 has zeros of the form 2{pp2k 1qπ q, k P N. Hence there is no b ¡ 0 such that the restrictions of f and g would satisfy the assumptions in Theorem 2.5.38. The following example shows that in general the existence of lim x Ña f pxq g pxq does not imply the existence of f 1 pxq . lim xÑa g 1 pxq Example 2.5.44. For this, define f pxq : x sinp1{x2 q expp1{xq , g pxq : expp1{xq for all x ¡ 0. Then f pxq xÑ0 g pxq lim f pxq 0 , lim g pxq , lim x Ñ0 x Ñ0 0. Further, f 1 pxq g 1 pxq 1 2 2 x p x 1 q sin p 1 { x q 2 cos p 1 { x q expp1{xq , x2 1 f 1 pxq exp p 1 { x q , xpx 1q sinp1{x2 q 2 cosp1{x2 q 2 1 x g pxq for all x ¡ 0. Hence f 1 {g 1 does not have a limit value for x Ñ 0. 189 For the motivation of following contraction mapping lemma, we consider a method of calculating square roots of numbers which can be traced back to ancient Greek times, but there are indications that this method was already known in ancient ? Babylonia. For this, we consider the problem of approximation of N by fractions where N is some non-zero natural number. If q is some non-zero positive rational number such that then it follows that q ? N , q2 N, q N and hence that N q ? N N ¡ Hence, the arithmetic mean q̄ : ?1 ? N N . 1 2 N q q of q and N {q, which is ? the midpoint of the interval rq, N {q s, might be a better approximation to N than q. Indeed, a little calculation gives that 2 1 2 N N2 q̄ N q q 2N N N q 4 q2 1 2 1 2 N2 N 2 4 q 2N q2 4 q N q2 pN q q 1 N p N q 2 q2 2 4 q2 1 pN q q 4q2 ¡ 0 2 and hence that if 1 4 q̄ 2 N 1 4 N q2 N q2 1 1 190 5q2. Hence if q 2 N 5q 2 , ? then q̄ is a better approximation to N than q. Note that q̄ does not satisfy the same inequalities since q̄ 2 ¡ N . On the other hand, which is equivalent to N q̄¯ : 1 2 N q̄ q̄ satisfies 1 N q̄ N 1 pN q̄ 2q 4 q̄ 2 p q̄ 2 N q2 1 N 2 4 1 q̄ 2 pq̄ N q 4q̄ 2 ¯2 and hence is a better approximation to 1 4 1 N q̄ 2 ? ¡0 N than q̄ since 1. Hence ? by continuing this process, we arrive at rational approximations to N whose accuracy increase in every step. For instance for N 2 and q 1, note that q 2 N ?5q 2 since 1 2 5, we arrive at the following rational approximations to 2 3 17 577 665857 886731088897 , , , , . 2 12 408 470832 627013566048 ? The value 17{12, which gives 2 within ? an error of 3 103, was used as a common rough approximation of 2 by the Babylonians. Starting from q 17{12, Babylonian arithmetic leads to the fraction 1 24 60 51 602 191 10 603 30547 21600 y 6 3 !!!! 2 1 x 2 !!!! 2 Fig. 55: Graph of T for the case N 2 and auxiliary curves. which was found on the Babylonian tablet YBC 7289 and gives an error of 6 107 . ? 2 within A modern interpretation of the process in terms of maps is that q , T pq q , T pT pq qq pT T qpq q , T pT pT pq qqq pT where T : p0, 8q Ñ p0, 8q is defined by T pxq : 1 2 x N x pT T qqpqq for every x ¡ 0, gives a sequence of approximations to accuracy. We expect that lim T n pq q nÑ8 ? ? N of increasing N where T n for n P N is inductively defined by T 0 : idp0,8q and T k T T k , for k P N. Indeed, if x0 , x1 , . . . converges to some element of x x ¡ 0 and xk : T k pxq 192 ... , 1 : P p0, 8q, where x 2 !!!! 2 1 10 5 20 15 n Fig. 56: pn, xn q for x 1 and n 1 to n 20. for all k P N, then xk 1 T pxq T pT pxqq T pxk q k 1 k 1 2 xk N xk x N x , and hence it follows by the limit laws that x klim x Ñ8 k 1 1 klim Ñ8 2 xk N xk 1 2 As a consequence, in this case, x satisfies the equation 1 2 or equivalently x2 x N x N which implies that ? x N 193 0 . since it was assumed that x ¡ 0. It is natural to ask in what sense a particular point for the map T . For this, we notice that ? N is ? ? 1 ? N N ? N , Tp N q 2 N ? ? that is, T maps N onto itself, i.e., N is a so called ‘fixed point’ of the map T . Also, every fixed point x of T satisfies the equation x 1 2 ? N x x which is equivalent to x N , i.e., there is no other fixed point of T . Finally, it is natural to ask whether there is a special property of the map that leads of x0 , x1 , . . . . For this, we notice that for ? to the convergence ? x ¥ N and y ¥ N , it follows that N xy and hence that |T pxq T pyq| | x y | 1 2 1 2 x y N 1 x xy 2 N x ¤1 ¤ | y| . This leads to |T pT pxqq T pT pyqq| ¤ 21 |T pxq T pyq| ¤ 14 |x y| and inductively to for all k P N. Since ? |T k pxq T k pyq| ¤ 21k |x y| N is a fixed point of T , this implies that |T k pxq ? N 1 N x y p x y q y 2 xy N|¤ 194 ? 1 | x N| 2k and hence that Since for x ? lim T k pxq k Ñ8 ? N . (2.5.25) N , as already observed above, it follows that 2 N q2 ¡ 0 pT pxqq2 N px 4x 2 and hence that T pxq ¡ ? N . Therefore, we conclude that (2.5.25) holds for all x ¡ 0. In addition, we notice that the fact that N P N was nowhere used in the previous discussion. As a consequence, summarizing that discussion, we proved the following result. Theorem 2.5.45. (Babylonian method of approximating roots of real numbers, I) Let a ¡ 0 and T : p0, 8q Ñ p0, 8q be defined by T pxq : for every x ¡ 0, then 1 x 2 lim T k pxq k Ñ8 a x ? a where T n for n P N is inductively defined by T 0 : idp0,8q and T k T T k , for k P N. 1 : Functions T satisfying |T pxq T pyq| ¤ α|x y| for some 0 ¤ α 1 and all x, y of their domain are called contractions. We notice from the previous discussion that if such a function T has a fixed point x and maps its domain into that domain, then it follows as above that lim T k pxq x . k Ñ8 for all x P DpT q. On the other hand, in many cases the existence of such a fixed point is not obvious, but such can be shown with the help of Theorem 2.3.33 if the domain of T is a closed interval of R. This is the additional point that is treated in Theorem 2.5.46. 195 Lemma 2.5.46. (Contraction mapping lemma on the real line) Let T : ra, bs Ñ R be such that T pra, bsq ra, bs where a, b P R are such that a b. In addition, let T be a contraction, i.e., let there exist α P r0, 1q such that |T pxq T pyq| ¤ α |x y| (2.5.26) for all x, y P ra, bs. Then T has a unique fixed point, i.e., a unique x ra, bs such that T px q x . Further, and |x x| ¤ |x 1T αpxq| n lim Ñ8 T pxq x n P (2.5.27) (2.5.28) for every x P ra, bs where T n for n P N is inductively defined by T 0 : idra,bs and T k 1 : T T k , for k P N. Proof. Note that (2.5.26) implies that T is continuous. Further, define the hence continuous function f : ra, bs Ñ R by f pxq : |x T pxq| for all x P ra, bs. Note that x P ra, bs is a fixed point of T if and only if it is a zero of f . By Theorem 2.3.33 f assumes its minimum in some point x P ra, bs. Hence 0 ¤ f px q ¤ f pT px qq |T px q T pT px qq| ¤ α |x T px q| α f px q and therefore f px q 0 since the assumption f px q 0 leads to the contradiction that 1 ¤ α. If x̄ P ra, bs is a fixed point of T , then |x x̄| |T pxq T px̄q| ¤ α |x x̄| and hence x̄ x since the assumption x̄ x leads to the contradiction that 1 ¤ α. Finally, let x P ra, bs. Then |x x| |x T pxq| |x T pxq T pxq T pxq| 196 ¤ |x T pxq| |T pxq T pxq| ¤ f pxq α |x x | and hence (2.5.27). Further from (2.5.27) and αn |T pxq x| |T pxq T pxq| ¤ α |x x| ¤ 1 α f pxq , n n n n it follows (2.5.28) since limnÑ8 αn 0. The following example applies the previous lemma to the Babylonian method of approximating roots of real numbers. In this, there are used more widely applicable methods in the proof of invariance of the domain of the function T and in the proof that T is a contraction. Example 2.5.47. (Babylonian method of approximating roots of real numbers, II) Let a ¡ 0 and N P N be such that N 2 ¡ a. Finally, define ? T : r a, N s Ñ R by 1 a T pxq : x 2 x ? for all x P r a, N s. Then lim T n pN q nÑ8 ? a. (2.5.29) Proof. First, we note that ? ? 1 a T p aq a , T pN q N N 2 N ? and ? hence that a is a fixed point of T . Further, T is twice differentiable on p a, N q with derivatives T 1 pxq ? 1 a 1 a 1 2 2 x2 a ¡ 0 , T 2 pxq 3 2 x 2x x ¡0 for all x P p a,? N q. Hence T, T 1 are strictly increasing according to Theo? rem 2.3.44, T pr a, N sq r a, N s and 0 ¤ T 1 pxq ¤ 1 a 1 1 2 . 2 N 2 197 y 15 10 5 1 3 x -5 Fig. 57: Graph of pR Ñ R, x ÞÑ x3 2x 5q. In particular, it follows by Theorem 2.5.6 that ? |T pxq T pyq| ¤ 21 |x y| for all x, y P r a, N s. By Lemma 2.5.46, it follows that T has a unique ? fixed point, which hence is given by a , and in particular (2.5.29). For instance for N mating fractions 2 and q 1, we get in this way the first five approxi- 3 17 577 665857 886731088897 , , , , 2 12 408 470832 627013566048 with corresponding errors (according to (2.5.27)) equal or smaller than 1 1 1 1 1 , , , , . 6 204 235416 313506783024 555992422174934068969056 In 1669, Newton submitted a paper with title ‘De analysi per aequationes numero terminorum infinitas’ to the Royal Society. This paper was published only much later in 1712 [82]. Among others, Newton introduces by 198 example a iterative method for the approximation of zeros of differentiable functions which is now named after him. For this, he considers the equation x3 2x 5 0 . (2.5.30) As a first approximation to the solution in the interval [2,3], compare Fig 57, he uses x 2. Substitution of x 2 p into (2.5.30) gives 0 x3 2x 5 p2 pq3 2p2 pq 5 8 12p 6p2 p3 4 2p 5 1 10p 6p2 p3 Neglecting higher order terms in p than first order, i.e., effectively replacing the last polynomial in p by its linearization around p 0, he arrives at the equation 1 10p 0 and hence at p 1{10. In this way, he arrives at x 2.1 as a second approximation to the solution. He then substitutes x 2.1 q into (2.5.30) to obtain 0 x3 2x 5 p2.1 q q3 2p2.1 q q 5 9.261 13.23q 6.3q2 q3 4.2 2q 5 0.061 11.23q 6.3q2 q3 . Again, neglecting higher order terms in q than first order, i.e., in this effectively replacing the last polynomial in q by its linearization around q 0, he arrives at the equation 0.061 11.23q and hence at q 0.0054 where only the first leading digits of are retained. In this way, he arrives at the rounded result x 2.0946 as a third approximation to the solution which approximates that solution within an error of 5 105 . It has to be taken into account that Newton’s paper does not contain references to his fluxions or fluents. On the other hand, in spirit, his procedure 199 matches today’s version of the method. The only difference is that today’s method does not involve substitutions. It proceeds as follows. We define f : pR Ñ R, x ÞÑ x3 2x 5q. Starting from the first approximation x0 2 of its zero, we calculate the linearization p10 of f around x0 . Since f 1 pxq 3x2 2 for all x P R, we arrive at p10 pxq f px0 q f 1 px0 qpx x0 q 1 10 px 2q 21 10x for all x P R. Effectively replacing the function f by its linearization p10 , we arrive at the equation 21 10x 0 and hence, as Newton, at the first approximation x1 2.1. In the second step, we calculate the linearization p11 of f around x1 . It is given by p11 pxq f px1 q f 1 px1 qpx x1 q 0.061 23.522 11.23x 11.23px 2.1q for all x P R. Again, effectively replacing the function f by its linearization p11 , we arrive at the equation 23.522 11.23x 0 and hence, as Newton, at the second approximation x2 ing Newton’s way of rounding the result. 2.0946 by repeat- From today’s perspective, Newton’s method can be viewed as a particular application of the contraction mapping lemma. This is also used below to prove the convergence of the method and to provide an error estimate. The method is iterative and used to approximate solutions of the equation f pxq 0 where f : I Ñ R is a differentiable function on a non-trivial open 200 interval I of R. Starting from an approximation xn P I to such a solution, the correction xn 1 is given by the zero of the linearization around xn , f 1 pxn qpx xn q , f pxn q x P R, and hence by xn 1 xn ff 1ppxxnqq (2.5.31) n assuming f 1 pxn q 0, thereby essentially replacing the function f by its linearization around xn . It is instructive to analyze the recursion (2.5.31) in a little more detail where we assume that f 1 is in addition continuous. For this, let’s assume that f pxn q ¡ 0. If f is increasing in some neighborhood of xn , i.e., if f 1 pxn q ¡ 0, then we would expect the solution to be located to the left (= towards smaller values) of xn and, indeed, in this case, xn 1 is to the left of xn . If f is decreasing in some neighborhood of xn , i.e., if f 1 pxn q 0, then we would expect the solution to be located to the right (= towards larger values) of xn and also xn 1 is to the right of xn . If f pxn q 0 and f is increasing in some neighborhood of xn , i.e., if f 1 pxn q ¡ 0, then we would expect the solution to be to the right of xn and also xn 1 is to the right of xn . Finally, if f is decreasing in some neighborhood of xn , i.e., if f 1 pxn q 0, then we would expect that the solution is to the left of xn and also xn 1 is to the left of xn . Hence the recursion (2.5.31) shows as very intuitive behavior. On the other hand, for this reasoning to be make sense, the solution should be very near to xn . In particular in cases that xn is near to a critical point of f , the method usually fails because of leading to corrections of a much too large size. Finally, since the graph of the linearization of f around xn gives the tangent to the graph of f in the point pxn , f pxn qq, xn 1 gives the abscissa of the intersection of that tangent with the x-axis. This fact gives a geometric interpretation to Newton’s method. 201 y 14 12 x2 x1 x0 x -2 Fig. 58: Graph of f from Example 2.5.48 (a 2) and Newton steps starting from x0 4. The following example shows that the Babylonian method of approximating roots of real numbers can be seen as a particular case of Newton’s method. Example 2.5.48. Let a ¡ 0. Define f : R Ñ R by for all x P R. Then xn for xn 1 xn f pxq : x2 a f pxn q f 1 pxn q xn x2n a 2xn 1 2 xn 0 which is the iteration used in Example 2.5.45. a xn Theorem 2.5.49. (Newton’s method) Let f be a twice differentiable realvalued function on a non-trivial open interval I of R. Further, let I contain a zero x0 of f and be such that f 1 pxq 0 for all x P I and in particular such that f pxqf 2 pxq f 12 pxq ¤ α 202 for all x P I and some α P R satisfying 0 ¤ α 1. Then lim T n pxq x0 Ñ8 n for all x P I where Finally, for all x P I. T pxq : x (2.5.32) f pxq . f 1 pxq |x x0| ¤ |x 1T αpxq| (2.5.33) Proof. First, it follows that T is differentiable with derivative f pxqf 2 pxq 1 T pxq f 12 pxq P I and that x0 is a fixed point of T . By Theorem 2.5.6 it follows T pxq T px0 q T pxq x0 xx 1 x x0 0 for all x P I different from x0 and hence that |T pxq x0| ¤ |x x0| (2.5.34) for all x P I. Now let ra, bs, where a, b P R such that a b, be some closed subinterval of I containing x0 . Then it follows by (2.5.34) that T pra, bsq ra, bs and by Theorem 2.5.6 that T pxq T py q ¤α xy for all x, y P ra, bs satisfying x y and hence that |T pxq T pyq| ¤ α|x y| for all x, y P ra, bs. Hence by Lemma 2.5.46, the relations (2.5.32) and (2.5.33) follow for all x P ra, bs. for all x that 203 y 0.5 -1 0.5 -0.5 1 x -0.5 -1 Fig. 59: Zero of f from Example 2.5.50 given by the xcoordinate of the intersection of two graphs. The following example gives an application of Newton’s method to a standard problem from quantum theory. Example 2.5.50. Find an approximation x1 to the solution of x0 cospx0q such that |x0 x1 | 106 . Solution: Define f : R Ñ R by f pxq : x cospxq for all x P R. Then f is infinitely often differentiable with f 1 pxq 1 sinpxq , f 2 pxq cospxq cospxq px cospxqq f pxqf 2 pxq 1 2 f pxq p1 sinpxqq2 where only in the last identity it has to be assumed that x is different from π {2 2kπ for all k P Z. Further f π 6 π 6 ? 3 2 0, f 204 π 4 π4 ?12 ¡ 0 , and hence according Theorem 2.3.37, f has a zero in the open interval I : pπ {6, π {4q. Also f 1 pxq 1 for all x P I. Further, f f2 f 12 1 sinpxq ¡ 1 sinpπ {6q 3{2 ¡ 0 pxq 3 cosppx1q x sinpxq 2x sinpxqq2 and 3 cospxq x sinpxq 2x ¥ 3 cos π 4 π π π sin 2 6 6 4 ?32 5π ¡0 12 and hence f f 2 {f 12 is strictly increasing on rπ {6, π {4s as a consequence of Theorem 2.5.10. Therefore, 1 9 27 cos and ? cos 3π p 1 sin π6 q2 π ? π π 4 cos π2 2 4 4 8 6?2 p1 sin π4 q2 cos π 6 π 6 π 6 2 f pfxq12fpxpqxq p q p q 1 9 ?3 π α : 1 1 pq 27 3 f x f 2 x f 12 x for all x P I. Starting the iteration from 0.7 gives to six decimal places 0.739436 , 0.739085 with the corresponding errors 0.000527006 , 4.08749 108 . Hence the zero x0 of f in the interval I agrees with x1 0.739085 205 to six decimal places. That there is no further zero of f can be concluded as follows. Since the derivative of f does not vanish in the interval pπ{2, π{2q, it follows by Theorem 2.5.4 that there are no other zeros in this interval. Further, for |x| ¥ π {2 p¡ 1q there is no zero of f because | cospxq| ¤ 1 for all x P R. The quantity U2 pU1 U2q x20 is the ground state energy of a particle in a finite square well potential with U3 U1 , γ 0, KL 2. See [79]. Problems 1) Give the maximum and minimum values of f and the points where they are assumed. a) b) c) d) e) f) g) f pxq : x2 5x 7 , x P r5, 0s , f ptq : t3 6t2 9t 14 , t P r5, 0s , f psq : s4 p8{3qs3 6s2 1 , s P r5, 5s , f ptq : 4pt 3q2 pt2 1q , t P r1, 4s , f pxq : p9x 12q{p3x2 4q , x P r1, 0s , f pxq : px2 x 1q exppxq , x P r0.3, 1.5s , ? f pxq : exppx{ 3 q cospxq , x P r0, 8q . 2) Consider a projectile that is shot into the atmosphere. If v ¥ 0 is the component of its speed at initial time 0 in the vertical direction, its height z ptq above ground at time t ¥ 0 is given by z ptq vt gt2 {2 where g 9.81m{s2 is the acceleration due to gravity and it is assumed that z p0q 0. Calculate the maximal height the projectile reaches and also the time of its flight, i.e., the time when it returns to the ground. 3) Reconsider the situation from previous problem, but now with inclusion of a viscous frictional force opposing the motion of the projectile. Then z ptq α rpv αg qp1 exppt{αqq gts where it is again assumed that z p0q 0. Here α m{λ where m ¡ 0 is the mass of the projectile and λ ¡ 0 is a parameter describing the strength of the friction. Calculate the maximal height the projectile reaches and also the time of its flight, i.e., the time when it returns to the ground. 206 4) Let a ¡ 0 and b ¡ 0. Find an equation for the straight line through the point pa, bq that cuts from the first quadrant a triangle of minimum area. State that area. 5) Let a ¡ 0 and b ¡ 0. Find an equation for the straight line through the point pa, bq whose intersection with the first quadrant is shortest. State the length of that intersection. 6) Find the maximal volume of a cylinder of given surface area A ¡ 0. 7) From each corner of a rectangular cardboard of side lengths a ¡ 0 and b ¡ 0, a square of side length x ¥ 0 is removed, and the edges are turned up to form an open box. Find the value of x for which the volume of that box is maximal. 8) A rectangular movie screen on a wall is h1 -meters above the floor and h2 -meters high. Imagine yourself sitting in front of the screen and looking into the direction of its center. Measured in this direction, what distance x from the wall will give you the largest viewing angle θ of the movie screen? [This is the angle between the straight lines that connect your eyes to the lowest and the highest points on the screen.] Assume that the height of your eyes above the floor is hs meters where hs h1 . 9) Imagine that the upper half-plane H : R p0, 8q and the lower half-plane H : R p8, 0q of R2 are filled with different ‘physical media’ with the xaxis being the interface I. Further, let px1 , y1 q P H , px2 , y2 q P H . Light rays in both media proceed along straight lines and at constant speeds v1 and v2 , respectively. According to Fermat’s principle, a ray connecting px1 , y1 q and px2 , y2 q chooses the path that takes the least time. Show that that path satisfies Snell’s law, i.e., sinpθ1 q{ sinpθ2 q v1 {v2 where θ1 (θ2 ) is the angle of the part of the ray in H Y I (H Y I) with the normal to the xaxis originating from its intersection with I. 10) For the following functions find the intervals of increase and decrease, the local maximum and minimum values and their locations and the intervals of convexity and concavity and the inflection points. Use the gathered information to sketch the graph of the function. If available, check your result with a graphing device. a) f psq : 7s4 3s2 1 , s P R , b) f ptq : t4 p8{3qt3 6t2 3 , t P R , c) f pxq : 4px 3q2 px2 1q , x P R . 207 11) For the following functions find vertical and horizontal asymptotes, the intervals of increase and decrease, the local maximum and minimum values and their locations and the intervals of convexity and concavity and the inflection points. Use the gathered information to sketch the graph of the function. If available, check your result with a graphing device. a) f pxq : x{p1 x2 q , x P R , ? b) f pxq : x2 1 x , x P R , ? ? c) f pxq : p9x 12q{p3x2 4q , x P R zt2{ 3, 2{ 3u . 12) Calculate the linearization of f around the given point. a) b) c) d) e) f) g) f pxq : p1 xqn , x ¡ 1 , around x 0 where n P R , f pxq : lnpxq , x ¡ 0 , around x 1 , f pϕq : sinpϕq , ϕ P R , around ϕ 0 , f pϕq : tanpϕq , ϕ P pπ {2, π {2q , around ϕ 0 , f pxq : sinhpxq : pex ex q{2 , x P R , around x 0 , f pϕq : lnrp5{4q cosp3ϕqs , ϕ P R , around ϕ0 3π {4 , f pxq : p3x2 x 5q{p5x2 6x 3q , x P R ztx P R : 5x2 6x 3 0u , around x 1 . 13) Show that a) b) c) d) e) f) g) p1 p1 xqn ¡ 1 nx for all x ¡ 0 and n ¥ 1 , xqn 1 nx for all x ¡ 0 and 0 n 1 , ln x ¤ x 1 for all x ¡ 0 , sinpϕq ϕ for all ϕ ¡ 0 , tanpϕq ¡ ϕ for all ϕ P p0, π {2q , sinhpxq : pex ex q{2 ¡ x for all x ¡ 0 . ln x ¥ px 1q{x for all x ¡ 0 . 14) Calculate a x a x , b) lim 1 , xÑ8 xÑ8 x x x tanpxq tanpxq , d) lim , c) lim xÑ0 1 cospxq xÑ0 x sinpxq 1 1 e) lim ? , f) lim , xÑ0 xÑ0 x x sinpxq lnpxq r lnpxq sn 2 , g) lim , h) lim xÑ8 xÑ8 x x a) lim 1 208 lnpxq Ñ1 tanpπxq i) lim x l) n) lim r sinpxq s Ñ0 x , j) lim xx , k) Ñ0 lim xa{ lnpxq , Ñ0 p q , m) lim x1{x , xÑ8 x x tan x lim xsinpxq , o) lim r cosp1{xq sx , Ñ0 x Ñ8 x cosp3xq cosp2xq p) lim , q) xÑ0 x2 where n P N, a P R. 1 cospπxq Ñ1 x2 2x 1 lim x 15) Explain why Newton’s method fails to find the zero(s) of f in the following cases. a) f pxq : x2 x6 , x P R , with initial approximation x 1{2 , b) f pxq : x1{3 , x P R . 16) A circular arch of length L ¡ 0 and height h ¡ 0 is to be constructed where L{h ¡ π. a) Show that x : L{p2rq, where r ¡ 0 is the radius of the corresponding circle, satisfies the transcendental equation cospxq 1 2h x. L b) Assume that L{h 7. By Newton’s method, find an approximation x0 to x such that |x0 x| 106 . 17) The characteristic frequencies of the transverse oscillations of a string of length L ¡ 0 with fixed left end and right end subject to the boundary condition v 1 pLq hv pLq 0, where v : r0, Ls Ñ R is the amplitude of deflection of the string and h P R, is given by ω x{L where x tan x (2.5.35) hL [20]. Assume hL 1{3, and find by Newton’s method an approximation x0 to the smallest solution x ¡ 0 of (2.5.35) such that |x0 x| 106 . 18) The characteristic frequencies of the transverse vibrations of a homogeneous beam of length L ¡ 0 with fixed ends are given by ω rEJ {pρS qs1{2 px{Lq2 where coshpxq cospxq 1 , (2.5.36) E is Young’s modulus, J is the moment of inertia of a transverse section, S is the area of the section, ρ is the density of the material 209 of the beam, and coshpy q : pey ey q{2 for all y P R [65]. By Newton’s method, find an approximation x0 to the smallest solution x ¡ 0 of (2.5.36) such that |x0 x| 106 . 19) (Binomial theorem) Let n P N . Define f : p1, 8q Ñ R by n ¸ f pxq : k 0 for all x P defined by p1, 8q where the so called ‘binomial coefficients’ are n 0 for every k n k x k : 1 , n k : 1 n pn 1q pn pk 1qq k! P N . a) Show that xqf 1 pxq nf pxq p1 for all x P p1, 8q. b) Conclude from part a) that f pxq p1 xqn for all x P p1, 8q. c) Show the binomial theorem, i.e., that px yq n n ¸ k 0 for all x, y P R. 210 n k nk x y k y 1 A 1 x Fig. 60: The yellow area A enclosed by the graph of f : p r0, 1s Ñ R, x ÞÑ 1 x2 q and the coordinate axes is determined by Archimedes’ method. 2.6 Riemann Integration An early example of integration is given by Archimedes’ quadrature of the segment of the parabola. For this, he presents two proofs. Here, we display his first proof because it anticipates the definition of the Riemann integral. The second proof will be given at beginning of Section 3.3 on series of real numbers. We use his method to calculate the area A of the parabolic segment tpx, yq P R2 : x P r0, 1s ^ 0 ¤ y ¤ 1 x2u that is contained the rectangle r0, 1sr0, 1s, see Fig. 60. He approximates A by what would be called upper and lower sums today, but the construction of those sums was geometrically motivated. We slightly alter that construction, but otherwise closely follow his method. For this, we divide the x-axis into intervals of equal lengths, for instance, into four intervals r0, 1{4s , r1{4, 2{4s , r2{4, 3{4s , r3{4, 4{4s 211 y 1 1 4 1 2 3 4 1 x Fig. 61: The yellow area gives the upper bound U4 for A, compare text. of equal lengths 1{4. Then the sum U4 of the areas of the two-dimensional intervals r0, 1{4s r 0, 1 p0{4q2 s , r1{4, 2{4s r 0, 1 p1{4q2 s , r2{4, 3{4s r 0, 1 p2{4q2 s , r3{4, 4{4s r 0, 1 p3{4q2 s given by 1 3̧ k2 U4 1 2 4 k0 4 exceeds A, and the sum L4 of the areas of the two-dimensional intervals r0, 1{4s r 0, 1 p1{4q2 s , r1{4, 2{4s r 0, 1 p2{4q2 s , r2{4, 3{4s r 0, 1 p3{4q2 s , r3{4, 4{4s r 0, 1 p4{4q2 s given by L4 1 4̧ k2 1 2 4 k1 4 212 y 1 1 4 1 2 3 4 1 x Fig. 62: The yellow area gives the lower bound L4 for A, compare text. is smaller than A, L4 ¤ A ¤ U4 . In the same way, by division of the x-domain into intervals of equal lengths 1{n, where n P N , we arrive at Un 1 1 n¸ k2 1 2 , Ln n k0 n and the inequalities Ln Since Un L n we conclude that Ln 1 ņ k2 1 2 n k1 n ¤ A ¤ Un . n1 n 2 n ¤ A ¤ Ln 213 n1 , 1 . n Further, Ln 1 n 1 1 3 ņ k2 n n2 k1 1 1 n 1 1 1 2n 1 ņ 2 k n3 k1 1 pn 1qp2n 6n2 1q where it has been used that ņ k2 k 1 61 npn 1qp2n 1q . The last formula was known to Archimedes. He proved it in his treatise on spirals [37]. Of course, it is tempting (and correct) to take the limit n Ñ 8 to conclude that A ¥ nlim Ñ8 Ln and hence that 2 , A ¤ nlim Ñ8 Ln 3 1 n 2 nlim Ñ8 Ln 3 A 2 . (2.6.1) 3 Below, the Riemann integral of f : r0, 1s Ñ R defined by f pxq : 1 x2 for every x P r0, 1s, will be defined essentially as the common limit of the sequences L1 , L2 , . . . and U1 , U2 , . . . , which give the area enclosed by the graph of f and the coordinate axes, and denoted by »1 ³ 0 f pxq dx where Leibniz’s sign is a stylized S and is intended to remind of the summation involved in the definition of the integral. Hence the previous reasoning shows that »1 2 f pxq dx . 3 0 Note that (2.6.1) presupposes an intuitive geometric notion of the area A. Today, the limits would be used for the definition of A. As derivatives 214 of functions are used to define tangents at curves, integrals of functions are used to define areas (or volumes in Calculus III). Also, note that the whole calculation, including the limit value, uses only rational numbers and therefore does not pose a problem to ancient Greek mathematics. In other cases where the quadrature failed, like the quadrature of the circle, that area was not describable by a rational number. Finally, instead of (2.6.1), Archimedes showed an equivalent result that expressed A in terms of a rational multiple of the area of a triangle inscribed into the parabolic segment. For the last result, we refer to the beginning of Section 3.3 in Calculus II on series of real numbers. We return to the question of showing that A 2{3. Since there was no limit concept at the time, this proof had to be performed by a so called ‘double reductio ad absurdum’, i.e., by leading both assumptions that A 2{3 and that A ¡ 2{3 to a contradiction which leaves only the option that A 2{3. Since 2 3 n1 ¤ 23 3n6n2 1 Ln ¤ A ¤ Un 23 3n 1 6n2 this can be done as follows. For this, we assume that A some ε ¡ 0. Then, it follows for n ¡ 1{ε that 2 3 εA¤ 2 3 1 n 32 ¤ 32 p2{3q 1 , n ε for ε. On the other hand, if A p2{3q ε for some ε ¡ 0, it follows for n ¡ 1{ε that 2 1 2 2 εA¥ ¡ ε . 3 3 n 3 Hence the only remaining possibility is that A 2{3. Of course, in ancient Greece only rational ε were considered in such analysis. A generalization of Archimedes’ result to natural powers of x were made only in the 17th century by Descartes and Fermat, but unpublished, and in 1647 by Bonaventura Cavalieri [24]. The next decisive step was the discovery of the fundamental theorem of calculus independently by Newton [83] 215 vHtL @msecD 10 8 6 4 2 0.2 0.4 0.6 0.8 1 1.2 t @secD Fig. 63: S6 p1.2q is given by the yellow area under Gpv q. and Leibniz [68], see Theorems 2.6.19, 2.6.21, i.e., the realization that differentiation and integration are inverse processes. For motivation of that theorem, we go back to the start of Section 2.4 to the discussion of Galileo’s results on bodies in free fall near the surface of the earth. Starting from the fallen distance sptq at time t, sptq 1 2 gt 2 (2.6.2) for all t ¥ 0, we determined the instantaneous speed v ptq of the body at time t as the derivative v ptq s 1 ptq gt where g 9.81m{sec2 is the acceleration of the earth’s gravitational field. We now investigate the reverse question, how to calculate sptq from the instantaneous speeds between times 0 and t. There are two main approaches to this problem. 216 The first uses that v ptq s 1 ptq for every t ¡ 0 and concludes that s is the ‘anti-derivative’ of v such that sp0q 0 and hence (by application of Theorem 2.5.7) is given by (2.6.2). A second approach leading on integration uses the following relation between s and v. For every t ¡ 0 and n P N , it follows that sptq sp0q s k s n¸1 k 1 t n k 1 t n 1 n k 0 k 0 For k n¸1 t s k t n s nk t k 1 t k t n n nk t P t0, . . . , n 1u, s k 1 t n k 1 t n . s nk t nk t is the average speed in the time interval k k 1 t, t n n . In this case, it is given by s k 1 t n k 1 t n v s nk t ngt 2 nk t k t n k 1 2 n 2 k n gt 1 2 k n gt 2n in terms of the instantaneous speed v at the beginning of the time interval. Hence, we conclude that sptq sp0q gt2 2n n¸1 v k 0 217 k t n t n Snptq gt2 2n where n¸1 Sn ptq : v k 0 This leads on sptq sp0q lim Ñ8 k t n n¸1 n v k 0 t . n k t n t n . Note that the sum Sn ptq has the geometrical interpretation of an area under Gpv q, see Fig. 63. Below the limit lim Ñ8 n n¸1 v k 0 k t n t n will coincide with the integral of the function v over the interval r0, ts which is denoted by » t 0 Hence v pτ q dτ . sptq sp0q »t 0 v pτ q dτ gives the relation between instantaneous speed and the distance traveled between times 0 and t. It is satisfied for the motion in one dimension in general. The last relation gives the connection between the integral of v over the interval r0, ts, t ¡ 0, and its anti-derivative s. It constitutes a special case of the fundamental theorem of calculus and is valid for a wide class of functions v. From the knowledge of an anti-derivative s of v, i.e., some function s such that s 1 pτ q v pτ q for all τ P r0, ts, this relation allows the calculation of the integral of v over the interval r0, ts. As a consequence of the discovery of the fundamental theorem of calculus, during the 18th century, the integral was generally regarded as the inverse of the derivative, i.e., the statement of the fundamental theorem of calculus was used to define the integral. Only in cases where an anti-derivative could 218 not be found, definitions of the integral as a limit of some sort of sums or an area under a curve were used to derive approximations. In particular, the notion of area was still considered intuitive such that no precise definition was needed. At the beginning of the 19th century, the work of Fourier made it necessary to define integrals also of discontinuous functions. Cauchy was the first to give a definition for continuous functions. Still, it contained an unnatural element in a preference of function values assumed at left ends of intervals used to subdivide the domain of such a function. The first fully satisfactory definition, applicable to a large class of discontinuous functions, was given by Bernhard Riemann in 1854 in his habilitation thesis [87]. The equivalent definition used in this text is due to Jean-Gaston Darboux. After this introduction, we start with natural definitions of the length of intervals, partitions of intervals and corresponding lower and upper sums of bounded functions. Such sums already appeared in the previous calculation of the area of the parabolic segment and in the motivation of the fundamental theorem of calculus. They corresponded to partitions of intervals into subintervals of equal length. In the limit of vanishing length, we arrived at the area A as well as at integrals of v. Below, the size of a partition generalizes that length. On the other hand, we will allow for much general partitions of intervals in the definition of the integral. As a consequence, those partitions cannot be characterized by a single parameter, and hence a definition of the integral in form of a simple limit is not possible. Such limit is replaced by the supremum of lower sums and the infimum of upper sums which is required to coincide for integrable functions. Definition 2.6.1. (i) Let a, b P R be such that a ¤ b. We define the lengths of the corresponding intervals pa, bq, pa, bs, ra, bq, ra, bs by lppa, bqq lppa, bsq lpra, bqq lpra, bsq : b a . A partition P of ra, bs is an ordered sequence pa0 , . . . , aν q of elements 219 of ra, bs such that a a0 ¤ a1 ¤ ¤ aν b where ν is an element of N . Since pa, bq is such a partition of ra, bs, the set of all partitions of that interval is non-empty. A partition P 1 of ra, bs is called a refinement of P if P is a subsequence of P 1 . (ii) A partition P pa0 , . . . , aν q of a bounded closed interval I of R induces a division of I into, in general non-disjoint, subintervals I ν¤1 Ij , Ij : raj , aj 1 s , j 0, . . . , ν . j 0 The size of P is defined as the maximum of the lengths of these subintervals. In addition, we define for every bounded function f on I the lower sum Lpf, P q and upper sum U pf, P q corresponding to P by: Lpf, P q : ν¸1 ν¸ 1 inf tf pxq : x P Ij u lpIj q , j 0 U pf, P q : suptf pxq : x P Ij u lpIj q . j 0 Note that if K that ¡ 0 is such that |f pxq| ¤ K for all x P I, it follows K ¤ inf tf pxq : x P J u ¤ suptf pxq : x P J u ¤ K for every subset J of I and hence that |Lpf, P q| ¤ ν¸1 | inf tf pxq : x P Ij u| lpIj q ¤ K j 0 |U pf, P q| ¤ ν¸1 ν¸1 lpIj q K lpI q , ν¸ 1 j 0 | suptf pxq : x P Ij u| lpIj q ¤ K j 0 j 0 220 lpIj q K lpI q . As a consequence, the sets tLpf, P q : P P Pu , tU pf, P q : P P Pu are bounded where P denotes the set of all partitions of I. Example 2.6.2. Consider the interval I : r0, 1s and the continuous function f : I Ñ R defined by f pxq : x for all x P I. P0 : p0, 1q , P1 : p0, 1{2, 1q are partitions of I. The size of P0 is 1, whereas the size of P1 is 1{2. Also, P1 is a refinement of P0 . Finally, Lpf, P0 q 0 1 0 , U pf, P0 q 1 1 1 , 1 1 1 1 Lpf, P1 q 0 , 2 2 2 4 1 1 1 3 U pf, P1 q 1 2 2 2 4 and hence Lpf, P0 q ¤ Lpf, P1 q ¤ U pf, P1 q ¤ U pf, P0 q . Intuitively, it is to be expected that a refinement of a partition of an interval leads to a decrease of corresponding upper sums and an increase of corresponding lower sums as has also been found in the special case in the previous example. Indeed, this is intuition is correct. Lemma 2.6.3. Let f be a bounded real-valued function on a closed interval I of R. Further, let P, P 1 be partitions of I, and in particular let P 1 be a refinement of P . Then Lpf, P q ¤ Lpf, P 1 q ¤ U pf, P 1 q ¤ U pf, P q . (2.6.3) Proof. The middle inequality is obvious from the definition of lower and upper sums given in Def 2.6.1(ii). Further, let P pa0 , . . . , aν q be a partition of [a,b] where ν P N and a0 , . . . , aν P ra, bs. Obviously, for the proof 221 of the remaining inequalities it is sufficient (by the method of induction) to assume that P 1 pa0 , a11 , a1 , . . . , aν q where a11 P I is such that a0 ¤ a11 ¤ a1 and where we simplified to keep the notation simple. Then Lpf, P 1 q Lpf, P q inf tf pxq : x P ra0 , a11 su lpra0 , a11 sq inf tf pxq : x P ra11 , a1 su lpra11 , a1 sq inf tf pxq : x P ra0, a1su lpra0, a1sq ¥ inf tf pxq : x P ra0, a1su tlpra0, a11 sq lpra11 , a1sq lpra0, a1squ 0 . Analogously, it follows that U pf, P 1 q U pf, P q ¤ 0 and hence, finally, (2.6.3). As a consequence of their definition, lower sums are smaller than upper sums. It is not difficult to show that the same is true for the supremum of the lower sums and the infimum of the upper sums. Theorem 2.6.4. Let f be a bounded real-valued function on the interval ra, bs of R and P be the set of all partitions of ra, bs where a and b are some elements of R such that a ¤ b. Then supptLpf, P q : P P Puq ¤ inf ptU pf, P q : P P Puq . (2.6.4) Proof. By Theorem 2.6.3, it follows for all P1 , P2 P P that Lpf, P1 q ¤ Lpf, P q ¤ U pf, P q ¤ U pf, P2 q , where P P P is some corresponding common refinement, and hence that supptLpf, P1 q : P1 P Puq ¤ U pf, P2 q and (2.6.4). 222 As a consequence of Lemma 2.6.3 and since every partition P of some interval of R is a refinement of the trivial partition containing only its initial and endpoints, we can make the following definition. Definition 2.6.5. (The Riemann integral) Let f be a bounded real-valued function on the interval ra, bs of R where a and b are some elements of R such that a ¤ b. Denote by P the set consisting of all partitions of ra, bs. We say that f is Riemann-integrable on ra, bs if supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq . In that case, we define the integral of f on ra, bs by »b a f pxq dx : supptLpf, P q : P In particular if f pxq graph of f by P Puq inf ptU pf, P q : P P Puq . ¥ 0 for all x P ra, bs, we define the area A under the A : »b a f pxq dx . Example 2.6.6. Let f be a constant function of value c P R on some interval ra, bs of R where a and b are some elements of R such that a ¤ b. In particular, f is bounded. Further, let P pa0 , . . . , aν q be a partition of ra, bs where ν P N and a0, . . . , aν P ra, bs. Then Lpf, P q U pf, P q ν¸1 c lprak , ak 1 sq k 0 ν¸1 ν¸1 c pak 1 ak q k 0 c pak 1 ak q c pb aq . k 0 Hence all lower and upper sums are equal to c pb aq. As a consequence, f is Riemann-integrable and »b a f pxq dx c pb aq . 223 Note that this result can restated as saying that »b dx a is given by the difference of the values the antiderivative pra, bs Ñ R, x ÞÑ xq of the integrand at b and a. That this is not just accidental will be seen later on. The same is also true in more general cases as specified in the version Theorem 2.6.21 of the fundamental theorem of calculus. Note that according to the previous example, the integral of every function defined on an interval containing precisely one point is zero. The value of the function in this point does not affect the value of the integral. This observation will lead further down to the definition of so called zero sets. Example 2.6.7. Consider the function f : ra, bs Ñ R defined by f pxq : x , for all x P ra, bs where a and b are some elements of R such that a ¤ b. Since f pxq |x| ¤ max |a|, |b| for every x P ra, bs, f is bounded. For every n P N , define the partition Pn of ra, bs by Pn : ba ,...,a n a, a n pb aq n b . Calculate Lpf, Pn q and U pf, Pn q for all n P N . Show that f is Riemannintegrable over ra, bs and calculate the value of »b a f pxq dx . Solution: We have: I n¤1 j 0 a j pb aq ,a n 224 pj 1qpb aq n and L pf, Pn q n¸1 j pb aq n a j 0 ba n 1 p b aq2 n¸ a pb aq j n2 a pb aq pb n2aq n2 pn 1q a pb aq pb 2 aq 2 U pf, Pn q n¸1 a j 0 pj 2 1qpb aq n 1 2 n¸ a pb aq pb n2aq pj Hence 1 n j 0 1 1 n , b n a 1q a pb aq j 0 p b aq2 1 a pb aq 2 pb aq2 n pn n2 2 1q , lim Ñ8 L pf, Pn q nlim Ñ8 U pf, Pn q n 1 2 pb2 a2q . As a consequence, it follows that 1 2 and that pb2 a2q ¤ supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq ¤ 21 pb2 a2q and hence by Theorem 2.6.4 that supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq 12 pb2 a2q where P denotes the set of partitions of ra, bs. Hence f is Riemann-integrable and »b 1 x dx pb2 a2 q . 2 a 225 Note that the last result can be restated as saying that »b x dx a is given by the difference of the values the antiderivative pra, bs Ñ R, x ÞÑ x2 {2q of the integrand at b and a. That this is not just accidental will be seen later on. The same is also true in more general cases as specified in the version Theorem 2.6.21 of the fundamental theorem of calculus. In the past, we have seen that the property of convergence of a sequence as well as of the continuity and differentiability of functions is automatically ‘transferred’ to sums, products and quotients, see Theorems 2.3.4, 2.3.46, 2.3.48 and 2.4.8. Also did this fact considerably simplify the process of the decision whether a given sequence is convergent or given functions are continuous or differentiable. In many cases, this is an obvious consequence of the convergence of elementary sequences as well as of the continuity or differentiability of elementary functions. For these reasons, it is natural to ask whether multiples, sums, products and quotients of integrable functions are integrable as well. Indeed, this is the case for multiples, sums and products. In the case of quotients, this is the case if the divisor is in addition nowhere vanishing, and if the quotient is bounded. The corresponding proof is relatively simple in the case of multiples and sums of integrable functions and is part of the following theorem. In the case of products and quotients, the statement is a consequence of Lebesgue’s criterion for Riemann-integrability, Theorem 2.6.13, which is proved in the appendix. Within the definition of Riemann-integrability above, we also defined the area under the graph of a positive integrable function in terms of its integral. This is reasonable in view of applications only if that integral is positive. This positivity is a simple consequence of the positivity of the lower sums. Theorem 2.6.8. (Linearity and positivity of the integral) Let f, g be bounded and Riemann-integrable on the interval ra, bs of R where a and b are elements of R such that a ¤ b and c P R. Then f g and cf are Riemann-integrable on ra, bs and »b a pf pxq g pxqq dx »b a 226 f pxq dx »b a g pxq dx , »b a cf pxq dx c »b a f pxq dx . If f is in addition positive, then »b a f pxq dx ¥ 0 . Proof. In the following, we denote by P the set of all partitions of ra, bs. First, if M1 ¡ 0 and M2 ¡ 0 are such that |f pxq| ¤ M1 and |g pxq| ¤ M2 , then |pf gqpxq| |f pxq gpxq| ¤ |f pxq| |gpxq| ¤ M1 M2 , |pcf qpxq| |cf pxq| |c| |f pxq| ¤ |c|M1 for all x P ra, bs and hence f g and cf are bounded for every c P R. Second, it follows for every subinterval J of I : ra, bs that inf tf pxq : x P J u inf tg pxq : x P J u ¤ f pxq g pxq pf g qpxq , pf gqpxq f pxq gpxq ¤ suptf pxq : x P J u suptgpxq : x P J u for all x P J and hence that inf tf pxq : x P J u inf tg pxq : x P J u ¤ inf tpf gqpxq : x P J u ¤ suptpf gqpxq : x P J u ¤ suptf pxq : x P J u suptgpxq : x P J u . Hence it follows for every partition P of I that Lpf, P q Lpg, P q ¤ Lpf ¤ U pf, P q U pg, P q . g, P q ¤ U pf If n P N , by refining partitions, we can construct Pn »b a f pxq dx 1 2n Lpf, Pnq , »b 227 a g pxq dx g, P q P P such that 1 2n Lpg, Pnq , U pf, Pn q »b f pxq dx a 1 , U pg, Pn q 2n »b a g pxq dx 1 . 2n Hence »b a ¤ »b f pxq dx »b a a f pxq dx g pxq dx »b a 1 n g pxq dx ¤ Lpf g, Pn q ¤ U pf g, Pn q 1 n and »b a f pxq dx ¤ inf tU pf »b a g pxq dx g, P q : P 1 n ¤ suptLpf P Pu ¤ »b a f pxq dx g, P q : P »b a P Pu g pxq dx Since the last is true for every n P N , we conclude that suptLpf Hence f »b a g, P q : P f pxq dx »b a P Pu inf tU pf g pxq dx . g, P q : P P Pu g is Riemann-integrable and »b a pf pxq g pxqq dx »b a f pxq dx »b a g pxq dx . Further, if c ¥ 0, it follows for every subinterval J of I that inf tcf pxq : x P J u c inf tf pxq : x P J u , suptcf pxq : x P J u c suptf pxq : x P J u and hence that Lpcf, P q c Lpf, P q , U pcf, P q c U pf, P q 228 1 . n for every partition P of I. The last implies that suptLpcf, P q : P inf tU pcf, P q : P P Pu c suptLpf, P q : P P Pu c P Pu c inf tU pf, P q : P P Pu c »b » If c ¤ 0, it follows for every subinterval J of I that a b a f pxq dx , f pxq dx . inf tcf pxq : x P J u c suptf pxq : x P J u , suptcf pxq : x P J u c inf tf pxq : x P J u and hence that Lpcf, P q c U pf, P q , U pcf, P q c Lpf, P q for every partition P of I. The last implies that suptLpcf, P q : P P Pu c inf tU pf, P q : P P Pu c inf tU pcf, P q : P P Pu c suptLpf, P q : P P Pu c Hence it follows in both cases that »b a cf pxq dx c »b a f pxq dx . Finally, if f is such that f pxq ¥ 0 for all x P I, then inf tf pxq : x P J u ¥ 0 for all subintervals J of I and hence Lpf, P q ¥ 0 for every partition P of I. As a consequence, »b a f pxq dx suptLpf, P q : P 229 P Pu ¥ 0 . »b a »b a f pxq dx , f pxq dx . The Riemann integral can be viewed as a map into the real numbers with domain given by the set of bounded Riemann-integrable functions over some bounded closed interval I of R. According to the previous theorem, that map is ‘linear’, i.e., the integral of the sum of such functions is equal to the sums of their corresponding integrals and the integral of a scalar multiple of such a function is given by that multiple of the integral of that function. In addition, it is positive, in the sense that it maps such functions which are in addition positive, i.e., which assume only positive (¥ 0) values, into a positive real number. It is easy to see that the linearity and positivity of the map implies also its monotony, i.e., if such functions f and g satisfy f ¤ g, defined by f pxq ¤ g pxq for all x P I, then the integral of f is equal or smaller than the integral of g. Corollary 2.6.9. (Monotony of the integral) Let f, g be bounded and Riemann-integrable on the interval ra, bs of R where a and b are elements of R such that a ¤ b, and in addition let f pxq ¤ g pxq for all x P ra, bs. Then » » b a b f pxq dx ¤ a g pxq dx . Proof. For this, we define the auxiliary function h : ra, bs Ñ R by hpxq : g pxq f pxq for all x P ra, bs. According to Theorem 2.6.8, h is bounded and Riemann-integrable. Finally, since f pxq ¤ g pxq for all x P ra, bs, it follows that hpxq ¥ 0 for all x P ra, bs. Hence it follows by the linearity and positivity of the integral that 0¤ »b a hpxq dx and hence that »b a g pxq dx »b a »b a rf pxqs dx f pxq dx ¤ »b a »b a g pxq dx »b a f pxq dx g pxq dx . The reader might have wondered why we did not define divisions of intervals induced by partitions in such a way that they contain only pairwise 230 disjoint intervals, although that would have been possible. In our definition subsequent intervals in a division contain a common point. Hence, in a certain sense, associated upper and lower sums count the values of the function in such points twice. The reason for our definition is that it is technically simpler than one which uses pairwise disjoint intervals and that the use of a definition of the latter type would have led to the same integral. The last is reflected by the fact that values of functions in individual points don’t influence the value of the integral. For this note that by Example 2.6.6, it follows that the integral of any function defined on a interval containing only one point is zero. The value of the function in this point does not affect the value of the integral. The reason behind this behavior is, of course, the fact that we defined the length of intervals as the difference between their right and left boundary. Hence the length of an interval containing only one point is zero. Such intervals are examples of so called zero sets. The values assumed by a function on a zero set do not influence the value of the integral. There are several definitions of zero sets possible. The following common definition uses the intuition that they should be, in some sense, of vanishing length. Definition 2.6.10. (Sets of measure zero) A subset S of R is said to have measure zero if for every ε ¡ 0 there is a corresponding sequence I0 , I1 , . . . of open subintervals of R such that the union of those intervals contains S and at the same time such that ņ lim Ñ8 n lpIk q ε . k 0 Remark 2.6.11. Note that any finite subset of R and also any subset of a set of measure zero has measure zero. Theorem 2.6.12. Every countable subset S of R is a set of measure zero. Proof. Since S is countable, there is a bijection ϕ : N Ñ S. Let ε 0 and define for each n P N the corresponding interval In : pϕpnq ε{2n 3 , ϕpnq ε{2n 3 q. Then for each N P N: Ņ n 0 lpIn q ε Ņ n 0 n 1 2 2 N ε 1 21 1 4 231 1 2 1 ε 2 1 N 1 2 1 ¡ and hence Ņ lim N Ñ8 k0 lpIk q ε 2 ε. So far, we proved existence of the integral only in few simple cases. The following celebrated theorem due to Henri Lebesgue changes this. It gives a characterization of Riemann-integrability. Because of its technical character, the proof is transferred to the Appendix. Theorem 2.6.13. (Lebesgue’s criterion for Riemann-integrability) Let f : ra, bs Ñ R be bounded where a and b are some elements of R such that a b. Further, let D be the set of discontinuities of f . Then f is Riemann-integrable if and only if D is a set of measure zero. Proof. See the proof of Theorem 5.2.6 in the Appendix. Remark 2.6.14. A property is said to hold almost everywhere on a subset S of R if it holds everywhere on S except for a set of measure zero. Thus, Theorem 2.6.13 states that a bounded function on a non-trivial bounded and closed interval of R is Riemann-integrable if and only if the function is almost everywhere continuous. Since a |f pxq| rf pxqs2 for every x P ra, bs, if f is bounded and Riemann-integrable on the interval ra, bs of R, where a and b are elements of R such that a ¤ b, we conclude by application of the previous theorem that also |f | is bounded and Riemannintegrable. Since f pxq ¤ |f pxq| ¤ f pxq for all x P ra, bs, it follows by the monotony of the Riemann integral, Corollary 2.6.9, that »b a f pxq dx ¤ »b a |f pxq| dx ¤ 232 »b a f pxq dx y 1 20 -20 x -0.5 Fig. 64: Graph of J0 . and hence that » b f x dx a pq ¤ »b a |f pxq| dx . The last estimate is frequently applied. For a first application, see Example 2.6.16. As a consequence, we proved the following theorem. Theorem 2.6.15. Let f be bounded and Riemann-integrable on the interval ra, bs of R where a and b are elements of R such that a ¤ b. Then |f | is bounded and Riemann-integrable and » b f x dx a pq ¤ »b a |f pxq| dx . Example 2.6.16. For many functions that are important for applications, there are integral representations which are often crucial for the derivation of their properties. For instance, for every n P Z, the corresponding Bessel function of the first kind Jn satisfies Jn pxq 1 π »π 0 cospx sin θ nθq dθ for all x P R and is the solution of the differential equation x2 f 2 pxq xf 1 pxq px2 n2qf pxq 0 , 233 for all x P R. By Corollary 2.6.9, it follows the simple estimate |Jnpxq| ¤ 1 π »π 0 | cospx sin θ nθq| dθ ¤ 1 π »π dθ 0 1 for all x P R and hence that Jn is a bounded function. Bessel functions occur frequently in the description of physical systems that are ‘axially symmetric’, i.e., symmetric with respect to rotations around an axis. Within the definition of Riemann-integrability above, we also defined the area under the graph of an bounded integrable function that assumes only positive (¥ 0) values in terms of its integral. Geometric intuition suggests that areas are additive, that is, if A is the set under the graph of a bounded integrable function and A is the disjoint union of two such sets B and C, we expect that the area of A is equal to the sum of the areas of B and C. Indeed in the following, it will be shown that this intuition is reflected in the additivity of the integral. Theorem 2.6.17. (Additivity of upper and lower Integrals) Let f : ra, bs Ñ R be bounded where a and b are some elements of R such that a ¤ b and c P ra, bs. Then supptLpf, P q : P P Puq supptLpf |ra,cs, P q : P P Pra,csuq supptLpf |rc,bs , P q : P P Prc,bs uq , inf ptU pf, P q : P P Puq inf ptU pf |ra,cs , P q : P P Pra,cs uq inf ptU pf |rc,bs , P q : P P Prc,bs uq where P, Pra,cs , Prc,bs denote the set consisting of all partitions of ra, bs, ra, cs and rc, bs, respectively. Proof. For this, let P1 pa0 , . . . , aν q P Pra,cs and P2 paν 1 , . . . , aν µ q P Prc,bs , where ν, µ are some elements of N , and P : pa0 , . . . , aν , aν 1 , . . . , aν µ q the corresponding element of P. Then Lpf, P q Lpf |ra,cs , P1 q 234 Lpf |rc,bs , P2 q , U pf, P q U pf |ra,cs , P1 q U pf |rc,bs , P2 q . Now let ε ¡ 0. Obviously because of Lemma 2.6.3, we can assume without restriction that P is such that ε supptLpf, P q : P P Puq Lpf, P q ¤ , 3 ε supptLpf |ra,cs , P q : P P Pra,cs uq Lpf, P1 q ¤ , 3 ε supptLpf |rc,bs , P q : P P Prc,bs uq Lpf, P2 q ¤ . 3 Then also sup L f, P : P pt p q P Puq supptL pf |ra,cs, P q : P P Pra,csuq supptLpf |rc,bs, P q : P P Prc,bsuq ¤ ε . Analogously because of Lemma 2.6.3, we can also assume without restriction that P is such that ε U pf, P q inf ptU pf, P q : P P Puq ¤ , 3 ε U pf, P1 q inf ptU pf |ra,cs , P q : P P Pra,cs uq ¤ , 3 ε U pf, P2 q inf ptU pf |rc,bs , P q : P P Prc,bs uq ¤ . 3 Then also inf U f, P : P pt p q P Puq inf ptLpf |ra,cs, P q : P P Pra,csuq inf ptLpf |rc,bs, P q : P P Prc,bsuq ¤ ε . Corollary 2.6.18. (Additivity of the Riemann Integral) Let f : ra, bs Ñ R be bounded and Riemann-integrable where a and b are some elements of R such that a ¤ b, and c P ra, bs. Then »b a f pxq dx »c a f pxq dx 235 »b c f pxq dx . Proof. The statement is a simple consequence of Theorem 2.6.13 and Lemma 2.6.17. So far, we calculated the value of the integral only in some simple cases and from its definition. At the moment, by help of the linearity of the integral and the results in these cases, we can calculate integrals of linear functions over bounded closed intervals of R, only. The next fundamental theorem will give us a powerful tool for such calculation. Below, that fundamental theorem will be given in two variations. Both are direct consequences of the additivity. The first displays that integration and differentiation are inverse processes. The second is a consequence of the first. For a certain class of integrands, it allows the calculation of the integral from the knowledge of the values of an antiderivative its integrand at the ends of the interval of integration. Theorem 2.6.19. Let f : ra, bs Ñ R be bounded and Riemann-integrable where a and b are some elements of R such that a b. Then F : ra, bs Ñ R defined by » F pxq : x a f ptq dt for every x P ra, bs is continuous. Furthermore, if f is continuous in some point x P pa, bq, then F is differentiable in x and F 1 pxq f pxq . Proof. For x, y P ra, bs, it follows by the Corollaries 2.6.18, 2.6.9 that |F pyq F pxq| if y ¥ x as well as that |F pyq F pxq| » y f t dt ¤ M |y x | » x f t dt ¤ M |y x | x y pq pq 236 if y x, where M ¥ 0 is such that |f ptq| ¤ M for all t P ra, bs, and hence the continuity of F . Further, let f be continuous in some point x P pa, bq. Hence given ε ¡ 0, there is δ ¡ 0 such that |f ptq f pxq| ε for all t P ra, bs such that |t x| δ. (Otherwise, there is some ε ¡ 0 along with a sequence t0 , t1 , . . . in ra, bs such that |f ptn q f pxq| ¥ ε and |tn x| 1{n for all n P N. Then t0, t1, . . . is converging to x, but f pt0 q, f pt1 q, . . . is not convergent to f pxq. ) Now let h P R be such that |h| δ and small enough such that x h P pa, bq. We consider the cases h ¡ 0 and h 0. In the first case, it follows by Theorem 2.6.13 and Corollary 2.6.18, 2.6.9 that » x h »x F x h 1 F x f x f t dt f t dt h h a a » x »x h »x 1 f t dt f t dt f t dt f x h a x a »x h »x h 1 1 f t f x dt f t f x dt ε . h h x x p q p q p q pq r p q p qs pq pq ¤ pq pq f x pq pq | p q p q| ¤ Analogously, in the second case it follows that » x h »x F x h 1 F x f t dt f t dt f x f x h h a a » x h »x h »x 1 f t dt f t dt f t dt f x h a a x h » »x 1 x 1 f t f x dt f t f x dt ε . h h p q p q p q pq x h pq pq pq pq pq r p q p qs ¤| | x h | p q p q| ¤ Hence it follows that h lim Ñ0,h0 F px hq F pxq h f pxq and that F is differentiable in x with derivative f pxq. 237 pq Remark 2.6.20. Note that because of Theorem 2.6.13, the function F in Theorem 2.6.19 is differentiable with derivative f pxq for almost all x P pa, bq. Theorem 2.6.21. (Fundamental Theorem of Calculus) Let f : ra, bs Ñ R be bounded and Riemann-integrable where a and b are some elements of R such that a b. Further, let F be a continuous function on ra, bs as well as differentiable on pa, bq such that F 1 pxq f pxq, for all x P pa, bq. Then »b a f pxq dx F pbq F paq . In calculations, we sometimes use the notation rF pxqs |ba : F pbq F paq . Proof. Let ε ¡ 0 and P pa0 , . . . , aν q be a partition of ra, bs where ν is an element of N . By Theorem 2.5.6 for every j P t0, 1, . . . , ν 1u, there is a corresponding cj P raj , aj 1 s such that F paj 1 q F paj q F 1 pcj qpaj 1 aj q where we define F 1 paq : f paq and F 1 pbq : f pbq. Hence F pbq F paq ν¸1 rF paj 1q F paj qs j 0 and ν¸1 f pcj qpaj 1 aj q . j 0 Lpf, P q ¤ F pbq F paq ¤ U pf, P q . Hence supptLpf, P q : P P Puq ¤ F pbq F paq ¤ inf ptU pf, P q : P P Puq supptLpf, P q : P 238 P Puq . Example 2.6.22. Calculate »π 7 sin x Solution: By dx . 3 0 f pxq : 7 sin x 3 for all x P r0, π s, there is defined a continuous and hence Riemann-integrable function on r0, π s. Further by F pxq : 21 cos x 3 for all x P r0, π s, there is defined a continuous function on r0, π s which is differentiable on p0, π q such that g 1 pxq f pxq for all x P p0, π q. Hence by Theorem 2.6.21 »π sin x 0 3 dx 21 cos π 21 cosp0q 21 3 21 2 212 . Example 2.6.23. A simple number theoretic function is the greatest integer or floor function defined by rxs : n for all x P rn, n 1q and n P N. Calculate » » x 0 rys dy 0 , x rys dy for all x ¥ 0 and x 0, respectively. Solution: Note that the greatest integer functions is almost everywhere continuous and hence according to Theorem 2.6.13 also Riemann-integrable on any closed interval of R. For every n P N and every x P rn, n 1q, it follows by Corollary 2.6.18 and Theorem 2.6.21 that »x 0 »n »x 0 n rys dy rys dy n¸1 » k 1 k 0 k k dy n¸1 » k 1 rys dy npx nq k 0 n¸1 k k 0 239 k rys dy npx nq »x n dy n y 4 2 -4 2 -2 x 4 -2 -4 Fig. 65: Graph of the greatest integer function and anti-derivative. n pn 1q 2 npx nq n x Analogously, it follows for every n rn, n 1q that »0 x rys dy npn npn »n x 1 rys dy 1 xq 1 xq »0 n 1 1 » k ¸ n k dy k n 1 k n pn 2 1q n 2 rxs x 1 rx s 2 . P Z such that n ¤ 1 and every x P rys dy 1 1 »n n dy x npn n 1 2 1 1 xq 1 » k ¸ k n 1 k ¸1 1 rys dy k k n 1 x rxs x 1 rx s 2 . See Fig. 65. A basic method for the evaluation of integrals with trigonometric integrands consists in the application of the addition theorems for sine and cosine. 240 Example 2.6.24. Calculate »π 0 sinpmθq sinpnθq dθ where m, n P N . Solution: By help of the addition theorem for the cosine function, it follows that cosppm nqθq cospmθq cospnθq sinpmθq sinpnθq , cosppm nqθq cospmθq cospnθq sinpmθq sinpnθq , and hence that sinpmθq sinpnθq for all θ »π 0 1 r cosppm nqθq cosppm nqθq s 2 P R. This leads to sinpmθq sinpnθq dθ 1 2 »π 0 r cosppm nqθq cosppm 2pm1 nq r sinppm nqθq sπ0 2pm1 nq r sinppm if m n and »π 0 sinpmθq sinpnθq dθ 1 2 π2 2pm1 nq r sinppm »π 0 r 1 cosppm nqθq sπ0 nqθq s dθ nqθq sπ0 0 nqθq s dθ π2 if m n. Example 2.6.25. Find the solutions of the following (‘differential’) equation for f : R Ñ R: f 1 pxq e2x sinp3xq (2.6.5) 241 for all x P R. Solution: If f is such function, it follows that f is continuously differentiable. Hence it follows by Theorem 2.6.21 that f pxq f px0 q »x x0 x 1 2y 1 e cosp3y q 2 3 where x0 »x f 1 py q dy x0 x0 pe2y sinp3y qq dy 21 e2x 13 cosp3xq 21 e2x 0 1 cosp3x0 q 3 P R and x ¡ x0. Hence f pxq 1 2x 1 e cosp3xq 2 3 c, for all x P R where c f p0q 1 2 1 3 f p0q 16 . On the other hand if c P R and fc : R Ñ R is defined by fc pxq : 1 2x 1 e cosp3xq 2 3 c for all x P R, then it follows by direct calculation that fc satisfies (2.6.5) for all x P R. Hence the solutions of the differential equation are given by the family of functions fc , c P R. Note that c f p0q p1{6q. Hence for every c P R, there is precisely one solution of the differential equation with ‘initial value’ f p0q c. The same is true for initial values given in any other point of R. Example 2.6.26. Find the solutions of the following differential equation for f : R Ñ R. f 1 pxq af pxq 3 (2.6.6) for all x P R where a P R. Solution: If f is such function, it follows that f is continuously differentiable. Further, by using the auxiliary function h : R Ñ R defined by hpxq : eax 242 for every x P R, it follows that phf q 1pxq hpxqf 1pxq h 1pxqf pxq eaxf 1pxq eaxrf 1pxq af pxqs 3 eax for all x P R. Hence it follows by Theorem 2.6.21 that phf qpxq phf qpx0q »x 3 eay dy x0 aeax f pxq a3 eax a3 eax 0 and therefore that phf qpxq phf qpx0q for x ¡ x0 where x0 3 p1 eax0 q a 3 ax pe 1q a P R. From this, we conclude that phf qpxq a3 peax 1q c and f pxq for all x P R where c is defined by 3 p 1 eax q a c eax f p0q. On the other hand if c P R and fc : R Ñ R 3 p 1 eax q c eax a for all x P R, then it follows by direct calculation that fc satisfies (2.6.6) for all x P R. Hence the solutions of the differential equation are given by the family of functions fc , c P R. Note that c f p0q. Hence for every c P R, there is precisely one solution of the differential equation with ‘initial value’ f p0q c. The same is true for initial values given in any other point of R. fc pxq : Problems 243 1) Calculate »3 px 2 a) 2 »2 c) »2 e) 1 »1 g) » x 0 π 2 { 1 3 0 »π j) l) { π 2 »3 1 1 , x | 5x 3 |2 2 0 » 2π q) 0 » 2π s) 0 ? 3x f) 3x x h) | sinpx{2q| dx k) 1 4|x |3x 4| dx , 1| dx p) , 2 | dx » π{6 sinpmθq sinpnθq dθ , , 1| |x , dx , , »3 dx 5 x3 , , 2 dx 1 2 sinpx{2q cospx{2q dx dx 5 |x 1| |x 4 o) , 2 2 » 3π 3 »5 dx sinpπxq dx π »4 dx 1 |x 1| 3 »5 n) d) , 0 »2 , 9 t2{3 dt b) 4 sinpxq cos2 pxq dx 5 »2 m) , ?4x 3x 2x2 i) 7q dx 5x pe2x 3xq dx 1 »2 π{6 | cosp3xq| dx » 2π , , r) 0 , sinpmθq cospnθq dθ , cospmθq cospnθq dθ where m, n P N . 2) Define f : R Ñ R by ? ? p xq f pxq : 3 {2 ' %? 3 p1 xq $ ' & 3 1 if x ¤ 1{2 if 1{2 x 1{2 if 1{2 ¤ x for every x P R. Calculate the area in R2 that is enclosed by the graph of f and the x-axis. Verify your result using facts from elementary 244 geometry. Use the result to calculate the area enclosed by a hexagon of side length 1. 3) Calculate the area in R2 that is enclosed by the graphs of the polynomials p1 pxq : 1 p7{2q x x2 , p2 pxq : 4 p7{2q x where x P R. x2 4) Calculate the area in R2 that is enclosed by the curve C : tpx, y q P R2 : y 2 4x2 4x4 0u . 5) Show that for all x P r0, π {2s. cospxq ¤ 1 x2 π 6) Find the solutions to the differential equation for f : R Ñ R. a) f 1 pxq 3f pxq x{2 , x P R , b) f 1 pxq 3f pxq ex{4 , x P R , c) 2f 1 pxq f pxq 3 ex , x P R . 7) Consider the following differential equation for f : R Ñ R. f 2 pxq 3x for all x P R. 4 a) Find the solutions of this equation. b) Find that solution which satisfies f p0q 1 and f 1 p0q 2. c) Find that solution which satisfies f p0q 2 and f p1q 3. 8) Calculate a0 : 1 2π bk : 1 π for all k P N . » » 2π 0 2π 0 f pxq dx , ak : 1 π » 2π 0 cospkxqf pxq dx , sinpkxqf pxq dx a) f pxq : # 1 245 1 if x P r0, π s , if x P pπ, 2π s b) f pxq : x for all x P r0, 2π s , c) f pxq : # x if x P r0, π s . 2π x if x P pπ, 2π s Remark: These are the coefficients of the Fourier expansion of f . The representation f pxq lim n # Ñ8 a0 lim n ¸ + rak cospkxq bk sinpkxqs k 1 is valid for every point x P r0, 2π s of continuity of f . 9) Calculate the area in R2 that is enclosed by the ellipse C : " 2 px, yq P R : xa2 2 y2 b2 1 * where a, b ¡ 0. 10) Calculate the area in R2 that is enclosed by the branches of hyperbolas " * y2 x2 C1 : px, y q P R : y ¥ 0 ^ 2 2 1 , a b " * p y cq2 x2 2 C2 : px, y q P R : y ¤ c ^ b2 1 a2 2 where a, b ¡ 0 and c ¡ a. 11) Let a, b P R be such that a b. Further, let f : ra, bs Ñ R be positive, i.e., such that f pxq ¥ 0 for all x P ra, bs, and assume a value ¡ 0 in some point of ra, bs. Show that »b a f pxq dx ¡ 0 . 12) Let a, b P R be such that a b. Further, let f : ra, bs Ñ R and g : ra, bs Ñ R be bounded and Riemann-integrable. Show the following Cauchy-Schwartz inequality for integrals: » b a f pxqg pxq dx 2 ¤ » b f pxq dx » b 2 a 246 g pxq dx 2 a . In addition, show that equality holds if and only if there are α,β P R satisfying that α2 β 2 0 and such that αf βg 0. Hint: Consider » b as a function of λ P R. a r f pxq λ g pxq s2 dx 13) Newton’s equation of motion for a point particle of mass m moving on a straight line is given by mf 2 ptq F pf ptqq ¥ 0 (2.6.7) for all t P R where f ptq is the position of the particle at time t and F pxq is the external force at the point x. For the specified force, calculate the solution function f of (2.6.7) with initial position f p0q x0 and initial speed f 1 p0q v0 where x0 , v0 P R. a) F pxq 0 , x P R , b) F pxq F0 , x P R where F0 is some real parameter . 14) Newton’s equation of motion for a point particle of mass m ¥ 0 moving on a straight line under the influence of a viscous friction is given by mf 2 ptq λf 1 ptq (2.6.8) for all t P R where f ptq is the position of the particle at time t and λ ¡ 0 is a parameter describing the strength of the friction. Calculate the solution function f of (2.6.8) with initial position f p0q x0 and initial speed f 1 p0q v0 where x0 , v0 P R. Investigate, whether f has a limit value for t Ñ 8. 15) Newton’s equation of motion for a point particle of mass m ¥ 0 moving on a straight line under the influence of low viscous friction, for instance friction exerted by air, is given by mf 2 ptq λ pf 1 ptqq2 (2.6.9) for all t P R where f ptq is the position of the particle at time t and λ ¡ 0 is a parameter describing the strength of the friction. Find solutions f of (2.6.9) with initial position f p0q x0 and initial speed f 1 p0q v0 where x0 , v0 P R. 16) Consider a projectile that is shot into the atmosphere. According to Newton’s equation of motion, the height f ptq above ground at time t P R satisfies the equation mf 2 ptq g λ pf 1 ptqq2 247 (2.6.10) for all t P R where g 9.81m{s2 is the acceleration due to gravity and λ ¡ 0 is a parameter describing the strength of the friction. Find solutions f of (2.6.10) with initial height f p0q z0 and initial speed component f 1 p0q v0 where z0 , v0 P R. 248 3 3.1 Calculus II Techniques of Integration This section studies standard techniques of integration, namely the methods of change of variables (also referred to as ‘integration by substitution’), integration by parts, integration by decomposition of rational integrands into partial fractions and, finally, approximate numerical calculation of integrals. 3.1.1 Change of Variables The method of change of variables (also referred to as ‘integration by substitution’) is based on the chain rule for differentiation. For motivation, we consider a continuously differentiable and increasing function g defined on a non-trivial open interval I of R and a continuously differentiable function F that is defined on an open interval containing Ranpg q. Further, let c, d P I be such that c d. Then it follows by the chain rule for differentiation that F is continuously differentiable with derivative given by g : I ÑR pF gq 1puq F 1pgpuqq g 1puq for all u P I. Further, it follows by the fundamental theorem of calculus, Theorem 2.6.21, that » gpdq pq g c F 1 pxq dx F pg pdqq F pg pcqq pF »d c pF gq 1puq du Hence by defining f : variables » gpdq pq g c »d c gqpdq pF gqpcq F 1 pg puqq g 1 puq du . F 1 , we arrive at the formula for the change of f pxq dx »d c f pg puqq g 1 puq du 249 for f . We note that the previous reasoning proves the validity of this equation if, in addition to the assumptions above on g, c and d, f is a continuous function that is defined on a open interval of containing Ranpg q for which there is a antiderivative F , i.e., for which there is a differentiable function F : Dpf q Ñ R such that F 1 pxq f pxq for all x P Dpf q. In the proof of the following theorem, the last is concluded from the continuity of the function f and the fundamental theorem of calculus in the form of Theorem 2.6.19. Theorem 3.1.1. (Change of variables) Let c, d P R such that c d. Further, let g : rc, ds Ñ R be continuous, such that g pcq ¤ g pdq and continuously differentiable on pc, dq with a derivative which can be extended to a continuous function on rc, ds. Finally, let I be an open interval interval of R containing g prc, dsq and f : I Ñ R be continuous. Then » gpdq pq g c f pxq dx »d c f pg puqq g 1 puq du . (3.1.1) Proof. In the special case that g is a constant function, the statement of the theorem is obviously true. In the remainder of this proof, we consider the case of a non-constant g. We denote by g 1 the extension of the derivative of g |pc,dq to a continuous function on rc, ds and define G : rc, ds Ñ R by Gpuq : »u c f pg pūqq g 1 pūq dū for all u P rc, ds. By Theorem 2.6.19 it follows that G is continuous as well as differentiable on pc, dq with G 1 puq f pg puqq g 1 puq for all u P pc, dq. Further, we define F : rx0 , x1 s Ñ R by F pxq : »x x0 f px 1 q dx 1 250 » gpcq x0 f px 1 q dx 1 for all x P rx0 , x1 s where x0 , x1 P I are such that x0 is smaller than the minimum value of g and x1 is larger than the maximum value of g, respectively. By Theorem 2.6.19 it follows that F is continuous as well as differentiable on px0 , x1 q with F 1 pxq f pxq for all x P px0 , x1 q. Hence it follows by Theorems 2.3.51, 2.4.10 that F is continuous as well as differentiable on pc, dq with g pF gq 1puq f pgpuqq g 1puq G 1puq for all u P pc, dq. From Theorem 2.5.7 and F pg pcqq Gpcq 0, it follows that F g G and hence by Corollary 2.6.18 also (3.1.1). Example 3.1.2. Calculate »3 1 ? x x 1 dx . Solution: For this, we define g : R Ñ R by g puq : u 1 for all u P R. Then g is increasing and continuously differentiable with a derivative function constant of value 1. In particular, g p0q 1 and g p2q a 3. Further, we define the continuous function f : R Ñ R by f pxq : x |x 1| for all x P R. Hence it follows by Theorem 3.1.1 that »3 2 ? x x 1 dx »2 pu 0 » gp2q pq g 0 ? 1q u du 2 p3u 15 5q u { 3 2 f pxq dx »2 pu { 2 0 0 u{ 3 2 0 »2 f pg puqq g 1 puq du q du 44 ? 1 2 22 23{2 15 15 2 5{2 u 5 2 2 3{2 u 3 0 2. Note that, we could have achieved this result also by the following more simple reasoning. »3 1 ? x x 1 dx »3 1 px 1 ? 1q x 1 dx 251 »3 1 px 1q { 2 p3x 15 3 2 px 1q { dx 2 px 1q5{2 1 2 2q px 3 3{2 1 q 1 5 44 22 23{2 15 15 ? 3 2 p x 1q3{2 3 1 2. Simple substitutions can often be avoided by application of such simple ‘tricks’. Below, we will give some examples where this is not the case. Example 3.1.3. Calculate »2 1 sinpln xq dx . x Solution: For this, we define g : R Ñ R by g puq : eu for all u P R. Then g is increasing and continuously differentiable with derivative g 1 puq eu for all u P R. In particular, g p0q 1 and g pln 2q exppln 2q 2. Further, we define the continuous function f : p0, 8q Ñ R by f pxq : sinpln xq{x for all x ¡ 0. Then, it follows by Theorem 3.1.1 that »2 1 sinpln xq dx x » ln 2 » gpln 2q pq g 0 f pxq dx » ln 2 » lnp2q 0 f pg puqq g 1 puq du sinpln e q u 2 e du sin u du r cos us |ln 1 cospln 2q 0 u e 0 0 ln 2 ln 2 ln 2 ln 2 2 2 2 sin 2 sin 2 1 cos 2 2 1 cos 2 2 u where, in particular, the addition theorem for the cosine was applied. The reason for continuing the simplification of the result 1 cospln 2q is motivated by applications. Usually in applications, a calculation of the previous type is only a small step in a sequence of steps toward a final result. Hence, typically, such result would be needed as input for the next step. Therefore, it is useful to reduce results in their ‘size’ in order to avoid a final result of even larger ‘size’. Usually, the implications of results of relatively large ‘size’ are less obvious. Note also that, the final expression 252 makes obvious the positivity of the integral which is due to the positivity of the integrand in the interval of integration. The last can be seen from the inequality 0 ¤ ln x ¤ x 1 ¤ π for all x P r1, 2s where the inequality (2.5.12) for the case a 1 was applied. Quite generally, such a consistency check of the signs of results can avoid errors. Also in this case, the application of change of variables could have been avoided. Usually, for a successful application of the method of change of variables, the presence of an ‘inner function’ in the integrand is needed. The function g in Theorem 3.1.1 is then defined in such a way that that inner function is simplified. In many simple cases, the derivative of that inner function is also present in the integrand. Often, this can be used to ‘guess’ an antiderivative F of the integrand. For instance in this case, an obvious candidate for an inner function is the natural logarithm function ln. Since ln 1 pxq 1{x for all x ¡ 0, we see that its derivative is also present in the integrand. Hence a first guess (incorrect) for such F might be F pxq : sinpln xq for all x ¡ 0. Then it would follow by the chain rule for differentiation that F 1 pxq cospln xq 1 x cospxln xq for all x ¡ 0. F 1 does not coincide with the integrand on the interval r1, 2s because of the presence of the cosine function instead of the sine function. Of course, there is a simple remedy for this. A second (correct) guess for such F would be F pxq : cospln xq for all x gives ¡ 0. As a consequence of the chain rule for differentiation, this F 1 pxq sinpln xq 253 1 x sinpxln xq and hence that F is a antiderivative of the integrand. Hence, we conclude by the fundamental theorem of calculus that »2 1 sinpln xq dx r cospln xqs |21 x 1 cospln 2q 2 sin 2 ln 2 2 . We give now in succession four examples of more serious applications of change of variables. The first three give standard trigonometric substitutions whose goal is the removal of square roots in integrands. The fourth example gives a standard substitution that is used to transform rational expression in sine and cosine functions of the same argument into rational expressions of the new variable. Example 3.1.4. Calculate »x 0 ? dy y2 a2 for every x ¡ 0 where a ¡ 0. Solution: Define g : pπ {2, π {2q Ñ R by g pθq : a tan θ for all θ P pπ {2, π {2q. Then g is a bijective as well as continuously differentiable such that g 1 pθq a p1 for all θ tan2 θq P pπ{2, π{2q. The inverse g1 is given by g 1 pxq : arctan x a for all x P R. By Theorem 3.1.1 »x 0 ? dy y2 » g1 pxq 0 a2 dθ cos θ » gpg1 pxqq pq g 0 ln 1 ? dy y2 a2 sinpg 1 pxqq cospg 1 pxqq 254 » g1 pxq 0 ln g 1 pθq dθ pgpθqq2 a2 a x a c 1 x2 a2 . Example 3.1.5. Calculate »2 ? 0 9 x2 dx . Solution: Define g : pπ {2, π {2q Ñ p3, 3q by g pθq : 3 sin θ for all θ P pπ {2, π {2q. Then g is a bijective as well as continuously differentiable such that g 1 pθq 3 cos θ for all θ P pπ{2, π{2q. The inverse g1 is given by x g 1 pxq arcsin 3 for all x P p3, 3q. By Theorem 3.1.1 »2 0 ? dx » gparcsinp2{3qq » arcsinp2{3q a 0 92 9 x2 pq g 0 9 pg pθqq » arcsinp2{3q 0 p1 9 2 arcsin 2 3 9 2 arcsin 2 3 ? 2 9 x2 dx g 1pθq dθ 9 cosp2θqq dθ » arcsinp2{3q cos2 θ dθ 0 1 sinp2 arcsinp2{3qq 2 ? 2 cosparcsinp2{3qq 5 3 9 arcsin 2 2 3 . Example 3.1.6. Calculate »5 4 x4 ? x2 9 dx . Solution: Define g : p0, π {2q Ñ p3, 8q by g pθq : 3 cos θ for all θ P p0, π{2q. Then g is a bijective as well as continuously differentiable such that sin θ g 1 pθq 3 cos2 θ 255 for all θ P p0, π{2q. The inverse g1 is given by 3 g 1 pxq arccos x for all x P p3, 8q. By Theorem 3.1.1 »5 4 ? x4 x2 9 dx » gparccosp3{5qq » arccosp3{5q a p{q arccos 3 4 19 p p { qq g arccos 3 4 x4 ? x2 9 dx pgpθqq4 pgpθqq2 9 g 1pθq dθ » arccosp3{5q p{q cos θ sin2 θ dθ ? 1 1 3 sin parccosp3{5qq sin3 parccosp3{4qq rp 4{5q3 p 7{4q3 s . 27 27 arccos 3 4 Example 3.1.7. Calculate » π{2 0 dθ . 4 cos θ 5 Solution: Define g : R Ñ pπ, π q by g pxq : 2 arctan x for all x P R. This is a standard substitution to transform a rational integrand in sin and cos into a rational integrand. Then g is bijective as well as continuously differentiable such that g 1 pxq 2 1 x2 for all x P R. The inverse g 1 is given by g 1 pθq : tan pθ{2q 256 y 2 1 -1 2 x 3 -2 Fig. 66: Graphs of solutions of the differential equation (3.1.2) in the case that a 1 with initial values π, π {2, π {2 and π at x 0. Compare Example 3.1.8. for all θ » π{2 0 P pπ, πq. By Theorem 3.1.1 5 »1 0 5 »1 0 4 2 1 Note that for all x P R. x2 » gp1q g p0q 5 g 1 pxq dx dθ 4 cos θ pq g x 2 2 cos2 5 dx 4 2 1 x2 cospg pxqq dθ 4 cos θ 1 1 »1 »1 0 2 0 g 1 pxq dx 5 4 cospg pxqq 5 »1 0 4 dx x2 g 1 pxq dx p p qq 1 2 1 tan2 9 g x 2 2 arctan 3 1 3 . 1 x2 2x , sinpg pxqq . 2 1 x 1 x2 The following example gives a typical application of change of variables to the solution of (‘separable’) ordinary differential equations of the first order. 257 Example 3.1.8. Find solutions of the following differential equation for f : R Ñ R with the specified initial values. f 1 pxq a sinpf pxqq (3.1.2) for all x P R where a ¡ 0, f p0q P p0, π q. Solution: If f is such function, it follows that f is continuously differentiable. Since f p0q P p0, π q, it follows by the continuity of f the existence of an open interval c, d P R such that c 0 d and such that f p[c, d]q p0, π q. Since a ¡ 0 and the sine function is ¥ 0 on the interval [0, π], it follows from (3.1.2) that f px1 q f px0 q f px0q » x1 x0 » x1 f px1 q f px0 q f px0 q x0 f 1 pxq dx a sinpf pxqq dx ¥ f px0 q for all x0 , x1 P [c, d] such that x0 ¤ x1 . In addition, the restriction of f to [c, d] is non-constant since the sine function has no zeros on p0, π q. Hence we conclude from (3.1.2) by Theorem 3.1.1 for x P [c, d] that apx cq »x c a du »x c f 1 puq du sinpf puqq » f pxq dθ . f pcq sinpθ q Further, it follows by use of the transformation g from the previous Example 3.1.7 and Theorem 3.1.1 that » f pxq » gptanpf pxq{2qq » tanpf pxq{2q dθ g 1 pxq dx g ptanpf pcq{2qq sinpθ q tanpf pcq{2q sinpg pxqq » tanpf pxq{2q dx tanpf pxq{2q ln tanpf pcq{2q . tanpf pcq{2q x dθ f pcq sinpθ q Hence it follows that apx cq ln tanpf pxq{2q tanpf pcq{2q 258 (3.1.3) which leads to f pxq 2 arctan tan f pcq 2 epq . a x c (3.1.4) From (3.1.3), we conclude that tan f pcq 2 f p0q eac tan . 2 Substituting this identity into (3.1.3) gives f pxq 2 arctan tan f p0q 2 ax e . On the other hand, for every c P pπ, π q, it follows by elementary calculation that f : R Ñ R defined by f pxq : 2 arctan tan c 2 ax e for all x P R satisfies (3.1.2) and f p0q c. As a side remark, note that for every k P Z the constant function of value kπ is a solution of (3.1.2). In addition, if f is a solution of (3.1.2), then for every k P Z also fk : R Ñ R defined by fk pxq : f pxq 2πk q for every x P R is a solution of (3.1.2). For the motivation of the following theorem, we consider the map R : pR2 Ñ R2 defined by Rpx, y q : px, y q for all px, y q P R2 . A geometrical interpretation of R is that of a reflection in the y-axis. This can be seen as follows. For this, let px, y q be some point in R2 . Then the line segment from px, y q to Rpx, y q px, y q, at the intersection p0, y q with the y-axis, is at a right angle with the y-axis and both points px, y q and Rpx, y q are at a distance |x| from the y-axis. Therefore, R meets the geometrical definition of the reflection in the y-axis. Intuitively (according to elementary geometry), we would not expect that 259 y 4 2 1 -3 1 -1 3 x Fig. 67: The line segment from p1, 3q to Rp1, 3q p1, 3q intersects the y-axis at a right angle and is halved by that axis. The yellow rectangles are mapped onto each other by R. Compare text. such reflection changes areas, i.e., if S is some subset of R2 of area A, then we expect that the set RpS q has the same area. For instance, a rectangle ra, bs rc, ds in R2 , where a ¤ b and c ¤ d, is mapped by R into the rectangle Rp ra, bs rc, ds q rb, as rc, ds . Both rectangles have the same area pb aqpd cq. Within the definition of Riemann-integrability above, we defined the area under the graph of a bounded integrable f : ra, bs Ñ R, where a, b P R are such that a b, that assumes only positive (¥ 0) values by »b a f pxq dx . 260 We consider the associated function f¯ : rb, as Ñ R defined by f¯pxq : f pxq for all x P rb, as. We claim that the graph of f¯ is the image of the graph of f under R, i.e., Gpf¯ q RpGpf qq . Indeed, if x P rb, as, then x P ra, bs and px, f¯pxqq px, f pxqq Rpx, f pxqq P RpGpf qq . Also, if x P ra, bs, then x P rb, as and Rpx, f pxqq px, f pxqq px, f ppxqqq px, f¯pxqq P Gpf¯ q . Therefore, we expect that f¯ is bounded, integrable and that the area under the graph of f¯ is equal to the area under the graph of f , i.e., that »b a f pxq dx » a b f pxq dx . (3.1.5) Indeed, it is shown within the proof of the following theorem that this is the case. Note that we can view this result as a kind of change of variables. For this, we define g : R Ñ R by g pxq : x. The g is decreasing and continuously differentiable with a derivative function which is constant of value 1. Hence g does not satisfy the assumptions of Theorem 3.1.1. In particular, g paq a and g pbq b. A formal application of the change of variable formula (3.1.1) would give »b a f pxq dx » b a f pg puqq g 1 puq du pincorrectq which does not make sense according to our definitions because a ¡ b. The correct formula (3.1.5), can be ‘obtained’ from this formula by exchange of the integration limits. Theorem 3.1.9. Let f be a bounded Riemann-integrable function on ra, bs where a and b are some elements of R such that a b. Then »b a f pxq dx » a b 261 f pxq dx . y 4 3 2 1 -2 1 -1 2 x Fig. 68: The graphs of p r2, 1s Ñ R, x ÞÑ pxq2 q and and p r1, 2s Ñ R, x ÞÑ x2 q are reflection symmetric with respect to the y-axis. Compare text. Proof. Define f : rb, as Ñ R by f pxq : f pxq for all x P rb, as. Then f is bounded, and for any partition P pa0 , . . . , aν q of ra, bs where ν P N , a0 , . . . , aν P ra, bs, P : paν , . . . , a0 q it is a partition of rb, as, and in particular Lpf, P q Lpf, Pq, U pf, P q U pf, Pq. Analogously, for any partition P pa0 , . . . , aν q of rb, as where ν P N , a0 , . . . , aν P rb, as, P : paν , . . . , a0 q is a partition of ra, bs, and in particular Lpf , P q Lpf, P q, U pf , P q U pf, P q. Hence the set consisting of the lower sums of f is equal to the set of lower sums of f and the set consisting of the upper sums of f is equal to the corresponding set of upper sums of f . The following example displays a typical application of the previous theorem to functions f that are defined on intervals that are symmetric to the origin, i.e., of the form ra, as, where a ¥ 0, as well as bounded, integrable and antisymmetric, i.e., such that f pxq f pxq for all x P ra, as. Their integrals vanish. 262 Example 3.1.10. Calculate »1 3 sinp2xq dx . 1 Solution: By Theorem 3.1.9, it follows that »1 1 »1 3 sinp2xq dx and hence that 1 »1 1 3 sinp2xq dx »1 1 3 sinp2xq dx 3 sinp2xq dx 0 . A variation of the previous reasoning is displayed in the next example. Example 3.1.11. Calculate »π 0 x sin2 pxq dx . Solution: First by Theorem 3.1.9, it follows that »π x sin pxq dx »0 2 0 pxq sin pxq dx »0 2 π π x sin2 pxq dx . Further, it follows by Theorem 3.1.1 and Example 2.6.24 that »0 x sin pxq dx »π 2 »π π »π y sin py q dy 2 0 and, finally, that 0 py πq sin2py πq dy sin py q dy x sin2 pxq dx π2 . 4 2 π 0 »π 0 263 »π 0 y sin2 py q dy π2 2 Another typical application of Theorem 3.1.9 applies to functions f defined on intervals that are symmetric to the origin, i.e., of the form ra, as, where a ¥ 0, that are bounded, integrable and symmetric, i.e., such that f pxq f pxq for all x P ra, as. The value of the integral of such a function is twice the value of the corresponding integral of its restrictions to r0, as. Example 3.1.12. Show that »π sinpxq dx 2 π x »π 0 sinpxq dx . x Solution: By Corollary 2.6.18 and Theorem 3.1.9, it follows that »π » » 0 π sinpxq sinpxq sinpxq dx dx dx x π » x 0 π» x »π π π sinpxq sinpxq sinpxq dx dx 2 dx . x x x 0 0 0 Remark 3.1.13. The solution of following problem n) from 1) illustrates the general rule that one should never blindfoldly rely on computer programs. In Mathematica 5.1, the command Integraterpx^ 2 2x 4q^ t3{2u, tx, 1, 2us gives the output 1 p68 16 27 Logr3sq which is incorrect. Problems 1) Calculate the value of the integral. For this, if the antiderivative of the integrand is not obvious, use a suitable substitution. »1 p2x 1q { dx , b) »1 1 2 a) 0 264 0 u p2u 1q1{2 du , »1 c) x p3x 1 g) 2 »2 1 x 0 x 0 tanpθq dθ »4b 3 k) 3 »6c b » 3 7 o) 3 »2 q) 1 »π 2 ?dx2 2 x 4 dx ? x2 5 x2 x a 1 0 »π 2 { π{2 »2 , x1{3 1 ?x dx , x2 1 »π 2 { r) 0 sinp2θq dθ , »2 sinpθq 0 ?x x 2 u 12 du dx , 4q3{2 , 2 , dx x 9 dθ sinp3θq 2 » π{2 , 4u u2 px2 2x ?dx2 t) cos4 pθq dθ sin4 pθq cos4 pθq dx 4 1 p) , 2 n) »3 , j) x1{2 l) ds , »4 x P r0, π {2q , ?x dx 3 m) , 3 2 2 » u) s 2s2 px 2q2 sin x x 2 dx , f) 3 ? »1 »3 sinp xq ? dx , h) u eu {2 du i) s) ? 1 2 »5 e) »3 1q { dx , d) 2 , , dθ 2 cospθq , . 2) Let a P R, f : ra, as Ñ R be Riemann-integrable and g : R Ñ R be Riemann-integrable over every interval rb, cs where b, c P R are such that b ¤ c. Show that a) »a a f pxq dx 0 if f is antisymmetric, i.e., if f pxq ra, as. b) »a a f pxq dx 2 »a 0 f pxq for all x P f pxq dx if f is symmetric, i.e., if f pxq f pxq for all x P ra, as. c) »c b f pxq dx 265 »c τ b τ f pxq dx , if b, c P R are such that b ¤ c and f is periodic with period τ ¥ 0, i.e., if f px τ q f pxq for all x P R. 3) Calculate the area in ( 8, 0 ]2 that is enclosed by the strophoid C : px, yq P R2 : pa xq y2 pa xq x2 ( 0 where a ¡ 0. 4) Find solutions of the following differential equation for f : R with the specified initial values. f 1 pxq 2 cospf pxqq ÑR 3 for all x P R, f p0q P [ π, π q. 3.1.2 Integration by Parts The method of integration by parts is based on the product rule for differentiation. For motivation, we consider continuous functions F : ra, bs Ñ R and G : ra, bs Ñ R whose restrictions to the open interval pa, bq are differentiable with derivatives which can be extended to bounded Riemannintegrable functions f : ra, bs Ñ R and g : ra, bs Ñ R, respectively. Then it follows by the fundamental theorem of calculus and the product rule for differentiation that »b F pbqGpbq F paqGpaq »b rF 1pxqGpxq F pxqG 1 pxqs dx a »b a »b a a pF Gq 1pxq dx »b F 1 pxqGpxq dx f pxqGpxq dx a »b a F pxqG 1 pxq dx F pxqg pxq dx and hence that »b a F pxqg pxq dx F pbqGpbq F paqGpaq 266 »b a f pxqGpxq dx . We note the sign change and how antiderivatives, denoted by capital letters, switch positions inside the integrals. A typical application of the last formula consists in the following steps. The integrand of a given integral needs to be represented by a product of functions. Its first function will be differentiated in the process. It is an antiderivative of that derivative. The last will appear as the first factor in the transformed integrand. For the second function an antiderivative should be available. That antiderivative will appear as the second factor in the transformed integrand. The final result is obtained in form of a difference. The minuend is given by the difference of the product of the first factor with the antiderivative of the second factor evaluated at the upper limit of integration and the value of that product at the lower limit of integration. The subtrahend is given by the integral over the original interval of integration with the transformed integrand. Theorem 3.1.14. (Integration by parts) Let f , g be bounded Riemannintegrable functions on ra, bs where a and b are elements of R such that a b. Further, let F, G be continuous functions on ra, bs which are differentiable on pa, bq and such that F 1 pxq f pxq and G 1 pxq g pxq for all x P pa, bq. Then »b a F pxqg pxq dx F pbqGpbq F paqGpaq »b a f pxqGpxq dx . Proof. First as a consequence of Theorem 2.6.13, f G and F g are both Riemann-integrable as products of Riemann-integrable functions. Moreover, F G is continuous and differentiable such that pF Gq 1 pxq f pxqGpxq F pxqg pxq for all x P pa, bq, and f G F g is Riemann-integrable by Theorem 2.6.8 as a sum of Riemann-integrable functions. Hence by Theorem 2.6.21 »b a f pxqGpxqdx »b a F pxqg pxqdx F pbqGpbq F paqGpaq . 267 »b a f pxqGpxq F pxqg pxq dx The first example gives a typical application of integration by parts where the occurrence of the derivative of the first factor in the transformed integrand is used to lower the order of a polynomial appearing in the original integral. Example 3.1.15. Calculate »π 0 x cosp3xq dx . Solution: Define F, G, f, g : r0, π s Ñ R by F pxq : x , g pxq : cosp3xq , f pxq : 1 , Gpxq : 1 sinp3xq 3 for all x P r0, π s. Hence by Theorems 3.1.14, 2.6.21: »π 0 1 x cosp3xq dx 3 »π 0 sinp3xq dx 1 2 1 cosp3π q cosp0q . 9 9 9 Another typical application consists in a repeated use of integration by parts until the original integral reappears, but multiplied by a factor which is different from 1. In such a case the resulting equation can be solved for the original integral. Example 3.1.16. Calculate »π 0 ex sinp2xq dx . Solution: Define F, G, f, g : r0, π s Ñ R by 1 F pxq : ex , g pxq : sinp2xq , f pxq : ex , Gpxq : cosp2xq 2 for all x P r0, π s. Then by Theorem 3.1.14, »π e sinp2xq dx x 0 1 p1 eπ q 2 268 1 2 »π 0 ex cosp2xq dx (3.1.6) To determine the last integral, define F, G, f, g : r0, π s Ñ R by F pxq : ex , g pxq : cosp2xq , f pxq : ex , Gpxq : 1 sinp2xq 2 for all x P r0, π s. Then by Theorem 3.1.14, 1 2 »π 1 e cosp2xq dx 4 »π x 0 0 ex sinp2xq dx . (3.1.7) and hence by (3.1.6), (3.1.7) finally: »π 0 ex sinp2xq dx 2 π pe 1q . 5 Of course, every integrand can be represented by its product with the constant function of value 1. Such a representation can sometimes lead to a successful application of the method of partial integration as in the following example. Example 3.1.17. Calculate »e 1 lnp4xq dx . Solution: Define F, G, f, g : r1, es Ñ R by F pxq : lnp4xq , g pxq : 1 , f pxq : for all x P r0, es. Then by Theorem 3.1.14, »e 1 lnp4xq dx p1 ln 4q e ln 4 »e 1 1 , Gpxq : x x dx pe 1q ln 4 1. Often, the method of partial integration can be used to derive a recursion relation for an integral containing a parameter. Such a case is considered in the following example. In particular, its result will lead to the subsequent Wallis’ product representation of π. 269 Example 3.1.18. Calculate In : »π 0 sinn pxq dx for n P N . Solution: For n 1, 2, we conclude that »π 0 sinpxq dx r cospxqs »π 0 π 0 2, »π 0 sin2 pxq dx π 1 1 1 r x sinp2xq 1 cosp2xqs dx 2 2 2 0 π . For n ¥ 3, we conclude by partial integration that In »π sin pxq dx »π n 0 0 sinn1 pxq sinpxq dx (π sin pxqr cospxqs 0 »π n 1 pn 1q pn 1q »π »0π 0 0 pn 1q sinn2pxq cospxqr cospxqs dx sinn2 pxq cos2 pxq dx sinn2 pxqr1 sin2 pxqs dx pn 1qpIn2 In q and hence that In n n 1 In2 . Hence we conclude by induction that I2k for all k 1 2 23 2k2k 1 , I2k π 21 2k2k 1 P N zt0, 1u. The result from the previous example leads on John Wallis’ product representation of π which will be used in the subsequent derivation of Stirling’s formula and in the calculation of Gaussian integrals. 270 3.3 3.2 Π 3.1 10 20 30 40 50 n Fig. 69: Sequences a1 , a2 , . . . and b1 , b2 , . . . from the proof of Wallis product representation for π, Theorem 3.1.19, that converge to π from below and above, respectively. Theorem 3.1.19. (Wallis’ product representation of π, 1656, [98]) lim 4pk k Ñ8 1q 2 3 2k 2k 1 2 π . Proof. In this, we are using the notation from the previous example. Since 0 ¤ sinpxq ¤ 1 for all x P r0, π s, it follows that sinn 1 pxq sinpxq sinnpxq ¤ sinnpxq for all x P r0, π s and hence that In 1 »π sin 0 n 1 pxq dx ¤ for all n P N . As a consequence, 2 2 3 »π 0 sinn pxq dx In 2k2k 1 I2k 1 ¤ I2k π 12 2k2k 1 271 pk 1q ¤ I2k1 2 23 22k 1 and 2 2 3 pk 1q 2k 2k 2k2k 1 21 43 22k 3 2k 1 2k 2 1 1 ak : p4k 2q 2k2k 1 ¤ π pk 1q 2 4 2pk 1q 2k ¤ 2 32 22k 1 1 3 2k 3 2k 1 pk 1q 2 bk : 4k 23 22k 1 2 3 P N zt0, 1, 2u. Further, 2 4k 6 2pk 1q 1 for k ak ak bk 1 bk bk ak 4k 2 4pk 1q 4k 4k 4k 2 2k 3 2 8k 8k 2 16k 16k 8 6 ¡1, 2 4k 4k4k 1, 2 4k 1 2 1 p2k 1q 1 1 2k 2k 1 2k 2k 2 8pk 1q2 p4k 2qp2k 3q 2k 2k for all k P N zt0, 1, 2u. Hence the sequences a3 , a4 , . . . and are convergent, as increasing sequence that is bounded from above by π and decreasing sequence that is bounded from below by π, respectively, and converge to the same limit π. Essentially as an application of Wallis’ product formula, we prove Stirling’s asymptotic formula for factorials which is often used in applications . Theorem 3.1.20. (Stirling’s formula, 1730, [92]) n! n n ? lim nÑ8 n e 272 ? 2π . (3.1.8) y 1.05 1.04 1.03 1.02 1.01 10 20 30 40 50 ? x 1q px{eqx { 2πx q. Note that Γpn Fig. 70: Graph of pp0, 8q Ñ R, x ÞÑ Γpx for every n P N. See Theorem 3.1.20. Proof. First, we notice that ln is concave since ln2 pxq 1{x2 x ¡ 0. Hence it follows by Theorem 2.5.33 that »x x 1 lnpy q dy x ln x ¥ 1 »x x 1 lnpxq x x 1 2 py xq ln ln x 1 x x 1 x 21 r lnpxq dy 1q n! 0 for all lnpxq lnpx 1qs for all x ¡ 0. In addition, it follows from the Definition 2.5.29 of the concavity of a differentiable function that lnpy q ¤ lnpxq where x ¡ 0 and y »x x 1 lnpy q dy yx , lnpy q ¤ lnpx x 1q y px 1q , x 1 ¡ 0, and hence that ¤ lnpxq 1 1 1 2x lnpxq 273 1 , 2x »x 1 x lnpy q dy ¤ lnpx 1q 1 and »x 1 x lnpy q dy 21 r lnpxq ¤ 1 2 1 lnpxq lnpx 2px lnpx 1qs 1 4 1 x lnpx 1q 1 1q 1 2x x11 2px 0¤ x 1 lnpy q dy 1 r lnpxq 2 lnpx 1qs ¤ for all x ¡ 0 and hence that 0¤ »n 41 1 lnpy q dy 1 1 n 1 1 n¸ r lnpkq 2 k1 lnpk 1qs ¤ 2px 1 1q 1 1q . Hence it follows that »x 1q 1 4 1 x 1 1 n¸ 4 k1 x 1 k 1 k 1 1 1 ¤ 41 . Therefore, we conclude that the sequence S1 , S2 , . . . , where Sn : »n »n 1 lnpy q dy 1 1 n¸ r lnpkq 2 k1 lnpk 1qs r y lnpyq y sn1 lnpn!q lnp2nq 1 ? n lnpnq n lnpnq pn 1q lnpn!q 2 1 ln n!n ne lnpy q dy lnpn!q lnpnq 2 for every n P N , is increasing as well as bounded from above and therefore convergent to an element of the closed interval form 0 to 1{4. Hence it follows also the existence of n! n n ? lim nÑ8 n e 274 which will be denoted by a in the following. For the determination of its value, we use Wallis’ product. According to Theorem 3.1.19 2 2pk 1q p2k k!q4 klim 2pk 1q lim Ñ8 kÑ8 2k 1 p2k 1q rp2k q!s2 p2k k!q4 klim Ñ8 p2k 1q rp2k q!s2 k 4 ? 2k 2k 2 2p4k 1q k p2k k!q4 1 k ? klim 2k Ñ8 p2k 1q rp2k q!s2 e k e k 4 ? 2k 2k 2 a2 k pk!q4 1 k ? klim 2k 4 . Ñ8 2p2k 1q rp2k q!s2 e k e ? Hence it follows that a 2π and, finally, (3.1.8). π 2 2 3 2k 2k 1 The example below gives another application of the method of partial integration to an integrand containing a parameter which leads on Euler’s famous product representations of the sine and the cosine. These representations will be used later on in the proof of the reflection formula for the gamma function. For the formulation of these representations, we need to introduce the product symbol. Definition 3.1.21. (Product symbol) If I is some non-empty finite index set and ai P R for every i P I, the symbol ¹ P ai i I denotes the product of all ai where i runs through the elements of I. Note that, as a consequence of the commutativity and associativity of multiplication, the order in which the products are performed is inessential. Example 3.1.22. ( Euler’s product representation of the sine and cosine, 1748, [38]) Show that for every x P R sin πx 2 n ¹ πx x2 lim 1 2 2 nÑ8 k1 4k 275 , cos πx 2 nlim Ñ8 n ¹ 1 k 0 p2k x2 1q2 . (3.1.9) Solution: For this, we define for every n P N a corresponding In : R Ñ R by In pxq : » π{2 cospxtq cosn ptq dt 0 for every x P R. In particular, this implies that # I0 pxq 1 if x 0 , sinpπx{2q{pπx{2q if x 0 π 2 for x R t1, 1u I1 pxq » π{2 0 cospxtq cosptq dt I1 pxq » π{2 0 I1 pxq x In pxq x » π{2 0 π {2 0 π{2 0 cosppx 1qtqs dt cos1 pπxx{22q , » π{2 0 1qtq 1 r1 2 cosp2tqs dt π4 # In the following, let x integration that 2 1 rcosppx 2 cos ptq dt 2 1 sinp2tq 2 1 t 2 and hence 0 sinppx 1qtq x1 1 sinppx 1qtq 2 x 1 for x P t1, 1u » π{2 π {4 if x P t1, 1u . 2 cospπx{2q{p1 x q if x R t1, 1u P R. T For n P N z t0, 1u, we conclude by partial cosn ptq x cospxtq dt r x cosn ptq sinpxtqsπ0 {2 276 » π{2 n 0 » π{2 n sinptq cosn1 ptq x sinpxtq dt 0 » π{2 n sinptq cosn1ptq cospxtq π0 {2 cosn ptq pn 1q sin2 ptq cosn2 ptq cospxtq dt n n cosn ptq pn 1q cosn2 ptq 0 2 n Inpxq npn 1qIn2pxq . cospxtq dt Therefore, it follows that In2 pxq n2 x2 In pxq npn 1q and hence that In2 pxq x2 1 2 In2 p0q n From this, it follows by induction that I0 pxq I0 p0q I1 pxq I1 p0q In pxq . In p0q n I2n pxq ¹ x2 1 I2n p0q k1 p2kq2 I2n I2n , n pxq ¹ x2 1 p2k 1q2 1 p0q k1 1 for every n P N . In the following, we show that lim nÑ8 In pxq In p0q 1. For this, we note that | cospxtq 1| | cosp|x|tq 1| (3.1.10) » |x |t 0 r p qs for t ¥ 0. Hence it follows for every n P N that |Inpxq Inp0q| » π{2 cos xt 0 sin s ds 1 cosn t dt r p q s 277 pq ¤ |x|t ¤ |x| » π{2 0 t cosptq cos ptq dt ¤ |x| n 1 » π{2 0 sinptq cosn1 ptq dt |x | n where it has been used that t cosptq ¤ sinptq for 0 ¤ t ¤ π {2. Hence it follows (3.1.10) and, finally, (3.1.9). For this, note that the second relation in (3.1.9) is trivially satisfied for x P t1, 1u. The following application of the method of partial integration to an integrand containing a parameter leads on a recursion formula that will be used in the method of integration of rational expressions by decomposition into partial fractions displayed in the next section. Example 3.1.23. Let m be some natural number Define F, G, f, g : R Ñ R by F py q : py 2 Gpy q : y for all y »x a ¥ 1, a P R and c ¡ 0. c2 qm , g py q : 1 , f py q : 2my py 2 c2 qpm 1q , P R. Then by Theorem 3.1.14 for every x ¡ a x dy 2 2 m 2 py c q px c2qm a x 2 2 m 2 px c »q pa c2qm x 2 2mc py2 dyc2qm 1 a pa2 a »x 2m a »x c2 qm dy 2 py c2qm 2m a y 2 dy py2 c2qm and hence it follows the recursion (or ‘reduction’) formula »x a py2 dy c2 qm 1 1 x a 2mc 2 px2 2 m 2 cq pa c2qm »x 2m 1 dy , 2 2 2mc c2 qm a py 1 which is used in the method of integration by decomposition into partial fractions below. 278 The following final example gives a another typical application of the method of partial integration. Also in this, the integrand contains a parameter. The method is used to derive an estimate for a special function, a Bessel function, defined in terms of an integral. It is a remarkable fact that estimates even of elementary functions are often easier to achieve by help of integral representations. Example 3.1.24. Show that |Jnpxq| ¤ π2 n2 x x2 for all n P N and x P R such that 0 ¤ x n. Solution: Define F pθq : f pθq : for all θ 1 , g pθq : px cos θ nq cospx sin θ nθq , x cos θ n x sinpθq px cos θ nq2 , Gpθq : sinpx sin θ nθq P r0, πs. Then by Theorem 3.1.14, Jn pxq π1 1 π »π 0 »π 0 cospx sin θ nθq dθ x sinpθq px cos θ nq2 sinpx sin θ nθq dθ , and hence |Jnpxq| ¤ 1 π »π 0 x sinpθq px cos θ nq2 dθ π2 n2 x x2 . Problems 1) Calculate the value of the integral. In this, where applicable, n P N . »3 4t e5t dt , b) 2 a) 0 » π{2 0 279 ϕ r sinp2ϕq 3 cosp7ϕq s dϕ , »π c) 0 »1 e) 0 »π g) 0 »2 eϕ cosp2ϕq dϕ , d) x2 arctanp3xq dx x sinpnxq dx , » 1{?2 1 f) »3 , lnp2xq dx x2 h) 1 0 , lnp2x2 xn lnpxq dx 1q dx , . 2) Derive a reduction formula where the integral is expressed in terms of the same integral with a smaller n. In this n P N , a P R, x ¥ a and, where applicable, m P N, b, c P R . »x a) »ax y n eby dy c) »ax e) »ax g) a »x sinn py q dy , b) »x , d) a y n cospby q dy , a cosn py q dy y n sinpby q dy »x f) a ecy sinpby q dy , , , y m rlnpy qsn dy »x h) a , ecy cospby q dy 3) Let I be some non-empty open interval of R, h : I a, b P I be such that a b. . Ñ R a map and a) If h is twice differentiable on I and such that hpaq hpbq 0, show that »b a hpxq dx 1 2 »b a px bqpx aqh 2 pxq dx . b) If h is four times differentiable on I and such that hpaq hpbq h 1 paq h 1 pbq 0, show that »b a hpxq dx 1 24 »b a px bq2 px aq2 h pivq pxq dx . [Remark: Note that if h f p where f : I Ñ R is twice and four times differentiable, respectively, and p : I Ñ R is a polynomial function of the order 1, 3, respectively, then h 2 f 2 , h pivq f pivq , respectively. In connection with the above formulas, this fact is used in the estimation of the errors for the Trapezoid Rule / Simpson Rule for the numerical approximation of integrals. See Section 3.1.4.] 280 4) Let a, b P R be such that a b and f, g : ra, bs Ñ R be restrictions to ra, bs of twice continuously differentiable functions defined on open intervals of R containing ra, bs. In addition, let f paq f pbq 0 and g paq g pbq 0. a) Show that »b a g pxqf 2 pxq dx »b a g 2 pxqf pxq dx . b) In addition, assume that f and g solve the differential equations f 2 pxq U pxq f pxq λ f pxq , g 2 pxq U pxq gpxq µ gpxq where U : ra, bs Ñ R is continuous and λ, µ P R are such that λ µ. Show that »b f pxqg pxq dx 0 . a 3.1.3 Partial Fractions The method of integration of rational expressions by decomposition into partial fractions is suggested by the following simple observation. For this, let a1 , a2 , A1 , A2 P R. Then A2 A1 px a2 q A2 px a1 q A1 x a1 x a2 px a1qpx a2q pA1x2 Ap2aqx apAq1xa2 a Aa2a1q 1 2 1 2 for all x P R zta1 , a2 u. Note that for the left hand side of the last equation, as a function of x, there is an antiderivative which is given by A1 lnp|x a1 |q A2 lnp|x a2 |q for every x P R zta1 , a2 u. 281 On the other hand, for a given quotient p{q of polynomials p of first order and q of second order an antiderivative is usually not obvious. Here we exclude the case that the quotient can be reduced to the quotient of a zero order polynomial and a first order polynomial. Also, we assume that the coefficient of the leading order of q is equal to 1 which can always be achieved by appropriate definition of p and q. Therefore, for the purpose of integration, it is natural to try to represent such a quotient p{q in the form ppxq q pxq x A1a 1 A2 x a2 (3.1.11) for all x P R z r ta1 , a2 u Y q 1 pt0uq s and for some a1 , a2 P R, A1 , A2 P R such that a1 a2 . In this, we notice that the vanishing of one of the coefficients A1 , A2 or a1 a2 would lead on the excluded case that the quotient can be reduced to a quotient of a zero order polynomial and a first order polynomial. In the following, we will determine A1 , A2 , a1 and a2 . We immediately note from the singular behavior of the right hand side of equation (3.1.11) near a1 and a2 that q needs to vanish in the points a1 and a2 . This can also be shown as follows. The equation (3.1.11) implies that rA1px a2q A2px a1qs qpxq ppxqpx a1qpx a2q for all x P R z r ta1 , a2 u Y q 1 pt0uq s. Hence A1 pa1 a2 q q pa1 q xlim Ña rA1 px a2 q A2 px a1 qs q pxq xlim Ña ppxqpx a1 qpx a2 q 0 , A2 pa2 a1 q q pa2 q xlim Ña rA1 px a2 q A2 px a1 qs q pxq xlim Ña ppxqpx a1 qpx a2 q 0 . 1 1 2 2 Since A1 0, A2 0 and a1 a2, this implies that q pa1 q q pa2 q 0 . 282 Hence q has the two different zeros a1 , a2 and q pxq px a1 qpx a2 q for all x P R. Then (3.1.11) implies that ppxq A1 px a2 q A2 px a1 q for all x P R zta1 , a2 u and therefore that ppa1 q xlim Ña ppxq A1 pa1 a2 q , ppa2 q xlim Ña ppxq A2 pa2 a1 q . 1 2 The last system gives A1 appa1aq 2 , A2 1 appa2aq 2 . 1 Indeed, if p is a polynomial of first order and a1 , a2 a1 a2 , then P R are such that appa1aq x 1 a appa2aq x 1 a 2 1 1 2 1 2 ppa1 qpx a2 q ppa2 qpx a1 q 1 a2 a1 px a1qpx a2q for all x P R zta1 , a2 u. In addition, ppa1qpa1 a2q ppa2qpa1 a1q ppa q 1 a2 a1 ppa1qpa2 a2q ppa2qpa2 a1q ppa q. 2 a a 2 Hence for all x P R and 1 ppa1qpx a2q ppa2qpx a1q ppxq a2 a1 ppxq px a1qpx a2q appa1aq x 1 a 2 1 1 283 ppa2 q 1 a2 a1 x a2 for all x P R zta1 , a2 u gives a decomposition as required. In particular, an antiderivative of R zta1 , a2 u Ñ R , x ÞÑ ppxq px a1qpx a2q is given by ppa2 q lnp|x a2|q a2 a1 appa1aq lnp|x a1|q 2 1 for every x P R zta1 , a2 u. As noticed above, a decomposition of the type (3.1.11) is impossible if the polynomial q has a double zero or no real zero. For this reason, we try to find a similar decomposition also for these cases. If q has a double zero a P R, then p{q is given by ppxq px aq2 , for all x P R ztau. Then ppxq px aq2 1 1 p paq ppxx aaqq2 ppaq xp paaq pxppaaq q2 for all x P R ztau. Hence an antiderivative of R ztau Ñ R, x ÞÑ ppxq px aq2 is given by p 1 paq lnp|x a|q for every x P R ztau. ppaq xa Finally, if q has no real zero, then q pxq x2 cx d x 284 c 2 2 d c2 4 for all x P R where c, d P R are such that d¡ Further, p is given by ppxq ax c2 . 4 b for all x P R and some a, b P R. Then ax ac b ac ppxq ax b 2 2 q pxq x2 cx d x2 cx d 1 2ax ac ac 1 2 x2 cx d b 2 2 c 2 x 2 d c4 b ac a 2x c 1 1 2 b b 2 x2 cx d 2 2 d c4 d c4 1 bx c 2 c2 4 2 d for all x P R. The first summand on the right hand side of the last equation, as a function of x, has an antiderivative given by a lnpx2 2 cx dq for all x P R. Hence it remains to find an antiderivative for the second summand. Since we know from Calculus I that arctan 1 pxq for all x P R, such is given by b b ac 2 d c2 4 1 x2 1 arctan bx c 2 d c2 4 for all x P R. Note that in the last step, we could also have employed change of variables, but the procedure here is more direct. Hence in the case that c2 d¡ , 4 285 an antiderivative of p{q, given by ax for every x P R, is given by a lnpx2 2 cx dq x2 b cx b b ac 2 d for all x P R. d c2 4 arctan bx c 2 d c2 4 The previous analysis can be generalized to quotients of the form p{q where p, q are polynomials of order m and n, respectively, such that m n. The result is given below without proof. The proof can be found in texts on function theory, that is, the theory of functions of one complex variable. For readers that already know complex numbers, we just indicate how their introduction might be helpful in this respect. For this, we consider the case that ppxq 1 and q x2 1 for all x P R. The polynomial q has no real zero, but if we extend q to the complex plane by q̄ pz q : z 2 1 for every complex number z, then q̄ has the roots i and i, where i denotes the imaginary unit, since q̄ piq i2 In particular, 1 1 1 0 , q̄ piq piq2 1 1 10. i 1 1 1 q̄ pz q 2 z i zi for every complex z different from i and i. As reflected in this example, the introduction of complex numbers allows in every case the decomposition of the extension of p{q to complex numbers into sums of functions that assume the values 1 1 , ... , za pz aqµpaq , in every complex z not among the zeros of that extension of q where a runs through the zeros of q and for every such a the symbol µa denotes the corresponding multiplicity. This fact simplifies the discussion significantly. 286 Lemma 3.1.25. Let p, q : R Ñ R be polynomials of degree m, n P N , respectively, where m n. Finally, let a1 , . . . ar be the (possibly empty) sequence of pairwise different real roots of q, where r P N, and let m1 , . . . , mr be the sequence in N consisting of the corresponding multiplicities. (i) There are s P N along with (possibly empty and apart from reordering unique) sequences pbr 1 , cr 1 q, . . . , pbr s , cr s q of pairwise different elements of R p0, 8q and mr 1 , . . . , mr s in N such that q pxq qn px a1 qm1 . . . px ar qmr . . . px br s q2 cr mr px br 1q2 cr mr 1 s s for all x P R where qn is the coefficient of the nth order of q. (ii) There are unique sequences of real numbers A11 , . . . , A1m1 , . . . , Ar1 , . . . , Armr and pairs of real numbers pBr 1,1 , Cr 1,1 q, . . . , pBr 1,mr 1 , Cr 1,mr 1 q, . . . , pBr s,1, Cr s,1q, . . . , pBr s,mr s , Cr s,mr s q, respectively, such that ppxq q pxq xA11a px A1ma qm . . . 1 1 Ar1 Arm px a qm x ar r Br 1,m x Br 1,1 x Cr 1,1 rpx b q2 px b q2 c 1 1 r r r 1 r 1 r 1 Br s,1 x Cr s,1 px br sq2 cr s r 1 rpBxr s,mb qx2 r s r s Cr cr Cr cr 1,mr mr 1 1 1 s,mr mr s s s s s ... for all x P R zta1 , . . . , ak u. Proof. See Function Theory. Corollary 3.1.26. Let p, q, m, n; a1 , . . . ak , m1 , . . . , mr , pb1 , c1 q . . . , pbnk , cnk q, mr 1, . . . , mr s, A11, . . . , A1m1 , . . . , Ar1, . . . , Armr and 287 1 pBr pBr 1,1 , Cr 1,1 s,mr s , Cr q, . . . , pBr 1,m , Cr 1,m q, . . . , pBr q as in Lemma 3.1.25. Then by s,m r 1 r 1 s,1 , Cr s,1 q, . . . , r s F pxq : A11 lnp|x a1 |q A1m 1 px a qm 1 1 m1 Ar1 lnp|x ar |q 1 1 1 Arm 1 px a qm 1 1 mr r r r ... ... Br 1,1 lnrpx br 1 q2 cr 1 s 2 x br 1 br 1 Br 1,1 Cr 1,1 arctan ... cr 1 cr 1 Br 1,1 1 2p1 mr 1 q rpx br 1 q2 cr 1 smr 1 1 pbr 1Br 1,1 Cr 1,1q Fr 1pxq . . . Br s,1 lnrpx br s q2 cr s s 2 br s Br s,1 Cr s,1 x br s arctan ... cr s cr s 1 Br s,1 2p1 mr s q rpx br s q2 cr s smr s 1 pbr sBr s,1 Cr s,1q Fr spxq for all x P R zta1 , . . . , ak u, there is defined an anti-derivative F of p{q. Here Fr 1 , . . . , Fr s : R Ñ R denote anti-derivatives satisfying Fr1 l pxq rpx b q21 r l cr l sm r l for all x P R and l 2, . . . s. Note that such functions can be calculated by the recursion formula from Example 3.1.23. In the following, we give five examples of typical applications of the previous lemma and its corollary. The fifth example gives such application to the solution of a (‘separable’) first order differential equation. 288 Example 3.1.27. Calculate »2 0 4 x2 9 dx . Solution: »2 0 »2 »2 dx dx dx x2 9 3q x3 x 3 0 px 3qpx 0 2 2 2 p lnp|2 3|q lnp|2 3|qq plnp| 3|q lnp|3|qq lnp5q , 3 3 3 4 4 2 3 1 1 where it has been used that for every function f pf pxq 1 aqpf pxq bq ba 1 1 f pxq a 1 f pxq b , (3.1.12) where a, b P R are such that a b and x P Dpf q is such that f pxq R ta, bu. The previous identity is also of use in applications of the method of integration by partial fractions to more complicated situations. Example 3.1.28. Calculate »3 0 3x x2 4 2x 2 dx . Solution: »3 0 »3 » 3 3 2x 2 1 dx dx dx 2 2 x 2x 2 2x 2 1q2 1 0 2 x 0 px 3 3 lnp32 2 3 2q arctanp3 1q lnp2q arctanp1q 2 2 3 17 π ln arctanp4q . 2 2 4 3x 4 Example 3.1.29. Calculate »2 1 1 x2 px2 289 1q2 dx . Solution: Since the integrand is a restriction of the composition of the maps pR Ñ R, x ÞÑ 1{rxpx 1q2s q and p R Ñ R, x ÞÑ x2 q, by Lemma (3.1.25) there are A, B, C P R such that A B C (3.1.13) 2 2 2 p 1q x x 1 px 1q2 for all x 0. Hence for all x P R 1 Apx2 1q2 Bx2 px2 1q Cx2 pA B qx4 p2A B C qx2 A and hence A 1, B 1 and C 1. Hence it follows by the recursion 1 x2 x2 2 formula from Example 3.1.23 that »2 1 »2 » » 2 2 1 1 1 dx dx dx dx 2 2 2 2 2 2 x px 1q 1 1q2 1 x 1 x 1 px »2 1 1 π arctanp2q dx 2 2 4 1q2 1 px » 1 π 1 2 1 1 2 1 arctanp2q 2 5 3 2 x2 1 dx 2 4 1 7 3 π arctanp2q . 15 2 4 1 Another way of arriving at the decomposition (3.1.13) is by help of the identity (3.1.12) which leads on 1 x2 px2 1q2 1 x2px2 for all x P R . x2 1q 1 1 1 1 x2 1 x2 px2 1q x2 1 x2 1 1 1 1 px2 1q2 x2 x2 1 px2 1q2 Example 3.1.30. Calculate »x a dy 1 y4 290 1 1 y 1 -4 2 -2 4 x -1 Fig. 71: Graph of the antiderivative F of f pxq : 1{p1 Compare Example 3.1.30. x4 q, x P R, satisfying F p0q 0. where a P R and x ¥ a. Solution: Since x4 1 ¡ 0 for all x according to Lemma 3.1.25 there are b, c, d, e P R such that y 4 py 2 by y4 dy3 ey2 y4 pb dqy3 1 P R, cq py 2 dy eq (3.1.14) 3 2 2 by bdy bey cy cdy ce 2 pc e bdqy pbe cdqy ce P R. This equation is satisfied if and only if b d 0 , c e bd 0 , be cd 0 , ce 1 . From the first equation, we conclude that d b which leads to the equiv- for all y alent reduced system d b , e c b2 , bpe cq 0 , ce 1 . The assumption that b 0 leads to e c and 1 ce c2 . Hence it follows that b 0. Therefore, the second equation of the last system leads to the equivalent reduced system d b , b2 2c , e c , 291 c2 1 ? ? which has the solution c e 1 and?b 2,?d 2. (The other remaining solution c e 1 and b 2, d 2 results in a reordering of the factors in (3.1.14)). Hence it follows that y4 1 ? py2 2y 1q py 2 ? 2y 1q for all y P R. Note that, the last could have also been more simply derived as follows 1 y4 ?1 2y 2 y 4 2y 2 2 ? py ? 1q2 p 2y q2 py2 2 y 1q py2 2 y 1q valid for all y P R. Further, according to Corollary 3.1.26 there are uniquely determined A, B, C, D P R such that 1 1 y4 y2 Ay?2 yB ? Cy y2 1 D 2y 1 (3.1.15) P R. In particular, this implies that 1 Ay? B Cy? D 1 4 4 1 y 1 py q y2 2 y 1 y2 2y 1 Cy D Ay B y 2 ?2 y 1 y 2 ? 2 y 1 for all y P R. Since A, B, C and D are uniquely determined by the equations (3.1.15) for every y P R, it follows that C A and D B. Hence we conclude that there are uniquely determined A, B P R such that 1 Ay B Ay? B ? 1 y4 y2 2 y 1 y2 2 y 1 for all y P R. In particular, for all y 1 1 1 04 292 2B and hence B 1{2. Also 1 1 14 A2 p?1{22q A2 ?p12{2q ? ? ? 2 2 1 2 1 2 2 2 2 2 A 2 A A 2 2 2 2 ? and hence A 2{4. We conclude that ? ? 1 2y 2 2y 2 1 4 y 2 ?2 y 1 y 2 ?2 y 1 1 y4 ? ? 1 2y 1 2y 1 4 y 2 ?2 y 1 y 2 ? 2 y 1 ? ? ? 2 2 2 ? ? 2 2 4 2y 1 1 2y 1 1 for all y P R. Hence it follows that ? 2 ? 2 ? »x 2 2x 1 dy x x 2x 1 8 ln a2 ?2 a 1 ln a2 ?2 a 1 y4 a 1 ? ? ? 2 arctanp 2 x 1q arctanp 2 a 1q ?4 ? ? 2 arctanp 2 x 1q arctanp 2 a 1q . 4 1 2 Remark 3.1.31. The previous example gives another illustration of the general rule that one should never blindfoldly rely on computer programs. In Mathematica 5.1, the command Integrater1{p1 x^ 4q, xs gives the output ?1 p2ArcTanr1 4 2 ? 2xs 2ArcTanr1 293 ? 2xs Logr1 ? 2x x2 s y 1.2 0.8 0.5 -4 -3 -2 1 -1 2 3 4 x Fig. 72: Graphs of the solutions f0 , f1{4 , f1{2 , f3{4 and f1 of (3.1.16) in the case that a 1. Compare Example 3.1.32. Logr1 ? 2x x2 sq which is incorrect. A first inspection of the last formula reveals that the argument of the first natural logarithm function is becoming negative for large x such that the logarithm is not defined. This gives a first indication that the expression is incorrect. Comparison with the result from Example 3.1.30 shows that the sign of that argument has to be reversed. Example 3.1.32. Find solutions of the following differential equation for f : R Ñ R with the specified initial values. f 1 pxq af pxqp1 f pxqq (3.1.16) for all x P R where a ¡ 0, f p0q P p0, 1q. Solution: If f is such function, it follows that f is continuously differentiable. Since f p0q P p0, 1q, it follows by the continuity of f the existence of an open interval c, d P R such that c 0 d and such that f p[c, d]q p0, 1q. Since a ¡ 0 and the function af p1 f q is ¡ 0 on the interval [c, d], it follows from (3.1.2) that f px1 q f px0 q f px1 q f px0 q f px0 q 294 » x1 x0 f 1 pxq dx y 1 -1 -0.5 1 0.5 x -9 Fig. 73: Graphs of the solutions f2 and f4 of (3.1.16) in the case that a Example 3.1.32. » x1 f px0q x0 1. Compare a f pxqp1 f pxqqdx ¥ f px0 q for all x0 , x1 P [c, d] such that x0 ¤ x1 . In addition, the restriction of f to [c, d] is non-constant since the function pR Ñ R, x ÞÑ axp1 xqq has no zeros on p0, 1q. Hence we conclude from (3.1.2) by Theorem 3.1.1 for x P [c, d] that apx cq » f pxq pq f c ln »x 1 u c a du 1 1u »x c f 1 py q dy f py qp1 f py qq du ln 1 f pcq f pxq f pcq 1 f pxq and hence that f pcq eac eax 1 f pcq This implies that u » f pxq f pxq 1u du f pcq up1 uq pq f c 1 f pfxpqxq f p1xqf 1pxq 1 1 1f pxq 1 . f pcq eac 1 f pcq 1 f pf0pq0q 295 and hence that f p0q eax 1 f p0q 1 1f pxq 1 . Finally, this leads on f pxq 1 eax ax 1 . pq eax 1f pf0pq0q p qe On the other hand, for every c P R , it follows by elementary calculation f 0 1 f 0 1 that the function fc defined by fc pxq : $ & for x P R if 0 c ¤ 1 eax c eax 1 c ax % axe 1c e c for x P R zta1 lnppc 1q{cqu if c ¡ 1 or c 0 satisfies (3.1.16). Note also that f0 , defined as the constant function of value zero on R, is a further solution of (3.1.16) such that f0 p0q 0. Problems 1) Calculate the integral. »2 a) 2 »4 c) 3 »3 e) 2 »1 g) 3u u2 »4 0 3 x3 x2 6x2 du 1{2 , d) 0 3 12x 3x 4x2 x3 4x x4 2x2 8 4 x 4 5 dx 1 x2 0 6x »3 , f) 1 »3 , h) 3 »4 j) 2 , 9 »1 l) , , x2 1 dx x4 4x2 4 , , 7 dx , x3 3x2 1 15x2 10x 24 dx , 4x3 x4 dx x2 3x 1 dx x3 2x2 7x 4 3x x4 0 296 1 3x 1 dx x3 7x 6 dx dx , 2x b) »1 , 3x 5 dx 4 x 4x2 3 » 1{2 k) du u2 1 x3 i) u 12 u3 x2 »2 2 6x2 4x 1 »0 m) 2 x4 »1 n) 0 »3 o) 0 »2 p) 1 3.1.4 2x2 1 dx x3 9x2 11x 4 x3 x 1 x4 3x3 3x2 7x x2 6 1 x4 2x3 3x2 4x 2 x4 2x3 x2 3x3 5x2 4 9x 6 , dx , dx , dx . Approximate Numerical Calculation of Integrals Usually, in cases where an evaluation of a given integral in terms of known functions appears to be impossible, resort is taken to approximation methods. Basic numerical methods for this, the midpoint rule, the trapezoid rule and Simpson’s rule, are given within this section. Each of them uses approximations of integrands analogous to those leading to upper and lower sums in the definition of the Riemann integral. For this, partitions of the interval of integration I are used which induce divisions into subintervals of equal length. Generally, the decrease of that length leads to better approximations. On each subinterval, the corresponding restriction of the integrand f : I Ñ R is replaced by a certain polynomial approximation characteristic for each method. The integral of f over I is then approximated by the sum of the integrals of the approximating polynomials over the subintervals. The midpoint rule uses on each subinterval the constant polynomial whose value coincides with the value of f in the midpoint of that interval. This is equivalent to the approximation of f by its linearization around the midpoint of the subinterval, since the integral of the non-constant part of the polynomial over that interval vanishes. The last is the reason, why the midpoint rule leads to results which are similar in accuracy to those of the trapezoid method. The trapezoid method approximates f on each subinterval by the linear polynomial that interpolates between the values of f at the interval ends, i.e., by that linear polynomial that assumes the same values as f at both ends of the subinterval. Finally, Simpson’s method approximates f on each subinterval by the quadratic polynomial that interpolates between the value of f at the end points and at the midpoint of that interval. 297 From this description, it might be expected that among those methods, Simpson’s rule is the most accurate, followed by the trapezoid rule and the midpoint rule. Indeed, Simpson’s rule is the most accurate which is also reflected in the fact that its error is proportional to n4 where n is the of number of subintervals of the division. On the other hand, the error of both, the midpoint and the trapezoidal rule, is proportional to n2 . Often, the trapezoid rule gives better approximations than the midpoint rule, but there are also cases known where the opposite is true. For instance, in the examples below this is the case. All these methods, can lead to poor results in the case of an oscillating f as long as the length of the subintervals is comparable to the average distance of subsequent minima and maxima of f . Such cases are depicted in the figures below. The key for the following derivation of an error estimate for the midpoint rule is the observation that the associated integrals over the subintervals coincide with those of the linearization of the integrand around the midpoints. As a consequence, the remainder estimate of Corollary 2.5.26 to Taylor’s theorem can be applied. Theorem 3.1.33. (Midpoint Rule) Let a, b P R be such that a b, f : ra, bs Ñ R be bounded and twice differentiable on pa, bq such that |f 2pxq| ¤ K for all x P pa, bq and some K ¥ 0. Then (i) » b f x dx a p q f a b 2 a K pb q ¤ 24 pb aq3 . (ii) In addition, let n P N , h : pb aq{n and ai : a i P t0, . . . , nu. Then » b f x dx a p q h n¸1 f a i 0 298 i ai 1 2 ¤ i h for all K pb aq3 . 24 n2 y 40 30 20 10 1.2 1.4 1.6 1.8 x 2 Fig. 74: Midpoint approximation. Proof. (i) By the Corollary 2.5.26 to Taylor’s theorem, it follows that |f pxq p1pxq| ¤ for all x P pa, bq where p1 pxq : f a b K 2 x a f1 2 a b 2 2 x b 2 a b 2 for all x P R is the first-degree Taylor-polynomial of f centered around pa bq{2. Further, »b a p1 pxq dx f f f a b 2 a b 2 a b 2 pb aq a f1 pb aq 1 f1 2 a b 2 pb aq . 299 b 2 »b x a a b 2 x 2 b a a b 2 dx In addition, » b f x dx a »b pq ¤ »b a p1 x dx 2 ¤ pq K a b x 2 a 2 K pb aq3 . 24 »b a dx |f pxq p1pxq| dx K 6 x a b 2 3 b a (ii) is a simple consequence of (i). Example 3.1.34. We use the midpoint rule to approximate the value of lnp2q »2 1 dx . x For this, we use the partition pa0, a1, a2, a3, a4q p4{4, 5{4, 6{4, 7{4, 8{4q leading to a division of r1, 2s into the four subintervals of length h 1{4. Then 3̧ h f a i 0 i ai 2 1 1 1 1 4 5 5 2 4 4 4 4448 0.691 6435 12 3̧ ai i 0 1 ai 1 1 6 4 6 4 1 7 4 7 4 8 4 2 1 9 1 11 1 13 1 15 where f pxq : 1{x for all x P r1, 2s and the last approximation is to three decimal places. To three decimal places, lnp2q is given by lnp2q 0.693 . The result of this application of the midpoint gives lnp2q within an error of 2 103 . Since |f 2pxq| 2x3 ¤ 2 300 y 40 30 20 10 1.2 1.4 1.6 1.8 2 x Fig. 75: Trapezoid approximation. for all x P p1, 2q, Theorem 3.1.33 (ii) leads to the error bound 4448 6435 ln 2 1 p q ¤ 24 2 16 192 6 103 . The following derivation of an error estimate for the trapezoid rule exploits the fact that the difference of the approximating polynomial on a subinterval and the restriction of the integrand vanishes at the interval ends. By partial integration, the integral of such a difference can be transformed into an integral containing the second order derivative of the difference, instead. Since the approximating polynomial is only of first order, the last coincides with the second order derivative of the restriction of integrand. This leads to an error estimate in terms of a bound on the second derivative of f . Theorem 3.1.35. (Trapezoid Rule) Let I be some non-empty open interval of R, f : I Ñ R be twice continuously differentiable and a, b P I be such that a b. In particular, let |f 2 pxq| ¤ K for all x P pa, bq and some K ¥ 0. Then 301 (i) » b f x dx pq a f paq f pbq 2 a K pb q ¤ 12 pb aq3 . (ii) In addition let n P N , h : pb aq{n and ai : a i P t0, . . . , nu. Then » b f x dx a p q h f pa q i n¸1 i 0 f pai 2 i h for all 1 q ¤ K pb aq3 12 n2 . Proof. Define f pbq f paq ba ppxq : f paq for all x P R and h : f and » p x aq p. In particular, it follows that hpaq hpbq 0 b f paq f pbq ppxq dx p b aq . 2 a By partial integration, it follows that »b a hpxq dx 1 2 »b a px bqpx aqh 2pxq dx 1 2 and hence that » b h x dx a »b pq ¤ 1 2 »b a »b a px bqpx aqf 2pxq dx pb xq px aq |f 2pxq| dx K ¤ K2 pb xq px aq dx 12 pb aq3 . a (ii) is a simple consequence of piq. Example 3.1.36. As before the midpoint rule, we use the trapezoid rule to approximate the value of lnp2q »2 1 302 dx . x Again, we use the partition pa0, a1, a2, a3, a4q p4{4, 5{4, 6{4, 7{4, 8{4q leading to a division of r1, 2s into the four subintervals of length h 1{4. Then h f pa q i n¸1 i 0 f pai 2 1 1 4 4 4 8 4 5 5 1171 0.697 1680 q1 4 6 3̧ 8 i0 4 4 6 7 1 ai 1 ai 4 7 4 8 1 1 2 1 4 2 5 2 6 2 7 1 8 where f pxq : 1{x for all x P r1, 2s and the last approximation is to three decimal places. To three decimal places, lnp2q is given by lnp2q 0.693 . The result of this application of the midpoint gives lnp2q within an error of 4 103 . Since |f 2pxq| 2x3 ¤ 2 for all x P p1, 2q, Theorem 3.1.35 (ii) leads to the error bound 1171 1680 ln 2 p q ¤ 12 2 16 961 11 103 . The following derivation of an error estimate for Simpson’s rule is similar to that for the trapezoid rule. Again, it uses partial integration to exploit the the fact that the difference of the approximating polynomial on a subinterval and the restriction of the integrand vanishes at the endpoints and also in the middle of the interval. This leads to an error estimate in terms of a bound on the fourth derivative of the integrand. 303 Theorem 3.1.37. (Simpson’s Rule) Let h ¡ 0, I be some open interval of R containing rh, hs, f : I Ñ R be four times continuously differentiable and |f pivq pxq| ¤ K for all x P ph, hq and some K ¥ 0. Then » h f x dx h pq 1 rf phq 3 4f p0q f phqs h K 5 ¤ 90 h . Proof. Define " ppxq : * 1 rf phq 2 f phqs f p0q for all x P R and g : f »h h ppxq dx 31 rf phq " 2 hx2 1 x r f phq f phqs 2 h f p0q p. Then gphq gp0q gphq 0 and * 1 rf phq 2 4f p0q f phqs f p0q 2h 3 f p0q 2h f phqs h . By partial integration, it follows that »0 h px »h hq p3x hq g pivq pxq dx 3 »h px hq p3x 3 0 0 hq rg pivq pxq px hq3 p3x hq g pivq pxq dx g pivq pxqs dx 72 »h and hence that » h g x dx h pq ¤ K 36 »h 0 ph xq3 p3x h K 3 p h xq5 h ph xq4 36 5 0 304 hq dx K 5 90 h . h g pxq dx y 50 40 30 20 10 1.2 1.4 1.6 1.8 2 x Fig. 76: Simpson’s approximation. Corollary 3.1.38. Let I be some non-empty open interval of R, f : I Ñ R be four times continuously differentiable and a, b P I be such that a b. In particular, let |f pivq pxq| ¤ K for all x P pa, bq and some K ¥ 0. Finally, let n P N , h : pb aq{n and ai : a i h for all i P t0, . . . , nu. Then » b f x dx a pq 1 h n¸ f pai q 6 i1 K pb aq ¤ 2880 n4 4f ppai ai 1 q{2q f pai 1 q 5 . Note that 1 h n¸ f pai q 6 i1 32 h n¸1 4f ppai f ppai ai ai 1 q{2q 1 q{2q i 1 305 f pai 1 q n¸ 1 f pa q f pa 1 i i h 3 2 i1 1 q hence equals the sum of two-thirds of the corresponding sum for the midpoint rule and one-third of the corresponding sum for the trapezoid rule. Proof. The corollary is a simple consequence of Theorem 3.1.37. Example 3.1.39. As before the midpoint and trapezoid rule, we use Simpson’s rule to approximate the value of lnp2q »2 1 dx . x Again, we use the partition pa0, a1, a2, a3, a4q p4{4, 5{4, 6{4, 7{4, 8{4q leading to a division of r1, 2s into the four subintervals of length h 1{4. Then 1 h n¸ f pai q 6 i1 32 4448 6435 4f ppai ai 1 1171 3 1680 1498711 0.693155 2162160 1 q{2q f pai 1 q where f pxq : 1{x for all x P r1, 2s and the last approximation is to six decimal places. Also, the corresponding sums for the midpoint rule and the trapezoid rule have been used. To six decimal places, lnp2q is given by lnp2q 0.693147 . The result of this application of Simpson’s rule gives lnp2q within an error of 8 106 . Since |f pivqpxq| 24x5 ¤ 24 for all x P p1, 2q, Theorem 3.1.38 (ii) leads to the error bound 1498711 2162160 ln 2 1 p q ¤ 288024 256 30720 4 105 . 306 Problems 1) Calculate the integral. In addition, evaluate the integral approximately, using the midpoint rule, the trapezoidal rule and Simpson’s rule. In this, subdivide the interval of integration into 4 intervals of equal length. Compare the approximation to the exact result. »1 a) 0 »1 c) 0 »1 du p1 uq2 , b) 0 3u2 p1 u3 q2 du p1 2x dx x2 q2 , . 2) By using Simpson’s rule, approximate the area in R2 that is enclosed by the Cartesian leaf ? C : tpx, y q P R2 : 3 2 py 2 x2 q 2 x px2 3y 2 q 0u where a ¡ 0. In this, subdivide the interval of integration into 4 intervals of equal length. Compare the approximation to the exact result which is given by 1.5. 3) The time for one complete swing (‘period’) T of a pendulum with length L ¡ 0 is given by a L{g π k 2 I pk q 1 k2 2 T ? where I pk q »1 ? 1 1 k2 u2 ? 2 ? 1 2u ? 2 2 du , 1k 1k u θ0 P pπ {2, π {2q is the initial angle of elongation from the position of rest of the pendulum, k : | sinpθ0 {2q|, and where g is the acceleration of the Earth’s gravitational field. By using Simpson’s rule, approximate T for θ0 π {4. In this, subdivide the interval of integration into 4 intervals of equal length. 307 3.2 Improper Integrals A large number of integrals in applications are ‘improper’ in the sense that they are not Riemann integrals of functions over bounded closed intervals of R. For instance in physics, integrals over unbounded sets occur naturally in the description of systems of infinite extension which are basic for physics. Another important source for improper integrals is in theory of special functions where the majority of integral representations is in form of improper Riemann integrals (or, alternatively, Lebesgue integrals). Also special functions have important applications. The majority appears as solutions of differential equations from applications, like Bessel functions, hypergeometric functions, confluent hypergeometric functions or elliptic functions. Others, like the gamma function or the beta function appear naturally in the definitions of the former. For this reason, in this section we also introduce basic special functions, the gamma function and the beta function, by help of such integral representations and derive their basic properties. In particular, Legendre’s duplication formula, Euler’s reflection formula and Gauss’ representation for the gamma function are proved in this section. In applications, these results are often needed also for complex arguments. As is known, these follow from those for real arguments by help of the principle of analytic continuation. In addition, elementary properties of Gaussian integrals are derived that are frequently used in quantum theory and in probability theory. Original proofs of some of these results used improper double integrals. In the meantime, more elementary proofs have been found that allow their derivation already at an early stage in a calculus course. In particular, we use results from [26] and [61]. For motivation, in the following we consider the problem of the calculation of the period of a simple pendulum in Earths gravitational field which leads in a natural way on an improper Riemann integral. A simple pendulum is defined as a particle of mass m ¡ 0 suspended from a point O by a string of length L ¡ 0 and of negligible mass. During the time of 308 O Θ L m Fig. 77: A simple pendulum. The dashed line marks the rest position. Compare text. development of calculus in the 17th century, such motion was considered in 1673 by the inventor of the pendulum clock, Christian Huygens [56]. In the analysis below, we use Newton’s equation of motion . The last was not known to Huygens at that time. Newton’s equation of motion give the following differential equation for the angle of elongation θ from the rest position of the pendulum as a function of time. g sin θ 0 (3.2.1) θ2 L where g is the acceleration of Earth’s gravitational field. The general solution of this equation is not expressible in terms of elementary functions, but only in terms of special functions called ‘elliptic functions’. In the following, instead of finding the solutions of (3.2.1), we pursue the goal of finding the time τ for the pendulum to reach the angle 0 after release from rest at initial time 0, i.e., θ 1 p0q 0, and with initial elongation θ0 P p0, π {2q. The time τ corresponds to one-fourth of the time necessary for completion of one complete swing, i.e., to one-fourth of the period of the pen309 dulum. For this, we assume that there is a unique solution θ : R Ñ R of (3.2.1) such that θp0q θ0 , θ 1 p0q 0, 0 P Ranpθq, and we define τ : min θ1 pt0uq. Only this solution of (3.2.1), whose existence and uniqueness can be proved, we consider in the following. Note that these assumptions imply that θ is twice differentiable and, as a particular consequence of (3.2.1), that θ 2 is continuous. In a first step, we use the conserved energy for the solutions of 3.2.1, see Example 2.5.9, to derive a differential equation for θ that contains no higher order derivatives of θ than of first order. Multiplication of (3.2.1) by θ 1 gives 0 θ 1θ 2 g 1 θ sin θ L 1 12 g θ cos θ 2 L 1 . Hence it follows by Theorem 2.5.7 that the function inside the brackets is constant and therefore that 1 1 g 1 g g p θ ptqq2 cos θptq pθ 1 p0qq2 cos θp0q cos θ0 2 L 2 L L which leads to rcos θptq cos θ0s pθ 1ptqq2 2g L for every t P R. The solution of the last equation for θ 1 ptq for some t P R requires the knowledge of the sign of θ 1 ptq. By the fundamental theorem of calculus, it follows from (3.2.1) that θ 1 ptq θ 1 ptq θ 1 p0q »t 0 g θ 2 psq ds L »t 0 sin θpsq ds ¤ 0 for all t P r0, τ s where it has been used that θpτ q 0 and θptq P r0, θ0 s r0, π{2q. Both follow from the definition of τ . Hence, we conclude that c 2g a θ 1 ptq cos θptq cos θ L for all t P r0, τ s. 310 0 Since θ 1 ptq 0 for all τ P p0, τ q, it follows by Theorems 2.3.44, 2.5.10 and 2.5.18 that for the restriction of θ to the interval r0, τ s there is a strictly decreasing continuous inverse function θ1 : r0, θ0 s Ñ R whose restriction to p0, θ0 q is differentiable such that d pθ1q1pϕq θ 1pθ1pϕqq 1 L a 2g cos θpθ1 pϕqq cos θ0 1 d L 2g d ?cos ϕ cos θ 12 0 L b g sin2 1 1 θ0 2 sin2 ϕ 2 for all ϕ P p0, θ0 q where the addition theorem for the cosine has been used to conclude that α α α α cos2 sin2 1 2 sin2 cos α cos 2 2 2 2 2 for every α P R. Hence it follows by the fundamental theorem of calculus that pθ1qp0q »pθ1qpϕq rpθ1qpϕq pθ1qp0qs ϕ pθ1qpϕq pθ1q1pϕ̄q dϕ̄ τ 0 pθ1qpϕq d L g 1 2 d pθ1qpϕq 1 2k L g for every ϕ P r0, θ0 q where k »ϕ dϕ̄ b sin2 0 »ϕ b 0 θ0 2 sin2 dϕ̄ 1 1 k2 sin2 ϕ̄ 2 ϕ̄ 2 P p0, 1q is defined by θ0 k : sin 2 . By use of the substitution g : r0, sinpϕ{2q{k s Ñ R defined by g puq : 2 arcsinpkuq 311 for every u P r0, sinpϕ{2q{k s, we arrive at d τ pθ1 qpϕq » L g 1 k sin p ϕ2 q du . p1 u2qp1 k2u2q a 0 Finally, since θ1 : r0, θ0 s Ñ R is continuous, we conclude that d τ ϕlim Ñθ L g 0 ulim Ñ1 d L g » 1 k sin p ϕ2 q a p1 0 »u a p1 0 du qp1 k2u2q u2 dū . qp1 k2ū2q ū2 It would be natural to indicate the last by d τ L g »1 dū , p1 ū2qp1 k2ū2q a 0 but the integrand of the last ‘integral’ is not defined at ū 1 and its restriction to the interval r0, 1q is an unbounded function. Hence the last ‘integral’ is no Riemann integral. The definitions below turn it into an improper Riemann integral defined by »1 0 d dū a : lim p1 ū2qp1 k2ū2q uÑ1 L g »u dū . p1 ū2qp1 k2ū2q a 0 As a side remark, we mention that »u a 0 p1 dū qp1 k2ū2q ū2 for 0 ¤ u ¤ 1 is called an elliptic integral of the first kind (in Jacobian form) and is denoted by the symbol F pu|k q. Definition 3.2.1. (Improper Riemann integrals) 312 (i) Let a P R, b P R Y t8u such that a b if b 8 and f : ra, bq Ñ R be almost everywhere continuous. Then F : ra, bq Ñ R, defined by F pxq : »x f py q dy a for every y P ra, bq, is a continuous function according to Theorem 2.6.19. We say that f is improper Riemann-integrable if there is L P R such that lim F pxq lim x Ñb x Ñb »x f py q dy a L. In this case, we define the improper Riemann integral of f by »b a f py q dy »x xlim Ñb a f py q dy . (ii) Let a P R Y t8u, b P R be such that a b if a 8 and f : pa, bs Ñ R be almost everywhere continuous. Then F : pa, bs Ñ R defined by » F pxq : b x f py q dy for every y P ra, bq is a continuous function according to Theorem 2.6.19. We say that f is improper Riemann-integrable if there is some L P R such that lim Ña F pxq xlim Ña x »b x f py q dy L. In this case, we define the improper Riemann integral of f by »b a f py q dy xlim Ña 313 »b x f py q dy . (iii) Let a P R Y t8u, b P R Y t8u such that a b if a 8 and b 8. Further, let f : pa, bq Ñ R be almost everywhere continuous. We say that f is improper Riemann-integrable if, both, f |pa,cs and f |rc,bq are improper Riemann-integrable for some c P pa, bq. In this case, we define »b a f pxq dx : »c a »b f pxq dx c f pxq dx . That this definition is indeed independent of c is a consequence of the additivity of the Riemann integral, Theorem 2.6.18. The proof of this will be given in the subsequent second remark below. Remark 3.2.2. Note that according to the previous definition, the restrictions to pa, bs, ra, bq or pa, bq of a continuous function defined on a bounded closed interval ra, bs, where a, b P R are such that a b, are improper Riemann-integrable, and that the values of the associated improper integrals all coincide with the Riemann integral of that function. Remark 3.2.3. In the following, we use the notation from Definition 3.2.1. That Definition 3.2.1 (iii) is independent of c P pa, bq can be seen as follows. For this, let d P pc, bq and sequences a1 , a2 , . . . in pa, ds, b1 , b2 , . . . in rd, bq that are convergent to a and b, respectively. Then it follows by the additivity of the Riemann integral, Theorem 2.6.18, for sufficiently large n P N that »c ak »d c f pxq dx f pxq dx »d c » bk d f pxq dx f pxq dx »d f pxq dx , ak » bk c f pxq dx . Hence it follows by the limit laws that »c a »b c f pxq dx lim k Ñ8 f pxq dx lim k Ñ8 »c ak » bk c f pxq dx f pxq dx »d »d 314 c c »d f pxq dx f pxq dx lim k Ñ8 » bk lim k Ñ8 ak d f pxq dx f pxq dx . Therefore, f |pa,ds and f |rd,bq are improper Riemann-integrable and satisfy »c a »b c f pxq dx f pxq dx »d »d c c »d f pxq dx »b f pxq dx d a f pxq dx , f pxq dx . The last implies that »c a f pxq dx »b c f pxq dx »d »b f pxq dx a d f pxq dx . The case that d P pa, cq is analogous. If a1 , a2 , . . . in pa, ds, b1 , b2 , . . . in rd, bq are convergent to a and b, respectively. Then it follows by the additivity of the Riemann integral, Theorem 2.6.18, for sufficiently large n P N that »d ak »c d f pxq dx f pxq dx »c » d bk c f pxq dx f pxq dx »c f pxq dx , ak » bk d f pxq dx . Hence it follows by the limit laws that »c a »b c f pxq dx lim k Ñ8 f pxq dx lim k Ñ8 »c ak » bk c f pxq dx »c d f pxq dx »d f pxq dx »c d lim k f pxq dx Ñ8 ak f pxq dx » bk lim k Ñ8 d f pxq dx . Therefore, f |pa,ds and f |rd,bq are improper Riemann-integrable and satisfy »c a »b c f pxq dx »c d f pxq dx f pxq dx »c d f pxq dx 315 »d a f pxq dx , »b d f pxq dx . The last implies that »c a »b f pxq dx c f pxq dx »d a f pxq dx »b d f pxq dx . In the following, we give two prime examples of improper integrals whose integrands are restrictions of powers of the identity function on p0, 8q. The first example shows that such are improper integrable over an interval p0, as, where a ¡ 0, if and only if that power is greater than 1. The second example shows that such are improper integrable over an interval ra, 8q, where a ¡ 0, if and only if that power is smaller than 1. Example 3.2.4. Define fα : where a ¡ 0. Show that pp0, as Ñ R, x ÞÑ 1{xαq for every real α (i) fα is improper Riemann-integrable for every α 1 and »a 0 dx xα 1 α 1a α . (ii) fα is not improper Riemann-integrable for every α ¥ 1. Solution: For α P R zt1u and ε P p0, aq, it follows that »a ε and for α 1 that »a ε x1α 1α a ε1α 1α 1 α dx xα dx x r lnpxqsaε lnpaq lnpεq ε a and hence the statements. Example 3.2.5. Define fα : where a ¡ 0. Show that p ra, 8q Ñ R, x ÞÑ 1{xαq for every real α 316 (i) fα is improper Riemann-integrable for every α ¡ 1 and »8 a dx xα α 1 1 aα11 . (ii) fα is not improper Riemann-integrable for every α ¤ 1. Solution: For α P R zt1u and x ¡ a, it follows that »x a and for α 1 »x a y 1α 1α x x1α a1α 1α dy yα dy y r lnpyqsxa lnpxq lnpaq a and hence the statements. The following gives an important criterion for improper Riemann integrability. It is based on the fact that for every bounded continuous and increasing function F : ra, bq Ñ R where a P R, b P R Y t8u are such that a b if b 8, lim F pxq x Ñb exists. Within the following theorem, F is an antiderivative of the absolute value of an almost everywhere continuous integrand. In this connection, the theorem is applied by showing that that absolute value has an improper Riemann-integrable majorant. Theorem 3.2.6. Let a P R, b P R Y t8u be such that a b if b 8 and f : ra, bq Ñ R be almost everywhere continuous. Finally, let G : ra, bq Ñ R be defined by » Gpxq : x |f pyq| dy for all x P ra, bq and be bounded. Then, f and |f | are improper Riemanna integrable. 317 Proof. Let pbn qnPN be a sequence in ra, bq which is convergent to b. Since G is bounded and increasing, suptRan Gu exists. As a consequence for given ε ¡ 0, there is some c P ra, bq such that suptRan Gu Gpcq ε. Hence it also follows that suptRan Gu Gpxq ε for all x P rc, bq. Then there is n0 P N such that bn ¥ c for all n P N satisfying n ¥ n0 . Therefore it also follows that | suptRan Gu Gpbn q| ε for n P N satisfying n ¥ n0 and hence finally lim Gpbn q suptRan Gu . Ñ8 n Hence |f | is improper Riemann-integrable and »b a |f pxq| dx suptRan Gu . Further, for every x P ra, bq, it follows that »x a p|f pyq| f pyqq dy ¤ 2 »x a |f pyq| dy ¤ 2 suptRan Gu . Hence |f | f and therefore also f is improper Riemann-integrable. As an application of the previous theorem, the next example defines the gamma function. The last extends the factorial function pN Ñ N, n ÞÑ n!q to a function with domain given by all real numbers which are no negative integers and such that the functional relationship of the factorial that pn 1q! pn 1qn! for all n P N is preserved. The proof of the last will be given within the example that is next to the following example. A main reason for the importance of the gamma function for applications is the fact that it appears naturally in the definition of many special functions that are solutions of differential equations from applications. Example 3.2.7. Show that fy : pp0, 8q Ñ R, x ÞÑ ex xy1 q is improper Riemann-integrable for every y ¡ 0. Hence we can define the gamma function Γ : p0, 8q Ñ R by Γpy q : »8 ex xy1 dx 0 318 y y 1 24 0.5 6 2 1 2 3 4 5 x 1 2 3 4 5 x Fig. 78: Graphs of the gamma function Γ (left) and 1{Γ. for all y P p0, 8q. Solution: Let y ¡ 0. For every ε ¡ 0, it follows that »1 ε ex xy1 dx ¤ »1 ε xy1 dx ¤ 1 y and hence by Theorem 3.2.6 that fy |p0,1s is improper Riemann-integrable and that »1 1 ex xy1 dx ¤ . y 0 Further, hy : pr1, 8q Ñ R, x Ñ ex{2 xy1 q has a maximum at x0 : maxt1, 2py 1qu. Hence it follows for every R ¥ 1 that »R 1 ex xy1 dx ¤ hy px0 q »R 1 ex{2 dx ¤ 2hy px0 q e1{2 and by Theorem 3.2.6 that fy |r1,8q is improper Riemann-integrable. Note for later use that Γp1q »1 0 ex dx »R lim R Ñ8 Rlim p1 eR q 1 . Ñ8 1 319 ex dx lim RÑ8 »R 0 ex dx Example 3.2.8. Show that Γpy for all y ¡ 0 and hence that 1q y Γpy q (3.2.2) 1q n! (3.2.3) Γpn for all n P N. Solution: By partial integration it follows for every y ε P p0, 1q and R P p1, 8q that »R ε ex xy dx eε ε y eR R y and hence (3.2.2). Since Γp1q induction. 1 »R y ¡ 0, ex xy1 dx ε 0! , from this follows (3.2.3) by As another example of an application of Theorem 3.2.6, the next example defines Gaussian integrals. Such integrals appear in quantum theory in the study of the quantization of the harmonic oscillator which is of fundamental importance for physics. In addition, they appear naturally in the study of the normal distribution in probability theory. The last distribution is frequently used for the description of error progression due to random errors occurring in measurements of physical quantities. Example 3.2.9. ( Gaussian integrals, I ) Show that fm,n : p r0, 8q Ñ 2 R, x ÞÑ xm enx {2 q is improper Riemann-integrable for all m P N, n P N . In particular, show that I : N N Ñ R defined by I pm, nq : for all m P N, n P N satisfies I pm »8 2 xm enx {2 dx 0 2, nq m for all m P N, n P N and, in particular, I p2k 1, nq 2k k! , nk 1 320 1 n I pm, nq (3.2.4) I p2pk for all k 1 3 p2k nk 1 1q, nq 1q ?1n I p0, 1q (3.2.5) P N. Solution: First, it follows for n P N and x ¥ 1 that »x 0 1 2 y eny {2 dy 1 2 1 enx {2 n »x 0 ¤ 0 2 eny {2 dy n 0 pnyq eny {2 dy n1 2 eny {2 2 x 0 , »1 2 eny {2 dy »1 »x »x 2 eny {2 dy »x 0 2 eny {2 dy 1 2 yeny {2 dy 0 and hence by Theorem 3.2.6 that f0,m and f1,m are improper Riemannintegrable as well as that I p1, nq »8 2 y eny {2 dy 0 Further, according to Example 2.5.12, ex Hence it follows that x e ¥e 1 »x x y e dy 0 n1 . (3.2.6) ¥ 1 and ex ¥ x for all x ¥ 0. ¥ »x 2 y dy 0 x2 for all x ¥ 0 and in this way inductively that m ex for all x ¥ 1 and m x ¥ 0 that »x y 0 m 2 P N. ¥ xm! In addition, it follows for m 1 2 eny {2 dy n1 ym 1 e ny 2 {2 x 0 n »x ym 0 1 n 321 »x 0 1 P N, n P N and pnyq eny {2 dy pm 2 2 1q y m eny {2 dy n1 xm Since, 1 2 enx {2 1 m nx we notice that 1 1 n e nx2 Ñ8 n x lim x m 1 2 y m eny {2 dy . (3.2.7) 0 {2 m 1 »x ¤ m! exnx {2 , n 2 2 enx {2 0. Hence it follows from (3.2.7) inductively the improper Riemann-integrability of fn,m for all m P N and n P N as well as the validity of (3.2.4) for all m P N and n P N . Further, it follows from (3.2.6) and (3.2.7) by induction that I p2k for all k 1, nq 2k k! , I p2pk nk 1 1q, nq 1 3 p2k nk 1 1q I p0, nq P N and, finally, as a consequence of »x 0 1 2 eny {2 dy ? n » ?n x 2 eu {2 du 0 that I p2pk 1q, nq 1 3 p2k nk 1 1q ?1n I p0, 1q . Equation (3.2.5) reduces the calculation of the Gaussian integrals I pm, 1q for even m P N to the calculation of »8 2 ex {2 dx . 0 The determination of the last is the object of the following example. As an application of the result, the value of Γp1{2q is calculated in the subsequent example. 322 Example 3.2.10. ( Gaussian integrals, II ) Together with Wallis’ product representation of π from Theorem 3.1.19, the application of the results of Example 3.2.9 allow the calculation of I p0, 1q as follows. Employing the notation of Example 3.2.9, in a first step, we conclude for m P N, n P N that 0 »x 2 y py tq eny {2 dy m 0 »x 2t and hence that ym 1 e ny 2 {2 dy ym y 2 eny {2 dy t2 ym 2 2 eny {2 dy 0 m 2 2 eny {2 dy » x y 0 » x y 2 »0 x 0 » x »x 2 m 1 2 eny {2 dy 2 0 for all x ¥ 0 and, finally, I pm 1, nq m 2 2 eny {2 dy 0 ¡0 a I pm, nq I pm 2, nq . In particular, since according to (3.2.4) I pn 1, nq I pn 1, nq , I pn 2, nq n 1 n I pn, nq , we conclude that I pn 1, nq a I pn, nq I pn I pn 2, nq , I pn, nq I pn 1, nq I pn a 2, nq c n n 1, nq I pn 1 I pn 1, nq and hence that I pn, nq I pn 1, nq I pn 323 2, nq . 2, nq In particular, the case n 2k 2k k! p2k 1qk 1 I p2k 1 p32k p12kqk 1 1q k 1 2p2k pk1qk1q2! 1 where k P N, leads to 1q I p2k 1, 2k ?2k1 1 2, 2k I p0, 1q I p2k 1q 3, 2k 1q and hence to d 2k 4pk d 1 ? 2 k 1q 2 4 2k 3 p2k 1q 1 3 ? 2 k 1 1 2k 2q 2k 2k 4pk 2 I p0, 1q 2 4 2pk 1q 3 p2k 3q Finally, taking the limit k Ñ 8 in the last expression and applying Wallis’ product representation of π (3.2.5) leads to I p0, 1q »8 0 2 ey {2 dy Example 3.2.11. Show that Γp1{2q ? c π . 2 (3.2.8) π. Solution: For this, let ε, R ¡ 0. By change of variables, it follows that »R ε 1 2 ey {2 dy ? » R2 {2 2 { x1{2 ex dx ε2 2 and hence by taking the limits that c π 2 ?12 Γp1{2q . 324 40 2 z 20 1.5 0 0 1 y 0.5 0.5 1 x 1.5 2 0 Fig. 79: Graph of the Beta function. As another example of an application of Theorem 3.2.6, the next example defines Euler’s beta function. Example 3.2.12. ( Beta function, I ) Show that fx,y : pp0, 1q Ñ R, x ÞÑ tx1 p1 tqy1 q is improper Riemann-integrable for all x ¡ 0, y ¡ 0. Hence we can define the Beta function B : p0, 8q2 Ñ R by B px, y q : »1 0 tx1 p1 tqy1 dt for all x ¡ 0, y ¡ 0. Solution: For this, let x ε, δ P p0, 1{2q. Then » 1{2 ε ¤ x1 t p1 tqy1 dt ¤ 2 » 1δ { 1 2 x 1 » 1{2 x1 1 2 x 1{2 x 2t 2 1 x t dt ¤ ε x 1 x ε ε x 2 , t p1 tqy1 dt ¤ 2 x 1 ¡ 0, y ¡ 0. In addition, let » 1δ { 1 2 p1 tq y 1 325 y 2p1 y tq 1δ { 1 2 2 y y 1 2 δ ¤ y 1 y y1 1 2 . Hence it follows by Theorem 3.2.6 that fx,y |p0,1{2s and fx,y |p1{2,1q are improper Riemann-integrable and that » 1{2 0 1 t p1tqy1 dt ¤ x 1 x1 »1 1 2 x 1 t p1tqy1 dt ¤ x 1 , { 1 2 y1 y 1 2 . As a consequence, fx,y is improper Riemann-integrable and satisfies »1 1 t p1 tqy1 dt ¤ x 1 0 x1 1 y 1 2 x y1 1 2 . The next example represents the Gamma function essentially as a limit of the beta function. As another example of an application of Theorem 3.2.6, the next example defines Euler’s beta function. Example 3.2.13. ( Beta function, II ) Show that x lim Ñ8 y B px, y q Γpxq (3.2.9) y for all x ¡ 0. Solution: For this, let x ε, δ P p0, 1{2q. Then » 1δ ε 0, y ¡ 2. In addition, let tx1 p1 tqy1 dt y1 1 ¡ » py1qp1δq s x1 1 s y1 y1 y1 py1qε y1 » py1qp1δq s 1 x1 ds . py 1qx py1qε s 1 y 1 Further, » py1qp1δq x1 s 1 py1qε y1 s y1 326 ds » py1qp1δq py1qε ds sx1 es ds ¤ » py1qp1δq py1qε x1 s s e 1 1 s e ds . y1 s y1 We consider the auxiliary function h : r0, 8q Ñ R by hpsq : 1 1 for all s P r0, 8q. Then hp0q p0, 8q with derivative h 1 psq y1 s y1 es 0, h is continuous and differentiable on s 1 y1 y2 s es y1 ¡0 for 0 s y 1. Hence it follows for s P r0, y 1s that »s |hpsq| hpsq ¤ ¤ »s 0 u y1 1 u y2 y1 eu du exp u py 2q ln 1 du y1 y1 »s u u u u exp u py 2q du exp du y1 y1 0 y1 0 y1 e s2 2py 1q u u 0 »s where the case a that 1 of (2.5.12) has been used. » py1qp1δq x1 s s e 1 py1qε » py1qp1δq ¤ py1qε sx1 es 1 s y1 Hence it follows further y1 s e ds e s2 e ds ¤ Γpx 2py 1q 2py 1q From the previous, we conclude that |py 1qxB px, yq Γpxq| ¤ 2py e 1q Γpx 327 2q 2q . and hence that lim y x B px, y q Γpxq . y Ñ8 The following example expresses the beta function in terms of the gamma function. Example 3.2.14. ( Beta function, III ) Show that Γpxq Γpy q Γpx y q (3.2.10) y B px x (3.2.11) B px, y q for all x, y ¡ 0. Solution: For this, let x, y by use of partial integration that B px, y For this, let ε, δ » 1δ 1q 1, y q . P p0, 1{2q. Then 1δ 1 x y y t p1 tq dt t p1 tq y x x 1 x ε ¡ 0. In the first step, we show ε x1 rp1 δqxδy εxp1 εqy s y x » 1δ ε » 1δ ε tx p1 tqy1 dt tx p1 tqy1 dt which implies (3.2.11). Further, it follows that » 1δ ε » 1δ ε » 1δ t p1 tqy dt x 1 p1 t ε tx p1 tqy1 dt tq t p1 tqy1 dt x 1 » 1δ ε tx1 p1 tqy1 dt and hence that B px, y 1q B px 1, y q B px, y q . As a consequence, we obtain from (3.2.11) the equation B px, y q B px, y 1q B px 328 1, y q x y y B px, y 1q which results in B px, y 1q B px, y q . x y By induction, we conclude from (3.2.12) that B px, y nq y (3.2.12) y py 1q py n 1q B px, y q y q px y 1q px y n 1q px for every n P N . In particular, 1 2 pn 1q , y py 1q py n 1q 1 2 pn 1q y, nq px yq px y 1q px y B py, nq B px n 1q (3.2.13) where it has been used that B pz, 1q for every z »1 0 tz1 dt 1 z ¡ 0. Hence it follows that B py, nqB px, y nq B px, y q B px y, nq x y x y n n n B py,nnxq pyyB pxnq y,B pnx,q y nq . From this follows (3.2.10) by taking the limit n Ñ 8 and applying (3.2.9). Note that, as a consequence of (3.2.9) and the first identity of (3.2.13), we arrive at Gauss’ representation of the gamma function Γpxq nlim Ñ8pn nlim Ñ8 x px 1qx B px, n nx n! 1q px nq for every x ¡ 0. 329 x 1q nlim Ñ8 n B px, n 1q Theorem 3.2.15. (Gauss’ representation of the gamma function) For every x ¡ 0 nx n! Γpxq lim . (3.2.14) nÑ8 x px 1q px nq As an application of Gauss’ representation of the gamma function and the product representation of the sine, (3.1.9), we prove the reflection formula for Γ. Theorem 3.2.16. (Euler’s reflection formula for the gamma function) The equation π Γpxq Γp1 xq (3.2.15) sinpπxq holds for all 0 x 1. Proof. For this, let 0 x 1. Then it follows by (3.1.9) that Γpxq Γp1 xq nx n! 1q px x px nq n n! nlim Ñ8 p1 xq p2 xq rpn 1q xs n pn!q2 x1 nlim Ñ8 p1 x2 q pn2 x2 q pn 1q x 1 π 1 πx x1 nlim . 2 2 x x Ñ8 1 2 1 2 x sin p πx q sin p πx q 1 n lim Ñ8 n 1 x Remark 3.2.17. Note that the reflection formula (3.2.15) can and is used to extend the gamma function to negative values of its argument. See Fig. 80. As final examples for the application of improper integrals, Legendre’s duplication formula for the gamma function is proved, and an occasionally occurring integral is evaluated in terms of the gamma function. 330 y y 10 4 3 2 1 -1.5 -0.5 2 3 4 x 1 -4 1 -1.5 2 3 4 x -1 -10 Fig. 80: Graphs of the extensions of the gamma function Γ (left) and 1{Γ to negative values of the argument. Example 3.2.18. Show Legendre’s duplication formula for the gamma function 1 (3.2.16) Γp2xq ? 22x1 Γpxq Γpx p1{2qq π for all x ¡ 0. Solution: For this, let x ¡ 0 and ε, δ P p0, 1{2q. Then »1 Γpxq Γpxq Γp2xq B px, xq rtp1 tqsx1 dt 0 Further, it follows by change of variables that » 1δ ε rtp1 tqs » p1{2qδ 1 4 dt x 1 u » 1δ " ε x 1 2 du 2 p1{2q » 12δ 12x 2 p1 v2qx1 dv 2ε1 » 12δ 212x p1 v2qx1 dv ε 0 t 1 2 2 2x 1 2 » p1{2qδ 1 2 p1{2q p1 v q 1 p2uq2 dv 2 x 1 2ε 1 331 t ε »0 1 2 *x1 x1 dt du 2 1 2x » 12δ 0 p1 v q » 12ε dv 2 x 1 0 p1 v q dv . 2 x 1 (3.2.17) Further, it follows by change of variables that »b p1 v q dv 1 » b2 2 x 1 a 2 a2 y 1{2 p1 y qx1 dy , where 0 a b, and hence by taking the limit a Ñ 0 that »b p1 v q dv 1 » b2 2 x 1 0 2 0 y 1{2 p1 y qx1 dy . Hence it follows from (3.2.17) that » 1δ ε rtp1 tqsx1 dt 22x » p12δq2 0 » p12εq2 y 1{2 p1 y qx1 dy 0 y 1{2 p1 y qx1 dy and by taking the limits that Γpxq Γpxq Γp2xq 212xB p1{2, xq 212x ΓΓppx1{2qp1Γ{p2xqqq . Example 3.2.19. Show that » π{2 0 sin pθq cos pθq dθ µ ν for all µ, ν ¡ 1{2. Solution: For this, let µ, ν Then it follows by change of variables that » 1δ ε Γ µ2 1 Γ 2 Γ µ2 ν ν 1 2 1 (3.2.18) ¡ 1{2 and ε, δ P p0, 1{2q. tpµ1q{2 p1 tqpν 1q{2 dt » arcsinp?1δ q ? arcsinp ε q 2 sinpθq cospθq rsin2 pθqspµ1q{2 r1 sin2 pθqspν 1q{2 dθ 332 2 » arcsinp?1δ q ? arcsinp ε q sinµ pθq cosµ pθq dθ and hence by taking the limits that Γ µ 1 Γ ν21 2 Γ µ2 ν 1 B µ 2 1 ν , 1 2 2 » π{2 0 sinµ pθq cosν pθq dθ . Problems 1) Show the existence in the sense of an improper Riemann integral and calculate the value. In this, if applicable, s, a ¡ 0. »1 a) 0 »1 lnpxq dx , esx dx , »8 c) b) 0 x lnpxq dx , »8 d) 0 0 »8 esx sinpaxq dx , e) esx cospaxq dx , f) 0 » 8 ?x »8 e g) 0 ?x dx , h) 8 »8 ?x e dx , 0 x expp x2 q dx , 8 dx dx , i) , j) 4 x3 3 0 x 0 1 »8 »8 dx dx , l) , k) x x 2 e e x a2 8» 8 »8 8 dx dx m) , n) 2 3 2 5x 6 2x 3x 0 x 0 x »8 » 6 . 2) The radial part of the ‘wave function’ of an electron in a bound state around a proton is given by Rnl : p0, 8q Ñ R where n P N , l P t0, . . . , n 1u are the principal quantum number and the azimuthal quantum number, respectively [62]. Calculate the expectation value xry of the radial position of the electron in the corresponding state given by ³8 3 2 xry ³08 r2 Rnl2 prq dr a r Rnl prq dr 0 333 y y 0.2 0.5 0.4 0.3 0.1 0.2 0.1 1 2 3 4 r y 2 4 6 8 10 12 4 8 12 16 20 24 4 8 12 16 20 24 r y 0.2 0.1 0.1 0.05 2 4 6 8 10 12 r y r y 0.1 0.1 0.05 0.05 4 Fig. 81: Graphs of Problem 2. 8 12 16 20 24 r r pp0, 8q Ñ R, r ÞÑ r2 Rnl2 prqq corresponding to a) to f). 334 Compare where a 0.529 108 cm is the Bohr radius. a) R10 prq 2 er ? b) R20 prq , ? r r{2 2 1 e 2 2 6 r{2 re , 12 ? 2 3 2 2 2 d) R30 prq 1 r r er{3 9 3 27 8 1 2 ? e) R31 prq r r er{3 , 6 27 6 4 f) R32 prq ? r2 er{3 81 30 c) R21 prq for all r , , ¡ 0. 3) The ‘wave function’ of a ‘harmonic oscillator’, i.e., the ‘wave function’ of a point particle of mass m ¡ 0 under the influence of a linear restoring force, is given by ψn : R Ñ R where n P N is the principal quantum number [3]. Calculate the expectation value xxy of the position of the mass point in the corresponding state given by ³8 8 x ψn pxq dx . 2 8 ψn pxq dx pmω{~q1{2 , ω pk{mq1{2 , k xxy 2 ³8 [In this, a ¡ 0 is the spring’s constant and ~ is the reduced Planck’s constant.] a) ψ0 pxq b) ψ1 pxq c) ψ2 pxq d) ψ3 pxq ?aπ 1{2 a ? 2 π ? a ea 1{2 1{2 8 π a ? 48 π 2 { x2 2 , 2axea 2 { x2 2 , p4a2 x2 2q ea x {2 1{2 2 2 , p8a3 x3 12axq ea x {2 2 2 for all x P R. 4) The time for one complete swing (‘period’) T of a pendulum with 335 y y 0.6 0.4 0.2 0.1 -4 2 -2 4 x -4 -2 y -2 4 2 4 x y 0.4 -4 2 0.4 2 4 x -4 -2 x Fig. 82: Squares of the wave functions of a harmonic oscillator. Compare Problem 3. 336 length L ¡ 0 is given by d T 2 L g »1 1 a p1 du qp1 k2 u2 q u2 where θ0 P pπ {2, π {2q is the initial angle of elongation from the position of rest of the pendulum, k : | sinpθ0 {2q|, and where g is the acceleration of the Earth’s gravitational field. Show that the corresponding integral exists in the improper Riemann sense. Split the integrand into a Riemann integrable and an improper Riemann integrable part where the last leads on an integral that can easily be calculated. In this way, we give another representation of T that involves only a proper Riemann integral. 337 E E A A D B C C Fig. 83: Archimedes’ construction in the quadrature of the parabola. Refer to text. 3.3 Series of Real Numbers In this section, we start the study of series of real numbers. A special case of an important series, the geometric series, already appeared in Archimedes’ second proof of his quadrature of the parabola. For motivation, this second proof is considered in the following. For this, we consider a parabola along with a line segment AE between two points A and E on that parabola and the point C of smallest distance from AE. See Fig 83. Archimedes proved that the area of the parabolic segment ACE is 4{3 of the area of the inscribed triangle with corners A, C and E. He did this by dissecting the parabolic segment iteratively by triangles constructed from line segments between points on the parabola as follows. In the first step, two triangles with corners A, B, C and C, D, E are constructed in the same way from the line segments AC and CE, respectively, as the triangle with corners A, B, C was constructed from the line segment AE, i.e., the points B and D are the points of minimal distance from AC and CE, respectively. Then the same process is continued with the line segments AB, BC, CD, DE leading to four new triangles and so forth. At the time of Archimedes writing of his quadrature of the parabola, the 338 E G D A I B C Fig. 84: Auxiliary diagram for the description of results on parabolic segments used in Archimedes’ proof. Refer to text. following facts were known to be true for every line segment AE on a parabola. See Fig 84. (i) The tangent to the point C on the parabola of largest distance from AE is parallel to AE. (ii) The parallel to the axis of the parabola through C halves every line segment BD between two points B and D on the parabola that is parallel to AB. (iii) If I, G are the points of intersection of the parallel to the axis through C with BD and AE, respectively, then CI CG BI q2 ppAG q2 . (3.3.1) Note that they imply that AT ¤ AP ¤ 2AT (3.3.2) where AP denotes the area of the parabolic segment ACE and AT denotes the area of the inscribed triangle with corners A, C and E. See Fig 85. 339 E G A M C L Fig. 85: The double of the area of the triangle ACE gives an upper bound for the area of the parabolic segment ACE. Refer to text. E F G H K A D J I B C Fig. 86: Archimedes’ construction of quadrature of the parabola including auxiliary lines (dashed) and points. Refer to text. 340 Archimedes did not prove these facts, but referred for such proofs to earlier works on conics by Euclid and Aristaeus. We will give proofs in Example 3.5.26 below using methods from analytical geometry. By help of this knowledge, Archimedes concluded that the areas of the triangles ABC, CDE are 1{4 of the areas of the triangles ACG and GCE, respectively. See Fig 86. This can seen as follows. We denote by I the intersection of the parallel to AE through B with the parallel to the axis through C. Note that Fig 86 suggests that its prolongation goes through the point D, but this will not be used in the following. That this is indeed the case will be side result of the proof. Further, we denote by G the intersection of AE with the parallel to the axis through C. Finally, we denote by J, H the intersections of the parallel to the axis of the parabola through B with AC and AE, respectively. Since this parallel halves AC, BH and CG as well as BI, HG are parallel, we conclude that AH HG 21 AG , BI HG , BH IG and hence by (3.3.1) that CI CG pHGq2 1 . BI q2 ppAG q2 p2HGq2 4 (3.3.3) Further, the triangles with corners AJH and ACG are similar. Hence JH CG 1 AH . AG 2 In particular, by help of the last and (3.3.3), it follows that BJ BH JH IG JH CG CI JH 34 CG JH 32 JH JH 21 JH . Hence the triangles ABC and ACH have the side AC in common and the corresponding height of the triangle ABC ( distance from AC to B) is 341 half of that corresponding height of the triangle ACH ( distance from AC to H). Hence the area of the triangle ABC is half the area of the triangle ACH. Now also the triangles ACH and ACG have the side AC in common and the corresponding height of the triangle ACH ( distance from AC to H) is half of that corresponding height of the triangle ACG ( distance from AC to G). Hence it follows that the area of the triangle ABC is 1{4 of the area of the triangle ACG. The reasoning is analogous for the areas of the triangles CDE and GCE, respectively. See Fig 86. For this, we denote by I the intersection of the parallel to AE through D with the parallel to the axis through C. Note that this definition of the point I could conflict with its previous definition. But only the last definition will be used in the following, and a by product of the proof is that these points indeed coincide. As before, we denote by G the intersection of AE with the parallel to the axis through C. Finally, we denote by K, F the intersections of the parallel to the axis of the parabola through D with CE and AE, respectively. Since this parallel halves CE and CG, DF as well as ID, GF are parallel, we conclude that GF FE 21 GE , ID GF , DF IG and hence by (3.3.1) that CI CG pGF q2 1 . IDq2 ppGE q2 p2GF q2 4 (3.3.4) Note that this implies, that both previous definitions of I coincide. Further, the triangles with corners FKE and GCE are similar. Hence KF CG FE GE 12 . In particular, by help of the last and (3.3.4), it follows that DK DF KF IG KF 34 CG KF 342 23 KF KF 12 KF . Hence the triangles CDE and CEF have the side CE in common and the corresponding height of the triangle CDE ( distance from CE to D) is half of that corresponding height of the triangle CEF ( distance from CE to F ). Hence the area of the triangle CDE is half the area of the triangle CEF . Now also the triangles CEF and GCE have the side CE in common and the corresponding height of the triangle CEF ( distance from CE to F ) is half of that corresponding height of the triangle GCE ( distance from CE to G). Hence it follows that the area of the triangle CDE is 1{4 of the area of the triangle GCE As a consequence, the sum of the areas of the triangles ABC and CDE is 1{4 of the area of the triangle ACE. Hence it follows that ¤ AP AT ¤ 2 A4T AT 4 and inductively that ¤ AP AT AT 4n 1 k ņ k 0 1 4 ¤ 2 4An T1 (3.3.5) for every n P N. At this point observes that 1 1 3 4n 1 1 4n 1 43 4n1 1 13 41n for every n P N which leads to 1 1 n 3 4 1 13 41n n¸1 k 0 ņ k 0 1 4 k 4n 1 1 k 1 4 343 1 1 n 3 4 1 ņ k 0 k 1 4 for every n P N. Hence it follows that ņ 1 1 3 4n k k 0 1 4 1 1 3 40 0̧ for every n P N. For every n P N, this leads to AT 4n 1 ¤ AP AT 4 3 k 0 1 1 3 4n k 1 4 43 ¤ 2 4An T1 which is equivalent to 7 AT AT AT 1 4 ¤ A P AT n 1 n 1 n 3 4 4 3 4 3 AT 1 10 AT AT ¤ 2 4n 1 3 4n 3 4n 1 . (3.3.6) Differently to Archimedes, we can conclude from this by help of Theorem 2.3.12 directly that 4 AP AT . 3 Since the limit concept was not developed at that time, Archimedes had to employ a usual ‘double reductio ad absurdum’ argument for this, i.e., to lead both assumptions that AP 4AT {3 and that AP ¡ 4AT {3 to a contradiction which leaves only the option that AP 4AT {3. This can be done as follows. First, we notice that AP ¥ 4AT {3 according to (3.3.6). Therefore the assumption that AP 4AT {3 ε for some ε ¡ 0 contradicts (3.3.6) . Second, we assume that AP 4AT {3 ε for some ε ¡ 0. Then, it follows for n P N satisfying n¡ that AP 10 AT 3 ε ln 4 43 AT ¡ 103 4An T1 which contradicts (3.3.6) . Hence the only remaining possibility is that AP 4AT {3. Of course, in ancient Greece only rational ε were considered 344 in such analysis. A modern way of stating Archimedes’ result can be given as follows. Since it follows from (3.3.5) that 1 AT AT AP 2 n 1 4 k ņ ¤ k 0 1 4 ¤ 1 AT AP AT 4n 1 ¤ AAP (3.3.7) T for every n P N, the sequence S0 , S1 , . . . , defined by Sn : ņ k k 0 1 4 for every n P N, is increasing and bounded from above by AP {AT and hence convergent. In particular, it follows from (3.3.7) by Theorem 2.3.12 that ņ k 1 AP AT nlim . Ñ8 4 k0 In the following, the natural notation 8̧ 1 k k 0 4 : nlim Ñ8 ņ k 0 k 1 4 will be used and referenced as the ‘sum of the sequence x0 , x1 , . . . ’, defined by k 1 xk : 4 for every k P N. In addition, the sequence S0 , S1 , . . . will be called ‘the sequence of partial sums of x0 , x1 , . . . ’. Sequences of partial sums are also called ‘series’. In this sense, Archimedes calculates the sum of the sequence 1, q, q 2 , . . . for the case q 1{4 which is given by 4{3. The series corresponding to the sequences 1, q, q 2 , . . . where q runs through all real numbers are called ‘geometric series’. 345 Definition 3.3.1. Let x1 , x2 , . . . be a sequence of elements of R. We say that x1 , x2 , . . . is summable if the corresponding sequence of partial sums S1 , S2 , . . . , defined by Sn : ņ xk (3.3.8) k 1 for every n P N, is convergent to some real number. In this case, the sum of x1 , x2 , . . . is denoted by 8̧ xk . (3.3.9) k 1 Otherwise, we say that x1 , x2 , . . . is not summable. The sequence in (3.3.8) is also called a series and in case of its convergence a convergent series with its sum denoted by (3.3.9). In case of its divergence, that series is called divergent. In the following, we give two examples of series that play an important role in the analysis of the convergence of series, geometric series and the harmonic series. The former contain a real parameter. If and only if the absolute value of that parameter is smaller than 1, the corresponding geometric series converges. The harmonic series is divergent. Example 3.3.2. (Geometric series) Let x P R. In the following, we use the convention that x0 : 1. Show that the so called geometric series S0 , S1 , . . . , defined by Sn : ņ xk k 0 for every n P N, is convergent if and only if |x| 1. In the last case, show that 8̧ 1 xk . 1x k0 Solution: Note that in the case x 1, it follows that Sn n and hence the divergence of the corresponding the geometric series. For x 1, it follows 346 5 4 3 2 1 10 20 30 40 50 n [2 lnp2q]1 ln . Fig. 87: Partial sums of the harmonic series and graphs of ln and 21 that x Sn ņ xk 1 n¸1 xk Sn 1 xn 1 k 1 k 0 and hence that 1 1x x n 1 Sn . As a consequence, the series of partial sums is convergent if and only if |x| 1, and in this case 8̧ k 0 xk 1 nlim Ñ8 Sn 1 x . Example 3.3.3. (Harmonic series) Show that the harmonic series, defined by ņ 1 Sn : k k1 347 for every n P N , is divergent. Solution: For every n P N zt0, 1u, it follows that 2n ¸ 1 k 1 ¥ ¥ k 20 ¸ 1 k k 1 22 ¸ 1 k 22 ¸ 1 20 22 k 21 1 k k2n1 k 21 20 ¸ 1 k 1 2n ¸ 2n ¸ 1 2n k2n1 ¥ 1 p22 21q 212 p2n 2n1q 21n n 1 n1 n 1 1 1 lnp2 q 1 2 2 2 2 lnp2q 2 lnp2q and hence the divergence of the harmonic series. Remark 3.3.4. Note that because series are sequences of partial sums, we can apply the limit laws of Theorem 2.3.4 to series. Often, a given series consists of the partial sums corresponding to a sequence of the form f p1q, f p2q, . . . where f : r1, 8q Ñ R is some function. For instance in the case of a geometric series corresponding to q ¡ 0, such function is given by f pxq : e px1q ln q for x ¥ 1, and in the case of the harmonic series, such function is given by f pxq : 1 x for x ¥ 1. We note that in such case, the sequence of partial sums f p1q, f p2q, . . . , defined by ņ f pk q k 1 for every n P N , has the form of a Riemann sum, i.e., the form of sums used in the definition of the Riemann integral, corresponding to a decomposition of R into the intervals r1, 2s, r2, 3s, . . . of length 1. Hence, we would 348 expect that there is a relationship between the existence of the improper Riemann integral of f and the convergence of the series. Indeed, this is true for a particular class of functions f . Theorem 3.3.5. (Integral test) Let f : r1, 8q Ñ R be positive decreasing and almost everywhere continuous. Then f p1q, f p2q, . . . is summable if and only if f is improper Riemann-integrable. In this case, »8 1 f pxq dx ¤ 8̧ »8 f pk q ¤ f p1q 1 k 1 f pxq dx (3.3.10) as well as »8 m 1 8̧ f pxq dx ¤ f pk q ¤ »8 m k m 1 for every m P N . f pxq dx (3.3.11) Proof. For this, we define the auxiliary function g : r1, 8q Ñ R by g p1q : f p2q as well as g pxq : f pk 1q for all x P pk, k 1s and k P N . Then »n m 1 g pxq dx ņ »k 1 k m k n¸1 g pxq dx f pk q k m 1 for every m, n P N such that m ¤ n. If f is improper Riemann-integrable, it follows because of |g | ¤ f and by Theorem 3.2.6 that g is improper Riemann-integrable and hence that f p1q, f p2q, . . . is summable and 8̧ f pk q ¤ »8 k m 1 m f pxq dx for every m P N . If on the other hand f p1q, f p2q, . . . is summable, we define the auxiliary function h : r1, 8q Ñ R by hpxq : f pk q for all x P rk, k 1q and k P N . Then »x m f py q dy ¤ »x m hpy q dy 349 ¤ 8̧ k m f pk q for every m P N and x P r1, 8q. Hence it follows by Theorem 3.2.6 that f is improper Riemann-integrable and that 8̧ f pk q ¥ »8 k m for every m P N . m f py q dy . Remark 3.3.6. Note that (3.3.11) can be used to estimate remainder terms of the sequence. The following two examples give applications of the integral test to further series that play an important role in the analysis of the convergence of series. In particular, the following example defines Riemann’s zeta function which has important applications in the description of the distribution of the prime numbers. Further applications are in quantum statistical physics and quantum field theory. Finally, there is a famous problem concerning the zeros of the extension of Riemann’s zeta function to complex numbers. All even integers that are smaller than 0 are zeros of that extension. Riemann’s conjecture from 1859 claims that all other zeros have the real part 1{2. It is not yet known whether this is true. The solution to this problem would have profound consequences in the theory of numbers. Example 3.3.7. (Riemann’s Zeta function) Show that by ζ psq : 8̧ 1 s n n 1 for every s P p1, 8q there is defined a function ζ : p1, 8q Ñ R. This function is called Riemann’s zeta function. Solution: For every s P p1, 8q the corresponding function fs : r1, 8q Ñ R defined by fs pxq : 1{xs for every x ¥ 1 is positive decreasing and continuous and by Example 3.2.5 improper Riemann-integrable. Hence the statement follows from Theorem 3.3.5. In addition, it follows by (3.3.10) that s1 1 »8 1 fs pxq dx ¤ ζ psq ¤ 1 350 »8 1 fs pxq dx s s1 . y 10 8 6 4 2 2 1.5 3 2.5 s Fig. 88: Graphs of ζ (black), 1{p1 sq (blue) and s{p1 sq (red). 2.5 2 1.5 1 0.5 10 20 30 40 50 n Fig. 89: Partial sums of the series from Example 3.3.8 for the case p 1. 351 Example 3.3.8. Let p defined by ¥ 1. Determine whether the sequence a2 , a3 , . . . an : ņ 1 k plnpk qqp k2 for every n P N zt0, 1u is convergent or divergent. Solution: For this, we define the auxiliary function h : r2, 8q Ñ R by hpxq : x plnpxqqp for every x ¥ 2. Then h is strictly positive, strictly increasing and continuous and hence f : r1, 8q Ñ R defined by f pxq : 1{rpx 1qplnpx 1qqp s for every x ¥ 1 is positive, strictly decreasing and continuous. Further for p 1: »n dx lnplnpn 1qq lnplnp2qq 1q lnpx 1q 1 px for every n P N . Using that lnplnp2m qq lnpm lnp2qq for every m P N , it follows that f is not improper Riemann-integrable. Hence it follows by Theorem 3.3.5 the divergence of the corresponding sequence a2 , a3 , . . . . For p ¡ 1 it follows that »x 1 py dy 1qplnpy 1 rplnpx 1qq 1p p 1qq1p plnp2qq1p s for every x ¥ 1 and hence the improper Riemann-integrability of f and by Theorem 3.3.5 the convergence of the corresponding sequence a2 , a3 , . . . . The following comparison test is often applied to decide the convergence of a given series. For motivation, we investigate the convergence of the series S1 , S2 , . . . defined by ņ 1 Sn : 2 k 2 k1 for all n P N . A basic strategy in the solution of any problem is to investigate whether that problem has a peculiarity that prevents its immediate solution. Indeed, without the addition of 2 in the denominator of the summands, S1 , S2 , . . . would coincide with the zeta series corresponding to s 2 which was shown to converge. In such cases, it is often possible to 352 reduce, in some sense, the solution of the given problem to the solution of the simpler problem. For instance in this case, we notice that Sn : ņ 1 2 k k 1 ņ 1 ¤ 2 k1 k 2 8̧ 1 ¤ 2 k ζ p2q k 1 for every n P N . Hence S1 , S2 , . . . is an increasing sequence that is bounded from above and therefore convergent (with a sum that is smaller than ζ p2q). The following theorem generalizes this method of comparison of series. Theorem 3.3.9. (Comparison test) Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of positive real numbers. Further, let xn ¤ c yn for all n P tN, N 1, . . . u where c ¥ 0 and N is some element of N. If y1, y2, . . . is summable, then x1 , x2 , . . . is summable, too. Proof. If y1 , y2 , . . . is summable, it follows that ņ k 1 xk ¤ ņ c yk c k 1 ņ k 1 yk ¤c 8̧ yk k 1 for every n P N . Hence the sequence of partial sums of x1 , x2 , . . . is increasing (since xk ¥ 0 for all k P N ) and bounded from above and therefore convergent. In Example 3.3.3, we proved that the harmonic series is divergent by showing that 2n ¸ 1 lnp2n q lnp2q ¥ k 2 lnp2q k1 for every n P N. In addition, Fig 87 supported the validity of the more general estimate ņ lnpnq lnp2q 1 ¥ k 2 lnp2q k1 for every n P N . The last could indicate a logarithmic increase of the partial sums of the harmonic series with the number of summands. Indeed, as 353 another application of the previous theorem, the following example proves the more precise statement that ņ 1 k k1 lim Ñ8 n lnpnq γ where γ is a real number in the interval r0, 1s called Euler’s constant. To seven decimal places, γ is given by 0.5772156. Example 3.3.10. Show that the sequence a1 , a2 , . . . defined by an : ņ 1 k k1 lnpnq for all n P N is convergent. See Fig. 87. Solution: For this, we define an auxiliary sequence b1 , b2 , . . . by bn : ņ 1 k k1 lnpn 1q an for all n P N . Then lnpnq lnpn 1q an and hence for all n since 0 ¤ bn 1 ln n n 1 n 2 bn 1 bn ln n 1 n 1 »1 »1 1 n 1 1 n 1 1 dx x n 1 n 1 0 x 0 1 bn ¤ pn 1 1q2 x n 1 dx P N. Therefore b1, b2, . . . is increasing and bounded from above bn b1 n¸1 pbk 1 bk q ¤ b1 k 1 8̧ 1 2 k k 1 354 for all n P N z t0, 1u. Hence b1 , b2 , . . . and a1 , a2 , . . . are convergent. The constant ņ 1 γ : lim lnpnq nÑ8 k k1 is known as Euler constant. Presently, it is not yet known whether it is rational or irrational. Since lnpn ¤1 1q »n 1 n¸1 » k 1 k 1 1 k it follows that dx x dx x ņ »k k k 1 »n 1 0 ¤ ln 1 1 dx x n 1 dx x ¤ 1 n ņ 1 k k1 1 n¸1 1 k 1 k 1 lnpnq , ¤ an ¤ 1 for every n P N z t0, 1u and hence that 0 ¤ γ it is given by 0.5772156. ¤ 1. To seven decimal places The following example derives Weierstrass’ representation of the gamma function as a simple consequence of the previous result and Gauss’ representation (3.2.14) of the gamma function. Example 3.3.11. (Weierstrass’ representation of the gamma function) Show Weierstrass’ representation of the gamma function 1 Γpxq xeγx nlim Ñ8 n ¹ 1 k 1 x x{k e k (3.3.12) for every x ¡ 0 where γ is Euler’s constant. Solution: For this, let x According to (3.2.14), Γpxq is given by Γpxq nlim Ñ8 x px nx n! 1q px 355 x x1 nlim Ñ8 n nq n ¹ 1 k 1 1 x k ¡ 0. Further, ņ 1 nx exppx lnpnqq exp x lnpnq k k1 ņ exp x lnpnq k1 k1 n ¹ ņ x exp k k1 ex{k . k 1 Hence it follows that ņ 1 Γpxq x1 nlim Ñ8 exp x lnpnq k k1 x1eγx nlim Ñ8 n ¹ ex{k n ¹ ex{k 1 k 1 x k x k 1 k 1 which implies (3.3.12). The following comparison test is a simple consequence of the comparison test from Theorem 3.3.9. Theorem 3.3.12. (Limit comparison test) Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of positive real numbers. Further, let lim nÑ8 xn yn 1. (Note that this implies that yn ¡ 0 for all n P tN, N 1, . . . u and some N N .) Then x1 , x2 , . . . is summable if and only if y1 , y2 , . . . is summable. Proof. Since limnÑ8 pxk {yk q such that 1, there is N P N satisfying N ¥ 2 and 1 2 ¤ xyk ¤ 32 k for all k P N such that k ¥ N . In particular, this implies that 0 ¤ yk P ¤ 2xk , 0 ¤ xk ¤ 3y2k 356 for all k P N such that k ¥ N . Hence it follows by help of Theorem 3.3.9 that the sequence xN , xN 1 , . . . is summable if and only if yN , yN 1 , . . . is summable. Since ņ k 1 xk N ¸1 ņ ņ xk xk , for every n P N satisfying n the theorem. k 1 k N k 1 yk N ¸1 k 1 ņ yk yk k N ¥ N , the last also implies the statement of In the following, we give three typical applications of the previous comparison test. In particular, the two subsequent examples study series which are frequently used in the analysis of the convergence of given series. Furthermore, the following example, along with the fact that the sequence whose members are all equal to 1 is not summable, shows that the series 1, 1{2s , 1{3s , . . . is not summable for s ¤ 1 and hence that ζ psq cannot be defined for s ¤ 1 in the same way as for s ¡ 1. Example 3.3.13. Let p defined by for all n P n P N that 1. Determine, whether the sequence a1, a2, . . . an : 1 np N , is summable. Solution: Since p 1, it follows for every 1 1 ¥ p n n for all n P N and hence by Theorem 3.3.9 and Example 3.3.3 the divergence of a1 , a2 , . . . . Example 3.3.14. Let p defined by 1. Determine whether the sequence a2 , a3 , . . . an : ņ 1 k plnpk qqp k2 for every n P N zt0, 1u is convergent or divergent. Solution: Since p it follows for every k ¥ 3 that plnpkqq1p ¥ 11p 1 357 1, and hence that 1 1 ¥ . p k plnpk qq k lnpk q Hence it follows by Theorem 3.3.9 and Example 3.3.8 that the sequence a2 , a3 , . . . is divergent. Example 3.3.15. Determine whether the sequence a1 , a2 , . . . defined by an : 3n2 pn5 n 1 2q1{2 for all n P N is summable. Solution: Define bn : 3 n1{2 for all n P N . Then b1 , b2 , . . . is not summable according to Example 3.3.13. In addition, it follows that lim nÑ8 an bn 1 and hence by Theorem 3.3.12 also that a1 , a2 , . . . is not summable. In the 17th and 18th century, it was generally assumed that the reordering of the members of a sequence lead to a sequence which is summable if and only if the same is true for the original sequence and in that case that the sums of both sequences coincide. Indeed, we will see in the following that this is true for absolutely summable sequences that include sequences of positive p¥ 0q real numbers. On the other hand, we will also see that the above statement is false in more general cases. This false belief led to contradictions which plagued the calculus in those centuries. We present one example from that time [60] of a too naive handling of series resulting from a reordering of a sequence whose members alternate in sign. Since 1668 [78], it was known that 8̧ n 1 p1qn 1 lnp2q , n 358 a fact that will be proved in Example 3.4.19. On the other hand, it was argued that therefore lnp2q 1 1 1 1 1 1 1 1 ... 2 3 4 5 6 1 1 1 1 1 1 ... ... 2 3 5 2 4 6 2 2 1 1 1 1 1 ... ... 0 . 2 3 2 1 2 3 1 4 1 6 ... It should be noted that already the second line in the above ‘derivation’ cannot be concluded by the limit laws because all three series inside the brackets diverge. Hence the above can also be viewed as a classic example of the false treatment of 8 as a real number which was quite common at that time. The discovery of such apparent contradictions contributed essentially to a re-examination and rigorous founding of the theory of infinite series. A simple example for the fact that the reordering of a sequence can affect its sum is the following. For this, we consider a reordering of the sequence a1 , a2 , . . . defined by p1qk 1 ak : k for every k P N . The partial sums of this sequence are called the alternating harmonic series whose sum was also considered in the above ‘derivation’. Example 3.3.16. (A rearrangement of the alternating harmonic series) For this, we define the sequence a1 , a2 , . . . by ak : for every k p1qk 1 k P N, and the sequence b1, b2, . . . by b3k2 : 1 p 1q4k2 a4k3 , 4k 3 4k 3 1 359 1.1 1 0.9 10 20 30 40 50 n3 0.7 0.6 0.5 Fig. 90: Partial sums of the alternating harmonic series and its rearrangement from Example 3.3.16. b3k1 : b3k : 1 p 1q4k a4k1 , 4k 1 4k 1 1 1 2k p1q2k 1 2k1 a2k for every k P N . From the last, we conclude that the sequence b1 , b2 , . . . contains only members of the sequence a1 , a2 , . . . . The fact that it contains all of them can be seen as follows. For this, let k P N . If k is even, then a2 pk{2q ak . If k is odd, then there is l P N such that k 2l 1. If l is even, then b3 pl{2q1 a4 pl{2q1 a2l1 ak . b3 pk{2q Finally, if l is odd, then b3 ppl 1q{2q2 a4 ppl q{ q3 a2l1 ak . 1 2 360 Hence, b1 , b2 , . . . is a reordering of a1 , a2 , . . . . The ninth partial sum corresponding to a1 , a2 , . . . is given by 1 1 2 1 3 14 16 1 5 1 7 18 1 , 9 whereas the ninth partial sum corresponding to b1 , b2 , . . . is given by 1 1 3 12 1 5 1 7 41 1 9 1 11 61 . Assuming the convergence of the alternating harmonic series which is proved in Example 3.3.19, it follows that 8̧ p1qk 1 1 k k 1 2 1 3 56 . Further, because of b3k2 for every k b3k1 b3k P N, it follows that 3 2kp4k 8k 3qp4k 1q ¡ 0 3n ¸ bk k 1 ¡ 65 for every n P N . Therefore, either b1 , b2 , . . . is not summable (!), or 8̧ k 1 bk ¡ 56 ¡ 8̧ k 1 p1qk . p!q k In the following, we continue the study of series with view on sums of alternating sequences. Any sequence y1 , y2 , . . . of real numbers can be represented in the equivalent form x1 |y1 |, x2 |y2 |, . . . where the sequence x1 , x2 , . . . assumes values in t1, 1u. In this sense, y1 , y2 , . . . is always a product of a bounded sequence that describes sign changes and a sequence of positive numbers. In the case that the partial sums of x1 , x2 , . . . stay 361 bounded, as is the case for alternating y1 , y2 , . . . , consideration of this product structure is helpful in the analysis of the convergence of the series that corresponds to y1 , y2 , . . . . The basis for such analysis is provided by the following summation by parts formula which resembles the formula for partial integration. Theorem 3.3.17. (Summation by parts) Let x1 , x2 , . . . and y1 , y2 , . . . be sequences of real numbers and S1 , S2 , . . . be the sequence of partial sums of x1 , x2 , . . . . Then ņ xk yk pSn cqyn 1 pSm1 ņ cqym pSk cqpyk 1 yk q k m k m for all m, n P N such that n ¥ m and all c P R where we define S0 : 0. Proof. It follows for all m, n P N , that ņ xk yk k m ņ k m ņ Sk y k ņ pSk Sk1qyk Sk y k k m ņ Sk y k 1 n¸1 Sk y k 1 k m 1 Sn y n 1 Sm1ym Sk pyk 1 yk q k m k m ņ Snyn 1 Sm1ym Snyn 1 Sm1ym Snyn 1 Sm1ym k m ņ pSk k m ņ cqpyk 1 ņ yk q c pyk 1 yk q k m pSk cqpyk 1 yk q cyn 1 cym k m pSn cqyn 1 pSm1 cqym ņ pSk cqpyk 1 yk q . k m The following Dirichlet’s test is mainly a consequence of the summation by parts formula. This test is frequently used in connection with the sum362 mation of alternating sequences also because it provides a very simple estimate of the error resulting from the truncation of the series after finitely many terms. Theorem 3.3.18. (Dirichlet’s test) Let x1 , x2 , . . . be a sequence of real numbers such that its partial sums form a bounded sequence and y1 , y2 , . . . be a decreasing sequence of real numbers such that limkÑ8 yk 0. Then the sequence x1 y1 , x2 y2 , . . . is summable, 8̧ xk yk M1 y 1 8̧ pSk M1qpyk yk 1q (3.3.13) ¤ pM2 M1q yn (3.3.14) k 1 k 1 and for every n P N 8̧ kn xk yk 1 1 where M1 , M2 P R are a lower bound and upper bound, respectively, of the partial sums of x1 , x2 , . . . . Proof. For this let S1 , S2 , . . . be the sequence of partial sums of x1 , x2 , . . . and M1 , M2 P R be lower and upper bounds, respectively. Then by Theorem 3.3.17 ņ xk yk pSn M1qyn ņ 1 M1 y 1 k 1 pSk M1qpyk yk 1q k 1 as well as ņ 0¤ pSk M1qpyk yk 1q ¤ k 1 ņ pM2 M1qpyk yk 1q k 1 pM2 M1qpy1 yn 1q ¤ pM2 M1q y1 for all n P N . Therefore the sequence 1̧ k 1 2̧ pSk M1qpyk yk 1q, pSk M1qpyk yk 1q, . . . k 1 363 y 10 5 2 1.5 2.5 3 s -5 -10 Fig. 91: Graph of an extended Riemann’s zeta function ζ. is increasing as well as bounded from above and hence convergent. Therefore, since limkÑ8 yk 0, it follows the summability of x1 y1 , x2 y2 , . . . and p3.3.13q. Finally, it follows for every n P N that 8̧ xk yk pSn M1qyn 8̧ 1 k n 1 pSk M1qpyk yk 1q k n 1 and hence 8̧ 8̧ xk yk ¥ pSn M1qyn 1 ¥ pM2 M1qyn xk yk ¤ pSn M1qyn k n 1 k n 1 8̧ 1 pM2 M1qpyk yk 1q k n 1 pM2 Snqyn 1 ¤ pM2 M1qyn and (3.3.14). 364 1 1 ¡ 0. Example 3.3.19. Let s defined by Determine whether the sequence a1 , a2 , . . . an p 1qn1 : ns for all n P N is summable. Solution: Define xn : p1qn1 , yn : 1 ns for all n P N . Then the partial sums S1 , S2 , . . . of x1 , x2 , . . . oscillate between 0 and 1 and y1 , y2 , . . . is decreasing as well as convergent to 0. Hence by Theorem 3.3.18 a1 , a2 , . . . is summable and 8̧ k 1 ak 8̧ k 0 8̧ p2k k 0 p2k 1 1 1qs 1q p2k s p2k 1 1 2qs 21s ζ psq p1 21s q ζ psq 2q s if, in addition, s ¡ 1. Note that the last formula can and is used to define ζ on p0, 1q. See Fig. 91. In some cases where Dirichlet’s test cannot be applied Abel’s test is of use. Also Abel’s test is mainly a consequence of the summation by parts formula. Theorem 3.3.20. (Abel’s test) Let x1 , x2 , . . . be a summable sequence of real numbers and y1 , y2 , . . . a decreasing convergent sequence of real numbers. Then the sequence x1 y1 , x2 y2 , . . . is summable and 8̧ xk yk M1 y 1 k 1 where M1 8̧ k 1 x k M1 nlim Ñ8 yk 8̧ pSk M1qpyk yk 1q k 1 P R is a lower bound of the partial sums of x1, x2, . . . . (3.3.15) Proof. For this, let S1 , S2 , . . . be the sequence of partial sums of x1 , x2 , . . . and M1 , M2 P R be lower and upper bounds, respectively. Further, let 365 M3 , M4 P R be lower and upper bounds, respectively, of y1 , y2 , . . . . Then by Theorem 3.3.17 ņ xk yk pSn M1qyn ņ 1 M1 y 1 pSk M1qpyk yk 1q k 1 k 1 as well as ņ 0¤ pSk M1qpyk yk 1q ¤ ņ pM2 M1qpyk yk 1q k 1 k 1 pM2 M1qpy1 yn 1q ¤ pM2 M1q pM4 M3q for all n P N . Therefore, the sequence 1̧ 2̧ pSk M1qpyk yk 1q, pSk M1qpyk yk 1q, . . . k 1 k 1 is increasing as well as bounded from above and hence convergent. Finally, it follows the summability of x1 y1 , x2 y2 , . . . and p3.3.15q by the limit laws for sequences. The following example gives an application of Abel’s test. Example 3.3.21. Show that the sequence a1 , a2 , . . . defined by a2n1 : n1 1 , a2n : 2 n n 1 for every n P N is summable. Solution: We note that |a2n| |a2n1| n 1 1 n n2 1 n2pn1 1q ¡ 0 , |a2n| |a2n 1| n 1 1 pn n 1q2 pn 1 1q2 ¡ 0 for all n P N and hence that the sequence |a1 |, |a2 |, . . . is neither decreasing nor increasing. Hence Dirichlet’s test cannot be directly applied. On 366 0.5 10 5 20 15 25 30 n -0.2 0.3 -0.4 -0.6 0.1 10 5 20 15 n 30 25 -0.8 Fig. 92: Sequences of absolute values and partial sums of the sequence from Example 3.3.21. the other hand, Abel’s test can be applied successfully as follows. For this, we define x1 : 1 , x2n y1 : 0 , y2n 1 : 1 : , x2n : 1 n 1 n n , y2n : 1 1 , n n n 1 for every n P N . Then x1 y1 0 a1 , x2n y2n x2n 1 y2n 1 n 1 1 n n 1 pn n n1 n n 1 n 1 1 a2n 1q2 a2n 1 for all n P N . The partial sums of the sequence x1 , x2 , . . . are given by 2n ¸1 k 1 2n ¸ k 1 xk xk ņ x2k1 k 1 ņ k 1 x2k1 n¸1 x2k k 1 ņ x2k k 1 367 ņ 1 k k1 ņ 1 k k1 1 n¸1 k 1 ņ 1 , k n 1 k k1 0 , for every n P N such that n ¥ 2, and hence x1 , x2 , . . . is summable. Further, y1 , y2 , . . . is decreasing and convergent to 1. Hence it follows by Abel’s test that a1 , a2 , . . . is summable. Therefore, a1 , a2 , . . . is summable, too. In the following, we define and study absolutely summable sequences. Any reordering of such a sequence leads to a convergent series whose sum coincides with the sum of the original series. In applications mainly absolutely summable sequences occur. Exceptions are rare. One such exception is described in [9]. Definition 3.3.22. (Absolute summability) A sequence x1 , x2 , . . . of real numbers is said to be absolutely summable if the corresponding sequence |x1|, |x2|, . . . is summable. It is called conditionally summable if it is summable, but |x1 |, |x2 |, . . . is not. Of course, the previous definition is reasonable only if any absolutely summable sequence is summable, too. The last is easy to prove. Theorem 3.3.23. Any absolutely summable sequence of real numbers is summable. Proof. For this, let x1 , x2 , . . . be some absolutely summable sequence of real numbers. Then x1 |x1 |, x2 |x2 |, . . . is a sequence of positive real numbers and ņ k 1 pxk |xk |q ¤ 2 ņ k 1 |x k | ¤ 2 8̧ |x k | k 1 for all n P N . Hence the sequence of partial sums corresponding to the sequence x1 |x1 |, x2 |x2 |, . . . is increasing as well as bounded from above and hence convergent. Therefore, x1 |x1 |, x2 |x2 |, . . . is summable. Hence it follows by the limit laws that x1 , x2 , . . . is summable, too. Remark 3.3.24. Note that the previous definition and theorem reduce the decision whether a given sequence is absolutely summable (and therefore 368 also summable) to the decision whether a corresponding sequence of positive real numbers is summable. Usually, the decision of the last is relatively easy, and we already developed a number of tools for this. For this reason, the second step in the analysis is often the inspection whether the sequence is absolutely summable. Usually, the first step inspects whether the summability of the sequence can be concluded by help of the limit laws from the already known summability of certain sequences, or whether there are obvious reasons why the sequence is not summable. If this fails, absolute summability is investigated. If this also fails, the applicability of Dirichlet’s test or Abel’s test is investigated next. Example 3.3.25. In Example 3.3.2, we have seen that the geometric series, defined by Sn : ņ xk k 0 for every n P N, is convergent if and only if |x| 1 where x0 : 1. In the last case, this also implies that the geometric series defined by S̄n : ņ |x | ņ |x |k , k k 0 k 0 where |x|0 : 1, is convergent and hence that 1, x, x2 , . . . is absolutely summable. Example 3.3.26. Determine whether the sequence sinp1q sinp2q sinp3q , , ,... 12 22 32 is absolutely summable. Solution: For every k sin k k2 p q ¤ 1 k2 (3.3.16) P N, it follows that . Hence it follows by Example 3.3.7 and Theorem 3.3.9 that the sequence (3.3.16) is absolutely summable. 369 Example 3.3.27. The examples of the harmonic series Example 3.3.3 and the alternating harmonic series, i.e., the case s 1 in Example 3.3.19, show that not every summable sequence is absolutely summable. The following characterization of summability is sometimes useful in the analysis of sequences and will be used later on. It is a simple consequence of the definition of summability of a sequence and the completeness of the real number system in the form of Theorem 2.3.17. Theorem 3.3.28. (Cauchy’s characterization of summable sequences) A sequence x1 , x2 , . . . of real numbers is summable if and only if the corresponding sequence of partial sums is a Cauchy sequence, i.e., if and only if for every ε ¡ 0, there is some N P N such that ņ xk km ¤ε for all m, n P N satisfying n ¥ m ¥ N . Proof. First, if x1 , x2 , . . . is a sequence of real numbers whose corresponding sequence of partial sums is a Cauchy sequence, then it follows Theorem 2.3.17 that the last sequence is convergent and hence that x1 , x2 , . . . is a summable. If x1 , x2 , . . . is a summable sequence of real numbers, then the corresponding sequence of partial sums is convergent and hence also a Cauchy sequence according to Theorem 2.3.17. The last can also be proved directly as follows. For this, let ε ¡ 0. Since x1 , x2 , . . . is summable, there is N P N such that 8̧ ε m̧ xk xk ¤ k1 2 k1 for all m P N satisfying m that n ¥ m ¥ N 1 that ņ xk km ņ xk k1 m ¸1 k 1 ¥ N . Hence it follows for all m, n P N such xk ¤ ņ xk k1 370 xk k1 8̧ 8̧ xk k1 m ¸1 k 1 xk ¤ ε. The following corollary is often used to show that a given sequence is not summable. Corollary 3.3.29. Let x1 , x2 , . . . be a summable sequence of real numbers. Then lim xn 0 . Ñ8 n Example 3.3.30. We consider the sequence x1 , x2 , . . . defined by xn : p1qn n 1 n for every n P N . If x1 , x2 , . . . were convergent to 0 also every of its subsequences would converge to zero. On the other hand, lim x2n nÑ8 2n 1 nlim Ñ8 2n 1 . Hence x1 , x2 , . . . is not convergent to 0 and therefore also not summable. In the following, we give the two most important tests, the ratio test and the root test, for the decision whether a given sequence is absolutely summable or not. Both tests compare, by application of Theorem 3.3.9, the corresponding series to geometric series. Usually, the structure of the members of the sequence decides which of the tests is applied. The ratio test uses for this the ratio of the absolute values of subsequent members and the root test the n-th root of the absolute value of the n-th member. Since the structure of the last is often more complicated than that of the ratio, the quotient test is more frequently applied. Theorem 3.3.31. (Ratio test) Let x1 , x2 , . . . be a sequence of real numbers. (i) If there are q P p0, 1q and N P N such that xn x 1 n ¤q for all n P N such that n ¥ N , then x1 , x2 , . . . is absolutely summable. Note that this can only be the case if only finitely many of the members of x1 , x2 , . . . are zero. 371 (ii) If there is N P N such that xn x 1 n ¥1 for all n P N such that n ¥ N . Then x1 , x2 , . . . is not summable. Also this can only be the case if only finitely many of the members of x1 , x2 , . . . are zero. Proof. ‘(i)’: For this, let q P p0, 1q and N P N be such that |xn 1 | ¤ q |xn | for all n P N satisfying n ¥ N . Then it follows by induction that |xn| ¤ |xN | qnN for all n P N such that n ¥ N . Hence it follows by Example 3.3.2 and Theorem 3.3.9 the absolute summability of x1 , x2 , . . . . ‘(ii)’: For this let N P N be such that |xn 1 |{|xn | ¥ 1 for all n P N satisfying n ¥ N . Then it follows by induction that |xn| ¥ |xN | for all n P N satisfying n ¥ N and hence since xN 0 that x1 , x2 , . . . is not converging to 0. Hence it follows by Corollary 3.3.29 that x1 , x2 , . . . is not summable. Example 3.3.32. Find all values real x for which the sequence x0 x1 x2 , , ,... 0! 1! 2! is summable. Solution: For x 0, the corresponding sequence is obviously absolutely summable. For x P R and n P N, it follows that n 1 x n! lim nÑ8 n 1 ! xn p q |x | 0 nlim Ñ8 n 1 and hence by Theorem 3.3.31 the absolute summability of the corresponding sequence. 372 Theorem 3.3.33. (Root test) Let x1 , x2 , . . . be a sequence of real numbers. P r0, 1q and N P N such that |xn|1{n ¤ q P N satisfying n ¥ N , then x1, x2, . . . (i) If there are q for all n summable. (ii) If there is N is absolutely P N such that |xn|1{n ¥ 1 for all n P N satisfying n ¥ N , then x1 , x2 , . . . is not summable. Proof. ‘(i)’: For this, let q P r0, 1q and N P N be such that |xn |1{n ¤ q for all n P N satisfying n ¥ N . Then it follows that |x n | ¤ q n for all n P N satisfying n ¥ N and hence by Example 3.3.2 and Theorem 3.3.9 the absolute summability of x1 , x2 , . . . . ‘(ii)’: For this let N P N be such that |xn |1{n ¥ 1 for all n P N satisfying n ¥ N . Then it follows that |x n | ¥ 1 for all n P N such that n ¥ N and hence that x1 , x2 , . . . is not converging to 0. Hence it follows by Corollary 3.3.29 that x1 , x2 , . . . is not summable. Example 3.3.34. Determine whether the sequence p1q2 pln12q2 , p1q3 pln13q3 , p1q4 pln14q4 , ... is summable. Solution: For n P N zt0, 1u, it follows that lim nÑ8 p1q n 1{n 1 pln nqn 1 nlim Ñ8 ln n 0 and hence by Theorem 3.3.33 the absolute summability of the sequence. 373 Example 3.3.35. Note that in the case of the sequence a1 , a2 , . . . defined by 1 an : s n for all n P N , where s ¡ 0, that neither the ratio nor the root test can be applied, since an lim nÑ8 a 1 s nlim 1s 1 , Ñ8 n 1 n s{n s lnpnq{n lim n nlim e0 1 . nÑ8 Ñ8 e n Finally, by application of Cauchy’s characterization of summable sequences, Theorem 3.3.28, we prove that every reordering of an absolutely summable sequence leads to a convergent series whose sum coincides with the sum of the original series. Theorem 3.3.36. (Rearrangements of absolutely convergent series) Let x1 , x2 , . . . be an absolutely summable sequence of real numbers. Further, let f : N Ñ N be bijective. Then the sequence xf p1q , xf p2q , . . . is also absolutely summable and 8̧ xk k 1 8̧ xf pkq . (3.3.17) k 1 Proof. First, it follows that the sequence of partial sums of |xf p1q |, |xf p2q |, . . . °8 is increasing with upper bound k0 |xk | and hence convergent. Hence |xf p1q|, |xf p2q|, . . . is absolutely summable. Further, let ε ¡ 0. By Theorem 3.3.28, there is N P N such that for all n, m P N satisfying n ¥ m ¥ N , it follows that ņ |x k | ¤ ε . k m P N such t1, . . . , N u tf p1q, . . . , f pNf qu . Since f is bijective, there is Nf 374 Hence it follows for every n P N satisfying n ¥ maxtN, Nf u: ņ xf pkq k1 xk k1 ņ ¤ε. Hence it follows also (3.3.17). Problems 1) Express the periodic decimal expansion as a fraction. , a) 0.9 b) 0.3 , c) 0.377 . 2) Calculate 8̧ 1 1 8̧ 3p1 p1qn q 1 , b) , a) 2n 4 n n 4 n1 n1 8̧ 8̧ 1 1 npn 3q c) , npn 1qpn 3q d) n 1 . n 1 3) Determine whether the sequence a1 , a2 , . . . is absolutely summable, conditionally summable or not summable. a) ak a) ak c) ak k1 k 4 5 , plnpk 1 1qqk arctanpkq , k 4{3 1 p3q pk!1 g) ak 1{k p1qk ke2 b) 2 d) ak , f) , h) 31{k , p2k2 1q! , ? ? k 3 k ak k ak 2 p1qk k2k k 3 2 , j) p2kq! , l) a kk , ak k p3kq! k! i) ak k) k 2k 1 k2 q k e) ak , 51{k , p1qk a ak b) 375 p1qk k2 k 3 ak e3kk{2 , 3 , , ak k) where k 1 1 k k P N . 4) Determine the values q a3 , a4 , . . . rlnkpk2qs 3 , l) ak ¥ 1 for which the corresponding sequence ak : 1 , k lnpk q rlnplnpk qqsq P t3, 4, . . . u, is summable. Give reasons for your answer. Define a4k : a4k 1 : 1, a4k 2 : a4k 3 : 1 for every k P N. k 5) Determine whether the sequence ?ak , 3 k 7 k P N, is absolutely summable, conditionally summable or not summable. Give reasons for your answer. 6) Estimate the error if the sum of the first N terms is used as an approximation of the series. a) a) 8̧ 1 , N 3 , b) n2 n1 8̧ p1qn 1 n 1 n , N 7 8̧ 1 n r ln pnqs2 , N n2 , b) 8̧ p1qn n2 n 1 9 1 , N , 14 . 7) Calculate the sum correct to 3 decimal places 8̧ 1 8̧ 1 , b) , 4 5 lnpnq n n n1 n2 8̧ 8̧ p1qn p2n1 q! , b) p1qn a) n1 n1 a) 1 n! n2n . 8) A rubber ball falls from initial height 3m. Whenever it hits the ground, it bounces up 3{4-th of the previous height. What total distance is covered by the ball before it comes to rest? 9) If a1 , a2 , . . . is sequence of real numbers such that lim an n Ñ8 0, does this imply the summability of the sequence? Give reasons for your answer. 376 10) Give an example for a convergent sequence of real numbers a1 , a2 , . . . and a divergent sequence of real numbers b1 , b2 , . . . satisfying lim n an Ñ8 an 1 1, lim n bn Ñ8 bn 1 1. 11) Give an example for a convergent sequence of real numbers a1 , a2 , . . . and a divergent sequence of real numbers b1 , b2 , . . . satisfying lim pan q1{n n Ñ8 1, lim pbn q1{n n Ñ8 1. 12) Assume that a1 , a2 , . . . is a summable sequence of positive real numbers. Show that the sequence a21 , a22 , . . . is also summable. Is the last also generally true if members of a1 , a2 , . . . can be negative? 377 3.4 Series of Functions One main application of series is in form of series of functions, i.e., series containing one or more parameters varying in a certain domain. For motivation, we consider Leibniz’s ‘arithmetical quadrature of the circle’ from 1673. By ‘arithmetical’ quadrature, Leibniz meant a representation of an area as a sum of an infinite sequence of rational numbers. He arrived at that ‘Leibniz’s series’ for the area of the circle through application of his ‘transmutation theorem’ by which he could also derive essentially all known plane quadrature results at the time. From today’s perspective, the statement of that theorem is a consequence of the method of integration by parts and the change of variable formula. For this reason, Leibniz’s transmutation theorem will not be used in the following. For instance, see [36] for information on that theorem. As a consequence, the first steps in the derivation below will not look very natural. Leibniz uses the following representation S : tpx, y q P R2 : px 1q2 1u of a circle of radius 1 and center p1, 0q. Since px, y q P S if and only if 0 px 1q2 y 2 1 x2 2x 1 y 2 1 y 2 p2x x2 q , y2 the area of the upper left quarter ? of the circle is given by the area below the graph of p r0, 1s Ñ R, x ÞÑ 2x x2 q, see Fig 93. Hence 1? π 2x x2 dx . 4 0 Instead of applying Leibniz ‘transmutation theorem’, we proceed as follows. First, it follows by partial integration that » »1 0 ? 2x ? x2 2x dx x2 »1 1 x 0 0 ? 2x x2 1 dx »1 0 ?2 2x 2 x dx 2 2x x 378 y 1 1 2 x Fig. 93: The Leibniz series gives a representation of the area of a quarter of a circle of radius 1 in terms of a sum of an infinite sequence of rational numbers. 1 »1 0 2x x2 x ? dx 1 2x x2 and hence that »1 ? 0 2x x2 dx »1 ? 0 2x x2 dx »1c 1 1 2 »1c 2x 0 dx x 2x 0 x dx . Second, we use change of variables. For this, we define 2u2 1 u2 R. In particular, g is continuously differentiable with derivag puq : for every u tive P u2 q 2u2 2u 4u 2 2 p1 u q p1 u2q2 for all u P R and hence g is strictly increasing on r0, 8q and g p0q g p1q 1. Also, g 1 puq g puq 2 g puq 4u p1 2u2 1 2 1 u 2 12uu22 2u2 1 u2 1 2 1 u2 0, u2 for all u P R. Hence it follows by change of variables and partial integration that 1 1 2 »1c 0 x 2x dx 1 1 2 379 » gp1q c pq g 0 x 2x dx »1d 1 1 2 0 1 2 ru gpuqs| »1 g puq du 1 0 1 1 g puq g 1puq du 2 g puq »1 0 »1 1 1 2 0 2 1 2 u g 1 puq du »1 0 g puq du u2 du . 1 u2 0 In this way, as also Leibniz by use of his ‘transmutation method’, we arrive at the equation »1 u2 π 1 1 u2 du . 4 0 In the next step, Leibniz uses that 1{p1 series 1 pu2q 1 u2 1 8̧ 1 u2 q is the limit of a geometrical 8̧ pu q 2 k p1qk u2k (3.4.1) k 0 k 0 for every u P R satisfying |u| 1, where we use the usual convention that x0 : 1 for all x P R, and concludes that π 4 1 1 »1 »1 0 u2 du 1 1 u2 8̧ 0 k 0 8̧ p1q u p k 2 k 1 1 p1qk 8̧ k 0 1 p1q k 1 k 0 u 0 8̧ 2 1 3 0 p1qk u2k du 8̧ k 0 q du 1 2k 3 u 2k »1 p1q »1 k k0 8̧ u2pk 1q du 0 1 p1qk 2k 1 1 2pk 1q k 0 1 1 8̧ n 1 3 p1qn 2n 1 1 to arrive at the Leibniz series π 4 8̧ n 0 p1qn 2n 1 1 1 13 380 1 5 17 ... . (3.4.2) 1 Π 4 0.5 5 10 15 20 25 30 n Fig. 94: Members of Leibniz’ series and their limit value, π {4. Note that the previous derivation proceeds without any reference to trigonometric functions or their inverses. Indeed, as is proved in Example 3.4.20 later on, (3.4.2) turns out to be correct, but the exchange of integration and summation in the last part of the derivation needs justification. In particular, it has to be taken into account that the sum in (3.4.1) diverges at the right end u 1 of the interval of integration. In the following, we consider the situation in the last part of Leibniz’s derivation in a little more detail. For this, we define for every n P N the corresponding function fn : r0, 1s Ñ R by fn puq : ņ p1qk u2pk k 0 381 1 q for every u P r0, 1s. In particular, fn is continuous and hence Riemannintegrable for every n P N and lim fn puq Ñ8 n 8̧ k 0 2 p1qk u2pk 1q 1 u u2 for every u P r0, 1q. Also is the ‘limit function’ f : r0, 1s sequence of functions f0 , f1 , . . . defined by f puq : Ñ R of the u2 1 u2 for every u P r0, 1s, continuous and hence Riemann-integrable. In this situation, Leibniz uses that »1 0 f puq du nlim Ñ8 »1 0 » 1 8̧ 0 k 0 p1qk u2pk 1 q du 8̧ » 1 k 0 0 p1qk u2pk 1 q du fn puq du . In this special case, this exchange of integration and summation turns out to be correct. On the other hand, it is not difficult to find cases where such exchange leads to incorrect results, see Example 3.4.2 (iii). It is important to note that such exchange of integration, differentiation and other limit operations were quite common in the 17th century up to the first quarter of the 19th century. For instance, a mathematician like Euler would not have hesitated to perform such exchange without seeing a necessity for justification. A main goal of this section is to derive general conditions on a sequence of functions f0 , f1 , . . . and the limit function f such that »b a f puq du nlim Ñ8 »b a fn puq du where a, b P R such that a ¤ b, fn : ra, bs Ñ R is continuous for every n P N, f0 pxq, f1 pxq, . . . is convergent for every x P ra, bs and f : ra, bs Ñ R 382 defined by f pxq : lim fn pxq Ñ8 n for every x P ra, bs. Further equally important goals are the derivation of analogous conditions for situations where integration is replaced by differentiation and other limit operations. For the last, there is an important historic example. In his textbook ‘Cours d’analyse’ from 1821 [22], Cauchy gives a false theorem, along with an incorrect proof, that would imply that from the continuity of the members of such a sequence f0 , f1 , . . . it would follow the continuity of the limit function f . Indeed, the last, and therefore also Cauchy’s ‘theorem’, is incorrect, see Example 3.4.2(i). Apparently, this has been noticed first by Niels Henrik Abel in 1826 [1] who gave a counterexample. After the previous introduction, we start the following with a collection of counterexamples. For this, we make the following definition. Definition 3.4.1. (The pointwise limit of a sequence of functions) Let f1 , f2 , . . . be a sequence of functions defined on subsets of R. We define the pointwise limit of f1 , f2 , . . . as the function f defined by f pxq : nlim Ñ8 fn pxq for all x in the intersection of the domains of f1 , f2 , . . . which are such that the corresponding sequence f1 pxq, f2 pxq, . . . is convergent. Note that f has an empty domain in case that no such x exist. Applying the previous terminology, the following Example shows the following facts. Part (i) shows that the pointwise limit of a sequence of continuous functions is not always a continuous function. Part (ii) shows that the sequence of derivatives, associated to a sequence of differentiable functions that converge pointwise to a differentiable function f , does not always converge pointwise to the derivative of f . Finally, Part (iii) shows that the sequence of integrals, associated to a sequence of Riemann-integrable functions that converge pointwise to a Riemann-integrable function f , does not always converge to the integral of f . 383 y 1 0.8 0.6 0.4 -3 -2 1 -1 2 3 x Fig. 95: Graphs of the first five functions of the series from Example 3.4.2(i). Example 3.4.2. (Examples of limits of sequences of functions) (i) Define the sequence of infinitely often differentiable functions f1 , f2 , . . . by x2n fn pxq : 1 x2n for all n P N and x P R. Then lim Ñ8 fn n $ ' &0 1 pxq : nlim Ñ8 fn pxq ' 2 % 1 if |x| 1 if x t1, 1u if |x| ¡ 1 and hence limnÑ8 fn is not a continuous function. (ii) Define the sequence of differentiable functions g1 , g2 , . . . by gn pxq : 384 sinpnxq n y 1 x 1 -2 -1 Fig. 96: Graphs of the first 4 functions of the series from Example 3.4.2(ii). y -3 1 -1 3 x -1 Fig. 97: Graphs of the derivatives of the first 4 functions of the series from Example 3.4.2(ii). 385 y 1.4 1.2 1 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 x Fig. 98: Graphs of the first 10 functions of the series from Example 3.4.2(iii). for all n P N and x P R. Then lim gn nÑ8 pxq : nlim Ñ8 gn pxq 0 for all x P R and hence limnÑ8 gn is a differentiable function. On the other hand because of gn1 pxq cospnxq for all n P N and x P R, the limit of g11 pxq, g21 pxq, . . . does not exist for any x P R and 1 1 1 nlim Ñ8 gn p0q nlim Ñ8 gn p0q 0 . Hence the sequence g11 , g21 , . . . of derivatives of g1 , g2 , . . . does not converge pointwise to the derivative of limnÑ8 gn . (iii) Define the sequence of continuous functions h1 , h2 , . . . by hn pxq : nx p1 x2 qn 386 for all n P N and x P r0, 1s. Then lim hn Ñ8 n pxq : nlim Ñ8 hn pxq 0 for all x P r0, 1s defines a continuous function. On the other hand, »1 0 hn pxq dx for every n P N and hence 1 2 nlim Ñ8 »1 0 hn pxq dx n 1 2 n 1 »1 0 lim hn nÑ8 pxq dx 0 , i.e., the limit of the integrals of h1 , h2 , . . . over r0, 1s is different from the integral of its limit function over r0, 1s. The following defines the notion of uniform convergence of a sequence of functions. It is more restrictive than that of pointwise convergence and, if present, will turn out to allow the exchange of the operations of integration, differentiation and the taking of limits. Its definition is likely due to Christoph Gundermann the teacher of Weierstrass. From 1841 on, Weierstrass used it routinely in his work on power series. Through Weierstrass’ lectures on analysis at Berlin in 1859 and 1860, the mathematical world became slowly aware of the importance of the concept. Definition 3.4.3. (Uniform convergence) A sequence f1 , f2 , . . . of functions on some non-empty subset T of R is said to be uniformly convergent to some function f : T Ñ R, if for every ε ¡ 0 there is some N P N such that |fnpxq f pxq| ε for all x P T and all n P N such that n ¥ N . Note that the last is equivalent to the requirement that the graph of fn is contained in the ‘uniform neighborhood of size ε’ t px, yq P T R : y P rpf pxq εq, f pxq εs u around the graph of f for all n P N such that n ¥ N , see Fig 99. 387 y 1 0.5 -2 1 -1 x 2 Fig. 99: A uniform neighborhood of size 1{4 around sinp2xq. Subsequently, we prove three statements on the validity of the exchange of the operations of integration, differentiation and the performing of (other) limits. Theorem 3.4.4. (Uniform limits of continuous functions are continuous) Let f1 , f2 , . . . be a sequence of functions on some non-empty subset T of R which is uniformly convergent to some function f : T Ñ R. Further, let all f1 , f2 , . . . be continuous at some point x0 P T . Then f is continuous at x0 , too, i.e., lim lim fn pxq xlim nÑ8 xÑx Ñx 0 0 lim fn nÑ8 pxq . Proof. For this, let a1 , a2 , . . . be some sequence in T which is convergent to x0 and ε ¡ 0. Then for every m, n P N |f panq f px0q| |f panq fmpanq fmpanq fmpx0q fmpx0q f px0q| ¤ |f panq fmpanq| |fmpanq fmpx0q| |fmpx0q f px0q| . 388 (3.4.3) Since f1 , f2 , . . . is uniformly convergent to f , there is m0 P N such that |fm pxq f pxq| ¤ 3ε 0 for all x that P T . Further, since fm 0 is continuous in x0 , there is N P N such |fm panq fm px0q| ¤ 3ε for all n P N satisfying n ¥ N . Hence it follows by (3.4.3) that |f panq f px0q| ¤ ε for all n P N such that n ¥ N . 0 0 Theorem 3.4.5. (A simple limit theorem for Riemann integrals) Let f1 , f2 , . . . be a sequence of almost everywhere continuous functions on ra, bs, where a, b P R are such that a ¤ b, which is uniformly convergent to some almost everywhere continuous function f : ra, bs Ñ R. Then »b lim Ñ8 n a fn pxq dx »b a lim Ñ8 fn n pxq dx »b a f pxq dx . Proof. For this, let ε ¡ 0. Since f1 , f2 , . . . is uniformly convergent to f , there is n P N such that |fnpxq f pxq| ¤ ε for all x P ra, bs. Hence » b fn x dx pq ¤ pb aq ε . a »b a f x dx pq ¤ »b a 389 |fnpxq f pxq| dx ¤ »b ε dx a Theorem 3.4.6. Let f1 , f2 , . . . be a sequence of continuous functions on ra, bs, where a, b P R are such that a ¤ b, such that f1px0q, f2px0q, . . . is convergent for some x0 P ra, bs. Further, let the restriction of each fn , n P N , to pa, bq be differentiable with a derivative that can be extended to a continuous function fn1 on ra, bs. Finally, let the sequence f11 , f21 , . . . be uniformly convergent to some continuous function g : ra, bs Ñ R. Then f1 , f2 , . . . is uniformly convergent to a continuous function f : ra, bs Ñ R whose restriction to pa, bq is differentiable with derivative given by g |pa,bq . Hence in particular, lim f 1 pxq nÑ8 n lim Ñ8 fn n 1 pxq (3.4.4) for all x P pa, bq. Proof. By Theorem 2.6.21, it follows that fn pxq »x a fn1 py q dy fn paq for all n P N and x P ra, bs. Further, from this follows by the convergence of f1 px0 q, f2 px0 q, . . . , the uniform convergence f11 , f21 , . . . to g and Theorem 3.4.5 the pointwise convergence of f1 pxq, f2 pxq, . . . to some f : ra, bs Ñ R and f pxq »x a g py q dy lim Ñ8 fn paq n for all x P ra, bs. Further, from this follows by Theorem 2.6.19 and the continuity of g that f is continuous with its restriction to pa, bq being differentiable with derivative given by g |pa,bq . Finally, from |fnpxq f pxq| ¤ »x a |fn1 pyq gpyq| dy |fnpaq nlim Ñ8 fn paq| valid for every n P N and x P ra, bs, the uniform convergence of f11 , f21 , . . . to g and the convergence of f pa1 q, f pa2 q, . . . to limnÑ8 fn paq, it follows the uniform convergence of f1 , f2 , . . . to f . 390 Remark 3.4.7. Note that Example 3.4.2(ii) shows that only the assumption of a uniform convergence of the sequence f1 , f2 , . . . alone in Theorem 3.4.6 does not guarantee the validity of (3.4.4) in general. The previous three theorems demonstrate the usefulness of the notion of uniform convergence of a sequence of functions. Therefore, it is important to derive criteria that indicate the presence of such a convergence. In this connection, we give only the most simple and at the same time most useful criterion due to Weierstrass. Theorem 3.4.8. (Weierstrass test) Let f1 , f2 , . . . be a sequence of functions on some non-empty subset T of R for which there is a summable sequence M1 , M2 , . . . of positive real numbers such that |fnpxq| ¤ Mn for all x P T and n P N . Then the series S1 , S2 , . . . defined by Sn : ņ fk k 1 for all n P N is uniformly convergent to a function S : T Ñ R on T . Also is the series S1 pxq, S2 pxq, . . . absolutely convergent to S pxq for all x P T . Proof. For this let x P T . Then by Theorem 3.3.9, it follows the absolute summability of f1 pxq, f2 pxq, . . . . Hence we can define S : T Ñ R by ° S pxq : 8 k1 fk pxq for all x P T . Further for given ε ¡ 0, there is N P N such that ņ Mk k1 8̧ Mk k1 ¤ε for all n P N satisfying n ¥ N . For such n, it follows that ņ fk x k1 S x p q p q ¤ 8̧ k n 1 |fk pxq| ¤ 8̧ Mk ¤ε k n 1 for all x P T and hence the uniform convergence of S1 , S2 , . . . to S. 391 The following two examples give standard applications of Theorem 3.4.5 and Theorem 3.4.6, respectively. Example 3.4.9. Calculate 8̧ xn n n 1 for all x P p1, 1q. Solution: We notice that by formal ‘termwise differentiation’ of this sum with respect to x we arrive at a geometrical series. This fact will be exploited in the following. For this, we define fn pxq : xn for all n P N and x P R. Then |fnpxq| ¤ pmaxt|a|, |b|uqn for all n P N and x P ra, bs where a, b P R are such that 1 a ¤ b 1. Since | maxt|a|, |b|u| 1, it follows by Theorem 3.4.8 the uniform convergence of ņ Sf n : fk |ra,bs k 0 for n Ñ 8 as well as the absolute summability of f0 pxq, f1 pxq, . . . for all x P p1, 1q. Further, it follows by Theorem 3.4.5 that 8̧ bk k 0 ak 1 lim » b S pxq dx fn nÑ8 k 1 1 a »b a lim Ñ8 Sf n n pxq dx »b a dx 1x ln and hence, finally, that 8̧ xn for all x P p1, 1q. n n1 8̧ xn n n0 1 1 392 lnp1 xq 1a 1b Example 3.4.10. Calculate 8̧ nxn n 1 for all x P p1, 1q. Solution: We note that 8̧ nxn 8̧ pn 1q xn xn 8̧ n 1 n 1 n 1 8̧ pn 1q xn n 1 x 1x for all x P p1, 1q and that by formal integration of the last sum with respect to x we arrive at geometric series. This fact will be exploited in the following. For this, we define fn pxq : xn , gn pxq : nxn1 for all n P N and x P R. Then |fnpxq| ¤ pmaxt|a|, |b|uqn , |gnpxq| ¤ n pmaxt|a|, |b|uqn1 for all n P N and x P ra, bs where a, b P R are such that 1 a ¤ b 1. Since | maxt|a|, |b|u| 1, it follows by Theorem 3.4.8 the uniform convergence of Sf n : ņ ņ fk |ra,bs , Sgn : k 0 gk |ra,bs k 0 for n Ñ 8 as well as the absolute summability of f0 pxq, f1 pxq, . . . and g0 pxq, g1 pxq, . . . for all x P p1, 1q. Hence it follows by Theorem 3.4.6 that 8̧ k 0 kxk1 1 nlim Ñ8 Sgn pxq nlim Ñ8 Sf n pxq lim Sf n nÑ8 for all x P pa, bq. Hence, finally, it follows that 8̧ n 1 nxn p1 x xq2 393 1 pxq p1 1 xq2 for all x P p1, 1q. The following example gives a less simple application of Theorem 3.4.5 to an improper integral from a well-known representation of the Riemann zeta function. In such applications, Theorem 3.4.5 is applied to every member of a sequence of Riemann integrals whose limit coincides with the improper integral. Example 3.4.11. Show that ζ psq 1 Γpsq »8 0 xs1 dx ex 1 (3.4.5) for all s ¡ 1. Solution: For this, let s ¡ 1. According to (2.5.4) ex 1 ¡ x for all x P p0, 8q and hence xs1 ex 1 xs2 for all x P p0, 1s. Further, an easy calculation shows that ex 1 ¡ ex{2 for all x ¥ 1 and hence that xs1 ex 1 xs1 ex{2 for all x ¥ 1. Hence by Examples 3.2.4, 3.2.7 and Theorem 3.2.6, it follows the improper Riemann integrability of f : pp0, 8q Ñ R, x ÞÑ xs1 {pex 1qq. Further, define for k P N fk pxq : xs1 epk 1qx for every x P R. Then it follows for ε, R P R such that 0 ε R and any x P rε, Rs |fk pxq| ¤ Rs1epk 1qε . 394 Hence by Theorem 3.4.8, it follows the uniform convergence of ņ Sf n : fk |rε,Rs k 0 for n Ñ 8 to f |rε,Rs as well as the absolute summability of f0 pxq, f1 pxq, . . . for all x P p0, 8q. Hence it follows by Theorem 3.4.5 that »R ε »R 8̧ xs1 dx ex 1 k0 xs1 epk 1qx dx ε 8̧ 1 » kR ks ys1ey dy ¤ Γpsq ζ psq kε k1 and hence that »8 0 xs1 dx ¤ Γpsq ζ psq . ex 1 On the other hand, ņ 1 ks k1 » kR y ey dy ¤ s 1 kε »8 0 xs1 dx ex 1 for every n P N and hence ņ 1 Γpsq ks k1 ¤ and, finally, Γpsq ζ psq ¤ »8 0 »8 0 xs1 dx ex 1 xs1 dx . ex 1 We note that (3.4.5) gives that »8 0 x ex 1 dx 1 Γp2q »8 0 395 1 dx ζ p2q . x ex The first integral has applications in statistical mechanics / quantum field theory. Therefore, the knowledge of the value of ζ p2q is useful. Indeed, there are quite a number of elementary proofs for the well-known fact that ζ p2q π2 . 6 (3.4.6) We use for this the approach from [44]. From this result, we conclude that »8 0 x ex 1 dx π2 . 6 More generally, a representation similar to (3.4.6) is also known for ζ p2nq for n P N zt1u. Example 3.4.12. There is a fairly elementary way to show that ζ p2q π2 . 6 (3.4.7) For this, we define for every n P N a corresponding Sn : R Ñ R by Sn pxq : ņ 1 2 cospkxq k 1 for every x P R. In a first step, it follows that Sn pxq sinrp2n 1q x{2s 2 sinpx{2q for all x P R zt2πk : k P Zu and every n P N . The proof proceeds by induction of n P N . First, it follows by the addition theorem for the sine function that 2 sin x sin 2 S1 pxq 2 sin x 2 sin x 2 x 1 2 2 cospxq cospxq cos x 2 396 sin sinpxq x 2 2 sin x 2 cospxq sin x 2 cospxq cos x sinpxq sin 2 3x 2 for every x P R and hence the validity of the statement in the case n 1. Further, if the statement is true for some n P N , then it follows by the addition theorem for the sine function that x x x Sn 1 pxq 2 sin Sn pxq 2 sin cos rpn 1qxs 2 sin 2 2 2 x x p 2n 1q x sin sin cos rpn 1qxs cos sin rpn 1qxs 2 2 2 x x p 2n 3q x sin cos rpn 1qxs cos sin rpn 1qxs sin 2 2 2 for every x P R and hence the validity of the statement where n is replaced by n 1. Hence the statement holds for all n P N . In the following let n P N . Then »π 0 2 π 4 »π xSn pxq dx x ņ π2 4 k 1 ņ k 1 k 0 x dx 2 π sinpkxq π 1 cospkxq k2 0 0 ņ »π 0 k 1 ņ k1 k1 π2 4 x cospkxq dx »π 0 ņ sinpkxq dx k 1 p1qk k2 1 k2 . Further, by integration by parts, it follows that »π 0 xSn pxq dx »π »π x 0 sinrp2n 1q x{2s dx 2 sinpx{2q sinxp{x2{2q sinrp2n 1q x{2s dx 0 π 2 x{2 2n 1 cosrp2n 1q x{2s sinpx{2q 0 »π 1 sinpx{2q px{2q cospx{2q cosrp2n 2n 1 0 sin2 px{2q 397 1q x{2s dx Note that in this derivation it has been used that lim x Ñ0 x sinpxq 1, sinpx{2q px{2q cospx{2q sin2 px{2q lim x Ñ0 0 which follows from an application of L’Hospital’s theorem, Theorem 2.5.38. As a consequence, the function pp0, 2π q Ñ R, x ÞÑ px{2q{ sinpx{2q and its derivative have uniquely determined extensions to continuous functions on r0, 2πq. Further, by using the last fact, it follows that » π sin x 2 x 2 cos x 2 cos 2n 2 sin x 2 0 »π sin x 2 x 2 cos x 2 dx 2 sin x 2 ¤ 0 p { q p { q p { q rp p{q p { qp { q p { q p{q »π and hence that lim Ñ8 n 0 1 x 2 dx q {s xSn pxq dx 0 . Hence it follows that π2 4 8̧ p1qk k2 k 1 1 k2 0. The last implies that 8̧ ζ p2q k 1 ζ p22q p1qk k2 π2 4 ζ p2q π2 4 8̧ 2 1 p2kq2 k1 π2 4 and hence p3.4.7q. Cauchy, in his textbook ‘Cours d’analyse’ from 1821 [22], was the first to give nearly modern definitions of the continuity and differentiability of functions based on limits. Still, his understanding of limits was different from the modern understanding. During the early 19th century, it resulted 398 in the general belief that every continuous function is everywhere differentiable, except perhaps at finitely many points. Even several ‘proofs’ of this ‘fact’ appeared during that time. One such ‘proof’ is due to Andre-Marie Ampere. Therefore, it came as a shock when in 1872 [99] Weierstrass proved the existence of a continuous function which is nowhere differentiable. For the first time, this result signaled the complete mastery of the concepts of derivative and limit which is characteristic for modern calculus / analysis. As an application of uniform convergence of series of functions, the following gives such an example of a continuous function which is nowhere differentiable. It differs from Weierstrass’ original example, but the construction of the function and the subsequent reasoning are analogous. Weierstrass’ key idea is the construction of a continuous function f which is highly oscillating in the neighborhood of every point x of its domain in such a way that for every M ¡ 0 and in every neighborhood of x, there is x̄ P Dpf q such that the corresponding absolute value of the slope of the secant between px, f pxqq and px̄, f px̄qq satisfies f x̄ x̄ p q f pxq ¥ M . x Hence, there is a sequence x1 , x2 , . . . in Dpf q ztxu which is convergent to x and such that the corresponding sequence f x1 x p q f pxq 1x f x2 , x p q f pxq 2x , ... is unbounded. As a consequence, the sequence f px1 q f pxq f px2 q f pxq , , ... x1 x x2 x cannot be convergent and hence f is not differentiable in x. It is this key idea, which signals for the first time the complete mastery of the concepts of derivative and limit which is characteristic for modern calculus / analysis. 399 y 1 -3 -2 1 -1 2 3 x Fig. 100: Graph of the auxiliary function h in Example 3.4.13. Further, Weierstrass’ method of construction is suitable for the construction of a whole class of continuous functions that are nowhere differentiable. Hence, it cannot be said that such examples are in any sense isolated or pathological. The method supports more the view that such functions are generic. Example 3.4.13. (A continuous nowhere differentiable function) In the first step, we define an auxiliary function h : R Ñ R by hpx 2k q : |x| for all 1 ¤ x 1 and k for all x, y P R. mintx, y u. Then P Z. This implies that |hpxq hpyq| ¤ |x y| For the proof, let x, y P R and n P hpv q »v n g puq du , 400 Z such that n y y 0.7 -1 1 0.5 -0.5 x 1 -1 -0.5 y -0.5 1 0.5 1 x y 1.4 -1 0.5 2 0.5 Fig. 101: Graphs of pR Ñ R, x ÞÑ Example 3.4.13. x 1 -1 °n -0.5 x k k p3{4q hp4 xqq for n 1, 2, 3 and 10. Compare k 1 401 P R where g : R Ñ R is defined by # 1 if 1 ¤ y 0 g py 2k q : 1 if 0 ¤ y 1 for all 1 ¤ y 1. Hence if x ¤ y for all v |hpxq hpyq| and if y ¤x |hpxq hpyq| » y g u du x pq » x g u du y pq ¤ ¤ »y x »x y |gpuq| du y x |x y| |gpuq| du x y |x y| . In the next step, we define f : R Ñ R by f pxq : 8̧ 3 n n 0 for all x P R. Since 4 n 3 h 4n x 4 p q¤ hp4n xq n 3 4 for all x P R, the summability of the sequence p3{4q, p3{4q2 , . . . and the continuity of pR Ñ R, x ÞÑ p3{4qn hp4n xqq for every n P N , it follows by Theorems 3.4.4, 3.4.8 that f is continuous. In the following, we show that f is nowhere differentiable. For this, let x P R and m P N . Define δm : # 4m {2 if p4m x, 4m x p1{2qq contains no integer 4m{2 if p4mx p1{2q, 4mxq contains no integer and δm qq hp4n xq δm n for every n P N . When n ¡ m, 4 δm 4nm {2 is an even integer which implies that γmn 0. When n ¤ m, it follows that γmn : h 4n x |γmn| : p p hp4n px δm qq hp4n xq | 4n px ¤ δm 402 δm q 4n x| δm 4n . We conclude that °8 f x δm n0 3 n h 4n f x 4 δm m̧ n 8̧ n 3 3 γmn γmn n0 4 n0 4 m̧ n m̧ n 3 3 n 4 4n γmn n0 4 4 n0 p p px q p q δm qq δm °8 n 0 3 n 4 hp4n xq p q¥ m̧ n 0 3n 21 3m 1 1 . Hence the sequence f px δ1 q f pxq f px , δ1 δ2 q f pxq , ... δ2 is unbounded and hence not convergent, but lim Ñ8 δm m 0. Therefore, f is not differentiable in x. As another application of uniform convergence of series of functions, we construct a continuous plane-filling curve, i.e., continuous functions f1 and f2 from r0, 1s to r0, 1s such that the corresponding map f : r0, 1s Ñ r0, 1s2 , defined by f ptq : pf1 ptq, f2 ptqq for every t P r0, 1s, is surjective. The first construction of such a curve, by Peano in 1890, shocked the mathematical community. The map f can be viewed as a parametrization of its range. Since we experience the domain of f as ‘one-dimensional’, intuition expects the range of f to be ‘one-dimensional’ as well, i.e., to be a ‘curve’. But in this special case that curve is the ‘two-dimensional’ interval r0, 1s2 . In addition, f is continuous in the sense that the corresponding projections f1 , f2 on the coordinate axes are continuous. This seems to contradict common sense. Also here, the method of construction is sufficiently general to exclude that the result could be called isolated or pathological. 403 y 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 x Fig. 102: Curve from Example 3.4.14 resulting from truncating the sums after k 4. Example 3.4.14. (A plane-filling curve) Let h : r0, 2s Ñ R be some continuous function such that hptq 0 for all t P r0, 1{3s Y r5{3, 2s and hptq 1 for all t P r2{3, 4{3s. We consider the 2-periodic continuous extension of this function to the whole of R which will also be denoted by h. Then we define f1 , f2 : R Ñ R by f1 ptq : 8̧ hp32k2 tq k 1 2k , f2 ptq : 8̧ hp32k1 tq k 1 2k for all t P R. By Theorem 3.4.8, it follows that both series converge pointwise absolutely as well as uniformly on R and hence by Theorem 3.4.4 that f1 and f2 are continuous. In the following, we show that f pr0, 1sq r0, 1s2 where f : R Ñ R2 is the continuous curve defined by f ptq : pf1 ptq, f2 ptq for all t P R. For this, let x, y P r0, 1s and x 8̧ x k k 2 , y k 1 404 8̧ y k k 2 k 1 y y 0.5 0.7 0.4 0.6 0.5 0.3 0.4 0.2 0.3 0.2 0.1 0.1 0.1 0.2 0.3 0.4 0.5 0.6 t 0.2 0.4 0.6 0.8 1 t y 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 t Fig. 103: Graph of truncated f1 from Example 3.4.14 corresponding to truncation of the sum after k 1, 2, 3 and 4. 405 their binary representation where x1 , x2 , . . . and y1 , y2 , . . . in t0, 1u. Then we define t P r0, 1s by 8̧ t k t2 3k k1 where t2k1 : xk and t2k : yk for all k n P N that hp3n tq h 2 In case that tn 1 8̧ h tk 3kn k1 8̧ t n 2 3k k 1 and hence hp3n tq 0 tn 2 3 tk 3kn kn 1 8̧ 1 ¤2 k 2 3k 8̧ t n k 1 ¤2 k 3k h 8̧ t n 2 k 1 3k k . 13 and in case that tn 1 ¤2 k 8̧ 0, 2 P N. Then it follows for every 8̧ 1 3k k1 1 1, 1 and hence hp3n tq 1 tn 1 . As a consequence, f1 ptq 8̧ hp32k2 tq 2k 8̧ hp32k1 tq k 1 f2 ptq 2k k 1 8̧ t 8̧ x 2k1 k k k 2 2 k1 k1 8̧ t 8̧ y 2k k k 2 k 1 k 2 x, y . k 1 Note in particular that f p0q p0, 0q and f p1q p1, 1q. In the remainder of this section, we study power series which are sequences of polynomials p0 , p1 , . . . that are associated to a sequence of coefficients a0 , a1 , . . . of real numbers, an ‘expansion point’ x0 P R and defined by pn pxq ņ ak px x0 qk k 0 406 y y 0.5 0.7 0.4 0.6 0.5 0.3 0.4 0.2 0.3 0.2 0.1 0.1 0.2 0.4 0.6 0.8 1 t 0.2 0.4 0.6 0.8 1 t y 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 t Fig. 104: Graph of truncated f2 from Example 3.4.14 corresponding to truncation of the sum after k 1, 2, 3 and 4. 407 for every x P R and n P N where px x0 q0 : 1 for every x P R. Particular examples are the Taylor polynomials corresponding to a function f defined as well as infinitely often differentiable on a non-trivial open interval I. According to Taylor’s theorem, Theorem 2.5.25, for x0 P I, x P I and n P N, there is ξn in the closed interval between x0 and x such that f pxq f pkq px0 q p x x0 qk k! k0 ņ f pn 1q pξn q p x x0 qn pn 1q! 1 where f p0q : f and px x0 q0 : 1. The Taylor series of f around x0 is defined as the power series of Taylor polynomials p0 , p1 , . . . corresponding to the sequence f px0 q, f 1 px0 q, . . . and the expansion point x0 where pk pxq : f pkq px0 q p x x0 qk k! k0 ņ for every x P R and n P N. First, the question will be investigated for what values of x a given power series converges. In case that the sequence of coefficients has only finitely many non-zero members, this is of course trivially the case for every x P R. Hence in this connection, this case is not further considered. The following lemma shows that a ‘too fast growing sequence of coefficients’ leads to a power series that converges only in the expansion point. In such a case, we say that the series has the convergence radius 0. Lemma 3.4.15. Let a0 , a1 , . . . be a sequence of real numbers such that the set t|a1|, |a2|1{2, |a3|1{3, . . . u is unbounded. Then the sequence a0 , a1 x, a2 x2 , . . . is not summable for every non-zero real x. Proof. For this, let x be some non-zero real number. Then also t|a1x|, |a2x2|1{2, |a3x3|1{3, . . . u 408 is unbounded and hence there exists for every N P N some n P N such that |anxn| ¥ N n . Hence the sequence an xn , n P N, does not converge to zero and therefore is also not summable by Corollary 3.3.29. The following fundamental theorem gives important insight into the convergence properties of power series whose coefficients satisfy a certain growth condition. By application of the root test, Theorem 3.3.33, it shows that such a series converges on a symmetric open interval px0 r, x0 rq around the expansion point x0 where r ¡ 0 is the so called radius of convergence and is defined in terms of the coefficients. That radius r is maximal since the power series diverges for all x in the complement of the closed interval rx0 r, x0 rs. Further, the series converges uniformly on every closed interval of R that is contained in px0 r, x0 rq. Finally, the radius of convergence of the power series originating from the given series by differentiation has the same radius of convergence as the original series. Theorem 3.4.16. (Power series) Let x0 P R, a0 , a1 , . . . be a sequence of real numbers which contains infinitely many non-zero members and is such that the set t|a1|, |a2|1{2, |a3|1{3, . . . u is bounded from above. Finally, let N rN : sup |aN | P N and { , |a N 1 N 1 | {p 1 N 1 q, . . . ( 1 . (i) Then the sequence a0 , a1 px x0 q, a2 px x0 q2 , . . . is absolutely summable for every x P px0 rN , x0 rN q, and the series of polynomials defined by Sn pxq : ņ ak px x0 qk k 0 for every x P R is uniformly convergent on rx0 every r 1 satisfying 0 ¤ r 1 rN . 409 r 1 , x0 r 1 s for (ii) The number r : lim sup |aN | N Ñ8 { , |a N 1 N 1 | {p 1 N 1 q, . . . ( 1 where we set 1{0 : 8, is called the radius of convergence of the ‘power series’ S0 , S1 , . . . . For any x P R such that |x x0 | ¡ r, S0 pxq, S1 pxq, . . . is divergent. (iii) The power series S0 , S1 , . . . and S01 , S11 , . . . have the same radius of convergence. Proof. ‘(i)’: First, it follows by the definition of rN that |an|1{n ¤ r1 N for all n P N satisfying n ¥ N . Further let x P px0 rN , x0 rN q, then |anpx x0qn|1{n |an|1{n|x x0| ¤ |x r x0| 1 N for n P N satisfying n ¥ N , and hence it follows by Theorem 3.3.33 that the sequence a0 , a1 px x0 q, a2 px x0 q2 , . . . is absolutely summable. Further, let r 1 be such that 0 ¤ r 1 rN . Then it follows for every n P N, N 1, . . . and every x P rx0 r 1 , x0 r 1 s that |ak px x0q k | |ak | r 1k ¤ r1 rN k and hence, obviously, by Theorem 3.4.8 the uniform convergence of S0 , S1 , . . . on rx0 r 1 , x0 r 1 s. ‘(ii)’: First, it follows that r is well-defined since 1{r1 , 1{r2 , . . . is decreasing and bounded from below by 0 and hence convergent to some positive real number. In the case that this number is different from zero, it follows that lim sup |aN |1{N , |aN N Ñ8 1 410 |1{pN 1 q, . . . ( 1 . r Hence, if x P R is such that |x x0 | n P N such that |an|1{n ¡ ¡ r there is an infinite number of 1 |x x 0 | and hence the sequence an px x0 q , n P N, does not converge to zero and n therefore is also not summable by Corollary 3.3.29. ‘(iii)’: First, we note that the series S01 , S11 , . . . is convergent for some x P R if only if pidR x0 q S01 , pidR x0 q S11 , . . . is convergent in x and hence that both power series have the same radius of convergence. Further for k P t3, 4, . . . u, it follows by (2.5.12) that |ak |1{k ¤ elnpkq{k |ak |1{k pk |ak |q1{k ¤ e |ak |1{k and that expplnp3q{3q, expplnp4q{4q, . . . is decreasing and convergent to 1. Hence, obviously, it follows that the convergence radii of pidR x0 q S01 , pidR x0 q S11 , . . . and S0 , S1 , . . . are the same. By application of Theorems 3.4.6, 3.4.5, we immediately conclude the following important corollary. Corollary 3.4.17. Let x0 , a0 , a1 , . . . and r be as in Theorem 3.4.16. Then f : px0 r, x0 rq Ñ R defined by f pxq : 8̧ ak px x0 qk k 0 for all x P px0 r, x0 rq is infinitely often differentiable with derivative f pnq pxq 8̧ ak px x0 q p k n q ! kn and in particular an k! pnq f n!px0q 411 k n for every n P N and x P px0 r, x0 rq. Further for every a, b that a ¤ b and ra, bs px0 r, x0 rq, »b a f pxq dx »b 8̧ ak a k 0 8̧ P R such px x0qk dx ak pb x0qk k 1 k0 1 pa x0qk 1 . Proof. The statement is a simple consequence of Theorems 3.4.16, 3.4.6 and 3.4.5. It remains the question concerning the convergence of a power series with convergence radius r ¡ 0 and expansion point x0 in the points x0 r and x0 r. From examples, we will see later that the series can be divergent in both points, convergent in one of them or convergent in both. On the other hand, if the series is convergent in such a point, then the sum of the series as a function of x P px0 r, x0 rq is extendable to continuous function on the interval resulting from px0 r, x0 rq by addition of that point. Theorem 3.4.18. (Abel’s theorem) Let x0 P R, a0 , a1 , . . . , r be as in Theorem 3.4.16 and f : px0 r, x0 rq Ñ R be defined by f pxq : 8̧ ak px x0 qk k 0 for all x P px0 r, x0 rq. Further, let a0 r0 , a1 r1 , . . . be summable. Then x lim f pxq Ñx 0 r 8̧ ak r k . k 0 Proof. For this, let S1 , S0 , S1 , . . . be the sequence of partial sums of a0 r0 , a1 r1 , . . . , S1 : 0 and S : 8̧ ak r k . k 0 412 Then it follows that ņ ak px x0 qk ak rk x k 0 k 0 ņ ņ pSk Sk1q k 0 Sn x x r x 0 k r x0 n 1 r for every x P px0 r, x0 x0 k 1 x x k x x0 n¸ 0 Sk r r k0 rq, n P N and hence by Theorem 3.3.9 f pxq 1 x x k x x0 8̧ 0 Sk . r r k0 Further if M ¡ 0 is some upper bound for S1 , S2 , . . . , it follows for given ε ¡ 0, n0 P N such that |Sn S | ¤ ε{2 for all n P tn0 , n0 1, . . . u and x P px0 r, x0 rq satisfying " r r x0 x min r, 2n0 pM that |f pxq S | 1 |S |q ε * x x k x x0 8̧ 0 p Sk S q r r k0 k x x0 8̧ | x x0 | |S k S | ¤ 1 r r k0 ¤ n0 pM |S |q 1 x r x0 2ε ¤ ε for all n P tn0 , n0 1, . . . u. Example 3.4.19. By Examples 3.3.19, 3.4.9 and the previous theorem, it follows that the sum of the alternating harmonic series is given by 8̧ n 0 p1qn 1 lnp2q . n 413 y 1.1 1 0.9 0.8 0.7 0.6 0.5 10 20 30 40 50 n Fig. 105: Partial sums of the alternating harmonic series and Graph of the constant function of value lnp2q. Example 3.4.20. As another application of Theorem 3.4.18, we prove Leibniz’s result that π 4 8̧ n 0 p1qn 2n 1 1 1 13 We use as basis that 1 5 17 ... . » 1 π u2 1 1 u2 du 4 0 which was shown in the introduction to this section. Since 1 1 1 pu2q 1 u2 8̧ pu q 2 k k 0 8̧ p1q u p k 2 k 1 k 0 q 8̧ l 1 414 p1qk u2k k 0 for every u P R satisfying |u| 1, it follows that u2 1 u2 8̧ p1ql1u2l (3.4.8) for every u P R satisfying |u| 1 where we use the usual convention that x0 : 1 for all x P R. Hence, taking into account that the sequence 0, 1, 1, 1, 1, . . . diverges, the power series corresponding to the sequence 0, 1, 1, 1, 1, . . . has convergence radius 1. Hence it follows by corollary 3.4.17 for every 0 ¤ x 1 that 1 »x 0 u2 du 1 1 u2 8̧ 1 p1ql1 l 1 »x 8̧ 0 l 1 2l 1 x u 2l 1 0 p1q u du 1 l 1 2l 8̧ p1q »x l 1 l 1 2l 1 1 p1ql1 2lx l 1 8̧ 1 u2l du 0 . Further, by Dirichlet’s test, it follows that the sequence 0, 1{3, 1{5, 1{7, . . . is summable. Since the function that associates to every x P r0, 1s the value »x 0 u2 du 1 u2 is in particular continuous, it follows by Theorem 3.4.18 that 1 »1 0 8̧ 1 u2 l1 du 1 p 1 q 1 u2 2l 1 l1 and hence Leibniz’ result (3.4.8). The following example gives a standard application of power series expansions to the solution of differential equations. In this, it is assumed that the solution can be expanded into a power series around some expansion point. Usually, that expansion point is chosen to be a point where additional information on the solution is available. In the next step, the function in the differential equation is replaced by the power series. As a result of a subsequent calculation, a power series is obtained whose sum as a function of the variable vanishes in every point of its still unknown domain. As a consequence of Corollary 3.4.17, all coefficients of the last power series need to vanish. Usually, this leads to a recursion relation for the coefficients of the power series for the solution. If this recursion relation can be solved, it is 415 y 1 0.8 0.6 0.4 0.2 2 6 10 x -0.2 -0.4 Fig. 106: Graphs of J0 , J1 and J2 . tried to determine the radius of convergence of the corresponding series. If that radius is greater than zero, it follows that the obtained power series is indeed a solution of the differential equation. Precisely in this way, the majority of special functions of mathematics have been found. The associated differential equations had their roots in applications. This is also true for the following differential equation which is related to Bessel’s differential equation by a simple transformation. Example 3.4.21. Let ν differential equation P r0, 8q. Find a solution fν : R xfν2 pxq p2ν 1q fν1pxq for all x P R and such that fν p0q 1{Γpν xfν pxq 0 Ñ R of the (3.4.9) 1q. Solution: We assume that fν has a representation as a power series around 0 fν pxq 8̧ k 0 416 ak xk (3.4.10) for all x P pr, rq where a0 , a1 , . . . is some sequence of real numbers with corresponding convergence radius r ¡ 0 which are to be determined. Then it follows by Corollary 3.4.17 that 0 xfν2 pxq 8̧ 8̧ p2ν 1q fν1pxq k pk 1q ak xk1 p2ν xfν pxq 8̧ k 0 k 0 2ν q ak xk1 k pk k 1 p2ν ak xk 8̧ k ak xk1 ak xk 1 k 0 1 k 0 8̧ 1q a1 8̧ 1q rpk 2qpk 2ν 2q ak ak s xk 2 1 k 0 which is satisfied for all x P pr, rq if a0 for every k for all k 1, a1 0, ak 2 pk P N or explicitly if p1{4qk , a2k k! Γpν k 1q P N. Since a2pk 1q x2pk lim kÑ8 a x2k 1 2k q klim Ñ8 4pk ak 2qpk 2ν a2k 1 x2 1qpk 2q 0 ν 1q 0 foe every x P R, it follows by Theorem 3.3.31 that the convergence radius of the corresponding power series is infinite and hence that fν : R Ñ R defined by (3.4.10) has the required properties. In terms of fν , the so called Bessel function Jν of the first kind and of order ν is given by Jν pxq : x ν 2 fν pxq x ν 2 8̧ p1qk k! Γpν k k0 417 1q x2 4 k (3.4.11) for all x P p0, 8q. By (3.4.11), (3.4.9), it follows that Jν satisfies the differential relation x2 Jν2 pxq xJν1 pxq px2 ν 2qJν pxq 0 , for all x P p0, 8q. As a consequence of the absence of a clear notion of limits, in the 17th and 18th century, power series were generally treated like polynomials. Of course, the product of two polynomials is another polynomial. The standard way to show this is to use the distributive law to write the product as a combination of powers of the variable and then to collect for powers of the variable. From the last, the coefficients of the resulting polynomial can be read off. For the above reason, the same was done for power series which led to the definition of the product of power series. If in that definition the value 1 is substituted for the variable, we arrive at the so called Cauchy product of series. Indeed, the following shows that the Cauchy product of an absolutely summable and a summable sequence is summable with corresponding sum given by the product of the sums of the factors. Theorem 3.4.22. (Cauchy product of series) Let a0 , a1 , . . . , b0 , b1 , . . . be absolutely summable and summable, respectively, sequences of real numbers. We define ņ cn : ak bnk (3.4.12) k 0 for all n P N. Then c1 , c2 , . . . is summable and 8̧ k 0 ck 8̧ ak k 0 8̧ bk . k 0 Proof. For this, let A0 , A1 , . . . , B0 , B1 , . . . and C0 , C1 , . . . be the sequence of partial sums of a0 , a1 , . . . , b1 , b2 , . . . and c1 , c2 , . . . , respectively. Further, let βn : Bn B for every n P N where B : 8̧ k 0 418 bk . In a first step, it follows by induction that Cn ņ ak Bnk k 0 and hence that Cn ņ ak pB ņ βnk q An B for every n P N. Since, limnÑ8 βn such that |βn | ¤ ε for all n P tN, N ņ a β nk k k0 ¤M Ņ ¤ ņ ank βk k 0 k 0 0, for given ε ¡ 0 there is N P N 1, . . . u. Hence for such n ņ |ank | |βk | k 0 | ak | 8̧ ε k n N |ank | |βk | k N 1 |ak | k 0 where M ¡ 0 is such that |βk | ¤ M for all k by Theorem 3.3.28, there is N 1 P N such that ņ P tN, N 1, . . . u. Further, | ak | ¤ ε k n N for all n P tN N 1, N N1 ņ ank βk k0 1, . . . u. Hence for such n ¤ 8̧ M |ak | ε. k 0 The following example shows that Cauchy product of two conditionally summable sequences is not necessarily summable. 419 Example 3.4.23. Define an : bn : and n ?p1q n 1 ņ cn : ak bnk k 0 for all n Further, cn : P N. Then a0, a1, . . . and b1, b2, . . . are conditionally summable. ņ k 0 k nk ?p1q ?p1q p1qn k 1 nk 1 p1qn ņ k 0 1 b n 2 and hence 2 1 n 2 k ņ k 0 a pk 1 1qpn k 1q 2 |cn| ¥ 2pnn 1q 2 for all n P N. As a consequence, limnÑ8 |cn | is not summable. 0 and therefore c1, c2, . . . The following example gives an application of the Cauchy product of series to the summation of arithmetic series. As a result, we will obtain a systematic method for the derivation of sums of arithmetic series. The key idea comes from the observation that the coefficient of the Cauchy product, see (3.4.12), are given by the partial sums of the first series if bk 1 for all k P N . Example 3.4.24. (Summation of arithmetic series) Show that ņ k 1 r k pk 1qpk m 1q s 1 m 420 1 n pn 1qpn mq (3.4.13) holds for all m, n P N . Solution: For this, let n follows that for all |x| 1 that n! p1 xqn 1 8̧ pk nq! k! k 0 P N. In a first step, it xk , including the absolute summability of the series. The proof proceeds by induction over n P N. For the case n 0, this follows from Example 3.3.2. If the statement is true for some n P N, we conclude by Theorem 3.4.16 and Corollary 3.4.17 that pn 1q! 8̧ k pk nq! xk1 8̧ pk p1 xqn 2 k1 k! k0 8̧ pk n 1q! xk 1q pk pk n 1q! k x 1q! k! k 0 for all |x| 1, including the absolute summability of the last series, and hence the validness of the statement where n is replaced by n 1. Further, it follows by Theorem 3.4.22 that n! 1 x p1 xqn 1 1 8̧ k 0 ķ pl nq! l! l 0 8̧ xk 8̧ pk k 0 k 0 nq! k! xk xk for |x| 1. Since, n! 1 x p1 xqn 1 1 n 1 8̧ pk p n 1q! 1 p1 xqn 2 k0 pn n 1q! k x 1q k! for |x| 1, it follows by Corollary 3.4.17 that k¸1 l 1 l pl 1q pl n 1q ķ l 0 421 pl 1q pl 2q pl nq ķ pl nq! l! l 0 n 1 1 pk n 1 1 pk 1q pk n 1q! k! 2q pk 1q n for all k P N and hence (3.4.13) for all m, n P N . From (3.4.13), we can iteratively determine the sum of the arithmetic series Sm pnq : ņ km k 1 for all m, n P N . We carry the procedure through for m m 1, according to (3.4.13) ņ S1 pnq k k 1 12 npn 1 to 4. For 1q . for all n P N . For m 2, according to (3.4.13) S2 pnq S1 pnq ņ r kpk 1q s k 1 1 npn 3 1qpn 2q and hence S2 pnq 1 npn 3 1qpn 2q 1 npn 2 1q 1 npn 6 1qp2n 1q for all n P N . For m 3, according to (3.4.13) S3 pnq 3S2 pnq 2S1 pnq ņ r kpk 1qpk 2q s 2qpn 3q 1 npn 2 5n 6 4n 2 4q k 1 1 npn 4 1qpn 1qp2n 1q and hence S3 pnq 1 npn 4 npn 1q 1qpn 1 npn 4 1qpn2 422 2q 41 n2pn 1q2 for all n P N . For m 4, according to (3.4.13) S4 pnq 6S3 pnq 11S2 pnq 6S1 pnq ņ r k pk 1q pk 3q s k 1 51 n pn 1q pn 4q and hence S4 pnq 1 n pn 5 1q pn 4q 3 2 n pn 1q2 2 11 6 npn 1qp2n 1q 3npn 1q 301 npn 1q r 6pn 2qpn 3qpn 4q 45npn 1q 55p2n 1q 90 s 301 npn 1q r 3p2n3 18n2 52n 48 15n2 15n 30q 55p2n 1q s 301 npn 1q r 3p2n3 3n2 37n 18q 55p2n 1q s 301 npn 1q r 3pn2 n 18qp2n 1q 55p2n 1q s 301 npn 1qp2n 1qp3n2 3n 1q for all n P N . As a consequence, we arrive at the following results. S1 pnq 1 1 npn 1q , S2 pnq npn 1qp2n 1q , 2 6 1 2 1 2 npn 1qp2n 1qp3n2 S3 pnq n pn 1q , S4 pnq 4 30 for all n P N . 3n 1q The following gives a simple and useful criterion for the convergence of the Taylor series of a function. It is a simple consequence of Taylor’s theorem, Theorem 2.5.25, and the fact that n! growths faster with n P N than an where a is any positive real number. 423 Theorem 3.4.25. (Taylor expansions) Let I be a non-trivial open interval, a P I and f : I Ñ R be infinitely often differentiable and such there are M ¥ 0 and N P N such that |f pnqpxq| ¤ M n for all n P tN, N 1, . . . u. Then 8̧ f pkq paq f pxq k! k 0 px aqk (3.4.14) for all x P I. Proof. By Theorem 2.5.25 for every x P I and n P tN, N 1, . . . u, there is some cx,n in the closed interval between a and x such that f x p q f pkq paq n¸1 k 0 and hence k! k a px q lim f x nÑ8 p q Corollary 3.4.26. Let I, a 3.4.25. Then f pkq paq k! P I, f : I a k px q 0 . Ñ R and M ¥ 0 as in Theorem 1q paq px aqk 8̧ f pk k 0 p q |x a|n ¤ pM |x a|qn n! n¸1 k 0 f 1 pxq for all x P I and F : I pnq f cx,n n! k! Ñ R, defined by F pxq : 8̧ f pk1q paq k 1 k! for all x P I, is an anti-derivative of f . 424 px aqk Proof. Obviously without restriction, we can assume that M, N ther, let g : f 1 . Then g is infinitely often differentiable and n ¥ 1. Fur- |gpnqpxq| |f pn 1qpxq| ¤ M n 1 M 2 ¤ M 2 n for all n P tN, N 1, . . . u and x P I. Hence it follows by Theorem 3.4.25 that 8̧ f pk 1q paq px aqk (3.4.15) g pxq for all x P I. Further, let c, d F : pc, dq Ñ R by F pxq : 1 2 k! k 0 P I be such that c a d. Then we define »x f py q dy c »a c f py q dy for every x P pc, dq. Then F is infinitely often differentiable with its first derivative given by the restriction of f to the interval pc, dq. Further, F paq 0 and |F pnqpxq| |f pn1qpxq| ¤ M n1 ¤ M n for all n P tN, N rem 3.4.25 that 1, . . . u and x 8̧ F pxq P pc, dq. Hence it follows by Theof pk1q paq px aqk (3.4.16) k! k 1 for all x P pc, dq. Example 3.4.27. By Theorem 3.4.25 it follows that ex 8̧ xk k! k0 cospxq , 8̧ 8̧ k 0 sinpxq for all x P R. k 0 2k p1qk px2kq! , 2k 1 p1qk p2kx 425 1q! y 6 5 4 3 2 1 0.25 0.5 0.75 1 1.25 1.5 1.75 x Fig. 107: Graphs of the exponential function and corresponding Taylor polynomials of orders 0, 1 and 2 around 0. y 1 0.5 0.5 1 2 2.5 x -0.5 -1 Fig. 108: Graphs of the cosine function and corresponding Taylor polynomials of orders 0, 2, 4 around 0. 426 y 2 1.5 1 0.5 0.5 1 1.5 2 x 3 -0.5 Fig. 109: Graphs of the sine function and corresponding Taylor polynomials of orders 1, 3, 5 around 0. y 1 0.8 0.6 0.4 0.2 0.25 0.5 0.75 1 1.25 1.5 1.75 x Fig. 110: Graphs of the error function, an associated asymptote and corresponding Taylor polynomials of orders 1, 3, 5 around 0. 427 Example 3.4.28. Find the power series expansion around zero of the error function defined by » 2 x y2 erfpxq : ? e dy π 0 for all x ¥ 0. By Example 3.4.27, it follows that 2 ey 8̧ k 0 2k p1qk yk! for all y P R including the absolute summability of this sum and also the uniform convergence of the sequence of functions S0 , S1 , . . . defined by Sn py q : ņ k 0 2k p1qk yk! for every n P N and y P R on every closed subinterval of R. Hence it follows by Theorem 3.4.5 erfpxq for all x ¥ 0. ?2π »x 0 2 2 ey dy ? »x 2 8̧ p1qk ?π y2k dy k! 0 k0 π »x 8̧ 2k p1qk yk! 2 8̧ 0 k 0 ?π dy p1qk x2k 1 p2k 1q k! k0 Example 3.4.29. Find the first three terms in the Taylor expansion of f pxq : eax cospxq for all x P R where a P R. Solution: By Examples 3.4.22,3.4.27, it follows that the first three terms in the Taylor expansion of f are given by c0 c1 x c2 x2 , where c0 a0b0 , c1 a0 b 1 a1 b 0 , c 2 428 a0 b 2 a1 b 1 a2 b0 , a0 1, a1 a, a2 a2{2, b0 1, b1 0, b2 1{2, and hence by a2 1 2 1 ax x 2 for all x P R. The Binomial series was discovered by Newton in 1665 inspired by Wallis’ paper ‘Arithmetica infinitorum’ [98]. He never published this result, but describes its derivation in letters from 1676 to Leibniz. The corresponding Taylor polynomials are routinely used in applications for the purpose of approximation. Theorem 3.4.25 is not strong enough for its derivation, but there is a simpler method of proof by consideration of an associated differential equation. That method is employed in the following. Example 3.4.30. (Binomial series) Let ν p1 xq ν P R. Show that 8̧ ν n 0 n xn for all x P p1, 1q if ν R N and all x P p1, 8q if ν P N. The coefficients in the series are called ‘binomial coefficients’. They are defined by ν 0 : 1 , ν n 1 ν pν 1q pν pn 1qq n! : for every n P N . Note that ν n Γpν 1q 1q Γpν n Γpn 1q for all n P N satisfying n ν 1. The series is called a binomial series. Note that in case that ν P N, the series terminates since ν ν k 429 0 for all k P N . In this case, the series coincides with a finite sum. Solution: First, we notice that there is n0 P N such that ν n 0 for n P N satisfying n ¥ n0 if and only if ν P N. In this case, the power series coincides with a finite sum and its convergence radius is therefore infinite. In the case that ν R N, it follows that ν n xn 1 ν xn n 1 pn n! ν ν 1 1 ! ν ν 1 q |νn n1| |x| ¤ n n 1 |x| p q pν nq |x| p q pν pn 1qq for all x P p1, 1q and n P N satisfying n ¥ ν. Hence it follows by the ratio test, Theorem 3.3.31, that the series is absolutely summable for every x P p1, 1q and not summable for every |x| ¡ 1. As a consequence, in this case, the convergence radius of the power series is equal to 1. In the following, we define I : p1, 8q if ν P N, I : p1, 1q if ν R N and f : I Ñ R by 8̧ ν f pxq : xn n n0 for all x P I. Then, p1 xqf 1 pxq p1 8̧ xq n 1 8̧ 8̧ n ν xn1 n ν n nν xn1 n xn n n1 n1 8̧ 8̧ ν ν n pn 1q n 1 x n xn n n0 n0 8̧ n 0 pn 1q n ν 1 430 n ν n xn for every x P I. In particular, p0 and pn 1q ν n 1q 0 1 ν ν 0 0 1 n ν n pn 1q ν 1 pn 1 νν 1q! ν 0 ν pν 1q pν nq 1 ν pν 1q pν pn 1qq n! 1 rpν nq ns n! ν pν 1q pν pn 1qq ν nν . n Hence it follows that xqf 1 pxq νf pxq p1 for all x P I. The last is equivalent to id I qν f p1 1 (3.4.17) pxq 0 for all x P I. Then it follows by Theorem 2.5.7 that p1 id I qν f is a constant function and, since f p0q 1, that f pxq p1 xqν for all x P I. The following gives a standard example for the derivation of power series for functions defined in terms of integrals. The proved integral representation for Bessel functions of the first kind is of frequent use in applications [2]. Example 3.4.31. Show that Jν pxq ?π Γ x ν 2 1 2 ν »π 0 cospx cos θq sin2ν θ dθ for all x ¡ 0 and ν ¥ 0. Solution: For this, let ν ¥ 0. Then it follows by use of the power series expansion of cos and Theorem 3.4.5 that »π 0 cospx cos θq sin2ν θ dθ »π 0 8̧ p1qk x2k cos2k θ sin2ν θ p2kq! k0 431 dθ 8̧ p1qk p2kq! x k 0 »π 2k 2ν 2k sin θ cos θ dθ . 0 Further, it follows by change of variables and (3.2.18) that »π 2ν 2k sin θ cos θ dθ 0 » π{2 2ν » π{2 »π 2ν sin θ cos θ dθ 0 » π{2 2k sin θ cos θ dθ 0 sin 0 » π{2 2ν 2k » π{2 sin θ cos θ dθ 0 Γ ν 12 Γ k Γpk ν 1q 1 2 2k 2ν θ̄ π 2 sin2ν θ cos2k θ dθ { π 2 2k cos θ̄ π dθ̄ 2 cos2ν θ̄ sin2k θ̄ dθ̄ 0 0, it follows by Legendre’s duplication formula (3.2.16) for Γ that ? 12k Γ ν 12 Γp2kq Γ ν 12 Γ k 21 π 2 Γpk ν 1q Γpkq Γpk ν 1q ? 12k Γ ν 21 2k Γp2kq ? 2k Γ ν 12 p2kq! π2 π 2 Γpk ν 1q k! Γpk ν 1q 2k Γpk q For k and hence that »π sin2ν θ cos2k θ dθ 0 ? π 22k Γ ν Γpk ν 1 2 p2kq! 1q k! P N where as usual 0! : 1. This leads to »π ? 1 8̧ p1qk 2ν cospx cos θq sin θ dθ π Γ ν 2 k! Γpk ν for all k 0 k 0 and hence to ?π Γ x ν 2 ν 1 2 »π 0 cospx cos θq sin2ν θ dθ 432 x 2k 1q 2 x ν 2 8̧ p1qk k! Γpk ν k0 x 2k 1q 2 Jν pxq where the last equality is a consequence of (3.4.11). Problems 1) Find the interval of convergence of the given series 8̧ a) xn 8̧ , b) xn 1 n2 n0 1 n 8̧ xn c) , d) 3n pn 1q n0 8̧ p3x 2qn n 0 e) 8̧ n 0 g) 5n xn lnpn 2q n0 , f) , h) , 8̧ px 1qn ? , n 1 n0 8̧ n! 10n n0 xn , 8̧ xn plnpn 2qqn n0 for real x. 2) Find the Taylor series of f around x0 and the corresponding convergence radius and interval of convergence. a) f pxq : 4x{p1 b) c) d) e) f) g) h) i) j) k) l) 2x 3x2 q , x P R zt1{3, 1u ; x0 f pxq : sinpxq , x P R ; x0 π {4 , f pxq : sin pxq , x P R ; x0 π {4 , f pxq : lnp1 xq , x 1 ; x0 0 , f pxq : sinhpxq , x P R ; x0 0 , f pxq : coshpxq , x P R ; x0 0 , f pxq : x3 3x 7 , x P R ; x0 3 f pxq : 3x , x P R ; x0 0 , f pxq : 1{p1 x x2 q , x P R ; x0 0 f pxq : 1{p1 x3 q , x P R ; x0 0 , f pxq : ex {2 , x P R ; x0 0 , f pxq : lnp1 x2 q , x P R ; x0 0 . 2 2 433 , , 0 , 3) Find the Maclaurin series of f pxq : 1 x2 1 , x P R, and use the result to determine the Maclaurin series of arctan. Finally, show that 8̧ p1qn π . 4 2n 1 n0 4) The Maclaurin series for f : p1, 1q Ñ R defined by for all x P too slowly. f pxq : lnp1 xq p8, 1q is not useful for computation since converging a) Find the Maclaurin series for g : p1, 1q Ñ R defined by g pxq : ln 1 x 1x for all x P p1, 1q. Also, find the convergence radius and interval of convergence of the series. b) Show that the error of truncating the series after n P N terms is equal or smaller than 2 2n 2n 1 x 1 1 x2 for x P p1, 1q. c) Compute lnp2q to four decimal places by using the series obtained in a). Show the accuracy of your result by using the estimate from bq. 5) By use of the Cauchy product of series, determine the Maclaurin series of f . a) f pxq : p1 xq2 , x 1 b) f pxq : lnp1 c) f pxq : rlnp1 xq{p1 , xq , x ¡ 1 xqs , x ¡ 1 2 6) By use of the Cauchy product of series show that for all x, y P R. exppxq exppy q exppx 434 yq . , y 0.5 -1 1 3 5 x Fig. 111: Graphs of f from Problem 11 and the constant function of value 1 which is an asymptote for large positive values of the argument. 7) Calculate the first k nonzero terms in the Taylor expansion of f around x0 . a) b) c) f pxq : ecospxq , x P R ; x0 0, k3 , f pxq : cospxq{p1 xq , |x| 1 ; x0 0 , k 6 ? f pxq : expp x q , x P r0, 8q ; x0 1 , k 3 , , 8) Use the Taylor series of the sine function around π {4 to approximate its value at 470 degrees correctly to five decimal places. 9) Evaluate » 1{2 0 1 dx x6 to four-decimal-place accuracy by using a suitable power series expansion. Give reasons for the validity of your calculation. 10) Calculate the leading first four digits of »1 0 cospeu q du by using a suitable power series expansion. Give reasons for the validity of your calculation. 435 11) Define f : R Ñ R by f pxq : # 0 if x ¤ 0 . expp1{xq if x ¡ 0 a) Show that f is infinitely often differentiable. b) Calculate the Maclaurin series of f and show that it does not converge to f pxq for any x ¡ 0. 12) By a power series expansion around x 0, find a solution of the differential equation satisfying the given boundary conditions. Determine the convergence radius of the series. a) for all x P R; b) for all x 1; c) for all x P R; d) f 2 pxq 2f 1 pxq f pxq 0 f p0q 0 , f 1 p0q 1 . p1 xq2 f 2 pxq 2f pxq 0 f p0q f 1 p0q 1 . xf 2 pxq 2f 1 pxq f p0q 1 . xf 2 pxq f 1 pxq for all x P R; 13) Let a, b ¡ 0 and c ¡ a xf pxq 0 p3 xqf pxq 0 f 2 p0q 2 . b. a) Find the convergence radius r of the Gauss hypergeometric series 8̧ Γpa nqΓpb nq xn Γpcq ΓpaqΓpbq n0 Γpc nq n! for x P R. 436 b) Show that the corresponding hypergeometric function f , which is generally denoted by the symbol ‘F pa, b; c; q’ in the literature, satisfies the hypergeometric differential equation x p1 xqf 2 pxq r c pa b 1qxsf 1 pxq abf pxq 0 , x P pr, rq. c) Show that arctanpxq , x 1 lnp1 xq , F p1{2, 1; 3{2, x2 q 2x 1 x lnp1 xq F p1, 1; 2, xq , x F p1{2, 1; 3{2, x2 q for 0 |x| r1{2 , 0 |x| r, respectively. 14) Let a, b ¡ 0. a) Find the radius r of convergence of the confluent hypergeometric series 8̧ Γpa nq xn Γpbq Γpaq n0 Γpb nq n! for x P R. b) Show that the corresponding confluent hypergeometric function f , which is generally denoted by the symbol ‘M pa, b, q’ in the literature, satisfies the confluent hypergeometric differential equation x f 2 pxq pb xqf 1 pxq af pxq 0 , x P pr, rq. c) Show that M pa, a, xq ex , M p1, 2, 2xq ex for all x P pr, rq, 0 |x| r, respectively. sinhpxq x 15) Let n P N. By a power series expansion around x 0, find a solution Hn : R Ñ R of Hermite’s differential equation f 2 pxq 2xf 1 pxq 437 2nf pxq 0 , y 15 10 5 -2 1 -1 2 x -5 -10 Fig. 112: Graphs of Hermite polynomials H0 , H1 , H2 , H3 . x P R, satisfying Hn p0q p1qn{2 n! pn{2q! , Hn1 p0q 0 if n is even and Hn p0q 0 , Hn1 p0q 2 p1qpn1q{2 n! rpn 1q{2s! if n is odd. Determine the convergence radius of the associated power series around x 0. 16) Let ν P R. By a power series expansion around x 1, find a solution of Legendre’s differential equation p1 x2 qf 2 pxq 2xf 1 pxq ν pν x P p1, 3q, satisfying f p1q 1 . 1qf pxq 0 , Determine the convergence radius of the associated power series around x 1. What happens if ν P N? 438 y 4 2 -2 1 -1 2 x -2 Fig. 113: Graphs of Legendre polynomials P0 , P1 , P2 , P3 . 3.5 Analytical Geometry and Elementary Vector Calculus The invention of analytical geometry was another important mathematical development of the 17th century. Usually, this invention is attributed to the works of Francois Viete, Fermat and Descartes. Indeed, those works put more stress on the use of algebraic reasoning within proofs of geometric statements. On the other hand, they also conformed in large parts to ancient Greek mathematical traditions. Sometimes, probably influenced by the fact that his philosophy was almost revolutionary in their break with ancient Greek philosophy, Descartes is claimed to be the prime inventor of analytical geometry. Such claim is also reflected in the name of the ‘Cartesian’ coordinates. On the other hand, his work in this area was much less radical. Also, it appears nowadays that he is not the inventor of the Cartesian coordinates [15]. The coordinates used by Nicholas Oresme (1320 1382) for a graphical representation of functions are much closer to their modern use than Descartes’. Here it has also be taken into account that, like the lawyer Fermat, Descartes was no professional mathematician. From today’s point of view, his main interest was in philosophy. Apparently, the invention of analytical geometry was a gradual process that started already in ancient 439 Greece in the work of Pappus and received its main impacts much later from the development of calculus. From a today’s perspective, the goal of analytical geometry is the replacement of intuition in the solution of geometric problems by algebraic calculations. This goal is contrary to ancient Greek mathematics that gave meaning to the solutions of algebraic equations through geometrical constructions. Today, differently to geometric intuition, algebraic arguments are considered a valid tool in mathematical proofs. To accomplish its goal, analytical geometry introduces a purely auxiliary Cartesian coordinate system, that is, a coordinate system that is not related in any essential way to the nature of the geometrical problem at hand, that allows a unique identification of points in the plane and in space by a pair or triple, respectively, of real numbers called the Cartesian coordinates of the point. For a simple prototypical example for the analytical geometric approach, see Example 3.5.5 that investigates the elementary geometric bisection of line elements in the framework of analytic geometry. For a more complicated example, see Example 3.5.26 that proves ancient Greek knowledge on the properties of line segments of parabolas which Archimedes used in his quadrature of the parabola. The last transcends analytic geometry somewhat since it involves not just algebra in the analysis, but also methods from calculus and therefore belongs to the area of differential geometry. 3.5.1 Metric Spaces Basic to geometry is the notion of the length of line segments or the distance of points. Later on, we will give a definition of the Euclidean distance between points in R2 and R3 which is motivated by the Pythagorean law of elementary geometry. For n P N such that n ¥ 4, that definition is generalized in a straightforward manner to points of Rn . In many cases notions of distance have been found that share certain properties of the Euclidean distance. That observation led to the definition of a metric space which allows the formulation of statements that are valid in all those cases. Definition 3.5.1. A metric space is a pair pM, dq consisting of a non-empty 440 set M , whose elements we shall call points, and a (‘distance’- or ‘metric’-) function d : M M Ñ R such that for all p, q, r P M (i) dpp, q q ness’) ¥ 0 and dpp, qq 0 if and only if p q, (‘Positive definite- (ii) dpp, q q dpq, pq for all p, q (iii) dpp, q q ¤ dpp, rq P M, (Symmetry) dpr, q q for all r PM (Triangle inequality). The following inequality will be used later on in the proof that Rn , n P N , equipped with the Euclidean distance function is indeed a metric space. It is also frequently used in other connections. Lemma 3.5.2. Cauchy-Schwarz inequality Let n pb1, . . . , bnq P Rn. Then ņ aj bj j 1 ¤ 1{2 ņ a2j 1{2 ņ j 1 P N and pa1, . . . , anq, b2j . (3.5.1) j 1 In addition, if bj 0 for some j P t1, . . . , nu, then equality holds in (3.5.1) if and only if aj pC {B q bj for all j 1, . . . , n where B : ņ b2j , C : j 1 ņ aj bj . j 1 Proof. In addition, define ņ A : a2j . j 1 Then it follows that 0¤ ņ j 1 2 pBaj Cbj q2 AB 2BC ņ B 2 a2j 2BCaj bj j 1 2 2 C B B pAB C 2q 441 C 2 b2j y q q2 p2 p r x p1 q1 Fig. 114: The square of the distance between two points p, q in the plane is given by the sum of squares of the distances between p, r and between r, q. and hence (3.5.1) in case B 0. In the remaining case B 0, it follows that b1 bn 0 and hence also (3.5.1). Further if bj 0 for some j P t1, . . . , nu, then equality holds in (3.5.1) if and only if aj pC {B q bj for all j 1, . . . , n. The definition of the Euclidean distance in R2 and R3 is motivated by the Pythagorean law of elementary geometry. For such motivation, we consider first the case R2 , see Fig 114. For this, let p pp1 , p2 q and q pq1 , q2 q be points in R2 . Then we introduce the auxiliary point r pq1 , p2 q. Since p and r are on the same height, i.e., share the same y-coordinate, the length of the line segment pr is given by the length |p1 q1 | of its orthogonal projection onto the x-axis. Also, since r and q share the same x-coordinate, the length of the line segment rq is given by the length |p2 q2 | of its orthogonal projection onto the y-axis. Since the triangle prq has a right angle in the corner r, we conclude by the Pythagorean law of elementary geometry that the length of the line segment 442 q s p r Fig. 115: The square of the distance between two points p, q in space is given by the sum of squares of the distances between p, r, r, s and s, q. pq is given by a a |p1 q1|2 |p2 q2|2 pp1 q1q2 pp2 q2q2 . The situation in R3 is similar, see Fig 115. For this, let p pp1 , p2 , p3 q and q pq1 , q2 , q3 q be points in R3 . We introduce two auxiliary points r pq1 , p2 , p3 q and s pq1 , q2 , p3 q. Since p and r share the same y and zcoordinates, the length of the line segment pr is given by the length |p1 q1 | of its orthogonal projection onto the x-axis. Also, since r and s share the same x and z-coordinates, the length of the line segment rs is given by the length |p2 q2 | of its orthogonal projection onto the y-axis. Since the triangle prs has a right angle in the corner r, we conclude by the Pythagorean law of elementary geometry that the length of the line segment ps is given by a a |p1 q1|2 |p2 q2|2 pp1 q1q2 pp2 q2q2 . Further, since s and q share the same x and y-coordinates, the length of the line segment sq is given by the length |p3 q3 | of its orthogonal projection 443 r q p Fig. 116: According to the triangle inequality, the distance between the points p, q is smaller than the sum of the distances between p, r and between r, q. onto the z-axis. Since the triangle psq has a right angle in the corner s, we conclude by the Pythagorean law of elementary geometry that the length of the line segment pq is given by a pp1 q1q2 pp2 q2q2 |p3 q3|2 pp1 q1q2 pp2 q2q2 pp3 q3q2 . a Example 3.5.3. Let n P N . Show that pRn , dq where en : Rn r0, 8q is the usual Euclidean distance function defined by e px, y q : n g f f e ņ Rn Ñ pxj yj q2 j 1 for all x px1 , . . . , xn q, y py1 , . . . , yn q P Rn , is a metric space. Solution: The positive definiteness and symmetry of en are obvious. Further, it 444 y 3 2 1 -3 -2 1 -1 x -1 Fig. 117: Circle of radius 2 and center p1, 1q. follows by Lemma 3.5.2 that ņ ņ penpx, yqq2 pxj yj q2 pxj zj j 1 pxj zj q j 1 ¤ penpx, zqq2 ¤ penpx, zqq2 and hence that yj q2 j 1 ņ 2 zj ņ 2 pxj zj qpzj yj q j 1 ņ 2 xj j 1 yj ņ pzj yj q2 j 1 p zj qpzj q penpz, yqq2 2en px, z qen pz, y q penpz, yqq2 penpx, yq en px, y q ¤ en px, z q for all x px1 , . . . , xn q, y en pz, y qq2 en pz, y q py1, . . . , ynq P Rn and z pz1, . . . , znq P Rn. Example 3.5.4. Let n P N and a pa1 , . . . , an q P Rn . Find a function f whose zero set is given by the sphere Srn paq of radius r ¡ 0 with center a. 445 4 y 2 0 2 0 z -2 -4 -2 0 x 2 4 Fig. 118: Sphere of radius 3 centered at the point p1, 2, 1q. Solution: A sphere of radius r ¡ 0 and center a contains precisely those points x px1 , . . . , xn q P Rn which have Euclidean distance r from a. Hence Srn paq is given by Srn paq # ņ px1, . . . , xnq P Rn : pxj aj q2 r2 + . j 1 In particular, Sr1 paq is called a circle of radius r around a. Hence such function f is given by f : Rn Ñ R defined by f px1 , . . . , xn q : ņ pxj aj q2 r2 j 1 for all px1 , . . . , xn q P Rn . We exemplify the goal of analytic geometry, i.e., the replacement of intuition in the solution of geometric problems by algebraic calculations, in a simple example which proves the correctness of the elementary geometric construction of the bisection of a line segment. 446 y x a Fig. 119: Elementary geometric bisection of a line segment. See Example 3.5.5. Example 3.5.5. (Bisection of a line segment) Prove the elementary geometric construction of the bisection of a line segment. Solution: For this, let p and q be two different points in the plane. We introduce a Cartesian coordinate system in the following way. The point p is chosen as the origin of the system, and the direction of the x-axis is chosen to coincide with the direction of the oriented line segment from p to q. Hence, p p0, 0q and q p0, aq where a ¡ 0 is the distance between p and q. The elementary geometric construction of the bisection of the line segment pq involves drawing circles of radius r ¡ a{2 around p and q. The line segment between the intersection points of the circles halves the line segment pq. A point px, y q P R2 is an intersection point of the circles if and only if its coordinates satisfy the following equations x2 y2 r2 , px aq2 y2 r2 . Subtraction of these equations gives 0 x2 y 2 px aq2 y 2 x2 x2 2ax a2 . The last equation is equivalent to the equation x a{2. Hence px, y q P R2 447 is an intersection point if and only if x a{2 and r 2 x 2 y 2 a2 4 y2 . As a consequence, the intersection points are given by ? 12 4r2 a2 a , 2 , a 1? 2 4r a2 , 2 2 . The line segment between these points, given by L : " a 2t 1 ? 2 4r a2 , 2 2 * :0¤t¤1 , intersects pq indeed in its midpoint pa{2, 0q. It might be argued that the above choice of the Cartesian coordinate system involved geometric intuition. On the other hand, a similar, but more complicated, calculation can also be performed for the case of a completely arbitrary Cartesian coordinate system. The whole calculation and reasoning can be performed without any geometric intuition once the geometric problem is translated into a set of algebraic equations. The last is the spirit of the analytic geometric approach. Problems 1) Which of the following sets are circles? Find center and radius where this is the case. a) b) c) d) e) f) g) tpx, yq P R2 : x2 tpx, yq P R2 : x2 tpx, yq P R2 : x2 tpx, yq P R2 : x2 tpx, yq P R2 : x2 tpx, yq P R2 : x2 tpx, yq P R2 : 4x2 y2 y2 y x 2y 2y 3x 2 3x 2 6x y 0u 2y , 1 0u 4 0u 1 0u , , , 5y 4 0u , y 12x 8y 43 0u 4y 2 2x 8y 1 0u y 2 2 448 , . 2) Find a function whose zero set is a circle with center p3, 2q passing through p2, 1q. 3) Find a function whose zero set is a circle with center pa, aq passing through the origin where a P R. 4) Find a function whose zero set is a circle passing through all three points p1, 2q, p1, 0q and p3, 2q. 5) Decide whether the points p0, 3q, p2, 0q and p4, 1q lie on a circle. If this is the case, find its center and radius. 6) Find functions whose zero sets are circles with center p1, 2q that touch a) the x-axis, b) the y-axis. 7) Find the intersection S1 X S2 where tpx, yq P R2 : 3 3x 2x2 S2 tpx, y q P R2 : 3 2x 3x2 S1 4y 0u , 0u . 2y 2 4y 3y 2 8) Which of the following sets are spheres? Where this is the case, find center and radius. tpx, y, zq P R3 : x2 y2 z2 2x 3y z 0u , b) tpx, y, z q P R3 : x2 y 2 z 2 4x 5 0u , c) tpx, y, z q P R3 : x2 y 2 z 2 6x y z 1 0u , d) tpx, y, z q P R3 : x2 y 2 z 2 x y z 1 0u , e) tpx, y, z q P R3 : x2 y 2 z 2 3x y 2z 21u , f) tpx, y, z q P R3 : x2 y 2 z 2 3x y 2z u , g) tpx, y, z q P R3 : 3px2 y 2 z 2 q 2x z 4u . Find a function whose zero set is sphere with center p1, 1, 2q and a) 9) radius 3. What is the intersection of the sphere and the xz-plane? 10) Find a function whose zero set is a sphere that passes through the point p2, 3, 1q and is centered in p3, 1, 1q. 11) Find functions whose zero sets are spheres with center p1, 3, 2q that touch a) the xy-plane, b) the yz-plane, c) the xz-plane. 12) Show that the spheres tpx, y, zq P R3 : 49px2 S2 tpx, y, z q P R3 : 49px2 S1 y2 y 2 z 2 q 32x z 2 q 20x 8y 26z 26y 10z have only one point in common, and find its coordinates. 449 8 0u 2 0u , 3.5.2 Vector Spaces In the following, the notion of vectors will be introduced. In applications, a quantity which has magnitude, direction and a point of attack is called vectorial. Physical examples are force, speed, acceleration, momentum, angular momentum, torque and so forth. The definition of vectors below does not take into account a point of attack. Therefore, a vectorial quantity in applications is a pair consisting of a point (of attack) and a vector in the sense below. The same is also true for tangent vectors to surfaces defined in differential geometry. Also those are attached to points in the surface. But, to simplify notation in applications, the point of attack is often not indicated, if clear from the context. This is often confusing for the beginner. An additional complication arises from the fact that vectors are often denoted by tuples of real numbers. In such cases, only from the context can be concluded whether a given tuple refers to the coordinates of a point or to the components of a vector. The following defines a vector as a set of parallel oriented line segments in Rn where n P N . Two points p, q P Rn define a line segment pq between p and q. In addition, we can give such a line segment an orientation by saying it is originating in p and ending in q. We denote the resulting oriented line # Note that qp # is different from pq # , although the underlying segment by pq. # can n line segments (which are subsets of R ) are identical. Alternatively, pq be interpreted as the pair pp, q q P pRn q2 . For the analytic description of parallelism, we use translations, Ta : Rn Ñ Rn , a pa1 , . . . , an q P Rn , defined by Ta pxq : px1 a1 , . . . , xn an q for every x px1 , . . . , xn q P Rn . Note Ta ‘translates’ the coordinates of every point in Rn in the same way. Therefore Ta also preserves the Euclidean distance between points: en px, y q g f f e ņ pxj yj q2 g f f e j 1 ņ pxj aj py j aj qq2 j 1 e pTapxq, Tapyqq n (3.5.2) 450 y 2 1 1 2 3 4 5 x Fig. 120: Line segment between the points p1, 1q and p2, 1q in the plane and images under the translations Ta where a p2, 0q, p0, 1q, p2, 1q. for all x px1 , . . . , xn q P Rn and y py1 , . . . , yn q P Rn . Oriented # and rs, # where p, q, r, s P Rn , will be called equivalent line segments pq if r Ta ppq and s Ta pq q for some a P Rn . If the last is the case, we # rs. # Finally, below it will be shown that the relation indicate this by pq has similar properties to ‘’, and a vector will be defined as a set of equivalent oriented line segments. Definition 3.5.6. Let n P N zt0, 1u. Then we define # from p to q : p, q S : tAll oriented line segments pq Further, we define on S the relation by P Rn u . # rs # pq for p, q, r, s P Rn if Tappq and s Tapqq for some a pa1 , . . . , an q P Rn where the translation Ta : Rn Ñ Rn is r defined by Ta pxq : px1 a1 , . . . , x n an q for every x px1 , . . . , xn q P R . Note that such a is unique and, in particular, that s Tpq1 p1 ,...,qn pn q prq . n 451 Fig. 121: Equivalent oriented line segments. See Definition 3.5.6. Theorem 3.5.7. Let n P N zt0, 1u. Then is an equivalence relation, i.e., it follows for all p, q, r, s P Rn that # pq # pq ( is reflexive) , # rs # q ñ p rs # pq # q p pq ( is symmetric) , # # # rs # q ^ rs # tu # tu p pq ñ pq ( is transitive) . Proof. is reflexive: For this, let p, q P Rn . Then p Tp0,...,0q ppq and q # pq. # is symmetric: For this, let p, q, r, s P Rn Tp0,...,0q pq q and hence pq # # and pq rs. Then there is a pa1 , . . . , an q P Rn such that r Ta ppq and s Ta pq q . Hence p Tpa1 ,...,an q prq and q Tpa ,...,a qpsq # pq. # is transitive: For this, let p, q, r, s, t, u P Rn and pq # rs # and rs # # and rs tu. Then there are a pa1 , . . . , an q, b pb1 , . . . , bn q P Rn such 452 1 n that r Tappq and s Ta pq q as well as such that t Tb prq and u Tb psq . Hence t Tpa1 b1 ,...,an bn q ppq and u Tpa1 b1 ,...,an bn q pq q # # tu. and pq In a first step, the following defines a vector as a set of equivalent oriented line segments. Every element of such a set is called a representative of the vector. Subsequently the addition, scalar multiplication, length and scalar product of vectors are defined. The addition of two vectors is defined as follows. First, we choose a representative of the first vector. Next, we choose the representative of the second vector whose initial point coincides with the end point of the first representative. Then the sum of the vectors is defined as the vector corresponding to the oriented line segment from the initial point of the first representative to the end point of the second representative. Scalar multiples of a vector are defined similarly. First, we choose the representative of the vector whose initial point coincides with the origin of the coordinate system. For λ P R, we define the λ-fold of the vector as the vector corresponding to the oriented line segment from the origin to the point which results from the endpoint of the representative by multiplication of each of its coordinates by λ. Subsequently, the length of a vector is defined as the Euclidean distance of the initial and endpoint of a representative. Geometrically, the scalar product of two vectors can be interpreted as the product of the length of the orthogonal projection of the representative of the first vector onto the representative of the second vector with the length of the second representative. In this, it is assumed that both representatives have the same initial point. Below, we use another equivalent and more convenient definition in terms of the law of cosines, namely as one half of difference of the sum of the lengths of the representatives 453 s v r u q t p Fig. 122: Vector addition. See Definition 3.5.8 (iii) . and the length of a difference of the representatives. These definitions of vectors and of operations on vectors are very geometrical in nature, but lead to notations that are inconvenient for use in calculations. Fortunately, a more convenient notation, suitable for calculation, can be obtained from the observation that there is a natural bijection ι of the set of vectors onto Rn . Indeed, every vector has a natural representative which has the origin as its starting point. By abuse of language, we will call such representatives position vectors. Hence by defining the image of the vector under ι as the endpoint of that representative, we achieve a bijection ι from the set of vectors onto Rn in part (viii) of the following definition. Subsequently, we define by help of ι operations on Rn which correspond to the operations defined on vectors. In this way, the elements of Rn become in future also position vectors, whereas so far the elements of Rn were only interpreted as n-tuples of coordinates that are associated to points by help of a Cartesian coordinate system. Only from the context of problem can be concluded whether a given tuple refers to the coordinates of a point or to the components of a position vector. 454 uHΛL u qHΛL t q p Fig. 123: Scalar multiplication. See Definition 3.5.8 (iv) . Definition 3.5.8. Let n P N zt0, 1u. We define: # ] corresponding to (i) For arbitrary p, q P Rn , the equivalence class [ pq # by pq # ] : trs # : rs # pq # , r, s P Rn u . [ pq Every such equivalence class is called a vector. (ii) The set of all vectors by # ] : p, q S/ : t[ pq P Rn u . # ] [ rs # ] of [ pq # ] and [ rs # ] as (iii) For arbitrary p, q, r, s P Rn , the sum [ pq # # follows. For this, let tu P [ pq ]. Then there is a unique v P Rn such # P [ rs # ], and we define that uv # # ] [ pq # ] : [ tv ] . [ rs # ] can be represented as Note that every element of [ pq # Ta ptqTa puq 455 (3.5.3) for some a P Rn . Then # # ] Ta puqTa pv q P [ rs and # # Ta ptqTa pv q P [ tv ] . # ] This shows that [ pq # ] is well-defined. [ rs # ] as follows. (iv) For every p, q P Rn and λ P R, the scalar multiple λ.[ pq # # For this, let tu P [ pq ]. Then # # ] : [ t pt λ.[ pq 1 λpu1 t1 q, . . . , tn λpun tn qq ] . # ] can be represented in the form of (3.5.3) Since every element of [ pq n for some a P R , it follows that # ta pta1 λpua1 ta1 q, . . . , tan λpuan tan qq T# aptq Tappt1 λpu1 t1q, . . . , tn λpun tnqqq P [ t# pt1 λpu1 t1q, . . . , tn λpun tnqq ] # ] is wellwhere ta : Ta ptq, ua : Ta puq. This shows that λ.[ pq defined. # ]| of [ pq # ] as follows. For this, (v) For arbitrary p, q P Rn , the length |[ pq # # ]. Then let tu P [ pq a # ]| : pu t q2 pu t q2 , |[ pq 1 1 n n i.e., as the Euclidean distance of the points t and u. Since every # ] can be represented as (3.5.3) for some a P Rn , it element of [ pq follows that a pa ua1 ta1 q2 puan tan q2 pu1 t1q2 pun tnq2 # ]| is well: Ta ptq, ua : Ta puq. This shows that |[ pq where ta defined. Vectors of length one are called unit vectors. 456 # ] (vi) For arbitrary p, q, r, s P Rn the scalar product (or dot product) [ pq # ] of [ pq # ] and [ rs # ] by [ rs # ] [ rs # ] : 1 [ pq 2 # ]|2 |[ pq # ]|2 |[ pq # ] [ rs # ]|2 |[ rs Note that according to the law of cosines . # ], [ rs # ]q , ?p[ pq # ]| 0, |[ rs # ]| 0 and where the angle ?p[ pq # ], [ rs # ]q P r0, π s if |[ pq # ] [ rs # ] |[ pq # ]| |[ rs # ]| cos [ pq # ] and [ rs # ] is defined by the angle between represenbetween [ pq tatives of both equivalence classes originating from the same point. Vectors with a vanishing scalar product are called orthogonal to each other. (vii) The bijective map ι : Rn Ñ S/ by # ιpxq : [ Ox ] . for all x P Rn where O denotes the origin defined by O : p0, . . . , 0q P Rn . P Rn and arbitrary λ P R, x y : ι1 pιpxq ιpy qq , λ.x : ι1 pλ.ιpxqq , |x| : |ιpxq| , x y : ιpxq ιpy q . (viii) For arbitrary x, y The following theorem derives the properties of the operations of addition and scalar multiplication that are induced on Rn , n P N , by ι and the corresponding operations for vectors. Theorem 3.5.9. Let n P N zt0, 1u. Then (i) x y px1, . . . , xnq py1, . . . , ynq px1 and for all x, y y1 , . . . , x n a.x a . px1 , . . . , xn q pa x1 , . . . , a xn q P Rn and a P R. 457 yn q (ii) pRn , , .q is a real vector space with 0 : p0, . . . , 0q as neutral element and for each x P Rn with x : px1 , . . . , xn q as corresponding inverse element, i.e, the following holds: y x px yq z x py x 0x x pxq 0 x y (Addition is commutative), zq (Addition is associative), (0 is a neutral element), (x is inverse to x) as well as 1.x x, pabq.x a.pb.xq, pa a.px y q a.x for all x, y bq.x a.x b.x, a.y P Rn and a, b P R. (iii) The sequence of n vectors e1 , . . . , en , defined by e1 : p1, . . . , 0q , . . . , en : p0, . . . , 1q , is a basis of Rn , i.e., for every x P Rn , we have x x1 .e1 xn .en and the coefficients x1 , . . . , xn in this representation are uniquely determined. I.e., if x x̄1 .e1 x̄n .en for some x̄1 , . . . , x̄n P R, then x̄1 x1 , . . . , x̄n xn . The sequence e1 , . . . , en is called the canonical basis of Rn . Proof. ‘(i)’: For this, let x a P R. Then x y : ι1 pιpxq px1, . . . , xnq, y py1, . . . , ynq P # ιpy qq ι1 [ Ox ] 458 # [ Oy ] Rn and # # ι1 [ Ox ] [ Tx pOqTx py q ] ι1 ι1ιppx1 y1, . . . , xn ynqq px1 and # # [ OTx py q ] y1 , . . . , x n yn q ι1 [ O# pλx1, . . . , λxnq ] ι1ιppλx1, . . . , λxnqq pλx1, . . . , λxnq . λ.x : ι1 pλ.ιpxqq ι1 λ.[ Ox ] (ii) and (iii) are trivial consequences of the definitions and the algebraic properties of the real numbers. The following theorem derives the properties the notion of length that is induced on Rn , n P N , by ι and the corresponding notion for vectors. Theorem 3.5.10. Let n P N zt0, 1u. Then (i) for every x P Rn |x | a x21 x2n . (ii) The absolute value satisfies the defining properties of a norm on Rn , i.e., Proof. |x| ¥ 0 and |x| 0 if and only if x 0 (Positive definiteness), |a.x| |a| |x| (Homogeneity), |x y| ¤ |x| |y| (Triangle inequality) n for all x, y P R and a P R. ‘(i)’: For this, let x px1 , . . . , xn q P Rn . Then a # |x| : |ιpxq| |[ Ox ]| x21 x2n . ‘(ii)”: The positive definiteness and homogeneity of the absolute value are straightforward consequences from the definitions. The triangle inequality follows from the corresponding property of the metric en . For this, let x px1 , . . . , xn q, y py1 , . . . , yn q P Rn . Then |x y | en pO, x y q ¤ en pO, xq 459 en px, x y q |x| |y | . The following theorem derives the properties the scalar product that is induced on Rn , n P N , by ι and the corresponding product for vectors. Theorem 3.5.11. Let n P N zt0, 1u. Then (i) for all x, y P Rn : xy ņ xk yk . k 1 (ii) This product satisfies the defining properties of a scalar product on a real vector space, i.e, xy yx (Symmetry), px yq z x z y z (Additivity in the first variable), pa.xq y apx yq (Homogeneity in the first variable), x x ¥ 0 and x x 0 if and only if x 0 (Positive definiteness) for all x, y P Rn and a P R. As a consequence, it satisfies the impor- tant Cauchy-Schwarz inequality |x y| ¤ |x| |y| . (3.5.4) for all x, y P Rn . In particular, in the case that y in (3.5.4) if and only if xy |y |2 . y . x Proof. ‘(i)’: For this, let x Then x y : ιpxq ιpy q 1 2 |x | 2 1 2 (3.5.5) px1, . . . , xnq and y py1, . . . , ynq P # |rOxs| 2 # # # |rOys| |rOxs rOys| 2 |y | |x y | 2 0, equality holds 1 ņ 2 rx 2 k1 k 460 2 2 yk2 pxk yk q2 s Rn . ņ xk yk . k 1 ‘(ii)’: The symmetry, additivity, homogeneity in the first variable, and positive definiteness of the dot product are obvious. Let a : |y |2 and b : x y. For the case y 0, inequality (3.5.4) is trivially satisfied. If y 0, then |y | ¡ 0 and 0 ¤ pa.x b.y q pa.x b.y q a2 |x|2 2ab px y q b2 |y |2 |y|4|x|2 2|y|2px yq2 px yq2|y|2 |y|2 r |x|2|y|2 px yq2 s , and hence it follows (3.5.4). In particular, equality holds in (3.5.4) if and only if (3.5.5) is true. Note that the basis e1 , . . . , en of Rn is in particular orthonormal with respect to the Euclidean scalar product, i.e., it satisfies and for all i, j ei ej 0, if i j pOrthogonalityq ei ej 1, if i j pNormalizationq P t1, . . . , nu. (3.5.6) We continue the section with applications of vectors. Part (i) of the following theorem rephrases the Pythagorean theorem in terms of vectors. Part (ii) shows that the minimal distance of a given point y P R2 from tλ.x : λ P Ru , where x is some non-zero vector, is assumed in the orthogonal projection of the position vector y onto the direction of x. Theorem 3.5.12. Let n P N zt0, 1u. Then (i) |x y|2 |x|2 |y|2 for all orthogonal x, y P Rn . 461 y y’ z x O z’ Fig. 124: Orthogonal projections y 1 , z 1 of y and z, respectively, onto the direction of x. (ii) For every x P Rn zt0u and every y Px py q P Rn such that P Rn, there is a unique vector |y Pxpyq| mint|y λ.x| : λ P Ru . Px py q is called the orthogonal projection of y onto the direction of x. In particular, it is given by Px py q : yx |x|2 .x , y Px py q is orthogonal to x and a |y Pxpyq| |x1| |x|2 |y|2 px yq2 . Proof. ‘(i)’: Let x px1 , . . . , xn q, y Then x y 0 and hence |x y| 2 ņ k 1 pxk yk q 2 py1, . . . , ynq P Rn be orthogonal. ņ k 1 ņ x2k k 1 462 ņ yk2 2 k 1 xk yk |x|2 |y|2 . ‘(ii)’: First, it follows for every λ P R that y yx |x |2 . x and hence by piq that y x y x py x|xqp|2x xq 0 |y λ.x| 2 y x y | x | 2 . x 2 y x x2 |y λ.x| ¥ for all λ P R and y |y λ.x| if and only if yx |x |2 λ λ . x2 || y Hence yx |x |2 . x 2 . x . y x |x |2 . x 1 a 2 y x 2 2 . x |x |2 |x | |x | | y | p x y q λ yx |x |2 . From elementary geometry, it is known that the area of a plane parallelogram is given by the product of the length of one of its sides and the length of the corresponding height. The following application of the vector methods allows an often simpler calculation of that area if the location of its corners is known with respect to a Cartesian coordinate system. Example 3.5.13. Let n P N zt0, 1u and Opqr be a parallelogram where p, q, r are points in Rn . Show that its area A is given by A a |a|2 |b|2 pa bq2 |a| |b| sinpαq # # where a [ 0p ] and b [ 0r] and α : ?pa, bq P r0, π s. Solution: See Fig. 125. If a are multiples of one another, A vanishes. This is consistent 463 r’ q’ r q b-PHbL b Α O a p PHbL Fig. 125: Calculation of the area of the parallelogram Opgr. See Example 3.5.13. with the above formula since in that case |a b| |a| |b| according to Theorem 3.5.11. In the remaining cases let P pbq denote the orthogonal projection of b onto the direction of a. The areas of the triangles Orr1 and pqq 1 are identical. Hence the area of Opqr is given by A |a| |b P pbq| |a| a b 1 a 2 a b 2 2 .a | a | |a|2 |a| |a| |b| pa bq |a|2 |b|2 pa bq2 . Note that |a|2 |b|2 pa bq2 ¡ 0 according to the Cauchy-Schwarz inequality Theorem 3.5.11. Further, since a b |a| |b| cospαq , it follows that A a |a|2 |b|2 p1 cos2pαqq |a| |b| sinpαq . 464 On R3 , it is possible to define a product that associates to every pair of vectors another vector and that shares some of the properties of multiplication of real numbers. That product is also important for applications, e.g., in electrodynamics. In the following, we motivate the definition of that vector product a b for a pa1 , a2 , a3 q, b pb1 , b2 , b3 q P R3 which are assumed not to be multiples of each other. Natural candidates for the definition of a b are vectors that are at the same time orthogonal to a and b. Orthogonal vectors to a are given by α . pa2 , a1 , 0q β . p0, a3 , a2 q where α, β P R. In this, we assume in addition that a2 0 which excludes that pa2 , a1 , 0q and p0, a3 , a2 q are multiples of each other. The condition b r α . pa2 , a1 , 0q leads to β . p0, a3 , a2 q s 0 pa1b2 a2b1q α pa2b3 a3b2q β 0 which is satisfied if α γ p a2 b 3 a3 b 2 q , β a2 for some γ P R. Then aγ pa1b2 a2b1q 2 α . pa2 , a1 , 0q β . p0, a3 , a2 q γ . pa2 b3 a3 b2 , a3 b1 a1 b3 , a1 b2 a2 b1 q . To restrict the final parameter γ, we calculate the square of the norm of this vector for γ 1. In this, we drop all the additional restricting assumptions on a, b P R3 made above. This gives pa2b3 a3b2q2 pa3b1 a1b3q2 pa1b2 a2b1q2 a22b23 a23b22 2a2b2a3b3 a23b21 a21b23 2a1b1a3b3 a21b22 a22b21 2a1b1a2b2 |a|2|b|2 pa1b1 a2b2 a3b3q2 |a|2|b|2 pa bq2 which according to Example 3.5.13 is the square of the area of the parallelogram determined by a, b if a, b P R3 zt0u. This suggests the following definition. 465 a x b b a O Fig. 126: Vector product of two vectors a and b. Definition 3.5.14. For all a, b a b P R3 by P R3, we define the corresponding product a b : pa2 b3 a3 b2 , a3 b1 a1 b3 , a1 b2 a2 b1 q . A simple calculation shows that a pa b q b pa b q 0 , i.e., that a and b are both orthogonal to a b, and by the foregoing that a |a b| |a|2|b|2 pa bq2 |a| |b| sinpαq , where α : ?pa, bq P r0, π s, which according to Example 3.5.13 is the area of the parallelogram determined by a, b. Remark 3.5.15. Let a, b P R3 zt0u. Then it follows by Example 3.5.13 that a and b are parallel if and only if ab0 . 466 The vector product satisfies the following rules that are frequently applied, e.g., in electrodynamics. Theorem 3.5.16. Let a, b, c, d P R3 and λ P R. Then (i) e1 e2 (ii) (iii) (iv) (v) (vi) (vii) e3 , e3 e1 e2 , e2 e3 e1 , a b b a, pλ . aq b λ . pa bq , pa bq c a c b c , a pb c q c pa b q , a pb c q p a c q . b pa b q . c , pa bq pc dq pa cq pb dq pa dq pb cq . Proof. The relations (i) to (iv) are obvious. ‘(v)’: a pb cq a pb2 c3 b3 c2 , b3 c1 b1 c3 , b1 c2 b2 c1 q a1b2c3 a1b3c2 a2b3c1 a2b1c3 a3b1c2 a3b2c1 c1 pa2b3 a3b2q c2 pa3b1 a1b3q c3 pa1b2 a2b1q c pa b q . ‘(vi)’: a pb cq pa2 pb cq3 a3 pb cq2 , a3 pb cq1 a1 pb cq3 , a1 pb cq2 a2 pb cq1 q pa2 pb1c2 b2c1q a3 pb3c1 b1c3q, a3 pb2c3 b3c2q a1 pb1c2 b2c1q, a1 pb3 c1 b1 c3 q a2 pb2 c3 b3 c2 qq pa2c2b1 a3c3b1 a2b2c1 a3b3c1, a3c3b2 a1c1b2 a3b3c2 a1b1c2, a1 c1 b3 a2 c2 b3 a1 b1 c3 a2 b2 c3 q pa1c1b1 a2c2b1 a3c3b1 a1b1c1 a2b2c1 a3b3c1, a3 c3 b2 a1 c1 b2 a2 c2 b2 a2 b2 c2 a3 b3 c2 a1 b1 c2 , 467 a1 c1 b3 a2 c2 b3 a3 c3 b3 a3 b3 c3 a1 b1 c3 a2 b2 c3 q pa cq . b pa bq . c . ‘(vii)’: pa bq pc dq pa2b3 a3b2, a3b1 a1b3, a1b2 a2b1q pc2d3 c3d2, c3d1 c1d3, c1d2 c2d1q a2b3c2d3 a3b2c3d2 a2b3c3d2 a3b2c2d3 a3 b1 c3 d1 a1 b3 c1 d3 a3 b1 c3 d1 a1 b3 c3 d1 a1 b2 c1 d2 a2 b1 c2 d1 a1 b2 c2 d1 a2 b1 c1 d2 a2c2b3d3 a3c3b2d2 a3c3b1d1 a1c1b3d3 a1c1b2d2 a2c2b1d1 pa2d2b3c3 a3d3b2c2 a3d1b1c3 a1d1b3c3 a1d1b2c2 a2d2b1c1q pa cq pb dq a1c1b1d1 a2c2b2d2 a3c3b3d3 pa2d2b3c3 a3d3b2c2 a3d1b1c3 a1d1b3c3 a1d1b2c2 a2d2b1c1q pa cq pb dq a1d1b1c1 a2d2b2c2 a3d3b3c3 pa2d2b3c3 a3d3b2c2 a3d1b1c3 a1d1b3c3 a1d1b2c2 a2d2b1c1q pa cq pb dq pa dq pb cq . From elementary geometry, it is known that the volume of parallelepiped in space is given by the product of the area of one of its bases and the length of the corresponding height. As another application of the vector product, the following allows an often simpler calculation of that volume if the location of its corners is known with respect to a Cartesian coordinate system. Example 3.5.17. Show that the volume V of the parallelepiped with sides a, b, c P R3 is given by the absolute value of the scalar triple product a pb cq V | a pb cq| p | c pa bq| | b pc aq| q . 468 a x b PHcL c b O a Fig. 127: The volume of a parallelepiped with sides a, b and c is given by |a pb cq|. Solution: The volume is equal to the length of the orthogonal projection P pcq of c onto the direction of a b times the area of the parallelogram with sides a, b. Compare Fig 127. Hence V | c |apab|bq| |a b| | c pa bq| where it is assumed that a and b are not scalar multiples of each other. In that case the V vanishes which is consistent with the above formula since in that case also a b 0. Determinants appear naturally in the representation of solutions of systems of linear equations, i.e., systems of equations where no powers of higher than first order are present, see Problem 13. This was recognized as early as 1693 by Leibniz in a letter to L’Hospital where such systems were studied that had parametric coefficients instead of explicitly given numbers. Today, the corresponding rule of solving such systems in terms of determinants is 469 called Cramer’s rule after Gabriel Cramer who published this rule in 1750 in a textbook [27]. The same rule was also published posthumously in 1748 in [74] two years after Colin Maclaurin’s death. Among others, determinants provide a simple way for determination of areas of parallelograms in R2 and of volumes of parallelepipeds in R3 if the location of their corners are known with respect to a Cartesian coordinate system. In addition, they generalize the notions of areas and volumes to higher dimensions. Definition 3.5.18. (The determinant function) Let n P N . (i) For every pk1 , . . . , kn q P Zn , we define n ¹ spk1 , . . . , kn q : sgnpkj ki q i,j 1,i j where the signum function sgn : R Ñ R is defined by $ ' & 1 if x ¡ 0 sgnpxq : 0 if x 0 ' % 1 if x 0 . In this, we use the convention that the empty product is equal to 1. Note that this definition implies that spk1 , . . . , kn q 0 if the ordered sequence pk1 , . . . , kn q contains two equal integers. Also, note for the case of pairwise different integers k1 , . . . , kn that spk1 , . . . , kn q 1 if the number of pairs pi, j q P t1, . . . , nu2 such that i j and kj ki is even, whereas spk1 , . . . , kn q 1 if that number is odd. (ii) For every ordered n-tuple pa1 , . . . , an q of elements of Rn , we define a corresponding determinant detpa1 , . . . , an q : a 11 an1 470 a1n ann : ņ spk1 , . . . , kn q a1k1 ankn . k1 ,...,kn 1 Note that according to the remark in (ii), the sum in this definition has only to be taken over n-tuples k1 , . . . , kn which are permutations of 1, . . . , n. Further, note that this definition leads in the case n 1 to detpa1 q a1 and in the case n 2 to detpa, bq a 1 b1 a2 a1 b2 a2 b1 b2 for every pa, bq such that a, b P R2 . Note that in the last case r detpa, bq s2 |a|2|b|2 pa bq2 . Hence the area A of the parallelogram with sides a and b is given by A | detpa, bq | . Finally, in the case n 3, this definition leads to detpa, b, cq a 1 b1 c1 a2 a3 b b b b 1 2 3 3 b2 b3 a1 a2 c 2 c 3 c 1 c 3 c2 c3 b a3 1 c 1 b2 c2 (3.5.7) a1b2c3 a1b3c2 a2b3c1 a2b1c3 a3b1c2 a3b2c1 a pb cq a1b2c3 a2b3c1 a3b1c2 pa3b2c1 a1b3c2 a2b1c3q for every pa, b, cq such that a, b, c P R3 . Note that detpa, b, cq a pb cq . Hence the volume V of the parallelepiped with sides a, b and c is given by V | detpa, b, cq | . 471 v x0 O Fig. 128: Line corresponding to x0 and v, see Definition 3.5.20. Remark 3.5.19. Formally, for ease of remembrance, we can write ab e 1 a1 b1 e2 e3 a2 a3 b2 b3 thereby utilizing the representation of the determinant in terms of minors given in (3.5.7). The use of the interpretation of the elements of Rn , n P N zt0, 1u, as position vectors starting from the origin of a Cartesian coordinate system allows the following transparent definitions of lines and planes. Definition 3.5.20. (Lines and Planes) Let n P N zt0, 1u and x0 x03 q, v P Rn . px01, x02, (i) We define a corresponding line by tx 0 t.v P Rn : t P Ru . (ii) In addition, let n 3 and w P Rn be such that v, w are not multiples of one another. Then we define a plane corresponding to x0 , v, w by tx 0 t.v s.w 472 P Rn : t, s P Ru . v w x0 O Fig. 129: Plane corresponding to x0 ,v and w, see Definition 3.5.20. Note that this set is equal to tx P Rn : n px x0q 0u tx P Rn : n1x1 n2x2 n3x3 pn1x01 n2x02 n3x03q 0u where n pn1 , n2 , n3 q is some normal vector to v and w, i.e., some non-trivial multiple of v w. Example 3.5.21. Calculate the distance d between the two lines L1 : tp4, 3, 1q and L2 : tp1, 0, 3q t.p1, 1, 2q P R3 : t P Ru t.p1, 1, 2q P R3 : t P Ru . Solution: L1 and L2 are parallel. Therefore d is given by the distance of some point on L1 , like p4, 3, 1q, from L2 , which by Theorem 3.5.12 is given by the length of (note that p4, 3, 1q p1, 0, 3q p3, 3, 4q) p3, 3, 4q 61 rp3, 3, 4q p1, 1, 2qs.p1, 1, 2q 32 .p1, 1, 1q . ? Hence d 2 3{3. 473 Example 3.5.22. Let x0 , v, w P R3 such that v, w are not multiples of one another. Finally, let E be the corresponding plane and u P R3 . Show that the distance dpu, E q of u from E is given by dpu, E q |pu x0q pv wq| . |v w | Solution: For this, let n : |v w| .pv wq , u0 : u x0 . Then it follows for every t, s P R that |u0 t.v s.w| |pu0 nq.n u0 pu0 nq.n t.v s.w| ¥ |pu0 nq.n| |u0 n| . Also because u0 pu0 nq.n is normal to n, it follows that there are t, s P R such that t.v s.w u0 pu0 nq.n. Hence dpu, E q : mint|u x| : x P E u |u0 n| . 1 Example 3.5.23. Find the distance d between the planes x y 3z 1 and 2x 2y 6z 0.5. Solution: Since the normals n1 p1, 1, 3q and n2 p2, 2, 6q are multiples of each other, these planes are parallel. Hence d is given by the distance of some point on L1 , like p0, 0, 1{3q, from the second plane. Therefore d ?1 p1, 1, 3q rp0, 0, 1{3q p0, 0, 1{12qs 5 44 11 ? 11 . Example 3.5.24. Let x01 , x02 , v, w P R3 and in particular v, w be no multiples of one another. Calculate the distance d of the (‘skew’) lines L1 : tx01 t.v P R3 : t P Ru . 474 and L2 : tx02 s.w Solution: For all t, s P R, |x01 t.v px02 P R3 : s P Ru . s.wq| |x01 x02 t.v psq.w| . Hence d is equal to the distance of the plane tx01 x02 t.v s.w P R3 : t, s P Ru from the origin, i.e, by d |px01 x02q pv wq| . |v w | Problems # ]q, of the vector that cor1) Determine the representative, i.e., ι1 p[ pq # responds to the oriented line segment pq between the points p and q given by a list of their coordinates with respect to a Cartesian coordinate system. In addition, calculate the distance between p and q. a) b) c) d) e) f) g) h) p p1, 2q , q p4, 7q , p p1, 2q , q p3, 4q , p p1, 3q , q p1, 2q , p p2, 4q , q p3, 4q , p p1, 3, 2q , q p3, 4, 7q , p p1, 2, 2q , q p3, 1, 4q , p p3, 1, 2q , q p1, 5, 2q , p p7, 1, 4q , q p3, 9, 4q . 3b, 3a 4b, a b, the angle θ between a and b, 2) Calculate |a|, 2a a vector of length one in the direction of a, the orthogonal projection of a onto the direction of b and, if at all possible, a b of the vectors a and b. a) a p3, 5q , b p1, 3q , 475 a p7, 1q , q b) p1, 3q , c) p p6, 9q , q p2, 4q , d) p p1, 5q , q p2, 6q , e) p p2, 3, 1q , q p3, 7, 7q , p p1, 3, 1q , q p4, 2, 3q , p p5, 2, 2q , q p2, 6, 3q p p9, 2, 3q , q p5, 8, 2q . f) g) h) , 3) Calculate detpa, bq, detpa, b, cq, respectively. a) a p4, 0q , b p7, 0q , b) a p8, 3q , b p2, 6q , c) a p8, 9q , b p5, 2q , d) a p4, 0, 1q , b p2, 0, 1q , c p3, 0, 9q , e) a p3, 1, 4q , b p1, 9, 1q , c p2, 3, 1q , f) a p4, 2, 3q , b p3, 5, 2q , c p4, 4, 1q . 4) Find suitable vectors x0 and v such that L tx0 tv : t P Ru where a) L tpx, y q P R2 : 3x b) L tpx, y q P R : 4y 2 c) L tpx, y q P R : 7x 4y 3u 2 d) 3y L tpx, y, z q P R : x 3 e) L tpx, y, z q P R : 3z 3 f) L tpx, y, z q P R : 5x 5u , 0u 3y 4z 7 ^ 9x 3 , 6y . 5 ^ 3x 2y z 3 0 ^ 8y 9y 0u 6z 12z , 0u 5) Find suitable vectors x0 , v and w such that P tx0 tv sw P R3 : t, s P Ru where a) P tpx, y, zq P R3 : x 3y 476 2z 1u 1u , . , tpx, y, zq P R3 : 3x 5u P tpx, y, z q P R3 : 2x 9y b) P , c) z 0u . 6) Calculate the distance between the lines L1 and L2 . tpx, y, zq P R3 : 12x 3y 2z 4 ^ x y z 0u , L2 tpx, y, z q P R3 : 4x y z 5 ^ 9x 3x 4z 7u , L1 tpx, y, z q P R3 : x 3y z 1 ^ y z 0u , L2 tpx, y, z q P R3 : 2x y 18z 4 ^ 5x 3y 1u , L1 tpx, y, z q P R3 : 9x 7y 2z 8 ^ x y z 0u , L2 tpx, y, z q P R3 : 22x 4y 14z 19 ^ 28x 7z 12u a) L1 b) c) 7) Calculate the distance of the point p from the plane P . a) p p5, 12, 3q , P b) c) tpx, yq P R3 : x 2y 3z 9u , p p0, 0, 0q , P tpx, y q P R3 : x y z 1u , p p6, 7, 2q , P tpx, y q P R3 : 18x y 9z 3u . 8) Let A, B and C be the vertices of a triangle. Find the representative # # # of [ AB ] [ BC ] [ CA ]. 9) Show that the line joining the midpoints of two sides of a triangle is parallel to and one-half the length of the third side. 10) Show that the medians of a triangle intersect in one point which is called the centroid. 11) Show that the perpendicular bisectors of a plane triangle intersect in one point which is called the circumcenter. 12) Show that the altitudes of a triangle, i.e., the the straight lines through the vertexes which are perpendicular to the opposite sides, intersect in one point which is called orthocenter. 13) a) Show that vectors a, b P R2 are linearly dependent, i.e., such that there are α, β P R such that α a β b 0 and α2 β 2 0, if and only if detpa, bq 0. b) Show that vectors a, b, c P R3 are linearly dependent, i.e., such that there are α, β, γ P R such that α a β b γ c 0 and α2 β 2 γ 2 0, if and only if detpa, b, cq 0. 477 . 14) (Cramer’s rule in two and three dimensions) a) Let a, b, c P R2 . Show that the of equation pa x, b xq c has a unique solution x P R2 if and only if detpa, bq 0. For that case express the solution x only in terms of detpc, bq, detpa, cq and detpa, bq. The result is called ‘Cramer’s rule’ in two dimensions. b) Let a, b, c, d P R3 . Show that the of equation pa x, b x, c xq d has a unique solution x P R3 if and only if detpa, b, cq 0. For that case express the solution x only in terms of detpd, b, cq, detpa, d, cq, detpa, b, dq and detpa, b, cq. The result is called ‘Cramer’s rule’ in three dimensions. 3.5.3 Conic Sections Conic sections were already known in ancient Greece. They were found by Menaechmus, a student of Eudoxus, in the search for curves that were suitable for the solution of the Delian problem. The last problem comprises the construction, from the edge of a cube and alone with compass and straightedge, of the edge of a second cube of double volume. From today’s perspective, this reduces essentially to the construction of a line segment of length 21{3 , alone by compass and straightedge. In his search, Menaechmus found ellipses, parabolas and hyperbolas as intersections of right circular cones with planes. For this, see Problem 7 in the Section 3.5.5 on quadrics. From an analytic geometric point of view, ellipses, parabolas and hyperbolas are zero sets of quadratic polynomials in the coordinates of Cartesian coordinate systems in the plane. In the following, for each case, such a polynomial will be derived starting from a geometric definition of the curve. A parabola is a subset of the plane consisting of those points that are equidistant from a given line, called its directrix, and a given point, called 478 Fig. 130: Parabola and corresponding directrix and focus. Compare Example 3.5.25. its focus. The point bisecting the distance of the focus and the directrix is called its vertex; it is on the parabola. The infinite line through the vertex which is perpendicular to the directrix is called the axis of the parabola. Example 3.5.25. (Parabolas) Find a function whose zero set is a parabola P with vertex in the origin, focus in the upper half-plane at p0, pq and axis given by the y axis of a Cartesian coordinate system. Solution: As a consequence of the assumptions the directrix is given by y p. Hence px, yq P P if and only if a x2 ô x2 ô x2 and therefore py pq2 |y p| py pq2 py pq2 4py x0 if p 0, i.e., in this case the parabola is given by y axis and 2 y if p 0. Hence P x 4p tpx, yq P R2 : x 0u 479 E G D A I B C Fig. 131: Auxiliary diagram for the description of ancient Greek knowledge on parabolic segments. See Example 3.5.26. if p 0 and " P px, yq P R 2 if p 0. :y x2 4p * 0 Below, we give another typical example for the approach of analytical geometry, i.e., the replacement of intuition in the solution of geometric problems by algebraic calculations based on the introduction of an auxiliary Cartesian coordinate system. Parts of the example transcend analytic geometry since they also apply methods from calculus. Hence as a whole, the example belongs to the area of differential geometry. Example 3.5.26. (Ancient Greek knowledge on parabolic segments) As an application of the previous Example, we prove the following facts for line segments AE of parabolas which provided the basis of Archimedes quadrature of the parabola. See Fig 131. (i) The tangent to the point C on the parabola of largest distance from AE is parallel to AE. 480 (ii) The parallel to the axis of the parabola through C halves every line segment BD between two points B and D on the parabola that is parallel to AB. (iii) If I, G are the points of intersection of the parallel to the axis through C with BD and AE, respectively, then CI CG BI q2 ppAG q2 . (3.5.8) For the proofs, we consider the parabola P given by the graph of f : R Ñ R defined by x2 f pxq 4p for every x P R where p ¡ 0. In addition, let px1 , y1 q and px2 , y2 q be two different points of P . Note that this implies that x2 x1 . Without loss of generality, we can assume that x1 x2 . Then the line segment L between px1, y1q and px2, y2q is given by the graph of the map h : rx1, x2s Ñ R defined by hpxq : apx x1 q y1 where x4p 1 x22 x21 1 x x 4p x x 4p px1 x2q . a : 2 1 2 1 First, we establish (i) in the following. For every px, y q P R2 , we have the y2 y1 x2 x1 x22 4p 2 1 following decomposition px, yq x1 ay .p1, aq a2 y ax .pa, 1q 1 a2 (3.5.9) where the vectors p1, aq and pa, 1q are orthogonal with respect to the Euclidean scalar product. In particular, the last gives for x P rx1 , x2 s px, hpxqq x 1 ahpxq .p1, aq a2 481 hpxq ax .pa, 1q 1 a2 x a2 px x1 q ay1 apx x1 q y1 ax .p1, aq .pa, 1q 2 1 a 1 a2 p1 a2q x apy1 ax1q .p1, aq y1 ax1 .pa, 1q . 1 a2 1 a2 Further, let px0 , y0 q P P such that x0 px0, y0q x10 P rx1, x2s. Then (3.5.9) gives ay0 y0 ax0 .p1, aq .pa, 1q . 2 a 1 a2 Hence the square of the distance of the points px, hpxqq where x and px0 , y0 q is given by 1 p a2 q x 1 x10 p1 apy1 ax1 q .p1, aq a2 P rx1, x2s y1 ax1 .pa, 1q 1 a2 2 ay0 y0 ax0 . p 1, a q . p a, 1 q a2 1 a2 a2 q x apy1 y0 ax1 q x0 .p1, aq 1 a2 2 y1 y0 apx1 x0 q . p a, 1 q 1 a2 rp1 a2q x apy1 y0 ax1q x0s2 1 a2 ry1 y0 apx1 x0qs2 . The zero of the first summand in the nominator of the last expression is given by x0 apy1 y0 ax1 q x0 x1 1 a2 x1 x0 x11 aap2y0 y1q . x x1 apy1 y0 ax1 q 1 a2 As a function of x0 , it is increasing, and it assumes at x0 x x1 and at x0 x2 the value x x1 x2 x1 apy2 y1 q 1 a2 x1 482 x2 x1 a 1 x1 the value x22 4p a2 x21 4p x 1 px 2 x 1 q a x14px2 1 a2 1 x2 . Hence it follows that x0 apy1 y0 ax1 q 1 a2 P rx1, x2s for all x0 P rx1 , x2 s and therefore that the minimal distance dppx0 , y0 q, Lq of px0 , y0 q from the line segment L is given by dppx0 , y0 q, Lq |y1 y0? apx1 x0q| 1 a2 and is assumed in precisely one point on L with abscissa x0 x1 apy0 y1 q . 1 a2 x1 Further, the distance function D : rx1 , x2 s Ñ R defined by Dpx0 q : dppx0 , f px0 qq, Lq P rx1, x2s continuous and hence assumes a maximum value. |y1 y2? apx1 x2q| 0 , Dpx q 0 , Dpx q for every x0 Since 1 2 1 a2 that maximum is assumed in the open interval between x1 and x2 . Since pr0, 8q Ñ R, x ÞÑ Rq is strictly increasing, D and D2 assume maxima in the same points. Since D2 is differentiable on the open interval between x1 and x2 , we conclude that the derivative of D2 vanishes in such a point x in the open interval between x1 and x2 , i.e., that 0 y1 y0 a x1 2 1 a2 x0 a2 D x0 a . 2p 2 1 x D 2 ? pq 1 p x0q a x0 p q 483 2p Hence it follows that x0 2pa. As as consequence, that maximal distance dppx0 , f px0 qq, Lq is assumed in precisely one point with coordinates px0, f px0qq p 2pa, pa2 q . (3.5.10) Finally, the slope of the tangent to the graph of f in this point is given by f 1 p2paq 2pa 2p a and hence equal to the slope of the line segment L. Hence, indeed, the tangent to the graph of f in this point is parallel to the line segment L. Finally, we establish (ii) and (iii) in the following. For this, let px3 , y3 q, px4, y4q P P be such that x3 x4 and such that y4 y3 x4 x3 a. Then the intersection of the parallel to the axis through the point (3.5.10) with the line segment between px3 , y3 q, px4 , y4 q is given by Since a px3 px, yq p2pa, ap2pa x3q x4 q{p4pq, we conclude that x 2pa x3 x4 2 y3 q . , that ap2pa x3q y3 x3 4p x4 x3 2 x4 x3 y3 2 2 x3 x4 x4 x3 y x4 x3 y y4 y3 y 4p and hence that 2 3 px, yq 3 8p x 3 2 x4 y3 , 484 2 y4 2 . y3 y3 2 y4 Hence, the distances of px3 , y3 q and px4 , y4 q from px, y q are indeed the same: c x3 x3 c x4 2 2 x3 y3 x4 2 y 4 2 y3 2 y3 c x 3 x 4 2 y y4 2 3 2 y 4 2 2 y4 . 2 2 We notice that that distance dppx3 , y3 q, px, y qq equals x4 ? p2pa x3q2 a2p2pa x3q2 1 a2 |2pa x3| ? ? 1 a2 x3 2 x4 x3 21 1 a2 px4 x3q . Further, the distance dppx0 , y0 q, px, y qq between px0 , y0 q and px, y q is given a by ap2pa x3 q y3 pa2 x3q2 px4 16p . Finally, it follows that dppx0 , y0 q, px, y qq r dppx3, y3q, px, yqq s2 appa x3q x3 q px4 16p p1 2 a2 x23 4p x3 q p2pa4p 4 qpx4 x3q2 2 4p p11 a2 q . An ellipse is a subset of the plane consisting of those points for which the sum of the distances from two points, called foci, is constant. Because of the triangle inequality for the Euclidean distance, that constant is greater or equal than the distance of the foci. If the constant is non-zero, the ratio between the distance of the foci and the constant is called the eccentricity of the ellipse. The line connecting the foci of an ellipse is called eccentric line and its midpoint the center of the ellipse. Example 3.5.27. (Ellipses) Find a function whose zero set is an ellipse E with foci at pc, 0q and pc, 0q and constant 2a where a ¥ c ¥ 0. Solution: px, yq P E if and only if d1 d2 2a 485 (3.5.11) Fig. 132: Ellipse and corresponding foci. Compare Example 3.5.27. a a where d1 : px cq2 y2 and d2 : px cq2 y2 . In case that a c 0, this is equivalent to x y 0, i.e., the ellipse is given by the origin. In the following, let a ¡ 0. Then 2a pd1 d2 q d21 d22 and px cq2 y 2 px cq2 y 2 d1 d2 2ca x . 4cx (3.5.12) Hence (3.5.11) is equivalent to (3.5.11), (3.5.12) and therefore to c c d1 a x , d2 a x . (3.5.13) a a We consider two cases. In case that a c, equations (3.5.13) are equivalent to |x| ¤ c and y 0, i.e., in this case, the ellipse is given by the line rc, cs t0u. In case that a ¡ c, equations (3.5.13) are equivalent to x2 a2 and the condition y2 1 a2 c 2 (3.5.14) 2 |x| ¤ ac . (3.5.15) Now the assumption |x| ¡ a2 {c and (3.5.14) leads to the contradiction that 0¡1 a2 c2 2 ¡ a2 y c2 486 . Fig. 133: Hyperbolas, corresponding foci and asymptotes (dashed). Compare Example 3.5.28. Hence (3.5.14) implies (3.5.15), and (3.5.13) is equivalent to (3.5.14). Hence E if a c 0, E if a ¡ 0, c a and E tp0, 0qu tpx, yq P R2 : c ¤ x ¤ cu " 2 px, yq P R : xa2 2 y2 a2 c2 * 1 if a ¡ 0, a ¡ c. Note that E is a circle of radius a if c 0. A hyperbola is a subset of the plane consisting of those points for which the difference of the distances from two points, called foci, is constant. Because of the triangle inequality for the Euclidean distance, the absolute value of that constant is smaller or equal than the distance of the foci. If the constant is non-zero, the ratio between the distance of the foci and its absolute value is called the eccentricity of the hyperbola. Example 3.5.28. (Hyperbolas) Find a function whose zero set is a hyperbola H with foci at pc, 0q and pc, 0q, where c ¥ 0, and constant 2a such that |a| ¤ c. Solution: px, y q P H if and only if d1 d2 2a 487 (3.5.16) a a where d1 : px cq2 y 2 and d2 : px cq2 y 2 . We consider two cases. In case that c a 0, equation (3.5.16) is satisfied by all px, yq P R2, i.e., the hyperbola is given by the whole plane. In case that c ¡ 0, a 0, (3.5.16) is equivalent to x 0 and y P R, i.e., the hyperbola is given by the y axis. In case that c ¡ 0, a 0, it follows that 2a pd1 d2 q d21 d22 px cq2 d1 2ca x . and that d2 y 2 px cq2 y 2 4cx (3.5.17) Hence (3.5.16) is equivalent to (3.5.16), (3.5.17) and therefore to d1 ac x a , d2 ac x a . (3.5.18) c2 a2 (3.5.19) Equations (3.5.18) are equivalent to c 2 a2 2 x y2 a2 and the condition |a| . x ¥ (3.5.20) a c In case that a P tc, cu, (3.5.19) and (3.5.20) are equivalent to x ¤ c and y 0, x ¥ c and y 0, respectively, i.e., the hyperbola is given by the respective half-lines. In case that |a| c, equations (3.5.19) and (3.5.20) are equivalent to x2 y2 1 (3.5.21) a2 c2 a2 and a2 x¥ c if a ¡ 0 and a2 x¤ c 488 if a 0, respectively. The assumption 0 ¤ x lead together with (3.5.21) to the contradiction a2{c or a2{c x ¤ 0 x2 a2 c2 1 ¤0. c2 a2 a2 c2 y2 Hence (3.5.19) and (3.5.20) are equivalent to (3.5.21) and x and (3.5.21) and x 0 if a 0. We conclude that H if c a 0, if c ¡ 0, a 0, if c ¡ 0, a c, if c ¡ 0, a c, R2 , H tp0, yq P R2 : y P Ru , H tpx, 0q P R2 : x ¤ cu H tpx, 0q P R2 : x ¥ cu # H if c ¡ a ¡ 0 and H ¡ 0 if a ¡ 0 px, yq P R2 : x a # px, yq P R2 : x a c 1 c 1 y2 + c2 a2 y2 + c2 a2 if c ¡ a ¡ 0. Remark 3.5.29. Note that from an analytic algebraic point of view conics are zero sets of second order polynomials in the coordinates of Cartesian coordinate systems in the plane. Later on in Section 3.5.5, quadrics will be defined as corresponding sets in three-dimensional space. 489 Problems 1) Find the vertex, focus and the directrix of the parabola. a) c) d) f) tpx, yq P R2 : y2 3x 0u , b) tpx, yq P R2 : y2 x{2u tpx, yq P R2 : y 5 2px 3q2 u , tpx, yq P R2 : y x2 xu , e) tpx, yq P R2 : y 2 x2 u tpx, yq P R2 : y x2 3x 1u . 2) Find a function whose zero set coincides with P . a) P is the parabola with focus p1, 3q and directrix tpx, y q P R2 : x 2y 1 0u, b) P is the parabola with focus p4, 3q and directrix tpx, y q P R2 : x 2y 1u, c) P is the parabola with focus p1, 2q and directrix tpx, y q P R2 : 2x y 3u. 3) Find the location of the foci and the eccentricity of the ellipse. a) b) c) d) e) f) tpx, yq P R2 : x2 2y2 6u , tpx, yq P R2 : 5x2 11y2 10u , tpx, yq P R2 : 2x2 4y2 5u , tpx, yq P R2 : 3x2 2y2 1u , tpx, yq P R2 : 6x2 7y2 4u , tpx, yq P R2 : y px2 {3q py2 {4q p1{2qu . 4) The lines connecting the foci of the following ellipses are parallel to the x-axis. Find the location of their foci and their eccentricities. a) b) c) tpx, yq P R2 : 3x2 tpx, yq P R2 : x2 tpx, yq P R2 : 4x2 2y 2 3x p4y {3q p1{36qu , 3y 2 4x 2y 2 12u , 8x 12y 21 0u 12y . 5) Find a function whose zero set is an ellipse of eccentricity 2 and foci at p1, 1q, p2, 2q. 6) Find the location of the foci and the eccentricity of the hyperbola. a) tpx, yq P R2 : 2x2 y2 5u 490 , , b) c) d) e) f) tpx, yq P R2 : 7x2 9y2 9u , tpx, yq P R2 : 3x2 5y2 4u , tpx, yq P R2 : 4x2 y2 2u , tpx, yq P R2 : 7x2 4y2 1u , tpx, yq P R2 : y px2 {4q py2 {2q p1{4qu . 7) The lines connecting the foci of the following hyperbolas are parallel to the x-axis. Find the location of their foci and their eccentricities. a) b) c) tpx, yq P R2 : 2x2 4y2 p2x{3q 4y p35{18qu tpx, yq P R2 : 3x2 y2 12x 4y 7u , tpx, yq P R2 : 2x2 3y2 4x 18y 30u . , 8) Find a function whose zero set is a hyperbola of eccentricity 4 and foci at p1, 1q, p2, 2q. 9) Show that the given set is an ellipse " 2t 1 t2 ,b 1 t2 1 t2 P R2 : t P R ( pa cospθq, b sinpθqq : θ P R b) where a ¡ 0 and b ¡ 0. a) a * , 10) Show that the given set is a hyperbola. " a) b) pa coshptq, b sinhptqq : t P R where a ¡ 0 and b ¡ 0. c) 3.5.4 * 2t 1 t2 ,b a P R 2 : 1 t 1 , 1 t2 1 t2 * " a , b tanpθq : θ P pπ {2, π {2q , cospθq ( Polar Coordinates In addition to Cartesian coordinate systems, there are other options to coordinate the points in the plane. Most important in this respect are polar coordinate systems, see Fig 134. In physics applications, such are generally applied if the system is, in a certain sense, symmetric under rotations 491 around a particular point. In such cases, the last is chosen as the origin of the polar coordinate system. In these situations, polar coordinates considerably simplify the analysis of the system compared to Cartesian coordinate systems. Polar coordinate systems use as coordinates the distance r of a point p from an origin O and the angle ϕ of the line segment from Op with a given line originating from O. For example in Fig 134, the last is given by the positive x-axis of a Cartesian coordinate system. We immediately notice two problems here. First, the origin does not correspond to a unique pair of coordinates r and ϕ and hence has to be excluded. Second, there are various ways to measure the angle from the positive x-axis. For instance, if we let ϕ run in the interval r0, 2π s, then the points on the half-line H : tpx, 0q P R2 : x ¡ 0u don’t correspond to unique pairs of coordinates r and ϕ. Hence in this case, we need to exclude the angles 0 or 2π. But then ϕ ‘jumps’ for points on H depending whether we approach such point from below H or from above. Such behavior along H is usually undesirable for applications. For this reason, below ϕ runs in the interval pπ, π s. Then the jump occurs only on H : tpx, 0q P R2 : x 0u which is usually acceptable for applications. Of course, we could also have chosen rπ, π q for that purpose. That would have led to a different coordinatization of the points on H , only. On the other hand, we will see later in Calculus III that, in certain applications, H needs to be excluded from coordinatization because the transformation g below, from polar coordinates to Cartesian coordinates, is not everywhere differentiable. Hence usually, the used convention for the coordinatization of H has no important consequences. Example 3.5.30. (Polar coordinates) Define g : p0, 8q pπ, π s Ñ R2 z tp0, 0qu 492 y p r sinHjL r j r cosHjL x O Fig. 134: Polar coordinates r, ϕ of a point p in the plane. r is the Euclidean distance of O and p. Compare Example 3.5.30. by g pr, ϕq : pr cos ϕ, r sin ϕq for all r P p0, 8q, ϕ P pπ, πs. Then g is bijective with the inverse g 1 : R2 z tp0, 0qu Ñ p0, 8q pπ, π s given by ? p?x2 p x2 for all px, y q P R2 z tp0, 0qu. g 1 px, y q " ? y 2 , arccospx{ x?2 y 2 qq if y ¥ 0 2 2 2 y , arccospx{ x y qq if y 0 Example 3.5.31. Find a parametrization of the ellipse E : " 2 px, yq P R : xa2 2 y2 b2 * 1 , i.e., a bijective map whose range coincides with E where a, b tion: Define the scale transformation f : R2 Ñ R2 by f px, y q : pax, by q for all x, y P R. Then f pS 1 p0qq E . 493 ¡ 0. Solu- Employing polar coordinates for parametrization, S 1 p0q is given by S 1 : pcos ϕ, sin ϕq P R2 : π ϕ ¤ π ( Hence . ( pa cos ϕ, b sin ϕq P R2 : π ϕ ¤ π , and a parametrization of E is given by h : pπ, π s Ñ R2 defined by hpϕq : pa cos ϕ, b sin ϕq for all ϕ P pπ, π s. E The following three examples use polar coordinates for the parametrization of ellipses, parabolas and hyperbolas that have foci in the origin of a Cartesian coordinate system. The results are frequently applied in astronomy, in the description of the motion of objects in the gravitational field of a central object. Such motion proceeds on ellipses, parabolas or hyperbolas with the position of the central object as a focus. Example 3.5.32. (Polar representation of parabola with focus in the origin) Let p ¡ 0. Show that Pp pr cos ϕ, r sin ϕq P R2 : r ¡ 0 ^ π ϕ ¤ π ^ r p1 cos ϕq 2pu (3.5.22) is a parabola with focus at the origin and directrix given by the parallel through the y-axis through the point p2p, 0q. Solution: For this, denote by Pp parabola with focus at the origin and directrix given by the parallel through the y-axis through the point p2p, 0q. Then px, y q P Pp if and only if a x2 The equation y2 a px 2pq2 |x 2p| . a x2 implies that x 2p x 2p y2 a x2 494 y2 ¥x (3.5.23) and hence that 2p ¥ 0 which is in contradiction to the assumptions. Hence this equation has no solution in R2 . Therefore (3.5.23) is equivalent to a x2 y 2 x 2p and tpx, yq P R2 : Pp a x2 x 2pu . y2 Finally, since g from Example 3.5.30 is bijective and p0, 0q R Pp , we conclude (3.5.22). Note that as a consequence of the foregoing, (3.5.23) is equivalent to x2 y2 p2p xq2 x2 4px and hence to y 4p2 a p pp xq . 2 Example 3.5.33. (Polar representation of ellipses with focus in the origin) Define for a ¡ 0 and 0 ¤ ε 1 the corresponding ellipse Ea,ε with center paε, 0q, foci at p2aε, 0q, p0, 0q and excentricity ε by Ea,ε : " px, yq P R 2 px : aεq2 y2 a2 p1 ε2 q a2 * 1 . Show that pr cos ϕ, r sin ϕq P R2 : r ¡ 0 ^( π ϕ ¤ π ^ r p1 ε cos ϕq ap1 ε2 q . (3.5.24) Solution: In Example 3.5.27, we showed for a ¡ 0 and 0 ¤ c a that the following equations are equivalent for px, y q P R2 Ea,ε x2 a2 and a px cq2 y2 y2 a2 c 2 a 1 px cq2 495 y2 2a . Hence if a ¡ 0 and 0 ¤ ε 1, then also the equations px aεq2 y2 a2 p1 ε2 q a2 and a px 2aεq2 y2 2a 1 a x2 y2 . are equivalent for px, y q P R2 . The last equation is equivalent to px 2aεq 2 y 2 2a 2 a x2 y2 (3.5.25) since the equation a px 2aεq2 y2 a x2 y 2 2a leads by use of the triangle inequality to |px, yq p2aε, 0q| |px, yq| 2a ¤ |px, yq p2aε, 0q| 2ap1 εq and therefore has no solution in R2 since a ¡ 0 and ε 1. Further, (3.5.25) is equivalent to x2 y2 4a2 ε2 4aεx x2 a y 2 4a x2 4a2 y2 which is equivalent to a x2 εx ap1 ε2 q . y2 As a consequence, we arrive at the representation ! Ea,ε px, yq P R 2 : a x2 y2 εx ap1 ε 2 ) q Finally, since g from Example 3.5.30 is bijective and p0, 0q conclude (3.5.24). 496 . R Ea,ε , we y 3 2 -3 x 1 -1 -2 -3 Fig. 135: Parabola, ellipse, hyperbola and asymptotes corresponding to the parameters p 1, ε 1{2 and ap1 ε2 q 1, ε 3{2 and ap1 ε2 q 1, respectively. Compare Examples 3.5.32, 3.5.33 and 3.5.34. Example 3.5.34. (Polar representation of hyperbola with focus in the origin) Define for a 0 and ε ¡ 1 the corresponding hyperbolas Ha,ε with center p0, aεq, foci at p0, 0q, p0, 2aεq and excentricity ε by Ha,ε : " px, yq P R 2 :x¤ a 2 px aεq2 p ε 1q ^ ε a2 y2 a2 pε2 1q * 1 . Show that pr cos ϕ, r sin ϕq P R2 : r ¡ 0 ^( π ϕ ¤ π ^ r p1 ε cos ϕq ap1 ε2 q . (3.5.26) Solution: In Example 3.5.28, we showed for a 0 and c ¡ a that the following equations are equivalent for px, y q P R2 Ha,ε x2 a2 2 c2 y a2 1 497 together with the condition that a2 x¤ c and a px cq2 y2 a px cq2 y2 2a . Therefore also the equations px cq2 a2 y2 1 c 2 a2 together with the condition that x¤ and a a px 2cq2 y2 2a . are equivalent. Hence if a 0 and ε ¡ 1, the equations px aεq2 y2 1 a2 a2 pε2 1q x2 y2 1 2 pc a2q c together with the condition that x¤ and a a 2 pε 1q ε y 2 2a x2 a px 2aεq2 y2 . are equivalent. The last equation is equivalent to px 2aεq2 a y2 y2 px x2 y 2 2a 2 since the equation 2a a x2 a 498 2aεq2 y2 (3.5.27) has no solution in R2 since a 0. Further, (3.5.27) is equivalent to x2 y2 4a2 ε2 4aεx which is equivalent to a x2 a x2 y 2 4a x2 y2 εx ap1 ε2 q . y2 As a consequence, we arrive at the representation ! 4a2 a px, yq P R : x2 y2 εx ap1 ε ) q . Finally, since g from Example 3.5.30 is bijective and p0, 0q R Ha,ε 2 2 Ha,ε , we conclude (3.5.26). Problems 1) Sketch the image of the set under g from Example 3.5.30 on polar coordinates. tpr, ϕq P Dpgq : r 3u , b) tpr, ϕq P Dpg q : r ¡ 2u , c) tpr, ϕq P Dpg q : π {2 ¤ ϕ ¤ π {2u , d) tpr, ϕq P Dpg q : 3π {4 ¤ ϕ ¤ 5π {6u , e) tpr, ϕq P Dpg q : 3π {4 ¤ ϕ ¤ π {4u , f) tpr, ϕq P Dpg q : 1 r 2 ^ π {6 ¤ ϕ ¤ π {3u , g) tpr, ϕq P Dpg q : 0 r 1 ^ π {3 ¤ ϕ ¤ π {6u . Find a function whose zero set coincides with g pC q where g is the a) 2) transformation from Example 3.5.30 on polar coordinates. In addition, sketch g pC q. tpr, ϕq P Dpgq : r 4u , C tpr, ϕq P Dpg q : 1 r r 2 cospϕq sinpϕq s 0u C tpr, ϕq P Dpg q : r 2 cospϕqu , C tpr, ϕq P Dpg q : r 1{ r 2 cospϕq su , C tpr, ϕq P Dpg q : r sin2 pϕqu , C tpr, ϕq P Dpg q : r2 sinp2ϕq 1u , C tpr, ϕq P Dpg q : r2 2 sinpϕqu . a) C b) c) d) e) f) g) 499 , 3) Find a function whose zero set coincides with g 1 pC q where g is the transformation from Example 3.5.30 on polar coordinates. tpx, yq P R2 : x 3u , C tpx, y q P R2 : 3x 2y 7u , C tpx, y q P R2 : 3x2 y 2 9u , C tpx, y q P R2 : y 4x2 1 0u , C tpx, y q P R2 : 8x2 4y 2 1 0u C tpx, y q P R2 : 3xy 4u , C tpx, y q P R2 : 2x2 3x y 2 1u a) C b) c) d) e) f) g) , . 4) Show that g from Example 3.5.30 on polar coordinates is bijective by verifying that g 1 pg pr, ϕqq pr, ϕq for all pr, ϕq P Dpg q and g pg 1 px, y qq px, y q for all px, y q P R2 z tp0, 0qu. 3.5.5 Quadric Surfaces Quadric surfaces are zero sets of second order polynomials in the coordinates of Cartesian coordinate systems in space. All quadrics are unique only up to rigid transformations, i.e., compositions of translations and rotations, in space. In the following, we give a brief discussion of the most important normal forms of quadrics. A detailed study of their geometric properties is object of courses in differential geometry. Example 3.5.35. For every plane curve C, the set C R is called a cylinder where we identify the pair ppx, y q, z q and the triple px, y, z q for all x, y, z P R. Examples are: (i) The parabolic cylinder ZP : ( px, y, zq P R3 : x2 4py 0 where p ¡ 0. The intersection of ZP with every parallel plane to the xy-plane is a parabola. 500 Fig. 136: Example of a parabolic cylinder. Compare Example 3.5.35. Fig. 137: Example of an elliptic cylinder. Compare Example 3.5.35. 501 Fig. 138: Example of a hyperbolic cylinder. Compare Example 3.5.35. (ii) The elliptic cylinder ZE : " 2 px, y, zq P R : xa2 y2 b2 3 * 1 where a, b ¡ 0. The intersection of ZE with every parallel plane to the xy-plane is an ellipse. (iii) The hyperbolic cylinder ZH : " 2 px, y, zq P R : xa2 3 y2 b2 * 1 where a, b ¡ 0. The intersection of ZH with every parallel plane to the xy-plane is a hyperbola. Example 3.5.36. The surface E : " 2 px, y, zq P R : xa2 3 502 y2 b2 z2 c2 * 1 , Fig. 139: Example of an ellipsoid. Compare Example 3.5.36. where a, b, c ¡ 0, is an ellipsoid with half-axes a,b and c. The intersection of E with a plane parallel to a coordinate plane is an ellipse, a point, or the empty set. E may be viewed as a ‘deformed’ sphere, because it is the image of S 2 p0q under the scale transformation f : R3 Ñ R3 defined by f px, y, z q : pax, by, cz q for all px, y, z q P R3 . Example 3.5.37. The surface EP : " px, y, zq P R : zc 3 x2 a2 y2 b2 * , where a, b, c ¡ 0, is called an elliptic paraboloid. The intersection of EP with a parallel to the xy-plane is an ellipse, a point or the empty set. The intersection of EP with a plane containing the z-axis is a parabola. The surface looks similar to a ‘saddle’ and is therefore often called a ‘saddle surface’. 503 Fig. 140: Example of an elliptic paraboloid. Compare Example 3.5.37. Fig. 141: Example of a hyperbolic paraboloid. Compare Example 3.5.38. 504 Fig. 142: Example of an elliptic cone. Compare Example 3.5.39. Example 3.5.38. The surface HP : " px, y, zq P R : zc 3 x2 a2 y2 b2 * , where a, b, c ¡ 0, is called an hyperbolic paraboloid. The intersection of HP with a parallel to the xy-plane is a hyperbola. The intersection of HP with a plane containing the z-axis is a parabola. Example 3.5.39. The surface EC : " 2 px, y, zq P R : zc2 3 x2 a2 y2 b2 * , where a, b, c ¡ 0, is called an elliptic cone. The intersection of EC with a parallel to the xy-plane is an ellipse, with midpoint given by its intersection with the z-axis, or a point called its vertex. The intersection of EC with a plane containing the z-axis are two straight lines crossing in the vertex. 505 Fig. 143: Example of a hyperboloid of one sheet. Compare Example 3.5.40. Example 3.5.40. The surface H1 : " 2 px, y, zq P R : zc2 3 x2 a2 y2 b2 * 1 , where a, b, c ¡ 0, is called a hyperboloid of one sheet. The intersection of H1 with a parallel to the xy-plane is an ellipse with midpoint given by its intersection with the z-axis. The intersection with a plane containing the z-axis consists of two hyperbolas. Example 3.5.41. The surface H2 : " 2 px, y, zq P R : zc2 3 x2 a2 y2 b2 * 1 , where a, b, c ¡ 0, is called a hyperboloid of two sheets. The intersection of H2 with a parallel to the xy-plane is an ellipse with midpoint given by its intersection with the z-axis, a point or the empty set. The intersection with a plane containing the z-axis consists of two hyperbolas. 506 Fig. 144: Example of a hyperboloid of two sheets. Compare Example 3.5.41. Problems 1) Describe and sketch the surface. a) b) c) d) e) tpx, y, zq P R3 : 2y2 3z2 9u , tpx, y, zq P R3 : z 6x2 1u , tpx, y, zq P R3 : x 2y2 3u , tpx, y, zq P R3 : xz 12u , tpx, y, zq P R3 : 3x2 5y2 7u . 2) Find the intersections of the surface with the coordinate planes. In this way, identify the surface and sketch it. a) b) c) d) e) f) tpx, y, zq P R3 : 2x2 3y2 z2 4u , tpx, y, zq P R3 : 9x2 3y2 5z2 12u tpx, y, zq P R3 : x2 2y2 4z2 3u , tpx, y, zq P R3 : y2 6x2 4z2 u , tpx, y, zq P R3 : 4x2 3z2 2yu , tpx, y, zq P R3 : 4z2 3x2 2y 0u , 507 , g) tpx, y, zq P R3 : x2 1 0u . tpx, y, zq P R3 : z2 4u , tpx, y, zq P R3 : 2y2 3z2 0u , tpx, y, zq P R3 : x2 4xy 4y2 2u , tpx, y, zq P R3 : px 2yq2 2px zq2 u . 4y 2 3z 2 3) Identify the surfaces. a) b) c) d) 4) Find a function whose zero set consists of all points that are equidistant from p0, 0, 1q and the coordinate plane tpx, y, zq P R3 : z 1u . Identify the surface. 5) Find a function whose zero set consists of all points whose distance from the z-axis is 3-times the distance from the xy-plane. Identify the surface. 6) Show that through every point of the the surfaces go two straight lines that are contained in that surface. a) The elliptic cone EC , b) the hyperbolic paraboloid HP , c) the hyperboloid of one sheet H1 . 7) (Conic sections) Let α P r0, π {2s and C be the circular cone defined by ( C : px, y, z q P R3 : z 2 x2 y 2 . a) Find a function whose zero set coincides with the cone Cα resulting from a C by clockwise rotation in the yz-plane around p0, 0, 1q and about the angle α. The symmetry axis of that cone is given by tp0, t sinpαq, 1 t cospαqq : t P Ru and its vertex by p0, sinpαq, 1 cospαqq . b) Find the intersection of Cα with the coordinate plane parallel to the xy-plane through p0, 0, 1q. Classify those curves. 508 3.5.6 Cylindrical and Spherical Coordinates There are also other options than Cartesian coordinate systems to coordinate the points in space. Most important in this respect are two derivatives of polar coordinates in the plane, cylindrical and spherical coordinates. In physics applications, cylindrical coordinates are generally applied if the system is, in a certain sense, symmetric under rotations around an axis. In such a case, cylindrical coordinates are defined with respect to a Cartesian coordinate system whose z-axis coincides with that symmetry axis. Subsequently, see Fig 145, the cylindrical coordinate system uses the zcoordinate of a point and the polar coordinates r and ϕ of the end point of the orthogonal projection of its position vector onto the x, y-plane as coordinates. Since in this way, every point on the z-axis is projected onto the origin, the points on the z-axis are not covered by this coordinatization. Usually in such situations, cylindrical coordinates considerably simplify the analysis of the system compared to the use of a Cartesian coordinate system. Example 3.5.42. (Cylindrical coordinates) Define g : p0, 8q pπ, π s R Ñ R3 z pt0u t0u Rq by g pr, ϕ, z q : pr cos ϕ, r sin ϕ, z q for all pr, ϕ, z q P p0, 8q pπ, π s R. Then g is bijective with inverse g 1 : R3 z pt0u t0u Rq Ñ p0, 8q pπ, π s R given by ? ? p?x2 y2 , arccospx{ x?2 y2 q , zq p x2 y2 , arccospx{ x2 y2 q , zq for all px, y, z q P R3 z pt0u t0u Rq. g 1 px, y, z q " 509 if y ¥ 0 if y 0 z p y O r q j r×sinHjL r×cosHjL x Fig. 145: Cylindrical coordinates r, ϕ, z of a point p in space. q is the orthogonal projection of p onto the xy-plane, r is the Euclidean distance of O and q, ϕ the angle of the line from O to q with the x-axis. Compare Example 3.5.42. Example 3.5.43. Find a parametrization of the cylinder Z1 px, yq P R2 : x2 y2 ( 1 , i.e., a bijective map whose range coincides with Z1 . Solution: Employing cylindrical coordinates, Z1 is given by tpcos ϕ, sin ϕ, zq : ϕ P pπ, πs, z P Ru , and a parametrization of Z1 is given by h : pπ, π s R Ñ R3 defined by hpϕq : pcos ϕ, sin ϕ, z q for all ϕ P pπ, π s, z P R. Z1 In physics applications, spherical coordinates are generally applied if the system is, in a certain sense, symmetric under rotations around a point. In such a case, spherical coordinates are defined with respect to a Cartesian coordinate system whose origin O coincides with that point. Subsequently, 510 z p y r r×cosHΘL Θ O r×sinHΘL j q x Fig. 146: Spherical coordinates r, θ, ϕ of a point p in space. r is the Euclidean distance of O and p, θ the angle between the line from O to p and the z-axis, q the orthogonal projection of p onto the xy-plane, ϕ the angle of the line from O to q with the x-axis. Compare Example 3.5.44. see Fig 146, spherical coordinates use the distance r of a point p from O, the angle θ of the line segment Op from the positive z-axis and the polar angle ϕ of the end point of the orthogonal projection onto the x, y-plane of the position vector corresponding to p. Since in this way, every point on the z-axis is projected onto the origin, also here the points on the z-axis are not covered by the coordinatization. Usually in such situations, spherical coordinates considerably simplify the analysis of the system compared to the use of a Cartesian coordinate system. Example 3.5.44. (Spherical coordinates) Define g : p0, 8q p0, π q pπ, π s Ñ R3 z pt0u t0u Rq by g pr, θ, ϕq : pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq for all pr, θ, ϕq P p0, 8q p0, π q pπ, π s. 511 Then g is bijective with inverse g 1 : R3 z pt0u t0u Rq Ñ p0, 8q p0, π q pπ, π s given by ? p|r| , arccospz{|r|q , arccospx{ x?2 y2qq if y ¥ 0 p|r| , arccospz{|r|q , arccospx{ x2 y2qq if y 0 for all px, y, z q P R3 z pt0u t0u Rq. In analogy with the situation on the globe, for px, y, z q P R3 z pt0u t0u Rq the second and third component of g 1 ppx, y, z qq can be called the longitude, co-latitude, respectively of px, y, zq. Note that for a point on the northern hemisphere π{2 minus its cog 1 prq " latitude gives its latitude, whereas for a point on the southern hemisphere the latitude is given by difference of its co-latitude and π {2. Example 3.5.45. Find a parametrization of E : E ztp0, 0, cq, p0, 0, cqu , where E is the ellipsoid defined by E : " 2 px, y, zq P R : xa2 3 y2 b2 z2 c2 * 1 and a, b, c ¡ 0, i.e., find a bijective map whose range coincides with E . Solution: E is the image of S 2 p0q under the scale transformation f : R3 Ñ R3 defined by f px, y, z q : pax, by, cz q for all px, y, z q P R3 . Employing spherical coordinates, S 2 p0q ztp0, 0, 1q, p0, 0, 1qu psin θ cos ϕ, sin θ sin ϕ, cos θq P R3 : θ P p0, πq, ϕ P pπ, πs Hence E pa sin θ cos ϕ, b sin θ sin ϕ, c cos θq P R3 : θ P p0, πq, 512 ( . ϕ P pπ, π su , and a parametrization of E is given by h : p0, π q pπ, π s Ñ R2 defined by hpθ, ϕq : pa sin θ cos ϕ, b sin θ sin ϕ, c cos θq for all θ P p0, πq, ϕ P pπ, πs. Problems 1) Describe the image of the set under g from Example 3.5.42 on cylindrical coordinates. a) b) c) d) e) f) g) tpr, ϕ, zq P Dpgq : r 3 ^ 1 z 1u , tpr, ϕ, zq P Dpgq : r ¡ 2 ^ 0 z 3u , tpr, ϕ, zq P Dpgq : 1 r 2u , tpr, ϕ, zq P Dpgq : π{2 ¤ ϕ ¤ π{2 ^ z 1u , tpr, ϕ, zq P Dpgq : 3π{4 ¤ ϕ ¤ 5π{6 ^ z ¤ 0u , tpr, ϕ, zq P Dpgq : 3π{4 ¤ ϕ ¤ π{4 ^ z ¥ 1u , tpr, ϕ, zq P Dpgq : 0 r 1 ^ π{3 ¤ ϕ ¤ π{6u . 2) Find a function whose zero set f : U Ñ R coincides with g pS q where g is the transformation from Example 3.5.42 on cylindrical coordinates. In addition, sketch g pS q. tpr, ϕ, zq P Dpgq : r 3u , S tpr, ϕ, z q P Dpg q : z 2ru , S tpr, ϕq P Dpg q : z 3r sinpϕq 12u , S tpr, ϕq P Dpg q : z 2 4r2 1u , S tpr, ϕq P Dpg q : 6z 2r2 3 0u , S tpr, ϕq P Dpg q : z 2 3 5r2 u , S tpr, ϕq P Dpg q : r 2 cospϕqu . a) S b) c) d) e) f) g) 3) Find a function whose zero set coincides with g 1 pS q where g is the transformation from Example 3.5.42 on cylindrical coordinates. a) S tpx, y, zq P R2 : 2x 6y 513 z 1u , tpx, y, zq P R2 : x2 y2 4u , S tpx, y, z q P R2 : x2 y 2 3z 2 2u , S tpx, y, z q P R2 : 2x2 2y 2 9z u , S tpx, y, z q P R2 : x2 y 2 2z 2 5u , C tpx, y q P R2 : 4x2 4y 2 z 2 1 0u C tpx, y q P R2 : 8x2 y 2 3z 2 0u . S b) c) d) e) f) g) , 4) Describe the image of the set under g from Example 3.5.44 on spherical coordinates. a) b) c) d) e) f) tpr, ϕ, θq P Dpgq : r 3u , tpr, ϕ, θq P Dpgq : r ¡ 2u , tpr, ϕ, θq P Dpgq : 1 r 8u , tpr, ϕ, θq P Dpgq : 0 ¤ θ ¤ π{4u , tpr, ϕ, θq P Dpgq : π{6 ¤ θ ¤ π{4u , tpr, ϕ, θq P Dpgq : r P r1, 2s ^ θ P rπ{6, π{3s ^ ϕ P rπ{6, π{3su . 5) Find a function whose zero set coincides with g pS q with g from Example 3.5.44 on spherical coordinates. In addition, sketch g pS q. tpr, ϕ, zq P Dpgq : r 5 0u , S tpr, ϕ, z q P Dpg q : ϕ π {6u , S tpr, ϕ, z q P Dpg q : θ π {4u , S tpr, ϕ, z q P Dpg q : r 6 cospθqu , S tpr, ϕ, z q P Dpg q : r sinpθq 4u , S tpr, ϕ, z q P Dpg q : r cospθq 2u , S tpr, ϕ, z q P Dpg q : r2 cosp2ϕq sin2 pθq 1u a) S b) c) d) e) f) g) . 6) Find a function whose zero set coincides with g 1 pS q where g is the transformation from Example 3.5.44 on spherical coordinates. tpx, y, zq P R2 : x2 S tpx, y, z q P R2 : x2 S tpx, y, z q P R2 : px2 a) S b) c) 514 y2 z 2 2y 0u , 3u , y q 4z 2 px2 y 2 qu y 2 2 2 . 7) Show that g from Example 3.5.42 on cylindrical coordinates is bijective by verifying that g 1 pg pr, ϕ, z qq pr, ϕ, z q for all pr, ϕ, z q P Dpg q and g pg 1 px, y, z qq px, y, z q for all px, y, z q P R3 z pt0u t0u Rq. 8) Show that g from Example 3.5.44 on spherical coordinates is bijective by verifying that g 1 pg pr, ϕ, θqq pr, ϕ, θq for all pr, ϕ, θq P Dpg q and g pg 1 px, y, z qq px, y, z q for all px, y, z q P R3 ztp0, 0, 0qu. 3.5.7 Limits in Rn Within this section, we assume that n P N zt0, 1u. The concept of limits of sequences of real numbers has been fundamental for our development of Calculus I. The same will be true for Calculus III which develops in particular the calculus for functions defined on subsets of Rn . The following definition is analogous to the corresponding definition in Calculus I. The main difference is the replacement of the modulus function by the Euclidean distance in Rn . Definition 3.5.46. Let x1 , x2 , . . . be a sequence of elements of Rn and x P Rn . We define lim xm x mÑ8 if for every ε ¡ 0 there is a corresponding m0 such that for all m ¥ m0 : r enpxm, xq s |xm x| ε . 515 x y z Fig. 147: A sequence in space is convergent if and only if all its coordinate projections converge. Compare Theorem 3.5.47. The following theorem states that a sequence of elements in Rn is converging to some x P Rn if and only if for every i P t1, . . . , nu the corresponding sequence of its i-th components converges in R to the i-th component of x. In this way, the question of convergence or non-convergence of a sequence in Rn is reduced to the question of convergence or non-convergence of sequences of real numbers. Theorem 3.5.47. Let x1 , x2 , . . . be a sequence of elements of Rn where n P N zt0, 1u and x P Rn . Then lim Ñ8 xm x lim Ñ8 xmj xj m if and only if for all j m 1, . . . , n. Proof. First, we note that max |yj | ¤ |y| ¤ |y1 | j 1,...,n 516 . . . |y n | for all y P Rn . Hence if limmÑ8 xm m ¥ m0 x, ε ¡ 0 is given and m0 is such that for all |xm x| ε , then also for every j P t1, . . . , nu and every m ¥ m0 |xmj xj | ε and hence also xj . On the other hand, if for every j P t1, . . . , nu lim xmj xj , mÑ8 ε ¡ 0 is given and for every j P t1, . . . , nu the corresponding m0j is such that for every m ¥ m0j |xmj xj | nε , then it follows for every m ¥ m01 m02 . . . that |xm x| ε , lim Ñ8 xmj m and hence that lim Ñ8 xm m x. Example 3.5.48. Calculate lim Ñ8 n Solution: n2 2 sinpnq lim , , 2 nÑ8 n n 1 n p0, 1, 0q . sinpnq n2 2 , 2 , n n 1 n . sinpnq n2 2 lim , lim , lim 2 nÑ8 nÑ8 n n 1 nÑ8 n 517 As a corollary, we obtain from the limit laws for sequences of real numbers limit laws valid for sequences in Rn . In particular, part (i) states that a sequence in Rn can have at most one limit point, part (ii) states that the sequence consisting of the sums of the members of convergent sequences in Rn is convergent against the sum of their limits, and part (iii) states that the sequence consisting of scalar multiples of the members of a convergent sequence in Rn converges against that scalar multiple of its limit. Corollary 3.5.49. Let x1 , x2 , . . . ; y1 , y2 , . . . be sequences of elements of Rn ; x, x̄, y P Rn and a P R. (i) If then x̄ x. lim Ñ8 xm x and m lim Ñ8 xm x and m m (ii) If m then lim Ñ8pxm m lim Ñ8 xm x̄ , lim Ñ8 ym y, ym q x (iii) If lim Ñ8 xm m then lim Ñ8 a.xm m y. x, a.x . Problems 1) If existent, calculate the limit of the sequence pxn , yn , f pxn , yn qq, n P N . Otherwise, show non-existence of the limit. Where applicable, a P R. a) xn : 1 a 2xy 2 , yn : , f px, y q : 2 , px, y q P R2 z t0u n n x y2 518 1 a 2xy 2 , px, y q P R2 z t0u , yn : , f px, y q : 2 n n x y4 1 1 2xy 2 xn : 2 , yn : , f px, y q : 2 , px, y q P R2 z t0u n n x y4 1 a xy xn : , yn : , f px, y q : 2 , px, y q P R2 z t0u n n x y2 1 a x y xn : , yn : , f px, y q : 2 , px, y q P R2 z t0u n n x y2 a x2 1 , px, y q P R2 z t0u xn : , yn : , f px, y q : 2 n n x y2 1 a x xn : , yn : , f px, y q : 2 , px, y q P R2 z t0u n n x y2 1 a x2 y 2 , xn : , yn : , f px, y q : 3 n n x y3 px, yq P R2 z tpx, xq : x P Ru b) xn : c) d) e) f) g) h) 1 e1{n x2 y 2 , , yn : , f px, y q : 3 n n x y3 px, yq P R2 z tpx, xq : x P Ru i) xn : 1 a x3 , yn : , f px, y q : 2 n n x 2 2 px, yq P R z tpx, x q : x P Ru j) xn : k) y3 , y x3 1 e1{n , yn : 2 , f px, y q : 2 n n x 2 2 px, yq P R z tpx, x q : x P Ru xn : a x3 1 , yn : , f px, y q : n n x 2 px, yq P R z tpx, xq : x P Ru l) xn : y3 , y y3 , y 1 a x2 y 2 , yn : , f px, y q : 4 , n n x y4 px, yq P R2 z t0u m) xn : a x2 y 2 1 , yn : , f px, y q : 2 , n n x y2 px, yq P R2 z t0u . n) xn : 2) Prove Corollary 3.5.49. 519 3.5.8 Paths in Rn As simple examples of functions assuming values in Rn , n P N zt1u, the following section considers paths. These have as their domains intervals of R. Paths occur frequently in applications, e.g., in the description of the motion of a point particle in Newtonian mechanics. In the last applications, the domain of a path is a time interval and its range is a curve in space. To every time t from the domain, the path associates the corresponding position of the particle in space. In particular, such path needs to satisfy Newton’s differential equations of motion, given later on, in order to describe a possible motion of a point particle in nature. The following defines the continuity and differentiability of a path in terms of the corresponding properties of its component functions. This should not be surprising in view of Theorem 3.5.47. In Calculus III, we give more general definitions for the continuity and differentiability of vector-valued functions of several variables. In the special case of paths, those definitions are equivalent to the definitions below. In the definition of the derivative of paths, we meet for the first time tangent vectors. Such have not only magnitude and direction, but also a point of attack. If u is a path and s is an element of the domain of u, the value of the derivative of u in s, u 1 psq, if existent, is a (tangent-) vector that has as point of attack (or ‘is attached to’) the point upsq. This point of attack is not indicated in the notation which is often confusing for the beginner, but is standard practice in calculus / analysis courses and in applications. Also the present text follows this convention. This does not lead to any serious complications for the problems considered in this text, but the reader should have this fact in mind for interpretation of the results. Hence as is usual in other calculus text, tangent vectors will be treated as position vectors. As a consequence, the derivative of a path will be a path, too. A proper definition of tangent vectors is given in most courses in differential geometry. 520 Definition 3.5.50. Let n P N . (i) A path is a map u : I Ñ Rn from some non-empty subinterval I of R into Rn . The range of a path is frequently called a curve. (ii) A path u : I Ñ Rn is called continuous if all corresponding component functions ui : I Ñ R that associate to every t P I the i-th component of uptq, i P t1, . . . , nu, are continuous. (iii) A path u : I Ñ Rn is said to be differentiable in some inner point t0 P I, i.e. some point t0 P I for which there is some ε ¡ 0 such that pt0 ε, t0 εq I, if all corresponding component functions ui : I Ñ R, i P t1, . . . , nu, are differentiable in t0 . In this case, we define its derivative in t0 by u 1 pt0 q : pu11 pt0 q, . . . , un1 pt0 qq . The last will also be called the tangent vector to u in upt0 q. Example 3.5.51. Calculate the derivative of the path u : R by uptq : pcos t, sin t, tq Ñ R3 defined for all t P R. Solution: u is differentiable since all its component functions are differentiable in the sense of Calculus I. Hence u 1 ptq : p sin t, cos t, 1q for all t P R. See Fig. 3.5.51. In applications, frequently derivatives of paths need to be calculated that are composed of other paths. Rules for the differentiation of such frequently occurring ‘compositions’ are given in a subsequent theorem and are simple consequences of the following theorem. 521 v O Fig. 148: Tangent vector v at a point of a helix. Compare Example 3.5.51. Theorem 3.5.52. Let l, m, n i.e., such that P N, λ : Rl Rm Ñ Rn be a bilinear map, λpα.x β.y, z q α.λpx, z q β.λpy, z q , λpx, α.z β.wq α.λpx, z q β.λpx, wq for all x, y P Rl , z, w P Rm and α, β P R. Further, let I be a non-void open interval of R and u : I Ñ Rl , v : I Ñ Rm be differentiable paths. Then the path λpu, v q : I Ñ Rn defined by rλpu, vqsptq : λpuptq, vptqq for all t P I is differentiable, and rλpu, vqs 1ptq λpu 1ptq, vptqq for all t P I. 522 λpuptq, v 1 ptqq m l Proof. For this, let el1 , . . . , ell , em 1 , . . . , em , be the canonical basis of R and m R , respectively. It follows by the bi-linearity of λ that ļ λpx, z q m̧ pxj zk q . λpelj , emkq j 1k 1 for all x P Rl , z P Rm. Hence rλpu, vqsi ļ m̧ rλpelj , emkqsi uj vk j 1k 1 is differentiable by Theorem 2.4.8 with derivative rλpu, vqsi1ptq ļ m̧ rλpelj , emkqsi puj1 ptq vk ptq j 1 k1 1 rλpu ptq, vptqq λpuptq, v 1ptqqs uj ptq vk1 ptqq i for all t P I and i P t1, . . . , nu. Theorem 3.5.53. Let n P N , I,J be non-void open intervals of R, u, v : I Ñ Rn differentiable paths, f : I Ñ R and g : J Ñ R be differentiable. Then Ñ Rn, defined by pu vqptq : uptq vptq for every t P I, is differentiable and pu vq 1ptq u 1ptq v 1ptq for all t P I. f.u : I Ñ Rn , defined by pf.uqptq : f ptq.uptq for every t P I, is differentiable and pf.uq 1ptq f 1ptq.uptq f ptq.u 1ptq for all t P I. (i) u (ii) v:I 523 (iii) u v : I (iv) (v) Ñ R, defined by pu vqptq : uptq vptq for every t P I, is differentiable and pu vq 1ptq u 1ptq vptq uptq v 1ptq for all t P I. if n 3, then u v : I Ñ R3 , defined by pu vqptq : uptq vptq for every t P I, is differentiable and pu vq 1ptq u 1ptq vptq uptq v 1ptq for all t P I. if Ran g I, then u g : J Ñ R is differentiable and pu gq 1ptq g 1ptq.u 1pgptqq for all t P J. Proof. ‘(i)-(iv)’ are consequences of Theorem 3.5.52. ‘(v)’: It follows by Theorem 2.4.10 that pu gqi ui g is differentiable with derivative rpu gqis 1ptq ui1pgptqq g 1ptq rg 1ptq.u 1pgptqqsi for all t P J, i P t1, . . . , nu. 524 Example 3.5.54. Let r be a twice differentiable path (the trajectory of a point particle parametrized by time) from some non-void open interval I of R into R3 and satisfying m.r 2 ptq 0 for all t P I (Newton’s equation of motion without external forces) where m ¡ 0 (the mass of the particle). Then m 2 v2 1 ptq m r 1ptq r 2ptq 0 for all t P I where v : r 1 (the velocity field of the particle) and v 2 : v v. Hence it follows by Theorem 2.5.7 that the function m 2 v 2 (the kinetic energy of the particle) is constant (‘is a constant of motion’). Example 3.5.55. (Kepler problem, I) Let r be a twice differentiable path (the trajectory of a point particle parametrized by time) from some nonvoid open interval I of R into R3 zt0u satisfying m . r 2 ptq γmM |rptq|3 . rptq (3.5.28) for all t P I (Newton’s equation of motion for a point particle under the influence of the gravitational field of a point mass located at the origin.) where m, M, γ ¡ 0 (the mass of particle, the mass of the gravitational source, the gravitational constant). Show that the total energy E : I Ñ R of the system, the angular momentum L : I Ñ R3 and the Lenz vector A : I Ñ R3 defined by m 1 γmM r ptq r 1 ptq 2 |rptq| , Lptq : rptq rm . r 1 ptqs m . rptq r 1 ptq , γmM 1 Aptq : m . r ptq Lptq |rptq| . rptq E ptq : 525 for every t P I are constant. Solution: It follows by Theorem 3.5.53, (3.5.28) and Theorem 3.5.16 (vi) that E 1 ptq m r 1 ptq r 2 ptq L 1 ptq m . r 1 ptq r 1 ptq for all t P I and γmM rptq r 1 ptq 0, |rptq|3 m . rptq r 2 ptq 0 γmM rptq r 1 ptq γmM 1 . rptq 3 |rptq| |rptq| . r ptq rptq rm . r 1 ptqs m . rr 1 ptq r 2 ptqs . rptq a 1 ptq r 2 ptq Lptq r 2ptq γmM . r 1 ptq |rptq| m . rr 1ptq r 2ptqs . rptq m . rrptq r 2ptqs . r 1ptq 1 m . rr 1ptq r 2ptqs . rptq γmM |rptq| . r ptq 0 for all t P I where a : I Ñ R3 is defined by aptq : m1 . Aptq for all t P I. Hence it follows by Theorem (2.5.7) that E, L and A are constant. In the following, we derive necessary consequences of these conservation laws. In this, we denote by E, L and A the corresponding constants and L : |L|, A : |A|. In particular, we assume that A 0, L 0 and denote for t P I by θptq P pπ, π s is the ‘polar’ angle between A and rptq. Then it follows by Theorem 3.5.16 (v) that A|rptq| cospθptqq A rptq m rptq γmM r 1 ptq L . rptq |rptq| m rptq r r 1ptq L s γm2M |rptq| m L r rptq r 1ptq s γm2M |rptq| L2 γm2M |rptq| and hence that |rptq| r 1 ε cospθptqq s p 526 for every t P I where ε : L2 A , p : . γm2 M γm2 M In addition, it follows by Theorem 3.5.16 (v), (vii) for t P I that A2 m2 γmM r 1 ptq L . rptq |rptq| |rptq| 1 pr 1ptq Lq pr 1ptq Lq 2γmM |rptq| . rptq pr ptq Lq 1 L2|r 1ptq|2 r L r 1ptq s2 2γmM |rptq| . L prptq r ptqq L 2 2E m 2γM |rptq| L 2γM |rptq| γmM r 1 ptq L . rptq 2 γ 2 m2 M 2 γ 2 m2 M 2 γ 2 m2 M 2 γ 2 m2 M 2 2EL2 m d and hence that ε As a consequence, 2EL2 . γ 2 m3 M 2 1 $ ' & 1 ε 1 ' % ¡1 if E if E if E 0 0 ¡0. By its definition, L is orthogonal to r 1 ptq for every t for for every t0 , t1 P I satisfying t0 t1 that L rrpt1 q rpt0 qs » t1 t0 P I. Hence it follows L r 1 ptq dt 0 . Hence the motion of the particle proceeds in a plane S with normal vector n3 : L1 .L . 527 In the following, we make the natural assumption that θptq assumes all values in pπ, π s. Then S contains the origin. This can be seen as follows. By assumption, there are t0 , t1 P I such that θpt1 q π π , θpt2 q 2 2 and hence L rpt0 q L rpt1 q 0 , A rpt0 q A rpt1 q 0 , |rpt0 q| |rpt1 q| . Therefore, we conclude, by noting that L A 0, that rpt0 q rpt1 q and hence that 1 prpt1q rpt0qq 0 P S . 2 Therefore, it follows from Examples 3.5.32, 3.5.33, 3.5.34 that Ranprq are conics in S with one focus in the origin. In particular, the conic is an ellipse, parabola or hyperbola if E 0, E 0 or E ¡ 0, respectively. Note that the previous was derived from the assumption of the existence of a solution of (3.5.28) with the prescribed properties. Indeed, that existence can be proved, and this is done, for instance, in courses in theoretical mechanics. rpt0 q Example 3.5.56. (Kepler problem, II, Levi-Civita’s transformation) We continue the discussion from the previous example and present Tullio LeviCivita’s ingenious method to transform (3.5.28) into a form whose solutions are obvious. His key idea is the ansatz (3.5.31) which transforms ellipses that have a focus in the origin into ellipses with centers in the origin. In the first step, we introduce a new time variable. For this, let t0 P I and I0 : pt0 , 8q X I. We define a time function τ : I0 Ñ R by τ ptq : »t t0 528 dt1 |rpt1q| for all t P I0 . Then τ is strictly increasing, and hence according to Theorem 2.5.18, the restriction in its image on its range, given by an open interval J0 , has a differentiable inverse which will be denoted by the symbol τ 1 in the following. In particular, we define ξ : r τ 1 . Then τ 1 ξ1 ξ2 |ξ |2 and hence 1 r 1 τ 1 r 2 τ 1 τ 1 1τ 1 |ξ | 1 r 2 τ 1 r 1 τ 1 r 1 τ 1 |ξ |2 |ξ | r 1 τ 1 r 2 τ 1 , |ξ | 1 ξ 1 |ξ | 1 |ξ1|2 ξ 2 ||ξξ||3 ξ 1 . Hence it follows from (3.5.28) that |ξ | ξ 2 | ξ | 1 ξ 1 γM ξ 0 (3.5.29) and E m 1 2 γmM |r | |r | 2 τ 1 m2 ||ξξ ||2 γmM |ξ | 12 . (3.5.30) For the next step, we assume that the r and hence also ξ assume values in the x, y-plane, only. Note that the discussion in the previous example indicates that it is reasonable to search for such solutions. For the solutions of (3.5.29), we make the ansatz ξ1 where u : J Ñ R, v : J be found. Then |ξ | u 2 v 2 , |ξ | 1 u2 v 2 , ξ2 2uv (3.5.31) Ñ R are twice differentiable functions that are to 2uu 1 2vv 1 , ξ11 529 2uu 1 2vv 1 , ξ21 2u 1 v 2uv 1 , |ξ 1|2 p2uu 1 2vv 1q2 p2u 1v 2uv 1q2 4pu2 v2qpu 1 2 v 1 2q ξ12 2uu 2 2vv 2 2u 1 2 2v 1 2 , ξ22 2u 2 v 4u 1 v 1 2uv 2 . Substitution of the ansatz into (3.5.29) leads to v 2 qp2uu 2 2vv 2 2u 1 2 2v 1 2 q p2uu 1 2vv 1 qp2uu 1 2vv 1 q γM pu2 v 2 q 2pu2 v 2 qpuu 2 vv 2 q 2pu2 v 2 qpu 1 2 v 1 2 q 4pu2u 1 2 v2v 1 2q γM pu2 v2q pu2 2pu2 v2qpuu 2 vv 2q γM 2pu 1 2 v 1 2q pu2 v2q 0 2pu2 v 2 qpu 2 v 2u 1 v 1 uv 2 q 4puu 1 vv 1 qpu 1 v uv 1 q 2 γM uv 2pu2 v2qpu 2v uv 2 q 4 pu2 v2qu 1v 1 puu 1 vv 1qpu 1v uv 1q 2 γM uv 2pu2 v 2 qpu 2 v uv 2 q γM 2pu 1 2 v 1 2 q 2uv 0 and hence to v 2 qupuu 2 vv 2 q v 2 qu 2 2pu 1 2 v 1 2q upu2 v2q 2pu2 v 2 qv pu 2 v uv 2 q γM 2pu 1 2 v 1 2 q 2uv 2 0 ( pu2 v2q 2pu2 v2qu 2 γM 2pu 1 2 v 12q u 0 2pu2 v 2 qupu 2 v uv 2 q γM 2pu 1 2 v 1 2 q 2u2 v 2pu2 v2qvpuu 2 vv 2q γM 2pu 1 2 v 1 2q vp(u2 v2q pu2 v2q 2pu2 v2qv 2 γM 2pu 1 2 v 1 2q v 0 2pu2 γM and hence to 2pu2 2pu2 2pu 1 2 γM 2pu 1 2 γM v 2 qv 2 v 1 2q u , Substitution of the ansatz into (3.5.30) leads to E 2m 12 uu2 v 12 v2 530 v 1 2q v . 2puγM 2 v2q which leads to the system of equations u2 E E u 0 , v2 v 2m 2m 0. The solution of the last equations are given by Theorem 2.5.17. In the following we define the arc length of a curve as a limit of the lengths of inscribed polygons. The length of any such polygon should be smaller than the length of the path, since intuitively we expect straight lines to be the shortest connection between two points. This suggests the following definition. Definition 3.5.57. Let n P N and u : I Ñ Rn be a path where I is some non-empty closed subinterval of R. We say u that is rectifiable, nonrectifiable if the set # ν¸1 + |uptj q uptj 1q| : P pt0, . . . , tν q P P µ 0 is bounded or unbounded, respectively. In case u is rectifiable, we define its length Lpuq by Lpuq sup # ν¸1 + |uptj q uptj 1q| : P pt0, . . . , tν q P P . µ 0 Example 3.5.58. (A non-rectifiable continuous path) Define u : r0, 1s Ñ R2 by π uptq : t, p1 tq cos 2p1 tq for all t P p0, 1s and uptq : p0, 0q. Then u is a continuous path. For every n P N , we define a partition pt0 , . . . , t2n 1 q of r0, 1s by t0 : 0 , t2k1 : 1 1 1 , t2k : 1 2p2k 1q 4k 531 y 0.2 0.2 0.4 0.6 1 x -0.2 -0.4 Fig. 149: Graph of the non-rectifiable continuous path u from Example 3.5.58. for k 1, . . . , n and t2n 1 : 1 . Then 2n ¸ |uptj q uptj 1q| ¥ µ 0 ¥ Hence # 1 2 2k 1 k1 ņ ν¸1 ņ |upt2k1q upt2k q| k 1 p q 1 1 ¥ 4k 2 ņ 1 . k k1 + |uptj q uptj 1q| : P pt0, . . . , tν q P P µ 0 is unbounded, and u is non-rectifiable. Below, we will give a formula for the calculation of the length of C 1 paths. The proof of that formula uses the following simple consequence of Bolzano-Weierstrass’ theorem. 532 Theorem 3.5.59. (Uniform continuity) Let f : ra, bs Ñ R, where a, b P R are such that a b, be continuous. Then f is uniformly continuous, i.e., for every ε ¡ 0 there is some δ ¡ 0 such that for all x, y P ra, bs it follows from |x y | ¤ δ that |f pxq f py q| ¤ ε. Proof. The proof is indirect. Assuming the opposite, there is some ε ¡ 0 for which the statement is not true. Hence for every n P N , there are xn , yn P ra, bs such that |xn yn | ¤ 1{n and at the same time such that |f pxn q f pyn q| ¡ ε. According to the Bolzano-Weierstrass’ Theorem 2.3.18, there are subsequences xn1 , xn2 , . . . of x1 , x2 , . . . converging to some element x P ra, bs and ynk1 , ynk2 , . . . of yn1 , yn2 , . . . converging to some element y P ra, bs. Hence it follows by the continuity of the modulus function (see Example 2.3.52), the continuity of f , Theorem 2.3.4 and Theorem 2.3.12 that x y and |f pxq f py q| ¥ ε. Theorem 3.5.60. Let n P N , a, b P R be such that a ¤ b, u : ra, bs Ñ Rn be continuous and differentiable on pa, bq such that its derivative on pa, bq can be extended to a continuous path u 1 on ra, bs, such a path will be called a C1 -path in the following, then u is rectifiable and Lpuq »b a |u 1ptq| dt . Proof. For this, let ν P N , pt0 , . . . , tν q P P and µ P t1, . . . , ν 1u. Then by Theorem 2.6.21, |uptµq uptµ 1q|2 ņ |uk ptµq uk ptµ 1q|2 k 1 » tµ k1 tµ ņ 1 2 uk1 t dt . pq By Theorem 3.5.2, » 2 » » tµ 1 tµ 1 ņ tµ 1 uk1 t dt uk1 s ds uk1 t dt tµ tµ k1 tµ k1 » 2 1{2 » ņ tµ 1 tµ 1 1 uk t dt u 1 t dt . tµ k1 tµ ņ ¤ pq pq pq 533 | p q| pq Hence » tµ k1 tµ ņ and 1 2 1 uk t dt pq ¤ » ν¸1 1 tµ |uptµq uptµ 1q| ¤ as well as 2 tµ » tµ 1 tµ |uptµq uptµ 1q| ¤ µ 1 »b a |u 1ptq| dt |u 1ptq| dt |u 1ptq| dt . Hence u is rectifiable and Lpuq ¤ »b a |u 1ptq| dt . For the proof of the opposite inequality, let ε ¡ 0. Since u 1 is continuous, it follows by application of Theorem 3.5.59 to its component functions the existence of δ ¡ 0 such that for all s, t P ra, bs |u 1psq u 1ptq| ¤ ε if |s t| ¤ δ . Let ν P N , pt0 , . . . , tν q P P of size ¤ δ, µ P t1, . . . , ν 1u. Then |u 1ptq| |u 1ptq u 1ptµq u 1ptµq| ¤ |u 1ptµq| ε for all t P rtµ , tµ 1 s. Hence » tµ tµ 1 |u 1ptq| dt ¤ p|u 1ptµq| » tµ tµ » tµ tµ 1 ru 1ptq u 1 t dt εq l p rtµ , tµ u 1 ptµ q u 1 ptqs » tµ tµ dt 1 sq ε l p rtµ , tµ dt 1 sq ru 1ptµq u 1ptqs ε l p rtµ , tµ 1 s q ? ¤ |uptµq uptµ 1q| p1 n q l p rtµ, tµ 1s q ε ¤ 1 pq 1 534 where integration of vector-valued functions is defined component-wise. Hence »b a |u 1ptq| dt ¤ ν¸1 |uptµq uptµ 1q| p1 ? n q ε ¤ Lpuq p1 ? nqε µ 1 and, finally, »b a |u 1ptq| dt ¤ Lpuq . Usually in applications, the length of curves, i.e., ranges of paths, is of more interest. The length of a curve should not depend on a parametrization / path. Below, it is proved the invariance of the length of paths under reparametrization. As a consequence, we will define the length of a curve as the length of an injective C 1 -path whose range coincides with the curve, if existent. Theorem 3.5.61. (Invariance of the length of paths under reparametrizations) Let n P N , a, b P R be such that a ¤ b and u : ra, bs Ñ Rn be a C1 -path. Further, let c, d P R such that c ¤ d, g : rc, ds Ñ ra, bs be continuous, increasing (not necessarily strictly) such that g pcq a, g pdq b, and differentiable on pc, dq with its derivative on pc, dq being extendable to a continuous function on rc, ds. Then u g is a C1 -path and Lpu g q Lpuq . For this reason, we define the length LpRan uq of the curve Ran u by LpRan uq : Lpuq if u is in addition injective. Proof. First, u g is continuous, differentiable on pc, dq with its derivative on pc, dq having the continuous extension g 1 .pu 1 g q. Hence u g is a C1 -path, and it follows by Theorem 3.1.1 that Lpuq » gpdq pq g c |u 1ptq| dt »d c |pu 1 gqpsq| g 1psq ds 535 »d c |g 1psq.pu 1 gqpsq| ds »d c |pu gq 1psq| ds Lpu gq . Example 3.5.62. Calculate the length of the circle Sr1 p0q of radius r ¡ 0 around the origin. Solution: An injective parametrization of the part of Sr1 p0q in the upper half-plane is given by the C1 -path u : r0, π s Ñ R2 defined by upϕq : pr cos ϕ , r sin ϕq for every ϕ P r0, π s. Since Lpuq r it follows that »π »π 0 0 |u 1pϕq| dϕ »π 0 |pr sin ϕ , r cos ϕq| dϕ dϕ πr , L Sr1 p0q 2πr . Example 3.5.63. (Length of plane paths given in polar coordinates) Let a, b P R be such that a ¤ b, I : ra, bs, r : I Ñ R and ϕ : I Ñ R be continuous as well as differentiable on pa, bq with derivatives that can be extended to continuous functions on I. Then by uptq : p rptq cos ϕptq , rptq sin ϕptq q for every t P I, there is defined a C 1 -path. Note that for t P I, rptq and ϕptq can be interpreted as polar coordinates of uptq if rptq ¡ 0 and ϕptq P pπ, πq. In particular for t P pa, bq, u 1 ptq p r 1 ptq cos ϕptq rptq ϕ 1 ptq sin ϕptq , r 1 ptq sin ϕptq rptq ϕ 1 ptq cos ϕptq q and hence |u 1ptq|2 r r 1ptq cos ϕptq rptq ϕ 1ptq sin ϕptq s2 536 r r 1ptq sin ϕptq rptq ϕ 1ptq cos ϕptq s2 r 1 2ptq r2ptq ϕ 1 2ptq . As consequence, the length of u is given by Lpuq »b a r 1 2 ptq r2 ptq ϕ 1 2 ptq dt . Problems Ñ Rn . upxq : px, 4x 7 q , x P I : r0, 1s , uptq : p2t3 , 3t2 q , t P I : r2, 5s , upxq : px, 2x4 p16x2 q1 q , x P I : r1, 2s , upxq : px, x2{3 q , x P I : r2, 3s , upxq : px, 128px5 {15q p8x3 q1 q , x P I : r1, 3s , upθq : pθ, ln cos θq , θ P I : rπ {8, π {4s , upsq : ps, cosh sq , s P I : r1, 8s , upθq : p2θ, cosp3θq, sinp3θqq , θ P I : r0, π s , ? uptq : pt2 {2, 2 t, lnptqq , t P I : r2, 7s , uptq : pt, cosh t, sinh tq , t P I : r0, 4s , upv q : p2v 3 , cos v v sin v, v cos v sin v q , v P I : r0, π {2s , uptq : p2et , et sin t, et cos tq , t P I : r3, 4s . 1) Calculate the length of the path u : I a) b) c) d) e) f) g) h) i) j) k) l) 2) Calculate the length of the curve C. a) C : tpx, y q P R2 : x2{3 b) c) d) e) f) g) y 2{3 9 ^ 1 ¤ x ¤ 3u , C : tpx, y q P R2 : x 2y 3{2 3 ^ 0 ¤ y ¤ 2u , C : tpx, y q P R2 : y 3 4x2 0 ^ 3 ¤ x ¤ 4u , C : tpx, y q P R2 : 1 px4 {3q xy 0 ^ 2 ¤ x ¤ 3u , C : tpx, y q P R2 : 8y 2 9px 1q2 ^ x ¥ 1 , 0 ¤ y ¤ 1u , ? C : tpx, y q P R2 : y x p3 2xq 0 ^ 0 ¤ x ¤ 4u , C : tpx, y q P R2 : xy px4 {2q 24 ^ 1 ¤ x ¤ 5u . 537 y 1 0.5 2Π Π x Fig. 150: A cycloid. 3) A cycloid is the trajectory of a point of a circle rolling along a straight line. Calculate the length of the part of the cycloid tpapt sin tq, ap1 cos tq : t P Ru between the points p0, 0q and p2πa, 0q where a ¡ 0. 4) An astroid is the trajectory of a point on a circle of radius R{4 rolling on the inside of a circle of radius R ¡ 0. Calculate the length of the part of the astroid tpR cos3 t, R sin3 tq : t P Ru between the points pR, 0q and p0, Rq. 5) A cardioid is the trajectory of a point on a circle rolling on the inside of a circle of the same radius. Calculate the length of the cardioid tpa cos ϕp1 where a ¡ 0. cos ϕq, a sin ϕp1 cos ϕqq : ϕ P r0, 2π qu 6) Consider all real-valued functions on [0,1] such that f p0q 1, f p1q 1 and that are continuously differentiable on the interval p0, 1q with a derivative that has a continuous extension to [0,1]. Find that function whose Graph is shortest. Give reasons for your answer. 7) Let b ¡ a ¡ 0. Consider all C 1 -paths u : [0,1] up0q pa, 0q, up1q pb, 0q and such that Ñ R2 such that uptq rptq cos ϕptq , rptq sin ϕptq , for every t P [0,1] where r : [0,1] Ñ R and ϕ : [0,1] Ñ R are continuous, continuously differentiable on the interval p0, 1q with derivatives that have continuous extensions to [0,1]. Characterize the shortest paths. What is the common range of all these paths? 538 y 1 0.5 -1 0.5 -0.5 1 x -0.5 -1 Fig. 151: An astroid. y 1 1 2 1 2 1 3 2 1 - 2 -1 Fig. 152: A cardioid. 539 x 8) Let a, b, c, d P R such that a2 b2 1 and c2 d2 1. Consider all C 1 -paths u : [0,1] Ñ R2 on the sphere of radius 1 around the origin such that up0q pa, 0, bq, up1q pc, 0, dq and such that uptq sinpθptqq cospϕptqq , sinpθptqq sinpϕptqq , cospθptqq , for every t P [0,1] where r : [0,1] Ñ R, θ : [0,1] Ñ R and ϕ : [0,1] Ñ R are continuous, continuously differentiable on the interval p0, 1q with derivatives that have continuous extensions to [0,1]. Characterize the shortest paths. What is the common range of all these paths? 9) (Length of space paths given in spherical coordinates) Let a, b P R be such that a ¤ b, I : ra, bs, r : I Ñ R, θ : I Ñ R, ϕ : I Ñ R be continuous as well as differentiable on pa, bq with derivatives that can be extended to continuous functions on I. Define uptq : p rptq sin θptq cos ϕptq , rptq sin θptq sin ϕptq , rptq cos θptq q for every t P I. Note that for t P I, rptq, θptq and ϕptq can be interpreted as spherical coordinates of uptq if rptq ¡ 0, θptq P p0, π q and ϕptq P pπ, π q. Show that u is C 1 -path of length Lpuq »b a r 1 2 ptq r2 ptq θ 1 2 ptq 540 sin2 θptq ϕ 1 2 ptq dt . 4 4.1 Calculus III Vector-valued Functions of Several Variables This section starts the investigation of maps with domains in Rn and ranges in Rm where m, n P N are such that at least one from of them is greater than 1, i.e., such that n2 m2 ¡ 2. For brevity, we will call such maps vector-valued functions of several variables. Today, the vast majority of applications lead on the consideration of such maps. Here it has to remembered that we identify points in Rk , where k P N is such that k ¥ 2, with position vectors, see the remarks preceding Definition 3.5.8. In addition, as was explained in the beginning of Section 3.5.8, we also identify tangent vectors that are associated to points in space with position vectors. In applications, only from the context of a problem can be concluded about the nature of the involved quantities. But, at least to the experience of the author, most maps in applications are considering ‘physical fields’, i.e., maps that have as domain a set of points and as range a set of real numbers or a set of tangent vectors. In the last case, such maps associate to every point from the domain a tangent vector that is ‘attached’ to that point. Notable exceptions are ‘transformations’ which map points into points. For the most part of this course, mathematically, the precise nature of the objects will not play a role. Only in parts of the subsequent section on applications of differentiation and in later sections on vector analysis that nature will play a role in the interpretation of the results. Although the case that m n 1 is not the main object of investigation in the following, a guiding principle of Calculus III is the generalization of main results of Calculus I to the case of vector-valued functions of several variables. In this way, results are achieved that reduce in the case m n 1 to familiar results from Calculus I. Often already from the structure of the last results and their proofs, it is clear whether they likely allow generalization or not. Such kind of structural thinking can be viewed as an outflow of the formal approach to mathematics suggested by Hilbert. 541 It has been very fruitful in the 20th century. In particular, it resulted in a restructuring of previous mathematical knowledge in a very efficient and aesthetic way. The structuring of whole course, Calculus I - III, can be viewed as an outgrowth of this formal approach. Definition 3.5.46 from Calculus II gives a simple example of the above guiding principle. This definition simply replaces the modulus function in the corresponding definition for sequences of real numbers by the Euclidean distance function in order to arrive at a definition of the convergence of sequences in Rk where k P N is such that k ¥ 2. From a notational point both definitions are practically identical. Subsequently, we proved that a sequence of elements in Rk is converging to some x P Rk if and only if for every i P t1, . . . , k u the corresponding sequence of i-th components converges in R to the i-th component of x. In this way, the question of convergence or non-convergence of a sequence in Rk was reduced to the question of convergence or non-convergence of sequences of real numbers. The last, i.e., reduction to results of Calculus I, is another guiding principle in Calculus III. For instance, Taylor’s theorem, Theorem 4.3.6, for functions in several variables is a direct consequence of the corresponding theorem, Theorem 2.5.25, for functions in one variable. Definition 4.1.1. (Vector-valued functions of several variables) A vectorvalued function is a map from a non-trivial subset of Rn into Rm where n P N and m P N zt1u. A function of several variables is a map f from a non-trivial subset D of Rn into Rm for some n P N zt1u and m P N . A vector-valued function of several variables is a vector-valued function and / or a function of several variables. In accordance with Definitions 2.2.28, 2.2.33, for such a function, we define Definition 4.1.2. (i) the domain of f by Dpf q : D , (ii) the range of f by Ranpf q : tf pxq : x P Du , 542 y 12 12 -12 x -12 Fig. 153: Range of γ1 . (iii) the Graph of f by Gpf q : tpx, f pxqq : x P Du , (iv) the level set (or contour) of f corresponding to some c P Rm by f 1 pcq : tx P D : f pxq cu . The following are examples of vector-valued function of several variables. Example 4.1.3. (i) γ1 : R Ñ R2 defined by γ1 ptq : pcosptq, sinptqq for every t P R, (ii) γ2 : R Ñ R3 defined by γ2 ptq : pcosptq, sinptq, tq for every t P R, 543 10 z 5 1 0 0 -11 0 x 1 y -1 Fig. 154: Range of γ2 . (iii) f3 : R2 zt0u Ñ R defined by for every x P R2 zt0u, f3 pxq : 1{|x| , (iv) f4 : R3 zt0u Ñ R defined by for every x P R3 zt0u. f4 pxq : 1{|x| Example 4.1.4. (i) Find the maximal domain Dpg q of g such that g px, y q a 36 9x2 4y 2 for all px, y q P Dpg q. Solution: The domain of g is the subset of R2 consisting of all those px, y q P R2 for which a 36 9x2 4y 2 544 3 2 z 2 1 1 0 -2 0 y -1 0 x -1 1 2 -2 Fig. 155: Truncated graph of f3 . 2 y 1 0 -1 -2 -2 0 x -1 1 2 Fig. 156: Contour map of f3 . Darker colors correspond to lower values of f3 . 545 y 2 1 x 1 -1 -1 -2 Fig. 157: Dpg q. is defined. Hence it is given by tpx, yq P R2 : px{2q2 py{3q2 ¤ 1u . Geometrically, this set consists of the area of the ellipse centered around the origin with half axes 2 and 3. (ii) Find the range of g. Solution: Ranpg q px, yq P Dpgq, it follows that r0, 6s. 0 ¤ 36 9x2 4y 2 and hence also that 0¤ a (Proof: For every ¤ 36 36 9x2 4y 2 ¤6. Therefore, Ranpg q r0, 6s. In addition for every z g 1? 36 z 2 , 0 3 P r0, 6s, z . Hence it follows also that Ranpg q r0, 6s and, finally, that Ranpg q r0, 6s.) 546 y 0.5 -1 0.5 -0.5 1 x -0.5 Fig. 158: Graph of f from Example 4.1.6. Analogous to the corresponding definition in Calculus I, the next defines continuity of a vector-valued function of several variables at a point by its property to commute with limits taken at that point. Definition 4.1.5. Let f : D Ñ Rm be a vector-valued function of several variables and x P D. We say that f is continuous in x if for every sequence x1 , x2 , . . . of elements in D from lim Ñ8 xν ν it follows that lim f pxν q f ν Ñ8 x lim xν ν Ñ8 r f pxqs . Otherwise, we say that f is discontinuous in x. Moreover, we say that f is continuous if f is continuous in all points of its domain D. Otherwise, we say that f is discontinuous. As a reminder of discontinuity of functions defined on subsets of R, we give the following example. 547 Example 4.1.6. Consider the function f : R Ñ R defined by f pxq : x |x | for every x P R zt0u and and f p0q : 1. Then 1 lim nÑ8 n 1 0 and nlim Ñ8 n but lim Ñ8 f n 1 n 1 and lim Ñ8 f n n1 0, 1 . Hence f is discontinuous at the point 1. See Fig. 158. Example 4.1.7. Consider the function of several variables f5 : R2 zt0u R defined by x2 y 2 f5 pxq : 2 x y2 Ñ for all x px, y q P R2 zt0u. Then f5 px, 0q 1 , f5 p0, y q 1 for all x, y P R zt0u, and hence there is no extension of f5 to a continuous function defined on R2 . Note that for every real a f5 px, axq 1 a2 1 a2 for all x P R zt0u. Hence for every b P r1, 1s, there is a real number a such that lim f5 px, axq b . x Ñ0,x0 Example 4.1.8. (Basic examples of continuous functions.) Let n P N . 548 2 1 z 2 y 1 0 0 1 -1 -2 0 -1 y -1 0 x -1 1 -2 -2 2 -2 -1 0 x 1 2 Fig. 159: Graph and contour map of f5 . In the last, darker colors correspond to lower values of f5 . (i) Constant vector-valued functions on Rn are continuous as a consequence of Theorem 3.5.47. (ii) For i P t1, . . . , nu, define the projection pi : Rn i-th component by pi pxq : xi Ñ R of Rn onto the for all x px1 , . . . , xn q P Rn . Then pi is continuous as a consequence of Theorem 3.5.47. In the case of functions of one real variable, one main application of continuity came from the fact that continuous functions defined on bounded and closed intervals assume a maximum value and a minimum value. The same is true also for continuous functions of several variables. Such functions assume a maximum value and a minimum value on so called ‘compact’ subsets, defined below, of their domain. Again, as in the case of functions of one real variable, this property is a consequence of the Bolzano-Weierstrass theorem for sequences in Rn . The last theorem will be given next. It is a simple consequence of its counterpart Theorem 2.3.18 for sequences of real numbers. 549 Theorem 4.1.9. (Bolzano-Weierstrass) Let n P N and x1 , x2 , . . . be a bounded sequence in Rn , i.e., for which there is M ¡ 0 such that |xk | ¤ M for all k P N . Then there is a subsequence, i.e., a sequence xn1 , xn2 , . . . that corresponds to a strictly increasing sequence n1 , n2 , . . . of non-zero natural numbers, which is convergent in Rn . Proof. Since x1 , x2 , . . . is bounded, the corresponding sequences of components x1k , x2k , . . . , k 1, . . . , n are also bounded. Therefore, as a consequence of an n-fold application of Theorem 2.3.18 and an application of Theorem 3.5.47, it follows the existence of a subsequence xn1 , xn2 , . . . , where n1 , n2 , . . . is a strictly increasing sequence of non-zero natural numbers, which is convergent in Rn . For n P N such that n ¥ 2, subsets of Rn show more variety than subsets of the real numbers. In the following, we define subclasses of such sets that play a particular role in calculus. In this, open subsets will play a role which is similar to open intervals of R. Differentiability will be defined only for functions defined on such sets because differences of neighboring function values need to be considered, and for every point x in such a set, there is ε ¡ 0 such that x h is also contained in that set for all h satisfying |h| ε. Compact subsets generalize aspects of bounded closed intervals of R. In particular, we will see below that continuous functions of several variables assume a maximum value and a minimum value on compact subsets of their domains. Definition 4.1.10. (Open, closed and compact subsets of Rn ) Let n P N . (i) A subset U of Rn is called open if for every x ball of some radius ε ¡ 0 around x’ P U there is an ‘open Uε pxq : ty P Rn : |y x| εu which is contained in U . In particular, φ , Rn , and every open ball of radius ε ¡ 0 around x P Rn is open. Obviously, arbitrary unions of open subsets of Rn and intersections of finitely many subsets of Rn are open. 550 (ii) A subset A of Rn is called closed if its complement Rn zA is open. In particular, the so called ‘closed ball of radius ε ¡ 0 around x P Rn ’ Bε pxq : ty P Rn : |y x| ¤ εu and the sphere of radius ε centered at x Sε pxq : ty P Rn : |y x| εu are closed. As a consequence of the last remark in (i), arbitrary intersections of closed subsets of Rn and unions of finitely many closed subsets of Rn are closed. In particular, we define for every subset S of Rn its corresponding closure S̄ as the intersection of all closed subsets of Rn that contain S. Hence S̄ is the smallest closed subset of Rn that contains S. (iii) A subset K of Rn is called compact if it is closed and bounded, i.e., it if is closed and contained in some open ball UR p0q of some radius R ¡ 0 around the origin. Among others, the following example shows that the notions of openness and closedness of intervals of R used in Calculus I coincide with those notions from the previous definition. Example 4.1.11. Let a, b, c P R be such that a ¤ b. (i) The interval pa, bq is bounded and open. This can be seen as follows. First, pa, bq is bounded since pa, bq pM, M q where M : maxt|a|, |b|u. Second, pa, bq is open since it follows for every x P pa, bq that px , x q pa, bq where ε : mintx a, b xu. (ii) Since p8, cq 8 ¤ pn c, cq , pc, 8q n 0 8 ¤ n 0 (i) also implies that p8, cq and pc, 8q are open. 551 pc, c nq , (iii) Since p8, cs R zpc, 8q , rc, 8q R zp8, cq , (ii) implies that p8, cs and rc, 8q are closed. (iv) The interval [a, b] is compact. This can be seen as follows. First, ra, bs is bounded since ra, bs pM, M q, for M ¡ maxt|a|, |b|u. Second, ra, bs is closed since R z ra, bs p8, aq Y pb, 8q is open as a union of open subsets of R. (v) The closure of pa, bq coincides with ra, bs. This can be seen as follows. First, ra, bs is a closed subset of R that contains pa, bq. Since, by definition, the closure of pa, bq is the smallest closed subset of R that contains pa, bq, the closure of pa, bq is a subset of ra, bs. We show now indirectly that b is contained in every closed subset C of R that contains pa, bq. Otherwise, there is such C for which this is not the case. Hence b is contained in R z C. Since the last set is open, there is ε ¡ 0 such that pb ε, b εq R z C. But the intersection of pb ε, b εq with pa, bq is non-empty. Hence C does not contain pa, bq. Analogously, it follows that a is contained in every closed subset C of R that contains pa, bq. Otherwise, there is such C for which this is not the case. Hence a is contained in R z C. Since the last set is open, there is ε ¡ 0 such that pa ε, a εq R z C. But the intersection of pa ε, a εq with pa, bq is non-empty. Hence C does not contain pa, bq. Hence every closed subset of R that contains pa, bq also contains ra, bs. Therefore, the closure of pa, bq also contains ra, bs and hence coincides with ra, bs. The closure S̄ of a subset S of Rn , n P N , was defined as the intersection of all closed subsets that contain S. It will turn out to be useful to have a characterization of S̄ that makes reference only to the set S, but not to any other set. Such characterization is given below. 552 Theorem 4.1.12. Let n P N and S Rn . Then the closure S̄ of S consists of all x P Rn for which there is a sequence x1 , x2 , . . . of elements of S that is convergent to x. Proof. If x P S̄, there are two cases. In case that x P S, the constant sequence x, x, . . . is a sequence in S that is converging to x. If x R S and U is some open subset of Rn that contains x, it follows that U also contains a point of S. Otherwise, it follows that S̄ z U S̄ X p Rn z U q is a closed subset of Rn that contains S and hence that S̄ S̄ z U which implies that x R S̄. In particular by applying the previous to U1{ν pxq for every ν P N , we obtain a sequence x1 , x2 , . . . of elements of S that is convergent to x by construction. On the other hand, if x P Rn is such that there is a sequence x1 , x2 , . . . of elements of S that is convergent to x and A is a closed subset Rn that contains S, it follows that x P A. Otherwise, x is contained in the open set Rn z A and there is ε ¡ 0 such that Uε pxq Rn z A. Hence if ν P N is such that |xν x| ε, it follows that xν P Rn z A and hence that xν R S. As a consequence, x is also contained in S̄ which is the intersection of all closed subsets A of Rn that contain S. Example 4.1.13. Let a, b P R be such that a ¤ b. Then the closure of the intervals pa, bq, pa, b], [a, bq and [a, b] is given by [a, b]. Note that the statement of the previous theorem implies that convergent sequences in Rn , n P N , whose members are elements of a closed subset of Rn converge to an element of that subset. This is the additional fact that is necessary for the proof that continuous functions defined on compact subsets of Rn assume a maximum value and a minimum value. The procedure of that proof itself is analogous to the proof of the similar statement from Calculus I. 553 Theorem 4.1.14. (Existence of maxima and minima of continuous functions on compact subsets of Rn ) Let n P N , K Rn a non-empty compact subset and f : K Ñ R be continuous. Then there are (not necessarily uniquely determined) xmin P K and xmax P K such that f pxmax q ¥ f pxq , f pxmin q ¤ f pxq for all x P K. Proof. For this, in a first step, we show that f is bounded and hence that sup f pK q exists. In the final step, we show that there is c P K such that f pcq sup f pK q. For both, we use the Bolzano-Weierstrass theorem. The proof that f is bounded is indirect. Assume on the contrary that f is unbounded. Then there is a sequence x1 , x2 , . . . in K such that f pxn q ¡ n (4.1.1) for all n P N. Hence according to Theorems 4.1.9, 4.1.12, there is a subsequence xk1 , xk2 , . . . of x1 , x2 , . . . converging to some element c P K. Note that the corresponding sequence f pxk1 q, f pxk2 q, . . . is not converging as a consequence of (4.1.1). But, since f is continuous, it follows that f pcq lim f pxnk q . k Ñ8 Hence f is bounded. Therefore, let M : sup f pK q. Then for every n P N there is a corresponding cn P K such that |f pcnq M | n1 . (4.1.2) Again, according to Theorem 4.1.9, 4.1.12, there is a subsequence ck1 , ck2 , . . . of c1 , c2 , . . . converging to some element c P K. Also, as consequence of (4.1.2), the corresponding sequence f pck1 q, f pck2 q, . . . is converging to M . Hence it follows by the continuity of f that f pcq M and by the definition of M that f pcq M ¥ f pxq 554 for all x P K. By applying the previous reasoning to the continuous function f , it follows the existence of a c 1 P K such that f pc 1q ¥ f pxq . Hence it follows that for all x P K. f pc 1 q ¤ f pxq In the case of functions of one real variable, it was shown that a continuous function f : [a, b] Ñ R where a, b P R are such that a b, which is differentiable on the open interval pa, bq, assumes its extrema either in a critical point in pa, bq or in the boundary points a or b of [a, b]. The same is true for functions of several variables. For this purpose, we define the notion of inner points and boundary points of subsets of Rn , n P N . Definition 4.1.15. (Inner points and boundary points of subsets of Rn ) Let n P N and S Rn . (i) We call x P S an inner point of S if there is ε ¡ 0 such Uε pxq S. In particular, we call the set of inner points of S the interior of S and denote this set by S . Obviously, S is the largest open set that is contained in S. (ii) We call x P Rn a boundary point of S if for every ε ¡ 0 the corresponding Uε pxq contains a point from S and a point from Rn z S. Hence a boundary point of S cannot be an inner point of S. We call the set of boundary points of S the boundary of S and denote this set by B S. Example 4.1.16. Let a, b P R be such that a ¤ b. Then the interior of the intervals pa, bq and [a, b] is given by pa, bq. The boundary of pa, bq and [a, b] is given by ta, bu. We note that the closure of pa, bq, i.e., [a, b], is the union of the interior of pa, bq, i.e., pa, bq, and the boundary of pa, bq, i.e., ta, bu. The last is true for every subset of Rn , n P N . 555 Theorem 4.1.17. (Decomposition of the closure of subsets of Rn ) Let n P N and S Rn . Then S Y BS . Proof. First, we note that S S S̄. If x P B S, then for every ν P N there is xν P S such that |xν x| 1{ν. Hence lim xν x ν Ñ8 and x P S̄. Hence it follows that S̄ S Y B S. If x P S̄, then either x P S or x P B S. Otherwise, there is ε ¡ 0 such that Uε pxq is contained in Rn z S. In this case, there is no sequence x1 , x2 , . . . of elements of S that is convergent to x and hence x R S̄. Hence if follows that S̄ S Y B S and finally that S̄ S Y B S. S̄ If defined, sums, scalar multiples, products, quotients and compositions of continuous vector-valued functions of several variables are continuous, as is also the case for functions in one real variable. This is a simple consequence of the limit laws, Theorems 3.5.49, 2.3.4, and the definition of continuity. The associated proofs are analogous to those of the corresponding statements for functions in one real variable in Calculus I. As usual, a typical application of the thus obtained theorems consists in the decomposition of a given vector-valued function of several variables into sums, scalar multiples, products, quotients, and compositions of vector-valued functions of several variables whose continuity is already known. Then the application of those theorems proves the continuity of that function. In this way, the proof of continuity of a given vector-valued function of several variables is greatly simplified and, usually, obvious. Therefore, in such obvious cases in future, the continuity of such a function will be just stated, but not explicitly proved. Definition 4.1.18. Let f1 : D1 Ñ Rm , f2 : D2 Ñ Rm be vector-valued functions of several variables such that D1 X D2 φ. Moreover, let a P R. We define pf1 f2 q : D1 X D2 Ñ Rm and a.f1 : D1 Ñ Rm by pf1 f2 qpxq : f1 pxq 556 f2 pxq for all x P D1 X D2 and pa.f1qpxq : a.f1pxq for all x P D1 . Theorem 4.1.19. Let f1 : D1 Ñ Rm , f2 : D2 Ñ Rm be vector-valued functions of several variables and such that D1 X D2 φ. Moreover, let a P R. Then by Corollary 3.5.49: (i) If f1 and f2 are both continuous in x continuous in x, too. P D1 X D2, then f1 f2 is (ii) If f1 is continuous in x P D1 , then a.f1 is continuous in x, too. Definition 4.1.20. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions of several variables such that D1 X D2 φ. We define f1 f2 : D1 X D2 Ñ R by pf1 f2qpxq : f1pxq f2pxq for all x P D1 X D2 . If moreover Ranpf1 q R , we define 1{f1 : D1 Ñ R by for all x P D1 . p1{f1qpxq : 1{f1pxq Theorem 4.1.21. Let f1 : D1 Ñ R, f2 : D2 variables such that D1 X D2 φ. (i) If f1 and f2 are both continuous in x continuous in x, too. Ñ R be functions of several P D1 X D2, then f1 f2 is (ii) If f1 is such that Ranpf1 q R as well as continuous in x P D1 , then 1{f1 is continuous in x, too. 557 Proof. For the proof of (i), let x1 , x2 , . . . be some sequence in D1 which converges to x. Then it follows for every ν P N that X D2 |pf1 f2qpxν q pf1 f2qpxq| |f1pxν qf2pxν q f1pxqf2pxq| |f1pxν qf2pxν q f1pxqf2pxν q f1pxqf2pxν q f1pxqf2pxq| ¤ |f1pxν q f1pxq| |f2pxν q| |f1pxq| |f2pxν q f2pxq| ¤ |f1pxν q f1pxq| |f2pxν q f2pxq| |f1pxν q f1pxq| |f2pxq| |f1pxq| |f2pxν q f2pxq| and hence, obviously, that lim Ñ8pf1 f2 qpxν q pf1 f2 qpxq . ν For the proof of (ii), let x1 , x2 , . . . be some sequence in D1 which converges to x. Then it follows for every ν P N that |p1{f1qpxν q p1{f1qpxq| |1{f1pxν q 1{f1pxq| |f1pxν q f1pxq|{r |f1pxν q| |f1pxq| s and hence, obviously, that lim Ñ8p1{f1 qpxν q p1{f1 qpxq . ν Definition 4.1.22. Let f : Df Ñ Rm and g : Dg Ñ Rp be vector-valued functions of several variables and Dg be a subset of Rm . We define g f : Dpg f q Ñ Rp by Dpg f q : tx P Df : f pxq P Dpg qu and for all x P Dpg f q. pg f qpxq : gpf pxqq 558 Theorem 4.1.23. Let f : Df Ñ Rm , g : Dg Ñ Rp be vector-valued functions of several variables and Dg be a subset of Rm . Moreover, let x P Df , f pxq P Dg , f be continuous in x and g be continuous in f pxq. Then g f is continuous in x. Proof. For this, let x1 , x2 , . . . be a sequence in Dpg f q converging to x. Then f px1 q, f px2 q, . . . is a sequence in Dg . Moreover, since f is continuous in x, it follows that lim Ñ8 f pxν q f pxq . ν Finally, since g is continuous in f pxq, it follows that lim Ñ8pg f qpxν q νlim Ñ8 g pf pxν qq g pf pxqq pg f qpxq . ν Example 4.1.24. In the following, we conclude that f5 : R2 zt0u from Example 4.1.7, defined by f5 pxq : Ñ R x2 y 2 x2 y 2 for all x px, y q P R2 zt0u, is continuous. For this, we define for every i P t1, 2u the corresponding projection pi : R2 zt0u Ñ R of R2 zt0u onto the i-th component by pi pxq : xi for every x px1 , x2 q P R2 zt0u. By Theorem 3.5.47, pi is continuous, i.e., continuous in every point of its domain R2 zt0u. We arrive at the following representation of f5 f5 r p1 p1 pp1q.p2q p2 s p 1 { r p1 p1 559 p2 p2 s q . Hence the continuity of f5 follows by application of Theorems 4.1.19, 4.1.21. Note that another way of concluding the continuity of the second factor 1 { r p1 p1 p2 p2 s is by means of Theorems 4.1.19(i), 4.1.21(i) and 4.1.23, using the continuity of the function pR zt0u Ñ R, x ÞÑ 1{xq known from Calculus I. Problems 1) Find the maximal domain Dpf q of f such that a) f px, y q f px, y q 5 1 2x2 y 2 , a p3x 2yq2 , c) f px, y q p1{px 1qq p1{y 2 q , d) f px, y q lnpx 3y q , e) f px, y q arccosp2xq lnpxy q , a f) f px, y q px2 y 2 3qp1 x2 y 2 q , a g) f px, y q p x y 2 q1{2 , a ? ? h) f px, y, z q x 3 y 2 z 1 , i) f px, y, z q arccospxq arccospy q arccosp1 z q , j) f px, y, z q lnpxyz q , a k) f px, y, z q 1 x2 2y 2 4z 2 , l) f px, y, z q arcsinpx 3y 6z q for all px, y q P Dpf q or px, y, z q P Dpf q. Find the maximal domain Dpg q, level sets and range of g such that a) g px, y q x 2y , b) g px, y q 2x3 3y , c) g px, y q y {x2 , d) g px, y q 3x2 5y 2 7 , e) g px, y q 4x2 2y 2 1 , f) g px, y q px{3q 2y 2 6 , b) 2) a 560 g px, y q 3{rpx g) g px, y q h) a i) g px, y q px 2 1qpy 1qs , x 3y , 3q{py 1q , j) g px, y, z q x 5y g px, y, z q 6x 2 k) 2z 3 , 2y l) g px, y, z q x2 y 2 2 z2 3 , 4z 2 9 for all px, y q P Dpg q or px, y, z q P Dpg q. In addition, for the cases a) - i), draw a contour map showing several curves and sketch Gpg q. 3) Where is the function h : D Ñ R continuous and why? In particular, decide whether h is continuous or discontinuous at the origin p0, 0q. Give reasons. x2 xy 2 for px, y q P D : R2 z t0u , a) hpx, y q : 2 x y2 hp0q : 1 , 3xy 2 for px, y q P D : R2 z t0u , x2 y 2 hp0q : 0 , b) hpx, y q : c) hpx, y q : xy 2 for px, y q P D : R2 z t0u , y4 x2 hp0q : 0 , d) hpx, y q : xy x2 hp0q : 0 , x e) hpx, y q : 2 x hp0q : 1 , y2 y for px, y q P D : R2 z t0u , y2 f) hpx, y q : y2 hpx, y q : y x2 hp0q : 1 , g) h) x2 hp0q : 0 , hpx, y q : for px, y q P D : R2 z t0u , y2 for px, y q P D : R2 z t0u , y2 for px, y q P D : R2 z t0u , x2 y 2 for px, y q P D : R2 z tpx, xq : x P Ru , y3 x3 hp0q : 0 , 561 x3 x2 hp0q : 0 , y3 for px, y q P D : R2 z tpx, x2 q : x P Ru , y x3 x hp0q : 0 , y3 for px, y q P D : R2 z tpx, xq : x P Ru , y i) hpx, y q : j) hpx, y q : k) hpx, y q : x2 y 2 for px, y q P D : R2 z t0u , y4 x4 hp0q : 0 , x2 y 2 for px, y q P D : R2 z t0u , x2 y 2 hp0q : 0 , xy xz yz m) hpx, y, z q : a for px, y, z q P D : R3 z t0u , x2 y 2 z 2 l) hpx, y q : hp0q : 0 , n) hpx, y, z q : hp0q : 0 , o) hpx, y, z q : hp0q : 1{2 . x2 xyz for px, y, z q P D : R3 z t0u , y2 z2 xy x2 yz y2 z2 for px, y, z q P D : R3 z t0u , 4) For every n P N , show that p | | : Rn Ñ R, x ÞÑ |x| q is continuous. P R. Further, define f : R2 z t0u Ñ R by |x|p |y|q for px, yq P D : R2 z t0u , f px, y q : 2 x xy y 2 f p0q : 0 . 5) Let p, q Find necessary and sufficient conditions on p and q such that f is continuous. 6) Sketch the subsets of Rn and determine whether they are bounded, unbounded, open, closed and compact. In addition, determine there interior, closure and boundary. a) The intervals I1 : p3, 4q , I2 : r1, 2q , I3 : p1, 3s , I4 : r1, 3s , I5 : p1, 8q , I6 : r0, 8q , I7 : p8, 3s , I8 : p8, 1s , 562 b) the 2-dimensional intervals I9 : tpx, y q : 1 x 4 ^ 0 y 1u , I10 : tpx, y q : 0 ¤ x ¤ 4 ^ 3 ¤ y ¤ 1u , I11 : tpx, y q : 0 x ¤ 4 ^ 3 ¤ y ¤ 1u , I12 : tpx, y q : x ¡ 0 ^ y 3u , I13 : tpx, y q : x ¥ 1 ^ y ¥ 4u , I14 : tpx, y q : x 1 ^ y ¥ 2u , c) the sets S1 : tpx, y q P R2 : xy S2 : tpx, y q P R : 9x ¡ 1u , 36u , S3 : tpx, y q P R : x y ¤ 1u , S4 : tpx, y q P R2 : 3x2 y 2 ¡ 3u , S5 : tpx, y q P R2 : x2 y 2 ¤ 5u , S6 : tpx, y q P R2 : 2px 1q2 y 2 ¤ 3u , S7 : tpx, y q P R2 : x y 2 ¤ 2u , S8 : tpx, y, z q P R3 : x2 2y 2 z 2 ¤ 4u , S9 : tpx, y, z q P R3 : x2 3y 2 2z 2 ¤ 1u , S10 : tpx, y, z q P R3 : 4x2 y 2 z 2 ¡ 2u , S11 : tpx, y, z q P R3 : 9x2 3y 2 4z 2 ¥ 4u , S12 : tpx, y, z q P R3 : x2 4y 2 z 2 9u . 2 2 2 2 4y 2 2 7) Let n P N . a) Show that the union of any number of open subsets of Rn and the intersection of a finite number of open subsets of Rn are open. b) Show that the intersection of any number of closed subsets of Rn and the union of a finite number of closed subsets of Rn are closed. c) Give an example of an intersection of non-empty open subsets of R which is non-empty and closed. d) Give an example of a union of non-empty closed subsets of R which is open. 563 Rn where n P N . Show that B S is closed. 8) Let S, T a) b) Show that S is closed if and only if B S c) Show that B S BpRn zS q. d) Show that S̄¯ T . S. e) Show that S Y T S̄ Y T̄ . f) Show that S X T S̄ X T̄ . g) Give an example that shows that in general S X T S̄ X T̄ . 9) Let n, m P N and f : Rn Ñ Rm . Show that f is continuous if and only if f 1 pU q is open for every open subset U of Rn . 564 4.2 Derivatives of Vector-valued Functions of Several Variables In the following, we define derivatives of such functions as linear maps. For the motivation of that definition, we use the guiding principle mentioned in the introduction to Calculus III, i.e., we try to generalize the corresponding definition from Calculus I to vector-valued functions of several variables. In Calculus I, we defined the following. A function f : pa, bq Ñ R, where a, b P R such that a b, is differentiable in x P pa, bq with derivative c P R if for all sequences x0 , x1 , . . . in pa, bq ztxu which are convergent to x it follows that f pxν q f pxq c. (4.2.1) lim ν Ñ8 xν x If the last is the case, we defined the derivative f 1 pxq of f in x by f 1 pxq : c . In the next step, we investigate whether the defining equation p4.2.1q is could be used also in the case of a vector-valued function of several variables f . In general, in that case x0 , x1 , . . . is a sequence in Rn , n P N and f px0 q f pxq, f px1 q f pxq, . . . is a sequence in Rm , m P N . We immediately notice therefore that (4.2.1) cannot be directly applied to this situation since division by elements of Rn is not defined. We try to remedy that by going back to the situation from Calculus I with the goal of rewriting (4.2.1) in an equivalent way such that generalization to the situation of a vector-valued function of several variables is possible. Indeed, (4.2.1) is equivalent to f pxν q f pxq 0. lim c ν Ñ8 xν x Further, since f xν x p q f pxq c f pxν q f pxq c pxν xq xν x ν x |f pxν q f|xpxqxc| pxν xq| ν 565 for every ν P N, (4.2.1) is equivalent to |f pxν q f pxq c pxν xq| 0 . lim ν Ñ8 |x ν x | Going back to the situation of a vector-valued function of several variables, we see that there is only one obstacle left for generalization of the last, namely the interpretation of c. In this situation, c cannot correspond to a real number in general since f pxν q f pxq P Rm and xν x P Rn for ν P N, and in general n m. Hence c needs to be a map from Rn to Rm . In the case of a function in one variable, this map is given by the linear function µc : R Ñ R defined by µc pxq : c x for every x P R. The map µc has the following simple properties µc px y q c px y q c x c y µc pxq µc pα xq c pα xq α pc xq α µc pxq µc py q , P R and α P R. If on the other hand λ : R Ñ R is such that λpx y q λpxq λpy q , λpα xq α λpxq (4.2.2) for all x, y P R and α P R, then λpxq λpx 1q x λp1q λp1q x for every x P R and hence λ µλp1q . for all x, y As a consequence, there is a one to one correspondence of functions λ on R with the property (4.2.2) and real numbers. In the following, maps λ : Rn Ñ Rm , where n, m P N , with the property that λpx y q λpxq λpy q , λpα.xq α.λpxq for all x, y P Rn and α P R are called ‘linear’. Such maps are considered next. Subsequently, a map f from some open subset U of Rn into Rm will 566 be said to be differentiable in x P U if there is a linear map λ : Rn Ñ Rm such that for all sequences x1 , x2 , . . . in U ztxu which are convergent to x it follows that lim ν Ñ8 |f pxν q f pxq λpxν xq| 0 . |xν x| Definition 4.2.1. (Linear maps) Let n, m say that λ is linear if λpx yq λpxq P N and λ : Rn Ñ Rm. We λpyq , λpαxq αλpxq for all x, y P Rn and α P R. Since in that case λpxq λ ņ xj enj j 1 ņ x j λp enj q j 1 m̧ ņ Λij xj em i i 1j 1 m n m where en1 , . . . , enn and em 1 , . . . , em denote the canonical basis of R and R , respectively, and for every i 1, . . . , m, j 1, . . . , n, Λij denotes the component of λpenj q in the direction of em i , such λ is determined by its values on the canonical basis of Rn . On the other hand, obviously, if pΛij qpi,jqPt1,...mut1,...nu is a given family of real numbers, then by λpxq : m̧ ņ Λij xj em i i 1j 1 for all x P Rn , there is defined a linear map λ : Rn Ñ Rm . Interpreting the elements of Rn and Rm as column vectors and defining the m n matrix Λ by λ11 λ1n Λ : λm1 567 λmn , the last is equivalent to λpxq : Λ x λm1 λ11 λ1n λmn x1 xn where the multiplication sign denotes a particular case of matrix multiplication defined by ņ pΛ xqi : Λij xj j 1 for every every x P Rn and i 1, . . . , m. In this case, we call Λ the reprem sentation matrix of λ with respect to the bases en1 , . . . , enn and em 1 , . . . , em . Definition 4.2.2. (Differentiability) A vector-valued function of several variables f from some open subset U of Rn into Rm is said to be differentiable in x P U if there is a linear map λ : Rn Ñ Rm such that for all sequences x1 , x2 , . . . in U ztxu which are convergent to x: lim ν Ñ8 |f pxν q f pxq λpxν xq| 0 . |xν x| Since in that case, it follows that |f pxν q f pxq| |f p|xxν q xf|pxq |xν x| ν | f pxν q f pxq λpxν xq λpxν xq| |xν x| |xν x| ¤ |f pxν q f|xpxqxλ| pxν xq| |xν x| |λpxν xq| ν ¤ |f pxν q f|xpxqxλ| pxν xq| |xν x| ν 568 m̧ ņ i 1j 1 |Λij | |xν x| m for every ν P N , where en1 , . . . , enn and em 1 , . . . , em denote the canonical n m basis of R and R , respectively, and for every i 1, . . . , m, j 1, . . . , n, Λij denotes the component of λpenj q in the direction of em i , the differentiability of f in x also implies the continuity of f in x. Ñ R by: f6 px, y q : 2x2 Example 4.2.3. Define f6 : R2 y2 for all x, y P R. Then f6 is differentiable, in particular, at the point p1, 1q. This can be seen as follows: For x, y P R zt1u, we calculate: f6 px, y q 2x2 y 2 f6 p1, 1q 2x2 y 2 3 f6p1, 1q 2rpx 1q2 2px 1qs py 1q2 f6p1, 1q 4px 1q 2py 1q 2px 1q2 2py 1q py 1q2 . Hence f6 px, y q f6 p1, 1q 4px 1q 2py 1q 2px 1q2 py 1q2 and |f6px, yq f6p1, 1q 4px 1q 2py 1q| |px, yq p1, 1q| 2 2px 1q py 1q2 ¤ 2|x 1| |y 1| . a px 1q2 py 1q2 Hence for every sequence x1 , x2 , . . . in R2 ztp1, 1qu which is convergent to p1, 1q: |f6pxν q f6p1, 1q 4pxν 1q 2pyν 1q| 0 . lim ν Ñ8 |xν p1, 1q| As a consequence, a linear map λ : R2 Ñ R satisfying the conditions of Definition 4.2.2 is given by λpxq : 4x 569 2y y -2 -1 0 1 2 10 5 z 0 -5 -10 -15 -2 -1 0 1 2 x Fig. 160: Graph of f6 together with its tangent plane at (1,1). for all x px, y q P R2 . The plane (see Fig. 160) z x, y f6p1, 1q λpx 1, y 1q 4x 2y 3 , P R, is called the tangent plane of the Graph of f6 at the point p1, 1q. In Calculus I, we already defined partial derivatives of functions in several variables. The following gives a natural generalization to vector-valued function of several variables. Definition 4.2.4. (Partial differentiability) A vector-valued function of several variables f from some open subset U of Rn into Rm is said to be partially differentiable in the i-th coordinate, where i P t1, . . . , nu, at some x P U if for all j P t1, . . . , mu the corresponding real-valued function of one real variable fj px1 , . . . , xi1 , , xi 1 , . . . , xn q is differentiable at xi in the sense of the Calculus I. In that case, we define: Bf pxq : prf px , . . . , x , , x 1 1 i1 i B xi 570 1 , . . . , xn qs 1pxiq, . . . , rfmpx1, . . . , xi1, , xi 1 , . . . , xm qs 1pxiqq . If f is partially differentiable at x in the i-th coordinate direction at every x P U , we call f partially differentiable in the i-th coordinate direction and denote by B f {B xi the map which associates to every x P U the corresponding pB f {B xi qpxq. Partial derivatives of f of higher order are defined recursively. If B f {B xi is partially differentiable in the j-th coordinate direction, where j P t1, . . . , nu, we denote the partial derivative of B f {B xi in the j-th coordinate direction by B2f B xj B xj . Such is called a partial derivative of f of second order. In the case j i, we set B2f : B2f . Bx2i BxiBxi Partial derivatives of f of higher order than two are defined accordingly. Ñ R by f7 px, y q : x3 x2 y 3 2y 2 Example 4.2.5. Define f7 : R2 for all x, y P R. Find Bf7 p2, 1q Bx and Bf7 p2, 1q . By Solution: We have f7 px, 1q x3 for all x, y x P R, x2 2 and f7 p2, y q 8 P R. Hence it follows that Bf7 px, 1q 3x2 Bx 2x , Bf7 p2, yq 12y2 4y , By 571 4y 3 2y 2 P R, and, finally, that Bf7 p2, 1q 16 and Bf7 p2, 1q 8 . Bx By Example 4.2.6. Define f : R3 Ñ R by f px, y, z q : x2 y 3 z 3x 4y 6z 5 for all x, y, z P R. Find Bf px, y, zq , Bf px, y, zq and Bf px, y, zq Bx By Bz for all x, y, z P R. Solution: Since in partial differentiating with respect to y one variable all other variables are held constant, we conclude that Bf px, y, zq 2xy3z Bx Bf px, y, zq x2y3 Bz for all x, y, z 3, Bf px, y, zq 3x2y2z By 4, 6, P R. The following example shows that, differently to a function that is differentiable in a point of its domain, a function that is partially differentiable in a point of its domain is not necessarily continuous in that point. Example 4.2.7. Define f : R2 f px, y q : Then # Ñ R by xy {px2 0 y 2 q if px, y q P R2 z t0u if px, y q 0 . a{n2 lim f p1{n, a{nq nlim nÑ8 Ñ8 p1 a2 q{n2 572 1 a a2 1 0.5 z 1 y 0.5 0 0 -0.5 -1 0 -0.5 y 0 x -1 -1 -0.5 1 -1 0 x 0.5 1 Fig. 161: Graph and contour map of f from Example 4.2.7. In the last, darker colors correspond to lower values of f . for every a hand, P R and hence f is discontinuous in the origin. f ph, 0q f p0, 0q hÑ0,h0 h lim and hence 0, On the other f p0, hq f p0, 0q hÑ0,h0 h lim 0 Bf p0, 0q Bf p0, 0q 0 . Bx By There are two loose ends here. First, we have not yet shown that the linear map occurring in the definition of differentiability of vector-valued functions in several variables is unique. Second, the relation of the notions of differentiability and partial differentiability of such functions is still unclear. Both will be changed by the next theorem. Theorem 4.2.8. Let f be a vector-valued function of several variables from some open subset U of Rn into Rm . Furthermore, let f be differentiable in x and λ : Rn Ñ Rm be some linear map such that for all sequences x1 , x2 , . . . in U ztxu which are convergent to x: lim ν Ñ8 |f pxν q f pxq λpxν xq| 0 . |xν x| 573 Then f is partially differentiable at x in the i-th coordinate with λpei q Bf pxq B xi for all i P t1, . . . , nu. In particular, it follows that λpyq y1 for all y P Rn . Bf pxq B x1 yn Bf pxq , B xn Proof. Let i P t1, . . . , nu and t1 , t2 , . . . be some null sequence in R . Then the sequence x1 , x2 , . . . , defined by xν : x tν .ei for all ν P N , converges to x. Also its members are contained in U for large enough ν. For such ν, it follows that |f pxν q f pxq λpxν xq| |xν x| |f px1, . . . , xi1, xi tν , xi 1, . . . , xnq f pxq tν .λpeiq| νlim Ñ8 |tν | f px1 , . . . , xi1 , xi tν , xi 1 , . . . , xn q f pxq νlim λ p e i q 0 Ñ8 t lim ν Ñ8 ν Hence, since this is true for all null sequences t1 , t2 , . . . in R , the statements of the theorem follows. As a result of the previous theorem, we can now define the following. Definition 4.2.9. (Derivatives of vector-valued functions in several variables) Let f be a vector-valued function of several variables f from some open subset U of Rn into Rm . In addition, let f be differentiable in x P U , and let λ be as in Definition 4.2.2. According to Theorem 4.2.8, λ is uniquely defined by the properties stated in Definition 4.2.2. 574 (i) We define the derivative f 1 pxq of f at x by f 1 pxq : λ . According to Theorem 4.2.8, λ is given by: f 1 pxqpyq y1 Bf pxq B x1 yn Bf pxq , B xn for all y P Rn . Note that if the elements of Rn , Rm are interpreted as column vectors, the representation matrix of f 1 pxq with respect to the canonical bases of Rn and Rm is given by B f1 Bx1 pxq 1 f pxq Bf pxq Bx m 1 B f1 Bxn pxq Bf pxq Bx m n where f1 , . . . , fm are the component functions of f . Ñ Rm defined by p1 pyq : f pxq f 1 pxqpy xq f pxq py1 x1q BBxf pxq pyn xnq BBxf pxq (ii) We call the function p1 : Rn 1 n for all y P Rn , the Taylor polynomial of f of total degree ¤ 1 at x. (iii) If f is in addition real-valued, we call the graph of p1 the tangent plane to Gpf q in the point px, f pxqq. Important special cases of the previous definition are given in the following example. Example 4.2.10. 575 (i) Let f be a vector-valued function of several variables from some open interval I in R into Rm which is differentiable at some t P I. Then rf 1ptqsp1q ppf1q 1ptq, . . . , pfmq 1ptqq where the derivatives on the right hand side are in the sense of Calculus I. (ii) Let f be a function of several variables from some open subset U of Rn into R which is differentiable at some point x P U . Then rf 1pxqspyq rpy ∇qf spxq for all y P Rn where the gradient of f in x, p∇f qpxq, is defined by B f B f p∇f qpxq : Bx pxq, . . . , Bx pxq 1 n and for every y P Rn . rpy ∇qf spxq : y p∇f qpxq Example 4.2.11. (Basic examples of differentiable functions) Let n, m P N . (i) Constant vector-valued functions on Rn are differentiable with zero derivative. (ii) Any linear map from Rn into Rm is differentiable and its derivative is given by the same linear map at any x P Rn . The following criterion for differentiability is usually sufficient for applications. Theorem 4.2.12. (A sufficient criterion for differentiability) Let f be a function of several variables from some open subset U of Rn into R. Moreover let f be partially differentiable in all coordinates, and let those partial derivatives define continuous functions on U . Then f is differentiable. 576 Proof. For x P U and y P U ztxu, it follows by the mean value theorem for functions of one real variable Theorem 2.5.6 that f pyq f pxq f py1, y2, . . . , ynq f px1, y2, . . . , ynq f px1 , y2 , . . . , yn q f px1 , x2 , . . . , yn q f px1 , x2 , . . . , xn1 , yn q f px1 , x2 , . . . , xn q BBxf pc1, y2, . . . , ynq py1 x1q 1 Bf px , c , . . . , y q py x q n 2 2 B x2 1 2 Bf px , x , . . . , x , c q py x q n1 n n n B xn 1 2 where for each i P t1, . . . , nu the corresponding ci is some element of the closed interval between xi and yi . Hence Bf pxq py x q Bf pxq n n B x1 B xn BBxf pc1, y2, . . . , ynq BBxf pxq py1 x1q 1 1 Bf px , c , . . . , y q Bf pxq py x q n 2 2 B x2 1 2 B x1 f pyq f pxq py1 x1 q 577 and Bf px , x , . . . , x , c q Bf pxq py x q n1 n n n B xn 1 2 B xn f y p q f pxq py1 x1q BBxf pxq pyn xnq BBxf pxq |y x| Bf Bf B x px1 , c2 , . . . , yn q B x pxq . . . 1 2 Bf B f B x px1 , x2 , . . . , xn1 , cn q B x pxq , n n n 1 Hence, obviously, by the continuity of the partial derivatives of f , it follows the differentiability of f in x and f 1 pxqpyq y1 Bf pxq B x1 yn Bf pxq , B xn for all y P Rn . Finally, since x was otherwise arbitrary, it follows the differentiability of f on U . Example 4.2.13. (A continuous and partially differentiable function that is not differentiable) Define f : R2 Ñ R by f px, y q : # p3x2y y3q{px2 0 y 2 q if px, y q P R2 z t0u if px, y q 0 . As a consequence of Theorem 4.2.12, the restriction of f to R2 z t0u is differentiable. In addition, since |3x2y y3| |y| |3x2 y2| ¤ 3|y|px2 y2q for every px, y q P R2 z t0u, it follows that f is continuous at the origin. Further, f ph, 0q f p0, 0q hÑ0,h0 h lim 578 0, 1 0.5 1 y 1 z 0 0 -1 0 -1 -0.5 y 0 x -1 -1 -0.5 1 -1 0 x 0.5 1 Fig. 162: Graph and contour map of f from Example 4.2.13. In the last, darker colors correspond to lower values of f . f p0, hq f p0, 0q hÑ0,h0 h lim and hence h 1 hÑlim 0,h0 h Bf p0, 0q 0 , Bf p0, 0q 1 . Bx By We lead the assumption that f is differentiable in the origin to a contradiction. If f is differentiable in the origin, it follows by Theorem 4.2.8 that Bf p0q h Bf p0q h f 1 p0qphq h1 2 2 Bx By for every h ph1 , h2 q P R2 . Hence it follows for every sequence h1 ph11, h12q, h2 ph21, h22q, . . . in R2 z t0u that is convergent to 0 that |f phν q hν2| lim 1 3h21ν h2ν h32ν h lim 2ν ν Ñ8 ν Ñ8 |h | |hν | |hν |2 ν 4h21ν |h2ν | νlim Ñ8 |hν |3 0 . But in the case that h1ν : h2ν : 1{ν for all ν P N , it follows that 4h21ν |h2ν | ? lim ν Ñ8 |hν |3 2 . 579 Hence f is not differentiable in the origin. Compare Fig. 162 which indicates that there is no tangential plane to Gpf q in the origin. Note that Bf px, yq 8xy3 Bx px2 y2q2 for px, y q P R2 z t0u. Hence B f {B x is discontinuous in the origin, and Theorem 4.2.12 is not applicable to f . The following classes of functions appear frequently in applications. Definition 4.2.14. We say that a real-valued function defined on some open subset U of Rn is of class C p for some p P N , if it is partially differentiable to all orders up to p, inclusively, and if all those partial derivatives define continuous functions on U . This includes partial derivatives of the order zero, i.e., that function itself is continuous. A real-valued function defined on some open subset U of Rn is said to be C 8 if it is of class C p for all p P N . Remark 4.2.15. As a consequence of the previous definition, the Theorem 4.2.12 can be restated as saying that every real-valued function which is defined on some open subset of Rn and which is of class C 1 is also differentiable. Definition 4.2.16. (Gradient operator) Let n P N . We define for every real-valued function f which is defined as well as partially differentiable in all coordinate directions on some open subset U of Rn B f B f p∇f qpxq : Bx pxq, . . . , Bx pxq 1 n (4.2.3) for all x from its domain. We call the map ∇ which associates to every such f the corresponding ∇f , the gradient operator. Ñ R by f8 px, y q : x3 x2 y 3 2y 2 Example 4.2.17. Define f8 : R2 580 for x, y P R. Find the second partial derivatives of f8. Solution: We have for all x, y P R: Bf8 px, yq 3x2 2xy3 , Bf8 px, yq 3x2y2 4y , Bx By 2 2 B f8 px, yq 6x 2y3 , B f8 px, yq 6x2y 4 , B x2 By2 B2f8 px, yq 6xy2 , B2f8 px, yq 6xy2 . B xB y ByBx We notice that the second mixed partial derivatives in the last example were identical. This is true for a large class of functions. Theorem 4.2.18. (H. A. Schwarz, 1843 - 1921) Let f be some real-valued function on some open subset of Rn which is of class C 2 . Then B2f B2f B xi B xj B xj B xi (4.2.4) P t1, . . . , nu. Proof. If i j the statement is trivially satisfied. For i j, x P U and sufficiently small hi , hj 0, it follows by the mean value theorem for i, j for functions of one variable Theorem 2.5.6 that there are si , ti in the open interval between xi and xi hi and sj , tj in the open interval between xj and xj hj such that f px hj .ej hi .ei q f px hi .ei q f px hj .ej q f pxq B f B f hi Bx px si.ei hj .ej q Bx px si.eiq i i 2 B f hihj Bx Bx px si.ei sj .ej q j i f px hi.ei hj .ej q f px hj .ej q f px hi.eiq f pxq hj BBxf px tj .ej hi.eiq BBxf px tj .ej q j j 581 1 2 z 0 1 -1 0 -2 y -1 0 x -1 1 2 -2 Fig. 163: Graph of f from Example 4.2.19. 2 hihj BxB Bfx px i ti .ei j tj .ej q . Since hi , hj are otherwise arbitrary, from this and the continuity of B2f , B2f B xi B xj B xj B xi follows (4.2.4) and hence, finally, the theorem. Example 4.2.19. Define f : R2 f px, y q : # Ñ R by xy px2 y 2 q{px2 0 y 2 q if px, y q P R2 z t0u if px, y q 0 . Then Bf p0, yq lim f ph, yq f p0, yq lim y h2 y2 y , hÑ0,h0 hÑ0,h0 h2 Bx h y2 582 Bf px, 0q lim f px, hq f px, 0q lim x x2 h2 x hÑ0,h0 hÑ0,h0 x2 By h h2 for all x, y P R. Further, B2f p0, 0q lim 1 Bf ph, 0q Bf p0, 0q 1 , hÑ0,h0 h B y B xB y By B2f p0, 0q lim 1 Bf p0, hq Bf p0, 0q 1 hÑ0,h0 h B x ByBx Bx and hence B2f p0, 0q B2f p0, 0q . B xB y ByBx Note that B2f px, yq 4xy3px2 3y2q B x2 px2 y2q3 for all px, y q P R2 z t0u. The function that associates the right hand side of the last equation to every px, y q P R2 z t0u cannot be extended to a continuous function on R2 . Hence f is not of class C 2 and Theorem 4.2.18 is not applicable to f . The following Laplace operator appears frequently in partial differential equations from applications. Definition 4.2.20. (Laplace operator, Laplace equation) Let n P N . We define for every real-valued function f which is defined on some open subset U of Rn and twice partially differentiable in every coordinate direction 4f : B2f Bx2i i1 ņ . (4.2.5) We call the map 4 which associates to every such f the corresponding 4f the Laplace operator. In particular, if such f is mapped into the zero function defined on the domain of f , f is called a solution of the Laplace equation 4f 0 . Note that the zero on the right hand denotes the function of value zero defined on the domain of f . 583 2 1 1 0 0 y y 2 -1 -1 -2 -2 -2 0 x -1 1 2 -2 -1 0 x 1 2 Fig. 164: Contour maps of B f {B x and B f {B y from Example 4.2.19. Darker colors correspond to lower values of the functions. y 1 1 0 -1 5 0 z y 0.5 0 -0.5 -5 1 0 -1 -1 -1 1 x -0.5 0 x 0.5 1 Fig. 165: Graph and contour map of f from Example 4.2.21. In the last, darker colors correspond to lower values of f . 584 Example 4.2.21. (A solution of Laplace’s equation) Define f : R2 zt0u Ñ R by x f px, y q : 2 x y2 for all px, y q P R2 zt0u. Then Bf px, yq x2 y2 2x2 y2 x2 , Bf px, yq 2xy Bx px2 y2q2 px2 y2q2 By px2 y2q2 B2f px, yq 2xpx2 y2q2 4xpy2 x2qpx2 y2q B x2 px2 y2q4 2 2 2 2 2 x2 , 2xpx pyx2q y2x2qp32y 2x q 2x p3y x2 y 2 q3 B2f px, yq 2xpx2 y2q2 8xy2px2 y2q By2 px2 y2q2 2 2 2 2 x2 3y 2 B 2xpx y q 8xy px2 y2q3 2x px2 y2q3 Bxf2 px, yq for all px, y q P R2 zt0u. Hence f is a solution of Laplace’s equation. , The differentiation of vector-valued function of several variables follows rules analogous to the case known from Calculus I. So there is a sum rule, a rule for scalar multiples, a product rule, a quotient rule and a chain rule. The corresponding proofs are analogous to those from Calculus I. Theorem 4.2.22. (Rules of differentiation) Let f, g be two differentiable vector-valued function of several variables from some open subset U of Rn into Rm and a P R. (i) Then f g and a.f are differentiable and pf gq 1pxq f 1pxq for all x P U . g 1 pxq , pa.f q 1 pxq a.f 1 pxq (ii) If f, g are both real-valued, then f g is differentiable and for all x P U . pf gq 1pxq f pxq.g 1pxq 585 g pxq.f 1 pxq (iii) If f is real-valued and non-vanishing, then 1{f is differentiable and 1 1 f pxq rf p1xqs2 .f 1pxq for all x P U . Proof. For this, let x P U and x1 , x2 , . . . be some sequence in U ztxu which is convergent to x. Then: |pf ¤ g qpxν q pf g qpxq pf 1 pxq |xν x| 1 |f pxν q f pxq f pxqpxν xq| |xν x| g 1 pxqqpxν xq| |gpxν q gpxq g 1pxqpxν xq| |xν x| and |pa.f qpxν q pa.f qpxq ra.pf 1qpxqspxν xq| |xν x| 1 |a| |f pxν q f px|xq fxp| xqpxν xq| ν and hence |pf lim ν Ñ8 and lim ν Ñ8 g qpxν q pf g qpxq pf 1 pxq |xν x| g 1 pxqqpxν xq| 0 |pa.f qpxν q pa.f qpxq ra.pf 1qpxqspxν xq| 0 . |xν x| If f and g are real-valued, it follows that |pf gqpxν q pf gqpxq pf pxq.g 1pxq gpxq.f 1pxqqpxν xq| |xν x| 1 ¤ |f pxν q f px|xq fxp| xqpxν xq| |gpxq| ν 586 1 |f pxq| |gpxν q gpx|xq gxp|xqpxν xq| ν |f pxν q f pxq| |gpx q gpxq| ν |xν x| and hence that |p f g qpxν q pf g qpxq pf pxq.g 1 pxq lim ν Ñ8 |xν x| 0 g pxq.f 1 pxqqpxν xq| If f is real-valued and non-vanishing, it follows that 1 f pxν q 1 f x pq 1 r p qs2 .f pxqpxν xq 1 f x ¤ |xν x| |f pxν q f pxq f 1pxqpxν xq| 1 |f pxq|2 |xν x| |f pxν q f pxq|2 |f pxν q| |f pxq|2 |xν x| and hence that lim ν Ñ8 1 f pxν q 1 f x pq 1 r p qs2 .f pxqpxν xq 1 f x |xν x| 0. Hence, since f 1 pxq g 1 pxq , a.f 1 pxq , f pxq.g 1 pxq g pxq.f 1 pxq , rf p1xqs2 .f 1pxq are all linear maps, x1 , x2 , . . . and x the theorem follows. P U were otherwise arbitrary, finally, 587 Theorem 4.2.23. (Chain rule) Let f : U Ñ Rm , g : V Ñ Rl be differentiable vector-valued functions of several variables defined on some open subsets U of Rn and V of Rm , respectively, and such that the domain of the composition g f is non trivial. Then g f is differentiable with for all x P Dpg f q. pg f q 1 g 1pf pxqq f 1pxq Proof. For this, let x P Dpg f q and x1 , x2 , . . . be some sequence in Dpg f q ztxu which is convergent to x. Then: |pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq| ¤ |xν x| |gpf pxν qq gpf pxqq g 1pf pxqqpf pxν q f pxqq| |xν x| 1 |g pf pxqqpf pxν q f pxq f 1pxqpxν xqq| |xν x| and hence, obviously, lim ν Ñ8 |pg f qpxν q pg f qpxq pg 1pf pxqq f 1pxqqpxν xq| |xν x| 0. Hence, since g 1 pf pxqq f 1 pxq is a linear map, x1 , x2 , . . . and x finally, the theorem follows. P Dpg f q were otherwise arbitrary, Definition 4.2.24. Let n P N , f : Df Ñ R, g : Dg Ñ Rn be functions of several variables such that Df X Dg φ. We define the product function f.g : Df X Dg Ñ R by pf.gqpxq : pf pxq g1pxq, . . . , f pxq gnpxqq for all x P Df X Dg where g1 , . . . , gn : Dg Ñ R are the component func- tions of g. 588 By application of Theorem 4.2.8, we get as a corollary the chain rule for partial derivatives. The last is frequently applied, e.g., in connection with coordinate transformations. A typical example for the last application is given subsequently. Corollary 4.2.25. (Chain rule for partial derivatives) Let f : U Ñ Rm , g : V Ñ Rl be differentiable vector-valued functions of several variables defined on some open subsets U of Rn and V of Rm and such that the domain of the composition g f is non trivial. Then for each x P Dpg f q, i P t1, . . . , nu: Bpg f q pxq Bf1 pxq. Bg pf pxqq Bfm pxq. Bg pf pxqq . B xi B xi B x1 B xi B xm Proof. By Theorem 4.2.23 and Theorem 4.2.8, it follows that Bpg f q pxq rpg f q 1pxqspe q rg 1pf pxqq f 1pxqspe q i i B xi B fm B f1 1 1 1 g pf pxqqpf pxqpeiqq g pf pxqq Bx pxq, . . . , Bx pxq i i B g B fm B g B f1 Bx pxq. Bx pf pxqq Bx pxq. Bx pf pxqq i 1 i m and hence the corollary. The following gives a typical application of the chain rule for partial derivatives. Example 4.2.26. (Polar coordinates) Let f : R2 Ñ R be differentiable. Calculate all partial derivatives of first order of f¯ : R2 Ñ R defined by f¯pr, ϕq : f pr cos ϕ, r sin ϕq for all pr, ϕq P R2 . Solution: We define g : R2 Ñ R2 by g pr, ϕq : pr cos ϕ, r sin ϕq 589 for all pr, ϕq P R2 . Then g is differentiable and f¯ f g. Hence we get by Corollary 4.2.25 that Bf¯pr, ϕq cos ϕ Bf pr cos ϕ, r sin ϕq sin ϕ Bf pr cos ϕ, r sin ϕq Br Bx By Bf¯pr, ϕq r sin ϕ Bf pr cos ϕ, r sin ϕq r cos ϕ Bf pr cos ϕ, r sin ϕq Bϕ Bx By for all pr, ϕq P R2 . Solving the previous system for the partial derivatives of f leads to the more useful formula Bf pr cos ϕ, r sin ϕq cos ϕ Bf¯pr, ϕq sin ϕ Bf¯pr, ϕq Bx Br r Bϕ ¯ Bf pr cos ϕ, r sin ϕq sin ϕ Bf pr, ϕq cos ϕ Bf¯pr, ϕq By Br r Bϕ for all r P R z t0u and ϕ P R. Problems 1) Calculate the partial derivatives of f : D Ñ R, and in this way conclude the differentiability of f . In addition, calculate f 1 p1, 2q and the Taylor-polynomial of total degree ¤ 1 (‘Linearization’) at p1, 2q. a) f px, y q : 4x3 b) 2y 3 3xy for px, y q P D : R2 , f px, y q : 8x3 5x2 y 2 c) f px, y, z q : xy yz 7y 3 for px, y q P D : R2 , xz for px, y, z q P D : R3 , d) f px, y q : xy for px, y q P D : tpx, y q P R2 : x ¡ 0u , a e) f px, y q : arccospx{ x2 f) g) h) y 2 q for px, y q P D : R2 z t0u , f px, y q : arctanpy {xq for px, y q P D : tpx, y q P R2 : x 0u , f px, y q : ln x a x2 y2 for px, y q P D : R2 z t0u , f px, y, z q : exyz for px, y, z q P D : R3 , i) f px, y, z q : xyz for px, y, z q P D : tpx, y, z q P R2 : x ¡ 0u . 590 2) Find a function whose zero set coincides with the tangent plane to the surface at the point p. a) S1 : tpx, y, z q P R3 : x b) c) d) e) f) g) h) i) 3u , p p1, 1, 1q , S2 : tpx, y, z q P R : xyz 2u , p p1, 2, 1q , S3 : tpx, y, z q P R3 : x2 y 2 2u , p p1, 1, 1q , S4 : tpx, y, z q P R3 : x2 y 2 z 0u , p p0, 0, 0q , S5 : tpx, y, z q P R3 : x2 y 2 z 2 1u , p p1, 1, 1q , S6 : tpx, y, z q P R3 : x2 y 2 z 2 3u , p p1, 1, 1q , S7 : tpx, y, z q P R3 : x2 y 2 2z 2 1u , p p1, 1, 1q , S8 : tpx, y, z q P R3 : x2 3xy 2 4z 1 0u , p p1, 2, 3q , S9 : tpx, y, z q P R3 : sinpxyz q 1{2u , p p1, π, 1{6q , y 3) Use the chain rule to calculate Bg p2, 1q Bu where for all u, v for all x, y z 3 g pu, v q : f and Bg p2, 1q Bv ?uv , 1 ln u 2 v ¡ 0 and f is a differentiable function such that Bf px, yq y , Bf px, yq x , Bx By P R. 4) Let f be a differentiable function with partial derivatives Bf px, yq x , Bf px, yq y Bx By P R. Define g pϕ, θq : f pcos ϕ p2 cos θq, sin ϕ p2 for all ϕ, θ P R. By using the chain rule, calculate Bg pϕ, θq Bϕ for all ϕ, θ P R. for all x, y 591 cos θqq 5) Use the chain rule to calculate Bg p1, π{6q Br where and Bg Bϕ p1, π{6q g pr, ϕq : f pr cos ϕ, r sin ϕq for all r, ϕ P R and f is a differentiable function such that 6) 7) Bf px, yq 3x2 y , Bf px, yq 3y2 x Bx By for all x, y P R. Let f : R3 Ñ R be a differentiable function satisfying Bf px, y, zq x , Bf px, y, zq y , Bf px, y, zq z Bx By Bz for all x, y, z P R. Define the function g by g pr, θ, ϕq : f pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq for all r, θ, ϕ P R. By using the chain rule, calculate Bg p1, π{4, 0q . Bθ Let U be a non-empty subset of Rn where n P N . Further, let f, g : U Ñ R be partially differentiable, I a non-empty open interval of R such that I Ranpf q, h : I Ñ R differentiable and a P R. Show that a) b) c) d) e) ∇pf g q ∇f ∇g , ∇pa .f q a .∇f , ∇pf g q f .∇g g .∇f , ∇f k k f k1 .∇f , k P N , ∇pf {g q p1{g q .∇f pf {g 2 q .∇g , if f 1 pt0uq φ , ∇ph f q ph 1 f q .∇f . 8) Let f : R Ñ R and g : R Ñ R be twice differentiable functions. Define upt, xq : f px tq g px tq for all pt, xq P R2 . Calculate all partial derivatives of u up to second order and conclude that u satisfies B2 u B2 u 0 . Bt2 Bx2 The last is called the wave equation in one space dimension (for a function u which is to be determined). 592 9) Determine whether f is a solution of Laplace’s equation. If applicable, a, b P R. a) f px, y q : a ex cospy q b ex sinpy q for px, y q P D : R2 , b) f px, y q : x3 3xy 2 for px, y q P D : R2 , d) f px, y q : x sinpx c) f px, y q : p1{2q lnpx2 e) f px, y q : arctan yq y 2 q for px, y q P D : R2 z t0u , y cospx x y 1 xy y q for px, y q P D : R2 , for px, y q P D : tpx, y q P R2 : xy f) g) h) i) 1u , f px, y, z q : e for px, y, z q P D : R3 , f px, y, z q : a e5x sinp3y q cosp4z q b e5x cosp3y q cosp4z q for px, y, z q P D : R3 , f px, y, z q : x3 2xy 2 xz 2 for px, y, z q P D : R3 , a f px, y, z q : 1{ x2 y 2 z 2 for px, y, z q P D : R3 z t0u . xyz 10) Let f, g : R Ñ R be differentiable, but otherwise arbitrary. Define u : R R3 Ñ R by upt, xq : 1 |x| r f p t |x| q gp t |x| q s for all t P R pR3 z t0uq. Calculate all partial derivatives of u up to second order and verify that u satisfies B2 u 4u 0 B t2 (4.2.6) where p4uqpt, xq : r4upt, qspxq for all pt, xq P R R3 . The equation (4.2.6) is called the wave equation in three space dimensions (for a function u which is to be determined). 11) (Transformation of the Laplace operator into polar coordinates) Define g : R2 Ñ R2 by g pr, ϕq : pr cos ϕ, r sin ϕq 593 for all pr, ϕq P R2 . Further, let u : R2 Ñ R be of class C 2 . Then 2 2 p4uqpgpr, ϕqq BBrū2 pr, ϕq 1r BBūr pr, ϕq r12 BBϕū2 pr, ϕq for all r P R , ϕ P R where ū : u g. 12) (Transformation of the Laplace operator into cylindrical coordinates) Define g : R3 Ñ R3 by g pr, ϕ, z q : pr cos ϕ, r sin ϕ, z q for all pr, ϕ, z q P R3 . Further, let u : R3 Ñ R be of class C 2 . Then 2 p4uqpgpr, ϕ, zqq BBrū2 pr, ϕ, zq 1r BBūr pr, ϕ, zq B2 ū pr, ϕ, zq 1 B 2 ū p r, ϕ, z q 2 2 r Bϕ Bz2 for all r P R , pϕ, zq P R2 where ū : u g. 13) (Transformation of the Laplace operator into spherical coordinates) Define g : R3 Ñ R3 by g pr, θ, ϕq : pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq for all pr, ϕ, z q P R3 . Further, let u : R3 14) Ñ R be of class C 2 . Then 2 p4uqpgpr, ϕ, zqq BBrū2 pr, ϕ, zq 2r BBūr pr, ϕ, zq 2 1 B ū pr, ϕ, zq sin2 θ B2 ū pr, ϕ, zq Bθ2 r2 sin2 θ B ϕ2 Bū pr, ϕ, zq sin θ cos θ Bθ for all r P R , θ P R ztkπ : k P Zu and ϕ P R where ū : u g. Define f : R2 Ñ R by f px, y q : # x2 y {px4 0 y 2 q if px, y q P R2 z t0u if px, y q 0 . a) Show that f is discontinuous at the origin. 594 2 1 0.5 2 0 y z 0 1 -0.5 -2 0 -1 y -1 0 x -1 1 -2 -2 2 -2 0 x -1 1 2 Fig. 166: Graph and contour map of f from Problem 13. b) Show that f is partially differentiable at the origin into every direction, i.e., f ph.uq f p0q lim hÑ0,h0 h exists for every pu1 , u2 q P R2 such that u21 15) As in Example 4.2.13, define f : R f px, y q : # 2 p3x2 y y3 q{px2 0 Ñ R by u22 1. y 2 q if px, y q P R2 z t0u if px, y q 0 . Show that f partially differentiable at the origin into every direction, i.e., f ph.uq f p0q lim hÑ0,h0 h exists for every pu1 , u2 q P R2 such that u21 16) Define f : R2 Ñ R by f px, y q : # x3 {px2 0 u22 1. y 2 q if px, y q P R2 z t0u if px, y q 0 . a) Show that f is not differentiable in the origin. b) Show that f g is differentiable for any differentiable path in R2 that passes through the origin. 595 1 0.5 z 1 y 1 0 0 -1 -1 0 -0.5 y 0 x -1 -1 -0.5 1 -1 0 x 0.5 1 Fig. 167: Graph and contour map of f from Problem 16. 4.3 Applications of Differentiation In this section, we give main applications of differentiation of functions of several variables. This includes a generalization of Taylor’s theorem, applications to the finding of maxima and minima and Lagrange’s multiplier rule for the finding of maxima and minima in the presence of additional constraints. A function of several variables f from some open subset U of Rn into Rm was said to be partially differentiable in the i-th coordinate, where i P t1, . . . , nu, at some x P U if the corresponding real-valued function of one real variable f px1 , . . . , xi1 , , xi 1 , . . . , xn q is differentiable at xi in the sense of the Calculus I. In that case, we defined Bf pxq :rf px , . . . , x , , x 1 i1 i B xi 1 , . . . , xn qs 1pxiq . In the following, we rewrite the right hand side of the last equation for the purpose of generalization. Since U is open, there is ε ¡ 0 such that the open ball Uε pxq around x is contained in U . For this reason, we can define an auxiliary function h : pε, εq Ñ R by hptq : f px t.ei q f px1 , . . . , xi1 , xi 596 t, xi 1 , . . . , xn q for all t P pε, εq where ei denotes the i-th canonical basis vector of Rn . As a consequence of the chain rule for functions in one variable, h is differentiable in 0 with derivative h 1 p0q rf px1 , . . . , xi1 , , xi 1 , . . . , xn qs 1 pxi q and hence In this sense, Bf pxq h 1p0q . B xi Bf pxq B xi is a derivative of f at the point x in the direction ei of the i-th coordinate axis. Of course, potentially, such a derivative can be defined in any direction not just in the directions of the coordinate axes. This is done in the definition below which will lead to a geometrical interpretation of the gradient ∇f for differentiable function of several variables f . Definition 4.3.1. (Directional derivatives) A function of several variables f defined on some open subset U of Rn is said to be differentiable in the direction of some unit vector u P Rn at some x P U if the auxiliary function h : I Ñ R, defined by hptq : f px t.uq for every t P I and some open interval I around 0, is differentiable at 0 in the sense of Calculus I. In this case, we define: Bf pxq : h 1p0q . Bu Theorem 4.3.2. Let n P N , f be some differentiable function defined on some open subset U of Rn and u P Rn be some unit vector. Then f is differentiable in the direction of u at all points of U and Bf pxq p∇f qpxq u cospαq |p∇f qpxq| Bu for all x P U where α denotes the angle between p∇f qpxq and u. 597 Proof. For this, let x P U and define the path g : I g ptq : x Ñ Rn by t.u for every t P I and some open interval I of R around 0 such that Ranpg q U . According to Example 4.2.11, g is differentiable, and moreover according to Example 4.2.10 its derivative is given by rg 1ptqsp1q u for all t P I. Hence it follows by Theorem 4.2.23 that f g is differentiable and that pf gq 1ptq rf 1pgptqqs u p∇f qpxq u where Example 4.2.10(ii) has been used. Hence the theorem follows. We consider again the situation in the previous theorem. If x0 P U and, more generally than the path g considered in the previous proof, g is a differentiable path from some open interval I around 0 to Rn such that Ranpg q U , g p0q x0 and gi1 p0q ui , where gi is the i-th component map of g, for all i P t1, . . . , nu, then f g is differentiable, and it follows by the chain rule that pf gq 1p0q rf 1pgptqqspuq p∇f qpx0q u . In particular if the range of g is contained in the level set of f through x0 , then f g is constant of value f px0 q. Hence 0 pf gq 1p0q p∇f qpx0q u . Since u is a tangent vector to g in x0 and the range of g is contained in the level set of f through the point x0 , u is also tangent to that level set in the same point. Since this is true for every such g, in this sense p∇f qpx0 q is perpendicular to the level set in x0 . Note that we excluded the question of existence of differentiable paths with values in that level set, and we did not define the tangent space to that set in x0 . Such questions are answered in courses on differential geometry. In part (ii) of the following remark, we give a corresponding result without proof. 598 y 1 y 0 1.5 -1 -1.5 -0.5 0.5 1.5 1 x z 0 -1.5 -11 0 x 1 Fig. 168: The left picture shows gradients and the level set S through the point p1, 0q of f from Example 4.3.4. The right picture shows the graph of f , S and directions of steepest increase of f . Remark 4.3.3. (Interpretation of the gradient) Let n, f and U as in Theorem 4.3.2. If the gradient vector p∇f qpx0 q of f in x0 P U is non vanishing, then (i) |p∇f qpx0q|1.p∇f qpx0q and |p∇f qpx0q|1.p∇f qpx0q are the directions of steepest ascent and steepest descent of f at x0 , respectively. The rate of the ascent and descent is given by |p∇f qpx0q| and |p∇f qpx0q| , respectively. (ii) If f is moreover of class C 1 , p∇f qpx0 q is perpendicular to the level set (or contour) of f at x0 . Hence the equation of the tangent plane to this set is given by p∇f qpx0q px x0q 0 and its normal line through x0 by x0 λ.p∇f qpx0 q 599 where λ P R. Ñ R defined by f px, y q : x2 y 2 Example 4.3.4. The function f : R2 for every px, y q P R2 is of class C 1 . The corresponding gradients are given by p∇f qpx, yq p2x, 2yq for every px, y q P R2 . In particular, p∇f qp0q 0 . Hence the directions of steepest increase / decrease of f in px, y q P R2 zt0u are ? x x2 y2 ,? y x2 y2 , ?x 2 x y2 ,? y x2 y2 . Since f p1, 0q 1, the level set of f through p1, 0q coincides with tpx, yq P R2 : f px, yq x2 y2 1u S11p0q . See Fig. 168. In the next step, we derive a generalization of Taylor’s theorem to functions of several variables. The idea for such a generalization is as follows. For the description, let f be a real-valued function defined on some open subset U of Rn , x0 P U and h P Rn such that x0 t.h P U for all t P pε, 1 εq where ε ¡ 0. Our goal is the derivation of a relationship between f px0 hq and the values of f as well as of its partial derivatives, so far existent, in the point x0 . For this, we define the auxiliary path g : pε, 1 εq Ñ Rn by g ptq : x0 600 t.h for every t P pε, 1 εq. Then f g is a function of one real variable such that pf gqp0q f px0q , pf gqp1q f px0 hq . In the next step, we apply Taylor’s theorem, Theorem 2.5.25, to f g, in accordance with the differentiability properties of f g, and choose the expansion point t0 : 0. For this, derivatives of f g in 0 need to be known. By application of the chain rule for functions in several variables, such derivatives can be expressed in terms of partial derivatives of f in g p0q x0 and partial derivatives of g. The last are constant functions. In this way, we arrive at a relationship of the required type. The following lemma derives the form of the derivatives of f g which are needed in this procedure. Lemma 4.3.5. Let f be a real-valued function defined on some open subset U of Rn and of class C r for some r P N . In addition, let x P U , h P Rn and I be some open interval of R around 0 such that x t.h P U for all t P I. Finally, define g : I Ñ U by g ptq : x t.h for all t P I. Then pf gqprq rph ∇q r f s g where for k ph ∇q k f P t2, . . . , ru is defined recursively by ph ∇q k f : ph ∇qrph ∇q k1f s . Compare Example 4.2.10 (ii). Proof. The proof proceeds by induction. For r 1, it follows by Remark 4.2.15 that f is differentiable. Moreover, obviously, g is differentiable. Hence by Example 4.2.10 and the chain rule,Theorem 4.2.23, it follows that pf gq 1ptq f 1pgptqq prg 1ptqsp1qq f 1pgptqqphq rph ∇qf spgptqq 601 for all t P I and hence the statement for r 1. Now assume that the statement is valid for some s P N such that 1 ¤ s ¤ r 1. Then it follows by Remark 4.2.15 that ph ∇q s f is differentiable and by the analogous arguments applied in the first step that pf gqps 1q rph ∇q s 1 fs g . By help of the previous lemma, we can now state and prove Taylor’s formula for functions in several variables. Theorem 4.3.6. (Taylor’s formula) Let f be a real-valued function defined on some open subset U of Rn and of class C r for some r P N . Moreover, let x0 P U and h P Rn zt0u such that the x0 t.h P U for all t P r0, 1s. Then there is τ P p0, 1q such that f px0 1 rph ∇qf spx0q 1! hq f px0 q ... (4.3.1) 1 rp h ∇q pr1q f spx0 q rph ∇q r f spx0 τ.hq . pr 1q! r! Proof. First, since x0 t.h P U for all t P r0, 1s and U is an open subset of Rn , there is some open interval I from R containing r0, 1s and such that x0 t.h P U for all t P I. Hence we can define g : I Ñ U by g ptq : x0 t.h for all t P I. Since f is of class C r , it follows by Lemma 4.3.5 that the real-valued function f g of one variable is r times continuously 1 differentiable. Hence it follows by Taylor’s theorem for functions of one variable, Theorem 2.5.25, that there is some τ P p0, 1q such that pf gqp1q pf gqp0q 1 pf r! 1 pf 1! gq 1p0q pr 1 1q! pf gqpr1qp0q gqprqpτ q . Finally, from this follows (4.3.1) by application of Lemma 4.3.5. 602 Theorem 4.3.7. (Estimate of the remainder in Taylor’s formula) Let f, U, r, x0 , h be as in Theorem 4.3.6. Moreover, let K be a bound for all partial derivatives of f on U of order r. Finally, define the remainder term by 1 Rr px0 hq : rph ∇q r f spx0 τ.hq . r! Then there is a number C P N depending only on r and n such that |Rr px0 hq| ¤ CK r |h| . r! (4.3.2) Proof. Obviously, ph ∇q r f is of the form ¸ ph ∇q r f i1 hin B i x B. . f. B i 1 r ci1 ...in hi11 in r n n 1 xn where the numbers ci1 ...in come from a multinomial expansion. Hence |ph ∇q r f | ¤ CK |h|r where ¸ C : i1 ci1 ...in in r depends only on n and r. Hence it follows (4.3.2). Definition 4.3.8. Let f, U, r, x0 , h and τ be as in Theorem 4.3.2. Then we call the function pr1 : Rn Ñ R defined by pr1 pxq :f px0 q 1 rppx x0q ∇qf spx0q 1! pr 1q! rppx x0q ∇q 1 ... pr1q f spx 0 q for all x P Rn , the Taylor polynomial of f of total degree ¤ r 1 at x0 and Rr pxq : 1 rppx x0q ∇q r f spx0 r! its remainder term at x x0 h. 603 τ.px x0 qq Example 4.3.9. Define f9 : R2 zt0u Ñ R by f9 px, y q : xy x2 y2 for all px, y q P R2 zt0u. Calculate the Taylor polynomial of f9 of total degree ¤ 2 at p1, 1q, and give an estimate of its remainder term at the point p1.1, 1.2q. Solution: Obviously, f9 is of class C 8 on R2 zt0u. As a consequence of Schwarz’s Theorem 4.2.18 and the symmetry of f9 under exchange of coordinates, there is only 1 ‘independent’ first order partial derivative as well as 2 independent second order and third order partial derivatives of f9 , respectively. In particular, Bf9 px, yq y py2 x2q , B2f9 px, yq x4 6x2y2 y4 , Bx px2 y2q2 BxBy px2 y2q3 2 2 3 2 B f9 px, yq 2xy p3y x q , B f9 px, yq 2x5 28x3y2 18xy4 B x2 px2 y2q3 Bx2By px2 y2q4 B3f9 px, yq 6y x4 6x2y2 y4 , (4.3.3) B x3 px2 y2q4 for all px, y q P R2 zt0u. Hence we have for x0 : p1, 1q, small enough h P R2 and some τ P r0, 1s: f9 px0 hq f9 px0 q 1 rph ∇qf9spx0q 1! 1 rp h ∇q 3 f9 spx0 τ.hq 3! f9px0q BBfx9 px0q hx BBfy9 px0q hy 1 B 2 f9 B2f9 px q h h 2 p x q h 2 0 x 2 B x2 B xB y 0 x y 21 14 phx hy q2 R3px0 hq 604 1 rp h ∇q 2 f9 spx0 q 2! B2f9 px q h2 By2 0 y R3 px0 hq where R3 px0 3 hq B3f9 px B xB y 2 0 1 6 B3f9 px τ.hq h3 x B x3 0 B3f9 px τ.hq hx h2y By3 0 B3f9 px B x2 B y 0 τ.hq h3y . 3 τ.hq h2x hy Hence the Taylor polynomial p2 of f9 of total degree ¤ 2 at p1, 1q is given by 1 1 1 1 p2 px, y q ppx 1q py 1qq2 px y q2 2 4 2 4 2 for all px, y q P R . Further for px, y q on the line segment between x0 and x1 : p1.1, 1.2q, it follows that 3 4 2 2 4 4 B f9 x 6x y y 6x2 y 2 y 4 6y ¤ 6|y | x p x, y q B x3 px2 y2q4 px2 y2q4 4 2 2 4 ¤ 6 1.2 p1.1q 6p1.1q16p1.2q p1.2q 6.29645 3 4 2 2 4 4 B f9 x 6x y y 6x2 y 2 y 4 6x ¤ 6|x| x p x, y q By3 px2 y2q4 px2 y2q4 4 2 2 4 ¤ 6 1.1 p1.1q 6p1.1q16p1.2q p1.2q 5.77174 5 3 5 2x 28x3 y 2 18xy 4 B f9 28|x|3 y 2 18|x|y 4 ¤ 2|x| p x, y q B x2 B y px2 y2q4 px2 y2q4 5 3 1.2q2 18 1.1p1.2q4 6.12151 ¤ 2p1.1q 28p1.1q p16 5 3 5 B f9 2y 28y 3 x2 18yx4 28|y |3 x2 18|y |x4 ¤ 2|y | p x, y q B xB y 2 px2 y2q4 px2 y2q4 5 3 1.1q2 18 1.2p1.1q4 ¤ 2p1.2q 28p1.2q p16 5.94662 and hence that |R3px0 hq| ¤ 16 p0.1q3 6.29645 3 p0.1q2 0.2 6.12151 605 3 0.1 p0.2q2 5.94662 p0.2q3 5.77174 ¤ 0.03 . In the next step, we apply differentiation to the finding of local maxima / minima of functions in several variables. For motivation, let f be such a function which is defined on some open subset U of Rn and assumes a maximum or minimum in x0 P U . Again, we use paths to investigate the behavior of f near x0 . For this, let u P Rn be such that |u| 1. Since U is open, there is an open interval of R around 0 such that the auxiliary path g : I Ñ Rn , defined by g ptq : x0 t.u for every t P I has its range in U . If f is differentiable, f g is differentiable and assumes a maximum or a minimum, respectively, in 0 since pf g qp0q f px0 q. Hence it follows by Calculus I that the derivative of f g in 0 vanishes, and it follows by the chain rule, Theorem 4.2.23, that 0 pf gq 1p0q u rp∇f qpx0qs . Since this is true for every such u, the last implies that the gradient of f vanishes in x0 p∇f qpx0q 0 . Below, we will derive the same result by more elementary means and without the assumption of differentiability of f . Definition 4.3.10. (Local minima and maxima) Let n P N and f be some real-valued function which is defined on some open subset U of Rn . Then we say that f has a local minimum, maximum at x0 P U if there is a open ball U px0 q around x0 such that f pxq ¥ f px0 q for all x P U px0 q and f pxq ¤ f px0 q for all x P U px0 q, respectively. 606 Theorem 4.3.11. (Necessary condition for the existence of a local minimum/maximum) Let n P N , and let f be a function defined on some open subset U of Rn which has a local minimum/maximum at x0 P U , and which is partially differentiable at x0 in each coordinate direction. Then Bf px q 0 , i P t1, . . . , nu . B xi 0 Note that in the case that f is differentiable in x0 , the last is equivalent to the vanishing of the derivative of f in x0 f 1 px0 q 0 . (4.3.4) Also in cases where the range of f is part of Rm for some m point x0 satisfying (4.3.4) will be called a critical point of f . Proof. If f has a local minimum (maximum) at x0 i P t1, . . . , nu and sufficiently small h P R that P P N, such a U , it follows for 1 f px01 , . . . , x0pi1q , x0i h, x0pi 1q , . . . , x0n q h f px01, . . . , x0pi1q, x0i, x0pi 1q, . . . , x0nq is ¥ p¤q 0 and ¤ p¥q 0, for h ¡ 0 and h 0, respectively. Therefore Bf px q B xi 0 is at the same time ¥ 0 and ¤ 0 and hence, finally, equal to 0. In particular, the following example shows that the vanishing of the gradient of a function of several variables in a point of its domain does not always indicate that the function assumes a maximum or a minimum in that point. Example 4.3.12. Define the differentiable function f10 : R2 f10 px, y q : x2 y 2 607 Ñ R by 2 1 z 1 0 0.5 -1 -1 0 y -0.5 0 x -0.5 0.5 1 -1 Fig. 169: Graph of f10 . for all px, y q P R2 . Then Bf10 px, yq 2x , Bf10 px, yq 2y Bx By for all px, y q P R2 and hence p0, 0q is a critical point of f10 , but, obviously, not a local minimum or maximum. It is a so called ‘saddle point’. Note that graph of f is a quadric, namely a hyperbolic paraboloid. Since hyperbolic paraboloids look similar to saddles, these are also often called ‘saddle surfaces’. In Calculus I, we derived a sufficient condition, in terms of the second order derivative, for a function to assume a local minimum / maximum in a point of its domain. In the following, we do the same for functions in several variables. The proof of that result uses Taylor’s formula and the following lemma. The proof of the last is given in the appendix. Lemma 4.3.13. (Sylvester’s criterion) Let n P N , A pAij qi,j Pt1,...,nu be a real symmetric n n matrix,i.e., such that Aij Aji for all i, j P 608 t1, . . . , nu. Then A is positive definite, i.e., ¸ Aij hi hj ¡0 i,j 1,...,n for all h P Rn zt0u, if and only if all leading principal minors detpAk q, k 1, . . . , n, of A are ¡ 0. Here Ak : pAij qi,j Pt1,...,ku , k P t1, . . . , nu . Proof. See the proof of Theorem 5.3.8 in the appendix. Example 4.3.14. For the real symmetric matrix 1 2 0 2 5 3 , A : 0 3 11 it follows that detpA1 q detp1q 1 ¡ 0 , detpA2 q detpA3 q 1 2 15221¡0 2 5 1 2 0 2 5 3 0 3 11 and hence that 1 5 11 3 3 1 11 2 2 2 ¡ 0 ¸ Aij hi hj ¡0 i,j 1,2,3 for all h P R3 zt0u. Note that this can also be seen directly from ¸ Aij hi hj h1ph1 2h2 q h2 p2h1 i,j 1,2,3 h21 ¡0 4h1 h2 5h22 6h2 h3 11h23 ph1 for all h P R3 zt0u. 609 5h2 3h3 q 2h2 q2 h3 p3h2 ph2 3h3 q2 11h3 q 2h23 Theorem 4.3.15. (Sufficient condition for the existence of a local minimum/maximum) Let n P N and f be a real-valued function on some open subset U of Rn which is of class C 2 . Finally, let x0 be a critical point for f . Then f has a local minimum/maximum in x0 if all leading principal minors of its Hessian matrix at x0 2 B f H px0 q : BxiBxj px0q i,jPt1,...,nu are ¡ 0/all leading principal minors of H px0 q are ¡ 0. Proof. First, since x0 is a critical point of f , we conclude by Taylor’s formula, Theorem 4.3.6, (together with Theorem 4.2.18) that for every h from some a sufficiently small ball Uε p0q, ε ¡ 0, around the origin f px0 hq f px0 q f px0q 1 rph ∇q 2f spx0 2 1 ¸ pH px0 2 i,j 1,...,n τ.hq τ.hqqij hi hj (4.3.5) where τ P r0, 1s. Now, if all leading principal minors of H px0 q are ¡ 0/all leading principal minors of H px0 q are ¡ 0, and since all leading principal minors of the Hessian of f define continuous functions on U , ε can be chosen such that also all leading principal minors of the Hessian of f are ¡ 0/all leading principal minors of H px0q are ¡ 0 at all points from Uε px0 q. Hence it follows from (4.3.5) and by Lemma 4.3.13 that f px0 hq ¥ f px0 q pf px0 hq ¤ f px0 qq for all h P Uε p0q and hence, finally, the theorem. Example 4.3.16. The function f9 : R2 Ñ R, see Example 4.3.9, is of class C 2 and has a critical point in p1, 1q. By (4.3.3), the negative of the Hessian matrix of f9 in p1, 1q is given by H p1, 1q 1{2 1{2 610 1{2 1{2 . 0.5 z 1.4 1.2 0.3 1 0.6 0.8 0.8 1 x y 1.2 0.6 1.4 Fig. 170: Graph of f9 . Hence the principal sub-determinants of H p1, 1q are given by 1{2 and 0, and hence Theorem 4.3.15 is not applicable. Nevertheless, f9 has even a global maximum at p1, 1q because f9 p1, 1q 1{2 and xy x2 2 1 x ¤ y2 2 x2 y2 y2 12 for every px, y q P R2 zt0u. This example demonstrates that the assumptions of Theorem4.3.15 for the existence of local minimum/maximum are not necessary. In the case of functions of one real variable, it was shown that a continuous function f : [a, b] Ñ R where a, b P R are such that a b, which is differentiable on the open interval pa, bq, assumes its extrema either in a critical point in pa, bq or in the boundary points a or b of [a, b]. The same procedure can be applied to a continuous function f of several variables that is differentiable on a bounded open set U in the domain and whose domain arises as the closure of U . As a consequence, that domain is compact, and hence 611 0 0 x 2 2 y 4 4 4 z 2 0 Fig. 171: Graph of the constraint surface for A 6. the function assumes a maximum and a maximum. A standard method for finding those extreme values compares the values of f in critical points of f in U to the values of f on the boundary of its domain. Frequently in applications, maxima and minima of functions need to be found whose domains are unbounded. In such a case, it is often possible to find the maximum / minimum by decomposing the domain into a compact subset C and an unbounded subset such that the function assumes a value on C which is larger / smaller than the values assumed on the unbounded set. In that case, the maximum / minimum exists and is assumed in C. Such a case is considered in the following example. Example 4.3.17. Find the length, width and height of a parallelepiped of given area A ¡ 0 and maximal volume. Solution: The volume V and area A of a rectangular box of length x 612 ¡ 0, width y ¡ 0 and height z ¡ 0 are given by: V xyz , A 2pxy xz yz q , respectively. Hence if existent, we need to find the maximum of the function V : r0, 8q r0, 8q Ñ R defined by V px, y q : xy x y A 2 xy for every px, y q P pr0, 8q r0, 8qq zt0u, and V p0, 0q : 0 . Note for later application of Theorem 4.1.14 that, obviously, V is of class C 8 on p0, 8q p0, 8q as well as continuous on DpV q zt0u. In addition because of 1 px y q2 1 xy ¤ px yq x y 2 x y 2 for every px, y q P DpV q zt0u, it follows the continuity of V in p0, 0q and hence, finally, the continuity of V . Also, note that A V px, y q ¤ ? 2 x 1 4 2 (4.3.6) y2 for every px, y q P Dpf q zt0u. This is obvious for xy ¥ A{2 because in this case V ¤ 0, whereas for xy ¤ A{2, px, y q p0, 0q, it follows that A V px, y q 2 ¤ A4 ?x21 A 2 xy x A2 4 y x 2 y 2 ¤ A2 4 1 x y 2 y2 . In the next step, we determine the critical point of V on p0, 8q p0, 8q. The partial derivatives of V are given by BV px, yq y2 A2 x2 2xy , BV px, yq x2 A2 y2 2xy Bx px yq2 By px yq2 613 1 z 2 0 1.5 -1 0 1 y 0.5 0.5 1 x 1.5 2 0 Fig. 172: Graph of V for the case A 6. for x, y P p0, 8q. Hence the critical points of V on p0, 8q p0, 8q are given by the solutions of the system x2 2xy A 2 0, 2xy y2 A 2 0 which has the unique solution x0 : y0 : c A . 6 In px0 , y0 q, the volume V assumes the value V px0 , y0 q A 6 3{2 Now we define the subset C of R2 by C : Dpf q X BR p0q , R : 614 . 27 ? A. 4 (4.3.7) Then C is in particular closed and bounded and hence compact. According to Theorem 4.1.14, V assumes a maximum value on C. Since V vanishes on both axes and because of (4.3.7) and (4.3.6), it follows that V does not assume its maximum on the boundary of C. Hence V assumes its maximum in the interior of C, and it follows by Theorem 4.3.11 and the previous analysis that this happens in the point px0 , y0 q. By (4.3.6), it follows that V assumes its (global) maximum in px0 , y0 q and that its maximum value is given by (4.3.7). Note that this implies that the box is a cube. The previous example solves the problem of finding the maximal volume of a parallelepiped of a given area. Such type of problems are called maximum / minimum problems with constraints. In such, the value of a function needs to be maximized / minimized under further conditions that restrict the elements of the domain which are considered in the maximization / minimization. In the previous example, the first was given by the function V : r0, 8q3 Ñ R that associates to every parallelepiped of length x ¥ 0, width y ¥ 0 and height z ¥ 0 the corresponding volume V px, y, z q : xyz . The second was given by the condition that demanded that the area of the parallelepiped is equal to some prescribed area A ¡ 0, i.e., that x, y, z satisfy the additional equation A 2pxy xz yz q . In the previous example, in the case that z ¡ 0, we solved the last equation for z and used this in the definition of V in order to arrive at a ‘reduced’ function defined on r0, 8q2 which was subsequently maximized. In general, the last can lead to the presence of very complicated expressions in the definition of the ‘reduced’ function or may not be possible at all in terms of elementary functions. In such cases, the following method of Lagrange multipliers is helpful. For its motivation, let f be a real-valued ‘constraint’ function defined and 615 HÑfLHpL p S Fig. 173: Sketch of the constraint surface S. of class C 1 on a non-empty open subset U of Rn , where n P N zt0, 1u, and let S : tx P U : f pxq 0u , the ‘constraint surface’, be such that p∇f qpxq 0 for every x P S. Finally, let g : U Ñ R be differentiable, and assume that the restriction g |S of g to S has a maximum or minimum in p P S. Then it follows for every differentiable path γ : I Ñ S through p, where I is some open interval around 0 and γ p0q p, that g γ has a maximum / minimum in 0. Hence the derivative of g γ in 0 vanishes. Therefore, we conclude by help of the chain rule for functions in several variables, Theorem 4.2.23, and Example 4.2.10 that pg γ q 1p0q p∇gqppq γ 1p0q 0 . Hence p∇f qppq and p∇g qppq are both orthogonal to the ‘pn1q-dimensional’ tangent space of f at p and therefore parallel. As a consequence, there is a 616 so called ‘Lagrange multiplier’ λ P R such that p∇gqppq λ.p∇f qppq . In this way, since in addition f ppq 0, we arrive at n n 1 equations for the 1 unknowns given by λ and the components of p. Example 4.3.18. For this, we consider again the situation from Example 4.3.17. The constraint surface S is given by the zero set of f : U Ñ R, where U : tpx, y, z q P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u , defined by f px, y, z q : xy xz for every px, y, z q P U . Note that yz A 2 p∇f qpxq py z, x z, x yq 0 for every px, y, z q P U . The function V : U Ñ R, defined by V px, y, z q : xyz for every px, y, z q P U , is to be maximized on S. According to the previous analysis, there is a real λ such that p∇V qpxq pyz, xz, xyq λ.p∇f qpxq λ.py Hence it follows that λ 0 and 1 1 1 1 1 1 λ y z x z x and therefore that xy z z, x z, x 1 y and, finally by using the constraint equation f px, y, z q 0, that xy z c A 6 which is identical to the result of Example 4.3.17. 617 yq . Usually, the proof of the Lagrange multiplier rule is based on the so called ‘implicit function theorem’ which itself is a consequence of the so called ’inverse mapping theorem’. The last are not considered in the course, for instance, see [63], XVIII, §4, Theorem 4.6 and XVIII, §3, Theorem 3.1. In the following, we use for the proof [76]. We remark that within the last reference there is given a more general Lagrange multiplier rule that applies also to constraints given in form of inequalities. Theorem 4.3.19. (Lagrange multipliers) Let n, m P N and g, f1 , . . . , fm be functions of class C 1 defined on some open subset U . Finally, assume that the restriction g |S of g to the constraint surface S, defined by S : tx P U : f1 pxq fm pxq 0u , assumes a minimum/maximum value in p P S. Then there are ‘Lagrange multipliers’ λ0 , . . . , λm P R that are not all 0 and such that λ0 .p∇g qppq λ1 .p∇f1 qppq . . . λm .p∇fm qppq 0 . Proof. First, we consider the case that g |S assumes a minimum value in p. For this, let ε0 ¡ 0 such that the closed ball Bε0 ppq is contained in U . In addition for every M ¡ 0, we define an auxiliary function hM : Uε0 ppq Ñ R of class C 1 by hM pxq : g pxq g ppq |x p| m̧ 2 M fk2 pxq k 1 for every x P Uε0 ppq. In a first step, we conclude that for every 0 ε ¤ ε0 , there is M pεq ¡ 0 such that hM pεq pxq ¡ 0 for all x P Sε ppq. Otherwise, there is 0 ε M ¡ 0 such that hM pxq ¡ 0 618 ¤ ε0 for which there is no for all x P Sε ppq. Hence for such ε and any N such that hN pxN q ¤ 0 P N, there is xN P Sεppq (4.3.8) or, equivalently, such that m̧ fk2 pxN q ¤ k 1 1 g pxN q g ppq N ε2 . (4.3.9) Therefore, as a consequence of the boundedness and closedness of Sε ppq and by application of Bolzano-Weierstrass’ Theorem 4.1.9, it follows the existence of a strictly increasing sequence N1 , N2 , . . . of non-zero natural numbers such that the corresponding sequence xN1 , xN2 , . . . is convergent to some x P Sε ppq. By performing the limit in (4.3.9), it follows that x belongs to the constraint surface S and hence that g px q ¥ g ppq. But, the last implies that hM px q ¥ ε2 for every M ¡ 0 and hence that (4.3.8) cannot be valid for every N P N . Hence for the second step, let 0 ε ¤ ε0 and M pεq ¡ 0 be such that hM pεq pxq ¡ 0 for all x P Sε ppq. Then there is xε λm pεqq P Rm 1 such that λ0 pεq. rp∇g qpxε q P Uεppq and a unit vector pλ0pεq, . . . , 2.pxε pqs m̧ λk pεq.p∇fk qpxε q 0 . k 1 This can be proved as follows. By Theorem 4.1.14, the restriction of hM pεq to Bε ppq assumes a minimum value in some point xε P Bε ppq. Since hM pεq ppqq 0, it follows that xε P Uε ppq and that p∇hM pεq qpxε q 0 p∇gqpxεq 2.pxε pq 2M pεq. m̧ fk pxε q.p∇fk qpxε q 0 k 1 which implies the above statement. In the last step, we choose a sequence ε1 , ε2 , . . . in the open interval between 0 and ε0 s which is convergent to 0. In particular, we choose it such that the corresponding sequence pλ0pε1q, . . . , λmpε1qq, pλ0pε2q, . . . , λmpε2qq, . . . 619 is convergent to a unit vector pλ0 , . . . , λm q in Rm 1 . This is possible as a consequence of Bolzano-Weierstrass’ Theorem 4.1.9. Since xε , lim xεk k Ñ8 we conclude that λ0 .p∇g qppq m̧ λk .p∇fk qppq 0 . k 1 Finally, if g |S assumes a maximum value in p, then g |S assumes a minimum value in p and the statement of the theorem follows by application of the just proved result to g |S . The following gives a standard example for the application of the Lagrange multiplier rule to the finding of the extrema, of the restriction to the unit sphere around the origin, of a quadratic form that is associated to a matrix. This leads naturally on the notion of eigenvalues of matrices and their associated eigenvectors. Example 4.3.20. Let n P N , pakl qk,lPt1,...,nu be a family of real numbers and g : Rn Ñ R be defined by g pxq : ņ akl xk xl k,l 1 for all x px1 , . . . , xn q R is defined by P Rn. Since S1np0q f 1pt0uq, where f : Rn Ñ f pxq : |x|2 1 1 ņ i 1 x2i for all x P R , is compact, the restriction of f to S1n p0q assumes a minimum and a maximum. Let x be a point where f assumes an extremum. Then it follows by Theorem 4.3.19 the existence of real λ0 , λ1 such that λ20 λ21 0 and λ0 .p∇g qpxq λ1 .p∇f qpxq 0 . n 620 Since ņ akl xk xl for all x P Rn , it follows that p∇gqpxq ņ pa1k ņ f pxq alk xl xk k,l 1 alk xk xl 1 pakl 2 ak1 qxk , . . . , k 1 ņ k,l 1 k,l 1 k,l 1 and hence ņ ņ alk q xk xl pank akn qxk , p∇f qpxq 2x . k 1 As a consequence, x satisfies the following system of equations ņ λ0 paik aki qxk 2λ1 xi 0, k 1 for i 1, . . . , n. Further, since x 0, it follows that λ0 that the last system is equivalent to ņ k 1 for i 1, . . . , n where 1 paik 2 aki q xk 0 and hence λ xi , λ : λ1 . λ0 By introducing matrix notation, the last system is equivalent to ān1 : pakl ā11 ā1n x1 x1 (4.3.10) λ. ānn xn xn alk q{2 for k 1, . . . , n, l 1, . . . , n and the mul- where ākl tiplication sign on the left hand side of the last equation denotes matrix 621 multiplication. As a side remark, in general, if λ P R and x P Rn zt0u satisfy such a matrix equation, λ is called an eigenvalue of the matrix and x an eigenvector of the matrix corresponding to λ. By Theorem 5.3.6 from the appendix, it follows that λ satisfies λ ā 11 ā21 ān1 ā12 ā22 λ ān2 ā1n ā2n λ 0 ānn which leads on a polynomial equation for λ. After solution of that equation and substitution of the calculated values for λ into (4.3.10), the solutions of the remaining system can be easily found. Problems 1) Find the rate of change of f : D Ñ R at the point p in the direction of v. In addition, find the direction of steepest ascent / steepest descent of f in p and the associated rates. a) f px, y q : x2 2xy 3y 2 for all px, y q P D : R2 , p p1, 2q, v p2, 1q , b) f px, y q : y cospxy q for all px, y q P D : R2 , p p0, 2q, v pcospπ {3q, sinpπ {3qq , c) f px, y q : x expp2px2 y 2 qq for all px, y q P D : R2 , p p1, 0q, v p1, 3q , d) f px, y q : lnpx2 y 2 q for all px, y q P D : R2 zt0u, p p1, 1q, v p3, 3q , e) f px, y, z q : xy yz xz for all px, y, z q P D : R3 , p p1, 2, 1q, v p1, 1, 1q , f) f px, y, z q : 5x2 3xy xyz for all px, y, z q P D : R3 , p p3, 4, 5q, v p1, 1, 1q , g) f px, y, z q : xyz px{y q py {z q pz {xq for all x ¡ 0, y ¡ 0, z ¡ 0, p p2, 1, 4q, v p1, 1, 1q . 622 2) Decide whether the matrix is symmetric and in case whether it is positive definite. A1 : A4 : 3 7 2 4 4 1 , A2 : 1 3 4 A6 : 2 5 A1 : A4 : 6 1 9k k k A6 : 5 8 2 6 , A3 : 1 3 1 5 3 4 , A7 : 3 6 1 k k 2 3 4 1 2k 4 2 3 , A5 : 1 1 3) Decide which values of k 5 12 4 5 3 1 1 , 5 3 3 1 3 , 2 1 1 . 9 P R make the matrix positive definite. , A2 : 3 k 10 , A5 : 2 5k , A3 : k 2 2 5k 9 3 , 3 7 8 6 k 4 , A7 : k 5 2k 2 7k k 4k 4k 1 , 2 7k . 14 4) Calculate the Taylor polynomial of f : D Ñ R of total degree ¤ 2 at p, and estimate the corresponding remainder term on B. a) f px, y q : sinpx y q for all px, y q P R2 , p p0, 0q, B tpx, yq P R2 : |x| ¤ 1 ^ |y| ¤ 1u , b) f px, y q : ex y for all px, y q P R2 , p p0, 0q, B tpx, y q P R2 : |x| ¤ 1 ^ |y | ¤ 1u , c) f px, y q : p1 x y q1{2 for all px, y q P R2 such that y ¥ p1 xq, p p0, 0q, B tpx, y q P R2 : |x| ¤ 1{2 ^ |y | ¤ 1{2u , d) f px, y q : xy for all x ¡ 0, y P R, p p1, 1q, B tpx, y q P R2 : |x 1| ¤ 1{10 ^ |y 1| ¤ 1{10u . 5) Find the maximum and minimum values, so far existent, of f : D R and the points where they are assumed. If applicable, a, b P R. a) f px, y q : xp2 4y q 5x2 y 2 for px, y q P D : R2 , x b) f px, y q : xy 1 y for px, y q P D : R2 , 2 c) f px, y q : 2 2x 5x2 2y p4 x 5y q 623 Ñ for px, y q P D : R2 , d) f px, y q : x3 xy e) f px, y q : x4 y 3 for px, y q P D : R2 , y 4 2x2 for px, y q P D : R2 , pa3 {xq pa3 {yq for px, y q P D : tpx, y q P R2 : x ¡ 0 ^ y ¡ 0u , g) f px, y q : x3 y 3 9xy 27 for px, y q P D : tpx, y q P R2 : 0 ¤ x ¤ 4 ^ 0 ¤ y ¤ 4u , h) f px, y q : x4 y 4 2x2 4xy 2y 2 for px, y q P D : tpx, y q P R2 : 0 ¤ x ¤ 2 ^ 0 ¤ y ¤ 2u , i) f px, y q : epx y q pax2 by 2 q for px, y q P D : R2 , where a, b ¡ 0 , j) f px, y, z q : xyz p4a x y z q for px, y q P D : R3 , k) f px, y, z q : px3 y 3 z 3 q{pxyz q for px, y, zq P D : tpx, y, zq P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u , l) f px, y, z q : rx{py z qs ry {px z qs rz {px y qs for px, y, zq P D : tpx, y, zq P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u . Find the maximum and minimum values of g : D Ñ R on the set(s) f) f px, y q : x2 xy 2 6) 4xy 2y 2 y2 2 S and the points where they are assumed. Give reasons for the existence of such values. a) f px, y q : x2 2xy y 2 for px, y q P D : R2 , on S : tpx, y q P R2 : x2 2x b) c) d) e) f) 0u , f px, y q : x 2y for px, y q P D : R2 , on S : tpx, y q P R2 : x4 y 4 1u , f px, y q : x2 y 2 for px, y q P D : R2 , on S : tpx, y q P R2 : 3 px2 y 2 q 2xy 1u f px, y q : x2 xy y 2 for px, y q P D : R2 , on S : tpx, y q P R2 : x2 y 2 1u , f px, y q : xy for px, y q P D : R2 , on S : tpx, y q P R2 : x2 y 2 1u , f px, y, z q : xyz for px, y, z q P D : R3 , 2 y2 2 624 , on S : tpx, y, z q P R3 : x2 g) h) i) 3u , f px, y, z q : x 2y 3z for px, y, z q P D : R3 , on S1 : tpx, y, z q P R3 : x2 y 2 z 2 1u , S2 : tpx, y, z q P R3 : x 2y 3z 0u , f px, y, z q : x2 y 2 z 2 for px, y, z q P D : R3 , on S1 : tpx, y, z q P R3 : x y z 0u , S2 : tpx, y, z q P R3 : px2 y 2 z 2 q2 x2 2y 2 4z 2 u , f px, y, z q : sinpx{2q sinpy {2q sinpz {2q for px, y, z q P D : tpx, y, z q P R3 : x ¡ 0 ^ y ¡ 0 ^ z ¡ 0u , on S : tpx, y, z q P R3 : x y z π u . 2 2 y2 z2 2 7) Let p ¡ 0. Determine the triangle with largest circumscribed area and perimeter 2p. 8) Determine the point inside a quadrilateral V with minimal sum of squares of distances from the corners. 9) Determine the point inside a quadrilateral V with minimal sum of distances from the corners. 10) Determine the triangle with maximal sum of squares of side lengths and corners on a circle. 11) Let p P R3 z t0u. Determine the plane of largest distance from the origin among all planes through p. 12) Let a ¡ b ¡ c ¡ 0 and E : " 2 px, y, zq P R : xa2 3 y2 b2 z2 c2 1 * . Find the point of E that has largest distance from the origin. 625 O P r C h Fig. 174: Archimedes determination of the volume of paraboloidal solids of revolution, see text. 4.4 Integration of Functions of Several Variables Archimedes’ determination of the volumes of paraboloidal, hyperboloidal and ellipsoidal solids of revolution can be seen as early examples of integration of functions of several variables. All these volumes are symmetric with respect to rotations around a line segment, the so called ‘axis of symmetry’, that is part of the body. In particular, he showed that the volume V of a paraboloidal solid of revolution P inscribed in a circular cylinder C with radius r and height h is one half of the volume VC of C V 21 VC , see Fig 174. For the proof, he divides the symmetry axis into n P N equal parts of length h{n. Through the points of division, A0 , A1 , . . . , An , he passes planes parallel to the base. On the circular sections that these planes cut out of the surface of the solid, he constructs inscribed and circumscribed cylindrical frustra as indicated in Fig 175. The last displays the intersections of the boundaries of the solid and the frustra with a plane containing 626 O = AH0L = BH0L O = AH0L = BH0L AH1L AH1L BH1L BH1L AH2L AH2L BH2L BH2L AHn-1L AHn-1L BHn-1L A = AHnL BHn-1L A = AHnL BHnL BHnL Fig. 175: Archimedes determination of the volume of paraboloidal solids of revolution, see text. Inside the last, the points Apiq, B piq are denoted by Ai and Bi , respectively, where i P t0, . . . , nu. the axis of symmetry OA of the body. The points B0 , B1 , . . . , Bn are intersection points of the circular sections with the plane. By summing the volumes of the frustra, he arrives at the inequality rlpA B qs2 i i n¸1 VC nr2 i 1 rlpAiBiqs ņ ¤ VC n¸1 π rlpAi Bi qs i 1 2 h n ¤V ¤ ņ π rlpAi Bi qs2 i 1 2 (4.4.1) nr2 i 1 h n where lpAi Bi q denotes the length of the line segment Ai Bi for i P t0, . . . , nu and VC πr2 h is the volume of the cylinder C. Further, since the bounding curve in Fig 175 is a parabola, it follows by ancient Greek knowledge on parabolic segments, see (iii) in Example 3.5.26, that i n ihh{n rlpAriB2 iqs 2 for every i P t1, . . . , nu. Hence it follows from (4.4.1) that 1 2 ¤ 1 V VC 1 n ¤ ņ i 1 1 pn 1qn n2 2 rlpAiBiqs2 nr2 1 n¸ 1 rlpA B qs2 1 n¸ i i i 2 n i1 nr2 i1 1 ņ 1 npn 1q i 2 2 n i1 n 2 627 1 2 1 1 n . As a consequence, V V 1 21 ¤ 2n . C In order to conclude from the last that V VC (4.4.2) 21 , Archimedes had to employ a usual ‘double reductio ad absurdum’ argument, i.e., to lead both assumptions that V { VC 1{2 and that V { VC ¡ 1{2 to a contradiction which leaves only the option that V { VC 1{2. This can be done as follows. If V { VC p1{2q ε for some ε ¡ 0, it follows for n ¡ 1{p2εq that V 1 1 V 2 ε ¡ 2n C which contradicts (4.4.2). Hence the only remaining possibility is that V { VC 1{2. Of course, in ancient Greece only rational ε were considered in such analysis. By introduction of a Cartesian coordinate system with origin in A and zaxis in the direction of the line segment from A to O, we achieve that P is enclosed by the x, y-plane and the graph of fP : Ur p0q Ñ R defined by fP px, y q : h 1 x2 y2 r2 for all px, y q P Ur p0q. Below the Riemann integral of fP , giving that volume enclosed by the x, y-plane and the graph of fP , will be defined essentially by a similar construction to Archimedes and denoted by » pq Ur 0 fP px, y q dxdy . Then the previous shows that » pq Ur 0 fP px, y q dxdy 628 1 2 » pq Ur 0 h dxdy where the integral on the right hand side of the last equation is the Riemann integral of the constant function of value h on Ur p0q. It is worth noting that Archimedes inscribed and circumscribed cylindrical frustra are associated to a partitioning of the range of f , such partitions are an important tool in Lebesgue integration, rather than to a partition of the domain. Partitions of the last type provide the basis for Riemann integration. After this introduction, we start with natural definitions of intervals in Rn , where n P N is such that n ¥ 2, volume of intervals, partitions of intervals and corresponding lower and upper sums of bounded functions. In large parts, the following presentation of Riemann integration of functions of several variables is analogous to that of Calculus I for functions in one variable. Definition 4.4.1. (i) Let a, b P R be such that a ¤ b and ra, bs be the corresponding closed interval in R. A partition P of ra, bs is an ordered sequence pa0, . . . , aν q of elements of ra, bs, where ν is an element of N, such that a a0 ¤ a1 ¤ ¤ aν b . Since pa, bq is such a partition of ra, bs, the set of all partitions of that interval is non-empty. A partition P 1 of ra, bs is called a refinement of P if P is a subsequence of P 1 . (ii) Let n P N be such that n ¥ 2. A closed interval I of Rn is the product of n closed intervals I1 , . . . , In of R: I1 In . We define the volume v pI q of I as the product of the lengths lpIi q of the intervals Ii , i P t1, . . . , nu v pI q : lpI1 q . . . lpIn q . I 629 A partition P of I is a sequence pP1 , . . . , Pn q consisting of partitions Pi of Ii , i P t1, . . . , nu. A partition P 1 pP11 , . . . , Pn1 q of I is called a refinement of a partition P pP1 , . . . , Pn q if Pi1 of I is a refinement of Pi for every i P t1, . . . , nu. A partition P ppa10 , . . . , a1ν1 q, . . . , pan0 , . . . , anνn qq induces a division of I into, in general non-disjoint, closed subintervals I ν¤ 1 1 j1 ... 0 ν¤ n 1 0 jn Ij1 ...jn , Ij1 ...jn : ra1j1 , a1pj1 1q s ranjn , anpjn 1q s , for j1 0, . . . , ν1 ; . . . ; jn 0, . . . , νn . The size of P is defined as the maximum of all the lengths of these subintervals. In addition, we define for every bounded function f on I the lower sum Lpf, P q and upper sum U pf, P q corresponding to P by Lpf, P q : ν¸ 1 1 ν¸ 1 1 ... j1 0 U pf, P q : ν¸ n 1 ν¸ n 1 inf tf pxq : x P Ij1 ...jn u v pIj1 ...jn q , jn 0 ... j1 0 suptf pxq : x P Ij1 ...jn u v pIj1 ...jn q . jn 0 ¡ 0 is such that |f pxq| ¤ K for all x P I, it follows Note that if K that K ¤ inf tf pxq : x P J u ¤ suptf pxq : x P J u ¤ K for every subset J of I and hence that |Lpf, P q| ¤ ¤K ν¸ 1 1 j1 0 ν¸ 1 1 ... j1 0 ... ν¸ n 1 ν¸ n 1 | inf tf pxq : x P Ij ...j u| vpIj ...j q 1 jn 0 v pIj1 ...jn q jn 0 630 n 1 n y y 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.2 0.4 0.6 0.8 x 1 0.2 0.4 0.6 0.8 1 x Fig. 176: Divisions of r0, 1sr0, 1s induced by P0 and P1 , respectively, see Example 4.4.2. K ν¸ 1 1 ... j1 0 ν¸ 1 1 j1 0 pa1pj 1 jn 0 |U pf, P q| ¤ ¤K ν¸ n 1 ν¸ 1 1 ... ν¸ n 1 ... j1 0 1 ν¸ n 1 q a1j1 q . . . panpjn 1q anjn q Kv pI q | suptf pxq : x P Ij ...j u| vpIj ...j q 1 n 1 n jn 0 v pIj1 ...jn q Kv pI q . jn 0 As a consequence, the sets tLpf, P q : P P Pu , tU pf, P q : P P Pu are bounded where P denotes the set of all partitions of I. Example 4.4.2. Consider the closed interval I : r0, 1s r0, 1s in R2 and the continuous function f : I Ñ R defined by f px, y q : x for all px, yq P I. P0 : pp0, 1q, p0, 1qq , P1 : pp0, 1{2, 1q, p0, 1{2, 1qq 631 are partitions of I. Also is P1 a refinement of P0 . Finally, Lpf, P q 0 1 0 , U pf, P q 1 1 1 , 2 1 Lpf, P 1 q 0 2 2 1 1 U pf, P 1 q 2 2 1 2 1 2 1 2 1 2 1 2 1 2 2 1 2 0 1 2 1 2 2 2 1 2 41 , 1 2 34 2 and hence Lpf, P q ¤ Lpf, P 1 q ¤ U pf, P 1 q ¤ U pf, P q . Intuitively, it is to be expected that a refinement of a partition of an interval leads to a decrease of corresponding upper sums and an increase of corresponding lower sums as has also been found in the special case in the previous example. Indeed, this intuition is correct. Lemma 4.4.3. Let n P N be such that n ¥ 2 and I I1 In be a closed interval of Rn . Further, let P pP1 , . . . , Pn q, P 1 pP11 , . . . , Pn1 q be partitions of I, and in particular let P 1 be a refinement of P . Then Lpf, P q ¤ Lpf, P 1 q ¤ U pf, P 1 q ¤ U pf, P q . (4.4.3) Proof. The middle inequality is obvious from the definition of lower and upper sums given in Def 4.4.1(ii). Obviously for the proof of the remaining inequalities, it is sufficient (by the method of induction) to assume that there is i0 P t1, . . . , nu such that Pi1 Pi for i i0 , for simplicity of notation, 1 , a11 , . . . , a1ν q where a 1 P I1 is such we assume i0 1, and P11 pa10 , a11 1 11 that 1 ¤a . a10 ¤ a11 11 Here we again simplified for notational reasons. Then Lpf, P 1 q Lpf, P q ν2 ¸ j2 0 ... νn ¸ 1 s ra , a inf tf pxq : x P ra10 , a11 njn npjn 1q su jn 0 632 vpra10, a111 s ranj , anpj 1qsq 1 , a s ra inf tf pxq : x P ra11 11 nj 1 vpra11, a11s ranj , anpj 1qsq inf tf pxq : x P ra10, a11s ranj( vpra10, a11s ranj , anpj 1qsq ¥ inf tf pxq : x P ra10, a11s ranj , anpj 1qsu vpra10, a111 s ranj , anpj 1qsq 1 , a s ra , a v pra11 11 nj npj 1q sq ( vpra10, a11s ranj , anpj 1qsq 0 . n n n n n n n n n n n , anpjn 1q su n , anpjn 1q su n n n n Analogously, it follows that U pf, P 1 q U pf, P q ¤ 0 and hence, finally, (4.4.3). As a consequence of their definition, lower sums are smaller than upper sums. It is not difficult to show that the same is true for the supremum of the lower sums and the infimum of the upper sums. Theorem 4.4.4. Let f be a bounded real-valued function on some closed interval I of Rn where n P N is such that n ¥ 2. Then supptLpf, P q : P P Puq ¤ inf ptU pf, P q : P P Puq . (4.4.4) Proof. By Theorem 4.4.3, it follows for all P1 , P2 P P that Lpf, P1 q ¤ Lpf, P q ¤ U pf, P q ¤ U pf, P2 q , where P P P is some corresponding common refinement, and hence that supptLpf, P1 q : P1 P Puq ¤ U pf, P2 q and hence (4.4.4). 633 As a consequence of Lemma 4.4.3 and since every partition P of some closed interval I of Rn is a refinement of the trivial partition containing only the coordinates of the initial and endpoints, we can make the following definition. Definition 4.4.5. (The Riemann integral, I) Let n P N be such that n ¥ 2, f be a bounded real-valued function on some closed interval I of Rn , and denote by P the set consisting of all partitions of I. We say that f is Riemann-integrable on I if supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq . In that case, we define the integral of f on I by » I f dv : supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq . We also use sometimes the notation » I f px1 , . . . , xn q dx1 . . . dxn for the integral indicating a Cartesian coordinate system. In particular if f pxq ¥ 0 for all x P I, we define the volume under the graph of f by » f dv . I Example 4.4.6. Let f be a constant function of value a P R on some closed interval I of Rn where n P N is such that n ¥ 2. In particular, f is bounded. Further, let P ppa10 , . . . , a1ν1 q, . . . , pan0 , . . . , anνn qq be a partition of I and I ν¤ 1 1 j1 0 ... ν¤ n 1 jn 0 Ij1 ...jn , Ij1 ...jn : ra1j1 , a1pj1 1q s ranjn , anpjn 1q s , 634 for j1 0, . . . , ν1 ; . . . ; jn 0, . . . , νn be the induced division of I into closed subintervals of I. Then Lpf, P q U pf, P q a a ν¸ 1 1 j1 0 ... ν¸ n 1 ν¸ 1 1 ... j1 0 pa1pj 1 1 ν¸ n 1 v pIj1 ...jn q a jn 0 ν¸ 1 1 ... j1 0 ν¸ n 1 v pIj1 ...jn q jn 0 q a1j1 q . . . panpjn 1q anjn q a v pI q jn 0 Hence f is Riemann-integrable and » f dv I a vpI q . Note that according to the previous example, the integral of every function defined on an interval with one vanishing side is zero. The values of the function on such an interval do not affect the value of the integral. This observation will lead further down to the definition of so called zero sets. Example 4.4.7. Consider the closed interval I : r0, 1s2 of R2 and the function f : I Ñ R defined by f px, y q : x y for all x, y P R. Since |f px, yq| |x| |y| ¤ 1 for all px, y q P I, f is bounded. For every n P N , define the partition Pn of I by Pn : 1 n 0, , . . . , n n 1 n , 0, , . . . , n n . Calculate Lpf, Pn q and U pf, Pn q for all n P N . What is the value of » f dv ? I 635 Solution: We have: I j j 1 1 jj 1 2 2 n n¤1 n¤1 n j1 0 j2 0 , 1 n and L pf, Pn q n¸1 n¸1 j1 0 j2 0 2 1 n pn 1q n4 2 U pf, Pn q 1 n n4 Hence pn 2 n¸1 n¸1 pj1 1q 41 1 1 n 1q 1 n 1 1qpj2 n2 j1 0 j2 0 2 1 4 1 n2 j2 j2 1 , n n 1 n4 n¸1 j1 j1 0 2 n¸1 j2 j2 0 , 1 n2 1 n4 2 ņ j1 1 j1 j2 1 . lim L pf, Pn q nlim nÑ8 Ñ8 U pf, Pn q 1 . 4 As a consequence, it follows that 1 4 and ¤ supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq ¤ 14 and hence by Theorem 4.4.4 that supptLpf, P q : P P Puq inf ptU pf, P q : P P Puq 14 . Hence f is Riemann-integrable and » f dv I 14 . 636 ņ j2 Note that the product » 1 » 1 x dx y dy 0 0 gives the same value. That this is not just accidental will be seen later on. This result can be obtained by application of Fubini’s theorem given below. In the past, we have seen many examples that the special properties of functions, such as continuity, differentiability and integrability are automatically ‘transferred’ to sums, products and quotients. Also did this fact considerably simplify the process of the decision whether a given function is continuous, differentiable or integrable. In many cases, this is an obvious consequence of the continuity, differentiability or integrability of elementary functions. For this reason, it is natural to ask whether multiples, sums, products and quotients of integrable functions of several variables are integrable as well. Indeed, this is the case for multiples and sums as stated in the theorem below. Within the definition of Riemann-integrability of functions of several variables above, we also defined the volume under the graph of a positive integrable function in terms of its integral. This is reasonable in view of applications only if that integral is positive. This positivity is a simple consequence of the positivity of the lower sums of such functions. Theorem 4.4.8. Let n P N be such that n ¥ 2 and f, g be bounded and Riemann-integrable on some closed interval I of Rn and a P R. Then f g and af are bounded and Riemann-integrable on I and » I pf g q dv » » f dv » g dv , I I af dv I a » f dv . I If f is in addition positive, then » f dv I ¥0. Proof. In the following, we denote by P the set of all partitions of I. First, if M1 ¡ 0 and M2 ¡ 0 are such that |f pxq| ¤ M1 and |g pxq| ¤ M2 , then |pf g qpxq| |f pxq g pxq| ¤ |f pxq| 637 |gpxq| ¤ M1 M2 , |pcf qpxq| |cf pxq| |c| |f pxq| ¤ |c|M1 for all x P I and hence f g and cf are bounded for every c P R. Second, it follows for every subinterval J of I that inf tf pxq : x P J u inf tg pxq : x P J u ¤ f pxq g pxq pf g qpxq , pf gqpxq f pxq gpxq ¤ suptf pxq : x P J u suptgpxq : x P J u for all x P J and hence that inf tf pxq : x P J u inf tg pxq : x P J u ¤ inf tpf gqpxq : x P J u ¤ suptpf gqpxq : x P J u ¤ suptf pxq : x P J u suptgpxq : x P J u . Hence it follows for every partition P of I that Lpf, P q Lpg, P q ¤ Lpf ¤ U pf, P q U pg, P q . If ν g, P q P N, by refining partitions, we can construct Pν P P such that » I f dv 1 2ν » U pf, Pν q Hence » I ¤ I » f dv 1 ν I » f dv I g dv I g dv 1 2ν ¤ Lpf » Lpg, Pν q , g dv I g, Pν q ¤ U pf 1 . 2ν g, Pν q 1 ν g dv I » I g dv » 1 , U pg, Pν q 2ν I » » Lpf, Pν q , f dv f dv and g, P q ¤ U pf 1 ν ¤ suptLpf 638 g, P q : P P Pu ¤ inf tU pf g, P q : P P Pu ¤ » » f dv 1 . ν g dv I I P N, we conclude that g, P q : P P Pu inf tU pf g, P q : P P Pu » Since the last is true for every ν suptLpf Hence f » f dv g dv . I I g is Riemann-integrable and » I pf g q dv » » f dv I g dv . I Further, if c ¥ 0, it follows for every subinterval J of I that inf tcf pxq : x P J u c inf tf pxq : x P J u , suptcf pxq : x P J u c suptf pxq : x P J u and hence that Lpcf, P q c Lpf, P q , U pcf, P q c U pf, P q for every partition P of I. The last implies that suptLpcf, P q : P P Pu c suptLpf, P q : P P Pu c inf tU pcf, P q : P P Pu c inf tU pf, P q : P P Pu c f dv , » If c ¤ 0, it follows for every subinterval J of I that inf tcf pxq : x P J u c suptf pxq : x P J u , suptcf pxq : x P J u c inf tf pxq : x P J u and hence that Lpcf, P q c U pf, P q , U pcf, P q c Lpf, P q 639 » I f dv . I for every partition P of I. The last implies that suptLpcf, P q : P P Pu c inf tU pf, P q : P P Pu c inf tU pcf, P q : P P Pu c suptLpf, P q : P P Pu c Hence it follows in both cases that » cf dv I c » f dv , » I f dv . I » f dv . I Finally, if f is such that f pxq ¥ 0 for all x P I, then inf tf pxq : x P J u ¥ 0 for all subintervals J of I and hence Lpf, P q ¥ 0 for every partition P of I. As a consequence, » f dv I suptLpf, P q : P P Pu ¥ 0 . The Riemann integral can be viewed as a map into the real numbers with domain given by the set of bounded Riemann-integrable functions over some closed interval I of Rn where n P N is such that n ¥ 2. According to the previous theorem, that map is ‘linear’, i.e., the integral of the sum of such functions is equal to the sums of their corresponding integrals, and the integral of a scalar multiple of such a function is given by that multiple of the integral of that function. In addition, it is positive, in the sense that it maps such functions which are in addition positive, i.e., which assume only positive (¥ 0) values, into a positive real number. It is easy to see that the linearity and positivity of the map implies also its monotony, i.e., if such functions f and g satisfy f ¤ g, defined by f pxq ¤ g pxq for all x P I, then the integral of f is equal or smaller than the integral of g. 640 Corollary 4.4.9. (Monotony of the integral) Let n P N be such that n ¥ 2, f, g be bounded and Riemann-integrable on some closed interval I of Rn , and in addition let f pxq ¤ g pxq for all x P I. Then » f dv I ¤ » g dv . I Proof. For this, we define the auxiliary function h : I Ñ R by hpxq : g pxq f pxq for all x P I. According to the previous Theorem, h is bounded and Riemann-integrable. Finally, since f pxq ¤ g pxq for all x P I, it follows that hpxq ¥ 0 for all x P I. Hence it follows by the linearity and positivity of the integral that 0¤ » and hence that h dv I » » g dv I I » f dv I rf s dv ¤ » I g dv » f dv I » g dv . I We have seen that the integral of every function defined on an interval with one vanishing side is zero. The values of the function on such an interval do not affect the value of the integral. The reason behind this behavior is, of course, the fact that we defined the volume of an interval as the product of its side lengths. Hence the volume of an interval with one vanishing side is zero. Such intervals are examples of so called negligible sets which are similar to zero sets defined in connection with Riemann integration for functions in one variable. The values assumed by a function on a negligible set do not influence the value of the integral. The following definition uses the intuition that such sets should have, in some sense, a vanishing volume. Definition 4.4.10. (Negligible sets) A subset K of Rn is said to be negligible if for every ε ¡ 0 there exists a finite number of closed intervals I1 , . . . , Iν of Rn whose union contains K and which is such that v pI1 q v pIν q ε . 641 Example 4.4.11. Any interval of Rn with at least one side of vanishing length is negligible. Remark 4.4.12. Obviously, negligible subsets are bounded, the closure of negligible subsets is negligible and a finite unions of negligible subsets are also negligible. The proofs of the remaining theorems in this section use either more advanced knowledge of topological properties of subsets of Rn , where n P N is such that n ¥ 2, than developed in this course or use the inverse mapping theorem for vector-valued functions in several variables which was not considered in the previous section. For this reason, these proofs will not be given in the following, but can be found in [63]. Intuitively, an interval of Rn , where n P N is such that n ¥ 2, with at least one side of vanishing length is of ‘lower dimension’ than n. Hence, it might be suspected that also other ‘lower dimensional’ subsets of Rn could be negligible such as parts of curves in R2 or parts of surfaces in R3 . Indeed, this intuition is correct, if such sets are images of bounded subsets of Rm , where m P N is such that 1 ¤ m n, under maps of class C 1 as detailed in the following theorem. Theorem 4.4.13. Let m, n P N , B be a bounded subset of Rm and U an open subset of Rm containing B. Finally, let n ¡ m and f : U Ñ Rn be of class C 1 , i.e., such that each of its component functions is of class C 1 . Then f pB q is negligible. Proof. See [63], XX, §2, Proposition 2.2. Example 4.4.14. Show that Sr1 p0q tpx, y q P Rn : x2 y2 r2 u where r ¥ 0, is a negligible subset of R2 . Solution: For this, we define f : p2π, 2π q Ñ R2 by f ptq : pr cos t, r sin tq 642 for every t P p2π, 2π q. Then f is of class C 1 and Ranpf q Sr1 p0q. Hence according to Theorem 4.4.13, Sr1 p0q is a negligible subset of R2 . So far, we proved existence of the integral only in few simple cases. The following theorem gives a criterion for the Riemann-integrability of a function which is sufficient for most applications. Theorem 4.4.15. (Existence of Riemann integrals) Let n that n ¥ 2. P N be such (i) Let f be a bounded real-valued function on some closed interval I of Rn . Moreover, let f be continuous in all points of I, except from points of a negligible subset of I. Then f is Riemann-integrable on I. (ii) If g is some function on I such that f pxq g pxq for all x P I, except from points of negligible subset of I, then g is Riemann-integrable on I and » » f dv g dv . I I Proof. See [63], XX, §1, Theorem 1.3. Since a |f pxq| rf pxqs2 for every x P I, if f is a bounded function on some closed interval I of Rn , where n P N is such that n ¥ 2, which is continuous in all points of I, except from points from a negligible subset of I, we conclude by application of the previous theorem that also |f | is bounded and Riemannintegrable. Since f pxq ¤ |f pxq| ¤ f pxq for all x P I, it follows by the monotony of the Riemann integral, Corollary 4.4.9, that » » » f dv ¤ |f | dv ¤ f dv I I 643 I » f dv and hence that I » ¤ |f | dv . I The last estimate is frequently applied. As a consequence, we proved the following theorem. Theorem 4.4.16. Let n P N be such that n ¥ 2 and f be bounded on some closed interval I of Rn . Further, let f be continuous in all points of I, except from points in a negligible subset of I. Then |f | is bounded and Riemann-integrable and » f dv I » ¤ |f | dv . I Example 4.4.17. Let f : tpx, y q P R2 : x2 y 2 ¤ r2 u Ñ R be some continuous function where r ¥ 0. Define fˆ : rr, rs2 Ñ R by fˆpx, y q : " f px, y q for px, y q P Dpf q 0 for px, y q P rr, rs2 z Dpf q. Then fˆ is everywhere continuous, except possibly on Sr1 p0q which is according to Example 4.4.14 a negligible subset of R2 . Hence according to Theorem 4.4.15, fˆ is Riemann-integrable on rr, rs2 . In Example 4.4.7, we have seen that » r0,1s2 xy dxdy » 1 » 1 x dx 0 y dy . 0 This result can be obtained by a simple application of the following theorem of Guido Fubini. This theorem is of major importance for the evaluation of integrals in Rn m , where n, m P N , since it reduces that evaluation to the calculation of such integrals in Rn and Rm . If applicable, by successive application of the theorem, the evaluation of integrals in Rn , n P N such that n ¥ 2, can be reduced to the calculation of integrals for functions in one variable. For the evaluation of the last, the powerful fundamental theorem of calculus is available. 644 Theorem 4.4.18. (Fubini’s Theorem) Let m, n P N and I, J be closed intervals of Rm and Rn , respectively. Further, let f : I J Ñ R be Riemann-integrable on I J. Finally, let f px, q be Riemann-integrable on J for all x P I, except on a negligible subset of I. Then the function on I which associates to every x P I the value » J f px, yq dy is Riemann-integrable on I and » f px, yq dx dy I J » » I J f px, yq dy dx . Proof. See [63], XX, §3, Theorem 3.1. Example 4.4.19. Let r ¡ 0. Define f : rr, rs2 Ñ R by f px, y q : 1 if x2 y 2 ¤ r2 and 0 otherwise. According to Example 4.4.17, f is Riemann-integrable on rr, rs2 , and we conclude by Theorems 4.4.18, 4.4.15 that » f px, y q dx dy rr,rs2 »r » ?r2 x2 r ? » r » r r r »r f px, y q dy dx ? dy dx 2 r2 x2 dx ? r2 x2 r x r 2 x2 x r r arcsin r 2 r πr2 which is the area of a circular disk of radius r. Note that this result can be achieved only by knowledge of the values of f on Br2 p0q. Hence it appears natural to define » pq Br2 0 dx dy : » rr,rs2 645 f px, y q dx dy , since f is the unique extension of the constant function of value 1 on Br2 p0q to a function on rr, rs2 which is constant of value zero on rr, rs2 zB12 p0q. Also it is obvious that if g is an analogous extension of the constant function of value 1 on Br2 p0q to some interval I Br2 p0q, it follows that » I g px, y q dx dy » rr,rs2 f px, y q dx dy as it should be since the symbol » pq dx dy B12 0 does not contain any reference to an interval or an extension of the integrand. This suggests the following definition. Definition 4.4.20. ( The Riemann integral, II ) Let n P N be such that n ¥ 2, Ω be a bounded subset of Rn whose boundary is negligible and f be a bounded function on Ω. In addition, let I Ω be a bounded closed interval, and let fˆ : I Ñ R defined by fˆpxq : # f pxq if x P Ω 0 if x P I z Ω . be Riemann integrable. Then we define » Ω f dv : » fˆ dv . I For the proof that this definition is independent of the interval I, we refer to the final part of [63], XX, § 1 on admissible sets and functions. In addition, as a particular case when f is constant of value 1, we define the n-dimensional volume V of Ω by V : » dv . Ω 646 y GHfL W b a x Fig. 177: Area under a graph of a function f . See Example 4.4.21. The following two examples give further applications of Fubini’s theorem. In particular, they indicate that previous definitions of area / volume under the graph of functions are consistent with the previous definition, Definition 4.4.20, of the n-dimensional volume of subsets in Rn where n P N is such that n ¥ 2. Example 4.4.21. In the following, we calculate » dxdy Ω where Ω R2 is the region under the graph of a function f : ra, bs Ñ R, where a, b P R are such that a b, that assumes only positive p¥ 0q values, i.e., Ω is given by Ω : tpx, y q P R2 : a ¤ x ¤ b ^ 0 ¤ y ¤ f pxqu , see Fig 177. In addition, we assume that f is the restriction of a continuously differentiable function fˆ defined on an open interval I of R containing ra, bs. As a consequence, the graph of f is part of image of the map h : I Ñ R2 of class C 1 defined by hpxq : px, fˆpxqq 647 for every x P I and hence is negligible. From this, we conclude that the boundary of Ω, given by 4 ¤ Bi i 1 where B1 : ra, bs t0u , B2 : tbu r0, f pbqs , B3 : Gpf q , B4 : tau r0, f paqs is a negligible set. Hence it follows by Fubini’s theorem, Theorem 4.4.18, that » dxdy Ω » b » f pxq a 0 dy dx »b a f pxq dx . » Hence the value of dxdy Ω coincides with the area under the graph of f as defined in Calculus I. Example 4.4.22. In the following, we calculate » dxdydz Ω where Ω R3 is the region under the graph of a function f : ra, bs rc, ds Ñ R, where a, b, c, d P R are such that a b and c d, that assumes only positive p¥ 0q values, i.e., Ω is given by Ω : tpx, y, z q P R3 : a ¤ x ¤ b ^ c ¤ x ¤ d ^ 0 ¤ z ¤ f px, yqu , see Fig 178. In addition, we assume that f is the restriction of a continuously differentiable function fˆ defined on an open set U of R2 containing ra, bs rc, ds. As a consequence, the graph of f is part of image of the map h : U Ñ R3 of class C 1 defined by hpx, y q : px, y, fˆpx, y qq 648 GHfL z d y a c b x Fig. 178: Volume under a graph of a function f . See Example 4.4.22. for all px, y q P U and hence is negligible. From this, we conclude that the boundary of Ω, given by 6 ¤ Bi , i 1 where B1 B3 B4 B5 B6 : ra, bs rc, ds t0u , B2 : Gpf q , : tpx, c, λf px, cqq : px, λq P ra, bs r0, 1su , : tpx, d, λf px, dqq : px, λq P ra, bs r0, 1su , : tpa, y, λf pa, y qq : py, λq P rc, ds r0, 1su , : tpb, y, λf pb, y qq : py, λq P rc, ds r0, 1su , is a negligible set. For this, note that for x0 maps that associate to every px, λq the value P ra, bs, y0 P rc, ds also the px, y0, λfˆpx, y0qq , 649 and to every py, λq the value px0, y, λfˆpx0, yqq are defined as well as of class C 1 on open subsets of R2 containing ra, bs r0, 1s and rc, ds r0, 1s, respectively. Therefore also B3, B4, B5 and B6 are negligible. Hence it follows by Fubini’s theorem, Theorem 4.4.18, that » dxdydz Ω » » ra,bsrc,ds p q f x,y dz dxdy 0 » ra,bsrc,ds f px, y q dxdy . » Hence the value of dxdydz Ω coincides with the volume under the graph of f as defined in Definition 4.4.5. Often in applications, the integrand of an integral in Rn , where n P N is such that n ¥ 2, has a certain symmetry. In such cases, integration by change of variables is often useful. The following theorem will also play a major role in the subsequent section on generalizations of the fundamental theorem of calculus to integrals in Rn . Theorem 4.4.23. (Change of variable formula) Let n P N be such that n ¥ 2 and I be a closed interval of Rn contained in some open subset U . Moreover, let g : U Ñ Rn be continuously differentiable with a continuously differentiable inverse. Finally, let f be a Riemann-integrable function over g pI q. Then » » where det g 1 : U pq g I f dv pf gq | det g 1| dv I Ñ R is defined by pdet g 1 qpxq : det for all x P U . 650 Bgi pxq B xj i,j 1,...,n Proof. See [63], XX, §4, Corollary 4.6. The following is a typical application of change of variables. Example 4.4.24. Show that » pq xdxdy 2 0 BR » ydxdy pq 2 0 BR for every R ¡ 0. Solution: For this, let R and g2 : R2 Ñ R2 by 0 (4.4.5) ¡ 0. We define g1 : R2 Ñ R2 g1 px, y q : px, y q , g2 px, y q : px, y q for all px, y q P R2 . The maps g1 , g2 are continuously differentiable with inverse g1 and g2 , respectively. In particular, g1 pBR2 p0qq g2 pBR2 p0qq BR2 p0q and det g11 det g21 1 . Since f1 : BR2 p0q Ñ R and f2 : BR2 p0q Ñ R, defined by f1 px, y q : x and f2 px, y q : y, respectively, for every px, y q P BR2 p0q, are continuous, it follows by Example 4.4.17 and change of variables that » pq xdxdy 2 0 BR » pq 2 0 BR ydxdy » » p p qq xdxdy 2 0 g1 BR p p qq 2 0 g2 BR ydxdy » » pq 2 0 BR pq 2 0 BR pxqdxdy pyqdxdy » » pq xdxdy , 2 0 BR pq ydxdy 2 0 BR and hence (4.4.5). Also the application of change of variables in the following example is typical. 651 Example 4.4.25. (A basic oscillatory integral) Let k »8 0 $ ' & π2 sinpkxq dx 0 ' x % π if k if k if k 2 and that » R sin kx dx x 0 p q ¤ π2 0 0 ¡0 P R. Show that 1 (4.4.6) (4.4.7) for every R ¥ 0. Solution: For this, let R ¡ 0. Then it follows by application of the fundamental theorem of calculus that »R sinpxq dx x 0 * » R " 1 x sin 1 x sin e sinpx cosq pπ{2q x e sinpx cosq p0q dx x 0 » R #» π{2 0 0 + hpx, θq dθ dx where h : r0, Rs r0, π {2s Ñ R is defined by hpx, θq : ex sin θ sinpx cos θq cos θ sin θ ex sin θ cospx cos θq for all px, θq P r0, Rs r0, π {2s. Since h is continuous, h is Riemannintegrable. Therefore, it follows by Fubini’s theorem that » R #» π{2 0 0 In addition, g : R2 + hpx, θq dθ dx » r0,Rsr0,π{2s Ñ R2 defined by g pθ, xq : px, θq hpx, θq dx dθ . for all pθ, xq P R2 is bijective and continuously differentiable. Since g 1 g, the inverse of g is continuously differentiable, too. Further, 0 1 h 1 pθ, xq 1 0 652 and | detpg 1pθ, xqq| 1 for all pθ, xq P R2 . Hence it follows by change of variables and by Fubini’s theorem that » r0,Rsr0,π{2s » π{2 "» R » π{2 0 π 2 r0,π{2sr0,Rs hpx, θq dθ dx * ex sin θ sinpx cos θq cos θ dx dθ sin θ ex sin θ cospx cos θq 0 0 hpx, θq dx dθ » eR sin θ cospR cos θq 1 dθ » π{2 0 eR sin θ cospR cos θq dθ . Since » π {2 eR sin θ cos R cos θ dθ 0 p q ¤ » π{2 e2Rθ{π dθ 0 π π 2R p1 eR q ¤ 2R , we conclude that » R sin x dx x pq 0 »8 and that 0 ¤ π 2 1 R 1 . sinpxq π dx . x 2 Further since » R sin x dx x pq 0 we arrive at ¤ »R sin x x 0 » R sin x dx x pq 0 653 p q dx ¤ R , ¤ π2 1. From the previous results, we conclude (4.4.6) and (4.4.7) as follows. First, we note that (4.4.6) and (4.4.7) are trivially satisfied if k 0. For k ¡ 0 and R ¡ 0 it follows by change of variables that »R 0 sinpkxq dx x » kR 0 and hence (4.4.6) and (4.4.7). Finally for k change of variables that »R 0 sinpkxq dx x »R 0 sinpy q dy y 0 and R ¡ 0, it follows by sinp|k |xq dx x and hence also in this case the validity of (4.4.6) and (4.4.7). For the application of the change of variable formula, transformations g are needed that have a differentiable inverse. For this reason, we need to exclude certain sets from the domains of polar, cylindrical and spherical coordinate transformations that were included in those definitions given in Calculus II. Usually, this does restrict their usefulness in an essential way since those sets are negligible sets. Example 4.4.26. (Polar coordinates) Define g : p0, 8q pπ, π q by g pr, ϕq : pr cos ϕ, r sin ϕq Ñ R2 for all pr, ϕq P p0, 8qpπ, π q. Then g is continuously differentiable with Ranpg q R2 z pp8, 0s t0uq and a continuously differentiable inverse g 1 : R2 z pp8, 0s t0uq Ñ R2 given by ? ? p?x2 y2 , arccospx{ x?2 y2 qq if y ¥ 0 p x2 y2 , arccospx{ x2 y2 qq if y 0 for all px, y q P Ranpg q R2 z pp8, 0s t0uq. In particular, cos ϕ r sin ϕ 1 r pdet g qpr, ϕq : g 1 px, y q " sin ϕ for all pr, ϕq P p0, 8q pπ, π q. 654 r cos ϕ y R ¶ W x -¶ -R Fig. 179: Domain of integration Ω in Example 4.4.27. Example 4.4.27. Let ε, R P R be such that ε R. Calculate » Ω where 1 lnpx2 4π y 2 q dxdy Ω : tpx, y q P R2 : x ¥ 0 ^ ε2 ¤ x2 y2 ¤ R2 u . Solution: First, we note that Ω is bounded with a negligible boundary since the last is given by the union of the negligible sets t0u rε, Rs, t0u rR, εs and subsets of the negligible sets Sε1p0q and SR1 p0q. Further, f : Ω Ñ R defined by 1 f px, y q : lnpx2 y 2 q 4π for every px, y q P Ω is continuous. Hence, we conclude that f is Riemannintegrable, and it follows by use of polar coordinates and application of Theorems 4.4.23, 4.4.18 that » Ω 1 lnpx2 4π y q dxdy 2 » 1 r lnprq drdϕ 2π rε,Rsrπ{2,π{2s 655 » R 1 R r2 1 r lnprq dr lnprq 2 ε 4 2 ε R2 1 ε2 1 lnpRq 4 lnpεq 2 . 4 2 Example 4.4.28. (Cylindrical coordinates) Define g : p0, 8qpπ, π q R Ñ R3 by g pr, ϕ, z q : pr cos ϕ, r sin ϕ, z q for all pr, ϕ, z q P p0, 8qpπ, π qR. Then g is continuously differentiable with Ranpg q R3 z pp8, 0s t0u Rq and a continuously differentiable inverse g 1 : R3 z pp8, 0s t0u Rq Ñ R3 given by ? ? p?x2 y2 , arccospx{ x?2 y2 q , zq p x2 y2 , arccospx{ x2 y2 q , zq for all px, y, z q P R3 z pp8, 0s t0u Rq. In particular, cos ϕ r sin ϕ 0 pdet g 1qpr, ϕ, zq : sin ϕ r cos ϕ 0 r g 1 px, y, z q " 0 0 if y ¥ 0 if y 0 1 for all pr, ϕ, z q P p0, 8q pπ, π q R. Example 4.4.29. (Spherical coordinates) Define g : p0, 8q pπ, πq Ñ R3 by p0, πq g pr, θ, ϕq : pr sin θ cos ϕ, r sin θ sin ϕ, r cos θq for all pr, θ, ϕq P p0, 8q p0, π q pπ, π q. Then g is continuously differentiable with Ranpg q R3 z pp8, 0s t0u Rq 656 1 z 1 0 y 0 -1 0 1 -1 x Fig. 180: For a 0, b 1 and f pz q : z for all z from Example 4.4.30 is a solid cone of height 1. P [0, 1] , the volume of revolution S and a continuously differentiable inverse g 1 : R3 z pp8, 0s t0u Rq Ñ R3 given by ? p |r| , arccospz{|r|q , arccospx{ x?2 y2 qq if y ¥ 0 p |r| , arccospz{|r|q , arccospx{ x2 y2 qq if y 0 for all px, y, z q P R3 z pp8, 0s t0u Rq. In particular, sin θ cos ϕ r cos θ cos ϕ r sin θ sin ϕ pdet g 1qpr, θ, ϕq : sin θ sin ϕ r cos θ sin ϕ r sin θ cos ϕ r2 sin θ cos θ r sin θ 0 for all pr, ϕ, z q P p0, 8q pπ, π q R. g 1 prq " In the following, we give some typical applications of integration of functions in several variables in the calculation of volumes of solid bodies, mechanics and probability theory. 657 Example 4.4.30. (Volume of a solid of revolution) Let a, b P R such that a b, f : [a, b] Ñ [0, 8q be a continuous function whose restriction to pa, bq is continuously differentiable and S : y 2 q1{2 px, y, zq P R3 : 0 ¤ px2 ¤ f pzq ^ z P ra, bs ( . Note that S is rotational symmetric around the z-axis and can be thought of as obtained from a region in x, z-plane that is rotated around the z-axis. The volume V of S is given by V π »b a f 2 pz q dz . This can be proved as follows. For this, we define ρ : S Ñ R by ρpxq : 1 for all x P S. As a constant map, ρ is continuous. We notice that B S is given by the union of A1 : tpx, y, z q P R3 : a z b ^ x2 y 2 f 2 pz q 0u , A2 : tpx, y, aq P R3 : x2 y 2 ¤ f 2 paqu , A3 : tpx, y, bq P R3 : x2 y 2 ¤ f 2 pbqu . Further, A1 is the image of the map f1 : C 1 defined by p2π, 2πq pa, bq Ñ R3 of class f1 pϕ, z q : pf pz q cos ϕ, f pz q sin ϕ, z q for every pϕ, z q P p2π, 2π qpa, bq, A2 is a subset of the image of the map f2 : p1, 1 f paqq p2π, 2π q Ñ R3 of class C 1 defined by f2 pr, ϕq : pr cos ϕ, r sin ϕ, aq for every pr, ϕq P p1, 1 f paqqp2π, 2π q and A3 is a subset of the image of the map f3 : p1, 1 f paqq p2π, 2π q Ñ R3 of class C 1 defined by f3 pϕq : pr cos ϕ, r sin ϕ, bq 658 for every pr, ϕq P p1, 1 f paqq p2π, 2π q. Hence it follows that B S is negligible. Further, if I is some closed interval such that I S, it follows that # 1 if x P S ρ̂pxq : 0 if x P I z S is continuous, except in points from a negligible subset of R3 . Therefore ρ̂ is Riemann-integrable. Hence we can apply the Theorem of Fubini to conclude that V » dxdydz S » b » pq π dxdy dz Bf2pzq 0 a »b a f 2 pz q dz . (4.4.8) As described in the introduction to this section, Archimedes showed that the volume V of a paraboloidal solid of revolution inscribed in a circular cylinder C with radius r and height h is one half of the volume VC of C. This result follows also from 4.4.8. In this, a 0, b h and f pz q r for every z V c 1 z h P r0, hs. Hence πr 2 »h 0 1 z dz h πr 2 z h z 2 2h 0 12 πr2h 12 VC . Example 4.4.31. Calculate the volume VS of a solid sphere of radius r ¡ 0 and the volume VC of circular cylinder of radius r and height h ¡ 0. Solution: With a r, b r, f pz q : for every z ? r2 z 2 P rr, rs, it follows from 4.4.8 that VS π »r r r 2 z 2 dz π 659 r z 2 r z 3 3 r 4π3 r3 . r 2r r Fig. 181: Solid sphere of radius r ¡ 0 inscribed into in a right circular cylinder whose height equals its diameter, see Example 4.4.31. Finally, with a 0, b h, for every z f pz q : r P r0, hs, it follows from (4.4.8) that VC π »h r2 dz 0 πr2h . Since the solid sphere of radius r ¡ 0 can be inscribed in a circular cylinder of radius r and height 2r, we conclude that VS 4π3 r3 23 Vr where Vr denotes the volume of that cylinder. Also this result was derived by Archimedes in ‘On the sphere and cylinder’. He required that on his tombstone be carved a sphere inscribed in a right circular cylinder whose height equals its diameter. After Archimedes death, Cicero restored his tomb with this inscription, see Fig 181. 660 Example 4.4.32. Calculate the volume V of the solid ellipsoid VE : " 2 px, y, zq P R : xa2 y2 b2 3 z2 c2 * ¤1 where a, b, c ¡ 0. Solution: First, it follows by Example 3.5.45 that the boundary of V , i.e., the ellipsoid E : " 2 px, y, zq P R : xa2 3 y2 b2 z2 c2 * 1 , is a negligible set. Hence it follows the existence of » dxdydz . VE Further, we note that gpU1p0qq where the scale function g : R3 Ñ R3 is defined by g px, y, z q : pax, by, cz q for all px, y, z q P R3 . In particular, g is continuously differentiable with a continuously differentiable inverse g 1 : R3 Ñ R3 given by g 1 px, y, z q : px{a , y {b , z {cq for all px, y, z q P R3 and detpg 1 px, y, z qq abc for all px, y, z q P R3 . Hence, we conclude by change of variables and the VE previous example that V » dxdydz VE abc » pq U1 0 661 dxdydz 4π3 abc . Example 4.4.33. (Total mass, center of mass and inertia tensor of a mass distribution) If ρ : V Ñ r0, 8q is the mass distribution (mass density) of a solid body occupying the region V in R3 , its total mass M , center of mass rC and inertia tensor pIij qi,j Pt1,2,3u are defined by: » M : rC I11 : I33 : and » 1 M 1 xρ dxdydz, M V » »V V py ρ dxdydz , V 2 px 2 » V 1 yρ dxdydz, M z qρ dxdydz , I22 : » 2 V px2 » zρ dxdydz , V z 2 qρ dxdydz , y qρ dxdydz 2 Iij : » xi xj ρ dxdydz V if i j, if existent. In the integrands, x, y, z, x1 : x, x2 : y, x3 : z denote the coordinate projections of R3 . Example 4.4.34. (Center of mass of a cylindrical rod) Calculate the center of mass of the rod R : t px, y, z q : x2 of radius r defined by y2 ¤ r2 ^ z P r0, hs u ¡ 0 and height h ¡ 0 for the mass distribution ρ : R Ñ r0, 8q ρ1 ρ0 ρpx, y, z q : ρ z 0 h fo all px, y, z q P R, where ρ0 , ρ1 ¥ 0. Solution: According to Example 4.4.30, B R is negligible. Further, if I is some closed interval such that I R, it follows that ρ̂px, y, z q : # ρpx, y, z q if px, y, z q P R 0 if px, y, z q P I z R 662 h R z 0 y 0 0 x Fig. 182: Cylindrical rod R of radius r and height h from Example 4.4.34. is continuous, possibly except from points of a negligible subset of R3 . Therefore ρ̂ is Riemann-integrable. Hence we can apply the Theorem of Fubini to conclude that the mass M and the center of mass rC pxC , yC , zC q of the rod are given by M » R πr xC yC ρpx, y, z q dxdydz 2 »h ρ0 »0 »h ρ0 0 ρ1 ρ0 z dz h » ρ 1 ρ0 z dxdy dz h Br2 p0q ρ0 2 ρ1 πr2h , 1 xρpx, y, z q dxdydz M R » » ρ1 ρ0 1 h ρ0 z xdxdy dz M 0 h Br2 p0q » 1 yρpx, y, z q dxdydz M R 663 0, zC » » ρ1 ρ0 1 h ρ0 z ydxdy dz 0 , M 0 h Br2 p0q » 1 zρpx, y, z q dxdydz M R » » ρ1 ρ0 2 1 h ρ0 z z dxdy dz M 0 h Br2 p0q πr2 M Note that »h ρ1 ρ0 2 z dz h ρ0 z 0 0 ¤ zC 2 πr M ρ0 2ρ1 6 h2 31 ρρ0 0 2ρ1 h. ρ1 13 ρρ0 2ρ1 1 3ρ0 3ρ1 h¤ hh ρ1 3 ρ0 ρ1 0 and hence that the center of mass lies inside the rod. Example 4.4.35. (Probability theory) A function ρ : Ω subset Ω of Rn , n P N , such that » ρ dv Ω Ñ r0, 8q from a 1 can be interpreted as a joint probability distribution for the random variables x1 , . . . , xn on the sample space Ω. The elements of Ω are called sample points and represent the possible outcomes of experiments. The probability P tpx1 , . . . , xn q P Du for the event that the outcome of an experiment px1 , . . . , xn q is a member of a subset D Ω is given by P tpx1 , . . . , xn q P Du » ρ dv , D if existent. The mean or expected value E pf q for the measurement of a random variable f : Ω Ñ R in an experiment is defined by E pf q : » f ρ dv , Ω if existent. 664 0.8 0.8 0.6 0.6 y 1 y 1 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 x 0.2 0.4 0.6 0.8 1 x Fig. 183: Density maps of ρp1,2q , ρp1,3q from Exercise 4.4.36. Darker colors correspond to smaller function values. Example 4.4.36. (Identical fermionic particles confined to a one-dimensional box) Consider two idealized ‘one-dimensional’ identical fermionic point particles of mass m ¡ 0 confined to the interval r0, 1s, but not subject to other forces. In a quantum mechanical description, the probability distributions for the position of the particle in basic stationary states of energy 2 2 E π2m~ |k|2 , where ~ is the reduced Planck constant, are given by ρk px, y q : 2 r sinpk1 πxq sinpk2 πy q sinpk2 πxq sinpk1 πy q s2 for all x, y P r0, 1s, where k P N2 satisfies k1 k2 . For every such k calculate the expectation values xx y y for the sum of the positions of the particles xx yy » r0,1s2 px Solution: For k P N2 such that k1 y q ρk px, y q dxdy . k2 and x, y P r0, 1s, it follows that ρk px, y q 2 sin2 pk1 πxq sin2 pk2 πy q sin2 pk2 πxq sin2 pk1 πy q 2 sinpk1πxq sinpk2πxq sinpk1πyq sinpk2πyq s 665 21 r 1 cosp2k1πxq s r 1 cosp2k2πyq s 21 r 1 cosp2k2 πxq s r 1 cosp2k1 πy q s r cosppk1 k2qπxq cosppk1 k2qπxq s r cosppk1 k2qπyq cosppk1 k2qπyq s , where it has been used that sinpαq sinpβ q 21 r cospα1 α2 q cospα1 for all α, β α2 q s P R. Hence it follows by Fubini’s theorem that xxy 21 »1 21 x r 1 cosp2k1 πxq s dx 0 »1 0 1 x r 1 cosp2k2 πxq s dx cosp2k1 πxq x sinp2k1 πxq 2k π 2 4k12 π 2 1 0 2 1 x 21 cosp2k2πxq x sinp2k2πxq 1 . 2 21 x 4k22 π 2 2 2k2 π 0 2 Further, for the calculation of xy y, we use change variables. For this, we define g : R2 Ñ R2 by g px, y q : py, xq for all px, y q P R2 . The map g is continuously differentiable with inverse g. In particular, g pr0, 1s2 q r0, 1s2 and det g 1 1 . Hence it follows by change of variables that xyy » r0,1s2 » r0,1s2 y ρk px, y q dxdy x ρk py, xq dxdy and hence, finally, that xx » pr s2 q » y y 1. 666 g 0,1 r0,1s2 y ρk px, y q dxdy x ρk px, y q dxdy xxy 0.8 0.8 0.6 0.6 y 1 y 1 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 x 0.6 0.8 1 x Fig. 184: Density maps of ρp2,3q , ρp2,4q from Exercise 4.4.36. Darker colors correspond to smaller function values. Problems 1) Evaluate the following iterated integrals. » 2 » 1 a) 0 0 px2 2y q dx dy , » 3 » 5 b) 3 » 4 » y2 4 2 c) 3 1 px π{2 2 » 1 "» 1 » 1 r sin ϕ dr dϕ , 0 0 ?1 » 1 "» 1x » 1xy f) » 1 #» ?1x2 » 0 2 0 e) 0 2y q dx dy , dy px yq2 dx , » π{2 » 3 cos ϕ d) 0 0 2) Calculate 0 z * dy dx , * xyz dz dy dx , ?1x y 2 g) 0 dz x y 0 2 » x dxdy D 667 + dz a dy dx . 1 |px, y, z q|2 where D R2 is the compact set that is contained in the first and fourth quadrant as well as is bounded by the y-axis and tpx, yq P R2 : x y 2 1 0u . Sketch D. 3) By using polar coordinates, calculate » T where T : ! px y q dxdy pr cos ϕ, r sin ϕq P R2 : 0 r ¤ 1 ^ π6 ¤ ϕ ¤ π3 ) . Sketch T and g 1 pT q where g is the polar coordinate transformation. 4) Calculate » x dxdydz E where E : 5) Calculate ! px, y, zq P [0, 8q3 : x y z 2 ) 1 . » y dxdy D where D R2 is the area of the triangle with corners p0, 0q, p1{2, 0q and p1{2, 1q. Sketch D. 6) By using polar coordinates, calculate » xy dxdy T where T is the compact subset of R2 that is bounded by the coordinate axes and ! ) p x, p1 x2 q1{2 q P R2 : 0 ¤ x ¤ 1 . Sketch T and g 1 pT q where g is the polar coordinate transformation. 668 7) Calculate » z dxdydz E where E is the compact subset of R3 in the first octant that is bounded by the coordinate surfaces and tpx, y, zq P R3 : x 2y 3z 1u . Sketch E. 8) Calculate » x2 dxdy D where D is the compact subset of R2 that is contained in the first and fourth quadrant as well as is bounded by the y-axis and tpx, yq P R2 : x y 2 0u , tpx, yq P R2 : x y 2 0u . Sketch D. 9) By using polar coordinates, calculate » xy dxdy T where T is the compact subset of R2 that is contained in the first quadrant as well as is bounded by both coordinate axes and tpx, yq P R2 : x2 y2 4u . Sketch T and g 1 pT q where g is the polar coordinate transformation. 10) Calculate » z dxdydz E where E R3 is the compact set contained in the first octant which is bounded by the coordinate surfaces and tp1, y, zq P R3 : y P R ^ z P Ru , tpx, y, zq P R3 : z Sketch E. 669 2y 2u . 11) Calculate » x2 y dxdy D where D is the compact subset of R2 that is contained in the upper half-plane as well as is bounded by the x-axis and tpx, yq P R2 : x2 y 4u . Sketch D. 12) By using polar coordinates, calculate » x dxdy T where T is the compact subset of R2 that is contained in the first quadrant as well as is bounded by both coordinate axes and tpx, yq P R2 : x2 y2 1u , tpx, yq P R2 : x2 y2 4u . Sketch T and g 1 pT q where g is the polar coordinate transformation. 13) Calculate » z 2 dxdydz E where E is the compact subset of R3 that is contained in tpx, y, zq P R3 : z ¥ 0u and is bounded by tpx, y, zq P R3 : x2 y2 z 9 0u . Sketch E. 14) Calculate the volume of solid ellipsoid with half-axes a, b, c ¡ 0. 15) Calculate the center of mass and the inertia tensor of a solid hemisphere tpx, y, zq P R3 : x2 pz rq2 ¤ r2 ^ 0 ¤ z ¤ ru for a mass distribution which is constant of value ρ0 ¥ 0. y2 670 0.8 0.8 0.6 0.6 y 1 y 1 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 x 0.2 0.4 0.6 0.8 1 x Fig. 185: Density maps of ρp1,1q , ρp1,2q from Problem 18. Darker colors correspond to smaller function values. 16) Calculate the center of mass and the inertia tensor for of a solid cone of height h ¥ 0 tpx, y, zq P R3 : a2 px2 y2 q ¤ z2 ^ 0 ¤ z ¤ hu , where a ¡ 0, for a mass distribution which is constant of value ρ0 ¥ 0. 17) (Buffon’s needle problem) A needle of length L ¡ 0 is thrown in a random fashion onto a smooth table ruled with parallel lines separated by a distance d ¡ L. For simplicity, associate to all lines a common orientation. Denote by x P r0, d{2s the minimal distance of the center of the needle to the lines and by θ P r0, π s the angle between the direction of the needle and the direction of the lines. Under the assumption that x and θ are uniformly distributed, the joint probability distribution ρ : r0, d{2sr0, π s Ñ r0, 8q of x, θ is given by 2 ρpx, θq πd for all px, θq P r0, d{2s r0, π s. a) Determine the set S r0, d{2s r0, π s corresponding to all events that cause the needle to intersect a ruled line. b) Calculate the probability pS of the last event given by pS » S 671 ρpx, θq dxdθ . 0.8 0.8 0.6 0.6 y 1 y 1 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 x 0.4 0.6 0.8 1 x Fig. 186: Density maps of ρp2,2q , ρp2,3q from Problem 18. Darker colors correspond to smaller function values. 18) (Identical bosonic particles confined to a one-dimensional box) Consider two idealized ‘one-dimensional’ identical bosonic point particles of mass m ¡ 0 confined to the interval r0, 1s, but not subject to other forces. In a quantum mechanical description, the probability distributions for the position of the particle in basic stationary states of energy π 2 ~2 2 |k| , E 2m where ~ is the reduced Planck constant, are given by ρk px, y q : 2 r sinpk1 πxq sinpk2 πy q sinpk2 πxq sinpk1 πy q s 2 for all x, y P r0, 1s, where k P N2 . For every such k calculate the expectation values xx y y for the sum of the positions of the particles xx yy » r0,1s2 px y q ρk px, y q dxdy . 19) A point particle of mass m ¡ 0 is confined to a cube r0, 1s3 , but not subject to other forces. In a quantum mechanical description, the probability distributions for the position of the particle in basic stationary states of energy 2 2 E π2m~ |k|2 , where ~ is the reduced Planck constant, are given by ρk px, y, z q : 8 sin2 pk1 πxq sin2 pk2 πy q sin2 pk3 πz q for all x, y, z P r0, 1s where k pk1 , k2 , k3 q P N3 . 672 0.8 0.8 0.6 0.6 y 1 y 1 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 x 0.6 0.8 1 x Fig. 187: Density maps of ρp1,1,1q p, , 0.5q, ρp1,2,1q p, , 0.5q. See Problem 19. Darker colors correspond to smaller function values. a) For every k pk1 , k2 , k3 q P N3 , calculate the expectation values xxy, xy y, xz y for the components of the position of the particle xxy xyy xzy » r0,1s3 » r0,1s3 » r0,1s3 x ρk px, y, z q dxdydz , y ρk px, y, z q dxdydz , z ρk px, y, z q dxdydz . b) For every k P N3 , calculate the standard deviation σ1 , σ2 , σ3 for the expectation values from part a) σ12 σ22 σ32 » » » r0,1s 3 r0,1s 3 r0,1s3 px xxyq2 ρk px, y, zq dxdydz , py xyyq2 ρk px, y, zq dxdydz , pz xzyq2 ρk px, y, zq dxdydz . c) Calculate and compare the probability of finding the particle in the volume r0, as3 that includes a corner and in rp1 aq{2, p1 aq{2s3 around the center of the cube where 0 a 1. 20) Calculate the volume of the compact subset of R3 that is bounded by the given surfaces. a) S1 t px, y, zq P R3 : x2 673 y2 zu , 0.8 0.8 0.6 0.6 y 1 y 1 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 x 0.2 0.4 0.6 0.8 1 x Fig. 188: Density maps of ρp2,1,1q p, , 0.5q, ρp2,2,1q p, , 0.5q. See Problem 19. Darker colors correspond to smaller function values. t px, y, zq P R3 : x4 y4 2 px2 y2 q u , S3 t px, y, z q P R3 : z 0 u , S1 t px, y, z q P R3 : x2 y 2 z 2 4 u , S2 t px, y, z q P R3 : px2 y 2 q2 4 px2 y 2 q u S1 t px, y, z q P R3 : x2 y 2 z 2 9 u , S2 t px, y, z q P R3 : x2 y 2 3 |x| u . S2 b) c) , 21) (Generalized polar coordinates) a) Let a, b, α ¡ 0. Define g : p0, 8q p0, π {2q Ñ R2 by g pr, ϕq : par cosα ϕ, br sinα ϕq for all pr, ϕq P p0, 8q p0, π {2q. Find the range of g. Show that the restriction of g in range to Ranpg q is a continuously differentiable bijection with a continuously differentiable inverse. In particular, calculate that inverse and detpg 1 q. b) Calculate the area of S : tpx, y q P R2 : |x|1{2 |y|1{2 ¤ R1{2 u for R ¡ 0 by use of suitable generalized polar coordinates from part a). 22) (Generalized spherical coordinates) a) Let a, b, c, α ¡ 0. Define g : p0, 8qp0, π {2qp0, π {2q Ñ R3 by g pr, θ, ϕq : par sinα θ cosα ϕ, br sinα θ sinα ϕ, cr cosα θq 674 1 1 -1 -1 Fig. 189: Graphical depiction of S from Problem 21 for the case R 1. 1 z 0 1 1 0 y -1 1 0 x Fig. 190: Graphical depiction of S from Problem 22 for the case R 1. 675 for all pr, θ, ϕq P p0, 8qp0, π {2qp0, π {2q. Find the range of g. Show that the restriction of g in range to Ranpg q is a continuously differentiable bijection with a continuously differentiable inverse. In particular, calculate that inverse and detpg 1 q. b) Calculate the volume of S : tpx, y, z q P R3 : |x|1{2 |y|1{2 |z|1{2 ¤ R1{2 u for R ¡ 0 by use of suitable generalized spherical coordinates from part a). 676 4.5 Vector Calculus We remind that we identify points in Rk , where k P N is such that k ¥ 2, with position vectors, see the remarks preceding Definition 3.5.8. In addition, as was explained in the beginning of Section 3.5.8, we also identify tangent vectors that are associated to points in space with position vectors. In applications, only from the context of a problem can be concluded about the nature of the involved quantities. But, at least to the experience of the author, apart from ‘transformations’ which map points into points, most maps in applications are considering ‘physical fields’, i.e., maps that have as domain a set of points and as range a set of real numbers or a set of tangent vectors. In the last case, such maps associate to every point from the domain a tangent vector that is ‘attached’ to that point. The remaining part of the course studies the last type of maps which are also called vector fields. Hence for the interpretation of the results, the reader should imagine that the value of a vector field in a point p of its domain is a position vector which has been parallel transported in space such that its starting point is p instead of the origin of a Cartesian coordinate system. The notion of parallel transport is made precise in courses in differential geometry. On the other hand, any vector-valued function of several variables f from a nontrivial subset D of some Rn into some Rm can be interpreted as a vector field assigning vectors to points in space. Therefore, the previous remarks gain importance only in connection with the interpretation of the results. Example 4.5.1. Define v : tpx, y, z q P R3 : 1 ¤ x2 v px, y, z q : y2 ¤ 4u Ñ R3 by 2 2 2 2 y x x2 y y2 2 , x x x2 y y2 2 , 0 for all px, y, z q P R3 satisfying 1 ¤ x2 y 2 ¤ 4. The map v describes the velocity field of a viscous incompressible flow, a so called ‘Couette flow’, between concentric cylinders of radius 1 and 2 rotating at the same rate, but in counterclockwise and clockwise direction, respectively. The velocity field in a point on these cylinders coincides with the speed of that point, i.e., due to viscous friction forces, the fluid sticks to the cylinders and, in this way, is carried along with the cylinders. To achieve better visualization, 677 2 y 1 0 -1 -2 -2 0 x -1 1 2 Fig. 191: Direction field corresponding to Couette flow between counter-rotating concentric cylinders of radius 1 and 2. See Example 4.5.1. Fig 191 shows the field of directions |v |1 .v ? corresponding to v. The first is not defined in points of the circle of radius 2 around the origin. Example 4.5.2. Define E : R3 zt0u Ñ R3 by E px, y, z q : p x2 1 y2 z 2 q3{2 px, y, zq for all px, y, z q P R3 zt0u. E describes the electrical field created by a negative unit charge. To achieve better visualization, Fig 192 shows the field of directions |E |1 .E corresponding to E. The following example motivates the subsequent definition of path integrals. Example 4.5.3. (Motivation for the definition of path integrals.) For this, let F be a continuous map from some open subset U in R3 into R3 (the ‘force field’) and r be a twice continuously differentiable map from some 678 1 y 0 -1 1 z 0 -1 -11 1 0 x Fig. 192: Direction field corresponding corresponding to an electrical field created by a negative point charge. See Example 4.5.2. open interval I of R into U (the trajectory of a point particle parametrized by time) which satisfies m r 2 ptq F prptqq for every t P I (Newton’s equation of motion) where m the particle). Then m 2 v2 1 ¡ 0 (the mass of ptq m r 1ptq r 2ptq r 1ptq F prptqq for every t P I where v : r 1 (the velocity field of the particle), and hence m 2 v 2 pt1q m 2 v 2 pt0q » t1 t0 r 1 ptq F prptqq dt . Hence the right hand side of the previous equation describes the difference of the ‘kinetic energies’ of the particle at t1 and t0 . Further, if F is in 679 addition ‘conservative’ , i.e., if there is some V : U function’) of class C 1 such that F ∇V Ñ R (a ‘potential , then we conclude by the chain rule that » t1 t0 r 1 ptq F prptqq dt » t1 » t1 t0 r 1 ptq ∇V prptqq dt pV rq 1 ptq dt V prpt0qq V prpt1qq t0 and hence that the function (‘the total energy of the particle’) m 2 v 2 V r is constant (‘Energy conservation’). Definition 4.5.4. Let n P N , F be a continuous map from some open subset U of Rn into Rn , a, b P R such that a ¤ b and r : ra, bs Ñ Rn be a regular C 1 -path in U , i.e., the restriction of a continuously differentiable map from some open interval I ra, bs into U . Then we define the path integral of F along r by » r F dr : »b a r 1 ptq F prptqq dt . Remark 4.5.5. Note that we don’t demand that r is necessarily injective. A simple example for a regular C 1 -path which is not injective is given by r : [ 1, 1] Ñ R2 defined by rptq : pt2 , 1q for every t P [ 1, 1]. This path begins at the point p1, 1q, moves on to p0, 1q from where it returns to p1, 1q. 680 2 1 y 0 -1 -2 -2 0 x -1 1 2 Fig. 193: Direction field associated to F from Example 4.5.6 and Ranprq for a 1. Example 4.5.6. Define F : R2 zt0u Ñ R2 by F px, y q : x2 y , y 2 x2 x y2 for all px, y q P R2 zt0u and the parametrization of the circle of radius a ¡ 0 around the origin by r : r0, 2π s Ñ R2 by rptq : a.pcos t, sin tq for all t P R. Then » r F dr : » 2π 0 » 2π 0 » 2π 0 r 1 ptq F prptqq dt p a sin t, a cos tq t a cos t a sin , 2 a a2 dt 2π . Note that this result does not depend on the radius a. 681 dt The following shows that the value of a path integral is unchanged if the path is replaced by another that has the same range and that traverses the range in the ‘same way’ as the first path. Theorem 4.5.7. (Invariance under reparametrization) Let n P N , F be a continuous map from some open subset U of Rn into Rn and r : ra, bs Ñ Rn be a regular C 1 -path in U . Further, let g : ra, bs Ñ rc, ds be continuously differentiable with a continuously differentiable inverse (i.e, there is an extension ĝ : I1 Ñ I2 of g, where I1 , I2 are open intervals of R such that I1 ra, bs, I2 rc, ds and such that ĝ is continuously differentiable with a continuously differentiable inverse) and such that g 1 pxq ¡ 0 for all x P ra, bs. Then » » r F dr r g F dr . Proof. By the change of variables and the chain rule, it follows that » r F dr : »d c »b a r 1 ptq F prptqq dt »d pr gq 1psq F ppr gqpsqq ds c r 1 pg psqq F prpg psqqq g 1 psq ds » r g F dr . The following defines for every regular C 1 -path r an inverse path r whose domain and range are the same as that of r, but traverses the range in the opposite way, i.e., in particular, starts at the endpoint of r and ends at the starting point of r. The replacement of r in a path integral by r leads to a change in sign. Definition 4.5.8. (Change of orientation/inverse path) Let n, F and r as in Definition 4.5.4. Then, we define the inverse path r to r by r ptq : rpa 682 b tq for all t P ra, bs. Then it follows by Theorem 3.1.9 and change of variables that » r F dr »b a b a a r 1 pa » a »b »b r 1 pt r 1 ptq F pr ptqq dt b tq F prpa a b tqq dt bq F prpt r 1 ptq F prptqq dt bqq dt a » r F dr . Path integrals occur frequently in form of ‘boundary integrals’. In these cases the range of paths are parts of the boundaries of subsets of Rn , where n P N such that n ¥ 2, and often the whole boundary of the set needs to be traversed. Since such a boundary can contain corners, e.g., in the case of the boundary of the interior of a rectangle, it is useful to define also path integrals along paths that are only piecewise C 1 . Definition 4.5.9. (i) A piecewise regular C 1 -path r is a sequence pr1 , . . . , rν q of regular C 1 -paths r1 , . . . , rν , where ν P N , with coinciding endpoints of ri and starting points of ri 1 for each i P t1, . . . , ν 1u. Further, we say that r is closed if the endpoint of rν coincides with the initial point of r1 . (ii) Further, for a continuous vector field F : U Ñ Rn , where U is some open subset of Rn containing the ranges of all ri , i P t1, . . . , ν u, we define the path integral along r by » r F dr : » r1 F dr » rν F dr . We already noticed in Example 4.5.3 that the value of a path integral depends only on the endpoints of the path in the case that the vector field is 683 the gradient of a function of class C 1 . Such functions are called potentials or potential functions in physics. Theorem 4.5.10. (Path independence) Let n P N , F be a continuous map from some open subset U of Rn into Rn and r be a piecewise regular C 1 -path in U from x0 to x1 . Finally, let V : U Ñ R be of class C 1 and such that F ∇V . Then » r F dr V px1 q V px0 q . Proof. Since r is a piecewise regular C 1 -path in U from x0 to x1 , there are ν P N along with regular C 1 -paths r1 : ra1 , b1 s Ñ U, . . . , rν : raν , bν s Ñ U such that r1 pa1 q x0 and rν pbν q x1 . Then it follows by the chain rule that » r F dr » b1 a1 » b1 » r1 F dr r11 ptq p∇V qpr1 ptqq dt » rν F dr » bν » bν aν rν1 ptq p∇V qprν ptqq dt pV rν q 1 ptq dt pV r1q 1 ptq dt a a V pr1pb1qq V pr1pa1qq V prν pbν qq V prν paν qq V prν pbν qq V pr1pa1qq V px1q V px0q . 1 ν From Schwarz’s Theorem 4.2.18 follows a simple necessary condition for the existence of a potential for a vector field F whose component functions are all of class C 1 . Theorem 4.5.11. (Necessary conditions for the existence of a potential) Let n P N such that n ¥ 2, F pF1 , . . . , Fn q be a map of class C 1 (i.e., all F1 , . . . , Fn are of class C 1 ) from some open subset U of Rn into Rn , and let V : U Ñ R be of class C 2 and such that F ∇V . Then 684 BFi BFj 0 , i, j P t1, . . . , nu , i j . B xj B xi For every i, j P t1, . . . , nu such that i j, it follows by Schwarz’s Proof. Theorem 4.2.18 that BFi B2V B2V BFj B xj B xj B xi B xi B xj B xi . Remark 4.5.12. For the cases n 2, 3, the condition (4.5.11) is equivalent to the vanishing of the so called rotational field curl F of F : curl F : $ & B F2 Bx % p BBFy BBFz , BBFz BBFx , BBFx BBFy q BBFy if n 2 1 3 2 1 3 2 1 if n 3 . The following example shows that not for every vector field that satisfies the conditions from Theorem 4.5.11 there is a potential function. Example 4.5.13. Let F be as in Example 4.5.6, i.e., define F : R2 zt0u Ñ R2 by x y , F px, y q : 2 x y 2 x2 y 2 for all px, y q P R2 zt0u. Then F is of class C 1 and BFx px, yq BFy px, yq y2 x2 By Bx x2 y 2 for all px, y q P R2 zt0u and hence curl F vanishes on R2 zt0u. But in Example 4.5.6, we found closed regular C 1 -paths r such that » r F dr 0 . As a consequence, the existence of a potential function for F would lead to a contradiction to Theorem 4.5.10. Hence there is no such potential. Note 685 that the same reasoning excludes also the existence of V : UR p0q zt0u Ñ R of class C 1 such that F px, y q p∇V qpx, y q for all px, y q P UR p0q zt0u for all R ¡ 0. Hence it is natural to assume that this fact is caused by the singular behavior of F in the origin. Indeed, the following theorem shows that this assumption is correct in the sense that there would be such a potential if F could be extended to a vector field F̂ of class C 1 on R2 such that curl F̂ vanishes also in the origin. Criteria, like the following, providing the existence of potential functions for vector fields satisfying certain conditions, are generally called ‘Poincare lemmas’ after Henri Poincare. Below, we give only the simplest criterion of this type. For its proof, the potential functions are explicitly constructed. Theorem 4.5.14. (Sufficient conditions for the existence of a potential, Poincare Lemma) Let n P N be such that n ¥ 2, U be an open subset of Rn which is star-shaped with respect to some x0 P U , i.e., such that for all x P U also the line segment tx0 t.px x0 q : t P r0, 1su is contained in U . Further, let F pF1 , . . . , Fn q : U Ñ Rn be of class C 1 and such that BFi BFj 0 B xj B xi for every i, j P t1, . . . , nu such that i j. Then there is a potential V U Ñ R of class C 2 such that F ∇V . Proof. Define V : U Ñ R by V pxq : » Hi pxq : rx »1 0 F dr ņ : pxi x0iqHipxq , i 1 Fi px0 t.px x0 qq dt , where rx ptq : x0 t.px x0 q for all t P r0, 1s, for all x P U . Now let x P U . Since U is open there is d ¡ 0 such Ud pxq U . Then by Taylor’s 686 formula Theorem 4.3.6, it follows for all h P Ud p0q that »1 BFi px t.px x q τ h.e q dt 0 j B xj 0 0 for some τ P r0, 1s. Now rj : r0, 1s r0, ds Ñ U defined by rj pt, sq : x0 t.px x0 q s.ej , pt, sq P r0, 1s r0, ds 1 rHipx h h.ej q Hi pxqs t is obviously continuous and hence its image compact, since its domain is compact, too. Since BFi : U Ñ R B xj is continuous, it is in particular uniformly continuous on Ran rj . Hence for any ε ¡ 0 there is δ ¡ 0 such that Fi x x2 B p q BFi px q ε Bj B xj 1 P Ran rj and |x2 x1| δ . In particular for h P R such that |h| δ, it follows that » 1 »1 BFi px t B Fi px0 t.px x0 q τ h.ej q dt t B xj B xj 0 0 0 2ε whenever x1 , x2 t.px x0 qq and hence also that 1 Hi x h r p ε. h.ej q Hi pxqs »1 0 687 BFi px t B xj 0 t.px x0 qq dt dt Since ε ¡ 0 is arbitrary otherwise, it follows that Hi is partially differentiable in the j-th coordinate direction with partial derivative given by BHi pxq » 1 t BFi px B xj B xj 0 0 t.px x0 qq dt , x P U . Moreover, analogous reasoning shows that B Hi B xj is continuous. Hence V is of class C 1 . In particular, BV pxq ņ px x q » 1 t BFi px i 0i B xj B xj 0 0 i1 »1 0 ņ Fj px0 pxi x0iq i 1 »1 0 Fj px0 tFj px0 t.px x0 qq dt t.px x0 qq dt »1 t 0 BFj px B xi 0 t.px x0 qq dt t.px x0 qq dt 1 t.px x0 qq 0 Fj pxq , j P t1, . . . , nu and hence, finally, it follows also that V is of class C 2 . Remark 4.5.15. For the case n 2, the statement of Theorem 4.5.14 is also true for the more general case of an open simply-connected U . For the proof see [63], XVI, §5, Theorem 5.4. Example 4.5.16. For n P N such that n ¥ 2, any open convex subset of Rn , i.e, any open subset S of Rn such that for all x, y P S also tx t.py xq : t P r0, 1su S, like Rn itself and any open ball in Rn , is star-shaped with respect to any of its elements. 688 Example 4.5.17. We define F : R3 Ñ R3 by F px, y, z q : py 2 z 3 , 2xyz 3 , 3xy 2 z 2 q , for all x, y, z P R. In particular, F is of class C 1 and curl F 0 . Since R3 is star-shaped with respect to the origin, there is a potential V : R3 Ñ R of class C 2 such that F ∇V . Such a potential is not uniquely determined since the gradients of constant functions vanish. Integration of the corresponding equations shows that V : R3 Ñ R defined by V px, y, z q : xy 2 z 3 for all x, y, z P R is a potential function for F . Problems 1) Calculate » r F dr . Note that the paths in d)-g) all start and end at the same points. a) Fpx, y q : py, 2xq , x, y b) c) d) e) f) P R , rptq : pt, t2 q , t P r0, 1s Fpx, y q : p3y, 4xq , x, y P R , rptq : pt2 , t3 q , t P r0, 2s , Fpx, y q : px2 3xy, xy y 2 q , x, y P R , rptq : pcos t, sin tq , t P rπ, π s , Fpx, y q : p2xy, x2 q , x, y P R , rptq : p2t, tq , t P r0, 1s , Fpx, y q : p2xy, x2 q , x, y P R , rptq : p2t, t1{2 q , t P r0, 1s , Fpx, y q : p2xy, x2 q , x, y P R , 689 , r : pr1 , r2 q , r1 ptq : p0, tq , t P r0, 1s , r2 psq : ps, 1q , s P r0, 2s , PR, r : pr1 , r2 q , r1 ptq : pt, 0q , t P r0, 2s , r2 psq : p2, sq , s P r0, 1s , h) Fpx, y, z q : py z, z x, x y q , x, y, z P R , rptq : pcos t, sin t, tq , t P r0, 2π s , i) Fpx, y, z q : py, z, xq , x, y, z P R , rptq : pR cos α cos t, R cos α sin t, R sin αq , t P r0, 2π s , αPR . If possible, find a potential function V : DpF q Ñ R of class C 1 for F where DpF q denotes the domain of F . Otherwise, give reasons g) 2) Fpx, y q : p2xy, x2 q , x, y why there is no such function. a) Fpx, y q : p1 b) c) Fpx, y q : p2xy d) xq , x, y y, 1 Fpx, y q : px, 2y q , x, y y2 PR PR , , 2xy 2 , x2 2xy Fpx, y q : pe cos y, e sin y q , x, y x x PR 2x2 y q , x, y e) Fpx, y, z q : py z , 2xyz , 2xy z q , x, y, z 2 , , PR , f) Fpx, y, z q : px, x z y, 3x y z q , x, y, z P R , g) Fpx, y, z q : |px, y, z q|3 px, y, z q , px, y, z q P R3 z t0u . Let n P N zt0, 1u, f : Rn zt0u Ñ R be continuous. Define F Rn zt0u Ñ Rn by F pxq : f pxq.x for all x P Rn zt0u. Calculate 2 2 2 2 3) PR 2 » r : F dr where r is a regular C 1 -path whose range is part of Srn p0q for some r ¡ 0. 4) Define F : R2 zt0u Ñ R2 by F px, y q : for all px, y q P R2 zt0u. 2 2 px2 2xyy2 q2 , pxx2 yy2 q2 690 2 y 1 0 -1 -2 -2 0 x -1 1 2 Fig. 194: Direction field of associated to F from Problem 4 and Ranprq for a 1. a) Calculate » r F dr where a ¡ 0 and r : r0, 2π s Ñ R2 is given by rptq : a.pcos t, sin tq for all t P R. Note the difference of the result to that of Example 4.5.6. b) If possible, find a potential function V : R2 zt0u Ñ R of class C 1 for F . Otherwise, give reasons why such function does not exist. c) Calculate » r F dr where r is any regular C 1 -path that assumes values in R2 zt0u and has initial point p and end point q. 5) As in Example 4.5.6, define F : R2 zt0u Ñ R2 by F px, y q : x2 691 y , y 2 x2 x y2 for all px, y q P R2 zt0u. Further, let a ¡ 0, r : r0, 1s Ñ R2 zt0u a regular C 1 -path such that rp0q rp1q pa, 0q and such that the y-component of r is 0 on p0, εq and ¡ 0 on p1 ε, 1q for some ε ¡ 0. a) Find a potential V : R2 zp8, 0s Ñ R of class C 1 for the restriction of F to R2 zp8, 0s Ñ R. b) Calculate » r 692 F dr . y GHfL W b a x Fig. 195: Domain of integration in the motivation of Green’s formula. See text. 4.6 Generalizations of the Fundamental Theorem of Calculus In the following, we consider generalizations of the fundamental theorem of calculus, Theorem 2.6.21, to vector-valued functions of several variables. Those generalizations have important applications in the theory of partial differential equations and connected areas, e.g., electrodynamics and fluid mechanics. For motivation, we calculate the integral of a partial derivative of a function in two variables over the region in Ω R2 under the graph of a function f : ra, bs Ñ R, where a, b P R are such that a b, that assumes only positive p¥ 0q values, i.e., Ω is given by Ω : tpx, y q P R2 : a ¤ x ¤ b ^ 0 ¤ y ¤ f pxqu , see Fig 195. In addition, we assume that f is the restriction of a continuously differentiable function fˆ defined on an open interval I of R containing ra, bs. As a consequence, the graph of f is part of the image of the map h : I Ñ R2 of class C 1 defined by hpxq : px, fˆpxqq 693 for every x P I and hence is negligible. From this, we conclude that the boundary of Ω, given by 4 ¤ Bi , i 1 where B1 : ra, bs t0u , B2 : tbu r0, f pbqs , B3 : Gpf q , B4 : tau r0, f paqs is a negligible set. Further, let U be an open subset of R2 containing Ω and F1 : U Ñ R be of class C 1 . For later use, we define a corresponding vector field F : U Ñ R2 by F px, y q : pF1 px, y q, 0q for all px, y q P U . Then, we conclude by Fubini’s theorem and the fundamental theorem of calculus, Theorem 2.6.21, that »b B F1 By dxdy a » Ω »b a rF1px, f pxqq »b a F1 px, f pxqq dx » pq f x 0 rF1px, qs 1pyq dy dx F1 px, 0qs dx »b a F1 px, 0q dx . We observe that the last two integrals ‘are’ in fact path integrals. Indeed, if we define r1 : ra, bs Ñ R2 , r3 : ra, bs Ñ R2 by r1 pxq : px, 0q , r3 pxq : px, f pxqq for every x P ra, bs, then r1 , r3 are regular C 1 -paths traversing parts of the boundary of Ω and » r1 F dr1 »b a F1 px, 0q dx , » r 3 694 F dr 3 »b a F1 px, f pxqq dx . Further, by defining r2 : r0, f pbqs Ñ R2 , r4 : r0, f paqs Ñ R2 by r2 psq : pb, sq , r4 ptq : pa, tq for every s P r0, f pbqs and t P r0, f paqs, then r2 , r4 are regular C 1 -paths traversing the remaining parts of the boundary of Ω such that » r2 F dr2 » r F dr 4 0 4 since the tangent vectors of the paths are orthogonal to F in every point. Hence the piecewise C 1 -path r : pr1 , r2 , r 3 , r4 q traverses the whole boundary of Ω such that » B F1 By dxdy F dr . r » Ω (4.6.1) The last is a special case of so called ‘Green’s formula’, see Theorems 4.6.5, 4.6.7. We make several observations about the structure of the last result. First, it reduces the calculation of the integral of a derivative of a function in two variables to that of a path integral, i.e., essentially to the calculation of an integral of a function of one variable. This is similar to the fundamental theorem of calculus if we interpret the evaluation of differences of an antiderivative at the endpoints of an interval of integration as a kind of ‘integration’ in ‘0-dimensions’. In this sense, we can view the result as a generalization of the fundamental theorem of calculus. The ‘derivative’ BBFy1 (4.6.2) of the vector field F does not look very natural. Later on, we will see that that derivative is given by curl F BBFx2 BBFy1 695 for more general vector fields F where F2 denotes the corresponding second component function. In the case that F2 vanishes, this reduces to (4.6.2). Also the last derivative, does not seem very natural since it is unsymmetrical in the components of the vector field. An understanding of the structure of such derivatives can be achieved by introduction of so called ‘differential forms’ as is done in differential geometry courses. See also [63], XXI. Such forms will not be introduced in this course and consequentially no explanation of the structure of such derivatives will be given. Apart from practical reasons, the mathematical reason for not introducing differential forms is the fact that, beyond the explanation of that structure of the derivatives, differential forms are of not much further use in this connection because they usually make unnecessarily strong assumptions on the differentiability of vector fields / differential forms. As a consequence, the integral theorems of Green, Stokes and Gauss obtained by those methods are usually weak, even compared to those from the present text that uses quite elementary methods. By use of the Lebesgue integral, the methods in this text would lead to far stronger results. An additional observation concerns the definition of the vector field F . What would have been the result if we had defined F by F px, y q : p0, F1 px, y qq for all px, y q P U , instead? Also this would have been a good choice, and we would have arrived at a special case of a version of Gauss’ theorem, see Theorem 4.6.27, in two space dimensions. The main difference of that approach is that it arrives at boundary integrals that are no path integrals, but integrals that describe the flow of a vector field through the boundary. In connection with Gauss’ and Stokes’ theorems, we will have to define such flow integrals later on. From the discussion, the reader can also correctly conclude that there are several forms of generalizations of the fundamental theorem of calculus to vector-valued functions of several variables. What form of generalization is used depends on the application at hand. A final observation concerns the peculiar way the path r traverses the bound696 ary of Ω in (4.6.1). Its starts at the point pa, 0q, proceeds through pb, 0q, pb, f pbqq, pa, f paqq and ends in pa, 0q. In this way, it traverses the points of the boundary in counterclockwise direction. The last direction is also called ‘mathematically positive’. The reader might wonder how this direction can be decided in general without the use of geometric intuition? For this, we observe that the path r separates R2 into two regions, a part which is ‘outside’ the boundary of Ω and a part that is ‘inside’ that boundary. In every point of r for which there exists a corresponding tangent, there are two directions that are orthogonal to the tangent. One is pointing towards the outside and the other one is pointing towards the inside. For such points on B1 , B2 , B3 , B4 , the outward pointing direction is given by p0, 1q , p1, 0q , αpxq.pf 1pxq, 1q , p1, 0q for every x P pa, bq, respectively, where αpxq : a 1 1 rf 1pxqs2 for every x P pa, bq, and the direction of the tangent is given by p1, 0q , p0, 1q , αpxq.p1, f 1pxqq , p0, 1q for every x P pa, bq, respectively. Therefore, we conclude that their corresponding determinants are given by det pp0, 1q, p1, 0qq 1 , det pp1, 0q, p0, 1qq 1 , det pαpxq.pf 1 pxq, 1q, αpxq.p1, f 1 pxqqq 1 , detpp1, 0q, p0, 1qq 1 for every x P pa, bq. Since the change of all signs of the entries of one row of a determinant leads to an overall change of sign, we conclude that those determinants are all equal to 1 if we replace all outward pointing directions by the corresponding inward pointing directions. Hence in the application of (4.6.1), the piecewise C 1 -path r traversing the boundary of Ω needs to be chosen in such a way that, in all points where a tangent vector exists, the determinant of the orthogonal direction pointing towards the outside and the tangent vector is ¡ 0. Still, the question remains whether every 697 closed continuous path r : rc, ds Ñ R2 , where c, d P R are such that c d, without self-intersection, i.e. such rpt1 q rpt2 q for different t1 , t2 P pa, bq, separates R2 into two regions that both have its range as boundary. Indeed, this is the case according to the Jordan curve theorem. For an elementary proof of this theorem, see [80]. After this introduction, we start with the definition of the orientation of n-tuples of vectors in Rn where n P N is such that n ¥ 2. Definition 4.6.1. (The orientation of n-tuples of vectors in Rn ) Let n P N be such that n ¥ 2, pa1 , . . . , an q be an n-tuple of vectors in Rn . Then we say that pa1 , . . . , an q is positively oriented, negatively oriented if detpa1 , . . . , an q ¡ 0 and detpa1 , . . . , an q 0 , respectively. Note that exchanging the order of two elements in a positively oriented n-tuple leads to a negatively oriented n-tuple and vice versa. In particular, since detpe1 , . . . , en q 1 ¡ 0 the n-tuple pe1 , . . . , en q consisting of the canonical basis e1 , . . . , en of Rn is positively oriented. Example 4.6.2. If a pa1 , a2 q P R2 zt0u and b P R2 zt0u has the same direction as the rotation of a in counterclockwise (= mathematically positive) sense around the origin by the angle α P p0, π q, then b λ.pa1 cospαq a2 sinpαq, a1 sinpαq for some λ ¡ 0 and detpa, bq λ a1 cos a1 pαq a2 sinpαq a2 cospαqq a1 sinpαq a2 a2 cospαq λ |a|2 sinpαq ¡ 0 and hence the pair pa, bq in R2 is positively oriented. This fact is often used to decide whether a given pair of vectors in R2 is positively oriented. 698 b Α O Fig. 196: Since 0 Example 4.6.2. a α π, the pair of vectors pa, bq in R2 is positively oriented. See a x b b a Fig. 197: The triple of vectors pa, b, a bq in R3 is positively oriented. See Example 4.6.3. 699 Example 4.6.3. If a, b P R3 are vectors that are not multiples of each other, it follows by Remark 3.5.15 and Definition 3.5.18 that detpa, b, a bq detpa b, a, bq |a b|2 ¡0 and hence that the triple pa, b, a bq in R3 is positively oriented. In applications, this is often used for the construction of positively oriented triples in R3 . 4.6.1 Green’s Theorem We continue with Green’s theorem for images of rectangles under certain differentiable maps. The basis for its proof is given by the following Lemma referring to transformation properties of the curl of a vector field. The lemma can be proved by a straightforward calculation using the chain rule in the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18. Lemma 4.6.4. Let V be a non-empty open subset of R2 and F pF1 , F2 q : V Ñ R2 be differentiable. Further, let g pg1 , g2 q : Dpg q Ñ R2 be defined and of class C 2 on a non-empty open subset Dpg q of R2 and such that g pDpg qq V . Then B pF gq Bg1 pF gq Bg2 B pF gq Bg1 pF gq Bg2 2 Bx 1 By 2 By By 1 Bx Bx BBFx2 BBFy1 g detp g 1 q . Proof. The proof proceeds by a simple calculation using the chain rule in the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18. B pF gq Bg1 pF gq Bg2 B pF gq Bg1 pF gq Bg2 2 2 Bx 1 By By By 1 Bx Bx 2 2 BpFB1x gq BBgy1 pF1 gq BBxBg1y BpFB2x gq BBgy2 pF2 gq BBxBg2y 2 2 Bp F1 g q B g1 B g1 Bp F2 g q B g2 B By Bx pF1 gq ByBx By Bx pF2 gq ByBg2x 700 BpFB1x gq BBgy1 BpFB2x gq BBgy2 BpFB1y gq BBgx1 BpFB2y gq BBgx2 B F1 B g1 B F1 B g2 B g1 Bx g Bx g Bx By B y BF2 g Bg1 BF2 g Bg2 Bg2 Bx Bx By Bx By BBFx1 g BBgy1 BBFy1 g BBgy2 BBgx1 B F2 B g1 B F2 B g2 B g2 Bx g By g By Bx B y B g2 B g1 B F2 B g1 B g2 B F1 Bg2 Bg1 B F1 g g By g Bx By Bx Bx By By By Bx BBFx2 g BBgy1 BBgx2 BBFx2 g BBFy1 g detp g 1 q . Green’s theorem for images of rectangles under certain differentiable maps is a consequence of the previous Lemma, Lemma 4.6.4, and change of variables, Theorem 4.4.23. Theorem 4.6.5. (Green’s theorem for images of rectangles) Let a, b, c, d P R such that a b and c d and I : ra, bs rc, ds, I0 : pa, bq pc, dq. Further, let U I be an open subset of R2, g : U Ñ R2 be twice continuously differentiable such that the induced map from U to g pU q is bijective with a continuously differentiable inverse and such that detpg 1 q ¡ 0. Finally, let V g pI q be an open subset of R2 and F pF1 , F2 q : V Ñ R2 be continuously differentiable. Then » p q g I0 BF2 BF1 dxdy » F dr Bx By r (4.6.3) for any piecewise C 1 -parametrization r of the boundary of g pI0 q which is of the same orientation as the piecewise C 2 -path prc , rb , r d , ra q where rc pxq : g px, cq , rb py q : g pb, y q , rd pxq : g px, dq , 701 y gHIL x Fig. 198: Illustration for the proof of Green’s theorem, Theorem 4.6.5. ra py q : g pa, y q for all x P ra, bs and y P rc, ds. Proof. In a first step, we consider the set g pI0 q. Since g is twice continuously differentiable with a continuously differentiable inverse, g pI0 q is a bounded open subset in R2 . Further, the restriction of BF2 BF1 Bx By to g pI0 q is bounded. In addition, it follows by Theorem 4.4.13 and Theorem 4.4.15 that the extension of this function to a function, defined on a closed subinterval J of R2 containing g pI0 q and assuming the value zero in the points of J z g pI0 q, is Riemann-integrable. Hence by Theorem 4.4.23, it follows in a second step that » p q g I0 BF2 BF1 dxdy » Bx By I 0 702 BF2 BF1 g detpg 1q dxdy Bx By and hence by the previous Lemma 4.6.4 that BF2 BF1 dxdy Bx By g pI q » B g1 B g2 B Bx pF1 gq By pF2 gq By dxdy I » B B g1 B g2 By pF1 gq Bx pF2 gq Bx dxdy . I » 0 0 (4.6.4) 0 Further, by Fubini’s Theorem 4.4.18 and the fundamental theorem of calculus Theorem 2.6.21, it follows that B pF gq Bg1 pF gq Bg2 dxdy (4.6.5) 1 2 By By I Bx »d B g1 B g2 pF1 gqpb, yq By pb, yq pF2 gqpb, yq By pb, yq dy c »d B g2 B g1 pF1 gqpa, yq By pa, yq pF2 gqpa, yq By pa, yq dy c » 0 and B pF gq Bg1 pF gq Bg2 dxdy (4.6.6) 1 2 Bx Bx I By »b B g1 B g2 pF1 gqpx, dq Bx px, dq pF2 gqpx, dq Bx px, dq dx a »b B g2 B g1 pF1 gqpx, cq Bx px, cq pF2 gqpx, cq Bx px, cq dx . a » 0 Finally, (4.6.3) follows from (4.6.4), (4.6.5) and (4.6.6). Remark 4.6.6. For the orientation of the piecewise C 2 -path prc , rb , r d , ra q note that the region g pI0 q is bounded and hence that there is outward pointing unit normal for every point on its boundary, apart from the corner points g pa, cq, g pb, cq, g pb, dq and g pa, dq. Outward pointing vectors are given by B g1 B g2 vc px, cq By px, cq, By px, cq 703 , y W a b x Fig. 199: Illustration for the proof of Green’s theorem, Theorem 4.6.7. B g2 B g1 vb pb, y q pb, yq, Bx pb, yq , B x B g1 B g2 vd px, dq px, dq, By px, dq , B y B g1 B g2 va pa, y q Bx pa, yq, Bx pa, yq for every x P pa, bq and y P pc, dq. In particular, as a consequence of the assumption that detpg 1 q ¡ 0 in the previous Theorem 4.6.5, it follows that detpvc px, cq, rc1 pxqq detpvb pb, y q, rb1 py qq detpvd px, dq, rd1 pxqq detpvapa, yq, ra1 pyqq ¡ 0 for all x P pa, bq and y P pc, dq. Hence the orientation for the piecewise C 1 parametrization r of the boundary of g pI0 q in Theorem 4.6.5 has to be such that the outward unit normal and the tangent vector in a point of the boundary of g pI0 q are positively oriented in every point of the boundary, apart from a finite number of points. This orientation is indicated in Fig. 198. 704 From Green’s theorem for images of rectangles, we can conclude Green’s theorem for regions bounded by graphs. The last has wider applications. Theorem 4.6.7. (Green’s theorem for regions bounded by graphs) Let a, b P R be such that a b, f1 : ra, bs Ñ R and f2 : ra, bs Ñ R be restrictions of twice continuously differentiable functions defined on open intervals containing ra, bs. In addition, let f1 , f2 be such that f1 pxq f2 pxq for all x P pa, bq.1 Further, let Ω : tpx, y q P R2 : a x b ^ f1 pxq y f2pxqu . In particular, let Ω be such that there is a 0 δ pb aq{2 such that the corresponding sets Ω Xppa, a δ q Rq and Ω Xppb δ, bq Rq are convex. Finally, let F pF1 , F2 q : V Ñ R2 be continuously differentiable where V is an open subset of R2 containing Ω and its boundary. Then: » Ω BF2 BF1 dxdy » F dr Bx By r (4.6.7) for any piecewise C 1 -parametrization r of the boundary of Ω which is of the same (‘mathematically positive’, ‘counterclockwise’) orientation as the piecewise C 2 -path pr1 , rb , r 2 , ra q r1 pxq : px, f1 pxqq , rb pλq : pb, f1 pbq λ [f2 pbq f1 pbq]q , r2 pxq : px, f2 pxqq , ra pλq : pa, f1 paq λ [f2 paq f1 paq]q for all x P ra, bs and λ P r0, 1s. Proof. For this, define the open subset U of R2 by U : pa, bq R and g : U Ñ U by g px, λq : px , f1 pxq 1 λ [f2 pxq f1 pxq]q Note that we do not demand that f1 paq f2 paq or that f1 pbq f2 pbq. As a consequence, in Fig. 199, each of the line segments of the boundary of Ω that are parallel to the y-axis can consist of one point, only. 705 for all px, λq P U . In particular, g is bijective, of class C 2 with an inverse of class C 2 given by g 1 px, y q x , y f1 pxq f2 pxq f1 pxq for all px, y q P U . detpg 1 px, λqq f2 pxq f1 pxq ¡ 0 for all px, λq P U . Further, g ppa, bq p0, 1qq Ω , (4.6.8) and hence Ω is an open subset of R2 . The validity of (4.6.8) can be seen as follows. First, for px, λq P pa, bq p0, 1q, it follows that f1 pxq f1 pxq λ [f2 pxq f1 pxq] f1 pxq f2 pxq f1 pxq f2 pxq and hence that g px, λq P Ω. Second, for px, y q P Ω, it follows that 0 y f1 pxq f2 pxq f1 pxq ff2ppxxqq ff1ppxxqq 1 2 1 and hence that g 1 px, y q P pa, bq p0, 1q. In the following, let 0 δ pb aq{2 be such that the corresponding sets Ω X ppa, a δq Rq and Ω X ppb δ, bq Rq are convex. Further, let 0 ε δ, Iε : ra ε, b εs r0, 1s and I0,ε : pa ε, b εq p0, 1q. Then U Iε and V g pIε q. Hence it follows by Theorem 4.6.5 that BF2 BF1 dxdy » F dr ε Bx By g pI q r where rε is the piecewise C 2 -path pr1,ε , rbε , r 2,ε , ra ε q given by r1,ε pxq : px, f1 pxqq , rbε pλq : pb ε, f1 pb εq λ [f2 pb εq f1 pb εq]q , » ε 0,ε 706 (4.6.9) r2,ε pxq : px, f2 pxqq , ra ε pλq : pa ε, f1 pa εq λ [f2 pa εq f1 pa εq]q for all x P ra ε, b εs and λ P r0, 1s. In the following final step of the proof, we show that (4.6.7) follows from (4.6.9) by performing the limit ε Ñ 0. For this, let fˆ1 , fˆ2 : pa δ 1 , b δ 1 q Ñ R twice continuously differentiable extensions of f1 and f2 , respectively, for some 0 δ 1 . Then by ˆ ˆ ˆ ĝ px, λq : x , f1 pxq λ [f2 pxq f1 pxq] for all px, λq P pa δ 1 , b δ 1 q R, there is defined a twice continuously differentiable extension of g, and hence it follows by Theorem 4.4.13 and Theorem 4.4.15 that the extension of BF2 BF1 , Bx By Ω to a function that is defined on a closed subinterval J of R2 containing Ω and assuming the value zero in the points of J z Ω, is Riemann-integrable. Further, it follows that » F2 gpI0,ε q x B BF1 dxdy » BF2 BF1 dxdy ¤ 2M ε B By Bx By Ω where M1 ¡ 0 denotes the maximum of BF2 BF1 Bx By on some closed subset that is contained in V and at the same time contains Ω. Hence, » lim Ñ0 gpI0,ε q ε Further, » F drε rε » r BF2 BF1 dxdy » BF2 BF1 dxdy . Bx By Bx By Ω F dr 707 ¤ » a ε » a ε 1 1 F x, f x 1, f x dx F x, f x 1, f x dx 2 1 2 1 a a » 1 [f2 aε f1 aε ] F2 aε , f1 aε λ [f2 aε f1 aε ] dλ 0 »1 [f2 a f1 a ] F2 a, f1 a λ [f2 a f1 a ] dλ 0 » b » b 1 1 F x, f1 x 1, f2 x dx 1, f1 x dx F x, f2 x bε b » 1ε [f2 bε f1 bε ] F2 bε , f1 bε λ [f2 bε f1 bε ] dλ 0 »1 [f2 b f1 b ] F2 b, f1 b λ [f2 b f1 b ] dλ p p qq p p qq p qq p q p q p p q p q p q q p q p q p pq p q p q q p qq p p qq p p p qq p p p qq p p q p q p p q p q p q q p q p q p pq p q p q q 0 p qq where r : pr1 , rb , r ε, bε : b ε. In the following, 2 , ra q and aε : a we estimate the individual terms of the last sum. First, » F x, f1 x I p p qq p1, f 1pxqq dx ¤ 1 » { , I |F px, f1pxqq| |p1, f11pxqq| dx ¤ εM2p1 q » 1 p p qq p1, f2 pxqq dx ¤ |F px, f2pxqq| |p1, f21pxqq| dx I I 2 1{2 ¤ εM2p1 M4 q for every interval I ra, bs of length ε. Here M2 ¥ 0 denotes the maximum of |F | on some closed subset that is contained in V and at the same time contains Ω; M3 ¥ 0 denotes the maximum of the restriction of |fˆ11 | to ra, bs; M4 ¥ 0 denotes the maximum of the restriction of |fˆ21 | to ra, bs. M32 1 2 » F x, f2 x Second, it follows by use of Taylor’s Theorem 4.3.6 that » 1 [f2 aε 0 »1 p q f1paεq] F2paε, f1paεq 0 [f2 paq f1 paq] F2 pa, f1 paq 708 λ [f2 paε q f1 paε q] q dλ λ [f2 paq f1 a ] dλ pqq »1 |F2paε, f1paεq λ [f2paεq f1paεq]q F2pa, f1paq λ [f2paq f1paq]q|dλ |f2paεq f2paq f1paεq f1paq| »1 |F2pa, f1paq λ [f2paq f1paq]q|dλ ¤ |f2paεq f1paεq| ¤ ! 0 M2 pM3 0 M4 q M7 pM5 M6 q 1 pM 3 {) 2 1 2 M4 q ε and that » 1 [f2 bε 0 »1 p q f1pbεq] F2pbε, f1pbεq 0 [f2 pbq f1 pbq] F2 pb, f1 pbq ¤ |f2pbεq f1pbεq| λ [f2 pbε q f1 pbε q] q dλ λ [f2 pbq »1 f1 b ] dλ pqq |F2pbε, f1pbεq λ [f2pbεq f1pbεq]q F2pb, f1pbq λ [f2pbq f1pbq]q|dλ |f2pbεq f2pbq f1pbεq f1pbq| »1 |F2pb, f1pbq λ [f2pbq f1pbq]q|dλ ! 0 0 ) pM3 M4q2 1{2 ε . Here Taylor M5 ¥ 0 denotes the maximum of |f1 | on ra, bs; M6 ¥ 0 denotes the maximum of |f2 | on ra, bs; M7 ¥ 0 denotes the maximum |∇F2| on some closed subset that is contained in V and at the same time ¤ M2 pM3 M4 q M7 pM5 M6 q 1 contains Ω. As a consequence, it follows that » lim Ñ0 ε rε F drε » r F dr and hence, finally, (4.6.7). Remark 4.6.8. Note that in the previous Theorem 4.6.7, the assumption of convexity Ω X ppa, a δ q Rq for some 0 δ pb aq{2 is redundant 709 if f1 paq f2 paq, and the assumption of convexity Ω X ppb δ, bq Rq for some 0 δ pb aq{2 is redundant in the case that f1 pbq f2 pbq. Remark 4.6.9. Green’s theorem can be generalized to regions which can be dissected into regions that satisfy the demands of Theorem 4.6.5 or Theorem 4.6.7. Green’s theorems, Theorem 4.6.5 or Theorem 4.6.7, are then applied to the parts of the dissection. In this, cuts are traversed twice, but in opposite directions such that their contributions cancel in the sum. For such a case, see Example 4.6.12. Example 4.6.10. (Area of the interior of ellipse) Let a, b be strictly positive real numbers such that a b and U : " 2 px, yq : xa2 y2 b2 * 1 be the interior of the ellipse with half-axes a and b around the origin. Then r : rπ, π s Ñ R2 , defined by rpϕq : pa cos ϕ, b sin ϕq for all ϕ P rπ, π s is a C 1 -parametrization of that ellipse with positive orientation. Finally, define the vector field F : R2 Ñ R2 by F px, y q : 1{2.py, xq for all px, yq P R2. Then it follows by Theorem 4.6.7 » dxdy U πab . » r F dr » 1 π pb sin t, a cos tq pa sin t, b cos tq dt 2 π In this way, the area enclosed by the ellipse has been calculated by evaluation of a path integral. Example 4.6.11. (Area of the interior of a plane curve given in polar coordinates) Let Ω be a subset of R2 that satisfies the assumptions for Ω in Theorem 4.6.7. Further, let u be a positively oriented C 1 -parametrization of B Ω given as follows. For this, let a, b P R be such that a ¤ b, I : ra, bs, r : I Ñ R and ϕ : I Ñ R be continuous as well as differentiable on pa, bq 710 with derivatives that can be extended to continuous functions on I. Then by uptq : p rptq cos ϕptq , rptq sin ϕptq q for every t P I, there is defined a C 1 -path. Note that for t P I, rptq and ϕptq can be interpreted as polar coordinates of uptq if rptq ¡ 0 and ϕptq P pπ, πq. In particular for t P pa, bq, u 1 ptq p r 1 ptq cos ϕptq rptq ϕ 1 ptq sin ϕptq , r 1 ptq sin ϕptq rptq ϕ 1 ptq cos ϕptq q . As in the previous example, we define the vector field F : R2 Ñ R2 by F px, y q : 1{2.py, xq for all px, y q P R2 . Then it follows by Theorem 4.6.7 that » dxdy U 1 2 »b a » r F dr 1 2 »b a prptq sin ϕptq , rptq cos ϕptqq u 1ptq dt r2 ptq ϕ 1 ptq dt . In this way, the area of Ω can be calculated by a Riemann integral of a function in one variable. The following example gives a typical application of Green’s theorem in the area of partial differential equations. It considers solutions of wave equations. Ultimately, it will lead to the proof of the causal behavior of the solutions, i.e., the fact that two solutions, whose values coincide on an interval I R at time t 0 and whose partial time derivatives coincide on that same set, coincide on the area of a certain ‘characteristic triangle’ that is contained in I r0, 8q and has I as basis. Example 4.6.12. (An energy inequality for a wave equation in one space dimension) We consider a function u : U Ñ R of class C 2 that satisfies the wave equation B2u B2u V u 0 , (4.6.10) Bt2 Bx2 711 t Τ T A Ξ-Τ Ξ-HΤ-TL Ξ Ξ+HΤ-TL Ξ+Τ x Fig. 200: Domain of integration in Example 4.6.11. where V : U Ñ R is continuous, assumes only positive values, i.e., RanpV q r0, 8q, and is such that BV 0 . Bt In this, U is a non-empty open subset of R2 . Then the functions , j defined by : 1 2 Bu 2 Bt Bu 2 Bx V u2 , j : Bu Bu Bx Bt satisfy B B u B 2 u B u B 2 u V u B u B u B 2 u V u Bt Bt Bt2 Bx BtBx B t B t B x2 Bu B2u V u Bu Bu B2u Bu B2u Bj . B x B xB t B t B t B x2 B x B xB t B x Hence we conclude the conservation law Bj B 0 . Bx Bt 712 (4.6.11) Note for later use that px, tq ¥ |j px, tq| . (4.6.12) for all px, tq P U . In physical applications, is called the energy density (corresponding to u) and j is called the energy flux density (corresponding to u). Integration of p, tq over an interval of R gives the energy of u that is contained in that interval at time t P R. The function j describes the flow of that energy. In the following, we derive an important consequence of (4.6.11). For this, let pξ, τ q P R p0, 8q, and let the area enclosed by the triangle with corners pξ τ, 0q, pξ τ, 0q and pξ, τ q be contained in U . We integrate (4.6.11) over the subarea enclosed by the trapezoid with corners pξ τ, 0q, pξ τ, 0q, pξ pτ T q, T q and pξ pτ T q, T q where 0 ¤ T t. We will show that the energy content at time T in the interval rξ pτ T q, ξ pτ T qs is equal or smaller than the energy content at time 0 in the interval rξ τ, ξ τ s »ξ ξ pτ T q pτ T q px, T q dx ¤ »ξ τ ξ τ px, 0q dx . (4.6.13) Indeed, it follows that » » 3̧ » B j B j B B 0 Bx Bt dxdt i1 A Bx Bt dxdt rp, j q dr A where r is the piecewise C 2 -path pr1 , r2 , r 3 , r4 q r1 py1 q : py1 , 0q , r2 pλq : pξ τ λT, λT q , r3 py3 q : py3 , T q , r4 pλq : pξ τ λT, λT q for all y1 P rξ τ, ξ τ s, y3 P rξ pτ T q, ξ pτ T qs and λ P r0, 1s. i Note that in this, A is dissected into the area A1 enclosed by the triangle with corners pξ τ, 0q, pξ pτ T q, 0q, pξ pτ T q, T q, the area A2 enclosed by the rectangle with corners pξ pτ T q, 0q, pξ pτ T q, 0q, pξ pτ T q, T q, pξ pτ T q, T q, the area A3 enclosed by the triangle with corners pξ pτ T q, 0q,pξ τ, 0q,pξ pτ T q, T q, and apply Green’s Theorem 4.6.7 to these surfaces. The cuts are traversed twice, but in opposite directions 713 such that their contribution cancels in the sum as indicated in Fig. 200. Further, we conclude that 0 »ξ τ ξ τ »1 px, 0q dx »ξ ξ pτ T q pτ T q px, T q dx »1 0 p, j qpr2pλqq pT, T q dλ p, j qpr4pλqq pT, T q dλ 0 and hence that »ξ τ ξ τ px, 0q dx »1 »ξ ξ pτ T q pτ T q px, T q dx »1 p, j qpr2pλqq pT, T q dλ 0 »1 0 p, j qpr4pλqq pT, T q dλ »1 ¥ p, |j |qpr2pλqq pT, T q dλ 0 0 p, |j |qpr4pλqq pT, T q dλ ¥ 0 where in the last step (4.6.12) has been used. Hence it follows (4.6.13). As an application of the energy inequality (4.6.13), we assume that v : U Ñ R is another solution of (4.6.10) such that upx, 0q v px, 0q , for all x P rξ τ, ξ Bu px, 0q Bv px, 0q Bt Bt τ s. Then u v is a solution of (4.6.10) such that pu vqpx, 0q 0 , BpuBt vq px, 0q 0 for all x P rξ τ, ξ τ s and hence the corresponding energy density vanishes at time 0 on rξ τ, ξ τ s. As a consequence of (4.6.13) and the positivity of the energy density, it follows that the same is true at time T on rξ pτ T q, ξ pτ T qs. Since this is true for every t P r0, τ q and since u v is continuous, it follows that u and v coincide in every point from the closed area that is enclosed by the triangle with corners pξ τ, 0q, pξ τ, 0q and 714 pξ, τ q. Note that this triangle is isosceles with a right angle and π{4 radian angles at the corners pξ τ, 0q, pξ τ, 0q. In addition, note that pξ, τ q p r pξ τ q pξ τ q s {2, r pξ τ q pξ τ q s {2 q . As a consequence, we have the following result. Theorem 4.6.13. (Uniqueness of the solutions of a wave equation in one space dimension) Let U be a non-empty open subset of R2 and u : U Ñ R, v : U Ñ R be of class C 2 and such that B2u B2u V u B2v B2v V v 0 , Bt2 Bx2 Bt2 Bx2 where V : U Ñ R is continuous, assumes only positive values, RanpV q r0, 8q, and satisfies BV 0 . Bt i.e., Further, let Bu px, t q Bv px, t q Bt 0 Bt 0 for some t0 P R and all x from some closed interval ra, bs of R where a, b P R are such that a b. Then upx, tq v px, tq for all px, tq from the closed area that is bounded by the isosceles right upx, t0 q v px, t0 q , triangle with corners pa, t0q , pb, t0q , ppa bq{2, t0 pb aq{2q . Proof. For the case t0 0, the result was proved in the previous example. If t0 0, then U0 : tpx, t t0 q : px, tq P U u, V0 : pU0 Ñ R, px, tq ÞÑ V px, t t0 qq, u0 : pU0 Ñ R, px, tq ÞÑ upx, t t0 qq, v0 : pU0 Ñ 715 R, px, tq ÞÑ v px, t t0 qq satisfy the assumptions of the theorem for the case t0 0. Hence it follows that upx, t t0 q u0 px, tq v0 px, tq v px, t t0 q for all px, tq from the closed area that is enclosed by the triangle with corners pa, 0q , pb, 0q , ppa bq{2, pb aq{2q . Hence it follows that upx, tq v px, tq for all px, tq from the closed area that is enclosed by the triangle with corners pa, t0q , pb, t0q , ppa bq{2, t0 pb aq{2q . Subsequently, we derive the theorems of Gauss and Stokes. As mentioned in the introduction, a part of the integrals occurring in these theorems describe flows of vector fields through surfaces. For the definition of such integrals, we need to introduce the notion of parametric surfaces. Definition 4.6.14. (Parametric surfaces) Let p P N . A C p -parametric surface (in R3 ) is a pair pS, rq consisting of a subset S (‘the surface’) of R3 and an injective map (‘parametrization’) r of class C p from some open subset U of R2 into R3 with range S. To pS, rq there is an associated normal field given by Br px, yq Br px, yq npx, y q : Bx By for every px, y q P U . Hence for every px0 , y0 q in rpx0 , y0 q P S is given by P U the tangent plane to S npx0 , y0 q ppx, y, z q rpx0 , y0 qq 0 . As a side remark, such surfaces are examples of C p -manifolds defined in differential geometry. Example 4.6.15. (Examples of parametric surfaces) 716 (i) Let p P N and f be a function of class C p defined on some nonempty open subset U of R2 . Then pGpf q, rf q is a C p -parametric surface where rf px, y q : px, y, f px, y qq for all px, y q P U . The corresponding normal field n is given by B f B f npx, y q px, y q, px, y q, 1 , Bx By px, yq P U , and the tangent plane at Gpf q in a point px0, y0, f px0, y0qq is given by f px0, y0q BBfx px0, y0q px x0q BBfy px0, y0q py y0q for all px, y q P R2 . The last is identical to the definition given in z Definition 4.2.9. (ii) Denote by S 2 the sphere of radius 1 centered at the origin. Then pS 2 zpp8, 0s t0u Rq, rq is a C p-parametric surface for every p P N . Here rpθ, ϕq : psin θ cos ϕ, sin θ sin ϕ, cos θq for all θ P p0, π q, ϕ given by θ P pπ, πq. The corresponding normal field n is npθ, ϕq sin θ . rpθ, ϕq , P p0, πq, ϕ P pπ, πq. (iii) Denote by Z 2 the circular cylinder of radius 1 with axis given by the z-axis. Then pZ 2 zpp8, 0st0u Rq, rq is a C p -parametric surface for every p P N where rpϕ, z q : pcos ϕ, sin ϕ, z q for all ϕ P pπ, π q, z P R. The corresponding normal field n is given by npϕ, z q pcos ϕ, sin ϕ, 0q , ϕ P pπ, π q, z P R. 717 1 z 0 2 -1 0 y -2 0 x -2 2 Fig. 201: Torus corresponding to r 1 and R 2.5. (iv) Denote by T 2 the torus obtained by rotating around the z-axis the circle of radius r ¡ 0 in the y, z plane centered at the point p0, R, 0q where R ¡ r. Then T2 z pp8, 0s t0u Rq Y pSR1 r pt0uq t0uq , r is a C p -parametric surface for every p P N where rpϕ, θq : pcos ϕ pR r cos θq, sin ϕ pR r cos θqr sin θq for all ϕ, θ P pπ, π q. The corresponding normal field n is given by npϕ, θq pR r cos θq.pr cos ϕ cos θ, r sin ϕ cos θ, r sin θ Rq for all ϕ, θ P pπ, π q. 4.6.2 Stokes’ Theorem Below, we introduce the notion of flux integrals that describe the flow of vector fields through parametrized surfaces. Such an integrals appear in Stokes’ theorem and also appear as boundary integrals in Gauss’ theorem. 718 1 Dz 0 0 2 1 v Dt 1 Dy 2 3 0 Fig. 202: Fluid volume flown through R after time 4t 4y 2m, 4z 1m. 7sec for v p0.5, 0, 0q m/sec, Example 4.6.16. (Motivation for the definition of the flux of a vector field across a surface) Consider a constant flow v pvx , vy , vz q (length / time) of a fluid with constant mass density ρ (mass / volume) across the area R enclosed by a rectangle with sides 4y, 4z in the y, z-plane. Imagine R to be part of a closed surface such that the outer normal to R is given by n : ex . Then the change of mass inside the volume after time 4t due to the flow across R is given by ρvx 4t4y4z . Note that it is negative if vx 0 because of our use of the outer normal. ‘Inflow’ pvx 0q is counted negatively, whereas ‘outflow’ pvx ¡ 0q is counted positively. The rate of change of mass in the volume due to the flow across R is given by B r Br ρ v n dydz ρv ρvx 4y4z By Bz dydz R R » » 719 (4.6.14) where the parametrization rpy, z q : p0, y, z q for all py, z q from the projection of R into the y, z-plane has been used. Note that in the special case that ρ 1, since r y 1 B r B r B B r v n : B Bz . By Bz it follows from (4.6.14) that » r y R , B Br dydz , B Bz coincides with the area of R. Motivated by the previous example, we define the following. Definition 4.6.17. (Flux of a vector field across a C 1 -parametric surface) Let pS, rq be a C 1 -parametric surface and F : S Ñ R3 be a continuous vector field on S. Finally, let B r Br pF rq Bx By be Riemann-integrable. Then we define the flux of F across S by B r B r F dS : F prpx, y qq Bx px, yq By px, yq dxdy . S Dprq » » In particular, we define the area A of S as the flux (if existent) corresponding to the special case that F coincides with the unit normal field induced by r. Hence A is defined by A : if » pq D r r x x, y B p q Br px, yq dxdy , B By r x B Br B By is Riemann integrable. 720 The following shows that the flux through parametric surfaces pS, r1 q, pS, r2 q is the same if the parametrizations r1 and r2 are related by an ‘orientation preserving map’. In this sense, the value of the flow integral is determined by the vector field and the set S alone. This fact is important for the use of flow integrals in applications. Theorem 4.6.18. (Invariance under reparametrization) Let pS, rq be a C 1 -parametric surface and F : S Ñ R3 be a continuous vector field on S such that B r Br pF rq Bx By is Riemann-integrable. Moreover, let V be an open subset of R2 , g : V Ñ Dpf q be continuously differentiable with a continuously differentiable inverse and such that detpg 1 q ¡ 0. Then pS, r g q is a C 1 -parametric surface and Bp r gq Bp r gq F ppr g qps, tqq ps, tq Bt ps, tq dsdt B s V » B r B r F prpx, y qq (4.6.15) Bx px, yq By px, yq dxdy . » pq D r Proof. By the chain rule for partial derivatives Corollary 4.2.25, it follows that Bpr gq ps, tq Bg1 ps, tq. Br pgps, tqq Bg2 ps, tq. Br pgps, tqq Bs Bs Bx Bs By Bpr gq ps, tq Bg1 ps, tq. Br pgps, tqq Bg2 ps, tq. Br pgps, tqq Bt Bt Bx Bt By and hence that Bpr gq ps, tq Bpr gq ps, tq Bs Bt B r Br 1 detpg ps, tqq Bx By pgps, tqq 721 1 z 1 0 -1 0 y 0 x 1 -1 Fig. 203: Sketch of S from Example 4.6.19. for all ps, tq P V where g1 , g2 are the component maps of g. Hence it follows (4.6.15) and finally the theorem follows by the change of variable formula Theorem 4.4.23. Example 4.6.19. Calculate the flux of the vector field F px, y, z q : pz, x, 1q, px, y, zq P R3, across the surface S : tpx, y, z q P R3 : z Solution: Define ¥ 0, x2 y2 z 1u . rpx, y q : px, y, 1 x2 y 2 q for all px, y q P R2 such that x2 y 2 ¤ 1. Then pS, rq is a C 2 -parametric surface and the corresponding flux across S is given by B r B r px, yq By px, yq dxdy F dS F prpx, y qq B x S Dprq » p1 x2 y2, x, 1q p2x, 2y, 1q dxdy » » pq D r 722 » pq D r π 2xp1 x2 y 2 q »1»π 0 π 2xy 1 dxdy 2r2 p1 r2 q cospϕq r3 sinp2ϕq drdϕ π . Example 4.6.20. Let f be a function of class C 1 defined on a non-empty bounded open subset U of R2 . Then pGpf q, rf q is a C 1 -parametric surface where rf px, y q : px, y, f px, y qq for all px, y q P U . The corresponding normal field n is given by B f B f npx, y q px, y q, px, y q, 1 , Bx By If |n| : U Ñ R is Riemann integrable, the surface area A of px, yq P U . Gpf q is given by A » a 1 U |p∇f qpx, yq|2 dxdy . Example 4.6.21. (Area of a surface of revolution) Let a, b P R such that a b, f : ra, bs Ñ r0, 8q be a continuous function with a finite set Nf of zeros which is the restriction of a continuously differentiable function defined on a open interval of R containing ra, bs and S : px, y, zq P R3 : px2 y 2 q1{2 f pzq ^ z P ra, bs ( . Note that S is rotational symmetric around the z-axis and can be thought of as obtained from a curve in x, z-plane that is rotated around the z-axis. An injective parametrization of class C 1 of S zN , where N : ( px, 0, zq P R3 : x f pzq ^ z P ra, bs Y tpx, y, aq P R3 : px2 y2q1{2 f paqu Y tpx, y, bq P R3 : px2 y2q1{2 f pbqu 723 Y tp0, 0, zq P R3 : z P Nf u , is given by r : pπ, π q ppa, bq zNf q Ñ R3 defined by rpϕ, z q : pf pz q cos ϕ, f pz q sin ϕ, z q for all pϕ, z q P pπ, π q ppa, bq zNf q. In particular, Br pϕ, zq pf pzq sin ϕ, f pzq cos ϕ, 0q , Bϕ Br pϕ, zq pf 1pzq cos ϕ, f 1pzq sin ϕ, 1q , Bz Br pϕ, zq Br pϕ, zq pf pzq cos ϕ, f pzq sin ϕ, f pzqf 1pzqq Bϕ Bz and hence r ϕ ϕ, z B p q Br pϕ, zq f pzq 1 pf 1pzqq21{2 B Bz for all pϕ, z q P pπ, π q ppa, bq zNf q. Since Br B r Bϕ Bz is Riemann integrable over pπ, π q ppa, bq zNf q, we define the area of S by » r ϕ ϕ, z B p q Br pϕ, zq dϕdz A : By Dprq B »b 2π f pzq 1 pf 1pzqq2 1{2 dz . (4.6.16) a In cases where f is only almost everywhere differentiable on pa, bq and such that the last integral exists as an improper Riemann integral, we use the last formula to define the area of a surface of revolution. This will be relevant in the following two examples. 724 Example 4.6.22. Calculate the surface area AS of a sphere of radius r ¡ 0 and the lateral surface area AC of a circular cylinder of radius r and height h ¡ 0. Solution: With a r, b r, ? f pz q : for every z AS P rr, rs, it follows from (4.6.16) that 2π ? »r r 4πr2 . r2 z2 2 z ?2 2 r z 1 Finally, with a 0, b h, for every z r2 z 2 1 {2 dz 2π »r r r dz f pz q : r P r0, hs, it follows from 4.4.8 that VC 2π »h r dz 0 2πrh . Example 4.6.23. Calculate the surface area AE of the rotational ellipsoid E : where r 2 px, y, zq P R : Rx 2 3 y2 R2 z2 r2 * 1 ¡ 0, R ¡ 0 are such that r R. Solution: With a r, b r, R? 2 f pz q : r z2 r for every z AE " P rr, rs, it follows from (4.6.16) that 2π 2πR »r R? r r »r c r 1 r2 z 2 1 2 R r2 r 2 R2 2 z dz r4 ? 2 z 2 r z 4πR 725 2 1{2 dz »rc 0 1 r 2 R2 2 z dz . r4 Hence in the case that r we arrive at AE 4πR »r ? 1 ¡ R and by defining 1 ? ε : 2 r2 R2 , r ε2 z2 dz 2πR ? εr 1 ε2 r2 arcsinpεrq ε c 1? 2 r 2 R2 2πr2 R ?2 2 r R2 1 r2 r R r 0 2πrR Rr In the case that r we arrive at AE 2πR ? εz 1 ε2 z 2 ε 4πR »r b c 1 1 R2 r2 1 arcsin R2 r2 p q 1? 2 arcsin r R2 r 0 . R and by defining 1 ? ε : 2 R2 r2 , r ? 1 ε2 z 2 dz 2πR ? εz 1 ε 2πR ? 2 2 εr 1 ε r arsinhpεrq ε c 1? 2 2πr2 R R2 r 2 ? 2 2 R r2 1 r2 R r r 0 2πrR Rr r arcsin εz b c 1 R2 r2 1 arsinh R2 r2 1 r arsinh εz p q ε2 z 2 1? 2 arsinh R r2 r 0 where arsinh denotes the inverse function to sinh. For its existence, not that sinh 1 pxq coshpxq ¥ 1 726 for every x P R. Hence sinhpxq for x ¥ 0 and »x sinhpxq 0 coshpxq dx ¥ x »0 x sinh 1 pxq dx ¤ x for x ¤ 0. Hence it follows by Theorem 2.5.18 that sinh is bijective as well as that 1 arsinh 1 pxq 1 sinh parsinhpxqq 1 1 ? 1 2 a coshparsinh 2 pxqq 1 x 1 sinh parsinhpxqq for every x P R. We continue with Stokes’ theorem which relates the flow of the curl of a vector field through a parametric surface to the path integral of that field along the boundary of the surface. The following version of Stokes’ theorem is a consequence of Green’s theorem and transformation properties of the curl of a vector field under certain differentiable maps. Theorem 4.6.24. (Stokes’ theorem) Let pS, rq be a C 2 -parametric surface and F be a vector field of class C 1 defined on an open set containing S.1 In particular, let r be such that (i) Its domain Ω is a non-empty bounded open subset of R2 for which Green’s theorem is valid, i.e., there is a piecewise C 1 -path α : I Ñ R2 defined on some non-empty closed interval of I R and traversing the boundary of Ω such that Green’s identity » Bf2 Bf1 dxdy » f dα (4.6.17) Bx By Ω α is valid for every continuously differentiable f pf1 , f2 q : V Ñ R2 defined on some open subset V of R2 containing Ω and its boundary. 1 Note that we do not make any further assumptions on the normal field. 727 y S 1 W z x 0 0 y 0 ¶W ¶S x Fig. 204: Illustration for the proof of Stokes’ theorem, Theorem 4.6.24. (ii) r is the restriction to Ω of a map r̂ of class C 2 defined on an open subset containing Ω and its boundary. » Then S curl F dS » γ F dγ where γ is any piecewise C 1 -parametrization of B S : Ranpr̂ αq which is of the same orientation as r̂ α. Proof. In the following, for simplicity of notation, we denote r̂ by the symbol ρ. First, since ρ α is a piecewise C 1 -path, it follows that » ρ α F dpρ αq 3̧ i 1 » α rpFi ρq ∇ρis d α * B B ρi B B ρi Bx pFi ρq By By pFi ρq Bx dx dy . i1 Ω Further, it follows for every i P t1, 2, 3u that B pF ρq Bρi B pF ρq Bρi BpFi ρq Bρi BpFi ρq Bρi Bx i By By i Bx Bx By By Bx 3̧ » " 728 B Fi B ρ1 B Fi B ρ2 B Fi B ρ3 B ρi Bx ρ Bx ρ Bx ρ Bx By B x2 B x3 1 B Fi B ρ1 B ρ2 B ρ3 B ρi B Fi B Fi Bx ρ By B x2 ρ B y B x3 ρ B y B x 1 and by using Bρ Bρ Bx By B ρ 2 B ρ3 B ρ 3 B ρ2 B ρ 3 B ρ1 B ρ 1 B ρ3 B ρ 1 B ρ2 B ρ 2 B ρ1 Bx By Bx By , Bx By Bx By , Bx By Bx By that B pF ρq Bρ1 B pF ρq Bρ1 Bx 1 B y By 1 Bx BBFx1 ρ BBρx2 BBFx1 ρ BBρx3 BBρy1 2 3 B ρ2 B F1 B ρ3 B ρ 1 B F1 ρ By Bx Bx ρ By B x3 2 B F1 B ρ Bρ B F1 B ρ Bρ Bx ρ Bx By Bx ρ Bx By 2 3 3 B pF ρq Bρ2 B pF ρq Bρ2 Bx 2 B y By 2 Bx BBFx2 ρ BBρx1 BBFx2 ρ BBρx3 BBρy2 1 3 B F2 B ρ1 B F2 B ρ3 B ρ2 ρ By Bx Bx ρ By B x3 1 B F2 B F2 B ρ Bρ B ρ Bρ Bx ρ Bx By Bx ρ Bx By 1 3 3 B pF ρq Bρ3 B pF ρq Bρ3 Bx 3 By By 3 Bx 729 , 2 1 , B F3 B ρ1 B F3 B ρ2 B ρ3 Bx ρ Bx ρ Bx By B x2 1 B F3 B ρ1 B ρ2 B ρ3 B F3 ρ By Bx Bx ρ By B x2 1 B F3 B ρ Bρ B F3 B ρ Bρ Bx ρ Bx By Bx ρ Bx By 1 2 2 1 Hence by using that B F3 B F2 B F1 B F3 B F2 B F1 curl F B x2 B x3 , B x3 B x1 , B x1 B x2 , we arrive at B pF ρq Bρi B pF ρq Bρi * Bx i By By i Bx i1 B ρ Bρ B F1 B ρ Bρ B F1 ρ Bx By Bx ρ Bx By B x3 2 3 2 BF2 ρ Bρ Bρ BF2 ρ Bρ Bρ Bx1 Bx By 3 Bx3 Bx By 1 BF3 ρ Bρ Bρ BBFx3 ρ BBxρ BByρ B x2 Bx By 1 1 2 rpcurl F q ρs BBxρ BByρ . 3̧ " Hence, finally, it follows that B ρ Bρ F dpρ αq rpcurl F q ρs dx dy B x By ρα Ω » curl F dS . » » S 730 . Example 4.6.25. Let S and F be as in Example 4.6.19. A simple calculation gives that F px, y, z q pcurl Aqpx, y, z q for all x, y, z P R where Apx, y, z q : for all x, y, z 2 2 y, z2 , x2 P R. Hence it follows by Theorem 4.6.24 that » S F dS » BS A dr . The boundary B S is given by the circle of radius 1 in the x, y-plane centered at the origin. Note that rpB S q ιpB S q where ι is the inclusion of R2 into R3 given by ιpx, y q : px, y, 0q for every px, y q P R2 . Therefore, a C 1 -parametrization of B S satisfying the assumptions of Theorem 4.6.24 is given by rptq : pcos t, sin t, 0q for all t P rπ, π s. Hence » S F dS »π »π π sinptq, 0, cos2ptq{2 p sinptq, cosptq, 0q dt sin ptq dt π π »π 2 π 1 1 p 1 cosp2tqq dt π sinp2tq 4 π 2 π which is identical to the result of Example 4.6.19. 4.6.3 Gauss’ Theorem The final theorem of this course is Gauss’ theorem for images of cuboids. It relates the volume integral of a certain derivative of a vector field, the so called ‘divergence’ of the field, to the flow of the vector field through the boundary of the volume. Its proof is similar to that of Green’s theorem 731 for images of rectangles, i.e., it is based on a technical lemma referring to transformation properties of the divergence of a vector field. That lemma is given next. It’s proof consists in a straightforward calculation using the chain rule in the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18. Lemma 4.6.26. Let V be a non-empty open subset of R3 and F pF1 , F2 , F3 q : V Ñ R3 be differentiable. Further, let g pg1 , g2 , g3 q : Dpg q Ñ R3 be defined and of class C 2 on a non-empty open subset Dpg q of R3 and such that g pDpg qq V . Then B pF gq Bg Bg B pF gq Bg Bg Bx By Bz By Bz Bx B pF gq Bg Bg rpdiv F q gs detpg 1q Bz Bx By where div F : BF1 BF2 BF3 Bx By Bz . Proof. The proof proceeds by a simple calculation using the chain rule in the form of Corollary 4.2.25 and Schwarz’s theorem 4.2.18. In the following, we indicate partial derivatives in the coordinate directions x, y, z by the index , x , , y and , z , respectively. In first step, we prove that pg,y g,z q,x pg,z g,xq,y pg,x g,y q,z 0 . For this, we note that g,y g,z pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q g,z g,x pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q g,x g,y pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q . Hence pg,y g,z q1,x pg,z g,xq1,y pg,x g,y q1,z pg2,y g3,z g3,y g2,z q,x pg2,z g3,x g3,z g2,xq,y pg2,x g3,y g3,x g2,y q,z 732 g2,yx g3,z g3,yx g2,z g2,y g3,zx g3,y g2,zx g2,zy g3,x g3,zy g2,x g2,z g3,xy g3,z g2,xy g2,xz g3,y g3,xz g2,y g2,x g3,yz g3,x g2,yz 0, pg,y g,z q2,x pg,z g,xq2,y pg,x g,y q2,z pg3,y g1,z g1,y g3,z q,x pg3,z g1,x g1,z g3,xq,y pg3,x g1,y g1,x g3,y q,z g3,yx g1,z g1,yx g3,z g3,y g1,zx g1,y g3,zx g3,zy g1,x g1,zy g3,x g3,z g1,xy g1,z g3,xy g3,xz g1,y g1,xz g3,y g3,x g1,yz g1,x g3,yz 0, pg,y g,z q3,x pg,z g,xq3,y pg,x g,y q3,z pg1,y g2,z g2,y g1,z q,x pg1,z g2,x g2,z g1,xq,y pg1,x g2,y g2,x g1,y q,z g1,yx g2,z g2,yx g1,z g1,y g2,zx g2,y g1,zx g1,zy g2,x g2,zy g1,x g1,z g2,xy g2,z g1,xy g1,xz g2,y g2,xz g1,y g1,x g2,yz g2,x g1,yz 0. Since according to the chain rule Corollary 4.2.25 pF gq,x g1,x.pF,x gq pF gq,y g1,y .pF,x gq pF gq,z g1,z .pF,x gq g2,x .pF,y g q g2,y .pF,y g q g2,z .pF,y g q g3,x .pF,z g q , g3,y .pF,z g q , g3,z .pF,z g q , it follows in a second step that rpF gq pg,y g,z qs,x rpF gq pg,z g,xqs,y rpF gq pg,x g,y qs,z pF gq,x pg,y g,z q pF gq,y pg,z g,xq pF gq,z pg,x g,y q rg1,x.pF,x gq g2,x.pF,y gq g3,x.pF,z gqs pg,y g,z q rg1,y .pF,x gq g2,y .pF,y gq g3,y .pF,z gqs pg,z g,xq rg1,z .pF,x gq g2,z .pF,y gq g3,z .pF,z gqs pg,x g,y q pF,x gq rg1,x.pg,y g,z q g1,y .pg,z g,xq g1,z .pg,x g,y qs pF,y gq rg2,x.pg,y g,z q g2,y .pg,z g,xq g2,z .pg,x g,y qs pF,z gq rg3,x.pg,y g,z q g3,y .pg,z g,xq g3,z .pg,x g,y qs . 733 Further, it follows that g1,x .pg,y g,z q g1,y .pg,z g,x q g1,z .pg,x g,y q g1,x.pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q g1,y .pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q g1,z .pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q pdetpg 1q, 0, 0q , g2,x .pg,y g,z q g2,y .pg,z g,x q g2,z .pg,x g,y q g2,x.pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q g2,y .pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q g2,z .pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q p0, detpg 1q, 0q , g3,x .pg,y g,z q g3,y .pg,z g,x q g3,z .pg,x g,y q g3,x.pg2,y g3,z g3,y g2,z , g3,y g1,z g1,y g3,z , g1,y g2,z g2,y g1,z q g3,y .pg2,z g3,x g3,z g2,x , g3,z g1,x g1,z g3,x , g1,z g2,x g2,z g1,x q g3,z .pg2,x g3,y g3,x g2,y , g3,x g1,y g1,x g3,y , g1,x g2,y g2,x g1,y q p0, 0, detpg 1qq and hence, finally, that rpF gq pg,y g,z qs,x rpF gq pg,z g,xqs,y rpF gq pg,x g,y qs,z rdivpF q gs detpg 1q . Gauss’ theorem for images of cuboids under certain differentiable maps is a consequence of the previous Lemma, Lemma 4.6.26, and change of variables, Theorem 4.4.23. Theorem 4.6.27. (Gauss’ theorem for images of cuboids) Let a1 , b1 , a2 , b2 , a3 , b3 P R be such that ai bi for i 1, 2, 3 and I : ra1 , b1 sra2 , b2 s ra3, b3s, I0 : pa1, b1q pa2, b2q pa3, b3q. Further, let U I be an open subset of R3 , g : U Ñ R3 be twice continuously differentiable such that the 734 z gHIL y x Fig. 205: g pI q and some outer normal vectors. Illustration for the proof of Gauss’ theorem, Theorem 4.6.27. induced map from U to g pU q is bijective with a continuously differentiable inverse and such that detpg 1 q ¡ 0. Finally, let V g pI q be an open subset of R3 and F pF1 , F2 , F3 q : V Ñ R3 be continuously differentiable. Then » p q divF dxdydz g I0 » Sb2 » F dS Sa3 » Sa1 » F dS F dS Sb1 » Sb3 F dS F dS » Sa2 F dS (4.6.18) where pSa1 , ra1 q, pSb1 , rb1 q, pSa2 , ra2 q, pSb2 , rb2 q, pSa3 , ra3 q, pSb3 , rb3 q are C 2 -parametric surfaces defined by ra1 py, z q : g pa1 , z, y q , rb1 py, z q : g pb1 , y, z q , ra2 px, z q : g px, a2 , z q , rb2 px, z q : g pz, b2 , xq , ra3 px, y q : g py, x, a3 q , rb3 px, y q : g px, y, b3 q for all x P pa1 , b1 q, y Sa1 P pa2, b2q and z P pa3, b3q and : Ranpra q , Sb : Ranpra q , Sa : Ranpra q , 1 1 1 2 735 2 Sb2 : Ranprb2 q , Sa3 : Ranpra3 q , Sb3 : Ranprb3 q . Proof. In a first step, we consider the set g pI0 q. Since g is twice continuously differentiable with a continuously differentiable inverse, g pI0 q is a bounded open subset in R3 . Further, the restriction of divF to g pI0 q is bounded. In addition, it follows by Theorem 4.4.13 and Theorem 4.4.15 that the extension of divF to a function, defined on a closed subinterval J of R3 containing g pI0 q and assuming the value zero in the points of J z g pI0 q, is Riemann-integrable. Hence by Theorem 4.4.23, it follows in a second step that » p q divF dxdydz g I0 » I0 rpdivF q gs detpg 1q dxdydz and hence by the previous Lemma 4.6.26 that » p q divF dxdydz g I0 B B g Bg Bx pF gq By Bz dxdydz I » B pF gq Bg Bg dxdydz Bz Bx I By » B pF gq Bg Bg dxdydz . Bx By I Bz » 0 0 0 Finally, from this follows (4.6.18) by Fubini’s Theorem 4.4.18 and the fundamental theorem of calculus Theorem 2.6.21. Remark 4.6.28. Since the region g pI0 q is bounded, there is an outward pointing unit normal for every point on its boundary, apart from the corner points g pa1 , a2 , a3 q, g pa1 , b2 , a3 q, g pa1 , a2 , b3 q, g pa1 , b2 , b3 q, g pb1 , a2 , a3 q, g pb1 , b2 , a3 q, g pb1 , a2 , b3 q and g pb1 , b2 , b3 q. Outward pointing vectors are given by B g1 B g2 B g3 va pa1 , y, z q Bx pa1, y, zq, Bx pa1, y, zq, Bx pa1, y, zq 1 736 , B g1 B g2 B g3 vb pb1 , y, z q Bx pb1, y, zq, Bx pb1, y, zq, Bx pb1, y, zq , Bg1 px, a , zq, Bg2 px, a , zq, Bg3 px, a , zq , va px, a2 , z q By 2 By 2 By 2 Bg1 px, b , zq, Bg2 px, b , zq, Bg3 px, b , zq , vb px, b2 , z q By 2 By 2 By 2 B g2 B g3 B g1 px, y, a3q, Bz px, y, a3q, Bz px, y, a3q , va px, y, a3 q B z B g1 B g2 B g3 vb px, y, b3 q Bz px, y, b3q, Bz px, y, b3q, Bz px, y, b3q , for every x P pa1 , b1 q, y P pa2 , b2 q and z P pa3 , b3 q. Normal vectors on the 1 2 2 3 3 boundary corresponding to the parametrizations ra1 , rb1 , ra2 , rb2 , ra3 , rb3 are given by B g Bg na pa1 , y, z q : pa1, y, zq , B z By B g Bg nb pb1 , y, z q : pb1, y, zq , B y Bz B g Bg na px, a2 , z q : Bx Bz px, a2, zq , Bg Bg px, b , zq , nb px, b2 , z q : 2 B z Bx B g Bg na px, y, a3 q : px, y, a3q , B y Bx B g Bg nb px, y, b3 q : Bx By px, y, b3q 1 1 2 2 3 3 for every x P pa1 , b1 q, y P pa2 , b2 q and z P pa3 , b3 q. In particular, as a consequence of (3.5.7) and the assumption in the previous Theorem 4.6.27 that detpg 1 q ¡ 0 , it follows that the orthogonal projections of the outgoing vectors onto the corresponding normal vectors are everywhere ¡ 0. Hence the parametrizations of the boundary surfaces Sa1 , Sb1 , Sa2 , Sb2 , Sa3 , Sb3 in 737 Theorem 4.6.27 has to be such that the corresponding normal vectors point out of g pI0 q in every point of the boundary, apart from finitely many points, as is indicated in Fig. 205. Remark 4.6.29. Also Gauss’ theorem can be generalized to a larger class of regions of R3 that can be dissected into regions that satisfy the requirements of Theorem 4.6.27. The last theorem is then applied to the parts of the dissection. In this, integration over cuts are performed twice, but with normal fields in opposite directions such that their contributions cancel in the sum. We will not give such generalizations in the following. The regions considered in the following examples and problems satisfy the requirements of such more general theorems. Example 4.6.30. Calculate » S F dS , using Gauss’s Theorem where F px, y, z q : xy, y 2 exz , sinpxy q 2 , x, y, z PR and S is the surface of the region E in the first octant bounded by the parabolic cylinder z 1 x2 and the planes y 0, z 0 and y z 2. Solution: By Gauss’ and Fubini’s Theorem, it follows » S F dS » 3 1 2 1 1 2 » 3y dxdydz » E 1 x2 0 7x x 3 p2 zq dz 2 » 1 » 1x2 » 2z 1 3y dy dxdz 0 dx 1 3 5 1 7 x x 5 7 1 » 0 1 1 p7 3x2 3x4 x6qdx 2 1 184 . 35 The following example gives a typical application of Gauss’ theorem in the area of partial differential equations. It considers solutions of wave 738 1 0.8 z 0.2 -1 0 0.5 -0.5 y 1 0 x 0.5 1 Fig. 206: Sketch of V . equations. Ultimately, it will lead to the proof of the causal behavior of the solutions, i.e., the fact that two solutions, whose values coincide on a circular area AC at time t 0 and whose partial time derivatives coincide on that same set, coincide on the volume of a certain ‘characteristic solid cone’ that is contained in AC r0, 8q and has AC as basis. Example 4.6.31. (An energy inequality for a wave equation in two space dimensions) We consider a function u : U Ñ R of class C 2 that satisfies the wave equation B2u 4u V u 0 (4.6.19) Bt2 where V : U Ñ R is continuous, assumes only positive values, i.e., RanpV q r0, 8q, and is such that BV 0 . Bt In this, U is a non-empty open subset of R3 . In addition, we define for every partially differentiable, twice partially differentiable f : U Ñ R and 739 Ñ R2 p∇f qpt, x, yq : r∇f pt, qspx, yq , div F : rdiv Fpt, qspx, yq , p4f qpt, x, yq : r4f pt, qspx, yq respectively, for all pt, x, y q P U . Then the function and the vector field j partially differentiable F : U defined by : 1 2 Bu 2 |∇u|2 Bt V u2 , j : Bu ∇u Bt satisfy B div Bu ∇u B 1 Bu 2 |∇u|2 V u2 div Bu ∇u Bt Bt Bt 2 Bt Bt 2 BBut BBtu2 p∇uq ∇ BBt u V u BBut ∇ BBt u p∇uq BBut 4u B u B2u Bt Bt2 4u V u 0 . Hence we conclude the conservation law div j B 0 . Bt (4.6.20) Note for later use that px, y, tq ¥ |jpx, y, tq| . (4.6.21) for all px, y, tq P U . In physical applications, is called the energy density (corresponding to u) and j is called the energy flux density (corresponding to u). Integration of p, tq over subsets of R2 (if the corresponding integral exists) gives the energy of u that is contained in that subset at time t P R. The vector field j describes the flow of that energy. In the following, we derive an important consequence of (4.6.20). For this, let px, y, tq P R3 740 T t Τ t-T y -t Η -Ht-TL -Ht-TL x t-T Ξ t -t Fig. 207: Sketch of the domain of integration in Example 4.6.31. be such that t ¡ 0. Further, let T P r0, tq. We define the solid backward with apex px, y, tq by characteristic cone SCx,y,t : SCx,y,t pξ, η, τ q P R3 : τ ¤ t |px ξ, y ηq| ( . Further, we assume that U X r0, T s R2 SCx,y,t . Then » p q Bt2T x,y puqpT, q dξdη ¤ » p q Bt2 x,y puqp0, q dξdη . (4.6.22) This can be shown as follows. It follows from (4.6.20) by Gauss’ Theorem that 0 » SCx,y,t Xp r0,T sR2 q B ∇ Bu ∇u pξ, η, τ q dξdηdτ Bt Bt 741 » BpSCx,y,t » 2 p q Bt2T x,y » B u , ∇u v dS Bt Xp r0,T sR qq » p0, q dξdη pT, q dξdη Cx,y,t p q Bt2 x,y Xp r0,T sR2 q B u puq, ∇u v dS Bt where v denotes the outer unit normal field on the boundary surface X SCx,y,t B r0, ts R2 X of SCx,y,t r0, ts R2 . Ñ R3 defined f pξ, η q : pξ, η, t |px ξ, y η q|q is given by f : R2 A parametrization of Cx,y,t for every pξ, η q P R2 . In particular, vpf py qq ?1 2 xξ yη 1, , |px ξ, y ηq| |px ξ, y ηq| for every pξ, η q P R2 . In addition, it follows for such pξ, η q that ? B u 2 2 , ∇u v pζ q Bt 2 BBut pζ q |p∇uqpζ q|2 V pζ q pupζ q2 |px ξ,2y ηq| BBut pζ q px ξ, y ηq p∇uqpζ q 2 Bu B u 2 2 ¥ Bt pζ q |p∇uqpζ q| V pζ q pupζ q 2 Bt pζ q |p∇uq|pζ q ¥ 0 where ζ : f pξ, η q. Hence it follows (4.6.22). As an application of the energy inequality (4.6.22), we assume that v : U Ñ R is another solution of (4.6.10) such that upx, y, 0q v px, y, 0q , Bu px, y, 0q Bv px, y, 0q Bt Bt 742 for all px, y q P Bt px, y q. Then u v is a solution of (4.6.19) such that pu vqpx, y, 0q 0 , BpuBt vq px, y, 0q 0 for all x P Bt px, y q and hence the corresponding energy density vanishes at time 0 on Bt px, y q. As a consequence of (4.6.22) and the positivity of the energy density, it follows that the same is true at time T on BtT px, y q. Since this is true for every T P r0, tq and since u v is continuous, it follows that u and v coincide in every point of X SCx,y,t r0, ts R2 . As a consequence, we have the following result. Theorem 4.6.32. (Uniqueness of the solutions of a wave equation in two space dimensions) Let px, y, tq P R3 be such that t ¡ 0. We define the solid with apex px, y, tq by backward characteristic cone SCx,y,t : SCx,y,t pξ, η, τ q P R3 : τ ¤ t |px ξ, y ηq| Further, let U be an open subset of R3 such that U SCx X R2 r0, ts ( . P C 2pU, Rq be such that B2u 4u V u B2v 4v V v , Bt2 Bt2 Bu px, y, 0q Bv px, y, 0q upx, y, 0q v px, y, 0q , Bt Bt for all px, y q P Bt px, y q. In this, p4uqpt, x, yq : r4upt, qspx, yq respectively, for all pt, x, y q P U , and V : U Ñ R is continuous, assumes only positive values, i.e., RanpV q r0, 8q, and satisfies BV 0 . Bt X p R2 r0, tsq. Then u and v coincide on SCx,y,t and u, v 743 Problems 1) Decide whether the tuple of vectors is positively or negatively oriented pp1, 2q, p3, 4qq , b) pp1, 1q, p0, 1qq , pp3, 7q, p2, 1qq , d) pp9, 4q, p10, 3qq , e) pp0, 4, 2q, p8, 1, 3q, p9, 12, 5qq , f) pp2, 2, 14q, p3, 1, 1q, p2, 9, 9qq , g) pp1, 1, 2q, p4, 3, 1q, p8, 2, 5qq , h) pp2, 7, 9q, p1, 2, 4q, p8, 8, 1qq . Let U be a non-empty open subset of R3 ; f : U Ñ R, v : U Ñ R3 and w : U Ñ R3 be partially differentiable; f1 : U Ñ R, v1 pv1x , v1y , v1z q : U Ñ R3 and also v2 : U Ñ R3 be of class C 2 . a) c) 2) Show that a) rot p∇f1 q 0 , b) div prot v1 q 0 , c) div p∇f1 q 4f1 , f) rot prot v1 q ∇pdiv v1 q 4v1 , d) rot pf.vq f.prot vq p∇f q v , e) div pf.vq f pdiv vq p∇f q v , g) div pv wq w prot vq v prot wq where 4v1 : p4v1x , 4v1y , 4v1z q . 3) Let a, b, c, d P R be such that a c d b. Further, let f1 : pa, bq Ñ R and f2 : pa, bq Ñ R be twice differentiable and such that f1 pxq f2 pxq for all x P pa, bq. Find a twice continuously differentiable bijective g : R pa, bq Ñ R pa, bq whose inverse is twice continuously differentiable and which is such that g pr1, 1s rc, dsq tpx, y q P R rc, ds : f1 py q ¤ x ¤ f2 py qu . 4) Calculate the area of Gpf q of f : U Ñ R defined by U : tpx, y q P R2 : x2 y 2 1u and f px, y q : 3 3x 7y for all px, y q P U . 5) Let D be the open subset of R2 that is bounded by the triangle with corners p0, 0q, p1, 0q, p1, 1q. Calculate the area of tpx, y, zq P R3 : 3x2 7y z 0uXtpx, y, zq P R3 : px, yq P Du . 744 y 1 0.5 -1 0.5 -0.5 1 x -0.5 -1 Fig. 208: An astroid. 6) Calculate the area of tpx, y, zq P R3 : x2 y2 z2 1uXtpx, y, zq P R3 : x2 y2 xu . 7) Calculate the surface areas of the tori from Example 4.6.15. 8) By calculation of a path integral, find the compact area that is bounded by the astroid where a ¡ 0. t pa cos3 t, a sin3 tq P R2 : t P r0, 2πq u 9) By calculation of a path integral, find the compact area that is bounded by the cardioid t pa cos t p1 where a ¡ 0. cos tq, a sin t p1 cos tqq P R2 : t P r0, 2π q u 10) By calculation of a path integral, find the compact area that is bounded by the folium of Descartes where a ¡ 0. t px, yq P R2 : x3 745 y 3 3axy 0u y 1 1 2 1 2 1 x 3 2 1 - 2 -1 Fig. 209: A cardioid. y 2 1 -2 1 -1 -1 -2 Fig. 210: A folium of Descartes. 746 2 x 11) Use Stokes’ theorem to calculate the surface integral » S where curl A dS Apx, y, z q : pz, x, y q P R and S : t px, y, z q P R3 : x2 for all x, y, z y2 z40^z ¥ 0u . For this, assume a normal field with positive z-component. Sketch S. 12) Use Stokes’ theorem to calculate the surface integral » S where curl A dS Apx, y, z q : p2yz, 0, xy q , P R and S : t px, y, z q P R3 : x2 for all x, y, z y2 z2 9 ^ 0 ¤ z ¤ 4z u . For this, assume a normal field pointing away from the z-axis. Sketch S. 13) By using Stokes’ theorem, calculate » S where for all x, y, z curl A dS Apx, y, z q : p2y, 3x, z 2 q P R and S is the closed upper half surface of the sphere t px, y, zq P R3 : x2 y2 z2 9 u . For this, assume a normal field with positive z-component. 14) Use Gauss’ theorem to calculate the surface integral » S A dS 747 where Apx, y, z q : px2 , xy, y 2 q for all x, y, z P R and S is the compact region in the first octant bounded by the coordinate planes and t px, y, zq P R3 : x 2y z 1u . Sketch S. 15) Use Gauss’ theorem to calculate the surface integral » S where A dS Apx, y, z q : p3x2 , 6xy, z 2 q for all x, y, z P R and S is the compact region in the first octant bounded by the coordinate planes and t px, y, zq P R3 : x 2 u , t px, y, zq P R3 : z y2 1u . Sketch S. 16) By using Gauss’ theorem, calculate » S where A dS Apx, y, z q : p2xy z, y 2 , x 3y q for all x, y, z P R and S is the compact region in the first octant bounded by the coordinate planes and t px, y, zq P R3 : 2x 748 2y z 6u . 5 5.1 Appendix Construction of the Real Number System Already the ancient Greeks discovered that there was a need to go beyond rational numbers. For instance, they found that there is no rational number to measure the length of the diagonal d of a square with sides of length 1. By the Pythagorean theorem that length satisfies the equation d2 2. In Example 2.2.15, we proved that this equation has no rational solution which was also known to the Greek’s of that time. Still, they did not develop the concept of real numbers. In its final form, that concept was developed only in the 19th century. In the following, we construct the real number system following an approach by Georg Cantor (1872) as completion of the rational number system. For this, in a first step, we identify Q with a space containing equivalence classes of Cauchy sequences of rational numbers. Definition 5.1.1. (Cauchy sequences in Q) Let x x1 , x2 , x3 , . . . be a sequence of rational numbers. We say that x is a Cauchy sequence if for every rational ε ¡ 0 there is a corresponding n0 P N such that |x m x n | ε for all m, n P N such that m ¥ n0 and n ¥ n0 . Such a sequence is nec- essarily bounded by a some rational number since this leads in the special case ε 1 to |xk | ¤ maxt|xl | : l 1, . . . , n0u |xk xn xn | ¤ maxt|xl | : l 1, . . . , n0u |xk xn | |xn | ¤ 1 |xn | maxt|xl | : l 1, . . . , n0u for k P N such that k ¥ n0 and |xk | ¤ maxt|xl | : l 1, . . . , n0u ¤ 1 |xn | maxt|xl | : l 1, . . . , n0u 0 0 0 0 0 749 0 for k P N such that k the symbol C. For x, y numbers by ¤ n0. We denote the set of all such sequences by P C, we define sequence x y an x y of rational x y : x1 y1 , x2 y2 , x3 y3 , . . . x y : x1 y1 , x2 y2 , x3 y3 , . . . . Since |px yqm px yqn| |xm xn ym yn| ¤ |xm xn| |ym yn| |px yqm px yqn| |xmym xnyn| |xmpym ynq pxm xnqyn| ¤ |xm| |ym yn| |yn| |xm xn| ¤ Cx|xm xn| Cy |ym yn| where Cx , Cy are rational bounds for x and y, respectively, it follows that x y P C and x y P C. Finally, we define for every x P C a corresponding sequence x P C by x : x1, x2, . . . . Definition 5.1.2. We define an equivalence relation ‘’ on C as follows. We say that x, y P C are equivalent and denote this by x y if for every rational ε ¡ 0 there is n0 P N such that |x n y n | ε for all n P N such that n ¥ n0 . Indeed, ‘’ is reflexive since for every x P C and every rational ε ¡ 0 it follows that |x n x n | 0 ε for all n P N such n ¥ 1 and hence that x x. Also, ‘’ is symmetric, since for x, y P C such that x y and rational ε ¡ 0 there is n0 P N such that |x n y n | ε for all n P N such that n ¥ n0 . This implies that |y n x n | ε 750 for all n P N such that n ¥ n0 and hence that y x. Finally, if x, y, z P C are such that x y and y z and ε is some rational number ¡ 0, it follows the existence of n0 P N such that |xn yn| ε{2 , |yn zn| ε{2 for all n P N such that n ¥ n0 . Hence |xn zn| |xn yn yn zn| ¤ |xn yn| |yn zn| ¤ ε for all n P N such that n ¥ n0 . Therefore it follows that x z. Lemma 5.1.3. Let x P C and x̄ be a subsequence of x. Then x̄ P C and x̄ x. Proof. Since x̄ is a subsequence of x, there is a strictly increasing sequence n1 , n2 , . . . of elements of N such that x̄ xn1 , xn2 , . . . and such that nk ¥ k for all k P N . Since x P C, for rational ε ¡ 0, there is n0 P N such that |x m x n | ε for all m, n that P N such that m ¥ n0 and n ¥ n0. In particular, this implies |x k x k | ε for all m, n P ¥ n0 and n ¥ n0 since the last implies that km ¥ m ¥ n0 and kn ¥ n ¥ n0 . Hence it follows that x̄ P C. Also, it N such that m m n follows that |x̄n xn| |xk xn| ε for all n P N such that n ¥ n0 since the last implies that kn ¥ n ¥ n0 . Definition 5.1.4. (Cantor real numbers) For every x P C, we define the n associated Cantor real number as the equivalence class [x] defined by [x] : ty : y P C ^ y xu . Also, we define the set C of Cantor real numbers by C : t [x] : x P Cu . 751 For [x], [y] P C , where x, y corresponding product by [x] P C, we define a corresponding sum and a [y] [x y] , [x] [y] [x y] . Indeed, this is possible, since it follows for x̄, ȳ and [ȳ] [y] that [x̄ ȳ] [x P C satisfying [x̄] [x] y] , [x̄ ȳ] [x y] . This can be seen as follows. First, since [x̄] [x] and [ȳ] [y], it follows that x̄ x and that ȳ y. Hence for every rational ε ¡ 0 there is n0 P N such that |x̄n xn| ε{2 , |ȳn yn| ε{2 for all n P N such that n ¥ n0 . For such n, it follows that |px̄ ȳqn px yqn| |x̄n xn ¤ |x̄n xn| |ȳn yn| ε and therefore that px̄ ȳ q px ȳn yn | yq . Also, if Cx , Cȳ are rational bounds for x and ȳ, respectively, and ε is some rational number ¡ 0, then there is n0 P N such that Cȳ |x̄n xn | ε{2 , Cx |ȳn yn | ε{2 for all n P N such that n ¥ n0 . For such n, it follows that |px̄ ȳqn px yqn| |x̄nȳn xnyn| |px̄n xnqȳn ¤ |ȳn| |x̄n xn| |xn| |ȳn yn| ¤ Cȳ |x̄n xn| and hence that px̄ ȳq px yq . Finally, we define the embedding ι of Q into C by ιpq q [q, q, . . . ] 752 xn pȳn yn q| Cx |ȳn yn | ε P Q. It is an obvious consequence of the definitions that ιpq q̄ q ιpq q ιpq̄ q , ιpq q̄ q ιpq q ιpq̄ q , for all q, q̄ P Q. Theorem 5.1.5. pC , , q is a field, i.e., the following holds for all x, y, z P for all q C: (i) p [x] [y] q [z] [x] (ii) [x] [y] [y] (iii) [x] ιp0q [x] , (iv) [x] [ x] ιp0q , (v) (vi) (vii) (viii) p [y] [z] q , (Associativity of addition) [x] , (Commutativity of addition) (Existence of a neutral element for addition) (Existence of inverse elements for addition) p [x] [y] q [z] [x] p [y] [z] q , (Associativity of multiplication) [x] [y] [y] [x] , (Commutativity of multiplication) [x] ιp1q [x] , (Existence of a neutral element for multiplication) If [x] ιp0q, then there is w P C such that [x] [w] ιp1q , (Existence of inverse elements for multiplication) (ix) [x] p [y] [z] q [x] [y] [x] [z] . (Distributive law) Proof. The validity of the statements (i)-(vii) and (ix) is an obvious consequence of the analogous laws for rational numbers and the definition of the addition and multiplication on C . For the proof of (viii), let x P C such that [x] ιp0q. As a consequence, it is not true that for every rational ε ¡ 0 there is n0 P N such that |x n | ε (5.1.1) for all n P N such that n ¥ n0 . Therefore, there is a rational δ ¡ 0 and for which there is no n0 P N such that (5.1.1) is valid for all n P N such 753 that n ¥ n0 . This implies the existence of a strictly increasing sequence n1 , n2 , . . . of natural numbers such that |x n | ¥ δ k for all k P N . According to Lemma 5.1.3, x̄ : xn1 , xn2 , P C and x̄ x. The last implies that [x̄] [x]. Therefore, we can assume without restriction that |x n | ¥ δ for all n P N . We define w : 1{x1 , 1{x2 , . . . . Then |wm wn| 1 x m 1 |xm xn | xn |xm | |xn | Hence if ε is rational such that ε ¡ 0 and n0 ¤ |xm δ2 xn| P N is such that |x m x n | δ 2 ε for all n P N satisfying n ¥ n0 , then also |wm wn| ¤ ε for all n ιp1q. P N such that n ¥ n0. As a consequence, w P C and [x] [w] In the next step, after preparation by a Lemma, we define an order relation ‘ ’ on C . Lemma 5.1.6. Let x, y n0 P N such that P C be such that there is a rational ε xn ¤ yn ε x̄n ¤ ȳn ε̄ ¡ 0 and for all n P N such that n ¥ n0 . Further let x̄, ȳ P C be such that x̄ x and ȳ y. Then, there are a rational ε̄ ¡ 0 and a n̄0 P N such that for all n P N such that n ¥ n̄0 . 754 Proof. Since x̄ x and ȳ y, it follows the existence of N P N such that x̄n xn ¤ |x̄n xn | ε{4 , yn ȳn ¤ |ȳn yn | ε{4 for all n P N such that n ¥ N . Hence it follows for n P N satisfying n ¥ maxtn0 , N u that ε ε x̄n xn ¤ yn ε ȳn ε 4 4 and hence that x̄n where ε̄ : ε{2. ¤ ȳn ε̄ As a consequence of the previous lemma, it is meaningful to define the following. Definition 5.1.7. For [x], [y] P C , we say that [x] is smaller than [y] and denote this by [x] [y] if there are a rational ε ¡ 0 and n0 P N such that xn ¤ yn ε for all n P N such that n ¥ n0 . Further, we say that [x] smaller than [y] P C and denote this by P C is equal or [x] ¤ [y] if [x] [y] or if [x] absolute value |[x]| by [y]. Finally, we define for every [x] |[x]| : P C its # [x] if ιp0q ¤ [x] [x] if [x] ιp0q . It is an obvious consequence of the definitions that for q1 , q2 in the case q1 q2 that ιpq1 q ιpq2 q , 755 P Q it follows in the case q1 ¤ q2 that ιpq1 q ¤ ιpq2 q and that |ιpq1q| |q1| . Theorem 5.1.8. Let [x] , [y] P C . Then (i) ιp0q ιp0q ; (ii) if [x] ιp0q , then either ιp0q [x] or [x] ιp0q ; (iii) if ιp0q [x] and ιp0q [y] , then ιp0q [x] [y] , ιp0q [x] [y] . Proof. ‘(i)’: The proof is indirect. Assume that ιp0q ιp0q. Then there is a rational ε ¡ 0 such that 0 ε. ‘(ii)’: For this, let [x] ιp0q . Further, assume that both [x] ιp0q and ιp0q [x]. Then it is not true that there is a rational ε ¡ 0 and n0 P N such that xn ¤ ε for all n P N such that n ¥ n0 . Hence for every rational ε n0 P N , there is n P N such that n ¥ n0 and xn ¡ 0 and every ¡ ε . Therefore, there is a subsequence x̄ of x such that x̄n ¡ n1 for every n P N . Since x̄ P [x] according to Lemma 5.1.3, we can assume without restriction that 1 xn ¡ n 756 for every n P N . Further, since ιp0q rational ε ¡ 0 and n0 P N such that [x], it is not true that there is a 0 ¤ xn ε for all n P N such that n ¥ n0 . Hence for every rational ε n0 P N , there is n P N such that n ¥ n0 and xn ¡ 0 and every ε. Therefore, there is a subsequence x̄ of x such that x̄n n1 for every n P N . Since x̄ P [x] according to Lemma 5.1.3, we can assume without restriction that n1 xn n1 for every n P N . Obviously, this implies that x 0, 0, . . . and hence that [x] ιp0q. Hence it follows that either ιp0q [x] or [x] ιp0q are true or that both of these inequalities are true. The last implies that there are a rational ε ¡ 0 and n0 P N such that 0 ¤ xn ε for all n P N satisfying n ¥ n0 and also that there are a rational δ m0 P N such that xn ¤ ε ¡ 0 and for all n P N satisfying n ¥ m0 . (iii) For this, let ιp0q [x] and ιp0q [y] . Then, there are a rational ε ¡ 0 and n0 P N such that ε ¤ xn , ε ¤ yn for all n P N satisfying n ¥ n0 . Hence it follows for all n P N satisfying n ¥ n0 that 2ε ¤ xn yn , ε2 ¤ xn yn 757 and finally that ιp0q [x] [y] , ιp0q [x] [y] . Theorem 5.1.9. Let [x], [y] be elements of C such that [x] there is q P Q such that [x] ιpq q [y] . Proof. Since [x] [y], there are a rational ε ¡ 0 and n0 xn for all n P N satisfying n such that m0 ¥ n0 and (5.1.2) P N such that ¤ yn ε ¥ n0. Further, since x, y P C, there is m0 P N |xm xn| ε{4 , |ym yn| ε{4 for all n P N satisfying n ¥ m0 . We define q : pxm follows for all n P N satisfying n ¥ m0 that 0 0 0 q xn 21 pxm yn q yn 12 pxm 0 [y]. Then ym0 q xn 0 xm xn 0 ym0 q yn ym0 ym0 q{2. Then it 1 ε p ym0 xm0 q ¡ 2 4 1 ε pym0 xm0 q ¡ 4 . 2 Hence it follows (5.1.2). Definition 5.1.10. Let [x1 ], [x2 ], . . . be a sequence of elements of C and [x] P C . (i) We call [x1 ], [x2 ], . . . a Cauchy sequence if for every [ε] that ιp0q [ε] there is a corresponding n0 P N such that |[xm] [xn]| [ε] for all m, n P N such that m ¥ n0 and n ¥ n0 . 758 P C such (ii) We define lim [xn ] [x] Ñ8 n if for every [ε] P C such that ιp0q n0 P N such that for all n ¥ n0 : [ε] there is a corresponding |[xn] [x]| [ε] . (5.1.3) In this case, we say that the sequence [x1 ], [x2 ], . . . is convergent to [x]. Theorem 5.1.11. (i) (The rational numbers are dense in the real numbers) Let [x] be some element of C . Then lim Ñ8 ιpxn q [x] . n (ii) (Completeness of the real number system) Every Cauchy sequence in C is convergent. Proof. ‘(i)’: For this, let [ε] rational δ ¡ 0 such that P C such that ιp0q [ε]. Then there is a ιp0q ιpδ q [ε]{2 as a consequence of Theorem 5.1.2. Further, it follows for m P N that ιpxm q [x] [xm x1 , xm x2 , . . . ] and, since x P C, the existence of n0 P N such that |x m x n | δ for all m, n P N satisfying m ¥ n0 and n ¥ n0 . Hence it follows for such m, n that 2δ δ δ xm xn δ 2δ δ 759 and therefore that ε 2 ιpδq ¤ ιpxmq [x] ¤ 2 ιpδq ε . This implies that |ιpxmq [x]| ε for all m P N satisfying m ¥ n0 . ‘(ii)’: For this, let [x1 ], [x2 ], . . . be a Cauchy sequence in C . In addition, for every n P N , let qn P Q be such that [xn ] ιpqn q [xn ] ιp1{nq . Such qn exists according to Theorem 5.1.2. In the following, we will show that lim [xn ] [q] nÑ8 where q : q1 , q2 , . . . . First, we show that q P C. For this, let δ such that n0 ¡ 4{δ and such that ¡ 0 be rational and n0 P N be |[xm] [xn]| ιpδq{2 for all m, n P N satisfying m ¥ n0 and n ¥ n0 . For such m and n, it follows that |ιpqmq ιpqnq| |ιpqmq [xm] [xm] [xn] [xn] ιpqnq| ¤ |ιpqmq [xm]| |[xm] [xn]| |[xn] ιpqnq| ιp1{mq ιp1{nq |[xm] [xn]| ιpδq and hence also that |q m q n | δ . Further, let [ε] P C be such that ιp0q [ε]. Since according to (i) lim ιpqn q [q] , nÑ8 760 it follows the existence of n0 P N such that ιp1{n0 q [ε]{2 and such that for all n ¥ n0 : |ιpqnq [q]| [ε]{2 . This also implies that |[xn] [q]| |[xn] ιpqnq ιp1{nq p[ε]{2q [ε] . 5.2 ιpqn q [q]| ¤ |[xn ] ιpqn q| |ιpqnq [q]| Lebesgue’s Criterion for Riemann-integrability Apart from notational changes and additions, we follow Sect. 7.26 of [5] in the proof of Lebesgue’s criterion for Riemann-integrability, Theorem 2.6.13. Theorem 5.2.1. Let S0 , S1 , . . . be a sequence of subsets of measure zero of R. Then the union S of these subsets has measure zero, too. Proof. Given ε ¡ 0, for each k P N there is a sequence Ik0 , Ik1 , . . . of open subintervals of R such that union of these intervals contains Sk and at the same time such that ņ lim Ñ8 n lpIkm q m 0 The sequence of all intervals Ikl , where k, l all these intervals contains S and ņ lim Ñ8 n lpIϕpkq q lim l k 0 where ϕ : N Ñ N2 is some bijection. ε 2k 1 . P N, is countable; the union of ļ Ñ8 k0 ε 2k 1 ε Definition 5.2.2. (Oscillation of a function) Let I be some non-trivial interval of R and f : I Ñ R be some bounded function. Then we define for every non-trivial subset S of I the oscillation Ωf pS q of f on S by Ωf pS q : suptf pxq f py q : x P S ^ y 761 P Su . (Note that set in the previous identity is bounded from above by suptf py q : y P I u inf tf py q : y P I u. In addition, note that Ωf pS q is positive.) Further, we define for each x P I the oscillation ωf pxq of f at x by the limit ωf pxq : lim Ωf ppx δ, x δ δq X I q Ñ0 of the decreasing function that associates to every δ Ωf ppx δ, x δ q X I q. ¡ 0 the value of Theorem 5.2.3. Let I be some non-trivial interval of R and f : I Ñ R be some bounded function and x P I. Then f is continuous in x if and only if ωf pxq 0. Proof. First, we consider the case that f is continuous in x. Then for every n P N there are xn , yn P px 1{n, x 1{nq X I such that |f pxnq f pynq Ωf ppx 1{n, x 1{nq X I q| ¤ 1 . n Hence f px1 qf py1 qΩf ppx1, x 1qXI q, f px2 qf py2 qΩf ppx1{2, x 1{2q X I q, . . . is a null sequence. Since both sequences x1 , x2 , . . . and y1 , y2 , . . . are converging to x and since f is continuous in x it follows by Theorem 2.3.4 that ωf pxq 0. Finally, we consider the case that ωf pxq 0. Assume that f is not continuous in x. Hence there is some ε ¡ 0 along with a sequence x1 , x2 , . . . in I ztxu which is convergent to x, but such that |f pxnq f pxq| ¥ ε . Hence Ωf ppx δn , x δn q X I q ¥ ε for all n P N where δn : 2|xn x| for all n P N . Since δ1 , δ2 , . . . is converging to 0 it follows that ωf pxq ¥ ε. Hence f is continuous in x. Theorem 5.2.4. Let f : ra, bs Ñ R be bounded where a and b are some elements of R such that a b. Further, let ωf pxq ε for every x P ra, bs 762 and some ε ¡ 0. Then there is δ ¡ 0 such that for every closed subinterval I of ra, bs of length smaller than δ it follows Ωf pI q ε . Proof. First, it follows from the assumptions that for each x is some δx ¡ 0 such that Ωf ppx δx , x P ra, bs there δx q X ra, bsq ε . The family of sets px δx {2, x δx {2q, where x P ra, bs, is an open covering of ra, bs and hence by the compactness of ra, bs there are x1 , x2 , . . . , xn P ra, bs, where n is some element of N, such that ra, bs is contained in the union of px1 δx1 {2, x1 δx1 {2q, px2 δx2 {2, x2 δx2 {2q, . . . , pxn δxn {2, xn δxn {2q. Now define δ : mintδx1 {2, δx2 {2, . . . , δxn {2u and let I be some closed subinterval of ra, bs of length smaller than δ. Further, let k be some element of t1, 2, . . . , nu such that pxk δxk {2, xk δxk {2qX I φ. Then, I pxk δxk , xk δxk q , since lpI q δ, and hence Ωf pI q ε. Theorem 5.2.5. Let f : ra, bs Ñ R be bounded where a and b are some elements of R such that a b. Further, let ε ¡ 0. Then Jε : tx P ra, bs : ωf pxq ¥ εu is a closed subset of ra, bs. Proof. Let x be some element of the complement ra, bs zJε . Then ωf pxq ε and hence there is some δ ¡ 0 such that Ωf ppx δ, x δ q X ra, bsq ε. In particular it follows for every element y P px δ, x δ q X ra, bs that ωf py q ε and as a consequence that px δ, x δ q X ra, bs is contained in ra, bs zJε. Hence is ra, bs zJε open in ra, bs and therefore Jε a closed subset of ra, bs. We prove now Theorem 2.6.13: 763 Theorem 5.2.6. (Lebesgue’s criterion for Riemann-integrability) Let f : ra, bs Ñ R be bounded where a and b are some elements of R such that a b. Further, let D be the set of discontinuities of f . Then f is Riemann-integrable if and only if D is a set of measure zero. Proof. First, assume that D is not of measure zero. Then D is non-empty and by Theorem 5.2.3 it follows that ωf pxq ¡ 0 for every x P D. Hence D 8 ¤ J1{n . (5.2.1) n 1 Since the union in (5.2.1) is countable, by Theorem 5.2.1 it follows the existence of some n P N such that J1{n is not a set of measure zero. Hence there is some ε ¡ 0 such that the sum of the lengths of the intervals corresponding to any covering of J1{n by open intervals is ¥ ε. Now let P be some partition of ra, bs with corresponding closed intervals I0 , I1 , . . . , Ik where k P N. Further, denote by S the subset of t0, 1, . . . , k u containing only those indexes j P t0, 1, . . . , k u for which the intersection of the inner of Ij and J1{n is non-empty. Then the open intervals corresponding to Ij , j P S cover J1{n , except possibly for a finite set, which is a set of measure zero. Hence the sum of their lengths is ¥ ε. U pf, P q Lpf, P q ¥ ķ rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q j 0 ¸ P rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q j S ¥ n1 ¸ P j S lpIj q ¥ ε n and hence f is not Riemann-integrable. Finally, assume that D is a set of measure zero and consider again (5.2.1). Further, let n P N such that 1{n pb aq{2. Then J1{n is by Theorem 5.2.5 compact and has measure zero as a subset of a set of measure zero. Hence there is a covering of J1{n 764 by a finite number of open intervals for which the corresponding sum of lengths is smaller than 1{n. Without restriction we can assume that those intervals are pairwise disjoint. Denote by An the union of those intervals. Then the complement Bn : ra, bs zAn is the union of a finite number of closed subintervals of ra, bs. Let I be such subinterval. Then ωf pxq 1{n for each x P I and hence by Theorem 5.2.4 there is a partition of I such that Ωf pI 1 q 1{n for any induced subinterval I 1 . All those partitions induce a partition Pn of ra, bs. Now consider some refinement P P P of Pn with corresponding closed intervals I0 , I1 , . . . , Ik where k P N. Further, denote by S the subset of t0, 1, . . . , k u containing only those indexes j P t0, 1, . . . , ku for which Ij X J1{n φ. Then ¸ R rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q j S ¤ n1 ¸ P ¸ R lpIj q ¤ j S ba , n rsuptf pxq : x P Ij u inf tf pxq : x P Ij us lpIj q j S ¤ pM mq ¸ P lpIj q ¤ M j S m , n where M : suptf pxq : x P ra, bsu and m : inf tf pxq : x hence ba M m U pf, P q Lpf, P q ¤ . n Therefore P Puq ¤ U pf, P q ¤ Lpf, P q b a ¤ supptLpf, P q : P P Puq b a nM m . inf ptU pf, P q : P P ra, bsu, and M n m Since this is true for any n P N such that 1{n pb aq{2, from this it follows by Theorem 4.4.4 that f is Riemann-integrable. 765 5.3 Properties of the Determinant Lemma 5.3.1. (Leibniz’ formula for the determinant) Let n pa1, . . . , anq be an n-tuple of elements Rn. Then P N and (i) detpa1 , . . . , an q ¸ P signpσ q a1σp1q anσpnq σ Sn where Sn denotes the set of permutations of t1, . . . , nu, i.e., the set of all bijections from t1, . . . , nu to t1, . . . , nu and signpσ q : spσ p1q, . . . , σ pnqq n ¹ sgnpσ pj q σ piqq i,j 1,i j for all σ P Sn . Note that signpσ q 1 if the number of pairs pi, j q P t1, . . . , nu2 such that i j and σpj q σpiq is even, whereas signpσq 1 if that number is odd. (ii) signpσ q for all σ P Sn . (iii) for all τ, σ P Sn . signpτ σ pj q σ piq ji i,j 1,i j n ¹ σq signpτ q signpσq (iv) In addition, let n ¥ 2. Further, let τ be a transposition, i.e., an element of Sn for which there are elements k, l P t1, . . . , nu such that k l and such that τ piq i for all i P t1, . . . , nu ztk, lu, τ pk q l and τ plq k. Then there is σ P Sn such that τ σ τ0 σ1 766 where τ0 P Sn is the transposition defined by τ0 piq i for all i P t1, . . . , nu zt1, 2u, τ0p1q 2 and τ0p2q 1. Note that from this it follows by (iii) that signpτ q signpσ τ0 σ 1 q signpσ q signpτ0 q signpσ 1 q signpτ0q 1 . (v) In addition, let n ¥ 2. For every σ P Sn , there is k sequence τ1 , . . . , τk of transpositions in Sn such that σ P N and a τ1 τk . (vi) For every i P t1, . . . , nu, define ņ āi : aji ej j 1 where e1 , . . . , en is the canonical basis of Rn . Then detpā1 , . . . , ān q a 11 an1 a 11 a1n a1n . ann an1 detpa1, . . . , anq ann Proof. ‘(i)’: The statement of (i) is a direct consequence of Definition 3.5.18 and the definitions given in (i). ‘(ii)’: For this let σ P Sn and denote by m the number of pairs pi, j q P t1, . . . , nu2 such that i j and σ pj q σ piq. Then n ¹ pσpj q σpiqq i,j 1,i j 767 n ¹ pσpj q σpiqq p1q m p q p q i,j 1,i j,σ i σ j n ¹ m p1q |σpj q σpiq| p1qm i,j 1,i j n ¹ p q σpiq |σpj q σpiq| i,j 1,i j,σ j n ¹ pj iq i,j 1,i j where the last equality uses the bijectivity of σ. Hence it follows that signpσ q p1qm σ pj q σ piq . ji i,j 1,i j n ¹ ‘(iii)’: First, it follows from (ii) that pτ σqpj q pτ σqpiq ji i,j 1,i j n n ¹ pτ σqpj q pτ σqpiq ¹ σ pj q σ piq σ pj q σ piq ji i,j 1,i j i,j 1,i j n ¹ pτ σqpj q pτ σqpiq signpσq . σ pj q σ piq i,j 1,i j signpτ σq n ¹ pτ σqpj q pτ σqpiq σ pj q σ piq i,j 1,i j n ¹ p τ σ qpj q pτ σ qpiq σ pj q σ piq i,j 1,i j,σ piq σ pj q n ¹ pτ σqpj q pτ σqpiq σ pj q σ piq i,j 1,i j,σ pj q σ piq n ¹ p τ σ qpj q pτ σ qpiq σ pj q σ piq i,j 1,i j,σ piq σ pj q n ¹ pτ σqpj q pτ σqpiq σ pj q σ piq i,j 1,i¡j,σ piq σ pj q n ¹ 768 pτ σqpj q pτ σqpiq σ pj q σ piq i,j 1,σ piq σ pj q n ¹ τ pj q τ piq signpτ q ji i,j 1,i j n ¹ where the last two equalities use the bijectivity of σ. Hence, finally, it follows that signpτ σ q signpτ q signpσ q . ‘(iv)’ For this, let k, l P t1, . . . , nu be such that k l and such that τ piq i for all i P t1, . . . , nu ztk, lu, τ pk q l and τ plq k. Further, let σ be some element of Sn such that σ p1q k and σ p2q l. Then σ τ0 σ 1 pk q σ τ0 p1q σ p2q l , σ τ0 σ 1 plq σ τ0 p2q σ p1q k and for i P t1, . . . , nu ztk, lu σ τ0 σ 1 piq σ σ 1 piq i . ‘(v)’: If σ coincides with the identity transformation on t1, . . . , nu, then σ τ τ for any transposition τ P Sn . If σ differs from the identity transformation on t1, . . . , nu, then there is i1 P t1, . . . , nu such that σ piq i for all i P t1, . . . , i1 1u where we define t1, . . . , 0u : φ and σ pi1 q i1 . The last implies that σ pi1 q ¡ i1 . We define the transposition τ1 P Sn by τ1 pi1 q : σ pi1 q, τ1 pσ pi1 qq : i1 and τ piq : i for all i P t1, . . . , nu zti1 , σ pi1 qu. Then σ1 : τ1 σ satisfies σ1 piq i for all i P t1, . . . , i1 u. Continuing this process, we arrive after at a sequence of transpositions τ1 , . . . , τk in Sn , where k is some element of N , such that idt1,...,nu Then σ τk . . . τ1 σ . τ11 τk1 τ1 τk . 769 ‘(vi)’: It follows by (i), (iii) that detpa1 , . . . , an q ¸ P σ Sn ¸ P ¸ σ Sn P ¸ P signpσ q a1σp1q anσpnq σ Sn signpσ q aσ1 pσp1qq σp1q aσ1 pσpnqq σpnq signpσ 1 q aσ1 p1q 1 aσ1 pnq n signpσ q ā1σp1q ānσpnq ¸ P signpσ q aσp1q 1 aσp1q n σ Sn detpā1, . . . , ānq . σ Sn Theorem 5.3.2. (Properties of the determinant) Let n P N , e1 , . . . , en the canonical basis of Rn , pa1 , . . . , an q an n-tuple of elements of Rn , i P t1, . . . , nu, ai1 P Rn, α P R and j P t1, . . . , nu such that j ¡ i. Then (i) detpe1 , . . . , en q 1 , (ii) ai1 , . . . , an q detpa1 , . . . , ai , . . . , an q detpa1 , . . . , ai1 , . . . , an q , detpa1 , . . . , α ai , . . . , an q α detpa1 , . . . , an q , detpa1 , . . . , ai (iii) if n ¥ 2, then detpa1 , . . . , ai , . . . , aj , . . . , an q detpa1 , . . . , aj , . . . , ai , . . . , an q , (iv) if n ¥ 2 and ai aj , then detpa1 , . . . , ai , . . . , aj , . . . , an q 0 . 770 Proof. ‘(i)’: detpe1 , . . . , en q sp1, . . . , nq ņ spk1 , . . . , kn q e1k1 enkn k1 ,...,kn 1 n ¹ sgnpj iq 1 . i,j 1,i j ‘(ii)’: detpa1 , . . . , ai ņ ai1 , . . . , an q spk1 , . . . , kn q a1k1 . . . paiki k1 ,...,kn 1 ņ aik1 i q . . . ankn spk1 , . . . , kn q a1k1 . . . aiki . . . ankn k1 ,...,kn 1 ņ spk1 , . . . , kn q a1k1 aik1 i . . . ankn k1 ,...,kn 1 detpa1, . . . , ai, . . . , anq detpa1 , . . . , ai1 , . . . , an q , detpa1 , . . . , α ai , . . . , an q ņ spk1 , . . . , kn q a1k1 . . . pα aqiki . . . ankn k1 ,...,kn 1 ņ α spk1 , . . . , kn q a1k1 . . . aiki . . . ankn k1 ,...,kn 1 α detpa1, . . . , ai, . . . , anq . ‘(iii)’: For this, we define the n-tuple pb1 , . . . , bn q of elements of Rn by bk : ak if k P t1, . . . , nu zti, j u, bi : aj and bj : ai . Then it follows by Lemma 5.3.1 (iv) that detpa1 , . . . , ai , . . . , aj , . . . , an q 771 ņ k1 ,...,kn 1 ņ k1 ,...,kn 1 ņ spk1 , . . . , ki , . . . , kj , . . . , kn q a1k1 . . . aiki . . . ajkj . . . ankn spk1 , . . . , ki , . . . , kj , . . . , kn q b1k1 . . . bjki . . . bikj . . . bnkn spk1 , . . . , kj , . . . , ki , . . . , kn q b1k1 . . . bjkj . . . biki . . . bnkn k1 ,...,kn 1 ņ spk1 , . . . , ki , . . . , kj , . . . , kn q b1k1 . . . biki . . . bjkj . . . bnkn k1 ,...,kn 1 detpa1, . . . , aj , . . . , ai, . . . , anq . ‘(iv)’: The statement of (iv) is simple consequence of (iii). Theorem 5.3.3. (Uniqueness of the determinant) Let n P N and w be a map which associates to every n-tuple of elements of Rn a real number. In particular, let w be such that (i) for the canonical basis e1 , . . . , en of Rn wpe1 , . . . , en q 1 , (ii) for every n-tuple pa1 , . . . , an q of elements of Rn , i P t1, . . . , nu, ai1 Rn and α P R P ai1 , . . . , an q wpa1 , . . . , ai , . . . , an q wpa1 , . . . , ai1 , . . . , an q , wpa1 , . . . , α ai , . . . , an q α wpa1 , . . . , an q , wpa1 , . . . , ai (iii) if n ¥ 2, for every n-tuple pa1 , . . . , an q of elements of Rn , i t1, . . . , nu, ai1 P Rn and j P t1, . . . , nu such that j ¡ i wpa1 , . . . , ai , . . . , aj , . . . , an q wpa1 , . . . , aj , . . . , ai , . . . , an q . Then w det. 772 P Proof. For this, let pa1 , . . . , an q be an n-tuple pa1 , . . . , an q of elements of Rn . Then it follows by (ii),(iii) that wpa1 , . . . , an q ¸ P ņ a1k1 . . . ankn wpek1 , . . . , ekn q k1 ,...,kn 0 a1σp1q . . . anσpnq wpeσp1q , . . . , eσpnq q . σ Sn Further, it follows by Theorem 5.3.1 (v), (iii) and (i) that wpeσp1q , . . . , eσpnq q detpeσp1q , . . . , eσpnq q and therefore, finally, that wpa1 , . . . , an q detpa1 , . . . , an q . Theorem 5.3.4. (Bases of Rn ) Let n P N . (i) Let r P N and v1 , . . . , vr be basis of Rn , i.e., a sequence of vectors in Rn which is such that for every w P Rn , there is a unique r-tuple pα1, . . . , αr q of real numbers such that w Then r ŗ αk vk . k 1 n. (ii) In addition, let r P N and v1 , . . . , vr P Rn be no basis Rn , but be linearly independent, i.e., such that the equation ŗ αk vk 0 k 1 for some real α1 , . . . , αr implies that α1 αr 0 . Then there are m P N and vectors w1 , . . . , wm in Rn such that v1 , . . . , vr , w1 , . . . , wm is a basis of Rn . 773 (iii) Let v1 , . . . , vn a basis of Rn . P Rn be linearly independent. Then v1, . . . , vn P Rn is Proof. ‘(i)’: Since v1 , . . . , vr and the canonical basis e1 , . . . , en of Rn are both bases, there are real numbers αik , i P t1, . . . , ru, k P t1, . . . , nu and βjl , j t1, . . . , nu, l P t1, . . . , ru such that vi ņ αik ek , ej vi ņ αik ek ņ βjl vl P t1, . . . , nu. For such i, j, it follows that ŗ k 1 l 1 k 1 for every i P t1, . . . , ru and j ŗ αik βkl vl , ej ŗ k 1l 1 βjl vl l 1 ņ ŗ βjl αlk ek k 1l 1 and hence that ņ αik βki 1, k 1 ŗ βjl αlj 1. l 1 This implies that r ŗ ņ i 1k 1 αik βki ņ ŗ βjl αlj n. j 1l 1 ‘(ii)’: Since the canonical basis e1 , . . . , en of Rn of Rn is a bases, it follows that every element of Rn can represented as a linear combination of the vectors v1 , . . . , vr , e1 , . . . , en . In a first step, we consider the sequence of vectors v1 , . . . , vr , e1 . If e1 is the linear combination of v1 , . . . , vr , then we drop e1 from the sequence v1 , . . . , vr , e1 , . . . , en and still every element of Rn can represented as a linear combination of the vectors from the remaining sequence. Otherwise, we keep e1 in the sequence. Note that in this case v1 , . . . , vr , e1 are linearly independent. Continuing this process, we arrive at a sequence of vectors w1 , . . . , wm in Rn , where m P N , such that v1 , . . . , vr , w1 , . . . , wm is linearly independent and such that every element of Rn can represented as a linear combination of its members. This also 774 implies that v1 , . . . , vr , w1 , . . . , wm is a basis of Rn . ‘(iii)’: The proof is indirect. Assume that v1 , . . . , vn is no basis of Rn . Then by (ii) v1 , . . . , vn can be extended to a basis by adding a non-zero number of vectors from Rn . That basis has at least n 1 members which contradicts (i). Definition 5.3.5. (The determinant of a linear map) Let n A : Rn Ñ Rn be linear, i.e. such that Apx y q Apxq P N and Apy q , Apαxq αApxq P Rn and α P R. Then, obviously, by wpa1 , . . . , an q : detpApa1 q, . . . , Apan qq for every n-tuple pa1 , . . . , an q of elements of Rn , there is given a map w for all x, y satisfying the conditions (ii) and (iii) in Theorem 5.3.3. Hence according to that theorem, w is a multiple of det. In the following, we call the corresponding factor the determinant of A and denote it by detpAq. By definition, it follows that detpApa1 q, . . . , Apan qq detpAq detpa1 , . . . , an q for every n-tuple pa1 , . . . , an q of elements of Rn and hence that detpAq detpApe1 q, . . . , Apen qq where e1 , . . . , en is the canonical basis of Rn . Theorem 5.3.6. Let n Then P N and A : Rn Ñ Rn, B : Rn Ñ Rn be linear. (i) detpA B q detpAq detpB q, (ii) A is bijective if and only if detpAq 0. 775 Proof. For this, let e1 , . . . , en be the canonical basis of Rn . ‘(i)’: From Definition 5.3.5, it follows that detpA B q det pA B qpe1q, .. . , pA B qpenq det ApB pe1qq, . . . , ApB penqq detpAq det B pe1q, . . . , B penq detpAq detpB q . ‘(ii)’: If A is bijective, then detpA1 q detpAq detpA1 Aq detpidRn q detpe1 , . . . , en q 1 and hence detpAq 0. On the other hand, if detpAq detpApe1 q, . . . , Apen qq 0 and α1 , . . . , αn are real numbers such that ņ αk Apek q 0 , k 1 then α1 such that and hence αn 0. Otherwise, there is i Apei q P t1, . . . , nu and αi 0 ņ αk Apek q α k1,ki i detpAq detpApe1 q, . . . , Apen qq 0 . Hence the vectors Ape1 q, . . . , Apen q are linearly independent and constitute a basis of Rn . Therefore, for every v P Rn , there are real α1 , . . . , αn such that v ņ ņ αk Apek q A k 1 αk ek k 1 and hence A is surjective. Finally, we show that A is also injective. For this, assume that there are v, w P Rn such that Av Aw. Then 0 Apv wq A ņ pvk wk qek k 1 ņ k 1 776 pvk wk qApek q . Since Ape1 q, . . . , Apen q are linearly independent, this implies that vk for every k P t1, . . . , nu and hence that v w. Definition 5.3.7. (Linear maps) Let n, m P N and A : Rn wk Ñ Rm . (i) We say that A is linear if Apx for all x, y y q Apxq Apy q , Apαxq αApxq P Rn and α P R. (ii) If A is linear, Apxq A ņ xj enj j 1 ņ m̧ xj Apenj q j 1 ņ Aij xj em i i 1j 1 m n where en1 , . . . , enn and em 1 , . . . , em denote the canonical basis of R and Rm , respectively, and for every i 1, . . . , m, j 1, . . . , n, Aij denotes the component of Apenj q in the direction of em i , such A is n determined by its values on the canonical basis of R . On the other hand, obviously, if pAij qpi,jqPt1,...mut1,...nu is a given family of real numbers, then by Apxq : m̧ ņ Aij xj em i i 1j 1 for all x P Rn , there is defined a linear map A : Rn Ñ Rm . Interpreting the elements of Rn and Rm as column vectors and defining the m n matrix MA by MA : A11 Am1 777 A1n Amn , the last is equivalent to Apxq : MA x A11 Am1 A1n x1 xn Amn where the multiplication sign denotes a particular case of matrix multiplication defined below pMA xqi : ņ Aij xj j 1 for every x P Rn and every i 1, . . . , m. In this case, we call MA the representation matrix of A with respect to the bases en1 , . . . , enn m and em 1 , . . . , em . (iii) If A is linear with representation matrix MA , l P N and B : Rm Rl is linear with representation matrix MB , it follows that pB Aqpxq B pApxqq ļ m̧ Bik pApxqqk eli i 1k 1 ļ ņ m̧ Bik i 1k 1 Akj xj eli j 1 Ñ ļ ņ i 1j 1 m̧ Bik Akj xj eli k 1 for every x P Rn and hence that the representation matrix MB A of B A is given by °m MB A B1k Ak1 k 1 °m Blk Ak1 k 1 778 °m B1k Akn k 1 °m Blk Akn k 1 . For this reason, we define the matrix product MB MA of MB and MA such that MB MA MB A . Hence °m MB MA : B1k Ak1 k 1 °m Blk Ak1 k 1 °m B1k Akn k 1 °m . Blk Akn k 1 (iv) If A is linear with representation matrix MA and m the determinant detpMA q of MA by n, we define detpMA q : detpAq . In this case, it follows by Theorem 5.3.1 (vi) that detpMA q detpApen1 q, . . . , Apenn qq An1 ņ ņ n n det Ai1 ei , . . . , Ain ei i1 i1 Ann A 11 A1n . An1 Ann If in addition B : Rn Ñ Rn is linear with representation matrix MB , A 11 A1n it follows by Theorem 5.3.6 (i) that detpMB MA q detpMB q detpMA q . 779 Lemma 5.3.8. (Sylvester’s criterion) Let n P N , A pAij qi,j Pt1,...,nu be a real symmetric n n matrix,i.e., such that Aij Aji for all i, j P t1, . . . , nu. Then A is positive definite, i.e., ¸ Aij hi hj ¡0 i,j 1,...,n for all h P Rn zt0u, if and only if all leading principal minors detpAk q, k 1, . . . , n, of A are ¡ 0. Here Ak : pAij qi,j Pt1,...,ku , k P t1, . . . , nu . First, we derive an auxiliary result. For this let n ¥ 2, detpAn1 q Proof. 0 and let α1 , . . . , αn1 be some real numbers. We define a linear map T : Rn Ñ Rn by T phq : h hn .pα1 , . . . , αn1 , 0q for every h P Rn . In particular, T is bijective with inverse T 1 ph̄q h̄ h̄n .pα1 , . . . , αn1 , 0q for every h̄ P Rn . Further, let h̄ P Rn and h : T 1 ph̄q. Then ņ Aij hi hj i,j 1 Ann h̄2n n¸1 i 1 Ann 2 2h̄n n¸1 i 1 2 n¸1 Ain ph̄i i 1 Ain αi h̄i n¸1 Aij αi αj i,j 1 αj Aij Aij hi hj Aij ph̄i i,j 1 n¸1 i,j 1 n¸1 h̄n αi qh̄n n¸1 n¸1 Ain hi hn i 1 2 Ann h2n h̄2n h̄n αi qph̄j h̄n αj q n¸1 Aij h̄i h̄j i,j 1 Ain j 1 Since detpAn1 q 0, the column vectors of An1 are linearly independent, and hence there is a unique n 1-tuple pα1 , . . . , αn1 q of real numbers such 780 that n¸1 αj Aij 0 Ain j 1 for all i P t1, . . . , n 1u. Therefore by choosing these α1 , . . . , αn1 it follows that ņ Aij hi hj P R, n¸1 bnn h̄2n i,j 1 Aij h̄i h̄j (5.3.1) i,j 1 where bnn : Ann 2 n¸1 n¸1 Ain αi i 1 Aij αi αj . i,j 1 As a consequence, A is positive definite if and only if Ā : A11 A1 n1 0 An1 1 0 0 An1 n1 0 bnn is positive definite, Note that by Leibniz’s formula Theorem 5.3.1 (i), it follows that detpĀq bnn detpAn1 q . Further, if M denotes the representation matrix of T , then ņ Aij hi hj i,j 1 ņ Āij h̄i h̄j i,j 1 ņ i,j,k,l 1 ņ Āij Mik Mjl hk hl ņ Āij pT phqqi pT phqqj i,j 1 ņ i,j,k,l 1 pM tĀM qij hihj , i,j 1 781 Mki Ākl Mlj hi hj where M t : pMji qi,j Pt1,...,nu , and hence, since M t ĀM is symmetric, A M t ĀM Therefore, we conclude that detpĀq bnn detpAn1 q [ detpM q ]2 detpAq . (5.3.2) With the help of the auxiliary result, the proof of the theorem proceeds by induction over n. The statement of the theorem is obviously true in the case n 1. In the following, we assume that it is true for some n P N and consider the case where n is increased by 1. If A is positive definite, it follows, in particular, that ¸ Aij hi hj ¡0 i,j 1,...,n for all h P Rn zt0u and therefore also that An is positive definite. As a consequence, according to the inductive assumption, the leading principal minors detpAk q, k 1, . . . , n are ¡ 0. Further, it follows by (5.3.1) that bnn ¡ 0 and hence by (5.3.2) that detpAq ¡ 0. On the other hand, if all leading principal minors of A are ¡ 0, it follows by the inductive assumption that An is positive definite and by (5.3.2) that bnn ¡ 0. Therefore, it follows by (5.3.1) that A is positive definite. 5.4 The Inverse Mapping Theorem Theorem 5.4.1. (Banach fixed point theorem for closed subsets of Rn ) Let n P N , B be a non-empty closed subset of Rn and f : B Ñ B be a contraction, i.e., such that |f pxq f pyq| ¤ α |x y| 782 (5.4.1) for all x, y P B and some α a unique x P B such that Further, P r0, 1q. Then f has a unique fixed point, i.e., f px q x . |x x| ¤ |x 1f pαxq| and (5.4.2) ν lim Ñ8 f pxq x ν for every x P B where f ν for ν and f k 1 : f f k , for k P N. P N is inductively defined by f 0 : idB Proof. Note that (5.4.1) implies that f is continuous. Further, define F : B Ñ R by F pxq : |x f pxq| for all x P E. Now let x P B. Then it follows that |f ν µ1 pxq f pxq| ¤ ν µ 1 1 µ¸ |f ν k 1 k µ ν pxq f pxq| ¤ ν k 1 1 µ¸ αν k F pxq k µ ¤ 1 α α F pxq for all ν, µ, µ1 P N such that µ1 ¥ µ. Hence the components of pf ν pxqqν PN are Cauchy sequences. Hence it follows by Theorem 2.3.17, Theorem 3.5.47 and the closedness of B that this sequence is convergent to some x P B. Further, it follows by the continuity of f that x is a fixed point of f . In addition, if x̄ P B is some fixed point of f , then |x x̄| |f pxq f px̄q| ¤ α |x x̄| and hence x̄ x since the assumption x̄ x leads to the contradiction that 1 ¤ α. Finally, let y be some element of B. Then it follows that |y x| |y f pxq| |y f pyq f pyq f pxq| ¤ |y f pyq| |f pyq f pxq| ¤ F pyq α |y x| 783 and hence (5.4.2). Lemma 5.4.2. Let n P N , Ω be an open subset of Rn containing 0, f : Ω Ñ Rn be of class C 1 , i.e., such that all corresponding component maps f1 , . . . , fn are of class C 1 , and such that f p0q 0 and f 1 p0q idRn . Then there are open subsets U , V of Rn such that 0 P U Ω, 0 P V and (i) |f pxq f py q| ¥ 1 2 |x y| for all x, y P U , (ii) f pU q V , (iii) f0 : U Ñ V defined by f0 pxq : f pxq for every x P U is bijective, and f01 is a continuous map which is differentiable in 0 such that f01 1 p0q idR n . Proof. For this, we define F : Ω Ñ Rn by F pxq : x f pxq for every x P Ω. Then F is of class C 1 such that F p0q 0 and F 1 p0q 0. Let ν0 P N be such that B1{ν0 p0q Ω, i P t1, . . . , nu and Fi be the i-th component function corresponding to F . In particular, |∇Fi | is continuous and |p∇Fi qp0q| 0. If xν P B1{ν p0q is such that |p∇Fiqpxq| |p∇Fiqpxν q| xPmax B p0q { 1 ν for every ν P N satisfying ν ¥ ν0, then x ν 0 , xν 0 1 , . . . is convergent to 0 and hence the corresponding sequence |p∇Fiqpxν q|, |p∇Fiqpxν 1q|, . . . 0 0 784 is convergent to 0. As a consequence, there is ν0i P N, ν0i ¥ ν0, such that |p∇Fiqpxq| ¤ 2?1 n for all x P U1{ν0i p0q. Since this true for i 1, . . . , n, we conclude that there is ν P N, ν ¥ ν0 such that |p∇Fiqpxq| ¤ 2?1 n for all i P t1, . . . , nu and x formula Theorem 4.3.6, for i τ P r0, 1s such that P U1{ν p0q. Further, according to Taylor’s P t1, . . . , nu and x, y P U1{ν p0q, there is Fi pxq Fi py q px y q p∇Fi qpy τ px y qq . Hence |Fipxq Fipyq| |px yq p∇Fiqpy τ px yqq| ¤ |x y| |p∇Fiqpy τ px yqq| ¤ 2?1 n |x y| and therefore |F pxq F pyq| ¤ 21 |x y| . As a consequence, it follows for x, y P U1{ν p0q that |f pxq f pyq| |f pxq x x y y f pyq| |x y ¥ |x y| |F pxq F pyq| ¥ 12 |x y| . F py q F pxq| (5.4.3) In the following, we define U : U1{ν p0q , V : f pU q . Then f0 : U Ñ V defined by f0 pxq : f pxq for every x P U is bijective. In the next step, we show that V Rn is open. For this, let y0 P V and x0 P U 785 such that y0 f px0 q. Further, let r ¡ 0 be such that Br px0 q U . In the following, we will show that Ur{2 py0 q V . For this, let y P Ur{2 py0 q. We define Fy : Br px0 q Ñ Rn by Fy pxq : F pxq y x f pxq y for every x P U . Then |Fy pxq Fy px̄q| |F pxq F px̄q| ¤ 21 |x x̄| for all x, x̄ P Br px0 q and |Fy pxq x0| |Fy pxq Fy px0q Fy px0 q x0 | ¤ 1 |x x 0 | 2 |y y0| ¤ r for all x P Br px0 q. Hence the restriction of F in image to Br px0 q is a contraction, and it follows by Theorem 5.4.1 the existence of a fixed point x P Br px0 q of that map. The last also implies that f pxq y. Hence we conclude that Ur{2 py0 q f pBr px0 qq V and therefore that V is open. Further, it follows by (5.4.3) that |f01pf0pxqq f01pf0pyqq| ¤ 2|f0pxq f0pyq| . for all x, y P U and hence that f01 is continuous. To simplify notation in the following, we define g : f01 . Finally, let y1 , y2 , . . . be some sequence in V zt0u that is convergent to 0. Then |gpyν q gp0q pyν 0q| |f pgpyν qq gpyν q| |yν 0| |y ν | |f p|ggppyyνqqqggp0pyq|ν q| |gpy|yν q g0|p0q| ν ν ¤ 2 |f pgpyν qq |fgppgyp0qqqgpp0gq|pyν q gp0qq| . ν 786 As a consequence of the continuity of g, it follows that g py1 q, g py2 q, . . . is convergent to 0. Hence it follows from the last and the differentiability of f in 0 that |gpyν q gp0q pyν 0q| 0 . lim ν Ñ8 |yν 0| and therefore that g is differentiable in 0 with derivative idRn . Theorem 5.4.3. (Inverse mapping theorem) Let n P N , U be an open subset of Rn , f : U Ñ Rn be of class C 1 , i.e., such that all corresponding component maps are of class C 1 , and x0 P U such that f 1 px0 q is bijective. Then there are open subsets Ux0 U , Vf px0 q Rn containing x0 and f px0 q, respectively, and such that f |Ux0 defines a bijection onto Vf px0 q whose inverse is differentiable. Proof. First, we notice that the function detpf q which associates detpf 1 pxqq ¸ P signpσ q σ Sn Bfσp1q pxq Bfσpnq pxq B x1 B xn to every x P U is continuous. Hence it follows the existence of an open subset U 1 U containing x0 and such that f 1 pxq is bijective for every x P U 1 . Otherwise, every open subset of U containing x0 also contains a point x such that f 1 pxq is not bijective and hence such that detpf 1 pxqq 0 . Then it follows the existence of a sequence x1 , x2 , . . . in U that is convergent to x0 and such that detpf 1 pxν qq 0 P N. Since detpf q is continuous, this leads to 1 1 0 νlim Ñ8 detpf pxν qq detpf pxqq and hence the fact that f 1 px0 q is not bijective. Further, let ε ¡ 0 be such that Uε px0 q U 1 . We define the auxiliary map h : Uε p0q Ñ Rn by hpxq : pf 1 px0 qq1 p f px x0 q f px0 q q for every ν 787 for every x P Uε p0q. In particular, h is of class C 1 , hp0q 0 and h 1 p0q idRn . Hence according to the previous lemma, there are open subsets U , V of Rn such that 0 P U Ω, 0 P V and such that h0 : U Ñ V defined 1 by h0 pxq : hpxq for every x P U is bijective with an inverse h 0 which is differentiable in 0. Hence it follows that by f0 pxq : f pxq for every x P U0 , where U0 : tx x0 : x P U u , V0 : tf 1 px0 qpy q f px0 q : y PVu there is defined a bijective map f0 : U0 Ñ V0 between open subsets U0 , V0 of Rn whose inverse is differentiable in f px0 q. Further, by application of the previous reasoning to f0 and every x P U0 ztx0 u, it follows that the inverse of f0 is differentiable. The following lemma is of use in the proof of the change variable formula for multiple integrals. The proof of the lemma uses methods analogous to those used in the proof of Theorem 5.4.3. Lemma 5.4.4. Let n P N , U be an open subset of Rn and f : U Ñ Rn be of class C 1 , i.e., whose corresponding component maps f1 , . . . , fn are of class C 1 , such that f p0q 0, f 1 p0q idRn . Finally, let r ¡ 0 and 0 ε 1 be such that Br p0q U and ņ max Pt1,...,nu j 1 i fi x x B p q Bfi pyq ¤ ?ε Bj B xj n for all x, y P Br p0q. Then for every y P Brp1εq p0q, there is a uniquely determined x P Br p0q such that f pxq y. Proof. For this, let y sponding gy pxq by P Brp1εqp0q. We define for every x P Br p0q a corregy pxq : x f pxq y. By Taylor’s formula, Theorem 4.3.6, it follows for every x t1, . . . , nu, the existence of τ P r0, 1s such that |fipxq xi| |fipxq fip0q x (∇fi)p0q| 788 P Br p0q, i P |x (∇fi)pτ xq x (∇fi)p0q| ¤ |x| |(∇fi)pτ xq (∇fi)p0q| ņ B f B f i i ¤ |x| Bx pτ xq Bx p0q ¤ ?rεn j 1 j and hence that j |f pxq x| ¤ rε . The last implies that |gy pxq| ¤ |x f pxq| |y| ¤ rε rp1 εq r and hence that the range of gy is part of Br p0q. Further, it follows for x1 , x2 P Br p0q and i P t1, . . . , nu the existence of τ P r0, 1s such that |gyipx1q gyipx2q| |fipx1q x1 (∇fi)p0q pfipx2q x2 (∇fi)p0qq| |fipx1q fipx2q px1 x2q (∇fi)p0q| |px1 x2q (∇fi)px1 τ px2 x1qq px1 x2q (∇fi)p0q| ¤ |x1 x2| |(∇fi)px1 τ px2 x1qq (∇fi)p0q| ¤ ?εn |x1 x2| and hence that |gy px1q gy px2q| ¤ ε |x1 x2| . Since ε 1, this implies that gy is a contraction and therefore has unique fixed point x P Br p0q according to Theorem 5.4.1. Since x P Br p0q is a fixed point of gy if and only if f pxq y, the statement of the lemma follows. 789 References [1] Abel N H 1826, Untersuchungen über die Reihe ..., J. reine angew. Math., 1, 311339. [2] Abramowitz M and Stegun I A (eds.) 1984, Pocketbook of Mathematical Functions, Harri Deutsch, Thun. [3] Alonso M, Finn E J 1967, 1967, 1968, Fundamental university physics, Vols I - III, Addison-Wesley, Reading, Mass. [4] Anon E 1969, Note on Simpson’s rule, Amer. Math. Month., 76, 929-930. [5] Apostol T M 2002, Mathematical analysis, 2nd ed., Narosa Publishing House, New Delhi. [6] Ayres F, Mendelson E 1999, Calculus, McGraw-Hill, New York. [7] Baron M E 1969, The origins of the infinitesimal calculus, Dover, New York. [8] Beyer H R 2007, Beyond partial differential equations, Springer Lecture Notes in Mathematics 1898, Springer, Berlin. [9] Beyer H R 1999, On the completeness of the quasinormal modes of the Pöschl-Teller potential, Commun. Math. Phys. 204, 397-423. [10] Boole G 1847, The mathematical analysis of logic, Cambridge, Macmillan, Barclay and Macmillan. [11] Bolyai J 1832, Appendix scientiam spatii absolute veram exhibens: a veritate aut, falsitate Axiomatis XI Euclidei (a priori haud unquam decidenda) independentem; adjecta ad casum falsitatis, quadratura circuli geometrica, in: Bolyai F 1832, Tentamen juventutem studiosam in elementa matheseos purae, elementaris ac sublimioris methodo intuitiva, evidentiaque huic propria, introducendi. Cum Appendice triplici, vol. 1., Maros-Vasarhelyini. German translation in: Engel F, Staeckel P 1913, Urkunden zur Geschichte der nichteuklidischen Geometrie, Band 2, Teil 1, 2, Teubner, Leipzig. [12] Bolzano B 1817, Rein analytischer Beweis des Lehrsatzes, dass zwischen je zwey Werthen, die ein entgegengesetztes Resultat gewaehren, wenigstens eine reelle Wurzel der Gleichung liege, in: Jourdain P E B (ed.) 1905, Ostwald’s Klassiker der exakten Wissenschaften, 153, Leipzig, Engelmann. [13] Bottazzini U 1986, The higher calculus: A history of real and complex analysis from Euler to Weierstrass, Springer, New York. [14] Boyer C B 1949, The concepts of the calculus: A critical and historical discussion of the derivative and the integral, Reprint, Hafner, New York. 790 [15] Boyer C B 1988, History of analytic geometry, Scholar’s bookshelf, Princeton. [16] Boyer C B 1968, A history of mathematics, Wiley, New York. [17] Bronson R 1989, Matrix operations, McGraw-Hill, New York. [18] Browder A 2001, Mathematical analysis, corr. 3rd print., Springer, New York. [19] Buck R C 1965, Advanced calculus, 2nd ed., McGraw-Hill, New York. [20] Budak B M, Samarskii A A, Tikhonov A N 1964, A collection of problems on mathematical physics, MacMillan, New York. [21] Cantor M 1880, 1892, 1898, 1908, Vorlesungen ber Geschichte der Mathematik, Vols 1 - 4, Teubner, Leipzig. [22] Cauchy A L 1821 / 1885, Algebraische Analysis, (Translated from French), Springer, Berlin. [23] Cheney W 2001, Analysis for applied mathematics, Springer, New York. [24] Cavalieri B 1647, Exercitationes geometricae sex, Bologna. [25] Coleman A J 1951, A simple proof of Stirling’s formula, Amer. Math. Month., 58, 334-336. [26] Coleman A J 1954, The probability integral, Amer. Math. Month., 61, 710-711. [27] Cramer G 1750, Introduction a l’analyse des lignes courbes algebriques, Freres Cramer & Cl. Philbert, Geneve. [28] Dantscher V 1908, Vorlesungen ueber die Weierstrasssche Theorie der irrationalen Zahlen, Teubner, Leipzig. [29] Demidovich B (ed.) 1989, Problems in mathematical analysis, 7th printing, Mir Publishers, Moscow. [30] Dirichlet G L 1829, Sur la convergence des series trigonometriques qui servent a representer une fonction arbitraire entre des limites donnees, J. reine angew. Math., 4, 157-169. [31] Dirichlet G L 1887 / 1918, 4th ed., Was sind und was sollen die Zahlen, Vieweg, Braunschweig. [32] Drager L D, Foote R L 1986, The contraction mapping lemma and the inverse function theorem in advanced calculus, Amer. Math. Month., 93, 52-54. [33] Dunkel O 1917, Discussions: Relating to the exponential function, Amer. Math. Month., 24, 244-246. [34] Enderton H B 1977, Elements of set theory, Academic Press, New York. 791 [35] Erdelyi A, Magnus W, Oberhettinger F, Tricomi F G (eds.) 1953, Higher transcendental functions, Vol. I, McGraw-Hill, New York. [36] Edwards C H 1979, The historical development of the calculus, Springer, New York. [37] Hutchins R M (ed.) 1952, Great books of the western world, Vol. II, Euclid, Archimedes, Apollonius of Perga, Nicomachus, Encyclopedia Britannica Inc., Chicago. [38] Euler L 1748 / 1885, Einleitung in die Analysis des Unendlichen, (Translated from Latin), Springer, Berlin. [39] Fischer G 2003, Lineare Algebra, 14th ed., Vieweg, Wiesbaden. [40] Ford J 1995, Avoiding the exchange lemma, Amer. Math. Month., 102, 350-351. [41] Fourier J 1822 / 1955, The analytical theory of heat, (Translated from French), reprint, New York, Dover. [42] Galileo G 1638 / 2002, Dialogues concerning two new sciences, (Translated from Italian), Philadelphia: Running Press. [43] Gelbaum B R, Olmsted J M H 2003, Counterexamples in analysis, Dover Publications, New York. [44] Giesy D P 1972, Still another elementary proof that 45, 148-149. ° 1{k 2 π2 {6, Math. Mag., [45] Goldrei D 1996, Classic set theory, Chapman & Hall, London. [46] Goursat E 1904, 1916, 1917, A course in mathematical analysis, Vols I - III, Ginn, Boston. [47] Grabiner J V 1983, The changing concept of change: The derivative from Fermat to Weierstrass, Math. Mag., 56, 195-206. [48] Guenter N M, Kusmin R O 1971, Aufgabensammlung zur hoeheren Mathematik, Vols I, II, DVW, Berlin. [49] Haaser N B, Sullivan J A 1971, Real analysis, Van Nostrand, New York. [50] Hairer E, Wanner G 2000, Analysis by its history, corr. 3rd printing, Springer, New York. [51] Halmos P R 1958, Finite-dimensional vector spaces, Van Nostrand, New York. [52] Hamilton N, Landin J 1961, Set theory and the struture of arithmetic, Allyn and Bacon, Boston. [53] Hazewinkel M (ed.) 2002, Encyclopaedia of mathematics, (Accessible online at: http://eom.springer.de/), Springer, Berlin. 792 [54] Hille E 1964, 1966, Analysis, Vols I, II, Blaisdell, New York. [55] Hardy G H 1909, The integral ³8 0 sin x x dx, Math. Gaz., 5, 98-103. [56] Huygens C 1673, Horologium oscillatorium sive de motu pendulorum, Paris. [57] Katz V J 1993, A history of mathematics: An introduction, HarperCollins, New York. [58] Klein F 1979, Vorlesungen über die Entwicklung der Mathematik im 19. Jahrhundert, reprint, Berlin, Springer. [59] Kleiner I 1991, Rigor and proof in mathematics: A historical perspective, Math. Mag., 64, 291-314. [60] Kleiner I, Movshovitz-Hadar N 1994, The role of paradoxes in the evolution of mathematics, Amer. Math. Month., 101, 963-974. [61] Konnully A O 1968, Relation between the beta and the gamma function, Math. Mag., 41, 37-39. [62] Landau L D, Lifschitz E M 1979, Lehrbuch der theoretischen Physik, Band III, 9. Aufl., Quantenmechanik, Akademie-Verlag, Berlin. [63] Lang S 1997, Undergraduate analysis, 2nd ed., Springer, New York. [64] Lang S 1969, Real analysis, Addison-Wesley, Reading, MA. [65] Lebedev N N, Skalskaya I P, Uflyand Ya S 1966, Problems in mathematical physics, Pergamon, Oxford. [66] Leibniz G Wilhelm 1684, Nova methodus pro maximis et minimis, itemque tangentibus, quae nec fractas nec irrationales quantitates moratur, et singulare pro illis calculi genus, Act. Erudit. Lips., in: Pertz G H (ed.) 1859, Leibnizens gesammelte Werke, Dritte Folge, Mathematik, Vierter Band, H W Schmidt: Hannover, 220-226. [67] Leibniz G W 1686, De geometria recondita et analysi indivisibilium atque infinitorum, Act. Erud. Lips., in: Pertz G H (ed.), Leibnizens gesammelte Werke, Dritte Folge, Mathematik, Vierter Band, H W Schmidt: Hannover, 226-233. [68] Leibniz G W 1693, Supplementum geometriae dimensoriae, seu generalissima omnium tetragonismorum effectio per motum: similiterque multiplex constructio lineae ex data tangentium conditione, Act. Erudit. Lips., in: Pertz G H (ed.), Leibnizens gesammelte Werke, Dritte Folge, Mathematik, Vierter Band, H W Schmidt: Hannover, 294-301. [69] de L’Hospital G F A 1696, Analyse des infiniment petits, pour l’intelligence des lignes courbes, De l’Imprimrie Royale, Paris. [70] Lipschutz S 1998, Set theory and related topics, 2nd ed., McGraw-Hill, New York. 793 [71] Lipschutz S, Lipson M L 2001, Linear algebra, 3rd ed., McGraw-Hill, New York. [72] Lobachevsky N I 1829-1830, On Elements of Geometry, Kazan Vestn. 4, XXV, books II-III, 178-187 (1829), book IV, 228-241 (1829), XXVII, book XI-XII, 227243 (1829), XXVIII, book III-IV, 251-283 (1830), XIX, books VII-VIII, 571-636 (1830). [73] Loomis L 1974, Calculus, Addison-Wesley, Reading, Mass. [74] Maclaurin C 1748, A treatise of algebra in three parts, London. [75] Margaris A 1990, First order mathematical logic, Dover Publications, New York. [76] McShane E J 1973, The Lagrange multplier rule, Amer. Math. Month., 80, 922-925. [77] Mendelson E 1988, 3000 solved problems in calulus, McGraw-Hill, New York. [78] Mercator N 1668, Logarithmotechnica, Londini. [79] Messiah A 1999, Quantum mechanics, Dover Publications, New York. [80] van Mill J 1989, Infinite-dimensional topology, North-Holland, Amsterdam. [81] De Morgan A 1848, On the syllogism: III; and on logic in general, Trans. Camb. Phil. Soc., 10, 173-230. [82] Newton I 1669 / 1712, De analysi per aequationes numero terminorum infinitas, published in: Collins D J 1712, Commercium epistolicum D. Johannis Collins et aliorum de analysi promota, London, 3-20. [83] Newton I 1671 / 1736, Methodus fluxionum et serierum infinitarum, Cum ejusdem applicatione ad curvarum geometriam, Anglice edita a J. Colsono, Londini, 1736. [84] Peano G 1890, Sur une courbe, qui remplit toute une aire plane, Math. Ann., 36, 157-160. [85] Peiffer J, Dahan-Dalmedico A 1994, Wege und Irrwege: Eine Geschichte der Mathematik, Wissenschaftliche Buchgesellschaft, Darmstadt. [86] Remmert R 1998, Theory of complex functions, 4th corrected printing, Springer, New York. [87] Riemann B 1854 / 1868, Ueber die Darstellbarkeit einer Function durch eine trigonometrische Reihe, Habilitationsschrift 1854, Abhandlungen der Königlichen Gesellschaft der Wissenschaften zu Göttingen, 13, 1868. [88] Rudin W 1976, Principles of mathematical analysis, 3rd ed., McGraw-Hill, Singapore. [89] Salas S, Hille E, Etgen G 2003, Calculus: One and several variables, 9th ed., Wiley, New York. 794 [90] Samuels S M 1966, A simplified proof of a sufficient condition for a positive definite quadratic form, Amer. Math. Month., 73, 297-298. [91] Stewart J 1999, Calculus: Early transcendentals, 4th ed., Brooks/Cole Publishing Company, Pacific Grove. [92] Sterling J 1730, Methodus differentialis, London. [93] Stoll R R 1963, Set theory and logic, Freeman, San Francisco. [94] Stromberg K R 1981, An introduction to classical real analysis, Wadsworth, Belmont. [95] Struik D J 1969, A source book in mathematics, 1200-1800, Harvard University Press, Cambridge. [96] Toeplitz O 1949, Die Entwicklung der Infinitesimalrechnung, Bd. I., Springer, Berlin, engl. trans.: Toeplitz O 1963, The calculus: A genetic approach, University of Chicago Press, Chicago. [97] Venkatachaliengar K 1962, Elementary proofs of the infinite product for sin z and allied formulae, Amer. Math. Month., 69, 541-545. [98] Wallis J 1656, Arithmetica infinitorum, Oxford. [99] Weierstrass K 1872, Ueber continuirliche Functionen eines reellen Arguments, die fuer keinen Werth des Letzteren einen bestimmten Differentialquotienten besitzen, gelesen: Akad. Wiss. 18. Juli 1872. [100] Whittaker E T, Watson G N 1952, A course of modern analysis, 4th ed. reprint, Cambridge University Press, Cambridge. [101] Wrede R, Spiegel M R 2002, Theory and problems of advanced calculus, McGrawHill, New York. [102] Wussing H 1989, Vorlesungen zur Geschichte der Mathematik, 2. Aufl., VEB, Berlin. 795 Index of Notation f pxq , value of f at x, 45 f pxq , image of x under f , 45 f pA 1 q , image of A 1 under f , 45 Ranpf q , range or image of f , 45 f 1 pB 1 q , inverse image of B 1 under f , 45 f |A , restriction of f to A 1 , 45 Gpf q , graph of f , 47 φ , empty set, 32 f 1 , inverse map, 48 N , t0, 1, 2, . . . u, 33 , composition, 54 N , t1, 2, . . . u, 33 idC , identity map on C, 55 Z , t. . . , 2, 1, 0, 1, 2, . . . u, 33 limnÑ8 , limit, 63 Z , t. . . , 2, 1, 1, 2, . . . u, 33 sup , supremum, 80 : , such that, 33 inf , infimum, 80 Q , rational numbers, 33 exp , exponential function, 84 Q , non-zero rational numbers, 33 ex , exponential function, 84 R , real numbers, 33 limxÑa , limit, 93 R , non-zero real numbers, 33 limxÑ8 , limit, 93 , subset, 33 limxÑ8 , limit, 93 : , per definition, 34 f1 f2 , sum of functions, 102 ra, bs , closed interval, 34 af1 , multiple of a function, 102 pa, bq , open interval, 34 f1 f2 , product of functions, 103 pa, bs , half-open interval, 34 1{f1 , quotient of functions, 103 ra, bq , half-open interval, 34 , approximately, 122 rc, 8q , unbounded closed interval, 34 limhÑ0,h0 , limit, 122 pc, 8q , unbounded open interval, 34 f 1 pxq , derivative of f in x, 128 p8, ds , unbounded closed interval, 34 f 1 , derivative of f , 128 p8, dq , unbounded open interval, 34 f pkq , k-th derivative of f , 128 Y , union, 34 f 2 , 2nd order derivative of f , 128 X , intersection, 34 f 3 , 3rd order derivative of f , 128 z , relative complement, 34 sinh , hyperbolic sine, 158 A B , Cartesian product of A and B, 36 cosh , hyperbolic cosine, 158 A1 An , n-fold Cartesian product of tanh °n , hyperbolic tangent, 158 A , . . . , A , 36 1 n km , Sum from k m to n, 168 n A , n-fold Cartesian product of A , n! 1 i1 i , n factorial, 168 n . . . , An , 36 , binomial coefficient, 210 ³ bk n A , n-fold Cartesian product of A, 36 f pxq dx , integral of f over ra, bs, 223 a r±F pxqs |ba , F pbq F paq, 238 , union, 38 , intersection, 38 °8, product symbol, 275 Dpf q , domain of f , 45 k1 xk , sum of a sequence, 346 , not, 21 ^ , and, 21 _ , or, 21 ñ , if . . . then, 21 ô, . . . if and only if . . . , 21 P , belongs to, 32 R , does not belong to, 32 1 796 ν n , binomial coefficient, 429 # , oriented line segment between p and q, pq 445 Srn paq , sphere of radius r around a in Rn , 446 # , oriented line segment between p and q, pq 451 # ] , vector associated to pq, # 455 [ pq # # [ pq ] [ rs ] , sum of vectors, 455 # ] , scalar multiple of a vector, 456 λ.[ pq # |[ pq ]| , length of a vector, 456 # ] [ rs # ] , scalar product of vectors, 457 [ pq a b , vector product of vectors in R3 , 465 sgn , signum function, 470 Uε pxq , open ball of radius ε centered at x, 550 Bε pxq , closed ball of radius ε centered at x, 550 Sε pxq , sphere of radius ε centered at x, 550 f1 f2 , sum of functions, 556 a.f1 , multiple of a function, 556 f1 f2 , product of functions, 557 1{f1 , quotient of functions, 557 g f , composition of functions, 558 Bf Bx1i , partial derivative, 570 f pxq , derivative of f in x, 574 p∇f qpxq , gradient of f in x, 574 C p , continuously partially differentiable up to order p, 580 C 8 , C p for all p P N , 580 ∇ , gradient operator, 580 4 , Laplace operator, 583 f.g ³ , product of functions, 588 ³I f dv , integral of f on I, 634 ³r F dr , path integral of F along r, 680 F dS , flux of F across S, 720 S Sn , permutation group, 766 sign , signum function, 766 797 Index of Terminology Rn generalized spherical, 674 polar, 492, 589, 593, 654 spherical, 511, 594, 656 Cramer’s rule, 477 distance of a point from a plane, 474 distance of two lines, 473, 474 distance of two planes, 474 ellipse, 495 hyperbola, 496 lines, 472 metric space, 440 parabola, 494 planes, 472 quadrics, 500 cylinder, 500 ellipsoid, 502, 512 elliptic cone, 505 elliptic cylinder, 502, 510 elliptic paraboloid, 503 hyperbolic cylinder, 502 hyperbolic paraboloid, 505 hyperboloid of one sheet, 506 hyperboloid of two sheets, 506 parabolic cylinder, 500 saddle surface, 503 triangle centroid, 477 circumcenter, 477 orthocenter, 477 triangle inequality, 440 vector spaces, 450 norms, 459 vectors, 450, 454 addition, 454 length, 454 linear independence, 477 orientation , 698 orthogonality, 454 position, 454 addition, 454, 457 bases, 773 canonical basis, 457 Cauchy-Schwarz inequality, 441, 460 length, 454, 459 linear independent vectors, 773 orthogonal projection, 461 Pythagorean theorem, 461 scalar multiplication, 454, 457 scalar product, 454, 460 subset area, 646 boundary, 555 boundary point, 555 bounded, 550 closed, 550 closure, 551, 552 compact, 550 convex, 688 curve, 520 inner point, 555 interior, 555 negligible, 641 open, 550 simply-connected, 688 star-shaped, 686 volume, 646 triangle inequality, 459 Analytical geometry area of a parallelogram, 463 conic sections, 478, 508 ellipse, 485, 493 hyperbola, 487 parabola, 479 coordinates, 492 cylindrical, 509, 594, 656 generalized polar, 674 798 scalar multiplication, 454 scalar product, 454 scalar triple product, 468 unit vector, 454 vector product, 465 volume parallelepiped, 468 Applications n-dimensional volume, 646 ancient knowledge on parabolic segments, 480 Archimedes’ conoids and spheroids, 626, 657 Archimedes’ measurement of the circle, 60 Archimedes’ quadrature of the parabola, 211, 338 area circular cylinder, 725 interior of a plane curve, 710 interior of an ellipse, 710 parallelogram, 463, 471 parametric surface, 720 rotational ellipsoid, 725 set, 646 sphere, 725 surface of revolution, 723 under a graph, 223 arithmetic series, 420 astroid, 538, 745 average speed, 125 Babylonian roots, 195 Bessel functions, 233, 279, 416, 431 Beta function, 325 cardioid, 538, 745 Cartesian leaf, 307 center of mass, 661, 662, 670 circular arch, 209 conchoid of Nicomedes, 141 confluent hypergeometric functions, 437 conservation law, 712, 740 799 constant of motion, 524 continuous compound interest, 82 Couette flow, 677 cycloid, 141, 537 differential equation, 142, 152, 156, 157, 241, 258, 266, 294, 309 direction field, 677 electric circuit, 156 electric field of a point charge, 678 elliptic integral, 312 energy conservation, 152, 680 energy inequality, 156, 713, 741 error function, 428 Euler constant, 354 Fermat’s principle, 207 floor function, 239 folium of Descartes, 745 force field, 152, 678 conservative, 153, 680 potential function, 153, 680 total energy of a point particle, 152, 680 Fourier coefficients, 245 free fall, 87, 206 with low viscous friction, 247 with viscous friction, 206 gamma function, 318, 324, 326, 329, 330, 355 Gaussian integrals, 320, 322 ground state energy, 206 harmonic oscillator, 156 heat conduction, 42 Hermite polynomials, 437 hypergeometric functions, 436 ideal gas law, 87 inertia tensor, 661, 670 instantaneous speed, 126 Kepler problem, 525 angular momentum, 525 energy, 525 Lenz vector, 525 Levi-Civita’s transformation, 528 total mass, 661 kinetic energy, 524, 679 trajectory of a point particle, 679 Laplace equation, 583, 592 transverse vibrations of a beam, 209 Laplace operator, 583 travel distance, 218 cylindrical form, 594 velocity field of a point particle, 679 polar form, 593 volume set, 646 spherical form, 594 largest viewing angle, 207 solid cylinder, 657 latitude, 512 solid ellipsoid, 657 Legendre functions, 438 solid of revolution, 626, 657 Leibniz’s ‘arithmetical quadrature of the solid sphere, 657 circle, 378 under a graph, 634 length Wallis product, 270 curve, 535 wave equation, 42, 88, 143, 592, 593, path, 531, 536, 540 711, 715, 739 longitude, 512 Curves Mathematica 5.1 error, 264, 293 astroid, 538, 745 Newton’s equation of motion, 142, 152, auxiliary lines 247, 309, 524, 679 normal, 124 oscillatory integral, 651 subnormal, 124 partial differential equation, 143, 592, subtangent, 124 593 tangent, 124 particle paths, 524 cardioid, 538, 745 potential function, 683 Cartesian leaf, 307 probability theory, 664 circle, 445 Buffon’s needle problem, 671 conchoid of Nicomedes, 141 quantum field theory, 395 cycloid, 141, 537 quantum statistics, 395 ellipse, 485, 495 quantum theory folium of Descartes, 124, 745 confined identical bosons, 671 helix, 521 confined identical fermions, 665 hyperbola, 487, 496 confined particle, 672 length, 535 harmonic oscillator, 335 parabola, 479, 494 hydrogen atom, 333 parallelogram, 463 Riemann’s zeta function, 350, 364, 394, plane-filling, 403 395 straight lines, 472 Schwarzschild black hole, 164 strophoid, 266 simple pendulum, 307, 308, 335 Snell’s law, 207 Elementary logic Stirling’s formula, 272 compound, 22 strophoid, 266 connectives, 21 800 contraposition, 26 contrapositive, 22 indirect proof, 24, 27, 119, 151 logical law, 26 negation, 22 proof by cases, 26 proposition, 21 rule of inference, 26 statement, 21 tautology, 26 transitivity, 26 truth table, 22 truth values, 21 Elementary set theory sets, 31 Cartesian product, 36 countable, 92 disjoint, 36 equality, 33 intersection, 34 relative complement, 34 subset, 33 types of definition, 33 uncountable, 92 union, 34 Zermelo-Russel paradoxon, 39 Functions definition, 44 determinant, 470, 766, 775 Leibniz formula, 766 domain, 44 image, 44 inverse image, 44 of n variables, 46 of one variable, 46 of several variables, 46 range, 44 restriction, 44 signum, 470, 766 zero set, 44 Functions of one variable B definition, 325 F pa, b; c; q, 436 Hn , 437 Jn , 279 integral representation, 233 Jν integral representation, 431 power series, 416 M pa, b, q, 437 Pν , 438 Γ Γp1{2q, 324 connection to ζ, 394 definition, 318 duplication formula, 330 Gauss formula, 329 limit of beta function, 326 reflection formula, 330 Stirling’s formula, 272 Weierstrass formula, 355 arccos, 105, 163 arcsin, 105, 163 arctan, 105, 163 arsinh, 727 cos, 105, 137 infinite product, 275 power series, 425 cosh, 158, 209 exp, 105, 154 characterization, 152 convexity, 176 definition, 83 derivative, 130 power series, 425 ln, 105, 154, 163 sin, 105, 131 infinite product, 275 power series, 425 sinh, 158, 208 801 tan, 105, 137 tanh, 158 erf definition, 428 power series, 428 ζ ζ p2q, 395 definition, 350 extension to p0, 1q, 364 integral representation, 394 antisymmetric, 262, 265 bisection method, 98 bounded, 96 concave, 174, 177 continuous, 90 continuous extension, 108 removable singularities, 108 singularities, 108 contraction, 195 convex, 174, 177 critical point, 146 derivative, 127 of higher order, 127 differentiable, 127 chain rule, 136 concave, 174, 177 convex, 174, 177 inverse functions, 162 linear approximation, 172 linearization, 172 product rule, 134 quotient rule, 134 sum rule, 134 Taylor’s formula, 172 differential equation, 416 Dirichlet’s function, 91 discontinuous, 90 everywhere, 91 extremum, 122 fixed point, 119, 194, 195 floor, 239 implicit differentiation, 141 increasing, 153 integral representation, 233 limits at infinity, 111 maximum, 93, 146, 181 minimum, 93, 146, 181 not differentiable, 132 nowhere differentiable, 400 periodic, 265 polynomial, 104, 136 powers, 129, 166 rational, 117 removable singularity, 110 Riemann integral, 223 additivity, 235 area under graph, 223 Cauchy-Schwartz inequality, 246 change of variables, 250 improper, 312 integration by parts, 267 Lebesgue criterion, 232 linearity, 226 Midpoint rule, 298 partial fractions, 281 positivity, 226 simple limit theorem, 389 Simpson’s rule, 280, 303 Trapezoid rule, 280, 301 strictly increasing, 101 symmetric, 264, 265 Taylor expansion, 423 uniformly continuous, 532 Functions of several variables, 542 composition, 558 continuous, 547, 548 contours, 542 critical point, 607 derivative, 574 differentiable, 568, 576 chain rule, 587 product rule, 585 802 quotient rule, 585 sum rule, 585 Taylor’s formula, 602 directional derivative, 597 discontinuous, 547, 548 domain, 542 gradient, 598 graph, 542 level set, 542 maximum, 553, 606, 610 minimum, 553, 606, 610 not differentiable, 578 of class C p , 580 of class C 8 , 580 partially derivative, 139, 570 of higher order, 139, 570 partially differentiable, 139, 570, 576 product, 557, 588 quotient, 557 range, 542 Riemann integral, 634 change of variables, 650 existence, 643 Fubini’s theorem, 644 negligible sets, 641 volume under a graph, 634 scalar multiple, 556 sum, 556 tangent plane, 574 Taylor expansion, 602 Taylor polynomial, 574 zero set, 44 General basic problem solving strategy, 352 steps in the analysis of a series, 368 Infinite products Γ, 329, 355 cos, 275 sin, 275 Wallis product, 270 Maps bijective, 48 bilinear, 521 composition, 54 definition, 44 domain, 44 graph, 47 identity map, 55 image, 44 injective, 48 inverse image, 44 Laplace operator, 583 linear, 567, 775, 777 representation matrix, 567, 777 Nabla operator, 580 one-to-one, 48 one-to-one and onto, 48 onto, 48 paths, 520 continuous, 520 differentiable, 520 helix, 521 length, 531, 533 non-rectifiable, 531 rectifiable, 531 tangent vector, 520 permutation, 766 permutation group, 766 transposition, 766 quadratic form, 620 range, 44 restriction, 44 surjective, 48 Matrices associated quadratic form, 620 definition, 567, 777 eigenvalues, 620 eigenvectors, 620 multiplication, 567, 777 symmetric, 608, 779 positive definite, 608, 779 803 Real numbers Babylonian roots, 195 completeness, 759 construction, 749 density of rational numbers, 91, 759 intervals, 34 length, 219 partition, 219 subset bounded from above, 79 bounded from below, 79 infimum, 80 measure zero, 231 supremum, 80 Sequences bounded, 63 bounded from above, 77 bounded from below, 78 Cauchy, 74 convergent, 63, 515 decreasing, 78 divergent, 63 increasing, 77 limit laws, 68, 518 limits preserve inequalities, 72 subsequence, 76 Series, 345 ζ-type, 357 Abel’s test, 365 absolutely summable, 368 alternating harmonic, 364, 413 arithmetic, 423 Binomial series, 429 Cauchy product, 418 comparison test, 353 conditionally summable, 368 convergent, 345 Cauchy’s characterization, 370 Dirichlet test, 363 divergent, 345 geometric, 346 harmonic, 347, 354 harmonic type, 352, 357 integral test, 349 not summable, 345 of functions, 378 power series, 406 uniform convergence, 387 Weierstrass’ test, 391 ratio test, 371 rearrangement, 359 root test, 373 summable, 345 summation by parts, 362 Surfaces area, 720, 723 cylinder, 500 ellipsoid, 502 elliptic cone, 505 elliptic paraboloid, 503 hyperbolic paraboloid, 505 hyperboloid of one sheet, 506 hyperboloid of two sheets, 506 parallelepiped, 468 planes, 472 quadrics, 500 saddle surface, 503 sphere, 445 Theorems Abel, 412 Binomial, 210 Bolzano-Weierstrass, 76, 549 change of variables, 250, 650 contraction mapping lemma, 195 extended mean value, 183 Fubini, 644 fundamental theorem of calculus, 238 Gauss, 734 Green, 701, 705 intermediate value, 96 804 L’Hospital’s rule, 185 Lagrange multiplier rule, 618 mean value, 150 Newton, 202 Poincare lemma, 686 Rolle, 149 Schwarz, 581 Stokes, 727 Taylor, 170, 423 Vector calculus area parametric surface, 720 closed path, 683 flux across a surface, 718, 720 Gauss theorem, 734 Green’s theorem, 701, 705 inverse path, 682 parametrized surfaces, 716 normal field, 716 tangent space, 716 path integrals, 680 piecewise regular C 1 -path, 683 Poincare lemma, 686 potential, 684 regular C 1 -path, 680 Stokes’ theorem, 727 vector field, 677 Vector-valued functions, 542 Vector-valued functions of several variables, 542 of class C 1 , 642 805
© Copyright 2025 Paperzz