1 Mathematics of Change Modeling the change of physical systems is central to the applications of mathematics the speedometer in your car that measures change in position of your car over time, the stockmarket which models over time the change in the value of stocks, the motion of a satellite, to mention a few. Modeling of change was first addressed by the Italian mathematician Galileo (1564 - 1642) in the early part of the 17th century in the context of describing the motion of a falling ball. Later in the century Isaac Newton (1643 - 1727) developed a comprehensive theory in his famous work Philosophiae naturalis principia mathematica 1687 in which he applied his ideas to many areas including a study of celestial mechanics - the motions of the earth, moon, and other planets. The foundation of all mathematics lies with the work of the early Greek mathematicians, in particular Euclid (325 - 265 B.C.) and Archimedes (287 - 212 B.C.). But surprisingly enough, a mathematical theory of motion waited for nearly 2,000 years. One reason perhaps is that the ideas of the famous philosopher Aristotle (384 - 322 B.C.) carried such weight that progress beyond his analysis of motion, which contained major errors, was difficult. 1.1 Driving to Montreal When I drive from Toronto to Montreal, a distance of roughly 500 km, if it takes 5 hours, I calculate my average speed at 100 km/hr - that is, average speed = distance/time. Midway along I notice a sign saying that the distance to the nearest rest station is 40 km; I’m interested because I need gas. Twenty minutes later I roll up to the gas station, and I am able to calculate the average speed for this portion of the trip to be 120 km/hr. So we all know how to calculate average speed over some specific time interval, but as I sit in the car and look at the speedometer, I have a notion of the instantaneous speed of the car. But what is this? How is instantaneous speed measured from one moment to the next? The speedometer as we know measures only an approximation; it measures the distance traveled over a small interval of time - say 1 second. It does this by measuring the number of revolutions of the axle in one second. This translates to a certain number of revolutions of the wheels in one second, which in turn translates to a certain distance traveled in one second. So our impression of instantaneous speed is gained by simply taking the time interval over which we measure distance to be small enough. But what we find then is still an average speed. I have just measured the diameter of the wheels on my car to be 60 cm. Thus with one revolution of the wheels, my car travels 0.6π meters. So with 50 revolutions, it travels 30π 1 2 meters in a second or (6010)(30) π ≈ 108 km/hr. The question now is, what is instantaneous 3 speed? We certainly have the concept within us. One way to define would be to take the readings of increasingly more accurate speedometers - ones that measure revolutions over every tenth or every thousandths of a second. This involves looking at some sort of limit the limit of readings of ever increasingly accurate speedometers. Figure 1: average speed of falling object 1.2 Falling objects It has been shown experimentally that an object that has been let fall travels a distance of f (t) = 4.9t2 meters per second. Is there a way to analytically determine the instantanesous speed, otherwise known as velocity, at any time after the object has been let fall? Yes, there is. This is how we do it - in essence we simply measure the average speed over increasingly small intervals of time - that is we look at increasingly more accurate speedometers. Suppose the time at which we wish to determine the the velocity is t0 seconds after the object has been let fall, and lets consider a small time interval over which we can calculate average speed - say the interval from t0 to t0 + 1/n where n is some large number. So we need to calculate the distance traveled over this small amount of time divided by time 2 ellapsed, which is then t0 + 1/n − t0 = 1/n. In the figure above the amount fallen for two different times t = 0.5 and t − 1.5 is represented by the vertical distance between the points of intersection of the chord (the magenta line) with the graph. The time elapsed is the horizontal distance between the points. The slope of the magenta line is then the distance divided by the time. It is the average speed over the time interval [0.5, 1.5]. Figure 2: Average speed lines approach tangent In figure 2 a second line in green is drawn which corresponds to the smaller time interval [1.15, 1] and as the interval decreases to the point at which the difference of the times is close to zero the corresponding line approaches the blue line tangent to the graph. In more generality, lets compute the average speed of the object over the time interval from t0 to t0 + n1 - in other words, the distance traveled divided by the length of the time interval, [t0 , t0 + n1 ]. From a geometric point of view this then becomes the task of computing the 3 slope of the line joining the points (t0 , f (t0 )) and t0 + n1 , f (t0 + n1 ) . We get f (t0 + n1 ) − f (t0 ) t0 + n1 − t0 = 4.9(t0 + n1 )2 − 4.9t20 = 4.9t20 + 9.8 tn0 + 1 n 1 n2 − 4.9t20 1 n = 9.8t0 + n1 . As n gets very large this number gets closer and closer to 9.8t0 . We express this by saying f (t0 + n1 ) − f (t0 ) that the limit of as n goes to infinity is 9.8t0 . We write t0 + n1 − t0 f (t0 + n1 ) − f (t0 ) = 9.8t0 . n→∞ t0 + n1 − t0 lim 1.3 The derivative What we have done above carries over to any function provided it is possible to have a unique tangent to the graph at the point of interest. So instead of a function describing falling objects, lets consider an arbitrary function f : A → R defined on some open interval A and suppose that f is such that at each point (x, f (x)) on the graph of f there is a unique tangent. Just as before we modeled the rate of change in distance fallen as a function of time, lets do the same thing for f. Given a point a ∈ A we will model the instantaneous rate of change of f at the point a, and as before we do this by first setting up an expression for an average rate of change as the slope of a line joining two points of the graph In figure 3 we compute the average rate of change from a point a to a point a + h, as the slope of the red line, where h is some small real number, negative or positive such that a + h remains an element of the domain A. The instantaneous rate of change at a is then the slope of the green line which is tangent to the graph at the point (a, f (a)). As the quantity h gets smaller and smaller and the red line approaches the green. That is, the slope of the green tangent line is determined by the slope of the red lines as a limit as h gets very small. The instantaneous rate of change of f at a is also called the derivative of f at a, and notationally is written as f 0 (a). We summarize as follows. Definition 1 Given a function f : A → R defined on some open interval A, let a ∈ A and suppose that the graph of f possesses a unique tangent line at the point (a, f (a). The 4 Figure 3: derivative at a is slope of green tangent 5 derivative of f at a ∈ A is defined to be the slope of the tangent to the graph of f at the point (a, f (a)). It is calculated as the limit f (a + h) − f (a) . h→0 h f 0 (a) = lim Given a function f as above, if the derivative exists at each point in the domain A of f, we then in effect can describe a new function, which at every point x ∈ A gives the derivative of f at the point x. This function referred to simply as f 0 - that is: f 0 (x) is the value of the derivative of f at x There are a number of different notational devices used for expressing derivatives. In particular if a function is described by an equation such as y = 2x2 + 3x + 1, then the d derivative is sometimes referred to as dy/dx or dx (2x2 + 3x + 1). Other times we write Df (x) to mean the same as f 0 (x) The problem with the definition of a derivative is that we do not really know what is meant by such a limit, and also what would it mean for the limit not to exist - that is: under what conditions would it not be possible to form a line tangent to the graph? The later question is not too hard to answer. First we must be clear as to what we mean by the word tangent. Definition 2 Given any curve C in the plane and a point P on C, a line is tangent to the curve C if • in some small region about P the line does not intersect the curve except at the point P or • the line corresponds precisely to a portion of the curve. What then does it mean for a function not to have a unique ( i.e. one and only one) tangent to the graph at a point (a, f (a))? There are three cases. The first doesn’t really count since we have prefaced all this by saying that the function is defined at the point a. • the function is not defined at the point a. For instance the function f (x) = not defined at x = 1. 1 x−1 is • there is a break in the graph of f at the point a - if this should happen the function is said to be discontinuous at a. The step function is an example of a function that is discontinuous at each integer, see figure 4. • the graph of f has a corner at (a, f (a)) so that it is possible to define more than one line tangent at (a, f (a)). An example is the absolute value function which has a corner at x = 0, see figure 5 6 Figure 4: multiple tangents at point of discontinuity Figure 5: multiple tangents at a corner 7 The notion of a tangent line is intuitive and easy to grasp, but how does one calculate the slope of such a line. The only way is by the sort of approximation which we have been talking about. Given the function f and the point a ∈ A the expression f (a + h) − f (a) h is referred to as difference quotient. It is in fact just another function whose variable however is h - that is: we have a new function g(h) = f (a + h) − f (a) , h and what we are asking is then the same as asking what is meant by the limit lim g(h)? h→0 This question is just a special case of a more general question - what precisely do we mean for some function f we say that the functional values f (x) get closer and closer to some number L as the points x get closer and closer to a fixed value a. 1.4 Limits One way of thinking of thinking about limits of a funciton is to construct a sequence, as we did in the case of falling objects, by looking at terms of the form f (a + n1 ) for n an integer. If n is always taken as positive, the domain values a + n1 approach a from the right as n gets large. In this case we are looking at a right-hand limit, which we write as limn→∞ f (a + n1 ). In the same way, if n is always negative, then the terms a + n1 bit by bit approach a from the left, and the corresponding limit is said to be a left-hand limit. Another way of thinking of this is to replace a + n1 by some arbitrary number x which in a limit gets closer and closer to a. In this case write limx→a+ f (x) to denote the right-hand limit when x is always chose to be greater than a and limx→a− f (x) to denote the left-hand limit. Example 3 Let the function f be defined by f (x) = x in the case that x ≤ 1 and if x > 1 let f be defined by f (x) = x + 1. Then the left-hand limit limx→1− f (x) = 1 whereas the right-hand limit limx→a+ f (x) = 2 Now it happens rather frequently that a right-hand limit does not equal a left-hand limit, which occurs when there is a discontinuity in the graph at a. See figure 6. If the left-hand limit equals the right-hand limit, then we say that the limit exists and we write simply limx→a f (x) 8 Figure 6: limit from left =1 whereas limit from right =2 Figure 7: For a − δ < x < a + δ and x 6= a, (x, f (x)) in green stripe 9 Example 4 Let the function f be defined so that at a point a there is a jump discontinuity, which simply means that the function is nicely continuous in the sense that the graph may be drawn without lifting pen from paper, except at the point x = a where it abruptly jumps to some distant point. For a particular example, suppose f is defined so that f (x) = 2, x 6= 1 f (x) = 4, x = 1 Note that limx→1− f (x) = 1, whereas limx→1+ f (1) = 3. The precise definition of the limit of a sequence has already been mulled over, and this allows a firm footing for a limit of the form limn→∞ f (a + n1 ). The more general definition, in which we substitute x for a + n1 , involves some minor changes which we show below in Definition 3. There are situations in which the sequence version and this general version do not coincide. We will postpone the details until later. Definition 5 Limit A function f has a limit L as x approaches a means that : for every small number > 0, there exists a number δ > 0 so that for every x, if x is within δ of a, then f (x) is within of L. In other words ∀ > 0 ∃δ > 0 so that ∀x, 0 < |x − a| ⇒ |f (x) − L| < Phrased another way, L is a limit of the function f at the point a, if: given an arbitrary measure of closeness , which determines an open interval (L − , L + ) centered at L on the y axis, there exists another measure of closeness δ and a corresponding interval (a − δ, a + δ) on the x axis so that if x is between a − δ and a + δ but not equal to a namely 0 < |x − a| < δ, then f (x) is on the y axis between L − and L + . See figure 7. We can also make precise the notions of left limit and right limit. All we need to do is to modify the above definition of the limit in such a way that for the left limit the values for x are always less than a and for the right limit, the values for x are always greater than a. The definitions are as follows, Definition 6 Left limit A function f has a left limit L as x approaches a means that : for every small number > 0, there exists a number δ > 0 so that for every x, if x is within δ of a and less than a, then f (x) is within of L. In other words ∀ > 0 ∃δ > 0 so that ∀x, 0 < a − x ⇒ |f (x) − L| < 10 Definition 7 Right limit A function f has a left limit L as x approaches a means that : for every small number > 0, there exists a number δ > 0 so that for every x, if x is within δ of a and greater than a, then f (x) is within of L. In other words ∀ > 0 ∃δ > 0 so that ∀x, 0 < x − a ⇒ |f (x) − L| < There are situations in which a limit of the function exists, say limx→a f (x) = L, but the value L of the limit does not coincide with the value f (a). Consider the following example. Example 8 Consider the function f : R → R defined by f (x) = 1 for x 6= 0 and f (0) = 2. Then limx→0 f (x) = 1, whereas f (0) = 2. 1.4.1 Arithmetic combinations of functions We need to talk about limits because we need to prove and state some things about derivatives, but first we need some more machinery. The techniques we will develop for computing derivatives require that we first break the definition of a given function into basic parts and then apply differentiation rules to the parts. In order to do this we need to know how to add, multiply, and divide functions. Given two functions f : A → R and g : A → R defined on an interval A, we define f + g, f · g, f /g, and a fourth operation which is called scalar multiplication as follows • the sum f + g is defined by f + g : x 7→ f (x) + g(x) - that is: (f + g)(x) = f (x) + g(x) • the product f · g is defined by f · g : x 7→ f (x)g(x) - that is: (f · g)(x) = f (x)g(x) • the quotient f /g is defined by (f /g : x 7→ (f /g)(x) = f (x) provided that g(x)¬0 - that is: g(x) f (x) g(x) • for a constant k ∈ R, scalar multiplication kf of f is defined by k is defined by kf : x 7→ k(f (x)) that is: (kf )(x) = k(f (x)) √ Example 9 Let f be defined by f (x) = x3 and let g be defined by g(x) = x √ 1. f + g is then defined by (f + g)(x) = x3 + x √ 2. f · g is defined by (f · g)(x) = x3 x 3. f /g is defined for x > 0 by (f /g)(x) = x3 √ x 11 1.5 Limit facts Calculating limits and calculating derivatives, which of course are also limits, can sometimes be difficult. Luckily there are some easy results that allow us to simplify the task by breaking the limit up into parts which then can be more easily evaluated. These results can be simply stated - the limit of a sum is the sum of the limits, the limit of a product is the product of the limits, a limit of a quotient is the quotient of the limits, provided the denominator is not zero, and the limit of a composition is the composition of the limit. The words however can be misleading. The proper statements are as follows Limit Theorems Given functions f : A → R and g : A → R, where A is an open interval, suppose that limx→a f (x) = L and limx→a g(x) = M, then • Sum of limits lim f (x) + g(x) = L + M x→a • Product of limits lim f (x)g(x) = LM x→a • Quotient of limits f (x) L = , x→a g(x) M lim provided g(x) 6= 0 in some small interval containing the point a. • Composition of limits provided limy→L f (y) exists then lim (f ◦ g)(x) = lim f (y) x→a y→L Using the definition of a limit one can prove each of the limit rules. To show you how it goes, I’ll prove the first. You may notice that the proof is very similar to the proof that the limit of a sum of sequences equals the sum of the limits. The other properties listed can be similarly proved, although the proofs are more involved. Theorem 10 In the context of the above, limx→a f (x) + g(x) = L + M. Proof. According to Definition 7, we need to show for arbitrary > 0 there exists a δ > 0 such that for arbitrary x, 0 < |x − a| < δ ⇒ |f (x) + g(x) − (L + M )| < . Lets begin by choosing an arbitrary but fixed > 0. Now, since limx→a f (x) = L and considering /2 instead of and applying Definition 3, we see that there is some number δ1 > 0 such that for arbitrary x, 12 0 < |x − a| < δ1 ⇒ |f (x) − L| < /2. (1) Similarly, since limx→a g(x) = M, there exists, according to Definition 7 again, some number δ2 > 0 such that for arbitrary x, 0 < |x − a| < δ2 ⇒ |g(x) − M | < /2. (2) From lines (1) and (2) above, we see that if we set δ to be the minimum of δ1 and δ2 , it follows that for 0 < |x − a| < δ that both |f (x) − L| < /2 and |g(x) − M | < /2 are true . We can then add . So we have: 0 < |x − a| < δ ⇒ |f (x) − L| + |g(x) − M | < /2 + /2 = . However, by the triangle inequality, which says that for any two numbers the absolute value of the sum is less than or equal to the sum of the absolute values - i.e. |a + b| ≤ |a| + |b|,we conclude that |f (x) + g(x) − (L + M )| = |f (x) − L + g(x) − M | ≤ |f (x) − L| + |g(x) − M | < . Hence, we have shown that for arbitrary > 0 there exists δ > 0 so that for all x, 0 < |x − a| < δ ⇒ |f (x) + g(x) − (L + M )| < . 1.6 Differentiation Formulas Being able to find the derivative of a function at a point is an important analytic tool, but if one had always to go through the elaborate process of looking at successive approximations, it would be a rather clumsy tool. Fortunately, there are four general results that allow, in most cases, a quick calculation. These results are proved using the definition of the derivative in terms of a limit. The results are the sum rule, the product rule, the quotient rule, and the chain rule. But before we get to them, it will be useful if we calculate a few derivatives the long way. The first is the so called Power Rule. The proof is short so we will include it. Theorem 11 Power Rule If f (x) = xn for some positive integer n, f 0 (x) = nxn−1 for all real numbers x. Proof. Lets let a be an arbitrary fixed number. I want to show that f 0 (a) = nan−1 . The trick here is to realize that xn − an = (x − a)(xn − 1 + xn−1 a + xn−2 a2 + · · · + xan−2 + an−1 ). 13 To verify this start multiplying out the right hand side. You will see that all the terms cancel except for xn and −an . Using this result, we go to the definition and calculate, f 0 (a) = limx→a f (x) − f (a) x−a = limx→a = limx→a xn − an x−a n−1 (x−a)(x +xn−1 a+xn−2 a2 +···+xan−2 +an−1 (x−a) = limx→a xn−1 + xn−1 a + xn−2 a2 + · · · + xan−2 + an−1 (x − a) = an−1 + an−2 a + · · · + aan−2 + an−1 = nan−1. The power rule, expressed above for exponents that are positive integers, may be extended to the case where the exponent is any real number. Later we shall prove this for the case in which the exponent is an arbitrary rational number Theorem 12 Extended Power Rule If r is an arbitrary real number, then d r (x ) = rxr−1 dx A constant function has graph which is a horizontal straight line, and it is not hard to see by examining the difference quotient that the derivative of any such function is zero. The proof is left as an exercise. Theorem 13 If c is some constant and f (x) = c for all x , then f 0 (x) = 0 for all x Example 14 1. Let f (x) = 2. Let g(x) = 1.7 √ 1 1 x. Then f 0 (x) = 12 x− 2 = √ 2 x √1 . x 1 3 Then g 0 (x) = − 12 x− 2 = 3 2x 2 Sum,Product, Quotient, and Chain Rules In what follows let f and g be two functions both of which have derivatives at a point a denoted as f 0 (a) and g 0 (a). 14 • Sum Rule the derivative of f + g at a point a is the derivative of f at a plus the derivative of g at a - that is: (f + g)0 (a) = f 0 (a) + g 0 (a) In simple words, the derivative of a sum is the sum of the derivatives √ √ Example 15 Let h(x) = x3 + x. Then letting f (x) = x3 and g(x) = x , we have 1 h0 (x) = 3x2 + √ 2 x • Product Rule the derivative of f · g at a is the derivative of f at a times the value of g at a plus the value of f at a times the derivative of g at a - that is (f · g)0 (a) = f 0 (a)g(a) + f (a)g 0 (a) • Quotient Rule provided g(a) 6= 0,the derivative of f /g at a is the derivative of f at a times g(a) minus the derivative of g at a times f (a) all divided by g(a) squared - that is (f /g)0 (a) = f 0 (a)g(a) − f (a)g 0 (a) (g(a))2 • Chain Rule Let g : A → R and f : B → R be two functions such that the composition f ◦ g makes sense in the sense that the range of g - namely g(A) - is contained in B. Further suppose that the derivative of f at b = g(a) exists. Under these conditions the derivative of f ◦ g at a is the derivative of f at b = g(a) times the derivative of g at a - that is (f ◦ g)0 (a) = f 0 g(a) g 0 (a). 1.7.1 Examples 1. Let h(x) = kg(x) for a constant k. Setting f (x) = k , knowing that f 0 (x) = 0, and using the product rule, we see that h0 (x) = kg 0 (x) 2. Let h(x) = (x2 + 4x + 1)(x5 + 3x4 + x2 + 3). To find the derivative one could multiply the two polynomials and then use the power rule, the sum rule, and the result shown just above. But to avoid the initial multiplication, we can use the product rule and simplify the calculation. We get h0 (x) = (2x + 4)(x5 + 3x4 + x2 + 3) + (x2 + 4x + 1)(5x4 + 12x3 + 2x), 15 which could be simplified, but for exercises involving only differentiation, simplification is not necessary. 3. Let h(x) = x3 + 2x2 + x + 2 . Using the quotient rule we get, x2 + 1 h0 (x) = (3x2 + 4x + 1(x2 + 1) − (x3 + 2x2 + x + 2)(2x) . (x2 + 1)2 4. Let h(x) = (x2 +2x+1)45 . So h is the composition of two functions g(x) = x2 +2x+1 and f (y) = y 45 . Using the chain rule, we then calculate the derivative of h as, h0 (x) = f 0 (g(x)g 0 (x) = 45(x2 + 2x + 1)44 (2x + 2). Observe that the only effective way of calculating the derivative is to use the chain rule. Multiplying out (x2 + 2x + 1)45 would be a nightmare. 16
© Copyright 2024 Paperzz