Carnegie Mellon University Research Showcase @ CMU Department of Mathematical Sciences Mellon College of Science 1995 The chain rule for higher derivatives Walter Noll Carnegie Mellon University Follow this and additional works at: http://repository.cmu.edu/math This Technical Report is brought to you for free and open access by the Mellon College of Science at Research Showcase @ CMU. It has been accepted for inclusion in Department of Mathematical Sciences by an authorized administrator of Research Showcase @ CMU. For more information, please contact [email protected]. NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying of this document without permission of its author may be prohibited by law. The Chain Rule for Higher Derivatives by Walter Noll Department of Mathematics Carnegie Mellon University Pittsburgh, PA 15213 Research Report No. 95-184 November, 1995 510.6 C28R 95-184 The Chain Rule for Higher Derivatives by Walter Noll, November 1995 1. Introduction. Perhaps the most important theorem of elementary differential calculus is the Chain Rule, It states, roughly, that the composite of two differentiable functions is again differentiable, and it gives a formula for the derivative of this composite. A Chain Rule of Order n should state, roughly, that the composite of two functions that are n times differentiable is again n times differentiable, and it should give a formula for the n'th derivative of this composite. Many texbooks contain a Chain Rule of order 2 and perhaps 3, but I do not know of a single one that contains a Chain Rule of arbitrary order n with an explicit and useful formula. The main purpose of this paper is to derive such a formula. In the last section, I will show how the Chain Rule of higher order can be used to prove that the composite of two real-analytic functions is again realanalytic. The usual proof uses complex extensions of the real-analytic functions and basic theorems of Complex Analysis. However, this usual proof cannot easily be extended to the case when the real-analytic functions are replaced by mappings between higher-dimensional spaces. The proof presented here can. In fact, it will be used in the second volume of my book entitled "Finite-Dimensional Spaces; Algebra, Geometry, and Analysis", now being written. (The first volume appeared in 1987.) In this paper, we need some precise and efficient notation and terminology, which, unfortunately, has not yet become standard. (It is described in some detail in Chapter 0 of Vol.1 of my book mentioned above.) We denote the set of all natural numbers, with zero included, by IN. the set of all real numbers by R , and the set of all positive real numbers, with zero included, by P . We use IN x : = ]N\{0} for the set of all non-zero natural numbers and F x := P \{0} for the set of all strictly positive real numbers. Given n £ IN we use the notation n' := {k £ IN | 1 < k < n} for the set of all natural numbers from 1 to n. We have 0' = 0. A subset J of R is called an interval if for all s,t £ J we have [s,t] C J. The empty set and singletons are intervals; all other intervals are said to be genuine. The cardinality of a finite set 5 is denoted by # 5 . A family a := (a?- | / £ J) is specified by a procedure which assigns to each i in a given set / , called the index set of a, an object a t , which is then called a term of a. The family a should not be confused with its range {a, | i £ / } , which is the set of all terms of a. The set of all families indexed on I with terms in a given set 5 is denoted by S1. If 5 contains a. zero-element 0 of some sort, for example if 5 is IN, R , or P , we define the support of a given a £ S1 by Supta := {i £ / | ax ^ 0 } . (11) The set of all families in S1 that have finite support will be denoted by 5 ( / ) := {a £ S1 | Supta is finite} . (12) Families indexed on IN or IN* are called sequences. Assume now that an additive monoid P is given. This means that an addition and a zero are defined for P such that the usual rules hold. Examples are P := IN, R , and P . If / is a finite set and a G P1 is given, one can unambiguously define the sum ]T? Y (13) by using the associative and commatative laws of addition. If / is the empty set we have E a = 0. The third notation of (13) is preferred in displayed formulas, but the first and second are very useful in involved formulas. If the set I is not finite but a G P^ 7) one can still define E a and use the notations (13) by putting E a := ] T ( a , | i G Supta) . (14) Assume now that the given monoid P is a commutative multiplicative monoid instead of an additive monoid. Examples are P := IN, R , P , IN X , and P x . Then the considerations of the preceding two paragraphs apply when ^2 i s replaced by f| in (13) and (14) and if 0 is replaced by 1 in the definition (11) of support. 2. Patterns of partitions. Recall that a partition of a given set 5 is a pair-wise disjoint collection of non-empty subsets of 5 whose union is all of 5. The member-sets of a partitions are called pieces. Definition 1. The pattern n(V) G IV finite set S is defined by of a given partition V of a given (n(V))i := #{/v G V | #/v = i} for all i G JVX (21) Roughly, the pattern of V tells us how many pieces of V have a given cardinality. Let a finite set 5 be given. It is clear that (22) for every partition V of 5. Given n G K , we use the notation n)i:={p€lN(1N;X)| Y, ' > ' = »} ielN 9 TX (23) We now put n := # S . Then Un is the set of all possible patterns of partitions of 5. Denote the set of all partitions of S by Part 5. Given p G II n , we use the notation \p : = #{V G Part S \p = ir(V)} . (24) for the number of all partitions of 5 that have the pattern p. This number depends only on p G IN* \ which determines ?? = # S via (22); it does not depend on the particular nature of 5 itself. \ not zero, be given and choose k 6 W x such that Lemma 1. Let p G J!V pk ^ 0. Define q G JV(iNfX) by Putting n : = £)(?>,- I * € ^r><)< u ' e ^ a v e £ ( * 9* I* £ ^ r>< ) = A 77 ~ (26) «- Proof : We choose a set 5 with n = # 5 . We then construct a partition of S as follows. First we choose a subset K of 5 with k = #K and then a partition Q. of S\K with pattern q as defined by (25). Then QU {A'} is a partition of 5 with pattern p. Every partition of S with patterns p can be obtained in this manner. Since there are (£) subsets A of 5 with #A~ = k and, given such a A', there are Xq partitions of S\K with pattern q. we obtain partitions of 5 with pattern p in (JJ)Ag different ways. However, since every partition of S with pattern p contains pu pieces A' with #A' = k, the procedure described above produces one and the same partitition of S in pu different ways. We conclude that (26) is valid. | Theorem 1. For all p G -EV * we have >Pi • Proof: We use induction over /?. If ?? := 0 we must have p = ( 0 , . . . ) , the sequence all of whose terms are zero. Then (27) is valid with A(o } = 1 because the empty set has only one partition, namely the empty one, and because 0' = 0, making the product in the denominator on the right side of (27) equal to 1. Now let p G IN( \ not zero, be given and put n := ]T(z px \ i G IN X ). Assume that (27) becomes valid after p is replaced by a q G IN( * G 1NX) < n, so that g = n(<iMW \iem\) • ] with m : = (2S) We now choose k G IN x such that pk ^ 0 and we define q G JN(ES ) by (25). Then in := n - A: < ?? and hence, by (26) of the Lemma and by (28), we have _ /??\ 1 m! P ~ W 1 (n-fc)! Ui!)1" | t € m l \ { f c } ) ' which easily yields (27), completing the induction. | 3. The Chain Rule of order n. In this section, we assume that genuine intervals J and K and functions / : J —• K and g : K —> R are given, so that the composite g o / : J —• R makes sense. The most common form of the ordinary Chain Rule is this: if / and g are differentiate, so is g o f and we have (g o / ) • = [g* o / ) /*. (We follow Newton and denote derivatives by superscript dots rather than primes.) The following is a preliminary form of the Chain Rule of order n. L e m m a 2. Let n G IS be given. If both f and g are n times differentiable, so is g o f and {(Jo /)<">= Y, {9'*V) ° f) U f{*K) • (3D Proof: We proceed by induction. If ?? := 0 we have ??' = 0 and hence 0 is the only member of Part n). Therefore, the product on the right side of (31) reduces to 1 and (31) becomes(<y o / ) ( 0 ) = (g o f) = (#<0) o / ) . Assume now that ?? G IN is given, that / and g are n + 1 times differentiate, and that (31) is valid for the given /?. Then / . g{^p) and / ( * / v] are different iable for all V G Part ??J and all 7v G V. Hence we can apply the ordinary Chain Rule and the Product Rule to differentiate (31) and obtain °nr PGPart /?! LG^ n h'eT\{L} Now, the partitions of (?? + 1)' can be classified as follows. (1) The first class consists of partitions that contain {n + 1} as a piece. They are of the form V9 := V U {{?? + 1}} with V e Part ??] . (2) The second class consists of partitions that do not contain {n + 1} as a piece. They are of the form V':=(V\{L}) U { L U { n +l } } with 4 VePartn) a n dL e V . The summands of the first sum on the right side of (32) are of the form (</ (#n °/) n fi#K>) when V belongs to the first class. The second term on the right side of (32) can be written as a sum with summands that are also of the form (33), except that V now belongs to the second class. Therefore, (32) reduces to (31) with n replaced by n + 1, and the induction is complete. | Theorem 2 (Chain Rule of order n). Let n G IV be given. If both f and g are n times different iable, so is g o / and (gof){n)= where II n is defined by (23) and Xp is given explicitly by (27). Proof: We observe that each summand on the right side of (31) depends on the partition V only through its pattern TT(V). NOW, in view of (24). Xp is the number of partitions of w having a given pattern p G II n . Therefore, (31) of Lemma 2 reduces to (34) when equal terms in the summation are put together. | 4. Summations. We assume that an index set / is given and we denote the collection of all finite subsets of / by F i n / . Let a G P 7 be given. As was pointed out in Sect.l, the sum E a is meaningful if / is finite or if a has finite support. Even if / is infinite and a fails to have finite support, one can still assign a meaning to S a , possibly oc, by putting So := sup{ Y^ «* IA ' € Fin/} G F := P U {oo} . (41) We say that a is summable if E a < oc. Most of the rules for summations remain valid for sums in the extended sense of (41). For example, if V is a partition of / , we have Kev If the index set is finite, the following proposition can be proved fairly easily by induction over # / using the Binomial Theorem. The proposition is also valid if / is infinite and if sums are defined by (41). Proposition 1. Let n G IV and a G P1 be given and put r , , : = { p e I V ( / ) | Zp = n} . (43) Then The following results are variations of standard theorems involving "power series". TAfX Proposition 2. Let a sequence a G P be given. Theii the following conditions are equivalent: (a) The set { %/oZ \ n G W x } is bounded. (b) For some h e Px, the set {cinhn \ n G W x } is bounded. (c) For some ft G P x , we have lim^^oo anhn = 0. (d) For some ft G P x , the sequence (anhn \ n G W x ) is suinmabie. Proposition 3. Assume that a givej^ sequence a G P ' satisfies one, and hence all, of the conditions of Prop.2. Then we can choose ft G P x such that the sequence {antn \ n G Nx) is summable for all t G [0,ft]and lim Y ant« = 0 . (45) 4. Composition of analytic functions. We assume now that non-empty open intervals J and A' and a C°°-function / : J > K and x G J are given. For every /? G IN * and d G P x with [x — d,x + d] C J, we use the notation a n (/;aNd):=sup{|/ ( l l ) (.r + 6)| | ^ G [ - ^ r i ] } e P . (51) The supremum on the right side of (51) is actually a maximum and hence belongs to P because all the derivatives of / are continuous. Definition 2. We say that f is analytic near x G J if there is a d G P x with [x — d,x + d] C J such that the sequence a := {-^an(f:x.d) \ n G 1NX) satisfies one. and hence all. of the conditions (a) - (d) of Prop.2 of Sect.4. Assume that / is analytic near x and choose d satisfying the conditions of Def.2. By (c) of Prop.2 we may choose /? G P x such that lim an(f;x,d)— ?i-^oo hn 77,! =0. (52) In view of the notation (51), (52) ensures that the remainder term in the Taylorexpansion of / near x goes to zero and hence that f(x + t) = f(x)+ lim Y f{k)(x)*— (53) n — oc for all t G P% satisfying |<| < min{/?.c/}. One can prove, conversely, that / is analytic in the sense of Def.2 if (53) holds for all t with |/| less than some strictly positive number. Therefore, our Def.2 of analyticity is equivalent to the one usually given in the textbooks. We assume now that non-empty open intervals J and K and C^-functions / : J —> A" and g :—• R are given. By Thm.2, g o f :—> R is then also a C°°-function. 6 Theorem 3 (Composition-Theorem for Analyticity). Let x G J be given and put y := f(x). Assume that f is analytic near x and that g is analytic near y. Then g o / is analytic near x. Proof: Let d! G P x with [y — d1, y + d1] C A" be given. Since / is continuous at x and since y = / ( # ) , we can choose d G P x with [x — d, x + d\ C J such that |/(.r + s) — y | < d' for all s G [—d, d]. This means that for every s G [—d, </] there is ^' G [—d',d'] such that f(x + s) = y + s*. Therefore, using the notation (51) with / , x, d, and n replaced by g, y, d1, and m, respectively, we have \(g(m) o / ) ( . : + 5)| = \g{m)(f(x + s))\ < *m(g\yJ) (54) for all 6 G [—d, c?] and rn G 1NX. We note that this statement remains valid if d is replaced by any smaller number in P x . Now let ?? G IN* be given. Using Thin.2, the inequality (54), and the notation (51) we see that \(gof){n)(.r + s)\< is valid for all 6 G [—d,d]. Using the notation (51) again, this time with / replaced by g o / , we conclude that ] T Ap a(Ep)(<j: y, d') J\ (a,-(/; ,r, c/))Pt . an(gof;x.d)< (55) Now let t G P be given. Using the formula (27) of Thm.l aaid the fact that t = n((*f')Pl> I * € nJ) for all p G II n , we conclude from (55) that n < Using the notation (43) with / := IN x , it is evident that both {Tn \ n G IN x } and {II n | 7? G N x } are partitions of 1N(INX ) \{(0,.. .)}• Summing (56) over n G ]NX and using the formula (42) with first with V replaced by {II n | n 6 { r n | n e IN*}, we find that < IN TX } and then again with V replaced by Using Prop.l of Sect.4 with / := INx . it follows that tn a ^^y,df)^( °n{gof;x,d)-< £ n€IN X ^ tk\m 1/ ]T <Tk(f;x,d)-) . (57) ™" This inequality is valid for all d! € P x with [y - d\ y + d'] C A', all d G P x with [x — d, x + d] C J that are small enough, and all / £ P . Since # is analytic near x, we can, by Def.2, fix df such that the condition (d) of Prop.2 of Sect.4 is valid for the sequence | 777. €1NX) . a:=(^(Jm(g^d') Hence we may choose h' G P x such that _ , „>' oo . (58) Since / is analytic near ,r. we can. by Def.2. fix d £ P x such that not only (57) holds but also Prop.3 of Sect.4 can be applied to the sequence | k e INx ) . a:=(-ak.(f:.v.d) Hence we may choose /?. G P x such that hk Ck(f',x.cl)— <hf . (59) Using (57) with the choice t := h it follows from (59) and (58) that <rn(gof;x,d)—< 77! V <Tm{g\y,d9)—r < oc . m ' x Hence condition (cl) of Prop.2 of Sect.4 is satisfied for the sequence a :=((jn(gof:x.d)^ \ n G INx ) . By Def.2 it follows that g o / is analytic near x. | APR 1 5 2004
© Copyright 2025 Paperzz