Taylor polynomials

ucsc
supplementary notes
ams/econ 11a
Taylor polynomials
c 2008, Yonatan Katznelson
1.
Recovering a polynomial from its derivatives.
If n is a positive integer, then the higher order derivatives of the function f (x) = xn are
all easy to compute:
f 0 (x) = nxn−1 ,
f 00 (x) = n(n − 1)xn−2 , . . . , f (n) (x) = n(n − 1)(n − 2) · · · 3 · 2 · 1 = n!.
Since f (n) (x) is a constant function, it follows that f (n+1) (x) = 0, and in fact, f (k) (x) = 0
for any k > n. Next, if P (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 is a polynomial of degree
n, then
P 0 (x) = nan xn−1 + (n − 1)an−1 xn−2 + · · · + 2a2 x + a1 ,
P 00 (x) = n(n − 1)an xn−2 + (n − 1)(n − 2)an−1 xn−3 + · · · + 2a2 ,
P 000 (x) = n(n − 1)(n − 2)an xn−3 + (n − 1)(n − 2)(n − 3)an−1 xn−4 + · · · + 6a3 ,
..
.
(n)
P (x) = n! · an .
Once again it follows, as above, that if k > n, then P (k) (x) = 0. It is particularly instructive to study the sequence of constant terms of the n + 1 polynomials, P (x), P 0 (x),
P 00 (x), . . . , P (n) (x). Starting with P (x), the constant terms of these polynomials are
a0 ,
, a1 ,
, 2a2 ,
, 6a3 ,
, 24a4 , . . . , n! · an ,
that is, the constant term of P (m) (x) is equal to m! · am .†
Now, the constant term of any polynomial is equal to the value of that polynomial at
x = 0, so we can summarize the observations above with the following fact.
Fact 1 If P (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 , then a0 = P (0), and for m = 1, . . . , n
am =
P (m) (0)
.
m!
(1.1)
This fact is more than a list of formulas. A polynomial is completely determined by its
coefficients, and Fact 1 says that the coefficients of a polynomial are completely determined
by its value and the values of its derivatives at x = 0. This means that if we know the
value of the polynomial at x = 0, and we know its rate of change (i.e., the value of its first
derivative) at x = 0, and we know how its derivative is changing (i.e., the value of its second
derivative) at x = 0, etc., then we can compute the value of the polynomial at any other
point.
†
For a positive integer m, the product 1 · 2 · 3 · · · (m − 1) · m is denoted by m! (pronounced ‘m factorial’).
For consistency’s sake, 0! is defined to be equal to 1.
1
In other words, if we know everything there is to know about the behavior of a polynomial
at the point x = 0, then we know everything there is to know about the polynomial at every
point.
There is, as it turns out, nothing special about the point x = 0, in this regard.
Example 1. Suppose that f (x) is a polynomial of degree 2, and we know that f (1) = 1,
f 0 (1) = 1 and f 00 (1) = −3. What is f (2)? More generally, what is f (x) equal to, for any x?
To answer these questions, we write f (x) = a2 x2 + a1 x + a0 , and use the given data to
find a0 , a1 and a2 . To begin, we note that
f 0 (x) = 2a2 x + a1
and
f 00 (x) = 2a2 .
Now, looking at the given values of f (1), f 0 (1) and f 00 (1) gives us three equations in the
three unknowns, a0 , a1 and a2 . Specifically,
1 = f (1) = a2 + a1 + a0
1 = f 0 (1) = 2a2 + a1
−3 = f 00 (1) = 2a2 .
From the third equation, we immediately find that a2 = −3/2; plugging this into the second
equation gives 1 = −3 + a1 , so a1 = 4; and plugging these two values into the first equation
gives
3
3
=⇒
a0 = − .
1 = − + 4 + a0
2
2
2
Thus, the polynomial is f (x) = −1.5x + 4x − 1.5, and f (2) = 0.5.
To simplify the process of finding a polynomial from data about its value and the values
of its derivatives at a given point, x0 , we use the following ‘trick’. Suppose that f (x) is a
polynomial of degree n, and we have the data
f (x0 ),
f 0 (x0 ),
f 00 (x0 ), . . . , f (n) (x0 ).
Instead of writing f (x) = an xn +an−1 xn−1 +· · · +a0 , and using the data to derive equations
for the coefficients an , an−1 , . . . , a0 , (as we did in the example above), we write
f (x) = cn (x − x0 )n + cn−1 (x − x0 )n−1 + · · · + c0 ,
(1.2)
and use the data to immediately find the coefficients cn , cn−1 , . . . , c0 . Note that the coefficients cj will generally not be the same as the coefficients aj . On the other hand, this is
not important, because it is just as easy to find the values of f (x) using the expression in
(1.2) as it is using the original (and more traditional) expression.
The reason that the expression in (1.2) is the ‘right’ one to use, given the data, becomes
clear once you start differentiating it. We have
f 0 (x) = ncn (x − x0 )n−1 + (n − 1)cn−1 (x − x0 )n−2 + · · · + 2c2 (x − x0 ) + c1 ,
f 00 (x) = n(n − 1)cn (x − x0 )n−2 + (n − 1)(n − 2)cn−1 (x − x0 )n−3 + · · · + 2c2 ,
f 000 (x) = n(n − 1)(n − 2)cn (x − x0 )n−3 + (n − 1)(n − 2)(n − 3)cn−1 (x − x0 )n−4 + · · · + 6c3 ,
..
.
(n)
f (x) = n! · cn ,
2
analogously to the formulas preceding Fact 1. Also, analogously to Fact 1, the constant
term in the polynomial f (m) (x) is m! · cm , and so
f (m) (x0 ) = m! · cm ,
(1.3)
since all the non-constant terms in f (m) (x) have a factor of (x − x0 ), so when you evaluate
f (m) (x) at x = x0 , all the non-constant terms vanish. Dividing both sides of (1.3) by m!
proves the following theorem.
Theorem 1 If f (x) is a polynomial of degree n and x0 is any point on the real number
line, then
f (x) = f (x0 ) + f 0 (x0 )(x − x0 ) +
f 00 (x0 )
f (n) (x0 )
(x − x0 )2 + · · · +
(x − x0 )n .
2
n!
(1.4)
In other words, if f (x) is a polynomial of degree n and x0 is any point on the real number
line, then f (x) is completely determined by its value and the values of its derivatives at the
point x0 .
Example 2. Suppose that g(x) is a cubic polynomial (degree 3), and that g(2) = 1,
g 0 (2) = −1, g 00 (2) = 4 and g 000 (2) = 9. Find g(4).
According to Theorem 1,
g(4) = g(2) + g 0 (2)(4 − 2) +
2.
g 000 (2)
9
g 00 (2)
(4 − 2)2 +
(4 − 2)3 = 1 − 1 · 2 + 2 · 22 + · 23 = 19.
2
6
6
The Taylor polynomial of a function.
If f (x) is any function whose derivatives of order up to and including n are all defined
at the point x0 , then the expression on the right-hand side of (1.4) is perfectly well defined,
and is in fact a polynomial of degree (at most) n.
What is no longer generally true, however, is that the function f (x) is equal to this
polynomial, unless f (x) is a polynomial of degree (at most) n to begin with.
Definition. The polynomial
Tf,n (x) = f (x0 ) + f 0 (x0 )(x − x0 ) +
f (n) (x0 )
f 00 (x0 )
(x − x0 )2 + · · · +
(x − x0 )n
2
n!
(2.1)
is called the nth degree Taylor polynomial of f (x), centered at x = x0 .
If the function f (x) is a polynomial (of degree at most n), then Theorem 1 says that
f (x) = Tf,n (x). On the other hand, if f (x) is not a polynomial, then it can’t possibly be
equal to Tf,n (x), since Tf,n (x) is a polynomial.‡
Example 3. Let’s find the 4th degree Taylor polynomial of f (x) = ln x, centered at x = 1.
First, we compute the derivatives up to and including order 4:
f 0 (x) = x−1 ,
‡
f 00 (x) = −x−2 ,
f 000 (x) = 2x−3 and f (4) (x) = −6x−4 .
Though f (x) and Tf,n (x) may well be equal at certain points, for example f (x0 ) = Tf,n (x0 ).
3
Evaluating ln x and its derivatives at x = 1, and using the definition of T4 (x), above, we
find that
−1−2
2 · 1−3
−6 · 1−4
· (x − 1)2 +
(x − 1)3 +
(x − 1)4
2
6
24
1
1
1
= (x − 1) − (x − 1)2 + (x − 1)3 − (x − 1)4 .
2
3
4
Tln,4 (x) = ln 1 + 1−1 · (x − 1) +
The graphs of y = ln x (blue line) and y = T4 (x) (broken red line) appear in the figure
below, and this figure highlights two important features of the Taylor polynomial.
1.5
1
0.5
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
-0.5
-1
-1.5
Figure 1: The graphs of ln x and T4 (x).
First of all, generally speaking, the Taylor polynomial T4 (x) behaves completely differently than ln x. For example, as we can easily see in Figure 1, if x > 3, then T4 (x) < −1
(and T4 (x) is decreasing), while ln x > 1 (and ln x is increasing).
On the other hand, if we only look in the vicinity of the point x = 1, the two functions
are almost identical. In Figure 1, the two graphs are indistinguishable when |x−1| < 1/2.§ If
you prefer numerical evidence, you can evaluate ln x and Tln,4 (x) on your favorite calculator,
at points close to 1, and see how far apart the values are. According to my TI-30XA
• ln 1.5 − Tln,4 (1.5) ≈ 0.0044234,
• ln 1.25 − Tln,4 (1.25) ≈ 0.0001618 and
• ln 1.1 − Tln,4 (1.1) ≈ 0.0000018.
§
The graphing utility that I use rounds plot positions to 4 decimal places, so points on the two graphs
that are less than 0.0005 apart may appear to coincide.
4
The errors of approximation have all been rounded to 7 decimal places, and as you can see,
the closer x is to 1, the better the approximation.
3.
Taylor’s theorem.
Example 3 suggests that the Taylor polynomial, Tf,n (x), centered at x0 , may provide a
good approximation to the function f (x), as long as |x − x0 | is sufficiently small. For this
reason, Tf,n (x) is also called the nth order Taylor approximation to f (x). The question is,
how good is this approximation? More precisely:
Question: How small is |f (x) − Tf,n (x)|, and how does this depend on n and on |x − x0 |?
The answer is given by Taylor’s theorem.
Theorem 2 (Taylor’s theorem) If the (n+ 1)st derivative of the function f (x) is defined
in the interval (x0 − a, x0 + a), then for every x between x0 − a and x0 + a
f (x) − Tf,n (x) =
f (n+1) (ξ)
(x − x0 )n+1 ,
(n + 1)!
(3.1)
for some point ξ between x and x0 .
Comments:
a. The point ξ at which f (n+1) (x) is evaluated in (3.1) is generally not known. Taylor’s
theorem guarantees that there is such a point, but doesn’t say how to find it.
b. The proof of this theorem goes a little beyond the scope of this course. We will revisit
the topic in 11B.
c. The difference f (x) − Tf,n (x) is often called the remainder term, and denoted by
Rn (x). The formula for the remainder that appears in (3.1) is Lagrange’s form of the
remainder. Taylor’s original form of the remainder involves an integral (which is one of
the reasons that we will revisit the topic in 11B).
If |x − x0 | < 1 and n is large, then (x − x0 )n+1 is generally much smaller than |x − x0 |.
So, we might expect Rn (x) to become smaller and smaller as n grows larger, and we might
expect Tf,n (x) to provide an increasingly accurate approximation to f (x). This is true for
many functions, (but not for all), and if (x − x0 ) ≥ 1, then there is no guarantee that
Tf,n+1 (x) will give a better approximation than Tf,n (x).
Example 4.
The sixth degree Taylor polynomial for ln x, centered at x = 1 is
1
1
1
1
1
Tln,6 (x) = (x − 1) − (x − 1)2 + (x − 1)3 − (x − 1)4 + (x − 1)5 − (x − 1)6 ,
2
3
4
5
6
as you can check for yourself. The graph of this function appears in Figure 2 (as a dashed
green line), together with the graphs of ln x and Tln,4 (x). In the figure, you can see that
the graph of Tln,6 (x) remains close to the graph of ln x for longer than the graph of Tln,4 (x),
though it is hard to see whether Tln,6 (x) provides a better approximation to ln x in the
5
1.5
1
0.5
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
-0.5
-1
-1.5
Figure 2: The graphs of ln x, Tln,4 (x) and Tln,6 (x).
portion of the figure where all three graphs come together. On the other hand, as x moves
further away from x0 = 1, (e.g., when x ≥ 2.5), Tln,6 (x) is actually further from ln x than
Tln,4 (x).
To verify that it does, we return to the calculator. Once again, using my handy-dandy
TI-30XA, I found that the errors of approximation to ln 1.5, ln 1.25 and ln 1.1 are
• ln 1.5 − Tln,6 (1.5) ≈ 0.0007776,
• ln 1.25 − Tln,6 (1.25) ≈ 0.0000072 and
• ln 1.1 − Tln,6 (1.1) ≈ 0.000000013.
Comparing these errors to the errors in Example 3, we see that Tln,6 (x) produces much
more accurate approximations than Tln,4 (x).
4.
The mean value theorem.
When n = 0, Equation (3.1) (in Taylor’s theorem) reads
f (x) − f (x0 ) = f 0 (ξ)(x − x0 ),
where ξ is some point between x and x0 . This equation is important in its own right, and
is called the mean value theorem. In fact, Taylor’s theorem should really be thought of
as a generalization of the mean value theorem.¶
¶
The proof of the mean value theorem is actually fairly simple, and the mean value theorem can be used
to prove Taylor’s theorem.
6
Dividing both sides of the equation above by (x − x0 ) gives a simple geometric interpretation to the mean value theorem. Namely, the fact that
f (x) − f (x0 )
= f 0 (ξ)
x − x0
means that there is a point ξ between x0 and x, such that the tangent line to the graph
y = f (x) at the point (ξ, f (ξ)) is parallel to the secant connecting the point (x0 , f (x0 )) and
(x, f (x)). See Figure 3 for an illustration.
3.2
2.8
(x0 , f(x0))
(ξ , f(ξ))
2.4
2
1.6
1.2
(x , f(x))
0.8
0.4
-0.4
0
0.4
0.8
1.2
1.6
2
2.4
2.8
3.2
3.6
4
4.4
4.8
5.2
Figure 3: Illustration of the mean value theorem.
5.
Estimating the error of approximation.
Note: This section may be skipped,k but I encourage you to read it to get an idea of how
Taylor’s theorem (and similar ideas) can be used in ‘real life’.
When using an approximation method to estimate the value of f (x), it is usually important to have some idea of the size of the error of approximation. The point is that,
in reality, we don’t usually know the true value of f (x). Thus, it is often not enough to
say that f (x) ≈ A,∗∗ and we need to have an estimate for the possible size of the error
|f (x) − A|. In other words, we would like to be able to come up with an inequality of the
form |f (x) − A| < ε.
Taylor’s theorem allows us to do just that, by giving a fairly precise formula for the
difference
Rn (x) = f (x) − Tf,n (x)
k
∗∗
I.e., you won’t be tested on the material discussed here.
For example, when building a bridge or a space shuttle.
7
in Equation (3.1). I say ‘fairly precise’, because of the fact that we don’t know the exact
value of ξ, the point where f (n+1) (x) is evaluated. Nonetheless, in many examples, we can
still arrive at decent estimates for the size of the error of approximation, |Rn (x)|.††
Example 5. I’ll illustrate these ideas by using the Taylor polynomial, centered at x0 = 25,
for the square root function f (x) = x1/2 to compute (approximate) square roots of numbers
that are close to 25. I’ll start with the 4th degree Taylor polynomial. I’ll also drop the
subscript f from Tf,n , e.g., I’ll write T4 (x) instead of T√x,4 (x), since we will be discussing
the same function throughout.
√
First we need the derivatives of f (x) = x up to and including order 5 (for the remainder). These are:
f 0 (x) =
−1
3
−15
105
1
, f 00 (x) = 3/2 , f 000 (x) = 5/2 , f (4) (x) =
and f (5) (x) =
,
1/2
7/2
2x
4x
8x
16x
32x9/2
as you should check for yourself.
Using f, f 0 , f 00 , f 000 and f (4) , we write out T4 (x) from the definition of the Taylor polynomial, (2.1):
T4 (x) = (25)1/2 +
=5+
(x − 25)
(x − 25)2
3(x − 25)3
15(x − 25)4
−
+
−
(1!) · 2 · (25)1/2 (2!) · 4 · (25)3/2 (3!) · 8 · (25)5/2 (4!) · 32 · (25)9/2
(x − 25) (x − 25)2 (x − 25)3 (x − 25)4
−
+
−
.
10
1000
50000
2000000
Next, we use (3.1) to get a handle on the remainder. According to that formula,
R4 (x) =
105
f (5) (ξ)
(x − 25)5 =
(x − 25)5 ,
5!
(5!) · 32 · ξ 9/5
where, as always, ξ is some unknown number between x and 25. The fact that ξ is unknown
means that we can’t know the precise value of R4 (x). But we can still use a simple argument
to get good upper bounds for R4 (x).
First, consider the case that x > 25. This means that ξ > 25 (because ξ is between x
and 25). It follows that ξ 9/2 > (25)9/2 = 1953125, so (1/ξ 9/2 ) < (1/1953125). This implies
that
7
7
7
(x − 25)5 <
(x − 25)5 =
(x − 25)5 .
9/2
256 · 1953125
500000000
256ξ
√
Moreover, the remainder is positive
√ when x > 25, meaning that x > T4 (x), so we obtain
both upper and lower bounds for x. I.e., if x is any number greater than 25, then
R4 (x) =
T4 (x) <
√
x < T4 (x) +
††
7(x − 25)5
.
500000000
(5.1)
Taylor’s form of the remainder, which uses a definite integral, is in a certain sense more precise than the
form I am using here. More on that in 11B.
8
For example,
T4 (30) = 5 +
5
25
125
625
7 · (30 − 25)5
−
+
−
= 5.4771875, and
= 0.00004375,
10 1000 50000 2000000
500000000
so (5.1) gives the estimate
√
30 < 5.47723125.
√
This means that we know with certainty that 30 = 5.477 . . . .‡‡ If we choose x closer
to 25, then we can expect the accuracy to improve, because the factor (x − 25)5 in the
remainder term will shrink. For example, T4 (25.5) = 5.049752468745, and
5.4771875 <
R4 (25.5) <
7 · (1/2)5
= 0.0000000004375,
500000000
so (5.1) gives
This implies that
√
5.049752468745 <
√
25.5 < 5.0497524691825.
25.5 = 5.04975246 . . . .
Next, if x < 25, then x < ξ < 25, so ξ −9/2 < x−9/2 . This means that for x < 25,
|R4 (x)| =
7x−9/2 |x − 25|5
105ξ −9/2 |x − 25|5
<
.
3840
256
(5.2)
I use absolute values here, because R4 (x) is negative when x < 25, since the factor (x − 25)5
is negative in this case.
For x = 16, this gives a predicted error of no more than
|R4 (16)| <
7 · (16)−9/2 · 95
< 0.00616.
256
The 4th degree Taylor approximation gives
T4 (16) = 5 −
81
729
6561
9
−
−
−
= 4.0011395,
10 1000 50000 2000000
so the actual error is about 0.00114. The prediction for the error is correct (since 0.00616 >
0.00114), and it has the correct order of magnitude (i.e., the same number of zeros after
the dot).
√
If we want to compute more accurate approximate values for x than the ones given by
T4 (x) in the example above, we need to either (i) increase the degree of the approximation,
(i.e., use Tn (x), with n > x), or (ii) make |x − x0 | smaller, by changing x0 , or both.
The second option is actually quite easy to do. Our only requirement for x0 , besides
being close to x, is that it have a known (by which I mean rational) square root.
√
Example 6. Suppose that we want an approximation of 30 that is so accurate that the
first 10 decimal digits are known to be correct.
‡‡
By which I mean that we know with certainty what the first three decimal digits of
9
√
30 are.
If x0 < 30, and T4 (x) is the Taylor polynomial of degree 4 of
√
30 − T4 (30) =
√
x, centered at x0 , then
7(30 − x0 )5
7(30 − x0 )5
,
<
9/2
256ξ 9/2
256x0
using the formula for the remainder and the same arguments as in Example 5.† By choosing
x0 very close to 30, we can make the factor (30 − x0 )5 very small, and get a very accurate
approximation.
√
We already know that 5.4771 < 30 < 5.4772, and to keep things simple, I’ll use
x0 = (5.47)2 = 29.9209, which means that 30 − x0 = 0.0791 < 0.1. Notice that by choosing
1/2
x0 to be the square of a known number, we automatically know x0 (= 5.47).
Plugging these numbers into the estimate for the remainder we have
R4 (30) <
7 · (0.1)5
= 0.0000000000000623 . . . .
256 · 5.479
All that remains is to compute T4 (30):
1/2
T4 (30) = x0
+
(30 − x0 )
1/2
−
(30 − x0 )2
3/2
+
3(30 − x0 )3
5/2
−
15(30 − x0 )4
7/2
2x0
8x0
48x0
768x0
2
3
(0.0791)
3 · (0.0791)
15 · (0.0791)4
0.0791
−
+
−
= 5.47 +
10.94
8 · (5.47)3
48 · (5.47)5
768 · (5.47)7
= 5.4772255750568606 . . .
Since T4 (30) <
√
30 < T4 (30) + R4 (30), we have
√
5.4772255750568606 < 30 < 5.4772255750569228.
√
The upper and lower bounds for 30 have the same first 12 decimal digits, so we know with
certainty that these first 12 decimal digits are correct, i.e.,
√
30 = 5.477225575056 . . . .
†
9/2
I.e., x0 < ξ, so 1/x0
> 1/ξ 9/2 .
10