Orthogonal Polynomials and Least Squares Approximations, cont`d

Jim Lambers
MAT 460/560
Fall Semeseter 2009-10
Lecture 37 Notes
These notes correspond to Section 8.2 in the text.
Orthogonal Polynomials and Least Squares Approximations, cont’d
Previously, we learned that the problem of finding the polynomial 𝑓𝑛 (π‘₯), of degree 𝑛, that best
approximates a function 𝑓 (π‘₯) on an interval [π‘Ž, 𝑏] in the least squares sense, i.e., that minimizes
(∫
βˆ₯𝑓𝑛 βˆ’ 𝑓 βˆ₯ =
)1/2
𝑏
2
[𝑓𝑛 (π‘₯) βˆ’ 𝑓 (π‘₯)] 𝑑π‘₯
,
π‘Ž
is easy to solve if we represent 𝑓𝑛 (π‘₯) as a linear combination of orthogonal polynomials,
𝑓𝑛 (π‘₯) =
𝑛
βˆ‘
𝑐𝑗 𝑝𝑗 (π‘₯).
𝑗=0
Each polynomial 𝑝𝑗 (π‘₯) is of degree 𝑗, and the set of polynomials 𝑝0 (π‘₯), 𝑝1 (π‘₯), . . . , 𝑝𝑛 (π‘₯) are orthogonal with respect to the inner product
∫ 𝑏
βŸ¨π‘“, π‘”βŸ© =
𝑓 (π‘₯)𝑔(π‘₯) 𝑑π‘₯.
π‘Ž
That is,
∫
βŸ¨π‘π‘˜ , 𝑝𝑗 ⟩ =
𝑏
π‘π‘˜ (π‘₯)𝑝𝑗 (π‘₯) 𝑑π‘₯ = 0,
π‘˜ βˆ•= 𝑗.
π‘Ž
Given this sequence of orthogonal polynomials, the coefficients 𝑐𝑗 in the linear combination used
to compute 𝑓𝑛 (π‘₯) are given by
𝑐𝑗 =
βŸ¨π‘π‘— , 𝑓 ⟩
,
βŸ¨π‘π‘— , 𝑝𝑗 ⟩
𝑐𝑗 = 0, 1, . . . , 𝑛.
Now, we focus on the task of finding such a sequence of orthogonal polynomials.
Recall the process known as Gram-Schmidt orthogonalization for obtaining a set of orthogonal
vectors p1 , p2 , . . . , p𝑛 from a set of linearly independent vectors a1 , a2 , . . . , a𝑛 :
p1 = a 1
p2 = a 2 βˆ’
p1 β‹… a2
p1
p1 β‹… p1
1
..
.
p𝑛 = a 𝑛 βˆ’
π‘›βˆ’1
βˆ‘
𝑗=0
p𝑗 β‹… a 𝑛
p𝑗 .
p𝑗 β‹… p𝑗
By normalizing each vector p𝑗 , we obtain a unit vector
q𝑗 =
1
p𝑗 ,
∣p𝑗 ∣
and a set of orthonormal vectors {q𝑗 }𝑛𝑗=1 , in that they are orthogonal (qπ‘˜ β‹… q𝑗 = 0 for π‘˜ βˆ•= 𝑗), and
unit vectors (q𝑗 β‹… q𝑗 = 1).
We can use a similar process to compute a set of orthogonal polynomials. For simplicitly, we will
require that all polynomials in the set be monic; that is, their leading (highest-degree) coefficient
must be equal 1. We then define 𝑝0 (π‘₯) = 1. Then, because 𝑝1 (π‘₯) is supposed to be of degree 1, it
must have the form 𝑝1 (π‘₯) = π‘₯ βˆ’ 𝛼1 for some constant 𝛼1 . To ensure that 𝑝1 (π‘₯) is orthogonal to
𝑝0 (π‘₯), we compute their inner product, and obtain
0 = βŸ¨π‘0 , 𝑝1 ⟩ = ⟨1, π‘₯ βˆ’ 𝛼1 ⟩,
so we must have
𝛼1 =
⟨1, π‘₯⟩
.
⟨1, 1⟩
For 𝑗 > 1, we start by setting 𝑝𝑗 (π‘₯) = π‘₯π‘π‘—βˆ’1 (π‘₯), since 𝑝𝑗 should be of degree one greater
than that of π‘π‘—βˆ’1 , and this satisfies the requirement that 𝑝𝑗 be monic. Then, we need to subtract
polynomials of lower degree to ensure that 𝑝𝑗 is orthogonal to 𝑝𝑖 , for 𝑖 < 𝑗. To that end, we apply
Gram-Schmidt orthogonalization and obtain
𝑝𝑗 (π‘₯) = π‘₯π‘π‘—βˆ’1 (π‘₯) βˆ’
π‘—βˆ’1
βˆ‘
βŸ¨π‘π‘– , π‘₯π‘π‘—βˆ’1 ⟩
𝑖=0
βŸ¨π‘π‘– , 𝑝𝑖 ⟩
𝑝𝑖 (π‘₯).
However, by the definition of the inner product, βŸ¨π‘π‘– , π‘₯π‘π‘—βˆ’1 ⟩ = ⟨π‘₯𝑝𝑖 , π‘π‘—βˆ’1 ⟩. Furthermore, because
π‘₯𝑝𝑖 is of degree 𝑖 + 1, and π‘π‘—βˆ’1 is orthogonal to all polynomials of degree less than 𝑗, it follows that
βŸ¨π‘π‘– , π‘₯π‘π‘—βˆ’1 ⟩ = 0 whenever 𝑖 < 𝑗 βˆ’ 1.
We have shown that sequences of orthogonal polynomials satisfy a three-term recurrence relation
2
𝑝𝑗 (π‘₯) = (π‘₯ βˆ’ 𝛼𝑗 )π‘π‘—βˆ’1 (π‘₯) βˆ’ π›½π‘—βˆ’1
π‘π‘—βˆ’2 (π‘₯),
2
where the recursion coefficients 𝛼𝑗 and π›½π‘—βˆ’1
are defined to be
𝛼𝑗 =
βŸ¨π‘π‘—βˆ’1 , π‘₯π‘π‘—βˆ’1 ⟩
,
βŸ¨π‘π‘—βˆ’1 , π‘π‘—βˆ’1 ⟩
2
𝑗 > 1,
𝑗 > 1,
𝛽𝑗2 =
βŸ¨π‘π‘—βˆ’1 , π‘₯𝑝𝑗 ⟩
⟨π‘₯π‘π‘—βˆ’1 , 𝑝𝑗 ⟩
βŸ¨π‘π‘— , 𝑝𝑗 ⟩
βˆ₯𝑝𝑗 βˆ₯2
,
=
=
=
βŸ¨π‘π‘—βˆ’1 , π‘π‘—βˆ’1 ⟩
βŸ¨π‘π‘—βˆ’1 , π‘π‘—βˆ’1 ⟩
βŸ¨π‘π‘—βˆ’1 , π‘π‘—βˆ’1 ⟩
βˆ₯π‘π‘—βˆ’1 βˆ₯2
𝑗 β‰₯ 1.
Note that ⟨π‘₯π‘π‘—βˆ’1 , 𝑝𝑗 ⟩ = βŸ¨π‘π‘— , 𝑝𝑗 ⟩ because π‘₯π‘π‘—βˆ’1 differs from 𝑝𝑗 by a polynomial of degree at most
𝑗 βˆ’ 1, which is orthogonal to 𝑝𝑗 . The recurrence relation is also valid for 𝑗 = 1, provided that we
define π‘π‘—βˆ’1 (π‘₯) ≑ 0, and 𝛼1 is defined as above. That is,
𝑝1 (π‘₯) = (π‘₯ βˆ’ 𝛼1 )𝑝0 (π‘₯),
𝛼1 =
βŸ¨π‘0 , π‘₯𝑝0 ⟩
.
βŸ¨π‘0 , 𝑝0 ⟩
If we also define the recursion coefficient 𝛽0 by
𝛽02 = βŸ¨π‘0 , 𝑝0 ⟩,
and then define
π‘žπ‘— (π‘₯) =
𝑝𝑗 (π‘₯)
,
𝛽0 𝛽1 β‹… β‹… β‹… 𝛽𝑗
then the polynomials π‘ž0 , π‘ž1 , . . . , π‘žπ‘› are also orthogonal, and
βŸ¨π‘žπ‘— , π‘žπ‘— ⟩ =
βŸ¨π‘π‘— , 𝑝𝑗 ⟩
2
𝛽0 𝛽12 β‹… β‹… β‹… 𝛽𝑗2
= βŸ¨π‘π‘— , 𝑝𝑗 ⟩
βŸ¨π‘π‘—βˆ’1 , π‘π‘—βˆ’1 ⟩
βŸ¨π‘0 , 𝑝0 ⟩
1
β‹…β‹…β‹…
= 1.
βŸ¨π‘π‘— , 𝑝𝑗 ⟩
βŸ¨π‘1 , 𝑝1 ⟩ βŸ¨π‘0 , 𝑝0 ⟩
That is, these polynomials are orthonormal.
If we consider the inner product
∫
1
βŸ¨π‘“, π‘”βŸ© =
𝑓 (π‘₯)𝑔(π‘₯) 𝑑π‘₯,
βˆ’1
then a sequence of orthogonal polynomials, with respect to this inner product, can be defined as
follows:
𝐿0 (π‘₯) = 1,
𝐿1 (π‘₯) = π‘₯,
2𝑗 + 1
𝑗
𝐿𝑗+1 (π‘₯) =
π‘₯𝐿𝑗 (π‘₯) βˆ’
πΏπ‘—βˆ’1 (π‘₯),
𝑗+1
𝑗+1
𝑗 = 1, 2, . . .
These are known as the Legendre polynomials. One of their most important applications is in the
construction of Gaussian quadrature rules. Specifically, the roots of 𝐿𝑛 (π‘₯), for 𝑛 β‰₯ 1, are the nodes
of a Gaussian quadrature rule for the interval [βˆ’1, 1]. However, they can also be used to easily
compute continuous least-squares polynomial approximations, as the following example shows.
Example We will use Legendre polynomials to approximate 𝑓 (π‘₯) = cos π‘₯ on [βˆ’πœ‹/2, πœ‹/2] by a
quadratic polynomial. First, we note that the first three Legendre polynomials, which are the ones
of degree 0, 1 and 2, are
𝐿0 (π‘₯) = 1,
𝐿1 (π‘₯) = π‘₯,
3
1
𝐿2 (π‘₯) = (3π‘₯2 βˆ’ 1).
2
However, it is not practical to use these polynomials directly to approximate 𝑓 (π‘₯), because they
are orthogonal with respect to the inner product defined on the interval [βˆ’1, 1], and we wish to
approximate 𝑓 (π‘₯) on [βˆ’πœ‹/2, πœ‹/2].
To obtain orthogonal polynomials on [βˆ’πœ‹/2, πœ‹/2], we replace π‘₯ by 2𝑑/πœ‹, where 𝑑 belongs to
[βˆ’πœ‹/2, πœ‹/2], in the Legendre polynomials, which yields
(
)
2𝑑
1 12 2
˜
˜
˜
𝐿0 (𝑑) = 1, 𝐿1 (𝑑) = , 𝐿2 (𝑑) =
𝑑 βˆ’1 .
πœ‹
2 πœ‹2
Then, we can express our quadratic approximation 𝑓2 (π‘₯) of 𝑓 (π‘₯) by the linear combination
˜ 0 (π‘₯) + 𝑐1 𝐿
˜ 1 (π‘₯) + 𝑐2 𝐿
˜ 2 (π‘₯),
𝑓2 (π‘₯) = 𝑐0 𝐿
where
𝑐𝑗 =
Λœπ‘— ⟩
βŸ¨π‘“, 𝐿
,
Λœπ‘— , 𝐿
Λœπ‘— ⟩
⟨𝐿
𝑗 = 0, 1, 2.
Computing these inner products yields
˜ 0⟩ =
βŸ¨π‘“, 𝐿
∫
πœ‹/2
cos 𝑑 𝑑𝑑
βˆ’πœ‹/2
= 2,
∫ πœ‹/2
2𝑑
˜
βŸ¨π‘“, 𝐿1 ⟩ =
cos 𝑑 𝑑𝑑
βˆ’πœ‹/2 πœ‹
= 0,
)
∫ πœ‹/2 (
1 12 2
˜
βŸ¨π‘“, 𝐿2 ⟩ =
𝑑 βˆ’ 1 cos 𝑑 𝑑𝑑
πœ‹2
βˆ’πœ‹/2 2
2 2
=
(πœ‹ βˆ’ 12),
πœ‹2
∫ πœ‹/2
˜ 0, 𝐿
˜ 0⟩ =
⟨𝐿
1 𝑑𝑑
βˆ’πœ‹/2
= πœ‹,
∫ πœ‹/2 ( )2
2𝑑
˜
˜
⟨𝐿1 , 𝐿1 ⟩ =
𝑑𝑑
πœ‹
βˆ’πœ‹/2
8πœ‹
=
,
3
)]2
∫ πœ‹/2 [ (
1 12 2
˜
˜
⟨𝐿2 , 𝐿2 ⟩ =
𝑑 βˆ’1
𝑑𝑑
πœ‹2
βˆ’πœ‹/2 2
πœ‹
=
.
5
4
It follows that
and therefore
2
,
πœ‹
2 5 2
10
(πœ‹ βˆ’ 12) = 3 (πœ‹ 2 βˆ’ 12),
2
πœ‹ πœ‹
πœ‹
(
)
2
12 2
5 2
𝑓2 (π‘₯) = + 3 (πœ‹ βˆ’ 12)
π‘₯ βˆ’ 1 β‰ˆ 0.98016 βˆ’ 0.4177π‘₯2 .
πœ‹ πœ‹
πœ‹2
𝑐0 =
𝑐1 = 0,
𝑐2 =
This approximation is shown in Figure 1. β–‘
Figure 1: Graph of cos π‘₯ (solid blue curve) and its continuous least-squares quadratic approximation
(red dashed curve) on [βˆ’πœ‹/2, πœ‹/2]
It is possible to compute sequences of orthogonal polynomials with respect to other inner products. A generalization of the inner product that we have been using is defined by
∫ 𝑏
βŸ¨π‘“, π‘”βŸ© =
𝑓 (π‘₯)𝑔(π‘₯)𝑀(π‘₯) 𝑑π‘₯,
π‘Ž
where 𝑀(π‘₯) is a weight function. To be a weight function, it is required that 𝑀(π‘₯) β‰₯ 0 on (π‘Ž, 𝑏), and
that 𝑀(π‘₯) βˆ•= 0 on any subinterval of (π‘Ž, 𝑏). So far, we have only considered the case of 𝑀(π‘₯) ≑ 1.
5
Another weight function of interest is
𝑀(π‘₯) = √
1
,
1 βˆ’ π‘₯2
βˆ’1 < π‘₯ < 1.
A sequence of polynomials that is orthogonal with respect to this weight function, and the associated
inner product
∫ 1
1
𝑑π‘₯
βŸ¨π‘“, π‘”βŸ© =
𝑓 (π‘₯)𝑔(π‘₯) √
1 βˆ’ π‘₯2
βˆ’1
is the sequence of Chebyshev polynomials
𝐢0 (π‘₯) = 1,
𝐢1 (π‘₯) = π‘₯,
𝐢𝑗+1 (π‘₯) = 2π‘₯𝐢𝑗 (π‘₯) βˆ’ πΆπ‘—βˆ’1 (π‘₯),
𝑗 = 1, 2, . . .
which can also be defined by
𝐢𝑗 (π‘₯) = cos(𝑗 cosβˆ’1 π‘₯),
βˆ’1 ≀ π‘₯ ≀ 1.
It is interesting to note that if we let π‘₯ = cos πœƒ, then
∫ 1
1
𝑑π‘₯
βŸ¨π‘“, 𝐢𝑗 ⟩ =
𝑓 (π‘₯) cos(𝑗 cosβˆ’1 π‘₯) √
1 βˆ’ π‘₯2
βˆ’1
∫ πœ‹
𝑓 (cos πœƒ) cos π‘—πœƒ π‘‘πœƒ.
=
0
In later lectures, we will investigate continuous and discrete least-squares approximation of functions
by linear combinations of trigonometric polynomials such as cos π‘—πœƒ or sin π‘—πœƒ, which will reveal one
of the most useful applications of Chebyshev polynomials.
6