Jim Lambers
MAT 772
Fall Semester 2010-11
Lecture 12 Notes
These notes correspond to Sections 9.2 and 9.3 in the text.
Best Approximation in the 2-norm
Suppose that we wish to obtain a function ππ (π₯) that is a linear combination of given functions
{ππ (π₯)}ππ=0 , and best ο¬ts a function π (π₯) at a discrete set of data points {(π₯π , π (π₯π ))}π
π=1 in a
least-squares sense. That is, we wish to ο¬nd constants {ππ }ππ=0 such that
β‘
β€2
π
π
π
β
β
β
β£
[ππ (π₯π ) β π (π₯π )]2 =
ππ ππ (π₯π ) β π (π₯π )β¦
π=1
π=1
π=0
is minimized. This can be accomplished by solving a system of π + 1 linear equations for the {ππ },
known as the normal equations.
Now, suppose we have a continuous set of data. That is, we have a function π (π₯) deο¬ned on
an interval [π, π], and we wish to approximate it as closely as possible, in some sense, by a function
ππ (π₯) that is a linear combination of given functions {ππ (π₯)}ππ=0 . If we choose π equally spaced
points {π₯π }π
π=1 in [π, π], and let π β β, we obtain the continuous least-squares problem of ο¬nding
the function
π
β
ππ (π₯) =
ππ ππ (π₯)
π=0
that minimizes
β«
πΈ(π0 , π1 , . . . , ππ ) =
π
[ππ (π₯) β π (π₯)]2 ππ₯ =
π
β«
π
π
β‘
β€2
π
β
β£
ππ ππ (π₯) β π (π₯)β¦ ππ₯.
π=0
To obtain the coeο¬cients {ππ }ππ=0 , we can proceed as in the discrete case. We compute the
partial derivatives of πΈ(π0 , π1 , . . . , ππ ) with respect to each ππ and obtain
β‘
β€
β« π
π
β
βπΈ
=
ππ (π₯) β£
ππ ππ (π₯) β π (π₯)β¦ ππ₯,
βππ
π
π=0
and requiring that each partial derivative be equal to zero yields the normal equations
]
β« π
π [β« π
β
ππ (π₯)ππ (π₯) ππ₯ ππ =
ππ (π₯)π (π₯) ππ₯, π = 0, 1, . . . , π.
π=0
π
π
1
We can then solve this system of equations to obtain the coeο¬cients {ππ }ππ=0 . This system can
be solved as long as the functions {ππ (π₯)}ππ=0 are linearly independent. That is, the condition
π
β
ππ ππ (π₯) β‘ 0,
π₯ β [π, π],
π=0
is only true if π0 = π1 = β
β
β
= ππ = 0. In particular, this is the case if, for π = 0, 1, . . . , π, ππ (π₯) is a
polynomial of degree π. This can be proved using a simple inductive argument.
Example We approximate π (π₯) = ππ₯ on the interval [0, 5] by a fourth-degree polynomial
π4 (π₯) = π0 + π1 π₯ + π2 π₯2 + π3 π₯3 + π4 π₯4 .
The normal equations have the form
π
β
πππ ππ = ππ ,
π = 0, 1, . . . , 4,
π=0
or, in matrix-vector form, π΄c = b, where
β« 5
β« 5
5π+π+1
π π
, π, π = 0, 1, . . . , 4,
πππ =
π₯ π₯ ππ₯ =
π₯π+π ππ₯ =
π+π+1
0
0
β« 5
π₯π ππ₯ ππ₯, π = 0, 1, . . . , 4.
ππ =
0
Integration by parts yields the relation
ππ = 5π π5 β πππβ1 ,
π0 = π5 β 1.
Solving this system of equations yields the polynomial
π4 (π₯) = 2.3002 β 6.226π₯ + 9.5487π₯2 β 3.86π₯3 + 0.6704π₯4 .
As Figure 1 shows, this polynomial is barely distinguishable from ππ₯ on [0, 5].
However, it should be noted that the matrix π΄ is closely related to the π × π Hilbert matrix π»π ,
which has entries
1
[π»π ]ππ =
, 1 β€ π, π β€ π.
π+πβ1
This matrix is famous for being highly ill-conditioned, meaning that solutions to systems of linear
equations involving this matrix that are computed using ο¬oating-point arithmetic are highly sensitive to roundoο¬ error. In fact, the matrix π΄ in this example has a condition number of 1.56 × 107 ,
which means that a change of size π in the right-hand side vector b, with entries ππ , can cause a
change of size 1.56π × 107 in the solution c. β‘
2
Figure 1: Graphs of π (π₯) = ππ₯ (red dashed curve) and 4th-degree continuous least-squares polynomial approximation π4 (π₯) on [0, 5] (blue solid curve)
Inner Product Spaces
As the preceding example shows, it is important to choose the functions {ππ (π₯)}ππ=0 wisely, so that
the resulting system of normal equations is not unduly sensitive to round-oο¬ errors. An even better
choice is one for which this system can be solved analytically, with relatively few computations. An
ideal choice of functions is one for which the task of computing ππ+1 (π₯) can reuse the computations
needed to compute ππ (π₯).
To that end, recall that two π-vectors u = β¨π’1 , π’2 , . . . , π’π β© and v = β¨π£1 , π£2 , . . . , π£π β© are orthogonal if
π
β
uβ
v =
π’π π£π = 0,
π=1
where u β
v is the dot product, or inner product, of u and v.
By viewing functions deο¬ned on an interval [π, π] as inο¬nitely long vectors, we can generalize
the inner product, and the concept of orthogonality, to functions. To that end, we deο¬ne the inner
product of two real-valued functions π (π₯) and π(π₯) deο¬ned on the interval [π, π] by
β«
β¨π, πβ© =
π
π (π₯)π(π₯) ππ₯.
π
3
Then, we say π and π are orthogonal with respect to this inner product if β¨π, πβ© = 0.
In general, an inner product on a vector space π± over β, be it continuous or discrete, has the
following properties:
1. β¨π + π, ββ© = β¨π, ββ© + β¨π, ββ© for all π, π, β β π±
2. β¨ππ, πβ© = πβ¨π, πβ© for all π β β and all π β π±
3. β¨π, πβ© = β¨π, π β© for all π, π β π±
4. β¨π, π β© β₯ 0 for all π β π±, and β¨π, π β© = 0 if and only if π = 0.
This inner product can be used to deο¬ne the norm of a function, which generalizes the concept
of the magnitude of a vector to functions, and therefore provides a measure of the βmagnitudeβ of
a function. Recall that the magnitude of a vector v, denoted by β₯vβ₯, can be deο¬ned by
β₯vβ₯ = (v β
v)1/2 .
Along similar lines, we deο¬ne the 2-norm of a function π (π₯) deο¬ned on [π, π] by
β₯π β₯2 = (β¨π, π β©)
1/2
(β«
=
π
2
)1/2
[π (π₯)] ππ₯
.
π
As we will see, it can be veriο¬ed that this function does in fact satisfy the properties required of a
norm. The continuous least-squares problem can then be described as the problem of ο¬nding
ππ (π₯) =
π
β
ππ ππ (π₯)
π=0
such that
(β«
β₯ππ β π β₯2 =
π
)1/2
[ππ (π₯) β π (π₯)] ππ₯
2
π
is minimized. This minimization can be performed over πΆ[π, π], the space of functions that are
continuous on [π, π], but it is not necessary for a function π (π₯) to be continuous for β₯π β₯2 to be
deο¬ned. Rather, we consider the space πΏ2 (π, π), the space of real-valued functions such that β£π (π₯)β£2
is integrable over (π, π).
One very important property that β₯ β
β₯2 has is that it satisο¬es the Cauchy-Schwarz inequality
β£β¨π, πβ©β£ β€ β₯π β₯2 β₯πβ₯2 ,
π, π β π±.
This can be proven by noting that for any scalar π β β,
π2 β₯π β₯22 + 2πβ¨π, πβ© + β₯πβ₯22 = β₯ππ + πβ₯22 β₯ 0.
4
The left side is a quadratic polynomial in π. In order for this polynomial to not have any negative
values, it must either have complex roots or a double real root. This is the case if the discrimant
satisο¬es
4β¨π, πβ©2 β 4β₯π β₯22 β₯πβ₯22 β€ 0,
from which the Cauchy-Schwarz inequality immediately follows. By setting π = 1 and applying this
inequality, we immediately obtain the triangle-inequality property of norms.
Suppose that we can construct a set of functions {ππ (π₯)}ππ=0 that is orthogonal with respect to
the inner product of functions on [π, π]. That is,
β«
β¨ππ , ππ β© =
π
{
ππ (π₯)ππ (π₯) ππ₯ =
π
0
π β= π
.
πΌπ > 0 π = π
Then, the normal equations simplify to a trivial system
[β« π
]
β« π
2
[ππ (π₯)] ππ₯ ππ =
ππ (π₯)π (π₯) ππ₯,
π
π = 0, 1, . . . , π,
π
or, in terms of norms and inner products,
β₯ππ β₯22 ππ = β¨ππ , π β©,
π = 0, 1, . . . , π.
It follows that the coeο¬cients {ππ }ππ=0 of the least-squares approximation ππ (π₯) are simply
ππ =
β¨ππ , π β©
,
β₯ππ β₯22
π = 0, 1, . . . , π.
If the constants {πΌπ }ππ=0 above satisfy πΌπ = 1 for π = 0, 1, . . . , π, then we say that the orthogonal
set of functions {ππ (π₯)}ππ=0 is orthonormal. In that case, the solution to the continuous least-squares
problem is simply given by
ππ = β¨ππ , π β©, π = 0, 1, . . . , π.
Next, we will learn how sets of orthogonal polynomials can be computed.
5
© Copyright 2026 Paperzz