Math 221, Section 5.5
Least square problem
(1) Linearly independent columns
(2) Linearly dependent columns
1 / 15
The problem
I
The section concerns about the problem when Ax = b has no
solution.
I
The best one can do is to find an x̂ that makes Ax̂ as close as
possible to b.
I
The least-squares problem is to find an x̂ that makes
||b − Ax̂||
(the square root of a sum of squares) as small as possible.
2 / 15
The Aim
We have to find
I
the least-square solution x̂
I
the best approximation b̂ = Ax̂ of b in Col A
Therefore, b̂ is the orthogonal projection of b onto Col A.
3 / 15
The formula
Suppose that
I
I
the columns a1 , . . . , an of A are linearly independent
A may not be invertible (as a rectangular matrix), but
AT A is invertible.
(see Theorem 14 on the last page.)
The best approximation is given by the orthogonal projection
b̂ = Ax̂ = A(AT A)−1 AT b
The least square solution is therefore
x̂ = (AT A)−1 AT b
In other words, x̂ is a solution of the normal equation
(AT A)x̂ = AT b
4 / 15
Example
In practice, we compute x̂ first and compute the projection b̂ = Ax̂.
.
(a) Find the orthogonal projection of b onto Col A, and
(b) a least-squares solution of Ax = b.
(Note: not the solution of Ax = b because there is no solution!)
AT A
Solution:
(AT A)−1
=
=
=
=
2
1
1 6/11
5 / 15
AT b
=
=
.
Therefore, the normal equation is
x̂ =
⇒
and
−1 T
T
x̂ = (A A)
A b=
b̂ = Ax̂ =
=
=
3
2
.
6 / 15
Linearly dependence case
If the columns a1 , . . . , an of A are not linearly dependent
I
AT A is not invertible
I
We cannot take inverse of AT A to compute
x̂ = (AT A)−1 AT b directly.
I
So we need solve the normal equation
(AT A)x̂ = AT b
by row-reduction.
I
We get infinitely many solutions for x̂.
7 / 15
Example
.
(Note: a1 = a2 + a3 .) Find all least-squares solutions of Ax = b.
Solution:
I AT A
I
Solve
=
h1 1 1 1i1 1 0
1100
0011
(AT A)x̂ =
110
101
101
T
A b by
=
h4 2 2i
220
202
is not invertible.
row-reduction.
4 2 2 14
1 0 1
5
[AT A|AT b] = 2 2 0 4 → · · · → 0 1 −1 −3
2 0 2 10
0 0 0
0
RREF
Therefore,
x̂ =
hx i
y
z
=
h
5
−3
0
i
+z
h −1 i
1
1
8 / 15
Best fitting line (optional)
Idea: Given a set of data points (x1 , y1 ), . . . , (xm , ym ) in the plane,
find the least square line (also called the best-fitting line) that
best fits these points
9 / 15
Sec 5.6, Ex 1. Given
(0, 1), (1, 1), (2, 2), (3, 2).
Aim: Find the line mx + c = y best fitting the given points.
Of course, the system of equations (plug in the points)
c=1
m+c =1
2m + c = 2
3m + c = 2
has no solution for (m, c)
10 / 15
Rewrite the equation as Ax = b form
0 1 1
1
1 1 m
2 1 c = 2
2
3 1
I
Ax = b has no solution.
We instead solve the normal equation AT Ax̂ = AT b.
0 1 1
0 1 2 3
1 1 m̂ = 0 1 2 3 1
1 1 1 1
2 1
ĉ
1 1 1 1 2
3 1
2
14 6
m̂
11
⇒
=
← normal equation
6 4
ĉ
6
−1 m̂
14 6
11
0.2 −0.3
11
0.4
⇒
=
=
=
.
ĉ
6 4
6
−0.3 0.7
6
0.9
I
11 / 15
In other words, the line
y = m̂x + ĉ = 0.4x + 0.9
best fits the data.
12 / 15
(optional) Remark
Last time we computed the example, Sec 5.3.
We computed that
35 0
1/35
0
I UT U =
⇒ (U T U)−1 =
.
0 14
0
1/14
9
−3
0
10
10
I ProjW = U(U T U)−1 U T = 0
1 0 .
−3
1
0 10
10
9
0 −3
5
3
10
10
I ŷ = ProjW y = 0
1 0 −9 = −9 .
−3
1
5
−1
0 10
10
What if we take an non-orthogonal basis?
13 / 15
Take another basis
a1 = u1 + u2 =
I AT A
I
=
49 21
21 49
h −6 i
⇒
−3
2
, a2 = u1 − u2 =
(AT A)−1
=
h
0
−7
0
i
1/40
−3/280
−3/280
1/40
.
Check A(AT A)−1 AT
−6
= −3
2
0
1/40
−7
−3/280
0
−3/280
1/40
−6
0
−3
−7
2
0
=
9
10
0
−3
10
0
1
0
−3
10
0
1
10
which is the same as U(U T U)−1 U T
Note: for the orthogonal projection formula
ŷ =
y · u1
y · u2
u1 +
u2
u1 · u1
u2 · u2
we still require the basis {u1 , u2 } to be orthogonal.
14 / 15
Quiz ∞
The bold numbers indicate similar types of questions that may
appear in Quiz ∞.
Sec 5.5: 1,5,9,13,17,18a,b,c,d,23,24
You may check the solution at:
http://www.slader.com/textbook/9780321385178-linearalgebra-and-its-applications-4th-edition/
Note: Sec 5.5 in the UBC edition become Sec 6.5 in the 4th
edition.
15 / 15
© Copyright 2026 Paperzz