UCSD—MAE 127: lecture 13 (Gille) 1 Least Squares Fitting in Matrix

UCSD—MAE 127: lecture 13 (Gille)
1
Least Squares Fitting in Matrix Form
Now we’ll return to our problem of solving an overdetermined matrix equation Ax = b. We might
erroneously imagine that we could just multiply b by the inverse of A to find x. That won’t work, since A is
not square, so its inverse is not defined. In any case, our cost function specifies that we will try to minimize
the squared error. In matrix form, this is
=
m
n X
X
(
aij xj − bi )2 = (Ax − b)T (Ax − b) = xT AT Ax − 2xT AT b + bT b
(1)
i=1 j=1
and is just a scalar. We’d like to make as small as possible. To find its minimum, we set the derivative
equal to zero:
∂
= 2AT Ax − 2AT b = 0.
(2)
∂x
The matrix AT A is a square matrix, so we can try to compute its inverse. Thus
x = (AT A)−1 AT b.
(3)
This is the basic equation for a least-squares fit. In Matlab, you’d code it as:
x=inv(A’*A)*A’*b;
So for the example that we’ve considered of shoe size versus height, we’d write the following:
h=height; s=shoe_size;
A=[ones(length(s)) s];
x=inv(A’*A)*A’*h;
This yields the result reported in the last lecture x1 = 58.4 and x2 = 1.14.
The great thing about this is that we can solve for an arbitrarily complicated system. So if we’d like to
fit: h = x1 + x2 s + x3 s2 + x4 s3 , we define our matrix A to be:




A=



s1
s2
s3
..
.
s21
s22
s23
..
.
s31
s32
s33
..
.
1 sN
s2N
s3N
1
1
1
..
.




.



(4)
If we’d like to fit temperature data to a sinusoidal seasonal cycle, we look for a fit to T = x 1 +
x2 cos(2πt/365.25) + x3 sin(2πt/365.25), where T is temperature, and t is time in days. Here we fit both
a sine and cosine, because we don’t know when temperature should be maximum. By solving this way, we
can find the point in time when the amplitude is maximum. Thus our matrix A is:
A=
"
1 cos(tr ) sin(tr )
..
..
..
.
.
.
#
.
(5)
where tr = 2πt/365.25, the time in radians. We can represent this as a cosine with an amplitude α and
phase φ, so that T = x1 + α cos(tr + φ) = x1 + α q
cos(tr ) cos(φ) − α sin(tr ) sin(φ). That means that
x2 = α cos(φ) and x3 = −α sin(φ). Therefore, α = x22 + x23 and φ = atan(−x3 /x2 ), or to allow φ to
range from 0 to 2π, you’d code it as φ = atan2(−x3 , x2 ).
On the other hand, we can’t use this linear least-squares fitting method to solve for x in T = x 1 cos(x2 tr ),
because x1 and x2 are multiplied together. For this we’d have to resort to nonlinear fitting procedures which
are never as tidy as linear least-squares fitting and are beyond the scope of this class.
Let’s use the sinusoidal form above to fit an annual cycle to the buoy temperature data that you used in
the use the first three problem sets. We’ll use the following Matlab commands:
UCSD—MAE 127: lecture 13 (Gille)
2
25
30
20
Air temperature (C)
Water temperature (C)
25
20
15
15
10
10
2000
2001
2002
2003
Time (years)
2004
2005
5
2000
2001
2002
2003
Time (years)
2004
2005
Figure 1: (left) Measured water temperature from buoy in Santa Monica Basin with red least-squares fitted
temperature. (right) Measured air temperature from buoy with fitted curve.
load buoy.mat
water_temp(water_temp>900)=NaN;
air_temp(air_temp>900)=NaN;
time=datenum(year,month,day,hour,0,0);
A=[ones(length(time)) cos(time*2*pi/365.25) sin(time*2*pi/365.25)];
index=find(˜isnan(water_temp));
x_water=inv(A(index,:)’*A(index,:))*A(index,:)’*water_temp(index);
index=find(˜isnan(air_temp));
x_air=inv(A(index,:)’*A(index,:))*A(index,:)’*air_temp(index);
%plot Ax and T
subplot(1,2,1); plot(time,water_temp,time,A*x_water,’r’)
datetick
xlabel(’Time (years)’); ylabel(’Water temperature (C)’)
subplot(1,2,2); plot(time,air_temp,time,A*x_air,’r’)
datetick
xlabel(’Time (years)’); ylabel(’Air temperature (C)’)
The result, plotted in Figure 1 yields x1 = 16.8◦ C, x2 = −2.64◦ C, and x3 = −1.66◦ C for water and
x1 = 15.5◦ C, x2 = −1.93◦ C, and x3 = −1.50◦ C for air.
Often we want to know whether changes in air temperature lead or follow changes in water temperature.
The least-squares fit alone won’t tell us that, but if we represent it as an amplitude and phase, we immediately
see the differences. To do this we compute:
a_air=sqrt(x_air(2).ˆ2+x_air(3).ˆ2);
p_air=365.25/(2*pi)*atan2(-x_air(3),x_air(2));
a_water=sqrt(x_water(2).ˆ2+x_water(3).ˆ2);
p_water=365.25/(2*pi)*atan2(-x_water(3),x_water(2));
Air temperature has an amplitude of 2.4◦ C, water has an amplitude of 3.1◦ C. The value p air peaks at day
144, implying that air temperature reaches a maximum at day 365.25-144 = 221, (around August 8 or 9).
Water temperature reaches a maximum slightly earlier, at day 215. (These results are slightly surprising to
me—we tend to think that near the coast, the air should warm and cool slightly faster than the ocean, so
UCSD—MAE 127: lecture 13 (Gille)
3
reach its maximum slightly earlier than the water. If I were doing research with this data, at this juncture I’d
spend a few days comparing the statistics of this buoy with nearby buoy measurements to decide if I really
trusted the archived data values.)
Orthogonality and Least-Squares Fits
Let’s think about one important detail of our fitting procedure. What would happen if we wanted to fit
T = x1 + x2 tr + x3 tr , that is to fit two constants to the same variable? In this case, clearly x 2 and x3 are
completely indistinguishable. What happens when we try to use our redundant functions in our matrix A?



A=


Then we find:

1
1
..
.
t1
t2
..
.
t1
t2
..
.
1 tN
tN



.


(6)

N
t
t
P 2i P 2i 
 P
ti
ti
ti  .
A0 A = 
P
P 2 P 2
ti
ti
ti
P
P
(7)
The second and third rows of A0 A are identical, which tells us that the third row is adding no additional
informaiton to the system. As a result, the matrix A0 A is singular, and we won’t be able to find an inverse
for it. (In Matlab, when you try to do the inversion, you’ll see a message, “Warning: Matrix is singular to
working precision.”)
You probably wouldn’t try to fit coefficients to two identical functions, but you might do something
that was fairly similar. For example, T = x2 tr + x3 sin(tr ) poses a similar problem when tr is near zero. In
this case, the rows of A0 A might not be identical, but they might be nearly the same so that Matlab would
give you an error message.
Similarly, you’ll have trouble if you try: T = x1 + x2 tr + x3 (1 + tr ).