Document

Linear Regression: Method of Least Squares
The Method of Least Squares is a procedure to determine the best fit line to data; the proof
uses simple calculus and linear algebra. The basic problem is to find the best fit straight
line y = a + bx given that, for n ϵ {1,…,N}, the pairs (xn; yn) are observed.
The form of the fitted curve is
Sum of squares of errors
slope
y=a+bx
n
q   y i  a  bx i    y i  a  bx i 
i 1
y intercept
y
n
2
b=tana
a
2
i 1
q
 0  2 ( y i  a bx i )  0
a
q
 0  2 x i ( y i  a bx i )  0
b
a
x
 n

 x i
 x  a     y 
 x  b  x y 
i
2
i
i
i
i
Example 1: Find a 1st order polynomial y=a+bx for the values given in the table.
x
y
-5
-2
2
4
3.5
3
x
i 1
i
3
 5  2  7  4
3
3 4  a   5.5 
4 78 b  42.5

  

i 1
i
 2  4  3 .5  5 .5
i
y i  (5) x (2)  2 x 4  7 x 3.5  42.5
3
 x i2  52  22  7 2  78
i 1
y
x
i 1
a 
1  78  4  5.5 
 




218
b 
  4 3  42.5
clc;clear
x=[-5,2,7];
y=[-2,4,3.5];
p=polyfit(x,y,1)
x1=-5:0.01:7;
yx=polyval(p,x1);
plot(x,y,'ro',x1,yx,'b')
xlabel('x value')
ylabel ('y value')
a=1.188
b=0.484
y=1.188+0.484x
With Matlab:
7
Data point
Fitted curve
6
5
4
y value
7
n 3
3
2
1
0
-1
-2
-6
-4
-2
0
2
x value
4
6
8
Example 2:
x
y
0
200
3
230
5
240
8
270
10
290
 n

  x i
 xi  a     y i 
 xi2  b  xi y i 
a 
1  198  26 1230

 
  26
 6950
314
b
5
 



 5 26   a  1230
 26 198 b   6950

  

y=a+bx
y=200.13 + 8.82x
320
300
280
y value
clc;clear
x=[0,3,5,8,10];
y=[200,230,240,270,290];
p=polyfit(x,y,1)
x1=-1:0.01:12;
yx=polyval(p,x1);
plot(x,y,'ro',x1,yx,'b')
xlabel('x value')
ylabel ('y value')
Data point
Fitted curve
260
240
220
200
180
-2
0
2
4
6
x value
8
10
12
Example 4:
The change in the interior temperature of an oven with respet to time is given in the
Figure. It is desired to model the relationship between the temperature (T) and time
(t) by a first order polynomial as T=c1t+c2. Determine the coefficients c1 and c2.
T (°C)
T(t)  c1 t  c2
212
204
200
Slope Intercept
 n

 x i
Intercept
 x  c     y 
 x  c   x y 
i
2
i
2
i
1
i i
Slope
175
0
5
10
15
t (min.)
n4
 x 0  5  10  15  30
 x 0  5  10  15  350
 y  175  204  200  212  791
 x y 0 * 175  5 * 204  10 * 200  15 * 212  6200
i
2
i
2
2
2
 4 30  c2   791 
30 350 c   6200

  1 

2
i
i i
c2  181.7
c1  2.14
c2 
 350  30  791 
1
 




c

30
4
6200
 1  4 * 350  30 * 30  


T(t)  2.14 t  181.7
with Matlab:
clc;clear
x=[0,5,10,15];
y=[175,204,200,212];
p=polyfit(x,y,1)
t=0:0.01:15;
T=polyval(p,t);
plot(x,y,'ro',t,T,'b')
xlabel('x value')
ylabel ('y value‘)