Linear Regression

Linear Regression
Introduction to Data Science Algorithms
Jordan Boyd-Graber and Michael Paul
SLIDES ADAPTED FROM FEDERICO
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
1 of 1
Deriving OLS
• Common theme in data science:
◦ Build model
◦ Write error model
◦ Derive how to minimize error
• Practice for OLS (other models next week)
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
2 of 1
Model and Objective
Model
yi = b0 + b1 xi + ei
(1)
ei = yi − b1 xi − b0 = ei
(2)
Error
Objective
`≡
X
ei2
(3)
i
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
3 of 1
Partial Derivatives
Intercept
∂
∂`
=
∂ b0
Introduction to Data Science Algorithms
|
P
i (yi
− b0 − b1 xi )2
∂ b0
Boyd-Graber and Paul
=
Linear Regression
|
4 of 1
Partial Derivatives
Intercept
∂
∂`
=
∂ b0
Introduction to Data Science Algorithms
|
P
i (yi
− b0 − b1 xi )2
∂ b0
Boyd-Graber and Paul
= −2
X
(yi − b0 − b1 xi )
(4)
i
Linear Regression
|
4 of 1
Partial Derivatives
Intercept
∂
∂`
=
∂ b0
P
i (yi
− b0 − b1 xi )2
∂ b0
= −2
X
(yi − b0 − b1 xi )
(4)
i
Slope
∂
∂`
=
∂ b1
Introduction to Data Science Algorithms
|
P
i (yi
− b0 − b1 xi )2
∂ b1
Boyd-Graber and Paul
=
Linear Regression
|
4 of 1
Partial Derivatives
Intercept
∂
∂`
=
∂ b0
P
i (yi
− b0 − b1 xi )2
∂ b0
= −2
X
(yi − b0 − b1 xi )
(4)
xi (yi − b0 − b1 xi )
(5)
i
Slope
∂
∂`
=
∂ b1
Introduction to Data Science Algorithms
|
P
i (yi
− b0 − b1 xi )2
∂ b1
Boyd-Graber and Paul
= −2
X
i
Linear Regression
|
4 of 1
System of Equations with Two Unknowns
Solve for Intercept
(6)
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
0=−2
X
(yi − b0 − b1 xi )
(6)
i
(7)
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
0=−2
X
(yi − b0 − b1 xi )
(6)
i
0=
X
i
yi −
X
i
b0 − bi
X
xi
(7)
i
(8)
Multiply by − 12 , distribute sum
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
0=−2
X
(yi − b0 − b1 xi )
(6)
i
0=
X
i
Nb0 =
X
i
yi −
X
b0 − bi
i
yi − bi
X
xi
(7)
i
X
xi
(8)
i
(9)
b0 is constant, so
Introduction to Data Science Algorithms
|
P
i
b0 = Nb0 , move to LHS
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
0=−2
X
(yi − b0 − b1 xi )
(6)
i
0=
X
yi −
X
i
Nb0 =
i
X
yi − bi
i
X
xi
(7)
i
X
xi
(8)
i
P
b0 =
b0 − bi
i
N
yi
P
− b1
i
N
xi
(9)
(10)
Divide by N
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
0=−2
X
(yi − b0 − b1 xi )
(6)
i
0=
X
yi −
X
i
Nb0 =
i
X
yi − bi
P
i
yi
N
b0 =ȳ − b1 x̄
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
X
xi
(7)
i
X
xi
(8)
i
i
b0 =
b0 − bi
P
− b1
i
N
xi
(9)
(10)
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
b0 =ȳ − b1 x̄
(6)
Solve for Slope
(7)
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
b0 =ȳ − b1 x̄
(6)
Solve for Slope
0=−2
X
xi (yi − b0 − b1 xi )
(7)
i
(8)
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
b0 =ȳ − b1 x̄
(6)
Solve for Slope
0=−2
X
xi (yi − b0 − b1 xi )
(7)
i
0=
X
x i y i − b0
i
X
i
xi −
X
b1 xi2
(8)
i
(9)
Multiply by − 12 , distribute sum and xi
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
b0 =ȳ − b1 x̄
(6)
Solve for Slope
0=−2
X
xi (yi − b0 − b1 xi )
(7)
i
0=
X
x i y i − b0
X
xi2 =
X
i
i
xi −
i
i
b1
X
x i y i − b0
X
X
b1 xi2
(8)
i
xi
(9)
i
(10)
Move last term to RHS
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
5 of 1
System of Equations with Two Unknowns
Solve for Intercept
b0 =ȳ − b1 x̄
(6)
xi (yi − b0 − b1 xi )
(7)
Solve for Slope
0=−2
X
i
0=
X
xi yi − b0
X
xi2 =
i
b1
xi yi − b0
i
X
i
Introduction to Data Science Algorithms
X
xi2
=
X
X
b1 xi2
(8)
i
X
xi
(9)
i
P
xi yi −
i
|
xi −
i
i
b1
X
Boyd-Graber and Paul
i
N
yi
P
− b1
i
N
xi
X
xi
(10)
i
(11)|
Linear Regression
5 of 1
Solve for Slope (continued)
b1
X
i
Introduction to Data Science Algorithms
xi2
=
X
P
xi yi −
i
|
Boyd-Graber and Paul
i
N
yi
P
− b1
i
N
xi
X
xi
i
Linear Regression
|
6 of 1
Solve for Slope (continued)
b1
X
xi2
=
i
b1
X
P
xi yi −
i
xi2
=
X
P
xi yi −
i
i
yi
− b1
P
N
P
N
i
X
yi
i
i
xi
i
xi
X
N
P
− b1
xi
i
(
i
xi )2
N
Multiplying out the last term
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
6 of 1
Solve for Slope (continued)
b1
X
xi2
=
i
b1
b1
X
xi2 + b1
(
X
i
xi yi −
xi2
=
X
2
=
X
P
xi yi −
i
i
yi
P
− b1
P
i
xi
i
yi
P
xi
i
i
xi
X
N
xi
i
P
− b1
N
P
xi yi −
yi
i
N
i
xi )
N
i
P
i
i
P
X
(
i
xi )2
N
N
Move last term to LHS
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
6 of 1
Solve for Slope (continued)
X
b1
xi2
=
i
xi2
=
i
b1
X
xi2 + b1
P
(
b1
X
xi2 +
P
(
i
2
xi )
=
X
X
=
X
i
P
xi yi −
i
yi
P
− b1
P
xi
i
xi
i
xi
i
xi yi −
i
yi
P
i
xi
X
N
xi
i
P
− b1
N
P
(
i
xi )2
N
N
P
xi yi −
yi
i
N
i
™
2
xi )
N
i
xi yi −
i
N
i
–
i
P
i
X
b1
X
i
yi
P
N
Factor out b1
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
6 of 1
Solve for Slope (continued)
X
b1
xi2
=
i
xi2
=
i
b1
X
xi2 + b1
P
(
b1
X
xi2 +
P
(
i
=
X
X
=
X
i
Boyd-Graber and Paul
− b1
yi
P
xi
i
xi
i
xi
i
xi yi −
i
yi
P
i
xi
X
N
xi
i
P
− b1
N
P
(
i
xi )2
N
N
P
xi yi −
xi yi −
b1 = P
|
xi yi −
i
i
yi
P
N
i
P
Introduction to Data Science Algorithms
P
P
N
i
™
2
xi )
N
i
xi )
N
i
–
xi yi −
yi
i
i
2
i
P
i
X
b1
X
P
2
i xi +
i
P
yi
N
P
(
i
xi )2
N
i
xi
Linear Regression
|
6 of 1
Solve for Slope (continued)
P
i
xi yi −
b1 = P
P
2
i xi +
(
i
P
yi
N
P
i
xi )2
N
i
xi
Ratio of the sum of the crossproducts of x and y over the sum of squares
for x
Introduction to Data Science Algorithms
|
Boyd-Graber and Paul
Linear Regression
|
6 of 1