Linear Regression

Linear Regression
https://xkcd.com/1007/
Slides from:
CSC321: Intro to Machine Learning and Neural Networks, Winter 2016
Andrew Ng
Michael Guerzhoy
Training set of
housing prices
(Portland, OR)
Size in feet2 (x)
2104
1416
1534
852
…
Notation:
m = Number of training examples
x’s = “input” variable / features
y’s = “output” variable / “target” variable
Price ($) in 1000's (y)
460
232
315
178
…
Housing Prices
(Portland, OR)
Price
(in 1000s
of dollars)
500
400
300
200
100
0
0
500
1000
1500
Size
2000
(feet2)
2500
Supervised Learning
Regression Problem
Given the “right answer” for
each example in the data.
Predict real-valued output
3000
How do we represent h ?
Training Set
Learning Algorithm
Size of
house
h
Estimated
price
Linear regression with one variable.
Univariate linear regression.
Training Set
Size in feet2 (x)
2104
1416
1534
852
…
Hypothesis:
‘s:
Parameters
How to choose
‘s ?
Price ($) in 1000's (y)
460
232
315
178
…
3
3
3
2
2
2
1
1
1
0
0
0
0
1
2
3
0
1
2
3
0
1
2
3
y
x
Idea: Choose
so that
is close to for our
training examples
Quadratic cost function – on the board
𝐽𝜃0 (𝜃1 )
(for fixed
, this is a function of x)
(function of the parameter
3
3
2
2
1
1
y
0
0
1
x
2
3
, for a fixed 𝜃0 )
0
-0.5 0 0.5 1 1.5 2 2.5
Hypothesis:
Parameters:
Cost Function:
Goal:
Cost Function Surface Plot
Contour Plots
• For a function F(x, y) of two variables, assigned
different colours to different values of F
• Pick some values to plot
• The result will be contours – curves in the graph
along which the values of F(x, y) are constant
(for fixed
, this is a function of x)
(function of the parameters
)
Cost Function Contour Plot
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
Have some function
Want
Outline:
• Start with some
• Keep changing
to reduce
until we hopefully end up at a minimum
Gradient Descent on
the board
J(0,1)
0
1
J(0,1)
0
1
For Linear Regression, J is bowl-shaped (“convex”)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
Linear Regression vs. k-Nearest Neighbours
Orange: y = 1
Blue: y = 0
Linear Regression vs. k-Nearest Neighbours
• Linear Regression: the boundary can only be linear
• Nearest Neighbours: the boundary can more
complex
• Which is better?
• Depends on what the actual boundary looks like
• Depends on whether we have enough data to figure out
the correct complex boundary