lecture 2 not ready - Villanova Computer Science

4: Regression
CSC 4510 – Machine Learning
Dr. Mary-Angela Papalaskari
Department of Computing Sciences
Villanova University
Course website:
www.csc.villanova.edu/~map/4510/
The slides in this presentation are adapted from:
•
The Stanford online ML course http://www.ml-class.org/
CSC 4510 - M.A. Papalaskari - Villanova University
1
Housing Prices
(Portland, OR)
500
400
300
Price 200
(in 1000s
of dollars) 100
0
0
500
1000
1500
Size
data file
2000
(feet2)
2500
3000
Housing Prices
(Portland, OR)
500
400
300
Price 200
(in 1000s
of dollars) 100
0
0
500
1000
1500
Size
Supervised Learning
2000
(feet2)
2500
Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
3000
Training set of
housing prices
(Portland, OR)
Size in feet2
(x)
2104
1416
1534
852
…
Notation:
m = Number of training examples
x’s = “input” variable / features
y’s = “output” variable / “target” variable
Price ($) in
1000's (y)
460
232
315
178
…
Training Set
Learning Algorithm
Size of
house
h
Estimate
price
Training Set
Learning Algorithm
Linear Hypothesis:
Size of
house
Univariate linear regression)
h
Estimate
price
Training Set
Size in feet2
(x)
2104
1416
1534
852
…
Hypothesis:
‘s:
Parameters
How to choose
‘s ?
Price ($) in
1000's (y)
460
232
315
178
…
3
3
3
2
2
2
1
1
1
0
0
0
0
1
2
3
0
1
2
3
0
1
2
3
3
3
3
2
2
2
1
1
1
0
0
0
0
1
2
3
0
1
2
3
0
1
2
3
Idea:
• Choose θ0 ,θ1 so that hθ (x) is close to y for
our training examples
What are good measures of being “close”?
CSC 4510 - M.A. Papalaskari - Villanova University
10
Hypothesis:
Parameters:
Cost Function:
Goal:
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
(for fixed
, this is a function of x)
(function of the parameters
)
Have some function
Want
Outline:
• Start with some
• Keep changing
to reduce
until we hopefully end up at a minimum
J(0,1)
1
0
J(0,1)
1
0
Next time: Gradient descent algorithm for
linear univariate regression
update
and
simultaneously
Exercise: Let’s use Excel to find h(x) for the data file housing prices example
(if you need a spreadsheet refresher try this:
http://www.ncsu.edu/labwrite/res/gt/gt-reg-home.html#cal)