WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression Gradient Descent Single Variable Linear Regression Gradient Descent Have some function Want Outline: • Start with some • Keep changing to reduce until we hopefully end up at a minimum Ilustration J(0,1) 0 1 Ilustration J(0,1) 0 1 The Algorithm Gradient descent algorithm Correct: Simultaneous update Incorrect: Algorithm Explained.. Gradient descent algorithm = Learning Rate Following = are the derivative effects.. If α is too small, gradient descent can be slow. If α is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge. Fixed .. Gradient descent can converge to a local minimum, even with the learning rate α fixed. As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decrease α over time. Applying Gradient Descent for Linear Regresion Gradient descent algorithm Linear Regression Model Gradient Descent Function.. Algorithm.. Gradient descent algorithm update and simultaneously Remember Local Minimum Problem. J(0,1) 0 1 It Wont Happened Here.. “Batch” Gradient Descent “Batch”: Each step of gradient descent uses all the training examples. Visualization (for fixed , this is a function of x) (function of the parameters ) Contd. (for fixed , this is a function of x) (function of the parameters ) Contd.. (for fixed , this is a function of x) (function of the parameters ) Contd. (for fixed , this is a function of x) (function of the parameters ) Contd… (for fixed , this is a function of x) (function of the parameters ) (for fixed , this is a function of x) (function of the parameters ) (for fixed , this is a function of x) (function of the parameters ) (for fixed , this is a function of x) (function of the parameters ) (for fixed , this is a function of x) (function of the parameters ) Homework Create a program to demonstrate Gradient Descent usage on One Variable Linear Regression Problem. Use Diamond Data. Input : 1 variable Output : 1 variable. Visualize your program. (MSE, Line Regression) Able to manually initialize 0 1 Linear Regression with multiple variables Multiple features Previously Multiple Feature Multiple features (variables). Notation: = number of features = input (features) of = value of feature in training example. training example. Hypothesis: Previously: Still Hypothesis… For convenience of notation, define Multivariate linear regression. . Linear Regression with multiple variables Gradient descent for multiple variables linear regression Hypothesis: Symplified Parameters: Cost function: Gradient descent: Repeat (simultaneously update for every ) Gradient Descent New algorithm Repeat : Previously (n=1): Repeat (simultaneously update ) (simultaneously update ) for Linear Regression with multiple variables Gradient descent in practice I: Feature Scaling Feature Scaling Idea: Make sure features are on a similar scale. E.g. = size (0-2000 feet2) = number of bedrooms (1-5) size (feet2) number of bedrooms Feature Scaling Get every feature into approximately a range. Mean normalization Replace with (Do not apply to E.g. to make features have approximately zero mean ). Linear Regression with multiple variables Choosing Learning Rate Making sure gradient descent is working correctly. Example automatic convergence test: 0 100 200 300 No. of iterations 400 Declare convergence if decreases by less than in one iteration. Making sure gradient descent is working correctly. Gradient descent not working. Use smaller . No. of iterations No. of iterations - No. of iterations For sufficiently small , should decrease on every iteration. But if is too small, gradient descent can be slow to converge. Summary: - If is too small: slow convergence. - If is too large: may not decrease on every iteration; may not converge. To choose , try Homework Create a program to demonstrate Gradient Descent usage on Multiple Variable Linear Regression Problem. Use Housing Data. Input : 2 variable Output : 1 variable. Able to manually initialize 0 1 is customizable Do the “Feature Scalling” Linear Regression with multiple variables Features and polynomial regression Housing prices prediction Polynomial regression Price (y) Size (x) Fin… Finally …
© Copyright 2025 Paperzz